See Also edit
I've come across skepticism about
Tcl's facility for bit manipulation one too many times. While
RS's usual
tours de force such as "
Playing with bits" and "
Big bitstring operations" show how
good Tcl can be at bit manipulation, this page has a far more limited ambition: simply to help hardware- or
C-oriented developers to feel comfortable working on low-level data in a higher-level language.
# Let's experiment: one model for "low-level data" is a byte array, or
# byte sequence. A sequence of eight-bit values is often the manifestation
# of information received from a physical device through a serial port, or
# from a remote host through a network connection. Start, then, with
# sample data, a sequence of seven eight-bit quantities.
set sample \x63\x77\x54\x00\x83\x41\x42
# While they're all there, only some are displayable.
puts $sample
# Let's make a utility that'll show the content of a byte
# sequence as hex data:
proc show byte_sequence {
binary scan $byte_sequence H* x
puts [regsub -all (..) $x {\1 }]
}
# Display our sample data.
show $sample
# Suppose bits 3 and 4 combine to make some type specification.
# Let's look at them:
foreach byte [split $sample {}] {
puts "The type of this byte is '[expr 0x06 & [scan $byte %c]]'."
}
# We can "mask off" unwanted bits.
foreach byte [split $sample {}] {
puts "After masking, we're looking at '[format %2X [expr 0x3F & [scan $byte %c]]]'."
}
# This sample datum is of five sixteen-bit quantities.
set word_sample \u0001\u0020\u0300\u4000\uFEDC
proc show_words word_sequence {
foreach word [split $word_sequence {}] {
puts -nonewline "[format %04X [scan $word %c]] "
}
puts ""
}
# Suppose I need to look just at words three and four.
set subsample [string range $word_sample 3 4]
# This next displays "4000 FEDC".
show_words $subsample
# We can mask off any bits we choose.
foreach word [split $word_sample {}] {
puts "After masking, we see '[format %04X [expr 0xF0FF & [scan $word %c]]]'."
}
# The output from that last should have been:
# After masking, we see '0001'.
# After masking, we see '0020'.
# After masking, we see '0000'.
# After masking, we see '4000'.
# After masking, we see 'F0DC'.
Thanks to
CLN for a drastic simplification of what follows.
# It sometimes happens that vendors define sixteen-bit protocols that they, in effect,
# force through eight-bit pipes. A network receiver might, for example, receive bytes
# we'll label \x01\x03\x54\x80, with the direction to interpret these as the two
# sixteen-bit words \u0301\u8054 (notice that we're entering the realm of endian
# affairs). Here's a model for handling such cases:
set byte_sequence \x01\x03\x54\x80\x33\x34
# Notice that "s*" and "S*" account for the two endianness parities.
binary scan $byte_sequence s* display_word_sequence
puts "Here are the words: '$display_word_sequence'."
binary scan [string range $byte_sequence 0 3] S2 first_two_words
puts "Here are the first two words of the byte sequence: '$first_two_words'."
# Also: remarks on RE (string trim; (..?)).
Size in Bits edit
PYK 2016-01-16:
bitsize returns the number of bits used to represent a positive integer.
proc bitsize value {
set p -1
while {$value >= ( 1 << [incr p])} {}
return $p
}