How to convert a sequence of 32 char (0/1) to 32 bits (uint32_t)?

32-bit hexadecimal
number of bits in a number
binary integer
byte representation
number representation in computer
what is the binary representation of decimal 1,333 as a 12 bit unsigned integer
decimal representation of an integer
8-bit integer

I have an array of char (usually thousands of bytes long) read from a file, all composed of 0 and 1 (not '0' and '1', in which case I could use strtoul). I want to pack these into single bits, thus converting each 32 char into a single uint32_t. Should I write a bit shift operation with 32 parts, or is there a saner way?

out[i/32] = 
    data[i] << 31 |
    data[i+1] << 30 |
    data[i+2] << 29 |
    data[i+3] << 28 |
    data[i+4] << 27 |
    data[i+5] << 26 |
    data[i+6] << 25 |
    data[i+7] << 24 |
    data[i+8] << 23 |
    data[i+9] << 22 |
    data[i+10] << 21 |
    data[i+11] << 20 |
    data[i+12] << 19 |
    data[i+13] << 18 |
    data[i+14] << 17 |
    data[i+15] << 16 |
    data[i+16] << 15 |
    data[i+17] << 14 |
    data[i+18] << 13 |
    data[i+19] << 12 |
    data[i+20] << 11 |
    data[i+21] << 10 |
    data[i+22] << 9 |
    data[i+23] << 8 |
    data[i+24] << 7 |
    data[i+25] << 6 |
    data[i+26] << 5 |
    data[i+27] << 4 |
    data[i+28] << 3 |
    data[i+29] << 2 |
    data[i+30] << 1 |
    data[i+31];

If this monstrous bit shift is the fastest in run time, then I'll have to stick to it.

Restricted to the x86 platform, you can use the PEXT instruction. It is part of the BMI2 instruction set extension on newer processors.

Use 32-bit instructions in a row and then merge the results in one value with shifts.

This is probably the optimal approach on Intel processors, but the disadvantage is that this instruction is slow on AMD Ryzen.

Reverse bits of a 32 bit unsigned integer, If you don't need the output bits to appear in exactly the same order as the input bytes, but if they can instead be "interleaved" in a specific way, then a fast and portable way to accomplish this is to take 8 blocks of 8 bytes (64 bytes total) and to combine all the LSBs together into a single 8 byte value. ToUInt32(ReadOnlySpan<Byte>) Converts a read-only byte span into a 32-bit unsigned integer. ToUInt32(Byte[], Int32) Returns a 32-bit unsigned integer converted from four bytes at a specified position in a byte array.

If you don't need the output bits to appear in exactly the same order as the input bytes, but if they can instead be "interleaved" in a specific way, then a fast and portable way to accomplish this is to take 8 blocks of 8 bytes (64 bytes total) and to combine all the LSBs together into a single 8 byte value.

Something like:

uint32_t extract_lsbs2(uint8_t (&input)[32]) {
  uint32_t t0, t1, t2, t3, t4, t5, t6, t7;
  memcpy(&t0, input + 0 * 4, 4);
  memcpy(&t1, input + 1 * 4, 4);
  memcpy(&t2, input + 2 * 4, 4);
  memcpy(&t3, input + 3 * 4, 4);
  memcpy(&t4, input + 4 * 4, 4);
  memcpy(&t5, input + 5 * 4, 4);
  memcpy(&t6, input + 6 * 4, 4);
  memcpy(&t7, input + 7 * 4, 4);

  return 
    (t0 << 0) |
    (t1 << 1) |
    (t2 << 2) |
    (t3 << 3) |
    (t4 << 4) |
    (t5 << 5) |
    (t6 << 6) |
    (t7 << 7);
}

This generates "not terrible, not great" code on most compilers.

If you use uint64_t instead of uint32_t it would generally be twice as fast (assuming you have more than 32 total bytes to transform) on a 64-bit platform.

With SIMD you could easy vectorize the entire operation in something like two instructions (for AVX2, but any x86 SIMD ISA will work): compare and pmovmskb.

Compiler Construction: 12th International Conference, CC 2003, , Bitwise operators treat their operands as a sequence of 32 bits (zeroes and Converting Signed Decimal to Binary Convert absolute value of the decimal integer to binary. uint16 or uint32 operand is expected. unsigned char The last expression Give the value of 88, 0, 1, 127, and 255 in 8-bit unsigned representation. Y = uint32(X) converts the values in X to type uint32.Values outside the range [0,2 32-1] map to the nearest endpoint.

Bit shifting is the simplest way to go about this. Better to write code that reflects what you're actually doing rather than trying to micro-optimize.

So you want something like this:

char bits[32];
// populate bits
uint32_t value = 0;
for (int i=0; i<32; i++) {
    value |= (uint32_t)(bits[i] & 1) << i;
}

Chapter 3: Numbers, Characters and Strings -- Valvano, It currently represents structures, unions, and arrays as sequences of these primitive bit v indicates whether the type of the corresponding data object may change char), unk8 (one byte of unknown type), int16, uint16, unk16, int32, uint32, pointer initialized, invariant short 0 1 1 int16 1 initialized, unsigned char 1 1 0 1  The data involves various integers (14 bits, 13 bits) and control words, spliced together in an arbitrary pattern. I actually need to send 24, not 32, bits at a time - but I can handle that part.) 01-28-2010 #2

Python convert unsigned to signed integer, In other words, information is encoded as a sequence of 1's and 0's. Again we do need 32 to generate our 36, so bit 5 is one and we subtract 36 minus 32 to get 4. Example conversion from decimal to unsigned 8-bit binary to hexadecimal. ToInt32(Single) Converts the value of the specified single-precision floating-point number to an equivalent 32-bit signed integer. ToInt32(String)

Split 32 Bit Number Into Bytes, the simpler version is just: from numpy import uint32 from numpy import int32 NumPy ToSByte(Char) Dec 20, 2019 · In 32-bit integers, an unsigned integer has a and stores a sequence of values ranging from 0-255 (8-bits). intbe:n, n bits as a Convert an image to floating point format, with values in [0, 1]. v 1 = a % r;  The first method converts the uint32_t to an unsigned long, which is guaranteed to be large enough to hold any possible uint32_t value. The second method uses a macro to get the proper conversion specifier.

Integer (computer science), Formula for converting 32 bit Hexadecimal into decimal in EXCEL. code copy definitions Hello, how can i split a number into its Byte-values (U8)? I found the "Split of the bytes; letters are ascii-characters, numbers are numerical byte-​values). ToByte(UInt32) Converts the value of the specified 32-bit unsigned integer to  Examples. The following example defines a HexString class that implements the IConvertible interface and that is designed to hold the string representation of both 32-bit signed and 32-bit unsigned values.

Comments
  • Yep, bitwise or and shift as you iterate over each byte is about all you can do in this case I think.
  • The entire file is only 32 bytes long? Or you actually want to do this to a long stream of bytes?
  • @BeeOnRope It depends on the size of the raster. I'm testing it on a 40,000 char long (equivalent to a row in the raster).
  • How to create a byte out of 8 bool values (and vice versa)?, What's the fastest way to pack 32 0/1 values into the bits of a single 32-bit variable?, How to efficiently convert an 8-bit bitmap to array of 0/1 integers with x86 SIMD
  • Possible duplicate of What's the fastest way to pack 32 0/1 values into the bits of a single 32-bit variable?
  • I can imagine a sequence of pand, pmaddubsw, phaddw, phaddd that would be compatible with more machines than AVX2 and likely to be quite fast...
  • @IwillnotexistIdonotexist - I wasn't thinking of any special instructions in AVX2, just that it happens that 32 bytes the OP suggested needs only 1 AVX instruction. You could do it with SSE, just at half the speed. To be clear I am thinking of a cmpeq; movmskb sequence, with a scalar store. This should achieve close to 64 bits of output a cycle (combing two 32-bit regs into a 64-bit one to save store bandwidth), which means close to 64 bytes of input per cycle. It will hard to beat that I think with a fully-SIMD solution?
  • Thank you. Please take a look at my edit. Isn't a loop less efficient than a single operation? Or both are the same to the processor?
  • @Rodrigo Many compilers will perform loop unrolling as an optimization. Don't worry about that level of manual optimization unless you've identified a specific bottleneck.