Portable conversion of data endianness using the Linux kernel's API

linux endian functions
network byte order
big endian to little endian converter
convert big endian to little endian c
le16_to_cpu function in linux
htonl
reverse byte order
byte ordering

How can I improve the following code, that is, make it more robust with respect to type safety and endianness using the functions and macros in the Linux kernel's API? For instance, in the following example src_data is an array of two 16-bit signed integers (typically stored in little endian order) and is to be sent out via UART in big endian byte order.

s16 src_data[2] = {...}; /* note: this is signed data! */
u8 tx_data[4];

u8* src_data_u8 = (u8*)src_data;

tx_data[0] = src_data_u8[1];
tx_data[1] = src_data_u8[0];
tx_data[2] = src_data_u8[3];
tx_data[3] = src_data_u8[2];

I think the functions cpu_to_be16 and cpu_to_be16p should play a role in doing this conversion. Although I'm not sure how I can use them in a way that is safe and robust to endianness.

As I understand, two 16-bit words, to be sent, one after another, after converting each into bigendian format. I think following should be ok.

s16 src_data[2] = {...}; /* note: this is signed data! */
s16 tx_data[2];
tx_data[0] = cpu_tp_be16(src_data_u8[0]);
tx_data[1] = cpu_to_be16(src_data_u8[1]);

s16 src_data[2] = {...}; /* note: this is signed data! */
s16 tx_data[2];

tx_data[0] = cpu_tp_be16(src_data_u8[0]);
tx_data[1] = cpu_to_be16(src_data_u8[1]);

Portable conversion of data endianness using the Linux kernel's API, Portable conversion of data endianness using the Linux kernel's API. linux kernel little endian cpu_to_be32 linux endian conversion linux byte order But writing code this way would make it less portable, due to potential endianness mismatches between the CPU and the hardware device. Additionally, one has to pay close attention when translating register definitions from the hardware documentation into bit field indices for the structs.

Your issue with safety seems to be that the htons(x) function/macro expects an unsigned integer, but you possess a signed one. Not an issue:

union {
    int16_t signed_repr;
    uint16_t unsigned_repr;
} data;

data.signed_repr = ...;

u16 unsigned_big_endian_data = htons(data.unsigned_repr);

memcpy(tx_data, &unsigned_big_endian_data,
       min(sizeof tx_data, sizeof unsigned_big_endian_data));

PS. Type-punning via unions is perfectly well-defined.

Byte Order, Linux kernel little endian. EndianIssues, Little-endian processors store data with the right-most bytes (those with a higher address value) being the most� Server consolidation based on Linux for IBM System z offers advantages, but moving existing applications requires some specialized knowledge. In this article, get general advice on how to organize your porting project, including technical details on mainframe virtualization, byte-ordering, and address calculation specific to System z. This article also covers how development tools (compiler

I believe the following is one of the best answers to my question. I have used the links provided by @0andriy to existing examples in the kernel source code.

Converting a signed 16-bit value for transmitting

s16 src = -5;
u8 dst[2];
__be16 tx_buf;
*(__be16*)dst = cpu_to_be16(src);

Converting multiple signed 16-bit values for transmitting

s16 src[2] = {-5,-2};
u8 dst[4];
s16* psrc = src;
u8* pdst = dst;
int len = sizeof(src);

for ( ; len > 1; len -= 2) {
    *(__be16 *)pdst = cpu_to_be16p(psrc++);
    pdst += 2;
}

A quick disclaimer, I still need to check if this code is correct / compiles.

Overall, I'm a bit unsatisfied with the solution for copying and converting the endianness of multiple values since it is prone to typos and could easily be implemented into a macro.

Other Portability Issues, The Linux kernel can be either big endian or little endian depending upon which Then, the receiving system converts the data from network byte order to its In order to write portable code, you should always use these macros to convert to� The Linux kernel defines a set of macros that handle conversions between the processor’s byte ordering and that of the data you need to store or load in a specific byte order. For example: u32 cpu_to_le32 (u32); u32 le32_to_cpu (u32);

If the Linux machine will always be little endian, and the protocol will always be big endian, then the code works fine and you don't need to change anything.

If you for some reason need to make the Linux code endian-independent, then you'd use:

tx_data[0] = ((unsigned int)src_data[0] >> 8) & 0xFF;
tx_data[1] = ((unsigned int)src_data[0] >> 0) & 0xFF;
tx_data[2] = ((unsigned int)src_data[1] >> 8) & 0xFF;
tx_data[3] = ((unsigned int)src_data[1] >> 0) & 0xFF;

Where the cast is there to ensure that the right shifts are not carried out on a signed type, which would invoke non-portable implementation defined behavior.

The advantage of bit shifts compared to any other version is that they work on an abstraction level above the hardware and endianess, letting the specific compiler generate the instructions for the underlying memory access. Code such as u16 >> 8 always means "give me the least significant byte" regardless of where that byte is stored in memory.

Kernel development [LWN.net], Other Portability Issues In addition to data typing, there are a few other software The Linux kernel defines a set of macros that handle conversions between the convert a value from whatever the CPU uses to an unsigned, little-endian,� 2 Portable conversion of data endianness using the Linux kernel's API May 28 '18 2 Merging types from variadic template using template-template arguments Oct 22 '15 2 Display “open with other application” as a sub-menu Sep 20 '17

Writing Portable Device Drivers, Developers like to joke about Al Viro's fearsome presence on linux-kernel, but Kernel code often must work with data encoded in a specific byte ordering which will convert a little-endian 32-bit quantity to the ordering required by the These macros make for portable code; they perform the requested transformation on� 1 Startup code for — linux IRQ interrupt hander for ARM May 28 '15 0 How to map a specific physical address range of linux kernel, of size less than one page size(4Kb) to user space Dec 13 '19 0 Portable conversion of data endianness using the Linux kernel's API Dec 20 '19

Endianness: Big and Little Endian Byte Order, There are some native kernel data types that you should use instead of To convert from the processor's native format into little-endian form� Newer Linux Kernels As of version 2.19 of the util-linux package the command lscpustarted including a field related to Endianness. So now you can simply use this command to find this out: $ lscpu | grep -i byte

Endianness, This tutorial covers big and little endian byte order, conversion and bit fields. Byte swapping to convert the endianness of binary data can be achieved using the following In GCC (not portable) for you can directly call: Gnome byte order macros � Qt byte order functions; See macros in include files linux/kernel.h and� This is achieved by storing the data always in one fixed endianness, or carrying with the data a switch to indicate the endianness. An example of the first case is the binary XLS file format that is portable between Windows and Mac systems and always little-endian, leaving the Mac application to swap the bytes on load and save when running on a

Comments
  • What exactly is the issue you want to solve, that the comments in the header file you link do not help you with? Make your question more specific.
  • If you write code like this, careful with the de-serialization of incoming protocols. You cant go from lets say u8 rx_data[4] to s16 [2] by using pointer conversions - that would be a strict aliasing violation. You get away with it here because you go from "any type" to character type, which is a special allowed case.
  • @MichaelFoukarakis The functions/macros in the header file don't seem safe since they only accept unsigned arguments... Furthermore, it is not clear to me how I can safely store the intermediate result before putting the converted data into my u8 tx_data array...
  • Thanks for the input @0andriy, the protocol is fixed as big endian, and the target machine (an AVR) is little endian. As I understand, in practice, pretty much all Linux based machines are also little endian, but nonetheless, I would like to know how to achieve this conversion in a clean, platform independent way.
  • @0andriy It is fairly clear that you have never written a single line of hardware-related programming in your life. Sorry, but TCP/IP sockets is not hardware, it is layer 3 and 4 or so. I write drivers for embedded systems everyday and have done so for the past 15 years, but thanks for sharing your wisdom about something you have never worked with.
  • Comments are not for extended discussion; this conversation has been moved to chat.
  • Just would like to keep it here: davmac.wordpress.com/2010/02/26/c99-revisited
  • This is almost a complete answer. If you could add a short explanation as to why this solution is endian-independent and confirm that (or speculate as to why) there isn't already a function in the Linux kernel's API to do or assist this conversion, I will accept the answer.
  • More specifically, why would tx_data[0] contain the MSB of src_data[0]? On a little endian machine, I would have thought that ((unsigned int)src_data[0] >> 8) would give you the LSB, since the LSB is at the lower address...
  • @allsey87 I added a bit of explanation. Basically, C bit shifts don't know or care where bytes are stored in memory. There's probably some library functions, but this is such fundamental stuff that I see no reason for using a function.
  • I see! So essentially, any data type in C can be visualised as a series of adjacent bytes with the MSB on the left and the LSB on the right, regardless of endianness.
  • @allsey87 Only integer data types, since these are the only ones we can shift.