Understanding Big and Little Endian Byte Order

Question from Doran: "How can 2 chars allocate to an unsigned short? It just doesn’t make sense to me.
I’ve heard about NUXI few times before, and still I can’t get it. Please can you explain it for me. (even in C)

My reply:

On a 32-bit computer, a short is composed of 16 bits (2 bytes). In order to set the value, you an specify the short using two bytes, which is 4 hex characters (in C):

short a = 0x1234;
short b = 0x5678;

So short a has 0x1234 (4,660 decimal) and b is similar. Now, instead of using the characters “0-A”, let’s just use U N I and X to represent each byte. For example, U could be 0x12, N could be 0x34, I could be 0x56, and X could be 0x78.

short a = 0xUN;
short b = 0xIX;

On any machine, these shorts would be stored consecutively in memory. Address 0 and 1 would be “a”, and address 2 and 3 would be “b”. [Again, each short takes up 2 bytes].

On a big-endian machine, the data would look like this:

Addr 0: U
Addr 1: N
Addr 2: I
Addr 3: X

On a little-endian machine, we store the smallest part of the number first. That is, in a = 0xUN, we store “N” first, which are the low-order bits. So in memory it would look like this:

Addr 0: N
Addr 1: U
Addr 2: X
Addr 3: I

Hence the “NUXI” problem. On a big-endian machine the data looks like UNIX, on a little-endian machine the data looks like NUXI. This isn’t a problem if you stay on the same machine (each machine knows how to convert appropriately), but can be a problem if you are exchanging binary data between machines.

Hope this helps,

-Kalid

Well information is good!!!
I have one query if there is not much advantage of Big-Endian over Little Endian then why Network Byte order is Big-Endian???

While there are advantages to each, I don’t think one is clearly better than the other. I think they just had choose one or the other – Big Endian may have been a more popular format at the time :).

Thank you so much. I’ve been lucky thus far doing high end coding, but having rolled my sleeves up to start mucking about with bit and bytes this has been very helpful indeed.

kudos!

Thanks Steve, glad you found it useful! It’s fun to dip into bits & bytes every once in a while :).

That was damn good explanation. Thanks a lot for the post

Hi Ramkumar, you’re welcome. Glad you liked it.

this is a great read

Thanks Socal!

How can I change byte order from Big Endian to Little Endian and vice versa without breaking structure. When we are sending any structure.

Hi avvy, you can use the “host to network” and “network to host” functions to convert data (more info here: http://linux.die.net/man/3/htons). You’d have to convert each field in the structure separately.

Hi Kalid,

I cannot get a better picture of this topic wherever i search. I was searching about endianness in a hurry as was looking for some stuff which can atleast give the details in short and i feel lucky to find your post. Your post came like an angel as i had some urgency to find about endianness faster.
Very briefly you told whole story about it. Details perfect…Flow of explanation perfect! Kudos!
I’ll appreciate if you drop a small mail whenver you post stuff like this. :slight_smile:

Thanks a tonn!

Hi Shweta, glad you liked the article! If you’d like to receive emails when new posts appear, just enter your email address in the “subscribe” form on the upper-right of the page. Thanks for the comment.

Hey thanks for info…done it :slight_smile:
Hoping to see something informative soon.

You have a ‘locaiton’ in there, if you care to fix it.
I second everyone else and say that this is awesome. I didn’t even know this issue existed and now I understand it well (I think).

I think you should mention explicitly that if you store ‘UNIX’ in little-endian it will end up as NUXI in big-endian. Not strictly necessary, but I think it would Explain it Better. (Even Better.) Or perhaps lainExp it terBet. (enEv terBet.)

Whenever I see the word endian I think first of Ender Wiggin. The enemy is down and so forth.

Thanks Alrenous, glad you’re enjoying it. Appreciate the tip – went through and cleaned up a bunch of typos (it’s a bit embarrassing how many were in there).

Good suggestion on the explanation, I’m always looking for ways to make things clearer (that’s why this isn’t best explained :slight_smile: ).

I hadn’t made the Ender Wiggin association, but I love Ender’s game… maybe there’s a way to fit him and Bean into this article somewhere.

Gr8 post. really helpful.

Thanks Breeson, glad it was useful.

Thanks alot Kalid ur explanation about endianness is awesome. I hve one question that need to be answered Is there is a way In a mixed binary file with 4 byte Integers and single byte characters to identify whether the byte we read from the file is a character or is a part of 4 byte integer data.

It would be very helpful if you can answer me

Hi Kumar, glad you enjoyed it. Offhand, I don’t think there’s a way, looking at the raw data, to tell whether it’s supposed to be a character or integer.

I think you’d need the file format spec to figure out the structure of the data – for example, the TCP header defines the byte ranges for each field.