C, C++: Understanding Big and Little Endian; Code to Determine at Runtime
As outlined above, let’s delve into the fundamentals of computer organisation.
The basic addressable unit is a ‘byte,’ and a group of bytes is referred to as a ‘word.’
Important note: In computer organization, store/load operations occur via byte or word. The reason is simple: we cannot afford an address for each bit in the memory. That’s why computer architecture designer specifies whether it’s byte or word addressable.
For the purpose of this article, let’s assume a word size of 32 bits consistently. As mentioned earlier, this equates to 4 bytes.
imagine as follow:
Endianness is about how computers store data in memory.
Let’s use an example to understand this.
Imagine we have a number, let’s call it ‘a,’ with the value 0x01020304.
we are assuming ‘byte addressable memory’. I will write another article to understand what is ‘byte vs word addressable’
int a = 0x01020304;
This number takes up a certain amount of space in memory, and if we break it down into individual bytes, we get the following data.
Now, let’s think in a general way. What are the possible ways to store this information in a word of the computer?
Possible 1st way: Mapping directly, bit by bit, as we humans see.
2nd way: from the opposite end.
Did you notice that I am dealing with bytes at a time, not with bits? In the beginning, I mentioned a very important point, and it’s because of that.
That’s it. To identify which one is which, we engineers gave two names. The first format is ‘big endian,’ and the second one is ‘little endian.’
To match with the practical world definition, let’s describe them:
- Big Endian: The Most Significant Bit (MSB) is stored at the lowest byte address.
- Little Endian: The Least Significant Bit (LSB) is stored at the lowest byte address.
I hope that is clear. This is a very important topic, especially if you are taking FAANG coding tests. They may even specify the type of endianness they assume in their questions.
Now, let’s look at the code and understand ‘which format your computer is using.’
Do you have any idea how we can test this?
why don’t we store a number in a word and check the first byte or last byte? Then, cross-check which format it’s matching. Will it help us? Absolutely. Let’s do that.
Let’s determine the endianness used by ideone.com:
#include <iostream>
using namespace std;
int main() {
// your code goes here
int a = 0x01020304;
char *c = (char *)&a;
for (int i = 1; i < 5 ; i++){
cout << "addr: " << (int *)c << " , val: " << (int)*c << endl;
c = c + 1;
}
return 0;
}
output:
Success #stdin #stdout 0s 5304KB
addr: 0x7ffc2c10f924 , val: 4
addr: 0x7ffc2c10f925 , val: 3
addr: 0x7ffc2c10f926 , val: 2
addr: 0x7ffc2c10f927 , val: 1
Conclusion:
It is using little-endian architecture because the Least Significant Bit (LSB) is stored at the lower byte address.
If you want to check without writing the above complex logic, I came across a method on some website during my interview preparation (I don’t remember the link, sorry for that). You can try this one:
#include <iostream>
using namespace std;
int main() {
// your code goes here
int a = 0x1;
char *c = (char *)&a;
if ((int)*c){
cout << "little endian" << endl;
}else{
cout << "Big endian" << endl;
}
return 0;
}
output:
Success #stdin #stdout 0s 5312KB
little endian