Efficient Unpacking of Array Data in C: Stream Processing Techniques
This post is about arrays and structures in the C language.
question is like this:
There is a stream of data from a source. Say your friend sent you a text. and you know that certain contiguous combined bits belong to a specific piece of understandable information. The task is to decode the given stream of data and further perform some operations on it.
This post is primarily focused on decoding data. In simple terms, you have an array of data, say 2 bytes. The first 3 bits belong to certain understandable information, say ‘a’, the next 4 bits as ‘b’,” the next 6 bits as ‘c’,” and the last 3 bits as “d.”
eg. assuming big endianness. You can refer about endianness from the following link.
unsigned char arr[] = {0xe1, 0x7b};
if we decode the above given stream of 16bit data, we get the following i.e
‘a’ as ‘0x7’
‘b’ as ‘0x02’
‘c’ as ‘0xe’
‘d’ as ‘0x3’
Out of all possible approaches, the interviewer expects the following one: without using any bit-shift operators.
This approach is based on the fundamentals that
1. array memory allocation is contiguous.
2. If structure padding is not used, then structure should use exact memory.
example code: I have taken the example with ‘4’ bit for easy understanding of the output
#include <bits/stdc++.h>
using namespace std;
// disable structure padding
#pragma pack(1)
// create a structure for decoding the stream
// bits are mapped from top to bottom - i.e starting from lsb to msb
struct info{
unsigned char a: 4;
unsigned char b: 4;
unsigned char c: 4;
unsigned char d: 4;
};
// reset structure padding
#pragma pack()
int main(){
// NOTE: I used ideone.com to generate this code. It uses little endian format.
unsigned char arr[] = {0xba, 0xdc};
// assign the starting location of the contiguous array(type conversion is must)
struct info * temp = (struct info *)arr;
// print the values
cout << "a: " << hex << int(temp->a) << endl;
cout << "b: " << hex << int(temp->b) << endl;
cout << "c: " << hex << int(temp->c) << endl;
cout << "d: " << hex << int(temp->d) << endl;
return 0;
}
code output: the following output is little endian based.
a: a
b: b
c: c
d: d