I have the following code:
struct datos
{
char c;
int i;
long l;
};
#define SO(x) #x << " = " << x
int main()
{
datos d;
std::cout
<< SO(sizeof(d)) << '\n'
<< SO(sizeof(d.c)) << '\n'
<< SO(sizeof(d.i)) << '\n'
<< SO(sizeof(d.l)) << '\n'
<< SO(sizeof(char) + sizeof(int) + sizeof(long)) << '\n';
return 0;
}
Which produces the following output:
sizeof(d) = 16 sizeof(d.c) = 1 sizeof(d.i) = 4 sizeof(d.l) = 8 sizeof(char) + sizeof(int) + sizeof(long) = 13
I finished my primary studies a long time ago, but the last time I checked, adding one, four and eight was thirteen; but requesting the size of datos
returns sixteen, which is not the sum of one, four, and eight.
What's going on?
This is due to a decision the compiler makes in generating the type to match multiples of the processor word size .
processor word.
Processor word 1 roughly indicates how many bits a processor can process in a single operation. As an analogy we can imagine a polygraph, the more needles it has, the more lines it can draw at the same time:
The processor word also tells us the size of the data that can travel across the bus each cycle; eg: A 16-bit processor will take two bus cycles to send a 32-bit integer.
memory alignment.
Since it
int
appears to be 4 bytes on your system, your processor word is likely to be 32 bits. What is the size in bits of the structuredatos
?The size should be 13 bytes (104 bits), being the processor word 32 bits, we obtain that the structure
datos
occupies 3.25 processor words; this supposes that to read an instance of the structure we will need 4 readings of which, in the last one we will ignore the last 3 bytes read.If after this instance of
datos
104 bits we store another, its memory address would be misaligned 1 byte with respect to the multiples of the processor word, it will still have to perform 4 reads but after reading it will have to align the data discarding the first byte and then moving data one byte to the left 2 .To avoid this extra work, the objects are created so that they occupy multiples of the processor word, so the structure
datos
actually looks like this in memory:In total, the structure ends up occupying 16 bytes, which are the original 13 plus the 3 padding to make its size a multiple of the processor word (4 bytes).
How do I know that the padding is between
datos::c
anddatos::i
? Because the information inside the structure is also being aligneddatos
; without the alignment itdatos::i
would be straddling two processor words and would require two reads (and re-alignment) to be read. Also because we can check the address of each element, with this code:We get the following output:
That shows us that between each data of the structure
datos
there is a distance of 4 bytes.Align to my liking.
If for some reason you want to prevent the compiler from deciding the size of your structures (you may be serializing them) you can use the attribute
__attribute__((packed))
in GCC and CLang or#pragma pack(1)
in MSVC.MSVC
GCC
/CLang
With these changes, the following code:
Produces the following output:
In which we see that the size of the structure
datos
has changed to 13 and the distance betweendatos::c
anddatos::i
is 1 byte (the size ofdatos::c
).1 There is nothing biblical about it.
2 This is conceptually, the processor can read the memory in another way.