I have the following program:
#include <stdio.h>
typedef struct
{
int a;
int b;
}Letter;
int main()
{
Letter arr[2] = { {1, 101}, {2, 23} };
Letter* p1 = arr;
Letter** p2 = &p1;
printf("%d\n", p2[0]->a);//--> muestra el dato 1
printf("%d\n", p2[0]->b);//--> muestra el dato 101
printf("%d\n", p2[1]->a); //--> aquí acaba el programa
printf("%d\n", p2[1]->b);
return 0;
}
The problem lies when this line of code is executed:
printf("%d\n", p2[1]->a);
The program stops working. I activated the debugger and it tells me that a segmentation fault occurred, however I don't understand the reason for the error.
A curious detail is that when these two codes are executed:
printf("%d\n", p2[0]->a);
printf("%d\n", p2[0]->b);
A segmentation fault does not occur, so it displays the data correctly.
Now, the million dollar question: Why in the first case does a segmentation fault not occur and in the second it does?
p2
is a double pointer and in order to access the array data we need to use an additional access operator(*), otherwise undefined behavior would occur, this means that the compiler is free to do what it wants with the code, causing possible strange behavior at runtime.For example, the compiler seeing these codes:
You could implicitly use this pointer arithmetic:
And this means that the memory address of the data is not calculated properly.
Where:
sizeof(Letter)
: It will return the size of bytes (in this case it returns8
bytes) that the structure occupiesLetter
.indice
: The position of X structure that we want to access.offset_miembro
: Each member of a structure has an associated offset , in which it will be used to reach the memory address of X member of the structureLetter
.I emphasize that
offset
it is not calculated by us, but by the compiler.How do we get it?
One option is to use the offsetof macro .
Example:
Screen result:
Now yes, let's start with the deduction.
The compiler when it sees this expression:
Will convert it to:
Resulting:
The code above does three things:
1.- Access the content of
p2
, which is precisely the memory address ofp1
.2.- Then the content of the pointer is accessed
p1
, which is precisely the base address of the first structure of the array .3.- Finally, we access the content of that address.
So with this we can understand that the first expression will never give a segmentation fault because this subexpression always results in a
0
(of course if the index is0
):The same logic applies with this second expression:
The compiler converts it to:
Resulting:
*p2
will give the base address of the first structure of the array , then a is added4
to reach the memory address of the memberb
and finally, the data is accessed.With all this explanation we can answer this question:
The answer will depend on how the compiler translates the expression, however, if we continue with our deduction, it is because this subexpression:
Will always result in a
0
. So with this at no time would we be accessing a memory address that does not belong to the program.However, this third expression:
The compiler converts it to:
Resulting:
First we access the content of
p2
, which is basically the memory address ofp1
, then we add a to it8
and finally, we access the content of that address.There is the problem! This expression possibly gives a segmentation fault:
However, there is the possibility that this expression:
Calculate the memory address of a variable that is part of the program.
Imagine that in memory we have the following:
So when evaluating this expression:
It gives us as a result:
Like the address
0x16
if it belongs to the program, it is totally valid to access its content, giving as a result:*(0x12)
, but therein lies the problem, that later we will be accessing the address0x12
(which was actually the value that the address had saved0x16
) and there if a segmentation fault would occur (in our example yes).This is crazy! Never try this at home!
Answering the second question:
Because this subexpression:
It is not returning a
0
and this is because its index is different from0
.What is the correct way to access?
In this way:
In this case we have added the access operator (*) that was missing from the beginning. Since in this way, we ensure that the compiler uses an appropriate addressing mode (it depends on this to be able to calculate the memory address of X member of the structure
Letter
).Usually the compiler should convert the above code to this:
Resulting:
Now yes,
*p2
it returns the base address of the first structure of the array (basically to what points top1
), then we add the8
. With this we calculate the base address of the second structure of the array and finally, we access said address and in this way, we will obtain the data that the member has storeda
.Recommendation:
Do not try to access the array of structures through a double pointer, it makes the syntax less readable. Instead, use a plain pointer.
Imagine that you have a function called
llenarDatos
, where its only parameter is a double pointer.Example:
Simple isn't it? In the function
main
we can have a simple pointer in which it has the base address of the array of structures, in this way, we can use it anywhere in themain
.Discussion:
The deduction was based on the pointer arithmetic that is usually used to access the array of structures through a simple pointer:
Where this expression:
It should give the base address of X array structure and adding the
offset_member
would give us the memory address of X member of a structure .Taking into account the above, we can deduce that the pointer arithmetic to be able to access an array of structures with a double pointer would be:
It is almost the same arithmetic that we had seen before, the difference is that we must make one more memory access (because it
pointer
is a double pointer).Taking into account the two previous arithmetics, we can arrive at this:
Why? Because with this arithmetic we can verify that when we execute this code:
p2[0]->b
, it will not give a segmentation fault because the addressing mode will always be equal to:On the other hand, when the position ( or index ) is different from
0
, the arithmetic is preserved as such, for that reason, an address is calculated that is not.Conclusion:
All this problem is related to the way the compiler translates the statements. Likewise, we don't need to worry about doing this by hand, since all of this is done by the compiler implicitly; however, it helps us to resolve doubts.
Font: