Could it be that a header is the same as an include? Or are they different? It's just that a friend told me they were different but he never tried to tell me the difference.
What I understand is that the include is the directive that includes the external file and the header is basically as if it were the library where the code to include is. I don't know if what I said is right or what.
Before the compiler has a chance to see the code, the preprocessor reads it and potentially transforms it into something else, following whatever preprocessor directives are included in the source.
A
#include
is one of those directives to the preprocessor, telling it to include the contents of another file at that point. What the compiler receives is as if you yourself had opened the file indicated by#include
it and literally copied and pasted it into the main program.Therefore the content of a file
.h
must be valid C code. Actually it can be any code (it could include for example the source code that implements some functions), but it is not usually the case. Usually, it contains only declarations (data types or function prototypes), while the implementation itself is not in the.h
, but in others.c
that will be compiled separately and added to the main one, or stored in libraries.The compiler, thanks to those prototypes that have been brought from the
.h
, can check that when you call a function you are passing it the correct parameters and of the appropriate types. However, the compiler does not need to have the code of the functions, it is enough with their prototypes.To assemble the final executable, once the compiler has finished, comes the link assembler ( linker ), which compiles the machine code generated by the compiler plus that of other compiled files, plus that of the necessary external libraries, and with everything this resolves calls to functions that have been used. If any of the functions declared in
.h
it (and then called from the program) do not appear at mount time, there will be an error at link time.The files
.h
are called "header files", or headers . Many people also call them bookstores. This is wrong, and not because of pedantic things like library is not the correct translation of library (which should be library), but because a.h
contains source code (and is used by the compiler), while a library (or library if you want ) contains machine code and is used by the link builder.Update
This long update is to answer some additional questions raised by the user in the comments about link-time errors.
The following example helps illustrate the difference between the two phases of creating an executable:
On include
math.h
, the preprocessor includes the entire contents of the file at that pointmath.h
. Among other things, within that file the constant is definedM_PI
with the value3.1415926etc...
and the prototype of the function is providedsqrt()
. In reality it is somewhat more complex because the prototype of that function is defined through a macro, but for our purposes we can imagine that the file contains a line like this:Thanks to this declaration, the compiler can verify, when we call in our program
sqrt()
, that the type that we pass as a parameter is correct, and that what the function returns can be assigned to the variableradio
, or if the type ofradio
wasfloat
, what automatic conversions they should be done.However, if you try to compile this example, you will get an error (at least with it
gcc
on Linux):This is a link-time error (the
ld
one the error refers to is the linker name ).It doesn't occur if
gcc
you tell it to "compile only" (i.e. do the first phase but not the second):In this case there are no errors, and the compile result is
area.o
. But this is not a complete executable. It only has the machine code for the functionmain
, but not for other functions that are called frommain
, such asscanf()
,printf()
orsqrt()
.For this reason, the second phase is needed, the linking phase, which we can do manually, but again we get an error:
Notice that since the linker takes as input
area.o
instead ofarea.c
, it is no longer looking at the source code, but rather the machine code that the compiler generated. In that machine code, the mission of the linker is to "fill in the gaps". The compiler left marks like "here goes a call to a functionprintf
, here goes another to a functionsqrt
. The linker must find the machine code of those functions, add them to the executable and fill in those gaps the necessary addresses for theCALL
.Where does the linker get the machine code for those other functions? Well, basically from three places:
.o
that you have specified in the command when invoking the linker (we have not put any in this case)-l
(we have not put any in this case)It turns out that the C standard library has the machine code for the
printf
and functionsscanf
, so these are fine. But it doesn't have the code forsqrt
, because that is in the math library, which is not searched by default. That is why the linker does not find it and gives an error.Note that the linker error is limited to showing the name of the function it cannot find and the name of the function from which it was called, but it cannot tell you from which specific line
area.c
that call was made, because the linker does not read the source (in fact you could have already deleted it after the first compilation phase).To avoid this error you have to tell the linker in this case to also look in the mathematical library, for which the option must be used
-lm
. Then:Now yes, the file
area
is the final executable.We could also have executed the two phases with a single command, instead of doing it separately:
Note that the only difference is that I put
area.c
(so itgcc
will compile first and then link) instead ofarea.o
(in which case the compile phase was skipped because it was already done).There's still more
One last detail, somewhat surprising. We said that the compiler only needs the prototype of the function, but not its code. In fact, you don't even need the prototype!
You can do the following experiment. In
area.c
changes the call tosqrt()
to another toraiz_cuadrada()
, a function that has neither been declared nor exists, and tries to "compile alone", without linking:As you can see we get a warning , but not an error. The compiler will generate a
.o
, leaving a hole for the linker that says "and here should come the function callraiz_cuadrada
". Naturally the linker won't find it and will throw an error at link time as before.But how is it possible that the compiler has not given an error when we are calling a function that does not exist? Because when the compiler encounters an undeclared function call, it makes up the declaration itself . It's what he calls an implicit declaration, and it's what the warning is warning us about . The statement that the compiler makes up is based on the call that we have made. Since he sees that we pass him a type parameter
double
, he assumes that the declaration would beraiz_cuadrada(double)
. However for the return value it always assumesint
, which would be wrong in this case. That's why it's important to include the.h
right ones, to prevent the compiler from "inventing" incorrect declarations.