Translated from: https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong
I've seen a lot of people trying to read files this way lately.
Code
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char **argv )
{
char * path = ( argc > 1 ) ? argv[1] : "input.txt";
FILE * fp = fopen ( path, "r" );
if ( fp == NULL )
{
perror ( path );
return EXIT_FAILURE;
}
while( !feof ( fp ) ) /* --ESTO ESTÁ MAL-- */
{
/* Lee y procesa datos desde el archivo… */
}
if( fclose ( fp ) == 0 )
{
return EXIT_SUCCESS;
}
else
{
perror(path);
return EXIT_FAILURE;
}
}
What is wrong with the loop while( !feof ( fp ) )
?
Translated from: https://stackoverflow.com/a/26557243/8607301
I'd like to give an abstract, high-level perspective.
concurrency and simultaneity
I/O operations interact with the environment. The environment is not part of your program and is not under your control. The environment exists truly "concurrent" with your program. As with all things concurrent, "current state" questions simply don't make sense. There is no concept of "simultaneity" in concurrent events. Many of the properties of a state simply do not exist concurrently.
Let's be a bit more precise: Suppose you want to ask "do you have more data?". You could ask a concurrent container or your I/O manager on your system. But the answer, in general, is inaccessible or, therefore, simply does not make sense. So if the container says "yes", it might not have the data while you're reading. Similarly, if the answer is "no", just as you are reading, the data could arrive. The bottom line to this in a simple way is that there is no such property as "I have data", since you cannot act meaningfully in response to every possible situation. (The situation is slightly better with buffered input, where you can possibly get a "yes, I have data", which constitutes some sort of guarantee, but it would still have to be able to deal with the opposite case. And with the result, the situation is certainly as bad as I described it: you never know if the disk or the network buffer is full.)
So we come to the conclusion that it is impossible, and indeed reasonable , to ask an I/O system if it will be able to perform an I/O operation. The only possible way to interact with it (just like with a concurrent container) is to try the operation and check if it succeeded or not. At that moment when you interact with the environment, only then can you know if the interaction was really possible, and at that moment you must commit to make the interaction. (This is a "sync point", if you like.)
EOF
Now we come to EOF. EOF is the response that an I/O operation attempt gets . It means that you were trying to read or write something, but doing so failed to read or write any data, and the end of input or output was encountered instead. This is true for virtually all I/O APIs, be it the C standard library, C++ iostreams, or other libraries. As long as the I/O operations succeed, you simply can't know if later, future operations will succeed. You should always test the operation first and then respond to success or failure.
examples
In each of the examples, note that we first attempt the I/O operation and then consume the result if it is valid. Also keep in mind that we should always use the result of the I/O operation, even though in each example the result takes different forms and appearances.
Read from a file using stdio , in C:
The result we should use is
n
, the number of elements we just read (which can be as small as zero).Cstdio,
scanf
:The result we should use is the value returned
scanf
by , the number of items converted.Extraction with iostreams format, C++:
The result we should use is
std::cin
itself, since it can be evaluated in a boolean context and tell us if the stream is still inbuen()
state.C++, getline in iostreams:
The result we should use is the same as before,
std::cin
.POSIX, use
write(2)
to flush a buffer:The result we use here is
k
, the number of bytes written. The idea here is that we can know how many bytes have been written after the write operation.POSIX
getline()
The result we should use is
nbytes
, the number of bytes we have including the newline character itself (or EOF if the file doesn't end with a newline character).Note that this function explicitly returns
-1
(and not EOF!) when an error occurs or when it reaches EOF.As you can see, we very rarely say the word "EOF". We usually catch the error condition in some other way that is of most immediate interest to us (for example, not doing the I/O we wanted). In each example, there is an API feature that could explicitly tell us that the EOF state has been found, but in fact that is not very useful information. That detail is much more than we often care about. And we care if the I/O succeeded, rather than if it failed.
One last example that actually queries for the EOF state: Suppose you have a string and you want to test that it represents an integer in its entirety, with no extra bits at the end except whitespace. Using C++ iostreams, it's like this:
We use two results here. The first is
iss
, the stream object itself, to verify that the formatted extract avalue
succeeded. But then, after consuming the whitespace, we perform another I/O,iss.get()
expecting it to fail as EOF, which is the case if the formatted output has already consumed the entire string.In the C standard library, you can accomplish something similar with functions by
strto*l
checking that the final pointer has reached the end of the input string.The answer
while(!eof)
it's wrong because it proves something that's irrelevant and doesn't prove something you need to know. The result is that you're mistakenly executing code that assumes you're accessing data that was successfully read, when, in fact, it never actually happened.