Surely you have already encountered a problem like the following:
> (2.3 - 1.8) == 0.5
[1] FALSE
> sqrt(2)^2 == 2
[1] FALSE
The explanation to the general problem of handling floating point numbers can be found here: Why can't my programs do arithmetic calculations correctly? .
This is not a problem particular to R but to any language that handles floating point numbers.
Now, how can we resolve or handle these "inconsistencies" in the language when making comparisons?
First of all, the simple demonstration of the problem:
We see that in the first case
2.3 - 1.8
it is not exactly0.5
and also thatsqrt(2)^2
it is not exactly2
. It just so happens that, for clarity, R does not display a large number of decimal places by default:This problem is not from R, but is inherent to the computational architecture itself (See: Why can't my programs do arithmetic calculations correctly? ), the impossibility of representing certain numbers by floating point. One might think that the problem is representing numbers like
1/3
orPi
, but it is not only these, because of binary algebra, even "simpler" decimal numbers cannot be represented exactly, but what is always handled is an approximation:Obviously only the most significant decimals are seen, we can see up to how many significant digits will be taken with
getOption("digits")
now if we modify this:We clearly notice where we have the problem. How do we resolve this inconsistency?
Comparisons (equality)
One of the main problems, as we have already seen, is to compare two numbers where at least one is the product of an operation with floating point values. The natural way to solve it is by using the
all.equal()
y function combined withisTRUE()
This is used to treat scalars, or being consistent with the language, vectors with a single element, however for vectors with several elements, the previous solution must be "vectorized", for example:
Note: In case of using
dplyr
we havenear()
Other comparisons
For the rest of the comparisons, we must attack the problem in another way. This would be the problem:
Here what we can do is use
zapsmall()
what is nothing more than a "wrapper" around around()
to remove unnecessary digits:Sources: