2

after googling a while, I'm posting here for help.

I have two float64 variables returned from a function. Both of them are apparently 1:

>>> x, y = somefunc()
>>> print x,y
>>> if x < 1 :   print "x < 1"
>>> if y < 1 :   print "y < 1"
1.0  1.0
y < 1

Behavior changes when variables are defined float32, in which case the 'y<1' statement doesn't appear.

I tried setting

np.set_printoptions(precision=10)

expecting to see the differences between variables but even so, both of them appear as 1.0 when printed.

I am a bit confused at this point. Is there a way to visualize the difference of these float64 numbers? Can "if/then" be used reliably to check float64 numbers?

Thanks Trevarez

2
  • 3
    I don't understand your question. Obviously the error is in the printed representation where y is less than one ni the float64 case, and is equal(or greater) to 1 when using float32 due to rounding errors. When dealing with floating point values you whould never use equal comparisons. Fix a minimum error(for example epsilon=1e-16 or smaller/bigger depending on the application) and do if abs(number - 1) < epsilon: # number is sufficiently close to 1 to be considered as 1. Commented Aug 19, 2013 at 10:01
  • @Bakuriu you can post this as an answer... your pretty much explained what is going on Commented Aug 19, 2013 at 10:05

3 Answers 3

4

The printed values are not correct. In your case y is smaller than 1 when using float64 and bigger or equal to 1 when using float32. this is expected since rounding errors depend on the size of the float.

To avoid this kind of problems, when dealing with floating point numbers you should always decide a "minimum error", usually called epsilon and, instead of comparing for equality, checking whether the result is at most distant epsilon from the target value:

In [13]: epsilon = 1e-11

In [14]: number = np.float64(1) - 1e-16

In [15]: target = 1

In [16]: abs(number - target) < epsilon   # instead of number == target
Out[16]: True

In particular, numpy already provides np.allclose which can be useful to compare arrays for equality given a certain tolerance. It works even when the arguments aren't arrays(e.g. np.allclose(1 - 1e-16, 1) -> True).

Note however than numpy.set_printoptions doesn't affect how np.float32/64 are printed. It affects only how arrays are printed:

In [1]: import numpy as np

In [2]: np.float(1) - 1e-16
Out[2]: 0.9999999999999999

In [3]: np.array([1 - 1e-16])
Out[3]: array([ 1.])

In [4]: np.set_printoptions(precision=16)

In [5]: np.array([1 - 1e-16])
Out[5]: array([ 0.9999999999999999])

In [6]: np.float(1) - 1e-16
Out[6]: 0.9999999999999999

Also note that doing print y or evaluating y in the interactive interpreter gives different results:

In [1]: import numpy as np

In [2]: np.float(1) - 1e-16
Out[2]: 0.9999999999999999

In [3]: print(np.float64(1) - 1e-16)
1.0

The difference is that print calls str while evaluating calls repr:

In [9]: str(np.float64(1) - 1e-16)
Out[9]: '1.0'

In [10]: repr(np.float64(1) - 1e-16)
Out[10]: '0.99999999999999989'
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you Bakuriu, it was quite a comprehensive answer that clarify my doubts.
I was just wondering what is the point of Numpy providing functions as "numpy.where" without a warning on these subtleties on floats? From the documentation, it follows that can be used directly with floats without concerning about precision issues: >>> x = np.arange(9.).reshape(3, 3) >>> x[np.where( x > 3.0 )] array([ 4., 5., 6., 7., 8.])
@Trevarez It's not a problem with numpy. It's a problem with any floating point computation. This is something you ought to know(see for example What Every Computer Scientist Should Know About Floating-Point Arithmetic). If you want to check where the values are bigger than 3.0, up to a certain precision you can simply subtract the epsilon first(i.e. x[np.where((x - epsilon) > 3.0)]).
Great article. I think I finally understood the scope of the "problem". Thank you again for your explanations.
Note: you should use np.allclose(value, other_value). Do not define your own arbitrary epsilon value.
1
In [26]: x  = numpy.float64("1.000000000000001")

In [27]: print x, repr(x)
1.0 1.0000000000000011

In other words, you are plagued by loss of precision in print statement. The value is very slightly different than 1.

1 Comment

see, print outputs 11 significant digits and float64 has 15 or so.
0

Following the advices provided here I summarize the answers in this way:

To make comparisons between floats, the programmer has to define a minimum distance (eps) for them to be considered different (eps=1e-12, for example). Doing so, the conditions should be written like this:

Instead of (x>a), use (x-a)>eps
Instead of (x<a), use (a-x)>eps
Instead of (x==a), use abs(x-a)<eps

This doesn't apply to comparison between integer numbers since difference between them is fixed to 1.

Hope it helps others as it helped me.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.