2002-04-17 14:05:38

by Andrey Ulanov

[permalink] [raw]
Subject: FPU, i386

Look at this:

$ cat test.c
#include <stdio.h>

main()
{
double h = 0.2;

if(1/h == 5.0)
printf("1/h == 5.0\n");

if(1/h < 5.0)
printf("1/h < 5.0\n");
return 0;
}
$ gcc test.c
$ ./a.out
1/h < 5.0
$

I also ran same a.out under FreeBSD. It says "1/h == 5.0".
It seems there is difference somewhere in FPU
initialization code. And I think it should be fixed.

ps. cc to me
--
with best regards, Andrey Ulanov.
[email protected]


2002-04-17 14:20:55

by Mike Black

[permalink] [raw]
Subject: Re: FPU, i386

When you use -O2 it works.

Assembly with -O2:
.file "t1.c"
.version "01.01"
gcc2_compiled.:
.section .rodata
.LC2:
.string "1/h == 5.0\n"
.LC3:
.string "1/h < 5.0\n"
.LC4:
.string "%llx %llx\n"
.align 8
.LC1:
.long 0x0,0x40140000
.text
.align 4
.globl main
.type main,@function
main:
pushl %ebp
movl %esp,%ebp
subl $8,%esp
addl $-12,%esp
pushl $.LC2
call printf
addl $16,%esp
addl $-12,%esp
pushl .LC1+4
pushl .LC1
pushl .LC1+4
pushl .LC1
pushl $.LC4
call printf
xorl %eax,%eax
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 2.95.3 20010315 (release)"


Assembly with a plain compile:
.file "t1.c"
.version "01.01"
gcc2_compiled.:
.section .rodata
.LC2:
.string "1/h == 5.0\n"
.LC3:
.string "1/h < 5.0\n"
.LC4:
.string "%llx %llx\n"
.align 8
.LC0:
.long 0x9999999a,0x3fc99999
.align 8
.LC1:
.long 0x0,0x40140000
.text
.align 4
.globl main
.type main,@function
main:
pushl %ebp
movl %esp,%ebp
subl $24,%esp
fldl .LC0
fstpl -8(%ebp)
fld1
fdivl -8(%ebp)
fldl .LC1
fucompp
fnstsw %ax
andb $68,%ah
xorb $64,%ah
jne .L3
addl $-12,%esp
pushl $.LC2
call printf
addl $16,%esp
.L3:
fld1
fdivl -8(%ebp)
fldl .LC1
fcompp
fnstsw %ax
andb $69,%ah
jne .L4
addl $-12,%esp
pushl $.LC3
call printf
addl $16,%esp
.L4:
addl $-12,%esp
fldl .LC1
subl $8,%esp
fstpl (%esp)
fld1
fdivl -8(%ebp)
subl $8,%esp
fstpl (%esp)
pushl $.LC4
call printf
addl $32,%esp
xorl %eax,%eax
jmp .L2
.p2align 4,,7
.L2:
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 2.95.3 20010315 (release)"


________________________________________
Michael D. Black Principal Engineer
[email protected] 321-676-2923,x203
http://www.csihq.com Computer Science Innovations
http://www.csihq.com/~mike My home page
FAX 321-676-2355
----- Original Message -----
From: "Andrey Ulanov" <[email protected]>
To: <[email protected]>
Sent: Wednesday, April 17, 2002 10:05 AM
Subject: FPU, i386


Look at this:

$ cat test.c
#include <stdio.h>

main()
{
double h = 0.2;

if(1/h == 5.0)
printf("1/h == 5.0\n");

if(1/h < 5.0)
printf("1/h < 5.0\n");
return 0;
}
$ gcc test.c
$ ./a.out
1/h < 5.0
$

I also ran same a.out under FreeBSD. It says "1/h == 5.0".
It seems there is difference somewhere in FPU
initialization code. And I think it should be fixed.

ps. cc to me
--
with best regards, Andrey Ulanov.
[email protected]

2002-04-17 14:24:09

by Richard B. Johnson

[permalink] [raw]
Subject: Re: FPU, i386

/*
* Note FPU control only exists per process. Therefore, you have
* to set up the FPU before you use it in any program.
*/
#include <i386/fpu_control.h>

#define FPU_MASK (_FPU_MASK_IM |\
_FPU_MASK_DM |\
_FPU_MASK_ZM |\
_FPU_MASK_OM |\
_FPU_MASK_UM |\
_FPU_MASK_PM)

void fpu()
{
__setfpucw(_FPU_DEFAULT & ~FPU_MASK);
}


main() {
double zero=0.0;
double one=1.0;
fpu();

one /=zero;
}

2002-04-17 14:29:08

by Nikita Danilov

[permalink] [raw]
Subject: Re: FPU, i386

Andrey Ulanov writes:
> Look at this:
>
> $ cat test.c
> #include <stdio.h>
>
> main()
> {
> double h = 0.2;
>
> if(1/h == 5.0)
> printf("1/h == 5.0\n");
>
> if(1/h < 5.0)
> printf("1/h < 5.0\n");
> return 0;
> }
> $ gcc test.c

$ gcc -O test.c
$ ./a.out
1/h == 5.0

without -O, gcc initializes h to 0.2000000000000000111

> $ ./a.out
> 1/h < 5.0
> $
>
> I also ran same a.out under FreeBSD. It says "1/h == 5.0".
> It seems there is difference somewhere in FPU
> initialization code. And I think it should be fixed.
>
> ps. cc to me
> --
> with best regards, Andrey Ulanov.
> [email protected]

2002-04-17 14:41:15

by Jesse Pollard

[permalink] [raw]
Subject: Re: FPU, i386

--------- Received message begins Here ---------

>
> Andrey Ulanov writes:
> > Look at this:
> >
> > $ cat test.c
> > #include <stdio.h>
> >
> > main()
> > {
> > double h = 0.2;
> >
> > if(1/h == 5.0)
> > printf("1/h == 5.0\n");
> >
> > if(1/h < 5.0)
> > printf("1/h < 5.0\n");
> > return 0;
> > }
> > $ gcc test.c
>
> $ gcc -O test.c
> $ ./a.out
> 1/h == 5.0
>
> without -O, gcc initializes h to 0.2000000000000000111
>
> > $ ./a.out
> > 1/h < 5.0
> > $
> >
> > I also ran same a.out under FreeBSD. It says "1/h == 5.0".
> > It seems there is difference somewhere in FPU
> > initialization code. And I think it should be fixed.

Nope. -O2 implies constant folding, and h is a constant. What you are
compairing is runtime vs compile time values. 5.0 is compile time.
1/h where h is a constant is compile time (O2) and that would
come out at 5.0 also

Been there done that... My solution (based on the problem I was working
in) was to multiply both sides by the 10^<number of siginificant digits
of the problem set>. Taking the simplistic approach:

if (int(1/h * 100) == int(5.0 * 100))

will give a "proper" result within two decimal places. This is still
limited since there are irrational numbers within that range that COULD
still come out with a wrong answer, but is much less likely to occur.

Exact match of floating point is not possible - 1/h is eleveated to a float.

If your 1/h was actually num/h, and num computed by summing .01 100 times
I suspect the result would also be "wrong".

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: [email protected]

Any opinions expressed are solely my own.

2002-04-17 14:49:46

by John Alvord

[permalink] [raw]
Subject: Re: FPU, i386

On Wed, 17 Apr 2002, Jesse Pollard wrote:

> --------- Received message begins Here ---------
>
> >
> > Andrey Ulanov writes:
> > > Look at this:
> > >
> > > $ cat test.c
> > > #include <stdio.h>
> > >
> > > main()
> > > {
> > > double h = 0.2;
> > >
> > > if(1/h == 5.0)
> > > printf("1/h == 5.0\n");
> > >
> > > if(1/h < 5.0)
> > > printf("1/h < 5.0\n");
> > > return 0;
> > > }
> > > $ gcc test.c
> >
> > $ gcc -O test.c
> > $ ./a.out
> > 1/h == 5.0
> >
> > without -O, gcc initializes h to 0.2000000000000000111
> >
> > > $ ./a.out
> > > 1/h < 5.0
> > > $
> > >
> > > I also ran same a.out under FreeBSD. It says "1/h == 5.0".
> > > It seems there is difference somewhere in FPU
> > > initialization code. And I think it should be fixed.
>
> Nope. -O2 implies constant folding, and h is a constant. What you are
> compairing is runtime vs compile time values. 5.0 is compile time.
> 1/h where h is a constant is compile time (O2) and that would
> come out at 5.0 also
>
> Been there done that... My solution (based on the problem I was working
> in) was to multiply both sides by the 10^<number of siginificant digits
> of the problem set>. Taking the simplistic approach:
>
> if (int(1/h * 100) == int(5.0 * 100))
>
> will give a "proper" result within two decimal places. This is still
> limited since there are irrational numbers within that range that COULD
> still come out with a wrong answer, but is much less likely to occur.
>
> Exact match of floating point is not possible - 1/h is eleveated to a float.
>
> If your 1/h was actually num/h, and num computed by summing .01 100 times
> I suspect the result would also be "wrong".

In most portable floating point work, you never use ==. Instead, the
difference is compared to a small epsilon value and division is avoided by
scaling.

john alvord

2002-04-17 16:20:17

by Gunther Mayer

[permalink] [raw]
Subject: Re: FPU, i386

Andrey Ulanov wrote:

> Look at this:
>
> $ cat test.c
> #include <stdio.h>
>
> main()
> {
> double h = 0.2;
>
> if(1/h == 5.0)
> printf("1/h == 5.0\n");
>
> if(1/h < 5.0)
> printf("1/h < 5.0\n");
> return 0;
> }
> $ gcc test.c
> $ ./a.out
> 1/h < 5.0
> $
>
> I also ran same a.out under FreeBSD. It says "1/h == 5.0".
> It seems there is difference somewhere in FPU
> initialization code. And I think it should be fixed.
>
> ps. cc to me
> --
> with best regards, Andrey Ulanov.
> [email protected]
>

google for every scientist should know about floating point


2002-04-18 08:31:10

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: FPU, i386

On Wed, Apr 17, 2002 at 09:40:48AM -0500, Jesse Pollard wrote:
...
> Been there done that... My solution (based on the problem I was working
> in) was to multiply both sides by the 10^<number of siginificant digits
> of the problem set>. Taking the simplistic approach:
>
> if (int(1/h * 100) == int(5.0 * 100))

The common solution I've seen and used is
if (fabs(a-b) < e)

Set e according to your needs (1E-4 is good enough for many practical purposes,
1E-12 is better for my personal gut-feeling in many problems, 1E-16 and beyond
does not make any sense what so ever (for double precision))

--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:

2002-04-25 13:06:57

by rpm

[permalink] [raw]
Subject: Re: FPU, i386

On Wednesday 17 April 2002 08:10 pm, Jesse Pollard wrote:
> --------- Received message begins Here ---------
>

> if (int(1/h * 100) == int(5.0 * 100))
>
> will give a "proper" result within two decimal places. This is still
> limited since there are irrational numbers within that range that COULD
> still come out with a wrong answer, but is much less likely to occur.
>
> Exact match of floating point is not possible - 1/h is eleveated to a
> float.
>
> If your 1/h was actually num/h, and num computed by summing .01 100 times
> I suspect the result would also be "wrong".
>

why is exact match of floating point not possible ?
what i understand is if you do a " x/y " (where x and y are two integers )
division in hardware then you should always get the same value , then why
can't we compare floating point "==" operation ?
I understand that in case of inrrational number it will not give a exact
value ......but division like 1/.2 is not irrational ! and it should always
come to 5 !

rpm

2002-04-25 13:24:03

by Andreas Schwab

[permalink] [raw]
Subject: Re: FPU, i386

rpm <[email protected]> writes:

|> I understand that in case of inrrational number it will not give a exact
|> value ......but division like 1/.2 is not irrational ! and it should always
|> come to 5 !

It is not about irrational number (of course, the result of dividing two
rational number is always a rational number), but about finite precision.
You simply cannot represent 0.2 finitely in base 2.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE GmbH, Deutschherrnstr. 15-19, D-90429 N?rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2002-04-25 14:22:16

by Richard B. Johnson

[permalink] [raw]
Subject: Re: FPU, i386

On Thu, 25 Apr 2002, rpm wrote:

> On Wednesday 17 April 2002 08:10 pm, Jesse Pollard wrote:
> > --------- Received message begins Here ---------
> >
>
> > if (int(1/h * 100) == int(5.0 * 100))
> >
> > will give a "proper" result within two decimal places. This is still
> > limited since there are irrational numbers within that range that COULD
> > still come out with a wrong answer, but is much less likely to occur.
> >
> > Exact match of floating point is not possible - 1/h is eleveated to a
> > float.
> >
> > If your 1/h was actually num/h, and num computed by summing .01 100 times
> > I suspect the result would also be "wrong".
> >
>
> why is exact match of floating point not possible ?

Because many (read most) numbers are not exactly representable
in floating-point. The purpose of floating-point it to represent
real numbers with a large dynamic range. The trade-off is that
few such internal representations are exact.

As a simple example, 0.33333333333..... can't be represented exactly
even with paper-and-pencil. However, as the ratio of two integers
it can be represented exactly, i.e., 1/3 . Both 1 and 3 must
be integers to represent this ratio exactly.

All real numbers (except trancendentials) can represented exactly
as the ratio of two integers but floating-point uses only one
value, not two integers, to represent the value. So, an exact
representation of a real number, when using a single variable
in a general-purpose way, is, for all practical purposes, not
possible. Instead, we get very close.

When it comes to '==' close is not equal. There are macros in
<math.h> that can be used for most floating-point logic. You
should check them out. If we wanted to check for '==' we really
need to do something like this:

double a, b;
some_loop() {
if(fabs(a-b) < 1.0e-38)
break;
}
Where we get the absolute value of the difference between two
FP variables and compare against some very small number.

To use the math macros, the comparison should be something like:

if (isless(fabs(a-b), 1.0e-38))
break;


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.

2002-04-25 14:39:12

by Nicholas Berry

[permalink] [raw]
Subject: Re: FPU, i386

Nitpick:

> All real numbers (except trancendentials) can represented exactly
> as the ratio of two integers

should be 'except irrationals', of which trancendentals are a subset :)

Nik


2002-04-25 14:52:18

by Richard B. Johnson

[permalink] [raw]
Subject: Re: FPU, i386

On Thu, 25 Apr 2002, Nicholas Berry wrote:

> Nitpick:
>
> > All real numbers (except trancendentials) can represented exactly
> > as the ratio of two integers
>
> should be 'except irrationals', of which trancendentals are a subset :)
>
> Nik

Yep. It's the definition is an irrational number.

>

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.

2002-04-25 15:30:07

by Mark Mielke

[permalink] [raw]
Subject: Re: FPU, i386

On Thu, Apr 25, 2002 at 10:22:49AM -0400, Richard B. Johnson wrote:
> To use the math macros, the comparison should be something like:
> if (isless(fabs(a-b), 1.0e-38))
> break;

I might be saying something stupid, but, I was under the impression
that floating point '==', assuming it follows IEEE rules, does exactly
this.

I know for certain that it does not do memcmp(), as it has to deal
with the exponent and mantissa being each off by +/-1 and <</>>1
respectively.

mark

--
[email protected]/[email protected]/[email protected] __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

2002-04-25 16:06:49

by Richard B. Johnson

[permalink] [raw]
Subject: Re: FPU, i386

On Thu, 25 Apr 2002, Mark Mielke wrote:

> On Thu, Apr 25, 2002 at 10:22:49AM -0400, Richard B. Johnson wrote:
> > To use the math macros, the comparison should be something like:
> > if (isless(fabs(a-b), 1.0e-38))
> > break;
>
> I might be saying something stupid, but, I was under the impression
> that floating point '==', assuming it follows IEEE rules, does exactly
> this.
>

A floating-point '==' tests for the two oprands to be exact. The
only time they will be exact is:

float a, b;
a = b = 10.000;

At this instant a = b and (a==b) will return TRUE.
However, if by previous math, you expected a to equal 10.0, by
doing:

a = b = 10.00

a += 12345.678;
a -= 12345.678;

Now, 'a' is not equal to 'b' if the 'C' compiler actually performed
the indicated operations (the compiler may obtimize them away).

In this case (a==b) will return FALSE because they are not equal.

Script started on Thu Apr 25 11:52:46 2002

# cat >xxx.c
#include <stdio.h>
#include <math.h>

int main()
{
double a, b;
a = b = 10.0;

printf("%d\n", (a==b));
a += 10.234567890;
printf("%d\n", (a==b));
a -= 10.234567890;
printf("%d\n", (a==b));
return 0;

}

# gcc -o xxx xxx.c
# ./xxx
1
0
0
# exit
exit

Script done on Thu Apr 25 11:53:10 2002

So, as you can see, when first initialized with the same value,
regardless of how inaccurately in can be represented, it does
show equality.

However, as soon as I muck with it, add, then subtract the same
number, it will no longer be egual.

This is why you NEVER use '==' when dealing with floating-point
numbers. The use of '==' in floating-point operations is a bug.
'C' allows you to write buggy code. It will probably call
a floating-point compare routine in the FPU (fcom, fcomp, fcompp)
which will promptly return the 'not-equal' in C1 and the CPU flags.

'C' will even allow you to use floating-point numbers in loops, i.e.,

while(a--)
do_something_forever();

Just because it's allowed, it doesn't mean that it's not a bug.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.

2002-04-26 22:10:28

by Kerl, John

[permalink] [raw]
Subject: RE: FPU, i386

There is an error here which shouldn't be propagated:

if (fabs(a-b) < 1.0e-38)
...

"Machine epsilon" for doubles (namely, the difference between 1.0 and
the next largest number) is on the order of 2e-16. This is a rough
estimate of the granularity of round-off error; and in fact 1.0 / 0.2
and 5.0 can't possibly be as close as 1.0e-38, unless they're exactly
equal.

There are four epsilon-ish things to be aware of:

* Difference between 0.0 and next float above: ~= 1.4e-45
* Difference between 0.0 and next double above: ~= 4.9e-324
* Difference between 1.0 and next float above: ~= 1.2e-7
* Difference between 1.0 and next double above: ~= 2.2e-16

The first two are more useful for things like detecting underflow; the
last two (some numerical folks suggest using their square roots) are
more useful for implementing an "approximately equals".

----------------------------------------------------------------

The poster was incorrect in expecting 1.0 / 0.2 to be exactly equal to
anything, as was explained to him. But the problem doesn't have to do
with whether a number is transcendental, or irrational, or rational:
the number must be rational *and* must have a mantissa whose
denominator is a power of two *and* that power of two must be less than
or equal to 23 (for single) or 52 (for double). And of course 1/5 is
2^-3 * 8/5, of which the mantissa has denominator 5, which isn't a power
of two.

So we all should know not to expect floating-point numbers to be
exactly equal to anything; that's been established. However, another
more basic question was not answered; curiosity (if nothing else)
demands an answer. Namely, it's OK to say we can't expect 1.0/0.2 ==
5.0. But why is the result of (what is apparently) the same
computation *sometimes* the same, and *sometimes* different? That's the
question.

And I think it's fair for the poster to want to know why.

If you disassemble the sample program, you'll see that without
optimization, 1.0 is divided by 0.2 at *run* time, and compared with
5.0; with optimization, the division is done, and the "<" and
"==" comparisons are done, at *compile* time. OK, but: If we're not
cross-compiling (most people don't), then the compiler creating a.out
is running on perhaps the same box as a.out is! Why does gcc, folding
the constant in the optimized a.out, get a different answer for 1.0/0.2
than the unoptimized a.out gets for 1.0/0.2?

Not only that, without optimization:

if (1/h < 5.0)
...

gives a different answer inside a.out than

x = 1/h;
if (x < 5.0)
...

The key is that Pentiums (Pentia?) have 80-bit floating-point numbers
in the FPU. Without optimization, at compile time, gcc represents 5.0
as 0x4014000000000000. 0.2 is 0x3fc999999999999a. These are both
64-bit doubles -- 1 sign bit, 11 exponent bits, & 52 explicit mantissa
bits (and 1 implicit leading mantissa bit, not stored in memory.)

In the case "if (1/h < 5.0)", at run time, 1.0 is loaded into the FPU
using fld1; then "fdivl {address of 0.2 in memory}". The result is the
*80-bit* number 0x40019ffffffffffffd80. The 64-bit number 5.0
(0x4014000000000000) is loaded into the FPU to become the 80-bit number
0x4001a000000000000000. Then, these two 80-bit numbers are compared in
the FPU; they're of course not the same.

What's different in the case "x = 1/h; if (x < 5.0) ..." is that both
80-bit numbers are stored from the FPU to memory as 64-bit (rounding
off the mantissa bits which differ), at which point they're both
0x4014000000000000, then loaded *back* into the FPU where they're
both 0x4001a000000000000000.

This isn't an FPU bug, by any stretch of the imagination, nor is it a
compiler bug. But it's a subtle difference between the Pentium's FPU
and other FPUs, of which it may occasionally be useful to be aware.




-----Original Message-----
From: Richard B. Johnson [mailto:[email protected]]
Sent: Thursday, April 25, 2002 7:23 AM
To: rpm
Cc: Jesse Pollard; [email protected]; Andrey Ulanov;
[email protected]
Subject: Re: FPU, i386


On Thu, 25 Apr 2002, rpm wrote:

> On Wednesday 17 April 2002 08:10 pm, Jesse Pollard wrote:
> > --------- Received message begins Here ---------
> >
>
> > if (int(1/h * 100) == int(5.0 * 100))
> >
> > will give a "proper" result within two decimal places. This is still
> > limited since there are irrational numbers within that range that COULD
> > still come out with a wrong answer, but is much less likely to occur.
> >
> > Exact match of floating point is not possible - 1/h is eleveated to a
> > float.
> >
> > If your 1/h was actually num/h, and num computed by summing .01 100
times
> > I suspect the result would also be "wrong".
> >
>
> why is exact match of floating point not possible ?

Because many (read most) numbers are not exactly representable
in floating-point. The purpose of floating-point it to represent
real numbers with a large dynamic range. The trade-off is that
few such internal representations are exact.

As a simple example, 0.33333333333..... can't be represented exactly
even with paper-and-pencil. However, as the ratio of two integers
it can be represented exactly, i.e., 1/3 . Both 1 and 3 must
be integers to represent this ratio exactly.

All real numbers (except trancendentials) can represented exactly
as the ratio of two integers but floating-point uses only one
value, not two integers, to represent the value. So, an exact
representation of a real number, when using a single variable
in a general-purpose way, is, for all practical purposes, not
possible. Instead, we get very close.

When it comes to '==' close is not equal. There are macros in
<math.h> that can be used for most floating-point logic. You
should check them out. If we wanted to check for '==' we really
need to do something like this:

double a, b;
some_loop() {
if(fabs(a-b) < 1.0e-38)
break;
}
Where we get the absolute value of the difference between two
FP variables and compare against some very small number.

To use the math macros, the comparison should be something like:

if (isless(fabs(a-b), 1.0e-38))
break;


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.

2002-04-29 12:32:07

by Richard B. Johnson

[permalink] [raw]
Subject: RE: FPU, i386

On Fri, 26 Apr 2002, Kerl, John wrote:

> There is an error here which shouldn't be propagated:
>
> if (fabs(a-b) < 1.0e-38)
> ...

It is not an error at all.

>
> "Machine epsilon" for doubles (namely, the difference between 1.0 and
> the next largest number) is on the order of 2e-16. This is a rough
> estimate of the granularity of round-off error; and in fact 1.0 / 0.2
> and 5.0 can't possibly be as close as 1.0e-38, unless they're exactly
> equal.


This is not correct. The FPU in (even) the i486 stores 'tiny' (defined in
the Intel book) in extended real format. Quantities as small as +/- 3.37
* 10 -4932 are represented internally. Any comparison of real numbers is
(or certainly should be) done internally. The hard-coded example of
1.0e-38 is well within the dynamic range of both the FPU and the double
fabs().

As explained on page 16-3 of the Intel 486 Programmer's Reference
Manual, the FPU tries to prevent denormalization. Denormalization
produces either a denormal or a zero. Even single precision
denormals all have exponents of -46 or more negative, also well
within the -38 cited.

>
> There are four epsilon-ish things to be aware of:
>
> * Difference between 0.0 and next float above: ~= 1.4e-45
> * Difference between 0.0 and next double above: ~= 4.9e-324
> * Difference between 1.0 and next float above: ~= 1.2e-7
> * Difference between 1.0 and next double above: ~= 2.2e-16
>
> The first two are more useful for things like detecting underflow; the
> last two (some numerical folks suggest using their square roots) are
> more useful for implementing an "approximately equals".
>


I agree with your explainations on everything following:



> ----------------------------------------------------------------
>
> The poster was incorrect in expecting 1.0 / 0.2 to be exactly equal to
> anything, as was explained to him. But the problem doesn't have to do
> with whether a number is transcendental, or irrational, or rational:
> the number must be rational *and* must have a mantissa whose
> denominator is a power of two *and* that power of two must be less than
> or equal to 23 (for single) or 52 (for double). And of course 1/5 is
> 2^-3 * 8/5, of which the mantissa has denominator 5, which isn't a power
> of two.
>
> So we all should know not to expect floating-point numbers to be
> exactly equal to anything; that's been established. However, another
> more basic question was not answered; curiosity (if nothing else)
> demands an answer. Namely, it's OK to say we can't expect 1.0/0.2 ==
> 5.0. But why is the result of (what is apparently) the same
> computation *sometimes* the same, and *sometimes* different? That's the
> question.
>
> And I think it's fair for the poster to want to know why.
>
> If you disassemble the sample program, you'll see that without
> optimization, 1.0 is divided by 0.2 at *run* time, and compared with
> 5.0; with optimization, the division is done, and the "<" and
> "==" comparisons are done, at *compile* time. OK, but: If we're not
> cross-compiling (most people don't), then the compiler creating a.out
> is running on perhaps the same box as a.out is! Why does gcc, folding
> the constant in the optimized a.out, get a different answer for 1.0/0.2
> than the unoptimized a.out gets for 1.0/0.2?
>
> Not only that, without optimization:
>
> if (1/h < 5.0)
> ...
>
> gives a different answer inside a.out than
>
> x = 1/h;
> if (x < 5.0)
> ...
>
> The key is that Pentiums (Pentia?) have 80-bit floating-point numbers
> in the FPU. Without optimization, at compile time, gcc represents 5.0
> as 0x4014000000000000. 0.2 is 0x3fc999999999999a. These are both
> 64-bit doubles -- 1 sign bit, 11 exponent bits, & 52 explicit mantissa
> bits (and 1 implicit leading mantissa bit, not stored in memory.)
>
> In the case "if (1/h < 5.0)", at run time, 1.0 is loaded into the FPU
> using fld1; then "fdivl {address of 0.2 in memory}". The result is the
> *80-bit* number 0x40019ffffffffffffd80. The 64-bit number 5.0
> (0x4014000000000000) is loaded into the FPU to become the 80-bit number
> 0x4001a000000000000000. Then, these two 80-bit numbers are compared in
> the FPU; they're of course not the same.
>
> What's different in the case "x = 1/h; if (x < 5.0) ..." is that both
> 80-bit numbers are stored from the FPU to memory as 64-bit (rounding
> off the mantissa bits which differ), at which point they're both
> 0x4014000000000000, then loaded *back* into the FPU where they're
> both 0x4001a000000000000000.
>
> This isn't an FPU bug, by any stretch of the imagination, nor is it a
> compiler bug. But it's a subtle difference between the Pentium's FPU
> and other FPUs, of which it may occasionally be useful to be aware.
>
>
>
>
> -----Original Message-----
> From: Richard B. Johnson [mailto:[email protected]]
> Sent: Thursday, April 25, 2002 7:23 AM
> To: rpm
> Cc: Jesse Pollard; [email protected]; Andrey Ulanov;
> [email protected]
> Subject: Re: FPU, i386
>
>
> On Thu, 25 Apr 2002, rpm wrote:
>
> > On Wednesday 17 April 2002 08:10 pm, Jesse Pollard wrote:
> > > --------- Received message begins Here ---------
> > >
> >
> > > if (int(1/h * 100) == int(5.0 * 100))
> > >
> > > will give a "proper" result within two decimal places. This is still
> > > limited since there are irrational numbers within that range that COULD
> > > still come out with a wrong answer, but is much less likely to occur.
> > >
> > > Exact match of floating point is not possible - 1/h is eleveated to a
> > > float.
> > >
> > > If your 1/h was actually num/h, and num computed by summing .01 100
> times
> > > I suspect the result would also be "wrong".
> > >
> >
> > why is exact match of floating point not possible ?
>
> Because many (read most) numbers are not exactly representable
> in floating-point. The purpose of floating-point it to represent
> real numbers with a large dynamic range. The trade-off is that
> few such internal representations are exact.
>
> As a simple example, 0.33333333333..... can't be represented exactly
> even with paper-and-pencil. However, as the ratio of two integers
> it can be represented exactly, i.e., 1/3 . Both 1 and 3 must
> be integers to represent this ratio exactly.
>
> All real numbers (except trancendentials) can represented exactly
> as the ratio of two integers but floating-point uses only one
> value, not two integers, to represent the value. So, an exact
> representation of a real number, when using a single variable
> in a general-purpose way, is, for all practical purposes, not
> possible. Instead, we get very close.
>
> When it comes to '==' close is not equal. There are macros in
> <math.h> that can be used for most floating-point logic. You
> should check them out. If we wanted to check for '==' we really
> need to do something like this:
>
> double a, b;
> some_loop() {
> if(fabs(a-b) < 1.0e-38)
> break;
> }
> Where we get the absolute value of the difference between two
> FP variables and compare against some very small number.
>
> To use the math macros, the comparison should be something like:
>
> if (isless(fabs(a-b), 1.0e-38))
> break;
>
>
> Cheers,
> Dick Johnson
>
> Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
>
> Windows-2000/Professional isn't.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.

2002-04-29 16:19:36

by Kerl, John

[permalink] [raw]
Subject: RE: FPU, i386

OK, I'll say that the snippet in question *is*
an error on any processor I've ever worked with
(Sparc, PPC, MIPS), *except* for x86. Personally,
I want to always make sure my C code isn't
processor-dependent. Pretty much everybody
implements IEEE-standard single- and double-precision
floating point these days, but Intel's extended
format is not standard. (Arguably *should* be;
but isn't.)

Thanks for the clarification.



-----Original Message-----
From: Richard B. Johnson [mailto:[email protected]]
Sent: Monday, April 29, 2002 5:34 AM
To: Kerl, John
Cc: '[email protected]'
Subject: RE: FPU, i386


On Fri, 26 Apr 2002, Kerl, John wrote:

> There is an error here which shouldn't be propagated:
>
> if (fabs(a-b) < 1.0e-38)
> ...

It is not an error at all.

>
> "Machine epsilon" for doubles (namely, the difference between 1.0 and
> the next largest number) is on the order of 2e-16. This is a rough
> estimate of the granularity of round-off error; and in fact 1.0 / 0.2
> and 5.0 can't possibly be as close as 1.0e-38, unless they're exactly
> equal.


This is not correct. The FPU in (even) the i486 stores 'tiny' (defined in
the Intel book) in extended real format. Quantities as small as +/- 3.37
* 10 -4932 are represented internally. Any comparison of real numbers is
(or certainly should be) done internally. The hard-coded example of
1.0e-38 is well within the dynamic range of both the FPU and the double
fabs().

As explained on page 16-3 of the Intel 486 Programmer's Reference
Manual, the FPU tries to prevent denormalization. Denormalization
produces either a denormal or a zero. Even single precision
denormals all have exponents of -46 or more negative, also well
within the -38 cited.

>
> There are four epsilon-ish things to be aware of:
>
> * Difference between 0.0 and next float above: ~= 1.4e-45
> * Difference between 0.0 and next double above: ~= 4.9e-324
> * Difference between 1.0 and next float above: ~= 1.2e-7
> * Difference between 1.0 and next double above: ~= 2.2e-16
>
> The first two are more useful for things like detecting underflow; the
> last two (some numerical folks suggest using their square roots) are
> more useful for implementing an "approximately equals".
>


I agree with your explainations on everything following:



> ----------------------------------------------------------------
>
> The poster was incorrect in expecting 1.0 / 0.2 to be exactly equal to
> anything, as was explained to him. But the problem doesn't have to do
> with whether a number is transcendental, or irrational, or rational:
> the number must be rational *and* must have a mantissa whose
> denominator is a power of two *and* that power of two must be less than
> or equal to 23 (for single) or 52 (for double). And of course 1/5 is
> 2^-3 * 8/5, of which the mantissa has denominator 5, which isn't a power
> of two.
>
> So we all should know not to expect floating-point numbers to be
> exactly equal to anything; that's been established. However, another
> more basic question was not answered; curiosity (if nothing else)
> demands an answer. Namely, it's OK to say we can't expect 1.0/0.2 ==
> 5.0. But why is the result of (what is apparently) the same
> computation *sometimes* the same, and *sometimes* different? That's the
> question.
>
> And I think it's fair for the poster to want to know why.
>
> If you disassemble the sample program, you'll see that without
> optimization, 1.0 is divided by 0.2 at *run* time, and compared with
> 5.0; with optimization, the division is done, and the "<" and
> "==" comparisons are done, at *compile* time. OK, but: If we're not
> cross-compiling (most people don't), then the compiler creating a.out
> is running on perhaps the same box as a.out is! Why does gcc, folding
> the constant in the optimized a.out, get a different answer for 1.0/0.2
> than the unoptimized a.out gets for 1.0/0.2?
>
> Not only that, without optimization:
>
> if (1/h < 5.0)
> ...
>
> gives a different answer inside a.out than
>
> x = 1/h;
> if (x < 5.0)
> ...
>
> The key is that Pentiums (Pentia?) have 80-bit floating-point numbers
> in the FPU. Without optimization, at compile time, gcc represents 5.0
> as 0x4014000000000000. 0.2 is 0x3fc999999999999a. These are both
> 64-bit doubles -- 1 sign bit, 11 exponent bits, & 52 explicit mantissa
> bits (and 1 implicit leading mantissa bit, not stored in memory.)
>
> In the case "if (1/h < 5.0)", at run time, 1.0 is loaded into the FPU
> using fld1; then "fdivl {address of 0.2 in memory}". The result is the
> *80-bit* number 0x40019ffffffffffffd80. The 64-bit number 5.0
> (0x4014000000000000) is loaded into the FPU to become the 80-bit number
> 0x4001a000000000000000. Then, these two 80-bit numbers are compared in
> the FPU; they're of course not the same.
>
> What's different in the case "x = 1/h; if (x < 5.0) ..." is that both
> 80-bit numbers are stored from the FPU to memory as 64-bit (rounding
> off the mantissa bits which differ), at which point they're both
> 0x4014000000000000, then loaded *back* into the FPU where they're
> both 0x4001a000000000000000.
>
> This isn't an FPU bug, by any stretch of the imagination, nor is it a
> compiler bug. But it's a subtle difference between the Pentium's FPU
> and other FPUs, of which it may occasionally be useful to be aware.
>
>
>
>
> -----Original Message-----
> From: Richard B. Johnson [mailto:[email protected]]
> Sent: Thursday, April 25, 2002 7:23 AM
> To: rpm
> Cc: Jesse Pollard; [email protected]; Andrey Ulanov;
> [email protected]
> Subject: Re: FPU, i386
>
>
> On Thu, 25 Apr 2002, rpm wrote:
>
> > On Wednesday 17 April 2002 08:10 pm, Jesse Pollard wrote:
> > > --------- Received message begins Here ---------
> > >
> >
> > > if (int(1/h * 100) == int(5.0 * 100))
> > >
> > > will give a "proper" result within two decimal places. This is still
> > > limited since there are irrational numbers within that range that
COULD
> > > still come out with a wrong answer, but is much less likely to occur.
> > >
> > > Exact match of floating point is not possible - 1/h is eleveated to a
> > > float.
> > >
> > > If your 1/h was actually num/h, and num computed by summing .01 100
> times
> > > I suspect the result would also be "wrong".
> > >
> >
> > why is exact match of floating point not possible ?
>
> Because many (read most) numbers are not exactly representable
> in floating-point. The purpose of floating-point it to represent
> real numbers with a large dynamic range. The trade-off is that
> few such internal representations are exact.
>
> As a simple example, 0.33333333333..... can't be represented exactly
> even with paper-and-pencil. However, as the ratio of two integers
> it can be represented exactly, i.e., 1/3 . Both 1 and 3 must
> be integers to represent this ratio exactly.
>
> All real numbers (except trancendentials) can represented exactly
> as the ratio of two integers but floating-point uses only one
> value, not two integers, to represent the value. So, an exact
> representation of a real number, when using a single variable
> in a general-purpose way, is, for all practical purposes, not
> possible. Instead, we get very close.
>
> When it comes to '==' close is not equal. There are macros in
> <math.h> that can be used for most floating-point logic. You
> should check them out. If we wanted to check for '==' we really
> need to do something like this:
>
> double a, b;
> some_loop() {
> if(fabs(a-b) < 1.0e-38)
> break;
> }
> Where we get the absolute value of the difference between two
> FP variables and compare against some very small number.
>
> To use the math macros, the comparison should be something like:
>
> if (isless(fabs(a-b), 1.0e-38))
> break;
>
>
> Cheers,
> Dick Johnson
>
> Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
>
> Windows-2000/Professional isn't.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.