Hexbyte Hacker News Computers Intel Underestimates Error Bounds by 1.3 quintillion

Hexbyte Hacker News Computers

Hexbyte  Hacker News  Computers imageIntel’s manuals for their x86/x64 processor clearly state that the fsin instruction (calculating the trigonometric sine) has a maximum error, in round-to-nearest mode, of one unit in the last place. This is not true. It’s not even close.

The worst-case error for the fsin instruction for small inputs is actually about 1.37 quintillion units in the last place, leaving fewer than four bits correct. For huge inputs it can be much worse, but I’m going to ignore that.

I was shocked when I discovered this. Both the fsin instruction and Intel’s documentation are hugely inaccurate, and the inaccurate documentation has led to poor decisions being made.

The great news is that when I shared an early version of this blog post with Intel they reacted quickly and the documentation is going to get fixed!

I discovered this while playing around with my favorite mathematical identity. If you add a double-precision approximation for pi to the sin() of that same value then the sum of the two values (added by hand) gives you a quad-precision estimate (about 33 digits) for the value of pi. This works because the sine of a number very close to pi is almost equal to the error in the estimate of pi. This is just calculus 101, and a variant of Newton’s method, but I still find it charming.

The sin() function in VC++ is accurate enough for this technique to work well. However this technique relies on printing the value of the double-precision approximation of pi to 33 digits, and up to VC++ 2013  this doesn’t work – the extra digits are all printed as zeroes. So, you either need custom printing code or you need to wait for Dev 14 which has improved printing code.

So, I tried g++ on 32-bit Linux (Ubuntu 12.04) because I knew that glibc handles the printing just fine. However the results weren’t adding up correctly. I eventually realized that the sin() function in 32-bit versions of 2.15 glibc is just a thin wrappers around fsin and this instruction is painfully inaccurate when the input is near pi.

Hexbyte Hacker News Computers Catastrophic cancellation, enshrined in silicon

The first step in calculating trigonometric functions like sin() is range reduction. The input number, which Intel says can be up into the quintillion range, needs to be reduced down to between +/- pi/2.

Range reduction in general is a tricky problem, but range reduction around pi is actually very easy. You just have to subtract the input number from a sufficiently precise approximation of pi. The worst-case for range reduction near pi will be the value that is closest to pi. For double-precision this will be a 53-bit approximation to pi that is off by, at most, half a unit in the last place (0.5 ULP). If you want to have 53 bits of accuracy in the result then you need to subtract from an approximation to pi that is accurate to at least 106 bits – half the bits will cancel out but you’ll still be left with enough.

But you actually need more than that. The x87 FPU has 80-bit registers which have a 64-bit mantissa. If you are doing long-double math and you want your range reduction of numbers near pi to be accurate then you need to subtract from at least a 128-bit approximation to pi.

This is still easy. If you know your input number is near pi then just extract the mantissa and do high-precision integer math to subtract from a hard-coded 128-bit approximation to pi. Round carefully and then convert back to floating-point. In case you want to give this a try I converted pi to 192-bits of hexadecimal:

C90FDAA2 2168C234 C4C6628B 80DC1CD1 29024E08 8A67CC74

But Intel doesn’t use a 128-bit approximation to pi. Their approximation has just 66 bits. Here it is, taken from their manuals:

C90FDAA2 2168C234 C

Therein lies the problem. When doing range reduction from double-precision (53-bit mantissa) pi the results will have about 13 bits of precision (66 minus 53), for an error of up to 2^40 ULPs (53 minus 13). The measured error is lower at approximately 164 billion ULPs (~37 bits), but that’s still large. When doing range reduction from the 80-bit long-double format there will be almost complete cancellation and the worst-case error is 1.376 quintillion ULPs.

Oops.

Hexbyte Hacker News Computers Actually calculating sin

Hexbyte  Hacker News  Computers Not pi - just an approximationAfter range reduction the next step is to calculate sin. For numbers that are sufficiently close to pi this is trivial – the range-reduced number is the answer. Therefore, for double(pi) the error in the range reduction is the error in fsin, and fsin is inaccurate near pi, contradicting Intel’s documentation. QED.

Note that sin(double(pi)) should not return zero. The sine of pi is, mathematically, zero, but since float/double/long-double can’t represent pi it would actually be incorrect for sin() to return zero when passed an approximation to pi.

Hexbyte Hacker News Computers Frustration

It is surprising that fsin is so inaccurate. I could perhaps forgive it for being inaccurate for extremely large inputs (which it is) but it is hard to forgive it for being so inaccurate on pi which is, ultimately, a very ‘normal’ input to the sin() function. Ah, the age-old decisions that we’re then stuck with for decades.

I also find Intel’s documentation frustrating. In their most recent Volume 2: Instruction Set Reference they say that the range for fsin is -2^63 to +2^63. Then, in Volume 3: System Programming Guide, section 22.18.8 (Transcendental Instructions) they say “results will have a worst case error of less than 1 ulp when rounding to the nearest-even”. They reiterate this in section 8.3.10 (Transcendental Instruction Accuracy) of Volume 1: Basic Architecture.

Section 8.3.8 (Pi) of the most recent Volume 1: Basic Architecture documentation from Intel discusses the 66-bit approximation for pi that the x87 CPU uses, and says that it has been “chosen to guarantee no loss of significance in a source operand”, which is simply not true. This optimistic documentation has consequences. In 2001 an up-and-coming Linux kernel programmer believed this documentation and used it to argue against adding an fsin wrapper function to glibc.

Similar problems apply to fcos near pi/2. See this blog post for more details and pretty pictures.

Intel has known for years that these instructions are not as accurate as promised. They are now making updates to their documentation. Updating the instruction is not a realistic option.

I ran my tests on the 2.19 version of glibc (Ubuntu 14.04) and found that it no longer uses fsin. Clearly the glibc developers are aware of the problems with fsin and have stopped using it. That’s good because g++ sometimes calculates sin at compile-time, and the compile-time version is far more accurate than fsin, which makes for some bizarre inconsistencies when using glibc 2.15, which can make floating-point determinism challenging.

Hexbyte Hacker News Computers Is it a bug?

When Intel’s fdiv instruction was found to be flawed it was clearly a bug. The IEEE-754 spec says exactly how fdiv should be calculated (perfectly rounded) but imposes no restrictions on fsin. Intel’s implementation of fsin is clearly weak, but for compatibility reasons it probably can’t be changed so it can’t really be considered a bug at this point.

However the documentation is buggy. By claiming near-perfect accuracy across a huge domain and then only providing it in a tiny domain Intel has misled developers and caused them to make poor decisions. Scott Duplichan’s research shows that at long-double precision the fsin fails to meet its guarantees at numbers as small as 3.01626.

For double-precision the guarantees are not met for numbers as small as about 3.14157. Another way of looking at this is that there are tens of billions of doubles near pi where the double-precision result from fsin fails to meet Intel’s precision guarantees.

The absolute error in the range I was looking at was fairly constant at about 4e-21, which is quite small. For many purposes this will not matter. However the relative error can get quite large, and for the domains where that matters the misleading documentation can easily be a problem.

I’m not the first to discover this by any means, but many people are not aware of this quirk. While it is getting increasingly difficult to actually end up using the fsin instruction, it still shows up occasionally, especially when using older versions of glibc. And, I think this behavior is interesting, so I’m sharing.

That’s it. You can stop reading here if you want. The rest is just variations and details on the theme.

Hexbyte Hacker News Computers How I discovered this

I ran across this problem when I was planning to do a tweet version of my favorite pi/sin() identity

double pi_d = 3.14159265358979323846;

printf(“pi = %.33fn   + %.33fn”, pi_d, sin(pi_d));

I compiled this with g++, confident that it would work. But before sending the tweet I checked the results – I manually added the two rows of numbers – and I was confused when I got the wrong answer – I was expecting 33 digits of pi and I only got 21.

I then printed the integer representation of sin(pi_d) in hex and it was 0x3ca1a60000000000. That many zeroes is very strange for a transcendental function.

At first I thought that the x87 rounding mode had been set to float for some reason, although in hindsight the result did not even have float accuracy. When I discovered that the code worked on my desktop Linux machine I assumed that the virtual machine I’d been using for testing must have a bug.

It took a bit of testing and experimentation and stepping into various sin() functions to realize that the ‘bug’ was in the actual fsin instruction. It’s hidden on 64-bit Linux (my desktop Linux machine is 64-bit) and with VC++ because they both have a hand-written sin() function that doesn’t use fsin.

I experimented a bit with comparing sin() to fsin with VC++ to characterize the problem and eventually, with help from Scott Duplichan’s expose on this topic, I recognized the problem as being a symptom of catastrophic cancellation in the range reduction. I’m sure I would have uncovered the truth faster if Intel’s documentation had not filled me with unjustified optimism.

Amusingly enough, if I’d followed my usual practice and declared pi_d as const then the code would have worked as expected and I would never have discovered the problem with fsin!

Hexbyte Hacker News Computers Code!

If you want to try this technique for calculating pi but you don’t like manually adding 34-digit numbers then you can try this very crude code to do it for you:

void PiAddition() {

    double pi_d = 3.14159265358979323846;

    double sin_d = sin(pi_d);

    printf(“pi = %.33fn   + %.33fn”, pi_d, sin_d);

    char pi_s[100]; char sin_s[100]; char result_s[100] = {};

    snprintf(pi_s, sizeof(pi_s), “%.33f”, pi_d);

    snprintf(sin_s, sizeof(sin_s), “%.33f”, sin_d);

    int carry = 0;

    for (int i = strlen(pi_s) – 1; i >= 0; –i) {

        result_s[i] = pi_s[i];

        if (pi_s[i] != ‘.’) {

            char d = pi_s[i] + sin_s[i] + carry – ‘0’ * 2;

            carry = d > 9;

            result_s[i] = d % 10 + ‘0’;

        }

    }

    printf(”   = %sn”, result_s);

}

The code makes lots of assumptions (numbers aligned, both numbers positive) but it does demonstrate the technique quite nicely. Here are some results:

With g++ on 32-bit Linux with glibc 2.15:

pi = 3.141592653589793115997963468544185

   + 0.000000000000000122460635382237726

   = 3.141592653589793238458598850781911

With VC++ 2013 Update 3:

pi = 3.141592653589793100000000000000000

   + 0.000000000000000122464679914735320

   = 3.141592653589793222464679914735320

With VC++ Dev 14 CTP 3 and g++ on 64-bit Linux or glibc 2.19:

pi = 3.141592653589793115997963468544185

   + 0.000000000000000122464679914735321

   = 3.141592653589793238462643383279506

In other words, if the sixth digit of sin(double(pi)) is four then you have the correct value for sin(), but if it is zero then you have the incorrect fsin version.

Hexbyte Hacker News Computers Compile-time versus run-time sin

g++ tries to do as much math as possible at compile time, which further complicates this messy situation. If g++ detects that pi_d is constant then it will calculate sin(pi_d) at compile time, using MPFR. In the sample code above if pi_d is declared as ‘const’ then the results are quite different. This difference between compile-time and run-time sin() in g++ with glibc 2.15 can be seen with this code and its non-unity result:

const double pi_d = 3.14159265358979323846;

const int zero = argc / 99;

printf(“%fn”, sin(pi_d + zero) / sin(pi_d));

0.999967

Or, this equally awesome code that also prints 0.999967:

const double pi_d = 3.14159265358979323846;

double pi_d2 = pi_d;

printf(“%fn”, sin(pi_d2) / sin(pi_d));

It’s almost a shame that these goofy results are going away in glibc 2.19.

Hexbyte Hacker News Computers Sine ~zero == ~zero

Some years ago I was optimizing all of the trigonometric functions for an embedded system. I was making them all run about five times faster and my goal was to get identical results. However I had to abandon that goal when I realized that many of the functions, including sin(), gave incorrect answers on denormal inputs. Knowing that the sine of a denormal is the denormal helped me realized that the original versions were wrong, and I changed my goal to be identical results except when the old results were wrong.

Hexbyte Hacker News Computers 66 == 68

Intel describes their internal value for pi as having 66 bits, but it arguably has 68 bits because the next three bits are all zero, so their internal 66-bit value is a correctly rounded 68-bit approximation of pi. That’s why some of the results for fsin are slightly more accurate than you might initially expect.

Hexbyte Hacker News Computers Sin() is trivial (for almost half of input values)

For small enough doubles, the best answer for sin(x) is x. That is, for numbers smaller than about 1.49e-8, the closest double to the sine of x is actually x itself. I learned this from the glibc source code for sin(). Since half of all doubles are numbers below 1.0, and since doubles have enormous range, it turns out that those numbers below 1.49e-8 represent about 48.6% of all doubles. In the glibc sin() source code it tests for the top-half of the integer representation of abs(x) being less than 0x3e500000, roughly equivalent to <= 1.49e-8, and that represents 48.6% of the entire range for positive doubles (0x80000000). The ‘missing’ 1.4% is the numbers

Read More