Loss of precision when doing LVecBas3d arithmetic

eldee · January 19, 2022, 10:30pm

I encountered a very weird problem: If I do with LPoint3d this kind of arithmetic operation (a - p) + b, the Y the result suffer from precision loss !

I have the following C++ test code :

static LPoint3d a(0.0, -2.8311460325405014e+17, 1.2274520263792942e+17);
static LPoint3d b(127576048.4776305, 43032144.01342494, 0.0);

LPoint3d test_diff(LPoint3d p)
{
  return (a - p) + b;
}

if I invoke in from Python, with the following code :

        p = LPoint3d(0.0, -2.8311460325405014e+17, 1.2274520263792942e+17)
        b = LPoint3d(127576048.4776305, 43032144.01342494, 0.0);
        print(test_diff(p) - b)

I expect to get a null vector, but instead I get this result :

LVector3d(0, 15.9866, 0)

If I do the same code in pure Python, :

        p = LPoint3d(0.0, -2.8311460325405014e+17, 1.2274520263792942e+17)
        a = LPoint3d(0.0, -2.8311460325405014e+17, 1.2274520263792942e+17);
        b = LPoint3d(127576048.4776305, 43032144.01342494, 0.0);
        print(((a - p) + b) - b)

the result is as expected :

LVecBase3d(0, 0, 0)

Now, if I change the code in C++ to split up the operation :

static LPoint3d a(0.0, -2.8311460325405014e+17, 1.2274520263792942e+17);
static LPoint3d b(127576048.4776305, 43032144.01342494, 0.0);

LPoint3d f(LPoint3d a, LPoint3d b)
{
  return a - b;
}

LPoint3d test_diff(LPoint3d p)
{
  return f(a, p) + b;
}

It works fine :

LVector3d(0, 0, 0)

At first I thought it was a memory corruption, but it always happens even in trivial code like this. Also it is not a compiler or platform issue as I get the same problem on Linux and on macOS!

And for the test I’m using one of the latest Panda3D SDK 1.11 : 68f0931f43284345893a90d5bba9ba5df8aa53bb) by cmu (Dec 13 2021 08:35:38)

Not really sure what’s going on. I tend to believe that there is a wrong cast to single precision float that happens somehow as the delta is always between -15 something and 15 something (at least with the values I’m using in my real code). But is it a bug, or am I doing something wrong ?

serega-kkz · January 19, 2022, 11:17pm

I’m not sure for sure, maybe it’s the difference in how the memory was allocated. In C++, you used static allocation, and python has dynamic allocation. I think you can try using dynamic memory allocation for testing, please note that this is just a theory.

eldee · January 20, 2022, 7:55am

Actually I discovered the problem with dynamically allocated object I switched to static constant values for the test to be sure it wasn’t a problem linked to the memory allocation.

rdb · January 20, 2022, 2:01pm

We build Panda with -ffast-math (or /fp:fast on Windows), maybe that’s it?

eldee · January 20, 2022, 5:19pm

Ah, forgot about that flag ! That could be the problem indeed, as -ffast-math implies -funsafe-math-optimizations which allows the compiler to rearrange the order of the operations without any regard to the precision or rounding (as long as the math is still valid).

Here as I’m subtracting two (identically) large number and add a very small one, if the order of operation is modified, it’s possible that the small value is first added to one of the large and so result in precision loss !

I will rebuild Panda3D without that flag and check the result

eldee · January 20, 2022, 9:44pm

Indeed that was the root cause of the problem, without -ffast-math the problem disappear.

And, to be fair, the problem is not in Panda3D but in P3DModuleBuilder. With optimize level greater or equal to 3 it enable -ffast-math like Panda3D, however if you look at makepanda.py you see this nice line, which is missing in P3DModuleBuilder :

github.com

panda3d/panda3d/blob/ce7c1ec16c5a0b0ceb7c9bebb30e85e76e34592f/makepanda/makepanda.py#L1412

    
      
          # Certain clang versions crash when passing these math flags while
          # compiling Objective-C++ code
          if not src.endswith(".m") and not src.endswith(".mm"):
              if optlevel >= 3:
                  cmd += " -ffast-math -fno-stack-protector"
              if optlevel == 3:
                  # Fast math is nice, but we'd like to see NaN in dev builds.
                  cmd += " -fno-finite-math-only"
          
          
    # Make sure this is off to avoid GCC/Eigen bug (see GitHub #228)
              cmd += " -fno-unsafe-math-optimizations"
          
          
if (optlevel==1): cmd += " -ggdb -D_DEBUG"
          if (optlevel==2): cmd += " -O1 -D_DEBUG"
          if (optlevel==3): cmd += " -O2"
          if (optlevel==4): cmd += " -O3 -DNDEBUG"
          
          
# Enable more warnings.
          cmd += " -Wall -Wno-unused-function"
          
          
if not src.endswith(".c"):

rdb · January 22, 2022, 10:27am

Ah, yeah, I had wondered if you had removed that flag or something like that.

I checked in a change to P3DModuleBuilder to add the missing flag.