[FIXED] odespace.collide leaking memory?

Hi,

In my code I use ODE collisions with the space.collide method instead of the Panda’s AutoCollide feature. Recently I’ve noticed a memory leak at the rate of up to 0.1 MB per second.

I’ve searched for the problem, and tracked it down to my collision handling task. After commenting almost everything in there, I was only left with the space.collide() method, yet the problem remained. This method uses a callback which I replaced with an empty method for testing.

However, as soon as I commented space.collide() as well, the problem seemed to no longer exist.

My tracking the cause of the leak might not be perfect, and the cause might still lie somewhere in my code (which I will continue to investigate), but currently I’m almost certain that either the collide method itself or my usage of it is the problem.

You can see the sample of the leak in here: dl.dropbox.com/u/196274/coppertopLeak.tar.gz

There’s not much going on in this sample when you run it, because the only thing the ODE update task does is invoke the space.collide() method, which in turn calls an empty callback. The most relevant code is odeWorldManager.py from line 810 to 822.

Note that for this simple demo it takes a while (at least on my system) for the leaking to manifest itself. If nothing happens (as in constant memory usage), try waiting a few minutes. Also, the leak is slower, but present.

The problem doesn’t (seem to) show when using the standard AutoCollide, which is yet another clue that it might be space.collide() related, because AutoCollide uses ODE’s functions directly, omitting space.collide().

I’ve tried to look into odeSpace.cxx for clues. I’m not into C++, but I was thinking it might be related to the near_callback method at line 260 of that file. It looks like the geoms are coppied there. The odespace_cat.spam() also looks a little suspicious to me, but as I said, I’m not good in C++, so it’s just a thought that came to my mind when browsing the source.

Just in case, I’m on 32-bit Ubuntu 10.04 and use the packages from the Panda’s ppa.

Thanks very much in advance for any help (even if only running to confirm) on this.

EDIT:
For testing purposes, I’ve replaced space.collide() call in the code I work on with this:

def spaceCollide(self):
    numGeoms = self.space.getNumGeoms()
        for i in range(numGeoms):
            for j in range(numGeoms):
                if i == j:
                    continue
				
                g1 = self.space.getGeom(i)
                g2 = self.space.getGeom(j)
                
                self.handleCollisions("", g1, g2)

def simulationTask(self, task):
    #self.space.collide("", self.handleCollisions)
    self.spaceCollide()
    self.world.quickStep(self.stepSize)
    self.contactGroup.empty()

The rest of the code remained unchanged (including def handleCollisions which is normally used as callback in space.collide). Everything works (far too slowly, obviously) and the leak is not present in this setup.

I wanted to dive into the code of the space.collide, but I need instructions to correctly build ODE for use with Panda, so I can in turn build Panda myself. The package included with Ubuntu causes problems, rendering ODE unusable. If you can provide me with those directions I would be very grateful.

Peter

Hmm, that method does indeed look like trouble. In particular, it creates Python wrapper objects p1 and p2, and the Python result object result; and it frees none of these.

It should call Py_DECREF() for each of the three of these before it returns.

David

Thanks for your reply David. Can this problem be easily fixed?

Sure, by adding the three calls to Py_DECREF(). I’m reluctant to do it, though, since I don’t build ODE myself, and I wouldn’t be able to test the change.

Perhaps someone else who regularly builds Panda with ODE can make the change and confirm that it works?

David

You can check it in, and people can test it using the daily builds.

Hmm, bold. OK, I’ve done it.

David

Thanks for the fast reaction to both of you, you’re great.

However, I have installed the newest build from buildbot and there seems to be a different problem. Here’s the output I get:

*** glibc detected *** python: free(): invalid pointer: 0xbf860844 ***

This error causes the whole application to freeze completely, it can only be shut down by killing or closing it’s parent terminal. Neither escape (bound with sys.exit) nor closing the window by close button/alt+f4 works.

Based on when and where it happens, I’m fairly sure it’s the same area, so it most probably is related to the fix, unfortunately.

This same code of mine works fine on the stable Panda 1.7 from ppa (except for the leak of course).

Thanks again for your help and interest so far.

Hmm, disturbing. I just tried replacing the Py_DECREF() calls with Py_XDECREF(), though that really shouldn’t matter. I also tested it out with my own ODE build, and it appears to work without problems. Try out the next auto-build and we’ll see if that has any effect.

David

I just fired a new 32-bits karmic build, this should contain David’s new fix. But I doubt it makes a difference.

Could you install the new build and, if it doesn’t work, get a gdb traceback? You can do so by running “gdb python”, and in the gdb prompt typing "run " (e.g. “run game.py”).
As soon as you get the crash, type “bt” to get a stack trace of the crash.

Unfortunately it didn’t make a difference…

I’ve tried getting a traceback from gdb before, but failed with my real application – I only got “corrupt stack” and not much more. Besides, the game didn’t actually crash in this case – it just froze.

However, with a much simpler program (the one I linked to in the first post) I was able to finally get something, hopefully, more useful. The difference in on-crash behavior might be related to lots of other stuff going on in my actual game, so I’ll stick to the simplified version for the sake of debugging.

Anyway, here it is:

(gdb) run main.py
Starting program: /usr/bin/python main.py
[Thread debugging using libthread_db enabled]
DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
[New Thread 0xb740cb70 (LWP 2089)]
*** glibc detected *** /usr/bin/python: free(): invalid pointer: 0xbfffe954 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6(+0x6b591)[0x38d591]
/lib/tls/i686/cmov/libc.so.6(+0x6cde8)[0x38ede8]
/lib/tls/i686/cmov/libc.so.6(cfree+0x6d)[0x391ecd]
/usr/lib/panda3d/libp3dtool.so.1.7(_ZN10MemoryHook16heap_free_singleEPv+0x37)[0x7d0657]
/usr/lib/panda3d/libpandaexpress.so(_ZN11MemoryUsage16heap_free_singleEPv+0x40)[0x5743a0]
/usr/lib/panda3d/libpandaode.so(_ZN7OdeGeomD0Ev+0xa2)[0x58f8902]
/usr/lib/panda3d/libpandaode.so(+0x5537e)[0x590c37e]
/usr/lib/panda3d/libpandaode.so(_Z24Dtool_Deallocate_GeneralP7_object+0x16)[0x5976b96]
/usr/lib/panda3d/libpandaode.so(_ZN8OdeSpace13near_callbackEPvP6dxGeomS2_+0x169)[0x58f8d99]
/usr/lib/panda3d/libpandaode.so(+0xcb667)[0x5982667]
======= Memory map: ========
00110000-0012b000 r-xp 00000000 08:01 146678     /lib/ld-2.11.1.so
0012b000-0012c000 r--p 0001a000 08:01 146678     /lib/ld-2.11.1.so
0012c000-0012d000 rw-p 0001b000 08:01 146678     /lib/ld-2.11.1.so
0012d000-0012e000 r-xp 00000000 00:00 0          [vdso]
0012e000-00143000 r-xp 00000000 08:01 192776     /lib/tls/i686/cmov/libpthread-2.11.1.so
00143000-00144000 r--p 00014000 08:01 192776     /lib/tls/i686/cmov/libpthread-2.11.1.so
00144000-00145000 rw-p 00015000 08:01 192776     /lib/tls/i686/cmov/libpthread-2.11.1.so
00145000-00147000 rw-p 00000000 00:00 0 
00147000-00149000 r-xp 00000000 08:01 187561     /lib/tls/i686/cmov/libdl-2.11.1.so
00149000-0014a000 r--p 00001000 08:01 187561     /lib/tls/i686/cmov/libdl-2.11.1.so
0014a000-0014b000 rw-p 00002000 08:01 187561     /lib/tls/i686/cmov/libdl-2.11.1.so
0014b000-0014d000 r-xp 00000000 08:01 192781     /lib/tls/i686/cmov/libutil-2.11.1.so
0014d000-0014e000 r--p 00001000 08:01 192781     /lib/tls/i686/cmov/libutil-2.11.1.so
0014e000-0014f000 rw-p 00002000 08:01 192781     /lib/tls/i686/cmov/libutil-2.11.1.so
0014f000-00191000 r-xp 00000000 08:01 147349     /lib/i686/cmov/libssl.so.0.9.8
00191000-00192000 r--p 00042000 08:01 147349     /lib/i686/cmov/libssl.so.0.9.8
00192000-00195000 rw-p 00043000 08:01 147349     /lib/i686/cmov/libssl.so.0.9.8
00195000-002cd000 r-xp 00000000 08:01 147348     /lib/i686/cmov/libcrypto.so.0.9.8
002cd000-002d5000 r--p 00137000 08:01 147348     /lib/i686/cmov/libcrypto.so.0.9.8
002d5000-002e3000 rw-p 0013f000 08:01 147348     /lib/i686/cmov/libcrypto.so.0.9.8
002e3000-002e7000 rw-p 00000000 00:00 0 
002e7000-002fa000 r-xp 00000000 08:01 146790     /lib/libz.so.1.2.3.3
002fa000-002fb000 r--p 00012000 08:01 146790     /lib/libz.so.1.2.3.3
002fb000-002fc000 rw-p 00013000 08:01 146790     /lib/libz.so.1.2.3.3
002fc000-00320000 r-xp 00000000 08:01 187562     /lib/tls/i686/cmov/libm-2.11.1.so
00320000-00321000 r--p 00023000 08:01 187562     /lib/tls/i686/cmov/libm-2.11.1.so
00321000-00322000 rw-p 00024000 08:01 187562     /lib/tls/i686/cmov/libm-2.11.1.so
00322000-00475000 r-xp 00000000 08:01 187558     /lib/tls/i686/cmov/libc-2.11.1.so
00475000-00476000 ---p 00153000 08:01 187558     /lib/tls/i686/cmov/libc-2.11.1.so
00476000-00478000 r--p 00153000 08:01 187558     /lib/tls/i686/cmov/libc-2.11.1.so
00478000-00479000 rw-p 00155000 08:01 187558     /lib/tls/i686/cmov/libc-2.11.1.so
00479000-0047c000 rw-p 00000000 00:00 0 
0047c000-00795000 r-xp 00000000 08:01 104119     /usr/lib/panda3d/libpandaexpress.so.1.7
00795000-00799000 r--p 00318000 08:01 104119     /usr/lib/panda3d/libpandaexpress.so.1.7
00799000-007aa000 rw-p 0031c000 08:01 104119     /usr/lib/panda3d/libpandaexpress.so.1.7
007aa000-007b1000 rw-p 00000000 00:00 0 
007b1000-007e0000 r-xp 00000000 08:01 104122     /usr/lib/panda3d/libp3dtool.so.1.7
007e0000-007e1000 r--p 0002e000 08:01 104122     /usr/lib/panda3d/libp3dtool.so.1.7
007e1000-007e2000 rw-p 0002f000 08:01 104122     /usr/lib/panda3d/libp3dtool.so.1.7
007e2000-0083b000 r-xp 00000000 08:01 104124     /usr/lib/panda3d/libp3dtoolconfig.so.1.7
0083b000-0083c000 r--p 00058000 08:01 104124     /usr/lib/panda3d/libp3dtoolconfig.so.1.7
0083c000-0083e000 rw-p 00059000 08:01 104124     /usr/lib/panda3d/libp3dtoolconfig.so.1.7
0083e000-00927000 r-xp 00000000 08:01 100224     /usr/lib/libstdc++.so.6.0.13
00927000-00928000 ---p 000e9000 08:01 100224     /usr/lib/libstdc++.so.6.0.13
00928000-0092c000 r--p 000e9000 08:01 100224     /usr/lib/libstdc++.so.6.0.13
0092c000-0092d000 rw-p 000ed000 08:01 100224     /usr/lib/libstdc++.so.6.0.13
0092d000-00934000 rw-p 00000000 00:00 0 
00934000-00951000 r-xp 00000000 08:01 146675     /lib/libgcc_s.so.1
00951000-00952000 r--p 0001c000 08:01 146675     /lib/libgcc_s.so.1
00952000-00953000 rw-p 0001d000 08:01 146675     /lib/libgcc_s.so.1
00953000-02033000 r-xp 00000000 08:01 104103     /usr/lib/panda3d/libpanda.so.1.7
02033000-02034000 ---p 016e0000 08:01 104103     /usr/lib/panda3d/libpanda.so.1.7
02034000-0205a000 r--p 016e0000 08:01 104103     /usr/lib/panda3d/libpanda.so.1.7
0205a000-020b9000 rw-p 01706000 08:01 104103     /usr/lib/panda3d/libpanda.so.1.7
020b9000-020f3000 rw-p 00000000 00:00 0 
020f3000-0219e000 r-xp 00000000 08:01 131424     /usr/lib/i686/cmov/libavformat.so.52.31.0
0219e000-0219f000 r--p 000aa000 08:01 131424     /usr/lib/i686/cmov/libavformat.so.52.31.0
0219f000-021a5000 rw-p 000ab000 08:01 131424     /usr/lib/i686/cmov/libavformat.so.52.31.0
021a5000-021f1000 rw-p 00000000 00:00 0 
021f1000-02720000 r-xp 00000000 08:01 131405     /usr/lib/i686/cmov/libavcodec.so.52.20.1
02720000-02721000 r--p 0052f000 08:01 131405     /usr/lib/i686/cmov/libavcodec.so.52.20.1
02721000-0272a000 rw-p 00530000 08:01 131405     /usr/lib/i686/cmov/libavcodec.so.52.20.1
0272a000-02a38000 rw-p 00000000 00:00 0 
02a38000-02a44000 r-xp 00000000 08:01 131403     /usr/lib/i686/cmov/libavutil.so.49.15.0
02a44000-02a45000 r--p 0000b000 08:01 131403     /usr/lib/i686/cmov/libavutil.so.49.15.0
02a45000-02a46000 rw-p 0000c000 08:01 131403     /usr/lib/i686/cmov/libavutil.so.49.15.0
02a46000-02a49000 rw-p 00000000 00:00 0 
02a49000-02a91000 r-xp 00000000 08:01 131541     /usr/lib/i686/cmov/libswscale.so.0.7.1
02a91000-02a92000 r--p 00048000 08:01 131541     /usr/lib/i686/cmov/libswscale.so.0.7.1
02a92000-02a93000 rw-p 00049000 08:01 131541     /usr/lib/i686/cmov/libswscale.so.0.7.1
02a93000-02abe000 r-xp 00000000 08:01 103367     /usr/lib/libfftw.so.2.0.5
02abe000-02abf000 rw-p 0002a000 08:01 103367     /usr/lib/libfftw.so.2.0.5
02abf000-02ae5000 r-xp 00000000 08:01 103370     /usr/lib/librfftw.so.2.0.5
02ae5000-02ae6000 rw-p 00026000 08:01 103370     /usr/lib/librfftw.so.2.0.5
02ae6000-02b57000 r-xp 00000000 08:01 99619      /usr/lib/libfreetype.so.6.3.22
02b57000-02b5b000 r--p 00070000 08:01 99619      /usr/lib/libfreetype.so.6.3.22
02b5b000-02b5c000 rw-p 00074000 08:01 99619      /usr/lib/libfreetype.so.6.3.22
02b5c000-03098000 r-xp 00000000 08:01 100477     /usr/lib/libCg.so
03098000-032e7000 rw-p 0053b000 08:01 100477     /usr/lib/libCg.so
032e7000-032eb000 rw-p 00000000 00:00 0 
032eb000-03343000 r-xp 00000000 08:01 100249     /usr/lib/libtiff.so.4.3.2
03343000-03345000 r--p 00057000 08:01 100249     /usr/lib/libtiff.so.4.3.2
Program received signal SIGABRT, Aborted.
0x0012d422 in __kernel_vsyscall ()
(gdb) bt
#0  0x0012d422 in __kernel_vsyscall ()
#1  0x0034c651 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0x0034fa82 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0x0038349d in ?? () from /lib/tls/i686/cmov/libc.so.6
#4  0x0038d591 in ?? () from /lib/tls/i686/cmov/libc.so.6
#5  0x0038ede8 in ?? () from /lib/tls/i686/cmov/libc.so.6
#6  0x00391ecd in free () from /lib/tls/i686/cmov/libc.so.6
#7  0x007d0657 in MemoryHook::heap_free_single(void*) () from /usr/lib/panda3d/libp3dtool.so.1.7
#8  0x005743a0 in MemoryUsage::heap_free_single(void*) () from /usr/lib/panda3d/libpandaexpress.so
#9  0x058f8902 in OdeGeom::~OdeGeom() () from /usr/lib/panda3d/libpandaode.so
#10 0x0590c37e in Dtool_FreeInstance_OdeGeom(_object*) () from /usr/lib/panda3d/libpandaode.so
#11 0x05976b96 in Dtool_Deallocate_General(_object*) () from /usr/lib/panda3d/libpandaode.so
#12 0x058f8d99 in OdeSpace::near_callback(void*, dxGeom*, dxGeom*) () from /usr/lib/panda3d/libpandaode.so
#13 0x05982667 in collideAABBs (g1=0x8d1f228, g2=0x8d1ed78, data=0xb7fa70b0, callback=0x58f8c30 <OdeSpace::near_callback(void*, dxGeom*, dxGeom*)>)
    at collision_space_internal.h:81
#14 0x05982846 in dxSimpleSpace::collide (this=0x8cb2158, data=0xb7fa70b0, callback=0x58f8c30 <OdeSpace::near_callback(void*, dxGeom*, dxGeom*)>)
    at collision_space.cpp:281
#15 0x05982a7f in dSpaceCollide (space=0x8cb2158, data=0xb7fa70b0, callback=0x58f8c30 <OdeSpace::near_callback(void*, dxGeom*, dxGeom*)>) at collision_space.cpp:738
#16 0x058f1cd7 in OdeSpace::collide(_object*, _object*) () from /usr/lib/panda3d/libpandaode.so
#17 0x0592638d in Dtool_OdeSpace_collide_245(_object*, _object*, _object*) () from /usr/lib/panda3d/libpandaode.so
#18 0x080e0a21 in PyEval_EvalFrameEx ()
#19 0x080e2807 in PyEval_EvalCodeEx ()
#20 0x0816b2ac in ?? ()
#21 0x0806245a in PyObject_Call ()
#22 0x0806a45c in ?? ()
#23 0x0806245a in PyObject_Call ()
#24 0x011ba3d2 in Thread::call_python_func(_object*, _object*) () from /usr/lib/panda3d/libpanda.so
#25 0x011d5b0f in PythonTask::do_python_task() () from /usr/lib/panda3d/libpanda.so
#26 0x011d5e2d in PythonTask::do_task() () from /usr/lib/panda3d/libpanda.so
#27 0x011db4e1 in AsyncTask::unlock_and_do_task() () from /usr/lib/panda3d/libpanda.so
#28 0x011e4c46 in AsyncTaskChain::service_one_task(AsyncTaskChain::AsyncTaskChainThread*) () from /usr/lib/panda3d/libpanda.so
#29 0x011e5a5c in AsyncTaskChain::do_poll() () from /usr/lib/panda3d/libpanda.so
#30 0x011e5b9b in AsyncTaskManager::poll() () from /usr/lib/panda3d/libpanda.so
#31 0x011f989e in Dtool_AsyncTaskManager_poll_121(_object*, _object*, _object*) () from /usr/lib/panda3d/libpanda.so
#32 0x080e0a21 in PyEval_EvalFrameEx ()
#33 0x080e1bb0 in PyEval_EvalFrameEx ()
#34 0x080e2807 in PyEval_EvalCodeEx ()
#35 0x080e0c8b in PyEval_EvalFrameEx ()
#36 0x080e1bb0 in PyEval_EvalFrameEx ()
#37 0x080e2807 in PyEval_EvalCodeEx ()
#38 0x080e2907 in PyEval_EvalCode ()
#39 0x081005ad in PyRun_FileExFlags ()
#40 0x08100812 in PyRun_SimpleFileExFlags ()
#41 0x0805de5c in Py_Main ()
#42 0x0805d03b in main ()

Including the error message on crash.

David, when you tested with your own build did you use my code or some standard Panda ODE-using program? I’m asking this just in case, because the ODE samples from the manual don’t use space.collide, and instead use the ODE’s methods directly through autocollide, in which case everything should work nicely.

Ah, I have it now. I just committed the proper fix. It was properly cleaning up the Python objects now, but double-deleting the Panda objects.

In fact I used your code, but for some reason Windows didn’t crash during the double-delete. Whatever.

David

Awesome, many thanks!

For the record, I just added a follow-up fix, after realizing that the callback function might expect the OdeGeom’s that it receives to persist for longer than its own lifetime.

David

I’m very happy to confirm that everything works perfectly, no leaking and no crashes :smiley:. Thank you very much, you’re great. :slight_smile: