SegFault when loading a texture and compiling a shader in different tasks


#1

In my application at start up I am loading a lot of textures in background (using an asynchronous task) and compiling several shaders and from time to time the app crashes with a segmentation fault or an abort.

When running under gdb, the stack traces almost always end up in malloc, or dlmalloc respectively called by the NVidia driver (while compiling the shader) or the PNMFileTypePNG::Reader::read_data method.

Is this a bug in the NVidia driver or in Panda or background texture loading is not supported ?

PNMFileTypePNG::Reader::read_data backtrace :

#0  0x00007ffff49c72e0 in dlmalloc () from /usr/lib/x86_64-linux-gnu/panda3d/libp3dtool.so.1.10
#1  0x00007ffff49c955c in MemoryHook::heap_alloc_array(unsigned long) () from /usr/lib/x86_64-linux-gnu/panda3d/libp3dtool.so.1.10
#2  0x00007ffff5560908 in PNMFileTypePNG::Reader::read_data(pixel*, unsigned short*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#3  0x00007ffff558806d in PNMImage::read(PNMReader*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#4  0x00007ffff545089a in Texture::do_read_one(Texture::CData*, Filename const&, Filename const&, int, int, int, int, LoaderOptions const&, bool, BamCacheRecord*) ()
   from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#5  0x00007ffff547f20d in Texture::do_read(Texture::CData*, Filename const&, Filename const&, int, int, int, int, bool, bool, LoaderOptions const&, BamCacheRecord*) ()
   from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#6  0x00007ffff545ae39 in Texture::read(Filename const&, LoaderOptions const&) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#7  0x00007ffff6133f3f in Dtool_Texture_read_1229(_object*, _object*, _object*) () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#8  0x00000000004bc4aa in PyEval_EvalFrameEx ()
#9  0x00000000004b9b66 in PyEval_EvalCodeEx ()
#10 0x00000000004d5669 in ?? ()
#11 0x00000000004a587e in PyObject_Call ()
#12 0x00000000004be51e in PyEval_EvalFrameEx ()
#13 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#14 0x00000000004d5669 in ?? ()
#15 0x00000000004eef5e in ?? ()
#16 0x00000000004a587e in PyObject_Call ()
#17 0x00007ffff636eeb9 in PythonThread::call_python_func(_object*, _object*) () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#18 0x00007ffff63739cc in PythonTask::do_python_task() () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#19 0x00007ffff6374f38 in PythonTask::do_task() () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#20 0x00007ffff53b77e8 in AsyncTask::unlock_and_do_task() () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#21 0x00007ffff53c2cfa in AsyncTaskChain::service_one_task(AsyncTaskChain::AsyncTaskChainThread*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#22 0x00007ffff53c34f8 in AsyncTaskChain::AsyncTaskChainThread::thread_main() () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#23 0x00007ffff53af2a4 in ThreadPosixImpl::root_func(void*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#24 0x00007ffff7bc16ba in start_thread (arg=0x7fffe2366700) at pthread_create.c:333
#25 0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

NVidia backtrace :

#0  _int_malloc (av=av@entry=0x7ffff7bb4b20 , bytes=bytes@entry=2064) at malloc.c:3727
#1  0x00007ffff7874184 in __GI___libc_malloc (bytes=2064) at malloc.c:2913
#2  0x00007fffeb8466a9 in ?? () from /usr/lib/nvidia-367/libGLX_nvidia.so.0
#3  0x00007fffe9b5259a in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#4  0x00007fffe9b52738 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#5  0x00007fffe9ba5e51 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#6  0x00007fffe9bbbe7c in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#7  0x00007fffe9b73b89 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#8  0x00007fffe9b65df1 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#9  0x00007fffea61eba5 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#10 0x00007fffea65d14b in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#11 0x00007fffea626845 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#12 0x00007fffeda17264 in GLShaderContext::glsl_compile_and_link() () from /usr/lib/x86_64-linux-gnu/panda3d/libpandagl.so
#13 0x00007fffeda17cbe in GLShaderContext::GLShaderContext(GLGraphicsStateGuardian*, Shader*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpandagl.so
#14 0x00007fffeda184c3 in GLGraphicsStateGuardian::prepare_shader(Shader*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpandagl.so
#15 0x00007ffff545e62e in PreparedGraphicsObjects::prepare_shader_now(Shader*, GraphicsStateGuardianBase*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#16 0x00007ffff5479516 in Shader::prepare_now(PreparedGraphicsObjects*, GraphicsStateGuardianBase*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#17 0x00007fffed9f358e in GLGraphicsStateGuardian::do_issue_shader() () from /usr/lib/x86_64-linux-gnu/panda3d/libpandagl.so
#18 0x00007fffeda14aba in GLGraphicsStateGuardian::set_state_and_transform(RenderState const*, TransformState const*) ()
   from /usr/lib/x86_64-linux-gnu/panda3d/libpandagl.so
#19 0x00007ffff5263396 in CullBinStateSorted::draw(bool, Thread*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#20 0x00007ffff51b2f0b in CullResult::draw(Thread*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#21 0x00007ffff5390370 in GraphicsEngine::do_draw(GraphicsOutput*, GraphicsStateGuardian*, DisplayRegion*, Thread*) ()
   from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#22 0x00007ffff539051c in GraphicsEngine::draw_bins(ov_set<PointerTo, IndirectLess, pvector<PointerTo > > const&, Thread*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#23 0x00007ffff5392d24 in GraphicsEngine::WindowRenderer::do_frame(GraphicsEngine*, Thread*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#24 0x00007ffff539365c in GraphicsEngine::render_frame() () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#25 0x00007ffff606e820 in Dtool_GraphicsEngine_render_frame_518(_object*, _object*) () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#26 0x00000000004c0e41 in PyEval_EvalFrameEx ()
#27 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#28 0x00000000004d5669 in ?? ()
#29 0x00000000004eef5e in ?? ()
#30 0x00000000004a587e in PyObject_Call ()
#31 0x00007ffff636ef0d in PythonThread::call_python_func(_object*, _object*) () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#32 0x00007ffff63739cc in PythonTask::do_python_task() () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#33 0x00007ffff6374f38 in PythonTask::do_task() () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#34 0x00007ffff53b77e8 in AsyncTask::unlock_and_do_task() () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#35 0x00007ffff53c2cfa in AsyncTaskChain::service_one_task(AsyncTaskChain::AsyncTaskChainThread*) () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#36 0x00007ffff53c38a3 in AsyncTaskChain::do_poll() () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#37 0x00007ffff53c3a99 in AsyncTaskManager::poll() () from /usr/lib/x86_64-linux-gnu/panda3d/libpanda.so.1.10
#38 0x00007ffff60a4ba7 in Dtool_AsyncTaskManager_poll_138(_object*, _object*) () from /usr/lib/python2.7/dist-packages/panda3d/core.so
#39 0x00000000004c0e41 in PyEval_EvalFrameEx ()
#40 0x00000000004c141f in PyEval_EvalFrameEx ()
#41 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#42 0x00000000004c1f56 in PyEval_EvalFrameEx ()
#43 0x00000000004c141f in PyEval_EvalFrameEx ()
#44 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#45 0x00000000004eb69f in ?? ()
#46 0x00000000004e58f2 in PyRun_FileExFlags ()
---Type  to continue, or q  to quit---
#47 0x00000000004e41a6 in PyRun_SimpleFileExFlags ()
#48 0x00000000004938ce in Py_Main ()
#49 0x00007ffff7810830 in __libc_start_main (main=0x493370 , argc=2, argv=0x7fffffffdff8, init=, fini=, rtld_fini=, 
    stack_end=0x7fffffffdfe8) at ../csu/libc-start.c:291
#50 0x0000000000493299 in _start ()

#2

Please state exact version of Panda:

import panda3d
print(panda3d.__version__)

#3

Forgot the obvious :slight_smile:

1.10.0 Commit 9035b4a6105fe383ab7091104f7a238c4594b70b

Note: Built using STDFLOAT_DOUBLE=1 override


#4

Just tested with an official dev build (and so without the double override), the crashes still occur at the same places


#5

Hmm, nothing obvious jumps out at me, especially without the debugging symbols. There are two diferent memory allocators at play here so it’s hard to see how they might be causing a race condtion, unless some form of memory corruption has already taken place.

Do you always observe the crash at the same location? If we can collect more stack traces of the crash happening in different places, it might give more clues as to the cause of this issue.

Looking at the PNG loading code, it does make an awful lot of separate allocations—I could make it more optimal by allocating memory for all rows in one allocation, though this might just obscure the crash by making it rarer without actually fixing the underlying problem.


#6

Always at the same locations, either in PNMFileTypePNG::Reader::read_data or in GLShaderContext::glsl_compile_and_link. Though loading the textures and generating the shaders are the main activities of the app at startup this might be biased.

I will check if using JPG textures triggers the problem too.

What are the parameters to add to makepanda to rebuild panda3d with debugging symbol enabled ?


#7

The easiest way to get debugging symbols is just to add this before invoking makepanda:

export CXXFLAGS=-ggdb

You can also use --optimize=1, but this will also have many other effects on the build.

Oh, when you get gdb traceback, please also report tracebacks occurring in other threads (you can use “thread 1” followed by “bt” to get the stack trace for thread 1, for example).


#8

FWIW, if I only use JPEG textures then there are no more crashes so far (though as it is a race condition it might be that I’m just lucky).


#9

Here are the backtraces with the debug build :

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7ffff7fc1700 (LWP 19654) "python" 0x00007fffe9dabd5c in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
  5    Thread 0x7fffe6cdc700 (LWP 19660) "threaded-ml" 0x00007ffff78eb74d in poll () at ../sysdeps/unix/syscall-template.S:84
  6    Thread 0x7fffe64db700 (LWP 19661) "alsoft-mixer" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:143
  7    Thread 0x7fffe639a700 (LWP 19662) "python" 0x00007ffff7825428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54

Thread 1:

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffe9dabd5c in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
(gdb) bt
#0  0x00007fffe9dabd5c in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#1  0x00007fffe9dac2f1 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#2  0x00007fffe9b66346 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#3  0x00007fffea6206d3 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#4  0x00007fffea6257f4 in ?? () from /usr/lib/nvidia-367/libnvidia-glcore.so.367.57
#5  0x00007fffed9f403b in GLShaderContext::glsl_compile_shader (this=this@entry=0x7fffe5e3c860, type=type@entry=Shader::ST_fragment)
    at panda/src/glstuff/glShaderContext_src.cxx:3098
#6  0x00007fffeda17605 in GLShaderContext::glsl_compile_and_link (this=this@entry=0x7fffe5e3c860)
    at panda/src/glstuff/glShaderContext_src.cxx:3168
#7  0x00007fffeda17cbe in GLShaderContext::GLShaderContext (this=0x7fffe5e3c860, glgsg=, s=0x1e20b70)
    at panda/src/glstuff/glShaderContext_src.cxx:278
#8  0x00007fffeda184c3 in GLGraphicsStateGuardian::prepare_shader (this=0x18e9b10, se=0x1e20b70)
    at panda/src/glstuff/glGraphicsStateGuardian_src.cxx:6068
#9  0x00007ffff545e62e in PreparedGraphicsObjects::prepare_shader_now (this=0x18eaee0, se=se@entry=0x1e20b70, gsg=0x18e9b10)
    at panda/src/gobj/preparedGraphicsObjects.cxx:873
#10 0x00007ffff5479516 in Shader::prepare_now (this=this@entry=0x1e20b70, prepared_objects=0x18eaee0, gsg=gsg@entry=0x18e9b10)
    at panda/src/gobj/shader.cxx:3696
#11 0x00007fffed9f358e in GLGraphicsStateGuardian::do_issue_shader (this=this@entry=0x18e9b10)
    at panda/src/glstuff/glGraphicsStateGuardian_src.cxx:7292
#12 0x00007fffeda14aba in GLGraphicsStateGuardian::set_state_and_transform (this=0x18e9b10, target=0x7fffe5884c50, 
    transform=0x7fffe5ebf190) at panda/src/glstuff/glGraphicsStateGuardian_src.cxx:11071
#13 0x00007ffff5263396 in CullBinStateSorted::draw (this=0x1f0a3d0, force=false, current_thread=0xa91660)
    at panda/src/cull/cullBinStateSorted.cxx:80
#14 0x00007ffff51b2f0b in CullResult::draw (this=this@entry=0x1f7aa20, current_thread=current_thread@entry=0xa91660)
    at panda/src/pgraph/cullResult.cxx:300
#15 0x00007ffff5390370 in GraphicsEngine::do_draw (this=this@entry=0x16284a0, win=win@entry=0x172f550, gsg=gsg@entry=0x18e9b10, 
    dr=0x1c6d010, current_thread=current_thread@entry=0xa91660) at panda/src/display/graphicsEngine.cxx:2080
#16 0x00007ffff539051c in GraphicsEngine::draw_bins (this=this@entry=0x16284a0, wlist=..., current_thread=current_thread@entry=0xa91660)
    at panda/src/display/graphicsEngine.cxx:1684
#17 0x00007ffff5392d24 in GraphicsEngine::WindowRenderer::do_frame (this=this@entry=0x16285a0, engine=engine@entry=0x16284a0, 
    current_thread=current_thread@entry=0xa91660) at panda/src/display/graphicsEngine.cxx:2509
#18 0x00007ffff539365c in GraphicsEngine::render_frame (this=0x16284a0) at panda/src/display/graphicsEngine.cxx:792
#19 0x00007ffff606e820 in Dtool_GraphicsEngine_render_frame_518 (self=) at built/tmp/libp3display_igate.cxx:20689
#20 0x00000000004c0e41 in PyEval_EvalFrameEx ()
---Type  to continue, or q  to quit---
#21 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#22 0x00000000004d5669 in ?? ()
#23 0x00000000004eef5e in ?? ()
#24 0x00000000004a587e in PyObject_Call ()
#25 0x00007ffff636ef0d in PythonThread::call_python_func (function=0x7fffef82f460, args=args@entry=0x7fffef7606d0)
    at panda/src/pipeline/pythonThread.cxx:137
#26 0x00007ffff63739cc in PythonTask::do_python_task (this=this@entry=0x7ffff7fea2b0) at panda/src/event/pythonTask.cxx:474
#27 0x00007ffff6374f38 in PythonTask::do_task (this=0x7ffff7fea2b0) at panda/src/event/pythonTask.cxx:439
#28 0x00007ffff53b77e8 in AsyncTask::unlock_and_do_task (this=0x7ffff7fea2b0) at panda/src/event/asyncTask.cxx:427
#29 0x00007ffff53c2cfa in AsyncTaskChain::service_one_task (this=this@entry=0x10154b0, thread=thread@entry=0x0)
    at panda/src/event/asyncTaskChain.cxx:675
#30 0x00007ffff53c38a3 in AsyncTaskChain::do_poll (this=0x10154b0) at panda/src/event/asyncTaskChain.cxx:1196
#31 0x00007ffff53c3a99 in AsyncTaskManager::poll (this=0x1015010) at panda/src/event/asyncTaskManager.cxx:482
#32 0x00007ffff60a4ba7 in Dtool_AsyncTaskManager_poll_138 (self=) at built/tmp/libp3event_igate.cxx:4343
#33 0x00000000004c0e41 in PyEval_EvalFrameEx ()
#34 0x00000000004c141f in PyEval_EvalFrameEx ()
#35 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#36 0x00000000004c1f56 in PyEval_EvalFrameEx ()
#37 0x00000000004c141f in PyEval_EvalFrameEx ()
#38 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#39 0x00000000004eb69f in ?? ()
#40 0x00000000004e58f2 in PyRun_FileExFlags ()
#41 0x00000000004e41a6 in PyRun_SimpleFileExFlags ()
#42 0x00000000004938ce in Py_Main ()
#43 0x00007ffff7810830 in __libc_start_main (main=0x493370 , argc=2, argv=0x7fffffffdff8, init=, 
    fini=, rtld_fini=, stack_end=0x7fffffffdfe8) at ../csu/libc-start.c:291
#44 0x0000000000493299 in _start ()

Thread 7 :

 #0  0x00007ffff7825428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff782702a in __GI_abort () at abort.c:89
#2  0x00007ffff49c6675 in dlfree (mem=mem@entry=0x2828f00) at dtool/src/dtoolbase/dlmalloc_src.cxx:4798
#3  0x00007ffff49c91c1 in MemoryHook::heap_free_array (this=0xa68910, ptr=0x2828f00) at dtool/src/dtoolbase/memoryHook.cxx:459
#4  0x00007ffff5560aca in PNMFileTypePNG::Reader::read_data (this=0x7fffc80018d0, array=0x7fffe50f4010, alpha_data=0x0)
    at panda/src/pnmimagetypes/pnmFileTypePNG.cxx:405
#5  0x00007ffff558806d in PNMImage::read (this=this@entry=0x7fffe6398ab0, reader=0x7fffc80018d0) at panda/src/pnmimage/pnmImage.cxx:357
#6  0x00007ffff545089a in Texture::do_read_one (this=0x1f0a5f0, cdata=0x7fffe5e379d0, fullpath=..., alpha_fullpath=..., z=0, n=0, 
    primary_file_num_channels=0, alpha_file_channel=0, options=..., header_only=false, record=0x0) at panda/src/gobj/texture.cxx:3122
#7  0x00007ffff547f20d in Texture::do_read (this=this@entry=0x1f0a5f0, cdata=0x7fffe5e379d0, fullpath=..., alpha_fullpath=..., 
    primary_file_num_channels=primary_file_num_channels@entry=0, alpha_file_channel=alpha_file_channel@entry=0, z=0, n=0, 
    read_pages=false, read_mipmaps=false, options=..., record=0x0) at panda/src/gobj/texture.cxx:2987
#8  0x00007ffff545ae39 in Texture::read (this=0x1f0a5f0, fullpath=..., options=...) at panda/src/gobj/texture.cxx:557
#9  0x00007ffff6133f3f in Dtool_Texture_read_1229 (self=, args=, kwds=)
    at built/tmp/libp3gobj_igate.cxx:46059
#10 0x00000000004bc4aa in PyEval_EvalFrameEx ()
#11 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#12 0x00000000004d5669 in ?? ()
#13 0x00000000004a587e in PyObject_Call ()
#14 0x00000000004be51e in PyEval_EvalFrameEx ()
#15 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#16 0x00000000004d5669 in ?? ()
#17 0x00000000004eef5e in ?? ()
#18 0x00000000004a587e in PyObject_Call ()
#19 0x00007ffff636eeb9 in PythonThread::call_python_func (function=0x7fffef82f5a0, args=args@entry=0x7fffef6f1d10)
    at panda/src/pipeline/pythonThread.cxx:241
#20 0x00007ffff63739cc in PythonTask::do_python_task (this=this@entry=0x7ffff7feadd0) at panda/src/event/pythonTask.cxx:474
#21 0x00007ffff6374f38 in PythonTask::do_task (this=0x7ffff7feadd0) at panda/src/event/pythonTask.cxx:439
#22 0x00007ffff53b77e8 in AsyncTask::unlock_and_do_task (this=0x7ffff7feadd0) at panda/src/event/asyncTask.cxx:427
#23 0x00007ffff53c2cfa in AsyncTaskChain::service_one_task (this=0x1cf7f10, thread=thread@entry=0x1cf7a00)
    at panda/src/event/asyncTaskChain.cxx:675
#24 0x00007ffff53c34f8 in AsyncTaskChain::AsyncTaskChainThread::thread_main (this=0x1cf7a00) at panda/src/event/asyncTaskChain.cxx:1430
#25 0x00007ffff53af2a4 in ThreadPosixImpl::root_func (data=0x1cf7a68) at panda/src/pipeline/threadPosixImpl.cxx:264
#26 0x00007ffff7bc16ba in start_thread (arg=0x7fffe639a700) at pthread_create.c:333
---Type  to continue, or q  to quit---
#27 0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Threads 5 and 6 :

#0  0x00007ffff78eb74d in poll () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007fffe8b48861 in ?? () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#2  0x00007fffe8b39e11 in pa_mainloop_poll () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#3  0x00007fffe8b3a4ae in pa_mainloop_iterate () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#4  0x00007fffe8b3a560 in pa_mainloop_run () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#5  0x00007fffe8b487a9 in ?? () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#6  0x00007fffe86e0078 in ?? () from /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-8.0.so
#7  0x00007ffff7bc16ba in start_thread (arg=0x7fffe6cdc700) at pthread_create.c:333
#8  0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) thread 6
[Switching to thread 6 (Thread 0x7fffe64db700 (LWP 19661))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:143
143	../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:143
#1  0x00007fffe8b48e48 in pa_threaded_mainloop_wait () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#2  0x00007fffe8fb6a69 in ?? () from /usr/lib/x86_64-linux-gnu/libopenal.so.1
#3  0x00007fffe8fb9f27 in ?? () from /usr/lib/x86_64-linux-gnu/libopenal.so.1
#4  0x00007ffff7bc16ba in start_thread (arg=0x7fffe64db700) at pthread_create.c:333
#5  0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

#10

That one looks like an assertion that’s triggering within dlmalloc. What was the exact text of the assertion?


#11

It’s actually rather random, most of the time the thread is killed with SIGABRT without any text, I will try and get an assert text, but so far I’m unlucky. it crashed several time at another place in dlmalloc though, see below.

Btw, the files I’m trying to load are 8-bits grayscale PNG

Thread 7 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe639a700 (LWP 20528)]
sys_alloc (m=0x7ffff4be2240 <_gm_>, nb=1552) at dtool/src/dtoolbase/dlmalloc_src.cxx:4213
4213	      while (sp != 0 && tbase != sp->base + sp->size)
(gdb) bt
#0  sys_alloc (m=0x7ffff4be2240 <_gm_>, nb=1552) at dtool/src/dtoolbase/dlmalloc_src.cxx:4213
#1  dlmalloc (bytes=bytes@entry=1536) at dtool/src/dtoolbase/dlmalloc_src.cxx:4685
#2  0x00007ffff49c955c in MemoryHook::heap_alloc_array (this=0xa68920, size=1536) at dtool/src/dtoolbase/memoryHook.cxx:332
#3  0x00007ffff5560908 in PNMFileTypePNG::Reader::read_data (this=0x7fffc80018d0, array=0x7fffe50f4010, alpha_data=0x0)
    at panda/src/pnmimagetypes/pnmFileTypePNG.cxx:345
#4  0x00007ffff558806d in PNMImage::read (this=this@entry=0x7fffe6398ab0, reader=0x7fffc80018d0) at panda/src/pnmimage/pnmImage.cxx:357
#5  0x00007ffff545089a in Texture::do_read_one (this=0x1eda1b0, cdata=0x7fffe5cceaf0, fullpath=..., alpha_fullpath=..., z=0, n=0, 
    primary_file_num_channels=0, alpha_file_channel=0, options=..., header_only=false, record=0x0) at panda/src/gobj/texture.cxx:3122
#6  0x00007ffff547f20d in Texture::do_read (this=this@entry=0x1eda1b0, cdata=0x7fffe5cceaf0, fullpath=..., alpha_fullpath=..., 
    primary_file_num_channels=primary_file_num_channels@entry=0, alpha_file_channel=alpha_file_channel@entry=0, z=0, n=0, 
    read_pages=false, read_mipmaps=false, options=..., record=0x0) at panda/src/gobj/texture.cxx:2987
#7  0x00007ffff545ae39 in Texture::read (this=0x1eda1b0, fullpath=..., options=...) at panda/src/gobj/texture.cxx:557
#8  0x00007ffff6133f3f in Dtool_Texture_read_1229 (self=, args=, kwds=)
    at built/tmp/libp3gobj_igate.cxx:46059
#9  0x00000000004bc4aa in PyEval_EvalFrameEx ()
#10 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#11 0x00000000004d5669 in ?? ()
#12 0x00000000004a587e in PyObject_Call ()
#13 0x00000000004be51e in PyEval_EvalFrameEx ()
#14 0x00000000004b9b66 in PyEval_EvalCodeEx ()
#15 0x00000000004d5669 in ?? ()
#16 0x00000000004eef5e in ?? ()
#17 0x00000000004a587e in PyObject_Call ()
#18 0x00007ffff636eeb9 in PythonThread::call_python_func (function=0x7fffef82e5f0, args=args@entry=0x7fffef51e6d0)
    at panda/src/pipeline/pythonThread.cxx:241
#19 0x00007ffff63739cc in PythonTask::do_python_task (this=this@entry=0x7ffff7feadd0) at panda/src/event/pythonTask.cxx:474
#20 0x00007ffff6374f38 in PythonTask::do_task (this=0x7ffff7feadd0) at panda/src/event/pythonTask.cxx:439
#21 0x00007ffff53b77e8 in AsyncTask::unlock_and_do_task (this=0x7ffff7feadd0) at panda/src/event/asyncTask.cxx:427
#22 0x00007ffff53c2cfa in AsyncTaskChain::service_one_task (this=0x1cf6f10, thread=thread@entry=0x1cf6a00)
    at panda/src/event/asyncTaskChain.cxx:675
#23 0x00007ffff53c34f8 in AsyncTaskChain::AsyncTaskChainThread::thread_main (this=0x1cf6a00) at panda/src/event/asyncTaskChain.cxx:1430
#24 0x00007ffff53af2a4 in ThreadPosixImpl::root_func (data=0x1cf6a68) at panda/src/pipeline/threadPosixImpl.cxx:264
#25 0x00007ffff7bc16ba in start_thread (arg=0x7fffe639a700) at pthread_create.c:333
#26 0x00007ffff78f741d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

#12

Small follow up: since the changes in the PNG loader I think I got only a couple of crashes out of hundreds of tests. The exact place is always either in dlmalloc (at random place) or in the Nvidia driver. I start to believe that there is a bug in the Nvidia GLSL compiler that corrupts the whole application heap.


#13

If you are getting weird, random segfaults, you might want to run some tests on your system memory (e.g., by using memtest) just to rule out bad memory.