Any one up for OSX work.

When I said console access, I meant access to the keyboard/display in contrast to remote access (ssh ) :slight_smile:

I tried pview, and it runs pretty normally with no errors.

If I do however run my full project through gdb with setShaderAuto() enabled, I get the following:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x0294d11c in RenderState::get_generated_shader ()
(gdb) where 
#0  0x0294d11c in RenderState::get_generated_shader ()
#1  0x0623744e in GLGraphicsStateGuardian::set_state_and_transform ()
#2  0x02b4fbf3 in CullBinStateSorted::draw ()
#3  0x02898690 in CullResult::draw ()
#4  0x02d55266 in GraphicsEngine::do_draw ()
#5  0x02d56bb9 in GraphicsEngine::draw_bins ()
#6  0x02d56ef8 in GraphicsEngine::draw_bins ()
#7  0x02d59d2a in GraphicsEngine::WindowRenderer::do_frame ()
#8  0x02d6b29e in GraphicsEngine::render_frame ()
#9  0x02d879a6 in Dtool_GraphicsEngine_render_frame_482 ()
#10 0x0018d806 in PyEval_EvalFrameEx ()
#11 0x0018f45b in PyEval_EvalCodeEx ()
#12 0x00139c27 in PyFunction_SetClosure ()
#13 0x0011fd3d in PyObject_Call ()
#14 0x001285f8 in PyMethod_New ()
#15 0x0011fd3d in PyObject_Call ()
#16 0x02dcfc6a in Thread::call_python_func ()
#17 0x02de5d2b in PythonTask::do_python_task ()
#18 0x02de7094 in AsyncTask::unlock_and_do_task ()
#19 0x02df0998 in AsyncTaskChain::service_one_task ()
#20 0x02df12c6 in AsyncTaskChain::do_poll ()
#21 0x02df137b in AsyncTaskManager::poll ()
#22 0x02e07306 in Dtool_AsyncTaskManager_poll_123 ()
#23 0x0018d806 in PyEval_EvalFrameEx ()
#24 0x0018d9e8 in PyEval_EvalFrameEx ()
#25 0x0018f45b in PyEval_EvalCodeEx ()
#26 0x0018da85 in PyEval_EvalFrameEx ()
#27 0x0018d9e8 in PyEval_EvalFrameEx ()
#28 0x0018f45b in PyEval_EvalCodeEx ()
#29 0x00139c27 in PyFunction_SetClosure ()
#30 0x0011fd3d in PyObject_Call ()
#31 0x001285f8 in PyMethod_New ()
#32 0x0011fd3d in PyObject_Call ()
#33 0x00188b15 in PyEval_CallObjectWithKeywords ()
#34 0x0012482d in PyInstance_New ()
#35 0x0011fd3d in PyObject_Call ()
#36 0x0018db1a in PyEval_EvalFrameEx ()
#37 0x0018f45b in PyEval_EvalCodeEx ()
#38 0x0018f548 in PyEval_EvalCode ()
#39 0x001a69ec in PyErr_Display ()
#40 0x001a7016 in PyRun_FileExFlags ()
#41 0x001a8982 in PyRun_SimpleFileExFlags ()
#42 0x001b3c03 in Py_Main ()
#43 0x00001fca in ?? ()
(gdb) 

Mhm, that explains it, it wasnā€™t a debug build. Try again with the new dmg I just uploaded. It also includes a potential fix to what might be the problem, but Iā€™m not sure.

Doesnā€™t pview crash too when you hit the ā€œPā€ key? (sorry, I forgot to mention that.) (Also, a full traceback would be useful too, with ā€œbt fullā€.)

Sorry about that, I didnā€™t know of the ā€œPā€ key :slight_smile:

Yes, it crashes.

The new dmg doesnā€™t seem to have debug symbols either, since even if I donā€™t have the sources installed, it should show something like this:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000090
0x00001fdf in main (argc=1, argv=0xbffff548) at test.c:14
14	test.c: No such file or directory.
	in test.c
(gdb) list
9	in test.c

Hereā€™s the binary info:

machine:proj user$ ls -l /Applications/Panda3D/1.6.0/bin/pview
-rwxr-xr-x@ 1 user  admin  114096 28 feb 13:55 /Applications/Panda3D/1.6.0/bin/pview
machine:proj user$ file /Applications/Panda3D/1.6.0/bin/pview
/Applications/Panda3D/1.6.0/bin/pview: Mach-O universal binary with 2 architectures
/Applications/Panda3D/1.6.0/bin/pview (for architecture ppc7400):	Mach-O executable ppc
/Applications/Panda3D/1.6.0/bin/pview (for architecture i386):	Mach-O executable i386

Here is the gdb output from the latest version:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x02da7caf in RenderState::get_generated_shader ()
(gdb) where
#0  0x02da7caf in RenderState::get_generated_shader ()
#1  0x01126136 in GLGraphicsStateGuardian::set_state_and_transform ()
#2  0x02faa923 in CullBinStateSorted::draw ()
#3  0x02ceda10 in CullResult::draw ()
#4  0x031c2ad6 in GraphicsEngine::do_draw ()
#5  0x031c4429 in GraphicsEngine::draw_bins ()
#6  0x031c4768 in GraphicsEngine::draw_bins ()
#7  0x031c759a in GraphicsEngine::WindowRenderer::do_frame ()
#8  0x031d8b0e in GraphicsEngine::render_frame ()
#9  0x00068a7b in PandaFramework::task_igloop ()
#10 0x03251874 in GenericAsyncTask::do_task ()
#11 0x03255cf4 in AsyncTask::unlock_and_do_task ()
#12 0x0325f5f8 in AsyncTaskChain::service_one_task ()
#13 0x0325ff26 in AsyncTaskChain::do_poll ()
#14 0x0325ffdb in AsyncTaskManager::poll ()
#15 0x00068a45 in PandaFramework::do_frame ()
#16 0x00068c0c in PandaFramework::main_loop ()
#17 0x00002d80 in main ()
(gdb) bt full
#0  0x02da7caf in RenderState::get_generated_shader ()
No symbol table info available.
#1  0x01126136 in GLGraphicsStateGuardian::set_state_and_transform ()
No symbol table info available.
#2  0x02faa923 in CullBinStateSorted::draw ()
No symbol table info available.
#3  0x02ceda10 in CullResult::draw ()
No symbol table info available.
#4  0x031c2ad6 in GraphicsEngine::do_draw ()
No symbol table info available.
#5  0x031c4429 in GraphicsEngine::draw_bins ()
No symbol table info available.
#6  0x031c4768 in GraphicsEngine::draw_bins ()
No symbol table info available.
#7  0x031c759a in GraphicsEngine::WindowRenderer::do_frame ()
No symbol table info available.
#8  0x031d8b0e in GraphicsEngine::render_frame ()
No symbol table info available.
#9  0x00068a7b in PandaFramework::task_igloop ()
No symbol table info available.
#10 0x03251874 in GenericAsyncTask::do_task ()
No symbol table info available.
#11 0x03255cf4 in AsyncTask::unlock_and_do_task ()
No symbol table info available.
#12 0x0325f5f8 in AsyncTaskChain::service_one_task ()
No symbol table info available.
#13 0x0325ff26 in AsyncTaskChain::do_poll ()
No symbol table info available.
#14 0x0325ffdb in AsyncTaskManager::poll ()
No symbol table info available.
#15 0x00068a45 in PandaFramework::do_frame ()
No symbol table info available.
#16 0x00068c0c in PandaFramework::main_loop ()
No symbol table info available.
#17 0x00002d80 in main ()
No symbol table info available.
(gdb) 

Darn it. Must be a bug in makepanda.
Iā€™ve made it print some debug info now - can you try the new dmg?

Since where and bt full donā€™t give any usable info, Iā€™ll skip those.

$ pview
Known pipe types:
  osxGraphicsPipe
(all display modules loaded.)
a:0x1c698714:3
b:1
c:0xf8e6cc:3
d:0xf8e1a4:16
1:0xf8e6cc
2:0
3:0
Bus error

$ gdb pview
...
a:0x1c82a524:3
b:1
c:0xf8e6cc:3
d:0xf8e1a4:16
1:0xf8e6cc
2:0
3:0

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x02da7b6e in RenderState::get_generated_shader ()

First run is directly, and second run through gdb.

The problem seems to be that ShaderGeneratorBase::get_default() returns NULL. I am lost though, I donā€™t get the ShaderGeneratorBase thing - itā€™s new to me. Iā€™m even more confused as to why Iā€™m not getting the error here.
It appears David created that file when he separated pgraph and pgraphnodes.
Itā€™s supposed to be initialized in config_pgraphnodes.cxx:

ShaderGenerator::set_default(new ShaderGenerator());

But I donā€™t see anything like ShaderGeneratorBase::set_default being set.

drwr, could you maybe shine some light on this?

I only created ShaderGeneratorBase to allow the division of pgraph into two smaller directories, pgraph and pgraphnodes, simply because pgraph was getting too large to compile on certain platforms.

But ShaderGeneratorBase::set_default() is the same thing as ShaderGenerator::set_default(), which is, as you noted, called in config_pgraphnodes. Thereā€™s no need to call anything else.

David

Thatā€™s very weird why itā€™s still NULL on his machine, then. WhiteFang, try out the new build, which prints some more info, to see if set_default gets called at all.

From the march 02 version:

$ pview
Set here
default SG :0x1013024
default SGB:0x1013024
Known pipe types:
osxGraphicsPipe
(all display modules loaded.)
a:0x1c716814:3
b:1
c:0xf8e6cc:3
d:0xf8e1a4:16
1:0xf8e6cc
2:0
3:0
Bus error

Mysterious. Iā€™m stumped.

It means set_default certainly does get set to a ShaderGenerator instance.
But later, get_default returns NULL.
Somewhere in between that, it must have been set to NULL.
But I canā€™t find any single reference to set_default or _default_generator in the Panda source except for the one in pgraphnodes.

I concur. There are no references that I can find.

Furthermore, the code:

ShaderGenerator *ShaderGenerator::
get_default() {
  if (_default_generator == (ShaderGenerator *)NULL) {
    _default_generator = new ShaderGenerator;
  }
  return _default_generator;
}

Canā€™t get much simplerā€¦

GCC on Mac OS X returns 0 for undefined variables - same as NULL, so the if() should work.

So if this is the problem, then Iā€™m stumped as well :neutral_face:

Try putting a cerr statement within the get_default() method, to ensure that it is being called and that it is initializing the pointer correctly. Also put a cerr statement when the pointer is being queried, to ensure that this happens after get_default() has been called.

David

Okay. WhiteFang, try the new DMG. I totally recompiled it from scratch (this time, gdb ā€œbt fullā€ should work too.). I spawned some extra debug info in set_default, get_default, and the constructor and destructor.

(gdb) run
Starting program: /Applications/Panda3D/1.6.0/bin/pview
Reading symbols for shared libraries ++++++++++warning: .o file ā€œ/Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/i686-apple-darwin9/4.0.1/libgcc_eh.a(unwind-dw2.o)ā€ more recent than executable timestamp
warning: .o file ā€œ/Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/i686-apple-darwin9/4.0.1/libgcc_eh.a(unwind-dw2-fde-darwin.o)ā€ more recent than executable timestamp
ā€¦ done
Set here
SGB Constructed : 0x7212f74
current :0
set_default called, with SGB 0x7212f74
SGB::get_default called: 0x7212f74:1
default SG :0x7212f74
SGB::get_default called: 0x7212f74:1
default SGB:0x7212f74
Reading symbols for shared libraries warning: Could not find object file ā€œ/Users/pro-rsoft/panda3d/built/tmp/pandagl_pandagl.oā€ - no debug information available for ā€œpanda/metalibs/pandagl/pandagl.cxxā€.

warning: Could not find object file ā€œ/Users/pro-rsoft/panda3d/built/tmp/glgsg_config_glgsg.oā€ - no debug information available for ā€œpanda/src/glgsg/config_glgsg.cxxā€.

warning: Could not find object file ā€œ/Users/pro-rsoft/panda3d/built/tmp/glgsg_glgsg.oā€ - no debug information available for ā€œpanda/src/glgsg/glgsg.cxxā€.

warning: Could not find object file ā€œ/Users/pro-rsoft/panda3d/built/tmp/osxdisplay_composite.oā€ - no debug information available for ā€œpanda/src/osxdisplay/osxdisplay_composite.mmā€.

.warning: Could not find object file ā€œ/Users/pro-rsoft/panda3d/built/tmp/glstuff_glpure.oā€ - no debug information available for ā€œpanda/src/glstuff/glpure.cxxā€.

ā€¦ done
Known pipe types:
osxGraphicsPipe
(all display modules loaded.)
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries ā€¦ done
Reading symbols for shared libraries . done
Reading symbols for shared libraries ā€¦ done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries ā€¦ done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
a:0x1e429964:3
b:1
c:0x27d6cc:3
d:0x27d1a4:16
1:0x27d6cc
2:0
SGB::get_default called: 0
3:0

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x039f4dfe in RenderState::get_generated_shader ()
(gdb) where
#0 0x039f4dfe in RenderState::get_generated_shader ()
#1 0x07325969 in GLGraphicsStateGuardian::set_state_and_transform ()
#2 0x03bf86e7 in CullHandler::draw ()
#3 0x03bf6d31 in CullBinStateSorted::draw ()
#4 0x0392eefc in CullResult::draw ()
#5 0x03e0d6c1 in GraphicsEngine::do_draw ()
#6 0x03e0e3c9 in GraphicsEngine::draw_bins ()
#7 0x03e0e6f4 in GraphicsEngine::draw_bins ()
#8 0x03e116b5 in GraphicsEngine::WindowRenderer::do_frame ()
#9 0x03e124ab in GraphicsEngine::render_frame ()
#10 0x000d870d in PandaFramework::task_igloop ()
#11 0x03ea0dd4 in GenericAsyncTask::do_task ()
#12 0x03ea9c8f in AsyncTask::unlock_and_do_task ()
#13 0x03eade4e in AsyncTaskChain::service_one_task ()
#14 0x03eae981 in AsyncTaskChain::do_poll ()
#15 0x03eaea6c in AsyncTaskManager::poll ()
#16 0x000d8505 in PandaFramework::do_frame ()
#17 0x000d8549 in PandaFramework::main_loop ()
#18 0x00003288 in main ()
(gdb) bt full
#0 0x039f4dfe in RenderState::get_generated_shader ()
No symbol table info available.
#1 0x07325969 in GLGraphicsStateGuardian::set_state_and_transform ()
No symbol table info available.
#2 0x03bf86e7 in CullHandler::draw ()
No symbol table info available.
#3 0x03bf6d31 in CullBinStateSorted::draw ()
No symbol table info available.
#4 0x0392eefc in CullResult::draw ()
No symbol table info available.
#5 0x03e0d6c1 in GraphicsEngine::do_draw ()
No symbol table info available.
#6 0x03e0e3c9 in GraphicsEngine::draw_bins ()
No symbol table info available.
#7 0x03e0e6f4 in GraphicsEngine::draw_bins ()
No symbol table info available.
#8 0x03e116b5 in GraphicsEngine::WindowRenderer::do_frame ()
No symbol table info available.
#9 0x03e124ab in GraphicsEngine::render_frame ()
No symbol table info available.
#10 0x000d870d in PandaFramework::task_igloop ()
No symbol table info available.
#11 0x03ea0dd4 in GenericAsyncTask::do_task ()
No symbol table info available.
#12 0x03ea9c8f in AsyncTask::unlock_and_do_task ()
No symbol table info available.
#13 0x03eade4e in AsyncTaskChain::service_one_task ()
No symbol table info available.
#14 0x03eae981 in AsyncTaskChain::do_poll ()
No symbol table info available.
#15 0x03eaea6c in AsyncTaskManager::poll ()
No symbol table info available.
#16 0x000d8505 in PandaFramework::do_frame ()
No symbol table info available.
#17 0x000d8549 in PandaFramework::main_loop ()
No symbol table info available.
#18 0x00003288 in main ()
No symbol table info available.
(gdb)

Really? Thatā€™s weird. Can you try the new DMG? (not through gdb this time)
It just prints out the pointer to the default SG pointer.

I still donā€™t know why gdb prints no extra debug information.

Iā€™ll try when I get home (in a few hours).

I did some testing, and yes, if a variable isnā€™t assigned a value, it returns 0.

That is - unless there hasnā€™t been declared/assigned a value to some other variable. Then it returns 4096.

$ cat und.c
#include <stdio.h>

void main(void)
{
	int i=4;
	int j;
	int *k=&i;
	int *l;
	printf("%d\n%d\n",i,j);
	printf("%d\n%d\n",k,l);
}

gives

$ ./und
4
0
-1073743696
4096

No, I doubt it. Itā€™s just returning a random value, whatever happens to be in memory at that particular address at the time the program started. It might be 0 more often than not, but thatā€™s just the way the ball bounces.

David

And the latest run :slight_smile:

$ pview
Set here
SGB Constructed : 0x7213014
  current :0x5cbc698
set_default called
  new 0x5cbc698
SGB::get_default called: 0x5cbc698
default SG :0x7213014
SGB::get_default called: 0x5cbc698
default SGB:0x7213014
Known pipe types:
  osxGraphicsPipe
(all display modules loaded.)
a:0x1e29daf4:3
b:1
c:0x27d6cc:3
d:0x27d1a4:16
1:0x27d6cc
2:0
SGB::get_default called: 0x5cbc698
3:0
Bus error

The address of the pointer is the same. Nothing else sets the variable. Still, suddenly it becomes NULL.
So unless Panda calls ((int)0x5cbc698) = 0; somewhere this is impossible in my eyes. Iā€™m lost here.

Ah, I think I understand whatā€™s going on. This is a static-init ordering problem.

This is one of those real nasty C++ problems; and one that Iā€™m quite familiar with (weā€™ve been fighting it in various forms for years). Itā€™s also one of the reasons Iā€™m not at all a fan of doing a lot of stuff automatically in static init, but one of our early Panda developers thought this was a swell idea and started us down this path, and itā€™s too late to go back now.

Static init is a concept that was introduced with the development of C++ and its constructors. Originally, when all compiled programs were written in C or some similar non-object-oriented language, there wasnā€™t much code that ran before main() was called; just some startup stuff hardcoded into the system runtime libraries. C allows you to define global or ā€œstaticā€ variables outside of any function scope, and even give them initial values, like this:

int x = 10;
int main() {
  ...
}

which means that at the time main() is called, x already exists and has the value 10. This was implemented by preloading a memory image that already had the right bits in the right place when it was loaded from disk; no code was necessary to run before main in order to assign 10 to x.

But, now introduce C++ and its constructors. Now you can declare an object outside of main that has a constructor. According to C++ semantics, that constructor has to be called to initialize that object, and thus you now have user code that is running before main:

class Thing {
  Thing() { cerr << "initializing\n"; }
};
Thing x;
int main() {
  cerr << "running main\n";
  return 0;
};

This caused a sea change in system library support, because suddenly the system runtime loader has to support calling user code automatically when a program is started, or even when a .so is loaded in at runtime.

But anyway. Part of Pandaā€™s low-level design takes advantage of these static initializers to call all sorts of setup function when the libraries are loaded. init_libpgraphnodes() is one of those functions, and one of the things it calls is ShaderGenerator::set_default(new ShaderGenerator()). This gets called at static init time, by virtue of a class object with a constructor, and so it is supposed to be called automatically when libpgraphnodes.so gets loaded into the running program. So, weā€™re supposed to be guaranteed that the ShaderGenerator already has a default value set by the time we start running.

But wait! We also have a static constuctor in libpgraph.so. It looks like this:

PT(ShaderGeneratorBase) ShaderGeneratorBase::_default_generator;

Donā€™t see the static constructor? Itā€™s hard to see, isnā€™t it? Welcome to the joys of C++, where code can be hidden from the programmer. In fact, thereā€™s a default constructor for the class PT(ShaderGeneratorBase), and the default constructorā€™s job is to initialize its pointer to NULL.

So, as long as libpgraph.soā€™s static constructors are called before the ones in libpgraphnodes.so, then everything is good: the default constructor for _default_generator will be called, ensuring that pointer is NULL. Then the static constructors in libpgraphnodes.so will be called, which will call set_default(), reassigning the pointer to a valid value. But, if the static constructors happen to get called in the opposite order, we have a terrible situation: the set_default() will be called first, assigning the pointer to a valid value, and then the default constructor will be called later, reassigning the pointer to NULL! Thatā€™s certainly whatā€™s happening here.

Unfortunately, the system does not guarantee any ordering of static init constructors between different .soā€™s. Itā€™s absolutely unpredictable. So on one system, it might call these in the correct order, and on another system, it might call them in the incorrect order. The ordering might even change from one day to the next.

So, basically, I introduced this bug when I split up libpgraph.so and libpgraphnodes.so, because in doing so I introduced a nondeterministic behavior between these static initializers. But because C++ tries so hard to make things automatic, the bug is extremely hard to see until it bites you, and you spend days isolating it down to discover that a pointer is getting reset to NULL after you had thought it was properly set.

Iā€™ll fix the bug now. Itā€™s easy to fix, by replacing the PT(ShaderGeneratorBase) with an ordinary ShaderGeneratorBase * pointer. The reason this will fix the problem is that an ordinary pointer doesnā€™t have a constructor, so its default value will be set to NULL by preloading the memory image, and so thereā€™s no longer an ordering issue between static initializers. (Iā€™ll also have to explicitly manage the reference counts in set_default() to compensate for this change, but thatā€™s not so bad.)

My apologies for the long trip down a dark corridor I caused you guys.

David