[size=100]History[/size]
[size=75](Skip to ‘isolating the problem’ if you have a short attention span and want to just look at the 3 line snipplet illustrating the actual problem.)[/size]
I was trying to re-create the panda physics example in C++ when I stumbled on a strange problem.
The application would segfault on PhysicsManager->attach_physical_node() with following backtrace:
#0 0x080558e8 in PointerTo<Physical>::operator
Physical* (this=0x15) at /usr/local/panda3d/include/pointerTo.I:83
#1 0x08055911 in PhysicalNode::get_physical
(this=0x82af0f4, index=0) at
/usr/local/panda3d/include/physicalNode.I:32
#2 0x08056ced in
PhysicsManager::attach_physical_node (this=0xbfd0477c,
at /usr/local/panda3d/include/physicsManager.I:70
in other words, the physics manager tries to call get_physical on the ActorNode, and gets a bad pointer back.
At first I figured I forgot to add something to the ActorNode, but as it turns out, In the ActorNode constructor you see this in the panda source:
ActorNode(const string &name) : PhysicalNode(name) {
_contact_vector = LVector3f::zero();
add_physical(new Physical(1, true));
// <snip> ...
In other words, a Physical object should always be in the _physicals vector (which is in ActorNode’s parent class, PhysicalNode), as one gets added in the ctor.
Thus I proceeded debugging by trying to print out ->get_num_physicals().
get_num_physicals simply prints out the vector’s size. The vector is created on the stack in the PhysicalNode header, so there is no reason for it to be uninitialised or whatever. Yet I get:
num_physicals: 34261536
which smells like uninitialised/bad memory to me.
Thus I proceeded to try and make the simplest test-case possible.
Try and add a Physical:
[size=100]Isolating the problem[/size]
[size=75](Start reading here if you have a short attention span.)[/size]
#include <actorNode.h>
#include <physical.h>
int main(int argc, char* argv[])
{
PT(ActorNode) an = new ActorNode("jetpack-guy-physics");
PT(Physical) test = new Physical();
// The line below throws these runtime errors:
//
// Invalid TypeHandle index -1208867616! Is memory corrupt?
//
// testprog: dtool/src/dtoolbase/typeHandle.cxx:56:
// void TypeHandle::inc_memory_usage
// (TypeHandle::MemoryClass, int):
//
// Assertion `rnode != (TypeRegistryNode *)__null' failed.
an->add_physical(test);
return 0;
}
As mentioned in the source comments above, This very simple, essentially 3 line example, throws runtime errors of uninitialised memory!
Nothing can possibly go out of scope, everything is allocated with new, and used immediately afterward. How is this possible?!
The only way I thought it could be possible is if the object would call delete this in the constructor or something, or a self-destructing = operator, who knows. None of the logical explanations made sense, until i looked at how _physicals is defined in physicalNode.h:
typedef pvector<PT(Physical)> PhysicalsVector;
PhysicalsVector _physicals;
Looks fine, if pvector were a normal stl vector.
But when I go look at the pvector source, I see it’s using custom stl allocators etc, and does some very very creative memory management.
class pvector : public vector<Type, pallocator_array<Type> >
Meaning that this little problem goes beyond the scope of my intellect, as lots of things can happen that normally don’t happen when you’re dealing with ‘creative’ memory management.
So, what did I stumble upon here?
Did I stumble on a panda bug, or am I violating the laws of nature and the universe?