RigidBodyCombiner and flattenstrong algorthims?

zhao · June 9, 2010, 6:58am

How does RigidBodyCombiner and flattenstrong work behind the scenes?

I’m guess that flattenstrong is just burning the transforms into the objects’ vertices and creating one gigantic mesh.

Is RigidBodyCombiner doing the same thing, ie., transforming all of the vertices of an object with every new setPos, or is it doing something more clever with Vertex Buffers, and just streaming in a single new transform for each change?

The reason why I’m asking this is to know how sensitive RigidBodyCombiner is to a new setPos. If I combine 100 object x 4000 vertices each and change each object’s position every frame, am I going to incur a penalty on the order of 400,000 calculations or just 100 calculations?

Another way to phrase the question, is with RigidBodyContainer, changing the position/hpr of a object with 40000 vertices more expensive than changing the position/hpr of a 400 vertice object? Or is it the same cost?

zhao · June 9, 2010, 8:50am

After some IRC help and more reading, I understand that the RigidBodyCombiner (RBC) is based on the animation system.

I tried astelix’s example
discourse.panda3d.org/viewtopic.php?t=7858

and got what I expected with 2000 x box.egg. Each box .egg is 12 tris for a total of ~24k vertices

It was ~15fps w/o RBC and 60fps(Vsync) with.

However, when I tried 2000 x smileys, each smiley is 1.3k tris, for a total of ~2.6 million tris
I got 15 fps w/o RBC and 5 fps with RBC.

What’s even stranger to me is that even when I disabled all smiley movement, the fps stayed at the same 5 fps with RBC. (I can render a single 3million mesh at 60+fps, without any problem.)

This is leading me to conclude that the RBC is applying a cpu-transformation to every vertex every frame regardless of whether or not the object is moving.

Thus its very good for collecting lots of low-tris objects into a single batch, but not so good at collecting lots of moderately-high tris objects.

Is this correct or am I missing something?

(Does this suggest that animations aren’t cached in Panda?)

zhao · June 9, 2010, 9:28am

After some more tinkering, it appears that for an object like smiley ~1.3tris, using rigidbodycombiner is across the board worse than just parenting it to render.

               Using RBC    Parenting to Render

For 2000 smileys 5 fps 15 fps
1000 smileys 10 fps 30 fps
200 smileys 46 fps 60 fps (Vsync)

I’m just a gtx 260 video card with a 1.8ghz CPU.

ThomasEgi · June 9, 2010, 11:17am

your earlier assumption is correct.
GPU’s have a hard time dealing with many hundret individual batches of geometry. the gpu prefers fewer, large batches.

the RBC collects your objects, applies the transform using the CPU, and sends it to the GPU in one batch.

so if you’r bottleneck is a high geom-count, RBC can reduce the problem by offloading some work to the CPU.

if your CPU is already at it’s limit you might be better of not to use it. it’s a trade-off and depending on your situation, one or the other might be better.

as you pointed out, RBC is most efficient when you have many many small objects with low vertex count.

zhao · June 9, 2010, 6:15pm

I’m just surprised that it doesn’t attempt to cache any of the animemation transforms for objects that aren’t moving.

Is the base animemation system also cache-less, compute every frame regardless of movement?

drwr · June 9, 2010, 6:40pm

It is not completely cacheless, but in the normal usage of the animation system, either all of the vertices are moving, or none of them. So if any joints have changed, it recomputes all of the vertices.

It should follow that if you move no objects, it should be faster than if you move any one of them (but probably still not as fast as if you did not use the RBC at all).

David

zhao · June 9, 2010, 9:25pm

Ok. Got it. It seems like this is a pretty cool system to do something whose mesh complexity is inbetween particles and a larger mesh. I think it would be good for asteroids, bushes or trees.