Panda3D Preformance questions

maiklof · August 15, 2013, 12:57pm

Hi!
I have written an app which shows some roller conveyors (up to hundred of them) which can be different size, color, etc,… To paint a conveyor I load a simple cube model and a cylinder model which are copied several times. One conveyor is drawn with 4 cubes and around 10 cylinders.
This works very well, it is very easy to implement with Panda3D, and actually it uses not much CPU. But I have seen that when I add lots of conveyors and I move the camera to a spot where several (maybe 20) conveyors are shown in the screen, the frame rate starts to slow down and it can go down to 5fps when some hundred of conveyors are shown, but the CPU use is still quite low. No lighting, shaders,…

I am running in a DELL laptop:
i5 CPU
M560 2.67GHZ
3,42 GB of RAM
Intel HD Graphics

I wonder if I am doing something wrong, or I can do some cloning of models instead of loading, or is simply the graphic card which can not handle so many geometries.
I paste a code example to see if someone can help me…
Thanks a lot!

from panda3d.core import Vec4,LineSegs
from direct.showbase.ShowBase import ShowBase


class PandaWorld(ShowBase):
    
    def __init__(self):
        ShowBase.__init__(self)
        base.setFrameRateMeter(True)
        
        #grid
        GridMax = 50
        segs = LineSegs( ) 
        segs.setThickness( 1.0 ) 
        segs.setColor( Vec4(1,1,1,1) )
        for i in range(-GridMax,GridMax,1):
            segs.moveTo(i,-GridMax,0)
            segs.drawTo(i,+GridMax,0)
            segs.moveTo(-GridMax,i,0)
            segs.drawTo(+GridMax,i,0)
        render.attachNewNode(segs.create())

        # Show conveyors
        for i in range(20):
            for k in range(20):
                self.createConveyor().setPos(i*2,k*2,0)
     

    def createConveyor(self, length=2.0, width=0.8, height=0.5, radius=0.05, color=[1.0,0.0,0.0,1.0]):
        node = render.attachNewNode("conveyor")
        # Structure
        for i in range(2):
            side = self.loader.loadModel("cube")
            side.setScale(length, 0.05, radius*2)
            side.setColor(color[0],color[1],color[2],color[3])
            side.reparentTo(node)
            side.setPos(0,-width/2-0.03+i*(width+0.06),height/2-radius)
            for j in range(2):
                foot = self.loader.loadModel("cube")
                foot.setScale(0.1, 0.05, height-0.1)
                foot.setColor(color[0],color[1],color[2],color[3])
                foot.reparentTo(node)
                foot.setPos(-length/2+0.1+j*(length-0.2),-width/2-0.03+i*(width+0.06),-0.05)
        # Cylinders
        offset = 0.025
        num_cyl = int(length/(radius*2)/2)
        dist = (length-offset*2-radius*2)/(num_cyl-1)
        for i in range(num_cyl):
            cyl = self.loader.loadModel("cylinder")
            cyl.setScale(width, radius*2, radius*2)
            cyl.setColor(0.2,0.2,0.2,1.0)
            cyl.reparentTo(node)
            cyl.setPos(offset-length/2+radius+i*dist,0,height/2-radius)
            cyl.setHpr(90,0,0)
        # Return nodepath
        return node
        
app = PandaWorld()
app.run()

maiklof · August 15, 2013, 1:00pm

I attach the models if someone wants to try the code.
models.zip (939 Bytes)

cslos77 · August 16, 2013, 2:25am

Here’s the results of “render.analyze()”:

13204 total nodes (including 0 instances); 0 LODNodes.
6799 transforms; 48% of nodes have some render attribute.
6401 Geoms, with 3 GeomVertexDatas and 2 GeomVertexFormats, appear on 6401 GeomNodes.
456 vertices, 56 normals, 400 colors, 56 texture coordinates.
GeomVertexData arrays occupy 9K memory.
GeomPrimitive arrays occupy 1K memory.
140800 triangles:
  140800 of these are on 26400 tristrips (5.33333 average tris per strip).
  0 of these are independent triangles.
200 lines, 0 points.
0 textures, estimated minimum 0K texture memory required.

You have 13204 total nodes on the scene graph which for most systems is way too many. It’s not an issue of the graphics card being able to handle the geometry, but of panda attempting to process 13,000+ nodes per frame. Try to keep your total nodes in the hundreds at most.

The problem here is that you’re creating a large assembly of small little models (cylinder, etc) each with its own node. Can you not build an entire conveyor belt as one model and then import those? Then there would only be one node per conveyor belt which would get your node count down to 400, which is still fairly high, but much better than it currently is.

maiklof · August 16, 2013, 6:59am

Thanks for your reply!
I didn’t know about having many nodes was a performance issue…
The reasson I can not build an entire conveyor model is because each conveyor is different size and has different number of cylinders, color, etc… this code was only an easy example…
But I have used analyze tool and seen some strange things:

In the example, we see 13204 nodes (400 conveyors), but each conveyor is only 14 models, total of 5600 models, where is the rest comming from?
I have tried to flattenStrong() the node for each conveyor, where the models are reparented, and this seems not to reduce the node ammount, why? node.flattenStrong() returns 0…
I have trie to do instancing, load the cube and cylinder models only once to athe render, and then instance them so many times I need. This gives me a worse performance, why?
Any Idea what else I can do?
Thanks again!

maiklof · August 16, 2013, 7:07am

I have found that doing node.clearModelNodes() before node.flattenStrong() seems to work!
But, do you have any other recomendation to improve performance?

cslos77 · August 16, 2013, 5:33pm

Actually each conveyor is made up of 16 models (6 cubes, 10 cylinders) and each one of these models is made up of 2 nodes (ModelRoot + GeomNode); that puts the total at 12800. Then add the 400 nodes you’re creating for each conveyor and you have 13200. Here’s the results of ls on a conveyor node (node.ls()):

PandaNode conveyor T:(pos 38 38 0)
  ModelRoot cube.egg T:(pos 0 -0.43 0.2 scale 2 0.05 0.1) S:(ColorAttrib)
    GeomNode Cube (1 geoms)
  ModelRoot cube.egg T:(pos -0.9 -0.43 -0.05 scale 0.1 0.05 0.4) S:(ColorAttrib)
    GeomNode Cube (1 geoms)
  ModelRoot cube.egg T:(pos 0.9 -0.43 -0.05 scale 0.1 0.05 0.4) S:(ColorAttrib)
    GeomNode Cube (1 geoms)
  ModelRoot cube.egg T:(pos 0 0.43 0.2 scale 2 0.05 0.1) S:(ColorAttrib)
    GeomNode Cube (1 geoms)
  ModelRoot cube.egg T:(pos -0.9 0.43 -0.05 scale 0.1 0.05 0.4) S:(ColorAttrib)
    GeomNode Cube (1 geoms)
  ModelRoot cube.egg T:(pos 0.9 0.43 -0.05 scale 0.1 0.05 0.4) S:(ColorAttrib)
    GeomNode Cube (1 geoms)
  ModelRoot cylinder.egg T:(pos -0.925 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos -0.719444 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos -0.513889 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos -0.308333 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos -0.102778 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos 0.102778 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos 0.308333 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos 0.513889 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos 0.719444 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)
  ModelRoot cylinder.egg T:(pos 0.925 0 0.2 hpr 90 0 0 scale 0.8 0.1 0.1) S:(ColorAttrib)
    GeomNode Cylinder (1 geoms)

Since you’ve got flattenStrong working you could improve performance even more by combining conveyors into rows, each with a single node. Keep using “clearModelNodes” on each conveyor node but don’t attach each conveyor to render directly, instead return it to the loop that creates it and do something like this:

# Show conveyors
for i in range(20):
    row_node = render.attachNewNode("row")
    for k in range(20):
        node = self.createConveyor()    # return node from createConveyor.
        node.setPos(i*2,k*2,0)
        node.reparentTo(row_node)
    row_node.flattenStrong()

This gets your node count down to 44 or so, but in the end I would still pre-build the conveyors and save them as models themselves. It really depends what your final goal for the scene is and how many different size, colour conveyors you need.

rdb · August 16, 2013, 7:32pm

Just to clear this up, the number of nodes is not the problem. This is:

It’s not a limit of Panda, or the graphics card, but of the data bus. Each Geom needs to be sent as separate batch, and as a very rough rule of thumb, you can render about 300 geoms to maintain 60 FPS and 600 geoms to maintain 30 FPS. You should use flattening techniques in order to reduce the number of geoms drastically.

cslos77 · August 16, 2013, 7:56pm

Ok, thanks for pointing that out, I had always just assumed it was a question of the number of nodes when I ran into this kind of issue. The last example I posted gets it down to 21 geoms so that should help depending on the final requirements of the scene.

maiklof · August 21, 2013, 11:10am

Thanks for the posts!
The requeriment is that I need to be able to move each conveyor independently, and the sizes, colors, etc. will be different. And it is not only about conveyors, I whant to add more objects and move them in the world, like a world editor.
I have done some improvments, and I have managed to reduce the number of geoms to the number of objects… But I would like to be able to draw hundred of objects… There is nothing to or?
Now I have realised that I have another problem, because before I could change the color of a roller in the conveyor, but now, after it is flatten, I do not know how to access to it…
Thanks again!!!

rdb · August 21, 2013, 11:24am

You cannot change the state of an individual component after flattening it. That’s part of the point of flattenStrong; it reduces the number of state changes. You should put that roller under a ModelNode with the appropriate flags set if you want to prevent it from being flattened in with the rest of the conveyor.

If you want to use more than a few hundred objects, you could flatten groups of these objects together (if they don’t have to move independently.) Otherwise, you may consider the RigidBodyCombiner or hardware instancing.