Performance killer: geom or primitive?

skylet · November 5, 2012, 1:53pm

Hello guys,

Could anyone please help me out with some internal information? I am about to design chunks for my minecraft clone.

Now I am looking for a compromise between easy handling using individual geom nodes (later gathered in a RBC) and big geoms with primitives for several cubes.

What is the killer for graphics performance (given that the number of primitives is the same in total over all geoms)?

Is using many geoms much slower and if so, is there an easy way to gather the primitives of all geoms in one and drop the rest? Is this done by RBC/flattenStrong? I thought they effect the nodes only.

Many thanks for your ideas!

Michael

Nemesis_13 · November 7, 2012, 6:15pm

I guess with promitives you mean vertices, edges and polygons.
In that case the number of primitives doesn’t really count, since modern HW can handle milions of vertices with a decent framerate. What is the killer are single batches of geometry, called geoms. Try to flatten multiple geoms to a single one and you’re fine.
You probably also could keep the unflattened geometry in memory (a copy of it before flattening) and simply switch the rendered geometry (flattened chunk) in realtime with the cached unflattened objects.

EDIT: I was just told that there also is a thing called GeomPrimitive in Panda (panda3d.org/manual/index.php/GeomPrimitive). If you meant that, then I actually have no idea, sry.

skylet · November 9, 2012, 7:25am

Hello nemesis,

many thanks for your reply, I very much appreciate your answer.

As vertices themselve an not visible, I actually referred to the tris, tristrips, lines etc. you would add to a Geom using addPrimitive, like this:

        self.vertex_data = GeomVertexData(name, self.format, Geom.UHStatic)
        self.tristrips = GeomTristrips(Geom.UHStatic)

        self.vertex = GeomVertexWriter(self.vertex_data, 'vertex')
#[..]
        _v0 = self.vertex.getWriteRow()
        self.vertex.addData3f(x1, y1, z1)
        self.vertex.addData3f(x2, y1, z1)
        self.vertex.addData3f(x2, y2, z2)
        self.vertex.addData3f(x1, y2, z2)

        self.tristrips.addVertices(_v0, _v0 + 1, _v0 + 3, _v0 + 2)
        self.tristrips.closePrimitive()
#[..]
        geom_node = GeomNode(self.name)
        mesh = Geom(self.vertex_data)
        mesh.addPrimitive(self.tristrips)       
        geom_node.addGeom(mesh)

So yes, I think that is what I mean.

This is exactly the type of answer I hoped to get, clear and precise, thanks!

Only for information:
I wanted to keep my question simple and general, that’s why I did not go into detail with my initial post.

In a first try I thought to use flattenString to reduce the number of used nodes/geoms. Because using the predefined boxes would be convenient and simple. It looked like this:

BLOCK_CHOICE = [0, 1, 2]
BLOCK_COLOR_MAP = {0: (0, 0, 0, 1),
                   1: (0.5, 0.5, 0.5, 1),
                   2: (1, 1, 1, 1)
                  }

# Load the box
box = loader.loadModel("box")
# Make sure its center is at 0, 0, 0 like OdeBoxGeom
box.setPos(-.5, -.5, -.5)
box.flattenLight() # Apply transform
box.setTextureOff()


def create_chunk(n, name="Chunk"):
    chunk_node = render.attachNewNode(name)
    
    for z in xrange(n):
        for y in xrange(n):
            for x in xrange(n):
                if not z or not y or not x or z==n-1 or y == n-1 or x==n-1:
                    block_type = choice(BLOCK_CHOICE)
                    np = chunk_node.attachNewNode("Base Box %i %i %i" % (x, y, z))
                    copy = box.copyTo(np)
                    copy.setPos(x, y, z)
                    copy.setColor(*BLOCK_COLOR_MAP.get(block_type)) 
                
    chunk_node.flattenStrong()
    return chunk_node
    
# Add a random amount of boxes
boxes = []
n=10
for z in xrange(3):
    for y in xrange(3):
        for x in xrange(3):
            boxes.append((x, y, z))

np = render.attachNewNode("Base Box")
copy = box.instanceTo(np)
copy.setPos(0, 0, 0)
copy.setColor(0.5, 0.5, 0.5, 1) 
 
# Set the camera position
base.disableMouse()
base.camera.setPos(60, 60, 40)
base.camera.lookAt(0, 0, 0)

#[..]

def build_task(task):
    if boxes:
        x, y, z = boxes.pop()
        chunk_name = "Chunk %i %i %i" % (x, y, z)
        print "generating", chunk_name
        chunk = create_chunk(n, chunk_name)
        chunk.setPos(x*n, y*n, z*n)
        task.setDelay(1)
        return task.again
 
taskMgr.doMethodLater(1, build_task, "Chunk Build Task")
run()

I have created 3x3x3 = 27 flattened chunks of cubes, where I already took care to only draw the bounding boxes (x, y, z = 0 or n-1).
But still: as long as the block of chunks is only partially on screen, everything runs fluently. But as soon as I get all into view, the framerate drops significantly.
(I use a GeForce GTX 560 and usually have no problems with Minecraft using HD texture packs.)

Actually I was really surprised. I didn’t think that the (estimated) < 20k cube models would be so expensive. Therefore my questions, whether it was the amound of visible elements or use of geoms that made the framerate drop. Maybe I should have asked for the cost of models instead of geoms.

BR,
Michael

ThomasEgi · November 9, 2012, 3:49pm

as rough guideline. you would want to keep the number of individual batches send to the GPU somewhere below 200 or 300. so 20k is totaly off-limits.

you should use a lot less batches with a lot more geometry in each. you an use pstats to check how many are upload to gpu each frame.

skylet · November 9, 2012, 5:58pm

Hello Thomas,

I have not heard of “batches” in the context of handling model data with Panda3D. Does every model lead to a batch that is sent to the GPU? I assumed that the flattenStrong would reduce the models’ geoms to a single geom node:

chunk_node.flattenStrong()

So, I will follow my current approach to create one geom (and node) with all necessary primitives (faces) per Chunk. This way I already had success, but I wanted to try using flattenStrong or RBC with whole cube-models.

It is as usual: if you need performance, go to the roots.

BR,
Michael

ThomasEgi · November 9, 2012, 7:37pm

hm. i’m not 100% sure on this one. as , in most of my cases the primitive batches and geoms have perfectly the same number (which doesn’t have to be the case for you)

either way, having an eye on pstats’s geom and primitie-batches count will probably solve this mystery.

flatten should help, but flatten is a rather slow operation. so you may find it faster to generate geometry with the right structure directly. back to the roots, as you named it.