All of my framerate is being consumed by... RopeNode?

Hi, I have been using the direct.showutil.Rope/panda3d.core.RopeNode to dynamically draw paths on the screen. Since I am still in the early stages of my project I have not been paying much attention to the frame rate until now, but have run into some major issues when using the RopeNode class. Having set up pstats and started profiling, it seems that RopeNode is using all of my processing power. Specifically, adding a 103 vertex loop to my game scene dropped my frame rate from ~60 FPS to ~35 FPS. Adding another RopeNode can drop the frame rate down below 10 FPS. According to pstat, after making a couple more shapes the RopeNodes are taking 150ms to process, whereas the entire Draw portion takes ~1.5ms and all of my per frame scripting takes ~5ms. As far as I know I am not doing anything fancy with the RopeNodes, just creating them once and attaching them to render. Am I making some sort of rookie mistake, or is the RopeNode class just not meant for extensive use? If so, is there a more performant alternative that I should be using? Thanks.

I’ll confess that I haven’t much used RopeNode, so I feel that I’m not in a position to speak to the matter of its use.

However, I do think that there might be an alternative for the purpose of drawing dynamic paths: MeshDrawer.

In short, MeshDrawer allows you to generate simple geometry through a fairly easy interface. Of likely particular use here, one of its features is the ability to draw “linked segments”–essentially quads whose ends meet, thus allowing one to generate a path from them.

MeshDrawer manual entry:

MeshDrawer API:

Looking through the implementation of RopeNode, it seems that it recreates all of its GeomVertexes from scratch every render cycle, regardless of whether the curve has actually changed. I am not familiar enough with the underlying implementation to know if this is a major bottleneck or just a read herring, and since I am mostly inferring which methods are getting called based on their signature, I could be completely off base about it being recalculated on every render cycle. If someone more knowledgeable wants to clarify, I am all ears. In either case, I will look into the MeshDrawer to see if it is more performant. Other alternatives might be LineSegs, or just getting my hands dirty and generating the GeomNodes manually.

Oh, yeah, it does appear to be inefficiently implemented. There are probably a number of easy things that can be done to speed this up. Do you have a simple stress test program that I can profile?

Most of it is tied up in the rest of my project, but I have copied some basic output into a simple program to test with. The spacebar toggles the Rope curve to be rendered or not, on my machine I get a flat 60 FPS when the scene is empty, and then it drops to ~38 FPS when the Rope is attached to render. Interesting to note, the Draw section of my Pstats feed runs at about ~15ms when the scene is empty (seems to all be from the ‘clear’ portion), but drops to ~0.5ms when the Rope is rendered. Sample code here:

from direct.showbase.ShowBase import ShowBase
from direct.showutil.Rope import Rope
from panda3d.core import LPoint3f, RopeNode
from pandac.PandaModules import PStatClient

class DemoWorld(ShowBase):

    def __init__(self):

        base.setBackgroundColor(0, 0, 0)
        camera.setPos(0, 0, 180)
        camera.setHpr(0, -90, 0)

        self.rope_curve = self._generate_rope_curve(self._get_sample_verticies())

        self.display_curve = True
        self.accept("space", self.toggle_curve)

    def toggle_curve(self):
        if self.display_curve:
            self.rope_curve = self._generate_rope_curve(self._get_sample_verticies())
        self.display_curve = not self.display_curve

    def _generate_rope_curve(self, points, thickness=.1, color=(1, 1, 1, 1)):
        curve_order = min(4, len(points))

        verts = [{'point': x, 'thickness': 1, 'color': color} for x in points]

        rope = Rope()
        rope.setup(curve_order, verts)
        rope_node = rope.ropeNode
        rope_node.render_mode = RopeNode.RM_tube
        return rope

    def _get_sample_verticies(self):
        return [LPoint3f(21.3316, 0, 0),
                LPoint3f(21.2911, 1.31319, 0),
                LPoint3f(21.1699, 2.6214, 0),
                LPoint3f(20.9683, 3.91966, 0),
                LPoint3f(20.6873, 5.20306, 0),
                LPoint3f(20.3277, 6.46672, 0),
                LPoint3f(19.8911, 7.70585, 0),
                LPoint3f(19.379, 8.91574, 0),
                LPoint3f(18.7934, 10.0918, 0),
                LPoint3f(18.1365, 11.2296, 0),
                LPoint3f(17.4108, 12.3248, 0),
                LPoint3f(16.619, 13.3733, 0),
                LPoint3f(15.7642, 14.371, 0),
                LPoint3f(14.8496, 15.3142, 0),
                LPoint3f(13.8787, 16.1993, 0),
                LPoint3f(12.8551, 17.023, 0),
                LPoint3f(11.7828, 17.782, 0),
                LPoint3f(10.6658, 18.4737, 0),
                LPoint3f(9.50829, 19.0952, 0),
                LPoint3f(8.31474, 19.6443, 0),
                LPoint3f(7.08965, 20.119, 0),
                LPoint3f(5.83766, 20.5172, 0),
                LPoint3f(4.56353, 20.8377, 0),
                LPoint3f(3.27208, 21.0791, 0),
                LPoint3f(1.96823, 21.2406, 0),
                LPoint3f(0.656907, 21.3214, 0),
                LPoint3f(-0.656907, 21.3214, 0),
                LPoint3f(-1.96823, 21.2406, 0),
                LPoint3f(-3.27208, 21.0791, 0),
                LPoint3f(-4.56353, 20.8377, 0),
                LPoint3f(-5.83766, 20.5172, 0),
                LPoint3f(-7.08965, 20.119, 0),
                LPoint3f(-8.31474, 19.6443, 0),
                LPoint3f(-9.50829, 19.0952, 0),
                LPoint3f(-10.6658, 18.4737, 0),
                LPoint3f(-11.7828, 17.782, 0),
                LPoint3f(-12.8551, 17.023, 0),
                LPoint3f(-13.8787, 16.1993, 0),
                LPoint3f(-14.8496, 15.3142, 0),
                LPoint3f(-15.7642, 14.371, 0),
                LPoint3f(-16.619, 13.3733, 0),
                LPoint3f(-17.4108, 12.3248, 0),
                LPoint3f(-18.1365, 11.2296, 0),
                LPoint3f(-18.7934, 10.0918, 0),
                LPoint3f(-19.379, 8.91574, 0),
                LPoint3f(-19.8911, 7.70585, 0),
                LPoint3f(-20.3277, 6.46672, 0),
                LPoint3f(-20.6873, 5.20306, 0),
                LPoint3f(-20.9683, 3.91966, 0),
                LPoint3f(-21.1699, 2.6214, 0),
                LPoint3f(-21.2911, 1.31319, 0),
                LPoint3f(-21.3316, 0, 0),
                LPoint3f(-21.2911, -1.31319, 0),
                LPoint3f(-21.1699, -2.6214, 0),
                LPoint3f(-20.9683, -3.91966, 0),
                LPoint3f(-20.6873, -5.20306, 0),
                LPoint3f(-20.3277, -6.46672, 0),
                LPoint3f(-19.8911, -7.70585, 0),
                LPoint3f(-19.379, -8.91574, 0),
                LPoint3f(-18.7934, -10.0918, 0),
                LPoint3f(-18.1365, -11.2296, 0),
                LPoint3f(-17.4108, -12.3248, 0),
                LPoint3f(-16.619, -13.3733, 0),
                LPoint3f(-15.7642, -14.371, 0),
                LPoint3f(-14.8496, -15.3142, 0),
                LPoint3f(-13.8787, -16.1993, 0),
                LPoint3f(-12.8551, -17.023, 0),
                LPoint3f(-11.7828, -17.782, 0),
                LPoint3f(-10.6658, -18.4737, 0),
                LPoint3f(-9.50829, -19.0952, 0),
                LPoint3f(-8.31474, -19.6443, 0),
                LPoint3f(-7.08965, -20.119, 0),
                LPoint3f(-5.83766, -20.5172, 0),
                LPoint3f(-4.56353, -20.8377, 0),
                LPoint3f(-3.27208, -21.0791, 0),
                LPoint3f(-1.96823, -21.2406, 0),
                LPoint3f(-0.656907, -21.3214, 0),
                LPoint3f(0.656907, -21.3214, 0),
                LPoint3f(1.96823, -21.2406, 0),
                LPoint3f(3.27208, -21.0791, 0),
                LPoint3f(4.56353, -20.8377, 0),
                LPoint3f(5.83766, -20.5172, 0),
                LPoint3f(7.08965, -20.119, 0),
                LPoint3f(8.31474, -19.6443, 0),
                LPoint3f(9.50829, -19.0952, 0),
                LPoint3f(10.6658, -18.4737, 0),
                LPoint3f(11.7828, -17.782, 0),
                LPoint3f(12.8551, -17.023, 0),
                LPoint3f(13.8787, -16.1993, 0),
                LPoint3f(14.8496, -15.3142, 0),
                LPoint3f(15.7642, -14.371, 0),
                LPoint3f(16.619, -13.3733, 0),
                LPoint3f(17.4108, -12.3248, 0),
                LPoint3f(18.1365, -11.2296, 0),
                LPoint3f(18.7934, -10.0918, 0),
                LPoint3f(19.379, -8.91574, 0),
                LPoint3f(19.8911, -7.70585, 0),
                LPoint3f(20.3277, -6.46672, 0),
                LPoint3f(20.6873, -5.20306, 0),
                LPoint3f(20.9683, -3.91966, 0),
                LPoint3f(21.1699, -2.6214, 0),
                LPoint3f(21.2911, -1.31319, 0),
                LPoint3f(21.3316, 0, 0),
                LPoint3f(21.2911, 1.31319, 0)]

if __name__ == "__main__":
    base = DemoWorld()

I suspect this is partially a driver issue. When I run the code, it runs at 60 fps on my laptop. Which renderer are you using? Which video card? Do you have updated drivers?

I can optimize the “thread” render mode significantly quite easily, but the “tube” mode turns out to be somewhat harder to optimize.

Likewise, I’m getting ~100fps on my machine with the code posted above. (When the rope is showing, that is.)

Admittedly, and in line with the above, switching to “thread” mode produces better performance still by far: I’m getting ~400fps. “Tape” and “billboard” are even better.

Thus it does seem that, if there’s optimising to be done, then the “tube” mode is perhaps the one that most calls for it. (Unless, I suppose, that mode turns out to be seldom-used.)

One more thing that might be worth asking, in addition to the above: what are the specs of your computer?

I am also generating scenes with large numbers (hundreds, to thousands) of Ropes. I was able to maintain some level of performance using LOD, and setting the numSlices and numSubDiv as low as I can tolerate. I need the tube render mode, since the others don’t look good for my application.
I would be really interested in better performance, particularly since my ropes are static.
I have also found that ropeNodes do not save/load properly through pickle or bam. Is this something I can fix in the source? I don’t want to hijack this thread, should I create a separate one, or a github issue?

Specs are on the very low end; integrated graphics card from 3+ years ago and running on stock Linux OS drivers rather than anything manufacturer specific. I have not specified a renderer, so I assume its defaulting to OpenGL, but I am definitely not familiar with how they are configured. Based on the other posts here it is probably a combination of middling optimization and terrible specs that has made the tubed RopeNodes unusable on my system. I can experiment with the other Rope render modes, or just use an different class for rendering my paths. Either way, thanks for the quick responses and good feedback on the issue.

I have added a generate method to ropeNode, following the CardMaker code. I think this will allow me to generate static geometry that should be more performant, and fixes the problem of writing to bam. The performance is worse, but I am sure I have done something stupid. Here is my diff. Does anyone see anything obvious?

Sorry, the poor performance turned out to be some of my compile options. Looks like this is working for me. Is this a direction that we could consider including in the next version?

Any thoughts on adding something like a generate method to Ropes to create static geometry?