Aspect2d and framerate

Greetings all!

I’ve been doing some performance-tuning in a Panda3d application I’m working on, and I noticed an unexpected side-effect of using aspect2d. It seems that any objects parented to aspect2d have a large effect on framerate. In particular, we have a scene with several OnscreenText and OnscreenImage objects (about six total, though one of the images is a full-screen transparent mask), and a framerate of about 23 FPS (pstats shows the draw loop executing in around 40ms). Simply removing one of the larger OnscreenText objects jumps the framerate to 35 FPS (draw loop about 28ms). If we reparent aspect2d to hidden, the framerate leaps to 51 FPS (draw loop about 18ms).

I’m not terribly surprised at the slow performance with the full-screen transparency; what I do find surprising is that simple text objects seem to have such a large impact on the framerate. Is this to be expected given the way Panda3d layers the render2d image atop the render image, or is it likely that I have some options mis-configured?

On a related note: does anyone have suggestions on how to optimize the full-screen transparency? Ideally, I’d like to do something akin to an OpenGL stencil (it’s really unnecessary for anything in render that would show up outside the mask’s transparent region to even be drawn). Is there a feature like that in the Panda engine?

Thanks for the help!

Actually, full-screen transparency is not supposed to be that expensive.

I suspect you have encountered a problem that we have discovered with certain graphics drivers, particularly on NVidia GeForce2 cards and earlier. These cards don’t support the “clamp-to-border” texture mode, which Panda enables by default to render text. But even though the card doesn’t support this mode, the driver does, but it supports it by rendering the texture in software! This means it renders very slowly.

A workaround is to put the following in your Config.prc file:


gl-support-clamp-to-border 0

David

David,

Thank you for the suggestion!

I’m afraid it appears that changing the setting didn’t solve the problem; I’m still seeing framerates around 20 with the text visible, 30 with it hidden. I don’t see the “gl-support-clamp-to-border” option when I execute cvMgr.listVariables; should it be listed in that system?

My graphics card is an NVIDIA GeForce2 MX/MX 400, and I’m using the pandagl display system. Switching to pandadx8 did appear to make the problem go away (and also alleviated the slowdown I had observed in rendering the fullscreen transparent mask, although the maximum framerate caps out at 40 instead of 50 or 60).

Is there any way to change the option at runtime? Perhaps I could toggle it on and off to get a comparison in the same session. Enabling DirectX 8 support will work as a short-term solution, but I’d love to be able to port the program to Linux (and Mac OS X! :wink: ) in the future.

Thanks,
Mark

Oops, my mistake–the gl-support-clamp-to-border option hasn’t been released as part of Panda3D 1.1.0 yet. So, yeah, that won’t work.

I suggest continuing to use dx8 for now. Don’t worry about future ports to other platforms; this variable is just a temporary hack around a driver bug (of a sort), and we’ll have a better solution in the future, but in the meantime you’re not likely to encounter this same driver bug on Linux or OSX anyway.

David

Any changes to this?

(1)Where is Config.prc?
(2)I’m getting the same problem with simple text nodes. However, it occurs on both an old Nvidia and a new laptop with a Nvidia NForce GO 7400 + more than enough processing power even if it were being rendered by software.

The gl-support-clamp-to-border option has been part of Panda since 1.2.0. You should be able to set it in your Config.prc and see an immediate effect.

The Config.prc file is the primary file that configures Panda runtime options. There is a default Config.prc file in c:/Panda3D-version/etc .

If your driver is kicking your card into software mode, you are going to see a drop in framerate, sometimes a significant drop, even if you are running on a CPU with lots of horsepower. CPU’s just aren’t built for the kinds of fill rates that graphics cards achieve as a matter of course.

Can you tell me more about the problems you are experiencing? Does it go away in DX8 mode, like the original poster reported?

David

I have the same problem. Even displaying a very simple text on screen (couple of variables) can cost 20 fps. I have a high-end graphics card with the latest drivers, therefore obviously gl-support-clamp-to-border does not help. I can’t switch to DX8 because my shaders start behaving weird then. I’m using Panda 1.3.2 by the way. Is there any chance this might be a bug?

Thanks.

Well, there’s always the possibility of a bug, of course. :slight_smile: In this case, though, if it’s affecting your frame rate dramatically, but not having the same effect for everyone (for instance, I don’t observe this problem), it certainly sounds like a driver issue. The “bug” might be that something Panda is doing is triggering your driver to go into software rendering mode.

Have you tried the gl-support-clamp-to-border option? It is helpful to know whether you have tried it and it has no effect, or whether you have not tried it.

Does switching (temporarily) to DX8 solve the problem, even if it makes your shaders go funny? It will be helpful to know, since if it behaves differently in DX8, it’s another sign that it is a driver issue, and not something that Panda is necessarily doing wrong directly. How about in DX9 mode?

There’s another config variable to try:

text-wrap-mode clamp

Please let me know if this has an effect.

David

text-wrap-mode clamp or gl-support-clamp-to-border don’t help unfortunately. In DX mode, my overall frame rate is lower and my shaders look funny but enabling HUD doesn’t decrease frame rate dramatically (~3 fps). I hope this helps.

My question is, if something makes my program to switch to software mode, shouldn’t it cause other artifacts like weird lighting? Or is there a possibility to switch to software mode just for aspect2d?

Thanks.

Your symptoms sound different from the original poster’s. I think you may be experiencing some different problem; it may not be related to software rendering at all.

But, to answer your question: software rendering can look exactly like hardware rendering. (In fact, the reference OpenGL implementation, which all the hardware manufacturers are trying to imitate, runs in software on an ordinary CPU.) A driver might decide to switch to software rendering for any part of the scene, then switch back to hardware rendering for the rest of it. OpenGL doesn’t really provide any interfaces to control whether the driver uses software or hardware rendering, or even determine which it chose, so it can be frustratingly difficult to diagnose or fix problems of this nature.

However, I don’t think that’s necessarily what you’re seeing. Since the frame rate is slower in DX8, it’s probably not a problem with software rendering at all (DX8 doesn’t have the same ambiguity with software rendering that OpenGL has; it will simply choose not to render something if the hardware doesn’t support it, rather than render it in software).

You didn’t specify the precise nature of the problems you’re experiencing, but I assume that it’s something to do with a dramatic frame rate decrease when you render onscreen text. How much decrease are we talking about? Does it happen with any text at all–even just one letter? Does the scale of the text–the size of each individual letter–make any difference? Or how about the quantity of text–does having more letters, or more words, matter? How about the number of individual DirectGui elements? Or is it something to do with the particular text that you are displaying? Is it possible it’s related to the Python code that updates the text?

Are you familiar with PStats? It can be a very useful tool for diagnosing mysterious performance issues like this. With PStats running, you can alternately hide and show the HUD, and visually see which part of the graphics engine experiences the slowdown.

David

I’ve been working with Cody, so we are talking about the same problem.

A further look into the code shows that the framerate drop is not associated with the display of on screen text, but with the use of setText every frame to update the onscreen text. (I’m using TextNodes to display text using a ttf font).

I have a task that updates the variable to be displayed (its a speed variable that being displayed). When I dont update the text my frame rate is 60fps on an average. Updating one variable drops it to about 45. I need to display about 6-7 variable texts - fortunately the recrease in framerate is not worsened.

Oh, right. setText() is a relatively expensive operation, not meant to be done every frame. Fortunately, it’s not necessary to do it every frame, since the user can’t visually process a number that changes 60 times a second anyway.

I suggest writing a task that updates the number every half second or so. This is the approach taken by the fps meter in the corner, for instance.

David

That works. Its what I tried on a hunch and it worked. The setText is being called 4 times a second and the framerate hasn’t taken a hit.

Thanks drwr.

I don’t have a folder under Panda3d called /etc (I’m on Linux and the folder I’m looking at is /usr/local/panda3d). Should I just create Config.prc?

I currently have 2 different nodes, parented to aspect2d, one with 4 subnodes of OnScreenText and the other with 6 (mayChange=True for all). They slow down performance by 133% (measured subjectively by a counting a moving model)
When I add a line of code to detach them, the performance hit is gone. When I only detach one, the performance hit is halved. I checked the task I am running every frame and it does nothing with them.

I’m not sure where the Config.prc file is stored by default under Linux, but it must already exist somewhere, or you wouldn’t be able to open a graphics window. Try:

print cpMgr

to list the various *.prc files that have been loaded by Panda.

If you detach a TextNode, then that eliminates the cost to calling setText(). I bet you are calling it more than you think. Try putting:

notify-level-text debug

in your Config.prc (as soon as you locate it, of course). That will print a line of output each time the text is recomputed.

You can measure your frame rate much less subjectively by putting:

show-frame-rate-meter 1

in your Config.prc.

David

Aha. Config.prc is under the root folder /etc on Linux.

text-wrap-mode clamp did the trick!!

(Oh, and the objective framerate? It was 74 vs 15 :open_mouth:)

having a similar issue.
we’re having a hud with several frames and labels (marking 105 total nodes but should be around 2-3 frames, humm 19 labels or so and maybe something more). When this particular group is hidden, the fps count jumps to around 430-450 fps from 240-250fps.
Tried several things including the configs on all used prc files, the set texting going on is only from the fps meter.
the system is ubuntu jaunty, panda 1.6.1, using gl.

Anyone has any idea what it might be? Or even if it’s normal?

The difference between 240 fps and 430 fps is only 1.8 ms. That means that if you were running at 75 fps, you would be running at 66 fps when you enabled the text. I’m not sure if you’ve really got a problem here. I do understand that every millisecond counts, but so far you seem to be doing fine.

Still, there are things that you can do to optimize all this. 105 notes is quite a few; you could attempt to reduce this count by combining neighboring nodes, or use more aggressive flattening. You should check in PStats to see if that 1.8 ms is spent mostly in cull (indicates you should reduce your node count) or mostly in draw (you should reduce your geom count or the rendering complexity).

David

Thanks,

Yeah that’s pretty much it. Got a bit worried bout the big change in numbers and wondered the effect on a slower machine but didn’t crossed my mind to check the proportional difference. Especially since been tackling performance hits lately :slight_smile:.

Checked with PStats and the difference, with the ui on or off, seems to be shared between cull and draw. By that i mean they pretty much increase and drop by the same amount. Will try to do more regarding flattening, there’s a considerably high amount of nodes on certain scenes (around 500).