set_shader_input() performance problems

Hello!

In my application, I have ~500 simple objects on screen that are rendered using a custom Cg shader. There is a shader uniform for each object that must be independently updated every frame. Calling set_shader_input() on each of these objects to set the uniform every frame costs ~8ms to simply make the set_shader_input() call, and about ~8ms increase in GarbageCollectStates time according to pstats.

In other words, setting a uniform on 500 different objects in my scene every frame causes an increase of 16ms frame time.

I realize that there are ways to accomplish my goal without using shader uniforms, but why is this so expensive? Setting a uniform is already done many times per object for transformation matrices and other values without performance problems. I read this thread ([url]Performance questions on setShaderInput and setShaderAuto]), but it only applies to auto-generated shaders, which neither my objects nor their children are using.

I did take a look at the pandaNode.cxx source, and set_attrib(), which is called by set_shader_input(), seems to mark the node and all parents’ bounding volumes stale even if we are simply setting a shader attrib or color attrib! Could this be the issue?

Any help or workarounds would be greatly appreciated. Thanks!
Matt

Hmm, that does sound problematic. Each call to set_shader_input creates a new RenderState, and based on the fact that GarbageCollectStates is indicated in pstats that does make me think that this may be the problem. What happens when you disable the state cache, by setting “state-cache 0” in Config.prc?

When setting “state-cache 0”, time spent in GarbageCollectStates drops to nearly zero, but the time spent calling set_shader_input() is still ~8ms, and now the time spent in Cull has significantly increased. Overall, performance is slightly worse.

Any ideas for ways that I can work around this issue? Maybe tweaking my copy of the Panda source to modify a RenderState in-place without making a copy if the set_attrib() argument is of ShaderAttrib type, or is that just asking for trouble?

Matt

Hmm, I don’t think that would work. Could you perhaps reduce the problem down to a simple Python script that demonstrates the issue, so that I can debug it for myself?