look at my pstats what is your conclusion?

treeform · February 17, 2008, 10:00am

This is my pstats?

What do you think is taking the most time? 27fps is not too great. Yet it looks like it spends most of its time in the draw. Would i have to put in lods to fix this problem?

How would one find the source of the problem using pstats?

rdb · February 17, 2008, 3:44pm

Hmm, just a guess, “State changes” for transforms takes some time. Have you tried using flattenLight on your objects, which applies transforms before processing them? You might also try a heavyer flattening method to reduce the number of batches.

The ‘make curremsomething’ in the draw graph, what is that excactly?

treeform · February 17, 2008, 5:34pm

“make current” - i think that draws the scene, its the heavies call in the system. I will try to think were to flatten, its really hard to because all of them are moving parts.

drwr · February 17, 2008, 10:43pm

Actually, “make current” doesn’t draw the scene, it’s just asking the graphics device to make the indicated window the current context and prepare it for drawing. It’s normally a very, very tiny measurement.

Since you’re getting substantial time in “make current”, it may mean that your graphics card is still busy drawing the previous scene (too much depth complexity? shaders too complex?) or it may mean some problem with multiple windows at once.

David

treeform · February 17, 2008, 11:48pm

Yes i do have complex shader. I will yank them out and see what is wrong. I don’t think i have much depth complexity. Do you have any tips on speeding up shaders?

EDIT:
i removed most of my shaders but still the make current call dominates

Josh_Yelon · February 18, 2008, 1:14am

glxMakeCurrent is used to switch between OpenGL contexts - a slow operation. There are two possible reasons I can think of that it might be switching contents:

possibility 1: Those offscreen buffers might be pbuffers. Each pbuffer has to have its own OpenGL context, therefore necessitating the constant use of glxMakeCurrent. This is one of the big reasons that pbuffers were deprecated in favor of FBOs.

possibility 2: Those offscreen buffers were created without passing in a handle of the main window’s gsg. In that case, the offscreen buffers will create their own gsgs, which in turn means they will create their own OpenGL contexts, which means they will need to use glxMakeCurrent.

So what it comes down to is this: to speed this up you need to take control of the offscreen buffer creation process.

When creating an offscreen buffer, supply an explicit gsg from the main window.
Try forcing the type of the offscreen buffer. To force a parasite buffer, use the BF_require_parasite flag. To force an FBO, use the BF_can_bind_every flag.

Then, there’s the question of whether or not FBOs are functional under linux - I’ve heard that they’re sometimes not. That could explain why you’re getting pbuffers, if you are getting pbuffers.

treeform · February 18, 2008, 1:45am

This is how i create the buffers. Very similar to the tron blur demo.

def makeFilterBuffer(srcbuffer, name, sort, prog):
    blurBuffer=base.win.makeTextureBuffer(name, bufferX, bufferY)
    blurBuffer.setSort(sort)
    blurBuffer.setClearColor(Vec4(1,0,0,1))
    blurCamera=base.makeCamera2d(blurBuffer)
    blurScene=NodePath("new Scene")
    blurCamera.node().setScene(blurScene)
    shader = Shader.load(prog)
    card = srcbuffer.getTextureCard()
    card.reparentTo(blurScene)
    card.setShader(shader)
    return blurBuffer

I have removed the buffers completely from this run. No buffers get created but the make_current call still takes up lots of time.

I think in the other post drwr said that cards could be tricky about where they report their time spent. I would think that the draw call is the length it takes to draw the screen. I would love to see inside it though.

Josh_Yelon · February 18, 2008, 8:19pm

I’m puzzled. I can’t imagine why it would be calling make_current if the application doesn’t use buffers. Does it have multiple windows?

treeform · February 18, 2008, 9:02pm

no, i think the card is just lying about the draw times. When its time to switch context it just waits till its done. It would be nice if i can check when drawing is done and do my own thing.

Josh_Yelon · February 18, 2008, 9:04pm

What I’m saying is, I don’t know why it would be calling make_current at all.

treeform · February 18, 2008, 9:06pm

yes that is a bit odd. The people ##openGL said dont make the call at all. Normally i have to make the call in order to update the render to textures but not in the last picture because i disabled them. That is a very bad FPS for amount of stuff i have on screen.

drwr · February 19, 2008, 9:03pm

Panda will call make_current() at the start of every frame, even if there is only one window. This function normally returns instantly.

What do you see if you run pstats on an empty pview window? Does it still spend a long time in make_current?

David

treeform · February 19, 2008, 9:45pm

I run it with just pview. Although it appears a little hard to measure because of such small time intervals but make_current is still the dominant call. I have Mobile ATI x600.

drwr · February 20, 2008, 12:44am

It might be worthwhile building a custom version of Panda, with the glxMakeCurrent() call in glxGraphicsWindow::begin_frame() commented out, just to see what effect this has on your graphics card. There are two possibilities:

(1) The time we are currently seeing in make_current will go away completely, and you suddenly have a great frame rate. If this is the case, then something about glxMakeCurrent() itself is particularly expensive in your particular graphics driver, and we will need to find a way to avoid its call altogether in Panda.

(2) The time we are currently seeing in make_current will now appear in some other call, for instance, flip, and you will still have the same overall frame rate. If this is the case, then your graphics driver is just performing an implicit glFlush() or equivalent inside glxMakeCurrent(), and the make_current call is not itself the problem. We’ll have to research more closely to determine what is causing the slowdown.

David

treeform · February 20, 2008, 2:05am

With only no texture buffers there is no need for glMakeCurrent right? Also the people on openGL suggested i check if the current context is current then just not call if it is. Thought i think its required for render to texture stuff, right?

Where do cards normally spend their time? Is there any sort of database for cards and their performance/problems with panda3d?

drwr · February 20, 2008, 2:38am

Right, well, with no buffers and no other windows–in short, with only one graphics context–there’s only a need to call glxMakeCurrent() once, when the context is created. (This is already done by Panda in another place, though.)

The spec doesn’t clarify what’s supposed to happen if the context is already the current one when you call glxMakeCurrent(), but most drivers seem to trivially return in this case. It might be the case that yours is doing something stupid, in which case it does make sense to check if the context is already current before calling glxMakeCurrent() again. But if it’s not the case that your driver is doing something stupid, there’s no advantage to making this check first.

But, yeah, we could just make three calls to glXGetCurrentDisplay(), glXGetCurrentDrawable(), and glXGetCurrentContext(), and only call glxMakeCurrent() if one of them doesn’t match the context we want to set. If the OpenGL consortium really recommends this convoluted check, and assuming it actually helps, I’m happy to do it.

I’ve checked in this modification now. If you like, you could just pick up the latest Panda and build it to see if it makes a difference. (You might need to wait for the change to propagate to the anonymous cvs repository.)

David

treeform · February 20, 2008, 3:17am

While i try to figure out how cvs works. Could the timer not timing this call but some thing else that is around there? Maybe there is some odd function doing some thing stupid thats being timed along with it.

drwr · February 20, 2008, 3:38am

Hmm, that is possible. It looks like there are several things that happen in Panda within the same timer bracket. Most of these things should be trivially fast, but it’s complex enough that I can’t promise it will be just by looking at the code. It’s possible that something in there is taking unexpectedly long.

If it’s something in Panda, though, it’s very strange that it’s only bothering you and not me. (I don’t measure any significant time spent in the make_current timer.)

David

treeform · February 20, 2008, 5:32am

Compiling the new version of panda 1.4.3 after an hour of compiling off cvs i get this error
CG ERROR : The profile is not supported.
Did some one changed the way shaders work?

Even though i am enable to run the game, i ran pview and pstats:

the make_current is no more. And it looks significantly faster - i wish i could run my game!

I also ran a little stress test at almost 300k vertexes i am still at 40fps, But this is unshaded, untextured ones. I think the other panda run just as fast.

hopefully there is some way to get 40fps in game with all the stuff on. It’s rare where one would see all the ships at once and most of them will probably be loded out … The ideal game would look some thing like this aff2aw.com/affdata/nice/ but currently it runs at 5-15fps.

treeform · February 21, 2008, 11:33pm

well josh kindly helped me with the shaders. So now it all works. It looks like it sends all of its time int he flip. Sally the fps did not change: So i am looking for other ways to optimize the scene.