Can you describe the Rendering Pipeline please?

There’s a lot of references in the sourcecode to the Rendering Pipeline, but its hard to work out exactly what this means in practice.

Can you describe what is the Rendering Pipeline in Panda3D? How do things like GSG, Window, DisplayRegion, GraphicsEngine, CullTraverser etc relate?

Hugh

Yes, this certainly deserves some explanation.

The GraphicsEngine is the owner of all of this stuff; it owns pointers to all of the GraphicsWindows you create, and is responsible for the render loop. There is normally one GraphicsEngine in the application, or at least, one per isolated window chain.

There are also one or more GraphicsPipe objects. Each of these represents a handle to the underlying graphics API, for instance, there is a different GraphicsPipe for OpenGL, DirectX7, DirectX8, and DirectX9. Typically again, there is only one of these in an application, although you can load all of them at the same time if you like. The global GraphicsPipeSelection object is responsible for maintaining the list of available graphics pipes; loading a DLL like libpandagl.dll automatically registers its GLGraphicsPipe with the GraphicsPipeSelection.

The GraphicsPipe actually contains the hooks for creating a new graphics context and window (although the public API to do this is via the GraphicsEngine).

The GraphicsStateGuardian object is the object that represents a particular rendering context. This object is responsible for changing rendering state; it keeps track of the current context’s state, and when it is issued a request to a new state, it issues only the minimum delta between the current state and the new state. Thus the name. The GSG is also responsible for issuing the actual draw-primitive commands like glDrawElements(). You can create one or multiple GraphicsStateGuardians for a given graphics API; typically, there will be just one.

The GraphicsWindow is, of course, the window that presents the output from a particular rendering context. There might be a one-to-one relationship between GraphicsStateGuardians and GraphicsWindows, or it might be one-to-many (a single GraphicsStateGuardian can render to multiple windows, by drawing to each one in sequence).

Actually, there is also a GraphicsBuffer class, which is similar to GraphicsWindow except it encapsulates an offscreen buffer. GraphicsWindow and GraphicsBuffer both inherit from a common class, GraphicsOutput.

Within a GraphicsWindow (actually, within a GraphicsOutput) there may be one or more DisplayRegions. Each DisplayRegion is a rectangular region within the window for rendering into, OpenGL calls this a viewport. Each DisplayRegion is associated with a Camera.

During GraphicsEngine::render_frame(), all of the windows are visited in sequence, and for each window, all of the DisplayRegions are visited. For each DisplayRegion, a SceneSetup object is created with the camera information, and then a CullTraverser object is created to perform the rendering traversal.

During the traversal, the CullTraverser discovers all of the Geoms that are to be rendered, and determines the net RenderState and TransformState for each one. It also determines which bin the Geom goes into (that’s part of the render state). It creates a CullableObject for each Geom that records the Geom along with its RenderState and TransformState, and saves this object in the appropriate bin.

After the cull traversal has been performed, the draw traversal begins; it walks through all of the bins in order, and for each bin, it pulls out all of the CullableObjects in the bin-defined order. For each CullableObject, it issues a set_state_and_transform() request on the GSG, and then it issues the draw primitive requests.

Eventually, the draw traversal will be run in a separate thread, so that the main thread can get on with the job of culling the next frame in parallel. For now, they are both run in sequence for simplicity.

David

Oh, a bit more clarification on the phrase “rendering pipeline”–this refers to the three-step process of rendering: App, Cull, Draw. “App” is anything handled by the application, most of which is outside of Panda’s domain. This is all of your Python code and some other miscellaneous stuff, like collisions. “Cull” begins more or less when you call GraphicsEngine::render_frame(); it is the process of walking through the scene graph and discovering all of the renderable geometry that is within the viewing frustum (and culling out all of the rest, hence the name). “Draw” is the third step, which is sending all of the geometry discovered by “Cull” to the graphics card.

In a multithreaded environment, all three of these steps can run in parallel, in principle. First, App runs on frame 0. Then, Cull starts processing frame 0, while App starts in on the next frame, frame 1. Then, Draw starts drawing frame 0, while Cull processes frame 1, and App starts in on frame 2. And so on. With this approach, you can theoretically triple your frame rate (if you happen to have three CPU’s, or at least double it if you only have two), although you don’t improve the total latency between the time something happens in App and the time you actually see it onscreen.

This kind of parallel processing is called a pipeline, hence the term “graphics pipeline”.

Panda does not actually support this multithreaded pipeline yet; instead, App, Cull, and Draw are always run in sequence, all in the main thread. But all of the infrastructure is in place, and it will be supported eventually. This is the CData stuff you see in all of the scene graph objects; these are required because we need to maintain up to three different versions of each scene graph object, since App, Cull, and Draw may need to be working on different versions of the same object.

David

Oh, a bit more clarification on the phrase “rendering pipeline”–this refers to the three-step process of rendering: App, Cull, Draw. “App” is anything handled by the application, most of which is outside of Panda’s domain. This is all of your Python code and some other miscellaneous stuff, like collisions. “Cull” begins more or less when you call GraphicsEngine::render_frame(); it is the process of walking through the scene graph and discovering all of the renderable geometry that is within the viewing frustum (and culling out all of the rest, hence the name). “Draw” is the third step, which is sending all of the geometry discovered by “Cull” to the graphics card.

Ah, thats a big bit of insight that was missing to me. That explains the meaning of “cull_callback”.

One thing, a little off-topic for this thread perhaps, when would a node not have a cull_callback? ie, there is an inline function to state whether a node has one or not, so presumably there are some that dont?

Panda does not actually support this multithreaded pipeline yet; instead, App, Cull, and Draw are always run in sequence, all in the main thread. But all of the infrastructure is in place, and it will be supported eventually

I guess this is going to become important as we near the end of Moore’s law (transistor density doubling every X years), and multicores become more popular?

Hugh

Sure. Lots of nodes don’t. The base class, PandaNode, doesn’t, for instance. A PlaneNode doesn’t; it just holds a plane. A Light doesn’t. A Camera doesn’t. Basically, any node that doesn’t have any fancy rendering requirements doesn’t have a cull_callback.

Exactly. We are starting to see this now; the multicores have entered the consumer market and more and more gamers are picking them up.

David