Suggestions wanted for unusual rendering problems

I’m working on a project right now in Panda3D (1.6.2) that views and edits mesh files from an old Playstation game (Final Fantasy Tactics). The program is mostly working, but I’ve encountered two problems caused by the way the mesh files are stored and rendered on the Playstation itself, and I don’t know the best way of dealing with them.

So you can get an idea of what I’m talking about, here’s an example of what the mesh files look like:

ffhacktics.com/maps.php?id=3

The first problem is that there’s a single 4-bit-per-pixel 256x1024 texture image for the whole scene, and each polygon can apply one of sixteen 16-color palettes to the texture image while the polygon is being rendered.

One way to solve this would be to create 16 copies of the texture in Panda3D, each with a different palette applied. Then, I could create 16 empty NodePaths, each with a different texture applied to it, and then parent each polygon to the appropriate NodePath. This is a fairly simple way to do it, but it requires duplicating the texture in memory, which makes editing a little more of a pain.

Another possibility would be to use shaders. If I could pass the palettes to the shaders (I’ve done this before in OpenGL without Panda3D by making a special 1D texture that contained a pixel for each palette color), then I could have the shader decide which palette to apply to the polygon. I haven’t used shaders in Panda yet, though, and I’m not sure I’d be able to use that same 1D texture trick, so I don’t know if this method will work.

The second problem is that each polygon has a set of 12 bits that indicate when each polygon should be drawn, based on the rotation of the camera around the scene. This works sort of like back-face culling, but it’s explicit and provides more flexibility because there are 12 angle ranges (NE, NNE, NNW, etc.) instead of 2 (front and back).

One solution for this is to have 12 empty NodePaths, each of which contains the full set of polygons that should be visible from that angle. Then, as the camera rotates, check the camera’s angle and show/hide the appropriate NodePaths. This is simple, but creates a lot of duplicate information (many polygons would appear in all 12 NodePaths).

I guess I could do away with the empty NodePaths in both cases and individually show or hide polygons or set their textures, but I’m a little worried about the performance impact that might have with 1500 or so polygons. Maybe that’s not such a big deal, though.

Is there an easy solution to either of these problems that I’m missing?

Thanks,

Hmm, interesting.

Your two solutions for the texture palette both sound sensible. If you’ve previously written an OpenGL shader to solve this problem in another context, I don’t expect you would have difficulty doing so again in Panda. But I also don’t think the texture-duplication approach would be a bad idea, either.

The 12-NodePath approach sounds like a fine idea for the camera-angle problem, too. I don’t think the duplicate information would be anything to worry about; modern PC’s are so much more capable than the PlayStation that they wouldn’t blink about the extra vertices. It seems you could also solve this problem with a shader that duplicates the camera-angle logic on a per-polygon basis, but that would be complicated and probably slower than the 12-NodePath idea.

You can’t individually operate on polygons at the scene graph level, unless you split each polygon into its own Geom, which would indeed be extremely expensive. So I don’t recommend attempting your final solution.

David

Whoops! I’ve been putting each polygon in a separate Geom all along because I wanted each one to be editable (since it’s a mesh-editing program). I didn’t realize that was bad. The program already runs fairly fast, but now you’ve got me wondering if I should change it.

So you’re saying I should make a NodePath for each combination of viewing angle and texture (so there would be 16 * 12 = 192, not counting empty ones for nesting). Each NodePath contains a single Geom, and each Geom contains a GeomPrimitive for each polygon that matches the NodePath’s viewing angle / texture combination. Is that right?

When the user edits a polygon, I’ll need to rebuild all the Geoms that contain that polygon, correct?

Thanks,

That’s right; that’s the fastest performance for raw rendering. Of course, you’ll have to balance that with the cost required to regenerate all of those NodePaths when a polygon is edited.

If you’re getting acceptable performance on a one-Geom-per-polygon method, you don’t need to change it. But you should understand that the performance bottleneck for the number of Geoms is very low; typically PC’s can render about 300 Geoms at 60fps, or 600-1000 at 30fps. So you are sharply limited in the number of polygons you can render in this fashion. (With multiple polygons per Geom, you can render millions of polygons at a steady 60fps, depending on your graphics card.)

David

Such fast replies! Thanks, you’ve been very helpful. I think I’ll stick with separate Geoms for each polygon for now, since it’s easier. I can always optimize later, right? :slight_smile:

I underestimated how long it would take to make 16 copies of my texture: about 7 seconds. It would be nice if I could get that down to 1 second or less.

To make each copy of the texture, I create a PNMimage and then loop through a 2D array of palette indexes, calling setXelA(x, y, palette[palette_index]) for each pixel (palette contains a VBase4D for each color in the palette). I’ve already optimized this loop as much as I know how to. Is there a different way to do this that’s faster?

A while ago, you mentioned that I might be able to do it another way, but it sounded like my textures aren’t in a good format for that (since they’re not RGBA with 8 bits per channel). Might it still be faster to convert the texture to a different format (at runtime) and then use setRamImage or something similar?

I suppose I could try a shader instead, but I’m not looking forward to that. :confused:

Thanks,

That 7 seconds is probably Python time for running through the loops. There’s not much way to improve that if Python is going to have to visit every pixel, short of reducing the number of pixels (for instance, by reducing the size of your textures). You might be able to get a slight gain by storing PNMImage.PixelSpec values, instead of VBase4D values, and using setPixel() instead of setXelA(). (The Pixel values are in the range 0…255 instead of 0…1, so they don’t require being counterscaled.)

The biggest bang would be to code the loop in C++, and call it from Python.

David