hitting a speed limit

castironpi · April 3, 2010, 1:16am

Well, I could give the on-and-off colors different normals. This is sounding even more and more hackish.

I would have to drop the ambient light for a directional light, and just have one direction be one color. Then I’m limited to ever having 6 colors, or rollback the change.

castironpi · April 3, 2010, 8:06pm

Also, I’m a little unclear on if and how LODs can help. All of my nodes are at the same Y-coordinate, Y=0. However, some are smaller than others.

I have switched to an Orthographic Lens with no firm decision. Do LODs operate the same on it? Can I have some nodes go out of scope once my film size gets too large or too small? Can I simulate it with a fake Y-coordinate on the camera, even though it won’t affect the image?

At the highest LOD, all vertices will be calculated anyway, so is it worth it to add LOD nodes?

drwr · April 3, 2010, 10:19pm

LOD’s are designed to switch geometry out at certain distances. It would behave the same way with an orthographic lens, but because there is no change in size with distance, it would be kind of an arbitrary switch.

You could always change out the nodes yourself, of course, whenever you deem it appropriate based on your own calculations.

I can’t tell you whether this is worthwhile or not without knowing more specifics about your scene. It amounts to reducing the average number of vertices onscreen at any given time.

David

castironpi · April 5, 2010, 6:27pm

So how do LODs fit into a Rigid Combiner? What if I change the LOD distances after ‘collect’? Or does that go under the “everything but vertices and normals” category?

I had an idea for roughly +1/4x performance by precomputing textures. There may be any combination of high-and-low color tiles in the scene; so render a texture filled with all lows, then just size the high ones in front of them to size 1 or size 0 as needed. Then you have one texture and individual tiles. But the size of the tiles can get larger than 1/2048, so I will need smaller groups, possibly corresponding loosely to the nested structures I alluded to some pages ago; so savings won’t quite be 1/2x polygons… unless this would be where the ‘tile pyramid’ strategy comes in.

drwr · April 5, 2010, 9:00pm

Yeah, unfortunately, LODNode’s are not supported below a RigidBodyCombiner.

Sounds plausible.

David

castironpi · April 5, 2010, 11:05pm

Now ya tell me. Um, something similar might be to divide up the larger and smaller tiles into RBC’s by size, then either put the RBCs into LODs, or swap them in-and-out manually.

There’s a possibility that I’ll be adding a number of line-segs to this diagram. If they get too many, this might start to look attractive; especially since the line segs won’t be changing color.

I’m setting the film size with a pseudo-zoom factor. I chose orthographic because there were specific ideas I had of what I wanted on the screen and not. So setting the normal distance of the camera would just be the same as selecting which LOD nodes I’m rendering.

My quandary, which I mentioned, is that at the highest “zoom” or pseudo-zoom factor, when the camera has the closest logical distance to the figure, every LOD will be rendered. So the performance increase only covers part of the, what I’d call, the “control space”, the combinations of user settings. Then I want to switch-out the super-close ones to get the performance back, but then also keep the ones that are partly visible, even if they’re huge.

Then I’m left with something I can do in batches that I “feel” should be done by the engine, which is not even bother to calculate the position of most of the vertices that are off screen, since most of their world positions stay constant most of the time.

Plus I just realized that LOD-distance is not simply the scale, since tiles far away off the screen could have farther distances than tiny tiles on screen, and an LOD-cross-section is really a sphere around the camera, so the RBC buckets I’m doing by hand should be based on Y-plane position in addition to size, and with twww.panda3d.orgy frame rate starts to suffer again. ARGH!

Sorry for the rant.

castironpi · April 11, 2010, 12:17pm

Drwr,

Speaking of plausible, I am looking at a line in rigidBodyCombiner.cxx:collect() that says:

  gr.collect_vertex_data(_internal_root, ~(SceneGraphReducer::CVD_format | SceneGraphReducer::CVD_name | SceneGraphReducer::CVD_animation_type));

If we were to pass our own flags into ‘collect’, would we be able to enable changing color, and perhaps say, lose the names and the animations? Or is my understanding of SceneGraphReducer way off?

drwr · April 11, 2010, 3:16pm

Yeah, it’s more than just these flags. Changing these flags would indeed change the data that gets collapsed together, but it wouldn’t suddenly start animating by the RigidBodyCombiner.

The RigidBodyCombiner works by converting the geometry into a structure that is animated by Panda’s built-in animation system, the same system used for Actors. This system is only designed to animate joint transforms; it doesn’t animate state changes, UV changes, or color changes (other than via a morph, which is a different system).

David

castironpi · April 11, 2010, 4:24pm

Ok ok, shot in the dark.

But how would you feel about telling me what would be involved in adding a mode to the RBC or subclassing it?

So you have a picture of my motives, I switched to Cards from LineSegs for my linesegs, because I needed to click on them and to change widths with the rest of the film size. So, I would not gain 2x polygons from it… closer to 1.25x, whatever that would mean for my reliability.

I couldn’t promise anything, of course.

In my case, my ideal strategy would be to get the GeomPrim and index of the vertex color and write that; I will know it so I wouldn’t need the calculated end-product of an entire NodePath. Other than that I would be looking for slight generalizations or variations of the RBC or crosses of other systems with it. I think I am asking what understanding I would need to acquire before being able to make these changes (either in my own fork or something presentable).

Lastly, you didn’t respond to my idea for a hack using normals. In my specific application, you could just change the normals by 90’, and use 2 directional lights (up to 6) to get my change; and you said the RBC can tolerate changes to normals.

drwr · April 11, 2010, 5:00pm

Well, the RBC basically sets up the geometry as I described above, with a TransformBlendTable and an extra column that indexes into that table. That it manipulates the values of the TransformBlendTable according to the changes you apply to the scene graph. The low-level animation system does the rest.

Assuming you’re not proposing rewriting the low-level animation system, which is a pretty major subsystem in Panda, you can’t really add any new functionality to the RBC beyond what that already supports. So, no direct fiddling with colors or whatnot. And you can manipulate normals only insofar as the transform manipulates them.

But, you could write your own object that directly pokes at the vertices, using the GeomVertexWriter class, as we started out discussing in the beginning. With this object you could do anything you like to the vertices; you could change the color, position, and everything. The only catch is that you have to do all the work; it wouldn’t be done for you automatically by Panda’s built-in systems. Also, it might not be very efficient to do per-vertex operations in Python, so this sort of thing would be best implemented in C++.

Edit: Actually, it may not be too bad to do in Python, if you only changed a few vertices at a time, for instance to flip colors on your tiles here and there. And in any case, it would make sense to prototype it in Python first before attempting to port it to C++.

David

castironpi · April 11, 2010, 7:38pm

My application is outside the fundamental assumptions of the engine. Some of ours have to be. Color changes are really rare within an RBC; it would really only be useful for diagram / visualization tasks. And setting normals distinctly from vertices is extremely rare, almost entirely unattested altogether.

My last idea is a parallel scene graph. It’s not connected to the camera, but does have the vertices before and after all the transformations I want to make. Then my index is just:

nodepaths2indices[ nodepath ]

and I can write to my GeomWriter at will. My polygons-per-frame will be higher, but my transformation-per-frame will be lower. ‘nodepaths2indices’ should generally be a WeakKeyDictionary.

The more options you have you know, and the more you know about them, prior to making any decision, the better; tell me more about TransformBlendTable.

Speaking of which, possible fatal flaw: Can I detect collisions on GeomPrims? Hit the deck!

drwr · April 11, 2010, 7:56pm

The TransformBlendTable is a system to store a table of transforms that are automatically applied each frame to certain vertices in a table. It’s designed to implement skeleton-based animation, where the different vertices of a model are bound to one or more different animated joints, each of which is represented by a transform. Each entry of a TransformBlendTable corresponds to the weighted combination of one or more joints, and is then applied to one or more vertices in the table.

In the context of the RBC, it analyzes the nodes underneath the RBC when you call collect(), and adds an entry to the TransformBlendTable for each node, and also copies the node’s vertices into the GeomVertexData it is building up. When you animate the nodes later, it adjusts the corresponding transform in the TransformBlendTable.

You can indeed test for collisions against a GeomPrimitive; however, the code that implements this is extremely inefficient. It is designed primarily for occasional use, for instance for picking a node based on a mouse selection; it is probably too slow to use for ordinary physics-like collision detection.

David

castironpi · April 11, 2010, 9:11pm

How does it know when making a transform whether the node is in the TBT or under an RBC or not? That is, whether auxiliary steps are additional to it?

Oh good, I just need a shortcut for the mouse ray. But I think I mistook my question-- I would need the smaller unit than the primitive.

drwr · April 12, 2010, 4:12am

Not sure I understand. When you call collect(), it analyzes the scene graph and finds all the nodes with transforms. It adds these to an internal table, and also creates an entry in the TransformBlendTable for them. Then, each frame, it walks through its internal table, gets the transform currently on each node, and applies it to the appropriate entry in the TransformBlendTable.

It will tell you the particular triangle that generated the intersection, but the only way you’ll have to identify it back to the source triangle is to examine the placement of the vertices.

David

castironpi · April 12, 2010, 6:42am

Oh, lol. Please don’t try this in Python. The participants in the TransformBlendTable have colored vertices of course and they are already in a GeomVertexData array. What are the relationships between the TBT and the GVD? Does it build the GVD from scratch every frame? I need the GVD and the index of a given NodePath on this frame, or its permanent index if it has one. I do not see a GeomVertexData member in transformBlendTable.h.

I think you mean “particular GeomTriangle”. Unfortunately every GeomTriangle comes out of my batch count budget.

plane of triangle collides with mouse ray
get intersection point relative to triangle
if mapped point in [0,0]…[1,1] then it was clicked on.
If this is too expensive to do in Python, that would make sense, and it seems reasonable to have to be clever about this thing, such as always maxing your batch count, or just using buckets to check them myself.

Regarding the parallel scene graph, I don’t think there would be much performance gain over the RBC, so the RBC+TBT=GVD is my preferred strategy.

Does setTwoSided have a performance impact?

Why would my mouse freeze briefly every second if I have 100k triangles in one primitive?

drwr · April 12, 2010, 4:42pm

Well, sure. It would be silly to reimplement TBT in Python. But I wasn’t proposing you do that; I was just describing how the TBT works because you asked for more information about it.

Incidentally, the TBT doesn’t modify the GeomVertexData directly. Nor does the RBC. The RBC only has to update the entries in the TBT, and Panda will automatically apply the changes to the GeomVertexData when it is rendered, by looking up the transform for each vertex based on the “transform_blend” column.

If you were to implement your custom animation needs, I would not recommend going this route, because there is no easy facility for animating colors this way. (Though there is some facility: you could use an RGBA morph, which would accomplish the desired goal, especially for flipping between two colors. But you’d have to set up the GeomVertexData yourself, eschewing the RBC.)

But setting up a GeomVertexData for animation, including a TransformBlendTable and/or morph animation, is complicated. It might be much easier just to directly manipulate the vertex values with a GeomVertexWriter.

It might, depending on your scene and your hardware. Enabling two-sided rendering can actually improve your performance, if you are spending more time processing vertices than you are filling triangles (which appears to be likely in your case). But don’t just ask theoretically: try it, and see what happens.

I don’t know. If your mouse is freezing, it’s either a driver issue, or your system is swapping virtual memory to disk. What does it look like in PStats?

Since I anticipate your asking the next question about RGBA morphs, I will briefly describe that too. But be warned that morphs are a complex topic and there’s lots of literature in general on the net.

A morph is a linear blend between two different vertex values. Any frame, the vertex may have value A, or value B, or some value linearly in between. Conceptually, for a given morph “x”, each vertex in the GeomVertexData has two definitions: x[0], and x[1], and by varying the value “x”, you can smoothly change the vertex from x[0] to x[1]. All the vertices change at the same time, though you can have certain vertices whose value for x[0] and x[1] are the same thing, and these vertices don’t change when you change x.

So, by varying x, you can change the entire shape (or color, or normals) of the surface from x[0] to x[1]. You just have to define what the shape is at x[0], and what the shape is at x[1].

To implement this, we store a new column in the GeomVertexData for each different morph. This column defines the value of x[1] for each vertex. (x[0] is the initial position of the vertex). Actually, the data in the column is the delta: x[1] - x[0]. The name of the column is “basename.morph.slidername”, where “basename” is the name of the column it is affecting (and might be something like “vertex” or “color”), “morph” is literal, and “slidername” is the name of the particular morph, e.g. “x” in my example so far. This column must also have a type of CMorphDelta.

Then we construct a SliderTable with an entry for each named morph slider, and store it on the GeomVertexData. Then manipulating the morph values is simply a matter of manipulating the slider values, and when the mesh is rendered, Panda will automatically apply the request morph operations to the vertices.

To use this technique to color individual tiles of a grid, you will need to define a different morph for each group of tiles that will be independently controlled. If each tile could potentially be independent of all of the other tiles, you will need a morph for each tile, which could be prohibitively expensive (there is a small cost per each morph).

David

castironpi · April 12, 2010, 7:34pm

drwr,

PStats and performance tools were very unpopular where I come from. So I’m not prepared to examine that right now. I have a lot of internal resistance to it. Acclimation is a process. If I’m to use PStats you’ll have to tell me what to ask the folks in IRC for to get help on it.

I don’t need the GeomVertexData if you can get me the GeomVertexWriter. That’s what I wanted but not what I asked. Of course I will need the right one every frame.

Do we incur this for every morph for every frame, or only for every change to the morphs?

Let me get this straight. NodePath X is in an RBC which has been collected. I rotate X. Then the RBC updates the TBT, then Panda updates the GVD. Fine fine, skip it all, just give me the GVD!

My ideas for getting the GVD are:

Build ctypes duplicates of the RBC and TBT classes, then put a ctypes RBC where the RBC is, get its ctypes TBT, then get its GVD.
Build my own little module that takes the Py_RBC object and includes the headers for RBC and TBT, then gets the GVD from that.
Build Panda myself with the accessor function in it.

I am also open to ideas for how to get the index once in the GVD.

drwr · April 12, 2010, 8:09pm

Huh? You don’t like to use performance analysis tools? How else can you analyze your performance?

I think you misunderstand a bit. All of this stuff about the RBC and the TBT and morph sliders is just a distraction. Don’t use any of it; it’s the wrong tool for the job.

Just create your GeomVertexData by hand. Then you will have the GVD, and you can create a GeomVertexWriter on-the-fly as you need one.

David

castironpi · April 12, 2010, 8:55pm

I’m baffled myself. Self, why don’t you want to! Actually my statement was inaccurate; I’ve used Python’s profiler a decent bit. Panda is just scary… in the sense of extremely intimidating. Perhaps I don’t have the resources to interpret a performance analysis at this point in the learning process, or if there aren’t theoretical answers, such as asymptotic running time, then I would just prefer to defer it for the moment.

Ok ok. I actually got distracted when you said, “But you’d have to set up the GeomVertexData yourself, eschewing the RBC.” If I build all the GeomVertexData’s myself, I’ll just use the parallel scene graphs strategy. I infer that there can’t be any improvements over that, or can there?

However, if the RBC leaves available any avenues for changing color, I’d like to explore them, because I’m especially clumsy when it comes to parallel structures specifically.

I’m basing most of my inquires off a picture I have of vertices and graphics memory, namely that there’s more or less a big GeomVertexData somewhere on (or near) my graphics chip that I want to write to, and Panda as well as the chip itself provide various facilities for. My picture of the drivers is particularly vague.

The closest thing I found in rigidBodyCombiner.h is a vector of NodeVertexTransform pointers, called _internal_transforms, unless it would be in the second field of _vd_table.

Using ctypes to get at C++ objects sounds reasonably discouraging anyway. There’s always the possibility to try to step through one render cycle, from _Py_Initialize( ) down.

drwr · April 12, 2010, 9:28pm

The “parallel scene graphs” approach sounds like it might be right, though I’m not sure I understand what you mean by “parallel” here. Is there a GeomVertexData that contains data that is not to be rendered? What is that for? Unless you mean that to be the source of data to copy into the live GVD from time to time?

Your picture of graphics memory is a reasonably accurate model.

The RBC doesn’t actually keep a pointer to the GVD it creates; it doesn’t need it. But that’s just an implementation detail.

Please don’t try to use ctypes to get at C++ objects. That sounds awful. There should be no reason to do, anyway.

David