hitting a speed limit

castironpi · March 26, 2010, 2:31am

My idealistic side preferred the ‘clear them all to 0’ route. The search locates all the ‘low’ nodes after ‘collect’ and gives them the proper size.

	def _subcard( name, scale, color ):
	...
	path1= _subcard( 'high', 0, ( 1, .776, 0 ) )
	path2= _subcard( 'low', 0, ( 0, .620, 1 ) )
	...
	for rbc in rbcs:
		rbc.collect( )
	for rbcnp in rbcnps:
		for low in rbcnp.findAllMatches( "Tex-Placeholder */card/low" ):
			low.setScale( 1 )

drwr · March 26, 2010, 4:52pm

Well, you could try pre-creating them and saving them in a bam file. Put them all under a common node, and then use nodePath.writeBamFile() to save them out. Then, when you start up, load up the bam file and find them in there. I think you can’t write out the RigidBodyCombiner this way, but you can re-create this on the fly without too much trouble.

Sometimes it is, especially when you’re doing per-vertex operations. Python is really good for doing high-level operations, but since its function call overhead is so very high, it’s not always a good choice for tiny repetitive operations. But it depends on how much you’re doing each frame.

I mean, maybe we have just one texture that is applied to the entire group of tiles, with each tile given a specific UV value that maps to a different texel, and we change individual texels of the texture in order to change the colors of the tiles.

David

castironpi · March 26, 2010, 8:23pm

This is highly subjective, but the UV strategy starts to sound like premature optimization / engineering. You don’t sound like it has huge amounts of potential, but some. I tend to back burner this one… inexplicably… expected frustration level exceeds tolerance… unless you recommend.

It must not be common or you’d have it. But I wouldn’t mind getting the cards created and collected and attached to nodepaths outside the interpreter, then just assigning their attributes. Still sounds very meddling… even trifling.

If collect is heavy on the GPU, maybe I could add a thread or something.

My fave. option so far is to get the screen up, then set to work on building the scene, starting largest to smallest or something-- similar to “tile pyramid” strategies. It would be nice to put ‘collect’ in a separate thread so it doesn’t freeze the renderer, or otherwise get it in the background.

I ought to profile the start-up.

I won’t pursue the pre-comp’ed format immediately. I’m not prepared to make a decision about that.

ThomasEgi · March 26, 2010, 8:48pm

if you only want to color the tiles. it should be possible to simply apply a texture and make your UV coords to match a pixel each. if you turn of texturefiltering that should work, color manipulation is easy and your gpu doesnt choke.

castironpi · March 26, 2010, 10:59pm

Strategy #3:

Creating a ‘flood’ basis texture, to dedicate one pixel each to tiles:

colimg = PNMImage( 256, 256 )
colimg.fill( *colLow )
colimg.setXel( 0, 0, colHigh )
coltex= Texture( )
coltex.load( colimg )
coltex.setMagfilter( Texture.FTNearest )
ts = TextureStage('ts')

Card generation, texturing, and UV configuration:

		card1 = cm.generate()
		path1= NodePath( card1 )
		path1.setTexture( texts[ count% 3 ] )
		path1.setTexture( ts, coltex )
		path1.setTexOffset( ts, ( count% 256+ .5 )/ 256.0, ( count/ 256+ .5 )/ 256.0 )
		path1.setTexScale( ts, 0 )

Currently, all colors correctly set to LOW, but the tile at 0,0 should be colored HIGH. It’s not.

drwr · March 26, 2010, 11:15pm

      path1.setTexOffset( ts, ( count% 256+ .5 )/ 256.0, ( count/ 256+ .5 )/ 256.0 )
      path1.setTexScale( ts, 0 )

These lines will set the entire contents of path1 to the same texture coordinates, because you are scaling the existing UV’s to 0 and then adding a fixed offset. So all of path1 will have the same color. Is this your intention?

If you omit the setTexOffset line (or if it evaluates to 0) it will certainly apply whatever color is at uv (0, 0), but note that this is the lower-left corner of your texture image, not the upper-left, so replace this:

colimg.setXel( 0, 0, colHigh )

with this:

colimg.setXel( 0, 255, colHigh )

Also, it’s probably necessary to ensure your geometry in fact includes texture coordinates in the first place, even though you’re scaling them to 0 (I don’t think it’s defined what would happen if you scaled a nonexistent value to 0). Or you can use one of the setTexGen() modes to ensure this if you don’t want to add them to your GeomVertexData.

David

castironpi · March 27, 2010, 12:10am

David’s fix to the y-value in the flood texture worked. Here’s the revised version.

colHigh= VBase3D( 1, .776, 0 )
colLow= VBase3D( 0, .620, 1 ) 
colimg = PNMImage( 256, 256 )
colimg.fill( *colLow )
colimg.setXel( 0, 255, colHigh )
coltex= Texture( )
coltex.setMagfilter( Texture.FTNearest )
coltex.load( colimg )
ts = TextureStage('ts')

cm = CardMaker('card')
count= 0
for j in range( 100 ):
	parent= NodePath( 'parent %i'% j )
	for k in range( 100 ):
		card1 = cm.generate()
		path1= NodePath( card1 )
		path1.setTexture( texts[ count% 3 ] )
		path1.setTexture( ts, coltex )
		path1.setTexOffset( ts, ( count% 256+ .5 )/ 256.0, ( count/ 256+ .5 )/ 256.0 )
		path1.setTexScale( ts, 0 )
		path1.setScale( .8 )
		path1.setPos( count% 100, 0, count/ 100 )
		path1.reparentTo( parent )
		count+= 1
	parent.flattenStrong( )
	parent.reparentTo( render )

Then, to change colors:

		self.col= 1- self.col
		colimg.setXel( 0, 255, [ colLow, colHigh ][ self.col ] )
		coltex.load( colimg )

Performance was observed to be about comparable. Both performed poorly on scrolling at the farthest-out zoom levels at 100000 cards, which wasn’t observed earlier. Performance of flattenStrong degraded more sharply with increases in number of tiles on screen than RBC. With all 10000 on screen, flattenStrong got about 2fps; RBC got about 9.5fps. Load time on flattenStrong was much better.

drwr · March 27, 2010, 12:42am

If you’re using setTexOffset() to set the UV coordinates differently for each tile, you’re defeating the effectiveness of flattenStrong and the RBC to combine nodes into a single batch. (All geometry in a batch has to have all of the same state, and setTexOffset() is part of the state.)

My original idea had been to set the UV’s appropriately when you set up the GeomVertexData, so that each tile has a different UV. Then let them all flatten together.

Of course, in your case, the bottleneck is almost certainly more tied to the number of vertices than to the number of batches, but still it’s probably a worthwhile optimization.

David

castironpi · March 27, 2010, 1:03am

Each tile is going to be unique in some way. It can either have its own UVs on a color, its own texture from two pre-flooded versions of the raster, or its own UVs to a pre-flooded side-by-side texture.

ThomasEgi was claiming on the IRC channel to get millions of vertices on other hardware than mine. It’s a matter of how many are worth bothering about. The RBC gives me transforms for free, but Geoms are fast to load, small, and fast to run.

In my design, tiles will occur in nested structures. Instances of the same structure will be identical, except for the color of their tiles, which will be unique combinations. Does that mean that instancing won’t work? If so, I’ll have to transform all the instances in sequence. If not, I can just issue the transform to the one and only instance.

In the Advanced Instancing section in the manual, there is only one ‘chorusline’ and one ‘dancer’. Is it possible for each path from level 1 to level 2 to correspond to its own path from level 2 to level 3?

krid · March 27, 2010, 1:39am

your hardware would be also able to display millions of verts.
but not at processing 10000 squares (4 verts p square) into a rigidbody combiner hardware shader (this store your geom array just into a single array) therefor is your cpu to slow! if you splitting does (rbc) up you need to use threads, instead it wouldnt make a difference, either you have to use a multicore cpu.

so its not your graphics hardware, its your cpu. and please ask thomas, about this code, im sure lot of peoples are wondering now, how its possible to store such a high amount of squares into rigidcombiners (ok thats not hard to store them, but which computer is he using)! i mean 1 mio. means 250000 squares (tiles). my hardware can handle on one processor around 10000 squares with only using the rigidcombiner without flatten them with a framerate around 20 frames. im very curious about thomas supercomputer so he uses a supercomputer with more than 250 cpus!!! holy moly! i want to have such a thing too. or maybe im completly wrong! but thats why im asking! i mean im happy now with 5000 indepently moving objects. but i would say WOW with 100000.

ThomasEgi · March 27, 2010, 12:23pm

@krid:thx for asking, you are indeed completely wrong.
it’s about !static! geometry here, which only has to change color.
6.553.600 vertices (that’s 6mio) making up 3.276.800 triangles
which make up 1.638.400 quads. on my machine, rendered in 100 geomnodes (65536 verts each), brute force with no optimisation fullscreen 1440x900 with the entire geometry visible at 40fps. and my
“supercomputer” is a dell inspiron1720 with a geforce 8600m gt. i run the default panda 1.7.0 build for ubuntu 64bit, no threading and nothing, everything runs on a single core.
time it takes to load the model form disk (bam file) and arrange 100 copies of it is 0.037 seconds.
if i zoom in a little so i dont see the entire thing at once. my fps jumps up to 400fps. memory usage is below 40MB

given texturing comes for free these days i would expect no performance penalties from displaying 2 textures on top of those quads. if you somewhat match up the position of the logic-gates with the position of the pixels on the texture image, you could easily re-use the image and replace like 1000 logical quads with one big quad with the approriate texture -part on it. which would make it a quite good LOD. pyramid-like as you mentioned earlier.

@castironpi. i didnt really get the describtion of your last post. it sorta sounded like you want to display 2 different textures(logic symbols) on each quad. if so, you may be able to simply display both of them all the time, and toggle them using an extra-texture (like texture-splatting on terrain). most gpu’s can handle up to 4 textures with no performance penalty. this way you might be able to completely avoid any kind of messing with the geoms themselfs.

instancing is for animated geometry. if your geometry are static copies no special tricks should be nececssary.

drwr · March 27, 2010, 3:00pm

Instancing doesn’t give you any performance benefit. It may give you a slight programming convenience if you thereby avoid having to modify multiple transforms simultaneously, but it sounds like it isn’t worth it in your case. Though for the record, if you were coloring the nodes with textures, you could apply a different texture to each instance, and thereby have a different set of colors for each instance.

You couldn’t have a different set of children for each instance, though.

David

krid · March 27, 2010, 4:37pm

i thought he is going to build a picture out of tiles, like a mosaic and wanna animate this tiles. but now i got it, he is going to do this huge scale pictures

but therefor dont use a huge amount of tiles, just use one geo, which is tesselated into 10000 pieces. scale all squares uv into 0 1 (quad). but therefor you have to write your own model format and watch into the tiff format, there you can store layers. a similar format could help you, the layers should point on the needed polygon. so for each square polygon you need layers which get loaded in the like resolution in relation to you cam position. but im sure its not so easy to write like it sounds. this would take lot time.

drwr · March 27, 2010, 5:13pm

ditus a.k.a krid, we do appreciate your enthusiasm, but I’m sorry to say your very specific advice seems confused at best.

David

rdb · March 27, 2010, 5:27pm

a.k.a. logen a.k.a. solutionX
Having your first nick banned is no reason to create three more accounts.

krid · March 27, 2010, 7:02pm

how do you get it?

@drwr: hmm, i think disney developed something very close to like that, but dont ask me how its called , something like Ptex or so… its only a bit different to.

ThomasEgi · March 27, 2010, 7:59pm

there is a reason why some people are called “admins”. and please stop making useless and confusing posts (that includes your last one,too).
instead. lets tackle the original problem of this thread. fixing.

krid · March 27, 2010, 8:16pm

yes it would be useless, if the complete thread is not about this huge zoom in pictures. if it is, im sure the time will show you it wasnt useless. ^^ and thats excatly this way would and will work and maybe its already working.

sorry for the short bypass way, now back to the main track. im also very curious about performance solving.

castironpi · March 31, 2010, 12:28am

Would it be possible to alter the texture UVs of a node in a rigid collector?

My expected gain would be 2x, not needing to store 2 cards per tile, but still get to rotate & translate. It would be a “worst of neither” of strategies #2 and #3.

drwr · March 31, 2010, 4:58pm

No, sorry, the RBC only adjusts vertex and normal positions according to transforms.

David