CommonFilters - some new filters, and the future

Technologicat · October 22, 2014, 8:48am

True. To think of it, it was a silly convenience feature.

Mm, maybe.

I think at least almost everything is, including stageName and sort. Only isMergeable is a “design-time parameter” - maybe it shouldn’t be a property.

OTOH, isMergeable interacts with sort, which is a property. My first reaction would be to eliminate isMergeable, and let sort=0 declare a new render pass, but that wouldn’t work in the case with lens flare and bloom. Both are non-mergeable, belong in the same logical stage, yet must have different sort values so that both can be enabled at the same time.

So yeah, let’s return to this later

Ok.

They provide information to FilterPipeline controlling the options needed for renderSceneInto(), and to FilterStage controlling the creation of the compositing shader.

The flag merge logic takes in the metadata defined for the same texture by all enabled filters, and determines what that means “as a whole”. Explicitly: in the case of FilterStage, the metadata from each individual filter controls the parameter list of the filter’s own Cg function call, and the merged metadata controls the parameter list of the compositing fshader. FilterPipeline needs only the merged metadata, for determining which scene textures and which aux bits are needed.

CustomInputMetadata is simpler; it just lists the custom input names and types needed by each filter. It is used by FilterStage. Again, the metadata from each individual filter controls the parameter list of the filter’s own Cg function call, and the merged data is used in constructing the parameter list of the compositing fshader.

In TextureInputMetadata, a filter can currently request the following:

needTexpix = pixel size is required by the filter’s Cg function (for this texture).
internalOnly = this is a scene texture needed in the shaders of the filter’s internal stages (so FilterPipeline must provide this texture), but the filter’s compositing Cg function does not need it.
metaOnly = the filter’s Cg function wants the metadata for this texture, i.e. texcoord (and texpix, if needTexpix is True), but does not need the actual texture. This is useful in some special cases, e.g. in ScanlinesFilter. Be aware also that due to StageInitializationFilter, the current pixel’s up-to-date color is always available in the “pixcolor” variable - so if a filter wanted e.g. only that and texcoord (for whatever reason; e.g. for applying vignetting), it could set metaOnly to avoid passing the unused k_txcolor to the filter function.

The thisAndThatOnly flags are used to eliminate shader inputs that would not be used, resulting in a faster shader.

(I’m also considering adding needTexpad similarly to needTexpix, but we’ll see.)

True. I was speaking of triggering FilterStage- and FilterPipeline-level reconfigures from inside the filter, but with your suggestion below, that is no longer needed.

Usually true, but there are special cases.

See params1.y in AmbientOcclusionFilter - it contains a step parameter that depends on both “amount” and “numsamples”. This saves an instruction in the shader, as it gets a precomputed step value as input. Here “numsamples” is a compile-time parameter (to make SSAO work with the arbfp1 profile), but “amount” is a run-time parameter.

Hence params1.y behaves like a “hybrid type” parameter: the shader input “params1” must be updated whenever “amount” changes (like usual), but also when the shader is recompiled (unlike usual).

Currently, it already does something like this, by setting the _needsCompile flag at the appropriate level of the hierarchy.

But I’ll need to reconsider where such flags should live - there are changes requiring only a local recompile (of the internal shaders), changes requiring a FilterStage-level recompile (affecting code synthesis of the compositing fshader), and changes requiring a FilterPipeline-level recompile (e.g. filters being added/removed, or a filter being moved to a different stageName).

However, due to how FilterManager works, it is currently not possible to support FilterStage-level recompiles, anyway. So, maybe either FilterManager should be modified (so that it would be possible to clean up only some buffers by reference; but then it would also require a system to let the caller modify the buffer sort order - and FilterPipeline doesn’t really want to do that), or this system should be simplified so that there are only Filter-level and FilterPipeline-level reconfigures.

This is very elegant! Thank you!

(But what about updating stageName and sort? Does the task manager run in its own thread? If it does, there is a possible race condition here, unless we store those parameters in a single property as a tuple.)

rdb:

Technologicat:

The filter type as a way to identify filter instances obviously must go when support for multiple instances of the same filter type in the same pipeline is added. But I didn’t yet have time to spare much thought as to what would be the appropriate replacement.

Maybe the cleanest way is to let the caller create/destroy the instances that the application wants, and then add them to the pipeline using FilterPipeline.addFilterInstance(). Then, there could also be a FilterPipeline.delFilterInstance() that removes a filter from the pipeline by its object reference.

I think the new approach should be to let the user deal with it. He creates a filter, and is himself responsible for only adding that filter instance once (or face the consequences), without having FilterPipeline second-guess a decision to add a same-type filter twice.

Sounds good.

rdb:

As for CommonFilters, I think we can just do it like this:

class CommonFilters(object):
    def __init__(self):
        self.volumetricLighting = None

    def setVolumetricLighting(self, setting1):
        if self.volumetricLighting is None:
            self.volumetricLighting = VolumetricLighting()
            self.addFilterInstance(self.volumetricLighting)
        self.volumetricLighting.setting1 = setting1

    def delVolumetricLighting(self):
        if self.volumetricLighting is not None:
            self.removeFilter(self.volumetricLighting)
            self.volumetricLighting = None

Mm, yes, that should be fine.

(Although a detail: to conform to the old API, the set***() method must update all configuration parameters of the filter.)

Yes and yes.

BloomFilter and AmbientOcclusionFilter already hit me with some surprises, so when the framework is 90% done, it may be time to try to port the new filters over to it and see if any new features are needed

(Also, I think that the new filters need to be prepared relatively quickly after the framework is done, in order not to delay the release of 1.9.0.)

rdb · October 22, 2014, 11:45am

This isn’t true; there is absolutely no penalty to leaving unused shader inputs in a shader. The shader compiler knows which inputs you don’t use, and automatically removes them from the compiled code. Panda also notices that the input isn’t used and doesn’t bother passing the data to the GPU.

For Panda-provided uniforms like texpad, in fact, there is not a significant reason to leave them out of a shader (except readability of the generated shader) if doing otherwise makes things more complicated. And in fact, I would suggest that we do that if it means we can remove more code. (Remember the old saying: perfection in design is attained not when there is nothing more to add, but when there is nothing more to take away.)

For custom inputs, there is some overhead associated with having the shader inputs cycle down the scene graph and in the setShaderInput call itself, though not very much.

OK. (Though this is one case where I’d suggest that if it becomes more complex to support this case, I’d say we have to just change the SSAO shader rather than bend over backward to support it).
In this case, though, it doesn’t seem to be a problem since the make method calls setShaderInput anyway - or we just both call setShaderInput and markDirty() in the appropriate setter.

I think the filter’s setter should mark the flag wherever it is most appropriate: if it requires a pipeline-level recompile, it should mark it on the pipeline, if it requires a filter-level recompile, it should mark it on the filter. It seems you are already doing something like that in your _set() method, anyway.

No, it should run in the same thread (the app thread), because of an existing problem in Panda where we can’t create buffers in any other thread. There is no risk of data races; we just make sure the task sorts right before the igLoop task (which has a sort of 50) to avoid a one-frame lag.

Yes, I understand - I was pretending there was only one parameter for the sake of simplicity.

Technologicat · October 27, 2014, 7:53am

Ok, I’m back. Now I should be able to un-busy myself for the next couple of days to finish the framework.

Expect a version some time this week. What remains to be done are the additional features needed to support VolumetricLighting, and the design cleanup as per the review.

Ah, ok. I stand corrected, then

Does this apply also to texture (sampler2D) inputs?

And how deep is the code analysis? Does this optimization occur also when the “unused” parameter is passed to a function, which takes the parameter but never reads it? And in the future, can we rely on that GLSL compilers will apply this same optimization? (This is critical now that the compositing pass of each filter is implemented as a Cg function.)

((Currently, the functions are actually defined as “inline”, which eliminates the call, but I have no idea at which point the compiler processes that.))

In that case, maybe we only need “needTexpix” and “needTexpad” flags, since these add more inputs (instead of removing inputs as the *Only flags do).

Or as an extreme simplification, leave those out, too, and always provide texpix and texpad for each registered input texture?

As you seem to have noticed, I was essentially trying to make the generated code as clean as possible, but if that is not important, we should be able to get rid of at least the *Only flags. (This is assuming that a bunch of unused textures doesn’t hurt; e.g. SSAO wants depth and normals for its internal shader, but in the compositing shader it only needs to blend ssao2 onto the current pixel of the color texture.)

In the case of custom inputs, I think we can safely assume the filter authors will only request them if they are needed, so this should not be a problem anyway.

I agree, a custom setter will solve this cleanly.

Having make() call setShaderInput() was a hack - generally, make() should not do that. This was needed because “params1” did not conform to the RT/CT param setter templates. Given this model, “the least wrong” solution was to treat it as a CT param, and apply the RT param processing manually.

Actually, “params1” failed to conform to the RT/CT param model in two separate ways: first, there was no one-to-one mapping between parameters and shader inputs (as RT params expected), and secondly, it required both an update to a shader input and a recompile (i.e. it was neither type). By removing the RT/CT param model (which, as is now clear, was not a good idea to begin with, although like many ideas, it did seem so at the time), these problems go away.

Yes, that’s exactly what the _set() method is doing.

This remark was because I may have misunderstood the intent of your earlier comment that a Filter does not need to check its own existence in the setter. This is the case I was thinking of when I mentioned earlier that the setter should check for the existence of the relevant object. Naturally, the Filter exists, but the target where the relevant recompile flag should be set, might not. (I was speaking of invoking reconfigures earlier, but of course the same observation applies also if we only set the recompile flag.)

Specifically in this case, the Filter must check for the presence of the pipeline before it tries to set the pipeline level flag. The check will fail, by design, if the Filter is not currently attached to a pipeline. This is exactly what the pipeline-level attach() is meant for: it sets up communication between the filter and the pipeline. (The pipeline will have its recompile flag set by the filter when needed, and during attach(), the Filter tells the pipeline which textures and custom inputs it needs.)

So, I think we agree that the flags should live at the level relevant to each flag (as they already do).

Ah, I see.

From the “50”, I take it that the priorities of the internal tasks in ShowBase have not been moved to the new-style priority/sort system? (Referring to your old detective work in events without redraw - #4 by rdb )

Ok. I thought that might be the case but decided it’s better to be sure

rdb · October 27, 2014, 1:06pm

It applies to all uniforms. Technically to all inputs, but we still have a limited set of registers for varyings under ARB profiles.

Good question. I think so. I just ran a test both with and without inlining enabled and the compiler stripped out the additional parameter to the function in both cases. In fact, it was even stripped out when I used the variable but multiplied it with 0.

Since we compile all Cg shaders to arbfp1, which doesn’t support function calls, all functions will automatically be inlined, so it won’t matter anyway.

As for GLSL: I don’t know, as each vendor has their own implementation of the GLSL standard. I know that the standard mandates the removal of unused inputs, but I don’t think it goes to far as to specify how well the optimizer actually looks to determine that.
It is true that we should be thinking soon about actually putting GLSL code in the filter system as Cg is no longer even being maintained, hmm.

That sounds good. I’ll leave the decision of which of those two options to go for to you.

They have. (That post is 7 years old already.) These tasks use sort values rather than priority values in current versions of Panda3D.

Technologicat · October 28, 2014, 12:04pm

Ok. In the case of uniforms, this should be safe then.

Considering the compositing fshader, I think the only varyings will be the texture coordinates, anyway (since textures themselves are uniforms of a sampler type).

(Which reminds me - I have a vague feeling that I may have observed some missing “const” in some filters for quantities that are constants. This (rather obviously) affects “shader complexity” in the Cg sense. I’ll double-check this before sending the eventual final version of the code.)

Interesting.

I’ve also noticed that Cg is very aggressive in its optimization. Based on my own tests, it seems it will omit any code that will not affect the final result, at least inside the same function.

(This was when I was trying to profile my new inker. I observed that Cg omitted a for loop that would normally generate several texture lookups, when the result from that loop was simply not used in calculation of o_color. I wanted to know the relative time taken in the loop vs. in the math that used its result, but instead got this observation )

Mm, a valid point. I didn’t recall that limitation.

Although, considering the future port to GLSL, there may come a point where we no longer want to inline them. Mainly, I would imagine this to occur if some future filter contains a lengthy, repetitive operation that would cut down on the size of the shader program if it was a genuine function.

As an aside, the new inker almost contains such operations: the analysis of discontinuities in the data source textures is very similar for depth and normals, differing only in the calculations that depend on the scalar/vector nature of the data source. If we later add more data sources (I would like to have an “object ID” aux bitplane generated in the main renderer, to guarantee object outlines even for edges at the same depth and similar normals), at that point it will be useful to refactor the scalar/vector discontinuity checkers into functions. They could be combined into one function with callbacks defined via function pointers, but that makes the code less readable, so I think it’s better to keep the scalar/vector versions separate.

Thus, although not yet, we may need to revisit the “inline” issue sooner than we expect.

And as an aside to the aside, an “object ID” aux bitplane is much more generally useful than it may sound at first. For example, there is a postprocessing shader Thaumaturge has been experimenting with that needs this information. But most importantly: given a small amount of infrastructure, it can be used for instantly identifying the object occupying any given pixel on the screen, without doing any collision ray tracing, and working correctly even in the presence of image warping postprocessing filters such as lens distortion (provided only that those filters apply the same distortion to this bitplane, too).

I think that for cursor-based object selection, this solution is superior to 3D collision ray techniques because of the “semantic matching”: when done this way, the pixel-to-object identification process operates on the exact data that is in the final output - i.e. the picture the user actually sees - which is also the space where the cursor naturally resides. (Something I’ve learned dealing with mathematics is to pay close attention to the choice of spaces.)

The applications are endless - for example, one could generate a fullscreen glow map by filtering the object ID texture leaving only values that match the one under the cursor (setting the glow for those pixels to 1 and zeroing the rest), and then combine that with bloom or other applicable postprocessing to make the “selected” object glow in a natural-looking manner. Applying a tint to the bloom output would allow further customization of the glow (e.g. to match the primary colour of HUD elements).

Improving on this, small objects with complex shapes (i.e. hard to hit with a single-pixel cursor) could be made easy to select by pre-filtering inside a circle centered on the cursor (effectively, casting out a large number of collision rays at O(1) cost each), and choosing the object ID that covers the largest number of pixels in that circle. The pixel counts could be optionally weighted by a per-object multiplier that can be used to prioritize those objects (practically, in games, object types) the user/player is most likely interested in (or equivalently, to de-prioritize “prop” objects) - this should eliminate the “Take bucket” problem familiar to players of many first-person RPGs. Often, the player wants to pick up the various small items inside the bucket and not the bucket itself, while it adds to the sense of immersion to be able to pick up the bucket, too. Eliminating the need to constantly fiddle around with aiming in the most common use case should also improve immersion.

Another possible application, in first-person shooters, are weapons with zero flight time - given where the player aims, hits to actual scene geometry can be instantly determined without needing to raytrace (or to set up collision shapes). Then, if needed, the exact position (and trajectory vector) of the hit in 3D space can be extracted from the screen space data. The pixel position and depth (plus any relevant lens information such as FOV) determine the point uniquely in the camera’s 3D coordinate system, leaving only conversion to the collision target’s coordinate system, which is conveniently provided by Panda. (This particular application doesn’t play well with postprocess image warping, though.)

But this is getting very far beside the point. What I wanted to say in this aside is that eventually, the filter system will need new aux bitplanes from the main renderer, and in addition to “gloss” (for SSLR), “object ID” would be an especially useful one to have.

Ok.

Yes. I was originally thinking that we could do that post-1.9 (in order to avoid delaying the release unnecessarily), but on the other hand, we are encouraging people to write and share new filters by publishing the new API.

If we move to GLSL later, we will need to support both Cg and GLSL to preserve backward compatibility. Also in the long run, there will potentially be lots of filters “in the wild” written in Cg because the interface happened to be there, so then it won’t be realistic to deprecate Cg at least in any future version of this particular API.

If we move to GLSL now (before releasing 1.9), we will only need to support GLSL, which ensures that any new filters that people write will be implemented in GLSL.

It will probably take some time for me to learn GLSL and port the existing filters, but not too much. Cg was easy to learn, and apparently the languages are quite similar. Of course, if you prefer, I can also leave the conversion to you (but I think one of us should convert the new filters, too, to get them into 1.9).

What do you think?

Ok.

(Hmm, good point.)

Ok. Maybe I should have asked what I wanted to know, instead

So, according to the manual ( panda3d.org/manual/index.php/Tasks ), sort=49 and priority=0 should make this task run just before igLoop. Or sort=50 and priority=10000?

I’ll probably move the individual filter update() calls into this task, too - should be cleaner and more scalable. FilterPipeline can start the task at the end of reconfigure(), and tear it down in cleanup().

rdb · October 28, 2014, 1:44pm

Yes-- shader compilers tend to be. And NVIDIA actually feeds GLSL code through their Cg compiler, so we know we’re good on NVIDIA cards, at least.

The ARB assembly language specification didn’t include a “jump” instruction.

GLSL doesn’t even have a working “inline” keyword; it is there but only because it’s reserved for future use. This is because shader compilers are almost always better at determining whether or not there is a benefit from inlining a particular function.

Given that compilers have to be smart enough to do that, it’s very likely that they’ll be smart enough to figure out whether or not there exists a code path that ends up using a certain variable. Cgc does seem to be, anyway.

Technologicat:

And as an aside to the aside, an “object ID” aux bitplane is much more generally useful than it may sound at first. For example, there is a postprocessing shader Thaumaturge has been experimenting with that needs this information. But most importantly: given a small amount of infrastructure, it can be used for instantly identifying the object occupying any given pixel on the screen, without doing any collision ray tracing, and working correctly even in the presence of image warping postprocessing filters such as lens distortion (provided only that those filters apply the same distortion to this bitplane, too).
…
But this is getting very far beside the point. What I wanted to say in this aside is that eventually, the filter system will need new aux bitplanes from the main renderer, and in addition to “gloss” (for SSLR), “object ID” would be an especially useful one to have.

A bit mask would work better for some of the things you describe, but yeah, the general principle is sound. In general, I think that Panda needs a more general solution for render targets.
I have a general solution for custom bitplanes that I plan on implementing after 1.9, which makes render targets work in a very similar way to texture stages: you add them to a node, and can associate a custom name and meaning for each of them. That’s for later, though.

I’m afraid of stalling 1.9 too much. I’ll take a brief look at it when you submit the code and see if there’s an easy way to at least prepare the code for more easily adding GLSL later.

sort=49 should be enough.

Technologicat · October 28, 2014, 2:50pm

Ok.

Ah!

Ok. Then I suppose we won’t need to worry about that (at least for now).

Yeah, to some extent this may be a case of “all objects look like nails”

That sounds good. I’m fine with waiting until then for doing the things I described - the project I’m building won’t likely be in a stage where I’d need pixel-perfect object selection until much later, anyway.

Ok. Cg it is for now, then.

Ok, thanks!

I think I now know what to do. One final question (for now), returning to the texture metadata issue:

I’m tempted to go for the option of keeping the need* flags, because that preserves the possibility to easily add new metadata flags later.

But looking at the code, I now see I wasn’t thinking clearly when I presented those two options.

Essentially, there are only two sensible ways to merge flags: you either OR or AND the whole set for each particular kind of flag. A need* flag says it wants to be OR’d, while an *Only flag wants to be AND’d. There is no advantage in having support for both kinds of merging - I think this is one of the important points where to simplify.

It is possible to standardize on one of the combination operators simply by rewording. It is almost universally clearer to use positive wording. Hence, if the system is to have any flags, one should standardize on need*, and get rid of *Only. Then all flag types can be OR’d regardless of what they actually do, and we can use standard named tuples because the metadata data structures then no longer need any nontrivial logic.

Next, observe that in the code actually using the flags, there is not much difference between the *need and the *Only flags. See FilterStage.reconfigure() - during the synthesis of the filter functions and the fshader definition, each flag introduces one check that either emits code or doesn’t.

The only exception is internalOnly (which actually means “not needForCompositing”), which skips the whole texture. This is because the same texture registration system is doing two unrelated things. We might want to change that.

I think a cleaner solution would be to have a different function to register scene texture inputs for internal stages, and keep those registrations in a separate registry. FilterPipeline must know, so that it can request the texture from FilterManager, but the compositing shader synthesis system doesn’t want to even know about such textures.

Thus, I think the options should have been:

Preserve the metadata system. Keep needTexpix, add needTexpad, change metaOnly → needSampler.
Remove TextureInputMetadata altogether, and always provide all possible inputs for each registered texture.

Whichever option we choose, internalOnly needs to go (and we need some clean way for a Filter to request scene textures that will not be needed in compositing).

So, right now I’m thinking that maybe the first option would be preferable. Is this ok?

The needSampler flag would work similarly as needTexpix and needTexpad (which were present in the original first option), while this solution allows for easy future expansion.

rdb · October 28, 2014, 4:21pm

I like the second option because it provides the lowest number of potential points of failure.

By the way, I just realised: since all uniform inputs are globals in GLSL, we don’t have to pass the uniforms to the individual functions anyway in GLSL, so the compiler really doesn’t need to do any dark magic to figure that a particular uniform is unused.

Technologicat · October 29, 2014, 11:42am

I implemented the “texture borrowing” and “participates in compositing” features last night, and began working on getting several instances of the same type of filter to work in the same FilterStage. (These all will be needed to support VolumetricLighting, specifically its “source” parameter.)

Making the custom input names unique for each instance wasn’t too hard, but the texture parameter logic still needs some work.

I expect to be able to implement VolumetricLighting soon, and then the next step is to clean up the architecture.

That’s a good point.

This option is actually becoming more tempting, because handling the texture metadata flags is already becoming more complicated than it used to be, with the introduction of the “texture borrowing” and “multiple filter instances” features. The code seems to be telling me that if the flags aren’t essential, I should get rid of them

I noticed also that there is no point in merging the flags at registration time; the sane thing is to compute whatever is needed for the compositing shader in the logic that actually builds the compositing shader.

Hmm. Although it does simplify the compiler, that’s a bit unfortunate for what I was planning.

In Cg, one used to be able to do this (observe the placement of the _XXXs):

float4 myFilterFunction_XXX( uniform float k_myparam )
{
  // ...user code using k_myparam here...
}

void fshader( ...
              uniform float k_myparam_XXX )
{
  // ...
  pixcolor = myFilterFunction_XXX( k_myparam_XXX );
  // ...
}

where _XXX is a unique instance identifier (basically “_0x%x” % id(filterInstance)) to prevent name conflicts. This is needed, since now several filters of the same type are allowed in the same FilterStage, and hence in the same compositing shader (and thus any relevant input names in the compositing shader must be unique for each instance).

(This is a simplified picture, concerning only custom inputs.)

Note that the name “myFilterFunction” must be mangled too, because Filters are allowed to emit different code depending on arbitrary user-defined configuration options (consider GammaAdjustFilter). Thus, the only safe thing to do is to make each Filter instance provide its own implementation, even if this causes duplicated code. It doesn’t affect shader complexity anyway, because the functions are inlined.

But since in GLSL, uniforms are globals, the k_myparam - which is above passed as a parameter to myFilterFunction - must have a unique name, and hence also in the function implementation (which is defined in user code, in the implementation for each filter type), the mangled name must be used.

It would have been cleaner if the user code implementing the filter function didn’t need to care about name mangling, but I guess there’s no easy solution that’s forward-compatible with GLSL.

And in any case, even now, custom shader inputs sent to the compositing shader must use the mangled name - this I think is unavoidable to make several instances play together in the same FilterStage.

Note that custom shader inputs sent to the filter’s internal stages do not need name mangling, because each Filter instance has its own copy of the internal stages. The intent of the name mangling is only to uniqify the inputs going into the same compositing shader.

So, I suppose that to make it more transparent what is happening (and keep this forward-compatible with GLSL), we should require user code to use the mangled parameter name, too. (It can be obtained by self.getMangledName(originalName), similarly to how self.texcoord(texName) retrieves the name of the texture coordinate variable related to texName.)

Usually, of course there won’t be several instances of the same type of filter in the same FilterStage, even when several instances are present in the same pipeline. For example, in the case of VolumetricLighting, its bloom filter (from which VL “borrows” the bloom3 texture) will live in the “SceneOptics” logical stage, while a user-enabled bloom filter will be placed much later, in the “LensOptics” stage. Also, bloom is not mergeable, so two instances can only end up in the same render pass if they are specifically placed in the same logical stage.

(Hmm, it might actually be that any filter that declares internal render passes cannot be mergeable? It’s a one-way implication, though - a filter can be non-mergeable even if it has no internal passes. (Example: any filter that must access up-to-date color information outside the current pixel, even if it does all the work in the compositing shader.))

But the system needs to be general, because in the future VolumetricLighting might not be the only one that wants to use another filter for preprocessing. Also, render passes (FilterStage) are an internal thing; to the outside, the API exposes only logical stages. These are merged automatically, so robustness requires support for several instances of the same type of filter also in the same FilterStage.

rdb · October 29, 2014, 12:45pm

Although I don’t really understand any use case where internal stages should be mergeable, I think that a general solution for this problem is to just implement name mangling across the board.

If we want to do this without things getting cluttered, we could let people write using Python formatting syntax, eg. “float localvar = {myuniform};”, and then use str.format(), passing a dictionary containing the mappings between canonical uniform names and mangled names for that shader.

Technologicat · October 29, 2014, 1:20pm

Ah, that wasn’t my intention. Internal stages belong to a filter instance - logical stage merging only controls which filter instances go into which FilterStage. So there is no case where only the internal stages would be merged - it’s either the whole filter or nothing.

The way I’m planning to implement VolumetricLighting is to insert an extra BloomFilter in the main pipeline, but with its “participates in compositing” flag cleared. This BloomFilter will live in the same logical stage with VolumetricLightingFilter, with a smaller sort so that it goes first.

Then VolumetricLighting registers the “bloom3” texture (or whatever has been set as “source”), and FilterStage walks its list of filters backward (starting from that particular instance of VolumetricLighting), looking for one that provides it.

This is just a few lines of code, and in any case I think this “flattened pipeline” solution is simpler than having filters contain other filters.

Yes.

I expect that internally, the Filter implementations will do that anyway (either that or the variant with “%{myuniform}s” % myparams).

So maybe we don’t need anything special here? I’ll see how it looks, and then return to this.

Technologicat · November 3, 2014, 8:30pm

Status update: name mangling is now complete and working. Unless there are any more surpises, I expect to have VolumetricLighting working tomorrow.

Then it’s just the cleanup of the architecture (which will probably take a few more days) and then this overhaul will finally be done.

I opted to get rid of the need* flags - it’s simpler to just always provide texpix and texpad since doing that doesn’t hurt performance.

Intermediate code snapshot attached (but do note that the next two versions will be far more interesting).
CommonFilters190_name_mangling_ver.zip (144 KB)

rdb · November 4, 2014, 12:18pm

Great work, thanks for all the effort!

Technologicat · November 6, 2014, 10:39pm

Looking at the code, I decided to go for the architecture cleanup first, after all.

Done:

Eliminated the overly complicated and not very successful RT/CT/UT param abstraction, in favor of plain Python properties.
Deferred reconfigure. Update task with sort=49 managed by FilterPipeline; this triggers reconfigures and tells each FilterStage to call any registered update methods.
Eliminated BatchUpdatable (for great justice!).
Removed unnecessary FilterStage._needsCompile, since single FilterStage reconfigures are not supported anyway (this simplifies the surrounding code).
Noticed and fixed something about texpad: if the texture is unpadded, also the fshader should use the constant float2(0.5, 0.5) (this is actually more important to do in the fshader, which is called once per pixel, than in the vshader, which is only called for the corners of the fullscreen quad).
Some minor cleanups, like the removal of the silly Filter.getPropertiesWithDocs() method.
README updated. This architecture cleanup actually eliminated many of the original remaining issues!

(I decided to keep the plain Filter.getProperties() for now, though; it is used by ls(), setConfiguration() (which used to be updateConfigurationFromKeywords()) and getConfiguration() (which used to be getConfigurationAsKeywords()). I think these should be simple enough not to be evil?)

TODO (this should be all that’s left):

One and only one obvious way to do it: have addFilterInstance(), removeFilterInstance(), let the caller own the Filter instances, and get rid of cruft like addFilter, setFilter, findFilter.
Implement CommonFilters using the above approach, keeping Filter instance references in member variables like suggested by rdb.
Port VolumetricLighting to this framework.
Port the old CartoonInk to this framework.
Check appropriate use of “const” in filter implementations.
Reorganize and rename setup() (which should be two methods), make(), and synthesize().
Sanity check method names: is e.g. reconfigure() descriptive of what it does?
Check all docstrings.
Check all comments.
Port the new filters to this framework.

Another version to follow within the next few days. Shouldn’t take too long to wrap this up, now
CommonFilters190_cleanup_vol1.zip (126 KB)

Technologicat · November 7, 2014, 9:27pm

Architecture cleanup, part 2.

Done:

One and only one obvious way to do it: have addFilterInstance(), removeFilterInstance(), let the caller own the Filter instances, and get rid of cruft like addFilter, setFilter, findFilter.
Implement CommonFilters using the above approach, keeping Filter instance references in member variables like suggested by rdb.
Split the FilterPipeline update task into two tasks: one that handles the pipeline-level deferred reconfigure, and one that handles updating the filters.

TODO:

Port VolumetricLighting to this framework.
Port the old CartoonInk to this framework.
Check appropriate use of “const” in filter implementations.
Reorganize and rename setup() (which should be two methods), make(), and synthesize().
Sanity check method names: is e.g. reconfigure() descriptive of what it does?
Get rid of the *Filter suffix in class names of concrete filter classes? The fully qualified path “direct.filter.XXX” is already enough of a hint.
Check all docstrings.
Check all comments.
Port the new filters to this framework.

New version to follow within the next few days.
CommonFilters190_cleanup_vol2.zip (124 KB)

Technologicat · November 9, 2014, 12:26am

Feature parity reached.

Done:

Port VolumetricLighting to this framework.
Port the old CartoonInk to this framework.
Check appropriate use of “const” in filter implementations.
Old API support requires FilterPipeline.reconfigure() to call filter reconfigures so that it can catch exceptions immediately. This avoids unexpected exception related crashes in legacy applications.
Remove the silly “force” flag from the reconfigure() methods.

The VolumetricLightingFilter comes from CVS head; the “source” parameter has been implemented. I made a rudimentary test, adding a bloom filter for preprocess. While my quickly made radioactive cartoon animals test setup is pretty much nonsensical, this filter combination performs exactly as advertised:

Observe especially the smooth look of the light rays - the smoothing of the bloom filter really helps here.

TODO:

Reorganize and rename setup() (which should be two methods), make(), and synthesize().
Sanity check method names: is e.g. reconfigure() descriptive of what it does?
Get rid of the *Filter suffix in class names of concrete filter classes? The fully qualified path “direct.filter.XXX” is already enough of a hint.
Check all docstrings.
Check all comments.
Port the new filters to this framework.

So, the overhaul is almost done. New version to follow.
CommonFilters190_feature_parity_ver.zip (134 KB)

rdb · November 9, 2014, 12:37am

Excellent work, thanks so much for doing this! I’m extremely busy for at least the next few days, but I may be able to take a closer look near the end of next week.

And yeah, those look like very smooth rays. We’ll have to see if it still works well for lots of small objects, such as leaves in a tree, where the rays have to be sharper.

Technologicat · November 9, 2014, 1:47am

Thanks. I’m looking forward to what others will do with this

Ok. I’ll try to have some of the remaining items done by then.

Hmm. The current blur kernel might be too wide for that. Maybe we need a configurable blur filter to cover different cases.

It would be useful for other applications, too, and it would also allow for some optimization for speed: if the desired blur radius is small, it is in some cases more efficient to blur in a single pass, utilizing the free bilinear filtering available from the GPU. The most optimal I’ve seen was 5 texture lookups for blurring 17 texels in a single pass (this was used as part of the approximate depth of field algorithm described in GPU Gems 3).

And even when using separate x and y passes, bilinear filtering can be exploited to cut down on the number of texture lookups - the current blur filter does not do this. It is possible to average 7 texels in the same row/column using just four linearly interpolated lookups, reducing the total texture lookups for the two passes from 14 to 8 (while still blurring an area of 7x7 = 49 texels).

I’d also like to have a blur filter with a gaussian kernel - I’m thinking the result might look more natural than with the current constant-weight kernel. But that remains to be tested.

This leads to the following idea: now that a filter can borrow a previous one’s output texture, we could split bloom into two filters, BloomGenerator and BloomCompositor. This would allow the blur step to live in a separate filter, the full setup being as follows:

BloomGenerator (compositing disabled) > Blur (compositing disabled) > BloomCompositor (compositing enabled)

The current code looks like it wants to be something like this - already in 1.8.x, bloom borrows the shader implementations of the blur filter for its blur passes.

Finally, this suggests the general idea of preprocessor filters, which only generate textures, and are never meant to participate in compositing. These could be useful as building blocks for more complex multipass filters.

I’ll have to check my code, but to do this it might actually be sufficient to update just the documentation - explicitly mentioning the possibility of filters which always have enableRender=False and do not even provide a synthesize() method for the compositing shader. Currently, I’ve been working in the mindset that synthesize() is always required, but it might make sense to relax that.

On the other hand, splitting the more complex filters into a set of separate parts would make those filters harder to use, because the user must then instantiate the parts separately and make sure that they are ordered correctly.

But I think in the long run, this feature would be desirable in one form or another. There might be some better abstraction I haven’t yet thought of.

Technologicat · November 10, 2014, 1:13am

Status update:

Indeed, it turned out that synthesize() is only required if enableRender=True.

I had an idea this morning for this: introduce a CompoundFilter base class, which inherits from Filter. It represents a filter that consists of other filters (incl. other CompoundFilters if desired), providing an adapter to the Filter API.

This makes it possible to “package” the described kind of filter chains into a single filter to build more complex filters. The advantages over simply defining internal stages are twofold: the “building block” in question may already be a more complex filter (such as Bloom), and this makes it possible for such modular “building blocks” to work as filters in their own right (E.g. VolumetricLighting uses Bloom, which uses Blur - which is already useful by itself). I think this promotes re-use.

The advantage over the flat pipeline is that each CompoundFilter only takes up one slot (sort value), so it’s easier to manage its placement in the pipeline. It’s also easier to instantiate, because it doesn’t require manual setup of the parts at each use site.

(I haven’t removed the possibility to use the flat pipeline approach. I’m thinking it may be useful for cases where it would be too bureaucratic to define a new CompoundFilter just to be able to connect some parts. But I don’t know if supporting both approaches leads to chaos or not.)

The drawback is that it’s yet more code for the unlucky filter author to write, because the filter parameters must be manually re-exposed as properties of the CompoundFilter (assigning values to the original contained filter properties in the setters for the CompoundFIlter properties). This is a design decision - it gives the programmer exact control over what parameters to expose and under what names.

I was thinking that if there are no name conflicts, the inspect module could be used to synthesize appropriate properties automatically but maybe that’s not the way to go.

The compositing is done either by custom code like in Filter, or by the compositing logic of the last contained filter (useful for e.g. VolumetricLighting, which only wants to include another filter as a preprocessor, the compositing code and its shader inputs being fine as they are). The compositing logic in any other contained filters is always ignored, the idea being that the earlier contained filters just generate internal textures that the last one (or the CompoundFilter itself) can then use.

Initial code sketch along these lines (CompoundFilter.py) included in the attached version. Comments would be appreciated - it’s very probable that I’m doing at least something wrong

As a minor edit, I also got rid of the *Filter suffix in the class names. I think the result looks cleaner, but it’s harder to search, since filter names can consist of single words that can also match elsewhere in the code (e.g. “Tint” instead of “TintFilter”). I don’t yet know which choice to go for in the final version.
CommonFilters190_compoundfilter.zip (148 KB)

Technologicat · November 10, 2014, 11:46pm

I did some testing of the CompoundFilter idea. Now the implementation provided in the attachment should be correct

I’ve also added a RapidPrototypeCompoundFilter, which inherits from CompoundFilter, and at runtime, automatically exposes all non-conflicting properties from the contained filters. It does not and cannot produce production-quality code, so it is intended only for quickly trying out new ideas (hence the name). (It reduced my minimal test example for CompoundFilter to just a few lines, provided in the docstring.)

Furthermore, I’ve added a small developer tool called ListAllFilters, which finds all the Filters defined in the current directory, and prints out module/class/stageName/sort information. It’s a bit hacky, but it’s useful to quickly get an up-to-date summary of the default stageName/sort data, which in the code is sprinkled across the filter definitions.
CommonFilters190_compoundfilter2.zip (160 KB)