Screen-Space Ambient Occlusion

rdb · November 9, 2008, 7:44pm

Hi guys,

I’m working on an interesting feature for real-time ambient occlusion. The algorithm was first used by Crysis in 2007. It adds pretty much realism to scenes. I still recommend pre-baking Ambient Occlusion, but for scenes that are altered pretty much this can do a great job.

I’m adding it into the CommonFilters class so it can be turned on in just one function call.

Here’s a screenshot so far:
pro-rsoft.com/screens/ssao-workingpartially.png
(don’t notice the bad quality, thats what you get with an 8-bits z-buffer.)
As you can see I still need to eliminate the edges.

Eric48k · November 11, 2008, 12:54am

It looks good

treeform · November 11, 2008, 2:25am

I also have my implementation. Together with pro we will come up with some good SSAO for you hopefully in 1.6.0

rdb · January 28, 2009, 7:41pm

Here’s an update. I have the algorithm working well now:
pro-rsoft.com/screens/ssao-compare.png
left scene is without ssao, middle is with, the right is the AO component only.
I agree this is not the best model to show it with, but I didn’t have any other.

As you can see, the floor is kinda grey in the AO component - caused by rounding errors. I think the only way I can remove it is by adding a depth threshold, which will be configurable of course.

majadi · May 2, 2009, 7:22pm

Looks great! you rock! also the gray floor is the same effect i get when using mentl ray, so you nailed it on the spot

rdb · May 3, 2009, 7:44pm

Thanks.
I’ve retried the SSAO, from scratch, this time with a lot more luck. I now have a usable SSAO effect.
Here’s a screen of the AO-component in the disco-lights sample (the worst-case scenario):
pro-rsoft.com/screens/ssao-w00t.png
I’ll soon post a better screenshot, and polish it off and check it into CVS. Most likely it will make it into 1.7.0.

jhocking · May 3, 2009, 9:15pm

Neat, that’s pretty cool!

Metal3d · May 4, 2009, 3:19am

Yeah that looks great, very useful indeed.

rdb · May 23, 2009, 8:01am

SSAO in action (click to enlarge).

…original…with ssao…ao component…

As you can see, normal maps are included in the AO calculation as well.
(Don’t mind the low FPS, my fps always drops while making a screenshot.)

MentalDisaster · May 23, 2009, 12:17pm

Awsome!

But can you do something with that blurry feel?

rdb · May 24, 2009, 9:00am

Tricky. I actually need the blur passes to eliminate the grain caused by the randomly rotating kernel.
But since SSAO will most likely be applied in a textured scene, this ‘blurry feel’ won’t be noticed.

Without blur, it looks like this:

MentalDisaster · May 24, 2009, 1:42pm

I’ve heard that you can enchance the quality if you put contrast in the SSAO/depth buffer higher, is it right?(Have you tried it?)

I have found a kinda interesting blurring style. I did not read about it a lot but some people on gamedev.net said that for SSAO it’s the best(as I understood because it "respects’ edges, but how?), links:

homepages.inf.ed.ac.uk/rbf/CVonl … ering.html

risbrandt.blogspot.com/2007/12/p … ation.html
^shader

What do you say?

Azraiyl · June 12, 2009, 3:43pm

Just two questions:

Do you plan to add it to CommonFilters? If yes, do you render the specular highlights to another buffer, so they aren’t affected by AO?

You write that the normal maps are included in the calculation (not only the height map, used for parallax mapping), therefore I conclude that you render all normals to a buffer. I only ask, because I hacked together an SSAO algorithm some time ago, without using normals, but I might well be possible that my approach is inferior.

rdb · June 12, 2009, 4:08pm

Yes, I already implemented it into CommonFilters from the beginning. No, I didn’t render the specular highlight into a render target since I didn’t really see a need for it.

Height maps aren’t taken in respect for the calculation - only the depth texture and screen-space normal (which is put into a render target by the Shader Generator, which gets this info through AuxBitplaneAttrib)

I can leave away the normal calculation, but that makes it less accurate.
Crysis’ implementation used just depth indeed - but CryTek fixed that for CryEngine 3.

Did your implementation manage to resolve kernel artifacts?

Azraiyl · June 12, 2009, 4:30pm

Don’t you think I may look a bit ugly if specular highlights are occluded by AO? (Never tried it, because I never had the idea to just apply it).

About the kernel artifacts. I played around with bilateral filtering algorithms. Because the should preserve edges (often the result was not so convincing because of high frequency noise), the AO should not bleed to much. I ended with a blur that cares about depth (based on a threshhold the weight is zeroed). In fact my whole kernel was only in 2D (I haven’t created a 3D kernel at all). Then I’ve done it in smaller resolutions (only 1/4 of the screen) and let the GPU upscale it. For my problem I solved it with a much simpler solution, but it is not generic. I’ve added a film grain like effect (which is noisy by definition), not because of SSAO, because somehow the result was looking better without it (I’ve stolen this idea from mass effect, although I don’t now how they implement it, it is surpisingly how you could improve image quality (IMO) with such a simple idea).

Azraiyl · June 13, 2009, 3:37pm

If you look close at the edges you can see that they are sharper than the rest.

Difference between simple blur and blur with respect to the depth.

Inner loop looks like:

if(abs(blurDepth - currentDepth) > threshold) {
	color += currentColor * filter[x];
} else {
	color += tex2D(AmbientOcclusionSampler, st) * filter[x];
}

rdb · June 13, 2009, 4:07pm

Hey, that’s pretty neat. No bleeding at all.

Looks like your implementation looks pretty sweet too. Mind sharing it, including the blur shader?
I’d love to kick the normal calculation out of my shader (or at least make it optional) because this requires the shader generator and having it write to an extra render target, which costs a lot of performance, compared to having SSAO with just depth.

treeform · June 13, 2009, 11:40pm

Yeah more shader tech is awesome!

Azraiyl · June 17, 2009, 8:17pm

I’m currently considering using normals aswell (thanks for the hint). One big advantage is that you only have to sample a hemisphere and not a full sphere. In fact in my implementation 50% of all samples were wrong. Therefore I’m not sure if your implementation is faster with/without a normal buffer, because more samples means more cache misses (according to some NVIDIA papers the random sampling is a torture for the GPUs internal cache, reducing width and height by factor of 2 may raise the speed more than factor of 4 (2*2)).

The most advanced implementation is maybe available at developer.download.nvidia.com/SD … mples.html. But I have to admit that I don’t understand it fully.

There is nothing special about the blur shader (and I bet that most other implementations are more sophisticated than this one). Both shaders only differ in the line “float2 st = …”. Either RESPECT_DETPH or RESPECT_NORMAL should be enabled. RESPECT_NORMAL is IMO inferior in this implementation.

const float FILTER[] = { 0.05, 0.1, 0.2, 0.3, 0.2, 0.1, 0.05 };

#define RESPECT_DETPH
//#define RESPECT_NORMAL

struct BlurVertexIn {
	float4 position : POSITION;
	float2 texcoord : TEXCOORD0;
};

struct BlurVertexOutFragmentIn {
	float4 position : POSITION;
	float2 texcoord : TEXCOORD0;
};

struct BlurFragmentOut {
	float4 color : COLOR;
};

void BlurVertexProgram(in BlurVertexIn i, out BlurVertexOutFragmentIn o) {
	o.position = i.position;
	o.texcoord = i.texcoord;
}

void BlurSSAOHorizontalFragmentProgram(in BlurVertexOutFragmentIn i, out BlurFragmentOut o, uniform sampler2D samplerColor, uniform sampler2D samplerNormalDepth) {
  const float2 scale = float2(1.0 / ViewPortPixelSize.x, 1.0 / ViewPortPixelSize.y);

  float currentColor = tex2D(samplerColor, i.texcoord).r;
#ifdef RESPECT_NORMAL
  float3 currentNormal = tex2D(samplerNormalDepth, i.texcoord).xyz;
#endif
#ifdef RESPECT_DETPH
  float currentDepth = tex2D(samplerNormalDepth, i.texcoord).w;
#endif

  float color = 0;
  for(int n = 0; n < 7; n++) {
    float2 st = i.texcoord + float2(scale.x * (n - 3), 0.0);
    float sampleColor = tex2D(samplerColor, st).r;
#ifdef RESPECT_NORMAL
    float3 sampleNormal = tex2D(samplerNormalDepth, st).xyz;
    if(dot(sampleNormal, currentNormal) < SSAOBlurThreshold) {
      color += currentColor * FILTER[n];
    } else {
      color += sampleColor * FILTER[n];
    }
#endif
#ifdef RESPECT_DETPH
    float sampleDepth = tex2D(samplerNormalDepth, st).w;
    if(abs(sampleDepth - currentDepth) > SSAOBlurThreshold) {
      color += currentColor * FILTER[n];
    } else {
      color += sampleColor * FILTER[n];
    }
#endif
  }

  o.color = color;
}

void BlurSSAOVerticalFragmentProgram(in BlurVertexOutFragmentIn i, out BlurFragmentOut o, uniform sampler2D samplerColor, uniform sampler2D samplerNormalDepth) {
  const float2 scale = float2(1.0 / ViewPortPixelSize.x, 1.0 / ViewPortPixelSize.y);

  float currentColor = tex2D(samplerColor, i.texcoord).r;
#ifdef RESPECT_NORMAL
  float3 currentNormal = tex2D(samplerNormalDepth, i.texcoord).xyz;
#endif
#ifdef RESPECT_DETPH
  float currentDepth = tex2D(samplerNormalDepth, i.texcoord).w;
#endif

  float color = 0;
  for(int n = 0; n < 7; n++) {
    float2 st = i.texcoord + float2(0.0, scale.y * (n - 3));
    float sampleColor = tex2D(samplerColor, st).r;
#ifdef RESPECT_NORMAL
    float3 sampleNormal = tex2D(samplerNormalDepth, st).xyz;
    if(dot(sampleNormal, currentNormal) < SSAOBlurThreshold) {
      color += currentColor * FILTER[n];
    } else {
      color += sampleColor * FILTER[n];
    }
#endif
#ifdef RESPECT_DETPH
    float sampleDepth = tex2D(samplerNormalDepth, st).w;
    if(abs(sampleDepth - currentDepth) > SSAOBlurThreshold) {
      color += currentColor * FILTER[n];
    } else {
      color += sampleColor * FILTER[n];
    }
#endif
  }

  o.color = color;
}

Playing around with shaders, I normally do in FX Composer (IMO it is somehwat faster to test ideas than with Panda3D). I can’t be copied directly into the FilterManager. I’ll clean up the whole testsuite in the next days and upload everything somewhere. One more note: The screen space normal (xyz) and depth (w) are stored in one floating point buffer. The normals are already normalized. If the offscreen buffers contain bytes, a “tex2D(…) * 2.0 - 1.0” is perhaps needed.

More details (with a link to the source): discourse.panda3d.org/viewtopic … 3&start=30.

rdb · December 29, 2009, 12:40pm

After a bit of tweaking in my blurring shader and lowering the radius of the ambient occlusion I’ve found that my implementation works quite well now. Tested it on various sample programs including Roaming Ralph.

I’ve just checked it in to CVS, will be in 1.7.0.