Is Panda’s AA using super sampling instead of multisampling? This link explains the difference between the two pretty well
But in essence conventional MSAA is much quicker, but has poorer quality (especially with alpha-testing than supersampling.) MSAA is the standard AA used for most games as it gets rid of polygon edges very well at very little cost. Supersampling is super expensive as its rendering the screen at 2x or 4x screen resolution before downscaling to fit the screen.
Without enabling any AA, I can do 8 fullscreen quads at 1920 x 1080 without breaking a sweat on my gtx 260. The frame rate drops from 880 to something like 750.
When I enable Panda’s AA using
loadPrcFileData(’’, ‘framebuffer-multisample 1’ )
loadPrcFileData(’’, ‘multisamples 2’ )
loadPrcFileData(’’, ‘multisamples 2’ )
or any combination of the above
A single full screen quad drops the frame rate down to 350 fps. Meaning it took almost 3ms to draw a single quad. When I decrease the resolution from 1920 x 1080 to 1024x768, the frame rate jumps back to ~850fps. So it is highly sensitive to the resolution of the screen – much more sensitive that it should be for MSAA (used in regular games), and more indicative of using supersampling.
So my question is what is actually Panda doing underneath the hood? And how do I enable MSAA or choose which objects to be draw with multisampling? As far as I can tell, setAntialias doesn’t do anything. I can only get AA with ‘multisamples 2’ or
‘framebuffer-multisample 1’, but once they are enabled, I can’t shut off aliasing for any particular objects using setAntialias( MNone )
You’re using polygon smoothing to do anti-aliasing. If you want to use multisample antialiasing, set this:
I believe Schell Games implemented coverage sampling too, but I’m not sure if coverage antialiasing is in there and working.
For the record, multisampling is slower no matter how you put it. The GPU will render fragments several times and then merge that into one. Multisampling is just a specific optimisation of supersampling, so that less computations are done.
I’ve tried AntialiasAttrib.MMultisample – it makes no difference. The only AA I can get under GL is with ‘multisample X’ or ‘framebuffer-multisample 1’, at which point every object is AA’d.
I also understand that multisampling is slower, but I’m trying to suggest that the current implementation is way slower than what you would expect for just MSAA and is indicative more of SSAA. Also, under wgl, I can’t get any AA-edge quality better than 2x. While the dx9 pipeline can produce clearly better edged AA. This also suggests to be that the wgl pipeline is using some sort of SSAA mode as you can’t do more than 2x SSAA for resolutions greater than 1024.
If setting the “framebuffer-multisample” and “multisamples” mode enables multisample antialiasing, that means there’s something wrong. They are only supposed to request a multisampling framebuffer, not enable antialiasing. Probably you have it enabled in your driver settings- it’s best to configure your driver to leave it up to the application for the sake of analysing this.
If you want multisample antialiasing, you should do two things:
- Request a framebuffer with multisampling, using those config vars.
- Enable multisample antialiasing in the scene graph, using setAntialias(AntialiasAttrib.MMultisample).
If you omit either, then you won’t get multisample antialiasing, unless your driver is overriding Panda’s commands.
There’s no magic done on Panda’s end. Panda3D simply asks WGL for a multisampling framebuffer, and simply asks OpenGL to enable multisampling before issuing the draw calls.
If you’re sure you’re getting SSAA instead of MSAA, then that means that your drivers are lying to you. (Or there’s some kind of obscure bug that manifests only in certain cases.) Perhaps they don’t support MSAA under OpenGL and emulate support for it using SSAA?
The limit on the number of samples could be a driver limitation, but it could perhaps be that coverage sampling is not supported on OpenGL while we do support it on DirectX, or something like that. I could take a look at it if you file a bug report for it on the bug tracker.
I just checked the driver and it is not overriding the application/Panda’s setting. I also tried a win64/Amd system and on that system, what you say is true – You will only get an alias effect after 1) requesting a framebuffer with ‘multisamples’ 2) enabling setAntialias. But the practical matter is that even on that machine, once you request a multisample framebuffer object, not enabling setAntialias doesn’t make an object render any faster. ie., no speed difference between rendering AA objects and non-AA objects when an AA framebuffer object is requested. So I feel like even in the AMD system, SSAA is still enabled.
I’ve also done some tests with rendering to a double-size offscreen buffer, and downsizing the texture back into the main visible screen – effectively what SSAA does – and this speed is roughly the same as the current Panda AA implementaion. So this reinforces my opinion that wgl is also auto-enabling SSAA when a multisample buffer is requested.
It’s normal to see a performance loss regardless of whether or not antialiasing is enabled. This is just the extra performance cost of rendering to a multisampling framebuffer.
I don’t know what you want me to say to your analysis. If your driver is giving you SSAA while Panda is requesting MSAA, then it’s out of Panda’s control.
I’ve discovered this phenomenon too. Some drivers just seem to enable SSAA regardless of what Panda asks for, and regardless of the settings in the driver interface. On a machine that was doing this, I couldn’t find a way to turn it off.
Well, the whole point of my analysis which David has confirmed (thanks David!) is to 1) establish whether or not SSAA is being enabled. Without confirming 1) it’s not possible to try to fix the problem. 2) The next step is for me to find some opengl programs that uses aliasing and see if they have the same problem with my driver. If they don’t, then the problem is with Panda’s implementation. 3) At which point we can try to figure out the problem in Panda’s source.
I don’t see how you can get to 3) without having gone through 1). I also don’t think it’s fair to automatically blame the driver for everything. I love Panda, which is why I’m using it, and why I keep spending a lot of effort to pinpoint problems. But you have to admit that Panda can be very flakey with ‘advanced’ features. It’s not going to improve if we stick our heads in the sand like an ostrich and pretend the problems don’t exist.
To follow up this issue, I downloaded the demo for Ogre3d and benchmarked their postprocessing framework on my gtx at 1920x1080.
gl/dx9, 0 AA, No Bloom, 471 FPS / 400
gl/dx9, 0 AA, Yes Bloom 350 FPS / 323
gl/dx9, 4 AA, No Bloom, 450 FPS / 330
gl/dx9, 4 AA, Yes Bloom, 350 FPS / 260
gl/dx9, 16 AA, No Bloom, 267 FPS / 281
gl/dx9, 16 AA, Yes Bloom, 102 FPS / 178
So basically what the numbers say is yes, it is possible for my opengl driver to have AA without the SSAA, and for the most part GL’s performance is superior than dx9. So it seems that the problem is with Panda’s implementation.
Hopfully, this information will be useful to other people. With AA enabled, your Panda application will be much more sensitive to bandwidth and fill rate than what you would expect. I’ve been blitting lots of full-screen quads to the main (SSAA’d) buffer as part of a post processing framework, but now it makes much more sense to open another non AA offscreen buffer and summing everything there before laying a final card on top of the main buffer.
I’ll also try the 1.6.2 branch because I remember a lot of buffer related issues were broken between 1.6.2 and the initial 1.7.0 branch – hopefully this is just another one of these issues that can be fixed with a diff.
If you can find the way to open a graphics context on a multisample-enabled framebuffer without also enabling SSAA on your driver, I’d be happy to know that information. If you could incorporate that into Panda, even better.