I only recently broke free of iOS3.x for Flower Garden, so I’m finally adding all the features I had been itching to add that required higher OS support. I had already added some iOS4+ features, but I was keeping them to a minimum because it’s always a huge cause of bugs to target multiple versions of the OS at once.
One of the first features I looked into adding was multisample antialiasing (MSAA) support for OpenGL, which was originally introduced in iOS 4.0. The geometry generated for the petals in Flower Garden is fairly high contrast, and since it’s not like the textures were carefully created and laid out by an artist, the result is pretty bad aliasing around the edges. Perfect candidate for multisampling!I based everything on the Apple documentation on multisampling. It was very straightforward and it works for both off-screen and view-based render targets. It was also extremely helpful to keep both the ability to use regular and multisampled render targets. That way I can easily run performance and visual comparisons, and, if necessary, I can disable multisampling on a particular device.
The main gotcha was making sure I used the GL_RGBA8_OES color format on the multisampling color buffer (otherwise it won’t work, and all you’ll get are 1282 “invalid operation” OpenGL errors). Also, if you’re in the same boat as Flower Garden, which uses OpenGL ES 1.1 and the fixed function pipeline, you’ll have to add _OES to just about every constant in the documentation.
Visual results
I will let the screenshots speak for themselves.
In this screenshot you can see the rendering of the pots using off-screen render targets.
Pretty impressive improvement, even if I say so myself! The improvement is even more striking in-game because the animation of the flowers moving in the wind doesn’t have jaggies popping in and out.
The improvements are more noticeable on the iPad because the pixels are bigger, and, not surprisingly, less so on retina-display iPhones. But even on the retina displays, jaggies are less noticeable during the flower wind-swaying animations.
Performance
So, no doubt: They look pretty. But how about performance? Under the hood, the rasterizer is generating four samples for every pixel, and then combining them at the end in a separate step. So you’ll get more of a performance hit with complex pixel shaders. Edit: My bad. I spaced out and I forgot that MSAA only evaluates the pixel shader once, so performance doesn’t depend on pixel shader complexity. Instead, the performance hit probably comes from the extra writes (four per pixel) to the render target, and should be fairly similar between games.
It turns out that Flower Garden is still using OpenGL ES 1.1, which is implemented using shaders at the driver level. Fortunately, even though I’m using some texture combiner operations and several input textures, those pixel shaders aren’t all that complex.
These are the frames per second I recorded with one particular flower pot on the different devices I have for testing. Notice that the lowest device I have listed is the iPhone 3GS. That’s because I have also stopped supporting arm6 CPUs to keep things simpler (and the market share is minimal).
Update: I was only running the game loop at 30 fps, so the initial performance numbers I had listed are pretty meaningless. Here are the correct numbers running at a max of 60 fps.
Device | Without MSAA | With MSAA |
---|---|---|
iPhone 3GS | 39 fps (25.6 ms) | 37 fps (27.0 ms) |
iPhone 4 | 48 fps (20.8 ms) | 23 fps (43.5 ms) |
iPhone 4S | 60 fps (16.7 ms) | 60 fps (16.7 ms) |
iPad 1 | 60 fps (16.7 ms) | 18 fps (55.6 ms) |
iPad 2 | 60 fps (16.7 ms) | 60 fps (16.7 ms) |
Of the devices I tested, MSAA didn’t slow down things much on the iPhone 3GS, and the iPhone 4S was maxed out at 60 fps both ways. The iPad 1 was a different story, and performance crashed from 60 fps to 18 fps. Like Rory pointed in the comments, looking at the actual time instead of the fps gives a better insight. Using MSAA adds 38.9 ms to each frame! The iPhone 4 also suffered from a big performance hit due to all the extra pixels in the retina display. I don’t care how pretty the flowers are, that’s just not acceptable (and no, I’m not adding user-tweakable graphics settings like PC games).
I should also add that Flower Garden is hugely CPU bound. It generates all that geometry every frame on the CPU, and that’s a lot of vector transforms. So if your game is GPU bound, you’re likely to see higher performance hits.
I was initially very surprised to see that the performance on the simulator tanked big time (as in, going from 30 fps to 5 fps). This was news to me, but apparently the iOS simulator runs completely in software, so the quadrupling of pixels brings it to its knees. I’m really, really, impressed at how smoothly it usually runs for being in software though. It had me fooled thinking it was using OpenGL under the hood!
As far as memory goes, after an animated discussion on Twitter, it seems that the iPhone does create full buffers for the multisample frame buffers, so there is a significant increase in memory usage. It’s not something easy to track because it all happens at the driver level, but it’s something to be aware of. Because of this, I might consider turning multisampling off on retina displays, since the improvement is not mind-blowing (and the extra memory is very significant because of the amount of pixels). I’ll also probably turn it off in the simulator so I can achieve a decent frame rate during development.
Conclusions
Multisampling only required adding a few lines of code and resulted in an impressive visual improvement and minimal performance impact in some devices. It’s a no-brainer on the 3GS and iPad 2. I’m wishing I had implemented it earlier!
Great information! Â Thanks for writing this up.
Have you looked into FXAA? http://www.codinghorror.com/blog/2011/12/fast-approximate-anti-aliasing-fxaa.html
I have – http://aras-p.info/blog/2011/08/17/fast-mobile-shaders-or-i-did-a-talk-at-siggraph/ -, and it’s too costly for current generation of iOS GPUs. On iPad2, for example, FXAA was taking around 12 milliseconds (compared to 4xMSAA which usually costs between 1-4 milliseconds depending on the scene).
Interesting stuff Noel. You should probably use milliseconds for performance numbers so that we can compare performance on a linear scale. It would be nice to see some iPad screenshots too!
Very true. I usually give people a hard time for putting performance results in fps instead of ms. I guess I was just being lazy. I’ll update that when I get home along with the iPad 2 numbers.
Ipad 1 and Iphone4 have the same GPU… The amount of pixels is 25% lesser on iphone4 so i suspect the performance will be very bad. You can try to use scaleFactor to draw few pixels with MSAA, but i suspect the rendering better without MSAA. The ipad2 will run at 60fps.
Fillrate issue on ipad1 and iphone 4: http://fabiensanglard.net/fillrate_issues/
Good call. That’s pretty close to what I found out when I measured it.
Interesting post. Thanks for sharing.Â
You should be able to see the memory footprint increase in the VM Tracker Instrument.Â
How did you stop supporting armv6? By requiring iOS 4.3? Otherwise I iTunes Connect will not let you drop armv6 in my experience.
Hi Noel, great comparison of the MSAA performance. iPad 1 is a killer…
IÂ was thinking, now that you require armv7, maybe you could try using ARM NEON for those geometry calculations:
“It generates all that geometry every frame on the CPU, and that’s a lot of vector transforms.”
Or, if you’re planning to switch to OpenGL ES 2.x at some point, you could do some of the animation calculations in the vertex shader.
Interesting.
The comparison chart might be easier to read if you added a %-additional-time column.
Otherwise, nice writeup! Â 🙂
I just wanted to say thank you!
Mr. Noel,
I’m just beginning to learn to use shaders. It’s been almost 24h since I finally discovered that my render-to-texture algorithm did actually work on the simulator (not on device, though).
According to you and to Apple, a few changes in the code should solve my problem… but it didn’t! I know it is a stretch but can you send me a minimal example of render-to-texture and texture-to-screen that actually works on the device?
I would really appreciate it.
hardan@gmail.com
According to your results there is no performance impact on iPad 2. This does not match my experience at all.