Update March 2012: read our updated benchmarks here
Yesterday we released our new beta WebGL renderer in r68. The release notes cover it briefly, but some of the technology is really cool, so I thought I would go in to more detail in a blog post - this is be pretty long and technical, but it's always good to know what's going on under the hood! I'll also compare the performance of different renderers and engines. Here at Scirra we're also in the fairly unique position of having three renderers to compare: the Canvas 2D renderer, the new WebGL renderer, and our old C++/DirectX renderer from Construct Classic. It's certainly interesting to compare all three, since they have all been written in a similar way for a practical, real-world engine.
How GPUs draw 2D
First of all, let's cover the rendering process so you can understand what's happening in the tests. This is somewhat simplified, but it should give you the basic idea. To draw a 2D sprite in a low-level renderer (like WebGL or DirectX) you need to draw a quad. This is basically the box around the sprite, and you have to provide the X and Y co-ordinates of each corner. Each corner is called a vertex, shown by a red dot in the image below.
To efficiently draw lots of 2D sprites, you need to make a big list of the vertices of every sprite you want to draw. This list is stored in a vertex buffer, and the graphics card can use that to draw. In the image below there are five pirate princesses, some rotated and scaled, making for a total of 20 vertices.
Graphics cards are sophisticated technology. They've been pushed by the billion-dollar 3D gaming industry for years, and now they're so blindingly fast they can render 2D too quickly. Yes, too quickly, because they can draw a sprite faster than you can tell it. In other words, it might take you 2 microseconds to work out the positions of the vertices and send them to the graphics card, and then the graphics card finishes the work in 1 microsecond, and sits idle waiting for the next command. If you're not impressed, you should be, because the CPUs issuing the commands are mind-blowingly fast too.
Anyway, this has the interesting side effect that performance in 2D games is limited by how fast you can fill the vertex buffer. You can probably assume the graphics card can draw it faster than you can fill it, so the question is: how quickly can you calculate those vertex positions and put them in the buffer?
The performance test
In order to test the performance of each renderer we've written a standard test. We have a simple blue square sprite, and we create as many on-screen as possible until the framerate drops to 30fps. This minimises the effect of the typical per-frame calculations and ensures we're measuring raw vertex-filling speed. The square is see-through so you can see them piling up. It looks a bit like this:
You can try the test for yourself in a variety of browsers and see how they hold up - there's the Canvas 2D performance test and the WebGL-enabled performance test that you can try online. Note that WebGL may not be supported by your browser or computer, and if not it'll fall back to the 2D renderer - check it says "renderer: webgl" at the top. The Construct 2 project for the test is here in case you want to poke around. We'll also compare results with the equivalent test in Construct Classic (warning: EXE) which is our C++/DirectX powered engine, and its project file is here. The test system config I've used to get the results below is: Intel Core i5-2500 (4 cores @ 3.3 GHz), 8 GB RAM, AMD Radeon HD 6500 (1 GB VRAM), running Windows 7 64-bit (experience index 6.8) with the latest stable releases of the top browsers: Internet Explorer 9, Chrome 15 and Firefox 8. I've not included Safari: a Mac would have different hardware and make the comparison unfair, and it supports WebGL on Mac but not on Windows, so testing it on Windows would probably unfairly make it look bad. Opera 11.52 is the latest stable release but it performs very badly because it doesn't have hardware acceleration yet. The Opera 12 alpha comes with hardware acceleration but since it's alpha I don't want to include its results yet - sometimes making software more reliable also means making it a little slower, and considering I used the stable branches for all the other browsers I also thought it would be unfair to include Opera 12 alpha (until the end where I throw it in for fun). I'll follow up with its results in future when it's finished and released, but its results already look extremely promising.
You might be wondering: why is performance so important? There are a number of reasons. Firstly, games running at a solid 60 fps look better and make for a more enjoyable playing experience. Secondly, slow engines put a limit on creativity. Do you want a moment in your game when 100 enemies rush you and the screen is full of explosions? How about a crazy bullet hell game? If your game engine is slow, the game may be unplayably slow, and then you have to revise your game to something less exciting - how disappointing! Finally, faster engines are more efficient, which means they use less battery on laptops, phones and tablets. So you really want your engine to be as fast as possible!
The Canvas 2D renderer
HTML5 defines a 2D canvas for drawing 2D content like games. This is much higher level than WebGL. You don't actually send vertices. You pretty much just tell it to draw a 2D image at a position, and the browser does the calculations to work out where the vertices should go. (This is assuming it's a hardware-accelerated 2D canvas, and it is in most browsers today.) How does it fare in modern browsers? Remember the test measures the most sprites it can get on-screen at 30fps, so higher is better.
Pretty close - all managed a few thousand sprites at 30 fps, but IE9 came in top being about 42% faster than Firefox 8.
The WebGL renderer
The interesting thing about the WebGL renderer is our engine happens to track the location of all object's vertices for the collision engine. This means no calculations are necessary to work out where the vertices are - they're already known. This removes the overhead of the browser figuring out for itself where the vertices are. They can simply be copied directly in to the vertex buffer! Obviously not doing calculations is much faster than doing calculations, so how much does this improve performance?
Now you can see clearly how important the overhead of filling the vertex buffer is. Firefox 8 was previously the slowest, but with WebGL it's 70% faster than IE9, and since IE9 does not support WebGL it simply can't compete. Chrome 15 is king though with a massive 356% performance boost over its own 2D renderer, and also managing to be 3 times faster than IE9's 2D renderer. It's clear that WebGL takes performance to a whole new level.
I have to mention Opera 12's alpha though, because it's so promising - I'm not including it in the graphs for the reason I stated earlier, but it scored about 21,000 sprites at 30 fps. If they can keep that performance to the stable release it comes in top, 44% faster than Chrome 15, and easily a whole 4 times faster than IE9's 2D renderer! They've come from a software rendered 2D canvas which only scored 400 which makes it over 50 times faster than their last release! Keep up the good work, Opera!
Construct Classic's C++/DirectX renderer
I don't want to be unfair to Opera, but I am after all going to throw it in this one graph, to show the scales involved. I'll try to level it out by including both the slow Opera 11.52 and the fast Opera 12 alpha!
Another way to interpret the results is the average time to render each sprite. Classic manages to render each sprite in about 0.39 microseconds, Chrome 15 in about 2.3 microseconds (still a difference of about 5.8x), and IE9 in about 7.1 microseconds. Let's imagine a really intense game with 1000 objects on screen. To get a reliable 60fps framerate, you have to render a frame within 16 milliseconds (16,000 microseconds). In this case, IE9 would take 7.1 milliseconds to render each frame (44% of the frame time), Chrome 15 would take 2.3 milliseconds (14% of the frame time) and Classic would take 0.39 milliseconds (2.4% of the frame time). Remember that's only drawing time, and there must be time for the game logic to run as well. IE9's renderer takes up nearly half the frame time for drawing alone, significantly reducing the time allowed for game logic to run. If there's a lot of game logic it could cause a framerate drop, and that's especially likely with 1000 objects. Chrome 15 has much more headroom, leaving about 86% of the frame time free for logic. Also, consider the rendering time difference between IE9 and Chrome 15 is 4.8 milliseconds, but only 1.91 milliseconds between Chrome 15 and Classic. That's about 2.5x less. What this shows is that the low hanging fruit has been picked by supporting WebGL: C++ is a lot faster, but the gains to be made in rendering time are much smaller once you've reached Chrome 15's performance level. High framerates look impressive, but in practice if the renderer can leave the majority of the frame time free for game logic to run, it's done its job. Consider it this way: Chrome 15 leaves 86% of the frame time free and Classic leaves 97.6% of the frame time free, but the math worked out that Chrome 15 is 5.8x slower. It is, but it's still doing its job nearly as well as Classic by leaving most of the frame time free.
From this point of view I think you could argue WebGL is definitely fast enough for even intense 2D gaming. It leaves the majority of the frame time for game logic. The very fastest 2D renderer, IE9, still ate up close to half the frame time which can mean the game logic starts to hurt framerates. C++ is much faster but beyond WebGL there's rapidly diminishing returns on frame time. Therefore I think we're happy to sacrifice the performance of C++ in favour of the massive advantage of having one HTML5 game that runs on all devices, unlike the Classic runtime which despite its speed is tied specifically to the Windows desktop.
Finally it's a shame no mobiles support WebGL yet. iOS actually supports WebGL, but they've disabled it! I'm not sure why they've done this, because enabling WebGL is crucial to high performance HTML5 games on mobiles. Mobiles typically have much weaker hardware so the performance gains that WebGL gives are absolutely crucial to ensuring a wide variety of games are playable. I'm sure WebGL is the future of mobile gaming in HTML5, so let's hope phone makers add support for it soon! Once they have, hopefully your PhoneGap games will be running great as well, so let's hope the market moves along quickly.