Understanding Draw Calls.

Discussion and feedback on Construct 2

Post » Thu May 18, 2017 8:11 am

Ashley wrote:You're worrying over nothing. There is nothing here to suggest any performance problems.


Are you kidding me? Here's a new screenshot.... The only thing i did was to increase the number of sprites in layout to about 1000... Take notice... IN LAYOUT, not on screen, none of them are moving, just static sprites, and framerate dropped to 30fps.

Image

Draw calls also increase along with the number of sprites, becuase you're not using buffers!

Of course, 100draw calls is not very much for a small game on a powerful device, but people doing large games and games for mobile ARE noticing the bad performance. Because, you're not even implementing best practices... general things you should do.

https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API/WebGL_best_practices

Code: Select all
Fewer, larger draw operations will improve performance. If you have 1000 sprites to paint, try to do it as a single drawArrays() or drawElements() call. You can draw degenerate (flat) triangles if you need to draw discontinuous objects as a single drawArrays() call
.
that's exactly what bunnymark is doing when I check the webGL inspecitor.

Using a webGL inspector i can clearly see you're not doing that! As I said you may not notice it, for small games on powerful devices, but you will notice it for LARGE games, and Mobile games.

Please please please.... Just try to look in to at least using best practices, and use a drawArray. It's a known fact that WebGL overhead is an issue, and you're doing nothing to minimize it.

Or is my only option to modify c2runtime.js myself to prove you wrong?

I can easily say that just by that little tweak we would get a LOT better performance.

If I'm wrong I'd be happy to send you a fine bottle of whiskey.
If you're wrong, the only thing you have to lose is a little time, and getting more happy customers because of a small tweak to how things are drawn ;)
Follow my progress on Twitter
or in this thread Archer Devlog
B
41
S
18
G
18
Posts: 1,028
Reputation: 13,353

Post » Thu May 18, 2017 10:07 am

Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck. So there's no evidence draw calls are the limitation there.

Code: Select all
Fewer, larger draw operations will improve performance. If you have 1000 sprites to paint, try to do it as a single drawArrays() or drawElements() call. You can draw degenerate (flat) triangles if you need to draw discontinuous objects as a single drawArrays() call
.
that's exactly what bunnymark is doing when I check the webGL inspecitor.

Using a webGL inspector i can clearly see you're not doing that! As I said you may not notice it, for small games on powerful devices, but you will notice it for LARGE games, and Mobile games.

The engine does already do that, with a sophisticated batching engine. But changing texture is one of the operations that has to split the batch. In C3, or after export, textures are merged in to spritesheets and the batching works better since there are fewer texture swaps.

So we're already doing everything you've asked for.
Scirra Founder
B
399
S
236
G
89
Posts: 24,530
Reputation: 195,402

Post » Thu May 18, 2017 10:20 am

To prove the point, try running the WebGL inspector on https://www.scirra.com/demos/c2/quadissueperf/. It can draw over 10,000 sprites in ~50 draw calls, most of which is just overhead to render the layout and text. In fact there's just one draw call that renders most of the sprites:

37 drawElements(TRIANGLES, 11994, UNSIGNED_SHORT, 0)


This is the batching engine working exactly as intended.

This is a complex technical part of the engine, please don't jump to conclusions or make assumptions about what the engine is or isn't doing.
Scirra Founder
B
399
S
236
G
89
Posts: 24,530
Reputation: 195,402

Post » Thu May 18, 2017 10:36 am

Ashley wrote:Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck. So there's no evidence draw calls are the limitation there.

Code: Select all
Fewer, larger draw operations will improve performance. If you have 1000 sprites to paint, try to do it as a single drawArrays() or drawElements() call. You can draw degenerate (flat) triangles if you need to draw discontinuous objects as a single drawArrays() call
.
that's exactly what bunnymark is doing when I check the webGL inspecitor.

Using a webGL inspector i can clearly see you're not doing that! As I said you may not notice it, for small games on powerful devices, but you will notice it for LARGE games, and Mobile games.

The engine does already do that, with a sophisticated batching engine. But changing texture is one of the operations that has to split the batch. In C3, or after export, textures are merged in to spritesheets and the batching works better since there are fewer texture swaps.

So we're already doing everything you've asked for.


No it doesn't! Use a WebGl inspector and check for your self! The aim should be 1 draw per frame, that's it! Yes and splitting the batch you're creating 100's of draw, where you could be doing a single one, with all the sprites in one go!

Stepping through the C2 draws, I can see what you're explaining... some things are batched together, drawing layer upon layer 100 times per frame, where you SHOULD be drawing 1 time per frame as the bunnymark example is doing. All the sprites in one go!! The implementation is sloppy, It's doing it completely wrong with loads of unnecessary overhead.

There IS an overhead issue, and it scales directly with number of sprites(draws), as you're rending layer upon layer of "drawElements", where all of it could be drawn in one go.

Image
I'm getting lots draws per frame, layer upon layer, upon layer, and i can step through them one by one to see how it's layered.


Bunnymark is using 1 draw per frame, as you SHOULD be aiming for, no matter how many bunnies on screen, it's always 1 draw per frame.


Image

I don't even know why I have to point out the obvious?

Do I have your permission to modify c2runtime.js and do it the right way?
Follow my progress on Twitter
or in this thread Archer Devlog
B
41
S
18
G
18
Posts: 1,028
Reputation: 13,353

Post » Thu May 18, 2017 10:48 am

tunepunk wrote:No it doesn't! Use a WebGl inspector and check for your self! The aim should be 1 draw per frame, that's it! Yes and splitting the batch you're creating 100's of draw, where you could be doing a single one, with all the sprites in one go!

It already does. The entry I showed you draws all the sprites in one go. The other calls are to draw things like the text and the spinner in the corner.

Sorry, but I don't think you actually understand how WebGL rendering works.
Scirra Founder
B
399
S
236
G
89
Posts: 24,530
Reputation: 195,402

Post » Thu May 18, 2017 10:59 am

https://www.scirra.com/demos/c2/quadissueperf/

I tried this test on my phone, getting 7600 sprites on screen, at 30fps.

Image


So how do you explain then, Why I can't even have a few hundred static sprites on game screen without hitting 30fps on the same mobile?

Image

What is causing the slowdown then?, if it's not the draw calls and not the rendering? I there must be something causing it! And I can't find anything else to improve what I'm doing at the moment. Your Stress tests 7000 sprites on screen no problem... my project, not even a few hundred.

I want to know why.... my only explanation is some kind of overhead.
Follow my progress on Twitter
or in this thread Archer Devlog
B
41
S
18
G
18
Posts: 1,028
Reputation: 13,353

Post » Thu May 18, 2017 11:12 am

I've experienced the same thing; too many sprite objects in a layout and the performance gets bogged down. I had to combine a lot of sprites I'd used as tiles, background decoration and whatnot, to get performance back up. Whatever the reason, it *is* an issue on my end at least.
B
39
S
16
G
6
Posts: 543
Reputation: 7,619

Post » Thu May 18, 2017 11:39 am

I already answered that:

Ashley wrote:Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck.
Scirra Founder
B
399
S
236
G
89
Posts: 24,530
Reputation: 195,402

Post » Thu May 18, 2017 12:13 pm

Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck.


Sorry, but I don't think you actually understand how WebGL rendering works.

You're absolutely right I don't, that's why I'm doing everything in my power to investigate why I'm getting bad performance on mobile.

I made a gif to try to show you what I mean.
Image

This doesn't look very efficient to me. And I'm not a WebGl guru, but from what I've read, you should be minimizing draws to a minimum for webGL. This doesn't look like 1 draw per frame from one array, This ilooks like several drawElements, layer upon layer. You can see the blue dots building up to the right.

But anyway, so your example is rendering 7500 sprites but my game only a few hundred, on the Same Phone!? I doubt it's GPU bottleneck. Other webGL examples and games are not bottlenecked, why only C2 games with lots of different sprites?

I'm only guessing it has something to do with how the rendering is done.

As I said. Merging most my artwork in to the same sprite, by using Animations and frames, seems to have a positive effect. So the only way to get around the bottleneck is to merge all sprites to one huge spritesheet? I want to go to the bottom of this. I shouldn't be getting 30fps with a couple of static sprites on screen.

Anyway.... I'm going to set up a few different capx tests to further test this.

Maybe that will help in finding out why, performance is dropping significantly, when using a lot of sprites from diffrent spritesheets, but not when using 1 sprite or 1 spritesheet (texture)
Follow my progress on Twitter
or in this thread Archer Devlog
B
41
S
18
G
18
Posts: 1,028
Reputation: 13,353

Post » Thu May 18, 2017 12:28 pm

So that particular case touches on a pretty obscure part of the engine. Another piece of WebGL performance advice is not to submit huge buffers in one go, but to actually submit them in chunks. This also helps keep the memory usage down and reduce latency to issuing work to the GPU. So the engine issues chunks of several thousand quads at a time. In the quadissue case, it reaches extreme levels of sprite batching so you are seeing lots of chunks.

There is nothing to gain by improving this. It looks like it's submitting about 2500 sprites at a time, which means the draw call overhead is about 0.04% of the naive case of one call per sprite. If we increased this to say 5000, it would make such a tiny difference it is totally irrelevant (0.02%), while increasing memory usage and latency. So like most engineering tasks there's a tradeoff here, and we've aimed at a good sweet spot.

So you are in fact looking at the batching engine working in ideal circumstances, and accusing it of bad performance. You should not jump to conclusions about parts of the engine you don't understand.

GPU fillrate is the bottleneck that most people run in to, so that is probably the limit in your game too.
Scirra Founder
B
399
S
236
G
89
Posts: 24,530
Reputation: 195,402

PreviousNext

Return to Construct 2 General

Who is online

Users browsing this forum: Yahoo [Bot] and 8 guests