Understanding Draw Calls.

Discussion and feedback on Construct 2

Post » Tue May 16, 2017 4:04 pm

In my own project, checking the debugger I always notice draw calls using most CPU, more than all my game logic combined. Even if I'm not using any Blend modes, WebGL effects, particles, etc. I want to understand why it is so high and what I can do to reduce it.

The only way I found to reduce it is to reduce the number of sprites on screen. Sometimes even merging graphics to one big sprite to reduce draw calls (but uses more memory). And as Draw calls goes up, I can notice frame rate dropping.

My question goes out to the @Ashley and the devs, if there's anything more they can do on their end to optimize this further on their and, and enlighten us a bit more of how it works and why it's using that much CPU. This seems to be a huge CPU hog even for simple games. Especially for me as I'm designing for mobile.
Follow my progress on Twitter
or in this thread Archer Devlog
B
35
S
15
G
17
Posts: 944
Reputation: 12,210

Post » Tue May 16, 2017 4:16 pm

If you can provide a minimal .capx that shows high draw call usage, I'd be happy to investigate optimising the engine. Without that the most I can do is speculate.
Scirra Founder
B
387
S
230
G
87
Posts: 24,248
Reputation: 192,238

Post » Tue May 16, 2017 4:36 pm

@Ashley No problem, i can send over the actual project. Where do I send it as I don't want to share it publicly?
Follow my progress on Twitter
or in this thread Archer Devlog
B
35
S
15
G
17
Posts: 944
Reputation: 12,210


Post » Wed May 17, 2017 9:19 am

@Ashely Sent you a project file on mail.

Ashley wrote:Draw calls is another matter really, and happens on the CPU side. It's probably best to split that topic off to a new thread. We have OpenGL ES 3 equivalent capabilities with WebGL 2 though, so if at any point draw calls prove to be a bottleneck, it's something we can potentially optimise in exactly the same way a native app would adjust their draw calls to be more efficient. Most 3D APIs, WebGL included, are specifically designed to allow as much drawing as possible with the fewest draw calls, to as far as possible eliminate the CPU overhead.


I'm not a programmer, but I just feel that drawing, and draw calls is not very optimized currently. Like it's drawing every single sprite, multiple times per frame, instead of drawing from a buffer, lot of things at once.

And it feels like there's a lot of overhead currently. And that there's a lot of room for improvement. Especially when it comes to draw calls and rendering.
Image

- next I tested drawing with ANGLE_instanced_arrays, object positions are computed on CPU, written to a (double-buffered) dynamic vertex buffer, and then rendered with a single draw call, in Chrome on Windows with NVIDIA I can get 450k instances before the performance drops below 60fps (so 450k particle position updates per frame in JS, and no sweat!), performance in a native app isn't better here, my suspicion is that the vertex buffer update is the limiter here (500k instances means 8MByte of dynamic vertex data shuffled to the GPU each frame), on my OSX MBP I can go up to about 180k instances (again very likely vertex throughput limited). However in this case, the way the dynamic vertex buffer works is also important, it looks like vertex buffer orphaning is useless in WebGL (see discussion here: https://groups.google.com/forum/#!topic ... MNXSNRAg8M), so I switched to double-buffering


Reading that quite it seems some people seem to be getting way more performance out of WebGL that we currently can in C2/C3, which I believe is due to overhead. Maybe both from draw calls and the way it is rendered? Any possibility there's something to this?

I'm not a engine programmer, I'm a designer, but it just seems C2/C3 could perform a lot better, than it currently is, by minimizing overhead.
Follow my progress on Twitter
or in this thread Archer Devlog
B
35
S
15
G
17
Posts: 944
Reputation: 12,210

Post » Wed May 17, 2017 9:55 am

I also noticed that I was able to get a good amount of performance boost, by merging most of my assets to as few sprites as possible, adding all assets to different frames, and animations, as everything is rendered "per texture", so that they are in the same spritesheet. If I wasn't doing that I wouldn't be getting as good performance as I currently am.

So, my conclusion... use as few sprites as possible, but add all assets to the same sprite will increase performance, since they then will be on the same "TEXTURE" (spritesheet), will result in fewer draw calls, less overhead, and less drawing per frame.

I was checking the c2runtime.js webGL the whole GL section.

Are we allowed to modify c2runtime.js?, because i would like to make some test to see if I could make some improvements there.
Follow my progress on Twitter
or in this thread Archer Devlog
B
35
S
15
G
17
Posts: 944
Reputation: 12,210

Post » Wed May 17, 2017 12:50 pm

Image

Testing bunnymark VS my construct project rendering There is way less calls here, and far more bang for the buck. Looking at the WebGL inspector they are rendering things differently than C2 does. Seems to be using buffers.

It seems the way C2 render stuff has a LOT more overhead...

http://www.goodboydigital.com/pixijs/bunnymark/
here's the link to bunnymark if anyone want to try it on their phone to test performance.

I can have 1500 bunnies jumping around on a midrange (Nokia Lumia 830) before framerate goes below full 60fps.

My construct project is struggling the same phone with 50 static object on screen. No animations, nothing moving.


Here's a screenshot from my game, at an area with very few objects, CPU is pretty high, mostly due to draw calls. Framrate is getting low. About 50ish, with just a few static objects on the map.

Image

Here's a screenshot of Bunnymark with a lots of objects jumping around. at a similar framerate 60fps.

Image

I'm pretty confident that @Ashley claiming near native performance is possible with WebGL, but not with the current implementation, as it's REALLY inefficient.


Please take a look at this.... it's not only me experiencing bad performance, i think construct can do it better. It's just sloppy implementation, and bad optimization.

And I think this should be a first priority, as people are choosing other engines due to performance issues.
Follow my progress on Twitter
or in this thread Archer Devlog
B
35
S
15
G
17
Posts: 944
Reputation: 12,210

Post » Wed May 17, 2017 1:43 pm

This thread should have enough solid proof now that the way C2 does the rendering is not very efficient at all, considering it's WebGL, and what it should be capable of.

Ashley wrote:If you can provide a minimal .capx that shows high draw call usage, I'd be happy to investigate optimising the engine. Without that the most I can do is speculate.


So get on with it ;)

I'd be happy to play with the new superfast C2, C3, once the optimizations are in ;)
Follow my progress on Twitter
or in this thread Archer Devlog
B
35
S
15
G
17
Posts: 944
Reputation: 12,210

Post » Wed May 17, 2017 2:19 pm

You're worrying over nothing. There is nothing here to suggest any performance problems.

The screenshot you posted shows it swapping texture between draw calls, which is normal. After you export - or in C3's preview mode, which uses in-editor spritesheeting - a great deal of those texture calls will disappear, as it combines most of the images on to just a few spritesheet textures. So that particular case is unique to preview mode in C2, and does not happen after export or in C3 at all.

Based on the fact very few people have ever noticed a performance difference between preview and export in C2, I'd say the overhead doesn't matter anyway.
Scirra Founder
B
387
S
230
G
87
Posts: 24,248
Reputation: 192,238

Post » Wed May 17, 2017 5:58 pm

I've noticed. Thats why I came to the same conclusion and put many objects into one spritesheet so it exists only on 1 texture. This is why both Unity and Gamemaker have ways to intelligently merge sprites onto one texture. This is why I've requested an optimization feature for C3 (that got denied).

Maybe only the more advanced users come to this conclusion and that's why you don't hear it often.
B
79
S
51
G
39
Posts: 370
Reputation: 24,705

Next

Return to Construct 2 General

Who is online

Users browsing this forum: Colludium and 2 guests