First (truly shocking!) optimization benchmarks
Posted: Tue Jun 09, 2015 6:25 pm
I've just played the tutorial.
The camera's movement was showing some serious lag, but not so annoying.
That scenario just had a few units on the ground, though.
I enjoyed it overally.
From the previous days talks on this forum, I was thinking this game was CPU (AI, LOS, anims updating, etc.) bound, rather than GPU.
Hence profiling it with MS DirectX PIX sounded like a formality. I did it anyway...
So I picked "WL01 - The Emperor's Plan", looking at the scene's left from the French starting position, as benchmark.
They were by far the most shocking results I ever run across with such a tool:
- 5000 DPUPs per frame, meaning one of the most deprecated DX9's draw call ("data specified by a user memory pointer") per sprite instance (plus all terrain stuff)!
This can be very easily converted to 1 x DIP per sprite type (not instance!!!) with Hardware Instancing (SetStreamSourceFreq).
I suppose 5000 DPUPs to 50 DIPs thereafter?
- 5500 SetRenderStates per frame, same as above (one per sprite instance, plus terrains).
This can be very easily converted to 5 x SetRenderState for the whole sprite renderer.
5000 SetRenderState vs 5 SetRenderState?
- 5000 SetVertexShader + 5000 SetPixelShader per frame, again the same problem.
This can be very easily converted to 1 x SetVertexShader + 1 x SetPixelShader for the whole sprite renderer.
5000+5000 vs 1+1?
- 22000 SetTexture calls, which should come from the fact that one sprite instance needs more than one texture switch (average of 4?) because of composition (trousers, horses, hats, etc.).
Some terrain stuff is also included in the estimate.
Yet, incredible. You should batch draw calls sharing material.
20000 to 250 SetTextures?
- 6600 x SetVertexShaderConstant per frame;
- 6600 x SetPixelShaderConstant per frame;
- The total DX9 API calls are steadily > 200.000 per frame.
DirectX 9 APIs are quite inefficient on their own, abusing in such a way is performance-killer at the nth power.
This is inescapable, so please fix it as soon as possible.
It may really do all the difference of the world with a limited effort.
Back to fun now.
The camera's movement was showing some serious lag, but not so annoying.
That scenario just had a few units on the ground, though.
I enjoyed it overally.
From the previous days talks on this forum, I was thinking this game was CPU (AI, LOS, anims updating, etc.) bound, rather than GPU.
Hence profiling it with MS DirectX PIX sounded like a formality. I did it anyway...
So I picked "WL01 - The Emperor's Plan", looking at the scene's left from the French starting position, as benchmark.
They were by far the most shocking results I ever run across with such a tool:
- 5000 DPUPs per frame, meaning one of the most deprecated DX9's draw call ("data specified by a user memory pointer") per sprite instance (plus all terrain stuff)!
This can be very easily converted to 1 x DIP per sprite type (not instance!!!) with Hardware Instancing (SetStreamSourceFreq).
I suppose 5000 DPUPs to 50 DIPs thereafter?
- 5500 SetRenderStates per frame, same as above (one per sprite instance, plus terrains).
This can be very easily converted to 5 x SetRenderState for the whole sprite renderer.
5000 SetRenderState vs 5 SetRenderState?
- 5000 SetVertexShader + 5000 SetPixelShader per frame, again the same problem.
This can be very easily converted to 1 x SetVertexShader + 1 x SetPixelShader for the whole sprite renderer.
5000+5000 vs 1+1?
- 22000 SetTexture calls, which should come from the fact that one sprite instance needs more than one texture switch (average of 4?) because of composition (trousers, horses, hats, etc.).
Some terrain stuff is also included in the estimate.
Yet, incredible. You should batch draw calls sharing material.
20000 to 250 SetTextures?
- 6600 x SetVertexShaderConstant per frame;
- 6600 x SetPixelShaderConstant per frame;
- The total DX9 API calls are steadily > 200.000 per frame.
DirectX 9 APIs are quite inefficient on their own, abusing in such a way is performance-killer at the nth power.
This is inescapable, so please fix it as soon as possible.
It may really do all the difference of the world with a limited effort.
Back to fun now.