GZDoom performance

Hello, fine folks.

I'm currently working on a map that has a high number of linedefs in sight at almost any given time.

My PC runs on Windows XP 32-bit, with 4 GB RAM, a 3 Ghz dual core processor, and a GTX260 video-card.

When running the map with GZDoom, and viewing the whole landscape populated with about 9500 linedefs, the frame-rate drops to about 40 FPS.

I'm still not finished with the map, and expect to add a couple thousand more linedefs still.

My question is, what needs an upgrade in order to improve the performance? I suspect it's the processor, but I'd love to hear from others more knowledgeable on how GZDoom's renderer works and what it demands to run the map with at least 60 FPS at a minimum.

I haven't tested it in multiplayer yet, and I haven't tested with monster enemies recently, so I'm not sure how the performance will be affected with those factors included.

I await your wisdom, Graf Zahl.. ;)

on second thought i could send this as a PM but what the heck....

Share this post


Link to post

You're rendering a lot of linedefs there, so I'd assume it is related to your CPU.

Share this post


Link to post

When running big maps with the most recent version and build of GZDoom I've found that even with my triple-core processor, Radeon card and 4GB RAM the lighting effects will cause a big framerate loss. If I run it without the lights pk3 loaded it speeds right up.

Share this post


Link to post

Big open areas with lots of linedefs are always going to be a killer in GZdoom (and many other ports too of course). I believe that is the main reason for the poor performance that some people experience on some of my maps, such as Overlord. If you want your map to be played by as many people as possible, then, obviously, the best solution is to modify your map to reduce the problem from the map side rather than trying to bump the performance of the computer. Of course, if you don't want to compromise your plans for your map...

Share this post


Link to post

Not easy to quantify what exactly happens just by the number of linedefs: linedefs form sectors, which are formed by polygons...and also form walls, more polygons etc. and with thousands of visible linedefs things can get pretty ugly very quickly, and I presume that slowdown isn't exactly linear. Just thinking into how many millions of texture-mapped polygons per second those thousands of linedefs translate to, just boggles my mind. Such an architecture is overly complex even by Far Cry standards, even if visually not as stunning. 40 fps? You should be glad!

So unless you have a super-tuned OpenGL (or any other 3D API) engine using GPGPU or scathing optimizations etc. then that's about as good as it can get with an engine that's designed to be quite flexible (within the domain of Doom derivatives).

Share this post


Link to post

I think it's primarily an issue with the engine and its somewhat lack of optimization. Not just GZDoom but Doom and all its derivatives in general.

However, I'm sure a processor upgrade would be of great help when rendering so many linedefs. I think neither Doom-based source-ports take much advantage of the GPU?

Share this post


Link to post

What do you mean, "lack of optimization"?

Share this post


Link to post

I mean that the Doom engine and its source-ports aren't optimized for a large amount of polygons, the way modern engines are.

Share this post


Link to post

I'd say either could be the bottleneck, if you're on GZDoom. You can try slightly over-clocking or under-clocking your CPU and GPU (be careful though) to see which gives you the bigger performance boost/penalty, and go from there.

Share this post


Link to post

Indeed I could! Hm...

Supposedly it's very easy to overclock with Asus - YES!!

I might give it a shot.

Do you have any input, Graf Zahl?

Share this post


Link to post

I don't mean keeping it overclocked all the time. I mean if you overclock your GPU by 10% and your performance goes up from 40 to 42 FPS, but then you underclock it by 20% and it falls down to 35 FPS, a GPU upgrade might help. On the other hand, if it doesn't change, it might be the CPU that is the bottleneck. Same goes for CPU overclocking/underclocking.

Honestly, I'm a bit worried about who will be able to play this map, though. :P

Share this post


Link to post

all the good kids who worked hard to afford beast rigs in their adulthood !

Good idea, BTW. I will do that.

But where the heck is Graf Zahl?

That name just rolls off the tongue... ;)

Share this post


Link to post

is your map GZdoom specific ? If not give it a try with glboom i heard it has great performance.

Share this post


Link to post

Your own computer's performance on your own map is moot - what's more important is how well the average user can run it. Even if you trade in your PC for some massive beast, that doesn't help those of us who still need to squeeze a few more years out of our bargain-basement rigs. I know it sucks to make compromises with your artistic vision, but the most sensible thing to do here is to scale back your linedef count, or restructure your map so that not so many lines are visible at one time.

Anyways, to answer your technical questions: the performance bottleneck here is your video card, but that doesn't mean it's worth buying a newer one. As I understand it, modern GPUs are optimized for rendering vertices in batches of thousands, because this is what newer games demand of them, but because of how the Doom engine is set up, every sector and wall has to be sent to the renderer as its own "batch" of just a few vertices. Basically, there's a limit to how fast a Doom-engine game can be rendered on a modern video card, and that limit is how fast the card can accept and process new "batches", and that limit isn't about to go up any time soon, because sixteen-year-old game engines are not the force that drives the graphics card industry. Graf would be able to give you a more accurate explanation.

The good news in this is that, because the bottleneck is your GPU's ability to render walls and floors, once you add monsters the map should run every bit as fast as it does now.

Share this post


Link to post

or convert the gzdoom only stuff to boom equivalent and run it on glboom like a champ.

Share this post


Link to post
VinceDSS said:

or convert the gzdoom only stuff to boom equivalent and run it on glboom like a champ.

glboom will be slower than gzdoom in case of large maps. On sunder.wad map10 I have 37 fps in glboom, 49 in gzdoom and 114 in glboom+ (with comparable to glboom's abilities cfg)

Share this post


Link to post
Creaphis said:

because of how the Doom engine is set up, every sector and wall has to be sent to the renderer as its own "batch" of just a few vertices.

Out of curiosity, how difficult or feasible would it be to rework it so that it sends a lot of vertices at a time?

Share this post


Link to post

Based on the horrendously long debate the various coders have been having over that issue across several forums, not so easy. There's discussion of it in some of the forums on this thread.

Share this post


Link to post

Its certainly not an easy problem but its most definitely doable. Feasibility of course is another issue entirely and one which must be considered individually by each port.

For years Doomsday would batch up all vertices visible at a given time and upload them all at once every frame using a vertex array. However support for this ultimately died off in the "big two" GL APIs and it became more of a performance bottleneck than simply using "immediate mode". So it was dropped.

Our plans for Doomsday 2.0 include a fundamental redesign of how we represent map geometry so as to allow us to fully leverage the GPU. As yet this work has been limited to private research projects but we intend to start putting this into practice soon.

Share this post


Link to post
Spleen said:

Out of curiosity, how difficult or feasible would it be to rework it so that it sends a lot of vertices at a time?



The big problem is not the vertex upload performance but more the batching of rendering primitives. The way Doom works makes it very hard to optimize this and unless you stay close to the way Doom processes a level for rendering it's not really solvable.

The tests I made with vertex buffers showed no measurable improvement compared to immediate mode. The main bottleneck lies elsewhere.

Share this post


Link to post

Thanks for the info!

DaniJ said:

Our plans for Doomsday 2.0 include a fundamental redesign of how we represent map geometry so as to allow us to fully leverage the GPU. As yet this work has been limited to private research projects but we intend to start putting this into practice soon.

Interesting, I hope that goes well.

Graf Zahl said:

unless you stay close to the way Doom processes a level

Well, I'm assuming that how Doom processes a level can be changed too, but that will be even more work and it may come at the expense of software renderer performance.

Graf Zahl said:

The main bottleneck lies elsewhere.

Hmm, any idea what the bottleneck on Sunder maps 9/10 is? :P

Share this post


Link to post
Spleen said:

Well, I'm assuming that how Doom processes a level can be changed too, but that will be even more work and it may come at the expense of software renderer performance.


I would have no problem with that :)

Isn't that the whole point of GZDoom?

Share this post


Link to post
Spleen said:

Hmm, any idea what the bottleneck on Sunder maps 9/10 is? :P



No. To see where GlBoom+ is so much faster I'd need more profiling info from it. A plain FPS counter does not help much.

Share this post


Link to post
Graf Zahl said:

No. To see where GlBoom+ is so much faster I'd need more profiling info from it.

Do not worry about sunder.wad. GZDoom is fine there. At least with non crappy hardware. GLBoom-Plus 2.5.0.6.release is faster mostly because of Intel Compiler, heh. With current unoptimized 2.5.0.7.beta with my real config I have 51-52 fps in glboom-plus and 47-49 in gzdoom. I have 8800 GTS.

2.5.0.7.beta.msvc + my cfg - 51 fps
2.5.0.7.beta.msvc + testp.cfg - 82 fps

2.5.0.6.Intel + my cfg - 72 fps
2.5.0.6.Intel + testp.cfg - 118 fps

testp.cfg is made for comparing speed between glboom and glboom+. All new features are disabled there. My cfg provide me with the best graphics glboom-plus can (8xAA, 16x aniso, 110 fov, nicest fog, blend animations, quality renderer, etc).

The most important differences between testp.cfg and my.cfg are quality renderer (t-junctions) (-6 fps) and fog based ligting (-10 fps). 8x AA and 16x aniso (glboom can only 2x) do not matter at all

Current 2.5.0.7.Intel (/O3+/Qipo) probably will be even faster than 2.5.0.6, because it can use precompiled drawlists for flats instead of separate drawarrays.

Share this post


Link to post
Graf Zahl said:

No. To see where GlBoom+ is so much faster I'd need more profiling info from it.

Did you try to understand why zdoom is so slow on nuts? You know, it is not gzdoom issue, it is zdoom issue.

Share this post


Link to post
entryway said:

Do not worry about sunder.wad. GZDoom is fine there. At least with non crappy hardware. GLBoom-Plus 2.5.0.6.release is faster mostly because of Intel Compiler, heh. With current unoptimized 2.5.0.7.beta with my real config I have 51-52 fps in glboom-plus and 47-49 in gzdoom. I have 8800 GTS.

2.5.0.7.beta.msvc + my cfg - 51 fps
2.5.0.7.beta.msvc + testp.cfg - 82 fps

2.5.0.6.Intel + my cfg - 72 fps
2.5.0.6.Intel + testp.cfg - 118 fps

testp.cfg is made for comparing speed between glboom and glboom+. All new features are disabled there. My cfg provide me with the best graphics glboom-plus can (8xAA, 16x aniso, 110 fov, nicest fog, blend animations, quality renderer, etc).

The most important differences between testp.cfg and my.cfg are quality renderer (t-junctions) (-6 fps) and fog based ligting (-10 fps). 8x AA and 16x aniso (glboom can only 2x) do not matter at all

Current 2.5.0.7.Intel (/O3+/Qipo) probably will be even faster than 2.5.0.6, because it can use precompiled drawlists for flats instead of separate drawarrays.

In other words, GLBoom+ and GZDoom suffer from the same bottleneck about equally. If you look at Sunder maps 9/10 in terms of pure polys, there are not that many, so it could be a lot faster, correct? Like DaniJ said, it's possible represent the map geometry in a different way to utilize more of the GPU, right?

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now