Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Sign in to follow this  
DuckReconMajor

Software vs Hardware timedemos

Recommended Posts

So the other day I decided to run some timedemos to see whether prboom-plus or glboom-plus was running faster on my ThinkPad T61P. I was surprised at the results.

prboom-plus on doom.wad DEMO1: 200.2
glboom-plus on doom.wad DEMO1: 63.2

prboom-plus on nuts.wad: 62.1
glboom-plus on nuts.wad: 44.8
I'm guessing it's because I've got a mobile GPU, but I never realized the difference would be that much.

Post your hardware vs software timedemos here. It'd be interesting to see what runs faster on what.

P.S. I tried disabling all the filtering on glboom-plus and it made no difference in framerate. Anything else I should try turning off?

edit: system specs: Core2Duo T7700 @2.4 and NVIDIA Quadro FX 570M (256.0 MB)

Share this post


Link to post

glboom-plus.exe -geom 640x480w -nosound -timedemo demo1 -config testp.cfg
2636 fps

prboom-plus.exe -geom 640x480w -nosound -timedemo demo1 -config testp_soft.cfg
1172 fps

glboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp.cfg
164 fps

prboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp_soft.cfg
132 fps

demos and configs: http://prboom-plus.sourceforge.net/test_demos.zip
system specs: Core2Duo E6750 2.66 @ 3.0, NVidia 8800GTS (320mb), prboom-plus 2.5.0.6 release

p.s. configs are made for being comparable with prboom/glboom renderer

Share this post


Link to post

revised specs:

glboom-plus.exe -geom 640x480w -nosound -timedemo demo1 -config testp.cfg
540 fps

prboom-plus.exe -geom 640x480w -nosound -timedemo demo1 -config testp_soft.cfg
458 fps

glboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp.cfg
78 fps

prboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp_soft.cfg
90 fps

system specs: Core2Duo T7700 @2.4 and NVIDIA Quadro FX 570M (256.0 MB)

weird

Share this post


Link to post
Never_Again said:

glboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp.cfg
185.6 fps

system specs: Core2Duo 3.0, NVidia 9600GT (512mb), prboom-plus 2.5.0.7-test (today's build)

Hmm, 9600 should be slower than 8800 GTS I thought. 200 fps with 2.5.0.6 release? I have only 152 with beta and 164 with release.

Never_Again said:

Oh, and the DEMO1 is from DOOM.WAD, not DOOM2, right?

My results are for doom2.wad. 3150 (GL) and 1173 (soft) for demo1 @ doom.wad. After tweaking of cfg (no filtering, no aniso, no statusbar/hud, no messages) I got 3990 fps for demo1 @ doom.wad, heh

Share this post


Link to post

My eeePC Netbook system specs: Intel Atom N270 @ 1.6GHz, Intel GMA 950 (224mb), prboom-plus 2.5.0.6 release in XP 32bit

DOOM II

demo1-GL	113
demo1-PR	130
nuts-GL		27
nuts-PR		27
DV04-GL		79
DV04-PR		72
My ACER Notebook system specs: AMD Athlon64 X2 TK-57 @ 1.9GHz, ATI X1250 (256mb), prboom-plus 2.5.0.6 release in Vista 32bit
DOOM II

demo1-GL	89
demo1-PR	194
nuts-GL		31
nuts-PR		45
DV04-GL		71
DV04-PR		118
My friends HP Desktop replacement Laptop system specs: AMD Turion II M600 @ 2.4GHz, ATI HD 4200 (320mb), prboom-plus 2.5.0.6 release in Win7 64bit
DOOM II

demo1-GL	383
demo1-PR	409
nuts-GL		66
nuts-PR		76
DV04-GL		185
DV04-PR		213
mmm, I want a fast Core i7 or at least an i5 with a sweet NVIDIA when I can afford it.

Share this post


Link to post

It's probably worth pointing out that you also need to be using the same IWAD version as each other, and I'm pretty sure demos are different Shareware to Registered.

The old Doombench website tests are made with Shareware v1.9 DEMO3, the author of that page wrote somewhere that be believed the other DEMOs produced inconsistent or artificially inflated benchmark figures.

entryway said:

3990 fps for demo1 @ doom.wad

Holy shit! :D

Share this post


Link to post

AMD 64 X2 4200+ (939, Toledo), 2x1GB (Mushkin, Corsair) DDR 400 (Cas 3), eVGA GeForce 8800GT

glboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp.cfg
76.6 fps

prboom-plus.exe nuts.wad -geom 640x480w -nosound -timedemo nuts -config testp_soft.cfg
62.8 fps


I need to get off of socket 939.. uhg.

Share this post


Link to post
Mike.Reiner said:

AMD 64 X2 4200+ (939, Toledo)

Does AMD have similar problems with cache as Pentium4 has?

What do you have in stdout.txt for prboom-plus -geom 1024x768w ?

Core2:
test case for pitch=1024 is processed 28294 times for 100 msec
test case for pitch=1056 is processed 28896 times for 100 msec
optimized screen pitch is 1056

Pentium4:
test case for pitch=1024 is processed 1130 times for 100 msec
test case for pitch=1056 is processed 18550 times for 100 msec
optimized screen pitch is 1056

As you see, Pentium 4 is 16x slower at 1024 in comparison with 1056 and there is no difference for core2. I think it is a reason why PrBoom is ~4x faster on Core2 than on P4 with the same frequency.

Share this post


Link to post

test case for pitch=1024 is processed 1618 times for 100 msec
test case for pitch=1056 is processed 8539 times for 100 msec
optimized screen pitch is 1056

Share this post


Link to post
Mike.Reiner said:

test case for pitch=1024 is processed 1618 times for 100 msec
test case for pitch=1056 is processed 8539 times for 100 msec

hence AMD 64 X2 4200 is also shit, although it is better than p4 - difference is only 5x instead of 16x. core2 for the win!

Share this post


Link to post

Probably there's a rendering limit in the software engine, which doesn't actually attempt to render all 4000+ fps, but is always in sync with the screen and thus will never draw more than 85 or 100 frames, picking only a fixed amount per second.

This puts it in advantage compared to the OpenGL engine, which actually tries fully rendering all frames and thus introduces an additional bottleneck. Just my opinion though.

Share this post


Link to post
Maes said:

Probably there's a rendering limit in the software engine, which doesn't actually attempt to render all 4000+ fps, but is always in sync with the screen and thus will never draw more than 85 or 100 frames, picking only a fixed amount per second.

vsync is not implemented for "windib" sdl videodriver. windib is default for non-9x platforms

Share this post


Link to post

Probably there's a rendering limit in the software engine, which doesn't actually attempt to render all 4000+ fps, but is always in sync with the screen and thus will never draw more than 85 or 100 frames, picking only a fixed amount per second.

Either that, or the rendering overhead is so trivial in some cases (like in Doom.wad timedemos, which have very simple levels and few monsters) that using OpenGL actually makes things worse in terms of overhead (more probable). In the case of nuts.wad the difference is diminished, but is still dramatic. Seems that OpenGL doesn't do much good when only billboard sprites are involved (perhaps trying something with ultra-complex architecture and BSP trees would make the scale tip?).

In any case, this proves that there are situations where a software renderer may have the edge over an accelerated one: if there's so little visual processing that the OpenGL overhead is not "paid back" by the acceleration.

Share this post


Link to post
Maes said:

In any case, this proves that there are situations where a software renderer may have the edge over an accelerated one: if there's so little visual processing that the OpenGL overhead is not "paid back" by the acceleration.

Or when your GPU sux. :P One proof that OpenGL is just better is that recording videos with glboom+ gets better frame rates than recording with prboom+, given a good enough gpu. If you don't believe it, you can try recording a Long Days demo with both exes using any screen recorder of your choice.

Share this post


Link to post

GL has advantage over software on high resolutions. On my work computer (Dual Core E5200 @ 2.5, GeForce 9500 GT) I have only 1.2x decreasing in speed between 1600x1200 and 640x480 for GL and 2x for software. Most people now have LCD and they are obliged to use high resolutions

Share this post


Link to post
Jodwin said:

Or when your GPU sux.


If the GPU is slower in rendering a sufficiently complex scene than you can do in pure software with the same quality (as was the case with the S3 "accelerators" back in 1996-1997) then yeah, you can safely say the GPU sucks.

But that's hardly the case today, if you use anything with 1 or more dedicated pixel and vertex pipelines (aka anything better than an S3 Savage and pre-4000 Intel GMA).

What doesn't change is that issuing an OpenGL or Direct3D directive has some software overhead, that's a hard and fast fact.

As with parallel processing, if the problem at hand is simple/small enough, a non-threaded non-parallel version may outperform the multithreaded one, because the overhead of using threads will not be paid back in extreme cases, unless you have a particularly optimized OS with VERY low latency for this sort of stuff.

This explains perfectly why Doom.wad timedemos often beat their GL counterparts: there's so little visual processing that just invoking anything OpenGL is slower than actually rendering the scene in software, before a single thing is even rendered in hardware.

entryway said:

GL has advantage over software on high resolutions. On my work computer (Dual Core E5200 @ 2.5, GeForce 9500 GT) I have only 1.2x decreasing in speed between 1600x1200 and 640x480 for GL and 2x for software. Most people now have LCD and they are obliged to use high resolutions


Exactly.

The overhead of calling OpenGL is pretty much constant whether you render nothing, a single pixel, or a Doom 3 scene, and largely independent of the screen resolution. Of course, it's better to have enough stuff for the GPU to do, rather than invoking it needlessly/for little reason. So more complex scenes and higher resolutions will have higher efficiencies over lower ones, which will instead waste most of their time in calling overheads.

This is more of an OS/driver issue: if the drivers and OS are particularly optimized for dealing with a bazillion of micro-OpenGL calls, then maaaaybe there will be little difference, and the OpenGL may even slowly recuperate (still, not impressively so). Early Direct3D suffered very badly from this, and needed large execution buffers to be assembled in order to make it actually worth using.

I guess that Linux and nVidia should perform better under these circumstances, while Windows and ATI should be at a disadvantage.

Jodwin said:

One proof that OpenGL is just better is that recording videos with glboom+ gets better frame rates than recording with prboom+, given a good enough gpu. If you don't believe it, you can try recording a Long Days demo with both exes using any screen recorder of your choice.


Could be, but the circumstances are entirely different: in this case OpenGL is NOT required to fire away needlessly a bazillion frames per second, and the better decoupling/parallelism between game engine and rendering allows for better game engine/input recording smoothing, which is harder to do for a single threaded software render-engine status update playloop cycle. Plus, you don't record demos at an accelerated pace ;-)

Share this post


Link to post

Super Jamie said:
The old Doombench website tests are made with Shareware v1.9 DEMO3, the author of that page wrote somewhere that be believed the other DEMOs produced inconsistent or artificially inflated benchmark figures.

The only difference, from the other demos in the shareware, must be that E1M7 is a bit more complex, so it may tax some aspects more than the other demos. This doesn't mean the effects of the others are artificial. All benchmarks depend on contexts because some systems are better at certain things.

I noticed -nosound lowers realtics a little bit, and I get slightly lower realtics under full DOS than in Windows 98.

Jodwin said:
Or when your GPU sux. :P

And when it sucks, there's an additional reason why OpenGL is not beneficial; visual glitches. In this sense, from personal experience, the only engine that has OpenGL that's good on older hardware is Risen3D... or my graphics card, at least.

entryway said:
Most people now have LCD and they are obliged to use high resolutions

They are?

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×