Here's an old post I made on the subject,
If the GPU is slower in rendering a sufficiently complex scene than you can do in pure software with the same quality (as was the case with the S3 "accelerators" back in 1996-1997) then yeah, you can safely say the GPU sucks.
Or when your GPU sux.
But that's hardly the case today, if you use anything with 1 or more dedicated pixel and vertex pipelines (aka anything better than an S3 Savage and pre-4000 Intel GMA).
What doesn't change is that issuing an OpenGL or Direct3D directive has some software overhead, that's a hard and fast fact.
As with parallel processing, if the problem at hand is simple/small enough, a non-threaded non-parallel version may outperform the multithreaded one, because the overhead of using threads will not be paid back in extreme cases, unless you have a particularly optimized OS with VERY low latency for this sort of stuff.
This explains perfectly why Doom.wad timedemos often beat their GL counterparts: there's so little visual processing that just invoking anything OpenGL is slower than actually rendering the scene in software, before a single thing is even rendered in hardware.
GL has advantage over software on high resolutions. On my work computer (Dual Core E5200 @ 2.5, GeForce 9500 GT) I have only 1.2x decreasing in speed between 1600x1200 and 640x480 for GL and 2x for software. Most people now have LCD and they are obliged to use high resolutions
The overhead of calling OpenGL is pretty much constant whether you render nothing, a single pixel, or a Doom 3 scene, and largely independent of the screen resolution. Of course, it's better to have enough stuff for the GPU to do, rather than invoking it needlessly/for little reason. So more complex scenes and higher resolutions will have higher efficiencies over lower ones, which will instead waste most of their time in calling overheads.
This is more of an OS/driver issue: if the drivers and OS are particularly optimized for dealing with a bazillion of micro-OpenGL calls, then maaaaybe there will be little difference, and the OpenGL may even slowly recuperate (still, not impressively so). Early Direct3D suffered very badly from this, and needed large execution buffers to be assembled in order to make it actually worth using.
I guess that Linux and nVidia should perform better under these circumstances, while Windows and ATI should be at a disadvantage.
Could be, but the circumstances are entirely different: in this case OpenGL is NOT required to fire away needlessly a bazillion frames per second, and the better decoupling/parallelism between game engine and rendering allows for better game engine/input recording smoothing, which is harder to do for a single threaded software render-engine status update playloop cycle. Plus, you don't record demos at an accelerated pace ;-)
One proof that OpenGL is just better is that recording videos with glboom+ gets better frame rates than recording with prboom+, given a good enough gpu. If you don't believe it, you can try recording a Long Days demo with both exes using any screen recorder of your choice.
Last edited by Maes on Jan 11 2010 at 10:54