Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Redneckerz

FastDoom: DOS Vanilla Doom optimized for 386/486 processors

Recommended Posts

On 6/20/2024 at 12:10 PM, MrFlibble said:

Can you give a rough estimate of how much one would need to sacrifice in terms of level geometry's complexity (and/or internal structure of the data) for a speed improvement? Or would it be just "easier" to write a completely new rendering engine from scratch that'd be optimized for the 386 architecture?

 

If we just put Doom aside, do you think that it is possible to create an engine that would be more advanced than that of Wolf3D and replicate at least some of Doom's features, but run better on a 386DX at least?

 

BTW, have you considered rendering distance as a detail level variable? For example, both Bethesda's Arena and Daggerfall have this "detail" slider that actually controls the viewing distance, which affects the frame rate quite notably.

A viewing distance in Doom would be hard since it operates very differently from a hardware based renderer. In Doom you have to draw to every pixel in order to avoid garbage from a previous frame. In a hardware render you blank the screen and most likely fill whatever is left when rendering the near geometry with a sky box. That is why distant mountains pop up when you get closer.

Doom spends a lot of time just writing the pixels to memory so there is very little to gain. You have to write something there anyway.

I suppose you could have normal pixels in the central area and potato pixels near the edges to reduce the amount of writes but make the main play area / aim area reasonably good looking. Doubt there would be much to gain, since it complicates the rendering algorithms a bit.

There could perhaps be some minor gains from hand-writing assembly for that specific cpu, if you are an asm expert.

There are probably some gains to be had in improving the data structures used to render the imagery. If I'm not mistaken the order the BSP tree is very cache-unfriendly. Maybe fixing that when loading could improve the rendering slightly.

Share this post


Link to post
Posted (edited)

So I tried the latest FastDoom VESA version, it's much better than the previous releases, good job, @viti95

A minor disatvantage here is that the actors still look as bright as if their rendering trick is still that brighter error

Share this post


Link to post

New dev updates. I've added untextured column rendering, it's faster but crimples gameplay a lot:

 

Untextured walls:

 

dos4gw_000.png.33645941b394482d1691f7020f855e67.png

 

Untextured walls + no diminished lightning

 

dos4gw_001.png.a8cc5f5a39a6bbb122569f86d0885803.png

 

As a side effect, also sprites can be rendered untextured:

 

dos4gw_002.png.6d723f0684a3b572202ce4f0a109f336.png

dos4gw_003.png.93740f51c1fa55f16a4380c0c2dbef27.png

 

This will be available on the next release

Share this post


Link to post

SNES Doom mode, but more than the flats are, well, flat colors :)

Share this post


Link to post

That way it's going to work fast on a 386sx-16 finally. I guess this trick could be utilized for drawing distant geometry and sprites too.

Share this post


Link to post
10 hours ago, viti95 said:

New dev updates. I've added untextured column rendering, it's faster but crimples gameplay a lot:

 

Untextured walls:

 

dos4gw_000.png.33645941b394482d1691f7020f855e67.png

 

Untextured walls + no diminished lightning

 

dos4gw_001.png.a8cc5f5a39a6bbb122569f86d0885803.png

 

As a side effect, also sprites can be rendered untextured:

 

dos4gw_002.png.6d723f0684a3b572202ce4f0a109f336.png

dos4gw_003.png.93740f51c1fa55f16a4380c0c2dbef27.png

 

This will be available on the next release

 

DOOM 64Kb

Share this post


Link to post

The truth is that miniwad is actually very playable and looks pretty neat too.

Share this post


Link to post
8 hours ago, viti95 said:

 

https://x.com/viti95/status/1812065740133920798?t=Q8s0Rihq8t0skCN2U-Em9Q&s=19

 

This is a test on my 386sx-33, still far from being playable with decent graphics lol

I don't care about that cpu tbh, the thing is that it doubles the performance without sacrficing image quality and triples it on the real modern hardware under dos, this is amazing.


Any 486 since 33MHz, Pentiums - all really favor your fdoom13h.exe. The higher resolution versions are also great, especially 400r.

 

The only disatvantage in contrast to the original exes is opl music, which I wrote about before, it sounds considerably worse and if you only could at least get Chocolate Doom code in there to make it better.

Share this post


Link to post
Posted (edited)

New release! FastDoom 0.9.9f

 

Changelog:

 

* Dreamblaster S2P support (MIDI through LPT)
* New rendering options, now it's possible to draw walls and sprites untextured (single color)
* Update display options menu, scroll it's available so it's possible to add as many options as wanted
* Optimize flat sky rendering
* Two new command line options "-freeram" and "-limitram". First one limits the amount of ram available for zone memory (kb), and second limits the ammount of memory free after allocating zone memory (also in kb). Reverted default free memory to 128kb (fixes issues on some setups) @jsmolina
* Added file size verification for supported IWADs (in order to reduce unsupported IWAD version issues)

 

https://github.com/viti95/FastDoom/releases/tag/0.9.9f

Share this post


Link to post

Good news!

 

Just a few suggestions:

- Ultimate Doom, Doom Registered and Doom Shareware IWAD identification based on presence of E4M1 and E2M1 or their music files, an option to autodetect determined in fdsetup and fdoom.cfg;

- Sigil and Sigil2? support, this one is really CPU hungry and the only way to play it smoothly enough with a vanilla exe is by having a Pentium 133 system or higher, would be really cool if FastDoom supported that;

 

... and a bug report:
- the whole source port performance is totally unplayable on 386sx since the version that introduced a special font in PCem, regardless of systems specs like ram and video and regardless of FastDoom CPU rendering path set to 386sx. The issue doesn't occur with 486sx.

Share this post


Link to post
Posted (edited)

So I was bughunting in realdoom all day, trying to fix this bug, similar to a slime trail bug, visible below and to the left of the blue key.

 

(if you wish to trigger this exact position, what i do is toss the following into the automap drawer function after the automap active check, then toggle the map on and off in game to trigger the position change:

players[0].mo->x = 0xff809e2b;

players[0].mo->y = 0xfdd065a0;

players[0].mo->z = 0x00600000;

players[0].mo->angle = 0xbe400000;

)

 

Screenshot_2024-08-16_at_11.09.31.png?ex

 

So the bug doesn't show up in this position vanilla. After lots and lots and lots of digging, comparing params to RenderSegRange, etc etc, i found that this actually happens as a quirk of the the non-recursive BSP traversal algorithm that I pulled from FastDoom - its just that R_Subsector/AddLine gets called in a slightly different node order, which leads to solidsegs being constructed slightly differently causing different buggy single-pixel wide calls to R_StoreWallRange. (It's very possible that the bug similarly appears in some spots in the vanilla BSP algorithm and doesn't appear in the same spots in the non-recursive version.) I confirmed it also happens in fastdoom and not pcdoomv2, for example. I replaced my bsp code with the old vanilla version and it went away.

 

I don't think this graphical glitch is necessarily a big deal - I actually assumed it was a bug I caused in my code with some lack of precision which is why I worked so hard to find it. Anyway, I did some benchmarking and i measured something like 0.02% fps difference between vanilla and recursive algorithms. I've made changes for even less FPS gain than that so I get it, but I figured I'd ask if you ran into any other issues causing to pick the recursive one.  I honestly just noticed the problems it was giving me right now, after including it for almost a year. 

 

I may poke around at the algorithm and see if i can figure out a way to force the same R_Subsector call order to get the best of both worlds... not immediately sure if its possible.

 

EDIT:  Okay.  I worked on this a few more hours today. So it's actually the tiebreak/ equals on left/right comparison on the bsp traversal  that causes this different behavior. To be 'extra similar' to vanilla i've just changed the equals symbol in the comparator for the first side traversed. 

 

I meanwhile tried a few different approaches to the bsp node traversal. I found one that is a bit faster on pentium, but it is slower on 386. I couldn't get anything faster on older processors - the best option is probably still the one without recursion that fastdoom uses. I will investigate further when rewriting in asm later.

 

 

 

Edited by sqpat

Share this post


Link to post

Mmmm interesting, didn't notice this issue. Now I wonder if that can cause issues on demo compatibility. Maybe I'll revert this on FastDoom, as RAM optimization is not as important as on 16-bit systems

Share this post


Link to post
8 hours ago, viti95 said:

Mmmm interesting, didn't notice this issue. Now I wonder if that can cause issues on demo compatibility. Maybe I'll revert this on FastDoom, as RAM optimization is not as important as on 16-bit systems


To be clear, after further investigation I think if you just change the "right >= left" in fastdoom to "right > left" i think it goes back to vanilla rendering order and you can keep the nonrecursive algorithm. It shouldn't affect timedemo compatibility at all. The renderer basically has a few bugs somewhat related to single-pixel column draws, this tiebreak change on right vs left  leads to some rare cases where bsp traversal is slightly different, which leads to slightly different instances of the bug. I assumed it was a bug of mine due to using 16 bits of precision when I needed 32 bits, which is why I worked so hard to find it... 

 

This reminds me, sometimes I have thought it would be neat for testing purposes to take per-frame checksums during timedemo of all the pixels on screen, and compare it vs a known good "vanilla" set of checksums to be able to compare timedemos programatically (rather than watching them) and determine at which tics they go bad. But innocuous bugs like this would trip the comparitor. The other idea was a dump of all thing positions and prnd index, but that feels like a lot of work too. I'll probably never get around to such a thing, but sounds possible in theory.

Share this post


Link to post
On 8/18/2024 at 3:46 PM, Frenkel said:

That sounds a bit like Headless Doom.

 

Oh wow, cool. Did not know that existed.

Share this post


Link to post

Exciting News! Doug has been working on frame interpolation and it's very close to be 100% finished. Now visuals update faster than 35 fps with this new option (options -> display -> uncapped fps ON). This feature works well on faster 486 systems or 5th generation CPUs (K5/K6/Pentium/Cyrix 6x86/...)

 

https://github.com/viti95/FastDoom/releases/tag/0.9.9g_test1

 

Please note that there are still a few known issues, such as the FPS counter and some quirks with application startup. We need user feedback as this is a significant change. If you encounter any bugs or unexpected behavior, please let us know!

Share this post


Link to post

Anyone else notice that benchmarks can freeze in some systems?

They run fine on my 386DX.

I tried on an AMD 5x86 133mhz and it freezes through and have to reboot.

Recently I also got one of those Pocket386 handhelds, I tried the benchmark there and it freezes after few frames. However the game plays normally well enough without freezes (for the slow 386SX that is).

It's only the benchmarks that are affected in certain systems, what could be the reason and how to avoid it?

Share this post


Link to post

Would it be possible to add code to have the game detect slowdowns and speedups to automatically adjust the optimizations used in order to keep game play within certain frame rates? If the fps drops below 20fps (a frame takes too long to generate) the next frame is rendered with slightly less details/effects etc.

This way we could have a more constant frame rate without having to resort to low quality visuals all the time. Make it user-specified. One could set it to try to keep frame rate above 20fps, and if we go above 25fps, the engine turns on more details again. Another user might set the limits to 35 as lower bound, to ensure that a very high frame rate is always achieved on their system. Ideally you should be able to specify which options you favor more.

This way we get decent fps in large areas with many monsters, but we would also see better quality visuals if we get close to a wall etc or play in a non-complex area.

Care should be taken to not make the transitions too jarring and flipping options on and off too often.

Share this post


Link to post

Modern games have this features, it's called "Dynamic Resolution" and it kinda sucks, thanks God, it can be disabled. It's just as bad as a difference between low and high fps.

 

I'm wondering if it was possible to port code from other consoles, pick the fastest for walls, fastest for flats, fastest for sprites and make it there?

Share this post


Link to post

I was thinking of an option to reduce the amount of sprites loaded, maybe fastdoom could allow to only use 4, 2, or 1 angles for monsters sprites instead of the usual 8. Given that most of the time the monsters face the player it would often go unnoticed and 8x less memory for sprites. Also maybe it could render only every other animation frame, reducing further memory pressure, this would be another option.

 

I think SNES DOOM only used one face for the sprites (like wolf3D's bosses).

Share this post


Link to post
On 9/4/2024 at 11:45 AM, Optimus said:

Anyone else notice that benchmarks can freeze in some systems?

They run fine on my 386DX.

I tried on an AMD 5x86 133mhz and it freezes through and have to reboot.

 Recently I also got one of those Pocket386 handhelds, I tried the benchmark there and it freezes after few frames. However the game plays normally well enough without freezes (for the slow 386SX that is).

 It's only the benchmarks that are affected in certain systems, what could be the reason and how to avoid it?

 

My guess is that there might be a memory leak somewhere in the code. Not having a good debugger makes everything more difficult. I'm thinking of adding minimal support for modern OS, so we can have better tools to detect these leaks.

 

Share this post


Link to post

FWIW, I recently saw this video about the nearly-unobtainium UMC Green 486 CPUs, which outperformed contemporary Intel, Cyrix and AMD chips (yes, there are Doom and FastDoom benchmarks in a sequel, in this other post).

 

 

What surprised me was the difference in performance (for the worse) between the Cyrix CPUs and pretty much all the rest. They were easily 20-25% down at frequency parity compared to everything else. Now I feel QUITE cheated about my experience with Doom (had a Cx486 DX/40...) and, more generally, PC gaming back in the day :-/

Share this post


Link to post

Cyrix gave you more bang for the buck, and were geared towards typical office applications. Games tended to be optimized for intel cpus. Had they instead been optimized for Cyrix, the picture might have been a bit different. Quake is a good example of a game tuned for pentium processors, and is very slow on 486/Cyrix/AMD, even if the MHz should indicate otherwise.  Abrash has written a lot about various architectures and optimizations in his black book (free online).

The intel was the safest bet for game dev, since it was the dominant player, what game reviewers most likely had. Enthusiast users tended to favor intel.

A cyrix 686 is phenomenal at Doom, but plays a shit Quake.

Ironically, Intel ended up with a similar problem with their pentium 4 line. The high MHz rating and pentium brand indicated high performance, but turned out to be rather lackluster. The 1.4GHz launch model was generally slower than a P3 900MHz. Intel never managed to perfect the P4 architecture, and went back to something much closer to their P3 architecture after a while. They screwed up on 64bit as well, and these days we all use AMD's instruction set for 64bit cpus.

An office user would probably prefer a cyrix cpu and more ram, over a slightly faster performing intel cpu.

Share this post


Link to post

I reckon that their desire to push the technology forward didn't allow them to optimize Quake for non-586 systems. The computers then were so slow and expensive that guys at iD wanted to drive up the market so much that their games were getting finally playable on cosumer systems later. People today laugh at games like Quake calling them "scrap metal games". Who knows for how much longer the game of selling 486 clones would go if not games like Quake

Spoiler

Quake was slow on CPUs different than Pentium due to the perspective-correction process which ran integer and float units in parallel which was impossible on other CPUs and each unit waited for other to accomplish their tasks and especially those FDIV commands. That was the texture mapper.

 

I'm pretty sure that Quake could be optimized for 486 though by adding the low detail mode, flat surfaces like in fast doom and a different texture mapper that would utilize the integer commands only similiar to descent or ps1 aprroach.

 

An affine texture mapper was implemented in the original quake to draw polygonal entities like monsters, decorations and other stuff. The texture warping-wobbling effect wasn't that much noticeable due to small size of polygons. Maybe one could get that part of the code, divide the closer geometry to larger amount of triangles like it's done in Quake 2 PSX and call it a day?

 

I was experimenting with Duke Nukem 3D source code as it compiles really easily in DOS with a Watcom compiler and replaced the slope rendering routines utilizing the FPU to the ones coded but unused integer ones (replace slopevlin_ to slopevlin2_ in a text editor is all you need to do in game.c?). I tested it in PCem with am486-SX2-66, the difference was that integer routines rendered 1 frame less (about 8% relatively) and pretty much I read it somewhere on vogons that when FPU is absent it relies on Watcom's FPU emulator. As soon as a CPU has an FPU the difference really goes up, so it might explain why Ken decided to use the float routines as it was faster on SX processors anyway and even faster on DX and Pentiums.

 

Unfortunately at this moment I'm too stupid for this now.

Share this post


Link to post

Id specifically hired Abrash to optimize for the Pentium. It was custom written assembly for that exact architecture. They wanted to produce a good looking and technologically innovative game. A lot of Quake stuff went really wrong, and the end product is a mixed bag of brilliance and uninspired work.

Their approach with a client server model, great networking options, mod programming language support and general level of fidelity in the rendered maps was amazing.

They failed on the game design bit and the game lacks polish and artistic flair. It's an extremely bare-bones approach to rendering 37 mostly good looking maps. The quality is much more even, it doesn't have the lows that Doom (2) had. It has that tongue in cheek dark humor that was gone from Quake 2. Quake 2 had excellent polish, but lacked the fun element that made Quake a classic.

That's just my take though. I remember being disappointed after installing the registered version on pretty much launch day and starting it up, and it looked exactly like the shareware. They hadn't even updated the demos. Still no title pic or other additional artwork. We got a few more enemies and a lot of maps and a tree boss...

Seing e1m8 low gravity as a secret map I was expecting great things. Turns out it was the best secret map in the game, and all of the iconic / hard monsters are already in the shareware.

Share this post


Link to post

I was 6 when first played Quak and after Doom and Duke it appeared like a huge downgrade to me. I had AMD K5-PR133 so Quake was very much playable to me but the issue was not the performance, it was the overall quality. I quickly turned back to playing Duke Nukem 3D as it appelead to me so much more reminding of the real life instead of a poorly shaved doomguy that put on an ancient "helmet" and went back in the ancient times to shoot primitive guns, wasn't fun at all. It's only relatively recently that I liked it more.

 

Quake 2 I first played a year ago, both old Windows and the new DOS versions and it appeared like a huge step back from Quake, the tech is better, the gameplay and everything looks like a toy. I played through some levels and found them boring, aka Doom 3 - 1997 but much much worse. Oh yeah, the Quake 2 system requirements are much higher than Quake, if the latter required a 486DX4-100 like a minimum, the former required at least Pentium 90...

 

Half-Life was the first game that squeezed the best out of Quake engine that appeared and played serious.

 

Upd: you can see colored dynamic lighting and environment mapping on security helmets in software mode half-life! Something GZDoom is only capable to do with GL3+ cards only.

Edited by Darkcrafter07 : Add

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×