Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Redneckerz

FastDoom: DOS Vanilla Doom optimized for 386/486 processors

Recommended Posts

New release! FastDoom 0.8.14

 

Changelog:

 

- Performance uplift, avoid AGI stalls on column and visplane rendering code (486). Also small optimizations in sound code.

- Dual screen support. Now it's possible to use an Hercules card/monitor to render the automap. Use the command line parameter '-hercmap'.

- Better performance on 8-bit ISA cards and special backbuffered modes (Hercules, CGA, Plantronics ColorPlus, EGA, ATI 640x200).

- Removed Hercules 640x200 mode. It only worked on emulators, didn't work at all on real hardware. Even can cause serious problems on B&W monitors.

- Added transparent automap support. Press 'T' while on automap mode. Only supported while in 'fullscreen' mode.

 

Grab it here: https://github.com/viti95/FastDoom/releases/tag/0.8.14

 

 

Edited by viti95 : Add download link

Share this post


Link to post

Cool. I was reading about the AGI stalls somewhere, didn't know or pay attention much to it before, even if try to optimize for oldschool PCs.

 

But what is VBD mode? I can't find something? Or maybe it's something I am aware but just don't know the name of it?

Share this post


Link to post

AGI (Address Generation Interlock) are more important than I thought, specially on critical code (such as the column and visplane renderer code, or the sound code). There are some manuals that explains what has to be avoided. The AMD 486 software development manual (https://www.amd.com/system/files/TechDocs/18497.pdf) explains what to avoid really well. There are also code optimization manuals for the Pentium, AMD K5 and Cyrix 6x86, and some of the AGIs are shared between those. So avoiding those AGIs you're optimizing mostly for all the architectures at the same time.

 

Regarding the VBD mode, It's the VESA 2.0 Linear FrameBuffer "direct" executable (It's just a really bad acronym). It's by far the fastest when the video card supports it. It's a mix between the Mode Y and the backbuffered modes. Uses the backbuffer rendering routines but modified to write directly onto the VRAM. This avoid the copy between the backbuffer and the VRAM, and also avoids all the OUT instructions that are required to change planes that are required in the mode Y.

 

I'm thinking of adapting those routines for the potato detail, since it works pretty much like a linear buffer.

Share this post


Link to post

It's kinda easy with the backbuffered modes, but the whole game and assets are designed for a 320x200 resolution and that won't fit properly on the 256x256 resolution. The main question is, why support the Mode Q? Has any advantage over the other video modes?

Share this post


Link to post

The main advantage of mode Q is how easy it is to compute screen coordinates. You use a 16bit int and put the Y coord in the upper 8 bits and X in the lower 8 bits. No multiplication needed. Not sure how useful that is for Doom.

Share this post


Link to post

If you were to use it, I would ignore aspect ratio problems and render the status bar without the face graphics and cut off the rest of the pixels on each side. Make it so that the status bar cannot be removed even at max viewing size and render just a bit more screen height then you do at the normal no status bar view.

The status bar is like 32 pixels tall, so that's 224 pixels of world view area, just 24 pixels more than fullscreen. The mug shot is 32 pixels wide or so, leaving you with having to cut off 16 pixels at each side. You could probably render the armor numbers slightly more to the center (2-3 pixels) without ruining too many custom status bars.

On an old crt or dosbox you can stretch and tweak the aspect ration to make it look fairly good. 1280x1536 (5:6 ratio) is easily doable on many wide screen monitors, especially if you can tilt it vertically. Exactly why one would do this is I don't know, but it would look 'correct'!

Share this post


Link to post

I've been analyizing the Mode Q, and it's too much work to do for the benefit it could provide. It could have faster address calculations for column render functions and visplane render functions, but the speed-up won't be that much as is only calculated once per function call.

 

For now I prefer spending time optimizing the sound code, and adding support for new sound cards. I've finished Adlib OPL2 sound support (yeah you heard it right), the quality is nowhere near as good as a regular Sound Blaster, but at least it's something.

Share this post


Link to post
1 hour ago, viti95 said:

I've been analyizing the Mode Q, and it's too much work to do for the benefit it could provide. It could have faster address calculations for column render functions and visplane render functions, but the speed-up won't be that much as is only calculated once per function call.

 

For now I prefer spending time optimizing the sound code, and adding support for new sound cards. I've finished Adlib OPL2 sound support (yeah you heard it right), the quality is nowhere near as good as a regular Sound Blaster, but at least it's something.

The sound card on one of my machines has a dsp based adlib emulation that is less than perfect. I wonder what that one will make of this...

Share this post


Link to post
On 7/5/2022 at 11:23 AM, viti95 said:

I should try on an Ensoniq AudioPCI, OPL2 emulation on those cards is really awful:

Would be hilarious to hear what it makes of it.  Although I don't know if I would say the emulation is awful since it's not emulating an OPL2 chip but rather basically trying to convert AdLib music to MIDI.  In a couple games it produces decent results (I assume the ones they designed it for).  Not accurate of course, but I don't think that's what they were going for.

 

Honestly the sample you linked isn't particularly terrible.  Check out what it does to Super 3-D Noah's Ark: http://maniacsvault.net/loosefiles/AudioPCI/NoahWTF.opus  Guessing they didn't do much tuning for rhythm mode since most of the notes are completely missing.

Share this post


Link to post

New release! FastDoom 0.9

 

This is a major release since lot's of fixes were done. Here is the changelog:

  • Fixed lot's of crashing bugs. May be a performance impact, but it's better than crashing the whole game. Huge thanks to @RamonUnch for discovering the issues, fixing them, and providing extensive testing in lots of maps and demos.

  • Added commandline parameter "-palette1" to enable the black-cyan-magenta-white palette on CGA 320x200 mode.

  • Added commandline parameter "-snow" to enable a fix snow bug in IBM CGA cards on modes 160x100 16-color and 80x100 136-color.

  • Added basic compatibility levels support. Now demos wont desync. Also again huge thanks to @RamonUnch. And thanks Decino for the marvelous video explaining the compatibility levels.

    • Added commandline parameter "-complevel X" to select any compatibility level

    • Supported compatibility levels are:

      • 2 -> Doom 1.9 (also Doom II)

      • 3 -> Ultimate Doom

      • 4 -> Final Doom

  • Added OPL2LPT and OPL3LPT support by Jordi Sesmero (@jsmolina)

  • Added new video mode FDOOME80.EXE => FastDoom EGA 80x200 16-color. This video mode is 640x200 EGA mode with write mode 2 enabled, so we can write 8 pixels at the same time with a single 8-bit write (that's 32-bit written at once, and avoiding the chunky2planar process). Should perform well on 8-bit ISA EGA cards.

  • Updated FastDoom Setup program with the new music devices. Also small visual fixes.

  • As always small optimizations here and there.

FastDoom 0.9

Share this post


Link to post

Time for a new release! FastDoom 0.9.1:

 

* Creative Music System (Game Blaste) music support by @Tronix286

* New EGA 160x200 (16 color) mode FDOOMEW1.EXE, fast on 8-bit ISA cards.
* New EGA 320x200 (14 color) mode FDOOME14.EXE, much faster on 8-bit ISA cards compared to FDOOMEGA.EXE
* Updated FastDoom Setup program with new music devices. Support for MDA / Hercules video cards using parameter "-mono"
* New CGA ANSI from Hell mode, based on the same idea of Area5150. Base resolution of 320x100 and 16 colors. Not perfect but awesome for CGA cards.
* Add SBK soundfont loading support for Creative AWE32/AWE64 sound cards by @Tronix286. Use parameter "-sbk".
* Updated startup menu, now it only shows available WADs. Also experimental support for FreeDoom Phase 1+2.
* New command line parameter "-iwad" to load any IWAD directly
* Fixed Plutonia MAP12 crash by @Tronix286 and @RamonUnch
* Small bugfixes and optimizations

 

Download: https://github.com/viti95/FastDoom/releases/tag/0.9.1

 

FastDoom CGA ANSI from Hell:

 

FastDoom Creative Music System music:

 

Edited by viti95

Share this post


Link to post

FastDoom 0.9.2 is released!

 

Changelog:

- New video mode, IBM CGA 80x100 "512 color" mode with composite output (available for old and new IBM CGA cards)
- Optimized R_DrawColumnPotato and R_DrawSpanPotato for Potato mode in FDOOM.EXE, up to 22% more FPS
- Optimized R_DrawColumnLow for Low detail in FDOOM.EXE, up to 10% more FPS
- Optimized R_DrawColumn for High detail in FDOOM.EXE, up to 10% more FPS
- FDOOMEGA.EXE now uses same thechnique as FDOOME14.EXE but with full 16 colors, so FDOOME14.EXE has been removed as is not needed anymore (and now FDOOMEGA.EXE is much faster)
- Huge video code refactor, much easier to create and mantain new video modes (no changes in terms of performance)
- Removed VGA 80x100 and EGA 80x86 text modes. Too slow on real hardware, they required lot's of VRAM reads which are very slow
- Removed EGA 160x100 text mode. Other EGA video modes offer much better quality, and have better aspect ratio
- Removed 136 pseudo-color modes, as CGA modes have been superseeded with better ones, same for EGA/VGA
- Minor optimizations here and there

 

https://github.com/viti95/FastDoom/releases/tag/0.9.2

Share this post


Link to post
On 7/30/2020 at 5:59 AM, Redneckerz said:

It does include several optimizations beyond limiting visuals, resulting in a smaller executable. However, some of the optimizations are done in C instead of asm, sometimes intentional, sometimes because the author mentions not liking asm in general.

So it definitely does do some under-the-hood optimizations aswell :)

Also crossposting from the Doom Pictures thread, Here is Back To Saturn X E1 running on FastDoom, 486 binary, -potato mode with -flatsurfaces and -flatsky on:

TUtWf4t.png

 

Totally playable, sir. Jokes aside, I consider very cool the proposal of this port 

Share this post


Link to post

Could you add some really simple shading to the flat sky and flats to give it more depth and life? Line 64 and 63 is this shade, 62 and 61 slighly darker etc. A sky with more colors in the same manner would be on par with many actual sky textures

Share this post


Link to post

Time for a new release, not a big changelog but includes a feature I really love, AudioCD support.

 

FastDoom 0.9.3, changelog:

 

* AudioCD music support. Custom AudioCD mappings for (Ultimate) Doom, Doom II, TNT and Plutonia. (thanks @theelf for testing and the idea!)
* Fixed bugs on Hercules automap mode (thanks @darmok for testing)
* Small optimizations in rendering code

 

https://github.com/viti95/FastDoom/releases/download/0.9.3/FastDoom_0.9.3.zip

 

Share this post


Link to post

Another fairly cool addition would be to add support for MT32 into the port. You'd need to send a few system messages to reprogram the synth to be as close to the GM standard as you can. If you're really adventurous you could upload 1-2 instruments/samples to replace some of the ones that sound the least like the SC55 that Doom was made for.

If you also remap instruments Doom does not use, you could improve the pwad support. Not sure how much of the GM set that Doom uses that has an equivalent MT32 instrument, but the vast majority of instruments should work fine. MT32 has the same amount of instruments as the GM set, and they are more or less made by the same people, so a lot of overlap is to be expected. This could be tested by running FastDoom inside dosbox and mapping the midi to MUNT.

I couldn't find a list of instrument mappings, but if you seriously want to add this, I could probably make one. The messages you need to send to remap can probably be found in documentation or by looking at the MUNT source code. For an excellent example of configuring and using the MT32, try the soundtrack to Dune 2. It adds several synth lead instruments and those are used throughout most of the soundtrack giving it a distinct Dune 2 feel while still using a ton of high quality sounds from the rest of the instrument set. Here's a link to a decent example of the soundtrack and custom instruments:

 

Share this post


Link to post

Yeah yeah, the ideas guys here again, maybe it was told about it here, just wondering if the game could be sped up even further by reducing size of all textures, patches and sprites by half or four, reducing palette size etc. Definitely not a fan of such looking doom tbh but it's just being so interesting to see how far it goes.

Share this post


Link to post
6 hours ago, Darkcrafter07 said:

Yeah yeah, the ideas guys here again, maybe it was told about it here, just wondering if the game could be sped up even further by reducing size of all textures, patches and sprites by half or four, reducing palette size etc. Definitely not a fan of such looking doom tbh but it's just being so interesting to see how far it goes.

You could perhaps speed scene rendering up by cutting down on the textures used in a map if the textures are essentially the same in the color space used. Having fewer textures to load could mean they are more often found in cache...

Share this post


Link to post
6 hours ago, Darkcrafter07 said:

Yeah yeah, the ideas guys here again, maybe it was told about it here, just wondering if the game could be sped up even further by reducing size of all textures, patches and sprites by half or four, reducing palette size etc. Definitely not a fan of such looking doom tbh but it's just being so interesting to see how far it goes.

You could perhaps speed scene rendering up by cutting down on the textures used in a map if the textures are essentially the same in the color space used. Having fewer textures to load could mean they are more often found in cache...

I doubt it would matter much as systems with cache are probably more than fast enough for Doom. It would also make it easier to run bigger maps with less ram, which I think is one of the sub goals of this project.

What I would love to see would be higher-res textures on lower color. Use dithering to better approximate the source color. Especially on grey scale, this could be quite effective.

PS: The board software had a seizure and posted empty posts. At first it looked like duplicates, but they are empty now. Feel free to wipe them from the thread :)

Edited by zokum

Share this post


Link to post

I went googling for some mt32 info and I found a promising page: https://www.midimusicadventures.com/queststudios/mt32-resource/utilities/ The link and descriptions that really caught my eye was this one:
 

MT-32 to GM

A utility from Roland Corporation which remaps the MT-32 to General MIDI. Includes 64 new sounds, plus additional drum sets.

There might be copyright issues here, but it looks like this might be a really good instrument remapping.

Share this post


Link to post

I would have preferred to have made a PR on GitHub, but seeing as I'm currently completely locked out of my account I'll post it here instead: (also a ping for @viti95)

 

I've been bothered for a while with the fact that the palette WADs for the non-VGA modes have perma-lightamp because of the COLORMAP lumps in those WADs. I don't really see the reason for this as it's effectively a cheat and makes maps like Monster Condo trivial while stripping out the atmosphere. Of course, reducing from a 256-colour palette to <=16 colours without modifying the graphics definitely requires a bit of interpretation on how to present Doom while making a best effort to preserve the spirit of the IWAD maps.

 

The biggest problem I've seen so far is that both the Hercules and CGA B&W modes have RGB translation values based on the EGA palette WAD, which I don't believe is a good way of making calculations on how shades of luminosity are generated - it should instead be balanced relative to the vanilla PLAYPAL+COLORMAP as this allows greater compatibility with PWADs that include custom PLAYPAL+COLORMAP lumps. I've spent a bit of time on this, and so far I've come up with the following patches for i_herc.c and i_cga_bw.c:

--- i_cga_bw.c	2023-02-11 16:31:20.926982519 +1100
+++ i_cga_bw.c	2023-02-11 16:32:46.942040820 +1100
@@ -36,8 +36,8 @@
 
         sum = r + g + b;
 
-        lutcolors[i] = sum > 32 ? 0xFF : 0x00;
-        lutcolors[i + 1] = sum > 64 ? 0xFF : 0x00;
+        lutcolors[i] = sum > 19 ? 0xFF : 0x00;
+        lutcolors[i + 1] = sum > 59 ? 0xFF : 0x00;
     }
 }
--- i_herc.c	2023-02-11 16:31:20.927982508 +1100
+++ i_herc.c	2023-02-11 16:31:56.582592209 +1100
@@ -36,10 +36,10 @@
 
         sum = r + g + b;
 
-        lutcolors[i] = sum > 38 ? 0xFF : 0x00;
-        lutcolors[i + 1] = sum > 115 ? 0xFF : 0x00;
-        lutcolors[i + 2] = sum > 155 ? 0xFF : 0x00;
-        lutcolors[i + 3] = sum > 77 ? 0xFF : 0x00;
+        lutcolors[i] = sum > 19 ? 0xFF : 0x00;
+        lutcolors[i + 1] = sum > 59 ? 0xFF : 0x00;
+        lutcolors[i + 2] = sum > 79 ? 0xFF : 0x00;
+        lutcolors[i + 3] = sum > 39 ? 0xFF : 0x00;
     }
 }

While it will cause maps to be darker overall, I feel it strikes a decent balance between trying to keep the general atmosphere of the maps and making sure that things are somewhat distinguishable, given the limitations of only having 3 shades of luminosity with FDOOMBWC and 5 shades of luminosity with FDOOMHGC. The replacement MODEBW.WAD only requires a custom COLORMAP lump, and only for the invul colormap as reducing the brightness of that by 40% and increasing the contrast by 10% makes it possible to actually see what you're doing instead of seeing an almost pure white. I can see a need for a custom COLORMAP with further manual tweaking to improve the output of FDOOMBWC and also a custom PLAYPAL to reduce the intensity of item pickups a little bit (most likely with duplicating the third pickup palette to replace the fourth), this is a fairly major first step towards that.

 

I recorded the attract mode of all IWADs on my IBM old-style CGA card via composite output for FDOOMBWC, for reference on how it looks on that hardware:

 

I've also attached a pre-compiled FDOOMBWC and FDOOMHGC for anyone who wants to test.

FastDoom_bwc_hgc.zip

MODEBW.ZIP

 

Finally, it would be nice if the source code was 8.3 filename compatible so that I don't have to modify the build process when using OpenWatcom in DOS.

Edited by deathz0r

Share this post


Link to post

Also I'm double posting here, but I've also spent a couple hours each on improving MODE16 and MODECVBS - they're still heavily WIP and not ready for PR as MODECVBS is definitely too dark when using my CGA card, but should be an improvement towards proper lighting regardless.

modecvbs_mode16.zip

 

EDIT: I'm pretty sure MODECVBS isn't even required - I decided to test MODE16 with FDOOMCVB on my CGA card and it definitely looks much better than I expected.

Edited by deathz0r

Share this post


Link to post

Great work @deathz0r, B&W modes indeed look much better now (especially the invul is very useable now). If you cannot create a PR I'll add the changes.

 

On 2/11/2023 at 10:23 AM, deathz0r said:

Finally, it would be nice if the source code was 8.3 filename compatible so that I don't have to modify the build process when using OpenWatcom in DOS. 

 

When I did the refactor of the source code for the video modes I decided to use non 8.3 filenames as I stopped using OpenWatcom for DOS, since cross-compiling it's much faster and less limited. I'll take a look if I can rename all the files to 8.3 format again.

 

BTW can you try fdoom512.exe (the "512 color" composite mode for IBM cards)?? I still have to repair the composite output of my CGA card, all testing I've done is emulation based

Share this post


Link to post

Thanks, I would appreciate it if you can add the changes!

 

I did some more work on MODE16 last night with further PLAYPAL tweaking and various COLORMAP improvements - I've added transitioning from dark shades colour to dark gray (#555555) and then black - curiously the colour burst for #555555 through composite is noticeably brighter than dark red (#AA0000) so there is now a justification for having a separate MODE16 and MODECVBS by removing that specific transition from MODECVBS. Once I've done some final PLAYPAL tweaks and done a playthrough of the IWADs to make sure everything looks reasonable, I'll port the COLORMAP to only use 0-15 for MODETXT and post the WADs. MODE4 definitely needs its own PLAYPAL+COLORMAP for something more optimal, though the least I can do for now is the same invul colormap fix that I did for MODEBW.

 

17 hours ago, viti95 said:

BTW can you try fdoom512.exe (the "512 color" composite mode for IBM cards)?? I still have to repair the composite output of my CGA card, all testing I've done is emulation based

I can't get a direct capture because the Koryuu Transcoder doesn't pick up the colour burst from 80x25 text mode, so a "point camera at screen" video will have to do:

 

 

I don't see any issues with the colour at all! Framerate takes a fairly big nosedive because of the snow fix (20-25 FPS on a 233Mhz Pentium MMX w/66Mhz FSB*3.5) though reducing the screen size by two does make movement smooth, but any pickup/pain flashes will still cause noticeable stutter.

Edited by deathz0r

Share this post


Link to post

I'm really surprised the CGA "512 color" mode works so well, even though the resolution is too low (80x100). Yeah the snow fix is a big bottleneck, if I eliminate completely the snow it goes down to 5-6 fps, and if remove the fix, the framerate goes 30+ fps, but the screen becomes a complete mess, so I had to take a middle solution. Also changing the palette usually requires to rewrite the whole 16Kb video memory, which is very slow on IBM CGA cards (~500Kb/s CPU-to-VRAM speed). IBM CGA cards are pretty much on the limit of what they can do in this case.

 

 

 

 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×