Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Redneckerz

FastDoom: DOS Vanilla Doom optimized for 386/486 processors

Recommended Posts

9 hours ago, Lila Feuer said:

You've no idea how happy this makes me.

-flatsurfaces -flatsky -flatshadow -potato -mono -lowsound, pair with adlib music playback and the default 3 sound channels for extra SIZZLE

unknown2.png

1530562955862.png

Still unmistakingly Doom but also very crunchy looking. I dunno, its both a disgusting look but also a very appealing one. I like.

Share this post


Link to post

Very cool. Back in the 90s I had a 486sx laptop that only barely seemed to run Doom and this would have been a godsend back then.

Share this post


Link to post

Hi everyone!

 

Thanks for making this post @Redneckerz!! I didn't knew the existence of this post, and wow it makes me really happy! Yes I'm FastDoom main developer, and i'm glad to answer all your questions :D

 

@Graf Zahl it's right about abusing the VGA ModeX, Potato mode uses the same idea as the low resolution mode in vanilla doom, but renders 4 pixels writing only one in video memory (that's why the final resolution it's 80x200). This helps a lot video cards with low bandwith (older ISA VGA cards), as it halves the number of writes to the card. It's even possible to make it faster, as this is a quick'n'dirty implementation. It uses the same code as the low resolution mode but only renders even columns, odd columns are omitted but still calculated. Rendering columns and visplanes it's the major bottleneck in vanilla Doom.

 

Joystick support was removed because not many people play Doom with joystick/gamepad, and polling the gameport it's a slow process that makes the game slower. Also network support was removed in order to make the game more responsive in slower CPUs. The less code the better, as the 386/486 pc's had very low memory bandwith, and having less code makes the caches behave better.

 

For the next version the gamma correction support will be back, as suggested by various members of Vogons and DoomWorld. Disabling it made the palette changes faster.

 

@Wagi I made some testing with flat visplanes and depth lightning, but it didn't looked as good as it should be.

@Redneckerz I didn't knew the existence of GBADoom, that's an impressive port and the idea of running Doom in 256Kb of RAM just blows my mind, maybe I can get some ideas of that port.

@Optimus (FAN MODE ON) I love the progress you're making with OptiDoom, I was the one who gave you the idea of making coding streamings :D. I'll be adding menus to enable/disable options ingame as you've done in OptiDoom instead of making that options parameters.

 

There are lot's of parts of the code that can be optmized even more, but I don't have as much time as I wished to develop this port. The main goal it's make the game playable in 386SX processors, nearly an impossible mission :D.

 

Share this post


Link to post
2 hours ago, viti95 said:

>It's even possible to make it faster, as this is a quick'n'dirty implementation. It uses the same code as the low resolution mode but only renders even columns, odd columns are

>omitted but still calculated. Rendering columns and visplanes it's the major bottleneck in vanilla Doom.

 

@Optimus (FAN MODE ON) I love the progress you're making with OptiDoom, I was the one who gave you the idea of making coding streamings :D. I'll be adding menus to enable/disable options ingame as you've done in OptiDoom instead of making that options parameters.

 

 

The thing with the render/skip half the columns, but still have to loop through all columns in calculations, was my problem on the 3DO too. I could skip half columns and scale them horizontally double with the hardware, but since there is another pass through the columns that is necessary for the visplane calculation and sprite clipping, I couldn't skip half columns at every point because it introduced visplanes spills (even if I tried to interpolate things in between, I still got them). But now, I ended up with the new solution with the scalers, render at half width/height (so everything, even visplanes and all are really calculated in a tiny window in offscreen), then scale it back to fit. It does give a bit better speed than the old scalers in some cases, but it's unpredictable when (depends on the slope of sectors on the screen). Funny enough, I also left inside the old code that physically renders half the columns double, so in combination with the 2x1 scaler, it's really emulating your -potato mode :)

 

Thanks btw, it was good idea you gave me to start the videos (even though I've thought a lot of times in the past to do this, but always abandoned because of how boring and slow I might sound :)

 

 

Share this post


Link to post

@viti95 great to hear! Ill be addressing your comment in full tomorrow.

Spoiler

That said, i am putting the last workings on something something... ;) I am sure you will appreciate!

Lastly, thank you for creating FastDoom. :)

Share this post


Link to post

I'm impressed and definitely see the utility, but at the same time, those flat visplanes being incredibly bright against dark walls in dark sectors are kind of an eyesore.

 

I do expect darkening the visplanes as they go into the distance would negate a lot of the speed gain - you'd probably still gain some speed, due to only having to poll one palette entry instead of calculating what pixel of the flat needs to be rendered and then polling its palette entry, but I imagine the removal of the darkening is where a lot of the speed gain actually lies here. Maybe it'd be a better compromise to render the entire visplane at a single color regardless of distance as it currently is, but change what color that is based on the light value of the sector it belongs to (based on what color it'd have been an arbitrary fixed distance from the camera, set to whatever looks most aesthetically-pleasing)? That said, if I'm wrong and reintroducing the visplanes getting darker into the distance isn't a huge performance-killer, it would certainly complete the SNES Doom look (since that did have that feature - with dithering, even, though I wouldn't expect that myself).

Share this post


Link to post
15 hours ago, fraggle said:

Very cool. Back in the 90s I had a 486sx laptop that only barely seemed to run Doom and this would have been a godsend back then.

 

That's precisely the point where I have my doubts: maps that were unplayable/unbearable with the rendering on, even at the smallest window size, would often be super-slow even with just the automap on, so the bottleneck wasn't always to be found in the renderer.

 

In fact, to be perfectly frank, I don't think most IWAD and PWAD maps of the era had the kind of detail or on-screen action that would tax the renderer so much that it would dominate CPU time: most maps weren't too far off E1M1/MAP01 in terms of detail, and that one handled just fine (well, it'd better do, otherwise why would anyone continue on with the rest of the game). What really killed performance however was using a large number of textures, flats, a large portion of the bestiary and swapping this stuff in/out of RAM. I still remember how a visually pretty average map (Castle Of The Renegades II) was unplayable on 4 MB (and you really felt that from the very first area, looking at maybe 2-3 different textures), and perfectly playable (at least as much as any IWAD map) on 8 MB.

Share this post


Link to post

I just tried it yesterday on my pumped up 386DX 40mhz, 8MB and with Tseng Labs ET 4000 (fastest ISA card iirc). BIOS settings tweaked to hit best output (to fill 320*200*8bpp rep stosd, gets 85 fps)

 

The good thing is there is finally a frame counter (and not just timedemo) so that I can check in start of E1M1, or move on specific player views that I know are slow, and compare at the spot how the framerate goes with high or lowres or different window sizes. Out of curiosity I'd compare how this PC fares against the 3DO Doom. First, I don't have a similar FPS counter on original Doom, but it feels like highres is obviously slow, but with lowres and a bit of one or two window sizes down it becomes pretty playable (and feels more smooth than the same in original Doom), at least in the first few levels (Don't try the start of E4M2, haha).

 

It's also interesting to compare with the 3DO. I noticed that certain player views that I know are killed because of more visplanes, are kinda better (maybe 10-30%) on this 386 than the 3DO. For example the start of E1M1. I think I got 9fps on 3DO with biggest size (which is however 280*160, smaller than the big size on PC with the health bar on), I get this on fastdoom too in highres, fullscreen with the bar. Lowering it one or two sizes to match the 3DO max size, I think I get 10-11fps. This gets much better (maybe 16fps? I don't remember) with F5 half pixel (and with potato plus flatsurfaces it gets over 20 in this view). In other views also suffering by too much visplanes rather than rendering, I recognize I do notice the 386 getting slightly better than 3DO. Now,. interestingly if you stick the view to the wall, in highres you get something like 10fps iirc, but with F5 you'll get quite better. On 3DO if you stuck your face in a wall you'd get 30-40fps I think. Roughly, I don't remember exact numbers. But it would show. So,. it's true Doom and Fastdoom on 386 suffers a lot from the rendering. It's software rendered anyway, contrary to 3DO which at least for walls it's using the hardware CEL to render and scale and shade individual columns,. while only on the floor/ceiling it will do software rendering (but soft rendering inside the texture buffer of horizontal CEL spans, so the shading will be again done in the hardware). Here the 3DO wins because of it's CEL rendering hardware. But interesting this 386 comes a bit better than the 3DO CPU (it's ARM but only 12.5mhz, I wish it was double that).

Share this post


Link to post

Since you got it up and running, try running a timedemo of a well-known map that puts Doom through its paces and offers a variety of situations, not just a fixed view or facing a wall but actual, ever-changing action, and see if the average elapsed realtics do decrease. That's the only objective metric of how effective FastDoom is overall.

 

Of course there are different kinds of pressures a Doom map may present to the engine (e.g. too many textures, high RAM usage => cache/swap killer, too complex architecture -> BSP recursion Hell, too many sprites on screen, too many active monsters on the map etc.) so it would be interesting to produce timedemos of maps that have a predominant type of pressure, and see how effective FastDoom is at each one.

 

Because, in the end, what I would have liked back in 1994 was indeed a faster, more optimized Doom, that would run everything better on the same hardware. Somehow, as optimized as Doom was compared to other FPS, I was never convinced it was the "best of the possible worlds" in that aspect.

Edited by Maes

Share this post


Link to post
16 hours ago, viti95 said:

Hi everyone!

 

Thanks for making this post @Redneckerz!! I didn't knew the existence of this post, and wow it makes me really happy! Yes I'm FastDoom main developer, and i'm glad to answer all your questions :D

I am glad that you are here, either way! FastDoom is a very interesting project and i am also glad to hear that you are planning to convert the parameters to in-game menu toggles.

 

Regarding flat visplanes and depth lighting (To achieve that more authentic SNES look), like several users and Shadow Hog have been suggesting, i'd love to see it return in some way, shape or form.

 

58 minutes ago, Optimus said:

I just tried it yesterday on my pumped up 386DX 40mhz, 8MB and with Tseng Labs ET 4000 (fastest ISA card iirc). BIOS settings tweaked to hit best output (to fill 320*200*8bpp rep stosd, gets 85 fps)

 

The good thing is there is finally a frame counter (and not just timedemo) so that I can check in start of E1M1, or move on specific player views that I know are slow, and compare at the spot how the framerate goes with high or lowres or different window sizes. Out of curiosity I'd compare how this PC fares against the 3DO Doom. First, I don't have a similar FPS counter on original Doom, but it feels like highres is obviously slow, but with lowres and a bit of one or two window sizes down it becomes pretty playable (and feels more smooth than the same in original Doom), at least in the first few levels (Don't try the start of E4M2, haha).

It should be noted that DeHackEd support is in FastDoom, but only partial. Back To Saturn X runs properly, but @MrFlibble attempted to run REKKR and didn't achieve a good result. It has to do with the different offsets and the trimmed down FastDoom executable. Certain DeHackEd patches may funk out due to the differences, so it might be interesting to test out different vanilla heavy WADS or TC's and see if they work out correctly.

Something like Batman Doom (with the Vanilla fix) and its unusual behavior might be a good test for that.

58 minutes ago, Optimus said:

It's software rendered anyway, contrary to 3DO which at least for walls it's using the hardware CEL to render and scale and shade individual columns,. while only on the floor/ceiling it will do software rendering (but soft rendering inside the texture buffer of horizontal CEL spans, so the shading will be again done in the hardware). Here the 3DO wins because of it's CEL rendering hardware. But interesting this 386 comes a bit better than the 3DO CPU (it's ARM but only 12.5mhz, I wish it was double that).

Given the final quality of the 3DO port, its ever the more impressive that Rebecca Heineman was able to move certain rendering over to hardware. Imagine the port without these functions :P

Share this post


Link to post

I wonder if there would be any benefit to reordering the NODES of a map for a 386? Linguica's comments suggest there won't be any noticeable improvement at all, but his low-end CPU test was still a Pentium M.

Share this post


Link to post

I don't think Doom II was playable on a 386, try MAP15 for instance. I've tried again vanilla 1.9 with 86Box and a 386DX and it's a slideshow.

I could upload a virtual machine with DOS and drivers installed, but is MS-DOS 5.0 still considered warez?

Then you may inject some games using some software to manipulate disk images. OSFMount was free but did it work?

Edit: i've seen that my installed DOS is in spanish.

Edited by drfrag

Share this post


Link to post
1 hour ago, drfrag said:

I don't think Doom II was playable on a 386, try MAP15 for instance. I've tried again vanilla 1.9 with 86Box and a 386DX and it's a slideshow.

I could upload a virtual machine with DOS and drivers installed, but is MS-DOS 5.0 still considered warez?

Then you may inject some games using some software to manipulate disk images. OSFMount was free but did it work?

Edit: i've seen that my installed DOS is in spanish.

The unofficial MBF 2.04 update by Gerwin also had a test build for 386 processors, called MBF386. Download can be found here.

 

I haven't tested it however, and according to Gerwin, it really was just an experiment.

40 minutes ago, Lila Feuer said:

Is -devparm not working? I tried to hit F1 to take screenshots but it only brought up the help screen.

I believe so. I tried yesterday to take super secret screenshots but F1 did zilch. It was apparently removed, i suppose?

Share this post


Link to post
1 hour ago, drfrag said:

I don't think Doom II was playable on a 386, try MAP15 for instance. I've tried again vanilla 1.9 with 86Box and a 386DX and it's a slideshow.

 

To be fair, 86Box is hardly the best emulator for games and in general all those VMs/HyperVisors suck for real-time stuff like games, despite having -in theory- less CPU overhead than a full-on emulator like DOSBox. They simply don't get the timings right, so most DOS games just won't run properly and you won't get an objective metric using them.

Share this post


Link to post

86box and PCem I believe can run Impulse Tracker better than DOSBox though, and both feature better SB16 emulation than the latter.

Share this post


Link to post

Audio playback may be a special case, as long as the emulated program can fill an audio buffer to the host machine fast enough and the output sample rate has been negotiated correctly between emulated program/emulation layer. But I wouldn't expect even video playback to be smooth or consistent in speed. Even with music players, skipping ahead/changing tracks may not feel smooth.

Share this post


Link to post

For what it's worth, as far as I know PCem is a full-on emulator, and one aiming for accuracy moreso than DOSBox does (down to needing BIOS ROMs and having to partition virtual hard drives to install the OS), but at the same time I wouldn't necessarily call it accurate enough to produce useful results for testing the real-world applicability of this source port. (It would be pretty close, though.)

Share this post


Link to post

I wonder if it is possible to add support for the 387 co-processor, since it's dedicated to doing floating-point math, it could help with some extra performance.

 

Just speculating.

Share this post


Link to post
3 minutes ago, AlektorophobiA said:

I wonder if it is possible to add support for the 387 co-processor, since it's dedicated to doing floating-point math, it could help with some extra performance.

 

Just speculating.

Doom uses fixed-point math, not floating-point.

Share this post


Link to post
9 hours ago, Maes said:

Since you got it up and running, try running a timedemo of a well-known map that puts Doom through its paces and offers a variety of situations, not just a fixed view or facing a wall but actual, ever-changing action, and see if the average elapsed realtics do decrease. That's the only objective metric of how effective FastDoom is overall.

 

Yes, sorry about the incomplete tests (I was yesterday curious to compare it with the 3DO, which doesn't have a timedemo option yet or it must be implemented, so I ended up comparing views that I know are pretty slow on 3DO).

 

But today I ran full timedemo tests on both the original Doom executable and FastDoom.

FastDoom, even in high quality without the new low rendering options, scores always a bit more, which is a good improvement for a start (I don't know if it was the switch from Dos4GW to DOS32 extender (seems strange to me), or simply the fact it's recompiled with more modern optimizing compiler or 386 specific options).

 

Just to mention the specs: 386DX40mhz, 8MB Ram, ISA gfx card Tseng Labs ET4000. Screen size max with health bar on.

 

I tried the four demos in Ultimate Doom.

HQ=High quality

LQ=Low quality (F5)

PQ=Potato quality

UQ=Potato + flatsurfaces + flatskies + flatshadows +low sound + mono

 

Demo 1: E1M5

Doom: HQ (6.9fps) LQ (11fps)

FastDoom: HQ (7.8fps) LQ (12.9fps) PQ (15.4fps) UQ (19.0fps)

 

Demo 2: E2M2

Doom: HQ (7.4fps) LQ (12fps)

FastDoom: HQ (8.3fps) LQ (14.1fps) PQ (17.7fps) UQ (21.4fps)

 

Demo 3: E3M5

Doom: HQ (7.7fps) LQ (12.8fps)

FastDoom: HQ (8.7fps) LQ (15.2fps) PQ (19.6fps) UQ (23.5fps)

 

Demo 4: E4M2

Doom: HQ (5fps) LQ (7.7fps)

FastDoom: HQ (5.6fps) LQ (9fps) PQ (10.8fps) UQ (11.8fps)

 

Definitely a little better. I love this port, plays quite better esp smoother with low quality fullscreen(with bar) on my 386, I wish to see more in the future.

Edited by Optimus

Share this post


Link to post
3 hours ago, Cacodemon345 said:

Doom uses fixed-point math, not floating-point.

Indeed. Some of the code that has been hauled over from Russian Doom (The FPS viewer) has been rewritten in fixed point specifically for FastDoom.

 

2 hours ago, Optimus said:

 

Yes, sorry about the incomplete tests (I was yesterday curious to compare it with the 3DO, which doesn't have a timedemo option yet or it must be implemented, so I ended up comparing views that I know are pretty slow on 3DO).

 

But today I ran full timedemo tests on both the original Doom executable and FastDoom.

Impressive readings, really. Really brings the point home that FastDoom brings in a rather noticeable performance benefit.

 

PS: It has been done. FastDoom is now on DoomWiki.

Share this post


Link to post
13 hours ago, plums said:

I wonder if there would be any benefit to reordering the NODES of a map for a 386? Linguica's comments suggest there won't be any noticeable improvement at all, but his low-end CPU test was still a Pentium M.

 

This is super interesting! The main performance difference between 386 and 486 processors comes from the L1 cache, so a better usage of that cache should make the game perform much better. I'll try to recreate the script that reoders the nodes and see what happens.

 

14 hours ago, Maes said:

Since you got it up and running, try running a timedemo of a well-known map that puts Doom through its paces and offers a variety of situations, not just a fixed view or facing a wall but actual, ever-changing action, and see if the average elapsed realtics do decrease. That's the only objective metric of how effective FastDoom is overall.

 

Of course there are different kinds of pressures a Doom map may present to the engine (e.g. too many textures, high RAM usage => cache/swap killer, too complex architecture -> BSP recursion Hell, too many sprites on screen, too many active monsters on the map etc.) so it would be interesting to produce timedemos of maps that have a predominant type of pressure, and see how effective FastDoom is at each one.

 

Because, in the end, what I would have liked back in 1994 was indeed a faster, more optimized Doom, that would run everything better on the same hardware. Somehow, as optimized as Doom was compared to other FPS, I was never convinced it was the "best of the possible worlds" in that aspect.

 

You're right about the main pressures of the Doom engine. I made some profiling (enabled with "/et" in the CCOPTS) and got that the BSP processing and column/visplane rendering are the major bottlenecks in the game. The BSP tree is also used for monster sight checks so that's why if there are lot's of enemies the game runs slower. Any optimization on those make the game run greatly faster.

 

fastdoom_profiling.png

 

14 hours ago, Optimus said:

I just tried it yesterday on my pumped up 386DX 40mhz, 8MB and with Tseng Labs ET 4000 (fastest ISA card iirc). BIOS settings tweaked to hit best output (to fill 320*200*8bpp rep stosd, gets 85 fps)

 

The good thing is there is finally a frame counter (and not just timedemo) so that I can check in start of E1M1, or move on specific player views that I know are slow, and compare at the spot how the framerate goes with high or lowres or different window sizes. Out of curiosity I'd compare how this PC fares against the 3DO Doom. First, I don't have a similar FPS counter on original Doom, but it feels like highres is obviously slow, but with lowres and a bit of one or two window sizes down it becomes pretty playable (and feels more smooth than the same in original Doom), at least in the first few levels (Don't try the start of E4M2, haha).

 

It's also interesting to compare with the 3DO. I noticed that certain player views that I know are killed because of more visplanes, are kinda better (maybe 10-30%) on this 386 than the 3DO. For example the start of E1M1. I think I got 9fps on 3DO with biggest size (which is however 280*160, smaller than the big size on PC with the health bar on), I get this on fastdoom too in highres, fullscreen with the bar. Lowering it one or two sizes to match the 3DO max size, I think I get 10-11fps. This gets much better (maybe 16fps? I don't remember) with F5 half pixel (and with potato plus flatsurfaces it gets over 20 in this view). In other views also suffering by too much visplanes rather than rendering, I recognize I do notice the 386 getting slightly better than 3DO. Now,. interestingly if you stick the view to the wall, in highres you get something like 10fps iirc, but with F5 you'll get quite better. On 3DO if you stuck your face in a wall you'd get 30-40fps I think. Roughly, I don't remember exact numbers. But it would show. So,. it's true Doom and Fastdoom on 386 suffers a lot from the rendering. It's software rendered anyway, contrary to 3DO which at least for walls it's using the hardware CEL to render and scale and shade individual columns,. while only on the floor/ceiling it will do software rendering (but soft rendering inside the texture buffer of horizontal CEL spans, so the shading will be again done in the hardware). Here the 3DO wins because of it's CEL rendering hardware. But interesting this 386 comes a bit better than the 3DO CPU (it's ARM but only 12.5mhz, I wish it was double that).

 

The main problem with 386 cpu's it's the lack of L1 cache. They're also very limited by the ISA bus, that's why even low frequency 486 cpu's with Vesa Local Bus video cards are faster than any 386 at 40Mhz. Does your 386 have any external cache? That should make the game run faster. Things apart, the cool thing about the 3DO it's that the CEL engine helps a lot the cpu, the ARM60 processor at 12.5Mhz without any cache and low ram bandwith shouldn't be able to softrender the game at more than 5fps. Another question, would it be possible to render the walls and floor/ceilings with the CEL engine as normal quads?

 

11 hours ago, Lila Feuer said:

Is -devparm not working? I tried to hit F1 to take screenshots but it only brought up the help screen.

 

I removed all -devparm functionality, forgot to mention it ^^

 

6 hours ago, AlektorophobiA said:

I wonder if it is possible to add support for the 387 co-processor, since it's dedicated to doing floating-point math, it could help with some extra performance.

 

Just speculating.

 

The Doom engine uses fixed-point math, which doesn't require a FPU and it's faster than the 387/487 instruction set. Check the Game Engine Black Book DOOM, section 2.1.7, it explains really well this point.

 

7 hours ago, Shadow Hog said:

For what it's worth, as far as I know PCem is a full-on emulator, and one aiming for accuracy moreso than DOSBox does (down to needing BIOS ROMs and having to partition virtual hard drives to install the OS), but at the same time I wouldn't necessarily call it accurate enough to produce useful results for testing the real-world applicability of this source port. (It would be pretty close, though.)

 

I use DOSBox to compile the game as it's the fastest solution, and to check the game it's running ok with new modifications. The main problem with DOSBox it's that the sound emulation it's really bad, many things that I tested made the sound stop working in real hardware, but not in DOSBox. PCem behaves a lot better than DOSBox, but as you say, you can't trust the benchmark results runned in PCem.

Share this post


Link to post
11 hours ago, viti95 said:

Things apart, the cool thing about the 3DO it's that the CEL engine helps a lot the cpu, the ARM60 processor at 12.5Mhz without any cache and low ram bandwith shouldn't be able to softrender the game at more than 5fps. Another question, would it be possible to render the walls and floor/ceilings with the CEL engine as normal quads?

 

Yes, I think Rebecca was saying in the past, when she first ported Doom with the original full software renderer, maybe 5fps or even a bit less was the best she could get.

 

I already have an options to render walls as polygons (you can even see a bit of the lack of perspective correction if you go close, e.g. in the doors). The thing is, I didn't gain speed with that except in special cases, as I realized I had to subdivide the walls a lot (also a lot of the wall column loops remain, for the second pass needed for the visplanes). Typical Startan texture for example where on PC is 128*128, on 3DO is 32*128. Initially, I also thought of a CEL trick to not having to subdivide vertically (if wall is way taller than 128), something the Rebecca also avoid when rendering a column (a wall column bigger than 128, will trick the CEL to think the texture height (or really width as the CEL vectors are rotated 90 degrees (like the textures/sprites too) from what you physically would expect, so a horizontal linear bitmap rendered vertically (I know this on PC happens also because of cache)) is way over 128, so the texture read will bleed over to the nearby texture column, but you won't see it most of the times. It will look like an entire wall is texture repeat vertically (something the CEL doesn't really support, neither tex coords, nor tiling) but only works in texture columns (with any width size, but height 1). I tried to replicate the trick on 2d textures for my quad walls, worked like a charm on emulators, but gave me black textures and 1fps (CEL engine locking and then cancelling render job for a second) on the real 3DO. So,. now suddenly I have to go back and also subdivide vertically (So a wall that is 320 * 384 for example and with the Startan, will be broken in 10*3 quads). Far away wall quads in outside areas much columns in size, so I had a heuristic check if projected wall segments are too small, switch back to column rendering. A mess,. and only gains me 1-2fps when looking close at a staircase for example.

 

To render the floor/ceilings as quad polygons, is something I am thinking for the future version (0.3, but now I work on 0.2b) and will take more work. And it has problems. For once, it will allow me to get rid of visplanes. The idea is during level loading, maybe store additional information about a sector which I'll subdivide in quads and triangles (degenerate quads). Later, when I have to render sector floor/ceiling, maybe I'd transform the geometry and do some CEL rendering of the full quad. I don't know even if it will work, if there will be gaps, or not give me improvement,. but one idea is that if I do this I may be able to get rid of the visplanes which eat a lot. It's not only the visplane calculation from walls, but a later function that from visplane info that is vertical edge data, it converts by climbing on the slopes and doing comparisons, to horizontal edge data. Anyway,. the problem on 3DO, no texture coordinates for the polygons. Maybe the pure flat untextured ones will work ok. Fun fact, reading an article from Fabien (https://fabiensanglard.net/doom_psx/index.html) I realize that maybe the PS1 version did got rid of visplanes and do something similar, transform/project of the sector triangles, BUT it will end up splitting them again to horizontal spans (needs for the shading and perspective through the Z, which doesn't change on a horizontal line) and that's using thin triangles. But they can have texture coordinates on those horizontal spans, something the 3DO again is lacking. The floor, the way it rotates, the texcoords are allover the place, while it's only easier on wall columns to map a linear row of a bitmap. Even the PS1 does the wall column/floor horizontal span simulation with it's GPU, not full blown polygons, because of the obvious reasons. Can be seen in an PS1 emulator with wireframe display on. But does it very fast.

 

So many problems, will try in a future version, but not very optimistic about it. Also, another problem, visplane edges are also used on the 3DO port for sprite clipping (when a sprite is half hidden by an elevated wall in front for example) and the sprites are only rendered last and clipped against. So,. in order to get rid of visplanes, I would also need to somehow sort sprites between the wall rendering. There is so much more work involved to barely do this right.

Share this post


Link to post

@Optimus: thank you for the benchmarks. The gains in standard high/low detail mode were in the ranges I had expected, aka 1 ~ 1.5 fps. TBQH I was surprised that significant gains were still to be had in the ultra low-detail/potato modes, but I still cannot shake off my experience with large/complex maps back in the day: those played like molasses even with just the automap on, so I doubt that even potato mode would have helped those.

 

Speaking of automap, that could be an interesting test, provided the fps counters still work and it's possible to switch to the automap during timedemos: just what is the lower limit if you reduce screenoutput to almost nothing (the BSP calculations will still have to run, however).

 

Now, the IWAD maps are pretty tame in terms of complexity (except maybe the Final Doom ones), so maybe a better timedemo test would be needed, using a known "heavy" PWAD (still vanilla compatible, ofc). Or, a simple test: is Final Doom playable at all on a 386 with all of the FastDoom cuts applied?

Share this post


Link to post
14 hours ago, viti95 said:

The BSP tree is also used for monster sight checks so that's why if there are lot's of enemies the game runs slower. Any optimization on those make the game run greatly faster.

Originally they used the blockmap for this and it had better performances, but there were a few bugs with their implementation so they switched to BSP. The blockmap-based implementation can be found in Heretic/Hexen source code. PrBoom+ probably also has it as part of its "old Doom" complevels.

 

The problem of course, besides the bugs that need to be fixed, is that changing the way AI sight checks are done runs a very high risk of demo desync.

Share this post


Link to post
2 hours ago, Maes said:

Speaking of automap, that could be an interesting test, provided the fps counters still work and it's possible to switch to the automap during timedemos: just what is the lower limit if you reduce screenoutput to almost nothing (the BSP calculations will still have to run, however).

 

Nah, I tried to press TAB to turn the automap while the demo is running, and this will bring on the menu in front. In fact every key.

 

But I just tried something else at least, reducing the window size to it's lowest and enable low detail (without potato).

That would be spending much less time on rendering and more time on BSP and other things.

In fact it's interesting, because in this case and some levels, the difference between FastDoom and Doom grows quite bigger.

 

Demo 3 (E3M5) which was the lightest of them, with this configuration (smallest window size possible and F5 for low quality) FastDoom hits 49fps. The original Doom hits 31fps. That's a bigger gap. I was quite careful to compare with original Doom, this is double pixel lowres (no potato or flat surfaces or anything).

Demo 4 (E4M2) is also interesting, because the original view when the level starts goes really really slow (and it shows even in the tiny window how slow it is) because of so many sectors far away in that view in the open space. It must bet the slowest Ultimate Doom map maybe. There, FastDoom average is 21fps, while Doom is 16fps. Again smallest window possible, and F5 double pixel lowres.

 

I'd need to test a vanilla compatible map later. I don't know which one hits the limit of complexity without visplane crashes. Could it be Suspend in Dusk? (was this vanilla compatible, don't remember). I could also try Doom 2 demos too, especially the open areas. I could try record my own demo instead of the ones provided.

Share this post


Link to post
12 minutes ago, Gez said:

Originally they used the blockmap for this and it had better performances, but there were a few bugs with their implementation so they switched to BSP. The blockmap-based implementation can be found in Heretic/Hexen source code. PrBoom+ probably also has it as part of its "old Doom" complevels.

 

The problem of course, besides the bugs that need to be fixed, is that changing the way AI sight checks are done runs a very high risk of demo desync.

 

I recall that Doom also uses the blockmap for those checks -at least it tries to, but it will revert to other methods if anything goes wrong/is inconclusive. In fact a badly constructed blockmap can and will lead to weird bugs, like the inability to hit monsters or even for projectiles to travel past a certain map coordinate.

Share this post


Link to post

I discovered while searching how omgifol library works, that Zennode tool is able to optimize the BSP tree and the nodes ( https://github.com/Doom-Utils/zennode ). The speedup is quite noticeable, here is a video of the current dev build (also comes with optimizations in flat visplanes rendering) running the optimized IWAD.

 

 

@Optimus thanks for the extensive testing with your 386! I think it's possible to get even more FPS with 386 processors and make Doom somewhat more playable. The optimizations you're making with Optidoom really amazes me, i'm pretty sure you will be able to get faster and steadier frame rates. E4M2 is the map I usually test to see if any BSP optimization is effective, as it is one of the most complex maps. Maybe BSP optimizations could help reducing the pressure on the 3DO cpu.

 

EDIT:

 

@Dark Pulse i've tested ZokumBSP but the WAD it generates causes the demos to desync. I haven't tried all the options, but i'm pretty sure the desync comes from changes in the blockmap

 

EDIT 2:

 

Same DEV build but running on a 386SX

 

 

 

Edited by viti95

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×