Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Sign in to follow this  
Quasar

Surprising strict limit on PWAD size in Vanilla Doom

Recommended Posts

esselfortium's Saturn X project suddenly suffered a bizarre setback in the form of a messageless lockup during W_Init.

The fact that there's no message and a lockup is the fault of I_Error, which assumes certain side effects of I_Init have occurred, and I_Init is not called until well after W_InitMultipleFiles (In particular it calls DMX's TSM_DelService with an uninitialized tsm_ID variable).

However, the cause of the crash in the first place is this code, from vanilla's W_AddFile:

      v10 = 16 * v41; // v41 == numlumps, read from the directory
      v26 = 16 * v41 + 3;
      LOWORD(v26) = v26 & 0xFFFC;
      if ( v26 >= (unsigned int)alloca_imp_() )
        v11 = 0;
      else
        v11 = &a1;
      lseek(v35, v42, 0);
      v27 = v11;
      v28 = v34;
      read(v35, v8, v10);
      if ( !v28 )
        numlumps += v41;
The bolded portion of the code is an inlined expansion of Watcom's alloca macro.

alloca is a non-standard but widely implemented C function which allocates memory from the temporary variable store, typically implemented on the stack. the "a" in alloca means "automatic", because such allocations have the same lifetime as local or "auto" variables.

However, alloca is limited in Watcom's implementation to the size of the stack segment as determined at compile time. In vanilla Doom, this is a 64 KB segment. This means that, subtracting off the stack space already used when calling W_AddFile, there is only room left for approximately 4046 lumps.

Any wad larger than this will cause memory corruption and/or a lockup, chiefly because the return value of alloca is not checked by the code (note where it stores 0, or NULL, into "v11" if the size being allocated is greater than the return value of alloca_imp, which is a Watcom internal function that returns the amount of stack space available). But, this return value is passed blindly to the read function later.

This means it is reading a file into low DOS memory, starting at offset 0 and continuing for the size of the wad directory. This would trash the entire interrupt table and a considerable portion of whatever follows - right in the middle of a file operation. This is extremely dangerous.

Testing by CSonicGo on a real DOS machine showed bizarre unexplainable behaviors such as randomly changing text displaying in awful colors.

Share this post


Link to post

Egad thats nasty. This is precisely why alloca should never be used for unbounded allocations (personally I've yet to find a use but I can see how it might be useful in some niche applications).

So, whats the plan? How does the Saturn X team intend to get around the problem?

Share this post


Link to post

xttl has provided me a hacked EXE that changes the stack to 512K, enabling SaturnX's resources load properly. Testing this hacked EXE on a real DOS machine was successful, and will now allow me to properly test the project.

Share this post


Link to post
DaniJ said:

Egad thats nasty. This is precisely why alloca should never be used for unbounded allocations (personally I've yet to find a use but I can see how it might be useful in some niche applications).

So, whats the plan? How does the Saturn X team intend to get around the problem?

There are a few possibilities, one of which is just to separate the release into a few wads to be loaded together. BTSXMAPS.WAD, BTSXFLAT.WAD, BTSXGFX.WAD, that sort of thing.

Share this post


Link to post

Here's the quick'n'dirty 2-byte fix for regular v1.9 in case anyone is interested (not Ultimate or either Final Doom version, but easily adaptable):

offset    old byte   new byte
00027AF2: 08         10
00027BC2: 08         10
That will extend the data section size in memory by 512kB and moves the initial stack pointer value up correspondingly. 512kB is sizeof(filelump_t)*32768 so you can now have up to 32768 lumps + whatever amount worked with originally. Dunno if it'll have any unwanted side effects (besides increasing memory requirements), but it does seem to work pretty well with esselfortium's WAD and a junk test WAD with 32768 lumps, both in DOSBox and on a real DOS PC. A better solution would be to patch the function to use malloc/free or Z_Malloc/Z_Free I guess, but that'd also require a bit more effort. :-)

Share this post


Link to post
esselfortium said:

There are a few possibilities, one of which is just to separate the release into a few wads to be loaded together. BTSXMAPS.WAD, BTSXFLAT.WAD, BTSXGFX.WAD, that sort of thing.

Yeah, that would be the obvious solution. However I know that some users have an (irrational?) dislike for multi-wad mods...

How many lumps are we talking? Is it significantly more than _safe_number_ or might you be able to squeeze it in range?

Share this post


Link to post
DaniJ said:

Yeah, that would be the obvious solution. However I know that some users have an (irrational?) dislike for multi-wad mods...

How many lumps are we talking? Is it significantly more than _safe_number_ or might you be able to squeeze it in range?

He's almost 100 over just with the offending texture wad, and I imagine the original plan had been to merge that with the levels wad for the release. That plan's definitely out, unless they want to require the hacked EXE ;)

Share this post


Link to post

I discussed with essel about using the Hacked EXE, and before the haters scream "WELL IT ISN'T VANILLA NOW IS IT???!" May I remind them that any hacking done to Doom doesn't remove its vanilla-ness.

Share this post


Link to post

I have to admit that I do think bundling the hacked EXE would rather undermine the rigorous testing effort for true-vanilla compatibility. I know the point you are making but keeping within static limits are part of that self-imposed compatibility requirement. Moving the goal posts now kind of makes that effort meaningless in a sense. See what I mean?

Share this post


Link to post

The two-byte fix should probably be added to the Doom+ patch. I guess LinuxDoom (which also counts as "vanilla") and anything derived from it probably don't have this bug.

Share this post


Link to post
Csonicgo said:

I discussed with essel about using the Hacked EXE, and before the haters scream "WELL IT ISN'T VANILLA NOW IS IT???!" May I remind them that any hacking done to Doom doesn't remove its vanilla-ness.

Yes, but it makes no sense to just remove 1 limit and keep the others. Might want to change the target engine to Doom+ as well.

Share this post


Link to post
hex11 said:

I guess LinuxDoom (which also counts as "vanilla") and anything derived from it probably don't have this bug.

The official Linux binaries use alloca just like the DOS version, as does anything based on the released linuxdoom source unless the port authors have changed W_AddFile themselves. It's just that you probably have more space reserved for your stack then.

Share this post


Link to post

xttl said:
Dunno if it'll have any unwanted side effects (besides increasing memory requirements), but it does seem to work pretty well with esselfortium's WAD and a junk test WAD with 32768 lumps, both in DOSBox and on a real DOS PC.

Doom+ also changes 00027BC2, but to another value:

Header                        : It's always necessary
DOOM.EXE
00027BC0: 10 00 ;new size of segment
00027BC1: 5E F0
00027BC2: 08 1A
I changed 00027AF2 as per your hack, leaving 00027BC2 alone, and it works, but I'm not sure if there could be a problem there...

Share this post


Link to post
myk said:

Doom+ also changes 00027BC2, but to another value:

Header                        : It's always necessary
DOOM.EXE
00027BC0: 10 00 ;new size of segment
00027BC1: 5E F0
00027BC2: 08 1A
I changed 00027AF2 as per your hack, leaving 00027BC2 alone, and it works, but I'm not sure if there could be a problem there...

Yeah, Doom+ already extends the size of the data section a lot for the new, bigger arrays for visplanes, drawsegs, etc. So what you did was that you left the DS size (0x1AF000) alone but moved the initial stack pointer up to somewhere in the middle of that new DS (0x105E10) which is bad because now writes (pushes) to the stack can overwrite whatever Doom+ is storing at those addresses (or the other way around, something like a bunch of visplanes could overwrite the stack contents). It might not immediately break anything, but probably will sooner or later.

Here's an equivalent patch for Doom+ that will extend DS even more by 512kB and will move the stack pointer to the top of DS:

offset    old byte   new byte
00027AF0: 10         00
00027AF1: 5E         F0
00027AF2: 08         22
00027BC2: 1A         22

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×