Questions about parsing classic wads into a file hierarchy.

I want to do some batch processing on my collection of classic wads (which is in the thousands), and I would like to clarify two points regarding the wad format.

+ Where do maps (ExMy/MAPxx) end?

When I process levels, I generally treat level markers like a folder (as I think is the standard). So if you extract the resources, you end up with a folder with the name of the marker entry, and all of the lumps underneath (things, etc) within that.

As there is no 'end' marker for level data, where does the 'folder' end? The Doom Specs were never clear on it, and I haven't been able to find a definitive answer.

My approach (as demonstrated below) is to treat all recognized level lumps as part of the level folder, and consider the folder ended when a non-level lump is encountered.

name     aname    date        time  size      crc      type     offset
-------- -------- ----------- ----- --------- -------- -------- ---------
<previous level>
platform,        ,24-feb-1995,11:53,     3562,c14ff6c6,unknown ,  2465882
-- UP --
e3m7    ,mudmad14,24-feb-1995,11:53,    72314,6268b650,folder  ,  2469444
- DOWN -
things  ,        ,24-feb-1995,11:53,     4620,ff573c14,binary  ,  2469444
linedefs,        ,24-feb-1995,11:53,     8596,cbe338be,binary  ,  2474064
sidedefs,        ,24-feb-1995,11:53,    23970,0239668a,binary  ,  2482660
vertexes,        ,24-feb-1995,11:53,     2544,880ee47e,binary  ,  2506630
segs    ,        ,24-feb-1995,11:53,    10920,c8054ee6,binary  ,  2509174
ssectors,        ,24-feb-1995,11:53,     1280,3e1bdecf,binary  ,  2520094
nodes   ,        ,24-feb-1995,11:53,     8932,03170841,binary  ,  2521374
sectors ,        ,24-feb-1995,11:53,     2210,1d98abe3,binary  ,  2530306
reject  ,        ,24-feb-1995,11:53,      904,2c9438cc,binary  ,  2532516
blockmap,        ,24-feb-1995,11:53,     6776,28ffe235,binary  ,  2533420
platform,        ,24-feb-1995,11:53,     1370,5957e729,unknown ,  2540196
-- UP --
demo1   ,        ,24-feb-1995,11:53,    34624,9e2d02e3,unknown ,  2541566
demo2   ,        ,24-feb-1995,11:53,    29540,51d1ef79,unknown ,  2576190
demo3   ,        ,24-feb-1995,11:53,    33944,f77efa22,unknown ,  2605730
e3m8    ,mudman17,24-feb-1995,11:53,    75551,97623788,folder  ,  2639674
- DOWN -
things  ,        ,24-feb-1995,11:53,     2100,0a7f8537,binary  ,  2639674
<next level>
The problem with this approach is that there are a number of 'custom' lumps around which are attached to levels (like 'platform' in the above). So whatever is reading in the directory needs to be aware of these.

Another possible approach is to end the level when you hit the next level marker (or other marker entry) since they tend to be clustered together. But as the above shows, this would then 'incorrectly' bundle the demo entries under the level.

So what's the consensus from the Doom community? At what point do you consider a lump to not be part of a level?

+ What's the character set for WAD entry names?

It's a given that entries can consist of any alphanumeric name up to 8 characters in length, as well as the underscore, as these are used in original game.

Further, you can add the '[', '/', and ']' characters to the list, as those are used for sprite frames of index 26-28.

What other characters are acceptable in entry names? And if the answer to that is any characters, then what is the general standard?

Also, are entry names case-insensitive? I've always treated them as such, and if it is indeed the case it puts a theoretical limit on the number of frames a sprite can have.

Thanks in advance.

Share this post


Link to post
markj said:

+ Where do maps (ExMy/MAPxx) end?


In UDMF, there's an ENDMAP marker lump which is convenient for that. Everything between a TEXTMAP and the next ENDMAP belongs to the map.

THINGS
LINEDEFS
SIDEDEFS
VERTEXES
SEGS
SSECTORS
NODES
SECTORS
REJECT
BLOCKMAP
BEHAVIOR (Hexen format only)
SCRIPTS (Hexen format only, optional)
PLATFORM (added by some old editors)
GL_VERT (GL nodes compiled by glBSP, optional)
GL_SEGS (GL nodes compiled by glBSP, optional)
GL_SSECT (GL nodes compiled by glBSP, optional)
GL_NODES (GL nodes compiled by glBSP, optional)
GL_PVS (potentially visible set compiled by glVIS, optional addon to GL nodes)
LEAFS (PSX Doom and Doom 64 only)
LIGHTS (Doom 64 only)
MACROS (Doom 64 only)


I think that's all.

markj said:

+ What's the character set for WAD entry names?

It's a given that entries can consist of any alphanumeric name up to 8 characters in length, as well as the underscore, as these are used in original game.

Further, you can add the '[', '/', and ']' characters to the list, as those are used for sprite frames of index 26-28.

What other characters are acceptable in entry names? And if the answer to that is any characters, then what is the general standard?

Also, are entry names case-insensitive? I've always treated them as such, and if it is indeed the case it puts a theoretical limit on the number of frames a sprite can have.

Thanks in advance.

Sprite frames are referred to, internally by the engine, a number. E.g., not "frame A" but "frame 0". They use ASCII symbols in sequence, with a hardcoded limit corresponding to "frame ]".

Wad entries, however, do not have such a limit. Underscores and dashes _- are often used. Underscores especially, since they are part of the namespace marker lumps (S_START/S_END for example).

Technically, nothing prevents lumps from having names using lower case characters. It's just that it's not usually done.

Share this post


Link to post
Gez said:

Technically, nothing prevents lumps from having names using lower case characters. It's just that it's not usually done.

Probably important to note that name resolution within wad files by the game engine is done with case insensitivity. So FOOBAR, foobar, and FooBar are all the same lump as far as it's concerned.

Also I've said this before - a wad lump name could theoretically contain any ASCII code other than 0. The engine isn't going to be looking for such lumps so there's little reason to use strange characters but, any tool programmers would be wise to account for the possibility and not, say, code their tools to where they end up crashing on an unexpected string.

Share this post


Link to post
Quasar said:

Probably important to note that name resolution within wad files by the game engine is done with case insensitivity. So FOOBAR, foobar, and FooBar are all the same lump as far as it's concerned.

The Legacy fork used for SRB2 has been made case-sensitive. One of the people involved with that game requested that SLADE 3 get an option to turn off the automatic uppercasing because of that.

Share this post


Link to post

Yeah, characters in that game use up so many frames that it goes into the lower cases. Archvile, eat your heart out.

Share this post


Link to post
Gez said:

The Legacy fork used for SRB2 has been made case-sensitive. One of the people involved with that game requested that SLADE 3 get an option to turn off the automatic uppercasing because of that.

Yes. It just shouldn't be assumed that lumps differing in name by case will be treated distinctly. I do *not* therefore think that tools should necessarily normalize names that happen to be lowercase, since as in this instance, there's a good reason for it :>

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now