Is there a way to do a diff between WADs?

I've noticed that my SVN copy of SRB2's "srb2.wad" is marked as modified. Thing is, I don't recall what I modified to cause that to happen, and the WAD's got a good 6233 lumps to go through to figure out just what the hell it is I changed. That's a lot of room for error.

So, I want to know if there's any way I can automatically detect what's different between two copies of a file. Like, even something as basic as dumping out a list of lumps with the contents' checksums would probably suffice; I could use Notepad++'s Compare plugin for a kind-of-sort-of diff. Just doing a diff between the two binaries isn't likely to produce anything meaningful, though (I don't think? I forget if diff can do binaries, since interpreting it as text would just yield a bunch of nearly indecipherable garbage data).

Has to be case-sensitive with regards to lump names, though. I know that most Doom WADs never need lower-case letters, but we have sprites in here with so many frames, it makes the Archvile weep - and that number creeps into the lower-case letter part of the ASCII table.

(Wouldn't surprise me if SLADE is hiding this feature somewhere and I just missed it...)

Share this post


Link to post

If you can dump ALL the lumps of each WAD into two different directories DIR_A and DIR_B (this can be done easily with DEU), then you can just use

fc/b DIR_A\*.* DIR_B\*.*


on Windows, or cmp under Linux. You may want to redirect the standard output to a text file, though, as 6000+ files will create a lot of clutter even if there are no differences encountered.

If there is no difference in the lumps themselves, then the difference might be only in the WAD's "slack space" (space created by deletion/moving of lumps), different lump order or in different memory garbage data (e.g. bytes present after a string's terminating zero, for fixed-width string fields).

Just one catch though: under Windows you can't have case-sensitive file comparison...and frankly it's a bad practice for Doom source ports, as it's not guaranteed to work across source ports and editors which assumpe implicitly that lump names are case-insensitive, and convert them to uppercase anyway at some point. What source port are you creating this PWAD for?

Share this post


Link to post

SRB2 uses its own heavily-modified branch of an old Doom Legacy source port (chosen back around 1998/1999 or so, at a point when Legacy was the only port with 3D floors and when ZDoom was, like, a fraction of what it ended up being today), so I'm not terribly concerned about other source ports; they're kind of out-of-scope.

I suppose I can dump the contents of each WAD to separate folders via SLADE and do a cmp, then (it's Windows, admittedly, but I downloaded a bunch of recompiled Linux tools that I use now and then; generally more useful than the Windows ones, anyway). Thanks for the tip.

Share this post


Link to post

Personally I'd love to see a "waddiff" tool and matching "wadpatch" to go with it.

Share this post


Link to post
fraggle said:

Personally I'd love to see a "waddiff" tool and matching "wadpatch" to go with it.


Well, at least the "waddiff" part should be trivial for anyone having access to working wad reading code, though providing refined/meaningful output might be harder. Is it enough to just say e.g. "lumps #3456 from PWAD A and #1234 from PWAD B have the same name but differ in content/size"?

As for a wadpatch utility, I recall I had created an utility for a strategy game (wardlords battlecry III, in 2005) which could apply patches to its equivalent of "PWAD" files (XCR files), which was a quite more complex format to handle because the index was at the FRONT of the file (so ADDING entries would require you to shift "lumps" around) plus some of them had to be encrypted for the game to use them, and of course you had to account for slack space/defragmentation. But it made mods for the game practically possible ;-)

The cool thing about that was that the "distributable patch" format I had adopted was identical to the same "XCR" format targetted by the utility -imagine something like a PWAD with the lumps to change, and a lump containing an optional script telling the patcher utility what to target specifically within an XCR file. This enabled people to create patch files using existing editors for the format.

I still have the SC for that around...it could be adapted for handling WADs, I guess.

Share this post


Link to post
Shadow Hog said:

I've noticed that my SVN copy of SRB2's "srb2.wad" is marked as modified. Thing is, I don't recall what I modified to cause that to happen, and the WAD's got a good 6233 lumps to go through to figure out just what the hell it is I changed. That's a lot of room for error.
[...]
(Wouldn't surprise me if SLADE is hiding this feature somewhere and I just missed it...)


Here's something you can do. Select the old srb2.wad as the base resource archive, and open the new one. Then run Archive->Maintenance->Remove entries duplicated from IWAD. This should delete all the non-map, non-marker lumps that have the same name and the same CRC-32 as something from the base resource archive, leaving you with only the modified lumps in the new one.

(Note: maps are deliberately ignored by this feature because a map is more than a single lump. Marker lumps are also ignored because it's not like an empty lump matters much for copyright purpose and also because they are necessary for namespacing.)

Share this post


Link to post
Maes said:

Well, at least the "waddiff" part should be trivial for anyone having access to working wad reading code, though providing refined/meaningful output might be harder. Is it enough to just say e.g. "lumps #3456 from PWAD A and #1234 from PWAD B have the same name but differ in content/size"?

Another way of looking at it is that PWADs are themselves *patch* WADs. So you could make a waddiff tool such that:

waddiff old.wad new.wad -o diff.wad
would generate diff.wad such that:
doom -file old.wad diff.wad
would be the same as:
doom -file new.wad
Then you could have a wadpatch tool that would just merge two WADs together. Ideally you'd want it to be able to do three-way merges as well, to combine two separate sets of changes.

The above doesn't work if you want to be able to merge levels, though - something that would be really nice if you wanted to collaboratively work on levels with others. With that you've got an entirely different set of challenges.

It all depends on the use case though. I'd also like to be able to see a simple text listing of "lump named X changed" as well, sometimes.

Share this post


Link to post

Well, levels are notoriously treated a bit different than normal lumps by Doom itself: essentially a level is a marker plus the ten (?) lumps immediately following it, which should also have specific names, so merging PWADS with levels should account for that and always include all of a level's lumps in a diff, even if not all of them actually change.

If you think that this would be inefficient space-wise, you can always include e.g. a special OVERRIDE lump with specific instructions, but that'd require using a special patching tool, not quite the direct -merge usage you described.

Other special groups of lumps I can think on top of my head right now are sprites, flats, etc. all of which would require special treatment fo r a "patch" PWAD to work as described, essentially making it a pre-merged PWAD, providing ersatz "-merge" functionality even to plain vanilla Doom. Does it really need to be that complicated, or is a binary executable patcher enough?

Share this post


Link to post
fraggle said:

The above doesn't work if you want to be able to merge levels, though - something that would be really nice if you wanted to collaboratively work on levels with others. With that you've got an entirely different set of challenges.


I think it was talked about while UDMF was elaborated. Unfortunately, though, this is not possible with the way the Doom engine reads level data (regardless of their container format). You'd need to give the various elements a unique identifier (probably some sort of random hash-like value that has a guarantee not to be reused ever even if the element is deleted and then re-created). You'd therefore get a "work format", and a "play format".

Share this post


Link to post
Gez said:

I think it was talked about while UDMF was elaborated. Unfortunately, though, this is not possible with the way the Doom engine reads level data (regardless of their container format). You'd need to give the various elements a unique identifier (probably some sort of random hash-like value that has a guarantee not to be reused ever even if the element is deleted and then re-created). You'd therefore get a "work format", and a "play format".

You'd certainly need a unique ID to do it "precisely", but I bet you could achieve pretty good results just by doing it heuristically.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now