EXE hacking

kb1 · December 7, 2018

Ok, this has got to be the coolest, most bad-ass post I've seen in a long time! I'm dying to check out what your patcher is doing, and how you're getting everything to work, despite the awful run-time relocation stuff to deal with.

Great stuff - I'm massively impressed! Cacoaward!!

xttl · December 7, 2018

The call/pop trick was actually originally mentioned by budko (the prboom+ coder, also the doom+ series creator) on #doom-tech or somewhere else on IRC I was lurking years ago, though I think at some point the realization would have dawned upon me regardless*, and you could alternatively also modify the fixup table to make the program loader write the relocated addresses of the start of CS and DS somewhere your code can grab them. (granted this isn't as easy as it could be because the LE format seems to be overall really shitty and stupidly complex, just look at it and compare to eg. PE which has also always had relocation support for DLLs and nowadays ASLR enabled executables, but at least I managed to write that tool earlier to help a bit with this)

^{*(even if I'm not a particulary expert or experienced programmer or even good or a professional in this field, really I just hack on old games as a hobby sometimes, maybe a bit like some people do puzzles and riddles stuff like that)}

When you know where the code and data sections of the game binary are in memory at runtime, it's already easy to start doing small changes with some memory reads and writes. You can try this yourself with that newdhack.exe file I just posted inside that zip, just use NASM or something to assemble a stage2.bin which does some movs. When the code in stage2.bin starts, EDX contains the relocation offset for game code and EBX contains it for game data, and in EAX you have the address where the code in stage2.bin was loaded. At the end of the file you just put a RET to go back to the 1st stage loader (which is in the EXE, I overwrite an unused function with it and then overwrote part of the date + password check code on early startup with a call to there). You don't really even need to save/restore any registers yourself because the loader I patched into the game exe does pusha/popa already and the call to stage2.bin is the last instruction before the popa and ret back to original game code.

So for example, if you wanted to change plat speed to match the release version (0x40000 vs. 0x10000) you could write this:

BITS 32

org 0

mov [edx + 0x2eb60], byte 0x04
ret

and assemble it with NASM (raw output) to a file you name stage2.bin and put in the same directory with newdhack.exe before you run it in DOSBox.

If you look at newdoom.exe in any disassembler that shows you addresses as they would appear in memory without relocation*, you see that this overwrites a byte from the immediate value included in this instruction inside EV_DoPlat:

cseg01:0002EB5B                 mov     dword ptr [eax+10h], 10000h

^{*(note that the preferred load address without relocation specified in the LE headers for the code section is 0x10000 and not 0, and for the data section it's 0x50000, this is why I already renamed the vars in the C code I posted on last page to csrelo/dsrelo from csbase/dsbase because what stage1 loader really gives you is the relocation offsets = difference vs preferred address, not relocated start of the section, that would be the difference+0x10000 for code or difference+0x50000 for data)}

That's not a change that would be hindered by DOS4GW rewriting all non-relative address references on startup though, because it's just an immediate value inside an instruction (and the memory write it does is to an address relative to a register but even if it wasn't the relocation record would still hit just the address part of the instruction here) so let's try something else next:

BITS 32

org 0

lea ecx, [edx + 0x248d4]
mov [ebx + 0x5221c], dword ecx
ret

This changes a function pointer inside the menu definitions in data section so that the first selection in the main menu ("Demo Map 1") now activates the previously disabled but hidden in the code options menu. This is something you'd normally need to know to change in the relocation table instead because the function pointer inside the menu defs needs to be rewritten obviously when the game's relocated in memory (it's not a relative reference like jmps/calls back and forth inside the code section).

For anybody not even superficially familiar with x86 assembly: note that the lea instruction does not do any real memory access despite the square brackets, it's just used for calculations (it's intended for memory address calculations but can be used for any calculations really, and sometimes it's more optimal to calculations that way, ^{though here you could just do add edx 0x248d4 and use edx as the source in the next instruction if you didn't care about losing the value in edx which is the CS relocation offset}).

^{(beware the screen size control in the options menu btw., you can play around it with but if you expand the game view to fullscreen then something bad happens when you quit the game, it's probably overwriting something in memory just before the screen buffer because the view is drawn a bit too high with a few pixels of the status bar still showing in the bottom, I needed to go to a virtual console with ctrl+alt+f2 to kill dosbox because after quitting the game it went into a state where it made X totally unresponsive. maybe it doesn't break on Windows like that but I don't really use Windows anymore because Win10 is such a disaster, so it's difficult for me to say anything for sure about that)}

None of this really helps with making more significant changes to the game code any less tedious work (where you have to keep track of the byte position of all memory address references in your code all the time) though, unless you can somehow automate the process of always applying address fixups to the correct locations in the replacement code. Because the LE format is annoying to work with I think it's easier to accomplish this via a runtime loader approach than something which would generate fixup records for LE headers automatically.

Now I'm really not some secret closet master wizard programmer, as unfortunate as that is (I've completed the first page of puzzles in TIS-100 once but that's it, and even those were often very unoptimal and sometimes silly solutions though they worked), so I gave up on programming a rebasing loader for code chunks of any kind (even for a real simple format I'd "design" myself on the fly) in assembly almost right away and started looking at the possibility to use C for this. Using compiled code for patching Watcom-made game binaries is something I had really wanted to make work earlier already but I gave up on it because I had only been looking at compiler and not linker output, and that format looked just disgusting.

Well, now I finally looked at the linker output instead and in the end it turned out it isn't difficult at all to make this work at least on the level seen in the previous post. It's really fortunate that Open Watcom can make PE DLLs if you select the win95 or nt target, it's such a nice and easy format to parse compared to OMF32 or LE/LX, and then I just had to figure out ways to work around the fact that I don't have any kind of runtime dynamic rebasing loader yet in the C code. I cannot really replace ingame functions from compiled code yet though, because how would I pass the pointers to the vars/funcs structs to some code that isn't called from the start function without resorting to doing something silly, like storing them to some fixed address in low DOS memory, video memory or elsewhere where I could maybe get away with it, for later access?

edit: well, a fews mins after posting this I realized that I could of course go into the stack in my new function when it's called by the game get some pointers to known locations, and proceed from there... lol, maybe I'll go code that next

What I have already is enough for writing a rebasing loader much easier than using pure asm though, maybe for the simple format first and DLLs later, or maybe I'll just attempt DLLs right away.

Edited December 7, 2018 by xttl

Linguica · December 7, 2018

I'm not sure if I ever posted it, but several months ago I was playing around with trying to disassemble / poke at the original EXE. Here's what I have from as far as I got.

Quote

address=0x3D830; length=84; funcname="P_GunShot"; exe=~/Desktop/doom2/DOOM2.EXE; printf -v offset "%d" $((address + 0x32014)); xxd -s $offset -o-0x32014 -l $length -g 1 -u $exe; dd bs=1 skip=$offset count=$length if=$exe > ${funcname}.bin; ndisasm -u -o $address ${funcname}.bin > ${funcname}.asm

address=0x3D830; length=84; funcname="P_GunShot"; exe=~/Desktop/doom2/DOOM2.EXE; printf -v offset "%d" $((address + 0x32014)); printf "USE32\nCPU 386\n" > ${funcname}.asm; ndisasm -u -o $address ${funcname}.bin | tr A-Z a-z | gsed -r 's/^[0-9a-f]{3}([0-9a-f]{5}). ([^ ]*) *(.*)/l_0x\1:\t\3\t; \2/' | gsed -r 's/j[a-z]{1,2} /&l_/' | gsed -r 's/(^.*call).*; \w{2}(\w{2})(\w{2})(\w{2})(\w{2}).*/\1 \$+5+0x\5\4\3\2/' >> ${funcname}.asm; nasm ${funcname}.asm -w-number-overflow -O3 -o tempfile; if [ "$(md5 -q ${funcname}.bin)" == "$(md5 -q tempfile)" ]; then echo "Recompiled binary is identical to the original"; else echo "Recompiled binary is not the same"; fi; rm tempfile;

address=0x3D830; length=84; funcname="P_GunShot"; exe=~/DOOMS/DOOM2.EXE; cp $exe ~/DOOMS/; nasm ${funcname}.asm -O3 -o ${funcname}.bin; dd if=${funcname}.bin of=$exe bs=1 count=$length seek=$offset conv=notrunc

kb1 · December 8, 2018

@xttl Let me see if I can understand your difficulties and your system:

Pre-patching the exe if difficult, except for tiny patches: Since you don't know where the OS will relocate the code, you cannot use long jumps - all jumps must be relative, and therefore, nearby, making it impossible to add large functions, or subroutines called absolutely.
Patching at runtime via Stage2.bin give you absolute addresses, but you still must figure out where the loader put the Doom code. If you can get the Doom code to call your patch, you can pop the return address off the stack, and determine where the caller is, but, to get the caller to call your patch, you need to know where the caller is. Catch-22!

You said that when stage2.bin starts, it has the location of the Doom code - how do you accomplish this? You also said that upon stage2.bin running,

EDX contains the relocation offset for game code

EBX contains it for game data

Can all game code be directly referenced as an offset of EDX? I thought there were lots of relocation entries, perhaps one for every Doom module. Can EDX be used to find all game code? If so, that's beautiful, and it makes much more sense to patch post-relocation.

So, basically, you patched the exe to get it to load stage2.bin, then you pop (and then re-push) the return address off the stack and place the address in EDX, and, finally you call stage2.bin which returns to the Doom code you patched. Is this what you're doing? If so, sounds like a nice system, indeed!

xttl · December 8, 2018

13 hours ago, kb1 said:

@xttl Let me see if I can understand your difficulties and your system:

Pre-patching the exe if difficult, except for tiny patches: Since you don't know where the OS will relocate the code, you cannot use long jumps - all jumps must be relative, and therefore, nearby, making it impossible to add large functions, or subroutines called absolutely.

Patching at runtime via Stage2.bin give you absolute addresses, but you still must figure out where the loader put the Doom code. If you can get the Doom code to call your patch, you can pop the return address off the stack, and determine where the caller is, but, to get the caller to call your patch, you need to know where the caller is. Catch-22!

You said that when stage2.bin starts, it has the location of the Doom code - how do you accomplish this? You also said that upon stage2.bin running,

EDX contains the relocation offset for game code

EBX contains it for game data

Can all game code be directly referenced as an offset of EDX? I thought there were lots of relocation entries, perhaps one for every Doom module. Can EDX be used to find all game code? If so, that's beautiful, and it makes much more sense to patch post-relocation.

So, basically, you patched the exe to get it to load stage2.bin, then you pop (and then re-push) the return address off the stack and place the address in EDX, and, finally you call stage2.bin which returns to the Doom code you patched. Is this what you're doing? If so, sounds like a nice system, indeed!

Relative call/jmp works with addresses up to +/- 2GB, so regular function calls are not a problem at all. Function pointers need to use absolute addressing though, as do all references to the data section.

The sources posted on the previous page should explain everything pretty well, but here's a summary:

exe is prepatched with some new code (138 bytes, asm source is on previous page) and a call to there on startup, the call replaces part of the press beta's date&password check code and the rest i just skip over, so no need for fakedate.com and those batch files anymore.
this code overwrites an unused function in the beta exe (D_TimingLoop which does a thing that's very similar to the timerefresh console command in quake if you know it, rotates the view 360 degrees once and then drops to dos with some timing info displayed)
i modified the LE fixup table to remove* any fixup records (3 total) which would otherwise cause dos4gw to overwrite parts of this code, because D_TimingLoop references some global vars and a string from the data section
prepatched new code uses open/read/close/filelength/malloc code already present in the original exe, no problems here because the relative call instruction works with addresses up to +/- 2GB like mentioned previously. loads a file called "stage2.bin" from the same directory on disk to a malloced buffer, and does an absolute call to the buffer's address
call $-5 calls the next instruction right after it (wherever it happens to be in memory), so if you write pop <register> after the call you can get the current EIP value in that register no matter where the code is currently located in memory, no need to peek into the stack deeper than that (nothing I posted on previous page peeks into the stack to get pointers)
then you can subtract from that value the address which the pop instruction following the call would have been at if it weren't relocated, and now you have the code section relocation offset, you can use this knowledge to address anything located in the code section
read a known data pointer value from somewhere in the code section, subtract from it the expected unrelocated memory address of that data, and now you also know the data section's relocation offset
you know what address malloc just gave you for the buffer you loaded stage2.into, that's the base address for all code and data loaded from that external file.
then just put or keep these values in the registers before directing execution flow into the buffer (btw. I originally chose ESI = cs offset, EDI = ds offset, EDX = stage 2 base, but changed these after I started experimenting with compiled code because EAX, EDX and EBX are the register watcom's calling convention uses)

"Can all game code be directly referenced as an offset of EDX?" <- yes, just add the address you see in IDA or some other disassembler or listing which shows addresses the same way. For example, if the first byte of the code section was at 0x10000 (it's preferred load address), then exit and printf from libc would be at:

cseg01:000436CE exit_
cseg01:00042D40 printf_

If the first byte of the data section was at 0x50000 (preferred), then this sound debug string from the original game code would be at:

dseg02:000583EC aSomethingSFuck db 'Something',27h,'s fucked.  Distance to sound < 0.',0

So you could write a stage2.bin which uses printf from the game's libc to print that string from the game, a string from stage2.bin itself and then the libc exit function like this:

BITS 32
org 0

lea ecx, [edx + 0x42d40] ; printf
lea esi, [eax + text]    ; text from this binary
push esi                 ; string pointer for printf
call ecx                 ;
add esp, 4               ;

lea esi, [ebx + 0x583ec] ; "Something's fucked. Distance to sound < 0."
push esi                 ;
call ecx                 ; still points to printf
add esp, 4               ;

lea ecx, [edx + 0x436ce]          ; exit
mov eax, -1                       ; program exit code
jmp ecx                           ; never retuns

text: db "hello from stage2",0x0a,0

(why exit? well I_Quit / I_Error try to do some deinitialization IIRC which is not a good idea at this point in startup)

Relocation is done per-section, (not per-page or anything crazy like that luckily, ~~that wouldn't work very well with code anyway because it's better to use relative jumps/calls whenever possible~~ ^{of course you can always rewrite relative pointers too, I was already doing that to patch in calls in the loader when I wrote that but I still keep forgetting it... see, this is why I don't do this as a job :D}) so because there are only 2 relevant sections in the executable you really just need 2 offsets to address everything.

* really I just made them all clones of another fixup record which fix up addresses nearby, I didn't want to use the tool I made which always regenerates the whole fixup table because that would cause a huge bindiff while manually hex edited this way it's only a 9 byte difference to the original file

I am too tired right now, but I'll write more tomorrow. I almost just got replacing a function in the game with compiled C code from the loader working, but there's some strange problems with it and I can't figure out why it's not working at least until I sleep.

Edited December 8, 2018 by xttl : a couple of small fixes/additions

kb1 · December 8, 2018

@xttl: Thanks! For some reason, your last post really explains it in a way I can more easily understand. I know writing that post (and the posts before it) was a lot of work, and I appreciate it.

This is a really neat system. I really want to write a program that breaks down the structure of these types of EXEs, in a way that easily explains all of the runtime relocations, and maybe provides a way to streamline hacks like the work that you're doing. Your process provides a method to bypass the need for most of that, but, as you said, the EXE format is over-complicated, and I'd really want a better understanding of it.

Speaking of which, I am going to need to study what you've written and done a lot more, before I can claim to really understand it on a deep level, but, from what I've read so far, this is very slick, man. Who says you aren't a Hacking Wizard? :)

@xttl and @Quasar :

Your posts are always interesting, and you're doing very cool stuff! Quasar deserves some credit here too. The beta is a really neat piece of history, and the work you guys have done are really turning it into a new, fully playable game, which is fascinating! Thank you both.

xttl · June 16, 2020

Here's a video clip.

I don't think there's a way I can get video preview working without making an account for this at Youtube/Vimeo? It's only about 49-50 seconds (23-24MB) in it's entirety anyway.

Spoiler

What is happening:

I wrote a program which allocates a buffer in memory for each section of a LE file, copies the sections to the buffers, uses the LE fixup table to fixup all non-relative pointers (since the buffers can be at any address, the OS decides), and then just jumps into the LE code section with a custom exception handler installed and stack pointer set properly.

Idea here is: most of the code from 32-bit protected mode DOS games can be executed as-is on any modern x86-64 OS which still supports 32-bit programs. Only the parts which access DOS, DPMI, BIOS, etc. APIs or hardware directly do not work.

I actually jump directly into the C main function, not the actual LE entrypoint which goes into Watcom C startup code, because I'm not really interested in emulating or faking all the oldschool APIs and hardware required. I've also patched some functions from the LE to "escape" into the loader's exception handler using bytes CD FF ## ## C3 (int 0xFF, db ##, db ##, ret). This way I can handle some of the libc functions used in the game (which don't work anymore due to missing old APIs and startup code being skipped), and graphics and timing (and later: sound, input, etc.) inside my loader instead.

Currently, the patches need to be done by hand (hence "hack.exe" in the video) but I plan to make it patch the code automatically in memory.

It didn't even take that long to get Heretic v1.3 running and playing demos like this (of course with no sound and no way to enter input). I didn't start with Doom because it useds unchained VGA mode (needs a little bit more work to patch to the point where graphics starts working) and because there's that handy TIC.MAP included with the Raven source release.

Not sure how much more I'll work on this and what schedule, and which games I will support, but just thought I'd share a video clip of it running Heretic's DEMO2. (I've watched a couple of longer demos in their entirety from Heretic-N and they work too, of couse no reason why they wouldn't really). The reason I even started tinkering with this is to possibly be able to play DOS FPS games without the slight mouse lag DOSBox always has. (otherwise it works well enough even on C2D hardware that's over 10 years old at this point...)

If I work on it more I'll make it also run in Linux at least, maybe FreeBSD too (as for MacOS it seems even they kept some kind of support for running 32-bit code for Wine/Crossover specifically, but I don't even own any Macs, and seems they require code signing for using that capability without disabling SIP...).

The game looks more choppy in the video than it really runs because my old laptop can't handle capturing video at 1920x1080 properly, or because I tried to capture at 35fps while my monitor refresh rate 60Hz (or both).

tl;dr: it's old heretic.exe v1.3 running directly on 64-bit Windows Server 2019 (in a 32-bit process)

^{and yeah I need to change this to not abuse exceptions like this at some point, I initially went for this approach because it didn't need any inline assembly (I don't think it's possible to have custom calling conventions without a customized C compiler otherwise?)}

Edited June 17, 2020 by xttl

axdoomer · June 17, 2020

This is an awesome projet. I wanted to do the same, but I don't have an entire grasp of the DOS LE format and how I would run it on a 64-bit system.

At some point, I would have moved forward after doing such as loader, I'd have done one for Xbox executable files (XBE). There's a separate "doom.xbe" executable on the Doom3:RoE disc and there also this version that I would use for testing.

Quote

most of the code from 32-bit protected mode DOS games can be executed as-is on any modern x86-64 OS which still supports 32-bit programs.

What limits Heretic.exe from running inside a 64-bit program?

Is the limitation only because Heretic calls libc functions using the Watcom convention (eax, edx, ebx, ecx), whereas a 64-bit loader would expect rax, rdx, rbx, rcx? I'm not sure how extra arguments which would be put onto the stack would be handled. As for e?x vs r?x, I'm sure the code can be easily made to ignore the upper 32 bits of the registers. I don't think this would cause problems, wouldn't it?

xttl · June 17, 2020

On 6/17/2020 at 12:30 PM, axdoomer said:

This is an awesome projet. I wanted to do the same, but I don't have an entire grasp of the DOS LE format and how I would run it on a 64-bit system.

At some point, I would have moved forward after doing such as loader, I'd have done one for Xbox executable files (XBE). There's a separate "doom.xbe" executable on the Doom3:RoE disc and there also this version that I would use for testing.

What limits Heretic.exe from running inside a 64-bit program?

Is the limitation only because Heretic calls libc functions using the Watcom convention (eax, edx, ebx, ecx), whereas a 64-bit loader would expect rax, rdx, rbx, rcx? I'm not sure how extra arguments which would be put onto the stack would be handled. As for e?x vs r?x, I'm sure the code can be easily made to ignore the upper 32 bits of the registers. I don't think this would cause problems, wouldn't it?

Yeah, the LE format seems kinda overcomplicated and annoying compared to eg. PEs. :-) I assume you've already seen lxexe.doc or lxomf.pdf?

The first and only time I had to make a little modification in a relocatable PE (DLL from an old Windows game) I remember thinking: "this is so much simpler and saner, I already got it" pretty soon after looking at the specs from MSDN and the output from dumpbin /relocations, though right now I don't remember the details anymore.

If you only want to read the section contents and assume the sections in the file are stored without holes until the last initialized page of the last section (in all DOS4GW games I've seen so far they are), it's possible to take shortcuts and ignore the page table crap, but unless you can somehow always allocate buffers at the preferred section base addresses from the header you'll still need to handle the fixup page and record tables correctly. (not all record types though, for most games types 5 and 7 suffice, though I've just now run into one Watcom LE which uses type 8)

The main reason I didn't attempt to run the game inside a 64-bit process was that I'd need the virtual addresses of the buffers I allocate to be somehow guaranteed to be always below 2GB/4GB.

If I understood the explanation of VirtualAlloc on MSDN correctly, it seems I can request specific virtual addresses, I have to try that... I'd actually prefer to just get random addresses at anywhere under 2GB since I already spent the effort to handle the fixup tables. :-) Seems man mmap on both Linux and FreeBSD says they support a MAP_32BIT flag so there's that solved.

update: Well, I just found out that MSVC apprently does not support inline assembly at all in 64-bit programs. That sucks. It's onto either MinGW or using separate .asm files then. And MS Defender thinks that every exe file compiled by MinGW inside WSL is a trojan by default, that's just awesome. Thankfully they still at least allow disabling that without jumping through a million hoops.

Also, now that I thought to actually check it, it seems that eg. push rax and push eax have the same opcode (0x50) in 64-bit and 32-bit modes respectively (and push eax simply does not exist 64-bit at all). 64-bit pushes obviously take more space from the stack and move the stack pointer further... and the games like to do things like mov eax, [esp+###] at times, and also after every variadic function call the caller will have to adjust ESP.

So I'll give up on 64-bit for the moment.

At least on Linux there's modify_ldt and perhaps something could be cooked up using that, but it's getting into territory too advanced for me. :)

Edited June 18, 2020 by xttl

Redneckerz · June 18, 2020

@xttl Do you happen to have F2DoomPP from this March 1, 2016 post laying around still? The original unfortunately is defunct.

esselfortium · June 18, 2020

On 6/16/2020 at 12:52 PM, xttl said:

Here's a video clip.

Is it just me or is no video appearing?

xttl · June 18, 2020

10 hours ago, esselfortium said:

Is it just me or is no video appearing?

There was a link to a MP4 file there but it went away at some point. I thought a mod might have removed it for some reason (I guessed perhaps because video players have had exploitable holes in the past so in theory it could have contained an exploit!*), but it's always possible I just screwed up myself while updating the post.

^{* I'm really not even near the skill level to develop my own useful 0days for chrome/firefox/vlc, etc., if I were I'd probably have a lot more of money and success in life right now and wouldn't be here wasting my time with silly stuff like this ;-) OTOH, people sometimes keep using massively outdated software...}

axdoomer · June 19, 2020

I uploaded a copy on Youtube. "Luckily", I happened to have downloaded it because of Internet issues. I don't think Youtube would have any vulnerabilities or remote code executions on people's PCs. (If anyone has a problem with the video being online, please DM me)

Spoiler

@xttl Will you be releasing your layer's source code at some point?

I would upload a backup copy somewhere too, because sadly most of your hacks (like F2DoomPP) went missing after the links died. @Redneckerz and I are keeping some of your previous hacks safe on GitHub and a FTP server.

xttl · June 19, 2020

10 hours ago, axdoomer said:

I uploaded a copy on Youtube. "Luckily", I happened to have downloaded it because of Internet issues. I don't think Youtube would have any vulnerabilities or remote code executions on people's PCs. (If anyone has a problem with the video being online, please DM me)

Reveal hidden contents

@xttl Will you be releasing your layer's source code at some point?

I would upload a backup copy somewhere too, because sadly most of your hacks (like F2DoomPP) went missing after the links died. @Redneckerz and I are keeping some of your previous hacks safe on GitHub and a FTP server.

Yeah I'll probably release the sources at some point. I'm currently in process of making it look a bit nicer and converting to C++ (this is first time I'll be actually using classes and stuff because it's not like I'm a professional programmer or anything, haha, just felt I could probably make it a lot neater and easier to add eg. support for more EXE formats at some; DJGPP games used COFF and I wouldn't be surprised if there are some DOS games in PE executables out there somewhere...)

I'll try to have presentable C++ sources which can run at least Heretic v1.3 and Hexen v1.1 in a few weeks-months (gotta give myself time to procrastinate ;) and let's say I just have some long-running issues in life) but I'll just dump the big C mess which has lots of hardcoding for Heretic v1.3 + prepatched Heretic v1.3 binary I have right now if I lose interest. Currently keyboard+mouse work and I also changed the screen buffer upscaler/drawer to use SDL2 renderer+textures so it's a lot faster now (and aspect ratio is correct). The game just hangs (no crash) for some reason if I try to start a game from the menu, complete a map or use the engage## cheat. Only -warp from command line is ok for getting into gameplay. (in fact I could consider dumping these for you privately right now if you absolutely insist)

Btw. I found out today that the Japanese version of Wolfenstein 3D for PC-98 machines is actually a 32-bit pmode program unlike the releases we got. It uses the same (Japan-only) DOS extender PC-98 Dooms have which is called DX386. It's even compiled with Watcom C, but the EXE format is not LE. The header magic bytes are "P3" which seems to belong to a format originally invented for Phar Lap's DOS extenders (according to info from the web). I found some specs for it, but unfortunately it seems they built the binary without relocation support and the EXE is supposed to be loaded with starting address 0. :( The lowest address I get a buffer from VirtualAlloc() on Windows is 0x10000, and even that allocation does not reliably succeed every time. I could try to run it with some bytes cut off from the beginning, but 0x10000 is too much.

Perhaps I'll try to figure out a way to check if there's truly no relocation happening (there's something that looks it could be a table of some sort just before the P3 header, inside the extender's MZ and when I run the game in a PC98 emulator the extender actually says it loaded the program starting at 0x100000), but it's not helping the only version of Neko Project with debugging features I found is crashing whenever I start Wolf98. (it works in the latest non-debug Neko Project 21/W)

xttl · June 22, 2020

I got Doom Press Beta running on the Loader now, though sometimes it hangs on startup (gotta investigate more).

Also, VGA palette and pixel writes are handled the proper way (watching for VGA port writes and writes to memory 0xA0000... in the exception handling code). No more need for I_SetPalette / I_InitGraphics hooks.

This is slower (causes lots of context switching due to exceptions triggering), so I might later put some game specific hooks back, and keep these as a fallback though I don't notice it at all even on this old slow laptop with power saving settings maxed.

Next: unchained mode / pageflipping support (Doom release ver.)

Redneckerz · June 22, 2020

1 hour ago, xttl said:

I got Doom Press Beta running on the Loader now, though sometimes it hangs on startup (gotta investigate more).

Also, VGA palette and pixel writes are handled the proper way (watching for VGA port writes and writes to memory 0xA0000... in the exception handling code). No more need for I_SetPalette / I_InitGraphics hooks.

This is slower (causes lots of context switching due to exceptions triggering), so I might later put some game specific hooks back, and keep these as a fallback though I don't notice it at all even on this old slow laptop with power saving settings maxed.

Next: unchained mode / pageflipping support (Doom release ver.)

... Jesus christ. Together with the Hexen/Heretic stuff, this is something else. Seeing that early code run on a modern system is quite the badass coding.

What's next? Press Beta as a moddable source port? I haven't forgotten your Newdhack work, and i am so glad to see you return to early Doom stuff.

xttl · June 22, 2020

10 hours ago, Redneckerz said:

... Jesus christ. Together with the Hexen/Heretic stuff, this is something else. Seeing that early code run on a modern system is quite the badass coding.

What's next? Press Beta as a moddable source port? I haven't forgotten your Newdhack work, and i am so glad to see you return to early Doom stuff.

Well, it was really easy after getting Heretic/Hexen running because all the offsets are known already (there's both the mapfile Quasar posted a long time ago, and the exe itself which contains debug symbols) and because it doesn't use unchained mode.

Press beta is kinda moddable already, it supports -file. Even all map slots work if you create a wad with a map which does not exist in doompres.wad.

You just need to delete REJECT because it doesn't know about it and it'll mess up loading BLOCKMAP (because everything in a map is loaded by getting the index # of the ExMy/MAPxx lump and adding something to it). Texture/patch format is slightly different.

The endgame is to run some executables not related to Doom at all, but compiled with the same compiler and in the same exe format. Like Blood and Dark Forces.

Redneckerz · June 22, 2020

16 minutes ago, xttl said:

Well, it was really easy after getting Heretic/Hexen running because all the offsets are known already (there's both the mapfile Quasar posted a long time ago, and the exe itself which contains debug symbols) and because it doesn't use unchained mode.

Press beta is kinda moddable already, it supports -file. Even all map slots work if you create a wad with a map which does not exist in doompres.wad.

You just need to delete REJECT because it doesn't know about it and it'll mess up loading BLOCKMAP (because everything in a map is loaded by getting the index # of the ExMy/MAPxx lump and adding something to it). Texture/patch format is slightly different.

The endgame is to run some executables not related to Doom at all, but compiled with the same compiler and in the same exe format. Like Blood and Dark Forces.

I am sure you are aware of the Force Engine regarding Dark Forces?

Regarding Press Beta - I do believe its really useful to have this running in a state that it can actually be playable outside its DOS and development constraints - And i reckon Quasar or Fraggle may be interested in this stuff aswell.

Edited June 22, 2020 by Redneckerz

xttl · June 22, 2020

10 hours ago, Redneckerz said:

I am sure you are aware of the Force Engine regarding Dark Forces?

Regarding Press Beta - I do believe its really useful to have this running in a state that it can actually be playable outside its DOS and development constraints - And i reckon Quasar or Fraggle may be interested in this stuff aswell.

Yeah, I just found out about that (Force Engine) earlier today. Cool project, I'll need to try it later. Makes it much less useful to have Dark Forces supported, but I might still try to do it just for fun. Even for Blood there's NBlood (based on EDuke32) nowadays which I've also not tried, but probably isn't too bad at all. (I know I don't mind playing Duke3D using EDuke)

Redneckerz · June 22, 2020

1 hour ago, xttl said:

Yeah, I just found out about that (Force Engine) earlier today. Cool project, I'll need to try it later. Makes it much less useful to have Dark Forces supported, but I might still try to do it just for fun. Even for Blood there's NBlood (based on EDuke32) nowadays which I've also not tried, but probably isn't too bad at all. (I know I don't mind playing Duke3D using EDuke)

Raze also supports NBlood and Shadow Warrior (through) VoidSW :)

axdoomer · June 26, 2020

Hi everyone. I've been porting LE Loader to Linux. Here's a screenshot of it running Hexen on Debian.

Screenshot_2020-06-25_20-39-18.png.b7a86bfc64e3ee6acea1208b22a338b5.png

Redneckerz · June 26, 2020

17 hours ago, axdoomer said:

Hi everyone. I've been porting LE Loader to Linux. Here's a screenshot of it running Hexen on Debian.

.... I am speechless.

LE Loader looks to be an amazing project to get DOS Heretic/Hexen and Doom Press Beta running natively on W10 and now Linux*

*I assume that DOS Heretic/Hexen are meant to run here.

BTW, and this goes for @xttl if any of this stuff needs documentation or the like, please let me know. This is some proper haxxor level stuff.

xttl · July 4, 2020

Want to use the official free version of IDA instead of one of the various unofficial releases to look at LE files? (assuming you could even get a good, recent, fully functional unofficial version for your OS of choice...)

Or what about Ghidra, even if nobody has compiled the 3rd party LE loader plugin for the latest version, and you don't want to bother with setting up the dev environment required for compiling Ghidra plugins? (I know I don't :-) It seems editing the version number in extension.properties inside the zip is enough to make it work in 9.1.2.

Solution: use the first thing attached to this post (le2bin). Should compile & run ok on Windows & Linux at least.

It converts a LE to a single "flat" file, which can be imported as a raw binary into IDA or Ghidra, no plugins required. Fixup records are applied so the disassembler can find xrefs properly. It's even ok to use the newest freeware IDA versions (which only includes ida64), because you can still import the raw binary file as 32-bit! (it creates a 32-bit code segment/section for it and everything seems to work just as it should, it has to support this because some real world binaries can mix 32/64 bit code, apparently some malware has even used this as an obfuscation technique in the past)

By default, le2bin uses the preferred memory base addresses from the LE section table to decide where to place sections inside the generated flat binary, but you can add parameter /sb on command line to make it use the original exe offsets instead. (basically you can choose whether you want to see the same addresses in your disassembly listing as someone who uses a proper LE loader plugin would see, or if you want to see addresses which you can directly go try and patch in the original exe with a hex editor)

Second attachment includes symbols for heretic.exe v1.3 and press beta newdoom.exe in IDA IDC script and Ghidra ImportSymbolsScript.py text file formats (both in mem and on disk variations)

The fixup "parser" still only supports very few record types & flags, so many (especially newer) Watcom binaries can't be converted yet, but at least all Doom-related binaries should work!

^{(except for alpha 0.2 and 0.3 because they use a completely different extender and executable format, and haven't even been compiled using Watcom C)}

left: Ghidra with sections at original exe file offsets

right: IDA Freeware with sections at preferred load addresses

disasm1.png.c14f96c3730d57b03b9efbc71d6e71b2.png disasm2.png.4f58a819a293a20138b81c6f822b2d04.png

le2bin.zip

symbols_pack.zip

also, you will need to use this to add support for the Watcom calling convetion into Ghidra (again, editing extension.properties makes it work in 9.1.2 at least). then you'll need to edit all variadic functions signatures (printf, I_Error, etc.) manually to change calling convetion to _cdecl and check varargs.

Edited July 4, 2020 by xttl

axdoomer · July 5, 2020

11 hours ago, xttl said:

It seems editing the version number in extension.properties inside the zip is enough to make it work in 9.1.2.

also, you will need to use this to add support for the Watcom calling convetion into Ghidra (again, editing extension.properties makes it work in 9.1.2 at least). then you'll need to edit all variadic functions signatures (printf, I_Error, etc.) manually to change calling convetion to _cdecl and check varargs.

Very interesting, I'll check this out. Open Watcom is still used today to compile binaries (I even believe it can compile Chocolate-Doom), so if it breaks Ghidra's decompiler, I may do a pull request to add support for this.

xttl · July 5, 2020

Btw., when using the loader plugin you always need to uncheck "show only recommended language/compiler specs" when importing exe because otherwise it only allows selecting gcc as compiler. You really want watcall as the global default calling convention, then change the few variadic functions manually to use cdecl.

I think this can't be changed without recompiling. :( LXLoader.java contains a line like this:

return List.of(new LoadSpec(this, 0, new LanguageCompilerSpecPair("x86:LE:32:default", "gcc"), true));

edit/update: heh, found one little thing le2bin was actually useful for even with that LX loader ghidra plugin available:

the plugin just doesn't recognize doom v0.99 as a LE (who knows why)

Edited July 5, 2020 by xttl

xttl · July 6, 2020

Is this cool enough that it's ok to double post? :-)

beta_sbar.png.562555d0f03da87db2a8c07e15d9fbf6.png

xttl · July 10, 2020

The ULTIMATE DOOM HACK wip (done on top of Final Doom anthology exe because it's the most complete/bugfree ver):

done and works:

-maxzone parm (default: 8MB, max: 32MB)
-maxplanes parm (default: 128, max: 4096)
-maxdsegs parm (default: 256, max: 4096)
-maxsprites parm (default: 128, max: 4096)
can put all IWADs into same directory (use -plutonia, -tnt or -doom1 to play something else than doom2.wad)
-telebug parm (broken teleports like finaldoom rev1 exe)
-medkitfix parm (picked up a medikit that you REALLY need!)
-ouchfix parm (unfinished, only self damage really works due to priority problems)
no rest for the living mode (-nerve)
-nogibr parm (disable private chat keys in netplay with >2 players)
512kB stack always enabled (can load unsplit BTSX)
idclip works in doom1 mode

still needs to be done:

all the rest of doom+ (limit expand) stuff
-skullbug parm (broken lost soul floor bounce like all pre-ultimate exes)
-longtics parm
-fakenet parm
forced OPL3 mode?
customizing private chat keys?
general midi reset fixes? (for external synth users and wads like reqmus which have songs that like to enable midi reverb)
automatically add nerve.wad & set expanded limits with -nerve
sigil mode (it partially works already, try -sigil -warp 5 1 w/ udoom iwad & sigil non-compat pwad & check automap)
edit: ah heh, forgot nerve par times even existed
anything else?

update: fixed hang under win95 (that's what I get for only testing with dosbox)

Edited July 13, 2020 by xttl

Redneckerz · July 10, 2020

3 hours ago, xttl said:

The ULTIMATE DOOM HACK wip (done on top of Final Doom anthology exe because it's the most complete/bugfree ver):

done and works:

-maxzone parm (default: 8MB, max: 32MB)

-maxplanes parm (default: 128, max: 4096)

-maxdsegs parm (default: 256, max: 4096)

-maxsprites parm (default: 128, max: 4096)

can put all IWADs into same directory (use -plutonia, -tnt or -doom1 to play something else than doom2.wad)

-telebug parm (broken teleports like finaldoom rev1 exe)

-medkitfix parm (picked up a medikit that you REALLY need!)

-ouchfix parm (unfinished, only self damage really works due to priority problems)

no rest for the living mode (-nerve)

-nogibr parm (disable private chat keys in netplay with >2 players)

512kB stack always enabled (can load unsplit BTSX)

idclip works in doom1 mode

still needs to be done:

all the rest of doom+ (limit expand) stuff

-skullbug parm (broken lost soul floor bounce like all pre-ultimate exes)

-longtics parm

-fakenet parm

forced OPL3 mode?

customizing private chat keys?

general midi reset fixes? (for external synth users and wads like reqmus which have songs that like to enable midi reverb)

automatically add nerve.wad & set expanded limits with -nerve

sigil mode (it partially works already, try -sigil -warp 5 1 w/ udoom iwad & sigil non-compat pwad & check automap)

edit: ah heh, forgot nerve par times even existed

anything else?

udoomhack.zip

update: fixed hang under win95 (that's what I get for only testing with dosbox)

... okay, i certainly didn't expect that to happen! That is huge. You weren't lying! Many thanks, when this is finished it would dwarf Doom32 even and be the definitive executable hack in many ways.

Will the executable remain UDoom or will you give it a name like the previously mentioned F2DoomPP?

BTW, perhaps this has any use, but AXDoomer crafted an executable of his Doom-Patcher program, providing a rather easy way to apply several hacks to a stock executable. Find the page here.

PS: Here is a thread with all the ID Anthology patches if there is any need for it. If you are using the Anthology version, you are pretty much using the very latest id-made Doom1/2 build anyway (November 1996)

Edited July 10, 2020 by Redneckerz

esselfortium · July 10, 2020

3 hours ago, xttl said:

The ULTIMATE DOOM HACK wip (done on top of Final Doom anthology exe because it's the most complete/bugfree ver):

done and works:

-maxzone parm (default: 8MB, max: 32MB)

-maxplanes parm (default: 128, max: 4096)

-maxdsegs parm (default: 256, max: 4096)

-maxsprites parm (default: 128, max: 4096)

can put all IWADs into same directory (use -plutonia, -tnt or -doom1 to play something else than doom2.wad)

-telebug parm (broken teleports like finaldoom rev1 exe)

-medkitfix parm (picked up a medikit that you REALLY need!)

-ouchfix parm (unfinished, only self damage really works due to priority problems)

no rest for the living mode (-nerve)

-nogibr parm (disable private chat keys in netplay with >2 players)

512kB stack always enabled (can load unsplit BTSX)

idclip works in doom1 mode

still needs to be done:

all the rest of doom+ (limit expand) stuff

-skullbug parm (broken lost soul floor bounce like all pre-ultimate exes)

-longtics parm

-fakenet parm

forced OPL3 mode?

customizing private chat keys?

general midi reset fixes? (for external synth users and wads like reqmus which have songs that like to enable midi reverb)

automatically add nerve.wad & set expanded limits with -nerve

sigil mode (it partially works already, try -sigil -warp 5 1 w/ udoom iwad & sigil non-compat pwad & check automap)

edit: ah heh, forgot nerve par times even existed

anything else?

udoomhack.zip

update: fixed hang under win95 (that's what I get for only testing with dosbox)

My mind is blown, incredible work!

xttl · July 11, 2020

22 hours ago, Redneckerz said:

PS: Here is a thread with all the ID Anthology patches if there is any need for it. If you are using the Anthology version, you are pretty much using the very latest id-made Doom1/2 build anyway (November 1996)

Not needed because I put the complete exe in that zip.

I just got xdelta3 to compile for DOS using Open Watcom, so in the future I could just put out a zip which has xdelta3 patches from all the common exe versions to my hacked exe + some kind of bat or frontend program which automatically chooses the right patch to use.

(edit: no idea how slow it will be on period correct hardware though, but I remember RTPatch which many commercial game patches used back then not being that fast on my 486DX4 either :D at least on patching large files like the WADs, edit2: seems plenty fast for patching the exe with dosbox limited to 3000 cycles or even just 1000 cycles, on a real machine disk i/o, memory, etc. would be slower but i don't know how much effect that would have)

Edited July 11, 2020 by xttl

Sign In

EXE hacking

Recommended Posts

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in