Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Maes

Exactly what approaches are used for "limit removal" in ports?

Recommended Posts

Some of Doom's static limits were kept low also because they were taxing in terms of memory and were implemented by means of fixed-size lists -no dynamic allocation, except for mapobjects-.

E.g. each visplane means statically allocating enough memory to hold two screen scanlines plus padding, so increasing this number to an arbitrarily "high enough" value like 65535 would waste a lot of memory, even today. Other stuff may be more lightweight, but still...

So, how do the various source ports implement "limit removal" ?

  • Just increasing static limits (if so, to what numbers)
  • Fully dynamic allocation/deallocation based on linked lists rather than arrays.
  • Dynamic arrays system (e.g. doubling arrays size/halving if not needed etc.)
  • Circular reuse etc. etc.

Share this post


Link to post
Maes said:

E.g. each visplane means statically allocating enough memory to hold two screen scanlines plus padding, so increasing this number to an arbitrarily "high enough" value like 65535 would waste a lot of memory, even today. Other stuff may be more lightweight, but still..

Revision: 2850
Author: entryway
Date: 16:40:38, 6 жовтня 2008 р.
Message:
Resolution limitation is removed. Memory usage has decreased as a bonus. 7 mb instead of 8 on doom2.wad/map01 and 14 mb instead of 18 on epic.wad/map05 at 640x480

http://pastebin.com/P8g0Amdt

Share this post


Link to post

Both increasing static limits and circular reuse aren't really limit removing. Linked list allocation for each renderer struct is very inefficient, especially in something like Java. Dynamic arrays I believe is the best option, just be sure to replace the pointer system with indices.

Share this post


Link to post

Lee Killough's Littlest Guide to Limit Removal

(Excerpted from LOG_LEE.TXT)

BTW, despite the "array doubling" coming up in our discussions before
the source was released, it has only been a good solution for around
half of the limit problems so far.

I've used these data structures/algorithms/optimizations so far:

1.  Array doubling (dynamic array that's realloced everytime it gets too small)
2.  Linear probing hash table with number of elements known a priori
3.  Chained hash table (array of pointers, each to a dynamic linked list)
4.  Remove the array and the requirement for it completely (wall scroll effect)
5.  Singly-linked list
6.  Doubly-linked list
All of these algorithms, but particularly hashing and dynamic array expansion, have proven appropriate to solve additional problems and limits in Eternity, each in different situations.

For visplanes, Eternity allocates the screen clipping buffers dynamically, separate from the visplane hash table. When a resolution change occurs, any existing freelisted visplanes with too-small buffers get new buffers for the bigger screen size.

Share this post


Link to post

First, check out the Boom source and compare to the original, and search for "removed" and "limit" (seriously). Lee and others (but mainly Lee) wrote some excellent improvements that are clear enough to understand.

Basically, most ports nowadays use and improve on those. Check out PRBoom+ and Eternity.

A related thing is in-game resolution switching, which, of course 'removes the limit' of 320x200 resolution, but, at the same time, highlights a bunch (but not all) of the structures that are concerned with limit removal.

These methods should translate very well to other languages, because, they're sound algorithms.

Share this post


Link to post

Well, it's a bit early to concern my self with that, but there are a lot of things that are hardcoded to the default screenwidth/screenheight at compile time (and some are even hardcoded with numerical constants).

Share this post


Link to post

Perhaps it would be best to get the port functional and then worry about how to add post-vanilla features?

Share this post


Link to post

Yup :-p

Besides, I'll have a lot of "fun" ironing out numerous bugs -however as of now it's functional enough to load IWAD and PWADs and allowing to start a game from the menu. E1M1 gets displayed, the player can turn around...but not move, until I implement proper network/messaging. However, I can switch to automap mode, pan, zoom etc. and even activate cheats. Even head bobbing works, but there's still a lot to do to call it "playable".

Turns out that player input goes through the networking system ANYWAY, even in single player mode. That IS true client/server architecture.

Share this post


Link to post
Maes said:

Turns out that player input goes through the networking system ANYWAY, even in single player mode. That IS true client/server architecture.

Isn't that only in one direction? Sure, it's easy for a client to send commands to a server.

But, I think it's quite difficult to get the server->client gamestate updates working efficiently in a C/S source port, while allowing the client to accurately predict the gamestate in between updates.

Share this post


Link to post
Spleen said:

allowing the client to accurately predict the gamestate in between updates.


Is there any port that actually does that? I can only imagine multiplayer-specific ports bothering to keep a semblance of consistency. My experience with ZDoom tells me this ain't so, however. It's a lag and dropped packet abuse fragfest if your connection isn't single-digit ping ;-)

Share this post


Link to post

It depends what you mean by "gamestate prediction". Doomsday for example uses various methods of clientside movement prediction in combination with timestamped, location event reconciliation for important events (e.g., opening doors or launching projectiles (not line attacks)).

Share this post


Link to post
Maes said:

Is there any port that actually does that? I can only imagine multiplayer-specific ports bothering to keep a semblance of consistency.

Yeah, multiplayer-specific ports do it. For example, they usually let monsters continue to perform a limited set of actions until the next update from the server. In Skulltag, for instance, the client predicts some monster actions that don't require a random number, since the pseudorandom number generation isn't in sync in between client and server. Updating monster information every tic would be way too bandwidth-intensive, and updating on important monster actions is much more smooth and efficient than updating every N tics.

Share this post


Link to post
Maes said:

Yup :-p

Besides, I'll have a lot of "fun" ironing out numerous bugs -however as of now it's functional enough to load IWAD and PWADs and allowing to start a game from the menu.

Just don't get burned by buffer overruns! as you know Carmack wasn't too concerned here - I guess when he ran into a problem he just increased the constants a bit. I can justify that for a developer trying to get his game out, and get it running on crappy low memory machines, but, of course, a better approach is needed for new maps.
How does your environment/code behave during a overrun? Does it track out-of-bounds reads/writes, or just crash? Just curious.

In original Doom, a few array bounds are checked, like visplanes (long after the overrun occurs...). This is typically what limits are being removed - all the fixed-sized arrays.

Gook luck with the project - a brave venture indeed!

Share this post


Link to post

@kb1 as you said, some things like e.g. visplanes seem to work sanely enough, while others are well known to overflow (e.g. drawing some masked textures on single-sided linedefs).

Java does ALWAYS strict bound checking, no way to disable that. For some situations where overflows do occur, I can choose to decrease range,increase array sizes or catch the overflow exceptions and carry on -I usually do the latter to speed up development if something is particularly vexing.

If I'm in a good mood I might improve on the problem spot and think of a clever way to get around it e.g.

As of now the greatest problems with the renderer seem to be visplane distortion (floors twist and turn in a weird manner) and the presence of "blind angles", where certain sidedefs are rendered magnified/distorted -those I know are because of some bugs in my emulation of doom's BAM trigonometry -it mostly works but it's tricky to use longs pretending they are 32-bit unsigned numbers with overflows, I had to write BAM arithmetic all over again ;-)

Share this post


Link to post
Maes said:

Java does ALWAYS strict bound checking, no way to disable that.

Hmmm...good for finding bugs, not so good for performance, but, maybe it'll be ok.

Maes said:

...bugs in my emulation of doom's BAM trigonometry

Not cheating with floating point? Might make sense for the renderer. Just a thought.

Share this post


Link to post
kb1 said:

Hmmm...good for finding bugs, not so good for performance, but, maybe it'll be ok.


Yup. Old and known issue. There were a few spots where a pointer was just left "dangling" after iterating through vissprites and then read -forgivable in C, but not in Java.

AS far as performance goes, at vanilla resolution a Pentium IV under other loads is able to render the player's view at about 500-600 fps, a Pentium III about 110, a Core 2 laptop or an iMac can easily go beyond 1500. It makes a great difference if the JVM used is able to map transparently to DirectDraw surfaces or the equivalent in other OSes, or not.

It doesn't sound like much compared to native ports, and depends on a lot of factors, but its tons faster than e.g. Flash Doom or that MS-DOS emulator written in Java that runs Doom, so that it will be actually playable at reasonable speeds without question.

One important advantage is that it worked on all the tested OSes and systems with no changes ;-)

kb1 said:

Not cheating with floating point? Might make sense for the renderer. Just a thought.


The problem is that those damn "angle_t"'s are everywhere, are added and manipulated as 32-bit unsigned integers representing 0-360 degrees with a ridiculously high accuracy, and then they are truncated to a 13-bit number anyway, which is used to look up in sine/cosine tables. They could have just mapped them to 16-bits IMO, but hey, who am I to judge.

To emulate them, I store them as longs, and after doing ANY manipulation on them I AND them with a 32-bit mask. It IS possible to avoid this and store them as 32-bit ints directly as long as you know pretty damn well what is going on e.g. as long as they are just added/subtracted between them, but comparisons won't work directly (ANG270 will appear to be smaller than ANG180, which is not true in terms of unsigned numbers).

So in theory it should be possible to change them entirely to simple 32-bit floats, it would probably work faster AND more accurately too, if one does away with the lookup tables as well and uses FPU-accelerated Math library functions.

The fixed point arithmetic used for coordinates, heights etc. is another weird beast, but works with much less problems than BAM angles because it's a signed types, and standard comparison and addition/subtraction operations work on them.

Share this post


Link to post
Maes said:

One important advantage is that it worked on all the tested OSes and systems with no changes ;-)

That's a serious advantage, above and beyond native ports!

Maes said:

The problem is that those damn "angle_t"'s are everywhere...

Yep. Are you going for demo compatibility?? That would be a very brave move! But, it just may be possible. If you're wrapping everything up to the point where you're actually emulating the math, it could be possible. That's an area where the BAMs are critical. Even with the 'ridiculous precision', one least-significant bit of difference in the angle, or the resultant trig CAN and WILL put it out of sync quickly.

Here's a helpful approach:
Take a port known to have good sync, and add code to XYMovement that writes ActorType X Y Z Angle info along with tic count to a file. Then run a standard demo.

Do the same with your port, and compare the files. Should reveal 99% of any low-level math discrepancies, since it hits angles, mul, div, add, etc.

Be sure to use the same resolution for both tests (yep, that can change things too).

Even if you're not going for demo sync, this test should make any oddities surface quickly.

Share this post


Link to post

I actually use test suites and stress-tests to compare accuracy -or rather, reproducibility- of results when using ints and longs as a storage for angle_t's and fixed_t's. I mostly got the theory down, but some glitches show up here and there e.g. I get "ultra magnified" columns when looking in certain directions.

In general, if all I do is add "BAMs" between them, even if stored as longs, it will work OK as long as I strip the final result of any high bits beyond the 32 mark. I must be very careful with multiplications/divisions though, and with the result of subtractions -there are supposed to be no negative BAMs.

Fixed point arithmetic works just the same because it's just signed ints underneath, the only difference can be due to the use of float casts in FixedDiv (those are used by the C code too, and the actual results may behave differently from machine to machine, from CPU to CPU and from compiler to compiler anyway, unless you can guarantee that casts are performed by every FPU in the same way).

As I said, I'm aiming for "vanilla faithfulness" to the degree necessary to load and play PWADs and even DEH files (it's possible to reassign action "pointers" too, since I implement them with a dispatcher method, and even add more in the future, leaving a window open for Boom-like features). I believe the easiest way to accomplish that is to try and reuse as much of the original code as possible, rather than writing everything from scratch, even if that means facing some unpleasantness.

For now, I'm not changing the savegame or demofile formats either (I will convert the C code faithfully, so it's just possible that demos will play, at least for a while). If they do, it'll be just an added bonus -the most important goal of this port is to free Doom from the C/C++ codebase once and for all. Who knows where this can lead in the future ;-)

Share this post


Link to post

Just committed code to remove sprite limits in DoomLegacy.
Because it gets indexed and searched through, it was easier to keep the vissprite table as one monolithic table.

How it is done is as follows:
1. Get a value of how many sprites the user can tolerate before the machine bogs down. We have a user setting of 128 to 16000 sprites.
Even though you may want to display every sprite, the machine cannot handle it. Longday.wad must have over a thousand sprites in one yard (I did not count them) which gave a frame rate of 1 frame / 3 seconds.
2. Keep a count of the number of needed sprites (previous frame).
3. Figure how large the sprite table will be.
Have a minimum size.
Limit the size according to the user setting.
Round the estimate to something reasonable, like 64.
4. Figure a way to prevent sprite table thrashing. Then compare to current value and if the change is not significant, then do not change the table.
There is no use reducing the sprite table size unless it is really huge.
5. Reallocate the new table between frames. It is not in use then.
6. MergeSort the sprites into the table by distance.
You can get rid of the vissprite sort function that was used to sort them later.
7. Make sure that sprites close to the player always get into the table.
8. When the table size is exceeded, figure out what to do about the distant sprites.
Create a way for distant sprites to get a transient viewing (so the player at least knows they are there), randomly, or every x frames.
There are always some wads, like longday.wad, that will exceed what
you can handle. Gracefully degrade from the ideal.
Do not use a random number generator as it is too slow, just need to have a non-repeating pattern of loser sprites that got lucky this frame.
Only do this in the distant half of the table.
9. Sprites that do not get into the table are invisible.
You might want to try to favor live sprites over dead ones, but that
is difficult. Getting hit by invisible missiles will be bad.

10. Some wads, like Europe.wad and longday.wad, exceed the distance that can be handled too. Overflow errors make some distant walls foul the nearby display.

11. Europe.wad exceeds the blockmap size by 3 times too (most significant bits of blockmap index are chopped off in the wad).
For that I watched for the index going from large to small, and added
back in the missing high bits. Other ports ignore the wad blockmap and make a new one.

12. Europe.wad also exceeds the map size that the map viewer can handle. Must examine map viewing code for overflows.

Fixes used by DoomLegacy are in the recent patches in the
SVN repository, SourceForge, DoomLegacy project.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×