Procedurally generating trigonometric LUTs

Maes · November 12, 2010

While continuing development of MochaDoom, the "dead" R_InitTables() code in r_main.c caught my attention.

Insofar, I had to come up with a very hackish solution for allowing Java to reuse the original trigonometric LUTs: the static data limit per source code file was 64K, which proved easily exceedable unless at least finesine was in its own file, and then "imported" inside Tables.java via means of interface implementation. Fugly as hell.

Similarly, finecosine has to exist in its own private copy, which was generated by copying finesine over it (since aliased array pointers are not possible in Java).

In any case, I first tried the R_InitTables() code with gcc and it gave exactly the same values present in the table, for both finesine and finetangent. The same thing happened when porting the code in Java (at least on Intel).

Furthermore, I managed to recreate the code that generates tantoangle:

    for (i=0 ; i<2049 ; i++)
    {
	
	a=(float)((i<<5)/65536.0);
	t=(int)((float)(2*Math.atan(a)/PI)*0x40000000);	
	tantoangle[i] = t;
    }

which also proved bit-exact and generated the original tantoangle LUT.

I just thought that this information can be used by anyone who's facing similar limitations or wants to keep hardcoded data to a minimum for whatever reason. I can only see a minor problem with this approach on non-Intel platforms or on different FPUs, but then again, there would be similar problems caused by FixedDiv too.

Quasar · November 12, 2010

R_InitTables was actually used back in the prebeta, IIRC, instead of having the tables precalculated already - so that explains why it exists if you were curious.

Maes · November 12, 2010

Well, good thing it wasn't wiped entirely, as it came pretty handy now. I really wanted a cleaner solution for the LUTs, as splitting them across several classes was really shitty. Then again that static limit is absurd, but that's another story.

It's always interesting when "dead" code is brought back to use or even reverse-engineered. The automap strobing is another similar feature, and there probably are others.

Another thing that caught my attention was a "defined-out" code snippet claiming to fix the "stretched line bug" in the renderer code.

Which is interesting, because I hit upon a similar bug myself (actually it was a selective column magnification under certain view angles, which proved to be because of intermixing "unsigned" arithmetic with signed math functions) near the same spot. In my case it was caused because Math.abs() was called upon longs that were meant to represent angle_t's (I keep them trimmed to 32-bits all the time after arithmetic operations, so that they always represent positive 32-bit integers).

However JUST in that place, they were meant to be interpreted as signed integers (and the same probably happens in the original C code under some compilers).

LexiMax · November 19, 2010

Maes said:
However JUST in that place, they were meant to be interpreted as signed integers (and the same probably happens in the original C code under some compilers).

This kind of tomfoolery is why I'm such a fan of your project. It's really nice of you that you're going through and preserving doom in a more...how should I put it..."defined" language.

RjY · November 20, 2010

What do you know, it does work!

rjy@vile:~/src/rboom/data$ /tmp/tables s | cmp -lb - lumps/sinetabl.lmp 
rjy@vile:~/src/rboom/data$ /tmp/tables t | cmp -lb - lumps/tangtabl.lmp
rjy@vile:~/src/rboom/data$ /tmp/tables a | cmp -lb - lumps/tantoang.lmp

No output means no differences. tables.c test program source is here if anyone wants to play with it. Incidentally

Maes:
Furthermore, I managed to recreate the code that generates tantoangle:

You didn't really need to do this, there's another function R_InitPointToAngle in r_main.c with the commented-out tantoang generation in it (I only just discovered this)

Ladna · November 20, 2010

AlexMax said:
This kind of tomfoolery is why I'm such a fan of your project. It's really nice of you that you're going through and preserving doom in a more...how should I put it..."defined" language.

It's funny, I read something about C from Brian Kernighan and he consistently referred to C as weakly-typed. You'd never hear that in a modern programming discussion, but what he was referring to was that in C, everything's either a float, a struct, or an integer, and that punning really mixes it all up. In a lot of ways this is extremely useful, lots of things are super fast and easy in C because almost everything is an integer (array indices, addresses, etc.) and C doesn't care whether you call a block of memory a thinker_t or a mobj_t. I mean when you think about it it makes sense, and especially since there are languages that are much more strict about type than C is (Java and C#). But I'd just never thought of C as "weakly-typed" before.

Quasar · November 21, 2010

Til C99 threw it all out the window.

Ladna · November 21, 2010

Yeah WTF.

Maes · November 21, 2010

Well, that's probably one of the things that threw off many earlier attempts at rewriting Doom in a non-C language. However, it can be rationalized if you accept the overhead of object casting and inheritance as a necessary evil e.g. instead of treating mobj_t as an implicit union with thinker_t (as the original code does, if you dig into it), I just make it extend thinker_t (the same goes for sector specials, and it appears to be working just fine), which allowed me to pass around mobj_t's as thinker_t's just as you would do in C/C++, at least at the SC level.

At least so far I've thoroughly debunked the idea that Doom is totally dependant on C's memory model and way of doing things: it can be decoupled to a large degree and even gain a few perks along the way: e.g. I obviously don't need the Zone memory system anymore, yet I can use the same file-memory caching mechanism. I can unclutter the syntax by removing *, & and -> operators resulting in way more readable code, and I can simulate "weak typing" by having around enough interfaces e.g. strobe_t and other such sector events extend a "SectorAction" abstract class which provides the containing sector_t object, and in turn it extends the thinker_t abstract class which provides the nodes to the thinker list and the "pointer" to the function (how I implemented that, is another story).

Graf Zahl · November 22, 2010

Maes said:
and in turn it extends the thinker_t abstract class which provides the nodes to the thinker list and the "pointer" to the function (how I implemented that, is another story).

I don't understand why you had to hack around that (at least that's how I understand this sentence.)

In ZDoom the 'Thinker' class defines a method called 'Tick' and all thinker subclasses override it so it's all well defined within the class hierarchy.

Maes · November 22, 2010

Graf Zahl said:
I don't understand why you had to hack around that (at least that's how I understand this sentence.)

Ok, back to the roots:

mobj_t (one of the many implicit "subclasses" of thinker_t) contains a thinker_t struct at its beginning, not a pointer. This means that, in C and C++ you can pass either type around and treat it as an union of sorts, even if you DON'T do an explicit mobj_t->thinker access. Aka if you pass a mobj_t where a thinker_t is expected it won't complain, and will work as intended.

In code terms:

public class thinker_t {
   
   public thinker_t prev;
   public thinker_t next;
   public think_t     function;

}

public class mobj_t extends thinker_t implements Interceptable   {

    ...(snip)...  

    /* List: thinker links. */
    //public thinker_t       thinker;

   ...(snip)...
}

Focus on that commented out thinker_t: if I leave it in and DON'T extend thinker_t instead, I won't be able to pass and cast mobj_t into thinker_t, and I surely won't be able to access any of its fields. Yes, I admit to not using a pure OO approach here and a lot of stuff is accessed directly through fields, at least at first.

Similarly:

public abstract class SectorAction extends thinker_t {
  
    public sector_t sector;
}

public class fireflicker_t extends SectorAction{
	
     public int     count;
     public int     maxlight;
     public int     minlight;
     
 }

Focusing on that fireflicker_t, I noticed that in ZDoom it has become DFireFlicker and essentially contains the sector actions "inside it", rather than leaving it in the equivalent of r_specs, without losing context. In Java, depending on how much global crap it needs to access, I'd have to make it internal to the class containing the various "global" variables, or else devise (yet another) way of passing context around. I prefer keeping it simple, for now ;-)

If I want to pass mobj_t's or "sector actions" as thinker_t's around, they must either implement an interface (but then, in Java, interfaces cannot define member fields or implement functions except class static ones) so I'd have to hack around that anyway. Abstract classes can do both, but like interfaces, they cannot be instantiated (maybe C++'s meaning is different?), so I must extend an abstract class with those fields.

Maybe this "daisy chain inheritance" appears weird, but remember than in Java you don't have true multiple inheritance.

I checked out ZDoom's code to see what you mean though, and I saw you took it to a whole new level: you're using a much more OO approach to Thinkers (including iterator patterns, which are nowhere as efficient or automatic in Java as they are in C++) so we're really comparing apples to oranges here, as I'm not changing the internal mechanics to that degree (yet): it's weird that you call my approach a "hack" though, when you are dealing with a much more complex and layered system in ZDoom (than again ZDoom has a much broader scope, so whatever works best I guess). If someone else wants to make a "Java ZDoom" however, hey, be my guest :-p

Wagi · November 30, 2010

Even without the limitations that you describe, this is still pretty useful. Using similar methods on the other LUTs (like finesine[]? :o ), could potentially cut down on executable size, if you're the kind of person who cares about that stuff.

DaniJ · November 30, 2010

Some ports already do just that Wagi.

Quasar · November 30, 2010

Due to the fact the C code treated thinker.function as RTTI, in the process of EE's C++ conversion I've had to also create something of a "hack" to get around this implication of a dynamic type - thinker.function is reassigned to P_RemoveThinkerDeferred in BOOM so that thinkers can remain in the list until they're freed - because of demo compatibility concerns, I can't just change that.

So instead I've created a thinker_cast template function that only performs the dynamic_cast to a target subclass when the thinker is not removed. Looks like this:

template<typename T> inline T thinker_cast(CThinker *th)
{
  CThinker *ret = NULL;

  if(th && !th->isRemoved())
     ret = dynamic_cast<T>(th);

  return ret;
}

I don't really LIKE having to do it that way, but it's the only way to avoid checks for this everywhere in the code, now that what used to be thinker.function is a virtual Think() method like it ought to be for proper OO design. Should this prove to be an efficiency problem somehow, I guess it has to go back to being a function pointer >_>

DaniJ · November 30, 2010

I've not been keeping up on Eternity's changelogs of late Quasar but why have you opted not to use real RTTI?

Graf Zahl · November 30, 2010

He does. But the C code abused the function pointer as RTTI identifier. But this can't be done in C++ so this particular exception case had to be handled separately.

ZDoom is doing something similar for pointers to deleted thinkers but it's a bit more complex due to the garbage collector that it uses.

DaniJ · November 30, 2010

In Doomsday we are making a copy of the mobj in a "deleted thinker" list that is automatically garbage collected at the end of the tic. Any reads to the mobj post deletion are routed to the copy and the original mobj is put into stasis.

I don't see why you need identification on thinker function ptr or why that is a problem. You can use a set of numeric constant type idents if you want. You can have just one Mobj_Thinker and derive from it in an inheritance model and maintaining databases for name'd instances either by their function or present use.

There is actually no reason why mobj thinkers can't be folded until there is one. Yeah it involves a couple of high-level logic branches in Mobj_Thinker but that can be dealt with (which originate mostly from Hexen).

Graf Zahl · November 30, 2010

If I understand this correctly, Eternity can't do that due to demo compatibility issues

ZDoom is even more radical, btw. Most pointers are actually wrapped by a small class and each read of the pointer checks if it's still valid and immediately NULLs it if it isn't. So for most cases destroyed objects can not be accessed at all.

DaniJ · November 30, 2010

We like to keep the copy around. It makes for more algorithmic performance options in the world renderers (both video and audio).

Graf Zahl · November 30, 2010

Why that?

These subsystems should never need to access a destroyed object.

DaniJ · November 30, 2010

Logically, perhaps but the option is there. That is what I'm saying.

Quasar · December 1, 2010

See I cannot modify the order or contents of the thinker list, because demo sync is highly sensitive to it. This means that BOOM's deferred thinker removal semantics must be preserved through the conversion of the thinker_t hierarchy into a proper set of inheriting classes.

It seems really simple at first until you run into the fact that the code is testing thinker->function everywhere, and BOOM changes thinker->function to P_RemoveThinkerDeferred when the object passes through P_RemoveThinker - this is so that ref counting can take over and the object will stay around as long as something seems to still be using it. However while it's in this state, none of those loops testing thinker->function detected it as being of the type they're interested in (which is mobj_t in 99% of the cases).

I was tempted to create a new list of mobj_t's only and iterate over it instead of iterating on the thinker list. Now let me show you why it doesn't work. Suppose two consecutive objects are removed during the same gametic inside an iteration on this list of mobj_t's, while the iteration points to the first mobj_t. The next mobj_t to step to would be the second one that was removed. But it's still going to look like a valid mobj_t, short of doing explicit checks on mobj_t::removed everywhere, which I don't want cluttering up the code, so the loop will end up running on it, whereas in BOOM it would have been rejected due to having its thinker.function altered.

Instead it's easier and safer to just keep the loops working at the thinker level and let a selectively performed dynamic_cast in a global routine handle this uniformly throughout the engine. This replaces the original:

  if(th->function == P_MobjThinker) 
  {
     mobj_t *mo = (mobj_t *)th;
  }

with a clean and simple and obviously correspondent

  mobj_t *mo; 
  ...
  if((mo = thinker_cast<mobj_t *>(th)))
  {
    ...
  }

No additional branches or nagging need to remember checking for removed status everywhere. It's the kind of thing that the rest of the game engine isn't supposed to have to worry about - deferred removal is only supposed to be the burden of the thinker list.

Going to a custom RTTI system that changed the effective type of thinkers was also an option but with obvious downsides. When using custom RTTI dynamic_cast becomes redundant and not worth its ordinary overhead any longer. But static_cast has no safety, and IMO is barely an improvement over the tricks we were doing in C, things I want to move away from. I doubt any RTTI system I cooked up on my own would really have less overhead than those built into modern compilers, which means I'd just be sacrificing a language feature for an inferior improvisation.

Graf Zahl · December 1, 2010

Quasar said:
I doubt any RTTI system I cooked up on my own would really have less overhead than those built into modern compilers, which means I'd just be sacrificing a language feature for an inferior improvisation.

Depends on what you need. ZDoom implements its own RTTI system but for good reasons. Since the compiler provided system cannot handle types generated at run time it's practically useless in a setup where you need to be able to create your own types (for DECORATE defined actors.)

Another issue is that to my knowledge the compiler provided RTTI still does not offer a means to create an object based on a runtime type - also a feature ZDoom heavily makes use of.

rpeter · December 1, 2010

Excellent thread, not that I understand every word of it.

I feel justified that I removed the LUTs years ago in my pet project. I have no idea why some ports still insist on them, probably some demo compatibility consideration?

As for the thinker/mobj stuff I separated some specials out of the thinker list to see what happens. Each special type has its own list and deferred deleting mechanism, so far no problem.

Maes · December 1, 2010

rpeter said:
I feel justified that I removed the LUTs years ago in my pet project. I have no idea why some ports still insist on them, probably some demo compatibility consideration?

Given that they can be generated with bit-exactness, at least on Intel, I'd say it's more a mix of convenience and overlooking. Why rewrite something that works, and is found at such a basic level? With C/C++ there should be no problem pulling in the original .c and .h files for most stuff anyway, it's when you change language as I did that you may start having a problem (similarly, I had to rewrite most "static initializers" and semantically-valued enums present throughout the code).

RestlessRodent · December 1, 2010

Maes said:
While continuing development of MochaDoom, the "dead" R_InitTables() code in r_main.c caught my attention.

Insofar, I had to come up with a very hackish solution for allowing Java to reuse the original trigonometric LUTs: the static data limit per source code file was 64K, which proved easily exceedable unless at least finesine was in its own file, and then "imported" inside Tables.java via means of interface implementation. Fugly as hell.

Similarly, finecosine has to exist in its own private copy, which was generated by copying finesine over it (since aliased array pointers are not possible in Java).

In any case, I first tried the R_InitTables() code with gcc and it gave exactly the same values present in the table, for both finesine and finetangent. The same thing happened when porting the code in Java (at least on Intel).

Furthermore, I managed to recreate the code that generates tantoangle:
    for (i=0 ; i<2049 ; i++)
    {
	
	a=(float)((i<<5)/65536.0);
	t=(int)((float)(2*Math.atan(a)/PI)*0x40000000);	
	tantoangle[i] = t;
    }
which also proved bit-exact and generated the original tantoangle LUT.

I just thought that this information can be used by anyone who's facing similar limitations or wants to keep hardcoded data to a minimum for whatever reason. I can only see a minor problem with this approach on non-Intel platforms or on different FPUs, but then again, there would be similar problems caused by FixedDiv too.

Depending on the processor, floating point operations could change precision. For example a float could be a 32-bit precision float, but then passed to the CPU it could turn into an 80-bit precision float. A slight precision change could change the value especially when it's converted back and how the CPU/Assembly code handles it. FPUs could also use different means to calculate functions and what not.

Floating point is the most variable and could be implemented a bunch of ways while on the other hand integer division is pretty much not really a problem these days since it's pretty static (although the algorithm is different it could produce the same results anyway).

Also, if you do not want dependence on floating point operations and want to operate entirely on fixed point, removing the LUTs is not an option. Integer operations are still faster than floating point operations.

RjY · December 1, 2010

Oh yeah, this. Thanks Maes, this thread was a really nice bit of research. This is an extract from tables.c in rboom. Out of paranoia I put in a CRC check so it sprays out a big warning if the code appears to get the answers wrong. It actually works, in that I haven't seen any demo desyncs since I did it, but I've only tested it on the one machine in front of me. It could probably go upstream into PrBoom easily enough but I wouldn't trust it without being able to test on all the different platforms PrBoom is meant to run on :-)

Maes · December 1, 2010

It could be made platform-independent if Intel FPU emulation was thrown in just for this part. Since it would be a one-off operation, that wouldn't matter much.

I'm surprised how nobody mentions that the FixedDiv code is also dependant on the FPU (or FPU emulation), since it actually has a cast to floating point. At least linuxdoom is a bit murky at this point, since I recall there's a version using a cast to float and one that doesn't (?), however only a disassembly can tell us what really happens in vanilla Doom. Is it pure floating point? What do actual source ports do?

Linuxdoom version (pure integer part is dead code).

fixed_t
FixedDiv2
( fixed_t	a,
  fixed_t	b )
{
#if 0
    long long c;
    c = ((long long)a<<16) / ((long long)b);
    return (fixed_t) c;
#endif

    double c;

    c = ((double)a) / ((double)b) * FRACUNIT;

    if (c >= 2147483648.0 || c < -2147483648.0)
	I_Error("FixedDiv: divide by zero");
    return (fixed_t) c;
}

However chocolate doom apparently uses a variation of the "integer" version:

fixed_t FixedDiv(fixed_t a, fixed_t b)
{
    if ((abs(a) >> 14) >= abs(b))
    {
	return (a^b) < 0 ? INT_MIN : INT_MAX;
    }
    else
    {
	int64_t result;

	result = ((int64_t) a << 16) / b;

	return (fixed_t) result;
    }
}

as does ZDoom.

Quasar · December 1, 2010

Maes said:
I'm surprised how nobody mentions that the FixedDiv code is also dependant on the FPU (or FPU emulation), since it actually has a cast to floating point.

IIRC this is one of the "improvements" that Bernd Kreimeier made. Either that or it was a change in the Linux version. Either way, IIRC, this will desync demos if left in place. I can definitely tell you that error message and floating point ops aren't in any version of the DOOM engine I've personally disassembled.

RestlessRodent · December 2, 2010

From the tests I have done, the 64-bit shift and divide for FixedDiv matches and is much faster than doing it in floating point, however it is still slow as hell.

Sign In

Procedurally generating trigonometric LUTs

Recommended Posts

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post