Ouchface
Register | User Profile | Member List | F.A.Q | Privacy Policy | New Blog | Search Forums | Forums Home
Doomworld Forums : Powered by vBulletin version 2.2.5 Doomworld Forums > Classic Doom > Source Ports > Which are the ports with most optimized renderers
Pages (2): « 1 [2]  
Author
All times are GMT. The time now is 02:44. Post New Thread    Post A Reply
Maes
I like big butts!


Posts: 12371
Registered: 07-06



kb1 said:
Just curious, are you now maintaining original demo sync with your port? That would be incredible! Is that something you're considering?


If you mean compatibiliy with vanilla demos, there's a partial one: some play back just fine, others (most of them) desync. Some "known good" ones are DEMO2 and DEMO3 (and maybe DEMO4?) from Ultimate Doom. I think a couple from Final Doom also play back correctly.

However, unless I figure out what's precisely causing the desyncs at certain points (the actions of certain monsters? the movement of certain monsters? A call to P_Random too much? A small error with my integer-wishing-it-was-fixed_t arithmetic?) I can't do much. Who knows.....if I could make a proper test map that isolated certain kinds of causes, then maybe.

The only clue I have so far is that, with a -nomonsters map with the player just navigating it (recorded in vanilla doom), so far I couldn't induce desync. With monsters present, it's just a matter of time, but not always. I can't reconduce it positively to one kind of monster or action yet.

Old Post 10-25-12 01:25 #
Maes is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit/Delete || Quote
Graf Zahl
Why don't I have a custom title by now?!


Posts: 7708
Registered: 01-03


I tested REJECT mostly with MAP09 from Caverns of Darkness.

Back in the day it was the largest thing I could find. It was also quite detailed needing a lot of time to traverse the BSP.
I woke up all the monsters in there by activating god mode and running through all the monster stashes. Then I measured the time spent in P_CheckSight and compared with the entire time spent in the play simulation and the entire time spent processing a frame.

Yes, P_CheckSight was markedly improved, no doubt about it. But here's the thing: The play simulation spends much more time in P_CheckPosition than in P_CheckSight! And the game spends magnitudes more time in the renderer, especially when using software rendering.

The results I got were that the play simulation was improved by 20-30%. Sounds great on first sight. But if you consider then that the play simulation took less - often significantly less - than 10% of overall execution time things will look a lot different suddenly.

This is a spot that looks very appealing for optimization because you can drastically improve its performance by at least 100%. But since only 1, maybe 2 or 3 percent of overall time is spent there the maximum performance increase is a fraction of those measly few percents.

I have to admit one thing though: When I tested this I didn't have an engine with uncapped frame rate yet. But honestly, when it comes to comparing 100 fps to 110 fps I don't see much of a point.

Now, that was with the maps I tested with. If you can show me a map where things look different I'll happily make another profiling session and post the results.




The problem is that reject building, especially perfect reject building takes time. Lots of time. And the only suitable tool is still RMB, which if I'm not mistaken is showing compatibility issues with modern OS's. Unfortunately its source code was never released so there's very little chance of getting it improved.

Another issue with REJECT is that with large maps the lump gets huge which means it can become a hassle for many projects.

Old Post 10-25-12 08:02 #
Graf Zahl is offline Profile || Blog || PM || Email || Search || Add Buddy IP || Edit/Delete || Quote
Maes
I like big butts!


Posts: 12371
Registered: 07-06



Graf Zahl said:
Another issue with REJECT is that with large maps the lump gets huge which means it can become a hassle for many projects.


For fun, I once experimented with REJECT in order to see if a "sparse" representation format could help save some space, e.g. by RLE encoding sequences of "1"s and "0"s. However, the REJECT tables of stock Doom & Doom II maps are, on average, 65% dense (meaning that there aren't many all-zero rows or columns, and so I didn't pursue it further.

A lower density would mean that the tables are mostly zeros, and it could be worthwhile to keep them in a compressed format, assuming that lookup still look constant or at most logarithmic time.

Old Post 10-25-12 13:22 #
Maes is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit/Delete || Quote
kb1
Member


Posts: 337
Registered: 11-06



Graf Zahl said:
...Caverns of Darkness...
If you can maintain 100 fps, that's great. I suppose I should pull out my Visual Basic reject builder code. It's reasonably fast considering - naw, it's slow. The first version used raycasting, and was sectors * sectors * lines * lines * traces. The latest version uses intersection of a plane code, eliminating the traces, but requiring slower math. Problem is, it's married to another project at the moment.


Maes said:
If you mean compatibiliy with vanilla demos, there's a partial one: some play back just fine, others (most of them) desync. Some "known good" ones are DEMO2 and DEMO3 (and maybe DEMO4?) from Ultimate Doom. I think a couple from Final Doom also play back correctly.

However, unless I figure out what's precisely causing the desyncs at certain points (the actions of certain monsters? the movement of certain monsters? A call to P_Random too much? A small error with my integer-wishing-it-was-fixed_t arithmetic?) I can't do much. Who knows.....if I could make a proper test map that isolated certain kinds of causes, then maybe.

The only clue I have so far is that, with a -nomonsters map with the player just navigating it (recorded in vanilla doom), so far I couldn't induce desync. With monsters present, it's just a matter of time, but not always. I can't reconduce it positively to one kind of monster or action yet.

Awesome! Here's how to found the remaining desyncs:

1. Open PrBoom+ source
2. Find every instance of P_Random(). Add a unique number to every instance: P_Random(22).

3. Modify P_Random to accept the number, and append it, the random number, and the random index to a file: "PR 22 1 252".

4. Modify TryRunTics to write gametic to the same file.
5. Modify XYMovement to write doomednum, x, y, z, momx, momy, momz to the same file.

6. Compile. Run it against failing demo. Save file.
7. Make the *exact* same changes in your port, to the end that it produces an identical file. EDIT: (be sure to use corresponding integers for your P_Random calls!) Run failing demo on your port. Compare.

This will give you at least some of the following information (9 times out of 10):

. The actual function that's incompatible
. The tic that deviates
. The type of monster
. Possibly exactly what's wrong

I used that to discover and fix a great many desyncs from a project that I had been recklessly modifying for over a year with disregard for demo sync. It now plays most vanilla demos. Good luck!

EDIT: If you're interested, I could probably post my "syncdebug.txt" file from my port, running against a demo of your choice. Of course, you'd have to pattern your files and P_Random integers exactly as I did. If so, please create a new thread, and let me know.

Last edited by kb1 on 10-25-12 at 21:34

Old Post 10-25-12 19:01 #
kb1 is offline Profile || Blog || PM || Search || Add Buddy IP || Edit/Delete || Quote
wesleyjohnson
Senior Member


Posts: 1000
Registered: 04-09


The memory system needs to be tested at the minimums and extremes because that simulates best the conditions that crash or bog.
Talking about getting max framerate on a big memory computer is really a different discussion and should start a separate thread.

Any memory system is highly tuned to the port environment as that determines the unfortunate situations that cause it thrash. You can fix the memory system to tolerate them, or fix the port to not put the memory system in that situation.

I have an improved version of Zone Memory with many additional tag types.
This protects some things against being purged in specialized ways. Textures can use a higher priority purge tag, they do not purge until the more desperate second search pass. This Zone Memory grows itself as needed.
The Zone Memory I inherited had two bugs that could corrupt aligned allocations.

Still, I think that free lists are the way to go for same sized allocations that have high turnover. You can probably find some wad that can make any allocation have high turnover, but that will also create bogging by some other draw function too, so some kind of early pruning will likely always have the best payoff.

That peak mobj usage drives the size of the free list is a valid argument. If that free list never gets purged between levels, then it is also valid that it can be a memory hog itself.
Allocating all (or even many) in one malloc call is going to hinder freeing anything on that list.

It would not hurt to release memory from the free list when memory is needed, but that requires a depth to memory exception handling that most ports do not support. Excess freelist size would have to be checked on a periodic basis, such as every tenth frame, or on some other periodic event. This depends upon single object allocations, because checking that all objects in the allocation are on the free list would be so painful. I have done this on another project, and it works but there are usually so many partially filled allocations, that a compactor is needed too.

All of this is so highly tuned to the rest of the port, that comparisons between ports of the best way to go is almost laughable.
You can make any memory system look bad by using it differently than for what it was designed.

Last edited by wesleyjohnson on 10-25-12 at 22:52

Old Post 10-25-12 22:40 #
wesleyjohnson is offline Profile || Blog || PM || Search || Add Buddy IP || Edit/Delete || Quote
kb1
Member


Posts: 337
Registered: 11-06


I can't imagine that once you hit the "peak mobj_t" point, that you haven't already loaded the bulk of textures, sprites, sounds, etc. I'm not sure how helpful memory reduction during a single level would be of much benefit. If you need 10K mobj_t's at any point during the level, you need em! Of course, when you switch levels, you can purge that memory, and I do.

My port's memory usage goes up to about 22Mb from launch to playing demo1. After cycling thru all Doom II demos 3 to 4 times, it hovers at about 25Mb. That's running 1920x1080x32 resolution.

I allocate in chunks of 4k mobj_ts at a time. Last time I checked, I think they are < 300 bytes each, so... 1.2Mb per block, so, at any given time, I'm burning <= 1.2Mb. I can live with that, knowing that I will *never* have to malloc/free an mobj_t :) It's not like I'm "multitasking" when playing full-screen Doom!

Once again, my concerns are with performance. I'm not suggesting that anyone use my port on a Win95 32Mb machine. In fact, I'm not suggesting it at all at this time, since I have not yet released it, or even named it :) KBDoom sounds silly...

Old Post 10-26-12 00:26 #
kb1 is offline Profile || Blog || PM || Search || Add Buddy IP || Edit/Delete || Quote
All times are GMT. The time now is 02:44. Post New Thread    Post A Reply
Pages (2): « 1 [2]  
Doomworld Forums : Powered by vBulletin version 2.2.5 Doomworld Forums > Classic Doom > Source Ports > Which are the ports with most optimized renderers

Show Printable Version | Email this Page | Subscribe to this Thread

 

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are OFF
[IMG] code is ON
 

< Contact Us - Doomworld >

Powered by: vBulletin Version 2.2.5
Copyright ©2000, 2001, Jelsoft Enterprises Limited.