I like big butts!
My question is has anyone considered using this approach for their ports? Would there be any trade-offs, advantages or disadvantages? I've been considering doing this for the next version of Doom64EX and thought I ask what anyone else think of this method?
I probably have experimented with multithreaded rendering in Doom for the longest time than anyone, and I considered several different models. However, in all of them, the gameplay "thread", so to speak, is the main thread, and at the end of a gameplay tick and ONLY then, the multi-threaded rendering starts, with all the possible variations (e.g. parallel walls + floors, 2 thread for walls, 1 for floor) and workload strategies (e.g. split walls by columns or segs, split floors by columns or visplanes etc.) that may apply. The problem with all of these approaches is that rendering is totally synchronous with gameplay code, as there is a serial dependence
The model I used was a serial-fork-join one: Gameplay code runs first, ensuring that everything is in its proper place on the map. Then rendering code forks into as many rendering threads as it has been set to, those threads join back into the main thread, and a new gameplay tic cycle starts. There's no concurrency between gameplay and rendering, and since it's required to know where everything is before rendering a frame, there seems to be no escaping that...or is there?
I recently hypothesized in another thread that a different "asynchronous tic/rendering" strategy might be possible: at the end of each gameplay tic, you copy ALL information required for rendering a frame starting from the map status to a temporary place, and start running a NEW gameplay tic IMMEDIATELY. Kind of like a "double buffering" of the map's objects' status. This copying strategy is already used in ports that do inter-frame interpolation, and it has the disadvantage that you need double the memory for map objects and their status. In large maps, this can be quite significant.
While this happens, you try rendering the screen based on the map info you just stored, on a separate thread. This allows actual gameplay-rendering concurrency.
How well might this work? If rendering and gameplay take more or less the same time, you might see an up to 200% speed improvement (in theory). If one of them is significantly slower though, it will dominate total tic time, and improvements will be much lower. Worst case: no difference with serial, or a bit slower because you cause loss of cache coherency by running two very different things at once. Best case: maximum 2x theoretical speedup if rendering and logic are equal in complexity, and you have no effects on cache coherency.
You could STILL have multithreaded rendering in combination with asynchronous tic/rendering, which would be "asynchronous tic/multithreaded rendering", possibly making thing a bit faster if rendering is the actual bottleneck.
Using such a model also changes Doom's speed throttling strategy dramatically (IMO it simplifies it), and you still need to arrange for a way for tic & rendering to sync up at the end of a tic. A simple thread barrier would force to render EVERY tic and destroy speed throttling. A semaphore would have to be used instead, so that e.g. if rendering takes too long, gameplay is allowed to 'run ahead' for an extra tic, and rendering only receives new data when it has completed rendering the previous content.
Last edited by Maes on 08-07-12 at 22:36