Search In
• More options...
Find results that contain...
Find results in...

ZokumBSP development

Recommended Posts

Figured I should post/blog a bit about my ongoing development of ZokumBSP and the proces around it. I'll mostly cover thing that are specific to ZokumBSP, but there will be the occasional post about code and algorithms that can be used in other nodebuilders.

The current release version is 1.0.9. There's a beta version 1.0.10 bundled with Doombuilder X with a few more features.

Node building: An efficient way to find the boundaries of node entries.

The code inherited from ZenNode goes something like this:
Create a new node.
Find a partition line.
Go through all the segs on each side, figure out the maxX, maxY, minX and minY for each side.
Store that into the current node, process child nodes/sub sectors.

The method being run is FindBounds in ZenNode and is mostly called from two spots in CreateNode. In BSP the corresponding method is FindLimits and is also called from a method called CreateNode.

The problem with this approach is that it will have to go through every seg in the map on the first node. Then split it into two parts, and go over all those segs as well one more time. If there's an average depth of 8 in the tree, the initial segs in the map will have to be gone through 8 times.

So I figured I could do it faster by exploiting the fact that the node tree is a tree. One side of a node's boundary box would always be the same as the span of both of that sides of a node's children. It's easier to grasp from a bottoms up approach. If a node has two subsectors as children.
Child one spanning from 0 to 128 in X and from 0 to 128 in Y.
Child two spanning from 128 to 256 in X and from 0 to 384 in Y.

The parent of this node would then have that node span from 0 to 256 in X and 0 to 384 in Y.

So I decided to build a recursive method to iterate through the node tree and set all the boundaries in the complete tree. The Only time I would have to go through a list of segs would be for the sub sectors, since those are the leaves of the tree.

The code I came up with isn't exactly pretty, but it works, and reuses the old FindBounds method. It gets called just after CreateNode is called to make the root node of the node tree. Here's the code:

```void PostFindBounds( wNode *node) {

// If it's a subSector, calculate from segs!
// If not, recursively figure it out.

for (int nodeChild = 0; nodeChild != 2; nodeChild++) {
// printf("Checking side %d\n", nodeChild);

// If a child is a subsector, we can use the segs to get bounds
if (node->child[nodeChild] & 0x8000) {

int subSector = node->child[nodeChild] - 0x8000;

// printf(" FindBounds: child: %d, subsector: %d, first: %d, num %d\n", &node->side[nodeChild], subSector, ssectorPool[subSector].first, ssectorPool[subSector].num);

FindBounds( &node->side[nodeChild], &segStart [ssectorPool[subSector].first], ssectorPool[subSector].num);

} else { // The child is another node, we need to figure out it's bounds and then use the bounds

// printf(" PostFind recursive on node %d\n", node->child[nodeChild]);
PostFindBounds( &nodePool[node->child[nodeChild]]); // recursion

// printf("\nBefore: %d, %d, %d, %d\n", node->side[nodeChild].minx, node->side[nodeChild].miny, node->side[nodeChild].maxx, node->side[nodeChild].maxy );

// Check the children's two sides on this side to get this node's side (boundary box)
if (nodePool[node->child[nodeChild]].side[0].minx < nodePool[node->child[nodeChild]].side[1].minx) {
node->side[nodeChild].minx = nodePool[node->child[nodeChild]].side[0].minx;
} else {
node->side[nodeChild].minx = nodePool[node->child[nodeChild]].side[1].minx;
}

if (nodePool[node->child[nodeChild]].side[0].miny < nodePool[node->child[nodeChild]].side[1].miny) {
node->side[nodeChild].miny = nodePool[node->child[nodeChild]].side[0].miny;
} else {
node->side[nodeChild].miny = nodePool[node->child[nodeChild]].side[1].miny;
}

if (nodePool[node->child[nodeChild]].side[0].maxx > nodePool[node->child[nodeChild]].side[1].maxx) {
node->side[nodeChild].maxx = nodePool[node->child[nodeChild]].side[0].maxx;
} else {
node->side[nodeChild].maxx = nodePool[node->child[nodeChild]].side[1].maxx;
}

if (nodePool[node->child[nodeChild]].side[0].maxy > nodePool[node->child[nodeChild]].side[1].maxy) {
node->side[nodeChild].maxy = nodePool[node->child[nodeChild]].side[0].maxy;
} else {
node->side[nodeChild].maxy = nodePool[node->child[nodeChild]].side[1].maxy;
}

//printf("After:  %d, %d, %d, %d\n", node->side[nodeChild].minx, node->side[nodeChild].miny, node->side[nodeChild].maxx, node->side[nodeChild].maxy );

}
}
}```

Performance wise perf gave me the following data:

Conventional FindBounds approach:
0.12%  zokumbsp  zokumbsp           [.] FindBounds
PostFindBounds:
0.03%  zokumbsp  zokumbsp             [.] PostFindBounds

Roughly 4 times faster. It's not a very heavy part of the node builder process, but there's some nice benefits. This moved complexity out of CreateNode and into a separate post-step, tightening the inner core loop slightly. This is much more important when working on a multi-tree approach. The idea is there to compute the sub trees of two or more partition lines and compare them and pick the best. If boundaries are done inside CreateNode, a large majority of boundary boxes computed would be thrown away. Doing one final pass on the finished tree is a much more elegant solution.

The approach here could probably be used in other nodebuilders, at least the ones that derive from ZenNode and BSP. The reason the printf debugs have before and after lines is that I ran both the in-createnode approach and the post-approach to ensure the data was the same while bug fixing and tweaking.

I haven't updated here in a while, but I haven't been idle. There's a nice thread about an error with bad seg angles causing rendering errors. The best known being the star structure near the exit on map09. I am sure there are other similar errors on other maps. I made a post asking for other places things got screwed up without much of a reply apart from the map09, e1m1 ones and a vague reference to one on map30.

I've been working on a tool to tweak partition lines. If i can get it work manually, I'll eventually include the code into ZokumBSP. As always it will be optional to use it or not. I've named it partition tuning. The goal is to fine tune the partition line to reduce the amount of errors. Reducing errors should result in fewer/or smaller slime trails and other oddities.

The tool can't remove all slime trails, but it _might_ remove some. It should also hopefully allow for a bit of manual tweaking for the perfectionists. A given partition line might have slime trails no matter what, but with proper tuning, we might be able to change where they occur. Having a slime trail in the middle of a nice and important vantage point is a lot more annoying than having one inside a monster closet, dark area or one with similar textures leaking into eachother.

Similar textures would be rendered different if they are at different heights, but it would probably be a lot less jarring.

At the moment the tool can read a bunch of data from command line and compute how wrong the points are compared to the partition line. There might be some clever hacks that could reduce the chance of a slime trail in favor of slightly funky wobbly lines :D I'll see what I come up with. Since it's command line, it could be used manually on any existing map and with any existing node builder.

This is not a very user friendly tool. Don't expect guis or wizards or hand holding.

```./partitiontuning 3136 -3072 3990 -5370 3520 -3584 3408 -3432 3136 -3072
Partition data: 3136,-3072 | 3990,-5370
Line from: 3136,-3072 to 7126,-8442
Distance from point #1  ( 3520,-3584) to line: 2.86992630 ! Bad rounding, should have inserted a seg?
Distance from point #2  ( 3408,-3432) to line: 3.62328196 ! Bad rounding, should have inserted a seg?
Distance from point #3  ( 3136,-3072) to line: 0.00000000 - All good!```

Thi is just an early version, any automated tuning hasn't been applied.

I've been working on the partition tuning algorithm and here's some test data for e1m1 to try out.

The general idea is to add some of this code to be optionally turned on in zokumbsp. Some authors might prefer their maps to be built in this manner than the standard way. I am a firm believer in giving tools to mappers and let them decide what to should be done with their creations.

This is an early version and fairly hackish, and there might still be improvements to be had. It is incredibly slow due to a fair amount of brute force involved.

Use slade 3 or something similar for this. I chose e1m1 since we all know it and the slime trail near the zig-zag is easy to find. It's not perfect, but it's not half bad I think. Modify the following lumps in doom.wad e1m1.wad (save it as e1m1hax.wad).

NODES:
162: Partition X: 3272, Partition Y: -3352, Partition X Diff: 4020, Partition Y Diff: -5318.

(non purist tweak below)
VERTEXES

437: X: 3455, Y:-3494

SEGS:

438: Angle: -19128
473: Angle: -19176

Note that the angle already stored in those segs are probably a bit off due to the angle bug in DoomBSP. These are merely corrections to make it fit better with the slightly modified inserted vertex. If you're using a different editor than slade and it wants a proper unsigned BAM, just compute 65536 - angle to get the proper numbers. I think the value should probably be recomputed anyway to fix the bug.

This modification doesn't HAVE to have the vertex modified. If you want to be a purist, you should probably go with these values instead:

NODES:
162: 3277,-3259, 4009,-5304

Edited by zokum

I am working on optimizing a few things in ZokumBSP for the multi-tree algorithm.

The nodebuilder subsector creator algorithm has a sorting of the segs for in the order of the lowest linedef first, as some special effects require this. I am not sure which ones, if anyone knows, do tell me more about this. However, this is normally done for every subsector, and if this subsector is discarded, the sorting was pointless.

It shouldn't be hard to do the sorting afterwards. I hacked up a quick test where I computed map11 of doom 2 WITH the current sorting:

```MAP11*   Lump          Old         New     %Change      %Limit      Time Elapsed
Segs         1276  =>    1273      99.76%       3.88%
Subsecs       403  =>     366      90.82%       1.12%      335.310 secs```

And here is a build where i simply commented out the call to qsort:

```MAP11*   Lump          Old         New     %Change      %Limit      Time Elapsed
Segs         1276  =>    1273      99.76%       3.88%
Subsecs       403  =>     366      90.82%       1.12%      326.254 secs```

Currently ZokumBSP does a lot of stuff in-place when making the nodes structure. By delaying the computations till the main nodes structure is done, we can do those doing operations only once, instead of thousands of times. In most of the cases, the result will be discarded, due to another tree being better. This one is only ~2.5% faster, but if we can find a few more such optimizations, it starts to get very noticable.

In case anyone is wondering. Doing a complete nodebuild without width-algorithm of map11 takes 0.02 seconds with sorting in place. The sorting algorithm is only a tiny fraction of the time spent building the map.

There are a few more of these optimizations that are possible. These kinds of optimizations move work out of the main nodes-loop and into separate work loads that run sequentially afterwards. I reckon the next one, a possible vertex-optimization will reduce memory consumption and speed things up somewhat more noticably. They also reduce code complexity, so they're a win-win in more than one regard.

Edited by zokum

I recently added truecolor support which uses 'Romero Blue', similar to the color scheme in the original setup.exe. Using this also enables a gradient style progress bar. This will most likely only work on Linux.

So, I don‘t understand much of what you are writing, but I just quickly wanted to express my gratitude of the work you are doing. Even though I am not sure I can benefit from it, if I map in UDMF, I really enjoy people still trying to improve Doom things, after all these years.

Your point is actually pretty important. The nodebuilder process is mostly voodoo to the average mapper. There are pieces in my code with ZenNode origins I still don't want to touch. There hasn't been much research done on making better node builders. I've been working on and off for some time on improving the algorithms.

I have also tried to document my research in the manual and on the web site. It's not exactly brilliant, but people like @Vorpal have made guides and helped me with the web site etc. That stuff helps a lot!

If there's any documentation you feel is lacking, or concepts that could use some clarification, I might be able to do something with that :)

A lot of the color and output stuff is done to make it easier to understand the output. If you're working on making cutting-edge large doom maps, the %Limit-column will tell you exactly how close you are to the maximum size. I have also updated the manual with a bit of information. There will also be a wad showing a few special effects included.

I just finished a nice optimization. Map11 is now down to 291 seconds, from 326 seconds last week. There are still a few nice optimizations that can be done which will lower this even more. After that though, things will be harder to optimize.

I have one idea for an optimization I want to try out. If a seg is an edge (all the other segs in the map are on one side), it's unusable as a partition line. No matter how many subdivions we do further down, it will always be unusuable as a partition line. By flagging such lines, we could perhaps cut down on a lot of partition testing.

How the progress bar works for BSP generation (nodes).

Single-tree algorithms:
I have added a new way to calculate progress for the BSP calculations. This only applies for single tree computations. The new mode will take into account how many segs have been assigned to a subsector compared to the total amount of segs in the map. There is a short delay before it gets to the bottom of the tree and creates the first subsector. The good thing about this approach is that unbalanced trees do not make the progress bar progress unevenly. Segs splits will create more segs. To avoid a progress bar going backwards, progress is therefor only updated when the fraction of work done is higher. This results in a very smooth progress bar that accuratly reflects the progress with a fairly even speed.

Multi-tree algorithm.
The second method is used when the multi-tree algorithm is used and width is 2 or higher. Here it will look at how much of the tree has been computed, down to 7 levels deep. This method can give a somewhat erratic progress bar when the tree isn't balanced. A bigger unbalance leads to a more uneven progress. The progress bar in this case represents how much of the actual tree is complete, without considering the amount of children a node has, or the amount of segs. This is needed to avoid the progress bar going up and down when new subtrees are being tested, compared and possibly rejected or used as a replacement for the old best subtree.

Common for them both is support for colors invoked with -c. If one uses -cc instead, 24bit color output will be produced. In the current beta this is what 24bit color looks like:

Hopefully this makes it a bit more clear what exactly is going on and why the progress is a bit wonky when using the multi-tree algorithm.

24bit color will make it easier to color code the limit and changes data without having to resort to a few threshhold values. The reason that the reject data is green here is due to the value being slightly lower, but being rounded off to 100.00%. The real value is probably something like 99.99999596%. This is also something I will look into. If a value is lower, but rounded to 100.00% i could just display it as 99.99%.

On 9/24/2018 at 7:09 AM, zokum said:

Common for them both is support for colors invoked with -c. If one uses -cc instead, 24bit color output will be produced.

Any plans to add -ccc with HDR output?

It's a little thing, but that progress bar is beautiful.

@Zokum: Your developments and insights into node building are very much appreciated. One word of caution: Careful with those optimizations - they are, after all, the root of all computer evil! Some of the map tricks are very delicate, and very dependent on the relationship between the node builder, and the person testing their built trick maps. Some tricks might even depend on various node builder bugs. (I don't know of any off the top of my head, but I could imagine a scenario where a mapper was satisfied with how their map plays, only because a node builder bug allowed their special trick to work - heh).

You know what you're doing. It's just a friendly note from someone who likes to turn slow, nice code into blazingly-fast, unmaintainable nightmares :)

@kb1 I don't know of any such tricks either. I have however added a few. It's now really easy to speed up the speed on scrolling walls and make scolling walls have a type. You can now have scrolling walls that are switches, doors, etc easily.

I've also added some bam trickery, neat for secrets or the infinite sky trick. The documentation explains it all. There's more to come later :) I have some ideas, hehe.

Edited by zokum

Nice! I can't wait until I have time to check it out in depth.

I have a few broken things I need to fix for 1.0.10-rc1.

* Something breaks on 32bit systems or newer gcc, producing very odd nodes.
* Some minor errors here and there caught by linters that need fixing.
* Screwed up progress bar on non-vt220-style terminals. (Win32 etc).

Once these are handled, I think I will release 1.0.10-rc1.

I have already started working on a new case study, compiling data for multi-tree builds of most Doom 2 maps. There are some pretty interesting numbers.

I found he 32bit error. It now produces the same output as a 64bit executable does.

Added a new default tree assessment mode. It will now value 2 seg splits as equal to one extra subsector. A tree with 3 extra splits but one less subsector will therefore be considered a worse tree. This is now the default mode, not subsector reduction. This only applies for the multi-tree mode.

Minor tweaks to the output code made it slightly more readable on win32, but still an unromeroish mess. Still loads of debugging left to do.

The case study is chugging along. With the new 32bit fix, I have set up a 15 year old Pentium 4 machine to work on doom.wad. It's chugging along nicely. I am currently at a computer party, but It had been computing for about a day when I left and had already finished e1m5. E1m3 took over 13 hours. I was thinking this wouldn' take so long.....

Currently at the computer party and just logged in to my P4! E1m6 seems to go really slow. It's been at it for over 60 hours and is still only finished with 1.27% of the total bsp trees it needs to calculate. This looks like a long haul, maybe worse than the 6 days i spent on Doom 2 map21. Anyone got a spare super computer?

Just got back from the computer party and got my computer set up. Computations are now at 1.46%. Any wagers on how long this will take? :)

Edited by zokum

Balanced mode has had the minor speed problem fixed. Here are numbers from Doom 2 map11 with different metrics.

```m=b - Balanced:   Trees are better IF they score lower with the formula: SubsectorCount * 2 + SegCount.
m=s - Splits:     Trees are better IF they score lower with the formula: SegCount.
m=u - SubSectors: Trees are better IF they score lower With the formula: SubSectorCount.

./zokumbsp -b- -r- -na=mw=2m=b doom2.wad map11
MAP11*   Entry         Old         New     %Change      %Limit     Time Elapsed
Segs         1309  =>    1267      96.79%       3.87%
Subsecs       430  =>     368      85.58%       1.12%     2m 36s 898ms

./zokumbsp -b- -r- -na=mw=2m=s doom2.wad map11
MAP11*   Entry         Old         New     %Change      %Limit     Time Elapsed
Segs         1309  =>    1260      96.26%       3.85%
Subsecs       430  =>     375      87.21%       1.14%     2m 15s 707ms

./zokumbsp -b- -r- -na=mw=2m=u doom2.wad map11
MAP11*   Entry         Old         New     %Change      %Limit     Time Elapsed
Segs         1309  =>    1273      97.25%       3.88%
Subsecs       430  =>     366      85.12%       1.12%     2m 37s 207ms```

Depending on what you look for, different algorithms will give slightly different results. The old one is the doom2.wad data. Splits go slightly faster as it lends itself towards less data due to fewer lines to handle.

I've been busy coding, I think there will be a new release soon. Nothing exciting, but some important stuff and quality of life fixes.

*Sped up the bsp multi-algorithm a fair bit, 5-6 times at least.
*Big speedup of the default bsp algorithm, especially on the bigger maps.
*I have decided to turn on -nu as a default option. This fixes a lot of problems with invisible floors.
*There's been output fixes to make it easier to see how close many of the entries are near the limits.
* Many minor fixes and tweaks to the output. Hopefull this will work fine on both Linux and Windows.
*Better summary when using -sm (m=multiple).
*Switch to show all entry stats except totals, -sa (a=all).
*Minor color tweaks when using -c on Linux.

Here's what it looks like in 16 color mode:

As for speed: ./zokumbsp -b- -r- -na=mw=2m=b doom2.wad map11

This build previously took 2 minutes and 36 seconds for the BSP part, and now takes less than 23 seconds.

I've been ill for a week so development has crawled to a halt. Any spare brain cycles I have had has been used to watch the chess world championship. Here's what I plan to do for the next version:

* Dummy sector support.
* Fix 32bit blockmap support and use it internally for reject without optimizations to avoid problems.
* Mystery features that are fairly pointless.
* A few 24bit color tweaks to better colorize the output.
* Better "quiet" output settings for use from tools that do not display the text output anyway, like Doombuilder etc.

Surprise 25 year anniversary release. http://doom2.net/zokum/zokumbsp/fetch

Mostly speed optimizations and better debugging output.