Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Tarnsman

Random Visual Glitch? in PrBoom+

Recommended Posts

I started noticing some weird visuals in PrBoom+ where if I stand parallel-ish to a sector with a different floor or ceiling flat than the surrounding sectors it creates this weird convex HOMeqsue effect, and if I look at if from about a 45 degree angle it becomes concave. I already tried getting a new copy of PrBoom+ and that didn't fix it, and it's present in every WAD.

Pictured.

Share this post


Link to post

If you are referring to the way the ceiling texture doesn't match up with the physical scale of the upper walls -- welcome to the DOOM engine :P

Only two ports address this, and in different ways - Eternity, and ZDoom. Eternity does it using the Cardboard floating-point renderer, by SoM.

Share this post


Link to post
Quasar said:

Only two ports address this, and in different ways - Eternity, and ZDoom. Eternity does it using the Cardboard floating-point renderer, by SoM.

Because there are no more software ports! So better to say: all ports except software prboom have no this issue :)

glboom(-plus), gzdoom, vavoom, edge, doomsday are not affected, because of hardware renderer (at least)

Share this post


Link to post

Right but I didn't name those because he A. obviously wasn't using a hardware port and B. the "fix" for it in hardware isn't explicit - it's just a consequence of using actual proper graphic primitives in hardware to draw floors & ceilings instead of a system of hacked-up xtoviewangle and distance-for-viewangle lookups/functions with low-stability fixed-point math to scale and clip linedefs and textures and determine the edges of visplanes.

Share this post


Link to post
entryway said:

Because there are no more software ports! So better to say: all ports except software prboom have no this issue :)


Choco, Mocha... Maybe DelphiDoom? I haven't tried that one.

Share this post


Link to post
Gez said:

Choco, Mocha...

I meant "Ports", not modern analogues of vanilla which even do not touch resolution limitation

Yeah, DelphiDoom (software renderer) is affected by this issue too. Probably Legacy also.

Share this post


Link to post

Mocha supports high resolutions too, and yeah, since it's the same
engine as vanilla (well, more like a bastardization of Linux Doom, Boom and even prBoom+ ;-), it exhibits the same types of visual bugs.



An interesting side-discovery was that the vanilla renderer alone is perfectly capable of scale up to "regular" high resolutions without exhibiting any of the anticipated "mini HOM mess" problems I had been warned against, and without having to take any corrective measures.

This holds true at least up to vertical resolutions of 1200 lines. There sure must be a limit for a renderer using 16.16 fixed arithmetic though, but it's well beyond what's displayable on average monitors or what most players use. What DOES look fugly already at vanilla resolutions is the vanilla Doom span rendering function: it does a half-assed "optimization" by crippling 16.16 fixed point numbers to 6.10 in order to do two operations in one go, and that distorts visplanes even in vanilla Doom.

Share this post


Link to post
entryway said:

doubling?


No, it has actual resolution scaling with a program-wide VideoScale setting that affects initialization and size of e.g. visplanes, line buffers and other resolution-dependent LUTs or buffering structures that normally depend on the hardcoded SCREENWIDTH and SCREENHEIGHT. Been like that since uhm... 1.4?

The SCREENWIDTH and SCREENHEIGHT "constants" themselves are not actually constants, but can be set independently for each module that can potentially be affected by a resolution scaling (these implement the IVideoScaleAware interface).

The only limitations is that it's still tied to the original 320:200 aspect ratios and can only accomodate integer multiples of that resolution, but I'm working towards allowing arbitrary ratios as well. I just ain't sure what approach should I use for non-integer scaling of static graphics: ratio-preserving letterboxing/cropping or ZDoom-like arbitrary scaling with distortion (e.g. think of the status bar or title screen)?

Actually, I was amazed when I discovered that it's VERY easy to implement high resolutions just by giving SCREENWIDTH and SCREENHEIGHT different constant values. Anything drawn by the RenderPlayerView function as well as the automap would work smoothly practically with no further changes. That ain't the hard part at all.

The really hard part is handling static graphics like menus, status bar, HUD, fonts, etc. require a totally different system as they are actually tied to hardcoded positions on a 320x200 screen, and are not drawn by the renderer's colfuncs but by the fixed-scale ones in v_. Thus, you need to change each and every place in the code where they are drawn to work with an arbitrary scale system. Currently, I'm using a DrawScaledPatch function ripped derived from Eternity Engine by _D_.

I suppose that's the reason why e.g. fraggle chose not to implement true scaling in Chocolate Doom: yeah, he could pre-multiply SCREENWIDTH and SCREENHEIGHT and get the renderer and automap going in literally under a minute, but then he'd have to rewrite every place where static patches are drawn...and it's a bit hard to guarantee that this would not introduce side effects. There are forks (?) of Choco that do just that, though.

Here's a full resolution version of the above image.



The visplane rendering on top looks a bit like ass because it's using the 6.10 precision vanilla span function which goes over the roof after drawing about halfway through the screen, but it's possible to use anything you like ;-) When using multi-threaded visplane rendering with strict startx/endx bounding though (each visplane renders only a set horizontal range), this problem goes away because the function "starts clean" with every visplane, and breaks even long spans across different threads.

Share this post


Link to post

I wonder if there's some quick, hacky fix that one could do to fix this from an out-of-the-box Vanilla renderer. By the way, the term for this visual glitch is called the "Longwall" error, even if it isn't happening on a long wall.

Share this post


Link to post

That's an interesting note, although the problem here would be more of a "shortwall" error. Perhaps the only thing they have in common is the loss of precision, but for different reasons and at different points altogether.

Obviously the renderer thinks that it has to render more "spans" than necessary in a given screen zone before it even calls the actual drawing functions, so I doubt that the solution is on the spanfuncs end. A small test I did by increasing the internal span fracx/fracy coords to 32.32 accuracy confirmed this.

The problem doesn't seem to be in the final rendering, but, rather, in the door's linedefs themselves "jumping around" due to them being part of such a thin sector (it gets worse in ports with freelook, because it also extends to the also thin ceiling border in the big room). So actually what you see is the door's shape being distorted and sometimes being thinner, sometimes flanged.

My guess is that any increases in accuracy should go into the R_MapPlane function or even much before that, in the internal accuracy of the linedefs or even the BSP nodes themselves. Using a slime trail remover may fix some of these occurrences, but not all.

Share this post


Link to post

It is not precisely the same as LWE. When a vertex is off-screen behind the player, the math used to project the linedef becomes extremely unstable. The angular clipping system maps the vertex off the edge of the screen and the tantoangle table is basically crap. So, texture mapping coordinates behave in a mathetmatically unstable manner with respect to motion and rotation of the viewpoint while one vertex is behind the viewpoint, even on relatively short lines.

LWE on the other hand is genuine numeric overflow, a combination of high scales/long lengths and the limitations of fixed-point math. The end results are rather similar, however, so you get partial credit for the observation.

It is not possible to "bandaid" this problem out of the engine; you must change over to floating-point math and use a sane method of projection which makes mathematical sense when dealing with vertices behind the projection plane.

Share this post


Link to post

Perhaps project vertices behind the viewer onto the near view plane along the vector of the seg/bs-partition which intersects said plane and use those new coordinates instead?

Share this post


Link to post
DaniJ said:

Perhaps project vertices behind the viewer onto the near view plane along the vector of the seg/bs-partition which intersects said plane and use those new coordinates instead?


That would fall neatly into the "sane projection method" category Quasar mentioned, although I'm not sure if that alone would be enough to save the day without also using floating point math.

For the latter in particular, don't expect any ports focused on performance optimization/too close to vanilla/boom to adopt it.

Share this post


Link to post

I don't think that floating point performance is still an issue - unless you are dead set on keeping your program operable on 15+ year old dinosaur systems.

Share this post


Link to post
Graf Zahl said:

I don't think that floating point performance is still an issue - unless you are dead set on keeping your program operable on 15+ year old dinosaur systems.


Sounds too much like an older somewhat fallacious argument that "making the renderer faster is meaningless, since you can only see/display X FPS", implying that those X FPS are somehow fixed/guaranteed no matter what.

Doing integer ALU operations is still faster than FP, op-by-op (an integer addition will always be faster than a FP addition etc.), on almost any extant CPU architecture. That's why I mentioned "ports focused on performance" or "closer to the Vanilla/Boom renderer". That's the same reasoning with which software bloat becomes acceptable, without making it necessarily a good thing: eventually advances in hardware catch up with bloated software, and so one might miss the point in optimizing.

In the case of Doom in particular, there's an added problem: you need to spend time to fix something that ain't broken (well, OK, the vanilla Doom engine can hardly be called non-broken but the core functionality is still pretty valid, once you remove some pesky limits).

Share this post


Link to post

There's evidently some compatibility issues inherent in a switch to floating-point, too: For Eternity's high-precision Cardboard renderer, SoM had to separate things out so that the game simulation would use the original numbers to retain compatible behavior while the renderer would use the more precise floating-point ones to look better and avoid these sorts of glitches.

Share this post


Link to post
Maes said:

Doing integer ALU operations is still faster than FP, op-by-op (an integer addition will always be faster than a FP addition etc.), on almost any extant CPU architecture.



But that will only help you if the integer using code is so good that all the workarounds to compensate for the missing floats don't bring it down. It also needs to make up a significant percentage of execution time.


If you end up saving 10% of time in code that runs 1% of all the time you gained nothing, except making your code harder to maintain which may ultimately affect it by not allowing to easily optimize it.

Share this post


Link to post

At the risk of appearing self-contradicting, I largely agree: some aspects of the rendering code, if taken in isolation, do indeed weigh much less than expected on the final rendering, and are a bitch to parallelize or improve significantly in other ways. Amdahl/law of diminishing returns and all that.

E.g. running the Mochadoom v1.6 nuts.wad timedemo from the CVS@ 960x600: 60 fps when drawing everything, 90 fps when not calling the Things drawing code at all, and 70 FPS when doing everything else (clipping etc.) but without calling the final colfuncs for Things. Parallelizing the Things code by a factor of two to four, quite predictably, led only to a 65-66 fps figure, with the absolute top performance cap being of course 70 fps (if drawing Things had no cost) and 90 FPS (if computing, clipping, bounding and drawing Things had no cost).

So OK, even major improvements to a minor subsystem won't necessarily affect overall performance significantly (or even positively) but that still doesn't mean that each part shouldn't be as fast as possible: while overall performance improvement due to the improvement of a single part is indeed upper-bounded, unfortunately there's no lower bound on overall performance degradation for degrading even a single part: it can drag EVERYTHING down, with no lower limit.

Share this post


Link to post

It ultimately comes down to what you are trying to build. If you are aiming for an open ended, flexible renderer that will see you through years of development with the scope for as-yet-unimagined features; I would say that optimizing every last loop and process is completely counter productive.

If however your featureset is set in stone (e.g., emulate the capabilities of vanilla DOOM's software renderer) then yeah, I agree that there is no reason not to optimize to the degree you suggest Maes.

Share this post


Link to post

Not all integer ALU ops are faster than FPU any more. I have heard that integer divide units are no longer faster than fdiv. Additionally, the advent of compiler vectorizers has enabled some floating-point code to be automatically translated into streaming SIMD.

Eternity is compiled with SSE2 codegen enabled when it is built with Visual Studio 2008. I don't have benchmarks, but I can tell you there was an appreciable increase in speed when these optimizations were enabled - IIRC there was an average gain of 12 FPS in the final room of Sunder MAP14 after enabling SSE2, link-time code generation w/full program optimizations, favoring fast floating point ops over accuracy, and setting inlining to compiler initiative (ie wherever the compiler believes it's faster, even if the inline keyword is not specified).

Share this post


Link to post
Quasar said:

Not all integer ALU ops are faster than FPU any more. I have heard that integer divide units are no longer faster than fdiv.


I would like to see some source for that. Not a long time ago I had stumbled on an Intel spec for the FPU unit used in (Pentiums? Pentium IVs? I can't remember) which was a bitch to find and not I can't find anymore -_-

Edit: Here are the results I had extracted from them, in an older thread. Also quite the trip down memory lane, as I was just getting started tackling fixed_t on Mocha Doom, and someone mentioned that Eternity was just transitioning to floating point. Ahh, the memories ;-)

Some particularly heavy ones like trigonometric could take a hundred or more, and FDIV/FMUL were nowhere near as efficient as 1 CPI (1-7 CPI for FMUL, more than 30 for FDIV, depending on arguments).

I suppose they could be made completable in 1 CPI just like fixed point ones if enough dedicated hardware is thrown at them (DSPs are particularly good at that) but faster than FP? Not unless internal parallelization like SIMD or somesuch helps speeding things up, and even then it's very debatable if it could match FP for similar calculations on similar hardware. For sporadic and irregularly paced instructions (e.g. two additions, then a multiplication, then a division etc.) SIMD won't help much, either.

Of course, those specs and considerations were about the classic P5/x87 architecture, and at least for AMD-64, they may be no longer valid (AMD-64 even deprecates with x87, at least in 64-bit mode).

Share this post


Link to post
Maes said:

but faster than FP?


You know, when comparing FP vs. FP, so as to know whether FP is faster than FP or to the contrary it's FP that's faster than FP, it might be advantageous to lay off the alphabet soup for once and write "fixed point" and "floating point" in a non-ambiguous fashion. (FiP and FlP are acceptable if you're really lazy.)

Share this post


Link to post
Quasar said:

Eternity is compiled with SSE2 codegen enabled when it is built with Visual Studio 2008. I don't have benchmarks, but I can tell you there was an appreciable increase in speed when these optimizations were enabled - IIRC there was an average gain of 12 FPS in the final room of Sunder MAP14 after enabling SSE2

Do you have any speed up on Intel processors after enabling sse?

Btw, what is "gain of 12 FPS"? 4000+12 or 3+12?

Share this post


Link to post
Quasar said:

In this case the gain of 12 was the difference between irritating lag and playability.

http://prboom-plus.sf.net/ee_sunder_test.zip
eternity -file Sunder.wad -loadgame 0 -timedemo sunder

The latest release of EE, Core2Duo, default cfg, 1024x768f, no wipe

"Win32 Binaries" - 28.3 fps
"Win32 "Plus" Binaries" - 31.8 fps

I think this is just a difference between VC6 and VC2008 compilers.

What difference do you have? What kind of processor? Can you compile two versions (with and without SSE) with VC2008? I ask, because I think I will not have any difference between SSE and non SSE versions on my Intel Processor.

Btw, at same position I have ~80 fps with prboom-plus. OpenGL is about 1.5-2x slower there, heh

Share this post


Link to post

The code where I used fixed point instead of float(rgb-hsv-rgb) was about 2 - 8 times faster on both A64 and C2D.
The thing I was shocked was that short if/else statements are faster than "always calculate" operations even if simple.
e.g. min/max using ?: crap beat arithmetic voodoo min/max by a factor of nearly 3. Prediction algorithms seem to be good nowadays.
As far as I "know" using float for Quake was only fast because it was based on simultaneous int/fp execution and abusing the fload/fstore instruction - all masterfully hand assembled by the mighty Abrash... beautiful times.

Share this post


Link to post
_bruce_ said:

The code where I used fixed point instead of float(rgb-hsv-rgb) was about 2 - 8 times faster on both A64 and C2D

Is fast math enabled?

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×