dpJudas

Members
  • Content count

    157
  • Joined

  • Last visited

2 Followers

About dpJudas

  • Rank
    Junior Member
  1. You have to generate a new BSP tree where all subsectors are convex. ZDoom includes its own node builder that it uses to generate such a tree - the BSP in the WAD can not be used. From that step forward it is a matter of drawing each subsector floor/ceiling as a triangle fan.
  2. I guess I'm not going to convince you. All I can say that I've never seen a performance critical codebase use 3 bytes for truecolor. Keep in mind you can't just store or load 3 bytes - you either have to do three 1 byte stores, or one short + one byte, or start shifting things around and store half-processed pixels. The cache miss is rare, while the time you decided to pay has to be paid for every single pixel processed.
  3. The sprites also use 32-bit textures. The only exception to this rule are translated sprites, where the translation texture is still 8-bit. Light shading is applied per pixel in RGB space: out.rgb = (texture.rgb * lightshade + 127) >> 8. It does this using SSE instructions where it can do 8 word multiplications (two pixels) in one instruction. Much more expensive than palette mode, but the only alternative would be using a huge lookup table. Given the size required I'm not sure it would be faster. Interesting idea doing precalculated copies of the same texture - I hadn't even considered that option. Unfortunately it would require 32 copies just to get the same number of shades as the palette renderer uses. It would probably be faster, but use a lot more memory and cause a quality loss (32 shades vs 256 now).
  4. It always uses 32-bit truecolor textures (the 8-bit ones are converted to 32-bit at load, including generating mipmaps). It can sample from them using either nearest or linear filtering.
  5. Just to be clear: my comments were meant for a software renderer implementation, such as the truecolor one in GZDoom. I only brought GPU's into the discussion as an illustration that 24-bit truecolor textures are always being stored as 32-bit to the degree that modern API's does not even offer a 24-bit backing anymore. On the subject of caches and GPU's, they do have caches and they do use a NUMA architecture. This PDF has some nice images of the various caches and such.
  6. As far as I know, yes all and yes always. Memory addressing gets a lot simpler when you are dword aligned as you can use shifting to calculate the address. I.e. (fracpos >> (FRACBITS - 2)) & ~3. With 3 bytes you'd have to use a multiply instruction. It also unlocks the opportunity to use vectored instructions. On today's GPUs you can't even pick a texture/framebuffer format that is only 3 bytes - your choices are always 1, 2 or 4. I'm sure some hardware engineers somewhere ran the numbers and came to the conclusion that the addressing advantages of 4 bytes were greater than the cache hit. Also must be a reason why all compilers align their reads and writes. The interesting thing about mipmaps is that from a theoretical point of view they are both better for the cache and provides a higher quality result (less aliasing). However, that transition jump you describe makes it look pretty bad unless it is paired with linear mipmap sampling. I'm applying a slight texture bias in the GZDoom mipmap implementation to try counter it, but of course if you look for it you will see the jumps. Overall though, I prefer the lower aliasing it gives in a scene over not having the transition jumps.
  7. All truecolor renderers generally always use 4 bytes per pixel to keep things dword aligned. That gives you 16 pixels per cache line rather than 21. As for drawing things far away, mipmaps were invented to fix that cache problem. Doom doesn't support those out of the box, but they are fairly easy to add (but as you said yourself, once the sprite is far away, the speed gains get a lot lesser, especially compared to the sprite setup and sorting costs).
  8. Well keep in mind that the shadowmaps I added are applied to all dynamic lights in a scene. If the shadow casters are limited to only a few lights chosen by the mapper the performance costs are different. The cost of a spot light casting a shadow is roughly the same as being able to see camtexture. Omni directional lights are a bit more expensive, somewhere between 2-6 camtextures depending on what strategy is used to build the shadowmap. If the light source is stationary and actors are excluded they can also be cached for sectors that aren't actively moving.
  9. The Software light mode is what the ZDoom software renderer uses and is probably the one closest to vanilla Doom. I personally prefer Software + Radial. The radial feature makes the diminishing effect circular around the player rather than strictly based on the depth in the scene - it is a subtle change but for the better I think. As for the other light modes, from my point of view they are all just various versions of "wrong Doom light". Only if a map has been explicitly designed for them would I use them. I don't know the historical background for them, but none of them give the correct light as Doom was intended.
  10. Hey, that looks pretty cool actually. :)
  11. How does that effect work?
  12. Oldest personal files on my computer would be this: Too bad that I lost the source code..
  13. If the dynamic light source is of the attenuated type and the light source is behind the wall, then no light will be applied. I'm not entirely sure if this also applies to light types based purely on distance as well - it might be the case.
  14. It is a good question. A drawer that doesn't use constant-Z also has to calculate the diminishing light for every pixel though. Calculating the shade is a linear interpolation, I think, but there are some clamps in place to prevent it from going out of bounds.
  15. The visplanes are stored initially as a list of columns to draw, but when drawing them it finds the row spans covered by those columns. It is done this way because everything in Doom is drawn in directions where the depth (Z) is constant. For walls that is columns, and for floor/ceilings that is rows. The constant depth part is important for performance as you'd otherwise have to do a division for every pixel drawn. As long as the depth is constant, you can calculate the UV coordinates at each end of the span and then do a linear interpolation between them.