Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Sign in to follow this  
Phml

GlBoomPlus - performance issues while monsters are asleep

Recommended Posts

So, I've got this map with a relatively large area not connected to the main level (roughly a 5000x5000 square, without stuff to break LoS) and about a thousand monsters in it. As long as these monsters are dormant, the game is very choppy ; as soon as they wake up, everything is smooth.

The thing that baffles me, and what I'm wondering here, is the following: when I monitor framerate with FRAPS, it doesn't show anything wrong. I'm getting at least 500 FPS all the time, and despite that everything is jerky. Is FRAPS not working properly with GlBoom+? Is that kind of slowdown some particularity that doesn't register on the framerate itself?

Here's the map, if it helps any.

Share this post


Link to post

It's very possible that the method FRAPS uses to count FPS is not accurate... Doesn't GLBoom+ feature its own FPS counter? I'd use that in addition to FRAPS's one and compare the results...

Share this post


Link to post

Sounds like a problem that a properly built REJECT map might be useful for. Since the monsters are idle, they are all doing lots of line-of-sight checks, which could be avoided with a valid REJECT map, if the large disconnected sector cannot be seen by the start area.

Share this post


Link to post

Fraps works fine with glboom-plus.

There is IDRATE cheat for internal fps counter.

The game can be choppy probably because of issues in interpolation code. Have no idea how to fix.

Share this post


Link to post

Out of curiosity, how fast is your computer? The map is silky smooth for me on a 2.6ghz P4, both before and after those monsters are woken.

Share this post


Link to post
natt said:

Out of curiosity, how fast is your computer? The map is silky smooth for me on a 2.6ghz P4, both before and after those monsters are woken.

There is a problem with interpolation in prboom-plus. Sometimes it is very noticeable. Compare gzdoom and glboom-plus on sunder.wad map10. GZDoom at 50 fps is smoother than glboom-plus at 90. Also you can try sunder.wad -complevel 9 -warp 11. Fps is 70-80, but you will feel it as 30

Some time ago I tried to understand why it happens. I logged interpolated view angles and timefrac values, but without positive result.

Share this post


Link to post

Thanks for the insight, everyone. For this particular map building a REJECT lump seems to do the trick, at first glance. I keep forgetting ZDBSP doesn't do that by default.

Share this post


Link to post
entryway said:

Some time ago I tried to understand why it happens. I logged interpolated view angles and timefrac values, but without positive result.


For sunder.wad map11:

lprintf("%d: %d\n", gametic, tic_vars.frac);

10: 42130
10: 46811
10: 51492
10: 60854
10: 63195
10: 65536

11: 56173
11: 60854
11: 65536

12: 44470
12: 49152
12: 49152
12: 56173
12: 60854
12: 65536
12: 65536
First frame of every tic takes 80% of time and that's because prboom-plus isn't smooth (I think) even at 100+ fps for such levels. Is something wrong in prboom-plus? IIRC gzdoom has similar values, but gzdoom is still smooth.

Share this post


Link to post

What do you use for timing? GZDoom, like ZDoom uses Windows' native timer callback functions.

I know that at one point ZDoom tried to replace it with a polling method which made things horribly worse.

Share this post


Link to post
entryway said:

...First frame of every tic takes 80% of time and that's because prboom-plus isn't smooth...

Hmmm. 40000 for the first frame and somewhere between 4000 and 8000 or so for subsequent frames... If you half the number of monsters, does the first frame (~40000) get shorter, with the subsequent frames still taking about the same (4000 to 8000) amount of time?

EDIT: By your numbers, it looks to me like your AI+1 frame of rendering is taking more than 1/2 a tic to complete.

Doesn't that mean that the best you can get without choppiness is 1 frame/tic (35 fps)?

I've never tried to implement any interpolation, but it seems to me that, for it to be smooth, it needs to run at a multiple of the AI's speed (35 fps, 70 fps, 105 fps, etc).

Maybe this could be implemented:

frames_per_tic = (int)(65536/first_frame_frac)
So, if first_frame_frac = 8192, you can render 8 frames that tic.
Also:
next_interpolation_frac = currentfrac + 65536/frames_per_tic
Something like that would spread out the frames evenly across the tic.

EDIT 2: Come to think of it, this gets more tricky when you also consider tearing (vertical sync). For example, maybe you don't render the first frame of a tic at all, but rather, render the interpolated frac nearest to a vsync. In other words, it makes no sense to render more frames than your monitor's refresh rate, but, the frames you do render should be as evenly spaced as possible.

It is reasonable (I think) to assume that anyone's monitor will refresh at at least 35hz. Anticipating when a vertical sync will occur could help you decide which frame to render (and more importantly which frames not to render). And, the frame you render might not be the first frame in a tic, it might be the interpolated one.

I think the goal is to get all those timings you posted to be evenly spaced (even at the expense of dropping frames).

Let's consider a 60hz monitor. Considering a vertical refresh occurs at tic 0, frac 0, another one will occur at tic 0 frac 38229 (35/60)*65536. That number can be considered to be a constant goal to strive for (in this Doom session, for this monitor).

If 1 unit of AI time + render time + interpolation time < 38229, you can smoothly render at your monitor's refresh rate.

However, the posted example takes 42130 fracs to run (mostly AI time?), so you'll miss a frame, there's no way around that, unless you could somehow interpolate during an AI update (ouch - I wouldn't try it). About the only sensible thing to do at that point is decide to refresh every other frame, which, for a 60hz monitor, puts you back to 35hz. Lame? yes, but it would be smooth.

Share this post


Link to post

Yuck!

I doubt that this will work.

The interpolation code is meant to handle such variances. Otherwise it'd be useless. And since it works fine in ZDoom there must be something different in PrBoom+ - and this difference needs to be found to fix it.

Share this post


Link to post
Graf Zahl said:

What do you use for timing? GZDoom, like ZDoom uses Windows' native timer callback functions.

SDL_GetTicks(). Internally it uses timeGetTime() for Windows (with timeBeginPeriod(1) of course).

kb1, some time ago I implemented alternative method (test_interpolation_method = 1)

    if (interpolation_method == 0)
      frac = (fixed_t)((now - tic_vars.start) * FRACUNIT / tic_vars.step);
    else
      frac = (unsigned int)((float)FRACUNIT * TICRATE * subframe / renderer_fps);
It is smoother in some situations.

Share this post


Link to post
Graf Zahl said:

Yuck!

I doubt that this will work.

The interpolation code is meant to handle such variances. Otherwise it'd be useless. And since it works fine in ZDoom there must be something different in PrBoom+ - and this difference needs to be found to fix it.

Oh, yeah, definitely yuck. I guess I'll have to give it a try. And, certainly if your AI time is longer than a physical frame can be displayed, yes, you will not be able to render at that frame rate. But, if you want a constant frame rate, you'll have to render at some constant frame rate...
My theory suggests that, even rendering a constant 35fps may not look smooth, because of the monitor frame rate. Monitor refresh rates are typically not divisible by 35.
At 60hz, you can display the first frame, but you'll need to show the second frame at 60hz, not 70. So, you could display an 60/70th interpolated version of the first frame.

There are a few cases to consider:
1. Your AI is faster than the monitor refresh rate.
2. Your AI is faster than 35hz, but slower than the monitor refresh rate.
3. Your AI is slower than 35hz.

Case 1 is easy, just render your interpolated frame at the monitor refresh rate.
Case 3 is easy, ignore the interpolation (to save time), and just render when the frame's ready.
Entryway's case falls into the tricky case 2. The question to consider is: Does it look bad to render one slow frame, followed by a bunch of faster frames, or should we strive to always render at the same speed? Or a third option, should we introduce a small delay after the first slow frame, to attempt to arrive somewhere in the middle?

Although more complicated than a general algorithm, it may provide value to be able to dynamically test for which case you're currently in, and, use the appropriate rendering method for each case. I am of the opinion that, anything you try to do to imrpove case 2 should be separated from case 1 and 3, otherwise it may have a negative effect on those simple cases.

Graf, it's interesting to note that GZDoom seems superior in this case. Do you render as quick as possible, or do you limit render speed based on timing? (Which I think may look good.)

I started writing some pseudocode, but, this is something that you have to experiment with - and experiment with maps that cover all three cases. Good luck!

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×