Spider Mastermind
Register | User Profile | Member List | F.A.Q | Privacy Policy | New Blog | Search Forums | Forums Home
Doomworld Forums : Powered by vBulletin version 2.2.5 Doomworld Forums > Classic Doom > Source Ports > PrBoom-Plus running slow
 
Author
All times are GMT. The time now is 21:46. Post New Thread    Post A Reply
Vordakk
Banned


Posts: 975
Registered: 07-10


Sorry if this has been addressed previously; I tried to do some searches but couldn't find anything.

I'm using PrBoom-Plus a lot these days, and although I really like it, I've noticed that it runs a great deal slower than ZDoom. I'm using the latest version btw. I have it configured to NOT be graphics intensive(no glBoom) and to emulate original DOOM graphics and behaviors, with "uncapped framerate" selected. However, no matter what wad I play using PrBoom-Plus, including any of the iwads, it is quite a bit slower and laggier than ZDoom. I seriously doubt that it's my system because I have a fast system for starters, and secondly ZDoom is never slow, even with very large wads. Is there anything I can do to speed up PrBoom-Plus, or might there be some odd option that I could disable to make it run quicker?

Old Post 07-26-12 17:53 #
Vordakk is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
Quasar
Moderator


Posts: 6124
Registered: 08-00


ZDoom has magical 999 FPS. It can be running a level with 50000 monsters all infighting and doesn't even blip. Nobody can scientifically explain it :P

Old Post 07-26-12 18:56 #
Quasar is online now Profile || Blog || PM || Email || Homepage || Search || Add Buddy IP || Edit || Quote
Grazza
Let's try Caesium


Posts: 12496
Registered: 07-02



Vordakk said:
I have it configured to NOT be graphics intensive(no glBoom)
Actually gl mode should be much faster on most systems.

But if you want software rendering, then perhaps you could post your cfg (or at least the business end of it), and it might be possible to offer more specific help. You might want to check if you have enhanced monster AI enabled, and things like that which slow things down rather needlessly. If you are using a vanilla or Boom complevel (recommended for maps that don't use MBF features), then these will be automatically disabled anyway.

Old Post 07-26-12 19:02 #
Grazza is online now Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit || Quote
Vordakk
Banned


Posts: 975
Registered: 07-10


I use -complevel 9 with 1680x1050 resolution. I mostly use it to test wads I'm making or to play old stuff like AV or Eternal DOOM.

Old Post 07-26-12 19:17 #
Vordakk is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
Archy
Forum Regular


Posts: 672
Registered: 11-09



Quasar said:
ZDoom has magical 999 FPS. It can be running a level with 50000 monsters all infighting and doesn't even blip. Nobody can scientifically explain it :P


Strange, PrBoom-Plus lags much less than zDoom for me when playing pWADs like Nuts.wad.

Old Post 07-26-12 19:44 #
Archy is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
Sodaholic
I feel justified yet disgusted with myself at the same time


Posts: 2932
Registered: 04-07


IIRC, ZDoom has a garbage collector, which somehow speeds things up significantly by removing redundant data in the RAM, or something along those lines, whereas other ports don't have anything like that. You'd have to ask either Graf, Randy, Gez, Blzut3, or any people like that who understand it better than I to explain it to you in more detail/accuracy.

Old Post 07-26-12 21:17 #
Sodaholic is offline Profile || Blog || PM || Email || Search || Add Buddy IP || Edit || Quote
Gez
Why don't I have a custom title by now?!


Posts: 11385
Registered: 07-07


Actor processing is more intensive in ZDoom since there are a bazillion features that ZDoom has to take into account that PrBoom+ doesn't.

However, there are other things that are much faster, for example running traces uses a quick blockmap algorithm instead of traversing the BSP tree. PrBoom+ cannot use that faster method because it'd desync demos. Try complevel 0 to force use of that method to see if you get some speedup. If so, that's probably it.

Old Post 07-26-12 21:25 #
Gez is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
Vordakk
Banned


Posts: 975
Registered: 07-10


Wow, thanks! I had no idea there was so much behind-the-scenes disparity between the two source ports. This explains a lot.

Old Post 07-26-12 22:30 #
Vordakk is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
entryway
Forum Staple


Posts: 2739
Registered: 01-04



Vordakk said:
I use -complevel 9 with 1680x1050 resolution. I mostly use it to test wads I'm making or to play old stuff like AV or Eternal DOOM.

I think zdoom is fastest port on high resolutions and not stress-test levels like nuts.wad, sunder.wad, etc. I've tested on av.wad map01 and dv.wad map03 at 1920x1080 and zdoom is ~15% faster than prboom-plus on both. At 640x480 prboom-plus is faster.

BTW, by some reasons zdoom does not show my native 1920x1080 resolution in resolutions list and I forced it manually through cfg.

Last edited by entryway on 07-27-12 at 09:35

Old Post 07-27-12 09:28 #
entryway is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit || Quote
Gez
Why don't I have a custom title by now?!


Posts: 11385
Registered: 07-07


If the speedup is in rendering (pure game logic has no reason to be affected by screen resolution changes), then it must be thanks to Randy's hyper-optimized assembly code.

Old Post 07-27-12 13:27 #
Gez is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
Ladna
Member


Posts: 309
Registered: 04-10


I say we blame SDL.

Old Post 07-30-12 20:00 #
Ladna is offline Profile || Blog || PM || Email || Search || Add Buddy IP || Edit || Quote
Quasar
Moderator


Posts: 6124
Registered: 08-00


SDL_Flip can be slow as hell, that's for sure. Nothing like seeing 80% of program execution time being spent in a library call.

The difference when running with the GL backend with EE is pretty amazing. When the ARB PBO extension is enabled, it's even possible to have asynchronous screen updates, so the call returns immediately and some of the work of pushing it down to the card and out to the screen at the next refresh happens on some system thread I don't have to be concerned with.

Old Post 07-31-12 18:56 #
Quasar is online now Profile || Blog || PM || Email || Homepage || Search || Add Buddy IP || Edit || Quote
Csonicgo


Posts: 4458
Registered: 03-04



Ladna said:
I say we blame SDL.


seconded. SDL may seem ok on a normal machine, but on an older machine (or lower speed CPU with lower IPC) SDL is the biggest bottleneck ever.

Old Post 08-01-12 01:01 #
Csonicgo is offline Profile || Blog || PM || Email || Homepage || Search || Add Buddy IP || Edit || Quote
Chu
Forum Regular


Posts: 746
Registered: 10-02



Quasar said:
The difference when running with the GL backend with EE is pretty amazing. When the ARB PBO extension is enabled, it's even possible to have asynchronous screen updates, ...[/B]


Ah, you use a pixel buffer, interesting. I'm assuming that speeds things up quite a bit. Something I'd be interested in doing for other ports.

__________________
3DGE source port

Old Post 08-01-12 04:35 #
Chu is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit || Quote
_bruce_
Senior Member


Posts: 1312
Registered: 11-07


So the 'flip' stuff is where all the time gets lost. Before I measured the cycles of functions I always thought the upscaling of the 320x200 screen(as used in choco) was slow
and the speedup via SSE2 in even cases(x2, x4, x6, x8) was negligible.
But after some checking I discovered to my dismay that SDL has a serious handbrake somewhere in the api.

Old Post 08-01-12 11:46 #
_bruce_ is offline Profile || Blog || PM || Email || Search || Add Buddy IP || Edit || Quote
entryway
Forum Staple


Posts: 2739
Registered: 01-04



Quasar said:
SDL_Flip can be slow as hell, that's for sure. Nothing like seeing 80% of program execution time being spent in a library call

That is what SDL does for SDL_Flip()

code:
HDC hdc, mdc; int i; hdc = GetDC(SDL_Window); if ( screen_pal ) { SelectPalette(hdc, screen_pal, FALSE); } mdc = CreateCompatibleDC(hdc); SelectObject(mdc, screen_bmp); for ( i=0; i<numrects; ++i ) { BitBlt(hdc, rects[i].x, rects[i].y, rects[i].w, rects[i].h, mdc, rects[i].x, rects[i].y, SRCCOPY); } DeleteDC(mdc); ReleaseDC(SDL_Window, hdc);


I have these values for software 1600x1200 without status bar on map01

code:
software GL backend 8bit 140 fps 85 fps 32bit 54 fps 72 fps


Just tested zdoom on my home computer 120 fps. On my work computer zdoom was faster at 1920x1080

Last edited by entryway on 08-02-12 at 19:41

Old Post 08-02-12 18:56 #
entryway is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit || Quote
tempun
Member


Posts: 597
Registered: 08-09


I wonder if using a shader for palette conversion can make GL backend faster.

Old Post 08-04-12 16:47 #
tempun is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
entryway
Forum Staple


Posts: 2739
Registered: 01-04



tempun said:
I wonder if using a shader for palette conversion can make GL backend faster.


I replaced correct filling of w*h*4 buffer for GL with memcpy(buffer, pixels, w*h) and there is no any fps improvement at 1600x1200 and 640x480. Even without shaders at all.

code:
void UpdatePixels(unsigned char* dst) { int x, y; unsigned int *pal = (unsigned int*)(vid_8ingl.colours + 256 * vid_8ingl.palette * 4); if (V_GetMode() == VID_MODE8) { #if 1 memcpy(dst, (byte*)vid_8ingl.screen->pixels, vid_8ingl.screen->pitch * REAL_SCREENHEIGHT); #else for (y = 0; y < REAL_SCREENHEIGHT; y++) { byte *px = (((byte*)vid_8ingl.screen->pixels) + y * vid_8ingl.screen->pitch); int *py = ((int*)dst) + y * vid_8ingl.width; for (x = 0; x < REAL_SCREENWIDTH; x++) { *(int*)py = pal[*(byte*)px]; px += 1; py += 1; } } #endif } else if (V_GetMode() == VID_MODE15 || V_GetMode() == VID_MODE16)

Old Post 08-04-12 17:28 #
entryway is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit || Quote
Ladna
Member


Posts: 309
Registered: 04-10


I wish there were a free software, professional-grade abstraction layer that does what SDL claims to do.

Video:
- OpenGL
- Linux: DGA/X11
- OS X: Quartz2D
- Windows: DirectDraw

Audio:
- PortAudio/PortMidi

Networking:
- TCP: Steal SDL_Net
- UDP: ENet

Threading:
- Steal SDL_Thread

Input (keyboard/mouse/joystick):
- Linux: XInput2
- OS X: Cocoa
- Windows: Message Loop/XInput

Filesystem:
- Simple wrappers

Stupid C/C++ API differences:
- strcasecmp vs. stricmp, etc.

It would be a fair amount of work but Jesus, don't you want it so badly? I swear to God if I have to :%s/stricmp/strcasecmp/g one more time I'll probably just explode.

Last edited by Ladna on 08-04-12 at 23:40

Old Post 08-04-12 23:32 #
Ladna is offline Profile || Blog || PM || Email || Search || Add Buddy IP || Edit || Quote
Quasar
Moderator


Posts: 6124
Registered: 08-00



Ladna said:
I swear to God if I have to :%s/stricmp/strcasecmp/g one more time I'll probably just explode.

I do agree, however for this particular problem I prefer a different solution ;)
code:
#ifdef STUPID_PLATFORM #define strcasecmp _stricmp #define strncasecmp _strnicmp #endif

Old Post 08-05-12 01:13 #
Quasar is online now Profile || Blog || PM || Email || Homepage || Search || Add Buddy IP || Edit || Quote
Gez
Why don't I have a custom title by now?!


Posts: 11385
Registered: 07-07


stricmp makes more sense than strcasecmp. The i stands for insensitive, while the case seems to imply it's the case-sensitive version and that the "normal" version strcmp is case-insensitive.

Old Post 08-05-12 11:28 #
Gez is offline Profile || Blog || PM || Search || Add Buddy IP || Edit || Quote
entryway
Forum Staple


Posts: 2739
Registered: 01-04



entryway said:
I replaced correct filling of w*h*4 buffer for GL with memcpy(buffer, pixels, w*h) and there is no any fps improvement at 1600x1200 and 640x480.

Replacing GL_BGRA with GL_LUMINANCE (mapped buffer is 4x smaller) increases FPS twice (85->160) and it becomes faster than clean software (140 fps)

Old Post 08-05-12 13:01 #
entryway is offline Profile || Blog || PM || Homepage || Search || Add Buddy IP || Edit || Quote
All times are GMT. The time now is 21:46. Post New Thread    Post A Reply
 
Doomworld Forums : Powered by vBulletin version 2.2.5 Doomworld Forums > Classic Doom > Source Ports > PrBoom-Plus running slow

Show Printable Version | Email this Page | Subscribe to this Thread

 

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are OFF
[IMG] code is ON
 

< Contact Us - Doomworld >

Powered by: vBulletin Version 2.2.5
Copyright ©2000, 2001, Jelsoft Enterprises Limited.