Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Porsche Monty

Chocolate Doom's sound upsampling sucks :-(

Recommended Posts

I know what causes it. It would probably be possible to put a hack fix together, but I haven't bothered, as the best solution is for me to replace the OPL emulator code, which is something I'm planning to do anyway.

Any reason you can't just up the sample rate?

Share this post


Link to post

Alright, here's the problem. Pretty much every sound in Doom was sampled at 11kkhz, and plain unnecessary upsampling introduces some really nasty aliasing (the higher you go, the more aliasing is introduced and it's the industry standard for modern sound cards to have a native sample rate no lower than 44khz) Normally this is dealt with a filter and several other processes which cannot be applied on the fly (though faster, yet sub-optimal processing can still be useful for Doom ports) but even the most basic processing seems to be missing in Chocolate Doom.

To circumvent this, I usually just setup Doom to output at 11khz and let my sound card carry out the resampling process for substantially superior results. Unfortunately, without CMI8738 OPL-passthrough (hopefully it will happen at some point, and I'm just assuming it wouldn't trigger the problem as the emulated alternative) plus the OPL+11khz lockup, I have no choice put to tell Chocolare Doom to output at 22lhz and use an external post-DAC low-pass to filter out the aliasing, which can be somewhat inconvenient.

Share this post


Link to post

Chocolate Doom applies a low-pass filter when it resamples, so there is some basic compensation in there for aliasing effects.

You can also try using libsamplerate, which greatly improves the resampling. However, the Windows releases don't have libsamplerate support compiled in, so you'd probably have to compile your own version.

Share this post


Link to post

1. CMI8738 OPL passthrough = nothing going on it seems.
2. Quick hack fix = very unlikely.
3. OPL emulator replacement = "when it's done"
4. Custom libsamplerate-enabled Win32 binaries = not a chance in hell.

Guess you can't always have it your way. I'll just wait and see where this goes.

Share this post


Link to post

Buy a YMF724 :-)

I've added a bug for this issue. I might see if I can write a better filter for the upsampling code. The main thing is that I want something simple and not too CPU intensive. Also, the maths might be a bit fiddly and I was never particularly good at DSP back when I studied it :-)

I did some preliminary investigations and found that there was actually a bug in the way the filter was being applied, so it may be worth re-testing with the latest SVN build.

Otherwise, it might just be a matter of compiling the Windows binaries for future releases with libsamplerate support enabled.

Share this post


Link to post

Eh, I kinda like it without much filtering. Filtered 11khz sounds are mushy and dull, and at least it gives the illusion of crispness here. :P

Of course, I used Impulse Tracker for years without any playback filtering and got used to the sound of it, so my ears are very biased. It doesn't really matter that much either way, but I never would've realized there was anything 'wrong' with the way Chocolate handles sound.

Share this post


Link to post
fraggle said:

I might see if I can write a better filter for the upsampling code.

EE gets away with simple linear interpolation, which to my ear sounded better at the time than Choco with only a low-pass filter. EE however did not have a low-pass filter and so it was rather harsh. To solve that I recently added a realtime 3-band EQ which can be user-configured to allow for effects like cutting of high frequencies and bass boost. It does require some CPU time but isn't even close to being something that modern CPUs cannot stomach.

Share this post


Link to post
Porsche Monty said:

2. Quick hack fix = very unlikely.
3. OPL emulator replacement = "when it's done"

Well, this is done now - trunk is using DOSBox's OPL emulator (DBOPL), which is a big improvement. So that's two down at least.

Share this post


Link to post
fraggle said:

Well, this is done now - trunk is using DOSBox's OPL emulator (DBOPL), which is a big improvement.

Cool beans.

Share this post


Link to post

Wowie kazowie! thank you very much. I can definitely hear a small but welcome improvement plus the OPL+11khz freezing no longer occurs. Thanks to the resampling fix, I can finally output at 49716khz (needed for proper OPL playback) and care somewhat less about screwing up 11khz sounds in the process. Ideally sound and OPL should have individual sample rate controls, but I'm not getting my hopes up for that one.

Have you checked out the SINC filter thing I suggested on the bug tracker?

Share this post


Link to post
Porsche Monty said:

Ideally sound and OPL should have individual sample rate controls, but I'm not getting my hopes up for that one.



Good luck finding sound hardware that can do that.

Share this post


Link to post
Graf Zahl said:

Good luck finding sound hardware that can do that.


This should be done at software level, should it be possible or convenient to begin with. Modern sound cards have somewhat limited native sample rates and won't output to more than one sample rate at any given time. Picture this like 2 media players outputting audio at 2 different sample rates; you'll hear both playing but these sounds will be resampled to the default/native sample rate. The only reason for such a feature is the quality of the resampling technique used by the sound card which may be superior to that of the source port.

Share this post


Link to post
Porsche Monty said:

Have you checked out the SINC filter thing I suggested on the bug tracker?


Sinc? isn't that a little extreme? linear is just fine in this case.

Share this post


Link to post

If I understood correctly, the problem is mixing the 11 KHz SFX (assuming no pitch variations and perfect sample alignment between multiple SFX being played) and the 49716 Hz needed for sample-precise OPL output.

Now, according to signal theory, you can mix and output at any frequency you like, as long as:

  • The final sampling rate accomodates both (so you can choose just the higher of the two)
  • They can be mixed at an intermediate oversampling rate which accomodates both lower ones by being an integer multiple of both (or, in math terms, be their least common multiple or LCM).
Unfortunately, 11025 and 49716 are pretty awkward numbers, and have a ridiculously high LCM of 60902100, aka you'd have to mix them at 60.902 MHz if you want perfect downmixing into a 49716 Hz signal later. The OPL signal would have to be oversampled by a factor of 1225, which is clearly ridiculous for the task at hand, ths SFX would have to be oversampled by a factor of 5524, which is also super-ridiculous, considering we're talking about 8-bit, 11 KHz samples.

A so-so compromise would be to force the OPL output at 44.1 KHz sharp (NOT by proper oversampling, that would be just as intensive as the 11 KHz one, divided by a factor of four), and then use a much more reasonable 4x oversampling factor for the SFX, and mix at 44.1 KHz.

A cruder way would be to raise the sampling rate of the OPL to 50 KHz precise, and play the SFX at 11 KHz (11000 Hz, not 11025). This way, you can get away with an oversampling factor of 5, at the cost of slightly increased OPL pitch (by 0.571%), and direct mixing without chopping off anything from the OPL output (you'd have to put up with the higher pitch). With this method you'd only need to oversample the SFX to the OPL's rate and output without any further filtering.

The advantages of up-pitching or chopping down the OPL output to a rounder number is that you can afford much lower oversampling LCM buffers, no oversampling for the music and then output directly at the final OPL rate (44.1 or 50 KHz).

Obviously, other combinations such as 12000 Hz SFX and 48 KHz OPL are possible (either by chopping or by forcing lower pitch), but in any case you should use sinc filters for the SFX (linear filters would cause audible aliasing and distortion, with such a low starting sampling rate for SFX). Avoiding oversamplingthe OPL chip will result in less computationally expensive filtering.

IMHO, the most elegant solution would be to oversample the OPL by a factor of 2 (99432 Hz), and the SFX by a factor of 9, but by playing them at a sample rate of 11048 Hz. That's just a 0.2086% increase in pitch, far more tolerable in the SFX than in the music, and can you keep oversampling to a VERY reasonable level without affecting the music's pitch at all. The final output can be at 49716 KHz, which can be trivially achieved even by a decimation filter if you used sinc filters during oversampling.

Share this post


Link to post
Maes said:

The final output can be at 49716 KHz, which can be trivially achieved even by a decimation filter if you used sinc filters during oversampling.



Is there any hardware that can output such a non-standard frequency?

On my hardware I can do 44100, 48000 and 96000 only - and if I use Vista WASAPI I'm limited to 48000. That limitation lets me assume that using DirectSound only hides the true output frequency and internally converts to 48000, too, since on Vista it only wraps around WASAPI.

Share this post


Link to post
Graf Zahl said:

Is there any hardware that can output such a non-standard frequency?


If you tinker with the hardware directly and bypass the OS's sound API, I think all of them can do it.

At least under DOS I recall that you could program older soundcards to work at any sampling rate between a minimum such as 4KHz and a maximum of 48 KHz or even beyond that, if you tinkered with low-level programming (not with API/convenience "development kit" type drivers). Under Win32 and other modern OSes and their sound APIs, I think you can force a specific sample rate and then it's up to the driver and system architecture to decide whether it can output "freely" or not.

As you said, because the mixing from multiple streams is delegated to one centralized sound system, the output sample rate is fixed to some common value and as for proper oversampling...dream on. It will either be super-CPU intensive, non-existent, and if you're using some super-expensive sound hardware like a Creative X-Fi it will be done in hardware (hopefully).

Now, I think that you can "fix" the final output frequency SYSTEM WIDE, e.g. 49716 Hz or 50 KHz FOR EVERYTHING. This means that anything outputting natively at such a rate will play directly, and EVERYTHING ELSE will have to be downmixed/adapted. This would be a good compromise while chocodoom or ZDoom is running.

In any case, nothing can beat the analog output of a separate OPL chip getting mixed with the sound effects asynchronously and without sampling-rate locksteps on a high-quality analog mixer, unless you mix 9x oversampled 11048 Hz effects with a 2x oversampled OPL digital output, which is the best compromise between quality, processing power and pitch accuracy you can hope to get (assuming that you can force a 49716 or at least 50 KHz global sampling rate).

Share this post


Link to post

Sooo... When somebody complains about an OPL emulator not sounding like the real thing, they may very well be right because of all these sample rate shenanigans? :p

Share this post


Link to post
Gez said:

Sooo... When somebody complains about an OPL emulator not sounding like the real thing, they may very well be right because of all these sample rate shenanigans? :p


It's surely an important factor.

"Complaints" about OPL output "not sounding quite right" began way back on the AWE32, which actually subsampled it internally to 48 KHz with its EMU8000 DS, instead of having an analog mixer (dunno if other models such as the SB16 Vibra did this too, in order to save on analog mixing).

In any case, 49716 is a fuckin' awkward sampling rate and the only way to successfully downmix it digitally with more industry standard ones would be to use dedicated hardware or waste A LOT of processing power.

Share this post


Link to post
exp(x) said:

Ugh, fuck signals and systems.

Preach it, brother.

Csonicgo said:

Sinc? isn't that a little extreme? linear is just fine in this case.

Not at all. A sinc filter is how to "properly" resample a sound (ie. no high frequency aliasing).

Share this post


Link to post

You get into a similar situation with SPC music, because (IIRC) the SPC700 outputs a sample rate of 32kHz. To feed this into EE's 44.1kHz stream I have to interpolate on the fly.

Share this post


Link to post
fraggle said:

Preach it, brother.

Not at all. A sinc filter is how to "properly" resample a sound (ie. no high frequency aliasing).


Thank god doom isn't a game where resampling doesn't hurt the sound quality as it is in duke3d. playing duke3d with resampled sound sounds like placing your speakers underwater.

Share this post


Link to post

Duke3D samples have to be the worst pile of garbage ever produced in the history of video games. How the responsible for this disaster got away with it is beyond me, though disposable-cheap creative multimedia speaker sets sounds like a potential explanation. He literally had absolutely no idea how to handle an audio signal, and the result is a screeching mess of aliasing, clipping and various other artifacts typical of sampling/resampling procedures that didn't follow any standards.

Anyways, back on Chodolate-Doom's resampling problem...well, let's keep a couple of things in mind: in this particular case, resampling is inevitable and virtually impossible to tweak to a perfection, so there will definitely be a compromise regardless of the approach.

What to do? if I had the power, I'd get rid of that SDL junk (at least for the Win32 port) , replace it with fmod (more specifically ZDoom's) and throw in a SINC filter ala WinUAE, which's fast and probably the most accurate and realistic implementation I can think of right now. I can account for it's accuracy since I have an A500 standing next to me, and daqarta is there to back this up.

Share this post


Link to post
Porsche Monty said:

Duke3D samples have to be the worst pile of garbage ever produced in the history of video games. How the responsible for this disaster got away with it is beyond me, though disposable-cheap creative multimedia speaker sets sounds like a potential explanation. He literally had absolutely no idea how to handle an audio signal, and the result is a screeching mess of aliasing, clipping and various other artifacts typical of sampling/resampling procedures that didn't follow any standards.

Hmm? Duke3D's low sound quality is just because they squashed the sample rate way down, likely to save space or some similar reason. Any sane sound engineer would've produced them at a much higher quality, only downsampling them in the post-production phase.

Also, fmod's not an option, since it's not GPL and Chocolate is.

Share this post


Link to post
Xaser said:

Hmm? Duke3D's low sound quality is just because they squashed the sample rate way down, likely to save space or some similar reason


Sampling and downsampling (one of the many steps in decimation) are two slightly different animals and neither was carried out anywhere near right. You should sample in conformance with the Nyquist–Shannon theorem while avoiding oversampling unless strictly necessary. That means, you don't really need 44100hz or even the old, nearly-standard 11025hz when sampling, say, a 2000hz sound, you can very well do with 4000hz, but then you'd need to upsample that to the target native sample rate of the sound card, so that's where oversampling comes in handy. Unfortunately DN3D's sounds are a mixed bag and the resampling technique used by the engine was too primitive to lessen any of the side-effects.

Share this post


Link to post
Porsche Monty said:

Sampling and downsampling (one of the many steps in decimation) are two slightly different animals and neither was carried out anywhere near right. You should sample in conformance with the Nyquist–Shannon theorem while avoiding oversampling unless strictly necessary. That means, you don't really need 44100hz or even the old, nearly-standard 11025hz when sampling, say, a 2000hz sound, you can very well do with 4000hz, but then you'd need to upsample that to the target native sample rate of the sound card, so that's where oversampling comes in handy. Unfortunately DN3D's sounds are a mixed bag and the resampling technique used by the engine was too primitive to lessen any of the side-effects.

Using low hz sounds saved memory. Duke Nukem 3D was for DOS computers.

Share this post


Link to post
Porsche Monty said:

You don't even get the point.

The point of you needlessly throwing around your audiophile technobabble?

These old dos games were developed by a couple guys couped up in a garage or apartment building. They didn't have access to Wikipedia or high end sound equipment, they just wanted their game's shareware version to fit on a floppy disk and be easily downloaded from a BBS in the shortest amount of time possible. It's primitive, it's flawed, but it's all they had, so shut up you big whiner.

Share this post


Link to post
Guest
This topic is now closed to further replies.
×