Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Sign in to follow this  
Quasar

SDL_mixer not to blame for some multicore problems after all?

Recommended Posts

I've implemented a new thread-safe sound update loop in Eternity, and preliminary testing by a single user seems to indicate that the SDL_mixer crashes have vanished without having to set the process affinity mask to restrict execution to a single core.

Since our sound code is based directly on PrBoom's, it would be possible for that port to do something similar as well. But before anybody else uses it I'd really like for some more testing to be done. Anybody got a multicore or even multiple-CPU machine that wouldn't mind running this for a while and playing it in various different circumstances?

Even if it doesn't stop *all* of the SDL_mixer crashes, it is at least necessary. The fact that SDL dispatches the audiospec callback from a separate thread leads to massive race conditions for anything trying to maintain sound data - the callback could interrupt execution of the main thread at any point, so it becomes necessary to use SDL semaphores to protect the channel data.

I understand that Chocolate Doom, which uses a totally different approach, also has these crashes, however. It may be the case that SDL_mixer's channel data is unprotected in a similar manner.

Share this post


Link to post
Quasar said:

preliminary testing by a single user

Hehe. This included, among other things, running Doom2 map30 on nightmare with godmode on, for 3 hours and 40 minutes. The dead monster count was over 50,000 by the time I quit. I've still yet to make it crash.

Share this post


Link to post

damnit quas, I'll do it.


Seriously, I'll run it on a duel and quad core, but not for hours-- just as long as I can. :(

Share this post


Link to post

E8400 here.

I can't sit around and watch the thing for hours, but I can leave it running while I go to work or class and see how it held up once I get back.

Share this post


Link to post

http://sl4.poned.com/wasd/eternity-starter-kit.zip

OK. Everybody who wants to help test please download this package. It should have everything you need in it. Please restrict testing to the standard DOOM / DOOM II / Final Doom / Heretic IWADs. I don't want any other undiscovered issues creeping up and giving false positives. The main point is to test out the sound engine, so try to do things that will make a lot of sounds :)

Share this post


Link to post
Quasar said:

The fact that SDL dispatches the audiospec callback from a separate thread leads to massive race conditions for anything trying to maintain sound data - the callback could interrupt execution of the main thread at any point, so it becomes necessary to use SDL semaphores to protect the channel data.

You should not need anything more than SDL_LockAudio() and SDL_UnlockAudio().

EDGE has a little wrapper for them:

void I_LockAudio(void)
{
        if (audio_is_locked)
        {
                I_UnlockAudio();
                I_Error("I_LockAudio: called twice without unlock!\n");
        }

        SDL_LockAudio();
        audio_is_locked = true;
}

void I_UnlockAudio(void)
{
        if (audio_is_locked)
        {
                SDL_UnlockAudio();
                audio_is_locked = false;
        }
}

Share this post


Link to post

I think Quasar's work affects callback function. You can't use SDL_LockAudio in callback to avoid recursions and maybe deadlocks, but you must protect some Doom data if you change it in callback and in the main thread. Not using of Z_ChangeTag in callback maybe is an easier way. In any way, I did not have any sound related crashes in prboom+. (anybody had?) Only MIDIes have problems and I need to force prboom to use only one CPU and it solves the problem for me.

Share this post


Link to post
entryway said:

I think Quasar's work affects callback function. You can't use SDL_LockAudio in callback to avoid recursions and maybe deadlocks, but you must protect some Doom data if you change it in callback and in the main thread. Not using of Z_ChangeTag in callback maybe is an easier way.

If you are using the callback of SDL_OpenAudio() directly, then I think it is obvious that your callback should not do anything heavy (especially allocating memory or reading files). The buffers should already be there (allocated or loaded outside of your mixing code). Since the mixing code is run from signal handler or another thread, is must be as light as possible.

However it is different if you are using SDL_mixer, it has it's own mixing callback function which it gives to SDL_OpenAudio() and all of the SDL_mixer API functions should protect itself. I'm confused now as exactly what Quasar has fixed, bugs in SDL_mixer??

Share this post


Link to post

Sounds to me like Quasar has implemented a thread-safe layer on top of the SDL_mixer interface to guard against possible race and deadlock issues by only allowing SDL_mixer interaction to only ever happen when and where it is appropriate given the aforementioned issues in SDL_mixer itself. Basically its an abstraction layer for the purposes of safety?

Share this post


Link to post

EE (and PrBoom) do not use SDL_mixer's own digital sound mixing - the library is present primarily for playing music, which it seems to do not that well after all.

The problem is this:
When you call Mix_SetPostMix, the routine you register is called from SDL_mixer's mixing routine (at the very end of it). This routine is in turn executed from SDL_UpdateAudio, in SDL_audio.c, which is the handler routine for an SDL thread.

Under Win32, at least, SDL uses real threads. This means that the I_UpdateSound and I_SDLUpdateSound routines in PrBoom and EE are being executed from a real, separate thread, which on multicore processors can and almost always will be assigned to a different core than the main thread.

Consider for one moment what would happen if this audio update thread were to interrupt the main thread half way through a call to initialize an audio channel, or half-way through the call to decommission a used channel. Or consider if, like the original code, the I_(SDL)UpdateSound routine is allowed to modify the zone heap, and does so while the main thread is in the middle of a call to Z_Malloc - the zone heap is completely non-reentrant just to be clear about that ;)

What my modification being discussed here does is protect the channelinfo structure from concurrent modifications that result in race conditions between the main and audio threads which are inherent in using the postmix callback to do custom mixing like we do.

This has nothing to do with protection of the audio stream buffer, for which SDL_mixer attempts to take responsibility. SDL_mixer says to *never* call SDL_LockAudio, especially not from the callback routines, because it is already doing this itself. However, it is already known that SDL_mixer itself may have multithreading issues. For one, the call to SDL_mutexP that is used to acquire the audio buffer before calling the audiospec callback does not check for any error conditions. It simply assumes that the mutex has been acquired if the function returns. I have already reported this as a bug to the SDL team.

However entryway now seems to suggest that the MIDI music may be the problem. IIRC PrBoom's own proff wrote the code that SDL_mixer uses to play native MIDI under Win32. Maybe I need to take another look at that code (I have played with it extensively in the past).

Share this post


Link to post

As I mentioned to Quasar, I was able to get one SDL crash last night, about 5 seconds into a game, opening a door. But whether it's just extremely good luck or not, I haven't gotten any other crashes in the past few days, and I've been using EE very frequently in that time.

Share this post


Link to post

I believe the revision has improved stability, but is not a total solution. So whether or not PrBoom+ should adopt this is up to entryway entirely :)

It's pretty clear SDL_mixer itself still has some issues that won't be addressed via protection of our own data. But being thoroughly correct never hurts either :) I'm going to continue my investigations of the SDL_mixer and SDL audio sources and see if I can't get closer to some ultimate answers.

I will remain interested in the results of further testing, though, so if you're still playtesting, please let me know of your results :)

Share this post


Link to post
Quasar said:

Consider for one moment what would happen if this audio update thread were to interrupt the main thread half way through a call to initialize an audio channel, or half-way through the call to decommission a used channel.

Woah woah woah, that's exactly the stuff that must be protected with SDL_LockAudio() and SDL_UnlockAudio(), which prevents the audio thread (and hence the callback) from running during that time.

Or consider if, like the original code, the I_(SDL)UpdateSound routine is allowed to modify the zone heap

Like I said before, your mixing callback should be extremely light, nothing but actual mixing (between buffers already allocated by the main thread).

SDL_mixer says to *never* call SDL_LockAudio, especially not from the callback routines

Given there are no docs with SDL_mixer 1.2.8, there is only SDL_mixer.h to go by, and all it says is reiterate what the normal SDL docs say: SDL_LockAudio should never be called by the callback function.

I am not an expert with using SDL_mixer, I dislike it a lot and don't use it in any projects. But I think you are fixing this problem in a much more complicated way than is necessary.

EDIT: here is an example of protected sound function in EDGE:

void S_StopFX(position_c *pos)
{
        if (nosound) return;

        I_LockAudio();
        {
                for (int i = 0; i < num_chan; i++)
                {
                        mix_channel_c *chan = mix_chan[i];

                        if (chan->state == CHAN_Playing && chan->pos == pos)
                        {
                                S_KillChannel(i);
                        }
                }
        }
        I_UnlockAudio();
}

Share this post


Link to post

http://jcatki.no-ip.org:8080/SDL_mixer/SDL_mixer.html#SEC5

The relevant parts:
"When using SDL_mixer functions you need to avoid the following functions from SDL:
...
SDL_LockAudio
This is just not needed since SDL_mixer handles this for you.
Using it may cause problems as well.
SDL_UnlockAudio
This is just not needed since SDL_mixer handles this for you.
Using it may cause problems as well."

Maybe this is an oversimplification of the actual situation. But, I don't think my solution is that complicated at all. The SDL_sem API is as simple as they get, and the calls to semaphore acquisition and release are in exactly the same places that you would suggest the lock and unlock audio calls - around the critical code sections that actually manipulate the channelinfo structure.

In fact doing it the way I am doing should be some degree faster. First off, the callback is coded so that it never blocks to wait on the main thread - it just skips the channel in question (and the main thread never has more than one channel locked at a time). SDL_LockAudio gives no such guarantee. Second, since I only lock the channel actually in question, there's no chance of the audio skipping on all the channels just because the callback timing was bad.

Share this post


Link to post
Quasar said:

SDL_LockAudio
This is just not needed since SDL_mixer handles this for you.
Using it may cause problems as well.
SDL_UnlockAudio
This is just not needed since SDL_mixer handles this for you.
Using it may cause problems as well."

Then the SDL_mixer API is horribly broken, as it provides the Mix_SetPostMix() function but no equivalent to SDL_LockAudio() to prevent your callback from being called at critical times.

EDIT: rewrote the next part to try and be more helpful

I think that, since SDL_mixer is lacking locking functions and you are using your own post mixer, using SDL_LockAudio around critical sections (in the sound code) is not only possible but necessary. The only real caveat is that you don't call any Mix_XXX functions in those sections.

Your approach of doing real multi-threaded programming (using mutexes in both the main thread and audio thread) will work too, and is probably not so complicated. I think that is going against the grain of how SDL's sound API is designed and generally used, but that is your choice.

Real multi-threaded programming is hard. The Linux kernel used to have "The Big Lock", i.e. processes could run on multiple cores but the kernel itself was single-threaded. They eventually removed the big lock, but it was no small feat. When I have written multi-threaded code, it has often been a grrrrrrrrr experience trying to track down weird bugs. Hence nowadays I prefer the simplest approach, such as locking the whole audio thread with SDL_LockAudio.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×