Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754732Ab1CYSYe (ORCPT ); Fri, 25 Mar 2011 14:24:34 -0400 Received: from cantor2.suse.de ([195.135.220.15]:52791 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752343Ab1CYSYd (ORCPT ); Fri, 25 Mar 2011 14:24:33 -0400 Date: Fri, 25 Mar 2011 19:24:32 +0100 Message-ID: From: Takashi Iwai To: Russ Dill Cc: linux-kernel@vger.kernel.org Subject: Re: snd_card_disconnect race (sound/core/init.c) In-Reply-To: References: User-Agent: Wanderlust/2.15.6 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.7 Emacs/23.2 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6263 Lines: 136 At Fri, 25 Mar 2011 10:40:50 -0700, Russ Dill wrote: > > On Thu, Mar 24, 2011 at 10:23 AM, Russ Dill wrote: > > On Thu, Mar 24, 2011 at 1:58 AM, Takashi Iwai wrote: > >> At Thu, 24 Mar 2011 01:19:43 -0700, > >> Russ Dill wrote: > >>> > >>> On Wed, Mar 23, 2011 at 11:43 PM, Takashi Iwai wrote: > >>> > At Wed, 23 Mar 2011 18:40:58 -0700, > >>> > Russ Dill wrote: > >>> >> > >>> >> With slub poisoning on, I'm seeing an oops in snd_disconnect_release > >>> >> with slub poison (6b6b6b6b). Investigating, snd_card_disconnect looks > >>> >> to be the possible culprit: > >>> >> > >>> >> 333         spin_lock(&card->files_lock); > >>> >> 334         mfile = card->files; > >>> >> 335         while (mfile) { > >>> >> 336                 file = mfile->file; > >>> >> 337 > >>> >> 338                 /* it's critical part, use endless loop */ > >>> >> 339                 /* we have no room to fail */ > >>> >> 340                 mfile->disconnected_f_op = mfile->file->f_op; > >>> >> 341 > >>> >> 342                 spin_lock(&shutdown_lock); > >>> >> 343                 list_add(&mfile->shutdown_list, &shutdown_files); > >>> >> 344                 spin_unlock(&shutdown_lock); > >>> >> 345 > >>> >> 346                 mfile->file->f_op = &snd_shutdown_f_ops; > >>> >> 347                 fops_get(mfile->file->f_op); > >>> >> 348 > >>> >> 349                 mfile = mfile->next; > >>> >> 350         } > >>> >> 351         spin_unlock(&card->files_lock); > >>> >> > >>> >> I'm not aware of any locking that would prevent the original release > >>> >> function running concurrently if it were started before line 346 was > >>> >> executed. The original release functions (such as snd_hwdep_release) > >>> >> call snd_card_remove_file which frees the mfile object which would > >>> >> have been just added to the shutdown_files list leading to a use of > >>> >> freed (or in my case poisoned) memory later on down the line when the > >>> >> shutdown_files gets walked again. > >>> >> > >>> >> It seems that 118dd6bf (ALSA: Clean up snd_monitor_file management, > >>> >> v2.6.30) may have either introduced this problem or made it worse > >>> >> since snd_card_file_remove previously removed items from the > >>> >> shutdown_files list before freeing them. > >>> >> > >>> >> I'm currently experiencing these crashes on 2.6.32.26, but I can't > >>> >> find any changes that would cause them not to occur on later kernel > >>> >> versions. > >>> > > >>> > Which device are you using?  USB-audio had a known problem at > >>> > disconnection, for example, and it was fixed recently. > >>> > > >>> > >>> I'm using USB-audio. I tried back-porting a couple of patches, to no avail: > >>> > >>> ALSA: usb-audio: fix oops due to cleanup race when disconnecting > >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=382225e6 > >>> > >>> ALSA: usb - Release capture substream URBs properly > >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=76195fb0 > >>> > >>> Backing out 118dd6bf fixes the problem. The app that is running when > >>> this crash occurs does full duplex streaming and segfaults while the > >>> USB-audio device is going away. I'd have to check, but I think its > >>> actually segfaulting it response to a disconnect of another piece of > >>> the composite device the USB-audio device is part of, so the release > >>> may actually be happening at the time the snd_card_disconnect is > >>> occurring. I don't see any locking that would prevent bad things from > >>> happening in this case, and the shutdown_files list is definitely > >>> poisoned. > >> > >> OK, what about the patch below? > >> > >> > >> Takashi > >> > >> --- > >> From: Takashi Iwai > >> Subject: [PATCH] ALSA: Fix yet another race in disconnection > >> > >> This patch fixes a race between snd_card_file_remove() and > >> snd_card_disconnect().  When the card is added to shutdown_files list > >> in snd_card_disconnect(), but it's freed in snd_card_file_remove() at > >> the same time, the shutdown_files list gets corrupted.  The list member > >> must be freed in snd_card_file_remove() as well. > >> > >> Reported-by: Russ Dill > >> Signed-off-by: Takashi Iwai > >> --- > >>  sound/core/init.c |    4 ++++ > >>  1 files changed, 4 insertions(+), 0 deletions(-) > >> > >> diff --git a/sound/core/init.c b/sound/core/init.c > >> index 3e65da2..a0080aa 100644 > >> --- a/sound/core/init.c > >> +++ b/sound/core/init.c > >> @@ -848,6 +848,7 @@ int snd_card_file_add(struct snd_card *card, struct file *file) > >>                return -ENOMEM; > >>        mfile->file = file; > >>        mfile->disconnected_f_op = NULL; > >> +       INIT_LIST_HEAD(&mfile->shutdown_list); > >>        spin_lock(&card->files_lock); > >>        if (card->shutdown) { > >>                spin_unlock(&card->files_lock); > >> @@ -883,6 +884,9 @@ int snd_card_file_remove(struct snd_card *card, struct file *file) > >>        list_for_each_entry(mfile, &card->files_list, list) { > >>                if (mfile->file == file) { > >>                        list_del(&mfile->list); > >> +                       spin_lock(&shutdown_lock); > >> +                       list_del(&mfile->shutdown_list); > >> +                       spin_unlock(&shutdown_lock); > >>                        if (mfile->disconnected_f_op) > >>                                fops_put(mfile->disconnected_f_op); > >>                        found = mfile; > >> -- > >> 1.7.4.1 > > > > It looks sane, I'll run it through my torture (about 24hr) testing and > > let you know. > > > > Passes testing and looks good! Thanks. Thanks for testing. I merged the patch now. Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/