Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754157Ab1CYRky (ORCPT ); Fri, 25 Mar 2011 13:40:54 -0400 Received: from mail-iw0-f174.google.com ([209.85.214.174]:38582 "EHLO mail-iw0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753997Ab1CYRkv convert rfc822-to-8bit (ORCPT ); Fri, 25 Mar 2011 13:40:51 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=kMQQ5lIJ+Rs+wPsuBImL948kmgc9nen6ne73IsPbCk/E5jIW7ybJu+YDeUaQu8JspW XzD0f6roHX1XgBQk39DGH+pIYf+kV58FN8/nLnQ4yt+35RbORNkFjWokQJ7ODrWpfENI Nw2VUnwr5EuWgGlVuTkbovzJcRMFz7p8RLyH4= MIME-Version: 1.0 In-Reply-To: References: Date: Fri, 25 Mar 2011 10:40:50 -0700 Message-ID: Subject: Re: snd_card_disconnect race (sound/core/init.c) From: Russ Dill To: Takashi Iwai Cc: linux-kernel@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5905 Lines: 128 On Thu, Mar 24, 2011 at 10:23 AM, Russ Dill wrote: > On Thu, Mar 24, 2011 at 1:58 AM, Takashi Iwai wrote: >> At Thu, 24 Mar 2011 01:19:43 -0700, >> Russ Dill wrote: >>> >>> On Wed, Mar 23, 2011 at 11:43 PM, Takashi Iwai wrote: >>> > At Wed, 23 Mar 2011 18:40:58 -0700, >>> > Russ Dill wrote: >>> >> >>> >> With slub poisoning on, I'm seeing an oops in snd_disconnect_release >>> >> with slub poison (6b6b6b6b). Investigating, snd_card_disconnect looks >>> >> to be the possible culprit: >>> >> >>> >> 333         spin_lock(&card->files_lock); >>> >> 334         mfile = card->files; >>> >> 335         while (mfile) { >>> >> 336                 file = mfile->file; >>> >> 337 >>> >> 338                 /* it's critical part, use endless loop */ >>> >> 339                 /* we have no room to fail */ >>> >> 340                 mfile->disconnected_f_op = mfile->file->f_op; >>> >> 341 >>> >> 342                 spin_lock(&shutdown_lock); >>> >> 343                 list_add(&mfile->shutdown_list, &shutdown_files); >>> >> 344                 spin_unlock(&shutdown_lock); >>> >> 345 >>> >> 346                 mfile->file->f_op = &snd_shutdown_f_ops; >>> >> 347                 fops_get(mfile->file->f_op); >>> >> 348 >>> >> 349                 mfile = mfile->next; >>> >> 350         } >>> >> 351         spin_unlock(&card->files_lock); >>> >> >>> >> I'm not aware of any locking that would prevent the original release >>> >> function running concurrently if it were started before line 346 was >>> >> executed. The original release functions (such as snd_hwdep_release) >>> >> call snd_card_remove_file which frees the mfile object which would >>> >> have been just added to the shutdown_files list leading to a use of >>> >> freed (or in my case poisoned) memory later on down the line when the >>> >> shutdown_files gets walked again. >>> >> >>> >> It seems that 118dd6bf (ALSA: Clean up snd_monitor_file management, >>> >> v2.6.30) may have either introduced this problem or made it worse >>> >> since snd_card_file_remove previously removed items from the >>> >> shutdown_files list before freeing them. >>> >> >>> >> I'm currently experiencing these crashes on 2.6.32.26, but I can't >>> >> find any changes that would cause them not to occur on later kernel >>> >> versions. >>> > >>> > Which device are you using?  USB-audio had a known problem at >>> > disconnection, for example, and it was fixed recently. >>> > >>> >>> I'm using USB-audio. I tried back-porting a couple of patches, to no avail: >>> >>> ALSA: usb-audio: fix oops due to cleanup race when disconnecting >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=382225e6 >>> >>> ALSA: usb - Release capture substream URBs properly >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=76195fb0 >>> >>> Backing out 118dd6bf fixes the problem. The app that is running when >>> this crash occurs does full duplex streaming and segfaults while the >>> USB-audio device is going away. I'd have to check, but I think its >>> actually segfaulting it response to a disconnect of another piece of >>> the composite device the USB-audio device is part of, so the release >>> may actually be happening at the time the snd_card_disconnect is >>> occurring. I don't see any locking that would prevent bad things from >>> happening in this case, and the shutdown_files list is definitely >>> poisoned. >> >> OK, what about the patch below? >> >> >> Takashi >> >> --- >> From: Takashi Iwai >> Subject: [PATCH] ALSA: Fix yet another race in disconnection >> >> This patch fixes a race between snd_card_file_remove() and >> snd_card_disconnect().  When the card is added to shutdown_files list >> in snd_card_disconnect(), but it's freed in snd_card_file_remove() at >> the same time, the shutdown_files list gets corrupted.  The list member >> must be freed in snd_card_file_remove() as well. >> >> Reported-by: Russ Dill >> Signed-off-by: Takashi Iwai >> --- >>  sound/core/init.c |    4 ++++ >>  1 files changed, 4 insertions(+), 0 deletions(-) >> >> diff --git a/sound/core/init.c b/sound/core/init.c >> index 3e65da2..a0080aa 100644 >> --- a/sound/core/init.c >> +++ b/sound/core/init.c >> @@ -848,6 +848,7 @@ int snd_card_file_add(struct snd_card *card, struct file *file) >>                return -ENOMEM; >>        mfile->file = file; >>        mfile->disconnected_f_op = NULL; >> +       INIT_LIST_HEAD(&mfile->shutdown_list); >>        spin_lock(&card->files_lock); >>        if (card->shutdown) { >>                spin_unlock(&card->files_lock); >> @@ -883,6 +884,9 @@ int snd_card_file_remove(struct snd_card *card, struct file *file) >>        list_for_each_entry(mfile, &card->files_list, list) { >>                if (mfile->file == file) { >>                        list_del(&mfile->list); >> +                       spin_lock(&shutdown_lock); >> +                       list_del(&mfile->shutdown_list); >> +                       spin_unlock(&shutdown_lock); >>                        if (mfile->disconnected_f_op) >>                                fops_put(mfile->disconnected_f_op); >>                        found = mfile; >> -- >> 1.7.4.1 > > It looks sane, I'll run it through my torture (about 24hr) testing and > let you know. > Passes testing and looks good! Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/