Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp5343303img; Wed, 27 Mar 2019 06:52:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqzAJ49xPkZYgUN+ie/yuN24CVCXI1nCh7VBUWBprDV4x7QgJkxh05HkQj3y2PPEQx5rzxdg X-Received: by 2002:a63:3f8b:: with SMTP id m133mr33948333pga.91.1553694734742; Wed, 27 Mar 2019 06:52:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553694734; cv=none; d=google.com; s=arc-20160816; b=fITRjh6yVsjcv3cADhgysQE8q4s1CkkycwLJ/ox65Gy8wEnqLHSJLTPe15+/LVcFYV aPKkieYlsfnXJ/cP6A30tfpPU124t4lDCDcRv8VLkwZIaXReb0sirVcb631L9hkxXMX5 xmiAdjEmjBSzfuyMau5VAASsV9Pt4qm6POxm4lwhadwRKYCwBFjt7ewneo8te3m/yE9m 8A2DBwd2fWcI+XRxTuVKFnaFu2MnxNykYBRbboZt6DNtHqdbyC2z3dSTHwc0c7Ieukdc NNz0oybaAYyHCrrK2YDRg4aO7ZGp7TvxXDzNC7O641vlpuY+d6Yy4EW1awh0Eemof8FY 9DWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=uObGdnemS16MJtFzZgKm9+o99p35JcWEwbPAWwAUDPs=; b=aI1qqyHUPd1b6bseUPl1xojipxjmJ5tvemyeeRbIjGwCBxKlNohb+WETOBFBNRG1lo 15rP8X5cEuSfdEgEGfTnX++HDTijLjwdc/MVTrei4+NPPmNJt9cX7ZnKU522nPINzblZ cVBW2kFtgYmEZzQ7CuxeC95K3Hgk1HW+qNuwLm79P+TY04mZ2Qkb/J/w46E+t+mm1KWw G7M8ABthTmSeO2TS4hFtrI54szVerXQChT9XEqUe+w7FkXSW7cyxvDotuwZGDS9/peW6 YbrpLRz/sj3dQffXW1DIl3FxpRpQT0rrIWqsgTdk4VkErjaWXH2EJN9gt37+vr0H+8k3 lJ8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w65si18328250pfb.86.2019.03.27.06.51.59; Wed, 27 Mar 2019 06:52:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728837AbfC0NvM (ORCPT + 99 others); Wed, 27 Mar 2019 09:51:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42582 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725797AbfC0NvM (ORCPT ); Wed, 27 Mar 2019 09:51:12 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 93C986DDC9; Wed, 27 Mar 2019 13:51:11 +0000 (UTC) Received: from [10.33.36.91] (unknown [10.33.36.91]) by smtp.corp.redhat.com (Postfix) with ESMTP id D850A804B5; Wed, 27 Mar 2019 13:51:09 +0000 (UTC) Subject: Re: [RFC PATCH 1/4] vfs: Create fs_context-aware mount_bdev() replacement To: David Howells Cc: miklos@szeredi.hu, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org References: <155301260319.7556.1326405089184672936.stgit@warthog.procyon.org.uk> <155301261082.7556.2558480789011010142.stgit@warthog.procyon.org.uk> <14987.1553685803@warthog.procyon.org.uk> From: Andrew Price Message-ID: <601b8d3d-4de2-7dda-d839-5960841529f8@redhat.com> Date: Wed, 27 Mar 2019 13:51:08 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.0 MIME-Version: 1.0 In-Reply-To: <14987.1553685803@warthog.procyon.org.uk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 27 Mar 2019 13:51:11 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27/03/2019 11:23, David Howells wrote: > Andrew Price wrote: > >>> + up_write(&s->s_umount); >>> + blkdev_put(bdev, fc->bdev_mode); >>> + down_write(&s->s_umount); >> >> fc->bdev should be NULLed here (or, on the way out of sget_fc() might be more >> appropriate) otherwise we get a double-blkdev_put() leading to NULL pointer >> derefs later. This happens when I mount a device twice and then unmount them, >> or mount it 3 times. > > How about the attached changes? Note that they conflict a bit with the mtd > changes from where I took the destructor piece. > > David It fixes my bug and I see nothing unexpected from an xfstests -g quick run with my gfs2 patch on top. Looks good. Thanks, Andy > --- > commit 161602f963ae5a05bdb834461f7a4896286fbde0 > Author: David Howells > Date: Wed Mar 27 11:12:57 2019 +0000 > > bdev handling fix > > diff --git a/fs/fs_context.c b/fs/fs_context.c > index 9e22fe6aafe7..fc8fda4b9fe9 100644 > --- a/fs/fs_context.c > +++ b/fs/fs_context.c > @@ -425,10 +425,8 @@ void put_fs_context(struct fs_context *fc) > > if (fc->need_free && fc->ops && fc->ops->free) > fc->ops->free(fc); > -#ifdef CONFIG_BLOCK > - if (fc->bdev) > - blkdev_put(fc->bdev, fc->bdev_mode); > -#endif > + if (fc->dev_destructor) > + fc->dev_destructor(fc); > > security_free_mnt_opts(&fc->security); > put_net(fc->net_ns); > diff --git a/fs/super.c b/fs/super.c > index 85851adb0f19..e9e678d2c37c 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -1211,8 +1211,17 @@ int vfs_get_super(struct fs_context *fc, > EXPORT_SYMBOL(vfs_get_super); > > #ifdef CONFIG_BLOCK > +static void fc_bdev_destructor(struct fs_context *fc) > +{ > + if (fc->bdev) { > + blkdev_put(fc->bdev, fc->bdev_mode); > + fc->bdev = NULL; > + } > +} > + > static int set_bdev_super_fc(struct super_block *s, struct fs_context *fc) > { > + s->s_mode = fc->bdev_mode; > s->s_bdev = fc->bdev; > s->s_dev = s->s_bdev->bd_dev; > s->s_bdi = bdi_get(s->s_bdev->bd_bdi); > @@ -1252,6 +1261,9 @@ int vfs_get_block_super(struct fs_context *fc, > return PTR_ERR(bdev); > } > > + fc->dev_destructor = fc_bdev_destructor; > + fc->bdev = bdev; > + > /* Once the superblock is inserted into the list by sget_fc(), s_umount > * will protect the lockfs code from trying to start a snapshot while > * we are mounting > @@ -1260,18 +1272,14 @@ int vfs_get_block_super(struct fs_context *fc, > if (bdev->bd_fsfreeze_count > 0) { > mutex_unlock(&bdev->bd_fsfreeze_mutex); > warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev); > - error = -EBUSY; > - goto error_bdev; > + return -EBUSY; > } > > - fc->bdev = bdev; > fc->sb_flags |= SB_NOSEC; > s = sget_fc(fc, test_bdev_super_fc, set_bdev_super_fc); > mutex_unlock(&bdev->bd_fsfreeze_mutex); > - if (IS_ERR(s)) { > - error = PTR_ERR(s); > - goto error_bdev; > - } > + if (IS_ERR(s)) > + return PTR_ERR(s); > > if (s->s_root) { > /* Don't summarily change the RO/RW state. */ > @@ -1281,16 +1289,10 @@ int vfs_get_block_super(struct fs_context *fc, > goto error_sb; > } > > - /* s_umount nests inside bd_mutex during __invalidate_device(). > - * blkdev_put() acquires bd_mutex and can't be called under > - * s_umount. Drop s_umount temporarily. This is safe as we're > - * holding an active reference. > + /* Leave fc->bdev to fc_bdev_destructor() to clean up to avoid > + * locking conflicts. > */ > - up_write(&s->s_umount); > - blkdev_put(bdev, fc->bdev_mode); > - down_write(&s->s_umount); > } else { > - s->s_mode = fc->bdev_mode; > snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev); > sb_set_blocksize(s, block_size(bdev)); > error = fill_super(s, fc); > @@ -1307,11 +1309,7 @@ int vfs_get_block_super(struct fs_context *fc, > > error_sb: > deactivate_locked_super(s); > -error_bdev: > - if (fc->bdev) { > - blkdev_put(fc->bdev, fc->bdev_mode); > - fc->bdev = NULL; > - } > + /* Leave fc->bdev to fc_bdev_destructor() to clean up */ > return error; > } > EXPORT_SYMBOL(vfs_get_block_super); > @@ -1521,8 +1519,13 @@ int vfs_get_tree(struct fs_context *fc) > * on the superblock. > */ > error = fc->ops->get_tree(fc); > - if (error < 0) > + if (error < 0) { > + if (fc->dev_destructor) { > + fc->dev_destructor(fc); > + fc->dev_destructor = NULL; > + } > return error; > + } > > if (!fc->root) { > pr_err("Filesystem %s get_tree() didn't set fc->root\n", > diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h > index cb49b92f02af..9af788a3ef8e 100644 > --- a/include/linux/fs_context.h > +++ b/include/linux/fs_context.h > @@ -76,7 +76,9 @@ struct fs_context { > const struct fs_context_operations *ops; > struct file_system_type *fs_type; > void *fs_private; /* The filesystem's context */ > - struct block_device *bdev; /* The backing blockdev (if applicable) */ > + union { > + struct block_device *bdev; /* The backing blockdev (if applicable) */ > + }; > struct dentry *root; /* The root and superblock */ > struct user_namespace *user_ns; /* The user namespace for this mount */ > struct net *net_ns; /* The network namespace for this mount */ > @@ -93,6 +95,7 @@ struct fs_context { > enum fs_context_purpose purpose:8; > bool need_free:1; /* Need to call ops->free() */ > bool global:1; /* Goes into &init_user_ns */ > + void (*dev_destructor)(struct fs_context *fc); /* For block or mtd */ > }; > > struct fs_context_operations { >