2023-08-02 16:34:03

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 11/12] xfs: drop s_umount over opening the log and RT devices

Just like get_tree_bdev needs to drop s_umount when opening the main
device, we need to do the same for the xfs log and RT devices to avoid a
potential lock order reversal with s_unmount for the mark_dead path.

It might be preferable to just drop s_umount over ->fill_super entirely,
but that will require a fairly massive audit first, so we'll do the easy
version here first.

Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/xfs/xfs_super.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 8185102431301d..d5042419ed9997 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -448,17 +448,21 @@ STATIC int
xfs_open_devices(
struct xfs_mount *mp)
{
- struct block_device *ddev = mp->m_super->s_bdev;
+ struct super_block *sb = mp->m_super;
+ struct block_device *ddev = sb->s_bdev;
struct block_device *logdev = NULL, *rtdev = NULL;
int error;

+ /* see get_tree_bdev why this is needed and safe */
+ up_write(&sb->s_umount);
+
/*
* Open real time and log devices - order is important.
*/
if (mp->m_logname) {
error = xfs_blkdev_get(mp, mp->m_logname, &logdev);
if (error)
- return error;
+ goto out_unlock;
}

if (mp->m_rtname) {
@@ -496,7 +500,10 @@ xfs_open_devices(
mp->m_logdev_targp = mp->m_ddev_targp;
}

- return 0;
+ error = 0;
+out_unlock:
+ down_write(&sb->s_umount);
+ return error;

out_free_rtdev_targ:
if (mp->m_rtdev_targp)
@@ -508,7 +515,7 @@ xfs_open_devices(
out_close_logdev:
if (logdev && logdev != ddev)
xfs_blkdev_put(mp, logdev);
- return error;
+ goto out_unlock;
}

/*
--
2.39.2



2023-08-02 17:00:07

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 11/12] xfs: drop s_umount over opening the log and RT devices

On Wed, Aug 02, 2023 at 05:41:30PM +0200, Christoph Hellwig wrote:
> Just like get_tree_bdev needs to drop s_umount when opening the main
> device, we need to do the same for the xfs log and RT devices to avoid a
> potential lock order reversal with s_unmount for the mark_dead path.
>
> It might be preferable to just drop s_umount over ->fill_super entirely,
> but that will require a fairly massive audit first, so we'll do the easy
> version here first.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> fs/xfs/xfs_super.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 8185102431301d..d5042419ed9997 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -448,17 +448,21 @@ STATIC int
> xfs_open_devices(
> struct xfs_mount *mp)
> {
> - struct block_device *ddev = mp->m_super->s_bdev;
> + struct super_block *sb = mp->m_super;
> + struct block_device *ddev = sb->s_bdev;
> struct block_device *logdev = NULL, *rtdev = NULL;
> int error;
>
> + /* see get_tree_bdev why this is needed and safe */

Which part of get_tree_bdev? Is it this?

/*
* s_umount nests inside open_mutex during
* __invalidate_device(). blkdev_put() acquires
* open_mutex and can't be called under s_umount. Drop
* s_umount temporarily. This is safe as we're
* holding an active reference.
*/
up_write(&s->s_umount);
blkdev_put(bdev, fc->fs_type);
down_write(&s->s_umount);

<confused>

> + up_write(&sb->s_umount);
> +
> /*
> * Open real time and log devices - order is important.
> */
> if (mp->m_logname) {
> error = xfs_blkdev_get(mp, mp->m_logname, &logdev);
> if (error)
> - return error;
> + goto out_unlock;
> }
>
> if (mp->m_rtname) {
> @@ -496,7 +500,10 @@ xfs_open_devices(
> mp->m_logdev_targp = mp->m_ddev_targp;
> }
>
> - return 0;
> + error = 0;
> +out_unlock:
> + down_write(&sb->s_umount);

Isn't down_write taking s_umount? I think the label should be
out_relock or something less misleading.

--D

> + return error;
>
> out_free_rtdev_targ:
> if (mp->m_rtdev_targp)
> @@ -508,7 +515,7 @@ xfs_open_devices(
> out_close_logdev:
> if (logdev && logdev != ddev)
> xfs_blkdev_put(mp, logdev);
> - return error;
> + goto out_unlock;
> }
>
> /*
> --
> 2.39.2
>

2023-08-05 08:42:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 11/12] xfs: drop s_umount over opening the log and RT devices

On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote:
> > + /* see get_tree_bdev why this is needed and safe */
>
> Which part of get_tree_bdev? Is it this?
>
> /*
> * s_umount nests inside open_mutex during
> * __invalidate_device(). blkdev_put() acquires
> * open_mutex and can't be called under s_umount. Drop
> * s_umount temporarily. This is safe as we're
> * holding an active reference.
> */
> up_write(&s->s_umount);
> blkdev_put(bdev, fc->fs_type);
> down_write(&s->s_umount);

Yes. With the refactoring earlier in the series get_tree_bdev should
be trivial enough to not need a more specific reference. If you
think there's a better way to refer to it I can update the comment,
though.

> > mp->m_logdev_targp = mp->m_ddev_targp;
> > }
> >
> > - return 0;
> > + error = 0;
> > +out_unlock:
> > + down_write(&sb->s_umount);
>
> Isn't down_write taking s_umount? I think the label should be
> out_relock or something less misleading.

Agreed. Christian, can you just change this in your branch, or should
I send an incremental patch?


2023-08-05 10:47:12

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 11/12] xfs: drop s_umount over opening the log and RT devices

On Sat, Aug 05, 2023 at 10:32:39AM +0200, Christoph Hellwig wrote:
> On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote:
> > > + /* see get_tree_bdev why this is needed and safe */
> >
> > Which part of get_tree_bdev? Is it this?
> >
> > /*
> > * s_umount nests inside open_mutex during
> > * __invalidate_device(). blkdev_put() acquires
> > * open_mutex and can't be called under s_umount. Drop
> > * s_umount temporarily. This is safe as we're
> > * holding an active reference.
> > */
> > up_write(&s->s_umount);
> > blkdev_put(bdev, fc->fs_type);
> > down_write(&s->s_umount);
>
> Yes. With the refactoring earlier in the series get_tree_bdev should
> be trivial enough to not need a more specific reference. If you
> think there's a better way to refer to it I can update the comment,
> though.
>
> > > mp->m_logdev_targp = mp->m_ddev_targp;
> > > }
> > >
> > > - return 0;
> > > + error = 0;
> > > +out_unlock:
> > > + down_write(&sb->s_umount);
> >
> > Isn't down_write taking s_umount? I think the label should be
> > out_relock or something less misleading.
>
> Agreed. Christian, can you just change this in your branch, or should
> I send an incremental patch?

No need to send an incremental patch. I just s/out_unlock/out_relock/g
in-tree. Thanks!

2023-08-05 16:26:57

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 11/12] xfs: drop s_umount over opening the log and RT devices

On Sat, Aug 05, 2023 at 10:32:39AM +0200, Christoph Hellwig wrote:
> On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote:
> > > + /* see get_tree_bdev why this is needed and safe */
> >
> > Which part of get_tree_bdev? Is it this?
> >
> > /*
> > * s_umount nests inside open_mutex during
> > * __invalidate_device(). blkdev_put() acquires
> > * open_mutex and can't be called under s_umount. Drop
> > * s_umount temporarily. This is safe as we're
> > * holding an active reference.
> > */
> > up_write(&s->s_umount);
> > blkdev_put(bdev, fc->fs_type);
> > down_write(&s->s_umount);
>
> Yes. With the refactoring earlier in the series get_tree_bdev should
> be trivial enough to not need a more specific reference. If you
> think there's a better way to refer to it I can update the comment,
> though.

How about:

/*
* blkdev_put can't be called under s_umount, see the comment in
* get_tree_bdev for more details
*/

with that and the label name change,
Reviewed-by: Darrick J. Wong <[email protected]>

--D


> > > mp->m_logdev_targp = mp->m_ddev_targp;
> > > }
> > >
> > > - return 0;
> > > + error = 0;
> > > +out_unlock:
> > > + down_write(&sb->s_umount);
> >
> > Isn't down_write taking s_umount? I think the label should be
> > out_relock or something less misleading.
>
> Agreed. Christian, can you just change this in your branch, or should
> I send an incremental patch?
>

2023-08-05 17:18:23

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 11/12] xfs: drop s_umount over opening the log and RT devices

On Sat, Aug 05, 2023 at 09:19:04AM -0700, Darrick J. Wong wrote:
> On Sat, Aug 05, 2023 at 10:32:39AM +0200, Christoph Hellwig wrote:
> > On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote:
> > > > + /* see get_tree_bdev why this is needed and safe */
> > >
> > > Which part of get_tree_bdev? Is it this?
> > >
> > > /*
> > > * s_umount nests inside open_mutex during
> > > * __invalidate_device(). blkdev_put() acquires
> > > * open_mutex and can't be called under s_umount. Drop
> > > * s_umount temporarily. This is safe as we're
> > > * holding an active reference.
> > > */
> > > up_write(&s->s_umount);
> > > blkdev_put(bdev, fc->fs_type);
> > > down_write(&s->s_umount);
> >
> > Yes. With the refactoring earlier in the series get_tree_bdev should
> > be trivial enough to not need a more specific reference. If you
> > think there's a better way to refer to it I can update the comment,
> > though.
>
> How about:
>
> /*
> * blkdev_put can't be called under s_umount, see the comment in
> * get_tree_bdev for more details
> */
>
> with that and the label name change,
> Reviewed-by: Darrick J. Wong <[email protected]>

Added that comment and you rvb in-tree.