2011-09-13 02:53:05

by Valerie Aurora

[permalink] [raw]
Subject: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

I've been working on some patches to fix the following problem:

File system freeze/thaw require the superblock's s_umount lock.
However, we write to the file system while holding the s_umount lock
in several places (e.g., writeback and quotas). Any of these can
block on a frozen file system while holding s_umount, preventing any
later thaw from occurring, since thaw requires s_umount. The solution
is to add a check, vfs_is_frozen(), to all code paths that write while
holding sb->s_umount and return without writing if it is true.

Attached is an audit of each use of ->s_umount and whether it could
result in a deadlock with freeze/thaw. Some of this has gone into the
code comments. Patches against some random post 3.0.0 pull to follow.

A patch to xfstests is coming. Basically, we need to test for mmaped
writes concurrent with file system freeze. Probably test 068 can be
extended. (If anyone is interested in doing this, it is relatively
simple work.)

Thanks, Jan Kara and Dave Chinner for their kind suggestions for fixes
and review of patches offline.

-VAL


Attachments:
audit.txt (5.18 kB)

2011-09-14 13:05:26

by Jan Kara

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

> fs/reiserfs/procfs.c
> - dropped after get_super() call in /proc operation
> XXX don't know, need a reiser expert
Where exactly do we hold s_umount? Anyway, nothing in reiserfs/procfs.c
does not seem to involve writes.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2011-09-14 23:22:30

by Valerie Aurora

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

On Wed, Sep 14, 2011 at 6:05 AM, Jan Kara <[email protected]> wrote:
>> fs/reiserfs/procfs.c
>> ?- dropped after get_super() call in /proc operation
>> ? ?XXX don't know, need a reiser expert
> ?Where exactly do we hold s_umount? Anyway, nothing in reiserfs/procfs.c
> does not seem to involve writes.

Sorry, this actually calls sget(), not get_super(). And I agree,
there aren't any writes. Audit updated accordingly.

Thanks!

-VAL

2011-10-27 21:39:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

Any plans to resubmit these with the updates and in proper format?


2011-10-27 22:08:55

by Valerie Aurora

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

On Thu, Oct 27, 2011 at 2:39 PM, Christoph Hellwig <[email protected]> wrote:
> Any plans to resubmit these with the updates and in proper format?

Yes, coming tomorrow. We found more bugs during testing which
fortunately Surbhi Palande already fixed and submitted patches for,
but got lost in the shuffle.

-VAL

2011-10-30 00:59:42

by Valerie Aurora

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

On Thu, Oct 27, 2011 at 3:08 PM, Valerie Aurora <[email protected]> wrote:
> On Thu, Oct 27, 2011 at 2:39 PM, Christoph Hellwig <[email protected]> wrote:
>> Any plans to resubmit these with the updates and in proper format?
>
> Yes, coming tomorrow. ?We found more bugs during testing which
> fortunately Surbhi Palande already fixed and submitted patches for,
> but got lost in the shuffle.

Ironically, this has been held up by a disk failure. :) A few more days.

-VAL

2011-11-28 20:58:42

by Valerie Aurora

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

On Sat, Oct 29, 2011 at 5:59 PM, Valerie Aurora <[email protected]> wrote:
> On Thu, Oct 27, 2011 at 3:08 PM, Valerie Aurora <[email protected]> wrote:
>> On Thu, Oct 27, 2011 at 2:39 PM, Christoph Hellwig <[email protected]> wrote:
>>> Any plans to resubmit these with the updates and in proper format?
>>
>> Yes, coming tomorrow. ?We found more bugs during testing which
>> fortunately Surbhi Palande already fixed and submitted patches for,
>> but got lost in the shuffle.
>
> Ironically, this has been held up by a disk failure. :) A few more days.

I no longer have time to consult in addition to my full-time job, so
Canonical is taking the lead on this again.

-VAL