2011-09-07 03:28:39

by Masayoshi MIZUMA

[permalink] [raw]
Subject: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

Hi,

When I checked the freeze feature for ext3 filesystem using fsfreeze
command at 3.1.0-rc4, I think the following deadlock problem happened.

How to reproduce:
# mkfs -t ext3 /dev/sdd1
# mount /dev/sdd1 /MNT
# ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
# fsfreeze -f /MNT
# fsfreeze -u /MNT

If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.

The detail of deadlock:
o [flush-8:16:1523]
wb_do_writeback
wb_writeback
...
ext3_journalled_writepage
journal_start
start_this_handle
# waiting until journal->j_barrier_count turns 0...
# j_barrier_count was incremented by journal_lock_updates()
# via ext3_freeze().

o [fsstress:2673]
sys_sync
sync_filesystems
iterate_supers
down_read(sb->s_umount)
sync_one_sb
__sync_filesystem
writeback_inodes_sb
writeback_inodes_sb_nr
wait_for_completion
wait_for_common
# waiting for completion of [flush-8:16:1523]...

o [fsfreeze:2749]
sys_ioctl
do_vfs_ioctl
thaw_super
# waiting for down_write(sb->s_umount)...
# [fsfreeze:2673] did down_read(sb->s_umount).

I got the following messages.
---------------------------------------------------------------------
INFO: task flush-8:16:1523 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:16 D e9ab3d14 0 1523 2 0x00000000
f0b2b030 00000046 00000002 e9ab3d14 00000001 00000000 f2c8f530 f3287600
c0ae8600 00000000 00000000 c0ae8600 c0ae8600 c0ae8600 c0ae8600 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:
[<c0511dda>] ? kmem_cache_alloc_trace+0x10a/0x110
[<c047363b>] ? prepare_to_wait+0x6b/0x70
[<f8418b5d>] ? start_this_handle+0x21d/0x340 [jbd]
[<c044ffd6>] ? dequeue_task_fair+0x36/0xc0
[<c0473360>] ? wake_up_bit+0x30/0x30
[<f8418dfa>] ? journal_start+0x9a/0xd0 [jbd]
[<f872ceee>] ? ext3_journalled_writepage+0x8e/0x210 [ext3]
[<c04dffe8>] ? __writepage+0x8/0x30
[<c04e12a5>] ? write_cache_pages+0x1a5/0x3b0
[<c04dffe0>] ? set_page_dirty+0x60/0x60
[<c04e14ee>] ? generic_writepages+0x3e/0x60
[<c053e9af>] ? writeback_single_inode+0xff/0x300
[<c053eed1>] ? writeback_sb_inodes+0x171/0x200
[<c053f128>] ? wb_writeback+0xa8/0x280
[<c0462dd7>] ? lock_timer_base+0x27/0x50
[<c04637d1>] ? del_timer_sync+0x21/0x40
[<c053f37b>] ? wb_do_writeback+0x7b/0x200
[<c0462e00>] ? lock_timer_base+0x50/0x50
[<c053f58a>] ? bdi_writeback_thread+0x8a/0x1f0
[<c053f500>] ? wb_do_writeback+0x200/0x200
[<c0472fb4>] ? kthread+0x74/0x80
[<c0472f40>] ? kthread_worker_fn+0x150/0x150
[<c083803e>] ? kernel_thread_helper+0x6/0x10

INFO: task fsstress:2673 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fsstress D 00000000 0 2673 2645 0x00000000
e240d030 00000082 00000000 00000000 00000000 00000000 f7353a30 f32c7600
c0ae8600 bb9fa3df 00000168 c0ae8600 c0ae8600 c0ae8600 c0ae8600 00000001
00000000 00000001 c0ae8600 e240d2bc e240ff3c f2cc8e80 c044db1f 00000002
Call Trace:
[<c044db1f>] ? load_balance+0x7f/0x3e0
[<c082f65d>] ? schedule_timeout+0x19d/0x270
[<c082ebce>] ? schedule+0x37e/0x820
[<c082f38d>] ? wait_for_common+0xdd/0x140
[<c044e4c0>] ? try_to_wake_up+0x220/0x220
[<c053e33f>] ? writeback_inodes_sb_nr+0x6f/0x90
[<c0560510>] ? drop_dquot_ref+0x110/0x110
[<c05432c2>] ? __sync_filesystem+0x42/0xa0
[<c05200f8>] ? iterate_supers+0x58/0xa0
[<c0543320>] ? __sync_filesystem+0xa0/0xa0
[<c054334e>] ? sys_sync+0x1e/0x50
[<c0837a9f>] ? sysenter_do_call+0x12/0x28

INFO: task fsfreeze:2749 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fsfreeze D e952ff14 0 2749 2640 0x00000000
e9aa3a30 00000086 00000002 e952ff14 ffffff9c 00000000 f2c7f030 f3247600
c0ae8600 fff90830 000000c0 c0ae8600 c0ae8600 c0ae8600 c0ae8600 ee226218
00000000 c05ddfd3 e952ff60 bfab77c0 00000008 c05219ee 0000081b 00000000
Call Trace:
[<c05ddfd3>] ? copy_to_user+0x33/0x110
[<c05219ee>] ? cp_new_stat64+0xee/0x100
[<c08308c5>] ? rwsem_down_failed_common+0x85/0xe0
[<c05dd612>] ? call_rwsem_down_write_failed+0x6/0x8
[<c083015c>] ? down_write+0x1c/0x20
[<c052056b>] ? thaw_super+0x1b/0xb0
[<c045fe1d>] ? ns_capable+0x1d/0x50
[<c052d407>] ? do_vfs_ioctl+0x257/0x2b0
[<c052d4ee>] ? sys_ioctl+0x8e/0xa0
[<c0837a9f>] ? sysenter_do_call+0x12/0x28
---------------------------------------------------------------------

Thanks,
Masayoshi Mizuma




2011-09-07 06:40:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 07, 2011 at 12:29:30PM +0900, Masayoshi MIZUMA wrote:
> Hi,
>
> When I checked the freeze feature for ext3 filesystem using fsfreeze
> command at 3.1.0-rc4, I think the following deadlock problem happened.
>
> How to reproduce:
> # mkfs -t ext3 /dev/sdd1
> # mount /dev/sdd1 /MNT
> # ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
> # fsfreeze -f /MNT
> # fsfreeze -u /MNT

Can you add this testcase to xfstests?


2011-09-07 16:50:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 07, 2011 at 12:45:34PM -0400, Greg Freemyer wrote:
> > Can you add this testcase to xfstests?
>
> Christoph,
>
> Isn't that just a matter of extending test 068 to ext4

If you have recent enough xfsprogs that allow the freeze command
for foreign filesystems that might work.


2011-09-07 16:45:34

by Greg Freemyer

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 7, 2011 at 2:40 AM, Christoph Hellwig <[email protected]> wrote:
> On Wed, Sep 07, 2011 at 12:29:30PM +0900, Masayoshi MIZUMA wrote:
>> Hi,
>>
>> When I checked the freeze feature for ext3 filesystem using fsfreeze
>> command at 3.1.0-rc4, I think the following deadlock problem happened.
>>
>> How to reproduce:
>> ?# mkfs -t ext3 /dev/sdd1
>> ?# mount /dev/sdd1 /MNT
>> ?# ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
>> ?# fsfreeze -f /MNT
>> ?# fsfreeze -u /MNT
>
> Can you add this testcase to xfstests?

Christoph,

Isn't that just a matter of extending test 068 to ext4

====
--- 068 2011-06-30 18:41:17.000000000 -0400
+++ 068.new 2011-09-07 12:41:35.000000000 -0400
@@ -51,7 +51,7 @@
. ./common.filter

# real QA test starts here
-_supported_fs xfs
+_supported_fs xfs ext3 ext4
_supported_os Linux IRIX

_require_scratch
====

That's a totally untested patch if someone wants to try it.

Greg

2011-09-07 17:10:07

by Eric Sandeen

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On 9/7/11 11:50 AM, Christoph Hellwig wrote:
> On Wed, Sep 07, 2011 at 12:45:34PM -0400, Greg Freemyer wrote:
>>> Can you add this testcase to xfstests?
>>
>> Christoph,
>>
>> Isn't that just a matter of extending test 068 to ext4
>
> If you have recent enough xfsprogs that allow the freeze command
> for foreign filesystems that might work.

where "recent enough" is since Tue Feb 10 14:41:51 2009 -0600

I say go for it :)

Could always add a quick helper to make sure the xfs_io command
doesn't fail on freeze, and _notrun if it does.

-Eric


2011-09-07 17:17:40

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 07, 2011 at 12:10:07PM -0500, Eric Sandeen wrote:
> > If you have recent enough xfsprogs that allow the freeze command
> > for foreign filesystems that might work.
>
> where "recent enough" is since Tue Feb 10 14:41:51 2009 -0600
>
> I say go for it :)
>
> Could always add a quick helper to make sure the xfs_io command
> doesn't fail on freeze, and _notrun if it does.

With the _notrun trick it could probably even claim generic fs
support.


2011-09-07 17:21:48

by Masayoshi MIZUMA

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock


(2011/09/07 15:40), Christoph Hellwig wrote:

> On Wed, Sep 07, 2011 at 12:29:30PM +0900, Masayoshi MIZUMA wrote:
> > Hi,
> >
> > When I checked the freeze feature for ext3 filesystem using fsfreeze
> > command at 3.1.0-rc4, I think the following deadlock problem happened.
> >
> > How to reproduce:
> > # mkfs -t ext3 /dev/sdd1
> > # mount /dev/sdd1 /MNT
> > # ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
> > # fsfreeze -f /MNT
> > # fsfreeze -u /MNT
>
> Can you add this testcase to xfstests?

I don't know how to add this testcase to xfstests much, but I will
try to create it.
Please wait for a while...

Thanks,
Masayoshi



2011-09-07 17:34:44

by Jan Kara

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

Hello,

Thanks for report!

On Wed 07-09-11 12:29:30, Masayoshi MIZUMA wrote:
> When I checked the freeze feature for ext3 filesystem using fsfreeze
> command at 3.1.0-rc4, I think the following deadlock problem happened.
>
> How to reproduce:
> # mkfs -t ext3 /dev/sdd1
> # mount /dev/sdd1 /MNT
> # ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
> # fsfreeze -f /MNT
> # fsfreeze -u /MNT
>
> If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.
>
> The detail of deadlock:
> o [flush-8:16:1523]
> wb_do_writeback
> wb_writeback
> ...
> ext3_journalled_writepage
> journal_start
> start_this_handle
> # waiting until journal->j_barrier_count turns 0...
> # j_barrier_count was incremented by journal_lock_updates()
> # via ext3_freeze().
>
> o [fsstress:2673]
> sys_sync
> sync_filesystems
> iterate_supers
> down_read(sb->s_umount)
> sync_one_sb
> __sync_filesystem
> writeback_inodes_sb
> writeback_inodes_sb_nr
> wait_for_completion
> wait_for_common
> # waiting for completion of [flush-8:16:1523]...
>
> o [fsfreeze:2749]
> sys_ioctl
> do_vfs_ioctl
> thaw_super
> # waiting for down_write(sb->s_umount)...
> # [fsfreeze:2673] did down_read(sb->s_umount).
Yes, this is a classical deadlock that can happen for any filesystem. The
problem is flusher thread holds s_umount semaphore (either directly, or as
in your case, indirectly via blocked sync) and tries to do some IO which
blocks on frozen filesystem. It's particularly easy to hit for ext3 because
it doesn't do vfs_check_frozen() checks but all other filesystems have the
race window as well. Val Henson is working on fixing the problem - she even
has some first version of patches I believe.

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2011-09-07 17:56:38

by Greg Freemyer

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 7, 2011 at 1:34 PM, Jan Kara <[email protected]> wrote:
> ?Hello,
>
> ?Thanks for report!
>
> On Wed 07-09-11 12:29:30, Masayoshi MIZUMA wrote:
>> When I checked the freeze feature for ext3 filesystem using fsfreeze
>> command at 3.1.0-rc4, I think the following deadlock problem happened.
>>
>> How to reproduce:
>> ?# mkfs -t ext3 /dev/sdd1
>> ?# mount /dev/sdd1 /MNT
>> ?# ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
>> ?# fsfreeze -f /MNT
>> ?# fsfreeze -u /MNT
>>
>> ?If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.
>>
>> The detail of deadlock:
>> o [flush-8:16:1523]
>> ? wb_do_writeback
>> ? ?wb_writeback
>> ? ?...
>> ? ? ?ext3_journalled_writepage
>> ? ? ? journal_start
>> ? ? ? ?start_this_handle
>> ? ? ? ?# waiting until journal->j_barrier_count turns 0...
>> ? ? ? ?# j_barrier_count was incremented by journal_lock_updates()
>> ? ? ? ?# via ext3_freeze().
>>
>> o [fsstress:2673]
>> ? sys_sync
>> ? ?sync_filesystems
>> ? ? iterate_supers
>> ? ? ?down_read(sb->s_umount)
>> ? ? ?sync_one_sb
>> ? ? ? __sync_filesystem
>> ? ? ? ?writeback_inodes_sb
>> ? ? ? ? writeback_inodes_sb_nr
>> ? ? ? ? ?wait_for_completion
>> ? ? ? ? ? wait_for_common
>> ? ? ? ? ? # waiting for completion of [flush-8:16:1523]...
>>
>> o [fsfreeze:2749]
>> ? sys_ioctl
>> ? ?do_vfs_ioctl
>> ? ? thaw_super
>> ? ? # waiting for down_write(sb->s_umount)...
>> ? ? # [fsfreeze:2673] did down_read(sb->s_umount).
> ?Yes, this is a classical deadlock that can happen for any filesystem. The
> problem is flusher thread holds s_umount semaphore (either directly, or as
> in your case, indirectly via blocked sync) and tries to do some IO which
> blocks on frozen filesystem. It's particularly easy to hit for ext3 because
> it doesn't do vfs_check_frozen() checks but all other filesystems have the
> race window as well. Val Henson is working on fixing the problem - she even
> has some first version of patches I believe.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Honza

xfstests test 068 has been around since kernel 2.4 days and should
have caught it if xfs is impacted.

I know I ran the 2002 version many times to prove to myself that
fsfreeze for xfs was stable when teamed with LVM. (It wasn't when I
first wrote 068 way back then).

068 has been greatly simplified since 2002, but it still looks like it
should do a good job.

Is there a problem with 068? Does it need extra test coverage even for xfs?

Greg

2011-09-07 22:32:11

by Jan Kara

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed 07-09-11 13:56:08, Greg Freemyer wrote:
> On Wed, Sep 7, 2011 at 1:34 PM, Jan Kara <[email protected]> wrote:
> > ?Hello,
> >
> > ?Thanks for report!
> >
> > On Wed 07-09-11 12:29:30, Masayoshi MIZUMA wrote:
> >> When I checked the freeze feature for ext3 filesystem using fsfreeze
> >> command at 3.1.0-rc4, I think the following deadlock problem happened.
> >>
> >> How to reproduce:
> >> ?# mkfs -t ext3 /dev/sdd1
> >> ?# mount /dev/sdd1 /MNT
> >> ?# ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
> >> ?# fsfreeze -f /MNT
> >> ?# fsfreeze -u /MNT
> >>
> >> ?If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.
> >>
> >> The detail of deadlock:
> >> o [flush-8:16:1523]
> >> ? wb_do_writeback
> >> ? ?wb_writeback
> >> ? ?...
> >> ? ? ?ext3_journalled_writepage
> >> ? ? ? journal_start
> >> ? ? ? ?start_this_handle
> >> ? ? ? ?# waiting until journal->j_barrier_count turns 0...
> >> ? ? ? ?# j_barrier_count was incremented by journal_lock_updates()
> >> ? ? ? ?# via ext3_freeze().
> >>
> >> o [fsstress:2673]
> >> ? sys_sync
> >> ? ?sync_filesystems
> >> ? ? iterate_supers
> >> ? ? ?down_read(sb->s_umount)
> >> ? ? ?sync_one_sb
> >> ? ? ? __sync_filesystem
> >> ? ? ? ?writeback_inodes_sb
> >> ? ? ? ? writeback_inodes_sb_nr
> >> ? ? ? ? ?wait_for_completion
> >> ? ? ? ? ? wait_for_common
> >> ? ? ? ? ? # waiting for completion of [flush-8:16:1523]...
> >>
> >> o [fsfreeze:2749]
> >> ? sys_ioctl
> >> ? ?do_vfs_ioctl
> >> ? ? thaw_super
> >> ? ? # waiting for down_write(sb->s_umount)...
> >> ? ? # [fsfreeze:2673] did down_read(sb->s_umount).
> > ?Yes, this is a classical deadlock that can happen for any filesystem. The
> > problem is flusher thread holds s_umount semaphore (either directly, or as
> > in your case, indirectly via blocked sync) and tries to do some IO which
> > blocks on frozen filesystem. It's particularly easy to hit for ext3 because
> > it doesn't do vfs_check_frozen() checks but all other filesystems have the
> > race window as well. Val Henson is working on fixing the problem - she even
> > has some first version of patches I believe.
> >
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Honza
>
> xfstests test 068 has been around since kernel 2.4 days and should
> have caught it if xfs is impacted.
>
> I know I ran the 2002 version many times to prove to myself that
> fsfreeze for xfs was stable when teamed with LVM. (It wasn't when I
> first wrote 068 way back then).
>
> 068 has been greatly simplified since 2002, but it still looks like it
> should do a good job.
>
> Is there a problem with 068? Does it need extra test coverage even for xfs?
I believe at least mmapped writes can trigger the deadlock even for xfs
and fsstress (slightly surprisingly) does not test that. It's a narrow race
window but it is there and it has been triggered in practice (for ext4 but
it's a race in VFS code used by both XFS and ext4). So maybe extending
fsstress would be a way to go?

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2011-09-09 03:06:25

by Greg Freemyer

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 7, 2011 at 6:32 PM, Jan Kara <[email protected]> wrote:
>
> On Wed 07-09-11 13:56:08, Greg Freemyer wrote:
> > On Wed, Sep 7, 2011 at 1:34 PM, Jan Kara <[email protected]> wrote:
> > > ?Hello,
> > >
> > > ?Thanks for report!
> > >
> > > On Wed 07-09-11 12:29:30, Masayoshi MIZUMA wrote:
> > >> When I checked the freeze feature for ext3 filesystem using fsfreeze
> > >> command at 3.1.0-rc4, I think the following deadlock problem happened.
> > >>
> > >> How to reproduce:
> > >> ?# mkfs -t ext3 /dev/sdd1
> > >> ?# mount /dev/sdd1 /MNT
> > >> ?# ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
> > >> ?# fsfreeze -f /MNT
> > >> ?# fsfreeze -u /MNT
> > >>
> > >> ?If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.
> > >>
> > >> The detail of deadlock:
> > >> o [flush-8:16:1523]
> > >> ? wb_do_writeback
> > >> ? ?wb_writeback
> > >> ? ?...
> > >> ? ? ?ext3_journalled_writepage
> > >> ? ? ? journal_start
> > >> ? ? ? ?start_this_handle
> > >> ? ? ? ?# waiting until journal->j_barrier_count turns 0...
> > >> ? ? ? ?# j_barrier_count was incremented by journal_lock_updates()
> > >> ? ? ? ?# via ext3_freeze().
> > >>
> > >> o [fsstress:2673]
> > >> ? sys_sync
> > >> ? ?sync_filesystems
> > >> ? ? iterate_supers
> > >> ? ? ?down_read(sb->s_umount)
> > >> ? ? ?sync_one_sb
> > >> ? ? ? __sync_filesystem
> > >> ? ? ? ?writeback_inodes_sb
> > >> ? ? ? ? writeback_inodes_sb_nr
> > >> ? ? ? ? ?wait_for_completion
> > >> ? ? ? ? ? wait_for_common
> > >> ? ? ? ? ? # waiting for completion of [flush-8:16:1523]...
> > >>
> > >> o [fsfreeze:2749]
> > >> ? sys_ioctl
> > >> ? ?do_vfs_ioctl
> > >> ? ? thaw_super
> > >> ? ? # waiting for down_write(sb->s_umount)...
> > >> ? ? # [fsfreeze:2673] did down_read(sb->s_umount).
> > > ?Yes, this is a classical deadlock that can happen for any filesystem. The
> > > problem is flusher thread holds s_umount semaphore (either directly, or as
> > > in your case, indirectly via blocked sync) and tries to do some IO which
> > > blocks on frozen filesystem. It's particularly easy to hit for ext3 because
> > > it doesn't do vfs_check_frozen() checks but all other filesystems have the
> > > race window as well. Val Henson is working on fixing the problem - she even
> > > has some first version of patches I believe.
> > >
> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Honza
> >
> > xfstests test 068 has been around since kernel 2.4 days and should
> > have caught it if xfs is impacted.
> >
> > I know I ran the 2002 version many times to prove to myself that
> > fsfreeze for xfs was stable when teamed with LVM. ?(It wasn't when I
> > first wrote 068 way back then).
> >
> > 068 has been greatly simplified since 2002, but it still looks like it
> > should do a good job.
> >
> > Is there a problem with 068? ?Does it need extra test coverage even for xfs?
> ?I believe at least mmapped writes can trigger the deadlock even for xfs
> and fsstress (slightly surprisingly) does not test that. It's a narrow race
> window but it is there and it has been triggered in practice (for ext4 but
> it's a race in VFS code used by both XFS and ext4). So maybe extending
> fsstress would be a way to go?
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Honza

?That's a surprisingly large hole in xfstests.

That sounds like a pretty core and significant change. I'll have to
leave that to one of the main developers.

Greg

2011-09-13 03:00:45

by Valerie Aurora

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock

On Wed, Sep 7, 2011 at 10:34 AM, Jan Kara <[email protected]> wrote:
> ?Hello,
>
> ?Thanks for report!
>
> On Wed 07-09-11 12:29:30, Masayoshi MIZUMA wrote:
>> When I checked the freeze feature for ext3 filesystem using fsfreeze
>> command at 3.1.0-rc4, I think the following deadlock problem happened.
>>
>> How to reproduce:
>> ?# mkfs -t ext3 /dev/sdd1
>> ?# mount /dev/sdd1 /MNT
>> ?# ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
>> ?# fsfreeze -f /MNT
>> ?# fsfreeze -u /MNT
>>
>> ?If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.
>>
>> The detail of deadlock:
>> o [flush-8:16:1523]
>> ? wb_do_writeback
>> ? ?wb_writeback
>> ? ?...
>> ? ? ?ext3_journalled_writepage
>> ? ? ? journal_start
>> ? ? ? ?start_this_handle
>> ? ? ? ?# waiting until journal->j_barrier_count turns 0...
>> ? ? ? ?# j_barrier_count was incremented by journal_lock_updates()
>> ? ? ? ?# via ext3_freeze().
>>
>> o [fsstress:2673]
>> ? sys_sync
>> ? ?sync_filesystems
>> ? ? iterate_supers
>> ? ? ?down_read(sb->s_umount)
>> ? ? ?sync_one_sb
>> ? ? ? __sync_filesystem
>> ? ? ? ?writeback_inodes_sb
>> ? ? ? ? writeback_inodes_sb_nr
>> ? ? ? ? ?wait_for_completion
>> ? ? ? ? ? wait_for_common
>> ? ? ? ? ? # waiting for completion of [flush-8:16:1523]...
>>
>> o [fsfreeze:2749]
>> ? sys_ioctl
>> ? ?do_vfs_ioctl
>> ? ? thaw_super
>> ? ? # waiting for down_write(sb->s_umount)...
>> ? ? # [fsfreeze:2673] did down_read(sb->s_umount).
> ?Yes, this is a classical deadlock that can happen for any filesystem. The
> problem is flusher thread holds s_umount semaphore (either directly, or as
> in your case, indirectly via blocked sync) and tries to do some IO which
> blocks on frozen filesystem. It's particularly easy to hit for ext3 because
> it doesn't do vfs_check_frozen() checks but all other filesystems have the
> race window as well. Val Henson is working on fixing the problem - she even
> has some first version of patches I believe.

Yes, if the bug reporter could test the patches I just sent out, that
would be great. I'm happy to resend privately. Thanks!

-VAL

2011-09-14 06:24:54

by Masayoshi MIZUMA

[permalink] [raw]
Subject: Re: [BUG] ext3: cannot unfreeze a filesystem due to a deadlock


(2011/09/13 12:00), Valerie Aurora wrote:

> On Wed, Sep 7, 2011 at 10:34 AM, Jan Kara <[email protected]> wrote:
> > Hello,
> >
> > Thanks for report!
> >
> > On Wed 07-09-11 12:29:30, Masayoshi MIZUMA wrote:
> >> When I checked the freeze feature for ext3 filesystem using fsfreeze
> >> command at 3.1.0-rc4, I think the following deadlock problem happened.
> >>
> >> How to reproduce:
> >> # mkfs -t ext3 /dev/sdd1
> >> # mount /dev/sdd1 /MNT
> >> # ./fsstress -d /MNT/tmp -n 10 -p 1000 > /dev/null 2>&1 &
> >> # fsfreeze -f /MNT
> >> # fsfreeze -u /MNT
> >>
> >> If this deadlock is reproduced, "fsfreeze -u /MNT" does not return.
> >>
> >> The detail of deadlock:
> >> o [flush-8:16:1523]
> >> wb_do_writeback
> >> wb_writeback
> >> ...
> >> ext3_journalled_writepage
> >> journal_start
> >> start_this_handle
> >> # waiting until journal->j_barrier_count turns 0...
> >> # j_barrier_count was incremented by journal_lock_updates()
> >> # via ext3_freeze().
> >>
> >> o [fsstress:2673]
> >> sys_sync
> >> sync_filesystems
> >> iterate_supers
> >> down_read(sb->s_umount)
> >> sync_one_sb
> >> __sync_filesystem
> >> writeback_inodes_sb
> >> writeback_inodes_sb_nr
> >> wait_for_completion
> >> wait_for_common
> >> # waiting for completion of [flush-8:16:1523]...
> >>
> >> o [fsfreeze:2749]
> >> sys_ioctl
> >> do_vfs_ioctl
> >> thaw_super
> >> # waiting for down_write(sb->s_umount)...
> >> # [fsfreeze:2673] did down_read(sb->s_umount).
> > Yes, this is a classical deadlock that can happen for any filesystem. The
> > problem is flusher thread holds s_umount semaphore (either directly, or as
> > in your case, indirectly via blocked sync) and tries to do some IO which
> > blocks on frozen filesystem. It's particularly easy to hit for ext3 because
> > it doesn't do vfs_check_frozen() checks but all other filesystems have the
> > race window as well. Val Henson is working on fixing the problem - she even
> > has some first version of patches I believe.
>
> Yes, if the bug reporter could test the patches I just sent out, that
> would be great. I'm happy to resend privately. Thanks!

I put your patches to 3.1.0-rc4 and tested it. Then, the deadlock was
not reproduced, so your patches work fine, thank you!

Masayoshi

>
> -VAL