2009-01-30 05:04:17

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] New: ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579

Summary: ext4 filesystem hang
Product: File System
Version: 2.5
KernelVersion: 2.6.29-rc2-git1
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: ext4
AssignedTo: [email protected]
ReportedBy: [email protected]


Latest working kernel version:
Earliest failing kernel version: 2.6.29 rc2 git1
Distribution: Rawhide
Hardware Environment: x86_64
Software Environment: Rawhide
Problem Description:

Deadlock/hang:

SysRq : Show Blocked State
task PC stack pid father
livecd-creato D ffff880001869f30 3096 11953 3276
ffff880064cc1b18 0000000000000046 0000000000000001 0000000000000002
ffffffff818fe700 ffffffff818fe700 ffff8800836cf000 ffff880010f28000
ffff88006389a390 ffff880010f283b8 0000000000000000 0000000000000046
Call Trace:
[<ffffffff81016ec0>] ? read_tsc+0xe/0x24
[<ffffffff81066749>] ? getnstimeofday+0x58/0xb4
[<ffffffff81380f70>] io_schedule+0x63/0xa5
[<ffffffff810a7029>] sync_page+0x49/0x4d
[<ffffffff8138136a>] __wait_on_bit_lock+0x45/0x79
[<ffffffff810a6fe0>] ? sync_page+0x0/0x4d
[<ffffffff810a6f93>] __lock_page+0x63/0x6a
[<ffffffff8105f0c8>] ? wake_bit_function+0x0/0x2a
[<ffffffff810ae25b>] write_cache_pages+0x1fc/0x3a4
[<ffffffffa0020231>] ? __mpage_da_writepage+0x0/0x163 [ext4]
[<ffffffffa0005329>] ? jbd2_journal_start+0xef/0x119 [jbd2]
[<ffffffffa0005329>] ? jbd2_journal_start+0xef/0x119 [jbd2]
[<ffffffffa001ffe1>] ext4_da_writepages+0x2c0/0x457 [ext4]
[<ffffffff8138164c>] ? mutex_unlock+0x9/0xb
[<ffffffff810a8b01>] ? generic_file_aio_write+0x74/0xc5
[<ffffffff8106ebcb>] ? print_lock_contention_bug+0x1e/0x110
[<ffffffffa0021310>] ? ext4_da_get_block_write+0x0/0x130 [ext4]
[<ffffffff810ae456>] do_writepages+0x2b/0x3b
[<ffffffff810a7b3f>] __filemap_fdatawrite_range+0x51/0x59
[<ffffffff810a889a>] filemap_fdatawrite+0x1a/0x1c
[<ffffffff810fbbe5>] vfs_fsync+0x53/0xac
[<ffffffff810fbc70>] do_fsync+0x32/0x48
[<ffffffff810fbc94>] sys_fdatasync+0xe/0x12
[<ffffffff810112ba>] system_call_fastpath+0x16/0x1b
kjournald2 D ffff88000c475e28 5040 11969 2
ffff88000c475d40 0000000000000046 0000000000000000 0000000000000046
ffffffff818fe700 ffffffff818fe700 0000000000000000 ffff880014882390
ffff88007eb88000 ffff880014882748 000000000c475ce0 0000000000000046
Call Trace:
[<ffffffff8105f215>] ? prepare_to_wait+0x1c/0x5e
[<ffffffff81383055>] ? _spin_unlock_irqrestore+0x47/0x57
[<ffffffff8106d609>] ? trace_hardirqs_on_caller+0x1f/0x153
[<ffffffff8119a55f>] ? _raw_spin_unlock+0x8e/0x93
[<ffffffffa00057e8>] jbd2_journal_commit_transaction+0x25c/0x159c [jbd2]
[<ffffffff81054bf5>] ? lock_timer_base+0x26/0x4a
[<ffffffff81383055>] ? _spin_unlock_irqrestore+0x47/0x57
[<ffffffff8106d609>] ? trace_hardirqs_on_caller+0x1f/0x153
[<ffffffff8105f090>] ? autoremove_wake_function+0x0/0x38
[<ffffffff81054c67>] ? try_to_del_timer_sync+0x4e/0x5c
[<ffffffffa000a68c>] kjournald2+0x172/0x3a8 [jbd2]
[<ffffffff8105f090>] ? autoremove_wake_function+0x0/0x38
[<ffffffffa000a51a>] ? kjournald2+0x0/0x3a8 [jbd2]
[<ffffffff8105ed18>] kthread+0x49/0x76
[<ffffffff8101262a>] child_rip+0xa/0x20
[<ffffffff81383002>] ? _spin_unlock_irq+0x2b/0x37
[<ffffffff8106d74a>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff81011f3e>] ? restore_args+0x0/0x30
[<ffffffff8105ecaa>] ? kthreadd+0x176/0x19b
[<ffffffff8105eccf>] ? kthread+0x0/0x76
[<ffffffff81012620>] ? child_rip+0x0/0x20
pdflush D ffff880001878350 992 12227 2
ffff880050975960 0000000000000046 0000000000000000 ffff8800724d8ea0
ffffffff818fe700 ffffffff818fe700 ffff8800509758e0 ffff880076c48000
ffffffff815334f0 ffff880076c483b8 0000000000000046 0000000000000046
Call Trace:
[<ffffffff8105f173>] ? prepare_to_wait_exclusive+0x1c/0x5e
[<ffffffff81016ec0>] ? read_tsc+0xe/0x24
[<ffffffff81066749>] ? getnstimeofday+0x58/0xb4
[<ffffffff81380f70>] io_schedule+0x63/0xa5
[<ffffffff810a7029>] sync_page+0x49/0x4d
[<ffffffff8138136a>] __wait_on_bit_lock+0x45/0x79
[<ffffffff810a6fe0>] ? sync_page+0x0/0x4d
[<ffffffff810a6f93>] __lock_page+0x63/0x6a
[<ffffffff8105f0c8>] ? wake_bit_function+0x0/0x2a
[<ffffffff810ae25b>] write_cache_pages+0x1fc/0x3a4
[<ffffffffa0020231>] ? __mpage_da_writepage+0x0/0x163 [ext4]
[<ffffffffa0005329>] ? jbd2_journal_start+0xef/0x119 [jbd2]
[<ffffffffa001ffe1>] ext4_da_writepages+0x2c0/0x457 [ext4]
[<ffffffff810ae375>] ? write_cache_pages+0x316/0x3a4
[<ffffffff8106ebcb>] ? print_lock_contention_bug+0x1e/0x110
[<ffffffff8103b23a>] ? account_group_exec_runtime+0x2c/0x44
[<ffffffff8106ebcb>] ? print_lock_contention_bug+0x1e/0x110
[<ffffffffa0021310>] ? ext4_da_get_block_write+0x0/0x130 [ext4]
[<ffffffff810ae456>] do_writepages+0x2b/0x3b
[<ffffffff810f893e>] __writeback_single_inode+0x187/0x2fc
[<ffffffff810f8ee0>] ? generic_sync_sb_inodes+0x275/0x362
[<ffffffff810f8e86>] generic_sync_sb_inodes+0x21b/0x362
[<ffffffff810f91c3>] writeback_inodes+0xa2/0xfa
[<ffffffff810aeaaf>] background_writeout+0x92/0xcb
[<ffffffff810af0eb>] pdflush+0x14e/0x201
[<ffffffff810aea1d>] ? background_writeout+0x0/0xcb
[<ffffffff810aef9d>] ? pdflush+0x0/0x201
[<ffffffff8105ed18>] kthread+0x49/0x76
[<ffffffff8101262a>] child_rip+0xa/0x20
[<ffffffff81383002>] ? _spin_unlock_irq+0x2b/0x37
[<ffffffff8106d74a>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff81011f3e>] ? restore_args+0x0/0x30
[<ffffffff8105ecaa>] ? kthreadd+0x176/0x19b
[<ffffffff8105eccf>] ? kthread+0x0/0x76
[<ffffffff81012620>] ? child_rip+0x0/0x20
Sched Debug Version: v0.08, 2.6.29-0.53.rc2.git1.fc11.x86_64 #1
now at 8999693.829504 msecs
.sysctl_sched_latency : 40.000000
.sysctl_sched_min_granularity : 8.000000
.sysctl_sched_wakeup_granularity : 10.000000
.sysctl_sched_child_runs_first : 0.000001
.sysctl_sched_features : 24191

cpu#0, 2666.818 MHz
.nr_running : 0
.load : 0
.nr_switches : 5337240
.nr_load_updates : 3406197
.nr_uninterruptible : -5686
.jiffies : 4303666545
.next_balance : 4303.666549
.curr->pid : 0
.clock : 8999757.002901
.cpu_load[0] : 0
.cpu_load[1] : 0
.cpu_load[2] : 0
.cpu_load[3] : 0
.cpu_load[4] : 0
.yld_exp_empty : 0
.yld_act_empty : 0
.yld_both_empty : 0
.yld_count : 693
.sched_switch : 0
.sched_count : 5466466
.sched_goidle : 1234594
.ttwu_count : 3148627
.ttwu_local : 2728322
.bkl_count : 50886

cfs_rq[0]:/
.exec_clock : 2295440.884259
.MIN_vruntime : 0.000001
.min_vruntime : 42668359.480395
.max_vruntime : 0.000001
.spread : 0.000000
.spread0 : 0.000000
.nr_running : 0
.load : 0
.nr_spread_over : 336023
.shares : 0

rt_rq[0]:/
.rt_nr_running : 0
.rt_throttled : 0
.rt_time : 0.000000
.rt_runtime : 950.000000

runnable tasks:
task PID tree-key switches prio exec-runtime
sum-exec sum-sleep
----------------------------------------------------------------------------------------------------------

cpu#1, 2666.818 MHz
.nr_running : 1
.load : 1024
.nr_switches : 4843840
.nr_load_updates : 3459888
.nr_uninterruptible : 5693
.jiffies : 4303666545
.next_balance : 4303.666801
.curr->pid : 13252
.clock : 8999249.133201
.cpu_load[0] : 1024
.cpu_load[1] : 561
.cpu_load[2] : 448
.cpu_load[3] : 422
.cpu_load[4] : 476
.yld_exp_empty : 0
.yld_act_empty : 0
.yld_both_empty : 0
.yld_count : 1603
.sched_switch : 0
.sched_count : 5039690
.sched_goidle : 1429690
.ttwu_count : 2758073
.ttwu_local : 2199222
.bkl_count : 38801

cfs_rq[1]:/
.exec_clock : 1717848.884015
.MIN_vruntime : 0.000001
.min_vruntime : 50983710.080737
.max_vruntime : 0.000001
.spread : 0.000000
.spread0 : 8315350.600342
.nr_running : 1
.load : 1024
.nr_spread_over : 385373
.shares : 0

rt_rq[1]:/
.rt_nr_running : 0
.rt_throttled : 0
.rt_time : 0.000000
.rt_runtime : 950.000000

runnable tasks:
task PID tree-key switches prio exec-runtime
sum-exec sum-sleep
----------------------------------------------------------------------------------------------------------
R bash 13252 50983710.080737 245 120 50983710.080737
868.862431 66420.517493 /



Steps to reproduce:

Run livecd-creator from rawhide to create a desktop spin:

livecd-creator --config
/usr/share/spin-kickstarts/fedora-livecd-desktop-en_US.ks --cache
/mnt/test/livecd/cache --tmpdir /mnt/test/livecd/tmp

kickstart file modified to create image on ext4

haven't dug into this yet but wanted to capture it.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


2009-02-12 15:24:38

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579


[email protected] changed:

What |Removed |Added
----------------------------------------------------------------------------
Regression|0 |1




------- Comment #1 from [email protected] 2009-02-12 07:24 -------
Finally had some time, and I've tracked this down to the page-writeback.c
changes since .28 - backing them all out the testcase runs overnight. Still
working out which one and why. I suppose it's still possible it's an ext4 bug
but if so the page-writeback.c changes exposed it.

It's a classic deadlock; I usually have 2 threads stuck, pdflush vs. the livecd
creator doing an fsync. Each is waiting for a page the other has locked.

-Eric


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-12 16:36:24

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang

Two commits which are already in the mainline which fixed some changes
after .28 are

89e1219004b3657cc014521663eeef0744f1c99d
dcf6a79dda5cc2a2bec183e50d829030c0972aaa


Both changes have implication wrt ext4. I would suggest try with the
above two commits and see if the problem still exist.

-aneesh

2009-02-12 16:37:46

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #2 from [email protected] 2009-02-12 08:37 -------
Two commits which are already in the mainline which fixed some changes
after .28 are

89e1219004b3657cc014521663eeef0744f1c99d
dcf6a79dda5cc2a2bec183e50d829030c0972aaa


Both changes have implication wrt ext4. I would suggest try with the
above two commits and see if the problem still exist.

-aneesh


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-12 16:45:30

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579


[email protected] changed:

What |Removed |Added
----------------------------------------------------------------------------
KernelVersion|2.6.29-rc2-git1 |2.6.29-rc4




------- Comment #3 from [email protected] 2009-02-12 08:44 -------
Aneesh, thanks. I tried both of those last night, and the problem persists.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-12 18:48:50

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang


>> Finally had some time, and I've tracked this down to the
>> page-writeback.c changes since .28 - backing them all out the
>> testcase runs overnight. Still working out which one and why. I
>> suppose it's still possible it's an ext4 bug but if so the
>> page-writeback.c changes exposed it.
>>
>> It's a classic deadlock; I usually have 2 threads stuck, pdflush
>> vs. the livecd creator doing an fsync. Each is waiting for a page
>> the other has locked.
>
> Two commits which are already in the mainline which fixed some changes
> after .28 are
>
> 89e1219004b3657cc014521663eeef0744f1c99d
> dcf6a79dda5cc2a2bec183e50d829030c0972aaa
>

I read Eric's email as saying he tracked it down to the
page-writeback.c changes *after* 2.6.28. Eric, is that what you
meant?

- Ted

2009-02-12 18:48:55

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #4 from [email protected] 2009-02-12 10:48 -------

>> Finally had some time, and I've tracked this down to the
>> page-writeback.c changes since .28 - backing them all out the
>> testcase runs overnight. Still working out which one and why. I
>> suppose it's still possible it's an ext4 bug but if so the
>> page-writeback.c changes exposed it.
>>
>> It's a classic deadlock; I usually have 2 threads stuck, pdflush
>> vs. the livecd creator doing an fsync. Each is waiting for a page
>> the other has locked.
>
> Two commits which are already in the mainline which fixed some changes
> after .28 are
>
> 89e1219004b3657cc014521663eeef0744f1c99d
> dcf6a79dda5cc2a2bec183e50d829030c0972aaa
>

I read Eric's email as saying he tracked it down to the
page-writeback.c changes *after* 2.6.28. Eric, is that what you
meant?

- Ted


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-12 18:56:18

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #5 from [email protected] 2009-02-12 10:55 -------
Yep. the 2 patches Aneesh refers to are just hours hold, so they were fixes
after I originally saw the hang and worth testing, but they don't seem to help.

I'm still trying some bisection of the changes (unfortunately many intermediate
points seem nonfunctional) and may have finally found a spot; if this doesn't
yield results though it's time for another crash-course in writeback. ;)

-Eric


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-13 00:43:23

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #6 from [email protected] 2009-02-12 16:42 -------
dJust for the record; here are the paths we're stuck on:

livecd:

sys_fdatasync
do_fsync
vfs_fsync
filemap_fdatawrite
__filemap_fdatawrite_range
do_writepage
ext4_da_writepages
write_cache_pages
lock page
(currently locked by pdflush)

pdflush:

pdflush
background_writeout
writeback_inodes
generic_sync_sb_inodes
__writeback_single_inode
do_writepages
ext4_da_writepages
write_cache_pages
lock page
(currently loscked by livecd)

the pdflush wbc looks like:

[1]kdb> wbc 0xffff88007d4abe10
struct writeback_control at 0xffff88007d4abe10
bdi = (null) sync_mode 0 (NONE)
nr_to_write = 992 pages_skipped = 0
range_start 0, range_end 0
nonblocking 1 congestion 0 kupdate 0 reclaim 0
writepages 1 range_cyclic 1 more_io 0 no_update 1

and the livecd-creator fsync wbc looks like:

[1]kdb> wbc 0xffff880012821ea8
struct writeback_control at 0xffff880012821ea8
bdi = (null) sync_mode 1 (ALL)
nr_to_write = 9223372036854775689 pages_skipped = 0
range_start 0, range_end 9223372036854775807
nonblocking 0 congestion 0 kupdate 0 reclaim 0
writepages 1 range_cyclic 0 more_io 0 no_update 1


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-13 11:50:09

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang

How about the below patch ?

From: Aneesh Kumar K.V <[email protected]>
Subject: [PATCH] ext4: Don't use the range_cylic mode implemented by write_cache_pages

With delayed allocation we lock the page in write_cache_pages and try to build
an in memory extent of contiguous blocks. This is needed so that we can get
large contiguous blocks request. Now with range_cyclic mode in write_cache_pages
if we have not done an I/O we loop back to 0 index and try to write the page.
That would imply we will attempt to take page lock of lower index page holding
the page lock of higher index page. This can cause a dead lock with other writeback
thread.

Signed-off-by: Aneesh Kumar K.V <[email protected]>

---
fs/ext4/inode.c | 20 ++++++++++++++++++--
1 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 61e8fc0..f743524 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2437,6 +2437,7 @@ static int ext4_da_writepages(struct address_space *mapping,
int no_nrwrite_index_update;
int pages_written = 0;
long pages_skipped;
+ int range_cyclic = 0, cycled = 1, io_done = 0;
int needed_blocks, ret = 0, nr_to_writebump = 0;
struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb);

@@ -2488,9 +2489,14 @@ static int ext4_da_writepages(struct address_space *mapping,
if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
range_whole = 1;

- if (wbc->range_cyclic)
+ if (wbc->range_cyclic) {
index = mapping->writeback_index;
- else
+ wbc->range_start = index << PAGE_CACHE_SHIFT;
+ wbc->range_end = LLONG_MAX;
+ wbc->range_cyclic = 0;
+ range_cyclic = 1;
+ cycled = 0;
+ } else
index = wbc->range_start >> PAGE_CACHE_SHIFT;

mpd.wbc = wbc;
@@ -2504,6 +2510,7 @@ static int ext4_da_writepages(struct address_space *mapping,
wbc->no_nrwrite_index_update = 1;
pages_skipped = wbc->pages_skipped;

+retry:
while (!ret && wbc->nr_to_write > 0) {

/*
@@ -2546,6 +2553,7 @@ static int ext4_da_writepages(struct address_space *mapping,
pages_written += mpd.pages_written;
wbc->pages_skipped = pages_skipped;
ret = 0;
+ io_done = 1;
} else if (wbc->nr_to_write)
/*
* There is no more writeout needed
@@ -2554,6 +2562,13 @@ static int ext4_da_writepages(struct address_space *mapping,
*/
break;
}
+ if (!io_done && !cycled) {
+ cycled = 1;
+ index = 0;
+ wbc->range_start = index << PAGE_CACHE_SHIFT;
+ wbc->range_end = mapping->writeback_index - 1;
+ goto retry;
+ }
if (pages_skipped != wbc->pages_skipped)
printk(KERN_EMERG "This should not happen leaving %s "
"with nr_to_write = %ld ret = %d\n",
@@ -2561,6 +2576,7 @@ static int ext4_da_writepages(struct address_space *mapping,

/* Update index */
index += pages_written;
+ wbc->range_cyclic = range_cyclic;
if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
/*
* set the writeback_index so that range_cyclic
--
tg: (6ebb071..) range_cyclic_fix (depends on: fix_list_corruption)

2009-02-13 11:51:19

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #7 from [email protected] 2009-02-13 03:50 -------
How about the below patch ?

From: Aneesh Kumar K.V <[email protected]>
Subject: [PATCH] ext4: Don't use the range_cylic mode implemented by
write_cache_pages

With delayed allocation we lock the page in write_cache_pages and try to build
an in memory extent of contiguous blocks. This is needed so that we can get
large contiguous blocks request. Now with range_cyclic mode in
write_cache_pages
if we have not done an I/O we loop back to 0 index and try to write the page.
That would imply we will attempt to take page lock of lower index page holding
the page lock of higher index page. This can cause a dead lock with other
writeback
thread.

Signed-off-by: Aneesh Kumar K.V <[email protected]>

---
fs/ext4/inode.c | 20 ++++++++++++++++++--
1 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 61e8fc0..f743524 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2437,6 +2437,7 @@ static int ext4_da_writepages(struct address_space
*mapping,
int no_nrwrite_index_update;
int pages_written = 0;
long pages_skipped;
+ int range_cyclic = 0, cycled = 1, io_done = 0;
int needed_blocks, ret = 0, nr_to_writebump = 0;
struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb);

@@ -2488,9 +2489,14 @@ static int ext4_da_writepages(struct address_space
*mapping,
if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
range_whole = 1;

- if (wbc->range_cyclic)
+ if (wbc->range_cyclic) {
index = mapping->writeback_index;
- else
+ wbc->range_start = index << PAGE_CACHE_SHIFT;
+ wbc->range_end = LLONG_MAX;
+ wbc->range_cyclic = 0;
+ range_cyclic = 1;
+ cycled = 0;
+ } else
index = wbc->range_start >> PAGE_CACHE_SHIFT;

mpd.wbc = wbc;
@@ -2504,6 +2510,7 @@ static int ext4_da_writepages(struct address_space
*mapping,
wbc->no_nrwrite_index_update = 1;
pages_skipped = wbc->pages_skipped;

+retry:
while (!ret && wbc->nr_to_write > 0) {

/*
@@ -2546,6 +2553,7 @@ static int ext4_da_writepages(struct address_space
*mapping,
pages_written += mpd.pages_written;
wbc->pages_skipped = pages_skipped;
ret = 0;
+ io_done = 1;
} else if (wbc->nr_to_write)
/*
* There is no more writeout needed
@@ -2554,6 +2562,13 @@ static int ext4_da_writepages(struct address_space
*mapping,
*/
break;
}
+ if (!io_done && !cycled) {
+ cycled = 1;
+ index = 0;
+ wbc->range_start = index << PAGE_CACHE_SHIFT;
+ wbc->range_end = mapping->writeback_index - 1;
+ goto retry;
+ }
if (pages_skipped != wbc->pages_skipped)
printk(KERN_EMERG "This should not happen leaving %s "
"with nr_to_write = %ld ret = %d\n",
@@ -2561,6 +2576,7 @@ static int ext4_da_writepages(struct address_space
*mapping,

/* Update index */
index += pages_written;
+ wbc->range_cyclic = range_cyclic;
if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
/*
* set the writeback_index so that range_cyclic


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-13 22:06:37

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #8 from [email protected] 2009-02-13 14:06 -------
Created an attachment (id=20238)
--> (http://bugzilla.kernel.org/attachment.cgi?id=20238&action=view)
Tested patch

Patch from Aneesh, un-whitespace-mangled.

Ted, can you push this out? Works great. :) We might want to ask the other
reporter of something similar (next-20090206: deadlock on ext4) to test it too.
I'll ping him.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-13 22:23:05

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579


[email protected] changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #20238|0 |1
is obsolete| |




------- Comment #9 from [email protected] 2009-02-13 14:22 -------
Created an attachment (id=20239)
--> (http://bugzilla.kernel.org/attachment.cgi?id=20239&action=view)
updated patch w/ my signoffs

Updated the patch w/ my tested/reviewed by.

I'd like to think about whether we can make write_cache_pages smarter and not
need to sort of recreate bits of it in the ext4 code, but for now this solves
my problem, at least.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-14 02:07:26

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang

> Patch from Aneesh, un-whitespace-mangled.
>
> Ted, can you push this out? Works great. :) We might want to ask
> the other reporter of something similar (next-20090206: deadlock on
> ext4) to test it too. I'll ping him.

Do we completely understand the root cause, in terms of which commit
broken the mm/page-writeback.c code we were depending on? And if so,
what of the code in mm/page-writeback.c? Does anyone else use it?
Can anyone sanely use it?

And am I right in assuming that this only applies to 2.6.29-rcX
kernels, and is not needed for 2.6.28 or earlier kernels?

I hadn't yet pushed it out because I needed time to understand all of
these issues, hence these questions....

- Ted


2009-02-14 02:07:31

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #10 from [email protected] 2009-02-13 18:07 -------
> Patch from Aneesh, un-whitespace-mangled.
>
> Ted, can you push this out? Works great. :) We might want to ask
> the other reporter of something similar (next-20090206: deadlock on
> ext4) to test it too. I'll ping him.

Do we completely understand the root cause, in terms of which commit
broken the mm/page-writeback.c code we were depending on? And if so,
what of the code in mm/page-writeback.c? Does anyone else use it?
Can anyone sanely use it?

And am I right in assuming that this only applies to 2.6.29-rcX
kernels, and is not needed for 2.6.28 or earlier kernels?

I hadn't yet pushed it out because I needed time to understand all of
these issues, hence these questions....

- Ted


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-14 03:52:44

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #11 from [email protected] 2009-02-13 19:52 -------
Ted, all good questions. :) If you think we need to digest this a bit that's
fine, though I'd rather this go in than not at all, before 2.6.29 gets out the
door. But I think we still have some time.

The problem has not been seen on 2.6.28 kernels; I can't right now point for
sure to one of Nick's patch series that changed the behavior; in fact many
points in the series seemed to not work at all; even building kernels on other
filesystems got me hung up. Didn't inspire tons of confidence. :)

You're probably right though, and echoing my concern in the previous comment,
that we might need to look at this holistically and decide if this exact change
to ext4 is the right path or if the page-writeback infrastructure needs a
tweak...

-Eric


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-14 08:06:49

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang

On Fri, Feb 13, 2009 at 02:06:06PM -0800, [email protected] wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12579
>
>
>
>
>
> ------- Comment #8 from [email protected] 2009-02-13 14:06 -------
> Created an attachment (id=20238)
> --> (http://bugzilla.kernel.org/attachment.cgi?id=20238&action=view)
> Tested patch
>
> Patch from Aneesh, un-whitespace-mangled.
>
> Ted, can you push this out? Works great. :) We might want to ask the other
> reporter of something similar (next-20090206: deadlock on ext4) to test it too.
> I'll ping him.
>

I think this small changes is also needed.

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index f743524..c80e038 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2495,7 +2495,10 @@ static int ext4_da_writepages(struct address_space *mapping,
wbc->range_end = LLONG_MAX;
wbc->range_cyclic = 0;
range_cyclic = 1;
- cycled = 0;
+ if (index == 0)
+ cycled = 1;
+ else
+ cycled = 0;
} else
index = wbc->range_start >> PAGE_CACHE_SHIFT;


2009-02-14 08:08:07

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #12 from [email protected] 2009-02-14 00:07 -------
On Fri, Feb 13, 2009 at 02:06:06PM -0800, [email protected]
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12579
>
>
>
>
>
> ------- Comment #8 from [email protected] 2009-02-13 14:06 -------
> Created an attachment (id=20238)
--> (http://bugzilla.kernel.org/attachment.cgi?id=20238&action=view)
> --> (http://bugzilla.kernel.org/attachment.cgi?id=20238&action=view)
> Tested patch
>
> Patch from Aneesh, un-whitespace-mangled.
>
> Ted, can you push this out? Works great. :) We might want to ask the other
> reporter of something similar (next-20090206: deadlock on ext4) to test it too.
> I'll ping him.
>

I think this small changes is also needed.

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index f743524..c80e038 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2495,7 +2495,10 @@ static int ext4_da_writepages(struct address_space
*mapping,
wbc->range_end = LLONG_MAX;
wbc->range_cyclic = 0;
range_cyclic = 1;
- cycled = 0;
+ if (index == 0)
+ cycled = 1;
+ else
+ cycled = 0;
} else
index = wbc->range_start >> PAGE_CACHE_SHIFT;



--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-14 08:40:20

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang

On Fri, Feb 13, 2009 at 08:50:18PM -0500, Theodore Tso wrote:
> > Patch from Aneesh, un-whitespace-mangled.
> >
> > Ted, can you push this out? Works great. :) We might want to ask
> > the other reporter of something similar (next-20090206: deadlock on
> > ext4) to test it too. I'll ping him.
>
> Do we completely understand the root cause, in terms of which commit
> broken the mm/page-writeback.c code we were depending on? And if so,
> what of the code in mm/page-writeback.c? Does anyone else use it?
> Can anyone sanely use it?

AFAIU we need the changes even for older kernels. The
reasoning is, with delayed allocation we cannot allow to retry with lower
page index in write_cache_pages. We do retry even in older version of
kernel. What made it so easy to reproduce it on later kernels is that
we were doing a retry even if nr_to_write was zero. This got fixed on
mainline by 3a4c6800f31ea8395628af5e7e490270ee5d0585. So with that
change we are logically back to 2.6.28 state, But still the possibility
of deadlock remain.

>
> And am I right in assuming that this only applies to 2.6.29-rcX
> kernels, and is not needed for 2.6.28 or earlier kernels?

I guess the hang can happen on 2.6.28 or earlier kernels.

>
> I hadn't yet pushed it out because I needed time to understand all of
> these issues, hence these questions....

-aneesh

2009-02-14 08:40:32

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #13 from [email protected] 2009-02-14 00:40 -------
On Fri, Feb 13, 2009 at 08:50:18PM -0500, Theodore Tso wrote:
> > Patch from Aneesh, un-whitespace-mangled.
> >
> > Ted, can you push this out? Works great. :) We might want to ask
> > the other reporter of something similar (next-20090206: deadlock on
> > ext4) to test it too. I'll ping him.
>
> Do we completely understand the root cause, in terms of which commit
> broken the mm/page-writeback.c code we were depending on? And if so,
> what of the code in mm/page-writeback.c? Does anyone else use it?
> Can anyone sanely use it?

AFAIU we need the changes even for older kernels. The
reasoning is, with delayed allocation we cannot allow to retry with lower
page index in write_cache_pages. We do retry even in older version of
kernel. What made it so easy to reproduce it on later kernels is that
we were doing a retry even if nr_to_write was zero. This got fixed on
mainline by 3a4c6800f31ea8395628af5e7e490270ee5d0585. So with that
change we are logically back to 2.6.28 state, But still the possibility
of deadlock remain.

>
> And am I right in assuming that this only applies to 2.6.29-rcX
> kernels, and is not needed for 2.6.28 or earlier kernels?

I guess the hang can happen on 2.6.28 or earlier kernels.

>
> I hadn't yet pushed it out because I needed time to understand all of
> these issues, hence these questions....

-aneesh


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-14 09:41:31

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [Bug 12579] ext4 filesystem hang

On Sat, Feb 14, 2009 at 02:10:04PM +0530, Aneesh Kumar K.V wrote:
> On Fri, Feb 13, 2009 at 08:50:18PM -0500, Theodore Tso wrote:
> > > Patch from Aneesh, un-whitespace-mangled.
> > >
> > > Ted, can you push this out? Works great. :) We might want to ask
> > > the other reporter of something similar (next-20090206: deadlock on
> > > ext4) to test it too. I'll ping him.
> >
> > Do we completely understand the root cause, in terms of which commit
> > broken the mm/page-writeback.c code we were depending on? And if so,
> > what of the code in mm/page-writeback.c? Does anyone else use it?
> > Can anyone sanely use it?
>
> AFAIU we need the changes even for older kernels. The
> reasoning is, with delayed allocation we cannot allow to retry with lower
> page index in write_cache_pages. We do retry even in older version of
> kernel. What made it so easy to reproduce it on later kernels is that
> we were doing a retry even if nr_to_write was zero. This got fixed on
> mainline by 3a4c6800f31ea8395628af5e7e490270ee5d0585. So with that
> change we are logically back to 2.6.28 state, But still the possibility
> of deadlock remain.
>

I found commit 31a12666d8f0c22235297e1c1575f82061480029 to be the root
cause. The commit is correct in what it does. Ext4 was dependent on the
wrong behaviour. The relevant change is

@@ -897,7 +903,6 @@ retry:
min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1))) {
unsigned i;

- scanned = 1;
for (i = 0; i < nr_pages; i++) {


I think that caused us the retry. That would imply we may not need the
patch I did for 2.6.28. But given that Ext4 was dependent on the wrong
behaviour of write_cache_pages i would suggest we still push the patch
to 2.6.28

-aneesh

2009-02-14 09:42:41

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #14 from [email protected] 2009-02-14 01:42 -------
On Sat, Feb 14, 2009 at 02:10:04PM +0530, Aneesh Kumar K.V wrote:
> On Fri, Feb 13, 2009 at 08:50:18PM -0500, Theodore Tso wrote:
> > > Patch from Aneesh, un-whitespace-mangled.
> > >
> > > Ted, can you push this out? Works great. :) We might want to ask
> > > the other reporter of something similar (next-20090206: deadlock on
> > > ext4) to test it too. I'll ping him.
> >
> > Do we completely understand the root cause, in terms of which commit
> > broken the mm/page-writeback.c code we were depending on? And if so,
> > what of the code in mm/page-writeback.c? Does anyone else use it?
> > Can anyone sanely use it?
>
> AFAIU we need the changes even for older kernels. The
> reasoning is, with delayed allocation we cannot allow to retry with lower
> page index in write_cache_pages. We do retry even in older version of
> kernel. What made it so easy to reproduce it on later kernels is that
> we were doing a retry even if nr_to_write was zero. This got fixed on
> mainline by 3a4c6800f31ea8395628af5e7e490270ee5d0585. So with that
> change we are logically back to 2.6.28 state, But still the possibility
> of deadlock remain.
>

I found commit 31a12666d8f0c22235297e1c1575f82061480029 to be the root
cause. The commit is correct in what it does. Ext4 was dependent on the
wrong behaviour. The relevant change is

@@ -897,7 +903,6 @@ retry:
min(end - index,
(pgoff_t)PAGEVEC_SIZE-1) + 1))) {
unsigned i;

- scanned = 1;
for (i = 0; i < nr_pages; i++) {


I think that caused us the retry. That would imply we may not need the
patch I did for 2.6.28. But given that Ext4 was dependent on the wrong
behaviour of write_cache_pages i would suggest we still push the patch
to 2.6.28

-aneesh


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-20 02:31:54

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579





------- Comment #15 from [email protected] 2009-02-19 18:31 -------
There is another bug reported about the deadlock (
http://bugzilla.kernel.org/show_bug.cgi?id=12610 ) happening on 2.6.28.2 and
later.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2009-02-21 17:35:48

by bugme-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579


[email protected] changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |CLOSED
Resolution| |PATCH_ALREADY_AVAILABLE




------- Comment #16 from [email protected] 2009-02-21 09:35 -------
This patch has been merged into mainline. I will work on getting this pushed
into a stable kernel.


--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

2010-02-27 18:29:40

by bugzilla-daemon

[permalink] [raw]
Subject: [Bug 12579] ext4 filesystem hang

http://bugzilla.kernel.org/show_bug.cgi?id=12579


Jayson R. King <[email protected]> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |[email protected]




--- Comment #17 from Jayson R. King <[email protected]> 2010-02-27 18:28:56 ---
I sent mail a few hours ago about this bug occurring on a 2.6.27.y kernel. The
patch which fixed the bug on mainline also fixes 2.6.27.y, but it depends on
several other patches being merged in before it (mostly other ext4 fixes).

--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.