On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <[email protected]> wrote:
> [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
> [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
> [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
> [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0
>
> We shouldn't be mapping buffers there, there's a patch below to fix
> this. It's probably the source of this report, even though I cannot
> lockdep seems to be off with the fairies...

That patch seems to break my system.
After it started to swap, because I was compiling seamonkey (firefox
turned into the full navigator suite) on a tmpfs, several processes
got stuck and triggered the hung-task-check.
As a kswapd, xfsaild/md4 and flush-9:4 also got stuck, not even a
shutdown worked.

The attached log first contains the hung-task-notices, then the output
from SysRq+W.

After the shutdown got stuck trying to turn of swap, I first tries
SysRq+S, but did not get a 'Done' and on SysRq+U lockdep complained
about an lock imbalance wrt. sb_writer. SysRq+O also did no longer
work, only SysRq+B.

I don't know which one got stuck first, but I'm somewhat suspicious of
the plasma-desktop and the sshd that SysRq+W reported stuck in xfs
reclaim, even if these processes did never trigger the hung task
check.

Torsten

> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> xfs: inode allocation should use unmapped buffers.
>
> From: Dave Chinner <[email protected]>
>
> Inode buffers do not need to be mapped as inodes are read or written
> directly from/to the pages underlying the buffer. This fixes a
> regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the
> default behaviour").
>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/xfs/xfs_ialloc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
> index 2d6495e..a815412 100644
> --- a/fs/xfs/xfs_ialloc.c
> +++ b/fs/xfs/xfs_ialloc.c
> @@ -200,7 +200,8 @@ xfs_ialloc_inode_init(
> */
> d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster));
> fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d,
> - mp->m_bsize * blks_per_cluster, 0);
> + mp->m_bsize * blks_per_cluster,
> + XBF_UNMAPPED);
> if (!fbuf)
> return ENOMEM;
> /*

Attachments:

xfs-reclaim-hang-messages.txt (77.93 kB)

2012-11-20 20:27:49

by Dave Chinner

[permalink] [raw]

Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Nov 20, 2012 at 08:45:03PM +0100, Torsten Kaiser wrote:
> On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <[email protected]> wrote:
> > [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> > [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> > [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> > [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> > [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
> > [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
> > [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
> > [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0
> >
> > We shouldn't be mapping buffers there, there's a patch below to fix
> > this. It's probably the source of this report, even though I cannot
> > lockdep seems to be off with the fairies...
>
> That patch seems to break my system.

You've got an IO problem, not an XFS problem. Everything is hung up
on MD.

INFO: task kswapd0:725 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0 D 0000000000000001 0 725 2 0x00000000
ffff8803280d13f8 0000000000000046 ffff880329a0ab80 ffff8803280d1fd8
ffff8803280d1fd8 ffff8803280d1fd8 ffff880046b7c880 ffff880329a0ab80
ffff8803280d1408 ffff8803278dbbd0 ffff8803278db800 00000000ffffffff
Call Trace:
[<ffffffff816b1224>] schedule+0x24/0x60
[<ffffffff814f9dad>] md_super_wait+0x4d/0x80
[<ffffffff81500753>] bitmap_unplug+0x173/0x180
[<ffffffff814e8eb8>] raid1_unplug+0x98/0x110
[<ffffffff81278a6d>] blk_flush_plug_list+0xad/0x240
[<ffffffff816b15c3>] io_schedule_timeout+0x83/0xf0
[<ffffffff810b0e1d>] mempool_alloc+0x12d/0x160
[<ffffffff811263da>] bvec_alloc_bs+0xda/0x100
[<ffffffff811264ea>] bio_alloc_bioset+0xea/0x110
[<ffffffff81126656>] bio_clone_bioset+0x16/0x40
[<ffffffff814f471a>] bio_clone_mddev+0x1a/0x30
[<ffffffff814edbb1>] make_request+0x551/0xde0
[<ffffffff814f80bb>] md_make_request+0x21b/0x4d0
[<ffffffff81276e52>] generic_make_request+0xc2/0x100
[<ffffffff81276ef5>] submit_bio+0x65/0x110
[<ffffffff811e07bf>] xfs_submit_ioend_bio.isra.21+0x2f/0x40
[<ffffffff811e088e>] xfs_submit_ioend+0xbe/0x110
[<ffffffff811e0c91>] xfs_vm_writepage+0x3b1/0x540
[<ffffffff810bcd84>] shrink_page_list+0x564/0x890
[<ffffffff810bd637>] shrink_inactive_list+0x1d7/0x310
[<ffffffff810bdb9d>] shrink_lruvec+0x42d/0x530
[<ffffffff810be323>] kswapd+0x683/0xa20
[<ffffffff8105c246>] kthread+0xd6/0xe0
[<ffffffff816b31ac>] ret_from_fork+0x7c/0xb0
no locks held by kswapd0/725.

So kswapd is trying to clean pages, but it's blocked in an unplug
during IO submission. Probably one to report to the linux-raid list.

INFO: task xfsaild/md4:1742 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
xfsaild/md4 D 0000000000000003 0 1742 2 0x00000000
ffff88032438bb68 0000000000000046 ffff880329965700 ffff88032438bfd8
ffff88032438bfd8 ffff88032438bfd8 ffff88032827e580 ffff880329965700
ffff88032438bb78 ffff8803278dbbd0 ffff8803278db800 00000000ffffffff
Call Trace:
[<ffffffff816b1224>] schedule+0x24/0x60
[<ffffffff814f9dad>] md_super_wait+0x4d/0x80
[<ffffffff8105ca30>] ? __init_waitqueue_head+0x60/0x60
[<ffffffff81500753>] bitmap_unplug+0x173/0x180
[<ffffffff81278c13>] ? blk_finish_plug+0x13/0x50
[<ffffffff814e8eb8>] raid1_unplug+0x98/0x110
[<ffffffff81278a6d>] blk_flush_plug_list+0xad/0x240
[<ffffffff81278c13>] blk_finish_plug+0x13/0x50
[<ffffffff811e296a>] __xfs_buf_delwri_submit+0x1ca/0x1e0
[<ffffffff811e2ffb>] xfs_buf_delwri_submit_nowait+0x1b/0x20
[<ffffffff81233066>] xfsaild+0x226/0x4c0
[<ffffffff81065dfa>] ? finish_task_switch+0x3a/0x100
[<ffffffff81232e40>] ? xfs_trans_ail_cursor_first+0xa0/0xa0
[<ffffffff8105c246>] kthread+0xd6/0xe0
[<ffffffff816b246b>] ? _raw_spin_unlock_irq+0x2b/0x50
[<ffffffff8105c170>] ? flush_kthread_worker+0xe0/0xe0
[<ffffffff816b31ac>] ret_from_fork+0x7c/0xb0
[<ffffffff8105c170>] ? flush_kthread_worker+0xe0/0xe0
no locks held by xfsaild/md4/1742.

Same here - metadata writes are backed up waiting for MD to submit
IO. Everything else is stuck on thesei or MD, too...

Cheers,

Dave.
--
Dave Chinner
[email protected]