2008-10-17 12:43:35

by Alexander Beregalov

[permalink] [raw]
Subject: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

It is SMP x86_64, rootfs is on XFS

[ 4.419571] BUG: sleeping function called from invalid context at
kernel/rwsem.c:131
[ 4.419576] in_atomic(): 1, irqs_disabled(): 0, pid: 1396, name: mkdir
[ 4.419579] 2 locks held by mkdir/1396:
[ 4.419581] #0: (&type->i_mutex_dir_key#2/1){--..}, at:
[<ffffffff802a895f>] lookup_create+0x2b/0x99
[ 4.419594] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
[<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 4.419605] Pid: 1396, comm: mkdir Not tainted 2.6.27-next-20081017 #2
[ 4.419608] Call Trace:
[ 4.419614] [<ffffffff80257a61>] ? __debug_show_held_locks+0x22/0x24
[ 4.419620] [<ffffffff802303ee>] __might_sleep+0x120/0x122
[ 4.419625] [<ffffffff8024e41d>] down_write_nested+0x2c/0x7b
[ 4.419629] [<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 4.419633] [<ffffffff8037d9b4>] xfs_iget_core+0x3cc/0x4f1
[ 4.419637] [<ffffffff802b2393>] ? iget_locked+0xf0/0x125
[ 4.419641] [<ffffffff8037dbc4>] xfs_iget+0xeb/0x197
[ 4.419646] [<ffffffff803933c6>] xfs_trans_iget+0xb4/0x126
[ 4.419651] [<ffffffff8038164b>] xfs_ialloc+0x9e/0x51e
[ 4.419655] [<ffffffff802304be>] ? get_parent_ip+0x11/0x41
[ 4.419659] [<ffffffff80393d5a>] xfs_dir_ialloc+0x78/0x26a
[ 4.419663] [<ffffffff8024e464>] ? down_write_nested+0x73/0x7b
[ 4.419667] [<ffffffff80395fbf>] xfs_mkdir+0x236/0x438
[ 4.419673] [<ffffffff8039fe54>] xfs_vn_mknod+0xe7/0x13b
[ 4.419677] [<ffffffff8039febb>] xfs_vn_mkdir+0x13/0x15
[ 4.419681] [<ffffffff802a9e1d>] vfs_mkdir+0x4b/0x8f
[ 4.419685] [<ffffffff802ab574>] sys_mkdirat+0xa7/0x101
[ 4.419691] [<ffffffff8054052c>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 4.419695] [<ffffffff802587b0>] ? trace_hardirqs_on_caller+0x12c/0x157
[ 4.419700] [<ffffffff80225010>] ? do_page_fault+0x394/0x884
[ 4.419704] [<ffffffff802ab5e6>] sys_mkdir+0x18/0x1a
[ 4.419709] [<ffffffff8020b8bb>] system_call_fastpath+0x16/0x1b
[ 4.743247] dmesg used greatest stack depth: 2848 bytes left
[ 5.930376] bnx2: eth0: using MSI
[ 6.056409] BUG: sleeping function called from invalid context at
kernel/rwsem.c:131
[ 6.056413] in_atomic(): 1, irqs_disabled(): 0, pid: 1782, name: dhcpcd
[ 6.056416] 2 locks held by dhcpcd/1782:
[ 6.056418] #0: (&type->i_mutex_dir_key#2){--..}, at:
[<ffffffff802aa763>] do_filp_open+0x178/0x7d5
[ 6.056431] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
[<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 6.056443] Pid: 1782, comm: dhcpcd Not tainted 2.6.27-next-20081017 #2
[ 6.056445] Call Trace:
[ 6.056452] [<ffffffff80257a61>] ? __debug_show_held_locks+0x22/0x24
[ 6.056458] [<ffffffff802303ee>] __might_sleep+0x120/0x122
[ 6.056464] [<ffffffff8024e41d>] down_write_nested+0x2c/0x7b
[ 6.056468] [<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 6.056472] [<ffffffff8037d9b4>] xfs_iget_core+0x3cc/0x4f1
[ 6.056476] [<ffffffff802b2393>] ? iget_locked+0xf0/0x125
[ 6.056481] [<ffffffff8037dbc4>] xfs_iget+0xeb/0x197
[ 6.056486] [<ffffffff803933c6>] xfs_trans_iget+0xb4/0x126
[ 6.056491] [<ffffffff8038164b>] xfs_ialloc+0x9e/0x51e
[ 6.056495] [<ffffffff802304be>] ? get_parent_ip+0x11/0x41
[ 6.056499] [<ffffffff80393d5a>] xfs_dir_ialloc+0x78/0x26a
[ 6.056503] [<ffffffff8024e464>] ? down_write_nested+0x73/0x7b
[ 6.056508] [<ffffffff80395b8a>] xfs_create+0x239/0x438
[ 6.056512] [<ffffffff80256ccd>] ? get_lock_stats+0x2d/0x5c
[ 6.056518] [<ffffffff8039fe3b>] xfs_vn_mknod+0xce/0x13b
[ 6.056522] [<ffffffff8039fecd>] xfs_vn_create+0x10/0x12
[ 6.056526] [<ffffffff802a9dab>] vfs_create+0x5d/0x84
[ 6.056530] [<ffffffff802aa839>] do_filp_open+0x24e/0x7d5
[ 6.056534] [<ffffffff802304be>] ? get_parent_ip+0x11/0x41
[ 6.056539] [<ffffffff80231ee1>] ? sub_preempt_count+0xb0/0xc4
[ 6.056544] [<ffffffff802b40a4>] ? alloc_fd+0x122/0x133
[ 6.056549] [<ffffffff8029ef42>] do_sys_open+0x58/0xdf
[ 6.056553] [<ffffffff8029effc>] sys_open+0x20/0x22
[ 6.056561] [<ffffffff8020b8bb>] system_call_fastpath+0x16/0x1b
[ 9.098603] bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full
duplex, receive & transmit flow control ON
[ 17.899872] BUG: sleeping function called from invalid context at
kernel/rwsem.c:131
[ 17.899876] in_atomic(): 1, irqs_disabled(): 0, pid: 1791, name:
dhcpcd-run-hook
[ 17.899879] 2 locks held by dhcpcd-run-hook/1791:
[ 17.899881] #0: (&type->i_mutex_dir_key#2){--..}, at:
[<ffffffff802aa763>] do_filp_open+0x178/0x7d5
[ 17.899893] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
[<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 17.899905] Pid: 1791, comm: dhcpcd-run-hook Not tainted
2.6.27-next-20081017 #2
[ 17.899907] Call Trace:
[ 17.899914] [<ffffffff80257a61>] ? __debug_show_held_locks+0x22/0x24
[ 17.899920] [<ffffffff802303ee>] __might_sleep+0x120/0x122
[ 17.899926] [<ffffffff8024e41d>] down_write_nested+0x2c/0x7b
[ 17.899930] [<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 17.899934] [<ffffffff8037d9b4>] xfs_iget_core+0x3cc/0x4f1
[ 17.899939] [<ffffffff802b2393>] ? iget_locked+0xf0/0x125
[ 17.899943] [<ffffffff8037dbc4>] xfs_iget+0xeb/0x197
[ 17.899953] [<ffffffff803933c6>] xfs_trans_iget+0xb4/0x126
[ 17.899956] [<ffffffff8038164b>] xfs_ialloc+0x9e/0x51e
[ 17.899959] [<ffffffff802304be>] ? get_parent_ip+0x11/0x41
[ 17.899962] [<ffffffff80393d5a>] xfs_dir_ialloc+0x78/0x26a
[ 17.899965] [<ffffffff8024e464>] ? down_write_nested+0x73/0x7b
[ 17.899968] [<ffffffff80395b8a>] xfs_create+0x239/0x438
[ 17.899971] [<ffffffff80256ccd>] ? get_lock_stats+0x2d/0x5c
[ 17.899975] [<ffffffff8039fe3b>] xfs_vn_mknod+0xce/0x13b
[ 17.899978] [<ffffffff8039fecd>] xfs_vn_create+0x10/0x12
[ 17.899981] [<ffffffff802a9dab>] vfs_create+0x5d/0x84
[ 17.899984] [<ffffffff802aa839>] do_filp_open+0x24e/0x7d5
[ 17.899987] [<ffffffff802304be>] ? get_parent_ip+0x11/0x41
[ 17.899990] [<ffffffff80231ee1>] ? sub_preempt_count+0xb0/0xc4
[ 17.899993] [<ffffffff802b40a4>] ? alloc_fd+0x122/0x133
[ 17.899997] [<ffffffff8029ef42>] do_sys_open+0x58/0xdf
[ 17.900002] [<ffffffff80241376>] ? sigprocmask+0x37/0xd7
[ 17.900004] [<ffffffff8029effc>] sys_open+0x20/0x22
[ 17.900008] [<ffffffff8020b8bb>] system_call_fastpath+0x16/0x1b
[ 18.190958] Adding 3911788k swap on /dev/sda1. Priority:-1
extents:1 across:3911788k
[ 30.773375] CPA self-test:
[ 30.780146] 4k 523088 large 0 gb 0 x
262144[ffff880000000000-ffff88003ffff000] miss 0
[ 30.792087] 4k 523088 large 0 gb 0 x
262144[ffff880000000000-ffff88003ffff000] miss 0
[ 30.803460] 4k 523088 large 0 gb 0 x
262144[ffff880000000000-ffff88003ffff000] miss 0
[ 30.803463] ok.
[ 3846.588515] BUG: sleeping function called from invalid context at
kernel/rwsem.c:131
[ 3846.588519] in_atomic(): 1, irqs_disabled(): 0, pid: 2005, name: screen
[ 3846.588523] 2 locks held by screen/2005:
[ 3846.588524] #0: (&type->i_mutex_dir_key#2/1){--..}, at:
[<ffffffff802a895f>] lookup_create+0x2b/0x99
[ 3846.588538] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
[<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 3846.588549] Pid: 2005, comm: screen Not tainted 2.6.27-next-20081017 #2
[ 3846.588552] Call Trace:
[ 3846.588559] [<ffffffff80257a61>] ? __debug_show_held_locks+0x22/0x24
[ 3846.588565] [<ffffffff802303ee>] __might_sleep+0x120/0x122
[ 3846.588570] [<ffffffff8024e41d>] down_write_nested+0x2c/0x7b
[ 3846.588574] [<ffffffff8037d30f>] xfs_ilock+0x60/0x7e
[ 3846.588578] [<ffffffff8037d9b4>] xfs_iget_core+0x3cc/0x4f1
[ 3846.588582] [<ffffffff802b2393>] ? iget_locked+0xf0/0x125
[ 3846.588586] [<ffffffff8037dbc4>] xfs_iget+0xeb/0x197
[ 3846.588591] [<ffffffff803933c6>] xfs_trans_iget+0xb4/0x126
[ 3846.588596] [<ffffffff8038164b>] xfs_ialloc+0x9e/0x51e
[ 3846.588599] [<ffffffff802304be>] ? get_parent_ip+0x11/0x41
[ 3846.588604] [<ffffffff80393d5a>] xfs_dir_ialloc+0x78/0x26a
[ 3846.588607] [<ffffffff8024e464>] ? down_write_nested+0x73/0x7b
[ 3846.588612] [<ffffffff80395b8a>] xfs_create+0x239/0x438
[ 3846.588617] [<ffffffff8039fe3b>] xfs_vn_mknod+0xce/0x13b
[ 3846.588621] [<ffffffff802a9d29>] vfs_mknod+0x75/0x9a
[ 3846.588625] [<ffffffff802ab76e>] sys_mknodat+0x186/0x1e5
[ 3846.588632] [<ffffffff8054052c>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 3846.588636] [<ffffffff802587b0>] ? trace_hardirqs_on_caller+0x12c/0x157
[ 3846.588640] [<ffffffff802ab7e7>] sys_mknod+0x1a/0x1c
[ 3846.588644] [<ffffffff8020b8bb>] system_call_fastpath+0x16/0x1b


2008-10-17 16:41:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Fri, Oct 17, 2008 at 04:43:07PM +0400, Alexander Beregalov wrote:
> It is SMP x86_64, rootfs is on XFS

The message comes from code in XFS, but I really wonder where the atomic
context comes from. xfs_iget_core has a few changes in the tree pulled
into todays Linux-next, but it's still missing the recent updates
redoing this area completely. In some ways it looks like this process
has the the atomic context flag left pending somewhere, but that should
also trigger warnings on the VFS locks like i_mutex.

Very misterious.

IS there a specific workload you can reproduce this with?

2008-10-17 16:54:49

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

2008/10/17 Christoph Hellwig <[email protected]>:
> On Fri, Oct 17, 2008 at 04:43:07PM +0400, Alexander Beregalov wrote:
>> It is SMP x86_64, rootfs is on XFS
>
> The message comes from code in XFS, but I really wonder where the atomic
> context comes from. xfs_iget_core has a few changes in the tree pulled
> into todays Linux-next, but it's still missing the recent updates
> redoing this area completely. In some ways it looks like this process
> has the the atomic context flag left pending somewhere, but that should
> also trigger warnings on the VFS locks like i_mutex.
>
> Very misterious.
>
> IS there a specific workload you can reproduce this with?
>
Workload is not specific, many i/o commands trigger the message.

dmesg|grep comm:
[ 4.419605] Pid: 1396, comm: mkdir Not tainted 2.6.27-next-20081017 #2
[ 6.056443] Pid: 1782, comm: dhcpcd Not tainted 2.6.27-next-20081017 #2
[ 17.899905] Pid: 1791, comm: dhcpcd-run-hook Not tainted
2.6.27-next-20081017 #2
[ 3846.588549] Pid: 2005, comm: screen Not tainted 2.6.27-next-20081017 #2
[ 4416.242723] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
[ 4430.618037] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
[ 4431.641578] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
[ 4432.944265] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
[ 4590.330706] Pid: 2026, comm: git-merge Not tainted 2.6.27-next-20081017 #2
[14882.802479] Pid: 2058, comm: git-merge Not tainted 2.6.27-next-20081017 #2
[16955.026436] Pid: 2091, comm: git-merge Not tainted 2.6.27-next-20081017 #2
[18466.645935] Pid: 2119, comm: git-merge Not tainted 2.6.27-next-20081017 #2
[19165.813049] Pid: 2150, comm: git-merge Not tainted 2.6.27-next-20081017 #2

2008-10-17 16:57:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Fri, Oct 17, 2008 at 08:54:35PM +0400, Alexander Beregalov wrote:
> Workload is not specific, many i/o commands trigger the message.
>
> dmesg|grep comm:
> [ 4.419605] Pid: 1396, comm: mkdir Not tainted 2.6.27-next-20081017 #2
> [ 6.056443] Pid: 1782, comm: dhcpcd Not tainted 2.6.27-next-20081017 #2
> [ 17.899905] Pid: 1791, comm: dhcpcd-run-hook Not tainted
> 2.6.27-next-20081017 #2
> [ 3846.588549] Pid: 2005, comm: screen Not tainted 2.6.27-next-20081017 #2
> [ 4416.242723] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
> [ 4430.618037] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
> [ 4431.641578] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
> [ 4432.944265] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
> [ 4590.330706] Pid: 2026, comm: git-merge Not tainted 2.6.27-next-20081017 #2
> [14882.802479] Pid: 2058, comm: git-merge Not tainted 2.6.27-next-20081017 #2
> [16955.026436] Pid: 2091, comm: git-merge Not tainted 2.6.27-next-20081017 #2
> [18466.645935] Pid: 2119, comm: git-merge Not tainted 2.6.27-next-20081017 #2
> [19165.813049] Pid: 2150, comm: git-merge Not tainted 2.6.27-next-20081017 #2

Very strange. Can you retry with tomorrow Linux-next which should have
a large XFS update? If it still happens I'll investigate it in more
detail.

2008-10-17 17:13:19

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

2008/10/17 Christoph Hellwig <[email protected]>:
> Very strange. Can you retry with tomorrow Linux-next which should have
> a large XFS update? If it still happens I'll investigate it in more
> detail.

I have tried next-20081016 with the same result.
I am afraid I wont be able to test it tomorrow, it will likely be
possible on Monday.

2008-10-17 20:37:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

I just ran the xfs testsuite over linux-next on qemu (i386), and I can't
find anything at all. Really strange. Let's see if this still there
with mondays linux-next, and if yes can you just try the xfs patch from
the splitout linux-next patches and see if that alone causes it?

In fact that might be useful for todays linux-next, too.

2008-10-17 20:55:37

by Arjan van de Ven

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Fri, 17 Oct 2008 16:37:10 -0400
Christoph Hellwig <[email protected]> wrote:

> I just ran the xfs testsuite over linux-next on qemu (i386), and I
> can't find anything at all. Really strange. Let's see if this still
> there with mondays linux-next, and if yes can you just try the xfs
> patch from the splitout linux-next patches and see if that alone
> causes it?
>
> In fact that might be useful for todays linux-next, too.
> --

also if you enable lockdep, it tracks where irq's got turned off (with
a stacktrace I think) so it would provide some good clues.


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-10-20 14:58:20

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

2008/10/18 Arjan van de Ven <[email protected]>:
> On Fri, 17 Oct 2008 16:37:10 -0400
> Christoph Hellwig <[email protected]> wrote:
>
>> I just ran the xfs testsuite over linux-next on qemu (i386), and I
>> can't find anything at all. Really strange. Let's see if this still
>> there with mondays linux-next, and if yes can you just try the xfs
>> patch from the splitout linux-next patches and see if that alone
>> causes it?

Hi Christoph

I have the same result with next-1020 and today's xfs-2.6.git/master
(
commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
Author: David Chinner <[email protected]>
Date: Fri Oct 17 15:36:23 2008 +1000
Inode: Allow external list initialisation
)

Should I start bisecting?
>>
>> In fact that might be useful for todays linux-next, too.
>> --
>
> also if you enable lockdep, it tracks where irq's got turned off (with
> a stacktrace I think) so it would provide some good clues.

Yes, lockdep was enabled, but there is no more information than I
already posted.

2008-10-20 16:33:45

by Christoph Hellwig

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
> Hi Christoph
>
> I have the same result with next-1020 and today's xfs-2.6.git/master
> (
> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
> Author: David Chinner <[email protected]>
> Date: Fri Oct 17 15:36:23 2008 +1000
> Inode: Allow external list initialisation
> )
>
> Should I start bisecting?

That would be extremely helpful!


And as mentioned in the other mail I still can't reproduce it, but I can
currently only test on x86-32, not x86-64.

2008-10-20 17:22:50

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

2008/10/20 Christoph Hellwig <[email protected]>:
> On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
>> Hi Christoph
>>
>> I have the same result with next-1020 and today's xfs-2.6.git/master
>> (
>> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
>> Author: David Chinner <[email protected]>
>> Date: Fri Oct 17 15:36:23 2008 +1000
>> Inode: Allow external list initialisation
>> )
Ha, that kernel (xfs/master) has made my system dead locked.
SysRQ-d:
Show all locks held in the system
1 lock held by pdflush
(&type->s_umount_key#18{----}, at writeback_inodes
1 lock held by login
(&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
and so on ( many locks at xfs_ilock)

>>
>> Should I start bisecting?
>
> That would be extremely helpful!
>
>
> And as mentioned in the other mail I still can't reproduce it, but I can
> currently only test on x86-32, not x86-64.
>

2008-10-20 22:36:19

by Dave Chinner

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Mon, Oct 20, 2008 at 09:13:05PM +0400, Alexander Beregalov wrote:
> 2008/10/20 Christoph Hellwig <[email protected]>:
> > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
> >> Hi Christoph
> >>
> >> I have the same result with next-1020 and today's xfs-2.6.git/master
> >> (
> >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
> >> Author: David Chinner <[email protected]>
> >> Date: Fri Oct 17 15:36:23 2008 +1000
> >> Inode: Allow external list initialisation
> >> )
> Ha, that kernel (xfs/master) has made my system dead locked.
> SysRQ-d:
> Show all locks held in the system
> 1 lock held by pdflush
> (&type->s_umount_key#18{----}, at writeback_inodes
> 1 lock held by login
> (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
> and so on ( many locks at xfs_ilock)

Curious. Can you post the full stack traces?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2008-10-21 11:42:48

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

2008/10/21 Dave Chinner <[email protected]>:
> On Mon, Oct 20, 2008 at 09:13:05PM +0400, Alexander Beregalov wrote:
>> 2008/10/20 Christoph Hellwig <[email protected]>:
>> > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
>> >> Hi Christoph
>> >>
>> >> I have the same result with next-1020 and today's xfs-2.6.git/master
>> >> (
>> >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
>> >> Author: David Chinner <[email protected]>
>> >> Date: Fri Oct 17 15:36:23 2008 +1000
>> >> Inode: Allow external list initialisation
>> >> )
>> Ha, that kernel (xfs/master) has made my system dead locked.
>> SysRQ-d:
>> Show all locks held in the system
>> 1 lock held by pdflush
>> (&type->s_umount_key#18{----}, at writeback_inodes
>> 1 lock held by login
>> (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
>> and so on ( many locks at xfs_ilock)
>
> Curious. Can you post the full stack traces?
I can not reproduce it, yet.

Bisected to:
dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
Author: Lachlan McIlroy <[email protected]>
Date: Mon Sep 29 14:56:40 2008 +1000

[XFS] Unlock inode before calling xfs_idestroy()

Lock debugging reported the ilock was being destroyed without being
unlocked. We don't need to lock the inode until we are going to insert it
into the radix tree.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32159a

Signed-off-by: Lachlan McIlroy <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

:040000 040000 b56d4b74bf0cbdd66d0e61c537b4bb2f7461084f
40773db6102a1e348a0a21b0e960cb33ffd0d419 M fs

git bisect start 'fs/xfs'
git bisect bad bfd2bd10da76378dc4afd87d7d204a1d3d70b347 Inode: Allow
external list initialisation
git bisect good 3fa8749e584b55f1180411ab1b51117190bac1e5 Linux 2.6.27
git bisect skip a55c8e45381bcc5588a544ba73580719887372eb
git bisect skip 35afc673a41114eb650c6ae766010160dd982a7b
git bisect skip 7a9ba9bb899933293604a2b3c5ca4f40ad5a92a8
git bisect skip c02afc5fa7433fbbc0a045afbb472533de0758de
git bisect skip c0f47794ba870f0cd1dbe5c8fdbf56b86f5f2afa
git bisect skip 7197197719eb94b527c6a422bf3fb682cdf89954
git bisect good e0fe783155e4f1c7106f3579c258b9f995330c19
git bisect bad 2472c6b938d2b3cb1698abe39cc90a3c1d7625b9
git bisect good 20c1bd2cc6d9311fb7d7d0eb91b46dc4a42b5d11
git bisect good 8e47c0b2427f0ea35984f02648163cc7a35d3592
git bisect good a1853934a9bef78aeb1aa7539c629cdb755edab2
git bisect bad 05de34dbbe744a4d235279f1493a8f05b903a4bb
git bisect bad c4fa37724d18a3444bb4b4f77c4580b9dd525ed9
git bisect good 4b4577db477462ff6f41babcc3b3e7036a1ba27d
git bisect bad dd509097cb0b76d3836385f80d6b2d6fd3b97757

2008-10-22 07:58:55

by Dave Chinner

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
> 2008/10/21 Dave Chinner <[email protected]>:
> > On Mon, Oct 20, 2008 at 09:13:05PM +0400, Alexander Beregalov wrote:
> >> 2008/10/20 Christoph Hellwig <[email protected]>:
> >> > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
> >> >> Hi Christoph
> >> >>
> >> >> I have the same result with next-1020 and today's xfs-2.6.git/master
> >> >> (
> >> >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
> >> >> Author: David Chinner <[email protected]>
> >> >> Date: Fri Oct 17 15:36:23 2008 +1000
> >> >> Inode: Allow external list initialisation
> >> >> )
> >> Ha, that kernel (xfs/master) has made my system dead locked.
> >> SysRQ-d:
> >> Show all locks held in the system
> >> 1 lock held by pdflush
> >> (&type->s_umount_key#18{----}, at writeback_inodes
> >> 1 lock held by login
> >> (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
> >> and so on ( many locks at xfs_ilock)
> >
> > Curious. Can you post the full stack traces?
> I can not reproduce it, yet.
>
> Bisected to:
> dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
> commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
> Author: Lachlan McIlroy <[email protected]>
> Date: Mon Sep 29 14:56:40 2008 +1000
>
> [XFS] Unlock inode before calling xfs_idestroy()
>
> Lock debugging reported the ilock was being destroyed without being
> unlocked. We don't need to lock the inode until we are going to insert it
> into the radix tree.

Ah, OK, I see the problem, though I don't understand why I'm not
seeing the might_sleep() triggering all the time given that I always
build with:

$ grep SLEEP .config
CONFIG_DEBUG_SPINLOCK_SLEEP=y

Basically the above commit moved xfs_ilock() inside
radix_tree_preload()/radix_tree_preload_end(), which means we are
taking a rwsem() while we have an elevated preempt count. I'll
get a patch out to fix it.

The question is why isn't this triggering for me during QA runs?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2008-10-22 08:21:43

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

> Ah, OK, I see the problem, though I don't understand why I'm not
> seeing the might_sleep() triggering all the time given that I always
> build with:
>
> $ grep SLEEP .config
> CONFIG_DEBUG_SPINLOCK_SLEEP=y
>
> Basically the above commit moved xfs_ilock() inside
> radix_tree_preload()/radix_tree_preload_end(), which means we are
> taking a rwsem() while we have an elevated preempt count. I'll
> get a patch out to fix it.
Could it cause the I/O dead lock or should I continue trying to reproduce it?

2008-10-22 08:26:12

by Dave Chinner

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Wed, Oct 22, 2008 at 06:58:38PM +1100, Dave Chinner wrote:
> On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
> > Bisected to:
> > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
> > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
> > Author: Lachlan McIlroy <[email protected]>
> > Date: Mon Sep 29 14:56:40 2008 +1000
> >
> > [XFS] Unlock inode before calling xfs_idestroy()
> >
> > Lock debugging reported the ilock was being destroyed without being
> > unlocked. We don't need to lock the inode until we are going to insert it
> > into the radix tree.
>
> Ah, OK, I see the problem, though I don't understand why I'm not
> seeing the might_sleep() triggering all the time given that I always
> build with:
>
> $ grep SLEEP .config
> CONFIG_DEBUG_SPINLOCK_SLEEP=y
>
> Basically the above commit moved xfs_ilock() inside
> radix_tree_preload()/radix_tree_preload_end(), which means we are
> taking a rwsem() while we have an elevated preempt count. I'll
> get a patch out to fix it.

Patch below (against the xfs master/linux-next branch) should fix the
regression. I've just started QA on it. Can you please check that
it works for you, Alexander?

Cheers,

Dave.
--
Dave Chinner
[email protected]

XFS: Can't lock inodes in radix tree preload region

When we are inside a radix tree preload region, we cannot
sleep. Recently we moved the inode locking inside the
preload region for the inode radix tree. Fix that,
and fix a missed unlock in another error path in the
same code at the same time.

Signed-off-by: Dave Chinner <[email protected]>
---
fs/xfs/xfs_iget.c | 18 +++++++++++-------
1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index fbc6088..837cae7 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -159,18 +159,19 @@ xfs_iget_cache_miss(
goto out_destroy;
}

+ if (lock_flags)
+ xfs_ilock(ip, lock_flags);
+
/*
* Preload the radix tree so we can insert safely under the
- * write spinlock.
+ * write spinlock. Note that we cannot sleep inside the preload
+ * region.
*/
if (radix_tree_preload(GFP_KERNEL)) {
error = EAGAIN;
- goto out_destroy;
+ goto out_unlock;
}

- if (lock_flags)
- xfs_ilock(ip, lock_flags);
-
mask = ~(((XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_inodelog)) - 1);
first_index = agino & mask;
write_lock(&pag->pag_ici_lock);
@@ -181,7 +182,7 @@ xfs_iget_cache_miss(
WARN_ON(error != -EEXIST);
XFS_STATS_INC(xs_ig_dup);
error = EAGAIN;
- goto out_unlock;
+ goto out_preload_end;
}

/* These values _must_ be set before releasing the radix tree lock! */
@@ -193,9 +194,12 @@ xfs_iget_cache_miss(
*ipp = ip;
return 0;

-out_unlock:
+out_preload_end:
write_unlock(&pag->pag_ici_lock);
radix_tree_preload_end();
+out_unlock:
+ if (lock_flags)
+ xfs_iunlock(ip, lock_flags);
out_destroy:
xfs_destroy_inode(ip);
return error;

2008-10-22 08:28:55

by Dave Chinner

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Wed, Oct 22, 2008 at 12:21:23PM +0400, Alexander Beregalov wrote:
> > Ah, OK, I see the problem, though I don't understand why I'm not
> > seeing the might_sleep() triggering all the time given that I always
> > build with:
> >
> > $ grep SLEEP .config
> > CONFIG_DEBUG_SPINLOCK_SLEEP=y
> >
> > Basically the above commit moved xfs_ilock() inside
> > radix_tree_preload()/radix_tree_preload_end(), which means we are
> > taking a rwsem() while we have an elevated preempt count. I'll
> > get a patch out to fix it.
> Could it cause the I/O dead lock or should I continue trying to reproduce it?

The deadlock wouldn't be produced by the same thing that
produced the sleeping-in-atomic warning. The missed unlock that
I also fixed in the patch I just sent could possibly have caused
that, but I'm just speculating on that...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2008-10-22 09:12:33

by Alexander Beregalov

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

2008/10/22 Dave Chinner <[email protected]>:
> On Wed, Oct 22, 2008 at 06:58:38PM +1100, Dave Chinner wrote:
>> On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
>> > Bisected to:
>> > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
>> > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
>> > Author: Lachlan McIlroy <[email protected]>
>> > Date: Mon Sep 29 14:56:40 2008 +1000
>> >
>> > [XFS] Unlock inode before calling xfs_idestroy()
>> >
>> > Lock debugging reported the ilock was being destroyed without being
>> > unlocked. We don't need to lock the inode until we are going to insert it
>> > into the radix tree.
>>
>> Ah, OK, I see the problem, though I don't understand why I'm not
>> seeing the might_sleep() triggering all the time given that I always
>> build with:
>>
>> $ grep SLEEP .config
>> CONFIG_DEBUG_SPINLOCK_SLEEP=y
>>
>> Basically the above commit moved xfs_ilock() inside
>> radix_tree_preload()/radix_tree_preload_end(), which means we are
>> taking a rwsem() while we have an elevated preempt count. I'll
>> get a patch out to fix it.
>
> Patch below (against the xfs master/linux-next branch) should fix the
> regression. I've just started QA on it. Can you please check that
> it works for you, Alexander?
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> XFS: Can't lock inodes in radix tree preload region
>
> When we are inside a radix tree preload region, we cannot
> sleep. Recently we moved the inode locking inside the
> preload region for the inode radix tree. Fix that,
> and fix a missed unlock in another error path in the
> same code at the same time.
>
> Signed-off-by: Dave Chinner <[email protected]>
Tested-by: Alexander Beregalov <[email protected]>

Yes, it fixes the issue. Thanks

2008-10-22 10:14:07

by Christoph Hellwig

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Wed, Oct 22, 2008 at 07:25:50PM +1100, Dave Chinner wrote:
> > Basically the above commit moved xfs_ilock() inside
> > radix_tree_preload()/radix_tree_preload_end(), which means we are
> > taking a rwsem() while we have an elevated preempt count. I'll
> > get a patch out to fix it.

This really needs a warning. Then again I don't really understand this
as the point of radix_tree_preload was that we can do the actual
radix-tree under a lock, or not?

Anyway, the patch looks correct to me, including the slightly unrelated
hunk to fix the missing unlock and preload leak.

2008-10-22 15:35:29

by Johannes Weiner

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS?

Dave Chinner <[email protected]> writes:

> On Wed, Oct 22, 2008 at 06:58:38PM +1100, Dave Chinner wrote:
>> On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
>> > Bisected to:
>> > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
>> > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
>> > Author: Lachlan McIlroy <[email protected]>
>> > Date: Mon Sep 29 14:56:40 2008 +1000
>> >
>> > [XFS] Unlock inode before calling xfs_idestroy()
>> >
>> > Lock debugging reported the ilock was being destroyed without being
>> > unlocked. We don't need to lock the inode until we are going to insert it
>> > into the radix tree.
>>
>> Ah, OK, I see the problem, though I don't understand why I'm not
>> seeing the might_sleep() triggering all the time given that I always
>> build with:
>>
>> $ grep SLEEP .config
>> CONFIG_DEBUG_SPINLOCK_SLEEP=y
>>
>> Basically the above commit moved xfs_ilock() inside
>> radix_tree_preload()/radix_tree_preload_end(), which means we are
>> taking a rwsem() while we have an elevated preempt count. I'll
>> get a patch out to fix it.
>
> Patch below (against the xfs master/linux-next branch) should fix the
> regression. I've just started QA on it. Can you please check that
> it works for you, Alexander?

Ran into the same problem, the fix worked for me.

Thank you.

Hannes

2008-10-22 21:10:45

by Dave Chinner

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

On Wed, Oct 22, 2008 at 06:13:51AM -0400, Christoph Hellwig wrote:
> On Wed, Oct 22, 2008 at 07:25:50PM +1100, Dave Chinner wrote:
> > > Basically the above commit moved xfs_ilock() inside
> > > radix_tree_preload()/radix_tree_preload_end(), which means we are
> > > taking a rwsem() while we have an elevated preempt count. I'll
> > > get a patch out to fix it.
>
> This really needs a warning. Then again I don't really understand this
> as the point of radix_tree_preload was that we can do the actual
> radix-tree under a lock, or not?

Right - the preload allows us to do GFP_KERNEL allocations for radix
tree nodes and use a spinlock for inserts into the tree. We could
drop the preload stuff if we initialised the radix tree to use
GFP_ATOMIC allocations for radix tree nodes, but that is more likely
to lead to insert failures under low memory conditions compared to
the preload method.

Cheers,

Dave.
--
Dave Chinner
[email protected]