2009-04-10 16:35:29

by Alessio Igor Bogani

[permalink] [raw]
Subject: [PATCH] remove the BKL: remove "BKL auto-drop" assumption from nfs3_rpc_wrapper()

Fix nfs3_rpc_wrapper()'s "schedule() drops the BKL automatically" assumption,
when schedule_timeout_killable() does not do that it can lock up.

Signed-off-by: Alessio Igor Bogani <[email protected]>
---
fs/nfs/nfs3proc.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index d0cc5ce..d91047c 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -17,6 +17,7 @@
#include <linux/nfs_page.h>
#include <linux/lockd/bind.h>
#include <linux/nfs_mount.h>
+#include <linux/smp_lock.h>

#include "iostat.h"
#include "internal.h"
@@ -28,11 +29,17 @@ static int
nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
{
int res;
+ int bkl = kernel_locked();
+
do {
res = rpc_call_sync(clnt, msg, flags);
if (res != -EJUKEBOX)
break;
+ if (bkl)
+ unlock_kernel();
schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
+ if (bkl)
+ lock_kernel();
res = -ERESTARTSYS;
} while (!fatal_signal_pending(current));
return res;
--
1.6.0.4


2009-04-10 18:29:52

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] remove the BKL: remove "BKL auto-drop" assumption from nfs3_rpc_wrapper()

On Fri, Apr 10, 2009 at 06:34:41PM +0200, Alessio Igor Bogani wrote:
> Fix nfs3_rpc_wrapper()'s "schedule() drops the BKL automatically" assumption,
> when schedule_timeout_killable() does not do that it can lock up.
>
> Signed-off-by: Alessio Igor Bogani <[email protected]>


Hi Alessio,

Btw, does it fix the lockdep message you've seen while mounting
an nfs point?

Thanks,
Frederic.


> ---
> fs/nfs/nfs3proc.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
> index d0cc5ce..d91047c 100644
> --- a/fs/nfs/nfs3proc.c
> +++ b/fs/nfs/nfs3proc.c
> @@ -17,6 +17,7 @@
> #include <linux/nfs_page.h>
> #include <linux/lockd/bind.h>
> #include <linux/nfs_mount.h>
> +#include <linux/smp_lock.h>
>
> #include "iostat.h"
> #include "internal.h"
> @@ -28,11 +29,17 @@ static int
> nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
> {
> int res;
> + int bkl = kernel_locked();
> +
> do {
> res = rpc_call_sync(clnt, msg, flags);
> if (res != -EJUKEBOX)
> break;
> + if (bkl)
> + unlock_kernel();
> schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
> + if (bkl)
> + lock_kernel();
> res = -ERESTARTSYS;
> } while (!fatal_signal_pending(current));
> return res;
> --
> 1.6.0.4
>

2009-04-10 20:50:28

by Alessio Igor Bogani

[permalink] [raw]
Subject: Re: [PATCH] remove the BKL: remove "BKL auto-drop" assumption from nfs3_rpc_wrapper()

Hi,

2009/4/10 Frederic Weisbecker <[email protected]>:
[...]
>> Fix nfs3_rpc_wrapper()'s "schedule() drops the BKL automatically" assumption,
>> when schedule_timeout_killable() does not do that it can lock up.
[...]
> Btw, does it fix the lockdep message you've seen while mounting
> an nfs point?

Unfortunately no. That lockdep message still happens when I unmount rpc_pipefs.
I'll investigate further.

Ciao,
Alessio

2009-04-12 13:59:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] remove the BKL: remove "BKL auto-drop" assumption from nfs3_rpc_wrapper()


* Alessio Igor Bogani <[email protected]> wrote:

> Hi,
>
> 2009/4/10 Frederic Weisbecker <[email protected]>:
> [...]
> >> Fix nfs3_rpc_wrapper()'s "schedule() drops the BKL automatically" assumption,
> >> when schedule_timeout_killable() does not do that it can lock up.
> [...]
> > Btw, does it fix the lockdep message you've seen while mounting
> > an nfs point?
>
> Unfortunately no. That lockdep message still happens when I
> unmount rpc_pipefs. I'll investigate further.

might make sense to post that message here out in the open - maybe
someone with a strong NFSd-fu will comment on it.

Ingo

2009-04-12 20:34:40

by Alessio Igor Bogani

[permalink] [raw]
Subject: Re: [PATCH] remove the BKL: remove "BKL auto-drop" assumption from nfs3_rpc_wrapper()

Dear Sir Molnar,

2009/4/12 Ingo Molnar <[email protected]>:
[...]
>> Unfortunately no. That lockdep message still happens when I
>> unmount rpc_pipefs. I'll investigate further.
>
> might make sense to post that message here out in the open - maybe
> someone with a strong NFSd-fu will comment on it.

This message appear when I unmount rpc_pipefs(/var/lib/nfs/rpc_pipefs)
or nfsd (/proc/fs/nfsd):

[ 130.094907] =======================================================
[ 130.096071] [ INFO: possible circular locking dependency detected ]
[ 130.096071] 2.6.30-rc1-nobkl #39
[ 130.096071] -------------------------------------------------------
[ 130.096071] umount/2883 is trying to acquire lock:
[ 130.096071] (kernel_mutex){+.+.+.}, at: [<ffffffff80748074>]
lock_kernel+0x34/0x43
[ 130.096071]
[ 130.096071] but task is already holding lock:
[ 130.096071] (&type->s_lock_key#8){+.+...}, at:
[<ffffffff803196ce>] lock_super+0x2e/0x30
[ 130.096071]
[ 130.096071] which lock already depends on the new lock.
[ 130.096071]
[ 130.096071]
[ 130.096071] the existing dependency chain (in reverse order) is:
[ 130.096071]
[ 130.096071] -> #2 (&type->s_lock_key#8){+.+...}:
[ 130.096071] [<ffffffff802891cc>] __lock_acquire+0xf9c/0x13e0
[ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
[ 130.096071] [<ffffffff8074534e>] __mutex_lock_common+0x5e/0x510
[ 130.096071] [<ffffffff807458df>] mutex_lock_nested+0x3f/0x50
[ 130.096071] [<ffffffff803196ce>] lock_super+0x2e/0x30
[ 130.096071] [<ffffffff80319b8d>] __fsync_super+0x2d/0x90
[ 130.096071] [<ffffffff80319c06>] fsync_super+0x16/0x30
[ 130.096071] [<ffffffff80319c61>] do_remount_sb+0x41/0x280
[ 130.096071] [<ffffffff8031ad1b>] get_sb_single+0x6b/0xe0
[ 130.096071] [<ffffffffa00c3bdb>] nfsd_get_sb+0x1b/0x20 [nfsd]
[ 130.096071] [<ffffffff8031a521>] vfs_kern_mount+0x81/0x180
[ 130.096071] [<ffffffff8031a693>] do_kern_mount+0x53/0x110
[ 130.096071] [<ffffffff8033504a>] do_mount+0x6ba/0x910
[ 130.096071] [<ffffffff80335360>] sys_mount+0xc0/0xf0
[ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
[ 130.096071] [<ffffffffffffffff>] 0xffffffffffffffff
[ 130.096071]
[ 130.096071] -> #1 (&type->s_umount_key#34/1){+.+.+.}:
[ 130.096071] [<ffffffff802891cc>] __lock_acquire+0xf9c/0x13e0
[ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
[ 130.096071] [<ffffffff80277702>] down_write_nested+0x52/0x90
[ 130.096071] [<ffffffff8031a99b>] sget+0x24b/0x560
[ 130.096071] [<ffffffff8031acf3>] get_sb_single+0x43/0xe0
[ 130.096071] [<ffffffffa00c3bdb>] nfsd_get_sb+0x1b/0x20 [nfsd]
[ 130.096071] [<ffffffff8031a521>] vfs_kern_mount+0x81/0x180
[ 130.096071] [<ffffffff8031a693>] do_kern_mount+0x53/0x110
[ 130.096071] [<ffffffff8033504a>] do_mount+0x6ba/0x910
[ 130.096071] [<ffffffff80335360>] sys_mount+0xc0/0xf0
[ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
[ 130.096071] [<ffffffffffffffff>] 0xffffffffffffffff
[ 130.096071]
[ 130.096071] -> #0 (kernel_mutex){+.+.+.}:
[ 130.096071] [<ffffffff802892ad>] __lock_acquire+0x107d/0x13e0
[ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
[ 130.096071] [<ffffffff8074534e>] __mutex_lock_common+0x5e/0x510
[ 130.096071] [<ffffffff807458df>] mutex_lock_nested+0x3f/0x50
[ 130.096071] [<ffffffff80748074>] lock_kernel+0x34/0x43
[ 130.096071] [<ffffffff80319ef4>] generic_shutdown_super+0x54/0x140
[ 130.096071] [<ffffffff8031a046>] kill_anon_super+0x16/0x50
[ 130.096071] [<ffffffff8031a0a7>] kill_litter_super+0x27/0x30
[ 130.096071] [<ffffffff8031a485>] deactivate_super+0x85/0xa0
[ 130.096071] [<ffffffff8033301a>] mntput_no_expire+0x11a/0x160
[ 130.096071] [<ffffffff803333d4>] sys_umount+0x64/0x3c0
[ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
[ 130.096071] [<ffffffffffffffff>] 0xffffffffffffffff
[ 130.096071]
[ 130.096071] other info that might help us debug this:
[ 130.096071]
[ 130.096071] 2 locks held by umount/2883:
[ 130.096071] #0: (&type->s_umount_key#35){+.+...}, at:
[<ffffffff8031a47d>] deactivate_super+0x7d/0xa0
[ 130.096071] #1: (&type->s_lock_key#8){+.+...}, at:
[<ffffffff803196ce>] lock_super+0x2e/0x30
[ 130.096071]
[ 130.096071] stack backtrace:
[ 130.096071] Pid: 2883, comm: umount Not tainted 2.6.30-rc1-nobkl #39
[ 130.096071] Call Trace:
[ 130.096071] [<ffffffff80286c96>] print_circular_bug_tail+0xa6/0x100
[ 130.096071] [<ffffffff802892ad>] __lock_acquire+0x107d/0x13e0
[ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
[ 130.096071] [<ffffffff80748074>] ? lock_kernel+0x34/0x43
[ 130.096071] [<ffffffff8074534e>] __mutex_lock_common+0x5e/0x510
[ 130.096071] [<ffffffff80748074>] ? lock_kernel+0x34/0x43
[ 130.096071] [<ffffffff80287685>] ? trace_hardirqs_on_caller+0x165/0x1c0
[ 130.096071] [<ffffffff80748074>] ? lock_kernel+0x34/0x43
[ 130.096071] [<ffffffff807458df>] mutex_lock_nested+0x3f/0x50
[ 130.096071] [<ffffffff80748074>] lock_kernel+0x34/0x43
[ 130.096071] [<ffffffff80319ef4>] generic_shutdown_super+0x54/0x140
[ 130.096071] [<ffffffff8031a046>] kill_anon_super+0x16/0x50
[ 130.096071] [<ffffffff8031a0a7>] kill_litter_super+0x27/0x30
[ 130.096071] [<ffffffff8031a485>] deactivate_super+0x85/0xa0
[ 130.096071] [<ffffffff8033301a>] mntput_no_expire+0x11a/0x160
[ 130.096071] [<ffffffff803333d4>] sys_umount+0x64/0x3c0
[ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b

Please notice that removing lock_kernel()/unlock_kernel() from
generic_shutdown_super() make this warning disappear but I'm not sure
that is it the _real_ fix.

Ciao,
Alessio

2009-04-13 05:54:23

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] remove the BKL: remove "BKL auto-drop" assumption from nfs3_rpc_wrapper()

On Sun, Apr 12, 2009 at 10:34:28PM +0200, Alessio Igor Bogani wrote:
> Dear Sir Molnar,
>
> 2009/4/12 Ingo Molnar <[email protected]>:
> [...]
> >> Unfortunately no. That lockdep message still happens when I
> >> unmount rpc_pipefs. I'll investigate further.
> >
> > might make sense to post that message here out in the open - maybe
> > someone with a strong NFSd-fu will comment on it.
>
> This message appear when I unmount rpc_pipefs(/var/lib/nfs/rpc_pipefs)
> or nfsd (/proc/fs/nfsd):
>
> [ 130.094907] =======================================================
> [ 130.096071] [ INFO: possible circular locking dependency detected ]
> [ 130.096071] 2.6.30-rc1-nobkl #39
> [ 130.096071] -------------------------------------------------------
> [ 130.096071] umount/2883 is trying to acquire lock:
> [ 130.096071] (kernel_mutex){+.+.+.}, at: [<ffffffff80748074>]
> lock_kernel+0x34/0x43
> [ 130.096071]
> [ 130.096071] but task is already holding lock:
> [ 130.096071] (&type->s_lock_key#8){+.+...}, at:
> [<ffffffff803196ce>] lock_super+0x2e/0x30
> [ 130.096071]
> [ 130.096071] which lock already depends on the new lock.
> [ 130.096071]


I have a very similar locking depency problem while unmounting reiserfs.
I haven't digged it yet because of a nasty hang with reiserfs I have to fix
first.

But I suspect this locking dependency is something to fix in the fs layer itself.

Frederic.


> [ 130.096071]
> [ 130.096071] the existing dependency chain (in reverse order) is:
> [ 130.096071]
> [ 130.096071] -> #2 (&type->s_lock_key#8){+.+...}:
> [ 130.096071] [<ffffffff802891cc>] __lock_acquire+0xf9c/0x13e0
> [ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
> [ 130.096071] [<ffffffff8074534e>] __mutex_lock_common+0x5e/0x510
> [ 130.096071] [<ffffffff807458df>] mutex_lock_nested+0x3f/0x50
> [ 130.096071] [<ffffffff803196ce>] lock_super+0x2e/0x30
> [ 130.096071] [<ffffffff80319b8d>] __fsync_super+0x2d/0x90
> [ 130.096071] [<ffffffff80319c06>] fsync_super+0x16/0x30
> [ 130.096071] [<ffffffff80319c61>] do_remount_sb+0x41/0x280
> [ 130.096071] [<ffffffff8031ad1b>] get_sb_single+0x6b/0xe0
> [ 130.096071] [<ffffffffa00c3bdb>] nfsd_get_sb+0x1b/0x20 [nfsd]
> [ 130.096071] [<ffffffff8031a521>] vfs_kern_mount+0x81/0x180
> [ 130.096071] [<ffffffff8031a693>] do_kern_mount+0x53/0x110
> [ 130.096071] [<ffffffff8033504a>] do_mount+0x6ba/0x910
> [ 130.096071] [<ffffffff80335360>] sys_mount+0xc0/0xf0
> [ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
> [ 130.096071] [<ffffffffffffffff>] 0xffffffffffffffff
> [ 130.096071]
> [ 130.096071] -> #1 (&type->s_umount_key#34/1){+.+.+.}:
> [ 130.096071] [<ffffffff802891cc>] __lock_acquire+0xf9c/0x13e0
> [ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
> [ 130.096071] [<ffffffff80277702>] down_write_nested+0x52/0x90
> [ 130.096071] [<ffffffff8031a99b>] sget+0x24b/0x560
> [ 130.096071] [<ffffffff8031acf3>] get_sb_single+0x43/0xe0
> [ 130.096071] [<ffffffffa00c3bdb>] nfsd_get_sb+0x1b/0x20 [nfsd]
> [ 130.096071] [<ffffffff8031a521>] vfs_kern_mount+0x81/0x180
> [ 130.096071] [<ffffffff8031a693>] do_kern_mount+0x53/0x110
> [ 130.096071] [<ffffffff8033504a>] do_mount+0x6ba/0x910
> [ 130.096071] [<ffffffff80335360>] sys_mount+0xc0/0xf0
> [ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
> [ 130.096071] [<ffffffffffffffff>] 0xffffffffffffffff
> [ 130.096071]
> [ 130.096071] -> #0 (kernel_mutex){+.+.+.}:
> [ 130.096071] [<ffffffff802892ad>] __lock_acquire+0x107d/0x13e0
> [ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
> [ 130.096071] [<ffffffff8074534e>] __mutex_lock_common+0x5e/0x510
> [ 130.096071] [<ffffffff807458df>] mutex_lock_nested+0x3f/0x50
> [ 130.096071] [<ffffffff80748074>] lock_kernel+0x34/0x43
> [ 130.096071] [<ffffffff80319ef4>] generic_shutdown_super+0x54/0x140
> [ 130.096071] [<ffffffff8031a046>] kill_anon_super+0x16/0x50
> [ 130.096071] [<ffffffff8031a0a7>] kill_litter_super+0x27/0x30
> [ 130.096071] [<ffffffff8031a485>] deactivate_super+0x85/0xa0
> [ 130.096071] [<ffffffff8033301a>] mntput_no_expire+0x11a/0x160
> [ 130.096071] [<ffffffff803333d4>] sys_umount+0x64/0x3c0
> [ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
> [ 130.096071] [<ffffffffffffffff>] 0xffffffffffffffff
> [ 130.096071]
> [ 130.096071] other info that might help us debug this:
> [ 130.096071]
> [ 130.096071] 2 locks held by umount/2883:
> [ 130.096071] #0: (&type->s_umount_key#35){+.+...}, at:
> [<ffffffff8031a47d>] deactivate_super+0x7d/0xa0
> [ 130.096071] #1: (&type->s_lock_key#8){+.+...}, at:
> [<ffffffff803196ce>] lock_super+0x2e/0x30
> [ 130.096071]
> [ 130.096071] stack backtrace:
> [ 130.096071] Pid: 2883, comm: umount Not tainted 2.6.30-rc1-nobkl #39
> [ 130.096071] Call Trace:
> [ 130.096071] [<ffffffff80286c96>] print_circular_bug_tail+0xa6/0x100
> [ 130.096071] [<ffffffff802892ad>] __lock_acquire+0x107d/0x13e0
> [ 130.096071] [<ffffffff8028972f>] lock_acquire+0x11f/0x170
> [ 130.096071] [<ffffffff80748074>] ? lock_kernel+0x34/0x43
> [ 130.096071] [<ffffffff8074534e>] __mutex_lock_common+0x5e/0x510
> [ 130.096071] [<ffffffff80748074>] ? lock_kernel+0x34/0x43
> [ 130.096071] [<ffffffff80287685>] ? trace_hardirqs_on_caller+0x165/0x1c0
> [ 130.096071] [<ffffffff80748074>] ? lock_kernel+0x34/0x43
> [ 130.096071] [<ffffffff807458df>] mutex_lock_nested+0x3f/0x50
> [ 130.096071] [<ffffffff80748074>] lock_kernel+0x34/0x43
> [ 130.096071] [<ffffffff80319ef4>] generic_shutdown_super+0x54/0x140
> [ 130.096071] [<ffffffff8031a046>] kill_anon_super+0x16/0x50
> [ 130.096071] [<ffffffff8031a0a7>] kill_litter_super+0x27/0x30
> [ 130.096071] [<ffffffff8031a485>] deactivate_super+0x85/0xa0
> [ 130.096071] [<ffffffff8033301a>] mntput_no_expire+0x11a/0x160
> [ 130.096071] [<ffffffff803333d4>] sys_umount+0x64/0x3c0
> [ 130.096071] [<ffffffff80213232>] system_call_fastpath+0x16/0x1b
>
> Please notice that removing lock_kernel()/unlock_kernel() from
> generic_shutdown_super() make this warning disappear but I'm not sure
> that is it the _real_ fix.
>
> Ciao,
> Alessio