2014-10-27 14:05:34

by David Howells

[permalink] [raw]
Subject: Locking problem in overlayfs

Using my testsuite, I see the attached moan from lockdep. Unfortunately, it
doesn't cause the testsuite to actually fail, so I'm going to have to manually
try and isolate the failing test.

David

=============================================
[ INFO: possible recursive locking detected ]
3.18.0-rc2-fsdevel+ #910 Tainted: G W
---------------------------------------------
run/2642 is trying to acquire lock:
(&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<ffffffff81203d81>] ovl_cleanup_whiteouts+0x29/0xb4

but task is already holding lock:
(&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<ffffffff8113abff>] lock_rename+0xb7/0xd7

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&sb->s_type->i_mutex_key#10/1);
lock(&sb->s_type->i_mutex_key#10/1);

*** DEADLOCK ***

May be due to missing lock nesting notation

7 locks held by run/2642:
#0: (sb_writers#15){.+.+.+}, at: [<ffffffff8114a1e2>] mnt_want_write+0x1f/0x46
#1: (&sb->s_type->i_mutex_key#17/1){+.+.+.}, at: [<ffffffff8113c737>] do_rmdir+0xa9/0x165
#2: (&sb->s_type->i_mutex_key#17){+.+.+.}, at: [<ffffffff8113c0ad>] vfs_rmdir+0x5a/0x115
#3: (sb_writers#8){.+.+.+}, at: [<ffffffff8114a1e2>] mnt_want_write+0x1f/0x46
#4: (&type->s_vfs_rename_key){+.+.+.}, at: [<ffffffff8113ab88>] lock_rename+0x40/0xd7
#5: (&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<ffffffff8113abff>] lock_rename+0xb7/0xd7
#6: (&sb->s_type->i_mutex_key#10/2){+.+.+.}, at: [<ffffffff8113ac15>] lock_rename+0xcd/0xd7

stack backtrace:
CPU: 0 PID: 2642 Comm: run Tainted: G W 3.18.0-rc2-fsdevel+ #910
Hardware name: /DG965RY, BIOS MQ96510J.86A.0816.2006.0716.2308 07/16/2006
ffffffff823989e0 ffff880038d8fa68 ffffffff815222f2 0000000000000006
ffffffff823989e0 ffff880038d8fb38 ffffffff810738f4 000000000000000b
ffff880037ebc710 ffff880038d8fb00 ffff880037ebcf58 0000000000000005
Call Trace:
[<ffffffff815222f2>] dump_stack+0x4e/0x68
[<ffffffff810738f4>] __lock_acquire+0x7b5/0x1a17
[<ffffffff8107522d>] lock_acquire+0xa3/0x11d
[<ffffffff81203d81>] ? ovl_cleanup_whiteouts+0x29/0xb4
[<ffffffff8111ff14>] ? kfree+0x17e/0x1ca
[<ffffffff8152580a>] mutex_lock_nested+0x5a/0x304
[<ffffffff81203d81>] ? ovl_cleanup_whiteouts+0x29/0xb4
[<ffffffff81139123>] ? vfs_rename+0x602/0x689
[<ffffffff81203d81>] ovl_cleanup_whiteouts+0x29/0xb4
[<ffffffff81202252>] ovl_clear_empty+0x195/0x216
[<ffffffff81202315>] ovl_check_empty_and_clear+0x42/0x5d
[<ffffffff81056944>] ? creds_are_invalid+0x17/0x4a
[<ffffffff81202a19>] ovl_do_remove+0x189/0x36a
[<ffffffff81202c0b>] ovl_rmdir+0x11/0x13
[<ffffffff8113c0f2>] vfs_rmdir+0x9f/0x115
[<ffffffff8113c784>] do_rmdir+0xf6/0x165
[<ffffffff8100d224>] ? do_audit_syscall_entry+0x4a/0x4c
[<ffffffff8100e43f>] ? syscall_trace_enter_phase2+0x178/0x1c1
[<ffffffff810e5aae>] ? context_tracking_user_exit+0x54/0xce
[<ffffffff8113d007>] SyS_rmdir+0x11/0x13
[<ffffffff81528a09>] tracesys_phase2+0xd4/0xd9


2014-10-27 14:19:43

by Miklos Szeredi

[permalink] [raw]
Subject: Re: Locking problem in overlayfs

On Mon, Oct 27, 2014 at 02:05:18PM +0000, David Howells wrote:
> Using my testsuite, I see the attached moan from lockdep. Unfortunately, it
> doesn't cause the testsuite to actually fail, so I'm going to have to manually
> try and isolate the failing test.
>
> David
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 3.18.0-rc2-fsdevel+ #910 Tainted: G W
> ---------------------------------------------
> run/2642 is trying to acquire lock:
> (&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<ffffffff81203d81>] ovl_cleanup_whiteouts+0x29/0xb4
>
> but task is already holding lock:
> (&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<ffffffff8113abff>] lock_rename+0xb7/0xd7

Uh-oh. We changed nesting late in the cycle and I didn't retest with lockdep.

And it's actually harmless, but AFAICS needs another level of nesting between
I_MUTEX_CHILD and I_MUTEX_NORMAL.

Will do a patch.

Thanks,
Miklos

2014-10-27 14:39:37

by David Howells

[permalink] [raw]
Subject: Re: Locking problem in overlayfs

Miklos Szeredi <[email protected]> wrote:

> Uh-oh. We changed nesting late in the cycle and I didn't retest with lockdep.
>
> And it's actually harmless, but AFAICS needs another level of nesting between
> I_MUTEX_CHILD and I_MUTEX_NORMAL.

In an overlay directory that shadows an empty lower directory, say
/mnt/a/empty102, do:

touch /mnt/a/empty102/x
unlink /mnt/a/empty102/x
rmdir /mnt/a/empty102

David

2014-10-27 14:42:10

by Miklos Szeredi

[permalink] [raw]
Subject: Re: Locking problem in overlayfs

On Mon, Oct 27, 2014 at 02:39:21PM +0000, David Howells wrote:
> Miklos Szeredi <[email protected]> wrote:
>
> > Uh-oh. We changed nesting late in the cycle and I didn't retest with lockdep.
> >
> > And it's actually harmless, but AFAICS needs another level of nesting between
> > I_MUTEX_CHILD and I_MUTEX_NORMAL.
>
> In an overlay directory that shadows an empty lower directory, say
> /mnt/a/empty102, do:
>
> touch /mnt/a/empty102/x
> unlink /mnt/a/empty102/x
> rmdir /mnt/a/empty102


Yes, following (untested) patch should fix it:

Thanks,
Miklos

diff --git a/fs/namei.c b/fs/namei.c
index 42df664e95e5..922f27068c4c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2497,7 +2497,7 @@ struct dentry *lock_rename(struct dentry *p1, struct dentry *p2)
}

mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_PARENT);
- mutex_lock_nested(&p2->d_inode->i_mutex, I_MUTEX_CHILD);
+ mutex_lock_nested(&p2->d_inode->i_mutex, I_MUTEX_PARENT2);
return NULL;
}
EXPORT_SYMBOL(lock_rename);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 910553f37aca..de77b5c62d72 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -569,7 +569,7 @@ void ovl_cleanup_whiteouts(struct dentry *upper, struct list_head *list)
{
struct ovl_cache_entry *p;

- mutex_lock_nested(&upper->d_inode->i_mutex, I_MUTEX_PARENT);
+ mutex_lock_nested(&upper->d_inode->i_mutex, I_MUTEX_CHILD);
list_for_each_entry(p, list, l_node) {
struct dentry *dentry;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4e41a4a331bb..01036262095f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -639,11 +639,13 @@ static inline int inode_unhashed(struct inode *inode)
* 2: child/target
* 3: xattr
* 4: second non-directory
- * The last is for certain operations (such as rename) which lock two
+ * 5: second parent (when locking independent directories in rename)
+ *
+ * I_MUTEX_NONDIR2 is for certain operations (such as rename) which lock two
* non-directories at once.
*
* The locking order between these classes is
- * parent -> child -> normal -> xattr -> second non-directory
+ * parent[2] -> child -> grandchild -> normal -> xattr -> second non-directory
*/
enum inode_i_mutex_lock_class
{
@@ -651,7 +653,8 @@ enum inode_i_mutex_lock_class
I_MUTEX_PARENT,
I_MUTEX_CHILD,
I_MUTEX_XATTR,
- I_MUTEX_NONDIR2
+ I_MUTEX_NONDIR2,
+ I_MUTEX_PARENT2,
};

void lock_two_nondirectories(struct inode *, struct inode*);

2014-10-27 15:45:09

by David Howells

[permalink] [raw]
Subject: Re: Locking problem in overlayfs

Miklos Szeredi <[email protected]> wrote:

> Yes, following (untested) patch should fix it:

Tested-by: David Howells <[email protected]>