2011-09-12 22:17:02

by Simon Kirby

[permalink] [raw]
Subject: Re: [3.1-rc4] vfs_rmdir() -> mutex_unlock() Oops

On Thu, Sep 08, 2011 at 03:24:20PM -0700, Simon Kirby wrote:

> This box primarily does most of its VFS stuff over lots of NFS mounts,
> but has some local EXT3 filesystems. This has happened a couple of times:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
> IP: [<ffffffff816a6641>] mutex_unlock+0x1/0x20
> PGD 12e07e067 PUD 102dfa067 PMD 0
> Oops: 0002 [#1] SMP
> CPU 3
> Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
> [445464.657194]
> Pid: 13796, comm: php Tainted: G W 3.1.0-rc4-hw+ #27 Dell Inc. PowerEdge 1950/0UR033
> RIP: 0010:[<ffffffff816a6641>] [<ffffffff816a6641>] mutex_unlock+0x1/0x20
> RSP: 0018:ffff8801381c7e58 EFLAGS: 00010206
> RAX: 00000000fffffffe RBX: ffff88002e95aca8 RCX: ffff88010712dc70
> RDX: 0000000040000200 RSI: ffff880126d21cc0 RDI: 00000000000000a0
> RBP: ffff8801381c7e88 R08: 0000000026361000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000002 R12: 00000000fffffffe
> R13: ffff880126d21cc0 R14: 00007f0a6e0ba9d8 R15: 00007f0a6e0bd258
> FS: 00007f0a6e1c1720(0000) GS:ffff88022fcc0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000b8 CR3: 000000012692a000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process php (pid: 13796, threadinfo ffff8801381c6000, task ffff88003ed4d900)
> Stack:
> ffff8801381c7e88 ffffffff8110e680 ffff8801381c7e98 ffff880126d21cc0
> ffff8801381c7e98 0000000000000000 ffff8801381c7f68 ffffffff8110e7a6
> ffff880225eed800 ffff8800bea6d780 00000004018245d5 ffff88003ed0b03b
> Call Trace:
> [<ffffffff8110e680>] ? vfs_rmdir+0xb0/0x100
> [<ffffffff8110e7a6>] do_rmdir+0xd6/0x130
> [<ffffffff811049c1>] ? sys_newlstat+0x31/0x50
> [<ffffffff8110e851>] sys_rmdir+0x11/0x20
> [<ffffffff816af57b>] system_call_fastpath+0x16/0x1b
> Code: 0f 79 05 e8 d2 00 00 00 65 48 8b 04 25 c0 b5 00 00 48 8b 1c 24 49 89 44 24 18 4c 8b 64 24 08 c9 c3 66 0f 1f 84 00 00 00 00 00 55
> c7 47 18 00 00 00 00 48 89 e5 f0 ff 07 7f 05 e8 4a 00 00 00
> RIP [<ffffffff816a6641>] mutex_unlock+0x1/0x20
> RSP <ffff8801381c7e58>
> CR2: 00000000000000b8
> ---[ end trace 3ad1ff5004f78cbb ]---
>
> I don't see any relevant changes in Linus HEAD yet. This build was of
> 9e79e3e9dd9672b37ac9412e9a926714306551fe (slightly past 3.1-rc4).

Got a few more identical Oopses on another box running slightly past
3.1-rc5 (79016f648872549392d232cd648bd02298c2d2bb). It seems to be
do_rmdir()'s mutex_unlock() call.

I'm building -rc6 with CONFIG_DEBUG_MUTEXES now.

Simon-


2011-09-14 15:39:30

by Fabio Coatti

[permalink] [raw]
Subject: Re: [3.1-rc4] vfs_rmdir() -> mutex_unlock() Oops

2011/9/14 Lin Ming <[email protected]>:
> On Wed, Sep 14, 2011 at 12:43 AM, Simon Kirby <[email protected]> wrote:
>> On Mon, Sep 12, 2011 at 03:17:00PM -0700, Simon Kirby wrote:
>>
>>> On Thu, Sep 08, 2011 at 03:24:20PM -0700, Simon Kirby wrote:
>>>
>>> > This box primarily does most of its VFS stuff over lots of NFS mounts,
>>> > but has some local EXT3 filesystems. This has happened a couple of times:
>>> >
>>> > BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
>>>...
>>> Got a few more identical Oopses on another box running slightly past
>>> 3.1-rc5 (79016f648872549392d232cd648bd02298c2d2bb). It seems to be
>>> do_rmdir()'s mutex_unlock() call.
>>>
>>> I'm building -rc6 with CONFIG_DEBUG_MUTEXES now.
>>
>> ...and not much more help with CONFIG_DEBUG_MUTEXES, from 3.1-rc6:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 00000000000000a4
>> IP: [<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
>> PGD 13c6c4067 PUD 2256fc067 PMD 0
>> Oops: 0002 [#1] SMP
>> CPU 2
>> Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
>>
>> Pid: 27658, comm: php Not tainted 3.1.0-rc6-hw-mudbg+ #32 Dell Inc. PowerEdge 1950/0TT740
>> RIP: 0010:[<ffffffff816adc93>] ?[<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
>> RSP: 0018:ffff8800b65e1e28 ?EFLAGS: 00010046
>> RAX: 0000000000000100 RBX: ffff88001bcece48 RCX: ffff88001bd05348
>> RDX: 0000000040000200 RSI: ffff8800916cf6c0 RDI: 00000000000000a0
>> RBP: ffff8800b65e1e48 R08: 00000000043205bc R09: ffffea00011340c0
>> R10: 0000000000000000 R11: 0000000000000002 R12: 00000000000000a0
>> R13: 00000000000000a4 R14: 0000000000000246 R15: 00007f7fe2d15680
>> FS: ?00007f7fe2e1f720(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
>> CS: ?0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00000000000000a4 CR3: 0000000215c16000 CR4: 00000000000006e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process php (pid: 27658, threadinfo ffff8800b65e0000, task ffff8800175d4320)
>> Stack:
>> ?ffff88001bcece48 00000000fffffffe ffff8800916cf6c0 00007f7fe2d146a8
>> ?ffff8800b65e1e58 ffffffff816add89 ffff8800b65e1e88 ffffffff8110ec70
>> ?ffff8800b65e1e98 ffff8800916cf6c0 ffff8800b65e1e98 0000000000000000
>> Call Trace:
>> ?[<ffffffff816add89>] mutex_unlock+0x9/0x10
>> ?[<ffffffff8110ec70>] vfs_rmdir+0xb0/0x100
>> ?[<ffffffff8110ed96>] do_rmdir+0xd6/0x130
>> ?[<ffffffff81102063>] ? fput+0x1c3/0x260
>> ?[<ffffffff810fe648>] ? filp_close+0x68/0xa0
>> ?[<ffffffff8110ee41>] sys_rmdir+0x11/0x20
>> ?[<ffffffff816b6f3b>] system_call_fastpath+0x16/0x1b
>> Code: 75 1b 65 48 8b 04 25 c8 b5 00 00 48 63 80 44 e0 ff ff a9 00 ff ff 07 0f 85 bb 00 00 00 9c 41 5e fa b8 00 01 00 00 4d 8d 6c 24 04 <f0> 66 41 0f c1 45 00 38 e0 74 08 f3 90 41 8a 45 00 eb f4 44 8b
>> RIP ?[<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
>> ?RSP <ffff8800b65e1e28>
>> CR2: 00000000000000a4
>> ---[ end trace f515ec8376bdb799 ]---
>>
>> How can I further debug this? At this point, it seems to be happening several times daily.
>
> Fabio reported a similar bug,
> 3.0.3 [BUG] unable to handle kernel NULL pointer dereference
> http://marc.info/?t=131416920900001&r=1&w=2
>
> Do you have a test case to trigger this bug reliably?

Unfortunatey not, so far. This problem happens on a quite loaded
server with apache on it and it is difficult to find root cause or at
least a situation able to trigger it. Still digging, but I wouldn't
expect to find something useful in short time.



--
Fabio

2011-09-14 20:34:46

by Simon Kirby

[permalink] [raw]
Subject: Re: [3.1-rc4] vfs_rmdir() -> mutex_unlock() Oops

On Wed, Sep 14, 2011 at 09:48:09PM +0800, Lin Ming wrote:

> > ---[ end trace f515ec8376bdb799 ]---
> >
> > How can I further debug this? At this point, it seems to be happening several times daily.
>
> Fabio reported a similar bug,
> 3.0.3 [BUG] unable to handle kernel NULL pointer dereference
> http://marc.info/?t=131416920900001&r=1&w=2
>
> Do you have a test case to trigger this bug reliably?

Hello!

Sorry, I didn't realize I took this off list after replying to Linus'
resent mail, but the problem has been tracked down and fixed, and should
show up in Linus' next push.

The problem was introduced by 64252c75a2196a0cf1e0d3777143ecfe0e3ae650,
and Al Viro's patch is reproduced below.

I was able to reproduce the issue reliably with:

ssh <nfs_client> 'cd /mnt/test; mkdir foo; sleep 1; mkdir foo' ; \
ssh <nfs_server> rmdir /export/test/foo ; \
ssh <nfs_client> rmdir /mnt/test/foo

Simon-

>From Al:

[PATCH] restore pinning the victim dentry in vfs_rmdir()/vfs_rename_dir()

We used to get the victim pinned by dentry_unhash() prior to commit
0cf1e0d3777143ecfe0e3ae650 and ->rmdir()/->rename() instances relied on
that; most of them don't care, but ones that used d_delete() themselves
do. As the result, we are getting rmdir() oopses on NFS now.

Just grab the reference before locking the victim and drop it explicitly
after unlocking, same as vfs_rename_other() does.

Signed-off-by: Al Viro <[email protected]>
---
diff --git a/fs/namei.c b/fs/namei.c
index 5422080..22c2d5d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2619,6 +2619,7 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)
if (!dir->i_op->rmdir)
return -EPERM;

+ dget(dentry);
mutex_lock(&dentry->d_inode->i_mutex);

error = -EBUSY;
@@ -2639,6 +2640,7 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)

out:
mutex_unlock(&dentry->d_inode->i_mutex);
+ dput(dentry);
if (!error)
d_delete(dentry);
return error;
@@ -3028,6 +3030,7 @@ static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry,
if (error)
return error;

+ dget(new_dentry);
if (target)
mutex_lock(&target->i_mutex);

@@ -3048,6 +3051,7 @@ static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry,
out:
if (target)
mutex_unlock(&target->i_mutex);
+ dput(new_dentry);
if (!error)
if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE))
d_move(old_dentry,new_dentry);

2011-09-14 13:53:55

by Lin Ming

[permalink] [raw]
Subject: Re: [3.1-rc4] vfs_rmdir() -> mutex_unlock() Oops

On Wed, Sep 14, 2011 at 12:43 AM, Simon Kirby <[email protected]> wrote:
> On Mon, Sep 12, 2011 at 03:17:00PM -0700, Simon Kirby wrote:
>
>> On Thu, Sep 08, 2011 at 03:24:20PM -0700, Simon Kirby wrote:
>>
>> > This box primarily does most of its VFS stuff over lots of NFS mounts,
>> > but has some local EXT3 filesystems. This has happened a couple of times:
>> >
>> > BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
>>...
>> Got a few more identical Oopses on another box running slightly past
>> 3.1-rc5 (79016f648872549392d232cd648bd02298c2d2bb). It seems to be
>> do_rmdir()'s mutex_unlock() call.
>>
>> I'm building -rc6 with CONFIG_DEBUG_MUTEXES now.
>
> ...and not much more help with CONFIG_DEBUG_MUTEXES, from 3.1-rc6:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000000a4
> IP: [<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
> PGD 13c6c4067 PUD 2256fc067 PMD 0
> Oops: 0002 [#1] SMP
> CPU 2
> Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
>
> Pid: 27658, comm: php Not tainted 3.1.0-rc6-hw-mudbg+ #32 Dell Inc. PowerEdge 1950/0TT740
> RIP: 0010:[<ffffffff816adc93>] ?[<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
> RSP: 0018:ffff8800b65e1e28 ?EFLAGS: 00010046
> RAX: 0000000000000100 RBX: ffff88001bcece48 RCX: ffff88001bd05348
> RDX: 0000000040000200 RSI: ffff8800916cf6c0 RDI: 00000000000000a0
> RBP: ffff8800b65e1e48 R08: 00000000043205bc R09: ffffea00011340c0
> R10: 0000000000000000 R11: 0000000000000002 R12: 00000000000000a0
> R13: 00000000000000a4 R14: 0000000000000246 R15: 00007f7fe2d15680
> FS: ?00007f7fe2e1f720(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
> CS: ?0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000a4 CR3: 0000000215c16000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process php (pid: 27658, threadinfo ffff8800b65e0000, task ffff8800175d4320)
> Stack:
> ?ffff88001bcece48 00000000fffffffe ffff8800916cf6c0 00007f7fe2d146a8
> ?ffff8800b65e1e58 ffffffff816add89 ffff8800b65e1e88 ffffffff8110ec70
> ?ffff8800b65e1e98 ffff8800916cf6c0 ffff8800b65e1e98 0000000000000000
> Call Trace:
> ?[<ffffffff816add89>] mutex_unlock+0x9/0x10
> ?[<ffffffff8110ec70>] vfs_rmdir+0xb0/0x100
> ?[<ffffffff8110ed96>] do_rmdir+0xd6/0x130
> ?[<ffffffff81102063>] ? fput+0x1c3/0x260
> ?[<ffffffff810fe648>] ? filp_close+0x68/0xa0
> ?[<ffffffff8110ee41>] sys_rmdir+0x11/0x20
> ?[<ffffffff816b6f3b>] system_call_fastpath+0x16/0x1b
> Code: 75 1b 65 48 8b 04 25 c8 b5 00 00 48 63 80 44 e0 ff ff a9 00 ff ff 07 0f 85 bb 00 00 00 9c 41 5e fa b8 00 01 00 00 4d 8d 6c 24 04 <f0> 66 41 0f c1 45 00 38 e0 74 08 f3 90 41 8a 45 00 eb f4 44 8b
> RIP ?[<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
> ?RSP <ffff8800b65e1e28>
> CR2: 00000000000000a4
> ---[ end trace f515ec8376bdb799 ]---
>
> How can I further debug this? At this point, it seems to be happening several times daily.

Fabio reported a similar bug,
3.0.3 [BUG] unable to handle kernel NULL pointer dereference
http://marc.info/?t=131416920900001&r=1&w=2

Do you have a test case to trigger this bug reliably?

Lin Ming

2011-09-13 16:43:43

by Simon Kirby

[permalink] [raw]
Subject: Re: [3.1-rc4] vfs_rmdir() -> mutex_unlock() Oops

On Mon, Sep 12, 2011 at 03:17:00PM -0700, Simon Kirby wrote:

> On Thu, Sep 08, 2011 at 03:24:20PM -0700, Simon Kirby wrote:
>
> > This box primarily does most of its VFS stuff over lots of NFS mounts,
> > but has some local EXT3 filesystems. This has happened a couple of times:
> >
> > BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
>...
> Got a few more identical Oopses on another box running slightly past
> 3.1-rc5 (79016f648872549392d232cd648bd02298c2d2bb). It seems to be
> do_rmdir()'s mutex_unlock() call.
>
> I'm building -rc6 with CONFIG_DEBUG_MUTEXES now.

...and not much more help with CONFIG_DEBUG_MUTEXES, from 3.1-rc6:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a4
IP: [<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
PGD 13c6c4067 PUD 2256fc067 PMD 0
Oops: 0002 [#1] SMP
CPU 2
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2

Pid: 27658, comm: php Not tainted 3.1.0-rc6-hw-mudbg+ #32 Dell Inc. PowerEdge 1950/0TT740
RIP: 0010:[<ffffffff816adc93>] [<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
RSP: 0018:ffff8800b65e1e28 EFLAGS: 00010046
RAX: 0000000000000100 RBX: ffff88001bcece48 RCX: ffff88001bd05348
RDX: 0000000040000200 RSI: ffff8800916cf6c0 RDI: 00000000000000a0
RBP: ffff8800b65e1e48 R08: 00000000043205bc R09: ffffea00011340c0
R10: 0000000000000000 R11: 0000000000000002 R12: 00000000000000a0
R13: 00000000000000a4 R14: 0000000000000246 R15: 00007f7fe2d15680
FS: 00007f7fe2e1f720(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a4 CR3: 0000000215c16000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process php (pid: 27658, threadinfo ffff8800b65e0000, task ffff8800175d4320)
Stack:
ffff88001bcece48 00000000fffffffe ffff8800916cf6c0 00007f7fe2d146a8
ffff8800b65e1e58 ffffffff816add89 ffff8800b65e1e88 ffffffff8110ec70
ffff8800b65e1e98 ffff8800916cf6c0 ffff8800b65e1e98 0000000000000000
Call Trace:
[<ffffffff816add89>] mutex_unlock+0x9/0x10
[<ffffffff8110ec70>] vfs_rmdir+0xb0/0x100
[<ffffffff8110ed96>] do_rmdir+0xd6/0x130
[<ffffffff81102063>] ? fput+0x1c3/0x260
[<ffffffff810fe648>] ? filp_close+0x68/0xa0
[<ffffffff8110ee41>] sys_rmdir+0x11/0x20
[<ffffffff816b6f3b>] system_call_fastpath+0x16/0x1b
Code: 75 1b 65 48 8b 04 25 c8 b5 00 00 48 63 80 44 e0 ff ff a9 00 ff ff 07 0f 85 bb 00 00 00 9c 41 5e fa b8 00 01 00 00 4d 8d 6c 24 04 <f0> 66 41 0f c1 45 00 38 e0 74 08 f3 90 41 8a 45 00 eb f4 44 8b
RIP [<ffffffff816adc93>] __mutex_unlock_slowpath+0x53/0x140
RSP <ffff8800b65e1e28>
CR2: 00000000000000a4
---[ end trace f515ec8376bdb799 ]---

How can I further debug this? At this point, it seems to be happening several times daily.

Simon-