2012-05-22 01:12:00

by Dave Jones

[permalink] [raw]
Subject: 3.4+ dcache BUG.

Just hit this. Probably related to todays dcache changes ?

kernel BUG at fs/dcache.c:229!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU 2
Modules linked in: ipt_ULOG nfnetlink x25 phonet llc2 can rose af_key caif_socket caif netrom rds binfmt_misc bluetooth rfkill pppoe pppox ppp_generic slhc decnet ipx p8023 p8022 irda crc_ccitt atm appletalk psnap llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode pcspkr serio_raw i2c_i801 usb_debug e1000e iTCO_wdt iTCO_vendor_support nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]

Pid: 15394, comm: cc1 Not tainted 3.4.0+ #17
RIP: 0010:[<ffffffff811bb7ac>] [<ffffffff811bb7ac>] d_free+0x6c/0x70
RSP: 0018:ffff88013230fe48 EFLAGS: 00010202
RAX: ffff880077134d40 RBX: ffff880111e45e10 RCX: 0000000000000000
RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880111e45e10
RBP: ffff88013230fe58 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff880111c89c10
R13: ffff8801474139d0 R14: 0000000000000001 R15: ffff8801462bd1a0
FS: 00002b935ace1ec0(0000) GS:ffff880148000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002b9361c49000 CR3: 00000001322d5000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cc1 (pid: 15394, threadinfo ffff88013230e000, task ffff880077134d40)
Stack:
ffff8801474139d0 ffff880111e45e10 ffff88013230fe88 ffffffff811bc987
ffff880111e45e10 ffff8801474139d0 ffff880111c89c10 ffff880111e45e70
ffff88013230feb8 ffffffff811bcad2 ffff88007c47aec0 0000000000000010
Call Trace:
[<ffffffff811bc987>] d_kill+0xb7/0x120
[<ffffffff811bcad2>] dput+0xe2/0x1d0
[<ffffffff811a6a0f>] fput+0x1cf/0x300
[<ffffffff811a16b9>] filp_close+0x69/0xa0
[<ffffffff811a223d>] sys_close+0xad/0x1a0
[<ffffffff8165a612>] system_call_fastpath+0x16/0x1b
Code: 00 00 74 1b 48 c7 c6 d0 b6 1b 81 e8 6f fd f3 ff 48 83 c4 08 5b 5d c3 0f 1f 84 00 00 00 00 00 e8 2b ff ff ff 48 83 c4 08 5b 5d c3 <0f> 0b 66 90 55 48 89 e5 41 55 41 54 53 48 83 ec 28 66 66 66 66


I'm not sure why, but the dcache.c line numbers don't match up..
This kernel was v3.3-rc7-14528-g29db10d which looked like..

229 */
230 static void d_free(struct dentry *dentry)
231 {
232 BUG_ON(dentry->d_count);
233 this_cpu_dec(nr_dentry);


Hopefully that's close enough that that BUG_ON is the one that got hit.

Dave


2012-05-22 01:39:57

by Dave Jones

[permalink] [raw]
Subject: Re: 3.4+ dcache BUG.

On Mon, May 21, 2012 at 09:11:52PM -0400, Dave Jones wrote:
> Just hit this. Probably related to todays dcache changes ?
>
> kernel BUG at fs/dcache.c:229!
> invalid opcode: 0000 [#1] PREEMPT SMP
> CPU 2
> Modules linked in: ipt_ULOG nfnetlink x25 phonet llc2 can rose af_key caif_socket caif netrom rds binfmt_misc bluetooth rfkill pppoe pppox ppp_generic slhc decnet ipx p8023 p8022 irda crc_ccitt atm appletalk psnap llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode pcspkr serio_raw i2c_i801 usb_debug e1000e iTCO_wdt iTCO_vendor_support nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
>
> Pid: 15394, comm: cc1 Not tainted 3.4.0+ #17

Rebuilt a fresh kernel. This time I got this instead..


kernel BUG at fs/namei.c:516!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU 5
Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel serio_raw ghash_clmulni_intel microcode pcspkr usb_debug i2c_i801 iTCO_wdt iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]

Pid: 2822, comm: sh Not tainted 3.4.0+ #19
RIP: 0010:[<ffffffff811b22d4>] [<ffffffff811b22d4>] complete_walk+0x174/0x180
RSP: 0018:ffff88002b459d08 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88002b459e48 RCX: 0000000000000004
RDX: 0000000000000007 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffff88002b459d38 R08: 0000000000000001 R09: 0000000000000001
R10: ffff88013e46a6a0 R11: ffff88013e46aff0 R12: ffff88000a3e4de0
R13: ffff88000a3e4e40 R14: ffff880112f3b480 R15: ffff88010430bd70
FS: 00007fced19e8740(0000) GS:ffff880148600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000003768278630 CR3: 00000000090b1000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 2822, threadinfo ffff88002b458000, task ffff88013e46a6a0)
Stack:
0000000000000040 ffff88002b459e48 0000000000088000 ffff88002b459e48
0000000000088000 0000000000000024 ffff88002b459da8 ffffffff811b4a98
ffff8801462bd1a0 ffff880100000001 ffff88002b459de8 ffff880112f3b486
Call Trace:
[<ffffffff811b4a98>] do_last+0x138/0x900
[<ffffffff811b5372>] path_openat+0xd2/0x3f0
[<ffffffff81009e86>] ? native_sched_clock+0x26/0x90
[<ffffffff811c3f51>] ? alloc_fd+0x171/0x220
[<ffffffff811b57b1>] do_filp_open+0x41/0xa0
[<ffffffff81653595>] ? _raw_spin_unlock+0x35/0x60
[<ffffffff811c3f51>] ? alloc_fd+0x171/0x220
[<ffffffff811a315d>] do_sys_open+0xed/0x1c0
[<ffffffff811a3251>] sys_open+0x21/0x30
[<ffffffff8165ae52>] system_call_fastpath+0x16/0x1b
Code: ff ff 8b 45 d8 e9 7e ff ff ff e8 2b 7d 49 00 4c 89 ef e8 a0 12 4a 00 e8 9b e3 ff ff e8 66 26 01 00 b8 f6 ff ff ff e9 5d ff ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53
RIP [<ffffffff811b22d4>] complete_walk+0x174/0x180

That's...

BUG_ON(nd->inode != dentry->d_inode);



followed by a slew of CPU stuck for 22s! traces.

Dave

2012-05-22 01:52:19

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.4+ dcache BUG.

On Mon, May 21, 2012 at 6:11 PM, Dave Jones <[email protected]> wrote:
> Just hit this. Probably related to todays dcache changes ?

Almost certainly. Except:

> I'm not sure why, but the dcache.c line numbers don't match up..
> This kernel was v3.3-rc7-14528-g29db10d which looked like..

You seem to not have fetched any tags lately (so it says "3.3-rc7 +
14528 commits" instead of something more relevant), and I can't make
sense of that SHA1 either (29db10d) either.

You probably have other changes in your tree as well, explaining the
SHA1 that I don't recognize?

But that line number does match the BUG_ON() in d_free() of the
pre-careful name lookup dcache.c, so it's all sane apart from the odd
versions you have.

What was the load you used, btw? Considering that this hits the
d_free() BUG_ON(), I have a good guess about what is going on, and I
suspect that we *used* to be protected by the pointless d_unhashed()
check in fs/dcache.c.

I say "pointless", because it *should* be pointless. But your
backtrace is intriguing, since it says:

sys_close -> filp_close -> fput -> dput -> d_kill -> d_free

and the only way you get from d_put to d_kill is through an unhashed dentry.

But the people who unhash the dentries *should* have either

(a) happened after umount, when nobody can possibly actually match on
that dentry

OR

(b) done the proper dentry sequence number dance to make sure we never use it.

that's why the d_unhashed() check got removed as "unnecessary". But
clearly I screwed it up.

What was the load that triggered this? Just a regular kernel compile?
I see the "comm: cc1" there, and I'm a bit surprised, since I ran
those patches here locally a *lot*. Is this perhaps some low-memory
scenario?

Anyway, thinking more about it, I'm starting to see why my thinking
about sequence counts was buggy. I think that happens is:

- RCU lookup races with __d_drop

- __d_drop unhashes the dentry, and does a "write_seqcount_barrier()"

- the RCU lookup saw the old dentry pointer (that we unhashed), but
by the time it loaded the sequence number off it, it's the new
sequence number after the barrier.

- so now all the sequence numbers check out ok, but we have a unhashed dentry

and I was just wrong about the d_unhashed() check being unnecessary
due to the sequence numbers.

I'll revert commit 8c01a529b861.

Linus

2012-05-22 01:58:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.4+ dcache BUG.

On Mon, May 21, 2012 at 6:39 PM, Dave Jones <[email protected]> wrote:
>
> ? ? ? ?BUG_ON(nd->inode != dentry->d_inode);

I'm going to assume it's the exact same thing (and thus hopefully
fixed by the revert that I just pushed out), but now I'm really
curious what your load is, since you seem to recreate this so easily.
I ran with that patch for weeks.

But I suspect my issue is that while I tested lots of concurrent
lookups, I don't run into low-memory situations.

Linus

2012-05-22 02:08:31

by Dave Jones

[permalink] [raw]
Subject: Re: 3.4+ dcache BUG.

On Mon, May 21, 2012 at 06:51:51PM -0700, Linus Torvalds wrote:
> On Mon, May 21, 2012 at 6:11 PM, Dave Jones <[email protected]> wrote:
> > Just hit this. Probably related to todays dcache changes ?
>
> Almost certainly. Except:
>
> > I'm not sure why, but the dcache.c line numbers don't match up..
> > This kernel was v3.3-rc7-14528-g29db10d which looked like..
>
> You seem to not have fetched any tags lately (so it says "3.3-rc7 +
> 14528 commits" instead of something more relevant), and I can't make
> sense of that SHA1 either (29db10d) either.
>
> You probably have other changes in your tree as well, explaining the
> SHA1 that I don't recognize?

ah, yeah. Mostly just stuff like removing noisy printk's that I get
tired of seeing when I run trinity.

> What was the load that triggered this? Just a regular kernel compile?

kernel build doing make -j 64 on a 8 way machine. (quad core with ht)
(It has a really fast ssd, so it can really churn through builds quickly).

> I see the "comm: cc1" there, and I'm a bit surprised, since I ran
> those patches here locally a *lot*. Is this perhaps some low-memory
> scenario?

hmm, that's a possibility. Machine has 4GB, but with all those gcc/ld's flying around,
maybe when it gets to a big link section..

Dave

2012-05-22 02:14:55

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.4+ dcache BUG.

On Mon, May 21, 2012 at 6:58 PM, Linus Torvalds
<[email protected]> wrote:
>
> I'm going to assume it's the exact same thing (and thus hopefully
> fixed by the revert that I just pushed out)

Oops. Except I hadn't actually pushed after committing.

NOW it's pushed out.

And please do a "git fetch --tags" too to make sure you have the tags,
and your "git describe" gives better output than
v3.3-rc7-<some-huge-number> ;)

Linus

2012-05-22 02:36:18

by Dave Jones

[permalink] [raw]
Subject: Re: 3.4+ dcache BUG.

On Mon, May 21, 2012 at 07:14:32PM -0700, Linus Torvalds wrote:
> On Mon, May 21, 2012 at 6:58 PM, Linus Torvalds
> <[email protected]> wrote:
> >
> > I'm going to assume it's the exact same thing (and thus hopefully
> > fixed by the revert that I just pushed out)
>
> Oops. Except I hadn't actually pushed after committing.
>
> NOW it's pushed out.

Yeah, that looks much better.

> And please do a "git fetch --tags" too to make sure you have the tags,
> and your "git describe" gives better output than
> v3.3-rc7-<some-huge-number> ;)

Done, thanks.

Dave