2011-03-28 20:23:11

by Wolfgang Walter

[permalink] [raw]
Subject: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

Upgraded from 2.6.38 to 2.6.38.2: Got immediately:

Mar 28 21:35:08 au kernel: [ 312.778443] ------------[ cut here ]------------
Mar 28 21:35:08 au kernel: [ 312.778629] kernel BUG at fs/nfsd/nfs4state.c:380!
Mar 28 21:35:08 au kernel: [ 312.778746] invalid opcode: 0000 [#1] SMP
Mar 28 21:35:08 au kernel: [ 312.778949] last sysfs file: /sys/devices/virtual/vc/vcsa6/uevent
Mar 28 21:35:08 au kernel: [ 312.779068] CPU 3
Mar 28 21:35:08 au kernel: [ 312.779115] Modules linked in: i2c_i801 i5k_amb
Mar 28 21:35:08 au kernel: [ 312.779469]
Mar 28 21:35:08 au kernel: [ 312.779581] Pid: 12850, comm: nfsd Not tainted 2.6.38.2-bigintel64a+1.17 #1 Supermicro X7DB8/X7DB8
Mar 28 21:35:08 au kernel: [ 312.779970] RIP: 0010:[<ffffffff81281c8a>] [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
Mar 28 21:35:08 au kernel: [ 312.779970] RSP: 0018:ffff8803bd563b50 EFLAGS: 00010297
Mar 28 21:35:08 au kernel: [ 312.779970] RAX: 00000000ffffffff RBX: ffff88040e726d58 RCX: ffff88040e726d78
Mar 28 21:35:08 au kernel: [ 312.779970] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8803bd563b5c
Mar 28 21:35:08 au kernel: [ 312.779970] RBP: ffff8803bd563b70 R08: dead000000200200 R09: dead000000100100
Mar 28 21:35:08 au kernel: [ 312.779970] R10: dead000000200200 R11: dead000000100100 R12: ffff88040e726d58
Mar 28 21:35:08 au kernel: [ 312.779970] R13: ffff88040b588dd0 R14: ffff88040b588d98 R15: ffff8803bd4421a0
Mar 28 21:35:08 au kernel: [ 312.779970] FS: 0000000000000000(0000) GS:ffff8800cfd80000(0000) knlGS:0000000000000000
Mar 28 21:35:08 au kernel: [ 312.779970] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 28 21:35:08 au kernel: [ 312.779970] CR2: 00007ffe6063a530 CR3: 000000040dae3000 CR4: 00000000000006e0
Mar 28 21:35:08 au kernel: [ 312.779970] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 28 21:35:08 au kernel: [ 312.779970] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 28 21:35:08 au kernel: [ 312.779970] Process nfsd (pid: 12850, threadinfo ffff8803bd562000, task ffff8803bd447290)
Mar 28 21:35:08 au kernel: [ 312.779970] Stack:
Mar 28 21:35:08 au kernel: [ 312.779970] 00000000000000d0 0000000000000202 ffff88040e726d58 ffff88040b588d98
Mar 28 21:35:08 au kernel: [ 312.779970] ffff8803bd563ba0 ffffffff81281df9 0000000000000011 0000000000000011
Mar 28 21:35:08 au kernel: [ 312.779970] 0000000000000001 000000001d270000 ffff8803bd563dc0 ffffffff81286b4c
Mar 28 21:35:08 au kernel: [ 312.779970] Call Trace:
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81281df9>] release_lockowner+0xb9/0x1a0
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81286b4c>] nfsd4_lock+0x50c/0x8d0
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8126fd07>] ? nfsd_setuser+0x137/0x300
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81268522>] ? nfsd_setuser_and_check_port+0x72/0x80
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812686a5>] ? fh_verify+0x175/0x6d0
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8174cb51>] ? unix_gid_lookup+0x61/0x70
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8127737d>] nfsd4_proc_compound+0x33d/0x4a0
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81264bfb>] nfsd_dispatch+0xbb/0x260
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff817489a2>] svc_process+0x4b2/0x840
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812652a0>] ? nfsd+0x0/0x160
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81265375>] nfsd+0xd5/0x160
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b96>] kthread+0x96/0xb0
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a54>] kernel_thread_helper+0x4/0x10
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b00>] ? kthread+0x0/0xb0
Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a50>] ? kernel_thread_helper+0x0/0x10
Mar 28 21:35:08 au kernel: [ 312.779970] Code: 8b 77 60 48 8d 7d ec e8 d5 fc ff ff 8b 45 ec 83 e0 03 83 f8 02 0f 84 b6 00 00 00 83 f8 03 0f 84 9d 00 00 00 ff c8 0f 1f 00 74 0e
<0f> 0b 0f 1f 40 00 eb fa 66 0f 1f 44 00 00 31 f6 49 8b 7c 24 48
Mar 28 21:35:08 au kernel: [ 312.779970] RIP [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
Mar 28 21:35:08 au kernel: [ 312.779970] RSP <ffff8803bd563b50>
Mar 28 21:35:08 au kernel: [ 312.788300] ---[ end trace 0eb789063a9e575d ]---


Reverting

nfsd4: fix struct file leak

seems to "fix" it.

Regards,
--
Wolfgang Walter
Studentenwerk M?nchen
Anstalt des ?ffentlichen Rechts


2011-03-28 20:33:21

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Mon, Mar 28, 2011 at 10:23:05PM +0200, Wolfgang Walter wrote:
> Upgraded from 2.6.38 to 2.6.38.2: Got immediately:
>
> Mar 28 21:35:08 au kernel: [ 312.778443] ------------[ cut here ]------------
> Mar 28 21:35:08 au kernel: [ 312.778629] kernel BUG at fs/nfsd/nfs4state.c:380!
> Mar 28 21:35:08 au kernel: [ 312.778746] invalid opcode: 0000 [#1] SMP
> Mar 28 21:35:08 au kernel: [ 312.778949] last sysfs file: /sys/devices/virtual/vc/vcsa6/uevent
> Mar 28 21:35:08 au kernel: [ 312.779068] CPU 3
> Mar 28 21:35:08 au kernel: [ 312.779115] Modules linked in: i2c_i801 i5k_amb
> Mar 28 21:35:08 au kernel: [ 312.779469]
> Mar 28 21:35:08 au kernel: [ 312.779581] Pid: 12850, comm: nfsd Not tainted 2.6.38.2-bigintel64a+1.17 #1 Supermicro X7DB8/X7DB8
> Mar 28 21:35:08 au kernel: [ 312.779970] RIP: 0010:[<ffffffff81281c8a>] [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
> Mar 28 21:35:08 au kernel: [ 312.779970] RSP: 0018:ffff8803bd563b50 EFLAGS: 00010297
> Mar 28 21:35:08 au kernel: [ 312.779970] RAX: 00000000ffffffff RBX: ffff88040e726d58 RCX: ffff88040e726d78
> Mar 28 21:35:08 au kernel: [ 312.779970] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8803bd563b5c
> Mar 28 21:35:08 au kernel: [ 312.779970] RBP: ffff8803bd563b70 R08: dead000000200200 R09: dead000000100100
> Mar 28 21:35:08 au kernel: [ 312.779970] R10: dead000000200200 R11: dead000000100100 R12: ffff88040e726d58
> Mar 28 21:35:08 au kernel: [ 312.779970] R13: ffff88040b588dd0 R14: ffff88040b588d98 R15: ffff8803bd4421a0
> Mar 28 21:35:08 au kernel: [ 312.779970] FS: 0000000000000000(0000) GS:ffff8800cfd80000(0000) knlGS:0000000000000000
> Mar 28 21:35:08 au kernel: [ 312.779970] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Mar 28 21:35:08 au kernel: [ 312.779970] CR2: 00007ffe6063a530 CR3: 000000040dae3000 CR4: 00000000000006e0
> Mar 28 21:35:08 au kernel: [ 312.779970] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Mar 28 21:35:08 au kernel: [ 312.779970] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Mar 28 21:35:08 au kernel: [ 312.779970] Process nfsd (pid: 12850, threadinfo ffff8803bd562000, task ffff8803bd447290)
> Mar 28 21:35:08 au kernel: [ 312.779970] Stack:
> Mar 28 21:35:08 au kernel: [ 312.779970] 00000000000000d0 0000000000000202 ffff88040e726d58 ffff88040b588d98
> Mar 28 21:35:08 au kernel: [ 312.779970] ffff8803bd563ba0 ffffffff81281df9 0000000000000011 0000000000000011
> Mar 28 21:35:08 au kernel: [ 312.779970] 0000000000000001 000000001d270000 ffff8803bd563dc0 ffffffff81286b4c
> Mar 28 21:35:08 au kernel: [ 312.779970] Call Trace:
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81281df9>] release_lockowner+0xb9/0x1a0
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81286b4c>] nfsd4_lock+0x50c/0x8d0
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8126fd07>] ? nfsd_setuser+0x137/0x300
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81268522>] ? nfsd_setuser_and_check_port+0x72/0x80
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812686a5>] ? fh_verify+0x175/0x6d0
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8174cb51>] ? unix_gid_lookup+0x61/0x70
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8127737d>] nfsd4_proc_compound+0x33d/0x4a0
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81264bfb>] nfsd_dispatch+0xbb/0x260
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff817489a2>] svc_process+0x4b2/0x840
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812652a0>] ? nfsd+0x0/0x160
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81265375>] nfsd+0xd5/0x160
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b96>] kthread+0x96/0xb0
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a54>] kernel_thread_helper+0x4/0x10
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b00>] ? kthread+0x0/0xb0
> Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a50>] ? kernel_thread_helper+0x0/0x10
> Mar 28 21:35:08 au kernel: [ 312.779970] Code: 8b 77 60 48 8d 7d ec e8 d5 fc ff ff 8b 45 ec 83 e0 03 83 f8 02 0f 84 b6 00 00 00 83 f8 03 0f 84 9d 00 00 00 ff c8 0f 1f 00 74 0e
> <0f> 0b 0f 1f 40 00 eb fa 66 0f 1f 44 00 00 31 f6 49 8b 7c 24 48
> Mar 28 21:35:08 au kernel: [ 312.779970] RIP [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
> Mar 28 21:35:08 au kernel: [ 312.779970] RSP <ffff8803bd563b50>
> Mar 28 21:35:08 au kernel: [ 312.788300] ---[ end trace 0eb789063a9e575d ]---
>
>
> Reverting
>
> nfsd4: fix struct file leak
>
> seems to "fix" it.

Ick.

Bruce, should I revert this, or is something else needed to be able to
have this patch applied?

thanks,

greg k-h

2011-03-28 21:01:01

by Wolfgang Walter

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Monday 28 March 2011, Greg KH wrote:
> On Mon, Mar 28, 2011 at 10:23:05PM +0200, Wolfgang Walter wrote:
> > Upgraded from 2.6.38 to 2.6.38.2: Got immediately:
> >
> > Mar 28 21:35:08 au kernel: [ 312.778443] ------------[ cut here ]------------
> > Mar 28 21:35:08 au kernel: [ 312.778629] kernel BUG at fs/nfsd/nfs4state.c:380!
> > Mar 28 21:35:08 au kernel: [ 312.778746] invalid opcode: 0000 [#1] SMP
> > Mar 28 21:35:08 au kernel: [ 312.778949] last sysfs file: /sys/devices/virtual/vc/vcsa6/uevent
> > Mar 28 21:35:08 au kernel: [ 312.779068] CPU 3
> > Mar 28 21:35:08 au kernel: [ 312.779115] Modules linked in: i2c_i801 i5k_amb
> > Mar 28 21:35:08 au kernel: [ 312.779469]
> > Mar 28 21:35:08 au kernel: [ 312.779581] Pid: 12850, comm: nfsd Not tainted 2.6.38.2-bigintel64a+1.17 #1 Supermicro X7DB8/X7DB8
> > Mar 28 21:35:08 au kernel: [ 312.779970] RIP: 0010:[<ffffffff81281c8a>] [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
> > Mar 28 21:35:08 au kernel: [ 312.779970] RSP: 0018:ffff8803bd563b50 EFLAGS: 00010297
> > Mar 28 21:35:08 au kernel: [ 312.779970] RAX: 00000000ffffffff RBX: ffff88040e726d58 RCX: ffff88040e726d78
> > Mar 28 21:35:08 au kernel: [ 312.779970] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8803bd563b5c
> > Mar 28 21:35:08 au kernel: [ 312.779970] RBP: ffff8803bd563b70 R08: dead000000200200 R09: dead000000100100
> > Mar 28 21:35:08 au kernel: [ 312.779970] R10: dead000000200200 R11: dead000000100100 R12: ffff88040e726d58
> > Mar 28 21:35:08 au kernel: [ 312.779970] R13: ffff88040b588dd0 R14: ffff88040b588d98 R15: ffff8803bd4421a0
> > Mar 28 21:35:08 au kernel: [ 312.779970] FS: 0000000000000000(0000) GS:ffff8800cfd80000(0000) knlGS:0000000000000000
> > Mar 28 21:35:08 au kernel: [ 312.779970] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > Mar 28 21:35:08 au kernel: [ 312.779970] CR2: 00007ffe6063a530 CR3: 000000040dae3000 CR4: 00000000000006e0
> > Mar 28 21:35:08 au kernel: [ 312.779970] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Mar 28 21:35:08 au kernel: [ 312.779970] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Mar 28 21:35:08 au kernel: [ 312.779970] Process nfsd (pid: 12850, threadinfo ffff8803bd562000, task ffff8803bd447290)
> > Mar 28 21:35:08 au kernel: [ 312.779970] Stack:
> > Mar 28 21:35:08 au kernel: [ 312.779970] 00000000000000d0 0000000000000202 ffff88040e726d58 ffff88040b588d98
> > Mar 28 21:35:08 au kernel: [ 312.779970] ffff8803bd563ba0 ffffffff81281df9 0000000000000011 0000000000000011
> > Mar 28 21:35:08 au kernel: [ 312.779970] 0000000000000001 000000001d270000 ffff8803bd563dc0 ffffffff81286b4c
> > Mar 28 21:35:08 au kernel: [ 312.779970] Call Trace:
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81281df9>] release_lockowner+0xb9/0x1a0
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81286b4c>] nfsd4_lock+0x50c/0x8d0
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8126fd07>] ? nfsd_setuser+0x137/0x300
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81268522>] ? nfsd_setuser_and_check_port+0x72/0x80
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812686a5>] ? fh_verify+0x175/0x6d0
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8174cb51>] ? unix_gid_lookup+0x61/0x70
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8127737d>] nfsd4_proc_compound+0x33d/0x4a0
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81264bfb>] nfsd_dispatch+0xbb/0x260
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff817489a2>] svc_process+0x4b2/0x840
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812652a0>] ? nfsd+0x0/0x160
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81265375>] nfsd+0xd5/0x160
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b96>] kthread+0x96/0xb0
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a54>] kernel_thread_helper+0x4/0x10
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b00>] ? kthread+0x0/0xb0
> > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a50>] ? kernel_thread_helper+0x0/0x10
> > Mar 28 21:35:08 au kernel: [ 312.779970] Code: 8b 77 60 48 8d 7d ec e8 d5 fc ff ff 8b 45 ec 83 e0 03 83 f8 02 0f 84 b6 00 00 00 83 f8 03 0f 84 9d 00 00 00 ff c8 0f 1f 00 74
0e
> > <0f> 0b 0f 1f 40 00 eb fa 66 0f 1f 44 00 00 31 f6 49 8b 7c 24 48
> > Mar 28 21:35:08 au kernel: [ 312.779970] RIP [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
> > Mar 28 21:35:08 au kernel: [ 312.779970] RSP <ffff8803bd563b50>
> > Mar 28 21:35:08 au kernel: [ 312.788300] ---[ end trace 0eb789063a9e575d ]---
> >
> >
> > Reverting
> >
> > nfsd4: fix struct file leak
> >
> > seems to "fix" it.
>
> Ick.
>
> Bruce, should I revert this, or is something else needed to be able to
> have this patch applied?
>
> thanks,
>
> greg k-h
>
>


I just searched linux-nfs and found this:

http://marc.info/?l=linux-nfs&m=130129644016061&w=4

He even seems to have a real fix though I don't want to test it
without blessing.

Regards,
--
Wolfgang Walter
Studentenwerk M?nchen
Anstalt des ?ffentlichen Rechts

2011-03-28 21:41:34

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Mon, Mar 28, 2011 at 11:00:57PM +0200, Wolfgang Walter wrote:
> On Monday 28 March 2011, Greg KH wrote:
> > On Mon, Mar 28, 2011 at 10:23:05PM +0200, Wolfgang Walter wrote:
> > > Upgraded from 2.6.38 to 2.6.38.2: Got immediately:
> > >
> > > Mar 28 21:35:08 au kernel: [ 312.778443] ------------[ cut here ]------------
> > > Mar 28 21:35:08 au kernel: [ 312.778629] kernel BUG at fs/nfsd/nfs4state.c:380!
> > > Mar 28 21:35:08 au kernel: [ 312.778746] invalid opcode: 0000 [#1] SMP
> > > Mar 28 21:35:08 au kernel: [ 312.778949] last sysfs file: /sys/devices/virtual/vc/vcsa6/uevent
> > > Mar 28 21:35:08 au kernel: [ 312.779068] CPU 3
> > > Mar 28 21:35:08 au kernel: [ 312.779115] Modules linked in: i2c_i801 i5k_amb
> > > Mar 28 21:35:08 au kernel: [ 312.779469]
> > > Mar 28 21:35:08 au kernel: [ 312.779581] Pid: 12850, comm: nfsd Not tainted 2.6.38.2-bigintel64a+1.17 #1 Supermicro X7DB8/X7DB8
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RIP: 0010:[<ffffffff81281c8a>] [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RSP: 0018:ffff8803bd563b50 EFLAGS: 00010297
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RAX: 00000000ffffffff RBX: ffff88040e726d58 RCX: ffff88040e726d78
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8803bd563b5c
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RBP: ffff8803bd563b70 R08: dead000000200200 R09: dead000000100100
> > > Mar 28 21:35:08 au kernel: [ 312.779970] R10: dead000000200200 R11: dead000000100100 R12: ffff88040e726d58
> > > Mar 28 21:35:08 au kernel: [ 312.779970] R13: ffff88040b588dd0 R14: ffff88040b588d98 R15: ffff8803bd4421a0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] FS: 0000000000000000(0000) GS:ffff8800cfd80000(0000) knlGS:0000000000000000
> > > Mar 28 21:35:08 au kernel: [ 312.779970] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > Mar 28 21:35:08 au kernel: [ 312.779970] CR2: 00007ffe6063a530 CR3: 000000040dae3000 CR4: 00000000000006e0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > Mar 28 21:35:08 au kernel: [ 312.779970] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Mar 28 21:35:08 au kernel: [ 312.779970] Process nfsd (pid: 12850, threadinfo ffff8803bd562000, task ffff8803bd447290)
> > > Mar 28 21:35:08 au kernel: [ 312.779970] Stack:
> > > Mar 28 21:35:08 au kernel: [ 312.779970] 00000000000000d0 0000000000000202 ffff88040e726d58 ffff88040b588d98
> > > Mar 28 21:35:08 au kernel: [ 312.779970] ffff8803bd563ba0 ffffffff81281df9 0000000000000011 0000000000000011
> > > Mar 28 21:35:08 au kernel: [ 312.779970] 0000000000000001 000000001d270000 ffff8803bd563dc0 ffffffff81286b4c
> > > Mar 28 21:35:08 au kernel: [ 312.779970] Call Trace:
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81281df9>] release_lockowner+0xb9/0x1a0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81286b4c>] nfsd4_lock+0x50c/0x8d0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8126fd07>] ? nfsd_setuser+0x137/0x300
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81268522>] ? nfsd_setuser_and_check_port+0x72/0x80
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812686a5>] ? fh_verify+0x175/0x6d0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8174cb51>] ? unix_gid_lookup+0x61/0x70
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff8127737d>] nfsd4_proc_compound+0x33d/0x4a0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81264bfb>] nfsd_dispatch+0xbb/0x260
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff817489a2>] svc_process+0x4b2/0x840
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff812652a0>] ? nfsd+0x0/0x160
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81265375>] nfsd+0xd5/0x160
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b96>] kthread+0x96/0xb0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a54>] kernel_thread_helper+0x4/0x10
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81093b00>] ? kthread+0x0/0xb0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] [<ffffffff81032a50>] ? kernel_thread_helper+0x0/0x10
> > > Mar 28 21:35:08 au kernel: [ 312.779970] Code: 8b 77 60 48 8d 7d ec e8 d5 fc ff ff 8b 45 ec 83 e0 03 83 f8 02 0f 84 b6 00 00 00 83 f8 03 0f 84 9d 00 00 00 ff c8 0f 1f 00 74
> 0e
> > > <0f> 0b 0f 1f 40 00 eb fa 66 0f 1f 44 00 00 31 f6 49 8b 7c 24 48
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RIP [<ffffffff81281c8a>] free_generic_stateid+0x3a/0xf0
> > > Mar 28 21:35:08 au kernel: [ 312.779970] RSP <ffff8803bd563b50>
> > > Mar 28 21:35:08 au kernel: [ 312.788300] ---[ end trace 0eb789063a9e575d ]---
> > >
> > >
> > > Reverting
> > >
> > > nfsd4: fix struct file leak
> > >
> > > seems to "fix" it.
> >
> > Ick.
> >
> > Bruce, should I revert this, or is something else needed to be able to
> > have this patch applied?
> >
> > thanks,
> >
> > greg k-h
> >
> >
>
>
> I just searched linux-nfs and found this:
>
> http://marc.info/?l=linux-nfs&m=130129644016061&w=4
>
> He even seems to have a real fix though I don't want to test it
> without blessing.

Well, you aren't going to get that from me, Bruce?

thanks,

greg k-h

2011-03-29 03:13:54

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

Bah, apologies about this, I don't know how it got through my own
testing.

On Mon, Mar 28, 2011 at 02:41:22PM -0700, Greg KH wrote:
> On Mon, Mar 28, 2011 at 11:00:57PM +0200, Wolfgang Walter wrote:
> > I just searched linux-nfs and found this:
> >
> > http://marc.info/?l=linux-nfs&m=130129644016061&w=4
> >
> > He even seems to have a real fix though I don't want to test it
> > without blessing.
>
> Well, you aren't going to get that from me, Bruce?

Good analysis from Mi Jinlong, but I'm not convinced by the patch....

I'll look closer.

(Disclaimer: between the sleep deprivation and the UM hospital's dubious
guest wifi (hey, how's a new dad supposed to keep up with his kernel
hacking if they block git?), I may be a little slow.)

--b.

2011-03-29 03:18:40

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Mon, Mar 28, 2011 at 11:13:47PM -0400, J. Bruce Fields wrote:
> Bah, apologies about this, I don't know how it got through my own
> testing.
>
> On Mon, Mar 28, 2011 at 02:41:22PM -0700, Greg KH wrote:
> > On Mon, Mar 28, 2011 at 11:00:57PM +0200, Wolfgang Walter wrote:
> > > I just searched linux-nfs and found this:
> > >
> > > http://marc.info/?l=linux-nfs&m=130129644016061&w=4
> > >
> > > He even seems to have a real fix though I don't want to test it
> > > without blessing.
> >
> > Well, you aren't going to get that from me, Bruce?
>
> Good analysis from Mi Jinlong, but I'm not convinced by the patch....
>
> I'll look closer.
>
> (Disclaimer: between the sleep deprivation and the UM hospital's dubious
> guest wifi (hey, how's a new dad supposed to keep up with his kernel
> hacking if they block git?), I may be a little slow.)

Hey, no rush, you have more important things to deal with, I can always
revert the patch for now :)

thanks,

greg k-h

2011-04-11 08:17:34

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Mon, Mar 28, 2011 at 08:17:25PM -0700, Greg KH wrote:
> On Mon, Mar 28, 2011 at 11:13:47PM -0400, J. Bruce Fields wrote:
> > Bah, apologies about this, I don't know how it got through my own
> > testing.
> >
> > On Mon, Mar 28, 2011 at 02:41:22PM -0700, Greg KH wrote:
> > > On Mon, Mar 28, 2011 at 11:00:57PM +0200, Wolfgang Walter wrote:
> > > > I just searched linux-nfs and found this:
> > > >
> > > > http://marc.info/?l=linux-nfs&m=130129644016061&w=4
> > > >
> > > > He even seems to have a real fix though I don't want to test it
> > > > without blessing.
> > >
> > > Well, you aren't going to get that from me, Bruce?
> >
> > Good analysis from Mi Jinlong, but I'm not convinced by the patch....
> >
> > I'll look closer.
> >
> > (Disclaimer: between the sleep deprivation and the UM hospital's dubious
> > guest wifi (hey, how's a new dad supposed to keep up with his kernel
> > hacking if they block git?), I may be a little slow.)
>
> Hey, no rush, you have more important things to deal with, I can always
> revert the patch for now :)

Argh, I forgot to stick the "Cc: stable" on this one. Anyway, should be
in Linus's tree soon:

--b.

commit 23fcf2ec93fb8573a653408316af599939ff9a8e
Author: J. Bruce Fields <[email protected]>
Date: Mon Mar 28 15:15:09 2011 +0800

nfsd4: fix oops on lock failure

Lock stateid's can have access_bmap 0 if they were only partially
initialized (due to a failed lock request); handle that case in
free_generic_stateid.

------------[ cut here ]------------
kernel BUG at fs/nfsd/nfs4state.c:380!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
Modules linked in: nfs fscache md4 nls_utf8 cifs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc nfsd lockd nfs_acl auth_rpcgss sunrpc ipv6 ppdev parport_pc parport pcnet32 mii pcspkr microcode i2c_piix4 BusLogic floppy [last unloaded: mperf]

Pid: 1468, comm: nfsd Not tainted 2.6.38+ #120 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
EIP: 0060:[<e24f180d>] EFLAGS: 00010297 CPU: 0
EIP is at nfs4_access_to_omode+0x1c/0x29 [nfsd]
EAX: ffffffff EBX: dd758120 ECX: 00000000 EDX: 00000004
ESI: dd758120 EDI: ddfe657c EBP: dd54dde0 ESP: dd54dde0
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process nfsd (pid: 1468, ti=dd54c000 task=ddc92580 task.ti=dd54c000)
Stack:
dd54ddf0 e24f19ca 00000000 ddfe6560 dd54de08 e24f1a5d dd758130 deee3a20
ddfe6560 31270000 dd54df1c e24f52fd 0000000f dd758090 e2505dd0 0be304cf
dbb51d68 0000000e ddfe657c ddcd8020 dd758130 dd758128 dd7580d8 dd54de68
Call Trace:
[<e24f19ca>] free_generic_stateid+0x1c/0x3e [nfsd]
[<e24f1a5d>] release_lockowner+0x71/0x8a [nfsd]
[<e24f52fd>] nfsd4_lock+0x617/0x66c [nfsd]
[<e24e57b6>] ? nfsd_setuser+0x199/0x1bb [nfsd]
[<e24e056c>] ? nfsd_setuser_and_check_port+0x65/0x81 [nfsd]
[<c07a0052>] ? _cond_resched+0x8/0x1c
[<c04ca61f>] ? slab_pre_alloc_hook.clone.33+0x23/0x27
[<c04cac01>] ? kmem_cache_alloc+0x1a/0xd2
[<c04835a0>] ? __call_rcu+0xd7/0xdd
[<e24e0dfb>] ? fh_verify+0x401/0x452 [nfsd]
[<e24f0b61>] ? nfsd4_encode_operation+0x52/0x117 [nfsd]
[<e24ea0d7>] ? nfsd4_putfh+0x33/0x3b [nfsd]
[<e24f4ce6>] ? nfsd4_delegreturn+0xd4/0xd4 [nfsd]
[<e24ea2c9>] nfsd4_proc_compound+0x1ea/0x33e [nfsd]
[<e24de6ee>] nfsd_dispatch+0xd1/0x1a5 [nfsd]
[<e1d6e1c7>] svc_process_common+0x282/0x46f [sunrpc]
[<e1d6e578>] svc_process+0xdc/0xfa [sunrpc]
[<e24de0fa>] nfsd+0xd6/0x115 [nfsd]
[<e24de024>] ? nfsd_shutdown+0x24/0x24 [nfsd]
[<c0454322>] kthread+0x62/0x67
[<c04542c0>] ? kthread_worker_fn+0x114/0x114
[<c07a6ebe>] kernel_thread_helper+0x6/0x10
Code: eb 05 b8 00 00 27 4f 8d 65 f4 5b 5e 5f 5d c3 83 e0 03 55 83 f8 02 89 e5 74 17 83 f8 03 74 05 48 75 09 eb 09 b8 02 00 00 00 eb 0b <0f> 0b 31 c0 eb 05 b8 01 00 00 00 5d c3 55 89 e5 57 56 89 d6 8d
EIP: [<e24f180d>] nfs4_access_to_omode+0x1c/0x29 [nfsd] SS:ESP 0068:dd54dde0
---[ end trace 2b0bf6c6557cb284 ]---

The trace route is:

-> nfsd4_lock()
-> if (lock->lk_is_new) {
-> alloc_init_lock_stateid()

3739: stp->st_access_bmap = 0;

->if (status && lock->lk_is_new && lock_sop)
-> release_lockowner()
-> free_generic_stateid()
-> nfs4_access_bmap_to_omode()
-> nfs4_access_to_omode()

380: BUG(); *****

This problem was introduced by 0997b173609b9229ece28941c118a2a9b278796e.

Reported-by: Mi Jinlong <[email protected]>
Tested-by: Mi Jinlong <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fbde6f7..8e3c407 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -397,10 +397,13 @@ static void unhash_generic_stateid(struct nfs4_stateid *stp)

static void free_generic_stateid(struct nfs4_stateid *stp)
{
- int oflag = nfs4_access_bmap_to_omode(stp);
+ int oflag;

- nfs4_file_put_access(stp->st_file, oflag);
- put_nfs4_file(stp->st_file);
+ if (stp->st_access_bmap) {
+ oflag = nfs4_access_bmap_to_omode(stp);
+ nfs4_file_put_access(stp->st_file, oflag);
+ put_nfs4_file(stp->st_file);
+ }
kmem_cache_free(stateid_slab, stp);
}

2011-04-12 00:09:27

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Mon, Apr 11, 2011 at 04:17:26AM -0400, J. Bruce Fields wrote:
> On Mon, Mar 28, 2011 at 08:17:25PM -0700, Greg KH wrote:
> > On Mon, Mar 28, 2011 at 11:13:47PM -0400, J. Bruce Fields wrote:
> > > Bah, apologies about this, I don't know how it got through my own
> > > testing.
> > >
> > > On Mon, Mar 28, 2011 at 02:41:22PM -0700, Greg KH wrote:
> > > > On Mon, Mar 28, 2011 at 11:00:57PM +0200, Wolfgang Walter wrote:
> > > > > I just searched linux-nfs and found this:
> > > > >
> > > > > http://marc.info/?l=linux-nfs&m=130129644016061&w=4
> > > > >
> > > > > He even seems to have a real fix though I don't want to test it
> > > > > without blessing.
> > > >
> > > > Well, you aren't going to get that from me, Bruce?
> > >
> > > Good analysis from Mi Jinlong, but I'm not convinced by the patch....
> > >
> > > I'll look closer.
> > >
> > > (Disclaimer: between the sleep deprivation and the UM hospital's dubious
> > > guest wifi (hey, how's a new dad supposed to keep up with his kernel
> > > hacking if they block git?), I may be a little slow.)
> >
> > Hey, no rush, you have more important things to deal with, I can always
> > revert the patch for now :)
>
> Argh, I forgot to stick the "Cc: stable" on this one. Anyway, should be
> in Linus's tree soon:
>
> --b.
>
> commit 23fcf2ec93fb8573a653408316af599939ff9a8e

Now picked up.

In the future, at least cc: this email to [email protected] so I don't
loose it in the wilds that are my suse.de email account...

thanks,

greg k-h

2011-04-16 00:41:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 2.6.38.2: regression from 2.6.38: kernel BUG at fs/nfsd/nfs4state.c:380!

On Mon, Apr 11, 2011 at 05:08:09PM -0700, Greg KH wrote:
> On Mon, Apr 11, 2011 at 04:17:26AM -0400, J. Bruce Fields wrote:
> > Argh, I forgot to stick the "Cc: stable" on this one. Anyway, should be
> > in Linus's tree soon:
> >
> > --b.
> >
> > commit 23fcf2ec93fb8573a653408316af599939ff9a8e
>
> Now picked up.
>
> In the future, at least cc: this email to [email protected] so I don't
> loose it in the wilds that are my suse.de email account...

Apologies, will do.--b.