2012-08-07 13:42:10

by Joerg Roedel

[permalink] [raw]
Subject: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

Hi,

starting with Linux 3.6-rc1 I experience this BUG on one of my test
machines. Please let me know if you need any additional information.

[ 20.271810] ------------[ cut here ]------------
[ 20.276869] kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!
[ 20.284306] invalid opcode: 0000 [#1] SMP
[ 20.288806] Modules linked in: nfs4 auth_rpcgss nfs fscache lockd sunrpc kvm_intel radeon kvm ttm drm_kms_helper i7core_edac drm edac_core joydev hpilo psmouse hid_generic i2c_algo_bit serio_raw usbhid hid bnx2
[ 20.309466] CPU 2
[ 20.311476] Pid: 1073, comm: mount.nfs Not tainted 3.6.0-rc1 #24 HP ProLiant DL360 G7
[ 20.320264] RIP: 0010:[<ffffffffa029108b>] [<ffffffffa029108b>] nfs_idmap_legacy_upcall+0x34b/0x350 [nfs4]
[ 20.320266] RSP: 0018:ffff880214e333e8 EFLAGS: 00010286
[ 20.320267] RAX: 0000000000000010 RBX: ffff880211f22540 RCX: 0000000000000000
[ 20.320268] RDX: 0000000000000000 RSI: ffff880216ba6ef4 RDI: ffff880211f22552
[ 20.320269] RBP: ffff880214e33428 R08: 000000000000003a R09: ffff880214e33380
[ 20.320270] R10: 000000002e8cc855 R11: 000000001fce9892 R12: ffff8802130c1bc0
[ 20.320271] R13: ffff880216baa0c0 R14: ffff880415b78280 R15: 0000000000000010
[ 20.320273] FS: 00007fe6d4208720(0000) GS:ffff880217c20000(0000) knlGS:0000000000000000
[ 20.320274] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 20.320275] CR2: 00007fa82360f6c0 CR3: 0000000216399000 CR4: 00000000000007e0
[ 20.320276] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 20.320277] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 20.320279] Process mount.nfs (pid: 1073, threadinfo ffff880214e32000, task ffff880213198000)
[ 20.320280] Stack:
[ 20.320282] ffff880216ba6ee4 ffff880216ba6ef4 ffff880214e33428 ffff880216baa0c0
[ 20.320284] ffff880211f22b40 ffff880211f226c0 0000000000000000 ffffffffa0296ddc
[ 20.320286] ffff880214e334c8 ffffffff812a35b7 0000000000000000 ffff880415517080
[ 20.320287] Call Trace:
[ 20.320294] [<ffffffff812a35b7>] request_key_and_link+0x2e7/0x410
[ 20.320297] [<ffffffff812a374e>] request_key_with_auxdata+0x1e/0x70
[ 20.320304] [<ffffffffa0291166>] nfs_idmap_request_key+0xd6/0x1b0 [nfs4]
[ 20.320310] [<ffffffffa029143b>] nfs_idmap_lookup_id+0xeb/0x110 [nfs4]
[ 20.320316] [<ffffffffa029166e>] ? nfs_map_string_to_numeric+0x3e/0xb0 [nfs4]
[ 20.320322] [<ffffffffa0291e5c>] nfs_map_group_to_gid+0x5c/0x80 [nfs4]
[ 20.320327] [<ffffffffa028bca1>] decode_getfattr_attrs+0xb71/0xb90 [nfs4]
[ 20.320331] [<ffffffff810135ca>] ? __switch_to+0x17a/0x410
[ 20.320336] [<ffffffffa028bd41>] decode_getfattr_generic.constprop.71+0x81/0xb0 [nfs4]
[ 20.320341] [<ffffffffa028bfc0>] ? nfs4_xdr_dec_link+0xd0/0xd0 [nfs4]
[ 20.320346] [<ffffffffa028bd83>] decode_getfattr+0x13/0x20 [nfs4]
[ 20.320350] [<ffffffffa028c02b>] nfs4_xdr_dec_lookup_root+0x6b/0x70 [nfs4]
[ 20.320355] [<ffffffffa028bfc0>] ? nfs4_xdr_dec_link+0xd0/0xd0 [nfs4]
[ 20.320370] [<ffffffffa020e195>] rpcauth_unwrap_resp+0x65/0x70 [sunrpc]
[ 20.320377] [<ffffffffa0203918>] call_decode+0x308/0x400 [sunrpc]
[ 20.320381] [<ffffffff81079a70>] ? autoremove_wake_function+0x40/0x40
[ 20.320391] [<ffffffffa020c6a0>] __rpc_execute+0x70/0x2c0 [sunrpc]
[ 20.320397] [<ffffffffa0203610>] ? call_transmit_status+0xd0/0xd0 [sunrpc]
[ 20.320404] [<ffffffffa0203610>] ? call_transmit_status+0xd0/0xd0 [sunrpc]
[ 20.320413] [<ffffffffa020cf3f>] rpc_execute+0x4f/0xb0 [sunrpc]
[ 20.320420] [<ffffffffa0205215>] rpc_run_task+0x75/0x90 [sunrpc]
[ 20.320427] [<ffffffffa0205333>] rpc_call_sync+0x43/0x70 [sunrpc]
[ 20.320431] [<ffffffffa027f393>] _nfs4_call_sync+0x13/0x20 [nfs4]
[ 20.320435] [<ffffffffa028230c>] _nfs4_lookup_root.isra.37+0xac/0xc0 [nfs4]
[ 20.320440] [<ffffffffa028236f>] nfs4_lookup_root+0x4f/0x90 [nfs4]
[ 20.320444] [<ffffffffa0285c65>] nfs4_proc_get_rootfh+0x35/0xd0 [nfs4]
[ 20.320450] [<ffffffffa0293a33>] nfs4_get_rootfh+0x33/0xd0 [nfs4]
[ 20.320459] [<ffffffffa025cee4>] ? nfs_alloc_fattr+0x24/0x80 [nfs]
[ 20.320465] [<ffffffffa0293b36>] nfs4_server_common_setup+0x66/0xf0 [nfs4]
[ 20.320471] [<ffffffffa029432d>] nfs4_create_server+0x1bd/0x2e0 [nfs4]
[ 20.320474] [<ffffffff8116c81d>] ? __kmalloc_track_caller+0x13d/0x190
[ 20.320480] [<ffffffffa028f808>] nfs4_remote_mount+0x38/0x70 [nfs4]
[ 20.320483] [<ffffffff8117abe3>] mount_fs+0x43/0x1b0
[ 20.320485] [<ffffffff81195156>] vfs_kern_mount+0x76/0x120
[ 20.320491] [<ffffffffa028f785>] nfs_do_root_mount+0x95/0xe0 [nfs4]
[ 20.320497] [<ffffffffa028fac0>] nfs4_try_mount+0x40/0x60 [nfs4]
[ 20.320504] [<ffffffffa0262077>] nfs_fs_mount+0x487/0xa40 [nfs]
[ 20.320512] [<ffffffffa02619e0>] ? nfs_clone_super+0x140/0x140 [nfs]
[ 20.320519] [<ffffffffa0260270>] ? nfs_clone_sb_security+0x60/0x60 [nfs]
[ 20.320521] [<ffffffff8117abe3>] mount_fs+0x43/0x1b0
[ 20.320523] [<ffffffff811942c3>] ? find_filesystem+0x63/0x80
[ 20.320526] [<ffffffff81195156>] vfs_kern_mount+0x76/0x120
[ 20.320528] [<ffffffff811959c4>] do_kern_mount+0x54/0x110
[ 20.320530] [<ffffffff81197424>] do_mount+0x1a4/0x8a0
[ 20.320533] [<ffffffff811970fa>] ? copy_mount_options+0x3a/0x170
[ 20.320535] [<ffffffff81197bb0>] sys_mount+0x90/0xe0
[ 20.320538] [<ffffffff81620ba9>] system_call_fastpath+0x16/0x1b
[ 20.320559] Code: ff 0f 1f 80 00 00 00 00 66 c7 07 00 00 83 ea 02 48 83 c7 02 e9 5e fd ff ff 0f 1f 80 00 00 00 00 41 bf f4 ff ff ff e9 0b fe ff ff <0f> 0b 0f 1f 00 55 48 89 e5 48 83 ec 60 48 89 5d d8 4c 89 65 e0
[ 20.320565] RIP [<ffffffffa029108b>] nfs_idmap_legacy_upcall+0x34b/0x350 [nfs4]
[ 20.320566] RSP <ffff880214e333e8>
[ 20.320635] ---[ end trace 883f5b90b0291611 ]---

--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632



2012-08-07 14:36:50

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 08/07/2012 10:27 AM, Joerg Roedel wrote:
> On Tue, Aug 07, 2012 at 10:17:33AM -0400, Bryan Schumaker wrote:
>> On 08/07/2012 10:15 AM, Joerg Roedel wrote:
>>> Yes, it reproduces pretty reliable here with Ubuntu 11.10 Server on an
>>> Intel box with an NFSv3 directory mounted at boot. This is the only box
>>> I have seen this so far, probably it depends on the config. I attach the
>>> config of the failing box.
>>
>> Interesting. Are you mounting v4, too? This code shouldn't be
>> running for v3... maybe that's why I haven't been able to hit it.
>
> No, I am not using NFSv4 on the box where the BUG happens. I have
> another box mounting the same directory where the BUG does not trigger
> with v3.6-rc1. A difference I spotted between the kernels is, that on
> the failing box NFS is compiled as a module whereas it is compiled into
> the kernel on the box that works fine. Not sure if that has anything to
> do with the problem...
>

Your stack trace is showing v4 calls on the failing box, those definitely shouldn't be happening if you're using v3. Can you double check /etc/fstab and /proc/mounts on a working kernel to be sure?

My VM has nfs as a module, so I don't think that's the issue... I just started compiling your config to test on my own.

>
> Joerg
>
>


2012-08-07 14:50:22

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Tue, Aug 07, 2012 at 10:36:31AM -0400, Bryan Schumaker wrote:
> Your stack trace is showing v4 calls on the failing box, those
> definitely shouldn't be happening if you're using v3. Can you double
> check /etc/fstab and /proc/mounts on a working kernel to be sure?

So the bug is probably (for whatever reason) that the nfs4 path is
called for an nfs3 mount :)
Anyway, I attach /proc/mounts and /etc/fstab from that box running a
v3.5-rc5 kernel (where it works).


Joerg


Attachments:
(No filename) (479.00 B)
proc-mounts-3.5 (976.00 B)
etc-fstab (856.00 B)
Download all attachments

2012-08-07 15:19:03

by Myklebust, Trond

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

T24gVHVlLCAyMDEyLTA4LTA3IGF0IDE2OjUwICswMjAwLCBKb2VyZyBSb2VkZWwgd3JvdGU6DQo+
IE9uIFR1ZSwgQXVnIDA3LCAyMDEyIGF0IDEwOjM2OjMxQU0gLTA0MDAsIEJyeWFuIFNjaHVtYWtl
ciB3cm90ZToNCj4gPiBZb3VyIHN0YWNrIHRyYWNlIGlzIHNob3dpbmcgdjQgY2FsbHMgb24gdGhl
IGZhaWxpbmcgYm94LCB0aG9zZQ0KPiA+IGRlZmluaXRlbHkgc2hvdWxkbid0IGJlIGhhcHBlbmlu
ZyBpZiB5b3UncmUgdXNpbmcgdjMuICBDYW4geW91IGRvdWJsZQ0KPiA+IGNoZWNrIC9ldGMvZnN0
YWIgYW5kIC9wcm9jL21vdW50cyBvbiBhIHdvcmtpbmcga2VybmVsIHRvIGJlIHN1cmU/DQo+IA0K
PiBTbyB0aGUgYnVnIGlzIHByb2JhYmx5IChmb3Igd2hhdGV2ZXIgcmVhc29uKSB0aGF0IHRoZSBu
ZnM0IHBhdGggaXMNCj4gY2FsbGVkIGZvciBhbiBuZnMzIG1vdW50IDopDQoNCklmIHlvdXIgL2V0
Yy9uZnNtb3VudC5jb25mIGRvZXNuJ3QgY29udGFpbiBhIGxpbmUgb2YgdGhlIGZvcm0NCg0KRGVm
YXVsdHZlcnM9NA0KDQp0aGVuIHRoZSBtb3VudCB1dGlsaXR5IHdpbGwgdHJ5IE5GU3Y0IGJ5IGRl
ZmF1bHQuIFRoYXQgaXMgbm90IGEgYnVnLCBpdA0KaXMgYSBkZWxpYmVyYXRlIGZlYXR1cmUgb2Yg
cmVjZW50IHZlcnNpb25zIG9mIG5mcy11dGlscy4NCg0KPiBBbnl3YXksIEkgYXR0YWNoIC9wcm9j
L21vdW50cyBhbmQgL2V0Yy9mc3RhYiBmcm9tIHRoYXQgYm94IHJ1bm5pbmcgYQ0KPiB2My41LXJj
NSBrZXJuZWwgKHdoZXJlIGl0IHdvcmtzKS4NCg0KSSdtIGd1ZXNzaW5nIHRoYXQgdGhlIGZhY3Qg
eW91IGFyZSBub3QgcnVubmluZyBpZG1hcHBlciBpcyBjYXVzaW5nIHRoZQ0KTkZTdjQgbW91bnQg
dG8gZmFpbCBvbiB0aGUgb2xkZXIga2VybmVscywgYW5kIHNvIHRoZSBtb3VudCBwcm9ncmFtIGlz
DQpmYWxsaW5nIGJhY2sgdG8gYW4gTkZTdjMgbW91bnQuDQoNCldlIG5lZWQgdG8gZml4IHYzLjYg
c28gdGhhdCBpdCBkb2VzIHRoZSBzYW1lLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXgg
TkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5j
b20NCnd3dy5uZXRhcHAuY29tDQoNCg==

2012-08-07 14:25:10

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Tue, Aug 07, 2012 at 09:55:14AM -0400, Bryan Schumaker wrote:
> On 08/07/2012 09:41 AM, Joerg Roedel wrote:
> > starting with Linux 3.6-rc1 I experience this BUG on one of my test
> > machines. Please let me know if you need any additional information.
>
> I think this is the same bug that William Dauchy has been hitting. Do
> you have a reproducer for this? I haven't been able to trigger it on
> my own :(.

Yes, it reproduces pretty reliable here with Ubuntu 11.10 Server on an
Intel box with an NFSv3 directory mounted at boot. This is the only box
I have seen this so far, probably it depends on the config. I attach the
config of the failing box.

HTH,

Joerg


Attachments:
(No filename) (676.00 B)
config-buenos (86.67 kB)
Download all attachments

2012-08-07 14:17:37

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 08/07/2012 10:15 AM, Joerg Roedel wrote:
> On Tue, Aug 07, 2012 at 09:55:14AM -0400, Bryan Schumaker wrote:
>> On 08/07/2012 09:41 AM, Joerg Roedel wrote:
>>> starting with Linux 3.6-rc1 I experience this BUG on one of my test
>>> machines. Please let me know if you need any additional information.
>>
>> I think this is the same bug that William Dauchy has been hitting. Do
>> you have a reproducer for this? I haven't been able to trigger it on
>> my own :(.
>
> Yes, it reproduces pretty reliable here with Ubuntu 11.10 Server on an
> Intel box with an NFSv3 directory mounted at boot. This is the only box
> I have seen this so far, probably it depends on the config. I attach the
> config of the failing box.

Interesting. Are you mounting v4, too? This code shouldn't be running for v3... maybe that's why I haven't been able to hit it.

- Bryan

>
> HTH,
>
> Joerg
>


2012-08-07 15:12:57

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 08/07/2012 10:50 AM, Joerg Roedel wrote:
> On Tue, Aug 07, 2012 at 10:36:31AM -0400, Bryan Schumaker wrote:
>> Your stack trace is showing v4 calls on the failing box, those
>> definitely shouldn't be happening if you're using v3. Can you double
>> check /etc/fstab and /proc/mounts on a working kernel to be sure?
>
> So the bug is probably (for whatever reason) that the nfs4 path is
> called for an nfs3 mount :)

That's what I'm thinking. If you're not using v4, what happens if you turn CONFIG_NFS_V4 off in your .config for the broken machine? I still haven't been able to trigger the bug, even with your .config, so I'm going to keep poking at it.

- Bryan

> Anyway, I attach /proc/mounts and /etc/fstab from that box running a
> v3.5-rc5 kernel (where it works).

Thanks. I don't see anything weird in there. You're using the same settings for the other machine, too?

>
>
> Joerg
>


2012-08-07 15:18:26

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 08/07/2012 11:14 AM, Myklebust, Trond wrote:
> On Tue, 2012-08-07 at 10:36 -0400, Bryan Schumaker wrote:
>> On 08/07/2012 10:27 AM, Joerg Roedel wrote:
>>> On Tue, Aug 07, 2012 at 10:17:33AM -0400, Bryan Schumaker wrote:
>>>> On 08/07/2012 10:15 AM, Joerg Roedel wrote:
>>>>> Yes, it reproduces pretty reliable here with Ubuntu 11.10 Server on an
>>>>> Intel box with an NFSv3 directory mounted at boot. This is the only box
>>>>> I have seen this so far, probably it depends on the config. I attach the
>>>>> config of the failing box.
>>>>
>>>> Interesting. Are you mounting v4, too? This code shouldn't be
>>>> running for v3... maybe that's why I haven't been able to hit it.
>>>
>>> No, I am not using NFSv4 on the box where the BUG happens. I have
>>> another box mounting the same directory where the BUG does not trigger
>>> with v3.6-rc1. A difference I spotted between the kernels is, that on
>>> the failing box NFS is compiled as a module whereas it is compiled into
>>> the kernel on the box that works fine. Not sure if that has anything to
>>> do with the problem...
>>>
>>
>> Your stack trace is showing v4 calls on the failing box, those definitely shouldn't be happening if you're using v3. Can you double check /etc/fstab and /proc/mounts on a working kernel to be sure?
>>
>> My VM has nfs as a module, so I don't think that's the issue... I just started compiling your config to test on my own.
>
> Joerg,
>
> The stack trace definitely shows that the NFS client is attempting an
> NFSv4 mount. Are you supplying a 'vers=3' mount option? If not, then
> note that recent versions of nfs-utils can be configured to try NFSv4 as
> the default mount option, so I'd guess this is why you are hitting an
> NFSv4 path.
>
> Bryan,
>
> That said, when looking at the legacy upcall, it seems that if
> rpc_queue_upcall fails, then we don't do anything to clear
> idmap->idmap_key_cons. Ditto if the call times out, or if the pipe is
> closed before the downcall.
>

Ah! I didn't think about the upcall failing, thanks Trond! I'll work on a patch.

- Bryan


2012-08-07 14:27:10

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Tue, Aug 07, 2012 at 10:17:33AM -0400, Bryan Schumaker wrote:
> On 08/07/2012 10:15 AM, Joerg Roedel wrote:
> > Yes, it reproduces pretty reliable here with Ubuntu 11.10 Server on an
> > Intel box with an NFSv3 directory mounted at boot. This is the only box
> > I have seen this so far, probably it depends on the config. I attach the
> > config of the failing box.
>
> Interesting. Are you mounting v4, too? This code shouldn't be
> running for v3... maybe that's why I haven't been able to hit it.

No, I am not using NFSv4 on the box where the BUG happens. I have
another box mounting the same directory where the BUG does not trigger
with v3.6-rc1. A difference I spotted between the kernels is, that on
the failing box NFS is compiled as a module whereas it is compiled into
the kernel on the box that works fine. Not sure if that has anything to
do with the problem...


Joerg



2012-08-07 15:14:24

by Myklebust, Trond

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

T24gVHVlLCAyMDEyLTA4LTA3IGF0IDEwOjM2IC0wNDAwLCBCcnlhbiBTY2h1bWFrZXIgd3JvdGU6
DQo+IE9uIDA4LzA3LzIwMTIgMTA6MjcgQU0sIEpvZXJnIFJvZWRlbCB3cm90ZToNCj4gPiBPbiBU
dWUsIEF1ZyAwNywgMjAxMiBhdCAxMDoxNzozM0FNIC0wNDAwLCBCcnlhbiBTY2h1bWFrZXIgd3Jv
dGU6DQo+ID4+IE9uIDA4LzA3LzIwMTIgMTA6MTUgQU0sIEpvZXJnIFJvZWRlbCB3cm90ZToNCj4g
Pj4+IFllcywgaXQgcmVwcm9kdWNlcyBwcmV0dHkgcmVsaWFibGUgaGVyZSB3aXRoIFVidW50dSAx
MS4xMCBTZXJ2ZXIgb24gYW4NCj4gPj4+IEludGVsIGJveCB3aXRoIGFuIE5GU3YzIGRpcmVjdG9y
eSBtb3VudGVkIGF0IGJvb3QuIFRoaXMgaXMgdGhlIG9ubHkgYm94DQo+ID4+PiBJIGhhdmUgc2Vl
biB0aGlzIHNvIGZhciwgcHJvYmFibHkgaXQgZGVwZW5kcyBvbiB0aGUgY29uZmlnLiBJIGF0dGFj
aCB0aGUNCj4gPj4+IGNvbmZpZyBvZiB0aGUgZmFpbGluZyBib3guDQo+ID4+DQo+ID4+IEludGVy
ZXN0aW5nLiAgQXJlIHlvdSBtb3VudGluZyB2NCwgdG9vPyAgVGhpcyBjb2RlIHNob3VsZG4ndCBi
ZQ0KPiA+PiBydW5uaW5nIGZvciB2My4uLiBtYXliZSB0aGF0J3Mgd2h5IEkgaGF2ZW4ndCBiZWVu
IGFibGUgdG8gaGl0IGl0Lg0KPiA+IA0KPiA+IE5vLCBJIGFtIG5vdCB1c2luZyBORlN2NCBvbiB0
aGUgYm94IHdoZXJlIHRoZSBCVUcgaGFwcGVucy4gSSBoYXZlDQo+ID4gYW5vdGhlciBib3ggbW91
bnRpbmcgdGhlIHNhbWUgZGlyZWN0b3J5IHdoZXJlIHRoZSBCVUcgZG9lcyBub3QgdHJpZ2dlcg0K
PiA+IHdpdGggdjMuNi1yYzEuIEEgZGlmZmVyZW5jZSBJIHNwb3R0ZWQgYmV0d2VlbiB0aGUga2Vy
bmVscyBpcywgdGhhdCBvbg0KPiA+IHRoZSBmYWlsaW5nIGJveCBORlMgaXMgY29tcGlsZWQgYXMg
YSBtb2R1bGUgd2hlcmVhcyBpdCBpcyBjb21waWxlZCBpbnRvDQo+ID4gdGhlIGtlcm5lbCBvbiB0
aGUgYm94IHRoYXQgd29ya3MgZmluZS4gTm90IHN1cmUgaWYgdGhhdCBoYXMgYW55dGhpbmcgdG8N
Cj4gPiBkbyB3aXRoIHRoZSBwcm9ibGVtLi4uDQo+ID4gDQo+IA0KPiBZb3VyIHN0YWNrIHRyYWNl
IGlzIHNob3dpbmcgdjQgY2FsbHMgb24gdGhlIGZhaWxpbmcgYm94LCB0aG9zZSBkZWZpbml0ZWx5
IHNob3VsZG4ndCBiZSBoYXBwZW5pbmcgaWYgeW91J3JlIHVzaW5nIHYzLiAgQ2FuIHlvdSBkb3Vi
bGUgY2hlY2sgL2V0Yy9mc3RhYiBhbmQgL3Byb2MvbW91bnRzIG9uIGEgd29ya2luZyBrZXJuZWwg
dG8gYmUgc3VyZT8NCj4gDQo+IE15IFZNIGhhcyBuZnMgYXMgYSBtb2R1bGUsIHNvIEkgZG9uJ3Qg
dGhpbmsgdGhhdCdzIHRoZSBpc3N1ZS4uLiBJIGp1c3Qgc3RhcnRlZCBjb21waWxpbmcgeW91ciBj
b25maWcgdG8gdGVzdCBvbiBteSBvd24uDQoNCkpvZXJnLA0KDQpUaGUgc3RhY2sgdHJhY2UgZGVm
aW5pdGVseSBzaG93cyB0aGF0IHRoZSBORlMgY2xpZW50IGlzIGF0dGVtcHRpbmcgYW4NCk5GU3Y0
IG1vdW50LiBBcmUgeW91IHN1cHBseWluZyBhICd2ZXJzPTMnIG1vdW50IG9wdGlvbj8gSWYgbm90
LCB0aGVuDQpub3RlIHRoYXQgcmVjZW50IHZlcnNpb25zIG9mIG5mcy11dGlscyBjYW4gYmUgY29u
ZmlndXJlZCB0byB0cnkgTkZTdjQgYXMNCnRoZSBkZWZhdWx0IG1vdW50IG9wdGlvbiwgc28gSSdk
IGd1ZXNzIHRoaXMgaXMgd2h5IHlvdSBhcmUgaGl0dGluZyBhbg0KTkZTdjQgcGF0aC4NCg0KQnJ5
YW4sDQoNClRoYXQgc2FpZCwgd2hlbiBsb29raW5nIGF0IHRoZSBsZWdhY3kgdXBjYWxsLCBpdCBz
ZWVtcyB0aGF0IGlmDQpycGNfcXVldWVfdXBjYWxsIGZhaWxzLCB0aGVuIHdlIGRvbid0IGRvIGFu
eXRoaW5nIHRvIGNsZWFyDQppZG1hcC0+aWRtYXBfa2V5X2NvbnMuIERpdHRvIGlmIHRoZSBjYWxs
IHRpbWVzIG91dCwgb3IgaWYgdGhlIHBpcGUgaXMNCmNsb3NlZCBiZWZvcmUgdGhlIGRvd25jYWxs
Lg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoN
Ck5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg==

2012-08-07 13:55:18

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 08/07/2012 09:41 AM, Joerg Roedel wrote:
> Hi,
>
> starting with Linux 3.6-rc1 I experience this BUG on one of my test
> machines. Please let me know if you need any additional information.

I think this is the same bug that William Dauchy has been hitting. Do you have a reproducer for this? I haven't been able to trigger it on my own :(.

- Bryan

>
> [ 20.271810] ------------[ cut here ]------------
> [ 20.276869] kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!
> [ 20.284306] invalid opcode: 0000 [#1] SMP
> [ 20.288806] Modules linked in: nfs4 auth_rpcgss nfs fscache lockd sunrpc kvm_intel radeon kvm ttm drm_kms_helper i7core_edac drm edac_core joydev hpilo psmouse hid_generic i2c_algo_bit serio_raw usbhid hid bnx2
> [ 20.309466] CPU 2
> [ 20.311476] Pid: 1073, comm: mount.nfs Not tainted 3.6.0-rc1 #24 HP ProLiant DL360 G7
> [ 20.320264] RIP: 0010:[<ffffffffa029108b>] [<ffffffffa029108b>] nfs_idmap_legacy_upcall+0x34b/0x350 [nfs4]
> [ 20.320266] RSP: 0018:ffff880214e333e8 EFLAGS: 00010286
> [ 20.320267] RAX: 0000000000000010 RBX: ffff880211f22540 RCX: 0000000000000000
> [ 20.320268] RDX: 0000000000000000 RSI: ffff880216ba6ef4 RDI: ffff880211f22552
> [ 20.320269] RBP: ffff880214e33428 R08: 000000000000003a R09: ffff880214e33380
> [ 20.320270] R10: 000000002e8cc855 R11: 000000001fce9892 R12: ffff8802130c1bc0
> [ 20.320271] R13: ffff880216baa0c0 R14: ffff880415b78280 R15: 0000000000000010
> [ 20.320273] FS: 00007fe6d4208720(0000) GS:ffff880217c20000(0000) knlGS:0000000000000000
> [ 20.320274] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 20.320275] CR2: 00007fa82360f6c0 CR3: 0000000216399000 CR4: 00000000000007e0
> [ 20.320276] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 20.320277] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 20.320279] Process mount.nfs (pid: 1073, threadinfo ffff880214e32000, task ffff880213198000)
> [ 20.320280] Stack:
> [ 20.320282] ffff880216ba6ee4 ffff880216ba6ef4 ffff880214e33428 ffff880216baa0c0
> [ 20.320284] ffff880211f22b40 ffff880211f226c0 0000000000000000 ffffffffa0296ddc
> [ 20.320286] ffff880214e334c8 ffffffff812a35b7 0000000000000000 ffff880415517080
> [ 20.320287] Call Trace:
> [ 20.320294] [<ffffffff812a35b7>] request_key_and_link+0x2e7/0x410
> [ 20.320297] [<ffffffff812a374e>] request_key_with_auxdata+0x1e/0x70
> [ 20.320304] [<ffffffffa0291166>] nfs_idmap_request_key+0xd6/0x1b0 [nfs4]
> [ 20.320310] [<ffffffffa029143b>] nfs_idmap_lookup_id+0xeb/0x110 [nfs4]
> [ 20.320316] [<ffffffffa029166e>] ? nfs_map_string_to_numeric+0x3e/0xb0 [nfs4]
> [ 20.320322] [<ffffffffa0291e5c>] nfs_map_group_to_gid+0x5c/0x80 [nfs4]
> [ 20.320327] [<ffffffffa028bca1>] decode_getfattr_attrs+0xb71/0xb90 [nfs4]
> [ 20.320331] [<ffffffff810135ca>] ? __switch_to+0x17a/0x410
> [ 20.320336] [<ffffffffa028bd41>] decode_getfattr_generic.constprop.71+0x81/0xb0 [nfs4]
> [ 20.320341] [<ffffffffa028bfc0>] ? nfs4_xdr_dec_link+0xd0/0xd0 [nfs4]
> [ 20.320346] [<ffffffffa028bd83>] decode_getfattr+0x13/0x20 [nfs4]
> [ 20.320350] [<ffffffffa028c02b>] nfs4_xdr_dec_lookup_root+0x6b/0x70 [nfs4]
> [ 20.320355] [<ffffffffa028bfc0>] ? nfs4_xdr_dec_link+0xd0/0xd0 [nfs4]
> [ 20.320370] [<ffffffffa020e195>] rpcauth_unwrap_resp+0x65/0x70 [sunrpc]
> [ 20.320377] [<ffffffffa0203918>] call_decode+0x308/0x400 [sunrpc]
> [ 20.320381] [<ffffffff81079a70>] ? autoremove_wake_function+0x40/0x40
> [ 20.320391] [<ffffffffa020c6a0>] __rpc_execute+0x70/0x2c0 [sunrpc]
> [ 20.320397] [<ffffffffa0203610>] ? call_transmit_status+0xd0/0xd0 [sunrpc]
> [ 20.320404] [<ffffffffa0203610>] ? call_transmit_status+0xd0/0xd0 [sunrpc]
> [ 20.320413] [<ffffffffa020cf3f>] rpc_execute+0x4f/0xb0 [sunrpc]
> [ 20.320420] [<ffffffffa0205215>] rpc_run_task+0x75/0x90 [sunrpc]
> [ 20.320427] [<ffffffffa0205333>] rpc_call_sync+0x43/0x70 [sunrpc]
> [ 20.320431] [<ffffffffa027f393>] _nfs4_call_sync+0x13/0x20 [nfs4]
> [ 20.320435] [<ffffffffa028230c>] _nfs4_lookup_root.isra.37+0xac/0xc0 [nfs4]
> [ 20.320440] [<ffffffffa028236f>] nfs4_lookup_root+0x4f/0x90 [nfs4]
> [ 20.320444] [<ffffffffa0285c65>] nfs4_proc_get_rootfh+0x35/0xd0 [nfs4]
> [ 20.320450] [<ffffffffa0293a33>] nfs4_get_rootfh+0x33/0xd0 [nfs4]
> [ 20.320459] [<ffffffffa025cee4>] ? nfs_alloc_fattr+0x24/0x80 [nfs]
> [ 20.320465] [<ffffffffa0293b36>] nfs4_server_common_setup+0x66/0xf0 [nfs4]
> [ 20.320471] [<ffffffffa029432d>] nfs4_create_server+0x1bd/0x2e0 [nfs4]
> [ 20.320474] [<ffffffff8116c81d>] ? __kmalloc_track_caller+0x13d/0x190
> [ 20.320480] [<ffffffffa028f808>] nfs4_remote_mount+0x38/0x70 [nfs4]
> [ 20.320483] [<ffffffff8117abe3>] mount_fs+0x43/0x1b0
> [ 20.320485] [<ffffffff81195156>] vfs_kern_mount+0x76/0x120
> [ 20.320491] [<ffffffffa028f785>] nfs_do_root_mount+0x95/0xe0 [nfs4]
> [ 20.320497] [<ffffffffa028fac0>] nfs4_try_mount+0x40/0x60 [nfs4]
> [ 20.320504] [<ffffffffa0262077>] nfs_fs_mount+0x487/0xa40 [nfs]
> [ 20.320512] [<ffffffffa02619e0>] ? nfs_clone_super+0x140/0x140 [nfs]
> [ 20.320519] [<ffffffffa0260270>] ? nfs_clone_sb_security+0x60/0x60 [nfs]
> [ 20.320521] [<ffffffff8117abe3>] mount_fs+0x43/0x1b0
> [ 20.320523] [<ffffffff811942c3>] ? find_filesystem+0x63/0x80
> [ 20.320526] [<ffffffff81195156>] vfs_kern_mount+0x76/0x120
> [ 20.320528] [<ffffffff811959c4>] do_kern_mount+0x54/0x110
> [ 20.320530] [<ffffffff81197424>] do_mount+0x1a4/0x8a0
> [ 20.320533] [<ffffffff811970fa>] ? copy_mount_options+0x3a/0x170
> [ 20.320535] [<ffffffff81197bb0>] sys_mount+0x90/0xe0
> [ 20.320538] [<ffffffff81620ba9>] system_call_fastpath+0x16/0x1b
> [ 20.320559] Code: ff 0f 1f 80 00 00 00 00 66 c7 07 00 00 83 ea 02 48 83 c7 02 e9 5e fd ff ff 0f 1f 80 00 00 00 00 41 bf f4 ff ff ff e9 0b fe ff ff <0f> 0b 0f 1f 00 55 48 89 e5 48 83 ec 60 48 89 5d d8 4c 89 65 e0
> [ 20.320565] RIP [<ffffffffa029108b>] nfs_idmap_legacy_upcall+0x34b/0x350 [nfs4]
> [ 20.320566] RSP <ffff880214e333e8>
> [ 20.320635] ---[ end trace 883f5b90b0291611 ]---
>


2012-09-27 16:16:29

by Myklebust, Trond

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

T24gVGh1LCAyMDEyLTA5LTI3IGF0IDE3OjM5ICswMjAwLCBKb2VyZyBSb2VkZWwgd3JvdGU6DQo+
IE9uIFRodSwgU2VwIDI3LCAyMDEyIGF0IDAzOjMyOjAyUE0gKzAwMDAsIE15a2xlYnVzdCwgVHJv
bmQgd3JvdGU6DQo+IA0KPiA+IERvZXMgeW91ciBjaGVja2VkIG91dCBjb3B5IG9mIDMuNi1yYzcg
Y29udGFpbiBjb21taXQgYzUwNjY5IChORlM6DQo+ID4gQ2xlYXIga2V5IGNvbnN0cnVjdGlvbiBk
YXRhIGlmIHRoZSBpZG1hcCB1cGNhbGwgZmFpbHMpPyBUaGUgbGF0dGVyIHdhcw0KPiA+IG1lcmdl
ZCBpbjMuNi1yYzMsIGFuZCBpcyByZXBvcnRlZCB0byBmaXggdGhlIHByb2JsZW0gZm9yIHRoZSBv
dGhlcg0KPiA+IHRlc3RlcnMuDQo+IA0KPiBZZXMsIGl0IGNvbnRhaW5zIHRoYXQgY29tbWl0LiBJ
IHdhcyBhYm91dCB0byB0ZXN0IHBsYWluIHYzLjYtcmM3IHdpdGhvdXQNCj4gbXkgcGF0Y2hlcyAo
bm90IG5mcyByZWxhdGVkLCBvZiBjb3VyY2UpIG9uLXRvcCwgYnV0IHVuZm9ydHVuYXRseSB0aGUN
Cj4gZGlzayB3aXRoIHRoZSByb290LWZzIGRpZWQgOi0vDQo+IEkgYW0gYWJvdXQgdG8gc2V0IHVw
IHRoZSBib3ggYWdhaW4gYW5kIHRlc3QgcGxhaW4gLXJjNy4NCg0KUGxlYXNlIGRvLg0KDQpJIGNh
bm5vdCBzZWUgaG93IHRoYXQgQlVHX09OIGNhbiBiZSB0cmlnZ2VyZWQgaW4gdGhlIGN1cnJlbnQg
Y29kZSwgZ2l2ZW4NCnRoYXQgdGhlIG9ubHkgcGxhY2Ugd2hlcmUgaWRtYXAtPmlkbWFwX2tleV9j
b25zIGlzIHNldCB0byBhIG5vbi1OVUxMDQp2YWx1ZSBpcyBjb3ZlcmVkIGJ5IGEgbXV0ZXgsIGFu
ZCB0aGF0IGl0IGlzIGFsd2F5cyBjbGVhcmVkIGJlZm9yZSB3ZQ0KcmVsZWFzZSBzYWlkIG11dGV4
Lg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoN
Ck5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQo=

2012-09-27 15:32:06

by Myklebust, Trond

[permalink] [raw]
Subject: RE: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

> -----Original Message-----
> From: Joerg Roedel [mailto:[email protected]]
> Sent: Thursday, September 27, 2012 10:52 AM
> To: Myklebust, Trond
> Cc: Joerg Roedel; [email protected]; [email protected];
> [email protected]
> Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!
>
> On Tue, Aug 07, 2012 at 03:41:56PM +0200, Joerg Roedel wrote:
> > starting with Linux 3.6-rc1 I experience this BUG on one of my test
> > machines. Please let me know if you need any additional information.
> >
> > [ 20.271810] ------------[ cut here ]------------
> > [ 20.276869] kernel BUG at
> /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!
> > [ 20.284306] invalid opcode: 0000 [#1] SMP
> > [ 20.288806] Modules linked in: nfs4 auth_rpcgss nfs fscache lockd sunrpc
> kvm_intel radeon kvm ttm drm_kms_helper i7core_edac drm edac_core
> joydev hpilo psmouse hid_generic i2c_algo_bit serio_raw usbhid hid bnx2
>
> I still see this BUG with 3.6-rc7. Any fix in sight?

Does your checked out copy of 3.6-rc7 contain commit c50669 (NFS: Clear key construction data if the idmap upcall fails)? The latter was merged in3.6-rc3, and is reported to fix the problem for the other testers.

Trond

2012-09-27 18:11:40

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Thu, Sep 27, 2012 at 04:16:25PM +0000, Myklebust, Trond wrote:
> > Yes, it contains that commit. I was about to test plain v3.6-rc7 without
> > my patches (not nfs related, of cource) on-top, but unfortunatly the
> > disk with the root-fs died :-/
> > I am about to set up the box again and test plain -rc7.
>
> Please do.

Okay, the box is running again and the bug does not reproduce anymore
with the new installation. Unfortunatly the old kernel configuration is
gone together with the previous version of the nfs-utils :-/ I'll try to
play around with the kernel config a little bit more and see if it
reproduces again.


Joerg

--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


2012-09-28 13:21:30

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 09/28/2012 08:17 AM, Joerg Roedel wrote:
> On Thu, Sep 27, 2012 at 02:15:21PM -0400, Bryan Schumaker wrote:
>
>> Double check that you're using the legacy idmapper, and not the
>> keyring based one (/etc/request-key.conf shouldn't have the "create
>> id_resolver * * /usr/bin/nfsidmap %k %d" line).
>
> That should only be relevant for NFSv4, no? I am using v3 only (without
> explicitly setting the version as mount option).

Any idmapper problem is only relevant to v4, so if you're seeing a problem that means you're not using NFS v3.

>
> Anyway, I had no luck on reproducing the failure again after
> re-installation of the box, either with of with-out my patches on-top. I
> tried a few different config settings for NFS too, but the BUG didn't
> trigger again :-/

Maybe it's mounting over v3 now? Can you retry using "vers=4" in your mount options?

- Bryan

>
>
> Joerg
>
>


2012-09-27 15:01:36

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Tue, Aug 07, 2012 at 03:41:56PM +0200, Joerg Roedel wrote:
> starting with Linux 3.6-rc1 I experience this BUG on one of my test
> machines. Please let me know if you need any additional information.
>
> [ 20.271810] ------------[ cut here ]------------
> [ 20.276869] kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!
> [ 20.284306] invalid opcode: 0000 [#1] SMP
> [ 20.288806] Modules linked in: nfs4 auth_rpcgss nfs fscache lockd sunrpc kvm_intel radeon kvm ttm drm_kms_helper i7core_edac drm edac_core joydev hpilo psmouse hid_generic i2c_algo_bit serio_raw usbhid hid bnx2

I still see this BUG with 3.6-rc7. Any fix in sight?


Joerg



2012-09-28 13:34:35

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Fri, Sep 28, 2012 at 09:21:26AM -0400, Bryan Schumaker wrote:
> On 09/28/2012 08:17 AM, Joerg Roedel wrote:

> Maybe it's mounting over v3 now? Can you retry using "vers=4" in your mount options?

Tried that, mounting fails with 'No such file or directory'.


Joerg

--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


2012-09-28 12:17:37

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Thu, Sep 27, 2012 at 02:15:21PM -0400, Bryan Schumaker wrote:

> Double check that you're using the legacy idmapper, and not the
> keyring based one (/etc/request-key.conf shouldn't have the "create
> id_resolver * * /usr/bin/nfsidmap %k %d" line).

That should only be relevant for NFSv4, no? I am using v3 only (without
explicitly setting the version as mount option).

Anyway, I had no luck on reproducing the failure again after
re-installation of the box, either with of with-out my patches on-top. I
tried a few different config settings for NFS too, but the BUG didn't
trigger again :-/


Joerg



2012-09-27 16:59:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Thu, Sep 27, 2012 at 9:16 AM, Myklebust, Trond
<[email protected]> wrote:
>
> I cannot see how that BUG_ON can be triggered in the current code, given
> that the only place where idmap->idmap_key_cons is set to a non-NULL
> value is covered by a mutex, and that it is always cleared before we
> release said mutex.

Quite frankly, the "I cannot see" thing is *never* an excuse for a BUG_ON().

We don't do kernel-killing asserts in Linux. Never.

The only excuse for a BUG_ON() is "I cannot possibly continue, I don't
even have an error path I can take".

If it's a fundamentally impossible situation, the BUG_ON() should
never have been there in the first place!

And if it's a "I don't see how it could happen", then it should have
been something like

if (WARN_ON_ONCE(condition))
goto cleanup;

rather than a BUG_ON().

We have too many f*cking BUG_ON's in the kernel, and the fact that one
triggers and it has taken a month and a half without it even being
resolved is a problem.

Get rid of the thing, already, dammit. If you cannot figure out how it
can happen, then the *last* thing you want to do is then kill the
machine so that it's impossible to debug it sanely.

Besides, as far as I can tell, idmap_key_cons locking is suspect
anyway. Stuff like this:

cons = ACCESS_ONCE(idmap->idmap_key_cons);
idmap->idmap_key_cons = NULL;

is an almost certain example of "the code is racy, and we did it
wrong". The above is basically *never* correct.

If the access is properly locked, then the ACCESS_ONCE() is a bug.

And if the access *isn't* properly locked, then setting things to NULL
afterwards is in no way safe.

IOW, either way, it's broken. And there's at least two of those
clearly buggy code-sequences involving that field.

So get rid of the BUG_ON() (possibly replacing it with the
WARN_ON_ONCE), and please look at those ACCESS_ONCE() sequences and
fix them. Either they happen under a lock, or they don't. None of this
crazy racy crap, please.

Linus

2012-09-27 15:39:31

by Joerg Roedel

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On Thu, Sep 27, 2012 at 03:32:02PM +0000, Myklebust, Trond wrote:

> Does your checked out copy of 3.6-rc7 contain commit c50669 (NFS:
> Clear key construction data if the idmap upcall fails)? The latter was
> merged in3.6-rc3, and is reported to fix the problem for the other
> testers.

Yes, it contains that commit. I was about to test plain v3.6-rc7 without
my patches (not nfs related, of cource) on-top, but unfortunatly the
disk with the root-fs died :-/
I am about to set up the box again and test plain -rc7.


Joerg



2012-09-27 18:15:23

by Anna Schumaker

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

On 09/27/2012 01:56 PM, Joerg Roedel wrote:
> On Thu, Sep 27, 2012 at 04:16:25PM +0000, Myklebust, Trond wrote:
>>> Yes, it contains that commit. I was about to test plain v3.6-rc7 without
>>> my patches (not nfs related, of cource) on-top, but unfortunatly the
>>> disk with the root-fs died :-/
>>> I am about to set up the box again and test plain -rc7.
>>
>> Please do.
>
> Okay, the box is running again and the bug does not reproduce anymore
> with the new installation. Unfortunatly the old kernel configuration is
> gone together with the previous version of the nfs-utils :-/ I'll try to
> play around with the kernel config a little bit more and see if it
> reproduces again.

Double check that you're using the legacy idmapper, and not the keyring based one (/etc/request-key.conf shouldn't have the "create id_resolver * * /usr/bin/nfsidmap %k %d" line).

- Bryan

>
>
> Joerg
>


2012-09-27 21:22:01

by Myklebust, Trond

[permalink] [raw]
Subject: Re: kernel BUG at /data/lemmy/linux.trees.git/fs/nfs/idmap.c:681!

T24gVGh1LCAyMDEyLTA5LTI3IGF0IDA5OjU5IC0wNzAwLCBMaW51cyBUb3J2YWxkcyB3cm90ZToN
Cj4gT24gVGh1LCBTZXAgMjcsIDIwMTIgYXQgOToxNiBBTSwgTXlrbGVidXN0LCBUcm9uZA0KPiA8
VHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20+IHdyb3RlOg0KPiA+DQo+ID4gSSBjYW5ub3Qgc2Vl
IGhvdyB0aGF0IEJVR19PTiBjYW4gYmUgdHJpZ2dlcmVkIGluIHRoZSBjdXJyZW50IGNvZGUsIGdp
dmVuDQo+ID4gdGhhdCB0aGUgb25seSBwbGFjZSB3aGVyZSBpZG1hcC0+aWRtYXBfa2V5X2NvbnMg
aXMgc2V0IHRvIGEgbm9uLU5VTEwNCj4gPiB2YWx1ZSBpcyBjb3ZlcmVkIGJ5IGEgbXV0ZXgsIGFu
ZCB0aGF0IGl0IGlzIGFsd2F5cyBjbGVhcmVkIGJlZm9yZSB3ZQ0KPiA+IHJlbGVhc2Ugc2FpZCBt
dXRleC4NCj4gDQo+IFF1aXRlIGZyYW5rbHksIHRoZSAiSSBjYW5ub3Qgc2VlIiB0aGluZyBpcyAq
bmV2ZXIqIGFuIGV4Y3VzZSBmb3IgYSBCVUdfT04oKS4NCj4gDQo+IFdlIGRvbid0IGRvIGtlcm5l
bC1raWxsaW5nIGFzc2VydHMgaW4gTGludXguIE5ldmVyLg0KPiANCj4gVGhlIG9ubHkgZXhjdXNl
IGZvciBhIEJVR19PTigpIGlzICJJIGNhbm5vdCBwb3NzaWJseSBjb250aW51ZSwgSSBkb24ndA0K
PiBldmVuIGhhdmUgYW4gZXJyb3IgcGF0aCBJIGNhbiB0YWtlIi4NCj4gDQo+IElmIGl0J3MgYSBm
dW5kYW1lbnRhbGx5IGltcG9zc2libGUgc2l0dWF0aW9uLCB0aGUgQlVHX09OKCkgc2hvdWxkDQo+
IG5ldmVyIGhhdmUgYmVlbiB0aGVyZSBpbiB0aGUgZmlyc3QgcGxhY2UhDQo+IA0KPiBBbmQgaWYg
aXQncyBhICJJIGRvbid0IHNlZSBob3cgaXQgY291bGQgaGFwcGVuIiwgdGhlbiBpdCBzaG91bGQg
aGF2ZQ0KPiBiZWVuIHNvbWV0aGluZyBsaWtlDQo+IA0KPiAgICAgaWYgKFdBUk5fT05fT05DRShj
b25kaXRpb24pKQ0KPiAgICAgICAgIGdvdG8gY2xlYW51cDsNCj4gDQo+IHJhdGhlciB0aGFuIGEg
QlVHX09OKCkuDQo+IA0KPiBXZSBoYXZlIHRvbyBtYW55IGYqY2tpbmcgQlVHX09OJ3MgaW4gdGhl
IGtlcm5lbCwgYW5kIHRoZSBmYWN0IHRoYXQgb25lDQo+IHRyaWdnZXJzIGFuZCBpdCBoYXMgdGFr
ZW4gYSBtb250aCBhbmQgYSBoYWxmIHdpdGhvdXQgaXQgZXZlbiBiZWluZw0KPiByZXNvbHZlZCBp
cyBhIHByb2JsZW0uDQo+IA0KPiBHZXQgcmlkIG9mIHRoZSB0aGluZywgYWxyZWFkeSwgZGFtbWl0
LiBJZiB5b3UgY2Fubm90IGZpZ3VyZSBvdXQgaG93IGl0DQo+IGNhbiBoYXBwZW4sIHRoZW4gdGhl
ICpsYXN0KiB0aGluZyB5b3Ugd2FudCB0byBkbyBpcyB0aGVuIGtpbGwgdGhlDQo+IG1hY2hpbmUg
c28gdGhhdCBpdCdzIGltcG9zc2libGUgdG8gZGVidWcgaXQgc2FuZWx5Lg0KPiANCj4gQmVzaWRl
cywgYXMgZmFyIGFzIEkgY2FuIHRlbGwsIGlkbWFwX2tleV9jb25zIGxvY2tpbmcgaXMgc3VzcGVj
dA0KPiBhbnl3YXkuIFN0dWZmIGxpa2UgdGhpczoNCj4gDQo+ICAgICAgICAgICAgICAgICBjb25z
ID0gQUNDRVNTX09OQ0UoaWRtYXAtPmlkbWFwX2tleV9jb25zKTsNCj4gICAgICAgICAgICAgICAg
IGlkbWFwLT5pZG1hcF9rZXlfY29ucyA9IE5VTEw7DQo+IA0KPiBpcyBhbiBhbG1vc3QgY2VydGFp
biBleGFtcGxlIG9mICJ0aGUgY29kZSBpcyByYWN5LCBhbmQgd2UgZGlkIGl0DQo+IHdyb25nIi4g
VGhlIGFib3ZlIGlzIGJhc2ljYWxseSAqbmV2ZXIqIGNvcnJlY3QuDQo+IA0KPiBJZiB0aGUgYWNj
ZXNzIGlzIHByb3Blcmx5IGxvY2tlZCwgdGhlbiB0aGUgQUNDRVNTX09OQ0UoKSBpcyBhIGJ1Zy4N
Cj4gDQo+IEFuZCBpZiB0aGUgYWNjZXNzICppc24ndCogcHJvcGVybHkgbG9ja2VkLCB0aGVuIHNl
dHRpbmcgdGhpbmdzIHRvIE5VTEwNCj4gYWZ0ZXJ3YXJkcyBpcyBpbiBubyB3YXkgc2FmZS4NCj4g
DQo+IElPVywgZWl0aGVyIHdheSwgaXQncyBicm9rZW4uIEFuZCB0aGVyZSdzIGF0IGxlYXN0IHR3
byBvZiB0aG9zZQ0KPiBjbGVhcmx5IGJ1Z2d5IGNvZGUtc2VxdWVuY2VzIGludm9sdmluZyB0aGF0
IGZpZWxkLg0KPiANCj4gU28gZ2V0IHJpZCBvZiB0aGUgQlVHX09OKCkgKHBvc3NpYmx5IHJlcGxh
Y2luZyBpdCB3aXRoIHRoZQ0KPiBXQVJOX09OX09OQ0UpLCBhbmQgcGxlYXNlIGxvb2sgYXQgdGhv
c2UgQUNDRVNTX09OQ0UoKSBzZXF1ZW5jZXMgYW5kDQo+IGZpeCB0aGVtLiBFaXRoZXIgdGhleSBo
YXBwZW4gdW5kZXIgYSBsb2NrLCBvciB0aGV5IGRvbid0LiBOb25lIG9mIHRoaXMNCj4gY3Jhenkg
cmFjeSBjcmFwLCBwbGVhc2UuDQoNClRoZSBBQ0NFU1NfT05DRSBpcyBhIGJ1Zy4gSXQgaXNuJ3Qg
bmVlZGVkLCBzaW5jZSB0aGUgd2hvbGUgdXBjYWxsIGlzDQpjb3ZlcmVkIGJ5IHRoZSBpZG1hcF9t
dXRleC4gSSdsbCByZW1vdmUgaXQgYW5kIHRoZSBCVUdfT04oKSBjb21lIHRoZQ0KbWVyZ2Ugd2lu
ZG93IChvciBJIGNhbiBzZW5kIGEgcGF0Y2ggc29vbmVyIGlmIHlvdSBjYXJlKS4NCg0KVGhlIGRv
d25jYWxsIGlzIG9yZGVyZWQgdy5yLnQuIGlkbWFwX3BpcGVfZGVzdHJveV9tc2coKSAod2hpY2gg
Y2xlYXJzDQppZG1hcC0+aWRtYXBfa2V5X2NvbnMgb25seSBvbiB1cGNhbGwgZmFpbHVyZSkuIFRo
ZSBkb3duY2FsbCBpcyBhbHNvDQpwcm90ZWN0ZWQgYWdhaW5zdCBjb2xsaXNpb25zIHdpdGggaWRt
YXBfcmVsZWFzZV9waXBlKCkgYnkgbWVhbnMgb2YgdGhlDQppbm9kZS0+aV9tdXRleC4gRmluYWxs
eSwgdGhlIGNhbGwgdG8gcmVxdWVzdF9rZXkoKSAoZHVyaW5nIHdoaWNoIHRoZQ0KaWRtYXBfbXV0
ZXggaXMgaGVsZCkgd2lsbCBhbHdheXMgZG8gYW4gdW5pbnRlcnJ1cHRpYmxlIHdhaXQgZm9yIG9u
ZSBvZg0KdGhvc2UgMyBmdW5jdGlvbnMgdG8gY29tcGxldGUuDQpJT1c6IGFsbCB0aGUgZnVuY3Rp
b25zIHRoYXQgbWFuaXB1bGF0ZSBpZG1hcC0+aWRtYXBfa2V5X2NvbnMgYXJlDQp0aGVvcmV0aWNh
bGx5IG9yZGVyZWQgdy5yLnQuIGVhY2ggb3RoZXIuIFRoYXQncyB3aGF0IEkgbWVhbiB3aGVuIEkg
c2F5DQp0aGF0IEkgcmVhbGx5IGRvbid0IHVuZGVyc3RhbmQgd2h5IHRoaXMgaXMgaGFwcGVuaW5n
Lg0KDQpTbyB0aGUgZmlyc3QgdGhpbmcgdG8gZG8gaXMgdG8gdHJ5IGEgX3ZhbmlsbGFfIDMuNi1y
YzcsIGFuZCBzZWUgaWYgdGhlDQpwcm9ibGVtIGlzIHJlcHJvZHVjaWJsZSB3aXRob3V0IEpvZXJn
J3MgZXh0cmEgcGF0Y2hlcy4NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGll
bnQgbWFpbnRhaW5lcg0KDQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cu
bmV0YXBwLmNvbQ0K