Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752233AbYCLQAZ (ORCPT ); Wed, 12 Mar 2008 12:00:25 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751145AbYCLQAM (ORCPT ); Wed, 12 Mar 2008 12:00:12 -0400 Received: from mail.fieldses.org ([66.93.2.214]:34625 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751057AbYCLQAK (ORCPT ); Wed, 12 Mar 2008 12:00:10 -0400 Date: Wed, 12 Mar 2008 12:00:07 -0400 To: Lukas Hejtmanek Cc: nfsv4@linux-nfs.org, linux-kernel@vger.kernel.org, Neil Brown Subject: Re: Oops in NFSv4 server in 2.6.23.17 Message-ID: <20080312160007.GA10015@fieldses.org> References: <20080312122550.GB8141@ics.muni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20080312122550.GB8141@ics.muni.cz> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) From: "J. Bruce Fields" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9520 Lines: 206 On Wed, Mar 12, 2008 at 01:25:50PM +0100, Lukas Hejtmanek wrote: > Hello, > > I'm running NFSv4 server (nfs-utils 1.1.1) with vanila kernel 2.6.23.17. > It works while I use the server from standard ethernet lines. > > However, I tried to mount the volume over GPRS network. For the first time, it > replies premission denied and the second try results in the following oops. > > Anything I should try? > > I guess that the last oops is induced from the first one due to some memory > corruption. > > WARNING: at lib/kref.c:33 kref_get() Hm. So looks like that means kref_get() was called on something with a zero reference count. > > Call Trace: > [] kref_get+0x35/0x40 > [] :sunrpc:sunrpc_cache_lookup+0x64/0x160 So sunrpc_cache_lookup found something with a zero reference count, in... > [] :nfsd:svc_export_lookup+0xc0/0xd0 ... the has table that keeps the kernel's cached view of the export table. > [] :nfsd:exp_get_by_name+0x2f/0x80 > [] :sunrpc:sunrpc_cache_lookup+0x64/0x160 > [] :sunrpc:cache_check+0x49/0x490 > [] :nfsd:exp_find_key+0x57/0xc0 > [] :nfsd:exp_find+0x44/0xa0 > [] :nfsd:rqst_exp_find+0x9e/0xf0 > [] :nfsd:fh_verify+0x425/0x540 > [] :nfsd:nfsd4_proc_compound+0x1de/0x330 > [] :nfsd:nfsd_dispatch+0xb1/0x250 > [] :sunrpc:svc_process+0x491/0x7d0 > [] __down_read+0x12/0xb0 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x1a0/0x2f0 > [] child_rip+0xa/0x12 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x0/0x2f0 > [] child_rip+0x0/0x12 > > WARNING: at lib/kref.c:33 kref_get() > > Call Trace: > [] kref_get+0x35/0x40 > [] :sunrpc:sunrpc_cache_lookup+0x64/0x160 > [] :nfsd:svc_export_lookup+0xc0/0xd0 > [] :nfsd:exp_get_by_name+0x2f/0x80 > [] :sunrpc:sunrpc_cache_lookup+0x64/0x160 > [] :sunrpc:cache_check+0x49/0x490 > [] :nfsd:exp_find_key+0x57/0xc0 > [] :nfsd:exp_find+0x44/0xa0 > [] :nfsd:rqst_exp_find+0x9e/0xf0 > [] :nfsd:fh_verify+0x425/0x540 > [] :nfsd:nfsd4_proc_compound+0x1de/0x330 > [] :nfsd:nfsd_dispatch+0xb1/0x250 > [] :sunrpc:svc_process+0x491/0x7d0 > [] __down_read+0x12/0xb0 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x1a0/0x2f0 > [] child_rip+0xa/0x12 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x0/0x2f0 > [] child_rip+0x0/0x12 > > WARNING: at lib/kref.c:33 kref_get() > > Call Trace: > [] kref_get+0x35/0x40 > [] :sunrpc:sunrpc_cache_lookup+0x64/0x160 > [] :nfsd:svc_export_lookup+0xc0/0xd0 > [] :nfsd:exp_get_by_name+0x2f/0x80 > [] :sunrpc:sunrpc_cache_lookup+0x64/0x160 > [] :sunrpc:cache_check+0x49/0x490 > [] :nfsd:exp_find_key+0x57/0xc0 > [] :nfsd:exp_find+0x44/0xa0 > [] :nfsd:rqst_exp_find+0x9e/0xf0 > [] :nfsd:fh_verify+0x425/0x540 > [] :nfsd:nfsd4_proc_compound+0x1de/0x330 > [] :nfsd:nfsd_dispatch+0xb1/0x250 > [] :sunrpc:svc_process+0x491/0x7d0 > [] __down_read+0x12/0xb0 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x1a0/0x2f0 > [] child_rip+0xa/0x12 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x0/0x2f0 > [] child_rip+0x0/0x12 > > general protection fault: 0000 [1] SMP > CPU 0 > Modules linked in: des cbc blkcipher nfsd exportfs rpcsec_gss_krb5 auth_rpcgss nfs lockd nfs_acl sunrpc ipv6 dm_snapshot dm_mirror loop dm_round_robin dm_multipath dm_mod ata_piix ata_generic thermal mptfc libata processor scsi_transport_fc ehci_hcd e1000 evdev serio_raw psmouse uhci_hcd i2c_i801 i2c_core > Pid: 2863, comm: nfsd Not tainted 2.6.23.17 #1 > RIP: 0010:[] [] :auth_rpcgss:gss_get_mic+0x3/0x10 > RSP: 0018:ffff81025c1d1c78 EFLAGS: 00010286 > RAX: 762aba3d00000004 RBX: ffff81025b19f240 RCX: ffff81025c05c000 > RDX: ffff81025c1d1cd0 RSI: ffff81025c1d1c80 RDI: ffff81025effa600 > RBP: ffff81025c05c014 R08: 0000000000000000 R09: 0000000000000004 > R10: 0000000000000000 R11: 0000000000000000 R12: ffff81025effa600 > R13: ffff81025b596000 R14: ffff81025c05c018 R15: ffff81025b596130 > FS: 00002b540c07b6d0(0000) GS:ffffffff805fe000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007fffa57e00e8 CR3: 000000025bb40000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process nfsd (pid: 2863, threadinfo ffff81025c1d0000, task ffff81025fe0c0c0) > Stack: ffffffff8819c175 ffff81025c1d1cec 0000000000000004 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000400000004 > 0000000000000000 ffff81025c1d1cec 0000000000000004 ffff81025effa600 > Call Trace: > [] :auth_rpcgss:gss_write_verf+0xa5/0x130 In this case it looks like something in the gss context cache is bad. I'm not sure what's going on. >From a quick gitk v2.6.23..origin fs/nfsd net/sunrpc/cache.c include/linux/sunrpc/cache.h I don't see any recent commits that would address a problem like this. But it might be worth trying to reproduce this in v2.6.25-rc5. It'd also be interesting to know whether this is reproduceable with auth_unix (as opposed to auth_gss) access, and/or with nfsv3 as opposed to nfsv4. --b. > [] :auth_rpcgss:svcauth_gss_accept+0x8b8/0xc30 > [] :sunrpc:svc_sock_enqueue+0x19f/0x340 > [] :sunrpc:svc_tcp_recvfrom+0x81/0x980 > [] del_timer_sync+0x10/0x20 > [] schedule_timeout+0x67/0xd0 > [] process_timeout+0x0/0x10 > [] schedule_timeout+0x5a/0xd0 > [] :sunrpc:svc_process+0x378/0x7d0 > [] default_wake_function+0x0/0x10 > [] __down_read+0x12/0xb0 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x1a0/0x2f0 > [] child_rip+0xa/0x12 > [] :nfsd:nfsd+0x0/0x2f0 > [] :nfsd:nfsd+0x0/0x2f0 > [] child_rip+0x0/0x12 > > > Code: 48 8b 40 30 4c 8b 58 08 41 ff e3 66 90 48 8b 07 48 8b 40 30 > RIP [] :auth_rpcgss:gss_get_mic+0x3/0x10 > RSP > > > kernel: invalid opcode: 0000 [2] SMP > CPU 1 > Modules linked in: des cbc blkcipher nfsd exportfs rpcsec_gss_krb5 auth_rpcgss nfs lockd nfs_acl sunrpc ipv6 dm_snapshot dm_mirror loop dm_round_robin dm_multipath dm_mod ata_piix serio_raw mptfc psmouse scsi_transport_fc evdev ata_generic libata i2c_i801 e1000 i2c_core ehci_hcd uhci_hcd thermal processor > Pid: 2748, comm: multipathd Tainted: G D 2.6.23.17 #1 > RIP: 0010:[] [] cache_alloc_refill+0x1ad/0x540 > RSP: 0018:ffff810259aafe68 EFLAGS: 00010046 > RAX: ffff81025e494040 RBX: 0000000000000020 RCX: 000000000000001c > RDX: ffff81025e494040 RSI: 000000000000000f RDI: ffff81025e810040 > RBP: ffff81025fc6ec00 R08: ffff81025e810070 R09: 0000000000000000 > R10: 0000000000159849 R11: 0000000000000000 R12: ffff81025fc00680 > R13: ffff81025fc023c0 R14: 0000000000000000 R15: ffff81025fc0d000 > FS: 0000000040061960(0063) GS:ffff81025fc6ce40(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00002ba20d8f4e30 CR3: 0000000259b10000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process multipathd (pid: 2748, threadinfo ffff810259aae000, task ffff81025fd06040) > Stack: 000000d000000010 ffffffff80291d6a 000000d0000080d0 ffff810259b19000 > 0000000000000296 00000000000080d0 ffff81025fc00680 00007fff9d909298 > 0000000040060e10 ffffffff80291cd2 0000000040062000 ffff810259b19000 > Call Trace: > [] cache_alloc_refill+0x7a/0x540 > [] kmem_cache_alloc+0x82/0xa0 > [] do_execve+0x46/0x230 > [] sys_execve+0x44/0xb0 > [] stub_execve+0x67/0xb0 > > > Code: 0f 0b eb fe 49 8b 45 10 49 8d 55 10 48 89 78 08 48 89 07 48 > RSP > > > -- > Lukáš Hejtmánek > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/