From: Aioanei Rares Subject: Re: BUG: oops in gss_validate on 2.6.31 Date: Wed, 16 Sep 2009 12:39:23 +0000 (UTC) Message-ID: <4AB22E13.6020501__15933.7515513927$1253104763$gmane$org@gmail.com> References: <20090916102953.GA18674@wavehammer.waldi.eu.org> <1253104249.5154.3.camel@heimdal.trondhjem.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Cc: Bastian Blank , linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org To: Trond Myklebust Return-path: Date: Thu, 17 Sep 2009 15:39:47 +0300 In-Reply-To: <1253104249.5154.3.camel@heimdal.trondhjem.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: Trond Myklebust wrote: > On Wed, 2009-09-16 at 12:29 +0200, Bastian Blank wrote: > >> Hi >> >> Since 2.6.31 my gssapi authenticated nfs oopses. >> >> BUG: unable to handle kernel NULL pointer dereference at 00000010 >> IP: [] gss_validate+0xad/0x175 [auth_rpcgss] >> *pdpt = 0000000001473001 *pde = 0000000000000000 >> Oops: 0000 [#1] SMP >> last sysfs file: /sys/devices/virtual/block/dm-13/range >> Modules linked in: kvm_intel kvm ext4 jbd2 crc16 usb_storage usbhid hid i915 drm i2c_algo_bit sco bridge stp bnep rfcomm l2cap xt_mac ipt_REJECT xt_tcpudp xt_conntrack iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables tun nfsd exportfs nfs lockd fscache nfs_acl deflate zlib_deflate ctr twofish twofish_common camellia serpent blowfish cast5 des_generic xcbc rmd160 sha1_generic hmac crypto_null af_key fuse rpcsec_gss_krb5 auth_rpcgss sunrpc loop acpi_cpufreq arc4 snd_hda_codec_analog ecb snd_hda_intel snd_hda_codec iwl3945 snd_hwdep iwlcore snd_pcm snd_seq snd_timer thinkpad_acpi snd_seq_device nsc_ircc i2c_i801 btusb mac80211 i2c_core serio_raw snd soundcore battery button psmouse processor rng_core snd_page_alloc evdev nvra m ac cfg80211 bluetooth irda rfkill crc_ccitt ext3 jbd mbcache sha256_generic aes_i586 aes_generic cbc dm_crypt dm_mod sd_mod crc_t10dif ata_generic ide_pci_generic ahci libata scsi_mod sdhci_pci piix sdhci firewire_ohci firewire_core crc_itu_t ide_core mmc_core led_class uhci_hcd ehci_hcd usbcore nls_base e1000e intel_agp agpgart video output thermal fan thermal_sys [last unloaded: kvm] >> >> Pid: 2025, comm: rpciod/0 Not tainted (2.6.31-trunk-686-bigmem #1) 170255G >> EIP: 0060:[] EFLAGS: 00010246 CPU: 0 >> EIP is at gss_validate+0xad/0x175 [auth_rpcgss] >> EAX: d5d7e830 EBX: f60f5ef8 ECX: f60f5ee4 EDX: f60f5ef8 >> ESI: 00000025 EDI: 00000000 EBP: cdc30bc0 ESP: f60f5edc >> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >> Process rpciod/0 (pid: 2025, ti=f60f4000 task=f6685ae0 task.ti=f60f4000) >> Stack: >> f5c512c0 d5d7e830 00000025 d5d7e830 f60f5ef4 00000004 9cd00000 f60f5ef4 >> <0> 00000004 00000000 00000000 00000000 00000001 00000000 f847888c 00000004 >> <0> 00000004 be91f5c4 cdc30bc0 f5c512c0 d5d7e828 f3c807f8 f8d9de34 be91f5c4 >> Call Trace: >> [] ? rpcauth_checkverf+0x4a/0x60 [sunrpc] >> [] ? call_decode+0x30f/0x5de [sunrpc] >> [] ? rpcproc_decode_null+0x0/0x21 [sunrpc] >> [] ? __rpc_execute+0x76/0x21e [sunrpc] >> [] ? worker_thread+0x146/0x1d9 >> [] ? rpc_async_schedule+0x0/0x29 [sunrpc] >> [] ? autoremove_wake_function+0x0/0x4f >> [] ? worker_thread+0x0/0x1d9 >> [] ? kthread+0x7a/0x7f >> [] ? kthread+0x0/0x7f >> [] ? kernel_thread_helper+0x7/0x10 >> Code: 24 18 89 da 89 44 24 10 8d 44 24 10 c7 44 24 14 04 00 00 00 e8 a4 f0 fc ff 89 da 8b 44 24 04 89 74 24 08 8d 4c 24 08 89 44 24 0c <8b> 47 10 e8 3c 16 00 00 3d 00 00 0c 00 89 c2 75 0a 8d 45 28 f0 >> EIP: [] gss_validate+0xad/0x175 [auth_rpcgss] SS:ESP 0068:f60f5edc >> CR2: 0000000000000010 >> ---[ end trace 92895856d62132dd ]--- >> >> I saw this two times in the last days. Always under load. I've never >> seen this with 2.6.30. The server is a 2.6.30 machine. >> > > Hmm... I don't see any obvious candidates in the changelog. My only > guess is that something is amiss after the merge of the nfsv4.1 > backchannel code. > > Would you be able to do a git bisect in order to finger the culprit? > > Cheers > Trond > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > Guess i'll have to look into some manuals, since I'm not my git-fu is weak, and I'll get back to you. Meanwhile I'll test my .config with a release kernel. Thanks,