Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761493AbYFDSOm (ORCPT ); Wed, 4 Jun 2008 14:14:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753608AbYFDSOf (ORCPT ); Wed, 4 Jun 2008 14:14:35 -0400 Received: from agminet01.oracle.com ([141.146.126.228]:39569 "EHLO agminet01.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752537AbYFDSOe (ORCPT ); Wed, 4 Jun 2008 14:14:34 -0400 Cc: Trond Myklebust , chucklever@gmail.com, Linux Kernel Message-Id: From: Chuck Lever To: Dave Jones In-Reply-To: <20080604141958.GA14148@redhat.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v924) Subject: Re: NFS oops in 2.6.26rc4 Date: Wed, 4 Jun 2008 14:13:08 -0400 References: <20080527190419.GA14577@redhat.com> <76bd70e30805301059v461e8f0eocc38b6fe36dd3b21@mail.gmail.com> <20080530182126.GB32480@redhat.com> <1212172308.8579.20.camel@localhost> <20080530190314.GC32480@redhat.com> <20080604141958.GA14148@redhat.com> X-Mailer: Apple Mail (2.924) X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1716 Lines: 43 On Jun 4, 2008, at 10:19 AM, Dave Jones wrote: > On Fri, May 30, 2008 at 03:37:01PM -0400, Chuck Lever wrote: > >>> Something else of note which I hadn't seen before, usually things >>> lock >>> up just after that first oops. For some reason, today it survived >>> a little longer, but things really went downhill fast. >>> It survived a 'dmesg ; scp dmesg davej@gelk', and then wedged solid. >>> So as well as the oops, it seems we're corrupting memory too. >>> For reference, this kernel has both SLUB_DEBUG and PAGEALLOC_DEBUG >>> enabled. >> >> I haven't seen this kind of problem here with .26, but yes, it does >> look like something is clobbering memory during an NFS mount. >> >> I introduced some NFS mount parsing changes in this commit range: >> >> 2d767432..82d101d5 >> >> A quick bisect should show which, if any of these, is the guilty >> party. If any of these are the problem, I suspect it's 3f8400d1. > > I didn't get time to try this out yet (hopefully tomorrow). > In the meantime, we've just gotten word of another user seeing memory > corruption with nfs - https://bugzilla.redhat.com/show_bug.cgi?id=449958 449958 could very well be the same problem. The stack traceback is a lot cleaner than the one you originally sent, but there are a lot of similarities. (I doubt this is related to symlinks, as the comment suggests). Is commit 86d61d863 applied to the current rawhide kernel? -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/