Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761339AbYFDS1R (ORCPT ); Wed, 4 Jun 2008 14:27:17 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756689AbYFDS1E (ORCPT ); Wed, 4 Jun 2008 14:27:04 -0400 Received: from mx1.redhat.com ([66.187.233.31]:44498 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755428AbYFDS1C (ORCPT ); Wed, 4 Jun 2008 14:27:02 -0400 Date: Wed, 4 Jun 2008 14:20:15 -0400 From: Dave Jones To: Chuck Lever Cc: Trond Myklebust , chucklever@gmail.com, Linux Kernel Subject: Re: NFS oops in 2.6.26rc4 Message-ID: <20080604182015.GA20074@redhat.com> Mail-Followup-To: Dave Jones , Chuck Lever , Trond Myklebust , chucklever@gmail.com, Linux Kernel References: <20080527190419.GA14577@redhat.com> <76bd70e30805301059v461e8f0eocc38b6fe36dd3b21@mail.gmail.com> <20080530182126.GB32480@redhat.com> <1212172308.8579.20.camel@localhost> <20080530190314.GC32480@redhat.com> <20080604141958.GA14148@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2021 Lines: 48 On Wed, Jun 04, 2008 at 02:13:08PM -0400, Chuck Lever wrote: > > On Jun 4, 2008, at 10:19 AM, Dave Jones wrote: > > > On Fri, May 30, 2008 at 03:37:01PM -0400, Chuck Lever wrote: > > > >>> Something else of note which I hadn't seen before, usually things > >>> lock > >>> up just after that first oops. For some reason, today it survived > >>> a little longer, but things really went downhill fast. > >>> It survived a 'dmesg ; scp dmesg davej@gelk', and then wedged solid. > >>> So as well as the oops, it seems we're corrupting memory too. > >>> For reference, this kernel has both SLUB_DEBUG and PAGEALLOC_DEBUG > >>> enabled. > >> > >> I haven't seen this kind of problem here with .26, but yes, it does > >> look like something is clobbering memory during an NFS mount. > >> > >> I introduced some NFS mount parsing changes in this commit range: > >> > >> 2d767432..82d101d5 > >> > >> A quick bisect should show which, if any of these, is the guilty > >> party. If any of these are the problem, I suspect it's 3f8400d1. > > > > I didn't get time to try this out yet (hopefully tomorrow). > > In the meantime, we've just gotten word of another user seeing memory > > corruption with nfs - https://bugzilla.redhat.com/show_bug.cgi?id=449958 > > 449958 could very well be the same problem. The stack traceback is a > lot cleaner than the one you originally sent, but there are a lot of > similarities. (I doubt this is related to symlinks, as the comment > suggests). > > Is commit 86d61d863 applied to the current rawhide kernel? That kernel was .26rc4.git2, so unless it's only gone in in the last day or two, yes. (Bandwidth impaired right now, and no local git repo to check) Dave -- http://www.codemonkey.org.uk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/