Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753110AbZFRAUO (ORCPT ); Wed, 17 Jun 2009 20:20:14 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751373AbZFRAUE (ORCPT ); Wed, 17 Jun 2009 20:20:04 -0400 Received: from main.gmane.org ([80.91.229.2]:35879 "EHLO ciao.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751340AbZFRAUD (ORCPT ); Wed, 17 Jun 2009 20:20:03 -0400 X-Injected-Via-Gmane: http://gmane.org/ To: linux-kernel@vger.kernel.org From: Zdenek Kaspar Subject: Re: 2.6.29.1: nfsd: page allocation failure - nfsd or kernel problem? Date: Thu, 18 Jun 2009 02:14:17 +0200 Message-ID: <4A3986D9.5020204@gmail.com> References: <4A37FE48.6070306@msgid.tls.msk.ru> <4A38ACC0.3060501@msgid.tls.msk.ru> <4A38C7CA.7040005@msgid.tls.msk.ru> <20090617185139.GF24040@fieldses.org> <4A395119.5060108@msgid.tls.msk.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org Cc: "J. Bruce Fields" , Justin Piszcz X-Gmane-NNTP-Posting-Host: r9hh95.net.upc.cz User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) In-Reply-To: <4A395119.5060108@msgid.tls.msk.ru> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2497 Lines: 60 Michael Tokarev napsal(a): > J. Bruce Fields wrote: >> On Wed, Jun 17, 2009 at 02:39:06PM +0400, Michael Tokarev wrote: >>> Justin Piszcz wrote: >>>> >>>> On Wed, 17 Jun 2009, Michael Tokarev wrote: >>>> >>>>> Michael Tokarev wrote: >>>>>> Justin Piszcz wrote: >>>>> ... >>>>> >>>>> Justin, by the way, what's the underlying filesystem on the server? >>>>> >>>>> I've seen this error on 2 machines already (both running 2.6.29.x >>>>> x86-64), >>>>> and in both cases the filesystem on the server was xfs. May this be >>>>> related somehow to http://bugzilla.kernel.org/show_bug.cgi?id=13375 ? >>>>> That one is different, but also about xfs and nfs. I'm trying to >>>>> reproduce the problem on different filesystem... >>>> Hello, I am also running XFS on 2.6.29.x x86-64. >>>> >>>> For me, the error happened when I was running an XFSDUMP from a >>>> client (and dumping) the stream over NFS to the XFS >>>> server/filesystem. This is typically when the error occurs or >>>> during heavy I/O. >>> Very similar load was here -- not xfsdump but tar and dump of an ext3 >>> filesystems. >>> >>> And no, it's NOT xfs-related: I can trigger the same issue easily on > > Note the NOT, in upper case ;) > >>> ext4 as well. About 20 minutes of running 'dump' of another fs >>> to the nfs mount and voila, nfs server reports the same page allocation >>> failure. Note that all file operations are still working, i.e. it >>> produces good (not corrupted) files on the server. >> >> There's a possibly related report for 2.6.30 here: >> >> http://bugzilla.kernel.org/show_bug.cgi?id=13518 > > Does not look similar. > > I repeated the issue here. The slab which is growing here is buffer_head. > It's growing slowly -- right now, after ~5 minutes of constant writes over > nfs, its size is 428423 objects, growing at about 5000 objects/minute rate. > When stopping writing, the cache shrinks slowly back to an acceptable > size, probably when the data gets actually written to disk. > > It looks like we need a bug entry for this :) > > I'll re-try 2.6.30 hopefully tomorrow. > > /mjt Can you try if increasing vm.min_free_kbytes will help you? I "temp-fixed" heavy I/O problems with vm.min_free_kbytes=32768 on machine with 4G memory. Z. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/