Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753926AbYHDJSb (ORCPT ); Mon, 4 Aug 2008 05:18:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752537AbYHDJSW (ORCPT ); Mon, 4 Aug 2008 05:18:22 -0400 Received: from ns1.q-leap.de ([153.94.51.193]:52354 "EHLO mail.q-leap.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752124AbYHDJSV (ORCPT ); Mon, 4 Aug 2008 05:18:21 -0400 From: Bernd Schubert To: "J. Bruce Fields" Subject: Re: high latency NFS Date: Mon, 4 Aug 2008 11:18:18 +0200 User-Agent: KMail/1.9.9 Cc: Neil Brown , Michael Shuey , Shehjar Tikoo , linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, rees@citi.umich.edu, aglo@citi.umich.edu References: <200807241311.31457.shuey@purdue.edu> <20080804003206.GB6119@disturbed> <20080804011158.GA8066@fieldses.org> In-Reply-To: <20080804011158.GA8066@fieldses.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200808041118.19743.bs@q-leap.de> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2197 Lines: 47 On Monday 04 August 2008 03:11:58 J. Bruce Fields wrote: > On Mon, Aug 04, 2008 at 10:32:06AM +1000, Dave Chinner wrote: > > On Fri, Aug 01, 2008 at 03:15:59PM -0400, J. Bruce Fields wrote: > > > On Fri, Aug 01, 2008 at 05:23:20PM +1000, Dave Chinner wrote: > > > > On Thu, Jul 31, 2008 at 05:03:05PM +1000, Neil Brown wrote: > > > > > You might want to track the max length of the request queue too and > > > > > start more threads if the queue is long, to allow a quick ramp-up. > > > > > > > > Right, but even request queue depth is not a good indicator. You > > > > need to leep track of how many NFSDs are actually doing useful > > > > work. That is, if you've got an NFSD on the CPU that is hitting > > > > the cache and not blocking, you don't need more NFSDs to handle > > > > that load because they can't do any more work than the NFSD > > > > that is currently running is. > > > > > > > > i.e. take the solution that Greg banks used for the CPU scheduler > > > > overload issue (limiting the number of nfsds woken but not yet on > > > > the CPU), > > > > > > I don't remember that, or wasn't watching when it happened.... Do you > > > have a pointer? > > > > Ah, I thought that had been sent to mainline because it was > > mentioned in his LCA talk at the start of the year. Slides > > 65-67 here: > > > > http://mirror.linux.org.au/pub/linux.conf.au/2007/video/talks/41.pdf > > OK, so to summarize: when the rate of incoming rpc's is very high (and, > I guess, when we're serving everything out of cache and don't have IO > wait), all the nfsd threads will stay runable all the time. That keeps > userspace processes from running (possibly for "minutes"). And that's a > problem even on a server dedicated only to nfs, since it affects portmap > and rpc.mountd. Even worse, it affects user space HA software such as heartbeat and everyone with reasonable timeouts will see spurious 'failures'. -- Bernd Schubert Q-Leap Networks GmbH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/