Subject: Re: NFS client latencies
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Ingo Molnar <mingo@elte.hu>
Cc: Lee Revell <rlrevell@joe-job.com>,
       linux-kernel <linux-kernel@vger.kernel.org>
In-Reply-To: <20050330080224.GB19683@elte.hu>
References: <1112137487.5386.33.camel@mindpipe>
	 <1112138283.11346.2.camel@lade.trondhjem.org>
	 <1112139155.5386.35.camel@mindpipe>
	 <1112139263.11892.0.camel@lade.trondhjem.org>
	 <20050330080224.GB19683@elte.hu>
Content-Type: text/plain
Date: Wed, 30 Mar 2005 09:11:00 -0500
Message-Id: <1112191860.10634.29.camel@lade.trondhjem.org>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1805
Lines: 39

on den 30.03.2005 Klokka 10:02 (+0200) skreiv Ingo Molnar:

> the comment suggests that this is optimized for append writes (which is 
> quite common, but by far not the only write workload) - but the 
> worst-case behavior of this code is very bad. How about disabling this 
> sorting altogether and benchmarking the result? Maybe it would get 
> comparable coalescing (higher levels do coalesce after all), but wastly 
> improved CPU utilization on the client side. (Note that the server 
> itself will do sorting of any write IO anyway, if this is to hit any 
> persistent storage - and if not then sorting so agressively on the 
> client side makes little sense.)

No. Coalescing on the client makes tons of sense. The overhead of
sending 8 RPC requests for 4k writes instead of sending 1 RPC request
for a single 32k write is huge: among other things, you end up tying up
8 RPC slots on the client + 8 nfsd threads on the server instead of just
one of each.
You also end up allocating 8 times a much memory for supporting
structures such as rpc_tasks. That sucks when you're in a low memory
situation and are trying to push dirty pages to the server as fast as
possible...

What we can do instead is to do the same thing that the VM does: use a
radix tree with tags to do the sorting of the nfs_pages when we're
actually building up the list of dirty pages to send.
We already have the radix tree to do that, all we need to do is add the
tags and modify the scanning function to use them.

Cheers,
 Trond

-- 
Trond Myklebust <trond.myklebust@fys.uio.no>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/