From: Neil Brown <neilb@suse.de>
To: Jens Axboe <axboe@suse.de>
Date: Fri, 25 Aug 2006 14:36:28 +1000
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <17646.32332.572865.919526@cse.unsw.edu.au>
Cc: David Chinner <dgc@sgi.com>, Andi Kleen <ak@suse.de>,
       linux-kernel@vger.kernel.org, akpm@osdl.org
Subject: Re: RFC - how to balance Dirty+Writeback in the face of slow  writeback.
In-Reply-To: message from Jens Axboe on Monday August 21
References: <20060815230050.GB51703024@melbourne.sgi.com>
	<17635.60378.733953.956807@cse.unsw.edu.au>
	<20060816231448.cc71fde7.akpm@osdl.org>
	<20060818001102.GW51703024@melbourne.sgi.com>
	<20060817232942.c35b1371.akpm@osdl.org>
	<20060818070314.GE798@suse.de>
	<p73hd0998is.fsf@verdi.suse.de>
	<17640.65491.458305.525471@cse.unsw.edu.au>
	<20060821031505.GQ51703024@melbourne.sgi.com>
	<17641.24478.496091.79901@cse.unsw.edu.au>
	<20060821135132.GG4290@suse.de>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1977
Lines: 43

On Monday August 21, axboe@suse.de wrote:
> 
> But these numbers are in no way tied to the hardware. It may be totally
> reasonable to have 3GiB of dirty data on one system, and it may be
> totally unreasonable to have 96MiB of dirty data on another. I've always
> thought that assuming any kind of reliable throttling at the queue level
> is broken and that the vm should handle this completely.

I keep changing my mind about this.  Sometimes I see it that way,
sometimes it seems very sensible for throttling to happen at the
device queue.

Can I ask a question:  Why do we have a 'nr_requests' maximum?  Why
not just allocate request structures whenever a request is made?
If there some reason relating to making the block layer work more
efficiently? or is it just because the VM requires it.


I'm beginning to think that the current scheme really works very well
- except for a few 'bugs'(*).
The one change that might make sense would be for the VM to be able to
tune the queue size of each backing dev.  Exactly how that would work
I'm not sure, but the goal would be to get the sum of the active queue
sizes to about 1 half of dirty_threshold.

The 'bugs' I am currently aware of are:
 - nfs doesn't put a limit on the request queue
 - the ext3 journal often writes out dirty data without clearing
   the Dirty flag on the page - so the nr_dirty count ends up wrong.
   ext3 writes the buffers out and marks them clean.  So when
   the VM tried to flush a page, it finds all the buffers are clean
   and so marks the page clean, so the nr_dirty count eventually
   gets correct again, but I think this can cause write throttling to
   be very unfair at times.

I think we need a queue limit on NFS requests.....

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/