Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Thu, 22 Feb 2001 13:59:56 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Thu, 22 Feb 2001 13:59:47 -0500 Received: from neon-gw.transmeta.com ([209.10.217.66]:16132 "EHLO neon-gw.transmeta.com") by vger.kernel.org with ESMTP id ; Thu, 22 Feb 2001 13:59:32 -0500 Date: Thu, 22 Feb 2001 10:59:20 -0800 (PST) From: Linus Torvalds To: Jens Axboe cc: Marcelo Tosatti , lkml Subject: Re: ll_rw_block/submit_bh and request limits In-Reply-To: <20010222145642.D17276@suse.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 22 Feb 2001, Jens Axboe wrote: > On Thu, Feb 22 2001, Marcelo Tosatti wrote: > > The following piece of code in ll_rw_block() aims to limit the number of > > locked buffers by making processes throttle on IO if the number of on > > flight requests is bigger than a high watermaker. IO will only start > > again if we're under a low watermark. > > > > if (atomic_read(&queued_sectors) >= high_queued_sectors) { > > run_task_queue(&tq_disk); > > wait_event(blk_buffers_wait, > > atomic_read(&queued_sectors) < low_queued_sectors); > > } > > > > > > However, if submit_bh() is used to queue IO (which is used by ->readpage() > > for ext2, for example), no throttling happens. > > > > It looks like ll_rw_block() users (writes, metadata reads) can be starved > > by submit_bh() (data reads). > > > > If I'm not missing something, the watermark check should be moved to > > submit_bh(). > > We might as well put it there, the idea was to not lock this one > buffer either but I doubt this would make any different in reality :-) I'd prefer for this check to be a per-queue one. Right now a slow device (like a floppy) would artifically throttle a fast one, if I read the above right. So instead of moving it down the call-chain, I'd rather remove the check completely as it looks wrong to me. Now, if people want throttling, I'd much rather see that done per-queue. (There's another level of throttling that migth make sense: right now the swap-out code has this "nr_async_pages" throttling which is very different from the queue throttling. It might make sense to move that _VM_-level throttling to writepage too - so that syncing of dirty mmap's will not cause an overload of pages in flight. This was one of the reasons I changed the semantics of write-page - so that shared mappings could do that kind of smoothing too). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/