From: Arjan van de Ven Subject: Re: [GIT PULL] Ext3 latency fixes Date: Sun, 5 Apr 2009 13:06:48 -0700 Message-ID: <20090405130648.3266a468@infradead.org> References: <20090404135719.GA9812@mit.edu> <20090404151649.GE5178@kernel.dk> <20090404173412.GF5178@kernel.dk> <20090404180108.GH5178@kernel.dk> <20090404232222.GA7480@mit.edu> <20090404163349.20df1208@infradead.org> <20090405001005.GA7553@mit.edu> <20090405115629.521057fc@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Theodore Tso , Jens Axboe , Linux Kernel Developers List , Ext4 Developers List To: Linus Torvalds Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Sun, 5 Apr 2009 12:34:32 -0700 (PDT) Linus Torvalds wrote: > > > On Sun, 5 Apr 2009, Arjan van de Ven wrote: > > > > > See get_request(): > > > > our default number of requests is so low that we very regularly hit > > the limit. In addition to setting kjournald to higher priority, I > > tend to set the number of requests to 4096 or so to improve > > interactive performance on my own systems. That way at least the > > elevator has a chance to see the requests ;-) > > That's insane. 4096 is an absolutely insane value that hides some of the problem > Long queues make the problem harder to hit, yes. But > it also tends to make the problem them a million times worse when you > _do_ hit it. There is a dilemma though. By not having the IO needs in a queue, to some degree, they haven't gone away; they just are invisible. Now there is also a throttling value in having these limits, to slow down "regular" processes that would cause too much IO. Except that we have the dirty limit for that in the VM, and except that most actual IO is done by pdflush and other kernel threads, with the dirtying of data asynchronous to that. I would contend that for most common cases, not giving callers a request immediately does not change or throttle the actual IO that is in want of being sent to the device. All it does is reduce visibility of the IO need so less grouping of adjacent and prioritization can be done by the elevator. > I would suggest looking instead at trying to have separate allocation > pools for bulk and "sync" IO. Instead of having just one rq->rq_pool, > we could easily have a rq->rq_bulk_pool and rq->rq_sync_pool. Well that or have pools for a few buckets of priority level. The risk of this is that someone like pdflush might get stuck on a low priority queue, and thus cannot send the IO it might have wanted to send into a higher priority queue. I fear that any such limits will in general punish the wrong guy; after all number 129 is punished, not the guy who put numbers 1 to 128 in the queue. I wonder if it wouldn't be a better solution to give insight of the queue length in use to pdflush, and have pdflush decide what kind of IO to submit based on the length, rather than having it just block. Just think of the sync() or fsync() cases. The total amount of IO that those calls will cause is pretty much fixed: the data that is "relevantly dirty" at the time of the call. Holding things back at the request allocation level does not change that, all it changes is that we delay merging requests that are adjacent, sort on priority, etc. -- Arjan van de Ven Intel Open Source Technology Centre For development, discussion and tips for power savings, visit http://www.lesswatts.org