Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758024AbZJBSTB (ORCPT ); Fri, 2 Oct 2009 14:19:01 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756976AbZJBSTA (ORCPT ); Fri, 2 Oct 2009 14:19:00 -0400 Received: from brick.kernel.dk ([93.163.65.50]:49580 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750844AbZJBSTA (ORCPT ); Fri, 2 Oct 2009 14:19:00 -0400 Date: Fri, 2 Oct 2009 20:19:03 +0200 From: Jens Axboe To: Mike Galbraith Cc: Ingo Molnar , Linus Torvalds , Vivek Goyal , Ulrich Lukas , linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, dm-devel@redhat.com, nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com, mikew@google.com, fchecconi@gmail.com, paolo.valente@unimore.it, ryov@valinux.co.jp, fernando@oss.ntt.co.jp, jmoyer@redhat.com, dhaval@linux.vnet.ibm.com, balbir@linux.vnet.ibm.com, righi.andrea@gmail.com, m-ikeda@ds.jp.nec.com, agk@redhat.com, akpm@linux-foundation.org, peterz@infradead.org, jmarchan@redhat.com, riel@redhat.com Subject: Re: IO scheduler based IO controller V10 Message-ID: <20091002181903.GN31616@kernel.dk> References: <20091002092839.GA26962@kernel.dk> <20091002145610.GD31616@kernel.dk> <20091002171129.GG31616@kernel.dk> <20091002172046.GA2376@elte.hu> <20091002172554.GJ31616@kernel.dk> <20091002172842.GA4884@elte.hu> <20091002173732.GK31616@kernel.dk> <1254507215.8667.7.camel@marge.simson.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1254507215.8667.7.camel@marge.simson.net> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4621 Lines: 93 On Fri, Oct 02 2009, Mike Galbraith wrote: > On Fri, 2009-10-02 at 19:37 +0200, Jens Axboe wrote: > > On Fri, Oct 02 2009, Ingo Molnar wrote: > > > > > > * Jens Axboe wrote: > > > > > > > On Fri, Oct 02 2009, Ingo Molnar wrote: > > > > > > > > > > * Jens Axboe wrote: > > > > > > > > > > > It's not _that_ easy, it depends a lot on the access patterns. A > > > > > > good example of that is actually the idling that we already do. > > > > > > Say you have two applications, each starting up. If you start them > > > > > > both at the same time and just care for the dumb low latency, then > > > > > > you'll do one IO from each of them in turn. Latency will be good, > > > > > > but throughput will be aweful. And this means that in 20s they are > > > > > > both started, while with the slice idling and priority disk access > > > > > > that CFQ does, you'd hopefully have both up and running in 2s. > > > > > > > > > > > > So latency is good, definitely, but sometimes you have to worry > > > > > > about the bigger picture too. Latency is more than single IOs, > > > > > > it's often for complete operation which may involve lots of IOs. > > > > > > Single IO latency is a benchmark thing, it's not a real life > > > > > > issue. And that's where it becomes complex and not so black and > > > > > > white. Mike's test is a really good example of that. > > > > > > > > > > To the extent of you arguing that Mike's test is artificial (i'm not > > > > > sure you are arguing that) - Mike certainly did not do an artificial > > > > > test - he tested 'konsole' cache-cold startup latency, such as: > > > > > > > > [snip] > > > > > > > > I was saying the exact opposite, that Mike's test is a good example of > > > > a valid test. It's not measuring single IO latencies, it's doing a > > > > sequence of valid events and looking at the latency for those. It's > > > > benchmarking the bigger picture, not a microbenchmark. > > > > > > Good, so we are in violent agreement :-) > > > > Yes, perhaps that last sentence didn't provide enough evidence of which > > category I put Mike's test into :-) > > > > So to kick things off, I added an 'interactive' knob to CFQ and > > defaulted it to on, along with re-enabling slice idling for hardware > > that does tagged command queuing. This is almost completely identical to > > what Vivek Goyal originally posted, it's just combined into one and uses > > the term 'interactive' instead of 'fairness'. I think the former is a > > better umbrella under which to add further tweaks that may sacrifice > > throughput slightly, in the quest for better latency. > > > > It's queued up in the for-linus branch. > > FWIW, I did a matrix of Vivek's patch combined with my hack. Seems we > do lose a bit of dd throughput over stock with either or both. > > dd pre 65.1 65.4 67.5 64.8 65.1 65.5 fairness=1 overload_delay=1 > perf stat 1.70 1.94 1.32 1.89 1.87 1.7 > dd post 69.4 62.3 69.7 70.3 69.6 68.2 > > dd pre 67.0 67.8 64.7 64.7 64.9 65.8 fairness=1 overload_delay=0 > perf stat 4.89 3.13 2.98 2.71 2.17 3.1 > dd post 67.2 63.3 62.6 62.8 63.1 63.8 > > dd pre 65.0 66.0 66.9 64.6 67.0 65.9 fairness=0 overload_delay=1 > perf stat 4.66 3.81 4.23 2.98 4.23 3.9 > dd post 62.0 60.8 62.4 61.4 62.2 61.7 > > dd pre 65.3 65.6 64.9 69.5 65.8 66.2 fairness=0 overload_delay=0 > perf stat 14.79 9.11 14.16 8.44 13.67 12.0 > dd post 64.1 66.5 64.0 66.5 64.4 65.1 I'm not too worried about the "single IO producer" scenarios, and it looks like (from a quick look) that most of your numbers are within some expected noise levels. It's the more complex mixes that are likely to cause a bit of a stink, but lets worry about that later. One quick thing would be to read eg 2 or more files sequentially from disk and see how that performs. If you could do a cleaned up version of your overload patch based on this: http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=1d2235152dc745c6d94bedb550fea84cffdbf768 then lets take it from there. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/