Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751836AbZI0SQI (ORCPT ); Sun, 27 Sep 2009 14:16:08 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751074AbZI0SQI (ORCPT ); Sun, 27 Sep 2009 14:16:08 -0400 Received: from mail.gmx.net ([213.165.64.20]:43898 "HELO mail.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1751038AbZI0SQG (ORCPT ); Sun, 27 Sep 2009 14:16:06 -0400 X-Authenticated: #14349625 X-Provags-ID: V01U2FsdGVkX1+RQmBaMlaRsluoqx7daHtvw2apkWRYzUa0r1LfzZ ZPXGZRluRAasEB Subject: Re: IO scheduler based IO controller V10 From: Mike Galbraith To: Jens Axboe Cc: Vivek Goyal , Ulrich Lukas , linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, dm-devel@redhat.com, nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com, mikew@google.com, fchecconi@gmail.com, paolo.valente@unimore.it, ryov@valinux.co.jp, fernando@oss.ntt.co.jp, jmoyer@redhat.com, dhaval@linux.vnet.ibm.com, balbir@linux.vnet.ibm.com, righi.andrea@gmail.com, m-ikeda@ds.jp.nec.com, agk@redhat.com, akpm@linux-foundation.org, peterz@infradead.org, jmarchan@redhat.com, torvalds@linux-foundation.org, mingo@elte.hu, riel@redhat.com In-Reply-To: <20090927164235.GA23126@kernel.dk> References: <1253820332-10246-1-git-send-email-vgoyal@redhat.com> <4ABC28DE.7050809@datenparkplatz.de> <20090925202636.GC15007@redhat.com> <1253976676.7005.40.camel@marge.simson.net> <1254034500.7933.6.camel@marge.simson.net> <20090927164235.GA23126@kernel.dk> Content-Type: text/plain Date: Sun, 27 Sep 2009 20:15:59 +0200 Message-Id: <1254075359.7354.66.camel@marge.simson.net> Mime-Version: 1.0 X-Mailer: Evolution 2.24.1.1 Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.48 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3064 Lines: 82 On Sun, 2009-09-27 at 18:42 +0200, Jens Axboe wrote: > On Sun, Sep 27 2009, Mike Galbraith wrote: > > My dd vs load non-cached binary woes seem to be coming from backmerge. > > > > #if 0 /*MIKEDIDIT sand in gearbox?*/ > > /* > > * See if our hash lookup can find a potential backmerge. > > */ > > __rq = elv_rqhash_find(q, bio->bi_sector); > > if (__rq && elv_rq_merge_ok(__rq, bio)) { > > *req = __rq; > > return ELEVATOR_BACK_MERGE; > > } > > #endif > > It's a given that not merging will provide better latency. Yeah, absolutely everything I've diddled that reduces the size of queued data improves the situation, which makes perfect sense. This one was a bit unexpected. Front merges didn't hurt at all, back merges did, and lots. After diddling the code a bit, I had the "well _duh_" moment. > We can't > disable that or performance will suffer A LOT on some systems. There are > ways to make it better, though. One would be to make the max request > size smaller, but that would also hurt for streamed workloads. Can you > try whether the below patch makes a difference? It will basically > disallow merges to a request that isn't the last one. That's what all the looking I've done ends up at. Either you let the disk be all it can be, and you pay in latency, or you don't, and you pay in throughput. > below wont work well for two (or more) streamed cases. I'll think a bit > about that. Cool, think away. I've been eyeballing and pondering how to know when latency is going to become paramount. Absolutely nothing is happening, even for "it's my root". > Note this is totally untested! I'll give it a shot first thing in the A.M. Note: I tested my stable of kernels today (22->), and we are better off dd vs read today than ever in this time span at least. (i can't recall ever seeing a system where beating snot outta root didn't hurt really bad... would be very nice though;) > diff --git a/block/elevator.c b/block/elevator.c > index 1975b61..d00a72b 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -497,9 +497,17 @@ int elv_merge(struct request_queue *q, struct request **req, struct bio *bio) > * See if our hash lookup can find a potential backmerge. > */ > __rq = elv_rqhash_find(q, bio->bi_sector); > - if (__rq && elv_rq_merge_ok(__rq, bio)) { > - *req = __rq; > - return ELEVATOR_BACK_MERGE; > + if (__rq) { > + /* > + * If requests are queued behind this one, disallow merge. This > + * prevents streaming IO from continually passing new IO. > + */ > + if (elv_latter_request(q, __rq)) > + return ELEVATOR_NO_MERGE; > + if (elv_rq_merge_ok(__rq, bio)) { > + *req = __rq; > + return ELEVATOR_BACK_MERGE; > + } > } > > if (e->ops->elevator_merge_fn) > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/