Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753670AbYKSKOA (ORCPT ); Wed, 19 Nov 2008 05:14:00 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752746AbYKSKNv (ORCPT ); Wed, 19 Nov 2008 05:13:51 -0500 Received: from ms01.sssup.it ([193.205.80.99]:35176 "EHLO sssup.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752679AbYKSKNu (ORCPT ); Wed, 19 Nov 2008 05:13:50 -0500 Date: Wed, 19 Nov 2008 11:17:01 +0100 From: Fabio Checconi To: Aaron Carroll Cc: Jens Axboe , Vivek Goyal , Nauman Rafique , Li Zefan , Divyesh Shah , Ryo Tsuruta , linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, virtualization@lists.linux-foundation.org, taka@valinux.co.jp, righi.andrea@gmail.com, s-uchida@ap.jp.nec.com, fernando@oss.ntt.co.jp, balbir@linux.vnet.ibm.com, akpm@linux-foundation.org, menage@google.com, ngupta@google.com, riel@redhat.com, jmoyer@redhat.com, peterz@infradead.org, paolo.valente@unimore.it Subject: Re: [patch 0/4] [RFC] Another proportional weight IO controller Message-ID: <20081119101701.GA20915@gandalf.sssup.it> References: <20081117142309.GA15564@redhat.com> <4922224A.5030502@cn.fujitsu.com> <20081118120508.GD15268@gandalf.sssup.it> <20081118140751.GA4283@redhat.com> <20081118144139.GE15268@gandalf.sssup.it> <20081118191208.GJ26308@kernel.dk> <20081118211442.GG15268@gandalf.sssup.it> <4923716A.5090104@gelato.unsw.edu.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4923716A.5090104@gelato.unsw.edu.au> User-Agent: Mutt/1.4.2.3i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2591 Lines: 58 > From: Aaron Carroll > Date: Wed, Nov 19, 2008 12:52:42PM +1100 > > Fabio Checconi wrote: > > - To detect hw tagging in BFQ we consider a sample valid iff the > > number of requests that the scheduler could have dispatched (given > > by cfqd->rb_queued + cfqd->rq_in_driver, i.e., the ones still into > > the scheduler plus the ones into the driver) is higher than the > > CFQ_HW_QUEUE_MIN threshold. This obviously caused no problems > > during testing, but the way CFQ uses now seems a little bit > > strange. > > BFQ's tag detection logic is broken in the same way that CFQ's used to > be. Explanation is in this patch: > If you look at bfq_update_hw_tag(), the logic introduced by the patch you mention is still there; BFQ starts with ->hw_tag = 1, and updates it every 32 valid samples. What changed WRT your patch, apart from the number of samples, is that the condition for a sample to be valid is: bfqd->rq_in_driver + bfqd->queued >= 5 while in your patch it is: cfqd->rq_queued > 5 || cfqd->rq_in_driver > 5 We preferred the first one because that sum better reflects the number of requests that could have been dispatched, and I don't think that this is wrong. There is a problem, but it's not within the tag detection logic itself. >From some quick experiments, what happens is that when a process starts, CFQ considers it seeky (*), BFQ doesn't. As a side effect BFQ does not always dispatch enough requests to correctly detect tagging. At the first seek you cannot tell if the process is going to bee seeky or not, and we have chosen to consider it sequential because it improved fairness in some sequential workloads (the CIC_SEEKY heuristic is used also to determine the idle_window length in [bc]fq_arm_slice_timer()). Anyway, we're dealing with heuristics, and they tend to favor some workload over other ones. If recovering this thoughput loss is more important than a transient unfairness due to short idling windows assigned to sequential processes when they start, I've no problems in switching the CIC_SEEKY logic to consider a process seeky when it starts. Thank you for testing and for pointing out this issue, we missed it in our testing. (*) to be correct, the initial classification depends on the position of the first accessed sector. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/