Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753530Ab1EINuw (ORCPT ); Mon, 9 May 2011 09:50:52 -0400 Received: from mga09.intel.com ([134.134.136.24]:60948 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752802Ab1EINut (ORCPT ); Mon, 9 May 2011 09:50:49 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.64,340,1301900400"; d="scan'208";a="744454247" Date: Mon, 9 May 2011 21:50:46 +0800 From: Shaohua Li To: Vivek Goyal Cc: Tejun Heo , "linux-kernel@vger.kernel.org" , "linux-ide@vger.kernel.org" , "jaxboe@fusionio.com" , "hch@infradead.org" , "jgarzik@pobox.com" , "djwong@us.ibm.com" , "sshtylyov@mvista.com" , James Bottomley , "linux-scsi@vger.kernel.org" , "ricwheeler@gmail.com" Subject: Re: [patch v3 2/3] block: hold queue if flush is running for non-queueable flush drive Message-ID: <20110509135046.GB29753@sli10-conroe.sh.intel.com> References: <20110505015932.306763905@sli10-conroe.sh.intel.com> <20110505020417.817084678@sli10-conroe.sh.intel.com> <20110505083853.GC30950@htj.dyndns.org> <20110509130316.GB5975@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110509130316.GB5975@redhat.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2553 Lines: 59 On Mon, May 09, 2011 at 09:03:16PM +0800, Vivek Goyal wrote: > On Thu, May 05, 2011 at 10:38:53AM +0200, Tejun Heo wrote: > > [..] > > Similarly, I'd like to suggest something like the following. > > > > /* > > * Hold dispatching of regular requests if non-queueable > > * flush is in progress; otherwise, the low level driver > > * would keep dispatching IO requests just to requeue them > > * until the flush finishes, which not only adds > > * dispatching / requeueing overhead but may also > > * significantly affect throughput when multiple flushes > > * are issued back-to-back. Please consider the following > > * scenario. > > * > > * - flush1 is dispatched with write1 in the elevator. > > * > > * - Driver dispatches write1 and requeues it. > > * > > * - flush2 is issued and appended to dispatch queue after > > * the requeued write1. As write1 has been requeued > > * flush2 can't be put in front of it. > > * > > * - When flush1 finishes, the driver has to process write1 > > * before flush2 even though there's no fundamental > > * reason flush2 can't be processed first and, when two > > * flushes are issued back-to-back without intervening > > * writes, the second one essentially becomes noop. > > * > > * This phenomena becomes quite visible under heavy > > * concurrent fsync workload and holding the queue while > > * flush is in progress leads to significant throughput > > * gain. > > */ > > Tejun, > > I am assuming that these back-to-back flushes are independent of each > other otherwise write request will anyway get between two flushes. Hi, yes, the flushes are independent. > If that's the case, then should we solve the problem by improving flush > merge logic a bit better. (Say idle a bit before issuing a flush only > if request queue is not empty). I tried some ways to improve flush merge logic. The problem I observed is something like: say we have 10 flushes, originally we dispatch 4 flush, write, 6 flush. doing more merge we have 6 flush, write, 4 flush. the flush request number sent to drive isn't reduced. Another reason I didn't see improvement with better back-to-back merge might be drive already optimizes two adjacent flushes case well. Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/