Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3294666ybb; Sun, 22 Mar 2020 20:46:41 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuifb1QE0LZKiLNgWJYUjTtY7wiztr786XfJsQl9G0spR+qxmrHXjQBmBUYvCWJB+3U1DOq X-Received: by 2002:a9d:754e:: with SMTP id b14mr8389757otl.293.1584935200820; Sun, 22 Mar 2020 20:46:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584935200; cv=none; d=google.com; s=arc-20160816; b=ycj/735GUDguDeFHuPBSd+Tjg6P+CaOlHj13fwwi9KKOEn+VKwp7TIA9MnnKGyIlfx 2aTY3aTesQz9NVFw8K3BTddU2Xg0VjnyFpxHC023kSAf4nh5r6BEXi7QBJ9NWcS4Gsps yIiPsXJt4ZDR0AFyaTqTgLi8zOK4ESqB9jptL7PZGrE0v9kKQeHIRHum1boh164EopXx BWWgdJ4czwXC3w3yGgIXfHS7hNolXtbG9k0KnJWhKrhLlWYnOz9GF/IqW8wC1Vtox2OY VNiHlC20BgItg8eJi5BjDntKcR+I+0CbJS+zcB1exJ3proyYHyVW5AWtSV0j/Ro3DAi9 dzJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=H+GrDyOrCD21iKFNlG2dpPR5xnUhMJ4A9olQC66AaHA=; b=XPT+ijGJQz4X+L5wbOoKU/5FmYWR9bOdgTLkV7cVo3/blWj8sSCWmwWYCgE9c3tKfa k0p52ut39GC5leRkhzAzNCsyzlBvT1xmC7dJi/fS5HX5wF/yOOqS7bd9FV7BKAzFk3Op iTJ7Bj6T6OFtN73UvteI1icux1qLTxMegcpvWcs5kRTlDE6p3WVH7+wpAiRSsET6hV2i W27qeREKvW6Btu2LurR3dhsE3ubGm4qYARVOk+Aiyo9OFv2bx+SsExc7LJkXAE/WQ5z7 nuXiNNeaWJ2heRx+D3Vc6BYn2UKgyWgG8BHuhyUh/WxWDpCsfx9UmO/1B6i8YOn57V1N dMqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=jDHXqIMw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m18si6888663otf.196.2020.03.22.20.46.28; Sun, 22 Mar 2020 20:46:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=jDHXqIMw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727159AbgCWDoy (ORCPT + 99 others); Sun, 22 Mar 2020 23:44:54 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:45038 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726979AbgCWDox (ORCPT ); Sun, 22 Mar 2020 23:44:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1584935091; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=H+GrDyOrCD21iKFNlG2dpPR5xnUhMJ4A9olQC66AaHA=; b=jDHXqIMwpId8GhjQuMSwYwxctQeqzybA5x3SjfUkCV8ZVw8Qoh3EsM9qFu2LSQhpVN4UqS xSo0KitjQi+JB0FKYSLYFtiOI+dZN/MODogqCwl40SPnhzg0qgjYp4gCxyR63oOTt+0BqG CFOYSM6UTg46sBGZ+7A33papj6MT5D4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-449-I-ah-cjMNJqVPiTlPvr9ig-1; Sun, 22 Mar 2020 23:44:50 -0400 X-MC-Unique: I-ah-cjMNJqVPiTlPvr9ig-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 152DF107ACC4; Mon, 23 Mar 2020 03:44:48 +0000 (UTC) Received: from ming.t460p (ovpn-8-21.pek2.redhat.com [10.72.8.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DDA4E5C3FD; Mon, 23 Mar 2020 03:44:37 +0000 (UTC) Date: Mon, 23 Mar 2020 11:44:32 +0800 From: Ming Lei To: Baolin Wang Cc: axboe@kernel.dk, Paolo Valente , Ulf Hansson , Adrian Hunter , Arnd Bergmann , Linus Walleij , Orson Zhai , Chunyan Zhang , linux-mmc , linux-block , LKML Subject: Re: [RESEND RFC PATCH 2/8] block: Allow sending a batch of requests from the scheduler to hardware Message-ID: <20200323034432.GA27507@ming.t460p> References: <20200318100123.GA27531@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.1 (2019-06-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 20, 2020 at 06:27:41PM +0800, Baolin Wang wrote: > Hi Ming, > > On Wed, Mar 18, 2020 at 6:26 PM Baolin Wang wrote: > > > > Hi Ming, > > > > On Wed, Mar 18, 2020 at 6:01 PM Ming Lei wrote: > > > > > > On Mon, Mar 16, 2020 at 06:01:19PM +0800, Baolin Wang wrote: > > > > As we know, some SD/MMC host controllers can support packed request, > > > > that means we can package several requests to host controller at one > > > > time to improve performence. So the hardware driver expects the blk-mq > > > > can dispatch a batch of requests at one time, and driver can use bd.last > > > > to indicate if it is the last request in the batch to help to combine > > > > requests as much as possible. > > > > > > > > Thus we should add batch requests setting from the block driver to tell > > > > the scheduler how many requests can be dispatched in a batch, as well > > > > as changing the scheduler to dispatch more than one request if setting > > > > the maximum batch requests number. > > > > > > > > > > I feel this batch dispatch style is more complicated, and some other > > > drivers(virtio blk/scsi) still may get benefit if we can pass real 'last' flag in > > > .queue_rq(). > > > > > > So what about the following way by extending .commit_rqs() to this usage? > > > And you can do whatever batch processing in .commit_rqs() which will be > > > guaranteed to be called if BLK_MQ_F_FORCE_COMMIT_RQS is set by driver. > > > > I'm very appreciated for your good suggestion, which is much simpler than mine. > > It seems to solve my problem, and I will try it on my platform to see > > if it can work and give you the feadback. Thanks again. > > I tried your approach on my platform, but met some problems, see below. > > > > > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > > > index 856356b1619e..cd2bbe56f83f 100644 > > > --- a/block/blk-mq-sched.c > > > +++ b/block/blk-mq-sched.c > > > @@ -85,11 +85,12 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx) > > > * its queue by itself in its completion handler, so we don't need to > > > * restart queue if .get_budget() returns BLK_STS_NO_RESOURCE. > > > */ > > > -static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) > > > +static bool blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) > > > { > > > struct request_queue *q = hctx->queue; > > > struct elevator_queue *e = q->elevator; > > > LIST_HEAD(rq_list); > > > + bool ret = false; > > > > > > do { > > > struct request *rq; > > > @@ -112,7 +113,10 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) > > > * in blk_mq_dispatch_rq_list(). > > > */ > > > list_add(&rq->queuelist, &rq_list); > > > - } while (blk_mq_dispatch_rq_list(q, &rq_list, true)); > > > + ret = blk_mq_dispatch_rq_list(q, &rq_list, true); > > > + } while (ret); > > > + > > > + return ret; > > > } > > > > > > static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx, > > > @@ -131,11 +135,12 @@ static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx, > > > * its queue by itself in its completion handler, so we don't need to > > > * restart queue if .get_budget() returns BLK_STS_NO_RESOURCE. > > > */ > > > -static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) > > > +static bool blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) > > > { > > > struct request_queue *q = hctx->queue; > > > LIST_HEAD(rq_list); > > > struct blk_mq_ctx *ctx = READ_ONCE(hctx->dispatch_from); > > > + bool ret = false; > > > > > > do { > > > struct request *rq; > > > @@ -161,10 +166,12 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) > > > > > > /* round robin for fair dispatch */ > > > ctx = blk_mq_next_ctx(hctx, rq->mq_ctx); > > > - > > > - } while (blk_mq_dispatch_rq_list(q, &rq_list, true)); > > > + ret = blk_mq_dispatch_rq_list(q, &rq_list, true); > > > + } while (ret); > > > > > > WRITE_ONCE(hctx->dispatch_from, ctx); > > > + > > > + return ret; > > > } > > > > > > void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) > > > @@ -173,6 +180,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) > > > struct elevator_queue *e = q->elevator; > > > const bool has_sched_dispatch = e && e->type->ops.dispatch_request; > > > LIST_HEAD(rq_list); > > > + bool dispatch_ret; > > > > > > /* RCU or SRCU read lock is needed before checking quiesced flag */ > > > if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) > > > @@ -206,20 +214,26 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) > > > */ > > > if (!list_empty(&rq_list)) { > > > blk_mq_sched_mark_restart_hctx(hctx); > > > - if (blk_mq_dispatch_rq_list(q, &rq_list, false)) { > > > + dispatch_ret = blk_mq_dispatch_rq_list(q, &rq_list, false); > > > + if (dispatch_ret) { > > > if (has_sched_dispatch) > > > - blk_mq_do_dispatch_sched(hctx); > > > + dispatch_ret = blk_mq_do_dispatch_sched(hctx); > > If we dispatched a request successfully by blk_mq_dispatch_rq_list(), > and got dispatch_ret = true now. Then we will try to dispatch more > reuqests from scheduler by blk_mq_do_dispatch_sched(), but if now no > more requests in scheduler, then we will got dispatch_ret = false. In 'dispatch_ret' always holds result of the last blk_mq_do_dispatch_sched(). When any one request has been dispatched successfully, 'dispatch_ret' is true. New request is always added to list before calling blk_mq_do_dispatch_sched(), so once blk_mq_do_dispatch_sched() returns false, it means that .commit_rqs() has been called. > this case, we will not issue commit_rqs() to tell the hardware to > handle previous request dispatched from &rq_list. > > So I think we should not overlap the 'dispatch_ret'? Or do you have > any other thoughts to fix? > > > > else > > > - blk_mq_do_dispatch_ctx(hctx); > > > + dispatch_ret = blk_mq_do_dispatch_ctx(hctx); > > > } > > > } else if (has_sched_dispatch) { > > > - blk_mq_do_dispatch_sched(hctx); > > > + dispatch_ret = blk_mq_do_dispatch_sched(hctx); > > > } else if (hctx->dispatch_busy) { > > > /* dequeue request one by one from sw queue if queue is busy */ > > > - blk_mq_do_dispatch_ctx(hctx); > > > + dispatch_ret = blk_mq_do_dispatch_ctx(hctx); > > > } else { > > > blk_mq_flush_busy_ctxs(hctx, &rq_list); > > > - blk_mq_dispatch_rq_list(q, &rq_list, false); > > > + dispatch_ret = blk_mq_dispatch_rq_list(q, &rq_list, false); > > > + } > > > + > > > + if (dispatch_ret) { > > > + if (hctx->flags & BLK_MQ_F_FORCE_COMMIT_RQS) > > > + hctx->queue->mq_ops->commit_rqs(hctx); > > > } > > > } > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > index 87c6699f35ae..9b46f5d6c7fd 100644 > > > --- a/block/blk-mq.c > > > +++ b/block/blk-mq.c > > > @@ -1238,11 +1238,15 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, > > > * Flag last if we have no more requests, or if we have more > > > * but can't assign a driver tag to it. > > > */ > > > - if (list_empty(list)) > > > - bd.last = true; > > > - else { > > > - nxt = list_first_entry(list, struct request, queuelist); > > > - bd.last = !blk_mq_get_driver_tag(nxt); > > > + if (!(hctx->flags & BLK_MQ_F_FORCE_COMMIT_RQS)) { > > > + if (list_empty(list)) > > > + bd.last = true; > > > + else { > > > + nxt = list_first_entry(list, struct request, queuelist); > > > + bd.last = !blk_mq_get_driver_tag(nxt); > > > + } > > > + } else { > > > + bd.last = false; > > If we enabled BLK_MQ_F_FORCE_COMMIT_RQS flag, we will always get > bd.last = false even for the real last request in the IO scheduler. I > know you already use commit_irqs() to help to kick driver. But I > worried if it is reasonable that drivers always get bd.last = false. > BLK_MQ_F_FORCE_COMMIT_RQS means the .last flag is ignored, and we can document this usage. Thanks, Ming