Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp3475098pxb; Mon, 30 Aug 2021 03:15:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyaQjz1JX2VUHxQ9Puem72FlMJkKetk9tCY/IofB3DR0hmsYQztLzYlgZJ9u4geOKo+vNNu X-Received: by 2002:aa7:cd13:: with SMTP id b19mr16977818edw.210.1630318512321; Mon, 30 Aug 2021 03:15:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630318512; cv=none; d=google.com; s=arc-20160816; b=hWbp8P+yaFXK/1k5La8XRCDUf1kg4/JmpPlJM2SYXuU7DucKs5By2gV0aO0+36CmKn rWVxa6drmL1km/DG++cfO/Ug1JkEyrgAAQWOYM8c6gisnCuJchRDX0fDTV0m7V2s0S7g rJHmPhgYqzZVu4VFlrUnfEWXf7dmn0LilFaqJUWrwnlO0wDlF2xJsTKiD09KUL2tK1YM +b0olhw3feIcsgqh6TN9q8kuB+sRN5SGbY5jDQXvxa4iR7zHTaHth58xEwz5d2FoPTDg eYV6627/NXZpsQaboBpyjDXP1c+PE5HawRxGTtIrqzh0yHMidcfuBso4JSNuQaIQ8USd GErg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=GNS9WvnZ0xw8x+G4OAVRsDet7iU5hb6uy6leUR087lI=; b=V+Uh9ThEPEcTm/L9jzyFbAJyqOCNxpB/0k0BfdJyDs8VaNbFYzg9TzKZ5Uoyie/XfR N2tLuXGWiFwvmBDdTD7dZqqHRaIp66HhiqX6GXA5qbFRP9JahS+IT+vboafcKHUfYOBM 2nbbU1Ub1MLmGGGq7dnKYxJRUjNwdFtAuvmQQE+vRVnceWfUclsdLlXSjv1piDJRnfyV IjJqJDNsMZYaFoYF405/hCITw1Vrf9C0b034KRBlvLhRJySy9c4B7A35fA5zIpJe6DUw Fgy22CJDJqPSn78cygUQJG3EOTIdDkkACYrOmKA1kXF1CNPWYYl3DzhTskwxq4ZPQRX6 l2zQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DuW4Nlk8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hk6si13118058ejb.587.2021.08.30.03.14.49; Mon, 30 Aug 2021 03:15:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DuW4Nlk8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236305AbhH3KMX (ORCPT + 99 others); Mon, 30 Aug 2021 06:12:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:53286 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236255AbhH3KMX (ORCPT ); Mon, 30 Aug 2021 06:12:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630318289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=GNS9WvnZ0xw8x+G4OAVRsDet7iU5hb6uy6leUR087lI=; b=DuW4Nlk8wZXzM2detqXSeCnL+CHYfCCCSB4NbV/keE8mLEPPO0LSc6QCUCgnQ9t0bkBgtV xegF7BPHgjHIB075xgaFefERDiwwjI+XREEc9ndPjsKnXMUKvXSwMyPnMvrRiP7Uw8eQfl 8ze2SoM+nDE3Faejujl8EbjRE+JEtsA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-307-10RyjYvuMOSbnW5b1QszLQ-1; Mon, 30 Aug 2021 06:11:27 -0400 X-MC-Unique: 10RyjYvuMOSbnW5b1QszLQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 65D506409B; Mon, 30 Aug 2021 10:11:26 +0000 (UTC) Received: from T590 (ovpn-8-36.pek2.redhat.com [10.72.8.36]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 759AB10074E5; Mon, 30 Aug 2021 10:11:17 +0000 (UTC) Date: Mon, 30 Aug 2021 18:11:12 +0800 From: Ming Lei To: Niklas Cassel Cc: Jens Axboe , Bart Van Assche , Damien Le Moal , Paolo Valente , "linux-block@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [RFC PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler Message-ID: References: <20210827124100.98112-1-Niklas.Cassel@wdc.com> <20210827124100.98112-2-Niklas.Cassel@wdc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 30, 2021 at 09:48:06AM +0000, Niklas Cassel wrote: > On Fri, Aug 27, 2021 at 09:28:07PM +0800, Ming Lei wrote: > > On Fri, Aug 27, 2021 at 12:41:31PM +0000, Niklas Cassel wrote: > > > From: Niklas Cassel > > > > > > Currently, __blk_mq_alloc_request() calls ops.prepare_request and sets > > > RQF_ELVPRIV. > > > > > > Therefore, (if the request is not a flush) the RQF_ELVPRIV flag will be > > > set for the request in blk_mq_submit_bio(), regardless if the request > > > was submitted to a scheduler, or bypassed the scheduler. > > > > > > Later, blk_mq_free_request() checks if the RQF_ELVPRIV flag is set, > > > if it is, the ops.finish_request callback will be called. > > > > > > The problem with this is that the finish_request scheduler callback > > > will be called for requests that bypassed the scheduler. > > > > > > Fix this by calling the scheduler ops.prepare_request callback, and > > > set the RQF_ELVPRIV flag only immediately before calling the insert > > > callback. > > > > One request could be inserted more than one times, such as requeue, > > however __blk_mq_alloc_request() is just run once, so is it fine to > > call ->prepare_request more than one time for same request? > > Calling ->prepare_request multiple times is fine. > All the different I/O schedulers (BFQ, mq-deadline, kyber) > simply use .prepare_request to clear/set elv->priv to a fixed value. > > > > > Or I am wondering why not call ->prepare_request when the following > > check is true? > > > > if (e && e->type->ops.prepare_request && !op_is_flush(data->cmd_flags) && > > !blk_op_is_passthrough(data->cmd_flags)) > > e->type->ops.prepare_request() > > > That might work, and might be a nicer solution indeed. > > If a request got plugged, it will be inserted to the scheduler through > blk_flush_plug_list() -> blk_mq_flush_plug_list() -> blk_mq_sched_insert_requests() > which will insert them unconditionally. > In this case. we know that !op_is_flush() (because if it was, blk_mq_submit_bio() > would have inserted directly.) > > > If we didn't plug, we do blk_mq_sched_insert_request(), which will add it if > blk_mq_sched_bypass_insert() returns false: > > blk_mq_sched_bypass_insert() is defined as: > > if ((rq->rq_flags & RQF_FLUSH_SEQ) || blk_rq_is_passthrough(rq)) > return true; > Also in this case. we know that !op_is_flush() (blk_mq_submit_bio() would have > inserted directly.) > > > So, we could easily add && !blk_op_is_passthrough(data->cmd_flags) to the > ->prepare_request condition in blk_mq_rq_ctx_init() like you suggested, > but since the bypass condition also seems to look at RQF_FLUSH_SEQ, wouldn't > we need to add RQF_FLUSH_SEQ to the condition in blk_mq_rq_ctx_init() as well? > > This flag is set after blk_mq_rq_ctx_init(). Are we sure that RQF_FLUSH_SEQ > flag will only be set for a request which op_is_flush() returned true? > > (If so, then only adding && !blk_op_is_passthrough(data->cmd_flags) should > be fine.) BTW, what I meant is the following change, is it fine? diff --git a/block/blk-mq.c b/block/blk-mq.c index 0a33d16a7298..f98f8cc05644 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -327,20 +327,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, data->ctx->rq_dispatched[op_is_sync(data->cmd_flags)]++; refcount_set(&rq->ref, 1); - - if (!op_is_flush(data->cmd_flags)) { - struct elevator_queue *e = data->q->elevator; - - rq->elv.icq = NULL; - if (e && e->type->ops.prepare_request) { - if (e->type->icq_cache) - blk_mq_sched_assign_ioc(rq); - - e->type->ops.prepare_request(rq); - rq->rq_flags |= RQF_ELVPRIV; - } - } - data->hctx->queued++; return rq; } @@ -359,17 +345,25 @@ static struct request *__blk_mq_alloc_request(struct blk_mq_alloc_data *data) if (data->cmd_flags & REQ_NOWAIT) data->flags |= BLK_MQ_REQ_NOWAIT; - if (e) { + if (e && !op_is_flush(data->cmd_flags) && + !blk_op_is_passthrough(data->cmd_flags)) { /* * Flush/passthrough requests are special and go directly to the * dispatch list. Don't include reserved tags in the * limiting, as it isn't useful. */ - if (!op_is_flush(data->cmd_flags) && - !blk_op_is_passthrough(data->cmd_flags) && - e->type->ops.limit_depth && - !(data->flags & BLK_MQ_REQ_RESERVED)) + if (e->type->ops.limit_depth && + !(data->flags & BLK_MQ_REQ_RESERVED)) e->type->ops.limit_depth(data->cmd_flags, data); + + rq->elv.icq = NULL; + if (e->type->ops.prepare_request) { + if (e->type->icq_cache) + blk_mq_sched_assign_ioc(rq); + + e->type->ops.prepare_request(rq); + rq->rq_flags |= RQF_ELVPRIV; + } } retry: Thanks, Ming