Received: by 10.223.176.5 with SMTP id f5csp1112218wra; Wed, 7 Feb 2018 12:55:56 -0800 (PST) X-Google-Smtp-Source: AH8x224sqtkPYKKaaJpLCg2ARfKm2mmv2jOVlBBTKbodJQnMlJsrzT469hm2zFExuWye/k+lI2OM X-Received: by 2002:a17:902:5582:: with SMTP id g2-v6mr7052739pli.4.1518036956784; Wed, 07 Feb 2018 12:55:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518036956; cv=none; d=google.com; s=arc-20160816; b=cF53RcJl4aJx/yjtm6uCzqhORKPSyNbQIhsmUbJbttv/5iEPxDMJhiA3CWMqDHaDVU xRButJ75qTSO8+UIhvr1/H5/Qa/tbiRf2x2Ox2O0LAL1XxZzkxlZO5TQjrMXvWFs6wRt HFI6W7pIOHjpR/+a2HkXvTGCQToEY8c0rNRV36v03oNZHf7tZL8A+96Nq5jyaoQPrMHg OuYGrKUsSDgr/o/Ko7WS4hljuaMD4lSbK0zx29ZfJSe0yw0PfNKcETqHvKBiuWYszGCX 9Vx0QkvYNwhEQbAEqFNCcjjRcJzLB/H6k7Vpdkljrw5npc14eQ85DUIP6pp/zzTWquAF p7Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature:arc-authentication-results; bh=hfRonWlK4HLg7ORTvaCVf2rtIYoqWM8kAwnEApZNcIM=; b=BvXuH0oek8V/tK/PvI1cPltZjgqxPlNU8V76ueNu4L61w9XwCbr9vslxCZ1yGIYIOv Ui2/tI4pDskfP7S37dGxcG5PMOq77bVTR4e0/Up0ngHPS4geFKMAS/uGBQbiWEUwPzxZ yMlnjTDGkRGDMmC1POcYHOiYZavkmieMAOkZbuDPjx7qt37KiOCO4nRVfy6mgcutwwhI so+BzsNy0nkmthYnK95kbU4oR0exVzlarHHigW4YAZ5W5Kc1i3ZVl4Ks1ntFM8q9Mzt2 I73PC1PUtQAYIdy1VJdp26RvP3q4Ymanm8xpoZiCpBw5wYgfacBNragi3Wit3ED3OVVE ixHA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Dn1KAe2V; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i1si1043521pgr.255.2018.02.07.12.55.42; Wed, 07 Feb 2018 12:55:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Dn1KAe2V; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932067AbeBGUxm (ORCPT + 99 others); Wed, 7 Feb 2018 15:53:42 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:52854 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754327AbeBGUxj (ORCPT ); Wed, 7 Feb 2018 15:53:39 -0500 Received: by mail-wm0-f65.google.com with SMTP id g1so5620903wmg.2 for ; Wed, 07 Feb 2018 12:53:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=hfRonWlK4HLg7ORTvaCVf2rtIYoqWM8kAwnEApZNcIM=; b=Dn1KAe2VcJXofdoS63sBLZgqO++8pQpAO/gd4Dm9scjPRGNpcyMwKaCC4YwzkCphtT aL81zuIy1sWKBi68V6AU3vJs9wpx/JIlRjwBWAqNRkUsBqcDE9YDWmkdjV2yQbezQx6L /VC3weK2FqRwd3ZrCa4uUCBh4K9r0qcdUkx3w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=hfRonWlK4HLg7ORTvaCVf2rtIYoqWM8kAwnEApZNcIM=; b=c8Fz9Gj5WZrnFfPinnKxLodDG2GrRZVuX7o6yA7O5bPf4RdijXrjjXXaitKscRoWiD jaDz4/N0KYhAV8EVyH1OWUt9EpquDBmOtwYrsBMeGvAq/ai++cEmEM/5d8fYkknTarz6 Xoub/Z/jt6+sQmqgvkoQ4xDThcyooTJVvzyxk5/B2VNQprRC8uTXLuOqLISmk4EAmG6e muj8pMQJmDcKhnFP4G1gipZHo/4TH7ra/oUvxZmwVwCRjcC5iWFaH/1NOKAua0j0FI59 JxOJHzbYtsEgZ8FYe8U7nMjj5OgBVUtl3H9RY19B1GGCg6RBAUsxzFVi6EYaHZSRNDvF Ggsw== X-Gm-Message-State: APf1xPAHHxXZIgAvlFLVpuK3I9O+AC8KgMAVb6dqkwH3eXFceSWgIlw1 uPqmozDWyYaLPs/8NdK6xSmCkA== X-Received: by 10.28.107.134 with SMTP id a6mr5871619wmi.136.1518036818079; Wed, 07 Feb 2018 12:53:38 -0800 (PST) Received: from [192.168.0.102] (146-241-58-49.dyn.eolo.it. [146.241.58.49]) by smtp.gmail.com with ESMTPSA id w133sm2319013wmg.5.2018.02.07.12.53.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Feb 2018 12:53:36 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Subject: Re: [PATCH BUGFIX V2 1/1] block, bfq: add requeue-request hook From: Paolo Valente In-Reply-To: <4815ec99-8c92-4c3b-c6c0-33425a6f2bc6@kernel.dk> Date: Wed, 7 Feb 2018 21:53:35 +0100 Cc: linux-block , Linux Kernel Mailing List , Ulf Hansson , Mark Brown , Linus Walleij , 'Paolo Valente' via bfq-iosched , Oleksandr Natalenko , Alban Browaeys , Ming Lei , ivan@ludios.org, 169364@studenti.unimore.it, holger@applied-asynchrony.com, efault@gmx.de, Serena Ziviani Content-Transfer-Encoding: quoted-printable Message-Id: <616D9096-8E3B-4C1E-BBDA-182A854197FB@linaro.org> References: <20180207180050.5639-1-paolo.valente@linaro.org> <20180207180050.5639-2-paolo.valente@linaro.org> <4815ec99-8c92-4c3b-c6c0-33425a6f2bc6@kernel.dk> To: Jens Axboe X-Mailer: Apple Mail (2.3445.5.20) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Il giorno 07 feb 2018, alle ore 19:06, Jens Axboe ha = scritto: >=20 > On 2/7/18 11:00 AM, Paolo Valente wrote: >> Commit 'a6a252e64914 ("blk-mq-sched: decide how to handle flush rq = via >> RQF_FLUSH_SEQ")' makes all non-flush re-prepared requests for a = device >> be re-inserted into the active I/O scheduler for that device. As a >> consequence, I/O schedulers may get the same request inserted again, >> even several times, without a finish_request invoked on that request >> before each re-insertion. >>=20 >> This fact is the cause of the failure reported in [1]. For an I/O >> scheduler, every re-insertion of the same re-prepared request is >> equivalent to the insertion of a new request. For schedulers like >> mq-deadline or kyber, this fact causes no harm. In contrast, it >> confuses a stateful scheduler like BFQ, which keeps state for an I/O >> request, until the finish_request hook is invoked on the request. In >> particular, BFQ may get stuck, waiting forever for the number of >> request dispatches, of the same request, to be balanced by an equal >> number of request completions (while there will be one completion for >> that request). In this state, BFQ may refuse to serve I/O requests >> from other bfq_queues. The hang reported in [1] then follows. >>=20 >> However, the above re-prepared requests undergo a requeue, thus the >> requeue_request hook of the active elevator is invoked for these >> requests, if set. This commit then addresses the above issue by >> properly implementing the hook requeue_request in BFQ. >>=20 >> [1] https://marc.info/?l=3Dlinux-block&m=3D151211117608676 >>=20 >> Reported-by: Ivan Kozik >> Reported-by: Alban Browaeys >> Tested-by: Mike Galbraith >> Signed-off-by: Paolo Valente >> Signed-off-by: Serena Ziviani >> --- >> block/bfq-iosched.c | 109 = ++++++++++++++++++++++++++++++++++++++++------------ >> 1 file changed, 84 insertions(+), 25 deletions(-) >>=20 >> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c >> index 47e6ec7427c4..21e6b9e45638 100644 >> --- a/block/bfq-iosched.c >> +++ b/block/bfq-iosched.c >> @@ -3823,24 +3823,26 @@ static struct request = *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) >> } >>=20 >> /* >> - * We exploit the bfq_finish_request hook to decrement >> - * rq_in_driver, but bfq_finish_request will not be >> - * invoked on this request. So, to avoid unbalance, >> - * just start this request, without incrementing >> - * rq_in_driver. As a negative consequence, >> - * rq_in_driver is deceptively lower than it should be >> - * while this request is in service. This may cause >> - * bfq_schedule_dispatch to be invoked uselessly. >> + * We exploit the bfq_finish_requeue_request hook to >> + * decrement rq_in_driver, but >> + * bfq_finish_requeue_request will not be invoked on >> + * this request. So, to avoid unbalance, just start >> + * this request, without incrementing rq_in_driver. As >> + * a negative consequence, rq_in_driver is deceptively >> + * lower than it should be while this request is in >> + * service. This may cause bfq_schedule_dispatch to be >> + * invoked uselessly. >> * >> * As for implementing an exact solution, the >> - * bfq_finish_request hook, if defined, is probably >> - * invoked also on this request. So, by exploiting >> - * this hook, we could 1) increment rq_in_driver here, >> - * and 2) decrement it in bfq_finish_request. Such a >> - * solution would let the value of the counter be >> - * always accurate, but it would entail using an extra >> - * interface function. This cost seems higher than the >> - * benefit, being the frequency of non-elevator-private >> + * bfq_finish_requeue_request hook, if defined, is >> + * probably invoked also on this request. So, by >> + * exploiting this hook, we could 1) increment >> + * rq_in_driver here, and 2) decrement it in >> + * bfq_finish_requeue_request. Such a solution would >> + * let the value of the counter be always accurate, >> + * but it would entail using an extra interface >> + * function. This cost seems higher than the benefit, >> + * being the frequency of non-elevator-private >> * requests very low. >> */ >> goto start_rq; >> @@ -4515,6 +4517,8 @@ static inline void = bfq_update_insert_stats(struct request_queue *q, >> unsigned int cmd_flags) {} >> #endif >>=20 >> +static void bfq_prepare_request(struct request *rq, struct bio = *bio); >> + >> static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct = request *rq, >> bool at_head) >> { >> @@ -4541,6 +4545,20 @@ static void bfq_insert_request(struct = blk_mq_hw_ctx *hctx, struct request *rq, >> else >> list_add_tail(&rq->queuelist, &bfqd->dispatch); >> } else { >> + if (!bfqq) { >> + /* >> + * This should never happen. Most likely rq is >> + * a requeued regular request, being >> + * re-inserted without being first >> + * re-prepared. Do a prepare, to avoid >> + * failure. >> + */ >> + pr_warn("Regular request associated with no = queue"); >> + WARN_ON(1); >> + bfq_prepare_request(rq, rq->bio); >> + bfqq =3D RQ_BFQQ(rq); >=20 > This reads kind of strange. "Regular request not associated with a > queue" would be cleaner. Do we really need the message? Why not just > make the above: >=20 > if (WARN_ON_ONCE(!bfqq)) { > bfq_prepare_request(rq, rq->bio); > bfqq =3D RQ_BFQQ(rq); > } >=20 > which is much simpler, kills the useless message, and avoids constant > spew in case it does trigger. >=20 I added that message because I thought that just a warning on a !bfqq would have told nothing to a user. But probably that message is about as enigmatic and useless. And I went for a WARN_ON, because I expect this anomaly to never happen, so the number of warning would have provided information too. But, also in this case, I guess cons would be more than pros. Anyway, ok to your recommendation. Thanks, Paolo > --=20 > Jens Axboe