Received: by 10.223.176.5 with SMTP id f5csp953491wra; Wed, 7 Feb 2018 10:07:42 -0800 (PST) X-Google-Smtp-Source: AH8x224OG4Cirj+uSLuHfae4AFBCKxbUuViqIAGn385wGWZWWD/2BxHK3JNiF42cKyTxxudouQMx X-Received: by 2002:a17:902:68ca:: with SMTP id x10-v6mr6660039plm.367.1518026861937; Wed, 07 Feb 2018 10:07:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518026861; cv=none; d=google.com; s=arc-20160816; b=Ka5mg+s0NgeMoJ5acdybU0LoMddztsWHFmqUFOBSIKvPlNtLfqBpYxjQCPGxxBtN0T 70GuhlrOK+vg67I2qjSnlOp+90fWfo40OH7HxU3+9vE4R317u2Mqu7ltJdMBV/YD9kNL g3PvI2/nndvvbqlupm7TRHoSPJSGfE8H5YPSbmIIRNKH/BcP/rDYtE/sh0rFLuWPXEK8 92fprRzOpvCaRYPGiLsmBU0PRTEFtsxnEpuqbHgW86DyvJSBd8ZqKy/8Em8L+HGr70cA qjQwj9haTPQxSUDhLheJTbuf2mfhm03LfhXJayZSq2tcEwZBGt0enTXq7uex9kvq4b+o i57Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=9UVSITg8sz1VputKogXKG7WDKHT/a8E37amnW9mN90U=; b=I9vUP5a21/lJFECXWdYPFHmlhdYBbLM9z4swYxUM33D/RiyhatT1qhfsBD8r1X70ej l+IQyOP0jCDShTjPIOcC78OOChbvV7uV/IC/RK+rA4IUaTIUvhf+2e1xzFdN/L55/OVL Y3f4S/a8BjH8PUsvfvKKE7f1zoxvwHWE/bw+8L2sP3/ocbVkraOIAyexfSw/GI3sFwm+ /mPAkTQ4hGOZqzf6pITEVL4AX6Ezhv1kZl6QGM582cwq5Ide3MsrTjO3K94tWWY1LX/A efuoql1Ee0/wZQ1vSc7msbwrBFMS0cckpkrf0lvflyTLLyxhx54Y3jV3bEXKl5oPLy3o JIPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=jKNj2uWQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 91-v6si103495plh.570.2018.02.07.10.07.24; Wed, 07 Feb 2018 10:07:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=jKNj2uWQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754171AbeBGSGZ (ORCPT + 99 others); Wed, 7 Feb 2018 13:06:25 -0500 Received: from mail-pl0-f66.google.com ([209.85.160.66]:44554 "EHLO mail-pl0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753965AbeBGSGX (ORCPT ); Wed, 7 Feb 2018 13:06:23 -0500 Received: by mail-pl0-f66.google.com with SMTP id f8-v6so494500plk.11 for ; Wed, 07 Feb 2018 10:06:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=9UVSITg8sz1VputKogXKG7WDKHT/a8E37amnW9mN90U=; b=jKNj2uWQoP8ZqesZtKnI201Wj4H2FxEp0/73taofWPy/HfWP/6NnRioxf0SMBAYXVD CVekCRE6MVnsv++TstlGshh+FSiG8531gRUIXHqTVibJNqYgNOjK1PB3rYMZ0DHXBi/c L8ucCULkAMdVSRHaRcnfL5NCvPlckAuBBtRDpCzR3DQ4JDwIR6Kyf4bRkgzxopG4xguw SGlLDumTLKaDny+NJ4CaJXKBYo3i82b5VeN6foaAsOdPx8QFMwZmkIvs8PGHqgFfjgeI uUEwF8bJkwLQFcNd0NhnmJeErFSxZuJuhqdFfkqgP+YTnRTXzq1w7kCYdAr42CsNsCgV TK8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=9UVSITg8sz1VputKogXKG7WDKHT/a8E37amnW9mN90U=; b=EdE7bgdCUMhJgptQWuxtPkAGBUsBiV0bkDPgFCLfjJJXbVuWT9wD4CX0k8QRRAVgNX qpzJnmSgXzhnDShrWlRzUqYGcmyR6Azq7veKvWUtoAS7oo2s/+u4YQddMHyDzIrUT6pV TwNhRYubD/+Srmjm31IIO9R56n8FJewgnGrV0eDecKu+oXzQNYntYyO41PCF57KKqAQC mRWNerMdsevvFFT0DlBVHSR+ffpLrRhWOQf4thuBXM2tUXREqzVBkasBjwh9aNNfbh2X ZYK3QiPDHHszuaoNig3zFWabLG+8iCD8fGOM/x28+Pas2NZPL9a9dlkNoem+KMH126sy NrRA== X-Gm-Message-State: APf1xPAK/RsSgT88Jmo6xeFH5oLSGgfl3CfZHxPM0G8mtIGqGlsbYy+v bUBaAsXWNqnja1D0YbqI6CpLTg== X-Received: by 2002:a17:902:4324:: with SMTP id i33-v6mr6785316pld.39.1518026782776; Wed, 07 Feb 2018 10:06:22 -0800 (PST) Received: from ?IPv6:2620:10d:c081:1131::12fd? ([2620:10d:c090:180::eb99]) by smtp.gmail.com with ESMTPSA id t29sm5764619pfj.21.2018.02.07.10.06.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Feb 2018 10:06:21 -0800 (PST) Subject: Re: [PATCH BUGFIX V2 1/1] block, bfq: add requeue-request hook To: Paolo Valente Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, alban.browaeys@gmail.com, ming.lei@redhat.com, ivan@ludios.org, 169364@studenti.unimore.it, holger@applied-asynchrony.com, efault@gmx.de, Serena Ziviani References: <20180207180050.5639-1-paolo.valente@linaro.org> <20180207180050.5639-2-paolo.valente@linaro.org> From: Jens Axboe Message-ID: <4815ec99-8c92-4c3b-c6c0-33425a6f2bc6@kernel.dk> Date: Wed, 7 Feb 2018 11:06:18 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Thunderbird/58.0 MIME-Version: 1.0 In-Reply-To: <20180207180050.5639-2-paolo.valente@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/7/18 11:00 AM, Paolo Valente wrote: > Commit 'a6a252e64914 ("blk-mq-sched: decide how to handle flush rq via > RQF_FLUSH_SEQ")' makes all non-flush re-prepared requests for a device > be re-inserted into the active I/O scheduler for that device. As a > consequence, I/O schedulers may get the same request inserted again, > even several times, without a finish_request invoked on that request > before each re-insertion. > > This fact is the cause of the failure reported in [1]. For an I/O > scheduler, every re-insertion of the same re-prepared request is > equivalent to the insertion of a new request. For schedulers like > mq-deadline or kyber, this fact causes no harm. In contrast, it > confuses a stateful scheduler like BFQ, which keeps state for an I/O > request, until the finish_request hook is invoked on the request. In > particular, BFQ may get stuck, waiting forever for the number of > request dispatches, of the same request, to be balanced by an equal > number of request completions (while there will be one completion for > that request). In this state, BFQ may refuse to serve I/O requests > from other bfq_queues. The hang reported in [1] then follows. > > However, the above re-prepared requests undergo a requeue, thus the > requeue_request hook of the active elevator is invoked for these > requests, if set. This commit then addresses the above issue by > properly implementing the hook requeue_request in BFQ. > > [1] https://marc.info/?l=linux-block&m=151211117608676 > > Reported-by: Ivan Kozik > Reported-by: Alban Browaeys > Tested-by: Mike Galbraith > Signed-off-by: Paolo Valente > Signed-off-by: Serena Ziviani > --- > block/bfq-iosched.c | 109 ++++++++++++++++++++++++++++++++++++++++------------ > 1 file changed, 84 insertions(+), 25 deletions(-) > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index 47e6ec7427c4..21e6b9e45638 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -3823,24 +3823,26 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) > } > > /* > - * We exploit the bfq_finish_request hook to decrement > - * rq_in_driver, but bfq_finish_request will not be > - * invoked on this request. So, to avoid unbalance, > - * just start this request, without incrementing > - * rq_in_driver. As a negative consequence, > - * rq_in_driver is deceptively lower than it should be > - * while this request is in service. This may cause > - * bfq_schedule_dispatch to be invoked uselessly. > + * We exploit the bfq_finish_requeue_request hook to > + * decrement rq_in_driver, but > + * bfq_finish_requeue_request will not be invoked on > + * this request. So, to avoid unbalance, just start > + * this request, without incrementing rq_in_driver. As > + * a negative consequence, rq_in_driver is deceptively > + * lower than it should be while this request is in > + * service. This may cause bfq_schedule_dispatch to be > + * invoked uselessly. > * > * As for implementing an exact solution, the > - * bfq_finish_request hook, if defined, is probably > - * invoked also on this request. So, by exploiting > - * this hook, we could 1) increment rq_in_driver here, > - * and 2) decrement it in bfq_finish_request. Such a > - * solution would let the value of the counter be > - * always accurate, but it would entail using an extra > - * interface function. This cost seems higher than the > - * benefit, being the frequency of non-elevator-private > + * bfq_finish_requeue_request hook, if defined, is > + * probably invoked also on this request. So, by > + * exploiting this hook, we could 1) increment > + * rq_in_driver here, and 2) decrement it in > + * bfq_finish_requeue_request. Such a solution would > + * let the value of the counter be always accurate, > + * but it would entail using an extra interface > + * function. This cost seems higher than the benefit, > + * being the frequency of non-elevator-private > * requests very low. > */ > goto start_rq; > @@ -4515,6 +4517,8 @@ static inline void bfq_update_insert_stats(struct request_queue *q, > unsigned int cmd_flags) {} > #endif > > +static void bfq_prepare_request(struct request *rq, struct bio *bio); > + > static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, > bool at_head) > { > @@ -4541,6 +4545,20 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, > else > list_add_tail(&rq->queuelist, &bfqd->dispatch); > } else { > + if (!bfqq) { > + /* > + * This should never happen. Most likely rq is > + * a requeued regular request, being > + * re-inserted without being first > + * re-prepared. Do a prepare, to avoid > + * failure. > + */ > + pr_warn("Regular request associated with no queue"); > + WARN_ON(1); > + bfq_prepare_request(rq, rq->bio); > + bfqq = RQ_BFQQ(rq); This reads kind of strange. "Regular request not associated with a queue" would be cleaner. Do we really need the message? Why not just make the above: if (WARN_ON_ONCE(!bfqq)) { bfq_prepare_request(rq, rq->bio); bfqq = RQ_BFQQ(rq); } which is much simpler, kills the useless message, and avoids constant spew in case it does trigger. -- Jens Axboe