Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp9933981ybi; Wed, 24 Jul 2019 12:35:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqwX+YlsdbK7H38/o/MPkS5OMi3c7FcbayxPGkMKgjjMT7OaUYGe3RaXeHl0cFZZWQEN2Qm0 X-Received: by 2002:a17:902:2a69:: with SMTP id i96mr86364290plb.108.1563996945674; Wed, 24 Jul 2019 12:35:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563996945; cv=none; d=google.com; s=arc-20160816; b=PmELcVWXHsLMIB/RxKP3nCcr5AGAwjeQiuLy3gV5rafohe0tK8U9J6d3++6W46gsC2 ZdemqV1izLQSQ8s/89gQ7D7BUcfPVk7jStSpZ/UkEgk/67Ef83WvD0G2RLQfeQpWzBSn zq+RM1LIXnxDyYa0KI6yHsDZO3ukCQtwsrGErcJHoRWFDV21/V5RoTyETIjzadoPfRt3 ScLmb5AlC3QJ2N8oDh9IENj9vaJapeMk/Cet9AAGUnTvUSOyFiwVzjCOoZa3dXn9hl6m uFYAwS+1nVHWxAnOmPalr4N3fl3M0Pwh9XKyzUZSOt012vZIJB95muUUyyEnqGN4tFg7 M2hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=2/y7e0JcQJe5l/aOWkO3cO4kkbor6zn3cQ74gkNr1j8=; b=MyH5YK/ps8XhiekUmT1jDP+u+IQlSjkyk3C62e9M4ptl7UEDMkM/XlpDl3xhNMHBzc n660PsANNy3VfBJcLxnHZtDzLydczlVxSEfdPmj/N+SDYFnRvpbmmiY4m4OlHgtTQ6kU QCY3J5jVlU0PRKzsPHAf27Mq2AZwkpLyjrLigM2vcw33Y1Hkv4XlZ2K3KB0sOJ6vC4ub gx7o5fNvxSTX0HY5FV0WBLe6belyg/kC/eCapDkCYbqq0OKjyqCOybX/iYMMeTDKUPrA 7KKYucQAY0SAlp6MXgtmi/1QDq+2v5HXuRglMU6ADJbTGqT5sKi76ffcjk7s9CaXvWz6 k3Og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Rjg1xgkY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d8si18485938pfr.182.2019.07.24.12.35.30; Wed, 24 Jul 2019 12:35:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Rjg1xgkY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389086AbfGXTei (ORCPT + 99 others); Wed, 24 Jul 2019 15:34:38 -0400 Received: from mail.kernel.org ([198.145.29.99]:57128 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728743AbfGXTeg (ORCPT ); Wed, 24 Jul 2019 15:34:36 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9238320659; Wed, 24 Jul 2019 19:34:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1563996876; bh=WRHFOrZXgIter6h+eJlsnU5bLTpb/MoDdjbRL5sYrJ0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Rjg1xgkYyLJd5qxmeO+N+19WTs5pBQkD5zd+O8urqfcH/iaE2LXe+lW7L9Vav09q1 zE6CEG3xVby78pGIHXap5J1tX+jGG9LSfyxTt7npm2AWP5KzwAx8TiArRVQpy24+bH 05RCGqRcX+ZVPwO5ji3Pn5skLgyCRnqVWzOlLNvs= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dennis Zhou , Jens Axboe , Sasha Levin , Tejun Heo , Josef Bacik Subject: [PATCH 5.2 246/413] blk-iolatency: fix STS_AGAIN handling Date: Wed, 24 Jul 2019 21:18:57 +0200 Message-Id: <20190724191753.146674620@linuxfoundation.org> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190724191735.096702571@linuxfoundation.org> References: <20190724191735.096702571@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [ Upstream commit c9b3007feca018d3f7061f5d5a14cb00766ffe9b ] The iolatency controller is based on rq_qos. It increments on rq_qos_throttle() and decrements on either rq_qos_cleanup() or rq_qos_done_bio(). a3fb01ba5af0 fixes the double accounting issue where blk_mq_make_request() may call both rq_qos_cleanup() and rq_qos_done_bio() on REQ_NO_WAIT. So checking STS_AGAIN prevents the double decrement. The above works upstream as the only way we can get STS_AGAIN is from blk_mq_get_request() failing. The STS_AGAIN handling isn't a real problem as bio_endio() skipping only happens on reserved tag allocation failures which can only be caused by driver bugs and already triggers WARN. However, the fix creates a not so great dependency on how STS_AGAIN can be propagated. Internally, we (Facebook) carry a patch that kills read ahead if a cgroup is io congested or a fatal signal is pending. This combined with chained bios progagate their bi_status to the parent is not already set can can cause the parent bio to not clean up properly even though it was successful. This consequently leaks the inflight counter and can hang all IOs under that blkg. To nip the adverse interaction early, this removes the rq_qos_cleanup() callback in iolatency in favor of cleaning up always on the rq_qos_done_bio() path. Fixes: a3fb01ba5af0 ("blk-iolatency: only account submitted bios") Debugged-by: Tejun Heo Debugged-by: Josef Bacik Signed-off-by: Dennis Zhou Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- block/blk-iolatency.c | 51 ++++++++++++------------------------------- 1 file changed, 14 insertions(+), 37 deletions(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index c91b84bb9d0a..a1eb5e9ac904 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -600,10 +600,6 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) if (!blkg || !bio_flagged(bio, BIO_TRACKED)) return; - /* We didn't actually submit this bio, don't account it. */ - if (bio->bi_status == BLK_STS_AGAIN) - return; - iolat = blkg_to_lat(bio->bi_blkg); if (!iolat) return; @@ -622,40 +618,22 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) inflight = atomic_dec_return(&rqw->inflight); WARN_ON_ONCE(inflight < 0); - if (iolat->min_lat_nsec == 0) - goto next; - iolatency_record_time(iolat, &bio->bi_issue, now, - issue_as_root); - window_start = atomic64_read(&iolat->window_start); - if (now > window_start && - (now - window_start) >= iolat->cur_win_nsec) { - if (atomic64_cmpxchg(&iolat->window_start, - window_start, now) == window_start) - iolatency_check_latencies(iolat, now); + /* + * If bi_status is BLK_STS_AGAIN, the bio wasn't actually + * submitted, so do not account for it. + */ + if (iolat->min_lat_nsec && bio->bi_status != BLK_STS_AGAIN) { + iolatency_record_time(iolat, &bio->bi_issue, now, + issue_as_root); + window_start = atomic64_read(&iolat->window_start); + if (now > window_start && + (now - window_start) >= iolat->cur_win_nsec) { + if (atomic64_cmpxchg(&iolat->window_start, + window_start, now) == window_start) + iolatency_check_latencies(iolat, now); + } } -next: - wake_up(&rqw->wait); - blkg = blkg->parent; - } -} - -static void blkcg_iolatency_cleanup(struct rq_qos *rqos, struct bio *bio) -{ - struct blkcg_gq *blkg; - - blkg = bio->bi_blkg; - while (blkg && blkg->parent) { - struct rq_wait *rqw; - struct iolatency_grp *iolat; - - iolat = blkg_to_lat(blkg); - if (!iolat) - goto next; - - rqw = &iolat->rq_wait; - atomic_dec(&rqw->inflight); wake_up(&rqw->wait); -next: blkg = blkg->parent; } } @@ -671,7 +649,6 @@ static void blkcg_iolatency_exit(struct rq_qos *rqos) static struct rq_qos_ops blkcg_iolatency_ops = { .throttle = blkcg_iolatency_throttle, - .cleanup = blkcg_iolatency_cleanup, .done_bio = blkcg_iolatency_done_bio, .exit = blkcg_iolatency_exit, }; -- 2.20.1