Received: by 10.223.164.202 with SMTP id h10csp2836730wrb; Sun, 12 Nov 2017 20:08:10 -0800 (PST) X-Google-Smtp-Source: AGs4zMbCZqKabVaH9xBlMx7W1p8SjQbP3+vq3eVbedgMh530aq8UtgtDFoPG8nWB1b3TwJQvGAf3 X-Received: by 10.99.127.14 with SMTP id a14mr3342346pgd.315.1510546090193; Sun, 12 Nov 2017 20:08:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510546090; cv=none; d=google.com; s=arc-20160816; b=VWNvYXgB6e/4Eg41o47tD9JejWa2fRD4QdMt39Q2pefQc8l48hqUWfI+P5TVRSkG49 R4F0WTIaUktfQwT3Ol5Ej1g1hsqby2mJYBUVSThxJj6L5u/ms/0Gzli4iBjL5xlVLTHa 88l4QQC9NHuCXMFE5ZYvx1fyo8NZUEPSII85miDxNDpXuJAdJJx/Sij8k14V6K7iILtT AjDQ90lNy0Q8Jgc+SVf4sh2c1LZkAJekN68AY4+J1EUMpqzoghYDKcnTgRVzDU9h7PVJ EqKAAUr3vTJbjokvmaRyLC+zRPEfbmEhsPspPTPfrPptcuS256kNTxMuqOIOp5LW4S4y XrzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:arc-authentication-results; bh=qkSyNYndAPb0ewwD/XsaY51p1Dx0cwtTHtBSnIbbVYw=; b=Hq9SC/twwLcodBsXWqYNDubVvyGVa/L1LIoBrAogFFXmyYdM7zyjbye5sx2ba5Pl/T XsVVoXREP+kA5eZhk8Pl0651Yvb4T2gSszF+UKNAW9xMDsVi8A86y44E15dfYIMT+t+S 2nyK7TADJmZb84sBsKWmPB0f2Hq49aw9McjYSS24J7zVm/0uh5yfkV6lt7rG5WPusNkO z6HevKnWsdrlAuuDQd0z/+aE7vVhGGqVomL+0TeZOt+liZNdeS+hDll7AAHp9bTuklwF 98eLx3pnoRhgvXxUhIvzhEbZnA+Nl6qponK+Z78y17gs1I3m+Hss3GY94mycMi9/gocC QhQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n9si10863239pgu.602.2017.11.12.20.07.57; Sun, 12 Nov 2017 20:08:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751751AbdKMEHV (ORCPT + 87 others); Sun, 12 Nov 2017 23:07:21 -0500 Received: from mail.kernel.org ([198.145.29.99]:46882 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751404AbdKMEHT (ORCPT ); Sun, 12 Nov 2017 23:07:19 -0500 Received: from kernel.org (unknown [199.201.64.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 955C821904; Mon, 13 Nov 2017 04:07:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 955C821904 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=shli@kernel.org Date: Sun, 12 Nov 2017 20:07:16 -0800 From: Shaohua Li To: Tejun Heo Cc: axboe@kernel.dk, linux-kernel@vger.kernel.org, kernel-team@fb.com, lizefan@huawei.com, hannes@cmpxchg.org, cgroups@vger.kernel.org, guro@fb.com Subject: Re: [PATCH 7/7] blk-throtl: don't throttle the same IO multiple times Message-ID: <20171113040716.kaheegc4qub42n6z@kernel.org> References: <20171112222613.3613362-1-tj@kernel.org> <20171112222613.3613362-8-tj@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171112222613.3613362-8-tj@kernel.org> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 12, 2017 at 02:26:13PM -0800, Tejun Heo wrote: > BIO_THROTTLED is used to mark already throttled bios so that a bio > doesn't get throttled multiple times. The flag gets set when the bio > starts getting dispatched from blk-throtl and cleared when it leaves > blk-throtl. > > Unfortunately, this doesn't work when the request_queue decides to > split or requeue the bio and ends up throttling the same IO multiple > times. This condition gets easily triggered and often leads to > multiple times lower bandwidth limit being enforced than configured. > > Fix it by always setting BIO_THROTTLED for bios recursing to the same > request_queue and clearing only when a bio leaves the current level. > > Signed-off-by: Tejun Heo > --- > block/blk-core.c | 10 +++++++--- > block/blk-throttle.c | 8 -------- > include/linux/blk-cgroup.h | 20 ++++++++++++++++++++ > 3 files changed, 27 insertions(+), 11 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index ad23b96..f0e3157 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -2216,11 +2216,15 @@ blk_qc_t generic_make_request(struct bio *bio) > */ > bio_list_init(&lower); > bio_list_init(&same); > - while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL) > - if (q == bio->bi_disk->queue) > + while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL) { > + if (q == bio->bi_disk->queue) { > + blkcg_bio_repeat_q_level(bio); > bio_list_add(&same, bio); > - else > + } else { > + blkcg_bio_leave_q_level(bio); > bio_list_add(&lower, bio); > + } > + } Hi Tejun, Thanks for looking into this while I was absence. I don't understand how this works. Assume a bio will be splitted into 2 small bios. In generic_make_request, we charge the whole bio. 'q->make_request_fn' will dispatch the first small bio, and call generic_make_request for the second small bio. Then generic_make_request charge the second small bio and we add the second small bio to current->bio_list[0] (please check the order). In above code the patch changed, we pop the second small bio and set BIO_THROTTLED for it. But this is already too late, because generic_make_request already charged the second small bio. Did you look at my original patch (https://marc.info/?l=linux-block&m=150791825327628&w=2), anything wrong? Thanks, Shaohua > /* now assemble so we handle the lowest level first */ > bio_list_merge(&bio_list_on_stack[0], &lower); > bio_list_merge(&bio_list_on_stack[0], &same); > diff --git a/block/blk-throttle.c b/block/blk-throttle.c > index 1e6916b..76579b2 100644 > --- a/block/blk-throttle.c > +++ b/block/blk-throttle.c > @@ -2223,14 +2223,6 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, > out_unlock: > spin_unlock_irq(q->queue_lock); > out: > - /* > - * As multiple blk-throtls may stack in the same issue path, we > - * don't want bios to leave with the flag set. Clear the flag if > - * being issued. > - */ > - if (!throttled) > - bio_clear_flag(bio, BIO_THROTTLED); > - > #ifdef CONFIG_BLK_DEV_THROTTLING_LOW > if (throttled || !td->track_bio_latency) > bio->bi_issue_stat.stat |= SKIP_LATENCY; > diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h > index f2f9691..bed0416 100644 > --- a/include/linux/blk-cgroup.h > +++ b/include/linux/blk-cgroup.h > @@ -675,9 +675,29 @@ static inline void blkg_rwstat_add_aux(struct blkg_rwstat *to, > #ifdef CONFIG_BLK_DEV_THROTTLING > extern bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, > struct bio *bio); > + > +static inline void blkcg_bio_repeat_q_level(struct bio *bio) > +{ > + /* > + * @bio is queued while processing a previous bio which was already > + * throttled. Don't throttle it again. > + */ > + bio_set_flag(bio, BIO_THROTTLED); > +} > + > +static inline void blkcg_bio_leave_q_level(struct bio *bio) > +{ > + /* > + * @bio may get throttled at multiple q levels, clear THROTTLED > + * when leaving the current one. > + */ > + bio_clear_flag(bio, BIO_THROTTLED); > +} > #else > static inline bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, > struct bio *bio) { return false; } > +static inline void blkcg_bio_repeat_q_level(struct bio *bio) { } > +static inline void biocg_bio_leave_q_level(struct bio *bio) { } > #endif > > static inline struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg, > -- > 2.9.5 > From 1583901042028495922@xxx Sun Nov 12 22:29:03 +0000 2017 X-GM-THRID: 1583901042028495922 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread