Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp5827580iog; Thu, 23 Jun 2022 06:12:54 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tBDPlGlvhNq8JNkOuqa51QoIfO/kUsNsPMBS/jdIR2IigHppeUUpNxRdaHX/VvhK+MOKWt X-Received: by 2002:aa7:810d:0:b0:525:1345:2a3 with SMTP id b13-20020aa7810d000000b00525134502a3mr27213785pfi.24.1655989974327; Thu, 23 Jun 2022 06:12:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655989974; cv=none; d=google.com; s=arc-20160816; b=yWeCcjlEiPQqvpsiQoUtsOAirWFS+76Rcs8ht8V57P1aaF/qZftZd65vDSwjlk33Gz MYA88z8OC/9Zc3bqtu2616gtC9pwn513JUV0Lvt7o0ysJPmWHoC2AKS3SN+cy/h8a9L7 Z9s42SB/132boHJj5tiglirqPvjwgd9MV4Gz2b3r7xB6XP9y/bq/+nQGjgOhOUe3pXxJ Gb0fOFMFfimyhA5Ati4veQbrJwT4aq+PbAbeGe4f/IbZqKrapIzYWDEFRaU1eKUxEMlL Stt5INNSw/AUdWRrINQTcYkB2HcpCgIxFO1EQY2ATeVcitSkC5G15gZUeMaRUAsTsBDD AR+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=vehIV9hjavbAdj2IlgIpjQAtO97UgEXGUGR0oqQ9WnA=; b=k1nLIVNvV21Ssh0H1zXNS4qIbmhj8vNApX7LPAc4IblFhVPrj0R8KAu/Vj+TWa9Mbu qxYWZM9B7U9XtfnACrMVNbiYXUdvmdf8OjpxqP3KF6pOmUw/6e0Wq0i5/kVeSsQhfEq7 Ljuupua+5sCwVSTBckdZYp3YfVbbvU/9XGp63K6ggD056krCh6YGExWc1oz+DKdglSoU y93K/L6MoFZnFzUCFskZSM95M9eo1mtIxBd/c68I0xgVLY0KuLSrdCf1EBaoZ8t6OiJu 4+y1SseVEODz/6EEwC9g51gUbXPpKWGlSEuc62FOwGzrg/QuAZ3OVx5TOFiz551mfs7L yZ8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d19-20020aa797b3000000b0051bd4457edesi15487108pfq.161.2022.06.23.06.12.41; Thu, 23 Jun 2022 06:12:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231387AbiFWM1S (ORCPT + 99 others); Thu, 23 Jun 2022 08:27:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229741AbiFWM1R (ORCPT ); Thu, 23 Jun 2022 08:27:17 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99B7D3CA40; Thu, 23 Jun 2022 05:27:15 -0700 (PDT) Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.55]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4LTKHg0JJqzDsPx; Thu, 23 Jun 2022 20:26:39 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 23 Jun 2022 20:27:13 +0800 Received: from [10.174.176.73] (10.174.176.73) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 23 Jun 2022 20:27:12 +0800 Subject: Re: [PATCH -next v5 4/8] blk-throttle: fix io hung due to config updates To: =?UTF-8?Q?Michal_Koutn=c3=bd?= CC: , , , , , , References: <20220528064330.3471000-1-yukuai3@huawei.com> <20220528064330.3471000-5-yukuai3@huawei.com> <20220622172621.GA28246@blackbody.suse.cz> From: Yu Kuai Message-ID: Date: Thu, 23 Jun 2022 20:27:11 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20220622172621.GA28246@blackbody.suse.cz> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.176.73] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, 在 2022/06/23 1:26, Michal Koutný 写道: > (Apologies for taking so long before answering.) > > On Sat, May 28, 2022 at 02:43:26PM +0800, Yu Kuai wrote: >> Some simple test: >> 1) >> cd /sys/fs/cgroup/blkio/ >> echo $$ > cgroup.procs >> echo "8:0 2048" > blkio.throttle.write_bps_device >> { >> sleep 2 >> echo "8:0 1024" > blkio.throttle.write_bps_device >> } & >> dd if=/dev/zero of=/dev/sda bs=8k count=1 oflag=direct >> >> 2) >> cd /sys/fs/cgroup/blkio/ >> echo $$ > cgroup.procs >> echo "8:0 1024" > blkio.throttle.write_bps_device >> { >> sleep 4 >> echo "8:0 2048" > blkio.throttle.write_bps_device >> } & >> dd if=/dev/zero of=/dev/sda bs=8k count=1 oflag=direct >> >> test results: io finish time >> before this patch with this patch >> 1) 10s 6s >> 2) 8s 6s > > I agree these are consistent and correct times. > > And the new implementation won't make it worse (in terms of delaying a > bio) than configuring minimal limits from the beginning, AFACT. > >> @@ -801,7 +836,8 @@ static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio, >> >> /* Round up to the next throttle slice, wait time must be nonzero */ >> jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice); >> - io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd); >> + io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd) + >> + tg->io_skipped[rw]; >> if (tg->io_disp[rw] + 1 <= io_allowed) { >> if (wait) >> *wait = 0; >> @@ -838,7 +874,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio, >> jiffy_elapsed_rnd = tg->td->throtl_slice; >> >> jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); >> - bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd); >> + bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) + >> + tg->bytes_skipped[rw]; >> if (tg->bytes_disp[rw] + bio_size <= bytes_allowed) { >> if (wait) >> *wait = 0; >> > > Here we may allow to dispatch a bio above current slice's > calculate_bytes_allowed() if bytes_skipped is already >0. Hi, I don't expect that to happen. For example, if a bio is still throttled, then old slice is keeped with proper 'bytes_skipped', then new wait time is caculated based on (bio_size - bytes_skipped). After the bio is dispatched(I assum that other bios can't preempt), if new slice is started, then 'bytes_skipped' is cleared, there should be no problem; If old slice is extended, note that we only wait for 'bio_size - bytes_skipped' bytes, while 'bio_size' bytes is added to 'tg->bytes_disp'. I think this will make sure new bio won't be dispatched above slice. What do you think? > > bytes_disp + bio_size <= calculate_bytes_allowed() + bytes_skipped > > Then on the next update > >> [shuffle] >> +static void __tg_update_skipped(struct throtl_grp *tg, bool rw) >> +{ >> + unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw]; >> + u64 bps_limit = tg_bps_limit(tg, rw); >> + u32 iops_limit = tg_iops_limit(tg, rw); >> + >> + if (bps_limit != U64_MAX) >> + tg->bytes_skipped[rw] += >> + calculate_bytes_allowed(bps_limit, jiffy_elapsed) - >> + tg->bytes_disp[rw]; >> + if (iops_limit != UINT_MAX) >> + tg->io_skipped[rw] += >> + calculate_io_allowed(iops_limit, jiffy_elapsed) - >> + tg->io_disp[rw]; >> +} > > the difference(s) here could be negative. bytes_skipped should be > reduced to account for the additionally dispatched bio. > This is all unsigned so negative numbers underflow, however, we add them > again to the unsigned, so thanks to modular arithmetics the result is > correctly updated bytes_skipped. > > Maybe add a comment about this (unsigned) intention? Of course I can do that. > > (But can this happen? The discussed bio would have to outrun another bio > (the one which defined the current slice_end) but since blk-throttle > uses queues (FIFO) everywhere this shouldn't really happen. But it's > good to know this works as intended.) I can also mention that in comment. > > This patch can have > Reviewed-by: Michal Koutný > Thanks for the review! Kuai