Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5768268ioo; Wed, 1 Jun 2022 12:09:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxu6cvAAnLfYlwJQ1Y7/Hcp1Cj8zh6SG5wuiwBSwCGSRhutDiKZJGQHie8OZQtE8eV6b+ui X-Received: by 2002:a17:90a:8b18:b0:1e0:b1cd:aa9a with SMTP id y24-20020a17090a8b1800b001e0b1cdaa9amr35986754pjn.103.1654110562456; Wed, 01 Jun 2022 12:09:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654110562; cv=none; d=google.com; s=arc-20160816; b=Q/TkG1WAK2wU7mub5Ps4aB4K/yRPM9Y+xVnSxnw2xD9z0pIZBjPAE94HkASUQuaCTR gR7byguJrT40/eqlntOVBJ5iFCx2fCtCO5cdujHx3pgG5cgSi6WH5N88bSr9pVt0uZJi D7FxqB7uBvQP6KriTeYFwA7g3YUsGMgx1sWHOkTMKX2iyuwcUSq3wpn9EinysF/yURxA HEI5thpIxgv7pXUse9bF2a7mh3euminddBE6zhYZTKKDKfmFqjrCBedi6drT3EHvh5SC 0090KLfWtcgD3pM9lvRP+ioR7Z4JjikbKFHOaHuE22zkLa3K6zvpHBsbPICmcIVVP5Fw 5crg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=mRzYGajXsSFG5xNgfOlScdXka00j9TItElz8O+HmCKI=; b=y8GfhOG8U+ROwEdBLgpwqMI3X4RptE2/mG5ux6t5Min675jJtkVBnzOeXnxCbN1CSo +rbfVefPQA8X5QC6sSIL1hdLK0DiAsfJyz3FncgbdUy1ppBIY5p/BoWNzLmKcL0w5VKv vrYW/UsmsVUqNMGB1/U7Sd9zLDqoRpCCb0lD8beOBe3LO33/MbD8eUa48f2Einr79cGG NgHtQVF/KzfuW7loKVFuCPWFQ4XYfkkgWAdW22bXmtMxfzXWOJUnxtfJzGnWZHPb6vI6 S11wxH2HBemkV2275E957P2jHdpdnWnELfOkEq1WavslGCzOr5pyj8loLyXIwPINigDQ Ln7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id 66-20020a630145000000b003fccbc8a072si759599pgb.781.2022.06.01.12.09.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 12:09:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6B5A01207D4; Wed, 1 Jun 2022 11:49:50 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245154AbiEaJYN (ORCPT + 99 others); Tue, 31 May 2022 05:24:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245155AbiEaJYK (ORCPT ); Tue, 31 May 2022 05:24:10 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A460B90CCA; Tue, 31 May 2022 02:24:08 -0700 (PDT) Received: from kwepemi100025.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LC6JL1skwzjWyv; Tue, 31 May 2022 17:22:58 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi100025.china.huawei.com (7.221.188.158) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 31 May 2022 17:24:06 +0800 Received: from [10.174.176.73] (10.174.176.73) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 31 May 2022 17:24:05 +0800 Subject: Re: [PATCH -next v7 2/3] block, bfq: refactor the counting of 'num_groups_with_pending_reqs' To: Paolo Valente CC: Jan Kara , Jens Axboe , Tejun Heo , , linux-block , LKML , References: <20220528095020.186970-1-yukuai3@huawei.com> <20220528095020.186970-3-yukuai3@huawei.com> <0D9355CE-F85B-4B1A-AEC3-F63DFC4B3A54@linaro.org> <1803FD7E-9FB1-4A1E-BD6D-D6657006589A@unimore.it> <81214347-3806-4F54-B60F-3E5A1A5EC84D@unimore.it> From: Yu Kuai Message-ID: <8ffa050c-1254-0974-1457-4ce4cb39dcb4@huawei.com> Date: Tue, 31 May 2022 17:24:05 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <81214347-3806-4F54-B60F-3E5A1A5EC84D@unimore.it> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.176.73] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2022/05/31 17:19, Paolo Valente 写道: > > >> Il giorno 31 mag 2022, alle ore 11:06, Yu Kuai ha scritto: >> >> 在 2022/05/31 16:36, Paolo VALENTE 写道: >>>> Il giorno 30 mag 2022, alle ore 10:40, Yu Kuai ha scritto: >>>> >>>> 在 2022/05/30 16:34, Yu Kuai 写道: >>>>> 在 2022/05/30 16:10, Paolo Valente 写道: >>>>>> >>>>>> >>>>>>> Il giorno 28 mag 2022, alle ore 11:50, Yu Kuai ha scritto: >>>>>>> >>>>>>> Currently, bfq can't handle sync io concurrently as long as they >>>>>>> are not issued from root group. This is because >>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in >>>>>>> bfq_asymmetric_scenario(). >>>>>>> >>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs': >>>>>>> >>>>>>> Before this patch: >>>>>>> 1) root group will never be counted. >>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests. >>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests. >>>>>>> >>>>>>> After this patch: >>>>>>> 1) root group is counted. >>>>>>> 2) Count if bfqg have at least one bfqq that is marked busy. >>>>>>> 3) Don't count if bfqg doesn't have any busy bfqqs. >>>>>> >>>>>> Unfortunately, I see a last problem here. I see a double change: >>>>>> (1) a bfqg is now counted only as a function of the state of its child >>>>>> queues, and not of also its child bfqgs >>>>>> (2) the state considered for counting a bfqg moves from having pending >>>>>> requests to having busy queues >>>>>> >>>>>> I'm ok with with (1), which is a good catch (you are lady explained >>>>>> the idea to me some time ago IIRC). >>>>>> >>>>>> Yet I fear that (2) is not ok. A bfqq can become non busy even if it >>>>>> still has in-flight I/O, i.e. I/O being served in the drive. The >>>>>> weight of such a bfqq must still be considered in the weights_tree, >>>>>> and the group containing such a queue must still be counted when >>>>>> checking whether the scenario is asymmetric. Otherwise service >>>>>> guarantees are broken. The reason is that, if a scenario is deemed as >>>>>> symmetric because in-flight I/O is not taken into account, then idling >>>>>> will not be performed to protect some bfqq, and in-flight I/O may >>>>>> steal bandwidth to that bfqq in an uncontrolled way. >>>>> Hi, Paolo >>>>> Thanks for your explanation. >>>>> My orginal thoughts was using weights_tree insertion/removal, however, >>>>> Jan convinced me that using bfq_add/del_bfqq_busy() is ok. >>>>> From what I see, when bfqq dispatch the last request, >>>>> bfq_del_bfqq_busy() will not be called from __bfq_bfqq_expire() if >>>>> idling is needed, and it will delayed to when such bfqq get scheduled as >>>>> in-service queue again. Which means the weight of such bfqq should still >>>>> be considered in the weights_tree. >>>>> I also run some tests on null_blk with "irqmode=2 >>>>> completion_nsec=100000000(100ms) hw_queue_depth=1", and tests show >>>>> that service guarantees are still preserved on slow device. >>>>> Do you this is strong enough to cover your concern? >>> Unfortunately it is not. Your very argument is what made be believe >>> that considering busy queues was enough, in the first place. But, as >>> I found out, the problem is caused by the queues that do not enjoy >>> idling. With your patch (as well as in my initial version) they are >>> not counted when they remain without requests queued. And this makes >>> asymmetric scenarios be considered erroneously as symmetric. The >>> consequence is that idling gets switched off when it had to be kept >>> on, and control on bandwidth is lost for the victim in-service queues. >> >> Hi,Paolo >> >> Thanks for your explanation, are you thinking that if bfqq doesn't enjoy >> idling, then such bfqq will clear busy after dispatching the last >> request? >> >> Please kindly correct me if I'm wrong in the following process: >> >> If there are more than one bfqg that is activatied, then bfqqs that are >> not enjoying idle are still left busy after dispatching the last >> request. >> >> details in __bfq_bfqq_expire: >> >> if (RB_EMPTY_ROOT(&bfqq->sort_list) && >> ┊ !(reason == BFQQE_PREEMPTED && >> ┊ idling_needed_for_service_guarantees(bfqd, bfqq))) { >> -> idling_needed_for_service_guarantees will always return true, > > It returns true only is the scenario is symmetric. Not counting bfqqs > with in-flight requests makes an asymmetric scenario be considered > wrongly symmetric. See function bfq_asymmetric_scenario(). Hi, Yes, with this patchset, If there are more than one bfqg that is activatied(contain busy bfqq), bfq_asymmetric_scenario() will return true: bfq_asymmetric_scenario() return varied_queue_weights || multiple_classes_busy #ifdef CONFIG_BFQ_GROUP_IOSCHED || bfqd->num_groups_with_busy_queues > 1 #endif From what I see, bfqd->num_groups_with_busy_queues > 1 is always true... > > Paolo > >> bfqq(whether or not enjoy idling) will stay busy. >> if (bfqq->dispatched == 0) >> /* >> ┊* Overloading budget_timeout field to store >> ┊* the time at which the queue remains with no >> ┊* backlog and no outstanding request; used by >> ┊* the weight-raising mechanism. >> ┊*/ >> bfqq->budget_timeout = jiffies; >> >> bfq_del_bfqq_busy(bfqd, bfqq, true); >> >> Thanks, >> Kuai > > . >