Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp3927620pxb; Tue, 19 Apr 2022 12:46:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw6NcRrsYRUwFiaOOFtc3p1Gj5+hoCYrkDvKjbwsWvdheP0qmN7h6q+VRlIP8+CnVnIAkii X-Received: by 2002:a17:90a:c302:b0:1bd:14ff:15 with SMTP id g2-20020a17090ac30200b001bd14ff0015mr219128pjt.19.1650397583173; Tue, 19 Apr 2022 12:46:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650397583; cv=none; d=google.com; s=arc-20160816; b=ijoQ2d/Z56eBfo7TVhGNRCxnPkAddBe0IuTdU1XxCBaII6hB3ifLIv3q1nVCqyazbX n9FXTgApUjvmoS33dPTRJTGCrqT9q8s0IUC7HhBzV/TboOJjdffnkqX+P/D87BTQB8UZ LUhooRCy5W1C96ZMU2VaTd0Vp2VZeWPEGVIFbRnl6pfFp/l+SFpT1KsZvTW03Xtzoa0q 3NlsgTcJE49Y56sTJOuR1rdsmMEckMqkHCgh3YqpvSVqY+pAgJsvTO4Y+QEolqPoyIoh MNCLx4IhTlyO4+PIQSzbu11YOy3E62hP0JU2C/zNv5wZk0gRXz4YxRtJ3sOOgCvn+pf8 38YA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=I/yGTP55Q3zKCVvsyO3pqJukwATC0ZlhhZ8VlgwH+4k=; b=FatseOf0XSimABujjU69Ex4wDUFiYYufap3jEqp8DL4M+mq2NmLLPXnJhwrbSMK8An J6MSYKx7FJdAtFO0h48Q+1r0fgttVmanGhx1GFvSLNq+l8BhcAPwulM/d91KrH4TmwCk rLHgHnatsiSDxAliVvaYMpytSFZ32sb0FFmw0Zg71pIjyv7OLag2Y8ffHJju7AxztOtN Ds62Uw7BuZtcX4/pF+NvCi9Fcsjm04kUY4NTWs3N4EyNMRsd0FfqC5n/n6pvnPPE4ewt KVyhNqwWlLupKIjbc55WTEHlgY3E4AnHZ2tT7OUGNitVicJteEzutYiKekUkXx3glTPV wRGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EbTRqHc1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s67-20020a632c46000000b0039d46296177si221330pgs.86.2022.04.19.12.46.07; Tue, 19 Apr 2022 12:46:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EbTRqHc1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236225AbiDSIKW (ORCPT + 99 others); Tue, 19 Apr 2022 04:10:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233480AbiDSIKU (ORCPT ); Tue, 19 Apr 2022 04:10:20 -0400 Received: from mail-yw1-x112a.google.com (mail-yw1-x112a.google.com [IPv6:2607:f8b0:4864:20::112a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D988F1CB3E for ; Tue, 19 Apr 2022 01:07:38 -0700 (PDT) Received: by mail-yw1-x112a.google.com with SMTP id 00721157ae682-2ec42eae76bso163487977b3.10 for ; Tue, 19 Apr 2022 01:07:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=I/yGTP55Q3zKCVvsyO3pqJukwATC0ZlhhZ8VlgwH+4k=; b=EbTRqHc1mXW90UbzM8S7dYlhr+0BJPriQ+S21B43cirXEp/VJSmlbSXCOXZKx3Dzcr j6d9U8wdry+aNeouRvOjrhmUUg3pFjONHC6FKMSYuvEtu4BXu/GbJXBGObsh7cGxbUil ETv0TamHTr+5q36Et0DOeEcKa60j//TcTrvGGA449/Uux3f+kZSFyOLaVKnJJUYSiFsA AIRGlJHkj4QKJVBEhQ48qtYATWdV/J4txXbnGLCYN0g6MifNXRlf64kOQaqDN+P3b55m Psjys5DhYpS9DbA8KrdB0kCNV3tZlQkx2ojHJWzESgShiPPQwjrP/mAANcoTex6npkPH Yhlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=I/yGTP55Q3zKCVvsyO3pqJukwATC0ZlhhZ8VlgwH+4k=; b=6TG1/cMWZkDbDu/pzgyy3upx5bz2SiJxfbeV6nQpI2Dku9TTxpf7XiGC+wqOeJoGjS haKNrqX18Fxy9xaweWjoyUbwBhJ5V4C3YIIfJhBJ1fizLcks3BbTC2gIda0VMdGbqxaH A98z2ty1WY6F1pPgCQ7ThPE5CZgKDkSQi1djTut3XiVxpusMR8kU1WteDseAAgfsJG6t MjQqlPRQ1Oq/KfaZDYO74sBfUza57yNCty1kCHvn25uqbGUFykC+SRQMNWe0YLDTGhrB RwH3P653G7cm6fmqmHgiEpX3kIoYqap4aMQDHASh8AhgSbJb+vO/FeZojtAzyKfHqs7h pT9Q== X-Gm-Message-State: AOAM530SxGmwDYxi3JcEEu8wVpiabqtXvdeKF9ucA6CYNmNEjqcMMvWA Xie1Jmg1jYmtRBMS3BFE1R7Mah8JuK0O7BSg8ImkQg== X-Received: by 2002:a81:6189:0:b0:2eb:deb5:9f63 with SMTP id v131-20020a816189000000b002ebdeb59f63mr13358223ywb.319.1650355658062; Tue, 19 Apr 2022 01:07:38 -0700 (PDT) MIME-Version: 1.0 References: <1649566716-24687-1-git-send-email-zgpeng@tencent.com> In-Reply-To: <1649566716-24687-1-git-send-email-zgpeng@tencent.com> From: Vincent Guittot Date: Tue, 19 Apr 2022 10:07:26 +0200 Message-ID: Subject: Re: [PATCH] sched/fair: Fix the scene where group's imbalance is set incorrectly To: zgpeng Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 10 Apr 2022 at 06:58, zgpeng wrote: > > In the load_balance function, if the balance fails due to > affinity,then parent group's imbalance will be set to 1. > However, there will be a scene where balance is achieved, > but the imbalance flag is still set to 1, which needs to > be fixed. > > The specific trigger scenarios are as follows. In the > load_balance function, the first loop picks out the busiest > cpu. During the process of pulling the process from the > busiest cpu, it is found that all tasks cannot be run on the > DST cpu. At this time, both LBF_SOME_PINNED andLBF_ALL_PINNED > of env.flags are set. Because balance fails and LBF_SOME_PINNED shouldn't LBF_DST_PINNED and dst_cpu have been set ? and goto more_balance should clear env.imbalance before we set group's imbalance ? > is set, the parent group's mbalance flag will be set to 1. At > the same time, because LBF_ALL_PINNED is set, it will re-enter > the second cycle to select another busiest cpu to pull the process. > Because the busiest CPU has changed, the process can be pulled to > DST cpu, so it is possible to reach a balance state. The new load_balance will be done without the previous busiest cpu > > But at this time, the parent group's imbalance flag is not set > correctly again. As a result, the load balance is successfully > achieved, but the parent group's imbalance flag is incorrectly > set to 1. In this case, the upper-layer load balance will > erroneously select this domain as the busiest group, thereby > breaking the balance. > > Signed-off-by: zgpeng > --- > kernel/sched/fair.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d4bd299..e137917 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10019,13 +10019,13 @@ static int load_balance(int this_cpu, struct rq *this_rq, > } > > /* > - * We failed to reach balance because of affinity. > + * According to balance status, set group_imbalance correctly. > */ > if (sd_parent) { > int *group_imbalance = &sd_parent->groups->sgc->imbalance; > > - if ((env.flags & LBF_SOME_PINNED) && env.imbalance > 0) > - *group_imbalance = 1; > + if (env.flags & LBF_SOME_PINNED) > + *group_imbalance = env.imbalance > 0 ? 1 : 0; > } > > /* All tasks on this runqueue were pinned by CPU affinity */ > -- > 2.9.5 >