Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp3464730imw; Mon, 11 Jul 2022 09:07:57 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vuFfg6wVf3TcefO28t/cOWJ3eVgMuSBeuNPWfTXqTYSeTf1V93MLHdReI/bAsJFf5Jz7V+ X-Received: by 2002:a63:cc53:0:b0:40d:bf0e:21a4 with SMTP id q19-20020a63cc53000000b0040dbf0e21a4mr16175431pgi.162.1657555677603; Mon, 11 Jul 2022 09:07:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657555677; cv=none; d=google.com; s=arc-20160816; b=iIYvmJlk9erCUJVFhxsQn8eCVmEkXqElXJN7oCeNoaGpfclbCrWiQ3XRq/tGZlELYg GCkifa+VLvIbuvlopwzrJ1WSDb8HUuptG/UnxTLJAyk+wb0y+P+D44nIDBec2cia0Nzs j1cHkiLITifOa536ojFpoHLl995KtZUnpU9KdQshxcz2RJLDc98z+yLVYF970NlNFzh1 2eocd5GIZa7mE97XwmIJDsXKHDyIWVwJ80HNQpps8adLk7zdFsGLfIywQQOYQzUy7m23 +KTvAMdF+q6UJ2fFS2YcgTkFJweOxWWK0Co6Rvs7V2qvWht6vFEBH+qlHNjan90cyfLn A+6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=/5zqzgebhDHfU2VWCEUEGudRmukRsYj4mM3L4UzBnyA=; b=JguPSgoGV2T5Q2zT07KC3bjSmz+QDTbHaH+IZRUcj2XQa8E6aR/F8XjP12Bkh0Qq2T tfSZEnVu1F1GAKf1XvSy9PKNEuInF53bsgNWMXQo4lfXZyxx3wIxOiT5nrTRbT6JbSJu C//8j5XjsqysU3h5dBUGhWQ+oL3SmXUIAIRAyOsGNVBjSr+qhvQJqz4XChLwM2xJaVbO ZGjn4DZ6TqiW4r7xUAemjCUDhsrY/KVsIwHJYcjtSLX14ZiVRRBqYZVaMjedhdydnG+z GXwDXyIXTGEmfrzTpCZ9UF5wBQoHe7yBbfmZkmW3bGG4jvJBwnh8b0ptFOUbRDAerZJJ wQbQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v11-20020a056a00148b00b0050dcf85b9e1si10188537pfu.141.2022.07.11.09.07.43; Mon, 11 Jul 2022 09:07:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230471AbiGKQDL (ORCPT + 99 others); Mon, 11 Jul 2022 12:03:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229699AbiGKQDJ (ORCPT ); Mon, 11 Jul 2022 12:03:09 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4C5C2422F1 for ; Mon, 11 Jul 2022 09:03:08 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 77D311596; Mon, 11 Jul 2022 09:03:08 -0700 (PDT) Received: from wubuntu (unknown [10.57.86.231]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5BD6A3F792; Mon, 11 Jul 2022 09:03:06 -0700 (PDT) Date: Mon, 11 Jul 2022 17:03:04 +0100 From: Qais Yousef To: Vincent Guittot Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, david.chen@nutanix.com, zhangqiao22@huawei.com Subject: Re: [PATCH v2] sched/fair: fix case with reduced capacity CPU Message-ID: <20220711160304.njkd3ml7nqpokiim@wubuntu> References: <20220708154401.21411-1-vincent.guittot@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20220708154401.21411-1-vincent.guittot@linaro.org> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vincent On 07/08/22 17:44, Vincent Guittot wrote: > The capacity of the CPU available for CFS tasks can be reduced because of > other activities running on the latter. In such case, it's worth trying to > move CFS tasks on a CPU with more available capacity. > > The rework of the load balance has filtered the case when the CPU is > classified to be fully busy but its capacity is reduced. > > Check if CPU's capacity is reduced while gathering load balance statistic > and classify it group_misfit_task instead of group_fully_busy so we can > try to move the load on another CPU. > > Reported-by: David Chen > Reported-by: Zhang Qiao > Signed-off-by: Vincent Guittot > Tested-by: David Chen > Tested-by: Zhang Qiao > --- [...] > @@ -8820,8 +8833,9 @@ static inline void update_sg_lb_stats(struct lb_env *env, > > for_each_cpu_and(i, sched_group_span(group), env->cpus) { > struct rq *rq = cpu_rq(i); > + unsigned long load = cpu_load(rq); > > - sgs->group_load += cpu_load(rq); > + sgs->group_load += load; > sgs->group_util += cpu_util_cfs(i); > sgs->group_runnable += cpu_runnable(rq); > sgs->sum_h_nr_running += rq->cfs.h_nr_running; > @@ -8851,11 +8865,17 @@ static inline void update_sg_lb_stats(struct lb_env *env, > if (local_group) > continue; > > - /* Check for a misfit task on the cpu */ > - if (env->sd->flags & SD_ASYM_CPUCAPACITY && > - sgs->group_misfit_task_load < rq->misfit_task_load) { > - sgs->group_misfit_task_load = rq->misfit_task_load; > - *sg_status |= SG_OVERLOAD; > + if (env->sd->flags & SD_ASYM_CPUCAPACITY) { > + /* Check for a misfit task on the cpu */ > + if (sgs->group_misfit_task_load < rq->misfit_task_load) { > + sgs->group_misfit_task_load = rq->misfit_task_load; > + *sg_status |= SG_OVERLOAD; > + } > + } else if ((env->idle != CPU_NOT_IDLE) && > + sched_reduced_capacity(rq, env->sd)) { > + /* Check for a task running on a CPU with reduced capacity */ > + if (sgs->group_misfit_task_load < load) > + sgs->group_misfit_task_load = load; > } > } Small questions mostly for my education purposes. The new condition only applies for SMP systems. The reason asym systems don't care is because misfit check already considers capacity pressure when checking that the task fits_capacity()? It **seems** to me that the migration margin in fits_capacity() acts like the sd->imbalance_pct when check_cpu_capacity() is called by sched_reduced_capacity(), did I get it right? If I got it right, if the migration margin ever tweaked, could we potentially start seeing this kind of reported issue on asym systems then? I guess not. It just seems to me for asym systems tweaking the migration margin is similar to tweaking imbalance_pct for smp ones. But the subtlety is greater as imbalance_pct is still used in asym systems. Thanks -- Qais Yousef