Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3168632pxb; Tue, 19 Jan 2021 16:03:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJxGiKY8iDRjdPBZ1eqfD4lpqBILMW0j6mkWvMSW5W/TtsjNi4v1QPk+AtB4dbf4g1C8YRfw X-Received: by 2002:a17:906:3481:: with SMTP id g1mr4562588ejb.5.1611101000224; Tue, 19 Jan 2021 16:03:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611101000; cv=none; d=google.com; s=arc-20160816; b=sCKbtlW0kdK+BXM9DZRCwKRntNs7a4ngcazJEdFaIU6CC7scki+hmNNrEKr9SIKqQW Kin9DZYVLaoeHX5c7pZoO3yK63ydrfae+TN3Z5TtpLOqjUgtXuhj8Pc4t+6w955yHi1C nHvledDinkT1Mg55hNzPrO06xtQWBN5aMQEIbMcjqLTgnuTE18z2muiuM7ybKKFZkNBe 13X4GOJ8Z+3WjaSrpuIsWtKgbvPitG+zJdJmybZP6akCCooyWV52qJQPN0WO4meG0fRq z9zMqX1bFiLAmkx3BhXo9Qhf2jmNFJWN+pRXOCTi2G/a6isfDt9ck38KRSW7VNyOf5uN RYEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from; bh=36fiX6S80WJc9zxY3DZkBuYMy2I2asF0mxP4qo2vzkE=; b=BlATngWu4XINe5VbGDUHx1zTam9VL4oKz/uMXzmxkEXPn6LRitSa9R/V4u41rFCbVD 9z0QTckC+905B++AlWegR4QVaTbcSefDQicHRlSMPUHRoQnRr6rr3bqu2S+EjXud9vnz PNnFj05HqMxePBZoy9B9/XYYP89/tKJ/uF9YGAasjfS9FaF5unvo2faB9jx2mfN0O8yj X17MCSnIXw7tRUGt9f6iRacIdsdt1x6IKKwufPeLjmlkwj65tSh2uubwHIA+GSuK3rNt R/obWrQmybcYaXW8+tCE+iCpgfIlJ1RtFQ+gdGKHEIc704dqNHARZS47G8tdiULrXJl6 m1oQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 61si161287edf.562.2021.01.19.16.02.43; Tue, 19 Jan 2021 16:03:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730768AbhASX56 (ORCPT + 99 others); Tue, 19 Jan 2021 18:57:58 -0500 Received: from foss.arm.com ([217.140.110.172]:60096 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2395011AbhASNzU (ORCPT ); Tue, 19 Jan 2021 08:55:20 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BC465D6E; Tue, 19 Jan 2021 05:54:33 -0800 (PST) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D669F3F719; Tue, 19 Jan 2021 05:54:32 -0800 (PST) From: Valentin Schneider To: Vincent Guittot , Qais Yousef Cc: "Peter Zijlstra \(Intel\)" , Dietmar Eggemann , linux-kernel , Morten Rasmussen Subject: Re: [PATCH] sched/eas: Don't update misfit status if the task is pinned In-Reply-To: References: <20210119120755.2425264-1-qais.yousef@arm.com> User-Agent: Notmuch/0.21 (http://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) Date: Tue, 19 Jan 2021 13:54:30 +0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 19/01/21 14:34, Vincent Guittot wrote: > On Tue, 19 Jan 2021 at 13:08, Qais Yousef wrote: >> >> If the task is pinned to a cpu, setting the misfit status means that >> we'll unnecessarily continuously attempt to migrate the task but fail. >> >> This continuous failure will cause the balance_interval to increase to >> a high value, and eventually cause unnecessary significant delays in >> balancing the system when real imbalance happens. >> >> Caught while testing uclamp where rt-app calibration loop was pinned to >> cpu 0, shortly after which we spawn another task with high util_clamp >> value. The task was failing to migrate after over 40ms of runtime due to >> balance_interval unnecessary expanded to a very high value from the >> calibration loop. >> >> Not done here, but it could be useful to extend the check for pinning to >> verify that the affinity of the task has a cpu that fits. We could end >> up in a similar situation otherwise. >> >> Fixes: 3b1baa6496e6 ("sched/fair: Add 'group_misfit_task' load-balance type") >> Signed-off-by: Qais Yousef >> --- >> kernel/sched/fair.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 197a51473e0c..9379a481dd8c 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -4060,7 +4060,7 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq) >> if (!static_branch_unlikely(&sched_asym_cpucapacity)) >> return; >> >> - if (!p) { >> + if (!p || p->nr_cpus_allowed == 1) { > > Side question: What happens if there is 2 misfit tasks and the current > one is pinned but not the other waiting one > update_misfit_status() is called either on the current task (at tick) or on the task picked by pick_next_task_fair() - i.e. CFS current or about-to-be-current. So if you have 2 CPU hogs enqueued on a single LITTLE, and one of them is pinned, the other one will be moved away either via regular load balance, or via misfit balance sometime after it's picked as the next task to run. Admittedly that second case suffers from unfortunate timing mostly related to the load balance interval. There was an old patch in the Android stack that would reduce the balance interval upon detecting a misfit task to "accelerate" its upmigration; this might need to be revisited... >> rq->misfit_task_load = 0; >> return; >> } >> -- >> 2.25.1 >>