Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3136473pxb; Tue, 19 Jan 2021 15:00:42 -0800 (PST) X-Google-Smtp-Source: ABdhPJyvAK4A615PEHoJiIqIXs8rQabGAk2fQ/HIoQdiI2lQKTv6G+A3/JFDDD+JhaBPvubav9n0 X-Received: by 2002:a05:6402:35c2:: with SMTP id z2mr5029266edc.34.1611097242358; Tue, 19 Jan 2021 15:00:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611097242; cv=none; d=google.com; s=arc-20160816; b=t8Hb1Qg0rDBJj7ck7MeajoEhL6Pn5jnnk4vz0ueTpcbB1vTka+6K7VToWUnnEGv/+D iQwymFE4k7RjD4pJ5QYNcu2ijapMmp8rA7cVMJFXrqbc0uTMA9JR3aTwonnWbR8M0sdM oFo/0SaEMeD3o3H8chOY17U+L9iK98x/Q+cBCodBZ09CPUxs1YRwUViwS6tKiarsozO0 RLfBjK+PZogu4lcOFfcz1vKzliYHPbEhvon+WondZN5QAAtsTYlW6E1RNipnDcJhuBSe 95e4usoFnhU6QGbcrGLwHAiSc0kMo9qeCcMV3awUXJMGKPtgQiqo81yaksDe/uPC5JgL 0ofQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ZRO2/C+6E03QiQmlqueDmdiHt80ctBlBnVBI5jJQXQM=; b=yVmyi9YBEMGvIqPUcHl6BMRy8YlbOn+Rxh+JpX/2loNdTIiNNRSVi3s7Gru/kxdRRb 7OHzpYN1y5TCvEkgFbcRCtysRvuxjgDTX1zAjcQ3P33u2bdpjey3I8Z9NX4ggACRevaK 1fnsXrmklT5C7OzmLkkBWK6LTJ7iFqx1Xvf2Es4qST1NspiS+fWFWRISh0DEMsVC3OVH Gmeqfx7Rpz1xdaq12xY7DaltoaeJu3u3Of38uWv+z/6hTrN8k2Urs6WRI52GXFHvHkUR xvbHWHZ7PKwNhx/C5YtOBsl+aTbTi9fxzyVKzMMSnPaV7sDIZiZJR2dApjLBC7Q0Sbnz Isjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=bKqRue30; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qh14si62399ejb.360.2021.01.19.14.59.59; Tue, 19 Jan 2021 15:00:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=bKqRue30; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728157AbhASW6V (ORCPT + 99 others); Tue, 19 Jan 2021 17:58:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404276AbhASOUY (ORCPT ); Tue, 19 Jan 2021 09:20:24 -0500 Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1669FC061757 for ; Tue, 19 Jan 2021 06:19:43 -0800 (PST) Received: by mail-lj1-x232.google.com with SMTP id 3so1887573ljc.4 for ; Tue, 19 Jan 2021 06:19:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZRO2/C+6E03QiQmlqueDmdiHt80ctBlBnVBI5jJQXQM=; b=bKqRue30HAMdrH+Y1dU68G7qzKrZbMFYaGbQfRPPK5NTBMy/E7SkKtal9LDYZMLgm9 U1114S3GeaTOr6K1fkGgGf7xM+jhg42XEoBW80Vyky9uSmmRI/gjkTRsb11daaR9PvBU XxRa00+lFpcDfSE7LhcVEBi0tEzrP7S1rD9DCdAFCHiSpQrxX6afJ/NXIgod5ajqZS+S jEeSIrE2fD3Og3kqkpnfsDtrcJEgP1rUfmHapOedtTOWXqcLT0yKC7fN/wAUvXvDgvDb bU017crKzfdqmhpLm0GUhObpYe5347YY2c0kSZ91r2OqUDHXMLWvZ0AM0uRIHnOERZyw yGRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZRO2/C+6E03QiQmlqueDmdiHt80ctBlBnVBI5jJQXQM=; b=bf7DHNYBdu4lb5xUJJySUuKwaUVJDRt+1w4rH7K2tHP+2uZhpZoe6n668XTpvWrdUt 2mNwti7VXFm558jsDEnWuw+vY4gvltjmvmAiuso7kIjInY2iyrtOX6vMPFBjlWL+is7C gARqD1iwpLwofvSvEEr3nVwjba0FocpF5En+M5Xd0cb8zpk3i9ksvl9UaoeeVU2NBMHR 5aVjAR6BVukYx4kGChd8CODm5nFGEgeRAvZhiKXjyEJY7XriO2kl7/8pRRNPgRHFTMMT tC8qQBoYICVkAPypgBmW0GEcKX2ADBe39a7to4bUEqWYS7yCL/effDuej7U6hSYJUynr Lj9w== X-Gm-Message-State: AOAM530tYG+ptlc1DSaApXbsuk0Cv8KC/ycijVNlKh3DLuN//NOM9XZi 4TeOVxLr95d6OQGPncaLTUcUFNPn8HDNHjzjBlYBAA== X-Received: by 2002:a2e:9913:: with SMTP id v19mr1987292lji.209.1611065981528; Tue, 19 Jan 2021 06:19:41 -0800 (PST) MIME-Version: 1.0 References: <20210119120755.2425264-1-qais.yousef@arm.com> In-Reply-To: From: Vincent Guittot Date: Tue, 19 Jan 2021 15:19:30 +0100 Message-ID: Subject: Re: [PATCH] sched/eas: Don't update misfit status if the task is pinned To: Valentin Schneider Cc: Qais Yousef , "Peter Zijlstra (Intel)" , Dietmar Eggemann , linux-kernel , Morten Rasmussen Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 19 Jan 2021 at 14:54, Valentin Schneider wrote: > > On 19/01/21 14:34, Vincent Guittot wrote: > > On Tue, 19 Jan 2021 at 13:08, Qais Yousef wrote: > >> > >> If the task is pinned to a cpu, setting the misfit status means that > >> we'll unnecessarily continuously attempt to migrate the task but fail. > >> > >> This continuous failure will cause the balance_interval to increase to > >> a high value, and eventually cause unnecessary significant delays in > >> balancing the system when real imbalance happens. > >> > >> Caught while testing uclamp where rt-app calibration loop was pinned to > >> cpu 0, shortly after which we spawn another task with high util_clamp > >> value. The task was failing to migrate after over 40ms of runtime due to > >> balance_interval unnecessary expanded to a very high value from the > >> calibration loop. > >> > >> Not done here, but it could be useful to extend the check for pinning to > >> verify that the affinity of the task has a cpu that fits. We could end > >> up in a similar situation otherwise. > >> > >> Fixes: 3b1baa6496e6 ("sched/fair: Add 'group_misfit_task' load-balance type") > >> Signed-off-by: Qais Yousef > >> --- > >> kernel/sched/fair.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index 197a51473e0c..9379a481dd8c 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -4060,7 +4060,7 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq) > >> if (!static_branch_unlikely(&sched_asym_cpucapacity)) > >> return; > >> > >> - if (!p) { > >> + if (!p || p->nr_cpus_allowed == 1) { > > > > Side question: What happens if there is 2 misfit tasks and the current > > one is pinned but not the other waiting one > > > > update_misfit_status() is called either on the current task (at tick) or > on the task picked by pick_next_task_fair() - i.e. CFS current or > about-to-be-current. > > So if you have 2 CPU hogs enqueued on a single LITTLE, and one of them > is pinned, the other one will be moved away either via regular load This doesn't seem reliable because it uses load or nr_running > balance, or via misfit balance sometime after it's picked as the next > task to run. > > Admittedly that second case suffers from unfortunate timing mostly > related to the load balance interval. There was an old patch in the > Android stack that would reduce the balance interval upon detecting a Shouldn't we keep track of enqueue misfit tasks instead ? > misfit task to "accelerate" its upmigration; this might need to be > revisited... > > >> rq->misfit_task_load = 0; > >> return; > >> } > >> -- > >> 2.25.1 > >>