Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1958178pxb; Fri, 5 Feb 2021 05:56:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJxemf5thabcEzC5zRmgLyZNSwrcwfUVcLAeZM2vPe+8ztHLD3WVGNBbzH5d8erNfu1TH5mF X-Received: by 2002:a50:fc06:: with SMTP id i6mr3741250edr.20.1612533398967; Fri, 05 Feb 2021 05:56:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612533398; cv=none; d=google.com; s=arc-20160816; b=kjHxy7OQCjwNvekdzK6zNneDY9Tt5jaGutYGiwXGYRQPNpm36QXqDIarh+Xj9TCSxD Rsl7jCduqYNpCnmDllen6hCYluwDHURqlgH+f4cVqcVs3IVEZj+9xsp6IX6yU56HPsuw giR/2EG+9VDwdnXBLYuxZq16OaYvi5EonJEEn4wsrBnjM76ue201rsvdVuys4diVlYqj yXntOaPZk17vN3hESr5pH6vJsLrWJkvn7NTcYWoMv2Vfb9ffObJB0oien7AaMTwA1pc8 9JnPUF1Jm7f65EUGemFoImazixyTyYgydnZJFWCaaE4Q6yvrc5dbAkR+twdcrVIghm44 3Pgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ASWzAhi2cG2TQHse+1Zu4lZ0A7tCAPhy+TIzDjmHLkM=; b=xMu3K6sJ/o2m+8k0c8kW6o4MeAqLcStaK2QaUpEFQDLDQlYBlwF+R1rkFEQP/4K+Wn tfNOcsdkiqVSPlGN2E/Y8V0sPhQzK/vOgtX/ZGR4MeCj4va1dVbi8JtorRd40zVcXnVP zV37irhVXqvBphRO38eCh3HM/122zehuYcDtoNwfKbejrWJd5IeAC0VA0hK88kZGiQTq asj+LWyCVrvYgoWr2ockxOLIgpFFaCRotnpIsUeKE5fTzVh5zA6vKN26UH9m9MxKy1mI ghRL0VKs03wT3dtRaekJ4MNeyBDeJ8Til0J1MTGcHCRaDtKqm2/bNLsuCPmn4DmFLuv1 WJow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FEqiqBbx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o6si2863235edr.440.2021.02.05.05.56.14; Fri, 05 Feb 2021 05:56:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FEqiqBbx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231321AbhBENzg (ORCPT + 99 others); Fri, 5 Feb 2021 08:55:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbhBENwT (ORCPT ); Fri, 5 Feb 2021 08:52:19 -0500 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 767B3C0613D6 for ; Fri, 5 Feb 2021 05:52:04 -0800 (PST) Received: by mail-lj1-x231.google.com with SMTP id m22so7826356ljj.4 for ; Fri, 05 Feb 2021 05:52:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ASWzAhi2cG2TQHse+1Zu4lZ0A7tCAPhy+TIzDjmHLkM=; b=FEqiqBbxKRc+F3Poiwa4vl81wMLAfV5VxyFnHYaiBpWou6UpR48cfY/ZenWXMwZdYB P+emTgcsLNav6sEWDDyew051HRnkdlw+6FT1g/ZvWWZaxaDsa2Q9j2xxe+cS9/uIuWQ2 WUJ+tQeUiu2ab0oydkYcSnXQEltmnMogmmkBMRsUFGQ8n24B2JIXAjeen7qUlZSr3Sme +OkTRhfP7vmAZwDZGBr5GxJ0FZ3qr+2fM+psINYQjp45D2jlnVfDc6IdlldkjZ1hXQJn s5/A1MPgdy0yYF22EEdxIMWKMJ1UR8/R98GwjGNRQjWH0zGUhTDSqXK+Y2QNsWMk8Gl/ F9jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ASWzAhi2cG2TQHse+1Zu4lZ0A7tCAPhy+TIzDjmHLkM=; b=Ow18enNcrBw3rFhLEgfPvwZAecgfud2C7aAQHfs1i/EAi6BvaGEDvwkWuQOeRfl+k2 EEUHC6b3gWxrqobPMPpdP0LK04/4SQBQSe1F1FhGBDO5cv1W0eGSUK+QnySWeuyy/sJo mubP/CnG9vYNFvccYuhzBpQ+7zgJ2ENtsAq+ZriI5hu23NPk0+AlxzGz8Hi5fF0OL526 Uh0+K0/2mOyxqQjwrZ5Z5OgSgSwnbsZ43X5AbQeNTl/wx1P7RSvCGptPJxsWfwPnUtEl zDFajzn4+Tjs8rfqLrLYNW0lq+vFJtPRJCfrzTvFER3CHLbR5qGbJ5+gFQPutQHseA9V +/1g== X-Gm-Message-State: AOAM532qas0Xohg4JWEHkmIr7/wUIZx8YCi+rekNe5X9SeFvvPu/tc1c 90W3FnqdrWoPoeoTcyYL+bwQbHyyQvUGEU9J8ARv/g== X-Received: by 2002:a2e:9857:: with SMTP id e23mr2697903ljj.209.1612533122884; Fri, 05 Feb 2021 05:52:02 -0800 (PST) MIME-Version: 1.0 References: <20210128183141.28097-1-valentin.schneider@arm.com> <20210128183141.28097-2-valentin.schneider@arm.com> In-Reply-To: <20210128183141.28097-2-valentin.schneider@arm.com> From: Vincent Guittot Date: Fri, 5 Feb 2021 14:51:51 +0100 Message-ID: Subject: Re: [PATCH 1/8] sched/fair: Clean up active balance nr_balance_failed trickery To: Valentin Schneider Cc: linux-kernel , Peter Zijlstra , Ingo Molnar , Dietmar Eggemann , Morten Rasmussen , Qais Yousef , Quentin Perret , Pavan Kondeti , Rik van Riel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 28 Jan 2021 at 19:32, Valentin Schneider wrote: > > When triggering an active load balance, sd->nr_balance_failed is set to > such a value that any further can_migrate_task() using said sd will ignore > the output of task_hot(). > > This behaviour makes sense, as active load balance intentionally preempts a > rq's running task to migrate it right away, but this asynchronous write is > a bit shoddy, as the stopper thread might run active_load_balance_cpu_stop > before the sd->nr_balance_failed write either becomes visible to the > stopper's CPU or even happens on the CPU that appended the stopper work. > > Add a struct lb_env flag to denote active balancing, and use it in > can_migrate_task(). Remove the sd->nr_balance_failed write that served the > same purpose. > > Signed-off-by: Valentin Schneider > --- > kernel/sched/fair.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 197a51473e0c..0f6a4e58ce3c 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7423,6 +7423,7 @@ enum migration_type { > #define LBF_SOME_PINNED 0x08 > #define LBF_NOHZ_STATS 0x10 > #define LBF_NOHZ_AGAIN 0x20 > +#define LBF_ACTIVE_LB 0x40 > > struct lb_env { > struct sched_domain *sd; > @@ -7608,10 +7609,14 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) > > /* > * Aggressive migration if: > - * 1) destination numa is preferred > - * 2) task is cache cold, or > - * 3) too many balance attempts have failed. > + * 1) active balance > + * 2) destination numa is preferred > + * 3) task is cache cold, or > + * 4) too many balance attempts have failed. > */ > + if (env->flags & LBF_ACTIVE_LB) > + return 1; > + This changes the behavior for numa system because it skips migrate_degrades_locality() which can return 1 and prevent active migration whatever nr_balance_failed Is that intentional ? > tsk_cache_hot = migrate_degrades_locality(p, env); > if (tsk_cache_hot == -1) > tsk_cache_hot = task_hot(p, env); > @@ -9805,9 +9810,6 @@ static int load_balance(int this_cpu, struct rq *this_rq, > active_load_balance_cpu_stop, busiest, > &busiest->active_balance_work); > } > - > - /* We've kicked active balancing, force task migration. */ > - sd->nr_balance_failed = sd->cache_nice_tries+1; > } > } else { > sd->nr_balance_failed = 0; > @@ -9963,7 +9965,8 @@ static int active_load_balance_cpu_stop(void *data) > * @dst_grpmask we need to make that test go away with lying > * about DST_PINNED. > */ > - .flags = LBF_DST_PINNED, > + .flags = LBF_DST_PINNED | > + LBF_ACTIVE_LB, > }; > > schedstat_inc(sd->alb_count); > -- > 2.27.0 >