Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp6671697pxb; Wed, 17 Feb 2021 10:14:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJyZHzxanXZluFrhj6PHV77MVg7f/fojIfSru41W4St8wLZklUoyMm6LP8isWf1XT1lTa0jl X-Received: by 2002:a17:906:73c2:: with SMTP id n2mr260831ejl.224.1613585651654; Wed, 17 Feb 2021 10:14:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613585651; cv=none; d=google.com; s=arc-20160816; b=t42KUGr6x7JsMIhq+qK40WDRkngDeuvV/ReC3nHD81Lw3vTAX9oEE6sSq5fnNO93fh rEbOSVUECg7phmlId8Jh60lYmh0jMIhecCt9gq+BBGC+P/ItwIZZ2sqbJpiba9hDznPn reIj31Z9LKO5w5592am4+yODMljweqxkUCw75bwSNKQb1n/gsP89rX/LkRGNPKNrpi3A 71iqDacvwJ2A+IhK5hc42/Yg0pInl3ixcDJg1W4lcMmEnMkPOdgN90x8EDjrk+fjjUx/ Ttk4ZMNqLaGZUSja7hcjE1FTlrZkWSfovrDDhXT8yyyXZojQ1bdxcUAhkgL6uZGjrADJ JA5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from; bh=h1zUnNKX4omlvBoHIKJ7v/SuaSoohENYwqmDob11ACM=; b=vuJGXJK7g5o5EGbh/P/k283/bfOW/IPq0yKCQcLQcUIjk/uKpoTmeSxR+5ut98EdLW 5Wy3SdCE7fbLJFMaPz2lazVVgMux3jRx70lk11R8LFSAZaIm4XIUkOQpJuUt3J1kqavl wFWvR2T1ahcN9YuvJhPZYGtgZ9fg9HTTayuOf4gevGcd6QzkXWdmeFlvOStqdwShKza6 l6jyQC3NfpL0cvxvkcNg6rz0w1T98A3WhhNvbdBtEmBK7uEv5FIx6Ic6OjvXsRO6hKbE LM/aniDcL3cNFdFGF3sr26/i9ctv1ZMiN81PHUCkNyqmP+d/kxHbRmFK1ID+HBUdTZ5T kHiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rk15si2107613ejb.6.2021.02.17.10.13.47; Wed, 17 Feb 2021 10:14:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233506AbhBQOvQ (ORCPT + 99 others); Wed, 17 Feb 2021 09:51:16 -0500 Received: from foss.arm.com ([217.140.110.172]:60032 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233016AbhBQOvP (ORCPT ); Wed, 17 Feb 2021 09:51:15 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 46994ED1; Wed, 17 Feb 2021 06:50:30 -0800 (PST) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E4A993F694; Wed, 17 Feb 2021 06:50:28 -0800 (PST) From: Valentin Schneider To: Lingutla Chandrasekhar , linux-kernel@vger.kernel.org Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, pkondeti@codeaurora.org, peterz@infradead.org, mingo@kernel.org, Lingutla Chandrasekhar Subject: Re: [PATCH] sched/fair: Ignore percpu threads for imbalance pulls In-Reply-To: <20210217120854.1280-1-clingutla@codeaurora.org> References: <20210217120854.1280-1-clingutla@codeaurora.org> User-Agent: Notmuch/0.21 (http://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) Date: Wed, 17 Feb 2021 14:50:23 +0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/02/21 17:38, Lingutla Chandrasekhar wrote: > In load balancing, when balancing group is unable to pull task > due to ->cpus_ptr constraints from busy group, then it sets > LBF_SOME_PINNED to lb env flags, as a consequence, sgc->imbalance > is set for its parent domain level. which makes the group > classified as imbalance to get help from another balancing cpu. > > Consider a 4-CPU big.LITTLE system with CPUs 0-1 as LITTLEs and > CPUs 2-3 as Bigs with below scenario: > - CPU0 doing newly_idle balancing > - CPU1 running percpu kworker and RT task (small tasks) > - CPU2 running 2 big tasks > - CPU3 running 1 medium task > > While CPU0 is doing newly_idle load balance at MC level, it fails to > pull percpu kworker from CPU1 and sets LBF_SOME_PINNED to lb env flag > and set sgc->imbalance at DIE level domain. As LBF_ALL_PINNED not cleared, > it tries to redo the balancing by clearing CPU1 in env cpus, but it don't > find other busiest_group, so CPU0 stops balacing at MC level without > clearing 'sgc->imbalance' and restart the load balacing at DIE level. > > And CPU0 (balancing cpu) finds LITTLE's group as busiest_group with group > type as imbalance, and Bigs that classified the level below imbalance type > would be ignored to pick as busiest, and the balancing would be aborted > without pulling any tasks (by the time, CPU1 might not have running tasks). > > It is suboptimal decision to classify the group as imbalance due to > percpu threads. So don't use LBF_SOME_PINNED for per cpu threads. > Sounds like you've stumbled on the same thing I'm trying to fix in http://lore.kernel.org/r/20210128183141.28097-8-valentin.schneider@arm.com (I'm currently working on a v2) Now, I'd tend to agree that if we could prevent pcpu kworkers from interfering with load-balance altogether, that would indeed be much better than trying to deal with the group_imbalanced faff further down the line (which is what I've been doing). > Signed-off-by: Lingutla Chandrasekhar > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 04a3ce20da67..44a05ad8c96b 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7560,7 +7560,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) > > schedstat_inc(p->se.statistics.nr_failed_migrations_affine); > > - env->flags |= LBF_SOME_PINNED; > + /* Ignore percpu threads for imbalance pulls. */ > + if (p->nr_cpus_allowed > 1) > + env->flags |= LBF_SOME_PINNED; > > /* > * Remember if this task can be migrated to any other CPU in Unlike user tasks, pcpu kworkers have a stable affinity (with some hotplug quirks), so perhaps we could do this instead: --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8a8bd7b13634..84fca350b9ae 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7539,6 +7539,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu)) return 0; + if (kthread_is_per_cpu(p)) + return 0; + if (!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)) { int cpu; --- > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > a Linux Foundation Collaborative Project.