Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp3400024pxv; Mon, 26 Jul 2021 03:25:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxO9P+5MIqqDmzfSsYRvU1bkKWNiNtjq6paH8UDIE8qHjg0n+6gBFXsnrew43E7W4QPypWw X-Received: by 2002:a05:6638:264e:: with SMTP id n14mr15796877jat.71.1627295137507; Mon, 26 Jul 2021 03:25:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627295137; cv=none; d=google.com; s=arc-20160816; b=NN9gPMrgGubCpt37cIIECLbCo4IH+czQTN4cicFIOqJlA+/QuYv7IB8d+BDsLdFDvq IrxWgEhJEP3+qg6wjC0OjQqSblHwXVO3saRD1USfPsdJzL/uLsRnAzRIxV3MyPNFilBp avG7k3iO3zAdwxExF4NSF4VzxVVpWRJ2FDnH5FdN0ZUgKTJzNPOI9ZlujNPG1veOH5Lh e5g/RHweasNe82QONIIRTaEQvIQmQsEqF6PSTDmI3zzb1H/3Yw+TVOqYdfN+wSgAkxc3 tk7jbnKZQ65LIEuRhJtSCUBo+mJeCvepOf4Yj9Gx1jzXMFD/rxnDk7V+DqSBTkQs0cNp aMdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=yTgEzlBFj62XA4Qj9hCX90nRJTTcnb5kMOgmXzHRxtE=; b=JZUDDgBQK14OjVirHzIugoHUE0KfYG0+oILGonDx+e4RYpLYTjSXvdHmjsDHwyuJBT Ha1VQ+rY1ufCQXQh+x+JDTe3b2RHQXaYhhrqLmgtIMfLzYn4F3B+L/MyJN+wsokQeeZ5 kA+fZokf7iEHoUQdKvFKrRfsgZIQYcUtflzmrfkMO9VjbCQT6Ctxi6BuR5kBOuQ2ifMz iH/YPCdM7JUxtoLKrTwMVxOiZGMR7+oTXfYrTYAlKkfOdJNRHc+6E5tinyWSwE7QhHhD CkRMTMNTJML+ss0HkxHqjIrr62sYtioQzDRocuAsstCAj2GP9ahDYNWAB3sXKh5mYJz+ DWkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t25si32775729ios.38.2021.07.26.03.25.26; Mon, 26 Jul 2021 03:25:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232877AbhGZJnn (ORCPT + 99 others); Mon, 26 Jul 2021 05:43:43 -0400 Received: from outbound-smtp56.blacknight.com ([46.22.136.240]:58109 "EHLO outbound-smtp56.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232619AbhGZJnm (ORCPT ); Mon, 26 Jul 2021 05:43:42 -0400 Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp56.blacknight.com (Postfix) with ESMTPS id 78A0CFA8E5 for ; Mon, 26 Jul 2021 11:24:10 +0100 (IST) Received: (qmail 23960 invoked from network); 26 Jul 2021 10:24:10 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.255]) by 81.17.254.9 with ESMTPA; 26 Jul 2021 10:24:10 -0000 From: Mel Gorman To: LKML Cc: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Valentin Schneider , Aubrey Li , Mel Gorman Subject: [PATCH 7/9] sched/fair: Enforce proportional scan limits when scanning for an idle core Date: Mon, 26 Jul 2021 11:22:45 +0100 Message-Id: <20210726102247.21437-8-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210726102247.21437-1-mgorman@techsingularity.net> References: <20210726102247.21437-1-mgorman@techsingularity.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When scanning for a single CPU, the scan is limited based on the estimated average idle time for a domain to reduce the risk that more time is spent scanning for idle CPUs than we are idle for. With SMT, if an idle core is expected to exist there is no scan depth limits so the scan depth may or may not be related to average idle time. Unfortunately has_idle_cores can be very inaccurate when workloads are rapidly entering/exiting idle (e.g. hackbench). As the scan depth is now proportional to cores and not CPUs, enforce SIS_PROP for idle core scans. The performance impact of this is variable and is neither a universal gain nor loss. In some cases, has_idle_cores will be cleared prematurely because the whole domain was not scanned but has_idle_cores is already known to be an inaccurate heuristic. There is also additional cost because time calculations are made even for an idle core scan and the delta is calculated for both scan successes and failures. Finally, SMT siblings may be used prematurely due to scan depth limitations. On the flip side, scan depth is now consistent for both core and smt scans. The reduction in scan depth improves performance in some cases and wakeup latency is reduced in some cases. There were few changes identified in the SIS statistics but notably, "SIS Core Hit" was slightly reduced in tbench as thread counts increased, presumably due to the core search depth being throttled. Signed-off-by: Mel Gorman --- kernel/sched/fair.c | 33 +++++++++++++++++++-------------- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 20b9255ebf97..b180205e6b25 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6232,7 +6232,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); - if (sched_feat(SIS_PROP) && !has_idle_core) { + if (sched_feat(SIS_PROP)) { u64 avg_cost, avg_idle, span_avg; unsigned long now = jiffies; @@ -6265,30 +6265,35 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool if (has_idle_core) { i = select_idle_core(p, cpu, cpus, &idle_cpu); if ((unsigned int)i < nr_cpumask_bits) - return i; + break; + nr -= sched_smt_weight; } else { - if (!--nr) - return -1; idle_cpu = __select_idle_cpu(cpu, p); if ((unsigned int)idle_cpu < nr_cpumask_bits) break; + nr--; } + + if (nr < 0) + break; } - if (has_idle_core) - set_idle_cores(target, false); + if ((unsigned int)idle_cpu < nr_cpumask_bits) { + if (has_idle_core) + set_idle_cores(target, false); - if (sched_feat(SIS_PROP) && !has_idle_core) { - time = cpu_clock(this) - time; + if (sched_feat(SIS_PROP)) { + time = cpu_clock(this) - time; - /* - * Account for the scan cost of wakeups against the average - * idle time. - */ - this_rq->wake_avg_idle -= min(this_rq->wake_avg_idle, time); + /* + * Account for the scan cost of wakeups against the average + * idle time. + */ + this_rq->wake_avg_idle -= min(this_rq->wake_avg_idle, time); - update_avg(&this_sd->avg_scan_cost, time); + update_avg(&this_sd->avg_scan_cost, time); + } } return idle_cpu; -- 2.26.2