Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2565755imm; Mon, 10 Sep 2018 03:12:53 -0700 (PDT) X-Google-Smtp-Source: ANB0VdawGSTnrdQg8XTk3Fp/4x89+j8tGg718NQCwUkJw7APatmftPotMqg1cyfxU6ij5FdxDNxq X-Received: by 2002:a17:902:4601:: with SMTP id o1-v6mr21260505pld.202.1536574373928; Mon, 10 Sep 2018 03:12:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536574373; cv=none; d=google.com; s=arc-20160816; b=Vry5pfq+9qPHFjfiGQtqueTMTzt2qAPHx1GJNVeQaXS3QeEYZSDs1LDvjg6lUSoxVC Y0xlj1aXebnTmARiaCJQi65lOTIscQ6iZJCrjx42Kna1Fe3DV2POnPi6joSPw3nUzySG wyHeA+GcG4Ks+9Q2vks1JjRgYw2WRlu5/lWFsDM7eb+n5snb8oW3SSG6JZ6YMv6txnfp qXSa0wVa63Kk4GzR8Nr2q1C/NrsEx402oMpQmjuWtg94RGZaZ3LoSn+DDeDLhtaO5i+o oFJ9hZCMGqmGPepqdmuc6cVefTJuLlcRCoKxLcb5XuL/95My97T4KLtXV7uL6WDCifJk PQIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:robot-unsubscribe:robot-id :git-commit-id:subject:to:references:in-reply-to:reply-to:cc :message-id:from:date; bh=rssp4xTwPiR+bisdl9BGVoqM2fwqyhhRIK1RfGGDNBM=; b=HHP+iajNhyuMOE85gWhktcY8Um6cUu9ThhtkTMJWmiuFMnPHzCqyYHdZfP0GYb6JPx sdFmi0gaACcLWnzvW+LTCDfjVsYdo+y1YUzeDsFsA/DbYne9kqLSEHeVTU1DwIIvPRlh WFwBjgmsKNQqfv2urDTRayYV69Z6ONoMM2tR4TrOqa/J8pn28h+6rB/bZatV9tnG6l8t i06oL3yHxFoLxqU8HDrU9tu9S8lW+/arNHVVvMH14qsYzoyEm/R0NXXjxw4FOYrG5biF HRjVSddvBqIgEuxP2uED5Z/Q1g7rZ/i7tZv60rEkzYid6qrsFcCrbVOnO1WVaNn267tH 8W7Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b23-v6si16984628pgj.571.2018.09.10.03.12.38; Mon, 10 Sep 2018 03:12:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728086AbeIJPEr (ORCPT + 99 others); Mon, 10 Sep 2018 11:04:47 -0400 Received: from terminus.zytor.com ([198.137.202.136]:50185 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727614AbeIJPEr (ORCPT ); Mon, 10 Sep 2018 11:04:47 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id w8AAB7aF1807463 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 10 Sep 2018 03:11:07 -0700 Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id w8AAB7jO1807460; Mon, 10 Sep 2018 03:11:07 -0700 Date: Mon, 10 Sep 2018 03:11:07 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Morten Rasmussen Message-ID: Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, morten.rasmussen@arm.com, hpa@zytor.com, torvalds@linux-foundation.org, peterz@infradead.org, mingo@kernel.org Reply-To: mingo@kernel.org, hpa@zytor.com, peterz@infradead.org, torvalds@linux-foundation.org, tglx@linutronix.de, morten.rasmussen@arm.com, linux-kernel@vger.kernel.org In-Reply-To: <1532093554-30504-2-git-send-email-morten.rasmussen@arm.com> References: <1532093554-30504-2-git-send-email-morten.rasmussen@arm.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:sched/core] sched/topology: Add SD_ASYM_CPUCAPACITY flag detection Git-Commit-ID: 05484e0984487d42e97c417cbb0697fa9d16e7e9 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, T_DATE_IN_FUTURE_96_Q autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on terminus.zytor.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 05484e0984487d42e97c417cbb0697fa9d16e7e9 Gitweb: https://git.kernel.org/tip/05484e0984487d42e97c417cbb0697fa9d16e7e9 Author: Morten Rasmussen AuthorDate: Fri, 20 Jul 2018 14:32:31 +0100 Committer: Ingo Molnar CommitDate: Mon, 10 Sep 2018 11:05:45 +0200 sched/topology: Add SD_ASYM_CPUCAPACITY flag detection The SD_ASYM_CPUCAPACITY sched_domain flag is supposed to mark the sched_domain in the hierarchy where all CPU capacities are visible for any CPU's point of view on asymmetric CPU capacity systems. The scheduler can then take to take capacity asymmetry into account when balancing at this level. It also serves as an indicator for how wide task placement heuristics have to search to consider all available CPU capacities as asymmetric systems might often appear symmetric at smallest level(s) of the sched_domain hierarchy. The flag has been around for while but so far only been set by out-of-tree code in Android kernels. One solution is to let each architecture provide the flag through a custom sched_domain topology array and associated mask and flag functions. However, SD_ASYM_CPUCAPACITY is special in the sense that it depends on the capacity and presence of all CPUs in the system, i.e. when hotplugging all CPUs out except those with one particular CPU capacity the flag should disappear even if the sched_domains don't collapse. Similarly, the flag is affected by cpusets where load-balancing is turned off. Detecting when the flags should be set therefore depends not only on topology information but also the cpuset configuration and hotplug state. The arch code doesn't have easy access to the cpuset configuration. Instead, this patch implements the flag detection in generic code where cpusets and hotplug state is already taken care of. All the arch is responsible for is to implement arch_scale_cpu_capacity() and force a full rebuild of the sched_domain hierarchy if capacities are updated, e.g. later in the boot process when cpufreq has initialized. Signed-off-by: Morten Rasmussen Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dietmar.eggemann@arm.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/1532093554-30504-2-git-send-email-morten.rasmussen@arm.com [ Fixed 'CPU' capitalization. ] Signed-off-by: Ingo Molnar --- include/linux/sched/topology.h | 6 ++-- kernel/sched/topology.c | 81 ++++++++++++++++++++++++++++++++++++++---- 2 files changed, 78 insertions(+), 9 deletions(-) diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 26347741ba50..6b9976180c1e 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -23,10 +23,10 @@ #define SD_BALANCE_FORK 0x0008 /* Balance on fork, clone */ #define SD_BALANCE_WAKE 0x0010 /* Balance on wakeup */ #define SD_WAKE_AFFINE 0x0020 /* Wake task to waking CPU */ -#define SD_ASYM_CPUCAPACITY 0x0040 /* Groups have different max cpu capacities */ -#define SD_SHARE_CPUCAPACITY 0x0080 /* Domain members share cpu capacity */ +#define SD_ASYM_CPUCAPACITY 0x0040 /* Domain members have different CPU capacities */ +#define SD_SHARE_CPUCAPACITY 0x0080 /* Domain members share CPU capacity */ #define SD_SHARE_POWERDOMAIN 0x0100 /* Domain members share power domain */ -#define SD_SHARE_PKG_RESOURCES 0x0200 /* Domain members share cpu pkg resources */ +#define SD_SHARE_PKG_RESOURCES 0x0200 /* Domain members share CPU pkg resources */ #define SD_SERIALIZE 0x0400 /* Only a single load balancing instance */ #define SD_ASYM_PACKING 0x0800 /* Place busy groups earlier in the domain */ #define SD_PREFER_SIBLING 0x1000 /* Prefer to place tasks in a sibling domain */ diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 505a41c42b96..5c4d583d53ee 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1061,7 +1061,6 @@ static struct cpumask ***sched_domains_numa_masks; * SD_SHARE_PKG_RESOURCES - describes shared caches * SD_NUMA - describes NUMA topologies * SD_SHARE_POWERDOMAIN - describes shared power domain - * SD_ASYM_CPUCAPACITY - describes mixed capacity topologies * * Odd one out, which beside describing the topology has a quirk also * prescribes the desired behaviour that goes along with it: @@ -1073,13 +1072,12 @@ static struct cpumask ***sched_domains_numa_masks; SD_SHARE_PKG_RESOURCES | \ SD_NUMA | \ SD_ASYM_PACKING | \ - SD_ASYM_CPUCAPACITY | \ SD_SHARE_POWERDOMAIN) static struct sched_domain * sd_init(struct sched_domain_topology_level *tl, const struct cpumask *cpu_map, - struct sched_domain *child, int cpu) + struct sched_domain *child, int dflags, int cpu) { struct sd_data *sdd = &tl->data; struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu); @@ -1100,6 +1098,9 @@ sd_init(struct sched_domain_topology_level *tl, "wrong sd_flags in topology description\n")) sd_flags &= ~TOPOLOGY_SD_FLAGS; + /* Apply detected topology flags */ + sd_flags |= dflags; + *sd = (struct sched_domain){ .min_interval = sd_weight, .max_interval = 2*sd_weight, @@ -1604,9 +1605,9 @@ static void __sdt_free(const struct cpumask *cpu_map) static struct sched_domain *build_sched_domain(struct sched_domain_topology_level *tl, const struct cpumask *cpu_map, struct sched_domain_attr *attr, - struct sched_domain *child, int cpu) + struct sched_domain *child, int dflags, int cpu) { - struct sched_domain *sd = sd_init(tl, cpu_map, child, cpu); + struct sched_domain *sd = sd_init(tl, cpu_map, child, dflags, cpu); if (child) { sd->level = child->level + 1; @@ -1632,6 +1633,65 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve return sd; } +/* + * Find the sched_domain_topology_level where all CPU capacities are visible + * for all CPUs. + */ +static struct sched_domain_topology_level +*asym_cpu_capacity_level(const struct cpumask *cpu_map) +{ + int i, j, asym_level = 0; + bool asym = false; + struct sched_domain_topology_level *tl, *asym_tl = NULL; + unsigned long cap; + + /* Is there any asymmetry? */ + cap = arch_scale_cpu_capacity(NULL, cpumask_first(cpu_map)); + + for_each_cpu(i, cpu_map) { + if (arch_scale_cpu_capacity(NULL, i) != cap) { + asym = true; + break; + } + } + + if (!asym) + return NULL; + + /* + * Examine topology from all CPU's point of views to detect the lowest + * sched_domain_topology_level where a highest capacity CPU is visible + * to everyone. + */ + for_each_cpu(i, cpu_map) { + unsigned long max_capacity = arch_scale_cpu_capacity(NULL, i); + int tl_id = 0; + + for_each_sd_topology(tl) { + if (tl_id < asym_level) + goto next_level; + + for_each_cpu_and(j, tl->mask(i), cpu_map) { + unsigned long capacity; + + capacity = arch_scale_cpu_capacity(NULL, j); + + if (capacity <= max_capacity) + continue; + + max_capacity = capacity; + asym_level = tl_id; + asym_tl = tl; + } +next_level: + tl_id++; + } + } + + return asym_tl; +} + + /* * Build sched domains for a given set of CPUs and attach the sched domains * to the individual CPUs @@ -1644,18 +1704,27 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att struct s_data d; struct rq *rq = NULL; int i, ret = -ENOMEM; + struct sched_domain_topology_level *tl_asym; alloc_state = __visit_domain_allocation_hell(&d, cpu_map); if (alloc_state != sa_rootdomain) goto error; + tl_asym = asym_cpu_capacity_level(cpu_map); + /* Set up domains for CPUs specified by the cpu_map: */ for_each_cpu(i, cpu_map) { struct sched_domain_topology_level *tl; sd = NULL; for_each_sd_topology(tl) { - sd = build_sched_domain(tl, cpu_map, attr, sd, i); + int dflags = 0; + + if (tl == tl_asym) + dflags |= SD_ASYM_CPUCAPACITY; + + sd = build_sched_domain(tl, cpu_map, attr, sd, dflags, i); + if (tl == sched_domain_topology) *per_cpu_ptr(d.sd, i) = sd; if (tl->flags & SDTL_OVERLAP)