2019-02-26 06:22:52

by Len Brown

[permalink] [raw]
Subject: [PATCH 0/14] v2 multi-die/package topology support

This patch series does 4 things.

1. Parses the new CPUID.1F leaf to discover multi-die/package topology

2. Export multi-die topology inside the kernel

3. Update 3 places (coretemp, pkgtemp, rapl) that that need to know
the difference between die and package-scope MSR.

(Note: Kan Liang has a patch series on top of this one to similarly
make the uncore perf code multi-die/package aware.)

4. Export multi-die topology to user-space via sysfs

These changes should have 0 impact on cache topology,
NUMA topology, Linux scheduler, or system performance.

These topology changes primarily impact parts of the kernel
and some applciations that care about package MSR scope.
Also, some software is licensed per package, and other tools,
such as benchmark reporting software sometimes cares about packages.

Changes since v1:

Responded to all syntax and style feedback.

Fixed a bug in the CPUID.1F parsing code
we were parsing the leaf properly, but in some configs
we were not updating the maps correctly

topology_logical_die_id() replaces topology_unique_die_id()

Suggested by Kan, who's uncore code uses
topology_logical_package_id().

Restored sysfs core_siblings, core_siblings_list

v1 proposed re-defining this existing attribute to
be the threads in a die, rather than in a package.

For compatibility, decided rather to keep this
attribute unchanged, for now, even though
its name makes little sense, and it makes
no sense in a multi-die system.

Added sysfs package_threads, package_threads_list

Added this attribute to show threads siblings in a package.
Exactly same as "core_siblings above", a name now deprecated.
This attribute name and definition is immune to future
topology changes.

Suggested by Brice.

Added sysfs die_threads, die_threads_list

Added this attribute to show which threads siblings in a die.
V1 had proposed putting this info into "core_siblings", but we
decided to leave that legacy attribute alone.
This attribute name and definition is immune to future
topology changes.

On a single die-package system this attribute has same contents
as "package_threads".

Suggested by Brice.

Added sysfs core_threads, core_threads_list

Added this attribute to show which threads siblings in a core.
Exactly same as "thread_siblings", a name now deprecated.
This attribute name and definition is immune to future
topology changes.

Suggested by Brice.


For compatibility, sysfs cpuX/topology core_siblings
and core_siblings_list are unchanged. They retain
their legacy defintion of listing which CPUs share
the same package.

Patch Summary:

Unchanged:

[PATCH 01/14] x86 topology: Fix doc typo
[PATCH 02/14] topolgy: Simplify cputopology.txt formatting and
[PATCH 03/14] x86 smpboot: Rename match_die() to match_pkg()
[PATCH 05/14] cpu topology: Export die_id
[PATCH 07/14] powercap/intel_rapl: Simplify rapl_find_package()
[PATCH 10/14] powercap/intel_rapl: update rapl domain name and debug

Bug Fixed:

[PATCH 04/14] x86 topology: Add CPUID.1F multi-die/package support

New since v1:

[PATCH 06/14] x86 topology: Define topology_logical_die_id()
[PATCH 12/14] topology: Create package_threads sysfs attribute
[PATCH 13/14] topology: Create core_threads sysfs attribute
[PATCH 14/14] topology: Create die_threads sysfs attribute

Updated (to use logical_die_id()):

[PATCH 08/14] powercap/intel_rapl: Support multi-die/package
[PATCH 09/14] thermal/x86_pkg_temp_thermal: Support multi-die/package
[PATCH 11/14] hwmon/coretemp: Support multi-die/package



Documentation/cputopology.txt | 72 ++++++++++++++---------
Documentation/x86/topology.txt | 6 +-
arch/x86/include/asm/processor.h | 5 +-
arch/x86/include/asm/smp.h | 1 +
arch/x86/include/asm/topology.h | 5 ++
arch/x86/kernel/cpu/topology.c | 85 +++++++++++++++++++++-------
arch/x86/kernel/smpboot.c | 73 +++++++++++++++++++++++-
arch/x86/xen/smp_pv.c | 1 +
drivers/base/topology.c | 22 +++++++
drivers/hwmon/coretemp.c | 9 +--
drivers/powercap/intel_rapl.c | 75 +++++++++++++-----------
drivers/thermal/intel/x86_pkg_temp_thermal.c | 9 +--
include/linux/topology.h | 6 ++
13 files changed, 276 insertions(+), 93 deletions(-)

These patches are also available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git x86




2019-02-26 06:21:06

by Len Brown

[permalink] [raw]
Subject: [PATCH 01/14] x86 topology: Fix doc typo

From: Len Brown <[email protected]>

Syntax only, no functional or semantic change.

reflect actual cpuinfo_x86 field name:

s/logical_id/logical_proc_id/

Signed-off-by: Len Brown <[email protected]>
Cc: [email protected]
---
Documentation/x86/topology.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/x86/topology.txt b/Documentation/x86/topology.txt
index 2953e3ec9a02..06b3cdbc4048 100644
--- a/Documentation/x86/topology.txt
+++ b/Documentation/x86/topology.txt
@@ -51,7 +51,7 @@ The topology of a system is described in the units of:
The physical ID of the package. This information is retrieved via CPUID
and deduced from the APIC IDs of the cores in the package.

- - cpuinfo_x86.logical_id:
+ - cpuinfo_x86.logical_proc_id:

The logical ID of the package. As we do not trust BIOSes to enumerate the
packages in a consistent way, we introduced the concept of logical package
--
2.18.0-rc0


2019-02-26 06:21:22

by Len Brown

[permalink] [raw]
Subject: [PATCH 05/14] cpu topology: Export die_id

From: Len Brown <[email protected]>

Export die_id in cpu topology, for the benefit of hardware that
has multiple-die/package.

Signed-off-by: Len Brown <[email protected]>
Cc: [email protected]
---
Documentation/cputopology.txt | 6 ++++++
arch/x86/include/asm/topology.h | 1 +
drivers/base/topology.c | 4 ++++
include/linux/topology.h | 3 +++
4 files changed, 14 insertions(+)

diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index cb61277e2308..4e6be7f68fd8 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -12,6 +12,12 @@ physical_package_id:
socket number, but the actual value is architecture and platform
dependent.

+die_id:
+
+ the CPU die ID of cpuX. Typically it is the hardware platform's
+ identifier (rather than the kernel's). The actual value is
+ architecture and platform dependent.
+
core_id:

the CPU core ID of cpuX. Typically it is the hardware platform's
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 453cf38a1c33..281be6bbc80d 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -106,6 +106,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);

#define topology_logical_package_id(cpu) (cpu_data(cpu).logical_proc_id)
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
+#define topology_die_id(cpu) (cpu_data(cpu).cpu_die_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)

#ifdef CONFIG_SMP
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 5fd9f167ecc1..50352cf96f85 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -43,6 +43,9 @@ static ssize_t name##_list_show(struct device *dev, \
define_id_show_func(physical_package_id);
static DEVICE_ATTR_RO(physical_package_id);

+define_id_show_func(die_id);
+static DEVICE_ATTR_RO(die_id);
+
define_id_show_func(core_id);
static DEVICE_ATTR_RO(core_id);

@@ -72,6 +75,7 @@ static DEVICE_ATTR_RO(drawer_siblings_list);

static struct attribute *default_attrs[] = {
&dev_attr_physical_package_id.attr,
+ &dev_attr_die_id.attr,
&dev_attr_core_id.attr,
&dev_attr_thread_siblings.attr,
&dev_attr_thread_siblings_list.attr,
diff --git a/include/linux/topology.h b/include/linux/topology.h
index cb0775e1ee4b..5cc8595dd0e4 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -184,6 +184,9 @@ static inline int cpu_to_mem(int cpu)
#ifndef topology_physical_package_id
#define topology_physical_package_id(cpu) ((void)(cpu), -1)
#endif
+#ifndef topology_die_id
+#define topology_die_id(cpu) ((void)(cpu), -1)
+#endif
#ifndef topology_core_id
#define topology_core_id(cpu) ((void)(cpu), 0)
#endif
--
2.18.0-rc0


2019-02-26 06:21:26

by Len Brown

[permalink] [raw]
Subject: [PATCH 04/14] x86 topology: Add CPUID.1F multi-die/package support

From: Len Brown <[email protected]>

Some new systems have multiple software-visible die within each package.

Update Linux parsing of the Intel CPUID "Extended Topology Leaf"
to handle either CPUID.B, or the new CPUID.1F.

Add cpuinfo_x86.die_id and cpuinfo_x86.max_dies to store the result.

die_id will be non-zero only for multi-die/package systems.

Signed-off-by: Len Brown <[email protected]>
Cc: [email protected]
---
Documentation/x86/topology.txt | 4 ++
arch/x86/include/asm/processor.h | 4 +-
arch/x86/kernel/cpu/topology.c | 85 +++++++++++++++++++++++++-------
arch/x86/kernel/smpboot.c | 2 +
4 files changed, 75 insertions(+), 20 deletions(-)

diff --git a/Documentation/x86/topology.txt b/Documentation/x86/topology.txt
index 06b3cdbc4048..8107b6cfc9ea 100644
--- a/Documentation/x86/topology.txt
+++ b/Documentation/x86/topology.txt
@@ -46,6 +46,10 @@ The topology of a system is described in the units of:

The number of cores in a package. This information is retrieved via CPUID.

+ - cpuinfo_x86.x86_max_dies:
+
+ The number of dies in a package. This information is retrieved via CPUID.
+
- cpuinfo_x86.phys_proc_id:

The physical ID of the package. This information is retrieved via CPUID
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 33051436c864..f2856fe03715 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -105,7 +105,8 @@ struct cpuinfo_x86 {
int x86_power;
unsigned long loops_per_jiffy;
/* cpuid returned max cores value: */
- u16 x86_max_cores;
+ u16 x86_max_cores;
+ u16 x86_max_dies;
u16 apicid;
u16 initial_apicid;
u16 x86_clflush_size;
@@ -117,6 +118,7 @@ struct cpuinfo_x86 {
u16 logical_proc_id;
/* Core id: */
u16 cpu_core_id;
+ u16 cpu_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 8f6c784141d1..4d17e699657d 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -15,33 +15,63 @@
/* leaf 0xb SMT level */
#define SMT_LEVEL 0

-/* leaf 0xb sub-leaf types */
+/* extended topology sub-leaf types */
#define INVALID_TYPE 0
#define SMT_TYPE 1
#define CORE_TYPE 2
+#define DIE_TYPE 5

#define LEAFB_SUBTYPE(ecx) (((ecx) >> 8) & 0xff)
#define BITS_SHIFT_NEXT_LEVEL(eax) ((eax) & 0x1f)
#define LEVEL_MAX_SIBLINGS(ebx) ((ebx) & 0xffff)

-int detect_extended_topology_early(struct cpuinfo_x86 *c)
-{
#ifdef CONFIG_SMP
+/*
+ * Check if given CPUID extended toplogy "leaf" is implemented
+ */
+static int check_extended_topology_leaf(int leaf)
+{
unsigned int eax, ebx, ecx, edx;

- if (c->cpuid_level < 0xb)
+ cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
+
+ if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
return -1;

- cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
+ return 0;
+}
+/*
+ * Return best CPUID Extended Toplogy Leaf supported
+ */
+static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
+{
+ if (c->cpuid_level >= 0x1f) {
+ if (check_extended_topology_leaf(0x1f) == 0)
+ return 0x1f;
+ }

- /*
- * check if the cpuid leaf 0xb is actually implemented.
- */
- if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
+ if (c->cpuid_level >= 0xb) {
+ if (check_extended_topology_leaf(0xb) == 0)
+ return 0xb;
+ }
+
+ return -1;
+}
+#endif
+
+int detect_extended_topology_early(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_SMP
+ unsigned int eax, ebx, ecx, edx;
+ int leaf;
+
+ leaf = detect_extended_topology_leaf(c);
+ if (leaf < 0)
return -1;

set_cpu_cap(c, X86_FEATURE_XTOPOLOGY);

+ cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
/*
* initial apic id, which also represents 32-bit extended x2apic id.
*/
@@ -52,7 +82,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
}

/*
- * Check for extended topology enumeration cpuid leaf 0xb and if it
+ * Check for extended topology enumeration cpuid leaf, and if it
* exists, use it for populating initial_apicid and cpu topology
* detection.
*/
@@ -60,22 +90,28 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx, sub_index;
- unsigned int ht_mask_width, core_plus_mask_width;
+ unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
+ unsigned int die_select_mask, die_level_siblings;
+ int leaf;

- if (detect_extended_topology_early(c) < 0)
+ leaf = detect_extended_topology_leaf(c);
+ if (leaf < 0)
return -1;

/*
* Populate HT related information from sub-leaf level 0.
*/
- cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
+ cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
+ c->initial_apicid = edx;
core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
+ die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);

sub_index = 1;
do {
- cpuid_count(0xb, sub_index, &eax, &ebx, &ecx, &edx);
+ cpuid_count(leaf, sub_index, &eax, &ebx, &ecx, &edx);

/*
* Check for the Core type in the implemented sub leaves.
@@ -83,23 +119,34 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
- break;
+ die_level_siblings = core_level_siblings;
+ die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
+ }
+ if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
+ die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}

sub_index++;
} while (LEAFB_SUBTYPE(ecx) != INVALID_TYPE);

core_select_mask = (~(-1 << core_plus_mask_width)) >> ht_mask_width;
-
- c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid, ht_mask_width)
- & core_select_mask;
- c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, core_plus_mask_width);
+ die_select_mask = (~(-1 << die_plus_mask_width)) >>
+ core_plus_mask_width;
+
+ c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid,
+ ht_mask_width) & core_select_mask;
+ c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
+ core_plus_mask_width) & die_select_mask;
+ c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid,
+ die_plus_mask_width);
/*
* Reinit the apicid, now that we have extended initial_apicid.
*/
c->apicid = apic->phys_pkg_id(c->initial_apicid, 0);

c->x86_max_cores = (core_level_siblings / smp_num_siblings);
+ c->x86_max_dies = (die_level_siblings / core_level_siblings);
#endif
return 0;
}
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 19a963890bbe..c70e547b18c2 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -393,6 +393,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
int cpu1 = c->cpu_index, cpu2 = o->cpu_index;

if (c->phys_proc_id == o->phys_proc_id &&
+ c->cpu_die_id == o->cpu_die_id &&
per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
if (c->cpu_core_id == o->cpu_core_id)
return topology_sane(c, o, "smt");
@@ -404,6 +405,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
}

} else if (c->phys_proc_id == o->phys_proc_id &&
+ c->cpu_die_id == o->cpu_die_id &&
c->cpu_core_id == o->cpu_core_id) {
return topology_sane(c, o, "smt");
}
--
2.18.0-rc0


2019-02-26 06:21:38

by Len Brown

[permalink] [raw]
Subject: [PATCH 13/14] topology: Create core_threads sysfs attribute

From: Len Brown <[email protected]>

Create CPU topology sysfs attributes:
"core_threads" and "core_threads_list"

These attributes represent all of the logical CPU threads that share the
same core.

These attriutes is synonymous with the existing "thread_siblings" and
"thread_siblings_list" attribute, which will be deprecated.

Signed-off-by: Len Brown <[email protected]>
Suggested-by: Brice Goglin <[email protected]>
---
Documentation/cputopology.txt | 8 ++++----
drivers/base/topology.c | 6 ++++++
2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index 2794dbe8e559..e67915a8a512 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -36,15 +36,15 @@ drawer_id:
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.

-thread_siblings:
+core_threads:

internal kernel map of cpuX's hardware threads within the same
- core as cpuX.
+ core as cpuX. (deprecated name: "thread_siblings")

-thread_siblings_list:
+core_threads_list:

human-readable list of cpuX's hardware threads within the same
- core as cpuX.
+ core as cpuX. (deprecated name: "thread_siblings_list");

package_threads:

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 5f4405a08c6e..73efadf5e6d4 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -53,6 +53,10 @@ define_siblings_show_func(thread_siblings, sibling_cpumask);
static DEVICE_ATTR_RO(thread_siblings);
static DEVICE_ATTR_RO(thread_siblings_list);

+define_siblings_show_func(core_threads, sibling_cpumask);
+static DEVICE_ATTR_RO(core_threads);
+static DEVICE_ATTR_RO(core_threads_list);
+
define_siblings_show_func(core_siblings, core_cpumask);
static DEVICE_ATTR_RO(core_siblings);
static DEVICE_ATTR_RO(core_siblings_list);
@@ -83,6 +87,8 @@ static struct attribute *default_attrs[] = {
&dev_attr_core_id.attr,
&dev_attr_thread_siblings.attr,
&dev_attr_thread_siblings_list.attr,
+ &dev_attr_core_threads.attr,
+ &dev_attr_core_threads_list.attr,
&dev_attr_core_siblings.attr,
&dev_attr_core_siblings_list.attr,
&dev_attr_package_threads.attr,
--
2.18.0-rc0


2019-02-26 06:21:39

by Len Brown

[permalink] [raw]
Subject: [PATCH 12/14] topology: Create package_threads sysfs attribute

From: Len Brown <[email protected]>

The sysfs cpu/topology/core_siblings (and core_siblings_list)
attributes are documented, implemented, and used by programs to
represent set of logical CPU threads sharing the same package.

This makes sense if the next topology level above a core
is always a package. But on systems where there is a die
topology level between a core and a package, the name
no longer makese sense.

So without changing its function, add a name for this map
that describes what it actually is -- package threads --
the set of logical CPU threads that share the same package.

This new name will be immune to changes in topology, since
it describes threads at the current level, not siblings
at a contained level.

Signed-off-by: Len Brown <[email protected]>
Suggested-by: Brice Goglin <[email protected]>
---
Documentation/cputopology.txt | 8 ++++----
drivers/base/topology.c | 6 ++++++
2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index 4e6be7f68fd8..2794dbe8e559 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -46,15 +46,15 @@ thread_siblings_list:
human-readable list of cpuX's hardware threads within the same
core as cpuX.

-core_siblings:
+package_threads:

internal kernel map of cpuX's hardware threads within the same
- physical_package_id.
+ physical_package_id. (deprecated name: "core_siblings")

-core_siblings_list:
+package_threads_list:

human-readable list of cpuX's hardware threads within the same
- physical_package_id.
+ physical_package_id. (deprecated name: "core_siblings_list")

book_siblings:

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 50352cf96f85..5f4405a08c6e 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -57,6 +57,10 @@ define_siblings_show_func(core_siblings, core_cpumask);
static DEVICE_ATTR_RO(core_siblings);
static DEVICE_ATTR_RO(core_siblings_list);

+define_siblings_show_func(package_threads, core_cpumask);
+static DEVICE_ATTR_RO(package_threads);
+static DEVICE_ATTR_RO(package_threads_list);
+
#ifdef CONFIG_SCHED_BOOK
define_id_show_func(book_id);
static DEVICE_ATTR_RO(book_id);
@@ -81,6 +85,8 @@ static struct attribute *default_attrs[] = {
&dev_attr_thread_siblings_list.attr,
&dev_attr_core_siblings.attr,
&dev_attr_core_siblings_list.attr,
+ &dev_attr_package_threads.attr,
+ &dev_attr_package_threads_list.attr,
#ifdef CONFIG_SCHED_BOOK
&dev_attr_book_id.attr,
&dev_attr_book_siblings.attr,
--
2.18.0-rc0


2019-02-26 06:21:52

by Len Brown

[permalink] [raw]
Subject: [PATCH 14/14] topology: Create die_threads sysfs attribute

From: Len Brown <[email protected]>

The die_threads show all the logical CPUs that share the same die_id.

Signed-off-by: Len Brown <[email protected]>
Suggested-by: Brice Goglin <[email protected]>
---
Documentation/cputopology.txt | 12 ++++++++++++
arch/x86/include/asm/smp.h | 1 +
arch/x86/include/asm/topology.h | 1 +
arch/x86/kernel/smpboot.c | 22 ++++++++++++++++++++++
arch/x86/xen/smp_pv.c | 1 +
drivers/base/topology.c | 6 ++++++
include/linux/topology.h | 3 +++
7 files changed, 46 insertions(+)

diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index e67915a8a512..6c25ce682c90 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -56,6 +56,16 @@ package_threads_list:
human-readable list of cpuX's hardware threads within the same
physical_package_id. (deprecated name: "core_siblings_list")

+die_threads:
+
+ internal kernel map of cpuX's hardware threads within the same
+ die_id.
+
+die_threads_list:
+
+ human-readable list of cpuX's hardware threads within the same
+ die_id.
+
book_siblings:

internal kernel map of cpuX's hardware threads within the same
@@ -92,6 +102,7 @@ these macros in include/asm-XXX/topology.h::
#define topology_drawer_id(cpu)
#define topology_sibling_cpumask(cpu)
#define topology_core_cpumask(cpu)
+ #define topology_die_cpumask(cpu)
#define topology_book_cpumask(cpu)
#define topology_drawer_cpumask(cpu)

@@ -108,6 +119,7 @@ not defined by include/asm-XXX/topology.h:
2) topology_core_id: 0
3) topology_sibling_cpumask: just the given CPU
4) topology_core_cpumask: just the given CPU
+5) topology_die_cpumask: just the given CPU

For architectures that don't support books (CONFIG_SCHED_BOOK) there are no
default definitions for topology_book_id() and topology_book_cpumask().
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 2e95b6c1bca3..39266d193597 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -23,6 +23,7 @@ extern unsigned int num_processors;

DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
+DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
/* cpus sharing the last level cache: */
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 88578f10ae22..c573b0a26e16 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -111,6 +111,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)

#ifdef CONFIG_SMP
+#define topology_die_cpumask(cpu) (per_cpu(cpu_die_map, cpu))
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e332d5e59652..d30fd42a3285 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -90,6 +90,10 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

+/* representing HT, core, and die siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
+EXPORT_PER_CPU_SYMBOL(cpu_die_map);
+
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);

/* Per CPU bogomips and other parameters */
@@ -511,6 +515,15 @@ static bool match_pkg(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
return false;
}

+static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
+{
+ if ((c->phys_proc_id == o->phys_proc_id) &&
+ (c->cpu_die_id == o->cpu_die_id))
+ return true;
+ return false;
+}
+
+
#if defined(CONFIG_SCHED_SMT) || defined(CONFIG_SCHED_MC)
static inline int x86_sched_itmt_flags(void)
{
@@ -573,6 +586,7 @@ void set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, topology_sibling_cpumask(cpu));
cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
cpumask_set_cpu(cpu, topology_core_cpumask(cpu));
+ cpumask_set_cpu(cpu, topology_die_cpumask(cpu));
c->booted_cores = 1;
return;
}
@@ -621,6 +635,9 @@ void set_cpu_sibling_map(int cpu)
}
if (match_pkg(c, o) && !topology_same_node(c, o))
x86_has_numa_in_package = true;
+
+ if ((i == cpu) || (has_mp && match_die(c, o)))
+ link_mask(topology_die_cpumask, cpu, i);
}

threads = cpumask_weight(topology_sibling_cpumask(cpu));
@@ -1216,6 +1233,7 @@ static __init void disable_smp(void)
physid_set_mask_of_physid(0, &phys_cpu_present_map);
cpumask_set_cpu(0, topology_sibling_cpumask(0));
cpumask_set_cpu(0, topology_core_cpumask(0));
+ cpumask_set_cpu(0, topology_die_cpumask(0));
}

/*
@@ -1311,6 +1329,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
+ zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}

@@ -1531,6 +1550,8 @@ static void remove_siblinginfo(int cpu)
cpu_data(sibling).booted_cores--;
}

+ for_each_cpu(sibling, topology_die_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_die_cpumask(sibling));
for_each_cpu(sibling, topology_sibling_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
@@ -1538,6 +1559,7 @@ static void remove_siblinginfo(int cpu)
cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(topology_sibling_cpumask(cpu));
cpumask_clear(topology_core_cpumask(cpu));
+ cpumask_clear(topology_die_cpumask(cpu));
c->cpu_core_id = 0;
c->booted_cores = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 145506f9fdbe..ac13b0be8448 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -251,6 +251,7 @@ static void __init xen_pv_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
+ zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 73efadf5e6d4..b6d1fec9b6a3 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -61,6 +61,10 @@ define_siblings_show_func(core_siblings, core_cpumask);
static DEVICE_ATTR_RO(core_siblings);
static DEVICE_ATTR_RO(core_siblings_list);

+define_siblings_show_func(die_threads, die_cpumask);
+static DEVICE_ATTR_RO(die_threads);
+static DEVICE_ATTR_RO(die_threads_list);
+
define_siblings_show_func(package_threads, core_cpumask);
static DEVICE_ATTR_RO(package_threads);
static DEVICE_ATTR_RO(package_threads_list);
@@ -91,6 +95,8 @@ static struct attribute *default_attrs[] = {
&dev_attr_core_threads_list.attr,
&dev_attr_core_siblings.attr,
&dev_attr_core_siblings_list.attr,
+ &dev_attr_die_threads.attr,
+ &dev_attr_die_threads_list.attr,
&dev_attr_package_threads.attr,
&dev_attr_package_threads_list.attr,
#ifdef CONFIG_SCHED_BOOK
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 5cc8595dd0e4..47a3e3c08036 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -196,6 +196,9 @@ static inline int cpu_to_mem(int cpu)
#ifndef topology_core_cpumask
#define topology_core_cpumask(cpu) cpumask_of(cpu)
#endif
+#ifndef topology_die_cpumask
+#define topology_die_cpumask(cpu) cpumask_of(cpu)
+#endif

#ifdef CONFIG_SCHED_SMT
static inline const struct cpumask *cpu_smt_mask(int cpu)
--
2.18.0-rc0


2019-02-26 06:22:20

by Len Brown

[permalink] [raw]
Subject: [PATCH 09/14] thermal/x86_pkg_temp_thermal: Support multi-die/package

From: Zhang Rui <[email protected]>

On the new dual-die/package systems, the package temperature MSR becomes
die-scope. Thus instead of one thermal zone device per physical package,
now there should be one thermal_zone for each die on these systems.

This patch introduces x86_pkg_temp_thermal support for new
dual-die/package systems.

On the hardwares that do not have multi-die, topology_logical_die_id()
equals topology_physical_package_id(), thus there is no functional change.

Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Len Brown <[email protected]>
---
drivers/thermal/intel/x86_pkg_temp_thermal.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 1ef937d799e4..1b03ab3ee20c 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -122,7 +122,7 @@ static int pkg_temp_debugfs_init(void)
*/
static struct pkg_device *pkg_temp_thermal_get_dev(unsigned int cpu)
{
- int pkgid = topology_logical_package_id(cpu);
+ int pkgid = topology_logical_die_id(cpu);

if (pkgid >= 0 && pkgid < max_packages)
return packages[pkgid];
@@ -353,7 +353,7 @@ static int pkg_thermal_notify(u64 msr_val)

static int pkg_temp_thermal_device_add(unsigned int cpu)
{
- int pkgid = topology_logical_package_id(cpu);
+ int pkgid = topology_logical_die_id(cpu);
u32 tj_max, eax, ebx, ecx, edx;
struct pkg_device *pkgdev;
int thres_count, err;
@@ -449,7 +449,7 @@ static int pkg_thermal_cpu_offline(unsigned int cpu)
* worker will see the package anymore.
*/
if (lastcpu) {
- packages[topology_logical_package_id(cpu)] = NULL;
+ packages[topology_logical_die_id(cpu)] = NULL;
/* After this point nothing touches the MSR anymore. */
wrmsr(MSR_IA32_PACKAGE_THERM_INTERRUPT,
pkgdev->msr_pkg_therm_low, pkgdev->msr_pkg_therm_high);
@@ -511,11 +511,12 @@ MODULE_DEVICE_TABLE(x86cpu, pkg_temp_thermal_ids);
static int __init pkg_temp_thermal_init(void)
{
int ret;
+ struct cpuinfo_x86 *c = &cpu_data(0);

if (!x86_match_cpu(pkg_temp_thermal_ids))
return -ENODEV;

- max_packages = topology_max_packages();
+ max_packages = topology_max_packages() * c->x86_max_dies;
packages = kcalloc(max_packages, sizeof(struct pkg_device *),
GFP_KERNEL);
if (!packages)
--
2.18.0-rc0


2019-02-26 06:22:20

by Len Brown

[permalink] [raw]
Subject: [PATCH 03/14] x86 smpboot: Rename match_die() to match_pkg()

From: Len Brown <[email protected]>

Syntax only, no functional or semantic change.

This routine matches packages, not die, so name it thus.

Signed-off-by: Len Brown <[email protected]>
---
arch/x86/kernel/smpboot.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ccd1f2a8e557..19a963890bbe 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -459,7 +459,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
* multicore group inside a NUMA node. If this happens, we will
* discard the MC level of the topology later.
*/
-static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
+static bool match_pkg(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
{
if (c->phys_proc_id == o->phys_proc_id)
return true;
@@ -550,7 +550,7 @@ void set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
o = &cpu_data(i);

- if ((i == cpu) || (has_mp && match_die(c, o))) {
+ if ((i == cpu) || (has_mp && match_pkg(c, o))) {
link_mask(topology_core_cpumask, cpu, i);

/*
@@ -574,7 +574,7 @@ void set_cpu_sibling_map(int cpu)
} else if (i != cpu && !c->booted_cores)
c->booted_cores = cpu_data(i).booted_cores;
}
- if (match_die(c, o) && !topology_same_node(c, o))
+ if (match_pkg(c, o) && !topology_same_node(c, o))
x86_has_numa_in_package = true;
}

--
2.18.0-rc0


2019-02-26 06:22:30

by Len Brown

[permalink] [raw]
Subject: [PATCH 08/14] powercap/intel_rapl: Support multi-die/package

From: Zhang Rui <[email protected]>

On the new dual-die/package systems, the RAPL MSR becomes die-scope.
Thus instead of one powercap device per physical package, now there
should be one powercap device for each unique die on these systems.

This patch introduces intel_rapl driver support for new
dual-die/package systems.

On the hardwares that do not have multi-die, topology_logical_die_id()
equals topology_physical_package_id(), thus there is no functional change.

Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Len Brown <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
---
drivers/powercap/intel_rapl.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 6057d9695fed..8723e9ae7436 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -266,7 +266,7 @@ static struct rapl_domain *platform_rapl_domain; /* Platform (PSys) domain */
/* caller to ensure CPU hotplug lock is held */
static struct rapl_package *rapl_find_package(int cpu)
{
- int id = topology_physical_package_id(cpu);
+ int id = topology_logical_die_id(cpu);
struct rapl_package *rp;

list_for_each_entry(rp, &rapl_packages, plist) {
@@ -1457,7 +1457,7 @@ static void rapl_remove_package(struct rapl_package *rp)
/* called from CPU hotplug notifier, hotplug lock held */
static struct rapl_package *rapl_add_package(int cpu)
{
- int id = topology_physical_package_id(cpu);
+ int id = topology_logical_die_id(cpu);
struct rapl_package *rp;
int ret;

--
2.18.0-rc0


2019-02-26 06:22:50

by Len Brown

[permalink] [raw]
Subject: [PATCH 07/14] powercap/intel_rapl: Simplify rapl_find_package()

From: Zhang Rui <[email protected]>

Syntax only, no functional or semantic change.

Simplify how the code to discover a package is called.
Rename find_package_by_id() to rapl_find_package()

Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Len Brown <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
---
drivers/powercap/intel_rapl.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 6cdb2c14eee4..6057d9695fed 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -264,8 +264,9 @@ static struct powercap_control_type *control_type; /* PowerCap Controller */
static struct rapl_domain *platform_rapl_domain; /* Platform (PSys) domain */

/* caller to ensure CPU hotplug lock is held */
-static struct rapl_package *find_package_by_id(int id)
+static struct rapl_package *rapl_find_package(int cpu)
{
+ int id = topology_physical_package_id(cpu);
struct rapl_package *rp;

list_for_each_entry(rp, &rapl_packages, plist) {
@@ -1298,7 +1299,7 @@ static int __init rapl_register_psys(void)
rd->rpl[0].name = pl1_name;
rd->rpl[1].prim_id = PL2_ENABLE;
rd->rpl[1].name = pl2_name;
- rd->rp = find_package_by_id(0);
+ rd->rp = rapl_find_package(0);

power_zone = powercap_register_zone(&rd->power_zone, control_type,
"psys", NULL,
@@ -1454,8 +1455,9 @@ static void rapl_remove_package(struct rapl_package *rp)
}

/* called from CPU hotplug notifier, hotplug lock held */
-static struct rapl_package *rapl_add_package(int cpu, int pkgid)
+static struct rapl_package *rapl_add_package(int cpu)
{
+ int id = topology_physical_package_id(cpu);
struct rapl_package *rp;
int ret;

@@ -1464,7 +1466,7 @@ static struct rapl_package *rapl_add_package(int cpu, int pkgid)
return ERR_PTR(-ENOMEM);

/* add the new package to the list */
- rp->id = pkgid;
+ rp->id = id;
rp->lead_cpu = cpu;

/* check if the package contains valid domains */
@@ -1495,12 +1497,11 @@ static struct rapl_package *rapl_add_package(int cpu, int pkgid)
*/
static int rapl_cpu_online(unsigned int cpu)
{
- int pkgid = topology_physical_package_id(cpu);
struct rapl_package *rp;

- rp = find_package_by_id(pkgid);
+ rp = rapl_find_package(cpu);
if (!rp) {
- rp = rapl_add_package(cpu, pkgid);
+ rp = rapl_add_package(cpu);
if (IS_ERR(rp))
return PTR_ERR(rp);
}
@@ -1510,11 +1511,10 @@ static int rapl_cpu_online(unsigned int cpu)

static int rapl_cpu_down_prep(unsigned int cpu)
{
- int pkgid = topology_physical_package_id(cpu);
struct rapl_package *rp;
int lead_cpu;

- rp = find_package_by_id(pkgid);
+ rp = rapl_find_package(cpu);
if (!rp)
return 0;

--
2.18.0-rc0


2019-02-26 06:22:55

by Len Brown

[permalink] [raw]
Subject: [PATCH 11/14] hwmon/coretemp: Support multi-die/package

From: Zhang Rui <[email protected]>

This patch introduces coretemp driver support
for new dual-die/package systems.

On the new dual-die/package systems, the package temperature MSRs becomes
die-scope. Thus instead of one hwmon device per physical package, now
there should be one hwmon device for each die on these systems.

On the hardwares that do not have multi-die support,
topology_logical_die_id() equals topology_physical_package_id(), thus the
only difference is that physical package id is used as the coretemp
platform device id, instead of logical package id on these systems.

Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Len Brown <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/hwmon/coretemp.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 5d34f7271e67..57f348d43819 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -435,7 +435,7 @@ static int chk_ucode_version(unsigned int cpu)

static struct platform_device *coretemp_get_pdev(unsigned int cpu)
{
- int pkgid = topology_logical_package_id(cpu);
+ int pkgid = topology_logical_die_id(cpu);

if (pkgid >= 0 && pkgid < max_packages)
return pkg_devices[pkgid];
@@ -579,7 +579,7 @@ static struct platform_driver coretemp_driver = {

static struct platform_device *coretemp_device_add(unsigned int cpu)
{
- int err, pkgid = topology_logical_package_id(cpu);
+ int err, pkgid = topology_logical_die_id(cpu);
struct platform_device *pdev;

if (pkgid < 0)
@@ -703,7 +703,7 @@ static int coretemp_cpu_offline(unsigned int cpu)
* the rest.
*/
if (cpumask_empty(&pd->cpumask)) {
- pkg_devices[topology_logical_package_id(cpu)] = NULL;
+ pkg_devices[topology_logical_die_id(cpu)] = NULL;
platform_device_unregister(pdev);
return 0;
}
@@ -732,6 +732,7 @@ static enum cpuhp_state coretemp_hp_online;
static int __init coretemp_init(void)
{
int err;
+ struct cpuinfo_x86 *c = &cpu_data(0);

/*
* CPUID.06H.EAX[0] indicates whether the CPU has thermal
@@ -741,7 +742,7 @@ static int __init coretemp_init(void)
if (!x86_match_cpu(coretemp_ids))
return -ENODEV;

- max_packages = topology_max_packages();
+ max_packages = topology_max_packages() * c->x86_max_dies;
pkg_devices = kcalloc(max_packages, sizeof(struct platform_device *),
GFP_KERNEL);
if (!pkg_devices)
--
2.18.0-rc0


2019-02-26 06:23:17

by Len Brown

[permalink] [raw]
Subject: [PATCH 10/14] powercap/intel_rapl: update rapl domain name and debug messages

From: Zhang Rui <[email protected]>

The RAPL domain "name" attribute contains "Package-N",
which is ambiguous on multi-die per-package systems.

Update the name to "package-X-die-Y" on those systems.

No change on systems without multi-die.

Driver debug messages are also updated.

Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Len Brown <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
---
drivers/powercap/intel_rapl.c | 57 ++++++++++++++++++++---------------
1 file changed, 32 insertions(+), 25 deletions(-)

diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 8723e9ae7436..47719c995f61 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -178,12 +178,15 @@ struct rapl_domain {
#define power_zone_to_rapl_domain(_zone) \
container_of(_zone, struct rapl_domain, power_zone)

+/* maximum rapl package domain name: package-%d-die-%d */
+#define PACKAGE_DOMAIN_NAME_LENGTH 30

-/* Each physical package contains multiple domains, these are the common
+
+/* Each rapl package contains multiple domains, these are the common
* data across RAPL domains within a package.
*/
struct rapl_package {
- unsigned int id; /* physical package/socket id */
+ unsigned int id; /* logical die id, equals physical 1-die systems */
unsigned int nr_domains;
unsigned long domain_map; /* bit map of active domains */
unsigned int power_unit;
@@ -198,6 +201,7 @@ struct rapl_package {
int lead_cpu; /* one active cpu per package for access */
/* Track active cpus */
struct cpumask cpumask;
+ char name[PACKAGE_DOMAIN_NAME_LENGTH];
};

struct rapl_defaults {
@@ -926,8 +930,8 @@ static int rapl_check_unit_core(struct rapl_package *rp, int cpu)
value = (msr_val & TIME_UNIT_MASK) >> TIME_UNIT_OFFSET;
rp->time_unit = 1000000 / (1 << value);

- pr_debug("Core CPU package %d energy=%dpJ, time=%dus, power=%duW\n",
- rp->id, rp->energy_unit, rp->time_unit, rp->power_unit);
+ pr_debug("Core CPU %s energy=%dpJ, time=%dus, power=%duW\n",
+ rp->name, rp->energy_unit, rp->time_unit, rp->power_unit);

return 0;
}
@@ -951,8 +955,8 @@ static int rapl_check_unit_atom(struct rapl_package *rp, int cpu)
value = (msr_val & TIME_UNIT_MASK) >> TIME_UNIT_OFFSET;
rp->time_unit = 1000000 / (1 << value);

- pr_debug("Atom package %d energy=%dpJ, time=%dus, power=%duW\n",
- rp->id, rp->energy_unit, rp->time_unit, rp->power_unit);
+ pr_debug("Atom %s energy=%dpJ, time=%dus, power=%duW\n",
+ rp->name, rp->energy_unit, rp->time_unit, rp->power_unit);

return 0;
}
@@ -1179,7 +1183,7 @@ static void rapl_update_domain_data(struct rapl_package *rp)
u64 val;

for (dmn = 0; dmn < rp->nr_domains; dmn++) {
- pr_debug("update package %d domain %s data\n", rp->id,
+ pr_debug("update %s domain %s data\n", rp->name,
rp->domains[dmn].name);
/* exclude non-raw primitives */
for (prim = 0; prim < NR_RAW_PRIMITIVES; prim++) {
@@ -1204,7 +1208,6 @@ static void rapl_unregister_powercap(void)
static int rapl_package_register_powercap(struct rapl_package *rp)
{
struct rapl_domain *rd;
- char dev_name[17]; /* max domain name = 7 + 1 + 8 for int + 1 for null*/
struct powercap_zone *power_zone = NULL;
int nr_pl, ret;

@@ -1215,20 +1218,16 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++) {
if (rd->id == RAPL_DOMAIN_PACKAGE) {
nr_pl = find_nr_power_limit(rd);
- pr_debug("register socket %d package domain %s\n",
- rp->id, rd->name);
- memset(dev_name, 0, sizeof(dev_name));
- snprintf(dev_name, sizeof(dev_name), "%s-%d",
- rd->name, rp->id);
+ pr_debug("register package domain %s\n", rp->name);
power_zone = powercap_register_zone(&rd->power_zone,
control_type,
- dev_name, NULL,
+ rp->name, NULL,
&zone_ops[rd->id],
nr_pl,
&constraint_ops);
if (IS_ERR(power_zone)) {
- pr_debug("failed to register package, %d\n",
- rp->id);
+ pr_debug("failed to register power zone %s\n",
+ rp->name);
return PTR_ERR(power_zone);
}
/* track parent zone in per package/socket data */
@@ -1254,8 +1253,8 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
&constraint_ops);

if (IS_ERR(power_zone)) {
- pr_debug("failed to register power_zone, %d:%s:%s\n",
- rp->id, rd->name, dev_name);
+ pr_debug("failed to register power_zone, %s:%s\n",
+ rp->name, rd->name);
ret = PTR_ERR(power_zone);
goto err_cleanup;
}
@@ -1268,7 +1267,7 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
* failed after the first domain setup.
*/
while (--rd >= rp->domains) {
- pr_debug("unregister package %d domain %s\n", rp->id, rd->name);
+ pr_debug("unregister %s domain %s\n", rp->name, rd->name);
powercap_unregister_zone(control_type, &rd->power_zone);
}

@@ -1378,8 +1377,8 @@ static void rapl_detect_powerlimit(struct rapl_domain *rd)
/* check if the domain is locked by BIOS, ignore if MSR doesn't exist */
if (!rapl_read_data_raw(rd, FW_LOCK, false, &val64)) {
if (val64) {
- pr_info("RAPL package %d domain %s locked by BIOS\n",
- rd->rp->id, rd->name);
+ pr_info("RAPL %s domain %s locked by BIOS\n",
+ rd->rp->name, rd->name);
rd->state |= DOMAIN_STATE_BIOS_LOCKED;
}
}
@@ -1408,10 +1407,10 @@ static int rapl_detect_domains(struct rapl_package *rp, int cpu)
}
rp->nr_domains = bitmap_weight(&rp->domain_map, RAPL_DOMAIN_MAX);
if (!rp->nr_domains) {
- pr_debug("no valid rapl domains found in package %d\n", rp->id);
+ pr_debug("no valid rapl domains found in %s\n", rp->name);
return -ENODEV;
}
- pr_debug("found %d domains on package %d\n", rp->nr_domains, rp->id);
+ pr_debug("found %d domains on %s\n", rp->nr_domains, rp->name);

rp->domains = kcalloc(rp->nr_domains + 1, sizeof(struct rapl_domain),
GFP_KERNEL);
@@ -1444,8 +1443,8 @@ static void rapl_remove_package(struct rapl_package *rp)
rd_package = rd;
continue;
}
- pr_debug("remove package, undo power limit on %d: %s\n",
- rp->id, rd->name);
+ pr_debug("remove package, undo power limit on %s: %s\n",
+ rp->name, rd->name);
powercap_unregister_zone(control_type, &rd->power_zone);
}
/* do parent zone last */
@@ -1459,6 +1458,7 @@ static struct rapl_package *rapl_add_package(int cpu)
{
int id = topology_logical_die_id(cpu);
struct rapl_package *rp;
+ struct cpuinfo_x86 *c = &cpu_data(cpu);
int ret;

rp = kzalloc(sizeof(struct rapl_package), GFP_KERNEL);
@@ -1469,6 +1469,13 @@ static struct rapl_package *rapl_add_package(int cpu)
rp->id = id;
rp->lead_cpu = cpu;

+ if (c->x86_max_dies > 1)
+ snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH,
+ "package-%d-die-%d", c->phys_proc_id, c->cpu_die_id);
+ else
+ snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d",
+ c->phys_proc_id);
+
/* check if the package contains valid domains */
if (rapl_detect_domains(rp, cpu) ||
rapl_defaults->check_unit(rp, cpu)) {
--
2.18.0-rc0


2019-02-26 06:23:46

by Len Brown

[permalink] [raw]
Subject: [PATCH 06/14] x86 topology: Define topology_logical_die_id()

From: Len Brown <[email protected]>

Define topology_logical_die_id() ala
existing topology_logical_package_id()

Signed-off-by: Len Brown <[email protected]>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/topology.h | 3 +++
arch/x86/kernel/smpboot.c | 43 ++++++++++++++++++++++++++++++++
3 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f2856fe03715..ee34ff34889d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -119,6 +119,7 @@ struct cpuinfo_x86 {
/* Core id: */
u16 cpu_core_id;
u16 cpu_die_id;
+ u16 logical_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 281be6bbc80d..88578f10ae22 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -106,6 +106,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);

#define topology_logical_package_id(cpu) (cpu_data(cpu).logical_proc_id)
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
+#define topology_logical_die_id(cpu) (cpu_data(cpu).logical_die_id)
#define topology_die_id(cpu) (cpu_data(cpu).cpu_die_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)

@@ -125,6 +126,7 @@ static inline int topology_max_smt_threads(void)

int topology_update_package_map(unsigned int apicid, unsigned int cpu);
int topology_phys_to_logical_pkg(unsigned int pkg);
+int topology_phys_to_logical_die(unsigned int die);
bool topology_is_primary_thread(unsigned int cpu);
bool topology_smt_supported(void);
#else
@@ -132,6 +134,7 @@ bool topology_smt_supported(void);
static inline int
topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
+static inline int topology_phys_to_logical_die(unsigned int die) { return 0; }
static inline int topology_max_smt_threads(void) { return 1; }
static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
static inline bool topology_smt_supported(void) { return false; }
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c70e547b18c2..e332d5e59652 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -100,6 +100,7 @@ EXPORT_PER_CPU_SYMBOL(cpu_info);
unsigned int __max_logical_packages __read_mostly;
EXPORT_SYMBOL(__max_logical_packages);
static unsigned int logical_packages __read_mostly;
+static unsigned int logical_die __read_mostly;

/* Maximum number of SMT threads on any online core */
int __read_mostly __max_smt_threads = 1;
@@ -306,6 +307,24 @@ int topology_phys_to_logical_pkg(unsigned int phys_pkg)
return -1;
}
EXPORT_SYMBOL(topology_phys_to_logical_pkg);
+/**
+ * topology_phys_to_logical_die - Map a physical die id to logical
+ *
+ * Returns logical die id or -1 if not found
+ */
+int topology_phys_to_logical_die(unsigned int die_id)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ struct cpuinfo_x86 *c = &cpu_data(cpu);
+
+ if (c->initialized && c->cpu_die_id == die_id)
+ return c->logical_proc_id;
+ }
+ return -1;
+}
+EXPORT_SYMBOL(topology_phys_to_logical_die);

/**
* topology_update_package_map - Update the physical to logical package map
@@ -330,6 +349,29 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
cpu_data(cpu).logical_proc_id = new;
return 0;
}
+/**
+ * topology_update_die_map - Update the physical to logical die map
+ * @die: The die id as retrieved via CPUID
+ * @cpu: The cpu for which this is updated
+ */
+int topology_update_die_map(unsigned int die, unsigned int cpu)
+{
+ int new;
+
+ /* Already available somewhere? */
+ new = topology_phys_to_logical_pkg(die);
+ if (new >= 0)
+ goto found;
+
+ new = logical_die++;
+ if (new != die) {
+ pr_info("CPU %u Converting physical %u to logical die %u\n",
+ cpu, die, new);
+ }
+found:
+ cpu_data(cpu).logical_die_id = new;
+ return 0;
+}

void __init smp_store_boot_cpu_info(void)
{
@@ -339,6 +381,7 @@ void __init smp_store_boot_cpu_info(void)
*c = boot_cpu_data;
c->cpu_index = id;
topology_update_package_map(c->phys_proc_id, id);
+ topology_update_die_map(c->cpu_die_id, id);
c->initialized = true;
}

--
2.18.0-rc0


2019-02-26 06:23:56

by Len Brown

[permalink] [raw]
Subject: [PATCH 02/14] topolgy: Simplify cputopology.txt formatting and wording

From: Len Brown <[email protected]>

Syntax only, no functional or semantic change.

Signed-off-by: Len Brown <[email protected]>
Cc: [email protected]
---
Documentation/cputopology.txt | 46 +++++++++++++++++------------------
1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index c6e7e9196a8b..cb61277e2308 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -3,79 +3,79 @@ How CPU topology info is exported via sysfs
===========================================

Export CPU topology info via sysfs. Items (attributes) are similar
-to /proc/cpuinfo output of some architectures:
+to /proc/cpuinfo output of some architectures. They reside in
+/sys/devices/system/cpu/cpuX/topology/:

-1) /sys/devices/system/cpu/cpuX/topology/physical_package_id:
+physical_package_id:

physical package id of cpuX. Typically corresponds to a physical
socket number, but the actual value is architecture and platform
dependent.

-2) /sys/devices/system/cpu/cpuX/topology/core_id:
+core_id:

the CPU core ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.

-3) /sys/devices/system/cpu/cpuX/topology/book_id:
+book_id:

the book ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.

-4) /sys/devices/system/cpu/cpuX/topology/drawer_id:
+drawer_id:

the drawer ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.

-5) /sys/devices/system/cpu/cpuX/topology/thread_siblings:
+thread_siblings:

internal kernel map of cpuX's hardware threads within the same
core as cpuX.

-6) /sys/devices/system/cpu/cpuX/topology/thread_siblings_list:
+thread_siblings_list:

human-readable list of cpuX's hardware threads within the same
core as cpuX.

-7) /sys/devices/system/cpu/cpuX/topology/core_siblings:
+core_siblings:

internal kernel map of cpuX's hardware threads within the same
physical_package_id.

-8) /sys/devices/system/cpu/cpuX/topology/core_siblings_list:
+core_siblings_list:

human-readable list of cpuX's hardware threads within the same
physical_package_id.

-9) /sys/devices/system/cpu/cpuX/topology/book_siblings:
+book_siblings:

internal kernel map of cpuX's hardware threads within the same
book_id.

-10) /sys/devices/system/cpu/cpuX/topology/book_siblings_list:
+book_siblings_list:

human-readable list of cpuX's hardware threads within the same
book_id.

-11) /sys/devices/system/cpu/cpuX/topology/drawer_siblings:
+drawer_siblings:

internal kernel map of cpuX's hardware threads within the same
drawer_id.

-12) /sys/devices/system/cpu/cpuX/topology/drawer_siblings_list:
+drawer_siblings_list:

human-readable list of cpuX's hardware threads within the same
drawer_id.

-To implement it in an architecture-neutral way, a new source file,
-drivers/base/topology.c, is to export the 6 to 12 attributes. The book
-and drawer related sysfs files will only be created if CONFIG_SCHED_BOOK
-and CONFIG_SCHED_DRAWER are selected.
+Architecture-neutral, drivers/base/topology.c, exports these attributes.
+However, the book and drawer related sysfs files will only be created if
+CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are selected, respectively.

-CONFIG_SCHED_BOOK and CONFIG_DRAWER are currently only used on s390, where
-they reflect the cpu and cache hierarchy.
+CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are currently only used on s390,
+where they reflect the cpu and cache hierarchy.

For an architecture to support this feature, it must define some of
these macros in include/asm-XXX/topology.h::
@@ -98,10 +98,10 @@ To be consistent on all architectures, include/linux/topology.h
provides default definitions for any of the above macros that are
not defined by include/asm-XXX/topology.h:

-1) physical_package_id: -1
-2) core_id: 0
-3) sibling_cpumask: just the given CPU
-4) core_cpumask: just the given CPU
+1) topology_physical_package_id: -1
+2) topology_core_id: 0
+3) topology_sibling_cpumask: just the given CPU
+4) topology_core_cpumask: just the given CPU

For architectures that don't support books (CONFIG_SCHED_BOOK) there are no
default definitions for topology_book_id() and topology_book_cpumask().
--
2.18.0-rc0


2019-02-26 18:52:18

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> Restored sysfs core_siblings, core_siblings_list
>
> v1 proposed re-defining this existing attribute to
> be the threads in a die, rather than in a package.
>
> For compatibility, decided rather to keep this
> attribute unchanged, for now, even though
> its name makes little sense, and it makes
> no sense in a multi-die system.

So why do things that make no sense? What breaks?

2019-02-26 18:54:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> Added sysfs package_threads, package_threads_list
>
> Added this attribute to show threads siblings in a package.
> Exactly same as "core_siblings above", a name now deprecated.
> This attribute name and definition is immune to future
> topology changes.
>
> Suggested by Brice.
>
> Added sysfs die_threads, die_threads_list
>
> Added this attribute to show which threads siblings in a die.
> V1 had proposed putting this info into "core_siblings", but we
> decided to leave that legacy attribute alone.
> This attribute name and definition is immune to future
> topology changes.
>
> On a single die-package system this attribute has same contents
> as "package_threads".
>
> Suggested by Brice.
>
> Added sysfs core_threads, core_threads_list
>
> Added this attribute to show which threads siblings in a core.
> Exactly same as "thread_siblings", a name now deprecated.
> This attribute name and definition is immune to future
> topology changes.
>
> Suggested by Brice.

I think I prefer 's/threads/cpus/g' on that. Threads makes me think SMT,
and I don't think there's any guarantee the part in question will have
SMT on.

2019-02-26 19:06:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> Documentation/cputopology.txt | 72 ++++++++++++++---------
> Documentation/x86/topology.txt | 6 +-
> arch/x86/include/asm/processor.h | 5 +-
> arch/x86/include/asm/smp.h | 1 +
> arch/x86/include/asm/topology.h | 5 ++
> arch/x86/kernel/cpu/topology.c | 85 +++++++++++++++++++++-------
> arch/x86/kernel/smpboot.c | 73 +++++++++++++++++++++++-
> arch/x86/xen/smp_pv.c | 1 +
> drivers/base/topology.c | 22 +++++++
> drivers/hwmon/coretemp.c | 9 +--
> drivers/powercap/intel_rapl.c | 75 +++++++++++++-----------
> drivers/thermal/intel/x86_pkg_temp_thermal.c | 9 +--
> include/linux/topology.h | 6 ++
> 13 files changed, 276 insertions(+), 93 deletions(-)

Should we not also have changes to
arch/x86/kernel/cpu/proc.c:show_cpuinfo_cores() ?

2019-02-26 19:08:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 07/14] powercap/intel_rapl: Simplify rapl_find_package()

On Tue, Feb 26, 2019 at 01:20:05AM -0500, Len Brown wrote:
> -static struct rapl_package *find_package_by_id(int id)
> +static struct rapl_package *rapl_find_package(int cpu)
> {
> + int id = topology_physical_package_id(cpu);
> struct rapl_package *rp;

Which you'll change to topology_physical_die_id() in the next patch.

If you respin the series again, could we pick a better name?

2019-02-26 19:46:58

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 04/14] x86 topology: Add CPUID.1F multi-die/package support

On 2/25/19 10:20 PM, Len Brown wrote:
> -/* leaf 0xb sub-leaf types */
> +/* extended topology sub-leaf types */
> #define INVALID_TYPE 0
> #define SMT_TYPE 1
> #define CORE_TYPE 2
> +#define DIE_TYPE 5

Looking in the SDM, Vol. 3A "8.9.1 Hierarchical Mapping of Shared
Resources", there are a _couple_ of new levels: Die, Tile and Module.
But, this patch only covers Dies.

Was there a reason for that?

I wonder if we'll end up with different (better) infrastructure if we do
these all at once instead of hacking them in one at a time.

2019-02-28 17:17:52

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 04/14] x86 topology: Add CPUID.1F multi-die/package support

On Tue, Feb 26, 2019 at 2:46 PM Dave Hansen <[email protected]> wrote:
>
> On 2/25/19 10:20 PM, Len Brown wrote:
> > -/* leaf 0xb sub-leaf types */
> > +/* extended topology sub-leaf types */
> > #define INVALID_TYPE 0
> > #define SMT_TYPE 1
> > #define CORE_TYPE 2
> > +#define DIE_TYPE 5
>
> Looking in the SDM, Vol. 3A "8.9.1 Hierarchical Mapping of Shared
> Resources", there are a _couple_ of new levels: Die, Tile and Module.
> But, this patch only covers Dies.
>
> Was there a reason for that?

We have software visible modules and tiles in past products,
and those were discovered by family/model checks, rather than
enumerated. I'm content to let that sleeping dog lay.
We don't have software visible modules or die enumerated with this
mechanism in any current or about to be current products.
When (and if) such products do come along, only then will we know
why software *cares* about die and tiles -- and that will be the time
to add that support. Per below, I think this series cleanly prepares us
for that time.

> I wonder if we'll end up with different (better) infrastructure if we do
> these all at once instead of hacking them in one at a time.

Assuming that "hacking in" is a derogatory term, let me make the case
that this patch series cleanly sets the stage for the future.

old:

thread_siblings: the threads that are in the same core
core_siblings: the threads that are in the same package

This naming scheme assumed that there would never be a topology level
between a core and a package. Though we leave "core_siblings" intact
for legacy SW that depends on it, we mark it depreciated.

new:

core_threads: the threads in the same core
die_threads: the threads in the same die
package_threads: the threads in the same package

So in the future we could always add...

dave_threads: the threads in the same dave

So I think we are ready for whatever the future may throw at us,
while remaining compatible, consistent, and no "hacking in" required.

thanks,
Len Brown, Intel Open Source Technology Center

2019-03-07 14:47:17

by Morten Rasmussen

[permalink] [raw]
Subject: Re: [PATCH 05/14] cpu topology: Export die_id

Hi Len,

On Tue, Feb 26, 2019 at 01:20:03AM -0500, Len Brown wrote:
> From: Len Brown <[email protected]>
>
> Export die_id in cpu topology, for the benefit of hardware that
> has multiple-die/package.
>
> Signed-off-by: Len Brown <[email protected]>
> Cc: [email protected]
> ---
> Documentation/cputopology.txt | 6 ++++++
> arch/x86/include/asm/topology.h | 1 +
> drivers/base/topology.c | 4 ++++
> include/linux/topology.h | 3 +++
> 4 files changed, 14 insertions(+)
>
> diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
> index cb61277e2308..4e6be7f68fd8 100644
> --- a/Documentation/cputopology.txt
> +++ b/Documentation/cputopology.txt
> @@ -12,6 +12,12 @@ physical_package_id:
> socket number, but the actual value is architecture and platform
> dependent.
>
> +die_id:
> +
> + the CPU die ID of cpuX. Typically it is the hardware platform's
> + identifier (rather than the kernel's). The actual value is
> + architecture and platform dependent.
> +
> core_id:

Can we add the details about die_id further down in cputopology.txt as
well?

diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index 6c25ce682c90..77b65583081e 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -97,6 +97,7 @@ For an architecture to support this feature, it must define some of
these macros in include/asm-XXX/topology.h::

#define topology_physical_package_id(cpu)
+ #define topology_die_id(cpu)
#define topology_core_id(cpu)
#define topology_book_id(cpu)
#define topology_drawer_id(cpu)
@@ -116,10 +117,11 @@ provides default definitions for any of the above macros that are
not defined by include/asm-XXX/topology.h:

1) topology_physical_package_id: -1
-2) topology_core_id: 0
-3) topology_sibling_cpumask: just the given CPU
-4) topology_core_cpumask: just the given CPU
-5) topology_die_cpumask: just the given CPU
+2) topology_die_id: -1
+3) topology_core_id: 0
+4) topology_sibling_cpumask: just the given CPU
+5) topology_core_cpumask: just the given CPU
+6) topology_die_cpumask: just the given CPU

>
> the CPU core ID of cpuX. Typically it is the hardware platform's
> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> index 453cf38a1c33..281be6bbc80d 100644
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -106,6 +106,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
>
> #define topology_logical_package_id(cpu) (cpu_data(cpu).logical_proc_id)
> #define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
> +#define topology_die_id(cpu) (cpu_data(cpu).cpu_die_id)
> #define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
>
> #ifdef CONFIG_SMP

The above is x86 specific and seems to fit better with the next patch in
the series.

Morten

2019-03-07 15:01:34

by Morten Rasmussen

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Feb 26, 2019 at 07:53:58PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> > Added sysfs package_threads, package_threads_list
> >
> > Added this attribute to show threads siblings in a package.
> > Exactly same as "core_siblings above", a name now deprecated.
> > This attribute name and definition is immune to future
> > topology changes.
> >
> > Suggested by Brice.
> >
> > Added sysfs die_threads, die_threads_list
> >
> > Added this attribute to show which threads siblings in a die.
> > V1 had proposed putting this info into "core_siblings", but we
> > decided to leave that legacy attribute alone.
> > This attribute name and definition is immune to future
> > topology changes.
> >
> > On a single die-package system this attribute has same contents
> > as "package_threads".
> >
> > Suggested by Brice.
> >
> > Added sysfs core_threads, core_threads_list
> >
> > Added this attribute to show which threads siblings in a core.
> > Exactly same as "thread_siblings", a name now deprecated.
> > This attribute name and definition is immune to future
> > topology changes.
> >
> > Suggested by Brice.
>
> I think I prefer 's/threads/cpus/g' on that. Threads makes me think SMT,
> and I don't think there's any guarantee the part in question will have
> SMT on.

I think 'threads' is a bit confusing as well. We seem to be using 'cpu'
everywhere for something we can schedule tasks on, including the sysfs
/sys/devices/system/cpu/ subdirs for each SMT thread on SMT systems.

Another thing that I find confusing is that with this series we a new
die id/mask which is totally unrelated to the DIE level in the
sched_domain hierarchy. We should rename DIE level to something that
reflects what it really is. If we can agree on that ;-)

NODE level?

Morten

2019-03-26 18:19:45

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 05/14] cpu topology: Export die_id

On Thu, Mar 7, 2019 at 9:45 AM Morten Rasmussen
<[email protected]> wrote:

> Can we add the details about die_id further down in cputopology.txt as
> well?

Done.

> > diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> > --- a/arch/x86/include/asm/topology.h
> > +++ b/arch/x86/include/asm/topology.h
...
> > +#define topology_die_id(cpu) (cpu_data(cpu).cpu_die_id)
> > #define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)

> The above is x86 specific and seems to fit better with the next patch in
> the series.

No problem, I split that x86 line into a separate x86-specific patch.

thanks Morten,

Len Brown, Intel Open Source Technology Center

2019-03-26 18:28:45

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 07/14] powercap/intel_rapl: Simplify rapl_find_package()

On Tue, Feb 26, 2019 at 2:07 PM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Feb 26, 2019 at 01:20:05AM -0500, Len Brown wrote:
> > -static struct rapl_package *find_package_by_id(int id)
> > +static struct rapl_package *rapl_find_package(int cpu)
> > {
> > + int id = topology_physical_package_id(cpu);
> > struct rapl_package *rp;
>
> Which you'll change to topology_physical_die_id() in the next patch.
>
> If you respin the series again, could we pick a better name?

Done.
I called this routine "rapl_find_package_domain()" -- since that is
what it does --
it finds what RAPL calls a "Package Domain", no matter if that domain
is implemented in the topology by a die or a physical package.

thanks,
Len Brown, Intel Open Source Technology Center

2019-04-05 18:17:09

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 04/14] x86 topology: Add CPUID.1F multi-die/package support

On Thu, 28 Feb 2019, Len Brown wrote:
> On Tue, Feb 26, 2019 at 2:46 PM Dave Hansen <[email protected]> wrote:
> > I wonder if we'll end up with different (better) infrastructure if we do
> > these all at once instead of hacking them in one at a time.
>
> Assuming that "hacking in" is a derogatory term, let me make the case
> that this patch series cleanly sets the stage for the future.
>
> old:
>
> thread_siblings: the threads that are in the same core
> core_siblings: the threads that are in the same package
>
> This naming scheme assumed that there would never be a topology level
> between a core and a package. Though we leave "core_siblings" intact
> for legacy SW that depends on it, we mark it depreciated.
>
> new:
>
> core_threads: the threads in the same core
> die_threads: the threads in the same die
> package_threads: the threads in the same package
>
> So in the future we could always add...
>
> dave_threads: the threads in the same dave
>
> So I think we are ready for whatever the future may throw at us,
> while remaining compatible, consistent, and no "hacking in" required.

Makes sense.

tglx

2019-04-05 18:28:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 08/14] powercap/intel_rapl: Support multi-die/package

On Tue, 26 Feb 2019, Len Brown wrote:

> From: Zhang Rui <[email protected]>
>
> On the new dual-die/package systems, the RAPL MSR becomes die-scope.
> Thus instead of one powercap device per physical package, now there
> should be one powercap device for each unique die on these systems.
>
> This patch introduces intel_rapl driver support for new
> dual-die/package systems.

This patch .... See Documentation/processs/

and this sentence is not really helpful either.

> On the hardwares that do not have multi-die, topology_logical_die_id()
> equals topology_physical_package_id(), thus there is no functional change.

Something like this:

To support this the RAPL package domain has to be identified by the die id
instead of the package id. On single die CPUs the die id is the same as the
physical package id.

Hmm?

> Signed-off-by: Zhang Rui <[email protected]>
> Signed-off-by: Len Brown <[email protected]>
> Acked-by: Rafael J. Wysocki <[email protected]>
> Cc: [email protected]
> ---
> drivers/powercap/intel_rapl.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
> index 6057d9695fed..8723e9ae7436 100644
> --- a/drivers/powercap/intel_rapl.c
> +++ b/drivers/powercap/intel_rapl.c
> @@ -266,7 +266,7 @@ static struct rapl_domain *platform_rapl_domain; /* Platform (PSys) domain */
> /* caller to ensure CPU hotplug lock is held */
> static struct rapl_package *rapl_find_package(int cpu)
> {
> - int id = topology_physical_package_id(cpu);
> + int id = topology_logical_die_id(cpu);
> struct rapl_package *rp;
>
> list_for_each_entry(rp, &rapl_packages, plist) {
> @@ -1457,7 +1457,7 @@ static void rapl_remove_package(struct rapl_package *rp)
> /* called from CPU hotplug notifier, hotplug lock held */
> static struct rapl_package *rapl_add_package(int cpu)
> {
> - int id = topology_physical_package_id(cpu);
> + int id = topology_logical_die_id(cpu);
> struct rapl_package *rp;
> int ret;
>
> --
> 2.18.0-rc0
>
>

2019-04-05 18:31:03

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 09/14] thermal/x86_pkg_temp_thermal: Support multi-die/package

On Tue, 26 Feb 2019, Len Brown wrote:
> static int __init pkg_temp_thermal_init(void)
> {
> int ret;
> + struct cpuinfo_x86 *c = &cpu_data(0);
>
> if (!x86_match_cpu(pkg_temp_thermal_ids))
> return -ENODEV;
>
> - max_packages = topology_max_packages();
> + max_packages = topology_max_packages() * c->x86_max_dies;

This is really a sloppy hack. Just because cpuinfo is accessible it's not a
good idea to fiddle with it in a driver. We went great length to abstract
that stuff. So please add a new helper function topology_max_dies() and
retrieve it from that.

Thanks,

tglx

2019-04-05 18:34:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, 26 Feb 2019, Len Brown wrote:

> This patch series does 4 things.
>
> 1. Parses the new CPUID.1F leaf to discover multi-die/package topology
>
> 2. Export multi-die topology inside the kernel
>
> 3. Update 3 places (coretemp, pkgtemp, rapl) that that need to know
> the difference between die and package-scope MSR.
>
> (Note: Kan Liang has a patch series on top of this one to similarly
> make the uncore perf code multi-die/package aware.)
>
> 4. Export multi-die topology to user-space via sysfs

Aside of the few nitpicks (which apply to several patches) this looks very
nice.

Thanks,

tglx

2019-04-05 18:34:47

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Thu, 7 Mar 2019, Morten Rasmussen wrote:
> On Tue, Feb 26, 2019 at 07:53:58PM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> > > Added sysfs core_threads, core_threads_list
> > >
> > > Added this attribute to show which threads siblings in a core.
> > > Exactly same as "thread_siblings", a name now deprecated.
> > > This attribute name and definition is immune to future
> > > topology changes.
> > >
> > > Suggested by Brice.
> >
> > I think I prefer 's/threads/cpus/g' on that. Threads makes me think SMT,
> > and I don't think there's any guarantee the part in question will have
> > SMT on.
>
> I think 'threads' is a bit confusing as well. We seem to be using 'cpu'
> everywhere for something we can schedule tasks on, including the sysfs
> /sys/devices/system/cpu/ subdirs for each SMT thread on SMT systems.
>
> Another thing that I find confusing is that with this series we a new
> die id/mask which is totally unrelated to the DIE level in the
> sched_domain hierarchy. We should rename DIE level to something that
> reflects what it really is. If we can agree on that ;-)
>
> NODE level?

Any conclusions here?

2019-04-05 18:41:31

by Brown, Len

[permalink] [raw]
Subject: RE: [PATCH 09/14] thermal/x86_pkg_temp_thermal: Support multi-die/package


> please add a new helper function topology_max_dies() and retrieve it from that.

I'll do this for Rui -- since his patch is in my x86 branch,
and it is already (very early) Saturday morning for him:-)

FYI, there were a couple of other small changes I made to that branch
in response to list feedback that I pushed, but didn't re-email out the series.

Thomas,
Let me know if you prefer me to re-email the series, or just send a git pull request.

Thanks,
-Len

2019-04-05 18:52:17

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH 09/14] thermal/x86_pkg_temp_thermal: Support multi-die/package

Len,

On Fri, 5 Apr 2019, Brown, Len wrote:
> > please add a new helper function topology_max_dies() and retrieve it from that.
>
> I'll do this for Rui -- since his patch is in my x86 branch,
> and it is already (very early) Saturday morning for him:-)
>
> FYI, there were a couple of other small changes I made to that branch
> in response to list feedback that I pushed, but didn't re-email out the series.
>
> Thomas,
> Let me know if you prefer me to re-email the series, or just send a git pull request.

Mail is fine.

Thanks,

tglx

2019-04-08 01:36:43

by Zhang, Rui

[permalink] [raw]
Subject: Re: [PATCH 08/14] powercap/intel_rapl: Support multi-die/package

On 五, 2019-04-05 at 20:27 +0200, Thomas Gleixner wrote:
> On Tue, 26 Feb 2019, Len Brown wrote:
>
> >
> > From: Zhang Rui <[email protected]>
> >
> > On the new dual-die/package systems, the RAPL MSR becomes die-
> > scope.
> > Thus instead of one powercap device per physical package, now there
> > should be one powercap device for each unique die on these systems.
> >
> > This patch introduces intel_rapl driver support for new
> > dual-die/package systems.
> This patch .... See Documentation/processs/
>
> and this sentence is not really helpful either.
>
> >
> > On the hardwares that do not have multi-die,
> > topology_logical_die_id()
> > equals topology_physical_package_id(), thus there is no functional
> > change.
> Something like this:
>
> To support this the RAPL package domain has to be identified by the
> die id
> instead of the package id. On single die CPUs the die id is the same
> as the
> physical package id.
>
> Hmm?

Yeah, sounds much better.
Len, will you help me update the changelog or do you want me to send an
updated version to you?

thanks,
rui

>
> >
> > Signed-off-by: Zhang Rui <[email protected]>
> > Signed-off-by: Len Brown <[email protected]>
> > Acked-by: Rafael J. Wysocki <[email protected]>
> > Cc: [email protected]
> > ---
> >  drivers/powercap/intel_rapl.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/powercap/intel_rapl.c
> > b/drivers/powercap/intel_rapl.c
> > index 6057d9695fed..8723e9ae7436 100644
> > --- a/drivers/powercap/intel_rapl.c
> > +++ b/drivers/powercap/intel_rapl.c
> > @@ -266,7 +266,7 @@ static struct rapl_domain
> > *platform_rapl_domain; /* Platform (PSys) domain */
> >  /* caller to ensure CPU hotplug lock is held */
> >  static struct rapl_package *rapl_find_package(int cpu)
> >  {
> > - int id = topology_physical_package_id(cpu);
> > + int id = topology_logical_die_id(cpu);
> >   struct rapl_package *rp;
> >  
> >   list_for_each_entry(rp, &rapl_packages, plist) {
> > @@ -1457,7 +1457,7 @@ static void rapl_remove_package(struct
> > rapl_package *rp)
> >  /* called from CPU hotplug notifier, hotplug lock held */
> >  static struct rapl_package *rapl_add_package(int cpu)
> >  {
> > - int id = topology_physical_package_id(cpu);
> > + int id = topology_logical_die_id(cpu);
> >   struct rapl_package *rp;
> >   int ret;
> >  

2019-04-12 19:34:05

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Feb 26, 2019 at 1:51 PM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> > Restored sysfs core_siblings, core_siblings_list
> >
> > v1 proposed re-defining this existing attribute to
> > be the threads in a die, rather than in a package.
> >
> > For compatibility, decided rather to keep this
> > attribute unchanged, for now, even though
> > its name makes little sense, and it makes
> > no sense in a multi-die system.
>
> So why do things that make no sense?

>>> 7) /sys/devices/system/cpu/cpuX/topology/core_siblings:
>>>
>>> internal kernel map of cpuX's hardware threads within the same
>>> physical_package_id.

This definition tells you what cpus are in what package.
That is fine, it is useful, and it is in use.

What doesn't make sense is that it is called "core_siblings".
Who is to say that the map of CPUs inside a package has anything to do
with "cores"?
Sometimes it does, sometimes it doesn't...

> What breaks?

User space applications, such as lscpu and hwloc are using this attribute
per its definition, to figure out what cpus are in what packages.
If we change the definition to match the attribute's name, they break.
If we change the attribute name to match the definition, they break.

So the plan is to simply leave this attribute and its definition in place,
deprecate it, and move to the new attribute names that don't have
the word "siblings" in them -- which imply a known fixed topology.

We can schedule this attribute to be deleted some day,
but changing it and hoping you've updated all of user-space
would be a unnecessary pain.

Len Brown, Intel Open Source Technology Center

2019-04-12 19:55:23

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

> > > I think I prefer 's/threads/cpus/g' on that. Threads makes me think SMT,
> > > and I don't think there's any guarantee the part in question will have
> > > SMT on.
> >
> > I think 'threads' is a bit confusing as well. We seem to be using 'cpu'
> > everywhere for something we can schedule tasks on, including the sysfs
> > /sys/devices/system/cpu/ subdirs for each SMT thread on SMT systems.

I agree with Peter and Morten.
"cpu" is more clear and consistent than "thread" here.
I'll spin the series with that string changed.

> > Another thing that I find confusing is that with this series we a new
> > die id/mask which is totally unrelated to the DIE level in the
> > sched_domain hierarchy. We should rename DIE level to something that
> > reflects what it really is. If we can agree on that ;-)
> >
> > NODE level?

Cache topology and node interconnect topology impact performance, and so
we what we look at, when we decided to run something on this CPU or that one.

That logical topology lives within the physical package and die topology,
but doesn't necessarily match it. For example, caches can be shared
or split into pieces inside a package or die. Logical nodes may match
die boundaries, or there may be multiple logical nodes within a single
physical package or die.

We have mechanisms for explicitly enumerating the caches,
and for nodes. This patch series does not touch those mechanisms.

The reason we need to know about physical packages and die is that
there are other things associated with them.
eg. power and temperature domains, and certain system registers
follow these physical boundaries. Code that talks to those items
needs to be able to understand these physical boundaries.

I hope that helps.

thanks,
Len Brown, Intel Open Source Technology Center

2019-04-12 20:42:35

by Brice Goglin

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

Le 12/04/2019 à 21:52, Len Brown a écrit :
>>>> I think I prefer 's/threads/cpus/g' on that. Threads makes me think SMT,
>>>> and I don't think there's any guarantee the part in question will have
>>>> SMT on.
>>> I think 'threads' is a bit confusing as well. We seem to be using 'cpu'
>>> everywhere for something we can schedule tasks on, including the sysfs
>>> /sys/devices/system/cpu/ subdirs for each SMT thread on SMT systems.
> I agree with Peter and Morten.
> "cpu" is more clear and consistent than "thread" here.
> I'll spin the series with that string changed.


Agreed, I should have used that suffix from the beginning.

Brice


2019-04-30 06:52:39

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Feb 26, 2019 at 2:05 PM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote:
> > Documentation/cputopology.txt | 72 ++++++++++++++---------
> > Documentation/x86/topology.txt | 6 +-
> > arch/x86/include/asm/processor.h | 5 +-
> > arch/x86/include/asm/smp.h | 1 +
> > arch/x86/include/asm/topology.h | 5 ++
> > arch/x86/kernel/cpu/topology.c | 85 +++++++++++++++++++++-------
> > arch/x86/kernel/smpboot.c | 73 +++++++++++++++++++++++-
> > arch/x86/xen/smp_pv.c | 1 +
> > drivers/base/topology.c | 22 +++++++
> > drivers/hwmon/coretemp.c | 9 +--
> > drivers/powercap/intel_rapl.c | 75 +++++++++++++-----------
> > drivers/thermal/intel/x86_pkg_temp_thermal.c | 9 +--
> > include/linux/topology.h | 6 ++
> > 13 files changed, 276 insertions(+), 93 deletions(-)
>
> Should we not also have changes to
> arch/x86/kernel/cpu/proc.c:show_cpuinfo_cores() ?

Good question.
I was thinking that /proc/cpuinfo was sort of the legacy API, and
adding a field might break something.
While adding an attribute to sysfs topology directory was the
compatible/safe way to make additions.

/proc/cpuinfo has these fields today:

physical id : 0
this is the physical package id
siblings : 8
this is the count of cpus in the same package
core id : 3
this is cpu_core_id
cpu cores : 4
this is booted_cores

If one were to make a change here, I'd consider adding the (physical) die_id,
though it is already in sysfs topology as an attribute.

Not sure if it would then make sense to print the count of cpus in the die.
Not sure what I'd name it, and this info is already in sysfs as a map and list.


Len Brown, Intel Open Source Technology Center

2019-04-30 09:35:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Apr 30, 2019 at 02:50:58AM -0400, Len Brown wrote:
> If one were to make a change here, I'd consider adding the (physical) die_id,
> though it is already in sysfs topology as an attribute.

From: Documentation/x86/topology.txt

"The kernel does not care about the concept of physical sockets because
a socket has no relevance to software. It's an electromechanical
component. In the past a socket always contained a single package
(see below), but with the advent of Multi Chip Modules (MCM) a socket
can hold more than one package. So there might be still references to
sockets in the code, but they are of historical nature and should be
cleaned up."

So that die thing has only small relevance to some software, as you say:

"These topology changes primarily impact parts of the kernel and some
applciations that care about package MSR scope."

So if there's no real need to add it there, then it probably shouldn't
be added. The topology is already too complex - so much so, that tools
are even generating PDFs from it :)

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-04-30 18:47:45

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 0/14] v2 multi-die/package topology support

On Tue, Apr 30, 2019 at 5:33 AM Borislav Petkov <[email protected]> wrote:

> So that die thing has only small relevance to some software, as you say:
>
> "These topology changes primarily impact parts of the kernel and some
> applciations that care about package MSR scope."
>
> So if there's no real need to add it there, then it probably shouldn't
> be added. The topology is already too complex - so much so, that tools
> are even generating PDFs from it :)

Agreed.
Let's keep /proc/cpuinfo simple.
If a case emerges where it makes sense to add more detail there, it is
trivial do add later.

thanks,
Len Brown, Intel Open Source Technology Center