2023-04-13 17:32:16

by K Prateek Nayak

[permalink] [raw]
Subject: [PATCH v2 0/2] arch/x86: Set L2 Cache ID on AMD and Hygon processors

commit 66558b730f253 ("sched: Add cluster scheduler level for x86")
defined cluster on x86 as the set of threads sharing the same L2 cache.
cluster_id on x86, maps to the l2c_id which currently only Intel
processors set.

This series sets the l2c_id on AMD and Hygon processors with
X86_FEATURE_TOPOEXT, using the extended APIC ID and the
"Cache Properties (L2)" CPUID (0x8000001D EAX). On AMD and Hygon
processors without X86_FEATURE_TOPOEXT, current behavior will continue.

Following are the changes in value reported by
"/sys/devices/system/cpu/cpuX/topology/cluster_id" on a 2P Milan system
(2 x 64C/128T) where L2 is per-core and SMT sibling of CPU (X) is
CPU ((X + 128) % 256).

- tip:x86/core

$ for i in {0..255}; do\
echo -n "CPU$i cluster_id: ";\
cat /sys/devices/system/cpu/cpu$i/topology/cluster_id;\
done;

CPU0 cluster_id: 65535
CPU1 cluster_id: 65535
CPU2 cluster_id: 65535
CPU3 cluster_id: 65535
CPU4 cluster_id: 65535
...
CPU254 cluster_id: 65535
CPU255 cluster_id: 65535

- tip:x86/core + this series

$ for i in {0..255}; do\
echo -n "CPU$i cluster_id: ";\
cat /sys/devices/system/cpu/cpu$i/topology/cluster_id;\
done;

CPU0 cluster_id: 0
CPU1 cluster_id: 1
CPU2 cluster_id: 2
CPU3 cluster_id: 3
CPU4 cluster_id: 4
CPU5 cluster_id: 5
CPU6 cluster_id: 6
CPU7 cluster_id: 7
CPU8 cluster_id: 8
...
CPU126 cluster_id: 126
CPU127 cluster_id: 127
CPU128 cluster_id: 0
CPU129 cluster_id: 1
CPU130 cluster_id: 2
CPU131 cluster_id: 3
CPU132 cluster_id: 4
CPU133 cluster_id: 5
CPU134 cluster_id: 6
CPU135 cluster_id: 7
CPU136 cluster_id: 8
...
CPU254 cluster_id: 126
CPU255 cluster_id: 127

Note: Hygon maintainer Pu Wen <[email protected]> has confirmed that the
same strategy of setting l2c_id works for Hygon as well without
requiring any change to the function
(https://lore.kernel.org/lkml/20230411122348.GAZDVRVNvbbS8F5NUB@fat_crate.local/)
Call to the same function has been added in the Hygon topology init path
too. Pu Wen, can you please test this version too and let me know if the
behavior is still as expected.

The series also adds documentation for clusters on x86 platforms and
applies cleanly on top of tip:x86/core at commit ce3ba2af9695
("x86: Suppress KMSAN reports in arch_within_stack_frames()")

---
o v1->v2
- Improved documentation of cluster based on Peter's suggestion.
- Renamed cacheinfo_amd_init_l2c_id() to
cacheinfo_topoext_init_l2c_id() and added the call to same in
Hygon's topology init path.
- Collected tags for Patch 1.
---
K Prateek Nayak (2):
arch/x86: Set L2 Cache ID on AMD and Hygon processors
x86/Documentation: Add documentation about cluster

Documentation/x86/topology.rst | 27 ++++++++++++++++++++++++
arch/x86/include/asm/cacheinfo.h | 1 +
arch/x86/kernel/cpu/amd.c | 1 +
arch/x86/kernel/cpu/cacheinfo.c | 36 ++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/hygon.c | 1 +
5 files changed, 66 insertions(+)

--
2.34.1


2023-04-13 17:32:19

by K Prateek Nayak

[permalink] [raw]
Subject: [PATCH v2 1/2] arch/x86: Set L2 Cache ID on AMD and Hygon processors

On AMD and Hygon processors supporting X86_FEATURE_TOPOEXT set the l2c_id
using the Extended APIC ID and the Cache Properties CPUID.

Tested-by: Oleksandr Natalenko <[email protected]>
Signed-off-by: K Prateek Nayak <[email protected]>
---
o v1->v2
- No functional changes since v1 for AMD processors.
- Renamed cacheinfo_amd_init_l2c_id() to
cacheinfo_topoext_init_l2c_id() and added the call to same in
Hygon's topology init path.
- Collected tags from v1.
---
arch/x86/include/asm/cacheinfo.h | 1 +
arch/x86/kernel/cpu/amd.c | 1 +
arch/x86/kernel/cpu/cacheinfo.c | 36 ++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/hygon.c | 1 +
4 files changed, 39 insertions(+)

diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h
index ce9685fc78d8..2034cd556c07 100644
--- a/arch/x86/include/asm/cacheinfo.h
+++ b/arch/x86/include/asm/cacheinfo.h
@@ -7,6 +7,7 @@ extern unsigned int memory_caching_control;
#define CACHE_MTRR 0x01
#define CACHE_PAT 0x02

+void cacheinfo_topoext_init_l2c_id(struct cpuinfo_x86 *c, int cpu);
void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu);
void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu);

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f769d6d08b43..500401b9e645 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -358,6 +358,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
if (!err)
c->x86_coreid_bits = get_count_order(c->x86_max_cores);

+ cacheinfo_topoext_init_l2c_id(c, cpu);
cacheinfo_amd_init_llc_id(c, cpu);

} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index f4e5aa27eec6..bed7b9633451 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -659,6 +659,42 @@ static int find_num_cache_leaves(struct cpuinfo_x86 *c)
return i;
}

+void cacheinfo_topoext_init_l2c_id(struct cpuinfo_x86 *c, int cpu)
+{
+ u32 eax, ebx, ecx, edx, num_sharing_cache;
+ int i = 0, bits;
+
+ /* Check if L2 cache identifiers exists. */
+ if (!cpuid_ecx(0x80000006))
+ return;
+
+ while (true) {
+ u32 level;
+
+ cpuid_count(0x8000001d, i, &eax, &ebx, &ecx, &edx);
+ if (!eax)
+ return;
+
+ /*
+ * Check if the current leaf is for L2 cache using
+ * eax[7:5] used to describe the cache level.
+ */
+ level = (eax >> 5) & 0x7;
+ if (level == 2)
+ break;
+
+ ++i;
+ }
+
+ /*
+ * L2 ID is calculated from the number of threads
+ * sharing the L2 cache.
+ */
+ num_sharing_cache = ((eax >> 14) & 0xfff) + 1;
+ bits = get_count_order(num_sharing_cache);
+ per_cpu(cpu_l2c_id, cpu) = c->apicid >> bits;
+}
+
void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu)
{
/*
diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
index 5a2962c492d3..cb0025b4a2fd 100644
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -89,6 +89,7 @@ static void hygon_get_topology(struct cpuinfo_x86 *c)
/* Socket ID is ApicId[6] for these processors. */
c->phys_proc_id = c->apicid >> APICID_SOCKET_ID_BIT;

+ cacheinfo_topoext_init_l2c_id(c, cpu);
cacheinfo_hygon_init_llc_id(c, cpu);
} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
u64 value;
--
2.34.1

2023-04-13 17:32:22

by K Prateek Nayak

[permalink] [raw]
Subject: [PATCH v2 2/2] x86/Documentation: Add documentation about cluster

x86 processors map cluster to the L2 cache. Add documentation stating
the same, and provide more information on the values and API related to
CPU clusters exposed by the kernel.

Signed-off-by: K Prateek Nayak <[email protected]>
---
o v1->v2
- Reworded the definition of cluster on x86 based on Peter's
suggestion.
- Fixed double spacing before and after the cluster section.
---
Documentation/x86/topology.rst | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
index 7f58010ea86a..5dae8a0327d1 100644
--- a/Documentation/x86/topology.rst
+++ b/Documentation/x86/topology.rst
@@ -33,6 +33,7 @@ historical nature and should be cleaned up.
The topology of a system is described in the units of:

- packages
+ - cluster
- cores
- threads

@@ -90,6 +91,23 @@ Package-related topology information in the kernel:
Cache. In general, it is a number identifying an LLC uniquely on the
system.

+Clusters
+========
+A cluster consists of threads of one or more cores sharing the same L2 cache.
+
+Cluster-related topology information in the kernel:
+
+ - cluster_id:
+
+ A per-CPU variable containing:
+
+ - On Intel, the common upper bits of APIC ID of the list of CPUs sharing
+ the L2 Cache with lower bits set to 0.
+
+ - On AMD and Hygon, with Topology Extension, the common upper bits of the
+ Extended APIC ID of the list of CPUs sharing the L2 Cache, left shifted
+ to remove trailing 0s.
+
Cores
=====
A core consists of 1 or more threads. It does not matter whether the threads
@@ -125,6 +143,11 @@ Thread-related topology information in the kernel:

The number of online threads is also printed in /proc/cpuinfo "siblings."

+ - topology_cluster_cpumask():
+
+ The cpumask contains all online threads in the cluster to which a thread
+ belongs.
+
- topology_sibling_cpumask():

The cpumask contains all online threads in the core to which a thread
@@ -138,6 +161,10 @@ Thread-related topology information in the kernel:

The physical package ID to which a thread belongs.

+ - topology_cluster_id();
+
+ The ID of the cluster to which a thread belongs.
+
- topology_core_id();

The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
--
2.34.1

2023-04-13 18:08:49

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/Documentation: Add documentation about cluster

On 4/13/23 10:29, K Prateek Nayak wrote:
> + - cluster_id:
> +
> + A per-CPU variable containing:
> +
> + - On Intel, the common upper bits of APIC ID of the list of CPUs sharing
> + the L2 Cache with lower bits set to 0.
> +
> + - On AMD and Hygon, with Topology Extension, the common upper bits of the
> + Extended APIC ID of the list of CPUs sharing the L2 Cache, left shifted
> + to remove trailing 0s.

I think this is too much detail for Documentation. We have the code if
anyone cares _this_ much.

Also, I'm perplexed by the "left shifted" comment. I don't see a lot of
left shifting in the patch. Am I just missing it?

Further, this makes it sound like all Intel CPUs have the cluster_id
populated. I'm also not sure that folks reading this will have any
worldly idea what "Topology Extension" is.

Why don't we just say that some CPUs don't have this info? That way we
don't need to spell out AMD vs. Intel or expect our users to go figuring
out of their CPU has "Topology Extension" or leaf 3 or wherever this
info is on Intel.

How about:

A per-CPU variable containing:

- Some upper bits extracted from the APIC ID. CPUs which have the
same value in these bits share an L2 and have the same cluster_id.

CPUs for which L2 cache information is unavailable will show 65535
as the cluster_id.

2023-04-14 02:43:25

by K Prateek Nayak

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/Documentation: Add documentation about cluster

Hello Dave,

Thank you for taking a look at the series.

On 4/13/2023 11:27 PM, Dave Hansen wrote:
> On 4/13/23 10:29, K Prateek Nayak wrote:
>> + - cluster_id:
>> +
>> + A per-CPU variable containing:
>> +
>> + - On Intel, the common upper bits of APIC ID of the list of CPUs sharing
>> + the L2 Cache with lower bits set to 0.
>> +
>> + - On AMD and Hygon, with Topology Extension, the common upper bits of the
>> + Extended APIC ID of the list of CPUs sharing the L2 Cache, left shifted
>> + to remove trailing 0s.
>
> I think this is too much detail for Documentation. We have the code if
> anyone cares _this_ much.

Yes, I agree. I'll reword this as you suggested.

>
> Also, I'm perplexed by the "left shifted" comment. I don't see a lot of
> left shifting in the patch. Am I just missing it?

In Patch1, cacheinfo_topoext_init_l2c_id() sets l2c_id as follows for AMD
and Hygon processors:

bits = get_count_order(num_sharing_cache);
per_cpu(cpu_l2c_id, cpu) = c->apicid >> bits;

For Intel, in init_intel_cacheinfo(), l2c_id is set as follows:

index_msb = get_count_order(num_threads_sharing);
l2_id = c->apicid & ~((1 << index_msb) - 1);
...
per_cpu(cpu_l2c_id, cpu) = l2_id;

In the former, only the upper bits that are same for all the threads in a
cluster are retained, shifting out the lower bits, whereas in the latter
the lower bits are set to 0s keeping the upper bits, common to all the
threads on the cluster, as is. Let me know if I'm missing something.

>
> Further, this makes it sound like all Intel CPUs have the cluster_id
> populated. I'm also not sure that folks reading this will have any
> worldly idea what "Topology Extension" is.

I agree, it becomes too technical.

>
> Why don't we just say that some CPUs don't have this info? That way we
> don't need to spell out AMD vs. Intel or expect our users to go figuring
> out of their CPU has "Topology Extension" or leaf 3 or wherever this
> info is on Intel.
>
> How about:
>
> A per-CPU variable containing:
>
> - Some upper bits extracted from the APIC ID. CPUs which have the
> same value in these bits share an L2 and have the same cluster_id.
>
> CPUs for which L2 cache information is unavailable will show 65535
> as the cluster_id.

I'll reword the description based on your suggestion in the next version.

--
Thanks and Regards,
Prateek

2023-04-14 03:19:51

by K Prateek Nayak

[permalink] [raw]
Subject: [PATCH v2.1 2/2] x86/Documentation: Add documentation about cluster

x86 processors map cluster to the L2 cache. Add documentation stating
the same, and provide more information on the values and API related to
CPU clusters exposed by the kernel.

Suggested-by: Dave Hansen <[email protected]> # cluster_id description
Signed-off-by: K Prateek Nayak <[email protected]>
---
o v2->v2.1
- Reword the cluster_id description based on Dave's suggestions.
o v1->v2
- Reworded the definition of cluster on x86 based on Peter's
suggestion.
- Fixed double spacing before and after the cluster section.
---
Documentation/x86/topology.rst | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
index 7f58010ea86a..9de14f3f7783 100644
--- a/Documentation/x86/topology.rst
+++ b/Documentation/x86/topology.rst
@@ -33,6 +33,7 @@ historical nature and should be cleaned up.
The topology of a system is described in the units of:

- packages
+ - cluster
- cores
- threads

@@ -90,6 +91,22 @@ Package-related topology information in the kernel:
Cache. In general, it is a number identifying an LLC uniquely on the
system.

+Clusters
+========
+A cluster consists of threads of one or more cores sharing the same L2 cache.
+
+Cluster-related topology information in the kernel:
+
+ - cluster_id:
+
+ A per-CPU variable containing:
+
+ - Upper bits extracted from the APIC ID. CPUs which have the same value
+ in these bits share an L2 and have the same cluster_id.
+
+ CPUs for which cluster information is unavailable will show 65535
+ (BAD_APICID) as the cluster_id.
+
Cores
=====
A core consists of 1 or more threads. It does not matter whether the threads
@@ -125,6 +142,11 @@ Thread-related topology information in the kernel:

The number of online threads is also printed in /proc/cpuinfo "siblings."

+ - topology_cluster_cpumask():
+
+ The cpumask contains all online threads in the cluster to which a thread
+ belongs.
+
- topology_sibling_cpumask():

The cpumask contains all online threads in the core to which a thread
@@ -138,6 +160,10 @@ Thread-related topology information in the kernel:

The physical package ID to which a thread belongs.

+ - topology_cluster_id();
+
+ The ID of the cluster to which a thread belongs.
+
- topology_core_id();

The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
--
2.34.1

2023-04-15 02:37:00

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v2.1 2/2] x86/Documentation: Add documentation about cluster

On 4/14/23 10:17, K Prateek Nayak wrote:
> + - cluster_id:
> +
> + A per-CPU variable containing:
> +
> + - Upper bits extracted from the APIC ID. CPUs which have the same value
> + in these bits share an L2 and have the same cluster_id.
> +
> + CPUs for which cluster information is unavailable will show 65535
> + (BAD_APICID) as the cluster_id.

"... return cluster_id of 65535 (BAD_APICID)."

> The number of online threads is also printed in /proc/cpuinfo "siblings."
>
> + - topology_cluster_cpumask():
> +
> + The cpumask contains all online threads in the cluster to which a thread
> + belongs.
> +

Looks OK.

> The physical package ID to which a thread belongs.
>
> + - topology_cluster_id();
> +
> + The ID of the cluster to which a thread belongs.
> +

Looks OK.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2023-04-17 01:17:43

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2.1 2/2] x86/Documentation: Add documentation about cluster

On 4/14/23 19:24, Bagas Sanjaya wrote:
> On 4/14/23 10:17, K Prateek Nayak wrote:
>> + - cluster_id:
>> +
>> + A per-CPU variable containing:
>> +
>> + - Upper bits extracted from the APIC ID. CPUs which have the same value
>> + in these bits share an L2 and have the same cluster_id.
>> +
>> + CPUs for which cluster information is unavailable will show 65535
>> + (BAD_APICID) as the cluster_id.
> "... return cluster_id of 65535 (BAD_APICID)."

Bagas, this is talking about a per-cpu variable. Variables don't
"return" things, functions do.

I also have a request: I'd really appreciate if you could avoid
reviewing x86-related documentation. The review comments that I've seen
coming from you have not helped x86 documentation. They've hurt the
patches more than they have helped.

2023-04-18 02:22:16

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v2.1 2/2] x86/Documentation: Add documentation about cluster

On 4/17/23 07:23, Dave Hansen wrote:
> On 4/14/23 19:24, Bagas Sanjaya wrote:
>> On 4/14/23 10:17, K Prateek Nayak wrote:
>>> + - cluster_id:
>>> +
>>> + A per-CPU variable containing:
>>> +
>>> + - Upper bits extracted from the APIC ID. CPUs which have the same value
>>> + in these bits share an L2 and have the same cluster_id.
>>> +
>>> + CPUs for which cluster information is unavailable will show 65535
>>> + (BAD_APICID) as the cluster_id.
>> "... return cluster_id of 65535 (BAD_APICID)."
>
> Bagas, this is talking about a per-cpu variable. Variables don't
> "return" things, functions do.
>

Oops, I don't see that!

> I also have a request: I'd really appreciate if you could avoid
> reviewing x86-related documentation. The review comments that I've seen
> coming from you have not helped x86 documentation. They've hurt the
> patches more than they have helped.
>

OK, thanks!

--
An old man doll... just what I always wanted! - Clara

2023-05-18 02:44:44

by K Prateek Nayak

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] arch/x86: Set L2 Cache ID on AMD and Hygon processors

Gentle ping :)

On 4/13/2023 10:59 PM, K Prateek Nayak wrote:
> commit 66558b730f253 ("sched: Add cluster scheduler level for x86")
> defined cluster on x86 as the set of threads sharing the same L2 cache.
> cluster_id on x86, maps to the l2c_id which currently only Intel
> processors set.
>
> This series sets the l2c_id on AMD and Hygon processors with
> X86_FEATURE_TOPOEXT, using the extended APIC ID and the
> "Cache Properties (L2)" CPUID (0x8000001D EAX). On AMD and Hygon
> processors without X86_FEATURE_TOPOEXT, current behavior will continue.
>
> Following are the changes in value reported by
> "/sys/devices/system/cpu/cpuX/topology/cluster_id" on a 2P Milan system
> (2 x 64C/128T) where L2 is per-core and SMT sibling of CPU (X) is
> CPU ((X + 128) % 256).
>
> - tip:x86/core
>
> $ for i in {0..255}; do\
> echo -n "CPU$i cluster_id: ";\
> cat /sys/devices/system/cpu/cpu$i/topology/cluster_id;\
> done;
>
> CPU0 cluster_id: 65535
> CPU1 cluster_id: 65535
> CPU2 cluster_id: 65535
> CPU3 cluster_id: 65535
> CPU4 cluster_id: 65535
> ...
> CPU254 cluster_id: 65535
> CPU255 cluster_id: 65535
>
> - tip:x86/core + this series
>
> $ for i in {0..255}; do\
> echo -n "CPU$i cluster_id: ";\
> cat /sys/devices/system/cpu/cpu$i/topology/cluster_id;\
> done;
>
> CPU0 cluster_id: 0
> CPU1 cluster_id: 1
> CPU2 cluster_id: 2
> CPU3 cluster_id: 3
> CPU4 cluster_id: 4
> CPU5 cluster_id: 5
> CPU6 cluster_id: 6
> CPU7 cluster_id: 7
> CPU8 cluster_id: 8
> ...
> CPU126 cluster_id: 126
> CPU127 cluster_id: 127
> CPU128 cluster_id: 0
> CPU129 cluster_id: 1
> CPU130 cluster_id: 2
> CPU131 cluster_id: 3
> CPU132 cluster_id: 4
> CPU133 cluster_id: 5
> CPU134 cluster_id: 6
> CPU135 cluster_id: 7
> CPU136 cluster_id: 8
> ...
> CPU254 cluster_id: 126
> CPU255 cluster_id: 127
>
> Note: Hygon maintainer Pu Wen <[email protected]> has confirmed that the
> same strategy of setting l2c_id works for Hygon as well without
> requiring any change to the function
> (https://lore.kernel.org/lkml/20230411122348.GAZDVRVNvbbS8F5NUB@fat_crate.local/)
> Call to the same function has been added in the Hygon topology init path
> too. Pu Wen, can you please test this version too and let me know if the
> behavior is still as expected.
>
> The series also adds documentation for clusters on x86 platforms and
> applies cleanly on top of tip:x86/core at commit ce3ba2af9695
> ("x86: Suppress KMSAN reports in arch_within_stack_frames()")
>
> ---
> o v1->v2
> - Improved documentation of cluster based on Peter's suggestion.
> - Renamed cacheinfo_amd_init_l2c_id() to
> cacheinfo_topoext_init_l2c_id() and added the call to same in
> Hygon's topology init path.
> - Collected tags for Patch 1.
> ---
> K Prateek Nayak (2):
> arch/x86: Set L2 Cache ID on AMD and Hygon processors
> x86/Documentation: Add documentation about cluster
>
> Documentation/x86/topology.rst | 27 ++++++++++++++++++++++++
> arch/x86/include/asm/cacheinfo.h | 1 +
> arch/x86/kernel/cpu/amd.c | 1 +
> arch/x86/kernel/cpu/cacheinfo.c | 36 ++++++++++++++++++++++++++++++++
> arch/x86/kernel/cpu/hygon.c | 1 +
> 5 files changed, 66 insertions(+)
>

--
Thanks and Regards,
Prateek