2021-02-16 20:01:59

by Alison Schofield

[permalink] [raw]
Subject: [PATCH v2] x86,sched: Update the Intel SNC CPU list that allows shared LLCs

Commit 1340ccfa9a9a ("x86,sched: Allow topologies where NUMA nodes
share an LLC") added a vendor and model specific check to never
call topology_sane() for systems where NUMA nodes share an LLC.

Intel's Ice Lake and Sapphire Rapids CPUs exhibit this same topology.
They enumerate an LLC that is shared by multiple NUMA nodes. The
LLC on these CPUs is shared for off-package data access but private
to the NUMA node for on-package access. Since its CPUID can only
enumerate the cache as shared or unshared, add these CPUs to the
list of allowable topologies (snc_cpu[]).

In SNC mode, Ice Lake and Sapphire Rapids servers will no longer emit
this warning:

sched: CPU #3's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.

Acked-by: Dave Hansen <[email protected]>
Signed-off-by: Alison Schofield <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Igor Mammedov <[email protected]>
Cc: Prarit Bhargava <[email protected]>
Cc: [email protected]

---
Changes v1->v2:

- Implemented the minimal required change of adding the new models to
the existing vendor and model specific check.

- Side effect of going minimalist: no longer labelled an X86_BUG (TonyL)

- Considered PeterZ suggestion of checking for COD CPUs, rather than
SNC CPUs. That meant this snc_cpu list would go away, and so it never
needs updating. That ups the stakes for this patch wrt LOC changed
and testing needed. It actually drove me back to this simplest soln.

- Considered DaveH suggestion to remove the check altogether and recognize
these topologies as sane. Not running with that further here. This patch
is what is needed now. The broader discussion of sane topologies can
carry on independent of this update.

- Updated commit message and log.

arch/x86/kernel/smpboot.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 02813a7f3a7c..de8c598dc3b9 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -475,6 +475,8 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)

static const struct x86_cpu_id snc_cpu[] = {
X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, NULL),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
+ X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, NULL),
{}
};

--
2.20.1


2021-02-25 01:13:37

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2] x86,sched: Update the Intel SNC CPU list that allows shared LLCs

On 2/16/21 11:58 AM, Alison Schofield wrote:
> arch/x86/kernel/smpboot.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 02813a7f3a7c..de8c598dc3b9 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -475,6 +475,8 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
>
> static const struct x86_cpu_id snc_cpu[] = {
> X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, NULL),
> + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
> + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, NULL),
> {}
> };

Oh, and if this version gets picked up (or we go to a v3), it probably
also needs a:

Cc: [email protected]

This does cause scary warnings, and it would be nice to suppress those
on stable kernels too.