2023-07-28 13:57:53

by Thomas Gleixner

[permalink] [raw]
Subject: [patch v2 00/38] x86/cpu: Rework the topology evaluation

Hi!

This is the follow up to V1:

https://lore.kernel.org/lkml/[email protected]

which addresses the review feedback and some minor fallout I observed in my
testing of the work based on top.

TLDR:

This reworks the way how topology information is evaluated via CPUID
in preparation for a larger topology management overhaul to address
shortcomings of the current code vs. hybrid systems and systems which make
use of the extended topology domains in leaf 0x1f. Aside of that it's an
overdue spring cleaning to get rid of accumulated layers of duct tape and
haywire.

What changed vs. V1:

- Fixed an issue vs. the logical die/pkg management as the current
code (ab)uses cpuinfo for persistant storage.

- Consolidated APIC ID usage on u32 and ditched the u16 limitation

- Addressed the review feedback from Peter and Arjan

- Added a new patch which gets rid of XENPV fiddling in the cpuinfo
state. That needs some testing on XENPV obviously. The relevant
patches are #22 and #37

I did not pick up any of the tested by tags yet. I hope people can run it
once more. Neither did I add the Ack from Peter.

The series is based on the APIC cleanup series:

https://lore.kernel.org/lkml/[email protected]

and also available on top of that from git:

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2

Thanks,

tglx
---
Documentation/arch/x86/topology.rst | 12 -
a/arch/x86/kernel/cpu/topology.c | 168 ---------------------
arch/x86/events/amd/core.c | 2
arch/x86/events/amd/uncore.c | 2
arch/x86/events/intel/uncore.c | 2
arch/x86/hyperv/hv_vtl.c | 2
arch/x86/include/asm/apic.h | 32 +---
arch/x86/include/asm/cacheinfo.h | 3
arch/x86/include/asm/cpuid.h | 32 ++++
arch/x86/include/asm/mpspec.h | 2
arch/x86/include/asm/processor.h | 60 +++++--
arch/x86/include/asm/smp.h | 4
arch/x86/include/asm/topology.h | 51 +++++-
arch/x86/include/asm/x86_init.h | 2
arch/x86/kernel/acpi/boot.c | 4
arch/x86/kernel/amd_nb.c | 8 -
arch/x86/kernel/apic/apic.c | 14 -
arch/x86/kernel/apic/apic_common.c | 4
arch/x86/kernel/apic/apic_flat_64.c | 13 -
arch/x86/kernel/apic/apic_noop.c | 9 -
arch/x86/kernel/apic/apic_numachip.c | 21 --
arch/x86/kernel/apic/bigsmp_32.c | 10 -
arch/x86/kernel/apic/local.h | 6
arch/x86/kernel/apic/probe_32.c | 10 -
arch/x86/kernel/apic/x2apic_cluster.c | 1
arch/x86/kernel/apic/x2apic_phys.c | 10 -
arch/x86/kernel/apic/x2apic_uv_x.c | 67 +-------
arch/x86/kernel/cpu/Makefile | 5
arch/x86/kernel/cpu/amd.c | 156 --------------------
arch/x86/kernel/cpu/cacheinfo.c | 51 ++----
arch/x86/kernel/cpu/centaur.c | 4
arch/x86/kernel/cpu/common.c | 111 +-------------
arch/x86/kernel/cpu/cpu.h | 14 +
arch/x86/kernel/cpu/hygon.c | 133 -----------------
arch/x86/kernel/cpu/intel.c | 38 ----
arch/x86/kernel/cpu/mce/amd.c | 4
arch/x86/kernel/cpu/mce/apei.c | 4
arch/x86/kernel/cpu/mce/core.c | 4
arch/x86/kernel/cpu/mce/inject.c | 7
arch/x86/kernel/cpu/proc.c | 8 -
arch/x86/kernel/cpu/zhaoxin.c | 18 --
arch/x86/kernel/kvm.c | 6
arch/x86/kernel/sev.c | 2
arch/x86/kernel/smpboot.c | 97 +++++++-----
arch/x86/kernel/vsmp_64.c | 13 -
arch/x86/mm/amdtopology.c | 35 ++--
arch/x86/mm/numa.c | 4
arch/x86/xen/apic.c | 14 -
arch/x86/xen/smp_pv.c | 3
b/arch/x86/kernel/cpu/debugfs.c | 97 ++++++++++++
b/arch/x86/kernel/cpu/topology.h | 51 ++++++
b/arch/x86/kernel/cpu/topology_amd.c | 179 +++++++++++++++++++++++
b/arch/x86/kernel/cpu/topology_common.c | 233 ++++++++++++++++++++++++++++++
b/arch/x86/kernel/cpu/topology_ext.c | 136 +++++++++++++++++
drivers/edac/amd64_edac.c | 4
drivers/edac/mce_amd.c | 4
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2
drivers/hwmon/fam15h_power.c | 7
drivers/scsi/lpfc/lpfc_init.c | 8 -
drivers/virt/acrn/hsm.c | 2
60 files changed, 1049 insertions(+), 956 deletions(-)



2023-07-28 16:35:56

by Dimitri Sivanich

[permalink] [raw]
Subject: Re: [patch v2 00/38] x86/cpu: Rework the topology evaluation

I successfully booted the same 32-socket, 3840 cpu HPE Sapphire Rapids system
with the V2 update found here:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2

# cat /sys/kernel/debug/x86/topo/cpus/3839
initial_apicid: ff7
apicid: ff7
pkg_id: 31
die_id: 31
cu_id: 255
core_id: 59
logical_pkg_id: 31
logical_die_id: 31
llc_id: 3968
l2c_id: 4086
amd_node_id: 0
amd_nodes_per_pkg: 0
max_cores: 60
max_die_per_pkg: 1
smp_num_siblings: 2

On Fri, Jul 28, 2023 at 02:12:42PM +0200, Thomas Gleixner wrote:
> Hi!
>
> This is the follow up to V1:
>
> https://lore.kernel.org/lkml/[email protected]
>
> which addresses the review feedback and some minor fallout I observed in my
> testing of the work based on top.
>
> TLDR:
>
> This reworks the way how topology information is evaluated via CPUID
> in preparation for a larger topology management overhaul to address
> shortcomings of the current code vs. hybrid systems and systems which make
> use of the extended topology domains in leaf 0x1f. Aside of that it's an
> overdue spring cleaning to get rid of accumulated layers of duct tape and
> haywire.
>
> What changed vs. V1:
>
> - Fixed an issue vs. the logical die/pkg management as the current
> code (ab)uses cpuinfo for persistant storage.
>
> - Consolidated APIC ID usage on u32 and ditched the u16 limitation
>
> - Addressed the review feedback from Peter and Arjan
>
> - Added a new patch which gets rid of XENPV fiddling in the cpuinfo
> state. That needs some testing on XENPV obviously. The relevant
> patches are #22 and #37
>
> I did not pick up any of the tested by tags yet. I hope people can run it
> once more. Neither did I add the Ack from Peter.
>
> The series is based on the APIC cleanup series:
>
> https://lore.kernel.org/lkml/[email protected]
>
> and also available on top of that from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2
>
> Thanks,
>
> tglx
> ---
> Documentation/arch/x86/topology.rst | 12 -
> a/arch/x86/kernel/cpu/topology.c | 168 ---------------------
> arch/x86/events/amd/core.c | 2
> arch/x86/events/amd/uncore.c | 2
> arch/x86/events/intel/uncore.c | 2
> arch/x86/hyperv/hv_vtl.c | 2
> arch/x86/include/asm/apic.h | 32 +---
> arch/x86/include/asm/cacheinfo.h | 3
> arch/x86/include/asm/cpuid.h | 32 ++++
> arch/x86/include/asm/mpspec.h | 2
> arch/x86/include/asm/processor.h | 60 +++++--
> arch/x86/include/asm/smp.h | 4
> arch/x86/include/asm/topology.h | 51 +++++-
> arch/x86/include/asm/x86_init.h | 2
> arch/x86/kernel/acpi/boot.c | 4
> arch/x86/kernel/amd_nb.c | 8 -
> arch/x86/kernel/apic/apic.c | 14 -
> arch/x86/kernel/apic/apic_common.c | 4
> arch/x86/kernel/apic/apic_flat_64.c | 13 -
> arch/x86/kernel/apic/apic_noop.c | 9 -
> arch/x86/kernel/apic/apic_numachip.c | 21 --
> arch/x86/kernel/apic/bigsmp_32.c | 10 -
> arch/x86/kernel/apic/local.h | 6
> arch/x86/kernel/apic/probe_32.c | 10 -
> arch/x86/kernel/apic/x2apic_cluster.c | 1
> arch/x86/kernel/apic/x2apic_phys.c | 10 -
> arch/x86/kernel/apic/x2apic_uv_x.c | 67 +-------
> arch/x86/kernel/cpu/Makefile | 5
> arch/x86/kernel/cpu/amd.c | 156 --------------------
> arch/x86/kernel/cpu/cacheinfo.c | 51 ++----
> arch/x86/kernel/cpu/centaur.c | 4
> arch/x86/kernel/cpu/common.c | 111 +-------------
> arch/x86/kernel/cpu/cpu.h | 14 +
> arch/x86/kernel/cpu/hygon.c | 133 -----------------
> arch/x86/kernel/cpu/intel.c | 38 ----
> arch/x86/kernel/cpu/mce/amd.c | 4
> arch/x86/kernel/cpu/mce/apei.c | 4
> arch/x86/kernel/cpu/mce/core.c | 4
> arch/x86/kernel/cpu/mce/inject.c | 7
> arch/x86/kernel/cpu/proc.c | 8 -
> arch/x86/kernel/cpu/zhaoxin.c | 18 --
> arch/x86/kernel/kvm.c | 6
> arch/x86/kernel/sev.c | 2
> arch/x86/kernel/smpboot.c | 97 +++++++-----
> arch/x86/kernel/vsmp_64.c | 13 -
> arch/x86/mm/amdtopology.c | 35 ++--
> arch/x86/mm/numa.c | 4
> arch/x86/xen/apic.c | 14 -
> arch/x86/xen/smp_pv.c | 3
> b/arch/x86/kernel/cpu/debugfs.c | 97 ++++++++++++
> b/arch/x86/kernel/cpu/topology.h | 51 ++++++
> b/arch/x86/kernel/cpu/topology_amd.c | 179 +++++++++++++++++++++++
> b/arch/x86/kernel/cpu/topology_common.c | 233 ++++++++++++++++++++++++++++++
> b/arch/x86/kernel/cpu/topology_ext.c | 136 +++++++++++++++++
> drivers/edac/amd64_edac.c | 4
> drivers/edac/mce_amd.c | 4
> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2
> drivers/hwmon/fam15h_power.c | 7
> drivers/scsi/lpfc/lpfc_init.c | 8 -
> drivers/virt/acrn/hsm.c | 2
> 60 files changed, 1049 insertions(+), 956 deletions(-)

2023-07-28 20:24:50

by Sohil Mehta

[permalink] [raw]
Subject: Re: [patch v2 00/38] x86/cpu: Rework the topology evaluation

On 7/28/2023 5:12 AM, Thomas Gleixner wrote:
> I did not pick up any of the tested by tags yet. I hope people can run it
> once more. Neither did I add the Ack from Peter.
>
> The series is based on the APIC cleanup series:
>
> https://lore.kernel.org/lkml/[email protected]
>
> and also available on top of that from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2
>

The series boots fine on a 2S Sandy bridge system. I didn't see any
issues with cpu-hotplug either.

Tested-by: Sohil Mehta <[email protected]>