2023-08-02 10:43:24

by Thomas Gleixner

[permalink] [raw]
Subject: [patch V3 00/40] x86/cpu: Rework the topology evaluation

Hi!

This is the follow up to V2:

https://lore.kernel.org/lkml/[email protected]

which addresses the review feedback and some fallout reported on and
off-list.

TLDR:

This reworks the way how topology information is evaluated via CPUID
in preparation for a larger topology management overhaul to address
shortcomings of the current code vs. hybrid systems and systems which make
use of the extended topology domains in leaf 0x1f. Aside of that it's an
overdue spring cleaning to get rid of accumulated layers of duct tape and
haywire.

What changed vs. V2:

- Decoded and fixed the fallout vs. XEN/PV reported by Juergen. Thanks to
Juergen for the remote hand debugging sessions!

That's addressed in the first two new patches in this series. Summary:
XEN/PV booted by pure chance since the addition of SMT control 5 years
ago.

- Fixed the off by one in the AMD parser which was debugged by Michael

- Addressed review comments from various people

As discussed in:

https://lore.kernel.org/lkml/BYAPR21MB16889FD224344B1B28BE22A1D705A@BYAPR21MB1688.namprd21.prod.outlook.com
....
https://lore.kernel.org/lkml/87r0omjt8c.ffs@tglx

this series unfortunately brings the Hyper-V BIOS inconsistency into
effect, which results in a slight performance impact. The L3 association
which "worked" so far by exploiting the inconsistency of the Linux topology
code is not longer supportable as we really need to get the actual short
comings of our topology management addressed in a consistent way.

The series is based on V3 of the APIC cleanup series:

https://lore.kernel.org/lkml/[email protected]

and also available on top of that from git:

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v3

Thanks,

tglx
---
arch/x86/kernel/cpu/topology.c | 168 -------------------
b/Documentation/arch/x86/topology.rst | 12 -
b/arch/x86/events/amd/core.c | 2
b/arch/x86/events/amd/uncore.c | 2
b/arch/x86/events/intel/uncore.c | 2
b/arch/x86/hyperv/hv_vtl.c | 2
b/arch/x86/include/asm/apic.h | 32 +--
b/arch/x86/include/asm/cacheinfo.h | 3
b/arch/x86/include/asm/cpuid.h | 36 ++++
b/arch/x86/include/asm/mpspec.h | 2
b/arch/x86/include/asm/processor.h | 60 ++++---
b/arch/x86/include/asm/smp.h | 4
b/arch/x86/include/asm/topology.h | 51 +++++
b/arch/x86/include/asm/x86_init.h | 2
b/arch/x86/kernel/acpi/boot.c | 4
b/arch/x86/kernel/amd_nb.c | 8
b/arch/x86/kernel/apic/apic.c | 25 +-
b/arch/x86/kernel/apic/apic_common.c | 4
b/arch/x86/kernel/apic/apic_flat_64.c | 13 -
b/arch/x86/kernel/apic/apic_noop.c | 9 -
b/arch/x86/kernel/apic/apic_numachip.c | 21 --
b/arch/x86/kernel/apic/bigsmp_32.c | 10 -
b/arch/x86/kernel/apic/local.h | 6
b/arch/x86/kernel/apic/probe_32.c | 10 -
b/arch/x86/kernel/apic/x2apic_cluster.c | 1
b/arch/x86/kernel/apic/x2apic_phys.c | 10 -
b/arch/x86/kernel/apic/x2apic_uv_x.c | 67 +------
b/arch/x86/kernel/cpu/Makefile | 5
b/arch/x86/kernel/cpu/amd.c | 156 ------------------
b/arch/x86/kernel/cpu/cacheinfo.c | 51 ++---
b/arch/x86/kernel/cpu/centaur.c | 4
b/arch/x86/kernel/cpu/common.c | 111 +-----------
b/arch/x86/kernel/cpu/cpu.h | 14 +
b/arch/x86/kernel/cpu/debugfs.c | 97 +++++++++++
b/arch/x86/kernel/cpu/hygon.c | 133 ---------------
b/arch/x86/kernel/cpu/intel.c | 38 ----
b/arch/x86/kernel/cpu/mce/amd.c | 4
b/arch/x86/kernel/cpu/mce/apei.c | 4
b/arch/x86/kernel/cpu/mce/core.c | 4
b/arch/x86/kernel/cpu/mce/inject.c | 7
b/arch/x86/kernel/cpu/proc.c | 8
b/arch/x86/kernel/cpu/topology.h | 51 +++++
b/arch/x86/kernel/cpu/topology_amd.c | 179 ++++++++++++++++++++
b/arch/x86/kernel/cpu/topology_common.c | 240 ++++++++++++++++++++++++++++
b/arch/x86/kernel/cpu/topology_ext.c | 136 +++++++++++++++
b/arch/x86/kernel/cpu/zhaoxin.c | 18 --
b/arch/x86/kernel/kvm.c | 6
b/arch/x86/kernel/sev.c | 2
b/arch/x86/kernel/smpboot.c | 97 ++++++-----
b/arch/x86/kernel/vsmp_64.c | 13 -
b/arch/x86/mm/amdtopology.c | 35 +---
b/arch/x86/mm/numa.c | 4
b/arch/x86/xen/apic.c | 14 -
b/arch/x86/xen/smp_pv.c | 3
b/drivers/edac/amd64_edac.c | 4
b/drivers/edac/mce_amd.c | 4
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2
b/drivers/hwmon/fam15h_power.c | 7
b/drivers/scsi/lpfc/lpfc_init.c | 8
b/drivers/virt/acrn/hsm.c | 2
b/kernel/cpu.c | 6
61 files changed, 1077 insertions(+), 956 deletions(-)




2023-08-02 12:53:07

by Juergen Gross

[permalink] [raw]
Subject: Re: [patch V3 00/40] x86/cpu: Rework the topology evaluation

On 02.08.23 12:20, Thomas Gleixner wrote:
> Hi!
>
> This is the follow up to V2:
>
> https://lore.kernel.org/lkml/[email protected]
>
> which addresses the review feedback and some fallout reported on and
> off-list.
>
> TLDR:
>
> This reworks the way how topology information is evaluated via CPUID
> in preparation for a larger topology management overhaul to address
> shortcomings of the current code vs. hybrid systems and systems which make
> use of the extended topology domains in leaf 0x1f. Aside of that it's an
> overdue spring cleaning to get rid of accumulated layers of duct tape and
> haywire.
>
> What changed vs. V2:
>
> - Decoded and fixed the fallout vs. XEN/PV reported by Juergen. Thanks to
> Juergen for the remote hand debugging sessions!
>
> That's addressed in the first two new patches in this series. Summary:
> XEN/PV booted by pure chance since the addition of SMT control 5 years
> ago.
>
> - Fixed the off by one in the AMD parser which was debugged by Michael
>
> - Addressed review comments from various people
>
> As discussed in:
>
> https://lore.kernel.org/lkml/BYAPR21MB16889FD224344B1B28BE22A1D705A@BYAPR21MB1688.namprd21.prod.outlook.com
> ....
> https://lore.kernel.org/lkml/87r0omjt8c.ffs@tglx
>
> this series unfortunately brings the Hyper-V BIOS inconsistency into
> effect, which results in a slight performance impact. The L3 association
> which "worked" so far by exploiting the inconsistency of the Linux topology
> code is not longer supportable as we really need to get the actual short
> comings of our topology management addressed in a consistent way.
>
> The series is based on V3 of the APIC cleanup series:
>
> https://lore.kernel.org/lkml/[email protected]
>
> and also available on top of that from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v3
>
> Thanks,
>
> tglx

For Xen PV (dom0 and unprivileged guest):

Tested-by: Juergen Gross <[email protected]>


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments

2023-08-03 00:15:13

by Sohil Mehta

[permalink] [raw]
Subject: Re: [patch V3 00/40] x86/cpu: Rework the topology evaluation

On 8/2/2023 3:20 AM, Thomas Gleixner wrote:
> The series is based on V3 of the APIC cleanup series:
>
> https://lore.kernel.org/lkml/[email protected]
>
> and also available on top of that from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v3
>

Tested-by: Sohil Mehta <[email protected]>

2023-08-03 04:28:12

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [patch V3 00/40] x86/cpu: Rework the topology evaluation

From: Thomas Gleixner <[email protected]> Sent: Wednesday, August 2, 2023 3:21 AM

>
> Hi!
>
> This is the follow up to V2:
>
>
> https://lore.kernel.org/lkml/[email protected]/
>
> which addresses the review feedback and some fallout reported on and
> off-list.
>
> TLDR:
>
> This reworks the way how topology information is evaluated via CPUID
> in preparation for a larger topology management overhaul to address
> shortcomings of the current code vs. hybrid systems and systems which make
> use of the extended topology domains in leaf 0x1f. Aside of that it's an
> overdue spring cleaning to get rid of accumulated layers of duct tape and
> haywire.
>
> What changed vs. V2:
>
> - Decoded and fixed the fallout vs. XEN/PV reported by Juergen. Thanks to
> Juergen for the remote hand debugging sessions!
>
> That's addressed in the first two new patches in this series. Summary:
> XEN/PV booted by pure chance since the addition of SMT control 5 years
> ago.
>
> - Fixed the off by one in the AMD parser which was debugged by Michael
>
> - Addressed review comments from various people
>
> As discussed in:
>
>
> https://lore.kernel.org/lkml/BYAPR21MB16889FD224344B1B28BE22A1D705A@BYAPR21MB1688.namprd21.prod.outlook.com/
> ....
>
> https://lore.kernel.org/lkml/87r0omjt8c.ffs@tglx/
>
> this series unfortunately brings the Hyper-V BIOS inconsistency into
> effect, which results in a slight performance impact. The L3 association
> which "worked" so far by exploiting the inconsistency of the Linux topology
> code is not longer supportable as we really need to get the actual short
> comings of our topology management addressed in a consistent way.
>
> The series is based on V3 of the APIC cleanup series:
>
>
> https://lore.kernel.org/lkml/[email protected]/
>
> and also available on top of that from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v3
>
> Thanks,
>
> tglx
> ---
> arch/x86/kernel/cpu/topology.c | 168 -------------------
> b/Documentation/arch/x86/topology.rst | 12 -
> b/arch/x86/events/amd/core.c | 2
> b/arch/x86/events/amd/uncore.c | 2
> b/arch/x86/events/intel/uncore.c | 2
> b/arch/x86/hyperv/hv_vtl.c | 2
> b/arch/x86/include/asm/apic.h | 32 +--
> b/arch/x86/include/asm/cacheinfo.h | 3
> b/arch/x86/include/asm/cpuid.h | 36 ++++
> b/arch/x86/include/asm/mpspec.h | 2
> b/arch/x86/include/asm/processor.h | 60 ++++---
> b/arch/x86/include/asm/smp.h | 4
> b/arch/x86/include/asm/topology.h | 51 +++++
> b/arch/x86/include/asm/x86_init.h | 2
> b/arch/x86/kernel/acpi/boot.c | 4
> b/arch/x86/kernel/amd_nb.c | 8
> b/arch/x86/kernel/apic/apic.c | 25 +-
> b/arch/x86/kernel/apic/apic_common.c | 4
> b/arch/x86/kernel/apic/apic_flat_64.c | 13 -
> b/arch/x86/kernel/apic/apic_noop.c | 9 -
> b/arch/x86/kernel/apic/apic_numachip.c | 21 --
> b/arch/x86/kernel/apic/bigsmp_32.c | 10 -
> b/arch/x86/kernel/apic/local.h | 6
> b/arch/x86/kernel/apic/probe_32.c | 10 -
> b/arch/x86/kernel/apic/x2apic_cluster.c | 1
> b/arch/x86/kernel/apic/x2apic_phys.c | 10 -
> b/arch/x86/kernel/apic/x2apic_uv_x.c | 67 +------
> b/arch/x86/kernel/cpu/Makefile | 5
> b/arch/x86/kernel/cpu/amd.c | 156 ------------------
> b/arch/x86/kernel/cpu/cacheinfo.c | 51 ++---
> b/arch/x86/kernel/cpu/centaur.c | 4
> b/arch/x86/kernel/cpu/common.c | 111 +-----------
> b/arch/x86/kernel/cpu/cpu.h | 14 +
> b/arch/x86/kernel/cpu/debugfs.c | 97 +++++++++++
> b/arch/x86/kernel/cpu/hygon.c | 133 ---------------
> b/arch/x86/kernel/cpu/intel.c | 38 ----
> b/arch/x86/kernel/cpu/mce/amd.c | 4
> b/arch/x86/kernel/cpu/mce/apei.c | 4
> b/arch/x86/kernel/cpu/mce/core.c | 4
> b/arch/x86/kernel/cpu/mce/inject.c | 7
> b/arch/x86/kernel/cpu/proc.c | 8
> b/arch/x86/kernel/cpu/topology.h | 51 +++++
> b/arch/x86/kernel/cpu/topology_amd.c | 179 ++++++++++++++++++++
> b/arch/x86/kernel/cpu/topology_common.c | 240 ++++++++++++++++++++++++++++
> b/arch/x86/kernel/cpu/topology_ext.c | 136 +++++++++++++++
> b/arch/x86/kernel/cpu/zhaoxin.c | 18 --
> b/arch/x86/kernel/kvm.c | 6
> b/arch/x86/kernel/sev.c | 2
> b/arch/x86/kernel/smpboot.c | 97 ++++++-----
> b/arch/x86/kernel/vsmp_64.c | 13 -
> b/arch/x86/mm/amdtopology.c | 35 +---
> b/arch/x86/mm/numa.c | 4
> b/arch/x86/xen/apic.c | 14 -
> b/arch/x86/xen/smp_pv.c | 3
> b/drivers/edac/amd64_edac.c | 4
> b/drivers/edac/mce_amd.c | 4
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2
> b/drivers/hwmon/fam15h_power.c | 7
> b/drivers/scsi/lpfc/lpfc_init.c | 8
> b/drivers/virt/acrn/hsm.c | 2
> b/kernel/cpu.c | 6
> 61 files changed, 1077 insertions(+), 956 deletions(-)
>

Tested a variety of Hyper-V guest sizes running on Intel and AMD
processors of various generations. Tested guests configured with
hyper-threading and with no hyper-threading, single NUMA node
and multi-NUMA node, etc. Also tested a hyper-threaded VM with
the 'nosmt' option.

All topologies look good, modulo the identified Hyper-V issue
with mis-matched APIC IDs that must be fixed by Hyper-V.

Tested-by: Michael Kelley <[email protected]>