2022-07-09 15:57:58

by Conor Dooley

[permalink] [raw]
Subject: [PATCH v3 0/2] Fix RISC-V's arch-topology reporting

From: Conor Dooley <[email protected]>

Hey all,
It's my first time messing around with arch/ code at all, let alone
more than one arch, so forgive me if I have screwed up how to do a
migration like this.

The goal here is the fix the incorrectly reported arch topology on
RISC-V which seems to have been broken since it was added.
cpu, package and thread IDs are all currently reported as -1, so tools
like lstopo think systems have multiple threads on the same core when
this is not true:
https://github.com/open-mpi/hwloc/issues/536

arm64's topology code basically applies to RISC-V too, so it has been
made generic along with the removal of MPIDR related code, which
appears to be redudant code since '3102bc0e6ac7 ("arm64: topology: Stop
using MPIDR for topology information")' replaced the code that actually
interacted with MPIDR with default values.

I only built tested for arm{,64} , so hopefully it is not broken when
used. Testing on both arm64 & !SMP RISC-V would really be appreciated!

For V2, I dropped the idea of doing a RISC-V specific implementation
followed by a move to the generic code & just went for the more straight
forward method of moving to the shared version first. I also dropped the
RFC.

V3 moves store_cpu_topology() down inside the arch check alongside the
init function so that boot on 32bit arm is not broken.

Thanks,
Conor

Conor Dooley (2):
arm64: topology: move store_cpu_topology() to shared code
riscv: topology: fix default topology reporting

arch/arm64/kernel/topology.c | 40 ------------------------------------
arch/riscv/Kconfig | 2 +-
arch/riscv/kernel/smpboot.c | 4 +++-
drivers/base/arch_topology.c | 19 +++++++++++++++++
4 files changed, 23 insertions(+), 42 deletions(-)


base-commit: b6f1f2fa2bddd69ff46a190b8120bd440fd50563
--
2.37.0


2022-07-09 16:00:04

by Conor Dooley

[permalink] [raw]
Subject: [PATCH v3 1/2] arm64: topology: move store_cpu_topology() to shared code

From: Conor Dooley <[email protected]>

arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.

This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.

With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.

CC: [email protected]
Signed-off-by: Conor Dooley <[email protected]>
---
arch/arm64/kernel/topology.c | 40 ------------------------------------
drivers/base/arch_topology.c | 19 +++++++++++++++++
2 files changed, 19 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 869ffc4d4484..7889a00f5487 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -22,46 +22,6 @@
#include <asm/cputype.h>
#include <asm/topology.h>

-void store_cpu_topology(unsigned int cpuid)
-{
- struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
- u64 mpidr;
-
- if (cpuid_topo->package_id != -1)
- goto topology_populated;
-
- mpidr = read_cpuid_mpidr();
-
- /* Uniprocessor systems can rely on default topology values */
- if (mpidr & MPIDR_UP_BITMASK)
- return;
-
- /*
- * This would be the place to create cpu topology based on MPIDR.
- *
- * However, it cannot be trusted to depict the actual topology; some
- * pieces of the architecture enforce an artificial cap on Aff0 values
- * (e.g. GICv3's ICC_SGI1R_EL1 limits it to 15), leading to an
- * artificial cycling of Aff1, Aff2 and Aff3 values. IOW, these end up
- * having absolutely no relationship to the actual underlying system
- * topology, and cannot be reasonably used as core / package ID.
- *
- * If the MT bit is set, Aff0 *could* be used to define a thread ID, but
- * we still wouldn't be able to obtain a sane core ID. This means we
- * need to entirely ignore MPIDR for any topology deduction.
- */
- cpuid_topo->thread_id = -1;
- cpuid_topo->core_id = cpuid;
- cpuid_topo->package_id = cpu_to_node(cpuid);
-
- pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
- cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
- cpuid_topo->thread_id, mpidr);
-
-topology_populated:
- update_siblings_masks(cpuid);
-}
-
#ifdef CONFIG_ACPI
static bool __init acpi_cpu_is_threaded(int cpu)
{
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 441e14ac33a4..b7633bacbd31 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -845,4 +845,23 @@ void __init init_cpu_topology(void)
}
}
}
+
+void store_cpu_topology(unsigned int cpuid)
+{
+ struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
+
+ if (cpuid_topo->package_id != -1)
+ goto topology_populated;
+
+ cpuid_topo->thread_id = -1;
+ cpuid_topo->core_id = cpuid;
+ cpuid_topo->package_id = cpu_to_node(cpuid);
+
+ pr_debug("CPU%u: package %d core %d thread %d\n",
+ cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
+ cpuid_topo->thread_id);
+
+topology_populated:
+ update_siblings_masks(cpuid);
+}
#endif
--
2.37.0

2022-07-11 15:00:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: topology: move store_cpu_topology() to shared code

On Mon, Jul 11, 2022 at 03:35:42PM +0100, Sudeep Holla wrote:
> On Sat, Jul 09, 2022 at 04:23:54PM +0100, Conor Dooley wrote:
> > From: Conor Dooley <[email protected]>
> >
> > arm64's method of defining a default cpu topology requires only minimal
> > changes to apply to RISC-V also. The current arm64 implementation exits
> > early in a uniprocessor configuration by reading MPIDR & claiming that
> > uniprocessor can rely on the default values.
> >
> > This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
> > topology: Stop using MPIDR for topology information")', because the
> > current code just assigns default values for multiprocessor systems.
> >
> > With the MPIDR references removed, store_cpu_topolgy() can be moved to
> > the common arch_topology code.
> >
>
> Looks good. FWIW,
>
> Reviewed-by: Sudeep Holla <[email protected]>
>
> > CC: [email protected]
>
> However, while I understand the reason why this is needed in stable trees
> for RISC-V, I am not sure if we want this for stable tree at-least on arm64.
> I leave that part to Greg and Will.

Why would it be good for one arch but bad for another?

2022-07-11 15:24:57

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: topology: move store_cpu_topology() to shared code

On Sat, Jul 09, 2022 at 04:23:54PM +0100, Conor Dooley wrote:
> From: Conor Dooley <[email protected]>
>
> arm64's method of defining a default cpu topology requires only minimal
> changes to apply to RISC-V also. The current arm64 implementation exits
> early in a uniprocessor configuration by reading MPIDR & claiming that
> uniprocessor can rely on the default values.
>
> This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
> topology: Stop using MPIDR for topology information")', because the
> current code just assigns default values for multiprocessor systems.
>
> With the MPIDR references removed, store_cpu_topolgy() can be moved to
> the common arch_topology code.
>

Looks good. FWIW,

Reviewed-by: Sudeep Holla <[email protected]>

> CC: [email protected]

However, while I understand the reason why this is needed in stable trees
for RISC-V, I am not sure if we want this for stable tree at-least on arm64.
I leave that part to Greg and Will.

--
Regards,
Sudeep

2022-07-11 15:57:57

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: topology: move store_cpu_topology() to shared code

On Mon, Jul 11, 2022 at 04:50:38PM +0200, Greg Kroah-Hartman wrote:
> On Mon, Jul 11, 2022 at 03:35:42PM +0100, Sudeep Holla wrote:
> > On Sat, Jul 09, 2022 at 04:23:54PM +0100, Conor Dooley wrote:
> > > From: Conor Dooley <[email protected]>
> > >
> > > arm64's method of defining a default cpu topology requires only minimal
> > > changes to apply to RISC-V also. The current arm64 implementation exits
> > > early in a uniprocessor configuration by reading MPIDR & claiming that
> > > uniprocessor can rely on the default values.
> > >
> > > This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
> > > topology: Stop using MPIDR for topology information")', because the
> > > current code just assigns default values for multiprocessor systems.
> > >
> > > With the MPIDR references removed, store_cpu_topolgy() can be moved to
> > > the common arch_topology code.
> > >
> >
> > Looks good. FWIW,
> >
> > Reviewed-by: Sudeep Holla <[email protected]>
> >
> > > CC: [email protected]
> >
> > However, while I understand the reason why this is needed in stable trees
> > for RISC-V, I am not sure if we want this for stable tree at-least on arm64.
> > I leave that part to Greg and Will.
>
> Why would it be good for one arch but bad for another?

Not really bad as such. Just needs testing and must not change much ideally,
but it really depends on which stable trees we will target and what is the
original state there. As mentioned in the commit, this changed a bit around
v5.8/9 on arm64 and not sure what kernels RISC-V needs this. There could
be some surprises on some Andriod platforms but that is something we can
look at when if and when there are complaints.

I am in general not sure what is the -stable tree rules is such situation and
hence made the noise so that we are aware that we may need more work than just
backporting this patch. Also this is just my opinion. If we decide to backport
esp. to kernels older than the one containing 3102bc0e6ac7, then arm64 may need
more changes or probably we can pull that commit if that makes it easier. Based
on what is decided and what are the targeted -stable trees, we can dig deeper.

--
Regards,
Sudeep

2022-07-11 17:24:09

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: topology: move store_cpu_topology() to shared code

On 11/07/2022 16:24, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Mon, Jul 11, 2022 at 04:50:38PM +0200, Greg Kroah-Hartman wrote:
>> On Mon, Jul 11, 2022 at 03:35:42PM +0100, Sudeep Holla wrote:
>>> On Sat, Jul 09, 2022 at 04:23:54PM +0100, Conor Dooley wrote:
>>>> From: Conor Dooley <[email protected]>
>>>>
>>>> arm64's method of defining a default cpu topology requires only minimal
>>>> changes to apply to RISC-V also. The current arm64 implementation exits
>>>> early in a uniprocessor configuration by reading MPIDR & claiming that
>>>> uniprocessor can rely on the default values.
>>>>
>>>> This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
>>>> topology: Stop using MPIDR for topology information")', because the
>>>> current code just assigns default values for multiprocessor systems.
>>>>
>>>> With the MPIDR references removed, store_cpu_topolgy() can be moved to
>>>> the common arch_topology code.
>>>>
>>>
>>> Looks good. FWIW,
>>>
>>> Reviewed-by: Sudeep Holla <[email protected]>
>>>
>>>> CC: [email protected]
>>>
>>> However, while I understand the reason why this is needed in stable trees
>>> for RISC-V, I am not sure if we want this for stable tree at-least on arm64.
>>> I leave that part to Greg and Will.
>>
>> Why would it be good for one arch but bad for another?
>
> Not really bad as such. Just needs testing and must not change much ideally,
> but it really depends on which stable trees we will target and what is the
> original state there. As mentioned in the commit, this changed a bit around
> v5.8/9 on arm64 and not sure what kernels RISC-V needs this. There could
> be some surprises on some Andriod platforms but that is something we can
> look at when if and when there are complaints.
>
> I am in general not sure what is the -stable tree rules is such situation and
> hence made the noise so that we are aware that we may need more work than just
> backporting this patch. Also this is just my opinion. If we decide to backport
> esp. to kernels older than the one containing 3102bc0e6ac7, then arm64 may need
> more changes or probably we can pull that commit if that makes it easier. Based
> on what is decided and what are the targeted -stable trees, we can dig deeper.

There's always the option of, for the older kernels, not migrating arm64 at all
and just wrap store_cpu_topo with "if RISCV" rather than "if RISCV || ARM64".