2022-05-25 19:09:40

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids

Hi All,

This version updates cacheinfo to populate and use the information from
there for all the cache topology. Sorry for posting in the middle of
merge window but better to get this tested earlier so that it is ready
for next merge window.

This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.

These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.

P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
agree with the changes first before we include them.

v2[2]->v3:
- Dropped support to get the device node for the CPU's LLC
- Updated cacheinfo to support calling of detect_cache_attributes
early in smp_prepare_cpus stage
- Added support to check if LLC is valid and shared in the cacheinfo
- Used the same in arch_topology

v1[1]->v2[2]:
- Updated ID validity check include all non-negative value
- Added support to get the device node for the CPU's last level cache
- Added support to build llc_sibling on DT platforms

[1] https://lore.kernel.org/lkml/[email protected]
[2] https://lore.kernel.org/lkml/[email protected]

Sudeep Holla (16):
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Allow early detection and population of cache attributes
arch_topology: Add support to parse and detect cache attributes
arch_topology: Use the last level cache information from the cacheinfo
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Drop LLC identifier stash from the CPU topology
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Drop unnecessary check for uninitialised package_id
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Add support for parsing sockets in /cpu-map

arch/arm64/kernel/topology.c | 14 -----
drivers/base/arch_topology.c | 92 +++++++++++++++++----------
drivers/base/cacheinfo.c | 114 +++++++++++++++++++++-------------
include/linux/arch_topology.h | 1 -
include/linux/cacheinfo.h | 3 +
5 files changed, 135 insertions(+), 89 deletions(-)

--
2.36.1



2022-05-26 00:13:14

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
much earlier before the CPUs are registered as devices.

Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..b0bde272e2ae 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -14,7 +14,7 @@
#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/init.h>
-#include <linux/of.h>
+#include <linux/of_device.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/smp.h>
@@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
{
struct device_node *np;
struct cacheinfo *this_leaf;
- struct device *cpu_dev = get_cpu_device(cpu);
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
unsigned int index = 0;

@@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
return 0;
}

- if (!cpu_dev) {
- pr_err("No cpu device for CPU %d\n", cpu);
- return -ENODEV;
- }
- np = cpu_dev->of_node;
+ np = of_cpu_device_node_get(cpu);
if (!np) {
pr_err("Failed to find cpu%d device node\n", cpu);
return -ENOENT;
--
2.36.1


2022-06-01 19:22:16

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> This version updates cacheinfo to populate and use the information from
> there for all the cache topology. Sorry for posting in the middle of
> merge window but better to get this tested earlier so that it is ready
> for next merge window.
>
> This series intends to fix some discrepancies we have in the CPU topology
> parsing from the device tree /cpu-map node. Also this diverges from the
> behaviour on a ACPI enabled platform. The expectation is that both DT
> and ACPI enabled systems must present consistent view of the CPU topology.
>
> Currently we assign generated cluster count as the physical package identifier
> for each CPU which is wrong. The device tree bindings for CPU topology supports
> sockets to infer the socket or physical package identifier for a given CPU.
> Also we don't check if all the cores/threads belong to the same cluster before
> updating their sibling masks which is fine as we don't set the cluster id yet.
>
> These changes also assigns the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map without support for nesting of the clusters.
> Finally, it also add support for socket nodes in /cpu-map. With this the
> parsing of exact same information from ACPI PPTT and /cpu-map DT node
> aligns well.
>
> The only exception is that the last level cache id information can be
> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> in the device tree.
>
> P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> agree with the changes first before we include them.
>
> v2[2]->v3:
> - Dropped support to get the device node for the CPU's LLC
> - Updated cacheinfo to support calling of detect_cache_attributes
> early in smp_prepare_cpus stage
> - Added support to check if LLC is valid and shared in the cacheinfo
> - Used the same in arch_topology
>
> v1[1]->v2[2]:
> - Updated ID validity check include all non-negative value
> - Added support to get the device node for the CPU's last level cache
> - Added support to build llc_sibling on DT platforms
>
> [1] https://lore.kernel.org/lkml/[email protected]
> [2] https://lore.kernel.org/lkml/[email protected]
>
> Sudeep Holla (16):
> cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
> cacheinfo: Add helper to access any cache index for a given CPU
> cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
> cacheinfo: Add support to check if last level cache(LLC) is valid or shared
> cacheinfo: Allow early detection and population of cache attributes
> arch_topology: Add support to parse and detect cache attributes
> arch_topology: Use the last level cache information from the cacheinfo
> arm64: topology: Remove redundant setting of llc_id in CPU topology
> arch_topology: Drop LLC identifier stash from the CPU topology
> arch_topology: Set thread sibling cpumask only within the cluster
> arch_topology: Check for non-negative value rather than -1 for IDs validity
> arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
> arch_topology: Don't set cluster identifier as physical package identifier
> arch_topology: Drop unnecessary check for uninitialised package_id
> arch_topology: Set cluster identifier in each core/thread from /cpu-map
> arch_topology: Add support for parsing sockets in /cpu-map
>
> arch/arm64/kernel/topology.c | 14 -----
> drivers/base/arch_topology.c | 92 +++++++++++++++++----------
> drivers/base/cacheinfo.c | 114 +++++++++++++++++++++-------------
> include/linux/arch_topology.h | 1 -
> include/linux/cacheinfo.h | 3 +
> 5 files changed, 135 insertions(+), 89 deletions(-)
>

I tried this series on virtual machine where ACPI is enabled and looks good.
Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
tag for it. Besides, I checked the changes related to ACPI part and looks to
me either after the mentioned nits fixed. I leave the changes related to device-tree
to be reviewed by the experts :)

Thanks,
Gavin


2022-06-01 21:13:25

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids

On Wed, Jun 01, 2022 at 11:49:07AM +0800, Gavin Shan wrote:
> Hi Sudeep,
>
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > This version updates cacheinfo to populate and use the information from
> > there for all the cache topology. Sorry for posting in the middle of
> > merge window but better to get this tested earlier so that it is ready
> > for next merge window.
> >
> > This series intends to fix some discrepancies we have in the CPU topology
> > parsing from the device tree /cpu-map node. Also this diverges from the
> > behaviour on a ACPI enabled platform. The expectation is that both DT
> > and ACPI enabled systems must present consistent view of the CPU topology.
> >
> > Currently we assign generated cluster count as the physical package identifier
> > for each CPU which is wrong. The device tree bindings for CPU topology supports
> > sockets to infer the socket or physical package identifier for a given CPU.
> > Also we don't check if all the cores/threads belong to the same cluster before
> > updating their sibling masks which is fine as we don't set the cluster id yet.
> >
> > These changes also assigns the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map without support for nesting of the clusters.
> > Finally, it also add support for socket nodes in /cpu-map. With this the
> > parsing of exact same information from ACPI PPTT and /cpu-map DT node
> > aligns well.
> >
> > The only exception is that the last level cache id information can be
> > inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> > in the device tree.
> >
> > P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> > agree with the changes first before we include them.
> >
> > v2[2]->v3:
> > - Dropped support to get the device node for the CPU's LLC
> > - Updated cacheinfo to support calling of detect_cache_attributes
> > early in smp_prepare_cpus stage
> > - Added support to check if LLC is valid and shared in the cacheinfo
> > - Used the same in arch_topology
> >
> > v1[1]->v2[2]:
> > - Updated ID validity check include all non-negative value
> > - Added support to get the device node for the CPU's last level cache
> > - Added support to build llc_sibling on DT platforms
> >
> > [1] https://lore.kernel.org/lkml/[email protected]
> > [2] https://lore.kernel.org/lkml/[email protected]
> >
> > Sudeep Holla (16):
> > cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
> > cacheinfo: Add helper to access any cache index for a given CPU
> > cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
> > cacheinfo: Add support to check if last level cache(LLC) is valid or shared
> > cacheinfo: Allow early detection and population of cache attributes
> > arch_topology: Add support to parse and detect cache attributes
> > arch_topology: Use the last level cache information from the cacheinfo
> > arm64: topology: Remove redundant setting of llc_id in CPU topology
> > arch_topology: Drop LLC identifier stash from the CPU topology
> > arch_topology: Set thread sibling cpumask only within the cluster
> > arch_topology: Check for non-negative value rather than -1 for IDs validity
> > arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
> > arch_topology: Don't set cluster identifier as physical package identifier
> > arch_topology: Drop unnecessary check for uninitialised package_id
> > arch_topology: Set cluster identifier in each core/thread from /cpu-map
> > arch_topology: Add support for parsing sockets in /cpu-map
> >
> > arch/arm64/kernel/topology.c | 14 -----
> > drivers/base/arch_topology.c | 92 +++++++++++++++++----------
> > drivers/base/cacheinfo.c | 114 +++++++++++++++++++++-------------
> > include/linux/arch_topology.h | 1 -
> > include/linux/cacheinfo.h | 3 +
> > 5 files changed, 135 insertions(+), 89 deletions(-)
> >
>
> I tried this series on virtual machine where ACPI is enabled and looks good.
> Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
> tag for it. Besides, I checked the changes related to ACPI part and looks to
> me either after the mentioned nits fixed. I leave the changes related to device-tree
> to be reviewed by the experts :)
>

Thanks for the review and testing, much appreciated!
Yes the changes for ACPI is very minimal in this series, except the bug fix
you are interested, nothing changes in the system behaviour. The main aim
is to get the same behaviour on a similar DT based system.

--
Regards,
Sudeep

2022-06-01 21:19:34

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The of_cpu_device_node_get takes care of fetching the CPU'd device node
> either from cached cpu_dev->of_node if cpu_dev is initialised or uses
> of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
>
> Just use of_cpu_device_node_get instead of getting the cpu device first
> and then using cpu_dev->of_node for two reasons:
> 1. There is no other use of cpu_dev and can be simplified
> 2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
> much earlier before the CPUs are registered as devices.
>
> Signed-off-by: Sudeep Holla <[email protected]>
> ---
> drivers/base/cacheinfo.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>

Reviewed-by: Gavin Shan <[email protected]>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index dad296229161..b0bde272e2ae 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -14,7 +14,7 @@
> #include <linux/cpu.h>
> #include <linux/device.h>
> #include <linux/init.h>
> -#include <linux/of.h>
> +#include <linux/of_device.h>
> #include <linux/sched.h>
> #include <linux/slab.h>
> #include <linux/smp.h>
> @@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
> {
> struct device_node *np;
> struct cacheinfo *this_leaf;
> - struct device *cpu_dev = get_cpu_device(cpu);
> struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> unsigned int index = 0;
>
> @@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
> return 0;
> }
>
> - if (!cpu_dev) {
> - pr_err("No cpu device for CPU %d\n", cpu);
> - return -ENODEV;
> - }
> - np = cpu_dev->of_node;
> + np = of_cpu_device_node_get(cpu);
> if (!np) {
> pr_err("Failed to find cpu%d device node\n", cpu);
> return -ENOENT;
>