2021-05-26 22:05:39

by Liang, Kan

[permalink] [raw]
Subject: [PATCH] perf/x86/intel/uncore: Fix a kernel WARNING triggered by maxcpus=1

From: Kan Liang <[email protected]>

A kernel WARNING may be triggered when setting maxcpus=1.

The uncore counters are Die-scope. When probing a PCI device, only the
BUS information can be retrieved. The uncore driver has to maintain a
mapping table used to calculate the logical Die ID from a given BUS#.

Before the patch ba9506be4e40, the mapping table stores the mapping
information from the BUS# -> a Physical Socket ID. To calculate the
logical die ID, perf does,
- In snbep_pci2phy_map_init(), retrieve the BUS# -> a Physical Socket ID
from the UBOX PCI configure space.
- Calculate the mapping information (a BUS# -> a Physical Socket ID) for
the other PCI BUS.
- In the uncore_pci_probe(), get the physical Socket ID from a given BUS
and the mapping table.
- Calculate the logical Die ID

Since only the logical Die ID is required, with the patch ba9506be4e40,
the mapping table stores the mapping information from the BUS# -> a
logical Die ID. Now perf does,
- In snbep_pci2phy_map_init(), retrieve the BUS# -> a Physical Socket ID
from the UBOX PCI configure space.
- Calculate the logical Die ID
- Calculate the mapping information (a BUS# -> a logical Die ID) for the
other PCI BUS.
- In the uncore_pci_probe(), get the logical die ID from a given BUS and
the mapping table.

When calculating the logical Die ID, -1 may be returned, especially when
maxcpus=1. Here, -1 means the logical Die ID is not found. But when
calculating the mapping information for the other PCI BUS, -1 indicates
that it's the other PCI BUS that requires the calculation of the
mapping. The driver will mistakenly do the calculation.

Uses the -ENODEV to indicate the case which the logical Die ID is not
found. The driver will not mess up the mapping table anymore.

Fixes: ba9506be4e40 ("perf/x86/intel/uncore: Store the logical die id
instead of the physical die id.")
Reported-by: John Donnelly <[email protected]>
Tested-by: John Donnelly <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/uncore_snbep.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index f57d9cb..035787c 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1413,6 +1413,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
die_id = i;
else
die_id = topology_phys_to_logical_pkg(i);
+ if (die_id < 0)
+ die_id = -ENODEV;
map->pbus_to_dieid[bus] = die_id;
break;
}
@@ -1459,14 +1461,14 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
i = -1;
if (reverse) {
for (bus = 255; bus >= 0; bus--) {
- if (map->pbus_to_dieid[bus] >= 0)
+ if (map->pbus_to_dieid[bus] != -1)
i = map->pbus_to_dieid[bus];
else
map->pbus_to_dieid[bus] = i;
}
} else {
for (bus = 0; bus <= 255; bus++) {
- if (map->pbus_to_dieid[bus] >= 0)
+ if (map->pbus_to_dieid[bus] != -1)
i = map->pbus_to_dieid[bus];
else
map->pbus_to_dieid[bus] = i;
--
2.7.4


2021-05-27 20:17:54

by John Donnelly

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel/uncore: Fix a kernel WARNING triggered by maxcpus=1

On 5/26/21 8:58 AM, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> A kernel WARNING may be triggered when setting maxcpus=1.
>
> The uncore counters are Die-scope. When probing a PCI device, only the
> BUS information can be retrieved. The uncore driver has to maintain a
> mapping table used to calculate the logical Die ID from a given BUS#.
>
> Before the patch ba9506be4e40, the mapping table stores the mapping
> information from the BUS# -> a Physical Socket ID. To calculate the
> logical die ID, perf does,
> - In snbep_pci2phy_map_init(), retrieve the BUS# -> a Physical Socket ID
> from the UBOX PCI configure space.
> - Calculate the mapping information (a BUS# -> a Physical Socket ID) for
> the other PCI BUS.
> - In the uncore_pci_probe(), get the physical Socket ID from a given BUS
> and the mapping table.
> - Calculate the logical Die ID
>
> Since only the logical Die ID is required, with the patch ba9506be4e40,
> the mapping table stores the mapping information from the BUS# -> a
> logical Die ID. Now perf does,
> - In snbep_pci2phy_map_init(), retrieve the BUS# -> a Physical Socket ID
> from the UBOX PCI configure space.
> - Calculate the logical Die ID
> - Calculate the mapping information (a BUS# -> a logical Die ID) for the
> other PCI BUS.
> - In the uncore_pci_probe(), get the logical die ID from a given BUS and
> the mapping table.
>
> When calculating the logical Die ID, -1 may be returned, especially when
> maxcpus=1. Here, -1 means the logical Die ID is not found. But when
> calculating the mapping information for the other PCI BUS, -1 indicates
> that it's the other PCI BUS that requires the calculation of the
> mapping. The driver will mistakenly do the calculation.
>
> Uses the -ENODEV to indicate the case which the logical Die ID is not
> found. The driver will not mess up the mapping table anymore.
>
> Fixes: ba9506be4e40 ("perf/x86/intel/uncore: Store the logical die id
> instead of the physical die id.")
> Reported-by: John Donnelly <[email protected]>
> Tested-by: John Donnelly <[email protected]>
> Signed-off-by: Kan Liang <[email protected]>

Acked-by: John Donnelly <[email protected]>
> ---
> arch/x86/events/intel/uncore_snbep.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index f57d9cb..035787c 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -1413,6 +1413,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
> die_id = i;
> else
> die_id = topology_phys_to_logical_pkg(i);
> + if (die_id < 0)
> + die_id = -ENODEV;
> map->pbus_to_dieid[bus] = die_id;
> break;
> }
> @@ -1459,14 +1461,14 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
> i = -1;
> if (reverse) {
> for (bus = 255; bus >= 0; bus--) {
> - if (map->pbus_to_dieid[bus] >= 0)
> + if (map->pbus_to_dieid[bus] != -1)
> i = map->pbus_to_dieid[bus];
> else
> map->pbus_to_dieid[bus] = i;
> }
> } else {
> for (bus = 0; bus <= 255; bus++) {
> - if (map->pbus_to_dieid[bus] >= 0)
> + if (map->pbus_to_dieid[bus] != -1)
> i = map->pbus_to_dieid[bus];
> else
> map->pbus_to_dieid[bus] = i;
>