From: Alexander Antonov <[email protected]>
The NULL dereference happens inside upi_fill_topology() procedure in
case of disabling one of the sockets on the system.
For example, if you disable the 2nd socket on a 4-socket system then
uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
be allocated only for 3 sockets and stored in type->topology.
In discover_upi_topology() memory is accessed by socket id from CPUNODEID
registers which contain physical ids (from 0 to 3) and on the line:
upi = &type->topology[nid][idx];
out-of-bound access will happen and the 'upi' pointer will be passed to
upi_fill_topology() where it will be dereferenced.
To avoid this issue update the code to convert physical socket id to
logical socket id in discover_upi_topology() before accessing memory.
Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server")
Reported-by: Kyle Meyer <[email protected]>
Tested-by: Kyle Meyer <[email protected]>
Signed-off-by: Alexander Antonov <[email protected]>
---
arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 8250f0f59c2b..49bc27ab26ad 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
struct pci_dev *ubox = NULL;
struct pci_dev *dev = NULL;
u32 nid, gid;
- int i, idx, ret = -EPERM;
+ int i, idx, lgc_pkg, ret = -EPERM;
struct intel_uncore_topology *upi;
unsigned int devfn;
@@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
for (i = 0; i < 8; i++) {
if (nid != GIDNIDMAP(gid, i))
continue;
+ lgc_pkg = topology_phys_to_logical_pkg(i);
+ if (lgc_pkg < 0) {
+ ret = -EPERM;
+ goto err;
+ }
for (idx = 0; idx < type->num_boxes; idx++) {
- upi = &type->topology[nid][idx];
+ upi = &type->topology[lgc_pkg][idx];
devfn = PCI_DEVFN(dev_link0 + idx, ICX_UPI_REGS_ADDR_FUNCTION);
dev = pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
ubox->bus->number,
@@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
goto err;
}
}
+ break;
}
}
err:
base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0
--
2.25.1
On 2023-11-15 10:13 a.m., [email protected] wrote:
> From: Alexander Antonov <[email protected]>
>
> The NULL dereference happens inside upi_fill_topology() procedure in
> case of disabling one of the sockets on the system.
>
> For example, if you disable the 2nd socket on a 4-socket system then
> uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
> be allocated only for 3 sockets and stored in type->topology.
> In discover_upi_topology() memory is accessed by socket id from CPUNODEID
> registers which contain physical ids (from 0 to 3) and on the line:
>
> upi = &type->topology[nid][idx];
>
> out-of-bound access will happen and the 'upi' pointer will be passed to
> upi_fill_topology() where it will be dereferenced.
>
> To avoid this issue update the code to convert physical socket id to
> logical socket id in discover_upi_topology() before accessing memory.
>
> Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server")
> Reported-by: Kyle Meyer <[email protected]>
> Tested-by: Kyle Meyer <[email protected]>
> Signed-off-by: Alexander Antonov <[email protected]>
> ---
> arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 8250f0f59c2b..49bc27ab26ad 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
> struct pci_dev *ubox = NULL;
> struct pci_dev *dev = NULL;
> u32 nid, gid;
> - int i, idx, ret = -EPERM;
> + int i, idx, lgc_pkg, ret = -EPERM;
> struct intel_uncore_topology *upi;
> unsigned int devfn;
>
> @@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
> for (i = 0; i < 8; i++) {
> if (nid != GIDNIDMAP(gid, i))
> continue;
> + lgc_pkg = topology_phys_to_logical_pkg(i);
> + if (lgc_pkg < 0) {
> + ret = -EPERM;
> + goto err;
> + }
In the snbep_pci2phy_map_init(), there are similar codes to find the
logical die id. Can we factor a common function for both of them?
Thanks,
Kan
> for (idx = 0; idx < type->num_boxes; idx++) {
> - upi = &type->topology[nid][idx];
> + upi = &type->topology[lgc_pkg][idx];
> devfn = PCI_DEVFN(dev_link0 + idx, ICX_UPI_REGS_ADDR_FUNCTION);
> dev = pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
> ubox->bus->number,
> @@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
> goto err;
> }
> }
> + break;
> }
> }
> err:
>
> base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0
On 11/15/2023 8:00 PM, Liang, Kan wrote:
>
> On 2023-11-15 10:13 a.m., [email protected] wrote:
>> From: Alexander Antonov <[email protected]>
>>
>> The NULL dereference happens inside upi_fill_topology() procedure in
>> case of disabling one of the sockets on the system.
>>
>> For example, if you disable the 2nd socket on a 4-socket system then
>> uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
>> be allocated only for 3 sockets and stored in type->topology.
>> In discover_upi_topology() memory is accessed by socket id from CPUNODEID
>> registers which contain physical ids (from 0 to 3) and on the line:
>>
>> upi = &type->topology[nid][idx];
>>
>> out-of-bound access will happen and the 'upi' pointer will be passed to
>> upi_fill_topology() where it will be dereferenced.
>>
>> To avoid this issue update the code to convert physical socket id to
>> logical socket id in discover_upi_topology() before accessing memory.
>>
>> Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server")
>> Reported-by: Kyle Meyer <[email protected]>
>> Tested-by: Kyle Meyer <[email protected]>
>> Signed-off-by: Alexander Antonov <[email protected]>
>> ---
>> arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>> index 8250f0f59c2b..49bc27ab26ad 100644
>> --- a/arch/x86/events/intel/uncore_snbep.c
>> +++ b/arch/x86/events/intel/uncore_snbep.c
>> @@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
>> struct pci_dev *ubox = NULL;
>> struct pci_dev *dev = NULL;
>> u32 nid, gid;
>> - int i, idx, ret = -EPERM;
>> + int i, idx, lgc_pkg, ret = -EPERM;
>> struct intel_uncore_topology *upi;
>> unsigned int devfn;
>>
>> @@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
>> for (i = 0; i < 8; i++) {
>> if (nid != GIDNIDMAP(gid, i))
>> continue;
>> + lgc_pkg = topology_phys_to_logical_pkg(i);
>> + if (lgc_pkg < 0) {
>> + ret = -EPERM;
>> + goto err;
>> + }
> In the snbep_pci2phy_map_init(), there are similar codes to find the
> logical die id. Can we factor a common function for both of them?
>
> Thanks,
> Kan
Hi Kan,
Thank you for your comment.
Yes, I think we can factor out the common loop where GIDNIDMAP is being
checked.
But inside snbep_pci2phy_map_init() we have a bit different procedure which
also does the following:
if (topology_max_die_per_package() > 1)
die_id = i;
I think that having this code, at least, in our case could bring us to the
same issue which we are trying to fix. But of course we could
parametrize this checking.
What do you think?
Thanks,
Alexander
>
>> for (idx = 0; idx < type->num_boxes; idx++) {
>> - upi = &type->topology[nid][idx];
>> + upi = &type->topology[lgc_pkg][idx];
>> devfn = PCI_DEVFN(dev_link0 + idx, ICX_UPI_REGS_ADDR_FUNCTION);
>> dev = pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
>> ubox->bus->number,
>> @@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
>> goto err;
>> }
>> }
>> + break;
>> }
>> }
>> err:
>>
>> base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0
On 2023-11-20 2:49 p.m., Alexander Antonov wrote:
>
> On 11/15/2023 8:00 PM, Liang, Kan wrote:
>>
>> On 2023-11-15 10:13 a.m., [email protected] wrote:
>>> From: Alexander Antonov <[email protected]>
>>>
>>> The NULL dereference happens inside upi_fill_topology() procedure in
>>> case of disabling one of the sockets on the system.
>>>
>>> For example, if you disable the 2nd socket on a 4-socket system then
>>> uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
>>> be allocated only for 3 sockets and stored in type->topology.
>>> In discover_upi_topology() memory is accessed by socket id from
>>> CPUNODEID
>>> registers which contain physical ids (from 0 to 3) and on the line:
>>>
>>> upi = &type->topology[nid][idx];
>>>
>>> out-of-bound access will happen and the 'upi' pointer will be passed to
>>> upi_fill_topology() where it will be dereferenced.
>>>
>>> To avoid this issue update the code to convert physical socket id to
>>> logical socket id in discover_upi_topology() before accessing memory.
>>>
>>> Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology
>>> discovery for Icelake Server")
>>> Reported-by: Kyle Meyer <[email protected]>
>>> Tested-by: Kyle Meyer <[email protected]>
>>> Signed-off-by: Alexander Antonov <[email protected]>
>>> ---
>>> arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/events/intel/uncore_snbep.c
>>> b/arch/x86/events/intel/uncore_snbep.c
>>> index 8250f0f59c2b..49bc27ab26ad 100644
>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>> @@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct
>>> intel_uncore_type *type, int ubox_did, i
>>> struct pci_dev *ubox = NULL;
>>> struct pci_dev *dev = NULL;
>>> u32 nid, gid;
>>> - int i, idx, ret = -EPERM;
>>> + int i, idx, lgc_pkg, ret = -EPERM;
>>> struct intel_uncore_topology *upi;
>>> unsigned int devfn;
>>> @@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct
>>> intel_uncore_type *type, int ubox_did, i
>>> for (i = 0; i < 8; i++) {
>>> if (nid != GIDNIDMAP(gid, i))
>>> continue;
>>> + lgc_pkg = topology_phys_to_logical_pkg(i);
>>> + if (lgc_pkg < 0) {
>>> + ret = -EPERM;
>>> + goto err;
>>> + }
>> In the snbep_pci2phy_map_init(), there are similar codes to find the
>> logical die id. Can we factor a common function for both of them?
>>
>> Thanks,
>> Kan
> Hi Kan,
>
> Thank you for your comment.
> Yes, I think we can factor out the common loop where GIDNIDMAP is being
> checked.
> But inside snbep_pci2phy_map_init() we have a bit different procedure which
> also does the following:
>
> if (topology_max_die_per_package() > 1)
> die_id = i;
>
> I think that having this code, at least, in our case could bring us to the
> same issue which we are trying to fix. But of course we could
> parametrize this checking.
The topology_max_die_per_package() > 1 means there are more that 1 die
in a socket. AFAIK, it only happens on the Cascade Lake AP.
Did you observe it in the ICX?
Thanks,
Kan
>
> What do you think?
>
> Thanks,
> Alexander
>>
>>> for (idx = 0; idx < type->num_boxes; idx++) {
>>> - upi = &type->topology[nid][idx];
>>> + upi = &type->topology[lgc_pkg][idx];
>>> devfn = PCI_DEVFN(dev_link0 + idx,
>>> ICX_UPI_REGS_ADDR_FUNCTION);
>>> dev =
>>> pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
>>> ubox->bus->number,
>>> @@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct
>>> intel_uncore_type *type, int ubox_did, i
>>> goto err;
>>> }
>>> }
>>> + break;
>>> }
>>> }
>>> err:
>>>
>>> base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0
On 11/20/2023 10:21 PM, Liang, Kan wrote:
>
> On 2023-11-20 2:49 p.m., Alexander Antonov wrote:
>> On 11/15/2023 8:00 PM, Liang, Kan wrote:
>>> On 2023-11-15 10:13 a.m., [email protected] wrote:
>>>> From: Alexander Antonov <[email protected]>
>>>>
>>>> The NULL dereference happens inside upi_fill_topology() procedure in
>>>> case of disabling one of the sockets on the system.
>>>>
>>>> For example, if you disable the 2nd socket on a 4-socket system then
>>>> uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
>>>> be allocated only for 3 sockets and stored in type->topology.
>>>> In discover_upi_topology() memory is accessed by socket id from
>>>> CPUNODEID
>>>> registers which contain physical ids (from 0 to 3) and on the line:
>>>>
>>>> upi = &type->topology[nid][idx];
>>>>
>>>> out-of-bound access will happen and the 'upi' pointer will be passed to
>>>> upi_fill_topology() where it will be dereferenced.
>>>>
>>>> To avoid this issue update the code to convert physical socket id to
>>>> logical socket id in discover_upi_topology() before accessing memory.
>>>>
>>>> Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology
>>>> discovery for Icelake Server")
>>>> Reported-by: Kyle Meyer <[email protected]>
>>>> Tested-by: Kyle Meyer <[email protected]>
>>>> Signed-off-by: Alexander Antonov <[email protected]>
>>>> ---
>>>> arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/events/intel/uncore_snbep.c
>>>> b/arch/x86/events/intel/uncore_snbep.c
>>>> index 8250f0f59c2b..49bc27ab26ad 100644
>>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>>> @@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct
>>>> intel_uncore_type *type, int ubox_did, i
>>>> struct pci_dev *ubox = NULL;
>>>> struct pci_dev *dev = NULL;
>>>> u32 nid, gid;
>>>> - int i, idx, ret = -EPERM;
>>>> + int i, idx, lgc_pkg, ret = -EPERM;
>>>> struct intel_uncore_topology *upi;
>>>> unsigned int devfn;
>>>> @@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct
>>>> intel_uncore_type *type, int ubox_did, i
>>>> for (i = 0; i < 8; i++) {
>>>> if (nid != GIDNIDMAP(gid, i))
>>>> continue;
>>>> + lgc_pkg = topology_phys_to_logical_pkg(i);
>>>> + if (lgc_pkg < 0) {
>>>> + ret = -EPERM;
>>>> + goto err;
>>>> + }
>>> In the snbep_pci2phy_map_init(), there are similar codes to find the
>>> logical die id. Can we factor a common function for both of them?
>>>
>>> Thanks,
>>> Kan
>> Hi Kan,
>>
>> Thank you for your comment.
>> Yes, I think we can factor out the common loop where GIDNIDMAP is being
>> checked.
>> But inside snbep_pci2phy_map_init() we have a bit different procedure which
>> also does the following:
>>
>> if (topology_max_die_per_package() > 1)
>> die_id = i;
>>
>> I think that having this code, at least, in our case could bring us to the
>> same issue which we are trying to fix. But of course we could
>> parametrize this checking.
> The topology_max_die_per_package() > 1 means there are more that 1 die
> in a socket. AFAIK, it only happens on the Cascade Lake AP.
>
> Did you observe it in the ICX?
>
> Thanks,
> Kan
No, I didn't observe it on ICX. Seems for now we have it only on CLX-AP
Thanks,
Alexander
>
>> What do you think?
>>
>> Thanks,
>> Alexander
>>>> for (idx = 0; idx < type->num_boxes; idx++) {
>>>> - upi = &type->topology[nid][idx];
>>>> + upi = &type->topology[lgc_pkg][idx];
>>>> devfn = PCI_DEVFN(dev_link0 + idx,
>>>> ICX_UPI_REGS_ADDR_FUNCTION);
>>>> dev =
>>>> pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
>>>> ubox->bus->number,
>>>> @@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct
>>>> intel_uncore_type *type, int ubox_did, i
>>>> goto err;
>>>> }
>>>> }
>>>> + break;
>>>> }
>>>> }
>>>> err:
>>>>
>>>> base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0