2022-04-07 07:36:36

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.17 18/31] s390/pci: improve zpci_dev reference counting

From: Niklas Schnelle <[email protected]>

[ Upstream commit c122383d221dfa2f41cfe5e672540595de986fde ]

Currently zpci_dev uses kref based reference counting but only accounts
for one original reference plus one reference from an added pci_dev to
its underlying zpci_dev. Counting just the original reference worked
until the pci_dev reference was added in commit 2a671f77ee49 ("s390/pci:
fix use after free of zpci_dev") because once a zpci_dev goes away, i.e.
enters the reserved state, it would immediately get released. However
with the pci_dev reference this is no longer the case and the zpci_dev
may still appear in multiple availability events indicating that it was
reserved. This was solved by detecting when the zpci_dev is already on
its way out but still hanging around. This has however shown some light
on how unusual our zpci_dev reference counting is.

Improve upon this by modelling zpci_dev reference counting on pci_dev.
Analogous to pci_get_slot() increment the reference count in
get_zdev_by_fid(). Thus all users of get_zdev_by_fid() must drop the
reference once they are done with the zpci_dev.

Similar to pci_scan_single_device(), zpci_create_device() returns the
device with an initial count of 1 and the device added to the zpci_list
(analogous to the PCI bus' device_list). In turn users of
zpci_create_device() must only drop the reference once the device is
gone from the point of view of the zPCI subsystem, it might still be
referenced by the common PCI subsystem though.

Reviewed-by: Matthew Rosato <[email protected]>
Signed-off-by: Niklas Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/pci/pci.c | 1 +
arch/s390/pci/pci_bus.h | 3 ++-
arch/s390/pci/pci_clp.c | 9 +++++++--
arch/s390/pci/pci_event.c | 7 ++++++-
4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 792f8e0f2178..5bcd9228db5f 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -69,6 +69,7 @@ struct zpci_dev *get_zdev_by_fid(u32 fid)
list_for_each_entry(tmp, &zpci_list, entry) {
if (tmp->fid == fid) {
zdev = tmp;
+ zpci_zdev_get(zdev);
break;
}
}
diff --git a/arch/s390/pci/pci_bus.h b/arch/s390/pci/pci_bus.h
index e359d2686178..ecef3a9e16c0 100644
--- a/arch/s390/pci/pci_bus.h
+++ b/arch/s390/pci/pci_bus.h
@@ -19,7 +19,8 @@ void zpci_bus_remove_device(struct zpci_dev *zdev, bool set_error);
void zpci_release_device(struct kref *kref);
static inline void zpci_zdev_put(struct zpci_dev *zdev)
{
- kref_put(&zdev->kref, zpci_release_device);
+ if (zdev)
+ kref_put(&zdev->kref, zpci_release_device);
}

static inline void zpci_zdev_get(struct zpci_dev *zdev)
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index be077b39da33..5011d27461fd 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -22,6 +22,8 @@
#include <asm/clp.h>
#include <uapi/asm/clp.h>

+#include "pci_bus.h"
+
bool zpci_unique_uid;

void update_uid_checking(bool new)
@@ -403,8 +405,11 @@ static void __clp_add(struct clp_fh_list_entry *entry, void *data)
return;

zdev = get_zdev_by_fid(entry->fid);
- if (!zdev)
- zpci_create_device(entry->fid, entry->fh, entry->config_state);
+ if (zdev) {
+ zpci_zdev_put(zdev);
+ return;
+ }
+ zpci_create_device(entry->fid, entry->fh, entry->config_state);
}

int clp_scan_pci_devices(void)
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index 2e3e5b278925..ea9db5cea64e 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -269,7 +269,7 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
pdev ? pci_name(pdev) : "n/a", ccdf->pec, ccdf->fid);

if (!pdev)
- return;
+ goto no_pdev;

switch (ccdf->pec) {
case 0x003a: /* Service Action or Error Recovery Successful */
@@ -286,6 +286,8 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
break;
}
pci_dev_put(pdev);
+no_pdev:
+ zpci_zdev_put(zdev);
}

void zpci_event_error(void *data)
@@ -314,6 +316,7 @@ static void zpci_event_hard_deconfigured(struct zpci_dev *zdev, u32 fh)
static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
{
struct zpci_dev *zdev = get_zdev_by_fid(ccdf->fid);
+ bool existing_zdev = !!zdev;
enum zpci_state state;

zpci_dbg(3, "avl fid:%x, fh:%x, pec:%x\n",
@@ -378,6 +381,8 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
default:
break;
}
+ if (existing_zdev)
+ zpci_zdev_put(zdev);
}

void zpci_event_availability(void *data)
--
2.35.1


2022-04-07 17:18:03

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.17 18/31] s390/pci: improve zpci_dev reference counting

On Wed, 2022-04-06 at 21:10 -0400, Sasha Levin wrote:
> From: Niklas Schnelle <[email protected]>
>
> [ Upstream commit c122383d221dfa2f41cfe5e672540595de986fde ]
>
> Currently zpci_dev uses kref based reference counting but only accounts
> for one original reference plus one reference from an added pci_dev to
> its underlying zpci_dev. Counting just the original reference worked
> until the pci_dev reference was added in commit 2a671f77ee49 ("s390/pci:
> fix use after free of zpci_dev") because once a zpci_dev goes away, i.e.
> enters the reserved state, it would immediately get released. However
> with the pci_dev reference this is no longer the case and the zpci_dev
> may still appear in multiple availability events indicating that it was
> reserved. This was solved by detecting when the zpci_dev is already on
> its way out but still hanging around. This has however shown some light
> on how unusual our zpci_dev reference counting is.
>
> Improve upon this by modelling zpci_dev reference counting on pci_dev.
> Analogous to pci_get_slot() increment the reference count in
> get_zdev_by_fid(). Thus all users of get_zdev_by_fid() must drop the
> reference once they are done with the zpci_dev.
>
> Similar to pci_scan_single_device(), zpci_create_device() returns the
> device with an initial count of 1 and the device added to the zpci_list
> (analogous to the PCI bus' device_list). In turn users of
> zpci_create_device() must only drop the reference once the device is
> gone from the point of view of the zPCI subsystem, it might still be
> referenced by the common PCI subsystem though.
>
> Reviewed-by: Matthew Rosato <[email protected]>
> Signed-off-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Vasily Gorbik <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>
> ---

This isn't really a bug fix, as far as I'm aware the existing code
works correctly. It is just about making things more like PCI bus
reference counting and less weird. I also see some potential of the
state of things with just this commit added being confusing. That's why
there is a follow up commit 7dcfe50f58d2 ("s390/pci: rename
get_zdev_by_bus() to zdev_from_bus()") to make it more obvious when
zpci_zdev_put() is needed.

In short I'd propose to drop this patch from the stable queues.