2014-01-10 14:16:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 0/9] PCI: Eliminate race conditions between hotplug and sysfs rescan/remove (Was: Re: [PATCH v2 04/10] PCI: Destroy pci dev only once)

[Cc: adding linux-scsi for the MPT changes, Ben for powerpc, Matthew for
platform/x86 and Konrad for Xen]

On Friday, December 06, 2013 02:21:50 AM Rafael J. Wysocki wrote:

[...]

>
> OK
>
> To be a bit more constructive, as the next step I'd try to use
> pci_remove_rescan_mutex to serialize all PCI hotplug operations (as I said
> above) without making the other changes made by my patch. Does that sound
> reasonable?

Well, no answer here, so as a followup, a series implementing that idea
follows.

I *hope* I found all of the places that need to be synchronized vs the bus
rescan and device removal that can be triggered via sysfs, but I might overlook
something. Also in some cases I wasn't quite sure how much stuff to put under
the lock, because said stuff is not exactly straightforward.

Enjoy!

Rafael


2014-01-10 14:16:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 8/9] powerpc / eeh_driver: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>

Race conditions are theoretically possible between the PCI device
addition and removal in the PPC64 PCI error recovery driver and
the generic PCI bus rescan and device removal that can be triggered
via sysfs.

To avoid those race conditions make PPC64 PCI error recovery driver
use global PCI rescan-remove locking around PCI device addition and
removal.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
arch/powerpc/kernel/eeh_driver.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)

Index: linux-pm/arch/powerpc/kernel/eeh_driver.c
===================================================================
--- linux-pm.orig/arch/powerpc/kernel/eeh_driver.c
+++ linux-pm/arch/powerpc/kernel/eeh_driver.c
@@ -369,7 +369,9 @@ static void *eeh_rmv_device(void *data,
edev->mode |= EEH_DEV_DISCONNECTED;
(*removed)++;

+ pci_lock_rescan_remove();
pci_stop_and_remove_bus_device(dev);
+ pci_unlock_rescan_remove();

return NULL;
}
@@ -416,10 +418,13 @@ static int eeh_reset_device(struct eeh_p
* into pcibios_add_pci_devices().
*/
eeh_pe_state_mark(pe, EEH_PE_KEEP);
- if (bus)
+ if (bus) {
+ pci_lock_rescan_remove();
pcibios_remove_pci_devices(bus);
- else if (frozen_bus)
+ pci_unlock_rescan_remove();
+ } else if (frozen_bus) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
+ }

/* Reset the pci controller. (Asserts RST#; resets config space).
* Reconfigure bridges and devices. Don't try to bring the system
@@ -429,6 +434,8 @@ static int eeh_reset_device(struct eeh_p
if (rc)
return rc;

+ pci_lock_rescan_remove();
+
/* Restore PE */
eeh_ops->configure_bridge(pe);
eeh_pe_restore_bars(pe);
@@ -462,6 +469,7 @@ static int eeh_reset_device(struct eeh_p
pe->tstamp = tstamp;
pe->freeze_count = cnt;

+ pci_unlock_rescan_remove();
return 0;
}

@@ -618,8 +626,11 @@ perm_error:
eeh_pe_dev_traverse(pe, eeh_report_failure, NULL);

/* Shut down the device drivers for good. */
- if (frozen_bus)
+ if (frozen_bus) {
+ pci_lock_rescan_remove();
pcibios_remove_pci_devices(frozen_bus);
+ pci_unlock_rescan_remove();
+ }
}

static void eeh_handle_special_event(void)
@@ -692,6 +703,7 @@ static void eeh_handle_special_event(voi
if (rc == 2 || rc == 1)
eeh_handle_normal_event(pe);
else {
+ lock_pci_remove_rescan();
list_for_each_entry_safe(hose, tmp,
&hose_list, list_node) {
phb_pe = eeh_phb_pe_get(hose);
@@ -703,6 +715,7 @@ static void eeh_handle_special_event(voi
eeh_pe_dev_traverse(pe, eeh_report_failure, NULL);
pcibios_remove_pci_devices(bus);
}
+ unlock_pci_remove_rescan();
}
}

2014-01-10 14:16:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 7/9] MPT / PCI: Use pci_stop_and_remove_bus_device_locked()

From: Rafael J. Wysocki <[email protected]>

Race conditions are theoretically possible between the MPT PCI
device removal and the generic PCI bus rescan and device removal
that can be triggered via sysfs.

To avoid those race conditions make the MPT PCI code use
pci_stop_and_remove_bus_device_locked().

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/message/fusion/mptbase.c | 2 +-
drivers/scsi/mpt2sas/mpt2sas_base.c | 2 +-
drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/scsi/mpt3sas/mpt3sas_base.c
===================================================================
--- linux-pm.orig/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ linux-pm/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -131,7 +131,7 @@ static int mpt3sas_remove_dead_ioc_func(
pdev = ioc->pdev;
if ((pdev == NULL))
return -1;
- pci_stop_and_remove_bus_device(pdev);
+ pci_stop_and_remove_bus_device_locked(pdev);
return 0;
}

Index: linux-pm/drivers/scsi/mpt2sas/mpt2sas_base.c
===================================================================
--- linux-pm.orig/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ linux-pm/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -128,7 +128,7 @@ static int mpt2sas_remove_dead_ioc_func(
pdev = ioc->pdev;
if ((pdev == NULL))
return -1;
- pci_stop_and_remove_bus_device(pdev);
+ pci_stop_and_remove_bus_device_locked(pdev);
return 0;
}

Index: linux-pm/drivers/message/fusion/mptbase.c
===================================================================
--- linux-pm.orig/drivers/message/fusion/mptbase.c
+++ linux-pm/drivers/message/fusion/mptbase.c
@@ -346,7 +346,7 @@ static int mpt_remove_dead_ioc_func(void
if ((pdev == NULL))
return -1;

- pci_stop_and_remove_bus_device(pdev);
+ pci_stop_and_remove_bus_device_locked(pdev);
return 0;
}

2014-01-10 14:16:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 9/9] Xen / PCI: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>

Multiple race conditions are possible between the Xen pcifront
device addition and removal and the generic PCI device addition
and removal that can be triggered via sysfs.

To avoid those race conditions make the Xen pcifront code use global
PCI rescan-remove locking.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/xen-pcifront.c | 8 ++++++++
1 file changed, 8 insertions(+)

Index: linux-pm/drivers/pci/xen-pcifront.c
===================================================================
--- linux-pm.orig/drivers/pci/xen-pcifront.c
+++ linux-pm/drivers/pci/xen-pcifront.c
@@ -471,12 +471,15 @@ static int pcifront_scan_root(struct pci
}
pcifront_init_sd(sd, domain, bus, pdev);

+ pci_lock_rescan_remove();
+
b = pci_scan_bus_parented(&pdev->xdev->dev, bus,
&pcifront_bus_ops, sd);
if (!b) {
dev_err(&pdev->xdev->dev,
"Error creating PCI Frontend Bus!\n");
err = -ENOMEM;
+ pci_unlock_rescan_remove();
goto err_out;
}

@@ -494,6 +497,7 @@ static int pcifront_scan_root(struct pci
/* Create SysFS and notify udev of the devices. Aka: "going live" */
pci_bus_add_devices(b);

+ pci_unlock_rescan_remove();
return err;

err_out:
@@ -556,6 +560,7 @@ static void pcifront_free_roots(struct p

dev_dbg(&pdev->xdev->dev, "cleaning up root buses\n");

+ pci_lock_rescan_remove();
list_for_each_entry_safe(bus_entry, t, &pdev->root_buses, list) {
list_del(&bus_entry->list);

@@ -568,6 +573,7 @@ static void pcifront_free_roots(struct p

kfree(bus_entry);
}
+ pci_unlock_rescan_remove();
}

static pci_ers_result_t pcifront_common_process(int cmd,
@@ -1043,8 +1049,10 @@ static int pcifront_detach_devices(struc
domain, bus, slot, func);
continue;
}
+ pci_lock_rescan_remove();
pci_stop_and_remove_bus_device(pci_dev);
pci_dev_put(pci_dev);
+ pci_unlock_rescan_remove();

dev_dbg(&pdev->xdev->dev,
"PCI device %04x:%02x:%02x.%d removed.\n",

2014-01-10 14:17:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 1/9] PCI: Global rescan-remove lock

From: Rafael J. Wysocki <[email protected]>

There are multiple PCI device addition and removal code paths that
may be run concurrently with the generic PCI bus rescan and device
removal that can be triggered via sysfs. If that happens, it may
lead to multiple different, potentially dangerous race conditions.

The most straightforward way to address those problems is to run
the code in question under the same lock that is used by the
generic rescan/remove code in pci-sysfs.c. To prepare for those
changes, move the definition of the global PCI remove/rescan lock
to probe.c and provide global wrappers, pci_lock_rescan_remove()
and pci_unlock_rescan_remove(), allowing drivers to manipulate
that lock. Also provide pci_stop_and_remove_bus_device_locked()
for the callers of pci_stop_and_remove_bus_device() who only need
to hold the rescan/remove lock around it.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pci-sysfs.c | 19 +++++++------------
drivers/pci/probe.c | 18 ++++++++++++++++++
drivers/pci/remove.c | 8 ++++++++
include/linux/pci.h | 3 +++
4 files changed, 36 insertions(+), 12 deletions(-)

Index: linux-pm/drivers/pci/pci-sysfs.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-sysfs.c
+++ linux-pm/drivers/pci/pci-sysfs.c
@@ -297,7 +297,6 @@ msi_bus_store(struct device *dev, struct
}
static DEVICE_ATTR_RW(msi_bus);

-static DEFINE_MUTEX(pci_remove_rescan_mutex);
static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
size_t count)
{
@@ -308,10 +307,10 @@ static ssize_t bus_rescan_store(struct b
return -EINVAL;

if (val) {
- mutex_lock(&pci_remove_rescan_mutex);
+ pci_lock_rescan_remove();
while ((b = pci_find_next_bus(b)) != NULL)
pci_rescan_bus(b);
- mutex_unlock(&pci_remove_rescan_mutex);
+ pci_unlock_rescan_remove();
}
return count;
}
@@ -342,9 +341,9 @@ dev_rescan_store(struct device *dev, str
return -EINVAL;

if (val) {
- mutex_lock(&pci_remove_rescan_mutex);
+ pci_lock_rescan_remove();
pci_rescan_bus(pdev->bus);
- mutex_unlock(&pci_remove_rescan_mutex);
+ pci_unlock_rescan_remove();
}
return count;
}
@@ -354,11 +353,7 @@ static struct device_attribute dev_resca

static void remove_callback(struct device *dev)
{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- mutex_lock(&pci_remove_rescan_mutex);
- pci_stop_and_remove_bus_device(pdev);
- mutex_unlock(&pci_remove_rescan_mutex);
+ pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));
}

static ssize_t
@@ -395,12 +390,12 @@ dev_bus_rescan_store(struct device *dev,
return -EINVAL;

if (val) {
- mutex_lock(&pci_remove_rescan_mutex);
+ pci_lock_rescan_remove();
if (!pci_is_root_bus(bus) && list_empty(&bus->devices))
pci_rescan_bus_bridge_resize(bus->self);
else
pci_rescan_bus(bus);
- mutex_unlock(&pci_remove_rescan_mutex);
+ pci_unlock_rescan_remove();
}
return count;
}
Index: linux-pm/drivers/pci/probe.c
===================================================================
--- linux-pm.orig/drivers/pci/probe.c
+++ linux-pm/drivers/pci/probe.c
@@ -2014,6 +2014,24 @@ EXPORT_SYMBOL(pci_scan_slot);
EXPORT_SYMBOL(pci_scan_bridge);
EXPORT_SYMBOL_GPL(pci_scan_child_bus);

+/*
+ * pci_rescan_bus(), pci_rescan_bus_bridge_resize() and PCI device removal
+ * routines should always be executed under this mutex.
+ */
+static DEFINE_MUTEX(pci_rescan_remove_lock);
+
+void pci_lock_rescan_remove(void)
+{
+ mutex_lock(&pci_rescan_remove_lock);
+}
+EXPORT_SYMBOL_GPL(pci_lock_rescan_remove);
+
+void pci_unlock_rescan_remove(void)
+{
+ mutex_unlock(&pci_rescan_remove_lock);
+}
+EXPORT_SYMBOL_GPL(pci_unlock_rescan_remove);
+
static int __init pci_sort_bf_cmp(const struct device *d_a, const struct device *d_b)
{
const struct pci_dev *a = to_pci_dev(d_a);
Index: linux-pm/include/linux/pci.h
===================================================================
--- linux-pm.orig/include/linux/pci.h
+++ linux-pm/include/linux/pci.h
@@ -779,6 +779,7 @@ struct pci_dev *pci_dev_get(struct pci_d
void pci_dev_put(struct pci_dev *dev);
void pci_remove_bus(struct pci_bus *b);
void pci_stop_and_remove_bus_device(struct pci_dev *dev);
+void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev);
void pci_stop_root_bus(struct pci_bus *bus);
void pci_remove_root_bus(struct pci_bus *bus);
void pci_setup_cardbus(struct pci_bus *bus);
@@ -1022,6 +1023,8 @@ void set_pcie_hotplug_bridge(struct pci_
int pci_bus_find_capability(struct pci_bus *bus, unsigned int devfn, int cap);
unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge);
unsigned int pci_rescan_bus(struct pci_bus *bus);
+void pci_lock_rescan_remove(void);
+void pci_unlock_rescan_remove(void);

/* Vital product data routines */
ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
Index: linux-pm/drivers/pci/remove.c
===================================================================
--- linux-pm.orig/drivers/pci/remove.c
+++ linux-pm/drivers/pci/remove.c
@@ -114,6 +114,14 @@ void pci_stop_and_remove_bus_device(stru
}
EXPORT_SYMBOL(pci_stop_and_remove_bus_device);

+void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev)
+{
+ pci_lock_rescan_remove();
+ pci_stop_and_remove_bus_device(dev);
+ pci_unlock_rescan_remove();
+}
+EXPORT_SYMBOL_GPL(pci_stop_and_remove_bus_device_locked);
+
void pci_stop_root_bus(struct pci_bus *bus)
{
struct pci_dev *child, *tmp;

2014-01-10 14:17:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 3/9] ACPI / hotplug / PCI: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>

Multiple race conditions are possible between the ACPI-based PCI
hotplug (ACPIPHP) and the generic PCI bus rescan and device removal
that can be triggered via sysfs.

To avoid those race conditions make the ACPIPHP code use global PCI
rescan-remove locking.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/acpiphp.h | 5 +++-
drivers/pci/hotplug/acpiphp_core.c | 2 -
drivers/pci/hotplug/acpiphp_glue.c | 43 ++++++++++++++++++++++++++++++++-----
3 files changed, 43 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/pci/hotplug/acpiphp.h
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp.h
+++ linux-pm/drivers/pci/hotplug/acpiphp.h
@@ -77,6 +77,8 @@ struct acpiphp_bridge {

/* PCI-to-PCI bridge device */
struct pci_dev *pci_dev;
+
+ bool is_going_away;
};


@@ -150,6 +152,7 @@ struct acpiphp_attention_info
/* slot flags */

#define SLOT_ENABLED (0x00000001)
+#define SLOT_IS_GOING_AWAY (0x00000002)

/* function flags */

@@ -169,7 +172,7 @@ void acpiphp_unregister_hotplug_slot(str
typedef int (*acpiphp_callback)(struct acpiphp_slot *slot, void *data);

int acpiphp_enable_slot(struct acpiphp_slot *slot);
-int acpiphp_disable_and_eject_slot(struct acpiphp_slot *slot);
+int acpiphp_disable_slot(struct acpiphp_slot *slot);
u8 acpiphp_get_power_status(struct acpiphp_slot *slot);
u8 acpiphp_get_attention_status(struct acpiphp_slot *slot);
u8 acpiphp_get_latch_status(struct acpiphp_slot *slot);
Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -432,6 +432,7 @@ static void cleanup_bridge(struct acpiph
pr_err("failed to remove notify handler\n");
}
}
+ slot->flags |= SLOT_IS_GOING_AWAY;
if (slot->slot)
acpiphp_unregister_hotplug_slot(slot);
}
@@ -439,6 +440,8 @@ static void cleanup_bridge(struct acpiph
mutex_lock(&bridge_mutex);
list_del(&bridge->list);
mutex_unlock(&bridge_mutex);
+
+ bridge->is_going_away = true;
}

/**
@@ -757,6 +760,10 @@ static void acpiphp_check_bridge(struct
{
struct acpiphp_slot *slot;

+ /* Bail out if the bridge is going away. */
+ if (bridge->is_going_away)
+ return;
+
list_for_each_entry(slot, &bridge->slots, node) {
struct pci_bus *bus = slot->bus;
struct pci_dev *dev, *tmp;
@@ -827,6 +834,8 @@ void acpiphp_check_host_bridge(acpi_hand
}
}

+static int acpiphp_disable_and_eject_slot(struct acpiphp_slot *slot);
+
static void hotplug_event(acpi_handle handle, u32 type, void *data)
{
struct acpiphp_context *context = data;
@@ -856,6 +865,9 @@ static void hotplug_event(acpi_handle ha
} else {
struct acpiphp_slot *slot = func->slot;

+ if (slot->flags & SLOT_IS_GOING_AWAY)
+ break;
+
mutex_lock(&slot->crit_sect);
enable_slot(slot);
mutex_unlock(&slot->crit_sect);
@@ -871,6 +883,9 @@ static void hotplug_event(acpi_handle ha
struct acpiphp_slot *slot = func->slot;
int ret;

+ if (slot->flags & SLOT_IS_GOING_AWAY)
+ break;
+
/*
* Check if anything has changed in the slot and rescan
* from the parent if that's the case.
@@ -900,9 +915,11 @@ static void hotplug_event_work(void *dat
acpi_handle handle = context->handle;

acpi_scan_lock_acquire();
+ pci_lock_rescan_remove();

hotplug_event(handle, type, context);

+ pci_unlock_rescan_remove();
acpi_scan_lock_release();
acpi_evaluate_hotplug_ost(handle, type, ACPI_OST_SC_SUCCESS, NULL);
put_bridge(context->func.parent);
@@ -1070,12 +1087,19 @@ void acpiphp_remove_slots(struct pci_bus
*/
int acpiphp_enable_slot(struct acpiphp_slot *slot)
{
+ pci_lock_rescan_remove();
+
+ if (slot->flags & SLOT_IS_GOING_AWAY)
+ return -ENODEV;
+
mutex_lock(&slot->crit_sect);
/* configure all functions */
if (!(slot->flags & SLOT_ENABLED))
enable_slot(slot);

mutex_unlock(&slot->crit_sect);
+
+ pci_unlock_rescan_remove();
return 0;
}

@@ -1083,10 +1107,12 @@ int acpiphp_enable_slot(struct acpiphp_s
* acpiphp_disable_and_eject_slot - power off and eject slot
* @slot: ACPI PHP slot
*/
-int acpiphp_disable_and_eject_slot(struct acpiphp_slot *slot)
+static int acpiphp_disable_and_eject_slot(struct acpiphp_slot *slot)
{
struct acpiphp_func *func;
- int retval = 0;
+
+ if (slot->flags & SLOT_IS_GOING_AWAY)
+ return -ENODEV;

mutex_lock(&slot->crit_sect);

@@ -1104,9 +1130,18 @@ int acpiphp_disable_and_eject_slot(struc
}

mutex_unlock(&slot->crit_sect);
- return retval;
+ return 0;
}

+int acpiphp_disable_slot(struct acpiphp_slot *slot)
+{
+ int ret;
+
+ pci_lock_rescan_remove();
+ ret = acpiphp_disable_and_eject_slot(slot);
+ pci_unlock_rescan_remove();
+ return ret;
+}

/*
* slot enabled: 1
@@ -1117,7 +1152,6 @@ u8 acpiphp_get_power_status(struct acpip
return (slot->flags & SLOT_ENABLED);
}

-
/*
* latch open: 1
* latch closed: 0
@@ -1127,7 +1161,6 @@ u8 acpiphp_get_latch_status(struct acpip
return !(get_slot_status(slot) & ACPI_STA_DEVICE_UI);
}

-
/*
* adapter presence : 1
* absence : 0
Index: linux-pm/drivers/pci/hotplug/acpiphp_core.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_core.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_core.c
@@ -156,7 +156,7 @@ static int disable_slot(struct hotplug_s
pr_debug("%s - physical_slot = %s\n", __func__, slot_name(slot));

/* disable the specified slot */
- return acpiphp_disable_and_eject_slot(slot->acpi_slot);
+ return acpiphp_disable_slot(slot->acpi_slot);
}


2014-01-10 14:17:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 2/9] ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug

From: Rafael J. Wysocki <[email protected]>

Multiple race conditions are possible between the addition and
removal of PCI devices during ACPI PCI host bridge hotplug and the
generic PCI bus rescan and device removal that can be triggered via
sysfs.

To avoid those race conditions make the ACPI PCI host bridge addition
and removal code use global PCI rescan-remove locking.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/pci_root.c | 6 ++++++
1 file changed, 6 insertions(+)

Index: linux-pm/drivers/acpi/pci_root.c
===================================================================
--- linux-pm.orig/drivers/acpi/pci_root.c
+++ linux-pm/drivers/acpi/pci_root.c
@@ -604,7 +604,9 @@ static int acpi_pci_root_add(struct acpi
pci_assign_unassigned_root_bus_resources(root->bus);
}

+ pci_lock_rescan_remove();
pci_bus_add_devices(root->bus);
+ pci_unlock_rescan_remove();
return 1;

end:
@@ -616,6 +618,8 @@ static void acpi_pci_root_remove(struct
{
struct acpi_pci_root *root = acpi_driver_data(device);

+ pci_lock_rescan_remove();
+
pci_stop_root_bus(root->bus);

device_set_run_wake(root->bus->bridge, false);
@@ -623,6 +627,8 @@ static void acpi_pci_root_remove(struct

pci_remove_root_bus(root->bus);

+ pci_unlock_rescan_remove();
+
kfree(root);
}

2014-01-10 14:18:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 5/9] PCI / hotplug: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>

Multiple race conditions are possible between PCI hotplug and the
generic PCI bus rescan and device removal that can be triggered via
sysfs.

To avoid those race conditions make PCI hotplug use global PCI
rescan-remove locking.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/cpci_hotplug_pci.c | 14 ++++++++++++--
drivers/pci/hotplug/cpqphp_pci.c | 8 +++++++-
drivers/pci/hotplug/ibmphp_core.c | 13 +++++++++++--
drivers/pci/hotplug/pciehp_pci.c | 17 +++++++++++++----
drivers/pci/hotplug/rpadlpar_core.c | 19 ++++++++++++++-----
drivers/pci/hotplug/rpaphp_core.c | 4 ++++
drivers/pci/hotplug/s390_pci_hpc.c | 4 +++-
drivers/pci/hotplug/sgi_hotplug.c | 5 +++++
drivers/pci/hotplug/shpchp_pci.c | 18 ++++++++++++++----
9 files changed, 83 insertions(+), 19 deletions(-)

Index: linux-pm/drivers/pci/hotplug/rpadlpar_core.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/rpadlpar_core.c
+++ linux-pm/drivers/pci/hotplug/rpadlpar_core.c
@@ -354,10 +354,15 @@ int dlpar_remove_pci_slot(char *drc_name
{
struct pci_bus *bus;
struct slot *slot;
+ int ret = 0;
+
+ pci_lock_rescan_remove();

bus = pcibios_find_pci_bus(dn);
- if (!bus)
- return -EINVAL;
+ if (!bus) {
+ ret = -EINVAL;
+ goto out;
+ }

pr_debug("PCI: Removing PCI slot below EADS bridge %s\n",
bus->self ? pci_name(bus->self) : "<!PHB!>");
@@ -371,7 +376,8 @@ int dlpar_remove_pci_slot(char *drc_name
printk(KERN_ERR
"%s: unable to remove hotplug slot %s\n",
__func__, drc_name);
- return -EIO;
+ ret = -EIO;
+ goto out;
}
}

@@ -382,7 +388,8 @@ int dlpar_remove_pci_slot(char *drc_name
if (pcibios_unmap_io_space(bus)) {
printk(KERN_ERR "%s: failed to unmap bus range\n",
__func__);
- return -ERANGE;
+ ret = -ERANGE;
+ goto out;
}

/* Remove the EADS bridge device itself */
@@ -390,7 +397,9 @@ int dlpar_remove_pci_slot(char *drc_name
pr_debug("PCI: Now removing bridge device %s\n", pci_name(bus->self));
pci_stop_and_remove_bus_device(bus->self);

- return 0;
+ out:
+ pci_unlock_rescan_remove();
+ return ret;
}

/**
Index: linux-pm/drivers/pci/hotplug/cpci_hotplug_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/cpci_hotplug_pci.c
+++ linux-pm/drivers/pci/hotplug/cpci_hotplug_pci.c
@@ -254,9 +254,12 @@ int __ref cpci_configure_slot(struct slo
{
struct pci_dev *dev;
struct pci_bus *parent;
+ int ret = 0;

dbg("%s - enter", __func__);

+ pci_lock_rescan_remove();
+
if (slot->dev == NULL) {
dbg("pci_dev null, finding %02x:%02x:%x",
slot->bus->number, PCI_SLOT(slot->devfn), PCI_FUNC(slot->devfn));
@@ -277,7 +280,8 @@ int __ref cpci_configure_slot(struct slo
slot->dev = pci_get_slot(slot->bus, slot->devfn);
if (slot->dev == NULL) {
err("Could not find PCI device for slot %02x", slot->number);
- return -ENODEV;
+ ret = -ENODEV;
+ goto out;
}
}
parent = slot->dev->bus;
@@ -294,8 +298,10 @@ int __ref cpci_configure_slot(struct slo

pci_bus_add_devices(parent);

+ out:
+ pci_unlock_rescan_remove();
dbg("%s - exit", __func__);
- return 0;
+ return ret;
}

int cpci_unconfigure_slot(struct slot* slot)
@@ -308,6 +314,8 @@ int cpci_unconfigure_slot(struct slot* s
return -ENODEV;
}

+ pci_lock_rescan_remove();
+
list_for_each_entry_safe(dev, temp, &slot->bus->devices, bus_list) {
if (PCI_SLOT(dev->devfn) != PCI_SLOT(slot->devfn))
continue;
@@ -318,6 +326,8 @@ int cpci_unconfigure_slot(struct slot* s
pci_dev_put(slot->dev);
slot->dev = NULL;

+ pci_unlock_rescan_remove();
+
dbg("%s - exit", __func__);
return 0;
}
Index: linux-pm/drivers/pci/hotplug/pciehp_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/pciehp_pci.c
+++ linux-pm/drivers/pci/hotplug/pciehp_pci.c
@@ -39,22 +39,26 @@ int pciehp_configure_device(struct slot
struct pci_dev *dev;
struct pci_dev *bridge = p_slot->ctrl->pcie->port;
struct pci_bus *parent = bridge->subordinate;
- int num;
+ int num, ret = 0;
struct controller *ctrl = p_slot->ctrl;

+ pci_lock_rescan_remove();
+
dev = pci_get_slot(parent, PCI_DEVFN(0, 0));
if (dev) {
ctrl_err(ctrl, "Device %s already exists "
"at %04x:%02x:00, cannot hot-add\n", pci_name(dev),
pci_domain_nr(parent), parent->number);
pci_dev_put(dev);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}

num = pci_scan_slot(parent, PCI_DEVFN(0, 0));
if (num == 0) {
ctrl_err(ctrl, "No new device found\n");
- return -ENODEV;
+ ret = -ENODEV;
+ goto out;
}

list_for_each_entry(dev, &parent->devices, bus_list)
@@ -73,7 +77,9 @@ int pciehp_configure_device(struct slot

pci_bus_add_devices(parent);

- return 0;
+ out:
+ pci_unlock_rescan_remove();
+ return ret;
}

int pciehp_unconfigure_device(struct slot *p_slot)
@@ -92,6 +98,8 @@ int pciehp_unconfigure_device(struct slo
if (ret)
presence = 0;

+ pci_lock_rescan_remove();
+
/*
* Stopping an SR-IOV PF device removes all the associated VFs,
* which will update the bus->devices list and confuse the
@@ -126,5 +134,6 @@ int pciehp_unconfigure_device(struct slo
pci_dev_put(dev);
}

+ pci_unlock_rescan_remove();
return rc;
}
Index: linux-pm/drivers/pci/hotplug/cpqphp_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/cpqphp_pci.c
+++ linux-pm/drivers/pci/hotplug/cpqphp_pci.c
@@ -86,6 +86,8 @@ int cpqhp_configure_device (struct contr
struct pci_bus *child;
int num;

+ pci_lock_rescan_remove();
+
if (func->pci_dev == NULL)
func->pci_dev = pci_get_bus_and_slot(func->bus,PCI_DEVFN(func->device, func->function));

@@ -100,7 +102,7 @@ int cpqhp_configure_device (struct contr
func->pci_dev = pci_get_bus_and_slot(func->bus, PCI_DEVFN(func->device, func->function));
if (func->pci_dev == NULL) {
dbg("ERROR: pci_dev still null\n");
- return 0;
+ goto out;
}
}

@@ -113,6 +115,8 @@ int cpqhp_configure_device (struct contr

pci_dev_put(func->pci_dev);

+ out:
+ pci_unlock_rescan_remove();
return 0;
}

@@ -123,6 +127,7 @@ int cpqhp_unconfigure_device(struct pci_

dbg("%s: bus/dev/func = %x/%x/%x\n", __func__, func->bus, func->device, func->function);

+ pci_lock_rescan_remove();
for (j=0; j<8 ; j++) {
struct pci_dev* temp = pci_get_bus_and_slot(func->bus, PCI_DEVFN(func->device, j));
if (temp) {
@@ -130,6 +135,7 @@ int cpqhp_unconfigure_device(struct pci_
pci_stop_and_remove_bus_device(temp);
}
}
+ pci_unlock_rescan_remove();
return 0;
}

Index: linux-pm/drivers/pci/hotplug/ibmphp_core.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/ibmphp_core.c
+++ linux-pm/drivers/pci/hotplug/ibmphp_core.c
@@ -718,6 +718,8 @@ static void ibm_unconfigure_device(struc
func->device, func->function);
debug("func->device << 3 | 0x0 = %x\n", func->device << 3 | 0x0);

+ pci_lock_rescan_remove();
+
for (j = 0; j < 0x08; j++) {
temp = pci_get_bus_and_slot(func->busno, (func->device << 3) | j);
if (temp) {
@@ -725,7 +727,10 @@ static void ibm_unconfigure_device(struc
pci_dev_put(temp);
}
}
+
pci_dev_put(func->dev);
+
+ pci_unlock_rescan_remove();
}

/*
@@ -780,6 +785,8 @@ static int ibm_configure_device(struct p
int flag = 0; /* this is to make sure we don't double scan the bus,
for bridged devices primarily */

+ pci_lock_rescan_remove();
+
if (!(bus_structure_fixup(func->busno)))
flag = 1;
if (func->dev == NULL)
@@ -789,7 +796,7 @@ static int ibm_configure_device(struct p
if (func->dev == NULL) {
struct pci_bus *bus = pci_find_bus(0, func->busno);
if (!bus)
- return 0;
+ goto out;

num = pci_scan_slot(bus,
PCI_DEVFN(func->device, func->function));
@@ -800,7 +807,7 @@ static int ibm_configure_device(struct p
PCI_DEVFN(func->device, func->function));
if (func->dev == NULL) {
err("ERROR... : pci_dev still NULL\n");
- return 0;
+ goto out;
}
}
if (!(flag) && (func->dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)) {
@@ -810,6 +817,8 @@ static int ibm_configure_device(struct p
pci_bus_add_devices(child);
}

+ out:
+ pci_unlock_rescan_remove();
return 0;
}

Index: linux-pm/drivers/pci/hotplug/s390_pci_hpc.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/s390_pci_hpc.c
+++ linux-pm/drivers/pci/hotplug/s390_pci_hpc.c
@@ -80,7 +80,9 @@ static int enable_slot(struct hotplug_sl
goto out_deconfigure;

pci_scan_slot(slot->zdev->bus, ZPCI_DEVFN);
+ pci_lock_rescan_remove();
pci_bus_add_devices(slot->zdev->bus);
+ pci_unlock_rescan_remove();

return rc;

@@ -98,7 +100,7 @@ static int disable_slot(struct hotplug_s
return -EIO;

if (slot->zdev->pdev)
- pci_stop_and_remove_bus_device(slot->zdev->pdev);
+ pci_stop_and_remove_bus_device_locked(slot->zdev->pdev);

rc = zpci_disable_device(slot->zdev);
if (rc)
Index: linux-pm/drivers/pci/hotplug/shpchp_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/shpchp_pci.c
+++ linux-pm/drivers/pci/hotplug/shpchp_pci.c
@@ -40,7 +40,9 @@ int __ref shpchp_configure_device(struct
struct controller *ctrl = p_slot->ctrl;
struct pci_dev *bridge = ctrl->pci_dev;
struct pci_bus *parent = bridge->subordinate;
- int num;
+ int num, ret = 0;
+
+ pci_lock_rescan_remove();

dev = pci_get_slot(parent, PCI_DEVFN(p_slot->device, 0));
if (dev) {
@@ -48,13 +50,15 @@ int __ref shpchp_configure_device(struct
"at %04x:%02x:%02x, cannot hot-add\n", pci_name(dev),
pci_domain_nr(parent), p_slot->bus, p_slot->device);
pci_dev_put(dev);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}

num = pci_scan_slot(parent, PCI_DEVFN(p_slot->device, 0));
if (num == 0) {
ctrl_err(ctrl, "No new device found\n");
- return -ENODEV;
+ ret = -ENODEV;
+ goto out;
}

list_for_each_entry(dev, &parent->devices, bus_list) {
@@ -75,7 +79,9 @@ int __ref shpchp_configure_device(struct

pci_bus_add_devices(parent);

- return 0;
+ out:
+ pci_unlock_rescan_remove();
+ return ret;
}

int shpchp_unconfigure_device(struct slot *p_slot)
@@ -89,6 +95,8 @@ int shpchp_unconfigure_device(struct slo
ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:%02x\n",
__func__, pci_domain_nr(parent), p_slot->bus, p_slot->device);

+ pci_lock_rescan_remove();
+
list_for_each_entry_safe(dev, temp, &parent->devices, bus_list) {
if (PCI_SLOT(dev->devfn) != p_slot->device)
continue;
@@ -108,6 +116,8 @@ int shpchp_unconfigure_device(struct slo
pci_stop_and_remove_bus_device(dev);
pci_dev_put(dev);
}
+
+ pci_unlock_rescan_remove();
return rc;
}

Index: linux-pm/drivers/pci/hotplug/sgi_hotplug.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/sgi_hotplug.c
+++ linux-pm/drivers/pci/hotplug/sgi_hotplug.c
@@ -459,12 +459,15 @@ static int enable_slot(struct hotplug_sl
acpi_scan_lock_release();
}

+ pci_lock_rescan_remove();
+
/* Call the driver for the new device */
pci_bus_add_devices(slot->pci_bus);
/* Call the drivers for the new devices subordinate to PPB */
if (new_ppb)
pci_bus_add_devices(new_bus);

+ pci_unlock_rescan_remove();
mutex_unlock(&sn_hotplug_mutex);

if (rc == 0)
@@ -540,6 +543,7 @@ static int disable_slot(struct hotplug_s
acpi_scan_lock_release();
}

+ pci_lock_rescan_remove();
/* Free the SN resources assigned to the Linux device.*/
list_for_each_entry_safe(dev, temp, &slot->pci_bus->devices, bus_list) {
if (PCI_SLOT(dev->devfn) != slot->device_num + 1)
@@ -550,6 +554,7 @@ static int disable_slot(struct hotplug_s
pci_stop_and_remove_bus_device(dev);
pci_dev_put(dev);
}
+ pci_unlock_rescan_remove();

/* Remove the SSDT for the slot from the ACPI namespace */
if (SN_ACPI_BASE_SUPPORT() && ssdt_id) {
Index: linux-pm/drivers/pci/hotplug/rpaphp_core.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/rpaphp_core.c
+++ linux-pm/drivers/pci/hotplug/rpaphp_core.c
@@ -398,7 +398,9 @@ static int enable_slot(struct hotplug_sl
return retval;

if (state == PRESENT) {
+ pci_lock_rescan_remove();
pcibios_add_pci_devices(slot->bus);
+ pci_unlock_rescan_remove();
slot->state = CONFIGURED;
} else if (state == EMPTY) {
slot->state = EMPTY;
@@ -418,7 +420,9 @@ static int disable_slot(struct hotplug_s
if (slot->state == NOT_CONFIGURED)
return -EINVAL;

+ pci_lock_rescan_remove();
pcibios_remove_pci_devices(slot->bus);
+ pci_unlock_rescan_remove();
vm_unmap_aliases();

slot->state = NOT_CONFIGURED;

2014-01-10 14:18:44

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 4/9] PCMCIA / cardbus: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>

Multiple race conditions are possible between the cardbus PCI
device addition and removal and the generic PCI bus rescan and device
removal that can be triggered via sysfs.

To avoid those race conditions make the cardbus code use global
PCI rescan-remove locking.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pcmcia/cardbus.c | 7 +++++++
1 file changed, 7 insertions(+)

Index: linux-pm/drivers/pcmcia/cardbus.c
===================================================================
--- linux-pm.orig/drivers/pcmcia/cardbus.c
+++ linux-pm/drivers/pcmcia/cardbus.c
@@ -70,6 +70,8 @@ int __ref cb_alloc(struct pcmcia_socket
struct pci_dev *dev;
unsigned int max, pass;

+ pci_lock_rescan_remove();
+
s->functions = pci_scan_slot(bus, PCI_DEVFN(0, 0));
pci_fixup_cardbus(bus);

@@ -93,6 +95,7 @@ int __ref cb_alloc(struct pcmcia_socket

pci_bus_add_devices(bus);

+ pci_unlock_rescan_remove();
return 0;
}

@@ -115,6 +118,10 @@ void cb_free(struct pcmcia_socket *s)
if (!bus)
return;

+ pci_lock_rescan_remove();
+
list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list)
pci_stop_and_remove_bus_device(dev);
+
+ pci_unlock_rescan_remove();
}

2014-01-10 14:19:05

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 6/9] platform / x86: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>

Multiple race conditions are possible between the rfkill hotplug in
the asus-wmi and eeepc-laptop drivers and the generic PCI bus rescan
and device removal that can be triggered via sysfs.

To avoid those race conditions make asus-wmi and eeepc-laptop use
global PCI rescan-remove locking around the rfkill hotplug.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/platform/x86/asus-wmi.c | 2 ++
drivers/platform/x86/eeepc-laptop.c | 2 ++
2 files changed, 4 insertions(+)

Index: linux-pm/drivers/platform/x86/asus-wmi.c
===================================================================
--- linux-pm.orig/drivers/platform/x86/asus-wmi.c
+++ linux-pm/drivers/platform/x86/asus-wmi.c
@@ -605,6 +605,7 @@ static void asus_rfkill_hotplug(struct a
mutex_unlock(&asus->wmi_lock);

mutex_lock(&asus->hotplug_lock);
+ pci_lock_rescan_remove();

if (asus->wlan.rfkill)
rfkill_set_sw_state(asus->wlan.rfkill, blocked);
@@ -655,6 +656,7 @@ static void asus_rfkill_hotplug(struct a
}

out_unlock:
+ pci_unlock_rescan_remove();
mutex_unlock(&asus->hotplug_lock);
}

Index: linux-pm/drivers/platform/x86/eeepc-laptop.c
===================================================================
--- linux-pm.orig/drivers/platform/x86/eeepc-laptop.c
+++ linux-pm/drivers/platform/x86/eeepc-laptop.c
@@ -591,6 +591,7 @@ static void eeepc_rfkill_hotplug(struct
rfkill_set_sw_state(eeepc->wlan_rfkill, blocked);

mutex_lock(&eeepc->hotplug_lock);
+ pci_lock_rescan_remove();

if (eeepc->hotplug_slot) {
port = acpi_get_pci_dev(handle);
@@ -648,6 +649,7 @@ out_put_dev:
}

out_unlock:
+ pci_unlock_rescan_remove();
mutex_unlock(&eeepc->hotplug_lock);
}

2014-01-15 13:22:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Update][PATCH 8/9] powerpc / eeh_driver: Use global PCI rescan-remove locking

From: Rafael J. Wysocki <[email protected]>
Subject: powerpc / eeh_driver: Use global PCI rescan-remove locking

Race conditions are theoretically possible between the PCI device
addition and removal in the PPC64 PCI error recovery driver and
the generic PCI bus rescan and device removal that can be triggered
via sysfs.

To avoid those race conditions make PPC64 PCI error recovery driver
use global PCI rescan-remove locking around PCI device addition and
removal.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---

The previous version had wrong function names in the last hunk, sorry about
that.

Rafael

---
arch/powerpc/kernel/eeh_driver.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)

Index: linux-pm/arch/powerpc/kernel/eeh_driver.c
===================================================================
--- linux-pm.orig/arch/powerpc/kernel/eeh_driver.c
+++ linux-pm/arch/powerpc/kernel/eeh_driver.c
@@ -369,7 +369,9 @@ static void *eeh_rmv_device(void *data,
edev->mode |= EEH_DEV_DISCONNECTED;
(*removed)++;

+ pci_lock_rescan_remove();
pci_stop_and_remove_bus_device(dev);
+ pci_unlock_rescan_remove();

return NULL;
}
@@ -416,10 +418,13 @@ static int eeh_reset_device(struct eeh_p
* into pcibios_add_pci_devices().
*/
eeh_pe_state_mark(pe, EEH_PE_KEEP);
- if (bus)
+ if (bus) {
+ pci_lock_rescan_remove();
pcibios_remove_pci_devices(bus);
- else if (frozen_bus)
+ pci_unlock_rescan_remove();
+ } else if (frozen_bus) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
+ }

/* Reset the pci controller. (Asserts RST#; resets config space).
* Reconfigure bridges and devices. Don't try to bring the system
@@ -429,6 +434,8 @@ static int eeh_reset_device(struct eeh_p
if (rc)
return rc;

+ pci_lock_rescan_remove();
+
/* Restore PE */
eeh_ops->configure_bridge(pe);
eeh_pe_restore_bars(pe);
@@ -462,6 +469,7 @@ static int eeh_reset_device(struct eeh_p
pe->tstamp = tstamp;
pe->freeze_count = cnt;

+ pci_unlock_rescan_remove();
return 0;
}

@@ -618,8 +626,11 @@ perm_error:
eeh_pe_dev_traverse(pe, eeh_report_failure, NULL);

/* Shut down the device drivers for good. */
- if (frozen_bus)
+ if (frozen_bus) {
+ pci_lock_rescan_remove();
pcibios_remove_pci_devices(frozen_bus);
+ pci_unlock_rescan_remove();
+ }
}

static void eeh_handle_special_event(void)
@@ -692,6 +703,7 @@ static void eeh_handle_special_event(voi
if (rc == 2 || rc == 1)
eeh_handle_normal_event(pe);
else {
+ pci_lock_rescan_remove();
list_for_each_entry_safe(hose, tmp,
&hose_list, list_node) {
phb_pe = eeh_phb_pe_get(hose);
@@ -703,6 +715,7 @@ static void eeh_handle_special_event(voi
eeh_pe_dev_traverse(pe, eeh_report_failure, NULL);
pcibios_remove_pci_devices(bus);
}
+ pci_unlock_rescan_remove();
}
}

2014-01-15 17:38:31

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [Update][PATCH 8/9] powerpc / eeh_driver: Use global PCI rescan-remove locking

On Wed, Jan 15, 2014 at 02:36:36PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
> Subject: powerpc / eeh_driver: Use global PCI rescan-remove locking
>
> Race conditions are theoretically possible between the PCI device
> addition and removal in the PPC64 PCI error recovery driver and
> the generic PCI bus rescan and device removal that can be triggered
> via sysfs.
>
> To avoid those race conditions make PPC64 PCI error recovery driver
> use global PCI rescan-remove locking around PCI device addition and
> removal.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
>
> The previous version had wrong function names in the last hunk, sorry about
> that.

I replaced the previous version and re-pushed the pci/locking branch,
thanks!

>
> Rafael
>
> ---
> arch/powerpc/kernel/eeh_driver.c | 19 ++++++++++++++++---
> 1 file changed, 16 insertions(+), 3 deletions(-)
>
> Index: linux-pm/arch/powerpc/kernel/eeh_driver.c
> ===================================================================
> --- linux-pm.orig/arch/powerpc/kernel/eeh_driver.c
> +++ linux-pm/arch/powerpc/kernel/eeh_driver.c
> @@ -369,7 +369,9 @@ static void *eeh_rmv_device(void *data,
> edev->mode |= EEH_DEV_DISCONNECTED;
> (*removed)++;
>
> + pci_lock_rescan_remove();
> pci_stop_and_remove_bus_device(dev);
> + pci_unlock_rescan_remove();
>
> return NULL;
> }
> @@ -416,10 +418,13 @@ static int eeh_reset_device(struct eeh_p
> * into pcibios_add_pci_devices().
> */
> eeh_pe_state_mark(pe, EEH_PE_KEEP);
> - if (bus)
> + if (bus) {
> + pci_lock_rescan_remove();
> pcibios_remove_pci_devices(bus);
> - else if (frozen_bus)
> + pci_unlock_rescan_remove();
> + } else if (frozen_bus) {
> eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
> + }
>
> /* Reset the pci controller. (Asserts RST#; resets config space).
> * Reconfigure bridges and devices. Don't try to bring the system
> @@ -429,6 +434,8 @@ static int eeh_reset_device(struct eeh_p
> if (rc)
> return rc;
>
> + pci_lock_rescan_remove();
> +
> /* Restore PE */
> eeh_ops->configure_bridge(pe);
> eeh_pe_restore_bars(pe);
> @@ -462,6 +469,7 @@ static int eeh_reset_device(struct eeh_p
> pe->tstamp = tstamp;
> pe->freeze_count = cnt;
>
> + pci_unlock_rescan_remove();
> return 0;
> }
>
> @@ -618,8 +626,11 @@ perm_error:
> eeh_pe_dev_traverse(pe, eeh_report_failure, NULL);
>
> /* Shut down the device drivers for good. */
> - if (frozen_bus)
> + if (frozen_bus) {
> + pci_lock_rescan_remove();
> pcibios_remove_pci_devices(frozen_bus);
> + pci_unlock_rescan_remove();
> + }
> }
>
> static void eeh_handle_special_event(void)
> @@ -692,6 +703,7 @@ static void eeh_handle_special_event(voi
> if (rc == 2 || rc == 1)
> eeh_handle_normal_event(pe);
> else {
> + pci_lock_rescan_remove();
> list_for_each_entry_safe(hose, tmp,
> &hose_list, list_node) {
> phb_pe = eeh_phb_pe_get(hose);
> @@ -703,6 +715,7 @@ static void eeh_handle_special_event(voi
> eeh_pe_dev_traverse(pe, eeh_report_failure, NULL);
> pcibios_remove_pci_devices(bus);
> }
> + pci_unlock_rescan_remove();
> }
> }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-01-15 18:02:23

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/9] PCI: Eliminate race conditions between hotplug and sysfs rescan/remove (Was: Re: [PATCH v2 04/10] PCI: Destroy pci dev only once)

On Fri, Jan 10, 2014 at 03:20:44PM +0100, Rafael J. Wysocki wrote:
> [Cc: adding linux-scsi for the MPT changes, Ben for powerpc, Matthew for
> platform/x86 and Konrad for Xen]
>
> On Friday, December 06, 2013 02:21:50 AM Rafael J. Wysocki wrote:
>
> [...]
>
> >
> > OK
> >
> > To be a bit more constructive, as the next step I'd try to use
> > pci_remove_rescan_mutex to serialize all PCI hotplug operations (as I said
> > above) without making the other changes made by my patch. Does that sound
> > reasonable?
>
> Well, no answer here, so as a followup, a series implementing that idea
> follows.
>
> I *hope* I found all of the places that need to be synchronized vs the bus
> rescan and device removal that can be triggered via sysfs, but I might overlook
> something. Also in some cases I wasn't quite sure how much stuff to put under
> the lock, because said stuff is not exactly straightforward.

I applied this series to my pci/locking branch for v3.14. It should appear
in -next tomorrow.

Note that this touches some areas that are not strictly PCI, so speak
up if I'm treading on your toes:

arch/powerpc/kernel/eeh_driver.c | 19 ++++++++++++--
drivers/acpi/pci_root.c | 6 ++++
drivers/message/fusion/mptbase.c | 2 -
drivers/pci/hotplug/acpiphp.h | 5 +++
drivers/pci/hotplug/acpiphp_core.c | 2 -
drivers/pci/hotplug/acpiphp_glue.c | 43 +++++++++++++++++++++++++++++----
drivers/pci/hotplug/cpci_hotplug_pci.c | 14 +++++++++-
drivers/pci/hotplug/cpqphp_pci.c | 8 +++++-
drivers/pci/hotplug/ibmphp_core.c | 13 ++++++++-
drivers/pci/hotplug/pciehp_pci.c | 17 +++++++++----
drivers/pci/hotplug/rpadlpar_core.c | 19 ++++++++++----
drivers/pci/hotplug/rpaphp_core.c | 4 +++
drivers/pci/hotplug/s390_pci_hpc.c | 4 ++-
drivers/pci/hotplug/sgi_hotplug.c | 5 +++
drivers/pci/hotplug/shpchp_pci.c | 18 ++++++++++---
drivers/pci/pci-sysfs.c | 19 +++++---------
drivers/pci/probe.c | 18 +++++++++++++
drivers/pci/remove.c | 11 ++++++++
drivers/pci/xen-pcifront.c | 8 ++++++
drivers/pcmcia/cardbus.c | 7 +++++
drivers/platform/x86/asus-wmi.c | 2 +
drivers/platform/x86/eeepc-laptop.c | 2 +
drivers/scsi/mpt2sas/mpt2sas_base.c | 2 -
drivers/scsi/mpt3sas/mpt3sas_base.c | 2 -
include/linux/pci.h | 3 ++