2009-06-30 16:00:33

by Alan Jenkins

[permalink] [raw]
Subject: pciehp resume handler - racy?

Hi,

I've been hacking on the PCI hotplug driver otherwise known as
eeepc-laptop. At one point I reliably triggered a race on resume. In
the hot-unplug case, it seemed that the resume handler would try to
remove the PCI device at the same time as an acpi notification (run in a
workqueue) tried to do the same thing. The result was an OOPS. My
conclusion is that the PCI hotplug core does not protect against
multiple simultaneous removals of the same device.

pciehp appears to have an analogous problem. Assuming pciehp_force is
set, the resume handler can hot-unplug the device. The interrupt
handler could try to hot-unplug the device at the same time. Should the
resume handler take the slot mutex to avoid this problem?

diff --faked-up a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -382,15 +382,18 @@ static int pciehp_resume (struct pcie_device *dev)
/* reinitialize the chipset's event detection logic */
pcie_enable_notification(ctrl);

t_slot = pciehp_find_slot(ctrl, ctrl->slot_device_offset);
+ mutex_lock(&t_slot->lock);

/* Check if slot is occupied */
t_slot->hpc_ops->get_adapter_status(t_slot, &status);
if (status)
pciehp_enable_slot(t_slot);
else
pciehp_disable_slot(t_slot);
+
+ mutex_unlock(&t_slot->lock);
}
return 0;
}
#endif /* PM */


Regards
Alan


2009-07-01 04:34:44

by Kenji Kaneshige

[permalink] [raw]
Subject: Re: pciehp resume handler - racy?

Alan Jenkins wrote:
> Hi,
>
> I've been hacking on the PCI hotplug driver otherwise known as
> eeepc-laptop. At one point I reliably triggered a race on resume. In
> the hot-unplug case, it seemed that the resume handler would try to
> remove the PCI device at the same time as an acpi notification (run in a
> workqueue) tried to do the same thing. The result was an OOPS. My
> conclusion is that the PCI hotplug core does not protect against
> multiple simultaneous removals of the same device.

Though I don't know hotplug driver for eeepc-laptop at all, I guess
it doesn't use pci_hp_register(), which is for registering a hotplug
slot to pci hotplug core. The pci_hp_register() prevents a hot-plug
slot from being managed by multiple hotplug controller drivers at
the same time. Using pci_hp_register() would be one of the solution
for eeepc-laptop hotplug driver.

>
> pciehp appears to have an analogous problem. Assuming pciehp_force is
> set, the resume handler can hot-unplug the device. The interrupt
> handler could try to hot-unplug the device at the same time. Should the
> resume handler take the slot mutex to avoid this problem?
>

Hot-plug operations for the same hotplug slot are serialized by
crit_sect mutex of struct controller in pciehp_enable_slot() and
pciehp_disable_slot(). So I don't think multiple hot-plug
operations for the same slot would be executed at the same time.

Thanks,
Kenji Kaneshige


> diff --faked-up a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
> --- a/drivers/pci/hotplug/pciehp_core.c
> +++ b/drivers/pci/hotplug/pciehp_core.c
> @@ -382,15 +382,18 @@ static int pciehp_resume (struct pcie_device *dev)
> /* reinitialize the chipset's event detection logic */
> pcie_enable_notification(ctrl);
>
> t_slot = pciehp_find_slot(ctrl, ctrl->slot_device_offset);
> + mutex_lock(&t_slot->lock);
>
> /* Check if slot is occupied */
> t_slot->hpc_ops->get_adapter_status(t_slot, &status);
> if (status)
> pciehp_enable_slot(t_slot);
> else
> pciehp_disable_slot(t_slot);
> +
> + mutex_unlock(&t_slot->lock);
> }
> return 0;
> }
> #endif /* PM */
>
>
> Regards
> Alan
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>