2019-02-07 20:37:30

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the driver


When we unload pci-hyperv, the host doesn't send us a PCI_EJECT message.
In this case we also need to make sure the sysfs pci slot directory
is removed, otherwise "cat /sys/bus/pci/slots/2/address" will trigger
"BUG: unable to handle kernel paging request". And, if we unload/reload
the driver several times, we'll have multiple pci slot directories in
/sys/bus/pci/slots/ like this:

root@localhost:~# ls -rtl /sys/bus/pci/slots/
total 0
drwxr-xr-x 2 root root 0 Feb 7 10:49 2
drwxr-xr-x 2 root root 0 Feb 7 10:49 2-1
drwxr-xr-x 2 root root 0 Feb 7 10:51 2-2

The patch adds the missing code, and in hv_eject_device_work() it also
moves pci_destroy_slot() to an earlier place where we hold the pci lock.

Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information")
Signed-off-by: Dexuan Cui <[email protected]>
Cc: [email protected]
Cc: Stephen Hemminger <[email protected]>
---
drivers/pci/controller/pci-hyperv.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 9ba4d12c179c..6b4773727525 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1491,6 +1491,21 @@ static void hv_pci_assign_slots(struct hv_pcibus_device *hbus)
}
}

+/*
+ * Remove entries in sysfs pci slot directory.
+ */
+static void hv_pci_remove_slots(struct hv_pcibus_device *hbus)
+{
+ struct hv_pci_dev *hpdev;
+
+ list_for_each_entry(hpdev, &hbus->children, list_entry) {
+ if (!hpdev->pci_slot)
+ continue;
+ pci_destroy_slot(hpdev->pci_slot);
+ hpdev->pci_slot = NULL;
+ }
+}
+
/**
* create_root_hv_pci_bus() - Expose a new root PCI bus
* @hbus: Root PCI bus, as understood by this driver
@@ -1887,6 +1902,10 @@ static void hv_eject_device_work(struct work_struct *work)
pci_lock_rescan_remove();
pci_stop_and_remove_bus_device(pdev);
pci_dev_put(pdev);
+ if (hpdev->pci_slot) {
+ pci_destroy_slot(hpdev->pci_slot);
+ hpdev->pci_slot = NULL;
+ }
pci_unlock_rescan_remove();
}

@@ -1894,9 +1913,6 @@ static void hv_eject_device_work(struct work_struct *work)
list_del(&hpdev->list_entry);
spin_unlock_irqrestore(&hpdev->hbus->device_list_lock, flags);

- if (hpdev->pci_slot)
- pci_destroy_slot(hpdev->pci_slot);
-
memset(&ctxt, 0, sizeof(ctxt));
ejct_pkt = (struct pci_eject_response *)&ctxt.pkt.message;
ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
@@ -2682,6 +2698,7 @@ static int hv_pci_remove(struct hv_device *hdev)
pci_lock_rescan_remove();
pci_stop_root_bus(hbus->pci_bus);
pci_remove_root_bus(hbus->pci_bus);
+ hv_pci_remove_slots(hbus);
pci_unlock_rescan_remove();
hbus->state = hv_pcibus_removed;
}
--
2.19.1



2019-02-08 00:45:18

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the driver

On Thu, 7 Feb 2019 20:36:32 +0000
Dexuan Cui <[email protected]> wrote:

> When we unload pci-hyperv, the host doesn't send us a PCI_EJECT message.
> In this case we also need to make sure the sysfs pci slot directory
> is removed, otherwise "cat /sys/bus/pci/slots/2/address" will trigger
> "BUG: unable to handle kernel paging request". And, if we unload/reload
> the driver several times, we'll have multiple pci slot directories in
> /sys/bus/pci/slots/ like this:
>
> root@localhost:~# ls -rtl /sys/bus/pci/slots/
> total 0
> drwxr-xr-x 2 root root 0 Feb 7 10:49 2
> drwxr-xr-x 2 root root 0 Feb 7 10:49 2-1
> drwxr-xr-x 2 root root 0 Feb 7 10:51 2-2
>
> The patch adds the missing code, and in hv_eject_device_work() it also
> moves pci_destroy_slot() to an earlier place where we hold the pci lock.
>
> Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information")
> Signed-off-by: Dexuan Cui <[email protected]>
> Cc: [email protected]
> Cc: Stephen Hemminger <[email protected]>

Acked-by: Stephen Hemminger <[email protected]>

2019-02-11 20:37:31

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the driver

> From: Sasha Levin <[email protected]>
> Sent: Monday, February 11, 2019 9:26 AM
> To: Sasha Levin <[email protected]>; Dexuan Cui <[email protected]>;
> Lorenzo Pieralisi <[email protected]>
> Cc: [email protected]; [email protected]; Stephen Hemminger
> <[email protected]>; [email protected]
> Subject: Re: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the
> driver
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: a15f2c08c708 PCI: hv: support reporting serial number as slot
> information.
>
> The bot has tested the following trees: v4.20.7, v4.19.20, v4.14.98.
>
> v4.20.7: Build OK!
> v4.19.20: Build OK!
> v4.14.98: Failed to apply! Possible dependencies:
> Unable to calculate
> How should we proceed with this patch?
> Sasha

In v4.14.y, the file name is drivers/pci/host/pci-hyperv.c rather than
drivers/pci/controller/pci-hyperv.c. After I changed the folder name in the patch,
I could "git am" it cleanly on v4.14.98.

Thanks,
-- Dexuan

2019-02-12 12:15:11

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the driver

On Thu, Feb 07, 2019 at 08:36:32PM +0000, Dexuan Cui wrote:
>
> When we unload pci-hyperv, the host doesn't send us a PCI_EJECT message.
> In this case we also need to make sure the sysfs pci slot directory
> is removed, otherwise "cat /sys/bus/pci/slots/2/address" will trigger
> "BUG: unable to handle kernel paging request". And, if we unload/reload
> the driver several times, we'll have multiple pci slot directories in
> /sys/bus/pci/slots/ like this:
>
> root@localhost:~# ls -rtl /sys/bus/pci/slots/
> total 0
> drwxr-xr-x 2 root root 0 Feb 7 10:49 2
> drwxr-xr-x 2 root root 0 Feb 7 10:49 2-1
> drwxr-xr-x 2 root root 0 Feb 7 10:51 2-2
>
> The patch adds the missing code, and in hv_eject_device_work() it also
> moves pci_destroy_slot() to an earlier place where we hold the pci lock.

This patch fixes three bugs:

1) set hpdev->pci_slot to NULL
2) move code destroying the slot inside a locked region in
hv_eject_device_work()
3) Add missing slots removal code in hv_pci_remove()

We need three patches, not one.

(1) and (2), I am not entirely sure we want them in stable kernels,
since they are potential bugs, waiting for your input.

Lorenzo

> Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information")
> Signed-off-by: Dexuan Cui <[email protected]>
> Cc: [email protected]
> Cc: Stephen Hemminger <[email protected]>
> ---
> drivers/pci/controller/pci-hyperv.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 9ba4d12c179c..6b4773727525 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1491,6 +1491,21 @@ static void hv_pci_assign_slots(struct hv_pcibus_device *hbus)
> }
> }
>
> +/*
> + * Remove entries in sysfs pci slot directory.
> + */
> +static void hv_pci_remove_slots(struct hv_pcibus_device *hbus)
> +{
> + struct hv_pci_dev *hpdev;
> +
> + list_for_each_entry(hpdev, &hbus->children, list_entry) {
> + if (!hpdev->pci_slot)
> + continue;
> + pci_destroy_slot(hpdev->pci_slot);
> + hpdev->pci_slot = NULL;
> + }
> +}
> +
> /**
> * create_root_hv_pci_bus() - Expose a new root PCI bus
> * @hbus: Root PCI bus, as understood by this driver
> @@ -1887,6 +1902,10 @@ static void hv_eject_device_work(struct work_struct *work)
> pci_lock_rescan_remove();
> pci_stop_and_remove_bus_device(pdev);
> pci_dev_put(pdev);
> + if (hpdev->pci_slot) {
> + pci_destroy_slot(hpdev->pci_slot);
> + hpdev->pci_slot = NULL;
> + }
> pci_unlock_rescan_remove();
> }
>
> @@ -1894,9 +1913,6 @@ static void hv_eject_device_work(struct work_struct *work)
> list_del(&hpdev->list_entry);
> spin_unlock_irqrestore(&hpdev->hbus->device_list_lock, flags);
>
> - if (hpdev->pci_slot)
> - pci_destroy_slot(hpdev->pci_slot);
> -
> memset(&ctxt, 0, sizeof(ctxt));
> ejct_pkt = (struct pci_eject_response *)&ctxt.pkt.message;
> ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
> @@ -2682,6 +2698,7 @@ static int hv_pci_remove(struct hv_device *hdev)
> pci_lock_rescan_remove();
> pci_stop_root_bus(hbus->pci_bus);
> pci_remove_root_bus(hbus->pci_bus);
> + hv_pci_remove_slots(hbus);
> pci_unlock_rescan_remove();
> hbus->state = hv_pcibus_removed;
> }
> --
> 2.19.1
>

2019-02-13 02:36:18

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the driver

> From: Lorenzo Pieralisi <[email protected]>
> Sent: Tuesday, February 12, 2019 4:13 AM
> ...
> This patch fixes three bugs:
>
> 1) set hpdev->pci_slot to NULL
> 2) move code destroying the slot inside a locked region in
> hv_eject_device_work()
> 3) Add missing slots removal code in hv_pci_remove()
>
> We need three patches, not one.
>
> (1) and (2), I am not entirely sure we want them in stable kernels,
> since they are potential bugs, waiting for your input.
>
> Lorenzo

(1) is actually unnecessary, as I suppose hpdev should be freed at a later
place in the same function hv_eject_device_work -> put_pcichild() -> kfree(hpdev).
But today I think I found a refcount bug in the hot-remove case and the "kfree(hpdev)"
is never called in the hot-remove case. I'll further dig into this and make some extra
patches.

About (2), it's a race condition that can happen when the device is being hot-removed
and we're unloading the pci-hyperv driver at the same time. This is not a normal usage,
so I agree it doesn't really need to go into the stables.

(3) should go into the stables.

I'll make 3 separate patches, and extra patches for the refcount issue, and possible other
minor issues.

Thanks,
-- Dexuan