2021-08-24 07:21:23

by Long Li

[permalink] [raw]
Subject: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

From: Long Li <[email protected]>

In hv_pci_bus_exit, the code is holding a spinlock while calling
pci_destroy_slot(), which takes a mutex.

This is not safe for spinlock. Fix this by moving the children to be
deleted to a list on the stack, and removing them after spinlock is
released.

Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the device")

Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Dexuan Cui <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: "Krzysztof Wilczyński" <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Dan Carpenter <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Long Li <[email protected]>
---
drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index a53bd8728d0d..d4f3cce18957 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
struct hv_pci_dev *hpdev, *tmp;
unsigned long flags;
int ret;
+ struct list_head removed;

/*
* After the host sends the RESCIND_CHANNEL message, it doesn't
@@ -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
return 0;

if (!keep_devs) {
- /* Delete any children which might still exist. */
+ INIT_LIST_HEAD(&removed);
+
+ /* Move all present children to the list on stack */
spin_lock_irqsave(&hbus->device_list_lock, flags);
- list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) {
+ list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry)
+ list_move_tail(&hpdev->list_entry, &removed);
+ spin_unlock_irqrestore(&hbus->device_list_lock, flags);
+
+ /* Remove all children in the list */
+ while (!list_empty(&removed)) {
+ hpdev = list_first_entry(&removed, struct hv_pci_dev,
+ list_entry);
list_del(&hpdev->list_entry);
if (hpdev->pci_slot)
pci_destroy_slot(hpdev->pci_slot);
@@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
put_pcichild(hpdev);
put_pcichild(hpdev);
}
- spin_unlock_irqrestore(&hbus->device_list_lock, flags);
}

ret = hv_send_resources_released(hdev);
--
2.25.1


2021-08-24 11:04:37

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected] wrote:
> From: Long Li <[email protected]>
>
> In hv_pci_bus_exit, the code is holding a spinlock while calling
> pci_destroy_slot(), which takes a mutex.
>
> This is not safe for spinlock. Fix this by moving the children to be
> deleted to a list on the stack, and removing them after spinlock is
> released.
>
> Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the device")
>
> Cc: "K. Y. Srinivasan" <[email protected]>
> Cc: Haiyang Zhang <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Wei Liu <[email protected]>
> Cc: Dexuan Cui <[email protected]>
> Cc: Lorenzo Pieralisi <[email protected]>
> Cc: Rob Herring <[email protected]>
> Cc: "Krzysztof Wilczyński" <[email protected]>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Michael Kelley <[email protected]>
> Cc: Dan Carpenter <[email protected]>
> Reported-by: Dan Carpenter <[email protected]>
> Signed-off-by: Long Li <[email protected]>
> ---
> drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index a53bd8728d0d..d4f3cce18957 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
> struct hv_pci_dev *hpdev, *tmp;
> unsigned long flags;
> int ret;
> + struct list_head removed;

This can be moved to where it is needed -- the if(!keep_dev) branch --
to limit its scope.

>
> /*
> * After the host sends the RESCIND_CHANNEL message, it doesn't
> @@ -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
> return 0;
>
> if (!keep_devs) {
> - /* Delete any children which might still exist. */
> + INIT_LIST_HEAD(&removed);
> +
> + /* Move all present children to the list on stack */
> spin_lock_irqsave(&hbus->device_list_lock, flags);
> - list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) {
> + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry)
> + list_move_tail(&hpdev->list_entry, &removed);
> + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> +
> + /* Remove all children in the list */
> + while (!list_empty(&removed)) {
> + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> + list_entry);

list_for_each_entry_safe can also be used here, right?

Wei.

> list_del(&hpdev->list_entry);
> if (hpdev->pci_slot)
> pci_destroy_slot(hpdev->pci_slot);
> @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
> put_pcichild(hpdev);
> put_pcichild(hpdev);
> }
> - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> }
>
> ret = hv_send_resources_released(hdev);
> --
> 2.25.1
>

2021-08-24 12:26:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

"Fix a bug ..." is not a very useful subject line. It doesn't say
anything about what the patch *does*. It doesn't hint at a locking
change.

On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected] wrote:
> From: Long Li <[email protected]>
>
> In hv_pci_bus_exit, the code is holding a spinlock while calling
> pci_destroy_slot(), which takes a mutex.

It's unfortunate that slots are not better integrated into the PCI
core. I'm sorry your driver even has to worry about this.
>
> This is not safe for spinlock. Fix this by moving the children to be
> deleted to a list on the stack, and removing them after spinlock is
> released.
>
> Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the device")
>
> Cc: "K. Y. Srinivasan" <[email protected]>
> Cc: Haiyang Zhang <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Wei Liu <[email protected]>
> Cc: Dexuan Cui <[email protected]>
> Cc: Lorenzo Pieralisi <[email protected]>
> Cc: Rob Herring <[email protected]>
> Cc: "Krzysztof Wilczyński" <[email protected]>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Michael Kelley <[email protected]>
> Cc: Dan Carpenter <[email protected]>
> Reported-by: Dan Carpenter <[email protected]>

A lore link to Dan's report would be useful here.

> Signed-off-by: Long Li <[email protected]>
> ---
> drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index a53bd8728d0d..d4f3cce18957 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
> struct hv_pci_dev *hpdev, *tmp;
> unsigned long flags;
> int ret;
> + struct list_head removed;
>
> /*
> * After the host sends the RESCIND_CHANNEL message, it doesn't
> @@ -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
> return 0;
>
> if (!keep_devs) {
> - /* Delete any children which might still exist. */
> + INIT_LIST_HEAD(&removed);
> +
> + /* Move all present children to the list on stack */
> spin_lock_irqsave(&hbus->device_list_lock, flags);
> - list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) {
> + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry)
> + list_move_tail(&hpdev->list_entry, &removed);
> + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> +
> + /* Remove all children in the list */
> + while (!list_empty(&removed)) {
> + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> + list_entry);
> list_del(&hpdev->list_entry);
> if (hpdev->pci_slot)
> pci_destroy_slot(hpdev->pci_slot);
> @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
> put_pcichild(hpdev);
> put_pcichild(hpdev);
> }
> - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> }
>
> ret = hv_send_resources_released(hdev);
> --
> 2.25.1
>

2021-08-24 17:32:42

by Long Li

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
>
> On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected] wrote:
> > From: Long Li <[email protected]>
> >
> > In hv_pci_bus_exit, the code is holding a spinlock while calling
> > pci_destroy_slot(), which takes a mutex.
> >
> > This is not safe for spinlock. Fix this by moving the children to be
> > deleted to a list on the stack, and removing them after spinlock is
> > released.
> >
> > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the
> > device")
> >
> > Cc: "K. Y. Srinivasan" <[email protected]>
> > Cc: Haiyang Zhang <[email protected]>
> > Cc: Stephen Hemminger <[email protected]>
> > Cc: Wei Liu <[email protected]>
> > Cc: Dexuan Cui <[email protected]>
> > Cc: Lorenzo Pieralisi <[email protected]>
> > Cc: Rob Herring <[email protected]>
> > Cc: "Krzysztof Wilczyński" <[email protected]>
> > Cc: Bjorn Helgaas <[email protected]>
> > Cc: Michael Kelley <[email protected]>
> > Cc: Dan Carpenter <[email protected]>
> > Reported-by: Dan Carpenter <[email protected]>
> > Signed-off-by: Long Li <[email protected]>
> > ---
> > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> > 1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index a53bd8728d0d..d4f3cce18957 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> bool keep_devs)
> > struct hv_pci_dev *hpdev, *tmp;
> > unsigned long flags;
> > int ret;
> > + struct list_head removed;
>
> This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its
> scope.
>
> >
> > /*
> > * After the host sends the RESCIND_CHANNEL message, it doesn't @@
> > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool
> keep_devs)
> > return 0;
> >
> > if (!keep_devs) {
> > - /* Delete any children which might still exist. */
> > + INIT_LIST_HEAD(&removed);
> > +
> > + /* Move all present children to the list on stack */
> > spin_lock_irqsave(&hbus->device_list_lock, flags);
> > - list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> list_entry) {
> > + list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> list_entry)
> > + list_move_tail(&hpdev->list_entry, &removed);
> > + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > +
> > + /* Remove all children in the list */
> > + while (!list_empty(&removed)) {
> > + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> > + list_entry);
>
> list_for_each_entry_safe can also be used here, right?
>
> Wei.

I will address your comments.

Long

>
> > list_del(&hpdev->list_entry);
> > if (hpdev->pci_slot)
> > pci_destroy_slot(hpdev->pci_slot);
> > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> bool keep_devs)
> > put_pcichild(hpdev);
> > put_pcichild(hpdev);
> > }
> > - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > }
> >
> > ret = hv_send_resources_released(hdev);
> > --
> > 2.25.1
> >

2021-08-24 17:36:00

by Long Li

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
>
> "Fix a bug ..." is not a very useful subject line. It doesn't say anything about what
> the patch *does*. It doesn't hint at a locking change.
>
> On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected] wrote:
> > From: Long Li <[email protected]>
> >
> > In hv_pci_bus_exit, the code is holding a spinlock while calling
> > pci_destroy_slot(), which takes a mutex.
>
> It's unfortunate that slots are not better integrated into the PCI core. I'm sorry
> your driver even has to worry about this.
> >
> > This is not safe for spinlock. Fix this by moving the children to be
> > deleted to a list on the stack, and removing them after spinlock is
> > released.
> >
> > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the
> > device")
> >
> > Cc: "K. Y. Srinivasan" <[email protected]>
> > Cc: Haiyang Zhang <[email protected]>
> > Cc: Stephen Hemminger <[email protected]>
> > Cc: Wei Liu <[email protected]>
> > Cc: Dexuan Cui <[email protected]>
> > Cc: Lorenzo Pieralisi <[email protected]>
> > Cc: Rob Herring <[email protected]>
> > Cc: "Krzysztof Wilczyński" <[email protected]>
> > Cc: Bjorn Helgaas <[email protected]>
> > Cc: Michael Kelley <[email protected]>
> > Cc: Dan Carpenter <[email protected]>
> > Reported-by: Dan Carpenter <[email protected]>
>
> A lore link to Dan's report would be useful here.

I will address your comments and send v2.

Long

>
> > Signed-off-by: Long Li <[email protected]>
> > ---
> > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> > 1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index a53bd8728d0d..d4f3cce18957 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> bool keep_devs)
> > struct hv_pci_dev *hpdev, *tmp;
> > unsigned long flags;
> > int ret;
> > + struct list_head removed;
> >
> > /*
> > * After the host sends the RESCIND_CHANNEL message, it doesn't @@
> > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool
> keep_devs)
> > return 0;
> >
> > if (!keep_devs) {
> > - /* Delete any children which might still exist. */
> > + INIT_LIST_HEAD(&removed);
> > +
> > + /* Move all present children to the list on stack */
> > spin_lock_irqsave(&hbus->device_list_lock, flags);
> > - list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> list_entry) {
> > + list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> list_entry)
> > + list_move_tail(&hpdev->list_entry, &removed);
> > + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > +
> > + /* Remove all children in the list */
> > + while (!list_empty(&removed)) {
> > + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> > + list_entry);
> > list_del(&hpdev->list_entry);
> > if (hpdev->pci_slot)
> > pci_destroy_slot(hpdev->pci_slot);
> > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> bool keep_devs)
> > put_pcichild(hpdev);
> > put_pcichild(hpdev);
> > }
> > - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > }
> >
> > ret = hv_send_resources_released(hdev);
> > --
> > 2.25.1
> >

2021-08-25 21:08:49

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

From: Long Li <[email protected]> Sent: Tuesday, August 24, 2021 10:28 AM
>
> > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
> >
> > On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected] wrote:
> > > From: Long Li <[email protected]>
> > >
> > > In hv_pci_bus_exit, the code is holding a spinlock while calling
> > > pci_destroy_slot(), which takes a mutex.
> > >
> > > This is not safe for spinlock. Fix this by moving the children to be
> > > deleted to a list on the stack, and removing them after spinlock is
> > > released.
> > >
> > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the
> > > device")
> > >
> > > Cc: "K. Y. Srinivasan" <[email protected]>
> > > Cc: Haiyang Zhang <[email protected]>
> > > Cc: Stephen Hemminger <[email protected]>
> > > Cc: Wei Liu <[email protected]>
> > > Cc: Dexuan Cui <[email protected]>
> > > Cc: Lorenzo Pieralisi <[email protected]>
> > > Cc: Rob Herring <[email protected]>
> > > Cc: "Krzysztof Wilczyński" <[email protected]>
> > > Cc: Bjorn Helgaas <[email protected]>
> > > Cc: Michael Kelley <[email protected]>
> > > Cc: Dan Carpenter <[email protected]>
> > > Reported-by: Dan Carpenter <[email protected]>
> > > Signed-off-by: Long Li <[email protected]>
> > > ---
> > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> > > 1 file changed, 12 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-hyperv.c
> > > b/drivers/pci/controller/pci-hyperv.c
> > > index a53bd8728d0d..d4f3cce18957 100644
> > > --- a/drivers/pci/controller/pci-hyperv.c
> > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> > bool keep_devs)
> > > struct hv_pci_dev *hpdev, *tmp;
> > > unsigned long flags;
> > > int ret;
> > > + struct list_head removed;
> >
> > This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its
> > scope.
> >
> > >
> > > /*
> > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@
> > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool
> > keep_devs)
> > > return 0;
> > >
> > > if (!keep_devs) {
> > > - /* Delete any children which might still exist. */
> > > + INIT_LIST_HEAD(&removed);
> > > +
> > > + /* Move all present children to the list on stack */
> > > spin_lock_irqsave(&hbus->device_list_lock, flags);
> > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> > list_entry) {
> > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> > list_entry)
> > > + list_move_tail(&hpdev->list_entry, &removed);
> > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > > +
> > > + /* Remove all children in the list */
> > > + while (!list_empty(&removed)) {
> > > + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> > > + list_entry);
> >
> > list_for_each_entry_safe can also be used here, right?
> >
> > Wei.
>
> I will address your comments.
>
> Long

I thought list_for_each_entry_safe() is for use when list manipulation
is *not* protected by a lock and you want to safely walk the list
even if an entry gets removed. If the list is protected by a lock or
not subject to contention (as is the case here), then
list_for_each_entry() is the simpler implementation. The original
implementation didn't need to use the _safe version because of
the spin lock.

Or do I have it backwards?

Michael

>
> >
> > > list_del(&hpdev->list_entry);
> > > if (hpdev->pci_slot)
> > > pci_destroy_slot(hpdev->pci_slot);
> > > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> > bool keep_devs)
> > > put_pcichild(hpdev);
> > > put_pcichild(hpdev);
> > > }
> > > - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > > }
> > >
> > > ret = hv_send_resources_released(hdev);
> > > --
> > > 2.25.1
> > >

2021-08-25 21:12:34

by Long Li

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

> Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
>
> From: Long Li <[email protected]> Sent: Tuesday, August 24, 2021 10:28
> AM
> >
> > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on
> > > the bus
> > >
> > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected]
> wrote:
> > > > From: Long Li <[email protected]>
> > > >
> > > > In hv_pci_bus_exit, the code is holding a spinlock while calling
> > > > pci_destroy_slot(), which takes a mutex.
> > > >
> > > > This is not safe for spinlock. Fix this by moving the children to
> > > > be deleted to a list on the stack, and removing them after
> > > > spinlock is released.
> > > >
> > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing
> > > > the
> > > > device")
> > > >
> > > > Cc: "K. Y. Srinivasan" <[email protected]>
> > > > Cc: Haiyang Zhang <[email protected]>
> > > > Cc: Stephen Hemminger <[email protected]>
> > > > Cc: Wei Liu <[email protected]>
> > > > Cc: Dexuan Cui <[email protected]>
> > > > Cc: Lorenzo Pieralisi <[email protected]>
> > > > Cc: Rob Herring <[email protected]>
> > > > Cc: "Krzysztof Wilczyński" <[email protected]>
> > > > Cc: Bjorn Helgaas <[email protected]>
> > > > Cc: Michael Kelley <[email protected]>
> > > > Cc: Dan Carpenter <[email protected]>
> > > > Reported-by: Dan Carpenter <[email protected]>
> > > > Signed-off-by: Long Li <[email protected]>
> > > > ---
> > > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> > > > 1 file changed, 12 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/controller/pci-hyperv.c
> > > > b/drivers/pci/controller/pci-hyperv.c
> > > > index a53bd8728d0d..d4f3cce18957 100644
> > > > --- a/drivers/pci/controller/pci-hyperv.c
> > > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device
> > > > *hdev,
> > > bool keep_devs)
> > > > struct hv_pci_dev *hpdev, *tmp;
> > > > unsigned long flags;
> > > > int ret;
> > > > + struct list_head removed;
> > >
> > > This can be moved to where it is needed -- the if(!keep_dev) branch
> > > -- to limit its scope.
> > >
> > > >
> > > > /*
> > > > * After the host sends the RESCIND_CHANNEL message, it doesn't
> > > > @@
> > > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device
> > > > *hdev, bool
> > > keep_devs)
> > > > return 0;
> > > >
> > > > if (!keep_devs) {
> > > > - /* Delete any children which might still exist. */
> > > > + INIT_LIST_HEAD(&removed);
> > > > +
> > > > + /* Move all present children to the list on stack */
> > > > spin_lock_irqsave(&hbus->device_list_lock, flags);
> > > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> > > list_entry) {
> > > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> > > list_entry)
> > > > + list_move_tail(&hpdev->list_entry, &removed);
> > > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > > > +
> > > > + /* Remove all children in the list */
> > > > + while (!list_empty(&removed)) {
> > > > + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> > > > + list_entry);
> > >
> > > list_for_each_entry_safe can also be used here, right?
> > >
> > > Wei.
> >
> > I will address your comments.
> >
> > Long
>
> I thought list_for_each_entry_safe() is for use when list manipulation is *not*
> protected by a lock and you want to safely walk the list even if an entry gets
> removed. If the list is protected by a lock or not subject to contention (as is the
> case here), then
> list_for_each_entry() is the simpler implementation. The original
> implementation didn't need to use the _safe version because of the spin lock.
>
> Or do I have it backwards?
>
> Michael

I think we need list_for_each_entry_safe() because we delete the list elements while going through them:

Here is the comment on list_for_each_entry_safe():
/**
* Loop through the list, keeping a backup pointer to the element. This
* macro allows for the deletion of a list element while looping through the
* list.
*
* See list_for_each_entry for more details.
*/

>
> >
> > >
> > > > list_del(&hpdev->list_entry);
> > > > if (hpdev->pci_slot)
> > > > pci_destroy_slot(hpdev->pci_slot);
> > > > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device
> > > > *hdev,
> > > bool keep_devs)
> > > > put_pcichild(hpdev);
> > > > put_pcichild(hpdev);
> > > > }
> > > > - spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > > > }
> > > >
> > > > ret = hv_send_resources_released(hdev);
> > > > --
> > > > 2.25.1
> > > >

2021-08-25 21:13:14

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

On Wed, Aug 25, 2021 at 2:11 PM Michael Kelley <[email protected]> wrote:
>
> From: Long Li <[email protected]> Sent: Tuesday, August 24, 2021 10:28 AM
> >
> > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
> > >
> > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, [email protected] wrote:
> > > > From: Long Li <[email protected]>
> > > >
> > > > In hv_pci_bus_exit, the code is holding a spinlock while calling
> > > > pci_destroy_slot(), which takes a mutex.
> > > >
> > > > This is not safe for spinlock. Fix this by moving the children to be
> > > > deleted to a list on the stack, and removing them after spinlock is
> > > > released.
> > > >
> > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the
> > > > device")
> > > >
> > > > Cc: "K. Y. Srinivasan" <[email protected]>
> > > > Cc: Haiyang Zhang <[email protected]>
> > > > Cc: Stephen Hemminger <[email protected]>
> > > > Cc: Wei Liu <[email protected]>
> > > > Cc: Dexuan Cui <[email protected]>
> > > > Cc: Lorenzo Pieralisi <[email protected]>
> > > > Cc: Rob Herring <[email protected]>
> > > > Cc: "Krzysztof Wilczyński" <[email protected]>
> > > > Cc: Bjorn Helgaas <[email protected]>
> > > > Cc: Michael Kelley <[email protected]>
> > > > Cc: Dan Carpenter <[email protected]>
> > > > Reported-by: Dan Carpenter <[email protected]>
> > > > Signed-off-by: Long Li <[email protected]>
> > > > ---
> > > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++---
> > > > 1 file changed, 12 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/controller/pci-hyperv.c
> > > > b/drivers/pci/controller/pci-hyperv.c
> > > > index a53bd8728d0d..d4f3cce18957 100644
> > > > --- a/drivers/pci/controller/pci-hyperv.c
> > > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev,
> > > bool keep_devs)
> > > > struct hv_pci_dev *hpdev, *tmp;
> > > > unsigned long flags;
> > > > int ret;
> > > > + struct list_head removed;
> > >
> > > This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its
> > > scope.
> > >
> > > >
> > > > /*
> > > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@
> > > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool
> > > keep_devs)
> > > > return 0;
> > > >
> > > > if (!keep_devs) {
> > > > - /* Delete any children which might still exist. */
> > > > + INIT_LIST_HEAD(&removed);
> > > > +
> > > > + /* Move all present children to the list on stack */
> > > > spin_lock_irqsave(&hbus->device_list_lock, flags);
> > > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> > > list_entry) {
> > > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children,
> > > list_entry)
> > > > + list_move_tail(&hpdev->list_entry, &removed);
> > > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> > > > +
> > > > + /* Remove all children in the list */
> > > > + while (!list_empty(&removed)) {
> > > > + hpdev = list_first_entry(&removed, struct hv_pci_dev,
> > > > + list_entry);
> > >
> > > list_for_each_entry_safe can also be used here, right?
> > >
> > > Wei.
> >
> > I will address your comments.
> >
> > Long
>
> I thought list_for_each_entry_safe() is for use when list manipulation
> is *not* protected by a lock and you want to safely walk the list
> even if an entry gets removed. If the list is protected by a lock or
> not subject to contention (as is the case here), then
> list_for_each_entry() is the simpler implementation. The original
> implementation didn't need to use the _safe version because of
> the spin lock.
>
> Or do I have it backwards?

"_safe" only means "safe against removal of list entry" as the
kerneldoc says. But that means removal within the loop iteration, not
any writer. A lock is needed in either case if there's another writer.

Don't ask me about the RCU variant though...

Rob

2021-08-26 16:53:49

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

From: Long Li <[email protected]> Sent: Wednesday, August 25, 2021 1:25 PM

> >
> > I thought list_for_each_entry_safe() is for use when list manipulation is *not*
> > protected by a lock and you want to safely walk the list even if an entry gets
> > removed. If the list is protected by a lock or not subject to contention (as is the
> > case here), then
> > list_for_each_entry() is the simpler implementation. The original
> > implementation didn't need to use the _safe version because of the spin lock.
> >
> > Or do I have it backwards?
> >
> > Michael
>
> I think we need list_for_each_entry_safe() because we delete the list elements while going through them:
>
> Here is the comment on list_for_each_entry_safe():
> /**
> * Loop through the list, keeping a backup pointer to the element. This
> * macro allows for the deletion of a list element while looping through the
> * list.
> *
> * See list_for_each_entry for more details.
> */
>

Got it. Thanks (and to Rob Herring). I read that comment but
with the wrong assumptions and didn't understand it correctly.

Interestingly, pci-hyperv.c has another case of looping through
this list and removing items where the _safe version is not used.
See pci_devices_present_work() where the missing children are
moved to a list on the stack.

Michael

2021-08-26 19:43:58

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote:
> From: Long Li <[email protected]> Sent: Wednesday, August 25, 2021 1:25 PM
>
> > >
> > > I thought list_for_each_entry_safe() is for use when list manipulation is *not*
> > > protected by a lock and you want to safely walk the list even if an entry gets
> > > removed. If the list is protected by a lock or not subject to contention (as is the
> > > case here), then
> > > list_for_each_entry() is the simpler implementation. The original
> > > implementation didn't need to use the _safe version because of the spin lock.
> > >
> > > Or do I have it backwards?
> > >
> > > Michael
> >
> > I think we need list_for_each_entry_safe() because we delete the list elements while going through them:
> >
> > Here is the comment on list_for_each_entry_safe():
> > /**
> > * Loop through the list, keeping a backup pointer to the element. This
> > * macro allows for the deletion of a list element while looping through the
> > * list.
> > *
> > * See list_for_each_entry for more details.
> > */
> >
>
> Got it. Thanks (and to Rob Herring). I read that comment but
> with the wrong assumptions and didn't understand it correctly.
>
> Interestingly, pci-hyperv.c has another case of looping through
> this list and removing items where the _safe version is not used.
> See pci_devices_present_work() where the missing children are
> moved to a list on the stack.

That can be converted too, I think.

The original code is not wrong per-se. It is just not as concise as
using list_for_each_entry_safe.

Wei.

>
> Michael

2021-08-26 20:13:22

by Long Li

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
>
> On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote:
> > From: Long Li <[email protected]> Sent: Wednesday, August 25, 2021
> > 1:25 PM
> >
> > > >
> > > > I thought list_for_each_entry_safe() is for use when list
> > > > manipulation is *not* protected by a lock and you want to safely
> > > > walk the list even if an entry gets removed. If the list is
> > > > protected by a lock or not subject to contention (as is the case
> > > > here), then
> > > > list_for_each_entry() is the simpler implementation. The original
> > > > implementation didn't need to use the _safe version because of the spin
> lock.
> > > >
> > > > Or do I have it backwards?
> > > >
> > > > Michael
> > >
> > > I think we need list_for_each_entry_safe() because we delete the list
> elements while going through them:
> > >
> > > Here is the comment on list_for_each_entry_safe():
> > > /**
> > > * Loop through the list, keeping a backup pointer to the element.
> > > This
> > > * macro allows for the deletion of a list element while looping
> > > through the
> > > * list.
> > > *
> > > * See list_for_each_entry for more details.
> > > */
> > >
> >
> > Got it. Thanks (and to Rob Herring). I read that comment but
> > with the wrong assumptions and didn't understand it correctly.
> >
> > Interestingly, pci-hyperv.c has another case of looping through this
> > list and removing items where the _safe version is not used.
> > See pci_devices_present_work() where the missing children are moved to
> > a list on the stack.
>
> That can be converted too, I think.
>
> The original code is not wrong per-se. It is just not as concise as using
> list_for_each_entry_safe.
>
> Wei.

I assume we are talking about the following code in pci_devices_present_work():

list_for_each_entry(hpdev, &hbus->children, list_entry) {
if (hpdev->reported_missing) {
found = true;
put_pcichild(hpdev);
list_move_tail(&hpdev->list_entry, &removed);
break;
}
}

This code is correct as there is a "break" after a list entry is removed from the list. So there is no need to use the _safe version. This code can be converted to use the _safe version.

2021-08-26 20:17:00

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

On Thu, Aug 26, 2021 at 08:09:19PM +0000, Long Li wrote:
> > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
> >
> > On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote:
> > > From: Long Li <[email protected]> Sent: Wednesday, August 25, 2021
> > > 1:25 PM
> > >
> > > > >
> > > > > I thought list_for_each_entry_safe() is for use when list
> > > > > manipulation is *not* protected by a lock and you want to safely
> > > > > walk the list even if an entry gets removed. If the list is
> > > > > protected by a lock or not subject to contention (as is the case
> > > > > here), then
> > > > > list_for_each_entry() is the simpler implementation. The original
> > > > > implementation didn't need to use the _safe version because of the spin
> > lock.
> > > > >
> > > > > Or do I have it backwards?
> > > > >
> > > > > Michael
> > > >
> > > > I think we need list_for_each_entry_safe() because we delete the list
> > elements while going through them:
> > > >
> > > > Here is the comment on list_for_each_entry_safe():
> > > > /**
> > > > * Loop through the list, keeping a backup pointer to the element.
> > > > This
> > > > * macro allows for the deletion of a list element while looping
> > > > through the
> > > > * list.
> > > > *
> > > > * See list_for_each_entry for more details.
> > > > */
> > > >
> > >
> > > Got it. Thanks (and to Rob Herring). I read that comment but
> > > with the wrong assumptions and didn't understand it correctly.
> > >
> > > Interestingly, pci-hyperv.c has another case of looping through this
> > > list and removing items where the _safe version is not used.
> > > See pci_devices_present_work() where the missing children are moved to
> > > a list on the stack.
> >
> > That can be converted too, I think.
> >
> > The original code is not wrong per-se. It is just not as concise as using
> > list_for_each_entry_safe.
> >
> > Wei.
>
> I assume we are talking about the following code in pci_devices_present_work():
>
> list_for_each_entry(hpdev, &hbus->children, list_entry) {
> if (hpdev->reported_missing) {
> found = true;
> put_pcichild(hpdev);
> list_move_tail(&hpdev->list_entry, &removed);
> break;
> }
> }
>
> This code is correct as there is a "break" after a list entry is
> removed from the list. So there is no need to use the _safe version.
> This code can be converted to use the _safe version.

After this block there is another block like

while (!list_empty(removed)) {
...
list_del(...)

}

I assumed Michael was referring to that block. :-)

Wei.

2021-08-26 20:21:56

by Long Li

[permalink] [raw]
Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus

> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus
>
> On Thu, Aug 26, 2021 at 08:09:19PM +0000, Long Li wrote:
> > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on
> > > the bus
> > >
> > > On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote:
> > > > From: Long Li <[email protected]> Sent: Wednesday, August 25,
> > > > 2021
> > > > 1:25 PM
> > > >
> > > > > >
> > > > > > I thought list_for_each_entry_safe() is for use when list
> > > > > > manipulation is *not* protected by a lock and you want to
> > > > > > safely walk the list even if an entry gets removed. If the
> > > > > > list is protected by a lock or not subject to contention (as
> > > > > > is the case here), then
> > > > > > list_for_each_entry() is the simpler implementation. The
> > > > > > original implementation didn't need to use the _safe version
> > > > > > because of the spin
> > > lock.
> > > > > >
> > > > > > Or do I have it backwards?
> > > > > >
> > > > > > Michael
> > > > >
> > > > > I think we need list_for_each_entry_safe() because we delete the
> > > > > list
> > > elements while going through them:
> > > > >
> > > > > Here is the comment on list_for_each_entry_safe():
> > > > > /**
> > > > > * Loop through the list, keeping a backup pointer to the element.
> > > > > This
> > > > > * macro allows for the deletion of a list element while looping
> > > > > through the
> > > > > * list.
> > > > > *
> > > > > * See list_for_each_entry for more details.
> > > > > */
> > > > >
> > > >
> > > > Got it. Thanks (and to Rob Herring). I read that comment but
> > > > with the wrong assumptions and didn't understand it correctly.
> > > >
> > > > Interestingly, pci-hyperv.c has another case of looping through
> > > > this list and removing items where the _safe version is not used.
> > > > See pci_devices_present_work() where the missing children are
> > > > moved to a list on the stack.
> > >
> > > That can be converted too, I think.
> > >
> > > The original code is not wrong per-se. It is just not as concise as
> > > using list_for_each_entry_safe.
> > >
> > > Wei.
> >
> > I assume we are talking about the following code in
> pci_devices_present_work():
> >
> > list_for_each_entry(hpdev, &hbus->children, list_entry) {
> > if (hpdev->reported_missing) {
> > found = true;
> > put_pcichild(hpdev);
> > list_move_tail(&hpdev->list_entry, &removed);
> > break;
> > }
> > }
> >
> > This code is correct as there is a "break" after a list entry is
> > removed from the list. So there is no need to use the _safe version.
> > This code can be converted to use the _safe version.
>
> After this block there is another block like
>
> while (!list_empty(removed)) {
> ...
> list_del(...)
>
> }
>
> I assumed Michael was referring to that block. :-)
>
> Wei.

This block is also correct. We don't have a bug here but there is a better way to code it.

Long