2024-06-05 07:54:35

by Hongchen Zhang

[permalink] [raw]
Subject: [PATCH v2] PCI: use local_pci_probe when best selected cpu is offline

When the best selected CPU is offline, work_on_cpu() will stuck forever.
This can be happen if a node is online while all its CPUs are offline
(we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
in this case, we should call local_pci_probe() instead of work_on_cpu().

Cc: <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Hongchen Zhang <[email protected]>
---
v1 -> v2 Added the method to reproduce this issue
---
drivers/pci/pci-driver.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index af2996d0d17f..32a99828e6a3 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
free_cpumask_var(wq_domain_mask);
}

- if (cpu < nr_cpu_ids)
+ if ((cpu < nr_cpu_ids) && cpu_online(cpu))
error = work_on_cpu(cpu, local_pci_probe, &ddi);
else
error = local_pci_probe(&ddi);
--
2.33.0



2024-06-12 04:52:20

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2] PCI: use local_pci_probe when best selected cpu is offline

Hi, Hongchen,

It seems you forgot to update the title which I have pointed out. :)

And Bjorn,

Could you please take some time to review this patch? Thank you.

Huacai

On Wed, Jun 5, 2024 at 3:54 PM Hongchen Zhang <[email protected]> wrote:
>
> When the best selected CPU is offline, work_on_cpu() will stuck forever.
> This can be happen if a node is online while all its CPUs are offline
> (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
> in this case, we should call local_pci_probe() instead of work_on_cpu().
>
> Cc: <[email protected]>
> Signed-off-by: Huacai Chen <[email protected]>
> Signed-off-by: Hongchen Zhang <[email protected]>
> ---
> v1 -> v2 Added the method to reproduce this issue
> ---
> drivers/pci/pci-driver.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index af2996d0d17f..32a99828e6a3 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
> free_cpumask_var(wq_domain_mask);
> }
>
> - if (cpu < nr_cpu_ids)
> + if ((cpu < nr_cpu_ids) && cpu_online(cpu))
> error = work_on_cpu(cpu, local_pci_probe, &ddi);
> else
> error = local_pci_probe(&ddi);
> --
> 2.33.0
>
>

2024-06-12 18:08:39

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v2] PCI: use local_pci_probe when best selected cpu is offline


> This can be happen if a node is online while all its CPUs are offline
> (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
> in this case, we should call local_pci_probe() instead of work_on_cpu().

* Please take text layout concerns a bit better into account also according to
the usage of paragraphs.
https://elixir.bootlin.com/linux/v6.10-rc3/source/Documentation/process/maintainer-tip.rst#L128

* Please improve the change description with an imperative wording.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.10-rc3#n94

* Would you like to add the tag “Fixes” accordingly?

* How do you think about to specify the name of the affected function
in the summary phrase?


Regards,
Markus

2024-06-13 02:20:52

by Hongchen Zhang

[permalink] [raw]
Subject: Re: [PATCH v2] PCI: use local_pci_probe when best selected cpu is offline

Hi Markus,
Thanks for your review.

On 2024/6/13 上午2:08, Markus Elfring wrote:
> …
>> This can be happen if a node is online while all its CPUs are offline
>> (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
>> in this case, we should call local_pci_probe() instead of work_on_cpu().
>
> * Please take text layout concerns a bit better into account also according to
> the usage of paragraphs.
> https://elixir.bootlin.com/linux/v6.10-rc3/source/Documentation/process/maintainer-tip.rst#L128OK, Let rewrite the commit message.
> * Please improve the change description with an imperative wording.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.10-rc3#n94
OK, Let me use imperative word.
> * Would you like to add the tag “Fixes” accordingly?
OK, Let me add Fixes.
> * How do you think about to specify the name of the affected function
> in the summary phrase?
OK, Let me add the affected function in summary phrase.

>
> Regards,
> Markus
>


--
Best Regards
Hongchen Zhang