2020-05-01 22:42:59

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 0/2] PCI: Check for platform_get_irq() failure consistently

From: Bjorn Helgaas <[email protected]>

All callers of platform_get_irq() and related functions interpret a
negative return value as an error. A few also interpret zero as an error.

platform_get_irq() should return either a negative error number or a valid
non-zero IRQ, so there's no need to check for zero.

This series:

- Extends the platform_get_irq() function comment to say it returns a
non-zero IRQ number or a negative error number.

- Adds a WARN() if platform_get_irq() ever *does* return zero (this would
be a bug in the underlying arch code, and most callers are not prepared
for this).

- Updates drivers/pci/ to check consistently using "irq < 0".

This is based on Aman's series [1]. I propose to merge this via the PCI
tree, given acks from Greg and Thomas.

[1] https://lore.kernel.org/r/[email protected]

Aman Sharma (1):
PCI: Check for platform_get_irq() failure consistently

Bjorn Helgaas (1):
driver core: platform: Clarify that IRQ 0 is invalid

drivers/base/platform.c | 40 ++++++++++++-------
drivers/pci/controller/dwc/pci-imx6.c | 4 +-
drivers/pci/controller/dwc/pcie-tegra194.c | 4 +-
.../controller/mobiveil/pcie-mobiveil-host.c | 4 +-
drivers/pci/controller/pci-aardvark.c | 3 ++
drivers/pci/controller/pci-v3-semi.c | 4 +-
drivers/pci/controller/pcie-mediatek.c | 3 ++
drivers/pci/controller/pcie-tango.c | 4 +-
8 files changed, 41 insertions(+), 25 deletions(-)


base-commit: 8f3d9f354286745c751374f5f1fcafee6b3f3136
--
2.25.1


2020-05-01 22:43:18

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

From: Bjorn Helgaas <[email protected]>

These interfaces return a negative error number or an IRQ:

platform_get_irq()
platform_get_irq_optional()
platform_get_irq_byname()
platform_get_irq_byname_optional()

The function comments suggest checking for error like this:

irq = platform_get_irq(...);
if (irq < 0)
return irq;

which is what most callers (~900 of 1400) do, so it's implicit that IRQ 0
is invalid. But some callers check for "irq <= 0", and it's not obvious
from the source that we never return an IRQ 0.

Make this more explicit by updating the comments to say that an IRQ number
is always non-zero and adding a WARN() if we ever do return zero. If we do
return IRQ 0, it likely indicates a bug in the arch-specific parts of
platform_get_irq().

Relevant prior discussion at [1, 2].

[1] https://lore.kernel.org/r/[email protected]/
[2] https://lore.kernel.org/r/[email protected]/
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/base/platform.c | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 5255550b7c34..084cf1d23d3f 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -152,23 +152,24 @@ EXPORT_SYMBOL_GPL(devm_platform_ioremap_resource_byname);
* if (irq < 0)
* return irq;
*
- * Return: IRQ number on success, negative error number on failure.
+ * Return: non-zero IRQ number on success, negative error number on failure.
*/
int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
{
+ int ret;
#ifdef CONFIG_SPARC
/* sparc does not have irqs represented as IORESOURCE_IRQ resources */
if (!dev || num >= dev->archdata.num_irqs)
return -ENXIO;
- return dev->archdata.irqs[num];
+ ret = dev->archdata.irqs[num];
+ goto out;
#else
struct resource *r;
- int ret;

if (IS_ENABLED(CONFIG_OF_IRQ) && dev->dev.of_node) {
ret = of_irq_get(dev->dev.of_node, num);
if (ret > 0 || ret == -EPROBE_DEFER)
- return ret;
+ goto out;
}

r = platform_get_resource(dev, IORESOURCE_IRQ, num);
@@ -176,7 +177,7 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
if (r && r->flags & IORESOURCE_DISABLED) {
ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
if (ret)
- return ret;
+ goto out;
}
}

@@ -190,13 +191,17 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
struct irq_data *irqd;

irqd = irq_get_irq_data(r->start);
- if (!irqd)
- return -ENXIO;
+ if (!irqd) {
+ ret = -ENXIO;
+ goto out;
+ }
irqd_set_trigger_type(irqd, r->flags & IORESOURCE_BITS);
}

- if (r)
- return r->start;
+ if (r) {
+ ret = r->start;
+ goto out;
+ }

/*
* For the index 0 interrupt, allow falling back to GpioInt
@@ -209,11 +214,14 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
ret = acpi_dev_gpio_irq_get(ACPI_COMPANION(&dev->dev), num);
/* Our callers expect -ENXIO for missing IRQs. */
if (ret >= 0 || ret == -EPROBE_DEFER)
- return ret;
+ goto out;
}

- return -ENXIO;
+ ret = -ENXIO;
#endif
+out:
+ WARN(ret == 0, "0 is an invalid IRQ number\n");
+ return ret;
}
EXPORT_SYMBOL_GPL(platform_get_irq_optional);

@@ -231,7 +239,7 @@ EXPORT_SYMBOL_GPL(platform_get_irq_optional);
* if (irq < 0)
* return irq;
*
- * Return: IRQ number on success, negative error number on failure.
+ * Return: non-zero IRQ number on success, negative error number on failure.
*/
int platform_get_irq(struct platform_device *dev, unsigned int num)
{
@@ -303,8 +311,10 @@ static int __platform_get_irq_byname(struct platform_device *dev,
}

r = platform_get_resource_byname(dev, IORESOURCE_IRQ, name);
- if (r)
+ if (r) {
+ WARN(r->start == 0, "0 is an invalid IRQ number\n");
return r->start;
+ }

return -ENXIO;
}
@@ -316,7 +326,7 @@ static int __platform_get_irq_byname(struct platform_device *dev,
*
* Get an IRQ like platform_get_irq(), but then by name rather then by index.
*
- * Return: IRQ number on success, negative error number on failure.
+ * Return: non-zero IRQ number on success, negative error number on failure.
*/
int platform_get_irq_byname(struct platform_device *dev, const char *name)
{
@@ -338,7 +348,7 @@ EXPORT_SYMBOL_GPL(platform_get_irq_byname);
* Get an optional IRQ by name like platform_get_irq_byname(). Except that it
* does not print an error message if an IRQ can not be obtained.
*
- * Return: IRQ number on success, negative error number on failure.
+ * Return: non-zero IRQ number on success, negative error number on failure.
*/
int platform_get_irq_byname_optional(struct platform_device *dev,
const char *name)
--
2.25.1

2020-05-01 22:44:02

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v2 2/2] PCI: Check for platform_get_irq() failure consistently

From: Aman Sharma <[email protected]>

The platform_get_irq*() interfaces return either a negative error number or
a valid IRQ. 0 is not a valid return value, so check for "< 0" to detect
failure as recommended by the function documentation.

On failure, return the error number from platform_get_irq*() instead of
making up a new one.

Link: https://lore.kernel.org/r/[email protected]
[bhelgaas: commit log, squash into one patch]
Signed-off-by: Aman Sharma <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Richard Zhu <[email protected]>
Cc: Lucas Stach <[email protected]>
Cc: Thierry Reding <[email protected]>
Cc: Karthikeyan Mitran <[email protected]>
Cc: Hou Zhiqiang <[email protected]>
Cc: Thomas Petazzoni <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Ryder Lee <[email protected]>
Cc: Marc Gonzalez <[email protected]>
---
drivers/pci/controller/dwc/pci-imx6.c | 4 ++--
drivers/pci/controller/dwc/pcie-tegra194.c | 4 ++--
drivers/pci/controller/mobiveil/pcie-mobiveil-host.c | 4 ++--
drivers/pci/controller/pci-aardvark.c | 3 +++
drivers/pci/controller/pci-v3-semi.c | 4 ++--
drivers/pci/controller/pcie-mediatek.c | 3 +++
drivers/pci/controller/pcie-tango.c | 4 ++--
7 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index acfbd34032a8..8f08ae53f53e 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -868,9 +868,9 @@ static int imx6_add_pcie_port(struct imx6_pcie *imx6_pcie,

if (IS_ENABLED(CONFIG_PCI_MSI)) {
pp->msi_irq = platform_get_irq_byname(pdev, "msi");
- if (pp->msi_irq <= 0) {
+ if (pp->msi_irq < 0) {
dev_err(dev, "failed to get MSI irq\n");
- return -ENODEV;
+ return pp->msi_irq;
}
}

diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index ae30a2fd3716..f1f945cc7bcb 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -2190,9 +2190,9 @@ static int tegra_pcie_dw_probe(struct platform_device *pdev)
}

pp->irq = platform_get_irq_byname(pdev, "intr");
- if (!pp->irq) {
+ if (pp->irq < 0) {
dev_err(dev, "Failed to get \"intr\" interrupt\n");
- return -ENODEV;
+ return pp->irq;
}

pcie->bpmp = tegra_bpmp_get(dev);
diff --git a/drivers/pci/controller/mobiveil/pcie-mobiveil-host.c b/drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
index a94be264240f..5907baa9b1f2 100644
--- a/drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
+++ b/drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
@@ -522,9 +522,9 @@ static int mobiveil_pcie_integrated_interrupt_init(struct mobiveil_pcie *pcie)
mobiveil_pcie_enable_msi(pcie);

rp->irq = platform_get_irq(pdev, 0);
- if (rp->irq <= 0) {
+ if (rp->irq < 0) {
dev_err(dev, "failed to map IRQ: %d\n", rp->irq);
- return -ENODEV;
+ return rp->irq;
}

/* initialize the IRQ domains */
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 2a20b649f40c..40a4257f0df1 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -973,6 +973,9 @@ static int advk_pcie_probe(struct platform_device *pdev)
return PTR_ERR(pcie->base);

irq = platform_get_irq(pdev, 0);
+ if (irq < 0)
+ return irq;
+
ret = devm_request_irq(dev, irq, advk_pcie_irq_handler,
IRQF_SHARED | IRQF_NO_THREAD, "advk-pcie",
pcie);
diff --git a/drivers/pci/controller/pci-v3-semi.c b/drivers/pci/controller/pci-v3-semi.c
index bd05221f5a22..a5bf945d2eda 100644
--- a/drivers/pci/controller/pci-v3-semi.c
+++ b/drivers/pci/controller/pci-v3-semi.c
@@ -777,9 +777,9 @@ static int v3_pci_probe(struct platform_device *pdev)

/* Get and request error IRQ resource */
irq = platform_get_irq(pdev, 0);
- if (irq <= 0) {
+ if (irq < 0) {
dev_err(dev, "unable to obtain PCIv3 error IRQ\n");
- return -ENODEV;
+ return irq;
}
ret = devm_request_irq(dev, irq, v3_irq, 0,
"PCIv3 error", v3);
diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c
index cb982891b22b..ebfa7d5a4e2d 100644
--- a/drivers/pci/controller/pcie-mediatek.c
+++ b/drivers/pci/controller/pcie-mediatek.c
@@ -651,6 +651,9 @@ static int mtk_pcie_setup_irq(struct mtk_pcie_port *port,
}

port->irq = platform_get_irq(pdev, port->slot);
+ if (port->irq < 0)
+ return port->irq;
+
irq_set_chained_handler_and_data(port->irq,
mtk_pcie_intr_handler, port);

diff --git a/drivers/pci/controller/pcie-tango.c b/drivers/pci/controller/pcie-tango.c
index 21a208da3f59..18c2c4313eb5 100644
--- a/drivers/pci/controller/pcie-tango.c
+++ b/drivers/pci/controller/pcie-tango.c
@@ -273,9 +273,9 @@ static int tango_pcie_probe(struct platform_device *pdev)
writel_relaxed(0, pcie->base + SMP8759_ENABLE + offset);

virq = platform_get_irq(pdev, 1);
- if (virq <= 0) {
+ if (virq < 0) {
dev_err(dev, "Failed to map IRQ\n");
- return -ENXIO;
+ return virq;
}

irq_dom = irq_domain_create_linear(fwnode, MSI_MAX, &dom_ops, pcie);
--
2.25.1

2020-05-02 06:19:24

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

On Fri, May 01, 2020 at 05:40:41PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> These interfaces return a negative error number or an IRQ:
>
> platform_get_irq()
> platform_get_irq_optional()
> platform_get_irq_byname()
> platform_get_irq_byname_optional()
>
> The function comments suggest checking for error like this:
>
> irq = platform_get_irq(...);
> if (irq < 0)
> return irq;
>
> which is what most callers (~900 of 1400) do, so it's implicit that IRQ 0
> is invalid. But some callers check for "irq <= 0", and it's not obvious
> from the source that we never return an IRQ 0.
>
> Make this more explicit by updating the comments to say that an IRQ number
> is always non-zero and adding a WARN() if we ever do return zero. If we do
> return IRQ 0, it likely indicates a bug in the arch-specific parts of
> platform_get_irq().

I worry about adding WARN() as there are systems that do panic_on_warn()
and syzbot trips over this as well. I don't think that for this issue
it would be a problem, but what really is this warning about that
someone could do anything with?

Other than that minor thing, this looks good to me, thanks for finally
clearing this up.

greg k-h

2020-05-04 18:13:16

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

On Sat, May 02, 2020 at 08:15:37AM +0200, Greg Kroah-Hartman wrote:
> On Fri, May 01, 2020 at 05:40:41PM -0500, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > These interfaces return a negative error number or an IRQ:
> >
> > platform_get_irq()
> > platform_get_irq_optional()
> > platform_get_irq_byname()
> > platform_get_irq_byname_optional()
> >
> > The function comments suggest checking for error like this:
> >
> > irq = platform_get_irq(...);
> > if (irq < 0)
> > return irq;
> >
> > which is what most callers (~900 of 1400) do, so it's implicit that IRQ 0
> > is invalid. But some callers check for "irq <= 0", and it's not obvious
> > from the source that we never return an IRQ 0.
> >
> > Make this more explicit by updating the comments to say that an IRQ number
> > is always non-zero and adding a WARN() if we ever do return zero. If we do
> > return IRQ 0, it likely indicates a bug in the arch-specific parts of
> > platform_get_irq().
>
> I worry about adding WARN() as there are systems that do panic_on_warn()
> and syzbot trips over this as well. I don't think that for this issue
> it would be a problem, but what really is this warning about that
> someone could do anything with?
>
> Other than that minor thing, this looks good to me, thanks for finally
> clearing this up.

What I'm concerned about is an arch that returns 0. Most drivers
don't check for 0 so they'll just try to use it, and things will fail
in some obscure way. My assumption is that if there really is no IRQ,
we should return -ENOENT or similar instead of 0.

I could be convinced that it's not worth warning about at all, or we
could do something like the following:

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 084cf1d23d3f..4afa5875e14d 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -220,7 +220,11 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
ret = -ENXIO;
#endif
out:
- WARN(ret == 0, "0 is an invalid IRQ number\n");
+ /* Returning zero here is likely a bug in the arch IRQ code */
+ if (ret == 0) {
+ pr_warn("0 is an invalid IRQ number\n");
+ dump_stack();
+ }
return ret;
}
EXPORT_SYMBOL_GPL(platform_get_irq_optional);
@@ -312,7 +316,11 @@ static int __platform_get_irq_byname(struct platform_device *dev,

r = platform_get_resource_byname(dev, IORESOURCE_IRQ, name);
if (r) {
- WARN(r->start == 0, "0 is an invalid IRQ number\n");
+ /* Returning zero here is likely a bug in the arch IRQ code */
+ if (r->start == 0) {
+ pr_warn("0 is an invalid IRQ number\n");
+ dump_stack();
+ }
return r->start;
}

2020-05-04 19:09:24

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

On Mon, May 04, 2020 at 01:08:22PM -0500, Bjorn Helgaas wrote:
> On Sat, May 02, 2020 at 08:15:37AM +0200, Greg Kroah-Hartman wrote:
> > On Fri, May 01, 2020 at 05:40:41PM -0500, Bjorn Helgaas wrote:
> > > From: Bjorn Helgaas <[email protected]>
> > >
> > > These interfaces return a negative error number or an IRQ:
> > >
> > > platform_get_irq()
> > > platform_get_irq_optional()
> > > platform_get_irq_byname()
> > > platform_get_irq_byname_optional()
> > >
> > > The function comments suggest checking for error like this:
> > >
> > > irq = platform_get_irq(...);
> > > if (irq < 0)
> > > return irq;
> > >
> > > which is what most callers (~900 of 1400) do, so it's implicit that IRQ 0
> > > is invalid. But some callers check for "irq <= 0", and it's not obvious
> > > from the source that we never return an IRQ 0.
> > >
> > > Make this more explicit by updating the comments to say that an IRQ number
> > > is always non-zero and adding a WARN() if we ever do return zero. If we do
> > > return IRQ 0, it likely indicates a bug in the arch-specific parts of
> > > platform_get_irq().
> >
> > I worry about adding WARN() as there are systems that do panic_on_warn()
> > and syzbot trips over this as well. I don't think that for this issue
> > it would be a problem, but what really is this warning about that
> > someone could do anything with?
> >
> > Other than that minor thing, this looks good to me, thanks for finally
> > clearing this up.
>
> What I'm concerned about is an arch that returns 0. Most drivers
> don't check for 0 so they'll just try to use it, and things will fail
> in some obscure way. My assumption is that if there really is no IRQ,
> we should return -ENOENT or similar instead of 0.
>
> I could be convinced that it's not worth warning about at all, or we
> could do something like the following:
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 084cf1d23d3f..4afa5875e14d 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -220,7 +220,11 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
> ret = -ENXIO;
> #endif
> out:
> - WARN(ret == 0, "0 is an invalid IRQ number\n");
> + /* Returning zero here is likely a bug in the arch IRQ code */
> + if (ret == 0) {
> + pr_warn("0 is an invalid IRQ number\n");
> + dump_stack();
> + }
> return ret;
> }
> EXPORT_SYMBOL_GPL(platform_get_irq_optional);
> @@ -312,7 +316,11 @@ static int __platform_get_irq_byname(struct platform_device *dev,
>
> r = platform_get_resource_byname(dev, IORESOURCE_IRQ, name);
> if (r) {
> - WARN(r->start == 0, "0 is an invalid IRQ number\n");
> + /* Returning zero here is likely a bug in the arch IRQ code */
> + if (r->start == 0) {
> + pr_warn("0 is an invalid IRQ number\n");
> + dump_stack();
> + }
> return r->start;
> }
>

I like that, but you said this is something that the platform people
should only see when bringing up a new system, so maybe the WARN() is
fine. It's not user-triggerable, so your original is ok.

sorry for the noise,

greg k-h

2020-05-04 22:31:13

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

On Mon, May 04, 2020 at 09:07:21PM +0200, Greg Kroah-Hartman wrote:
> On Mon, May 04, 2020 at 01:08:22PM -0500, Bjorn Helgaas wrote:
> > On Sat, May 02, 2020 at 08:15:37AM +0200, Greg Kroah-Hartman wrote:
> > > On Fri, May 01, 2020 at 05:40:41PM -0500, Bjorn Helgaas wrote:
> > > > From: Bjorn Helgaas <[email protected]>
> > > >
> > > > These interfaces return a negative error number or an IRQ:
> > > >
> > > > platform_get_irq()
> > > > platform_get_irq_optional()
> > > > platform_get_irq_byname()
> > > > platform_get_irq_byname_optional()
> > > >
> > > > The function comments suggest checking for error like this:
> > > >
> > > > irq = platform_get_irq(...);
> > > > if (irq < 0)
> > > > return irq;
> > > >
> > > > which is what most callers (~900 of 1400) do, so it's implicit
> > > > that IRQ 0 is invalid. But some callers check for "irq <= 0",
> > > > and it's not obvious from the source that we never return an
> > > > IRQ 0.
> > > >
> > > > Make this more explicit by updating the comments to say that
> > > > an IRQ number is always non-zero and adding a WARN() if we
> > > > ever do return zero. If we do return IRQ 0, it likely
> > > > indicates a bug in the arch-specific parts of
> > > > platform_get_irq().
> > >
> > > I worry about adding WARN() as there are systems that do
> > > panic_on_warn() and syzbot trips over this as well. I don't
> > > think that for this issue it would be a problem, but what really
> > > is this warning about that someone could do anything with?
> > >
> > > Other than that minor thing, this looks good to me, thanks for
> > > finally clearing this up.
> >
> > What I'm concerned about is an arch that returns 0. Most drivers
> > don't check for 0 so they'll just try to use it, and things will
> > fail in some obscure way. My assumption is that if there really
> > is no IRQ, we should return -ENOENT or similar instead of 0.
> >
> > I could be convinced that it's not worth warning about at all, or
> > we could do something like the following:
> >
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 084cf1d23d3f..4afa5875e14d 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -220,7 +220,11 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
> > ret = -ENXIO;
> > #endif
> > out:
> > - WARN(ret == 0, "0 is an invalid IRQ number\n");
> > + /* Returning zero here is likely a bug in the arch IRQ code */
> > + if (ret == 0) {
> > + pr_warn("0 is an invalid IRQ number\n");
> > + dump_stack();
> > + }
> > return ret;
> > }
> > ...

> I like that, but you said this is something that the platform people
> should only see when bringing up a new system, so maybe the WARN() is
> fine. It's not user-triggerable, so your original is ok.

Is that an ack? Thomas, any thoughts?

I suspect we could see this given a broken DT, too, so I'm not sure
it's strictly a bringup problem.

I would probably argue that even this case would be an arch defect:
the kernel should validate data from a DT at least enough to avoid
giving a bogus, useless IRQ to a driver.

Bjorn

2020-05-05 13:37:51

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

On Mon, May 04, 2020 at 05:26:59PM -0500, Bjorn Helgaas wrote:
> On Mon, May 04, 2020 at 09:07:21PM +0200, Greg Kroah-Hartman wrote:
> > On Mon, May 04, 2020 at 01:08:22PM -0500, Bjorn Helgaas wrote:
> > > On Sat, May 02, 2020 at 08:15:37AM +0200, Greg Kroah-Hartman wrote:
> > > > On Fri, May 01, 2020 at 05:40:41PM -0500, Bjorn Helgaas wrote:
> > > > > From: Bjorn Helgaas <[email protected]>
> > > > >
> > > > > These interfaces return a negative error number or an IRQ:
> > > > >
> > > > > platform_get_irq()
> > > > > platform_get_irq_optional()
> > > > > platform_get_irq_byname()
> > > > > platform_get_irq_byname_optional()
> > > > >
> > > > > The function comments suggest checking for error like this:
> > > > >
> > > > > irq = platform_get_irq(...);
> > > > > if (irq < 0)
> > > > > return irq;
> > > > >
> > > > > which is what most callers (~900 of 1400) do, so it's implicit
> > > > > that IRQ 0 is invalid. But some callers check for "irq <= 0",
> > > > > and it's not obvious from the source that we never return an
> > > > > IRQ 0.
> > > > >
> > > > > Make this more explicit by updating the comments to say that
> > > > > an IRQ number is always non-zero and adding a WARN() if we
> > > > > ever do return zero. If we do return IRQ 0, it likely
> > > > > indicates a bug in the arch-specific parts of
> > > > > platform_get_irq().
> > > >
> > > > I worry about adding WARN() as there are systems that do
> > > > panic_on_warn() and syzbot trips over this as well. I don't
> > > > think that for this issue it would be a problem, but what really
> > > > is this warning about that someone could do anything with?
> > > >
> > > > Other than that minor thing, this looks good to me, thanks for
> > > > finally clearing this up.
> > >
> > > What I'm concerned about is an arch that returns 0. Most drivers
> > > don't check for 0 so they'll just try to use it, and things will
> > > fail in some obscure way. My assumption is that if there really
> > > is no IRQ, we should return -ENOENT or similar instead of 0.
> > >
> > > I could be convinced that it's not worth warning about at all, or
> > > we could do something like the following:
> > >
> > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > > index 084cf1d23d3f..4afa5875e14d 100644
> > > --- a/drivers/base/platform.c
> > > +++ b/drivers/base/platform.c
> > > @@ -220,7 +220,11 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
> > > ret = -ENXIO;
> > > #endif
> > > out:
> > > - WARN(ret == 0, "0 is an invalid IRQ number\n");
> > > + /* Returning zero here is likely a bug in the arch IRQ code */
> > > + if (ret == 0) {
> > > + pr_warn("0 is an invalid IRQ number\n");
> > > + dump_stack();
> > > + }
> > > return ret;
> > > }
> > > ...
>
> > I like that, but you said this is something that the platform people
> > should only see when bringing up a new system, so maybe the WARN() is
> > fine. It's not user-triggerable, so your original is ok.
>
> Is that an ack? Thomas, any thoughts?

Sorry, yes:

Acked-by: Greg Kroah-Hartman <[email protected]>

2020-05-12 12:37:47

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] PCI: Check for platform_get_irq() failure consistently

On Sat, May 2, 2020 at 12:41 AM Bjorn Helgaas <[email protected]> wrote:

> From: Bjorn Helgaas <[email protected]>
>
> All callers of platform_get_irq() and related functions interpret a
> negative return value as an error. A few also interpret zero as an error.
>
> platform_get_irq() should return either a negative error number or a valid
> non-zero IRQ, so there's no need to check for zero.
>
> This series:
>
> - Extends the platform_get_irq() function comment to say it returns a
> non-zero IRQ number or a negative error number.
>
> - Adds a WARN() if platform_get_irq() ever *does* return zero (this would
> be a bug in the underlying arch code, and most callers are not prepared
> for this).
>
> - Updates drivers/pci/ to check consistently using "irq < 0".
>
> This is based on Aman's series [1]. I propose to merge this via the PCI
> tree, given acks from Greg and Thomas.
>
> [1] https://lore.kernel.org/r/[email protected]

Makes sense to me.
Acked-by: Linus Walleij <[email protected]>

for the series.

Yours,
Linus Walleij

2020-05-13 17:46:39

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] PCI: Check for platform_get_irq() failure consistently

On Fri, May 01, 2020 at 05:40:40PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> All callers of platform_get_irq() and related functions interpret a
> negative return value as an error. A few also interpret zero as an error.
>
> platform_get_irq() should return either a negative error number or a valid
> non-zero IRQ, so there's no need to check for zero.
>
> This series:
>
> - Extends the platform_get_irq() function comment to say it returns a
> non-zero IRQ number or a negative error number.
>
> - Adds a WARN() if platform_get_irq() ever *does* return zero (this would
> be a bug in the underlying arch code, and most callers are not prepared
> for this).
>
> - Updates drivers/pci/ to check consistently using "irq < 0".
>
> This is based on Aman's series [1]. I propose to merge this via the PCI
> tree, given acks from Greg and Thomas.
>
> [1] https://lore.kernel.org/r/[email protected]
>
> Aman Sharma (1):
> PCI: Check for platform_get_irq() failure consistently
>
> Bjorn Helgaas (1):
> driver core: platform: Clarify that IRQ 0 is invalid
>
> drivers/base/platform.c | 40 ++++++++++++-------
> drivers/pci/controller/dwc/pci-imx6.c | 4 +-
> drivers/pci/controller/dwc/pcie-tegra194.c | 4 +-
> .../controller/mobiveil/pcie-mobiveil-host.c | 4 +-
> drivers/pci/controller/pci-aardvark.c | 3 ++
> drivers/pci/controller/pci-v3-semi.c | 4 +-
> drivers/pci/controller/pcie-mediatek.c | 3 ++
> drivers/pci/controller/pcie-tango.c | 4 +-
> 8 files changed, 41 insertions(+), 25 deletions(-)

Applied with acks from Greg and Linus W to pci/misc for v5.8.