2021-11-05 09:44:42

by Ben Dooks

[permalink] [raw]
Subject: [PATCH] irqdomain: check irq mapping against domain size

The irq translate code does not check the irq number against
the maximum a domain can handle. This can cause an OOPS if
the firmware data has been damaged in any way. Check the intspec
or fwdata against the irqdomain and return -EINVAL if over.

This is the result of bug somewhere in the boot of a SiFive Unmatched
board where the 5th argument of the pcie node is being damaged which
causes an OOPS in the startup code.

Signed-off-by: Ben Dooks <[email protected]>
---
kernel/irq/irqdomain.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 6284443b87ec..e61397420723 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -906,6 +906,8 @@ int irq_domain_xlate_onecell(struct irq_domain *d, struct device_node *ctrlr,
{
if (WARN_ON(intsize < 1))
return -EINVAL;
+ if (WARN_ON(intspec[0] > d->hwirq_max))
+ return -EINVAL;
*out_hwirq = intspec[0];
*out_type = IRQ_TYPE_NONE;
return 0;
@@ -948,6 +950,8 @@ int irq_domain_xlate_onetwocell(struct irq_domain *d,
{
if (WARN_ON(intsize < 1))
return -EINVAL;
+ if (WARN_ON(intspec[0] > d->hwirq_max))
+ return -EINVAL;
*out_hwirq = intspec[0];
if (intsize > 1)
*out_type = intspec[1] & IRQ_TYPE_SENSE_MASK;
@@ -973,6 +977,8 @@ int irq_domain_translate_onecell(struct irq_domain *d,
{
if (WARN_ON(fwspec->param_count < 1))
return -EINVAL;
+ if (WARN_ON(fwspec->param[0] > d->hwirq_max))
+ return -EINVAL;
*out_hwirq = fwspec->param[0];
*out_type = IRQ_TYPE_NONE;
return 0;
@@ -994,6 +1000,8 @@ int irq_domain_translate_twocell(struct irq_domain *d,
{
if (WARN_ON(fwspec->param_count < 2))
return -EINVAL;
+ if (WARN_ON(fwspec->param[0] > d->hwirq_max))
+ return -EINVAL;
*out_hwirq = fwspec->param[0];
*out_type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
return 0;
--
2.30.2


2021-11-05 12:46:50

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqdomain: check irq mapping against domain size

Hi Ben,

On Fri, 05 Nov 2021 09:06:01 +0000,
Ben Dooks <[email protected]> wrote:
>
> The irq translate code does not check the irq number against
> the maximum a domain can handle. This can cause an OOPS if
> the firmware data has been damaged in any way. Check the intspec
> or fwdata against the irqdomain and return -EINVAL if over.
>
> This is the result of bug somewhere in the boot of a SiFive Unmatched
> board where the 5th argument of the pcie node is being damaged which
> causes an OOPS in the startup code.
>
> Signed-off-by: Ben Dooks <[email protected]>
> ---
> kernel/irq/irqdomain.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 6284443b87ec..e61397420723 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -906,6 +906,8 @@ int irq_domain_xlate_onecell(struct irq_domain *d, struct device_node *ctrlr,
> {
> if (WARN_ON(intsize < 1))
> return -EINVAL;
> + if (WARN_ON(intspec[0] > d->hwirq_max))
> + return -EINVAL;

This doesn't seem right.

For a start, d->hwirq_max is 0 when the domain is backed by a radix
tree. Also, nothing says that what you read from the DT is something
that should be directly meaningful to the irqdomain. A driver could
well call into this and perform some extra processing on the data
before it lands into the irqdomain.

In general, this looks like DT validation code, and I'm not keen on
that in the core code.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-12-13 16:07:52

by Ben Dooks

[permalink] [raw]
Subject: Re: [PATCH] irqdomain: check irq mapping against domain size

On 05/11/2021 12:09, Marc Zyngier wrote:
> Hi Ben,
>
> On Fri, 05 Nov 2021 09:06:01 +0000,
> Ben Dooks <[email protected]> wrote:
>>
>> The irq translate code does not check the irq number against
>> the maximum a domain can handle. This can cause an OOPS if
>> the firmware data has been damaged in any way. Check the intspec
>> or fwdata against the irqdomain and return -EINVAL if over.
>>
>> This is the result of bug somewhere in the boot of a SiFive Unmatched
>> board where the 5th argument of the pcie node is being damaged which
>> causes an OOPS in the startup code.
>>
>> Signed-off-by: Ben Dooks <[email protected]>
>> ---
>> kernel/irq/irqdomain.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>> index 6284443b87ec..e61397420723 100644
>> --- a/kernel/irq/irqdomain.c
>> +++ b/kernel/irq/irqdomain.c
>> @@ -906,6 +906,8 @@ int irq_domain_xlate_onecell(struct irq_domain *d, struct device_node *ctrlr,
>> {
>> if (WARN_ON(intsize < 1))
>> return -EINVAL;
>> + if (WARN_ON(intspec[0] > d->hwirq_max))
>> + return -EINVAL;
>
> This doesn't seem right.
>
> For a start, d->hwirq_max is 0 when the domain is backed by a radix
> tree. Also, nothing says that what you read from the DT is something
> that should be directly meaningful to the irqdomain. A driver could
> well call into this and perform some extra processing on the data
> before it lands into the irqdomain.

Thanks, didn't know that.

would doing:

+ if (WARN_ON(d->hwirq_max && intspec[0] > d->hwirq_max))
+ return -EINVAL;

be acceptable?

> In general, this looks like DT validation code, and I'm not keen on
> that in the core code.

I thought the core was probably the only place to do this, I didn't
think the DT code would know about the hardware capabilities of the
DT controller.

It seems bad that some corrupted data can just crash the kernel in
a non-recoverable and early way that requires some specific debug
features like early-printk enabled. Would anyone else have a way of
fixing this?

--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html