2012-10-28 15:57:39

by Roland Stigge

[permalink] [raw]
Subject: irq_set_chained_handler() called too early for hwirq to be initialized

Hi,

consider arch/arm/mach-lpc32xx/irq.c: irq_set_chained_handler() is
called at a point where it accesses
irq_to_desc(IRQ_LPC32XX_SUB2IRQ)->irq_data.hwirq but which is not yet
initialized.

(This bug just surfaced on lpc32xx when the chained interrupt controller
SIC2 wasn't working. SIC1 does, but just by chance: The uninitialized
value 0 is just coincidentally the correct one.)

...->hwirq is actually defined only later on in lpc32xx_init_irq() at
irq_domain_add_legacy(). Ideally, I would just move the
irq_set_chained_handler() calls to after of_irq_init() and
irq_domain_add_legacy(). Is this OK or does this produce any race condition?

Thanks in advance,

Roland


2012-10-28 17:35:03

by Thomas Gleixner

[permalink] [raw]
Subject: Re: irq_set_chained_handler() called too early for hwirq to be initialized

On Sun, 28 Oct 2012, Roland Stigge wrote:
>
> consider arch/arm/mach-lpc32xx/irq.c: irq_set_chained_handler() is
> called at a point where it accesses
> irq_to_desc(IRQ_LPC32XX_SUB2IRQ)->irq_data.hwirq but which is not yet
> initialized.

None of the functions which are called inside of
irq_set_chained_handler() touches desc->irq_data.hwirq.

So what are you talking about?

I just had a look into that lpc32xx irq code. At the end of
lpc32xx_init_irq():

irq_base = irq_alloc_descs(-1, 0, NR_IRQS, 0);
if (irq_base < 0) {
pr_warn("Cannot allocate irq_descs, assuming pre-allocated\n");
irq_base = 0;
}

That's just hilarious.

Of course are the interrupts preallocated, simply because
machine_desc->nr_irqs is 0 and therefor the ARM core code allocates
NR_IRQS irq descriptors in the early setup way before
lpc32xx_init_irq() is called.

If those interrupts would not be preallocated, then the code would
fail to initialize any interrupt at all. And of course nothing would
notice as all function calls to set_irq_* do not check the return
value.

That brilliant thing came in via commit f5c42271 (ARM: LPC32xx: Device
tree support). I have no idea from where you copied that and why you
thought putting it at the end of the init function would be a good
idea.

Though, that has nothing todo with your problem description above.

Thanks,

tglx

2012-10-28 18:37:17

by Roland Stigge

[permalink] [raw]
Subject: Re: irq_set_chained_handler() called too early for hwirq to be initialized

On 28/10/12 18:34, Thomas Gleixner wrote:
> On Sun, 28 Oct 2012, Roland Stigge wrote:
>> consider arch/arm/mach-lpc32xx/irq.c: irq_set_chained_handler() is
>> called at a point where it accesses
>> irq_to_desc(IRQ_LPC32XX_SUB2IRQ)->irq_data.hwirq but which is not yet
>> initialized.
>
> None of the functions which are called inside of
> irq_set_chained_handler() touches desc->irq_data.hwirq.
>
> So what are you talking about?

Via the call trace:

irq_set_chained_handler()
-> __irq_set_handler()
-> irq_startup()
-> irq_enable()
-> desc->irq_data.chip->irq_unmask()

The code path comes back to irq.c's lpc32xx_unmask_irq() which reads the
above described ->hwirq which is only later initialized on
irq_domain_add_legacy(). Hope this explains my above short description.

> Of course are the interrupts preallocated, simply because
> machine_desc->nr_irqs is 0 and therefor the ARM core code allocates
> NR_IRQS irq descriptors in the early setup way before
> lpc32xx_init_irq() is called.

OK, will remove the call to irq_alloc_descs() since it is superfluous.

Still, my question remains if I can move the irq_set_chained_handler()
calls to after of_irq_init() and irq_domain_add_legacy() since only the
latter initializes hwirq.

> If those interrupts would not be preallocated, then the code would
> fail to initialize any interrupt at all. And of course nothing would
> notice as all function calls to set_irq_* do not check the return
> value.

Do you mean mach-lpc32xx/irq.c's calls to set_irq_* not checking the
return values? Maybe because those are declared "void"?

static inline void
irq_set_chained_handler(unsigned int irq, irq_flow_handler_t handle);
void set_irq_flags(unsigned int irq, unsigned int iflags);
static inline void irq_set_chip_and_handler(unsigned int irq,
struct irq_chip *chip,
irq_flow_handler_t handle);

Or did I misunderstand sth.?

Thanks in advance,

Roland

2012-10-28 18:46:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: irq_set_chained_handler() called too early for hwirq to be initialized

On Sun, 28 Oct 2012, Roland Stigge wrote:
> On 28/10/12 18:34, Thomas Gleixner wrote:
> > On Sun, 28 Oct 2012, Roland Stigge wrote:
> >> consider arch/arm/mach-lpc32xx/irq.c: irq_set_chained_handler() is
> >> called at a point where it accesses
> >> irq_to_desc(IRQ_LPC32XX_SUB2IRQ)->irq_data.hwirq but which is not yet
> >> initialized.
> >
> > None of the functions which are called inside of
> > irq_set_chained_handler() touches desc->irq_data.hwirq.
> >
> > So what are you talking about?
>
> Via the call trace:
>
> irq_set_chained_handler()
> -> __irq_set_handler()
> -> irq_startup()
> -> irq_enable()
> -> desc->irq_data.chip->irq_unmask()

That's right. Though your description was so vague that I really could
not figure that out.

> Still, my question remains if I can move the irq_set_chained_handler()
> calls to after of_irq_init() and irq_domain_add_legacy() since only the
> latter initializes hwirq.

That's not a question. You MUST do that. And there is no reason why
you can't.

> > If those interrupts would not be preallocated, then the code would
> > fail to initialize any interrupt at all. And of course nothing would
> > notice as all function calls to set_irq_* do not check the return
> > value.
>
> Do you mean mach-lpc32xx/irq.c's calls to set_irq_* not checking the
> return values? Maybe because those are declared "void"?
>
> static inline void
> irq_set_chained_handler(unsigned int irq, irq_flow_handler_t handle);
> void set_irq_flags(unsigned int irq, unsigned int iflags);
> static inline void irq_set_chip_and_handler(unsigned int irq,
> struct irq_chip *chip,
> irq_flow_handler_t handle);

Bah. Yes. We should change that or at least put a warning into the
core code.

Thanks,

tglx