The eoi opreation in mips_cpu_irq_controller caused chained_irq_enter
falsely consider CPU IP interrupt as a FastEOI type IRQ. So the interrupt
won't be masked during in handler. Which might lead to spurious interrupt.
Thus we simply remove eoi operation for mips_cpu_irq_controller,
Signed-off-by: Jiaxun Yang <[email protected]>
---
drivers/irqchip/irq-mips-cpu.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/irqchip/irq-mips-cpu.c b/drivers/irqchip/irq-mips-cpu.c
index 95d4fd8f7a96..0ad7f1f9a58b 100644
--- a/drivers/irqchip/irq-mips-cpu.c
+++ b/drivers/irqchip/irq-mips-cpu.c
@@ -55,7 +55,6 @@ static struct irq_chip mips_cpu_irq_controller = {
.irq_mask = mask_mips_irq,
.irq_mask_ack = mask_mips_irq,
.irq_unmask = unmask_mips_irq,
- .irq_eoi = unmask_mips_irq,
.irq_disable = mask_mips_irq,
.irq_enable = unmask_mips_irq,
};
--
2.24.1
Hi Jiaxun,
On Mon, Jan 13, 2020 at 06:12:51PM +0800, Jiaxun Yang wrote:
> The eoi opreation in mips_cpu_irq_controller caused chained_irq_enter
> falsely consider CPU IP interrupt as a FastEOI type IRQ. So the interrupt
> won't be masked during in handler. Which might lead to spurious interrupt.
>
> Thus we simply remove eoi operation for mips_cpu_irq_controller,
>
> Signed-off-by: Jiaxun Yang <[email protected]>
> ---
> drivers/irqchip/irq-mips-cpu.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-mips-cpu.c b/drivers/irqchip/irq-mips-cpu.c
> index 95d4fd8f7a96..0ad7f1f9a58b 100644
> --- a/drivers/irqchip/irq-mips-cpu.c
> +++ b/drivers/irqchip/irq-mips-cpu.c
> @@ -55,7 +55,6 @@ static struct irq_chip mips_cpu_irq_controller = {
> .irq_mask = mask_mips_irq,
> .irq_mask_ack = mask_mips_irq,
> .irq_unmask = unmask_mips_irq,
> - .irq_eoi = unmask_mips_irq,
> .irq_disable = mask_mips_irq,
> .irq_enable = unmask_mips_irq,
> };
This one scares me; something doesn't seem right. The irq_eoi (n?e eoi)
callback was first added way back in commit 1417836e81c0 ("[MIPS] use
generic_handle_irq, handle_level_irq, handle_percpu_irq"). The commit
message there states that the motivation was to allow use of
handle_percpu_irq(), and indeed handle_percpu_irq() does:
irq_ack() (ie. mask)
invoke the handler(s)
irq_eoi() (ie. unmask)
By removing the irq_eoi callback I don't see how we'd ever unmask the
interrupt again..?
Thanks,
Paul
于 2020年1月15日 GMT+08:00 上午7:30:25, Paul Burton <[email protected]> 写到:
>Hi Jiaxun,
>
>On Mon, Jan 13, 2020 at 06:12:51PM +0800, Jiaxun Yang wrote:
>> The eoi opreation in mips_cpu_irq_controller caused chained_irq_enter
>> falsely consider CPU IP interrupt as a FastEOI type IRQ. So the
>interrupt
>> won't be masked during in handler. Which might lead to spurious
>interrupt.
>>
>> Thus we simply remove eoi operation for mips_cpu_irq_controller,
>>
>> Signed-off-by: Jiaxun Yang <[email protected]>
>> ---
>> drivers/irqchip/irq-mips-cpu.c | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-mips-cpu.c
>b/drivers/irqchip/irq-mips-cpu.c
>> index 95d4fd8f7a96..0ad7f1f9a58b 100644
>> --- a/drivers/irqchip/irq-mips-cpu.c
>> +++ b/drivers/irqchip/irq-mips-cpu.c
>> @@ -55,7 +55,6 @@ static struct irq_chip mips_cpu_irq_controller = {
>> .irq_mask = mask_mips_irq,
>> .irq_mask_ack = mask_mips_irq,
>> .irq_unmask = unmask_mips_irq,
>> - .irq_eoi = unmask_mips_irq,
>> .irq_disable = mask_mips_irq,
>> .irq_enable = unmask_mips_irq,
>> };
>
>This one scares me; something doesn't seem right. The irq_eoi (née eoi)
>callback was first added way back in commit 1417836e81c0 ("[MIPS] use
>generic_handle_irq, handle_level_irq, handle_percpu_irq"). The commit
>message there states that the motivation was to allow use of
>handle_percpu_irq(), and indeed handle_percpu_irq() does:
>
> irq_ack() (ie. mask)
> invoke the handler(s)
> irq_eoi() (ie. unmask)
>
>By removing the irq_eoi callback I don't see how we'd ever unmask the
>interrupt again..?
Hi Paul,
Sorry I didn't discover that by myself as all of my drivers are using chained handler.
So how should we deal with the chained case?
Probably we need a check in percpu IRQ handler to determine whether it's or not
level type?
>
>Thanks,
> Paul
--
Jiaxun Yang
On 2020-01-14 23:30, Paul Burton wrote:
> Hi Jiaxun,
>
> On Mon, Jan 13, 2020 at 06:12:51PM +0800, Jiaxun Yang wrote:
>> The eoi opreation in mips_cpu_irq_controller caused chained_irq_enter
>> falsely consider CPU IP interrupt as a FastEOI type IRQ. So the
>> interrupt
>> won't be masked during in handler. Which might lead to spurious
>> interrupt.
>>
>> Thus we simply remove eoi operation for mips_cpu_irq_controller,
>>
>> Signed-off-by: Jiaxun Yang <[email protected]>
>> ---
>> drivers/irqchip/irq-mips-cpu.c | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-mips-cpu.c
>> b/drivers/irqchip/irq-mips-cpu.c
>> index 95d4fd8f7a96..0ad7f1f9a58b 100644
>> --- a/drivers/irqchip/irq-mips-cpu.c
>> +++ b/drivers/irqchip/irq-mips-cpu.c
>> @@ -55,7 +55,6 @@ static struct irq_chip mips_cpu_irq_controller = {
>> .irq_mask = mask_mips_irq,
>> .irq_mask_ack = mask_mips_irq,
>> .irq_unmask = unmask_mips_irq,
>> - .irq_eoi = unmask_mips_irq,
>> .irq_disable = mask_mips_irq,
>> .irq_enable = unmask_mips_irq,
>> };
>
> This one scares me; something doesn't seem right. The irq_eoi (née eoi)
> callback was first added way back in commit 1417836e81c0 ("[MIPS] use
> generic_handle_irq, handle_level_irq, handle_percpu_irq"). The commit
> message there states that the motivation was to allow use of
> handle_percpu_irq(), and indeed handle_percpu_irq() does:
>
> irq_ack() (ie. mask)
> invoke the handler(s)
> irq_eoi() (ie. unmask)
>
> By removing the irq_eoi callback I don't see how we'd ever unmask the
> interrupt again..?
To be completely blunt, the fact that unmask and eoi are implemented the
same way is a clear sign that this is a bit broken.
irq_eoi is used if the irqchip tracks the IRQ life-cycle in HW, and it's
not obvious that this is the case. The fact that ack is also mapped to
mask
just adds to my feeling...
M.
--
Jazz is not dead. It just smells funny...
于 2020年1月15日 GMT+08:00 下午9:40:31, Marc Zyngier <[email protected]> 写到:
>On 2020-01-14 23:30, Paul Burton wrote:
>> Hi Jiaxun,
>>
>> On Mon, Jan 13, 2020 at 06:12:51PM +0800, Jiaxun Yang wrote:
>>> The eoi opreation in mips_cpu_irq_controller caused
>chained_irq_enter
>>> falsely consider CPU IP interrupt as a FastEOI type IRQ. So the
>>> interrupt
>>> won't be masked during in handler. Which might lead to spurious
>>> interrupt.
>>>
>>> Thus we simply remove eoi operation for mips_cpu_irq_controller,
>>>
>>> Signed-off-by: Jiaxun Yang <[email protected]>
>>> ---
>>> drivers/irqchip/irq-mips-cpu.c | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> diff --git a/drivers/irqchip/irq-mips-cpu.c
>>> b/drivers/irqchip/irq-mips-cpu.c
>>> index 95d4fd8f7a96..0ad7f1f9a58b 100644
>>> --- a/drivers/irqchip/irq-mips-cpu.c
>>> +++ b/drivers/irqchip/irq-mips-cpu.c
>>> @@ -55,7 +55,6 @@ static struct irq_chip mips_cpu_irq_controller = {
>>> .irq_mask = mask_mips_irq,
>>> .irq_mask_ack = mask_mips_irq,
>>> .irq_unmask = unmask_mips_irq,
>>> - .irq_eoi = unmask_mips_irq,
>>> .irq_disable = mask_mips_irq,
>>> .irq_enable = unmask_mips_irq,
>>> };
>>
>> This one scares me; something doesn't seem right. The irq_eoi (née
>eoi)
>> callback was first added way back in commit 1417836e81c0 ("[MIPS] use
>> generic_handle_irq, handle_level_irq, handle_percpu_irq"). The commit
>> message there states that the motivation was to allow use of
>> handle_percpu_irq(), and indeed handle_percpu_irq() does:
>>
>> irq_ack() (ie. mask)
>> invoke the handler(s)
>> irq_eoi() (ie. unmask)
>>
>> By removing the irq_eoi callback I don't see how we'd ever unmask the
>> interrupt again..?
>
>To be completely blunt, the fact that unmask and eoi are implemented
>the
>same way is a clear sign that this is a bit broken.
>
>irq_eoi is used if the irqchip tracks the IRQ life-cycle in HW, and
>it's
>not obvious that this is the case. The fact that ack is also mapped to
>mask
It's just a kind of hack to workaround the fact that our current percpu irq handler assumed
all percpu irqs are edge triggered or fasteoi type.
However MIPS processor implemented it in level triggered way.
My solution would be add a check. If neither ack nor eoi exist for the chip,
than we assume it's level triggered and process precpu irq in mask/unmask way.
Could it be a possible option?
Thanks.
>just adds to my feeling...
>
> M.
--
Jiaxun Yang
On 2020-01-15 14:23, Jiaxun Yang wrote:
> 于 2020年1月15日 GMT+08:00 下午9:40:31, Marc Zyngier <[email protected]> 写到:
>> On 2020-01-14 23:30, Paul Burton wrote:
>>> Hi Jiaxun,
>>>
>>> On Mon, Jan 13, 2020 at 06:12:51PM +0800, Jiaxun Yang wrote:
>>>> The eoi opreation in mips_cpu_irq_controller caused
>> chained_irq_enter
>>>> falsely consider CPU IP interrupt as a FastEOI type IRQ. So the
>>>> interrupt
>>>> won't be masked during in handler. Which might lead to spurious
>>>> interrupt.
>>>>
>>>> Thus we simply remove eoi operation for mips_cpu_irq_controller,
>>>>
>>>> Signed-off-by: Jiaxun Yang <[email protected]>
>>>> ---
>>>> drivers/irqchip/irq-mips-cpu.c | 1 -
>>>> 1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/drivers/irqchip/irq-mips-cpu.c
>>>> b/drivers/irqchip/irq-mips-cpu.c
>>>> index 95d4fd8f7a96..0ad7f1f9a58b 100644
>>>> --- a/drivers/irqchip/irq-mips-cpu.c
>>>> +++ b/drivers/irqchip/irq-mips-cpu.c
>>>> @@ -55,7 +55,6 @@ static struct irq_chip mips_cpu_irq_controller = {
>>>> .irq_mask = mask_mips_irq,
>>>> .irq_mask_ack = mask_mips_irq,
>>>> .irq_unmask = unmask_mips_irq,
>>>> - .irq_eoi = unmask_mips_irq,
>>>> .irq_disable = mask_mips_irq,
>>>> .irq_enable = unmask_mips_irq,
>>>> };
>>>
>>> This one scares me; something doesn't seem right. The irq_eoi (née
>> eoi)
>>> callback was first added way back in commit 1417836e81c0 ("[MIPS] use
>>> generic_handle_irq, handle_level_irq, handle_percpu_irq"). The commit
>>> message there states that the motivation was to allow use of
>>> handle_percpu_irq(), and indeed handle_percpu_irq() does:
>>>
>>> irq_ack() (ie. mask)
>>> invoke the handler(s)
>>> irq_eoi() (ie. unmask)
>>>
>>> By removing the irq_eoi callback I don't see how we'd ever unmask the
>>> interrupt again..?
>>
>> To be completely blunt, the fact that unmask and eoi are implemented
>> the
>> same way is a clear sign that this is a bit broken.
>>
>> irq_eoi is used if the irqchip tracks the IRQ life-cycle in HW, and
>> it's
>> not obvious that this is the case. The fact that ack is also mapped to
>> mask
>
> It's just a kind of hack to workaround the fact that our current
> percpu irq handler assumed
> all percpu irqs are edge triggered or fasteoi type.
>
> However MIPS processor implemented it in level triggered way.
>
> My solution would be add a check. If neither ack nor eoi exist for the
> chip,
> than we assume it's level triggered and process precpu irq in
> mask/unmask way.
>
> Could it be a possible option?
Post the patch, and we'll discuss it.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
MIPS processors implemented their IPI IRQ and CPU interrupt line
as level triggered IRQ. However, our current percpu_irq flow is trying
do it in a level triggered manner.
Thus we attempt to determine whether it is or not level triggered type by
checking if both ack and eoi operation not exist. And handle it in
mask/unmask way.
Signed-off-by: Jiaxun Yang <[email protected]>
---
kernel/irq/chip.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index b3fa2d87d2f3..4fafe572d022 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -893,6 +893,7 @@ void handle_edge_eoi_irq(struct irq_desc *desc)
void handle_percpu_irq(struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
+ bool is_level = !chip->irq_ack && !chip->irq_eoi;
/*
* PER CPU interrupts are not serialized. Do not touch
@@ -900,12 +901,16 @@ void handle_percpu_irq(struct irq_desc *desc)
*/
__kstat_incr_irqs_this_cpu(desc);
- if (chip->irq_ack)
+ if (is_level)
+ chip->irq_mask(&desc->irq_data);
+ else if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);
handle_irq_event_percpu(desc);
- if (chip->irq_eoi)
+ if (is_level)
+ chip->irq_unmask(&desc->irq_data);
+ else if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
}
@@ -925,6 +930,7 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
struct irq_chip *chip = irq_desc_get_chip(desc);
struct irqaction *action = desc->action;
unsigned int irq = irq_desc_get_irq(desc);
+ bool is_level = !chip->irq_ack && !chip->irq_eoi;
irqreturn_t res;
/*
@@ -933,7 +939,9 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
*/
__kstat_incr_irqs_this_cpu(desc);
- if (chip->irq_ack)
+ if (is_level)
+ chip->irq_mask(&desc->irq_data);
+ else if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);
if (likely(action)) {
@@ -951,7 +959,9 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
enabled ? " and unmasked" : "", irq, cpu);
}
- if (chip->irq_eoi)
+ if (is_level)
+ chip->irq_unmask(&desc->irq_data);
+ else if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
}
--
2.24.1
Actually, all MIPS processor interrupt lines are level triggered.
So providing ack/eoi operation could lead to some unexpected results,
Like chained IRQ handeler failed to mask upstream CPU IRQ.
Signed-off-by: Jiaxun Yang <[email protected]>
---
drivers/irqchip/irq-mips-cpu.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/irqchip/irq-mips-cpu.c b/drivers/irqchip/irq-mips-cpu.c
index 95d4fd8f7a96..e34f4317bb88 100644
--- a/drivers/irqchip/irq-mips-cpu.c
+++ b/drivers/irqchip/irq-mips-cpu.c
@@ -51,11 +51,9 @@ static inline void mask_mips_irq(struct irq_data *d)
static struct irq_chip mips_cpu_irq_controller = {
.name = "MIPS",
- .irq_ack = mask_mips_irq,
.irq_mask = mask_mips_irq,
.irq_mask_ack = mask_mips_irq,
.irq_unmask = unmask_mips_irq,
- .irq_eoi = unmask_mips_irq,
.irq_disable = mask_mips_irq,
.irq_enable = unmask_mips_irq,
};
@@ -112,11 +110,9 @@ static void mips_mt_send_ipi(struct irq_data *d, unsigned int cpu)
static struct irq_chip mips_mt_cpu_irq_controller = {
.name = "MIPS",
.irq_startup = mips_mt_cpu_irq_startup,
- .irq_ack = mips_mt_cpu_irq_ack,
.irq_mask = mask_mips_irq,
.irq_mask_ack = mips_mt_cpu_irq_ack,
.irq_unmask = unmask_mips_irq,
- .irq_eoi = unmask_mips_irq,
.irq_disable = mask_mips_irq,
.irq_enable = unmask_mips_irq,
#ifdef CONFIG_GENERIC_IRQ_IPI
--
2.24.1
Jiaxun Yang <[email protected]> writes:
> MIPS processors implemented their IPI IRQ and CPU interrupt line
> as level triggered IRQ. However, our current percpu_irq flow is trying
> do it in a level triggered manner.
So what are you trying to solve? The CPU uses level type and the per cpu
irq flow does the same. I can't find the problem with that.
> Thus we attempt to determine whether it is or not level triggered type by
> checking if both ack and eoi operation not exist. And handle it in
> mask/unmask way.
What? No. This is fundamentally wrong.
percpu interrupts which are used for IPIs/per CPU timers/PERF have the
the following semantics:
1) No serialization required because the interrupt cannot happen
concurrently on the same CPU
2) The interrupt handler is strictly hard interrupt context. There
can't be a threaded handler associated to it.
3) The interrupt hardware interaction is reduced to the absolute minimum
depending on the trigger type of the relevant interrupt controller.
Now #3 is the interesting one here and there are the following cases:
A) Edge type
Edge type interrupts are events which are triggered once when the
level of the interrupt line changes. Depending on the implementation
and configuration of the interrupt controller they react on either
one direction of level change (low -> high, high ->low) or on both,
They require that the interrupt is acked at the interrupt controller
_before_ the handler runs. That's required because otherwise the
device could issue a new interrupt after the initial one has been
handled and before the ack was issued. That ack would then clear the
new interrupt which would get lost and potentially starve the device
forever.
See handle_edge_irq()
Edge type interrupts are also not suitable for interrupt sharing.
Note, that MSI interrupts are edge type because the interrupt
message is only sent once when the event happens.
B) Level type
Level type interrupts react on a static level state. If the
interrupt line has the implementation/configuration defined level
then the interrupt is delivered to the CPU up to the point where the
interrupt line returns to the inactive level.
For regular operation they usually require that the interrupt is
masked before the handler runs and unmasked after the handler
completes. This is required to prevent interrupt storms in case the
handler reenables interrupts at the CPU level which is not relevant
for Linux unless the interrupt handler is threaded. For threaded
handlers the line must be masked until the thread completes.
They are lossless because if the interrupt is reactivated after
handling the initial one before the unmask of the line happens it is
immediately delivered to the CPU again after the unmask takes place.
Level interrupts can be shared lossless. (Note that interrupt
sharing is a design fail from a performance perspective because the
handling requires that all devices which share the same line need to
be polled whether they actually issued an interrupt).
Some level type controllers require an MASK/ACK sequence to work
correctly. That's often the case when the controller supports both
level and edge mode.
C) EOI type
EOI type interrupts are from their concept level interrupts.
Contrary to regular level interrupts they do not require the mask
and unmask operation to prevent an interrupt storm. The interrupt
controller guarantees not to deliver the still active interrupt up
to the point where the software issues EOI.
The EOI happens after the actual device handler has run. That's a
very efficient solution which avoids slow mask/unmask operations and
at the same time guarantees that no interrupts are lost.
So lets look how this relates to strict per cpu interrupts,
i.e. interrupts which can only be delivered to one particular CPU.
As documented above the strict percpuness removes the requirement for
locking as the CPU itself guarantees non-reentrancy for the same
interrupt as long as interrupts are disabled at the CPU level while the
interrupt is serviced.
Aside of that this also affects the required operations for interrupt
handling. None of the true per cpu interrupts requires any form of
mask/unmask operation in the handler. They only require that the
relevant ACK/EOI operations take place. So for the 3 types above this
results in the following:
A) Edge type
Issue ACK before invoking the handler
B) Level type
If the controller does not require ACK, then no action is required.
If the controller requires ACK, the ACK has to be issued before
invoking the handler.
C) EOI type
Issue EOI after invoking the handler
As the MIPS GIC is a strict level type controller which does not require
ACK or EOI operations, the irq_ack/irq_eoi callbacks are completely
bogus.
There is also no need to change anything in the core code percpu handler
implementations.
All you need to do is to remove the bogus irq_ack/irq_eoi function
pointers from the MIPS GIC irq_chip and be done with it.
Thanks,
tglx
17.01.2020, 09:29, "Thomas Gleixner" <[email protected]>:
> Jiaxun Yang <[email protected]> writes:
>> MIPS processors implemented their IPI IRQ and CPU interrupt line
>> as level triggered IRQ. However, our current percpu_irq flow is trying
>> do it in a level triggered manner.
>
Hi Thomas,
Thanks for your kind explanation.
That appears to be my misunderstanding of the trigger type.
Paul, I have confirmed it seems fine to handle percpu IRQ without mask
it on both Ingenic and Loongson processors. How about other MIPS Cores?
Could you please help check that?
Thanks.
--
Jiaxun Yang