2021-07-24 22:48:01

by Linus Walleij

[permalink] [raw]
Subject: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

Make sure we check that the right interrupt occurred before
calling the event handler for timer 1. Report spurious IRQs
as IRQ_NONE.

Cc: Cédric Le Goater <[email protected]>
Cc: Joel Stanley <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
---
drivers/clocksource/timer-fttmr010.c | 32 +++++++++++++++++++++-------
1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/clocksource/timer-fttmr010.c b/drivers/clocksource/timer-fttmr010.c
index 126fb1f259b2..de29d424ec95 100644
--- a/drivers/clocksource/timer-fttmr010.c
+++ b/drivers/clocksource/timer-fttmr010.c
@@ -253,20 +253,36 @@ static int fttmr010_timer_set_periodic(struct clock_event_device *evt)
*/
static irqreturn_t fttmr010_timer_interrupt(int irq, void *dev_id)
{
- struct clock_event_device *evt = dev_id;
+ struct fttmr010 *fttmr010 = dev_id;
+ struct clock_event_device *evt = &fttmr010->clkevt;
+ u32 val;
+
+ val = readl(fttmr010->base + TIMER_INTR_STATE);
+ if (val & (TIMER_1_INT_MATCH1 | TIMER_1_INT_OVERFLOW))
+ evt->event_handler(evt);
+ else
+ /* Spurious IRQ */
+ return IRQ_NONE;

- evt->event_handler(evt);
return IRQ_HANDLED;
}

static irqreturn_t ast2600_timer_interrupt(int irq, void *dev_id)
{
- struct clock_event_device *evt = dev_id;
- struct fttmr010 *fttmr010 = to_fttmr010(evt);
+ struct fttmr010 *fttmr010 = dev_id;
+ struct clock_event_device *evt = &fttmr010->clkevt;
+ u32 val;

- writel(0x1, fttmr010->base + TIMER_INTR_STATE);
+ val = readl(fttmr010->base + TIMER_INTR_STATE);
+ if (val & (TIMER_1_INT_MATCH1 | TIMER_1_INT_OVERFLOW)) {
+ writel(TIMER_1_INT_MATCH1, fttmr010->base + TIMER_INTR_STATE);
+ evt->event_handler(evt);
+ } else {
+ /* Just clear any spurious IRQs from the block */
+ writel(val, fttmr010->base + TIMER_INTR_STATE);
+ return IRQ_NONE;
+ }

- evt->event_handler(evt);
return IRQ_HANDLED;
}

@@ -384,12 +400,12 @@ static int __init fttmr010_common_init(struct device_node *np,
fttmr010->timer_shutdown = ast2600_timer_shutdown;
ret = request_irq(irq, ast2600_timer_interrupt,
IRQF_TIMER, "FTTMR010-TIMER1",
- &fttmr010->clkevt);
+ fttmr010);
} else {
fttmr010->timer_shutdown = fttmr010_timer_shutdown;
ret = request_irq(irq, fttmr010_timer_interrupt,
IRQF_TIMER, "FTTMR010-TIMER1",
- &fttmr010->clkevt);
+ fttmr010);
}
if (ret) {
pr_err("FTTMR010-TIMER1 no IRQ\n");
--
2.31.1


2021-08-20 08:55:01

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On 25/07/2021 00:44, Linus Walleij wrote:
> Make sure we check that the right interrupt occurred before
> calling the event handler for timer 1. Report spurious IRQs
> as IRQ_NONE.

Does it not mean there is something wrong with the initial setup ?


> Cc: Cédric Le Goater <[email protected]>
> Cc: Joel Stanley <[email protected]>
> Signed-off-by: Linus Walleij <[email protected]>
> ---
> drivers/clocksource/timer-fttmr010.c | 32 +++++++++++++++++++++-------
> 1 file changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/clocksource/timer-fttmr010.c b/drivers/clocksource/timer-fttmr010.c
> index 126fb1f259b2..de29d424ec95 100644
> --- a/drivers/clocksource/timer-fttmr010.c
> +++ b/drivers/clocksource/timer-fttmr010.c
> @@ -253,20 +253,36 @@ static int fttmr010_timer_set_periodic(struct clock_event_device *evt)
> */
> static irqreturn_t fttmr010_timer_interrupt(int irq, void *dev_id)
> {
> - struct clock_event_device *evt = dev_id;
> + struct fttmr010 *fttmr010 = dev_id;
> + struct clock_event_device *evt = &fttmr010->clkevt;
> + u32 val;
> +
> + val = readl(fttmr010->base + TIMER_INTR_STATE);
> + if (val & (TIMER_1_INT_MATCH1 | TIMER_1_INT_OVERFLOW))
> + evt->event_handler(evt);
> + else
> + /* Spurious IRQ */
> + return IRQ_NONE;
>
> - evt->event_handler(evt);
> return IRQ_HANDLED;
> }
>
> static irqreturn_t ast2600_timer_interrupt(int irq, void *dev_id)
> {
> - struct clock_event_device *evt = dev_id;
> - struct fttmr010 *fttmr010 = to_fttmr010(evt);
> + struct fttmr010 *fttmr010 = dev_id;
> + struct clock_event_device *evt = &fttmr010->clkevt;
> + u32 val;
>
> - writel(0x1, fttmr010->base + TIMER_INTR_STATE);
> + val = readl(fttmr010->base + TIMER_INTR_STATE);
> + if (val & (TIMER_1_INT_MATCH1 | TIMER_1_INT_OVERFLOW)) {
> + writel(TIMER_1_INT_MATCH1, fttmr010->base + TIMER_INTR_STATE);
> + evt->event_handler(evt);
> + } else {
> + /* Just clear any spurious IRQs from the block */
> + writel(val, fttmr010->base + TIMER_INTR_STATE);
> + return IRQ_NONE;
> + }
>
> - evt->event_handler(evt);
> return IRQ_HANDLED;
> }
>
> @@ -384,12 +400,12 @@ static int __init fttmr010_common_init(struct device_node *np,
> fttmr010->timer_shutdown = ast2600_timer_shutdown;
> ret = request_irq(irq, ast2600_timer_interrupt,
> IRQF_TIMER, "FTTMR010-TIMER1",
> - &fttmr010->clkevt);
> + fttmr010);
> } else {
> fttmr010->timer_shutdown = fttmr010_timer_shutdown;
> ret = request_irq(irq, fttmr010_timer_interrupt,
> IRQF_TIMER, "FTTMR010-TIMER1",
> - &fttmr010->clkevt);
> + fttmr010);
> }
> if (ret) {
> pr_err("FTTMR010-TIMER1 no IRQ\n");
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-08-20 13:42:31

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On Fri, Aug 20, 2021 at 10:53 AM Daniel Lezcano
<[email protected]> wrote:
>
> On 25/07/2021 00:44, Linus Walleij wrote:
> > Make sure we check that the right interrupt occurred before
> > calling the event handler for timer 1. Report spurious IRQs
> > as IRQ_NONE.
>
> Does it not mean there is something wrong with the initial setup ?

No it is not occurring. This is just to protect against ordinary
spurious interrupts (return from sleep glitches in electronics,
quantum effects...)

Yours,
Linus Walleij

2021-08-20 14:18:09

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On 20/08/2021 15:39, Linus Walleij wrote:
> On Fri, Aug 20, 2021 at 10:53 AM Daniel Lezcano
> <[email protected]> wrote:
>>
>> On 25/07/2021 00:44, Linus Walleij wrote:
>>> Make sure we check that the right interrupt occurred before
>>> calling the event handler for timer 1. Report spurious IRQs
>>> as IRQ_NONE.
>>
>> Does it not mean there is something wrong with the initial setup ?
>
> No it is not occurring. This is just to protect against ordinary
> spurious interrupts (return from sleep glitches in electronics,
> quantum effects...)

I was not aware such problems can happen.

Thanks for the clarification.


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-08-21 04:22:49

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

Hi,

On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:
> Make sure we check that the right interrupt occurred before
> calling the event handler for timer 1. Report spurious IRQs
> as IRQ_NONE.
>
> Cc: C?dric Le Goater <[email protected]>
> Cc: Joel Stanley <[email protected]>
> Signed-off-by: Linus Walleij <[email protected]>

This patch results in boot stalls with several qemu aspeed emulations
(quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
romulus-bmc, g220a-bmc). Reverting this patch together with
"clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
fixes the problem. Bisect log is attached.

Guenter

---
# bad: [86ed57fd8c93fdfaabb4f58e78455180fa7d8a84] Add linux-next specific files for 20210820
# good: [7c60610d476766e128cc4284bb6349732cbd6606] Linux 5.14-rc6
git bisect start 'HEAD' 'v5.14-rc6'
# good: [2b14a6beaaddc78bd4c7aeb0d94ef4d1ba708b7b] Merge remote-tracking branch 'crypto/master'
git bisect good 2b14a6beaaddc78bd4c7aeb0d94ef4d1ba708b7b
# good: [ecbc858174455ac847fb308ececc33ad408d2b3c] Merge remote-tracking branch 'tip/auto-latest'
git bisect good ecbc858174455ac847fb308ececc33ad408d2b3c
# bad: [7a307d6053886b8c05f897b4386365d0f556c2e9] Merge remote-tracking branch 'staging/staging-next'
git bisect bad 7a307d6053886b8c05f897b4386365d0f556c2e9
# bad: [b97ca8377c80f00f51767c249310ee37db949df4] Merge remote-tracking branch 'tty/tty-next'
git bisect bad b97ca8377c80f00f51767c249310ee37db949df4
# bad: [e83fe57c15b3a06707266e28590ee805920be987] Merge remote-tracking branch 'kvm/next'
git bisect bad e83fe57c15b3a06707266e28590ee805920be987
# good: [213605c149ff869a7206db53eefbee14fd22a78d] kcsan: selftest: Cleanup and add missing __init
git bisect good 213605c149ff869a7206db53eefbee14fd22a78d
# good: [049e1cd8365ea416016e21eb2c7d9ca553fc1dc7] KVM: x86: don't disable APICv memslot when inhibited
git bisect good 049e1cd8365ea416016e21eb2c7d9ca553fc1dc7
# bad: [c01ef7a90edfa8be6dc8aaa9e0bfdfe0eb7ea083] Merge remote-tracking branch 'irqchip/irq/irqchip-next'
git bisect bad c01ef7a90edfa8be6dc8aaa9e0bfdfe0eb7ea083
# good: [4513fb87e1402ad815912ec7f027eb17149f44ee] Merge branch irq/misc-5.15 into irq/irqchip-next
git bisect good 4513fb87e1402ad815912ec7f027eb17149f44ee
# bad: [dcc5fdc6b0f491a612ca372ad18d098103103bdb] Merge remote-tracking branch 'edac/edac-for-next'
git bisect bad dcc5fdc6b0f491a612ca372ad18d098103103bdb
# bad: [8903227aa72ce8c013e47b027be8e153e114bf19] clocksource/drivers/fttmr010: Be stricter on IRQs
git bisect bad 8903227aa72ce8c013e47b027be8e153e114bf19
# good: [be83c3b6e7b8ff22f72827a613bf6f3aa5afadbb] clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel
git bisect good be83c3b6e7b8ff22f72827a613bf6f3aa5afadbb
# good: [ce9570657d45d6387a68d7f419fe70d085200a2f] clocksource/drivers/mediatek: Optimize systimer irq clear flow on shutdown
git bisect good ce9570657d45d6387a68d7f419fe70d085200a2f
# good: [3a95de59730eb9ac8dd6a367018f5653a873ecaa] clocksource/drivers/fttmr010: Pass around less pointers
git bisect good 3a95de59730eb9ac8dd6a367018f5653a873ecaa
# first bad commit: [8903227aa72ce8c013e47b027be8e153e114bf19] clocksource/drivers/fttmr010: Be stricter on IRQs

2021-08-21 08:06:12

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On 21/08/2021 06:20, Guenter Roeck wrote:
> Hi,
>
> On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:
>> Make sure we check that the right interrupt occurred before
>> calling the event handler for timer 1. Report spurious IRQs
>> as IRQ_NONE.
>>
>> Cc: Cédric Le Goater <[email protected]>
>> Cc: Joel Stanley <[email protected]>
>> Signed-off-by: Linus Walleij <[email protected]>
>
> This patch results in boot stalls with several qemu aspeed emulations
> (quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
> romulus-bmc, g220a-bmc). Reverting this patch together with
> "clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
> fixes the problem. Bisect log is attached.

Thanks Guenter for git-bisecting.

Linus I have dropped the aforementioned patches.

Thanks

-- Daniel

> ---
> # bad: [86ed57fd8c93fdfaabb4f58e78455180fa7d8a84] Add linux-next specific files for 20210820
> # good: [7c60610d476766e128cc4284bb6349732cbd6606] Linux 5.14-rc6
> git bisect start 'HEAD' 'v5.14-rc6'
> # good: [2b14a6beaaddc78bd4c7aeb0d94ef4d1ba708b7b] Merge remote-tracking branch 'crypto/master'
> git bisect good 2b14a6beaaddc78bd4c7aeb0d94ef4d1ba708b7b
> # good: [ecbc858174455ac847fb308ececc33ad408d2b3c] Merge remote-tracking branch 'tip/auto-latest'
> git bisect good ecbc858174455ac847fb308ececc33ad408d2b3c
> # bad: [7a307d6053886b8c05f897b4386365d0f556c2e9] Merge remote-tracking branch 'staging/staging-next'
> git bisect bad 7a307d6053886b8c05f897b4386365d0f556c2e9
> # bad: [b97ca8377c80f00f51767c249310ee37db949df4] Merge remote-tracking branch 'tty/tty-next'
> git bisect bad b97ca8377c80f00f51767c249310ee37db949df4
> # bad: [e83fe57c15b3a06707266e28590ee805920be987] Merge remote-tracking branch 'kvm/next'
> git bisect bad e83fe57c15b3a06707266e28590ee805920be987
> # good: [213605c149ff869a7206db53eefbee14fd22a78d] kcsan: selftest: Cleanup and add missing __init
> git bisect good 213605c149ff869a7206db53eefbee14fd22a78d
> # good: [049e1cd8365ea416016e21eb2c7d9ca553fc1dc7] KVM: x86: don't disable APICv memslot when inhibited
> git bisect good 049e1cd8365ea416016e21eb2c7d9ca553fc1dc7
> # bad: [c01ef7a90edfa8be6dc8aaa9e0bfdfe0eb7ea083] Merge remote-tracking branch 'irqchip/irq/irqchip-next'
> git bisect bad c01ef7a90edfa8be6dc8aaa9e0bfdfe0eb7ea083
> # good: [4513fb87e1402ad815912ec7f027eb17149f44ee] Merge branch irq/misc-5.15 into irq/irqchip-next
> git bisect good 4513fb87e1402ad815912ec7f027eb17149f44ee
> # bad: [dcc5fdc6b0f491a612ca372ad18d098103103bdb] Merge remote-tracking branch 'edac/edac-for-next'
> git bisect bad dcc5fdc6b0f491a612ca372ad18d098103103bdb
> # bad: [8903227aa72ce8c013e47b027be8e153e114bf19] clocksource/drivers/fttmr010: Be stricter on IRQs
> git bisect bad 8903227aa72ce8c013e47b027be8e153e114bf19
> # good: [be83c3b6e7b8ff22f72827a613bf6f3aa5afadbb] clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel
> git bisect good be83c3b6e7b8ff22f72827a613bf6f3aa5afadbb
> # good: [ce9570657d45d6387a68d7f419fe70d085200a2f] clocksource/drivers/mediatek: Optimize systimer irq clear flow on shutdown
> git bisect good ce9570657d45d6387a68d7f419fe70d085200a2f
> # good: [3a95de59730eb9ac8dd6a367018f5653a873ecaa] clocksource/drivers/fttmr010: Pass around less pointers
> git bisect good 3a95de59730eb9ac8dd6a367018f5653a873ecaa
> # first bad commit: [8903227aa72ce8c013e47b027be8e153e114bf19] clocksource/drivers/fttmr010: Be stricter on IRQs
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-08-27 22:03:53

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On Sat, Aug 21, 2021 at 6:20 AM Guenter Roeck <[email protected]> wrote:
> On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:

> > Make sure we check that the right interrupt occurred before
> > calling the event handler for timer 1. Report spurious IRQs
> > as IRQ_NONE.
> >
> > Cc: Cédric Le Goater <[email protected]>
> > Cc: Joel Stanley <[email protected]>
> > Signed-off-by: Linus Walleij <[email protected]>
>
> This patch results in boot stalls with several qemu aspeed emulations
> (quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
> romulus-bmc, g220a-bmc). Reverting this patch together with
> "clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
> fixes the problem. Bisect log is attached.

Has it been tested on real hardware?

We are reading register 0x34 TIMER_INTR_STATE for this.
So this should reflect the state of raw interrupts from the timers.

I looked in qemu/hw/timer/aspeed_timer.c
and the aspeed_timer_read() looks dubious.
It rather looks like this falls down to returning whatever
was written to this register and not reflect which IRQ
was fired at all.

Andrew: have you tested this when developing the
QEMU driver?

Yours,
Linus Walleij

2021-08-27 22:34:16

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On 8/27/21 3:01 PM, Linus Walleij wrote:
> On Sat, Aug 21, 2021 at 6:20 AM Guenter Roeck <[email protected]> wrote:
>> On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:
>
>>> Make sure we check that the right interrupt occurred before
>>> calling the event handler for timer 1. Report spurious IRQs
>>> as IRQ_NONE.
>>>
>>> Cc: Cédric Le Goater <[email protected]>
>>> Cc: Joel Stanley <[email protected]>
>>> Signed-off-by: Linus Walleij <[email protected]>
>>
>> This patch results in boot stalls with several qemu aspeed emulations
>> (quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
>> romulus-bmc, g220a-bmc). Reverting this patch together with
>> "clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
>> fixes the problem. Bisect log is attached.
>
> Has it been tested on real hardware?
>
> We are reading register 0x34 TIMER_INTR_STATE for this.
> So this should reflect the state of raw interrupts from the timers.
>
> I looked in qemu/hw/timer/aspeed_timer.c
> and the aspeed_timer_read() looks dubious.
> It rather looks like this falls down to returning whatever
> was written to this register and not reflect which IRQ
> was fired at all.
>

Interesting question. There is aspeed_timer_read(), which doesn't return
anything useful for register 0x34, and there is aspeed_{2400,2500,2600}_timer_read,
which does. No idea what is actually executed.

Guenter

> Andrew: have you tested this when developing the
> QEMU driver?
>
> Yours,
> Linus Walleij
>

2021-08-28 03:39:02

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On 8/27/21 3:01 PM, Linus Walleij wrote:
> On Sat, Aug 21, 2021 at 6:20 AM Guenter Roeck <[email protected]> wrote:
>> On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:
>
>>> Make sure we check that the right interrupt occurred before
>>> calling the event handler for timer 1. Report spurious IRQs
>>> as IRQ_NONE.
>>>
>>> Cc: Cédric Le Goater <[email protected]>
>>> Cc: Joel Stanley <[email protected]>
>>> Signed-off-by: Linus Walleij <[email protected]>
>>
>> This patch results in boot stalls with several qemu aspeed emulations
>> (quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
>> romulus-bmc, g220a-bmc). Reverting this patch together with
>> "clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
>> fixes the problem. Bisect log is attached.
>
> Has it been tested on real hardware?
>
> We are reading register 0x34 TIMER_INTR_STATE for this.
> So this should reflect the state of raw interrupts from the timers.
>
> I looked in qemu/hw/timer/aspeed_timer.c
> and the aspeed_timer_read() looks dubious.
> It rather looks like this falls down to returning whatever
> was written to this register and not reflect which IRQ
> was fired at all.
>

Actually, no. Turns out the qemu code is just a bit difficult to understand.
The code in question is:

default:
value = ASPEED_TIMER_GET_CLASS(s)->read(s, offset);
break;

For ast2500-evb, that translates to a call to aspeed_2500_timer_read().
Here is a trace example (after adding some more tracing):

aspeed_2500_timer_read From 0x34: 0x0
aspeed_timer_read From 0x34: of size 4: 0x0

Problem is that - at least in qemu - only the 2600 uses register 0x34
for the interrupt status. On the 2500, 0x34 is the ctrl2 register.

Indeed, the patch works fine on, for example, ast2600-evb.
It only fails on ast2400 and ast2500 boards.

I don't have the manuals, so I can't say what the correct behavior is,
but at least there is some evidence that TIMER_INTR_STATE may not exist
on ast2400 and ast2500 SOCs. From drivers/clocksource/timer-fttmr010.c:

/*
* Interrupt status/mask register definitions for fttmr010/gemini/moxart
* timers.
* The registers don't exist and they are not needed on aspeed timers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* because:
* - aspeed timer overflow interrupt is controlled by bits in Control
* Register (TMC30).
* - aspeed timers always generate interrupt when either one of the
* Match registers equals to Status register.
*/

Guenter

2021-08-28 10:38:42

by Cédric Le Goater

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

Hello,

On 8/28/21 5:37 AM, Guenter Roeck wrote:
> On 8/27/21 3:01 PM, Linus Walleij wrote:
>> On Sat, Aug 21, 2021 at 6:20 AM Guenter Roeck <[email protected]> wrote:
>>> On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:
>>
>>>> Make sure we check that the right interrupt occurred before
>>>> calling the event handler for timer 1. Report spurious IRQs
>>>> as IRQ_NONE.
>>>>
>>>> Cc: Cédric Le Goater <[email protected]>
>>>> Cc: Joel Stanley <[email protected]>
>>>> Signed-off-by: Linus Walleij <[email protected]>
>>>
>>> This patch results in boot stalls with several qemu aspeed emulations
>>> (quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
>>> romulus-bmc, g220a-bmc). Reverting this patch together with
>>> "clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
>>> fixes the problem. Bisect log is attached.
>>
>> Has it been tested on real hardware?

It breaks the AST2500 EVB.

>>
>> We are reading register 0x34 TIMER_INTR_STATE for this.
>> So this should reflect the state of raw interrupts from the timers.
>>
>> I looked in qemu/hw/timer/aspeed_timer.c
>> and the aspeed_timer_read() looks dubious.
>> It rather looks like this falls down to returning whatever
>> was written to this register and not reflect which IRQ
>> was fired at all.
>>
>
> Actually, no. Turns out the qemu code is just a bit difficult to understand.
> The code in question is:
>
>     default:
>         value = ASPEED_TIMER_GET_CLASS(s)->read(s, offset);
>         break;
>
> For ast2500-evb, that translates to a call to aspeed_2500_timer_read().
> Here is a trace example (after adding some more tracing):
>
> aspeed_2500_timer_read From 0x34: 0x0
> aspeed_timer_read From 0x34: of size 4: 0x0
>
> Problem is that - at least in qemu - only the 2600 uses register 0x34
> for the interrupt status. On the 2500, 0x34 is the ctrl2 register.
>
> Indeed, the patch works fine on, for example, ast2600-evb.
> It only fails on ast2400 and ast2500 boards.

The QEMU modelling is doing a good job ! I agree that the timer model
is not the most obvious one to read.

> I don't have the manuals, so I can't say what the correct behavior is,
> but at least there is some evidence that TIMER_INTR_STATE may not exist
> on ast2400 and ast2500 SOCs.

On Aspeed SoCs AST2400 and AST2500, the TMC[34] register is a
"control register #2" whereas on the AST2600 it is an "interrupt
status register" with bits [0-7] holding the timers status.

I would say that the patch simply should handle the "is_aspeed" case.

> From drivers/clocksource/timer-fttmr010.c:

yes. See commit 86fe57fc47b1 ("clocksource/drivers/fttmr010: Fix
invalid interrupt register access")

AFAICT, the ast2600 does not use a fttmr010 clocksource. Something to
clarify may be Joel ?

Thanks,

C.

>
> /*
>  * Interrupt status/mask register definitions for fttmr010/gemini/moxart
>  * timers.
>  * The registers don't exist and they are not needed on aspeed timers
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>  * because:
>  *   - aspeed timer overflow interrupt is controlled by bits in Control
>  *     Register (TMC30).
>  *   - aspeed timers always generate interrupt when either one of the
>  *     Match registers equals to Status register.
>  */
>
> Guenter

2021-08-30 04:19:16

by Andrew Jeffery

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs



On Sat, 28 Aug 2021, at 17:38, Cédric Le Goater wrote:
> Hello,
>
> On 8/28/21 5:37 AM, Guenter Roeck wrote:
> > On 8/27/21 3:01 PM, Linus Walleij wrote:
> >> On Sat, Aug 21, 2021 at 6:20 AM Guenter Roeck <[email protected]> wrote:
> >>> On Sun, Jul 25, 2021 at 12:44:24AM +0200, Linus Walleij wrote:
> >>
> >>>> Make sure we check that the right interrupt occurred before
> >>>> calling the event handler for timer 1. Report spurious IRQs
> >>>> as IRQ_NONE.
> >>>>
> >>>> Cc: Cédric Le Goater <[email protected]>
> >>>> Cc: Joel Stanley <[email protected]>
> >>>> Signed-off-by: Linus Walleij <[email protected]>
> >>>
> >>> This patch results in boot stalls with several qemu aspeed emulations
> >>> (quanta-q71l-bmc, palmetto-bmc, witherspoon-bmc, ast2500-evb,
> >>> romulus-bmc, g220a-bmc). Reverting this patch together with
> >>> "clocksource/drivers/fttmr010: Clear also overflow bit on AST2600"
> >>> fixes the problem. Bisect log is attached.
> >>
> >> Has it been tested on real hardware?
>
> It breaks the AST2500 EVB.
>
> >>
> >> We are reading register 0x34 TIMER_INTR_STATE for this.
> >> So this should reflect the state of raw interrupts from the timers.
> >>
> >> I looked in qemu/hw/timer/aspeed_timer.c
> >> and the aspeed_timer_read() looks dubious.
> >> It rather looks like this falls down to returning whatever
> >> was written to this register and not reflect which IRQ
> >> was fired at all.
> >>
> >
> > Actually, no. Turns out the qemu code is just a bit difficult to understand.
> > The code in question is:
> >
> >     default:
> >         value = ASPEED_TIMER_GET_CLASS(s)->read(s, offset);
> >         break;
> >
> > For ast2500-evb, that translates to a call to aspeed_2500_timer_read().
> > Here is a trace example (after adding some more tracing):
> >
> > aspeed_2500_timer_read From 0x34: 0x0
> > aspeed_timer_read From 0x34: of size 4: 0x0
> >
> > Problem is that - at least in qemu - only the 2600 uses register 0x34
> > for the interrupt status. On the 2500, 0x34 is the ctrl2 register.
> >
> > Indeed, the patch works fine on, for example, ast2600-evb.
> > It only fails on ast2400 and ast2500 boards.
>
> The QEMU modelling is doing a good job ! I agree that the timer model
> is not the most obvious one to read.

I'll buy a drink for whoever refactors it and improves readability :)

>
> > I don't have the manuals, so I can't say what the correct behavior is,
> > but at least there is some evidence that TIMER_INTR_STATE may not exist
> > on ast2400 and ast2500 SOCs.
>
> On Aspeed SoCs AST2400 and AST2500, the TMC[34] register is a
> "control register #2" whereas on the AST2600 it is an "interrupt
> status register" with bits [0-7] holding the timers status.
>
> I would say that the patch simply should handle the "is_aspeed" case.

Well, is_aspeed is set true in the driver for all of the 2400, 2500 and
2600. 0x34 behaves the way this patch expects on the 2600. So I think
we need something less coarse than is_aspeed?

Andrew

2021-08-30 04:59:33

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2]: Be stric clocksource/drivers/fttmr010ter on IRQs

On 8/29/21 9:16 PM, Andrew Jeffery wrote:
[ ... ]
>>
>>> I don't have the manuals, so I can't say what the correct behavior is,
>>> but at least there is some evidence that TIMER_INTR_STATE may not exist
>>> on ast2400 and ast2500 SOCs.
>>
>> On Aspeed SoCs AST2400 and AST2500, the TMC[34] register is a
>> "control register #2" whereas on the AST2600 it is an "interruptarch/arm/boot/dts/ast2600-facebook-netbmc-common.dtsi:#include
>> status register" with bits [0-7] holding the timers status.
>>
>> I would say that the patch simply should handle the "is_aspeed" case.
>
> Well, is_aspeed is set true in the driver for all of the 2400, 2500 and
> 2600. 0x34 behaves the way this patch expects on the 2600. So I think
> we need something less coarse than is_aspeed?
>

If I understand the code correctly, ast2400 and ast2500 execute
fttmr010_timer_interrupt(), while ast2600 has its own interrupt handler.
To make this work, it would probably be necessary to check for is_aspeed
in fttmr010_timer_interrupt(), and only execute the new code if the flag
is false. The existing flag in struct fttmr010 should be good enough
for that.

Guenter

2021-08-30 06:32:56

by Andrew Jeffery

[permalink] [raw]
Subject: Re: [PATCH 2/2]: Be stric clocksource/drivers/fttmr010ter on IRQs



On Mon, 30 Aug 2021, at 14:28, Guenter Roeck wrote:
> On 8/29/21 9:16 PM, Andrew Jeffery wrote:
> [ ... ]
> >>
> >>> I don't have the manuals, so I can't say what the correct behavior is,
> >>> but at least there is some evidence that TIMER_INTR_STATE may not exist
> >>> on ast2400 and ast2500 SOCs.
> >>
> >> On Aspeed SoCs AST2400 and AST2500, the TMC[34] register is a
> >> "control register #2" whereas on the AST2600 it is an "interruptarch/arm/boot/dts/ast2600-facebook-netbmc-common.dtsi:#include
> >> status register" with bits [0-7] holding the timers status.
> >>
> >> I would say that the patch simply should handle the "is_aspeed" case.
> >
> > Well, is_aspeed is set true in the driver for all of the 2400, 2500 and
> > 2600. 0x34 behaves the way this patch expects on the 2600. So I think
> > we need something less coarse than is_aspeed?
> >
>
> If I understand the code correctly, ast2400 and ast2500 execute
> fttmr010_timer_interrupt(), while ast2600 has its own interrupt handler.
> To make this work, it would probably be necessary to check for is_aspeed
> in fttmr010_timer_interrupt(), and only execute the new code if the flag
> is false. The existing flag in struct fttmr010 should be good enough
> for that.

Sounds good.

Andrew

2021-08-30 07:48:10

by Cédric Le Goater

[permalink] [raw]
Subject: Re: [PATCH 2/2]: Be stric clocksource/drivers/fttmr010 on IRQs

On 8/30/21 6:58 AM, Guenter Roeck wrote:
> On 8/29/21 9:16 PM, Andrew Jeffery wrote:
> [ ... ]
>>>
>>>> I don't have the manuals, so I can't say what the correct behavior is,
>>>> but at least there is some evidence that TIMER_INTR_STATE may not exist
>>>> on ast2400 and ast2500 SOCs.
>>>
>>> On Aspeed SoCs AST2400 and AST2500, the TMC[34] register is a
>>> "control register #2" whereas on the AST2600 it is an "interruptarch/arm/boot/dts/ast2600-facebook-netbmc-common.dtsi:#include
>>> status register" with bits [0-7] holding the timers status.
>>>
>>> I would say that the patch simply should handle the "is_aspeed" case.
>>
>> Well, is_aspeed is set true in the driver for all of the 2400, 2500 and
>> 2600. 0x34 behaves the way this patch expects on the 2600. So I think
>> we need something less coarse than is_aspeed?
>>
>
> If I understand the code correctly, ast2400 and ast2500 execute
> fttmr010_timer_interrupt(), while ast2600 has its own interrupt handler.
> To make this work, it would probably be necessary to check for is_aspeed
> in fttmr010_timer_interrupt(), and only execute the new code if the flag
> is false. The existing flag in struct fttmr010 should be good enough
> for that.

yes.

I wonder why we have ast2600 support in fttmr010. The AST2600 boards use
the arm_arch_timer AFAICT.

C.

2021-08-30 15:25:11

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2]: Be stric clocksource/drivers/fttmr010 on IRQs

On 8/30/21 12:47 AM, Cédric Le Goater wrote:
> On 8/30/21 6:58 AM, Guenter Roeck wrote:
>> On 8/29/21 9:16 PM, Andrew Jeffery wrote:
>> [ ... ]
>>>>
>>>>> I don't have the manuals, so I can't say what the correct behavior is,
>>>>> but at least there is some evidence that TIMER_INTR_STATE may not exist
>>>>> on ast2400 and ast2500 SOCs.
>>>>
>>>> On Aspeed SoCs AST2400 and AST2500, the TMC[34] register is a
>>>> "control register #2" whereas on the AST2600 it is an "interruptarch/arm/boot/dts/ast2600-facebook-netbmc-common.dtsi:#include
>>>> status register" with bits [0-7] holding the timers status.
>>>>
>>>> I would say that the patch simply should handle the "is_aspeed" case.
>>>
>>> Well, is_aspeed is set true in the driver for all of the 2400, 2500 and
>>> 2600. 0x34 behaves the way this patch expects on the 2600. So I think
>>> we need something less coarse than is_aspeed?
>>>
>>
>> If I understand the code correctly, ast2400 and ast2500 execute
>> fttmr010_timer_interrupt(), while ast2600 has its own interrupt handler.
>> To make this work, it would probably be necessary to check for is_aspeed
>> in fttmr010_timer_interrupt(), and only execute the new code if the flag
>> is false. The existing flag in struct fttmr010 should be good enough
>> for that.
>
> yes.
>
> I wonder why we have ast2600 support in fttmr010. The AST2600 boards use
> the arm_arch_timer AFAICT.
>

It was introduced and enabled, but later disabled with commit c998f40f2ae6a4
("ARM: dts: aspeed: ast2600: Set arch timer always-on"):

"According to ASPEED, FTTMR010 is not intended to be used in the AST2600.
The arch timer should be used, but Linux doesn't enable high-res timers
without being assured that the arch timer is always on, so set that
property in the devicetree."

That commit also disables the FTTMR010 timer, but doesn't remove the devicetree
node. Maybe it would make sense to remove the ast2600 code from the fttmr010
driver, including the devicetree node. After all, it looks like it is dead.

Guenter

2021-09-01 23:51:46

by Joel Stanley

[permalink] [raw]
Subject: Re: [PATCH 2/2] clocksource/drivers/fttmr010: Be stricter on IRQs

On Sat, 28 Aug 2021 at 08:08, Cédric Le Goater <[email protected]> wrote:

> AFAICT, the ast2600 does not use a fttmr010 clocksource.

Correct.

When doing bringup of the ast2600, I updated this driver to support
the new layout. Only once I'd submitted it did I learn that aspeed
preferred to use only the Cortex A7's timer source, and only that
source.

The armv7 timer can be lost when a core goes into power save mode, but
I am told the ast2600 does not implement power saving modes, so we set
the always-on property for the arm,armv7-timer. This allows us to use
the timer as a clocksource for hrtimers, removing the need to use the
fttmr010 driver at all.

Cheers,

Joel