Force threaded interrupts for meson_mmc_irq to prevent possible deadlock
condition
during mmc operations when using preempt_rt with 5.9.0-rc3-rt3 patches
on arm64.
Using meson-gx-mmc with an emmc device on Hardkernel Odroid N2+
configured with
preempt_rt resulted in the soc becoming unresponsive. With lock
checking enabled
the below inconsistent lock state was observed during boot.
After some discussions with tglx in IRC #linux-rt the attached patch was
suggested
to remove IRQF_ONESHOT from request_threaded_irq.
This has been tested and confirmed by me to resolve both the
unresponsive soc and
the inconsistent lock state warning when using 5.9.0-rc3-rt3 on arm64
Odroid N2+.
Further review and testing is required to ensure there are no adverse
impacts or
concerns and that is the correct method to resolve the problem. I will
continue
to test on various amlogic devices with both standard mainline low
latency kernel
and preempt_rt kernel with -rt patches.
[ 7.858446] ================================
[ 7.858448] WARNING: inconsistent lock state
[ 7.858450] 5.9.0-rc3-rt3+ #33 Not tainted
[ 7.858453] --------------------------------
[ 7.858456] inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
[ 7.858459] swapper/0/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 7.858465] ffff80001219f4d8 (&trig->leddev_list_lock){+?.+}-{0:0}, at:
led_trigger_set+0x104/0x270
[ 7.858482] {IN-HARDIRQ-R} state was registered at:
[ 7.858484] lock_acquire+0xec/0x468
[ 7.858491] rt_read_lock+0xb0/0x108
[ 7.858497] led_trigger_event+0x34/0x88
[ 7.858501] mmc_request_done+0x3f0/0x450
[ 7.858505] meson_mmc_irq+0x284/0x378
[ 7.858511] __handle_irq_event_percpu+0xcc/0x4a8
[ 7.858515] handle_irq_event_percpu+0x60/0xb0
[ 7.858519] handle_irq_event+0x50/0x108
[ 7.858522] handle_fasteoi_irq+0xd0/0x180
[ 7.858527] generic_handle_irq+0x38/0x50
[ 7.858530] __handle_domain_irq+0x6c/0xc8
[ 7.858533] gic_handle_irq+0x5c/0xb8
[ 7.858537] el1_irq+0xbc/0x180
[ 7.858540] arch_cpu_idle+0x28/0x38
[ 7.858544] default_idle_call+0x90/0x3f0
[ 7.858547] do_idle+0x250/0x268
[ 7.858551] cpu_startup_entry+0x2c/0x78
[ 7.858554] rest_init+0x1b0/0x284
[ 7.858559] arch_call_rest_init+0x18/0x24
[ 7.858565] start_kernel+0x550/0x588
[ 7.858569] irq event stamp: 1925495
[ 7.858571] hardirqs last enabled at (1925495): [<ffff8000111e3ba4>]
_raw_spin_unlock_irqrestore+0xa4/0xb0
[ 7.858576] hardirqs last disabled at (1925494): [<ffff8000111e3c58>]
_raw_spin_lock_irqsave+0xa8/0xb8
[ 7.858580] softirqs last enabled at (1857856): [<ffff80001024705c>]
bdi_register_va+0x114/0x368
[ 7.858586] softirqs last disabled at (1857849): [<ffff80001024705c>]
bdi_register_va+0x114/0x368
[ 7.858590]
other info that might help us debug this:
[ 7.858592] Possible unsafe locking scenario:
[ 7.858594] CPU0
[ 7.858595] ----
[ 7.858597] lock(&trig->leddev_list_lock);
[ 7.858600] <Interrupt>
[ 7.858602] lock(&trig->leddev_list_lock);
[ 7.858604]
*** DEADLOCK ***
[ 7.858606] 3 locks held by swapper/0/1:
[ 7.858609] #0: ffff80001219eb30 (leds_list_lock){++++}-{0:0}, at:
led_trigger_register+0xf4/0x1c0
[ 7.858619] #1: ffff0000b0696a70 (&led_cdev->trigger_lock){+.+.}-{0:0},
at: led_trigger_register+0x134/0x1c0
[ 7.858629] #2: ffff800011fb83d0 (rcu_read_lock){....}-{1:2}, at:
rt_write_lock+0x8/0x108
[ 7.858637]
stack backtrace:
[ 7.858640] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc3-rt3+ #33
[ 7.858643] Hardware name: Hardkernel ODROID-N2Plus (DT)
[ 7.858646] Call trace:
[ 7.858647] dump_backtrace+0x0/0x1e8
[ 7.858650] show_stack+0x20/0x30
[ 7.858653] dump_stack+0xf0/0x164
[ 7.858659] print_usage_bug+0x2b4/0x2c0
[ 7.858662] mark_lock+0x2e8/0x360
[ 7.858665] __lock_acquire+0x238/0x1858
[ 7.858669] lock_acquire+0xec/0x468
[ 7.858672] rt_write_lock+0xb0/0x108
[ 7.858675] led_trigger_set+0x104/0x270
[ 7.858678] led_trigger_register+0x180/0x1c0
[ 7.858681] heartbeat_trig_init+0x28/0x5c
[ 7.858686] do_one_initcall+0x90/0x4bc
[ 7.858690] kernel_init_freeable+0x2cc/0x338
[ 7.858694] kernel_init+0x1c/0x11c
[ 7.858697] ret_from_fork+0x10/0x34
Signed-off-by: Brad Harper <[email protected]>
---
drivers/mmc/host/meson-gx-mmc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mmc/host/meson-gx-mmc.c
b/drivers/mmc/host/meson-gx-mmc.c
index 08a3b1c05..130ac134d 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -1139,7 +1139,7 @@ static int meson_mmc_probe(struct platform_device
*pdev)
host->regs + SD_EMMC_IRQ_EN);
ret = request_threaded_irq(host->irq, meson_mmc_irq,
- meson_mmc_irq_thread, IRQF_ONESHOT,
+ meson_mmc_irq_thread, 0,
dev_name(&pdev->dev), host);
if (ret)
goto err_init_clk;
--
2.20.1
Hi Brad,
Brad Harper <[email protected]> writes:
> Force threaded interrupts for meson_mmc_irq to prevent possible deadlock
> condition
> during mmc operations when using preempt_rt with 5.9.0-rc3-rt3 patches
> on arm64.
>
> Using meson-gx-mmc with an emmc device on Hardkernel Odroid N2+
> configured with
> preempt_rt resulted in the soc becoming unresponsive. With lock
> checking enabled
> the below inconsistent lock state was observed during boot.
>
> After some discussions with tglx in IRC #linux-rt the attached patch was
> suggested
> to remove IRQF_ONESHOT from request_threaded_irq.
> This has been tested and confirmed by me to resolve both the
> unresponsive soc and
> the inconsistent lock state warning when using 5.9.0-rc3-rt3 on arm64
> Odroid N2+.
>
> Further review and testing is required to ensure there are no adverse
> impacts or
> concerns and that is the correct method to resolve the problem. I will
> continue
> to test on various amlogic devices with both standard mainline low
> latency kernel
> and preempt_rt kernel with -rt patches.
This looks right to me, thanks for sending a fix.
For broader testing, I can add this to my testing branch so it gets
booted on a bunch more platform in KernelCI also.
However...
[...]
> Signed-off-by: Brad Harper <[email protected]>
> ---
> drivers/mmc/host/meson-gx-mmc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/meson-gx-mmc.c
> b/drivers/mmc/host/meson-gx-mmc.c
> index 08a3b1c05..130ac134d 100644
> --- a/drivers/mmc/host/meson-gx-mmc.c
> +++ b/drivers/mmc/host/meson-gx-mmc.c
> @@ -1139,7 +1139,7 @@ static int meson_mmc_probe(struct platform_device
> *pdev)
> host->regs + SD_EMMC_IRQ_EN);
>
> ret = request_threaded_irq(host->irq, meson_mmc_irq,
> - meson_mmc_irq_thread, IRQF_ONESHOT,
> + meson_mmc_irq_thread, 0,
> dev_name(&pdev->dev), host);
> if (ret)
> goto err_init_clk;
This patch has been mangled by your mailer, so it doesn't apply cleanly.
If you're using the gmail web UI, this is a common problem.
I strongly recommend using git-send-email to send directly via gmail
SMTP. The git-send-email docs[1] give some examples on how to set this
up.
Kevin
[1] https://git-scm.com/docs/git-send-email#_examples
On Thu 24 Sep 2020 at 19:01, Kevin Hilman <[email protected]> wrote:
> Hi Brad,
>
> Brad Harper <[email protected]> writes:
>
>> Force threaded interrupts for meson_mmc_irq to prevent possible deadlock
>> condition
>> during mmc operations when using preempt_rt with 5.9.0-rc3-rt3 patches
>> on arm64.
>>
>> Using meson-gx-mmc with an emmc device on Hardkernel Odroid N2+
>> configured with
>> preempt_rt resulted in the soc becoming unresponsive. With lock
>> checking enabled
>> the below inconsistent lock state was observed during boot.
>>
>> After some discussions with tglx in IRC #linux-rt the attached patch was
>> suggested
>> to remove IRQF_ONESHOT from request_threaded_irq.
>> This has been tested and confirmed by me to resolve both the
>> unresponsive soc and
>> the inconsistent lock state warning when using 5.9.0-rc3-rt3 on arm64
>> Odroid N2+.
>>
>> Further review and testing is required to ensure there are no adverse
>> impacts or
>> concerns and that is the correct method to resolve the problem. I will
>> continue
>> to test on various amlogic devices with both standard mainline low
>> latency kernel
>> and preempt_rt kernel with -rt patches.
>
> This looks right to me, thanks for sending a fix.
>
> For broader testing, I can add this to my testing branch so it gets
> booted on a bunch more platform in KernelCI also.
>
> However...
>
> [...]
>
>> Signed-off-by: Brad Harper <[email protected]>
>> ---
>> drivers/mmc/host/meson-gx-mmc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/host/meson-gx-mmc.c
>> b/drivers/mmc/host/meson-gx-mmc.c
>> index 08a3b1c05..130ac134d 100644
>> --- a/drivers/mmc/host/meson-gx-mmc.c
>> +++ b/drivers/mmc/host/meson-gx-mmc.c
>> @@ -1139,7 +1139,7 @@ static int meson_mmc_probe(struct platform_device
>> *pdev)
>> host->regs + SD_EMMC_IRQ_EN);
>>
>> ret = request_threaded_irq(host->irq, meson_mmc_irq,
>> - meson_mmc_irq_thread, IRQF_ONESHOT,
>> + meson_mmc_irq_thread, 0,
>> dev_name(&pdev->dev), host);
I'm not sure about this.
As you have explained on IRC, I understand that IRQF_ONESHOT is causing
trouble with RT as the hard IRQ part of the thread will not be migrated
to a thread. That was certainly not the intent when putting this flag.
As described in include/linux/interrupt.h:
* IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
* Used by threaded interrupts which need to keep the
* irq line disabled until the threaded handler has been run.
The intent was only that, making sure the irq is not re-enabled until
the threaded part is done. AFAIU, removing this flag would allow the
hard irq handler to fire again while we are not done handling the
previous IRQ in threaded part.
This seems pretty unsafe to me. Maybe we could improve the driver so it
copes with this case gracefully. ATM, I don't think it would.
Maybe I missed something and I am happy to be enlightened if that is the
case :)
>> if (ret)
>> goto err_init_clk;
>
> This patch has been mangled by your mailer, so it doesn't apply cleanly.
> If you're using the gmail web UI, this is a common problem.
>
> I strongly recommend using git-send-email to send directly via gmail
> SMTP. The git-send-email docs[1] give some examples on how to set this
> up.
>
> Kevin
>
> [1] https://git-scm.com/docs/git-send-email#_examples
>
> _______________________________________________
> linux-amlogic mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-amlogic
On 2020-09-25 11:11:42 [+0200], Jerome Brunet wrote:
> I'm not sure about this.
> As you have explained on IRC, I understand that IRQF_ONESHOT is causing
> trouble with RT as the hard IRQ part of the thread will not be migrated
> to a thread. That was certainly not the intent when putting this flag.
That is my understanding as well.
> This seems pretty unsafe to me. Maybe we could improve the driver so it
> copes with this case gracefully. ATM, I don't think it would.
Running the primary handler in hardirq context is bad, because it
invokes meson_mmc_request_done() at the very end. And here:
- mmc_complete_cmd() -> complete_all()
There is a lockdep_assert_RT_in_threaded_ctx() which should trigger.
- led_trigger_event() -> led_trigger_event()
This should trigger a might_sleep() warning somewhere.
So removing IRQF_ONESHOT is okay but it should additionally disable the
IRQ source in meson_mmc_irq() and re-enable back in
meson_mmc_irq_thread(). Otherwise the IRQ remains asserted and may fire
multiple times before the thread has a chance to run.
Sebastian
On Fri 25 Sep 2020 at 15:44, Sebastian Andrzej Siewior <[email protected]> wrote:
> On 2020-09-25 11:11:42 [+0200], Jerome Brunet wrote:
>> I'm not sure about this.
>> As you have explained on IRC, I understand that IRQF_ONESHOT is causing
>> trouble with RT as the hard IRQ part of the thread will not be migrated
>> to a thread. That was certainly not the intent when putting this flag.
>
> That is my understanding as well.
>
>> This seems pretty unsafe to me. Maybe we could improve the driver so it
>> copes with this case gracefully. ATM, I don't think it would.
>
> Running the primary handler in hardirq context is bad, because it
> invokes meson_mmc_request_done() at the very end. And here:
> - mmc_complete_cmd() -> complete_all()
> There is a lockdep_assert_RT_in_threaded_ctx() which should trigger.
>
> - led_trigger_event() -> led_trigger_event()
> This should trigger a might_sleep() warning somewhere.
>
> So removing IRQF_ONESHOT is okay but it should additionally disable the
> IRQ source in meson_mmc_irq() and re-enable back in
> meson_mmc_irq_thread(). Otherwise the IRQ remains asserted and may fire
> multiple times before the thread has a chance to run.
Looks like we need to do manually what IRQF_ONESHOT was doing for us :(
This brings a few questions:
* The consideration you described is not mentioned near the description
of IRQF_ONESHOT. Maybe it should so other drivers with same intent
don't end up in the same pitfall ?
* Why doesn't RT move the IRQ with this flag ? Seems completly unrelated
to RT (maybe it is the same documentation problem)
* Can't we have flag doing the irq disable in the same way while still
allowing to RT to do its magic ? seems better than open coding it in
the driver ?
>
> Sebastian
On 2020-09-25 16:14:09 [+0200], Jerome Brunet wrote:
> Looks like we need to do manually what IRQF_ONESHOT was doing for us :(
IRQF_ONESHOT disables the IRQ at the irqchip level. You must ensure that
the device keeps quite. Usually you mast the interrupt source at the
device lee.
> This brings a few questions:
>
> * The consideration you described is not mentioned near the description
> of IRQF_ONESHOT. Maybe it should so other drivers with same intent
> don't end up in the same pitfall ?
From request_threaded_irq() ->
| * If you want to set up a threaded irq handler for your device
| * then you need to supply @handler and @thread_fn. @handler is
| * still called in hard interrupt context and has to check
| * whether the interrupt originates from the device. If yes it
| * needs to disable the interrupt on the device and return
| * IRQ_WAKE_THREAD which will wake up the handler thread and run
| * @thread_fn. This split handler design is necessary to support
| * shared interrupts.
Just the line that saying what needs to be done before returning
IRQ_WAKE_THREAD.
> * Why doesn't RT move the IRQ with this flag ? Seems completly unrelated
> to RT (maybe it is the same documentation problem)
It is unrelated to RT. Mostly. You end up with the same problem booting
with `threadirqs'. RT has the additional restrictions that you may not
acquire any sleeping locks in hardirq context. This you can see with
addinional lockdep magic.
> * Can't we have flag doing the irq disable in the same way while still
> allowing to RT to do its magic ? seems better than open coding it in
> the driver ?
Puh. That should be forwarded the IRQ department.
So we have IRQF_NO_THREAD to avoid force threading. This is documented
as such. Then we have IRQF_TIMER and IRQF_PERCPU which are also not
force threaded and it is not documented as such. However it is used for
the timer-IRQ, IPI, perf and such - things you obviously don't want to
thread and need to run in hard-IRQ context.
What you have ist a primary and secondary and IRQF_ONESHOT and don't
want the primary handler to be force-threaded. I can't answer why we
don't.
However, drivers usually disable the source themself if they providing
both handler.
Sebastian