This is a updated experiamental patch for review following discussions
with Jerome / Sebastian regarding the usage of threadded interupts in
meson-gx-mmc. I don't have a complete understanding or am I a kernel
developer but this is my best efforts attempt to address this issue.
Also thanks to both of of them for opening up the discussions and
Kevin for pointing me in the right direction for patch formatting.
Force threaded interrupts for meson_mmc_irq to prevent possible deadlock
condition during mmc operations when using preempt_rt with 5.9.0-rc3-rt3
patches on arm64.
Using meson-gx-mmc with an emmc device on Hardkernel Odroid N2+
configured with preempt_rt resulted in the soc becoming unresponsive.
With lock checking enabled the below inconsistent lock state was
observed during boot.
After some discussions with tglx in IRC #linux-rt a patch was suggested
to remove IRQF_ONESHOT from request_threaded_irq.
This has been tested and confirmed by me to resolve both the
unresponsive soc and the inconsistent lock state warning when using
5.9.0-rc3-rt3 on arm64 Odroid N2+.
Further review and testing is required to ensure there are no adverse
impacts or concerns and that is the correct method to resolve the
problem. I will continue to test on various amlogic devices with both
standard mainline low latency kernel and preempt_rt kernel with -rt
patches.
Changes since v1:
- Add spinlock_t lock to meson_host structure
- Add spin_lock_init to driver probe for the host lock to ensure the
irq will not attempt to fire again if the threaded irq component
is not complete
[ 7.858446] ================================
[ 7.858448] WARNING: inconsistent lock state
[ 7.858450] 5.9.0-rc3-rt3+ #33 Not tainted
[ 7.858453] --------------------------------
[ 7.858456] inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
[ 7.858459] swapper/0/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 7.858465] ffff80001219f4d8 (&trig->leddev_list_lock){+?.+}-{0:0}, at: led_trigger_set+0x104/0x270
[ 7.858482] {IN-HARDIRQ-R} state was registered at:
[ 7.858484] lock_acquire+0xec/0x468
[ 7.858491] rt_read_lock+0xb0/0x108
[ 7.858497] led_trigger_event+0x34/0x88
[ 7.858501] mmc_request_done+0x3f0/0x450
[ 7.858505] meson_mmc_irq+0x284/0x378
[ 7.858511] __handle_irq_event_percpu+0xcc/0x4a8
[ 7.858515] handle_irq_event_percpu+0x60/0xb0
[ 7.858519] handle_irq_event+0x50/0x108
[ 7.858522] handle_fasteoi_irq+0xd0/0x180
[ 7.858527] generic_handle_irq+0x38/0x50
[ 7.858530] __handle_domain_irq+0x6c/0xc8
[ 7.858533] gic_handle_irq+0x5c/0xb8
[ 7.858537] el1_irq+0xbc/0x180
[ 7.858540] arch_cpu_idle+0x28/0x38
[ 7.858544] default_idle_call+0x90/0x3f0
[ 7.858547] do_idle+0x250/0x268
[ 7.858551] cpu_startup_entry+0x2c/0x78
[ 7.858554] rest_init+0x1b0/0x284
[ 7.858559] arch_call_rest_init+0x18/0x24
[ 7.858565] start_kernel+0x550/0x588
[ 7.858569] irq event stamp: 1925495
[ 7.858571] hardirqs last enabled at (1925495): [<ffff8000111e3ba4>] _raw_spin_unlock_irqrestore+0xa4/0xb0
[ 7.858576] hardirqs last disabled at (1925494): [<ffff8000111e3c58>] _raw_spin_lock_irqsave+0xa8/0xb8
[ 7.858580] softirqs last enabled at (1857856): [<ffff80001024705c>] bdi_register_va+0x114/0x368
[ 7.858586] softirqs last disabled at (1857849): [<ffff80001024705c>] bdi_register_va+0x114/0x368
[ 7.858590]
other info that might help us debug this:
[ 7.858592] Possible unsafe locking scenario:
[ 7.858594] CPU0
[ 7.858595] ----
[ 7.858597] lock(&trig->leddev_list_lock);
[ 7.858600] <Interrupt>
[ 7.858602] lock(&trig->leddev_list_lock);
[ 7.858604]
*** DEADLOCK ***
[ 7.858606] 3 locks held by swapper/0/1:
[ 7.858609] #0: ffff80001219eb30 (leds_list_lock){++++}-{0:0}, at: led_trigger_register+0xf4/0x1c0
[ 7.858619] #1: ffff0000b0696a70 (&led_cdev->trigger_lock){+.+.}-{0:0}, at: led_trigger_register+0x134/0x1c0
[ 7.858629] #2: ffff800011fb83d0 (rcu_read_lock){....}-{1:2}, at: rt_write_lock+0x8/0x108
[ 7.858637]
stack backtrace:
[ 7.858640] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc3-rt3+ #33
[ 7.858643] Hardware name: Hardkernel ODROID-N2Plus (DT)
[ 7.858646] Call trace:
[ 7.858647] dump_backtrace+0x0/0x1e8
[ 7.858650] show_stack+0x20/0x30
[ 7.858653] dump_stack+0xf0/0x164
[ 7.858659] print_usage_bug+0x2b4/0x2c0
[ 7.858662] mark_lock+0x2e8/0x360
[ 7.858665] __lock_acquire+0x238/0x1858
[ 7.858669] lock_acquire+0xec/0x468
[ 7.858672] rt_write_lock+0xb0/0x108
[ 7.858675] led_trigger_set+0x104/0x270
[ 7.858678] led_trigger_register+0x180/0x1c0
[ 7.858681] heartbeat_trig_init+0x28/0x5c
[ 7.858686] do_one_initcall+0x90/0x4bc
[ 7.858690] kernel_init_freeable+0x2cc/0x338
[ 7.858694] kernel_init+0x1c/0x11c
[ 7.858697] ret_from_fork+0x10/0x34
Brad Harper (1):
mmc: host: meson-gx-mmc: fix possible deadlock condition for
preempt_rt
drivers/mmc/host/meson-gx-mmc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--
2.20.1
---
drivers/mmc/host/meson-gx-mmc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index 08a3b1c05..3ba8f988d 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -146,6 +146,7 @@ struct sd_emmc_desc {
};
struct meson_host {
+ spinlock_t lock;
struct device *dev;
struct meson_mmc_data *data;
struct mmc_host *mmc;
@@ -1051,6 +1052,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
host->mmc = mmc;
host->dev = &pdev->dev;
dev_set_drvdata(&pdev->dev, host);
+ spin_lock_init(&host->lock);
/* The G12A SDIO Controller needs an SRAM bounce buffer */
host->dram_access_quirk = device_property_read_bool(&pdev->dev,
@@ -1139,7 +1141,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
host->regs + SD_EMMC_IRQ_EN);
ret = request_threaded_irq(host->irq, meson_mmc_irq,
- meson_mmc_irq_thread, IRQF_ONESHOT,
+ meson_mmc_irq_thread, 0,
dev_name(&pdev->dev), host);
if (ret)
goto err_init_clk;
--
2.20.1
On 2020-09-26 22:54:18 [-0400], Brad Harper wrote:
> ---
What happens if you boot this on a non-RT kernel with the `threadirqs'
command line option?
Sebastian
Brad Harper <[email protected]> writes:
> ---
> drivers/mmc/host/meson-gx-mmc.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
This patch still needs changelog summarizing the problem and what is
being fixed by the patch. Most of what's in the cover letter belongs
here.
The cover letter can be used to describe the history/background that you
don't want in the patch itself. Alternatviely, you could include that
information in the a single patch email also because everything after
the "---" line does not end up in git history.
> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
> index 08a3b1c05..3ba8f988d 100644
> --- a/drivers/mmc/host/meson-gx-mmc.c
> +++ b/drivers/mmc/host/meson-gx-mmc.c
> @@ -146,6 +146,7 @@ struct sd_emmc_desc {
> };
>
> struct meson_host {
> + spinlock_t lock;
> struct device *dev;
> struct meson_mmc_data *data;
> struct mmc_host *mmc;
> @@ -1051,6 +1052,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
> host->mmc = mmc;
> host->dev = &pdev->dev;
> dev_set_drvdata(&pdev->dev, host);
> + spin_lock_init(&host->lock);
I'm confused about what this lock is intended to do. You init it here,
but it's never used anywhere.
> /* The G12A SDIO Controller needs an SRAM bounce buffer */
> host->dram_access_quirk = device_property_read_bool(&pdev->dev,
> @@ -1139,7 +1141,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
> host->regs + SD_EMMC_IRQ_EN);
>
> ret = request_threaded_irq(host->irq, meson_mmc_irq,
> - meson_mmc_irq_thread, IRQF_ONESHOT,
> + meson_mmc_irq_thread, 0,
> dev_name(&pdev->dev), host);
> if (ret)
> goto err_init_clk;
Kevin
Hi Kevin,
I think you are right, I don't have a good enough understanding to
make this work so please disregard the patch. I will take on
Sebastian's advice and do some testing with 'threadirqs' parameter
enabled in standard kernel to see if I can reproduce my original issue
there. I'm hoping Jerome might also be able to help with some time to
find proper solution.
Many Thanks,
Brad.
On 29/09/2020 10:35 am, Kevin Hilman wrote:
> Brad Harper <[email protected]> writes:
>
>> ---
>> drivers/mmc/host/meson-gx-mmc.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> This patch still needs changelog summarizing the problem and what is
> being fixed by the patch. Most of what's in the cover letter belongs
> here.
>
> The cover letter can be used to describe the history/background that you
> don't want in the patch itself. Alternatviely, you could include that
> information in the a single patch email also because everything after
> the "---" line does not end up in git history.
>
>> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
>> index 08a3b1c05..3ba8f988d 100644
>> --- a/drivers/mmc/host/meson-gx-mmc.c
>> +++ b/drivers/mmc/host/meson-gx-mmc.c
>> @@ -146,6 +146,7 @@ struct sd_emmc_desc {
>> };
>>
>> struct meson_host {
>> + spinlock_t lock;
>> struct device *dev;
>> struct meson_mmc_data *data;
>> struct mmc_host *mmc;
>> @@ -1051,6 +1052,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
>> host->mmc = mmc;
>> host->dev = &pdev->dev;
>> dev_set_drvdata(&pdev->dev, host);
>> + spin_lock_init(&host->lock);
> I'm confused about what this lock is intended to do. You init it here,
> but it's never used anywhere.
>
>> /* The G12A SDIO Controller needs an SRAM bounce buffer */
>> host->dram_access_quirk = device_property_read_bool(&pdev->dev,
>> @@ -1139,7 +1141,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
>> host->regs + SD_EMMC_IRQ_EN);
>>
>> ret = request_threaded_irq(host->irq, meson_mmc_irq,
>> - meson_mmc_irq_thread, IRQF_ONESHOT,
>> + meson_mmc_irq_thread, 0,
>> dev_name(&pdev->dev), host);
>> if (ret)
>> goto err_init_clk;
> Kevin