2020-11-02 10:43:49

by Andrea Righi

[permalink] [raw]
Subject: [PATCH] leds: trigger: fix potential deadlock with libata

We have the following potential deadlock condition:

========================================================
WARNING: possible irq lock inversion dependency detected
5.10.0-rc2+ #25 Not tainted
--------------------------------------------------------
swapper/3/0 just changed the state of lock:
ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
but this lock took another, HARDIRQ-READ-unsafe lock in the past:
(&trig->leddev_list_lock){.+.?}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&trig->leddev_list_lock);
local_irq_disable();
lock(&host->lock);
lock(&trig->leddev_list_lock);
<Interrupt>
lock(&host->lock);

*** DEADLOCK ***

no locks held by swapper/3/0.

the shortest dependencies between 2nd lock and 1st lock:
-> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 {
HARDIRQ-ON-R at:
lock_acquire+0x15f/0x420
_raw_read_lock+0x42/0x90
led_trigger_event+0x2b/0x70
rfkill_global_led_trigger_worker+0x94/0xb0
process_one_work+0x240/0x560
worker_thread+0x58/0x3d0
kthread+0x151/0x170
ret_from_fork+0x1f/0x30
IN-SOFTIRQ-R at:
lock_acquire+0x15f/0x420
_raw_read_lock+0x42/0x90
led_trigger_event+0x2b/0x70
kbd_bh+0x9e/0xc0
tasklet_action_common.constprop.0+0xe9/0x100
tasklet_action+0x22/0x30
__do_softirq+0xcc/0x46d
run_ksoftirqd+0x3f/0x70
smpboot_thread_fn+0x116/0x1f0
kthread+0x151/0x170
ret_from_fork+0x1f/0x30
SOFTIRQ-ON-R at:
lock_acquire+0x15f/0x420
_raw_read_lock+0x42/0x90
led_trigger_event+0x2b/0x70
rfkill_global_led_trigger_worker+0x94/0xb0
process_one_work+0x240/0x560
worker_thread+0x58/0x3d0
kthread+0x151/0x170
ret_from_fork+0x1f/0x30
INITIAL READ USE at:
lock_acquire+0x15f/0x420
_raw_read_lock+0x42/0x90
led_trigger_event+0x2b/0x70
rfkill_global_led_trigger_worker+0x94/0xb0
process_one_work+0x240/0x560
worker_thread+0x58/0x3d0
kthread+0x151/0x170
ret_from_fork+0x1f/0x30
}
... key at: [<ffffffff83da4c00>] __key.0+0x0/0x10
... acquired at:
_raw_read_lock+0x42/0x90
led_trigger_blink_oneshot+0x3b/0x90
ledtrig_disk_activity+0x3c/0xa0
ata_qc_complete+0x26/0x450
ata_do_link_abort+0xa3/0xe0
ata_port_freeze+0x2e/0x40
ata_hsm_qc_complete+0x94/0xa0
ata_sff_hsm_move+0x177/0x7a0
ata_sff_pio_task+0xc7/0x1b0
process_one_work+0x240/0x560
worker_thread+0x58/0x3d0
kthread+0x151/0x170
ret_from_fork+0x1f/0x30

-> (&host->lock){-...}-{2:2} ops: 69 {
IN-HARDIRQ-W at:
lock_acquire+0x15f/0x420
_raw_spin_lock_irqsave+0x52/0xa0
ata_bmdma_interrupt+0x27/0x200
__handle_irq_event_percpu+0xd5/0x2b0
handle_irq_event+0x57/0xb0
handle_edge_irq+0x8c/0x230
asm_call_irq_on_stack+0xf/0x20
common_interrupt+0x100/0x1c0
asm_common_interrupt+0x1e/0x40
native_safe_halt+0xe/0x10
arch_cpu_idle+0x15/0x20
default_idle_call+0x59/0x1c0
do_idle+0x22c/0x2c0
cpu_startup_entry+0x20/0x30
start_secondary+0x11d/0x150
secondary_startup_64_no_verify+0xa6/0xab
INITIAL USE at:
lock_acquire+0x15f/0x420
_raw_spin_lock_irqsave+0x52/0xa0
ata_dev_init+0x54/0xe0
ata_link_init+0x8b/0xd0
ata_port_alloc+0x1f1/0x210
ata_host_alloc+0xf1/0x130
ata_host_alloc_pinfo+0x14/0xb0
ata_pci_sff_prepare_host+0x41/0xa0
ata_pci_bmdma_prepare_host+0x14/0x30
piix_init_one+0x21f/0x600
local_pci_probe+0x48/0x80
pci_device_probe+0x105/0x1c0
really_probe+0x221/0x490
driver_probe_device+0xe9/0x160
device_driver_attach+0xb2/0xc0
__driver_attach+0x91/0x150
bus_for_each_dev+0x81/0xc0
driver_attach+0x1e/0x20
bus_add_driver+0x138/0x1f0
driver_register+0x91/0xf0
__pci_register_driver+0x73/0x80
piix_init+0x1e/0x2e
do_one_initcall+0x5f/0x2d0
kernel_init_freeable+0x26f/0x2cf
kernel_init+0xe/0x113
ret_from_fork+0x1f/0x30
}
... key at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10
... acquired at:
__lock_acquire+0x9da/0x2370
lock_acquire+0x15f/0x420
_raw_spin_lock_irqsave+0x52/0xa0
ata_bmdma_interrupt+0x27/0x200
__handle_irq_event_percpu+0xd5/0x2b0
handle_irq_event+0x57/0xb0
handle_edge_irq+0x8c/0x230
asm_call_irq_on_stack+0xf/0x20
common_interrupt+0x100/0x1c0
asm_common_interrupt+0x1e/0x40
native_safe_halt+0xe/0x10
arch_cpu_idle+0x15/0x20
default_idle_call+0x59/0x1c0
do_idle+0x22c/0x2c0
cpu_startup_entry+0x20/0x30
start_secondary+0x11d/0x150
secondary_startup_64_no_verify+0xa6/0xab

This lockdep splat is reported after:
commit e918188611f0 ("locking: More accurate annotations for read_lock()")

To clarify:
- read-locks are recursive only in interrupt context (when
in_interrupt() returns true)
- after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call
write_lock(&trig->leddev_list_lock) that would be blocked by CPU0
that holds trig->leddev_list_lock in read-mode
- when CPU1 (ata_ac_complete()) tries to read-lock
trig->leddev_list_lock, it would be blocked by the write-lock waiter
on CPU2 (because we are not in interrupt context, so the read-lock is
not recursive)
- at this point if an interrupt happens on CPU0 and
ata_bmdma_interrupt() is executed it will try to acquire host->lock,
that is held by CPU1, that is currently blocked by CPU2, so:

* CPU0 blocked by CPU1
* CPU1 blocked by CPU2
* CPU2 blocked by CPU0

*** DEADLOCK ***

The deadlock scenario is better represented by the following schema
(thanks to Boqun Feng <[email protected]> for the schema and the
detailed explanation of the deadlock condition):

CPU 0: CPU 1: CPU 2:
----- ----- -----
led_trigger_event():
read_lock(&trig->leddev_list_lock);
<workqueue>
ata_hsm_qc_complete():
spin_lock_irqsave(&host->lock);
write_lock(&trig->leddev_list_lock);
ata_port_freeze():
ata_do_link_abort():
ata_qc_complete():
ledtrig_disk_activity():
led_trigger_blink_oneshot():
read_lock(&trig->leddev_list_lock);
// ^ not in in_interrupt() context, so could get blocked by CPU 2
<interrupt>
ata_bmdma_interrupt():
spin_lock_irqsave(&host->lock);

Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so
that no interrupt can happen in between, preventing the deadlock
condition.

Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
Fixes: eb25cb9956cc ("leds: convert IDE trigger to common disk trigger")
Signed-off-by: Andrea Righi <[email protected]>
---
drivers/leds/led-triggers.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 91da90cfb11d..16d1a93a10a8 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
enum led_brightness brightness)
{
struct led_classdev *led_cdev;
+ unsigned long flags;

if (!trig)
return;

- read_lock(&trig->leddev_list_lock);
+ read_lock_irqsave(&trig->leddev_list_lock, flags);
list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
led_set_brightness(led_cdev, brightness);
- read_unlock(&trig->leddev_list_lock);
+ read_unlock_irqrestore(&trig->leddev_list_lock, flags);
}
EXPORT_SYMBOL_GPL(led_trigger_event);

--
2.27.0


2020-11-25 12:50:38

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

Hi!

> We have the following potential deadlock condition:
>
> ========================================================
> WARNING: possible irq lock inversion dependency detected
> 5.10.0-rc2+ #25 Not tainted
> --------------------------------------------------------
> swapper/3/0 just changed the state of lock:
> ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> (&trig->leddev_list_lock){.+.?}-{2:2}
>
> and interrupts could create inverse lock ordering between them.
>
> other info that might help us debug this:
> Possible interrupt unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&trig->leddev_list_lock);
> local_irq_disable();
> lock(&host->lock);
> lock(&trig->leddev_list_lock);
> <Interrupt>
> lock(&host->lock);
>
> *** DEADLOCK ***
>
> no locks held by swapper/3/0.
>
> the shortest dependencies between 2nd lock and 1st lock:
> -> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 {
> HARDIRQ-ON-R at:
> lock_acquire+0x15f/0x420
> _raw_read_lock+0x42/0x90
> led_trigger_event+0x2b/0x70
> rfkill_global_led_trigger_worker+0x94/0xb0
> process_one_work+0x240/0x560
> worker_thread+0x58/0x3d0
> kthread+0x151/0x170
> ret_from_fork+0x1f/0x30
> IN-SOFTIRQ-R at:
> lock_acquire+0x15f/0x420
> _raw_read_lock+0x42/0x90
> led_trigger_event+0x2b/0x70
> kbd_bh+0x9e/0xc0
> tasklet_action_common.constprop.0+0xe9/0x100
> tasklet_action+0x22/0x30
> __do_softirq+0xcc/0x46d
> run_ksoftirqd+0x3f/0x70
> smpboot_thread_fn+0x116/0x1f0
> kthread+0x151/0x170
> ret_from_fork+0x1f/0x30
> SOFTIRQ-ON-R at:
> lock_acquire+0x15f/0x420
> _raw_read_lock+0x42/0x90
> led_trigger_event+0x2b/0x70
> rfkill_global_led_trigger_worker+0x94/0xb0
> process_one_work+0x240/0x560
> worker_thread+0x58/0x3d0
> kthread+0x151/0x170
> ret_from_fork+0x1f/0x30
> INITIAL READ USE at:
> lock_acquire+0x15f/0x420
> _raw_read_lock+0x42/0x90
> led_trigger_event+0x2b/0x70
> rfkill_global_led_trigger_worker+0x94/0xb0
> process_one_work+0x240/0x560
> worker_thread+0x58/0x3d0
> kthread+0x151/0x170
> ret_from_fork+0x1f/0x30
> }
> ... key at: [<ffffffff83da4c00>] __key.0+0x0/0x10
> ... acquired at:
> _raw_read_lock+0x42/0x90
> led_trigger_blink_oneshot+0x3b/0x90
> ledtrig_disk_activity+0x3c/0xa0
> ata_qc_complete+0x26/0x450
> ata_do_link_abort+0xa3/0xe0
> ata_port_freeze+0x2e/0x40
> ata_hsm_qc_complete+0x94/0xa0
> ata_sff_hsm_move+0x177/0x7a0
> ata_sff_pio_task+0xc7/0x1b0
> process_one_work+0x240/0x560
> worker_thread+0x58/0x3d0
> kthread+0x151/0x170
> ret_from_fork+0x1f/0x30
>
> -> (&host->lock){-...}-{2:2} ops: 69 {
> IN-HARDIRQ-W at:
> lock_acquire+0x15f/0x420
> _raw_spin_lock_irqsave+0x52/0xa0
> ata_bmdma_interrupt+0x27/0x200
> __handle_irq_event_percpu+0xd5/0x2b0
> handle_irq_event+0x57/0xb0
> handle_edge_irq+0x8c/0x230
> asm_call_irq_on_stack+0xf/0x20
> common_interrupt+0x100/0x1c0
> asm_common_interrupt+0x1e/0x40
> native_safe_halt+0xe/0x10
> arch_cpu_idle+0x15/0x20
> default_idle_call+0x59/0x1c0
> do_idle+0x22c/0x2c0
> cpu_startup_entry+0x20/0x30
> start_secondary+0x11d/0x150
> secondary_startup_64_no_verify+0xa6/0xab
> INITIAL USE at:
> lock_acquire+0x15f/0x420
> _raw_spin_lock_irqsave+0x52/0xa0
> ata_dev_init+0x54/0xe0
> ata_link_init+0x8b/0xd0
> ata_port_alloc+0x1f1/0x210
> ata_host_alloc+0xf1/0x130
> ata_host_alloc_pinfo+0x14/0xb0
> ata_pci_sff_prepare_host+0x41/0xa0
> ata_pci_bmdma_prepare_host+0x14/0x30
> piix_init_one+0x21f/0x600
> local_pci_probe+0x48/0x80
> pci_device_probe+0x105/0x1c0
> really_probe+0x221/0x490
> driver_probe_device+0xe9/0x160
> device_driver_attach+0xb2/0xc0
> __driver_attach+0x91/0x150
> bus_for_each_dev+0x81/0xc0
> driver_attach+0x1e/0x20
> bus_add_driver+0x138/0x1f0
> driver_register+0x91/0xf0
> __pci_register_driver+0x73/0x80
> piix_init+0x1e/0x2e
> do_one_initcall+0x5f/0x2d0
> kernel_init_freeable+0x26f/0x2cf
> kernel_init+0xe/0x113
> ret_from_fork+0x1f/0x30
> }
> ... key at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10
> ... acquired at:
> __lock_acquire+0x9da/0x2370
> lock_acquire+0x15f/0x420
> _raw_spin_lock_irqsave+0x52/0xa0
> ata_bmdma_interrupt+0x27/0x200
> __handle_irq_event_percpu+0xd5/0x2b0
> handle_irq_event+0x57/0xb0
> handle_edge_irq+0x8c/0x230
> asm_call_irq_on_stack+0xf/0x20
> common_interrupt+0x100/0x1c0
> asm_common_interrupt+0x1e/0x40
> native_safe_halt+0xe/0x10
> arch_cpu_idle+0x15/0x20
> default_idle_call+0x59/0x1c0
> do_idle+0x22c/0x2c0
> cpu_startup_entry+0x20/0x30
> start_secondary+0x11d/0x150
> secondary_startup_64_no_verify+0xa6/0xab
>
> This lockdep splat is reported after:
> commit e918188611f0 ("locking: More accurate annotations for read_lock()")
>
> To clarify:
> - read-locks are recursive only in interrupt context (when
> in_interrupt() returns true)
> - after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call
> write_lock(&trig->leddev_list_lock) that would be blocked by CPU0
> that holds trig->leddev_list_lock in read-mode
> - when CPU1 (ata_ac_complete()) tries to read-lock
> trig->leddev_list_lock, it would be blocked by the write-lock waiter
> on CPU2 (because we are not in interrupt context, so the read-lock is
> not recursive)
> - at this point if an interrupt happens on CPU0 and
> ata_bmdma_interrupt() is executed it will try to acquire host->lock,
> that is held by CPU1, that is currently blocked by CPU2, so:
>
> * CPU0 blocked by CPU1
> * CPU1 blocked by CPU2
> * CPU2 blocked by CPU0
>
> *** DEADLOCK ***
>
> The deadlock scenario is better represented by the following schema
> (thanks to Boqun Feng <[email protected]> for the schema and the
> detailed explanation of the deadlock condition):
>
> CPU 0: CPU 1: CPU 2:
> ----- ----- -----
> led_trigger_event():
> read_lock(&trig->leddev_list_lock);
> <workqueue>
> ata_hsm_qc_complete():
> spin_lock_irqsave(&host->lock);
> write_lock(&trig->leddev_list_lock);
> ata_port_freeze():
> ata_do_link_abort():
> ata_qc_complete():
> ledtrig_disk_activity():
> led_trigger_blink_oneshot():
> read_lock(&trig->leddev_list_lock);
> // ^ not in in_interrupt() context, so could get blocked by CPU 2
> <interrupt>
> ata_bmdma_interrupt():
> spin_lock_irqsave(&host->lock);
>
> Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so
> that no interrupt can happen in between, preventing the deadlock
> condition.
>
> Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
> Fixes: eb25cb9956cc ("leds: convert IDE trigger to common disk trigger")
> Signed-off-by: Andrea Righi <[email protected]>

I'd hate to see this in stable 3 days after Linus merges it...

Do these need _irqsave, too?

drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);

Best regards,
Pavel

> ---
> drivers/leds/led-triggers.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> index 91da90cfb11d..16d1a93a10a8 100644
> --- a/drivers/leds/led-triggers.c
> +++ b/drivers/leds/led-triggers.c
> @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> enum led_brightness brightness)
> {
> struct led_classdev *led_cdev;
> + unsigned long flags;
>
> if (!trig)
> return;
>
> - read_lock(&trig->leddev_list_lock);
> + read_lock_irqsave(&trig->leddev_list_lock, flags);
> list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> led_set_brightness(led_cdev, brightness);
> - read_unlock(&trig->leddev_list_lock);
> + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> }
> EXPORT_SYMBOL_GPL(led_trigger_event);
>

--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (10.11 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2020-11-25 14:05:29

by Boqun Feng

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

On Wed, Nov 25, 2020 at 01:46:48PM +0100, Pavel Machek wrote:
> Hi!
>
> > We have the following potential deadlock condition:
> >
> > ========================================================
> > WARNING: possible irq lock inversion dependency detected
> > 5.10.0-rc2+ #25 Not tainted
> > --------------------------------------------------------
> > swapper/3/0 just changed the state of lock:
> > ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> > but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> > (&trig->leddev_list_lock){.+.?}-{2:2}
> >
> > and interrupts could create inverse lock ordering between them.
> >
> > other info that might help us debug this:
> > Possible interrupt unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&trig->leddev_list_lock);
> > local_irq_disable();
> > lock(&host->lock);
> > lock(&trig->leddev_list_lock);
> > <Interrupt>
> > lock(&host->lock);
> >
> > *** DEADLOCK ***
> >
> > no locks held by swapper/3/0.
> >
> > the shortest dependencies between 2nd lock and 1st lock:
> > -> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 {
> > HARDIRQ-ON-R at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > rfkill_global_led_trigger_worker+0x94/0xb0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > IN-SOFTIRQ-R at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > kbd_bh+0x9e/0xc0
> > tasklet_action_common.constprop.0+0xe9/0x100
> > tasklet_action+0x22/0x30
> > __do_softirq+0xcc/0x46d
> > run_ksoftirqd+0x3f/0x70
> > smpboot_thread_fn+0x116/0x1f0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > SOFTIRQ-ON-R at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > rfkill_global_led_trigger_worker+0x94/0xb0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > INITIAL READ USE at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > rfkill_global_led_trigger_worker+0x94/0xb0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > }
> > ... key at: [<ffffffff83da4c00>] __key.0+0x0/0x10
> > ... acquired at:
> > _raw_read_lock+0x42/0x90
> > led_trigger_blink_oneshot+0x3b/0x90
> > ledtrig_disk_activity+0x3c/0xa0
> > ata_qc_complete+0x26/0x450
> > ata_do_link_abort+0xa3/0xe0
> > ata_port_freeze+0x2e/0x40
> > ata_hsm_qc_complete+0x94/0xa0
> > ata_sff_hsm_move+0x177/0x7a0
> > ata_sff_pio_task+0xc7/0x1b0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> >
> > -> (&host->lock){-...}-{2:2} ops: 69 {
> > IN-HARDIRQ-W at:
> > lock_acquire+0x15f/0x420
> > _raw_spin_lock_irqsave+0x52/0xa0
> > ata_bmdma_interrupt+0x27/0x200
> > __handle_irq_event_percpu+0xd5/0x2b0
> > handle_irq_event+0x57/0xb0
> > handle_edge_irq+0x8c/0x230
> > asm_call_irq_on_stack+0xf/0x20
> > common_interrupt+0x100/0x1c0
> > asm_common_interrupt+0x1e/0x40
> > native_safe_halt+0xe/0x10
> > arch_cpu_idle+0x15/0x20
> > default_idle_call+0x59/0x1c0
> > do_idle+0x22c/0x2c0
> > cpu_startup_entry+0x20/0x30
> > start_secondary+0x11d/0x150
> > secondary_startup_64_no_verify+0xa6/0xab
> > INITIAL USE at:
> > lock_acquire+0x15f/0x420
> > _raw_spin_lock_irqsave+0x52/0xa0
> > ata_dev_init+0x54/0xe0
> > ata_link_init+0x8b/0xd0
> > ata_port_alloc+0x1f1/0x210
> > ata_host_alloc+0xf1/0x130
> > ata_host_alloc_pinfo+0x14/0xb0
> > ata_pci_sff_prepare_host+0x41/0xa0
> > ata_pci_bmdma_prepare_host+0x14/0x30
> > piix_init_one+0x21f/0x600
> > local_pci_probe+0x48/0x80
> > pci_device_probe+0x105/0x1c0
> > really_probe+0x221/0x490
> > driver_probe_device+0xe9/0x160
> > device_driver_attach+0xb2/0xc0
> > __driver_attach+0x91/0x150
> > bus_for_each_dev+0x81/0xc0
> > driver_attach+0x1e/0x20
> > bus_add_driver+0x138/0x1f0
> > driver_register+0x91/0xf0
> > __pci_register_driver+0x73/0x80
> > piix_init+0x1e/0x2e
> > do_one_initcall+0x5f/0x2d0
> > kernel_init_freeable+0x26f/0x2cf
> > kernel_init+0xe/0x113
> > ret_from_fork+0x1f/0x30
> > }
> > ... key at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10
> > ... acquired at:
> > __lock_acquire+0x9da/0x2370
> > lock_acquire+0x15f/0x420
> > _raw_spin_lock_irqsave+0x52/0xa0
> > ata_bmdma_interrupt+0x27/0x200
> > __handle_irq_event_percpu+0xd5/0x2b0
> > handle_irq_event+0x57/0xb0
> > handle_edge_irq+0x8c/0x230
> > asm_call_irq_on_stack+0xf/0x20
> > common_interrupt+0x100/0x1c0
> > asm_common_interrupt+0x1e/0x40
> > native_safe_halt+0xe/0x10
> > arch_cpu_idle+0x15/0x20
> > default_idle_call+0x59/0x1c0
> > do_idle+0x22c/0x2c0
> > cpu_startup_entry+0x20/0x30
> > start_secondary+0x11d/0x150
> > secondary_startup_64_no_verify+0xa6/0xab
> >
> > This lockdep splat is reported after:
> > commit e918188611f0 ("locking: More accurate annotations for read_lock()")
> >
> > To clarify:
> > - read-locks are recursive only in interrupt context (when
> > in_interrupt() returns true)
> > - after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call
> > write_lock(&trig->leddev_list_lock) that would be blocked by CPU0
> > that holds trig->leddev_list_lock in read-mode
> > - when CPU1 (ata_ac_complete()) tries to read-lock
> > trig->leddev_list_lock, it would be blocked by the write-lock waiter
> > on CPU2 (because we are not in interrupt context, so the read-lock is
> > not recursive)
> > - at this point if an interrupt happens on CPU0 and
> > ata_bmdma_interrupt() is executed it will try to acquire host->lock,
> > that is held by CPU1, that is currently blocked by CPU2, so:
> >
> > * CPU0 blocked by CPU1
> > * CPU1 blocked by CPU2
> > * CPU2 blocked by CPU0
> >
> > *** DEADLOCK ***
> >
> > The deadlock scenario is better represented by the following schema
> > (thanks to Boqun Feng <[email protected]> for the schema and the
> > detailed explanation of the deadlock condition):
> >
> > CPU 0: CPU 1: CPU 2:
> > ----- ----- -----
> > led_trigger_event():
> > read_lock(&trig->leddev_list_lock);
> > <workqueue>
> > ata_hsm_qc_complete():
> > spin_lock_irqsave(&host->lock);
> > write_lock(&trig->leddev_list_lock);
> > ata_port_freeze():
> > ata_do_link_abort():
> > ata_qc_complete():
> > ledtrig_disk_activity():
> > led_trigger_blink_oneshot():
> > read_lock(&trig->leddev_list_lock);
> > // ^ not in in_interrupt() context, so could get blocked by CPU 2
> > <interrupt>
> > ata_bmdma_interrupt():
> > spin_lock_irqsave(&host->lock);
> >
> > Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so
> > that no interrupt can happen in between, preventing the deadlock
> > condition.
> >
> > Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
> > Fixes: eb25cb9956cc ("leds: convert IDE trigger to common disk trigger")
> > Signed-off-by: Andrea Righi <[email protected]>
>
> I'd hate to see this in stable 3 days after Linus merges it...
>
> Do these need _irqsave, too?
>
> drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
> drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
> drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
> drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
>

I think so, if you mean the read_{un,}lock in led_trigger_blink_setup()
also need to convert to irq-safe version ;-)

Regards,
Boqun

> Best regards,
> Pavel
>
> > ---
> > drivers/leds/led-triggers.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> > index 91da90cfb11d..16d1a93a10a8 100644
> > --- a/drivers/leds/led-triggers.c
> > +++ b/drivers/leds/led-triggers.c
> > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> > enum led_brightness brightness)
> > {
> > struct led_classdev *led_cdev;
> > + unsigned long flags;
> >
> > if (!trig)
> > return;
> >
> > - read_lock(&trig->leddev_list_lock);
> > + read_lock_irqsave(&trig->leddev_list_lock, flags);
> > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> > led_set_brightness(led_cdev, brightness);
> > - read_unlock(&trig->leddev_list_lock);
> > + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> > }
> > EXPORT_SYMBOL_GPL(led_trigger_event);
> >
>
> --
> http://www.livejournal.com/~pavelmachek


2020-11-25 14:19:18

by Andrea Righi

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

Hi Pavel,

On Wed, Nov 25, 2020 at 01:46:48PM +0100, Pavel Machek wrote:
> Hi!
>
> > We have the following potential deadlock condition:
> >
> > ========================================================
> > WARNING: possible irq lock inversion dependency detected
> > 5.10.0-rc2+ #25 Not tainted
> > --------------------------------------------------------
> > swapper/3/0 just changed the state of lock:
> > ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> > but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> > (&trig->leddev_list_lock){.+.?}-{2:2}
> >
> > and interrupts could create inverse lock ordering between them.
> >
> > other info that might help us debug this:
> > Possible interrupt unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&trig->leddev_list_lock);
> > local_irq_disable();
> > lock(&host->lock);
> > lock(&trig->leddev_list_lock);
> > <Interrupt>
> > lock(&host->lock);
> >
> > *** DEADLOCK ***
> >
> > no locks held by swapper/3/0.
> >
> > the shortest dependencies between 2nd lock and 1st lock:
> > -> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 {
> > HARDIRQ-ON-R at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > rfkill_global_led_trigger_worker+0x94/0xb0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > IN-SOFTIRQ-R at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > kbd_bh+0x9e/0xc0
> > tasklet_action_common.constprop.0+0xe9/0x100
> > tasklet_action+0x22/0x30
> > __do_softirq+0xcc/0x46d
> > run_ksoftirqd+0x3f/0x70
> > smpboot_thread_fn+0x116/0x1f0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > SOFTIRQ-ON-R at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > rfkill_global_led_trigger_worker+0x94/0xb0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > INITIAL READ USE at:
> > lock_acquire+0x15f/0x420
> > _raw_read_lock+0x42/0x90
> > led_trigger_event+0x2b/0x70
> > rfkill_global_led_trigger_worker+0x94/0xb0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> > }
> > ... key at: [<ffffffff83da4c00>] __key.0+0x0/0x10
> > ... acquired at:
> > _raw_read_lock+0x42/0x90
> > led_trigger_blink_oneshot+0x3b/0x90
> > ledtrig_disk_activity+0x3c/0xa0
> > ata_qc_complete+0x26/0x450
> > ata_do_link_abort+0xa3/0xe0
> > ata_port_freeze+0x2e/0x40
> > ata_hsm_qc_complete+0x94/0xa0
> > ata_sff_hsm_move+0x177/0x7a0
> > ata_sff_pio_task+0xc7/0x1b0
> > process_one_work+0x240/0x560
> > worker_thread+0x58/0x3d0
> > kthread+0x151/0x170
> > ret_from_fork+0x1f/0x30
> >
> > -> (&host->lock){-...}-{2:2} ops: 69 {
> > IN-HARDIRQ-W at:
> > lock_acquire+0x15f/0x420
> > _raw_spin_lock_irqsave+0x52/0xa0
> > ata_bmdma_interrupt+0x27/0x200
> > __handle_irq_event_percpu+0xd5/0x2b0
> > handle_irq_event+0x57/0xb0
> > handle_edge_irq+0x8c/0x230
> > asm_call_irq_on_stack+0xf/0x20
> > common_interrupt+0x100/0x1c0
> > asm_common_interrupt+0x1e/0x40
> > native_safe_halt+0xe/0x10
> > arch_cpu_idle+0x15/0x20
> > default_idle_call+0x59/0x1c0
> > do_idle+0x22c/0x2c0
> > cpu_startup_entry+0x20/0x30
> > start_secondary+0x11d/0x150
> > secondary_startup_64_no_verify+0xa6/0xab
> > INITIAL USE at:
> > lock_acquire+0x15f/0x420
> > _raw_spin_lock_irqsave+0x52/0xa0
> > ata_dev_init+0x54/0xe0
> > ata_link_init+0x8b/0xd0
> > ata_port_alloc+0x1f1/0x210
> > ata_host_alloc+0xf1/0x130
> > ata_host_alloc_pinfo+0x14/0xb0
> > ata_pci_sff_prepare_host+0x41/0xa0
> > ata_pci_bmdma_prepare_host+0x14/0x30
> > piix_init_one+0x21f/0x600
> > local_pci_probe+0x48/0x80
> > pci_device_probe+0x105/0x1c0
> > really_probe+0x221/0x490
> > driver_probe_device+0xe9/0x160
> > device_driver_attach+0xb2/0xc0
> > __driver_attach+0x91/0x150
> > bus_for_each_dev+0x81/0xc0
> > driver_attach+0x1e/0x20
> > bus_add_driver+0x138/0x1f0
> > driver_register+0x91/0xf0
> > __pci_register_driver+0x73/0x80
> > piix_init+0x1e/0x2e
> > do_one_initcall+0x5f/0x2d0
> > kernel_init_freeable+0x26f/0x2cf
> > kernel_init+0xe/0x113
> > ret_from_fork+0x1f/0x30
> > }
> > ... key at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10
> > ... acquired at:
> > __lock_acquire+0x9da/0x2370
> > lock_acquire+0x15f/0x420
> > _raw_spin_lock_irqsave+0x52/0xa0
> > ata_bmdma_interrupt+0x27/0x200
> > __handle_irq_event_percpu+0xd5/0x2b0
> > handle_irq_event+0x57/0xb0
> > handle_edge_irq+0x8c/0x230
> > asm_call_irq_on_stack+0xf/0x20
> > common_interrupt+0x100/0x1c0
> > asm_common_interrupt+0x1e/0x40
> > native_safe_halt+0xe/0x10
> > arch_cpu_idle+0x15/0x20
> > default_idle_call+0x59/0x1c0
> > do_idle+0x22c/0x2c0
> > cpu_startup_entry+0x20/0x30
> > start_secondary+0x11d/0x150
> > secondary_startup_64_no_verify+0xa6/0xab
> >
> > This lockdep splat is reported after:
> > commit e918188611f0 ("locking: More accurate annotations for read_lock()")
> >
> > To clarify:
> > - read-locks are recursive only in interrupt context (when
> > in_interrupt() returns true)
> > - after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call
> > write_lock(&trig->leddev_list_lock) that would be blocked by CPU0
> > that holds trig->leddev_list_lock in read-mode
> > - when CPU1 (ata_ac_complete()) tries to read-lock
> > trig->leddev_list_lock, it would be blocked by the write-lock waiter
> > on CPU2 (because we are not in interrupt context, so the read-lock is
> > not recursive)
> > - at this point if an interrupt happens on CPU0 and
> > ata_bmdma_interrupt() is executed it will try to acquire host->lock,
> > that is held by CPU1, that is currently blocked by CPU2, so:
> >
> > * CPU0 blocked by CPU1
> > * CPU1 blocked by CPU2
> > * CPU2 blocked by CPU0
> >
> > *** DEADLOCK ***
> >
> > The deadlock scenario is better represented by the following schema
> > (thanks to Boqun Feng <[email protected]> for the schema and the
> > detailed explanation of the deadlock condition):
> >
> > CPU 0: CPU 1: CPU 2:
> > ----- ----- -----
> > led_trigger_event():
> > read_lock(&trig->leddev_list_lock);
> > <workqueue>
> > ata_hsm_qc_complete():
> > spin_lock_irqsave(&host->lock);
> > write_lock(&trig->leddev_list_lock);
> > ata_port_freeze():
> > ata_do_link_abort():
> > ata_qc_complete():
> > ledtrig_disk_activity():
> > led_trigger_blink_oneshot():
> > read_lock(&trig->leddev_list_lock);
> > // ^ not in in_interrupt() context, so could get blocked by CPU 2
> > <interrupt>
> > ata_bmdma_interrupt():
> > spin_lock_irqsave(&host->lock);
> >
> > Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so
> > that no interrupt can happen in between, preventing the deadlock
> > condition.
> >
> > Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
> > Fixes: eb25cb9956cc ("leds: convert IDE trigger to common disk trigger")
> > Signed-off-by: Andrea Righi <[email protected]>
>
> I'd hate to see this in stable 3 days after Linus merges it...
>
> Do these need _irqsave, too?
>
> drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
> drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
> drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
> drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
>
> Best regards,

I think also led_trigger_blink_setup() needs to use irqsave/irqrestore,
in fact:

$ git grep "led_trigger_blink("
drivers/leds/led-triggers.c:void led_trigger_blink(struct led_trigger *trig,
drivers/power/supply/power_supply_leds.c: led_trigger_blink(psy->charging_blink_full_solid_trig,
include/linux/leds.h:void led_trigger_blink(struct led_trigger *trigger, unsigned long *delay_on,
include/linux/leds.h:static inline void led_trigger_blink(struct led_trigger *trigger,

power_supply_leds.c is using led_trigger_blink() from a workqueue
context, so potentially the same deadlock condition can also happen.

Let me know if you want me to send a new patch to include also this
case.

-Andrea

2020-11-25 15:27:17

by Andrea Righi

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

On Wed, Nov 25, 2020 at 03:15:18PM +0100, Andrea Righi wrote:
...
> > I'd hate to see this in stable 3 days after Linus merges it...
> >
> > Do these need _irqsave, too?
> >
> > drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
> > drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
> > drivers/leds/led-triggers.c: read_lock(&trig->leddev_list_lock);
> > drivers/leds/led-triggers.c: read_unlock(&trig->leddev_list_lock);
> >
> > Best regards,
>
> I think also led_trigger_blink_setup() needs to use irqsave/irqrestore,
> in fact:
>
> $ git grep "led_trigger_blink("
> drivers/leds/led-triggers.c:void led_trigger_blink(struct led_trigger *trig,
> drivers/power/supply/power_supply_leds.c: led_trigger_blink(psy->charging_blink_full_solid_trig,
> include/linux/leds.h:void led_trigger_blink(struct led_trigger *trigger, unsigned long *delay_on,
> include/linux/leds.h:static inline void led_trigger_blink(struct led_trigger *trigger,
>
> power_supply_leds.c is using led_trigger_blink() from a workqueue
> context, so potentially the same deadlock condition can also happen.
>
> Let me know if you want me to send a new patch to include also this
> case.

Just sent (and tested) a v2 of this patch that changes also
led_trigger_blink_setup().

-Andrea

2021-03-06 20:42:52

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

Hello *,

On 02.11.2020 11:41:52, Andrea Righi wrote:
> We have the following potential deadlock condition:
>
> ========================================================
> WARNING: possible irq lock inversion dependency detected
> 5.10.0-rc2+ #25 Not tainted
> --------------------------------------------------------
> swapper/3/0 just changed the state of lock:
> ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> (&trig->leddev_list_lock){.+.?}-{2:2}
>
> and interrupts could create inverse lock ordering between them.

[...]

> ---
> drivers/leds/led-triggers.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> index 91da90cfb11d..16d1a93a10a8 100644
> --- a/drivers/leds/led-triggers.c
> +++ b/drivers/leds/led-triggers.c
> @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> enum led_brightness brightness)
> {
> struct led_classdev *led_cdev;
> + unsigned long flags;
>
> if (!trig)
> return;
>
> - read_lock(&trig->leddev_list_lock);
> + read_lock_irqsave(&trig->leddev_list_lock, flags);
> list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> led_set_brightness(led_cdev, brightness);
> - read_unlock(&trig->leddev_list_lock);
> + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> }
> EXPORT_SYMBOL_GPL(led_trigger_event);

meanwhile this patch hit v5.10.x stable and caused a performance
degradation on our use case:

It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
controller. CAN stands for Controller Area Network and here used to
connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
Transport Protocol) transfer is running. With this patch, we see CAN
frames delayed for ~6ms, the usual gap between CAN frames is 240µs.

Reverting this patch, restores the old performance.

What is the best way to solve this dilemma? Identify the critical path
in our use case? Is there a way we can get around the irqsave in
led_trigger_event()?

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (2.44 kB)
signature.asc (499.00 B)
Download all attachments

2021-03-07 02:14:54

by Boqun Feng

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

On Sat, Mar 06, 2021 at 09:39:54PM +0100, Marc Kleine-Budde wrote:
> Hello *,
>
> On 02.11.2020 11:41:52, Andrea Righi wrote:
> > We have the following potential deadlock condition:
> >
> > ========================================================
> > WARNING: possible irq lock inversion dependency detected
> > 5.10.0-rc2+ #25 Not tainted
> > --------------------------------------------------------
> > swapper/3/0 just changed the state of lock:
> > ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> > but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> > (&trig->leddev_list_lock){.+.?}-{2:2}
> >
> > and interrupts could create inverse lock ordering between them.
>
> [...]
>
> > ---
> > drivers/leds/led-triggers.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> > index 91da90cfb11d..16d1a93a10a8 100644
> > --- a/drivers/leds/led-triggers.c
> > +++ b/drivers/leds/led-triggers.c
> > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> > enum led_brightness brightness)
> > {
> > struct led_classdev *led_cdev;
> > + unsigned long flags;
> >
> > if (!trig)
> > return;
> >
> > - read_lock(&trig->leddev_list_lock);
> > + read_lock_irqsave(&trig->leddev_list_lock, flags);
> > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> > led_set_brightness(led_cdev, brightness);
> > - read_unlock(&trig->leddev_list_lock);
> > + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> > }
> > EXPORT_SYMBOL_GPL(led_trigger_event);
>
> meanwhile this patch hit v5.10.x stable and caused a performance
> degradation on our use case:
>
> It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
> controller. CAN stands for Controller Area Network and here used to
> connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
> Transport Protocol) transfer is running. With this patch, we see CAN
> frames delayed for ~6ms, the usual gap between CAN frames is 240?s.
>
> Reverting this patch, restores the old performance.
>
> What is the best way to solve this dilemma? Identify the critical path
> in our use case? Is there a way we can get around the irqsave in
> led_trigger_event()?
>

Probably, we can change from rwlock to rcu here, POC code as follow,
only compile tested. Marc, could you see whether this help the
performance on your platform? Please note that I haven't test it in a
running kernel and I'm not that familir with led subsystem, so use it
with caution ;-)

(While at it, I think maybe we miss the leddev_list_lock in net/mac80211
in the patch)

Regards,
Boqun
------->8
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 4e7b78a84149..ae68ccab6cc9 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -171,10 +171,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)

/* Remove any existing trigger */
if (led_cdev->trigger) {
- write_lock_irqsave(&led_cdev->trigger->leddev_list_lock, flags);
- list_del(&led_cdev->trig_list);
- write_unlock_irqrestore(&led_cdev->trigger->leddev_list_lock,
+ spin_lock_irqsave(&led_cdev->trigger->leddev_list_lock, flags);
+ list_del_rcu(&led_cdev->trig_list);
+ spin_unlock_irqrestore(&led_cdev->trigger->leddev_list_lock,
flags);
+ /* Wait for the readers gone */
+ synchronize_rcu();
cancel_work_sync(&led_cdev->set_brightness_work);
led_stop_software_blink(led_cdev);
if (led_cdev->trigger->deactivate)
@@ -186,9 +188,9 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
led_set_brightness(led_cdev, LED_OFF);
}
if (trig) {
- write_lock_irqsave(&trig->leddev_list_lock, flags);
- list_add_tail(&led_cdev->trig_list, &trig->led_cdevs);
- write_unlock_irqrestore(&trig->leddev_list_lock, flags);
+ spin_lock_irqsave(&trig->leddev_list_lock, flags);
+ list_add_tail_rcu(&led_cdev->trig_list, &trig->led_cdevs);
+ spin_unlock_irqrestore(&trig->leddev_list_lock, flags);
led_cdev->trigger = trig;

if (trig->activate)
@@ -223,9 +225,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
trig->deactivate(led_cdev);
err_activate:

- write_lock_irqsave(&led_cdev->trigger->leddev_list_lock, flags);
- list_del(&led_cdev->trig_list);
- write_unlock_irqrestore(&led_cdev->trigger->leddev_list_lock, flags);
+ spin_lock_irqsave(&led_cdev->trigger->leddev_list_lock, flags);
+ list_del_rcu(&led_cdev->trig_list);
+ spin_unlock_irqrestore(&led_cdev->trigger->leddev_list_lock, flags);
+
+ /* XXX could use call_rcu() here? */
+ synchronize_rcu();
led_cdev->trigger = NULL;
led_cdev->trigger_data = NULL;
led_set_brightness(led_cdev, LED_OFF);
@@ -285,7 +290,7 @@ int led_trigger_register(struct led_trigger *trig)
struct led_classdev *led_cdev;
struct led_trigger *_trig;

- rwlock_init(&trig->leddev_list_lock);
+ spin_lock_init(&trig->leddev_list_lock);
INIT_LIST_HEAD(&trig->led_cdevs);

down_write(&triggers_list_lock);
@@ -378,15 +383,14 @@ void led_trigger_event(struct led_trigger *trig,
enum led_brightness brightness)
{
struct led_classdev *led_cdev;
- unsigned long flags;

if (!trig)
return;

- read_lock_irqsave(&trig->leddev_list_lock, flags);
- list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
+ rcu_read_lock();
+ list_for_each_entry_rcu(led_cdev, &trig->led_cdevs, trig_list)
led_set_brightness(led_cdev, brightness);
- read_unlock_irqrestore(&trig->leddev_list_lock, flags);
+ rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(led_trigger_event);

@@ -397,20 +401,19 @@ static void led_trigger_blink_setup(struct led_trigger *trig,
int invert)
{
struct led_classdev *led_cdev;
- unsigned long flags;

if (!trig)
return;

- read_lock_irqsave(&trig->leddev_list_lock, flags);
- list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list) {
+ rcu_read_lock();
+ list_for_each_entry_rcu(led_cdev, &trig->led_cdevs, trig_list) {
if (oneshot)
led_blink_set_oneshot(led_cdev, delay_on, delay_off,
invert);
else
led_blink_set(led_cdev, delay_on, delay_off);
}
- read_unlock_irqrestore(&trig->leddev_list_lock, flags);
+ rcu_read_unlock();
}

void led_trigger_blink(struct led_trigger *trig,
diff --git a/include/linux/leds.h b/include/linux/leds.h
index 6a8d6409c993..5acc0e8a9f18 100644
--- a/include/linux/leds.h
+++ b/include/linux/leds.h
@@ -356,7 +356,7 @@ struct led_trigger {
struct led_hw_trigger_type *trigger_type;

/* LEDs under control by this trigger (for simple triggers) */
- rwlock_t leddev_list_lock;
+ spinlock_t leddev_list_lock;
struct list_head led_cdevs;

/* Link to next registered trigger */
diff --git a/net/mac80211/led.c b/net/mac80211/led.c
index b275c8853074..5ec5070fe210 100644
--- a/net/mac80211/led.c
+++ b/net/mac80211/led.c
@@ -283,10 +283,10 @@ static void tpt_trig_timer(struct timer_list *t)
}
}

- read_lock(&local->tpt_led.leddev_list_lock);
+ rcu_read_lock();
list_for_each_entry(led_cdev, &local->tpt_led.led_cdevs, trig_list)
led_blink_set(led_cdev, &on, &off);
- read_unlock(&local->tpt_led.leddev_list_lock);
+ rcu_read_unlock();
}

const char *
@@ -349,10 +349,10 @@ static void ieee80211_stop_tpt_led_trig(struct ieee80211_local *local)
tpt_trig->running = false;
del_timer_sync(&tpt_trig->timer);

- read_lock(&local->tpt_led.leddev_list_lock);
+ rcu_read_lock();
list_for_each_entry(led_cdev, &local->tpt_led.led_cdevs, trig_list)
led_set_brightness(led_cdev, LED_OFF);
- read_unlock(&local->tpt_led.leddev_list_lock);
+ rcu_read_unlock();
}

void ieee80211_mod_tpt_led_trig(struct ieee80211_local *local,

2021-03-07 07:48:13

by Andrea Righi

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

On Sun, Mar 07, 2021 at 10:02:32AM +0800, Boqun Feng wrote:
> On Sat, Mar 06, 2021 at 09:39:54PM +0100, Marc Kleine-Budde wrote:
> > Hello *,
> >
> > On 02.11.2020 11:41:52, Andrea Righi wrote:
> > > We have the following potential deadlock condition:
> > >
> > > ========================================================
> > > WARNING: possible irq lock inversion dependency detected
> > > 5.10.0-rc2+ #25 Not tainted
> > > --------------------------------------------------------
> > > swapper/3/0 just changed the state of lock:
> > > ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> > > but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> > > (&trig->leddev_list_lock){.+.?}-{2:2}
> > >
> > > and interrupts could create inverse lock ordering between them.
> >
> > [...]
> >
> > > ---
> > > drivers/leds/led-triggers.c | 5 +++--
> > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> > > index 91da90cfb11d..16d1a93a10a8 100644
> > > --- a/drivers/leds/led-triggers.c
> > > +++ b/drivers/leds/led-triggers.c
> > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> > > enum led_brightness brightness)
> > > {
> > > struct led_classdev *led_cdev;
> > > + unsigned long flags;
> > >
> > > if (!trig)
> > > return;
> > >
> > > - read_lock(&trig->leddev_list_lock);
> > > + read_lock_irqsave(&trig->leddev_list_lock, flags);
> > > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> > > led_set_brightness(led_cdev, brightness);
> > > - read_unlock(&trig->leddev_list_lock);
> > > + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> > > }
> > > EXPORT_SYMBOL_GPL(led_trigger_event);
> >
> > meanwhile this patch hit v5.10.x stable and caused a performance
> > degradation on our use case:
> >
> > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
> > controller. CAN stands for Controller Area Network and here used to
> > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
> > Transport Protocol) transfer is running. With this patch, we see CAN
> > frames delayed for ~6ms, the usual gap between CAN frames is 240?s.
> >
> > Reverting this patch, restores the old performance.
> >
> > What is the best way to solve this dilemma? Identify the critical path
> > in our use case? Is there a way we can get around the irqsave in
> > led_trigger_event()?
> >
>
> Probably, we can change from rwlock to rcu here, POC code as follow,
> only compile tested. Marc, could you see whether this help the
> performance on your platform? Please note that I haven't test it in a
> running kernel and I'm not that familir with led subsystem, so use it
> with caution ;-)

If we don't want to touch the led subsystem at all maybe we could try to
fix the problem in libata, we just need to prevent calling
led_trigger_blink_oneshot() with &host->lock held from
ata_qc_complete(), maybe doing the led blinking from another context (a
workqueue for example)?

-Andrea

2021-03-08 02:43:04

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

Hi!

> > --- a/drivers/leds/led-triggers.c
> > +++ b/drivers/leds/led-triggers.c
> > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> > enum led_brightness brightness)
> > {
> > struct led_classdev *led_cdev;
> > + unsigned long flags;
> >
> > if (!trig)
> > return;
> >
> > - read_lock(&trig->leddev_list_lock);
> > + read_lock_irqsave(&trig->leddev_list_lock, flags);
> > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> > led_set_brightness(led_cdev, brightness);
> > - read_unlock(&trig->leddev_list_lock);
> > + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> > }
> > EXPORT_SYMBOL_GPL(led_trigger_event);
>
> meanwhile this patch hit v5.10.x stable and caused a performance
> degradation on our use case:
>
> It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
> controller. CAN stands for Controller Area Network and here used to
> connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
> Transport Protocol) transfer is running. With this patch, we see CAN
> frames delayed for ~6ms, the usual gap between CAN frames is 240?s.
>
> Reverting this patch, restores the old performance.
>
> What is the best way to solve this dilemma? Identify the critical path
> in our use case? Is there a way we can get around the irqsave in
> led_trigger_event()?

Hans was pushing for this patch, perhaps he has some ideas...
Pavel
--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (1.49 kB)
signature.asc (201.00 B)
Download all attachments

2021-03-08 03:30:45

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

Hi!

> > ---
> > drivers/leds/led-triggers.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> > index 91da90cfb11d..16d1a93a10a8 100644
> > --- a/drivers/leds/led-triggers.c
> > +++ b/drivers/leds/led-triggers.c
> > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> > enum led_brightness brightness)
> > {
> > struct led_classdev *led_cdev;
> > + unsigned long flags;
> >
> > if (!trig)
> > return;
> >
> > - read_lock(&trig->leddev_list_lock);
> > + read_lock_irqsave(&trig->leddev_list_lock, flags);
> > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> > led_set_brightness(led_cdev, brightness);
> > - read_unlock(&trig->leddev_list_lock);
> > + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> > }
> > EXPORT_SYMBOL_GPL(led_trigger_event);
>
> meanwhile this patch hit v5.10.x stable and caused a performance
> degradation on our use case:
>
> It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
> controller. CAN stands for Controller Area Network and here used to
> connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
> Transport Protocol) transfer is running. With this patch, we see CAN
> frames delayed for ~6ms, the usual gap between CAN frames is 240?s.
>
> Reverting this patch, restores the old performance.
>
> What is the best way to solve this dilemma? Identify the critical path
> in our use case? Is there a way we can get around the irqsave in
> led_trigger_event()?

6ms is quite long. Are you actively using any triggers? Do you have
LED blinking on CAN access?

Can you verify if it is cli/sti taking too long, or if the
led_set_brightness takes too long?

Best regards,
Pavel

--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (1.87 kB)
signature.asc (201.00 B)
Download all attachments

2021-03-08 08:15:57

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

Hi,

On 3/7/21 5:13 PM, Pavel Machek wrote:
> Hi!
>
>>> --- a/drivers/leds/led-triggers.c
>>> +++ b/drivers/leds/led-triggers.c
>>> @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
>>> enum led_brightness brightness)
>>> {
>>> struct led_classdev *led_cdev;
>>> + unsigned long flags;
>>>
>>> if (!trig)
>>> return;
>>>
>>> - read_lock(&trig->leddev_list_lock);
>>> + read_lock_irqsave(&trig->leddev_list_lock, flags);
>>> list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
>>> led_set_brightness(led_cdev, brightness);
>>> - read_unlock(&trig->leddev_list_lock);
>>> + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
>>> }
>>> EXPORT_SYMBOL_GPL(led_trigger_event)
>>
>> meanwhile this patch hit v5.10.x stable and caused a performance
>> degradation on our use case:
>>
>> It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
>> controller. CAN stands for Controller Area Network and here used to
>> connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
>> Transport Protocol) transfer is running. With this patch, we see CAN
>> frames delayed for ~6ms, the usual gap between CAN frames is 240?s.
>>
>> Reverting this patch, restores the old performance.
>>
>> What is the best way to solve this dilemma? Identify the critical path
>> in our use case? Is there a way we can get around the irqsave in
>> led_trigger_event()?
>
> Hans was pushing for this patch, perhaps he has some ideas...

I was not pushing for this particular fix, I was asking about a fix
for the lockdep identified potential deadlock.

And you replied that this was already fixed in your for-next branch
when I asked, so all in all, other then reporting the potential deadlock
(after it was already fixed) I have very little do to with this patch.

With that all said, I must say that I'm surprised that switching from
read_lock() to read_lock_irqsave() causes such a hefty penalty, so I
wonder what is really going on here. Using the irqsave version disables
interrupts, but AFAIK only on the current core and only for the duration
of the led_set_brightness() call(s) .

Is the system perhaps pinning IRQs to a specific CPU in combination with
a led_set_brightness() somehow taking much longer then it should?

Note that led_set_brightness() calls are not allowed to block, if they
block they should use the brightness_set_blocking callback in their
led_class_dev struct not the regular brightness_set callback. In which case
the LED-core will defer the actually setting of the LED to a workqueue.

So one thing which might be worthwhile to check is if any of the LED
drivers on the system in question are using the brightness_set callback,
where they should be using the blocking one.

Regards,

Hans

2021-03-08 15:00:52

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

On 07.03.2021 10:02:32, Boqun Feng wrote:
> On Sat, Mar 06, 2021 at 09:39:54PM +0100, Marc Kleine-Budde wrote:
> > Hello *,
> >
> > On 02.11.2020 11:41:52, Andrea Righi wrote:
> > > We have the following potential deadlock condition:
> > >
> > > ========================================================
> > > WARNING: possible irq lock inversion dependency detected
> > > 5.10.0-rc2+ #25 Not tainted
> > > --------------------------------------------------------
> > > swapper/3/0 just changed the state of lock:
> > > ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> > > but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> > > (&trig->leddev_list_lock){.+.?}-{2:2}
> > >
> > > and interrupts could create inverse lock ordering between them.
> >
> > [...]
> >
> > > ---
> > > drivers/leds/led-triggers.c | 5 +++--
> > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> > > index 91da90cfb11d..16d1a93a10a8 100644
> > > --- a/drivers/leds/led-triggers.c
> > > +++ b/drivers/leds/led-triggers.c
> > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> > > enum led_brightness brightness)
> > > {
> > > struct led_classdev *led_cdev;
> > > + unsigned long flags;
> > >
> > > if (!trig)
> > > return;
> > >
> > > - read_lock(&trig->leddev_list_lock);
> > > + read_lock_irqsave(&trig->leddev_list_lock, flags);
> > > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> > > led_set_brightness(led_cdev, brightness);
> > > - read_unlock(&trig->leddev_list_lock);
> > > + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> > > }
> > > EXPORT_SYMBOL_GPL(led_trigger_event);
> >
> > meanwhile this patch hit v5.10.x stable and caused a performance
> > degradation on our use case:
> >
> > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
> > controller. CAN stands for Controller Area Network and here used to
> > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
> > Transport Protocol) transfer is running. With this patch, we see CAN
> > frames delayed for ~6ms, the usual gap between CAN frames is 240µs.
> >
> > Reverting this patch, restores the old performance.
> >
> > What is the best way to solve this dilemma? Identify the critical path
> > in our use case? Is there a way we can get around the irqsave in
> > led_trigger_event()?
> >
>
> Probably, we can change from rwlock to rcu here, POC code as follow,
> only compile tested. Marc, could you see whether this help the
> performance on your platform? Please note that I haven't test it in a
> running kernel and I'm not that familir with led subsystem, so use it
> with caution ;-)
>
> (While at it, I think maybe we miss the leddev_list_lock in net/mac80211
> in the patch)

I can confirm, this patch basically restores the old performance.

Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (3.23 kB)
signature.asc (499.00 B)
Download all attachments

2021-03-08 15:09:03

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH] leds: trigger: fix potential deadlock with libata

On 08.03.2021 15:56:53, Marc Kleine-Budde wrote:
> > > meanwhile this patch hit v5.10.x stable and caused a performance
> > > degradation on our use case:
> > >
> > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
> > > controller. CAN stands for Controller Area Network and here used to
> > > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
> > > Transport Protocol) transfer is running. With this patch, we see CAN
> > > frames delayed for ~6ms, the usual gap between CAN frames is 240µs.
> > >
> > > Reverting this patch, restores the old performance.
> > >
> > > What is the best way to solve this dilemma? Identify the critical path
> > > in our use case? Is there a way we can get around the irqsave in
> > > led_trigger_event()?
> > >
> >
> > Probably, we can change from rwlock to rcu here, POC code as follow,
> > only compile tested. Marc, could you see whether this help the
> > performance on your platform? Please note that I haven't test it in a
> > running kernel and I'm not that familir with led subsystem, so use it
> > with caution ;-)
> >
> > (While at it, I think maybe we miss the leddev_list_lock in net/mac80211
> > in the patch)
>
> I can confirm, this patch basically restores the old performance.

Hmmm - it first looked OK, now it doesn't - let me do some better testing.

Sorry for the noise,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (1.64 kB)
signature.asc (499.00 B)
Download all attachments