2021-06-17 02:08:55

by Johannes Berg

[permalink] [raw]
Subject: lockdep issue with trig->leddev_list_lock again/still?

Hi Boqun, all,

I've been reading the threads at

https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/

and

https://lore.kernel.org/lkml/20201102104152.GG9930@xps-13-7390/

as well as the previous fix commit 27af8e2c90fb ("leds: trigger: fix
potential deadlock with libata"), but I cannot make sense of this.


I'm getting a report - on 5.10.27, which has that fix backported - that
says we have a chain with leddev_list_lock involved in it, but I don't
really understand it.

FWIW, I have the following patch, that Peter gave me some time ago but
said you (Boqun) wrote, applied to the tree as well:

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 48d736aa03b2..9de9ae8a4ae0 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2667,8 +2667,18 @@ static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
* Step 3: we found a bad match! Now retrieve a lock from the backward
* list whose usage mask matches the exclusive usage mask from the
* lock found on the forward list.
+ *
+ * Note, we should only keep the LOCKF_ENABLED_IRQ_ALL bits, considering
+ * the follow case:
+ *
+ * When trying to add A -> B to the graph, we find that there is a
+ * hardirq-safe L, that L -> ... -> A, and another hardirq-unsafe M,
+ * that B -> ... -> M. However M is **softirq-safe**, if we use exact
+ * invert bits of M's usage_mask, we will find another lock N that is
+ * **softirq-unsafe** and N -> ... -> A, however N -> .. -> M will not
+ * cause a inversion deadlock.
*/
- backward_mask = original_mask(target_entry1->class->usage_mask);
+ backward_mask = original_mask(target_entry1->class->usage_mask & LOCKF_ENABLED_IRQ_ALL);

ret = find_usage_backwards(&this, backward_mask, &target_entry);
if (bfs_error(ret)) {


Without the patch, the lockdep report makes no sense at all, so thanks
for that. :-)


With that patch, I'm getting the following report from our driver vs.
libata vs. LED triggers:

[ 498.399434] =====================================================
[ 498.399481] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
[ 498.399523] 5.10.27-dev-lockdep-fix2+ #1 Tainted: G OE
[ 498.399565] -----------------------------------------------------
[ 498.399608] ip/364 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 498.399650] ffff8880310fa438 (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2}, at: iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
[ 498.399749]
[ 498.399749] and this task is already holding:
[ 498.399789] ffff88800558b5b8 (&trig->leddev_list_lock){.-.-}-{2:2}, at: led_trigger_event+0x34/0xd0
[ 498.399845] which would create a new lock dependency:
[ 498.399884] (&trig->leddev_list_lock){.-.-}-{2:2} -> (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2}
[ 498.399938]
[ 498.399938] but this new dependency connects a HARDIRQ-irq-safe lock:
[ 498.399984] (&host->lock){-.-.}-{2:2}
[ 498.399992]
[ 498.399992] ... which became HARDIRQ-irq-safe at:
[ 498.400077] lock_acquire+0x1b8/0x720
[ 498.400121] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.400165] ata_bmdma_interrupt+0x21/0x5a0
[ 498.400208] __handle_irq_event_percpu+0x25e/0x630
[ 498.400250] handle_irq_event+0x10e/0x290
[ 498.400290] handle_edge_irq+0x1f7/0xbb0
[ 498.400332] asm_call_irq_on_stack+0x12/0x20
[ 498.400374] common_interrupt+0xfb/0x1a0
[ 498.400415] asm_common_interrupt+0x1e/0x40
[ 498.400456] _raw_spin_unlock_irq+0x2f/0x40
[ 498.400497] ata_sff_flush_pio_task+0xa0/0x170
[ 498.400537] ata_exec_internal_sg+0xbe0/0x1840
[ 498.400577] ata_dev_read_id+0x8cd/0xcb0
[ 498.400617] ata_dev_reread_id+0xef/0x1c0
[ 498.400658] ata_dev_revalidate+0xcb/0x1f0
[ 498.400698] ata_do_set_mode+0x126b/0x2410
[ 498.400741] ata_set_mode+0x305/0x3e0
[ 498.400786] ata_eh_recover+0x268c/0x4110
[ 498.400828] ata_do_eh+0x7a/0xe0
[ 498.400866] ata_sff_error_handler+0x263/0x360
[ 498.400906] ata_bmdma_error_handler+0x295/0x580
[ 498.400945] ata_scsi_port_error_handler+0x7c1/0x17b0
[ 498.400986] ata_scsi_error+0x1f8/0x410
[ 498.401027] scsi_error_handler+0x29b/0x14a0
[ 498.401077] kthread+0x370/0x440
[ 498.401119] ret_from_fork+0x22/0x30
[ 498.401158]
[ 498.401158] to a HARDIRQ-irq-unsafe lock:
[ 498.401197] (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2}
[ 498.401204]
[ 498.401204] ... which became HARDIRQ-irq-unsafe at:
[ 498.401277] ...
[ 498.401282] lock_acquire+0x1b8/0x720
[ 498.401359] _raw_spin_lock_bh+0x38/0x50
[ 498.401418] iwl_pcie_hcmd_complete+0x1c7/0x12b0 [iwlwifi]
[ 498.401481] iwl_pcie_rx_handle+0xe56/0x27e0 [iwlwifi]
[ 498.401543] iwl_pcie_napi_poll_msix+0xa3/0x240 [iwlwifi]
[ 498.401591] net_rx_action+0x3dd/0xea0
[ 498.401628] __do_softirq+0x1bc/0x899
[ 498.401668] asm_call_irq_on_stack+0x12/0x20
[ 498.401711] do_softirq_own_stack+0x8b/0xb0
[ 498.401756] do_softirq+0xab/0xd0
[ 498.401797] __local_bh_enable_ip+0xf3/0x110
[ 498.401854] iwl_pcie_irq_rx_msix_handler+0x26e/0x4e0 [iwlwifi]
[ 498.401901] irq_thread_fn+0x92/0x160
[ 498.401940] irq_thread+0x244/0x4a0
[ 498.401980] kthread+0x370/0x440
[ 498.402022] ret_from_fork+0x22/0x30
[ 498.402061]
[ 498.402061] other info that might help us debug this:
[ 498.402061]
[ 498.402101] Chain exists of:
[ 498.402101] &host->lock --> &trig->leddev_list_lock --> &iwl_txq_cmd_queue_lock_class
[ 498.402101]
[ 498.402153] Possible interrupt unsafe locking scenario:
[ 498.402153]
[ 498.402199] CPU0 CPU1
[ 498.402239] ---- ----
[ 498.402278] lock(&iwl_txq_cmd_queue_lock_class);
[ 498.402320] local_irq_disable();
[ 498.402358] lock(&host->lock);
[ 498.402398] lock(&trig->leddev_list_lock);
[ 498.402437] <Interrupt>
[ 498.402473] lock(&host->lock);
[ 498.402515]
[ 498.402515] *** DEADLOCK ***
[ 498.402515]
[ 498.402555] 3 locks held by ip/364:
[ 498.402594] #0: ffffffff9cf45f08 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x319/0x910
[ 498.402654] #1: ffff8880055885e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_open+0x165/0x230 [mac80211]
[ 498.402782] #2: ffff88800558b5b8 (&trig->leddev_list_lock){.-.-}-{2:2}, at: led_trigger_event+0x34/0xd0
[ 498.402837]
[ 498.402837] the dependencies between HARDIRQ-irq-safe lock and the holding lock:
[ 498.402888] -> (&host->lock){-.-.}-{2:2} {
[ 498.402934] IN-HARDIRQ-W at:
[ 498.402977] lock_acquire+0x1b8/0x720
[ 498.403020] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.403065] ata_bmdma_interrupt+0x21/0x5a0
[ 498.403109] __handle_irq_event_percpu+0x25e/0x630
[ 498.403162] handle_irq_event+0x10e/0x290
[ 498.403208] handle_edge_irq+0x1f7/0xbb0
[ 498.403248] asm_call_irq_on_stack+0x12/0x20
[ 498.403290] common_interrupt+0xfb/0x1a0
[ 498.403331] asm_common_interrupt+0x1e/0x40
[ 498.403378] _raw_spin_unlock_irq+0x2f/0x40
[ 498.403420] ata_sff_flush_pio_task+0xa0/0x170
[ 498.403461] ata_exec_internal_sg+0xbe0/0x1840
[ 498.403501] ata_dev_read_id+0x8cd/0xcb0
[ 498.403541] ata_dev_reread_id+0xef/0x1c0
[ 498.403582] ata_dev_revalidate+0xcb/0x1f0
[ 498.403623] ata_do_set_mode+0x126b/0x2410
[ 498.403664] ata_set_mode+0x305/0x3e0
[ 498.403706] ata_eh_recover+0x268c/0x4110
[ 498.403749] ata_do_eh+0x7a/0xe0
[ 498.403790] ata_sff_error_handler+0x263/0x360
[ 498.403831] ata_bmdma_error_handler+0x295/0x580
[ 498.403870] ata_scsi_port_error_handler+0x7c1/0x17b0
[ 498.403911] ata_scsi_error+0x1f8/0x410
[ 498.403951] scsi_error_handler+0x29b/0x14a0
[ 498.403991] kthread+0x370/0x440
[ 498.404030] ret_from_fork+0x22/0x30
[ 498.404070] IN-SOFTIRQ-W at:
[ 498.404110] lock_acquire+0x1b8/0x720
[ 498.404150] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.404190] ata_bmdma_interrupt+0x21/0x5a0
[ 498.404232] __handle_irq_event_percpu+0x25e/0x630
[ 498.404272] handle_irq_event+0x10e/0x290
[ 498.404312] handle_edge_irq+0x1f7/0xbb0
[ 498.404363] common_interrupt+0xad/0x1a0
[ 498.404414] asm_common_interrupt+0x1e/0x40
[ 498.404455] __do_softirq+0x173/0x899
[ 498.404495] asm_call_irq_on_stack+0x12/0x20
[ 498.404539] do_softirq_own_stack+0x8b/0xb0
[ 498.404578] irq_exit_rcu+0x12c/0x1c0
[ 498.404619] sysvec_apic_timer_interrupt+0x4f/0xb0
[ 498.404661] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 498.404703] _raw_spin_unlock_irqrestore+0x31/0x50
[ 498.404744] ata_scsi_queuecmd+0xd5/0x190
[ 498.404786] scsi_queue_rq+0x12ff/0x2c50
[ 498.404848] blk_mq_dispatch_rq_list+0x367/0x2050
[ 498.404889] __blk_mq_do_dispatch_sched+0x35f/0xa10
[ 498.404930] __blk_mq_sched_dispatch_requests+0x305/0x4b0
[ 498.404970] blk_mq_sched_dispatch_requests+0xd8/0x130
[ 498.405012] __blk_mq_run_hw_queue+0x116/0x2c0
[ 498.405052] blk_mq_run_work_fn+0x48/0x60
[ 498.405094] process_one_work+0x7cc/0x1410
[ 498.405135] worker_thread+0x583/0xf70
[ 498.405174] kthread+0x370/0x440
[ 498.405215] ret_from_fork+0x22/0x30
[ 498.405253] INITIAL USE at:
[ 498.405295] lock_acquire+0x1b8/0x720
[ 498.405336] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.405376] ata_dev_init+0x19c/0x3d0
[ 498.405416] ata_link_init+0x1a6/0x310
[ 498.405462] ata_port_alloc+0x4d7/0x5c0
[ 498.405502] ata_host_alloc+0x1d2/0x260
[ 498.405542] ata_host_alloc_pinfo+0x21/0x580
[ 498.405583] ata_pci_sff_prepare_host+0x41/0xc0
[ 498.405623] ata_pci_bmdma_prepare_host+0x14/0x60
[ 498.405661] piix_init_one+0x41b/0xbe0
[ 498.405703] local_pci_probe+0xde/0x170
[ 498.405743] pci_device_probe+0x351/0x6d0
[ 498.405787] really_probe+0x222/0xcc0
[ 498.405827] driver_probe_device+0x1fe/0x390
[ 498.405868] device_driver_attach+0x1e9/0x250
[ 498.405909] __driver_attach+0xfb/0x270
[ 498.405949] bus_for_each_dev+0x123/0x1c0
[ 498.405988] driver_attach+0x3d/0x50
[ 498.406028] bus_add_driver+0x35c/0x580
[ 498.406068] driver_register+0x219/0x380
[ 498.406108] __pci_register_driver+0x185/0x240
[ 498.406150] piix_init+0x1e/0x2e
[ 498.406190] do_one_initcall+0xcb/0x410
[ 498.406232] kernel_init_freeable+0x627/0x6bc
[ 498.406273] kernel_init+0x12/0x127
[ 498.406313] ret_from_fork+0x22/0x30
[ 498.406351] }
[ 498.406393] ... key at: [<ffffffff9ea0afc0>] __key.6+0x0/0x40
[ 498.406432] ... acquired at:
[ 498.406479] _raw_read_lock_irqsave+0x89/0x93
[ 498.406522] led_trigger_blink_oneshot+0x3f/0xe0
[ 498.406564] ledtrig_disk_activity+0x79/0xf0
[ 498.406602] ata_qc_complete+0x69/0xd80
[ 498.406641] ata_do_link_abort+0x1a4/0x360
[ 498.406681] ata_port_freeze+0x69/0x80
[ 498.406719] ata_hsm_qc_complete+0x1da/0x240
[ 498.406758] ata_sff_hsm_move+0x39a/0x1e10
[ 498.406797] ata_sff_pio_task+0x397/0x4a0
[ 498.406837] process_one_work+0x7cc/0x1410
[ 498.406877] worker_thread+0x583/0xf70
[ 498.406917] kthread+0x370/0x440
[ 498.406957] ret_from_fork+0x22/0x30
[ 498.406992]
[ 498.407029] -> (&trig->leddev_list_lock){.-.-}-{2:2} {
[ 498.407075] IN-HARDIRQ-R at:
[ 498.407117] lock_acquire+0x1b8/0x720
[ 498.407168] _raw_read_lock_irqsave+0x59/0x93
[ 498.407212] led_trigger_blink_oneshot+0x3f/0xe0
[ 498.407252] ledtrig_disk_activity+0x79/0xf0
[ 498.407291] ata_qc_complete+0x69/0xd80
[ 498.407331] ata_hsm_qc_complete+0x1c1/0x240
[ 498.407371] ata_sff_hsm_move+0x39a/0x1e10
[ 498.407411] __ata_sff_port_intr+0x1d2/0x3e0
[ 498.407450] ata_bmdma_port_intr+0x96/0x330
[ 498.409897] ata_bmdma_interrupt+0x1fd/0x5a0
[ 498.409940] __handle_irq_event_percpu+0x25e/0x630
[ 498.409980] handle_irq_event+0x10e/0x290
[ 498.410018] handle_edge_irq+0x1f7/0xbb0
[ 498.410057] asm_call_irq_on_stack+0x12/0x20
[ 498.410098] common_interrupt+0xfb/0x1a0
[ 498.410141] asm_common_interrupt+0x1e/0x40
[ 498.410182] _raw_spin_unlock_irq+0x2f/0x40
[ 498.410222] ata_sff_flush_pio_task+0xa0/0x170
[ 498.410262] ata_exec_internal_sg+0xbe0/0x1840
[ 498.410301] ata_dev_read_id+0x8cd/0xcb0
[ 498.410340] ata_dev_reread_id+0xef/0x1c0
[ 498.410381] ata_dev_revalidate+0xcb/0x1f0
[ 498.410421] ata_do_set_mode+0x126b/0x2410
[ 498.410461] ata_set_mode+0x305/0x3e0
[ 498.410502] ata_eh_recover+0x268c/0x4110
[ 498.410542] ata_do_eh+0x7a/0xe0
[ 498.410580] ata_sff_error_handler+0x263/0x360
[ 498.410618] ata_bmdma_error_handler+0x295/0x580
[ 498.410658] ata_scsi_port_error_handler+0x7c1/0x17b0
[ 498.410700] ata_scsi_error+0x1f8/0x410
[ 498.410739] scsi_error_handler+0x29b/0x14a0
[ 498.410779] kthread+0x370/0x440
[ 498.410819] ret_from_fork+0x22/0x30
[ 498.410858] IN-SOFTIRQ-R at:
[ 498.410909] lock_acquire+0x1b8/0x720
[ 498.410948] _raw_read_lock_irqsave+0x59/0x93
[ 498.410987] led_trigger_event+0x34/0xd0
[ 498.411029] kbd_bh+0x199/0x220
[ 498.411070] tasklet_action_common.constprop.0+0x260/0x300
[ 498.411110] tasklet_action+0x22/0x30
[ 498.411159] __do_softirq+0x1bc/0x899
[ 498.411200] run_ksoftirqd+0x3f/0x70
[ 498.411243] smpboot_thread_fn+0x356/0x810
[ 498.411283] kthread+0x370/0x440
[ 498.411321] ret_from_fork+0x22/0x30
[ 498.411358] INITIAL USE at:
[ 498.411400] lock_acquire+0x1b8/0x720
[ 498.411440] _raw_write_lock_irqsave+0x4d/0x70
[ 498.411481] led_trigger_set+0x47a/0xd80
[ 498.411522] led_trigger_set_default+0x16e/0x1f0
[ 498.411562] led_classdev_register_ext+0x596/0x7e0
[ 498.411603] input_leds_connect+0x35a/0x649 [input_leds]
[ 498.411645] input_attach_handler.isra.0+0x10b/0x160
[ 498.411685] input_register_handler+0x139/0x1e0
[ 498.411724] 0xffffffffc02d8015
[ 498.411762] do_one_initcall+0xcb/0x410
[ 498.411802] do_init_module+0x1d8/0x770
[ 498.411842] load_module+0x9eec/0xa9d0
[ 498.411881] __do_sys_finit_module+0x131/0x1c0
[ 498.411928] __x64_sys_finit_module+0x73/0xb0
[ 498.411968] do_syscall_64+0x38/0x90
[ 498.412010] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 498.412048] INITIAL READ USE at:
[ 498.412088] lock_acquire+0x1b8/0x720
[ 498.412129] _raw_read_lock_irqsave+0x89/0x93
[ 498.412169] led_trigger_event+0x34/0xd0
[ 498.412213] rfkill_global_led_trigger_worker+0xbe/0xf0
[ 498.412253] process_one_work+0x7cc/0x1410
[ 498.412293] worker_thread+0x583/0xf70
[ 498.412333] kthread+0x370/0x440
[ 498.412373] ret_from_fork+0x22/0x30
[ 498.412410] }
[ 498.412452] ... key at: [<ffffffff9ea17c40>] __key.0+0x0/0x40
[ 498.412491] ... acquired at:
[ 498.412530] lock_acquire+0x1b8/0x720
[ 498.412572] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.412631] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
[ 498.412692] iwl_trans_txq_send_hcmd+0x191/0xf20 [iwlwifi]
[ 498.412751] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
[ 498.412834] iwl_mvm_send_cmd+0xf1/0x1d0 [iwlmvm]
[ 498.412896] iwl_mvm_led_set+0x1df/0x270 [iwlmvm]
[ 498.412954] iwl_led_brightness_set+0x1f/0x30 [iwlmvm]
[ 498.413001] led_set_brightness_nosleep+0xd4/0x1a0
[ 498.413041] led_set_brightness+0x108/0x150
[ 498.413088] led_trigger_event+0x73/0xd0
[ 498.413170] ieee80211_led_radio+0x6e/0x90 [mac80211]
[ 498.413253] ieee80211_do_open+0x896/0x1fa0 [mac80211]
[ 498.413333] ieee80211_open+0x176/0x230 [mac80211]
[ 498.413383] __dev_open+0x260/0x450
[ 498.413423] __dev_change_flags+0x4c9/0x750
[ 498.413464] dev_change_flags+0x87/0x160
[ 498.413504] do_setlink+0x7a3/0x3160
[ 498.413543] __rtnl_newlink+0xbdd/0x15a0
[ 498.413588] rtnl_newlink+0x64/0xa0
[ 498.413627] rtnetlink_rcv_msg+0x36a/0x910
[ 498.413670] netlink_rcv_skb+0x140/0x3d0
[ 498.413710] rtnetlink_rcv+0x15/0x20
[ 498.413748] netlink_unicast+0x465/0x7a0
[ 498.413788] netlink_sendmsg+0x7b4/0xcf0
[ 498.413830] sock_sendmsg+0xec/0x120
[ 498.413870] ____sys_sendmsg+0x696/0x9f0
[ 498.413909] ___sys_sendmsg+0x101/0x190
[ 498.413948] __sys_sendmsg+0xc8/0x160
[ 498.413986] __x64_sys_sendmsg+0x78/0xb0
[ 498.414026] do_syscall_64+0x38/0x90
[ 498.414067] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 498.414105]
[ 498.414142]
[ 498.414142] the dependencies between the lock to be acquired
[ 498.414145] and HARDIRQ-irq-unsafe lock:
[ 498.414228] -> (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2} {
[ 498.414275] HARDIRQ-ON-W at:
[ 498.414316] lock_acquire+0x1b8/0x720
[ 498.414357] _raw_spin_lock_bh+0x38/0x50
[ 498.414414] iwl_pcie_hcmd_complete+0x1c7/0x12b0 [iwlwifi]
[ 498.414473] iwl_pcie_rx_handle+0xe56/0x27e0 [iwlwifi]
[ 498.414533] iwl_pcie_napi_poll_msix+0xa3/0x240 [iwlwifi]
[ 498.414579] net_rx_action+0x3dd/0xea0
[ 498.414636] __do_softirq+0x1bc/0x899
[ 498.414676] asm_call_irq_on_stack+0x12/0x20
[ 498.414717] do_softirq_own_stack+0x8b/0xb0
[ 498.414756] do_softirq+0xab/0xd0
[ 498.414795] __local_bh_enable_ip+0xf3/0x110
[ 498.414851] iwl_pcie_irq_rx_msix_handler+0x26e/0x4e0 [iwlwifi]
[ 498.414896] irq_thread_fn+0x92/0x160
[ 498.414935] irq_thread+0x244/0x4a0
[ 498.414974] kthread+0x370/0x440
[ 498.415014] ret_from_fork+0x22/0x30
[ 498.415051] IN-SOFTIRQ-W at:
[ 498.415091] lock_acquire+0x1b8/0x720
[ 498.415131] _raw_spin_lock_bh+0x38/0x50
[ 498.415195] iwl_pcie_hcmd_complete+0x1c7/0x12b0 [iwlwifi]
[ 498.415254] iwl_pcie_rx_handle+0xe56/0x27e0 [iwlwifi]
[ 498.415314] iwl_pcie_napi_poll_msix+0xa3/0x240 [iwlwifi]
[ 498.415360] net_rx_action+0x3dd/0xea0
[ 498.415406] __do_softirq+0x1bc/0x899
[ 498.415445] asm_call_irq_on_stack+0x12/0x20
[ 498.415484] do_softirq_own_stack+0x8b/0xb0
[ 498.415524] do_softirq+0xab/0xd0
[ 498.415565] __local_bh_enable_ip+0xf3/0x110
[ 498.415620] iwl_pcie_irq_rx_msix_handler+0x26e/0x4e0 [iwlwifi]
[ 498.415665] irq_thread_fn+0x92/0x160
[ 498.415705] irq_thread+0x244/0x4a0
[ 498.415744] kthread+0x370/0x440
[ 498.415783] ret_from_fork+0x22/0x30
[ 498.415820] INITIAL USE at:
[ 498.415860] lock_acquire+0x1b8/0x720
[ 498.415899] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.415956] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
[ 498.416016] iwl_trans_txq_send_hcmd+0x3fc/0xf20 [iwlwifi]
[ 498.416074] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
[ 498.416133] iwl_dbg_tlv_send_hcmds.isra.0+0x18f/0x240 [iwlwifi]
[ 498.416192] _iwl_dbg_tlv_time_point+0x261/0x2ca0 [iwlwifi]
[ 498.416248] iwl_run_unified_mvm_ucode+0x1fe/0x760 [iwlmvm]
[ 498.416303] iwl_run_init_mvm_ucode+0x5ef/0xb80 [iwlmvm]
[ 498.416359] iwl_mvm_start_get_nvm+0x1ee/0x7d0 [iwlmvm]
[ 498.416413] iwl_op_mode_mvm_start+0x2082/0x3280 [iwlmvm]
[ 498.416470] _iwl_op_mode_start.isra.0+0xef/0x1f0 [iwlwifi]
[ 498.416528] iwl_opmode_register+0x13f/0x210 [iwlwifi]
[ 498.416572] 0xffffffffc055803f
[ 498.416617] do_one_initcall+0xcb/0x410
[ 498.416658] do_init_module+0x1d8/0x770
[ 498.416697] load_module+0x9eec/0xa9d0
[ 498.416736] __do_sys_finit_module+0x131/0x1c0
[ 498.416775] __x64_sys_finit_module+0x73/0xb0
[ 498.416814] do_syscall_64+0x38/0x90
[ 498.416854] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 498.416893] }
[ 498.416949] ... key at: [<ffffffffc0a85bc0>] iwl_txq_cmd_queue_lock_class.89285+0x0/0xfffffffffff99440 [iwlwifi]
[ 498.416993] ... acquired at:
[ 498.417030] lock_acquire+0x1b8/0x720
[ 498.417069] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.417126] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
[ 498.417182] iwl_trans_txq_send_hcmd+0x191/0xf20 [iwlwifi]
[ 498.417240] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
[ 498.417298] iwl_mvm_send_cmd+0xf1/0x1d0 [iwlmvm]
[ 498.417356] iwl_mvm_led_set+0x1df/0x270 [iwlmvm]
[ 498.417412] iwl_led_brightness_set+0x1f/0x30 [iwlmvm]
[ 498.417457] led_set_brightness_nosleep+0xd4/0x1a0
[ 498.417498] led_set_brightness+0x108/0x150
[ 498.417537] led_trigger_event+0x73/0xd0
[ 498.417615] ieee80211_led_radio+0x6e/0x90 [mac80211]
[ 498.417695] ieee80211_do_open+0x896/0x1fa0 [mac80211]
[ 498.417775] ieee80211_open+0x176/0x230 [mac80211]
[ 498.417823] __dev_open+0x260/0x450
[ 498.417873] __dev_change_flags+0x4c9/0x750
[ 498.417913] dev_change_flags+0x87/0x160
[ 498.417950] do_setlink+0x7a3/0x3160
[ 498.417988] __rtnl_newlink+0xbdd/0x15a0
[ 498.418027] rtnl_newlink+0x64/0xa0
[ 498.418066] rtnetlink_rcv_msg+0x36a/0x910
[ 498.418105] netlink_rcv_skb+0x140/0x3d0
[ 498.418143] rtnetlink_rcv+0x15/0x20
[ 498.418191] netlink_unicast+0x465/0x7a0
[ 498.418229] netlink_sendmsg+0x7b4/0xcf0
[ 498.418267] sock_sendmsg+0xec/0x120
[ 498.418307] ____sys_sendmsg+0x696/0x9f0
[ 498.418345] ___sys_sendmsg+0x101/0x190
[ 498.418383] __sys_sendmsg+0xc8/0x160
[ 498.418421] __x64_sys_sendmsg+0x78/0xb0
[ 498.418459] do_syscall_64+0x38/0x90
[ 498.418499] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 498.418535]
[ 498.418572]
[ 498.418572] stack backtrace:
[ 498.418620] CPU: 3 PID: 364 Comm: ip Tainted: G OE 5.10.27-dev-lockdep-fix2+ #1
[ 498.418666] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 498.418709] Call Trace:
[ 498.418754] dump_stack+0xb3/0xed
[ 498.418798] check_irq_usage.cold+0x4e1/0x66d
[ 498.419100] __lock_acquire+0x2b0e/0x5460
[ 498.419239] lock_acquire+0x1b8/0x720
[ 498.419618] _raw_spin_lock_irqsave+0x4d/0x70
[ 498.419737] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
[ 498.420081] iwl_trans_txq_send_hcmd+0x191/0xf20 [iwlwifi]
[ 498.420245] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
[ 498.420305] iwl_mvm_send_cmd+0xf1/0x1d0 [iwlmvm]
[ 498.420365] iwl_mvm_led_set+0x1df/0x270 [iwlmvm]
[ 498.423581] iwl_led_brightness_set+0x1f/0x30 [iwlmvm]
[ 498.423627] led_set_brightness_nosleep+0xd4/0x1a0
[ 498.423668] led_set_brightness+0x108/0x150
[ 498.423709] led_trigger_event+0x73/0xd0
[ 498.423789] ieee80211_led_radio+0x6e/0x90 [mac80211]
[ 498.423869] ieee80211_do_open+0x896/0x1fa0 [mac80211]
[ 498.424028] ieee80211_open+0x176/0x230 [mac80211]
[ 498.424076] __dev_open+0x260/0x450
[ 498.424249] __dev_change_flags+0x4c9/0x750
[ 498.424339] dev_change_flags+0x87/0x160
[ 498.424382] do_setlink+0x7a3/0x3160
[ 498.425060] __rtnl_newlink+0xbdd/0x15a0
[ 498.425697] rtnl_newlink+0x64/0xa0
[ 498.425739] rtnetlink_rcv_msg+0x36a/0x910
[ 498.425938] netlink_rcv_skb+0x140/0x3d0
[ 498.426137] rtnetlink_rcv+0x15/0x20
[ 498.426177] netlink_unicast+0x465/0x7a0
[ 498.426384] netlink_sendmsg+0x7b4/0xcf0
[ 498.426556] sock_sendmsg+0xec/0x120
[ 498.426597] ____sys_sendmsg+0x696/0x9f0


Note that the iwlwifi module was a newer version, not necessarily
matching the 5.10(.27) version, especially wrt. the locking of the txq-
>lock (which is represented here by iwl_txq_cmd_queue_lock_class due to
us separating the lock classes for different types of queues).


Clearly, the scenario lockdep outputs isn't really an issue at all,
since that will just continue to make progress on CPU1, finish, and then
the interrupt on CPU0 can make progress.

I tried thinking about adding write_lock() to the scenario as you
(Boqun) did in the thread I linked to above, but even then I don't see
how it could deadlock since the write_lock() for leddev_list_lock is
always write_lock_irqsave(), so it can't get stuck on any of the other
locks involved here due to interrupts, I think?


All of the bits here that lockdep is reporting seem correct, but I can't
figure out how this chain matters?

&host->lock --> &trig->leddev_list_lock --> &iwl_txq_cmd_queue_lock_class

Yes, obviously we take leddev_list_lock under host->lock, and we also
take the iwl lock under leddev_list_lock. No argument there. But the
argument seems to be that we can have the iwl lock taken without
disabling interrupts (which we do), and that's a different code path
unrelated to the leddev_list_lock?

Basically, I don't think I understand how a read-lock in the middle can
cause this chain to be a problem?

Can anyone help me decode this?

Thanks,
johannes


2021-06-21 07:25:33

by Boqun Feng

[permalink] [raw]
Subject: Re: lockdep issue with trig->leddev_list_lock again/still?

On Wed, Jun 16, 2021 at 09:54:04PM +0200, Johannes Berg wrote:
> Hi Boqun, all,
>

Hello,

Sorry for the late response, please see below:

> I've been reading the threads at
>
> https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
>
> and
>
> https://lore.kernel.org/lkml/20201102104152.GG9930@xps-13-7390/
>
> as well as the previous fix commit 27af8e2c90fb ("leds: trigger: fix
> potential deadlock with libata"), but I cannot make sense of this.
>
>
> I'm getting a report - on 5.10.27, which has that fix backported - that
> says we have a chain with leddev_list_lock involved in it, but I don't
> really understand it.
>
> FWIW, I have the following patch, that Peter gave me some time ago but
> said you (Boqun) wrote, applied to the tree as well:
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 48d736aa03b2..9de9ae8a4ae0 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -2667,8 +2667,18 @@ static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
> * Step 3: we found a bad match! Now retrieve a lock from the backward
> * list whose usage mask matches the exclusive usage mask from the
> * lock found on the forward list.
> + *
> + * Note, we should only keep the LOCKF_ENABLED_IRQ_ALL bits, considering
> + * the follow case:
> + *
> + * When trying to add A -> B to the graph, we find that there is a
> + * hardirq-safe L, that L -> ... -> A, and another hardirq-unsafe M,
> + * that B -> ... -> M. However M is **softirq-safe**, if we use exact
> + * invert bits of M's usage_mask, we will find another lock N that is
> + * **softirq-unsafe** and N -> ... -> A, however N -> .. -> M will not
> + * cause a inversion deadlock.
> */
> - backward_mask = original_mask(target_entry1->class->usage_mask);
> + backward_mask = original_mask(target_entry1->class->usage_mask & LOCKF_ENABLED_IRQ_ALL);
>
> ret = find_usage_backwards(&this, backward_mask, &target_entry);
> if (bfs_error(ret)) {
>
>
> Without the patch, the lockdep report makes no sense at all, so thanks
> for that. :-)
>
>
> With that patch, I'm getting the following report from our driver vs.
> libata vs. LED triggers:
>
> [ 498.399434] =====================================================
> [ 498.399481] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
> [ 498.399523] 5.10.27-dev-lockdep-fix2+ #1 Tainted: G OE
> [ 498.399565] -----------------------------------------------------
> [ 498.399608] ip/364 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> [ 498.399650] ffff8880310fa438 (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2}, at: iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
> [ 498.399749]
> [ 498.399749] and this task is already holding:
> [ 498.399789] ffff88800558b5b8 (&trig->leddev_list_lock){.-.-}-{2:2}, at: led_trigger_event+0x34/0xd0
> [ 498.399845] which would create a new lock dependency:
> [ 498.399884] (&trig->leddev_list_lock){.-.-}-{2:2} -> (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2}
> [ 498.399938]
> [ 498.399938] but this new dependency connects a HARDIRQ-irq-safe lock:
> [ 498.399984] (&host->lock){-.-.}-{2:2}
> [ 498.399992]
> [ 498.399992] ... which became HARDIRQ-irq-safe at:
> [ 498.400077] lock_acquire+0x1b8/0x720
> [ 498.400121] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.400165] ata_bmdma_interrupt+0x21/0x5a0
> [ 498.400208] __handle_irq_event_percpu+0x25e/0x630
> [ 498.400250] handle_irq_event+0x10e/0x290
> [ 498.400290] handle_edge_irq+0x1f7/0xbb0
> [ 498.400332] asm_call_irq_on_stack+0x12/0x20
> [ 498.400374] common_interrupt+0xfb/0x1a0
> [ 498.400415] asm_common_interrupt+0x1e/0x40
> [ 498.400456] _raw_spin_unlock_irq+0x2f/0x40
> [ 498.400497] ata_sff_flush_pio_task+0xa0/0x170
> [ 498.400537] ata_exec_internal_sg+0xbe0/0x1840
> [ 498.400577] ata_dev_read_id+0x8cd/0xcb0
> [ 498.400617] ata_dev_reread_id+0xef/0x1c0
> [ 498.400658] ata_dev_revalidate+0xcb/0x1f0
> [ 498.400698] ata_do_set_mode+0x126b/0x2410
> [ 498.400741] ata_set_mode+0x305/0x3e0
> [ 498.400786] ata_eh_recover+0x268c/0x4110
> [ 498.400828] ata_do_eh+0x7a/0xe0
> [ 498.400866] ata_sff_error_handler+0x263/0x360
> [ 498.400906] ata_bmdma_error_handler+0x295/0x580
> [ 498.400945] ata_scsi_port_error_handler+0x7c1/0x17b0
> [ 498.400986] ata_scsi_error+0x1f8/0x410
> [ 498.401027] scsi_error_handler+0x29b/0x14a0
> [ 498.401077] kthread+0x370/0x440
> [ 498.401119] ret_from_fork+0x22/0x30
> [ 498.401158]
> [ 498.401158] to a HARDIRQ-irq-unsafe lock:
> [ 498.401197] (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2}
> [ 498.401204]
> [ 498.401204] ... which became HARDIRQ-irq-unsafe at:
> [ 498.401277] ...
> [ 498.401282] lock_acquire+0x1b8/0x720
> [ 498.401359] _raw_spin_lock_bh+0x38/0x50
> [ 498.401418] iwl_pcie_hcmd_complete+0x1c7/0x12b0 [iwlwifi]
> [ 498.401481] iwl_pcie_rx_handle+0xe56/0x27e0 [iwlwifi]
> [ 498.401543] iwl_pcie_napi_poll_msix+0xa3/0x240 [iwlwifi]
> [ 498.401591] net_rx_action+0x3dd/0xea0
> [ 498.401628] __do_softirq+0x1bc/0x899
> [ 498.401668] asm_call_irq_on_stack+0x12/0x20
> [ 498.401711] do_softirq_own_stack+0x8b/0xb0
> [ 498.401756] do_softirq+0xab/0xd0
> [ 498.401797] __local_bh_enable_ip+0xf3/0x110
> [ 498.401854] iwl_pcie_irq_rx_msix_handler+0x26e/0x4e0 [iwlwifi]
> [ 498.401901] irq_thread_fn+0x92/0x160
> [ 498.401940] irq_thread+0x244/0x4a0
> [ 498.401980] kthread+0x370/0x440
> [ 498.402022] ret_from_fork+0x22/0x30
> [ 498.402061]
> [ 498.402061] other info that might help us debug this:
> [ 498.402061]
> [ 498.402101] Chain exists of:
> [ 498.402101] &host->lock --> &trig->leddev_list_lock --> &iwl_txq_cmd_queue_lock_class
> [ 498.402101]
> [ 498.402153] Possible interrupt unsafe locking scenario:
> [ 498.402153]
> [ 498.402199] CPU0 CPU1
> [ 498.402239] ---- ----
> [ 498.402278] lock(&iwl_txq_cmd_queue_lock_class);
> [ 498.402320] local_irq_disable();
> [ 498.402358] lock(&host->lock);
> [ 498.402398] lock(&trig->leddev_list_lock);
> [ 498.402437] <Interrupt>
> [ 498.402473] lock(&host->lock);
> [ 498.402515]
> [ 498.402515] *** DEADLOCK ***
> [ 498.402515]
> [ 498.402555] 3 locks held by ip/364:
> [ 498.402594] #0: ffffffff9cf45f08 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x319/0x910
> [ 498.402654] #1: ffff8880055885e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_open+0x165/0x230 [mac80211]
> [ 498.402782] #2: ffff88800558b5b8 (&trig->leddev_list_lock){.-.-}-{2:2}, at: led_trigger_event+0x34/0xd0
> [ 498.402837]
> [ 498.402837] the dependencies between HARDIRQ-irq-safe lock and the holding lock:
> [ 498.402888] -> (&host->lock){-.-.}-{2:2} {
> [ 498.402934] IN-HARDIRQ-W at:
> [ 498.402977] lock_acquire+0x1b8/0x720
> [ 498.403020] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.403065] ata_bmdma_interrupt+0x21/0x5a0
> [ 498.403109] __handle_irq_event_percpu+0x25e/0x630
> [ 498.403162] handle_irq_event+0x10e/0x290
> [ 498.403208] handle_edge_irq+0x1f7/0xbb0
> [ 498.403248] asm_call_irq_on_stack+0x12/0x20
> [ 498.403290] common_interrupt+0xfb/0x1a0
> [ 498.403331] asm_common_interrupt+0x1e/0x40
> [ 498.403378] _raw_spin_unlock_irq+0x2f/0x40
> [ 498.403420] ata_sff_flush_pio_task+0xa0/0x170
> [ 498.403461] ata_exec_internal_sg+0xbe0/0x1840
> [ 498.403501] ata_dev_read_id+0x8cd/0xcb0
> [ 498.403541] ata_dev_reread_id+0xef/0x1c0
> [ 498.403582] ata_dev_revalidate+0xcb/0x1f0
> [ 498.403623] ata_do_set_mode+0x126b/0x2410
> [ 498.403664] ata_set_mode+0x305/0x3e0
> [ 498.403706] ata_eh_recover+0x268c/0x4110
> [ 498.403749] ata_do_eh+0x7a/0xe0
> [ 498.403790] ata_sff_error_handler+0x263/0x360
> [ 498.403831] ata_bmdma_error_handler+0x295/0x580
> [ 498.403870] ata_scsi_port_error_handler+0x7c1/0x17b0
> [ 498.403911] ata_scsi_error+0x1f8/0x410
> [ 498.403951] scsi_error_handler+0x29b/0x14a0
> [ 498.403991] kthread+0x370/0x440
> [ 498.404030] ret_from_fork+0x22/0x30
> [ 498.404070] IN-SOFTIRQ-W at:
> [ 498.404110] lock_acquire+0x1b8/0x720
> [ 498.404150] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.404190] ata_bmdma_interrupt+0x21/0x5a0
> [ 498.404232] __handle_irq_event_percpu+0x25e/0x630
> [ 498.404272] handle_irq_event+0x10e/0x290
> [ 498.404312] handle_edge_irq+0x1f7/0xbb0
> [ 498.404363] common_interrupt+0xad/0x1a0
> [ 498.404414] asm_common_interrupt+0x1e/0x40
> [ 498.404455] __do_softirq+0x173/0x899
> [ 498.404495] asm_call_irq_on_stack+0x12/0x20
> [ 498.404539] do_softirq_own_stack+0x8b/0xb0
> [ 498.404578] irq_exit_rcu+0x12c/0x1c0
> [ 498.404619] sysvec_apic_timer_interrupt+0x4f/0xb0
> [ 498.404661] asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 498.404703] _raw_spin_unlock_irqrestore+0x31/0x50
> [ 498.404744] ata_scsi_queuecmd+0xd5/0x190
> [ 498.404786] scsi_queue_rq+0x12ff/0x2c50
> [ 498.404848] blk_mq_dispatch_rq_list+0x367/0x2050
> [ 498.404889] __blk_mq_do_dispatch_sched+0x35f/0xa10
> [ 498.404930] __blk_mq_sched_dispatch_requests+0x305/0x4b0
> [ 498.404970] blk_mq_sched_dispatch_requests+0xd8/0x130
> [ 498.405012] __blk_mq_run_hw_queue+0x116/0x2c0
> [ 498.405052] blk_mq_run_work_fn+0x48/0x60
> [ 498.405094] process_one_work+0x7cc/0x1410
> [ 498.405135] worker_thread+0x583/0xf70
> [ 498.405174] kthread+0x370/0x440
> [ 498.405215] ret_from_fork+0x22/0x30
> [ 498.405253] INITIAL USE at:
> [ 498.405295] lock_acquire+0x1b8/0x720
> [ 498.405336] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.405376] ata_dev_init+0x19c/0x3d0
> [ 498.405416] ata_link_init+0x1a6/0x310
> [ 498.405462] ata_port_alloc+0x4d7/0x5c0
> [ 498.405502] ata_host_alloc+0x1d2/0x260
> [ 498.405542] ata_host_alloc_pinfo+0x21/0x580
> [ 498.405583] ata_pci_sff_prepare_host+0x41/0xc0
> [ 498.405623] ata_pci_bmdma_prepare_host+0x14/0x60
> [ 498.405661] piix_init_one+0x41b/0xbe0
> [ 498.405703] local_pci_probe+0xde/0x170
> [ 498.405743] pci_device_probe+0x351/0x6d0
> [ 498.405787] really_probe+0x222/0xcc0
> [ 498.405827] driver_probe_device+0x1fe/0x390
> [ 498.405868] device_driver_attach+0x1e9/0x250
> [ 498.405909] __driver_attach+0xfb/0x270
> [ 498.405949] bus_for_each_dev+0x123/0x1c0
> [ 498.405988] driver_attach+0x3d/0x50
> [ 498.406028] bus_add_driver+0x35c/0x580
> [ 498.406068] driver_register+0x219/0x380
> [ 498.406108] __pci_register_driver+0x185/0x240
> [ 498.406150] piix_init+0x1e/0x2e
> [ 498.406190] do_one_initcall+0xcb/0x410
> [ 498.406232] kernel_init_freeable+0x627/0x6bc
> [ 498.406273] kernel_init+0x12/0x127
> [ 498.406313] ret_from_fork+0x22/0x30
> [ 498.406351] }
> [ 498.406393] ... key at: [<ffffffff9ea0afc0>] __key.6+0x0/0x40
> [ 498.406432] ... acquired at:
> [ 498.406479] _raw_read_lock_irqsave+0x89/0x93
> [ 498.406522] led_trigger_blink_oneshot+0x3f/0xe0
> [ 498.406564] ledtrig_disk_activity+0x79/0xf0
> [ 498.406602] ata_qc_complete+0x69/0xd80
> [ 498.406641] ata_do_link_abort+0x1a4/0x360
> [ 498.406681] ata_port_freeze+0x69/0x80
> [ 498.406719] ata_hsm_qc_complete+0x1da/0x240
> [ 498.406758] ata_sff_hsm_move+0x39a/0x1e10
> [ 498.406797] ata_sff_pio_task+0x397/0x4a0
> [ 498.406837] process_one_work+0x7cc/0x1410
> [ 498.406877] worker_thread+0x583/0xf70
> [ 498.406917] kthread+0x370/0x440
> [ 498.406957] ret_from_fork+0x22/0x30
> [ 498.406992]
> [ 498.407029] -> (&trig->leddev_list_lock){.-.-}-{2:2} {
> [ 498.407075] IN-HARDIRQ-R at:
> [ 498.407117] lock_acquire+0x1b8/0x720
> [ 498.407168] _raw_read_lock_irqsave+0x59/0x93
> [ 498.407212] led_trigger_blink_oneshot+0x3f/0xe0
> [ 498.407252] ledtrig_disk_activity+0x79/0xf0
> [ 498.407291] ata_qc_complete+0x69/0xd80
> [ 498.407331] ata_hsm_qc_complete+0x1c1/0x240
> [ 498.407371] ata_sff_hsm_move+0x39a/0x1e10
> [ 498.407411] __ata_sff_port_intr+0x1d2/0x3e0
> [ 498.407450] ata_bmdma_port_intr+0x96/0x330
> [ 498.409897] ata_bmdma_interrupt+0x1fd/0x5a0
> [ 498.409940] __handle_irq_event_percpu+0x25e/0x630
> [ 498.409980] handle_irq_event+0x10e/0x290
> [ 498.410018] handle_edge_irq+0x1f7/0xbb0
> [ 498.410057] asm_call_irq_on_stack+0x12/0x20
> [ 498.410098] common_interrupt+0xfb/0x1a0
> [ 498.410141] asm_common_interrupt+0x1e/0x40
> [ 498.410182] _raw_spin_unlock_irq+0x2f/0x40
> [ 498.410222] ata_sff_flush_pio_task+0xa0/0x170
> [ 498.410262] ata_exec_internal_sg+0xbe0/0x1840
> [ 498.410301] ata_dev_read_id+0x8cd/0xcb0
> [ 498.410340] ata_dev_reread_id+0xef/0x1c0
> [ 498.410381] ata_dev_revalidate+0xcb/0x1f0
> [ 498.410421] ata_do_set_mode+0x126b/0x2410
> [ 498.410461] ata_set_mode+0x305/0x3e0
> [ 498.410502] ata_eh_recover+0x268c/0x4110
> [ 498.410542] ata_do_eh+0x7a/0xe0
> [ 498.410580] ata_sff_error_handler+0x263/0x360
> [ 498.410618] ata_bmdma_error_handler+0x295/0x580
> [ 498.410658] ata_scsi_port_error_handler+0x7c1/0x17b0
> [ 498.410700] ata_scsi_error+0x1f8/0x410
> [ 498.410739] scsi_error_handler+0x29b/0x14a0
> [ 498.410779] kthread+0x370/0x440
> [ 498.410819] ret_from_fork+0x22/0x30
> [ 498.410858] IN-SOFTIRQ-R at:
> [ 498.410909] lock_acquire+0x1b8/0x720
> [ 498.410948] _raw_read_lock_irqsave+0x59/0x93
> [ 498.410987] led_trigger_event+0x34/0xd0
> [ 498.411029] kbd_bh+0x199/0x220
> [ 498.411070] tasklet_action_common.constprop.0+0x260/0x300
> [ 498.411110] tasklet_action+0x22/0x30
> [ 498.411159] __do_softirq+0x1bc/0x899
> [ 498.411200] run_ksoftirqd+0x3f/0x70
> [ 498.411243] smpboot_thread_fn+0x356/0x810
> [ 498.411283] kthread+0x370/0x440
> [ 498.411321] ret_from_fork+0x22/0x30
> [ 498.411358] INITIAL USE at:
> [ 498.411400] lock_acquire+0x1b8/0x720
> [ 498.411440] _raw_write_lock_irqsave+0x4d/0x70
> [ 498.411481] led_trigger_set+0x47a/0xd80
> [ 498.411522] led_trigger_set_default+0x16e/0x1f0
> [ 498.411562] led_classdev_register_ext+0x596/0x7e0
> [ 498.411603] input_leds_connect+0x35a/0x649 [input_leds]
> [ 498.411645] input_attach_handler.isra.0+0x10b/0x160
> [ 498.411685] input_register_handler+0x139/0x1e0
> [ 498.411724] 0xffffffffc02d8015
> [ 498.411762] do_one_initcall+0xcb/0x410
> [ 498.411802] do_init_module+0x1d8/0x770
> [ 498.411842] load_module+0x9eec/0xa9d0
> [ 498.411881] __do_sys_finit_module+0x131/0x1c0
> [ 498.411928] __x64_sys_finit_module+0x73/0xb0
> [ 498.411968] do_syscall_64+0x38/0x90
> [ 498.412010] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 498.412048] INITIAL READ USE at:
> [ 498.412088] lock_acquire+0x1b8/0x720
> [ 498.412129] _raw_read_lock_irqsave+0x89/0x93
> [ 498.412169] led_trigger_event+0x34/0xd0
> [ 498.412213] rfkill_global_led_trigger_worker+0xbe/0xf0
> [ 498.412253] process_one_work+0x7cc/0x1410
> [ 498.412293] worker_thread+0x583/0xf70
> [ 498.412333] kthread+0x370/0x440
> [ 498.412373] ret_from_fork+0x22/0x30
> [ 498.412410] }
> [ 498.412452] ... key at: [<ffffffff9ea17c40>] __key.0+0x0/0x40
> [ 498.412491] ... acquired at:
> [ 498.412530] lock_acquire+0x1b8/0x720
> [ 498.412572] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.412631] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
> [ 498.412692] iwl_trans_txq_send_hcmd+0x191/0xf20 [iwlwifi]
> [ 498.412751] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
> [ 498.412834] iwl_mvm_send_cmd+0xf1/0x1d0 [iwlmvm]
> [ 498.412896] iwl_mvm_led_set+0x1df/0x270 [iwlmvm]
> [ 498.412954] iwl_led_brightness_set+0x1f/0x30 [iwlmvm]
> [ 498.413001] led_set_brightness_nosleep+0xd4/0x1a0
> [ 498.413041] led_set_brightness+0x108/0x150
> [ 498.413088] led_trigger_event+0x73/0xd0
> [ 498.413170] ieee80211_led_radio+0x6e/0x90 [mac80211]
> [ 498.413253] ieee80211_do_open+0x896/0x1fa0 [mac80211]
> [ 498.413333] ieee80211_open+0x176/0x230 [mac80211]
> [ 498.413383] __dev_open+0x260/0x450
> [ 498.413423] __dev_change_flags+0x4c9/0x750
> [ 498.413464] dev_change_flags+0x87/0x160
> [ 498.413504] do_setlink+0x7a3/0x3160
> [ 498.413543] __rtnl_newlink+0xbdd/0x15a0
> [ 498.413588] rtnl_newlink+0x64/0xa0
> [ 498.413627] rtnetlink_rcv_msg+0x36a/0x910
> [ 498.413670] netlink_rcv_skb+0x140/0x3d0
> [ 498.413710] rtnetlink_rcv+0x15/0x20
> [ 498.413748] netlink_unicast+0x465/0x7a0
> [ 498.413788] netlink_sendmsg+0x7b4/0xcf0
> [ 498.413830] sock_sendmsg+0xec/0x120
> [ 498.413870] ____sys_sendmsg+0x696/0x9f0
> [ 498.413909] ___sys_sendmsg+0x101/0x190
> [ 498.413948] __sys_sendmsg+0xc8/0x160
> [ 498.413986] __x64_sys_sendmsg+0x78/0xb0
> [ 498.414026] do_syscall_64+0x38/0x90
> [ 498.414067] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 498.414105]
> [ 498.414142]
> [ 498.414142] the dependencies between the lock to be acquired
> [ 498.414145] and HARDIRQ-irq-unsafe lock:
> [ 498.414228] -> (&iwl_txq_cmd_queue_lock_class){+.-.}-{2:2} {
> [ 498.414275] HARDIRQ-ON-W at:
> [ 498.414316] lock_acquire+0x1b8/0x720
> [ 498.414357] _raw_spin_lock_bh+0x38/0x50
> [ 498.414414] iwl_pcie_hcmd_complete+0x1c7/0x12b0 [iwlwifi]
> [ 498.414473] iwl_pcie_rx_handle+0xe56/0x27e0 [iwlwifi]
> [ 498.414533] iwl_pcie_napi_poll_msix+0xa3/0x240 [iwlwifi]
> [ 498.414579] net_rx_action+0x3dd/0xea0
> [ 498.414636] __do_softirq+0x1bc/0x899
> [ 498.414676] asm_call_irq_on_stack+0x12/0x20
> [ 498.414717] do_softirq_own_stack+0x8b/0xb0
> [ 498.414756] do_softirq+0xab/0xd0
> [ 498.414795] __local_bh_enable_ip+0xf3/0x110
> [ 498.414851] iwl_pcie_irq_rx_msix_handler+0x26e/0x4e0 [iwlwifi]
> [ 498.414896] irq_thread_fn+0x92/0x160
> [ 498.414935] irq_thread+0x244/0x4a0
> [ 498.414974] kthread+0x370/0x440
> [ 498.415014] ret_from_fork+0x22/0x30
> [ 498.415051] IN-SOFTIRQ-W at:
> [ 498.415091] lock_acquire+0x1b8/0x720
> [ 498.415131] _raw_spin_lock_bh+0x38/0x50
> [ 498.415195] iwl_pcie_hcmd_complete+0x1c7/0x12b0 [iwlwifi]
> [ 498.415254] iwl_pcie_rx_handle+0xe56/0x27e0 [iwlwifi]
> [ 498.415314] iwl_pcie_napi_poll_msix+0xa3/0x240 [iwlwifi]
> [ 498.415360] net_rx_action+0x3dd/0xea0
> [ 498.415406] __do_softirq+0x1bc/0x899
> [ 498.415445] asm_call_irq_on_stack+0x12/0x20
> [ 498.415484] do_softirq_own_stack+0x8b/0xb0
> [ 498.415524] do_softirq+0xab/0xd0
> [ 498.415565] __local_bh_enable_ip+0xf3/0x110
> [ 498.415620] iwl_pcie_irq_rx_msix_handler+0x26e/0x4e0 [iwlwifi]
> [ 498.415665] irq_thread_fn+0x92/0x160
> [ 498.415705] irq_thread+0x244/0x4a0
> [ 498.415744] kthread+0x370/0x440
> [ 498.415783] ret_from_fork+0x22/0x30
> [ 498.415820] INITIAL USE at:
> [ 498.415860] lock_acquire+0x1b8/0x720
> [ 498.415899] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.415956] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
> [ 498.416016] iwl_trans_txq_send_hcmd+0x3fc/0xf20 [iwlwifi]
> [ 498.416074] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
> [ 498.416133] iwl_dbg_tlv_send_hcmds.isra.0+0x18f/0x240 [iwlwifi]
> [ 498.416192] _iwl_dbg_tlv_time_point+0x261/0x2ca0 [iwlwifi]
> [ 498.416248] iwl_run_unified_mvm_ucode+0x1fe/0x760 [iwlmvm]
> [ 498.416303] iwl_run_init_mvm_ucode+0x5ef/0xb80 [iwlmvm]
> [ 498.416359] iwl_mvm_start_get_nvm+0x1ee/0x7d0 [iwlmvm]
> [ 498.416413] iwl_op_mode_mvm_start+0x2082/0x3280 [iwlmvm]
> [ 498.416470] _iwl_op_mode_start.isra.0+0xef/0x1f0 [iwlwifi]
> [ 498.416528] iwl_opmode_register+0x13f/0x210 [iwlwifi]
> [ 498.416572] 0xffffffffc055803f
> [ 498.416617] do_one_initcall+0xcb/0x410
> [ 498.416658] do_init_module+0x1d8/0x770
> [ 498.416697] load_module+0x9eec/0xa9d0
> [ 498.416736] __do_sys_finit_module+0x131/0x1c0
> [ 498.416775] __x64_sys_finit_module+0x73/0xb0
> [ 498.416814] do_syscall_64+0x38/0x90
> [ 498.416854] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 498.416893] }
> [ 498.416949] ... key at: [<ffffffffc0a85bc0>] iwl_txq_cmd_queue_lock_class.89285+0x0/0xfffffffffff99440 [iwlwifi]
> [ 498.416993] ... acquired at:
> [ 498.417030] lock_acquire+0x1b8/0x720
> [ 498.417069] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.417126] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
> [ 498.417182] iwl_trans_txq_send_hcmd+0x191/0xf20 [iwlwifi]
> [ 498.417240] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
> [ 498.417298] iwl_mvm_send_cmd+0xf1/0x1d0 [iwlmvm]
> [ 498.417356] iwl_mvm_led_set+0x1df/0x270 [iwlmvm]
> [ 498.417412] iwl_led_brightness_set+0x1f/0x30 [iwlmvm]
> [ 498.417457] led_set_brightness_nosleep+0xd4/0x1a0
> [ 498.417498] led_set_brightness+0x108/0x150
> [ 498.417537] led_trigger_event+0x73/0xd0
> [ 498.417615] ieee80211_led_radio+0x6e/0x90 [mac80211]
> [ 498.417695] ieee80211_do_open+0x896/0x1fa0 [mac80211]
> [ 498.417775] ieee80211_open+0x176/0x230 [mac80211]
> [ 498.417823] __dev_open+0x260/0x450
> [ 498.417873] __dev_change_flags+0x4c9/0x750
> [ 498.417913] dev_change_flags+0x87/0x160
> [ 498.417950] do_setlink+0x7a3/0x3160
> [ 498.417988] __rtnl_newlink+0xbdd/0x15a0
> [ 498.418027] rtnl_newlink+0x64/0xa0
> [ 498.418066] rtnetlink_rcv_msg+0x36a/0x910
> [ 498.418105] netlink_rcv_skb+0x140/0x3d0
> [ 498.418143] rtnetlink_rcv+0x15/0x20
> [ 498.418191] netlink_unicast+0x465/0x7a0
> [ 498.418229] netlink_sendmsg+0x7b4/0xcf0
> [ 498.418267] sock_sendmsg+0xec/0x120
> [ 498.418307] ____sys_sendmsg+0x696/0x9f0
> [ 498.418345] ___sys_sendmsg+0x101/0x190
> [ 498.418383] __sys_sendmsg+0xc8/0x160
> [ 498.418421] __x64_sys_sendmsg+0x78/0xb0
> [ 498.418459] do_syscall_64+0x38/0x90
> [ 498.418499] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 498.418535]
> [ 498.418572]
> [ 498.418572] stack backtrace:
> [ 498.418620] CPU: 3 PID: 364 Comm: ip Tainted: G OE 5.10.27-dev-lockdep-fix2+ #1
> [ 498.418666] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 498.418709] Call Trace:
> [ 498.418754] dump_stack+0xb3/0xed
> [ 498.418798] check_irq_usage.cold+0x4e1/0x66d
> [ 498.419100] __lock_acquire+0x2b0e/0x5460
> [ 498.419239] lock_acquire+0x1b8/0x720
> [ 498.419618] _raw_spin_lock_irqsave+0x4d/0x70
> [ 498.419737] iwl_pcie_gen2_enqueue_hcmd+0x2c5/0x2100 [iwlwifi]
> [ 498.420081] iwl_trans_txq_send_hcmd+0x191/0xf20 [iwlwifi]
> [ 498.420245] iwl_trans_send_cmd+0x1b4/0x410 [iwlwifi]
> [ 498.420305] iwl_mvm_send_cmd+0xf1/0x1d0 [iwlmvm]
> [ 498.420365] iwl_mvm_led_set+0x1df/0x270 [iwlmvm]
> [ 498.423581] iwl_led_brightness_set+0x1f/0x30 [iwlmvm]
> [ 498.423627] led_set_brightness_nosleep+0xd4/0x1a0
> [ 498.423668] led_set_brightness+0x108/0x150
> [ 498.423709] led_trigger_event+0x73/0xd0
> [ 498.423789] ieee80211_led_radio+0x6e/0x90 [mac80211]
> [ 498.423869] ieee80211_do_open+0x896/0x1fa0 [mac80211]
> [ 498.424028] ieee80211_open+0x176/0x230 [mac80211]
> [ 498.424076] __dev_open+0x260/0x450
> [ 498.424249] __dev_change_flags+0x4c9/0x750
> [ 498.424339] dev_change_flags+0x87/0x160
> [ 498.424382] do_setlink+0x7a3/0x3160
> [ 498.425060] __rtnl_newlink+0xbdd/0x15a0
> [ 498.425697] rtnl_newlink+0x64/0xa0
> [ 498.425739] rtnetlink_rcv_msg+0x36a/0x910
> [ 498.425938] netlink_rcv_skb+0x140/0x3d0
> [ 498.426137] rtnetlink_rcv+0x15/0x20
> [ 498.426177] netlink_unicast+0x465/0x7a0
> [ 498.426384] netlink_sendmsg+0x7b4/0xcf0
> [ 498.426556] sock_sendmsg+0xec/0x120
> [ 498.426597] ____sys_sendmsg+0x696/0x9f0
>
>
> Note that the iwlwifi module was a newer version, not necessarily
> matching the 5.10(.27) version, especially wrt. the locking of the txq-
> >lock (which is represented here by iwl_txq_cmd_queue_lock_class due to
> us separating the lock classes for different types of queues).
>
>
> Clearly, the scenario lockdep outputs isn't really an issue at all,
> since that will just continue to make progress on CPU1, finish, and then
> the interrupt on CPU0 can make progress.
>
> I tried thinking about adding write_lock() to the scenario as you
> (Boqun) did in the thread I linked to above, but even then I don't see
> how it could deadlock since the write_lock() for leddev_list_lock is
> always write_lock_irqsave(), so it can't get stuck on any of the other
> locks involved here due to interrupts, I think?
>
>
> All of the bits here that lockdep is reporting seem correct, but I can't
> figure out how this chain matters?
>
> &host->lock --> &trig->leddev_list_lock --> &iwl_txq_cmd_queue_lock_class
>
> Yes, obviously we take leddev_list_lock under host->lock, and we also
> take the iwl lock under leddev_list_lock. No argument there. But the
> argument seems to be that we can have the iwl lock taken without
> disabling interrupts (which we do), and that's a different code path
> unrelated to the leddev_list_lock?
>
> Basically, I don't think I understand how a read-lock in the middle can
> cause this chain to be a problem?
>

The deadlock senario lockdep reported is as follow:

CPU 0 CPU 1 CPU 2 CPU 3
===== ==== ===== =====
<irq enabled>
spin_lock(&txq->lock);
<irq disabled>
spin_lock(&host->lock);
<irq disabled>
read_lock(&trig->leddev_list_lock);
write_lock_irqsave(&trig->leddev_list_lock);
spin_lock(&txq->lock);

/* The following blocked because of fairness */
read_lock(&trig->leddev_list_lock);
<interrupted>
spin_lock(&host->lock);

The key to understand this deadlock is that via a write_lock() (no
matter whether irq is disabled or not), a read_lock() can block another
read_lock() because of the fairness: in this case, CPU 2 is the reader
of ->leddev_list_lock, and when CPU 3 comes and tries to acquire the
writer of ->leddev_list_lock, it gets blocked. And no one would acquire
->leddev_list_lock earlier than CPU 3 (*fairness*, even a reader cannot
steal the lock), so CPU 1 will get blocked on its read_lock().
Therefore, this is a 4-CPU involved deadlock.

> Can anyone help me decode this?

Does the above help?

Regards,
Boqun

>
> Thanks,
> johannes
>

2021-06-29 00:25:49

by Johannes Berg

[permalink] [raw]
Subject: Re: lockdep issue with trig->leddev_list_lock again/still?

Hi Boqun,

First, thanks for the explanation. I have a follow-up question or two,
I'm not sure I understand things correctly.

You state the following deadlock scenario:

> The deadlock senario lockdep reported is as follow:
>
> CPU 0 CPU 1 CPU 2 CPU 3
> ===== ==== ===== =====
> <irq enabled>
> spin_lock(&txq->lock);
> <irq disabled>
> spin_lock(&host->lock);
> <irq disabled>
> read_lock(&trig->leddev_list_lock);
> write_lock_irqsave(&trig->leddev_list_lock);
> spin_lock(&txq->lock);
>
> /* The following blocked because of fairness */
> read_lock(&trig->leddev_list_lock);
> <interrupted>
> spin_lock(&host->lock);
>
> The key to understand this deadlock is that via a write_lock() (no
> matter whether irq is disabled or not), a read_lock() can block another
> read_lock() because of the fairness: in this case, CPU 2 is the reader
> of ->leddev_list_lock, and when CPU 3 comes and tries to acquire the
> writer of ->leddev_list_lock, it gets blocked. And no one would acquire
> ->leddev_list_lock earlier than CPU 3 (*fairness*, even a reader cannot
> steal the lock), so CPU 1 will get blocked on its read_lock().
> Therefore, this is a 4-CPU involved deadlock.

And all of that makes sense.

One of the key "ingredients" to the deadlock here is obviously CPU 1,
notably the host->lock handling there.

Would lockdep be able to tell the difference - like read_lock() -
between in_interrupt() and interrupts disabled? I'm not sure, and I'm
kind of just curious because I think in this case we have e.g.
ata_exec_internal_sg() definitely calling it like this, i.e. having a
code like in your CPU 1.


Also, the addition of "CPU 3" is kind of what I missed - though I tried
to add a write_lock(), but couldn't "make" it deadlock with 3 CPUs :)


However, since basically any CPU can actually execute a write_lock()
anywhere, should we actually even think if a read_lock() be any
different from a regular spinlock wrt. lockdep? I mean, the above
becomes kind of (more) obvious when you think of the read_lock() as just
spin_lock(), and then you don't even need CPU 3 to be executing a
write_lock(), you just deadlock right away?

Or does lockdep in fact map the difference - like my question above -
between read_lock() in and out of interrupt, and thus it could tell the
difference? Again, just wondering.



Now, what does this actually mean? I think the only reasonable thing we
could possibly do is convert the LED framework to RCU or such, because I
don't see how we can make something that's so intertwined with lots of
things in the kernel require that e.g. the iwlwifi driver disables
interrupts for its locking all the time (which is actually kind of bad,
especially if it's not required), only because ATA code wants to be able
to blink an LED with interrupts disabled ...

Any thoughts?

Thanks,
johannes