Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp857243rwr; Wed, 26 Apr 2023 07:14:35 -0700 (PDT) X-Google-Smtp-Source: AKy350Y7WFVupEla4WL1Qv8zeUOnCrxfJMyWIoLr7SrMf00j1QexWjJgBye6FD+MIgFKFSeumjy+ X-Received: by 2002:a05:6a00:1496:b0:63d:6228:6893 with SMTP id v22-20020a056a00149600b0063d62286893mr27530388pfu.3.1682518475253; Wed, 26 Apr 2023 07:14:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682518475; cv=none; d=google.com; s=arc-20160816; b=lnxCdkaQyhlhZAfaoHkakXTJ5hI19F95cyID45hBrer1HO999RdHQ62TSGG/C1qy4o bDOcDyj3zH3hwvHtAY1OJc7tFJMc4+pPKOViWkW6x9dAYhXbajHG+njeYcElq7kx6xiU +Db5NGkJh67gNXdTMe/DBJlyA3tUIjt7SnEh0lAgVXUW2PbKciAy1WHPQ5rilS5iJ/Ue 9N1hDV3p+5jJE12rDqMPmsl1jSJrad+4A4KO7KdgyROY5CkF5ha5XPbVvP2H46XXvW2j 1D0+sj/AeyvNgzEGkADMk3w3crSJRMCXyT4BqScNXFQX1dhScbsUZ72O/WXatpkC/dFq OflA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-signature; bh=BF98xunI1jXZ9xbpMm14sbL7RGbwt3Ex2l2HY/IihDY=; b=aElV4zzWPF1iBkQlmbqCpFwIeVRxVoIsjEe4t97dokkJ/fERKNmkbj/hHY3eRs/hBY aNZIZ63vwtLq5eil5GlQczc2+PGxBd93kIXxnjFKYTbehshFAVst7C8Su5x2vB35U0D0 WMlae74InAoWRRDZFhAXl6JhbIc6K/e4094gZvxNdP3oF+mZgjrlCPuwOTrB4pkbt3+O 3bKnsaqSZV43eNcIOcqUydppBHPpytPQ39EHqtMBDS7jkmR0zQkkoYkDD6zC9+6D8RHr 5wB4E7PyVcje8rhvSG+gRaYXi/UKFC1RCZ0A74JHoWbvRpATGMneDyHp3noF4YLTx197 vZfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@alu.unizg.hr header.s=mail header.b=MPPsba7T; dkim=fail header.i=@alu.unizg.hr header.s=mail header.b=lGEMW6qG; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alu.unizg.hr Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i19-20020aa796f3000000b0063d26ee2cf0si16391937pfq.143.2023.04.26.07.14.23; Wed, 26 Apr 2023 07:14:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@alu.unizg.hr header.s=mail header.b=MPPsba7T; dkim=fail header.i=@alu.unizg.hr header.s=mail header.b=lGEMW6qG; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alu.unizg.hr Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241238AbjDZOCp (ORCPT + 62 others); Wed, 26 Apr 2023 10:02:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232584AbjDZOCo (ORCPT ); Wed, 26 Apr 2023 10:02:44 -0400 Received: from domac.alu.hr (domac.alu.unizg.hr [IPv6:2001:b68:2:2800::3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5AEC5FED; Wed, 26 Apr 2023 07:02:41 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by domac.alu.hr (Postfix) with ESMTP id 64B7A6017E; Wed, 26 Apr 2023 16:02:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=alu.unizg.hr; s=mail; t=1682517760; bh=Iz9hcB9GLNYUr5g+GxKCdWQEiMUXyIVgPkkw/i05YnU=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=MPPsba7T0C+OF/ZBgSh275mr2j574cCl6nfO4jGWTE0lxw77Gek2exkU5q5Q7XLsU jSxianVbFGSZWYluFNGW1mrxbq7E3xd9z/wbJN5EKSMJHJOZm8GkskGNBxu2lLvcmS dwdiTcTxTWLxH7171lqPaVal+U1wzVZw5et0KdZUk/QKuKqXSqucW98lWYLdrLXYmr +on3NO0PdY0nYuyOw0XAO7cSbxfiIUIQHtrh/9kkzK1khTeygeKFSIwQ5J6jaGTU1y YSUKcoH3x3Da/Izn4TbxkYR6O7mdSaUSMfv+SDwnYe6ya+64C/J7hO86Om6f+7aNFw iY2yYyw3TIaQQ== X-Virus-Scanned: Debian amavisd-new at domac.alu.hr Received: from domac.alu.hr ([127.0.0.1]) by localhost (domac.alu.hr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 40rX7cbhwoeu; Wed, 26 Apr 2023 16:02:37 +0200 (CEST) Received: from [193.198.186.200] (pc-mtodorov.slava.alu.hr [193.198.186.200]) by domac.alu.hr (Postfix) with ESMTPSA id 9261E6017C; Wed, 26 Apr 2023 16:02:36 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=alu.unizg.hr; s=mail; t=1682517757; bh=Iz9hcB9GLNYUr5g+GxKCdWQEiMUXyIVgPkkw/i05YnU=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=lGEMW6qGriqzXJHDKDuv0004BmjZfB01rVnIde5blHbmP4yf0v0KzzsGMqR2a5zW/ MnN0aRTS7EFKSKuBBE3V0CIgDQrQ6jN9qW8usu1AS9G9LwU+Z7eLGIylvXvbO5fNI+ K7j0EByNwZi3v7dF2CQCiXwI2DpmqP7ZJ+/nk1sd56nCo8yMjItv2CNKCOyeid6OjU TKO4aY68WXzzISLf01ZE1CaFbObeve5eimJi9DA5Ae8OMZcMwrheYkrep0JIoTgCEc Gf/V9c2Bh8w6ApR0jnvOTqKpK/3rrb1AnYkUu/+KCy0EJPZHWtw1GeSfsRnUqtZ5p4 rxRXPODDnlsBg== Message-ID: <074cf5ed-c39d-1c16-12e7-4b14bbe0cac4@alu.unizg.hr> Date: Wed, 26 Apr 2023 16:02:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH v4 1/1] wifi: mac80211: fortify the spinlock against deadlock by interrupt To: Leon Romanovsky Cc: Johannes Berg , linux-wireless@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Johannes Berg , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Gregory Greenman , Alexander Wetzel References: <20230425164005.25272-1-mirsad.todorovac@alu.unizg.hr> <20230426064145.GE27649@unreal> Content-Language: en-US, hr From: Mirsad Todorovac In-Reply-To: <20230426064145.GE27649@unreal> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 4/26/23 08:41, Leon Romanovsky wrote: > On Tue, Apr 25, 2023 at 06:40:08PM +0200, Mirsad Goran Todorovac wrote: >> In the function ieee80211_tx_dequeue() there is a particular locking >> sequence: >> >> begin: >> spin_lock(&local->queue_stop_reason_lock); >> q_stopped = local->queue_stop_reasons[q]; >> spin_unlock(&local->queue_stop_reason_lock); >> >> However small the chance (increased by ftracetest), an asynchronous >> interrupt can occur in between of spin_lock() and spin_unlock(), >> and the interrupt routine will attempt to lock the same >> &local->queue_stop_reason_lock again. >> >> This will cause a costly reset of the CPU and the wifi device or an >> altogether hang in the single CPU and single core scenario. >> >> The only remaining spin_lock(&local->queue_stop_reason_lock) that >> did not disable interrupts was patched, which should prevent any >> deadlocks on the same CPU/core and the same wifi device. >> >> This is the probable trace of the deadlock: >> >> kernel: ================================ >> kernel: WARNING: inconsistent lock state >> kernel: 6.3.0-rc6-mt-20230401-00001-gf86822a1170f #4 Tainted: G W >> kernel: -------------------------------- >> kernel: inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. >> kernel: kworker/5:0/25656 [HC0[0]:SC0[0]:HE1:SE1] takes: >> kernel: ffff9d6190779478 (&local->queue_stop_reason_lock){+.?.}-{2:2}, at: return_to_handler+0x0/0x40 >> kernel: {IN-SOFTIRQ-W} state was registered at: >> kernel: lock_acquire+0xc7/0x2d0 >> kernel: _raw_spin_lock+0x36/0x50 >> kernel: ieee80211_tx_dequeue+0xb4/0x1330 [mac80211] >> kernel: iwl_mvm_mac_itxq_xmit+0xae/0x210 [iwlmvm] >> kernel: iwl_mvm_mac_wake_tx_queue+0x2d/0xd0 [iwlmvm] >> kernel: ieee80211_queue_skb+0x450/0x730 [mac80211] >> kernel: __ieee80211_xmit_fast.constprop.66+0x834/0xa50 [mac80211] >> kernel: __ieee80211_subif_start_xmit+0x217/0x530 [mac80211] >> kernel: ieee80211_subif_start_xmit+0x60/0x580 [mac80211] >> kernel: dev_hard_start_xmit+0xb5/0x260 >> kernel: __dev_queue_xmit+0xdbe/0x1200 >> kernel: neigh_resolve_output+0x166/0x260 >> kernel: ip_finish_output2+0x216/0xb80 >> kernel: __ip_finish_output+0x2a4/0x4d0 >> kernel: ip_finish_output+0x2d/0xd0 >> kernel: ip_output+0x82/0x2b0 >> kernel: ip_local_out+0xec/0x110 >> kernel: igmpv3_sendpack+0x5c/0x90 >> kernel: igmp_ifc_timer_expire+0x26e/0x4e0 >> kernel: call_timer_fn+0xa5/0x230 >> kernel: run_timer_softirq+0x27f/0x550 >> kernel: __do_softirq+0xb4/0x3a4 >> kernel: irq_exit_rcu+0x9b/0xc0 >> kernel: sysvec_apic_timer_interrupt+0x80/0xa0 >> kernel: asm_sysvec_apic_timer_interrupt+0x1f/0x30 >> kernel: _raw_spin_unlock_irqrestore+0x3f/0x70 >> kernel: free_to_partial_list+0x3d6/0x590 >> kernel: __slab_free+0x1b7/0x310 >> kernel: kmem_cache_free+0x52d/0x550 >> kernel: putname+0x5d/0x70 >> kernel: do_sys_openat2+0x1d7/0x310 >> kernel: do_sys_open+0x51/0x80 >> kernel: __x64_sys_openat+0x24/0x30 >> kernel: do_syscall_64+0x5c/0x90 >> kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc >> kernel: irq event stamp: 5120729 >> kernel: hardirqs last enabled at (5120729): [] trace_graph_return+0xd6/0x120 >> kernel: hardirqs last disabled at (5120728): [] trace_graph_return+0xf0/0x120 >> kernel: softirqs last enabled at (5069900): [] return_to_handler+0x0/0x40 >> kernel: softirqs last disabled at (5067555): [] return_to_handler+0x0/0x40 >> kernel: >> other info that might help us debug this: >> kernel: Possible unsafe locking scenario: >> kernel: CPU0 >> kernel: ---- >> kernel: lock(&local->queue_stop_reason_lock); >> kernel: >> kernel: lock(&local->queue_stop_reason_lock); >> kernel: >> *** DEADLOCK *** >> kernel: 8 locks held by kworker/5:0/25656: >> kernel: #0: ffff9d618009d138 ((wq_completion)events_freezable){+.+.}-{0:0}, at: process_one_work+0x1ca/0x530 >> kernel: #1: ffffb1ef4637fe68 ((work_completion)(&local->restart_work)){+.+.}-{0:0}, at: process_one_work+0x1ce/0x530 >> kernel: #2: ffffffff9f166548 (rtnl_mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40 >> kernel: #3: ffff9d6190778728 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: return_to_handler+0x0/0x40 >> kernel: #4: ffff9d619077b480 (&mvm->mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40 >> kernel: #5: ffff9d61907bacd8 (&trans_pcie->mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40 >> kernel: #6: ffffffff9ef9cda0 (rcu_read_lock){....}-{1:2}, at: iwl_mvm_queue_state_change+0x59/0x3a0 [iwlmvm] >> kernel: #7: ffffffff9ef9cda0 (rcu_read_lock){....}-{1:2}, at: iwl_mvm_mac_itxq_xmit+0x42/0x210 [iwlmvm] >> kernel: >> stack backtrace: >> kernel: CPU: 5 PID: 25656 Comm: kworker/5:0 Tainted: G W 6.3.0-rc6-mt-20230401-00001-gf86822a1170f #4 >> kernel: Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN51WW 11/16/2022 >> kernel: Workqueue: events_freezable ieee80211_restart_work [mac80211] >> kernel: Call Trace: >> kernel: >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: dump_stack_lvl+0x5f/0xa0 >> kernel: dump_stack+0x14/0x20 >> kernel: print_usage_bug.part.46+0x208/0x2a0 >> kernel: mark_lock.part.47+0x605/0x630 >> kernel: ? sched_clock+0xd/0x20 >> kernel: ? trace_clock_local+0x14/0x30 >> kernel: ? __rb_reserve_next+0x5f/0x490 >> kernel: ? _raw_spin_lock+0x1b/0x50 >> kernel: __lock_acquire+0x464/0x1990 >> kernel: ? mark_held_locks+0x4e/0x80 >> kernel: lock_acquire+0xc7/0x2d0 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: ? ftrace_return_to_handler+0x8b/0x100 >> kernel: ? preempt_count_add+0x4/0x70 >> kernel: _raw_spin_lock+0x36/0x50 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: ieee80211_tx_dequeue+0xb4/0x1330 [mac80211] >> kernel: ? prepare_ftrace_return+0xc5/0x190 >> kernel: ? ftrace_graph_func+0x16/0x20 >> kernel: ? 0xffffffffc02ab0b1 >> kernel: ? lock_acquire+0xc7/0x2d0 >> kernel: ? iwl_mvm_mac_itxq_xmit+0x42/0x210 [iwlmvm] >> kernel: ? ieee80211_tx_dequeue+0x9/0x1330 [mac80211] >> kernel: ? __rcu_read_lock+0x4/0x40 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_mvm_mac_itxq_xmit+0xae/0x210 [iwlmvm] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_mvm_queue_state_change+0x311/0x3a0 [iwlmvm] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_mvm_wake_sw_queue+0x17/0x20 [iwlmvm] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_txq_gen2_unmap+0x1c9/0x1f0 [iwlwifi] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_txq_gen2_free+0x55/0x130 [iwlwifi] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_txq_gen2_tx_free+0x63/0x80 [iwlwifi] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: _iwl_trans_pcie_gen2_stop_device+0x3f3/0x5b0 [iwlwifi] >> kernel: ? _iwl_trans_pcie_gen2_stop_device+0x9/0x5b0 [iwlwifi] >> kernel: ? mutex_lock_nested+0x4/0x30 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_trans_pcie_gen2_stop_device+0x5f/0x90 [iwlwifi] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_mvm_stop_device+0x78/0xd0 [iwlmvm] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: __iwl_mvm_mac_start+0x114/0x210 [iwlmvm] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: iwl_mvm_mac_start+0x76/0x150 [iwlmvm] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: drv_start+0x79/0x180 [mac80211] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: ieee80211_reconfig+0x1523/0x1ce0 [mac80211] >> kernel: ? synchronize_net+0x4/0x50 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: ieee80211_restart_work+0x108/0x170 [mac80211] >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: process_one_work+0x250/0x530 >> kernel: ? ftrace_regs_caller_end+0x66/0x66 >> kernel: worker_thread+0x48/0x3a0 >> kernel: ? __pfx_worker_thread+0x10/0x10 >> kernel: kthread+0x10f/0x140 >> kernel: ? __pfx_kthread+0x10/0x10 >> kernel: ret_from_fork+0x29/0x50 >> kernel: >> >> Fixes: 4444bc2116ae ("wifi: mac80211: Proper mark iTXQs for resumption") >> Link: https://lore.kernel.org/all/1f58a0d1-d2b9-d851-73c3-93fcc607501c@alu.unizg.hr/ >> Reported-by: Mirsad Goran Todorovac >> Cc: Gregory Greenman >> Cc: Johannes Berg >> Link: https://lore.kernel.org/all/cdc80531-f25f-6f9d-b15f-25e16130b53a@alu.unizg.hr/ >> Cc: David S. Miller >> Cc: Eric Dumazet >> Cc: Jakub Kicinski >> Cc: Paolo Abeni >> Cc: Leon Romanovsky >> Cc: Alexander Wetzel >> Signed-off-by: Mirsad Goran Todorovac >> --- >> v3 -> v4: >> - Added whole lockdep trace as advised. >> - Trimmed irrelevant line prefix. >> v2 -> v3: >> - Fix the Fixes: tag as advised. >> - Change the net: to wifi: to comply with the original patch that >> is being fixed. >> v1 -> v2: >> - Minor rewording and clarification. >> - Cc:-ed people that replied to the original bug report (forgotten >> in v1 by omission). >> >> net/mac80211/tx.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) > > Thanks, > Reviewed-by: Leon Romanovsky Not at all. That's awesome! Just to ask, do I need to send the PATCH v5 with the Reviewed-by: tag, or it goes automatically? Thanks. -- Mirsad Goran Todorovac Sistem inženjer Grafički fakultet | Akademija likovnih umjetnosti Sveučilište u Zagrebu System engineer Faculty of Graphic Arts | Academy of Fine Arts University of Zagreb, Republic of Croatia "What’s this thing suddenly coming towards me very fast? Very very fast. ... I wonder if it will be friends with me?"