Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp958805pxb; Tue, 14 Sep 2021 12:35:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzbLXQT9/QokLGlsPKqchFPxqukSZbfYaWj3HBAbA9yyZiABW9/OM0U66Hpup0G0jfbGcT8 X-Received: by 2002:a92:c8c7:: with SMTP id c7mr10973886ilq.62.1631648110610; Tue, 14 Sep 2021 12:35:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631648110; cv=none; d=google.com; s=arc-20160816; b=gKxyq3lrUT6vyHewearKuR+HlpU1CfMaNH2pS+bgQ7c2F014OMiBvrhKOxR1EWrlmt RAYFvhETWF7yKwbqRGEJ5pusvxrxL+Cn4kt9+tunFNuwWhlTJsE+NfYq0tP4wvEAjLdi dQi29JtuAAXJG1iU4jpZWOgmGTSX5YnEJgLIUMAix09ATiwKd/nIiKo3v5zc2UHE2y45 OWUzQS//zCGIJYyvA1t2wWZTRtLPum7KOx6tQ07Hc/pKhAyph2eOul5022REwQ2Cr0xE Ke5b/D1ucBjHKs0049rRYle48akyt2qaSgtzXF2BmHRz/Xtq2bbvU1x72TPmgNhL8JRP pcsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=2rB2IHAfGxSUZi+Y1h/YX93iasdxhnBjQOugf0vkVtc=; b=RTZslsoppIiHZDZ9gSjySCggSl5bEcCU9XojozDM1qbkB4jkLB1xvwVddw35oTAXf4 uEN0yW0JC2y6LiG6NDnkc9O26w6ldGGmcnAvJwrv1Uvsbe1MJwRqFeDJAYRJWuzd4PDp lGJs37GOCbJBZ0AiFTDF+tngZtlOK2BEyta2sZe2rQT3jBNzexYq0gS1A6nUMAv/rrjs TH+LEP799Y/MsMumj54q6l96M2Q7SyJUOH3RLDVXZPRcfj8EVqf5XtuP21bNZ7JmguJF i6x7HAplIG91jzyrBNbdbB8n9q+BSDcVCBBymZNRB/nEnmlc9pWzLM2dUL/yFuu3vd11 EL8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d27si10846818jaa.121.2021.09.14.12.34.51; Tue, 14 Sep 2021 12:35:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233079AbhINTe4 (ORCPT + 99 others); Tue, 14 Sep 2021 15:34:56 -0400 Received: from mail.aperture-lab.de ([116.203.183.178]:44902 "EHLO mail.aperture-lab.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230061AbhINTey (ORCPT ); Tue, 14 Sep 2021 15:34:54 -0400 X-Greylist: delayed 482 seconds by postgrey-1.27 at vger.kernel.org; Tue, 14 Sep 2021 15:34:53 EDT Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id 173B841012; Tue, 14 Sep 2021 21:25:33 +0200 (CEST) From: =?UTF-8?q?Linus=20L=C3=BCssing?= To: Kalle Valo , Felix Fietkau , Sujith Manoharan , ath9k-devel@qca.qualcomm.com Cc: linux-wireless@vger.kernel.org, "David S . Miller" , Jakub Kicinski , "John W . Linville" , Felix Fietkau , Simon Wunderlich , Sven Eckelmann , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, =?UTF-8?q?Linus=20L=C3=BCssing?= , =?UTF-8?q?Linus=20L=C3=BCssing?= Subject: [PATCH 2/3] ath9k: Fix potential interrupt storm on queue reset Date: Tue, 14 Sep 2021 21:25:14 +0200 Message-Id: <20210914192515.9273-3-linus.luessing@c0d3.blue> In-Reply-To: <20210914192515.9273-1-linus.luessing@c0d3.blue> References: <20210914192515.9273-1-linus.luessing@c0d3.blue> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Last-TLS-Session-Version: TLSv1.3 Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org From: Linus Lüssing In tests with two Lima boards from 8devices (QCA4531 based) on OpenWrt 19.07 we could force a silent restart of a device with no serial output when we were sending a high amount of UDP traffic (iperf3 at 80 MBit/s in both directions from external hosts, saturating the wifi and causing a load of about 4.5 to 6) and were then triggering an ath9k_queue_reset(). Further debugging showed that the restart was caused by the ath79 watchdog. With disabled watchdog we could observe that the device was constantly going into ath_isr() interrupt handler and was returning early after the ATH_OP_HW_RESET flag test, without clearing any interrupts. Even though ath9k_queue_reset() calls ath9k_hw_kill_interrupts(). With JTAG we could observe the following race condition: 1) ath9k_queue_reset() ... -> ath9k_hw_kill_interrupts() -> set_bit(ATH_OP_HW_RESET, &common->op_flags); ... <- returns 2) ath9k_tasklet() ... -> ath9k_hw_resume_interrupts() ... <- returns 3) loops around: ... handle_int() -> ath_isr() ... -> if (test_bit(ATH_OP_HW_RESET, &common->op_flags)) return IRQ_HANDLED; x) ath_reset_internal(): => never reached <= And in ath_isr() we would typically see the following interrupts / interrupt causes: * status: 0x00111030 or 0x00110030 * async_cause: 2 (AR_INTR_MAC_IPQ) * sync_cause: 0 So the ath9k_tasklet() reenables the ath9k interrupts through ath9k_hw_resume_interrupts() which ath9k_queue_reset() had just disabled. And ath_isr() then keeps firing because it returns IRQ_HANDLED without actually clearing the interrupt. To fix this IRQ storm also clear/disable the interrupts again when we are in reset state. Cc: Sven Eckelmann Cc: Simon Wunderlich Cc: Linus Lüssing Fixes: 872b5d814f99 ("ath9k: do not access hardware on IRQs during reset") Signed-off-by: Linus Lüssing --- drivers/net/wireless/ath/ath9k/main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 139831539da3..98090e40e1cf 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -533,8 +533,10 @@ irqreturn_t ath_isr(int irq, void *dev) ath9k_debug_sync_cause(sc, sync_cause); status &= ah->imask; /* discard unasked-for bits */ - if (test_bit(ATH_OP_HW_RESET, &common->op_flags)) + if (test_bit(ATH_OP_HW_RESET, &common->op_flags)) { + ath9k_hw_kill_interrupts(sc->sc_ah); return IRQ_HANDLED; + } /* * If there are no status bits set, then this interrupt was not -- 2.31.0