Return-path: Received: from mail-wy0-f174.google.com ([74.125.82.174]:44064 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752615Ab1H2LyB convert rfc822-to-8bit (ORCPT ); Mon, 29 Aug 2011 07:54:01 -0400 Received: by wyg24 with SMTP id 24so3995857wyg.19 for ; Mon, 29 Aug 2011 04:54:00 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20110827113253.GA1444@ecki> Date: Mon, 29 Aug 2011 17:23:58 +0530 Message-ID: (sfid-20110829_135406_728182_A80818ED) Subject: Re: ath9k: irq storm after suspend/resume From: Mohammed Shafi To: Clemens Buchacher Cc: linux-wireless@vger.kernel.org, beta992@gmail.com Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-wireless-owner@vger.kernel.org List-ID: On Mon, Aug 29, 2011 at 1:25 PM, Mohammed Shafi wrote: > On Sat, Aug 27, 2011 at 5:02 PM, Clemens Buchacher wrote: >> Hi, >> > Hi Clemens, > >> I am trying to debug this issue: >> https://bugzilla.kernel.org/show_bug.cgi?id=39112 >> >> An IRQ storm triggers during resume after suspend to RAM. The IRQ >> storm also triggers during insmod after suspend/resume, even if >> ath9k is not loaded during suspend. > > a recent commit, not sure this will help suspend/resume > > commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8 > Author: Rajkumar Manoharan > Date: ? Sat Aug 13 10:28:09 2011 +0530 > > ? ?ath9k: Fix rx overrun interrupt storm > > ? ?Whenever RXEOL is received, both RXORN and RXEOL got cleared > ? ?to avoid rx overrun interrupt storm. This was handled only for > ? ?edma chips. The same scenario was also observered with AR9280, > ? ?doing frequent channel type switch b/w HT20/40 with bidi traffic > ? ?that is causing failure to stop rx dma. This patch clears > ? ?the RXEOL & RXORN interrupts for all chips. > > >> >> In the former case, the IRQ storm happens right after the PCI >> subsystem restores the config space for ath9k. >> >> In the latter case, the IRQ storm happens immediately after >> request_irq in ath_pci_probe. So I have moved ath9k_init_device >> ahead of request_irq, in order to call ath9k_hw_set_interrupts(ah, >> 0) before enabling the interrupt. >> >> But still, the interrupts come. Note that according to >> /proc/interrupts, the IRQ line is not shared with any other device. >> I did not manage to determine which interrupt it is exactly, >> because the device is not in a ready state (SC_OP_INVALID is set) >> when they happen (in either scenario that triggers the IRQ storm). >> And SC_OP_INVALID is cleared only much later in ath9k_start. >> >> So, I am at a loss. Any ideas? > > please provide the lspci -vvvxx. > also looking at > /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. hi, i think this will help, please get the message sudo modprobe ath9k debug=0xffffffff. few fatal PCI interrupt messages are based on ATH_DEBUG_ANY. thanks. > > also this additional information might help: > in case have you seen this is happening in 32 bit also ? > is this happening in wireless-testing ?Linux 3.1-rc3 ? or the latest > compat wireless? > i did some preliminary testing, not able to recreate it. will try > further.thanks! > >> >> Below is the output I get during insmod'ing right after resume >> (complete logs can be found in the bug report). Below that are the >> changes I made to produce the output. >> >> Clemens >> --- >> >> [ ? 53.387989] ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 >> [ ? 53.389306] ath9k 0000:02:00.0: setting latency timer to 64 >> [ ? 53.389371] ath: reset power on >> [ ? 53.392725] ath: UNDEFINED -> AWAKE >> [ ? 53.392783] ath: serialize_regmode is 0 >> [ ? 53.438096] ath: Eeprom VER: 14, REV: 13 >> [ ? 53.438098] ath: Initialize ANI >> [ ? 53.438112] ath: Enable MIB counters >> [ ? 53.438143] ath: Setup TX queue: 9 >> [ ? 53.438144] ath: Set queue properties for: 9 >> [ ? 53.438146] ath: Setup TX queue: 8 >> [ ? 53.438147] ath: Set queue properties for: 8 >> [ ? 53.438149] ath: Set queue properties for: 8 >> [ ? 53.438151] ath: Reset TX queue: 8 >> [ ? 53.438176] ath: tx ok 0x0 err 0x0 desc 0x100 eol 0x100 urn 0x0 >> [ ? 53.438178] ath: Setup TX queue: 0 >> [ ? 53.438179] ath: Set queue properties for: 0 >> [ ? 53.438181] ath: Setup TX queue: 1 >> [ ? 53.438182] ath: Set queue properties for: 1 >> [ ? 53.438184] ath: Setup TX queue: 2 >> [ ? 53.438185] ath: Set queue properties for: 2 >> [ ? 53.438186] ath: Setup TX queue: 3 >> [ ? 53.438187] ath: Set queue properties for: 3 >> [ ? 53.439692] ath: TX streams 1, RX streams: 1 >> [ ? 53.439694] ath: EEPROM regdomain: 0x60 >> [ ? 53.439695] ath: EEPROM indicates we should expect a direct regpair map >> [ ? 53.439698] ath: Country alpha2 being used: 00 >> [ ? 53.439699] ath: Regpair used: 0x60 >> [ ? 53.439702] cfg80211: Updating information on frequency 2412 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439704] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439706] cfg80211: Updating information on frequency 2417 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439709] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439710] cfg80211: Updating information on frequency 2422 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439713] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439714] cfg80211: Updating information on frequency 2427 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439717] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439719] cfg80211: Updating information on frequency 2432 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439721] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439723] cfg80211: Updating information on frequency 2437 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439725] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439727] cfg80211: Updating information on frequency 2442 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439729] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439731] cfg80211: Updating information on frequency 2447 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439733] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439735] cfg80211: Updating information on frequency 2452 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439737] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439739] cfg80211: Updating information on frequency 2457 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439741] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439743] cfg80211: Updating information on frequency 2462 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439746] cfg80211: 2402000 KHz - 2472000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439747] cfg80211: Updating information on frequency 2467 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439750] cfg80211: 2457000 KHz - 2482000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439751] cfg80211: Updating information on frequency 2472 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439754] cfg80211: 2457000 KHz - 2482000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439756] cfg80211: Updating information on frequency 2484 MHz for a 20 MHz width channel with regulatory rule: >> [ ? 53.439758] cfg80211: 2474000 KHz - 2494000 KHz @ ?KHz), (N/A mBi, 2000 mBm) >> [ ? 53.439761] ath: tx DMA: 512 buffers 1 desc/buf >> [ ? 53.439772] ath: tx DMA map: ffff880071f90000 (54600) -> 71f90000 (54600) >> [ ? 53.439792] ath: beacon DMA: 4 buffers 1 desc/buf >> [ ? 53.439795] ath: beacon DMA map: ffff880071ecc000 (416) -> 71ecc000 (416) >> [ ? 53.439797] ath: cachelsz 64 rxbufsize 1926 >> [ ? 53.439799] ath: rx DMA: 512 buffers 1 desc/buf >> [ ? 53.439811] ath: rx DMA map: ffff880071fb0000 (54600) -> 71fb0000 (54600) >> [ ? 53.441193] cfg80211: Ignoring regulatory request Set by core since the driver uses its own custom regulatory domain >> [ ? 53.441269] ieee80211 phy1: Selected rate control algorithm 'ath9k_rate_control' >> [ ? 53.441628] Registered led device: ath9k-phy1 >> [ ? 53.441641] ath9k: disabling interrupts >> [ ? 53.442979] ath: disable IER >> [ ? 53.442990] ath: 0x0 => 0x0 >> [ ? 53.442992] ath: new IMR 0x0 >> [ ? 53.443013] ath9k: requesting irq >> [ ? 53.444293] ath9k: interrupt with invalid flag set (1 times) >> [ ? 53.444296] wait 500 ms >> [ ? 53.446255] ath9k: interrupt with invalid flag set (2 times) >> [ ? 53.446288] ath9k: interrupt with invalid flag set (4 times) >> [ ? 53.446330] ath9k: interrupt with invalid flag set (8 times) >> [ ? 53.446374] ath9k: interrupt with invalid flag set (16 times) >> [ ? 53.446453] ath9k: interrupt with invalid flag set (32 times) >> [ ? 53.446572] ath9k: interrupt with invalid flag set (64 times) >> [ ? 53.446821] ath9k: interrupt with invalid flag set (128 times) >> [ ? 53.447319] ath9k: interrupt with invalid flag set (256 times) >> [ ? 53.448526] ath9k: interrupt with invalid flag set (512 times) >> [ ? 53.450942] ath9k: interrupt with invalid flag set (1024 times) >> [ ? 53.454793] ath9k: interrupt with invalid flag set (2048 times) >> [ ? 53.462493] ath9k: interrupt with invalid flag set (4096 times) >> [ ? 53.477738] ath9k: interrupt with invalid flag set (8192 times) >> [ ? 53.508164] ath9k: interrupt with invalid flag set (16384 times) >> [ ? 53.571798] ath9k: interrupt with invalid flag set (32768 times) >> [ ? 53.697472] ath9k: interrupt with invalid flag set (65536 times) >> [ ? 53.832522] irq 17: nobody cared (try booting with the "irqpoll" option) >> [ ? 53.834253] Pid: 0, comm: kworker/0:0 Tainted: G ? ? ? ? C ?3.1.0-rc2-ath9k-debug #16 >> [ ? 53.835991] Call Trace: >> [ ? 53.837727] ? ?[] __report_bad_irq+0x3d/0xe0 >> [ ? 53.839487] ?[] note_interrupt+0x14d/0x210 >> [ ? 53.841220] ?[] handle_irq_event_percpu+0xc9/0x2a0 >> [ ? 53.842952] ?[] handle_irq_event+0x48/0x70 >> [ ? 53.844659] ?[] handle_fasteoi_irq+0x5a/0xe0 >> [ ? 53.846379] ?[] handle_irq+0x22/0x40 >> [ ? 53.848073] ?[] do_IRQ+0x5a/0xe0 >> [ ? 53.849751] ?[] common_interrupt+0x6e/0x6e >> [ ? 53.851438] ? ?[] ? schedule+0x316/0x9b0 >> [ ? 53.853151] ?[] ? intel_idle+0xcb/0x120 >> [ ? 53.854822] ?[] ? intel_idle+0xad/0x120 >> [ ? 53.856502] ?[] cpuidle_idle_call+0xc6/0x350 >> [ ? 53.858166] ?[] cpu_idle+0xc9/0x120 >> [ ? 53.859806] ?[] start_secondary+0x202/0x209 >> [ ? 53.861438] handlers: >> [ ? 53.863009] [] ath_isr >> [ ? 53.864524] Disabling IRQ #17 >> [ ? 53.943777] done waiting >> [ ? 53.944849] ieee80211 phy1: Atheros AR9285 Rev:2 mem=0xffffc90010ea0000, irq=17 >> >> diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c >> index 8dcefe7..fe7780c 100644 >> --- a/drivers/net/wireless/ath/ath9k/hw.c >> +++ b/drivers/net/wireless/ath/ath9k/hw.c >> @@ -16,6 +16,7 @@ >> >> ?#include >> ?#include >> +#include >> ?#include >> >> ?#include "hw.h" >> @@ -532,6 +533,7 @@ static int __ath9k_hw_init(struct ath_hw *ah) >> ? ? ? ?ah->WARegVal |= (AR_WA_D3_L1_DISABLE | >> ? ? ? ? ? ? ? ? ? ? ? ? AR_WA_ASPM_TIMER_BASED_DISABLE); >> >> + ? ? ? ath_dbg(common, ATH_DBG_RESET, "reset power on\n"); >> ? ? ? ?if (!ath9k_hw_set_reset_reg(ah, ATH9K_RESET_POWER_ON)) { >> ? ? ? ? ? ? ? ?ath_err(common, "Couldn't reset chip\n"); >> ? ? ? ? ? ? ? ?return -EIO; >> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c >> index 9098aaa..fdd5dd1 100644 >> --- a/drivers/net/wireless/ath/ath9k/main.c >> +++ b/drivers/net/wireless/ath/ath9k/main.c >> @@ -16,6 +16,7 @@ >> >> ?#include >> ?#include >> +#include >> ?#include "ath9k.h" >> ?#include "btcoex.h" >> >> @@ -770,14 +771,28 @@ irqreturn_t ath_isr(int irq, void *dev) >> ? ? ? ? * touch anything. Note this can happen early >> ? ? ? ? * on if the IRQ is shared. >> ? ? ? ? */ >> - ? ? ? if (sc->sc_flags & SC_OP_INVALID) >> + ? ? ? if (sc->sc_flags & SC_OP_INVALID) { >> + ? ? ? ? ? ? ? static int count = 0; >> + ? ? ? ? ? ? ? if (count < INT_MAX) { >> + ? ? ? ? ? ? ? ? ? ? ? count++; >> + ? ? ? ? ? ? ? ? ? ? ? if (is_power_of_2(count)) >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? printk(KERN_DEBUG "ath9k: interrupt with invalid flag set (%d times)\n", count); >> + ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? ?return IRQ_NONE; >> + ? ? ? } >> >> >> ? ? ? ?/* shared irq, not for us */ >> >> - ? ? ? if (!ath9k_hw_intrpend(ah)) >> + ? ? ? if (!ath9k_hw_intrpend(ah)) { >> + ? ? ? ? ? ? ? static int count = 0; >> + ? ? ? ? ? ? ? if (count < INT_MAX) { >> + ? ? ? ? ? ? ? ? ? ? ? count++; >> + ? ? ? ? ? ? ? ? ? ? ? if (is_power_of_2(count)) >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? printk(KERN_DEBUG "ath9k: shared irq, not for us (%d times)\n", count); >> + ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? ?return IRQ_NONE; >> + ? ? ? } >> >> ? ? ? ?/* >> ? ? ? ? * Figure out the reason(s) for the interrupt. ?Note >> @@ -792,8 +807,15 @@ irqreturn_t ath_isr(int irq, void *dev) >> ? ? ? ? * If there are no status bits set, then this interrupt was not >> ? ? ? ? * for me (should have been caught above). >> ? ? ? ? */ >> - ? ? ? if (!status) >> + ? ? ? if (!status) { >> + ? ? ? ? ? ? ? static int count = 0; >> + ? ? ? ? ? ? ? if (count < INT_MAX) { >> + ? ? ? ? ? ? ? ? ? ? ? count++; >> + ? ? ? ? ? ? ? ? ? ? ? if (is_power_of_2(count)) >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? printk(KERN_DEBUG "ath9k: interrupt, but no status bit set (%d times)\n", count); >> + ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? ?return IRQ_NONE; >> + ? ? ? } >> >> ? ? ? ?/* Cache the status */ >> ? ? ? ?sc->intrstatus = status; >> diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c >> index be4ea13..6fdabd8 100644 >> --- a/drivers/net/wireless/ath/ath9k/pci.c >> +++ b/drivers/net/wireless/ath/ath9k/pci.c >> @@ -242,21 +242,35 @@ static int ath_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) >> ? ? ? ?/* Will be cleared in ath9k_start() */ >> ? ? ? ?sc->sc_flags |= SC_OP_INVALID; >> >> + ? ? ? pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); >> + ? ? ? ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); >> + ? ? ? if (ret) { >> + ? ? ? ? ? ? ? dev_err(&pdev->dev, "Failed to initialize device\n"); >> + ? ? ? ? ? ? ? goto err_init; >> + ? ? ? } >> + >> + ? ? ? { >> + ? ? ? ? ? ? ? struct ath_hw *ah = sc->sc_ah; >> + ? ? ? ? ? ? ? printk(KERN_INFO "ath9k: disabling interrupts\n"); >> + ? ? ? ? ? ? ? ath9k_ps_wakeup(sc); >> + ? ? ? ? ? ? ? mutex_lock(&sc->mutex); >> + ? ? ? ? ? ? ? ath9k_hw_set_interrupts(ah, 0); >> + ? ? ? ? ? ? ? mutex_unlock(&sc->mutex); >> + ? ? ? ? ? ? ? ath9k_ps_restore(sc); >> + ? ? ? } >> + >> + ? ? ? printk(KERN_INFO "ath9k: requesting irq\n"); >> ? ? ? ?ret = request_irq(pdev->irq, ath_isr, IRQF_SHARED, "ath9k", sc); >> ? ? ? ?if (ret) { >> ? ? ? ? ? ? ? ?dev_err(&pdev->dev, "request_irq failed\n"); >> ? ? ? ? ? ? ? ?goto err_irq; >> ? ? ? ?} >> + ? ? ? printk(KERN_INFO "wait 500 ms\n"); >> + ? ? ? msleep(500); >> + ? ? ? printk(KERN_INFO "done waiting\n"); >> >> ? ? ? ?sc->irq = pdev->irq; >> >> - ? ? ? pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &subsysid); >> - ? ? ? ret = ath9k_init_device(id->device, sc, subsysid, &ath_pci_bus_ops); >> - ? ? ? if (ret) { >> - ? ? ? ? ? ? ? dev_err(&pdev->dev, "Failed to initialize device\n"); >> - ? ? ? ? ? ? ? goto err_init; >> - ? ? ? } >> - >> ? ? ? ?ath9k_hw_name(sc->sc_ah, hw_name, sizeof(hw_name)); >> ? ? ? ?wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", >> ? ? ? ? ? ? ? ? ? hw_name, (unsigned long)mem, pdev->irq); >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at ?http://vger.kernel.org/majordomo-info.html >> > > > > -- > shafi > -- shafi