2012-03-07 14:25:29

by Justin P. Mattock

[permalink] [raw]
Subject: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

ath9k craps out from time to time(more annoying if anything)

[44309.840830] ath: Failed to stop TX DMA, queues=0x001!
[44309.858246] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[44309.858255] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[44309.932801] ath: Failed to stop TX DMA, queues=0x001!
[44309.950352] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[44309.950361] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[44310.024301] ath: Failed to stop TX DMA, queues=0x001!
[44310.041641] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[44310.041646] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[44310.116316] ath: Failed to stop TX DMA, queues=0x001!
[44310.133909] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[44310.133918] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[44310.208021] ath: Failed to stop TX DMA, queues=0x001!
[44310.225752] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[44310.225761] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[44310.299922] ath: Failed to stop TX DMA, queues=0x001!
[44310.317579] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[44310.317587] ath: Could not stop RX, we could be confusing the DMA
engine whe


part of dmesg here:
http://fpaste.org/xew6/
I can supply full log if needed..

Justin P. Mattock


2012-03-25 23:00:30

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/20/2012 01:29 AM, Sujith Manoharan wrote:
> Justin P. Mattock wrote:
>> yeah this works:
>>
>> eading symbols from
>> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
>> (gdb) l *(ath_tx_start+0x284)
>> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
>> 1873 ieee80211_is_data_qos(hdr->frame_control)) {
>> 1874 tidno = ieee80211_get_qos_ctl(hdr)[0]&
>> 1875 IEEE80211_QOS_CTL_TID_MASK;
>> 1876 tid = ATH_AN_2_TID(txctl->an, tidno);
>> 1877
>> 1878 WARN_ON(tid->ac->txq != txctl->txq);
>> 1879 }
>> 1880
>> 1881 if ((tx_info->flags& IEEE80211_TX_CTL_AMPDU)&& tid) {
>> 1882 /*
>> (gdb)
>
> Can you try this patch ?
>
> From: Sujith Manoharan<[email protected]>
> Date: Tue, 20 Mar 2012 13:51:26 +0530
> Subject: [PATCH] ath9k: Use HW HT capabilites properly
>
> The commit "ath9k: Remove aggregation flags" changed how
> nodes were being initialized. Use the HW HT cap bits
> to initialize/de-initialize nodes, else we would be
> accessing an uninitialized entry during a suspend/resume cycle,
> resulting in a panic.
>
> Reported-by: Justin P. Mattock<[email protected]>
> Signed-off-by: Sujith Manoharan<[email protected]>
> ---
> drivers/net/wireless/ath/ath9k/main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
> index 3879485..215eb25 100644
> --- a/drivers/net/wireless/ath/ath9k/main.c
> +++ b/drivers/net/wireless/ath/ath9k/main.c
> @@ -640,7 +640,7 @@ static void ath_node_attach(struct ath_softc *sc, struct ieee80211_sta *sta,
> an->sta = sta;
> an->vif = vif;
>
> - if (sta->ht_cap.ht_supported) {
> + if (sc->sc_ah->caps.hw_caps& ATH9K_HW_CAP_HT) {
> ath_tx_node_init(sc, an);
> an->maxampdu = 1<< (IEEE80211_HT_MAX_AMPDU_FACTOR +
> sta->ht_cap.ampdu_factor);
> @@ -659,7 +659,7 @@ static void ath_node_detach(struct ath_softc *sc, struct ieee80211_sta *sta)
> an->sta = NULL;
> #endif
>
> - if (sta->ht_cap.ht_supported)
> + if (sc->sc_ah->caps.hw_caps& ATH9K_HW_CAP_HT)
> ath_tx_node_cleanup(sc, an);
> }
>

I would have to say this patch above does get rid of this crash I was
seeing. as a quick test I simply connect to a WPA network, then connect
to an open network going back and forth triggers this freeze for me
after applying this I am able to toggle back and forth without a freeze.

Thanks!

Justin P. Mattock

2012-03-20 09:37:22

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

Hi Sujith/Justin,

On Tue, Mar 20, 2012 at 1:59 PM, Sujith Manoharan
<[email protected]> wrote:
> Justin P. Mattock wrote:
>> yeah this works:
>>
>> eading symbols from
>> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
>> (gdb) l *(ath_tx_start+0x284)
>> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
>> 1873 ? ? ? ? ? ? ? ? ?ieee80211_is_data_qos(hdr->frame_control)) {
>> 1874 ? ? ? ? ? ? ? ? ?tidno = ieee80211_get_qos_ctl(hdr)[0] &
>> 1875 ? ? ? ? ? ? ? ? ? ? ? ? ?IEEE80211_QOS_CTL_TID_MASK;
>> 1876 ? ? ? ? ? ? ? ? ?tid = ATH_AN_2_TID(txctl->an, tidno);
>> 1877
>> 1878 ? ? ? ? ? ? ? ? ?WARN_ON(tid->ac->txq != txctl->txq);
>> 1879 ? ? ? ? ?}
>> 1880
>> 1881 ? ? ? ? ?if ((tx_info->flags & IEEE80211_TX_CTL_AMPDU) && tid) {
>> 1882 ? ? ? ? ? ? ? ? ?/*
>> (gdb)
>
> Can you try this patch ?

just found out that 'ht_supported' may not be set, if assoc response
does not has ht_cap IE (or) if we could not parse it (why), then the
driver won't initialize those tid related structures ath_tx_node_init,
while we later access them in ath_tx_start. so this should fix the
issue.

>
> From: Sujith Manoharan <[email protected]>
> Date: Tue, 20 Mar 2012 13:51:26 +0530
> Subject: [PATCH] ath9k: Use HW HT capabilites properly
>
> The commit "ath9k: Remove aggregation flags" changed how
> nodes were being initialized. Use the HW HT cap bits
> to initialize/de-initialize nodes, else we would be
> accessing an uninitialized entry during a suspend/resume cycle,
> resulting in a panic.
>
> Reported-by: Justin P. Mattock <[email protected]>
> Signed-off-by: Sujith Manoharan <[email protected]>
> ---
> ?drivers/net/wireless/ath/ath9k/main.c | ? ?4 ++--
> ?1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
> index 3879485..215eb25 100644
> --- a/drivers/net/wireless/ath/ath9k/main.c
> +++ b/drivers/net/wireless/ath/ath9k/main.c
> @@ -640,7 +640,7 @@ static void ath_node_attach(struct ath_softc *sc, struct ieee80211_sta *sta,
> ? ? ? ?an->sta = sta;
> ? ? ? ?an->vif = vif;
>
> - ? ? ? if (sta->ht_cap.ht_supported) {
> + ? ? ? if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT) {
> ? ? ? ? ? ? ? ?ath_tx_node_init(sc, an);
> ? ? ? ? ? ? ? ?an->maxampdu = 1 << (IEEE80211_HT_MAX_AMPDU_FACTOR +
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sta->ht_cap.ampdu_factor);
> @@ -659,7 +659,7 @@ static void ath_node_detach(struct ath_softc *sc, struct ieee80211_sta *sta)
> ? ? ? ?an->sta = NULL;
> ?#endif
>
> - ? ? ? if (sta->ht_cap.ht_supported)
> + ? ? ? if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT)
> ? ? ? ? ? ? ? ?ath_tx_node_cleanup(sc, an);
> ?}
>
> --
> 1.7.9.4
>



--
thanks,
shafi

2012-03-20 06:36:11

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/19/2012 11:32 PM, Mohammed Shafi wrote:
> On Tue, Mar 20, 2012 at 11:57 AM, Justin P. Mattock
> <[email protected]> wrote:
>> On 03/19/2012 11:19 PM, Mohammed Shafi wrote:
>>>>
>>>> (gdb) list *ath9k_tx+start+0x284
>>>> No symbol "start" in current context.
>>>
>>>
>>> hey, pls try l *(ath9k_tx_start+0x284), yesterday i did the same in
>>> the latest wireless testing tree it pointed to line 1929 of xmit.c
>>>
>>> sc->tx.seq_no +=0x10;
>>>
>>> also did few suspend/resume in my machine, nothing interesting happened
>>>
>>>>
>>>> before receiving the email to do the above code with gdb I had cleaned
>>>> the
>>>> tree out with git clean -fx so rebuilding might have misalighned
>>>> things(but
>>>> could be wrong)..
>>>>
>>>
>>>
>>>
>>
>>
>> seems gdb is not working with that code but works if I remove _start.
>> (maybe -next took this out or something)
>>
>> (gdb) l *(ath9k_tx_start+0x284)
>> No symbol "ath9k_tx_start" in current context.
>
> sorry its
> l *(ath_tx_start+0x284)
>
>> (gdb) l *(ath9k_tx+0x284)
>> 0x4854 is in ath_paprd_activate (drivers/net/wireless/ath/ath9k/main.c:395).
>> 390 continue;
>> 391
>> 392 ar9003_paprd_populate_single_table(ah, caldata,
>> chain);
>> 393 }
>> 394
>> 395 ar9003_paprd_enable(ah, true);
>> 396 ath9k_ps_restore(sc);
>> 397 }
>> 398
>> 399 static bool ath_paprd_send_frame(struct ath_softc *sc, struct
>> sk_buff *skb, int chain)
>>
>> right now I am running next, will wait for the RX DMA to get a clean trace,
>> then will look to see.. as for the other issue I just rebooted and hit this
>> during bootup.
>>
>> Justin P. Mattock
>
>
>

yeah this works:

eading symbols from
/home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
(gdb) l *(ath_tx_start+0x284)
0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
1873 ieee80211_is_data_qos(hdr->frame_control)) {
1874 tidno = ieee80211_get_qos_ctl(hdr)[0] &
1875 IEEE80211_QOS_CTL_TID_MASK;
1876 tid = ATH_AN_2_TID(txctl->an, tidno);
1877
1878 WARN_ON(tid->ac->txq != txctl->txq);
1879 }
1880
1881 if ((tx_info->flags & IEEE80211_TX_CTL_AMPDU) && tid) {
1882 /*
(gdb)


2012-03-19 14:43:42

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/16/2012 12:45 PM, Justin P. Mattock wrote:
> On 03/15/2012 06:57 AM, Justin P. Mattock wrote:
>>
>>>> ok! I have set that on the machine.. will see if I get this message in
>>>> the upcoming weeks.
>>> Also, please try this patch with powersave enabled:
>>> http://nbd.name/ps-fix.patch
>>>
>>> - Felix
>>>
>>
>> I did have ps-fix patch running, but removed it in the kernel due to not
>> being able to connect once I wakeup from suspend. anyway I have enable
>> debugging for the wireless(hopefully) and was able to trigger this after
>> a fresh build of the kernel last night(git clean -fx and all)
>>
>> the message I have in dmesg is:
>>
>> [ 3348.274932] ath: Failed to stop TX DMA, queues=0x004!
>> [ 3348.292637] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
>> [ 3348.292643] ath: Could not stop RX, we could be confusing the DMA
>> engine when we start RX up
>> [ 3348.292648] ------------[ cut here ]------------
>> [ 3348.292662] WARNING: at drivers/net/wireless/ath/ath9k/recv.c:537
>> ath_stoprecv+0xed/0x100 [ath9k]()
>> [ 3348.292666] Hardware name: 0914
>> [ 3348.292669] Modules linked in: dm_crypt ipt_REJECT ipt_LOG xt_limit
>> xt_tcpudp xt_addrtype joydev xt_state arc4 snd_hda_codec_hdmi
>> snd_hda_codec_conexant ath9k snd_hda_intel snd_hda_codec snd_hwdep
>> mac80211 snd_pcm snd_seq_midi nf_conntrack_netbios_ns
>> nf_conntrack_broadcast nf_nat_ftp nf_nat snd_rawmidi ath9k_common
>> nf_conntrack_ipv4 uvcvideo nf_defrag_ipv4 ath9k_hw snd_seq_midi_event
>> nf_conntrack_ftp nf_conntrack snd_seq snd_timer snd_seq_device ath
>> videobuf2_core videodev cfg80211 psmouse iptable_filter ip_tables
>> videobuf2_vmalloc videobuf2_memops name_laptop intel_ips sparse_keymap
>> serio_raw snd lp parport x_tables soundcore snd_page_alloc i915
>> drm_kms_helper drm i2c_algo_bit intel_agp intel_gtt usb_storage agpgart
>> video r8169 uas
>> [ 3348.292786] Pid: 5972, comm: kworker/u:24 Not tainted
>> 3.3.0-rc7-00048-g762ad8a #1
>> [ 3348.292790] Call Trace:
>> [ 3348.292804] [<c1046f32>] warn_slowpath_common+0x72/0xa0
>> [ 3348.292816] [<f8f528bd>] ? ath_stoprecv+0xed/0x100 [ath9k]
>> [ 3348.292826] [<f8f528bd>] ? ath_stoprecv+0xed/0x100 [ath9k]
>> [ 3348.292832] [<c1046f82>] warn_slowpath_null+0x22/0x30
>> [ 3348.292842] [<f8f528bd>] ath_stoprecv+0xed/0x100 [ath9k]
>> [ 3348.292855] [<f8f4e35d>] ath_prepare_reset+0x5d/0xd0 [ath9k]
>> [ 3348.292865] [<f8f4f92c>] ath_reset_internal+0x7c/0x160 [ath9k]
>> [ 3348.292875] [<c10a862b>] ? trace_hardirqs_on+0xb/0x10
>> [ 3348.292885] [<f8f4fa35>] ath_reset+0x25/0xb0 [ath9k]
>> [ 3348.292891] [<c10a8594>] ? trace_hardirqs_on_caller+0xf4/0x180
>> [ 3348.292901] [<f8f51377>] ath_reset_work+0x17/0x20 [ath9k]
>> [ 3348.292909] [<c10663d6>] process_one_work+0x166/0x5b0
>> [ 3348.292916] [<c1066362>] ? process_one_work+0xf2/0x5b0
>> [ 3348.292924] [<c12ca72b>] ? do_raw_spin_lock+0x3b/0xf0
>> [ 3348.292935] [<f8f51360>] ? ath_isr+0x270/0x270 [ath9k]
>> [ 3348.292944] [<c1066b34>] worker_thread+0x124/0x2c0
>> [ 3348.292950] [<c1066a10>] ? rescuer_thread+0x1c0/0x1c0
>> [ 3348.292957] [<c106b77d>] kthread+0x7d/0x90
>> [ 3348.292963] [<c106b700>] ? __init_kthread_worker+0x60/0x60
>> [ 3348.292972] [<c158ed42>] kernel_thread_helper+0x6/0x10
>> [ 3348.292976] ---[ end trace ced0f48896d487b2 ]---
>> [ 3349.735411] ath: Failed to stop TX DMA, queues=0x005!
>> [ 3349.753028] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
>> [ 3349.753034] ath: Could not stop RX, we could be confusing the DMA
>> engine when we start RX up
>>
>>
>> full dmesg is here;
>> http://fpaste.org/YhqT/
>>
>> hope this helps with finding the bug!
>>
>> Justin P. Mattock
>
> not sure if this is correct, but from what I remember, not cleaning the
> directory of the kernel seems to hide this bug, doing a git clean -fx
> has triggered this twice now, as opposed to waiting a week.. anyway I
> added more debug options to the kernel to see what it might produce when
> this is triggered. this is what i have now..:
>
>
> [10558.732792] TKIP decrypt: data(len=1468) 08 28 08 20 00 00 00 00 aa
> aa 03 00 00 00 08 00 45 00 05 a8 54 36 40 00 3a 06 6a 5d cc 02 a6 81 0a
> 00 00 39 00 50 9f 02 8c ff aa e2 22 c1 69 aa 80 10 10 f8 1d 52 00 00 01
> 01 08 0a 0d 20 6d 44 00 27 3b 4d c1 08 29 9c c6 b1 d3 18 22 9e 13 08 a5
> dd ac 66 24 3a 18 22 11 84 ac 6a 49 94 c4 84 fb 73 86 94 10 9e 72 e9 04
> 51 04 d2 93 d3 2e a3 28 51 4c b4 8c 89 27 4c bd 3e 8c 60 44 10 2a 91 32
> e9 da 25 84 08 81 49 45 47 0e cc e0 8a 00 25 34 26 5d 7e 5c 47 e0 85 10
> 9e 03 c8 9c 57 fe 4f 4c c6 50 44 46 09 e0 74 23 0b 10 00 13 ef 12 3d 12
> 82 21 18 0f 22 47 12 64 25 e9 96 30 44 23 03 b9 10 c4 63 8c cf b3 38 6b
> 98 14 16 a5 12 27 d3 db 02 20 52 41 3d 92 9e 03 38 51 4e 05 0d 3d a0 fe
> 08 22 8c 25 20 9c 44 bd b8 41 14 40 a1 23 2c 06 70 45 38 14 06 a3 98 97
> 5c a0 4a 09 19 1f 7a 5e d1 9c 28 84 c3 49 42 58 ce 72 f5 fa 7d 30 44 23
> 09 e4 49 53 96 03 0c 25 28 69 42 29 aa d5 99 18 9e 9d 61 45 38 14 81 5c
> f2 eb 39 67 0d 28 14 e2 48 c3 db 02 64 10 84 ba 4b d7 02 20 51 4c c0 9e
> 42 04 10 42 72 54 0e 7e 88 e5 c5 0d 69 37 27 24 28 93 21 1c c5 3c 27 91
> 05 1e d0 41 ed 8e 81 48 85 6c f9 74 6d b5 d4 de 90 65 de 69 a6 c3 10 74
> 4d 73 f4 4a 72 f6 c6 35 67 ee 75 2f 41 4c d8 53 03 fe 47 dc 16 9f 3d ad
> 6a a1 db 64 ac 68 fc fe a6 c4 a7 30 94 49 5d b9 18 96 82 f3 d0 aa 57 64
> e9 50 39 04 a4 4e 53 fc 71 82 5c 62 57 d4 98 c6 e1 11 01 09 7a 23 98 95
> 26 10 81 06 63 4a 4a dc 24 04 21 22 65 44 99 00 07 69 30 07 11 72 4e 63
> 4d a4 05 66 5c f8 9b 6b ed 1b 15 be b3 7c 57 5c 9d bb 5d 4a 91 4b 65 b0
> 32 cb 8f 05 a4 05 29 33 75 2a 0e 68 04 6b 20 a7 1c a7 9c 5c 95 21 cf ca
> bc ad 4f ac 86 98 4b 68 87 3e 5f 72 e5 6e 9d 8b b6 2d 9b 0e 9f 71 5b ab
> 2b aa 1f ab ba 0b 78 a4 ad 61 14 ab a6 4f 86 71 e2 87 5a 91 58 74 68 9e
> ad 5a 4a 48 90 eb 1c 3e 49 6c 6d b9 5b a3 f5 31 3a 6b 58 5a 00 20 c3 a6
> fc d6 67 50 93 22 7d 99 f6 45 58 af 43 00 9c 94 80 71 18 fe 48 71 29 16
> 8e 44 48 c2 52 04 41 14 83 42 b3 78 a7 89 f6 6e f7 a1 79 0a b9 5c a9 ee
> 74 68 6d 75 c8 4a 18 4b 00 be a7 83 7d d9 29 52 95 ee b4 75 4f ac 4c d6
> 44 46 2b 06 b3 d4 26 53 cc c2 1a d8 75 aa f6 f6 9b 20 ae d5 65 f1 a6 d6
> a4 21 54 cf 5c 5a 4b 4e ba 0a 41 2a 09 4f bb a7 51 c0 88 26 cb 32 cc 15
> ef 4f ab 15 4c c5 0b 41 5a 3a 42 44 d3 9f 5e 91 09 71 39 56 80 63 40 80
> 02 08 89 6a 9a 46 13 cb d1 04 4a eb 08 37 a6 fb b8 4f 0f c5 04 4a 30 0e
> 45 2a e3 4d ad b5 f7 2d f9 16 8b c5 5d 6d 25 4d 61 4b 76 ef 06 86 96 8d
> 49 43 ab 74 ba 56 09 48 92 40 4c 86 71 24 b6 e2 ca b3 7d 42 a1 d2 1a 1c
> d6 82 23 6a d9 e4 bd 9d b6 76 bd d5 cb 45 ad fb 85 65 6d 39 42 ab 5f a9
> 69 1e 19 b4 ba 84 ad 21 2e 36 94 cd 7e f6 51 23 e4 90 dc 51 b1 52 a1 f5
> 3d bc dc 0e 68 06 19 14 45 23 51 13 00 7a fd 18 e3 15 e2 b7 8b 44 11 28
> 46 a9 80 31 12 20 f4 9c 04 94 83 40 c8 92 90 24 41 12 ea 65 04 4a e8 00
> a7 5b 47 8a a8 6a b6 75 56 f5 dd 77 07 2d 3b 69 84 a9 6c a6 95 b4 ae a5
> f4 a5 5d de a9 ac 29 28 d6 b9 25 03 49 2a cf 01 29 cf 2a 5b 9c 60 b0 ab
> fd 4d b2 1d 85 ad 05 c2 f8 dc b2 db b6 16 c0 ba d9 af 97 6a 1a 8b e5 20
> b4 5a ea ee 0c 5b 6e b4 ac d3 be ff 00 70 8d 69 74 38 90 50 eb 22 52 50
> 40 49 9a 93 88 eb 33 a9 88 30 8a cf 97 eb a6 11 2c 02 db 6c c9 df c9 6a
> af 34 04 e6 31 94 c0 22 71 46 25 7a fc 2d 39 12 f7 49 38 65 99 eb f2 c3
> c4 50 58 d3 78 09 69 4c 89 90 9e 32 10 62 3c a8 32 da 4d a0 29 67 17 ed
> 4d af ba 37 18 b2 5d ab 2a e9 2a ab 24 9b 6f 83 6d a2 da 8b 4d bc f3 bd
> ea 9c 4a 8a 7d c4 0d 32 19 ce 24 60 c5 64 56 67 a8 ce 75 3b 43 da d6 9c
> 87 e0 bb f5 7c 63 b2 93 c9 ed ec 5a 4a ab d5 55 43 28 52 ae 4f 34 c5 3a
> 92 d1 71 0c 38 d2 b5 e9 09 0d 25 0e 28 b8 48 9c e4 04 e2 63 26 00 18 de
> b2 99 ea ee 73 5c ec 2d 8b 61 90 c3 a3 a6 d1 67 24 54 6b 7b 59 b8 fa dc
> f3 b4 bb 5e be be f0 f5 2b dd d5 65 6a da 6c d0 a6 49 24 84 3c da 52 14
> ad 52 1d 9d 90 e6 c9 7b 5b 88 95 2f a7 7a 9b 67 cd c0 5a d1 1e 41 94 28
> c9 50 19 65 2e 9e d8 a7 15 e9 c3 42 1a 40 90 e8 7a f4 94 30 e2 12 2c 69
> bc 04 56 90 72 cb af 66 38 c2 89 40 68 e4 4d d2 99 e5 81 fb fe 88 22 ba
> c2 10 29 44 86 19 e5 86 7d 21 e2 3c ab 91 2d bc 81 00 10 44 b3 90 eb db
> 0b 11 be 29 e0 6c 21 01 04 a4 08 00 1c 4e 1e a9 99 08 22 53 c2 39 14 bf
> 8c f6 7e d8 dc f7 63 6a b9 56 56 d3 5c 1f 27 c2 0a 54 34 59 29 6d b5 2d
> 5a d4 b0 a2 14 74 99 44 ac 6e 28 da b1 fd 46 ad f4 f0 2d 6b 48 2b ae be
> 3b d8 83 90 53 b5 3c 75 de 53 14 cb ab ee 59 d0 6b 1c 09 71 28 0b d3 22
> 80 da bd ec 27 38 98 c9 22 06 37 ac c6 7a c3 de c7 41 ad 8b 79 8c 20 b9
> bc 9d b2 b6 ce d6 ac 16 cb 6b d7 1a bb 8b 5d db 95 6e 54 34 df 85 6d a7
> 52 ad 3f 9c 42 53 ef 95 01 81 e9 1c ba 49 0d 72 9f 8f ed
> [10558.734622] TKIP decrypt: iv16=0808 iv32=00000000
> [10560.232164] wlan0: detected beacon loss from AP - sending probe request
> [10560.645078] ath: Failed to stop TX DMA, queues=0x005!
> [10560.662741] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10560.662750] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10560.662754] ------------[ cut here ]------------
> [10560.662770] WARNING: at drivers/net/wireless/ath/ath9k/recv.c:537
> ath_stoprecv+0xed/0x100 [ath9k]()
> [10560.662775] Hardware name: 0914
> [10560.662778] Modules linked in: dm_crypt ipt_REJECT ipt_LOG joydev
> xt_limit xt_tcpudp xt_addrtype xt_state arc4 ath9k mac80211
> snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec
> snd_hwdep ath9k_common ath9k_hw nf_conntrack_netbios_ns
> nf_conntrack_broadcast snd_pcm snd_seq_midi snd_rawmidi nf_nat_ftp
> nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 snd_seq_midi_event
> nf_conntrack_ftp psmouse uvcvideo lp snd_seq videobuf2_core nf_conntrack
> ath videodev serio_raw ideapad_laptop sparse_keymap parport intel_ips
> cfg80211 videobuf2_vmalloc videobuf2_memops snd_timer snd_seq_device snd
> iptable_filter ip_tables x_tables soundcore snd_page_alloc usb_storage
> uas i915 drm_kms_helper drm intel_agp r8169 i2c_algo_bit intel_gtt
> agpgart video
> [10560.662895] Pid: 10046, comm: kworker/u:20 Not tainted
> 3.3.0-rc7-00103-g0c4d067 #1
> [10560.662899] Call Trace:
> [10560.662914] [<c1047572>] warn_slowpath_common+0x72/0xa0
> [10560.662925] [<f8c08bdd>] ? ath_stoprecv+0xed/0x100 [ath9k]
> [10560.662935] [<f8c08bdd>] ? ath_stoprecv+0xed/0x100 [ath9k]
> [10560.662942] [<c10475c2>] warn_slowpath_null+0x22/0x30
> [10560.662952] [<f8c08bdd>] ath_stoprecv+0xed/0x100 [ath9k]
> [10560.662965] [<f8c0452d>] ath_prepare_reset+0x5d/0xd0 [ath9k]
> [10560.662975] [<f8c05adc>] ath_reset_internal+0x7c/0x160 [ath9k]
> [10560.662984] [<c10a8cab>] ? trace_hardirqs_on+0xb/0x10
> [10560.662994] [<f8c05be5>] ath_reset+0x25/0xb0 [ath9k]
> [10560.663000] [<c10a8c14>] ? trace_hardirqs_on_caller+0xf4/0x180
> [10560.663011] [<f8c07527>] ath_reset_work+0x17/0x20 [ath9k]
> [10560.663020] [<c10669d8>] process_one_work+0x168/0x5c0
> [10560.663027] [<c1066964>] ? process_one_work+0xf4/0x5c0
> [10560.663035] [<c12cc55b>] ? do_raw_spin_lock+0x3b/0xf0
> [10560.663045] [<f8c07510>] ? ath_isr+0x270/0x270 [ath9k]
> [10560.663054] [<c1067141>] worker_thread+0x121/0x2c0
> [10560.663061] [<c1067020>] ? rescuer_thread+0x1c0/0x1c0
> [10560.663067] [<c106be5d>] kthread+0x7d/0x90
> [10560.663073] [<c106bde0>] ? __init_kthread_worker+0x60/0x60
> [10560.663083] [<c15a1102>] kernel_thread_helper+0x6/0x10
> [10560.663087] ---[ end trace 7fd01c4e029fbbc7 ]---
> [10560.670137] ieee80211 phy0: wlan0: No ack for nullfunc frame to AP
> 00:1e:2a:62:6b:1e, try 1/2
> [10561.166344] ieee80211 phy0: wlan0: Failed to send nullfunc to AP
> 00:1e:2a:62:6b:1e after 500ms, disconnecting.
> [10561.168266] wlan0: moving STA 00:1e:2a:62:6b:1e to state 2
> [10561.168272] wlan0: moving STA 00:1e:2a:62:6b:1e to state 1
> [10561.168277] wlan0: moving STA 00:1e:2a:62:6b:1e to state 0
> [10561.174026] ieee80211 phy0: Removed STA 00:1e:2a:62:6b:1e
> [10561.174390] ieee80211 phy0: Destroyed STA 00:1e:2a:62:6b:1e
> [10561.175170] ieee80211 phy0: device now idle
> [10561.175231] cfg80211: All devices are disconnected, going to restore
> regulatory settings
> [10561.175271] cfg80211: Restoring regulatory settings
> [10561.175292] cfg80211: Calling CRDA to update world regulatory domain
> [10561.185363] cfg80211: Ignoring regulatory request Set by core since
> the driver uses its own custom regulatory domain
> [10561.185415] cfg80211: World regulatory domain updated:
> [10561.185419] cfg80211: (start_freq - end_freq @ bandwidth),
> (max_antenna_gain, max_eirp)
> [10561.185424] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (300
> mBi, 2000 mBm)
> [10561.185429] cfg80211: (2457000 KHz - 2482000 KHz @ 20000 KHz), (300
> mBi, 2000 mBm)
> [10561.185434] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (300
> mBi, 2000 mBm)
> [10561.185439] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (300
> mBi, 2000 mBm)
> [10561.185444] cfg80211: (5735000 KHz - 5835000 KHz @ 40000 KHz), (300
> mBi, 2000 mBm)
> [10561.578320] ieee80211 phy0: device no longer idle - scanning
> [10561.603489] ath: Failed to stop TX DMA, queues=0x005!
> [10561.621059] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10561.621067] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10561.694090] ath: Failed to stop TX DMA, queues=0x001!
> [10561.711600] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [10561.711608] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10561.785820] ath: Failed to stop TX DMA, queues=0x001!
> [10561.803374] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10561.803382] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10561.877450] ath: Failed to stop TX DMA, queues=0x001!
> [10561.894814] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10561.894821] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10561.969359] ath: Failed to stop TX DMA, queues=0x001!
> [10561.986773] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [10561.986777] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.061391] ath: Failed to stop TX DMA, queues=0x001!
> [10562.078921] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [10562.078928] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.153191] ath: Failed to stop TX DMA, queues=0x001!
> [10562.170828] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10562.170835] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.244937] ath: Failed to stop TX DMA, queues=0x001!
> [10562.262556] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [10562.262563] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.336708] ath: Failed to stop TX DMA, queues=0x001!
> [10562.354225] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [10562.354232] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.428559] ath: Failed to stop TX DMA, queues=0x001!
> [10562.446067] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10562.446075] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.520388] ath: Failed to stop TX DMA, queues=0x001!
> [10562.537981] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [10562.537987] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [10562.612159] ath: Failed to stop TX DMA, queues=0x001!
>
>
>
> hope this helps.
>
> Justin P. Mattock
>

seems I got this when waking up from suspend, was trying to connect with
NM then all of a sudden Xorg shut off and this appeared on the
screen(image links below I could not save dmesg only pic)
http://flic.kr/p/bEhf2K
http://flic.kr/p/bEhf2x
http://flic.kr/p/bEhf2n
http://flic.kr/p/bEhf26

I see ath9k in the image, but am unsure if its related.
kernel is: 3.3.0-rc7-next-20120316

Justin P. Mattock




2012-03-20 06:32:42

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On Tue, Mar 20, 2012 at 11:57 AM, Justin P. Mattock
<[email protected]> wrote:
> On 03/19/2012 11:19 PM, Mohammed Shafi wrote:
>>>
>>> (gdb) list *ath9k_tx+start+0x284
>>> No symbol "start" in current context.
>>
>>
>> hey, pls try l *(ath9k_tx_start+0x284), yesterday i did the same in
>> the latest wireless testing tree it pointed to line 1929 of xmit.c
>>
>> sc->tx.seq_no +=0x10;
>>
>> also did few suspend/resume in my machine, nothing interesting happened
>>
>>>
>>> before receiving the email to do the above code with gdb I had cleaned
>>> the
>>> tree out with git clean -fx so rebuilding might have misalighned
>>> things(but
>>> could be wrong)..
>>>
>>
>>
>>
>
>
> seems gdb is not working with that code but works if I remove _start.
> (maybe -next took this out or something)
>
> (gdb) l *(ath9k_tx_start+0x284)
> No symbol "ath9k_tx_start" in current context.

sorry its
l *(ath_tx_start+0x284)

> (gdb) l *(ath9k_tx+0x284)
> 0x4854 is in ath_paprd_activate (drivers/net/wireless/ath/ath9k/main.c:395).
> 390 ? ? ? ? ? ? ? ? ? ? ? ? ? ? continue;
> 391
> 392 ? ? ? ? ? ? ? ? ? ? ar9003_paprd_populate_single_table(ah, caldata,
> chain);
> 393 ? ? ? ? ? ? }
> 394
> 395 ? ? ? ? ? ? ar9003_paprd_enable(ah, true);
> 396 ? ? ? ? ? ? ath9k_ps_restore(sc);
> 397 ? ? }
> 398
> 399 ? ? static bool ath_paprd_send_frame(struct ath_softc *sc, struct
> sk_buff *skb, int chain)
>
> right now I am running next, will wait for the RX DMA to get a clean trace,
> then will look to see.. as for the other issue I just rebooted and hit this
> during bootup.
>
> Justin P. Mattock



--
thanks,
shafi

2012-03-20 06:27:44

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/19/2012 11:19 PM, Mohammed Shafi wrote:
>> (gdb) list *ath9k_tx+start+0x284
>> No symbol "start" in current context.
>
> hey, pls try l *(ath9k_tx_start+0x284), yesterday i did the same in
> the latest wireless testing tree it pointed to line 1929 of xmit.c
>
> sc->tx.seq_no +=0x10;
>
> also did few suspend/resume in my machine, nothing interesting happened
>
>>
>> before receiving the email to do the above code with gdb I had cleaned the
>> tree out with git clean -fx so rebuilding might have misalighned things(but
>> could be wrong)..
>>
>
>
>


seems gdb is not working with that code but works if I remove _start.
(maybe -next took this out or something)

(gdb) l *(ath9k_tx_start+0x284)
No symbol "ath9k_tx_start" in current context.
(gdb) l *(ath9k_tx+0x284)
0x4854 is in ath_paprd_activate (drivers/net/wireless/ath/ath9k/main.c:395).
390 continue;
391
392 ar9003_paprd_populate_single_table(ah, caldata, chain);
393 }
394
395 ar9003_paprd_enable(ah, true);
396 ath9k_ps_restore(sc);
397 }
398
399 static bool ath_paprd_send_frame(struct ath_softc *sc, struct
sk_buff *skb, int chain)

right now I am running next, will wait for the RX DMA to get a clean
trace, then will look to see.. as for the other issue I just rebooted
and hit this during bootup.

Justin P. Mattock

2012-03-20 15:23:29

by Sujith Manoharan

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

Justin P. Mattock wrote:
> both mohammed and sujith Thank you for the patches for this issue!
> I really appreciate this..
>
> I went and applied the debug printk and added ath_node_attach as well to
> my linux-next clone, will run for a few days to see if I get anything..
> full dmesg is here..: http://fpaste.org/jHSX/

Thanks for testing and please report if the issue is fixed.

Sujith

2012-03-15 13:57:49

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040


>> ok! I have set that on the machine.. will see if I get this message in
>> the upcoming weeks.
> Also, please try this patch with powersave enabled:
> http://nbd.name/ps-fix.patch
>
> - Felix
>

I did have ps-fix patch running, but removed it in the kernel due to not
being able to connect once I wakeup from suspend. anyway I have enable
debugging for the wireless(hopefully) and was able to trigger this after
a fresh build of the kernel last night(git clean -fx and all)

the message I have in dmesg is:

[ 3348.274932] ath: Failed to stop TX DMA, queues=0x004!
[ 3348.292637] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[ 3348.292643] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[ 3348.292648] ------------[ cut here ]------------
[ 3348.292662] WARNING: at drivers/net/wireless/ath/ath9k/recv.c:537
ath_stoprecv+0xed/0x100 [ath9k]()
[ 3348.292666] Hardware name: 0914
[ 3348.292669] Modules linked in: dm_crypt ipt_REJECT ipt_LOG xt_limit
xt_tcpudp xt_addrtype joydev xt_state arc4 snd_hda_codec_hdmi
snd_hda_codec_conexant ath9k snd_hda_intel snd_hda_codec snd_hwdep
mac80211 snd_pcm snd_seq_midi nf_conntrack_netbios_ns
nf_conntrack_broadcast nf_nat_ftp nf_nat snd_rawmidi ath9k_common
nf_conntrack_ipv4 uvcvideo nf_defrag_ipv4 ath9k_hw snd_seq_midi_event
nf_conntrack_ftp nf_conntrack snd_seq snd_timer snd_seq_device ath
videobuf2_core videodev cfg80211 psmouse iptable_filter ip_tables
videobuf2_vmalloc videobuf2_memops name_laptop intel_ips sparse_keymap
serio_raw snd lp parport x_tables soundcore snd_page_alloc i915
drm_kms_helper drm i2c_algo_bit intel_agp intel_gtt usb_storage agpgart
video r8169 uas
[ 3348.292786] Pid: 5972, comm: kworker/u:24 Not tainted
3.3.0-rc7-00048-g762ad8a #1
[ 3348.292790] Call Trace:
[ 3348.292804] [<c1046f32>] warn_slowpath_common+0x72/0xa0
[ 3348.292816] [<f8f528bd>] ? ath_stoprecv+0xed/0x100 [ath9k]
[ 3348.292826] [<f8f528bd>] ? ath_stoprecv+0xed/0x100 [ath9k]
[ 3348.292832] [<c1046f82>] warn_slowpath_null+0x22/0x30
[ 3348.292842] [<f8f528bd>] ath_stoprecv+0xed/0x100 [ath9k]
[ 3348.292855] [<f8f4e35d>] ath_prepare_reset+0x5d/0xd0 [ath9k]
[ 3348.292865] [<f8f4f92c>] ath_reset_internal+0x7c/0x160 [ath9k]
[ 3348.292875] [<c10a862b>] ? trace_hardirqs_on+0xb/0x10
[ 3348.292885] [<f8f4fa35>] ath_reset+0x25/0xb0 [ath9k]
[ 3348.292891] [<c10a8594>] ? trace_hardirqs_on_caller+0xf4/0x180
[ 3348.292901] [<f8f51377>] ath_reset_work+0x17/0x20 [ath9k]
[ 3348.292909] [<c10663d6>] process_one_work+0x166/0x5b0
[ 3348.292916] [<c1066362>] ? process_one_work+0xf2/0x5b0
[ 3348.292924] [<c12ca72b>] ? do_raw_spin_lock+0x3b/0xf0
[ 3348.292935] [<f8f51360>] ? ath_isr+0x270/0x270 [ath9k]
[ 3348.292944] [<c1066b34>] worker_thread+0x124/0x2c0
[ 3348.292950] [<c1066a10>] ? rescuer_thread+0x1c0/0x1c0
[ 3348.292957] [<c106b77d>] kthread+0x7d/0x90
[ 3348.292963] [<c106b700>] ? __init_kthread_worker+0x60/0x60
[ 3348.292972] [<c158ed42>] kernel_thread_helper+0x6/0x10
[ 3348.292976] ---[ end trace ced0f48896d487b2 ]---
[ 3349.735411] ath: Failed to stop TX DMA, queues=0x005!
[ 3349.753028] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[ 3349.753034] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up


full dmesg is here;
http://fpaste.org/YhqT/

hope this helps with finding the bug!

Justin P. Mattock

2012-03-20 05:54:10

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/19/2012 08:33 AM, Felix Fietkau wrote:
> On 2012-03-19 4:25 PM, Justin P. Mattock wrote:
>> On 03/19/2012 08:20 AM, Mohammed Shafi wrote:
>>> Hi Justin,
>>>
>>>> seems I got this when waking up from suspend, was trying to connect with NM
>>>> then all of a sudden Xorg shut off and this appeared on the screen(image
>>>> links below I could not save dmesg only pic)
>>>> http://flic.kr/p/bEhf2K
>>>> http://flic.kr/p/bEhf2x
>>>> http://flic.kr/p/bEhf2n
>>>> http://flic.kr/p/bEhf26
>>>>
>>>> I see ath9k in the image, but am unsure if its related.
>>>> kernel is: 3.3.0-rc7-next-20120316
>>>
>>> this looks like some new issue in ath9k, lets try to recreate it here
>>>
>>>>
>>>> Justin P. Mattock
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>> would be nice to recreate, then bisecting will be more accurate to do..
>> I am trying userspace tweaks here and there but no luck(hostapd etc..)
>> keep in mind this is for the RX DMA, the above just occured..
>> as for writing a small app I still need to try, but my C skills only go
>> so far(anything to trigger!).
> Was your kernel compiled with debug info? If so, please run gdb on the
> ath9k.ko module and enter this:
> l *ath9k_tx+start+0x284
> Then send me the output of that.
>
> - Felix
>

I changed the Cc''s so hopefully its correct..
anyway I tried the above but might have gotten it wrong..:

Reading symbols from
/home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
(gdb) list *ath9k_tx+0xa3
0x4673 is in ath9k_tx (drivers/net/wireless/ath/ath9k/main.c:1136).
1131 memset(&txctl, 0, sizeof(struct ath_tx_control));
1132 txctl.txq = sc->tx.txq_map[skb_get_queue_mapping(skb)];
1133
1134 ath_dbg(common, XMIT, "transmitting packet, skb: %p\n", skb);
1135
1136 if (ath_tx_start(hw, skb, &txctl) != 0) {
1137 ath_dbg(common, XMIT, "TX failed\n");
1138 goto exit;
1139 }
1140
(gdb) list *ath9k_tx+start+0x284
No symbol "start" in current context.

before receiving the email to do the above code with gdb I had cleaned
the tree out with git clean -fx so rebuilding might have misalighned
things(but could be wrong)..


2012-03-07 15:24:16

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/07/2012 07:19 AM, Mohammed Shafi wrote:
> On Wed, Mar 7, 2012 at 7:55 PM, Justin P. Mattock
> <[email protected]> wrote:
>> ath9k craps out from time to time(more annoying if anything)
>>
>> [44309.840830] ath: Failed to stop TX DMA, queues=0x001!
>> [44309.858246] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>> [44309.858255] ath: Could not stop RX, we could be confusing the DMA engine
>> when we start RX up
>> [44309.932801] ath: Failed to stop TX DMA, queues=0x001!
>> [44309.950352] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>> [44309.950361] ath: Could not stop RX, we could be confusing the DMA engine
>> when we start RX up
>> [44310.024301] ath: Failed to stop TX DMA, queues=0x001!
>> [44310.041641] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>> [44310.041646] ath: Could not stop RX, we could be confusing the DMA engine
>> when we start RX up
>> [44310.116316] ath: Failed to stop TX DMA, queues=0x001!
>> [44310.133909] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>> [44310.133918] ath: Could not stop RX, we could be confusing the DMA engine
>> when we start RX up
>> [44310.208021] ath: Failed to stop TX DMA, queues=0x001!
>> [44310.225752] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>> [44310.225761] ath: Could not stop RX, we could be confusing the DMA engine
>> when we start RX up
>> [44310.299922] ath: Failed to stop TX DMA, queues=0x001!
>> [44310.317579] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
>> [44310.317587] ath: Could not stop RX, we could be confusing the DMA engine
>> whe
>
> this is has become a know issue but less harmful, please try to see if
> the workaround of disabling power save helps
>
> iw dev wlanX set power_save off
>

ok! I have set that on the machine.. will see if I get this message in
the upcoming weeks.

>
>
>>
>>
>> part of dmesg here:
>> http://fpaste.org/xew6/
>> I can supply full log if needed..
>>
>> Justin P. Mattock
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>

Justin P. Mattock

2012-03-19 15:20:26

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

Hi Justin,

> seems I got this when waking up from suspend, was trying to connect with NM
> then all of a sudden Xorg shut off and this appeared on the screen(image
> links below I could not save dmesg only pic)
> http://flic.kr/p/bEhf2K
> http://flic.kr/p/bEhf2x
> http://flic.kr/p/bEhf2n
> http://flic.kr/p/bEhf26
>
> I see ath9k in the image, but am unsure if its related.
> kernel is: 3.3.0-rc7-next-20120316

this looks like some new issue in ath9k, lets try to recreate it here

>
> Justin P. Mattock
>
>
>



--
thanks,
shafi

2012-03-07 17:27:38

by Felix Fietkau

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 2012-03-07 4:24 PM, Justin P. Mattock wrote:
> On 03/07/2012 07:19 AM, Mohammed Shafi wrote:
>> On Wed, Mar 7, 2012 at 7:55 PM, Justin P. Mattock
>> <[email protected]> wrote:
>>> ath9k craps out from time to time(more annoying if anything)
>>>
>>> [44309.840830] ath: Failed to stop TX DMA, queues=0x001!
>>> [44309.858246] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>> [44309.858255] ath: Could not stop RX, we could be confusing the DMA engine
>>> when we start RX up
>>> [44309.932801] ath: Failed to stop TX DMA, queues=0x001!
>>> [44309.950352] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>> [44309.950361] ath: Could not stop RX, we could be confusing the DMA engine
>>> when we start RX up
>>> [44310.024301] ath: Failed to stop TX DMA, queues=0x001!
>>> [44310.041641] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>> [44310.041646] ath: Could not stop RX, we could be confusing the DMA engine
>>> when we start RX up
>>> [44310.116316] ath: Failed to stop TX DMA, queues=0x001!
>>> [44310.133909] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>> [44310.133918] ath: Could not stop RX, we could be confusing the DMA engine
>>> when we start RX up
>>> [44310.208021] ath: Failed to stop TX DMA, queues=0x001!
>>> [44310.225752] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>> [44310.225761] ath: Could not stop RX, we could be confusing the DMA engine
>>> when we start RX up
>>> [44310.299922] ath: Failed to stop TX DMA, queues=0x001!
>>> [44310.317579] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
>>> [44310.317587] ath: Could not stop RX, we could be confusing the DMA engine
>>> whe
>>
>> this is has become a know issue but less harmful, please try to see if
>> the workaround of disabling power save helps
>>
>> iw dev wlanX set power_save off
>>
>
> ok! I have set that on the machine.. will see if I get this message in
> the upcoming weeks.
Also, please try this patch with powersave enabled:
http://nbd.name/ps-fix.patch

- Felix

2012-03-19 16:09:40

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/19/2012 08:33 AM, Felix Fietkau wrote:
> On 2012-03-19 4:25 PM, Justin P. Mattock wrote:
>> On 03/19/2012 08:20 AM, Mohammed Shafi wrote:
>>> Hi Justin,
>>>
>>>> seems I got this when waking up from suspend, was trying to connect with NM
>>>> then all of a sudden Xorg shut off and this appeared on the screen(image
>>>> links below I could not save dmesg only pic)
>>>> http://flic.kr/p/bEhf2K
>>>> http://flic.kr/p/bEhf2x
>>>> http://flic.kr/p/bEhf2n
>>>> http://flic.kr/p/bEhf26
>>>>
>>>> I see ath9k in the image, but am unsure if its related.
>>>> kernel is: 3.3.0-rc7-next-20120316
>>>
>>> this looks like some new issue in ath9k, lets try to recreate it here
>>>
>>>>
>>>> Justin P. Mattock
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>> would be nice to recreate, then bisecting will be more accurate to do..
>> I am trying userspace tweaks here and there but no luck(hostapd etc..)
>> keep in mind this is for the RX DMA, the above just occured..
>> as for writing a small app I still need to try, but my C skills only go
>> so far(anything to trigger!).
> Was your kernel compiled with debug info? If so, please run gdb on the
> ath9k.ko module and enter this:
> l *ath9k_tx+start+0x284
> Then send me the output of that.
>
> - Felix
>

sure! I will do this as soon as I get to a stable connection(am remote
now)so I can install gdb to gather the info for you.

Justin P. Mattock

2012-03-20 08:30:35

by Sujith Manoharan

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

Justin P. Mattock wrote:
> yeah this works:
>
> eading symbols from
> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
> (gdb) l *(ath_tx_start+0x284)
> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
> 1873 ieee80211_is_data_qos(hdr->frame_control)) {
> 1874 tidno = ieee80211_get_qos_ctl(hdr)[0] &
> 1875 IEEE80211_QOS_CTL_TID_MASK;
> 1876 tid = ATH_AN_2_TID(txctl->an, tidno);
> 1877
> 1878 WARN_ON(tid->ac->txq != txctl->txq);
> 1879 }
> 1880
> 1881 if ((tx_info->flags & IEEE80211_TX_CTL_AMPDU) && tid) {
> 1882 /*
> (gdb)

Can you try this patch ?

From: Sujith Manoharan <[email protected]>
Date: Tue, 20 Mar 2012 13:51:26 +0530
Subject: [PATCH] ath9k: Use HW HT capabilites properly

The commit "ath9k: Remove aggregation flags" changed how
nodes were being initialized. Use the HW HT cap bits
to initialize/de-initialize nodes, else we would be
accessing an uninitialized entry during a suspend/resume cycle,
resulting in a panic.

Reported-by: Justin P. Mattock <[email protected]>
Signed-off-by: Sujith Manoharan <[email protected]>
---
drivers/net/wireless/ath/ath9k/main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 3879485..215eb25 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -640,7 +640,7 @@ static void ath_node_attach(struct ath_softc *sc, struct ieee80211_sta *sta,
an->sta = sta;
an->vif = vif;

- if (sta->ht_cap.ht_supported) {
+ if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT) {
ath_tx_node_init(sc, an);
an->maxampdu = 1 << (IEEE80211_HT_MAX_AMPDU_FACTOR +
sta->ht_cap.ampdu_factor);
@@ -659,7 +659,7 @@ static void ath_node_detach(struct ath_softc *sc, struct ieee80211_sta *sta)
an->sta = NULL;
#endif

- if (sta->ht_cap.ht_supported)
+ if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT)
ath_tx_node_cleanup(sc, an);
}

--
1.7.9.4


2012-03-19 15:25:44

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/19/2012 08:20 AM, Mohammed Shafi wrote:
> Hi Justin,
>
>> seems I got this when waking up from suspend, was trying to connect with NM
>> then all of a sudden Xorg shut off and this appeared on the screen(image
>> links below I could not save dmesg only pic)
>> http://flic.kr/p/bEhf2K
>> http://flic.kr/p/bEhf2x
>> http://flic.kr/p/bEhf2n
>> http://flic.kr/p/bEhf26
>>
>> I see ath9k in the image, but am unsure if its related.
>> kernel is: 3.3.0-rc7-next-20120316
>
> this looks like some new issue in ath9k, lets try to recreate it here
>
>>
>> Justin P. Mattock
>>
>>
>>
>
>
>

would be nice to recreate, then bisecting will be more accurate to do..
I am trying userspace tweaks here and there but no luck(hostapd etc..)
keep in mind this is for the RX DMA, the above just occured..
as for writing a small app I still need to try, but my C skills only go
so far(anything to trigger!).

Justin P. mattock

2012-03-25 15:05:09

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/20/2012 12:44 AM, Mohammed Shafi wrote:
>> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
>> (gdb) l *(ath_tx_start+0x284)
>> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
>> 1873 ieee80211_is_data_qos(hdr->frame_control)) {
>> 1874 tidno = ieee80211_get_qos_ctl(hdr)[0]&
>> 1875 IEEE80211_QOS_CTL_TID_MASK;
>> 1876 tid = ATH_AN_2_TID(txctl->an, tidno);
>> 1877
>> 1878 WARN_ON(tid->ac->txq != txctl->txq);
>> 1879 }
>> 1880
>> 1881 if ((tx_info->flags& IEEE80211_TX_CTL_AMPDU)&& tid) {
>> 1882 /*
>> (gdb
>
> please check with the attached debug patch if tid is 'NULL'
>
>
> its my theory that tid is not initialized in ath_tx_node_init just after resume.
> during suspend ath9k_sta_remove may be called
> ieee80211_suspend -> drv_sta_state ->
> drv_sta_remove->ath9k_sta_remove->ath_node_detach->ath_tx_node_cleanup
>
> during resume
>
> ieee80211_resume ->
> ieee80211_reconfig->drv_sta_state->ath9k_sta_add->ath_node_attach->ath_tx_node_init
>
> now sta.ht_cap.ht_supported should be true. it will be set to true in
> assoc_sucess path and then sta_add is called.
>
> let me see if i had missed something or my analysis itself is wrong.
>
>>
>
>
>


after a few days of running these patches I did not hit anything, then
on a reboot I hit this. positive side though is I didnt loose internet
connectivity altogether i.g. my music streaming stayed connected but
internet surfing lost connection for a few seconds(30-40) then came back.

this is in dmesg..:

[ 18.801995] ieee80211 phy0: Selected rate control algorithm
'ath9k_rate_control'
[ 18.803477] Registered led device: ath9k-phy0
[ 18.803489] ieee80211 phy0: Atheros AR9285 Rev:2 mem=0xf8fa0000, irq=17
[ 18.863699] input: ETPS/2 Elantech Touchpad as
/devices/platform/i8042/serio1/input/input8
[ 18.905167] HDMI status: Codec=3 Pin=5 Presence_Detect=0 ELD_Valid=0
[ 18.905338] input: HDA Intel HDMI/DP,pcm=3 as
/devices/pci0000:00/0000:00:1b.0/sound/card0/input9
[ 22.178072] init: avahi-daemon main process (794) terminated with
status 2
[ 22.178116] init: avahi-daemon main process ended, respawning
[ 22.515462] init: ufw pre-start process (503) terminated with status 1
[ 22.523987] init: cups pre-start process (802) terminated with status 1
[ 23.122716] init: Failed to spawn smbd main process: unable to
execute: No such file or directory
[ 23.146521] init: failsafe main process (907) killed by TERM signal
[ 25.011529] ieee80211 phy0: device now idle
[ 25.046220] r8169 0000:06:00.0: eth0: bmcr: 1000 bmsr: 7849 gbcr:
0000 gbsr: 0000
[ 25.046223] r8169 0000:06:00.0: eth0: link down
[ 25.686361] ieee80211 phy0: device no longer idle - scanning
[ 27.311270] ieee80211 phy0: device now idle
[ 27.336454] ieee80211 phy0: device no longer idle - scanning
[ 28.206726] wlan0: authenticate with 00:1e:2a:62:6b:1e
[ 28.206800] ieee80211 phy0: Allocated STA 00:1e:2a:62:6b:1e
[ 28.220206] ieee80211 phy0: Inserted STA 00:1e:2a:62:6b:1e
[ 28.220211] wlan0: send auth to 00:1e:2a:62:6b:1e (try 1/3)
[ 28.222300] wlan0: authenticated
[ 28.222317] wlan0: moving STA 00:1e:2a:62:6b:1e to state 2
[ 28.229855] wlan0: associate with 00:1e:2a:62:6b:1e (try 1/3)
[ 28.232472] wlan0: RX AssocResp from 00:1e:2a:62:6b:1e (capab=0x31
status=0 aid=5)
[ 28.232479] wlan0: associated
[ 28.232495] wlan0: moving STA 00:1e:2a:62:6b:1e to state 3
[ 28.245788] wlan0: moving STA 00:1e:2a:62:6b:1e to state 4
[ 31.910960] init: nmbd pre-start process (1192) terminated with
status 127
[ 32.110327] EXT4-fs (sda9): re-mounted. Opts:
errors=remount-ro,commit=600
[ 35.706391] init: plymouth-stop pre-start process (1295) terminated
with status 1
[ 38.315982] EXT4-fs (sda9): re-mounted. Opts: errors=remount-ro,commit=0
[ 48.568651] EXT4-fs (sda9): re-mounted. Opts:
errors=remount-ro,commit=600
[ 941.868195] wlan0: detected beacon loss from AP - sending probe request
[ 942.040883] ath: phy0: Failed to stop TX DMA, queues=0x005!
[ 942.058737] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[ 942.058749] ath: phy0: Could not stop RX, we could be confusing the
DMA engine when we start RX up
[ 942.058757] ------------[ cut here ]------------
[ 942.058778] WARNING: at drivers/net/wireless/ath/ath9k/recv.c:531
ath_stoprecv+0xfa/0x110 [ath9k]()
[ 942.058785] Hardware name: 0914
[ 942.058790] Modules linked in: dm_crypt xt_limit xt_tcpudp
xt_addrtype joydev xt_state snd_hda_codec_hdmi snd_hda_codec_conexant
arc4 ath9k snd_hda_intel mac80211 nf_conntrack_netbios_ns
nf_conntrack_broadcast snd_hda_codec snd_hwdep nf_nat_ftp nf_nat snd_pcm
nf_conntrack_ipv4 nf_defrag_ipv4 snd_seq_midi nf_conntrack_ftp
nf_conntrack snd_rawmidi ath9k_common snd_seq_midi_event snd_seq
snd_timer iptable_filter uvcvideo ath9k_hw ath lp parport cfg80211
snd_seq_device snd ip_tables ideapad_laptop videobuf2_core soundcore
snd_page_alloc x_tables psmouse sparse_keymap serio_raw videodev
videobuf2_vmalloc videobuf2_memops intel_ips i915 drm_kms_helper drm
i2c_algo_bit intel_agp intel_gtt r8169 agpgart video
[ 942.058960] Pid: 65, comm: kworker/u:3 Not tainted
3.3.0-next-20120322-00001-gba6003c #3
[ 942.058967] Call Trace:
[ 942.058987] [<c10492c2>] warn_slowpath_common+0x72/0xa0
[ 942.059003] [<f8b0897a>] ? ath_stoprecv+0xfa/0x110 [ath9k]
[ 942.059019] [<f8b0897a>] ? ath_stoprecv+0xfa/0x110 [ath9k]
[ 942.059030] [<c1049312>] warn_slowpath_null+0x22/0x30
[ 942.059045] [<f8b0897a>] ath_stoprecv+0xfa/0x110 [ath9k]
[ 942.059066] [<f8b044cd>] ath_prepare_reset+0x5d/0xd0 [ath9k]
[ 942.059082] [<f8b05adc>] ath_reset_internal+0x6c/0x170 [ath9k]
[ 942.059095] [<c10abcab>] ? trace_hardirqs_on+0xb/0x10
[ 942.059110] [<f8b05c05>] ath_reset+0x25/0xb0 [ath9k]
[ 942.059148] [<c10abc14>] ? trace_hardirqs_on_caller+0xf4/0x180
[ 942.059164] [<f8b07407>] ath_reset_work+0x17/0x20 [ath9k]
[ 942.059176] [<c1068706>] process_one_work+0x166/0x5c0
[ 942.059185] [<c1068692>] ? process_one_work+0xf2/0x5c0
[ 942.059199] [<c12d546b>] ? do_raw_spin_lock+0x3b/0xf0
[ 942.059219] [<f8b073f0>] ? ath_isr+0x270/0x270 [ath9k]
[ 942.059235] [<c1068e74>] worker_thread+0x124/0x2c0
[ 942.059245] [<c1068d50>] ? rescuer_thread+0x1c0/0x1c0
[ 942.059257] [<c106dbed>] kthread+0x7d/0x90
[ 942.059267] [<c106db70>] ? __init_kthread_worker+0x60/0x60
[ 942.059283] [<c15a0082>] kernel_thread_helper+0x6/0x10
[ 942.059286] ---[ end trace 6df3dcfe38b1fdb8 ]---
[ 942.066296] ieee80211 phy0: wlan0: No ack for nullfunc frame to AP
00:1e:2a:62:6b:1e, try 1/2
[ 942.561934] ieee80211 phy0: wlan0: Failed to send nullfunc to AP
00:1e:2a:62:6b:1e after 500ms, disconnecting.
[ 942.566506] wlan0: moving STA 00:1e:2a:62:6b:1e to state 3
[ 942.566514] wlan0: moving STA 00:1e:2a:62:6b:1e to state 2
[ 942.566556] wlan0: moving STA 00:1e:2a:62:6b:1e to state 1
[ 942.569979] ieee80211 phy0: Removed STA 00:1e:2a:62:6b:1e
[ 942.570946] ieee80211 phy0: Destroyed STA 00:1e:2a:62:6b:1e
[ 942.574474] ieee80211 phy0: device now idle
[ 942.574743] cfg80211: All devices are disconnected, going to restore
regulatory settings
[ 942.574757] cfg80211: Restoring regulatory settings
[ 942.574983] cfg80211: Calling CRDA to update world regulatory domain
[ 942.585392] cfg80211: Ignoring regulatory request Set by core since
the driver uses its own custom regulatory domain
[ 942.585519] cfg80211: World regulatory domain updated:
[ 942.585524] cfg80211: (start_freq - end_freq @ bandwidth),
(max_antenna_gain, max_eirp)
[ 942.585530] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (300
mBi, 2000 mBm)
[ 942.585536] cfg80211: (2457000 KHz - 2482000 KHz @ 20000 KHz), (300
mBi, 2000 mBm)
[ 942.585541] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (300
mBi, 2000 mBm)
[ 942.585546] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (300
mBi, 2000 mBm)
[ 942.585552] cfg80211: (5735000 KHz - 5835000 KHz @ 40000 KHz), (300
mBi, 2000 mBm)
[ 942.965150] ieee80211 phy0: device no longer idle - scanning
[ 943.839899] wlan0: authenticate with 00:1e:2a:62:6b:1e
[ 943.839961] ieee80211 phy0: Allocated STA 00:1e:2a:62:6b:1e
[ 943.853169] ieee80211 phy0: Inserted STA 00:1e:2a:62:6b:1e
[ 943.853178] wlan0: send auth to 00:1e:2a:62:6b:1e (try 1/3)
[ 943.856967] wlan0: authenticated
[ 943.856980] wlan0: moving STA 00:1e:2a:62:6b:1e to state 2
[ 943.866660] wlan0: associate with 00:1e:2a:62:6b:1e (try 1/3)
[ 943.869352] wlan0: RX AssocResp from 00:1e:2a:62:6b:1e (capab=0x31
status=0 aid=5)
[ 943.869362] wlan0: associated
[ 943.869385] wlan0: moving STA 00:1e:2a:62:6b:1e to state 3
[ 943.884712] wlan0: moving STA 00:1e:2a:62:6b:1e to state 4

Justin P. Mattock

2012-03-08 21:18:15

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/07/2012 09:27 AM, Felix Fietkau wrote:
> On 2012-03-07 4:24 PM, Justin P. Mattock wrote:
>> On 03/07/2012 07:19 AM, Mohammed Shafi wrote:
>>> On Wed, Mar 7, 2012 at 7:55 PM, Justin P. Mattock
>>> <[email protected]> wrote:
>>>> ath9k craps out from time to time(more annoying if anything)
>>>>
>>>> [44309.840830] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44309.858246] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44309.858255] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44309.932801] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44309.950352] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44309.950361] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.024301] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.041641] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44310.041646] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.116316] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.133909] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44310.133918] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.208021] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.225752] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44310.225761] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.299922] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.317579] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
>>>> [44310.317587] ath: Could not stop RX, we could be confusing the DMA engine
>>>> whe
>>>
>>> this is has become a know issue but less harmful, please try to see if
>>> the workaround of disabling power save helps
>>>
>>> iw dev wlanX set power_save off
>>>
>>
>> ok! I have set that on the machine.. will see if I get this message in
>> the upcoming weeks.
> Also, please try this patch with powersave enabled:
> http://nbd.name/ps-fix.patch
>
> - Felix
>


probably to early to tell if this patch is working or not(since this bug
is shows up randomly), but I have applied this to the current Mainline
and am so far not seeing the DMA failed to stop
as for any issues: waking up from suspend I get no
connection(rmmod/modprobe recovers this) then after a few suspends the
connection is real crappy(on and offline).

dmesg with debug gives me this:

[ 8804.908128] ath: PDADC (0, 0): a2f0 68686868
[ 8804.908132] ath: PDADC: Chain 0 | PDADC 112 Value 104 | PDADC 113
Value 104 | PDADC 114 Value 104 | PDADC 115 Value 104 |
[ 8804.908137] ath: PDADC (0, 0): a2f4 68686868
[ 8804.908141] ath: PDADC: Chain 0 | PDADC 116 Value 104 | PDADC 117
Value 104 | PDADC 118 Value 104 | PDADC 119 Value 104 |
[ 8804.908145] ath: PDADC (0, 0): a2f8 68686868
[ 8804.908149] ath: PDADC: Chain 0 | PDADC 120 Value 104 | PDADC 121
Value 104 | PDADC 122 Value 104 | PDADC 123 Value 104 |
[ 8804.908154] ath: PDADC (0, 0): a2fc 68686868
[ 8804.908158] ath: PDADC: Chain 0 | PDADC 124 Value 104 | PDADC 125
Value 104 | PDADC 126 Value 104 | PDADC 127 Value 104 |
[ 8804.908506] ath: Getting spur idx:0 is2Ghz:1 val:8000
[ 8804.908609] ath: Enable MIB counters
[ 8804.908632] ath: ANI parameters:
[ 8804.908635] ath: noiseImmunityLevel=0, spurImmunityLevel=7,
ofdmWeakSigDetectOff=1
[ 8804.908638] ath: cckWeakSigThreshold=0, firstepLevel=0, listenTime=0
[ 8804.908641] ath: ofdmPhyErrCount=0, cckPhyErrCount=0
[ 8804.908643]
[ 8804.908646] ath: Writing ofdmbase=12582412 cckbase=12582712
[ 8804.908669] ath: New interrupt mask 0xf4041071
[ 8804.908672] ath: new IMR 0x918414b0
[ 8804.908679] ath: enable IER
[ 8804.908690] ath: AR_IMR 0x918414b0 IER 0x1
[ 8804.945518] ath: disable IER
[ 8804.945554] ath: enable IER
[ 8804.945565] ath: AR_IMR 0x918414b0 IER 0x1
[ 8805.031114] ath: Set channel: 2422 MHz type: 0
[ 8805.031243] ath: disable IER
[ 8805.031380] ath: Disable MIB counters
[ 8805.031425] ath: Reset to 2422 MHz, HT40: 0 fastcc: 0
[ 8805.031441] ath: NF calibrated [ctl] [chain 0] is -114
[ 8805.033988] ath: PDADC (0, 0): a280 04030201
[ 8805.033995] ath: PDADC: Chain 0 | PDADC 0 Value 1 | PDADC 1
Value 2 | PDADC 2 Value 3 | PDADC 3 Value 4 |
[ 8805.034002] ath: PDADC (0, 0): a284 08070605
[ 8805.034007] ath: PDADC: Chain 0 | PDADC 4 Value 5 | PDADC 5
Value 6 | PDADC 6 Value 7 | PDADC 7 Value 8 |
[ 8805.034013] ath: PDADC (0, 0): a288 0c0b0a09
[ 8805.034019] ath: PDADC: Chain 0 | PDADC 8 Value 9 | PDADC 9
Value 10 | PDADC 10 Value 11 | PDADC 11 Value 12 |
[ 8805.034025] ath: PDADC (0, 0): a28c 100f0e0d
[ 8805.034030] ath: PDADC: Chain 0 | PDADC 12 Value 13 | PDADC 13
Value 14 | PDADC 14 Value 15 | PDADC 15 Value 16 |
[ 8805.034037] ath: PDADC (0, 0): a290 18161412
[ 8805.034042] ath: PDADC: Chain 0 | PDADC 16 Value 18 | PDADC 17
Value 20 | PDADC 18 Value 22 | PDADC 19 Value 24 |
[ 8805.034048] ath: PDADC (0, 0): a294 23201d1a
[ 8805.034053] ath: PDADC: Chain 0 | PDADC 20 Value 26 | PDADC 21
Value 29 | PDADC 22 Value 32 | PDADC 23 Value 35 |
[ 8805.034059] ath: PDADC (0, 0): a298 302b2825
[ 8805.034064] ath: PDADC: Chain 0 | PDADC 24 Value 37 | PDADC 25
Value 40 | PDADC 26 Value 43 | PDADC 27 Value 48 |
[ 8805.034070] ath: PDADC (0, 0): a29c 45403b36
[ 8805.034075] ath: PDADC: Chain 0 | PDADC 28 Value 54 | PDADC 29
Value 59 | PDADC 30 Value 64 | PDADC 31 Value 69 |
[ 8805.034081] ath: PDADC (0, 0): a2a0 5a544f4a
[ 8805.034086] ath: PDADC: Chain 0 | PDADC 32 Value 74 | PDADC 33
Value 79 | PDADC 34 Value 84 | PDADC 35 Value 90 |
[ 8805.034092] ath: PDADC (0, 0): a2a4 726c6660
[ 8805.034098] ath: PDADC: Chain 0 | PDADC 36 Value 96 | PDADC 37
Value 102 | PDADC 38 Value 108 | PDADC 39 Value 114 |
[ 8805.034104] ath: PDADC (0, 0): a2a8 0f0d0b09
[ 8805.034110] ath: PDADC: Chain 0 | PDADC 40 Value 9 | PDADC 41
Value 11 | PDADC 42 Value 13 | PDADC 43 Value 15 |
[ 8805.034116] ath: PDADC (0, 0): a2ac 17151311
[ 8805.034121] ath: PDADC: Chain 0 | PDADC 44 Value 17 | PDADC 45
Value 19 | PDADC 46 Value 21 | PDADC 47 Value 23 |
[ 8805.034127] ath: PDADC (0, 0): a2b0 201d1b19
[ 8805.034133] ath: PDADC: Chain 0 | PDADC 48 Value 25 | PDADC 49
Value 27 | PDADC 50 Value 29 | PDADC 51 Value 32 |
[ 8805.034139] ath: PDADC (0, 0): a2b4 2b282522
[ 8805.034145] ath: PDADC: Chain 0 | PDADC 52 Value 34 | PDADC 53
Value 37 | PDADC 54 Value 40 | PDADC 55 Value 43 |
[ 8805.034151] ath: PDADC (0, 0): a2b8 3935312e
[ 8805.034156] ath: PDADC: Chain 0 | PDADC 56 Value 46 | PDADC 57
Value 49 | PDADC 58 Value 53 | PDADC 59 Value 57 |
[ 8805.034163] ath: PDADC (0, 0): a2bc 4b44403c
[ 8805.034168] ath: PDADC: Chain 0 | PDADC 60 Value 60 | PDADC 61
Value 64 | PDADC 62 Value 68 | PDADC 63 Value 75 |
[ 8805.034174] ath: PDADC (0, 0): a2c0 67605952
[ 8805.034179] ath: PDADC: Chain 0 | PDADC 64 Value 82 | PDADC 65
Value 89 | PDADC 66 Value 96 | PDADC 67 Value 103 |
[ 8805.034185] ath: PDADC (0, 0): a2c4 7575756e
[ 8805.034190] ath: PDADC: Chain 0 | PDADC 68 Value 110 | PDADC 69
Value 117 | PDADC 70 Value 117 | PDADC 71 Value 117 |
[ 8805.034196] ath: PDADC (0, 0): a2c8 75757575
[ 8805.034202] ath: PDADC: Chain 0 | PDADC 72 Value 117 | PDADC 73
Value 117 | PDADC 74 Value 117 | PDADC 75 Value 117 |
[ 8805.034208] ath: PDADC (0, 0): a2cc 75757575
[ 8805.034213] ath: PDADC: Chain 0 | PDADC 76 Value 117 | PDADC 77
Value 117 | PDADC 78 Value 117 | PDADC 79 Value 117 |
[ 8805.034219] ath: PDADC (0, 0): a2d0 75757575
[ 8805.034224] ath: PDADC: Chain 0 | PDADC 80 Value 117 | PDADC 81
Value 117 | PDADC 82 Value 117 | PDADC 83 Value 117 |
[ 8805.034230] ath: PDADC (0, 0): a2d4 75757575
[ 8805.034236] ath: PDADC: Chain 0 | PDADC 84 Value 117 | PDADC 85
Value 117 | PDADC 86 Value 117 | PDADC 87 Value 117 |
[ 8805.034241] ath: PDADC (0, 0): a2d8 75757575
[ 8805.034246] ath: PDADC: Chain 0 | PDADC 88 Value 117 | PDADC 89
Value 117 | PDADC 90 Value 117 | PDADC 91 Value 117 |
[ 8805.034252] ath: PDADC (0, 0): a2dc 75757575
[ 8805.034257] ath: PDADC: Chain 0 | PDADC 92 Value 117 | PDADC 93
Value 117 | PDADC 94 Value 117 | PDADC 95 Value 117 |
[ 8805.034264] ath: PDADC (0, 0): a2e0 75757575
[ 8805.034269] ath: PDADC: Chain 0 | PDADC 96 Value 117 | PDADC 97
Value 117 | PDADC 98 Value 117 | PDADC 99 Value 117 |
[ 8805.034275] ath: PDADC (0, 0): a2e4 75757575
[ 8805.034281] ath: PDADC: Chain 0 | PDADC 100 Value 117 | PDADC 101
Value 117 | PDADC 102 Value 117 | PDADC 103 Value 117 |
[ 8805.034286] ath: PDADC (0, 0): a2e8 75757575
[ 8805.034292] ath: PDADC: Chain 0 | PDADC 104 Value 117 | PDADC 105
Value 117 | PDADC 106 Value 117 | PDADC 107 Value 117 |
[ 8805.034298] ath: PDADC (0, 0): a2ec 75757575
[ 8805.034303] ath: PDADC: Chain 0 | PDADC 108 Value 117 | PDADC 109
Value 117 | PDADC 110 Value 117 | PDADC 111 Value 117 |
[ 8805.034310] ath: PDADC (0, 0): a2f0 75757575
[ 8805.034315] ath: PDADC: Chain 0 | PDADC 112 Value 117 | PDADC 113
Value 117 | PDADC 114 Value 117 | PDADC 115 Value 117 |
[ 8805.034321] ath: PDADC (0, 0): a2f4 75757575
[ 8805.034326] ath: PDADC: Chain 0 | PDADC 116 Value 117 | PDADC 117
Value 117 | PDADC 118 Value 117 | PDADC 119 Value 117 |
[ 8805.034332] ath: PDADC (0, 0): a2f8 75757575
[ 8805.034338] ath: PDADC: Chain 0 | PDADC 120 Value 117 | PDADC 121
Value 117 | PDADC 122 Value 117 | PDADC 123 Value 117 |
[ 8805.034344] ath: PDADC (0, 0): a2fc 75757575
[ 8805.034349] ath: PDADC: Chain 0 | PDADC 124 Value 117 | PDADC 125
Value 117 | PDADC 126 Value 117 | PDADC 127 Value 117 |
[ 8805.034377] ath: Getting spur idx:0 is2Ghz:1 val:8000
[ 8805.036119] ath: Reset TX queue: 0
[ 8805.036125] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0
[ 8805.036130] ath: Reset TX queue: 1
[ 8805.036135] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0
[ 8805.036140] ath: Reset TX queue: 2
[ 8805.036146] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0
[ 8805.036151] ath: Reset TX queue: 3
[ 8805.036157] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0
[ 8805.036163] ath: Reset TXQ, inactive queue: 4
[ 8805.036167] ath: Reset TXQ, inactive queue: 5
[ 8805.036171] ath: Reset TXQ, inactive queue: 6
[ 8805.036175] ath: Reset TXQ, inactive queue: 7
[ 8805.036180] ath: Reset TX queue: 8
[ 8805.036195] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0
[ 8805.036201] ath: Reset TX queue: 9
[ 8805.036216] ath: tx ok 0x0 err 0x0 desc 0x10f eol 0x10f urn 0x0
[ 8805.036230] ath: ah->misc_mode 0x4
[ 8805.037647] ath: Running PA Calibration
[ 8805.038316] ath: enabling ADC Gain Calibration
[ 8805.038320] ath: enabling ADC DC Calibration
[ 8805.038324] ath: enabling IQ Calibration
[ 8805.038333] ath: starting ADC Gain Calibration
[ 8805.038471] ath: Enable MIB counters
[ 8805.038496] ath: ANI parameters:
[ 8805.038501] ath: noiseImmunityLevel=0, spurImmunityLevel=7,
ofdmWeakSigDetectOff=1
[ 8805.038507] ath: cckWeakSigThreshold=0, firstepLevel=2, listenTime=0
[ 8805.038511] ath: ofdmPhyErrCount=0, cckPhyErrCount=0
[ 8805.038513]
[ 8805.038521] ath: ANI parameters:
[ 8805.038525] ath: noiseImmunityLevel=0, spurImmunityLevel=7,
ofdmWeakSigDetectOff=1
[ 8805.038531] ath: cckWeakSigThreshold=0, firstepLevel=2, listenTime=0
[ 8805.038536] ath: ofdmPhyErrCount=0, cckPhyErrCount=0
[ 8805.038538]
[ 8805.038543] ath: Writing ofdmbase=12582412 cckbase=12582712
[ 8805.038578] ath: New interrupt mask 0xf4041071
[ 8805.038583] ath: new IMR 0x918414b0
[ 8805.038592] ath: enable IER
[ 8805.038604] ath: AR_IMR 0x918414b0 IER 0x1
[ 8805.038695] ath: Set HW RX filter: 0x207
[ 8805.038724] ieee80211 phy5: device now idle
[ 8805.038826] ath: AWAKE -> FULL-SLEEP
[ 8805.038911] ath: FULL-SLEEP -> AWAKE
[ 8805.041787] ath: AWAKE -> FULL-SLEEP
[ 8805.041808] ieee80211 phy5: device no longer idle - working
[ 8805.041817] ath: FULL-SLEEP -> AWAKE
[ 8805.044666] wlan0: authenticate with 00:1e:2a:62:6b:1e (try 1)
[ 8805.044679] ath: transmitting packet, skb: f6b88c00
[ 8805.044688] ath: qnum: 0, txq depth: 0
[ 8805.044690] ath: TXDP[0] = 2ecd6e8c (eecd6e8c)
[ 8805.044692] ath: Enable TXE on queue: 0
[ 8805.242495] wlan0: authenticate with 00:1e:2a:62:6b:1e (try 2)
[ 8805.242511] ath: transmitting packet, skb: f71309c0
[ 8805.242521] ath: qnum: 0, txq depth: 1
[ 8805.242525] ath: link[0] (eecd6e8c)=2ecd6ef0 (eecd6ef0)
[ 8805.242528] ath: Enable TXE on queue: 0
[ 8805.441878] wlan0: authenticate with 00:1e:2a:62:6b:1e (try 3)
[ 8805.441893] ath: transmitting packet, skb: f6b88b40
[ 8805.441903] ath: qnum: 0, txq depth: 2
[ 8805.441907] ath: link[0] (eecd6ef0)=2ecd6f54 (eecd6f54)
[ 8805.441911] ath: Enable TXE on queue: 0
[ 8805.641298] wlan0: authentication with 00:1e:2a:62:6b:1e timed out
[ 8805.641445] ieee80211 phy5: device now idle


will continue running the system to see if I get anymore leads on this
issue.

Justin P. Mattock




2012-03-20 14:19:33

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/20/2012 01:29 AM, Sujith Manoharan wrote:
> Justin P. Mattock wrote:
>> yeah this works:
>>
>> eading symbols from
>> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
>> (gdb) l *(ath_tx_start+0x284)
>> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
>> 1873 ieee80211_is_data_qos(hdr->frame_control)) {
>> 1874 tidno = ieee80211_get_qos_ctl(hdr)[0]&
>> 1875 IEEE80211_QOS_CTL_TID_MASK;
>> 1876 tid = ATH_AN_2_TID(txctl->an, tidno);
>> 1877
>> 1878 WARN_ON(tid->ac->txq != txctl->txq);
>> 1879 }
>> 1880
>> 1881 if ((tx_info->flags& IEEE80211_TX_CTL_AMPDU)&& tid) {
>> 1882 /*
>> (gdb)
>
> Can you try this patch ?
>
> From: Sujith Manoharan<[email protected]>
> Date: Tue, 20 Mar 2012 13:51:26 +0530
> Subject: [PATCH] ath9k: Use HW HT capabilites properly
>
> The commit "ath9k: Remove aggregation flags" changed how
> nodes were being initialized. Use the HW HT cap bits
> to initialize/de-initialize nodes, else we would be
> accessing an uninitialized entry during a suspend/resume cycle,
> resulting in a panic.
>
> Reported-by: Justin P. Mattock<[email protected]>
> Signed-off-by: Sujith Manoharan<[email protected]>
> ---
> drivers/net/wireless/ath/ath9k/main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
> index 3879485..215eb25 100644
> --- a/drivers/net/wireless/ath/ath9k/main.c
> +++ b/drivers/net/wireless/ath/ath9k/main.c
> @@ -640,7 +640,7 @@ static void ath_node_attach(struct ath_softc *sc, struct ieee80211_sta *sta,
> an->sta = sta;
> an->vif = vif;
>
> - if (sta->ht_cap.ht_supported) {
> + if (sc->sc_ah->caps.hw_caps& ATH9K_HW_CAP_HT) {
> ath_tx_node_init(sc, an);
> an->maxampdu = 1<< (IEEE80211_HT_MAX_AMPDU_FACTOR +
> sta->ht_cap.ampdu_factor);
> @@ -659,7 +659,7 @@ static void ath_node_detach(struct ath_softc *sc, struct ieee80211_sta *sta)
> an->sta = NULL;
> #endif
>
> - if (sta->ht_cap.ht_supported)
> + if (sc->sc_ah->caps.hw_caps& ATH9K_HW_CAP_HT)
> ath_tx_node_cleanup(sc, an);
> }
>


alright!

both mohammed and sujith Thank you for the patches for this issue!
I really appreciate this..

I went and applied the debug printk and added ath_node_attach as well to
my linux-next clone, will run for a few days to see if I get anything..
full dmesg is here..: http://fpaste.org/jHSX/

Justin P. Mattock


2012-03-16 19:45:27

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/15/2012 06:57 AM, Justin P. Mattock wrote:
>
>>> ok! I have set that on the machine.. will see if I get this message in
>>> the upcoming weeks.
>> Also, please try this patch with powersave enabled:
>> http://nbd.name/ps-fix.patch
>>
>> - Felix
>>
>
> I did have ps-fix patch running, but removed it in the kernel due to not
> being able to connect once I wakeup from suspend. anyway I have enable
> debugging for the wireless(hopefully) and was able to trigger this after
> a fresh build of the kernel last night(git clean -fx and all)
>
> the message I have in dmesg is:
>
> [ 3348.274932] ath: Failed to stop TX DMA, queues=0x004!
> [ 3348.292637] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [ 3348.292643] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
> [ 3348.292648] ------------[ cut here ]------------
> [ 3348.292662] WARNING: at drivers/net/wireless/ath/ath9k/recv.c:537
> ath_stoprecv+0xed/0x100 [ath9k]()
> [ 3348.292666] Hardware name: 0914
> [ 3348.292669] Modules linked in: dm_crypt ipt_REJECT ipt_LOG xt_limit
> xt_tcpudp xt_addrtype joydev xt_state arc4 snd_hda_codec_hdmi
> snd_hda_codec_conexant ath9k snd_hda_intel snd_hda_codec snd_hwdep
> mac80211 snd_pcm snd_seq_midi nf_conntrack_netbios_ns
> nf_conntrack_broadcast nf_nat_ftp nf_nat snd_rawmidi ath9k_common
> nf_conntrack_ipv4 uvcvideo nf_defrag_ipv4 ath9k_hw snd_seq_midi_event
> nf_conntrack_ftp nf_conntrack snd_seq snd_timer snd_seq_device ath
> videobuf2_core videodev cfg80211 psmouse iptable_filter ip_tables
> videobuf2_vmalloc videobuf2_memops name_laptop intel_ips sparse_keymap
> serio_raw snd lp parport x_tables soundcore snd_page_alloc i915
> drm_kms_helper drm i2c_algo_bit intel_agp intel_gtt usb_storage agpgart
> video r8169 uas
> [ 3348.292786] Pid: 5972, comm: kworker/u:24 Not tainted
> 3.3.0-rc7-00048-g762ad8a #1
> [ 3348.292790] Call Trace:
> [ 3348.292804] [<c1046f32>] warn_slowpath_common+0x72/0xa0
> [ 3348.292816] [<f8f528bd>] ? ath_stoprecv+0xed/0x100 [ath9k]
> [ 3348.292826] [<f8f528bd>] ? ath_stoprecv+0xed/0x100 [ath9k]
> [ 3348.292832] [<c1046f82>] warn_slowpath_null+0x22/0x30
> [ 3348.292842] [<f8f528bd>] ath_stoprecv+0xed/0x100 [ath9k]
> [ 3348.292855] [<f8f4e35d>] ath_prepare_reset+0x5d/0xd0 [ath9k]
> [ 3348.292865] [<f8f4f92c>] ath_reset_internal+0x7c/0x160 [ath9k]
> [ 3348.292875] [<c10a862b>] ? trace_hardirqs_on+0xb/0x10
> [ 3348.292885] [<f8f4fa35>] ath_reset+0x25/0xb0 [ath9k]
> [ 3348.292891] [<c10a8594>] ? trace_hardirqs_on_caller+0xf4/0x180
> [ 3348.292901] [<f8f51377>] ath_reset_work+0x17/0x20 [ath9k]
> [ 3348.292909] [<c10663d6>] process_one_work+0x166/0x5b0
> [ 3348.292916] [<c1066362>] ? process_one_work+0xf2/0x5b0
> [ 3348.292924] [<c12ca72b>] ? do_raw_spin_lock+0x3b/0xf0
> [ 3348.292935] [<f8f51360>] ? ath_isr+0x270/0x270 [ath9k]
> [ 3348.292944] [<c1066b34>] worker_thread+0x124/0x2c0
> [ 3348.292950] [<c1066a10>] ? rescuer_thread+0x1c0/0x1c0
> [ 3348.292957] [<c106b77d>] kthread+0x7d/0x90
> [ 3348.292963] [<c106b700>] ? __init_kthread_worker+0x60/0x60
> [ 3348.292972] [<c158ed42>] kernel_thread_helper+0x6/0x10
> [ 3348.292976] ---[ end trace ced0f48896d487b2 ]---
> [ 3349.735411] ath: Failed to stop TX DMA, queues=0x005!
> [ 3349.753028] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [ 3349.753034] ath: Could not stop RX, we could be confusing the DMA
> engine when we start RX up
>
>
> full dmesg is here;
> http://fpaste.org/YhqT/
>
> hope this helps with finding the bug!
>
> Justin P. Mattock

not sure if this is correct, but from what I remember, not cleaning the
directory of the kernel seems to hide this bug, doing a git clean -fx
has triggered this twice now, as opposed to waiting a week.. anyway I
added more debug options to the kernel to see what it might produce when
this is triggered. this is what i have now..:


[10558.732792] TKIP decrypt: data(len=1468) 08 28 08 20 00 00 00 00 aa
aa 03 00 00 00 08 00 45 00 05 a8 54 36 40 00 3a 06 6a 5d cc 02 a6 81 0a
00 00 39 00 50 9f 02 8c ff aa e2 22 c1 69 aa 80 10 10 f8 1d 52 00 00 01
01 08 0a 0d 20 6d 44 00 27 3b 4d c1 08 29 9c c6 b1 d3 18 22 9e 13 08 a5
dd ac 66 24 3a 18 22 11 84 ac 6a 49 94 c4 84 fb 73 86 94 10 9e 72 e9 04
51 04 d2 93 d3 2e a3 28 51 4c b4 8c 89 27 4c bd 3e 8c 60 44 10 2a 91 32
e9 da 25 84 08 81 49 45 47 0e cc e0 8a 00 25 34 26 5d 7e 5c 47 e0 85 10
9e 03 c8 9c 57 fe 4f 4c c6 50 44 46 09 e0 74 23 0b 10 00 13 ef 12 3d 12
82 21 18 0f 22 47 12 64 25 e9 96 30 44 23 03 b9 10 c4 63 8c cf b3 38 6b
98 14 16 a5 12 27 d3 db 02 20 52 41 3d 92 9e 03 38 51 4e 05 0d 3d a0 fe
08 22 8c 25 20 9c 44 bd b8 41 14 40 a1 23 2c 06 70 45 38 14 06 a3 98 97
5c a0 4a 09 19 1f 7a 5e d1 9c 28 84 c3 49 42 58 ce 72 f5 fa 7d 30 44 23
09 e4 49 53 96 03 0c 25 28 69 42 29 aa d5 99 18 9e 9d 61 45 38 14 81 5c
f2 eb 39 67 0d 28 14 e2 48 c3 db 02 64 10 84 ba 4b d7 02 20 51 4c c0 9e
42 04 10 42 72 54 0e 7e 88 e5 c5 0d 69 37 27 24 28 93 21 1c c5 3c 27 91
05 1e d0 41 ed 8e 81 48 85 6c f9 74 6d b5 d4 de 90 65 de 69 a6 c3 10 74
4d 73 f4 4a 72 f6 c6 35 67 ee 75 2f 41 4c d8 53 03 fe 47 dc 16 9f 3d ad
6a a1 db 64 ac 68 fc fe a6 c4 a7 30 94 49 5d b9 18 96 82 f3 d0 aa 57 64
e9 50 39 04 a4 4e 53 fc 71 82 5c 62 57 d4 98 c6 e1 11 01 09 7a 23 98 95
26 10 81 06 63 4a 4a dc 24 04 21 22 65 44 99 00 07 69 30 07 11 72 4e 63
4d a4 05 66 5c f8 9b 6b ed 1b 15 be b3 7c 57 5c 9d bb 5d 4a 91 4b 65 b0
32 cb 8f 05 a4 05 29 33 75 2a 0e 68 04 6b 20 a7 1c a7 9c 5c 95 21 cf ca
bc ad 4f ac 86 98 4b 68 87 3e 5f 72 e5 6e 9d 8b b6 2d 9b 0e 9f 71 5b ab
2b aa 1f ab ba 0b 78 a4 ad 61 14 ab a6 4f 86 71 e2 87 5a 91 58 74 68 9e
ad 5a 4a 48 90 eb 1c 3e 49 6c 6d b9 5b a3 f5 31 3a 6b 58 5a 00 20 c3 a6
fc d6 67 50 93 22 7d 99 f6 45 58 af 43 00 9c 94 80 71 18 fe 48 71 29 16
8e 44 48 c2 52 04 41 14 83 42 b3 78 a7 89 f6 6e f7 a1 79 0a b9 5c a9 ee
74 68 6d 75 c8 4a 18 4b 00 be a7 83 7d d9 29 52 95 ee b4 75 4f ac 4c d6
44 46 2b 06 b3 d4 26 53 cc c2 1a d8 75 aa f6 f6 9b 20 ae d5 65 f1 a6 d6
a4 21 54 cf 5c 5a 4b 4e ba 0a 41 2a 09 4f bb a7 51 c0 88 26 cb 32 cc 15
ef 4f ab 15 4c c5 0b 41 5a 3a 42 44 d3 9f 5e 91 09 71 39 56 80 63 40 80
02 08 89 6a 9a 46 13 cb d1 04 4a eb 08 37 a6 fb b8 4f 0f c5 04 4a 30 0e
45 2a e3 4d ad b5 f7 2d f9 16 8b c5 5d 6d 25 4d 61 4b 76 ef 06 86 96 8d
49 43 ab 74 ba 56 09 48 92 40 4c 86 71 24 b6 e2 ca b3 7d 42 a1 d2 1a 1c
d6 82 23 6a d9 e4 bd 9d b6 76 bd d5 cb 45 ad fb 85 65 6d 39 42 ab 5f a9
69 1e 19 b4 ba 84 ad 21 2e 36 94 cd 7e f6 51 23 e4 90 dc 51 b1 52 a1 f5
3d bc dc 0e 68 06 19 14 45 23 51 13 00 7a fd 18 e3 15 e2 b7 8b 44 11 28
46 a9 80 31 12 20 f4 9c 04 94 83 40 c8 92 90 24 41 12 ea 65 04 4a e8 00
a7 5b 47 8a a8 6a b6 75 56 f5 dd 77 07 2d 3b 69 84 a9 6c a6 95 b4 ae a5
f4 a5 5d de a9 ac 29 28 d6 b9 25 03 49 2a cf 01 29 cf 2a 5b 9c 60 b0 ab
fd 4d b2 1d 85 ad 05 c2 f8 dc b2 db b6 16 c0 ba d9 af 97 6a 1a 8b e5 20
b4 5a ea ee 0c 5b 6e b4 ac d3 be ff 00 70 8d 69 74 38 90 50 eb 22 52 50
40 49 9a 93 88 eb 33 a9 88 30 8a cf 97 eb a6 11 2c 02 db 6c c9 df c9 6a
af 34 04 e6 31 94 c0 22 71 46 25 7a fc 2d 39 12 f7 49 38 65 99 eb f2 c3
c4 50 58 d3 78 09 69 4c 89 90 9e 32 10 62 3c a8 32 da 4d a0 29 67 17 ed
4d af ba 37 18 b2 5d ab 2a e9 2a ab 24 9b 6f 83 6d a2 da 8b 4d bc f3 bd
ea 9c 4a 8a 7d c4 0d 32 19 ce 24 60 c5 64 56 67 a8 ce 75 3b 43 da d6 9c
87 e0 bb f5 7c 63 b2 93 c9 ed ec 5a 4a ab d5 55 43 28 52 ae 4f 34 c5 3a
92 d1 71 0c 38 d2 b5 e9 09 0d 25 0e 28 b8 48 9c e4 04 e2 63 26 00 18 de
b2 99 ea ee 73 5c ec 2d 8b 61 90 c3 a3 a6 d1 67 24 54 6b 7b 59 b8 fa dc
f3 b4 bb 5e be be f0 f5 2b dd d5 65 6a da 6c d0 a6 49 24 84 3c da 52 14
ad 52 1d 9d 90 e6 c9 7b 5b 88 95 2f a7 7a 9b 67 cd c0 5a d1 1e 41 94 28
c9 50 19 65 2e 9e d8 a7 15 e9 c3 42 1a 40 90 e8 7a f4 94 30 e2 12 2c 69
bc 04 56 90 72 cb af 66 38 c2 89 40 68 e4 4d d2 99 e5 81 fb fe 88 22 ba
c2 10 29 44 86 19 e5 86 7d 21 e2 3c ab 91 2d bc 81 00 10 44 b3 90 eb db
0b 11 be 29 e0 6c 21 01 04 a4 08 00 1c 4e 1e a9 99 08 22 53 c2 39 14 bf
8c f6 7e d8 dc f7 63 6a b9 56 56 d3 5c 1f 27 c2 0a 54 34 59 29 6d b5 2d
5a d4 b0 a2 14 74 99 44 ac 6e 28 da b1 fd 46 ad f4 f0 2d 6b 48 2b ae be
3b d8 83 90 53 b5 3c 75 de 53 14 cb ab ee 59 d0 6b 1c 09 71 28 0b d3 22
80 da bd ec 27 38 98 c9 22 06 37 ac c6 7a c3 de c7 41 ad 8b 79 8c 20 b9
bc 9d b2 b6 ce d6 ac 16 cb 6b d7 1a bb 8b 5d db 95 6e 54 34 df 85 6d a7
52 ad 3f 9c 42 53 ef 95 01 81 e9 1c ba 49 0d 72 9f 8f ed
[10558.734622] TKIP decrypt: iv16=0808 iv32=00000000
[10560.232164] wlan0: detected beacon loss from AP - sending probe request
[10560.645078] ath: Failed to stop TX DMA, queues=0x005!
[10560.662741] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10560.662750] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10560.662754] ------------[ cut here ]------------
[10560.662770] WARNING: at drivers/net/wireless/ath/ath9k/recv.c:537
ath_stoprecv+0xed/0x100 [ath9k]()
[10560.662775] Hardware name: 0914
[10560.662778] Modules linked in: dm_crypt ipt_REJECT ipt_LOG joydev
xt_limit xt_tcpudp xt_addrtype xt_state arc4 ath9k mac80211
snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec
snd_hwdep ath9k_common ath9k_hw nf_conntrack_netbios_ns
nf_conntrack_broadcast snd_pcm snd_seq_midi snd_rawmidi nf_nat_ftp
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 snd_seq_midi_event
nf_conntrack_ftp psmouse uvcvideo lp snd_seq videobuf2_core nf_conntrack
ath videodev serio_raw ideapad_laptop sparse_keymap parport intel_ips
cfg80211 videobuf2_vmalloc videobuf2_memops snd_timer snd_seq_device snd
iptable_filter ip_tables x_tables soundcore snd_page_alloc usb_storage
uas i915 drm_kms_helper drm intel_agp r8169 i2c_algo_bit intel_gtt
agpgart video
[10560.662895] Pid: 10046, comm: kworker/u:20 Not tainted
3.3.0-rc7-00103-g0c4d067 #1
[10560.662899] Call Trace:
[10560.662914] [<c1047572>] warn_slowpath_common+0x72/0xa0
[10560.662925] [<f8c08bdd>] ? ath_stoprecv+0xed/0x100 [ath9k]
[10560.662935] [<f8c08bdd>] ? ath_stoprecv+0xed/0x100 [ath9k]
[10560.662942] [<c10475c2>] warn_slowpath_null+0x22/0x30
[10560.662952] [<f8c08bdd>] ath_stoprecv+0xed/0x100 [ath9k]
[10560.662965] [<f8c0452d>] ath_prepare_reset+0x5d/0xd0 [ath9k]
[10560.662975] [<f8c05adc>] ath_reset_internal+0x7c/0x160 [ath9k]
[10560.662984] [<c10a8cab>] ? trace_hardirqs_on+0xb/0x10
[10560.662994] [<f8c05be5>] ath_reset+0x25/0xb0 [ath9k]
[10560.663000] [<c10a8c14>] ? trace_hardirqs_on_caller+0xf4/0x180
[10560.663011] [<f8c07527>] ath_reset_work+0x17/0x20 [ath9k]
[10560.663020] [<c10669d8>] process_one_work+0x168/0x5c0
[10560.663027] [<c1066964>] ? process_one_work+0xf4/0x5c0
[10560.663035] [<c12cc55b>] ? do_raw_spin_lock+0x3b/0xf0
[10560.663045] [<f8c07510>] ? ath_isr+0x270/0x270 [ath9k]
[10560.663054] [<c1067141>] worker_thread+0x121/0x2c0
[10560.663061] [<c1067020>] ? rescuer_thread+0x1c0/0x1c0
[10560.663067] [<c106be5d>] kthread+0x7d/0x90
[10560.663073] [<c106bde0>] ? __init_kthread_worker+0x60/0x60
[10560.663083] [<c15a1102>] kernel_thread_helper+0x6/0x10
[10560.663087] ---[ end trace 7fd01c4e029fbbc7 ]---
[10560.670137] ieee80211 phy0: wlan0: No ack for nullfunc frame to AP
00:1e:2a:62:6b:1e, try 1/2
[10561.166344] ieee80211 phy0: wlan0: Failed to send nullfunc to AP
00:1e:2a:62:6b:1e after 500ms, disconnecting.
[10561.168266] wlan0: moving STA 00:1e:2a:62:6b:1e to state 2
[10561.168272] wlan0: moving STA 00:1e:2a:62:6b:1e to state 1
[10561.168277] wlan0: moving STA 00:1e:2a:62:6b:1e to state 0
[10561.174026] ieee80211 phy0: Removed STA 00:1e:2a:62:6b:1e
[10561.174390] ieee80211 phy0: Destroyed STA 00:1e:2a:62:6b:1e
[10561.175170] ieee80211 phy0: device now idle
[10561.175231] cfg80211: All devices are disconnected, going to restore
regulatory settings
[10561.175271] cfg80211: Restoring regulatory settings
[10561.175292] cfg80211: Calling CRDA to update world regulatory domain
[10561.185363] cfg80211: Ignoring regulatory request Set by core since
the driver uses its own custom regulatory domain
[10561.185415] cfg80211: World regulatory domain updated:
[10561.185419] cfg80211: (start_freq - end_freq @ bandwidth),
(max_antenna_gain, max_eirp)
[10561.185424] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (300
mBi, 2000 mBm)
[10561.185429] cfg80211: (2457000 KHz - 2482000 KHz @ 20000 KHz), (300
mBi, 2000 mBm)
[10561.185434] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (300
mBi, 2000 mBm)
[10561.185439] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (300
mBi, 2000 mBm)
[10561.185444] cfg80211: (5735000 KHz - 5835000 KHz @ 40000 KHz), (300
mBi, 2000 mBm)
[10561.578320] ieee80211 phy0: device no longer idle - scanning
[10561.603489] ath: Failed to stop TX DMA, queues=0x005!
[10561.621059] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10561.621067] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10561.694090] ath: Failed to stop TX DMA, queues=0x001!
[10561.711600] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[10561.711608] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10561.785820] ath: Failed to stop TX DMA, queues=0x001!
[10561.803374] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10561.803382] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10561.877450] ath: Failed to stop TX DMA, queues=0x001!
[10561.894814] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10561.894821] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10561.969359] ath: Failed to stop TX DMA, queues=0x001!
[10561.986773] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[10561.986777] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.061391] ath: Failed to stop TX DMA, queues=0x001!
[10562.078921] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[10562.078928] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.153191] ath: Failed to stop TX DMA, queues=0x001!
[10562.170828] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10562.170835] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.244937] ath: Failed to stop TX DMA, queues=0x001!
[10562.262556] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[10562.262563] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.336708] ath: Failed to stop TX DMA, queues=0x001!
[10562.354225] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
[10562.354232] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.428559] ath: Failed to stop TX DMA, queues=0x001!
[10562.446067] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10562.446075] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.520388] ath: Failed to stop TX DMA, queues=0x001!
[10562.537981] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
[10562.537987] ath: Could not stop RX, we could be confusing the DMA
engine when we start RX up
[10562.612159] ath: Failed to stop TX DMA, queues=0x001!



hope this helps.

Justin P. Mattock


2012-03-20 11:36:24

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On Tue, Mar 20, 2012 at 4:45 PM, Mohammed Shafi
<[email protected]> wrote:
> On Tue, Mar 20, 2012 at 3:07 PM, Mohammed Shafi
> <[email protected]> wrote:
>> Hi Sujith/Justin,
>>
>> On Tue, Mar 20, 2012 at 1:59 PM, Sujith Manoharan
>> <[email protected]> wrote:
>>> Justin P. Mattock wrote:
>>>> yeah this works:
>>>>
>>>> eading symbols from
>>>> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
>>>> (gdb) l *(ath_tx_start+0x284)
>>>> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
>>>> 1873 ? ? ? ? ? ? ? ? ?ieee80211_is_data_qos(hdr->frame_control)) {
>>>> 1874 ? ? ? ? ? ? ? ? ?tidno = ieee80211_get_qos_ctl(hdr)[0] &
>>>> 1875 ? ? ? ? ? ? ? ? ? ? ? ? ?IEEE80211_QOS_CTL_TID_MASK;
>>>> 1876 ? ? ? ? ? ? ? ? ?tid = ATH_AN_2_TID(txctl->an, tidno);
>>>> 1877
>>>> 1878 ? ? ? ? ? ? ? ? ?WARN_ON(tid->ac->txq != txctl->txq);
>>>> 1879 ? ? ? ? ?}
>>>> 1880
>>>> 1881 ? ? ? ? ?if ((tx_info->flags & IEEE80211_TX_CTL_AMPDU) && tid) {
>>>> 1882 ? ? ? ? ? ? ? ? ?/*
>>>> (gdb)
>>>
>>> Can you try this patch ?
>>
>> just found out that 'ht_supported' may not ?be set, if assoc response
>> does not has ht_cap IE (or) if we could not parse it (why), then the
>> driver won't initialize those tid related structures ath_tx_node_init,
>> while we later access them in ath_tx_start. so this should fix the
>> issue.
>
> now i was able to quite easily recreate this issue in wireless testing
> even without suspend resume and with the patch the issue seems to get
> fixed.
> i was using 2ghz channel 6 and TKIP (no HT!).

i was keep on trying with 5ghz TKIP and it seems HT enable even with
TKIP (need to see why this happens), thats why even i could not
recreate this issue. mac80211 seems to disable HT, ht_supported false
while in driver we go with ATH9K_HW_CAP HT


>
>
>>
>>>
>>> From: Sujith Manoharan <[email protected]>
>>> Date: Tue, 20 Mar 2012 13:51:26 +0530
>>> Subject: [PATCH] ath9k: Use HW HT capabilites properly
>>>
>>> The commit "ath9k: Remove aggregation flags" changed how
>>> nodes were being initialized. Use the HW HT cap bits
>>> to initialize/de-initialize nodes, else we would be
>>> accessing an uninitialized entry during a suspend/resume cycle,
>>> resulting in a panic.
>>>
>>> Reported-by: Justin P. Mattock <[email protected]>
>>> Signed-off-by: Sujith Manoharan <[email protected]>
>>> ---
>>> ?drivers/net/wireless/ath/ath9k/main.c | ? ?4 ++--
>>> ?1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
>>> index 3879485..215eb25 100644
>>> --- a/drivers/net/wireless/ath/ath9k/main.c
>>> +++ b/drivers/net/wireless/ath/ath9k/main.c
>>> @@ -640,7 +640,7 @@ static void ath_node_attach(struct ath_softc *sc, struct ieee80211_sta *sta,
>>> ? ? ? ?an->sta = sta;
>>> ? ? ? ?an->vif = vif;
>>>
>>> - ? ? ? if (sta->ht_cap.ht_supported) {
>>> + ? ? ? if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT) {
>>> ? ? ? ? ? ? ? ?ath_tx_node_init(sc, an);
>>> ? ? ? ? ? ? ? ?an->maxampdu = 1 << (IEEE80211_HT_MAX_AMPDU_FACTOR +
>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sta->ht_cap.ampdu_factor);
>>> @@ -659,7 +659,7 @@ static void ath_node_detach(struct ath_softc *sc, struct ieee80211_sta *sta)
>>> ? ? ? ?an->sta = NULL;
>>> ?#endif
>>>
>>> - ? ? ? if (sta->ht_cap.ht_supported)
>>> + ? ? ? if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT)
>>> ? ? ? ? ? ? ? ?ath_tx_node_cleanup(sc, an);
>>> ?}
>>>
>>> --
>>> 1.7.9.4
>>>
>>
>>
>>
>> --
>> thanks,
>> shafi
>
>
>
> --
> thanks,
> shafi



--
thanks,
shafi

2012-03-19 15:33:09

by Felix Fietkau

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 2012-03-19 4:25 PM, Justin P. Mattock wrote:
> On 03/19/2012 08:20 AM, Mohammed Shafi wrote:
>> Hi Justin,
>>
>>> seems I got this when waking up from suspend, was trying to connect with NM
>>> then all of a sudden Xorg shut off and this appeared on the screen(image
>>> links below I could not save dmesg only pic)
>>> http://flic.kr/p/bEhf2K
>>> http://flic.kr/p/bEhf2x
>>> http://flic.kr/p/bEhf2n
>>> http://flic.kr/p/bEhf26
>>>
>>> I see ath9k in the image, but am unsure if its related.
>>> kernel is: 3.3.0-rc7-next-20120316
>>
>> this looks like some new issue in ath9k, lets try to recreate it here
>>
>>>
>>> Justin P. Mattock
>>>
>>>
>>>
>>
>>
>>
>
> would be nice to recreate, then bisecting will be more accurate to do..
> I am trying userspace tweaks here and there but no luck(hostapd etc..)
> keep in mind this is for the RX DMA, the above just occured..
> as for writing a small app I still need to try, but my C skills only go
> so far(anything to trigger!).
Was your kernel compiled with debug info? If so, please run gdb on the
ath9k.ko module and enter this:
l *ath9k_tx+start+0x284
Then send me the output of that.

- Felix

2012-03-20 17:42:15

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/20/2012 08:22 AM, Sujith Manoharan wrote:
> Justin P. Mattock wrote:
>> both mohammed and sujith Thank you for the patches for this issue!
>> I really appreciate this..
>>
>> I went and applied the debug printk and added ath_node_attach as well to
>> my linux-next clone, will run for a few days to see if I get anything..
>> full dmesg is here..: http://fpaste.org/jHSX/
>
> Thanks for testing and please report if the issue is fixed.
>
> Sujith
>

no worries.. hopefully we can get this bug fixed.. will continue running
these patches and will send out anything I see..

Justin P. Mattock

2012-03-20 07:44:41

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
> (gdb) l *(ath_tx_start+0x284)
> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
> 1873 ? ? ? ? ? ? ? ? ? ?ieee80211_is_data_qos(hdr->frame_control)) {
> 1874 ? ? ? ? ? ? ? ? ? ?tidno = ieee80211_get_qos_ctl(hdr)[0] &
> 1875 ? ? ? ? ? ? ? ? ? ? ? ? ? ?IEEE80211_QOS_CTL_TID_MASK;
> 1876 ? ? ? ? ? ? ? ? ? ?tid = ATH_AN_2_TID(txctl->an, tidno);
> 1877
> 1878 ? ? ? ? ? ? ? ? ? ?WARN_ON(tid->ac->txq != txctl->txq);
> 1879 ? ? ? ? ? ?}
> 1880
> 1881 ? ? ? ? ? ?if ((tx_info->flags & IEEE80211_TX_CTL_AMPDU) && tid) {
> 1882 ? ? ? ? ? ? ? ? ? ?/*
> (gdb

please check with the attached debug patch if tid is 'NULL'


its my theory that tid is not initialized in ath_tx_node_init just after resume.
during suspend ath9k_sta_remove may be called
ieee80211_suspend -> drv_sta_state ->
drv_sta_remove->ath9k_sta_remove->ath_node_detach->ath_tx_node_cleanup

during resume

ieee80211_resume ->
ieee80211_reconfig->drv_sta_state->ath9k_sta_add->ath_node_attach->ath_tx_node_init

now sta.ht_cap.ht_supported should be true. it will be set to true in
assoc_sucess path and then sta_add is called.

let me see if i had missed something or my analysis itself is wrong.

>



--
thanks,
shafi


Attachments:
debug-panic.patch (493.00 B)

2012-03-07 15:19:38

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On Wed, Mar 7, 2012 at 7:55 PM, Justin P. Mattock
<[email protected]> wrote:
> ath9k craps out from time to time(more annoying if anything)
>
> [44309.840830] ath: Failed to stop TX DMA, queues=0x001!
> [44309.858246] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [44309.858255] ath: Could not stop RX, we could be confusing the DMA engine
> when we start RX up
> [44309.932801] ath: Failed to stop TX DMA, queues=0x001!
> [44309.950352] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [44309.950361] ath: Could not stop RX, we could be confusing the DMA engine
> when we start RX up
> [44310.024301] ath: Failed to stop TX DMA, queues=0x001!
> [44310.041641] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [44310.041646] ath: Could not stop RX, we could be confusing the DMA engine
> when we start RX up
> [44310.116316] ath: Failed to stop TX DMA, queues=0x001!
> [44310.133909] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [44310.133918] ath: Could not stop RX, we could be confusing the DMA engine
> when we start RX up
> [44310.208021] ath: Failed to stop TX DMA, queues=0x001!
> [44310.225752] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
> [44310.225761] ath: Could not stop RX, we could be confusing the DMA engine
> when we start RX up
> [44310.299922] ath: Failed to stop TX DMA, queues=0x001!
> [44310.317579] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
> [44310.317587] ath: Could not stop RX, we could be confusing the DMA engine
> whe

this is has become a know issue but less harmful, please try to see if
the workaround of disabling power save helps

iw dev wlanX set power_save off



>
>
> part of dmesg here:
> http://fpaste.org/xew6/
> I can supply full log if needed..
>
> Justin P. Mattock
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html



--
thanks,
shafi

2012-03-27 02:18:18

by Sujith Manoharan

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

Justin P. Mattock wrote:
> I would have to say this patch above does get rid of this crash I was
> seeing. as a quick test I simply connect to a WPA network, then connect
> to an open network going back and forth triggers this freeze for me
> after applying this I am able to toggle back and forth without a freeze.

Thanks for verifying.

Sujith

2012-03-20 11:15:57

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On Tue, Mar 20, 2012 at 3:07 PM, Mohammed Shafi
<[email protected]> wrote:
> Hi Sujith/Justin,
>
> On Tue, Mar 20, 2012 at 1:59 PM, Sujith Manoharan
> <[email protected]> wrote:
>> Justin P. Mattock wrote:
>>> yeah this works:
>>>
>>> eading symbols from
>>> /home/kernel/linux-next/drivers/net/wireless/ath/ath9k/ath9k.o...done.
>>> (gdb) l *(ath_tx_start+0x284)
>>> 0xcad4 is in ath_tx_start (drivers/net/wireless/ath/ath9k/xmit.c:1878).
>>> 1873 ? ? ? ? ? ? ? ? ?ieee80211_is_data_qos(hdr->frame_control)) {
>>> 1874 ? ? ? ? ? ? ? ? ?tidno = ieee80211_get_qos_ctl(hdr)[0] &
>>> 1875 ? ? ? ? ? ? ? ? ? ? ? ? ?IEEE80211_QOS_CTL_TID_MASK;
>>> 1876 ? ? ? ? ? ? ? ? ?tid = ATH_AN_2_TID(txctl->an, tidno);
>>> 1877
>>> 1878 ? ? ? ? ? ? ? ? ?WARN_ON(tid->ac->txq != txctl->txq);
>>> 1879 ? ? ? ? ?}
>>> 1880
>>> 1881 ? ? ? ? ?if ((tx_info->flags & IEEE80211_TX_CTL_AMPDU) && tid) {
>>> 1882 ? ? ? ? ? ? ? ? ?/*
>>> (gdb)
>>
>> Can you try this patch ?
>
> just found out that 'ht_supported' may not ?be set, if assoc response
> does not has ht_cap IE (or) if we could not parse it (why), then the
> driver won't initialize those tid related structures ath_tx_node_init,
> while we later access them in ath_tx_start. so this should fix the
> issue.

now i was able to quite easily recreate this issue in wireless testing
even without suspend resume and with the patch the issue seems to get
fixed.
i was using 2ghz channel 6 and TKIP (no HT!).


>
>>
>> From: Sujith Manoharan <[email protected]>
>> Date: Tue, 20 Mar 2012 13:51:26 +0530
>> Subject: [PATCH] ath9k: Use HW HT capabilites properly
>>
>> The commit "ath9k: Remove aggregation flags" changed how
>> nodes were being initialized. Use the HW HT cap bits
>> to initialize/de-initialize nodes, else we would be
>> accessing an uninitialized entry during a suspend/resume cycle,
>> resulting in a panic.
>>
>> Reported-by: Justin P. Mattock <[email protected]>
>> Signed-off-by: Sujith Manoharan <[email protected]>
>> ---
>> ?drivers/net/wireless/ath/ath9k/main.c | ? ?4 ++--
>> ?1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
>> index 3879485..215eb25 100644
>> --- a/drivers/net/wireless/ath/ath9k/main.c
>> +++ b/drivers/net/wireless/ath/ath9k/main.c
>> @@ -640,7 +640,7 @@ static void ath_node_attach(struct ath_softc *sc, struct ieee80211_sta *sta,
>> ? ? ? ?an->sta = sta;
>> ? ? ? ?an->vif = vif;
>>
>> - ? ? ? if (sta->ht_cap.ht_supported) {
>> + ? ? ? if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT) {
>> ? ? ? ? ? ? ? ?ath_tx_node_init(sc, an);
>> ? ? ? ? ? ? ? ?an->maxampdu = 1 << (IEEE80211_HT_MAX_AMPDU_FACTOR +
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sta->ht_cap.ampdu_factor);
>> @@ -659,7 +659,7 @@ static void ath_node_detach(struct ath_softc *sc, struct ieee80211_sta *sta)
>> ? ? ? ?an->sta = NULL;
>> ?#endif
>>
>> - ? ? ? if (sta->ht_cap.ht_supported)
>> + ? ? ? if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT)
>> ? ? ? ? ? ? ? ?ath_tx_node_cleanup(sc, an);
>> ?}
>>
>> --
>> 1.7.9.4
>>
>
>
>
> --
> thanks,
> shafi



--
thanks,
shafi

2012-03-20 06:19:33

by Mohammed Shafi

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

> (gdb) list *ath9k_tx+start+0x284
> No symbol "start" in current context.

hey, pls try l *(ath9k_tx_start+0x284), yesterday i did the same in
the latest wireless testing tree it pointed to line 1929 of xmit.c

sc->tx.seq_no +=0x10;

also did few suspend/resume in my machine, nothing interesting happened

>
> before receiving the email to do the above code with gdb I had cleaned the
> tree out with git clean -fx so rebuilding might have misalighned things(but
> could be wrong)..
>



--
thanks,
shafi

2012-03-07 17:37:18

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/07/2012 09:27 AM, Felix Fietkau wrote:
> On 2012-03-07 4:24 PM, Justin P. Mattock wrote:
>> On 03/07/2012 07:19 AM, Mohammed Shafi wrote:
>>> On Wed, Mar 7, 2012 at 7:55 PM, Justin P. Mattock
>>> <[email protected]> wrote:
>>>> ath9k craps out from time to time(more annoying if anything)
>>>>
>>>> [44309.840830] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44309.858246] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44309.858255] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44309.932801] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44309.950352] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44309.950361] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.024301] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.041641] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44310.041646] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.116316] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.133909] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44310.133918] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.208021] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.225752] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040
>>>> [44310.225761] ath: Could not stop RX, we could be confusing the DMA engine
>>>> when we start RX up
>>>> [44310.299922] ath: Failed to stop TX DMA, queues=0x001!
>>>> [44310.317579] ath: DMA failed to stop in 10 ms AR_CR=0x00000024
>>>> AR_DIAG_SW=0x02000020 DMADBG_7=0x00006040
>>>> [44310.317587] ath: Could not stop RX, we could be confusing the DMA engine
>>>> whe
>>>
>>> this is has become a know issue but less harmful, please try to see if
>>> the workaround of disabling power save helps
>>>
>>> iw dev wlanX set power_save off
>>>
>>
>> ok! I have set that on the machine.. will see if I get this message in
>> the upcoming weeks.
> Also, please try this patch with powersave enabled:
> http://nbd.name/ps-fix.patch
>
> - Felix
>

I will give that a try as well..

Thanks!

Justin P. Mattock


2012-04-06 16:40:29

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 03/26/2012 07:17 PM, Sujith Manoharan wrote:
> Justin P. Mattock wrote:
>> I would have to say this patch above does get rid of this crash I was
>> seeing. as a quick test I simply connect to a WPA network, then connect
>> to an open network going back and forth triggers this freeze for me
>> after applying this I am able to toggle back and forth without a freeze.
>
> Thanks for verifying.
>
> Sujith
>


From going through the warning message I was playing around and am
looking at: AH_UNPLUGGED .. seems if I remove this from the code I have
not seen this warning message.
is there something wrong with AH_UNPLUGGED = 0x2 ?
is this the wrong address(0x2)?

the patch that I was using is:

From ea8a7d6ecf8b0706f3f0521cd22cc359b20b9373 Mon Sep 17 00:00:00 2001
From: "Justin P. Mattock" <[email protected]>
Date: Sun, 25 Mar 2012 21:53:31 -0700
Subject: [PATCH] ath_test

Signed-off-by: Justin P. Mattock <[email protected]>

---
drivers/net/wireless/ath/ath9k/recv.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/recv.c
b/drivers/net/wireless/ath/ath9k/recv.c
index f4ae3ba..d887cee 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -523,8 +523,7 @@ bool ath_stoprecv(struct ath_softc *sc)
sc->rx.rxlink = NULL;
spin_unlock_bh(&sc->rx.rxbuflock);

- if (!(ah->ah_flags & AH_UNPLUGGED) &&
- unlikely(!stopped)) {
+ if (!(ah->ah_flags) && unlikely(!stopped)) {
ath_err(ath9k_hw_common(sc->sc_ah),
"Could not stop RX, we could be "
"confusing the DMA engine when we start RX up\n");
--
1.7.5.4


let me know what you think!

Justin P. Mattock

2012-04-06 16:56:21

by Felix Fietkau

[permalink] [raw]
Subject: Re: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x02000020 DMADBG_7=0x00008040

On 2012-04-06 6:40 PM, Justin P. Mattock wrote:
> On 03/26/2012 07:17 PM, Sujith Manoharan wrote:
>> Justin P. Mattock wrote:
>>> I would have to say this patch above does get rid of this crash I was
>>> seeing. as a quick test I simply connect to a WPA network, then connect
>>> to an open network going back and forth triggers this freeze for me
>>> after applying this I am able to toggle back and forth without a freeze.
>>
>> Thanks for verifying.
>>
>> Sujith
>>
>
>
> From going through the warning message I was playing around and am
> looking at: AH_UNPLUGGED .. seems if I remove this from the code I have
> not seen this warning message.
> is there something wrong with AH_UNPLUGGED = 0x2 ?
> is this the wrong address(0x2)?
>
> the patch that I was using is:
>
> From ea8a7d6ecf8b0706f3f0521cd22cc359b20b9373 Mon Sep 17 00:00:00 2001
> From: "Justin P. Mattock" <[email protected]>
> Date: Sun, 25 Mar 2012 21:53:31 -0700
> Subject: [PATCH] ath_test
>
> Signed-off-by: Justin P. Mattock <[email protected]>
>
> ---
> drivers/net/wireless/ath/ath9k/recv.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/recv.c
> b/drivers/net/wireless/ath/ath9k/recv.c
> index f4ae3ba..d887cee 100644
> --- a/drivers/net/wireless/ath/ath9k/recv.c
> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> @@ -523,8 +523,7 @@ bool ath_stoprecv(struct ath_softc *sc)
> sc->rx.rxlink = NULL;
> spin_unlock_bh(&sc->rx.rxbuflock);
>
> - if (!(ah->ah_flags & AH_UNPLUGGED) &&
> - unlikely(!stopped)) {
> + if (!(ah->ah_flags) && unlikely(!stopped)) {
That patch is completely messed up - before showing the warning, it
checks if ah_flags is set to 0, which will never happen with most
devices. You're simply making the warning impossible to trigger.

- Felix