2012-01-24 02:23:36

by Norbert Preining

[permalink] [raw]
Subject: iwlagn is getting even worse with 3.3-rc1

Dear all,

(please cc)

long time ago we had a long discussion about iwlagn regressions in 3.1.0
Up to 3.2 I still have regular problems on the university network
with stuck iwl driver not even realizing that nothing goes out or in.

Today I switched to 3.3-rc1 to see it is getting even worse, now on my
home network I get stuck queue and no connection:

Here a bit of dmsg:
[ 59.074751] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 59.077291] wlan0: authenticated
[ 59.082048] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 59.088907] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
[ 59.088911] wlan0: associated
[ 59.088914] wlan0: moving STA 00:0a:79:eb:56:10 to state 1
[ 59.088916] wlan0: moving STA 00:0a:79:eb:56:10 to state 2
[ 59.094289] cfg80211: Calling CRDA for country: JP
....
[ 59.100462] cfg80211: (5490000 KHz - 5710000 KHz @ 40000 KHz), (N/A, 2300 mBm)
[ 59.640945] wlan0: moving STA 00:0a:79:eb:56:10 to state 3
[ 59.893618] Rx A-MPDU request on tid 0 result 0
[ 67.500767] type=1305 audit(1327367378.735:34933): auid=4294967295 ses=4294967295 op="remove rule" key=(null) list=4 res=1
[ 67.500772] type=1305 audit(1327367378.735:34934): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 res=1
[ 70.367923] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 70.380150] activated addBA response timer on tid 0
[ 70.382052] switched off addBA timer for tid 0
[ 70.382054] Aggregation is on for tid 0
[ 70.382193] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 79.307470] Rx A-MPDU request on tid 6 result 0
[ 88.975687] Open BA session requested for 00:0a:79:eb:56:10 tid 6
[ 88.996195] activated addBA response timer on tid 6
[ 88.998130] switched off addBA timer for tid 6
[ 88.998132] Aggregation is on for tid 6
[ 88.998294] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
[ 94.128034] tx session timer expired on tid 6
[ 94.128059] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 6
[ 94.144199] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 6
[ 204.928067] tx session timer expired on tid 0
[ 204.928080] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 204.952166] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 213.387258] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 213.404237] activated addBA response timer on tid 0
[ 213.406212] switched off addBA timer for tid 0
[ 213.406214] Aggregation is on for tid 0
[ 213.406375] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 226.208032] tx session timer expired on tid 0
[ 226.208043] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 226.240201] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 228.599092] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 228.624228] activated addBA response timer on tid 0
[ 228.626198] switched off addBA timer for tid 0
[ 228.626200] Aggregation is on for tid 0
[ 228.626317] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 258.908018] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 258.908022] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 258.908024] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 258.908027] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 258.908030] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 259.316067] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 259.316071] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 259.316073] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 259.316076] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 259.316078] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 259.827797] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
[ 259.828087] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 6 inititator reason: 0
[ 259.828856] Rx A-MPDU request on tid 0 result 0
[ 282.159584] Rx A-MPDU request on tid 6 result 0
[ 317.440056] tx session timer expired on tid 0
[ 317.440092] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 317.460203] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 319.344568] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 319.364211] activated addBA response timer on tid 0
[ 319.366199] switched off addBA timer for tid 0
[ 319.366201] Aggregation is on for tid 0
[ 319.366376] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 331.616071] tx session timer expired on tid 0
[ 331.616118] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 331.636237] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 335.560927] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 335.592265] activated addBA response timer on tid 0
[ 335.594210] switched off addBA timer for tid 0
[ 335.594217] Aggregation is on for tid 0
[ 335.594374] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 352.912068] tx session timer expired on tid 0
[ 352.912099] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 358.537559] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 358.542672] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 358.564308] activated addBA response timer on tid 0
[ 358.566251] switched off addBA timer for tid 0
[ 358.566259] Aggregation is on for tid 0
[ 358.566414] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 376.928052] tx session timer expired on tid 0
[ 376.928078] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 376.952178] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 389.583167] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 389.596155] activated addBA response timer on tid 0
[ 389.598079] switched off addBA timer for tid 0
[ 389.598081] Aggregation is on for tid 0
[ 389.598211] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 472.736062] tx session timer expired on tid 0
[ 472.736103] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 478.343144] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 483.817635] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 483.832171] activated addBA response timer on tid 0
[ 483.834122] switched off addBA timer for tid 0
[ 483.834124] Aggregation is on for tid 0
[ 483.834251] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 559.520070] tx session timer expired on tid 0
[ 559.520118] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 559.540222] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 564.472540] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 564.496123] activated addBA response timer on tid 0
[ 564.498038] switched off addBA timer for tid 0
[ 564.498041] Aggregation is on for tid 0
[ 564.498179] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 580.800050] tx session timer expired on tid 0
[ 580.800077] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 580.820245] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 589.529736] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 589.540229] activated addBA response timer on tid 0
[ 589.542081] switched off addBA timer for tid 0
[ 589.542089] Aggregation is on for tid 0
[ 589.542247] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 602.128033] tx session timer expired on tid 0
[ 602.128077] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0

Here everything stopped working without any reaction or signal
from kernel, driver, whatsoever.

After some time I decided to reset (echo 1 > ...../force_reset):

[ 685.975706] Open BA session requested for 00:0a:79:eb:56:10 tid 6
[ 686.000235] activated addBA response timer on tid 6
[ 686.002201] switched off addBA timer for tid 6
[ 686.002203] Aggregation is on for tid 6
[ 686.002393] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
[ 688.643598] iwlwifi 0000:06:00.0: On demand firmware reload
[ 688.644044] ieee80211 phy0: Hardware restart was requested
[ 688.644147] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 688.647179] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 688.700816] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 36
[ 688.700983] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 6
[ 688.701115] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 6 recipient reason: 36
[ 688.701124] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 6

Still no success!!! So rfkill on and off:
[ 725.234448] iwlwifi 0000:06:00.0: RF_KILL bit toggled to disable radio.
[ 725.816063] iwlwifi 0000:06:00.0: Not sending command - RF KILL
[ 725.816100] iwlwifi 0000:06:00.0: Not sending command - RF KILL
[ 726.816121] iwlwifi 0000:06:00.0: Not sending command - RF KILL
[ 726.816158] iwlwifi 0000:06:00.0: Not sending command - RF KILL
[ 727.232124] iwlwifi 0000:06:00.0: Command REPLY_RXON aborted: RF KILL Switch
[ 727.232151] iwlwifi 0000:06:00.0: Command REPLY_RXON aborted: RF KILL Switch
[ 727.232161] iwlwifi 0000:06:00.0: Command REPLY_RXON aborted: RF KILL Switch
[ 727.232177] wlan0: deauthenticating from 00:0a:79:eb:56:10 by local choice (reason=3)
[ 727.264074] iwlwifi 0000:06:00.0: Command REPLY_ADD_STA aborted: RF KILL Switch
[ 727.264088] ieee80211 phy0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-125)
[ 727.288133] iwlwifi 0000:06:00.0: Command REPLY_ADD_STA aborted: RF KILL Switch
[ 727.288146] ieee80211 phy0: failed to remove key (0, 00:0a:79:eb:56:10) from hardware (-125)
[ 727.288221] wlan0: moving STA 00:0a:79:eb:56:10 to state 2
[ 727.288227] wlan0: moving STA 00:0a:79:eb:56:10 to state 1
[ 727.288234] wlan0: moving STA 00:0a:79:eb:56:10 to state 0
[ 727.288243] iwlwifi 0000:06:00.0: Command REPLY_REMOVE_STA aborted: RF KILL Switch
[ 727.320227] ieee80211 phy0: TX A-MPDU purging 264 packets for tid=0
[ 727.320461] iwlwifi 0000:06:00.0: Command REPLY_RXON aborted: RF KILL Switch
[ 727.320497] iwlwifi 0000:06:00.0: Not sending command - RF KILL
[ 727.320510] iwlwifi 0000:06:00.0: Command REPLY_RXON aborted: RF KILL Switch
[ 727.320557] cfg80211: Calling CRDA to update world regulatory domain
[ 727.340111] iwlwifi 0000:06:00.0: Command REPLY_RXON aborted: RF KILL Switch
[ 727.345213] cfg80211: World regulatory domain updated:
[ 727.345222] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[ 727.345232] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[ 727.345241] cfg80211: (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
[ 727.345249] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
[ 727.345257] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[ 727.345266] cfg80211: (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[ 735.347967] iwlwifi 0000:06:00.0: RF_KILL bit toggled to enable radio.
[ 735.350637] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 735.353938] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 742.984202] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 742.989491] wlan0: authenticated
[ 742.991201] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 742.997018] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
[ 742.997037] wlan0: moving STA 00:0a:79:eb:56:10 to state 1
[ 742.997044] wlan0: moving STA 00:0a:79:eb:56:10 to state 2
[ 743.003883] cfg80211: Calling CRDA for country: JP
[ 743.013936] cfg80211: Regulatory domain changed to country: JP
[ 743.013946] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[ 743.013955] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[ 743.013964] cfg80211: (2457000 KHz - 2482000 KHz @ 20000 KHz), (N/A, 2000 mBm)
[ 743.013971] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm)
[ 743.013979] cfg80211: (4910000 KHz - 4930000 KHz @ 10000 KHz), (N/A, 2300 mBm)
[ 743.013987] cfg80211: (4910000 KHz - 4990000 KHz @ 40000 KHz), (N/A, 2300 mBm)
[ 743.013994] cfg80211: (4930000 KHz - 4950000 KHz @ 10000 KHz), (N/A, 2300 mBm)
[ 743.014002] cfg80211: (5030000 KHz - 5045000 KHz @ 10000 KHz), (N/A, 2300 mBm)
[ 743.014010] cfg80211: (5030000 KHz - 5090000 KHz @ 40000 KHz), (N/A, 2300 mBm)
[ 743.014017] cfg80211: (5050000 KHz - 5060000 KHz @ 10000 KHz), (N/A, 2300 mBm)
[ 743.014025] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[ 743.014033] cfg80211: (5250000 KHz - 5330000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[ 743.014040] cfg80211: (5490000 KHz - 5710000 KHz @ 40000 KHz), (N/A, 2300 mBm)
[ 743.641206] wlan0: moving STA 00:0a:79:eb:56:10 to state 3
[ 743.707121] Rx A-MPDU request on tid 0 result 0
[ 754.151203] Rx A-MPDU request on tid 6 result 0
[ 757.526560] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 757.544182] activated addBA response timer on tid 0
[ 758.544087] addBA response timer expired on tid 0
[ 758.544135] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 765.471766] switched off addBA timer for tid 0
[ 765.471773] got addBA resp for tid 0 but we already gave up
[ 765.471812] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 766.117423] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 766.137379] activated addBA response timer on tid 0
[ 766.139746] switched off addBA timer for tid 0
[ 766.139752] Aggregation is on for tid 0
[ 766.139895] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 807.975974] Open BA session requested for 00:0a:79:eb:56:10 tid 6
[ 807.996257] activated addBA response timer on tid 6
[ 807.998205] switched off addBA timer for tid 6
[ 807.998212] Aggregation is on for tid 6
[ 807.998370] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
[ 814.112068] tx session timer expired on tid 6
[ 814.112118] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 6
[ 814.132223] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 6
[ 869.975764] Open BA session requested for 00:0a:79:eb:56:10 tid 6
[ 869.992279] activated addBA response timer on tid 6

etc etc etc etc etc ... but at least now it is working.

Seems that iwlwifi is not really improving in any way.

Please let me know if you want me to test more patches or debugging.

Thanks

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
ADRIGOLE (n.)
The centrepiece of a merry-go-round on which the man with the tickets
stands unnervingly still.
--- Douglas Adams, The Meaning of Liff


2012-01-26 05:49:07

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Norbert,

>
> syslog output is at http://www.logic.at/people/preining/syslog.log
> this is after a suspend and resume.
>
> first connection after resume succeeded at around kernel time 378.859794
> then it worked for say 20sec or so, hard to say when exactely it dropped,
> but defeinitely at kernel time 451 it was finished.
>
> After that nothing went out or in again.
>
> Hope you can make something out of that.
>

Unfortunately, not really. I can see you are hitting PASSIVE_NO_RX,
meaning that the uCode complains that you are trying to Tx on a
passive channel on which it hasn't heard any beacon yet. But then we
finally get a beacon there, and open the queue again.

Anyway, can you try the attached patch ?
Note that I have no real way to test it (I have to admit I haven't
even compiled... I am not in my office).
I hope it might help.


Attachments:
tid.patch (1.95 kB)

2012-01-25 05:32:46

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Tue, Jan 24, 2012 at 16:22, Norbert Preining <[email protected]> wrote:
> Hi Emmanuel,
>
> On Mo, 23 Jan 2012, Emmanuel Grumbach wrote:
>> Can you please run with debug flags 0x80000000 ?
>
> Here we go. Booting into 3.3-rc1 tries to connect, the connection seems
> to be established, but nothing actually goes out/in.
> Main problem seems to be:
> ? ? ? ?iwlagn_tx_agg_stop Can't proceed: ssn 340, next_recl = 329
>
> rfkill block wifi ; sleep 10; rfkill unblock wifi
>
> helps.
>
> I attach the dmesg of all wifi related things since boot.
>

yeah... looks like the same issue as Dave. Can you please try with 0xc0000000 ?
it will be noisy so be prepared...

I will send a patch to clarify one thing about mgmt frames, but as I
said I am not in my office until end of next week.

2012-01-31 02:42:12

by Dave Jones

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Thu, Jan 26, 2012 at 04:06:36PM -0800, Emmanuel Grumbach wrote:
> On Thu, Jan 26, 2012 at 15:25, Norbert Preining <[email protected]> wrote:
> > Hi Emmanuel,
> >
> > On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
> >> are suffering from the same issue. I am merging the mail threads here.
> >
> > First tests are promising, after reboot it was working immediately
> > without need to rfkill block/unblock. lso after suspend and resume.
> >
> > Will test more the next days and report back.
> >
> > dmesg output without debugging still the same, but connection
> > continues without any problem:
> > [ ? 62.437916] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> > [ ? 62.456184] activated addBA response timer on tid 0
> > [ ? 63.456042] addBA response timer expired on tid 0
> > [ ? 63.456063] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
> > [ ? 67.008297] switched off addBA timer for tid 0
> > [ ? 67.008305] got addBA resp for tid 0 but we already gave up
> > [ ? 67.008340] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
> > [ ? 67.015503] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> > [ ? 67.031648] activated addBA response timer on tid 0
> > [ ? 67.129907] switched off addBA timer for tid 0
> > [ ? 67.129914] Aggregation is on for tid 0
> > [ ? 67.130076] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
> > [ ?100.288018] tx session timer expired on tid 0
> > [ ?100.288032] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
> > [ ?106.347302] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
> > [ ?106.374760] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> > [ ?106.388178] activated addBA response timer on tid 0
> > [ ?106.390041] switched off addBA timer for tid 0
> > [ ?106.390044] Aggregation is on for tid 0
> > [ ?106.390193] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
> > [ ?135.786765] Open BA session requested for 00:0a:79:eb:56:10 tid 6
> > [ ?135.804164] activated addBA response timer on tid 6
> > [ ?135.806088] switched off addBA timer for tid 6
> > [ ?135.806092] Aggregation is on for tid 6
> > [ ?135.806241] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
> > [ ?146.912031] tx session timer expired on tid 6
> > [ ?146.912071] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 6
> > [ ?146.924187] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 6
> > [ ?256.560100] tx session timer expired on tid 0
> > [ ?256.560132] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
> > [ ?256.584379] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
> > [ ?258.477345] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> > [ ?258.500189] activated addBA response timer on tid 0
> > [ ?258.502148] switched off addBA timer for tid 0
> > [ ?258.502151] Aggregation is on for tid 0
> > [ ?258.502340] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
> >
> > Later on after resume I got a few hickups but the connection just
> > hang for a shor ttime, then came back:
> > [ ?589.895086] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
> > [ ?589.895095] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
> > [ ?589.900879] Rx A-MPDU request on tid 0 result 0
> > [ ?589.900950] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> > [ ?589.900958] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> > [ ?589.902649] Rx A-MPDU request on tid 0 result 0
> > [ ?589.902690] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> > [ ?589.902698] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> > [ ?589.903732] Rx A-MPDU request on tid 0 result 0
> > [ ?589.903810] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> > [ ?589.903817] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> > [ ?589.904965] Rx A-MPDU request on tid 0 result 0
> > [ ?589.905045] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> > [ ?589.905053] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> > [ ?589.906510] Rx A-MPDU request on tid 0 result 0
> > [ ?590.456062] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
> > [ ?590.456075] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
> > [ ?590.456083] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
> > [ ?594.413810] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
> > [ ?594.413822] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
> > [ ?594.428073] Rx A-MPDU request on tid 0 result 0
> >
> >
> > Generally fine.
> >
> > Thanks, more reports later.
> >
>
> Dave ?

after running with it all evening, it seems to be working fine so far.

(Sorry for delay, been on vacation for a week)

Dave


2012-01-24 04:47:16

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Mon, Jan 23, 2012 at 17:36, Norbert Preining <[email protected]> wrote:
> Dear all,
>
> (please cc)
>
> long time ago we had a long discussion about iwlagn regressions in 3.1.0
> Up to 3.2 I still have regular problems on the university network
> with stuck iwl driver not even realizing that nothing goes out or in.
>
> Today I switched to 3.3-rc1 to see it is getting even worse, now on my
> home network I get stuck queue and no connection:
>

Can you please run with debug flags 0x80000000 ?
We will probably need to enable TX_REPLY too: 0xC0000000 but this one
is noisy, so let's begin the 0x80000000

2012-01-25 00:23:05

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

On Mo, 23 Jan 2012, Emmanuel Grumbach wrote:
> Can you please run with debug flags 0x80000000 ?

Here we go. Booting into 3.3-rc1 tries to connect, the connection seems
to be established, but nothing actually goes out/in.
Main problem seems to be:
iwlagn_tx_agg_stop Can't proceed: ssn 340, next_recl = 329

rfkill block wifi ; sleep 10; rfkill unblock wifi

helps.

I attach the dmesg of all wifi related things since boot.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
WHAPLODE DROVE (n.)
A homicidal golf stroke.
--- Douglas Adams, The Meaning of Liff


Attachments:
(No filename) (914.00 B)
dmesg.txt (17.30 kB)
Download all attachments

2012-01-27 01:50:27

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

On Fr, 27 Jan 2012, Norbert Preining wrote:
> First tests are promising, after reboot it was working immediately
> without need to rfkill block/unblock. lso after suspend and resume.
>
> Will test more the next days and report back.

Unfortunately, at the university it is still a complete no-go.
Usually the connection works for a short time, then breaks down.
After that even unloading and loading the module did not reactivate
it, I cannot get a connection at all. But other units, or with
older kernel (it was 2.6.3X AFAIR) it was working without a glitch.

I uploaded a syslog output including kernel and network manager logs
to
http://www.logic.at/people/preining/syslog.log
(new one). This shows a session from loading the module up to giving up.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
AITH (n.)
The single bristle that sticks out sideways on a cheap paintbrush.
--- Douglas Adams, The Meaning of Liff

2012-01-27 03:11:59

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

> Hi Emmanuel,
>
> On Fr, 27 Jan 2012, Norbert Preining wrote:
>> First tests are promising, after reboot it was working immediately
>> without need to rfkill block/unblock. lso after suspend and resume.
>>
>> Will test more the next days and report back.
>
> Unfortunately, at the university it is still a complete no-go.
> Usually the connection works for a short time, then breaks down.
> After that even unloading and loading the module did not reactivate
> it, I cannot get a connection at all. But other units, or with
> older kernel (it was 2.6.3X AFAIR) it was working without a glitch.
>
> I uploaded a syslog output including kernel and network manager logs
> to
> ? ? ? ?http://www.logic.at/people/preining/syslog.log
> (new one). This shows a session from loading the module up to giving up.
>

I glanced at the logs, and they look healthy from the wifi driver
side. You just don't get any reply to DHCP_DISCOVER apparently... can
you get a sniffer ?
I am pretty sure that the packet in sent in the air, but if you can
get a capture of that we could check that out.

2012-01-27 00:06:38

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Thu, Jan 26, 2012 at 15:25, Norbert Preining <[email protected]> wrote:
> Hi Emmanuel,
>
> On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
>> are suffering from the same issue. I am merging the mail threads here.
>
> First tests are promising, after reboot it was working immediately
> without need to rfkill block/unblock. lso after suspend and resume.
>
> Will test more the next days and report back.
>
> dmesg output without debugging still the same, but connection
> continues without any problem:
> [ ? 62.437916] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> [ ? 62.456184] activated addBA response timer on tid 0
> [ ? 63.456042] addBA response timer expired on tid 0
> [ ? 63.456063] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
> [ ? 67.008297] switched off addBA timer for tid 0
> [ ? 67.008305] got addBA resp for tid 0 but we already gave up
> [ ? 67.008340] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
> [ ? 67.015503] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> [ ? 67.031648] activated addBA response timer on tid 0
> [ ? 67.129907] switched off addBA timer for tid 0
> [ ? 67.129914] Aggregation is on for tid 0
> [ ? 67.130076] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
> [ ?100.288018] tx session timer expired on tid 0
> [ ?100.288032] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
> [ ?106.347302] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
> [ ?106.374760] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> [ ?106.388178] activated addBA response timer on tid 0
> [ ?106.390041] switched off addBA timer for tid 0
> [ ?106.390044] Aggregation is on for tid 0
> [ ?106.390193] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
> [ ?135.786765] Open BA session requested for 00:0a:79:eb:56:10 tid 6
> [ ?135.804164] activated addBA response timer on tid 6
> [ ?135.806088] switched off addBA timer for tid 6
> [ ?135.806092] Aggregation is on for tid 6
> [ ?135.806241] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
> [ ?146.912031] tx session timer expired on tid 6
> [ ?146.912071] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 6
> [ ?146.924187] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 6
> [ ?256.560100] tx session timer expired on tid 0
> [ ?256.560132] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
> [ ?256.584379] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
> [ ?258.477345] Open BA session requested for 00:0a:79:eb:56:10 tid 0
> [ ?258.500189] activated addBA response timer on tid 0
> [ ?258.502148] switched off addBA timer for tid 0
> [ ?258.502151] Aggregation is on for tid 0
> [ ?258.502340] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
>
> Later on after resume I got a few hickups but the connection just
> hang for a shor ttime, then came back:
> [ ?589.895086] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
> [ ?589.895095] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
> [ ?589.900879] Rx A-MPDU request on tid 0 result 0
> [ ?589.900950] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> [ ?589.900958] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> [ ?589.902649] Rx A-MPDU request on tid 0 result 0
> [ ?589.902690] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> [ ?589.902698] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> [ ?589.903732] Rx A-MPDU request on tid 0 result 0
> [ ?589.903810] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> [ ?589.903817] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> [ ?589.904965] Rx A-MPDU request on tid 0 result 0
> [ ?589.905045] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
> [ ?589.905053] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
> [ ?589.906510] Rx A-MPDU request on tid 0 result 0
> [ ?590.456062] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
> [ ?590.456075] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
> [ ?590.456083] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
> [ ?594.413810] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
> [ ?594.413822] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
> [ ?594.428073] Rx A-MPDU request on tid 0 result 0
>
>
> Generally fine.
>
> Thanks, more reports later.
>

Dave ?

2012-01-26 23:26:12

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
> are suffering from the same issue. I am merging the mail threads here.

First tests are promising, after reboot it was working immediately
without need to rfkill block/unblock. lso after suspend and resume.

Will test more the next days and report back.

dmesg output without debugging still the same, but connection
continues without any problem:
[ 62.437916] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 62.456184] activated addBA response timer on tid 0
[ 63.456042] addBA response timer expired on tid 0
[ 63.456063] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 67.008297] switched off addBA timer for tid 0
[ 67.008305] got addBA resp for tid 0 but we already gave up
[ 67.008340] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 67.015503] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 67.031648] activated addBA response timer on tid 0
[ 67.129907] switched off addBA timer for tid 0
[ 67.129914] Aggregation is on for tid 0
[ 67.130076] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 100.288018] tx session timer expired on tid 0
[ 100.288032] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 106.347302] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 106.374760] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 106.388178] activated addBA response timer on tid 0
[ 106.390041] switched off addBA timer for tid 0
[ 106.390044] Aggregation is on for tid 0
[ 106.390193] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 135.786765] Open BA session requested for 00:0a:79:eb:56:10 tid 6
[ 135.804164] activated addBA response timer on tid 6
[ 135.806088] switched off addBA timer for tid 6
[ 135.806092] Aggregation is on for tid 6
[ 135.806241] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
[ 146.912031] tx session timer expired on tid 6
[ 146.912071] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 6
[ 146.924187] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 6
[ 256.560100] tx session timer expired on tid 0
[ 256.560132] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[ 256.584379] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[ 258.477345] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[ 258.500189] activated addBA response timer on tid 0
[ 258.502148] switched off addBA timer for tid 0
[ 258.502151] Aggregation is on for tid 0
[ 258.502340] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0

Later on after resume I got a few hickups but the connection just
hang for a shor ttime, then came back:
[ 589.895086] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
[ 589.895095] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
[ 589.900879] Rx A-MPDU request on tid 0 result 0
[ 589.900950] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
[ 589.900958] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
[ 589.902649] Rx A-MPDU request on tid 0 result 0
[ 589.902690] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
[ 589.902698] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
[ 589.903732] Rx A-MPDU request on tid 0 result 0
[ 589.903810] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
[ 589.903817] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
[ 589.904965] Rx A-MPDU request on tid 0 result 0
[ 589.905045] unexpected AddBA Req from 00:0a:79:eb:56:10 on tid 0
[ 589.905053] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 recipient reason: 32
[ 589.906510] Rx A-MPDU request on tid 0 result 0
[ 590.456062] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 590.456075] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 590.456083] ieee80211 phy0: release an RX reorder frame due to timeout on earlier frames
[ 594.413810] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
[ 594.413822] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
[ 594.428073] Rx A-MPDU request on tid 0 result 0


Generally fine.

Thanks, more reports later.

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
FAIRYMOUNT (vb.n.)
Polite word for buggery.
--- Douglas Adams, The Meaning of Liff

2012-01-26 00:37:53

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

On Di, 24 Jan 2012, Emmanuel Grumbach wrote:
> yeah... looks like the same issue as Dave. Can you please try with 0xc0000000 ?
> it will be noisy so be prepared...

syslog output is at http://www.logic.at/people/preining/syslog.log
this is after a suspend and resume.

first connection after resume succeeded at around kernel time 378.859794
then it worked for say 20sec or so, hard to say when exactely it dropped,
but defeinitely at kernel time 451 it was finished.

After that nothing went out or in again.

Hope you can make something out of that.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SKETTY (n.)
Apparently self-propelled little dance a beer glass performs in its
own puddle.
--- Douglas Adams, The Meaning of Liff

2012-01-26 21:13:07

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hey Dave,

Can you please try to patch attached to this thread. You and Norbert
are suffering from the same issue. I am merging the mail threads here.

On Wed, Jan 25, 2012 at 21:49, Emmanuel Grumbach <[email protected]> wrote:
> Hi Norbert,
>
>>
>> syslog output is at http://www.logic.at/people/preining/syslog.log
>> this is after a suspend and resume.
>>
>> first connection after resume succeeded at around kernel time 378.859794
>> then it worked for say 20sec or so, hard to say when exactely it dropped,
>> but defeinitely at kernel time 451 it was finished.
>>
>> After that nothing went out or in again.
>>
>> Hope you can make something out of that.
>>
>
> Unfortunately, not really. I can see you are hitting PASSIVE_NO_RX,
> meaning that the uCode complains that you are trying to Tx on a
> passive channel on which it hasn't heard any beacon yet. But then we
> finally get a beacon there, and open the queue again.
>
> Anyway, can you try the attached patch ?
> Note that I have no real way to test it (I have to admit I haven't
> even compiled... I am not in my office).
> I hope it might help.

2012-02-22 08:37:51

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Tue, Jan 24, 2012 at 3:36 AM, Norbert Preining <[email protected]> wrote:
> long time ago we had a long discussion about iwlagn regressions in 3.1.0
> Up to 3.2 I still have regular problems on the university network
> with stuck iwl driver not even realizing that nothing goes out or in.
>
> Today I switched to 3.3-rc1 to see it is getting even worse, now on my
> home network I get stuck queue and no connection:

I'm seeing the same thing with my home network as well with 3.3-rc3.
Is there a fix in the pipeline for 3.3 final or do you want me to
bisect which commits to revert?

Pekka

2012-02-23 07:06:18

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

On Wed, Feb 22, 2012 at 10:54 AM, Emmanuel Grumbach <[email protected]> wrote:
> Please check this one:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=3d29dd9b5b160ba4542a9b8f869a220559e633a0
>
> it is in 3.3-rc4

I used 3.3-rc4 yesterday and didn't notice the problem with my home
network. Is the patch supposed to fix Norbert's issue as well?

Pekka

2012-02-27 08:37:04

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

sorry for the silence, I wasn't at the university for a few days so
I couldn't test.

Now I did:
kernel 3.3-rc5

On Mi, 22 Feb 2012, Emmanuel Grumbach wrote:
> From the log, I can see that we have a lot of "passive channel
> failures". Can you try to disable 11n (module_parameter) ?
> Please also try with debug=0xc0000000

Disabling 11n and adding this parameter created loads of output, but
still makes the connection break down and reconnection does not work
anymore. I attach a (small) part of the syslog file. I had
loads *LOADS* (200000+) of messages
Feb 27 12:58:04 mithrandir kernel: [ 1447.835281] iwlwifi 0000:06:00.0: I iwlagn_rx_reply_tx Next reclaimed packet:3333
Feb 27 12:58:04 mithrandir kernel: [ 1447.835294] iwlwifi 0000:06:00.0: I iwl_trans_pcie_reclaim [Q 2 | AC 2] 213 -> 214 (470)
in the log. Then there is something about "TX on passive channel"
and from then on everything goes off.

> network. Is the patch supposed to fix Norbert's issue as well?

Apparently not.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
FRADDAM (n.)
The small awkward-shaped piece of cheese which remains after grating a
large regular-shaped piece of cheese and enables you to cut your
fingers.
--- Douglas Adams, The Meaning of Liff


Attachments:
(No filename) (1.56 kB)
syslog (22.37 kB)
Download all attachments

2012-02-22 00:38:04

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

sorry to come back soo late to that matter ... I was *really* busy
with real work.

On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
> >
> > On Fr, 27 Jan 2012, Norbert Preining wrote:
> >> First tests are promising, after reboot it was working immediately
> >> without need to rfkill block/unblock. lso after suspend and resume.
> >>
> >> Will test more the next days and report back.
> >
> > Unfortunately, at the university it is still a complete no-go.
> > Usually the connection works for a short time, then breaks down.
> > After that even unloading and loading the module did not reactivate
> > it, I cannot get a connection at all. But other units, or with
> > older kernel (it was 2.6.3X AFAIR) it was working without a glitch.
> >
> > I uploaded a syslog output including kernel and network manager logs
> > to
> > ? ? ? ?http://www.logic.at/people/preining/syslog.log
> > (new one). This shows a session from loading the module up to giving up.
> >
>
> I glanced at the logs, and they look healthy from the wifi driver
> side. You just don't get any reply to DHCP_DISCOVER apparently... can
> you get a sniffer ?
> I am pretty sure that the packet in sent in the air, but if you can
> get a capture of that we could check that out.

I still see that, on 3.3-rc4, and it is the same as usual. The interface
believes it is up and connected, but nothing works.

I am *100%* sure that this is related to the driver, because in old
revisions (somewhen around 2.6.27 or so) it was working without
any problem, and when it started I reported it long time ago.

Anyway, today it was really hopeless again, and the exact time it always
hangs is when the kernel driver spits out:
Rx A-MPDU request on tid 0 result 0
and with debugging on I get in addition:
ieee80211 phy3: release an RX reorder frame due to timeout on earlier frames
that is where it all goes down the gully, without any reaction from the
outside world suddenly. Before ping was running, then off.

I uploaded a new syslog.log in the above location that shows 5 min
or so of trial and error.

I don't know what else to provide then that, but it is definitely
a real problem, I cannot work without cable anymore at the university,
which is a serious rpbolem.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HAUGHAM (n.)
One who loudly informs other diners in a restaurant what kind of man
he is by calling for the chef by his christian name from the lobby.
--- Douglas Adams, The Meaning of Liff

2012-02-22 06:58:13

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Wed, Feb 22, 2012 at 02:37, Norbert Preining <[email protected]> wrote:
>
> Hi Emmanuel,
>
> sorry to come back soo late to that matter ... I was *really* busy
> with real work.
>
> On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
> > >
> > > On Fr, 27 Jan 2012, Norbert Preining wrote:
> > >> First tests are promising, after reboot it was working immediately
> > >> without need to rfkill block/unblock. lso after suspend and resume.
> > >>
> > >> Will test more the next days and report back.
> > >
> > > Unfortunately, at the university it is still a complete no-go.
> > > Usually the connection works for a short time, then breaks down.
> > > After that even unloading and loading the module did not reactivate
> > > it, I cannot get a connection at all. But other units, or with
> > > older kernel (it was 2.6.3X AFAIR) it was working without a glitch.
> > >
> > > I uploaded a syslog output including kernel and network manager logs
> > > to
> > > ? ? ? ?http://www.logic.at/people/preining/syslog.log
> > > (new one). This shows a session from loading the module up to giving
> > > up.
> > >
> >
> > I glanced at the logs, and they look healthy from the wifi driver
> > side. You just don't get any reply to DHCP_DISCOVER apparently... can
> > you get a sniffer ?
> > I am pretty sure that the packet in sent in the air, but if you can
> > get a capture of that we could check that out.
>
> I still see that, on 3.3-rc4, and it is the same as usual. The interface
> believes it is up and connected, but nothing works.
>
> I am *100%* sure that this is related to the driver, because in old
> revisions (somewhen around 2.6.27 or so) it was working without
> any problem, and when it started I reported it long time ago.
>
> Anyway, today it was really hopeless again, and the exact time it always
> hangs is when the kernel driver spits out:
> ? ? ? ?Rx A-MPDU request on tid 0 result 0
> and with debugging on I get in addition:
> ? ? ? ?ieee80211 phy3: release an RX reorder frame due to timeout on
> earlier frames
> that is where it all goes down the gully, without any reaction from the
> outside world suddenly. Before ping was running, then off.
>
> I uploaded a new syslog.log in the above location that shows 5 min
> or so of trial and error.

>From the log, I can see that we have a lot of "passive channel
failures". Can you try to disable 11n (module_parameter) ?
Please also try with debug=0xc0000000

2012-02-01 15:53:12

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Wed, 2012-02-01 at 10:43 -0500, Dave Jones wrote:
> On Mon, Jan 30, 2012 at 09:41:43PM -0500, Dave Jones wrote:
>
> > > > Generally fine.
> > > >
> > > > Thanks, more reports later.
> > >
> > > Dave ?
> >
> > after running with it all evening, it seems to be working fine so far.
>
> Two days later, it hasn't locked up at all, so this is definitly fixing my problem.
>
> One thing though, dmesg is flooded with these..
>
> [ 4384.849732] iwlwifi 0000:04:00.0: Tx aggregation enabled on ra = b8:c7:5d:0c:39:87 tid = 0
> [ 4434.332275] iwlwifi 0000:04:00.0: Tx aggregation enabled on ra = b8:c7:5d:0c:39:87 tid = 0
> [ 4475.309883] iwlwifi 0000:04:00.0: Tx aggregation enabled on ra = b8:c7:5d:0c:39:87 tid = 0
>
> Can it be supressed if the ra hasn't changed ?

John has a patch to fix this :-)

Wey
>



2012-02-27 18:01:12

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

> sorry for the silence, I wasn't at the university for a few days so
> I couldn't test.
>
> Now I did:
> kernel 3.3-rc5
>
> On Mi, 22 Feb 2012, Emmanuel Grumbach wrote:
>> From the log, I can see that we have a lot of "passive channel
>> failures". Can you try to disable 11n (module_parameter) ?
>> Please also try with debug=0xc0000000

So this doesn't seem to be related to my rework of the AMPDU logic....
Will keep you updated if I think about something.

2012-02-22 08:54:22

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Please check this one:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=3d29dd9b5b160ba4542a9b8f869a220559e633a0

it is in 3.3-rc4


On Wed, Feb 22, 2012 at 10:40, Pekka Enberg <[email protected]> wrote:
> On Wed, Feb 22, 2012 at 2:37 AM, Norbert Preining <[email protected]> wrote:
>> I still see that, on 3.3-rc4, and it is the same as usual. The interface
>> believes it is up and connected, but nothing works.
>>
>> I am *100%* sure that this is related to the driver, because in old
>> revisions (somewhen around 2.6.27 or so) it was working without
>> any problem, and when it started I reported it long time ago.
>
> It's definitely a kernel regression, no question about it. Was 3.3-rc1
> already broken for you? It shouldn't be that difficult to find the
> offending commit with git bisect if it's that easy to reproduce for
> you.
>
> ? ? ? ? ? ? ? ? ? ? ? ?Pekka

2012-02-01 15:44:23

by Dave Jones

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Mon, Jan 30, 2012 at 09:41:43PM -0500, Dave Jones wrote:

> > > Generally fine.
> > >
> > > Thanks, more reports later.
> >
> > Dave ?
>
> after running with it all evening, it seems to be working fine so far.

Two days later, it hasn't locked up at all, so this is definitly fixing my problem.

One thing though, dmesg is flooded with these..

[ 4384.849732] iwlwifi 0000:04:00.0: Tx aggregation enabled on ra = b8:c7:5d:0c:39:87 tid = 0
[ 4434.332275] iwlwifi 0000:04:00.0: Tx aggregation enabled on ra = b8:c7:5d:0c:39:87 tid = 0
[ 4475.309883] iwlwifi 0000:04:00.0: Tx aggregation enabled on ra = b8:c7:5d:0c:39:87 tid = 0

Can it be supressed if the ra hasn't changed ?

Dave


2012-02-27 22:42:37

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Mo, 27 Feb 2012, Emmanuel Grumbach wrote:
> So this doesn't seem to be related to my rework of the AMPDU logic....
> Will keep you updated if I think about something.

Thanks.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
MEATHOP (n.)
One who sets off for the scene of an aircraft crash with a picnic
hamper.
--- Douglas Adams, The Meaning of Liff

2012-02-22 08:40:44

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Wed, Feb 22, 2012 at 2:37 AM, Norbert Preining <[email protected]> wrote:
> I still see that, on 3.3-rc4, and it is the same as usual. The interface
> believes it is up and connected, but nothing works.
>
> I am *100%* sure that this is related to the driver, because in old
> revisions (somewhen around 2.6.27 or so) it was working without
> any problem, and when it started I reported it long time ago.

It's definitely a kernel regression, no question about it. Was 3.3-rc1
already broken for you? It shouldn't be that difficult to find the
offending commit with git bisect if it's that easy to reproduce for
you.

Pekka

2012-03-13 23:39:17

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

(now at -rc6)

On Mo, 27 Feb 2012, Emmanuel Grumbach wrote:
> > Now I did:
> > kernel 3.3-rc5
> >
> > On Mi, 22 Feb 2012, Emmanuel Grumbach wrote:
> >> From the log, I can see that we have a lot of "passive channel
> >> failures". Can you try to disable 11n (module_parameter) ?
> >> Please also try with debug=0xc0000000
>
> So this doesn't seem to be related to my rework of the AMPDU logic....
> Will keep you updated if I think about something.

Another thing ... I don't know what the hack is going on here, since
-rc6 I think I get *loads* of messages:
delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
After last reboot in about 10min (920.95) I got 414 of these lines.
Always in a triple:
[ 914.882870] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
[ 914.882881] Rx BA session stop requested for 00:0a:79:eb:56:10 tid 0 inititator reason: 0
[ 915.012136] Rx A-MPDU request on tid 0 result 0
[ 916.157983] delba from 00:0a:79:eb:56:10 (initiator) tid 0 reason code 39
...

No idea if that helps in any way, but it is new.

Interestingly, after doing rfkill block wifi; sleep 10; rfkill unblock wifi,
and that sometimes 2 or three times, at some point these messages disappear
and I have only:
[18876.624068] tx session timer expired on tid 0
[18876.624117] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
[18876.644224] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
[18877.609764] Open BA session requested for 00:0a:79:eb:56:10 tid 0
[18877.640279] activated addBA response timer on tid 0
[18877.642218] switched off addBA timer for tid 0
[18877.642224] Aggregation is on for tid 0
(again many of them)

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
MELCOMBE REGIS (n.)
The name of the style of decoration used in cocktail lounges in mock
Tudor hotels in Surrey.
--- Douglas Adams, The Meaning of Liff