2012-02-22 00:38:06

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

sorry to come back soo late to that matter ... I was *really* busy
with real work.

On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
> >
> > On Fr, 27 Jan 2012, Norbert Preining wrote:
> >> First tests are promising, after reboot it was working immediately
> >> without need to rfkill block/unblock. lso after suspend and resume.
> >>
> >> Will test more the next days and report back.
> >
> > Unfortunately, at the university it is still a complete no-go.
> > Usually the connection works for a short time, then breaks down.
> > After that even unloading and loading the module did not reactivate
> > it, I cannot get a connection at all. But other units, or with
> > older kernel (it was 2.6.3X AFAIR) it was working without a glitch.
> >
> > I uploaded a syslog output including kernel and network manager logs
> > to
> > ? ? ? ?http://www.logic.at/people/preining/syslog.log
> > (new one). This shows a session from loading the module up to giving up.
> >
>
> I glanced at the logs, and they look healthy from the wifi driver
> side. You just don't get any reply to DHCP_DISCOVER apparently... can
> you get a sniffer ?
> I am pretty sure that the packet in sent in the air, but if you can
> get a capture of that we could check that out.

I still see that, on 3.3-rc4, and it is the same as usual. The interface
believes it is up and connected, but nothing works.

I am *100%* sure that this is related to the driver, because in old
revisions (somewhen around 2.6.27 or so) it was working without
any problem, and when it started I reported it long time ago.

Anyway, today it was really hopeless again, and the exact time it always
hangs is when the kernel driver spits out:
Rx A-MPDU request on tid 0 result 0
and with debugging on I get in addition:
ieee80211 phy3: release an RX reorder frame due to timeout on earlier frames
that is where it all goes down the gully, without any reaction from the
outside world suddenly. Before ping was running, then off.

I uploaded a new syslog.log in the above location that shows 5 min
or so of trial and error.

I don't know what else to provide then that, but it is definitely
a real problem, I cannot work without cable anymore at the university,
which is a serious rpbolem.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HAUGHAM (n.)
One who loudly informs other diners in a restaurant what kind of man
he is by calling for the chef by his christian name from the lobby.
--- Douglas Adams, The Meaning of Liff


2012-02-22 06:58:15

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Wed, Feb 22, 2012 at 02:37, Norbert Preining <[email protected]> wrote:
>
> Hi Emmanuel,
>
> sorry to come back soo late to that matter ... I was *really* busy
> with real work.
>
> On Do, 26 Jan 2012, Emmanuel Grumbach wrote:
> > >
> > > On Fr, 27 Jan 2012, Norbert Preining wrote:
> > >> First tests are promising, after reboot it was working immediately
> > >> without need to rfkill block/unblock. lso after suspend and resume.
> > >>
> > >> Will test more the next days and report back.
> > >
> > > Unfortunately, at the university it is still a complete no-go.
> > > Usually the connection works for a short time, then breaks down.
> > > After that even unloading and loading the module did not reactivate
> > > it, I cannot get a connection at all. But other units, or with
> > > older kernel (it was 2.6.3X AFAIR) it was working without a glitch.
> > >
> > > I uploaded a syslog output including kernel and network manager logs
> > > to
> > > ? ? ? ?http://www.logic.at/people/preining/syslog.log
> > > (new one). This shows a session from loading the module up to giving
> > > up.
> > >
> >
> > I glanced at the logs, and they look healthy from the wifi driver
> > side. You just don't get any reply to DHCP_DISCOVER apparently... can
> > you get a sniffer ?
> > I am pretty sure that the packet in sent in the air, but if you can
> > get a capture of that we could check that out.
>
> I still see that, on 3.3-rc4, and it is the same as usual. The interface
> believes it is up and connected, but nothing works.
>
> I am *100%* sure that this is related to the driver, because in old
> revisions (somewhen around 2.6.27 or so) it was working without
> any problem, and when it started I reported it long time ago.
>
> Anyway, today it was really hopeless again, and the exact time it always
> hangs is when the kernel driver spits out:
> ? ? ? ?Rx A-MPDU request on tid 0 result 0
> and with debugging on I get in addition:
> ? ? ? ?ieee80211 phy3: release an RX reorder frame due to timeout on
> earlier frames
> that is where it all goes down the gully, without any reaction from the
> outside world suddenly. Before ping was running, then off.
>
> I uploaded a new syslog.log in the above location that shows 5 min
> or so of trial and error.

>From the log, I can see that we have a lot of "passive channel
failures". Can you try to disable 11n (module_parameter) ?
Please also try with debug=0xc0000000

2012-02-22 08:40:45

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Wed, Feb 22, 2012 at 2:37 AM, Norbert Preining <[email protected]> wrote:
> I still see that, on 3.3-rc4, and it is the same as usual. The interface
> believes it is up and connected, but nothing works.
>
> I am *100%* sure that this is related to the driver, because in old
> revisions (somewhen around 2.6.27 or so) it was working without
> any problem, and when it started I reported it long time ago.

It's definitely a kernel regression, no question about it. Was 3.3-rc1
already broken for you? It shouldn't be that difficult to find the
offending commit with git bisect if it's that easy to reproduce for
you.

Pekka

2012-02-22 08:54:23

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Please check this one:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=3d29dd9b5b160ba4542a9b8f869a220559e633a0

it is in 3.3-rc4


On Wed, Feb 22, 2012 at 10:40, Pekka Enberg <[email protected]> wrote:
> On Wed, Feb 22, 2012 at 2:37 AM, Norbert Preining <[email protected]> wrote:
>> I still see that, on 3.3-rc4, and it is the same as usual. The interface
>> believes it is up and connected, but nothing works.
>>
>> I am *100%* sure that this is related to the driver, because in old
>> revisions (somewhen around 2.6.27 or so) it was working without
>> any problem, and when it started I reported it long time ago.
>
> It's definitely a kernel regression, no question about it. Was 3.3-rc1
> already broken for you? It shouldn't be that difficult to find the
> offending commit with git bisect if it's that easy to reproduce for
> you.
>
> ? ? ? ? ? ? ? ? ? ? ? ?Pekka

2012-02-23 07:06:20

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

On Wed, Feb 22, 2012 at 10:54 AM, Emmanuel Grumbach <[email protected]> wrote:
> Please check this one:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=3d29dd9b5b160ba4542a9b8f869a220559e633a0
>
> it is in 3.3-rc4

I used 3.3-rc4 yesterday and didn't notice the problem with my home
network. Is the patch supposed to fix Norbert's issue as well?

Pekka

2012-02-27 08:37:06

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

Hi Emmanuel,

sorry for the silence, I wasn't at the university for a few days so
I couldn't test.

Now I did:
kernel 3.3-rc5

On Mi, 22 Feb 2012, Emmanuel Grumbach wrote:
> From the log, I can see that we have a lot of "passive channel
> failures". Can you try to disable 11n (module_parameter) ?
> Please also try with debug=0xc0000000

Disabling 11n and adding this parameter created loads of output, but
still makes the connection break down and reconnection does not work
anymore. I attach a (small) part of the syslog file. I had
loads *LOADS* (200000+) of messages
Feb 27 12:58:04 mithrandir kernel: [ 1447.835281] iwlwifi 0000:06:00.0: I iwlagn_rx_reply_tx Next reclaimed packet:3333
Feb 27 12:58:04 mithrandir kernel: [ 1447.835294] iwlwifi 0000:06:00.0: I iwl_trans_pcie_reclaim [Q 2 | AC 2] 213 -> 214 (470)
in the log. Then there is something about "TX on passive channel"
and from then on everything goes off.

> network. Is the patch supposed to fix Norbert's issue as well?

Apparently not.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
FRADDAM (n.)
The small awkward-shaped piece of cheese which remains after grating a
large regular-shaped piece of cheese and enables you to cut your
fingers.
--- Douglas Adams, The Meaning of Liff


Attachments:
(No filename) (1.56 kB)
syslog (22.37 kB)
Download all attachments

2012-02-27 18:01:14

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

> sorry for the silence, I wasn't at the university for a few days so
> I couldn't test.
>
> Now I did:
> kernel 3.3-rc5
>
> On Mi, 22 Feb 2012, Emmanuel Grumbach wrote:
>> From the log, I can see that we have a lot of "passive channel
>> failures". Can you try to disable 11n (module_parameter) ?
>> Please also try with debug=0xc0000000

So this doesn't seem to be related to my rework of the AMPDU logic....
Will keep you updated if I think about something.

2012-02-27 22:42:38

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting even worse with 3.3-rc1

On Mo, 27 Feb 2012, Emmanuel Grumbach wrote:
> So this doesn't seem to be related to my rework of the AMPDU logic....
> Will keep you updated if I think about something.

Thanks.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
MEATHOP (n.)
One who sets off for the scene of an aircraft crash with a picnic
hamper.
--- Douglas Adams, The Meaning of Liff