2011-11-24 11:49:51

by Pedro Francisco

[permalink] [raw]
Subject: iwl3945 firmware errors: tentative debugging

Hello!
iwl3945 has had firmware errors triggered 'by' NM after started using
nl80211 instead of wext.?Since Intel has stopped supporting iwl3945,
no firmware fix has been possible.?It has been workarounded by
disable_hw_scan=1 as default, with the penalty of network performance
being lower and frequent 'hangs' on the connection.

I was able to trigger the firmware error by doing "iw dev wlan0 scan
passive". By comparison, "iw dev wlan0 scan"?does NOT trigger the
firmware error.

Having activated firmware debugging, it would seem a?firmware error
occurs when a full passive scan is done. If all channels 1-140 are
scanned passively, a firmware error occurs. If at least one of those
channels is actively scanned, no error occurs.

Where should I look next?

Thanks in Advance,
--
Pedro


2011-11-25 16:18:59

by Pedro Francisco

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

Trusting firmware logs, it uses both active and passive scans.

As to why, I don't know, though I suppose it'd make sense to let the
wireless card handle scanning unless some condition arised which led
NM to think it would have to roam soon.

On Fri, Nov 25, 2011 at 7:00 AM, Kalle Valo <[email protected]> wrote:
> Pedro Francisco <[email protected]> writes:
>
>>
>> I was able to trigger the firmware error by doing "iw dev wlan0 scan
>> passive". By comparison, "iw dev wlan0 scan"?does NOT trigger the
>> firmware error.
>
> So NM uses passive scan? Now I'm curious, why is that?
>

2011-11-29 15:43:48

by Pedro Francisco

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

I'd just like to add that if I only use 802.11b-g frequencies there is
no crash on the microcode.

I'll try messing with the frequencies first before trying kernel
2.6.32 or earlier.

Now to hack that loveable /lib/crda/regulatory.bin binary file to try
every possible frequency combination....


On Thu, Nov 24, 2011 at 12:54 PM, Stanislaw Gruszka <[email protected]> wrote:
>
> Hi Pedro
>
> On Thu, Nov 24, 2011 at 11:49:30AM +0000, Pedro Francisco wrote:
> > iwl3945 has had firmware errors triggered 'by' NM after started using
> > nl80211 instead of wext.?Since Intel has stopped supporting iwl3945,
> > no firmware fix has been possible.?It has been workarounded by
> > disable_hw_scan=1 as default, with the penalty of network performance
> > being lower and frequent 'hangs' on the connection.
> Eh, we changed to software scan by default for workaround various
> problems. Unfortunately that is causing other problems for other
> users. For now, do not exist best default disable_hw_scan= value, that
> would pleased everyone :-(
>
> > I was able to trigger the firmware error by doing "iw dev wlan0 scan
> > passive". By comparison, "iw dev wlan0 scan"?does NOT trigger the
> > firmware error.
>
> > Having activated firmware debugging, it would seem a?firmware error
> > occurs when a full passive scan is done. If all channels 1-140 are
> > scanned passively, a firmware error occurs. If at least one of those
> > channels is actively scanned, no error occurs.
> >
> > Where should I look next?
> Good finding, I'm able to reproduce that firmware error too. Perhaps
> you could install older kernel ie. 2.6.32 or even 2.6.24 and see if you
> can recreate problem there (but I'm not sure if on older kernel
> "iw dev wlan0 scan passive" command will work). If you will find kernel
> version, where issue is not present, you might figure out what is
> different regarding setting up SCAN command, or let me know so I
> will look at that :-) If that is broken also on old kernels, we
> perhaps could modify code that will disallow to do all passive
> channels scan.
>
> Thanks
> Stanislaw

2011-11-25 07:00:53

by Kalle Valo

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

Pedro Francisco <[email protected]> writes:

> iwl3945 has had firmware errors triggered 'by' NM after started using
> nl80211 instead of wext. Since Intel has stopped supporting iwl3945,
> no firmware fix has been possible. It has been workarounded by
> disable_hw_scan=1 as default, with the penalty of network performance
> being lower and frequent 'hangs' on the connection.
>
> I was able to trigger the firmware error by doing "iw dev wlan0 scan
> passive". By comparison, "iw dev wlan0 scan" does NOT trigger the
> firmware error.

So NM uses passive scan? Now I'm curious, why is that?

--
Kalle Valo

2011-11-24 12:51:26

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

Hi Pedro

On Thu, Nov 24, 2011 at 11:49:30AM +0000, Pedro Francisco wrote:
> iwl3945 has had firmware errors triggered 'by' NM after started using
> nl80211 instead of wext.?Since Intel has stopped supporting iwl3945,
> no firmware fix has been possible.?It has been workarounded by
> disable_hw_scan=1 as default, with the penalty of network performance
> being lower and frequent 'hangs' on the connection.
Eh, we changed to software scan by default for workaround various
problems. Unfortunately that is causing other problems for other
users. For now, do not exist best default disable_hw_scan= value, that
would pleased everyone :-(

> I was able to trigger the firmware error by doing "iw dev wlan0 scan
> passive". By comparison, "iw dev wlan0 scan"?does NOT trigger the
> firmware error.

> Having activated firmware debugging, it would seem a?firmware error
> occurs when a full passive scan is done. If all channels 1-140 are
> scanned passively, a firmware error occurs. If at least one of those
> channels is actively scanned, no error occurs.
>
> Where should I look next?
Good finding, I'm able to reproduce that firmware error too. Perhaps
you could install older kernel ie. 2.6.32 or even 2.6.24 and see if you
can recreate problem there (but I'm not sure if on older kernel
"iw dev wlan0 scan passive" command will work). If you will find kernel
version, where issue is not present, you might figure out what is
different regarding setting up SCAN command, or let me know so I
will look at that :-) If that is broken also on old kernels, we
perhaps could modify code that will disallow to do all passive
channels scan.

Thanks
Stanislaw

2011-11-29 17:08:16

by Pedro Francisco

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

My country has the following regulatory.bin lines:

country PT: (2402.000 - 2482.000 @ 40.000), (N/A, 20.00) (5170.000 -
5250.000 @ 40.000), (N/A, 20.00) (5250.000 - 5330.000 @ 40.000), (N/A,
20.00), DFS <------ removed whole line (5490.000 - 5710.000 @ 40.000),
(N/A, 27.00), DFS <------ removed whole line

If I remove the last two lines, everything works as expected, i.e., NO
"Microsode SW error". If I include any of the last two, the Microcode
will issue an error.

I've tried every combination of those lines except just one of
"(5250.000 - 5330.000 @ 40.000), (N/A, 20.00), DFS" OR "(5490.000 -
5710.000 @ 40.000), (N/A, 27.00), DFS".

Next week I'll look at the source code.
On Thu, Nov 24, 2011 at 12:54 PM, Stanislaw Gruszka <[email protected]> wrote:
> Hi Pedro
>
> On Thu, Nov 24, 2011 at 11:49:30AM +0000, Pedro Francisco wrote:
>> iwl3945 has had firmware errors triggered 'by' NM after started using
>> nl80211 instead of wext.?Since Intel has stopped supporting iwl3945,
>> no firmware fix has been possible.?It has been workarounded by
>> disable_hw_scan=1 as default, with the penalty of network performance
>> being lower and frequent 'hangs' on the connection.
> Eh, we changed to software scan by default for workaround various
> problems. Unfortunately that is causing other problems for other
> users. For now, do not exist best default disable_hw_scan= value, that
> would pleased everyone :-(
>
>> I was able to trigger the firmware error by doing "iw dev wlan0 scan
>> passive". By comparison, "iw dev wlan0 scan"?does NOT trigger the
>> firmware error.
>
>> Having activated firmware debugging, it would seem a?firmware error
>> occurs when a full passive scan is done. If all channels 1-140 are
>> scanned passively, a firmware error occurs. If at least one of those
>> channels is actively scanned, no error occurs.
>>
>> Where should I look next?
> Good finding, I'm able to reproduce that firmware error too. Perhaps
> you could install older kernel ie. 2.6.32 or even 2.6.24 and see if you
> can recreate problem there (but I'm not sure if on older kernel
> "iw dev wlan0 scan passive" command will work). If you will find kernel
> version, where issue is not present, you might figure out what is
> different regarding setting up SCAN command, or let me know so I
> will look at that :-) If that is broken also on old kernels, we
> perhaps could modify code that will disallow to do all passive
> channels scan.
>
> Thanks
> Stanislaw

2011-12-21 11:33:42

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

Hi Pedro,

On Tue, Nov 29, 2011 at 05:07:55PM +0000, Pedro Francisco wrote:
> My country has the following regulatory.bin lines:
>
> country PT: (2402.000 - 2482.000 @ 40.000), (N/A, 20.00) (5170.000 -
> 5250.000 @ 40.000), (N/A, 20.00) (5250.000 - 5330.000 @ 40.000), (N/A,
> 20.00), DFS <------ removed whole line (5490.000 - 5710.000 @ 40.000),
> (N/A, 27.00), DFS <------ removed whole line
>
> If I remove the last two lines, everything works as expected, i.e., NO
> "Microsode SW error". If I include any of the last two, the Microcode
> will issue an error.
>
> I've tried every combination of those lines except just one of
> "(5250.000 - 5330.000 @ 40.000), (N/A, 20.00), DFS" OR "(5490.000 -
> 5710.000 @ 40.000), (N/A, 27.00), DFS".

Attached patch stop to trigger error on my setup with "iw dev wlan0 scan
passive". Can you check if it also fix problem on your normal wireless
workload?

Thanks
Stanislaw


Attachments:
(No filename) (920.00 B)
iwlegacy-do-not-promote-to-active-scan-on-transmistion-detect.patch (911.00 B)
Download all attachments

2012-01-17 16:07:33

by Pedro Francisco

[permalink] [raw]
Subject: Re: iwl3945 firmware errors: tentative debugging

Hi!
Sorry for the delay!

On Wed, Dec 21, 2011 at 11:33 AM, Stanislaw Gruszka <[email protected]> wrote:
> Hi Pedro,
>
> On Tue, Nov 29, 2011 at 05:07:55PM +0000, Pedro Francisco wrote:
>> My country has the following regulatory.bin lines:
>>
>> country PT:
>> (2402.000 - 2482.000 @ 40.000), (N/A, 20.00)
>> (5170.000 - >> 5250.000 @ 40.000), (N/A, 20.00)
>> (5250.000 - 5330.000 @ 40.000), (N/A, 20.00), DFS <------ removed whole line
>> (5490.000 - 5710.000 @ 40.000), (N/A, 27.00), DFS <------ removed whole line
>>
>> If I remove the last two lines, everything works as expected, i.e., NO
>> "Microsode SW error". If I include any of the last two, the Microcode
>> will issue an error.
>>
>> I've tried every combination of those lines except just one of
>> "(5250.000 - 5330.000 @ 40.000), (N/A, 20.00), DFS" OR "(5490.000 -
>> 5710.000 @ 40.000), (N/A, 27.00), DFS".
>
> Attached patch stop to trigger error on my setup with "iw dev wlan0 scan
> passive". Can you check if it also fix problem on your normal wireless
> workload?

It does seem fixed! However: I had to tweak a little the patch
(attached, for convenience of anyone else following the thread: "IL_"
-> "IWL_", file to patch is different and paragraphs are different).

Also, probably unrelated, I got _just once_ another firmware error
("iwl3945 Command REPLY_RATE_SCALE failed: FW Error
iwl3945 Error setting HW rate table: FFFFFFFB") but since the kernels
have changed and I've also changed to the PAE kernel, it's probably
unrelated to the patch (also I'm using swcrypto=0 which may help).
I'll continue testing but unless there is some other issue I won't
report back.

Thank you for the fix!
--
Pedro


Attachments:
iwlegacy-CUSTOM-PEDRO-do-not-promote-to-active-scan-on-transmistion-detect.patch (946.00 B)