2008-08-05 11:56:50

by Johannes Berg

[permalink] [raw]
Subject: iwl5000 oopses with Linus's tree

Regardless of whether it is enabled or not, this shouldn't happen:

[ 373.387834] iwl4965: uCode did not respond OK.
[ 373.387959] iwl4965: Error wrong command queue 0 command id 0x1
[ 373.388032] ------------[ cut here ]------------
[ 373.388058] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
[ 373.388102] Oops: Exception in kernel mode, sig: 5 [#1]

This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2008-08-28 15:45:22

by Ian Schram

[permalink] [raw]
Subject: Re: iwl5000 oopses



Tomas Winkler wrote:
> On Thu, Aug 28, 2008 at 3:17 PM, Johannes Berg
> <[email protected]> wrote:
>> On Thu, 2008-08-28 at 14:39 +0300, Tomas Winkler wrote:
>>> On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
>>> <[email protected]> wrote:
>>>> On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
>>>>> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg <[email protected]> wrote:
>>>>>>> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>>>>>> [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
>>>>>> [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
>>>>>> [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
>>>>>> [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
>>>>>> [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
>>>>>> [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
>>>>>> [ 127.170832] ------------[ cut here ]------------
>>>>>> [ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
>>>>>> [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
>>>> This is still happening with -rc4.
>>> I know, at least one regression.
>> Well, I guess for me the addition of the 5000 series code to the kernel
>> is the regression, without it I can use the machine just fine, just have
>> no wireless ;)
>
> And when I say that driver is half backed because I'm not done
> cleaning bugs it's somehow not understood
> Instead of chasing bugs I have to spend time to fitght the system.
> Tomas
> --

Probably a good idea to not see this as
,,you vs system'' .. Anyways that discussion is going on in other threads
perhaps we can focus on what has to be done about this bug.

what's known about this bug? ad where does it trigger? reproducible?

the error message clearly shows an invalid queue id (43 or 0x2b) where it should be
a number in the range of [0,4], this is multiqueue related?

the value in this error message was set by the driver, and then relayed by the ucode
in order to know which "command" this is a response to.

assuming there is no memory corruption, and the ucode is correct, ...

It might be set wrong. The value that is set is either the command queue, or a
tx_command queue which is determined by a call to skb_get_queue_mapping(skb)

might be nice to add some debug output documenting what this function is returning.



finally can i quickly ask why these macro's (that "encode" this queue id to the field in which it's passed to the ucode):
#define SEQ_TO_QUEUE(x) ((x >> 8) & 0xbf)
#define QUEUE_TO_SEQ(x) ((x & 0xbf) << 8)
use 0xbf, when according to the sourcecode comments it only uses the last 6 bits, hence i would
expect 0x3f. In QUEUE_TO_SEQ this msb should never be set .. so i wonder if there is a hack
i'm missing somewhere.


2008-08-28 23:58:50

by Ian Schram

[permalink] [raw]
Subject: Re: iwl5000 oopses



Tomas Winkler wrote:
> On Thu, Aug 28, 2008 at 6:44 PM, Ian Schram <[email protected]> wrote:
>>
>> Tomas Winkler wrote:
>>> On Thu, Aug 28, 2008 at 3:17 PM, Johannes Berg
>>> <[email protected]> wrote:
>>>> On Thu, 2008-08-28 at 14:39 +0300, Tomas Winkler wrote:
>>>>> On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
>>>>> <[email protected]> wrote:
>>>>>> On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
>>>>>>> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg
>>>>>>> <[email protected]> wrote:
>>>>>>>>> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>>>>>>>> [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for
>>>>>>>> Linux, 1.3.27kds
>>>>>>>> [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
>>>>>>>> [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN
>>>>>>>> REV=0x24
>>>>>>>> [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a
>>>>>>>> channels
>>>>>>>> [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
>>>>>>>> [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
>>>>>>>> [ 127.170832] ------------[ cut here ]------------
>>>>>>>> [ 127.170884] kernel BUG at
>>>>>>>> drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
>>>>>>>> [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
>>>>>> This is still happening with -rc4.
>>>>> I know, at least one regression.
>>>> Well, I guess for me the addition of the 5000 series code to the kernel
>>>> is the regression, without it I can use the machine just fine, just have
>>>> no wireless ;)
>>> And when I say that driver is half backed because I'm not done
>>> cleaning bugs it's somehow not understood
>>> Instead of chasing bugs I have to spend time to fitght the system.
>>> Tomas
>>> --
>> Probably a good idea to not see this as
>> ,,you vs system'' .. Anyways that discussion is going on in other threads
>> perhaps we can focus on what has to be done about this bug.
>>
>> what's known about this bug? ad where does it trigger? reproducible?
>>
>> the error message clearly shows an invalid queue id (43 or 0x2b) where it
>> should be
>> a number in the range of [0,4], this is multiqueue related?
>>
>> the value in this error message was set by the driver, and then relayed by
>> the ucode
>> in order to know which "command" this is a response to.
>>
>> assuming there is no memory corruption, and the ucode is correct, ...
>>
>> It might be set wrong. The value that is set is either the command queue, or
>> a
>> tx_command queue which is determined by a call to skb_get_queue_mapping(skb)
>>
>> might be nice to add some debug output documenting what this function is
>> returning.
>>
>>
>>
>> finally can i quickly ask why these macro's (that "encode" this queue id to
>> the field in which it's passed to the ucode):
>> #define SEQ_TO_QUEUE(x) ((x >> 8) & 0xbf)
>> #define QUEUE_TO_SEQ(x) ((x & 0xbf) << 8)
>> use 0xbf, when according to the sourcecode comments it only uses the last 6
>> bits, hence i would
>> expect 0x3f. In QUEUE_TO_SEQ this msb should never be set .. so i wonder if
>> there is a hack
>> i'm missing somewhere.
>
> Actually this is the correct settings (there is still a lot of old
> days junk in the code)
>
> +#define SEQ_TO_QUEUE(s) (((s) >> 8) & 0x1f)
> +#define QUEUE_TO_SEQ(q) (((q) & 0x1f) << 8)
> +#define SEQ_TO_INDEX(s) ((s) & 0xff)
> +#define INDEX_TO_SEQ(i) ((i) & 0xff)
>
> Yet this is not it an issue first of all it works pretty well I never

True. 0x1f seems slightly inconsistent with the iwl-command.h, but
that's not really the issue right now.

> hit this one if not under load.
> ' Error wrong command queue 43 command id ___0x6B___' 6b looks more
> like slub poison -- accessing already freed skb
>
> Thanks
> Tomas

hmm, 0x6B indeed is not a documented command ID...
Only triggering under load must point to some overflow or race i guess.

I should get myself a new laptop to be able to play with this...
The best i can do now, is wonder if this patch
"[PATCH 08/10] iwlwifi: decrement rx skb counter in scan abort handler"
might be responsible, but that's just fuzzy string matching "recent patches" with
"freed skb" ;-)

2008-08-05 12:23:06

by Johannes Berg

[permalink] [raw]
Subject: Re: iwl5000 oopses (was: iwl5000 oopses with Linus's tree)


> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty

Same happens with a newer kernel, I merged together wireless-testing and
linux-2.6 (because powerpc won't build in -rc1) and get this:

[ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
[ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
[ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
[ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
[ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
[ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
[ 127.170832] ------------[ cut here ]------------
[ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
[ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2008-08-28 10:42:47

by Johannes Berg

[permalink] [raw]
Subject: Re: iwl5000 oopses (was: iwl5000 oopses with Linus's tree)

On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg <[email protected]> wrote:
> >
> >> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty

> > [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
> > [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
> > [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
> > [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
> > [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
> > [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
> > [ 127.170832] ------------[ cut here ]------------
> > [ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
> > [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]

This is still happening with -rc4.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2008-08-29 00:15:47

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwl5000 oopses

On Fri, Aug 29, 2008 at 2:58 AM, Ian Schram <[email protected]> wrote:
>
>
> Tomas Winkler wrote:
>>
>> On Thu, Aug 28, 2008 at 6:44 PM, Ian Schram <[email protected]> wrote:
>>>
>>> Tomas Winkler wrote:
>>>>
>>>> On Thu, Aug 28, 2008 at 3:17 PM, Johannes Berg
>>>> <[email protected]> wrote:
>>>>>
>>>>> On Thu, 2008-08-28 at 14:39 +0300, Tomas Winkler wrote:
>>>>>>
>>>>>> On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
>>>>>>>>
>>>>>>>> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg
>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>>>>>>>>>
>>>>>>>>> [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for
>>>>>>>>> Linux, 1.3.27kds
>>>>>>>>> [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
>>>>>>>>> [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN
>>>>>>>>> REV=0x24
>>>>>>>>> [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a
>>>>>>>>> channels
>>>>>>>>> [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
>>>>>>>>> [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
>>>>>>>>> [ 127.170832] ------------[ cut here ]------------
>>>>>>>>> [ 127.170884] kernel BUG at
>>>>>>>>> drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
>>>>>>>>> [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
>>>>>>>
>>>>>>> This is still happening with -rc4.
>>>>>>
>>>>>> I know, at least one regression.
>>>>>
>>>>> Well, I guess for me the addition of the 5000 series code to the kernel
>>>>> is the regression, without it I can use the machine just fine, just
>>>>> have
>>>>> no wireless ;)
>>>>
>>>> And when I say that driver is half backed because I'm not done
>>>> cleaning bugs it's somehow not understood
>>>> Instead of chasing bugs I have to spend time to fitght the system.
>>>> Tomas
>>>> --
>>>
>>> Probably a good idea to not see this as
>>> ,,you vs system'' .. Anyways that discussion is going on in other
>>> threads
>>> perhaps we can focus on what has to be done about this bug.
>>>
>>> what's known about this bug? ad where does it trigger? reproducible?
>>>
>>> the error message clearly shows an invalid queue id (43 or 0x2b) where it
>>> should be
>>> a number in the range of [0,4], this is multiqueue related?
>>>
>>> the value in this error message was set by the driver, and then relayed
>>> by
>>> the ucode
>>> in order to know which "command" this is a response to.
>>>
>>> assuming there is no memory corruption, and the ucode is correct, ...
>>>
>>> It might be set wrong. The value that is set is either the command queue,
>>> or
>>> a
>>> tx_command queue which is determined by a call to
>>> skb_get_queue_mapping(skb)
>>>
>>> might be nice to add some debug output documenting what this function is
>>> returning.
>>>
>>>
>>>
>>> finally can i quickly ask why these macro's (that "encode" this queue id
>>> to
>>> the field in which it's passed to the ucode):
>>> #define SEQ_TO_QUEUE(x) ((x >> 8) & 0xbf)
>>> #define QUEUE_TO_SEQ(x) ((x & 0xbf) << 8)
>>> use 0xbf, when according to the sourcecode comments it only uses the last
>>> 6
>>> bits, hence i would
>>> expect 0x3f. In QUEUE_TO_SEQ this msb should never be set .. so i wonder
>>> if
>>> there is a hack
>>> i'm missing somewhere.
>>
>> Actually this is the correct settings (there is still a lot of old
>> days junk in the code)
>>
>> +#define SEQ_TO_QUEUE(s) (((s) >> 8) & 0x1f)
>> +#define QUEUE_TO_SEQ(q) (((q) & 0x1f) << 8)
>> +#define SEQ_TO_INDEX(s) ((s) & 0xff)
>> +#define INDEX_TO_SEQ(i) ((i) & 0xff)
>>
>> Yet this is not it an issue first of all it works pretty well I never
>
> True. 0x1f seems slightly inconsistent with the iwl-command.h, but
> that's not really the issue right now.

Also the comment is wrong. It should be 0x1f (bits 8:12 >> 8) 13 is
reserved. Will post a patch.

>> hit this one if not under load.
>> ' Error wrong command queue 43 command id ___0x6B___' 6b looks more
>> like slub poison -- accessing already freed skb
>>

>
> hmm, 0x6B indeed is not a documented command ID...
> Only triggering under load must point to some overflow or race i guess.
>
Yep.

> I should get myself a new laptop to be able to play with this...
> The best i can do now, is wonder if this patch
> "[PATCH 08/10] iwlwifi: decrement rx skb counter in scan abort handler"
> might be responsible, but that's just fuzzy string matching "recent patches"
> with
> "freed skb" ;-)

No, that's a good patch.. Johannes failure looks like he got it right
in the begining before scanning.
We need open more logs and check what slub allocator is in use

Tomas

2008-08-28 14:52:08

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwl5000 oopses (was: iwl5000 oopses with Linus's tree)

On Thu, Aug 28, 2008 at 3:17 PM, Johannes Berg
<[email protected]> wrote:
> On Thu, 2008-08-28 at 14:39 +0300, Tomas Winkler wrote:
>> On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
>> <[email protected]> wrote:
>> > On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
>> >> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg <[email protected]> wrote:
>> >> >
>> >> >> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>> >
>> >> > [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
>> >> > [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
>> >> > [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
>> >> > [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
>> >> > [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
>> >> > [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
>> >> > [ 127.170832] ------------[ cut here ]------------
>> >> > [ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
>> >> > [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
>> >
>> > This is still happening with -rc4.
>>
>> I know, at least one regression.
>
> Well, I guess for me the addition of the 5000 series code to the kernel
> is the regression, without it I can use the machine just fine, just have
> no wireless ;)

And when I say that driver is half backed because I'm not done
cleaning bugs it's somehow not understood
Instead of chasing bugs I have to spend time to fitght the system.
Tomas

2008-08-28 12:18:25

by Johannes Berg

[permalink] [raw]
Subject: Re: iwl5000 oopses (was: iwl5000 oopses with Linus's tree)

On Thu, 2008-08-28 at 14:39 +0300, Tomas Winkler wrote:
> On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
> <[email protected]> wrote:
> > On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
> >> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg <[email protected]> wrote:
> >> >
> >> >> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
> >
> >> > [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
> >> > [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
> >> > [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
> >> > [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
> >> > [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
> >> > [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
> >> > [ 127.170832] ------------[ cut here ]------------
> >> > [ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
> >> > [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
> >
> > This is still happening with -rc4.
>
> I know, at least one regression.

Well, I guess for me the addition of the 5000 series code to the kernel
is the regression, without it I can use the machine just fine, just have
no wireless ;)

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2008-08-28 21:30:30

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwl5000 oopses

On Thu, Aug 28, 2008 at 6:44 PM, Ian Schram <[email protected]> wrote:
>
>
> Tomas Winkler wrote:
>>
>> On Thu, Aug 28, 2008 at 3:17 PM, Johannes Berg
>> <[email protected]> wrote:
>>>
>>> On Thu, 2008-08-28 at 14:39 +0300, Tomas Winkler wrote:
>>>>
>>>> On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
>>>> <[email protected]> wrote:
>>>>>
>>>>> On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
>>>>>>
>>>>>> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg
>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>>>>>>>
>>>>>>> [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for
>>>>>>> Linux, 1.3.27kds
>>>>>>> [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
>>>>>>> [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN
>>>>>>> REV=0x24
>>>>>>> [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a
>>>>>>> channels
>>>>>>> [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
>>>>>>> [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
>>>>>>> [ 127.170832] ------------[ cut here ]------------
>>>>>>> [ 127.170884] kernel BUG at
>>>>>>> drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
>>>>>>> [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
>>>>>
>>>>> This is still happening with -rc4.
>>>>
>>>> I know, at least one regression.
>>>
>>> Well, I guess for me the addition of the 5000 series code to the kernel
>>> is the regression, without it I can use the machine just fine, just have
>>> no wireless ;)
>>
>> And when I say that driver is half backed because I'm not done
>> cleaning bugs it's somehow not understood
>> Instead of chasing bugs I have to spend time to fitght the system.
>> Tomas
>> --
>
> Probably a good idea to not see this as
> ,,you vs system'' .. Anyways that discussion is going on in other threads
> perhaps we can focus on what has to be done about this bug.
>
> what's known about this bug? ad where does it trigger? reproducible?
>
> the error message clearly shows an invalid queue id (43 or 0x2b) where it
> should be
> a number in the range of [0,4], this is multiqueue related?
>
> the value in this error message was set by the driver, and then relayed by
> the ucode
> in order to know which "command" this is a response to.
>
> assuming there is no memory corruption, and the ucode is correct, ...
>
> It might be set wrong. The value that is set is either the command queue, or
> a
> tx_command queue which is determined by a call to skb_get_queue_mapping(skb)
>
> might be nice to add some debug output documenting what this function is
> returning.
>
>
>
> finally can i quickly ask why these macro's (that "encode" this queue id to
> the field in which it's passed to the ucode):
> #define SEQ_TO_QUEUE(x) ((x >> 8) & 0xbf)
> #define QUEUE_TO_SEQ(x) ((x & 0xbf) << 8)
> use 0xbf, when according to the sourcecode comments it only uses the last 6
> bits, hence i would
> expect 0x3f. In QUEUE_TO_SEQ this msb should never be set .. so i wonder if
> there is a hack
> i'm missing somewhere.

Actually this is the correct settings (there is still a lot of old
days junk in the code)

+#define SEQ_TO_QUEUE(s) (((s) >> 8) & 0x1f)
+#define QUEUE_TO_SEQ(q) (((q) & 0x1f) << 8)
+#define SEQ_TO_INDEX(s) ((s) & 0xff)
+#define INDEX_TO_SEQ(i) ((i) & 0xff)

Yet this is not it an issue first of all it works pretty well I never
hit this one if not under load.
' Error wrong command queue 43 command id ___0x6B___' 6b looks more
like slub poison -- accessing already freed skb

Thanks
Tomas



>

2008-08-28 11:39:55

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwl5000 oopses (was: iwl5000 oopses with Linus's tree)

On Thu, Aug 28, 2008 at 1:36 PM, Johannes Berg
<[email protected]> wrote:
> On Tue, 2008-08-05 at 18:20 +0300, Tomas Winkler wrote:
>> On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg <[email protected]> wrote:
>> >
>> >> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>
>> > [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
>> > [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
>> > [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
>> > [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
>> > [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
>> > [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
>> > [ 127.170832] ------------[ cut here ]------------
>> > [ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
>> > [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]
>
> This is still happening with -rc4.

I know, at least one regression.
Tomas

2008-08-29 07:31:27

by Johannes Berg

[permalink] [raw]
Subject: Re: iwl5000 oopses


> >> ' Error wrong command queue 43 command id ___0x6B___' 6b looks more
> >> like slub poison -- accessing already freed skb
> >>
>
> >
> > hmm, 0x6B indeed is not a documented command ID...
> > Only triggering under load must point to some overflow or race i guess.
> >
> Yep.

Though the queue number actually keeps varying, it wasn't always 43, but
you're right, 0x6b sounds like poison.

Have you enabled poison on your test machines?

> > I should get myself a new laptop to be able to play with this...
> > The best i can do now, is wonder if this patch
> > "[PATCH 08/10] iwlwifi: decrement rx skb counter in scan abort handler"
> > might be responsible, but that's just fuzzy string matching "recent patches"
> > with
> > "freed skb" ;-)
>
> No, that's a good patch.. Johannes failure looks like he got it right
> in the begining before scanning.
> We need open more logs and check what slub allocator is in use

Indeed, I'm using slub on a 6.5G 64-bit powerpc box with IOMMU so
there's lots of room for differences...

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2008-08-05 15:26:42

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwl5000 oopses (was: iwl5000 oopses with Linus's tree)

On Tue, Aug 5, 2008 at 3:22 PM, Johannes Berg <[email protected]> wrote:
>
>> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty
>
> Same happens with a newer kernel, I merged together wireless-testing and
> linux-2.6 (because powerpc won't build in -rc1) and get this:
>
> [ 126.826663] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27kds
> [ 126.826947] iwlagn: Copyright(c) 2003-2008 Intel Corporation
> [ 126.828369] iwlagn: Detected Intel Wireless WiFi Link 5350AGN REV=0x24
> [ 126.848680] iwlagn: Tunable channels: 13 802.11bg, 24 802.11a channels
> [ 127.014564] firmware: requesting iwlwifi-5000-1.ucode
> [ 127.170640] iwlagn: Error wrong command queue 43 command id 0x6B
> [ 127.170832] ------------[ cut here ]------------
> [ 127.170884] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
> [ 127.170941] Oops: Exception in kernel mode, sig: 5 [#1]


Something is messing up the queues.
Will check
Thanks
Tomas

2008-10-06 12:29:31

by Johannes Berg

[permalink] [raw]
Subject: Re: iwl5000 oopses with Linus's tree

On Tue, 2008-08-05 at 13:56 +0200, Johannes Berg wrote:
> Regardless of whether it is enabled or not, this shouldn't happen:
>
> [ 373.387834] iwl4965: uCode did not respond OK.
> [ 373.387959] iwl4965: Error wrong command queue 0 command id 0x1
> [ 373.388032] ------------[ cut here ]------------
> [ 373.388058] kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1163!
> [ 373.388102] Oops: Exception in kernel mode, sig: 5 [#1]
>
> This is kernel 2.6.27-rc1-00504-g2b12a4c-dirty

I just found another thing that seems critical to this issue (I now am
using kernel 2.6.27-rc6-wl-01382-g0bea1f7-dirty):

It goes away completely when I use 4K-pages instead of 64K-pages.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part