2008-10-17 15:27:57

by Richard Scherping

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

>>>> When I associate with an AP, Linux 2.6.27 seems to "hang" for a few
>>>> seconds. During that time, all sound stops playing and keyboard and
>>>> mouse input is impossible.
>>
>> I'm using Mandriva 2009.0 x86_64 with wpa_supplicant and Mandriva's
>> wireless network configuration tool (drakroam).
>>
>> Actually I just found out that running # ifconfig wlan0 down
>> is enough to trigger the sound and mouse hanging for a few seconds.
>
> And shortly after I wrote that, while associating while getting an IP
> with dhclient when associating with a WPA encrypted AP, I got this
> backtrace in my logs:
> [...]

I have a similar problem here. No crash up to now, but the very same "hang" for a few seconds on "ifconfig wlan0 down". Interestingly this does only happen after a normal boot - once I did a suspend and resume (S3), there is no hang anymore.

Hardware: Thinkpad T61p with Intel 4965 agn
Software: Debian Lenny x86_64 with vanilla 2.6.27 kernel

I remember having the same "hang" with older kernels, too.

Richard


2008-10-17 20:02:21

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

On Fri, Oct 17, 2008 at 5:27 PM, Richard Scherping <[email protected]> wrote:
>>>>> When I associate with an AP, Linux 2.6.27 seems to "hang" for a few
>>>>> seconds. During that time, all sound stops playing and keyboard and
>>>>> mouse input is impossible.
>>>
>>> I'm using Mandriva 2009.0 x86_64 with wpa_supplicant and Mandriva's
>>> wireless network configuration tool (drakroam).
>>>
>>> Actually I just found out that running # ifconfig wlan0 down
>>> is enough to trigger the sound and mouse hanging for a few seconds.
>>
>> And shortly after I wrote that, while associating while getting an IP
>> with dhclient when associating with a WPA encrypted AP, I got this
>> backtrace in my logs:
>> [...]
>
> I have a similar problem here. No crash up to now, but the very same "hang" for a few seconds on "ifconfig wlan0 down". Interestingly this does only happen after a normal boot - once I did a suspend and resume (S3), there is no hang anymore.
>
> Hardware: Thinkpad T61p with Intel 4965 agn
> Software: Debian Lenny x86_64 with vanilla 2.6.27 kernel
>
Driver in 2.6.27 is not stable, please try to reproduce this in
current wireless-testing.git.
Thanks

2008-10-18 16:17:49

by Richard Scherping

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

Tomas Winkler schrieb:
> On Fri, Oct 17, 2008 at 5:27 PM, Richard Scherping <[email protected]> wrote:
>>>>>> When I associate with an AP, Linux 2.6.27 seems to "hang" for a few
>>>>>> seconds. During that time, all sound stops playing and keyboard and
>>>>>> mouse input is impossible.
>>>> I'm using Mandriva 2009.0 x86_64 with wpa_supplicant and Mandriva's
>>>> wireless network configuration tool (drakroam).
>>>>
>>>> Actually I just found out that running # ifconfig wlan0 down
>>>> is enough to trigger the sound and mouse hanging for a few seconds.
>>> And shortly after I wrote that, while associating while getting an IP
>>> with dhclient when associating with a WPA encrypted AP, I got this
>>> backtrace in my logs:
>>> [...]
>> I have a similar problem here. No crash up to now, but the very same "hang" for a few seconds on "ifconfig wlan0 down". Interestingly this does only happen after a normal boot - once I did a suspend and resume (S3), there is no hang anymore.
>>
>> Hardware: Thinkpad T61p with Intel 4965 agn
>> Software: Debian Lenny x86_64 with vanilla 2.6.27 kernel
>>
> Driver in 2.6.27 is not stable, please try to reproduce this in
> current wireless-testing.git.

I do not have the time to compile and test wireless-testing ATM, sorry.

In fact I am annoyed by the fact that iwlagn is "known to be unstable" in a stable kernel release and that this even seems to be a totally normal thing...

Once compat-wireless is working again on my system I will try that and check whether the hang still occurs. Current compat-wireless does not for me - compilation says

make[1]: Entering directory `/usr/src/linux-2.6.27'
/home/richard/Desktop/compat-wireless-2008-10-17/config.mk:44: "WARNING: You are running a kernel >= 2.6.23, you should enable in it CONFIG_NETDEVICES_MULTIQUEUE for 802.11[ne] support"

and I did not find CONFIG_NETDEVICES_MULTIQUEUE using menuconfig. Loading the modules fails with tons of unknown symbols and disagreed versions (already at cfg80211 and mac80211 modules).

Perhaps I should go back to Debian stock 2.6.26 kernel - but there Ad-Hoc is broken on iwl4965, although it was working without problems earlier in 2.6.24.

Richard

2008-10-19 15:27:58

by Andy Lutomirski

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

Richard Scherping wrote:
> Tomas Winkler schrieb:
>> On Fri, Oct 17, 2008 at 5:27 PM, Richard Scherping <[email protected]> wrote:
>>>>>>> When I associate with an AP, Linux 2.6.27 seems to "hang" for a few
>>>>>>> seconds. During that time, all sound stops playing and keyboard and
>>>>>>> mouse input is impossible.
>>>>> I'm using Mandriva 2009.0 x86_64 with wpa_supplicant and Mandriva's
>>>>> wireless network configuration tool (drakroam).
>>>>>
>>>>> Actually I just found out that running # ifconfig wlan0 down
>>>>> is enough to trigger the sound and mouse hanging for a few seconds.
>>>> And shortly after I wrote that, while associating while getting an IP
>>>> with dhclient when associating with a WPA encrypted AP, I got this
>>>> backtrace in my logs:
>>>> [...]
>>> I have a similar problem here. No crash up to now, but the very same "hang" for a few seconds on "ifconfig wlan0 down". Interestingly this does only happen after a normal boot - once I did a suspend and resume (S3), there is no hang anymore.
>>>
>>> Hardware: Thinkpad T61p with Intel 4965 agn
>>> Software: Debian Lenny x86_64 with vanilla 2.6.27 kernel
>>>
>> Driver in 2.6.27 is not stable, please try to reproduce this in
>> current wireless-testing.git.
>
> I do not have the time to compile and test wireless-testing ATM, sorry.
>
> In fact I am annoyed by the fact that iwlagn is "known to be unstable" in a stable kernel release and that this even seems to be a totally normal thing...

Amen. This driver has been available and more-or-less working for ages.
What kernel am I supposed to run if I just want a stable system?
Haven't found one yet, other than distro kernels...

In any case, I've seen these complete system hiccups with iwl4965 and
iwlagn since at least 2.6.25 and through quite a few wireless-testing
versions. I bet that this, along with things like it, is the culprit:

In many, many functions:
spin_lock_irqsave(&priv->lock, flags);
...
ret = iwl_grab_nic_access(priv);

In iwl-io.h (2.6.26.something):
static inline int _iwl_grab_nic_access(struct iwl_priv *priv)
{
...
ret = _iwl_poll_bit(priv, CSR_GP_CNTRL,
CSR_GP_CNTRL_REG_VAL_MAC_ACCESS_EN,
(CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY |
CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP), 50);
...
}

static inline int _iwl_poll_bit(struct iwl_priv *priv, u32 addr,
u32 bits, u32 mask, int timeout)
{
int i = 0;

do {
if ((_iwl_read32(priv, addr) & mask) == (bits & mask))
return i;
mdelay(10);
i += 10;
} while (i < timeout);

return -ETIMEDOUT;
}

Polling the hardware waiting for firmware to do something *with IRQs
disabled*? I'd really rather the drivers on my system didn't do this.

I'd attempt to fix this myself, but I have no clue what the locking
rules are supposed to be.

Would I be out of line for wishing the iwlwifi developers would fix
longstanding issues (latency and maybe horkage after resume, although
the latter seems much improved lately) before adding fancy new things?

--Andy

2008-10-19 22:13:25

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

On Sun, Oct 19, 2008 at 5:18 PM, Andy Lutomirski <[email protected]> wrote:
> Richard Scherping wrote:
>>
>> Tomas Winkler schrieb:
>>>
>>> On Fri, Oct 17, 2008 at 5:27 PM, Richard Scherping
>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> When I associate with an AP, Linux 2.6.27 seems to "hang" for a few
>>>>>>>> seconds. During that time, all sound stops playing and keyboard and
>>>>>>>> mouse input is impossible.
>>>>>>
>>>>>> I'm using Mandriva 2009.0 x86_64 with wpa_supplicant and Mandriva's
>>>>>> wireless network configuration tool (drakroam).
>>>>>>
>>>>>> Actually I just found out that running # ifconfig wlan0 down
>>>>>> is enough to trigger the sound and mouse hanging for a few seconds.
>>>>>
>>>>> And shortly after I wrote that, while associating while getting an IP
>>>>> with dhclient when associating with a WPA encrypted AP, I got this
>>>>> backtrace in my logs:
>>>>> [...]
>>>>
>>>> I have a similar problem here. No crash up to now, but the very same
>>>> "hang" for a few seconds on "ifconfig wlan0 down". Interestingly this does
>>>> only happen after a normal boot - once I did a suspend and resume (S3),
>>>> there is no hang anymore.
>>>>
>>>> Hardware: Thinkpad T61p with Intel 4965 agn
>>>> Software: Debian Lenny x86_64 with vanilla 2.6.27 kernel
>>>>
>>> Driver in 2.6.27 is not stable, please try to reproduce this in
>>> current wireless-testing.git.
>>
>> I do not have the time to compile and test wireless-testing ATM, sorry.
>>
>> In fact I am annoyed by the fact that iwlagn is "known to be unstable" in
>> a stable kernel release and that this even seems to be a totally normal
>> thing...

>
> Amen.
Stable doesn't mean all components are stable, citation from Linus blog:
"It doesn't have to be perfect (and obviously no release ever is), but
it needs to be in reasonable shape"

The fact is that some critical patches were rejected as not
regressions in rc cycle and probably need to be pushed to the stable
version now or distribution will merge them.
We gave more priority for testing 32 bit version so it is more stable
then 64 bit which got much less in house testing and we've missed many
issues there. The driver doesn't get full exposure till it's get to
the public in stable version therefore no bugs are opened in the rc
cycle so also are not fixed in the stable version. and unfortunately
there is no much system testing at all for what get's into merging
window.
Second the whole mac80211 stack didn't address fully MQ rewrite so
it's a bit shaky as well and this will be fact also in 2.6.28.

This driver has been available and more-or-less working for ages.
> What kernel am I supposed to run if I just want a stable system? Haven't
> found one yet, other than distro kernels...
>
> In any case, I've seen these complete system hiccups with iwl4965 and iwlagn
> since at least 2.6.25 and through quite a few wireless-testing versions. I
> bet that this, along with things like it, is the culprit:

Haven't seen you've filled bug for it.

>
> In many, many functions:
> spin_lock_irqsave(&priv->lock, flags);
> ...
> ret = iwl_grab_nic_access(priv);
>
> In iwl-io.h (2.6.26.something):

This code is here from version 2.6.18 at least was just moved around.

> static inline int _iwl_grab_nic_access(struct iwl_priv *priv)
> {
> ...
> ret = _iwl_poll_bit(priv, CSR_GP_CNTRL,
> CSR_GP_CNTRL_REG_VAL_MAC_ACCESS_EN,
> (CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY |
> CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP), 50);
> ...
> }
>
> static inline int _iwl_poll_bit(struct iwl_priv *priv, u32 addr,
> u32 bits, u32 mask, int timeout)
> {
> int i = 0;
>
> do {
> if ((_iwl_read32(priv, addr) & mask) == (bits & mask))
> return i;
> mdelay(10);
> i += 10;
> } while (i < timeout);
>
> return -ETIMEDOUT;
> }
>
> Polling the hardware waiting for firmware to do something *with IRQs
> disabled*? I'd really rather the drivers on my system didn't do this.
>
> I'd attempt to fix this myself, but I have no clue what the locking rules
> are supposed to be.

Locking need to be really revised but till now I didn't see show
stoppers issues so it didn't get priority

> Would I be out of line for wishing the iwlwifi developers
Patches are always welcome

would fix
> longstanding issues (latency and maybe horkage after resume, although the
> latter seems much improved lately) before adding fancy new things?

There are also problem in mac80211 it self and we did as well some
work to improve it a bit.
Tomas

2008-10-19 22:52:20

by Andy Lutomirski

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

On Sun, Oct 19, 2008 at 6:12 PM, Tomas Winkler <[email protected]> wrote:
> On Sun, Oct 19, 2008 at 5:18 PM, Andy Lutomirski <[email protected]> wrote:
>> Richard Scherping wrote:
>>>
>>> Tomas Winkler schrieb:
>
>>
>> Amen.
> Stable doesn't mean all components are stable, citation from Linus blog:
> "It doesn't have to be perfect (and obviously no release ever is), but
> it needs to be in reasonable shape"
>
> The fact is that some critical patches were rejected as not
> regressions in rc cycle and probably need to be pushed to the stable
> version now or distribution will merge them.
> We gave more priority for testing 32 bit version so it is more stable
> then 64 bit which got much less in house testing and we've missed many
> issues there. The driver doesn't get full exposure till it's get to
> the public in stable version therefore no bugs are opened in the rc
> cycle so also are not fixed in the stable version. and unfortunately
> there is no much system testing at all for what get's into merging
> window.
> Second the whole mac80211 stack didn't address fully MQ rewrite so
> it's a bit shaky as well and this will be fact also in 2.6.28.

OK.

>
> This driver has been available and more-or-less working for ages.
>> What kernel am I supposed to run if I just want a stable system? Haven't
>> found one yet, other than distro kernels...
>>
>> In any case, I've seen these complete system hiccups with iwl4965 and iwlagn
>> since at least 2.6.25 and through quite a few wireless-testing versions. I
>> bet that this, along with things like it, is the culprit:
>
> Haven't seen you've filled bug for it.

Fair enough. #1790.

>
> Locking need to be really revised but till now I didn't see show
> stoppers issues so it didn't get priority
>
>> Would I be out of line for wishing the iwlwifi developers
> Patches are always welcome

I can write a patch to add a mutex and change it to:

take mutex
grab_nic
spinlock

but I bet that would break all kinds of things. :)

--Andy

2008-10-19 23:12:23

by Tomas Winkler

[permalink] [raw]
Subject: Re: iwlagn: associating with AP causes kernel hiccup

On Mon, Oct 20, 2008 at 12:52 AM, Andrew Lutomirski <[email protected]> wrote:
> On Sun, Oct 19, 2008 at 6:12 PM, Tomas Winkler <[email protected]> wrote:
>> On Sun, Oct 19, 2008 at 5:18 PM, Andy Lutomirski <[email protected]> wrote:
>>> Richard Scherping wrote:
>>>>
>>>> Tomas Winkler schrieb:
>>
>>>
>>> Amen.
>> Stable doesn't mean all components are stable, citation from Linus blog:
>> "It doesn't have to be perfect (and obviously no release ever is), but
>> it needs to be in reasonable shape"
>>
>> The fact is that some critical patches were rejected as not
>> regressions in rc cycle and probably need to be pushed to the stable
>> version now or distribution will merge them.
>> We gave more priority for testing 32 bit version so it is more stable
>> then 64 bit which got much less in house testing and we've missed many
>> issues there. The driver doesn't get full exposure till it's get to
>> the public in stable version therefore no bugs are opened in the rc
>> cycle so also are not fixed in the stable version. and unfortunately
>> there is no much system testing at all for what get's into merging
>> window.
>> Second the whole mac80211 stack didn't address fully MQ rewrite so
>> it's a bit shaky as well and this will be fact also in 2.6.28.
>
> OK.
>
>>
>> This driver has been available and more-or-less working for ages.
>>> What kernel am I supposed to run if I just want a stable system? Haven't
>>> found one yet, other than distro kernels...
>>>
>>> In any case, I've seen these complete system hiccups with iwl4965 and iwlagn
>>> since at least 2.6.25 and through quite a few wireless-testing versions. I
>>> bet that this, along with things like it, is the culprit:
>>
>> Haven't seen you've filled bug for it.
>
> Fair enough. #1790.
>
Appreciated.

>>
>> Locking need to be really revised but till now I didn't see show
>> stoppers issues so it didn't get priority
>>
>>> Would I be out of line for wishing the iwlwifi developers
>> Patches are always welcome
>
> I can write a patch to add a mutex and change it to:
>
> take mutex
> grab_nic
> spinlock
>
> but I bet that would break all kinds of things. :)
>
I'm far from being lock master but I think mutex just won't work here
it can be used only in sleep-able context Also if I'm not mistake if
you using the lock in irq context we must use irqsafe version of the
spin lock,
Thanks
Tomas