2008-04-05 16:26:04

by Bas Hulsken

[permalink] [raw]
Subject: hostapd hangs on rt2500pci, leaving the nic in an unstable state

Hi Ivo,

first of all, thanks a lot for your help so far, I feel I'm getting
really close to making rt2500pci work as an AP with hostapd:) As I've
told you in previous emails, hostapd hangs somewhere during the
authentication process, just after receiving the EAPOL-Key frame.

After hostapd outputs:
WPA: 00:1b:77:a4:db:26 WPA_PTK entering state PTKCALCNEGOTIATING2
WPA: 00:1b:77:a4:db:26 WPA_PTK entering state PTKINITNEGOTIATING
it hangs, and wlan1 has received exactly one frame from the
laptop (there's a message showing up in the logs at this time,
complaining that wlan1 can't find any IPv6 routers). No further frames
are sent from hostapd, although beacon frames are still being sent, and
the interface still receives frames as well. The last frame rt2500pci
receives, is the EAPOL key from the laptop. Only ctrl-c can kill
hostapd, and after that the driver is in an unusable, and unstable
state, even unloading rt2500pci, does not help, and eventually the
entire system will become unstable, in particular under high I/O loads.
After some time, a message shows up that the IRQ for the rt2500pci is
being disabled ( ACPI: PCI interrupt for device 0000:05:01.0 disabled).
If after that, I get a lot of I/O activity on devices sharing the same
pci interrupt ( like recording some TV with an ivtv card) then a lot of
shit happens, ivtv gives time outs, harddrive controllers give I/O
errors, etc. As I've mentioned, this also happens after I unload+reload
the rt2500pci module. So, apparently the hardware is left in a bad state
after hostapd locks up.

Johannes, in the meantime, has tested hostapd again for current
wireless-testing, and for him things work (on a b43). His guess, also
based on the fact that the entire system gets unstable, is that
something still goes wrong in the rt2500pci driver. Do you have any
suggestions as to what I can do to further debug this issue? I've made a
register dump, as suggested on the serialmonkey forum, it's attached to
this mail. I've also attached an archive with a frame dump from debugfs,
obtained during the authentication process leading up to the hang. The
archive also contains the hostapd and interface configuration files, and
a log of the hostapd output.

Well, I hope you can help me a bit further on the basis of this
information, if there is anything else I can do to debug this issue
further, then please let me know.

thanks in advance, and best regards,
Bas


Attachments:
hostapdhang_regdump.txt (5.09 kB)
hostapd_try3.tar.bz2 (97.14 kB)
Download all attachments

2008-04-10 16:43:57

by Bas Hulsken

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

Hi,

On Wed, 2008-04-09 at 19:33 +0200, Ivo van Doorn wrote:

> > Does the patch by Daniel Wagner:
> > rt61pci: rt61pci_beacon_update do not free skb twice
> > which was send to the linux-wireless mailing list today help?
>
> wait never mind, that was a stupid comment since you are using rt2500pci... :S
> I'll see if the patch also is applicable for rt2500pci and post that patch if that is
> the case. :)

I assume the patch as it applies to rt2500pci is this one: "rt2x00: Only
free skb when beacon_update fails"

I've tried it, and it does not solve the problem, still exactly the same
hang as before.

Bas



2008-04-07 17:51:25

by Bas Hulsken

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

Hi

> Could you try editing rt2500pci.c
> and in rt2500pci_interrupt() change:
>
> if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
> rt2x00lib_beacondone(rt2x00dev);
>
> to
>
> //if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
> // rt2x00lib_beacondone(rt2x00dev);

did that, also applied Johannes' "fix STA AID bug" patch, but no change
in behaviour, still the same hang in hostapd.

I've attached the debugfs dumps, same meaning as in the previous mail
(see below)

> >
> > 1) regdump_beforeifup: a regdump taken before the interface is brought
> > up (module is loaded of course)
> > 2) regdump_beforehostapd: a regdump taken immediately after the
> > interface is brought up, but before hostapd is running.
> > 3) regdump_beforeauth: a regdump taken after hostapd is started, and
> > running, but before a client has started an authentication.
> > 4) regdump_afterauthhang: a regdump, after hostapd has locked up, due to
> > an authentication attempt.
>

Bas


Attachments:
regdump_afterauthhang (5.09 kB)
regdump_beforeauth (5.09 kB)
regdump_beforehostapd (5.09 kB)
regdump_beforeifup (5.09 kB)
Download all attachments

2008-04-06 15:06:53

by Ivo Van Doorn

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

Hi,

> > This is all very strange behavior, I don't know what can cause the
> > ACPI to disable an interrupt for the device, but what is interesting to
> > see is that the BSSID in the register seems to have been cleared....
> I remember reading somewhere that ACPI can do that, if it detects
> spurious interrupts coming from the device. Not sure if that's the only
> situation where ACPI does this.

Could you try editing rt2500pci.c
and in rt2500pci_interrupt() change:

if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
rt2x00lib_beacondone(rt2x00dev);

to

//if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
// rt2x00lib_beacondone(rt2x00dev);

> > Could you create a debugfs dump of the time just before it breaks?
> > Because it is very odd to see the BSSID to be suddenly cleared, currently
> > is seems to occur with some people in managed mode as well and so
> > far I haven't been able to trace it.
> > Although now that it occurs in master mode as well it becomes
> > more worrying since mac80211 doesn't control the BSSID in that case
> > (rt2x00 just grabs the MAC address). So if this seems reproducable there
> > might be some sort of hardware register reset occuring that messes
> > things up badly.. :S
> ok, here are some regdumps:
>
> 1) regdump_beforeifup: a regdump taken before the interface is brought
> up (module is loaded of course)
> 2) regdump_beforehostapd: a regdump taken immediately after the
> interface is brought up, but before hostapd is running.
> 3) regdump_beforeauth: a regdump taken after hostapd is started, and
> running, but before a client has started an authentication.
> 4) regdump_afterauthhang: a regdump, after hostapd has locked up, due to
> an authentication attempt.

Ivo

2008-04-09 17:28:15

by Ivo Van Doorn

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

On Monday 07 April 2008, Bas Hulsken wrote:
> Hi
>
> > Could you try editing rt2500pci.c
> > and in rt2500pci_interrupt() change:
> >
> > if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
> > rt2x00lib_beacondone(rt2x00dev);
> >
> > to
> >
> > //if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
> > // rt2x00lib_beacondone(rt2x00dev);
>
> did that, also applied Johannes' "fix STA AID bug" patch, but no change
> in behaviour, still the same hang in hostapd.

Does the patch by Daniel Wagner:
rt61pci: rt61pci_beacon_update do not free skb twice
which was send to the linux-wireless mailing list today help?

Ivo



2008-04-06 12:56:10

by Bas Hulsken

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

Hi Ivo,

On Sat, 2008-04-05 at 19:46 +0200, Ivo van Doorn wrote:
> Hi,

>
> This is all very strange behavior, I don't know what can cause the
> ACPI to disable an interrupt for the device, but what is interesting to
> see is that the BSSID in the register seems to have been cleared....
I remember reading somewhere that ACPI can do that, if it detects
spurious interrupts coming from the device. Not sure if that's the only
situation where ACPI does this.

>
> Could you create a debugfs dump of the time just before it breaks?
> Because it is very odd to see the BSSID to be suddenly cleared, currently
> is seems to occur with some people in managed mode as well and so
> far I haven't been able to trace it.
> Although now that it occurs in master mode as well it becomes
> more worrying since mac80211 doesn't control the BSSID in that case
> (rt2x00 just grabs the MAC address). So if this seems reproducable there
> might be some sort of hardware register reset occuring that messes
> things up badly.. :S
ok, here are some regdumps:

1) regdump_beforeifup: a regdump taken before the interface is brought
up (module is loaded of course)
2) regdump_beforehostapd: a regdump taken immediately after the
interface is brought up, but before hostapd is running.
3) regdump_beforeauth: a regdump taken after hostapd is started, and
running, but before a client has started an authentication.
4) regdump_afterauthhang: a regdump, after hostapd has locked up, due to
an authentication attempt.

hope this helps,
Bas


Attachments:
regdump_afterauthhang (5.09 kB)
regdump_beforeauth (5.09 kB)
regdump_beforehostapd (5.09 kB)
regdump_beforeifup (5.09 kB)
Download all attachments

2008-04-10 16:59:23

by Ivo Van Doorn

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

On Thursday 10 April 2008, Bas Hulsken wrote:
> Hi,
>
> On Wed, 2008-04-09 at 19:33 +0200, Ivo van Doorn wrote:
>
> > > Does the patch by Daniel Wagner:
> > > rt61pci: rt61pci_beacon_update do not free skb twice
> > > which was send to the linux-wireless mailing list today help?
> >
> > wait never mind, that was a stupid comment since you are using rt2500pci... :S
> > I'll see if the patch also is applicable for rt2500pci and post that patch if that is
> > the case. :)
>
> I assume the patch as it applies to rt2500pci is this one: "rt2x00: Only
> free skb when beacon_update fails"

Not really, it was something I found while looking through the beaconing code,
the rt61pci was indeed something that only was bugged in rt61pci, the other
drivers did the beacon handling correctly.

> I've tried it, and it does not solve the problem, still exactly the same
> hang as before.

I expected as much, I'm still looking into other causes for this bug.

Ivo

2008-04-09 17:29:08

by Ivo Van Doorn

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

On Wednesday 09 April 2008, Ivo van Doorn wrote:
> On Monday 07 April 2008, Bas Hulsken wrote:
> > Hi
> >
> > > Could you try editing rt2500pci.c
> > > and in rt2500pci_interrupt() change:
> > >
> > > if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
> > > rt2x00lib_beacondone(rt2x00dev);
> > >
> > > to
> > >
> > > //if (rt2x00_get_field32(reg, CSR7_TBCN_EXPIRE))
> > > // rt2x00lib_beacondone(rt2x00dev);
> >
> > did that, also applied Johannes' "fix STA AID bug" patch, but no change
> > in behaviour, still the same hang in hostapd.
>
> Does the patch by Daniel Wagner:
> rt61pci: rt61pci_beacon_update do not free skb twice
> which was send to the linux-wireless mailing list today help?

wait never mind, that was a stupid comment since you are using rt2500pci... :S
I'll see if the patch also is applicable for rt2500pci and post that patch if that is
the case. :)

Ivo

2008-04-05 17:44:54

by Ivo Van Doorn

[permalink] [raw]
Subject: Re: hostapd hangs on rt2500pci, leaving the nic in an unstable state

Hi,

> After hostapd outputs:
> WPA: 00:1b:77:a4:db:26 WPA_PTK entering state PTKCALCNEGOTIATING2
> WPA: 00:1b:77:a4:db:26 WPA_PTK entering state PTKINITNEGOTIATING
> it hangs, and wlan1 has received exactly one frame from the
> laptop (there's a message showing up in the logs at this time,
> complaining that wlan1 can't find any IPv6 routers). No further frames
> are sent from hostapd, although beacon frames are still being sent, and
> the interface still receives frames as well. The last frame rt2500pci
> receives, is the EAPOL key from the laptop. Only ctrl-c can kill
> hostapd, and after that the driver is in an unusable, and unstable
> state, even unloading rt2500pci, does not help, and eventually the
> entire system will become unstable, in particular under high I/O loads.
> After some time, a message shows up that the IRQ for the rt2500pci is
> being disabled ( ACPI: PCI interrupt for device 0000:05:01.0 disabled).
> If after that, I get a lot of I/O activity on devices sharing the same
> pci interrupt ( like recording some TV with an ivtv card) then a lot of
> shit happens, ivtv gives time outs, harddrive controllers give I/O
> errors, etc. As I've mentioned, this also happens after I unload+reload
> the rt2500pci module. So, apparently the hardware is left in a bad state
> after hostapd locks up.

This is all very strange behavior, I don't know what can cause the
ACPI to disable an interrupt for the device, but what is interesting to
see is that the BSSID in the register seems to have been cleared....

Could you create a debugfs dump of the time just before it breaks?
Because it is very odd to see the BSSID to be suddenly cleared, currently
is seems to occur with some people in managed mode as well and so
far I haven't been able to trace it.
Although now that it occurs in master mode as well it becomes
more worrying since mac80211 doesn't control the BSSID in that case
(rt2x00 just grabs the MAC address). So if this seems reproducable there
might be some sort of hardware register reset occuring that messes
things up badly.. :S

Ivo