2009-10-31 18:48:58

by Rob Browning

[permalink] [raw]
Subject: ath9k based AP stops responding after half a day or so


I have an machine that's running a recent compat-wireless (2009-10-17),
kernel 2.6.30-2-686 (Debian), and a build of hostapd as of 2009-10-17
(08d38568). The machine also has an AR5008 (ath9k) based PCI card.

After the machine reboots, wireless clients can connect to the AP
normally for a half day or so. Then, at some point, all connections
fail, and clients cannot even see the network until the machine is
rebooted. Reloading the wireless modules, and restarting hostapd
doesn't help.

Once the network has disappeared, if I restart hostapd with -dd, it gets
to this point:

WPA: group state machine entering state GTK_INIT (VLAN-ID 0)
GMK - hexdump(len=32): [REMOVED]
GTK - hexdump(len=32): [REMOVED]
WPA: group state machine entering state SETKEYSDONE (VLAN-ID 0)
nl_set_encr: ifindex=5 alg=2 addr=(nil) key_idx=1 set_tx=1 seq_len=0 key_len=32
nl80211: Set beacon (beacon_set=0)
wlan0: Setup of interface done.
MGMT (TX callback) ACK

and then pauses for a long time. Eventually it prints this:

wlan0: WPA rekeying GTK
WPA: group state machine entering state SETKEYS (VLAN-ID 0)
GMK - hexdump(len=32): [REMOVED]
GTK - hexdump(len=32): [REMOVED]
wpa_group_setkeys: GKeyDoneStations=0
WPA: group state machine entering state SETKEYSDONE (VLAN-ID 0)
nl_set_encr: ifindex=5 alg=2 addr=(nil) key_idx=2 set_tx=1 seq_len=0 key_len=32
wlan0: WPA rekeying GTK

but clients still won't be able to connect until I reboot the machine.

If it helps, once the AP stopped responding, I reloaded the ath9k
modules with debugging enabled (modprobe ath9k debug=0xffffffff) and
found that while the machine in this state it prints something like this
to the log repeatedly:

10:50:06 kernel: [230631.480019] ath: Writing ofdmbase=12582412 cckbase=12582712
10:50:07 kernel: [230632.080018] ath: Writing ofdmbase=12582412 cckbase=12582712
10:50:08 kernel: [230632.680020] ath: ANI parameters:
10:50:08 kernel: [230632.680024] ath: noiseImmunityLevel=0, spurImmunityLevel=1, ofdmWeakSigDetectOff=1
10:50:08 kernel: [230632.680029] ath: cckWeakSigThreshold=0, firstepLevel=0, listenTime=551
10:50:08 kernel: [230632.680032] ath: cycleCount=-1758147678, ofdmPhyErrCount=99, cckPhyErrCount=1
10:50:08 kernel: [230632.680035]
10:50:08 kernel: [230632.680037] ath: Writing ofdmbase=12582412 cckbase=12582712
10:50:08 kernel: [230632.880017] ath: invalid cmd 2
10:50:08 kernel: [230632.880023] ath: ANI parameters:
10:50:08 kernel: [230632.880025] ath: noiseImmunityLevel=0, spurImmunityLevel=2, ofdmWeakSigDetectOff=1
10:50:08 kernel: [230632.880029] ath: cckWeakSigThreshold=0, firstepLevel=0, listenTime=188
10:50:08 kernel: [230632.880032] ath: cycleCount=-1749347644, ofdmPhyErrCount=107, cckPhyErrCount=0
10:50:08 kernel: [230632.880035]
10:50:08 kernel: [230632.880037] ath: Writing ofdmbase=12582412 cckbase=12582712
10:50:08 kernel: [230633.480017] ath: Writing ofdmbase=12582412 cckbase=12582712
10:50:09 kernel: [230634.080018] ath: Writing ofdmbase=12582412 cckbase=12582712

Assuming this isn't a local configuration problem, I'd like to help fix
it if I can. In case it matters, the machine in question is running a
firewall (shorewall).

Please let me know if I can provide further information.

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


2009-11-29 12:51:55

by Björn Smedman

[permalink] [raw]
Subject: Re: ath9k based AP stops responding after half a day or so

Hi all,

I'm seeing a similar problem with?compat-wireless-2.6.32-rc1 and
hostapd 0.6.9 on a AR9100 (ath9k) mips/ahb based system. This is with
an open access point and without ANI so my guess is that Rob's log
unfortunately does not pinpoint the problem.

I can spend some time on this but like Rob I'm not sure where to
start. Any thoughts? It should be somewhere close to the hardware as a
restart/reload doesn't solve the problem (I haven't been able to try
this though). Anybody know anything related to
beacons/monitor/injection that may deteriorate over time and be
persistent across an rmmod/insmod?

Best regards,

Bj?rn

On Sat, Nov 7, 2009 at 11:18 PM, Rob Browning <[email protected]> wrote:
>
> Rob Browning <[email protected]> writes:
>
> > I have an machine that's running a recent compat-wireless (2009-10-17),
> > kernel 2.6.30-2-686 (Debian), and a build of hostapd as of 2009-10-17
> > (08d38568). ?The machine also has an AR5008 (ath9k) based PCI card.
> >
> > After the machine reboots, wireless clients can connect to the AP
> > normally for a half day or so. ?Then, at some point, all connections
> > fail, and clients cannot even see the network until the machine is
> > rebooted. ?Reloading the wireless modules, and restarting hostapd
> > doesn't help.
>
> Just as an update, this problem persists with hostapd as of eb999fef,
> and compat-wireless 2009-11-03.
>
> I'd be happy to provide further information.
> --
> Rob Browning
> rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu
> GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html



--
Venatech AB
Ideon Innovation
Ole R?mers v?g 12
SE-22370 LUND
Sweden

+46 (0) 46 286 86 20
[email protected]
http://www.venatech.se

2009-11-29 18:46:08

by RHS Linux User

[permalink] [raw]
Subject: Re: [ath9k-devel] ath9k based AP stops responding after half a day or so


Hi,

I have seen several "similar" sorts of things. I assume that
the chip(s) are not being reset successfully. That is to say, some
particular "dance" is needed to get the chip restarted.

Being somewhat of a hardware type. "Full" powerdown may be needed for a
significant time period and then maybe will only work sometimes. There may
be one or more flip-flops inside whatever chip that cannot be reset (chip
design bug), so even a "full" powerdown reset still doesn't make sure the
chip will positively come up OK. Or some special sequence... You get the
idea.

In this case the only strategy left is to do a "full" restart and see if
the circuit is now working? If not (not on the web, etc.) repeat the
process.

Pretty sad.... IMHO. But that's what getting complex hardware
to actually work is sometimes like.

Hope that helps.

If the manufacturer actually knew how their chip actually works,
it would help. From what I can see often the manufacturer doesn't
publish specs because they really don't know for sure and don't want
to appear stupid about their "industry standard" chips!

Some big manufacturer like Microsoft may have "accidently" stumbled
onto how to get whatever chip to actually work and now the "correct" power
up method/sequence becomes somewhat of a proprietary secret since the chip
manufacturer doesn't actually know how to make their own chip work!

Good luck!

Wiz (pen name)


On Sun, 29 Nov 2009, [ISO-8859-1] Bj?rn Smedman wrote:

> Hi all,
>
> I'm seeing a similar problem with?compat-wireless-2.6.32-rc1 and
> hostapd 0.6.9 on a AR9100 (ath9k) mips/ahb based system. This is with
> an open access point and without ANI so my guess is that Rob's log
> unfortunately does not pinpoint the problem.
>
> I can spend some time on this but like Rob I'm not sure where to
> start. Any thoughts? It should be somewhere close to the hardware as a
> restart/reload doesn't solve the problem (I haven't been able to try
> this though). Anybody know anything related to
> beacons/monitor/injection that may deteriorate over time and be
> persistent across an rmmod/insmod?
>
> Best regards,
>
> Bj?rn
>
> On Sat, Nov 7, 2009 at 11:18 PM, Rob Browning <[email protected]> wrote:
> >
> > Rob Browning <[email protected]> writes:
> >
> > > I have an machine that's running a recent compat-wireless (2009-10-17),
> > > kernel 2.6.30-2-686 (Debian), and a build of hostapd as of 2009-10-17
> > > (08d38568). ?The machine also has an AR5008 (ath9k) based PCI card.
> > >
> > > After the machine reboots, wireless clients can connect to the AP
> > > normally for a half day or so. ?Then, at some point, all connections
> > > fail, and clients cannot even see the network until the machine is
> > > rebooted. ?Reloading the wireless modules, and restarting hostapd
> > > doesn't help.
> >
> > Just as an update, this problem persists with hostapd as of eb999fef,
> > and compat-wireless 2009-11-03.
> >
> > I'd be happy to provide further information.
> > --
> > Rob Browning
> > rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu
> > GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> > the body of a message to [email protected]
> > More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Venatech AB
> Ideon Innovation
> Ole R?mers v?g 12
> SE-22370 LUND
> Sweden
>
> +46 (0) 46 286 86 20
> [email protected]
> http://www.venatech.se
> _______________________________________________
> ath9k-devel mailing list
> [email protected]
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel
>


2009-11-07 22:18:07

by Rob Browning

[permalink] [raw]
Subject: Re: ath9k based AP stops responding after half a day or so

Rob Browning <[email protected]> writes:

> I have an machine that's running a recent compat-wireless (2009-10-17),
> kernel 2.6.30-2-686 (Debian), and a build of hostapd as of 2009-10-17
> (08d38568). The machine also has an AR5008 (ath9k) based PCI card.
>
> After the machine reboots, wireless clients can connect to the AP
> normally for a half day or so. Then, at some point, all connections
> fail, and clients cannot even see the network until the machine is
> rebooted. Reloading the wireless modules, and restarting hostapd
> doesn't help.

Just as an update, this problem persists with hostapd as of eb999fef,
and compat-wireless 2009-11-03.

I'd be happy to provide further information.
--
Rob Browning
rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4