2008-02-25 15:43:29

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 6/8] ath5k: Fixes for PCI-E cards

> > Hey, that's a good clue... I just switched over to b-only and it seems to
> > be much more stable.

...or not. I still got some calibration errors last night in b-mode. Just
so we're on the same page, I see things in the dmesg like:

ath0: failed to restore operational channel after scan
ath5k phy0: calibration timeout (2412 MHz)
ath5k phy0: ath5k_chan_set: unable to reset channel (2412 Mhz)
ath0: failed to set freq to 2412 MHz for scan
ath5k phy0: calibration timeout (2417 MHz)

> If i'm correct you should get 4-7Mbit/sec @ 11Mbit. Plz let me know if
> you have some results, meanwhile i'll try to figure out the i/q
> calibration algo (we are ok for noise floor calibration i believe).

One thing I noticed from my traces is that the binary driver sets
bits AR5K_PHY_AGCCTL_NF | AR5K_PHY_AGCCTL_CAL in AR5K_PHY_AGCCTL.
Then it makes a whole lot of misc register writes, then re-reads
AR5K_PHY_AGCCTL; in that time only the noise floor bit got cleared
but _CAL is still high. Dunno if that means anything to you or not.

e.g:

R: 0x9860 = 0x00009d18 - AR5K_PHY_AGCCTL
W: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL <-- set (_CAL | _NF)
W: 0x1000 = 0x00000001 - AR5K_DCU_QCUMASK_BASE
W: 0x1004 = 0x00000002 - unknown
W: 0x1008 = 0x00000004 - unknown
[... lots more writes to DCU & IMR regs ...]
W: 0x00a0 = 0x00080965 - AR5K_PIMR
R: 0x00ac = 0x00000000 - AR5K_SIMR2
W: 0x00ac = 0x00070000 - AR5K_SIMR2
R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
R: 0x9860 = 0x00009d1a - AR5K_PHY_AGCCTL <-- _NF cleared

Maybe a red herring as obviously the current method of doing things
works for me sometimes...

--
Bob Copeland %% http://www.bobcopeland.com




2008-02-26 03:51:57

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 6/8] ath5k: Fixes for PCI-E cards

On Tue, Feb 26, 2008 at 11:13:34AM +0900, bruno randolf wrote:
> so when is AR5K_PHY_AGCCTL_CAL getting cleared?

Actually in my trace I didn't see it resetting on its own. The only
reads with _CAL == 0 happen right after a write that clears it.
(See attached from grep "^[RW].*AGCCTL"). It would be nice if my
mmio trace had timing info...

On the other hand, it must happen or else ath5k_hw_reset would never
work on the 5424.

> the HAL enables noise floor calibration, and reads the result on then next
> calibration interval, then enables calibration again and reads on the next...
> i think that way it makes sure there is always enough time for the noise
> floor calibration to take place in the mean time (the calibration can only
> happen when the channel is otherwise silent).

Good to know. I guess calibration timeouts could also be symptomatic of
general problems as well. Incidentally, in my current session I've
disabled G mode from the capabilities set. All is not well though, I'll
get a run of:

ath0: invalid Michael MIC in data frame from 00:1a:70:da:a9:cb
ath0: invalid Michael MIC in data frame from 00:1a:70:da:a9:cb
ath0: invalid Michael MIC in data frame from 00:1a:70:da:a9:cb
ath0: deauthenticate(reason=14)

(then it reassociates)

> another 20ms for the noise floor value to settle ( for (i = 20; i > 0; i--)
> mdelay(1); ) that worked well for older chipsets, but also for 5424 this
> results in more noise floor calibration timeouts.

Sounds plausible, though I didn't have a lot of success bumping up the
timeout there. But then, I have no idea what I'm doing so I'll test any
patches :)

--
Bob Copeland %% http://www.bobcopeland.com


Attachments:
(No filename) (1.64 kB)
agc.txt (5.43 kB)
Download all attachments

2008-02-26 02:14:05

by Bruno Randolf

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 6/8] ath5k: Fixes for PCI-E cards

On Tuesday 26 February 2008 00:43:07 Bob Copeland wrote:
> > > Hey, that's a good clue... I just switched over to b-only and it seems
> > > to be much more stable.
>
> ...or not. I still got some calibration errors last night in b-mode. Just
> so we're on the same page, I see things in the dmesg like:
>
> ath0: failed to restore operational channel after scan
> ath5k phy0: calibration timeout (2412 MHz)
> ath5k phy0: ath5k_chan_set: unable to reset channel (2412 Mhz)
> ath0: failed to set freq to 2412 MHz for scan
> ath5k phy0: calibration timeout (2417 MHz)
>
> > If i'm correct you should get 4-7Mbit/sec @ 11Mbit. Plz let me know if
> > you have some results, meanwhile i'll try to figure out the i/q
> > calibration algo (we are ok for noise floor calibration i believe).
>
> One thing I noticed from my traces is that the binary driver sets
> bits AR5K_PHY_AGCCTL_NF | AR5K_PHY_AGCCTL_CAL in AR5K_PHY_AGCCTL.
> Then it makes a whole lot of misc register writes, then re-reads
> AR5K_PHY_AGCCTL; in that time only the noise floor bit got cleared
> but _CAL is still high. Dunno if that means anything to you or not.
>
> e.g:
>
> R: 0x9860 = 0x00009d18 - AR5K_PHY_AGCCTL
> W: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL <-- set (_CAL | _NF)
> W: 0x1000 = 0x00000001 - AR5K_DCU_QCUMASK_BASE
> W: 0x1004 = 0x00000002 - unknown
> W: 0x1008 = 0x00000004 - unknown
> [... lots more writes to DCU & IMR regs ...]
> W: 0x00a0 = 0x00080965 - AR5K_PIMR
> R: 0x00ac = 0x00000000 - AR5K_SIMR2
> W: 0x00ac = 0x00070000 - AR5K_SIMR2
> R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> R: 0x9860 = 0x00009d1a - AR5K_PHY_AGCCTL <-- _NF cleared
>
> Maybe a red herring as obviously the current method of doing things
> works for me sometimes...

so when is AR5K_PHY_AGCCTL_CAL getting cleared?

i'm not sure if that is related because the error message you are seeing is
not due to noise floor calibration timeout but because AR5K_PHY_AGCCTL_CAL is
not cleared in the time that ath5k expects, but i want to mention it anyways:

the HAL enables noise floor calibration, and reads the result on then next
calibration interval, then enables calibration again and reads on the next...
i think that way it makes sure there is always enough time for the noise
floor calibration to take place in the mean time (the calibration can only
happen when the channel is otherwise silent).

if i understand correctly, ath5k currenty expects the noise floor calibration
to finish within 300ms (ath5k_hw_register_timeout) and then we give it
another 20ms for the noise floor value to settle ( for (i = 20; i > 0; i--)
mdelay(1); ) that worked well for older chipsets, but also for 5424 this
results in more noise floor calibration timeouts.

bruno





2008-02-26 03:57:24

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 6/8] ath5k: Fixes for PCI-E cards

2008/2/26, bruno randolf <[email protected]>:
> On Tuesday 26 February 2008 00:43:07 Bob Copeland wrote:
> > > > Hey, that's a good clue... I just switched over to b-only and it seems
> > > > to be much more stable.
> >
> > ...or not. I still got some calibration errors last night in b-mode. Just
> > so we're on the same page, I see things in the dmesg like:
> >
> > ath0: failed to restore operational channel after scan
> > ath5k phy0: calibration timeout (2412 MHz)
> > ath5k phy0: ath5k_chan_set: unable to reset channel (2412 Mhz)
> > ath0: failed to set freq to 2412 MHz for scan
> > ath5k phy0: calibration timeout (2417 MHz)
> >
> > > If i'm correct you should get 4-7Mbit/sec @ 11Mbit. Plz let me know if
> > > you have some results, meanwhile i'll try to figure out the i/q
> > > calibration algo (we are ok for noise floor calibration i believe).
> >
> > One thing I noticed from my traces is that the binary driver sets
> > bits AR5K_PHY_AGCCTL_NF | AR5K_PHY_AGCCTL_CAL in AR5K_PHY_AGCCTL.
> > Then it makes a whole lot of misc register writes, then re-reads
> > AR5K_PHY_AGCCTL; in that time only the noise floor bit got cleared
> > but _CAL is still high. Dunno if that means anything to you or not.
> >
> > e.g:
> >
> > R: 0x9860 = 0x00009d18 - AR5K_PHY_AGCCTL
> > W: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL <-- set (_CAL | _NF)
> > W: 0x1000 = 0x00000001 - AR5K_DCU_QCUMASK_BASE
> > W: 0x1004 = 0x00000002 - unknown
> > W: 0x1008 = 0x00000004 - unknown
> > [... lots more writes to DCU & IMR regs ...]
> > W: 0x00a0 = 0x00080965 - AR5K_PIMR
> > R: 0x00ac = 0x00000000 - AR5K_SIMR2
> > W: 0x00ac = 0x00070000 - AR5K_SIMR2
> > R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> > R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> > R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> > R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> > R: 0x9860 = 0x00009d1b - AR5K_PHY_AGCCTL
> > R: 0x9860 = 0x00009d1a - AR5K_PHY_AGCCTL <-- _NF cleared
> >
> > Maybe a red herring as obviously the current method of doing things
> > works for me sometimes...
>
>
> so when is AR5K_PHY_AGCCTL_CAL getting cleared?
>
> i'm not sure if that is related because the error message you are seeing is
> not due to noise floor calibration timeout but because AR5K_PHY_AGCCTL_CAL is
> not cleared in the time that ath5k expects, but i want to mention it anyways:
>
> the HAL enables noise floor calibration, and reads the result on then next
> calibration interval, then enables calibration again and reads on the next...
> i think that way it makes sure there is always enough time for the noise
> floor calibration to take place in the mean time (the calibration can only
> happen when the channel is otherwise silent).
>
> if i understand correctly, ath5k currenty expects the noise floor calibration
> to finish within 300ms (ath5k_hw_register_timeout) and then we give it
> another 20ms for the noise floor value to settle ( for (i = 20; i > 0; i--)
> mdelay(1); ) that worked well for older chipsets, but also for 5424 this
> results in more noise floor calibration timeouts.
>
>
> bruno
>

There are 3 types of calibration, noise floor calibration (which gets
us the noise floor value), I/Q calibration (which fixes QAM
constellation on OFDM rates) and the _CAL bit which is another kind of
calibration we don't know about (i'll check out Atheros patent docs
again, they might give us a clue)... I know that NF and I/Q
calibration must be done periodically (i also had an idea about
forcing I/Q calibration on certain rate changes since QAM
constellation changes) but i have no idea about _CAL stuff...

Noise floor calibration is not the problem, I/Q calibration is where
we are failing, that's why we have problems in OFDM rates. Even in
madwifi we get weird noise floor values sometimes (and calibration
function returns failure sometimes). We might have to add some extra
time for noise floor calibration, HAL has one function (phy_calibrate)
for both of them (I/Q + noise floor), i guess we can treat them in a
different way in our implementation.

Anyway we don't get a NF calibration timeout but we get timeout for
_CAL stuff and since HAL doesn't seem to check for it (it only sets
that bit during reset and during phy_calibrate but doesn't check for
it) we can also skip the test i guess...

Bob try removing this and see what happens (in terms of
performance/stability)...

if (ath5k_hw_register_timeout(ah, AR5K_PHY_AGCCTL,
AR5K_PHY_AGCCTL_CAL, 0, false)) {
ATH5K_ERR(ah->ah_sc, "calibration timeout (%uMHz)\n",
channel->center_freq);
return -EAGAIN;
}


--
GPG ID: 0xD21DB2DB
As you read this post global entropy rises. Have Fun ;-)
Nick

2008-02-26 04:39:25

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 6/8] ath5k: Fixes for PCI-E cards

On Tue, Feb 26, 2008 at 05:57:22AM +0200, Nick Kossifidis wrote:
> Bob try removing this and see what happens (in terms of
> performance/stability)...
>
> if (ath5k_hw_register_timeout(ah, AR5K_PHY_AGCCTL,
> AR5K_PHY_AGCCTL_CAL, 0, false)) {
> ATH5K_ERR(ah->ah_sc, "calibration timeout (%uMHz)\n",
> channel->center_freq);
> return -EAGAIN;
> }

It doesn't really have an effect, because even without this, the noise
floor calibration fails anyway. I suspect by the time it gets in this
state the device is just hung generally.

So, I can't sustain a connection long enough to do iperf. I got numbers
from madwifi though :)

Seems when it gets a big hunk of data the device barfs. When I'm doing
something like ssh or just browsing a little, it's annoying but still
useable because most of the time it'll disassociate with the AP for a
few seconds, reassociate, and it's working again.

--
Bob Copeland %% http://www.bobcopeland.com