2009-05-19 00:28:32

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH] ath5k: prevent infinite loop


After updating my laptop with the latest kernel (had 2.6.26 before that),
my laptop load went to 100%. The daemon phy0 was at 99.9% of the CPU.
Luckily I compiled with a preempt kernel otherwise this would have
been a lock up.

Using ftrace to dig into the problem I found that that the ath5k driver
was in an infinite loop. The code in ath5k_get_linear_pcdac_min has:

pwr_i = pwrR[0];
do {
pwr_i--;
tmp = (s8) ath5k_get_interpolated_value(pwr_i,
pwrR[0], pwrR[1],
stepR[0], stepR[1]);
} while (tmp > 1);


But ath5k_get_interpolated returns stepR[0] if pwrR[0] == pwrR[1] or
stepR[0] == stepR[1]. The pwr_i is ignored and we enter an infinite loop
because tmp never changes between iterations. Using ftrace, I was able to
determine that is exactly what happened in the case of my laptop.

This patch tries to keep the same result that would happen when this case
occurs. That is, the pwr_i becomes a minimal number. I used the minimum
number that a signed short may be to initialize the min pwrL and pwrR.
Then if the case that the code would cause an infinite loop, we bypass it.

Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/drivers/net/wireless/ath5k/phy.c b/drivers/net/wireless/ath5k/phy.c
index 9e2faae..2c916fc 100644
--- a/drivers/net/wireless/ath5k/phy.c
+++ b/drivers/net/wireless/ath5k/phy.c
@@ -1473,6 +1473,8 @@ ath5k_get_interpolated_value(s16 target, s16 x_left, s16 x_right,
return result;
}

+#define MIN_PWR (-32768)
+
/*
* Find vertical boundary (min pwr) for the linear PCDAC curve.
*
@@ -1486,29 +1488,32 @@ ath5k_get_linear_pcdac_min(const u8 *stepL, const u8 *stepR,
const s16 *pwrL, const s16 *pwrR)
{
s8 tmp;
- s16 min_pwrL, min_pwrR;
+ s16 min_pwrL = MIN_PWR, min_pwrR = MIN_PWR;
s16 pwr_i = pwrL[0];

- do {
- pwr_i--;
- tmp = (s8) ath5k_get_interpolated_value(pwr_i,
- pwrL[0], pwrL[1],
- stepL[0], stepL[1]);
-
- } while (tmp > 1);
-
- min_pwrL = pwr_i;
+ /* Avoid infinite loop */
+ if (pwrL[0] != pwrL[1] && stepL[0] != stepL[1]) {
+ do {
+ pwr_i--;
+ tmp = (s8) ath5k_get_interpolated_value(pwr_i,
+ pwrL[0], pwrL[1],
+ stepL[0], stepL[1]);
+ } while (tmp > 1);
+ min_pwrL = pwr_i;
+ }

pwr_i = pwrR[0];
- do {
- pwr_i--;
- tmp = (s8) ath5k_get_interpolated_value(pwr_i,
- pwrR[0], pwrR[1],
- stepR[0], stepR[1]);
-
- } while (tmp > 1);

- min_pwrR = pwr_i;
+ /* Avoid infinite loop */
+ if (pwrR[0] != pwrR[1] && stepR[0] != stepR[1]) {
+ do {
+ pwr_i--;
+ tmp = (s8) ath5k_get_interpolated_value(pwr_i,
+ pwrR[0], pwrR[1],
+ stepR[0], stepR[1]);
+ } while (tmp > 1);
+ min_pwrR = pwr_i;
+ }

/* Keep the right boundary so that it works for both curves */
return max(min_pwrL, min_pwrR);


2009-05-20 03:38:35

by Bob Copeland

[permalink] [raw]
Subject: [PATCH] ath5k: avoid and warn on potential infinite loop

If we are trying to interpolate a curve with slope == 0, the return
value will always be the y-coordinate. In this code we are looping
until we reach a minimum y-coordinate on a line, which in the 0-slope
case can never happen, thus the loop never terminates.

The PCDAC steps come from the EEPROM and should never be equal, but
we should gracefully handle that case, so warn and bail out.

Reported-by: Steven Rostedt <[email protected]>
Signed-off-by: Bob Copeland <[email protected]>
---

This ok? (based on top of the wireless-testing tree -- the patch
that's already there can go for 2.6.30. This is in addition to that
patch, this handles the other, unlikely, corner case.)

Having looked at the code, I don't see why we need to loop at all,
couldn't we just invert the slope and solve directly for y=0?
Certainly recomputing the slope in a loop could be optimized.
Anyway, that's another patch for another day.

drivers/net/wireless/ath/ath5k/phy.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/ath/ath5k/phy.c b/drivers/net/wireless/ath/ath5k/phy.c
index d0d1c35..a876ca8 100644
--- a/drivers/net/wireless/ath/ath5k/phy.c
+++ b/drivers/net/wireless/ath/ath5k/phy.c
@@ -1897,6 +1897,9 @@ ath5k_get_linear_pcdac_min(const u8 *stepL, const u8 *stepR,
s16 min_pwrL, min_pwrR;
s16 pwr_i;

+ if (WARN_ON(stepL[0] == stepL[1] || stepR[0] == stepR[1]))
+ return 0;
+
if (pwrL[0] == pwrL[1])
min_pwrL = pwrL[0];
else {
--
1.6.0.6

--
Bob Copeland %% http://www.bobcopeland.com


2009-05-19 00:43:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] ath5k: prevent infinite loop


On Tue, 19 May 2009, Nick Kossifidis wrote:
>
> This is already fixed on wireless-testing ;-)
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD

Thanks, but this does only half. Although I did not hit this in my laptop,
it can be an issue. If step[0] == step[1] you have the same problem.

-- Steve


2009-05-19 06:32:41

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [PATCH] ath5k: prevent infinite loop

2009/5/19 Steven Rostedt <[email protected]>:
>
> On Tue, 19 May 2009, Nick Kossifidis wrote:
>>
>> This is already fixed on wireless-testing ;-)
>> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD
>
> Thanks, but this does only half. Although I did not hit this in my laptop,
> it can be an issue. If step[0] == step[1] you have the same problem.
>

Having the same power value for 2 different steps is something we can
expect (although docs say that we expect the line to be monotonically
increasing but anyway), having the same step twice is way out of spec,
there is no way we can have the same step twice on EEPROM, only if we
have a corrupted EEPROM (we need to add some sanity checks indeed here
-> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/eeprom.c;h=c56b494d417acd40d445d922f2861b53cc2315df;hb=HEAD#l910
to handle such a case but first we need to have a "default" eeprom
dataset to fallback when we get such errors).



--
GPG ID: 0xD21DB2DB
As you read this post global entropy rises. Have Fun ;-)
Nick

2009-05-19 06:39:24

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH] ath5k: prevent infinite loop

On Mon, May 18, 2009 at 11:32 PM, Nick Kossifidis <[email protected]> wrote:
> 2009/5/19 Steven Rostedt <[email protected]>:
>>
>> On Tue, 19 May 2009, Nick Kossifidis wrote:
>>>
>>> This is already fixed on wireless-testing ;-)
>>> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD
>>
>> Thanks, but this does only half. Although I did not hit this in my laptop,
>> it can be an issue. If step[0] == step[1] you have the same problem.
>>
>
> Having the same power value for 2 different steps is something we can
> expect (although docs say that we expect the line to be monotonically
> increasing but anyway), having the same step twice is way out of spec,
> there is no way we can have the same step twice on EEPROM, only if we
> have a corrupted EEPROM (we need to add some sanity checks indeed here
> -> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/eeprom.c;h=c56b494d417acd40d445d922f2861b53cc2315df;hb=HEAD#l910
> to handle such a case but first we need to have a "default" eeprom
> dataset to fallback when we get such errors).

Don't bother with busted EEPROMs, if its busted its busted. Chances
are the complexity we'd need to add to deal with such devices is
simply not worth it.

Luis

2009-05-19 12:35:15

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH] ath5k: prevent infinite loop

On Tue, May 19, 2009 at 07:54:23AM -0400, Steven Rostedt wrote:
> > Do these patches need to go to 2.6.30?
>
> Considering that it locked up my box, I would think that it should. Well,
> at least the change that checks for the infinite loop.

Yeah, I just wasn't sure if you were using that or -next (which should be
fixed, I guess.) I mistakenly thought Nick posted commit ishs but they
were blobs, the actual commit was:

e5f1d7f3c192c8ebeb492427bab84611ed5568eb "ath5k: fix interpolation with
equal power levels" in wireless-testing.

Those loops really want a helper function...

Hmm, not too familiar with this particular code but getting back -32768
in the degenerate case seems wrong so I'd be inclined towards a hybrid
version of the two that also sanity checks the steps (corrupt eeproms
are bad, but one rather likely corruption is a big block of ones or
zeroes).

--
Bob Copeland %% http://www.bobcopeland.com


2009-05-19 11:54:23

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] ath5k: prevent infinite loop



On Tue, 19 May 2009, Bob Copeland wrote:

> On Tue, May 19, 2009 at 09:32:42AM +0300, Nick Kossifidis wrote:
> > >> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD
>
> > -> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/eeprom.c;h=c56b494d417acd40d445d922f2861b53cc2315df;hb=HEAD#l910
>
> Do these patches need to go to 2.6.30?
>

Considering that it locked up my box, I would think that it should. Well,
at least the change that checks for the infinite loop.

-- Steve


2009-05-19 11:45:15

by Steven Rostedt

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH] ath5k: prevent infinite loop


On Mon, 18 May 2009, Luis R. Rodriguez wrote:

> On Mon, May 18, 2009 at 11:32 PM, Nick Kossifidis <[email protected]> wrote:
> > 2009/5/19 Steven Rostedt <[email protected]>:
> >>
> >> On Tue, 19 May 2009, Nick Kossifidis wrote:
> >>>
> >>> This is already fixed on wireless-testing ;-)
> >>> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD
> >>
> >> Thanks, but this does only half. Although I did not hit this in my laptop,
> >> it can be an issue. If step[0] == step[1] you have the same problem.
> >>
> >
> > Having the same power value for 2 different steps is something we can
> > expect (although docs say that we expect the line to be monotonically
> > increasing but anyway), having the same step twice is way out of spec,
> > there is no way we can have the same step twice on EEPROM, only if we
> > have a corrupted EEPROM (we need to add some sanity checks indeed here
> > -> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/eeprom.c;h=c56b494d417acd40d445d922f2861b53cc2315df;hb=HEAD#l910
> > to handle such a case but first we need to have a "default" eeprom
> > dataset to fallback when we get such errors).
>
> Don't bother with busted EEPROMs, if its busted its busted. Chances
> are the complexity we'd need to add to deal with such devices is
> simply not worth it.

My concern is that a busted EEPROM should not lock up the kernel, when we
can avoid it. Put in a nasty WARN_ON if the steps are equal, and exit the
routine. But don't let it go into an infinite loop and have the user
wondering why their system just locked up.

-- Steve


2009-05-19 16:38:42

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH] ath5k: prevent infinite loop

On Tue, May 19, 2009 at 04:45:14AM -0700, Steven Rostedt wrote:
>
> On Mon, 18 May 2009, Luis R. Rodriguez wrote:
>
> > On Mon, May 18, 2009 at 11:32 PM, Nick Kossifidis <[email protected]> wrote:
> > > 2009/5/19 Steven Rostedt <[email protected]>:
> > >>
> > >> On Tue, 19 May 2009, Nick Kossifidis wrote:
> > >>>
> > >>> This is already fixed on wireless-testing ;-)
> > >>> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD
> > >>
> > >> Thanks, but this does only half. Although I did not hit this in my laptop,
> > >> it can be an issue. If step[0] == step[1] you have the same problem.
> > >>
> > >
> > > Having the same power value for 2 different steps is something we can
> > > expect (although docs say that we expect the line to be monotonically
> > > increasing but anyway), having the same step twice is way out of spec,
> > > there is no way we can have the same step twice on EEPROM, only if we
> > > have a corrupted EEPROM (we need to add some sanity checks indeed here
> > > -> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/eeprom.c;h=c56b494d417acd40d445d922f2861b53cc2315df;hb=HEAD#l910
> > > to handle such a case but first we need to have a "default" eeprom
> > > dataset to fallback when we get such errors).
> >
> > Don't bother with busted EEPROMs, if its busted its busted. Chances
> > are the complexity we'd need to add to deal with such devices is
> > simply not worth it.
>
> My concern is that a busted EEPROM should not lock up the kernel, when we
> can avoid it.

Sure, which is why there is a checksum which can be checked first.
IIRC we don't bail out if that fails in ath5k.

> Put in a nasty WARN_ON if the steps are equal, and exit the
> routine. But don't let it go into an infinite loop and have the user
> wondering why their system just locked up.

Sure, my point was more for not implemention a default EEPROM
as that would bring more complexity to the driver.

Luis


2009-05-19 00:34:37

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [PATCH] ath5k: prevent infinite loop

2009/5/19 Steven Rostedt <[email protected]>:
>
> After updating my laptop with the latest kernel (had 2.6.26 before that),
> my laptop load went to 100%. The daemon phy0 was at 99.9% of the CPU.
> Luckily I compiled with a preempt kernel otherwise this would have
> been a lock up.
>
> Using ftrace to dig into the problem I found that that the ath5k driver
> was in an infinite loop. The code in ath5k_get_linear_pcdac_min has:
>
>        pwr_i = pwrR[0];
>        do {
>                pwr_i--;
>                tmp = (s8) ath5k_get_interpolated_value(pwr_i,
>                                                pwrR[0], pwrR[1],
>                                                stepR[0], stepR[1]);
>        } while (tmp > 1);
>
>
> But ath5k_get_interpolated returns stepR[0] if pwrR[0] == pwrR[1] or
> stepR[0] == stepR[1]. The pwr_i is ignored and we enter an infinite loop
> because tmp never changes between iterations. Using ftrace, I was able to
> determine that is exactly what happened in the case of my laptop.
>
> This patch tries to keep the same result that would happen when this case
> occurs. That is, the pwr_i becomes a minimal number. I used the minimum
> number that a signed short may be to initialize the min pwrL and pwrR.
> Then if the case that the code would cause an infinite loop, we bypass it.
>
> Signed-off-by: Steven Rostedt <[email protected]>
>

This is already fixed on wireless-testing ;-)
http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=blob;f=drivers/net/wireless/ath/ath5k/phy.c;h=d0d1c350025aebba1fe4e17a44550536a59951ba;hb=HEAD

--
GPG ID: 0xD21DB2DB
As you read this post global entropy rises. Have Fun ;-)
Nick