2010-05-03 17:28:55

by John W. Linville

[permalink] [raw]
Subject: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

From: Adel Gadllah <[email protected]>

Currently it is a BUG_ON() which will hang the machine once triggered.

(Changed from WARN_ON to WARN_ON_ONCE. -- JWL)

Signed-off-by: Adel Gadllah <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
---
drivers/net/wireless/iwlwifi/iwl-agn-rs.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-rs.c b/drivers/net/wireless/iwlwifi/iwl-agn-rs.c
index 8f8d5e3..ca63ff9 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-rs.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-rs.c
@@ -2079,8 +2079,9 @@ static void rs_rate_scale_perform(struct iwl_priv *priv,
* actual average throughput */

/* Sanity-check TPT calculations */
- BUG_ON(window->average_tpt != ((window->success_ratio *
- tbl->expected_tpt[index] + 64) / 128));
+ if (WARN_ON_ONCE(window->average_tpt != ((window->success_ratio *
+ tbl->expected_tpt[index] + 64) / 128)))
+ return;

/* If we are searching for better modulation mode, check success. */
if (lq_sta->search_better_tbl &&
--
1.6.6.1



2010-05-03 20:48:38

by Gábor Stefanik

[permalink] [raw]
Subject: Re: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

On Mon, May 3, 2010 at 10:29 PM, Kalle Valo <[email protected]> wrote:
> "John W. Linville" <[email protected]> writes:
>
>> From: Adel Gadllah <[email protected]>
>>
>> Currently it is a BUG_ON() which will hang the machine once triggered.
>
> Related to this: can we have a rule that no wireless driver should
> ever use BUG_ON()? I think we could extend this even to cfg80211 and
> mac80211.
>
> BUG_ON() is valid whenever there's a risk of corrupting data, for
> example on a filesystem, but I really don't see the point of using
> them in wireless drivers. They just make things miserable, especially
> for the normal users. Printing a warning and handling the case as
> gracefully as possible is much better IMHO.
>

One exception I can think of: major misconfiguration can cause a
wireless device to DMA data into sensitive memory locations. When
evidence of this is detected, it might make sense to BUG_ON()
(especially if the bogus DMA operations can be exploited remotely to
overwrite arbitrary memory addresses). However, in that case, the
attacker may have already overwritten panic() with malicious code as
well, so even this case doesn't hold.

The other thing that comes to my mind is when there is a risk of
physically frying the card, but given that BUG_ON() doesn't cut power
to the PCI bus (at least not on x86 - dunno about other platforms),
this one falls down pretty easily too.

> --
> Kalle Valo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)

2010-05-06 18:30:22

by John W. Linville

[permalink] [raw]
Subject: Re: [PATCH] iwlwifi: recalculate average tpt if not current

On Thu, May 06, 2010 at 09:11:08AM -0700, reinette chatre wrote:
> On Mon, 2010-05-03 at 10:55 -0700, reinette chatre wrote:
> > From: Reinette Chatre <[email protected]>
> >
> > We currently have this check as a BUG_ON, which is being hit by people.
> > Previously it was an error with a recalculation if not current, return that
> > code.
> >
> > The BUG_ON was introduced by:
> > commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
> > Author: Guy Cohen <[email protected]>
> > Date: Tue Sep 9 10:54:54 2008 +0800
> >
> > iwlwifi: Added support for 3 antennas
> >
> > ... the portion adding the BUG_ON is reverted since we are encountering the error
> > and BUG_ON was created with assumption that error is not encountered.
> >
> > Signed-off-by: Reinette Chatre <[email protected]>
> > ---
>
> I noticed this patch in your wireless-next-2.6 pull request. Since it is
> addressing a system hang issue, could it perhaps be included in
> wireless-2.6 also? I should have included the bug report reference for
> this purpose, sorry ... it is
> https://bugzilla.redhat.com/show_bug.cgi?id=588021

I didn't send it that way because a) that code has been there for a
really long time; and b) the reporter couldn't reliably reproduce
the bug and therefore can't reliably test the fix. While I agree
that the fix looks harmless, no update is zero-risk.

Can you reliably hit that code? Has it been tested enough that we
should risk holding-up 2.6.34's release for it?

John
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2010-05-03 21:20:31

by Gábor Stefanik

[permalink] [raw]
Subject: Re: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

On Mon, May 3, 2010 at 11:10 PM, Adel Gadllah <[email protected]> wrote:
> 2010/5/3 Pavel Roskin <[email protected]>:
>> On Mon, 2010-05-03 at 22:48 +0200, G?bor Stefanik wrote:
>>
>>> One exception I can think of: major misconfiguration can cause a
>>> wireless device to DMA data into sensitive memory locations. When
>>> evidence of this is detected, it might make sense to BUG_ON()
>>> (especially if the bogus DMA operations can be exploited remotely to
>>> overwrite arbitrary memory addresses). However, in that case, the
>>> attacker may have already overwritten panic() with malicious code as
>>> well, so even this case doesn't hold.
>>
>> And then there is a case when encryption fails and there is a risk of
>> transmitting data without encryption or accepting data without
>> verification.
>
> So kill the connection rather than the whole system.

Or maybe just drop the packet, as mac80211 AFAIK usually does. Perhaps
print a WARN_ON to let developers know of the issue. But a BUG_ON is
still excessive - I can't think of any way a WARN_ON + interface down
may fail in preventing unencrypted data leak.

--
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)

2010-05-06 16:11:12

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] iwlwifi: recalculate average tpt if not current

Hi John,

On Mon, 2010-05-03 at 10:55 -0700, reinette chatre wrote:
> From: Reinette Chatre <[email protected]>
>
> We currently have this check as a BUG_ON, which is being hit by people.
> Previously it was an error with a recalculation if not current, return that
> code.
>
> The BUG_ON was introduced by:
> commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
> Author: Guy Cohen <[email protected]>
> Date: Tue Sep 9 10:54:54 2008 +0800
>
> iwlwifi: Added support for 3 antennas
>
> ... the portion adding the BUG_ON is reverted since we are encountering the error
> and BUG_ON was created with assumption that error is not encountered.
>
> Signed-off-by: Reinette Chatre <[email protected]>
> ---

I noticed this patch in your wireless-next-2.6 pull request. Since it is
addressing a system hang issue, could it perhaps be included in
wireless-2.6 also? I should have included the bug report reference for
this purpose, sorry ... it is
https://bugzilla.redhat.com/show_bug.cgi?id=588021

Reinette


2010-05-03 21:10:35

by Adel Gadllah

[permalink] [raw]
Subject: Re: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

2010/5/3 Pavel Roskin <[email protected]>:
> On Mon, 2010-05-03 at 22:48 +0200, G?bor Stefanik wrote:
>
>> One exception I can think of: major misconfiguration can cause a
>> wireless device to DMA data into sensitive memory locations. When
>> evidence of this is detected, it might make sense to BUG_ON()
>> (especially if the bogus DMA operations can be exploited remotely to
>> overwrite arbitrary memory addresses). However, in that case, the
>> attacker may have already overwritten panic() with malicious code as
>> well, so even this case doesn't hold.
>
> And then there is a case when encryption fails and there is a risk of
> transmitting data without encryption or accepting data without
> verification.

So kill the connection rather than the whole system.

2010-05-06 18:50:17

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] iwlwifi: recalculate average tpt if not current

Hi John,

On Thu, 2010-05-06 at 11:22 -0700, John W. Linville wrote:
> On Thu, May 06, 2010 at 09:11:08AM -0700, reinette chatre wrote:
> > I noticed this patch in your wireless-next-2.6 pull request. Since it is
> > addressing a system hang issue, could it perhaps be included in
> > wireless-2.6 also? I should have included the bug report reference for
> > this purpose, sorry ... it is
> > https://bugzilla.redhat.com/show_bug.cgi?id=588021
>
> I didn't send it that way because a) that code has been there for a
> really long time; and b) the reporter couldn't reliably reproduce
> the bug and therefore can't reliably test the fix. While I agree
> that the fix looks harmless, no update is zero-risk.
>
> Can you reliably hit that code? Has it been tested enough that we
> should risk holding-up 2.6.34's release for it?

Good point ... we have not seen this BUG being hit before and it seems
to have been there since 2.6.28. In addition to the report in RedHat
bugzilla there is a lkml thread that seems to be related to this code
also (http://lkml.org/lkml/2010/5/3/229 ) ... but that reporter did not
seem to test this patch and my last reply to that thread bounced to his
email address.

We thus seem to have two recent reports of users hitting the BUG, which
may mean some other recent change in driver causes this BUG to be hit,
not really just the presence of the BUG code itself.

As far as testing goes ... this patch has just been merged and will from
now on be included in our regression testing. Unfortunately this had not
been exercised much yet since I just merged.

I guess we will proceed with the "wait and see" here and forward this
patch to stable if the BUG is encountered again.

Reinette




2010-05-03 21:01:49

by Pavel Roskin

[permalink] [raw]
Subject: Re: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

On Mon, 2010-05-03 at 22:48 +0200, Gábor Stefanik wrote:

> One exception I can think of: major misconfiguration can cause a
> wireless device to DMA data into sensitive memory locations. When
> evidence of this is detected, it might make sense to BUG_ON()
> (especially if the bogus DMA operations can be exploited remotely to
> overwrite arbitrary memory addresses). However, in that case, the
> attacker may have already overwritten panic() with malicious code as
> well, so even this case doesn't hold.

And then there is a case when encryption fails and there is a risk of
transmitting data without encryption or accepting data without
verification.

But generally, I agree.

--
Regards,
Pavel Roskin

2010-05-03 20:29:51

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

"John W. Linville" <[email protected]> writes:

> From: Adel Gadllah <[email protected]>
>
> Currently it is a BUG_ON() which will hang the machine once triggered.

Related to this: can we have a rule that no wireless driver should
ever use BUG_ON()? I think we could extend this even to cfg80211 and
mac80211.

BUG_ON() is valid whenever there's a risk of corrupting data, for
example on a filesystem, but I really don't see the point of using
them in wireless drivers. They just make things miserable, especially
for the normal users. Printing a warning and handling the case as
gracefully as possible is much better IMHO.

--
Kalle Valo

2010-05-03 17:48:55

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] iwlagn: Change the TPT calculations sanity-check to WARN_ON

On Mon, 2010-05-03 at 10:25 -0700, John W. Linville wrote:
> From: Adel Gadllah <[email protected]>
>
> Currently it is a BUG_ON() which will hang the machine once triggered.
>
> (Changed from WARN_ON to WARN_ON_ONCE. -- JWL)
>
> Signed-off-by: Adel Gadllah <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>

I can see a potential race condition here in the calculation of the
average throughput so a BUG_ON seems extreme.

I looked at the history of this code and it seems as though the BUG_ON was
added as a sidenote to a patch implementing something else.

The patch adding this BUG_ON is:

commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
Author: Guy Cohen <[email protected]>
Date: Tue Sep 9 10:54:54 2008 +0800

iwlwifi: Added support for 3 antennas


... and it thus seems as though this BUG_ON was added along the way while doing
something else ... especially considering that the comments describing the
original code has not been removed yet. Since the current code still contains:

/* Else we have enough samples; calculate estimate of
* actual average throughput */

.. .which is obviously not done right now.

I looked at the original code and think we can revert the portion of this patch
adding the BUG_ON. Since users have not encountered the error I assume the
author considered that a BUG_ON was warranted, but now we know that users do
indeed encounter the error and we should return the original code.

I'll send the revert as a separate patch.

Reinette



2010-05-03 17:55:08

by Reinette Chatre

[permalink] [raw]
Subject: [PATCH] iwlwifi: recalculate average tpt if not current

From: Reinette Chatre <[email protected]>

We currently have this check as a BUG_ON, which is being hit by people.
Previously it was an error with a recalculation if not current, return that
code.

The BUG_ON was introduced by:
commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
Author: Guy Cohen <[email protected]>
Date: Tue Sep 9 10:54:54 2008 +0800

iwlwifi: Added support for 3 antennas

... the portion adding the BUG_ON is reverted since we are encountering the error
and BUG_ON was created with assumption that error is not encountered.

Signed-off-by: Reinette Chatre <[email protected]>
---

drivers/net/wireless/iwlwifi/iwl-agn-rs.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-rs.c b/drivers/net/wireless/iwlwifi/iwl-agn-rs.c
index b93e491..75a145c 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-rs.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-rs.c
@@ -2070,10 +2070,12 @@ static void rs_rate_scale_perform(struct iwl_priv *priv,
}
/* Else we have enough samples; calculate estimate of
* actual average throughput */
-
- /* Sanity-check TPT calculations */
- BUG_ON(window->average_tpt != ((window->success_ratio *
- tbl->expected_tpt[index] + 64) / 128));
+ if (window->average_tpt != ((window->success_ratio *
+ tbl->expected_tpt[index] + 64) / 128)) {
+ IWL_ERR(priv, "expected_tpt should have been calculated by now\n");
+ window->average_tpt = ((window->success_ratio *
+ tbl->expected_tpt[index] + 64) / 128);
+ }

/* If we are searching for better modulation mode, check success. */
if (lq_sta->search_better_tbl &&
--
1.6.3.3




2010-06-04 18:15:10

by John W. Linville

[permalink] [raw]
Subject: Re: [PATCH] iwlwifi: recalculate average tpt if not current

On Fri, Jun 04, 2010 at 06:32:18PM +0200, Nils Radtke wrote:
> Hi John,
>
> http://marc.info/?l=linux-wireless&m=127317062320707&w=2 :
> > > > The BUG_ON was introduced by:
> > > > commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
> > > > Author: Guy Cohen <[email protected]>
> > > > Date: Tue Sep 9 10:54:54 2008 +0800
> > > >
> > > > iwlwifi: Added support for 3 antennas
> > > >
> > > > ... the portion adding the BUG_ON is reverted since we are encountering the error
> > > > and BUG_ON was created with assumption that error is not encountered.
> > > >
> > > > Signed-off-by: Reinette Chatre <[email protected]>
> > > > ---
> > >
> > > I noticed this patch in your wireless-next-2.6 pull request. Since it is
> > > addressing a system hang issue, could it perhaps be included in
> > > wireless-2.6 also? I should have included the bug report reference for
> > > this purpose, sorry ... it is
> > > https://bugzilla.redhat.com/show_bug.cgi?id=588021
> >
> > I didn't send it that way because a) that code has been there for a
> > really long time; and b) the reporter couldn't reliably reproduce
> > the bug and therefore can't reliably test the fix. While I agree
> > that the fix looks harmless, no update is zero-risk.
> >
> > Can you reliably hit that code? Has it been tested enough that we
> > should risk holding-up 2.6.34's release for it?
> I'm one of those _very reliably_ hitting this BUG. I can tell how
> annoying it is. I have to backport the patch w/ every kernel release
> to be able to use the wireless link _at all_. So please consider
> it as somewhat urgent to get the patch included.
>
> I'm still in contact w/ Reinette regularly to get those problems fixed.
> It involves physically moving around the city for testing any modification
> so this working is slowly processing as it takes every time an enormous
> amount of time to do so. Work in (slow) progress, but it's alive..
> C.f. http://permalink.gmane.org/gmane.linux.kernel/992941
>
> BTW, https://bugzilla.redhat.com/show_bug.cgi?id=588021 this isn't fixed.
> Maybe RH did fix it. kernel.org sources haven't. This is a blocker ever
> since.

The patch is in linux-2.6. I suppose it could go to [email protected].

commit 3d79b2a9eeaa066b35c49fbb17e3156a3c482c3e
Author: Reinette Chatre <[email protected]>
Date: Mon May 3 10:55:07 2010 -0700

iwlwifi: recalculate average tpt if not current

We currently have this check as a BUG_ON, which is being hit by people.
Previously it was an error with a recalculation if not current, return that
code.

The BUG_ON was introduced by:
commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
Author: Guy Cohen <[email protected]>
Date: Tue Sep 9 10:54:54 2008 +0800

iwlwifi: Added support for 3 antennas

... the portion adding the BUG_ON is reverted since we are encountering the
and BUG_ON was created with assumption that error is not encountered.

Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2010-06-04 16:32:28

by Nils Radtke

[permalink] [raw]
Subject: Re: [PATCH] iwlwifi: recalculate average tpt if not current

Hi John,

http://marc.info/?l=linux-wireless&m=127317062320707&w=2 :
> > > The BUG_ON was introduced by:
> > > commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
> > > Author: Guy Cohen <[email protected]>
> > > Date: Tue Sep 9 10:54:54 2008 +0800
> > >
> > > iwlwifi: Added support for 3 antennas
> > >
> > > ... the portion adding the BUG_ON is reverted since we are encountering the error
> > > and BUG_ON was created with assumption that error is not encountered.
> > >
> > > Signed-off-by: Reinette Chatre <[email protected]>
> > > ---
> >
> > I noticed this patch in your wireless-next-2.6 pull request. Since it is
> > addressing a system hang issue, could it perhaps be included in
> > wireless-2.6 also? I should have included the bug report reference for
> > this purpose, sorry ... it is
> > https://bugzilla.redhat.com/show_bug.cgi?id=588021
>
> I didn't send it that way because a) that code has been there for a
> really long time; and b) the reporter couldn't reliably reproduce
> the bug and therefore can't reliably test the fix. While I agree
> that the fix looks harmless, no update is zero-risk.
>
> Can you reliably hit that code? Has it been tested enough that we
> should risk holding-up 2.6.34's release for it?
I'm one of those _very reliably_ hitting this BUG. I can tell how
annoying it is. I have to backport the patch w/ every kernel release
to be able to use the wireless link _at all_. So please consider
it as somewhat urgent to get the patch included.

I'm still in contact w/ Reinette regularly to get those problems fixed.
It involves physically moving around the city for testing any modification
so this working is slowly processing as it takes every time an enormous
amount of time to do so. Work in (slow) progress, but it's alive..
C.f. http://permalink.gmane.org/gmane.linux.kernel/992941

BTW, https://bugzilla.redhat.com/show_bug.cgi?id=588021 this isn't fixed.
Maybe RH did fix it. kernel.org sources haven't. This is a blocker ever
since.

Cheers,

Nils



2010-06-25 00:26:18

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [PATCH] iwlwifi: recalculate average tpt if not current

On Fri, Jun 04, 2010 at 02:03:28PM -0400, John W. Linville wrote:
> On Fri, Jun 04, 2010 at 06:32:18PM +0200, Nils Radtke wrote:
> > Hi John,
> >
> > http://marc.info/?l=linux-wireless&m=127317062320707&w=2 :
> > > > > The BUG_ON was introduced by:
> > > > > commit 3110bef78cb4282c58245bc8fd6d95d9ccb19749
> > > > > Author: Guy Cohen <[email protected]>
> > > > > Date: Tue Sep 9 10:54:54 2008 +0800
> > > > >
> > > > > iwlwifi: Added support for 3 antennas
> > > > >
> > > > > ... the portion adding the BUG_ON is reverted since we are encountering the error
> > > > > and BUG_ON was created with assumption that error is not encountered.
> > > > >
> > > > > Signed-off-by: Reinette Chatre <[email protected]>
> > > > > ---
> > > >
> > > > I noticed this patch in your wireless-next-2.6 pull request. Since it is
> > > > addressing a system hang issue, could it perhaps be included in
> > > > wireless-2.6 also? I should have included the bug report reference for
> > > > this purpose, sorry ... it is
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=588021
> > >
> > > I didn't send it that way because a) that code has been there for a
> > > really long time; and b) the reporter couldn't reliably reproduce
> > > the bug and therefore can't reliably test the fix. While I agree
> > > that the fix looks harmless, no update is zero-risk.
> > >
> > > Can you reliably hit that code? Has it been tested enough that we
> > > should risk holding-up 2.6.34's release for it?
> > I'm one of those _very reliably_ hitting this BUG. I can tell how
> > annoying it is. I have to backport the patch w/ every kernel release
> > to be able to use the wireless link _at all_. So please consider
> > it as somewhat urgent to get the patch included.
> >
> > I'm still in contact w/ Reinette regularly to get those problems fixed.
> > It involves physically moving around the city for testing any modification
> > so this working is slowly processing as it takes every time an enormous
> > amount of time to do so. Work in (slow) progress, but it's alive..
> > C.f. http://permalink.gmane.org/gmane.linux.kernel/992941
> >
> > BTW, https://bugzilla.redhat.com/show_bug.cgi?id=588021 this isn't fixed.
> > Maybe RH did fix it. kernel.org sources haven't. This is a blocker ever
> > since.
>
> The patch is in linux-2.6. I suppose it could go to [email protected].

Now queued up, thanks.

greg k-h