2014-01-11 19:48:42

by Larry Finger

[permalink] [raw]
Subject: [PATCH 0/3] Fixes for b43 and b43legacy

These 3 fixes should be applied as soon as possible. I understand that 3.13
will be released soon. As all 3 are marked for stable, inclusion in 3.14
will be satisfactory.

Larry


Larry Finger (3):
b43: Fix lockdep splat
b43: Fix oops if firmware is not available
b43legacy: Fix oops if firmware is not available

drivers/net/wireless/b43/main.c | 24 +++++++++++++++---------
drivers/net/wireless/b43legacy/main.c | 5 +++--
2 files changed, 18 insertions(+), 11 deletions(-)

--
1.8.4



2014-01-11 19:48:45

by Larry Finger

[permalink] [raw]
Subject: [PATCH 2/3] b43: Fix oops if firmware is not available

On openSUSE systems, the script that installs the firmware for b43 also
unloads and reloads the driver. When the firmware was not previously
available, the driver has stalled at a wait_for_completion(). When the
unload routine releases that hold, the driver encounters structures
that have already been deleted and generates a fatal condition. When
the user does a manual restart, the file system cleanup frequently
results in the firmware files being deleted and the user is never able
to install the firmware. The fix is to change the wait_for_completion()
with a wait_for_completion_timeout() with a 60 second wait period.

There is a potential race condition; however, the chances that less
than a minute has elapsed between the initial driver load and a
subsequent unload is very unlikely.

This patch also fixes a typo in a comment.

Signed-off-by: Larry Finger <[email protected]>
Cc: Stable <[email protected]>
---
drivers/net/wireless/b43/main.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
index 86b2030..5ce9ecb 100644
--- a/drivers/net/wireless/b43/main.c
+++ b/drivers/net/wireless/b43/main.c
@@ -2150,12 +2150,13 @@ int b43_do_request_fw(struct b43_request_fw_context *ctx,
pr_err("Unable to load firmware\n");
return err;
}
- /* stall here until fw ready */
- wait_for_completion(&ctx->fw_load_complete);
+ /* stall here until fw ready or 60 sec elapses */
+ wait_for_completion_timeout(&ctx->fw_load_complete,
+ msecs_to_jiffies(60000));
if (ctx->blob)
goto fw_ready;
/* On some ARM systems, the async request will fail, but the next sync
- * request works. For this reason, we dall through here
+ * request works. For this reason, we fall through here
*/
}
err = request_firmware(&ctx->blob, ctx->fwname,
--
1.8.4


2014-01-12 04:24:25

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH 2/3] b43: Fix oops if firmware is not available

On Sat, 2014-01-11 at 21:55 -0600, Larry Finger wrote:
> On 01/11/2014 09:27 PM, Ben Hutchings wrote:
> > On Sat, 2014-01-11 at 13:48 -0600, Larry Finger wrote:
> >> On openSUSE systems, the script that installs the firmware for b43 also
> >> unloads and reloads the driver. When the firmware was not previously
> >> available, the driver has stalled at a wait_for_completion(). When the
> >> unload routine releases that hold, the driver encounters structures
> >> that have already been deleted and generates a fatal condition. When
> >> the user does a manual restart, the file system cleanup frequently
> >> results in the firmware files being deleted and the user is never able
> >> to install the firmware. The fix is to change the wait_for_completion()
> >> with a wait_for_completion_timeout() with a 60 second wait period.
> >>
> >> There is a potential race condition; however, the chances that less
> >> than a minute has elapsed between the initial driver load and a
> >> subsequent unload is very unlikely.
> >
> > A minute-long race is 'unlikely' to be hit? Seriously?!
>
> Ben,
>
> If you force a reboot before the minute expires, nothing weird happens. The only
> race condition happens when the user has to

...remove the module. Exactly how the bug reporter triggered module
removal seems irrelevant.

> log in, open a terminal, run a
> script that downloads 13.5 MiB of files from the Internet, and then executes the
> firmware extraction program. On my 10 Mbps external line and a 2 GHz CPU, that
> takes 32 s, plus any time to enter the password for a sudo operation. That was
> the basis for my conclusion that a race is unlikely.
>
> What is the minimum time that should be allowed for a request_firmware_nowait()
> to respond? I know we had to go to asynchronous fw loading because the
> synchronous version would timeout at 30 s.

You could switch back to synchronous firmware loading soon, as it's not
going to support a usermode helper any more.

But until then, the proper fix for this is going to be to cancel the
waiter earlier in teardown.

Ben.

--
Ben Hutchings
Quantity is no substitute for quality, but it's the only one we've got.


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2014-01-11 19:48:46

by Larry Finger

[permalink] [raw]
Subject: [PATCH 3/3] b43legacy: Fix oops if firmware is not available

On openSUSE systems, the script that installs the firmware for b43legacy
also unloads and reloads the driver. When the firmware was not previously
available, the driver has stalled at a wait_for_completion(). When the
unload routine releases that hold, the driver encounters structures
that have already been deleted and generates a fatal condition. When
the user does a manual restart, the file system cleanup frequently
results in the firmware files being deleted and the user is never able
to install the firmware. The fix is to change the wait_for_completion()
with a wait_for_completion_timeout() with a 60 second wait period.

There is a potential race condition; however, the chances that less
than a minute has elapsed between the initial driver load and a
subsequent unload is very unlikely.

Signed-off-by: Larry Finger <[email protected]>
Cc: Stable <[email protected]>
---
drivers/net/wireless/b43legacy/main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/b43legacy/main.c b/drivers/net/wireless/b43legacy/main.c
index 5726688..bd5f142 100644
--- a/drivers/net/wireless/b43legacy/main.c
+++ b/drivers/net/wireless/b43legacy/main.c
@@ -1546,8 +1546,9 @@ static int do_request_fw(struct b43legacy_wldev *dev,
b43legacyerr(dev->wl, "Unable to load firmware\n");
return err;
}
- /* stall here until fw ready */
- wait_for_completion(&dev->fw_load_complete);
+ /* stall here until fw ready or 60 sec elapses */
+ wait_for_completion_timeout(&dev->fw_load_complete,
+ msecs_to_jiffies(60000));
if (!dev->fwp)
err = -EINVAL;
*fw = dev->fwp;
--
1.8.4


2014-01-12 03:28:08

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH 2/3] b43: Fix oops if firmware is not available

On Sat, 2014-01-11 at 13:48 -0600, Larry Finger wrote:
> On openSUSE systems, the script that installs the firmware for b43 also
> unloads and reloads the driver. When the firmware was not previously
> available, the driver has stalled at a wait_for_completion(). When the
> unload routine releases that hold, the driver encounters structures
> that have already been deleted and generates a fatal condition. When
> the user does a manual restart, the file system cleanup frequently
> results in the firmware files being deleted and the user is never able
> to install the firmware. The fix is to change the wait_for_completion()
> with a wait_for_completion_timeout() with a 60 second wait period.
>
> There is a potential race condition; however, the chances that less
> than a minute has elapsed between the initial driver load and a
> subsequent unload is very unlikely.

A minute-long race is 'unlikely' to be hit? Seriously?!

Ben.

> This patch also fixes a typo in a comment.
>
> Signed-off-by: Larry Finger <[email protected]>
> Cc: Stable <[email protected]>
> ---
> drivers/net/wireless/b43/main.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
> index 86b2030..5ce9ecb 100644
> --- a/drivers/net/wireless/b43/main.c
> +++ b/drivers/net/wireless/b43/main.c
> @@ -2150,12 +2150,13 @@ int b43_do_request_fw(struct b43_request_fw_context *ctx,
> pr_err("Unable to load firmware\n");
> return err;
> }
> - /* stall here until fw ready */
> - wait_for_completion(&ctx->fw_load_complete);
> + /* stall here until fw ready or 60 sec elapses */
> + wait_for_completion_timeout(&ctx->fw_load_complete,
> + msecs_to_jiffies(60000));
> if (ctx->blob)
> goto fw_ready;
> /* On some ARM systems, the async request will fail, but the next sync
> - * request works. For this reason, we dall through here
> + * request works. For this reason, we fall through here
> */
> }
> err = request_firmware(&ctx->blob, ctx->fwname,

--
Ben Hutchings
Quantity is no substitute for quality, but it's the only one we've got.


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2014-01-12 08:40:49

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] b43: Fix oops if firmware is not available

On Sun, 2014-01-12 at 04:24 +0000, Ben Hutchings wrote:

> You could switch back to synchronous firmware loading soon, as it's not
> going to support a usermode helper any more.
>
> But until then, the proper fix for this is going to be to cancel the
> waiter earlier in teardown.

I don't think we found a way when we looked at this, and instead made
the module unload wait for the request_firmware callback to come back
(see all users of the request_firmware_complete struct member in
iwlwifi/iwl-drv.c)

johannes


2014-01-12 19:21:16

by Larry Finger

[permalink] [raw]
Subject: Re: [PATCH 2/3] b43: Fix oops if firmware is not available

On 01/11/2014 10:24 PM, Ben Hutchings wrote:
> On Sat, 2014-01-11 at 21:55 -0600, Larry Finger wrote:
>> On 01/11/2014 09:27 PM, Ben Hutchings wrote:
>>> On Sat, 2014-01-11 at 13:48 -0600, Larry Finger wrote:
>>>> On openSUSE systems, the script that installs the firmware for b43 also
>>>> unloads and reloads the driver. When the firmware was not previously
>>>> available, the driver has stalled at a wait_for_completion(). When the
>>>> unload routine releases that hold, the driver encounters structures
>>>> that have already been deleted and generates a fatal condition. When
>>>> the user does a manual restart, the file system cleanup frequently
>>>> results in the firmware files being deleted and the user is never able
>>>> to install the firmware. The fix is to change the wait_for_completion()
>>>> with a wait_for_completion_timeout() with a 60 second wait period.
>>>>
>>>> There is a potential race condition; however, the chances that less
>>>> than a minute has elapsed between the initial driver load and a
>>>> subsequent unload is very unlikely.
>>>
>>> A minute-long race is 'unlikely' to be hit? Seriously?!
>>
>> Ben,
>>
>> If you force a reboot before the minute expires, nothing weird happens. The only
>> race condition happens when the user has to
>
> ...remove the module. Exactly how the bug reporter triggered module
> removal seems irrelevant.
>
>> log in, open a terminal, run a
>> script that downloads 13.5 MiB of files from the Internet, and then executes the
>> firmware extraction program. On my 10 Mbps external line and a 2 GHz CPU, that
>> takes 32 s, plus any time to enter the password for a sudo operation. That was
>> the basis for my conclusion that a race is unlikely.
>>
>> What is the minimum time that should be allowed for a request_firmware_nowait()
>> to respond? I know we had to go to asynchronous fw loading because the
>> synchronous version would timeout at 30 s.
>
> You could switch back to synchronous firmware loading soon, as it's not
> going to support a usermode helper any more.
>
> But until then, the proper fix for this is going to be to cancel the
> waiter earlier in teardown.

After closer inspection, it turns out the waiter was never canceled in the
teardown. I have had to move the completion struct to make it available at
module exit, but now I have a better fix.

Thanks for the critical review.

Larry



2014-01-12 03:55:20

by Larry Finger

[permalink] [raw]
Subject: Re: [PATCH 2/3] b43: Fix oops if firmware is not available

On 01/11/2014 09:27 PM, Ben Hutchings wrote:
> On Sat, 2014-01-11 at 13:48 -0600, Larry Finger wrote:
>> On openSUSE systems, the script that installs the firmware for b43 also
>> unloads and reloads the driver. When the firmware was not previously
>> available, the driver has stalled at a wait_for_completion(). When the
>> unload routine releases that hold, the driver encounters structures
>> that have already been deleted and generates a fatal condition. When
>> the user does a manual restart, the file system cleanup frequently
>> results in the firmware files being deleted and the user is never able
>> to install the firmware. The fix is to change the wait_for_completion()
>> with a wait_for_completion_timeout() with a 60 second wait period.
>>
>> There is a potential race condition; however, the chances that less
>> than a minute has elapsed between the initial driver load and a
>> subsequent unload is very unlikely.
>
> A minute-long race is 'unlikely' to be hit? Seriously?!

Ben,

If you force a reboot before the minute expires, nothing weird happens. The only
race condition happens when the user has to log in, open a terminal, run a
script that downloads 13.5 MiB of files from the Internet, and then executes the
firmware extraction program. On my 10 Mbps external line and a 2 GHz CPU, that
takes 32 s, plus any time to enter the password for a sudo operation. That was
the basis for my conclusion that a race is unlikely.

What is the minimum time that should be allowed for a request_firmware_nowait()
to respond? I know we had to go to asynchronous fw loading because the
synchronous version would timeout at 30 s.

Larry