2016-11-11 15:32:30

by Yves-Alexis Perez

[permalink] [raw]
Subject: [PATCH v2] firmware: fix async, manual firmware loading

When wait_for_completion_interruptible_timeout() is called from
_request_firmware_load() with a large timeout value (here, MAX_JIFFY_OFFSET
because it's a an explicit call to the user helper), its return value (a
long) will overflow when silently casted to int, be stored as a negative
integer and then treated as an error.

This bug was introduced in commit 68ff2a00dbf5 ("firmware_loader: handle
timeout via wait_for_completion_interruptible_timeout()") when a delay work
was replaced by the call to wait_for_completion_interruptible_timeout().

Fix this by re-using the timeout variable and only set retval in specific
cases.

Signed-off-by: Yves-Alexis Perez <[email protected]>
Fixes: 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
Cc: Luis R. Rodriguez <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Acked-by: Luis R. Rodriguez <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
---
Changelog:

v2: rewrite the changelog following comments by Luis

drivers/base/firmware_class.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 22d1760..a95e1e5 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -955,13 +955,14 @@ static int _request_firmware_load(struct firmware_priv *fw_priv,
timeout = MAX_JIFFY_OFFSET;
}

- retval = wait_for_completion_interruptible_timeout(&buf->completion,
+ timeout = wait_for_completion_interruptible_timeout(&buf->completion,
timeout);
- if (retval == -ERESTARTSYS || !retval) {
+ if (timeout == -ERESTARTSYS || !timeout) {
+ retval = timeout;
mutex_lock(&fw_lock);
fw_load_abort(fw_priv);
mutex_unlock(&fw_lock);
- } else if (retval > 0) {
+ } else if (timeout > 0) {
retval = 0;
}

--
2.10.2


2016-11-11 19:11:02

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v2] firmware: fix async, manual firmware loading

On Fri, Nov 11, 2016 at 04:32:17PM +0100, Yves-Alexis Perez wrote:
> When wait_for_completion_interruptible_timeout() is called from
> _request_firmware_load() with a large timeout value (here, MAX_JIFFY_OFFSET
> because it's a an explicit call to the user helper), its return value (a
> long) will overflow when silently casted to int, be stored as a negative
> integer and then treated as an error.
>
> This bug was introduced in commit 68ff2a00dbf5 ("firmware_loader: handle
> timeout via wait_for_completion_interruptible_timeout()") when a delay work
> was replaced by the call to wait_for_completion_interruptible_timeout().
>
> Fix this by re-using the timeout variable and only set retval in specific
> cases.
>
> Signed-off-by: Yves-Alexis Perez <[email protected]>
> Fixes: 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
> Cc: Luis R. Rodriguez <[email protected]>
> Cc: Ming Lei <[email protected]>
> Cc: Bjorn Andersson <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: [email protected]
> Acked-by: Luis R. Rodriguez <[email protected]>
> Reviewed-by: Bjorn Andersson <[email protected]>

Sorry this commit log still does not help kernel maintainers enough, we want
them well informed immediately what the default behaviour is, what the impact
of the issue is, how broad of an issue it is, how this issue is triggered and
exactly how this can happen. I'll massage this a bit myself and re-submit,
as otherwise this work will be done by each and every single Linux kernel
maintainer working on a Linux distribution trying to determine if its correct
to merge this into their tree or not.

I'll re-submit shortly.

Luis

2016-11-11 19:28:48

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH] firmware: fix usermode helper fallback loading

From: Yves-Alexis Perez <[email protected]>

When you use the firmware usermode helper fallback with a timeout value set to a
value greater than INT_MAX (2147483647) a cast overflow issue causes the
timeout value to go negative and breaks all usermode helper loading. This
regression was introduced through commit 68ff2a00dbf5 ("firmware_loader:
handle timeout via wait_for_completion_interruptible_timeout()") on kernel
v4.0.

The firmware_class drivers relies on the firmware usermode helper
fallback as a mechanism to look for firmware if the direct filesystem
search failed only if:

a) You've enabled CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros):

Then all of these callers will rely on the fallback mechanism in case
the firmware is not found through an initial direct filesystem lookup:

o request_firmware()
o request_firmware_into_buf()
o request_firmware_nowait()

b) If you've only enabled CONFIG_FW_LOADER_USER_HELPER (most distros):

Then only callers using request_firmware_nowait() with the second
argument set to false, this explicitly is requesting the UMH firmware
fallback to be relied on in case the first filesystem lookup fails.

Using Coccinelle SmPL grammar we have identified only two drivers
explicitly requesting the UMH firmware fallback mechanism:

- drivers/firmware/dell_rbu.c
- drivers/leds/leds-lp55xx-common.c

Since most distributions only enable CONFIG_FW_LOADER_USER_HELPER the
biggest impact of this regression are users of the dell_rbu and
leds-lp55xx-common device driver which required the UMH to find their
respective needed firmwares.

The default timeout for the UMH is set to 60 seconds always, as of
commit 68ff2a00dbf5 ("firmware_loader: handle timeout via
wait_for_completion_interruptible_timeout()") the timeout was bumped
to MAX_JIFFY_OFFSET ((LONG_MAX >> 1)-1). Additionally the MAX_JIFFY_OFFSET
value was also used if the timeout was configured by a user to 0.

The following works:

echo 2147483647 > /sys/class/firmware/timeout

But both of the following set the timeout to MAX_JIFFY_OFFSET even if
we display 0 back to userspace:

echo 2147483648 > /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0

echo 0> /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0

A max value of INT_MAX (2147483647) seconds is therefore implicit due to the
another cast with simple_strtol().

This fixes the secondary cast (the first one is simple_strtol() but its an
issue only by forcing an implicit limit) by re-using the timeout variable and
only setting retval in appropriate cases.

Lastly worth noting systemd had ripped out the UMH firmware fallback
mechanism from udev since udev 2014 via commit be2ea723b1d023b3d
("udev: remove userspace firmware loading support"), so as of systemd v217.

Signed-off-by: Yves-Alexis Perez <[email protected]>
Fixes: 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
Cc: Luis R. Rodriguez <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Acked-by: Luis R. Rodriguez <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
[[email protected]: gave commit log a whole lot of love]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---

Summary:

o the UMH fallback has been broken since Linux v4.0 !
o systemd removed the UMH firmware loader since systemd v217 in 2014 !

And no one had kicked or screamed...

drivers/base/firmware_class.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 22d1760a4278..a95e1e572697 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -955,13 +955,14 @@ static int _request_firmware_load(struct firmware_priv *fw_priv,
timeout = MAX_JIFFY_OFFSET;
}

- retval = wait_for_completion_interruptible_timeout(&buf->completion,
+ timeout = wait_for_completion_interruptible_timeout(&buf->completion,
timeout);
- if (retval == -ERESTARTSYS || !retval) {
+ if (timeout == -ERESTARTSYS || !timeout) {
+ retval = timeout;
mutex_lock(&fw_lock);
fw_load_abort(fw_priv);
mutex_unlock(&fw_lock);
- } else if (retval > 0) {
+ } else if (timeout > 0) {
retval = 0;
}

--
2.10.1