Hi,
Peter wrote a fuzzing uinput program[1] to check on libinput,
and the result is that the kernel fails more often than
libinput :)
These 2 patches allow to fix the early failures.
I marked them as stable as I believe eventhough not many
people discovered those and reported them, they should
still be fixed in current kernels.
The fuzzing process still manages to crash the kernel,
bu I have the feeling those crashes are now related
to some races between other userspace process that attempt
to handle the spurious events injected in those random
devices. For instance, the KEY_BLUETOOTH or RF_KILL seem
to manage to mess up my network driver.
Cheers,
Benjamin
[1] https://github.com/whot/fuzzydevice
Benjamin Tissoires (2):
input - leds: do not iterate over non initialized leds
input - leds: fix input_led_disconnect path
drivers/input/input-leds.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--
2.14.3
We only instantiate the led classes if there is a definition in
input_led_info[].
However, the max for EV_LED is bigger than the values filled in this
array, and there are some holes in it.
In .connect(), we check for these holes, but in leds_init_work() we do
not, leading to some nice kernel oopses.
Found by running https://github.com/whot/fuzzydevice
Cc: [email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
---
drivers/input/input-leds.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/input/input-leds.c b/drivers/input/input-leds.c
index 83d930f7396a..c86eb3d648bf 100644
--- a/drivers/input/input-leds.c
+++ b/drivers/input/input-leds.c
@@ -94,6 +94,9 @@ static void leds_init_work(struct work_struct *work)
int led_no = 0;
for_each_set_bit(led_code, leds->handle.dev->ledbit, LED_CNT) {
+ if (!input_led_info[led_code].name)
+ continue;
+
led = &leds->leds[led_no];
down_read(&led->cdev.trigger_lock);
--
2.14.3
Before unregistering the led classes, we have to be sure there is no
more events in the input pipeline.
Closing the input node before removing the led classes flushes the
pipeline and this prevents segfaults.
Found with https://github.com/whot/fuzzydevice
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197679
Cc: [email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
---
drivers/input/input-leds.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/input/input-leds.c b/drivers/input/input-leds.c
index c86eb3d648bf..8aefcc186a02 100644
--- a/drivers/input/input-leds.c
+++ b/drivers/input/input-leds.c
@@ -211,6 +211,7 @@ static void input_leds_disconnect(struct input_handle *handle)
int i;
cancel_delayed_work_sync(&leds->init_work);
+ input_close_device(handle);
for (i = 0; i < leds->num_leds; i++) {
struct input_led *led = &leds->leds[i];
@@ -219,7 +220,6 @@ static void input_leds_disconnect(struct input_handle *handle)
kfree(led->cdev.name);
}
- input_close_device(handle);
input_unregister_handle(handle);
kfree(leds);
--
2.14.3
Hello,
Benjamin Tissoires, on jeu. 14 déc. 2017 14:25:20 +0100, wrote:
> I marked them as stable as I believe eventhough not many
> people discovered those and reported them, they should
> still be fixed in current kernels.
Well, the quoted source lines are not in any stable tree, I don't even
find them in the mainline tree. Where do these lines come from?
Samuel
Benjamin Tissoires, on jeu. 14 déc. 2017 14:25:22 +0100, wrote:
> Before unregistering the led classes, we have to be sure there is no
> more events in the input pipeline.
> Closing the input node before removing the led classes flushes the
> pipeline and this prevents segfaults.
>
> Found with https://github.com/whot/fuzzydevice
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=197679
>
> Cc: [email protected]
> Signed-off-by: Benjamin Tissoires <[email protected]>
input_close_device does run synchronize_rcu() which we seem to have to
process before freeing the rest indeed. Thus,
Acked-by: Samuel Thibault <[email protected]>
(though AFAIK it doesn't apply on the mainline tree)
> ---
> drivers/input/input-leds.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/input/input-leds.c b/drivers/input/input-leds.c
> index c86eb3d648bf..8aefcc186a02 100644
> --- a/drivers/input/input-leds.c
> +++ b/drivers/input/input-leds.c
> @@ -211,6 +211,7 @@ static void input_leds_disconnect(struct input_handle *handle)
> int i;
>
> cancel_delayed_work_sync(&leds->init_work);
> + input_close_device(handle);
>
> for (i = 0; i < leds->num_leds; i++) {
> struct input_led *led = &leds->leds[i];
> @@ -219,7 +220,6 @@ static void input_leds_disconnect(struct input_handle *handle)
> kfree(led->cdev.name);
> }
>
> - input_close_device(handle);
> input_unregister_handle(handle);
>
> kfree(leds);
> --
> 2.14.3
>
On Fri, Dec 15, 2017 at 1:16 AM, Samuel Thibault
<[email protected]> wrote:
> Hello,
>
> Benjamin Tissoires, on jeu. 14 déc. 2017 14:25:20 +0100, wrote:
>> I marked them as stable as I believe eventhough not many
>> people discovered those and reported them, they should
>> still be fixed in current kernels.
>
> Well, the quoted source lines are not in any stable tree, I don't even
> find them in the mainline tree. Where do these lines come from?
Oops, indeed, the patch 1/2 fixes a local commit I wrote to fix the
LED not being updated after the connect, that has already been
rejected upstream. (FWIW, it was
https://patchwork.kernel.org/patch/9542397/)
Dmitry, please disregard 1/2 and only have a look at 2/2.
Cheers,
Benjamin
>
> Samuel
Hi Benjamin,
On Thu, Dec 14, 2017 at 02:25:22PM +0100, Benjamin Tissoires wrote:
> Before unregistering the led classes, we have to be sure there is no
> more events in the input pipeline.
> Closing the input node before removing the led classes flushes the
> pipeline and this prevents segfaults.
I do not think this actually the issue. input_leds_event() is an empty
stub, it does not really do anything with events it receives form input
core, and it does not reference the led structures. The way input leds
work is that input driver owns the hardware led and is responsible for
communicating with it. The LED subsystem is a secondary interface,
requests coming from /sys/class/led/... are being forwarded to the input
core and then to the input hardware driver by the way of:
input_leds_brightness_set() ->
input_inject_event() ->
input_handle_event()->
disposition & INPUT_PASS_TO_DEVICE
dev->event() (in atkbd or hid-input)
actual control of LED
I do not see how stopping event flow from input core to input-leds would
help here.
Thanks.
--
Dmitry
On Sat, Dec 16, 2017 at 1:48 AM, Dmitry Torokhov
<[email protected]> wrote:
> Hi Benjamin,
>
> On Thu, Dec 14, 2017 at 02:25:22PM +0100, Benjamin Tissoires wrote:
>> Before unregistering the led classes, we have to be sure there is no
>> more events in the input pipeline.
>> Closing the input node before removing the led classes flushes the
>> pipeline and this prevents segfaults.
>
> I do not think this actually the issue. input_leds_event() is an empty
> stub, it does not really do anything with events it receives form input
> core, and it does not reference the led structures. The way input leds
Right. Actually, we could simply drop the stub as input_to_handler()
checks for the validity of .event().
> work is that input driver owns the hardware led and is responsible for
> communicating with it. The LED subsystem is a secondary interface,
> requests coming from /sys/class/led/... are being forwarded to the input
> core and then to the input hardware driver by the way of:
>
> input_leds_brightness_set() ->
> input_inject_event() ->
> input_handle_event()->
> disposition & INPUT_PASS_TO_DEVICE
> dev->event() (in atkbd or hid-input)
> actual control of LED
>
> I do not see how stopping event flow from input core to input-leds would
> help here.
I wonder if this solution in this patch is not just masking the race
condition by forcing the sync.
After further researches, I realized that the patch is actually not
needed. In the upstream repo of Peter, the code inject events and
closes the device ASAP, triggering races with udev.
Adding the proper udev stubs seem to solve the issues.
I still have other random crashes, but I can't seem to reproduce the
error of https://bugzilla.kernel.org/show_bug.cgi?id=197679 now.
Anyway, I can't explain the backtrace of the kernel bug, it is as if
we are calling the unregister function without having properly
registered the device. But this can not happen because of the mutexes
in place.
The race seems to be udev and probably the led class accesses...
BTW, if the handler doesn't use the events coming in from the input
subsystem, is there any benefits of opening the device from the
handler?
I tried without, and it seemed to be working fine, but I wonder if
there is no hidden dependency I haven't see.
Cheers,
Benjamin
On Wed, Dec 20, 2017 at 9:11 AM, Benjamin Tissoires
<[email protected]> wrote:
> On Sat, Dec 16, 2017 at 1:48 AM, Dmitry Torokhov
> <[email protected]> wrote:
>> Hi Benjamin,
>>
>> On Thu, Dec 14, 2017 at 02:25:22PM +0100, Benjamin Tissoires wrote:
>>> Before unregistering the led classes, we have to be sure there is no
>>> more events in the input pipeline.
>>> Closing the input node before removing the led classes flushes the
>>> pipeline and this prevents segfaults.
>>
>> I do not think this actually the issue. input_leds_event() is an empty
>> stub, it does not really do anything with events it receives form input
>> core, and it does not reference the led structures. The way input leds
>
> Right. Actually, we could simply drop the stub as input_to_handler()
> checks for the validity of .event().
>
>> work is that input driver owns the hardware led and is responsible for
>> communicating with it. The LED subsystem is a secondary interface,
>> requests coming from /sys/class/led/... are being forwarded to the input
>> core and then to the input hardware driver by the way of:
>>
>> input_leds_brightness_set() ->
>> input_inject_event() ->
>> input_handle_event()->
>> disposition & INPUT_PASS_TO_DEVICE
>> dev->event() (in atkbd or hid-input)
>> actual control of LED
>>
>> I do not see how stopping event flow from input core to input-leds would
>> help here.
>
> I wonder if this solution in this patch is not just masking the race
> condition by forcing the sync.
>
> After further researches, I realized that the patch is actually not
> needed. In the upstream repo of Peter, the code inject events and
> closes the device ASAP, triggering races with udev.
> Adding the proper udev stubs seem to solve the issues.
> I still have other random crashes, but I can't seem to reproduce the
> error of https://bugzilla.kernel.org/show_bug.cgi?id=197679 now.
>
> Anyway, I can't explain the backtrace of the kernel bug, it is as if
> we are calling the unregister function without having properly
> registered the device. But this can not happen because of the mutexes
> in place.
> The race seems to be udev and probably the led class accesses...
I wonder if the crash was observed only with your first patch adding
the delay in initializing the leds?
>
> BTW, if the handler doesn't use the events coming in from the input
> subsystem, is there any benefits of opening the device from the
> handler?
> I tried without, and it seemed to be working fine, but I wonder if
> there is no hidden dependency I haven't see.
You want to open the device as it is what essentially powers it up, so
that it can react to input_inject_event() sent via LED subsystem. In
case of atkbd input_open_device() will result in call to serio_open()
which enables the KBD port of i8042. You probably do not see any
difference because the VT subsystem already attached the keyboard
handler to the device and it already "opened" the device in question.
Thanks.
--
Dmitry
On Wed, Dec 20, 2017 at 7:20 PM, Dmitry Torokhov
<[email protected]> wrote:
> On Wed, Dec 20, 2017 at 9:11 AM, Benjamin Tissoires
> <[email protected]> wrote:
>> On Sat, Dec 16, 2017 at 1:48 AM, Dmitry Torokhov
>> <[email protected]> wrote:
>>> Hi Benjamin,
>>>
>>> On Thu, Dec 14, 2017 at 02:25:22PM +0100, Benjamin Tissoires wrote:
>>>> Before unregistering the led classes, we have to be sure there is no
>>>> more events in the input pipeline.
>>>> Closing the input node before removing the led classes flushes the
>>>> pipeline and this prevents segfaults.
>>>
>>> I do not think this actually the issue. input_leds_event() is an empty
>>> stub, it does not really do anything with events it receives form input
>>> core, and it does not reference the led structures. The way input leds
>>
>> Right. Actually, we could simply drop the stub as input_to_handler()
>> checks for the validity of .event().
>>
>>> work is that input driver owns the hardware led and is responsible for
>>> communicating with it. The LED subsystem is a secondary interface,
>>> requests coming from /sys/class/led/... are being forwarded to the input
>>> core and then to the input hardware driver by the way of:
>>>
>>> input_leds_brightness_set() ->
>>> input_inject_event() ->
>>> input_handle_event()->
>>> disposition & INPUT_PASS_TO_DEVICE
>>> dev->event() (in atkbd or hid-input)
>>> actual control of LED
>>>
>>> I do not see how stopping event flow from input core to input-leds would
>>> help here.
>>
>> I wonder if this solution in this patch is not just masking the race
>> condition by forcing the sync.
>>
>> After further researches, I realized that the patch is actually not
>> needed. In the upstream repo of Peter, the code inject events and
>> closes the device ASAP, triggering races with udev.
>> Adding the proper udev stubs seem to solve the issues.
>> I still have other random crashes, but I can't seem to reproduce the
>> error of https://bugzilla.kernel.org/show_bug.cgi?id=197679 now.
>>
>> Anyway, I can't explain the backtrace of the kernel bug, it is as if
>> we are calling the unregister function without having properly
>> registered the device. But this can not happen because of the mutexes
>> in place.
>> The race seems to be udev and probably the led class accesses...
>
> I wonder if the crash was observed only with your first patch adding
> the delay in initializing the leds?
Nah, the bug was initially found by Peter on a plain Fedora system
without my crap :)
>
>>
>> BTW, if the handler doesn't use the events coming in from the input
>> subsystem, is there any benefits of opening the device from the
>> handler?
>> I tried without, and it seemed to be working fine, but I wonder if
>> there is no hidden dependency I haven't see.
>
> You want to open the device as it is what essentially powers it up, so
> that it can react to input_inject_event() sent via LED subsystem. In
> case of atkbd input_open_device() will result in call to serio_open()
> which enables the KBD port of i8042. You probably do not see any
> difference because the VT subsystem already attached the keyboard
> handler to the device and it already "opened" the device in question.
>
Right. So I guess there is not much to do then :)
Cheers,
Benjamin