Subject: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

Hi, Thorsten here, the Linux kernel's regression tracker.

I noticed a regression report in bugzilla.kernel.org. As many (most?)
kernel developers don't keep an eye on it, I decided to forward it by mail.

Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :

> guy.b 2023-05-07 07:37:34 UTC
>
> Hello,
>
> Since kernel 6.3.1 the boot process hangs (~ 5 seconds) by uevent triggering with the following errors :
>
> logitech-hidpp-device 0003:046D:405E.0004: hidpp_devicenametype_get_count: received protocol error 0x07
>
>
> The logs about logitech input:
>
> usb 1-8: new full-speed USB device number 2 using xhci_hcd
> mai 06 11:54:24 Cockpit kernel: usb 1-8: New USB device found, idVendor=046d, idProduct=c52b, bcdDevice=24.10
> mai 06 11:54:24 Cockpit kernel: usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> mai 06 11:54:24 Cockpit kernel: usb 1-8: Product: USB Receiver
> mai 06 11:54:24 Cockpit kernel: usb 1-8: Manufacturer: Logitech
> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.0/0003:046D:C52B.0001/input/input4
> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:C52B.0001: input,hidraw0: USB HID v1.11 Keyboard [Logitech USB Receiver] on usb-0000:00:14.0-8/input0
> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:046D:C52B.0002/input/input5
> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver Consumer Control as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:046D:C52B.0002/input/input6
> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver System Control as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:046D:C52B.0002/input/input7
> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:C52B.0002: input,hiddev96,hidraw1: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:14.0-8/input1
> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:C52B.0003: hiddev97,hidraw2: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-8/input2
> mai 06 11:54:24 Cockpit kernel: usbcore: registered new interface driver usbhid
> mai 06 11:54:24 Cockpit kernel: usbhid: USB HID core driver
> mai 06 11:54:24 Cockpit kernel: logitech-djreceiver 0003:046D:C52B.0003: hiddev96,hidraw0: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-8/input2
> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:405e Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:405E.0004/input/input9
> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:405e Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:405E.0004/input/input10
> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:405E.0004: input,hidraw1: USB HID v1.11 Keyboard [Logitech Wireless Device PID:405e] on usb-0000:00:14.0-8/input2:1
> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:2010 Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:2010.0005/input/input14
> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:2010.0005: input,hidraw2: USB HID v1.11 Keyboard [Logitech Wireless Device PID:2010] on usb-0000:00:14.0-8/input2:2
> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:405E.0004: HID++ 4.5 device connected.
> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:405E.0004: hidpp_devicenametype_get_count: received protocol error 0x07
> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:405e as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:405E.0004/input/input18
> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:405E.0004: input,hidraw1: USB HID v1.11 Keyboard [Logitech Wireless Device PID:405e] on usb-0000:00:14.0-8/input2:1
> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:2010 as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:2010.0005/input/input19
> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:2010.0005: input,hidraw2: USB HID v1.11 Keyboard [Logitech Wireless Device PID:2010] on usb-0000:00:14.0-8/input2:2
>
> Next, once booted and remove the unify receiver and plug it again there is a massive lag (~ 15 seconds) before that the receiver get ready for the mouse and keyboard to be functional with following errors :
>
> kernel: logitech-hidpp-device 0003:046D:405E.0022: hidpp_devicenametype_get_count: received protocol error 0x07
> kernel: logitech-hidpp-device 0003:046D:405E.0023: Couldn't get wheel multiplier (error -110)
>
> Unify receiver with K800 keyboard and M720 Triathlon mouse paired.
>
> This happens on my desktop computer but not on my laptop with a unify receiver and a marathon M705 mouse.
>
> Both computer are on Archlinux and up to date.
>
> On the desktop the boot is fine without the unify receiver.
>
> Let me know if you need more info.
>
> Thank you.


See the ticket for more details.

Note, there are two users affected by this (see Comment 5 for the
second), but you have to use bugzilla to reach the second reporter, as I
sadly[1] can not simply CCed them in mails like this (the initial
reporter gave permission).


[TLDR for the rest of this mail: I'm adding this report to the list of
tracked Linux kernel regressions; the text you find below is based on a
few templates paragraphs you might have encountered already in similar
form.]

BTW, let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:

#regzbot introduced: v6.2..v6.3
https://bugzilla.kernel.org/show_bug.cgi?id=217412
#regzbot title: input: hid: logitech unify receiver not working properly
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
this thread sees some discussion). See page linked in footer for details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

[1] because bugzilla.kernel.org tells users upon registration their
"email address will never be displayed to logged out users"


Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

Hi, Thorsten here, the Linux kernel's regression tracker.

On 08.05.23 11:55, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker.
>
> I noticed a regression report in bugzilla.kernel.org. As many (most?)
> kernel developers don't keep an eye on it, I decided to forward it by mail.
>
> Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :

TWIMC: a few other users (three or four iirc) showed up in that ticket
and reported problems with the receiver, albeit the symptoms are not
exactly the same for all of them, so there might be more than one problem.

I'll try to motivate the affected users to perform a bisection. But
would be great if those with more knowledge about this code could
briefly look into the ticket, maybe the details the users shared allows
one of you to guess what causes this.

Ciao, Thorsten

>> guy.b 2023-05-07 07:37:34 UTC
>>
>> Hello,
>>
>> Since kernel 6.3.1 the boot process hangs (~ 5 seconds) by uevent triggering with the following errors :
>>
>> logitech-hidpp-device 0003:046D:405E.0004: hidpp_devicenametype_get_count: received protocol error 0x07
>>
>>
>> The logs about logitech input:
>>
>> usb 1-8: new full-speed USB device number 2 using xhci_hcd
>> mai 06 11:54:24 Cockpit kernel: usb 1-8: New USB device found, idVendor=046d, idProduct=c52b, bcdDevice=24.10
>> mai 06 11:54:24 Cockpit kernel: usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
>> mai 06 11:54:24 Cockpit kernel: usb 1-8: Product: USB Receiver
>> mai 06 11:54:24 Cockpit kernel: usb 1-8: Manufacturer: Logitech
>> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.0/0003:046D:C52B.0001/input/input4
>> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:C52B.0001: input,hidraw0: USB HID v1.11 Keyboard [Logitech USB Receiver] on usb-0000:00:14.0-8/input0
>> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:046D:C52B.0002/input/input5
>> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver Consumer Control as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:046D:C52B.0002/input/input6
>> mai 06 11:54:24 Cockpit kernel: input: Logitech USB Receiver System Control as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:046D:C52B.0002/input/input7
>> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:C52B.0002: input,hiddev96,hidraw1: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:14.0-8/input1
>> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:C52B.0003: hiddev97,hidraw2: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-8/input2
>> mai 06 11:54:24 Cockpit kernel: usbcore: registered new interface driver usbhid
>> mai 06 11:54:24 Cockpit kernel: usbhid: USB HID core driver
>> mai 06 11:54:24 Cockpit kernel: logitech-djreceiver 0003:046D:C52B.0003: hiddev96,hidraw0: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-8/input2
>> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:405e Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:405E.0004/input/input9
>> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:405e Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:405E.0004/input/input10
>> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:405E.0004: input,hidraw1: USB HID v1.11 Keyboard [Logitech Wireless Device PID:405e] on usb-0000:00:14.0-8/input2:1
>> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:2010 Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:2010.0005/input/input14
>> mai 06 11:54:24 Cockpit kernel: hid-generic 0003:046D:2010.0005: input,hidraw2: USB HID v1.11 Keyboard [Logitech Wireless Device PID:2010] on usb-0000:00:14.0-8/input2:2
>> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:405E.0004: HID++ 4.5 device connected.
>> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:405E.0004: hidpp_devicenametype_get_count: received protocol error 0x07
>> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:405e as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:405E.0004/input/input18
>> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:405E.0004: input,hidraw1: USB HID v1.11 Keyboard [Logitech Wireless Device PID:405e] on usb-0000:00:14.0-8/input2:1
>> mai 06 11:54:24 Cockpit kernel: input: Logitech Wireless Device PID:2010 as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:046D:C52B.0003/0003:046D:2010.0005/input/input19
>> mai 06 11:54:24 Cockpit kernel: logitech-hidpp-device 0003:046D:2010.0005: input,hidraw2: USB HID v1.11 Keyboard [Logitech Wireless Device PID:2010] on usb-0000:00:14.0-8/input2:2
>>
>> Next, once booted and remove the unify receiver and plug it again there is a massive lag (~ 15 seconds) before that the receiver get ready for the mouse and keyboard to be functional with following errors :
>>
>> kernel: logitech-hidpp-device 0003:046D:405E.0022: hidpp_devicenametype_get_count: received protocol error 0x07
>> kernel: logitech-hidpp-device 0003:046D:405E.0023: Couldn't get wheel multiplier (error -110)
>>
>> Unify receiver with K800 keyboard and M720 Triathlon mouse paired.
>>
>> This happens on my desktop computer but not on my laptop with a unify receiver and a marathon M705 mouse.
>>
>> Both computer are on Archlinux and up to date.
>>
>> On the desktop the boot is fine without the unify receiver.
>>
>> Let me know if you need more info.
>>
>> Thank you.
>
>
> See the ticket for more details.
>
> Note, there are two users affected by this (see Comment 5 for the
> second), but you have to use bugzilla to reach the second reporter, as I
> sadly[1] can not simply CCed them in mails like this (the initial
> reporter gave permission).
>
>
> [TLDR for the rest of this mail: I'm adding this report to the list of
> tracked Linux kernel regressions; the text you find below is based on a
> few templates paragraphs you might have encountered already in similar
> form.]
>
> BTW, let me use this mail to also add the report to the list of tracked
> regressions to ensure it's doesn't fall through the cracks:
>
> #regzbot introduced: v6.2..v6.3
> https://bugzilla.kernel.org/show_bug.cgi?id=217412
> #regzbot title: input: hid: logitech unify receiver not working properly
> #regzbot ignore-activity
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply and tell me -- ideally
> while also telling regzbot about it, as explained by the page listed in
> the footer of this mail.
>
> Developers: When fixing the issue, remember to add 'Link:' tags pointing
> to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
> this thread sees some discussion). See page linked in footer for details.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> [1] because bugzilla.kernel.org tells users upon registration their
> "email address will never be displayed to logged out users"

2023-05-16 13:39:08

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On 5/11/23 18:58, Thorsten Leemhuis wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker.
>
> On 08.05.23 11:55, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Hi, Thorsten here, the Linux kernel's regression tracker.
>>
>> I noticed a regression report in bugzilla.kernel.org. As many (most?)
>> kernel developers don't keep an eye on it, I decided to forward it by mail.
>>
>> Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :
>
> TWIMC: a few other users (three or four iirc) showed up in that ticket
> and reported problems with the receiver, albeit the symptoms are not
> exactly the same for all of them, so there might be more than one problem.
>
> I'll try to motivate the affected users to perform a bisection. But
> would be great if those with more knowledge about this code could
> briefly look into the ticket, maybe the details the users shared allows
> one of you to guess what causes this.
>

Hmm,

You noted in the similar report [1] that developers involved here
ignore this regressions. I wonder if Linus has to be hired in
this case, and if it is the case, let's take a look and hear closely what
he will say.

Thanks.


[1]: https://lore.kernel.org/regressions/[email protected]/

--
An old man doll... just what I always wanted! - Clara


2023-05-16 13:55:34

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Tue, May 16, 2023 at 3:25 PM Bagas Sanjaya <[email protected]> wrote:
>
> On 5/11/23 18:58, Thorsten Leemhuis wrote:
> > Hi, Thorsten here, the Linux kernel's regression tracker.
> >
> > On 08.05.23 11:55, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> Hi, Thorsten here, the Linux kernel's regression tracker.
> >>
> >> I noticed a regression report in bugzilla.kernel.org. As many (most?)
> >> kernel developers don't keep an eye on it, I decided to forward it by mail.
> >>
> >> Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :
> >
> > TWIMC: a few other users (three or four iirc) showed up in that ticket
> > and reported problems with the receiver, albeit the symptoms are not
> > exactly the same for all of them, so there might be more than one problem.
> >
> > I'll try to motivate the affected users to perform a bisection. But
> > would be great if those with more knowledge about this code could
> > briefly look into the ticket, maybe the details the users shared allows
> > one of you to guess what causes this.
> >
>
> Hmm,
>
> You noted in the similar report [1] that developers involved here
> ignore this regressions. I wonder if Linus has to be hired in
> this case, and if it is the case, let's take a look and hear closely what
> he will say.

Sigh... Not answering an email is bad, but maybe you can also
understand that developers can take days off?

And it turns out that I was waiting for Bastien to chime in, but I can
access his calendar too and just found out that he was AFK for the
entire month, except for the first week, which I wasn't aware. May is
a time where people in France have a lot of public holidays and is
also the cut to use those time off or they expire.

For me, I'll also be taking time off the rest of this week, so I won't
be able to have a look at this until next week at the earliest.

Cheers,
Benjamin

>
> Thanks.
>
>
> [1]: https://lore.kernel.org/regressions/[email protected]/
>
> --
> An old man doll... just what I always wanted! - Clara
>


Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On 16.05.23 15:24, Bagas Sanjaya wrote:
> On 5/11/23 18:58, Thorsten Leemhuis wrote:
>> Hi, Thorsten here, the Linux kernel's regression tracker.
>>
>> On 08.05.23 11:55, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> Hi, Thorsten here, the Linux kernel's regression tracker.
>>>
>>> I noticed a regression report in bugzilla.kernel.org. As many (most?)
>>> kernel developers don't keep an eye on it, I decided to forward it by mail.
>>>
>>> Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :
>>
>> TWIMC: a few other users (three or four iirc) showed up in that ticket
>> and reported problems with the receiver, albeit the symptoms are not
>> exactly the same for all of them, so there might be more than one problem.
>>
>> I'll try to motivate the affected users to perform a bisection. But
>> would be great if those with more knowledge about this code could
>> briefly look into the ticket, maybe the details the users shared allows
>> one of you to guess what causes this.
>
> Hmm,
>
> You noted in the similar report [1] that developers involved here
> ignore this regressions. I wonder if Linus has to be hired in
> this case, and if it is the case, let's take a look and hear closely what
> he will say.
>
> [1]: https://lore.kernel.org/regressions/[email protected]/

You CCed him so maybe we'll learn soon.

I expect he doesn't like the situation, but at the same time I guess
there is nothing much he will do (which is why I do not CC him in cases
like this, unless they are urgent/severe or something like that).

That's because as far as I know in the end it is the duty of the
reporter(s) to find the the culprit.

Because in the end developers and subsystem maintainers are volunteers,
too -- and making them bisect each and every report would make the job
way to hard. And the question "which developer/subsystem maintainer
needs to perform the bisection" would often become quickly complicated
as well, as an issue in one area of the kernel can be caused by a change
in a totally different area (for file-systems that is way more likely
than for input drivers, but still). Not to mention that developer and
subsystem maintainers might not even have the environment at hand to
reproduce the issue.

That being said: I think a quick "We looked into these three reports
that might be related, but we have no idea what might cause this;
somebody needs to bisect things" from one of the involved
developers/maintainers would have been nice (appropriate?).

Ciao, Thorsten

2023-05-17 13:36:10

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Thu, May 11, 2023 at 01:58:43PM +0200, Thorsten Leemhuis wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker.
>
> On 08.05.23 11:55, Linux regression tracking (Thorsten Leemhuis) wrote:
> > Hi, Thorsten here, the Linux kernel's regression tracker.
> >
> > I noticed a regression report in bugzilla.kernel.org. As many (most?)
> > kernel developers don't keep an eye on it, I decided to forward it by mail.
> >
> > Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :
>
> TWIMC: a few other users (three or four iirc) showed up in that ticket
> and reported problems with the receiver, albeit the symptoms are not
> exactly the same for all of them, so there might be more than one problem.
>
> I'll try to motivate the affected users to perform a bisection. But
> would be great if those with more knowledge about this code could
> briefly look into the ticket, maybe the details the users shared allows
> one of you to guess what causes this.

Hi Thorsten,

Another reporter in the same bug ticket has already pinned down the
culprit to 586e8fede7953b ("HID: logitech-hidpp: Retry commands when device
is busy"). I'm now updating regzbot entry (and Cc'ing the culprit author):

#regzbot introduced: 586e8fede7953b

Thanks.

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (1.32 kB)
signature.asc (235.00 B)
Download all attachments
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On 16.05.23 15:34, Benjamin Tissoires wrote:
> On Tue, May 16, 2023 at 3:25 PM Bagas Sanjaya <[email protected]> wrote:
>> On 5/11/23 18:58, Thorsten Leemhuis wrote:
>>> On 08.05.23 11:55, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> I noticed a regression report in bugzilla.kernel.org. As many (most?)
>>>> kernel developers don't keep an eye on it, I decided to forward it by mail.
>>>>
>>>> Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217412 :
>>>
>>> TWIMC: a few other users (three or four iirc) showed up in that ticket
>>> and reported problems with the receiver, albeit the symptoms are not
>>> exactly the same for all of them, so there might be more than one problem.
>>>
>>> I'll try to motivate the affected users to perform a bisection. But
>>> would be great if those with more knowledge about this code could
>>> briefly look into the ticket, maybe the details the users shared allows
>>> one of you to guess what causes this.
>>
>> Hmm,
>>
>> You noted in the similar report [1] that developers involved here
>> ignore this regressions. I wonder if Linus has to be hired in
>> this case, and if it is the case, let's take a look and hear closely what
>> he will say.
>
> Sigh... Not answering an email is bad, but maybe you can also
> understand that developers can take days off?

Yup, also a totally valid reason I forgot to mention in my reply last week.

> And it turns out that I was waiting for Bastien to chime in, but I can
> access his calendar too and just found out that he was AFK for the
> entire month, except for the first week, which I wasn't aware. May is
> a time where people in France have a lot of public holidays and is
> also the cut to use those time off or they expire.

Thx for that, knowing that Bastien is unavailable is really helpful.

> For me, I'll also be taking time off the rest of this week, so I won't
> be able to have a look at this until next week at the earliest.

Hope your enjoyed your days off!

FWIW, in case anybody is interested in a status update: one reporter
bisected the problem down to 586e8fede79 ("HID: logitech-hidpp: Retry
commands when device is busy"); reverting that commit on-top of 6.3
fixes the problem for that reporter. For that reporter things also work
on 6.4-rc; but for someone else that is affected that's not the case.

Makes me wonder if we deal with two different issues here. Just asked
where 6.4 does not work if reverting 586e8fede79 fixes things for them
as well.

For more details, see https://bugzilla.kernel.org/show_bug.cgi?id=217412

Ciao, Thorsten

2023-05-22 18:29:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Mon, May 22, 2023 at 5:38 AM Linux regression tracking (Thorsten
Leemhuis) <[email protected]> wrote:
>
> FWIW, in case anybody is interested in a status update: one reporter
> bisected the problem down to 586e8fede79 ("HID: logitech-hidpp: Retry
> commands when device is busy"); reverting that commit on-top of 6.3
> fixes the problem for that reporter. For that reporter things also work
> on 6.4-rc; but for someone else that is affected that's not the case.

Hmm. It's likely timing-dependent.

But that code is clearly buggy.

If the wait_event_timeout() returns early, the device hasn't replied,
but the code does

if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
5*HZ)) {
dbg_hid("%s:timeout waiting for response\n", __func__);
memset(response, 0, sizeof(struct hidpp_report));
ret = -ETIMEDOUT;
}

and then continues to look at the response _anyway_.

Now, depending on out hardening options, that response may have been
initialized by the compiler, or may just be random stack contents.

That bug is pre-existing (ie the problem was not introduced by that
commit), but who knows if the retry makes things worse (ie if it then
triggers on a retry, the response data will be the *previous*
response).

The whole "goto exit" games should be removed too, because we're in a
for-loop, and instead of "goto exit" it should just do "break".

IOW, something like this might be worth testing.

That said, while I think the code is buggy, I doubt this is the actual
cause of the problem people are reporting. But it would be lovely to
hear if the attached patch makes any difference, and I think this is
fixing a real - but unlikely - problem anyway.

And obviously it might be helpful to actually enable those dbg_hid()
messages, but I didn't look at what the magic config option to do so
was.

NOTE! Patch below *ENTIRELY* untested. I just looked at the code when
that commit was mentioned, and went "that's not right"...

Linus


Attachments:
patch.diff (1.49 kB)

2023-05-23 12:49:43

by Jiri Kosina

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Mon, 22 May 2023, Linus Torvalds wrote:

> > FWIW, in case anybody is interested in a status update: one reporter
> > bisected the problem down to 586e8fede79 ("HID: logitech-hidpp: Retry
> > commands when device is busy"); reverting that commit on-top of 6.3
> > fixes the problem for that reporter. For that reporter things also work
> > on 6.4-rc; but for someone else that is affected that's not the case.

FWIW, I was pretty much away for past few weeks as well, same as Benjamin
as Bastien. Which is unfortunate timing, but that's how things pan out
sometimes.

> Hmm. It's likely timing-dependent.
>
> But that code is clearly buggy.
>
> If the wait_event_timeout() returns early, the device hasn't replied,
> but the code does
>
> if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
> 5*HZ)) {
> dbg_hid("%s:timeout waiting for response\n", __func__);
> memset(response, 0, sizeof(struct hidpp_report));
> ret = -ETIMEDOUT;
> }
>
> and then continues to look at the response _anyway_.

Yeah; we are zeroing it out, but that doesn't really make things any
better in principle, given all the dereferences later.

The issue seems to be existing ever since 2f31c52529 ("HID: Introduce
hidpp, a module to handle Logitech hid++ devices") when this whole driver
was introduced, as far as I can tell.

> Now, depending on out hardening options, that response may have been
> initialized by the compiler, or may just be random stack contents.

Again, as in case of timeout the buffer is just zeroed out, I'd just much
more expect NULL pointer dereference in such case. Which is not what we
are seeing here.

> That bug is pre-existing (ie the problem was not introduced by that
> commit), but who knows if the retry makes things worse (ie if it then
> triggers on a retry, the response data will be the *previous* response).
>
> The whole "goto exit" games should be removed too, because we're in a
> for-loop, and instead of "goto exit" it should just do "break".
>
> IOW, something like this might be worth testing.
>
> That said, while I think the code is buggy, I doubt this is the actual
> cause of the problem people are reporting. But it would be lovely to
> hear if the attached patch makes any difference, and I think this is
> fixing a real - but unlikely - problem anyway.
>
> And obviously it might be helpful to actually enable those dbg_hid()
> messages, but I didn't look at what the magic config option to do so
> was.

dbg_hid() is just pr_debug(), which means that on kernels with
CONFIG_DYNAMIC_DEBUG, this makes use of the dynamic debug facility;
otherwsie it just becomes printk(KERN_DEBUG...).

Thanks,

--
Jiri Kosina
SUSE Labs


2023-05-24 15:34:39

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Tue, May 23, 2023 at 2:31 PM Jiri Kosina <[email protected]> wrote:
>
> On Mon, 22 May 2023, Linus Torvalds wrote:
>
> > > FWIW, in case anybody is interested in a status update: one reporter
> > > bisected the problem down to 586e8fede79 ("HID: logitech-hidpp: Retry
> > > commands when device is busy"); reverting that commit on-top of 6.3
> > > fixes the problem for that reporter. For that reporter things also work
> > > on 6.4-rc; but for someone else that is affected that's not the case.
>
> FWIW, I was pretty much away for past few weeks as well, same as Benjamin
> as Bastien. Which is unfortunate timing, but that's how things pan out
> sometimes.
>
> > Hmm. It's likely timing-dependent.
> >
> > But that code is clearly buggy.
> >
> > If the wait_event_timeout() returns early, the device hasn't replied,
> > but the code does
> >
> > if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
> > 5*HZ)) {
> > dbg_hid("%s:timeout waiting for response\n", __func__);
> > memset(response, 0, sizeof(struct hidpp_report));
> > ret = -ETIMEDOUT;
> > }
> >
> > and then continues to look at the response _anyway_.
>
> Yeah; we are zeroing it out, but that doesn't really make things any
> better in principle, given all the dereferences later.
>
> The issue seems to be existing ever since 2f31c52529 ("HID: Introduce
> hidpp, a module to handle Logitech hid++ devices") when this whole driver
> was introduced, as far as I can tell.

Yep, that was on me. But the weird part is that I should be able to
reproduce this locally then, but I don't.

>
> > Now, depending on out hardening options, that response may have been
> > initialized by the compiler, or may just be random stack contents.
>
> Again, as in case of timeout the buffer is just zeroed out, I'd just much
> more expect NULL pointer dereference in such case. Which is not what we
> are seeing here.

Returning -ETIMEDOUT seems good to me FWIW.

>
> > That bug is pre-existing (ie the problem was not introduced by that
> > commit), but who knows if the retry makes things worse (ie if it then
> > triggers on a retry, the response data will be the *previous* response).
> >
> > The whole "goto exit" games should be removed too, because we're in a
> > for-loop, and instead of "goto exit" it should just do "break".
> >
> > IOW, something like this might be worth testing.
> >
> > That said, while I think the code is buggy, I doubt this is the actual
> > cause of the problem people are reporting. But it would be lovely to
> > hear if the attached patch makes any difference, and I think this is
> > fixing a real - but unlikely - problem anyway.

FWIW, Linus, your patch is
Reviewed-by: Benjamin Tissoires <[email protected]>

Feel free to submit it to us or to apply it directly if you prefer as
this is clearly a fix for a code path issue.

I am barely struggling with everything now that I am back from last
week, being sick at the beginning of the week and still not feeling
completely well doesn't help.

Cheers,
Benjamin

> >
> > And obviously it might be helpful to actually enable those dbg_hid()
> > messages, but I didn't look at what the magic config option to do so
> > was.
>
> dbg_hid() is just pr_debug(), which means that on kernels with
> CONFIG_DYNAMIC_DEBUG, this makes use of the dynamic debug facility;
> otherwsie it just becomes printk(KERN_DEBUG...).
>
> Thanks,
>
> --
> Jiri Kosina
> SUSE Labs
>


2023-05-25 11:20:48

by Jiri Kosina

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Wed, 24 May 2023, Benjamin Tissoires wrote:

> > > That bug is pre-existing (ie the problem was not introduced by that
> > > commit), but who knows if the retry makes things worse (ie if it then
> > > triggers on a retry, the response data will be the *previous* response).
> > >
> > > The whole "goto exit" games should be removed too, because we're in a
> > > for-loop, and instead of "goto exit" it should just do "break".
> > >
> > > IOW, something like this might be worth testing.
> > >
> > > That said, while I think the code is buggy, I doubt this is the actual
> > > cause of the problem people are reporting. But it would be lovely to
> > > hear if the attached patch makes any difference, and I think this is
> > > fixing a real - but unlikely - problem anyway.
>
> FWIW, Linus, your patch is
> Reviewed-by: Benjamin Tissoires <[email protected]>
>
> Feel free to submit it to us or to apply it directly if you prefer as
> this is clearly a fix for a code path issue.

It would be nice to hear from the people who were able to reproduce the
issue whether this makes any observable difference in behavior though. I
don't currently think it would, as it fixes a potential NULL pointer
dereference, which is not what has been reported.

Has anyone of the affected people tried to bisect the issue?

Thanks,

--
Jiri Kosina
SUSE Labs


2023-05-25 11:34:26

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Thu, May 25, 2023 at 1:10 PM Jiri Kosina <[email protected]> wrote:
>
> On Wed, 24 May 2023, Benjamin Tissoires wrote:
>
> > > > That bug is pre-existing (ie the problem was not introduced by that
> > > > commit), but who knows if the retry makes things worse (ie if it then
> > > > triggers on a retry, the response data will be the *previous* response).
> > > >
> > > > The whole "goto exit" games should be removed too, because we're in a
> > > > for-loop, and instead of "goto exit" it should just do "break".
> > > >
> > > > IOW, something like this might be worth testing.
> > > >
> > > > That said, while I think the code is buggy, I doubt this is the actual
> > > > cause of the problem people are reporting. But it would be lovely to
> > > > hear if the attached patch makes any difference, and I think this is
> > > > fixing a real - but unlikely - problem anyway.
> >
> > FWIW, Linus, your patch is
> > Reviewed-by: Benjamin Tissoires <[email protected]>
> >
> > Feel free to submit it to us or to apply it directly if you prefer as
> > this is clearly a fix for a code path issue.
>
> It would be nice to hear from the people who were able to reproduce the
> issue whether this makes any observable difference in behavior though. I
> don't currently think it would, as it fixes a potential NULL pointer
> dereference, which is not what has been reported.

Well, yes, I didn't mean it would fix the bug. But this is an obvious
fix that we need to take given that we now see it :)

>
> Has anyone of the affected people tried to bisect the issue?

I just checked the BZ linked above, and... it still doesn't work
(which is not surprising):
https://bugzilla.kernel.org/show_bug.cgi?id=217412#c33

Cheers,
Benjamin


2023-05-26 18:51:30

by Jiri Kosina

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Thu, 25 May 2023, Jiri Kosina wrote:

> > > > That bug is pre-existing (ie the problem was not introduced by that
> > > > commit), but who knows if the retry makes things worse (ie if it then
> > > > triggers on a retry, the response data will be the *previous* response).
> > > >
> > > > The whole "goto exit" games should be removed too, because we're in a
> > > > for-loop, and instead of "goto exit" it should just do "break".
> > > >
> > > > IOW, something like this might be worth testing.
> > > >
> > > > That said, while I think the code is buggy, I doubt this is the actual
> > > > cause of the problem people are reporting. But it would be lovely to
> > > > hear if the attached patch makes any difference, and I think this is
> > > > fixing a real - but unlikely - problem anyway.
> >
> > FWIW, Linus, your patch is
> > Reviewed-by: Benjamin Tissoires <[email protected]>
> >
> > Feel free to submit it to us or to apply it directly if you prefer as
> > this is clearly a fix for a code path issue.
>
> It would be nice to hear from the people who were able to reproduce the
> issue whether this makes any observable difference in behavior though. I
> don't currently think it would, as it fixes a potential NULL pointer
> dereference, which is not what has been reported.
>
> Has anyone of the affected people tried to bisect the issue?

Could anyone who is able to reproduce the issue please check whether
reverting

586e8fede7953b16 ("HID: logitech-hidpp: Retry commands when device is busy")

has any observable effect?

Thanks,

--
Jiri Kosina
SUSE Labs


Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly



On 26.05.23 20:41, Jiri Kosina wrote:
> On Thu, 25 May 2023, Jiri Kosina wrote:
>
>>>>> That bug is pre-existing (ie the problem was not introduced by that
>>>>> commit), but who knows if the retry makes things worse (ie if it then
>>>>> triggers on a retry, the response data will be the *previous* response).
>>>>>
>>>>> The whole "goto exit" games should be removed too, because we're in a
>>>>> for-loop, and instead of "goto exit" it should just do "break".
>>>>>
>>>>> IOW, something like this might be worth testing.
>>>>>
>>>>> That said, while I think the code is buggy, I doubt this is the actual
>>>>> cause of the problem people are reporting. But it would be lovely to
>>>>> hear if the attached patch makes any difference, and I think this is
>>>>> fixing a real - but unlikely - problem anyway.
>>>
>>> FWIW, Linus, your patch is
>>> Reviewed-by: Benjamin Tissoires <[email protected]>
>>>
>>> Feel free to submit it to us or to apply it directly if you prefer as
>>> this is clearly a fix for a code path issue.
>>
>> It would be nice to hear from the people who were able to reproduce the
>> issue whether this makes any observable difference in behavior though. I
>> don't currently think it would, as it fixes a potential NULL pointer
>> dereference, which is not what has been reported.
>>
>> Has anyone of the affected people tried to bisect the issue?
>
> Could anyone

Reminder: a lot of the affected users can only be reached through the
bugzilla ticket (https://bugzilla.kernel.org/show_bug.cgi?id=217412 )
that made me start this thread. Sorry for this mess, but I can't simply
CC them because on account creation users are told that the "email
address will never be displayed to logged out users". Bugbot will
hopefully soon make this sort of problems history.

> who is able to reproduce the issue please check whether
> reverting
>
> 586e8fede7953b16 ("HID: logitech-hidpp: Retry commands when device is busy")
>
> has any observable effect?

See https://bugzilla.kernel.org/show_bug.cgi?id=217412#c26 and later –
it at least solved the problem for one user.

But it's all a mess (at least afaics). Earlier in that ticket some other
user said things work with 6.4-rc kernel, while for another confirmed
things are still broken. So maybe we deal with more than one problem. Or
testing went sideways for some of the users.

Ciao, Thorsten

2023-05-31 08:49:17

by Jiri Kosina

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Sun, 28 May 2023, Thorsten Leemhuis wrote:

> > who is able to reproduce the issue please check whether
> > reverting
> >
> > 586e8fede7953b16 ("HID: logitech-hidpp: Retry commands when device is busy")
> >
> > has any observable effect?
>
> See https://bugzilla.kernel.org/show_bug.cgi?id=217412#c26 and later –
> it at least solved the problem for one user.
>
> But it's all a mess (at least afaics). Earlier in that ticket some other
> user said things work with 6.4-rc kernel, while for another confirmed
> things are still broken. So maybe we deal with more than one problem. Or
> testing went sideways for some of the users.

The patch that needs to be tested now by the affected users is here:

https://patchwork.kernel.org/project/linux-input/patch/[email protected]/

--
Jiri Kosina
SUSE Labs


2023-05-31 08:58:58

by Bastien Nocera

[permalink] [raw]
Subject: Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

On Mon, 2023-05-22 at 11:23 -0700, Linus Torvalds wrote:
> On Mon, May 22, 2023 at 5:38 AM Linux regression tracking (Thorsten
> Leemhuis) <[email protected]> wrote:
> >
> > FWIW, in case anybody is interested in a status update: one
> > reporter
> > bisected the problem down to 586e8fede79 ("HID: logitech-hidpp:
> > Retry
> > commands when device is busy"); reverting that commit on-top of 6.3
> > fixes the problem for that reporter. For that reporter things also
> > work
> > on 6.4-rc; but for someone else that is affected that's not the
> > case.
>
> Hmm. It's likely timing-dependent.
>
> But that code is clearly buggy.
>
> If the wait_event_timeout() returns early, the device hasn't replied,
> but the code does
>
>                 if (!wait_event_timeout(hidpp->wait, hidpp-
> >answer_available,
>                                         5*HZ)) {
>                         dbg_hid("%s:timeout waiting for response\n",
> __func__);
>                         memset(response, 0, sizeof(struct
> hidpp_report));
>                         ret = -ETIMEDOUT;
>                 }
>
> and then continues to look at the response _anyway_.
>
> Now, depending on out hardening options, that response may have been
> initialized by the compiler, or may just be random stack contents.

It's kzalloc()'ed in the 2 places it's used, hidpp_send_message_sync().

> That bug is pre-existing (ie the problem was not introduced by that
> commit), but who knows if the retry makes things worse (ie if it then
> triggers on a retry, the response data will be the *previous*
> response).
>
> The whole "goto exit" games should be removed too, because we're in a
> for-loop, and instead of "goto exit" it should just do "break".
>
> IOW, something like this might be worth testing.
>
> That said, while I think the code is buggy, I doubt this is the
> actual
> cause of the problem people are reporting. But it would be lovely to
> hear if the attached patch makes any difference, and I think this is
> fixing a real - but unlikely - problem anyway.
>
> And obviously it might be helpful to actually enable those dbg_hid()
> messages, but I didn't look at what the magic config option to do so
> was.

Thomas Weißschuh's patch ("HID: use standard debug APIs") linked all
those debug calls to the dynamic debugging system, so something like
this will work after boot:
echo 'file hid-logitech-hidpp.c +p' > /sys/kernel/debug/dynamic_debug/control

Adding this to the kernel command-line to get some debug during boot
should work:
dyndbg="file hid-logitech-hidpp.c +p"

In both cases, check it's enabled and that the messages can be printed
with:
grep -i hidpp /sys/kernel/debug/dynamic_debug/control

> NOTE! Patch below *ENTIRELY* untested. I just looked at the code when
> that commit was mentioned, and went "that's not right"...

I sent a similar patch before seeing your version, in answer to a
separate report I was sent. It doesn't change the style of the code,
and just fixes that one omission:
https://patchwork.kernel.org/project/linux-input/patch/[email protected]/

Cheers

>
>                      Linus