2023-05-08 06:46:29

by Mirsad Goran Todorovac

[permalink] [raw]
Subject: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

Hi,

There seems to be a kernel memory leak in the USB keyboard driver.

The leaked memory allocs are 96 and 512 bytes.

The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E
PG Lightning mobo,
and Genius SlimStar i220 GK-080012 keyboard.

(Logitech M100 HID mouse is not affected by the bug.)

BIOS is:

     *-firmware
          description: BIOS
          vendor: American Megatrends International, LLC.
          physical id: 0
          version: 1.21
          date: 04/26/2023
          size: 64KiB

The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.

The keyboard is recognised as Chicony:

                 *-usb
                      description: Keyboard
                      product: CHICONY USB Keyboard
                      vendor: CHICONY
                      physical id: 2
                      bus info: usb@5:2
                      logical name: input35
                      logical name: /dev/input/event4
                      logical name: input35::capslock
                      logical name: input35::numlock
                      logical name: input35::scrolllock
                      logical name: input36
                      logical name: /dev/input/event5
                      logical name: input37
                      logical name: /dev/input/event6
                      logical name: input38
                      logical name: /dev/input/event8
                      version: 2.30
                      capabilities: usb-2.00 usb
                      configuration: driver=usbhid maxpower=100mA
speed=1Mbit/s

The bug is easily reproduced by unplugging the USB keyboard, waiting
about a couple of seconds,
and then reconnect and scan for memory leaks twice.

The kmemleak log is as follows [edited privacy info]:

root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8dd020037c00 (size 96):
  comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
  hex dump (first 32 bytes):
    5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  backtrace:
    [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb87543d9>] class_create+0x29/0x80
    [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc03d7bab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc03b6d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc03b724c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc0388092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc03b7450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb8752082>] really_probe+0x1b2/0x420
    [<ffffffffb875237e>] __driver_probe_device+0x7e/0x170
    [<ffffffffb87524a3>] driver_probe_device+0x23/0xa0
    [<ffffffffb8752748>] __driver_attach+0xe8/0x1e0
    [<ffffffffb874f8ee>] bus_for_each_dev+0x7e/0xd0
    [<ffffffffb8751822>] driver_attach+0x22/0x30
    [<ffffffffb8750e70>] bus_add_driver+0x120/0x220
unreferenced object 0xffff8dd015653e00 (size 512):
  comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
  hex dump (first 32 bytes):
    00 3e 65 15 d0 8d ff ff 00 3e 65 15 d0 8d ff ff .>e......>e.....
    00 00 00 00 00 00 00 00 5d 8e 4e b9 ff ff ff ff ........].N.....
  backtrace:
    [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb8754292>] class_register+0x32/0x140
    [<ffffffffb87543f4>] class_create+0x44/0x80
    [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc03d7bab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc03b6d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc03b724c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc0388092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc03b7450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb8752082>] really_probe+0x1b2/0x420
    [<ffffffffb875237e>] __driver_probe_device+0x7e/0x170
    [<ffffffffb87524a3>] driver_probe_device+0x23/0xa0
    [<ffffffffb8752748>] __driver_attach+0xe8/0x1e0
    [<ffffffffb874f8ee>] bus_for_each_dev+0x7e/0xd0
    [<ffffffffb8751822>] driver_attach+0x22/0x30
unreferenced object 0xffff8dd020037f60 (size 96):
  comm "kworker/0:2", pid 496256, jiffies 4295987055 (age 4531.432s)
  hex dump (first 32 bytes):
    5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  backtrace:
    [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb87543d9>] class_create+0x29/0x80
    [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc03d7bab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc03b6d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc03b724c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc0388092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc03b7450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb8752082>] really_probe+0x1b2/0x420
    [<ffffffffb875237e>] __driver_probe_device+0x7e/0x170
    [<ffffffffb87524a3>] driver_probe_device+0x23/0xa0
    [<ffffffffb87525c2>] __device_attach_driver+0x92/0x120
    [<ffffffffb874f9da>] bus_for_each_drv+0x8a/0xe0
    [<ffffffffb8752ad1>] __device_attach+0xc1/0x1f0
    [<ffffffffb8752ed7>] device_initial_probe+0x17/0x20
unreferenced object 0xffff8dd07df25a00 (size 512):
  comm "kworker/0:2", pid 496256, jiffies 4295987055 (age 4531.500s)
  hex dump (first 32 bytes):
    00 5a f2 7d d0 8d ff ff 00 5a f2 7d d0 8d ff ff .Z.}.....Z.}....
    00 00 00 00 00 00 00 00 5d 8e 4e b9 ff ff ff ff ........].N.....
  backtrace:
    [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb8754292>] class_register+0x32/0x140
    [<ffffffffb87543f4>] class_create+0x44/0x80
    [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc03d7bab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc03b6d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc03b724c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc0388092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc03b7450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb8752082>] really_probe+0x1b2/0x420
    [<ffffffffb875237e>] __driver_probe_device+0x7e/0x170
    [<ffffffffb87524a3>] driver_probe_device+0x23/0xa0
    [<ffffffffb87525c2>] __device_attach_driver+0x92/0x120
    [<ffffffffb874f9da>] bus_for_each_drv+0x8a/0xe0
    [<ffffffffb8752ad1>] __device_attach+0xc1/0x1f0
unreferenced object 0xffff8dd015e66e40 (size 96):
  comm "kworker/1:0", pid 487844, jiffies 4296102566 (age 4069.472s)
  hex dump (first 32 bytes):
    5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  backtrace:
    [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb87543d9>] class_create+0x29/0x80
    [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc03d7bab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc03b6d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc03b724c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc0388092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc03b7450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb8752082>] really_probe+0x1b2/0x420
    [<ffffffffb875237e>] __driver_probe_device+0x7e/0x170
    [<ffffffffb87524a3>] driver_probe_device+0x23/0xa0
    [<ffffffffb87525c2>] __device_attach_driver+0x92/0x120
    [<ffffffffb874f9da>] bus_for_each_drv+0x8a/0xe0
    [<ffffffffb8752ad1>] __device_attach+0xc1/0x1f0
    [<ffffffffb8752ed7>] device_initial_probe+0x17/0x20
unreferenced object 0xffff8dd0caffe800 (size 512):
  comm "kworker/1:0", pid 487844, jiffies 4296102566 (age 4069.472s)
  hex dump (first 32 bytes):
    00 e8 ff ca d0 8d ff ff 00 e8 ff ca d0 8d ff ff ................
    00 00 00 00 00 00 00 00 5d 8e 4e b9 ff ff ff ff ........].N.....
  backtrace:
    [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb8754292>] class_register+0x32/0x140
    [<ffffffffb87543f4>] class_create+0x44/0x80
    [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc03d7bab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc03b6d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc03b724c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc0388092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc03b7450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb8752082>] really_probe+0x1b2/0x420
    [<ffffffffb875237e>] __driver_probe_device+0x7e/0x170
    [<ffffffffb87524a3>] driver_probe_device+0x23/0xa0
    [<ffffffffb87525c2>] __device_attach_driver+0x92/0x120
    [<ffffffffb874f9da>] bus_for_each_drv+0x8a/0xe0
    [<ffffffffb8752ad1>] __device_attach+0xc1/0x1f0
root@hostname:/home/username#

Best regards,
Mirsad


Attachments:
config-6.3.0-torvalds-13466-gfc4354c6e5c2.xz (55.50 kB)
hid-kmemleak.log (4.62 kB)
lshw.txt (56.87 kB)
Download all attachments

2023-05-08 06:59:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
> Hi,
>
> There seems to be a kernel memory leak in the USB keyboard driver.
>
> The leaked memory allocs are 96 and 512 bytes.
>
> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
> Lightning mobo,
> and Genius SlimStar i220 GK-080012 keyboard.
>
> (Logitech M100 HID mouse is not affected by the bug.)
>
> BIOS is:
>
> ???? *-firmware
> ????????? description: BIOS
> ????????? vendor: American Megatrends International, LLC.
> ????????? physical id: 0
> ????????? version: 1.21
> ????????? date: 04/26/2023
> ????????? size: 64KiB
>
> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>
> The keyboard is recognised as Chicony:
>
> ???????????????? *-usb
> ????????????????????? description: Keyboard
> ????????????????????? product: CHICONY USB Keyboard
> ????????????????????? vendor: CHICONY
> ????????????????????? physical id: 2
> ????????????????????? bus info: usb@5:2
> ????????????????????? logical name: input35
> ????????????????????? logical name: /dev/input/event4
> ????????????????????? logical name: input35::capslock
> ????????????????????? logical name: input35::numlock
> ????????????????????? logical name: input35::scrolllock
> ????????????????????? logical name: input36
> ????????????????????? logical name: /dev/input/event5
> ????????????????????? logical name: input37
> ????????????????????? logical name: /dev/input/event6
> ????????????????????? logical name: input38
> ????????????????????? logical name: /dev/input/event8
> ????????????????????? version: 2.30
> ????????????????????? capabilities: usb-2.00 usb
> ????????????????????? configuration: driver=usbhid maxpower=100mA
> speed=1Mbit/s
>
> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
> couple of seconds,
> and then reconnect and scan for memory leaks twice.
>
> The kmemleak log is as follows [edited privacy info]:
>
> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff8dd020037c00 (size 96):
> ? comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
> ? hex dump (first 32 bytes):
> ??? 5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
> ??? 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> ? backtrace:
> ??? [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
> ??? [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
> ??? [<ffffffffb87543d9>] class_create+0x29/0x80
> ??? [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0

As the call to class_create() in this path is now gone in 6.4-rc1, can
you retry that release to see if this is still there or not?

thanks,

greg k-h

2023-05-08 08:03:05

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On 5/8/23 08:51, Greg Kroah-Hartman wrote:
> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>> Hi,
>>
>> There seems to be a kernel memory leak in the USB keyboard driver.
>>
>> The leaked memory allocs are 96 and 512 bytes.
>>
>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>> Lightning mobo,
>> and Genius SlimStar i220 GK-080012 keyboard.
>>
>> (Logitech M100 HID mouse is not affected by the bug.)
>>
>> BIOS is:
>>
>>      *-firmware
>>           description: BIOS
>>           vendor: American Megatrends International, LLC.
>>           physical id: 0
>>           version: 1.21
>>           date: 04/26/2023
>>           size: 64KiB
>>
>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>
>> The keyboard is recognised as Chicony:
>>
>>                  *-usb
>>                       description: Keyboard
>>                       product: CHICONY USB Keyboard
>>                       vendor: CHICONY
>>                       physical id: 2
>>                       bus info: usb@5:2
>>                       logical name: input35
>>                       logical name: /dev/input/event4
>>                       logical name: input35::capslock
>>                       logical name: input35::numlock
>>                       logical name: input35::scrolllock
>>                       logical name: input36
>>                       logical name: /dev/input/event5
>>                       logical name: input37
>>                       logical name: /dev/input/event6
>>                       logical name: input38
>>                       logical name: /dev/input/event8
>>                       version: 2.30
>>                       capabilities: usb-2.00 usb
>>                       configuration: driver=usbhid maxpower=100mA
>> speed=1Mbit/s
>>
>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>> couple of seconds,
>> and then reconnect and scan for memory leaks twice.
>>
>> The kmemleak log is as follows [edited privacy info]:
>>
>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>> unreferenced object 0xffff8dd020037c00 (size 96):
>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>   hex dump (first 32 bytes):
>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>   backtrace:
>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>
> As the call to class_create() in this path is now gone in 6.4-rc1, can
> you retry that release to see if this is still there or not?

Certainly, but probably not before 6 PM UTC+02.

Best regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

"What’s this thing suddenly coming towards me very fast? Very very fast.
... I wonder if it will be friends with me?"

2023-05-08 14:22:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
> > Hi,
> >
> > There seems to be a kernel memory leak in the USB keyboard driver.
> >
> > The leaked memory allocs are 96 and 512 bytes.
> >
> > The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
> > Lightning mobo,
> > and Genius SlimStar i220 GK-080012 keyboard.
> >
> > (Logitech M100 HID mouse is not affected by the bug.)
> >
> > BIOS is:
> >
> > ???? *-firmware
> > ????????? description: BIOS
> > ????????? vendor: American Megatrends International, LLC.
> > ????????? physical id: 0
> > ????????? version: 1.21
> > ????????? date: 04/26/2023
> > ????????? size: 64KiB
> >
> > The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
> >
> > The keyboard is recognised as Chicony:
> >
> > ???????????????? *-usb
> > ????????????????????? description: Keyboard
> > ????????????????????? product: CHICONY USB Keyboard
> > ????????????????????? vendor: CHICONY
> > ????????????????????? physical id: 2
> > ????????????????????? bus info: usb@5:2
> > ????????????????????? logical name: input35
> > ????????????????????? logical name: /dev/input/event4
> > ????????????????????? logical name: input35::capslock
> > ????????????????????? logical name: input35::numlock
> > ????????????????????? logical name: input35::scrolllock
> > ????????????????????? logical name: input36
> > ????????????????????? logical name: /dev/input/event5
> > ????????????????????? logical name: input37
> > ????????????????????? logical name: /dev/input/event6
> > ????????????????????? logical name: input38
> > ????????????????????? logical name: /dev/input/event8
> > ????????????????????? version: 2.30
> > ????????????????????? capabilities: usb-2.00 usb
> > ????????????????????? configuration: driver=usbhid maxpower=100mA
> > speed=1Mbit/s
> >
> > The bug is easily reproduced by unplugging the USB keyboard, waiting about a
> > couple of seconds,
> > and then reconnect and scan for memory leaks twice.
> >
> > The kmemleak log is as follows [edited privacy info]:
> >
> > root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
> > unreferenced object 0xffff8dd020037c00 (size 96):
> > ? comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
> > ? hex dump (first 32 bytes):
> > ??? 5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
> > ??? 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > ? backtrace:
> > ??? [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
> > ??? [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
> > ??? [<ffffffffb87543d9>] class_create+0x29/0x80
> > ??? [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>
> As the call to class_create() in this path is now gone in 6.4-rc1, can
> you retry that release to see if this is still there or not?

No, wait, it's still there, I was looking at a development branch of
mine that isn't sent upstream yet. And syzbot just reported the same
thing:
https://lore.kernel.org/r/[email protected]

So something's wrong here, let me dig into it tomorrow when I get a
chance...

thanks,

greg k-h

2023-05-08 17:39:18

by Mirsad Goran Todorovac

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:

> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>> Hi,
>>>
>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>
>>> The leaked memory allocs are 96 and 512 bytes.
>>>
>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>> Lightning mobo,
>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>
>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>
>>> BIOS is:
>>>
>>>      *-firmware
>>>           description: BIOS
>>>           vendor: American Megatrends International, LLC.
>>>           physical id: 0
>>>           version: 1.21
>>>           date: 04/26/2023
>>>           size: 64KiB
>>>
>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>
>>> The keyboard is recognised as Chicony:
>>>
>>>                  *-usb
>>>                       description: Keyboard
>>>                       product: CHICONY USB Keyboard
>>>                       vendor: CHICONY
>>>                       physical id: 2
>>>                       bus info: usb@5:2
>>>                       logical name: input35
>>>                       logical name: /dev/input/event4
>>>                       logical name: input35::capslock
>>>                       logical name: input35::numlock
>>>                       logical name: input35::scrolllock
>>>                       logical name: input36
>>>                       logical name: /dev/input/event5
>>>                       logical name: input37
>>>                       logical name: /dev/input/event6
>>>                       logical name: input38
>>>                       logical name: /dev/input/event8
>>>                       version: 2.30
>>>                       capabilities: usb-2.00 usb
>>>                       configuration: driver=usbhid maxpower=100mA
>>> speed=1Mbit/s
>>>
>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>> couple of seconds,
>>> and then reconnect and scan for memory leaks twice.
>>>
>>> The kmemleak log is as follows [edited privacy info]:
>>>
>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>   hex dump (first 32 bytes):
>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>   backtrace:
>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>> you retry that release to see if this is still there or not?
> No, wait, it's still there, I was looking at a development branch of
> mine that isn't sent upstream yet. And syzbot just reported the same
> thing:
> https://lore.kernel.org/r/[email protected]
>
> So something's wrong here, let me dig into it tomorrow when I get a
> chance...

Hi,

I can confirm that the leak is still present in 6.4-rc1:

root@host:/home/user# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff9e6b57bd8ea0 (size 96):
  comm "systemd-udevd", pid 322, jiffies 4294892584 (age 123.516s)
  hex dump (first 32 bytes):
    a4 90 ee b6 ff ff ff ff 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  backtrace:
    [<ffffffffb5ba74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb5b27b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb6154959>] class_create+0x29/0x80
    [<ffffffffb62812a4>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc066ebab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc0629d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc062a24c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc05e8092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc062a450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb6152602>] really_probe+0x1b2/0x420
    [<ffffffffb61528fe>] __driver_probe_device+0x7e/0x170
    [<ffffffffb6152a23>] driver_probe_device+0x23/0xa0
    [<ffffffffb6152cc8>] __driver_attach+0xe8/0x1e0
    [<ffffffffb614fe6e>] bus_for_each_dev+0x7e/0xd0
    [<ffffffffb6151da2>] driver_attach+0x22/0x30
    [<ffffffffb61513f0>] bus_add_driver+0x120/0x220
unreferenced object 0xffff9e6b58d75800 (size 512):
  comm "systemd-udevd", pid 322, jiffies 4294892584 (age 123.516s)
  hex dump (first 32 bytes):
    00 58 d7 58 6b 9e ff ff 00 58 d7 58 6b 9e ff ff .X.Xk....X.Xk...
    00 00 00 00 00 00 00 00 a4 90 ee b6 ff ff ff ff ................
  backtrace:
    [<ffffffffb5ba74be>] __kmem_cache_alloc_node+0x22e/0x2b0
    [<ffffffffb5b27b6e>] kmalloc_trace+0x2e/0xa0
    [<ffffffffb6154812>] class_register+0x32/0x140
    [<ffffffffb6154974>] class_create+0x44/0x80
    [<ffffffffb62812a4>] usb_register_dev+0x1d4/0x2e0
    [<ffffffffc066ebab>] hiddev_connect+0x11b/0x1b0 [usbhid]
    [<ffffffffc0629d4e>] hid_connect+0xde/0x580 [hid]
    [<ffffffffc062a24c>] hid_hw_start+0x4c/0x70 [hid]
    [<ffffffffc05e8092>] hid_generic_probe+0x32/0x40 [hid_generic]
    [<ffffffffc062a450>] hid_device_probe+0x100/0x170 [hid]
    [<ffffffffb6152602>] really_probe+0x1b2/0x420
    [<ffffffffb61528fe>] __driver_probe_device+0x7e/0x170
    [<ffffffffb6152a23>] driver_probe_device+0x23/0xa0
    [<ffffffffb6152cc8>] __driver_attach+0xe8/0x1e0
    [<ffffffffb614fe6e>] bus_for_each_dev+0x7e/0xd0
    [<ffffffffb6151da2>] driver_attach+0x22/0x30
root@host:/home/user#

Would you need a bisect on this one? Maybe it would help.

Best regards,
Mirsad


2023-05-09 00:09:13

by Mirsad Goran Todorovac

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2



On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>> Hi,
>>>
>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>
>>> The leaked memory allocs are 96 and 512 bytes.
>>>
>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>> Lightning mobo,
>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>
>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>
>>> BIOS is:
>>>
>>>      *-firmware
>>>           description: BIOS
>>>           vendor: American Megatrends International, LLC.
>>>           physical id: 0
>>>           version: 1.21
>>>           date: 04/26/2023
>>>           size: 64KiB
>>>
>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>
>>> The keyboard is recognised as Chicony:
>>>
>>>                  *-usb
>>>                       description: Keyboard
>>>                       product: CHICONY USB Keyboard
>>>                       vendor: CHICONY
>>>                       physical id: 2
>>>                       bus info: usb@5:2
>>>                       logical name: input35
>>>                       logical name: /dev/input/event4
>>>                       logical name: input35::capslock
>>>                       logical name: input35::numlock
>>>                       logical name: input35::scrolllock
>>>                       logical name: input36
>>>                       logical name: /dev/input/event5
>>>                       logical name: input37
>>>                       logical name: /dev/input/event6
>>>                       logical name: input38
>>>                       logical name: /dev/input/event8
>>>                       version: 2.30
>>>                       capabilities: usb-2.00 usb
>>>                       configuration: driver=usbhid maxpower=100mA
>>> speed=1Mbit/s
>>>
>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>> couple of seconds,
>>> and then reconnect and scan for memory leaks twice.
>>>
>>> The kmemleak log is as follows [edited privacy info]:
>>>
>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>   hex dump (first 32 bytes):
>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>   backtrace:
>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>>
>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>> you retry that release to see if this is still there or not?
>
> No, wait, it's still there, I was looking at a development branch of
> mine that isn't sent upstream yet. And syzbot just reported the same
> thing:
> https://lore.kernel.org/r/[email protected]
>
> So something's wrong here, let me dig into it tomorrow when I get a
> chance...

If this could help, here is the bisect of the bug (I could not discern
what could possibly be wrong):

user@host:~/linux/kernel/linux_torvalds$ git bisect log
git bisect start
# bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
git bisect bad ac9a78681b921877518763ba0e89202254349d1b
# good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
# good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
'net-remove-some-rcu_bh-cruft'
git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
# good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc'
of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
git bisect good b68ee1c6131c540a62ecd443be89c406401df091
# bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag
'sysctl-6.4-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
# good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
# good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
'for-linus-2023042601' of
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
# good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
'staging-6.4-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
# good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
memory region to avoid memory overlapping
git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
# bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure
due to sysfs 'bus_type' argument needing to be const
git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
# good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class:
use lock_class_key already present in struct subsys_private
git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
# bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
class_is_registered()
git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
# good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
comment to set_primary_fwnode() on nullifying
git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
# bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add
debug message with checksum for FW file
git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
# good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
implement class_get/put without the private pointer.
git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
# bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
machine name in soc_device_register if empty
git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
# bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
convert to only use class_to_subsys
git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
# first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver
core: class.c: convert to only use class_to_subsys
user@host:~/linux/kernel/linux_torvalds$

Have a nice day and God bless.

Best regards,
Mirsad

2023-05-09 03:26:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
>
>
> On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
> > On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
> > > On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
> > > > Hi,
> > > >
> > > > There seems to be a kernel memory leak in the USB keyboard driver.
> > > >
> > > > The leaked memory allocs are 96 and 512 bytes.
> > > >
> > > > The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
> > > > Lightning mobo,
> > > > and Genius SlimStar i220 GK-080012 keyboard.
> > > >
> > > > (Logitech M100 HID mouse is not affected by the bug.)
> > > >
> > > > BIOS is:
> > > >
> > > > ???? *-firmware
> > > > ????????? description: BIOS
> > > > ????????? vendor: American Megatrends International, LLC.
> > > > ????????? physical id: 0
> > > > ????????? version: 1.21
> > > > ????????? date: 04/26/2023
> > > > ????????? size: 64KiB
> > > >
> > > > The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
> > > >
> > > > The keyboard is recognised as Chicony:
> > > >
> > > > ???????????????? *-usb
> > > > ????????????????????? description: Keyboard
> > > > ????????????????????? product: CHICONY USB Keyboard
> > > > ????????????????????? vendor: CHICONY
> > > > ????????????????????? physical id: 2
> > > > ????????????????????? bus info: usb@5:2
> > > > ????????????????????? logical name: input35
> > > > ????????????????????? logical name: /dev/input/event4
> > > > ????????????????????? logical name: input35::capslock
> > > > ????????????????????? logical name: input35::numlock
> > > > ????????????????????? logical name: input35::scrolllock
> > > > ????????????????????? logical name: input36
> > > > ????????????????????? logical name: /dev/input/event5
> > > > ????????????????????? logical name: input37
> > > > ????????????????????? logical name: /dev/input/event6
> > > > ????????????????????? logical name: input38
> > > > ????????????????????? logical name: /dev/input/event8
> > > > ????????????????????? version: 2.30
> > > > ????????????????????? capabilities: usb-2.00 usb
> > > > ????????????????????? configuration: driver=usbhid maxpower=100mA
> > > > speed=1Mbit/s
> > > >
> > > > The bug is easily reproduced by unplugging the USB keyboard, waiting about a
> > > > couple of seconds,
> > > > and then reconnect and scan for memory leaks twice.
> > > >
> > > > The kmemleak log is as follows [edited privacy info]:
> > > >
> > > > root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
> > > > unreferenced object 0xffff8dd020037c00 (size 96):
> > > > ? comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
> > > > ? hex dump (first 32 bytes):
> > > > ??? 5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
> > > > ??? 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > > > ? backtrace:
> > > > ??? [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
> > > > ??? [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
> > > > ??? [<ffffffffb87543d9>] class_create+0x29/0x80
> > > > ??? [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
> > >
> > > As the call to class_create() in this path is now gone in 6.4-rc1, can
> > > you retry that release to see if this is still there or not?
> >
> > No, wait, it's still there, I was looking at a development branch of
> > mine that isn't sent upstream yet. And syzbot just reported the same
> > thing:
> > https://lore.kernel.org/r/[email protected]
> >
> > So something's wrong here, let me dig into it tomorrow when I get a
> > chance...
>
> If this could help, here is the bisect of the bug (I could not discern what
> could possibly be wrong):
>
> user@host:~/linux/kernel/linux_torvalds$ git bisect log
> git bisect start
> # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
> git bisect bad ac9a78681b921877518763ba0e89202254349d1b
> # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
> git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
> # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
> 'net-remove-some-rcu_bh-cruft'
> git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
> # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
> git bisect good b68ee1c6131c540a62ecd443be89c406401df091
> # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
> of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
> git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
> # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
> 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
> git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
> # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
> 'for-linus-2023042601' of
> git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
> git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
> # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
> 'staging-6.4-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
> # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
> memory region to avoid memory overlapping
> git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
> # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
> to sysfs 'bus_type' argument needing to be const
> git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
> # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
> lock_class_key already present in struct subsys_private
> git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
> # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
> class_is_registered()
> git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
> # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
> comment to set_primary_fwnode() on nullifying
> git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
> # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
> message with checksum for FW file
> git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
> # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
> implement class_get/put without the private pointer.
> git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
> # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
> machine name in soc_device_register if empty
> git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
> # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
> convert to only use class_to_subsys
> git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
> # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
> class.c: convert to only use class_to_subsys
> user@host:~/linux/kernel/linux_torvalds$

This helps a lot, thanks. I got the reference counting wrong somewhere
in here, I thought I tested this better, odd it shows up now...

I'll try to work on it this week.

thanks,

greg k-h

2023-05-12 19:23:29

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

Hi, Greg,

On 09. 05. 2023. 04:59, Greg Kroah-Hartman wrote:
> On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
>>
>>
>> On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
>>> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>>>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>>>> Hi,
>>>>>
>>>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>>>
>>>>> The leaked memory allocs are 96 and 512 bytes.
>>>>>
>>>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>>>> Lightning mobo,
>>>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>>>
>>>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>>>
>>>>> BIOS is:
>>>>>
>>>>>      *-firmware
>>>>>           description: BIOS
>>>>>           vendor: American Megatrends International, LLC.
>>>>>           physical id: 0
>>>>>           version: 1.21
>>>>>           date: 04/26/2023
>>>>>           size: 64KiB
>>>>>
>>>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>>>
>>>>> The keyboard is recognised as Chicony:
>>>>>
>>>>>                  *-usb
>>>>>                       description: Keyboard
>>>>>                       product: CHICONY USB Keyboard
>>>>>                       vendor: CHICONY
>>>>>                       physical id: 2
>>>>>                       bus info: usb@5:2
>>>>>                       logical name: input35
>>>>>                       logical name: /dev/input/event4
>>>>>                       logical name: input35::capslock
>>>>>                       logical name: input35::numlock
>>>>>                       logical name: input35::scrolllock
>>>>>                       logical name: input36
>>>>>                       logical name: /dev/input/event5
>>>>>                       logical name: input37
>>>>>                       logical name: /dev/input/event6
>>>>>                       logical name: input38
>>>>>                       logical name: /dev/input/event8
>>>>>                       version: 2.30
>>>>>                       capabilities: usb-2.00 usb
>>>>>                       configuration: driver=usbhid maxpower=100mA
>>>>> speed=1Mbit/s
>>>>>
>>>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>>>> couple of seconds,
>>>>> and then reconnect and scan for memory leaks twice.
>>>>>
>>>>> The kmemleak log is as follows [edited privacy info]:
>>>>>
>>>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>>>   hex dump (first 32 bytes):
>>>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>>>   backtrace:
>>>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>>>>
>>>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>>>> you retry that release to see if this is still there or not?
>>>
>>> No, wait, it's still there, I was looking at a development branch of
>>> mine that isn't sent upstream yet. And syzbot just reported the same
>>> thing:
>>> https://lore.kernel.org/r/[email protected]
>>>
>>> So something's wrong here, let me dig into it tomorrow when I get a
>>> chance...
>>
>> If this could help, here is the bisect of the bug (I could not discern what
>> could possibly be wrong):
>>
>> user@host:~/linux/kernel/linux_torvalds$ git bisect log
>> git bisect start
>> # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
>> git bisect bad ac9a78681b921877518763ba0e89202254349d1b
>> # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
>> git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
>> # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
>> 'net-remove-some-rcu_bh-cruft'
>> git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
>> # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>> git bisect good b68ee1c6131c540a62ecd443be89c406401df091
>> # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
>> of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
>> git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
>> # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
>> 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>> git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
>> # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
>> 'for-linus-2023042601' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
>> git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
>> # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
>> 'staging-6.4-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>> git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
>> # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
>> memory region to avoid memory overlapping
>> git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
>> # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
>> to sysfs 'bus_type' argument needing to be const
>> git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
>> # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
>> lock_class_key already present in struct subsys_private
>> git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
>> # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
>> class_is_registered()
>> git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
>> # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
>> comment to set_primary_fwnode() on nullifying
>> git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
>> # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
>> message with checksum for FW file
>> git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
>> # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
>> implement class_get/put without the private pointer.
>> git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
>> # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
>> machine name in soc_device_register if empty
>> git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
>> # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
>> convert to only use class_to_subsys
>> git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
>> # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
>> class.c: convert to only use class_to_subsys
>> user@host:~/linux/kernel/linux_torvalds$
>
> This helps a lot, thanks. I got the reference counting wrong somewhere
> in here, I thought I tested this better, odd it shows up now...
>
> I'll try to work on it this week.
>
> thanks,
>
> greg k-h

Not at all!

I hope you had better luck because this part of code still looks to me
like hieroglyphs.

Linux kernel rose to 10.9M lines, and it would take me thirty years to
just read it once, 1000 lines a day ... 6.7M lines are "just drivers".

# find . -name '*.c' -o -name '*.h' -print0 | wc --files0-from -
10913623 35587483 631377958 total
# find drivers -name '*.c' -o -name '*.h' -print0 | wc --files0-from -
6705084 19985060 495162001 total

Best regards,
Mirsad

2023-05-12 21:40:47

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

Hi,

On 5/9/23 04:59, Greg Kroah-Hartman wrote:
> On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
>>
>>
>> On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
>>> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>>>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>>>> Hi,
>>>>>
>>>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>>>
>>>>> The leaked memory allocs are 96 and 512 bytes.
>>>>>
>>>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>>>> Lightning mobo,
>>>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>>>
>>>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>>>
>>>>> BIOS is:
>>>>>
>>>>>      *-firmware
>>>>>           description: BIOS
>>>>>           vendor: American Megatrends International, LLC.
>>>>>           physical id: 0
>>>>>           version: 1.21
>>>>>           date: 04/26/2023
>>>>>           size: 64KiB
>>>>>
>>>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>>>
>>>>> The keyboard is recognised as Chicony:
>>>>>
>>>>>                  *-usb
>>>>>                       description: Keyboard
>>>>>                       product: CHICONY USB Keyboard
>>>>>                       vendor: CHICONY
>>>>>                       physical id: 2
>>>>>                       bus info: usb@5:2
>>>>>                       logical name: input35
>>>>>                       logical name: /dev/input/event4
>>>>>                       logical name: input35::capslock
>>>>>                       logical name: input35::numlock
>>>>>                       logical name: input35::scrolllock
>>>>>                       logical name: input36
>>>>>                       logical name: /dev/input/event5
>>>>>                       logical name: input37
>>>>>                       logical name: /dev/input/event6
>>>>>                       logical name: input38
>>>>>                       logical name: /dev/input/event8
>>>>>                       version: 2.30
>>>>>                       capabilities: usb-2.00 usb
>>>>>                       configuration: driver=usbhid maxpower=100mA
>>>>> speed=1Mbit/s
>>>>>
>>>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>>>> couple of seconds,
>>>>> and then reconnect and scan for memory leaks twice.
>>>>>
>>>>> The kmemleak log is as follows [edited privacy info]:
>>>>>
>>>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>>>   hex dump (first 32 bytes):
>>>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>>>   backtrace:
>>>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>>>>
>>>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>>>> you retry that release to see if this is still there or not?
>>>
>>> No, wait, it's still there, I was looking at a development branch of
>>> mine that isn't sent upstream yet. And syzbot just reported the same
>>> thing:
>>> https://lore.kernel.org/r/[email protected]
>>>
>>> So something's wrong here, let me dig into it tomorrow when I get a
>>> chance...
>>
>> If this could help, here is the bisect of the bug (I could not discern what
>> could possibly be wrong):
>>
>> user@host:~/linux/kernel/linux_torvalds$ git bisect log
>> git bisect start
>> # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
>> git bisect bad ac9a78681b921877518763ba0e89202254349d1b
>> # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
>> git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
>> # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
>> 'net-remove-some-rcu_bh-cruft'
>> git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
>> # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>> git bisect good b68ee1c6131c540a62ecd443be89c406401df091
>> # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
>> of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
>> git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
>> # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
>> 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>> git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
>> # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
>> 'for-linus-2023042601' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
>> git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
>> # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
>> 'staging-6.4-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>> git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
>> # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
>> memory region to avoid memory overlapping
>> git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
>> # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
>> to sysfs 'bus_type' argument needing to be const
>> git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
>> # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
>> lock_class_key already present in struct subsys_private
>> git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
>> # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
>> class_is_registered()
>> git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
>> # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
>> comment to set_primary_fwnode() on nullifying
>> git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
>> # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
>> message with checksum for FW file
>> git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
>> # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
>> implement class_get/put without the private pointer.
>> git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
>> # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
>> machine name in soc_device_register if empty
>> git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
>> # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
>> convert to only use class_to_subsys
>> git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
>> # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
>> class.c: convert to only use class_to_subsys
>> user@host:~/linux/kernel/linux_torvalds$
>
> This helps a lot, thanks. I got the reference counting wrong somewhere
> in here, I thought I tested this better, odd it shows up now...
>
> I'll try to work on it this week.

I have figured out that the leak occurs on keyboard unplugging only, one
or two leaks (maybe a race condition?).

Please NOTE that the number of leaks is now odd:

root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak | grep comm
comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.224s)
comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.272s)
comm "kworker/6:3", pid 3046, jiffies 4294935362 (age 544.780s)
comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.740s)
comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.784s)
root@defiant:/home/marvin#

At one time unplugging keyboard generated only one leak, but only at one
time. As it requires manually unplugging keyboard, I didn't seem to find
a way to automate it, but it doesn't seem to require root access.

BTW, I've seen in syzbot output that kmemleak output has debug source
file names and line numbers. I couldn't make that work with the dbg .deb.

I will do some more homework, but this was a rough week.

Best regards,
Mirsad

2023-05-16 15:03:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On Fri, May 12, 2023 at 11:33:31PM +0200, Mirsad Goran Todorovac wrote:
> Hi,
>
> On 5/9/23 04:59, Greg Kroah-Hartman wrote:
> > On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
> > >
> > >
> > > On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
> > > > On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
> > > > > On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
> > > > > > Hi,
> > > > > >
> > > > > > There seems to be a kernel memory leak in the USB keyboard driver.
> > > > > >
> > > > > > The leaked memory allocs are 96 and 512 bytes.
> > > > > >
> > > > > > The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
> > > > > > Lightning mobo,
> > > > > > and Genius SlimStar i220 GK-080012 keyboard.
> > > > > >
> > > > > > (Logitech M100 HID mouse is not affected by the bug.)
> > > > > >
> > > > > > BIOS is:
> > > > > >
> > > > > > ???? *-firmware
> > > > > > ????????? description: BIOS
> > > > > > ????????? vendor: American Megatrends International, LLC.
> > > > > > ????????? physical id: 0
> > > > > > ????????? version: 1.21
> > > > > > ????????? date: 04/26/2023
> > > > > > ????????? size: 64KiB
> > > > > >
> > > > > > The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
> > > > > >
> > > > > > The keyboard is recognised as Chicony:
> > > > > >
> > > > > > ???????????????? *-usb
> > > > > > ????????????????????? description: Keyboard
> > > > > > ????????????????????? product: CHICONY USB Keyboard
> > > > > > ????????????????????? vendor: CHICONY
> > > > > > ????????????????????? physical id: 2
> > > > > > ????????????????????? bus info: usb@5:2
> > > > > > ????????????????????? logical name: input35
> > > > > > ????????????????????? logical name: /dev/input/event4
> > > > > > ????????????????????? logical name: input35::capslock
> > > > > > ????????????????????? logical name: input35::numlock
> > > > > > ????????????????????? logical name: input35::scrolllock
> > > > > > ????????????????????? logical name: input36
> > > > > > ????????????????????? logical name: /dev/input/event5
> > > > > > ????????????????????? logical name: input37
> > > > > > ????????????????????? logical name: /dev/input/event6
> > > > > > ????????????????????? logical name: input38
> > > > > > ????????????????????? logical name: /dev/input/event8
> > > > > > ????????????????????? version: 2.30
> > > > > > ????????????????????? capabilities: usb-2.00 usb
> > > > > > ????????????????????? configuration: driver=usbhid maxpower=100mA
> > > > > > speed=1Mbit/s
> > > > > >
> > > > > > The bug is easily reproduced by unplugging the USB keyboard, waiting about a
> > > > > > couple of seconds,
> > > > > > and then reconnect and scan for memory leaks twice.
> > > > > >
> > > > > > The kmemleak log is as follows [edited privacy info]:
> > > > > >
> > > > > > root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
> > > > > > unreferenced object 0xffff8dd020037c00 (size 96):
> > > > > > ? comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
> > > > > > ? hex dump (first 32 bytes):
> > > > > > ??? 5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
> > > > > > ??? 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > > > > > ? backtrace:
> > > > > > ??? [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
> > > > > > ??? [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
> > > > > > ??? [<ffffffffb87543d9>] class_create+0x29/0x80
> > > > > > ??? [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
> > > > >
> > > > > As the call to class_create() in this path is now gone in 6.4-rc1, can
> > > > > you retry that release to see if this is still there or not?
> > > >
> > > > No, wait, it's still there, I was looking at a development branch of
> > > > mine that isn't sent upstream yet. And syzbot just reported the same
> > > > thing:
> > > > https://lore.kernel.org/r/[email protected]
> > > >
> > > > So something's wrong here, let me dig into it tomorrow when I get a
> > > > chance...
> > >
> > > If this could help, here is the bisect of the bug (I could not discern what
> > > could possibly be wrong):
> > >
> > > user@host:~/linux/kernel/linux_torvalds$ git bisect log
> > > git bisect start
> > > # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
> > > git bisect bad ac9a78681b921877518763ba0e89202254349d1b
> > > # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
> > > git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
> > > # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
> > > 'net-remove-some-rcu_bh-cruft'
> > > git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
> > > # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
> > > git bisect good b68ee1c6131c540a62ecd443be89c406401df091
> > > # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
> > > of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
> > > git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
> > > # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
> > > 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
> > > git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
> > > # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
> > > 'for-linus-2023042601' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
> > > git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
> > > # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
> > > 'staging-6.4-rc1' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> > > git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
> > > # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
> > > memory region to avoid memory overlapping
> > > git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
> > > # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
> > > to sysfs 'bus_type' argument needing to be const
> > > git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
> > > # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
> > > lock_class_key already present in struct subsys_private
> > > git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
> > > # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
> > > class_is_registered()
> > > git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
> > > # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
> > > comment to set_primary_fwnode() on nullifying
> > > git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
> > > # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
> > > message with checksum for FW file
> > > git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
> > > # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
> > > implement class_get/put without the private pointer.
> > > git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
> > > # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
> > > machine name in soc_device_register if empty
> > > git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
> > > # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
> > > convert to only use class_to_subsys
> > > git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
> > > # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
> > > class.c: convert to only use class_to_subsys
> > > user@host:~/linux/kernel/linux_torvalds$
> >
> > This helps a lot, thanks. I got the reference counting wrong somewhere
> > in here, I thought I tested this better, odd it shows up now...
> >
> > I'll try to work on it this week.
>
> I have figured out that the leak occurs on keyboard unplugging only, one
> or two leaks (maybe a race condition?).
>
> Please NOTE that the number of leaks is now odd:
>
> root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak | grep comm
> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.224s)
> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.272s)
> comm "kworker/6:3", pid 3046, jiffies 4294935362 (age 544.780s)
> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.740s)
> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.784s)
> root@defiant:/home/marvin#
>
> At one time unplugging keyboard generated only one leak, but only at one
> time. As it requires manually unplugging keyboard, I didn't seem to find a
> way to automate it, but it doesn't seem to require root access.
>
> BTW, I've seen in syzbot output that kmemleak output has debug source file
> names and line numbers. I couldn't make that work with the dbg .deb.
>
> I will do some more homework, but this was a rough week.

I made up a patch based on code inspection alone, as I couldn't
reproduce this locally at all:
https://lore.kernel.org/r/2023051628-thumb-boaster-5680@gregkh
and it seemed to pass syzbot's tests.

I've included it here below, can you test it as well?

Hm, I only tested with a USB mouse unplug/plug cycle, maybe the issue is
a keyboard?

thanks,

greg k-h

-------------

diff --git a/drivers/base/class.c b/drivers/base/class.c
index ac1808d1a2e8..9b44edc8416f 100644
--- a/drivers/base/class.c
+++ b/drivers/base/class.c
@@ -320,6 +322,7 @@ void class_dev_iter_init(struct class_dev_iter *iter, const struct class *class,
start_knode = &start->p->knode_class;
klist_iter_init_node(&sp->klist_devices, &iter->ki, start_knode);
iter->type = type;
+ iter->sp = sp;
}
EXPORT_SYMBOL_GPL(class_dev_iter_init);

@@ -361,6 +364,7 @@ EXPORT_SYMBOL_GPL(class_dev_iter_next);
void class_dev_iter_exit(struct class_dev_iter *iter)
{
klist_iter_exit(&iter->ki);
+ subsys_put(iter->sp);
}
EXPORT_SYMBOL_GPL(class_dev_iter_exit);

diff --git a/include/linux/device/class.h b/include/linux/device/class.h
index 9deeaeb457bb..abf3d3bfb6fe 100644
--- a/include/linux/device/class.h
+++ b/include/linux/device/class.h
@@ -74,6 +74,7 @@ struct class {
struct class_dev_iter {
struct klist_iter ki;
const struct device_type *type;
+ struct subsys_private *sp;
};

int __must_check class_register(const struct class *class);

2023-05-17 07:48:03

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On 16.5.2023. 16:36, Greg Kroah-Hartman wrote:
> On Fri, May 12, 2023 at 11:33:31PM +0200, Mirsad Goran Todorovac wrote:
>> Hi,
>>
>> On 5/9/23 04:59, Greg Kroah-Hartman wrote:
>>> On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
>>>>
>>>>
>>>> On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
>>>>> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>>>>>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>>>>>
>>>>>>> The leaked memory allocs are 96 and 512 bytes.
>>>>>>>
>>>>>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>>>>>> Lightning mobo,
>>>>>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>>>>>
>>>>>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>>>>>
>>>>>>> BIOS is:
>>>>>>>
>>>>>>>      *-firmware
>>>>>>>           description: BIOS
>>>>>>>           vendor: American Megatrends International, LLC.
>>>>>>>           physical id: 0
>>>>>>>           version: 1.21
>>>>>>>           date: 04/26/2023
>>>>>>>           size: 64KiB
>>>>>>>
>>>>>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>>>>>
>>>>>>> The keyboard is recognised as Chicony:
>>>>>>>
>>>>>>>                  *-usb
>>>>>>>                       description: Keyboard
>>>>>>>                       product: CHICONY USB Keyboard
>>>>>>>                       vendor: CHICONY
>>>>>>>                       physical id: 2
>>>>>>>                       bus info: usb@5:2
>>>>>>>                       logical name: input35
>>>>>>>                       logical name: /dev/input/event4
>>>>>>>                       logical name: input35::capslock
>>>>>>>                       logical name: input35::numlock
>>>>>>>                       logical name: input35::scrolllock
>>>>>>>                       logical name: input36
>>>>>>>                       logical name: /dev/input/event5
>>>>>>>                       logical name: input37
>>>>>>>                       logical name: /dev/input/event6
>>>>>>>                       logical name: input38
>>>>>>>                       logical name: /dev/input/event8
>>>>>>>                       version: 2.30
>>>>>>>                       capabilities: usb-2.00 usb
>>>>>>>                       configuration: driver=usbhid maxpower=100mA
>>>>>>> speed=1Mbit/s
>>>>>>>
>>>>>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>>>>>> couple of seconds,
>>>>>>> and then reconnect and scan for memory leaks twice.
>>>>>>>
>>>>>>> The kmemleak log is as follows [edited privacy info]:
>>>>>>>
>>>>>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>>>>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>>>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>>>>>   hex dump (first 32 bytes):
>>>>>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>>>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>>>>>   backtrace:
>>>>>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>>>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>>>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>>>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>>>>>>
>>>>>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>>>>>> you retry that release to see if this is still there or not?
>>>>>
>>>>> No, wait, it's still there, I was looking at a development branch of
>>>>> mine that isn't sent upstream yet. And syzbot just reported the same
>>>>> thing:
>>>>> https://lore.kernel.org/r/[email protected]
>>>>>
>>>>> So something's wrong here, let me dig into it tomorrow when I get a
>>>>> chance...
>>>>
>>>> If this could help, here is the bisect of the bug (I could not discern what
>>>> could possibly be wrong):
>>>>
>>>> user@host:~/linux/kernel/linux_torvalds$ git bisect log
>>>> git bisect start
>>>> # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
>>>> git bisect bad ac9a78681b921877518763ba0e89202254349d1b
>>>> # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
>>>> git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
>>>> # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
>>>> 'net-remove-some-rcu_bh-cruft'
>>>> git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
>>>> # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>>>> git bisect good b68ee1c6131c540a62ecd443be89c406401df091
>>>> # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
>>>> of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
>>>> git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
>>>> # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
>>>> 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>>> git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
>>>> # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
>>>> 'for-linus-2023042601' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
>>>> git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
>>>> # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
>>>> 'staging-6.4-rc1' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>>>> git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
>>>> # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
>>>> memory region to avoid memory overlapping
>>>> git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
>>>> # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
>>>> to sysfs 'bus_type' argument needing to be const
>>>> git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
>>>> # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
>>>> lock_class_key already present in struct subsys_private
>>>> git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
>>>> # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
>>>> class_is_registered()
>>>> git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
>>>> # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
>>>> comment to set_primary_fwnode() on nullifying
>>>> git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
>>>> # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
>>>> message with checksum for FW file
>>>> git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
>>>> # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
>>>> implement class_get/put without the private pointer.
>>>> git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
>>>> # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
>>>> machine name in soc_device_register if empty
>>>> git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
>>>> # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
>>>> convert to only use class_to_subsys
>>>> git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
>>>> # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
>>>> class.c: convert to only use class_to_subsys
>>>> user@host:~/linux/kernel/linux_torvalds$
>>>
>>> This helps a lot, thanks. I got the reference counting wrong somewhere
>>> in here, I thought I tested this better, odd it shows up now...
>>>
>>> I'll try to work on it this week.
>>
>> I have figured out that the leak occurs on keyboard unplugging only, one
>> or two leaks (maybe a race condition?).
>>
>> Please NOTE that the number of leaks is now odd:
>>
>> root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak | grep comm
>> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
>> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
>> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.224s)
>> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.272s)
>> comm "kworker/6:3", pid 3046, jiffies 4294935362 (age 544.780s)
>> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.740s)
>> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.784s)
>> root@defiant:/home/marvin#
>>
>> At one time unplugging keyboard generated only one leak, but only at one
>> time. As it requires manually unplugging keyboard, I didn't seem to find a
>> way to automate it, but it doesn't seem to require root access.
>>
>> BTW, I've seen in syzbot output that kmemleak output has debug source file
>> names and line numbers. I couldn't make that work with the dbg .deb.
>>
>> I will do some more homework, but this was a rough week.
>
> I made up a patch based on code inspection alone, as I couldn't
> reproduce this locally at all:
> https://lore.kernel.org/r/2023051628-thumb-boaster-5680@gregkh
> and it seemed to pass syzbot's tests.
>
> I've included it here below, can you test it as well?
>
> Hm, I only tested with a USB mouse unplug/plug cycle, maybe the issue is
> a keyboard?
>
> thanks,
>
> greg k-h

Hi, Greg,

Yes, the problem was with the keyboard, and with unplugging, mouse unplugged
w/o problems.

I will test the patch in the afternoon, as the issue did not appear on the
work computer. Probably it is for the best that the exact environment is
reproduced for the test.

Best regards,
Mirsad

> -------------
>
> diff --git a/drivers/base/class.c b/drivers/base/class.c
> index ac1808d1a2e8..9b44edc8416f 100644
> --- a/drivers/base/class.c
> +++ b/drivers/base/class.c
> @@ -320,6 +322,7 @@ void class_dev_iter_init(struct class_dev_iter *iter, const struct class *class,
> start_knode = &start->p->knode_class;
> klist_iter_init_node(&sp->klist_devices, &iter->ki, start_knode);
> iter->type = type;
> + iter->sp = sp;
> }
> EXPORT_SYMBOL_GPL(class_dev_iter_init);
>
> @@ -361,6 +364,7 @@ EXPORT_SYMBOL_GPL(class_dev_iter_next);
> void class_dev_iter_exit(struct class_dev_iter *iter)
> {
> klist_iter_exit(&iter->ki);
> + subsys_put(iter->sp);
> }
> EXPORT_SYMBOL_GPL(class_dev_iter_exit);
>
> diff --git a/include/linux/device/class.h b/include/linux/device/class.h
> index 9deeaeb457bb..abf3d3bfb6fe 100644
> --- a/include/linux/device/class.h
> +++ b/include/linux/device/class.h
> @@ -74,6 +74,7 @@ struct class {
> struct class_dev_iter {
> struct klist_iter ki;
> const struct device_type *type;
> + struct subsys_private *sp;
> };
>
> int __must_check class_register(const struct class *class);

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu


2023-05-17 16:18:03

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On 5/16/23 16:36, Greg Kroah-Hartman wrote:
> On Fri, May 12, 2023 at 11:33:31PM +0200, Mirsad Goran Todorovac wrote:
>> Hi,
>>
>> On 5/9/23 04:59, Greg Kroah-Hartman wrote:
>>> On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
>>>>
>>>>
>>>> On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
>>>>> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>>>>>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>>>>>
>>>>>>> The leaked memory allocs are 96 and 512 bytes.
>>>>>>>
>>>>>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>>>>>> Lightning mobo,
>>>>>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>>>>>
>>>>>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>>>>>
>>>>>>> BIOS is:
>>>>>>>
>>>>>>>      *-firmware
>>>>>>>           description: BIOS
>>>>>>>           vendor: American Megatrends International, LLC.
>>>>>>>           physical id: 0
>>>>>>>           version: 1.21
>>>>>>>           date: 04/26/2023
>>>>>>>           size: 64KiB
>>>>>>>
>>>>>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>>>>>
>>>>>>> The keyboard is recognised as Chicony:
>>>>>>>
>>>>>>>                  *-usb
>>>>>>>                       description: Keyboard
>>>>>>>                       product: CHICONY USB Keyboard
>>>>>>>                       vendor: CHICONY
>>>>>>>                       physical id: 2
>>>>>>>                       bus info: usb@5:2
>>>>>>>                       logical name: input35
>>>>>>>                       logical name: /dev/input/event4
>>>>>>>                       logical name: input35::capslock
>>>>>>>                       logical name: input35::numlock
>>>>>>>                       logical name: input35::scrolllock
>>>>>>>                       logical name: input36
>>>>>>>                       logical name: /dev/input/event5
>>>>>>>                       logical name: input37
>>>>>>>                       logical name: /dev/input/event6
>>>>>>>                       logical name: input38
>>>>>>>                       logical name: /dev/input/event8
>>>>>>>                       version: 2.30
>>>>>>>                       capabilities: usb-2.00 usb
>>>>>>>                       configuration: driver=usbhid maxpower=100mA
>>>>>>> speed=1Mbit/s
>>>>>>>
>>>>>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>>>>>> couple of seconds,
>>>>>>> and then reconnect and scan for memory leaks twice.
>>>>>>>
>>>>>>> The kmemleak log is as follows [edited privacy info]:
>>>>>>>
>>>>>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>>>>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>>>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>>>>>   hex dump (first 32 bytes):
>>>>>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>>>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>>>>>   backtrace:
>>>>>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>>>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>>>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>>>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>>>>>>
>>>>>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>>>>>> you retry that release to see if this is still there or not?
>>>>>
>>>>> No, wait, it's still there, I was looking at a development branch of
>>>>> mine that isn't sent upstream yet. And syzbot just reported the same
>>>>> thing:
>>>>> https://lore.kernel.org/r/[email protected]
>>>>>
>>>>> So something's wrong here, let me dig into it tomorrow when I get a
>>>>> chance...
>>>>
>>>> If this could help, here is the bisect of the bug (I could not discern what
>>>> could possibly be wrong):
>>>>
>>>> user@host:~/linux/kernel/linux_torvalds$ git bisect log
>>>> git bisect start
>>>> # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
>>>> git bisect bad ac9a78681b921877518763ba0e89202254349d1b
>>>> # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
>>>> git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
>>>> # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
>>>> 'net-remove-some-rcu_bh-cruft'
>>>> git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
>>>> # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>>>> git bisect good b68ee1c6131c540a62ecd443be89c406401df091
>>>> # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
>>>> of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
>>>> git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
>>>> # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
>>>> 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>>> git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
>>>> # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
>>>> 'for-linus-2023042601' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
>>>> git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
>>>> # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
>>>> 'staging-6.4-rc1' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>>>> git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
>>>> # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
>>>> memory region to avoid memory overlapping
>>>> git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
>>>> # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
>>>> to sysfs 'bus_type' argument needing to be const
>>>> git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
>>>> # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
>>>> lock_class_key already present in struct subsys_private
>>>> git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
>>>> # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
>>>> class_is_registered()
>>>> git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
>>>> # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
>>>> comment to set_primary_fwnode() on nullifying
>>>> git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
>>>> # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
>>>> message with checksum for FW file
>>>> git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
>>>> # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
>>>> implement class_get/put without the private pointer.
>>>> git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
>>>> # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
>>>> machine name in soc_device_register if empty
>>>> git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
>>>> # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
>>>> convert to only use class_to_subsys
>>>> git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
>>>> # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
>>>> class.c: convert to only use class_to_subsys
>>>> user@host:~/linux/kernel/linux_torvalds$
>>>
>>> This helps a lot, thanks. I got the reference counting wrong somewhere
>>> in here, I thought I tested this better, odd it shows up now...
>>>
>>> I'll try to work on it this week.
>>
>> I have figured out that the leak occurs on keyboard unplugging only, one
>> or two leaks (maybe a race condition?).
>>
>> Please NOTE that the number of leaks is now odd:
>>
>> root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak | grep comm
>> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
>> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
>> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.224s)
>> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.272s)
>> comm "kworker/6:3", pid 3046, jiffies 4294935362 (age 544.780s)
>> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.740s)
>> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.784s)
>> root@defiant:/home/marvin#
>>
>> At one time unplugging keyboard generated only one leak, but only at one
>> time. As it requires manually unplugging keyboard, I didn't seem to find a
>> way to automate it, but it doesn't seem to require root access.
>>
>> BTW, I've seen in syzbot output that kmemleak output has debug source file
>> names and line numbers. I couldn't make that work with the dbg .deb.
>>
>> I will do some more homework, but this was a rough week.
>
> I made up a patch based on code inspection alone, as I couldn't
> reproduce this locally at all:
> https://lore.kernel.org/r/2023051628-thumb-boaster-5680@gregkh
> and it seemed to pass syzbot's tests.
>
> I've included it here below, can you test it as well?
>
> Hm, I only tested with a USB mouse unplug/plug cycle, maybe the issue is
> a keyboard?
>
> thanks,
>
> greg k-h
>
> -------------
>
> diff --git a/drivers/base/class.c b/drivers/base/class.c
> index ac1808d1a2e8..9b44edc8416f 100644
> --- a/drivers/base/class.c
> +++ b/drivers/base/class.c
> @@ -320,6 +322,7 @@ void class_dev_iter_init(struct class_dev_iter *iter, const struct class *class,
> start_knode = &start->p->knode_class;
> klist_iter_init_node(&sp->klist_devices, &iter->ki, start_knode);
> iter->type = type;
> + iter->sp = sp;
> }
> EXPORT_SYMBOL_GPL(class_dev_iter_init);
>
> @@ -361,6 +364,7 @@ EXPORT_SYMBOL_GPL(class_dev_iter_next);
> void class_dev_iter_exit(struct class_dev_iter *iter)
> {
> klist_iter_exit(&iter->ki);
> + subsys_put(iter->sp);
> }
> EXPORT_SYMBOL_GPL(class_dev_iter_exit);
>
> diff --git a/include/linux/device/class.h b/include/linux/device/class.h
> index 9deeaeb457bb..abf3d3bfb6fe 100644
> --- a/include/linux/device/class.h
> +++ b/include/linux/device/class.h
> @@ -74,6 +74,7 @@ struct class {
> struct class_dev_iter {
> struct klist_iter ki;
> const struct device_type *type;
> + struct subsys_private *sp;
> };
>
> int __must_check class_register(const struct class *class);

The build with the latest 6.4-rc2 and without this patch still leaked,
the build with the same commit and this patch applied was successful:

root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak
root@defiant:/home/marvin#

Tried three times, and it is a OK.

Congratulations! This had fixed the leak.

I wonder why it didn't show in the other contexts, hardware and archs?

Best regards,
Mirsad

2023-05-17 19:27:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On Wed, May 17, 2023 at 06:10:54PM +0200, Mirsad Goran Todorovac wrote:
> On 5/16/23 16:36, Greg Kroah-Hartman wrote:
> > On Fri, May 12, 2023 at 11:33:31PM +0200, Mirsad Goran Todorovac wrote:
> > > Hi,
> > >
> > > On 5/9/23 04:59, Greg Kroah-Hartman wrote:
> > > > On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
> > > > >
> > > > >
> > > > > On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
> > > > > > On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
> > > > > > > On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > There seems to be a kernel memory leak in the USB keyboard driver.
> > > > > > > >
> > > > > > > > The leaked memory allocs are 96 and 512 bytes.
> > > > > > > >
> > > > > > > > The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
> > > > > > > > Lightning mobo,
> > > > > > > > and Genius SlimStar i220 GK-080012 keyboard.
> > > > > > > >
> > > > > > > > (Logitech M100 HID mouse is not affected by the bug.)
> > > > > > > >
> > > > > > > > BIOS is:
> > > > > > > >
> > > > > > > > ???? *-firmware
> > > > > > > > ????????? description: BIOS
> > > > > > > > ????????? vendor: American Megatrends International, LLC.
> > > > > > > > ????????? physical id: 0
> > > > > > > > ????????? version: 1.21
> > > > > > > > ????????? date: 04/26/2023
> > > > > > > > ????????? size: 64KiB
> > > > > > > >
> > > > > > > > The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
> > > > > > > >
> > > > > > > > The keyboard is recognised as Chicony:
> > > > > > > >
> > > > > > > > ???????????????? *-usb
> > > > > > > > ????????????????????? description: Keyboard
> > > > > > > > ????????????????????? product: CHICONY USB Keyboard
> > > > > > > > ????????????????????? vendor: CHICONY
> > > > > > > > ????????????????????? physical id: 2
> > > > > > > > ????????????????????? bus info: usb@5:2
> > > > > > > > ????????????????????? logical name: input35
> > > > > > > > ????????????????????? logical name: /dev/input/event4
> > > > > > > > ????????????????????? logical name: input35::capslock
> > > > > > > > ????????????????????? logical name: input35::numlock
> > > > > > > > ????????????????????? logical name: input35::scrolllock
> > > > > > > > ????????????????????? logical name: input36
> > > > > > > > ????????????????????? logical name: /dev/input/event5
> > > > > > > > ????????????????????? logical name: input37
> > > > > > > > ????????????????????? logical name: /dev/input/event6
> > > > > > > > ????????????????????? logical name: input38
> > > > > > > > ????????????????????? logical name: /dev/input/event8
> > > > > > > > ????????????????????? version: 2.30
> > > > > > > > ????????????????????? capabilities: usb-2.00 usb
> > > > > > > > ????????????????????? configuration: driver=usbhid maxpower=100mA
> > > > > > > > speed=1Mbit/s
> > > > > > > >
> > > > > > > > The bug is easily reproduced by unplugging the USB keyboard, waiting about a
> > > > > > > > couple of seconds,
> > > > > > > > and then reconnect and scan for memory leaks twice.
> > > > > > > >
> > > > > > > > The kmemleak log is as follows [edited privacy info]:
> > > > > > > >
> > > > > > > > root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
> > > > > > > > unreferenced object 0xffff8dd020037c00 (size 96):
> > > > > > > > ? comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
> > > > > > > > ? hex dump (first 32 bytes):
> > > > > > > > ??? 5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
> > > > > > > > ??? 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > > > > > > > ? backtrace:
> > > > > > > > ??? [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
> > > > > > > > ??? [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
> > > > > > > > ??? [<ffffffffb87543d9>] class_create+0x29/0x80
> > > > > > > > ??? [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
> > > > > > >
> > > > > > > As the call to class_create() in this path is now gone in 6.4-rc1, can
> > > > > > > you retry that release to see if this is still there or not?
> > > > > >
> > > > > > No, wait, it's still there, I was looking at a development branch of
> > > > > > mine that isn't sent upstream yet. And syzbot just reported the same
> > > > > > thing:
> > > > > > https://lore.kernel.org/r/[email protected]
> > > > > >
> > > > > > So something's wrong here, let me dig into it tomorrow when I get a
> > > > > > chance...
> > > > >
> > > > > If this could help, here is the bisect of the bug (I could not discern what
> > > > > could possibly be wrong):
> > > > >
> > > > > user@host:~/linux/kernel/linux_torvalds$ git bisect log
> > > > > git bisect start
> > > > > # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
> > > > > git bisect bad ac9a78681b921877518763ba0e89202254349d1b
> > > > > # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
> > > > > git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
> > > > > # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
> > > > > 'net-remove-some-rcu_bh-cruft'
> > > > > git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
> > > > > # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
> > > > > git bisect good b68ee1c6131c540a62ecd443be89c406401df091
> > > > > # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
> > > > > of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
> > > > > git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
> > > > > # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
> > > > > 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
> > > > > git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
> > > > > # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
> > > > > 'for-linus-2023042601' of
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
> > > > > git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
> > > > > # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
> > > > > 'staging-6.4-rc1' of
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> > > > > git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
> > > > > # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
> > > > > memory region to avoid memory overlapping
> > > > > git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
> > > > > # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
> > > > > to sysfs 'bus_type' argument needing to be const
> > > > > git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
> > > > > # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
> > > > > lock_class_key already present in struct subsys_private
> > > > > git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
> > > > > # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
> > > > > class_is_registered()
> > > > > git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
> > > > > # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
> > > > > comment to set_primary_fwnode() on nullifying
> > > > > git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
> > > > > # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
> > > > > message with checksum for FW file
> > > > > git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
> > > > > # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
> > > > > implement class_get/put without the private pointer.
> > > > > git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
> > > > > # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
> > > > > machine name in soc_device_register if empty
> > > > > git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
> > > > > # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
> > > > > convert to only use class_to_subsys
> > > > > git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
> > > > > # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
> > > > > class.c: convert to only use class_to_subsys
> > > > > user@host:~/linux/kernel/linux_torvalds$
> > > >
> > > > This helps a lot, thanks. I got the reference counting wrong somewhere
> > > > in here, I thought I tested this better, odd it shows up now...
> > > >
> > > > I'll try to work on it this week.
> > >
> > > I have figured out that the leak occurs on keyboard unplugging only, one
> > > or two leaks (maybe a race condition?).
> > >
> > > Please NOTE that the number of leaks is now odd:
> > >
> > > root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak | grep comm
> > > comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
> > > comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
> > > comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.224s)
> > > comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.272s)
> > > comm "kworker/6:3", pid 3046, jiffies 4294935362 (age 544.780s)
> > > comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.740s)
> > > comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.784s)
> > > root@defiant:/home/marvin#
> > >
> > > At one time unplugging keyboard generated only one leak, but only at one
> > > time. As it requires manually unplugging keyboard, I didn't seem to find a
> > > way to automate it, but it doesn't seem to require root access.
> > >
> > > BTW, I've seen in syzbot output that kmemleak output has debug source file
> > > names and line numbers. I couldn't make that work with the dbg .deb.
> > >
> > > I will do some more homework, but this was a rough week.
> >
> > I made up a patch based on code inspection alone, as I couldn't
> > reproduce this locally at all:
> > https://lore.kernel.org/r/2023051628-thumb-boaster-5680@gregkh
> > and it seemed to pass syzbot's tests.
> >
> > I've included it here below, can you test it as well?
> >
> > Hm, I only tested with a USB mouse unplug/plug cycle, maybe the issue is
> > a keyboard?
> >
> > thanks,
> >
> > greg k-h
> >
> > -------------
> >
> > diff --git a/drivers/base/class.c b/drivers/base/class.c
> > index ac1808d1a2e8..9b44edc8416f 100644
> > --- a/drivers/base/class.c
> > +++ b/drivers/base/class.c
> > @@ -320,6 +322,7 @@ void class_dev_iter_init(struct class_dev_iter *iter, const struct class *class,
> > start_knode = &start->p->knode_class;
> > klist_iter_init_node(&sp->klist_devices, &iter->ki, start_knode);
> > iter->type = type;
> > + iter->sp = sp;
> > }
> > EXPORT_SYMBOL_GPL(class_dev_iter_init);
> > @@ -361,6 +364,7 @@ EXPORT_SYMBOL_GPL(class_dev_iter_next);
> > void class_dev_iter_exit(struct class_dev_iter *iter)
> > {
> > klist_iter_exit(&iter->ki);
> > + subsys_put(iter->sp);
> > }
> > EXPORT_SYMBOL_GPL(class_dev_iter_exit);
> > diff --git a/include/linux/device/class.h b/include/linux/device/class.h
> > index 9deeaeb457bb..abf3d3bfb6fe 100644
> > --- a/include/linux/device/class.h
> > +++ b/include/linux/device/class.h
> > @@ -74,6 +74,7 @@ struct class {
> > struct class_dev_iter {
> > struct klist_iter ki;
> > const struct device_type *type;
> > + struct subsys_private *sp;
> > };
> > int __must_check class_register(const struct class *class);
>
> The build with the latest 6.4-rc2 and without this patch still leaked,
> the build with the same commit and this patch applied was successful:
>
> root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak
> root@defiant:/home/marvin#
>
> Tried three times, and it is a OK.
>
> Congratulations! This had fixed the leak.

Wonderful, thanks for testing, can I add your "Tested-by:" to it?

> I wonder why it didn't show in the other contexts, hardware and archs?

It might depend on your keyboard if it has other things on it? I don't
know, sorry, I didn't spend much time digging after I found the "obvious
leak" based on the bisection you provided, which was very very helpful,
thanks for that.

And leaks are hard to notice, especially ones that only show up when you
remove a specific type of device.

thanks again for your help here,

greg k-h

2023-05-17 19:54:29

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On 5/17/23 20:57, Greg Kroah-Hartman wrote:
> On Wed, May 17, 2023 at 06:10:54PM +0200, Mirsad Goran Todorovac wrote:
>> On 5/16/23 16:36, Greg Kroah-Hartman wrote:
>>> On Fri, May 12, 2023 at 11:33:31PM +0200, Mirsad Goran Todorovac wrote:
>>>> Hi,
>>>>
>>>> On 5/9/23 04:59, Greg Kroah-Hartman wrote:
>>>>> On Tue, May 09, 2023 at 01:51:35AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>
>>>>>>
>>>>>> On 08. 05. 2023. 16:01, Greg Kroah-Hartman wrote:
>>>>>>> On Mon, May 08, 2023 at 08:51:55AM +0200, Greg Kroah-Hartman wrote:
>>>>>>>> On Mon, May 08, 2023 at 08:30:07AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> There seems to be a kernel memory leak in the USB keyboard driver.
>>>>>>>>>
>>>>>>>>> The leaked memory allocs are 96 and 512 bytes.
>>>>>>>>>
>>>>>>>>> The platform is Ubuntu 22.04 LTS on a assembled AMD Ryzen 9 with X670E PG
>>>>>>>>> Lightning mobo,
>>>>>>>>> and Genius SlimStar i220 GK-080012 keyboard.
>>>>>>>>>
>>>>>>>>> (Logitech M100 HID mouse is not affected by the bug.)
>>>>>>>>>
>>>>>>>>> BIOS is:
>>>>>>>>>
>>>>>>>>>      *-firmware
>>>>>>>>>           description: BIOS
>>>>>>>>>           vendor: American Megatrends International, LLC.
>>>>>>>>>           physical id: 0
>>>>>>>>>           version: 1.21
>>>>>>>>>           date: 04/26/2023
>>>>>>>>>           size: 64KiB
>>>>>>>>>
>>>>>>>>> The kernel is 6.3.0-torvalds-<id>-13466-gfc4354c6e5c2.
>>>>>>>>>
>>>>>>>>> The keyboard is recognised as Chicony:
>>>>>>>>>
>>>>>>>>>                  *-usb
>>>>>>>>>                       description: Keyboard
>>>>>>>>>                       product: CHICONY USB Keyboard
>>>>>>>>>                       vendor: CHICONY
>>>>>>>>>                       physical id: 2
>>>>>>>>>                       bus info: usb@5:2
>>>>>>>>>                       logical name: input35
>>>>>>>>>                       logical name: /dev/input/event4
>>>>>>>>>                       logical name: input35::capslock
>>>>>>>>>                       logical name: input35::numlock
>>>>>>>>>                       logical name: input35::scrolllock
>>>>>>>>>                       logical name: input36
>>>>>>>>>                       logical name: /dev/input/event5
>>>>>>>>>                       logical name: input37
>>>>>>>>>                       logical name: /dev/input/event6
>>>>>>>>>                       logical name: input38
>>>>>>>>>                       logical name: /dev/input/event8
>>>>>>>>>                       version: 2.30
>>>>>>>>>                       capabilities: usb-2.00 usb
>>>>>>>>>                       configuration: driver=usbhid maxpower=100mA
>>>>>>>>> speed=1Mbit/s
>>>>>>>>>
>>>>>>>>> The bug is easily reproduced by unplugging the USB keyboard, waiting about a
>>>>>>>>> couple of seconds,
>>>>>>>>> and then reconnect and scan for memory leaks twice.
>>>>>>>>>
>>>>>>>>> The kmemleak log is as follows [edited privacy info]:
>>>>>>>>>
>>>>>>>>> root@hostname:/home/username# cat /sys/kernel/debug/kmemleak
>>>>>>>>> unreferenced object 0xffff8dd020037c00 (size 96):
>>>>>>>>>   comm "systemd-udevd", pid 435, jiffies 4294892550 (age 8909.356s)
>>>>>>>>>   hex dump (first 32 bytes):
>>>>>>>>>     5d 8e 4e b9 ff ff ff ff 00 00 00 00 00 00 00 00 ].N.............
>>>>>>>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>>>>>>>>   backtrace:
>>>>>>>>>     [<ffffffffb81a74be>] __kmem_cache_alloc_node+0x22e/0x2b0
>>>>>>>>>     [<ffffffffb8127b6e>] kmalloc_trace+0x2e/0xa0
>>>>>>>>>     [<ffffffffb87543d9>] class_create+0x29/0x80
>>>>>>>>>     [<ffffffffb8880d24>] usb_register_dev+0x1d4/0x2e0
>>>>>>>>
>>>>>>>> As the call to class_create() in this path is now gone in 6.4-rc1, can
>>>>>>>> you retry that release to see if this is still there or not?
>>>>>>>
>>>>>>> No, wait, it's still there, I was looking at a development branch of
>>>>>>> mine that isn't sent upstream yet. And syzbot just reported the same
>>>>>>> thing:
>>>>>>> https://lore.kernel.org/r/[email protected]
>>>>>>>
>>>>>>> So something's wrong here, let me dig into it tomorrow when I get a
>>>>>>> chance...
>>>>>>
>>>>>> If this could help, here is the bisect of the bug (I could not discern what
>>>>>> could possibly be wrong):
>>>>>>
>>>>>> user@host:~/linux/kernel/linux_torvalds$ git bisect log
>>>>>> git bisect start
>>>>>> # bad: [ac9a78681b921877518763ba0e89202254349d1b] Linux 6.4-rc1
>>>>>> git bisect bad ac9a78681b921877518763ba0e89202254349d1b
>>>>>> # good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
>>>>>> git bisect good c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
>>>>>> # good: [85496c9b3bf8dbe15e2433d3a0197954d323cadc] Merge branch
>>>>>> 'net-remove-some-rcu_bh-cruft'
>>>>>> git bisect good 85496c9b3bf8dbe15e2433d3a0197954d323cadc
>>>>>> # good: [b68ee1c6131c540a62ecd443be89c406401df091] Merge tag 'scsi-misc' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>>>>>> git bisect good b68ee1c6131c540a62ecd443be89c406401df091
>>>>>> # bad: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1'
>>>>>> of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
>>>>>> git bisect bad 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
>>>>>> # good: [34b62f186db9614e55d021f8c58d22fc44c57911] Merge tag
>>>>>> 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>>>>> git bisect good 34b62f186db9614e55d021f8c58d22fc44c57911
>>>>>> # good: [34da76dca4673ab1819830b4924bb5b436325b26] Merge tag
>>>>>> 'for-linus-2023042601' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
>>>>>> git bisect good 34da76dca4673ab1819830b4924bb5b436325b26
>>>>>> # good: [97b2ff294381d05e59294a931c4db55276470cb5] Merge tag
>>>>>> 'staging-6.4-rc1' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>>>>>> git bisect good 97b2ff294381d05e59294a931c4db55276470cb5
>>>>>> # good: [2025b2ca8004c04861903d076c67a73a0ec6dfca] mcb-lpc: Reallocate
>>>>>> memory region to avoid memory overlapping
>>>>>> git bisect good 2025b2ca8004c04861903d076c67a73a0ec6dfca
>>>>>> # bad: [d06f5a3f7140921ada47d49574ae6fa4de5e2a89] cdx: fix build failure due
>>>>>> to sysfs 'bus_type' argument needing to be const
>>>>>> git bisect bad d06f5a3f7140921ada47d49574ae6fa4de5e2a89
>>>>>> # good: [dcfbb67e48a2becfce7990386e985b9c45098ee5] driver core: class: use
>>>>>> lock_class_key already present in struct subsys_private
>>>>>> git bisect good dcfbb67e48a2becfce7990386e985b9c45098ee5
>>>>>> # bad: [6f14c02220c791d5c46b0f965b9340c58f3d503d] driver core: create
>>>>>> class_is_registered()
>>>>>> git bisect bad 6f14c02220c791d5c46b0f965b9340c58f3d503d
>>>>>> # good: [2f9e87f5a2941b259336c7ea6c5a1499ede4554a] driver core: Add a
>>>>>> comment to set_primary_fwnode() on nullifying
>>>>>> git bisect good 2f9e87f5a2941b259336c7ea6c5a1499ede4554a
>>>>>> # bad: [02fe26f25325b547b7a31a65deb0326c04bb5174] firmware_loader: Add debug
>>>>>> message with checksum for FW file
>>>>>> git bisect bad 02fe26f25325b547b7a31a65deb0326c04bb5174
>>>>>> # good: [884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82] driver core: class:
>>>>>> implement class_get/put without the private pointer.
>>>>>> git bisect good 884f8ce42ccec9d0bf11d8bf9f111e5961ca1c82
>>>>>> # bad: [3f84aa5ec052dba960baca4ab8a352d43d47028e] base: soc: populate
>>>>>> machine name in soc_device_register if empty
>>>>>> git bisect bad 3f84aa5ec052dba960baca4ab8a352d43d47028e
>>>>>> # bad: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core: class.c:
>>>>>> convert to only use class_to_subsys
>>>>>> git bisect bad 7b884b7f24b42fa25e92ed724ad82f137610afaf
>>>>>> # first bad commit: [7b884b7f24b42fa25e92ed724ad82f137610afaf] driver core:
>>>>>> class.c: convert to only use class_to_subsys
>>>>>> user@host:~/linux/kernel/linux_torvalds$
>>>>>
>>>>> This helps a lot, thanks. I got the reference counting wrong somewhere
>>>>> in here, I thought I tested this better, odd it shows up now...
>>>>>
>>>>> I'll try to work on it this week.
>>>>
>>>> I have figured out that the leak occurs on keyboard unplugging only, one
>>>> or two leaks (maybe a race condition?).
>>>>
>>>> Please NOTE that the number of leaks is now odd:
>>>>
>>>> root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak | grep comm
>>>> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
>>>> comm "systemd-udevd", pid 330, jiffies 4294892588 (age 715.772s)
>>>> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.224s)
>>>> comm "kworker/6:0", pid 54, jiffies 4294907989 (age 654.272s)
>>>> comm "kworker/6:3", pid 3046, jiffies 4294935362 (age 544.780s)
>>>> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.740s)
>>>> comm "kworker/6:0", pid 54, jiffies 4294964122 (age 429.784s)
>>>> root@defiant:/home/marvin#
>>>>
>>>> At one time unplugging keyboard generated only one leak, but only at one
>>>> time. As it requires manually unplugging keyboard, I didn't seem to find a
>>>> way to automate it, but it doesn't seem to require root access.
>>>>
>>>> BTW, I've seen in syzbot output that kmemleak output has debug source file
>>>> names and line numbers. I couldn't make that work with the dbg .deb.
>>>>
>>>> I will do some more homework, but this was a rough week.
>>>
>>> I made up a patch based on code inspection alone, as I couldn't
>>> reproduce this locally at all:
>>> https://lore.kernel.org/r/2023051628-thumb-boaster-5680@gregkh
>>> and it seemed to pass syzbot's tests.
>>>
>>> I've included it here below, can you test it as well?
>>>
>>> Hm, I only tested with a USB mouse unplug/plug cycle, maybe the issue is
>>> a keyboard?
>>>
>>> thanks,
>>>
>>> greg k-h
>>>
>>> -------------
>>>
>>> diff --git a/drivers/base/class.c b/drivers/base/class.c
>>> index ac1808d1a2e8..9b44edc8416f 100644
>>> --- a/drivers/base/class.c
>>> +++ b/drivers/base/class.c
>>> @@ -320,6 +322,7 @@ void class_dev_iter_init(struct class_dev_iter *iter, const struct class *class,
>>> start_knode = &start->p->knode_class;
>>> klist_iter_init_node(&sp->klist_devices, &iter->ki, start_knode);
>>> iter->type = type;
>>> + iter->sp = sp;
>>> }
>>> EXPORT_SYMBOL_GPL(class_dev_iter_init);
>>> @@ -361,6 +364,7 @@ EXPORT_SYMBOL_GPL(class_dev_iter_next);
>>> void class_dev_iter_exit(struct class_dev_iter *iter)
>>> {
>>> klist_iter_exit(&iter->ki);
>>> + subsys_put(iter->sp);
>>> }
>>> EXPORT_SYMBOL_GPL(class_dev_iter_exit);
>>> diff --git a/include/linux/device/class.h b/include/linux/device/class.h
>>> index 9deeaeb457bb..abf3d3bfb6fe 100644
>>> --- a/include/linux/device/class.h
>>> +++ b/include/linux/device/class.h
>>> @@ -74,6 +74,7 @@ struct class {
>>> struct class_dev_iter {
>>> struct klist_iter ki;
>>> const struct device_type *type;
>>> + struct subsys_private *sp;
>>> };
>>> int __must_check class_register(const struct class *class);
>>
>> The build with the latest 6.4-rc2 and without this patch still leaked,
>> the build with the same commit and this patch applied was successful:
>>
>> root@defiant:/home/marvin# cat /sys/kernel/debug/kmemleak
>> root@defiant:/home/marvin#
>>
>> Tried three times, and it is a OK.
>>
>> Congratulations! This had fixed the leak.
>
> Wonderful, thanks for testing, can I add your "Tested-by:" to it?

Don't mention it. Tested-by: is fine, certainly.

>> I wonder why it didn't show in the other contexts, hardware and archs?
>
> It might depend on your keyboard if it has other things on it? I don't
> know, sorry, I didn't spend much time digging after I found the "obvious
> leak" based on the bisection you provided, which was very very helpful,
> thanks for that.

It looks like a rather normal Genius keyboard, which the system detected
as "Chicony".

Maybe it is the motherboard or the controller? Did I send the lshw.txt?

Yes, I did.

I can't tell which one of these it is:

root@defiant:/home/marvin# lspci -k | grep -B2 xhci
15:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43f7
(rev 01)
Subsystem: ASMedia Technology Inc. Device 1142
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
17:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43f7
(rev 01)
Subsystem: ASMedia Technology Inc. Device 1142
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
1a:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b6
Subsystem: ASRock Incorporation Device 15b6
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
1a:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b7
Subsystem: ASRock Incorporation Device 15b6
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
1b:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b8
Subsystem: ASRock Incorporation Device 15b6
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
root@defiant:/home/marvin#

Regarding the bisection, no need to thank, I was actually happy to test
the new hardware.

Before, it would take 15 hours for the bisect that big.

Though the box was slightly overheating (91 °C), and probably there
ought to be a way to recompile only changed sources after a
"git checkout" ... that would speed up many bisects, BTW.

> And leaks are hard to notice, especially ones that only show up when you
> remove a specific type of device.

Well, the leaks are rather harmless, but they show some inconsistency in
the code, so I took them seriously.

> thanks again for your help here,

Not at all, I hope to find some "real" bugs. :-)

> greg k-h

Best regards,
Mirsad

2023-05-18 08:16:18

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: [BUG][NEW DATA] Kmemleak, possibly hiddev_connect(), in 6.3.0+ torvalds tree commit gfc4354c6e5c2

On 5/17/23 20:57, Greg Kroah-Hartman wrote:

> And leaks are hard to notice, especially ones that only show up when you
> remove a specific type of device.
>
> thanks again for your help here,

I feel like more of a hindrance from the real issues than being helpful.

Memory leaks seem easy to detect, however, building with KMEMLEAK
debugging on can take up to 50-67% of system time, as I've noticed
a couple of days ago ...

It is obviously incurring some overhead. I did not expect a kernel compilation
as computation-heavy process to have such an impact from memory object
debugging.

Best regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

"What’s this thing suddenly coming towards me very fast? Very very fast.
... I wonder if it will be friends with me?"