2021-10-04 09:38:18

by Kent Gibson

[permalink] [raw]
Subject: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

Hi,

I'm seeing a refcount underflow when I unload the gpio-mockup module on
Linux v5.15-rc4 (and going back to v5.15-rc1):

# modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
# rmmod gpio-mockup
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
Modules linked in: gpio_mockup(-)
CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
EIP: refcount_warn_saturate+0xd1/0x120
Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
Call Trace:
kobject_put+0xdc/0xf0
software_node_notify_remove+0xa8/0xc0
device_del+0x15a/0x3e0
? kfree_const+0xf/0x30
? kobject_put+0xa6/0xf0
? module_remove_driver+0x73/0xa0
platform_device_del.part.0+0xf/0x80
platform_device_unregister+0x19/0x40
gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
__ia32_sys_delete_module+0x137/0x1e0
? task_work_run+0x61/0x90
? exit_to_user_mode_prepare+0x1b5/0x1c0
__do_fast_syscall_32+0x50/0xc0
do_fast_syscall_32+0x32/0x70
do_SYSENTER_32+0x15/0x20
entry_SYSENTER_32+0x98/0xe7
EIP: 0xb7eda549
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
---[ end trace 3d71387f54bc2d06 ]---

I suspect this is related to the recent changes to swnode.c or
platform.c, as gpio-mockup hasn't changed, but haven't had the
chance to debug further.

Cheers,
Kent.


2021-10-04 09:45:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> Hi,
>
> I'm seeing a refcount underflow when I unload the gpio-mockup module on
> Linux v5.15-rc4 (and going back to v5.15-rc1):
>
> # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> # rmmod gpio-mockup
> ------------[ cut here ]------------
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> Modules linked in: gpio_mockup(-)
> CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> EIP: refcount_warn_saturate+0xd1/0x120
> Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> Call Trace:
> kobject_put+0xdc/0xf0
> software_node_notify_remove+0xa8/0xc0
> device_del+0x15a/0x3e0
> ? kfree_const+0xf/0x30
> ? kobject_put+0xa6/0xf0
> ? module_remove_driver+0x73/0xa0
> platform_device_del.part.0+0xf/0x80
> platform_device_unregister+0x19/0x40
> gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> __ia32_sys_delete_module+0x137/0x1e0
> ? task_work_run+0x61/0x90
> ? exit_to_user_mode_prepare+0x1b5/0x1c0
> __do_fast_syscall_32+0x50/0xc0
> do_fast_syscall_32+0x32/0x70
> do_SYSENTER_32+0x15/0x20
> entry_SYSENTER_32+0x98/0xe7
> EIP: 0xb7eda549
> Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> ---[ end trace 3d71387f54bc2d06 ]---
>
> I suspect this is related to the recent changes to swnode.c or
> platform.c, as gpio-mockup hasn't changed, but haven't had the
> chance to debug further.

Any chance you can run 'git bisect' for this?

thanks,

greg k-h

2021-10-04 10:11:08

by Kent Gibson

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > Hi,
> >
> > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > Linux v5.15-rc4 (and going back to v5.15-rc1):
> >
> > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > # rmmod gpio-mockup
> > ------------[ cut here ]------------
> > refcount_t: underflow; use-after-free.
> > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > Modules linked in: gpio_mockup(-)
> > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > EIP: refcount_warn_saturate+0xd1/0x120
> > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > Call Trace:
> > kobject_put+0xdc/0xf0
> > software_node_notify_remove+0xa8/0xc0
> > device_del+0x15a/0x3e0
> > ? kfree_const+0xf/0x30
> > ? kobject_put+0xa6/0xf0
> > ? module_remove_driver+0x73/0xa0
> > platform_device_del.part.0+0xf/0x80
> > platform_device_unregister+0x19/0x40
> > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > __ia32_sys_delete_module+0x137/0x1e0
> > ? task_work_run+0x61/0x90
> > ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > __do_fast_syscall_32+0x50/0xc0
> > do_fast_syscall_32+0x32/0x70
> > do_SYSENTER_32+0x15/0x20
> > entry_SYSENTER_32+0x98/0xe7
> > EIP: 0xb7eda549
> > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > ---[ end trace 3d71387f54bc2d06 ]---
> >
> > I suspect this is related to the recent changes to swnode.c or
> > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > chance to debug further.
>
> Any chance you can run 'git bisect' for this?
>

Will do - just wanted to get the problem out there as a heads-up and in
case anyone had some immediate insight.

Cheers,
Kent.

2021-10-04 12:21:24

by Kent Gibson

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > Hi,
> >
> > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > Linux v5.15-rc4 (and going back to v5.15-rc1):
> >
> > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > # rmmod gpio-mockup
> > ------------[ cut here ]------------
> > refcount_t: underflow; use-after-free.
> > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > Modules linked in: gpio_mockup(-)
> > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > EIP: refcount_warn_saturate+0xd1/0x120
> > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > Call Trace:
> > kobject_put+0xdc/0xf0
> > software_node_notify_remove+0xa8/0xc0
> > device_del+0x15a/0x3e0
> > ? kfree_const+0xf/0x30
> > ? kobject_put+0xa6/0xf0
> > ? module_remove_driver+0x73/0xa0
> > platform_device_del.part.0+0xf/0x80
> > platform_device_unregister+0x19/0x40
> > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > __ia32_sys_delete_module+0x137/0x1e0
> > ? task_work_run+0x61/0x90
> > ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > __do_fast_syscall_32+0x50/0xc0
> > do_fast_syscall_32+0x32/0x70
> > do_SYSENTER_32+0x15/0x20
> > entry_SYSENTER_32+0x98/0xe7
> > EIP: 0xb7eda549
> > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > ---[ end trace 3d71387f54bc2d06 ]---
> >
> > I suspect this is related to the recent changes to swnode.c or
> > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > chance to debug further.
>
> Any chance you can run 'git bisect' for this?
>

That results in:

bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
commit bd1e336aa8535a99f339e2d66a611984262221ce
Author: Heikki Krogerus <[email protected]>
Date: Tue Aug 17 13:24:49 2021 +0300

driver core: platform: Remove platform_device_add_properties()

Cheers,
Kent.

2021-10-04 12:32:48

by Heikki Krogerus

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > Hi,
> > >
> > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > >
> > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > # rmmod gpio-mockup
> > > ------------[ cut here ]------------
> > > refcount_t: underflow; use-after-free.
> > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > Modules linked in: gpio_mockup(-)
> > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > EIP: refcount_warn_saturate+0xd1/0x120
> > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > Call Trace:
> > > kobject_put+0xdc/0xf0
> > > software_node_notify_remove+0xa8/0xc0
> > > device_del+0x15a/0x3e0
> > > ? kfree_const+0xf/0x30
> > > ? kobject_put+0xa6/0xf0
> > > ? module_remove_driver+0x73/0xa0
> > > platform_device_del.part.0+0xf/0x80
> > > platform_device_unregister+0x19/0x40
> > > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > __ia32_sys_delete_module+0x137/0x1e0
> > > ? task_work_run+0x61/0x90
> > > ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > __do_fast_syscall_32+0x50/0xc0
> > > do_fast_syscall_32+0x32/0x70
> > > do_SYSENTER_32+0x15/0x20
> > > entry_SYSENTER_32+0x98/0xe7
> > > EIP: 0xb7eda549
> > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > ---[ end trace 3d71387f54bc2d06 ]---
> > >
> > > I suspect this is related to the recent changes to swnode.c or
> > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > chance to debug further.
> >
> > Any chance you can run 'git bisect' for this?
> >
>
> That results in:
>
> bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> commit bd1e336aa8535a99f339e2d66a611984262221ce
> Author: Heikki Krogerus <[email protected]>
> Date: Tue Aug 17 13:24:49 2021 +0300
>
> driver core: platform: Remove platform_device_add_properties()

Can you test does this patch help:
https://lore.kernel.org/all/[email protected]/

thanks,

--
heikki

2021-10-04 14:09:27

by Kent Gibson

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > Hi,
> > > >
> > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > >
> > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > > # rmmod gpio-mockup
> > > > ------------[ cut here ]------------
> > > > refcount_t: underflow; use-after-free.
> > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > > Modules linked in: gpio_mockup(-)
> > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > EIP: refcount_warn_saturate+0xd1/0x120
> > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > > Call Trace:
> > > > kobject_put+0xdc/0xf0
> > > > software_node_notify_remove+0xa8/0xc0
> > > > device_del+0x15a/0x3e0
> > > > ? kfree_const+0xf/0x30
> > > > ? kobject_put+0xa6/0xf0
> > > > ? module_remove_driver+0x73/0xa0
> > > > platform_device_del.part.0+0xf/0x80
> > > > platform_device_unregister+0x19/0x40
> > > > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > > gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > > __ia32_sys_delete_module+0x137/0x1e0
> > > > ? task_work_run+0x61/0x90
> > > > ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > > __do_fast_syscall_32+0x50/0xc0
> > > > do_fast_syscall_32+0x32/0x70
> > > > do_SYSENTER_32+0x15/0x20
> > > > entry_SYSENTER_32+0x98/0xe7
> > > > EIP: 0xb7eda549
> > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > > ---[ end trace 3d71387f54bc2d06 ]---
> > > >
> > > > I suspect this is related to the recent changes to swnode.c or
> > > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > > chance to debug further.
> > >
> > > Any chance you can run 'git bisect' for this?
> > >
> >
> > That results in:
> >
> > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> > commit bd1e336aa8535a99f339e2d66a611984262221ce
> > Author: Heikki Krogerus <[email protected]>
> > Date: Tue Aug 17 13:24:49 2021 +0300
> >
> > driver core: platform: Remove platform_device_add_properties()
>
> Can you test does this patch help:
> https://lore.kernel.org/all/[email protected]/
>

You sure that is the patch you have in mind? It only removes dead code,
so I don't see how that would help. And it isn't quite dead either -
drivers/pci/quirks.c is still using device_add_properties(), so it won't
build.

Looking at the offending patch, it effectively replaces a call to
device_add_properties() with one to
device_create_managed_software_node(), and those two functions appear
quite different - at least at first glance.
Is that correct?

Cheers,
Kent.

2021-10-04 17:52:28

by Kent Gibson

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > Hi,
> > > > > > >

[snip]

> > > Looking at the offending patch, it effectively replaces a call to
> > > device_add_properties() with one to
> > > device_create_managed_software_node(), and those two functions appear
> > > quite different - at least at first glance.
> > > Is that correct?
> >
> > The only real difference between the two functions is that
> > device_create_managed_software_node() marks the software node it
> > creates (and it does it exactly the same way as
> > device_add_properties()) as "managed" with a specific flag.
> >
>

That managed flag makes all the difference.

I've tried to find a fix along the same lines as Laurentiu Tudor's
5aeb05b27f8 software node: balance refcount for managed software nodes
but haven't found anything that works.

What does work for me is to revert the call to
device_create_managed_software_node() to a call to
device_add_properties():

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 652531f67135..2f30bdb94fab 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
goto err;

if (pdevinfo->properties) {
- ret = device_create_managed_software_node(&pdev->dev,
- pdevinfo->properties, NULL);
+ ret = device_add_properties(&pdev->dev, pdevinfo->properties);
if (ret)
goto err;
}

That obviously wont work with your latest series that removes
device_add_properties(), but that is the simplest, and only, solution
that I've found so far.

Cheers,
Kent.

2021-10-04 20:58:39

by Kent Gibson

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > > > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > > > >
> > > > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > > > > # rmmod gpio-mockup
> > > > > > ------------[ cut here ]------------
> > > > > > refcount_t: underflow; use-after-free.
> > > > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > > > > Modules linked in: gpio_mockup(-)
> > > > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > > > EIP: refcount_warn_saturate+0xd1/0x120
> > > > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > > > > Call Trace:
> > > > > > kobject_put+0xdc/0xf0
> > > > > > software_node_notify_remove+0xa8/0xc0
> > > > > > device_del+0x15a/0x3e0
> > > > > > ? kfree_const+0xf/0x30
> > > > > > ? kobject_put+0xa6/0xf0
> > > > > > ? module_remove_driver+0x73/0xa0
> > > > > > platform_device_del.part.0+0xf/0x80
> > > > > > platform_device_unregister+0x19/0x40
> > > > > > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > > > > gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > > > > __ia32_sys_delete_module+0x137/0x1e0
> > > > > > ? task_work_run+0x61/0x90
> > > > > > ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > > > > __do_fast_syscall_32+0x50/0xc0
> > > > > > do_fast_syscall_32+0x32/0x70
> > > > > > do_SYSENTER_32+0x15/0x20
> > > > > > entry_SYSENTER_32+0x98/0xe7
> > > > > > EIP: 0xb7eda549
> > > > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > > > > ---[ end trace 3d71387f54bc2d06 ]---
> > > > > >
> > > > > > I suspect this is related to the recent changes to swnode.c or
> > > > > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > > > > chance to debug further.
> > > > >
> > > > > Any chance you can run 'git bisect' for this?
> > > > >
> > > >
> > > > That results in:
> > > >
> > > > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> > > > commit bd1e336aa8535a99f339e2d66a611984262221ce
> > > > Author: Heikki Krogerus <[email protected]>
> > > > Date: Tue Aug 17 13:24:49 2021 +0300
> > > >
> > > > driver core: platform: Remove platform_device_add_properties()
> > >
> > > Can you test does this patch help:
> > > https://lore.kernel.org/all/[email protected]/
> > >
> >
> > You sure that is the patch you have in mind? It only removes dead code,
> > so I don't see how that would help. And it isn't quite dead either -
> > drivers/pci/quirks.c is still using device_add_properties(), so it won't
> > build.
>
> Right, so can you test with the whole series that patch is part of?
>

Well, you could've said that to start with ;-).

> > Looking at the offending patch, it effectively replaces a call to
> > device_add_properties() with one to
> > device_create_managed_software_node(), and those two functions appear
> > quite different - at least at first glance.
> > Is that correct?
>
> The only real difference between the two functions is that
> device_create_managed_software_node() marks the software node it
> creates (and it does it exactly the same way as
> device_add_properties()) as "managed" with a specific flag.
>

Yeah, my bad - not sure what function I was looking at but it wasn't
device_create_managed_software_node().

> It means that when the device is removed, so is the software node.
> It happens when device_del() calls device_platform_notify_remove(),
> which then calls software_node_notify_remove().
>
> The problem is that after doing that step, device_del() then calls
> device_remove_properties() unconditionally which also attempts to
> remove the software node. So you end up doing the same thing twice.
>
> So the code in the patch that we're interested, and that I would like
> you to test, is this:
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 938cfcd1674eb..152a611a7e9ca 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3583,7 +3583,6 @@ void device_del(struct device *dev)
> device_pm_remove(dev);
> driver_deferred_probe_del(dev);
> device_platform_notify_remove(dev);
> - device_remove_properties(dev);
> device_links_purge(dev);
>
> if (dev->bus)
>

Makes sense.
Good news is the patch built.
Bad news is I still get the same result:

------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 98 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x150
Modules linked in: gpio_mockup(-)
CPU: 0 PID: 98 Comm: rmmod Not tainted 5.15.0-rc4 #1
Hardware name: linux,dummy-virt (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xf4/0x150
lr : refcount_warn_saturate+0xf4/0x150
sp : ffffffc010be3c50
x29: ffffffc010be3c50 x28: ffffff8001fa5780 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000000 x24: ffffffc010a14518
x23: 0000000000000000 x22: ffffffc010a13798 x21: ffffffc010a7a480
x20: ffffff8002126c10 x19: ffffff8002146600 x18: fffffffffffe25f8
x17: 666f733d4d455453 x16: 5953425553003065 x15: ffffffc01098ac80
x14: fffffffffffc25f7 x13: 2e656572662d7265 x12: ffffffc01098acd0
x11: 0000000000000093 x10: 6e75203a745f746e x9 : 00000000ffffefff
x8 : ffffffc0109e2cd0 x7 : 0000000000017fe8 x6 : 0000000000000001
x5 : ffffff8007f8fa68 x4 : 0000000000000000 x3 : 0000000000000027
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff8001fa5780
Call trace:
refcount_warn_saturate+0xf4/0x150
kobject_put+0xf4/0x110
software_node_notify_remove+0xc4/0xe0
device_del+0x18c/0x430
platform_device_del.part.0+0x1c/0x90
platform_device_unregister+0x28/0x50
gpio_mockup_unregister_pdevs+0x28/0x50 [gpio_mockup]
gpio_mockup_exit+0x28/0x3c0 [gpio_mockup]
__arm64_sys_delete_module+0x180/0x200
invoke_syscall+0x54/0x130
el0_svc_common.constprop.0+0x44/0xf0
do_el0_svc+0x40/0xa0
el0_svc+0x20/0x60
el0t_64_sync_handler+0x1a4/0x1b0
el0t_64_sync+0x1a0/0x1a4
---[ end trace 44421b7a22d450dd ]---


Cheers,
Kent.

2021-10-04 21:35:45

by Heikki Krogerus

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > Hi,
> > > > >
> > > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > > >
> > > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > > > # rmmod gpio-mockup
> > > > > ------------[ cut here ]------------
> > > > > refcount_t: underflow; use-after-free.
> > > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > > > Modules linked in: gpio_mockup(-)
> > > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > > EIP: refcount_warn_saturate+0xd1/0x120
> > > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > > > Call Trace:
> > > > > kobject_put+0xdc/0xf0
> > > > > software_node_notify_remove+0xa8/0xc0
> > > > > device_del+0x15a/0x3e0
> > > > > ? kfree_const+0xf/0x30
> > > > > ? kobject_put+0xa6/0xf0
> > > > > ? module_remove_driver+0x73/0xa0
> > > > > platform_device_del.part.0+0xf/0x80
> > > > > platform_device_unregister+0x19/0x40
> > > > > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > > > gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > > > __ia32_sys_delete_module+0x137/0x1e0
> > > > > ? task_work_run+0x61/0x90
> > > > > ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > > > __do_fast_syscall_32+0x50/0xc0
> > > > > do_fast_syscall_32+0x32/0x70
> > > > > do_SYSENTER_32+0x15/0x20
> > > > > entry_SYSENTER_32+0x98/0xe7
> > > > > EIP: 0xb7eda549
> > > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > > > ---[ end trace 3d71387f54bc2d06 ]---
> > > > >
> > > > > I suspect this is related to the recent changes to swnode.c or
> > > > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > > > chance to debug further.
> > > >
> > > > Any chance you can run 'git bisect' for this?
> > > >
> > >
> > > That results in:
> > >
> > > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> > > commit bd1e336aa8535a99f339e2d66a611984262221ce
> > > Author: Heikki Krogerus <[email protected]>
> > > Date: Tue Aug 17 13:24:49 2021 +0300
> > >
> > > driver core: platform: Remove platform_device_add_properties()
> >
> > Can you test does this patch help:
> > https://lore.kernel.org/all/[email protected]/
> >
>
> You sure that is the patch you have in mind? It only removes dead code,
> so I don't see how that would help. And it isn't quite dead either -
> drivers/pci/quirks.c is still using device_add_properties(), so it won't
> build.

Right, so can you test with the whole series that patch is part of?

> Looking at the offending patch, it effectively replaces a call to
> device_add_properties() with one to
> device_create_managed_software_node(), and those two functions appear
> quite different - at least at first glance.
> Is that correct?

The only real difference between the two functions is that
device_create_managed_software_node() marks the software node it
creates (and it does it exactly the same way as
device_add_properties()) as "managed" with a specific flag.

It means that when the device is removed, so is the software node.
It happens when device_del() calls device_platform_notify_remove(),
which then calls software_node_notify_remove().

The problem is that after doing that step, device_del() then calls
device_remove_properties() unconditionally which also attempts to
remove the software node. So you end up doing the same thing twice.

So the code in the patch that we're interested, and that I would like
you to test, is this:

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 938cfcd1674eb..152a611a7e9ca 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3583,7 +3583,6 @@ void device_del(struct device *dev)
device_pm_remove(dev);
driver_deferred_probe_del(dev);
device_platform_notify_remove(dev);
- device_remove_properties(dev);
device_links_purge(dev);

if (dev->bus)


thanks,

--
heikki

2021-10-04 23:42:22

by Andy Shevchenko

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <[email protected]> wrote:
>
> On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > > Hi,
> > > > > > > >
>
> [snip]
>
> > > > Looking at the offending patch, it effectively replaces a call to
> > > > device_add_properties() with one to
> > > > device_create_managed_software_node(), and those two functions appear
> > > > quite different - at least at first glance.
> > > > Is that correct?
> > >
> > > The only real difference between the two functions is that
> > > device_create_managed_software_node() marks the software node it
> > > creates (and it does it exactly the same way as
> > > device_add_properties()) as "managed" with a specific flag.
> > >
> >
>
> That managed flag makes all the difference.
>
> I've tried to find a fix along the same lines as Laurentiu Tudor's
> 5aeb05b27f8 software node: balance refcount for managed software nodes
> but haven't found anything that works.
>
> What does work for me is to revert the call to
> device_create_managed_software_node() to a call to
> device_add_properties():
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 652531f67135..2f30bdb94fab 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
> goto err;
>
> if (pdevinfo->properties) {
> - ret = device_create_managed_software_node(&pdev->dev,
> - pdevinfo->properties, NULL);
> + ret = device_add_properties(&pdev->dev, pdevinfo->properties);
> if (ret)
> goto err;
> }
>
> That obviously wont work with your latest series that removes
> device_add_properties(), but that is the simplest, and only, solution
> that I've found so far.

Here is debug prints of what happens

# modprobe gpio-mockup gpio_mockup_ranges=-1,10
[ 212.850993] (null): device_create_managed_software_node <<< 0
[ 212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
[ 212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
[ 212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
[ 212.882962] gpiochip_find_base: found new base at 840
[ 212.889873] gpio gpiochip3: software_node_notify 0 <<<
[ 212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
[ 212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
[ 212.913313] gpio gpiochip840: software_node_notify 0 <<<
[ 212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
[ 212.935601] modprobe (185) used greatest stack depth: 12744 bytes left

As I read it the software node is created for gpio-mockup device and
then _shared_ with the GPIO device, since it's managed it's gone when
GPIO device gone followed by UAF when mockup (platform) device itself
gone. I.o.w. this software node mustn't be managed because it covers
the lifetime of two different objects. The correct fix is to create
manually software node and assign it to the pdev and manually free in
gpio_mockup_unregister_pdevs()/

Tl;DR: it's a bug in gpio-mockup.


--
With Best Regards,
Andy Shevchenko

2021-10-05 00:43:15

by Kent Gibson

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Mon, Oct 04, 2021 at 10:56:00PM +0300, Andy Shevchenko wrote:
> On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <[email protected]> wrote:
> >
> > On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> > > On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > > > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> >
> > [snip]
> >
> > > > > Looking at the offending patch, it effectively replaces a call to
> > > > > device_add_properties() with one to
> > > > > device_create_managed_software_node(), and those two functions appear
> > > > > quite different - at least at first glance.
> > > > > Is that correct?
> > > >
> > > > The only real difference between the two functions is that
> > > > device_create_managed_software_node() marks the software node it
> > > > creates (and it does it exactly the same way as
> > > > device_add_properties()) as "managed" with a specific flag.
> > > >
> > >
> >
> > That managed flag makes all the difference.
> >
> > I've tried to find a fix along the same lines as Laurentiu Tudor's
> > 5aeb05b27f8 software node: balance refcount for managed software nodes
> > but haven't found anything that works.
> >
> > What does work for me is to revert the call to
> > device_create_managed_software_node() to a call to
> > device_add_properties():
> >
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 652531f67135..2f30bdb94fab 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
> > goto err;
> >
> > if (pdevinfo->properties) {
> > - ret = device_create_managed_software_node(&pdev->dev,
> > - pdevinfo->properties, NULL);
> > + ret = device_add_properties(&pdev->dev, pdevinfo->properties);
> > if (ret)
> > goto err;
> > }
> >
> > That obviously wont work with your latest series that removes
> > device_add_properties(), but that is the simplest, and only, solution
> > that I've found so far.
>
> Here is debug prints of what happens
>
> # modprobe gpio-mockup gpio_mockup_ranges=-1,10
> [ 212.850993] (null): device_create_managed_software_node <<< 0
> [ 212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
> [ 212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
> [ 212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
> [ 212.882962] gpiochip_find_base: found new base at 840
> [ 212.889873] gpio gpiochip3: software_node_notify 0 <<<
> [ 212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
> [ 212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
> [ 212.913313] gpio gpiochip840: software_node_notify 0 <<<
> [ 212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
> [ 212.935601] modprobe (185) used greatest stack depth: 12744 bytes left
>
> As I read it the software node is created for gpio-mockup device and
> then _shared_ with the GPIO device, since it's managed it's gone when
> GPIO device gone followed by UAF when mockup (platform) device itself
> gone. I.o.w. this software node mustn't be managed because it covers
> the lifetime of two different objects. The correct fix is to create
> manually software node and assign it to the pdev and manually free in
> gpio_mockup_unregister_pdevs()/
>
> Tl;DR: it's a bug in gpio-mockup.
>

So the bug is in the behaviour of gpio_mockup_register_chip()?
That is out of my court, so I'll leave it to you and Bart to sort out.

Cheers,
Kent.

2021-10-05 08:24:11

by Andy Shevchenko

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 10:56:00PM +0300, Andy Shevchenko wrote:
> > On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <[email protected]> wrote:

...

> > Here is debug prints of what happens
> >
> > # modprobe gpio-mockup gpio_mockup_ranges=-1,10
> > [ 212.850993] (null): device_create_managed_software_node <<< 0
> > [ 212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
> > [ 212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
> > [ 212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
> > [ 212.882962] gpiochip_find_base: found new base at 840
> > [ 212.889873] gpio gpiochip3: software_node_notify 0 <<<
> > [ 212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
> > [ 212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
> > [ 212.913313] gpio gpiochip840: software_node_notify 0 <<<
> > [ 212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
> > [ 212.935601] modprobe (185) used greatest stack depth: 12744 bytes left
> >
> > As I read it the software node is created for gpio-mockup device and
> > then _shared_ with the GPIO device, since it's managed it's gone when
> > GPIO device gone followed by UAF when mockup (platform) device itself
> > gone. I.o.w. this software node mustn't be managed because it covers
> > the lifetime of two different objects. The correct fix is to create
> > manually software node and assign it to the pdev and manually free in
> > gpio_mockup_unregister_pdevs()/
> >
> > Tl;DR: it's a bug in gpio-mockup.
>
> So the bug is in the behaviour of gpio_mockup_register_chip()?

Not really. The utter root cause is the former API (device_add_properties()
et al) which Heikki is getting rid of in particular because of this issue,
i.e. when users blindly call it without fully understanding the picture of
the object lifetimes.

> That is out of my court, so I'll leave it to you and Bart to sort out.

I'll see what I can do about.

--
With Best Regards,
Andy Shevchenko


2021-10-05 09:52:03

by Heikki Krogerus

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Tue, Oct 05, 2021 at 11:21:53AM +0300, Andy Shevchenko wrote:
> On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 10:56:00PM +0300, Andy Shevchenko wrote:
> > > On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <[email protected]> wrote:
>
> ...
>
> > > Here is debug prints of what happens
> > >
> > > # modprobe gpio-mockup gpio_mockup_ranges=-1,10
> > > [ 212.850993] (null): device_create_managed_software_node <<< 0
> > > [ 212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
> > > [ 212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
> > > [ 212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
> > > [ 212.882962] gpiochip_find_base: found new base at 840
> > > [ 212.889873] gpio gpiochip3: software_node_notify 0 <<<
> > > [ 212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
> > > [ 212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
> > > [ 212.913313] gpio gpiochip840: software_node_notify 0 <<<
> > > [ 212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
> > > [ 212.935601] modprobe (185) used greatest stack depth: 12744 bytes left
> > >
> > > As I read it the software node is created for gpio-mockup device and
> > > then _shared_ with the GPIO device, since it's managed it's gone when
> > > GPIO device gone followed by UAF when mockup (platform) device itself
> > > gone. I.o.w. this software node mustn't be managed because it covers
> > > the lifetime of two different objects. The correct fix is to create
> > > manually software node and assign it to the pdev and manually free in
> > > gpio_mockup_unregister_pdevs()/
> > >
> > > Tl;DR: it's a bug in gpio-mockup.
> >
> > So the bug is in the behaviour of gpio_mockup_register_chip()?
>
> Not really. The utter root cause is the former API (device_add_properties()
> et al) which Heikki is getting rid of in particular because of this issue,
> i.e. when users blindly call it without fully understanding the picture of
> the object lifetimes.
>
> > That is out of my court, so I'll leave it to you and Bart to sort out.
>
> I'll see what I can do about.

So, something like this (attached)?

--
heikki


Attachments:
(No filename) (2.23 kB)
gpio-mockup.diff (2.49 kB)
Download all attachments

2021-10-05 09:55:54

by Andy Shevchenko

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Tue, Oct 5, 2021 at 12:50 PM Heikki Krogerus
<[email protected]> wrote:
>
> On Tue, Oct 05, 2021 at 11:21:53AM +0300, Andy Shevchenko wrote:
> > On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:

...

> > I'll see what I can do about.
>
> So, something like this (attached)?

Actually I have sent another solution:
https://lore.kernel.org/linux-gpio/[email protected]/T/#u

--
With Best Regards,
Andy Shevchenko

2021-10-05 10:01:58

by Heikki Krogerus

[permalink] [raw]
Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

On Tue, Oct 05, 2021 at 12:53:04PM +0300, Andy Shevchenko wrote:
> On Tue, Oct 5, 2021 at 12:50 PM Heikki Krogerus
> <[email protected]> wrote:
> >
> > On Tue, Oct 05, 2021 at 11:21:53AM +0300, Andy Shevchenko wrote:
> > > On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:
>
> ...
>
> > > I'll see what I can do about.
> >
> > So, something like this (attached)?
>
> Actually I have sent another solution:
> https://lore.kernel.org/linux-gpio/[email protected]/T/#u

Ah, OK. Thanks Andy.

--
heikki