From: Jouni Hogander <[email protected]>
Currently error paths are using device_del to clean-up preparations
done by device_add. This is causing memory leak as free of dev->p
allocated in device_add is freed in device_release. This is fixed by
moving freeing dev->p to counterpart of device_add i.e. device_del.
This memory leak was reported by Syzkaller:
BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
[<000000002340019b>] device_add+0x882/0x1750
[<000000001d588c3a>] netdev_register_kobject+0x128/0x380
[<0000000011ef5535>] register_netdevice+0xa1b/0xf00
[<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
[<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
[<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
[<00000000fba062ea>] ksys_ioctl+0x99/0xb0
[<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
[<00000000984cabb9>] do_syscall_64+0x16f/0x580
[<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<00000000e6ca2d9f>] 0xffffffffffffffff
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Lukas Bulwahn <[email protected]>
Signed-off-by: Jouni Hogander <[email protected]>
---
drivers/base/core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7bd9cd366d41..cb4b27e82a9f 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1080,7 +1080,6 @@ EXPORT_SYMBOL_GPL(device_show_bool);
static void device_release(struct kobject *kobj)
{
struct device *dev = kobj_to_dev(kobj);
- struct device_private *p = dev->p;
/*
* Some platform devices are driven without driver attached
@@ -1102,7 +1101,6 @@ static void device_release(struct kobject *kobj)
else
WARN(1, KERN_ERR "Device '%s' does not have a release() function, it is broken and must be fixed. See Documentation/kobject.txt.\n",
dev_name(dev));
- kfree(p);
}
static const void *device_namespace(struct kobject *kobj)
@@ -2388,6 +2386,7 @@ void device_del(struct device *dev)
kobject_del(&dev->kobj);
cleanup_glue_dir(dev, glue_dir);
put_device(parent);
+ kfree(dev->p);
}
EXPORT_SYMBOL_GPL(device_del);
--
2.17.1
On Thu, Nov 14, 2019 at 02:18:40PM +0200, [email protected] wrote:
> From: Jouni Hogander <[email protected]>
>
> Currently error paths are using device_del to clean-up preparations
> done by device_add. This is causing memory leak as free of dev->p
> allocated in device_add is freed in device_release. This is fixed by
> moving freeing dev->p to counterpart of device_add i.e. device_del.
Are you sure that is safe? The device can still be "alive" after
device_del() is called. The only place you know that it should be freed
is in the release callback.
> This memory leak was reported by Syzkaller:
>
> BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
> comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
> [<000000002340019b>] device_add+0x882/0x1750
> [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
> [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
> [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
> [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
> [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
> [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
> [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
> [<00000000984cabb9>] do_syscall_64+0x16f/0x580
> [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<00000000e6ca2d9f>] 0xffffffffffffffff
How is this a leak? This is in device_add(), not removing the device.
When the structure really is freed then it can be removed.
Or are you triggering an error in device_add() somehow to trigger this
callback?
thanks,
greg k-h
Greg Kroah-Hartman <[email protected]> writes:
> On Thu, Nov 14, 2019 at 02:18:40PM +0200, [email protected] wrote:
>> From: Jouni Hogander <[email protected]>
>>
>> Currently error paths are using device_del to clean-up preparations
>> done by device_add. This is causing memory leak as free of dev->p
>> allocated in device_add is freed in device_release. This is fixed by
>> moving freeing dev->p to counterpart of device_add i.e. device_del.
>
> Are you sure that is safe? The device can still be "alive" after
> device_del() is called. The only place you know that it should be freed
> is in the release callback.
Now as you pointed this out I'm not.
>
>> This memory leak was reported by Syzkaller:
>>
>> BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
>> comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
>> hex dump (first 32 bytes):
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>> backtrace:
>> [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
>> [<000000002340019b>] device_add+0x882/0x1750
>> [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
>> [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
>> [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
>> [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
>> [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
>> [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
>> [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
>> [<00000000984cabb9>] do_syscall_64+0x16f/0x580
>> [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [<00000000e6ca2d9f>] 0xffffffffffffffff
>
> How is this a leak? This is in device_add(), not removing the device.
> When the structure really is freed then it can be removed.
In net/core/net-sysfs.c:netdev_register_kobject device_add allocates
dev->p. Now if register_queue_kobjects fails the error path is calling
device_del and dev->p is never freed. Proper fix here could be to call
put_device after device_del?
>
> Or are you triggering an error in device_add() somehow to trigger this
> callback?
This was found using Syzkaller with fault injection and memory leak
detection enabled. Error is not triggered in device_add but after
device_add.
BR,
Jouni Högander
On Fri, Nov 15, 2019 at 09:59:43AM +0200, Jouni H?gander wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> > On Thu, Nov 14, 2019 at 02:18:40PM +0200, [email protected] wrote:
> >> From: Jouni Hogander <[email protected]>
> >>
> >> Currently error paths are using device_del to clean-up preparations
> >> done by device_add. This is causing memory leak as free of dev->p
> >> allocated in device_add is freed in device_release. This is fixed by
> >> moving freeing dev->p to counterpart of device_add i.e. device_del.
> >
> > Are you sure that is safe? The device can still be "alive" after
> > device_del() is called. The only place you know that it should be freed
> > is in the release callback.
>
> Now as you pointed this out I'm not.
>
> >
> >> This memory leak was reported by Syzkaller:
> >>
> >> BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
> >> comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
> >> hex dump (first 32 bytes):
> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> >> backtrace:
> >> [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
> >> [<000000002340019b>] device_add+0x882/0x1750
> >> [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
> >> [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
> >> [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
> >> [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
> >> [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
> >> [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
> >> [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
> >> [<00000000984cabb9>] do_syscall_64+0x16f/0x580
> >> [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >> [<00000000e6ca2d9f>] 0xffffffffffffffff
> >
> > How is this a leak? This is in device_add(), not removing the device.
> > When the structure really is freed then it can be removed.
>
> In net/core/net-sysfs.c:netdev_register_kobject device_add allocates
> dev->p. Now if register_queue_kobjects fails the error path is calling
> device_del and dev->p is never freed. Proper fix here could be to call
> put_device after device_del?
Hm, this sounds like you have a reference count leak here, as
put_device() should be properly called already in this case. You might
want to look further to see where exactly the register_queue_kobjects()
call fails in order to see if we grabbed a reference we forgot to put
back on an error path.
thanks,
greg k-h
Greg Kroah-Hartman <[email protected]> writes:
>> >> This memory leak was reported by Syzkaller:
>> >>
>> >> BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
>> >> comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
>> >> hex dump (first 32 bytes):
>> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>> >> backtrace:
>> >> [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
>> >> [<000000002340019b>] device_add+0x882/0x1750
>> >> [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
>> >> [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
>> >> [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
>> >> [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
>> >> [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
>> >> [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
>> >> [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
>> >> [<00000000984cabb9>] do_syscall_64+0x16f/0x580
>> >> [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> >> [<00000000e6ca2d9f>] 0xffffffffffffffff
>> >
>> > How is this a leak? This is in device_add(), not removing the device.
>> > When the structure really is freed then it can be removed.
>>
>> In net/core/net-sysfs.c:netdev_register_kobject device_add allocates
>> dev->p. Now if register_queue_kobjects fails the error path is calling
>> device_del and dev->p is never freed. Proper fix here could be to call
>> put_device after device_del?
>
> Hm, this sounds like you have a reference count leak here, as
> put_device() should be properly called already in this case. You might
> want to look further to see where exactly the register_queue_kobjects()
> call fails in order to see if we grabbed a reference we forgot to put
> back on an error path.
Ok, did some more debugging on
this. net/core/net-sysfs.c:netdev_register_kobject is doing
device_initialize(dev). This is in
drivers/base/core.c:device_initialize:
* NOTE: Use put_device() to give up your reference instead of freeing
* @dev directly once you have called this function.
My understanding is that remaining reference on error path is taken by
device_initialize and as instructed in the note above it should be given
up using put_device? Tested this and it's fixing the memory leak I found
in my Syzkaller exercise. Addition to that it seems to be fixing also
this one:
https://syzkaller.appspot.com/bug?id=f5f4af9fb9ffb3112ad6e30f717f769decdccdfc
BR,
Jouni Högander
On Fri, Nov 15, 2019 at 12:05:17PM +0200, Jouni H?gander wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> >> >> This memory leak was reported by Syzkaller:
> >> >>
> >> >> BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
> >> >> comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
> >> >> hex dump (first 32 bytes):
> >> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> >> >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> >> >> backtrace:
> >> >> [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
> >> >> [<000000002340019b>] device_add+0x882/0x1750
> >> >> [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
> >> >> [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
> >> >> [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
> >> >> [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
> >> >> [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
> >> >> [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
> >> >> [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
> >> >> [<00000000984cabb9>] do_syscall_64+0x16f/0x580
> >> >> [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >> >> [<00000000e6ca2d9f>] 0xffffffffffffffff
> >> >
> >> > How is this a leak? This is in device_add(), not removing the device.
> >> > When the structure really is freed then it can be removed.
> >>
> >> In net/core/net-sysfs.c:netdev_register_kobject device_add allocates
> >> dev->p. Now if register_queue_kobjects fails the error path is calling
> >> device_del and dev->p is never freed. Proper fix here could be to call
> >> put_device after device_del?
> >
> > Hm, this sounds like you have a reference count leak here, as
> > put_device() should be properly called already in this case. You might
> > want to look further to see where exactly the register_queue_kobjects()
> > call fails in order to see if we grabbed a reference we forgot to put
> > back on an error path.
>
> Ok, did some more debugging on
> this. net/core/net-sysfs.c:netdev_register_kobject is doing
> device_initialize(dev). This is in
> drivers/base/core.c:device_initialize:
>
> * NOTE: Use put_device() to give up your reference instead of freeing
> * @dev directly once you have called this function.
>
> My understanding is that remaining reference on error path is taken by
> device_initialize and as instructed in the note above it should be given
> up using put_device?
Yes, that is correct.
> Tested this and it's fixing the memory leak I found in my Syzkaller
> exercise. Addition to that it seems to be fixing also this one:
>
> https://syzkaller.appspot.com/bug?id=f5f4af9fb9ffb3112ad6e30f717f769decdccdfc
Great! Care to submit a patch for this?
thanks,
greg k-h
Greg Kroah-Hartman <[email protected]> writes:
>>
>> Ok, did some more debugging on
>> this. net/core/net-sysfs.c:netdev_register_kobject is doing
>> device_initialize(dev). This is in
>> drivers/base/core.c:device_initialize:
>>
>> * NOTE: Use put_device() to give up your reference instead of freeing
>> * @dev directly once you have called this function.
>>
>> My understanding is that remaining reference on error path is taken by
>> device_initialize and as instructed in the note above it should be given
>> up using put_device?
>
> Yes, that is correct.
>
>> Tested this and it's fixing the memory leak I found in my Syzkaller
>> exercise. Addition to that it seems to be fixing also this one:
>>
>> https://syzkaller.appspot.com/bug?id=f5f4af9fb9ffb3112ad6e30f717f769decdccdfc
>
> Great! Care to submit a patch for this?
I will submit another patch and Cc you there. This patch should be ignored.
BR,
Jouni Högander