2018-12-21 15:24:17

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [bochs] df2052cc92: WARNING:at_drivers/gpu/drm/drm_mode_config.c:#drm_mode_config_cleanup

FYI, we noticed the following commit (built with gcc-4.9):

commit: df2052cc922136e98a5c8d9730f6a4fd0a958c94 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 768M

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-----------------------------------------------------------------------+------------+------------+
| | 70c0ef7bd3 | df2052cc92 |
+-----------------------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 4 | 4 |
| WARNING:at_block/blk.h:#blk_register_queue | 4 | 4 |
| EIP:blk_register_queue | 4 | 4 |
| WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page | 4 | 4 |
| EIP:note_page | 4 | 4 |
| WARNING:at_drivers/gpu/drm/drm_mode_config.c:#drm_mode_config_cleanup | 0 | 4 |
| EIP:drm_mode_config_cleanup | 0 | 4 |
+-----------------------------------------------------------------------+------------+------------+



[ 487.591733] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mode_config.c:478 drm_mode_config_cleanup+0x270/0x290
[ 487.599141] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G T 4.19.0-rc1-00164-gdf2052c #1
[ 487.599141] EIP: drm_mode_config_cleanup+0x270/0x290
[ 487.599141] Code: 26 00 ff 70 28 68 37 44 91 42 e8 3b f2 ff ff 58 5a 8d 45 dc e8 f1 98 ff ff 85 c0 75 e5 8d 45 dc e8 75 99 ff ff e9 43 fe ff ff <0f> 0b e9 15 ff ff ff 0f 0b 8d 65 f4 5b 5e 5f 5d c3 90 90 90 90 90
[ 487.599141] EAX: 6b879614 EBX: 6ee7970c ECX: 6ee795c8 EDX: 6ee795c8
[ 487.599141] ESI: 6ee79724 EDI: 6ee79178 EBP: 40125e20 ESP: 40125df8
[ 487.599141] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210206
[ 487.599141] CR0: 80050033 CR2: 00000000 CR3: 02d44000 CR4: 000006b0
[ 487.599141] Call Trace:
[ 487.599141] ? debugfs_remove_recursive+0x15a/0x170
[ 487.599141] ? bochs_kms_fini+0x1e/0x30
[ 487.599141] ? bochs_unload+0x18/0x40
[ 487.599141] ? drm_dev_unregister+0x3a/0xd0
[ 487.599141] ? drm_put_dev+0x22/0x50
[ 487.599141] ? bochs_pci_remove+0xe/0x10
[ 487.599141] ? pci_device_remove+0x1c/0x50
[ 487.599141] ? really_probe+0x9a/0x2f0
[ 487.599141] ? driver_probe_device+0x101/0x120
[ 487.599141] ? __driver_attach+0xa1/0xb0
[ 487.599141] ? driver_probe_device+0x120/0x120
[ 487.599141] ? bus_for_each_dev+0x53/0x90
[ 487.599141] ? driver_attach+0x14/0x20
[ 487.599141] ? driver_probe_device+0x120/0x120
[ 487.599141] ? bus_add_driver+0x177/0x1f0
[ 487.599141] ? qxl_init+0x61/0x61
[ 487.599141] ? driver_register+0x51/0xe0
[ 487.599141] ? qxl_init+0x61/0x61
[ 487.599141] ? __pci_register_driver+0x4b/0x50
[ 487.599141] ? bochs_init+0x3e/0x57
[ 487.599141] ? do_one_initcall+0x83/0x1c7
[ 487.599141] ? smp_apic_timer_interrupt+0x48/0x80
[ 487.599141] ? apic_timer_interrupt+0xcb/0xd0
[ 487.599141] ? kernel_init_freeable+0x129/0x204
[ 487.599141] ? rest_init+0xe0/0xe0
[ 487.599141] ? kernel_init+0x8/0xf0
[ 487.599141] ? ret_from_fork+0x19/0x30
[ 487.599141] ---[ end trace 9072929f8a181f7a ]---


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Rong Chen


Attachments:
(No filename) (4.04 kB)
config-4.19.0-rc1-00164-gdf2052c (120.45 kB)
job-script (4.13 kB)
dmesg.xz (24.38 kB)
Download all attachments

2018-12-22 17:42:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: [LKP] [bochs] df2052cc92: WARNING:at_drivers/gpu/drm/drm_mode_config.c:#drm_mode_config_cleanup

On Fri, Dec 21, 2018 at 12:32 AM kernel test robot
<[email protected]> wrote:
>
> FYI, we noticed commit df2052cc9221 ("bochs: convert to
> drm_fb_helper_fbdev_setup/teardown") caused
>
> [ 487.591733] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mode_config.c:478 drm_mode_config_cleanup+0x270/0x290

Ok, this is apparently just a leak for what appears to be a not
particularly interesting error case, but the warning is new to 4.20
(*) so it would be nice to have somebody look at it.

That commit is supposed to fix a leak, but there's apparently
something still there.

Daniel, Peter, Gerd?

Linus

(*) the *problem* is probably not new, it's just now exposed by the
switch to drm_mode_config_cleanup().

2018-12-23 16:35:31

by Peter Wu

[permalink] [raw]
Subject: Re: [LKP] [bochs] df2052cc92: WARNING:at_drivers/gpu/drm/drm_mode_config.c:#drm_mode_config_cleanup

On Fri, Dec 21, 2018 at 11:25:41AM -0800, Linus Torvalds wrote:
> On Fri, Dec 21, 2018 at 12:32 AM kernel test robot
> <[email protected]> wrote:
> >
> > FYI, we noticed commit df2052cc9221 ("bochs: convert to
> > drm_fb_helper_fbdev_setup/teardown") caused
> >
> > [ 487.591733] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mode_config.c:478 drm_mode_config_cleanup+0x270/0x290
>
> Ok, this is apparently just a leak for what appears to be a not
> particularly interesting error case, but the warning is new to 4.20
> (*) so it would be nice to have somebody look at it.
>
> That commit is supposed to fix a leak, but there's apparently
> something still there.

> (*) the *problem* is probably not new, it's just now exposed by the
> switch to drm_mode_config_cleanup().

I concur, the issue was only revealed because a (not so interesting
error path was triggered). Reproduced this on current master
(v4.20-rc7-274-g23203e3f34c9), the trace leading up to the warning is
the same:

[ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
[ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
[ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
[ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
[ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
[ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
...
[ 50.023155] Call Trace:
[ 50.023155] ? bochs_kms_fini+0x1e/0x30
[ 50.023155] ? bochs_unload+0x18/0x40
[ 50.023155] ? bochs_pci_remove+0x18/0x30
[ 50.023155] ? pci_device_remove+0x1c/0x50
[ 50.031880] ? really_probe+0xf3/0x2d0
[ 50.031880] ? driver_probe_device+0x23/0xa0

The warning suggests that drm_framebuffer_init was called at some point without
a matching call to drm_framebuffer_cleanup. Adding dump_stack() reveals:

[ 97.673399] drm_framebuffer_init+0x17d/0x190
[ 97.674134] drm_gem_fb_alloc+0xbe/0x120
[ 97.674678] drm_gem_fbdev_fb_create+0x184/0x1c0
[ 97.675322] ? drm_gem_fb_simple_display_pipe_prepare_fb+0x20/0x20
[ 97.676771] ? drm_fb_helper_alloc_fbi+0xe1/0x120
[ 97.677408] bochsfb_create+0x245/0x5f0
[ 97.677935] ? bochsfb_mmap+0x60/0x60
[ 97.678421] __drm_fb_helper_initial_config_and_unlock+0x3d3/0x7b0
[ 97.678827] ? drm_setup_crtcs+0x1430/0x1430
[ 97.678827] drm_fb_helper_fbdev_setup+0x12b/0x230
[ 97.678827] bochs_fbdev_init+0x33/0x40
[ 97.678827] bochs_pci_probe+0x197/0x1a0
[ 97.678827] pci_device_probe+0xe9/0x180
[ 97.693848] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
[ 97.694880] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)

More precisely, this is the call chain (obtained via GDB):

drm_fb_helper_fbdev_setup
-> drm_fb_helper_initial_config
-> __drm_fb_helper_initial_config_and_unlock
-> drm_fb_helper_single_fb_probe
-> bochsfb_create
-> drm_gem_fbdev_fb_create
-> drm_gem_fb_alloc
-> drm_framebuffer_init

Let's have a look at the source of the error message, keep in mind that
drm_fb_helper_fini is called on the error path:

int drm_fb_helper_fbdev_setup(struct drm_device *dev, ...) {
/* ... */
ret = drm_fb_helper_initial_config(fb_helper, preferred_bpp);
if (ret < 0) {
DRM_DEV_ERROR(dev->dev, "fbdev: Failed to set configuration (ret=%d)\n", ret);
goto err_drm_fb_helper_fini;
}

return 0;

err_drm_fb_helper_fini:
drm_fb_helper_fini(fb_helper);

return ret;
}

The CONFIG_FB_LITTLE_ENDIAN error above suggests that this code path is reached
*after* calling bochsfb_create:

drm_fb_helper_fbdev_setup
-> drm_fb_helper_initial_config
-> __drm_fb_helper_initial_config_and_unlock
-> register_framebuffer
-> do_register_framebuffer
-> fb_check_foreignness
(prints error and propagates error code back to drm_fb_helper_fbdev_setup).

What does "drm_fb_helper_fini" do? Among other things it basically kfree's
memory associated with "fb_helper->fbdev" which was created using
"drm_fb_helper_alloc_fbi" in the "fb_probe" callback. This is sufficient for
"drm_fb_helper_generic_probe" (introduced by Noralf), but not for
"bochsfb_create" which additionally calls "drm_gem_fbdev_fb_create":

info = drm_fb_helper_alloc_fbi(helper);
if (IS_ERR(info)) {
DRM_ERROR("Failed to allocate fbi: %ld\n", PTR_ERR(info));
return PTR_ERR(info);
}

info->par = &bochs->fb.helper;

fb = drm_gem_fbdev_fb_create(bochs->dev, sizes, 0, gobj, NULL);
if (IS_ERR(fb)) {
DRM_ERROR("Failed to create framebuffer: %ld\n", PTR_ERR(fb));
return PTR_ERR(fb);
}

/* setup helper */
bochs->fb.helper.fb = fb;

Note that "fb" is unhandled by "drm_fb_helper_fini", so it leaks.

What is the usual behavior? drm_fb_helper_fbdev_setup succeeds and on unload
drm_fb_helper_fbdev_teardown is called which properly releases "fb":

void drm_fb_helper_fbdev_teardown(struct drm_device *dev)
{
struct drm_fb_helper *fb_helper = dev->fb_helper;
struct fb_ops *fbops = NULL;

if (!fb_helper)
return;

/* Unregister if it hasn't been done already */
if (fb_helper->fbdev && fb_helper->fbdev->dev)
drm_fb_helper_unregister_fbi(fb_helper);

if (fb_helper->fbdev && fb_helper->fbdev->fbdefio) {
fb_deferred_io_cleanup(fb_helper->fbdev);
kfree(fb_helper->fbdev->fbdefio);
fbops = fb_helper->fbdev->fbops;
}

drm_fb_helper_fini(fb_helper);
kfree(fbops);

if (fb_helper->fb)
drm_framebuffer_remove(fb_helper->fb); // yay!
}

Due to calling "drm_fb_helper_fini" however, "dev->fb_helper" will be NULL and
thus this function does nothing on the error path.

So in summary, "drm_fb_helper_fbdev_setup" calls the driver callback
drm_fb_helper_funcs::fb_probe, detects an error but does not properly release
all resources from the callback even after calling "drm_fb_helper_fini". On
unload, "drm_fb_helper_fbdev_teardown" has no effect because the earlier call to
"drm_fb_helper_fini" and skips the required "drm_framebuffer_remove" call.

I'll send a proposed patch in a reply.
--
Kind regards,
Peter Wu
https://lekensteyn.nl

2018-12-23 16:36:35

by Peter Wu

[permalink] [raw]
Subject: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup

After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
"dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
have some effect). After that, drm_fb_helper_initial_config is called
which may call the "fb_probe" driver callback.

This driver callback may call drm_fb_helper_defio_init (as is done by
drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
as documented. These are normally cleaned up on exit by
drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.

If an error occurs after "fb_probe", but before setup is complete, then
calling just drm_fb_helper_fini will leak resources. This was triggered
by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):

[ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
[ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
[ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
[ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
[ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
[ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
...
[ 50.023155] Call Trace:
[ 50.023155] ? bochs_kms_fini+0x1e/0x30
[ 50.023155] ? bochs_unload+0x18/0x40

This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.

Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
Reported-by: kernel test robot <[email protected]>
Cc: Noralf Trønnes <[email protected]>
Signed-off-by: Peter Wu <[email protected]>
---
drivers/gpu/drm/drm_fb_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 9d64f874f965..432e0f3b9267 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
return 0;

err_drm_fb_helper_fini:
- drm_fb_helper_fini(fb_helper);
+ drm_fb_helper_fbdev_teardown(dev);

return ret;
}
--
2.20.0


2018-12-23 16:44:25

by Noralf Trønnes

[permalink] [raw]
Subject: Re: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup



Den 23.12.2018 01.55, skrev Peter Wu:
> After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
> "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
> have some effect). After that, drm_fb_helper_initial_config is called
> which may call the "fb_probe" driver callback.
>
> This driver callback may call drm_fb_helper_defio_init (as is done by
> drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
> as documented. These are normally cleaned up on exit by
> drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
>
> If an error occurs after "fb_probe", but before setup is complete, then
> calling just drm_fb_helper_fini will leak resources. This was triggered
> by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):
>
> [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
> [ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
> [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
> [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
> [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
> [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
> ...
> [ 50.023155] Call Trace:
> [ 50.023155] ? bochs_kms_fini+0x1e/0x30
> [ 50.023155] ? bochs_unload+0x18/0x40
>
> This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
>
> Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
> Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
> Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
> Reported-by: kernel test robot <[email protected]>
> Cc: Noralf Trønnes <[email protected]>
> Signed-off-by: Peter Wu <[email protected]>
> ---
> drivers/gpu/drm/drm_fb_helper.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 9d64f874f965..432e0f3b9267 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
> return 0;
>
> err_drm_fb_helper_fini:
> - drm_fb_helper_fini(fb_helper);
> + drm_fb_helper_fbdev_teardown(dev);

This change will break the error path for drm_fbdev_generic_setup()
because drm_fb_helper_generic_probe() cleans up on error but doesn't
clear drm_fb_helper->fb resulting in a double drm_framebuffer_remove().

My assumption has been that the drm_fb_helper_funcs->fb_probe callback
cleans up its resources on error. Clearly this is not the case for
bochs, so my take on this is that bochsfb_create() needs to clean up on
error.

Gerd has a patchset that switches bochs over to the generic fbdev
emulation, but ofc that doesn't help with 4.20:
https://patchwork.freedesktop.org/series/54269/

Noralf.

>
> return ret;
> }
>

2018-12-23 23:12:05

by Peter Wu

[permalink] [raw]
Subject: Re: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup

On Sun, Dec 23, 2018 at 02:55:52PM +0100, Noralf Tr?nnes wrote:
>
>
> Den 23.12.2018 01.55, skrev Peter Wu:
> > After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
> > "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
> > have some effect). After that, drm_fb_helper_initial_config is called
> > which may call the "fb_probe" driver callback.
> >
> > This driver callback may call drm_fb_helper_defio_init (as is done by
> > drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
> > as documented. These are normally cleaned up on exit by
> > drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
> >
> > If an error occurs after "fb_probe", but before setup is complete, then
> > calling just drm_fb_helper_fini will leak resources. This was triggered
> > by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):
> >
> > [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
> > [ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
> > [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
> > [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
> > [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
> > [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
> > ...
> > [ 50.023155] Call Trace:
> > [ 50.023155] ? bochs_kms_fini+0x1e/0x30
> > [ 50.023155] ? bochs_unload+0x18/0x40
> >
> > This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
> >
> > Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
> > Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
> > Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
> > Reported-by: kernel test robot <[email protected]>
> > Cc: Noralf Tr?nnes <[email protected]>
> > Signed-off-by: Peter Wu <[email protected]>
> > ---
> > drivers/gpu/drm/drm_fb_helper.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> > index 9d64f874f965..432e0f3b9267 100644
> > --- a/drivers/gpu/drm/drm_fb_helper.c
> > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
> > return 0;
> > err_drm_fb_helper_fini:
> > - drm_fb_helper_fini(fb_helper);
> > + drm_fb_helper_fbdev_teardown(dev);
>
> This change will break the error path for drm_fbdev_generic_setup()
> because drm_fb_helper_generic_probe() cleans up on error but doesn't
> clear drm_fb_helper->fb resulting in a double drm_framebuffer_remove().

This should probably considered a bug of drm_fb_helper_generic_probe.
Ownership of fb_helper should remain with the caller. The caller can
detect an error and act accordingly.

> My assumption has been that the drm_fb_helper_funcs->fb_probe callback
> cleans up its resources on error. Clearly this is not the case for bochs, so
> my take on this is that bochsfb_create() needs to clean up on error.

That assumption still holds for bochs. The problem is this sequence:
- drm_fb_helper_fbdev_setup is called.
- fb_probe succeeds (this is crucial).
- register_framebuffer fails.
- error path of setup is triggered.

As fb_helper is fully setup by drivers, the drm_fb_helper core should
fully deallocate it again on the error path or else a leak occurs.

> Gerd has a patchset that switches bochs over to the generic fbdev
> emulation, but ofc that doesn't help with 4.20:
> https://patchwork.freedesktop.org/series/54269/

And that does not help with other users of the drm_fb_helper who use
functions like drm_fb_helper_defio_init. They will likely run in the
same problem.

I don't have a way to test tinydrm or other drivers, but if you force
register_framebuffer to fail, you should be able to reproduce the
problem with drm_fb_helper_generic_probe.
--
Kind regards,
Peter Wu
https://lekensteyn.nl

2018-12-24 14:54:06

by Noralf Trønnes

[permalink] [raw]
Subject: Re: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup



Den 24.12.2018 00.10, skrev Peter Wu:
> On Sun, Dec 23, 2018 at 02:55:52PM +0100, Noralf Trønnes wrote:
>>
>>
>> Den 23.12.2018 01.55, skrev Peter Wu:
>>> After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
>>> "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
>>> have some effect). After that, drm_fb_helper_initial_config is called
>>> which may call the "fb_probe" driver callback.
>>>
>>> This driver callback may call drm_fb_helper_defio_init (as is done by
>>> drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
>>> as documented. These are normally cleaned up on exit by
>>> drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
>>>
>>> If an error occurs after "fb_probe", but before setup is complete, then
>>> calling just drm_fb_helper_fini will leak resources. This was triggered
>>> by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):
>>>
>>> [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
>>> [ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
>>> [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
>>> [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
>>> [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
>>> [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
>>> ...
>>> [ 50.023155] Call Trace:
>>> [ 50.023155] ? bochs_kms_fini+0x1e/0x30
>>> [ 50.023155] ? bochs_unload+0x18/0x40
>>>
>>> This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
>>>
>>> Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
>>> Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
>>> Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
>>> Reported-by: kernel test robot <[email protected]>
>>> Cc: Noralf Trønnes <[email protected]>
>>> Signed-off-by: Peter Wu <[email protected]>
>>> ---
>>> drivers/gpu/drm/drm_fb_helper.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>>> index 9d64f874f965..432e0f3b9267 100644
>>> --- a/drivers/gpu/drm/drm_fb_helper.c
>>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>>> @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
>>> return 0;
>>> err_drm_fb_helper_fini:
>>> - drm_fb_helper_fini(fb_helper);
>>> + drm_fb_helper_fbdev_teardown(dev);
>>
>> This change will break the error path for drm_fbdev_generic_setup()
>> because drm_fb_helper_generic_probe() cleans up on error but doesn't
>> clear drm_fb_helper->fb resulting in a double drm_framebuffer_remove().
>
> This should probably considered a bug of drm_fb_helper_generic_probe.
> Ownership of fb_helper should remain with the caller. The caller can
> detect an error and act accordingly.
>
>> My assumption has been that the drm_fb_helper_funcs->fb_probe callback
>> cleans up its resources on error. Clearly this is not the case for bochs, so
>> my take on this is that bochsfb_create() needs to clean up on error.
>
> That assumption still holds for bochs. The problem is this sequence:
> - drm_fb_helper_fbdev_setup is called.
> - fb_probe succeeds (this is crucial).
> - register_framebuffer fails.
> - error path of setup is triggered.
>
> As fb_helper is fully setup by drivers, the drm_fb_helper core should
> fully deallocate it again on the error path or else a leak occurs.
>
>> Gerd has a patchset that switches bochs over to the generic fbdev
>> emulation, but ofc that doesn't help with 4.20:
>> https://patchwork.freedesktop.org/series/54269/
>
> And that does not help with other users of the drm_fb_helper who use
> functions like drm_fb_helper_defio_init. They will likely run in the
> same problem.
>
> I don't have a way to test tinydrm or other drivers, but if you force
> register_framebuffer to fail, you should be able to reproduce the
> problem with drm_fb_helper_generic_probe.
>

Now I understand. I have looked at the drivers that use drm_fb_helper
and no one seem to handle the case where register_framebuffer() is
failing.

Here's what drivers do when drm_fb_helper_initial_config() fails:

Doesn't check:
amdgpu
virtio

Calls drm_fb_helper_fini():
armada
ast
exynos
gma500
hisilicon
mgag200
msm
nouveau
omap
radeon
rockchip
tegra
udl
bochs - Uses drm_fb_helper_fbdev_setup()
qxl - Uses drm_fb_helper_fbdev_setup()
vboxvideo - Uses drm_fb_helper_fbdev_setup()

Might clean up, not sure:
cirrus

Looks suspicious:
i915

I looked at bochs before it switched to drm_fb_helper_fbdev_setup() and
it also just called drm_fb_helper_fini().

It looks like you've uncovered something no one has though about (or
not implemented at least).

It's not just the framebuffer that's not destroyed, the buffer object
is also leaked. drm_mode_config_cleanup() yells about the framebuffer
(and frees it), but says nothing about the buffer object. It might be
that it can't even be made to detect that since some drivers do special
stuff for the fbdev buffer.

I'll pick up on this and do some testing after the Christmas holidays.

Noralf.

2018-12-24 15:04:26

by Peter Wu

[permalink] [raw]
Subject: Re: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup

On Mon, Dec 24, 2018 at 03:52:55PM +0100, Noralf Tr?nnes wrote:
>
>
> Den 24.12.2018 00.10, skrev Peter Wu:
> > On Sun, Dec 23, 2018 at 02:55:52PM +0100, Noralf Tr?nnes wrote:
> > >
> > >
> > > Den 23.12.2018 01.55, skrev Peter Wu:
> > > > After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
> > > > "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
> > > > have some effect). After that, drm_fb_helper_initial_config is called
> > > > which may call the "fb_probe" driver callback.
> > > >
> > > > This driver callback may call drm_fb_helper_defio_init (as is done by
> > > > drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
> > > > as documented. These are normally cleaned up on exit by
> > > > drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
> > > >
> > > > If an error occurs after "fb_probe", but before setup is complete, then
> > > > calling just drm_fb_helper_fini will leak resources. This was triggered
> > > > by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):
> > > >
> > > > [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
> > > > [ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
> > > > [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
> > > > [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
> > > > [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
> > > > [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
> > > > ...
> > > > [ 50.023155] Call Trace:
> > > > [ 50.023155] ? bochs_kms_fini+0x1e/0x30
> > > > [ 50.023155] ? bochs_unload+0x18/0x40
> > > >
> > > > This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
> > > >
> > > > Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
> > > > Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
> > > > Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
> > > > Reported-by: kernel test robot <[email protected]>
> > > > Cc: Noralf Tr?nnes <[email protected]>
> > > > Signed-off-by: Peter Wu <[email protected]>
> > > > ---
> > > > drivers/gpu/drm/drm_fb_helper.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> > > > index 9d64f874f965..432e0f3b9267 100644
> > > > --- a/drivers/gpu/drm/drm_fb_helper.c
> > > > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > > > @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
> > > > return 0;
> > > > err_drm_fb_helper_fini:
> > > > - drm_fb_helper_fini(fb_helper);
> > > > + drm_fb_helper_fbdev_teardown(dev);
> > >
> > > This change will break the error path for drm_fbdev_generic_setup()
> > > because drm_fb_helper_generic_probe() cleans up on error but doesn't
> > > clear drm_fb_helper->fb resulting in a double drm_framebuffer_remove().
> >
> > This should probably considered a bug of drm_fb_helper_generic_probe.
> > Ownership of fb_helper should remain with the caller. The caller can
> > detect an error and act accordingly.
> >
> > > My assumption has been that the drm_fb_helper_funcs->fb_probe callback
> > > cleans up its resources on error. Clearly this is not the case for bochs, so
> > > my take on this is that bochsfb_create() needs to clean up on error.
> >
> > That assumption still holds for bochs. The problem is this sequence:
> > - drm_fb_helper_fbdev_setup is called.
> > - fb_probe succeeds (this is crucial).
> > - register_framebuffer fails.
> > - error path of setup is triggered.
> >
> > As fb_helper is fully setup by drivers, the drm_fb_helper core should
> > fully deallocate it again on the error path or else a leak occurs.
> >
> > > Gerd has a patchset that switches bochs over to the generic fbdev
> > > emulation, but ofc that doesn't help with 4.20:
> > > https://patchwork.freedesktop.org/series/54269/
> >
> > And that does not help with other users of the drm_fb_helper who use
> > functions like drm_fb_helper_defio_init. They will likely run in the
> > same problem.
> >
> > I don't have a way to test tinydrm or other drivers, but if you force
> > register_framebuffer to fail, you should be able to reproduce the
> > problem with drm_fb_helper_generic_probe.
> >
>
> Now I understand. I have looked at the drivers that use drm_fb_helper
> and no one seem to handle the case where register_framebuffer() is
> failing.
>
> Here's what drivers do when drm_fb_helper_initial_config() fails:
>
> Doesn't check:
> amdgpu
> virtio
>
> Calls drm_fb_helper_fini():
> armada
> ast
> exynos
> gma500
> hisilicon
> mgag200
> msm
> nouveau
> omap
> radeon
> rockchip
> tegra
> udl
> bochs - Uses drm_fb_helper_fbdev_setup()
> qxl - Uses drm_fb_helper_fbdev_setup()
> vboxvideo - Uses drm_fb_helper_fbdev_setup()
>
> Might clean up, not sure:
> cirrus
>
> Looks suspicious:
> i915
>
> I looked at bochs before it switched to drm_fb_helper_fbdev_setup() and
> it also just called drm_fb_helper_fini().
>
> It looks like you've uncovered something no one has though about (or
> not implemented at least).
>
> It's not just the framebuffer that's not destroyed, the buffer object
> is also leaked. drm_mode_config_cleanup() yells about the framebuffer
> (and frees it), but says nothing about the buffer object. It might be
> that it can't even be made to detect that since some drivers do special
> stuff for the fbdev buffer.
>
> I'll pick up on this and do some testing after the Christmas holidays.

Thanks, the warning is bad for CI (which uses QEMU), but otherwise it
should not have any effect on regular users so it can wait.
--
Kind regards,
Peter Wu
https://lekensteyn.nl

2019-01-05 18:27:29

by Noralf Trønnes

[permalink] [raw]
Subject: Re: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup



Den 24.12.2018 16.03, skrev Peter Wu:
> On Mon, Dec 24, 2018 at 03:52:55PM +0100, Noralf Trønnes wrote:
>>
>>
>> Den 24.12.2018 00.10, skrev Peter Wu:
>>> On Sun, Dec 23, 2018 at 02:55:52PM +0100, Noralf Trønnes wrote:
>>>>
>>>>
>>>> Den 23.12.2018 01.55, skrev Peter Wu:
>>>>> After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
>>>>> "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
>>>>> have some effect). After that, drm_fb_helper_initial_config is called
>>>>> which may call the "fb_probe" driver callback.
>>>>>
>>>>> This driver callback may call drm_fb_helper_defio_init (as is done by
>>>>> drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
>>>>> as documented. These are normally cleaned up on exit by
>>>>> drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
>>>>>
>>>>> If an error occurs after "fb_probe", but before setup is complete, then
>>>>> calling just drm_fb_helper_fini will leak resources. This was triggered
>>>>> by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):
>>>>>
>>>>> [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
>>>>> [ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
>>>>> [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
>>>>> [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
>>>>> [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
>>>>> [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
>>>>> ...
>>>>> [ 50.023155] Call Trace:
>>>>> [ 50.023155] ? bochs_kms_fini+0x1e/0x30
>>>>> [ 50.023155] ? bochs_unload+0x18/0x40
>>>>>
>>>>> This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
>>>>>
>>>>> Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
>>>>> Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
>>>>> Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
>>>>> Reported-by: kernel test robot <[email protected]>
>>>>> Cc: Noralf Trønnes <[email protected]>
>>>>> Signed-off-by: Peter Wu <[email protected]>
>>>>> ---
>>>>> drivers/gpu/drm/drm_fb_helper.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>>>>> index 9d64f874f965..432e0f3b9267 100644
>>>>> --- a/drivers/gpu/drm/drm_fb_helper.c
>>>>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>>>>> @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
>>>>> return 0;
>>>>> err_drm_fb_helper_fini:
>>>>> - drm_fb_helper_fini(fb_helper);
>>>>> + drm_fb_helper_fbdev_teardown(dev);
>>>>
>>>> This change will break the error path for drm_fbdev_generic_setup()
>>>> because drm_fb_helper_generic_probe() cleans up on error but doesn't
>>>> clear drm_fb_helper->fb resulting in a double drm_framebuffer_remove().
>>>
>>> This should probably considered a bug of drm_fb_helper_generic_probe.
>>> Ownership of fb_helper should remain with the caller. The caller can
>>> detect an error and act accordingly.
>>>
>>>> My assumption has been that the drm_fb_helper_funcs->fb_probe callback
>>>> cleans up its resources on error. Clearly this is not the case for bochs, so
>>>> my take on this is that bochsfb_create() needs to clean up on error.
>>>
>>> That assumption still holds for bochs. The problem is this sequence:
>>> - drm_fb_helper_fbdev_setup is called.
>>> - fb_probe succeeds (this is crucial).
>>> - register_framebuffer fails.
>>> - error path of setup is triggered.
>>>
>>> As fb_helper is fully setup by drivers, the drm_fb_helper core should
>>> fully deallocate it again on the error path or else a leak occurs.
>>>
>>>> Gerd has a patchset that switches bochs over to the generic fbdev
>>>> emulation, but ofc that doesn't help with 4.20:
>>>> https://patchwork.freedesktop.org/series/54269/
>>>
>>> And that does not help with other users of the drm_fb_helper who use
>>> functions like drm_fb_helper_defio_init. They will likely run in the
>>> same problem.
>>>
>>> I don't have a way to test tinydrm or other drivers, but if you force
>>> register_framebuffer to fail, you should be able to reproduce the
>>> problem with drm_fb_helper_generic_probe.
>>>
>>
>> Now I understand. I have looked at the drivers that use drm_fb_helper
>> and no one seem to handle the case where register_framebuffer() is
>> failing.
>>
>> Here's what drivers do when drm_fb_helper_initial_config() fails:
>>
>> Doesn't check:
>> amdgpu
>> virtio
>>
>> Calls drm_fb_helper_fini():
>> armada
>> ast
>> exynos
>> gma500
>> hisilicon
>> mgag200
>> msm
>> nouveau
>> omap
>> radeon
>> rockchip
>> tegra
>> udl
>> bochs - Uses drm_fb_helper_fbdev_setup()
>> qxl - Uses drm_fb_helper_fbdev_setup()
>> vboxvideo - Uses drm_fb_helper_fbdev_setup()
>>
>> Might clean up, not sure:
>> cirrus
>>
>> Looks suspicious:
>> i915
>>
>> I looked at bochs before it switched to drm_fb_helper_fbdev_setup() and
>> it also just called drm_fb_helper_fini().
>>
>> It looks like you've uncovered something no one has though about (or
>> not implemented at least).
>>
>> It's not just the framebuffer that's not destroyed, the buffer object
>> is also leaked. drm_mode_config_cleanup() yells about the framebuffer
>> (and frees it), but says nothing about the buffer object. It might be
>> that it can't even be made to detect that since some drivers do special
>> stuff for the fbdev buffer.
>>
>> I'll pick up on this and do some testing after the Christmas holidays.
>
> Thanks, the warning is bad for CI (which uses QEMU), but otherwise it
> should not have any effect on regular users so it can wait.
>

This patch is good as long as it's applied along side the fix[1] to the
generic emulation:

Reviewed-by: Noralf Trønnes <[email protected]>

I can apply them both when I get an ack/rb on the other patch.

Thanks for fixing this.

Noralf.

[1] https://patchwork.freedesktop.org/patch/275002/

2019-01-08 17:58:25

by Noralf Trønnes

[permalink] [raw]
Subject: Re: [PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup



Den 05.01.2019 19.25, skrev Noralf Trønnes:
>
>
> Den 24.12.2018 16.03, skrev Peter Wu:
>> On Mon, Dec 24, 2018 at 03:52:55PM +0100, Noralf Trønnes wrote:
>>>
>>>
>>> Den 24.12.2018 00.10, skrev Peter Wu:
>>>> On Sun, Dec 23, 2018 at 02:55:52PM +0100, Noralf Trønnes wrote:
>>>>>
>>>>>
>>>>> Den 23.12.2018 01.55, skrev Peter Wu:
>>>>>> After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
>>>>>> "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini
>>>>>> will
>>>>>> have some effect). After that, drm_fb_helper_initial_config is called
>>>>>> which may call the "fb_probe" driver callback.
>>>>>>
>>>>>> This driver callback may call drm_fb_helper_defio_init (as is done by
>>>>>> drm_fb_helper_generic_probe) or set a framebuffer (as is done by
>>>>>> bochs)
>>>>>> as documented. These are normally cleaned up on exit by
>>>>>> drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
>>>>>>
>>>>>> If an error occurs after "fb_probe", but before setup is complete,
>>>>>> then
>>>>>> calling just drm_fb_helper_fini will leak resources. This was
>>>>>> triggered
>>>>>> by df2052cc922 ("bochs: convert to
>>>>>> drm_fb_helper_fbdev_setup/teardown"):
>>>>>>
>>>>>>        [   50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN
>>>>>> to support this framebuffer
>>>>>>        [   50.009436] bochs-drm 0000:00:02.0:
>>>>>> [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set
>>>>>> configuration (ret=-38)
>>>>>>        [   50.011456] [drm] Initialized bochs-drm 1.0.0 20130925
>>>>>> for 0000:00:02.0 on minor 2
>>>>>>        [   50.013604] WARNING: CPU: 1 PID: 1 at
>>>>>> drivers/gpu/drm/drm_mode_config.c:477
>>>>>> drm_mode_config_cleanup+0x280/0x2a0
>>>>>>        [   50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted:
>>>>>> G                T 4.20.0-rc7 #1
>>>>>>        [   50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
>>>>>>        ...
>>>>>>        [   50.023155] Call Trace:
>>>>>>        [   50.023155]  ? bochs_kms_fini+0x1e/0x30
>>>>>>        [   50.023155]  ? bochs_unload+0x18/0x40
>>>>>>
>>>>>> This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
>>>>>>
>>>>>> Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
>>>>>> Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
>>>>>> Fixes: 8741216396b2 ("drm/fb-helper: Add
>>>>>> drm_fb_helper_fbdev_setup/teardown()")
>>>>>> Reported-by: kernel test robot <[email protected]>
>>>>>> Cc: Noralf Trønnes <[email protected]>
>>>>>> Signed-off-by: Peter Wu <[email protected]>
>>>>>> ---
>>>>>>     drivers/gpu/drm/drm_fb_helper.c | 2 +-
>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c
>>>>>> b/drivers/gpu/drm/drm_fb_helper.c
>>>>>> index 9d64f874f965..432e0f3b9267 100644
>>>>>> --- a/drivers/gpu/drm/drm_fb_helper.c
>>>>>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>>>>>> @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct
>>>>>> drm_device *dev,
>>>>>>         return 0;
>>>>>>     err_drm_fb_helper_fini:
>>>>>> -    drm_fb_helper_fini(fb_helper);
>>>>>> +    drm_fb_helper_fbdev_teardown(dev);
>>>>>
>>>>> This change will break the error path for drm_fbdev_generic_setup()
>>>>> because drm_fb_helper_generic_probe() cleans up on error but doesn't
>>>>> clear drm_fb_helper->fb resulting in a double
>>>>> drm_framebuffer_remove().
>>>>
>>>> This should probably considered a bug of drm_fb_helper_generic_probe.
>>>> Ownership of fb_helper should remain with the caller. The caller can
>>>> detect an error and act accordingly.
>>>>
>>>>> My assumption has been that the drm_fb_helper_funcs->fb_probe callback
>>>>> cleans up its resources on error. Clearly this is not the case for
>>>>> bochs, so
>>>>> my take on this is that bochsfb_create() needs to clean up on error.
>>>>
>>>> That assumption still holds for bochs. The problem is this sequence:
>>>> - drm_fb_helper_fbdev_setup is called.
>>>> - fb_probe succeeds (this is crucial).
>>>> - register_framebuffer fails.
>>>> - error path of setup is triggered.
>>>>
>>>> As fb_helper is fully setup by drivers, the drm_fb_helper core should
>>>> fully deallocate it again on the error path or else a leak occurs.
>>>>
>>>>> Gerd has a patchset that switches bochs over to the generic fbdev
>>>>> emulation, but ofc that doesn't help with 4.20:
>>>>> https://patchwork.freedesktop.org/series/54269/
>>>>
>>>> And that does not help with other users of the drm_fb_helper who use
>>>> functions like drm_fb_helper_defio_init. They will likely run in the
>>>> same problem.
>>>>
>>>> I don't have a way to test tinydrm or other drivers, but if you force
>>>> register_framebuffer to fail, you should be able to reproduce the
>>>> problem with drm_fb_helper_generic_probe.
>>>>
>>>
>>> Now I understand. I have looked at the drivers that use drm_fb_helper
>>> and no one seem to handle the case where register_framebuffer() is
>>> failing.
>>>
>>> Here's what drivers do when drm_fb_helper_initial_config() fails:
>>>
>>> Doesn't check:
>>> amdgpu
>>> virtio
>>>
>>> Calls drm_fb_helper_fini():
>>> armada
>>> ast
>>> exynos
>>> gma500
>>> hisilicon
>>> mgag200
>>> msm
>>> nouveau
>>> omap
>>> radeon
>>> rockchip
>>> tegra
>>> udl
>>> bochs - Uses drm_fb_helper_fbdev_setup()
>>> qxl - Uses drm_fb_helper_fbdev_setup()
>>> vboxvideo - Uses drm_fb_helper_fbdev_setup()
>>>
>>> Might clean up, not sure:
>>> cirrus
>>>
>>> Looks suspicious:
>>> i915
>>>
>>> I looked at bochs before it switched to drm_fb_helper_fbdev_setup() and
>>> it also just called drm_fb_helper_fini().
>>>
>>> It looks like you've uncovered something no one has though about (or
>>> not implemented at least).
>>>
>>> It's not just the framebuffer that's not destroyed, the buffer object
>>> is also leaked. drm_mode_config_cleanup() yells about the framebuffer
>>> (and frees it), but says nothing about the buffer object. It might be
>>> that it can't even be made to detect that since some drivers do special
>>> stuff for the fbdev buffer.
>>>
>>> I'll pick up on this and do some testing after the Christmas holidays.
>>
>> Thanks, the warning is bad for CI (which uses QEMU), but otherwise it
>> should not have any effect on regular users so it can wait.
>>
>
> This patch is good as long as it's applied along side the fix[1] to the
> generic emulation:
>
> Reviewed-by: Noralf Trønnes <[email protected]>
>
> I can apply them both when I get an ack/rb on the other patch.
>

Applied to drm-misc-next.

Noralf.