2018-09-04 21:10:26

by Peter Wu

[permalink] [raw]
Subject: [PATCH] qxl: fix null-pointer crash during suspend

"crtc->helper_private" is not initialized by the QXL driver and thus the
"crtc_funcs->disable" call would crash (resulting in suspend failure).
Fix this by converting the suspend/resume functions to use the
drm_mode_config_helper_* helpers.

Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
During suspend the following message is visible from QEMU:

spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0

This seems to be triggered by QXL_IO_NOTIFY_CMD after
QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
seem to work (tested with both the GTK and -spice options).

Signed-off-by: Peter Wu <[email protected]>
---
Hi,

I found this issue while trying to suspend a VM that uses QXL. In order to see
the stack trace over serial, boot with no_console_suspend. Searching for
"qxl_drm_freeze" showed one recent report from Alan:
https://lkml.kernel.org/r/[email protected]

Kind regards,
Peter
---
drivers/gpu/drm/qxl/qxl_drv.c | 26 +++++---------------------
1 file changed, 5 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index 2445e75cf7ea..d00f45eed03c 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -136,20 +136,11 @@ static int qxl_drm_freeze(struct drm_device *dev)
{
struct pci_dev *pdev = dev->pdev;
struct qxl_device *qdev = dev->dev_private;
- struct drm_crtc *crtc;
-
- drm_kms_helper_poll_disable(dev);
-
- console_lock();
- qxl_fbdev_set_suspend(qdev, 1);
- console_unlock();
+ int ret;

- /* unpin the front buffers */
- list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
- const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private;
- if (crtc->enabled)
- (*crtc_funcs->disable)(crtc);
- }
+ ret = drm_mode_config_helper_suspend(dev);
+ if (ret)
+ return ret;

qxl_destroy_monitors_object(qdev);
qxl_surf_evict(qdev);
@@ -175,14 +166,7 @@ static int qxl_drm_resume(struct drm_device *dev, bool thaw)
}

qxl_create_monitors_object(qdev);
- drm_helper_resume_force_mode(dev);
-
- console_lock();
- qxl_fbdev_set_suspend(qdev, 0);
- console_unlock();
-
- drm_kms_helper_poll_enable(dev);
- return 0;
+ return drm_mode_config_helper_resume(dev);
}

static int qxl_pm_suspend(struct device *dev)
--
2.18.0



2018-10-01 20:15:58

by Fubo Chen

[permalink] [raw]
Subject: Re: [PATCH] qxl: fix null-pointer crash during suspend

On Tue, Sep 4, 2018 at 2:10 PM Peter Wu <[email protected]> wrote:
>
> "crtc->helper_private" is not initialized by the QXL driver and thus the
> "crtc_funcs->disable" call would crash (resulting in suspend failure).
> Fix this by converting the suspend/resume functions to use the
> drm_mode_config_helper_* helpers.
>
> Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> During suspend the following message is visible from QEMU:
>
> spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
> spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0
>
> This seems to be triggered by QXL_IO_NOTIFY_CMD after
> QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> seem to work (tested with both the GTK and -spice options).
>
> Signed-off-by: Peter Wu <[email protected]>

Is this a new issue or something that was introduced a long time ago?
In the latter case, please consider adding a "Cc:
<[email protected]>" tag to this patch.

Thanks,

Fubo.

2018-10-01 20:33:55

by Peter Wu

[permalink] [raw]
Subject: Re: [PATCH] qxl: fix null-pointer crash during suspend

On Mon, Oct 01, 2018 at 01:13:59PM -0700, Fubo Chen wrote:
> On Tue, Sep 4, 2018 at 2:10 PM Peter Wu <[email protected]> wrote:
> >
> > "crtc->helper_private" is not initialized by the QXL driver and thus the
> > "crtc_funcs->disable" call would crash (resulting in suspend failure).
> > Fix this by converting the suspend/resume functions to use the
> > drm_mode_config_helper_* helpers.
> >
> > Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> > During suspend the following message is visible from QEMU:
> >
> > spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
> > spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0
> >
> > This seems to be triggered by QXL_IO_NOTIFY_CMD after
> > QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> > seem to work (tested with both the GTK and -spice options).
> >
> > Signed-off-by: Peter Wu <[email protected]>
>
> Is this a new issue or something that was introduced a long time ago?
> In the latter case, please consider adding a "Cc:
> <[email protected]>" tag to this patch.

I am not sure exactly when the issue was introduced, but the original
code was added in v3.10-rc7-800-gd84300bf7934 while the new
drm_mode_config_helper_suspend API was added in 4.16.

The intended call chain to initialize the private object seems to be:
drm_crtc_helper_add
<- qdev_crtc_init
<- qxl_modeset_init
<- qxl_pci_probe

If any error occurs along the callchain, then the helper_private pointer
will remain NULL. Or if the crtc is obtained in a different way (not
sure how).

Not sure if it is worth backporting, suspend/resume does not seem an
important use case for VMs using QXL and the fix was not validated for
older kernels.
--
Kind regards,
Peter Wu
https://lekensteyn.nl

2018-10-02 08:14:53

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH] qxl: fix null-pointer crash during suspend

On Tue, Sep 04, 2018 at 10:27:47PM +0200, Peter Wu wrote:
> "crtc->helper_private" is not initialized by the QXL driver and thus the

This is still initialized, it's the ->disable that goes boom. At least the
call to drm_crtc_helper_add is still there. The ->disable was removed in:

commit 64581714b58bc3e16ede8dc37a025c3aa0e0eef1
Author: Laurent Pinchart <[email protected]>
Date: Fri Jun 30 12:36:45 2017 +0300

drm: Convert atomic drivers from CRTC .disable() to .atomic_disable()

Fixes: 64581714b58b ("drm: Convert atomic drivers from CRTC .disable() to .atomic_disable()")
Cc: <[email protected]> # v4.14+
Reviewed-by: Daniel Vetter <[email protected]>

I'll let Gerd pick this one up, after some testing. Also adding Laurent.
-Daniel

> "crtc_funcs->disable" call would crash (resulting in suspend failure).
> Fix this by converting the suspend/resume functions to use the
> drm_mode_config_helper_* helpers.
>
> Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> During suspend the following message is visible from QEMU:
>
> spice/server/display-channel.c:2425:display_channel_validate_surface: canvas address is 0x7fd05da68308 for 0 (and is NULL)
> spice/server/display-channel.c:2426:display_channel_validate_surface: failed on 0
>
> This seems to be triggered by QXL_IO_NOTIFY_CMD after
> QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> seem to work (tested with both the GTK and -spice options).
>
> Signed-off-by: Peter Wu <[email protected]>
> ---
> Hi,
>
> I found this issue while trying to suspend a VM that uses QXL. In order to see
> the stack trace over serial, boot with no_console_suspend. Searching for
> "qxl_drm_freeze" showed one recent report from Alan:
> https://lkml.kernel.org/r/[email protected]
>
> Kind regards,
> Peter
> ---
> drivers/gpu/drm/qxl/qxl_drv.c | 26 +++++---------------------
> 1 file changed, 5 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> index 2445e75cf7ea..d00f45eed03c 100644
> --- a/drivers/gpu/drm/qxl/qxl_drv.c
> +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> @@ -136,20 +136,11 @@ static int qxl_drm_freeze(struct drm_device *dev)
> {
> struct pci_dev *pdev = dev->pdev;
> struct qxl_device *qdev = dev->dev_private;
> - struct drm_crtc *crtc;
> -
> - drm_kms_helper_poll_disable(dev);
> -
> - console_lock();
> - qxl_fbdev_set_suspend(qdev, 1);
> - console_unlock();
> + int ret;
>
> - /* unpin the front buffers */
> - list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> - const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private;
> - if (crtc->enabled)
> - (*crtc_funcs->disable)(crtc);
> - }
> + ret = drm_mode_config_helper_suspend(dev);
> + if (ret)
> + return ret;
>
> qxl_destroy_monitors_object(qdev);
> qxl_surf_evict(qdev);
> @@ -175,14 +166,7 @@ static int qxl_drm_resume(struct drm_device *dev, bool thaw)
> }
>
> qxl_create_monitors_object(qdev);
> - drm_helper_resume_force_mode(dev);
> -
> - console_lock();
> - qxl_fbdev_set_suspend(qdev, 0);
> - console_unlock();
> -
> - drm_kms_helper_poll_enable(dev);
> - return 0;
> + return drm_mode_config_helper_resume(dev);
> }
>
> static int qxl_pm_suspend(struct device *dev)
> --
> 2.18.0
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2018-10-02 10:07:37

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH] qxl: fix null-pointer crash during suspend

Hello,

On Tuesday, 2 October 2018 11:14:22 EEST Daniel Vetter wrote:
> On Tue, Sep 04, 2018 at 10:27:47PM +0200, Peter Wu wrote:
> > "crtc->helper_private" is not initialized by the QXL driver and thus the
>
> This is still initialized, it's the ->disable that goes boom. At least the
> call to drm_crtc_helper_add is still there. The ->disable was removed in:
>
> commit 64581714b58bc3e16ede8dc37a025c3aa0e0eef1
> Author: Laurent Pinchart <[email protected]>
> Date: Fri Jun 30 12:36:45 2017 +0300
>
> drm: Convert atomic drivers from CRTC .disable() to .atomic_disable()
>
> Fixes: 64581714b58b ("drm: Convert atomic drivers from CRTC .disable() to
> .atomic_disable()") Cc: <[email protected]> # v4.14+
> Reviewed-by: Daniel Vetter <[email protected]>
>
> I'll let Gerd pick this one up, after some testing. Also adding Laurent.

Sorry for breaking it :-( Please let me know if there's something I can do to
help.

> > "crtc_funcs->disable" call would crash (resulting in suspend failure).
> > Fix this by converting the suspend/resume functions to use the
> > drm_mode_config_helper_* helpers.
> >
> > Tested system sleep with QEMU 3.0 using "echo mem > /sys/power/state".
> >
> > During suspend the following message is visible from QEMU:
> > spice/server/display-channel.c:2425:display_channel_validate_surface:
> > canvas address is 0x7fd05da68308 for 0 (and is NULL)
> > spice/server/display-channel.c:2426:display_channel_validate_surface:
> > failed on 0>
> > This seems to be triggered by QXL_IO_NOTIFY_CMD after
> > QXL_IO_DESTROY_PRIMARY_ASYNC, but aside from the warning things still
> > seem to work (tested with both the GTK and -spice options).
> >
> > Signed-off-by: Peter Wu <[email protected]>
> > ---
> > Hi,
> >
> > I found this issue while trying to suspend a VM that uses QXL. In order to
> > see the stack trace over serial, boot with no_console_suspend. Searching
> > for "qxl_drm_freeze" showed one recent report from Alan:
> > https://lkml.kernel.org/r/[email protected]
> >
> > Kind regards,
> > Peter
> > ---
> >
> > drivers/gpu/drm/qxl/qxl_drv.c | 26 +++++---------------------
> > 1 file changed, 5 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> > index 2445e75cf7ea..d00f45eed03c 100644
> > --- a/drivers/gpu/drm/qxl/qxl_drv.c
> > +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> > @@ -136,20 +136,11 @@ static int qxl_drm_freeze(struct drm_device *dev)
> >
> > {
> >
> > struct pci_dev *pdev = dev->pdev;
> > struct qxl_device *qdev = dev->dev_private;
> >
> > - struct drm_crtc *crtc;
> > -
> > - drm_kms_helper_poll_disable(dev);
> > -
> > - console_lock();
> > - qxl_fbdev_set_suspend(qdev, 1);
> > - console_unlock();
> > + int ret;
> >
> > - /* unpin the front buffers */
> > - list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> > - const struct drm_crtc_helper_funcs *crtc_funcs = crtc-
>helper_private;
> > - if (crtc->enabled)
> > - (*crtc_funcs->disable)(crtc);
> > - }
> > + ret = drm_mode_config_helper_suspend(dev);
> > + if (ret)
> > + return ret;
> >
> > qxl_destroy_monitors_object(qdev);
> > qxl_surf_evict(qdev);
> >
> > @@ -175,14 +166,7 @@ static int qxl_drm_resume(struct drm_device *dev,
> > bool thaw)>
> > }
> >
> > qxl_create_monitors_object(qdev);
> >
> > - drm_helper_resume_force_mode(dev);
> > -
> > - console_lock();
> > - qxl_fbdev_set_suspend(qdev, 0);
> > - console_unlock();
> > -
> > - drm_kms_helper_poll_enable(dev);
> > - return 0;
> > + return drm_mode_config_helper_resume(dev);
> >
> > }
> >
> > static int qxl_pm_suspend(struct device *dev)


--
Regards,

Laurent Pinchart