2018-07-12 17:04:46

by Lyude Paul

[permalink] [raw]
Subject: [PATCH v2 0/3] drm/nouveau: Fix runtime PM leaks

This is the latest version of
https://patchwork.freedesktop.org/series/45862/ . One new patch has been
added that also addresses some additional issues I found with
pmops_runtime_idle that would stop nouveau from suspending the GPU when
running under X.

Additionally,
"drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit()" has had
it's CC to stable removed.

Lyude Paul (3):
drm/nouveau: Fix runtime PM leak in drm_open()
drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit()
drm/nouveau: Remove bogus crtc check in pmops_runtime_idle

drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_drm.c | 17 ++++-------------
2 files changed, 5 insertions(+), 14 deletions(-)

--
2.17.1



2018-07-12 17:03:57

by Lyude Paul

[permalink] [raw]
Subject: [PATCH v2 1/3] drm/nouveau: Fix runtime PM leak in drm_open()

Noticed this as I was skimming through, if we fail to allocate memory
for cli we'll end up returning without dropping the runtime PM ref we
got. Additionally, we'll even return the wrong return code! (ret most
likely will == 0 here, we want -ENOMEM).

Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Lukas Wunner <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_drm.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 0452b18d36b9..0f668e275ee1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -919,8 +919,10 @@ nouveau_drm_open(struct drm_device *dev, struct drm_file *fpriv)
get_task_comm(tmpname, current);
snprintf(name, sizeof(name), "%s[%d]", tmpname, pid_nr(fpriv->pid));

- if (!(cli = kzalloc(sizeof(*cli), GFP_KERNEL)))
- return ret;
+ if (!(cli = kzalloc(sizeof(*cli), GFP_KERNEL))) {
+ ret = -ENOMEM;
+ goto done;
+ }

ret = nouveau_cli_init(drm, name, cli);
if (ret)
--
2.17.1


2018-07-12 17:03:59

by Lyude Paul

[permalink] [raw]
Subject: [PATCH v2 3/3] drm/nouveau: Remove bogus crtc check in pmops_runtime_idle

This both uses the legacy modesetting structures in a racy manner, and
additionally also doesn't even check the right variable (enabled != the
CRTC is actually turned on for atomic).

This fixes issues on my P50 regarding the dedicated GPU not entering
runtime suspend.

Signed-off-by: Lyude Paul <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/nouveau_drm.c | 11 -----------
1 file changed, 11 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 0f668e275ee1..c7ec86d6c3c9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -881,22 +881,11 @@ nouveau_pmops_runtime_resume(struct device *dev)
static int
nouveau_pmops_runtime_idle(struct device *dev)
{
- struct pci_dev *pdev = to_pci_dev(dev);
- struct drm_device *drm_dev = pci_get_drvdata(pdev);
- struct nouveau_drm *drm = nouveau_drm(drm_dev);
- struct drm_crtc *crtc;
-
if (!nouveau_pmops_runtime()) {
pm_runtime_forbid(dev);
return -EBUSY;
}

- list_for_each_entry(crtc, &drm->dev->mode_config.crtc_list, head) {
- if (crtc->enabled) {
- DRM_DEBUG_DRIVER("failing to power off - crtc active\n");
- return -EBUSY;
- }
- }
pm_runtime_mark_last_busy(dev);
pm_runtime_autosuspend(dev);
/* we don't want the main rpm_idle to call suspend - we want to autosuspend */
--
2.17.1


2018-07-12 17:04:06

by Lyude Paul

[permalink] [raw]
Subject: [PATCH v2 2/3] drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit()

A CRTC being enabled doesn't mean it's on! It doesn't even necessarily
mean it's being used. This fixes runtime PM leaks on the P50 I've got
next to me.

Signed-off-by: Lyude Paul <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index d9da69c83ae7..9bae4db84cfb 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
nv50_disp_atomic_commit_tail(state);

drm_for_each_crtc(crtc, dev) {
- if (crtc->state->enable) {
+ if (crtc->state->active) {
if (!drm->have_disp_power_ref) {
drm->have_disp_power_ref = true;
return 0;
--
2.17.1


2018-07-12 17:18:20

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] drm/nouveau: Remove bogus crtc check in pmops_runtime_idle

On Thu, Jul 12, 2018 at 01:02:54PM -0400, Lyude Paul wrote:
> This both uses the legacy modesetting structures in a racy manner, and
> additionally also doesn't even check the right variable (enabled != the
> CRTC is actually turned on for atomic).
>
> This fixes issues on my P50 regarding the dedicated GPU not entering
> runtime suspend.
>
> Signed-off-by: Lyude Paul <[email protected]>
> Cc: [email protected]

On both patch 2&3:

Reviewed-by: Daniel Vetter <[email protected]>

->enable vs. ->active is probably the biggest source of pain in atomic,
and beyond typing even more kerneldoc that will be ignored (there's
another series doing exactly that on the list) I have no idea what to do.
90% rule is to look at ->enable in atomic_check code (since DPMS changes
should always work) and ->active in atomic_commit code.

Wrt the legacy state: For the legacy pointers we can set them to NULL for
atomic, and Ville has done that. That's real effective at stopping drivers
from looking at the wrong thing. But for the others like this one here I
dunno what to do to effectively hide them from atomic drivers.

</rant>

Cheers, Daniel

> ---
> drivers/gpu/drm/nouveau/nouveau_drm.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 0f668e275ee1..c7ec86d6c3c9 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -881,22 +881,11 @@ nouveau_pmops_runtime_resume(struct device *dev)
> static int
> nouveau_pmops_runtime_idle(struct device *dev)
> {
> - struct pci_dev *pdev = to_pci_dev(dev);
> - struct drm_device *drm_dev = pci_get_drvdata(pdev);
> - struct nouveau_drm *drm = nouveau_drm(drm_dev);
> - struct drm_crtc *crtc;
> -
> if (!nouveau_pmops_runtime()) {
> pm_runtime_forbid(dev);
> return -EBUSY;
> }
>
> - list_for_each_entry(crtc, &drm->dev->mode_config.crtc_list, head) {
> - if (crtc->enabled) {
> - DRM_DEBUG_DRIVER("failing to power off - crtc active\n");
> - return -EBUSY;
> - }
> - }
> pm_runtime_mark_last_busy(dev);
> pm_runtime_autosuspend(dev);
> /* we don't want the main rpm_idle to call suspend - we want to autosuspend */
> --
> 2.17.1
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2018-07-17 07:34:59

by Lukas Wunner

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH v2 2/3] drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit()

On Thu, Jul 12, 2018 at 01:02:53PM -0400, Lyude Paul wrote:
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
> nv50_disp_atomic_commit_tail(state);
>
> drm_for_each_crtc(crtc, dev) {
> - if (crtc->state->enable) {
> + if (crtc->state->active) {
> if (!drm->have_disp_power_ref) {
> drm->have_disp_power_ref = true;
> return 0;

Somewhat tangential comment on this older patch, since you
continue to dig around in the runtime PM area:

Whenever a crtc is activated or deactivated in nouveau, we iterate
over all crtcs and acquire a runtime PM if a crtc is active and
previously there was no active one, or we drop a ref if none is
active and previously there was an active one.

For a while now I've been thinking that it would be more straightforward
to acquire a ref whenever a crtc is activated and drop one when a crtc
is deactivated, i.e. hold one ref for every active crtc. That way the
have_disp_power_ref variable as well as the iteration logic could be
removed, leading to a simplification. Just a suggestion anyway.

Thanks,

Lukas

2018-07-17 10:32:20

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH v2 2/3] drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit()

On Tue, Jul 17, 2018 at 09:33:52AM +0200, Lukas Wunner wrote:
> On Thu, Jul 12, 2018 at 01:02:53PM -0400, Lyude Paul wrote:
> > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > @@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
> > nv50_disp_atomic_commit_tail(state);
> >
> > drm_for_each_crtc(crtc, dev) {
> > - if (crtc->state->enable) {
> > + if (crtc->state->active) {
> > if (!drm->have_disp_power_ref) {
> > drm->have_disp_power_ref = true;
> > return 0;
>
> Somewhat tangential comment on this older patch, since you
> continue to dig around in the runtime PM area:
>
> Whenever a crtc is activated or deactivated in nouveau, we iterate
> over all crtcs and acquire a runtime PM if a crtc is active and
> previously there was no active one, or we drop a ref if none is
> active and previously there was an active one.
>
> For a while now I've been thinking that it would be more straightforward
> to acquire a ref whenever a crtc is activated and drop one when a crtc
> is deactivated, i.e. hold one ref for every active crtc. That way the
> have_disp_power_ref variable as well as the iteration logic could be
> removed, leading to a simplification. Just a suggestion anyway.

The current code looks somewhat busted anyway. First problem is that
it's accessing crtc->state without the appropriate locks held (unless
something always pulls in all crtcs to every commit?). Second issue
is that the rpm_put() is called without waiting for nonblocking commits
to have finished so it looks like you can currentlly remove the power
before the hardware has been properly shut down.

--
Ville Syrj?l?
Intel