2023-09-21 19:53:50

by Doug Anderson

[permalink] [raw]
Subject: [RFT PATCH v2 07/12] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
This commit is only compile-time tested.

...and further, I'd say that this patch is more of a plea for help
than a patch I think is actually right. I'm _fairly_ certain that
drm/amdgpu needs this call at shutdown time but the logic is a bit
hard for me to follow. I'd appreciate if anyone who actually knows
what this should look like could illuminate me, or perhaps even just
post a patch themselves!

(no changes since v1)

drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++
3 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8f2255b3a38a..cfcff0b37466 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1104,6 +1104,7 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_device *bdev)
int amdgpu_device_init(struct amdgpu_device *adev,
uint32_t flags);
void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_shutdown_hw(struct amdgpu_device *adev);
void amdgpu_device_fini_sw(struct amdgpu_device *adev);

int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a2cdde0ca0a7..fa5925c2092d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4247,6 +4247,16 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)

}

+void amdgpu_device_shutdown_hw(struct amdgpu_device *adev)
+{
+ if (adev->mode_info.mode_config_initialized) {
+ if (!drm_drv_uses_atomic_modeset(adev_to_drm(adev)))
+ drm_helper_force_disable_all(adev_to_drm(adev));
+ else
+ drm_atomic_helper_shutdown(adev_to_drm(adev));
+ }
+}
+
void amdgpu_device_fini_sw(struct amdgpu_device *adev)
{
int idx;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index e90f730eb715..3a7cbff111d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2333,6 +2333,8 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
struct drm_device *dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(dev);

+ amdgpu_device_shutdown_hw(adev);
+
if (amdgpu_ras_intr_triggered())
return;

--
2.42.0.515.g380fc7ccd1-goog


2023-09-26 00:46:43

by Doug Anderson

[permalink] [raw]
Subject: Re: [RFT PATCH v2 07/12] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time

Hi,

On Mon, Sep 25, 2023 at 8:57 AM Deucher, Alexander
<[email protected]> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Douglas Anderson <[email protected]>
> > Sent: Thursday, September 21, 2023 3:27 PM
> > To: [email protected]; Maxime Ripard <[email protected]>
> > Cc: Douglas Anderson <[email protected]>; Zhang, Bokun
> > <[email protected]>; Zhang, Hawking <[email protected]>;
> > Zhu, James <[email protected]>; Zhao, Victor <[email protected]>;
> > Pan, Xinhui <[email protected]>; [email protected]; Deucher, Alexander
> > <[email protected]>; [email protected]; Koenig,
> > Christian <[email protected]>; [email protected]; Kuehling, Felix
> > <[email protected]>; [email protected]; Ma, Le
> > <[email protected]>; Lazar, Lijo <[email protected]>; linux-
> > [email protected]; [email protected]; Limonciello,
> > Mario <[email protected]>; [email protected]; Zhang,
> > Morris <[email protected]>; SHANMUGAM, SRINIVASAN
> > <[email protected]>; [email protected]
> > Subject: [RFT PATCH v2 07/12] drm/amdgpu: Call
> > drm_atomic_helper_shutdown() at shutdown time
> >
> > Based on grepping through the source code this driver appears to be missing a
> > call to drm_atomic_helper_shutdown() at system shutdown time. Among
> > other things, this means that if a panel is in use that it won't be cleanly
> > powered off at system shutdown time.
> >
> > The fact that we should call drm_atomic_helper_shutdown() in the case of OS
> > shutdown/restart comes straight out of the kernel doc "driver instance
> > overview" in drm_drv.c.
> >
> > Suggested-by: Maxime Ripard <[email protected]>
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> > This commit is only compile-time tested.
> >
> > ...and further, I'd say that this patch is more of a plea for help than a patch I
> > think is actually right. I'm _fairly_ certain that drm/amdgpu needs this call at
> > shutdown time but the logic is a bit hard for me to follow. I'd appreciate if
> > anyone who actually knows what this should look like could illuminate me, or
> > perhaps even just post a patch themselves!
> >
> > (no changes since v1)
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++++
> > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++
> > 3 files changed, 13 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 8f2255b3a38a..cfcff0b37466 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1104,6 +1104,7 @@ static inline struct amdgpu_device
> > *amdgpu_ttm_adev(struct ttm_device *bdev) int amdgpu_device_init(struct
> > amdgpu_device *adev,
> > uint32_t flags);
> > void amdgpu_device_fini_hw(struct amdgpu_device *adev);
> > +void amdgpu_device_shutdown_hw(struct amdgpu_device *adev);
> > void amdgpu_device_fini_sw(struct amdgpu_device *adev);
> >
> > int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev); diff --git
> > a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index a2cdde0ca0a7..fa5925c2092d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -4247,6 +4247,16 @@ void amdgpu_device_fini_hw(struct
> > amdgpu_device *adev)
> >
> > }
> >
> > +void amdgpu_device_shutdown_hw(struct amdgpu_device *adev) {
>
> This needs a better name since its only for displays. Also maybe move it into amdgpu_display.c since it's really about turning off the displays. That said is this really even needed? The driver already calls its suspend functionality to turn off all of the hardware and put it into a quiescent state before shutdown. Basically shares the same code we use for suspend.

As per my comment above, for this driver, my patch was a "plea for
help". I have no idea if it's really needed or if suspend handles it.

My main concerns are:

a) If it's possible that someone out there is using this DRM driver
with a "drm_panel" then we need to make sure the panel gets disabled /
unprepared properly at shutdown time. The goal is to remove the
special logic in some panel drivers that disables the panel at
shutdown time. The guidance I got from Maxime is that we should be
relying on the DRM driver to disable panels at shutdown time and not
have extra per-panel code for it.

b) It is documented that DRM driers call drm_atomic_helper_shutdown()
at shutdown time. Even if things are working today, it's always
possible that something will change later and break for drivers that
aren't doing this.


If you're confident that everything is great for the "amdgpu" driver
then I'm happy to drop this patch and not consider it a blocker for
the eventual removal of the code in the individual panels drivers.

If, after reading this, you conclude that some sort of patch is
needed, I'd love it if you could test/post a patch yourself and then
I'll drop this patch from my series.


-Doug

2023-09-26 03:22:19

by Deucher, Alexander

[permalink] [raw]
Subject: RE: [RFT PATCH v2 07/12] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time

[Public]

> -----Original Message-----
> From: Douglas Anderson <[email protected]>
> Sent: Thursday, September 21, 2023 3:27 PM
> To: [email protected]; Maxime Ripard <[email protected]>
> Cc: Douglas Anderson <[email protected]>; Zhang, Bokun
> <[email protected]>; Zhang, Hawking <[email protected]>;
> Zhu, James <[email protected]>; Zhao, Victor <[email protected]>;
> Pan, Xinhui <[email protected]>; [email protected]; Deucher, Alexander
> <[email protected]>; [email protected]; Koenig,
> Christian <[email protected]>; [email protected]; Kuehling, Felix
> <[email protected]>; [email protected]; Ma, Le
> <[email protected]>; Lazar, Lijo <[email protected]>; linux-
> [email protected]; [email protected]; Limonciello,
> Mario <[email protected]>; [email protected]; Zhang,
> Morris <[email protected]>; SHANMUGAM, SRINIVASAN
> <[email protected]>; [email protected]
> Subject: [RFT PATCH v2 07/12] drm/amdgpu: Call
> drm_atomic_helper_shutdown() at shutdown time
>
> Based on grepping through the source code this driver appears to be missing a
> call to drm_atomic_helper_shutdown() at system shutdown time. Among
> other things, this means that if a panel is in use that it won't be cleanly
> powered off at system shutdown time.
>
> The fact that we should call drm_atomic_helper_shutdown() in the case of OS
> shutdown/restart comes straight out of the kernel doc "driver instance
> overview" in drm_drv.c.
>
> Suggested-by: Maxime Ripard <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
> This commit is only compile-time tested.
>
> ...and further, I'd say that this patch is more of a plea for help than a patch I
> think is actually right. I'm _fairly_ certain that drm/amdgpu needs this call at
> shutdown time but the logic is a bit hard for me to follow. I'd appreciate if
> anyone who actually knows what this should look like could illuminate me, or
> perhaps even just post a patch themselves!
>
> (no changes since v1)
>
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++
> 3 files changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 8f2255b3a38a..cfcff0b37466 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1104,6 +1104,7 @@ static inline struct amdgpu_device
> *amdgpu_ttm_adev(struct ttm_device *bdev) int amdgpu_device_init(struct
> amdgpu_device *adev,
> uint32_t flags);
> void amdgpu_device_fini_hw(struct amdgpu_device *adev);
> +void amdgpu_device_shutdown_hw(struct amdgpu_device *adev);
> void amdgpu_device_fini_sw(struct amdgpu_device *adev);
>
> int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev); diff --git
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a2cdde0ca0a7..fa5925c2092d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4247,6 +4247,16 @@ void amdgpu_device_fini_hw(struct
> amdgpu_device *adev)
>
> }
>
> +void amdgpu_device_shutdown_hw(struct amdgpu_device *adev) {

This needs a better name since its only for displays. Also maybe move it into amdgpu_display.c since it's really about turning off the displays. That said is this really even needed? The driver already calls its suspend functionality to turn off all of the hardware and put it into a quiescent state before shutdown. Basically shares the same code we use for suspend.


> + if (adev->mode_info.mode_config_initialized) {
> + if (!drm_drv_uses_atomic_modeset(adev_to_drm(adev)))
> + drm_helper_force_disable_all(adev_to_drm(adev));
> + else
> + drm_atomic_helper_shutdown(adev_to_drm(adev));
> + }
> +}
> +
> void amdgpu_device_fini_sw(struct amdgpu_device *adev) {
> int idx;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index e90f730eb715..3a7cbff111d1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2333,6 +2333,8 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
> struct drm_device *dev = pci_get_drvdata(pdev);
> struct amdgpu_device *adev = drm_to_adev(dev);
>
> + amdgpu_device_shutdown_hw(adev);

I would move this after the ras_intr check below.

Alex

> +
> if (amdgpu_ras_intr_triggered())
> return;
>
> --
> 2.42.0.515.g380fc7ccd1-goog