2019-01-08 00:59:18

by Lyude Paul

[permalink] [raw]
Subject: [PATCH 0/3] drm/amdgpu: Fix suspend/resume issues with MST

Fix the suspend/issues that Jerry Zuo found in amdgpu, and add some
compiler warnings for drivers ignoring the return code of
drm_dp_mst_topology_mgr_resume() to help ensure we don't need to fix
this again in the future for someone else's driver.

Cc: Jerry Zuo <[email protected]>

Lyude Paul (3):
drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()
drm/amdgpu: Don't fail resume process if resuming atomic state fails
drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()

.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 37 +++++++++++++------
drivers/gpu/drm/drm_dp_mst_topology.c | 3 +-
include/drm/drm_dp_mst_helper.h | 3 +-
3 files changed, 29 insertions(+), 14 deletions(-)

--
2.20.1



2019-01-08 00:59:18

by Lyude Paul

[permalink] [raw]
Subject: [PATCH 1/3] drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()

drm_dp_mst_topology_mgr_resume() returns whether or not it managed to
find the topology in question after a suspend resume cycle, and the
driver is supposed to check this value and disable MST accordingly if
it's gone-in addition to sending a hotplug in order to notify userspace
that something changed during suspend.

Currently, amdgpu just makes the mistake of ignoring the return code
from drm_dp_mst_topology_mgr_resume() which means that if a topology was
removed in suspend, amdgpu never notices and assumes it's still
connected which leads to all sorts of problems.

So, fix this by actually checking the rc from
drm_dp_mst_topology_mgr_resume(). Also, reformat the rest of the
function while we're at it to fix the over-indenting.

Signed-off-by: Lyude Paul <[email protected]>
Cc: Jerry Zuo <[email protected]>
Cc: <[email protected]> # v4.15+
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 32 +++++++++++++------
1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 8a626d16e8e3..3f326a2c513b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -699,22 +699,36 @@ static void s3_handle_mst(struct drm_device *dev, bool suspend)
{
struct amdgpu_dm_connector *aconnector;
struct drm_connector *connector;
+ struct drm_dp_mst_topology_mgr *mgr;
+ int ret;
+ bool need_hotplug = false;

drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

- list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
- aconnector = to_amdgpu_dm_connector(connector);
- if (aconnector->dc_link->type == dc_connection_mst_branch &&
- !aconnector->mst_port) {
+ list_for_each_entry(connector, &dev->mode_config.connector_list,
+ head) {
+ aconnector = to_amdgpu_dm_connector(connector);
+ if (aconnector->dc_link->type != dc_connection_mst_branch ||
+ aconnector->mst_port)
+ continue;
+
+ mgr = &aconnector->mst_mgr;

- if (suspend)
- drm_dp_mst_topology_mgr_suspend(&aconnector->mst_mgr);
- else
- drm_dp_mst_topology_mgr_resume(&aconnector->mst_mgr);
- }
+ if (suspend) {
+ drm_dp_mst_topology_mgr_suspend(mgr);
+ } else {
+ ret = drm_dp_mst_topology_mgr_resume(mgr);
+ if (ret < 0) {
+ drm_dp_mst_topology_mgr_set_mst(mgr, false);
+ need_hotplug = true;
+ }
+ }
}

drm_modeset_unlock(&dev->mode_config.connection_mutex);
+
+ if (need_hotplug)
+ drm_kms_helper_hotplug_event(dev);
}

/**
--
2.20.1


2019-01-08 00:59:45

by Lyude Paul

[permalink] [raw]
Subject: [PATCH 2/3] drm/amdgpu: Don't fail resume process if resuming atomic state fails

This is an ugly one unfortunately. Currently, all DRM drivers supporting
atomic modesetting will save the state that userspace had set before
suspending, then attempt to restore that state on resume. This probably
worked very well at one point, like many other things, until DP MST came
into the picture. While it's easy to restore state on normal display
connectors that were disconnected during suspend regardless of their
state post-resume, this can't really be done with MST because of the
fact that setting up a downstream sink requires performing sideband
transactions between the source and the MST hub, sending out the ACT
packets, etc.

Because of this, there isn't really a guarantee that we can restore the
atomic state we had before suspend once we've resumed. This sucks pretty
bad, but so far I haven't run into any compositors that this actually
causes serious issues with. Most compositors will notice the hotplug we
send afterwards, and then reprobe state.

Since nouveau and i915 also don't fail the suspend/resume process due to
failing to restore the atomic state, let's make amdgpu match this
behavior. Better to resume the GPU properly, then to stop the process
half way because of a potentially unavoidable atomic commit failure.

Eventually, we'll have a real fix for this problem on the DRM level. But
we've got some more important low-hanging fruit to deal with first.

Signed-off-by: Lyude Paul <[email protected]>
Cc: Jerry Zuo <[email protected]>
Cc: <[email protected]> # v4.15+
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3f326a2c513b..a3e65e457348 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -912,7 +912,6 @@ static int dm_resume(void *handle)
struct drm_plane_state *new_plane_state;
struct dm_plane_state *dm_new_plane_state;
enum dc_connection_type new_connection_type = dc_connection_none;
- int ret;
int i;

/* power on hardware */
@@ -985,13 +984,13 @@ static int dm_resume(void *handle)
}
}

- ret = drm_atomic_helper_resume(ddev, dm->cached_state);
+ drm_atomic_helper_resume(ddev, dm->cached_state);

dm->cached_state = NULL;

amdgpu_dm_irq_resume_late(adev);

- return ret;
+ return 0;
}

/**
--
2.20.1


2019-01-08 01:01:05

by Lyude Paul

[permalink] [raw]
Subject: [PATCH 3/3] drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()

Since I've had to fix two cases of drivers not checking the return code
from this function, let's make the compiler complain so this doesn't
come up again in the future.

Signed-off-by: Lyude Paul <[email protected]>
Cc: Jerry Zuo <[email protected]>
Cc: Daniel Vetter <[email protected]>
---
drivers/gpu/drm/drm_dp_mst_topology.c | 3 ++-
include/drm/drm_dp_mst_helper.h | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index 2ab16c9e6243..13a7e497bfe2 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -2226,7 +2226,8 @@ EXPORT_SYMBOL(drm_dp_mst_topology_mgr_suspend);
* if the device fails this returns -1, and the driver should do
* a full MST reprobe, in case we were undocked.
*/
-int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr)
+int __must_check
+drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr)
{
int ret = 0;

diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
index 371cc2816477..4355b55d0081 100644
--- a/include/drm/drm_dp_mst_helper.h
+++ b/include/drm/drm_dp_mst_helper.h
@@ -614,7 +614,8 @@ void drm_dp_mst_dump_topology(struct seq_file *m,
struct drm_dp_mst_topology_mgr *mgr);

void drm_dp_mst_topology_mgr_suspend(struct drm_dp_mst_topology_mgr *mgr);
-int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr);
+int __must_check
+drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr);
struct drm_dp_mst_topology_state *drm_atomic_get_mst_topology_state(struct drm_atomic_state *state,
struct drm_dp_mst_topology_mgr *mgr);
int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state,
--
2.20.1


2019-01-08 08:33:53

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH 3/3] drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()

On Mon, Jan 07, 2019 at 07:56:47PM -0500, Lyude Paul wrote:
> Since I've had to fix two cases of drivers not checking the return code
> from this function, let's make the compiler complain so this doesn't
> come up again in the future.
>
> Signed-off-by: Lyude Paul <[email protected]>
> Cc: Jerry Zuo <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> ---
> drivers/gpu/drm/drm_dp_mst_topology.c | 3 ++-
> include/drm/drm_dp_mst_helper.h | 3 ++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
> index 2ab16c9e6243..13a7e497bfe2 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -2226,7 +2226,8 @@ EXPORT_SYMBOL(drm_dp_mst_topology_mgr_suspend);
> * if the device fails this returns -1, and the driver should do
> * a full MST reprobe, in case we were undocked.
> */
> -int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr)
> +int __must_check

I think you need the must_check only in the header. With that fixed:

Reviewed-by: Daniel Vetter <[email protected]>

> +drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr)
> {
> int ret = 0;
>
> diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
> index 371cc2816477..4355b55d0081 100644
> --- a/include/drm/drm_dp_mst_helper.h
> +++ b/include/drm/drm_dp_mst_helper.h
> @@ -614,7 +614,8 @@ void drm_dp_mst_dump_topology(struct seq_file *m,
> struct drm_dp_mst_topology_mgr *mgr);
>
> void drm_dp_mst_topology_mgr_suspend(struct drm_dp_mst_topology_mgr *mgr);
> -int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr);
> +int __must_check
> +drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr);
> struct drm_dp_mst_topology_state *drm_atomic_get_mst_topology_state(struct drm_atomic_state *state,
> struct drm_dp_mst_topology_mgr *mgr);
> int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state,
> --
> 2.20.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2019-01-08 17:35:42

by Harry Wentland

[permalink] [raw]
Subject: Re: [PATCH 0/3] drm/amdgpu: Fix suspend/resume issues with MST

On 2019-01-07 7:56 p.m., Lyude Paul wrote:
> Fix the suspend/issues that Jerry Zuo found in amdgpu, and add some
> compiler warnings for drivers ignoring the return code of
> drm_dp_mst_topology_mgr_resume() to help ensure we don't need to fix
> this again in the future for someone else's driver.
>
> Cc: Jerry Zuo <[email protected]>

With the small change Daniel mentioned this series is
Reviewed-by: Harry Wentland <[email protected]>

Harry

>
> Lyude Paul (3):
> drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()
> drm/amdgpu: Don't fail resume process if resuming atomic state fails
> drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()
>
> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 37 +++++++++++++------
> drivers/gpu/drm/drm_dp_mst_topology.c | 3 +-
> include/drm/drm_dp_mst_helper.h | 3 +-
> 3 files changed, 29 insertions(+), 14 deletions(-)
>