The issues caused by:
64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")
Prompted me to take a closer look at how we clear the payload state in
general when disabling the topology, and it turns out there's actually
two subtle issues here.
The first is that we're not grabbing &mgr.payload_lock when clearing the
payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
lock order is &mgr.payload_lock -> &mgr.lock (because we always want
&mgr.lock to be the inner-most lock so topology validation always
works), this makes perfect sense. It also means that -technically- there
could be racing between someone calling
drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
modeset occurring that's modifying the payload state at the same time.
The second is the more obvious issue that Wayne Lin discovered, that
we're not clearing proposed_payloads when disabling the topology.
I actually can't see any obvious places where the racing caused by the
first issue would break something, and it could be that some of our
higher-level locks already prevent this by happenstance, but better safe
then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
first grabs &mgr.payload_lock followed by &mgr.lock so that we never
race when modifying the payload state. Then, we also clear
proposed_payloads to fix the original issue of enabling a new topology
with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
structures, but those are getting destroyed along with the ports anyway.
Changes since v1:
* Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
vsyrjala
Cc: Sean Paul <[email protected]>
Cc: Wayne Lin <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Signed-off-by: Lyude Paul <[email protected]>
---
drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index 3649e82b963d..23cf46bfef74 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3501,6 +3501,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
int ret = 0;
struct drm_dp_mst_branch *mstb = NULL;
+ mutex_lock(&mgr->payload_lock);
mutex_lock(&mgr->lock);
if (mst_state == mgr->mst_state)
goto out_unlock;
@@ -3559,7 +3560,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
/* this can fail if the device is gone */
drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
ret = 0;
- memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
+ memset(mgr->payloads, 0,
+ mgr->max_payloads * sizeof(mgr->payloads[0]));
+ memset(mgr->proposed_vcpis, 0,
+ mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
mgr->payload_mask = 0;
set_bit(0, &mgr->payload_mask);
mgr->vcpi_mask = 0;
@@ -3568,6 +3572,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
out_unlock:
mutex_unlock(&mgr->lock);
+ mutex_unlock(&mgr->payload_lock);
if (mstb)
drm_dp_mst_topology_put_mstb(mstb);
return ret;
--
2.24.1
Mention that the size of these two structs is determined by
max_payloads. Suggested by Ville Syrjälä.
Signed-off-by: Lyude Paul <[email protected]>
Cc: Ville Syrjälä <[email protected]>
---
include/drm/drm_dp_mst_helper.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
index bcb39da9adb4..5483f888712a 100644
--- a/include/drm/drm_dp_mst_helper.h
+++ b/include/drm/drm_dp_mst_helper.h
@@ -635,11 +635,13 @@ struct drm_dp_mst_topology_mgr {
struct mutex payload_lock;
/**
* @proposed_vcpis: Array of pointers for the new VCPI allocation. The
- * VCPI structure itself is &drm_dp_mst_port.vcpi.
+ * VCPI structure itself is &drm_dp_mst_port.vcpi, and the size of
+ * this array is determined by @max_payloads.
*/
struct drm_dp_vcpi **proposed_vcpis;
/**
- * @payloads: Array of payloads.
+ * @payloads: Array of payloads. The size of this array is determined
+ * by @max_payloads.
*/
struct drm_dp_payload *payloads;
/**
--
2.24.1
On Wed, Jan 22, 2020 at 02:43:21PM -0500, Lyude Paul wrote:
> Mention that the size of these two structs is determined by
> max_payloads. Suggested by Ville Syrj?l?.
>
> Signed-off-by: Lyude Paul <[email protected]>
> Cc: Ville Syrj?l? <[email protected]>
> ---
> include/drm/drm_dp_mst_helper.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
> index bcb39da9adb4..5483f888712a 100644
> --- a/include/drm/drm_dp_mst_helper.h
> +++ b/include/drm/drm_dp_mst_helper.h
> @@ -635,11 +635,13 @@ struct drm_dp_mst_topology_mgr {
> struct mutex payload_lock;
> /**
> * @proposed_vcpis: Array of pointers for the new VCPI allocation. The
> - * VCPI structure itself is &drm_dp_mst_port.vcpi.
> + * VCPI structure itself is &drm_dp_mst_port.vcpi, and the size of
> + * this array is determined by @max_payloads.
> */
> struct drm_dp_vcpi **proposed_vcpis;
> /**
> - * @payloads: Array of payloads.
> + * @payloads: Array of payloads. The size of this array is determined
> + * by @max_payloads.
> */
Reviewed-by: Ville Syrj?l? <[email protected]>
> struct drm_dp_payload *payloads;
> /**
> --
> 2.24.1
--
Ville Syrj?l?
Intel
On Wed, Jan 22, 2020 at 02:43:20PM -0500, Lyude Paul wrote:
> The issues caused by:
>
> 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")
>
> Prompted me to take a closer look at how we clear the payload state in
> general when disabling the topology, and it turns out there's actually
> two subtle issues here.
>
> The first is that we're not grabbing &mgr.payload_lock when clearing the
> payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
> lock order is &mgr.payload_lock -> &mgr.lock (because we always want
> &mgr.lock to be the inner-most lock so topology validation always
> works), this makes perfect sense. It also means that -technically- there
> could be racing between someone calling
> drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
> modeset occurring that's modifying the payload state at the same time.
>
> The second is the more obvious issue that Wayne Lin discovered, that
> we're not clearing proposed_payloads when disabling the topology.
>
> I actually can't see any obvious places where the racing caused by the
> first issue would break something, and it could be that some of our
> higher-level locks already prevent this by happenstance, but better safe
> then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
> first grabs &mgr.payload_lock followed by &mgr.lock so that we never
> race when modifying the payload state. Then, we also clear
> proposed_payloads to fix the original issue of enabling a new topology
> with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
> structures, but those are getting destroyed along with the ports anyway.
>
> Changes since v1:
> * Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
> vsyrjala
>
> Cc: Sean Paul <[email protected]>
> Cc: Wayne Lin <[email protected]>
> Cc: Ville Syrj?l? <[email protected]>
> Signed-off-by: Lyude Paul <[email protected]>
> ---
> drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
> index 3649e82b963d..23cf46bfef74 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -3501,6 +3501,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
> int ret = 0;
> struct drm_dp_mst_branch *mstb = NULL;
>
> + mutex_lock(&mgr->payload_lock);
> mutex_lock(&mgr->lock);
> if (mst_state == mgr->mst_state)
> goto out_unlock;
> @@ -3559,7 +3560,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
> /* this can fail if the device is gone */
> drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
> ret = 0;
> - memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
> + memset(mgr->payloads, 0,
> + mgr->max_payloads * sizeof(mgr->payloads[0]));
> + memset(mgr->proposed_vcpis, 0,
> + mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
> mgr->payload_mask = 0;
> set_bit(0, &mgr->payload_mask);
> mgr->vcpi_mask = 0;
> @@ -3568,6 +3572,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
>
> out_unlock:
> mutex_unlock(&mgr->lock);
> + mutex_unlock(&mgr->payload_lock);
Locking order looks sane. Not entirely sure what the implications of
clearing all that stuff outside of a proper modeset is, but at least
it matches what we already do. So
Reviewed-by: Ville Syrj?l? <[email protected]>
> if (mstb)
> drm_dp_mst_topology_put_mstb(mstb);
> return ret;
> --
> 2.24.1
--
Ville Syrj?l?
Intel
On Wed, 2020-01-22 at 22:51 +0200, Ville Syrjälä wrote:
> On Wed, Jan 22, 2020 at 02:43:20PM -0500, Lyude Paul wrote:
> > The issues caused by:
> >
> > 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")
> >
> > Prompted me to take a closer look at how we clear the payload state in
> > general when disabling the topology, and it turns out there's actually
> > two subtle issues here.
> >
> > The first is that we're not grabbing &mgr.payload_lock when clearing the
> > payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
> > lock order is &mgr.payload_lock -> &mgr.lock (because we always want
> > &mgr.lock to be the inner-most lock so topology validation always
> > works), this makes perfect sense. It also means that -technically- there
> > could be racing between someone calling
> > drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
> > modeset occurring that's modifying the payload state at the same time.
> >
> > The second is the more obvious issue that Wayne Lin discovered, that
> > we're not clearing proposed_payloads when disabling the topology.
> >
> > I actually can't see any obvious places where the racing caused by the
> > first issue would break something, and it could be that some of our
> > higher-level locks already prevent this by happenstance, but better safe
> > then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
> > first grabs &mgr.payload_lock followed by &mgr.lock so that we never
> > race when modifying the payload state. Then, we also clear
> > proposed_payloads to fix the original issue of enabling a new topology
> > with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
> > structures, but those are getting destroyed along with the ports anyway.
> >
> > Changes since v1:
> > * Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
> > vsyrjala
> >
> > Cc: Sean Paul <[email protected]>
> > Cc: Wayne Lin <[email protected]>
> > Cc: Ville Syrjälä <[email protected]>
> > Signed-off-by: Lyude Paul <[email protected]>
> > ---
> > drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 3649e82b963d..23cf46bfef74 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -3501,6 +3501,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> > int ret = 0;
> > struct drm_dp_mst_branch *mstb = NULL;
> >
> > + mutex_lock(&mgr->payload_lock);
> > mutex_lock(&mgr->lock);
> > if (mst_state == mgr->mst_state)
> > goto out_unlock;
> > @@ -3559,7 +3560,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> > /* this can fail if the device is gone */
> > drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
> > ret = 0;
> > - memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct
> > drm_dp_payload));
> > + memset(mgr->payloads, 0,
> > + mgr->max_payloads * sizeof(mgr->payloads[0]));
> > + memset(mgr->proposed_vcpis, 0,
> > + mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
> > mgr->payload_mask = 0;
> > set_bit(0, &mgr->payload_mask);
> > mgr->vcpi_mask = 0;
> > @@ -3568,6 +3572,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >
> > out_unlock:
> > mutex_unlock(&mgr->lock);
> > + mutex_unlock(&mgr->payload_lock);
>
> Locking order looks sane. Not entirely sure what the implications of
> clearing all that stuff outside of a proper modeset is, but at least
> it matches what we already do. So
fwiw - it's not clearing anything currently programmed in hw at that point,
it's just clearing the arrays we use for keeping track of pending/current
payload allocations so there's nothing in them the next time we enable MST on
a display. This is basically what we want anyway - once we disable MST through
the DPCD, we won't end up sending any payload updates anyway so these don't
get used in the modesets that follow.
tbh, part of me wonders if we could handle more of this in atomic state in the
future, but I haven't seen a good enough reason to try that yet.
>
> Reviewed-by: Ville Syrjälä <[email protected]>
>
> > if (mstb)
> > drm_dp_mst_topology_put_mstb(mstb);
> > return ret;
> > --
> > 2.24.1
--
Cheers,
Lyude Paul
[AMD Official Use Only - Internal Distribution Only]
Hi Lyude,
I also notice one thing that I was not aware of before. Since we're clearing payloads here, maybe we should also call drm_dp_mst_put_port_malloc() to put malloc refcounts here?
Appreciate for all your help! :)
________________________________________
From: Lyude Paul <[email protected]>
Sent: Thursday, January 23, 2020 07:43
To: Ville Syrj?l?
Cc: [email protected]; Sean Paul; Lin, Wayne; Maarten Lankhorst; Maxime Ripard; David Airlie; Daniel Vetter; [email protected]
Subject: Re: [PATCH v2 1/2] drm/dp_mst: Fix clearing payload state on topology disable
On Wed, 2020-01-22 at 22:51 +0200, Ville Syrj?l? wrote:
> On Wed, Jan 22, 2020 at 02:43:20PM -0500, Lyude Paul wrote:
> > The issues caused by:
> >
> > 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology mgr")
> >
> > Prompted me to take a closer look at how we clear the payload state in
> > general when disabling the topology, and it turns out there's actually
> > two subtle issues here.
> >
> > The first is that we're not grabbing &mgr.payload_lock when clearing the
> > payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
> > lock order is &mgr.payload_lock -> &mgr.lock (because we always want
> > &mgr.lock to be the inner-most lock so topology validation always
> > works), this makes perfect sense. It also means that -technically- there
> > could be racing between someone calling
> > drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
> > modeset occurring that's modifying the payload state at the same time.
> >
> > The second is the more obvious issue that Wayne Lin discovered, that
> > we're not clearing proposed_payloads when disabling the topology.
> >
> > I actually can't see any obvious places where the racing caused by the
> > first issue would break something, and it could be that some of our
> > higher-level locks already prevent this by happenstance, but better safe
> > then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
> > first grabs &mgr.payload_lock followed by &mgr.lock so that we never
> > race when modifying the payload state. Then, we also clear
> > proposed_payloads to fix the original issue of enabling a new topology
> > with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
> > structures, but those are getting destroyed along with the ports anyway.
> >
> > Changes since v1:
> > * Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
> > vsyrjala
> >
> > Cc: Sean Paul <[email protected]>
> > Cc: Wayne Lin <[email protected]>
> > Cc: Ville Syrj?l? <[email protected]>
> > Signed-off-by: Lyude Paul <[email protected]>
> > ---
> > drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 3649e82b963d..23cf46bfef74 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -3501,6 +3501,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> > int ret = 0;
> > struct drm_dp_mst_branch *mstb = NULL;
> >
> > + mutex_lock(&mgr->payload_lock);
> > mutex_lock(&mgr->lock);
> > if (mst_state == mgr->mst_state)
> > goto out_unlock;
> > @@ -3559,7 +3560,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> > /* this can fail if the device is gone */
> > drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
> > ret = 0;
> > - memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct
> > drm_dp_payload));
> > + memset(mgr->payloads, 0,
> > + mgr->max_payloads * sizeof(mgr->payloads[0]));
> > + memset(mgr->proposed_vcpis, 0,
> > + mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
> > mgr->payload_mask = 0;
> > set_bit(0, &mgr->payload_mask);
> > mgr->vcpi_mask = 0;
> > @@ -3568,6 +3572,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >
> > out_unlock:
> > mutex_unlock(&mgr->lock);
> > + mutex_unlock(&mgr->payload_lock);
>
> Locking order looks sane. Not entirely sure what the implications of
> clearing all that stuff outside of a proper modeset is, but at least
> it matches what we already do. So
fwiw - it's not clearing anything currently programmed in hw at that point,
it's just clearing the arrays we use for keeping track of pending/current
payload allocations so there's nothing in them the next time we enable MST on
a display. This is basically what we want anyway - once we disable MST through
the DPCD, we won't end up sending any payload updates anyway so these don't
get used in the modesets that follow.
tbh, part of me wonders if we could handle more of this in atomic state in the
future, but I haven't seen a good enough reason to try that yet.
>
> Reviewed-by: Ville Syrj?l? <[email protected]>
>
> > if (mstb)
> > drm_dp_mst_topology_put_mstb(mstb);
> > return ret;
> > --
> > 2.24.1
--
Cheers,
Lyude Paul