2022-03-11 20:23:19

by Ricardo Cañuelo

[permalink] [raw]
Subject: [PATCH 0/2 RESEND] Mitigate race condition problems when unbinding DRM driver

Hi all,

I'm sending these patches to try to improve the current situation for a
particular corner case (DRM driver unbinding).

I could reproduce a specific race condition during the unbinding of the
mediatek-drm driver that caused an invalid memory address. The race
condition is triggered by a userspace event (gnome-shell requesting a
DRM GET_CONNECTOR ioctl) while the encoders and drivers are in the
process of being disabled.

While I tried to mitigate this by making a small change in the
parade-ps8640 driver (for the bridge I'm testing on) and by making a
couple of functions in drm_bridge.c more robust, this is only a symptom
of a larger problem that might not be getting enough attention,
understandably, because this is an unusual corner case.

The scenario looks like this:

<userspace>: unbind mediatek-drm --------------------+
| |
<kernel> |
| |
... |
| ...
mtk_dsi_unbind |
| |
`- drm_encoder_cleanup v
| | gnome-shell
... `- drm_bridge_detach *<------ ioctl (GET_CONNECTOR)
|
<kernel>
|
...
|
|
ps8640_bridge_get_edid
|
`drm_bridge_chain_post_disable

which causes drm_bridge_chain_post_disable() to walk the bridge chain
after the bridge has already been detached and removed from the list. I
guess a more radical and subsystem-wide solution would be to not allow
or to block certain ioctl calls once the driver has started to unbind,
but I'd like to hear your opinion on this.

This was tested on an Acer Chromebook R13 (Elm, MT8173) running Debian
Sid, the command that triggers the race condition is

echo mediatek-drm.12.auto > /sys/bus/platform/drivers/mediatek-drm/unbind

Cheers,
Ricardo

Ricardo Cañuelo (2):
drm/bridge: parade-ps8640: avoid race condition on driver unbinding
drm/bridge: Add extra checks in pre_enable and post_enable

drivers/gpu/drm/bridge/parade-ps8640.c | 6 +++---
drivers/gpu/drm/drm_bridge.c | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

--
2.25.1


2022-03-11 22:01:26

by Ricardo Cañuelo

[permalink] [raw]
Subject: [PATCH 1/2 RESEND] drm/bridge: parade-ps8640: avoid race condition on driver unbinding

When unbinding a DRM master driver there's a race condition that
sometimes results in an invalid vm access when userspace (gnome-shell)
issues a DRM_IOCTL_MODE_GETCONNECTOR ioctl right after a bridge has been
removed from an encoder's bridge chain.

This means that once a bridge has been disabled and gone through
ps8640_post_disable(), if ps8640_bridge_get_edid() runs afterwards as a
result of that ioctl call it will call drm_bridge_chain_pre_enable()
and drm_bridge_chain_post_disable() again in a bridge that has been
already detached.

Setting `ps_bridge->pre_enabled = false` at a later stage of the
bringdown path and calling drm_bridge_chain_pre_enable() inside
ps8640_bridge_get_edid() only if needed avoid this, although a proper
subsystem-wide fix would be the proper solution, since the same type of
race conditions might happen somewhere else.

Tested on an Acer Chromebook R13 (Elm, MT8173) with Debian Sid.

Signed-off-by: Ricardo Cañuelo <[email protected]>
---
drivers/gpu/drm/bridge/parade-ps8640.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c b/drivers/gpu/drm/bridge/parade-ps8640.c
index 3f17337ee389..a927787a89bf 100644
--- a/drivers/gpu/drm/bridge/parade-ps8640.c
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -434,8 +434,6 @@ static void ps8640_post_disable(struct drm_bridge *bridge)
{
struct ps8640 *ps_bridge = bridge_to_ps8640(bridge);

- ps_bridge->pre_enabled = false;
-
ps8640_bridge_vdo_control(ps_bridge, DISABLE);
pm_runtime_put_sync_suspend(&ps_bridge->page[PAGE0_DP_CNTL]->dev);
}
@@ -487,6 +485,7 @@ static void ps8640_bridge_detach(struct drm_bridge *bridge)
drm_dp_aux_unregister(&ps_bridge->aux);
if (ps_bridge->link)
device_link_del(ps_bridge->link);
+ ps_bridge->pre_enabled = false;
}

static struct edid *ps8640_bridge_get_edid(struct drm_bridge *bridge,
@@ -508,7 +507,8 @@ static struct edid *ps8640_bridge_get_edid(struct drm_bridge *bridge,
* EDID, for this chip, we need to do a full poweron, otherwise it will
* fail.
*/
- drm_bridge_chain_pre_enable(bridge);
+ if (poweroff)
+ drm_bridge_chain_pre_enable(bridge);

edid = drm_get_edid(connector,
ps_bridge->page[PAGE0_DP_CNTL]->adapter);
--
2.25.1

2022-03-11 22:04:07

by Ricardo Cañuelo

[permalink] [raw]
Subject: [PATCH 2/2 RESEND] drm/bridge: Add extra checks in pre_enable and post_enable

Depending on the bridge code, certain userspace events during a driver
teardown (such as a DRM ioctl call) might cause a race condition where
the drm_bridge_chain_pre_enable() and drm_bridge_chain_post_enable()
functions could be called for a bridge that has just been detached and
removed from the bridge chain of an encoder.

This change makes these functions a bit more robust by bailing out if
the bridge has already been detached.

Tested on an Acer Chromebook R13 (Elm, MT8173) with Debian Sid.

Signed-off-by: Ricardo Cañuelo <[email protected]>
---
drivers/gpu/drm/drm_bridge.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index c96847fc0ebc..e074aa456dd1 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -529,7 +529,7 @@ void drm_bridge_chain_post_disable(struct drm_bridge *bridge)
{
struct drm_encoder *encoder;

- if (!bridge)
+ if (!bridge || !bridge->dev)
return;

encoder = bridge->encoder;
@@ -585,7 +585,7 @@ void drm_bridge_chain_pre_enable(struct drm_bridge *bridge)
struct drm_encoder *encoder;
struct drm_bridge *iter;

- if (!bridge)
+ if (!bridge || !bridge->dev)
return;

encoder = bridge->encoder;
--
2.25.1

Subject: Re: [PATCH 1/2 RESEND] drm/bridge: parade-ps8640: avoid race condition on driver unbinding

Il 11/03/22 10:34, Ricardo Cañuelo ha scritto:
> When unbinding a DRM master driver there's a race condition that
> sometimes results in an invalid vm access when userspace (gnome-shell)
> issues a DRM_IOCTL_MODE_GETCONNECTOR ioctl right after a bridge has been
> removed from an encoder's bridge chain.
>
> This means that once a bridge has been disabled and gone through
> ps8640_post_disable(), if ps8640_bridge_get_edid() runs afterwards as a
> result of that ioctl call it will call drm_bridge_chain_pre_enable()
> and drm_bridge_chain_post_disable() again in a bridge that has been
> already detached.
>
> Setting `ps_bridge->pre_enabled = false` at a later stage of the
> bringdown path and calling drm_bridge_chain_pre_enable() inside
> ps8640_bridge_get_edid() only if needed avoid this, although a proper
> subsystem-wide fix would be the proper solution, since the same type of
> race conditions might happen somewhere else.
>
> Tested on an Acer Chromebook R13 (Elm, MT8173) with Debian Sid.
>
> Signed-off-by: Ricardo Cañuelo <[email protected]>

Reviewed-by: AngeloGioacchino Del Regno <[email protected]>

> ---
> drivers/gpu/drm/bridge/parade-ps8640.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>

Subject: Re: [PATCH 2/2 RESEND] drm/bridge: Add extra checks in pre_enable and post_enable

Il 11/03/22 10:34, Ricardo Cañuelo ha scritto:
> Depending on the bridge code, certain userspace events during a driver
> teardown (such as a DRM ioctl call) might cause a race condition where
> the drm_bridge_chain_pre_enable() and drm_bridge_chain_post_enable()
> functions could be called for a bridge that has just been detached and
> removed from the bridge chain of an encoder.
>
> This change makes these functions a bit more robust by bailing out if
> the bridge has already been detached.
>
> Tested on an Acer Chromebook R13 (Elm, MT8173) with Debian Sid.
>
> Signed-off-by: Ricardo Cañuelo <[email protected]>

On various chromebooks with different MediaTek SoC models, and on
Qualcomm SC7180 Trogdor,

Tested-by: AngeloGioacchino Del Regno <[email protected]>