2023-06-12 22:32:41

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH] drm/msm/dp: Drop aux devices together with DP controller

Using devres to depopulate the aux bus made sure that upon a probe
deferral the EDP panel device would be destroyed and recreated upon next
attempt.

But the struct device which the devres is tied to is the DPUs
(drm_dev->dev), which may be happen after the DP controller is torn
down.

Indications of this can be seen in the commonly seen EDID-hexdump full
of zeros in the log, or the occasional/rare KASAN fault where the
panel's attempt to read the EDID information causes a use after free on
DP resources.

It's tempting to move the devres to the DP controller's struct device,
but the resources used by the device(s) on the aux bus are explicitly
torn down in the error path. The KASAN-reported use-after-free also
remains, as the DP aux "module" explicitly frees its devres-allocated
memory in this code path.

As such, explicitly depopulate the aux bus in the error path, and in the
component unbind path, to avoid these issues.

Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime")
Signed-off-by: Bjorn Andersson <[email protected]>
---
drivers/gpu/drm/msm/dp/dp_display.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index 3d8fa2e73583..bbb0550a022b 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -322,6 +322,8 @@ static void dp_display_unbind(struct device *dev, struct device *master,

kthread_stop(dp->ev_tsk);

+ of_dp_aux_depopulate_bus(dp->aux);
+
dp_power_client_deinit(dp->power);
dp_unregister_audio_driver(dev, dp->audio);
dp_aux_unregister(dp->aux);
@@ -1521,11 +1523,6 @@ void msm_dp_debugfs_init(struct msm_dp *dp_display, struct drm_minor *minor)
}
}

-static void of_dp_aux_depopulate_bus_void(void *data)
-{
- of_dp_aux_depopulate_bus(data);
-}
-
static int dp_display_get_next_bridge(struct msm_dp *dp)
{
int rc;
@@ -1554,12 +1551,6 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
of_node_put(aux_bus);
if (rc)
goto error;
-
- rc = devm_add_action_or_reset(dp->drm_dev->dev,
- of_dp_aux_depopulate_bus_void,
- dp_priv->aux);
- if (rc)
- goto error;
} else if (dp->is_edp) {
DRM_ERROR("eDP aux_bus not found\n");
return -ENODEV;
@@ -1583,6 +1574,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)

error:
if (dp->is_edp) {
+ of_dp_aux_depopulate_bus(dp_priv->aux);
disable_irq(dp_priv->irq);
dp_display_host_phy_exit(dp_priv);
dp_display_host_deinit(dp_priv);
--
2.25.1



2023-06-12 23:22:14

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dp: Drop aux devices together with DP controller

On 13/06/2023 01:01, Bjorn Andersson wrote:
> Using devres to depopulate the aux bus made sure that upon a probe
> deferral the EDP panel device would be destroyed and recreated upon next
> attempt.
>
> But the struct device which the devres is tied to is the DPUs
> (drm_dev->dev), which may be happen after the DP controller is torn
> down.
>
> Indications of this can be seen in the commonly seen EDID-hexdump full
> of zeros in the log, or the occasional/rare KASAN fault where the
> panel's attempt to read the EDID information causes a use after free on
> DP resources.
>
> It's tempting to move the devres to the DP controller's struct device,
> but the resources used by the device(s) on the aux bus are explicitly
> torn down in the error path.

I hoped that proper usage of of_dp_aux_populate_bus(), with the callback
function being non-NULL would have solved at least this part. But it
seems I'll never see this patch.

> The KASAN-reported use-after-free also
> remains, as the DP aux "module" explicitly frees its devres-allocated
> memory in this code path.
>
> As such, explicitly depopulate the aux bus in the error path, and in the
> component unbind path, to avoid these issues.
>
> Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime")
> Signed-off-by: Bjorn Andersson <[email protected]>

Reviewed-by: Dmitry Baryshkov <[email protected]>

> ---
> drivers/gpu/drm/msm/dp/dp_display.c | 14 +++-----------
> 1 file changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 3d8fa2e73583..bbb0550a022b 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -322,6 +322,8 @@ static void dp_display_unbind(struct device *dev, struct device *master,
>
> kthread_stop(dp->ev_tsk);
>
> + of_dp_aux_depopulate_bus(dp->aux);
> +
> dp_power_client_deinit(dp->power);
> dp_unregister_audio_driver(dev, dp->audio);
> dp_aux_unregister(dp->aux);
> @@ -1521,11 +1523,6 @@ void msm_dp_debugfs_init(struct msm_dp *dp_display, struct drm_minor *minor)
> }
> }
>
> -static void of_dp_aux_depopulate_bus_void(void *data)
> -{
> - of_dp_aux_depopulate_bus(data);
> -}
> -
> static int dp_display_get_next_bridge(struct msm_dp *dp)
> {
> int rc;
> @@ -1554,12 +1551,6 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
> of_node_put(aux_bus);
> if (rc)
> goto error;
> -
> - rc = devm_add_action_or_reset(dp->drm_dev->dev,
> - of_dp_aux_depopulate_bus_void,
> - dp_priv->aux);
> - if (rc)
> - goto error;
> } else if (dp->is_edp) {
> DRM_ERROR("eDP aux_bus not found\n");
> return -ENODEV;
> @@ -1583,6 +1574,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
>
> error:
> if (dp->is_edp) {
> + of_dp_aux_depopulate_bus(dp_priv->aux);
> disable_irq(dp_priv->irq);
> dp_display_host_phy_exit(dp_priv);
> dp_display_host_deinit(dp_priv);

--
With best wishes
Dmitry


2023-06-13 20:10:47

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dp: Drop aux devices together with DP controller

Hi,

On Mon, Jun 12, 2023 at 3:40 PM Dmitry Baryshkov
<[email protected]> wrote:
>
> On 13/06/2023 01:01, Bjorn Andersson wrote:
> > Using devres to depopulate the aux bus made sure that upon a probe
> > deferral the EDP panel device would be destroyed and recreated upon next
> > attempt.
> >
> > But the struct device which the devres is tied to is the DPUs
> > (drm_dev->dev), which may be happen after the DP controller is torn
> > down.
> >
> > Indications of this can be seen in the commonly seen EDID-hexdump full
> > of zeros in the log, or the occasional/rare KASAN fault where the
> > panel's attempt to read the EDID information causes a use after free on
> > DP resources.
> >
> > It's tempting to move the devres to the DP controller's struct device,
> > but the resources used by the device(s) on the aux bus are explicitly
> > torn down in the error path.
>
> I hoped that proper usage of of_dp_aux_populate_bus(), with the callback
> function being non-NULL would have solved at least this part. But it
> seems I'll never see this patch.

Agreed. This has been pending for > 1 year now with no significant
progress. Abhinav: Is there anything that can be done about this? Not
following up on agreed-to cleanups in a timely manner doesn't set a
good precedent. Next time the Qualcomm display wants to land something
and promises to land a followup people will be less likely to believe
them...


> > The KASAN-reported use-after-free also
> > remains, as the DP aux "module" explicitly frees its devres-allocated
> > memory in this code path.
> >
> > As such, explicitly depopulate the aux bus in the error path, and in the
> > component unbind path, to avoid these issues.
> >
> > Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime")
> > Signed-off-by: Bjorn Andersson <[email protected]>
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>

Reviewed-by: Douglas Anderson <[email protected]>

2023-06-13 21:37:19

by Abhinav Kumar

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dp: Drop aux devices together with DP controller

Hi Doug

On 6/13/2023 12:33 PM, Doug Anderson wrote:
> Hi,
>
> On Mon, Jun 12, 2023 at 3:40 PM Dmitry Baryshkov
> <[email protected]> wrote:
>>
>> On 13/06/2023 01:01, Bjorn Andersson wrote:
>>> Using devres to depopulate the aux bus made sure that upon a probe
>>> deferral the EDP panel device would be destroyed and recreated upon next
>>> attempt.
>>>
>>> But the struct device which the devres is tied to is the DPUs
>>> (drm_dev->dev), which may be happen after the DP controller is torn
>>> down.
>>>
>>> Indications of this can be seen in the commonly seen EDID-hexdump full
>>> of zeros in the log, or the occasional/rare KASAN fault where the
>>> panel's attempt to read the EDID information causes a use after free on
>>> DP resources.
>>>
>>> It's tempting to move the devres to the DP controller's struct device,
>>> but the resources used by the device(s) on the aux bus are explicitly
>>> torn down in the error path.
>>
>> I hoped that proper usage of of_dp_aux_populate_bus(), with the callback
>> function being non-NULL would have solved at least this part. But it
>> seems I'll never see this patch.
>
> Agreed. This has been pending for > 1 year now with no significant
> progress. Abhinav: Is there anything that can be done about this? Not
> following up on agreed-to cleanups in a timely manner doesn't set a
> good precedent. Next time the Qualcomm display wants to land something
> and promises to land a followup people will be less likely to believe
> them...
>

Both QC and Google know there were other factors which delayed this last
3-4 months.

But, I do not have any concrete justification to give you for the delays
before that apart from perhaps other higher priority chrome and upstream
bugs which kept cropping up.

Hence, all I can offer is my apologies for the delay.

After seeing this patch on the list, we have revived this effort now and
re-assigned this within our team to take over from where that was left
off. It will need some time to transition but this will see the end of
the tunnel soon.

Thanks

Abhinav

2023-06-15 12:15:45

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dp: Drop aux devices together with DP controller


On Mon, 12 Jun 2023 15:01:06 -0700, Bjorn Andersson wrote:
> Using devres to depopulate the aux bus made sure that upon a probe
> deferral the EDP panel device would be destroyed and recreated upon next
> attempt.
>
> But the struct device which the devres is tied to is the DPUs
> (drm_dev->dev), which may be happen after the DP controller is torn
> down.
>
> [...]

Applied, thanks!

[1/1] drm/msm/dp: Drop aux devices together with DP controller
https://gitlab.freedesktop.org/lumag/msm/-/commit/a7bfb2ad2184

Best regards,
--
Dmitry Baryshkov <[email protected]>

2023-06-19 12:50:30

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dp: Drop aux devices together with DP controller

On Mon, Jun 12, 2023 at 03:01:06PM -0700, Bjorn Andersson wrote:
> Using devres to depopulate the aux bus made sure that upon a probe
> deferral the EDP panel device would be destroyed and recreated upon next
> attempt.
>
> But the struct device which the devres is tied to is the DPUs
> (drm_dev->dev), which may be happen after the DP controller is torn
> down.

There appears to be some words missing in this sentence.

> Indications of this can be seen in the commonly seen EDID-hexdump full
> of zeros in the log,

This could happen also when the aux bus lifetime was tied to DP
controller and is mostly benign as dp_aux_deinit() set the "initted"
flag to false.

> or the occasional/rare KASAN fault where the
> panel's attempt to read the EDID information causes a use after free on
> DP resources.

But this is clearly a bug as there's a small window where the aux bus
struct holding the above flag may also have been released...

> It's tempting to move the devres to the DP controller's struct device,
> but the resources used by the device(s) on the aux bus are explicitly
> torn down in the error path. The KASAN-reported use-after-free also
> remains, as the DP aux "module" explicitly frees its devres-allocated
> memory in this code path.

Right, and this would also not work as the aux bus could remain
populated for the next bind attempt which would then fail (as described
in the commit message of the offending commit).

> As such, explicitly depopulate the aux bus in the error path, and in the
> component unbind path, to avoid these issues.

Sounds good.

> Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime")

This one should also have a stable tag:

Cc: [email protected] # 5.19

> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
> drivers/gpu/drm/msm/dp/dp_display.c | 14 +++-----------
> 1 file changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 3d8fa2e73583..bbb0550a022b 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -322,6 +322,8 @@ static void dp_display_unbind(struct device *dev, struct device *master,
>
> kthread_stop(dp->ev_tsk);
>
> + of_dp_aux_depopulate_bus(dp->aux);

This may now be called without first having populated the bus, but looks
like that still works.

> +
> dp_power_client_deinit(dp->power);
> dp_unregister_audio_driver(dev, dp->audio);
> dp_aux_unregister(dp->aux);

I know this one was merged while I was out-of-office last week, but for
the record:

Reviewed-by: Johan Hovold <[email protected]>
Tested-by: Johan Hovold <[email protected]>

Johan