If the display is not enable()d, then we aren't holding a runtime PM
reference here. Thus, it's easy to accidentally cause a hang, if user
space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
Let's get the panel and PM state right before trying to talk AUX.
Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
Cc: <[email protected]>
Cc: Tomeu Vizoso <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
---
Changes in v2:
- Fix spelling in Subject
- DRM_DEV_ERROR() -> drm_err()
- Propagate errors from un-analogix_dp_prepare_panel()
.../drm/bridge/analogix/analogix_dp_core.c | 21 ++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index b7d2e4449cfa..6fc46ac93ef8 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux,
struct drm_dp_aux_msg *msg)
{
struct analogix_dp_device *dp = to_dp(aux);
+ int ret, ret2;
- return analogix_dp_transfer(dp, msg);
+ ret = analogix_dp_prepare_panel(dp, true, false);
+ if (ret) {
+ drm_err(dp->drm_dev, "Failed to prepare panel (%d)\n", ret);
+ return ret;
+ }
+
+ pm_runtime_get_sync(dp->dev);
+ ret = analogix_dp_transfer(dp, msg);
+ pm_runtime_put(dp->dev);
+
+ ret2 = analogix_dp_prepare_panel(dp, false, false);
+ if (ret2) {
+ drm_err(dp->drm_dev, "Failed to unprepare panel (%d)\n", ret2);
+ /* Prefer the analogix_dp_transfer() error, if it exists. */
+ if (!ret)
+ ret = ret2;
+ }
+
+ return ret;
}
struct analogix_dp_device *
--
2.33.0.800.g4c38ced690-goog
(updating Andrzej's email)
On Fri, Oct 1, 2021 at 2:50 PM Brian Norris <[email protected]> wrote:
> If the display is not enable()d, then we aren't holding a runtime PM
> reference here. Thus, it's easy to accidentally cause a hang, if user
> space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
>
> Let's get the panel and PM state right before trying to talk AUX.
>
> Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
> Cc: <[email protected]>
> Cc: Tomeu Vizoso <[email protected]>
> Signed-off-by: Brian Norris <[email protected]>
> ---
>
> Changes in v2:
> - Fix spelling in Subject
> - DRM_DEV_ERROR() -> drm_err()
> - Propagate errors from un-analogix_dp_prepare_panel()
>
> .../drm/bridge/analogix/analogix_dp_core.c | 21 ++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
Ping? Do I need to do anything more here?
Thanks,
Brian
Hi Brian,
I am not DP specialist so CC-ed people working with DP
On 01.10.2021 23:42, Brian Norris wrote:
> If the display is not enable()d, then we aren't holding a runtime PM
> reference here. Thus, it's easy to accidentally cause a hang, if user
> space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
>
> Let's get the panel and PM state right before trying to talk AUX.
>
> Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
> Cc: <[email protected]>
> Cc: Tomeu Vizoso <[email protected]>
> Signed-off-by: Brian Norris <[email protected]>
Few questions/issues here:
1. If it is just to avoid accidental 'hangs' it would be better to just
check if the panel is working before transfer, if not, return error
code. If there is better reason for this pm dance, please provide it in
description.
2. Again I see an assumption that panel-prepare enables power for
something different than video transmission, accidentally it is true for
most devices, but devices having more fine grained power management will
break, or at least will be used inefficiently - but maybe in case of dp
it is OK ???
3. More general issue - I am not sure if this should not be handled
uniformly for all drm_dp devices.
Regards
Andrzej
> ---
>
> Changes in v2:
> - Fix spelling in Subject
> - DRM_DEV_ERROR() -> drm_err()
> - Propagate errors from un-analogix_dp_prepare_panel()
>
> .../drm/bridge/analogix/analogix_dp_core.c | 21 ++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> index b7d2e4449cfa..6fc46ac93ef8 100644
> --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> @@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux,
> struct drm_dp_aux_msg *msg)
> {
> struct analogix_dp_device *dp = to_dp(aux);
> + int ret, ret2;
>
> - return analogix_dp_transfer(dp, msg);
> + ret = analogix_dp_prepare_panel(dp, true, false);
> + if (ret) {
> + drm_err(dp->drm_dev, "Failed to prepare panel (%d)\n", ret);
> + return ret;
> + }
> +
> + pm_runtime_get_sync(dp->dev);
> + ret = analogix_dp_transfer(dp, msg);
> + pm_runtime_put(dp->dev);
> +
> + ret2 = analogix_dp_prepare_panel(dp, false, false);
> + if (ret2) {
> + drm_err(dp->drm_dev, "Failed to unprepare panel (%d)\n", ret2);
> + /* Prefer the analogix_dp_transfer() error, if it exists. */
> + if (!ret)
> + ret = ret2;
> + }
> +
> + return ret;
> }
>
> struct analogix_dp_device *
Hi Andrzej,
On Tue, Jan 11, 2022 at 5:26 AM Andrzej Hajda <[email protected]> wrote:
> I am not DP specialist so CC-ed people working with DP
Thanks for the review regardless! I'll also not claim to be a DP
specialist -- although I've had to learn my fair share to debug a good
handful of issues on an SoC using this driver.
> On 01.10.2021 23:42, Brian Norris wrote:
> > If the display is not enable()d, then we aren't holding a runtime PM
> > reference here. Thus, it's easy to accidentally cause a hang, if user
> > space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
> >
> > Let's get the panel and PM state right before trying to talk AUX.
> >
> > Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
> > Cc: <[email protected]>
> > Cc: Tomeu Vizoso <[email protected]>
> > Signed-off-by: Brian Norris <[email protected]>
>
>
> Few questions/issues here:
>
> 1. If it is just to avoid accidental 'hangs' it would be better to just
> check if the panel is working before transfer, if not, return error
> code. If there is better reason for this pm dance, please provide it in
> description.
I'm not that familiar with DP-AUX, but I believe it can potentially
provide a variety of useful information (e.g., EDID?) to users without
the display and primary video link being active. So it doesn't sound
like a good idea to me to purposely leave this interface uninitialized
(and emitting errors) even when the user is asking for communication
(via /dev/drm_dp_aux<N>). Do you want me to document what
/dev/drm_dp_aux<N> does, and why someone would use it, in the commit
message?
> 2. Again I see an assumption that panel-prepare enables power for
> something different than video transmission, accidentally it is true for
> most devices, but devices having more fine grained power management will
> break, or at least will be used inefficiently - but maybe in case of dp
> it is OK ???
For this part, I'm less sure -- I wasn't sure what the general needs
are for AUX communication, and whether we need the panel enabled or
not. It seems logical that we need something powered, and I don't know
of anything besides "prepare()" that ensures that for DP panels.
(NB: the key to _my_ problem is the PM runtime reference. It's
absolutely essential that we don't try to utilize the DP hardware
without powering it up. The panel power state is less critical.)
> 3. More general issue - I am not sure if this should not be handled
> uniformly for all drm_dp devices.
I'm not sure what precisely you mean by #3. But FWIW, this is at least
partially documented ("make sure it's been properly enabled"):
/**
* @transfer: transfers a message representing a single AUX
* transaction.
*
* This is a hardware-specific implementation of how
* transactions are executed that the drivers must provide.
...
* Also note that this callback can be called no matter the
* state @dev is in. Drivers that need that device to be powered
* to perform this operation will first need to make sure it's
* been properly enabled.
*/
ssize_t (*transfer)(struct drm_dp_aux *aux,
struct drm_dp_aux_msg *msg);
But maybe the definition of "properly enabled" is what you're unsure
about? (I'm also a little unsure.)
Regards,
Brian
On Tue, Feb 15, 2022 at 1:31 PM Doug Anderson <[email protected]> wrote:
>
> Hi,
Hi!
> On Fri, Oct 1, 2021 at 2:50 PM Brian Norris <[email protected]> wrote:
> >
> > If the display is not enable()d, then we aren't holding a runtime PM
> > reference here. Thus, it's easy to accidentally cause a hang, if user
> > space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
> >
> > Let's get the panel and PM state right before trying to talk AUX.
> > diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> > index b7d2e4449cfa..6fc46ac93ef8 100644
> > --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> > +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> > @@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux,
...
> > + pm_runtime_get_sync(dp->dev);
> > + ret = analogix_dp_transfer(dp, msg);
> > + pm_runtime_put(dp->dev);
>
> I've spent an unfortunate amount of time digging around the DP AUX bus
> recently, so I can at least say that I have some experience and some
> opinions here.
Thanks! Experience is welcome, and opinions too sometimes ;)
> IMO:
>
> 1. Don't power the panel on. If the panel isn't powered on then the DP
> AUX transfer will timeout. Tough nuggies. Think of yourself more like
> an i2c controller and of this as an i2c transfer implementation. The
> i2c controller isn't in charge of powering up the i2c devices on the
> bus. If userspace does an "i2c detect" on an i2c bus and some of the
> devices aren't powered then they won't be found. If you try to
> read/write from a powered off device that won't work either.
I guess this, paired with the driver examples below (ti-sn65dsi86.c,
especially, which specifically throws errors if the panel isn't on),
makes some sense. It's approximately (but more verbosely) what Andrzej
was recommending too, I guess. It still makes me wonder what the point
of the /dev/drm_dp_aux<N> interface is though, because it seems like
you're pretty much destined to not have reliable operation through
that means.
Also note: I found that the AUX bus is really not working properly at
all (even with this patch) in some cases due to self-refresh. Not only
do we need the panel enabled, but we need to not be in self-refresh
mode. Self-refresh is not currently exposed to user space, so user
space has no way of knowing the panel is currently active, aside from
racily inducing artificial display activity.
But if we're OK with "just throw errors" or "just let stuff time out",
then I guess that's not a big deal. My purpose is to avoid hanging the
system, not to make /dev/drm_dp_aux<N> useful.
> 2. In theory if the DP driver can read HPD (I haven't looked through
> the analogix code to see how it handles it) then you can fail an AUX
> transfer right away if HPD isn't asserted instead of timing out. If
> this is hard, it's probably fine to just time out though.
This driver does handle HPD, but it also has overrides because
apparently it doesn't work on some systems. I might see if we can
leverage it, or I might just follow the bridge-enabled state (similar
to ti-sn65dsi86.c's 'comms_enabled').
> 3. Do the "pm_runtime" calls, but enable "autosuspend" with something
> ~1 second autosuspend delay. When using the AUX bus to read an EDID
> the underlying code will call your function 16 times in quick
> succession. If you're powering up and down constantly that'll be a bit
> of a waste.
Does this part really matter? For properly active cases, the bridge
remains enabled, and it holds a runtime PM reference. For "maybe
active" (your "tough nuggies" situation above), you're probably right
that it's inefficient, but does it matter, when it's going to be a
slow timed-out operation anyway? The AUX failure will be much slower
than the PM transition.
I guess I can do this anyway, but frankly, I'll just be copy/pasting
stuff from other drivers, because the runtime PM documentation still
confuses me, and moreso once you involve autosuspend.
> For a reference, you could look at
> `drivers/gpu/drm/bridge/ti-sn65dsi86.c`. Also
> `drivers/gpu/drm/bridge/parade-ps8640.c`
Thanks for these. They look like reasonable patterns to follow.
Brian
On Tue, Feb 15, 2022 at 3:46 PM Doug Anderson <[email protected]> wrote:
> On Tue, Feb 15, 2022 at 2:52 PM Brian Norris <[email protected]> wrote:
> > It still makes me wonder what the point
> > of the /dev/drm_dp_aux<N> interface is though, because it seems like
> > you're pretty much destined to not have reliable operation through
> > that means.
>
> I can't say I have tons of history for those files. I seem to recall
> maybe someone using them to have userspace tweak the embedded
> backlight on some external DP connected panels? I think we also might
> use it in Chrome OS to update the firmware of panels (dunno if
> internal or external) in some cases too? I suspect that it works OK
> for certain situations but it's really not going to work in all
> cases...
Yes, I believe I'm only submitting patches like this because fwupd
apparently likes to indiscriminately whack at dpaux devices:
https://github.com/fwupd/fwupd/tree/main/plugins/synaptics-mst#kernel-dp-aux-interface
That seems like a bad idea.
(We've already disabled that plugin on these systems, but it seems
wise not to leave the stumbling block here for the next time.)
> I suppose this just further proves the point that this is really not a
> great interface to rely on. It's fine for debugging during hardware
> bringup and I guess in limited situations it might be OK, but it's
> really not something we want userspace tweaking with anyway, right? In
> general I expect it's up to the kernel to be controlling peripherals
> on the DP AUX bus. The kernel should have a backlight driver and
> should do the AUX transfers needed. Having userspace in there mucking
> with things is just a bad idea. I mean, userspace also doesn't know
> when a panel has been power cycled and potentially lost any changes
> that they might have written, right?
>
> I sorta suspect that most of the uses of these files are there because
> there wasn't a kernel driver and someone thought that doing it in
> userspace was the way to go?
*shrug* beats me.
Brian
Hi,
On Fri, Oct 1, 2021 at 2:50 PM Brian Norris <[email protected]> wrote:
>
> If the display is not enable()d, then we aren't holding a runtime PM
> reference here. Thus, it's easy to accidentally cause a hang, if user
> space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
>
> Let's get the panel and PM state right before trying to talk AUX.
>
> Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code")
> Cc: <[email protected]>
> Cc: Tomeu Vizoso <[email protected]>
> Signed-off-by: Brian Norris <[email protected]>
> ---
>
> Changes in v2:
> - Fix spelling in Subject
> - DRM_DEV_ERROR() -> drm_err()
> - Propagate errors from un-analogix_dp_prepare_panel()
>
> .../drm/bridge/analogix/analogix_dp_core.c | 21 ++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> index b7d2e4449cfa..6fc46ac93ef8 100644
> --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> @@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux,
> struct drm_dp_aux_msg *msg)
> {
> struct analogix_dp_device *dp = to_dp(aux);
> + int ret, ret2;
>
> - return analogix_dp_transfer(dp, msg);
> + ret = analogix_dp_prepare_panel(dp, true, false);
> + if (ret) {
> + drm_err(dp->drm_dev, "Failed to prepare panel (%d)\n", ret);
> + return ret;
> + }
> +
> + pm_runtime_get_sync(dp->dev);
> + ret = analogix_dp_transfer(dp, msg);
> + pm_runtime_put(dp->dev);
I've spent an unfortunate amount of time digging around the DP AUX bus
recently, so I can at least say that I have some experience and some
opinions here.
IMO:
1. Don't power the panel on. If the panel isn't powered on then the DP
AUX transfer will timeout. Tough nuggies. Think of yourself more like
an i2c controller and of this as an i2c transfer implementation. The
i2c controller isn't in charge of powering up the i2c devices on the
bus. If userspace does an "i2c detect" on an i2c bus and some of the
devices aren't powered then they won't be found. If you try to
read/write from a powered off device that won't work either.
2. In theory if the DP driver can read HPD (I haven't looked through
the analogix code to see how it handles it) then you can fail an AUX
transfer right away if HPD isn't asserted instead of timing out. If
this is hard, it's probably fine to just time out though.
3. Do the "pm_runtime" calls, but enable "autosuspend" with something
~1 second autosuspend delay. When using the AUX bus to read an EDID
the underlying code will call your function 16 times in quick
succession. If you're powering up and down constantly that'll be a bit
of a waste.
The above will help set us up for when someone goes through and
enables the "new" DP AUX bus and generic eDP panels on for
analogix-edp. See commit aeb33699fc2c ("drm: Introduce the DP AUX
bus"). In that case panels will actually be instantiated DP AUX
Endpoints instead of platform devices. They'll be given the DP AUX bus
and they'll be able to read the EDID over that. In this case, the
panel code turns itself on (it knows how to turn itself on enough to
read the EDID) and then calls the DP AUX transfer code. :-)
For a reference, you could look at
`drivers/gpu/drm/bridge/ti-sn65dsi86.c`. Also
`drivers/gpu/drm/bridge/parade-ps8640.c`
-Doug
Hi,
On Tue, Feb 15, 2022 at 4:41 PM Brian Norris <[email protected]> wrote:
>
> On Tue, Feb 15, 2022 at 3:46 PM Doug Anderson <[email protected]> wrote:
> > On Tue, Feb 15, 2022 at 2:52 PM Brian Norris <[email protected]> wrote:
> > > It still makes me wonder what the point
> > > of the /dev/drm_dp_aux<N> interface is though, because it seems like
> > > you're pretty much destined to not have reliable operation through
> > > that means.
> >
> > I can't say I have tons of history for those files. I seem to recall
> > maybe someone using them to have userspace tweak the embedded
> > backlight on some external DP connected panels? I think we also might
> > use it in Chrome OS to update the firmware of panels (dunno if
> > internal or external) in some cases too? I suspect that it works OK
> > for certain situations but it's really not going to work in all
> > cases...
>
> Yes, I believe I'm only submitting patches like this because fwupd
> apparently likes to indiscriminately whack at dpaux devices:
> https://github.com/fwupd/fwupd/tree/main/plugins/synaptics-mst#kernel-dp-aux-interface
>
> That seems like a bad idea.
>
> (We've already disabled that plugin on these systems, but it seems
> wise not to leave the stumbling block here for the next time.)
Yeah, it doesn't seem great. I guess it's _slightly_ less bad since
it's for an external device. As I understand it, they never really
turn off in the same way. It still feels like letting userspace
indiscriminately whack at DP AUX registers isn't a great idea, though.