2024-02-17 15:03:34

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

Starting with 6.8-rc1 the internal display sometimes fails to come up on
machines like the Lenovo ThinkPad X13s and the logs indicate that this
is due to a regression in the DRM subsystem [1].

This series fixes a race in the pmic_glink_altmode driver which was
exposed / triggered by the transparent DRM bridges rework that went into
6.8-rc1 and that manifested itself as a bridge failing to attach and
sometimes triggering a NULL-pointer dereference.

The intermittent hard resets that have also been reported since 6.8-rc1
unfortunately still remains and suggests that we are dealing with two
separate regressions. There is some indication that also the hard resets
(e.g. due to register accesses to unclocked hardware) are also due to
changes in the DRM subsystem as it happens around the time that the eDP
panel and display controller would be initialised during boot (the
runtime PM rework?). This remains to be verified, however.

Included is also a fix for a related OF node reference leak in the
aux-hpd driver found through inspection when reworking the driver.

The use-after-free bug is triggered by a probe deferral and highlighted
some further bugs in the involved drivers, which were registering child
devices before deferring probe. This behaviour is not correct and can
both trigger probe deferral loops and potentially also further issues
with the DRM bridge implementation.

This series can either go through the Qualcomm SoC tree (pmic_glink) or
the DRM tree. The PHY patches do not depend on the rest of the series
and could possibly be merged separately through the PHY tree.

Whichever gets this to mainline the fastest.

Johan


[1] https://lore.kernel.org/lkml/[email protected]/


Johan Hovold (5):
drm/bridge: aux-hpd: fix OF node leaks
drm/bridge: aux-hpd: separate allocation and registration
soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
phy: qcom-qmp-combo: fix drm bridge registration
phy: qcom-qmp-combo: fix type-c switch registration

Rob Clark (1):
soc: qcom: pmic_glink: Fix boot when QRTR=m

drivers/gpu/drm/bridge/aux-hpd-bridge.c | 70 ++++++++++++++++++-----
drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 16 +++---
drivers/soc/qcom/pmic_glink.c | 21 +++----
drivers/soc/qcom/pmic_glink_altmode.c | 16 +++++-
include/drm/bridge/aux-bridge.h | 15 +++++
5 files changed, 102 insertions(+), 36 deletions(-)

--
2.43.0



2024-02-17 15:04:04

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 5/6] phy: qcom-qmp-combo: fix drm bridge registration

Due to a long-standing issue in driver core, drivers may not probe defer
after having registered child devices to avoid triggering a probe
deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
-EPROBE_DEFER")).

This could potentially also trigger a bug in the DRM bridge
implementation which does not expect bridges to go away even if device
links may avoid triggering this (when enabled).

Move registration of the DRM aux bridge to after looking up clocks and
other resources.

Note that PHY creation can in theory also trigger a probe deferral when
a 'phy' supply is used. This does not seem to affect the QMP PHY driver
but the PHY subsystem should be reworked to address this (i.e. by
separating initialisation and registration of the PHY).

Fixes: 35921910bbd0 ("phy: qcom: qmp-combo: switch to DRM_AUX_BRIDGE")
Fixes: 1904c3f578dc ("phy: qcom-qmp-combo: Introduce drm_bridge")
Cc: [email protected] # 6.5
Cc: Bjorn Andersson <[email protected]>
Cc: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
index 1ad10110dd25..e19d6a084f10 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
@@ -3566,10 +3566,6 @@ static int qmp_combo_probe(struct platform_device *pdev)
if (ret)
return ret;

- ret = drm_aux_bridge_register(dev);
- if (ret)
- return ret;
-
/* Check for legacy binding with child nodes. */
usb_np = of_get_child_by_name(dev->of_node, "usb3-phy");
if (usb_np) {
@@ -3589,6 +3585,10 @@ static int qmp_combo_probe(struct platform_device *pdev)
if (ret)
goto err_node_put;

+ ret = drm_aux_bridge_register(dev);
+ if (ret)
+ goto err_node_put;
+
pm_runtime_set_active(dev);
ret = devm_pm_runtime_enable(dev);
if (ret)
--
2.43.0


2024-02-17 15:04:08

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

A recent DRM series purporting to simplify support for "transparent
bridges" and handling of probe deferrals ironically exposed a
use-after-free issue on pmic_glink_altmode probe deferral.

This has manifested itself as the display subsystem occasionally failing
to initialise and NULL-pointer dereferences during boot of machines like
the Lenovo ThinkPad X13s.

Specifically, the dp-hpd bridge is currently registered before all
resources have been acquired which means that it can also be
deregistered on probe deferrals.

In the meantime there is a race window where the new aux bridge driver
(or PHY driver previously) may have looked up the dp-hpd bridge and
stored a (non-reference-counted) pointer to the bridge which is about to
be deallocated.

When the display controller is later initialised, this triggers a
use-after-free when attaching the bridges:

dp -> aux -> dp-hpd (freed)

which may, for example, result in the freed bridge failing to attach:

[drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge /soc@0/phy@88eb000 to encoder TMDS-31: -16

or a NULL-pointer dereference:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
Call trace:
drm_bridge_attach+0x70/0x1a8 [drm]
drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
drm_bridge_attach+0x80/0x1a8 [drm]
dp_bridge_init+0xa8/0x15c [msm]
msm_dp_modeset_init+0x28/0xc4 [msm]

The DRM bridge implementation is clearly fragile and implicitly built on
the assumption that bridges may never go away. In this case, the fix is
to move the bridge registration in the pmic_glink_altmode driver to
after all resources have been looked up.

Incidentally, with the new dp-hpd bridge implementation, which registers
child devices, this is also a requirement due to a long-standing issue
in driver core that can otherwise lead to a probe deferral loop (see
fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).

Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
Fixes: 2bcca96abfbf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
Cc: [email protected] # 6.3
Cc: Bjorn Andersson <[email protected]>
Cc: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/soc/qcom/pmic_glink_altmode.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/soc/qcom/pmic_glink_altmode.c b/drivers/soc/qcom/pmic_glink_altmode.c
index 5fcd0fdd2faa..b3808fc24c69 100644
--- a/drivers/soc/qcom/pmic_glink_altmode.c
+++ b/drivers/soc/qcom/pmic_glink_altmode.c
@@ -76,7 +76,7 @@ struct pmic_glink_altmode_port {

struct work_struct work;

- struct device *bridge;
+ struct auxiliary_device *bridge;

enum typec_orientation orientation;
u16 svid;
@@ -230,7 +230,7 @@ static void pmic_glink_altmode_worker(struct work_struct *work)
else
pmic_glink_altmode_enable_usb(altmode, alt_port);

- drm_aux_hpd_bridge_notify(alt_port->bridge,
+ drm_aux_hpd_bridge_notify(&alt_port->bridge->dev,
alt_port->hpd_state ?
connector_status_connected :
connector_status_disconnected);
@@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct auxiliary_device *adev,
alt_port->index = port;
INIT_WORK(&alt_port->work, pmic_glink_altmode_worker);

- alt_port->bridge = drm_dp_hpd_bridge_register(dev, to_of_node(fwnode));
+ alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, to_of_node(fwnode));
if (IS_ERR(alt_port->bridge)) {
fwnode_handle_put(fwnode);
return PTR_ERR(alt_port->bridge);
@@ -510,6 +510,16 @@ static int pmic_glink_altmode_probe(struct auxiliary_device *adev,
}
}

+ for (port = 0; port < ARRAY_SIZE(altmode->ports); port++) {
+ alt_port = &altmode->ports[port];
+ if (!alt_port->bridge)
+ continue;
+
+ ret = devm_drm_dp_hpd_bridge_add(dev, alt_port->bridge);
+ if (ret)
+ return ret;
+ }
+
altmode->client = devm_pmic_glink_register_client(dev,
altmode->owner_id,
pmic_glink_altmode_callback,
--
2.43.0


2024-02-17 15:04:15

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 2/6] drm/bridge: aux-hpd: separate allocation and registration

Combining allocation and registration is an anti-pattern that should be
avoided. Add two new functions for allocating and registering an dp-hpd
bridge with a proper 'devm' prefix so that it is clear that these are
device managed interfaces.

devm_drm_dp_hpd_bridge_alloc()
devm_drm_dp_hpd_bridge_add()

The new interface will be used to fix a use-after-free bug in the
Qualcomm PMIC GLINK driver and may prevent similar issues from being
introduced elsewhere.

The existing drm_dp_hpd_bridge_register() is reimplemented using the
above and left in place for now.

Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/bridge/aux-hpd-bridge.c | 67 +++++++++++++++++++------
include/drm/bridge/aux-bridge.h | 15 ++++++
2 files changed, 67 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
index 9e71daf95bde..6886db2d9e00 100644
--- a/drivers/gpu/drm/bridge/aux-hpd-bridge.c
+++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
@@ -30,16 +30,13 @@ static void drm_aux_hpd_bridge_release(struct device *dev)
kfree(adev);
}

-static void drm_aux_hpd_bridge_unregister_adev(void *_adev)
+static void drm_aux_hpd_bridge_free_adev(void *_adev)
{
- struct auxiliary_device *adev = _adev;
-
- auxiliary_device_delete(adev);
- auxiliary_device_uninit(adev);
+ auxiliary_device_uninit(_adev);
}

/**
- * drm_dp_hpd_bridge_register - Create a simple HPD DisplayPort bridge
+ * devm_drm_dp_hpd_bridge_alloc - allocate a HPD DisplayPort bridge
* @parent: device instance providing this bridge
* @np: device node pointer corresponding to this bridge instance
*
@@ -47,11 +44,9 @@ static void drm_aux_hpd_bridge_unregister_adev(void *_adev)
* DRM_MODE_CONNECTOR_DisplayPort, which terminates the bridge chain and is
* able to send the HPD events.
*
- * Return: device instance that will handle created bridge or an error code
- * encoded into the pointer.
+ * Return: bridge auxiliary device pointer or an error pointer
*/
-struct device *drm_dp_hpd_bridge_register(struct device *parent,
- struct device_node *np)
+struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent, struct device_node *np)
{
struct auxiliary_device *adev;
int ret;
@@ -82,13 +77,55 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
return ERR_PTR(ret);
}

- ret = auxiliary_device_add(adev);
- if (ret) {
- auxiliary_device_uninit(adev);
+ ret = devm_add_action_or_reset(parent, drm_aux_hpd_bridge_free_adev, adev);
+ if (ret)
return ERR_PTR(ret);
- }

- ret = devm_add_action_or_reset(parent, drm_aux_hpd_bridge_unregister_adev, adev);
+ return adev;
+}
+EXPORT_SYMBOL_GPL(devm_drm_dp_hpd_bridge_alloc);
+
+static void drm_aux_hpd_bridge_del_adev(void *_adev)
+{
+ auxiliary_device_delete(_adev);
+}
+
+/**
+ * devm_drm_dp_hpd_bridge_add - register a HDP DisplayPort bridge
+ * @dev: struct device to tie registration lifetime to
+ * @adev: bridge auxiliary device to be registered
+ *
+ * Returns: zero on success or a negative errno
+ */
+int devm_drm_dp_hpd_bridge_add(struct device *dev, struct auxiliary_device *adev)
+{
+ int ret;
+
+ ret = auxiliary_device_add(adev);
+ if (ret)
+ return ret;
+
+ return devm_add_action_or_reset(dev, drm_aux_hpd_bridge_del_adev, adev);
+}
+EXPORT_SYMBOL_GPL(devm_drm_dp_hpd_bridge_add);
+
+/**
+ * drm_dp_hpd_bridge_register - allocate and register a HDP DisplayPort bridge
+ * @parent: device instance providing this bridge
+ * @np: device node pointer corresponding to this bridge instance
+ *
+ * Return: device instance that will handle created bridge or an error pointer
+ */
+struct device *drm_dp_hpd_bridge_register(struct device *parent, struct device_node *np)
+{
+ struct auxiliary_device *adev;
+ int ret;
+
+ adev = devm_drm_dp_hpd_bridge_alloc(parent, np);
+ if (IS_ERR(adev))
+ return ERR_CAST(adev);
+
+ ret = devm_drm_dp_hpd_bridge_add(parent, adev);
if (ret)
return ERR_PTR(ret);

diff --git a/include/drm/bridge/aux-bridge.h b/include/drm/bridge/aux-bridge.h
index c4c423e97f06..4453906105ca 100644
--- a/include/drm/bridge/aux-bridge.h
+++ b/include/drm/bridge/aux-bridge.h
@@ -9,6 +9,8 @@

#include <drm/drm_connector.h>

+struct auxiliary_device;
+
#if IS_ENABLED(CONFIG_DRM_AUX_BRIDGE)
int drm_aux_bridge_register(struct device *parent);
#else
@@ -19,10 +21,23 @@ static inline int drm_aux_bridge_register(struct device *parent)
#endif

#if IS_ENABLED(CONFIG_DRM_AUX_HPD_BRIDGE)
+struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent, struct device_node *np);
+int devm_drm_dp_hpd_bridge_add(struct device *dev, struct auxiliary_device *adev);
struct device *drm_dp_hpd_bridge_register(struct device *parent,
struct device_node *np);
void drm_aux_hpd_bridge_notify(struct device *dev, enum drm_connector_status status);
#else
+static inline struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent,
+ struct device_node *np)
+{
+ return NULL;
+}
+
+static inline int devm_drm_dp_hpd_bridge_add(struct auxiliary_device *adev)
+{
+ return 0;
+}
+
static inline struct device *drm_dp_hpd_bridge_register(struct device *parent,
struct device_node *np)
{
--
2.43.0


2024-02-17 15:04:32

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

The two device node references taken during allocation need to be
dropped when the auxiliary device is freed.

Fixes: 6914968a0b52 ("drm/bridge: properly refcount DT nodes in aux bridge drivers")
Cc: Dmitry Baryshkov <[email protected]>
Cc: Neil Armstrong <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/bridge/aux-hpd-bridge.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
index bb55f697a181..9e71daf95bde 100644
--- a/drivers/gpu/drm/bridge/aux-hpd-bridge.c
+++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
@@ -25,6 +25,7 @@ static void drm_aux_hpd_bridge_release(struct device *dev)
ida_free(&drm_aux_hpd_bridge_ida, adev->id);

of_node_put(adev->dev.platform_data);
+ of_node_put(adev->dev.of_node);

kfree(adev);
}
@@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,

ret = auxiliary_device_init(adev);
if (ret) {
+ of_node_put(adev->dev.platform_data);
+ of_node_put(adev->dev.of_node);
ida_free(&drm_aux_hpd_bridge_ida, adev->id);
kfree(adev);
return ERR_PTR(ret);
--
2.43.0


2024-02-17 15:04:32

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m

From: Rob Clark <[email protected]>

We need to bail out before adding/removing devices if we are going to
-EPROBE_DEFER. Otherwise boot can get stuck in a probe deferral loop due
to a long-standing issue in driver core (see fbc35b45f9f6 ("Add
documentation on meaning of -EPROBE_DEFER")).

Deregistering the altmode child device can potentially also trigger bugs
in the DRM bridge implementation, which does not expect bridges to go
away.

Suggested-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ johan: rebase on 6.8-rc4, amend commit message and mention DRM ]
Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
Cc: [email protected] # 6.3
Cc: Bjorn Andersson <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/soc/qcom/pmic_glink.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/soc/qcom/pmic_glink.c b/drivers/soc/qcom/pmic_glink.c
index f4bfd24386f1..f913e9bd57ed 100644
--- a/drivers/soc/qcom/pmic_glink.c
+++ b/drivers/soc/qcom/pmic_glink.c
@@ -265,10 +265,17 @@ static int pmic_glink_probe(struct platform_device *pdev)

pg->client_mask = *match_data;

+ pg->pdr = pdr_handle_alloc(pmic_glink_pdr_callback, pg);
+ if (IS_ERR(pg->pdr)) {
+ ret = dev_err_probe(&pdev->dev, PTR_ERR(pg->pdr),
+ "failed to initialize pdr\n");
+ return ret;
+ }
+
if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_UCSI)) {
ret = pmic_glink_add_aux_device(pg, &pg->ucsi_aux, "ucsi");
if (ret)
- return ret;
+ goto out_release_pdr_handle;
}
if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_ALTMODE)) {
ret = pmic_glink_add_aux_device(pg, &pg->altmode_aux, "altmode");
@@ -281,17 +288,11 @@ static int pmic_glink_probe(struct platform_device *pdev)
goto out_release_altmode_aux;
}

- pg->pdr = pdr_handle_alloc(pmic_glink_pdr_callback, pg);
- if (IS_ERR(pg->pdr)) {
- ret = dev_err_probe(&pdev->dev, PTR_ERR(pg->pdr), "failed to initialize pdr\n");
- goto out_release_aux_devices;
- }
-
service = pdr_add_lookup(pg->pdr, "tms/servreg", "msm/adsp/charger_pd");
if (IS_ERR(service)) {
ret = dev_err_probe(&pdev->dev, PTR_ERR(service),
"failed adding pdr lookup for charger_pd\n");
- goto out_release_pdr_handle;
+ goto out_release_aux_devices;
}

mutex_lock(&__pmic_glink_lock);
@@ -300,8 +301,6 @@ static int pmic_glink_probe(struct platform_device *pdev)

return 0;

-out_release_pdr_handle:
- pdr_handle_release(pg->pdr);
out_release_aux_devices:
if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_BATT))
pmic_glink_del_aux_device(pg, &pg->ps_aux);
@@ -311,6 +310,8 @@ static int pmic_glink_probe(struct platform_device *pdev)
out_release_ucsi_aux:
if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_UCSI))
pmic_glink_del_aux_device(pg, &pg->ucsi_aux);
+out_release_pdr_handle:
+ pdr_handle_release(pg->pdr);

return ret;
}
--
2.43.0


2024-02-17 15:10:48

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 6/6] phy: qcom-qmp-combo: fix type-c switch registration

Due to a long-standing issue in driver core, drivers may not probe defer
after having registered child devices to avoid triggering a probe
deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
-EPROBE_DEFER")).

Move registration of the typec switch to after looking up clocks and
other resources.

Note that PHY creation can in theory also trigger a probe deferral when
a 'phy' supply is used. This does not seem to affect the QMP PHY driver
but the PHY subsystem should be reworked to address this (i.e. by
separating initialisation and registration of the PHY).

Fixes: 2851117f8f42 ("phy: qcom-qmp-combo: Introduce orientation switching")
Cc: [email protected] # 6.5
Cc: Bjorn Andersson <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
index e19d6a084f10..17c4ad7553a5 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
@@ -3562,10 +3562,6 @@ static int qmp_combo_probe(struct platform_device *pdev)
if (ret)
return ret;

- ret = qmp_combo_typec_switch_register(qmp);
- if (ret)
- return ret;
-
/* Check for legacy binding with child nodes. */
usb_np = of_get_child_by_name(dev->of_node, "usb3-phy");
if (usb_np) {
@@ -3585,6 +3581,10 @@ static int qmp_combo_probe(struct platform_device *pdev)
if (ret)
goto err_node_put;

+ ret = qmp_combo_typec_switch_register(qmp);
+ if (ret)
+ goto err_node_put;
+
ret = drm_aux_bridge_register(dev);
if (ret)
goto err_node_put;
--
2.43.0


2024-02-19 09:04:05

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 5/6] phy: qcom-qmp-combo: fix drm bridge registration

On 17/02/2024 16:02, Johan Hovold wrote:
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> This could potentially also trigger a bug in the DRM bridge
> implementation which does not expect bridges to go away even if device
> links may avoid triggering this (when enabled).
>
> Move registration of the DRM aux bridge to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).
>
> Fixes: 35921910bbd0 ("phy: qcom: qmp-combo: switch to DRM_AUX_BRIDGE")
> Fixes: 1904c3f578dc ("phy: qcom-qmp-combo: Introduce drm_bridge")
> Cc: [email protected] # 6.5
> Cc: Bjorn Andersson <[email protected]>
> Cc: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> index 1ad10110dd25..e19d6a084f10 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> @@ -3566,10 +3566,6 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> return ret;
>
> - ret = drm_aux_bridge_register(dev);
> - if (ret)
> - return ret;
> -
> /* Check for legacy binding with child nodes. */
> usb_np = of_get_child_by_name(dev->of_node, "usb3-phy");
> if (usb_np) {
> @@ -3589,6 +3585,10 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> goto err_node_put;
>
> + ret = drm_aux_bridge_register(dev);
> + if (ret)
> + goto err_node_put;
> +
> pm_runtime_set_active(dev);
> ret = devm_pm_runtime_enable(dev);
> if (ret)

Reviewed-by: Neil Armstrong <[email protected]>

2024-02-19 17:49:43

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

> The two device node references taken during allocation need to be
> dropped when the auxiliary device is freed.

> +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c

> @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
>
> ret = auxiliary_device_init(adev);
> if (ret) {
> + of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> kfree(adev);
> return ERR_PTR(ret);

The last two statements are also used in a previous if branch.
https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/gpu/drm/bridge/aux-hpd-bridge.c#L63

How do you think about to avoid such a bit of duplicate source code
by adding a label here?

Regards,
Markus

2024-02-20 07:25:43

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On Mon, Feb 19, 2024 at 06:48:30PM +0100, Markus Elfring wrote:
> > The two device node references taken during allocation need to be
> > dropped when the auxiliary device is freed.
> …
> > +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> …
> > @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
> >
> > ret = auxiliary_device_init(adev);
> > if (ret) {
> > + of_node_put(adev->dev.platform_data);
> > + of_node_put(adev->dev.of_node);
> > ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> > kfree(adev);
> > return ERR_PTR(ret);
>
> The last two statements are also used in a previous if branch.
> https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/gpu/drm/bridge/aux-hpd-bridge.c#L63
>
> How do you think about to avoid such a bit of duplicate source code
> by adding a label here?

No, the current code is fine and what you are suggesting is in any case
unrelated to this fix.

If this function ever grows a third error path like that, I too would
consider it however.

Johan

2024-02-20 08:27:44

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free


> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.


I got the impression that the change description can be improved another bit.

1. Will any additional imperative wordings become helpful?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.8-rc5#n94



> +++ b/drivers/soc/qcom/pmic_glink_altmode.c
> @@ -76,7 +76,7 @@ struct pmic_glink_altmode_port {
>
> struct work_struct work;
>
> - struct device *bridge;
> + struct auxiliary_device *bridge;
>
> enum typec_orientation orientation;
> u16 svid;


2. How do you think about to stress such a data type adjustment?

Regards,
Markus

2024-02-20 11:24:57

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free


> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.

> +++ b/drivers/soc/qcom/pmic_glink_altmode.c

> @@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct auxiliary_device *adev,
> alt_port->index = port;
> INIT_WORK(&alt_port->work, pmic_glink_altmode_worker);
>
> - alt_port->bridge = drm_dp_hpd_bridge_register(dev, to_of_node(fwnode));
> + alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, to_of_node(fwnode));
> if (IS_ERR(alt_port->bridge)) {
> fwnode_handle_put(fwnode);
> return PTR_ERR(alt_port->bridge);


The function call “fwnode_handle_put(fwnode)” is used in multiple if branches.
https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/soc/qcom/pmic_glink_altmode.c#L435

I suggest to add a jump target so that a bit of exception handling
can be better reused at the end of this function implementation.

Regards,
Markus

2024-02-20 11:27:18

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Tue, Feb 20, 2024 at 11:55:57AM +0100, Markus Elfring wrote:
> …
> > Specifically, the dp-hpd bridge is currently registered before all
> > resources have been acquired which means that it can also be
> > deregistered on probe deferrals.
> >
> > In the meantime there is a race window where the new aux bridge driver
> > (or PHY driver previously) may have looked up the dp-hpd bridge and
> > stored a (non-reference-counted) pointer to the bridge which is about to
> > be deallocated.
> …
> > +++ b/drivers/soc/qcom/pmic_glink_altmode.c
> …
> > @@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct auxiliary_device *adev,
> > alt_port->index = port;
> > INIT_WORK(&alt_port->work, pmic_glink_altmode_worker);
> >
> > - alt_port->bridge = drm_dp_hpd_bridge_register(dev, to_of_node(fwnode));
> > + alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, to_of_node(fwnode));
> > if (IS_ERR(alt_port->bridge)) {
> > fwnode_handle_put(fwnode);
> > return PTR_ERR(alt_port->bridge);
> …
>
> The function call “fwnode_handle_put(fwnode)” is used in multiple if branches.
> https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/soc/qcom/pmic_glink_altmode.c#L435
>
> I suggest to add a jump target so that a bit of exception handling
> can be better reused at the end of this function implementation.

Markus, as people have told you repeatedly, just stop with these
comments. You're not helping, in fact, you are actively harmful to the
kernel community as you are wasting people's time.

Johan

2024-02-20 11:56:28

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On Tue, 20 Feb 2024 at 13:52, Julia Lawall <[email protected]> wrote:
>
>
>
> On Tue, 20 Feb 2024, Johan Hovold wrote:
>
> > On Mon, Feb 19, 2024 at 06:48:30PM +0100, Markus Elfring wrote:
> > > > The two device node references taken during allocation need to be
> > > > dropped when the auxiliary device is freed.
> > > …
> > > > +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> > > …
> > > > @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
> > > >
> > > > ret = auxiliary_device_init(adev);
> > > > if (ret) {
> > > > + of_node_put(adev->dev.platform_data);
> > > > + of_node_put(adev->dev.of_node);
> > > > ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> > > > kfree(adev);
> > > > return ERR_PTR(ret);
> > >
> > > The last two statements are also used in a previous if branch.
> > > https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/gpu/drm/bridge/aux-hpd-bridge.c#L63
> > >
> > > How do you think about to avoid such a bit of duplicate source code
> > > by adding a label here?
> >
> > No, the current code is fine and what you are suggesting is in any case
> > unrelated to this fix.
> >
> > If this function ever grows a third error path like that, I too would
> > consider it however.
>
> I guess these of_node_puts can all go away shortly with cleanup anyway?

I'm not sure about it. Those are long-living variables, so they are
not a subject of cleanup.h, are they?


--
With best wishes
Dmitry

2024-02-20 12:02:09

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks



On Tue, 20 Feb 2024, Johan Hovold wrote:

> On Mon, Feb 19, 2024 at 06:48:30PM +0100, Markus Elfring wrote:
> > > The two device node references taken during allocation need to be
> > > dropped when the auxiliary device is freed.
> > …
> > > +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> > …
> > > @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
> > >
> > > ret = auxiliary_device_init(adev);
> > > if (ret) {
> > > + of_node_put(adev->dev.platform_data);
> > > + of_node_put(adev->dev.of_node);
> > > ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> > > kfree(adev);
> > > return ERR_PTR(ret);
> >
> > The last two statements are also used in a previous if branch.
> > https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/gpu/drm/bridge/aux-hpd-bridge.c#L63
> >
> > How do you think about to avoid such a bit of duplicate source code
> > by adding a label here?
>
> No, the current code is fine and what you are suggesting is in any case
> unrelated to this fix.
>
> If this function ever grows a third error path like that, I too would
> consider it however.

I guess these of_node_puts can all go away shortly with cleanup anyway?

julia

2024-02-20 12:41:19

by Markus Elfring

[permalink] [raw]
Subject: Re: [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

>> The function call “fwnode_handle_put(fwnode)” is used in multiple if branches.
>> https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/soc/qcom/pmic_glink_altmode.c#L435
>>
>> I suggest to add a jump target so that a bit of exception handling
>> can be better reused at the end of this function implementation.
>
> Markus, as people have told you repeatedly, just stop with these comments.

How does such a response fit to advices from another known information sources?

Section “7) Centralized exiting of functions”
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.8-rc5#n526


> You're not helping, in fact, you are actively harmful to the
> kernel community as you are wasting people's time.

The proposed source code transformation can eventually be (automatically) achieved
also with help of improved development tools.

Regards,
Markus

2024-02-20 13:02:59

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks



On Tue, 20 Feb 2024, Dmitry Baryshkov wrote:

> On Tue, 20 Feb 2024 at 13:52, Julia Lawall <[email protected]> wrote:
> >
> >
> >
> > On Tue, 20 Feb 2024, Johan Hovold wrote:
> >
> > > On Mon, Feb 19, 2024 at 06:48:30PM +0100, Markus Elfring wrote:
> > > > > The two device node references taken during allocation need to be
> > > > > dropped when the auxiliary device is freed.
> > > > …
> > > > > +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> > > > …
> > > > > @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
> > > > >
> > > > > ret = auxiliary_device_init(adev);
> > > > > if (ret) {
> > > > > + of_node_put(adev->dev.platform_data);
> > > > > + of_node_put(adev->dev.of_node);
> > > > > ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> > > > > kfree(adev);
> > > > > return ERR_PTR(ret);
> > > >
> > > > The last two statements are also used in a previous if branch.
> > > > https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/gpu/drm/bridge/aux-hpd-bridge.c#L63
> > > >
> > > > How do you think about to avoid such a bit of duplicate source code
> > > > by adding a label here?
> > >
> > > No, the current code is fine and what you are suggesting is in any case
> > > unrelated to this fix.
> > >
> > > If this function ever grows a third error path like that, I too would
> > > consider it however.
> >
> > I guess these of_node_puts can all go away shortly with cleanup anyway?
>
> I'm not sure about it. Those are long-living variables, so they are
> not a subject of cleanup.h, are they?

OK, I didn't look at this code in detail, but cleanup would just call
of_node_put, not actually free the data.

julia

2024-02-20 13:36:03

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On Tue, 20 Feb 2024 at 14:56, Julia Lawall <[email protected]> wrote:
>
>
>
> On Tue, 20 Feb 2024, Dmitry Baryshkov wrote:
>
> > On Tue, 20 Feb 2024 at 13:52, Julia Lawall <[email protected]> wrote:
> > >
> > >
> > >
> > > On Tue, 20 Feb 2024, Johan Hovold wrote:
> > >
> > > > On Mon, Feb 19, 2024 at 06:48:30PM +0100, Markus Elfring wrote:
> > > > > > The two device node references taken during allocation need to be
> > > > > > dropped when the auxiliary device is freed.
> > > > > …
> > > > > > +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> > > > > …
> > > > > > @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
> > > > > >
> > > > > > ret = auxiliary_device_init(adev);
> > > > > > if (ret) {
> > > > > > + of_node_put(adev->dev.platform_data);
> > > > > > + of_node_put(adev->dev.of_node);
> > > > > > ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> > > > > > kfree(adev);
> > > > > > return ERR_PTR(ret);
> > > > >
> > > > > The last two statements are also used in a previous if branch.
> > > > > https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/gpu/drm/bridge/aux-hpd-bridge.c#L63
> > > > >
> > > > > How do you think about to avoid such a bit of duplicate source code
> > > > > by adding a label here?
> > > >
> > > > No, the current code is fine and what you are suggesting is in any case
> > > > unrelated to this fix.
> > > >
> > > > If this function ever grows a third error path like that, I too would
> > > > consider it however.
> > >
> > > I guess these of_node_puts can all go away shortly with cleanup anyway?
> >
> > I'm not sure about it. Those are long-living variables, so they are
> > not a subject of cleanup.h, are they?
>
> OK, I didn't look at this code in detail, but cleanup would just call
> of_node_put, not actually free the data.

Yes. The nodes should be put either in case of the failure or (if
everything goes fine) at the device unregistration.

--
With best wishes
Dmitry

2024-02-22 01:22:44

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On Sat, Feb 17, 2024 at 04:02:23PM +0100, Johan Hovold wrote:
> The two device node references taken during allocation need to be
> dropped when the auxiliary device is freed.
>
> Fixes: 6914968a0b52 ("drm/bridge: properly refcount DT nodes in aux bridge drivers")
> Cc: Dmitry Baryshkov <[email protected]>
> Cc: Neil Armstrong <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/gpu/drm/bridge/aux-hpd-bridge.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> index bb55f697a181..9e71daf95bde 100644
> --- a/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> @@ -25,6 +25,7 @@ static void drm_aux_hpd_bridge_release(struct device *dev)
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
>
> of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
>
> kfree(adev);
> }
> @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
>
> ret = auxiliary_device_init(adev);
> if (ret) {
> + of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> kfree(adev);
> return ERR_PTR(ret);
> --
> 2.43.0
>

2024-02-22 02:07:06

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 2/6] drm/bridge: aux-hpd: separate allocation and registration

On Sat, Feb 17, 2024 at 04:02:24PM +0100, Johan Hovold wrote:
> diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
[..]
> +/**
> + * devm_drm_dp_hpd_bridge_add - register a HDP DisplayPort bridge

kernel-doc wants () after function names.

> + * @dev: struct device to tie registration lifetime to
> + * @adev: bridge auxiliary device to be registered
> + *
> + * Returns: zero on success or a negative errno

and "Return:" without the 's'.

This could however be done in a separate patch, as the file is already
wrong in this regard.

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

2024-02-22 02:18:14

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m

On Sat, Feb 17, 2024 at 04:02:26PM +0100, Johan Hovold wrote:
> From: Rob Clark <[email protected]>
>
> We need to bail out before adding/removing devices if we are going to
> -EPROBE_DEFER. Otherwise boot can get stuck in a probe deferral loop due
> to a long-standing issue in driver core (see fbc35b45f9f6 ("Add
> documentation on meaning of -EPROBE_DEFER")).
>
> Deregistering the altmode child device can potentially also trigger bugs
> in the DRM bridge implementation, which does not expect bridges to go
> away.
>
> Suggested-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> [ johan: rebase on 6.8-rc4, amend commit message and mention DRM ]
> Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
> Cc: [email protected] # 6.3
> Cc: Bjorn Andersson <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/soc/qcom/pmic_glink.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/soc/qcom/pmic_glink.c b/drivers/soc/qcom/pmic_glink.c
> index f4bfd24386f1..f913e9bd57ed 100644
> --- a/drivers/soc/qcom/pmic_glink.c
> +++ b/drivers/soc/qcom/pmic_glink.c
> @@ -265,10 +265,17 @@ static int pmic_glink_probe(struct platform_device *pdev)
>
> pg->client_mask = *match_data;
>
> + pg->pdr = pdr_handle_alloc(pmic_glink_pdr_callback, pg);
> + if (IS_ERR(pg->pdr)) {
> + ret = dev_err_probe(&pdev->dev, PTR_ERR(pg->pdr),
> + "failed to initialize pdr\n");
> + return ret;
> + }
> +
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_UCSI)) {
> ret = pmic_glink_add_aux_device(pg, &pg->ucsi_aux, "ucsi");
> if (ret)
> - return ret;
> + goto out_release_pdr_handle;
> }
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_ALTMODE)) {
> ret = pmic_glink_add_aux_device(pg, &pg->altmode_aux, "altmode");
> @@ -281,17 +288,11 @@ static int pmic_glink_probe(struct platform_device *pdev)
> goto out_release_altmode_aux;
> }
>
> - pg->pdr = pdr_handle_alloc(pmic_glink_pdr_callback, pg);
> - if (IS_ERR(pg->pdr)) {
> - ret = dev_err_probe(&pdev->dev, PTR_ERR(pg->pdr), "failed to initialize pdr\n");
> - goto out_release_aux_devices;
> - }
> -
> service = pdr_add_lookup(pg->pdr, "tms/servreg", "msm/adsp/charger_pd");
> if (IS_ERR(service)) {
> ret = dev_err_probe(&pdev->dev, PTR_ERR(service),
> "failed adding pdr lookup for charger_pd\n");
> - goto out_release_pdr_handle;
> + goto out_release_aux_devices;
> }
>
> mutex_lock(&__pmic_glink_lock);
> @@ -300,8 +301,6 @@ static int pmic_glink_probe(struct platform_device *pdev)
>
> return 0;
>
> -out_release_pdr_handle:
> - pdr_handle_release(pg->pdr);
> out_release_aux_devices:
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_BATT))
> pmic_glink_del_aux_device(pg, &pg->ps_aux);
> @@ -311,6 +310,8 @@ static int pmic_glink_probe(struct platform_device *pdev)
> out_release_ucsi_aux:
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_UCSI))
> pmic_glink_del_aux_device(pg, &pg->ucsi_aux);
> +out_release_pdr_handle:
> + pdr_handle_release(pg->pdr);
>
> return ret;
> }
> --
> 2.43.0
>

2024-02-22 02:21:38

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Sat, Feb 17, 2024 at 04:02:25PM +0100, Johan Hovold wrote:
> A recent DRM series purporting to simplify support for "transparent
> bridges" and handling of probe deferrals ironically exposed a
> use-after-free issue on pmic_glink_altmode probe deferral.
>
> This has manifested itself as the display subsystem occasionally failing
> to initialise and NULL-pointer dereferences during boot of machines like
> the Lenovo ThinkPad X13s.
>
> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.
>
> When the display controller is later initialised, this triggers a
> use-after-free when attaching the bridges:
>
> dp -> aux -> dp-hpd (freed)
>
> which may, for example, result in the freed bridge failing to attach:
>
> [drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge /soc@0/phy@88eb000 to encoder TMDS-31: -16
>
> or a NULL-pointer dereference:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> ...
> Call trace:
> drm_bridge_attach+0x70/0x1a8 [drm]
> drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
> drm_bridge_attach+0x80/0x1a8 [drm]
> dp_bridge_init+0xa8/0x15c [msm]
> msm_dp_modeset_init+0x28/0xc4 [msm]
>
> The DRM bridge implementation is clearly fragile and implicitly built on
> the assumption that bridges may never go away. In this case, the fix is
> to move the bridge registration in the pmic_glink_altmode driver to
> after all resources have been looked up.
>
> Incidentally, with the new dp-hpd bridge implementation, which registers
> child devices, this is also a requirement due to a long-standing issue
> in driver core that can otherwise lead to a probe deferral loop (see
> fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).
>
> Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
> Fixes: 2bcca96abfbf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
> Cc: [email protected] # 6.3
> Cc: Bjorn Andersson <[email protected]>
> Cc: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/soc/qcom/pmic_glink_altmode.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/soc/qcom/pmic_glink_altmode.c b/drivers/soc/qcom/pmic_glink_altmode.c
> index 5fcd0fdd2faa..b3808fc24c69 100644
> --- a/drivers/soc/qcom/pmic_glink_altmode.c
> +++ b/drivers/soc/qcom/pmic_glink_altmode.c
> @@ -76,7 +76,7 @@ struct pmic_glink_altmode_port {
>
> struct work_struct work;
>
> - struct device *bridge;
> + struct auxiliary_device *bridge;
>
> enum typec_orientation orientation;
> u16 svid;
> @@ -230,7 +230,7 @@ static void pmic_glink_altmode_worker(struct work_struct *work)
> else
> pmic_glink_altmode_enable_usb(altmode, alt_port);
>
> - drm_aux_hpd_bridge_notify(alt_port->bridge,
> + drm_aux_hpd_bridge_notify(&alt_port->bridge->dev,
> alt_port->hpd_state ?
> connector_status_connected :
> connector_status_disconnected);
> @@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct auxiliary_device *adev,
> alt_port->index = port;
> INIT_WORK(&alt_port->work, pmic_glink_altmode_worker);
>
> - alt_port->bridge = drm_dp_hpd_bridge_register(dev, to_of_node(fwnode));
> + alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, to_of_node(fwnode));
> if (IS_ERR(alt_port->bridge)) {
> fwnode_handle_put(fwnode);
> return PTR_ERR(alt_port->bridge);
> @@ -510,6 +510,16 @@ static int pmic_glink_altmode_probe(struct auxiliary_device *adev,
> }
> }
>
> + for (port = 0; port < ARRAY_SIZE(altmode->ports); port++) {
> + alt_port = &altmode->ports[port];
> + if (!alt_port->bridge)
> + continue;
> +
> + ret = devm_drm_dp_hpd_bridge_add(dev, alt_port->bridge);
> + if (ret)
> + return ret;
> + }
> +
> altmode->client = devm_pmic_glink_register_client(dev,
> altmode->owner_id,
> pmic_glink_altmode_callback,
> --
> 2.43.0
>

2024-02-22 02:21:44

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 5/6] phy: qcom-qmp-combo: fix drm bridge registration

On Sat, Feb 17, 2024 at 04:02:27PM +0100, Johan Hovold wrote:
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> This could potentially also trigger a bug in the DRM bridge
> implementation which does not expect bridges to go away even if device
> links may avoid triggering this (when enabled).
>
> Move registration of the DRM aux bridge to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).
>
> Fixes: 35921910bbd0 ("phy: qcom: qmp-combo: switch to DRM_AUX_BRIDGE")
> Fixes: 1904c3f578dc ("phy: qcom-qmp-combo: Introduce drm_bridge")
> Cc: [email protected] # 6.5
> Cc: Bjorn Andersson <[email protected]>
> Cc: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> index 1ad10110dd25..e19d6a084f10 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> @@ -3566,10 +3566,6 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> return ret;
>
> - ret = drm_aux_bridge_register(dev);
> - if (ret)
> - return ret;
> -
> /* Check for legacy binding with child nodes. */
> usb_np = of_get_child_by_name(dev->of_node, "usb3-phy");
> if (usb_np) {
> @@ -3589,6 +3585,10 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> goto err_node_put;
>
> + ret = drm_aux_bridge_register(dev);
> + if (ret)
> + goto err_node_put;
> +
> pm_runtime_set_active(dev);
> ret = devm_pm_runtime_enable(dev);
> if (ret)
> --
> 2.43.0
>

2024-02-22 02:23:16

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 6/6] phy: qcom-qmp-combo: fix type-c switch registration

On Sat, Feb 17, 2024 at 04:02:28PM +0100, Johan Hovold wrote:
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> Move registration of the typec switch to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).
>
> Fixes: 2851117f8f42 ("phy: qcom-qmp-combo: Introduce orientation switching")
> Cc: [email protected] # 6.5
> Cc: Bjorn Andersson <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> index e19d6a084f10..17c4ad7553a5 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> @@ -3562,10 +3562,6 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> return ret;
>
> - ret = qmp_combo_typec_switch_register(qmp);
> - if (ret)
> - return ret;
> -
> /* Check for legacy binding with child nodes. */
> usb_np = of_get_child_by_name(dev->of_node, "usb3-phy");
> if (usb_np) {
> @@ -3585,6 +3581,10 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> goto err_node_put;
>
> + ret = qmp_combo_typec_switch_register(qmp);
> + if (ret)
> + goto err_node_put;
> +
> ret = drm_aux_bridge_register(dev);
> if (ret)
> goto err_node_put;
> --
> 2.43.0
>

2024-02-22 16:07:33

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 2/6] drm/bridge: aux-hpd: separate allocation and registration

On Wed, Feb 21, 2024 at 08:06:41PM -0600, Bjorn Andersson wrote:
> On Sat, Feb 17, 2024 at 04:02:24PM +0100, Johan Hovold wrote:
> > diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> [..]
> > +/**
> > + * devm_drm_dp_hpd_bridge_add - register a HDP DisplayPort bridge
>
> kernel-doc wants () after function names.

I don't think that's required for the symbol name here even if some
subsystems (drivers) use it.

> > + * @dev: struct device to tie registration lifetime to
> > + * @adev: bridge auxiliary device to be registered
> > + *
> > + * Returns: zero on success or a negative errno
>
> and "Return:" without the 's'.

This was a mistake however. Perhaps whoever applies this can drop it, or
I can send a v2.

Side note: Looks like there are more instances with an 's' than without
under driver/gpu/drm...

> This could however be done in a separate patch, as the file is already
> wrong in this regard.
>
> Reviewed-by: Bjorn Andersson <[email protected]>

Thanks for reviewing.

Johan

2024-02-22 20:57:58

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 2/6] drm/bridge: aux-hpd: separate allocation and registration

On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
>
> Combining allocation and registration is an anti-pattern that should be
> avoided. Add two new functions for allocating and registering an dp-hpd
> bridge with a proper 'devm' prefix so that it is clear that these are
> device managed interfaces.
>
> devm_drm_dp_hpd_bridge_alloc()
> devm_drm_dp_hpd_bridge_add()
>
> The new interface will be used to fix a use-after-free bug in the
> Qualcomm PMIC GLINK driver and may prevent similar issues from being
> introduced elsewhere.
>
> The existing drm_dp_hpd_bridge_register() is reimplemented using the
> above and left in place for now.
>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Dmitry Baryshkov <[email protected]>

Minor nit below.

> ---
> drivers/gpu/drm/bridge/aux-hpd-bridge.c | 67 +++++++++++++++++++------
> include/drm/bridge/aux-bridge.h | 15 ++++++
> 2 files changed, 67 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> index 9e71daf95bde..6886db2d9e00 100644
> --- a/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> @@ -30,16 +30,13 @@ static void drm_aux_hpd_bridge_release(struct device *dev)
> kfree(adev);
> }
>
> -static void drm_aux_hpd_bridge_unregister_adev(void *_adev)
> +static void drm_aux_hpd_bridge_free_adev(void *_adev)
> {
> - struct auxiliary_device *adev = _adev;
> -
> - auxiliary_device_delete(adev);
> - auxiliary_device_uninit(adev);
> + auxiliary_device_uninit(_adev);
> }
>
> /**
> - * drm_dp_hpd_bridge_register - Create a simple HPD DisplayPort bridge
> + * devm_drm_dp_hpd_bridge_alloc - allocate a HPD DisplayPort bridge
> * @parent: device instance providing this bridge
> * @np: device node pointer corresponding to this bridge instance
> *
> @@ -47,11 +44,9 @@ static void drm_aux_hpd_bridge_unregister_adev(void *_adev)
> * DRM_MODE_CONNECTOR_DisplayPort, which terminates the bridge chain and is
> * able to send the HPD events.
> *
> - * Return: device instance that will handle created bridge or an error code
> - * encoded into the pointer.
> + * Return: bridge auxiliary device pointer or an error pointer
> */
> -struct device *drm_dp_hpd_bridge_register(struct device *parent,
> - struct device_node *np)
> +struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent, struct device_node *np)
> {
> struct auxiliary_device *adev;
> int ret;
> @@ -82,13 +77,55 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
> return ERR_PTR(ret);
> }
>
> - ret = auxiliary_device_add(adev);
> - if (ret) {
> - auxiliary_device_uninit(adev);
> + ret = devm_add_action_or_reset(parent, drm_aux_hpd_bridge_free_adev, adev);
> + if (ret)
> return ERR_PTR(ret);
> - }
>
> - ret = devm_add_action_or_reset(parent, drm_aux_hpd_bridge_unregister_adev, adev);
> + return adev;
> +}
> +EXPORT_SYMBOL_GPL(devm_drm_dp_hpd_bridge_alloc);
> +
> +static void drm_aux_hpd_bridge_del_adev(void *_adev)
> +{
> + auxiliary_device_delete(_adev);
> +}
> +
> +/**
> + * devm_drm_dp_hpd_bridge_add - register a HDP DisplayPort bridge
> + * @dev: struct device to tie registration lifetime to
> + * @adev: bridge auxiliary device to be registered
> + *
> + * Returns: zero on success or a negative errno
> + */
> +int devm_drm_dp_hpd_bridge_add(struct device *dev, struct auxiliary_device *adev)
> +{
> + int ret;
> +
> + ret = auxiliary_device_add(adev);
> + if (ret)
> + return ret;
> +
> + return devm_add_action_or_reset(dev, drm_aux_hpd_bridge_del_adev, adev);
> +}
> +EXPORT_SYMBOL_GPL(devm_drm_dp_hpd_bridge_add);
> +
> +/**
> + * drm_dp_hpd_bridge_register - allocate and register a HDP DisplayPort bridge
> + * @parent: device instance providing this bridge
> + * @np: device node pointer corresponding to this bridge instance
> + *
> + * Return: device instance that will handle created bridge or an error pointer
> + */
> +struct device *drm_dp_hpd_bridge_register(struct device *parent, struct device_node *np)
> +{
> + struct auxiliary_device *adev;
> + int ret;
> +
> + adev = devm_drm_dp_hpd_bridge_alloc(parent, np);
> + if (IS_ERR(adev))
> + return ERR_CAST(adev);
> +
> + ret = devm_drm_dp_hpd_bridge_add(parent, adev);
> if (ret)
> return ERR_PTR(ret);
>
> diff --git a/include/drm/bridge/aux-bridge.h b/include/drm/bridge/aux-bridge.h
> index c4c423e97f06..4453906105ca 100644
> --- a/include/drm/bridge/aux-bridge.h
> +++ b/include/drm/bridge/aux-bridge.h
> @@ -9,6 +9,8 @@
>
> #include <drm/drm_connector.h>
>
> +struct auxiliary_device;
> +
> #if IS_ENABLED(CONFIG_DRM_AUX_BRIDGE)
> int drm_aux_bridge_register(struct device *parent);
> #else
> @@ -19,10 +21,23 @@ static inline int drm_aux_bridge_register(struct device *parent)
> #endif
>
> #if IS_ENABLED(CONFIG_DRM_AUX_HPD_BRIDGE)
> +struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent, struct device_node *np);
> +int devm_drm_dp_hpd_bridge_add(struct device *dev, struct auxiliary_device *adev);

I had a pretty close idea during prototyping, but I ended up doing it
as a single function for the following reasons:

First, this exports the implementation detail that internally the code
uses an aux device.
Also, by exporting the aux device the code becomes less type-safe. By
mistake one can call devm_drm_dp_hpd_bridge_add() on any aux device,
which is not necessarily the HPD bridge.
I'd prefer to see an opaque device-specific structure instead. WDYT?

> struct device *drm_dp_hpd_bridge_register(struct device *parent,
> struct device_node *np);
> void drm_aux_hpd_bridge_notify(struct device *dev, enum drm_connector_status status);
> #else
> +static inline struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent,
> + struct device_node *np)
> +{
> + return NULL;
> +}
> +
> +static inline int devm_drm_dp_hpd_bridge_add(struct auxiliary_device *adev)
> +{
> + return 0;
> +}
> +
> static inline struct device *drm_dp_hpd_bridge_register(struct device *parent,
> struct device_node *np)
> {
> --
> 2.43.0
>


--
With best wishes
Dmitry

2024-02-22 21:00:47

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
>
> The two device node references taken during allocation need to be
> dropped when the auxiliary device is freed.
>
> Fixes: 6914968a0b52 ("drm/bridge: properly refcount DT nodes in aux bridge drivers")
> Cc: Dmitry Baryshkov <[email protected]>
> Cc: Neil Armstrong <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>

Reviewed-by: Dmitry Baryshkov <[email protected]>

> ---
> drivers/gpu/drm/bridge/aux-hpd-bridge.c | 3 +++
> 1 file changed, 3 insertions(+)


--
With best wishes
Dmitry

2024-02-22 21:12:26

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 5/6] phy: qcom-qmp-combo: fix drm bridge registration

On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
>
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> This could potentially also trigger a bug in the DRM bridge
> implementation which does not expect bridges to go away even if device
> links may avoid triggering this (when enabled).
>
> Move registration of the DRM aux bridge to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).
>
> Fixes: 35921910bbd0 ("phy: qcom: qmp-combo: switch to DRM_AUX_BRIDGE")
> Fixes: 1904c3f578dc ("phy: qcom-qmp-combo: Introduce drm_bridge")
> Cc: [email protected] # 6.5
> Cc: Bjorn Andersson <[email protected]>
> Cc: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
Reviewed-by: Dmitry Baryshkov <[email protected]>


--
With best wishes
Dmitry

2024-02-22 21:12:47

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m

On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
>
> From: Rob Clark <[email protected]>
>
> We need to bail out before adding/removing devices if we are going to
> -EPROBE_DEFER. Otherwise boot can get stuck in a probe deferral loop due
> to a long-standing issue in driver core (see fbc35b45f9f6 ("Add
> documentation on meaning of -EPROBE_DEFER")).
>
> Deregistering the altmode child device can potentially also trigger bugs
> in the DRM bridge implementation, which does not expect bridges to go
> away.
>
> Suggested-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> [ johan: rebase on 6.8-rc4, amend commit message and mention DRM ]
> Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
> Cc: [email protected] # 6.3
> Cc: Bjorn Andersson <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/soc/qcom/pmic_glink.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>

Reviewed-by: Dmitry Baryshkov <[email protected]>

--
With best wishes
Dmitry

2024-02-22 21:13:00

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 6/6] phy: qcom-qmp-combo: fix type-c switch registration

On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
>
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> Move registration of the typec switch to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).
>
> Fixes: 2851117f8f42 ("phy: qcom-qmp-combo: Introduce orientation switching")
> Cc: [email protected] # 6.5
> Cc: Bjorn Andersson <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Dmitry Baryshkov <[email protected]>

Note to myself (or to anybody else, who has spare hands), we should
probably implement the same changes for phy-qcom-qmp-usbc.c

>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> index e19d6a084f10..17c4ad7553a5 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
> @@ -3562,10 +3562,6 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> return ret;
>
> - ret = qmp_combo_typec_switch_register(qmp);
> - if (ret)
> - return ret;
> -
> /* Check for legacy binding with child nodes. */
> usb_np = of_get_child_by_name(dev->of_node, "usb3-phy");
> if (usb_np) {
> @@ -3585,6 +3581,10 @@ static int qmp_combo_probe(struct platform_device *pdev)
> if (ret)
> goto err_node_put;
>
> + ret = qmp_combo_typec_switch_register(qmp);
> + if (ret)
> + goto err_node_put;
> +
> ret = drm_aux_bridge_register(dev);
> if (ret)
> goto err_node_put;
> --
> 2.43.0
>


--
With best wishes
Dmitry

2024-02-22 21:29:15

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
>
> A recent DRM series purporting to simplify support for "transparent
> bridges" and handling of probe deferrals ironically exposed a
> use-after-free issue on pmic_glink_altmode probe deferral.
>
> This has manifested itself as the display subsystem occasionally failing
> to initialise and NULL-pointer dereferences during boot of machines like
> the Lenovo ThinkPad X13s.
>
> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.
>
> When the display controller is later initialised, this triggers a
> use-after-free when attaching the bridges:
>
> dp -> aux -> dp-hpd (freed)
>
> which may, for example, result in the freed bridge failing to attach:
>
> [drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge /soc@0/phy@88eb000 to encoder TMDS-31: -16
>
> or a NULL-pointer dereference:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> ...
> Call trace:
> drm_bridge_attach+0x70/0x1a8 [drm]
> drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
> drm_bridge_attach+0x80/0x1a8 [drm]
> dp_bridge_init+0xa8/0x15c [msm]
> msm_dp_modeset_init+0x28/0xc4 [msm]
>
> The DRM bridge implementation is clearly fragile and implicitly built on
> the assumption that bridges may never go away. In this case, the fix is
> to move the bridge registration in the pmic_glink_altmode driver to
> after all resources have been looked up.
>
> Incidentally, with the new dp-hpd bridge implementation, which registers
> child devices, this is also a requirement due to a long-standing issue
> in driver core that can otherwise lead to a probe deferral loop (see
> fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).
>
> Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
> Fixes: 2bcca96abfbf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
> Cc: [email protected] # 6.3
> Cc: Bjorn Andersson <[email protected]>
> Cc: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/soc/qcom/pmic_glink_altmode.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>

Reviewed-by: Dmitry Baryshkov <[email protected]>


--
With best wishes
Dmitry

2024-02-23 10:57:01

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On 17/02/2024 16:02, Johan Hovold wrote:
> The two device node references taken during allocation need to be
> dropped when the auxiliary device is freed.
>
> Fixes: 6914968a0b52 ("drm/bridge: properly refcount DT nodes in aux bridge drivers")
> Cc: Dmitry Baryshkov <[email protected]>
> Cc: Neil Armstrong <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/gpu/drm/bridge/aux-hpd-bridge.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> index bb55f697a181..9e71daf95bde 100644
> --- a/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> @@ -25,6 +25,7 @@ static void drm_aux_hpd_bridge_release(struct device *dev)
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
>
> of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
>
> kfree(adev);
> }
> @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
>
> ret = auxiliary_device_init(adev);
> if (ret) {
> + of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> kfree(adev);
> return ERR_PTR(ret);

Reviewed-by: Neil Armstrong <[email protected]>

2024-02-23 10:57:29

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/bridge: aux-hpd: fix OF node leaks

On 17/02/2024 16:02, Johan Hovold wrote:
> The two device node references taken during allocation need to be
> dropped when the auxiliary device is freed.
>
> Fixes: 6914968a0b52 ("drm/bridge: properly refcount DT nodes in aux bridge drivers")
> Cc: Dmitry Baryshkov <[email protected]>
> Cc: Neil Armstrong <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/gpu/drm/bridge/aux-hpd-bridge.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/bridge/aux-hpd-bridge.c b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> index bb55f697a181..9e71daf95bde 100644
> --- a/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> +++ b/drivers/gpu/drm/bridge/aux-hpd-bridge.c
> @@ -25,6 +25,7 @@ static void drm_aux_hpd_bridge_release(struct device *dev)
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
>
> of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
>
> kfree(adev);
> }
> @@ -74,6 +75,8 @@ struct device *drm_dp_hpd_bridge_register(struct device *parent,
>
> ret = auxiliary_device_init(adev);
> if (ret) {
> + of_node_put(adev->dev.platform_data);
> + of_node_put(adev->dev.of_node);
> ida_free(&drm_aux_hpd_bridge_ida, adev->id);
> kfree(adev);
> return ERR_PTR(ret);

Reviewed-by: Neil Armstrong <[email protected]>

2024-02-23 11:29:31

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m

On 17/02/2024 16:02, Johan Hovold wrote:
> From: Rob Clark <[email protected]>
>
> We need to bail out before adding/removing devices if we are going to
> -EPROBE_DEFER. Otherwise boot can get stuck in a probe deferral loop due
> to a long-standing issue in driver core (see fbc35b45f9f6 ("Add
> documentation on meaning of -EPROBE_DEFER")).
>
> Deregistering the altmode child device can potentially also trigger bugs
> in the DRM bridge implementation, which does not expect bridges to go
> away.
>
> Suggested-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> [ johan: rebase on 6.8-rc4, amend commit message and mention DRM ]
> Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
> Cc: [email protected] # 6.3
> Cc: Bjorn Andersson <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/soc/qcom/pmic_glink.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/soc/qcom/pmic_glink.c b/drivers/soc/qcom/pmic_glink.c
> index f4bfd24386f1..f913e9bd57ed 100644
> --- a/drivers/soc/qcom/pmic_glink.c
> +++ b/drivers/soc/qcom/pmic_glink.c
> @@ -265,10 +265,17 @@ static int pmic_glink_probe(struct platform_device *pdev)
>
> pg->client_mask = *match_data;
>
> + pg->pdr = pdr_handle_alloc(pmic_glink_pdr_callback, pg);
> + if (IS_ERR(pg->pdr)) {
> + ret = dev_err_probe(&pdev->dev, PTR_ERR(pg->pdr),
> + "failed to initialize pdr\n");
> + return ret;
> + }
> +
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_UCSI)) {
> ret = pmic_glink_add_aux_device(pg, &pg->ucsi_aux, "ucsi");
> if (ret)
> - return ret;
> + goto out_release_pdr_handle;
> }
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_ALTMODE)) {
> ret = pmic_glink_add_aux_device(pg, &pg->altmode_aux, "altmode");
> @@ -281,17 +288,11 @@ static int pmic_glink_probe(struct platform_device *pdev)
> goto out_release_altmode_aux;
> }
>
> - pg->pdr = pdr_handle_alloc(pmic_glink_pdr_callback, pg);
> - if (IS_ERR(pg->pdr)) {
> - ret = dev_err_probe(&pdev->dev, PTR_ERR(pg->pdr), "failed to initialize pdr\n");
> - goto out_release_aux_devices;
> - }
> -
> service = pdr_add_lookup(pg->pdr, "tms/servreg", "msm/adsp/charger_pd");
> if (IS_ERR(service)) {
> ret = dev_err_probe(&pdev->dev, PTR_ERR(service),
> "failed adding pdr lookup for charger_pd\n");
> - goto out_release_pdr_handle;
> + goto out_release_aux_devices;
> }
>
> mutex_lock(&__pmic_glink_lock);
> @@ -300,8 +301,6 @@ static int pmic_glink_probe(struct platform_device *pdev)
>
> return 0;
>
> -out_release_pdr_handle:
> - pdr_handle_release(pg->pdr);
> out_release_aux_devices:
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_BATT))
> pmic_glink_del_aux_device(pg, &pg->ps_aux);
> @@ -311,6 +310,8 @@ static int pmic_glink_probe(struct platform_device *pdev)
> out_release_ucsi_aux:
> if (pg->client_mask & BIT(PMIC_GLINK_CLIENT_UCSI))
> pmic_glink_del_aux_device(pg, &pg->ucsi_aux);
> +out_release_pdr_handle:
> + pdr_handle_release(pg->pdr);
>
> return ret;
> }

Reviewed-by: Neil Armstrong <[email protected]>

2024-02-23 11:30:34

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On 23/02/2024 12:02, Neil Armstrong wrote:
> Hi,
>
> On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
>> Starting with 6.8-rc1 the internal display sometimes fails to come up on
>> machines like the Lenovo ThinkPad X13s and the logs indicate that this
>> is due to a regression in the DRM subsystem [1].
>>
>> This series fixes a race in the pmic_glink_altmode driver which was
>> exposed / triggered by the transparent DRM bridges rework that went into
>> 6.8-rc1 and that manifested itself as a bridge failing to attach and
>> sometimes triggering a NULL-pointer dereference.
>>
>> [...]
>
> Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
>
> [1/6] drm/bridge: aux-hpd: fix OF node leaks
> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
> [2/6] drm/bridge: aux-hpd: separate allocation and registration
> (no commit info)
> [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
> (no commit info)
> [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
> (no commit info)
> [5/6] phy: qcom-qmp-combo: fix drm bridge registration
> (no commit info)
> [6/6] phy: qcom-qmp-combo: fix type-c switch registration
> (no commit info)
>

To clarify, I only applied patch 1 to drm-misc-fixes

Thanks,
Neil

2024-02-23 11:34:58

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

Hi,

On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
> Starting with 6.8-rc1 the internal display sometimes fails to come up on
> machines like the Lenovo ThinkPad X13s and the logs indicate that this
> is due to a regression in the DRM subsystem [1].
>
> This series fixes a race in the pmic_glink_altmode driver which was
> exposed / triggered by the transparent DRM bridges rework that went into
> 6.8-rc1 and that manifested itself as a bridge failing to attach and
> sometimes triggering a NULL-pointer dereference.
>
> [...]

Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)

[1/6] drm/bridge: aux-hpd: fix OF node leaks
https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
[2/6] drm/bridge: aux-hpd: separate allocation and registration
(no commit info)
[3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
(no commit info)
[4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
(no commit info)
[5/6] phy: qcom-qmp-combo: fix drm bridge registration
(no commit info)
[6/6] phy: qcom-qmp-combo: fix type-c switch registration
(no commit info)

--
Neil


2024-02-23 12:10:27

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH 5/6] phy: qcom-qmp-combo: fix drm bridge registration

On 17-02-24, 16:02, Johan Hovold wrote:
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> This could potentially also trigger a bug in the DRM bridge
> implementation which does not expect bridges to go away even if device
> links may avoid triggering this (when enabled).
>
> Move registration of the DRM aux bridge to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).

Acked-by: Vinod Koul <[email protected]>

--
~Vinod

2024-02-23 12:10:54

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH 6/6] phy: qcom-qmp-combo: fix type-c switch registration

On 17-02-24, 16:02, Johan Hovold wrote:
> Due to a long-standing issue in driver core, drivers may not probe defer
> after having registered child devices to avoid triggering a probe
> deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of
> -EPROBE_DEFER")).
>
> Move registration of the typec switch to after looking up clocks and
> other resources.
>
> Note that PHY creation can in theory also trigger a probe deferral when
> a 'phy' supply is used. This does not seem to affect the QMP PHY driver
> but the PHY subsystem should be reworked to address this (i.e. by
> separating initialisation and registration of the PHY).

Acked-by: Vinod Koul <[email protected]>

--
~Vinod

2024-02-23 12:46:50

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 2/6] drm/bridge: aux-hpd: separate allocation and registration

On Thu, Feb 22, 2024 at 10:57:07PM +0200, Dmitry Baryshkov wrote:
> On Sat, 17 Feb 2024 at 17:03, Johan Hovold <[email protected]> wrote:
> >
> > Combining allocation and registration is an anti-pattern that should be
> > avoided. Add two new functions for allocating and registering an dp-hpd
> > bridge with a proper 'devm' prefix so that it is clear that these are
> > device managed interfaces.
> >
> > devm_drm_dp_hpd_bridge_alloc()
> > devm_drm_dp_hpd_bridge_add()
> >
> > The new interface will be used to fix a use-after-free bug in the
> > Qualcomm PMIC GLINK driver and may prevent similar issues from being
> > introduced elsewhere.
> >
> > The existing drm_dp_hpd_bridge_register() is reimplemented using the
> > above and left in place for now.
> >
> > Signed-off-by: Johan Hovold <[email protected]>
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>

Thanks for reviewing.

> Minor nit below.

> > diff --git a/include/drm/bridge/aux-bridge.h b/include/drm/bridge/aux-bridge.h
> > index c4c423e97f06..4453906105ca 100644
> > --- a/include/drm/bridge/aux-bridge.h
> > +++ b/include/drm/bridge/aux-bridge.h
> > @@ -9,6 +9,8 @@
> >
> > #include <drm/drm_connector.h>
> >
> > +struct auxiliary_device;
> > +
> > #if IS_ENABLED(CONFIG_DRM_AUX_BRIDGE)
> > int drm_aux_bridge_register(struct device *parent);
> > #else
> > @@ -19,10 +21,23 @@ static inline int drm_aux_bridge_register(struct device *parent)
> > #endif
> >
> > #if IS_ENABLED(CONFIG_DRM_AUX_HPD_BRIDGE)
> > +struct auxiliary_device *devm_drm_dp_hpd_bridge_alloc(struct device *parent, struct device_node *np);
> > +int devm_drm_dp_hpd_bridge_add(struct device *dev, struct auxiliary_device *adev);
>
> I had a pretty close idea during prototyping, but I ended up doing it
> as a single function for the following reasons:
>
> First, this exports the implementation detail that internally the code
> uses an aux device.

That's not an issue. The opposite, with interfaces trying to do too much
and hide details from the developers so that they can no longer reason
about what is going on, is a real problem though.

> Also, by exporting the aux device the code becomes less type-safe. By
> mistake one can call devm_drm_dp_hpd_bridge_add() on any aux device,
> which is not necessarily the HPD bridge.

No. First, that is currently not even an issue either as the
registration interface is safe to use with any aux device.

Second, if you cared about about type-safety you wouldn't have used a
struct device pointer for drm_aux_hpd_bridge_notify() which you back
cast to an aux device.

> I'd prefer to see an opaque device-specific structure instead. WDYT?

That should have been there from the start. But I'm not interested in
cleaning up this mess beyond what is minimally required to fix the
regressions it caused.

This can be reworked for 6.9 or later.

> > struct device *drm_dp_hpd_bridge_register(struct device *parent,
> > struct device_node *np);
> > void drm_aux_hpd_bridge_notify(struct device *dev, enum drm_connector_status status);

Johan

2024-02-23 12:58:18

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Fri, Feb 23, 2024 at 12:03:10PM +0100, Neil Armstrong wrote:
> On 23/02/2024 12:02, Neil Armstrong wrote:
> > Hi,
> >
> > On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
> >> Starting with 6.8-rc1 the internal display sometimes fails to come up on
> >> machines like the Lenovo ThinkPad X13s and the logs indicate that this
> >> is due to a regression in the DRM subsystem [1].
> >>
> >> This series fixes a race in the pmic_glink_altmode driver which was
> >> exposed / triggered by the transparent DRM bridges rework that went into
> >> 6.8-rc1 and that manifested itself as a bridge failing to attach and
> >> sometimes triggering a NULL-pointer dereference.
> >>
> >> [...]
> >
> > Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
> >
> > [1/6] drm/bridge: aux-hpd: fix OF node leaks
> > https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
> > [2/6] drm/bridge: aux-hpd: separate allocation and registration
> > (no commit info)
> > [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
> > (no commit info)
> > [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
> > (no commit info)
> > [5/6] phy: qcom-qmp-combo: fix drm bridge registration
> > (no commit info)
> > [6/6] phy: qcom-qmp-combo: fix type-c switch registration
> > (no commit info)
> >
>
> To clarify, I only applied patch 1 to drm-misc-fixes

Ok, but can you please not do that? :)

These patches should go in through the same tree to avoid conflicts.

I discussed this with Bjorn and Dmitry the other day and the conclusion
was that it was easiest to take all of these through DRM.

With Vinod acking the PHY patches, I believe you have what you need to
merge the whole series now?

Johan

2024-02-23 13:53:04

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On 23/02/2024 13:51, Johan Hovold wrote:
> On Fri, Feb 23, 2024 at 12:03:10PM +0100, Neil Armstrong wrote:
>> On 23/02/2024 12:02, Neil Armstrong wrote:
>>> Hi,
>>>
>>> On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
>>>> Starting with 6.8-rc1 the internal display sometimes fails to come up on
>>>> machines like the Lenovo ThinkPad X13s and the logs indicate that this
>>>> is due to a regression in the DRM subsystem [1].
>>>>
>>>> This series fixes a race in the pmic_glink_altmode driver which was
>>>> exposed / triggered by the transparent DRM bridges rework that went into
>>>> 6.8-rc1 and that manifested itself as a bridge failing to attach and
>>>> sometimes triggering a NULL-pointer dereference.
>>>>
>>>> [...]
>>>
>>> Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
>>>
>>> [1/6] drm/bridge: aux-hpd: fix OF node leaks
>>> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
>>> [2/6] drm/bridge: aux-hpd: separate allocation and registration
>>> (no commit info)
>>> [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
>>> (no commit info)
>>> [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
>>> (no commit info)
>>> [5/6] phy: qcom-qmp-combo: fix drm bridge registration
>>> (no commit info)
>>> [6/6] phy: qcom-qmp-combo: fix type-c switch registration
>>> (no commit info)
>>>
>>
>> To clarify, I only applied patch 1 to drm-misc-fixes
>
> Ok, but can you please not do that? :)
>
> These patches should go in through the same tree to avoid conflicts.
>
> I discussed this with Bjorn and Dmitry the other day and the conclusion
> was that it was easiest to take all of these through DRM.

I only applied patch 1, which is a standalone fix and goes into a separate tree,
for the next patches it would be indeed simpler for them to go via drm-misc when
they are properly acked.

Neil

>
> With Vinod acking the PHY patches, I believe you have what you need to
> merge the whole series now?
>
> Johan


2024-02-23 14:20:11

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Fri, 23 Feb 2024 at 15:52, Neil Armstrong <[email protected]> wrote:
>
> On 23/02/2024 13:51, Johan Hovold wrote:
> > On Fri, Feb 23, 2024 at 12:03:10PM +0100, Neil Armstrong wrote:
> >> On 23/02/2024 12:02, Neil Armstrong wrote:
> >>> Hi,
> >>>
> >>> On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
> >>>> Starting with 6.8-rc1 the internal display sometimes fails to come up on
> >>>> machines like the Lenovo ThinkPad X13s and the logs indicate that this
> >>>> is due to a regression in the DRM subsystem [1].
> >>>>
> >>>> This series fixes a race in the pmic_glink_altmode driver which was
> >>>> exposed / triggered by the transparent DRM bridges rework that went into
> >>>> 6.8-rc1 and that manifested itself as a bridge failing to attach and
> >>>> sometimes triggering a NULL-pointer dereference.
> >>>>
> >>>> [...]
> >>>
> >>> Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
> >>>
> >>> [1/6] drm/bridge: aux-hpd: fix OF node leaks
> >>> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
> >>> [2/6] drm/bridge: aux-hpd: separate allocation and registration
> >>> (no commit info)
> >>> [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
> >>> (no commit info)
> >>> [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
> >>> (no commit info)
> >>> [5/6] phy: qcom-qmp-combo: fix drm bridge registration
> >>> (no commit info)
> >>> [6/6] phy: qcom-qmp-combo: fix type-c switch registration
> >>> (no commit info)
> >>>
> >>
> >> To clarify, I only applied patch 1 to drm-misc-fixes
> >
> > Ok, but can you please not do that? :)
> >
> > These patches should go in through the same tree to avoid conflicts.
> >
> > I discussed this with Bjorn and Dmitry the other day and the conclusion
> > was that it was easiest to take all of these through DRM.
>
> I only applied patch 1, which is a standalone fix and goes into a separate tree,
> for the next patches it would be indeed simpler for them to go via drm-misc when
> they are properly acked.

I think PHY patches can go through a usual route (phy/next or
phy/fixes). For patches 3 and 4 I'd need an ack from Bjorn to merge
them through drm-misc-next or drm-misc-fixes.


--
With best wishes
Dmitry

2024-02-23 14:21:58

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Fri, Feb 23, 2024 at 02:52:28PM +0100, Neil Armstrong wrote:
> On 23/02/2024 13:51, Johan Hovold wrote:
> > On Fri, Feb 23, 2024 at 12:03:10PM +0100, Neil Armstrong wrote:
> >> On 23/02/2024 12:02, Neil Armstrong wrote:

> >>> Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
> >>>
> >>> [1/6] drm/bridge: aux-hpd: fix OF node leaks
> >>> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
> >>> [2/6] drm/bridge: aux-hpd: separate allocation and registration
> >>> (no commit info)
> >>> [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
> >>> (no commit info)
> >>> [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
> >>> (no commit info)
> >>> [5/6] phy: qcom-qmp-combo: fix drm bridge registration
> >>> (no commit info)
> >>> [6/6] phy: qcom-qmp-combo: fix type-c switch registration
> >>> (no commit info)
> >>>
> >>
> >> To clarify, I only applied patch 1 to drm-misc-fixes
> >
> > Ok, but can you please not do that? :)
> >
> > These patches should go in through the same tree to avoid conflicts.
> >
> > I discussed this with Bjorn and Dmitry the other day and the conclusion
> > was that it was easiest to take all of these through DRM.
>
> I only applied patch 1, which is a standalone fix and goes into a separate tree,
> for the next patches it would be indeed simpler for them to go via drm-misc when
> they are properly acked.

But it is *not* standalone as I tried to explain above.

So you have to drop it again as the later patches depend on it and
cannot be merged (through a different tree) without it.

I thought you had all the acks you needed to take this through drm-misc,
but we can wait a bit more if necessary (and there's no rush to get the
first one in).

Johan

2024-02-23 14:28:21

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Fri, Feb 23, 2024 at 04:18:08PM +0200, Dmitry Baryshkov wrote:
> On Fri, 23 Feb 2024 at 15:52, Neil Armstrong <[email protected]> wrote:
> > On 23/02/2024 13:51, Johan Hovold wrote:
> > > On Fri, Feb 23, 2024 at 12:03:10PM +0100, Neil Armstrong wrote:
> > >> On 23/02/2024 12:02, Neil Armstrong wrote:

> > >>> Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
> > >>>
> > >>> [1/6] drm/bridge: aux-hpd: fix OF node leaks
> > >>> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
> > >>> [2/6] drm/bridge: aux-hpd: separate allocation and registration
> > >>> (no commit info)
> > >>> [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
> > >>> (no commit info)
> > >>> [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
> > >>> (no commit info)
> > >>> [5/6] phy: qcom-qmp-combo: fix drm bridge registration
> > >>> (no commit info)
> > >>> [6/6] phy: qcom-qmp-combo: fix type-c switch registration
> > >>> (no commit info)
> > >>>
> > >>
> > >> To clarify, I only applied patch 1 to drm-misc-fixes
> > >
> > > Ok, but can you please not do that? :)
> > >
> > > These patches should go in through the same tree to avoid conflicts.
> > >
> > > I discussed this with Bjorn and Dmitry the other day and the conclusion
> > > was that it was easiest to take all of these through DRM.
> >
> > I only applied patch 1, which is a standalone fix and goes into a separate tree,
> > for the next patches it would be indeed simpler for them to go via drm-misc when
> > they are properly acked.
>
> I think PHY patches can go through a usual route (phy/next or
> phy/fixes).

They can, but I've explicitly asked Vinod to ack them so that they can
go in with the rest of the series through DRM.

They also fix a regression that came in through DRM in 6.8-rc1 (the
bridge rework which started registering child devices) so it makes sense
to also route the fix the same way. And to do it for this cycle.

> For patches 3 and 4 I'd need an ack from Bjorn to merge
> them through drm-misc-next or drm-misc-fixes.

You have Bjorn's ack. He's reviewed all the patches for this purpose and
we discussed this in person two days ago.

And, again, this has to go in for *this* cycle. You broke the display on
the X13s and other machines so this cannot wait for 6.9.

Johan

2024-02-23 14:42:43

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On 23/02/2024 15:21, Johan Hovold wrote:
> On Fri, Feb 23, 2024 at 02:52:28PM +0100, Neil Armstrong wrote:
>> On 23/02/2024 13:51, Johan Hovold wrote:
>>> On Fri, Feb 23, 2024 at 12:03:10PM +0100, Neil Armstrong wrote:
>>>> On 23/02/2024 12:02, Neil Armstrong wrote:
>
>>>>> Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-fixes)
>>>>>
>>>>> [1/6] drm/bridge: aux-hpd: fix OF node leaks
>>>>> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=9ee485bdda68d6d3f5728cbe3150eb9013d7d22b
>>>>> [2/6] drm/bridge: aux-hpd: separate allocation and registration
>>>>> (no commit info)
>>>>> [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
>>>>> (no commit info)
>>>>> [4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
>>>>> (no commit info)
>>>>> [5/6] phy: qcom-qmp-combo: fix drm bridge registration
>>>>> (no commit info)
>>>>> [6/6] phy: qcom-qmp-combo: fix type-c switch registration
>>>>> (no commit info)
>>>>>
>>>>
>>>> To clarify, I only applied patch 1 to drm-misc-fixes
>>>
>>> Ok, but can you please not do that? :)
>>>
>>> These patches should go in through the same tree to avoid conflicts.
>>>
>>> I discussed this with Bjorn and Dmitry the other day and the conclusion
>>> was that it was easiest to take all of these through DRM.
>>
>> I only applied patch 1, which is a standalone fix and goes into a separate tree,
>> for the next patches it would be indeed simpler for them to go via drm-misc when
>> they are properly acked.
>
> But it is *not* standalone as I tried to explain above.
>
> So you have to drop it again as the later patches depend on it and
> cannot be merged (through a different tree) without it.

drm-misc branches cannot be rebased, it must be reverted, but it can still be applied
on drm-misc-next and I'll send a revert patch for drm-misc-fixes if needed, not a big deal.

> I thought you had all the acks you needed to take this through drm-misc,
> but we can wait a bit more if necessary (and there's no rush to get the
> first one in).

If you want it to be in v6.9, it's too late since the last drm-misc-next PR has been sent
yesterday (https://cgit.freedesktop.org/drm/drm-misc/tag/?h=drm-misc-next-2024-02-22)

Please ping Thomas or Maxime, perhaps it's not too late since the drm-misc-next tree
really closes on sunday.

Neil


>
> Johan


2024-02-23 14:53:25

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Fri, Feb 23, 2024 at 03:38:13PM +0100, Neil Armstrong wrote:
> On 23/02/2024 15:21, Johan Hovold wrote:

> > But it is *not* standalone as I tried to explain above.
> >
> > So you have to drop it again as the later patches depend on it and
> > cannot be merged (through a different tree) without it.
>
> drm-misc branches cannot be rebased, it must be reverted, but it can still be applied
> on drm-misc-next and I'll send a revert patch for drm-misc-fixes if needed, not a big deal.
>
> > I thought you had all the acks you needed to take this through drm-misc,
> > but we can wait a bit more if necessary (and there's no rush to get the
> > first one in).
>
> If you want it to be in v6.9, it's too late since the last drm-misc-next PR has been sent
> yesterday (https://cgit.freedesktop.org/drm/drm-misc/tag/?h=drm-misc-next-2024-02-22)
>
> Please ping Thomas or Maxime, perhaps it's not too late since the drm-misc-next tree
> really closes on sunday.

I don't want this in 6.9, this is needed for *6.8* as this fixes a DRM
regression in 6.8-rc1 that breaks the display on machines like the X13s.

If you guys can't sort this out in time, then perhaps Bjorn can take
this through the Qualcomm tree instead (with DRM acks).

But again, this is fixing a severe *regression* in 6.8-rc1. It can not
wait for 6.9.

Johan

2024-02-23 14:54:52

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On 17/02/2024 16:02, Johan Hovold wrote:
> Starting with 6.8-rc1 the internal display sometimes fails to come up on
> machines like the Lenovo ThinkPad X13s and the logs indicate that this
> is due to a regression in the DRM subsystem [1].
>
> This series fixes a race in the pmic_glink_altmode driver which was
> exposed / triggered by the transparent DRM bridges rework that went into
> 6.8-rc1 and that manifested itself as a bridge failing to attach and
> sometimes triggering a NULL-pointer dereference.
>
> The intermittent hard resets that have also been reported since 6.8-rc1
> unfortunately still remains and suggests that we are dealing with two
> separate regressions. There is some indication that also the hard resets
> (e.g. due to register accesses to unclocked hardware) are also due to
> changes in the DRM subsystem as it happens around the time that the eDP
> panel and display controller would be initialised during boot (the
> runtime PM rework?). This remains to be verified, however.
>
> Included is also a fix for a related OF node reference leak in the
> aux-hpd driver found through inspection when reworking the driver.
>
> The use-after-free bug is triggered by a probe deferral and highlighted
> some further bugs in the involved drivers, which were registering child
> devices before deferring probe. This behaviour is not correct and can
> both trigger probe deferral loops and potentially also further issues
> with the DRM bridge implementation.
>
> This series can either go through the Qualcomm SoC tree (pmic_glink) or
> the DRM tree. The PHY patches do not depend on the rest of the series
> and could possibly be merged separately through the PHY tree.
>
> Whichever gets this to mainline the fastest.
>
> Johan
>
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>
>
> Johan Hovold (5):
> drm/bridge: aux-hpd: fix OF node leaks
> drm/bridge: aux-hpd: separate allocation and registration
> soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
> phy: qcom-qmp-combo: fix drm bridge registration
> phy: qcom-qmp-combo: fix type-c switch registration
>
> Rob Clark (1):
> soc: qcom: pmic_glink: Fix boot when QRTR=m
>
> drivers/gpu/drm/bridge/aux-hpd-bridge.c | 70 ++++++++++++++++++-----
> drivers/phy/qualcomm/phy-qcom-qmp-combo.c | 16 +++---
> drivers/soc/qcom/pmic_glink.c | 21 +++----
> drivers/soc/qcom/pmic_glink_altmode.c | 16 +++++-
> include/drm/bridge/aux-bridge.h | 15 +++++
> 5 files changed, 102 insertions(+), 36 deletions(-)
>

For the serie:
Acked-by: Neil Armstrong <[email protected]>

After an offline discussion, Dmitry, it's ok to push the remaining patches to drm-misc-fixes.

Thanks,
Neil

2024-02-23 14:55:58

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On 23/02/2024 15:52, Johan Hovold wrote:
> On Fri, Feb 23, 2024 at 03:38:13PM +0100, Neil Armstrong wrote:
>> On 23/02/2024 15:21, Johan Hovold wrote:
>
>>> But it is *not* standalone as I tried to explain above.
>>>
>>> So you have to drop it again as the later patches depend on it and
>>> cannot be merged (through a different tree) without it.
>>
>> drm-misc branches cannot be rebased, it must be reverted, but it can still be applied
>> on drm-misc-next and I'll send a revert patch for drm-misc-fixes if needed, not a big deal.
>>
>>> I thought you had all the acks you needed to take this through drm-misc,
>>> but we can wait a bit more if necessary (and there's no rush to get the
>>> first one in).
>>
>> If you want it to be in v6.9, it's too late since the last drm-misc-next PR has been sent
>> yesterday (https://cgit.freedesktop.org/drm/drm-misc/tag/?h=drm-misc-next-2024-02-22)
>>
>> Please ping Thomas or Maxime, perhaps it's not too late since the drm-misc-next tree
>> really closes on sunday.
>
> I don't want this in 6.9, this is needed for *6.8* as this fixes a DRM
> regression in 6.8-rc1 that breaks the display on machines like the X13s.
>
> If you guys can't sort this out in time, then perhaps Bjorn can take
> this through the Qualcomm tree instead (with DRM acks).
>
> But again, this is fixing a severe *regression* in 6.8-rc1. It can not
> wait for 6.9.

Right, I can't apply them right now, I send a patchset ack so it can be applied ASAP,

Thanks,
Neil

>
> Johan


2024-02-23 15:06:32

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: (subset) [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
> Starting with 6.8-rc1 the internal display sometimes fails to come up on
> machines like the Lenovo ThinkPad X13s and the logs indicate that this
> is due to a regression in the DRM subsystem [1].
>
> This series fixes a race in the pmic_glink_altmode driver which was
> exposed / triggered by the transparent DRM bridges rework that went into
> 6.8-rc1 and that manifested itself as a bridge failing to attach and
> sometimes triggering a NULL-pointer dereference.
>
> [...]

Applied to drm-misc-fixes, thanks!

[2/6] drm/bridge: aux-hpd: separate allocation and registration
commit: e5ca263508f7e9d2cf711edf3258d11ca087885c
[3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free
commit: b979f2d50a099f3402418d7ff5f26c3952fb08bb
[4/6] soc: qcom: pmic_glink: Fix boot when QRTR=m
commit: f79ee78767ca60e7a2c89eacd2dbdf237d97e838

Note, PHY patches (5,6) do not have dependency on the drm patch, so they can go
through the phy/fixes tree.

Best regards,
--
Dmitry Baryshkov <[email protected]>

2024-02-23 15:07:46

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

On Fri, 23 Feb 2024 at 16:55, Neil Armstrong <[email protected]> wrote:
>
> On 23/02/2024 15:52, Johan Hovold wrote:
> > On Fri, Feb 23, 2024 at 03:38:13PM +0100, Neil Armstrong wrote:
> >> On 23/02/2024 15:21, Johan Hovold wrote:
> >
> >>> But it is *not* standalone as I tried to explain above.
> >>>
> >>> So you have to drop it again as the later patches depend on it and
> >>> cannot be merged (through a different tree) without it.
> >>
> >> drm-misc branches cannot be rebased, it must be reverted, but it can still be applied
> >> on drm-misc-next and I'll send a revert patch for drm-misc-fixes if needed, not a big deal.
> >>
> >>> I thought you had all the acks you needed to take this through drm-misc,
> >>> but we can wait a bit more if necessary (and there's no rush to get the
> >>> first one in).
> >>
> >> If you want it to be in v6.9, it's too late since the last drm-misc-next PR has been sent
> >> yesterday (https://cgit.freedesktop.org/drm/drm-misc/tag/?h=drm-misc-next-2024-02-22)
> >>
> >> Please ping Thomas or Maxime, perhaps it's not too late since the drm-misc-next tree
> >> really closes on sunday.
> >
> > I don't want this in 6.9, this is needed for *6.8* as this fixes a DRM
> > regression in 6.8-rc1 that breaks the display on machines like the X13s.
> >
> > If you guys can't sort this out in time, then perhaps Bjorn can take
> > this through the Qualcomm tree instead (with DRM acks).
> >
> > But again, this is fixing a severe *regression* in 6.8-rc1. It can not
> > wait for 6.9.
>
> Right, I can't apply them right now, I send a patchset ack so it can be applied ASAP,

Applied and pushed patches 2-4. Patches 5 and 6 can go through the
phy/fixes. There is no need for them to go through drm-misc tree.

--
With best wishes
Dmitry

2024-03-06 17:49:00

by Vinod Koul

[permalink] [raw]
Subject: Re: (subset) [PATCH 0/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free


On Sat, 17 Feb 2024 16:02:22 +0100, Johan Hovold wrote:
> Starting with 6.8-rc1 the internal display sometimes fails to come up on
> machines like the Lenovo ThinkPad X13s and the logs indicate that this
> is due to a regression in the DRM subsystem [1].
>
> This series fixes a race in the pmic_glink_altmode driver which was
> exposed / triggered by the transparent DRM bridges rework that went into
> 6.8-rc1 and that manifested itself as a bridge failing to attach and
> sometimes triggering a NULL-pointer dereference.
>
> [...]

Applied, thanks!

[5/6] phy: qcom-qmp-combo: fix drm bridge registration
commit: d2d7b8e88023b75320662c2305d61779ff060950
[6/6] phy: qcom-qmp-combo: fix type-c switch registration
commit: 47b412c1ea77112f1148b4edd71700a388c7c80f

Best regards,
--
~Vinod