The MSM DRM driver is currently broken in multiple ways with respect to
probe deferral. Not only does the driver currently fail to probe again
after a late deferral, but due to a related use-after-free bug this also
triggers NULL-pointer dereferences.
These bugs are not new but have become critical with the release of
5.19 where probe is deferred in case the aux-bus EP panel driver has not
yet been loaded.
The underlying problem is lifetime issues due to careless use of
device-managed resources.
Specifically, device-managed resources allocated post component bind
must be tied to the lifetime of the aggregate DRM device or they will
not necessarily be released when binding of the aggregate device is
deferred.
The following call chain and pseudo code serves as an illustration of
the problem:
- platform_probe(pdev1)
- dp_display_probe()
- component_add()
- platform_probe(pdev2) // last component
- dp_display_probe() // d0
- component_add()
- try_to_bring_up_aggregate_device()
- devres_open_group(adev->parent) // d1
- msm_drm_bind()
- msm_drm_init()
- component_bind_all()
- for_each_component()
- component_bind()
- devres_open_group(&pdev->dev) // d2
- dp_display_bind()
- devm_kzalloc(&pdev->dev) // a1, OK
- devres_close_group(&pdev->dev) // d3
- dpu_kms_hw_init()
- for_each_panel()
- msm_dp_modeset_init()
- dp_display_request_irq()
- devm_request_irq(&pdev->dev) // a2, BUG
- if (pdev == pdev2 && condition)
- return -EPROBE_DEFER;
- if (error)
- component_unbind_all()
- for_each_component()
- component_unbind()
- dp_display_unbind()
- devres_release_group(&pdev->dev) // d4, only a1 is freed
- if (error)
- devres_release_group(adev->parent) // d5
The device-managed allocation a2 is buggy as its lifetime is tied to the
component platform device and will not be released when the aggregate
device bind fails (e.g. due to a probe deferral).
When pdev2 is later probed again, the attempt to allocate the IRQ a
second time will fail for pdev1 (which is still bound to its platform
driver).
This series fixes the lifetime issues by tying the lifetime of a2 (and
similar allocations) to the lifetime of the aggregate device so that a2
is released at d5.
In some cases, such has for the DP IRQ, the above situation can also be
avoided by moving the allocation in question to the platform driver
probe (d0) or component bind (between d2 and d3). But as doing so is not
a general fix, this can be done later as a cleanup/optimisation.
Johan
Changes in v2
- use a custom devres action instead of amending the AUX bus interface
(Doug)
- split sanity check fixes and cleanups per bridge type (Dmitry)
- add another Fixes tag for the missing bridge counter reset (Dmitry)
Johan Hovold (10):
drm/msm: fix use-after-free on probe deferral
drm/msm/dp: fix memory corruption with too many bridges
drm/msm/dsi: fix memory corruption with too many bridges
drm/msm/hdmi: fix memory corruption with too many bridges
drm/msm/dp: fix IRQ lifetime
drm/msm/dp: fix aux-bus EP lifetime
drm/msm/dp: fix bridge lifetime
drm/msm/hdmi: fix IRQ lifetime
drm/msm/dp: drop modeset sanity checks
drm/msm/dsi: drop modeset sanity checks
drivers/gpu/drm/msm/dp/dp_display.c | 26 +++++++++++++++++++-------
drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
drivers/gpu/drm/msm/msm_drv.c | 1 +
6 files changed, 37 insertions(+), 17 deletions(-)
--
2.35.1
Drop the overly defensive modeset sanity checks of function parameters
which have already been checked or used by the callers.
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/msm/dsi/dsi.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c
index 8a95c744972a..31fdee2052be 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.c
+++ b/drivers/gpu/drm/msm/dsi/dsi.c
@@ -211,14 +211,9 @@ void __exit msm_dsi_unregister(void)
int msm_dsi_modeset_init(struct msm_dsi *msm_dsi, struct drm_device *dev,
struct drm_encoder *encoder)
{
- struct msm_drm_private *priv;
+ struct msm_drm_private *priv = dev->dev_private;
int ret;
- if (WARN_ON(!encoder) || WARN_ON(!msm_dsi) || WARN_ON(!dev))
- return -EINVAL;
-
- priv = dev->dev_private;
-
if (priv->num_bridges == ARRAY_SIZE(priv->bridges)) {
DRM_DEV_ERROR(dev->dev, "too many bridges\n");
return -ENOSPC;
--
2.35.1
Device-managed resources allocated post component bind must be tied to
the lifetime of the aggregate DRM device or they will not necessarily be
released when binding of the aggregate device is deferred.
This is specifically true for the DP IRQ, which will otherwise remain
requested so that the next bind attempt fails when requesting the IRQ a
second time.
Since commit c3bf8e21b38a ("drm/msm/dp: Add eDP support via aux_bus")
this can happen when the aux-bus panel driver has not yet been loaded so
that probe is deferred.
Fix this by tying the device-managed lifetime of the DP IRQ to the DRM
device so that it is released when bind fails.
Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
Cc: [email protected] # 5.10
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/msm/dp/dp_display.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index fbe950edaefe..ba557328710a 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1258,7 +1258,7 @@ int dp_display_request_irq(struct msm_dp *dp_display)
return -EINVAL;
}
- rc = devm_request_irq(&dp->pdev->dev, dp->irq,
+ rc = devm_request_irq(dp_display->drm_dev->dev, dp->irq,
dp_display_irq_handler,
IRQF_TRIGGER_HIGH, "dp_display_isr", dp);
if (rc < 0) {
--
2.35.1
The bridge counter was never reset when tearing down the DRM device so
that stale pointers to deallocated structures would be accessed on the
next tear down (e.g. after a second late bind deferral).
Given enough bridges and a few probe deferrals this could currently also
lead to data beyond the bridge array being corrupted.
Fixes: d28ea556267c ("drm/msm: properly add and remove internal bridges")
Fixes: a3376e3ec81c ("drm/msm: convert to drm_bridge")
Cc: [email protected] # 3.12
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/msm/msm_drv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 391d86b54ded..d254fe2507ec 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -241,6 +241,7 @@ static int msm_drm_uninit(struct device *dev)
for (i = 0; i < priv->num_bridges; i++)
drm_bridge_remove(priv->bridges[i]);
+ priv->num_bridges = 0;
pm_runtime_get_sync(dev);
msm_irq_uninstall(ddev);
--
2.35.1
Drop the overly defensive modeset sanity checks of function parameters
which have already been checked or used by the callers.
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/msm/dp/dp_display.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index 808a516e84c5..33daec11f813 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1607,15 +1607,10 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
int msm_dp_modeset_init(struct msm_dp *dp_display, struct drm_device *dev,
struct drm_encoder *encoder)
{
- struct msm_drm_private *priv;
+ struct msm_drm_private *priv = dev->dev_private;
struct dp_display_private *dp_priv;
int ret;
- if (WARN_ON(!encoder) || WARN_ON(!dp_display) || WARN_ON(!dev))
- return -EINVAL;
-
- priv = dev->dev_private;
-
if (priv->num_bridges == ARRAY_SIZE(priv->bridges)) {
DRM_DEV_ERROR(dev->dev, "too many bridges\n");
return -ENOSPC;
--
2.35.1
Device-managed resources allocated post component bind must be tied to
the lifetime of the aggregate DRM device or they will not necessarily be
released when binding of the aggregate device is deferred.
This can lead resource leaks or failure to bind the aggregate device
when binding is later retried and a second attempt to allocate the
resources is made.
For the DP bridges, previously allocated bridges will leak on probe
deferral.
Fix this by amending the DP parser interface and tying the lifetime of
the bridge device to the DRM device rather than DP platform device.
Fixes: c3bf8e21b38a ("drm/msm/dp: Add eDP support via aux_bus")
Cc: [email protected] # 5.19
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/gpu/drm/msm/dp/dp_display.c | 2 +-
drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
3 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index 4b0a2d4bb61e..808a516e84c5 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1586,7 +1586,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
* For DisplayPort interfaces external bridges are optional, so
* silently ignore an error if one is not present (-ENODEV).
*/
- rc = dp_parser_find_next_bridge(dp_priv->parser);
+ rc = devm_dp_parser_find_next_bridge(dp->drm_dev->dev, dp_priv->parser);
if (!dp->is_edp && rc == -ENODEV)
return 0;
diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c b/drivers/gpu/drm/msm/dp/dp_parser.c
index dd732215d55b..dcbe893d66d7 100644
--- a/drivers/gpu/drm/msm/dp/dp_parser.c
+++ b/drivers/gpu/drm/msm/dp/dp_parser.c
@@ -240,12 +240,12 @@ static int dp_parser_clock(struct dp_parser *parser)
return 0;
}
-int dp_parser_find_next_bridge(struct dp_parser *parser)
+int devm_dp_parser_find_next_bridge(struct device *dev, struct dp_parser *parser)
{
- struct device *dev = &parser->pdev->dev;
+ struct platform_device *pdev = parser->pdev;
struct drm_bridge *bridge;
- bridge = devm_drm_of_get_bridge(dev, dev->of_node, 1, 0);
+ bridge = devm_drm_of_get_bridge(dev, pdev->dev.of_node, 1, 0);
if (IS_ERR(bridge))
return PTR_ERR(bridge);
diff --git a/drivers/gpu/drm/msm/dp/dp_parser.h b/drivers/gpu/drm/msm/dp/dp_parser.h
index 866c1a82bf1a..d30ab773db46 100644
--- a/drivers/gpu/drm/msm/dp/dp_parser.h
+++ b/drivers/gpu/drm/msm/dp/dp_parser.h
@@ -138,8 +138,9 @@ struct dp_parser {
struct dp_parser *dp_parser_get(struct platform_device *pdev);
/**
- * dp_parser_find_next_bridge() - find an additional bridge to DP
+ * devm_dp_parser_find_next_bridge() - find an additional bridge to DP
*
+ * @dev: device to tie bridge lifetime to
* @parser: dp_parser data from client
*
* This function is used to find any additional bridge attached to
@@ -147,6 +148,6 @@ struct dp_parser *dp_parser_get(struct platform_device *pdev);
*
* Return: 0 if able to get the bridge, otherwise negative errno for failure.
*/
-int dp_parser_find_next_bridge(struct dp_parser *parser);
+int devm_dp_parser_find_next_bridge(struct device *dev, struct dp_parser *parser);
#endif
--
2.35.1
Hi Johan,
On 9/13/22 3:53 AM, Johan Hovold wrote:
> The MSM DRM driver is currently broken in multiple ways with respect to
> probe deferral. Not only does the driver currently fail to probe again
> after a late deferral, but due to a related use-after-free bug this also
> triggers NULL-pointer dereferences.
>
> These bugs are not new but have become critical with the release of
> 5.19 where probe is deferred in case the aux-bus EP panel driver has not
> yet been loaded.
>
> The underlying problem is lifetime issues due to careless use of
> device-managed resources.
>
> Specifically, device-managed resources allocated post component bind
> must be tied to the lifetime of the aggregate DRM device or they will
> not necessarily be released when binding of the aggregate device is
> deferred.
>
> The following call chain and pseudo code serves as an illustration of
> the problem:
>
> - platform_probe(pdev1)
> - dp_display_probe()
> - component_add()
>
> - platform_probe(pdev2) // last component
> - dp_display_probe() // d0
> - component_add()
> - try_to_bring_up_aggregate_device()
> - devres_open_group(adev->parent) // d1
>
> - msm_drm_bind()
> - msm_drm_init()
> - component_bind_all()
> - for_each_component()
> - component_bind()
> - devres_open_group(&pdev->dev) // d2
> - dp_display_bind()
> - devm_kzalloc(&pdev->dev) // a1, OK
> - devres_close_group(&pdev->dev) // d3
>
> - dpu_kms_hw_init()
> - for_each_panel()
> - msm_dp_modeset_init()
> - dp_display_request_irq()
> - devm_request_irq(&pdev->dev) // a2, BUG
> - if (pdev == pdev2 && condition)
> - return -EPROBE_DEFER;
>
> - if (error)
> - component_unbind_all()
> - for_each_component()
> - component_unbind()
> - dp_display_unbind()
> - devres_release_group(&pdev->dev) // d4, only a1 is freed
>
> - if (error)
> - devres_release_group(adev->parent) // d5
>
> The device-managed allocation a2 is buggy as its lifetime is tied to the
> component platform device and will not be released when the aggregate
> device bind fails (e.g. due to a probe deferral).
>
> When pdev2 is later probed again, the attempt to allocate the IRQ a
> second time will fail for pdev1 (which is still bound to its platform
> driver).
>
> This series fixes the lifetime issues by tying the lifetime of a2 (and
> similar allocations) to the lifetime of the aggregate device so that a2
> is released at d5.
>
> In some cases, such has for the DP IRQ, the above situation can also be
> avoided by moving the allocation in question to the platform driver
> probe (d0) or component bind (between d2 and d3). But as doing so is not
> a general fix, this can be done later as a cleanup/optimisation.
>
> Johan
>
> Changes in v2
> - use a custom devres action instead of amending the AUX bus interface
> (Doug)
> - split sanity check fixes and cleanups per bridge type (Dmitry)
> - add another Fixes tag for the missing bridge counter reset (Dmitry)
>
>
> Johan Hovold (10):
> drm/msm: fix use-after-free on probe deferral
> drm/msm/dp: fix memory corruption with too many bridges
> drm/msm/dsi: fix memory corruption with too many bridges
> drm/msm/hdmi: fix memory corruption with too many bridges
> drm/msm/dp: fix IRQ lifetime
> drm/msm/dp: fix aux-bus EP lifetime
> drm/msm/dp: fix bridge lifetime
> drm/msm/hdmi: fix IRQ lifetime
> drm/msm/dp: drop modeset sanity checks
> drm/msm/dsi: drop modeset sanity checks
>
> drivers/gpu/drm/msm/dp/dp_display.c | 26 +++++++++++++++++++-------
> drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
> drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
> drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
> drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
> drivers/gpu/drm/msm/msm_drv.c | 1 +
> 6 files changed, 37 insertions(+), 17 deletions(-)
>
I've tested this on both sc8180x (Lenovo Flex 5G) and sdm850 (Lenovo
Yoga C630), and both of them show the same issue:
[ 7.449305] platform ae9a000.displayport-controller: Fixing up cyclic
dependency with ae01000.mdp [ 7.454344] Unable to handle kernel NULL
pointer dereference at virtual address 0000000000000008 [ 7.454406] Mem
abort info: [ 7.454423] ESR = 0x0000000096000004 [ 7.454446] EC = 0x25:
DABT (current EL), IL = 32 bits [ 7.454475] SET = 0, FnV = 0 [ 7.454494]
EA = 0, S1PTW = 0 [ 7.454512] FSC = 0x04: level 0 translation fault [
7.454539] Data abort info: [ 7.454556] ISV = 0, ISS = 0x00000004 [
7.454577] CM = 0, WnR = 0 [ 7.454595] user pgtable: 4k pages, 48-bit
VAs, pgdp=0000000101504000 [ 7.454629] [0000000000000008]
pgd=0000000000000000, p4d=0000000000000000 [ 7.454669] Internal error:
Oops: 96000004 [#1] PREEMPT SMP [ 7.454700] Modules linked in:
i2c_hid_of i2c_hid leds_qcom_lpg led_class_multicolor rtc_pm8xxx msm
mdt_loader gpu_sched drm_dp_aux_bus drm_display_helper drm_kms_helper
drm phy_qcom_edp llcc_qcom i2c_qcom_geni phy_qcom_qmp_combo
phy_qcom_snps_femto_v2 phy_qcom_qmp_ufs phy_qcom_qmp_pcie ufs_qcom
pwm_bl [ 7.454860] CPU: 2 PID: 76 Comm: kworker/u16:2 Not tainted
5.19.0-rc8-next-20220728 #26 [ 7.454902] Hardware name: LENOVO
82AK/LNVNB161216, BIOS EACN43WW(V1.15) 09/13/2021 [ 7.454941] Workqueue:
events_unbound deferred_probe_work_func [ 7.454982] pstate: 40400005
(nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 7.455020] pc :
dp_display_request_irq+0x50/0xdc [msm] [ 7.455145] lr :
dp_display_request_irq+0x2c/0xdc [msm] [ 7.455265] sp : ffff800008c1bb30
[ 7.455285] x29: ffff800008c1bb30 x28: 0000000000000000 x27:
0000000000000000 [ 7.455327] x26: ffffc9c918420000 x25: ffffc9c9186ec570
x24: 000000000000003a [ 7.455368] x23: ffffc9c918811d30 x22:
ffff2a5806baa998 x21: ffff2a5806ba3410 [ 7.455410] x20: ffff2a5806baa880
x19: ffff2a5806baa998 x18: ffffffffffffffff [ 7.455451] x17:
0000000000000038 x16: ffffc9c9164eeb24 x15: ffffffffffffffff [ 7.455492]
x14: ffff2a5806bc3004 x13: ffff2a5806bc3000 x12: 0000000000000000 [
7.455533] x11: 0000000000000040 x10: ffffc9c918493080 x9 :
ffffc9c918493078 [ 7.455574] x8 : ffff2a5800681b88 x7 : 0000000000000000
x6 : ffff2a5806baa880 [ 7.455616] x5 : ffffc9c8ca2de000 x4 :
0000000000080004 x3 : 0000000000000000 [ 7.455656] x2 : ffffc9c8ca296000
x1 : 00000000000000a8 x0 : 0000000000000000 [ 7.455698] Call trace: [
7.455714] dp_display_request_irq+0x50/0xdc [msm] [ 7.455834]
dp_display_probe+0xf8/0x4a4 [msm] [ 7.455950] platform_probe+0x6c/0xc4 [
7.455976] really_probe+0xbc/0x2d4 [ 7.455999]
__driver_probe_device+0x78/0xe0 [ 7.456025]
driver_probe_device+0x3c/0x13c [ 7.456051]
__device_attach_driver+0xb8/0x120 [ 7.456077] bus_for_each_drv+0x78/0xd0
[ 7.456105] __device_attach+0x9c/0x1a0 [ 7.456129]
device_initial_probe+0x18/0x2c [ 7.456154] bus_probe_device+0x9c/0xa4 [
7.456178] deferred_probe_work_func+0x88/0xc0 [ 7.456204]
process_one_work+0x1d4/0x330 [ 7.456231] worker_thread+0x70/0x42c [
7.456255] kthread+0x10c/0x110 [ 7.456278] ret_from_fork+0x10/0x20 [
7.456306] Code: aa1403e6 f2a00104 f0000225 f0ffffe2 (f9400400) [
7.456341] ---[ end trace 0000000000000000 ]---
This is from the sc8180x, sdm850 is the same call stack, just with
different addresses.
I do have
https://lore.kernel.org/all/[email protected]/
applied here which makes the 10th patch not apply cleanly.
It fails actually, but I applied it manually here.
On Tue, Sep 13, 2022 at 03:23:10PM -0500, Steev Klimaszewski wrote:
> Hi Johan,
>
> On 9/13/22 3:53 AM, Johan Hovold wrote:
> > The MSM DRM driver is currently broken in multiple ways with respect to
> > probe deferral. Not only does the driver currently fail to probe again
> > after a late deferral, but due to a related use-after-free bug this also
> > triggers NULL-pointer dereferences.
> > In some cases, such has for the DP IRQ, the above situation can also be
> > avoided by moving the allocation in question to the platform driver
> > probe (d0) or component bind (between d2 and d3). But as doing so is not
> > a general fix, this can be done later as a cleanup/optimisation.
> I've tested this on both sc8180x (Lenovo Flex 5G) and sdm850 (Lenovo
> Yoga C630), and both of them show the same issue:
[ Copied the below from IRC instead as the formatting in your mail was
off. ]
> [ 7.449305] platform ae9a000.displayport-controller: Fixing up cyclic dependency with ae01000.mdp
> [ 7.454344] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> [ 7.454406] Mem abort info:
> [ 7.454423] ESR = 0x0000000096000004
> [ 7.454446] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 7.454475] SET = 0, FnV = 0
> [ 7.454494] EA = 0, S1PTW = 0
> [ 7.454512] FSC = 0x04: level 0 translation fault
> [ 7.454539] Data abort info:
> [ 7.454556] ISV = 0, ISS = 0x00000004
> [ 7.454577] CM = 0, WnR = 0
> [ 7.454595] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101504000
> [ 7.454629] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000
> [ 7.454669] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [ 7.454700] Modules linked in: i2c_hid_of i2c_hid leds_qcom_lpg led_class_multicolor rtc_pm8xxx msm mdt_loader gpu_sched drm_dp_aux_bus drm_display_helper drm_kms_helper drm phy_qcom_edp llcc_qcom i2c_qcom_geni phy_qcom_qmp_combo phy_qcom_snps_femto_v2 phy_qcom_qmp_ufs phy_qcom_qmp_pcie ufs_qcom pwm_bl
> [ 7.454860] CPU: 2 PID: 76 Comm: kworker/u16:2 Not tainted 5.19.0-rc8-next-20220728 #26
> [ 7.454902] Hardware name: LENOVO 82AK/LNVNB161216, BIOS EACN43WW(V1.15) 09/13/2021
> [ 7.454941] Workqueue: events_unbound deferred_probe_work_func
> [ 7.454982] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 7.455020] pc : dp_display_request_irq+0x50/0xdc [msm]
> [ 7.455145] lr : dp_display_request_irq+0x2c/0xdc [msm]
> [ 7.455265] sp : ffff800008c1bb30
> [ 7.455285] x29: ffff800008c1bb30 x28: 0000000000000000 x27: 0000000000000000
> [ 7.455327] x26: ffffc9c918420000 x25: ffffc9c9186ec570 x24: 000000000000003a
> [ 7.455368] x23: ffffc9c918811d30 x22: ffff2a5806baa998 x21: ffff2a5806ba3410
> [ 7.455410] x20: ffff2a5806baa880 x19: ffff2a5806baa998 x18: ffffffffffffffff
> [ 7.455451] x17: 0000000000000038 x16: ffffc9c9164eeb24 x15: ffffffffffffffff
> [ 7.455492] x14: ffff2a5806bc3004 x13: ffff2a5806bc3000 x12: 0000000000000000
> [ 7.455533] x11: 0000000000000040 x10: ffffc9c918493080 x9 : ffffc9c918493078
> [ 7.455574] x8 : ffff2a5800681b88 x7 : 0000000000000000 x6 : ffff2a5806baa880
> [ 7.455616] x5 : ffffc9c8ca2de000 x4 : 0000000000080004 x3 : 0000000000000000
> [ 7.455656] x2 : ffffc9c8ca296000 x1 : 00000000000000a8 x0 : 0000000000000000
> [ 7.455698] Call trace:
> [ 7.455714] dp_display_request_irq+0x50/0xdc [msm]
> [ 7.455834] dp_display_probe+0xf8/0x4a4 [msm]
> [ 7.455950] platform_probe+0x6c/0xc4
> [ 7.455976] really_probe+0xbc/0x2d4
> [ 7.455999] __driver_probe_device+0x78/0xe0
> [ 7.456025] driver_probe_device+0x3c/0x13c
> [ 7.456051] __device_attach_driver+0xb8/0x120
> [ 7.456077] bus_for_each_drv+0x78/0xd0
> [ 7.456105] __device_attach+0x9c/0x1a0
> [ 7.456129] device_initial_probe+0x18/0x2c
> [ 7.456154] bus_probe_device+0x9c/0xa4
> [ 7.456178] deferred_probe_work_func+0x88/0xc0
> [ 7.456204] process_one_work+0x1d4/0x330
> [ 7.456231] worker_thread+0x70/0x42c
> [ 7.456255] kthread+0x10c/0x110
> [ 7.456278] ret_from_fork+0x10/0x20
> [ 7.456306] Code: aa1403e6 f2a00104 f0000225 f0ffffe2 (f9400400)
> [ 7.456341] ---[ end trace 0000000000000000 ]---
> This is from the sc8180x, sdm850 is the same call stack, just with
> different addresses.
>
> I do have
> https://lore.kernel.org/all/[email protected]/
> applied here which makes the 10th patch not apply cleanly.
Yeah, that is expected. You need to drop Dmitry's series first. Once you
verified that this series works, you can add it back if you want but you
then need to restore the device pointer used when allocating the irq in
dp_display_request_irq():
- rc = devm_request_irq(dp_display->drm_dev->dev, dp->irq,
+ rc = devm_request_irq(&dp->pdev->dev, dp->irq,
> It fails actually, but I applied it manually here.
Please drop that series and give this one another spin.
Johan
Hi Johan,
On Wed, Sep 14, 2022 at 1:01 AM Johan Hovold <[email protected]> wrote:
>
> On Tue, Sep 13, 2022 at 03:23:10PM -0500, Steev Klimaszewski wrote:
> > Hi Johan,
> >
> > On 9/13/22 3:53 AM, Johan Hovold wrote:
> > > The MSM DRM driver is currently broken in multiple ways with respect to
> > > probe deferral. Not only does the driver currently fail to probe again
> > > after a late deferral, but due to a related use-after-free bug this also
> > > triggers NULL-pointer dereferences.
>
> > > In some cases, such has for the DP IRQ, the above situation can also be
> > > avoided by moving the allocation in question to the platform driver
> > > probe (d0) or component bind (between d2 and d3). But as doing so is not
> > > a general fix, this can be done later as a cleanup/optimisation.
>
> > I've tested this on both sc8180x (Lenovo Flex 5G) and sdm850 (Lenovo
> > Yoga C630), and both of them show the same issue:
>
> [ Copied the below from IRC instead as the formatting in your mail was
> off. ]
>
> > [ 7.449305] platform ae9a000.displayport-controller: Fixing up cyclic dependency with ae01000.mdp
> > [ 7.454344] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > [ 7.454406] Mem abort info:
> > [ 7.454423] ESR = 0x0000000096000004
> > [ 7.454446] EC = 0x25: DABT (current EL), IL = 32 bits
> > [ 7.454475] SET = 0, FnV = 0
> > [ 7.454494] EA = 0, S1PTW = 0
> > [ 7.454512] FSC = 0x04: level 0 translation fault
> > [ 7.454539] Data abort info:
> > [ 7.454556] ISV = 0, ISS = 0x00000004
> > [ 7.454577] CM = 0, WnR = 0
> > [ 7.454595] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101504000
> > [ 7.454629] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000
> > [ 7.454669] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > [ 7.454700] Modules linked in: i2c_hid_of i2c_hid leds_qcom_lpg led_class_multicolor rtc_pm8xxx msm mdt_loader gpu_sched drm_dp_aux_bus drm_display_helper drm_kms_helper drm phy_qcom_edp llcc_qcom i2c_qcom_geni phy_qcom_qmp_combo phy_qcom_snps_femto_v2 phy_qcom_qmp_ufs phy_qcom_qmp_pcie ufs_qcom pwm_bl
> > [ 7.454860] CPU: 2 PID: 76 Comm: kworker/u16:2 Not tainted 5.19.0-rc8-next-20220728 #26
> > [ 7.454902] Hardware name: LENOVO 82AK/LNVNB161216, BIOS EACN43WW(V1.15) 09/13/2021
> > [ 7.454941] Workqueue: events_unbound deferred_probe_work_func
> > [ 7.454982] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [ 7.455020] pc : dp_display_request_irq+0x50/0xdc [msm]
> > [ 7.455145] lr : dp_display_request_irq+0x2c/0xdc [msm]
> > [ 7.455265] sp : ffff800008c1bb30
> > [ 7.455285] x29: ffff800008c1bb30 x28: 0000000000000000 x27: 0000000000000000
> > [ 7.455327] x26: ffffc9c918420000 x25: ffffc9c9186ec570 x24: 000000000000003a
> > [ 7.455368] x23: ffffc9c918811d30 x22: ffff2a5806baa998 x21: ffff2a5806ba3410
> > [ 7.455410] x20: ffff2a5806baa880 x19: ffff2a5806baa998 x18: ffffffffffffffff
> > [ 7.455451] x17: 0000000000000038 x16: ffffc9c9164eeb24 x15: ffffffffffffffff
> > [ 7.455492] x14: ffff2a5806bc3004 x13: ffff2a5806bc3000 x12: 0000000000000000
> > [ 7.455533] x11: 0000000000000040 x10: ffffc9c918493080 x9 : ffffc9c918493078
> > [ 7.455574] x8 : ffff2a5800681b88 x7 : 0000000000000000 x6 : ffff2a5806baa880
> > [ 7.455616] x5 : ffffc9c8ca2de000 x4 : 0000000000080004 x3 : 0000000000000000
> > [ 7.455656] x2 : ffffc9c8ca296000 x1 : 00000000000000a8 x0 : 0000000000000000
> > [ 7.455698] Call trace:
> > [ 7.455714] dp_display_request_irq+0x50/0xdc [msm]
> > [ 7.455834] dp_display_probe+0xf8/0x4a4 [msm]
> > [ 7.455950] platform_probe+0x6c/0xc4
> > [ 7.455976] really_probe+0xbc/0x2d4
> > [ 7.455999] __driver_probe_device+0x78/0xe0
> > [ 7.456025] driver_probe_device+0x3c/0x13c
> > [ 7.456051] __device_attach_driver+0xb8/0x120
> > [ 7.456077] bus_for_each_drv+0x78/0xd0
> > [ 7.456105] __device_attach+0x9c/0x1a0
> > [ 7.456129] device_initial_probe+0x18/0x2c
> > [ 7.456154] bus_probe_device+0x9c/0xa4
> > [ 7.456178] deferred_probe_work_func+0x88/0xc0
> > [ 7.456204] process_one_work+0x1d4/0x330
> > [ 7.456231] worker_thread+0x70/0x42c
> > [ 7.456255] kthread+0x10c/0x110
> > [ 7.456278] ret_from_fork+0x10/0x20
> > [ 7.456306] Code: aa1403e6 f2a00104 f0000225 f0ffffe2 (f9400400)
> > [ 7.456341] ---[ end trace 0000000000000000 ]---
>
> > This is from the sc8180x, sdm850 is the same call stack, just with
> > different addresses.
> >
> > I do have
> > https://lore.kernel.org/all/[email protected]/
> > applied here which makes the 10th patch not apply cleanly.
>
> Yeah, that is expected. You need to drop Dmitry's series first. Once you
> verified that this series works, you can add it back if you want but you
> then need to restore the device pointer used when allocating the irq in
> dp_display_request_irq():
>
> - rc = devm_request_irq(dp_display->drm_dev->dev, dp->irq,
> + rc = devm_request_irq(&dp->pdev->dev, dp->irq,
>
> > It fails actually, but I applied it manually here.
>
> Please drop that series and give this one another spin.
>
> Johan
I thought as much but wasn't sure. Thanks for the clarification.
With Dmitriy's patchset backed out, this series does work as expected.
Tested-by: Steev Klimaszewski <[email protected]>
On Tue, Sep 13, 2022 at 10:53:10AM +0200, Johan Hovold wrote:
> The MSM DRM driver is currently broken in multiple ways with respect to
> probe deferral. Not only does the driver currently fail to probe again
> after a late deferral, but due to a related use-after-free bug this also
> triggers NULL-pointer dereferences.
>
> These bugs are not new but have become critical with the release of
> 5.19 where probe is deferred in case the aux-bus EP panel driver has not
> yet been loaded.
>
> The underlying problem is lifetime issues due to careless use of
> device-managed resources.
Any chance of getting this merged for 6.1?
Johan
> Changes in v2
> - use a custom devres action instead of amending the AUX bus interface
> (Doug)
> - split sanity check fixes and cleanups per bridge type (Dmitry)
> - add another Fixes tag for the missing bridge counter reset (Dmitry)
>
>
> Johan Hovold (10):
> drm/msm: fix use-after-free on probe deferral
> drm/msm/dp: fix memory corruption with too many bridges
> drm/msm/dsi: fix memory corruption with too many bridges
> drm/msm/hdmi: fix memory corruption with too many bridges
> drm/msm/dp: fix IRQ lifetime
> drm/msm/dp: fix aux-bus EP lifetime
> drm/msm/dp: fix bridge lifetime
> drm/msm/hdmi: fix IRQ lifetime
> drm/msm/dp: drop modeset sanity checks
> drm/msm/dsi: drop modeset sanity checks
>
> drivers/gpu/drm/msm/dp/dp_display.c | 26 +++++++++++++++++++-------
> drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
> drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
> drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
> drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
> drivers/gpu/drm/msm/msm_drv.c | 1 +
> 6 files changed, 37 insertions(+), 17 deletions(-)
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> The bridge counter was never reset when tearing down the DRM device so
> that stale pointers to deallocated structures would be accessed on the
> next tear down (e.g. after a second late bind deferral).
>
> Given enough bridges and a few probe deferrals this could currently also
> lead to data beyond the bridge array being corrupted.
>
> Fixes: d28ea556267c ("drm/msm: properly add and remove internal bridges")
> Fixes: a3376e3ec81c ("drm/msm: convert to drm_bridge")
> Cc: [email protected] # 3.12
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/msm_drv.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 391d86b54ded..d254fe2507ec 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -241,6 +241,7 @@ static int msm_drm_uninit(struct device *dev)
>
> for (i = 0; i < priv->num_bridges; i++)
> drm_bridge_remove(priv->bridges[i]);
> + priv->num_bridges = 0;
>
> pm_runtime_get_sync(dev);
> msm_irq_uninstall(ddev);
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> Device-managed resources allocated post component bind must be tied to
> the lifetime of the aggregate DRM device or they will not necessarily be
> released when binding of the aggregate device is deferred.
>
> This is specifically true for the DP IRQ, which will otherwise remain
> requested so that the next bind attempt fails when requesting the IRQ a
> second time.
>
> Since commit c3bf8e21b38a ("drm/msm/dp: Add eDP support via aux_bus")
> this can happen when the aux-bus panel driver has not yet been loaded so
> that probe is deferred.
>
> Fix this by tying the device-managed lifetime of the DP IRQ to the DRM
> device so that it is released when bind fails.
>
> Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
> Cc: [email protected] # 5.10
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/dp/dp_display.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index fbe950edaefe..ba557328710a 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1258,7 +1258,7 @@ int dp_display_request_irq(struct msm_dp *dp_display)
> return -EINVAL;
> }
>
> - rc = devm_request_irq(&dp->pdev->dev, dp->irq,
> + rc = devm_request_irq(dp_display->drm_dev->dev, dp->irq,
> dp_display_irq_handler,
> IRQF_TRIGGER_HIGH, "dp_display_isr", dp);
> if (rc < 0) {
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> Device-managed resources allocated post component bind must be tied to
> the lifetime of the aggregate DRM device or they will not necessarily be
> released when binding of the aggregate device is deferred.
>
> This can lead resource leaks or failure to bind the aggregate device
> when binding is later retried and a second attempt to allocate the
> resources is made.
>
> For the DP bridges, previously allocated bridges will leak on probe
> deferral.
>
> Fix this by amending the DP parser interface and tying the lifetime of
> the bridge device to the DRM device rather than DP platform device.
>
> Fixes: c3bf8e21b38a ("drm/msm/dp: Add eDP support via aux_bus")
> Cc: [email protected] # 5.19
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/dp/dp_display.c | 2 +-
> drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
> drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
> 3 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 4b0a2d4bb61e..808a516e84c5 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1586,7 +1586,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
> * For DisplayPort interfaces external bridges are optional, so
> * silently ignore an error if one is not present (-ENODEV).
> */
> - rc = dp_parser_find_next_bridge(dp_priv->parser);
> + rc = devm_dp_parser_find_next_bridge(dp->drm_dev->dev, dp_priv->parser);
> if (!dp->is_edp && rc == -ENODEV)
> return 0;
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c b/drivers/gpu/drm/msm/dp/dp_parser.c
> index dd732215d55b..dcbe893d66d7 100644
> --- a/drivers/gpu/drm/msm/dp/dp_parser.c
> +++ b/drivers/gpu/drm/msm/dp/dp_parser.c
> @@ -240,12 +240,12 @@ static int dp_parser_clock(struct dp_parser *parser)
> return 0;
> }
>
> -int dp_parser_find_next_bridge(struct dp_parser *parser)
> +int devm_dp_parser_find_next_bridge(struct device *dev, struct dp_parser *parser)
> {
> - struct device *dev = &parser->pdev->dev;
> + struct platform_device *pdev = parser->pdev;
> struct drm_bridge *bridge;
>
> - bridge = devm_drm_of_get_bridge(dev, dev->of_node, 1, 0);
> + bridge = devm_drm_of_get_bridge(dev, pdev->dev.of_node, 1, 0);
> if (IS_ERR(bridge))
> return PTR_ERR(bridge);
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_parser.h b/drivers/gpu/drm/msm/dp/dp_parser.h
> index 866c1a82bf1a..d30ab773db46 100644
> --- a/drivers/gpu/drm/msm/dp/dp_parser.h
> +++ b/drivers/gpu/drm/msm/dp/dp_parser.h
> @@ -138,8 +138,9 @@ struct dp_parser {
> struct dp_parser *dp_parser_get(struct platform_device *pdev);
>
> /**
> - * dp_parser_find_next_bridge() - find an additional bridge to DP
> + * devm_dp_parser_find_next_bridge() - find an additional bridge to DP
> *
> + * @dev: device to tie bridge lifetime to
> * @parser: dp_parser data from client
> *
> * This function is used to find any additional bridge attached to
> @@ -147,6 +148,6 @@ struct dp_parser *dp_parser_get(struct platform_device *pdev);
> *
> * Return: 0 if able to get the bridge, otherwise negative errno for failure.
> */
> -int dp_parser_find_next_bridge(struct dp_parser *parser);
> +int devm_dp_parser_find_next_bridge(struct device *dev, struct dp_parser *parser);
>
> #endif
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> Drop the overly defensive modeset sanity checks of function parameters
> which have already been checked or used by the callers.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/dsi/dsi.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c
> index 8a95c744972a..31fdee2052be 100644
> --- a/drivers/gpu/drm/msm/dsi/dsi.c
> +++ b/drivers/gpu/drm/msm/dsi/dsi.c
> @@ -211,14 +211,9 @@ void __exit msm_dsi_unregister(void)
> int msm_dsi_modeset_init(struct msm_dsi *msm_dsi, struct drm_device *dev,
> struct drm_encoder *encoder)
> {
> - struct msm_drm_private *priv;
> + struct msm_drm_private *priv = dev->dev_private;
> int ret;
>
> - if (WARN_ON(!encoder) || WARN_ON(!msm_dsi) || WARN_ON(!dev))
> - return -EINVAL;
> -
> - priv = dev->dev_private;
> -
> if (priv->num_bridges == ARRAY_SIZE(priv->bridges)) {
> DRM_DEV_ERROR(dev->dev, "too many bridges\n");
> return -ENOSPC;
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> The bridge counter was never reset when tearing down the DRM device so
> that stale pointers to deallocated structures would be accessed on the
> next tear down (e.g. after a second late bind deferral).
>
> Given enough bridges and a few probe deferrals this could currently also
> lead to data beyond the bridge array being corrupted.
>
> Fixes: d28ea556267c ("drm/msm: properly add and remove internal bridges")
> Fixes: a3376e3ec81c ("drm/msm: convert to drm_bridge")
> Cc: [email protected] # 3.12
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/msm_drv.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 391d86b54ded..d254fe2507ec 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -241,6 +241,7 @@ static int msm_drm_uninit(struct device *dev)
>
> for (i = 0; i < priv->num_bridges; i++)
> drm_bridge_remove(priv->bridges[i]);
> + priv->num_bridges = 0;
>
> pm_runtime_get_sync(dev);
> msm_irq_uninstall(ddev);
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> Drop the overly defensive modeset sanity checks of function parameters
> which have already been checked or used by the callers.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/dp/dp_display.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 808a516e84c5..33daec11f813 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1607,15 +1607,10 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
> int msm_dp_modeset_init(struct msm_dp *dp_display, struct drm_device *dev,
> struct drm_encoder *encoder)
> {
> - struct msm_drm_private *priv;
> + struct msm_drm_private *priv = dev->dev_private;
> struct dp_display_private *dp_priv;
> int ret;
>
> - if (WARN_ON(!encoder) || WARN_ON(!dp_display) || WARN_ON(!dev))
> - return -EINVAL;
> -
> - priv = dev->dev_private;
> -
> if (priv->num_bridges == ARRAY_SIZE(priv->bridges)) {
> DRM_DEV_ERROR(dev->dev, "too many bridges\n");
> return -ENOSPC;
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> Drop the overly defensive modeset sanity checks of function parameters
> which have already been checked or used by the callers.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
The change LGTM, hence
Reviewed-by: Abhinav Kumar <[email protected]>
I think we can use below fixes tag so that we can pick up this entire
series for the fixes cycle.
Fixes: 3f0689e66352 ("drm/msm/dsi: check msm_dsi and dsi pointers before
use")
> ---
> drivers/gpu/drm/msm/dsi/dsi.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c
> index 8a95c744972a..31fdee2052be 100644
> --- a/drivers/gpu/drm/msm/dsi/dsi.c
> +++ b/drivers/gpu/drm/msm/dsi/dsi.c
> @@ -211,14 +211,9 @@ void __exit msm_dsi_unregister(void)
> int msm_dsi_modeset_init(struct msm_dsi *msm_dsi, struct drm_device *dev,
> struct drm_encoder *encoder)
> {
> - struct msm_drm_private *priv;
> + struct msm_drm_private *priv = dev->dev_private;
> int ret;
>
> - if (WARN_ON(!encoder) || WARN_ON(!msm_dsi) || WARN_ON(!dev))
> - return -EINVAL;
> -
> - priv = dev->dev_private;
> -
> if (priv->num_bridges == ARRAY_SIZE(priv->bridges)) {
> DRM_DEV_ERROR(dev->dev, "too many bridges\n");
> return -ENOSPC;
On 9/13/2022 1:53 AM, Johan Hovold wrote:
> Drop the overly defensive modeset sanity checks of function parameters
> which have already been checked or used by the callers.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
The change LGTM, hence
Reviewed-by: Abhinav Kumar <[email protected]>
I think we can use below fixes tag so that we can pick up this entire
series for the fixes cycle.
Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
> ---
> drivers/gpu/drm/msm/dp/dp_display.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
> index 808a516e84c5..33daec11f813 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1607,15 +1607,10 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
> int msm_dp_modeset_init(struct msm_dp *dp_display, struct drm_device *dev,
> struct drm_encoder *encoder)
> {
> - struct msm_drm_private *priv;
> + struct msm_drm_private *priv = dev->dev_private;
> struct dp_display_private *dp_priv;
> int ret;
>
> - if (WARN_ON(!encoder) || WARN_ON(!dp_display) || WARN_ON(!dev))
> - return -EINVAL;
> -
> - priv = dev->dev_private;
> -
> if (priv->num_bridges == ARRAY_SIZE(priv->bridges)) {
> DRM_DEV_ERROR(dev->dev, "too many bridges\n");
> return -ENOSPC;
On Mon, Sep 26, 2022 at 11:21:38AM -0700, Abhinav Kumar wrote:
>
>
> On 9/13/2022 1:53 AM, Johan Hovold wrote:
> > Drop the overly defensive modeset sanity checks of function parameters
> > which have already been checked or used by the callers.
> >
> > Reviewed-by: Dmitry Baryshkov <[email protected]>
> > Signed-off-by: Johan Hovold <[email protected]>
>
> The change LGTM, hence
>
> Reviewed-by: Abhinav Kumar <[email protected]>
>
> I think we can use below fixes tag so that we can pick up this entire
> series for the fixes cycle.
>
> Fixes: 3f0689e66352 ("drm/msm/dsi: check msm_dsi and dsi pointers before
> use")
Same here. I wouldn't add a Fixes tag unless required by some DRM
process.
Johan
On Mon, Sep 26, 2022 at 11:17:20AM -0700, Abhinav Kumar wrote:
> On 9/13/2022 1:53 AM, Johan Hovold wrote:
> > Drop the overly defensive modeset sanity checks of function parameters
> > which have already been checked or used by the callers.
> >
> > Reviewed-by: Dmitry Baryshkov <[email protected]>
> > Signed-off-by: Johan Hovold <[email protected]>
>
> The change LGTM, hence
>
> Reviewed-by: Abhinav Kumar <[email protected]>
>
> I think we can use below fixes tag so that we can pick up this entire
> series for the fixes cycle.
>
> Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
Perhaps that's a requirement for drm, but I wouldn't add a Fixes tag for
this otherwise as it's not a bug.
You also have to watch out for Sasha and his autosel scripts which will
probably try to backport this to stable if it finds a Fixes tag.
Johan
On 9/27/2022 12:14 AM, Johan Hovold wrote:
> On Mon, Sep 26, 2022 at 11:17:20AM -0700, Abhinav Kumar wrote:
>> On 9/13/2022 1:53 AM, Johan Hovold wrote:
>>> Drop the overly defensive modeset sanity checks of function parameters
>>> which have already been checked or used by the callers.
>>>
>>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>>> Signed-off-by: Johan Hovold <[email protected]>
>>
>> The change LGTM, hence
>>
>> Reviewed-by: Abhinav Kumar <[email protected]>
>>
>> I think we can use below fixes tag so that we can pick up this entire
>> series for the fixes cycle.
>>
>> Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
>
> Perhaps that's a requirement for drm, but I wouldn't add a Fixes tag for
> this otherwise as it's not a bug.
>
> You also have to watch out for Sasha and his autosel scripts which will
> probably try to backport this to stable if it finds a Fixes tag.
>
> Johan
Discussed with Rob on IRC, we will apply everything except the last two
patches of this series in the -fixes and take these two for the next
kernel rev push.
On 9/27/2022 12:16 AM, Johan Hovold wrote:
> On Mon, Sep 26, 2022 at 11:21:38AM -0700, Abhinav Kumar wrote:
>>
>>
>> On 9/13/2022 1:53 AM, Johan Hovold wrote:
>>> Drop the overly defensive modeset sanity checks of function parameters
>>> which have already been checked or used by the callers.
>>>
>>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>>> Signed-off-by: Johan Hovold <[email protected]>
>>
>> The change LGTM, hence
>>
>> Reviewed-by: Abhinav Kumar <[email protected]>
>>
>> I think we can use below fixes tag so that we can pick up this entire
>> series for the fixes cycle.
>>
>> Fixes: 3f0689e66352 ("drm/msm/dsi: check msm_dsi and dsi pointers before
>> use")
>
> Same here. I wouldn't add a Fixes tag unless required by some DRM
> process.
>
> Johan
Same response as the prev one, we will apply everything except the last
two patches of this series in the -fixes and take these two for the next
kernel rev push.
On Tue, Sep 27, 2022 at 11:42:53AM -0700, Abhinav Kumar wrote:
> On 9/27/2022 12:14 AM, Johan Hovold wrote:
> > On Mon, Sep 26, 2022 at 11:17:20AM -0700, Abhinav Kumar wrote:
> >> On 9/13/2022 1:53 AM, Johan Hovold wrote:
> >>> Drop the overly defensive modeset sanity checks of function parameters
> >>> which have already been checked or used by the callers.
> >>>
> >>> Reviewed-by: Dmitry Baryshkov <[email protected]>
> >>> Signed-off-by: Johan Hovold <[email protected]>
> >>
> >> The change LGTM, hence
> >>
> >> Reviewed-by: Abhinav Kumar <[email protected]>
> >>
> >> I think we can use below fixes tag so that we can pick up this entire
> >> series for the fixes cycle.
> >>
> >> Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
> >
> > Perhaps that's a requirement for drm, but I wouldn't add a Fixes tag for
> > this otherwise as it's not a bug.
> >
> > You also have to watch out for Sasha and his autosel scripts which will
> > probably try to backport this to stable if it finds a Fixes tag.
> Discussed with Rob on IRC, we will apply everything except the last two
> patches of this series in the -fixes and take these two for the next
> kernel rev push.
So the fixes go in 6.0 and the two follow-on cleanups in 6.1? Or did you
mean 6.1 and 6.2?
Johan
On 9/28/2022 5:24 AM, Johan Hovold wrote:
> On Tue, Sep 27, 2022 at 11:42:53AM -0700, Abhinav Kumar wrote:
>> On 9/27/2022 12:14 AM, Johan Hovold wrote:
>>> On Mon, Sep 26, 2022 at 11:17:20AM -0700, Abhinav Kumar wrote:
>>>> On 9/13/2022 1:53 AM, Johan Hovold wrote:
>>>>> Drop the overly defensive modeset sanity checks of function parameters
>>>>> which have already been checked or used by the callers.
>>>>>
>>>>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>>>>> Signed-off-by: Johan Hovold <[email protected]>
>>>>
>>>> The change LGTM, hence
>>>>
>>>> Reviewed-by: Abhinav Kumar <[email protected]>
>>>>
>>>> I think we can use below fixes tag so that we can pick up this entire
>>>> series for the fixes cycle.
>>>>
>>>> Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
>>>
>>> Perhaps that's a requirement for drm, but I wouldn't add a Fixes tag for
>>> this otherwise as it's not a bug.
>>>
>>> You also have to watch out for Sasha and his autosel scripts which will
>>> probably try to backport this to stable if it finds a Fixes tag.
>
>> Discussed with Rob on IRC, we will apply everything except the last two
>> patches of this series in the -fixes and take these two for the next
>> kernel rev push.
>
> So the fixes go in 6.0 and the two follow-on cleanups in 6.1? Or did you
> mean 6.1 and 6.2?
>
> Johan
The fixes will go in 6.1 first.
The two follow-on cleanups in 6.2.
Thanks
Abhinav
On Wed, Sep 28, 2022 at 08:33:52AM -0700, Abhinav Kumar wrote:
> On 9/28/2022 5:24 AM, Johan Hovold wrote:
> > On Tue, Sep 27, 2022 at 11:42:53AM -0700, Abhinav Kumar wrote:
> >> Discussed with Rob on IRC, we will apply everything except the last two
> >> patches of this series in the -fixes and take these two for the next
> >> kernel rev push.
> >
> > So the fixes go in 6.0 and the two follow-on cleanups in 6.1? Or did you
> > mean 6.1 and 6.2?
> The fixes will go in 6.1 first.
>
> The two follow-on cleanups in 6.2.
Ok, sounds good. Thanks.
Johan
On Tue, Sep 20, 2022 at 11:06:30AM +0200, Johan Hovold wrote:
> On Tue, Sep 13, 2022 at 10:53:10AM +0200, Johan Hovold wrote:
> > The MSM DRM driver is currently broken in multiple ways with respect to
> > probe deferral. Not only does the driver currently fail to probe again
> > after a late deferral, but due to a related use-after-free bug this also
> > triggers NULL-pointer dereferences.
> >
> > These bugs are not new but have become critical with the release of
> > 5.19 where probe is deferred in case the aux-bus EP panel driver has not
> > yet been loaded.
> >
> > The underlying problem is lifetime issues due to careless use of
> > device-managed resources.
>
> Any chance of getting this merged for 6.1?
Is anyone picking these up as fixes for 6.1-rc as we discussed?
Johan
> > Changes in v2
> > - use a custom devres action instead of amending the AUX bus interface
> > (Doug)
> > - split sanity check fixes and cleanups per bridge type (Dmitry)
> > - add another Fixes tag for the missing bridge counter reset (Dmitry)
> >
> >
> > Johan Hovold (10):
> > drm/msm: fix use-after-free on probe deferral
> > drm/msm/dp: fix memory corruption with too many bridges
> > drm/msm/dsi: fix memory corruption with too many bridges
> > drm/msm/hdmi: fix memory corruption with too many bridges
> > drm/msm/dp: fix IRQ lifetime
> > drm/msm/dp: fix aux-bus EP lifetime
> > drm/msm/dp: fix bridge lifetime
> > drm/msm/hdmi: fix IRQ lifetime
> > drm/msm/dp: drop modeset sanity checks
> > drm/msm/dsi: drop modeset sanity checks
> >
> > drivers/gpu/drm/msm/dp/dp_display.c | 26 +++++++++++++++++++-------
> > drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
> > drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
> > drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
> > drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
> > drivers/gpu/drm/msm/msm_drv.c | 1 +
> > 6 files changed, 37 insertions(+), 17 deletions(-)
Hi Johan
On 10/20/2022 11:27 PM, Johan Hovold wrote:
> On Tue, Sep 20, 2022 at 11:06:30AM +0200, Johan Hovold wrote:
>> On Tue, Sep 13, 2022 at 10:53:10AM +0200, Johan Hovold wrote:
>>> The MSM DRM driver is currently broken in multiple ways with respect to
>>> probe deferral. Not only does the driver currently fail to probe again
>>> after a late deferral, but due to a related use-after-free bug this also
>>> triggers NULL-pointer dereferences.
>>>
>>> These bugs are not new but have become critical with the release of
>>> 5.19 where probe is deferred in case the aux-bus EP panel driver has not
>>> yet been loaded.
>>>
>>> The underlying problem is lifetime issues due to careless use of
>>> device-managed resources.
>>
>> Any chance of getting this merged for 6.1?
>
> Is anyone picking these up as fixes for 6.1-rc as we discussed?
>
> Johan
All of these except the last two ( as discussed ) have landed in the
-fixes tree
https://gitlab.freedesktop.org/drm/msm/-/commit/6808abdb33bf90330e70a687d29f038507e06ebb
Thanks
Abhinav
>
>>> Changes in v2
>>> - use a custom devres action instead of amending the AUX bus interface
>>> (Doug)
>>> - split sanity check fixes and cleanups per bridge type (Dmitry)
>>> - add another Fixes tag for the missing bridge counter reset (Dmitry)
>>>
>>>
>>> Johan Hovold (10):
>>> drm/msm: fix use-after-free on probe deferral
>>> drm/msm/dp: fix memory corruption with too many bridges
>>> drm/msm/dsi: fix memory corruption with too many bridges
>>> drm/msm/hdmi: fix memory corruption with too many bridges
>>> drm/msm/dp: fix IRQ lifetime
>>> drm/msm/dp: fix aux-bus EP lifetime
>>> drm/msm/dp: fix bridge lifetime
>>> drm/msm/hdmi: fix IRQ lifetime
>>> drm/msm/dp: drop modeset sanity checks
>>> drm/msm/dsi: drop modeset sanity checks
>>>
>>> drivers/gpu/drm/msm/dp/dp_display.c | 26 +++++++++++++++++++-------
>>> drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
>>> drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
>>> drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
>>> drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
>>> drivers/gpu/drm/msm/msm_drv.c | 1 +
>>> 6 files changed, 37 insertions(+), 17 deletions(-)
On Fri, Oct 21, 2022 at 09:05:52AM -0700, Abhinav Kumar wrote:
> Hi Johan
>
> On 10/20/2022 11:27 PM, Johan Hovold wrote:
> > On Tue, Sep 20, 2022 at 11:06:30AM +0200, Johan Hovold wrote:
> >> On Tue, Sep 13, 2022 at 10:53:10AM +0200, Johan Hovold wrote:
> >>> The MSM DRM driver is currently broken in multiple ways with respect to
> >>> probe deferral. Not only does the driver currently fail to probe again
> >>> after a late deferral, but due to a related use-after-free bug this also
> >>> triggers NULL-pointer dereferences.
> >>>
> >>> These bugs are not new but have become critical with the release of
> >>> 5.19 where probe is deferred in case the aux-bus EP panel driver has not
> >>> yet been loaded.
> >>>
> >>> The underlying problem is lifetime issues due to careless use of
> >>> device-managed resources.
> >>
> >> Any chance of getting this merged for 6.1?
> >
> > Is anyone picking these up as fixes for 6.1-rc as we discussed?
> >
> > Johan
>
> All of these except the last two ( as discussed ) have landed in the
> -fixes tree
>
> https://gitlab.freedesktop.org/drm/msm/-/commit/6808abdb33bf90330e70a687d29f038507e06ebb
Ah, perfect, thanks.
When do you expect to send these on so that they end up in linux-next
and eventually Linus's tree?
Note that it looks like something happened with the commit messages when
you applied these. Specifically, the Fixes tags appears to now have a
line break in them and there's also some random whitespace before your
SoB:
Fixes: c3bf8e21
("drm/msm/dp: Add eDP support via aux_bus")
Cc: [email protected] # 5.19
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Tested-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Kuogee Hsieh <[email protected]>
Patchwork: https://patchwork.freedesktop.org/patch/502667/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Abhinav Kumar's avatarAbhinav Kumar <[email protected]>
It's possible just the gitlab UI that's messed up, but perhaps you can
double check before they hit linux-next, which should complain about
this otherwise.
Johan
On Mon, Oct 24, 2022 at 4:34 AM Johan Hovold <[email protected]> wrote:
>
> On Fri, Oct 21, 2022 at 09:05:52AM -0700, Abhinav Kumar wrote:
> > Hi Johan
> >
> > On 10/20/2022 11:27 PM, Johan Hovold wrote:
> > > On Tue, Sep 20, 2022 at 11:06:30AM +0200, Johan Hovold wrote:
> > >> On Tue, Sep 13, 2022 at 10:53:10AM +0200, Johan Hovold wrote:
> > >>> The MSM DRM driver is currently broken in multiple ways with respect to
> > >>> probe deferral. Not only does the driver currently fail to probe again
> > >>> after a late deferral, but due to a related use-after-free bug this also
> > >>> triggers NULL-pointer dereferences.
> > >>>
> > >>> These bugs are not new but have become critical with the release of
> > >>> 5.19 where probe is deferred in case the aux-bus EP panel driver has not
> > >>> yet been loaded.
> > >>>
> > >>> The underlying problem is lifetime issues due to careless use of
> > >>> device-managed resources.
> > >>
> > >> Any chance of getting this merged for 6.1?
> > >
> > > Is anyone picking these up as fixes for 6.1-rc as we discussed?
> > >
> > > Johan
> >
> > All of these except the last two ( as discussed ) have landed in the
> > -fixes tree
> >
> > https://gitlab.freedesktop.org/drm/msm/-/commit/6808abdb33bf90330e70a687d29f038507e06ebb
>
> Ah, perfect, thanks.
>
> When do you expect to send these on so that they end up in linux-next
> and eventually Linus's tree?
I'll send a -fixes PR this week
> Note that it looks like something happened with the commit messages when
> you applied these. Specifically, the Fixes tags appears to now have a
> line break in them and there's also some random whitespace before your
> SoB:
>
> Fixes: c3bf8e21
>
> ("drm/msm/dp: Add eDP support via aux_bus")
naw, that is just some problem with gitlab's html generation, the
actual patch is fine ;-)
BR,
-R
> Cc: [email protected] # 5.19
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Johan Hovold <[email protected]>
> Tested-by: Kuogee Hsieh <[email protected]>
> Reviewed-by: Kuogee Hsieh <[email protected]>
> Patchwork: https://patchwork.freedesktop.org/patch/502667/
> Link: https://lore.kernel.org/r/[email protected]
>
>
> Signed-off-by: Abhinav Kumar's avatarAbhinav Kumar <[email protected]>
>
> It's possible just the gitlab UI that's messed up, but perhaps you can
> double check before they hit linux-next, which should complain about
> this otherwise.
>
> Johan