2021-09-26 19:04:08

by Rob Clark

[permalink] [raw]
Subject: [PATCH] drm/msm: Fix crash on dev file close

From: Rob Clark <[email protected]>

If the device file was opened prior to fw being available (such as from
initrd before rootfs is mounted, when the initrd does not contain GPU
fw), that would cause a later crash when the dev file is closed due to
unitialized submitqueues list:

CPU: 4 PID: 263 Comm: plymouthd Tainted: G W 5.15.0-rc2-next-20210924 #2
Hardware name: LENOVO 81JL/LNVNB161216, BIOS 9UCN33WW(V2.06) 06/ 4/2019
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : msm_submitqueue_close+0x30/0x190 [msm]
lr : msm_postclose+0x54/0xf0 [msm]
sp : ffff80001074bb80
x29: ffff80001074bb80 x28: ffff03ad80c4db80 x27: ffff03ad80dc5ab0
x26: 0000000000000000 x25: ffff03ad80dc5af8 x24: ffff03ad81e90800
x23: 0000000000000000 x22: ffff03ad81e90800 x21: ffff03ad8b35e788
x20: ffff03ad81e90878 x19: 0000000000000000 x18: 0000000000000000
x17: 0000000000000000 x16: ffffda15f14f7940 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000040
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffda15cd18ff88
x8 : ffff03ad80c4db80 x7 : 0000000000000228 x6 : 0000000000000000
x5 : 1793a4e807e636bd x4 : ffff03ad80c4db80 x3 : ffff03ad81e90878
x2 : 0000000000000000 x1 : ffff03ad80c4db80 x0 : 0000000000000000
Call trace:
msm_submitqueue_close+0x30/0x190 [msm]
msm_postclose+0x54/0xf0 [msm]
drm_file_free.part.0+0x1cc/0x2e0 [drm]
drm_close_helper.isra.0+0x74/0x84 [drm]
drm_release+0x78/0x120 [drm]
__fput+0x78/0x23c
____fput+0x1c/0x30
task_work_run+0xcc/0x22c
do_exit+0x304/0x9f4
do_group_exit+0x44/0xb0
__wake_up_parent+0x0/0x3c
invoke_syscall+0x50/0x120
el0_svc_common.constprop.0+0x4c/0xf4
do_el0_svc+0x30/0x9c
el0_svc+0x20/0x60
el0t_64_sync_handler+0xe8/0xf0
el0t_64_sync+0x1a0/0x1a4
Code: aa0003f5 a90153f3 f8408eb3 aa1303e0 (f85e8674)
---[ end trace 39b2fa37509a2be2 ]---
Fixing recursive fault but reboot is needed!

Fixes: 86c2a0f000c1 drm/msm: ("Small submitqueue creation cleanup")
Reported-by: Steev Klimaszewski <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_drv.c | 3 +++
drivers/gpu/drm/msm/msm_submitqueue.c | 4 ----
2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index f350de754f84..938765ad7109 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -689,6 +689,9 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
if (!ctx)
return -ENOMEM;

+ INIT_LIST_HEAD(&ctx->submitqueues);
+ rwlock_init(&ctx->queuelock);
+
kref_init(&ctx->ref);
msm_submitqueue_init(dev, ctx);

diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
index 32a55d81b58b..7ce0771b5582 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -140,10 +140,6 @@ int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx)
*/
default_prio = DIV_ROUND_UP(max_priority, 2);

- INIT_LIST_HEAD(&ctx->submitqueues);
-
- rwlock_init(&ctx->queuelock);
-
return msm_submitqueue_create(drm, ctx, default_prio, 0, NULL);
}

--
2.31.1


2021-09-26 19:36:00

by Steev Klimaszewski

[permalink] [raw]
Subject: Re: [PATCH] drm/msm: Fix crash on dev file close


On 9/26/21 2:05 PM, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> If the device file was opened prior to fw being available (such as from
> initrd before rootfs is mounted, when the initrd does not contain GPU
> fw), that would cause a later crash when the dev file is closed due to
> unitialized submitqueues list:
>
> CPU: 4 PID: 263 Comm: plymouthd Tainted: G W 5.15.0-rc2-next-20210924 #2
> Hardware name: LENOVO 81JL/LNVNB161216, BIOS 9UCN33WW(V2.06) 06/ 4/2019
> pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : msm_submitqueue_close+0x30/0x190 [msm]
> lr : msm_postclose+0x54/0xf0 [msm]
> sp : ffff80001074bb80
> x29: ffff80001074bb80 x28: ffff03ad80c4db80 x27: ffff03ad80dc5ab0
> x26: 0000000000000000 x25: ffff03ad80dc5af8 x24: ffff03ad81e90800
> x23: 0000000000000000 x22: ffff03ad81e90800 x21: ffff03ad8b35e788
> x20: ffff03ad81e90878 x19: 0000000000000000 x18: 0000000000000000
> x17: 0000000000000000 x16: ffffda15f14f7940 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000040
> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffda15cd18ff88
> x8 : ffff03ad80c4db80 x7 : 0000000000000228 x6 : 0000000000000000
> x5 : 1793a4e807e636bd x4 : ffff03ad80c4db80 x3 : ffff03ad81e90878
> x2 : 0000000000000000 x1 : ffff03ad80c4db80 x0 : 0000000000000000
> Call trace:
> msm_submitqueue_close+0x30/0x190 [msm]
> msm_postclose+0x54/0xf0 [msm]
> drm_file_free.part.0+0x1cc/0x2e0 [drm]
> drm_close_helper.isra.0+0x74/0x84 [drm]
> drm_release+0x78/0x120 [drm]
> __fput+0x78/0x23c
> ____fput+0x1c/0x30
> task_work_run+0xcc/0x22c
> do_exit+0x304/0x9f4
> do_group_exit+0x44/0xb0
> __wake_up_parent+0x0/0x3c
> invoke_syscall+0x50/0x120
> el0_svc_common.constprop.0+0x4c/0xf4
> do_el0_svc+0x30/0x9c
> el0_svc+0x20/0x60
> el0t_64_sync_handler+0xe8/0xf0
> el0t_64_sync+0x1a0/0x1a4
> Code: aa0003f5 a90153f3 f8408eb3 aa1303e0 (f85e8674)
> ---[ end trace 39b2fa37509a2be2 ]---
> Fixing recursive fault but reboot is needed!
>
> Fixes: 86c2a0f000c1 drm/msm: ("Small submitqueue creation cleanup")
> Reported-by: Steev Klimaszewski <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/msm/msm_drv.c | 3 +++
> drivers/gpu/drm/msm/msm_submitqueue.c | 4 ----
> 2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index f350de754f84..938765ad7109 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -689,6 +689,9 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> if (!ctx)
> return -ENOMEM;
>
> + INIT_LIST_HEAD(&ctx->submitqueues);
> + rwlock_init(&ctx->queuelock);
> +
> kref_init(&ctx->ref);
> msm_submitqueue_init(dev, ctx);
>
> diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> index 32a55d81b58b..7ce0771b5582 100644
> --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> @@ -140,10 +140,6 @@ int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx)
> */
> default_prio = DIV_ROUND_UP(max_priority, 2);
>
> - INIT_LIST_HEAD(&ctx->submitqueues);
> -
> - rwlock_init(&ctx->queuelock);
> -
> return msm_submitqueue_create(drm, ctx, default_prio, 0, NULL);
> }
>

Have not seen the crash since applying the patch.

Tested-By: Steev Klimaszewski <[email protected]>

2021-09-26 19:39:47

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH] drm/msm: Fix crash on dev file close

On Sun, 26 Sept 2021 at 22:01, Rob Clark <[email protected]> wrote:
>
> From: Rob Clark <[email protected]>
>
> If the device file was opened prior to fw being available (such as from
> initrd before rootfs is mounted, when the initrd does not contain GPU
> fw), that would cause a later crash when the dev file is closed due to
> unitialized submitqueues list:

Reviewed-by: Dmitry Baryshkov <[email protected]>

I've sent a close version of this patch a day or so ago, but yours is
better, as I did not touch rwlock init.

>
> CPU: 4 PID: 263 Comm: plymouthd Tainted: G W 5.15.0-rc2-next-20210924 #2
> Hardware name: LENOVO 81JL/LNVNB161216, BIOS 9UCN33WW(V2.06) 06/ 4/2019
> pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : msm_submitqueue_close+0x30/0x190 [msm]
> lr : msm_postclose+0x54/0xf0 [msm]
> sp : ffff80001074bb80
> x29: ffff80001074bb80 x28: ffff03ad80c4db80 x27: ffff03ad80dc5ab0
> x26: 0000000000000000 x25: ffff03ad80dc5af8 x24: ffff03ad81e90800
> x23: 0000000000000000 x22: ffff03ad81e90800 x21: ffff03ad8b35e788
> x20: ffff03ad81e90878 x19: 0000000000000000 x18: 0000000000000000
> x17: 0000000000000000 x16: ffffda15f14f7940 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000040
> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffda15cd18ff88
> x8 : ffff03ad80c4db80 x7 : 0000000000000228 x6 : 0000000000000000
> x5 : 1793a4e807e636bd x4 : ffff03ad80c4db80 x3 : ffff03ad81e90878
> x2 : 0000000000000000 x1 : ffff03ad80c4db80 x0 : 0000000000000000
> Call trace:
> msm_submitqueue_close+0x30/0x190 [msm]
> msm_postclose+0x54/0xf0 [msm]
> drm_file_free.part.0+0x1cc/0x2e0 [drm]
> drm_close_helper.isra.0+0x74/0x84 [drm]
> drm_release+0x78/0x120 [drm]
> __fput+0x78/0x23c
> ____fput+0x1c/0x30
> task_work_run+0xcc/0x22c
> do_exit+0x304/0x9f4
> do_group_exit+0x44/0xb0
> __wake_up_parent+0x0/0x3c
> invoke_syscall+0x50/0x120
> el0_svc_common.constprop.0+0x4c/0xf4
> do_el0_svc+0x30/0x9c
> el0_svc+0x20/0x60
> el0t_64_sync_handler+0xe8/0xf0
> el0t_64_sync+0x1a0/0x1a4
> Code: aa0003f5 a90153f3 f8408eb3 aa1303e0 (f85e8674)
> ---[ end trace 39b2fa37509a2be2 ]---
> Fixing recursive fault but reboot is needed!
>
> Fixes: 86c2a0f000c1 drm/msm: ("Small submitqueue creation cleanup")
> Reported-by: Steev Klimaszewski <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/msm/msm_drv.c | 3 +++
> drivers/gpu/drm/msm/msm_submitqueue.c | 4 ----
> 2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index f350de754f84..938765ad7109 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -689,6 +689,9 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> if (!ctx)
> return -ENOMEM;
>
> + INIT_LIST_HEAD(&ctx->submitqueues);
> + rwlock_init(&ctx->queuelock);
> +
> kref_init(&ctx->ref);
> msm_submitqueue_init(dev, ctx);
>
> diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> index 32a55d81b58b..7ce0771b5582 100644
> --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> @@ -140,10 +140,6 @@ int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx)
> */
> default_prio = DIV_ROUND_UP(max_priority, 2);
>
> - INIT_LIST_HEAD(&ctx->submitqueues);
> -
> - rwlock_init(&ctx->queuelock);
> -
> return msm_submitqueue_create(drm, ctx, default_prio, 0, NULL);
> }
>
> --
> 2.31.1
>


--
With best wishes
Dmitry

2021-09-27 15:18:42

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH] drm/msm: Fix crash on dev file close

On Sun, Sep 26, 2021 at 12:36 PM Dmitry Baryshkov
<[email protected]> wrote:
>
> On Sun, 26 Sept 2021 at 22:01, Rob Clark <[email protected]> wrote:
> >
> > From: Rob Clark <[email protected]>
> >
> > If the device file was opened prior to fw being available (such as from
> > initrd before rootfs is mounted, when the initrd does not contain GPU
> > fw), that would cause a later crash when the dev file is closed due to
> > unitialized submitqueues list:
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
>
> I've sent a close version of this patch a day or so ago, but yours is
> better, as I did not touch rwlock init.

Thanks, sorry I did not see your patch earlier

BR,
-R

> >
> > CPU: 4 PID: 263 Comm: plymouthd Tainted: G W 5.15.0-rc2-next-20210924 #2
> > Hardware name: LENOVO 81JL/LNVNB161216, BIOS 9UCN33WW(V2.06) 06/ 4/2019
> > pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : msm_submitqueue_close+0x30/0x190 [msm]
> > lr : msm_postclose+0x54/0xf0 [msm]
> > sp : ffff80001074bb80
> > x29: ffff80001074bb80 x28: ffff03ad80c4db80 x27: ffff03ad80dc5ab0
> > x26: 0000000000000000 x25: ffff03ad80dc5af8 x24: ffff03ad81e90800
> > x23: 0000000000000000 x22: ffff03ad81e90800 x21: ffff03ad8b35e788
> > x20: ffff03ad81e90878 x19: 0000000000000000 x18: 0000000000000000
> > x17: 0000000000000000 x16: ffffda15f14f7940 x15: 0000000000000000
> > x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000040
> > x11: 0000000000000000 x10: 0000000000000000 x9 : ffffda15cd18ff88
> > x8 : ffff03ad80c4db80 x7 : 0000000000000228 x6 : 0000000000000000
> > x5 : 1793a4e807e636bd x4 : ffff03ad80c4db80 x3 : ffff03ad81e90878
> > x2 : 0000000000000000 x1 : ffff03ad80c4db80 x0 : 0000000000000000
> > Call trace:
> > msm_submitqueue_close+0x30/0x190 [msm]
> > msm_postclose+0x54/0xf0 [msm]
> > drm_file_free.part.0+0x1cc/0x2e0 [drm]
> > drm_close_helper.isra.0+0x74/0x84 [drm]
> > drm_release+0x78/0x120 [drm]
> > __fput+0x78/0x23c
> > ____fput+0x1c/0x30
> > task_work_run+0xcc/0x22c
> > do_exit+0x304/0x9f4
> > do_group_exit+0x44/0xb0
> > __wake_up_parent+0x0/0x3c
> > invoke_syscall+0x50/0x120
> > el0_svc_common.constprop.0+0x4c/0xf4
> > do_el0_svc+0x30/0x9c
> > el0_svc+0x20/0x60
> > el0t_64_sync_handler+0xe8/0xf0
> > el0t_64_sync+0x1a0/0x1a4
> > Code: aa0003f5 a90153f3 f8408eb3 aa1303e0 (f85e8674)
> > ---[ end trace 39b2fa37509a2be2 ]---
> > Fixing recursive fault but reboot is needed!
> >
> > Fixes: 86c2a0f000c1 drm/msm: ("Small submitqueue creation cleanup")
> > Reported-by: Steev Klimaszewski <[email protected]>
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/gpu/drm/msm/msm_drv.c | 3 +++
> > drivers/gpu/drm/msm/msm_submitqueue.c | 4 ----
> > 2 files changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index f350de754f84..938765ad7109 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -689,6 +689,9 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > if (!ctx)
> > return -ENOMEM;
> >
> > + INIT_LIST_HEAD(&ctx->submitqueues);
> > + rwlock_init(&ctx->queuelock);
> > +
> > kref_init(&ctx->ref);
> > msm_submitqueue_init(dev, ctx);
> >
> > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > index 32a55d81b58b..7ce0771b5582 100644
> > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > @@ -140,10 +140,6 @@ int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx)
> > */
> > default_prio = DIV_ROUND_UP(max_priority, 2);
> >
> > - INIT_LIST_HEAD(&ctx->submitqueues);
> > -
> > - rwlock_init(&ctx->queuelock);
> > -
> > return msm_submitqueue_create(drm, ctx, default_prio, 0, NULL);
> > }
> >
> > --
> > 2.31.1
> >
>
>
> --
> With best wishes
> Dmitry