2023-06-06 14:58:39

by Erico Nunes

[permalink] [raw]
Subject: [PATCH] drm/lima: fix sched context destroy

The drm sched entity must be flushed before finishing, to account for
jobs potentially still in flight at that time.
Lima did not do this flush until now, so switch the destroy call to the
drm_sched_entity_destroy() wrapper which will take care of that.

This fixes a regression on lima which started since the rework in
commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
where some specific types of applications may hang indefinitely.

Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: Erico Nunes <[email protected]>
---
drivers/gpu/drm/lima/lima_sched.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ff003403fbbc..ffd91a5ee299 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
void lima_sched_context_fini(struct lima_sched_pipe *pipe,
struct lima_sched_context *context)
{
- drm_sched_entity_fini(&context->base);
+ drm_sched_entity_destroy(&context->base);
}

struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
--
2.40.1



2023-06-07 01:44:03

by Vasily Khoruzhick

[permalink] [raw]
Subject: Re: [Lima] [PATCH] drm/lima: fix sched context destroy

On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes <[email protected]> wrote:
>
> The drm sched entity must be flushed before finishing, to account for
> jobs potentially still in flight at that time.
> Lima did not do this flush until now, so switch the destroy call to the
> drm_sched_entity_destroy() wrapper which will take care of that.
>
> This fixes a regression on lima which started since the rework in
> commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> where some specific types of applications may hang indefinitely.
>
> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> Signed-off-by: Erico Nunes <[email protected]>

Reviewed-by: Vasily Khoruzhick <[email protected]>

> ---
> drivers/gpu/drm/lima/lima_sched.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ff003403fbbc..ffd91a5ee299 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
> void lima_sched_context_fini(struct lima_sched_pipe *pipe,
> struct lima_sched_context *context)
> {
> - drm_sched_entity_fini(&context->base);
> + drm_sched_entity_destroy(&context->base);
> }
>
> struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
> --
> 2.40.1
>

2023-06-07 04:33:36

by Qiang Yu

[permalink] [raw]
Subject: Re: [Lima] [PATCH] drm/lima: fix sched context destroy

Reviewed-by: Qiang Yu <[email protected]>

Applied to drm-misc-fixes.

On Wed, Jun 7, 2023 at 9:18 AM Vasily Khoruzhick <[email protected]> wrote:
>
> On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes <[email protected]> wrote:
> >
> > The drm sched entity must be flushed before finishing, to account for
> > jobs potentially still in flight at that time.
> > Lima did not do this flush until now, so switch the destroy call to the
> > drm_sched_entity_destroy() wrapper which will take care of that.
> >
> > This fixes a regression on lima which started since the rework in
> > commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> > where some specific types of applications may hang indefinitely.
> >
> > Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> > Signed-off-by: Erico Nunes <[email protected]>
>
> Reviewed-by: Vasily Khoruzhick <[email protected]>
>
> > ---
> > drivers/gpu/drm/lima/lima_sched.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ff003403fbbc..ffd91a5ee299 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
> > void lima_sched_context_fini(struct lima_sched_pipe *pipe,
> > struct lima_sched_context *context)
> > {
> > - drm_sched_entity_fini(&context->base);
> > + drm_sched_entity_destroy(&context->base);
> > }
> >
> > struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
> > --
> > 2.40.1
> >

2023-06-07 09:59:22

by Christian König

[permalink] [raw]
Subject: Re: [Lima] [PATCH] drm/lima: fix sched context destroy

Acked-by: Christian König <[email protected]>

While you are at it: It's beneficial for drivers to implement the flush
callback on the file descriptor.

This way you can still send a SIGKILL when a terminating application
waits for the entity to be flushed out to the hardware and all the
pending jobs are canceled.

Regards,
Christian.

Am 07.06.23 um 06:04 schrieb Qiang Yu:
> Reviewed-by: Qiang Yu <[email protected]>
>
> Applied to drm-misc-fixes.
>
> On Wed, Jun 7, 2023 at 9:18 AM Vasily Khoruzhick <[email protected]> wrote:
>> On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes <[email protected]> wrote:
>>> The drm sched entity must be flushed before finishing, to account for
>>> jobs potentially still in flight at that time.
>>> Lima did not do this flush until now, so switch the destroy call to the
>>> drm_sched_entity_destroy() wrapper which will take care of that.
>>>
>>> This fixes a regression on lima which started since the rework in
>>> commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
>>> where some specific types of applications may hang indefinitely.
>>>
>>> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
>>> Signed-off-by: Erico Nunes <[email protected]>
>> Reviewed-by: Vasily Khoruzhick <[email protected]>
>>
>>> ---
>>> drivers/gpu/drm/lima/lima_sched.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index ff003403fbbc..ffd91a5ee299 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
>>> void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>>> struct lima_sched_context *context)
>>> {
>>> - drm_sched_entity_fini(&context->base);
>>> + drm_sched_entity_destroy(&context->base);
>>> }
>>>
>>> struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
>>> --
>>> 2.40.1
>>>