2020-08-13 00:03:49

by Rob Clark

[permalink] [raw]
Subject: [PATCH] drm/msm/adreno: fix updating ring fence

From: Rob Clark <[email protected]>

We need to set it to the most recent completed fence, not the most
recent submitted. Otherwise we have races where we think we can retire
submits that the GPU is not finished with, if the GPU doesn't manage to
overwrite the seqno before we look at it.

This can show up with hang recovery if one of the submits after the
crashing submit also hangs after it is replayed.

Fixes: f97decac5f4c ("drm/msm: Support multiple ringbuffers")
Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index f9e3badf2fca..34e6242c1767 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -405,7 +405,7 @@ int adreno_hw_init(struct msm_gpu *gpu)
ring->next = ring->start;

/* reset completed fence seqno: */
- ring->memptrs->fence = ring->seqno;
+ ring->memptrs->fence = ring->fctx->completed_fence;
ring->memptrs->rptr = 0;
}

--
2.26.2


2020-08-17 22:35:42

by Jordan Crouse

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/adreno: fix updating ring fence

On Wed, Aug 12, 2020 at 05:03:09PM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> We need to set it to the most recent completed fence, not the most
> recent submitted. Otherwise we have races where we think we can retire
> submits that the GPU is not finished with, if the GPU doesn't manage to
> overwrite the seqno before we look at it.
>
> This can show up with hang recovery if one of the submits after the
> crashing submit also hangs after it is replayed.

Reviewed-by: Jordan Crouse <[email protected]>

> Fixes: f97decac5f4c ("drm/msm: Support multiple ringbuffers")
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index f9e3badf2fca..34e6242c1767 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -405,7 +405,7 @@ int adreno_hw_init(struct msm_gpu *gpu)
> ring->next = ring->start;
>
> /* reset completed fence seqno: */
> - ring->memptrs->fence = ring->seqno;
> + ring->memptrs->fence = ring->fctx->completed_fence;
> ring->memptrs->rptr = 0;
> }
>
> --
> 2.26.2
>

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project