2022-10-24 20:24:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 384/390] Revert "drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega"

From: Shuah Khan <[email protected]>

This reverts commit 9f55f36f749a7608eeef57d7d72991a9bd557341 which is
commit e3163bc8ffdfdb405e10530b140135b2ee487f89 upstream.

This commit causes repeated WARN_ONs from

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd
gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu]

dmesg fills up with the following messages and drm initialization takes
a very long time.

Cc: <[email protected]> # 5.10
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 5 -----
drivers/gpu/drm/amd/amdgpu/soc15.c | 25 +++++++++++++++++++++++++
2 files changed, 25 insertions(+), 5 deletions(-)

--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1475,11 +1475,6 @@ static int sdma_v4_0_start(struct amdgpu
WREG32_SDMA(i, mmSDMA0_CNTL, temp);

if (!amdgpu_sriov_vf(adev)) {
- ring = &adev->sdma.instance[i].ring;
- adev->nbio.funcs->sdma_doorbell_range(adev, i,
- ring->use_doorbell, ring->doorbell_index,
- adev->doorbell_index.sdma_doorbell_range);
-
/* unhalt engine */
temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1332,6 +1332,25 @@ static int soc15_common_sw_fini(void *ha
return 0;
}

+static void soc15_doorbell_range_init(struct amdgpu_device *adev)
+{
+ int i;
+ struct amdgpu_ring *ring;
+
+ /* sdma/ih doorbell range are programed by hypervisor */
+ if (!amdgpu_sriov_vf(adev)) {
+ for (i = 0; i < adev->sdma.num_instances; i++) {
+ ring = &adev->sdma.instance[i].ring;
+ adev->nbio.funcs->sdma_doorbell_range(adev, i,
+ ring->use_doorbell, ring->doorbell_index,
+ adev->doorbell_index.sdma_doorbell_range);
+ }
+
+ adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
+ adev->irq.ih.doorbell_index);
+ }
+}
+
static int soc15_common_hw_init(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -1351,6 +1370,12 @@ static int soc15_common_hw_init(void *ha

/* enable the doorbell aperture */
soc15_enable_doorbell_aperture(adev, true);
+ /* HW doorbell routing policy: doorbell writing not
+ * in SDMA/IH/MM/ACV range will be routed to CP. So
+ * we need to init SDMA/IH/MM/ACV doorbell range prior
+ * to CP ip block init and ring test.
+ */
+ soc15_doorbell_range_init(adev);

return 0;
}



2022-10-25 09:22:19

by Salvatore Bonaccorso

[permalink] [raw]
Subject: Re: [PATCH 5.10 384/390] Revert "drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega"

Hi Greg,

On Mon, Oct 24, 2022 at 01:33:01PM +0200, Greg Kroah-Hartman wrote:
> From: Shuah Khan <[email protected]>
>
> This reverts commit 9f55f36f749a7608eeef57d7d72991a9bd557341 which is
> commit e3163bc8ffdfdb405e10530b140135b2ee487f89 upstream.
>
> This commit causes repeated WARN_ONs from
>
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd
> gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu]
>
> dmesg fills up with the following messages and drm initialization takes
> a very long time.
>
> Cc: <[email protected]> # 5.10
> Signed-off-by: Shuah Khan <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> ---
> drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 5 -----
> drivers/gpu/drm/amd/amdgpu/soc15.c | 25 +++++++++++++++++++++++++
> 2 files changed, 25 insertions(+), 5 deletions(-)
>
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -1475,11 +1475,6 @@ static int sdma_v4_0_start(struct amdgpu
> WREG32_SDMA(i, mmSDMA0_CNTL, temp);
>
> if (!amdgpu_sriov_vf(adev)) {
> - ring = &adev->sdma.instance[i].ring;
> - adev->nbio.funcs->sdma_doorbell_range(adev, i,
> - ring->use_doorbell, ring->doorbell_index,
> - adev->doorbell_index.sdma_doorbell_range);
> -
> /* unhalt engine */
> temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
> temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -1332,6 +1332,25 @@ static int soc15_common_sw_fini(void *ha
> return 0;
> }
>
> +static void soc15_doorbell_range_init(struct amdgpu_device *adev)
> +{
> + int i;
> + struct amdgpu_ring *ring;
> +
> + /* sdma/ih doorbell range are programed by hypervisor */
> + if (!amdgpu_sriov_vf(adev)) {
> + for (i = 0; i < adev->sdma.num_instances; i++) {
> + ring = &adev->sdma.instance[i].ring;
> + adev->nbio.funcs->sdma_doorbell_range(adev, i,
> + ring->use_doorbell, ring->doorbell_index,
> + adev->doorbell_index.sdma_doorbell_range);
> + }
> +
> + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
> + adev->irq.ih.doorbell_index);
> + }
> +}
> +
> static int soc15_common_hw_init(void *handle)
> {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> @@ -1351,6 +1370,12 @@ static int soc15_common_hw_init(void *ha
>
> /* enable the doorbell aperture */
> soc15_enable_doorbell_aperture(adev, true);
> + /* HW doorbell routing policy: doorbell writing not
> + * in SDMA/IH/MM/ACV range will be routed to CP. So
> + * we need to init SDMA/IH/MM/ACV doorbell range prior
> + * to CP ip block init and ring test.
> + */
> + soc15_doorbell_range_init(adev);
>
> return 0;
> }

Can you please as well revert 7b0db849ea030a70b8fb9c9afec67c81f955482e
on top?

See https://lore.kernel.org/stable/BL1PR12MB5144F3CC640A18DF0C36E414F72E9@BL1PR12MB5144.namprd12.prod.outlook.com/

Both of these reverts need to be applied to fix regressions which were
reported in https://gitlab.freedesktop.org/drm/amd/-/issues/2216 and
downstream in Debian (https://bugs.debian.org/1022025).

If it is now not anymore possible for 5.10.150 can you pick the revert
for 5.10.151?

Regards,
Salvatore

2022-10-25 15:36:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.10 384/390] Revert "drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega"

On Tue, Oct 25, 2022 at 11:02:33AM +0200, Salvatore Bonaccorso wrote:
> Hi Greg,
>
> On Mon, Oct 24, 2022 at 01:33:01PM +0200, Greg Kroah-Hartman wrote:
> > From: Shuah Khan <[email protected]>
> >
> > This reverts commit 9f55f36f749a7608eeef57d7d72991a9bd557341 which is
> > commit e3163bc8ffdfdb405e10530b140135b2ee487f89 upstream.
> >
> > This commit causes repeated WARN_ONs from
> >
> > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd
> > gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu]
> >
> > dmesg fills up with the following messages and drm initialization takes
> > a very long time.
> >
> > Cc: <[email protected]> # 5.10
> > Signed-off-by: Shuah Khan <[email protected]>
> > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > ---
> > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 5 -----
> > drivers/gpu/drm/amd/amdgpu/soc15.c | 25 +++++++++++++++++++++++++
> > 2 files changed, 25 insertions(+), 5 deletions(-)
> >
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > @@ -1475,11 +1475,6 @@ static int sdma_v4_0_start(struct amdgpu
> > WREG32_SDMA(i, mmSDMA0_CNTL, temp);
> >
> > if (!amdgpu_sriov_vf(adev)) {
> > - ring = &adev->sdma.instance[i].ring;
> > - adev->nbio.funcs->sdma_doorbell_range(adev, i,
> > - ring->use_doorbell, ring->doorbell_index,
> > - adev->doorbell_index.sdma_doorbell_range);
> > -
> > /* unhalt engine */
> > temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
> > temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -1332,6 +1332,25 @@ static int soc15_common_sw_fini(void *ha
> > return 0;
> > }
> >
> > +static void soc15_doorbell_range_init(struct amdgpu_device *adev)
> > +{
> > + int i;
> > + struct amdgpu_ring *ring;
> > +
> > + /* sdma/ih doorbell range are programed by hypervisor */
> > + if (!amdgpu_sriov_vf(adev)) {
> > + for (i = 0; i < adev->sdma.num_instances; i++) {
> > + ring = &adev->sdma.instance[i].ring;
> > + adev->nbio.funcs->sdma_doorbell_range(adev, i,
> > + ring->use_doorbell, ring->doorbell_index,
> > + adev->doorbell_index.sdma_doorbell_range);
> > + }
> > +
> > + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
> > + adev->irq.ih.doorbell_index);
> > + }
> > +}
> > +
> > static int soc15_common_hw_init(void *handle)
> > {
> > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > @@ -1351,6 +1370,12 @@ static int soc15_common_hw_init(void *ha
> >
> > /* enable the doorbell aperture */
> > soc15_enable_doorbell_aperture(adev, true);
> > + /* HW doorbell routing policy: doorbell writing not
> > + * in SDMA/IH/MM/ACV range will be routed to CP. So
> > + * we need to init SDMA/IH/MM/ACV doorbell range prior
> > + * to CP ip block init and ring test.
> > + */
> > + soc15_doorbell_range_init(adev);
> >
> > return 0;
> > }
>
> Can you please as well revert 7b0db849ea030a70b8fb9c9afec67c81f955482e
> on top?
>
> See https://lore.kernel.org/stable/BL1PR12MB5144F3CC640A18DF0C36E414F72E9@BL1PR12MB5144.namprd12.prod.outlook.com/
>
> Both of these reverts need to be applied to fix regressions which were
> reported in https://gitlab.freedesktop.org/drm/amd/-/issues/2216 and
> downstream in Debian (https://bugs.debian.org/1022025).
>
> If it is now not anymore possible for 5.10.150 can you pick the revert
> for 5.10.151?

Now queued up.

greg k-h