2024-05-26 14:06:44

by Mikhail Gavrilov

[permalink] [raw]
Subject: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

Hi,
Day before yesterday I replaced 7900XTX to 6900XT for got clear in
which kernel first time appeared warning message "DMA-API: amdgpu
0000:0f:00.0: cacheline tracking EEXIST, overlapping mappings aren't
supported".
The kernel 6.3 and older won't boot on a computer with Radeon 7900XTX.
When I booted the system with 6900XT I saw a green flashing bar on top
of the screen when I typed commands in the gnome terminal which was
maximized on full screen.
Demonstration: https://youtu.be/tTvwQ_5pRkk
For reproduction you need Radeon 6900XT GPU connected to 120Hz OLED TV by HDMI.

I bisected the issue and the first commit which I found was 6d4279cb99ac.
commit 6d4279cb99ac4f51d10409501d29969f687ac8dc (HEAD)
Author: Rodrigo Siqueira <[email protected]>
Date: Tue Mar 26 10:42:05 2024 -0600

drm/amd/display: Drop legacy code

This commit removes code that are not used by display anymore.

Acked-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h | 4 ----
drivers/gpu/drm/amd/display/dc/inc/resource.h | 7 -------
drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c | 10 ----------
drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 33
+--------------------------------
4 files changed, 1 insertion(+), 53 deletions(-)

Every time after bisecting I usually make sure that I found the right
commit and build the kernel with revert of the bad commit.
But this time I again observed an issue after running a kernel builded
without commit 6d4279cb99ac.
And I decided to find a second bad commit.
The second bad commit has been bc87d666c05.
commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 (HEAD)
Author: Rodrigo Siqueira <[email protected]>
Date: Tue Mar 26 11:55:19 2024 -0600

drm/amd/display: Add fallback configuration for set DRR in DCN10

Set OTG/OPTC parameters to 0 if something goes wrong on DCN10.

Acked-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

After reverting both these commits on top of 54f71b0369c9 the issue is gone.

I also attach the build config.

My hardware specs: https://linux-hardware.org/?probe=f25a873c5e

Rodrigo or anyone else from the AMD team can you look please.

--
Best Regards,
Mike Gavrilov.


Attachments:
.config.zip (64.97 kB)

2024-06-05 13:20:16

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

On Sun, May 26, 2024 at 7:06 PM Mikhail Gavrilov
<[email protected]> wrote:
>
> Hi,
> Day before yesterday I replaced 7900XTX to 6900XT for got clear in
> which kernel first time appeared warning message "DMA-API: amdgpu
> 0000:0f:00.0: cacheline tracking EEXIST, overlapping mappings aren't
> supported".
> The kernel 6.3 and older won't boot on a computer with Radeon 7900XTX.
> When I booted the system with 6900XT I saw a green flashing bar on top
> of the screen when I typed commands in the gnome terminal which was
> maximized on full screen.
> Demonstration: https://youtu.be/tTvwQ_5pRkk
> For reproduction you need Radeon 6900XT GPU connected to 120Hz OLED TV by HDMI.
>
> I bisected the issue and the first commit which I found was 6d4279cb99ac.
> commit 6d4279cb99ac4f51d10409501d29969f687ac8dc (HEAD)
> Author: Rodrigo Siqueira <[email protected]>
> Date: Tue Mar 26 10:42:05 2024 -0600
>
> drm/amd/display: Drop legacy code
>
> This commit removes code that are not used by display anymore.
>
> Acked-by: Hamza Mahfooz <[email protected]>
> Signed-off-by: Rodrigo Siqueira <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
>
> drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h | 4 ----
> drivers/gpu/drm/amd/display/dc/inc/resource.h | 7 -------
> drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c | 10 ----------
> drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 33
> +--------------------------------
> 4 files changed, 1 insertion(+), 53 deletions(-)
>
> Every time after bisecting I usually make sure that I found the right
> commit and build the kernel with revert of the bad commit.
> But this time I again observed an issue after running a kernel builded
> without commit 6d4279cb99ac.
> And I decided to find a second bad commit.
> The second bad commit has been bc87d666c05.
> commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 (HEAD)
> Author: Rodrigo Siqueira <[email protected]>
> Date: Tue Mar 26 11:55:19 2024 -0600
>
> drm/amd/display: Add fallback configuration for set DRR in DCN10
>
> Set OTG/OPTC parameters to 0 if something goes wrong on DCN10.
>
> Acked-by: Hamza Mahfooz <[email protected]>
> Signed-off-by: Rodrigo Siqueira <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
>
> drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> After reverting both these commits on top of 54f71b0369c9 the issue is gone.
>
> I also attach the build config.
>
> My hardware specs: https://linux-hardware.org/?probe=f25a873c5e
>
> Rodrigo or anyone else from the AMD team can you look please.
>

Did anyone watch?

--
Best Regards,
Mike Gavrilov.

2024-06-07 12:34:01

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

[CCing the other amd drm maintainers]

On 05.06.24 14:04, Mikhail Gavrilov wrote:
> On Sun, May 26, 2024 at 7:06 PM Mikhail Gavrilov
> <[email protected]> wrote:
>>
>> Day before yesterday I replaced 7900XTX to 6900XT for got clear in
>> which kernel first time appeared warning message "DMA-API: amdgpu
>> 0000:0f:00.0: cacheline tracking EEXIST, overlapping mappings aren't
>> supported".
>> The kernel 6.3 and older won't boot on a computer with Radeon 7900XTX.

Mikhail: are those details in any way relevant? Then in the future best
leave them out (or make things easier to follow), they make the bug
report confusing and sounds like this is just a bug, when it fact from
your bisection is sounds like this is a regression.

Anyway, @amd maintainers: is there a reason why this report did not get
at least a single reply? Or was there some progress somewhere and I just
missed it? Or would it be better if Mikhail would report this to
https://gitlab.freedesktop.org/drm/amd/-/issues/ ?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

>> When I booted the system with 6900XT I saw a green flashing bar on top
>> of the screen when I typed commands in the gnome terminal which was
>> maximized on full screen.
>>
>> Demonstration: https://youtu.be/tTvwQ_5pRkk
>> For reproduction you need Radeon 6900XT GPU connected to 120Hz OLED TV by HDMI.
>>
>> I bisected the issue and the first commit which I found was 6d4279cb99ac.
>> commit 6d4279cb99ac4f51d10409501d29969f687ac8dc (HEAD)
>> Author: Rodrigo Siqueira <[email protected]>
>> Date: Tue Mar 26 10:42:05 2024 -0600
>>
>> drm/amd/display: Drop legacy code
>>
>> This commit removes code that are not used by display anymore.
>>
>> Acked-by: Hamza Mahfooz <[email protected]>
>> Signed-off-by: Rodrigo Siqueira <[email protected]>
>> Signed-off-by: Alex Deucher <[email protected]>
>>
>> drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h | 4 ----
>> drivers/gpu/drm/amd/display/dc/inc/resource.h | 7 -------
>> drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c | 10 ----------
>> drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 33
>> +--------------------------------
>> 4 files changed, 1 insertion(+), 53 deletions(-)
>>
>> Every time after bisecting I usually make sure that I found the right
>> commit and build the kernel with revert of the bad commit.
>> But this time I again observed an issue after running a kernel builded
>> without commit 6d4279cb99ac.
>> And I decided to find a second bad commit.
>> The second bad commit has been bc87d666c05.
>> commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 (HEAD)
>> Author: Rodrigo Siqueira <[email protected]>
>> Date: Tue Mar 26 11:55:19 2024 -0600
>>
>> drm/amd/display: Add fallback configuration for set DRR in DCN10
>>
>> Set OTG/OPTC parameters to 0 if something goes wrong on DCN10.
>>
>> Acked-by: Hamza Mahfooz <[email protected]>
>> Signed-off-by: Rodrigo Siqueira <[email protected]>
>> Signed-off-by: Alex Deucher <[email protected]>
>>
>> drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c | 15 ++++++++++++---
>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> After reverting both these commits on top of 54f71b0369c9 the issue is gone.
>>
>> I also attach the build config.
>>
>> My hardware specs: https://linux-hardware.org/?probe=f25a873c5e
>>
>> Rodrigo or anyone else from the AMD team can you look please.
>>
>
> Did anyone watch?
>

2024-06-07 13:40:03

by Alex Deucher

[permalink] [raw]
Subject: Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

On Sun, May 26, 2024 at 10:12 AM Mikhail Gavrilov
<[email protected]> wrote:
>
> Hi,
> Day before yesterday I replaced 7900XTX to 6900XT for got clear in
> which kernel first time appeared warning message "DMA-API: amdgpu
> 0000:0f:00.0: cacheline tracking EEXIST, overlapping mappings aren't
> supported".
> The kernel 6.3 and older won't boot on a computer with Radeon 7900XTX.
> When I booted the system with 6900XT I saw a green flashing bar on top
> of the screen when I typed commands in the gnome terminal which was
> maximized on full screen.
> Demonstration: https://youtu.be/tTvwQ_5pRkk
> For reproduction you need Radeon 6900XT GPU connected to 120Hz OLED TV by HDMI.
>
> I bisected the issue and the first commit which I found was 6d4279cb99ac.
> commit 6d4279cb99ac4f51d10409501d29969f687ac8dc (HEAD)
> Author: Rodrigo Siqueira <[email protected]>
> Date: Tue Mar 26 10:42:05 2024 -0600
>
> drm/amd/display: Drop legacy code
>
> This commit removes code that are not used by display anymore.
>
> Acked-by: Hamza Mahfooz <[email protected]>
> Signed-off-by: Rodrigo Siqueira <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
>
> drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h | 4 ----
> drivers/gpu/drm/amd/display/dc/inc/resource.h | 7 -------
> drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c | 10 ----------
> drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 33
> +--------------------------------
> 4 files changed, 1 insertion(+), 53 deletions(-)
>
> Every time after bisecting I usually make sure that I found the right
> commit and build the kernel with revert of the bad commit.
> But this time I again observed an issue after running a kernel builded
> without commit 6d4279cb99ac.
> And I decided to find a second bad commit.
> The second bad commit has been bc87d666c05.
> commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 (HEAD)
> Author: Rodrigo Siqueira <[email protected]>
> Date: Tue Mar 26 11:55:19 2024 -0600
>
> drm/amd/display: Add fallback configuration for set DRR in DCN10
>
> Set OTG/OPTC parameters to 0 if something goes wrong on DCN10.
>
> Acked-by: Hamza Mahfooz <[email protected]>
> Signed-off-by: Rodrigo Siqueira <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
>
> drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> After reverting both these commits on top of 54f71b0369c9 the issue is gone.
>
> I also attach the build config.
>
> My hardware specs: https://linux-hardware.org/?probe=f25a873c5e
>
> Rodrigo or anyone else from the AMD team can you look please.

@Siqueira, Rodrigo can you take a look? The two patches change the
programming of OTG_V_TOTAL_CONTROL. The first patch removes this
code:

diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
b/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
index 58bdbd859bf9..d6f095b4555d 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
@@ -462,16 +462,6 @@ void optc2_setup_manual_trigger(struct
timing_generator *optc)
{
struct optc *optc1 = DCN10TG_FROM_TG(optc);

- /* Set the min/max selectors unconditionally so that
- * DMCUB fw may change OTG timings when necessary
- * TODO: Remove the w/a after fixing the issue in DMCUB firmware
- */
- REG_UPDATE_4(OTG_V_TOTAL_CONTROL,
- OTG_V_TOTAL_MIN_SEL, 1,
- OTG_V_TOTAL_MAX_SEL, 1,
- OTG_FORCE_LOCK_ON_EVENT, 0,
- OTG_SET_V_TOTAL_MIN_MASK, (1 << 1));
/* TRIGA */
-
REG_SET_8(OTG_TRIGA_CNTL, 0,
OTG_TRIGA_SOURCE_SELECT, 21,
OTG_TRIGA_SOURCE_PIPE_SELECT, optc->inst,

and the second patch adds this hunk:

diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
index f109a101d84f..5574bc628053 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
@@ -945,10 +945,19 @@ void optc1_set_drr(
OTG_FORCE_LOCK_ON_EVENT, 0,
OTG_SET_V_TOTAL_MIN_MASK_EN, 0,
OTG_SET_V_TOTAL_MIN_MASK, 0);
- }

- // Setup manual flow control for EOF via TRIG_A
- optc->funcs->setup_manual_trigger(optc);
+ // Setup manual flow control for EOF via TRIG_A
+ optc->funcs->setup_manual_trigger(optc);
+
+ } else {
+ REG_UPDATE_4(OTG_V_TOTAL_CONTROL,
+ OTG_SET_V_TOTAL_MIN_MASK, 0,
+ OTG_V_TOTAL_MIN_SEL, 0,
+ OTG_V_TOTAL_MAX_SEL, 0,
+ OTG_FORCE_LOCK_ON_EVENT, 0);
+
+ optc->funcs->set_vtotal_min_max(optc, 0, 0);
+ }
}


Looks like both the if and the else side paths end up programming
OTG_V_TOTAL_CONTROL differently after the change. Perhaps
OTG_SET_V_TOTAL_MIN_MASK needs to be set differently depending on the
DMCUB firmware version? @Mikhail Gavrilov does this patch fix it?

diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
index 336488c0574e..933c7a342936 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
@@ -944,7 +944,7 @@ void optc1_set_drr(
OTG_V_TOTAL_MAX_SEL, 1,
OTG_FORCE_LOCK_ON_EVENT, 0,
OTG_SET_V_TOTAL_MIN_MASK_EN, 0,
- OTG_SET_V_TOTAL_MIN_MASK, 0);
+ OTG_SET_V_TOTAL_MIN_MASK, (1 << 1)); /* TRIGA */

// Setup manual flow control for EOF via TRIG_A
optc->funcs->setup_manual_trigger(optc);


Thanks,

Alex

>
> --
> Best Regards,
> Mike Gavrilov.

2024-06-09 21:19:30

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

On Fri, Jun 7, 2024 at 6:39 PM Alex Deucher <[email protected]> wrote:
>
> --- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
> +++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
> @@ -944,7 +944,7 @@ void optc1_set_drr(
> OTG_V_TOTAL_MAX_SEL, 1,
> OTG_FORCE_LOCK_ON_EVENT, 0,
> OTG_SET_V_TOTAL_MIN_MASK_EN, 0,
> - OTG_SET_V_TOTAL_MIN_MASK, 0);
> + OTG_SET_V_TOTAL_MIN_MASK, (1 << 1)); /* TRIGA */
>
> // Setup manual flow control for EOF via TRIG_A
> optc->funcs->setup_manual_trigger(optc);

Thanks, Alex.
I applied this patch on top of 771ed66105de and unfortunately the
issue is not fixed.
I saw a green flashing bar on top of the screen again.

--
Best Regards,
Mike Gavrilov.

2024-06-10 22:05:22

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

On Fri, Jun 7, 2024 at 5:29 PM Linux regression tracking (Thorsten
Leemhuis) <[email protected]> wrote:
>
> [CCing the other amd drm maintainers]
>
> Mikhail: are those details in any way relevant? Then in the future best
> leave them out (or make things easier to follow), they make the bug
> report confusing and sounds like this is just a bug, when it fact from
> your bisection is sounds like this is a regression.

Apologies if my pre-story is confused. I just wanna say I completely
moved to the 7900XTX more than a year ago and I was surprised to see
this regression on the old 6900XT. An accident helped me find this
issue because I didn't plan to use old hardware.

--
Best Regards,
Mike Gavrilov.