Fix below crash by adding a check in the drm_dp_dpcd_access which
ensures that aux->transfer was actually initialized earlier.
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0010 [#1] SMP NOPTI
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e
RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8
RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: ffffa8d64225ba60
R10: 0000000000000002 R11: 000000000000000d R12: 0000000000000001
R13: 0000000000000000 R14: ffffa8d64225bb5e R15: ffff931511a1a9d8
FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0
PKRU: 55555554
Call Trace:
drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
crtc_crc_open+0x174/0x220 [drm]
full_proxy_open+0x168/0x1f0
? open_proxy_open+0x100/0x100
do_dentry_open+0x156/0x370
vfs_open+0x2d/0x30
v2: fix some typo
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/gpu/drm/drm_dp_helper.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 6d0f2c447f3b..76b28396001a 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request,
msg.buffer = buffer;
msg.size = size;
+ /* No transfer function is set, so not an available DP connector */
+ if (!aux->transfer)
+ return -EINVAL;
+
mutex_lock(&aux->hw_mutex);
/*
--
2.25.1
On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
> Fix below crash by adding a check in the drm_dp_dpcd_access which
> ensures that aux->transfer was actually initialized earlier.
Gut feeling says this is papering over a real usage issue somewhere
else. Why is the aux being used for transfers before ->transfer has been
set? Why should the dp helper be defensive against all kinds of
misprogramming?
BR,
Jani.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 0 P4D 0
> Oops: 0010 [#1] SMP NOPTI
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e
> RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8
> RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: ffffa8d64225ba60
> R10: 0000000000000002 R11: 000000000000000d R12: 0000000000000001
> R13: 0000000000000000 R14: ffffa8d64225bb5e R15: ffff931511a1a9d8
> FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0
> PKRU: 55555554
> Call Trace:
> drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
> drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
> drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
> amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
> crtc_crc_open+0x174/0x220 [drm]
> full_proxy_open+0x168/0x1f0
> ? open_proxy_open+0x100/0x100
> do_dentry_open+0x156/0x370
> vfs_open+0x2d/0x30
>
> v2: fix some typo
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/gpu/drm/drm_dp_helper.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 6d0f2c447f3b..76b28396001a 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request,
> msg.buffer = buffer;
> msg.size = size;
>
> + /* No transfer function is set, so not an available DP connector */
> + if (!aux->transfer)
> + return -EINVAL;
> +
> mutex_lock(&aux->hw_mutex);
>
> /*
--
Jani Nikula, Intel Open Source Graphics Center
[AMD Official Use Only]
Hi Jani:
Thanks for your comments.
> -----Original Message-----
> From: Jani Nikula <[email protected]>
> Sent: Monday, November 1, 2021 9:07 PM
> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> <[email protected]>; Maxime Ripard <[email protected]>;
> Thomas Zimmermann <[email protected]>; David Airlie <[email protected]>;
> Daniel Vetter <[email protected]>
> Cc: Yuan, Perry <[email protected]>; [email protected]; linux-
> [email protected]; Huang, Shimmer <[email protected]>; Huang,
> Ray <[email protected]>
> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on
> drm_dp_dpcd_access
>
> [CAUTION: External Email]
>
> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
> > Fix below crash by adding a check in the drm_dp_dpcd_access which
> > ensures that aux->transfer was actually initialized earlier.
>
> Gut feeling says this is papering over a real usage issue somewhere else. Why is
> the aux being used for transfers before ->transfer has been set? Why should the
> dp helper be defensive against all kinds of misprogramming?
>
>
> BR,
> Jani.
>
The issue was found by Intel IGT test suite, graphic by pass test case.
https://gitlab.freedesktop.org/drm/igt-gpu-tools
normally use case will not see the issue.
To avoid this issue happy again when we run the test case , it will be nice to add a check before the transfer is called.
And we can see that it really needs to have a check here to make ITG &kernel happy.
Perry.
>
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0
> > P4D 0
> > Oops: 0010 [#1] SMP NOPTI
> > RIP: 0010:0x0
> > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> > RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
> > RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e
> > RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8
> > RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: ffffa8d64225ba60
> > R10: 0000000000000002 R11: 000000000000000d R12: 0000000000000001
> > R13: 0000000000000000 R14: ffffa8d64225bb5e R15: ffff931511a1a9d8
> > FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
> > knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0
> > PKRU: 55555554
> > Call Trace:
> > drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
> > drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
> > drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
> > amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
> > crtc_crc_open+0x174/0x220 [drm]
> > full_proxy_open+0x168/0x1f0
> > ? open_proxy_open+0x100/0x100
> > do_dentry_open+0x156/0x370
> > vfs_open+0x2d/0x30
> >
> > v2: fix some typo
> >
> > Signed-off-by: Perry Yuan <[email protected]>
> > ---
> > drivers/gpu/drm/drm_dp_helper.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_dp_helper.c
> > b/drivers/gpu/drm/drm_dp_helper.c index 6d0f2c447f3b..76b28396001a
> > 100644
> > --- a/drivers/gpu/drm/drm_dp_helper.c
> > +++ b/drivers/gpu/drm/drm_dp_helper.c
> > @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct drm_dp_aux
> *aux, u8 request,
> > msg.buffer = buffer;
> > msg.size = size;
> >
> > + /* No transfer function is set, so not an available DP connector */
> > + if (!aux->transfer)
> > + return -EINVAL;
> > +
> > mutex_lock(&aux->hw_mutex);
> >
> > /*
>
> --
> Jani Nikula, Intel Open Source Graphics Center
On Tue, 02 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> [AMD Official Use Only]
>
> Hi Jani:
> Thanks for your comments.
>
>> -----Original Message-----
>> From: Jani Nikula <[email protected]>
>> Sent: Monday, November 1, 2021 9:07 PM
>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
>> <[email protected]>; Maxime Ripard <[email protected]>;
>> Thomas Zimmermann <[email protected]>; David Airlie <[email protected]>;
>> Daniel Vetter <[email protected]>
>> Cc: Yuan, Perry <[email protected]>; [email protected]; linux-
>> [email protected]; Huang, Shimmer <[email protected]>; Huang,
>> Ray <[email protected]>
>> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on
>> drm_dp_dpcd_access
>>
>> [CAUTION: External Email]
>>
>> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
>> > Fix below crash by adding a check in the drm_dp_dpcd_access which
>> > ensures that aux->transfer was actually initialized earlier.
>>
>> Gut feeling says this is papering over a real usage issue somewhere else. Why is
>> the aux being used for transfers before ->transfer has been set? Why should the
>> dp helper be defensive against all kinds of misprogramming?
>>
>>
>> BR,
>> Jani.
>>
>
> The issue was found by Intel IGT test suite, graphic by pass test case.
> https://gitlab.freedesktop.org/drm/igt-gpu-tools
> normally use case will not see the issue.
> To avoid this issue happy again when we run the test case , it will be nice to add a check before the transfer is called.
> And we can see that it really needs to have a check here to make ITG &kernel happy.
You're missing my point. What is the root cause? Why do you have the aux
device or connector registered before ->transfer function is
initialized. I don't think you should do that.
BR,
Jani.
>
> Perry.
>
>>
>> >
>> > BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0
>> > P4D 0
>> > Oops: 0010 [#1] SMP NOPTI
>> > RIP: 0010:0x0
>> > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
>> > RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
>> > RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e
>> > RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8
>> > RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: ffffa8d64225ba60
>> > R10: 0000000000000002 R11: 000000000000000d R12: 0000000000000001
>> > R13: 0000000000000000 R14: ffffa8d64225bb5e R15: ffff931511a1a9d8
>> > FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
>> > knlGS:0000000000000000
>> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0
>> > PKRU: 55555554
>> > Call Trace:
>> > drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
>> > drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
>> > drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
>> > amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
>> > crtc_crc_open+0x174/0x220 [drm]
>> > full_proxy_open+0x168/0x1f0
>> > ? open_proxy_open+0x100/0x100
>> > do_dentry_open+0x156/0x370
>> > vfs_open+0x2d/0x30
>> >
>> > v2: fix some typo
>> >
>> > Signed-off-by: Perry Yuan <[email protected]>
>> > ---
>> > drivers/gpu/drm/drm_dp_helper.c | 4 ++++
>> > 1 file changed, 4 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/drm_dp_helper.c
>> > b/drivers/gpu/drm/drm_dp_helper.c index 6d0f2c447f3b..76b28396001a
>> > 100644
>> > --- a/drivers/gpu/drm/drm_dp_helper.c
>> > +++ b/drivers/gpu/drm/drm_dp_helper.c
>> > @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct drm_dp_aux
>> *aux, u8 request,
>> > msg.buffer = buffer;
>> > msg.size = size;
>> >
>> > + /* No transfer function is set, so not an available DP connector */
>> > + if (!aux->transfer)
>> > + return -EINVAL;
>> > +
>> > mutex_lock(&aux->hw_mutex);
>> >
>> > /*
>>
>> --
>> Jani Nikula, Intel Open Source Graphics Center
--
Jani Nikula, Intel Open Source Graphics Center
[AMD Official Use Only]
Hi Jani:
> -----Original Message-----
> From: Jani Nikula <[email protected]>
> Sent: Tuesday, November 2, 2021 4:40 PM
> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> <[email protected]>; Maxime Ripard <[email protected]>;
> Thomas Zimmermann <[email protected]>; David Airlie <[email protected]>;
> Daniel Vetter <[email protected]>
> Cc: [email protected]; [email protected]; Huang,
> Shimmer <[email protected]>; Huang, Ray <[email protected]>
> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on
> drm_dp_dpcd_access
>
> [CAUTION: External Email]
>
> On Tue, 02 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> > [AMD Official Use Only]
> >
> > Hi Jani:
> > Thanks for your comments.
> >
> >> -----Original Message-----
> >> From: Jani Nikula <[email protected]>
> >> Sent: Monday, November 1, 2021 9:07 PM
> >> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> >> <[email protected]>; Maxime Ripard
> >> <[email protected]>; Thomas Zimmermann <[email protected]>;
> David
> >> Airlie <[email protected]>; Daniel Vetter <[email protected]>
> >> Cc: Yuan, Perry <[email protected]>;
> >> [email protected]; linux- [email protected];
> >> Huang, Shimmer <[email protected]>; Huang, Ray
> <[email protected]>
> >> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >> dereference on drm_dp_dpcd_access
> >>
> >> [CAUTION: External Email]
> >>
> >> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
> >> > Fix below crash by adding a check in the drm_dp_dpcd_access which
> >> > ensures that aux->transfer was actually initialized earlier.
> >>
> >> Gut feeling says this is papering over a real usage issue somewhere
> >> else. Why is the aux being used for transfers before ->transfer has
> >> been set? Why should the dp helper be defensive against all kinds of
> misprogramming?
> >>
> >>
> >> BR,
> >> Jani.
> >>
> >
> > The issue was found by Intel IGT test suite, graphic by pass test case.
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
> > ab.freedesktop.org%2Fdrm%2Figt-gpu-
> tools&data=04%7C01%7CPerry.Yuan
> > %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4884e6
> 08e11a8
> >
> 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbGZsb
> 3d8eyJWIj
> >
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C100
> 0&am
> >
> p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3D&reserved
> =0
> > normally use case will not see the issue.
> > To avoid this issue happy again when we run the test case , it will be nice to
> add a check before the transfer is called.
> > And we can see that it really needs to have a check here to make ITG &kernel
> happy.
>
> You're missing my point. What is the root cause? Why do you have the aux
> device or connector registered before ->transfer function is initialized. I don't
> think you should do that.
>
> BR,
> Jani.
>
One potential IGT fix patch to resolve the test case failure is:
tests/amdgpu/amd_bypass.c
data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id,
- AMDGPU_PIPE_CRC_SOURCE_DPRX);
+ INTEL_PIPE_CRC_SOURCE_AUTO);
The kernel panic error gone after change "dprx" to "auto" in the IGT test.
In my view ,the IGT amdgpu bypass test will do some common setup work including crc piple, source.
When the IGT sets up a new CRC pipe capture source for amdgpu bypass test, the SOURCE was set as "dprx" instead of "auto"
It makes "amdgpu_dm_crtc_set_crc_source()" failed to set correct AUX and it's transfer function invalid .
The system I tested use HDMI port connected to monitor .
amdgpu_dm_crtc_set_crc_source-> (aux = (aconn->port) ? &aconn->port->aux : &aconn->dm_dp_aux.aux;)
drm_dp_start_crc ->
drm_dp_dpcd_readb-> aux->transfer is NULL, issue here.
The fix will use the "auto" keyword, which will let the driver select a default source of frame CRCs for this CRTC.
Correct me if have some wrong points.
Thank you!
Perry.
>
> >
> > Perry.
> >
> >>
> >> >
> >> > BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD
> >> > 0 P4D 0
> >> > Oops: 0010 [#1] SMP NOPTI
> >> > RIP: 0010:0x0
> >> > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> >> > RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
> >> > RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e
> >> > RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8
> >> > RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: ffffa8d64225ba60
> >> > R10: 0000000000000002 R11: 000000000000000d R12: 0000000000000001
> >> > R13: 0000000000000000 R14: ffffa8d64225bb5e R15: ffff931511a1a9d8
> >> > FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
> >> > knlGS:0000000000000000
> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0
> >> > PKRU: 55555554
> >> > Call Trace:
> >> > drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
> >> > drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
> >> > drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
> >> > amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
> >> > crtc_crc_open+0x174/0x220 [drm]
> >> > full_proxy_open+0x168/0x1f0
> >> > ? open_proxy_open+0x100/0x100
> >> > do_dentry_open+0x156/0x370
> >> > vfs_open+0x2d/0x30
> >> >
> >> > v2: fix some typo
> >> >
> >> > Signed-off-by: Perry Yuan <[email protected]>
> >> > ---
> >> > drivers/gpu/drm/drm_dp_helper.c | 4 ++++
> >> > 1 file changed, 4 insertions(+)
> >> >
> >> > diff --git a/drivers/gpu/drm/drm_dp_helper.c
> >> > b/drivers/gpu/drm/drm_dp_helper.c index 6d0f2c447f3b..76b28396001a
> >> > 100644
> >> > --- a/drivers/gpu/drm/drm_dp_helper.c
> >> > +++ b/drivers/gpu/drm/drm_dp_helper.c
> >> > @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct
> >> > drm_dp_aux
> >> *aux, u8 request,
> >> > msg.buffer = buffer;
> >> > msg.size = size;
> >> >
> >> > + /* No transfer function is set, so not an available DP connector */
> >> > + if (!aux->transfer)
> >> > + return -EINVAL;
> >> > +
> >> > mutex_lock(&aux->hw_mutex);
> >> >
> >> > /*
> >>
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center
>
> --
> Jani Nikula, Intel Open Source Graphics Center
On Wed, 03 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> [AMD Official Use Only]
>
> Hi Jani:
>
>> -----Original Message-----
>> From: Jani Nikula <[email protected]>
>> Sent: Tuesday, November 2, 2021 4:40 PM
>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
>> <[email protected]>; Maxime Ripard <[email protected]>;
>> Thomas Zimmermann <[email protected]>; David Airlie <[email protected]>;
>> Daniel Vetter <[email protected]>
>> Cc: [email protected]; [email protected]; Huang,
>> Shimmer <[email protected]>; Huang, Ray <[email protected]>
>> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on
>> drm_dp_dpcd_access
>>
>> [CAUTION: External Email]
>>
>> On Tue, 02 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
>> > [AMD Official Use Only]
>> >
>> > Hi Jani:
>> > Thanks for your comments.
>> >
>> >> -----Original Message-----
>> >> From: Jani Nikula <[email protected]>
>> >> Sent: Monday, November 1, 2021 9:07 PM
>> >> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
>> >> <[email protected]>; Maxime Ripard
>> >> <[email protected]>; Thomas Zimmermann <[email protected]>;
>> David
>> >> Airlie <[email protected]>; Daniel Vetter <[email protected]>
>> >> Cc: Yuan, Perry <[email protected]>;
>> >> [email protected]; linux- [email protected];
>> >> Huang, Shimmer <[email protected]>; Huang, Ray
>> <[email protected]>
>> >> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
>> >> dereference on drm_dp_dpcd_access
>> >>
>> >> [CAUTION: External Email]
>> >>
>> >> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
>> >> > Fix below crash by adding a check in the drm_dp_dpcd_access which
>> >> > ensures that aux->transfer was actually initialized earlier.
>> >>
>> >> Gut feeling says this is papering over a real usage issue somewhere
>> >> else. Why is the aux being used for transfers before ->transfer has
>> >> been set? Why should the dp helper be defensive against all kinds of
>> misprogramming?
>> >>
>> >>
>> >> BR,
>> >> Jani.
>> >>
>> >
>> > The issue was found by Intel IGT test suite, graphic by pass test case.
>> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
>> > ab.freedesktop.org%2Fdrm%2Figt-gpu-
>> tools&data=04%7C01%7CPerry.Yuan
>> > %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4884e6
>> 08e11a8
>> >
>> 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbGZsb
>> 3d8eyJWIj
>> >
>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C100
>> 0&am
>> >
>> p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3D&reserved
>> =0
>> > normally use case will not see the issue.
>> > To avoid this issue happy again when we run the test case , it will be nice to
>> add a check before the transfer is called.
>> > And we can see that it really needs to have a check here to make ITG &kernel
>> happy.
>>
>> You're missing my point. What is the root cause? Why do you have the aux
>> device or connector registered before ->transfer function is initialized. I don't
>> think you should do that.
>>
>> BR,
>> Jani.
>>
>
> One potential IGT fix patch to resolve the test case failure is:
>
> tests/amdgpu/amd_bypass.c
> data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id,
> - AMDGPU_PIPE_CRC_SOURCE_DPRX);
> + INTEL_PIPE_CRC_SOURCE_AUTO);
> The kernel panic error gone after change "dprx" to "auto" in the IGT test.
>
> In my view ,the IGT amdgpu bypass test will do some common setup work including crc piple, source.
> When the IGT sets up a new CRC pipe capture source for amdgpu bypass test, the SOURCE was set as "dprx" instead of "auto"
> It makes "amdgpu_dm_crtc_set_crc_source()" failed to set correct AUX and it's transfer function invalid .
> The system I tested use HDMI port connected to monitor .
>
> amdgpu_dm_crtc_set_crc_source-> (aux = (aconn->port) ? &aconn->port->aux : &aconn->dm_dp_aux.aux;)
> drm_dp_start_crc ->
> drm_dp_dpcd_readb-> aux->transfer is NULL, issue here.
> The fix will use the "auto" keyword, which will let the driver select a default source of frame CRCs for this CRTC.
>
> Correct me if have some wrong points.
Apparently I'm completely failing to communicate my POV to you.
If you have a kernel oops, the fix needs to be in the kernel, not IGT.
The question is, why is it possible for IGT (or any userspace) to
trigger AUX communication when the ->transfer function is not set? In my
opinion, the kernel driver should not have exposed the interface at all
if the ->transfer function is not set. The interface is useless without
the ->transfer function. IMO, that's the bug.
BR,
Jani.
>
> Thank you!
> Perry.
>
>>
>> >
>> > Perry.
>> >
>> >>
>> >> >
>> >> > BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD
>> >> > 0 P4D 0
>> >> > Oops: 0010 [#1] SMP NOPTI
>> >> > RIP: 0010:0x0
>> >> > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
>> >> > RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
>> >> > RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e
>> >> > RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8
>> >> > RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: ffffa8d64225ba60
>> >> > R10: 0000000000000002 R11: 000000000000000d R12: 0000000000000001
>> >> > R13: 0000000000000000 R14: ffffa8d64225bb5e R15: ffff931511a1a9d8
>> >> > FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
>> >> > knlGS:0000000000000000
>> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >> > CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0
>> >> > PKRU: 55555554
>> >> > Call Trace:
>> >> > drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
>> >> > drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
>> >> > drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
>> >> > amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
>> >> > crtc_crc_open+0x174/0x220 [drm]
>> >> > full_proxy_open+0x168/0x1f0
>> >> > ? open_proxy_open+0x100/0x100
>> >> > do_dentry_open+0x156/0x370
>> >> > vfs_open+0x2d/0x30
>> >> >
>> >> > v2: fix some typo
>> >> >
>> >> > Signed-off-by: Perry Yuan <[email protected]>
>> >> > ---
>> >> > drivers/gpu/drm/drm_dp_helper.c | 4 ++++
>> >> > 1 file changed, 4 insertions(+)
>> >> >
>> >> > diff --git a/drivers/gpu/drm/drm_dp_helper.c
>> >> > b/drivers/gpu/drm/drm_dp_helper.c index 6d0f2c447f3b..76b28396001a
>> >> > 100644
>> >> > --- a/drivers/gpu/drm/drm_dp_helper.c
>> >> > +++ b/drivers/gpu/drm/drm_dp_helper.c
>> >> > @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct
>> >> > drm_dp_aux
>> >> *aux, u8 request,
>> >> > msg.buffer = buffer;
>> >> > msg.size = size;
>> >> >
>> >> > + /* No transfer function is set, so not an available DP connector */
>> >> > + if (!aux->transfer)
>> >> > + return -EINVAL;
>> >> > +
>> >> > mutex_lock(&aux->hw_mutex);
>> >> >
>> >> > /*
>> >>
>> >> --
>> >> Jani Nikula, Intel Open Source Graphics Center
>>
>> --
>> Jani Nikula, Intel Open Source Graphics Center
--
Jani Nikula, Intel Open Source Graphics Center
[AMD Official Use Only]
Hi Jani:
> -----Original Message-----
> From: Jani Nikula <[email protected]>
> Sent: Wednesday, November 3, 2021 7:31 PM
> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> <[email protected]>; Maxime Ripard
> <[email protected]>; Thomas Zimmermann <[email protected]>;
> David Airlie <[email protected]>; Daniel Vetter <[email protected]>
> Cc: [email protected]; [email protected]; Huang,
> Shimmer <[email protected]>; Huang, Ray <[email protected]>
> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference
> on drm_dp_dpcd_access
>
> [CAUTION: External Email]
>
> On Wed, 03 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> > [AMD Official Use Only]
> >
> > Hi Jani:
> >
> >> -----Original Message-----
> >> From: Jani Nikula <[email protected]>
> >> Sent: Tuesday, November 2, 2021 4:40 PM
> >> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> >> <[email protected]>; Maxime Ripard
> >> <[email protected]>; Thomas Zimmermann <[email protected]>;
> David
> >> Airlie <[email protected]>; Daniel Vetter <[email protected]>
> >> Cc: [email protected]; [email protected];
> >> Huang, Shimmer <[email protected]>; Huang, Ray
> <[email protected]>
> >> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >> dereference on drm_dp_dpcd_access
> >>
> >> [CAUTION: External Email]
> >>
> >> On Tue, 02 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> >> > [AMD Official Use Only]
> >> >
> >> > Hi Jani:
> >> > Thanks for your comments.
> >> >
> >> >> -----Original Message-----
> >> >> From: Jani Nikula <[email protected]>
> >> >> Sent: Monday, November 1, 2021 9:07 PM
> >> >> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> >> >> <[email protected]>; Maxime Ripard
> >> >> <[email protected]>; Thomas Zimmermann
> <[email protected]>;
> >> David
> >> >> Airlie <[email protected]>; Daniel Vetter <[email protected]>
> >> >> Cc: Yuan, Perry <[email protected]>;
> >> >> [email protected]; linux- [email protected];
> >> >> Huang, Shimmer <[email protected]>; Huang, Ray
> >> <[email protected]>
> >> >> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >> >> dereference on drm_dp_dpcd_access
> >> >>
> >> >> [CAUTION: External Email]
> >> >>
> >> >> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
> >> >> > Fix below crash by adding a check in the drm_dp_dpcd_access
> >> >> > which ensures that aux->transfer was actually initialized earlier.
> >> >>
> >> >> Gut feeling says this is papering over a real usage issue
> >> >> somewhere else. Why is the aux being used for transfers before
> >> >> ->transfer has been set? Why should the dp helper be defensive
> >> >> against all kinds of
> >> misprogramming?
> >> >>
> >> >>
> >> >> BR,
> >> >> Jani.
> >> >>
> >> >
> >> > The issue was found by Intel IGT test suite, graphic by pass test case.
> >> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> >> > itl
> >> > ab.freedesktop.org%2Fdrm%2Figt-gpu-
> >> tools&data=04%7C01%7CPerry.Yuan
> >> > %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4
> 884e6
> >> 08e11a8
> >> >
> >>
> 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbG
> Zsb
> >> 3d8eyJWIj
> >> >
> >>
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> 00
> >> 0&am
> >> >
> >>
> p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3D&reser
> ved
> >> =0
> >> > normally use case will not see the issue.
> >> > To avoid this issue happy again when we run the test case , it will
> >> > be nice to
> >> add a check before the transfer is called.
> >> > And we can see that it really needs to have a check here to make
> >> > ITG &kernel
> >> happy.
> >>
> >> You're missing my point. What is the root cause? Why do you have the
> >> aux device or connector registered before ->transfer function is
> >> initialized. I don't think you should do that.
> >>
> >> BR,
> >> Jani.
> >>
> >
> > One potential IGT fix patch to resolve the test case failure is:
> >
> > tests/amdgpu/amd_bypass.c
> > data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id,
> > - AMDGPU_PIPE_CRC_SOURCE_DPRX);
> > + INTEL_PIPE_CRC_SOURCE_AUTO);
> > The kernel panic error gone after change "dprx" to "auto" in the IGT test.
> >
> > In my view ,the IGT amdgpu bypass test will do some common setup work
> including crc piple, source.
> > When the IGT sets up a new CRC pipe capture source for amdgpu bypass
> test, the SOURCE was set as "dprx" instead of "auto"
> > It makes "amdgpu_dm_crtc_set_crc_source()" failed to set correct AUX
> and it's transfer function invalid .
> > The system I tested use HDMI port connected to monitor .
> >
> > amdgpu_dm_crtc_set_crc_source-> (aux = (aconn->port) ? &aconn-
> >port->aux : &aconn->dm_dp_aux.aux;)
> > drm_dp_start_crc ->
> > drm_dp_dpcd_readb-> aux->transfer is NULL, issue here.
> > The fix will use the "auto" keyword, which will let the driver select a
> default source of frame CRCs for this CRTC.
> >
> > Correct me if have some wrong points.
>
> Apparently I'm completely failing to communicate my POV to you.
>
> If you have a kernel oops, the fix needs to be in the kernel, not IGT.
>
> The question is, why is it possible for IGT (or any userspace) to trigger AUX
> communication when the ->transfer function is not set? In my opinion, the
> kernel driver should not have exposed the interface at all if the ->transfer
> function is not set. The interface is useless without the ->transfer function.
> IMO, that's the bug.
>
Yes , you are correct , the transfer shouldn't be called before it is ready !
Let me explain more details in my view .
Maybe the root cause is not why the aux->transfer is not called before it is registered in this case.
I suppose the issue was triggered by wrong CRC pipe source .
Actually the aux->transfer has been registered when amdgpu DM registered at kernel boot.
IGT test was run when system login to Gnome desktop.
amdgpu_dm_initialize_dp_connector->
aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
The test case failed when the IGT set an "DPRX" CRC pipe source while the HDMI connected to monitor only.
At this time, the aux->transfer is NULL, and dp helper did not check the transfer pointer NULL or not.
It calls the transfers to DPCD read, then you see the kernel panic log.
amdgpu_dm_crtc_funcs-> amdgpu_dm_crtc_set_crc_source-> drm_dp_start_crc
* And if the DP cable connected only, the issue will not happen. Test will pass.
* If I change the CRC source to "auto", kernel will not see the panic as well.
Maybe the failed test case need to run on the DP instead of HDMI, I am not sure at now.
Hopping this info can help.
Perry.
>
> BR,
> Jani.
>
> >
> > Thank you!
> > Perry.
> >
> >>
> >> >
> >> > Perry.
> >> >
> >> >>
> >> >> >
> >> >> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> >> >> > PGD
> >> >> > 0 P4D 0
> >> >> > Oops: 0010 [#1] SMP NOPTI
> >> >> > RIP: 0010:0x0
> >> >> > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> >> >> > RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
> >> >> > RAX: 0000000000000000 RBX: 0000000000000020 RCX:
> >> >> > ffffa8d64225bb5e
> >> >> > RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI:
> >> >> > ffff931511a1a9d8
> >> >> > RBP: ffffa8d64225bb10 R08: 0000000000000001 R09:
> >> >> > ffffa8d64225ba60
> >> >> > R10: 0000000000000002 R11: 000000000000000d R12:
> >> >> > 0000000000000001
> >> >> > R13: 0000000000000000 R14: ffffa8d64225bb5e R15:
> >> >> > ffff931511a1a9d8
> >> >> > FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
> >> >> > knlGS:0000000000000000
> >> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> >> > CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4:
> >> >> > 0000000000750ee0
> >> >> > PKRU: 55555554
> >> >> > Call Trace:
> >> >> > drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
> >> >> > drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
> >> >> > drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
> >> >> > amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
> >> >> > crtc_crc_open+0x174/0x220 [drm]
> >> >> > full_proxy_open+0x168/0x1f0
> >> >> > ? open_proxy_open+0x100/0x100
> >> >> > do_dentry_open+0x156/0x370
> >> >> > vfs_open+0x2d/0x30
> >> >> >
> >> >> > v2: fix some typo
> >> >> >
> >> >> > Signed-off-by: Perry Yuan <[email protected]>
> >> >> > ---
> >> >> > drivers/gpu/drm/drm_dp_helper.c | 4 ++++
> >> >> > 1 file changed, 4 insertions(+)
> >> >> >
> >> >> > diff --git a/drivers/gpu/drm/drm_dp_helper.c
> >> >> > b/drivers/gpu/drm/drm_dp_helper.c index
> >> >> > 6d0f2c447f3b..76b28396001a
> >> >> > 100644
> >> >> > --- a/drivers/gpu/drm/drm_dp_helper.c
> >> >> > +++ b/drivers/gpu/drm/drm_dp_helper.c
> >> >> > @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct
> >> >> > drm_dp_aux
> >> >> *aux, u8 request,
> >> >> > msg.buffer = buffer;
> >> >> > msg.size = size;
> >> >> >
> >> >> > + /* No transfer function is set, so not an available DP connector */
> >> >> > + if (!aux->transfer)
> >> >> > + return -EINVAL;
> >> >> > +
> >> >> > mutex_lock(&aux->hw_mutex);
> >> >> >
> >> >> > /*
> >> >>
> >> >> --
> >> >> Jani Nikula, Intel Open Source Graphics Center
> >>
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center
>
> --
> Jani Nikula, Intel Open Source Graphics Center
On 2021-11-05 03:35, Yuan, Perry wrote:
> [AMD Official Use Only]
>
> Hi Jani:
>
>
>> -----Original Message-----
>> From: Jani Nikula <[email protected]>
>> Sent: Wednesday, November 3, 2021 7:31 PM
>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
>> <[email protected]>; Maxime Ripard
>> <[email protected]>; Thomas Zimmermann <[email protected]>;
>> David Airlie <[email protected]>; Daniel Vetter <[email protected]>
>> Cc: [email protected]; [email protected]; Huang,
>> Shimmer <[email protected]>; Huang, Ray <[email protected]>
>> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference
>> on drm_dp_dpcd_access
>>
>> [CAUTION: External Email]
>>
>> On Wed, 03 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
>>> [AMD Official Use Only]
>>>
>>> Hi Jani:
>>>
>>>> -----Original Message-----
>>>> From: Jani Nikula <[email protected]>
>>>> Sent: Tuesday, November 2, 2021 4:40 PM
>>>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
>>>> <[email protected]>; Maxime Ripard
>>>> <[email protected]>; Thomas Zimmermann <[email protected]>;
>> David
>>>> Airlie <[email protected]>; Daniel Vetter <[email protected]>
>>>> Cc: [email protected]; [email protected];
>>>> Huang, Shimmer <[email protected]>; Huang, Ray
>> <[email protected]>
>>>> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
>>>> dereference on drm_dp_dpcd_access
>>>>
>>>> [CAUTION: External Email]
>>>>
>>>> On Tue, 02 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
>>>>> [AMD Official Use Only]
>>>>>
>>>>> Hi Jani:
>>>>> Thanks for your comments.
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jani Nikula <[email protected]>
>>>>>> Sent: Monday, November 1, 2021 9:07 PM
>>>>>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
>>>>>> <[email protected]>; Maxime Ripard
>>>>>> <[email protected]>; Thomas Zimmermann
>> <[email protected]>;
>>>> David
>>>>>> Airlie <[email protected]>; Daniel Vetter <[email protected]>
>>>>>> Cc: Yuan, Perry <[email protected]>;
>>>>>> [email protected]; linux- [email protected];
>>>>>> Huang, Shimmer <[email protected]>; Huang, Ray
>>>> <[email protected]>
>>>>>> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
>>>>>> dereference on drm_dp_dpcd_access
>>>>>>
>>>>>> [CAUTION: External Email]
>>>>>>
>>>>>> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
>>>>>>> Fix below crash by adding a check in the drm_dp_dpcd_access
>>>>>>> which ensures that aux->transfer was actually initialized earlier.
>>>>>>
>>>>>> Gut feeling says this is papering over a real usage issue
>>>>>> somewhere else. Why is the aux being used for transfers before
>>>>>> ->transfer has been set? Why should the dp helper be defensive
>>>>>> against all kinds of
>>>> misprogramming?
>>>>>>
>>>>>>
>>>>>> BR,
>>>>>> Jani.
>>>>>>
>>>>>
>>>>> The issue was found by Intel IGT test suite, graphic by pass test case.
>>>>>
>> https://g
itl
>>>>> ab.freedesktop.org%2Fdrm%2Figt-gpu-
>>>> tools&data=04%7C01%7CPerry.Yuan
>>>>> %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4
>> 884e6
>>>> 08e11a8
>>>>>
>>>>
>> 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbG
>> Zsb
>>>> 3d8eyJWIj
>>>>>
>>>>
>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
>> 00
>>>> 0&am
>>>>>
>>>>
>> p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3D&reser
>> ved
>>>> =0
>>>>> normally use case will not see the issue.
>>>>> To avoid this issue happy again when we run the test case , it will
>>>>> be nice to
>>>> add a check before the transfer is called.
>>>>> And we can see that it really needs to have a check here to make
>>>>> ITG &kernel
>>>> happy.
>>>>
>>>> You're missing my point. What is the root cause? Why do you have the
>>>> aux device or connector registered before ->transfer function is
>>>> initialized. I don't think you should do that.
>>>>
>>>> BR,
>>>> Jani.
>>>>
>>>
>>> One potential IGT fix patch to resolve the test case failure is:
>>>
>>> tests/amdgpu/amd_bypass.c
>>> data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id,
>>> - AMDGPU_PIPE_CRC_SOURCE_DPRX);
>>> + INTEL_PIPE_CRC_SOURCE_AUTO);
>>> The kernel panic error gone after change "dprx" to "auto" in the IGT test.
>>>
>>> In my view ,the IGT amdgpu bypass test will do some common setup work
>> including crc piple, source.
>>> When the IGT sets up a new CRC pipe capture source for amdgpu bypass
>> test, the SOURCE was set as "dprx" instead of "auto"
>>> It makes "amdgpu_dm_crtc_set_crc_source()" failed to set correct AUX
>> and it's transfer function invalid .
>>> The system I tested use HDMI port connected to monitor .
>>>
>>> amdgpu_dm_crtc_set_crc_source-> (aux = (aconn->port) ? &aconn-
>>> port->aux : &aconn->dm_dp_aux.aux;)
>>> drm_dp_start_crc ->
>>> drm_dp_dpcd_readb-> aux->transfer is NULL, issue here.
>>> The fix will use the "auto" keyword, which will let the driver select a
>> default source of frame CRCs for this CRTC.
>>>
>>> Correct me if have some wrong points.
>>
>> Apparently I'm completely failing to communicate my POV to you.
>>
>> If you have a kernel oops, the fix needs to be in the kernel, not IGT.
>>
>> The question is, why is it possible for IGT (or any userspace) to trigger AUX
>> communication when the ->transfer function is not set? In my opinion, the
>> kernel driver should not have exposed the interface at all if the ->transfer
>> function is not set. The interface is useless without the ->transfer function.
>> IMO, that's the bug.
>>
>
> Yes , you are correct , the transfer shouldn't be called before it is ready !
>
> Let me explain more details in my view .
> Maybe the root cause is not why the aux->transfer is not called before it is registered in this case.
> I suppose the issue was triggered by wrong CRC pipe source .
>
> Actually the aux->transfer has been registered when amdgpu DM registered at kernel boot.
> IGT test was run when system login to Gnome desktop.
>
> amdgpu_dm_initialize_dp_connector->
> aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
>
> The test case failed when the IGT set an "DPRX" CRC pipe source while the HDMI connected to monitor only.
> At this time, the aux->transfer is NULL, and dp helper did not check the transfer pointer NULL or not.
> It calls the transfers to DPCD read, then you see the kernel panic log.
>
> amdgpu_dm_crtc_funcs-> amdgpu_dm_crtc_set_crc_source-> drm_dp_start_crc
>
> * And if the DP cable connected only, the issue will not happen. Test will pass.
> * If I change the CRC source to "auto", kernel will not see the panic as well.
> Maybe the failed test case need to run on the DP instead of HDMI, I am not sure at now.
>
Two things need to happen:
1) IGT should skip tests requiring DPRX CRC source if not on a
DP connector.
2) Driver should return EINVAL (or another appropriate error) if
DPRX CRC source is requested when the CRTC is not connected to
a DP display. Alternatively we could make sure that DPRX is
not advertised as a CRC source in this case but I'm not sure
how difficult that would be.
Like Jani said, I don't think the current patch is the correct one
as it doesn't get to the root cause. The root cause fix should be
in the CRC debugfs handling code.
Harry
>
> Hopping this info can help.
>
> Perry.
>
>
>>
>> BR,
>> Jani.
>>
>>>
>>> Thank you!
>>> Perry.
>>>
>>>>
>>>>>
>>>>> Perry.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>>>>>>> PGD
>>>>>>> 0 P4D 0
>>>>>>> Oops: 0010 [#1] SMP NOPTI
>>>>>>> RIP: 0010:0x0
>>>>>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
>>>>>>> RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
>>>>>>> RAX: 0000000000000000 RBX: 0000000000000020 RCX:
>>>>>>> ffffa8d64225bb5e
>>>>>>> RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI:
>>>>>>> ffff931511a1a9d8
>>>>>>> RBP: ffffa8d64225bb10 R08: 0000000000000001 R09:
>>>>>>> ffffa8d64225ba60
>>>>>>> R10: 0000000000000002 R11: 000000000000000d R12:
>>>>>>> 0000000000000001
>>>>>>> R13: 0000000000000000 R14: ffffa8d64225bb5e R15:
>>>>>>> ffff931511a1a9d8
>>>>>>> FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
>>>>>>> knlGS:0000000000000000
>>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4:
>>>>>>> 0000000000750ee0
>>>>>>> PKRU: 55555554
>>>>>>> Call Trace:
>>>>>>> drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
>>>>>>> drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
>>>>>>> drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
>>>>>>> amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
>>>>>>> crtc_crc_open+0x174/0x220 [drm]
>>>>>>> full_proxy_open+0x168/0x1f0
>>>>>>> ? open_proxy_open+0x100/0x100
>>>>>>> do_dentry_open+0x156/0x370
>>>>>>> vfs_open+0x2d/0x30
>>>>>>>
>>>>>>> v2: fix some typo
>>>>>>>
>>>>>>> Signed-off-by: Perry Yuan <[email protected]>
>>>>>>> ---
>>>>>>> drivers/gpu/drm/drm_dp_helper.c | 4 ++++
>>>>>>> 1 file changed, 4 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/drm_dp_helper.c
>>>>>>> b/drivers/gpu/drm/drm_dp_helper.c index
>>>>>>> 6d0f2c447f3b..76b28396001a
>>>>>>> 100644
>>>>>>> --- a/drivers/gpu/drm/drm_dp_helper.c
>>>>>>> +++ b/drivers/gpu/drm/drm_dp_helper.c
>>>>>>> @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct
>>>>>>> drm_dp_aux
>>>>>> *aux, u8 request,
>>>>>>> msg.buffer = buffer;
>>>>>>> msg.size = size;
>>>>>>>
>>>>>>> + /* No transfer function is set, so not an available DP connector */
>>>>>>> + if (!aux->transfer)
>>>>>>> + return -EINVAL;
>>>>>>> +
>>>>>>> mutex_lock(&aux->hw_mutex);
>>>>>>>
>>>>>>> /*
>>>>>>
>>>>>> --
>>>>>> Jani Nikula, Intel Open Source Graphics Center
>>>>
>>>> --
>>>> Jani Nikula, Intel Open Source Graphics Center
>>
>> --
>> Jani Nikula, Intel Open Source Graphics Center
[AMD Official Use Only]
Hi Harry.
> -----Original Message-----
> From: Wentland, Harry <[email protected]>
> Sent: Wednesday, November 10, 2021 11:32 PM
> To: Yuan, Perry <[email protected]>; Jani Nikula
> <[email protected]>; Maarten Lankhorst
> <[email protected]>; Maxime Ripard <[email protected]>;
> Thomas Zimmermann <[email protected]>; David Airlie <[email protected]>;
> Daniel Vetter <[email protected]>
> Cc: Huang, Shimmer <[email protected]>; Huang, Ray
> <[email protected]>; [email protected]; dri-
> [email protected]; Limonciello, Mario <[email protected]>
> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference on
> drm_dp_dpcd_access
>
> On 2021-11-05 03:35, Yuan, Perry wrote:
> > [AMD Official Use Only]
> >
> > Hi Jani:
> >
> >
> >> -----Original Message-----
> >> From: Jani Nikula <[email protected]>
> >> Sent: Wednesday, November 3, 2021 7:31 PM
> >> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> >> <[email protected]>; Maxime Ripard
> >> <[email protected]>; Thomas Zimmermann <[email protected]>;
> David
> >> Airlie <[email protected]>; Daniel Vetter <[email protected]>
> >> Cc: [email protected]; [email protected];
> >> Huang, Shimmer <[email protected]>; Huang, Ray
> <[email protected]>
> >> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >> dereference on drm_dp_dpcd_access
> >>
> >> [CAUTION: External Email]
> >>
> >> On Wed, 03 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> >>> [AMD Official Use Only]
> >>>
> >>> Hi Jani:
> >>>
> >>>> -----Original Message-----
> >>>> From: Jani Nikula <[email protected]>
> >>>> Sent: Tuesday, November 2, 2021 4:40 PM
> >>>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> >>>> <[email protected]>; Maxime Ripard
> >>>> <[email protected]>; Thomas Zimmermann <[email protected]>;
> >> David
> >>>> Airlie <[email protected]>; Daniel Vetter <[email protected]>
> >>>> Cc: [email protected]; [email protected];
> >>>> Huang, Shimmer <[email protected]>; Huang, Ray
> >> <[email protected]>
> >>>> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >>>> dereference on drm_dp_dpcd_access
> >>>>
> >>>> [CAUTION: External Email]
> >>>>
> >>>> On Tue, 02 Nov 2021, "Yuan, Perry" <[email protected]> wrote:
> >>>>> [AMD Official Use Only]
> >>>>>
> >>>>> Hi Jani:
> >>>>> Thanks for your comments.
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Jani Nikula <[email protected]>
> >>>>>> Sent: Monday, November 1, 2021 9:07 PM
> >>>>>> To: Yuan, Perry <[email protected]>; Maarten Lankhorst
> >>>>>> <[email protected]>; Maxime Ripard
> >>>>>> <[email protected]>; Thomas Zimmermann
> >> <[email protected]>;
> >>>> David
> >>>>>> Airlie <[email protected]>; Daniel Vetter <[email protected]>
> >>>>>> Cc: Yuan, Perry <[email protected]>;
> >>>>>> [email protected]; linux- [email protected];
> >>>>>> Huang, Shimmer <[email protected]>; Huang, Ray
> >>>> <[email protected]>
> >>>>>> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer
> >>>>>> dereference on drm_dp_dpcd_access
> >>>>>>
> >>>>>> [CAUTION: External Email]
> >>>>>>
> >>>>>> On Mon, 01 Nov 2021, Perry Yuan <[email protected]> wrote:
> >>>>>>> Fix below crash by adding a check in the drm_dp_dpcd_access
> >>>>>>> which ensures that aux->transfer was actually initialized earlier.
> >>>>>>
> >>>>>> Gut feeling says this is papering over a real usage issue
> >>>>>> somewhere else. Why is the aux being used for transfers before
> >>>>>> ->transfer has been set? Why should the dp helper be defensive
> >>>>>> against all kinds of
> >>>> misprogramming?
> >>>>>>
> >>>>>>
> >>>>>> BR,
> >>>>>> Jani.
> >>>>>>
> >>>>>
> >>>>> The issue was found by Intel IGT test suite, graphic by pass test case.
> >>>>>
> >> https://g
> itl
> >>>>> ab.freedesktop.org%2Fdrm%2Figt-gpu-
> >>>> tools&data=04%7C01%7CPerry.Yuan
> >>>>> %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4
> >> 884e6
> >>>> 08e11a8
> >>>>>
> >>>>
> >> 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbG
> >> Zsb
> >>>> 3d8eyJWIj
> >>>>>
> >>>>
> >> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> >> 00
> >>>> 0&am
> >>>>>
> >>>>
> >> p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3D&reser
> >> ved
> >>>> =0
> >>>>> normally use case will not see the issue.
> >>>>> To avoid this issue happy again when we run the test case , it
> >>>>> will be nice to
> >>>> add a check before the transfer is called.
> >>>>> And we can see that it really needs to have a check here to make
> >>>>> ITG &kernel
> >>>> happy.
> >>>>
> >>>> You're missing my point. What is the root cause? Why do you have
> >>>> the aux device or connector registered before ->transfer function
> >>>> is initialized. I don't think you should do that.
> >>>>
> >>>> BR,
> >>>> Jani.
> >>>>
> >>>
> >>> One potential IGT fix patch to resolve the test case failure is:
> >>>
> >>> tests/amdgpu/amd_bypass.c
> >>> data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id,
> >>> - AMDGPU_PIPE_CRC_SOURCE_DPRX);
> >>> +
> >>> INTEL_PIPE_CRC_SOURCE_AUTO); The kernel panic error gone after change
> "dprx" to "auto" in the IGT test.
> >>>
> >>> In my view ,the IGT amdgpu bypass test will do some common setup
> >>> work
> >> including crc piple, source.
> >>> When the IGT sets up a new CRC pipe capture source for amdgpu bypass
> >> test, the SOURCE was set as "dprx" instead of "auto"
> >>> It makes "amdgpu_dm_crtc_set_crc_source()" failed to set correct
> >>> AUX
> >> and it's transfer function invalid .
> >>> The system I tested use HDMI port connected to monitor .
> >>>
> >>> amdgpu_dm_crtc_set_crc_source-> (aux = (aconn->port) ? &aconn-
> >>> port->aux : &aconn->dm_dp_aux.aux;)
> >>> drm_dp_start_crc ->
> >>> drm_dp_dpcd_readb-> aux->transfer is NULL, issue here.
> >>> The fix will use the "auto" keyword, which will let the driver
> >>> select a
> >> default source of frame CRCs for this CRTC.
> >>>
> >>> Correct me if have some wrong points.
> >>
> >> Apparently I'm completely failing to communicate my POV to you.
> >>
> >> If you have a kernel oops, the fix needs to be in the kernel, not IGT.
> >>
> >> The question is, why is it possible for IGT (or any userspace) to
> >> trigger AUX communication when the ->transfer function is not set? In
> >> my opinion, the kernel driver should not have exposed the interface
> >> at all if the ->transfer function is not set. The interface is useless without the -
> >transfer function.
> >> IMO, that's the bug.
> >>
> >
> > Yes , you are correct , the transfer shouldn't be called before it is ready !
> >
> > Let me explain more details in my view .
> > Maybe the root cause is not why the aux->transfer is not called before it is
> registered in this case.
> > I suppose the issue was triggered by wrong CRC pipe source .
> >
> > Actually the aux->transfer has been registered when amdgpu DM registered at
> kernel boot.
> > IGT test was run when system login to Gnome desktop.
> >
> > amdgpu_dm_initialize_dp_connector->
> > aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
> >
> > The test case failed when the IGT set an "DPRX" CRC pipe source while the
> HDMI connected to monitor only.
> > At this time, the aux->transfer is NULL, and dp helper did not check the
> transfer pointer NULL or not.
> > It calls the transfers to DPCD read, then you see the kernel panic log.
> >
> > amdgpu_dm_crtc_funcs-> amdgpu_dm_crtc_set_crc_source->
> > drm_dp_start_crc
> >
> > * And if the DP cable connected only, the issue will not happen. Test will pass.
> > * If I change the CRC source to "auto", kernel will not see the panic as well.
> > Maybe the failed test case need to run on the DP instead of HDMI, I am not
> sure at now.
> >
>
> Two things need to happen:
> 1) IGT should skip tests requiring DPRX CRC source if not on a
> DP connector.
> 2) Driver should return EINVAL (or another appropriate error) if
> DPRX CRC source is requested when the CRTC is not connected to
> a DP display. Alternatively we could make sure that DPRX is
> not advertised as a CRC source in this case but I'm not sure
> how difficult that would be.
>
> Like Jani said, I don't think the current patch is the correct one as it doesn't get
> to the root cause. The root cause fix should be in the CRC debugfs handling code.
>
> Harry
Got your point.
I will make another two patches as you suggested.
Thanks for your feedback.
Perry.
>
> >
> > Hopping this info can help.
> >
> > Perry.
> >
> >
> >>
> >> BR,
> >> Jani.
> >>
> >>>
> >>> Thank you!
> >>> Perry.
> >>>
> >>>>
> >>>>>
> >>>>> Perry.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
> >>>>>>> PGD
> >>>>>>> 0 P4D 0
> >>>>>>> Oops: 0010 [#1] SMP NOPTI
> >>>>>>> RIP: 0010:0x0
> >>>>>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> >>>>>>> RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246
> >>>>>>> RAX: 0000000000000000 RBX: 0000000000000020 RCX:
> >>>>>>> ffffa8d64225bb5e
> >>>>>>> RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI:
> >>>>>>> ffff931511a1a9d8
> >>>>>>> RBP: ffffa8d64225bb10 R08: 0000000000000001 R09:
> >>>>>>> ffffa8d64225ba60
> >>>>>>> R10: 0000000000000002 R11: 000000000000000d R12:
> >>>>>>> 0000000000000001
> >>>>>>> R13: 0000000000000000 R14: ffffa8d64225bb5e R15:
> >>>>>>> ffff931511a1a9d8
> >>>>>>> FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000)
> >>>>>>> knlGS:0000000000000000
> >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>>>> CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4:
> >>>>>>> 0000000000750ee0
> >>>>>>> PKRU: 55555554
> >>>>>>> Call Trace:
> >>>>>>> drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
> >>>>>>> drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
> >>>>>>> drm_dp_start_crc+0x38/0xb0 [drm_kms_helper]
> >>>>>>> amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu]
> >>>>>>> crtc_crc_open+0x174/0x220 [drm]
> >>>>>>> full_proxy_open+0x168/0x1f0
> >>>>>>> ? open_proxy_open+0x100/0x100
> >>>>>>> do_dentry_open+0x156/0x370
> >>>>>>> vfs_open+0x2d/0x30
> >>>>>>>
> >>>>>>> v2: fix some typo
> >>>>>>>
> >>>>>>> Signed-off-by: Perry Yuan <[email protected]>
> >>>>>>> ---
> >>>>>>> drivers/gpu/drm/drm_dp_helper.c | 4 ++++
> >>>>>>> 1 file changed, 4 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/drm_dp_helper.c
> >>>>>>> b/drivers/gpu/drm/drm_dp_helper.c index
> >>>>>>> 6d0f2c447f3b..76b28396001a
> >>>>>>> 100644
> >>>>>>> --- a/drivers/gpu/drm/drm_dp_helper.c
> >>>>>>> +++ b/drivers/gpu/drm/drm_dp_helper.c
> >>>>>>> @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct
> >>>>>>> drm_dp_aux
> >>>>>> *aux, u8 request,
> >>>>>>> msg.buffer = buffer;
> >>>>>>> msg.size = size;
> >>>>>>>
> >>>>>>> + /* No transfer function is set, so not an available DP connector */
> >>>>>>> + if (!aux->transfer)
> >>>>>>> + return -EINVAL;
> >>>>>>> +
> >>>>>>> mutex_lock(&aux->hw_mutex);
> >>>>>>>
> >>>>>>> /*
> >>>>>>
> >>>>>> --
> >>>>>> Jani Nikula, Intel Open Source Graphics Center
> >>>>
> >>>> --
> >>>> Jani Nikula, Intel Open Source Graphics Center
> >>
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center