2021-10-04 22:35:00

by Doug Anderson

[permalink] [raw]
Subject: [PATCH] drm/edid: Fix crash with zero/invalid EDID

In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
the EDID") I broke out reading the base block of the EDID to its own
function. Unfortunately, when I did that I messed up the handling when
drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
when we went through 4 loops and didn't get a valid EDID. Specifically
I needed to pass the broken EDID to connector_bad_edid() but now I was
passing an error-pointer.

Let's re-jigger things so we can pass the bad EDID in properly.

Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
Reported-by: kernel test robot <[email protected]>
Reported-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---

drivers/gpu/drm/drm_edid.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 9b19eee0e1b4..9c9463ec5465 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector)
}
EXPORT_SYMBOL(drm_add_override_edid_modes);

-static struct edid *drm_do_get_edid_base_block(
+static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector,
int (*get_edid_block)(void *data, u8 *buf, unsigned int block,
size_t len),
- void *data, bool *edid_corrupt, int *null_edid_counter)
+ void *data)
{
- int i;
+ int *null_edid_counter = connector ? &connector->null_edid_counter : NULL;
+ bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL;
void *edid;
+ int i;

edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
if (edid == NULL)
@@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block(
return edid;

carp:
- kfree(edid);
- return ERR_PTR(-EINVAL);
-
+ if (connector)
+ connector_bad_edid(connector, edid, 1);
out:
kfree(edid);
return NULL;
@@ -1982,14 +1983,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector,
if (override)
return override;

- edid = (u8 *)drm_do_get_edid_base_block(get_edid_block, data,
- &connector->edid_corrupt,
- &connector->null_edid_counter);
- if (IS_ERR_OR_NULL(edid)) {
- if (IS_ERR(edid))
- connector_bad_edid(connector, edid, 1);
+ edid = (u8 *)drm_do_get_edid_base_block(connector, get_edid_block, data);
+ if (!edid)
return NULL;
- }

/* if there's no extensions or no connector, we're done */
valid_extensions = edid[0x7e];
@@ -2142,14 +2138,13 @@ u32 drm_edid_get_panel_id(struct i2c_adapter *adapter)
struct edid *edid;
u32 panel_id;

- edid = drm_do_get_edid_base_block(drm_do_probe_ddc_edid, adapter,
- NULL, NULL);
+ edid = drm_do_get_edid_base_block(NULL, drm_do_probe_ddc_edid, adapter);

/*
* There are no manufacturer IDs of 0, so if there is a problem reading
* the EDID then we'll just return 0.
*/
- if (IS_ERR_OR_NULL(edid))
+ if (!edid)
return 0;

panel_id = edid_extract_panel_id(edid);
--
2.33.0.800.g4c38ced690-goog


2021-10-04 23:39:28

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID

On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
> In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> the EDID") I broke out reading the base block of the EDID to its own
> function. Unfortunately, when I did that I messed up the handling when
> drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> when we went through 4 loops and didn't get a valid EDID. Specifically
> I needed to pass the broken EDID to connector_bad_edid() but now I was
> passing an error-pointer.
>
> Let's re-jigger things so we can pass the bad EDID in properly.
>
> Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
> Reported-by: kernel test robot <[email protected]>
> Reported-by: Geert Uytterhoeven <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> drivers/gpu/drm/drm_edid.c | 27 +++++++++++----------------
> 1 file changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 9b19eee0e1b4..9c9463ec5465 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector)
> }
> EXPORT_SYMBOL(drm_add_override_edid_modes);
>
> -static struct edid *drm_do_get_edid_base_block(
> +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector,
> int (*get_edid_block)(void *data, u8 *buf, unsigned int block,
> size_t len),
> - void *data, bool *edid_corrupt, int *null_edid_counter)
> + void *data)
> {
> - int i;
> + int *null_edid_counter = connector ? &connector->null_edid_counter : NULL;
> + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL;
> void *edid;
> + int i;
>
> edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
> if (edid == NULL)
> @@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block(
> return edid;
>
> carp:
> - kfree(edid);
> - return ERR_PTR(-EINVAL);
> -
> + if (connector)
> + connector_bad_edid(connector, edid, 1);

BTW I believe connector_bad_edid() itself is broken since
commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid
corruption test"). Before we've even allocated the memory for the
extension blocks that code now assumes edid[0x7e] is to be 100%
trusted and goes and calculates the checksum on a block based on
that. So that's likely going to be pointing somewhere beyond the base
block into memory we've not even allocated. So anyone who wanted
could craft a bogus EDID and maybe get something interesting to
happen.

Would be good if someone could fix that while at it. Or just revert
the offending commit if there is no simple solution immediately in
sight.

The fact that we're parsing entirely untrustworthy crap in the kernel
always worries me. Either we need super careful review of all relevant
code, and/or we need to think about moving the parser out of the kernel.
I was considering playing around with the usermode helper stuff. IIRC
there is a way to embed the userspace binary into the kernel and just
fire it up when needed. But so far it's been the usual -ENOTIME for
me...

--
Ville Syrj?l?
Intel

2021-10-05 00:18:04

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID

Hi Douglas,

On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson <[email protected]> wrote:
> In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> the EDID") I broke out reading the base block of the EDID to its own
> function. Unfortunately, when I did that I messed up the handling when
> drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> when we went through 4 loops and didn't get a valid EDID. Specifically
> I needed to pass the broken EDID to connector_bad_edid() but now I was
> passing an error-pointer.
>
> Let's re-jigger things so we can pass the bad EDID in properly.
>
> Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
> Reported-by: kernel test robot <[email protected]>
> Reported-by: Geert Uytterhoeven <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>

The crash is was seeing is gone, so
Tested-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2021-10-05 00:44:18

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID

Hi,

On Mon, Oct 4, 2021 at 10:14 AM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Douglas,
>
> On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson <[email protected]> wrote:
> > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> > the EDID") I broke out reading the base block of the EDID to its own
> > function. Unfortunately, when I did that I messed up the handling when
> > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> > when we went through 4 loops and didn't get a valid EDID. Specifically
> > I needed to pass the broken EDID to connector_bad_edid() but now I was
> > passing an error-pointer.
> >
> > Let's re-jigger things so we can pass the bad EDID in properly.
> >
> > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
> > Reported-by: kernel test robot <[email protected]>
> > Reported-by: Geert Uytterhoeven <[email protected]>
> > Signed-off-by: Douglas Anderson <[email protected]>
>
> The crash is was seeing is gone, so
> Tested-by: Geert Uytterhoeven <[email protected]>

Thanks for testing! I'll plan to apply tomorrow morning (California
time) to balance between giving folks a chance to yell at me for my
patch and the urgency of fixing the breakage.

-Doug

2021-10-05 13:34:55

by Fangzhi Zuo

[permalink] [raw]
Subject: RE: [PATCH] drm/edid: Fix crash with zero/invalid EDID

[AMD Official Use Only]

Hi Ville:

Yhea, it is pretty old change from two years ago, and it is no long valid anymore. Please simply drop it.

Regards,
Jerry

> -----Original Message-----
> From: Ville Syrj?l? <[email protected]>
> Sent: October 4, 2021 3:45 PM
> To: Douglas Anderson <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; Daniel Vetter <[email protected]>; David Airlie
> <[email protected]>; Jani Nikula <[email protected]>; Linus Walleij
> <[email protected]>; Maarten Lankhorst
> <[email protected]>; Maxime Ripard
> <[email protected]>; Sam Ravnborg <[email protected]>; Thomas
> Zimmermann <[email protected]>; [email protected]; Zuo,
> Jerry <[email protected]>; Wentland, Harry <[email protected]>;
> Siqueira, Rodrigo <[email protected]>
> Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID
>
> On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
> > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> > the EDID") I broke out reading the base block of the EDID to its own
> > function. Unfortunately, when I did that I messed up the handling when
> > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> > when we went through 4 loops and didn't get a valid EDID. Specifically
> > I needed to pass the broken EDID to connector_bad_edid() but now I was
> > passing an error-pointer.
> >
> > Let's re-jigger things so we can pass the bad EDID in properly.
> >
> > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the
> > EDID")
> > Reported-by: kernel test robot <[email protected]>
> > Reported-by: Geert Uytterhoeven <[email protected]>
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> >
> > drivers/gpu/drm/drm_edid.c | 27 +++++++++++----------------
> > 1 file changed, 11 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 9b19eee0e1b4..9c9463ec5465 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct
> > drm_connector *connector) }
> > EXPORT_SYMBOL(drm_add_override_edid_modes);
> >
> > -static struct edid *drm_do_get_edid_base_block(
> > +static struct edid *drm_do_get_edid_base_block(struct drm_connector
> > +*connector,
> > int (*get_edid_block)(void *data, u8 *buf, unsigned int block,
> > size_t len),
> > - void *data, bool *edid_corrupt, int *null_edid_counter)
> > + void *data)
> > {
> > - int i;
> > + int *null_edid_counter = connector ? &connector-
> >null_edid_counter : NULL;
> > + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL;
> > void *edid;
> > + int i;
> >
> > edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
> > if (edid == NULL)
> > @@ -1941,9 +1943,8 @@ static struct edid
> *drm_do_get_edid_base_block(
> > return edid;
> >
> > carp:
> > - kfree(edid);
> > - return ERR_PTR(-EINVAL);
> > -
> > + if (connector)
> > + connector_bad_edid(connector, edid, 1);
>
> BTW I believe connector_bad_edid() itself is broken since commit
> e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption
> test"). Before we've even allocated the memory for the extension blocks
> that code now assumes edid[0x7e] is to be 100% trusted and goes and
> calculates the checksum on a block based on that. So that's likely going to be
> pointing somewhere beyond the base block into memory we've not even
> allocated. So anyone who wanted could craft a bogus EDID and maybe get
> something interesting to happen.
>
> Would be good if someone could fix that while at it. Or just revert the
> offending commit if there is no simple solution immediately in sight.
>
> The fact that we're parsing entirely untrustworthy crap in the kernel always
> worries me. Either we need super careful review of all relevant code, and/or
> we need to think about moving the parser out of the kernel.
> I was considering playing around with the usermode helper stuff. IIRC there
> is a way to embed the userspace binary into the kernel and just fire it up
> when needed. But so far it's been the usual -ENOTIME for me...
>
> --
> Ville Syrj?l?
> Intel

2021-10-05 13:47:21

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID

Hi,

On Mon, Oct 4, 2021 at 5:40 PM Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Mon, Oct 4, 2021 at 10:14 AM Geert Uytterhoeven <[email protected]> wrote:
> >
> > Hi Douglas,
> >
> > On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson <[email protected]> wrote:
> > > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> > > the EDID") I broke out reading the base block of the EDID to its own
> > > function. Unfortunately, when I did that I messed up the handling when
> > > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> > > when we went through 4 loops and didn't get a valid EDID. Specifically
> > > I needed to pass the broken EDID to connector_bad_edid() but now I was
> > > passing an error-pointer.
> > >
> > > Let's re-jigger things so we can pass the bad EDID in properly.
> > >
> > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
> > > Reported-by: kernel test robot <[email protected]>
> > > Reported-by: Geert Uytterhoeven <[email protected]>
> > > Signed-off-by: Douglas Anderson <[email protected]>
> >
> > The crash is was seeing is gone, so
> > Tested-by: Geert Uytterhoeven <[email protected]>
>
> Thanks for testing! I'll plan to apply tomorrow morning (California
> time) to balance between giving folks a chance to yell at me for my
> patch and the urgency of fixing the breakage.

Ah, doh! I can't push until I can get a review tag from someone. As
soon as I see one then I'll give it a push.

-Doug

2021-10-05 15:18:17

by Doug Anderson

[permalink] [raw]
Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)

Hi,

On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <[email protected]> wrote:
>
> > BTW I believe connector_bad_edid() itself is broken since commit
> > e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption
> > test"). Before we've even allocated the memory for the extension blocks
> > that code now assumes edid[0x7e] is to be 100% trusted and goes and
> > calculates the checksum on a block based on that. So that's likely going to be
> > pointing somewhere beyond the base block into memory we've not even
> > allocated. So anyone who wanted could craft a bogus EDID and maybe get
> > something interesting to happen.
> >
> > Would be good if someone could fix that while at it. Or just revert the
> > offending commit if there is no simple solution immediately in sight.
> >
> > The fact that we're parsing entirely untrustworthy crap in the kernel always
> > worries me. Either we need super careful review of all relevant code, and/or
> > we need to think about moving the parser out of the kernel.
> > I was considering playing around with the usermode helper stuff. IIRC there
> > is a way to embed the userspace binary into the kernel and just fire it up
> > when needed. But so far it's been the usual -ENOTIME for me...
> >
> [AMD Official Use Only]
>
> Hi Ville:
>
> Yhea, it is pretty old change from two years ago, and it is no long valid anymore. Please simply drop it.
>
> Regards,
> Jerry

I've cut out other bits from this email and changed the subject line
since I think this is an issue unrelated to the one my original patch
was fixing.

I don't actually know a ton about DP compliance testing, but I
attempted to try to be helpful and revert commit e11f5bd8228f ("drm:
Add support for DP 1.4 Compliance edid corruption test"). It wasn't
too hard to deal with the conflicts in the revert itself, but then
things didn't compile because there are two places that use
`real_edid_checksum` and that goes away if I revert the patch.

I've made an attempt to fix the problem by just adding a bounds check.
Perhaps you can see if that looks good to you:

https://lore.kernel.org/r/20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1295928d@changeid

-Doug

2021-10-05 15:28:08

by Fangzhi Zuo

[permalink] [raw]
Subject: RE: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)

[AMD Official Use Only]

> -----Original Message-----
> From: Doug Anderson <[email protected]>
> Sent: October 5, 2021 11:14 AM
> To: Zuo, Jerry <[email protected]>
> Cc: Ville Syrj?l? <[email protected]>; dri-
> [email protected]; [email protected]; [email protected];
> Daniel Vetter <[email protected]>; David Airlie <[email protected]>; Jani Nikula
> <[email protected]>; Linus Walleij <[email protected]>; Maarten
> Lankhorst <[email protected]>; Maxime Ripard
> <[email protected]>; Sam Ravnborg <[email protected]>; Thomas
> Zimmermann <[email protected]>; [email protected];
> Wentland, Harry <[email protected]>; Siqueira, Rodrigo
> <[email protected]>; Kuogee Hsieh <[email protected]>
> Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid:
> Fix crash with zero/invalid EDID)
>
> Hi,
>
> On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <[email protected]> wrote:
> >
> > > BTW I believe connector_bad_edid() itself is broken since commit
> > > e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid
> > > corruption test"). Before we've even allocated the memory for the
> > > extension blocks that code now assumes edid[0x7e] is to be 100%
> > > trusted and goes and calculates the checksum on a block based on
> > > that. So that's likely going to be pointing somewhere beyond the
> > > base block into memory we've not even allocated. So anyone who
> > > wanted could craft a bogus EDID and maybe get something interesting to
> happen.
> > >
> > > Would be good if someone could fix that while at it. Or just revert
> > > the offending commit if there is no simple solution immediately in sight.
> > >
> > > The fact that we're parsing entirely untrustworthy crap in the
> > > kernel always worries me. Either we need super careful review of all
> > > relevant code, and/or we need to think about moving the parser out of
> the kernel.
> > > I was considering playing around with the usermode helper stuff.
> > > IIRC there is a way to embed the userspace binary into the kernel
> > > and just fire it up when needed. But so far it's been the usual -ENOTIME
> for me...
> > >
> > [AMD Official Use Only]
> >
> > Hi Ville:
> >
> > Yhea, it is pretty old change from two years ago, and it is no long valid
> anymore. Please simply drop it.
> >
> > Regards,
> > Jerry
>
> I've cut out other bits from this email and changed the subject line since I
> think this is an issue unrelated to the one my original patch was fixing.
>
> I don't actually know a ton about DP compliance testing, but I attempted to
> try to be helpful and revert commit e11f5bd8228f ("drm:
> Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard
> to deal with the conflicts in the revert itself, but then things didn't compile
> because there are two places that use `real_edid_checksum` and that goes
> away if I revert the patch.
>
> I've made an attempt to fix the problem by just adding a bounds check.
> Perhaps you can see if that looks good to you:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1
> 295928d%40changeid&amp;data=04%7C01%7CJerry.Zuo%40amd.com%7C90
> b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d
> %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> 000&amp;sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3
> D&amp;reserved=0
>
> -Doug

The patch used for DP1.4 compliance edid corruption test. Let me double check if edid corruption test could be passed without the patch.

2021-10-05 16:47:30

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID

On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
> In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> the EDID") I broke out reading the base block of the EDID to its own
> function. Unfortunately, when I did that I messed up the handling when
> drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> when we went through 4 loops and didn't get a valid EDID. Specifically
> I needed to pass the broken EDID to connector_bad_edid() but now I was
> passing an error-pointer.
>
> Let's re-jigger things so we can pass the bad EDID in properly.
>
> Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
> Reported-by: kernel test robot <[email protected]>
> Reported-by: Geert Uytterhoeven <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>

A bit of historical fallout zone this part of the code. So
not the easiest thing to read in general. But looks like what
you have here should work.

Reviewed-by: Ville Syrj?l? <[email protected]>

> ---
>
> drivers/gpu/drm/drm_edid.c | 27 +++++++++++----------------
> 1 file changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 9b19eee0e1b4..9c9463ec5465 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector)
> }
> EXPORT_SYMBOL(drm_add_override_edid_modes);
>
> -static struct edid *drm_do_get_edid_base_block(
> +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector,
> int (*get_edid_block)(void *data, u8 *buf, unsigned int block,
> size_t len),
> - void *data, bool *edid_corrupt, int *null_edid_counter)
> + void *data)
> {
> - int i;
> + int *null_edid_counter = connector ? &connector->null_edid_counter : NULL;
> + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL;
> void *edid;
> + int i;
>
> edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
> if (edid == NULL)
> @@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block(
> return edid;
>
> carp:
> - kfree(edid);
> - return ERR_PTR(-EINVAL);
> -
> + if (connector)
> + connector_bad_edid(connector, edid, 1);
> out:
> kfree(edid);
> return NULL;
> @@ -1982,14 +1983,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector,
> if (override)
> return override;
>
> - edid = (u8 *)drm_do_get_edid_base_block(get_edid_block, data,
> - &connector->edid_corrupt,
> - &connector->null_edid_counter);
> - if (IS_ERR_OR_NULL(edid)) {
> - if (IS_ERR(edid))
> - connector_bad_edid(connector, edid, 1);
> + edid = (u8 *)drm_do_get_edid_base_block(connector, get_edid_block, data);
> + if (!edid)
> return NULL;
> - }
>
> /* if there's no extensions or no connector, we're done */
> valid_extensions = edid[0x7e];
> @@ -2142,14 +2138,13 @@ u32 drm_edid_get_panel_id(struct i2c_adapter *adapter)
> struct edid *edid;
> u32 panel_id;
>
> - edid = drm_do_get_edid_base_block(drm_do_probe_ddc_edid, adapter,
> - NULL, NULL);
> + edid = drm_do_get_edid_base_block(NULL, drm_do_probe_ddc_edid, adapter);
>
> /*
> * There are no manufacturer IDs of 0, so if there is a problem reading
> * the EDID then we'll just return 0.
> */
> - if (IS_ERR_OR_NULL(edid))
> + if (!edid)
> return 0;
>
> panel_id = edid_extract_panel_id(edid);
> --
> 2.33.0.800.g4c38ced690-goog

--
Ville Syrj?l?
Intel

2021-10-05 18:05:51

by Harry Wentland

[permalink] [raw]
Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)



On 2021-10-05 11:25, Zuo, Jerry wrote:
> [AMD Official Use Only]
>
>> -----Original Message-----
>> From: Doug Anderson <[email protected]>
>> Sent: October 5, 2021 11:14 AM
>> To: Zuo, Jerry <[email protected]>
>> Cc: Ville Syrjälä <[email protected]>; dri-
>> [email protected]; [email protected]; [email protected];
>> Daniel Vetter <[email protected]>; David Airlie <[email protected]>; Jani Nikula
>> <[email protected]>; Linus Walleij <[email protected]>; Maarten
>> Lankhorst <[email protected]>; Maxime Ripard
>> <[email protected]>; Sam Ravnborg <[email protected]>; Thomas
>> Zimmermann <[email protected]>; [email protected];
>> Wentland, Harry <[email protected]>; Siqueira, Rodrigo
>> <[email protected]>; Kuogee Hsieh <[email protected]>
>> Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid:
>> Fix crash with zero/invalid EDID)
>>
>> Hi,
>>
>> On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <[email protected]> wrote:
>>>
>>>> BTW I believe connector_bad_edid() itself is broken since commit
>>>> e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid
>>>> corruption test"). Before we've even allocated the memory for the
>>>> extension blocks that code now assumes edid[0x7e] is to be 100%
>>>> trusted and goes and calculates the checksum on a block based on
>>>> that. So that's likely going to be pointing somewhere beyond the
>>>> base block into memory we've not even allocated. So anyone who
>>>> wanted could craft a bogus EDID and maybe get something interesting to
>> happen.
>>>>
>>>> Would be good if someone could fix that while at it. Or just revert
>>>> the offending commit if there is no simple solution immediately in sight.
>>>>
>>>> The fact that we're parsing entirely untrustworthy crap in the
>>>> kernel always worries me. Either we need super careful review of all
>>>> relevant code, and/or we need to think about moving the parser out of
>> the kernel.
>>>> I was considering playing around with the usermode helper stuff.
>>>> IIRC there is a way to embed the userspace binary into the kernel
>>>> and just fire it up when needed. But so far it's been the usual -ENOTIME
>> for me...
>>>>
>>> [AMD Official Use Only]
>>>
>>> Hi Ville:
>>>
>>> Yhea, it is pretty old change from two years ago, and it is no long valid
>> anymore. Please simply drop it.
>>>
>>> Regards,
>>> Jerry
>>
>> I've cut out other bits from this email and changed the subject line since I
>> think this is an issue unrelated to the one my original patch was fixing.
>>
>> I don't actually know a ton about DP compliance testing, but I attempted to
>> try to be helpful and revert commit e11f5bd8228f ("drm:
>> Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard
>> to deal with the conflicts in the revert itself, but then things didn't compile
>> because there are two places that use `real_edid_checksum` and that goes
>> away if I revert the patch.
>>
>> I've made an attempt to fix the problem by just adding a bounds check.
>> Perhaps you can see if that looks good to you:
>>
>> https://lore.
kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1
>> 295928d%40changeid&amp;data=04%7C01%7CJerry.Zuo%40amd.com%7C90
>> b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d
>> %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJWIj
>> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
>> 000&amp;sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3
>> D&amp;reserved=0
>>
>> -Doug
>
> The patch used for DP1.4 compliance edid corruption test. Let me double check if edid corruption test could be passed without the patch.
>

Can you try the CTS test with Doug's fix?

https://patchwork.freedesktop.org/patch/457537/

Harry

2021-10-06 02:25:08

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID

Hi,

On Tue, Oct 5, 2021 at 9:43 AM Ville Syrjälä
<[email protected]> wrote:
>
> On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
> > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of
> > the EDID") I broke out reading the base block of the EDID to its own
> > function. Unfortunately, when I did that I messed up the handling when
> > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or
> > when we went through 4 loops and didn't get a valid EDID. Specifically
> > I needed to pass the broken EDID to connector_bad_edid() but now I was
> > passing an error-pointer.
> >
> > Let's re-jigger things so we can pass the bad EDID in properly.
> >
> > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID")
> > Reported-by: kernel test robot <[email protected]>
> > Reported-by: Geert Uytterhoeven <[email protected]>
> > Signed-off-by: Douglas Anderson <[email protected]>
>
> A bit of historical fallout zone this part of the code. So
> not the easiest thing to read in general. But looks like what
> you have here should work.
>
> Reviewed-by: Ville Syrjälä <[email protected]>

Thanks! Pushed to drm-misc/drm-misc-next:

e7bd95a7ed4e drm/edid: Fix crash with zero/invalid EDID

2021-10-06 12:08:42

by Fangzhi Zuo

[permalink] [raw]
Subject: RE: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)

[AMD Official Use Only]

> -----Original Message-----
> From: Wentland, Harry <[email protected]>
> Sent: October 5, 2021 2:04 PM
> To: Zuo, Jerry <[email protected]>; Doug Anderson
> <[email protected]>
> Cc: Ville Syrjälä <[email protected]>; dri-
> [email protected]; [email protected]; [email protected];
> Daniel Vetter <[email protected]>; David Airlie <[email protected]>; Jani Nikula
> <[email protected]>; Linus Walleij <[email protected]>; Maarten
> Lankhorst <[email protected]>; Maxime Ripard
> <[email protected]>; Sam Ravnborg <[email protected]>; Thomas
> Zimmermann <[email protected]>; [email protected];
> Siqueira, Rodrigo <[email protected]>; Kuogee Hsieh
> <[email protected]>
> Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid:
> Fix crash with zero/invalid EDID)
>
>
>
> On 2021-10-05 11:25, Zuo, Jerry wrote:
> > [AMD Official Use Only]
> >
> >> -----Original Message-----
> >> From: Doug Anderson <[email protected]>
> >> Sent: October 5, 2021 11:14 AM
> >> To: Zuo, Jerry <[email protected]>
> >> Cc: Ville Syrjälä <[email protected]>; dri-
> >> [email protected]; [email protected];
> >> [email protected]; Daniel Vetter <[email protected]>; David Airlie
> >> <[email protected]>; Jani Nikula <[email protected]>; Linus
> >> Walleij <[email protected]>; Maarten Lankhorst
> >> <[email protected]>; Maxime Ripard
> >> <[email protected]>; Sam Ravnborg <[email protected]>; Thomas
> >> Zimmermann <[email protected]>; [email protected];
> >> Wentland, Harry <[email protected]>; Siqueira, Rodrigo
> >> <[email protected]>; Kuogee Hsieh <[email protected]>
> >> Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid:
> >> Fix crash with zero/invalid EDID)
> >>
> >> Hi,
> >>
> >> On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <[email protected]> wrote:
> >>>
> >>>> BTW I believe connector_bad_edid() itself is broken since commit
> >>>> e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid
> >>>> corruption test"). Before we've even allocated the memory for the
> >>>> extension blocks that code now assumes edid[0x7e] is to be 100%
> >>>> trusted and goes and calculates the checksum on a block based on
> >>>> that. So that's likely going to be pointing somewhere beyond the
> >>>> base block into memory we've not even allocated. So anyone who
> >>>> wanted could craft a bogus EDID and maybe get something interesting
> >>>> to
> >> happen.
> >>>>
> >>>> Would be good if someone could fix that while at it. Or just revert
> >>>> the offending commit if there is no simple solution immediately in sight.
> >>>>
> >>>> The fact that we're parsing entirely untrustworthy crap in the
> >>>> kernel always worries me. Either we need super careful review of
> >>>> all relevant code, and/or we need to think about moving the parser
> >>>> out of
> >> the kernel.
> >>>> I was considering playing around with the usermode helper stuff.
> >>>> IIRC there is a way to embed the userspace binary into the kernel
> >>>> and just fire it up when needed. But so far it's been the usual
> >>>> -ENOTIME
> >> for me...
> >>>>
> >>> [AMD Official Use Only]
> >>>
> >>> Hi Ville:
> >>>
> >>> Yhea, it is pretty old change from two years ago, and it is no
> >>> long valid
> >> anymore. Please simply drop it.
> >>>
> >>> Regards,
> >>> Jerry
> >>
> >> I've cut out other bits from this email and changed the subject line
> >> since I think this is an issue unrelated to the one my original patch was
> fixing.
> >>
> >> I don't actually know a ton about DP compliance testing, but I
> >> attempted to try to be helpful and revert commit e11f5bd8228f ("drm:
> >> Add support for DP 1.4 Compliance edid corruption test"). It wasn't
> >> too hard to deal with the conflicts in the revert itself, but then
> >> things didn't compile because there are two places that use
> >> `real_edid_checksum` and that goes away if I revert the patch.
> >>
> >> I've made an attempt to fix the problem by just adding a bounds check.
> >> Perhaps you can see if that looks good to you:
> >>
> >> https://lore.
> kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1
> >>
> 295928d%40changeid&amp;data=04%7C01%7CJerry.Zuo%40amd.com%7C90
> >>
> b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d
> >> %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIj
> >>
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> >>
> 000&amp;sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3
> >> D&amp;reserved=0
> >>
> >> -Doug
> >
> > The patch used for DP1.4 compliance edid corruption test. Let me double
> check if edid corruption test could be passed without the patch.
> >
>
> Can you try the CTS test with Doug's fix?
>
> https://patchwork.freedesktop.org/patch/457537/
>
> Harry

Yes, I'll give a try on that.