2022-11-14 20:10:52

by Rob Clark

[permalink] [raw]
Subject: [PATCH] drm/msm/a6xx: Fix speed-bin detection vs probe-defer

From: Rob Clark <[email protected]>

If we get an error (other than -ENOENT) we need to propagate that up the
stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up with
whatever OPP(s) are represented by bit zero.

Fixed: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu")
Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 7fe60c65a1eb..96de2202c86c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1956,7 +1956,7 @@ static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
DRM_DEV_ERROR(dev,
"failed to read speed-bin (%d). Some OPPs may not be supported by hardware",
ret);
- goto done;
+ return ret;
}

supp_hw = fuse_to_supp_hw(dev, rev, speedbin);
--
2.38.1



2022-11-14 20:14:02

by Akhil P Oommen

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/a6xx: Fix speed-bin detection vs probe-defer

On 11/15/2022 1:11 AM, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> If we get an error (other than -ENOENT) we need to propagate that up the
> stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up with
> whatever OPP(s) are represented by bit zero.
>
> Fixed: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu")
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 7fe60c65a1eb..96de2202c86c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1956,7 +1956,7 @@ static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
> DRM_DEV_ERROR(dev,
> "failed to read speed-bin (%d). Some OPPs may not be supported by hardware",
I just noticed and was going to send a similar fix. We should remove ".
Some OPPs may not be supported by hardware" here.

Reviewed-by: Akhil P Oommen <[email protected]>

Btw, on msm-next-external-fixes + this fix,  I still see boot up issue
in herobrine due to drm_dev_alloc() failure with -ENOSPC error.

-Akhil.
> ret);
> - goto done;
> + return ret;
> }
>
> supp_hw = fuse_to_supp_hw(dev, rev, speedbin);


2022-11-14 20:39:12

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/a6xx: Fix speed-bin detection vs probe-defer

Hi,

On Mon, Nov 14, 2022 at 11:41 AM Rob Clark <[email protected]> wrote:
>
> From: Rob Clark <[email protected]>
>
> If we get an error (other than -ENOENT) we need to propagate that up the
> stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up with
> whatever OPP(s) are represented by bit zero.

Can you explain the "whatever OPP(s) are represented by bit zero"
part? This doesn't seem to be true because `supp_hw` is initiated to
UINT_MAX. If I'm remembering how this all works, doesn't that mean
that if we get an error we'll assume all OPPs are OK?

I'm not saying that I'm against your change, but I think maybe you're
misdescribing the old behavior.

Speaking of the initialization of supp_hw, if we want to change the
behavior like your patch does then we should be able to remove that
initialization, right?

I would also suspect that your patch will result in a compiler
warning, at least on some compilers. The goto label `done` is no
longer needed, right?

-Doug

2022-11-14 20:54:06

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/a6xx: Fix speed-bin detection vs probe-defer

On Mon, Nov 14, 2022 at 12:27 PM Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Mon, Nov 14, 2022 at 11:41 AM Rob Clark <[email protected]> wrote:
> >
> > From: Rob Clark <[email protected]>
> >
> > If we get an error (other than -ENOENT) we need to propagate that up the
> > stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up with
> > whatever OPP(s) are represented by bit zero.
>
> Can you explain the "whatever OPP(s) are represented by bit zero"
> part? This doesn't seem to be true because `supp_hw` is initiated to
> UINT_MAX. If I'm remembering how this all works, doesn't that mean
> that if we get an error we'll assume all OPPs are OK?

Oh, that's right.. and even worse! Ok, stand by for v2

> I'm not saying that I'm against your change, but I think maybe you're
> misdescribing the old behavior.
>
> Speaking of the initialization of supp_hw, if we want to change the
> behavior like your patch does then we should be able to remove that
> initialization, right?
>
> I would also suspect that your patch will result in a compiler
> warning, at least on some compilers. The goto label `done` is no
> longer needed, right?
>
> -Doug

2022-11-14 20:54:12

by Akhil P Oommen

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/a6xx: Fix speed-bin detection vs probe-defer

On 11/15/2022 1:57 AM, Doug Anderson wrote:
> Hi,
>
> On Mon, Nov 14, 2022 at 11:41 AM Rob Clark <[email protected]> wrote:
>> From: Rob Clark <[email protected]>
>>
>> If we get an error (other than -ENOENT) we need to propagate that up the
>> stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up with
>> whatever OPP(s) are represented by bit zero.
> Can you explain the "whatever OPP(s) are represented by bit zero"
> part? This doesn't seem to be true because `supp_hw` is initiated to
> UINT_MAX. If I'm remembering how this all works, doesn't that mean
> that if we get an error we'll assume all OPPs are OK?
>
> I'm not saying that I'm against your change, but I think maybe you're
> misdescribing the old behavior.
>
> Speaking of the initialization of supp_hw, if we want to change the
> behavior like your patch does then we should be able to remove that
> initialization, right?
>
> I would also suspect that your patch will result in a compiler
> warning, at least on some compilers. The goto label `done` is no
> longer needed, right?
>
> -Doug
You are right about the commit message. The problem is we can't enable
all bits in supp_hw anymore due to changes like this:
https://patchwork.kernel.org/project/linux-arm-msm/patch/20220829011035.1.Ie3564662150e038571b7e2779cac7229191cf3bf@changeid/

This creates 2 opps with same freq when supp_hw = UINT_MAX.

-Akhil.

2022-11-14 22:05:19

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/a6xx: Fix speed-bin detection vs probe-defer

On Mon, Nov 14, 2022 at 11:59 AM Akhil P Oommen
<[email protected]> wrote:
>
> On 11/15/2022 1:11 AM, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > If we get an error (other than -ENOENT) we need to propagate that up the
> > stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up with
> > whatever OPP(s) are represented by bit zero.
> >
> > Fixed: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu")
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index 7fe60c65a1eb..96de2202c86c 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -1956,7 +1956,7 @@ static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
> > DRM_DEV_ERROR(dev,
> > "failed to read speed-bin (%d). Some OPPs may not be supported by hardware",
> I just noticed and was going to send a similar fix. We should remove ".
> Some OPPs may not be supported by hardware" here.
>
> Reviewed-by: Akhil P Oommen <[email protected]>
>
> Btw, on msm-next-external-fixes + this fix, I still see boot up issue
> in herobrine due to drm_dev_alloc() failure with -ENOSPC error.

Could you track it down one level deeper? I wonder if there is some
missing cleanup in the probe-defer path and we end up failing in
drm_minor_alloc() or something along those lines

BR,
-R

> -Akhil.
> > ret);
> > - goto done;
> > + return ret;
> > }
> >
> > supp_hw = fuse_to_supp_hw(dev, rev, speedbin);
>