2022-05-23 18:33:31

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

Even though malformed replies from firmware must be treated carefully to
avoid memory corruption Kernel side, some out-of-spec SCMI replies can
be tolerated to avoid breaking existing deployed system, as long as they
won't cause memory issues.

Reported-by: Nicolas Frattaroli <[email protected]>
Cc: Etienne Carriere <[email protected]>
Cc: Sudeep Holla <[email protected]>
Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/base.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c
index 20fba7370f4e..d0ac96da1ddf 100644
--- a/drivers/firmware/arm_scmi/base.c
+++ b/drivers/firmware/arm_scmi/base.c
@@ -221,11 +221,17 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph,
calc_list_sz = (1 + (loop_num_ret - 1) / sizeof(u32)) *
sizeof(u32);
if (calc_list_sz != real_list_sz) {
- dev_err(dev,
- "Malformed reply - real_sz:%zd calc_sz:%u\n",
- real_list_sz, calc_list_sz);
- ret = -EPROTO;
- break;
+ dev_warn(dev,
+ "Malformed reply - real_sz:%zd calc_sz:%u (loop_num_ret:%d)\n",
+ real_list_sz, calc_list_sz, loop_num_ret);
+ /*
+ * Bail out if the expected list size is bigger than the
+ * total payload size of the received reply.
+ */
+ if (calc_list_sz > real_list_sz) {
+ ret = -EPROTO;
+ break;
+ }
}

for (loop = 0; loop < loop_num_ret; loop++)
--
2.36.1



2022-06-01 21:13:29

by Etienne Carriere

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

Hello Cristian,

On Mon, 23 May 2022 at 19:17, Cristian Marussi <[email protected]> wrote:
>
> Even though malformed replies from firmware must be treated carefully to
> avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> be tolerated to avoid breaking existing deployed system, as long as they
> won't cause memory issues.
>
> Reported-by: Nicolas Frattaroli <[email protected]>
> Cc: Etienne Carriere <[email protected]>
> Cc: Sudeep Holla <[email protected]>
> Signed-off-by: Cristian Marussi <[email protected]>

Acked-by: Etienne Carriere <[email protected]>

Best regards,
etienne

> ---
> drivers/firmware/arm_scmi/base.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c
> index 20fba7370f4e..d0ac96da1ddf 100644
> --- a/drivers/firmware/arm_scmi/base.c
> +++ b/drivers/firmware/arm_scmi/base.c
> @@ -221,11 +221,17 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph,
> calc_list_sz = (1 + (loop_num_ret - 1) / sizeof(u32)) *
> sizeof(u32);
> if (calc_list_sz != real_list_sz) {
> - dev_err(dev,
> - "Malformed reply - real_sz:%zd calc_sz:%u\n",
> - real_list_sz, calc_list_sz);
> - ret = -EPROTO;
> - break;
> + dev_warn(dev,
> + "Malformed reply - real_sz:%zd calc_sz:%u (loop_num_ret:%d)\n",
> + real_list_sz, calc_list_sz, loop_num_ret);
> + /*
> + * Bail out if the expected list size is bigger than the
> + * total payload size of the received reply.
> + */
> + if (calc_list_sz > real_list_sz) {
> + ret = -EPROTO;
> + break;
> + }
> }
>
> for (loop = 0; loop < loop_num_ret; loop++)
> --
> 2.36.1
>

2022-06-06 13:06:45

by Michael Riesch

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

Hi Cristian,

On 5/23/22 19:15, Cristian Marussi wrote:
> Even though malformed replies from firmware must be treated carefully to
> avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> be tolerated to avoid breaking existing deployed system, as long as they
> won't cause memory issues.
>
> Reported-by: Nicolas Frattaroli <[email protected]>
> Cc: Etienne Carriere <[email protected]>
> Cc: Sudeep Holla <[email protected]>
> Signed-off-by: Cristian Marussi <[email protected]>

Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568
EVB1 in vanilla v5.19-rc1.

I guess this patch should have a Fixes: tag, right?

Would be great to have this in v5.19. AFAIC:

Acked-by: Michael Riesch <[email protected]>

Best regards,
Michael

> ---
> drivers/firmware/arm_scmi/base.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c
> index 20fba7370f4e..d0ac96da1ddf 100644
> --- a/drivers/firmware/arm_scmi/base.c
> +++ b/drivers/firmware/arm_scmi/base.c
> @@ -221,11 +221,17 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph,
> calc_list_sz = (1 + (loop_num_ret - 1) / sizeof(u32)) *
> sizeof(u32);
> if (calc_list_sz != real_list_sz) {
> - dev_err(dev,
> - "Malformed reply - real_sz:%zd calc_sz:%u\n",
> - real_list_sz, calc_list_sz);
> - ret = -EPROTO;
> - break;
> + dev_warn(dev,
> + "Malformed reply - real_sz:%zd calc_sz:%u (loop_num_ret:%d)\n",
> + real_list_sz, calc_list_sz, loop_num_ret);
> + /*
> + * Bail out if the expected list size is bigger than the
> + * total payload size of the received reply.
> + */
> + if (calc_list_sz > real_list_sz) {
> + ret = -EPROTO;
> + break;
> + }
> }
>
> for (loop = 0; loop < loop_num_ret; loop++)

2022-06-06 13:41:02

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote:
> Hi Cristian,
>
Hi Michael,

> On 5/23/22 19:15, Cristian Marussi wrote:
> > Even though malformed replies from firmware must be treated carefully to
> > avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> > be tolerated to avoid breaking existing deployed system, as long as they
> > won't cause memory issues.
> >
> > Reported-by: Nicolas Frattaroli <[email protected]>
> > Cc: Etienne Carriere <[email protected]>
> > Cc: Sudeep Holla <[email protected]>
> > Signed-off-by: Cristian Marussi <[email protected]>
>
> Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568
> EVB1 in vanilla v5.19-rc1.
>

Yes, the break was reported on -next and today it appeared in 5.19-rc1.
A proper FW fix is also up for review by Etienne but in the meantime
this tries to limit damages relaxing a bit the checks.

> I guess this patch should have a Fixes: tag, right?
>

It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the
fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it
will be solved within the v5.19 cycle and I thought the Fixes tag was
not needed in this case (I could be wrong...)

> Would be great to have this in v5.19. AFAIC:
>
> Acked-by: Michael Riesch <[email protected]>

Thanks for testing it.
Cristian

2022-06-06 14:17:37

by Frank Wunderlich

[permalink] [raw]
Subject: Aw: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

Hi

> Gesendet: Montag, 06. Juni 2022 um 14:59 Uhr
> Von: "Michael Riesch" <[email protected]>

> Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568
> EVB1 in vanilla v5.19-rc1.
>
> I guess this patch should have a Fixes: tag, right?
>
> Would be great to have this in v5.19. AFAIC:
>
> Acked-by: Michael Riesch <[email protected]>

Thanks for the Patch, had also this on my rk3568 bpi-r2 pro board:

panfrost fde60000.gpu: get clock failed -517

fixed by this patch

Tested-By: Frank Wunderlich <[email protected]>

regards Frank

2022-06-06 14:56:02

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

On Mon, Jun 06, 2022 at 02:31:48PM +0100, Cristian Marussi wrote:
> On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote:
> > Hi Cristian,
> >
> Hi Michael,
>
> > On 5/23/22 19:15, Cristian Marussi wrote:
> > > Even though malformed replies from firmware must be treated carefully to
> > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> > > be tolerated to avoid breaking existing deployed system, as long as they
> > > won't cause memory issues.
> > >
> > > Reported-by: Nicolas Frattaroli <[email protected]>
> > > Cc: Etienne Carriere <[email protected]>
> > > Cc: Sudeep Holla <[email protected]>
> > > Signed-off-by: Cristian Marussi <[email protected]>
> >
> > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568
> > EVB1 in vanilla v5.19-rc1.
> >
>
> Yes, the break was reported on -next and today it appeared in 5.19-rc1.
> A proper FW fix is also up for review by Etienne but in the meantime
> this tries to limit damages relaxing a bit the checks.
>
> > I guess this patch should have a Fixes: tag, right?
> >
>
> It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the
> fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it
> will be solved within the v5.19 cycle and I thought the Fixes tag was
> not needed in this case (I could be wrong...)

Correct, if for some reason, we can't push this before v5.19, then fixes
tag needs to added. I will add that then, but for now let us target getting
it in before v5.19

--
Regards,
Sudeep

2022-06-06 15:02:04

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

On Mon, 23 May 2022 18:15:59 +0100, Cristian Marussi wrote:
> Even though malformed replies from firmware must be treated carefully to
> avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> be tolerated to avoid breaking existing deployed system, as long as they
> won't cause memory issues.
>

Applied to sudeep.holla/linux (for-next/scmi), thanks!

[1/1] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list
https://git.kernel.org/sudeep.holla/c/122839b58a

--
Regards,
Sudeep

2022-06-06 15:07:39

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

Am Montag, 6. Juni 2022, 16:43:05 CEST schrieb Sudeep Holla:
> On Mon, Jun 06, 2022 at 02:31:48PM +0100, Cristian Marussi wrote:
> > On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote:
> > > Hi Cristian,
> > >
> > Hi Michael,
> >
> > > On 5/23/22 19:15, Cristian Marussi wrote:
> > > > Even though malformed replies from firmware must be treated carefully to
> > > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> > > > be tolerated to avoid breaking existing deployed system, as long as they
> > > > won't cause memory issues.
> > > >
> > > > Reported-by: Nicolas Frattaroli <[email protected]>
> > > > Cc: Etienne Carriere <[email protected]>
> > > > Cc: Sudeep Holla <[email protected]>
> > > > Signed-off-by: Cristian Marussi <[email protected]>
> > >
> > > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568
> > > EVB1 in vanilla v5.19-rc1.
> > >
> >
> > Yes, the break was reported on -next and today it appeared in 5.19-rc1.
> > A proper FW fix is also up for review by Etienne but in the meantime
> > this tries to limit damages relaxing a bit the checks.
> >
> > > I guess this patch should have a Fixes: tag, right?
> > >
> >
> > It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the
> > fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it
> > will be solved within the v5.19 cycle and I thought the Fixes tag was
> > not needed in this case (I could be wrong...)
>
> Correct, if for some reason, we can't push this before v5.19, then fixes
> tag needs to added. I will add that then, but for now let us target getting
> it in before v5.19

hmm, I'd disagree for the generalization.

While true that is not 100% necessary to be present in all cases, so
definitly no reason for a new version when applied to the same -rc series,
having the Fixes tag not only clearly marks the patch as such, but also
allows people reading either mailing lists or the later the git history
to actually see where the issue started. So I really think it is a
nice-to-have in most cases.


Heiko


2022-06-06 15:22:08

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list

On Mon, Jun 06, 2022 at 04:55:10PM +0200, Heiko St?bner wrote:
> Am Montag, 6. Juni 2022, 16:43:05 CEST schrieb Sudeep Holla:
> > On Mon, Jun 06, 2022 at 02:31:48PM +0100, Cristian Marussi wrote:
> > > On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote:
> > > > Hi Cristian,
> > > >
> > > Hi Michael,
> > >
> > > > On 5/23/22 19:15, Cristian Marussi wrote:
> > > > > Even though malformed replies from firmware must be treated carefully to
> > > > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can
> > > > > be tolerated to avoid breaking existing deployed system, as long as they
> > > > > won't cause memory issues.
> > > > >
> > > > > Reported-by: Nicolas Frattaroli <[email protected]>
> > > > > Cc: Etienne Carriere <[email protected]>
> > > > > Cc: Sudeep Holla <[email protected]>
> > > > > Signed-off-by: Cristian Marussi <[email protected]>
> > > >
> > > > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568
> > > > EVB1 in vanilla v5.19-rc1.
> > > >
> > >
> > > Yes, the break was reported on -next and today it appeared in 5.19-rc1.
> > > A proper FW fix is also up for review by Etienne but in the meantime
> > > this tries to limit damages relaxing a bit the checks.
> > >
> > > > I guess this patch should have a Fixes: tag, right?
> > > >
> > >
> > > It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the
> > > fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it
> > > will be solved within the v5.19 cycle and I thought the Fixes tag was
> > > not needed in this case (I could be wrong...)
> >
> > Correct, if for some reason, we can't push this before v5.19, then fixes
> > tag needs to added. I will add that then, but for now let us target getting
> > it in before v5.19
>
> hmm, I'd disagree for the generalization.
>
> While true that is not 100% necessary to be present in all cases, so
> definitly no reason for a new version when applied to the same -rc series,
> having the Fixes tag not only clearly marks the patch as such, but also
> allows people reading either mailing lists or the later the git history
> to actually see where the issue started. So I really think it is a
> nice-to-have in most cases.

Absolutely agreed in general. But I asked Cristian to not add as we are work
around the firmware bug by relaxing the checks in the kernel. This is not
fixing anything in the original commit IMO, it is just that the original
commit highlighted the firmware issue on that system.

--
Regards,
Sudeep