2022-05-18 07:39:39

by Ang, Tien Sung

[permalink] [raw]
Subject: [PATCH] fpga: altera-cvp: allow interrupt to continue next time

From: Dinh Nguyen <[email protected]>

CFG_READY signal/bit may time-out due to firmware not responding
within the given time-out. This time varies due to numerous
factors like size of bitstream and others.
This time-out error does not impact the result of the CvP
previous transactions. The CvP driver shall then, respond with
EAGAIN instead Time out error.

Signed-off-by: Dinh Nguyen <[email protected]>
Signed-off-by: Ang Tien Sung <[email protected]>
---
drivers/fpga/altera-cvp.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
index 4ffb9da537d8..d74ff63c61e8 100644
--- a/drivers/fpga/altera-cvp.c
+++ b/drivers/fpga/altera-cvp.c
@@ -309,10 +309,22 @@ static int altera_cvp_teardown(struct fpga_manager *mgr,
/* STEP 15 - poll CVP_CONFIG_READY bit for 0 with 10us timeout */
ret = altera_cvp_wait_status(conf, VSE_CVP_STATUS_CFG_RDY, 0,
conf->priv->poll_time_us);
- if (ret)
+ if (ret) {
dev_err(&mgr->dev, "CFG_RDY == 0 timeout\n");
+ goto error_path;
+ }

return ret;
+
+error_path:
+ /* reset CVP_MODE and HIP_CLK_SEL bit */
+ altera_read_config_dword(conf, VSE_CVP_MODE_CTRL, &val);
+ val &= ~VSE_CVP_MODE_CTRL_HIP_CLK_SEL;
+ val &= ~VSE_CVP_MODE_CTRL_CVP_MODE;
+ altera_write_config_dword(conf, VSE_CVP_MODE_CTRL, val);
+
+ return -EAGAIN;
+
}

static int altera_cvp_write_init(struct fpga_manager *mgr,
--
2.25.1



2022-05-18 14:08:50

by Tom Rix

[permalink] [raw]
Subject: Re: [PATCH] fpga: altera-cvp: allow interrupt to continue next time


On 5/18/22 12:38 AM, [email protected] wrote:
> From: Dinh Nguyen <[email protected]>
>
> CFG_READY signal/bit may time-out due to firmware not responding
> within the given time-out. This time varies due to numerous
> factors like size of bitstream and others.
> This time-out error does not impact the result of the CvP
> previous transactions. The CvP driver shall then, respond with
> EAGAIN instead Time out error.
>
> Signed-off-by: Dinh Nguyen <[email protected]>
> Signed-off-by: Ang Tien Sung <[email protected]>
> ---
> drivers/fpga/altera-cvp.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
> index 4ffb9da537d8..d74ff63c61e8 100644
> --- a/drivers/fpga/altera-cvp.c
> +++ b/drivers/fpga/altera-cvp.c
> @@ -309,10 +309,22 @@ static int altera_cvp_teardown(struct fpga_manager *mgr,
> /* STEP 15 - poll CVP_CONFIG_READY bit for 0 with 10us timeout */
> ret = altera_cvp_wait_status(conf, VSE_CVP_STATUS_CFG_RDY, 0,
> conf->priv->poll_time_us);
> - if (ret)
> + if (ret) {
> dev_err(&mgr->dev, "CFG_RDY == 0 timeout\n");
> + goto error_path;
> + }
>
> return ret;
> +
> +error_path:
> + /* reset CVP_MODE and HIP_CLK_SEL bit */
> + altera_read_config_dword(conf, VSE_CVP_MODE_CTRL, &val);
> + val &= ~VSE_CVP_MODE_CTRL_HIP_CLK_SEL;
> + val &= ~VSE_CVP_MODE_CTRL_CVP_MODE;
> + altera_write_config_dword(conf, VSE_CVP_MODE_CTRL, val);
> +
> + return -EAGAIN;

This will set fpga_mgr->state to *_ERR.

Is this ok or do you think we need a couple new of *_BUSY enums ?

Tom

> +
> }
>
> static int altera_cvp_write_init(struct fpga_manager *mgr,


2022-05-19 15:13:27

by Ang, Tien Sung

[permalink] [raw]
Subject: Re: [PATCH] fpga: altera-cvp: Truncated bitstream error support

Thanks for bringing this up. Yes, you are right that the fpga_mgr sees this
as an error irrespective of the value. The CvP driver is changed now to just
indicate the correct error which recommends a retry. To me understanding,
EAGAIN was this. The fpga manager now looks like is going to return a CvP
failure in short.
A BUSY state does not seem to be able to solve this issue.
Even an extended time-out didn't resolve this error state. The current time-out
is set to 10seconds.
However, the main objective is to also handle the error if the CvP firmware
is not responsive. The error_path flow is to reset the CVP mode and HIP_CLK_SEL bit
as recommended by the firmware engineers.
The flow prescribed here is also an identical copy of working CvP driver
which is also owned by Intel. This driver is a downstream driver which is
not part of the Linux kernel. We are now porting this differences over to
the current upstream CvP driver.

2022-05-28 19:51:41

by Xu Yilun

[permalink] [raw]
Subject: Re: [PATCH] fpga: altera-cvp: allow interrupt to continue next time

On Wed, May 18, 2022 at 03:38:44PM +0800, [email protected] wrote:
> From: Dinh Nguyen <[email protected]>
>
> CFG_READY signal/bit may time-out due to firmware not responding
> within the given time-out. This time varies due to numerous
> factors like size of bitstream and others.
> This time-out error does not impact the result of the CvP
> previous transactions. The CvP driver shall then, respond with

Do you mean the reprogramming is successful even if you find the time
out in write_complete()? Then return 0 is better?

And could you specify what the time-out mean on write_init() phase?

Thanks,
Yilun

> EAGAIN instead Time out error.
>
> Signed-off-by: Dinh Nguyen <[email protected]>
> Signed-off-by: Ang Tien Sung <[email protected]>
> ---
> drivers/fpga/altera-cvp.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
> index 4ffb9da537d8..d74ff63c61e8 100644
> --- a/drivers/fpga/altera-cvp.c
> +++ b/drivers/fpga/altera-cvp.c
> @@ -309,10 +309,22 @@ static int altera_cvp_teardown(struct fpga_manager *mgr,
> /* STEP 15 - poll CVP_CONFIG_READY bit for 0 with 10us timeout */
> ret = altera_cvp_wait_status(conf, VSE_CVP_STATUS_CFG_RDY, 0,
> conf->priv->poll_time_us);
> - if (ret)
> + if (ret) {
> dev_err(&mgr->dev, "CFG_RDY == 0 timeout\n");
> + goto error_path;
> + }
>
> return ret;
> +
> +error_path:
> + /* reset CVP_MODE and HIP_CLK_SEL bit */
> + altera_read_config_dword(conf, VSE_CVP_MODE_CTRL, &val);
> + val &= ~VSE_CVP_MODE_CTRL_HIP_CLK_SEL;
> + val &= ~VSE_CVP_MODE_CTRL_CVP_MODE;
> + altera_write_config_dword(conf, VSE_CVP_MODE_CTRL, val);
> +
> + return -EAGAIN;
> +
> }
>
> static int altera_cvp_write_init(struct fpga_manager *mgr,
> --
> 2.25.1

2022-05-28 19:58:49

by Xu Yilun

[permalink] [raw]
Subject: Re: [PATCH] fpga: altera-cvp: Truncated bitstream error support

On Thu, May 19, 2022 at 05:39:07PM +0800, [email protected] wrote:
> Thanks for bringing this up. Yes, you are right that the fpga_mgr sees this
> as an error irrespective of the value. The CvP driver is changed now to just
> indicate the correct error which recommends a retry. To me understanding,
> EAGAIN was this. The fpga manager now looks like is going to return a CvP
> failure in short.
> A BUSY state does not seem to be able to solve this issue.
> Even an extended time-out didn't resolve this error state. The current time-out
> is set to 10seconds.
> However, the main objective is to also handle the error if the CvP firmware
> is not responsive. The error_path flow is to reset the CVP mode and HIP_CLK_SEL bit

Please add your main objective to commit message.

Thanks,
Yilun

> as recommended by the firmware engineers.
> The flow prescribed here is also an identical copy of working CvP driver
> which is also owned by Intel. This driver is a downstream driver which is
> not part of the Linux kernel. We are now porting this differences over to
> the current upstream CvP driver.

2022-06-01 16:23:13

by Ang, Tien Sung

[permalink] [raw]
Subject: [PATCH v2] fpga: altera-cvp: allow interrupt to continue next time

From: Dinh Nguyen <[email protected]>

The main objective of this change is to perform error handling
if the CvP firmware becomes unresponsive. The error_path flow
resets the CvP mode and HIP_CLK_SEL bit.

CFG_READY signal/bit may time-out due to firmware not responding
within the given time-out. This time varies due to numerous
factors like size of bitstream and others.
This time-out error may or may not impact the result of the CvP
previous transactions. The CvP driver shall then, respond with
EAGAIN instead Time out error.

Signed-off-by: Dinh Nguyen <[email protected]>
Signed-off-by: Ang Tien Sung <[email protected]>
---

changelog v2:
* Amend the commit message

---
drivers/fpga/altera-cvp.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
index 4ffb9da537d8..d74ff63c61e8 100644
--- a/drivers/fpga/altera-cvp.c
+++ b/drivers/fpga/altera-cvp.c
@@ -309,10 +309,22 @@ static int altera_cvp_teardown(struct fpga_manager *mgr,
/* STEP 15 - poll CVP_CONFIG_READY bit for 0 with 10us timeout */
ret = altera_cvp_wait_status(conf, VSE_CVP_STATUS_CFG_RDY, 0,
conf->priv->poll_time_us);
- if (ret)
+ if (ret) {
dev_err(&mgr->dev, "CFG_RDY == 0 timeout\n");
+ goto error_path;
+ }

return ret;
+
+error_path:
+ /* reset CVP_MODE and HIP_CLK_SEL bit */
+ altera_read_config_dword(conf, VSE_CVP_MODE_CTRL, &val);
+ val &= ~VSE_CVP_MODE_CTRL_HIP_CLK_SEL;
+ val &= ~VSE_CVP_MODE_CTRL_CVP_MODE;
+ altera_write_config_dword(conf, VSE_CVP_MODE_CTRL, val);
+
+ return -EAGAIN;
+
}

static int altera_cvp_write_init(struct fpga_manager *mgr,
--
2.25.1


2022-06-01 20:15:22

by Ang, Tien Sung

[permalink] [raw]
Subject: Re: [PATCH] fpga: altera-cvp: allow interrupt to continue next time

>> CFG_READY signal/bit may time-out due to firmware not responding
>> within the given time-out. This time varies due to numerous
>> factors like size of bitstream and others.
>> This time-out error does not impact the result of the CvP
>> previous transactions. The CvP driver shall then, respond with

>Do you mean the reprogramming is successful even if you find the time
>out in write_complete()? Then return 0 is better?
Based on the information given by the Intel FPGA firmware team,
CFG_READY is essential to indicate if the current FPGA
configuration session is indeed a success. There are
cases we test in the lab whereby, CFG_READY stays invalid and
the tests performed subsequently to verify the FPGA functionality
could not detect the failed session. A failed FPGA
configuration session means, the new bitstream wasn't
successfully configured and tests ran later will just be passing
on the previous working bitstream version. In short, CFG_READY
is esential, and an error indicating the time-out is a must.
Another example, using an incorrect SOF/Design FPGA results
in CFG_READY being invalid. The user must be informed of a
potential error.
I will correct the wordings i used earlier that says that
the timoeut error does not impact the results of the CvP
previous transactions. It may so if the firmware has some sort
of error.

>And could you specify what the time-out mean on write_init() phase?
I could not really understand your question. We set huge
time-outs of ~10seconds. Every wait for the firmware to respond
is potentially a hazard. The firmware CvP is has it's limitation
unfortunately.


2022-06-03 15:45:30

by Xu Yilun

[permalink] [raw]
Subject: Re: [PATCH] fpga: altera-cvp: allow interrupt to continue next time

On Tue, May 31, 2022 at 10:20:04AM +0800, [email protected] wrote:
> >> CFG_READY signal/bit may time-out due to firmware not responding
> >> within the given time-out. This time varies due to numerous
> >> factors like size of bitstream and others.
> >> This time-out error does not impact the result of the CvP
> >> previous transactions. The CvP driver shall then, respond with
>
> >Do you mean the reprogramming is successful even if you find the time
> >out in write_complete()? Then return 0 is better?
> Based on the information given by the Intel FPGA firmware team,
> CFG_READY is essential to indicate if the current FPGA
> configuration session is indeed a success. There are
> cases we test in the lab whereby, CFG_READY stays invalid and
> the tests performed subsequently to verify the FPGA functionality
> could not detect the failed session. A failed FPGA
> configuration session means, the new bitstream wasn't
> successfully configured and tests ran later will just be passing
> on the previous working bitstream version. In short, CFG_READY
> is esential, and an error indicating the time-out is a must.
> Another example, using an incorrect SOF/Design FPGA results
> in CFG_READY being invalid. The user must be informed of a
> potential error.
> I will correct the wordings i used earlier that says that
> the timoeut error does not impact the results of the CvP
> previous transactions. It may so if the firmware has some sort
> of error.

Understood. But with your new comment why you must change the error
code to -EAGAIN rather than timeout?

I think you may change your commit message. The main change is adding
the error handling. The error code change is minor, even not necessary
if you don't have a strong reason.

Thanks,
Yilun

>
> >And could you specify what the time-out mean on write_init() phase?
> I could not really understand your question. We set huge
> time-outs of ~10seconds. Every wait for the firmware to respond
> is potentially a hazard. The firmware CvP is has it's limitation
> unfortunately.


2022-06-03 16:45:41

by Xu Yilun

[permalink] [raw]
Subject: Re: [PATCH v2] fpga: altera-cvp: allow interrupt to continue next time

On Wed, Jun 01, 2022 at 09:40:27AM +0800, [email protected] wrote:
> From: Dinh Nguyen <[email protected]>
>
> The main objective of this change is to perform error handling
> if the CvP firmware becomes unresponsive. The error_path flow
> resets the CvP mode and HIP_CLK_SEL bit.
>
> CFG_READY signal/bit may time-out due to firmware not responding
> within the given time-out. This time varies due to numerous
> factors like size of bitstream and others.
> This time-out error may or may not impact the result of the CvP
> previous transactions. The CvP driver shall then, respond with
> EAGAIN instead Time out error.
>
> Signed-off-by: Dinh Nguyen <[email protected]>
> Signed-off-by: Ang Tien Sung <[email protected]>
> ---
>
> changelog v2:
> * Amend the commit message
>
> ---
> drivers/fpga/altera-cvp.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
> index 4ffb9da537d8..d74ff63c61e8 100644
> --- a/drivers/fpga/altera-cvp.c
> +++ b/drivers/fpga/altera-cvp.c
> @@ -309,10 +309,22 @@ static int altera_cvp_teardown(struct fpga_manager *mgr,
> /* STEP 15 - poll CVP_CONFIG_READY bit for 0 with 10us timeout */
> ret = altera_cvp_wait_status(conf, VSE_CVP_STATUS_CFG_RDY, 0,
> conf->priv->poll_time_us);
> - if (ret)
> + if (ret) {
> dev_err(&mgr->dev, "CFG_RDY == 0 timeout\n");
> + goto error_path;

I assume the error handling is specific to CFG_RDY timeout, is it? Then it
could be embedded in this code block.

And also the -EAGAIN ret, please only return it in this code block.

Usually the goto error path is for common fail out.

> + }
>
> return ret;
> +
> +error_path:
> + /* reset CVP_MODE and HIP_CLK_SEL bit */
> + altera_read_config_dword(conf, VSE_CVP_MODE_CTRL, &val);
> + val &= ~VSE_CVP_MODE_CTRL_HIP_CLK_SEL;
> + val &= ~VSE_CVP_MODE_CTRL_CVP_MODE;
> + altera_write_config_dword(conf, VSE_CVP_MODE_CTRL, val);
> +
> + return -EAGAIN;

Please still specify the reason for -EAGAIN rather than timeout.

Thanks,
Yilun

> +
> }
>
> static int altera_cvp_write_init(struct fpga_manager *mgr,
> --
> 2.25.1