2020-03-11 10:55:51

by Loic Pallardy

[permalink] [raw]
Subject: [RFC 0/2] Allow client to recover crashed processor

The following 2 patches propose some changes to allow user space
client to shutdown and restart a crashed co-processor.
This is required when auto recovery is disabled at framework level or
when auto recovery procedure failed.

Sent as RFC as may be part of Mathieu's proposal for early boot/late
attach support

Loic Pallardy (2):
remoteproc: sysfs: authorize rproc shutdown when rproc is crashed
remoteproc: core: keep rproc in crash state in case of recovery
failure

drivers/remoteproc/remoteproc_core.c | 8 +++++++-
drivers/remoteproc/remoteproc_sysfs.c | 2 +-
2 files changed, 8 insertions(+), 2 deletions(-)

--
2.7.4


2020-03-11 10:56:23

by Loic Pallardy

[permalink] [raw]
Subject: [RFC 2/2] remoteproc: core: keep rproc in crash state in case of recovery failure

When an error occurs during recovery procedure, internal rproc
variables may be unaligned:
- state is set to RPROC_OFFLINE
- power atomic not equal to 0
which is normal as only rproc_stop() has been executed and not
rproc_shutdown()

In such case, rproc_boot() can be re-executed by client to
reboot co-processor.

This patch proposes to keep rproc in RPROC_CRASHED state in case
of recovery failure to be coherent with recovery disabled mode.

Signed-off-by: Loic Pallardy <[email protected]>
---
drivers/remoteproc/remoteproc_core.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 7ac87a75cd1b..def4f9fc881d 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1679,6 +1679,12 @@ int rproc_trigger_recovery(struct rproc *rproc)
release_firmware(firmware_p);

unlock_mutex:
+ /*
+ * In case of error during recovery sequence restore rproc
+ * state in CRASHED
+ */
+ if (ret)
+ rproc->state = RPROC_CRASHED;
mutex_unlock(&rproc->lock);
return ret;
}
--
2.7.4

2020-03-11 14:57:05

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [RFC 0/2] Allow client to recover crashed processor

On Wed, 11 Mar 2020 at 04:54, Loic Pallardy <[email protected]> wrote:
>
> The following 2 patches propose some changes to allow user space
> client to shutdown and restart a crashed co-processor.
> This is required when auto recovery is disabled at framework level or
> when auto recovery procedure failed.
>
> Sent as RFC as may be part of Mathieu's proposal for early boot/late
> attach support

Perfect timing - thanks for sending those out.

Mathieu

>
> Loic Pallardy (2):
> remoteproc: sysfs: authorize rproc shutdown when rproc is crashed
> remoteproc: core: keep rproc in crash state in case of recovery
> failure
>
> drivers/remoteproc/remoteproc_core.c | 8 +++++++-
> drivers/remoteproc/remoteproc_sysfs.c | 2 +-
> 2 files changed, 8 insertions(+), 2 deletions(-)
>
> --
> 2.7.4
>

2020-05-06 02:28:28

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [RFC 2/2] remoteproc: core: keep rproc in crash state in case of recovery failure

On Wed 11 Mar 03:54 PDT 2020, Loic Pallardy wrote:

> When an error occurs during recovery procedure, internal rproc
> variables may be unaligned:
> - state is set to RPROC_OFFLINE
> - power atomic not equal to 0
> which is normal as only rproc_stop() has been executed and not
> rproc_shutdown()
>
> In such case, rproc_boot() can be re-executed by client to
> reboot co-processor.
>
> This patch proposes to keep rproc in RPROC_CRASHED state in case
> of recovery failure to be coherent with recovery disabled mode.
>
> Signed-off-by: Loic Pallardy <[email protected]>
> ---
> drivers/remoteproc/remoteproc_core.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 7ac87a75cd1b..def4f9fc881d 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1679,6 +1679,12 @@ int rproc_trigger_recovery(struct rproc *rproc)
> release_firmware(firmware_p);
>
> unlock_mutex:
> + /*
> + * In case of error during recovery sequence restore rproc
> + * state in CRASHED
> + */
> + if (ret)
> + rproc->state = RPROC_CRASHED;

Got back to this after looking at Mathieu's synchronization series, I
think it would be cleaner if we move the rproc->state update out of
rproc_start() and rproc_stop().

That way we would leave the state in CRASHED state throughout the
recovery process, which I think makes it easier to reason about the
various states and their transitions.

Regards,
Bjorn

> mutex_unlock(&rproc->lock);
> return ret;
> }
> --
> 2.7.4
>