2020-11-22 05:43:16

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH v3 0/4] remoteproc: Improvement for the Qualcomm sysmon

The core part of this series is the update to the sysmon driver to ensure that
notifications sent to the remote processor are consistent and always present
valid state transitions.

In testing this I finally took the time to fix up the issue of the SMP2P based
graceful shutdown in the remoteproc drivers always timing out if sysmon has
already successfully shut down the remote processor.

Bjorn Andersson (4):
remoteproc: sysmon: Ensure remote notification ordering
remoteproc: sysmon: Expose the shutdown result
remoteproc: qcom: q6v5: Query sysmon before graceful shutdown
remoteproc: sysmon: Improve error messages

drivers/remoteproc/qcom_common.h | 6 ++
drivers/remoteproc/qcom_q6v5.c | 8 +-
drivers/remoteproc/qcom_q6v5.h | 3 +-
drivers/remoteproc/qcom_q6v5_adsp.c | 2 +-
drivers/remoteproc/qcom_q6v5_mss.c | 2 +-
drivers/remoteproc/qcom_q6v5_pas.c | 2 +-
drivers/remoteproc/qcom_q6v5_wcss.c | 2 +-
drivers/remoteproc/qcom_sysmon.c | 119 +++++++++++++++++++++-------
8 files changed, 109 insertions(+), 35 deletions(-)

--
2.28.0


2020-11-22 05:43:59

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH v3 4/4] remoteproc: sysmon: Improve error messages

Improve the style of a few of the error messages printed by the sysmon
implementation and fix the copy-pasted shutdown error in the send-event
function.

Tested-by: Steev Klimaszewski <[email protected]>
Reviewed-by: Rishabh Bhatnagar <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
---

Change since v2:
- None

drivers/remoteproc/qcom_sysmon.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
index a428b707a6de..9fed11a2b4ba 100644
--- a/drivers/remoteproc/qcom_sysmon.c
+++ b/drivers/remoteproc/qcom_sysmon.c
@@ -352,9 +352,9 @@ static bool ssctl_request_shutdown(struct qcom_sysmon *sysmon)

ret = qmi_txn_wait(&txn, 5 * HZ);
if (ret < 0) {
- dev_err(sysmon->dev, "failed receiving QMI response\n");
+ dev_err(sysmon->dev, "timeout waiting for shutdown response\n");
} else if (resp.resp.result) {
- dev_err(sysmon->dev, "shutdown request failed\n");
+ dev_err(sysmon->dev, "shutdown request rejected\n");
} else {
dev_dbg(sysmon->dev, "shutdown request completed\n");
acked = true;
@@ -397,18 +397,18 @@ static void ssctl_send_event(struct qcom_sysmon *sysmon,
SSCTL_SUBSYS_EVENT_REQ, 40,
ssctl_subsys_event_req_ei, &req);
if (ret < 0) {
- dev_err(sysmon->dev, "failed to send shutdown request\n");
+ dev_err(sysmon->dev, "failed to send subsystem event\n");
qmi_txn_cancel(&txn);
return;
}

ret = qmi_txn_wait(&txn, 5 * HZ);
if (ret < 0)
- dev_err(sysmon->dev, "failed receiving QMI response\n");
+ dev_err(sysmon->dev, "timeout waiting for subsystem event response\n");
else if (resp.resp.result)
- dev_err(sysmon->dev, "ssr event send failed\n");
+ dev_err(sysmon->dev, "subsystem event rejected\n");
else
- dev_dbg(sysmon->dev, "ssr event send completed\n");
+ dev_dbg(sysmon->dev, "subsystem event accepted\n");
}

/**
--
2.28.0

2020-11-22 05:44:22

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH v3 2/4] remoteproc: sysmon: Expose the shutdown result

A graceful shutdown of the Qualcomm remote processors where
traditionally performed by invoking a shared memory state signal and
waiting for the associated ack.

This was later superseded by the "sysmon" mechanism, where some form of
shared memory bus is used to send a "graceful shutdown request" message
and one of more signals comes back to indicate its success.

But when this newer mechanism is in effect the firmware is shut down by
the time the older mechanism, implemented in the remoteproc drivers,
attempts to perform a graceful shutdown - and as such it will never
receive an ack back.

This patch therefor track the success of the latest shutdown attempt in
sysmon and exposes a new function in the API that the remoteproc driver
can use to query the success and the necessity of invoking the older
mechanism.

Tested-by: Steev Klimaszewski <[email protected]>
Reviewed-by: Rishabh Bhatnagar <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
---

Change since v2:
- None

drivers/remoteproc/qcom_common.h | 6 +++
drivers/remoteproc/qcom_sysmon.c | 82 ++++++++++++++++++++++++--------
2 files changed, 69 insertions(+), 19 deletions(-)

diff --git a/drivers/remoteproc/qcom_common.h b/drivers/remoteproc/qcom_common.h
index dfc641c3a98b..8ba9052955bd 100644
--- a/drivers/remoteproc/qcom_common.h
+++ b/drivers/remoteproc/qcom_common.h
@@ -51,6 +51,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
const char *name,
int ssctl_instance);
void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon);
+bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon);
#else
static inline struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
const char *name,
@@ -62,6 +63,11 @@ static inline struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
static inline void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon)
{
}
+
+static inline bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon)
+{
+ return false;
+}
#endif

#endif
diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
index b37b111b15b3..a428b707a6de 100644
--- a/drivers/remoteproc/qcom_sysmon.c
+++ b/drivers/remoteproc/qcom_sysmon.c
@@ -44,6 +44,7 @@ struct qcom_sysmon {
struct mutex lock;

bool ssr_ack;
+ bool shutdown_acked;

struct qmi_handle qmi;
struct sockaddr_qrtr ssctl;
@@ -115,10 +116,13 @@ static void sysmon_send_event(struct qcom_sysmon *sysmon,
/**
* sysmon_request_shutdown() - request graceful shutdown of remote
* @sysmon: sysmon context
+ *
+ * Return: boolean indicator of the remote processor acking the request
*/
-static void sysmon_request_shutdown(struct qcom_sysmon *sysmon)
+static bool sysmon_request_shutdown(struct qcom_sysmon *sysmon)
{
char *req = "ssr:shutdown";
+ bool acked = false;
int ret;

mutex_lock(&sysmon->lock);
@@ -141,9 +145,13 @@ static void sysmon_request_shutdown(struct qcom_sysmon *sysmon)
if (!sysmon->ssr_ack)
dev_err(sysmon->dev,
"unexpected response to sysmon shutdown request\n");
+ else
+ acked = true;

out_unlock:
mutex_unlock(&sysmon->lock);
+
+ return acked;
}

static int sysmon_callback(struct rpmsg_device *rpdev, void *data, int count,
@@ -297,14 +305,33 @@ static struct qmi_msg_handler qmi_indication_handler[] = {
{}
};

+static bool ssctl_request_shutdown_wait(struct qcom_sysmon *sysmon)
+{
+ int ret;
+
+ ret = wait_for_completion_timeout(&sysmon->shutdown_comp, 10 * HZ);
+ if (ret)
+ return true;
+
+ ret = try_wait_for_completion(&sysmon->ind_comp);
+ if (ret)
+ return true;
+
+ dev_err(sysmon->dev, "timeout waiting for shutdown ack\n");
+ return false;
+}
+
/**
* ssctl_request_shutdown() - request shutdown via SSCTL QMI service
* @sysmon: sysmon context
+ *
+ * Return: boolean indicator of the remote processor acking the request
*/
-static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
+static bool ssctl_request_shutdown(struct qcom_sysmon *sysmon)
{
struct ssctl_shutdown_resp resp;
struct qmi_txn txn;
+ bool acked = false;
int ret;

reinit_completion(&sysmon->ind_comp);
@@ -312,7 +339,7 @@ static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
ret = qmi_txn_init(&sysmon->qmi, &txn, ssctl_shutdown_resp_ei, &resp);
if (ret < 0) {
dev_err(sysmon->dev, "failed to allocate QMI txn\n");
- return;
+ return false;
}

ret = qmi_send_request(&sysmon->qmi, &sysmon->ssctl, &txn,
@@ -320,27 +347,23 @@ static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
if (ret < 0) {
dev_err(sysmon->dev, "failed to send shutdown request\n");
qmi_txn_cancel(&txn);
- return;
+ return false;
}

ret = qmi_txn_wait(&txn, 5 * HZ);
- if (ret < 0)
+ if (ret < 0) {
dev_err(sysmon->dev, "failed receiving QMI response\n");
- else if (resp.resp.result)
+ } else if (resp.resp.result) {
dev_err(sysmon->dev, "shutdown request failed\n");
- else
+ } else {
dev_dbg(sysmon->dev, "shutdown request completed\n");
-
- if (sysmon->shutdown_irq > 0) {
- ret = wait_for_completion_timeout(&sysmon->shutdown_comp,
- 10 * HZ);
- if (!ret) {
- ret = try_wait_for_completion(&sysmon->ind_comp);
- if (!ret)
- dev_err(sysmon->dev,
- "timeout waiting for shutdown ack\n");
- }
+ acked = true;
}
+
+ if (sysmon->shutdown_irq > 0)
+ return ssctl_request_shutdown_wait(sysmon);
+
+ return acked;
}

/**
@@ -510,6 +533,9 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
.subsys_name = sysmon->name,
.ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN
};
+ bool acked;
+
+ sysmon->shutdown_acked = false;

mutex_lock(&sysmon->state_lock);
sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN;
@@ -521,9 +547,11 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
return;

if (sysmon->ssctl_version)
- ssctl_request_shutdown(sysmon);
+ acked = ssctl_request_shutdown(sysmon);
else if (sysmon->ept)
- sysmon_request_shutdown(sysmon);
+ acked = sysmon_request_shutdown(sysmon);
+
+ sysmon->shutdown_acked = acked;
}

static void sysmon_unprepare(struct rproc_subdev *subdev)
@@ -681,6 +709,22 @@ void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon)
}
EXPORT_SYMBOL_GPL(qcom_remove_sysmon_subdev);

+/**
+ * qcom_sysmon_shutdown_acked() - query the success of the last shutdown
+ * @sysmon: sysmon context
+ *
+ * When sysmon is used to request a graceful shutdown of the remote processor
+ * this can be used by the remoteproc driver to query the success, in order to
+ * know if it should fall back to other means of requesting a shutdown.
+ *
+ * Return: boolean indicator of the success of the last shutdown request
+ */
+bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon)
+{
+ return sysmon && sysmon->shutdown_acked;
+}
+EXPORT_SYMBOL_GPL(qcom_sysmon_shutdown_acked);
+
/**
* sysmon_probe() - probe sys_mon channel
* @rpdev: rpmsg device handle
--
2.28.0

2020-11-22 05:44:24

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH v3 3/4] remoteproc: qcom: q6v5: Query sysmon before graceful shutdown

Requesting a graceful shutdown through the shared memory state signals
will not be acked in the event that sysmon has already successfully shut
down the remote firmware. So extend the stop request API to optionally
take the remoteproc's sysmon instance and query if there's already been
a successful shutdown attempt, before doing the signal dance.

Tested-by: Steev Klimaszewski <[email protected]>
Reviewed-by: Rishabh Bhatnagar <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
---

Change since v2:
- Fixed spelling of optionally in commit message

drivers/remoteproc/qcom_q6v5.c | 8 +++++++-
drivers/remoteproc/qcom_q6v5.h | 3 ++-
drivers/remoteproc/qcom_q6v5_adsp.c | 2 +-
drivers/remoteproc/qcom_q6v5_mss.c | 2 +-
drivers/remoteproc/qcom_q6v5_pas.c | 2 +-
drivers/remoteproc/qcom_q6v5_wcss.c | 2 +-
6 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/remoteproc/qcom_q6v5.c b/drivers/remoteproc/qcom_q6v5.c
index fd6fd36268d9..9627a950928e 100644
--- a/drivers/remoteproc/qcom_q6v5.c
+++ b/drivers/remoteproc/qcom_q6v5.c
@@ -13,6 +13,7 @@
#include <linux/soc/qcom/smem.h>
#include <linux/soc/qcom/smem_state.h>
#include <linux/remoteproc.h>
+#include "qcom_common.h"
#include "qcom_q6v5.h"

#define Q6V5_PANIC_DELAY_MS 200
@@ -146,15 +147,20 @@ static irqreturn_t q6v5_stop_interrupt(int irq, void *data)
/**
* qcom_q6v5_request_stop() - request the remote processor to stop
* @q6v5: reference to qcom_q6v5 context
+ * @sysmon: reference to the remote's sysmon instance, or NULL
*
* Return: 0 on success, negative errno on failure
*/
-int qcom_q6v5_request_stop(struct qcom_q6v5 *q6v5)
+int qcom_q6v5_request_stop(struct qcom_q6v5 *q6v5, struct qcom_sysmon *sysmon)
{
int ret;

q6v5->running = false;

+ /* Don't perform SMP2P dance if sysmon already shut down the remote */
+ if (qcom_sysmon_shutdown_acked(sysmon))
+ return 0;
+
qcom_smem_state_update_bits(q6v5->state,
BIT(q6v5->stop_bit), BIT(q6v5->stop_bit));

diff --git a/drivers/remoteproc/qcom_q6v5.h b/drivers/remoteproc/qcom_q6v5.h
index c4ed887c1499..1c212f670cbc 100644
--- a/drivers/remoteproc/qcom_q6v5.h
+++ b/drivers/remoteproc/qcom_q6v5.h
@@ -8,6 +8,7 @@

struct rproc;
struct qcom_smem_state;
+struct qcom_sysmon;

struct qcom_q6v5 {
struct device *dev;
@@ -40,7 +41,7 @@ int qcom_q6v5_init(struct qcom_q6v5 *q6v5, struct platform_device *pdev,

int qcom_q6v5_prepare(struct qcom_q6v5 *q6v5);
int qcom_q6v5_unprepare(struct qcom_q6v5 *q6v5);
-int qcom_q6v5_request_stop(struct qcom_q6v5 *q6v5);
+int qcom_q6v5_request_stop(struct qcom_q6v5 *q6v5, struct qcom_sysmon *sysmon);
int qcom_q6v5_wait_for_start(struct qcom_q6v5 *q6v5, int timeout);
unsigned long qcom_q6v5_panic(struct qcom_q6v5 *q6v5);

diff --git a/drivers/remoteproc/qcom_q6v5_adsp.c b/drivers/remoteproc/qcom_q6v5_adsp.c
index f0b7363b5b26..2f8f38408eb7 100644
--- a/drivers/remoteproc/qcom_q6v5_adsp.c
+++ b/drivers/remoteproc/qcom_q6v5_adsp.c
@@ -266,7 +266,7 @@ static int adsp_stop(struct rproc *rproc)
int handover;
int ret;

- ret = qcom_q6v5_request_stop(&adsp->q6v5);
+ ret = qcom_q6v5_request_stop(&adsp->q6v5, adsp->sysmon);
if (ret == -ETIMEDOUT)
dev_err(adsp->dev, "timed out on wait\n");

diff --git a/drivers/remoteproc/qcom_q6v5_mss.c b/drivers/remoteproc/qcom_q6v5_mss.c
index ac289e08062e..55f7c5740920 100644
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -1373,7 +1373,7 @@ static int q6v5_stop(struct rproc *rproc)
struct q6v5 *qproc = (struct q6v5 *)rproc->priv;
int ret;

- ret = qcom_q6v5_request_stop(&qproc->q6v5);
+ ret = qcom_q6v5_request_stop(&qproc->q6v5, qproc->sysmon);
if (ret == -ETIMEDOUT)
dev_err(qproc->dev, "timed out on wait\n");

diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
index 0678b417707e..49ea0133ff04 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -217,7 +217,7 @@ static int adsp_stop(struct rproc *rproc)
int handover;
int ret;

- ret = qcom_q6v5_request_stop(&adsp->q6v5);
+ ret = qcom_q6v5_request_stop(&adsp->q6v5, adsp->sysmon);
if (ret == -ETIMEDOUT)
dev_err(adsp->dev, "timed out on wait\n");

diff --git a/drivers/remoteproc/qcom_q6v5_wcss.c b/drivers/remoteproc/qcom_q6v5_wcss.c
index 8846ef0b0f1a..d6639856069b 100644
--- a/drivers/remoteproc/qcom_q6v5_wcss.c
+++ b/drivers/remoteproc/qcom_q6v5_wcss.c
@@ -390,7 +390,7 @@ static int q6v5_wcss_stop(struct rproc *rproc)
int ret;

/* WCSS powerdown */
- ret = qcom_q6v5_request_stop(&wcss->q6v5);
+ ret = qcom_q6v5_request_stop(&wcss->q6v5, wcss->sysmon);
if (ret == -ETIMEDOUT) {
dev_err(wcss->dev, "timed out on wait\n");
return ret;
--
2.28.0

2020-11-22 05:45:49

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH v3 1/4] remoteproc: sysmon: Ensure remote notification ordering

The reliance on the remoteproc's state for determining when to send
sysmon notifications to a remote processor is racy with regard to
concurrent remoteproc operations.

Further more the advertisement of the state of other remote processor to
a newly started remote processor might not only send the wrong state,
but might result in a stream of state changes that are out of order.

Address this by introducing state tracking within the sysmon instances
themselves and extend the locking to ensure that the notifications are
consistent with this state.

Fixes: 1f36ab3f6e3b ("remoteproc: sysmon: Inform current rproc about all active rprocs")
Fixes: 1877f54f75ad ("remoteproc: sysmon: Add notifications for events")
Fixes: 1fb82ee806d1 ("remoteproc: qcom: Introduce sysmon")
Cc: [email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
---

Changes since v2:
- Hold sysmon_lock during traversal of sysmons in sysmon_start()

drivers/remoteproc/qcom_sysmon.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
index 9eb2f6bccea6..b37b111b15b3 100644
--- a/drivers/remoteproc/qcom_sysmon.c
+++ b/drivers/remoteproc/qcom_sysmon.c
@@ -22,6 +22,9 @@ struct qcom_sysmon {
struct rproc_subdev subdev;
struct rproc *rproc;

+ int state;
+ struct mutex state_lock;
+
struct list_head node;

const char *name;
@@ -448,7 +451,10 @@ static int sysmon_prepare(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_BEFORE_POWERUP
};

+ mutex_lock(&sysmon->state_lock);
+ sysmon->state = SSCTL_SSR_EVENT_BEFORE_POWERUP;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon->state_lock);

return 0;
}
@@ -472,20 +478,25 @@ static int sysmon_start(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_AFTER_POWERUP
};

+ mutex_lock(&sysmon->state_lock);
+ sysmon->state = SSCTL_SSR_EVENT_AFTER_POWERUP;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon->state_lock);

mutex_lock(&sysmon_lock);
list_for_each_entry(target, &sysmon_list, node) {
- if (target == sysmon ||
- target->rproc->state != RPROC_RUNNING)
+ if (target == sysmon)
continue;

+ mutex_lock(&target->state_lock);
event.subsys_name = target->name;
+ event.ssr_event = target->state;

if (sysmon->ssctl_version == 2)
ssctl_send_event(sysmon, &event);
else if (sysmon->ept)
sysmon_send_event(sysmon, &event);
+ mutex_unlock(&target->state_lock);
}
mutex_unlock(&sysmon_lock);

@@ -500,7 +511,10 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
.ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN
};

+ mutex_lock(&sysmon->state_lock);
+ sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon->state_lock);

/* Don't request graceful shutdown if we've crashed */
if (crashed)
@@ -521,7 +535,10 @@ static void sysmon_unprepare(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_AFTER_SHUTDOWN
};

+ mutex_lock(&sysmon->state_lock);
+ sysmon->state = SSCTL_SSR_EVENT_AFTER_SHUTDOWN;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon->state_lock);
}

/**
@@ -534,11 +551,10 @@ static int sysmon_notify(struct notifier_block *nb, unsigned long event,
void *data)
{
struct qcom_sysmon *sysmon = container_of(nb, struct qcom_sysmon, nb);
- struct rproc *rproc = sysmon->rproc;
struct sysmon_event *sysmon_event = data;

/* Skip non-running rprocs and the originating instance */
- if (rproc->state != RPROC_RUNNING ||
+ if (sysmon->state != SSCTL_SSR_EVENT_AFTER_POWERUP ||
!strcmp(sysmon_event->subsys_name, sysmon->name)) {
dev_dbg(sysmon->dev, "not notifying %s\n", sysmon->name);
return NOTIFY_DONE;
@@ -591,6 +607,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
init_completion(&sysmon->ind_comp);
init_completion(&sysmon->shutdown_comp);
mutex_init(&sysmon->lock);
+ mutex_init(&sysmon->state_lock);

sysmon->shutdown_irq = of_irq_get_byname(sysmon->dev->of_node,
"shutdown-ack");
--
2.28.0

2020-11-25 22:14:01

by Rishabh Bhatnagar

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] remoteproc: sysmon: Ensure remote notification ordering

On 2020-11-21 21:41, Bjorn Andersson wrote:
> The reliance on the remoteproc's state for determining when to send
> sysmon notifications to a remote processor is racy with regard to
> concurrent remoteproc operations.
>
> Further more the advertisement of the state of other remote processor
> to
> a newly started remote processor might not only send the wrong state,
> but might result in a stream of state changes that are out of order.
>
> Address this by introducing state tracking within the sysmon instances
> themselves and extend the locking to ensure that the notifications are
> consistent with this state.
>
> Fixes: 1f36ab3f6e3b ("remoteproc: sysmon: Inform current rproc about
> all active rprocs")
> Fixes: 1877f54f75ad ("remoteproc: sysmon: Add notifications for
> events")
> Fixes: 1fb82ee806d1 ("remoteproc: qcom: Introduce sysmon")
> Cc: [email protected]
> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
>
> Changes since v2:
> - Hold sysmon_lock during traversal of sysmons in sysmon_start()
>
> drivers/remoteproc/qcom_sysmon.c | 25 +++++++++++++++++++++----
> 1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/remoteproc/qcom_sysmon.c
> b/drivers/remoteproc/qcom_sysmon.c
> index 9eb2f6bccea6..b37b111b15b3 100644
> --- a/drivers/remoteproc/qcom_sysmon.c
> +++ b/drivers/remoteproc/qcom_sysmon.c
> @@ -22,6 +22,9 @@ struct qcom_sysmon {
> struct rproc_subdev subdev;
> struct rproc *rproc;
>
> + int state;
> + struct mutex state_lock;
> +
> struct list_head node;
>
> const char *name;
> @@ -448,7 +451,10 @@ static int sysmon_prepare(struct rproc_subdev
> *subdev)
> .ssr_event = SSCTL_SSR_EVENT_BEFORE_POWERUP
> };
>
> + mutex_lock(&sysmon->state_lock);
> + sysmon->state = SSCTL_SSR_EVENT_BEFORE_POWERUP;
> blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
> + mutex_unlock(&sysmon->state_lock);
>
> return 0;
> }
> @@ -472,20 +478,25 @@ static int sysmon_start(struct rproc_subdev
> *subdev)
> .ssr_event = SSCTL_SSR_EVENT_AFTER_POWERUP
> };
>
> + mutex_lock(&sysmon->state_lock);
> + sysmon->state = SSCTL_SSR_EVENT_AFTER_POWERUP;
> blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
> + mutex_unlock(&sysmon->state_lock);
>
> mutex_lock(&sysmon_lock);
> list_for_each_entry(target, &sysmon_list, node) {
> - if (target == sysmon ||
> - target->rproc->state != RPROC_RUNNING)
> + if (target == sysmon)
> continue;
>
> + mutex_lock(&target->state_lock);
> event.subsys_name = target->name;
> + event.ssr_event = target->state;
>
> if (sysmon->ssctl_version == 2)
> ssctl_send_event(sysmon, &event);
> else if (sysmon->ept)
> sysmon_send_event(sysmon, &event);
> + mutex_unlock(&target->state_lock);
> }
> mutex_unlock(&sysmon_lock);
>
> @@ -500,7 +511,10 @@ static void sysmon_stop(struct rproc_subdev
> *subdev, bool crashed)
> .ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN
> };
>
> + mutex_lock(&sysmon->state_lock);
> + sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN;
> blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
> + mutex_unlock(&sysmon->state_lock);
>
> /* Don't request graceful shutdown if we've crashed */
> if (crashed)
> @@ -521,7 +535,10 @@ static void sysmon_unprepare(struct rproc_subdev
> *subdev)
> .ssr_event = SSCTL_SSR_EVENT_AFTER_SHUTDOWN
> };
>
> + mutex_lock(&sysmon->state_lock);
> + sysmon->state = SSCTL_SSR_EVENT_AFTER_SHUTDOWN;
> blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
> + mutex_unlock(&sysmon->state_lock);
> }
>
> /**
> @@ -534,11 +551,10 @@ static int sysmon_notify(struct notifier_block
> *nb, unsigned long event,
> void *data)
> {
> struct qcom_sysmon *sysmon = container_of(nb, struct qcom_sysmon,
> nb);
> - struct rproc *rproc = sysmon->rproc;
> struct sysmon_event *sysmon_event = data;
>
> /* Skip non-running rprocs and the originating instance */
> - if (rproc->state != RPROC_RUNNING ||
> + if (sysmon->state != SSCTL_SSR_EVENT_AFTER_POWERUP ||
> !strcmp(sysmon_event->subsys_name, sysmon->name)) {
> dev_dbg(sysmon->dev, "not notifying %s\n", sysmon->name);
> return NOTIFY_DONE;
> @@ -591,6 +607,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct
> rproc *rproc,
> init_completion(&sysmon->ind_comp);
> init_completion(&sysmon->shutdown_comp);
> mutex_init(&sysmon->lock);
> + mutex_init(&sysmon->state_lock);
>
> sysmon->shutdown_irq = of_irq_get_byname(sysmon->dev->of_node,
> "shutdown-ack");

Reviewed-by: Rishabh Bhatnagar <[email protected]>

2020-12-07 20:05:16

by Evan Green

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] remoteproc: sysmon: Expose the shutdown result

On Sat, Nov 21, 2020 at 9:43 PM Bjorn Andersson
<[email protected]> wrote:
>
> A graceful shutdown of the Qualcomm remote processors where
> traditionally performed by invoking a shared memory state signal and
> waiting for the associated ack.
>
> This was later superseded by the "sysmon" mechanism, where some form of
> shared memory bus is used to send a "graceful shutdown request" message
> and one of more signals comes back to indicate its success.
>
> But when this newer mechanism is in effect the firmware is shut down by
> the time the older mechanism, implemented in the remoteproc drivers,
> attempts to perform a graceful shutdown - and as such it will never
> receive an ack back.
>
> This patch therefor track the success of the latest shutdown attempt in
> sysmon and exposes a new function in the API that the remoteproc driver
> can use to query the success and the necessity of invoking the older
> mechanism.
>
> Tested-by: Steev Klimaszewski <[email protected]>
> Reviewed-by: Rishabh Bhatnagar <[email protected]>
> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
>
> Change since v2:
> - None
>
> drivers/remoteproc/qcom_common.h | 6 +++
> drivers/remoteproc/qcom_sysmon.c | 82 ++++++++++++++++++++++++--------
> 2 files changed, 69 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/remoteproc/qcom_common.h b/drivers/remoteproc/qcom_common.h
> index dfc641c3a98b..8ba9052955bd 100644
> --- a/drivers/remoteproc/qcom_common.h
> +++ b/drivers/remoteproc/qcom_common.h
> @@ -51,6 +51,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
> const char *name,
> int ssctl_instance);
> void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon);
> +bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon);
> #else
> static inline struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
> const char *name,
> @@ -62,6 +63,11 @@ static inline struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
> static inline void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon)
> {
> }
> +
> +static inline bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon)
> +{
> + return false;
> +}
> #endif
>
> #endif
> diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
> index b37b111b15b3..a428b707a6de 100644
> --- a/drivers/remoteproc/qcom_sysmon.c
> +++ b/drivers/remoteproc/qcom_sysmon.c
> @@ -44,6 +44,7 @@ struct qcom_sysmon {
> struct mutex lock;
>
> bool ssr_ack;
> + bool shutdown_acked;
>
> struct qmi_handle qmi;
> struct sockaddr_qrtr ssctl;
> @@ -115,10 +116,13 @@ static void sysmon_send_event(struct qcom_sysmon *sysmon,
> /**
> * sysmon_request_shutdown() - request graceful shutdown of remote
> * @sysmon: sysmon context
> + *
> + * Return: boolean indicator of the remote processor acking the request
> */
> -static void sysmon_request_shutdown(struct qcom_sysmon *sysmon)
> +static bool sysmon_request_shutdown(struct qcom_sysmon *sysmon)
> {
> char *req = "ssr:shutdown";
> + bool acked = false;
> int ret;
>
> mutex_lock(&sysmon->lock);
> @@ -141,9 +145,13 @@ static void sysmon_request_shutdown(struct qcom_sysmon *sysmon)
> if (!sysmon->ssr_ack)
> dev_err(sysmon->dev,
> "unexpected response to sysmon shutdown request\n");
> + else
> + acked = true;
>
> out_unlock:
> mutex_unlock(&sysmon->lock);
> +
> + return acked;
> }
>
> static int sysmon_callback(struct rpmsg_device *rpdev, void *data, int count,
> @@ -297,14 +305,33 @@ static struct qmi_msg_handler qmi_indication_handler[] = {
> {}
> };
>
> +static bool ssctl_request_shutdown_wait(struct qcom_sysmon *sysmon)
> +{
> + int ret;
> +
> + ret = wait_for_completion_timeout(&sysmon->shutdown_comp, 10 * HZ);
> + if (ret)
> + return true;
> +
> + ret = try_wait_for_completion(&sysmon->ind_comp);
> + if (ret)
> + return true;
> +
> + dev_err(sysmon->dev, "timeout waiting for shutdown ack\n");
> + return false;
> +}
> +
> /**
> * ssctl_request_shutdown() - request shutdown via SSCTL QMI service
> * @sysmon: sysmon context
> + *
> + * Return: boolean indicator of the remote processor acking the request
> */
> -static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> +static bool ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> {
> struct ssctl_shutdown_resp resp;
> struct qmi_txn txn;
> + bool acked = false;
> int ret;
>
> reinit_completion(&sysmon->ind_comp);
> @@ -312,7 +339,7 @@ static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> ret = qmi_txn_init(&sysmon->qmi, &txn, ssctl_shutdown_resp_ei, &resp);
> if (ret < 0) {
> dev_err(sysmon->dev, "failed to allocate QMI txn\n");
> - return;
> + return false;
> }
>
> ret = qmi_send_request(&sysmon->qmi, &sysmon->ssctl, &txn,
> @@ -320,27 +347,23 @@ static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> if (ret < 0) {
> dev_err(sysmon->dev, "failed to send shutdown request\n");
> qmi_txn_cancel(&txn);
> - return;
> + return false;
> }
>
> ret = qmi_txn_wait(&txn, 5 * HZ);
> - if (ret < 0)
> + if (ret < 0) {
> dev_err(sysmon->dev, "failed receiving QMI response\n");
> - else if (resp.resp.result)
> + } else if (resp.resp.result) {
> dev_err(sysmon->dev, "shutdown request failed\n");
> - else
> + } else {
> dev_dbg(sysmon->dev, "shutdown request completed\n");
> -
> - if (sysmon->shutdown_irq > 0) {
> - ret = wait_for_completion_timeout(&sysmon->shutdown_comp,
> - 10 * HZ);
> - if (!ret) {
> - ret = try_wait_for_completion(&sysmon->ind_comp);
> - if (!ret)
> - dev_err(sysmon->dev,
> - "timeout waiting for shutdown ack\n");
> - }
> + acked = true;
> }
> +
> + if (sysmon->shutdown_irq > 0)
> + return ssctl_request_shutdown_wait(sysmon);
> +
> + return acked;
> }
>
> /**
> @@ -510,6 +533,9 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
> .subsys_name = sysmon->name,
> .ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN
> };
> + bool acked;
> +
> + sysmon->shutdown_acked = false;
>
> mutex_lock(&sysmon->state_lock);
> sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN;
> @@ -521,9 +547,11 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
> return;
>
> if (sysmon->ssctl_version)

> - ssctl_request_shutdown(sysmon);
> + acked = ssctl_request_shutdown(sysmon);
> else if (sysmon->ept)
> - sysmon_request_shutdown(sysmon);
> + acked = sysmon_request_shutdown(sysmon);
> +
> + sysmon->shutdown_acked = acked;

Guenter noticed that the 0-day bot complains about acked being
potentially uninitialized here. He put a fix for us into Chrome OS:

https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2577656

Bjorn, do you want to tweak the patch in your tree?
-Evan

2020-12-07 20:12:20

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] remoteproc: sysmon: Expose the shutdown result

On Mon, Dec 7, 2020 at 2:00 PM Evan Green <[email protected]> wrote:
>
> On Sat, Nov 21, 2020 at 9:43 PM Bjorn Andersson
> <[email protected]> wrote:
> >
> > A graceful shutdown of the Qualcomm remote processors where
> > traditionally performed by invoking a shared memory state signal and
> > waiting for the associated ack.
> >
> > This was later superseded by the "sysmon" mechanism, where some form of
> > shared memory bus is used to send a "graceful shutdown request" message
> > and one of more signals comes back to indicate its success.
> >
> > But when this newer mechanism is in effect the firmware is shut down by
> > the time the older mechanism, implemented in the remoteproc drivers,
> > attempts to perform a graceful shutdown - and as such it will never
> > receive an ack back.
> >
> > This patch therefor track the success of the latest shutdown attempt in
> > sysmon and exposes a new function in the API that the remoteproc driver
> > can use to query the success and the necessity of invoking the older
> > mechanism.
> >
> > Tested-by: Steev Klimaszewski <[email protected]>
> > Reviewed-by: Rishabh Bhatnagar <[email protected]>
> > Signed-off-by: Bjorn Andersson <[email protected]>
> > ---
> >
> > Change since v2:
> > - None
> >
> > drivers/remoteproc/qcom_common.h | 6 +++
> > drivers/remoteproc/qcom_sysmon.c | 82 ++++++++++++++++++++++++--------
> > 2 files changed, 69 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/remoteproc/qcom_common.h b/drivers/remoteproc/qcom_common.h
> > index dfc641c3a98b..8ba9052955bd 100644
> > --- a/drivers/remoteproc/qcom_common.h
> > +++ b/drivers/remoteproc/qcom_common.h
> > @@ -51,6 +51,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
> > const char *name,
> > int ssctl_instance);
> > void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon);
> > +bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon);
> > #else
> > static inline struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
> > const char *name,
> > @@ -62,6 +63,11 @@ static inline struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
> > static inline void qcom_remove_sysmon_subdev(struct qcom_sysmon *sysmon)
> > {
> > }
> > +
> > +static inline bool qcom_sysmon_shutdown_acked(struct qcom_sysmon *sysmon)
> > +{
> > + return false;
> > +}
> > #endif
> >
> > #endif
> > diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
> > index b37b111b15b3..a428b707a6de 100644
> > --- a/drivers/remoteproc/qcom_sysmon.c
> > +++ b/drivers/remoteproc/qcom_sysmon.c
> > @@ -44,6 +44,7 @@ struct qcom_sysmon {
> > struct mutex lock;
> >
> > bool ssr_ack;
> > + bool shutdown_acked;
> >
> > struct qmi_handle qmi;
> > struct sockaddr_qrtr ssctl;
> > @@ -115,10 +116,13 @@ static void sysmon_send_event(struct qcom_sysmon *sysmon,
> > /**
> > * sysmon_request_shutdown() - request graceful shutdown of remote
> > * @sysmon: sysmon context
> > + *
> > + * Return: boolean indicator of the remote processor acking the request
> > */
> > -static void sysmon_request_shutdown(struct qcom_sysmon *sysmon)
> > +static bool sysmon_request_shutdown(struct qcom_sysmon *sysmon)
> > {
> > char *req = "ssr:shutdown";
> > + bool acked = false;
> > int ret;
> >
> > mutex_lock(&sysmon->lock);
> > @@ -141,9 +145,13 @@ static void sysmon_request_shutdown(struct qcom_sysmon *sysmon)
> > if (!sysmon->ssr_ack)
> > dev_err(sysmon->dev,
> > "unexpected response to sysmon shutdown request\n");
> > + else
> > + acked = true;
> >
> > out_unlock:
> > mutex_unlock(&sysmon->lock);
> > +
> > + return acked;
> > }
> >
> > static int sysmon_callback(struct rpmsg_device *rpdev, void *data, int count,
> > @@ -297,14 +305,33 @@ static struct qmi_msg_handler qmi_indication_handler[] = {
> > {}
> > };
> >
> > +static bool ssctl_request_shutdown_wait(struct qcom_sysmon *sysmon)
> > +{
> > + int ret;
> > +
> > + ret = wait_for_completion_timeout(&sysmon->shutdown_comp, 10 * HZ);
> > + if (ret)
> > + return true;
> > +
> > + ret = try_wait_for_completion(&sysmon->ind_comp);
> > + if (ret)
> > + return true;
> > +
> > + dev_err(sysmon->dev, "timeout waiting for shutdown ack\n");
> > + return false;
> > +}
> > +
> > /**
> > * ssctl_request_shutdown() - request shutdown via SSCTL QMI service
> > * @sysmon: sysmon context
> > + *
> > + * Return: boolean indicator of the remote processor acking the request
> > */
> > -static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> > +static bool ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> > {
> > struct ssctl_shutdown_resp resp;
> > struct qmi_txn txn;
> > + bool acked = false;
> > int ret;
> >
> > reinit_completion(&sysmon->ind_comp);
> > @@ -312,7 +339,7 @@ static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> > ret = qmi_txn_init(&sysmon->qmi, &txn, ssctl_shutdown_resp_ei, &resp);
> > if (ret < 0) {
> > dev_err(sysmon->dev, "failed to allocate QMI txn\n");
> > - return;
> > + return false;
> > }
> >
> > ret = qmi_send_request(&sysmon->qmi, &sysmon->ssctl, &txn,
> > @@ -320,27 +347,23 @@ static void ssctl_request_shutdown(struct qcom_sysmon *sysmon)
> > if (ret < 0) {
> > dev_err(sysmon->dev, "failed to send shutdown request\n");
> > qmi_txn_cancel(&txn);
> > - return;
> > + return false;
> > }
> >
> > ret = qmi_txn_wait(&txn, 5 * HZ);
> > - if (ret < 0)
> > + if (ret < 0) {
> > dev_err(sysmon->dev, "failed receiving QMI response\n");
> > - else if (resp.resp.result)
> > + } else if (resp.resp.result) {
> > dev_err(sysmon->dev, "shutdown request failed\n");
> > - else
> > + } else {
> > dev_dbg(sysmon->dev, "shutdown request completed\n");
> > -
> > - if (sysmon->shutdown_irq > 0) {
> > - ret = wait_for_completion_timeout(&sysmon->shutdown_comp,
> > - 10 * HZ);
> > - if (!ret) {
> > - ret = try_wait_for_completion(&sysmon->ind_comp);
> > - if (!ret)
> > - dev_err(sysmon->dev,
> > - "timeout waiting for shutdown ack\n");
> > - }
> > + acked = true;
> > }
> > +
> > + if (sysmon->shutdown_irq > 0)
> > + return ssctl_request_shutdown_wait(sysmon);
> > +
> > + return acked;
> > }
> >
> > /**
> > @@ -510,6 +533,9 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
> > .subsys_name = sysmon->name,
> > .ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN
> > };
> > + bool acked;
> > +
> > + sysmon->shutdown_acked = false;
> >
> > mutex_lock(&sysmon->state_lock);
> > sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN;
> > @@ -521,9 +547,11 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
> > return;
> >
> > if (sysmon->ssctl_version)
>
> > - ssctl_request_shutdown(sysmon);
> > + acked = ssctl_request_shutdown(sysmon);
> > else if (sysmon->ept)
> > - sysmon_request_shutdown(sysmon);
> > + acked = sysmon_request_shutdown(sysmon);
> > +
> > + sysmon->shutdown_acked = acked;
>
> Guenter noticed that the 0-day bot complains about acked being
> potentially uninitialized here. He put a fix for us into Chrome OS:
>
> https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2577656
>
> Bjorn, do you want to tweak the patch in your tree?

No, I prefer not to force push to the tree. I did however merge and
push out Arnd's fixup to this. You can find it here:

https://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc.git/commit/?h=for-next&id=9d7b4a40387d0f91512a74caed6654ffa23d5ce4

Regards,
Bjorn