hwspinlocks can be acquired by many devices on the SoC. If any of these
devices go into a bad state before the device releases the hwspinlock,
then that hwspinlock may end up in an unusable state.
In the case of smem, each remoteproc takes a hwspinlock before trying to
allocate an smem item. If the remoteproc were to suddenly crash without
releasing this, it would be impossible for other remoteprocs to allocate
any smem items.
We propose a new api to bust a hwspinlock. A driver can use the
the bust api if it detects the device has gone into an error state, thus
allowing other entities in the system to use the hwspinlock.
These patches were tested on an sm8650 mtp using engineering cdsp
firmware that triggers a watchdog with the smem hwspinlock acquired.
Checked for error in dt-bindings with below.
- make DT_CHECKER_FLAGS=-m DT_SCHEMA_FILES=remoteproc/qcom,pas-common.yaml dt_binding_check
- make qcom/sm8650-mtp.dtb CHECK_DTBS=1
Signed-off-by: Chris Lew <[email protected]>
---
Changes in v3:
- Fixed 80 char comment formatting and missing space suggested by Bjorn
- Changed unsigned to unsigned int in smem apis from checkpatch warning
- Removed null pointer check in smem bust api
- Picked up reviewed-by trailers from Bjorn
- Link to v2: https://lore.kernel.org/r/[email protected]
Changes in v2:
- Remove extra print in qcom_q6v5_pas as suggested by Bryan
- Expose SMEM API that remotproc can call to bust the smem hwlock as suggested by Krzysztof
- Drop patches related to multiple references on a hwlock
- Link to v1: https://lore.kernel.org/r/[email protected]
---
Chris Lew (1):
soc: qcom: smem: Add qcom_smem_bust_hwspin_lock_by_host()
Richard Maina (3):
hwspinlock: Introduce hwspin_lock_bust()
hwspinlock: qcom: implement bust operation
remoteproc: qcom_q6v5_pas: Add hwspinlock bust on stop
Documentation/locking/hwspinlock.rst | 11 +++++++++++
drivers/hwspinlock/hwspinlock_core.c | 28 ++++++++++++++++++++++++++++
drivers/hwspinlock/hwspinlock_internal.h | 3 +++
drivers/hwspinlock/qcom_hwspinlock.c | 25 +++++++++++++++++++++++++
drivers/remoteproc/qcom_q6v5_pas.c | 11 +++++++++++
drivers/soc/qcom/smem.c | 26 ++++++++++++++++++++++++++
include/linux/hwspinlock.h | 6 ++++++
include/linux/soc/qcom/smem.h | 2 ++
8 files changed, 112 insertions(+)
---
base-commit: e7b4ef8fffaca247809337bb78daceb406659f2d
change-id: 20240509-hwspinlock-bust-d497a70c1a3a
Best regards,
--
Chris Lew <[email protected]>
From: Richard Maina <[email protected]>
When a remoteproc crashes or goes down unexpectedly this can result in
a state where locks held by the remoteproc will remain locked possibly
resulting in deadlock. This new API hwspin_lock_bust() allows
hwspinlock implementers to define a bust operation for freeing previously
acquired hwspinlocks after verifying ownership of the acquired lock.
Signed-off-by: Richard Maina <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Chris Lew <[email protected]>
---
Documentation/locking/hwspinlock.rst | 11 +++++++++++
drivers/hwspinlock/hwspinlock_core.c | 28 ++++++++++++++++++++++++++++
drivers/hwspinlock/hwspinlock_internal.h | 3 +++
include/linux/hwspinlock.h | 6 ++++++
4 files changed, 48 insertions(+)
diff --git a/Documentation/locking/hwspinlock.rst b/Documentation/locking/hwspinlock.rst
index 6f03713b7003..2ffaa3cbd63f 100644
--- a/Documentation/locking/hwspinlock.rst
+++ b/Documentation/locking/hwspinlock.rst
@@ -85,6 +85,17 @@ is already free).
Should be called from a process context (might sleep).
+::
+
+ int hwspin_lock_bust(struct hwspinlock *hwlock, unsigned int id);
+
+After verifying the owner of the hwspinlock, release a previously acquired
+hwspinlock; returns 0 on success, or an appropriate error code on failure
+(e.g. -EOPNOTSUPP if the bust operation is not defined for the specific
+hwspinlock).
+
+Should be called from a process context (might sleep).
+
::
int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned int timeout);
diff --git a/drivers/hwspinlock/hwspinlock_core.c b/drivers/hwspinlock/hwspinlock_core.c
index 0c0a932c00f3..6505261e6068 100644
--- a/drivers/hwspinlock/hwspinlock_core.c
+++ b/drivers/hwspinlock/hwspinlock_core.c
@@ -305,6 +305,34 @@ void __hwspin_unlock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
}
EXPORT_SYMBOL_GPL(__hwspin_unlock);
+/**
+ * hwspin_lock_bust() - bust a specific hwspinlock
+ * @hwlock: a previously-acquired hwspinlock which we want to bust
+ * @id: identifier of the remote lock holder, if applicable
+ *
+ * This function will bust a hwspinlock that was previously acquired as
+ * long as the current owner of the lock matches the id given by the caller.
+ *
+ * Context: Process context.
+ *
+ * Returns: 0 on success, or -EINVAL if the hwspinlock does not exist, or
+ * the bust operation fails, and -EOPNOTSUPP if the bust operation is not
+ * defined for the hwspinlock.
+ */
+int hwspin_lock_bust(struct hwspinlock *hwlock, unsigned int id)
+{
+ if (WARN_ON(!hwlock))
+ return -EINVAL;
+
+ if (!hwlock->bank->ops->bust) {
+ pr_err("bust operation not defined\n");
+ return -EOPNOTSUPP;
+ }
+
+ return hwlock->bank->ops->bust(hwlock, id);
+}
+EXPORT_SYMBOL_GPL(hwspin_lock_bust);
+
/**
* of_hwspin_lock_simple_xlate - translate hwlock_spec to return a lock id
* @hwlock_spec: hwlock specifier as found in the device tree
diff --git a/drivers/hwspinlock/hwspinlock_internal.h b/drivers/hwspinlock/hwspinlock_internal.h
index 29892767bb7a..f298fc0ee5ad 100644
--- a/drivers/hwspinlock/hwspinlock_internal.h
+++ b/drivers/hwspinlock/hwspinlock_internal.h
@@ -21,6 +21,8 @@ struct hwspinlock_device;
* @trylock: make a single attempt to take the lock. returns 0 on
* failure and true on success. may _not_ sleep.
* @unlock: release the lock. always succeed. may _not_ sleep.
+ * @bust: optional, platform-specific bust handler, called by hwspinlock
+ * core to bust a specific lock.
* @relax: optional, platform-specific relax handler, called by hwspinlock
* core while spinning on a lock, between two successive
* invocations of @trylock. may _not_ sleep.
@@ -28,6 +30,7 @@ struct hwspinlock_device;
struct hwspinlock_ops {
int (*trylock)(struct hwspinlock *lock);
void (*unlock)(struct hwspinlock *lock);
+ int (*bust)(struct hwspinlock *lock, unsigned int id);
void (*relax)(struct hwspinlock *lock);
};
diff --git a/include/linux/hwspinlock.h b/include/linux/hwspinlock.h
index bfe7c1f1ac6d..f0231dbc4777 100644
--- a/include/linux/hwspinlock.h
+++ b/include/linux/hwspinlock.h
@@ -68,6 +68,7 @@ int __hwspin_lock_timeout(struct hwspinlock *, unsigned int, int,
int __hwspin_trylock(struct hwspinlock *, int, unsigned long *);
void __hwspin_unlock(struct hwspinlock *, int, unsigned long *);
int of_hwspin_lock_get_id_byname(struct device_node *np, const char *name);
+int hwspin_lock_bust(struct hwspinlock *hwlock, unsigned int id);
int devm_hwspin_lock_free(struct device *dev, struct hwspinlock *hwlock);
struct hwspinlock *devm_hwspin_lock_request(struct device *dev);
struct hwspinlock *devm_hwspin_lock_request_specific(struct device *dev,
@@ -127,6 +128,11 @@ void __hwspin_unlock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
{
}
+static inline int hwspin_lock_bust(struct hwspinlock *hwlock, unsigned int id)
+{
+ return 0;
+}
+
static inline int of_hwspin_lock_get_id(struct device_node *np, int index)
{
return 0;
--
2.25.1
Add qcom_smem_bust_hwspin_lock_by_host to enable remoteproc to bust the
hwspin_lock owned by smem. In the event the remoteproc crashes
unexpectedly, the remoteproc driver can invoke this API to try and bust
the hwspin_lock and release the lock if still held by the remoteproc
device.
Signed-off-by: Chris Lew <[email protected]>
---
drivers/soc/qcom/smem.c | 26 ++++++++++++++++++++++++++
include/linux/soc/qcom/smem.h | 2 ++
2 files changed, 28 insertions(+)
diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
index 7191fa0c087f..50039e983eba 100644
--- a/drivers/soc/qcom/smem.c
+++ b/drivers/soc/qcom/smem.c
@@ -359,6 +359,32 @@ static struct qcom_smem *__smem;
/* Timeout (ms) for the trylock of remote spinlocks */
#define HWSPINLOCK_TIMEOUT 1000
+/* The qcom hwspinlock id is always plus one from the smem host id */
+#define SMEM_HOST_ID_TO_HWSPINLOCK_ID(__x) ((__x) + 1)
+
+/**
+ * qcom_smem_bust_hwspin_lock_by_host() - bust the smem hwspinlock for a host
+ * @host: remote processor id
+ *
+ * Busts the hwspin_lock for the given smem host id. This helper is intended
+ * for remoteproc drivers that manage remoteprocs with an equivalent smem
+ * driver instance in the remote firmware. Drivers can force a release of the
+ * smem hwspin_lock if the rproc unexpectedly goes into a bad state.
+ *
+ * Context: Process context.
+ *
+ * Returns: 0 on success, otherwise negative errno.
+ */
+int qcom_smem_bust_hwspin_lock_by_host(unsigned int host)
+{
+ /* This function is for remote procs, so ignore SMEM_HOST_APPS */
+ if (host == SMEM_HOST_APPS || host >= SMEM_HOST_COUNT)
+ return -EINVAL;
+
+ return hwspin_lock_bust(__smem->hwlock, SMEM_HOST_ID_TO_HWSPINLOCK_ID(host));
+}
+EXPORT_SYMBOL_GPL(qcom_smem_bust_hwspin_lock_by_host);
+
/**
* qcom_smem_is_available() - Check if SMEM is available
*
diff --git a/include/linux/soc/qcom/smem.h b/include/linux/soc/qcom/smem.h
index a36a3b9d4929..03187bc95851 100644
--- a/include/linux/soc/qcom/smem.h
+++ b/include/linux/soc/qcom/smem.h
@@ -14,4 +14,6 @@ phys_addr_t qcom_smem_virt_to_phys(void *p);
int qcom_smem_get_soc_id(u32 *id);
+int qcom_smem_bust_hwspin_lock_by_host(unsigned int host);
+
#endif
--
2.25.1
From: Richard Maina <[email protected]>
When remoteproc goes down unexpectedly this results in a state where any
acquired hwspinlocks will remain locked possibly resulting in deadlock.
In order to ensure all locks are freed we include a call to
qcom_smem_bust_hwspin_lock_by_host() during remoteproc shutdown.
For qcom_q6v5_pas remoteprocs, each remoteproc has an assigned smem
host_id. Remoteproc can pass this id to smem to try and bust the lock on
remoteproc stop.
This edge case only occurs with q6v5_pas watchdog crashes. The error
fatal case has handling to clear the hwspinlock before the error fatal
interrupt is triggered.
Signed-off-by: Richard Maina <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Chris Lew <[email protected]>
---
drivers/remoteproc/qcom_q6v5_pas.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
index 54d8005d40a3..8458bcfe9e19 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -52,6 +52,7 @@ struct adsp_data {
const char *ssr_name;
const char *sysmon_name;
int ssctl_id;
+ unsigned int smem_host_id;
int region_assign_idx;
int region_assign_count;
@@ -81,6 +82,7 @@ struct qcom_adsp {
int lite_pas_id;
unsigned int minidump_id;
int crash_reason_smem;
+ unsigned int smem_host_id;
bool decrypt_shutdown;
const char *info_name;
@@ -399,6 +401,9 @@ static int adsp_stop(struct rproc *rproc)
if (handover)
qcom_pas_handover(&adsp->q6v5);
+ if (adsp->smem_host_id)
+ ret = qcom_smem_bust_hwspin_lock_by_host(adsp->smem_host_id);
+
return ret;
}
@@ -727,6 +732,7 @@ static int adsp_probe(struct platform_device *pdev)
adsp->pas_id = desc->pas_id;
adsp->lite_pas_id = desc->lite_pas_id;
adsp->info_name = desc->sysmon_name;
+ adsp->smem_host_id = desc->smem_host_id;
adsp->decrypt_shutdown = desc->decrypt_shutdown;
adsp->region_assign_idx = desc->region_assign_idx;
adsp->region_assign_count = min_t(int, MAX_ASSIGN_COUNT, desc->region_assign_count);
@@ -1196,6 +1202,7 @@ static const struct adsp_data sm8550_adsp_resource = {
.ssr_name = "lpass",
.sysmon_name = "adsp",
.ssctl_id = 0x14,
+ .smem_host_id = 2,
};
static const struct adsp_data sm8550_cdsp_resource = {
@@ -1216,6 +1223,7 @@ static const struct adsp_data sm8550_cdsp_resource = {
.ssr_name = "cdsp",
.sysmon_name = "cdsp",
.ssctl_id = 0x17,
+ .smem_host_id = 5,
};
static const struct adsp_data sm8550_mpss_resource = {
@@ -1236,6 +1244,7 @@ static const struct adsp_data sm8550_mpss_resource = {
.ssr_name = "mpss",
.sysmon_name = "modem",
.ssctl_id = 0x12,
+ .smem_host_id = 1,
.region_assign_idx = 2,
.region_assign_count = 1,
.region_assign_vmid = QCOM_SCM_VMID_MSS_MSA,
@@ -1275,6 +1284,7 @@ static const struct adsp_data sm8650_cdsp_resource = {
.ssr_name = "cdsp",
.sysmon_name = "cdsp",
.ssctl_id = 0x17,
+ .smem_host_id = 5,
.region_assign_idx = 2,
.region_assign_count = 1,
.region_assign_shared = true,
@@ -1299,6 +1309,7 @@ static const struct adsp_data sm8650_mpss_resource = {
.ssr_name = "mpss",
.sysmon_name = "modem",
.ssctl_id = 0x12,
+ .smem_host_id = 1,
.region_assign_idx = 2,
.region_assign_count = 3,
.region_assign_vmid = QCOM_SCM_VMID_MSS_MSA,
--
2.25.1
On Wed, May 29, 2024 at 11:09:57AM -0700, Chris Lew wrote:
> Add qcom_smem_bust_hwspin_lock_by_host to enable remoteproc to bust the
> hwspin_lock owned by smem. In the event the remoteproc crashes
> unexpectedly, the remoteproc driver can invoke this API to try and bust
> the hwspin_lock and release the lock if still held by the remoteproc
> device.
>
Reviewed-by: Bjorn Andersson <[email protected]>
Regards,
Bjorn
> Signed-off-by: Chris Lew <[email protected]>
> ---
> drivers/soc/qcom/smem.c | 26 ++++++++++++++++++++++++++
> include/linux/soc/qcom/smem.h | 2 ++
> 2 files changed, 28 insertions(+)
>
> diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
> index 7191fa0c087f..50039e983eba 100644
> --- a/drivers/soc/qcom/smem.c
> +++ b/drivers/soc/qcom/smem.c
> @@ -359,6 +359,32 @@ static struct qcom_smem *__smem;
> /* Timeout (ms) for the trylock of remote spinlocks */
> #define HWSPINLOCK_TIMEOUT 1000
>
> +/* The qcom hwspinlock id is always plus one from the smem host id */
> +#define SMEM_HOST_ID_TO_HWSPINLOCK_ID(__x) ((__x) + 1)
> +
> +/**
> + * qcom_smem_bust_hwspin_lock_by_host() - bust the smem hwspinlock for a host
> + * @host: remote processor id
> + *
> + * Busts the hwspin_lock for the given smem host id. This helper is intended
> + * for remoteproc drivers that manage remoteprocs with an equivalent smem
> + * driver instance in the remote firmware. Drivers can force a release of the
> + * smem hwspin_lock if the rproc unexpectedly goes into a bad state.
> + *
> + * Context: Process context.
> + *
> + * Returns: 0 on success, otherwise negative errno.
> + */
> +int qcom_smem_bust_hwspin_lock_by_host(unsigned int host)
> +{
> + /* This function is for remote procs, so ignore SMEM_HOST_APPS */
> + if (host == SMEM_HOST_APPS || host >= SMEM_HOST_COUNT)
> + return -EINVAL;
> +
> + return hwspin_lock_bust(__smem->hwlock, SMEM_HOST_ID_TO_HWSPINLOCK_ID(host));
> +}
> +EXPORT_SYMBOL_GPL(qcom_smem_bust_hwspin_lock_by_host);
> +
> /**
> * qcom_smem_is_available() - Check if SMEM is available
> *
> diff --git a/include/linux/soc/qcom/smem.h b/include/linux/soc/qcom/smem.h
> index a36a3b9d4929..03187bc95851 100644
> --- a/include/linux/soc/qcom/smem.h
> +++ b/include/linux/soc/qcom/smem.h
> @@ -14,4 +14,6 @@ phys_addr_t qcom_smem_virt_to_phys(void *p);
>
> int qcom_smem_get_soc_id(u32 *id);
>
> +int qcom_smem_bust_hwspin_lock_by_host(unsigned int host);
> +
> #endif
>
> --
> 2.25.1
>
On Wed, 29 May 2024 11:09:54 -0700, Chris Lew wrote:
> hwspinlocks can be acquired by many devices on the SoC. If any of these
> devices go into a bad state before the device releases the hwspinlock,
> then that hwspinlock may end up in an unusable state.
>
> In the case of smem, each remoteproc takes a hwspinlock before trying to
> allocate an smem item. If the remoteproc were to suddenly crash without
> releasing this, it would be impossible for other remoteprocs to allocate
> any smem items.
>
> [...]
Applied, thanks!
[3/4] soc: qcom: smem: Add qcom_smem_bust_hwspin_lock_by_host()
commit: 2e3f0d693875db698891ffe89a18121bda5b95b8
[4/4] remoteproc: qcom_q6v5_pas: Add hwspinlock bust on stop
commit: 568b13b65078e2b557ccf47674a354cecd1db641
Best regards,
--
Bjorn Andersson <[email protected]>