2018-09-10 06:27:50

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH v2 0/4] Qcom smmu-500 TLB invalidation errata for sdm845

Qcom's implementation of arm,mmu-500 on sdm845 has a functional/performance
errata [1] because of which the TCU cache look ups are stalled during
invalidation cycle. This is mitigated by serializing all the invalidation
requests coming to the smmu.

This patch series addresses this errata by adding new tlb_ops for
qcom,sdm845-smmu-500 [2]. These ops take context bank locks for all the
tlb_ops that queue and sync the TLB invalidation requests.

Besides adding locks, there's a way to expadite these TLB invalidations
for display and camera devices by turning off the 'wait-for-safe' logic
in hardware that holds the tlb invalidations until a safe level.
This 'wait-for-safe' logic is controlled by toggling a chicken bit
through a secure register. This secure register is accessed by making an
explicit SCM call into the EL3 firmware.
There are two ways of handling this logic -
* Firmware, such as tz present on sdm845-mtp devices has a handler to do
all the register access and bit set/clear. So is the handling in
downstream arm-smmu driver [3].
* Other firmwares can have handlers to just read/write this secure
register. In such cases the kernel make io_read/writel scm calls to
modify the register.
This patch series adds APIs in qcom-scm driver to handle both of these
cases.

Lastly, since these TLB invalidations can happen in atomic contexts
there's a need to add atomic versions of qcom_scm_io_readl/writel() and
qcom_scm_call() APIs. The traditional scm calls take mutex and we therefore
can't use these calls in atomic contexts.

This patch series is adapted version of how the errata is handled in
downstream [1].

Changes since v1:
* Addressed Will and Robin's comments:
- Dropped the patch[4] that forked out __arm_smmu_tlb_inv_range_nosync(),
and __arm_smmu_tlb_sync().
- Cleaned up the errata patch further to use downstream polling mechanism
for tlb sync.
* No change in SCM call patches - patches 1 to 3.

[1] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842
[2] https://lore.kernel.org/patchwork/patch/974114/
[3] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4864
[4] https://patchwork.kernel.org/patch/10565349/

Vivek Gautam (4):
firmware: qcom_scm-64: Add atomic version of qcom_scm_call
firmware/qcom_scm: Add atomic version of io read/write APIs
firmware/qcom_scm: Add scm call to handle smmu errata
iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

drivers/firmware/qcom_scm-32.c | 17 ++++
drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
drivers/firmware/qcom_scm.c | 18 ++++
drivers/firmware/qcom_scm.h | 9 ++
drivers/iommu/arm-smmu-regs.h | 2 +
drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++-
include/linux/qcom_scm.h | 6 ++
7 files changed, 320 insertions(+), 46 deletions(-)

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



2018-09-10 06:27:47

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH v2 1/4] firmware: qcom_scm-64: Add atomic version of qcom_scm_call

There are scnenarios where drivers are required to make a
scm call in atomic context, such as in one of the qcom's
arm-smmu-500 errata [1].

[1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/
tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842")

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/firmware/qcom_scm-64.c | 136 ++++++++++++++++++++++++++++-------------
1 file changed, 92 insertions(+), 44 deletions(-)

diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 688525dd4aee..3a8c867cdf51 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -70,32 +70,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
#define FIRST_EXT_ARG_IDX 3
#define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)

-/**
- * qcom_scm_call() - Invoke a syscall in the secure world
- * @dev: device
- * @svc_id: service identifier
- * @cmd_id: command identifier
- * @desc: Descriptor structure containing arguments and return values
- *
- * Sends a command to the SCM and waits for the command to finish processing.
- * This should *only* be called in pre-emptible context.
-*/
-static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
- const struct qcom_scm_desc *desc,
- struct arm_smccc_res *res)
+static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res, u32 fn_id,
+ u64 x5, u32 type)
+{
+ u64 cmd;
+ struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+
+ cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
+ ARM_SMCCC_OWNER_SIP, fn_id);
+
+ quirk.state.a6 = 0;
+
+ do {
+ arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
+ desc->args[1], desc->args[2], x5,
+ quirk.state.a6, 0, res, &quirk);
+
+ if (res->a0 == QCOM_SCM_INTERRUPTED)
+ cmd = res->a0;
+
+ } while (res->a0 == QCOM_SCM_INTERRUPTED);
+}
+
+static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res, u32 fn_id,
+ u64 x5, bool atomic)
+{
+ int retry_count = 0;
+
+ if (!atomic) {
+ do {
+ mutex_lock(&qcom_scm_lock);
+
+ __qcom_scm_call_do(desc, res, fn_id, x5,
+ ARM_SMCCC_STD_CALL);
+
+ mutex_unlock(&qcom_scm_lock);
+
+ if (res->a0 == QCOM_SCM_V2_EBUSY) {
+ if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
+ break;
+ msleep(QCOM_SCM_EBUSY_WAIT_MS);
+ }
+ } while (res->a0 == QCOM_SCM_V2_EBUSY);
+ } else {
+ __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
+ }
+}
+
+static int ___qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
+ const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res, bool atomic)
{
int arglen = desc->arginfo & 0xf;
- int retry_count = 0, i;
+ int i;
u32 fn_id = QCOM_SCM_FNID(svc_id, cmd_id);
- u64 cmd, x5 = desc->args[FIRST_EXT_ARG_IDX];
+ u64 x5 = desc->args[FIRST_EXT_ARG_IDX];
dma_addr_t args_phys = 0;
void *args_virt = NULL;
size_t alloc_len;
- struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+ gfp_t flag = atomic ? GFP_ATOMIC : GFP_KERNEL;

if (unlikely(arglen > N_REGISTER_ARGS)) {
alloc_len = N_EXT_QCOM_SCM_ARGS * sizeof(u64);
- args_virt = kzalloc(PAGE_ALIGN(alloc_len), GFP_KERNEL);
+ args_virt = kzalloc(PAGE_ALIGN(alloc_len), flag);

if (!args_virt)
return -ENOMEM;
@@ -125,33 +164,7 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
x5 = args_phys;
}

- do {
- mutex_lock(&qcom_scm_lock);
-
- cmd = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
- qcom_smccc_convention,
- ARM_SMCCC_OWNER_SIP, fn_id);
-
- quirk.state.a6 = 0;
-
- do {
- arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
- desc->args[1], desc->args[2], x5,
- quirk.state.a6, 0, res, &quirk);
-
- if (res->a0 == QCOM_SCM_INTERRUPTED)
- cmd = res->a0;
-
- } while (res->a0 == QCOM_SCM_INTERRUPTED);
-
- mutex_unlock(&qcom_scm_lock);
-
- if (res->a0 == QCOM_SCM_V2_EBUSY) {
- if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
- break;
- msleep(QCOM_SCM_EBUSY_WAIT_MS);
- }
- } while (res->a0 == QCOM_SCM_V2_EBUSY);
+ qcom_scm_call_do(desc, res, fn_id, x5, atomic);

if (args_virt) {
dma_unmap_single(dev, args_phys, alloc_len, DMA_TO_DEVICE);
@@ -164,6 +177,41 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
return 0;
}

+/**
+ * qcom_scm_call() - Invoke a syscall in the secure world
+ * @dev: device
+ * @svc_id: service identifier
+ * @cmd_id: command identifier
+ * @desc: Descriptor structure containing arguments and return values
+ *
+ * Sends a command to the SCM and waits for the command to finish processing.
+ * This should *only* be called in pre-emptible context.
+ */
+static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
+ const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res)
+{
+ return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, false);
+}
+
+/**
+ * qcom_scm_call_atomic() - atomic variation of qcom_scm_call()
+ * @dev: device
+ * @svc_id: service identifier
+ * @cmd_id: command identifier
+ * @desc: Descriptor structure containing arguments and return values
+ * @res: Structure containing results from SMC/HVC call
+ *
+ * Sends a command to the SCM and waits for the command to finish processing.
+ * This should be called in atomic context only.
+ */
+static int qcom_scm_call_atomic(struct device *dev, u32 svc_id, u32 cmd_id,
+ const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res)
+{
+ return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, true);
+}
+
/**
* qcom_scm_set_cold_boot_addr() - Set the cold boot address for cpus
* @entry: Entry point function for the cpus
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-09-10 06:28:13

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH v2 3/4] firmware/qcom_scm: Add scm call to handle smmu errata

Qcom's smmu-500 needs to toggle wait-for-safe sequence to
handle TLB invalidation sync's.
Few firmwares allow doing that through SCM interface.
Add API to toggle wait for safe from firmware through a
SCM call.

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/firmware/qcom_scm-32.c | 5 +++++
drivers/firmware/qcom_scm-64.c | 13 +++++++++++++
drivers/firmware/qcom_scm.c | 6 ++++++
drivers/firmware/qcom_scm.h | 5 +++++
include/linux/qcom_scm.h | 2 ++
5 files changed, 31 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 7293e5efad69..2d301ad053f8 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -639,3 +639,8 @@ int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
{
return -ENODEV;
}
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool enable)
+{
+ return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 6bf55403f6e3..f13bcabc5d78 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -590,3 +590,16 @@ int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
&desc, &res);
}
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool en)
+{
+ struct qcom_scm_desc desc = {0};
+ struct arm_smccc_res res;
+
+ desc.args[0] = QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL;
+ desc.args[1] = en;
+ desc.arginfo = QCOM_SCM_ARGS(2);
+
+ return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_SMMU_PROGRAM,
+ QCOM_SCM_CONFIG_ERRATA1, &desc, &res);
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 36da0000b37f..5f15cc2e9f69 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -353,6 +353,12 @@ int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare)
}
EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);

+int qcom_scm_qsmmu500_wait_safe_toggle(bool en)
+{
+ return __qcom_scm_qsmmu500_wait_safe_toggle(__scm->dev, en);
+}
+EXPORT_SYMBOL(qcom_scm_qsmmu500_wait_safe_toggle);
+
int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val)
{
return __qcom_scm_io_readl(__scm->dev, addr, val);
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index bb176107f51e..89a822c23e33 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -103,10 +103,15 @@ extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
u32 spare);
#define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE 3
#define QCOM_SCM_IOMMU_SECURE_PTBL_INIT 4
+#define QCOM_SCM_SVC_SMMU_PROGRAM 0x15
+#define QCOM_SCM_CONFIG_ERRATA1 0x3
+#define QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL 0x2
extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
size_t *size);
extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
u32 size, u32 spare);
+extern int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev,
+ bool enable);
#define QCOM_MEM_PROT_ASSIGN_ID 0x16
extern int __qcom_scm_assign_mem(struct device *dev,
phys_addr_t mem_region, size_t mem_sz,
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 6a5d0c98b328..46e6b1692998 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -62,6 +62,7 @@ extern int qcom_scm_set_remote_state(u32 state, u32 id);
extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
+extern int qcom_scm_qsmmu500_wait_safe_toggle(bool en);
extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
extern int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val);
@@ -100,6 +101,7 @@ qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return -ENODEV; }
static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { return -ENODEV; }
static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare) { return -ENODEV; }
+static inline int qcom_scm_qsmmu500_wait_safe_toggle(bool en) { return -ENODEV; }
static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { return -ENODEV; }
static inline int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-09-10 06:28:15

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH v2 2/4] firmware/qcom_scm: Add atomic version of io read/write APIs

Add atomic versions of qcom_scm_io_readl/writel to enable
reading/writing secure registers from atomic context.

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/firmware/qcom_scm-32.c | 12 ++++++++++++
drivers/firmware/qcom_scm-64.c | 32 ++++++++++++++++++++++++++++++++
drivers/firmware/qcom_scm.c | 12 ++++++++++++
drivers/firmware/qcom_scm.h | 4 ++++
include/linux/qcom_scm.h | 4 ++++
5 files changed, 64 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 4e24e591ae74..7293e5efad69 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -627,3 +627,15 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val)
return qcom_scm_call_atomic2(QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
addr, val);
}
+
+int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int *val)
+{
+ return -ENODEV;
+}
+
+int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int val)
+{
+ return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 3a8c867cdf51..6bf55403f6e3 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -558,3 +558,35 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val)
return qcom_scm_call(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
&desc, &res);
}
+
+int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int *val)
+{
+ struct qcom_scm_desc desc = {0};
+ struct arm_smccc_res res;
+ int ret;
+
+ desc.args[0] = addr;
+ desc.arginfo = QCOM_SCM_ARGS(1);
+
+ ret = qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_READ,
+ &desc, &res);
+ if (ret >= 0)
+ *val = res.a1;
+
+ return ret < 0 ? ret : 0;
+}
+
+int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int val)
+{
+ struct qcom_scm_desc desc = {0};
+ struct arm_smccc_res res;
+
+ desc.args[0] = addr;
+ desc.args[1] = val;
+ desc.arginfo = QCOM_SCM_ARGS(2);
+
+ return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
+ &desc, &res);
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index e778af766fae..36da0000b37f 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -365,6 +365,18 @@ int qcom_scm_io_writel(phys_addr_t addr, unsigned int val)
}
EXPORT_SYMBOL(qcom_scm_io_writel);

+int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val)
+{
+ return __qcom_scm_io_readl_atomic(__scm->dev, addr, val);
+}
+EXPORT_SYMBOL(qcom_scm_io_readl_atomic);
+
+int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val)
+{
+ return __qcom_scm_io_writel_atomic(__scm->dev, addr, val);
+}
+EXPORT_SYMBOL(qcom_scm_io_writel_atomic);
+
static void qcom_scm_set_download_mode(bool enable)
{
bool avail;
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index dcd7f7917fc7..bb176107f51e 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -37,6 +37,10 @@ extern void __qcom_scm_cpu_power_down(u32 flags);
#define QCOM_SCM_IO_WRITE 0x2
extern int __qcom_scm_io_readl(struct device *dev, phys_addr_t addr, unsigned int *val);
extern int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val);
+extern int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int *val);
+extern int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int val);

#define QCOM_SCM_SVC_INFO 0x6
#define QCOM_IS_CALL_AVAIL_CMD 0x1
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 5d65521260b3..6a5d0c98b328 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -64,6 +64,8 @@ extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
+extern int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val);
+extern int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val);
#else
static inline
int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -100,5 +102,7 @@ static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { ret
static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare) { return -ENODEV; }
static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { return -ENODEV; }
+static inline int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
+static inline int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val) { return -ENODEV; }
#endif
#endif
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-09-10 06:28:53

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH v2 4/4] iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

Qcom's implementation of arm,mmu-500 require to serialize all
TLB invalidations for context banks.
In case the TLB invalidation requests don't go through the first
time, there's a way to disable/enable the wait for safe logic.
Disabling this logic expadites the TLBIs.

Different bootloaders with their access control policies allow this
register access differntly. With one, we should be able to directly
make qcom-scm call to do io read/write, while with other we should
use the specific SCM command to send request to do the complete
register configuration.
A separate device tree flag for arm-smmu will allow to identify
which firmware configuration of the two mentioned above we use.

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/iommu/arm-smmu-regs.h | 2 +
drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++-
2 files changed, 133 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..71662cae9806 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -177,6 +177,8 @@ enum arm_smmu_s2cr_privcfg {
#define ARM_SMMU_CB_ATS1PR 0x800
#define ARM_SMMU_CB_ATSR 0x8f0

+#define ARM_SMMU_GID_QCOM_CUSTOM_CFG 0x300
+
#define SCTLR_S1_ASIDPNE (1 << 12)
#define SCTLR_CFCFG (1 << 7)
#define SCTLR_CFIE (1 << 6)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 411e5ac57c64..de9c4a5bf686 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -49,6 +49,7 @@
#include <linux/pci.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
#include <linux/slab.h>
#include <linux/spinlock.h>

@@ -181,7 +182,8 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_EXIDS (1 << 12)
u32 features;

-#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA (1 << 1)
u32 options;
enum arm_smmu_arch_version version;
enum arm_smmu_implementation model;
@@ -266,6 +268,7 @@ static bool using_legacy_binding, using_generic_binding;

static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+ { ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA, "qcom,smmu-500-fw-impl-errata" },
{ 0, NULL},
};

@@ -531,12 +534,134 @@ static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size,
writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID);
}

+#define CUSTOM_CFG_MDP_SAFE_ENABLE BIT(15)
+#define CUSTOM_CFG_IFE1_SAFE_ENABLE BIT(14)
+#define CUSTOM_CFG_IFE0_SAFE_ENABLE BIT(13)
+
+static int __qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, int en)
+{
+ int ret;
+ u32 val, gid_phys_base;
+ phys_addr_t reg;
+ struct vm_struct *vm;
+
+ /* We want physical address of SMMU, so the vm_area */
+ vm = find_vm_area(smmu->base);
+
+ /*
+ * GID (implementation defined address space) is located at
+ * SMMU_BASE + (2 × PAGESIZE).
+ */
+ gid_phys_base = vm->phys_addr + (2 << (smmu)->pgshift);
+ reg = gid_phys_base + ARM_SMMU_GID_QCOM_CUSTOM_CFG;
+
+ ret = qcom_scm_io_readl_atomic(reg, &val);
+ if (ret)
+ return ret;
+
+ if (en)
+ val |= CUSTOM_CFG_MDP_SAFE_ENABLE |
+ CUSTOM_CFG_IFE0_SAFE_ENABLE |
+ CUSTOM_CFG_IFE1_SAFE_ENABLE;
+ else
+ val &= ~(CUSTOM_CFG_MDP_SAFE_ENABLE |
+ CUSTOM_CFG_IFE0_SAFE_ENABLE |
+ CUSTOM_CFG_IFE1_SAFE_ENABLE);
+
+ ret = qcom_scm_io_writel_atomic(reg, val);
+
+ return ret;
+}
+
+static int qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu,
+ int en, bool is_fw_impl)
+{
+ if (is_fw_impl)
+ return qcom_scm_qsmmu500_wait_safe_toggle(en);
+ else
+ return __qsmmu500_wait_safe_toggle(smmu, en);
+}
+
+static void qcom_errata_tlb_sync(struct arm_smmu_domain *smmu_domain)
+{
+ struct arm_smmu_device *smmu = smmu_domain->smmu;
+ void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
+ bool is_fw_impl;
+ u32 val;
+
+ writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC);
+
+ if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
+ !(val & sTLBGSTATUS_GSACTIVE), 0, 100))
+ return;
+
+ is_fw_impl = smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA ?
+ true : false;
+
+ /* SCM call here to disable the wait-for-safe logic. */
+ if (WARN(qsmmu500_wait_safe_toggle(smmu, false, is_fw_impl),
+ "Failed to disable wait-safe logic, bad hw state\n"))
+ return;
+
+ if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
+ !(val & sTLBGSTATUS_GSACTIVE), 0, 10000))
+ return;
+
+ /* SCM call here to re-enable the wait-for-safe logic. */
+ WARN(qsmmu500_wait_safe_toggle(smmu, true, is_fw_impl),
+ "Failed to re-enable wait-safe logic, bad hw state\n");
+
+ dev_err_ratelimited(smmu->dev,
+ "TLB sync timed out -- SMMU in bad state\n");
+}
+
+static void qcom_errata_tlb_sync_context(void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = cookie;
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
+ qcom_errata_tlb_sync(smmu_domain);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
+}
+
+static void qcom_errata_tlb_inv_context_s1(void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = cookie;
+ struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+ void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
+ writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
+ qcom_errata_tlb_sync(cookie);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
+}
+
+static void qcom_errata_tlb_inv_range_nosync(unsigned long iova, size_t size,
+ size_t granule, bool leaf,
+ void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = cookie;
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
+ arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
+}
+
static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = {
.tlb_flush_all = arm_smmu_tlb_inv_context_s1,
.tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
.tlb_sync = arm_smmu_tlb_sync_context,
};

+static const struct iommu_gather_ops qcom_errata_s1_tlb_ops = {
+ .tlb_flush_all = qcom_errata_tlb_inv_context_s1,
+ .tlb_add_flush = qcom_errata_tlb_inv_range_nosync,
+ .tlb_sync = qcom_errata_tlb_sync_context,
+};
+
static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = {
.tlb_flush_all = arm_smmu_tlb_inv_context_s2,
.tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
@@ -824,7 +949,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
ias = min(ias, 32UL);
oas = min(oas, 32UL);
}
- smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
+ if (of_device_is_compatible(smmu->dev->of_node,
+ "qcom,sdm845-smmu-500"))
+ smmu_domain->tlb_ops = &qcom_errata_s1_tlb_ops;
+ else
+ smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
break;
case ARM_SMMU_DOMAIN_NESTED:
/*
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-09-10 10:40:41

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Qcom smmu-500 TLB invalidation errata for sdm845

+linux-arm-msm


On 09/10/2018 11:55 AM, Vivek Gautam wrote:
> Qcom's implementation of arm,mmu-500 on sdm845 has a functional/performance
> errata [1] because of which the TCU cache look ups are stalled during
> invalidation cycle. This is mitigated by serializing all the invalidation
> requests coming to the smmu.
>
> This patch series addresses this errata by adding new tlb_ops for
> qcom,sdm845-smmu-500 [2]. These ops take context bank locks for all the
> tlb_ops that queue and sync the TLB invalidation requests.
>
> Besides adding locks, there's a way to expadite these TLB invalidations
> for display and camera devices by turning off the 'wait-for-safe' logic
> in hardware that holds the tlb invalidations until a safe level.
> This 'wait-for-safe' logic is controlled by toggling a chicken bit
> through a secure register. This secure register is accessed by making an
> explicit SCM call into the EL3 firmware.
> There are two ways of handling this logic -
> * Firmware, such as tz present on sdm845-mtp devices has a handler to do
> all the register access and bit set/clear. So is the handling in
> downstream arm-smmu driver [3].
> * Other firmwares can have handlers to just read/write this secure
> register. In such cases the kernel make io_read/writel scm calls to
> modify the register.
> This patch series adds APIs in qcom-scm driver to handle both of these
> cases.
>
> Lastly, since these TLB invalidations can happen in atomic contexts
> there's a need to add atomic versions of qcom_scm_io_readl/writel() and
> qcom_scm_call() APIs. The traditional scm calls take mutex and we therefore
> can't use these calls in atomic contexts.
>
> This patch series is adapted version of how the errata is handled in
> downstream [1].
>
> Changes since v1:
> * Addressed Will and Robin's comments:
> - Dropped the patch[4] that forked out __arm_smmu_tlb_inv_range_nosync(),
> and __arm_smmu_tlb_sync().
> - Cleaned up the errata patch further to use downstream polling mechanism
> for tlb sync.
> * No change in SCM call patches - patches 1 to 3.
>
> [1] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842
> [2] https://lore.kernel.org/patchwork/patch/974114/
> [3] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4864
> [4] https://patchwork.kernel.org/patch/10565349/
>
> Vivek Gautam (4):
> firmware: qcom_scm-64: Add atomic version of qcom_scm_call
> firmware/qcom_scm: Add atomic version of io read/write APIs
> firmware/qcom_scm: Add scm call to handle smmu errata
> iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata
>
> drivers/firmware/qcom_scm-32.c | 17 ++++
> drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
> drivers/firmware/qcom_scm.c | 18 ++++
> drivers/firmware/qcom_scm.h | 9 ++
> drivers/iommu/arm-smmu-regs.h | 2 +
> drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++-
> include/linux/qcom_scm.h | 6 ++
> 7 files changed, 320 insertions(+), 46 deletions(-)
>

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


2018-09-25 06:00:05

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Qcom smmu-500 TLB invalidation errata for sdm845

Hi Bjorn,


On Mon, Sep 10, 2018 at 4:08 PM Vivek Gautam
<[email protected]> wrote:
>
> +linux-arm-msm
>
>
> On 09/10/2018 11:55 AM, Vivek Gautam wrote:
> > Qcom's implementation of arm,mmu-500 on sdm845 has a functional/performance
> > errata [1] because of which the TCU cache look ups are stalled during
> > invalidation cycle. This is mitigated by serializing all the invalidation
> > requests coming to the smmu.
> >
> > This patch series addresses this errata by adding new tlb_ops for
> > qcom,sdm845-smmu-500 [2]. These ops take context bank locks for all the
> > tlb_ops that queue and sync the TLB invalidation requests.
> >
> > Besides adding locks, there's a way to expadite these TLB invalidations
> > for display and camera devices by turning off the 'wait-for-safe' logic
> > in hardware that holds the tlb invalidations until a safe level.
> > This 'wait-for-safe' logic is controlled by toggling a chicken bit
> > through a secure register. This secure register is accessed by making an
> > explicit SCM call into the EL3 firmware.
> > There are two ways of handling this logic -
> > * Firmware, such as tz present on sdm845-mtp devices has a handler to do
> > all the register access and bit set/clear. So is the handling in
> > downstream arm-smmu driver [3].
> > * Other firmwares can have handlers to just read/write this secure
> > register. In such cases the kernel make io_read/writel scm calls to
> > modify the register.
> > This patch series adds APIs in qcom-scm driver to handle both of these
> > cases.
> >
> > Lastly, since these TLB invalidations can happen in atomic contexts
> > there's a need to add atomic versions of qcom_scm_io_readl/writel() and
> > qcom_scm_call() APIs. The traditional scm calls take mutex and we therefore
> > can't use these calls in atomic contexts.
> >
> > This patch series is adapted version of how the errata is handled in
> > downstream [1].

Gentle ping. Please let me know if you have comments on the SCM pieces
in this series.

Thanks & Regards
Vivek

> >
> > Changes since v1:
> > * Addressed Will and Robin's comments:
> > - Dropped the patch[4] that forked out __arm_smmu_tlb_inv_range_nosync(),
> > and __arm_smmu_tlb_sync().
> > - Cleaned up the errata patch further to use downstream polling mechanism
> > for tlb sync.
> > * No change in SCM call patches - patches 1 to 3.
> >
> > [1] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842
> > [2] https://lore.kernel.org/patchwork/patch/974114/
> > [3] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4864
> > [4] https://patchwork.kernel.org/patch/10565349/
> >
> > Vivek Gautam (4):
> > firmware: qcom_scm-64: Add atomic version of qcom_scm_call
> > firmware/qcom_scm: Add atomic version of io read/write APIs
> > firmware/qcom_scm: Add scm call to handle smmu errata
> > iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata
> >
> > drivers/firmware/qcom_scm-32.c | 17 ++++
> > drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
> > drivers/firmware/qcom_scm.c | 18 ++++
> > drivers/firmware/qcom_scm.h | 9 ++
> > drivers/iommu/arm-smmu-regs.h | 2 +
> > drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++-
> > include/linux/qcom_scm.h | 6 ++
> > 7 files changed, 320 insertions(+), 46 deletions(-)
> >
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

2018-09-25 12:11:20

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Qcom smmu-500 TLB invalidation errata for sdm845

On Mon, Sep 10, 2018 at 11:55:47AM +0530, Vivek Gautam wrote:
> Vivek Gautam (4):
> firmware: qcom_scm-64: Add atomic version of qcom_scm_call
> firmware/qcom_scm: Add atomic version of io read/write APIs
> firmware/qcom_scm: Add scm call to handle smmu errata
> iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata
>
> drivers/firmware/qcom_scm-32.c | 17 ++++
> drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
> drivers/firmware/qcom_scm.c | 18 ++++
> drivers/firmware/qcom_scm.h | 9 ++
> drivers/iommu/arm-smmu-regs.h | 2 +
> drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++-
> include/linux/qcom_scm.h | 6 ++
> 7 files changed, 320 insertions(+), 46 deletions(-)

Should this go through the iommu-tree? In that case I need Acked-by's
for the firmware code changes.

Regards,

Joerg

2018-09-25 12:32:38

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

On 10/09/18 07:25, Vivek Gautam wrote:
> Qcom's implementation of arm,mmu-500 require to serialize all
> TLB invalidations for context banks.

What does "serailize all TLB invalidations" actually mean, because it's
not entirely clear from context, and furthermore this patch appears to
behave subtly differently to the downstream code so I'm really
struggling to figure out whether it's actually doing what it's intended
to do.

> In case the TLB invalidation requests don't go through the first
> time, there's a way to disable/enable the wait for safe logic.
> Disabling this logic expadites the TLBIs.
>
> Different bootloaders with their access control policies allow this
> register access differntly. With one, we should be able to directly
> make qcom-scm call to do io read/write, while with other we should
> use the specific SCM command to send request to do the complete
> register configuration.
> A separate device tree flag for arm-smmu will allow to identify
> which firmware configuration of the two mentioned above we use.
>
> Signed-off-by: Vivek Gautam <[email protected]>
> ---
> drivers/iommu/arm-smmu-regs.h | 2 +
> drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 133 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index a1226e4ab5f8..71662cae9806 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -177,6 +177,8 @@ enum arm_smmu_s2cr_privcfg {
> #define ARM_SMMU_CB_ATS1PR 0x800
> #define ARM_SMMU_CB_ATSR 0x8f0
>
> +#define ARM_SMMU_GID_QCOM_CUSTOM_CFG 0x300
> +
> #define SCTLR_S1_ASIDPNE (1 << 12)
> #define SCTLR_CFCFG (1 << 7)
> #define SCTLR_CFIE (1 << 6)
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 411e5ac57c64..de9c4a5bf686 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -49,6 +49,7 @@
> #include <linux/pci.h>
> #include <linux/platform_device.h>
> #include <linux/pm_runtime.h>
> +#include <linux/qcom_scm.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
>
> @@ -181,7 +182,8 @@ struct arm_smmu_device {
> #define ARM_SMMU_FEAT_EXIDS (1 << 12)
> u32 features;
>
> -#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> +#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> +#define ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA (1 << 1)
> u32 options;
> enum arm_smmu_arch_version version;
> enum arm_smmu_implementation model;
> @@ -266,6 +268,7 @@ static bool using_legacy_binding, using_generic_binding;
>
> static struct arm_smmu_option_prop arm_smmu_options[] = {
> { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
> + { ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA, "qcom,smmu-500-fw-impl-errata" },
> { 0, NULL},
> };
>
> @@ -531,12 +534,134 @@ static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size,
> writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID);
> }
>
> +#define CUSTOM_CFG_MDP_SAFE_ENABLE BIT(15)
> +#define CUSTOM_CFG_IFE1_SAFE_ENABLE BIT(14)
> +#define CUSTOM_CFG_IFE0_SAFE_ENABLE BIT(13)
> +
> +static int __qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, int en)
> +{
> + int ret;
> + u32 val, gid_phys_base;
> + phys_addr_t reg;
> + struct vm_struct *vm;
> +
> + /* We want physical address of SMMU, so the vm_area */
> + vm = find_vm_area(smmu->base);
> +
> + /*
> + * GID (implementation defined address space) is located at
> + * SMMU_BASE + (2 × PAGESIZE).
> + */
> + gid_phys_base = vm->phys_addr + (2 << (smmu)->pgshift);
> + reg = gid_phys_base + ARM_SMMU_GID_QCOM_CUSTOM_CFG;
> +
> + ret = qcom_scm_io_readl_atomic(reg, &val);
> + if (ret)
> + return ret;
> +
> + if (en)
> + val |= CUSTOM_CFG_MDP_SAFE_ENABLE |
> + CUSTOM_CFG_IFE0_SAFE_ENABLE |
> + CUSTOM_CFG_IFE1_SAFE_ENABLE;
> + else
> + val &= ~(CUSTOM_CFG_MDP_SAFE_ENABLE |
> + CUSTOM_CFG_IFE0_SAFE_ENABLE |
> + CUSTOM_CFG_IFE1_SAFE_ENABLE);
> +
> + ret = qcom_scm_io_writel_atomic(reg, val);
> +
> + return ret;
> +}
> +
> +static int qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu,
> + int en, bool is_fw_impl)
> +{
> + if (is_fw_impl)
> + return qcom_scm_qsmmu500_wait_safe_toggle(en);
> + else
> + return __qsmmu500_wait_safe_toggle(smmu, en);
> +}
> +
> +static void qcom_errata_tlb_sync(struct arm_smmu_domain *smmu_domain)
> +{
> + struct arm_smmu_device *smmu = smmu_domain->smmu;
> + void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
> + bool is_fw_impl;
> + u32 val;
> +
> + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC);
> +
> + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
> + !(val & sTLBGSTATUS_GSACTIVE), 0, 100))
> + return;
> +
> + is_fw_impl = smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA ?
> + true : false;
> +
> + /* SCM call here to disable the wait-for-safe logic. */

I take it this is a global state, so it can't just be turned off for the
relevant contexts and left that way?

> + if (WARN(qsmmu500_wait_safe_toggle(smmu, false, is_fw_impl),
> + "Failed to disable wait-safe logic, bad hw state\n"))
> + return;
> +
> + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
> + !(val & sTLBGSTATUS_GSACTIVE), 0, 10000))
> + return;
> +
> + /* SCM call here to re-enable the wait-for-safe logic. */
> + WARN(qsmmu500_wait_safe_toggle(smmu, true, is_fw_impl),
> + "Failed to re-enable wait-safe logic, bad hw state\n");
> +
> + dev_err_ratelimited(smmu->dev,
> + "TLB sync timed out -- SMMU in bad state\n");
> +}
> +
> +static void qcom_errata_tlb_sync_context(void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = cookie;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> + qcom_errata_tlb_sync(smmu_domain);
> + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> +}
> +
> +static void qcom_errata_tlb_inv_context_s1(void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = cookie;
> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> + void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> + writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
> + qcom_errata_tlb_sync(cookie);
> + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> +}
> +
> +static void qcom_errata_tlb_inv_range_nosync(unsigned long iova, size_t size,
> + size_t granule, bool leaf,
> + void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = cookie;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> + arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
> + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);

I don't get what this locking is trying to achieve - the only thing it
protects is one or more writes to TLBIVA{L}, which *must* surely be
"serialised" by the interconnect anyway?

The downstream code doesn't appear to implement .tlb_add_flush at all,
so something smells off.

Robin.

> +}
> +
> static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = {
> .tlb_flush_all = arm_smmu_tlb_inv_context_s1,
> .tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
> .tlb_sync = arm_smmu_tlb_sync_context,
> };
>
> +static const struct iommu_gather_ops qcom_errata_s1_tlb_ops = {
> + .tlb_flush_all = qcom_errata_tlb_inv_context_s1,
> + .tlb_add_flush = qcom_errata_tlb_inv_range_nosync,
> + .tlb_sync = qcom_errata_tlb_sync_context,
> +};
> +
> static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = {
> .tlb_flush_all = arm_smmu_tlb_inv_context_s2,
> .tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
> @@ -824,7 +949,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> ias = min(ias, 32UL);
> oas = min(oas, 32UL);
> }
> - smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
> + if (of_device_is_compatible(smmu->dev->of_node,
> + "qcom,sdm845-smmu-500"))
> + smmu_domain->tlb_ops = &qcom_errata_s1_tlb_ops;
> + else
> + smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
> break;
> case ARM_SMMU_DOMAIN_NESTED:
> /*
>

2018-09-25 16:41:07

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Qcom smmu-500 TLB invalidation errata for sdm845

On Tue, Sep 25, 2018 at 02:09:34PM +0200, Joerg Roedel wrote:
> On Mon, Sep 10, 2018 at 11:55:47AM +0530, Vivek Gautam wrote:
> > Vivek Gautam (4):
> > firmware: qcom_scm-64: Add atomic version of qcom_scm_call
> > firmware/qcom_scm: Add atomic version of io read/write APIs
> > firmware/qcom_scm: Add scm call to handle smmu errata
> > iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata
> >
> > drivers/firmware/qcom_scm-32.c | 17 ++++
> > drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
> > drivers/firmware/qcom_scm.c | 18 ++++
> > drivers/firmware/qcom_scm.h | 9 ++
> > drivers/iommu/arm-smmu-regs.h | 2 +
> > drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++-
> > include/linux/qcom_scm.h | 6 ++
> > 7 files changed, 320 insertions(+), 46 deletions(-)
>
> Should this go through the iommu-tree? In that case I need Acked-by's
> for the firmware code changes.

Yup, and I'd also like Robin's ack on the arm-smmu*.c changes (I see he has
some comments on the code as it stands).

Will

2018-09-26 06:26:08

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Qcom smmu-500 TLB invalidation errata for sdm845

On Tue, Sep 25, 2018 at 10:09 PM Will Deacon <[email protected]> wrote:
>
> On Tue, Sep 25, 2018 at 02:09:34PM +0200, Joerg Roedel wrote:
> > On Mon, Sep 10, 2018 at 11:55:47AM +0530, Vivek Gautam wrote:
> > > Vivek Gautam (4):
> > > firmware: qcom_scm-64: Add atomic version of qcom_scm_call
> > > firmware/qcom_scm: Add atomic version of io read/write APIs
> > > firmware/qcom_scm: Add scm call to handle smmu errata
> > > iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata
> > >
> > > drivers/firmware/qcom_scm-32.c | 17 ++++
> > > drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
> > > drivers/firmware/qcom_scm.c | 18 ++++
> > > drivers/firmware/qcom_scm.h | 9 ++
> > > drivers/iommu/arm-smmu-regs.h | 2 +
> > > drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++-
> > > include/linux/qcom_scm.h | 6 ++
> > > 7 files changed, 320 insertions(+), 46 deletions(-)
> >
> > Should this go through the iommu-tree? In that case I need Acked-by's
> > for the firmware code changes.
>
> Yup, and I'd also like Robin's ack on the arm-smmu*.c changes (I see he has
> some comments on the code as it stands).

Thanks Joerg, and Will for taking a look at it. I will answer and
address Robin's comments
for smmu part.

Best regards
Vivek

>
> Will
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

2018-10-23 07:46:55

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

Hi Robin,

On Tue, Sep 25, 2018 at 6:01 PM Robin Murphy <[email protected]> wrote:
>
> On 10/09/18 07:25, Vivek Gautam wrote:
> > Qcom's implementation of arm,mmu-500 require to serialize all
> > TLB invalidations for context banks.
>
> What does "serailize all TLB invalidations" actually mean, because it's
> not entirely clear from context, and furthermore this patch appears to
> behave subtly differently to the downstream code so I'm really
> struggling to figure out whether it's actually doing what it's intended
> to do.

Adding Pratik Patel from downstream team.

Thanks for taking a look at this.
We want to space out the TLB invalidation and then the workaround
to toggle wait-safe logic would let the safe checks in HW work and
only allow invalidation to occur when device is expected to not
run into underruns.

> > In case the TLB invalidation requests don't go through the first
> > time, there's a way to disable/enable the wait for safe logic.
> > Disabling this logic expadites the TLBIs.
> >
> > Different bootloaders with their access control policies allow this
> > register access differntly. With one, we should be able to directly
> > make qcom-scm call to do io read/write, while with other we should
> > use the specific SCM command to send request to do the complete
> > register configuration.
> > A separate device tree flag for arm-smmu will allow to identify
> > which firmware configuration of the two mentioned above we use.
> >
> > Signed-off-by: Vivek Gautam <[email protected]>
> > ---
> > drivers/iommu/arm-smmu-regs.h | 2 +
> > drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++-
> > 2 files changed, 133 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> > index a1226e4ab5f8..71662cae9806 100644
> > --- a/drivers/iommu/arm-smmu-regs.h
> > +++ b/drivers/iommu/arm-smmu-regs.h
> > @@ -177,6 +177,8 @@ enum arm_smmu_s2cr_privcfg {
> > #define ARM_SMMU_CB_ATS1PR 0x800
> > #define ARM_SMMU_CB_ATSR 0x8f0
> >
> > +#define ARM_SMMU_GID_QCOM_CUSTOM_CFG 0x300
> > +
> > #define SCTLR_S1_ASIDPNE (1 << 12)
> > #define SCTLR_CFCFG (1 << 7)
> > #define SCTLR_CFIE (1 << 6)
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 411e5ac57c64..de9c4a5bf686 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -49,6 +49,7 @@
> > #include <linux/pci.h>
> > #include <linux/platform_device.h>
> > #include <linux/pm_runtime.h>
> > +#include <linux/qcom_scm.h>
> > #include <linux/slab.h>
> > #include <linux/spinlock.h>
> >
> > @@ -181,7 +182,8 @@ struct arm_smmu_device {
> > #define ARM_SMMU_FEAT_EXIDS (1 << 12)
> > u32 features;
> >
> > -#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> > +#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> > +#define ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA (1 << 1)
> > u32 options;
> > enum arm_smmu_arch_version version;
> > enum arm_smmu_implementation model;
> > @@ -266,6 +268,7 @@ static bool using_legacy_binding, using_generic_binding;
> >
> > static struct arm_smmu_option_prop arm_smmu_options[] = {
> > { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
> > + { ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA, "qcom,smmu-500-fw-impl-errata" },
> > { 0, NULL},
> > };
> >
> > @@ -531,12 +534,134 @@ static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size,
> > writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID);
> > }
> >
> > +#define CUSTOM_CFG_MDP_SAFE_ENABLE BIT(15)
> > +#define CUSTOM_CFG_IFE1_SAFE_ENABLE BIT(14)
> > +#define CUSTOM_CFG_IFE0_SAFE_ENABLE BIT(13)
> > +
> > +static int __qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, int en)
> > +{
> > + int ret;
> > + u32 val, gid_phys_base;
> > + phys_addr_t reg;
> > + struct vm_struct *vm;
> > +
> > + /* We want physical address of SMMU, so the vm_area */
> > + vm = find_vm_area(smmu->base);
> > +
> > + /*
> > + * GID (implementation defined address space) is located at
> > + * SMMU_BASE + (2 × PAGESIZE).
> > + */
> > + gid_phys_base = vm->phys_addr + (2 << (smmu)->pgshift);
> > + reg = gid_phys_base + ARM_SMMU_GID_QCOM_CUSTOM_CFG;
> > +
> > + ret = qcom_scm_io_readl_atomic(reg, &val);
> > + if (ret)
> > + return ret;
> > +
> > + if (en)
> > + val |= CUSTOM_CFG_MDP_SAFE_ENABLE |
> > + CUSTOM_CFG_IFE0_SAFE_ENABLE |
> > + CUSTOM_CFG_IFE1_SAFE_ENABLE;
> > + else
> > + val &= ~(CUSTOM_CFG_MDP_SAFE_ENABLE |
> > + CUSTOM_CFG_IFE0_SAFE_ENABLE |
> > + CUSTOM_CFG_IFE1_SAFE_ENABLE);
> > +
> > + ret = qcom_scm_io_writel_atomic(reg, val);
> > +
> > + return ret;
> > +}
> > +
> > +static int qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu,
> > + int en, bool is_fw_impl)
> > +{
> > + if (is_fw_impl)
> > + return qcom_scm_qsmmu500_wait_safe_toggle(en);
> > + else
> > + return __qsmmu500_wait_safe_toggle(smmu, en);
> > +}
> > +
> > +static void qcom_errata_tlb_sync(struct arm_smmu_domain *smmu_domain)
> > +{
> > + struct arm_smmu_device *smmu = smmu_domain->smmu;
> > + void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
> > + bool is_fw_impl;
> > + u32 val;
> > +
> > + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC);
> > +
> > + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
> > + !(val & sTLBGSTATUS_GSACTIVE), 0, 100))
> > + return;
> > +
> > + is_fw_impl = smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA ?
> > + true : false;
> > +
> > + /* SCM call here to disable the wait-for-safe logic. */
>
> I take it this is a global state, so it can't just be turned off for the
> relevant contexts and left that way?

Yes this is a global state disable/enable. But can we disable it for
the required context altogether forever, I will have to find it out.

>
> > + if (WARN(qsmmu500_wait_safe_toggle(smmu, false, is_fw_impl),
> > + "Failed to disable wait-safe logic, bad hw state\n"))
> > + return;
> > +
> > + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
> > + !(val & sTLBGSTATUS_GSACTIVE), 0, 10000))
> > + return;
> > +
> > + /* SCM call here to re-enable the wait-for-safe logic. */
> > + WARN(qsmmu500_wait_safe_toggle(smmu, true, is_fw_impl),
> > + "Failed to re-enable wait-safe logic, bad hw state\n");
> > +
> > + dev_err_ratelimited(smmu->dev,
> > + "TLB sync timed out -- SMMU in bad state\n");
> > +}
> > +
> > +static void qcom_errata_tlb_sync_context(void *cookie)
> > +{
> > + struct arm_smmu_domain *smmu_domain = cookie;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> > + qcom_errata_tlb_sync(smmu_domain);
> > + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> > +}
> > +
> > +static void qcom_errata_tlb_inv_context_s1(void *cookie)
> > +{
> > + struct arm_smmu_domain *smmu_domain = cookie;
> > + struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> > + void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> > + writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
> > + qcom_errata_tlb_sync(cookie);
> > + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> > +}
> > +
> > +static void qcom_errata_tlb_inv_range_nosync(unsigned long iova, size_t size,
> > + size_t granule, bool leaf,
> > + void *cookie)
> > +{
> > + struct arm_smmu_domain *smmu_domain = cookie;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> > + arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
> > + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
>
> I don't get what this locking is trying to achieve - the only thing it
> protects is one or more writes to TLBIVA{L}, which *must* surely be
> "serialised" by the interconnect anyway?
>

Okay, right. This callback is just a write to TLBIVA{L} registers, so this
should be good without any locking.
A subsequent .tlb_sync has the locking anyways.

> The downstream code doesn't appear to implement .tlb_add_flush at all,
> so something smells off.

Yes, the downstream pg-table driver never uses tlb_add_flush and
tlb_sync to do TLBIs.
Rather it relies on tlb_flush_all to flush the entire TLB after the
unmap is finished.
Moreover, in downstream a global remote spinlock is used to protect
the invalidation
requests owing to other entities besides kernel handling the TLB operations.

Assuming we don't want to do tlb_flush_all, we still want to serialize
the invalidation
requests, and then toggle the wait-safe logic too.
We would also not want the remote spinlock as we are invalidating based on ASIDs
in the kernel.

Regards
Vivek

>
> Robin.
>
> > +}
> > +
> > static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = {
> > .tlb_flush_all = arm_smmu_tlb_inv_context_s1,
> > .tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
> > .tlb_sync = arm_smmu_tlb_sync_context,
> > };
> >
> > +static const struct iommu_gather_ops qcom_errata_s1_tlb_ops = {
> > + .tlb_flush_all = qcom_errata_tlb_inv_context_s1,
> > + .tlb_add_flush = qcom_errata_tlb_inv_range_nosync,
> > + .tlb_sync = qcom_errata_tlb_sync_context,
> > +};
> > +
> > static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = {
> > .tlb_flush_all = arm_smmu_tlb_inv_context_s2,
> > .tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
> > @@ -824,7 +949,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> > ias = min(ias, 32UL);
> > oas = min(oas, 32UL);
> > }
> > - smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
> > + if (of_device_is_compatible(smmu->dev->of_node,
> > + "qcom,sdm845-smmu-500"))
> > + smmu_domain->tlb_ops = &qcom_errata_s1_tlb_ops;
> > + else
> > + smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
> > break;
> > case ARM_SMMU_DOMAIN_NESTED:
> > /*
> >
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

2019-03-25 21:11:10

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] firmware/qcom_scm: Add atomic version of io read/write APIs

On Sun 09 Sep 23:25 PDT 2018, Vivek Gautam wrote:

> Add atomic versions of qcom_scm_io_readl/writel to enable
> reading/writing secure registers from atomic context.
>
> Signed-off-by: Vivek Gautam <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/firmware/qcom_scm-32.c | 12 ++++++++++++
> drivers/firmware/qcom_scm-64.c | 32 ++++++++++++++++++++++++++++++++
> drivers/firmware/qcom_scm.c | 12 ++++++++++++
> drivers/firmware/qcom_scm.h | 4 ++++
> include/linux/qcom_scm.h | 4 ++++
> 5 files changed, 64 insertions(+)
>
> diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
> index 4e24e591ae74..7293e5efad69 100644
> --- a/drivers/firmware/qcom_scm-32.c
> +++ b/drivers/firmware/qcom_scm-32.c
> @@ -627,3 +627,15 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val)
> return qcom_scm_call_atomic2(QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
> addr, val);
> }
> +
> +int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
> + unsigned int *val)
> +{
> + return -ENODEV;
> +}
> +
> +int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
> + unsigned int val)
> +{
> + return -ENODEV;
> +}
> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
> index 3a8c867cdf51..6bf55403f6e3 100644
> --- a/drivers/firmware/qcom_scm-64.c
> +++ b/drivers/firmware/qcom_scm-64.c
> @@ -558,3 +558,35 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val)
> return qcom_scm_call(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
> &desc, &res);
> }
> +
> +int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
> + unsigned int *val)
> +{
> + struct qcom_scm_desc desc = {0};
> + struct arm_smccc_res res;
> + int ret;
> +
> + desc.args[0] = addr;
> + desc.arginfo = QCOM_SCM_ARGS(1);
> +
> + ret = qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_READ,
> + &desc, &res);
> + if (ret >= 0)
> + *val = res.a1;
> +
> + return ret < 0 ? ret : 0;
> +}
> +
> +int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
> + unsigned int val)
> +{
> + struct qcom_scm_desc desc = {0};
> + struct arm_smccc_res res;
> +
> + desc.args[0] = addr;
> + desc.args[1] = val;
> + desc.arginfo = QCOM_SCM_ARGS(2);
> +
> + return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
> + &desc, &res);
> +}
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index e778af766fae..36da0000b37f 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -365,6 +365,18 @@ int qcom_scm_io_writel(phys_addr_t addr, unsigned int val)
> }
> EXPORT_SYMBOL(qcom_scm_io_writel);
>
> +int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val)
> +{
> + return __qcom_scm_io_readl_atomic(__scm->dev, addr, val);
> +}
> +EXPORT_SYMBOL(qcom_scm_io_readl_atomic);
> +
> +int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val)
> +{
> + return __qcom_scm_io_writel_atomic(__scm->dev, addr, val);
> +}
> +EXPORT_SYMBOL(qcom_scm_io_writel_atomic);
> +
> static void qcom_scm_set_download_mode(bool enable)
> {
> bool avail;
> diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
> index dcd7f7917fc7..bb176107f51e 100644
> --- a/drivers/firmware/qcom_scm.h
> +++ b/drivers/firmware/qcom_scm.h
> @@ -37,6 +37,10 @@ extern void __qcom_scm_cpu_power_down(u32 flags);
> #define QCOM_SCM_IO_WRITE 0x2
> extern int __qcom_scm_io_readl(struct device *dev, phys_addr_t addr, unsigned int *val);
> extern int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val);
> +extern int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
> + unsigned int *val);
> +extern int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
> + unsigned int val);
>
> #define QCOM_SCM_SVC_INFO 0x6
> #define QCOM_IS_CALL_AVAIL_CMD 0x1
> diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
> index 5d65521260b3..6a5d0c98b328 100644
> --- a/include/linux/qcom_scm.h
> +++ b/include/linux/qcom_scm.h
> @@ -64,6 +64,8 @@ extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
> extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
> extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
> extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
> +extern int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val);
> +extern int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val);
> #else
> static inline
> int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
> @@ -100,5 +102,7 @@ static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { ret
> static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare) { return -ENODEV; }
> static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
> static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { return -ENODEV; }
> +static inline int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
> +static inline int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val) { return -ENODEV; }
> #endif
> #endif
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation
>

2019-03-25 21:11:47

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] firmware/qcom_scm: Add scm call to handle smmu errata

On Sun 09 Sep 23:25 PDT 2018, Vivek Gautam wrote:

> Qcom's smmu-500 needs to toggle wait-for-safe sequence to
> handle TLB invalidation sync's.
> Few firmwares allow doing that through SCM interface.
> Add API to toggle wait for safe from firmware through a
> SCM call.
>
> Signed-off-by: Vivek Gautam <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/firmware/qcom_scm-32.c | 5 +++++
> drivers/firmware/qcom_scm-64.c | 13 +++++++++++++
> drivers/firmware/qcom_scm.c | 6 ++++++
> drivers/firmware/qcom_scm.h | 5 +++++
> include/linux/qcom_scm.h | 2 ++
> 5 files changed, 31 insertions(+)
>
> diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
> index 7293e5efad69..2d301ad053f8 100644
> --- a/drivers/firmware/qcom_scm-32.c
> +++ b/drivers/firmware/qcom_scm-32.c
> @@ -639,3 +639,8 @@ int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
> {
> return -ENODEV;
> }
> +
> +int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool enable)
> +{
> + return -ENODEV;
> +}
> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
> index 6bf55403f6e3..f13bcabc5d78 100644
> --- a/drivers/firmware/qcom_scm-64.c
> +++ b/drivers/firmware/qcom_scm-64.c
> @@ -590,3 +590,16 @@ int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
> return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
> &desc, &res);
> }
> +
> +int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool en)
> +{
> + struct qcom_scm_desc desc = {0};
> + struct arm_smccc_res res;
> +
> + desc.args[0] = QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL;
> + desc.args[1] = en;
> + desc.arginfo = QCOM_SCM_ARGS(2);
> +
> + return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_SMMU_PROGRAM,
> + QCOM_SCM_CONFIG_ERRATA1, &desc, &res);
> +}
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index 36da0000b37f..5f15cc2e9f69 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -353,6 +353,12 @@ int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare)
> }
> EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);
>
> +int qcom_scm_qsmmu500_wait_safe_toggle(bool en)
> +{
> + return __qcom_scm_qsmmu500_wait_safe_toggle(__scm->dev, en);
> +}
> +EXPORT_SYMBOL(qcom_scm_qsmmu500_wait_safe_toggle);
> +
> int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val)
> {
> return __qcom_scm_io_readl(__scm->dev, addr, val);
> diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
> index bb176107f51e..89a822c23e33 100644
> --- a/drivers/firmware/qcom_scm.h
> +++ b/drivers/firmware/qcom_scm.h
> @@ -103,10 +103,15 @@ extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
> u32 spare);
> #define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE 3
> #define QCOM_SCM_IOMMU_SECURE_PTBL_INIT 4
> +#define QCOM_SCM_SVC_SMMU_PROGRAM 0x15
> +#define QCOM_SCM_CONFIG_ERRATA1 0x3
> +#define QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL 0x2
> extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
> size_t *size);
> extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
> u32 size, u32 spare);
> +extern int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev,
> + bool enable);
> #define QCOM_MEM_PROT_ASSIGN_ID 0x16
> extern int __qcom_scm_assign_mem(struct device *dev,
> phys_addr_t mem_region, size_t mem_sz,
> diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
> index 6a5d0c98b328..46e6b1692998 100644
> --- a/include/linux/qcom_scm.h
> +++ b/include/linux/qcom_scm.h
> @@ -62,6 +62,7 @@ extern int qcom_scm_set_remote_state(u32 state, u32 id);
> extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
> extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
> extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
> +extern int qcom_scm_qsmmu500_wait_safe_toggle(bool en);
> extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
> extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
> extern int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val);
> @@ -100,6 +101,7 @@ qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
> static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return -ENODEV; }
> static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { return -ENODEV; }
> static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare) { return -ENODEV; }
> +static inline int qcom_scm_qsmmu500_wait_safe_toggle(bool en) { return -ENODEV; }
> static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
> static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { return -ENODEV; }
> static inline int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation
>

2019-03-25 21:11:58

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] firmware: qcom_scm-64: Add atomic version of qcom_scm_call

On Sun 09 Sep 23:25 PDT 2018, Vivek Gautam wrote:

> There are scnenarios where drivers are required to make a
> scm call in atomic context, such as in one of the qcom's
> arm-smmu-500 errata [1].
>
> [1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/
> tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842")
>
> Signed-off-by: Vivek Gautam <[email protected]>

Reviewed-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/firmware/qcom_scm-64.c | 136 ++++++++++++++++++++++++++++-------------
> 1 file changed, 92 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
> index 688525dd4aee..3a8c867cdf51 100644
> --- a/drivers/firmware/qcom_scm-64.c
> +++ b/drivers/firmware/qcom_scm-64.c
> @@ -70,32 +70,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
> #define FIRST_EXT_ARG_IDX 3
> #define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)
>
> -/**
> - * qcom_scm_call() - Invoke a syscall in the secure world
> - * @dev: device
> - * @svc_id: service identifier
> - * @cmd_id: command identifier
> - * @desc: Descriptor structure containing arguments and return values
> - *
> - * Sends a command to the SCM and waits for the command to finish processing.
> - * This should *only* be called in pre-emptible context.
> -*/
> -static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
> - const struct qcom_scm_desc *desc,
> - struct arm_smccc_res *res)
> +static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
> + struct arm_smccc_res *res, u32 fn_id,
> + u64 x5, u32 type)
> +{
> + u64 cmd;
> + struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
> +
> + cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
> + ARM_SMCCC_OWNER_SIP, fn_id);
> +
> + quirk.state.a6 = 0;
> +
> + do {
> + arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
> + desc->args[1], desc->args[2], x5,
> + quirk.state.a6, 0, res, &quirk);
> +
> + if (res->a0 == QCOM_SCM_INTERRUPTED)
> + cmd = res->a0;
> +
> + } while (res->a0 == QCOM_SCM_INTERRUPTED);
> +}
> +
> +static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
> + struct arm_smccc_res *res, u32 fn_id,
> + u64 x5, bool atomic)
> +{
> + int retry_count = 0;
> +
> + if (!atomic) {
> + do {
> + mutex_lock(&qcom_scm_lock);
> +
> + __qcom_scm_call_do(desc, res, fn_id, x5,
> + ARM_SMCCC_STD_CALL);
> +
> + mutex_unlock(&qcom_scm_lock);
> +
> + if (res->a0 == QCOM_SCM_V2_EBUSY) {
> + if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
> + break;
> + msleep(QCOM_SCM_EBUSY_WAIT_MS);
> + }
> + } while (res->a0 == QCOM_SCM_V2_EBUSY);
> + } else {
> + __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
> + }
> +}
> +
> +static int ___qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
> + const struct qcom_scm_desc *desc,
> + struct arm_smccc_res *res, bool atomic)
> {
> int arglen = desc->arginfo & 0xf;
> - int retry_count = 0, i;
> + int i;
> u32 fn_id = QCOM_SCM_FNID(svc_id, cmd_id);
> - u64 cmd, x5 = desc->args[FIRST_EXT_ARG_IDX];
> + u64 x5 = desc->args[FIRST_EXT_ARG_IDX];
> dma_addr_t args_phys = 0;
> void *args_virt = NULL;
> size_t alloc_len;
> - struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
> + gfp_t flag = atomic ? GFP_ATOMIC : GFP_KERNEL;
>
> if (unlikely(arglen > N_REGISTER_ARGS)) {
> alloc_len = N_EXT_QCOM_SCM_ARGS * sizeof(u64);
> - args_virt = kzalloc(PAGE_ALIGN(alloc_len), GFP_KERNEL);
> + args_virt = kzalloc(PAGE_ALIGN(alloc_len), flag);
>
> if (!args_virt)
> return -ENOMEM;
> @@ -125,33 +164,7 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
> x5 = args_phys;
> }
>
> - do {
> - mutex_lock(&qcom_scm_lock);
> -
> - cmd = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
> - qcom_smccc_convention,
> - ARM_SMCCC_OWNER_SIP, fn_id);
> -
> - quirk.state.a6 = 0;
> -
> - do {
> - arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
> - desc->args[1], desc->args[2], x5,
> - quirk.state.a6, 0, res, &quirk);
> -
> - if (res->a0 == QCOM_SCM_INTERRUPTED)
> - cmd = res->a0;
> -
> - } while (res->a0 == QCOM_SCM_INTERRUPTED);
> -
> - mutex_unlock(&qcom_scm_lock);
> -
> - if (res->a0 == QCOM_SCM_V2_EBUSY) {
> - if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
> - break;
> - msleep(QCOM_SCM_EBUSY_WAIT_MS);
> - }
> - } while (res->a0 == QCOM_SCM_V2_EBUSY);
> + qcom_scm_call_do(desc, res, fn_id, x5, atomic);
>
> if (args_virt) {
> dma_unmap_single(dev, args_phys, alloc_len, DMA_TO_DEVICE);
> @@ -164,6 +177,41 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
> return 0;
> }
>
> +/**
> + * qcom_scm_call() - Invoke a syscall in the secure world
> + * @dev: device
> + * @svc_id: service identifier
> + * @cmd_id: command identifier
> + * @desc: Descriptor structure containing arguments and return values
> + *
> + * Sends a command to the SCM and waits for the command to finish processing.
> + * This should *only* be called in pre-emptible context.
> + */
> +static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
> + const struct qcom_scm_desc *desc,
> + struct arm_smccc_res *res)
> +{
> + return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, false);
> +}
> +
> +/**
> + * qcom_scm_call_atomic() - atomic variation of qcom_scm_call()
> + * @dev: device
> + * @svc_id: service identifier
> + * @cmd_id: command identifier
> + * @desc: Descriptor structure containing arguments and return values
> + * @res: Structure containing results from SMC/HVC call
> + *
> + * Sends a command to the SCM and waits for the command to finish processing.
> + * This should be called in atomic context only.
> + */
> +static int qcom_scm_call_atomic(struct device *dev, u32 svc_id, u32 cmd_id,
> + const struct qcom_scm_desc *desc,
> + struct arm_smccc_res *res)
> +{
> + return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, true);
> +}
> +
> /**
> * qcom_scm_set_cold_boot_addr() - Set the cold boot address for cpus
> * @entry: Entry point function for the cpus
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation
>

2019-03-25 21:19:05

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

On Sun 09 Sep 23:25 PDT 2018, Vivek Gautam wrote:

> Qcom's implementation of arm,mmu-500 require to serialize all
> TLB invalidations for context banks.
> In case the TLB invalidation requests don't go through the first
> time, there's a way to disable/enable the wait for safe logic.
> Disabling this logic expadites the TLBIs.
>

Reading this I get a feeling that this will give a slight performance
boost, but after finally narrowing the last weeks performance hunt down
I've concluded that without this patch we loose 1000x throughput on UFS
and USB (even I2C is completely useless) when the display is on.

So this is definitely needed for SDM845 devices, such as the Dragonboard
845c and MTP, to be functional.

> Different bootloaders with their access control policies allow this
> register access differntly. With one, we should be able to directly
> make qcom-scm call to do io read/write, while with other we should
> use the specific SCM command to send request to do the complete
> register configuration.
> A separate device tree flag for arm-smmu will allow to identify
> which firmware configuration of the two mentioned above we use.
>
> Signed-off-by: Vivek Gautam <[email protected]>

Tested-by: Bjorn Andersson <[email protected]>

Regards,
Bjorn

> ---
> drivers/iommu/arm-smmu-regs.h | 2 +
> drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 133 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index a1226e4ab5f8..71662cae9806 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -177,6 +177,8 @@ enum arm_smmu_s2cr_privcfg {
> #define ARM_SMMU_CB_ATS1PR 0x800
> #define ARM_SMMU_CB_ATSR 0x8f0
>
> +#define ARM_SMMU_GID_QCOM_CUSTOM_CFG 0x300
> +
> #define SCTLR_S1_ASIDPNE (1 << 12)
> #define SCTLR_CFCFG (1 << 7)
> #define SCTLR_CFIE (1 << 6)
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 411e5ac57c64..de9c4a5bf686 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -49,6 +49,7 @@
> #include <linux/pci.h>
> #include <linux/platform_device.h>
> #include <linux/pm_runtime.h>
> +#include <linux/qcom_scm.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
>
> @@ -181,7 +182,8 @@ struct arm_smmu_device {
> #define ARM_SMMU_FEAT_EXIDS (1 << 12)
> u32 features;
>
> -#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> +#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> +#define ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA (1 << 1)
> u32 options;
> enum arm_smmu_arch_version version;
> enum arm_smmu_implementation model;
> @@ -266,6 +268,7 @@ static bool using_legacy_binding, using_generic_binding;
>
> static struct arm_smmu_option_prop arm_smmu_options[] = {
> { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
> + { ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA, "qcom,smmu-500-fw-impl-errata" },
> { 0, NULL},
> };
>
> @@ -531,12 +534,134 @@ static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size,
> writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID);
> }
>
> +#define CUSTOM_CFG_MDP_SAFE_ENABLE BIT(15)
> +#define CUSTOM_CFG_IFE1_SAFE_ENABLE BIT(14)
> +#define CUSTOM_CFG_IFE0_SAFE_ENABLE BIT(13)
> +
> +static int __qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, int en)
> +{
> + int ret;
> + u32 val, gid_phys_base;
> + phys_addr_t reg;
> + struct vm_struct *vm;
> +
> + /* We want physical address of SMMU, so the vm_area */
> + vm = find_vm_area(smmu->base);
> +
> + /*
> + * GID (implementation defined address space) is located at
> + * SMMU_BASE + (2 ? PAGESIZE).
> + */
> + gid_phys_base = vm->phys_addr + (2 << (smmu)->pgshift);
> + reg = gid_phys_base + ARM_SMMU_GID_QCOM_CUSTOM_CFG;
> +
> + ret = qcom_scm_io_readl_atomic(reg, &val);
> + if (ret)
> + return ret;
> +
> + if (en)
> + val |= CUSTOM_CFG_MDP_SAFE_ENABLE |
> + CUSTOM_CFG_IFE0_SAFE_ENABLE |
> + CUSTOM_CFG_IFE1_SAFE_ENABLE;
> + else
> + val &= ~(CUSTOM_CFG_MDP_SAFE_ENABLE |
> + CUSTOM_CFG_IFE0_SAFE_ENABLE |
> + CUSTOM_CFG_IFE1_SAFE_ENABLE);
> +
> + ret = qcom_scm_io_writel_atomic(reg, val);
> +
> + return ret;
> +}
> +
> +static int qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu,
> + int en, bool is_fw_impl)
> +{
> + if (is_fw_impl)
> + return qcom_scm_qsmmu500_wait_safe_toggle(en);
> + else
> + return __qsmmu500_wait_safe_toggle(smmu, en);
> +}
> +
> +static void qcom_errata_tlb_sync(struct arm_smmu_domain *smmu_domain)
> +{
> + struct arm_smmu_device *smmu = smmu_domain->smmu;
> + void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
> + bool is_fw_impl;
> + u32 val;
> +
> + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC);
> +
> + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
> + !(val & sTLBGSTATUS_GSACTIVE), 0, 100))
> + return;
> +
> + is_fw_impl = smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA ?
> + true : false;
> +
> + /* SCM call here to disable the wait-for-safe logic. */
> + if (WARN(qsmmu500_wait_safe_toggle(smmu, false, is_fw_impl),
> + "Failed to disable wait-safe logic, bad hw state\n"))
> + return;
> +
> + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val,
> + !(val & sTLBGSTATUS_GSACTIVE), 0, 10000))
> + return;
> +
> + /* SCM call here to re-enable the wait-for-safe logic. */
> + WARN(qsmmu500_wait_safe_toggle(smmu, true, is_fw_impl),
> + "Failed to re-enable wait-safe logic, bad hw state\n");
> +
> + dev_err_ratelimited(smmu->dev,
> + "TLB sync timed out -- SMMU in bad state\n");
> +}
> +
> +static void qcom_errata_tlb_sync_context(void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = cookie;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> + qcom_errata_tlb_sync(smmu_domain);
> + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> +}
> +
> +static void qcom_errata_tlb_inv_context_s1(void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = cookie;
> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> + void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> + writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
> + qcom_errata_tlb_sync(cookie);
> + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> +}
> +
> +static void qcom_errata_tlb_inv_range_nosync(unsigned long iova, size_t size,
> + size_t granule, bool leaf,
> + void *cookie)
> +{
> + struct arm_smmu_domain *smmu_domain = cookie;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->cb_lock, flags);
> + arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
> + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
> +}
> +
> static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = {
> .tlb_flush_all = arm_smmu_tlb_inv_context_s1,
> .tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
> .tlb_sync = arm_smmu_tlb_sync_context,
> };
>
> +static const struct iommu_gather_ops qcom_errata_s1_tlb_ops = {
> + .tlb_flush_all = qcom_errata_tlb_inv_context_s1,
> + .tlb_add_flush = qcom_errata_tlb_inv_range_nosync,
> + .tlb_sync = qcom_errata_tlb_sync_context,
> +};
> +
> static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = {
> .tlb_flush_all = arm_smmu_tlb_inv_context_s2,
> .tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
> @@ -824,7 +949,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> ias = min(ias, 32UL);
> oas = min(oas, 32UL);
> }
> - smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
> + if (of_device_is_compatible(smmu->dev->of_node,
> + "qcom,sdm845-smmu-500"))
> + smmu_domain->tlb_ops = &qcom_errata_s1_tlb_ops;
> + else
> + smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
> break;
> case ARM_SMMU_DOMAIN_NESTED:
> /*
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation
>

2019-03-26 08:03:48

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] firmware: qcom_scm-64: Add atomic version of qcom_scm_call


On 3/26/2019 2:39 AM, Bjorn Andersson wrote:
> On Sun 09 Sep 23:25 PDT 2018, Vivek Gautam wrote:
>
>> There are scnenarios where drivers are required to make a
>> scm call in atomic context, such as in one of the qcom's
>> arm-smmu-500 errata [1].
>>
>> [1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/
>> tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842")
>>
>> Signed-off-by: Vivek Gautam <[email protected]>
> Reviewed-by: Bjorn Andersson <[email protected]>


Thanks Bjorn for reviewing and testing this series.
I will repost the series on latest head.

Best regards
Vivek

>
> Regards,
> Bjorn
>
>> ---
>> drivers/firmware/qcom_scm-64.c | 136 ++++++++++++++++++++++++++++-------------
>> 1 file changed, 92 insertions(+), 44 deletions(-)
>>
>> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
>> index 688525dd4aee..3a8c867cdf51 100644
>> --- a/drivers/firmware/qcom_scm-64.c
>> +++ b/drivers/firmware/qcom_scm-64.c
>> @@ -70,32 +70,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
>> #define FIRST_EXT_ARG_IDX 3
>> #define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)
>>
>> -/**
>> - * qcom_scm_call() - Invoke a syscall in the secure world
>> - * @dev: device
>> - * @svc_id: service identifier
>> - * @cmd_id: command identifier
>> - * @desc: Descriptor structure containing arguments and return values
>> - *
>> - * Sends a command to the SCM and waits for the command to finish processing.
>> - * This should *only* be called in pre-emptible context.
>> -*/
>> -static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
>> - const struct qcom_scm_desc *desc,
>> - struct arm_smccc_res *res)
>> +static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
>> + struct arm_smccc_res *res, u32 fn_id,
>> + u64 x5, u32 type)
>> +{
>> + u64 cmd;
>> + struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
>> +
>> + cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
>> + ARM_SMCCC_OWNER_SIP, fn_id);
>> +
>> + quirk.state.a6 = 0;
>> +
>> + do {
>> + arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
>> + desc->args[1], desc->args[2], x5,
>> + quirk.state.a6, 0, res, &quirk);
>> +
>> + if (res->a0 == QCOM_SCM_INTERRUPTED)
>> + cmd = res->a0;
>> +
>> + } while (res->a0 == QCOM_SCM_INTERRUPTED);
>> +}
>> +
>> +static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
>> + struct arm_smccc_res *res, u32 fn_id,
>> + u64 x5, bool atomic)
>> +{
>> + int retry_count = 0;
>> +
>> + if (!atomic) {
>> + do {
>> + mutex_lock(&qcom_scm_lock);
>> +
>> + __qcom_scm_call_do(desc, res, fn_id, x5,
>> + ARM_SMCCC_STD_CALL);
>> +
>> + mutex_unlock(&qcom_scm_lock);
>> +
>> + if (res->a0 == QCOM_SCM_V2_EBUSY) {
>> + if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
>> + break;
>> + msleep(QCOM_SCM_EBUSY_WAIT_MS);
>> + }
>> + } while (res->a0 == QCOM_SCM_V2_EBUSY);
>> + } else {
>> + __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
>> + }
>> +}
>> +
>> +static int ___qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
>> + const struct qcom_scm_desc *desc,
>> + struct arm_smccc_res *res, bool atomic)
>> {
>> int arglen = desc->arginfo & 0xf;
>> - int retry_count = 0, i;
>> + int i;
>> u32 fn_id = QCOM_SCM_FNID(svc_id, cmd_id);
>> - u64 cmd, x5 = desc->args[FIRST_EXT_ARG_IDX];
>> + u64 x5 = desc->args[FIRST_EXT_ARG_IDX];
>> dma_addr_t args_phys = 0;
>> void *args_virt = NULL;
>> size_t alloc_len;
>> - struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
>> + gfp_t flag = atomic ? GFP_ATOMIC : GFP_KERNEL;
>>
>> if (unlikely(arglen > N_REGISTER_ARGS)) {
>> alloc_len = N_EXT_QCOM_SCM_ARGS * sizeof(u64);
>> - args_virt = kzalloc(PAGE_ALIGN(alloc_len), GFP_KERNEL);
>> + args_virt = kzalloc(PAGE_ALIGN(alloc_len), flag);
>>
>> if (!args_virt)
>> return -ENOMEM;
>> @@ -125,33 +164,7 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
>> x5 = args_phys;
>> }
>>
>> - do {
>> - mutex_lock(&qcom_scm_lock);
>> -
>> - cmd = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
>> - qcom_smccc_convention,
>> - ARM_SMCCC_OWNER_SIP, fn_id);
>> -
>> - quirk.state.a6 = 0;
>> -
>> - do {
>> - arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
>> - desc->args[1], desc->args[2], x5,
>> - quirk.state.a6, 0, res, &quirk);
>> -
>> - if (res->a0 == QCOM_SCM_INTERRUPTED)
>> - cmd = res->a0;
>> -
>> - } while (res->a0 == QCOM_SCM_INTERRUPTED);
>> -
>> - mutex_unlock(&qcom_scm_lock);
>> -
>> - if (res->a0 == QCOM_SCM_V2_EBUSY) {
>> - if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
>> - break;
>> - msleep(QCOM_SCM_EBUSY_WAIT_MS);
>> - }
>> - } while (res->a0 == QCOM_SCM_V2_EBUSY);
>> + qcom_scm_call_do(desc, res, fn_id, x5, atomic);
>>
>> if (args_virt) {
>> dma_unmap_single(dev, args_phys, alloc_len, DMA_TO_DEVICE);
>> @@ -164,6 +177,41 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
>> return 0;
>> }
>>
>> +/**
>> + * qcom_scm_call() - Invoke a syscall in the secure world
>> + * @dev: device
>> + * @svc_id: service identifier
>> + * @cmd_id: command identifier
>> + * @desc: Descriptor structure containing arguments and return values
>> + *
>> + * Sends a command to the SCM and waits for the command to finish processing.
>> + * This should *only* be called in pre-emptible context.
>> + */
>> +static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
>> + const struct qcom_scm_desc *desc,
>> + struct arm_smccc_res *res)
>> +{
>> + return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, false);
>> +}
>> +
>> +/**
>> + * qcom_scm_call_atomic() - atomic variation of qcom_scm_call()
>> + * @dev: device
>> + * @svc_id: service identifier
>> + * @cmd_id: command identifier
>> + * @desc: Descriptor structure containing arguments and return values
>> + * @res: Structure containing results from SMC/HVC call
>> + *
>> + * Sends a command to the SCM and waits for the command to finish processing.
>> + * This should be called in atomic context only.
>> + */
>> +static int qcom_scm_call_atomic(struct device *dev, u32 svc_id, u32 cmd_id,
>> + const struct qcom_scm_desc *desc,
>> + struct arm_smccc_res *res)
>> +{
>> + return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, true);
>> +}
>> +
>> /**
>> * qcom_scm_set_cold_boot_addr() - Set the cold boot address for cpus
>> * @entry: Entry point function for the cpus
>> --
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
>> of Code Aurora Forum, hosted by The Linux Foundation
>>