2018-08-14 12:00:12

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH 0/5] Qcom smmu-500 TLB invalidation errata for sdm845

Qcom's implementation of arm,mmu-500 on sdm845 has a functional/performance
errata [1] because of which the TCU cache look ups are stalled during
invalidation cycle. This is mitigated by serializing all the invalidation
requests coming to the smmu.

This patch series addresses this errata by adding new tlb_ops for
qcom,sdm845-smmu-500 [2]. These ops take context bank locks for all the
tlb_ops that queue and sync the TLB invalidation requests.

Besides adding locks, there's a way to expadite these TLB invalidations
for display and camera devices by turning off the 'wait-for-safe' logic
in hardware that holds the tlb invalidations until a safe level.
This 'wait-for-safe' logic is controlled by toggling a chicken bit
through a secure register. This secure register is accessed by making an
explicit SCM call into the EL3 firmware.
There are two ways of handling this logic -
* Firmware, such as tz present on sdm845-mtp devices has a handler to do
all the register access and bit set/clear. So is the handling in
downstream arm-smmu driver [3].
* Other firmwares can have handlers to just read/write this secure
register. In such cases the kernel make io_read/writel scm calls to
modify the register.
This patch series adds APIs in qcom-scm driver to handle both of these
cases.

Lastly, since these TLB invalidations can happen in atomic contexts
there's a need to add atomic versions of qcom_scm_io_readl/writel() and
qcom_scm_call() APIs. The traditional scm calls take mutex and we therefore
can't use these calls in atomic contexts.

This patch series is adapted version of how the errata is handled in
downstream [1].

[1] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842
[2] https://lore.kernel.org/patchwork/patch/974114/
[3] https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4864

Vivek Gautam (5):
firmware: qcom_scm-64: Add atomic version of qcom_scm_call
firmware/qcom_scm: Add atomic version of io read/write APIs
firmware/qcom_scm: Add scm call to handle smmu errata
iommu/arm-smmu: Make way to add Qcom's smmu-500 errata handling
iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

drivers/firmware/qcom_scm-32.c | 17 ++++
drivers/firmware/qcom_scm-64.c | 181 +++++++++++++++++++++++++++++++----------
drivers/firmware/qcom_scm.c | 18 ++++
drivers/firmware/qcom_scm.h | 9 ++
drivers/iommu/arm-smmu-regs.h | 2 +
drivers/iommu/arm-smmu.c | 168 ++++++++++++++++++++++++++++++++++++--
include/linux/qcom_scm.h | 6 ++
7 files changed, 348 insertions(+), 53 deletions(-)

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



2018-08-14 10:57:40

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH 1/5] firmware: qcom_scm-64: Add atomic version of qcom_scm_call

There are scnenarios where drivers are required to make a
scm call in atomic context, such as in one of the qcom's
arm-smmu-500 errata [1].

[1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/
tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842")

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/firmware/qcom_scm-64.c | 136 ++++++++++++++++++++++++++++-------------
1 file changed, 92 insertions(+), 44 deletions(-)

diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 688525dd4aee..3a8c867cdf51 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -70,32 +70,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
#define FIRST_EXT_ARG_IDX 3
#define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)

-/**
- * qcom_scm_call() - Invoke a syscall in the secure world
- * @dev: device
- * @svc_id: service identifier
- * @cmd_id: command identifier
- * @desc: Descriptor structure containing arguments and return values
- *
- * Sends a command to the SCM and waits for the command to finish processing.
- * This should *only* be called in pre-emptible context.
-*/
-static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
- const struct qcom_scm_desc *desc,
- struct arm_smccc_res *res)
+static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res, u32 fn_id,
+ u64 x5, u32 type)
+{
+ u64 cmd;
+ struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+
+ cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
+ ARM_SMCCC_OWNER_SIP, fn_id);
+
+ quirk.state.a6 = 0;
+
+ do {
+ arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
+ desc->args[1], desc->args[2], x5,
+ quirk.state.a6, 0, res, &quirk);
+
+ if (res->a0 == QCOM_SCM_INTERRUPTED)
+ cmd = res->a0;
+
+ } while (res->a0 == QCOM_SCM_INTERRUPTED);
+}
+
+static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res, u32 fn_id,
+ u64 x5, bool atomic)
+{
+ int retry_count = 0;
+
+ if (!atomic) {
+ do {
+ mutex_lock(&qcom_scm_lock);
+
+ __qcom_scm_call_do(desc, res, fn_id, x5,
+ ARM_SMCCC_STD_CALL);
+
+ mutex_unlock(&qcom_scm_lock);
+
+ if (res->a0 == QCOM_SCM_V2_EBUSY) {
+ if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
+ break;
+ msleep(QCOM_SCM_EBUSY_WAIT_MS);
+ }
+ } while (res->a0 == QCOM_SCM_V2_EBUSY);
+ } else {
+ __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
+ }
+}
+
+static int ___qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
+ const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res, bool atomic)
{
int arglen = desc->arginfo & 0xf;
- int retry_count = 0, i;
+ int i;
u32 fn_id = QCOM_SCM_FNID(svc_id, cmd_id);
- u64 cmd, x5 = desc->args[FIRST_EXT_ARG_IDX];
+ u64 x5 = desc->args[FIRST_EXT_ARG_IDX];
dma_addr_t args_phys = 0;
void *args_virt = NULL;
size_t alloc_len;
- struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+ gfp_t flag = atomic ? GFP_ATOMIC : GFP_KERNEL;

if (unlikely(arglen > N_REGISTER_ARGS)) {
alloc_len = N_EXT_QCOM_SCM_ARGS * sizeof(u64);
- args_virt = kzalloc(PAGE_ALIGN(alloc_len), GFP_KERNEL);
+ args_virt = kzalloc(PAGE_ALIGN(alloc_len), flag);

if (!args_virt)
return -ENOMEM;
@@ -125,33 +164,7 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
x5 = args_phys;
}

- do {
- mutex_lock(&qcom_scm_lock);
-
- cmd = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
- qcom_smccc_convention,
- ARM_SMCCC_OWNER_SIP, fn_id);
-
- quirk.state.a6 = 0;
-
- do {
- arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
- desc->args[1], desc->args[2], x5,
- quirk.state.a6, 0, res, &quirk);
-
- if (res->a0 == QCOM_SCM_INTERRUPTED)
- cmd = res->a0;
-
- } while (res->a0 == QCOM_SCM_INTERRUPTED);
-
- mutex_unlock(&qcom_scm_lock);
-
- if (res->a0 == QCOM_SCM_V2_EBUSY) {
- if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
- break;
- msleep(QCOM_SCM_EBUSY_WAIT_MS);
- }
- } while (res->a0 == QCOM_SCM_V2_EBUSY);
+ qcom_scm_call_do(desc, res, fn_id, x5, atomic);

if (args_virt) {
dma_unmap_single(dev, args_phys, alloc_len, DMA_TO_DEVICE);
@@ -164,6 +177,41 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
return 0;
}

+/**
+ * qcom_scm_call() - Invoke a syscall in the secure world
+ * @dev: device
+ * @svc_id: service identifier
+ * @cmd_id: command identifier
+ * @desc: Descriptor structure containing arguments and return values
+ *
+ * Sends a command to the SCM and waits for the command to finish processing.
+ * This should *only* be called in pre-emptible context.
+ */
+static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
+ const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res)
+{
+ return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, false);
+}
+
+/**
+ * qcom_scm_call_atomic() - atomic variation of qcom_scm_call()
+ * @dev: device
+ * @svc_id: service identifier
+ * @cmd_id: command identifier
+ * @desc: Descriptor structure containing arguments and return values
+ * @res: Structure containing results from SMC/HVC call
+ *
+ * Sends a command to the SCM and waits for the command to finish processing.
+ * This should be called in atomic context only.
+ */
+static int qcom_scm_call_atomic(struct device *dev, u32 svc_id, u32 cmd_id,
+ const struct qcom_scm_desc *desc,
+ struct arm_smccc_res *res)
+{
+ return ___qcom_scm_call(dev, svc_id, cmd_id, desc, res, true);
+}
+
/**
* qcom_scm_set_cold_boot_addr() - Set the cold boot address for cpus
* @entry: Entry point function for the cpus
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-08-14 10:58:37

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH 5/5] iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata

Qcom's implementation of arm,mmu-500 require to serialize all
TLB invalidations for context banks.
In case the TLB invalidation requests don't go through the first
time, there's a way to disable/enable the wait for safe logic.
Disabling this logic expadites the TLBIs.

Different bootloaders with their access control policies allow this
register access differntly. With one, we should be able to directly
make qcom-scm call to do io read/write, while with other we should
use the specific SCM command to send request to do the complete
register configuration.
A separate device tree flag for arm-smmu will allow to identify
which firmware configuration of the two mentioned above we use.

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/iommu/arm-smmu-regs.h | 2 +
drivers/iommu/arm-smmu.c | 136 +++++++++++++++++++++++++++++++++++++++++-
2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..71662cae9806 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -177,6 +177,8 @@ enum arm_smmu_s2cr_privcfg {
#define ARM_SMMU_CB_ATS1PR 0x800
#define ARM_SMMU_CB_ATSR 0x8f0

+#define ARM_SMMU_GID_QCOM_CUSTOM_CFG 0x300
+
#define SCTLR_S1_ASIDPNE (1 << 12)
#define SCTLR_CFCFG (1 << 7)
#define SCTLR_CFIE (1 << 6)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 75c146751c87..fafdaeb4d097 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -48,6 +48,7 @@
#include <linux/of_iommu.h>
#include <linux/pci.h>
#include <linux/platform_device.h>
+#include <linux/qcom_scm.h>
#include <linux/slab.h>
#include <linux/spinlock.h>

@@ -179,7 +180,8 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_EXIDS (1 << 12)
u32 features;

-#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA (1 << 1)
u32 options;
enum arm_smmu_arch_version version;
enum arm_smmu_implementation model;
@@ -262,6 +264,7 @@ static bool using_legacy_binding, using_generic_binding;

static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+ { ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA, "qcom,smmu-500-fw-impl-errata" },
{ 0, NULL},
};

@@ -531,12 +534,137 @@ static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size,
writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID);
}

+#define CUSTOM_CFG_MDP_SAFE_ENABLE BIT(15)
+#define CUSTOM_CFG_IFE1_SAFE_ENABLE BIT(14)
+#define CUSTOM_CFG_IFE0_SAFE_ENABLE BIT(13)
+
+static int __qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, int en)
+{
+ int ret;
+ u32 val, gid_phys_base;
+ phys_addr_t reg;
+ struct vm_struct *vm;
+
+ /* We want physical address of SMMU, so the vm_area */
+ vm = find_vm_area(smmu->base);
+
+ /*
+ * GID (implementation defined address space) is located at
+ * SMMU_BASE + (2 × PAGESIZE).
+ */
+ gid_phys_base = vm->phys_addr + (2 << (smmu)->pgshift);
+ reg = gid_phys_base + ARM_SMMU_GID_QCOM_CUSTOM_CFG;
+
+ ret = qcom_scm_io_readl_atomic(reg, &val);
+ if (ret)
+ return ret;
+
+ if (en)
+ val |= CUSTOM_CFG_MDP_SAFE_ENABLE |
+ CUSTOM_CFG_IFE0_SAFE_ENABLE |
+ CUSTOM_CFG_IFE1_SAFE_ENABLE;
+ else
+ val &= ~(CUSTOM_CFG_MDP_SAFE_ENABLE |
+ CUSTOM_CFG_IFE0_SAFE_ENABLE |
+ CUSTOM_CFG_IFE1_SAFE_ENABLE);
+
+ ret = qcom_scm_io_writel_atomic(reg, val);
+
+ return ret;
+}
+
+static int qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu,
+ int en, bool is_fw_impl)
+{
+ if (is_fw_impl)
+ return qcom_scm_qsmmu500_wait_safe_toggle(en);
+ else
+ return __qsmmu500_wait_safe_toggle(smmu, en);
+}
+
+static void qcom_errata_tlb_sync(struct arm_smmu_domain *smmu_domain)
+{
+ struct arm_smmu_device *smmu = smmu_domain->smmu;
+ void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
+ void __iomem *status = base + ARM_SMMU_CB_TLBSTATUS;
+ bool is_fw_impl;
+
+ writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC);
+
+ if (!__arm_smmu_tlb_sync_wait(status))
+ return;
+
+ is_fw_impl = smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA ?
+ true : false;
+
+ /* SCM call here to disable the wait-for-safe logic. */
+ if (WARN(qsmmu500_wait_safe_toggle(smmu, false, is_fw_impl),
+ "Failed to disable wait-safe logic, bad hw state\n"))
+ return;
+
+ if (!__arm_smmu_tlb_sync_wait(status))
+ return;
+
+ /* SCM call here to re-enable the wait-for-safe logic. */
+ WARN(qsmmu500_wait_safe_toggle(smmu, true, is_fw_impl),
+ "Failed to re-enable wait-safe logic, bad hw state\n");
+
+ dev_err_ratelimited(smmu->dev,
+ "TLB sync timed out -- SMMU in bad state\n");
+}
+
+static void __qcom_errata_tlb_sync_context(struct arm_smmu_domain *smmu_domain)
+{
+ qcom_errata_tlb_sync(smmu_domain);
+}
+
+static void qcom_errata_tlb_sync_context(void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = cookie;
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
+ qcom_errata_tlb_sync(smmu_domain);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
+}
+
+static void qcom_errata_tlb_inv_context_s1(void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = cookie;
+ struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+ void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
+ writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
+ __qcom_errata_tlb_sync_context(cookie);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
+}
+
+static void qcom_errata_tlb_inv_range_nosync(unsigned long iova, size_t size,
+ size_t granule, bool leaf,
+ void *cookie)
+{
+ struct arm_smmu_domain *smmu_domain = cookie;
+ unsigned long flags;
+
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
+ __arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
+}
+
static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = {
.tlb_flush_all = arm_smmu_tlb_inv_context_s1,
.tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
.tlb_sync = arm_smmu_tlb_sync_context,
};

+static const struct iommu_gather_ops qcom_errata_s1_tlb_ops = {
+ .tlb_flush_all = qcom_errata_tlb_inv_context_s1,
+ .tlb_add_flush = qcom_errata_tlb_inv_range_nosync,
+ .tlb_sync = qcom_errata_tlb_sync_context,
+};
+
static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = {
.tlb_flush_all = arm_smmu_tlb_inv_context_s2,
.tlb_add_flush = arm_smmu_tlb_inv_range_nosync,
@@ -824,7 +952,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
ias = min(ias, 32UL);
oas = min(oas, 32UL);
}
- smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
+ if (of_device_is_compatible(smmu->dev->of_node,
+ "qcom,sdm845-smmu-500"))
+ smmu_domain->tlb_ops = &qcom_errata_s1_tlb_ops;
+ else
+ smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
break;
case ARM_SMMU_DOMAIN_NESTED:
/*
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-08-14 10:58:57

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH 3/5] firmware/qcom_scm: Add scm call to handle smmu errata

Qcom's smmu-500 needs to toggle wait-for-safe sequence to
handle TLB invalidation sync's.
Few firmwares allow doing that through SCM interface.
Add API to toggle wait for safe from firmware through a
SCM call.

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/firmware/qcom_scm-32.c | 5 +++++
drivers/firmware/qcom_scm-64.c | 13 +++++++++++++
drivers/firmware/qcom_scm.c | 6 ++++++
drivers/firmware/qcom_scm.h | 5 +++++
include/linux/qcom_scm.h | 2 ++
5 files changed, 31 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 7293e5efad69..2d301ad053f8 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -639,3 +639,8 @@ int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
{
return -ENODEV;
}
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool enable)
+{
+ return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 6bf55403f6e3..f13bcabc5d78 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -590,3 +590,16 @@ int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
&desc, &res);
}
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool en)
+{
+ struct qcom_scm_desc desc = {0};
+ struct arm_smccc_res res;
+
+ desc.args[0] = QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL;
+ desc.args[1] = en;
+ desc.arginfo = QCOM_SCM_ARGS(2);
+
+ return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_SMMU_PROGRAM,
+ QCOM_SCM_CONFIG_ERRATA1, &desc, &res);
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 36da0000b37f..5f15cc2e9f69 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -353,6 +353,12 @@ int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare)
}
EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);

+int qcom_scm_qsmmu500_wait_safe_toggle(bool en)
+{
+ return __qcom_scm_qsmmu500_wait_safe_toggle(__scm->dev, en);
+}
+EXPORT_SYMBOL(qcom_scm_qsmmu500_wait_safe_toggle);
+
int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val)
{
return __qcom_scm_io_readl(__scm->dev, addr, val);
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index bb176107f51e..89a822c23e33 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -103,10 +103,15 @@ extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
u32 spare);
#define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE 3
#define QCOM_SCM_IOMMU_SECURE_PTBL_INIT 4
+#define QCOM_SCM_SVC_SMMU_PROGRAM 0x15
+#define QCOM_SCM_CONFIG_ERRATA1 0x3
+#define QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL 0x2
extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
size_t *size);
extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
u32 size, u32 spare);
+extern int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev,
+ bool enable);
#define QCOM_MEM_PROT_ASSIGN_ID 0x16
extern int __qcom_scm_assign_mem(struct device *dev,
phys_addr_t mem_region, size_t mem_sz,
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 6a5d0c98b328..46e6b1692998 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -62,6 +62,7 @@ extern int qcom_scm_set_remote_state(u32 state, u32 id);
extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
+extern int qcom_scm_qsmmu500_wait_safe_toggle(bool en);
extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
extern int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val);
@@ -100,6 +101,7 @@ qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return -ENODEV; }
static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { return -ENODEV; }
static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare) { return -ENODEV; }
+static inline int qcom_scm_qsmmu500_wait_safe_toggle(bool en) { return -ENODEV; }
static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { return -ENODEV; }
static inline int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-08-14 10:59:23

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH 4/5] iommu/arm-smmu: Make way to add Qcom's smmu-500 errata handling

Cleanup to re-use some of the stuff

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 32e86df80428..75c146751c87 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -391,21 +391,31 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
clear_bit(idx, map);
}

-/* Wait for any pending TLB invalidations to complete */
-static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
- void __iomem *sync, void __iomem *status)
+static int __arm_smmu_tlb_sync_wait(void __iomem *status)
{
unsigned int spin_cnt, delay;

- writel_relaxed(0, sync);
for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE))
- return;
+ return 0;
cpu_relax();
}
udelay(delay);
}
+
+ return -EBUSY;
+}
+
+/* Wait for any pending TLB invalidations to complete */
+static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
+ void __iomem *sync, void __iomem *status)
+{
+ writel_relaxed(0, sync);
+
+ if (!__arm_smmu_tlb_sync_wait(status))
+ return;
+
dev_err_ratelimited(smmu->dev,
"TLB sync timed out -- SMMU may be deadlocked\n");
}
@@ -461,8 +471,9 @@ static void arm_smmu_tlb_inv_context_s2(void *cookie)
arm_smmu_tlb_sync_global(smmu);
}

-static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
- size_t granule, bool leaf, void *cookie)
+static void __arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+ size_t granule, bool leaf,
+ void *cookie)
{
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
@@ -498,6 +509,13 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
}
}

+static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+ size_t granule, bool leaf,
+ void *cookie)
+{
+ __arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
+}
+
/*
* On MMU-401 at least, the cost of firing off multiple TLBIVMIDs appears
* almost negligible, but the benefit of getting the first one in as far ahead
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-08-14 11:41:46

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 4/5] iommu/arm-smmu: Make way to add Qcom's smmu-500 errata handling

On Tue, Aug 14, 2018 at 04:25:27PM +0530, Vivek Gautam wrote:
> Cleanup to re-use some of the stuff

Maybe we should factor a few of the other bits whilst we're here.

Or just write a proper commit message ;)

Will

2018-08-14 12:13:42

by Vivek Gautam

[permalink] [raw]
Subject: [PATCH 2/5] firmware/qcom_scm: Add atomic version of io read/write APIs

Add atomic versions of qcom_scm_io_readl/writel to enable
reading/writing secure registers from atomic context.

Signed-off-by: Vivek Gautam <[email protected]>
---
drivers/firmware/qcom_scm-32.c | 12 ++++++++++++
drivers/firmware/qcom_scm-64.c | 32 ++++++++++++++++++++++++++++++++
drivers/firmware/qcom_scm.c | 12 ++++++++++++
drivers/firmware/qcom_scm.h | 4 ++++
include/linux/qcom_scm.h | 4 ++++
5 files changed, 64 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 4e24e591ae74..7293e5efad69 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -627,3 +627,15 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val)
return qcom_scm_call_atomic2(QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
addr, val);
}
+
+int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int *val)
+{
+ return -ENODEV;
+}
+
+int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int val)
+{
+ return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 3a8c867cdf51..6bf55403f6e3 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -558,3 +558,35 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val)
return qcom_scm_call(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
&desc, &res);
}
+
+int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int *val)
+{
+ struct qcom_scm_desc desc = {0};
+ struct arm_smccc_res res;
+ int ret;
+
+ desc.args[0] = addr;
+ desc.arginfo = QCOM_SCM_ARGS(1);
+
+ ret = qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_READ,
+ &desc, &res);
+ if (ret >= 0)
+ *val = res.a1;
+
+ return ret < 0 ? ret : 0;
+}
+
+int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int val)
+{
+ struct qcom_scm_desc desc = {0};
+ struct arm_smccc_res res;
+
+ desc.args[0] = addr;
+ desc.args[1] = val;
+ desc.arginfo = QCOM_SCM_ARGS(2);
+
+ return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
+ &desc, &res);
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index e778af766fae..36da0000b37f 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -365,6 +365,18 @@ int qcom_scm_io_writel(phys_addr_t addr, unsigned int val)
}
EXPORT_SYMBOL(qcom_scm_io_writel);

+int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val)
+{
+ return __qcom_scm_io_readl_atomic(__scm->dev, addr, val);
+}
+EXPORT_SYMBOL(qcom_scm_io_readl_atomic);
+
+int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val)
+{
+ return __qcom_scm_io_writel_atomic(__scm->dev, addr, val);
+}
+EXPORT_SYMBOL(qcom_scm_io_writel_atomic);
+
static void qcom_scm_set_download_mode(bool enable)
{
bool avail;
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index dcd7f7917fc7..bb176107f51e 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -37,6 +37,10 @@ extern void __qcom_scm_cpu_power_down(u32 flags);
#define QCOM_SCM_IO_WRITE 0x2
extern int __qcom_scm_io_readl(struct device *dev, phys_addr_t addr, unsigned int *val);
extern int __qcom_scm_io_writel(struct device *dev, phys_addr_t addr, unsigned int val);
+extern int __qcom_scm_io_readl_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int *val);
+extern int __qcom_scm_io_writel_atomic(struct device *dev, phys_addr_t addr,
+ unsigned int val);

#define QCOM_SCM_SVC_INFO 0x6
#define QCOM_IS_CALL_AVAIL_CMD 0x1
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 5d65521260b3..6a5d0c98b328 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -64,6 +64,8 @@ extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
+extern int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val);
+extern int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val);
#else
static inline
int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -100,5 +102,7 @@ static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { ret
static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare) { return -ENODEV; }
static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { return -ENODEV; }
+static inline int qcom_scm_io_readl_atomic(phys_addr_t addr, unsigned int *val) { return -ENODEV; }
+static inline int qcom_scm_io_writel_atomic(phys_addr_t addr, unsigned int val) { return -ENODEV; }
#endif
#endif
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


2018-08-14 12:14:13

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 0/5] Qcom smmu-500 TLB invalidation errata for sdm845

Hi Vivek,

On Tue, Aug 14, 2018 at 04:25:23PM +0530, Vivek Gautam wrote:
> Qcom's implementation of arm,mmu-500 on sdm845 has a functional/performance
> errata [1] because of which the TCU cache look ups are stalled during
> invalidation cycle. This is mitigated by serializing all the invalidation
> requests coming to the smmu.

How does this implementation differ from the one supported by qcom_iommu.c?
I notice you're adding firmware hooks here, which we avoided by having the
extra driver. Please help me understand which devices exist, how they
differ, and which drivers are intended to support them!

Also -- you didn't CC all the maintainers for the firmware bits, so adding
Andy here for that, and Rob for the previous question.

Thanks,

Will

2018-08-14 12:34:22

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH 0/5] Qcom smmu-500 TLB invalidation errata for sdm845

Hi Will,


On 8/14/2018 5:10 PM, Will Deacon wrote:
> Hi Vivek,
>
> On Tue, Aug 14, 2018 at 04:25:23PM +0530, Vivek Gautam wrote:
>> Qcom's implementation of arm,mmu-500 on sdm845 has a functional/performance
>> errata [1] because of which the TCU cache look ups are stalled during
>> invalidation cycle. This is mitigated by serializing all the invalidation
>> requests coming to the smmu.
> How does this implementation differ from the one supported by qcom_iommu.c?
> I notice you're adding firmware hooks here, which we avoided by having the
> extra driver. Please help me understand which devices exist, how they
> differ, and which drivers are intended to support them!

IIRC, the qcom_iommu driver was intended to support the static context
bank - SID
mapping, and is very specific to the smmu-v2 version present on msm8916 soc.
However, this is the qcom's mmu-500 implementation specific errata.
qcom_iommu
will not be able to support mmu-500 configurations.
Rob Clark can add more.
Let you know what you suggest.

>
> Also -- you didn't CC all the maintainers for the firmware bits, so adding
> Andy here for that, and Rob for the previous question.

I added Andy to the series, would you want me to add Rob H also?

Best regards
Vivek

>
> Thanks,
>
> Will


2018-08-14 12:34:41

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH 4/5] iommu/arm-smmu: Make way to add Qcom's smmu-500 errata handling



On 8/14/2018 5:10 PM, Will Deacon wrote:
> On Tue, Aug 14, 2018 at 04:25:27PM +0530, Vivek Gautam wrote:
>> Cleanup to re-use some of the stuff
> Maybe we should factor a few of the other bits whilst we're here.

Sure, do you want me to refactor anything besides this change?

>
> Or just write a proper commit message ;)

My bad. I should have written a more descriptive commit message. :|
Will change this.

Best regards
Vivek

>
> Will


2018-08-14 17:00:22

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 4/5] iommu/arm-smmu: Make way to add Qcom's smmu-500 errata handling

On 14/08/18 11:55, Vivek Gautam wrote:
> Cleanup to re-use some of the stuff
>
> Signed-off-by: Vivek Gautam <[email protected]>
> ---
> drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++++-------
> 1 file changed, 25 insertions(+), 7 deletions(-)

I think the overall diffstat would be an awful lot smaller if the
erratum workaround just has its own readl_poll_timeout() as it does in
the vendor kernel. The burst-polling loop is for minimising latency in
high-throughput situations, and if you're in a workaround which has to
lock *every* register write and issue two firmware calls around each
sync I think you're already well out of that game.

> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 32e86df80428..75c146751c87 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -391,21 +391,31 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
> clear_bit(idx, map);
> }
>
> -/* Wait for any pending TLB invalidations to complete */
> -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
> - void __iomem *sync, void __iomem *status)
> +static int __arm_smmu_tlb_sync_wait(void __iomem *status)
> {
> unsigned int spin_cnt, delay;
>
> - writel_relaxed(0, sync);
> for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
> for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
> if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE))
> - return;
> + return 0;
> cpu_relax();
> }
> udelay(delay);
> }
> +
> + return -EBUSY;
> +}
> +
> +/* Wait for any pending TLB invalidations to complete */
> +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
> + void __iomem *sync, void __iomem *status)
> +{
> + writel_relaxed(0, sync);
> +
> + if (!__arm_smmu_tlb_sync_wait(status))
> + return;
> +
> dev_err_ratelimited(smmu->dev,
> "TLB sync timed out -- SMMU may be deadlocked\n");
> }
> @@ -461,8 +471,9 @@ static void arm_smmu_tlb_inv_context_s2(void *cookie)
> arm_smmu_tlb_sync_global(smmu);
> }
>
> -static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> - size_t granule, bool leaf, void *cookie)
> +static void __arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> + size_t granule, bool leaf,
> + void *cookie)
> {
> struct arm_smmu_domain *smmu_domain = cookie;
> struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> @@ -498,6 +509,13 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> }
> }
>
> +static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> + size_t granule, bool leaf,
> + void *cookie)
> +{
> + __arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
> +}
> +

AFAICS even after patch #5 this does absolutely nothing except make the
code needlessly harder to read :(

Robin.

> /*
> * On MMU-401 at least, the cost of firing off multiple TLBIVMIDs appears
> * almost negligible, but the benefit of getting the first one in as far ahead
>

2018-08-28 07:00:32

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH 4/5] iommu/arm-smmu: Make way to add Qcom's smmu-500 errata handling

Hi Robin,


On 8/14/2018 10:29 PM, Robin Murphy wrote:
> On 14/08/18 11:55, Vivek Gautam wrote:
>> Cleanup to re-use some of the stuff
>>
>> Signed-off-by: Vivek Gautam <[email protected]>
>> ---
>>   drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++++-------
>>   1 file changed, 25 insertions(+), 7 deletions(-)
>
> I think the overall diffstat would be an awful lot smaller if the
> erratum workaround just has its own readl_poll_timeout() as it does in
> the vendor kernel. The burst-polling loop is for minimising latency in
> high-throughput situations, and if you're in a workaround which has to
> lock *every* register write and issue two firmware calls around each
> sync I think you're already well out of that game.

Sorry for the delayed response. I was on vacation.
I will fix this in my next version by adding the separate
read_poll_timeout() for the erratum WA.

>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 32e86df80428..75c146751c87 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -391,21 +391,31 @@ static void __arm_smmu_free_bitmap(unsigned
>> long *map, int idx)
>>       clear_bit(idx, map);
>>   }
>>   -/* Wait for any pending TLB invalidations to complete */
>> -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
>> -                void __iomem *sync, void __iomem *status)
>> +static int __arm_smmu_tlb_sync_wait(void __iomem *status)
>>   {
>>       unsigned int spin_cnt, delay;
>>   -    writel_relaxed(0, sync);
>>       for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
>>           for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
>>               if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE))
>> -                return;
>> +                return 0;
>>               cpu_relax();
>>           }
>>           udelay(delay);
>>       }
>> +
>> +    return -EBUSY;
>> +}
>> +
>> +/* Wait for any pending TLB invalidations to complete */
>> +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
>> +                void __iomem *sync, void __iomem *status)
>> +{
>> +    writel_relaxed(0, sync);
>> +
>> +    if (!__arm_smmu_tlb_sync_wait(status))
>> +        return;
>> +
>>       dev_err_ratelimited(smmu->dev,
>>                   "TLB sync timed out -- SMMU may be deadlocked\n");
>>   }
>> @@ -461,8 +471,9 @@ static void arm_smmu_tlb_inv_context_s2(void
>> *cookie)
>>       arm_smmu_tlb_sync_global(smmu);
>>   }
>>   -static void arm_smmu_tlb_inv_range_nosync(unsigned long iova,
>> size_t size,
>> -                      size_t granule, bool leaf, void *cookie)
>> +static void __arm_smmu_tlb_inv_range_nosync(unsigned long iova,
>> size_t size,
>> +                        size_t granule, bool leaf,
>> +                        void *cookie)
>>   {
>>       struct arm_smmu_domain *smmu_domain = cookie;
>>       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>> @@ -498,6 +509,13 @@ static void
>> arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
>>       }
>>   }
>>   +static void arm_smmu_tlb_inv_range_nosync(unsigned long iova,
>> size_t size,
>> +                      size_t granule, bool leaf,
>> +                      void *cookie)
>> +{
>> +    __arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie);
>> +}
>> +
>
> AFAICS even after patch #5 this does absolutely nothing except make
> the code needlessly harder to read :(

Sure, I will rather call arm_smmu_tlb_inv_range_nosync() from
qcom_errata_tlb_inv_range_nosync() then make this change.
Thanks for the review.

Best regards
Vivek

>
> Robin.
>
>>   /*
>>    * On MMU-401 at least, the cost of firing off multiple TLBIVMIDs
>> appears
>>    * almost negligible, but the benefit of getting the first one in
>> as far ahead
>>


2018-09-05 09:24:01

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH 0/5] Qcom smmu-500 TLB invalidation errata for sdm845


On 8/14/2018 5:54 PM, Vivek Gautam wrote:
> Hi Will,
>
>
> On 8/14/2018 5:10 PM, Will Deacon wrote:
>> Hi Vivek,
>>
>> On Tue, Aug 14, 2018 at 04:25:23PM +0530, Vivek Gautam wrote:
>>> Qcom's implementation of arm,mmu-500 on sdm845 has a
>>> functional/performance
>>> errata [1] because of which the TCU cache look ups are stalled during
>>> invalidation cycle. This is mitigated by serializing all the
>>> invalidation
>>> requests coming to the smmu.
>> How does this implementation differ from the one supported by
>> qcom_iommu.c?
>> I notice you're adding firmware hooks here, which we avoided by
>> having the
>> extra driver. Please help me understand which devices exist, how they
>> differ, and which drivers are intended to support them!
>
> IIRC, the qcom_iommu driver was intended to support the static context
> bank - SID
> mapping, and is very specific to the smmu-v2 version present on
> msm8916 soc.
> However, this is the qcom's mmu-500 implementation specific errata.
> qcom_iommu
> will not be able to support mmu-500 configurations.
> Rob Clark can add more.
> Let you know what you suggest.

Rob, can you please comment about how qcom-smmu driver has different
implementation
from arm-smmu driver?
Will, in case we would want to use arm-smmu driver, what would you
suggest for
having the firmware hooks?
Thanks.

Best regards
Vivek
>
>>
>> Also -- you didn't CC all the maintainers for the firmware bits, so
>> adding
>> Andy here for that, and Rob for the previous question.
>
> I added Andy to the series, would you want me to add Rob H also?
>
> Best regards
> Vivek
>
>>
>> Thanks,
>>
>> Will
>


2018-09-05 10:06:21

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH 0/5] Qcom smmu-500 TLB invalidation errata for sdm845

On Wed, Sep 5, 2018 at 5:22 AM Vivek Gautam <[email protected]> wrote:
>
>
> On 8/14/2018 5:54 PM, Vivek Gautam wrote:
> > Hi Will,
> >
> >
> > On 8/14/2018 5:10 PM, Will Deacon wrote:
> >> Hi Vivek,
> >>
> >> On Tue, Aug 14, 2018 at 04:25:23PM +0530, Vivek Gautam wrote:
> >>> Qcom's implementation of arm,mmu-500 on sdm845 has a
> >>> functional/performance
> >>> errata [1] because of which the TCU cache look ups are stalled during
> >>> invalidation cycle. This is mitigated by serializing all the
> >>> invalidation
> >>> requests coming to the smmu.
> >> How does this implementation differ from the one supported by
> >> qcom_iommu.c?
> >> I notice you're adding firmware hooks here, which we avoided by
> >> having the
> >> extra driver. Please help me understand which devices exist, how they
> >> differ, and which drivers are intended to support them!
> >
> > IIRC, the qcom_iommu driver was intended to support the static context
> > bank - SID
> > mapping, and is very specific to the smmu-v2 version present on
> > msm8916 soc.
> > However, this is the qcom's mmu-500 implementation specific errata.
> > qcom_iommu
> > will not be able to support mmu-500 configurations.
> > Rob Clark can add more.
> > Let you know what you suggest.
>
> Rob, can you please comment about how qcom-smmu driver has different
> implementation
> from arm-smmu driver?

sorry, I missed this thread earlier. But yeah, as you mentioned, the
purpose for qcom_iommu.c was to deal with the static context/SID
mapping.

(I guess it is all just software, and we could make qcom_iommu.c
support dynamic mapping as well, but I think then it starts to
duplicate most of arm_smmu.c, so that doesn't seem like the right
direction)

BR,
-R

> Will, in case we would want to use arm-smmu driver, what would you
> suggest for
> having the firmware hooks?
> Thanks.
>
> Best regards
> Vivek

2018-09-05 11:26:58

by Vivek Gautam

[permalink] [raw]
Subject: Re: [PATCH 0/5] Qcom smmu-500 TLB invalidation errata for sdm845



On 9/5/2018 3:34 PM, Rob Clark wrote:
> On Wed, Sep 5, 2018 at 5:22 AM Vivek Gautam <[email protected]> wrote:
>>
>> On 8/14/2018 5:54 PM, Vivek Gautam wrote:
>>> Hi Will,
>>>
>>>
>>> On 8/14/2018 5:10 PM, Will Deacon wrote:
>>>> Hi Vivek,
>>>>
>>>> On Tue, Aug 14, 2018 at 04:25:23PM +0530, Vivek Gautam wrote:
>>>>> Qcom's implementation of arm,mmu-500 on sdm845 has a
>>>>> functional/performance
>>>>> errata [1] because of which the TCU cache look ups are stalled during
>>>>> invalidation cycle. This is mitigated by serializing all the
>>>>> invalidation
>>>>> requests coming to the smmu.
>>>> How does this implementation differ from the one supported by
>>>> qcom_iommu.c?
>>>> I notice you're adding firmware hooks here, which we avoided by
>>>> having the
>>>> extra driver. Please help me understand which devices exist, how they
>>>> differ, and which drivers are intended to support them!
>>> IIRC, the qcom_iommu driver was intended to support the static context
>>> bank - SID
>>> mapping, and is very specific to the smmu-v2 version present on
>>> msm8916 soc.
>>> However, this is the qcom's mmu-500 implementation specific errata.
>>> qcom_iommu
>>> will not be able to support mmu-500 configurations.
>>> Rob Clark can add more.
>>> Let you know what you suggest.
>> Rob, can you please comment about how qcom-smmu driver has different
>> implementation
>> from arm-smmu driver?
> sorry, I missed this thread earlier. But yeah, as you mentioned, the
> purpose for qcom_iommu.c was to deal with the static context/SID
> mapping.
>
> (I guess it is all just software, and we could make qcom_iommu.c
> support dynamic mapping as well, but I think then it starts to
> duplicate most of arm_smmu.c, so that doesn't seem like the right
> direction)

Thanks Rob for the response. I will wait for Will's response on how would he
like this support be implemented.

Best regards
Vivek
>
> BR,
> -R
>
>> Will, in case we would want to use arm-smmu driver, what would you
>> suggest for
>> having the firmware hooks?
>> Thanks.
>>
>> Best regards
>> Vivek