2017-08-31 08:30:33

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
https://www.spinics.net/lists/arm-kernel/msg565155.html

But for some platform devices(aka on-chip integrated devices), there is also
SVM requirement, which works based on the SMMU stall mode.
Jean-Philippe has prepared a prototype patchset to support it:
git://linux-arm.org/linux-jpb.git svm/stall

We tested this patchset with some fixes on a on-chip integrated device. The
basic function is ok, so I just send them out for review, although this
patchset heavily depends on the former patchset (PCIe SVM support for ARM
SMMUv3), which is still under discussion.

Patch Overview:
*1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
*4 is to realise the SVM function for platform device
*5 is fix a bug when test SVM function while SMMU donnot support this feature
*6 avoid ILLEGAL setting of STE and CD entry about stall

Acctually here, I also have a question about SVM on SMMUv3:

1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
it will register a mmu_notify. Therefore, when a page range is invalid, we can
send TLBI or ATC invalid without BTM?

2. According to ACPI IORT spec, named component specific data has a node flags field
whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
a single platform device which should be enough, because SMMU only support 20 bit pasid

3. Presently, the pasid is allocate for a task but not for a context, if a task is trying
to bind to 2 device A and B:
a) A support 5 pasid bits
b) B support 2 pasid bits
c) when the task bind to device A, it allocate pasid = 16
d) then it must be fail when trying to bind to task B, for its highest pasid is 4.
So it should allocate a single pasid for a context to avoid this?


Jean-Philippe Brucker (3):
dt-bindings: document stall and PASID properties for IOMMU masters
iommu/of: Add stall and pasid properties to iommu_fwspec
iommu/arm-smmu-v3: Add SVM support for platform devices

Yisheng Xie (3):
ACPI: IORT: Add stall and pasid properties to iommu_fwspec
iommu/arm-smmu-v3: fix panic when handle stall mode irq
iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

Documentation/devicetree/bindings/iommu/iommu.txt | 13 ++
drivers/acpi/arm64/iort.c | 20 ++
drivers/iommu/arm-smmu-v3.c | 230 ++++++++++++++++++----
drivers/iommu/of_iommu.c | 11 +
include/acpi/actbl2.h | 5 +
include/linux/iommu.h | 2 +
6 files changed, 244 insertions(+), 37 deletions(-)

--
1.7.12.4


2017-08-31 08:29:29

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 3/6] ACPI: IORT: Add stall and pasid properties to iommu_fwspec

According to ACPI IORT spec, named component specific data has a node
flags field whoes bit0 is for Stall support. However, it do not have any
field for pasid bit.

As PCIe SMMU support 20 pasid bits, this patch suggest to use 5 bits[5:1]
in node flags field for pasid bits which means we can have 32 pasid bits.
And this should be enough for platform device. Anyway, this is just a RFC.

Signed-off-by: Yisheng Xie <[email protected]>
---
drivers/acpi/arm64/iort.c | 20 ++++++++++++++++++++
include/acpi/actbl2.h | 5 +++++
2 files changed, 25 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 4d9ccdd0..514caca3 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -710,6 +710,22 @@ static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
}

+static bool iort_platform_support_stall(struct acpi_iort_node *node)
+{
+ struct acpi_iort_named_component *ncomp;
+
+ ncomp = (struct acpi_iort_named_component *)node->node_data;
+ return ncomp->node_flags & ACPI_IORT_STALL_SUPPORTED;
+}
+
+static unsigned long iort_platform_get_pasid_bits(struct acpi_iort_node *node)
+{
+ struct acpi_iort_named_component *ncomp;
+
+ ncomp = (struct acpi_iort_named_component *)node->node_data;
+ return (ncomp->node_flags & ACPI_IORT_PASID_BITS_MASK) >> ACPI_IORT_PASID_BITS_SHIFT;
+}
+
/**
* iort_iommu_configure - Set-up IOMMU configuration for a device.
*
@@ -772,6 +788,10 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev)
IORT_IOMMU_TYPE,
i++);
}
+ if (!IS_ERR_OR_NULL(ops)) {
+ dev->iommu_fwspec->can_stall = iort_platform_support_stall(node);
+ dev->iommu_fwspec->num_pasid_bits = iort_platform_get_pasid_bits(node);
+ }
}

/*
diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index 707dda74..125b150 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -749,6 +749,11 @@ struct acpi_iort_named_component {
char device_name[1]; /* Path of namespace object */
};

+#define ACPI_IORT_STALL_SUPPORTED 0x00000001 /* The platform device supports stall */
+#define ACPI_IORT_STALL_UNSUPPORTED 0x00000000 /* The platform device doesn't support stall */
+#define ACPI_IORT_PASID_BITS_MASK 0x0000003e /* 5 bits for PASID BITS */
+#define ACPI_IORT_PASID_BITS_SHIFT 1 /* PASID BITS numbers shift */
+
struct acpi_iort_root_complex {
u64 memory_properties; /* Memory access properties */
u32 ats_attribute;
--
1.7.12.4

2017-08-31 08:29:28

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
means we should not disable stall mode if stall/terminate mode is not
configuable.

Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
means if stall mode is force we should always set CD.S.

This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
TERMINATE feature checking to ensue above ILLEGAL cases from happening.

Signed-off-by: Yisheng Xie <[email protected]>
---
drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index dbda2eb..0745522 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -55,6 +55,7 @@
#define IDR0_STALL_MODEL_SHIFT 24
#define IDR0_STALL_MODEL_MASK 0x3
#define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT)
+#define IDR0_STALL_MODEL_NS (1 << IDR0_STALL_MODEL_SHIFT)
#define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT)
#define IDR0_TTENDIAN_SHIFT 21
#define IDR0_TTENDIAN_MASK 0x3
@@ -766,6 +767,7 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_SVM (1 << 15)
#define ARM_SMMU_FEAT_HA (1 << 16)
#define ARM_SMMU_FEAT_HD (1 << 17)
+#define ARM_SMMU_FEAT_TERMINATE (1 << 18)
u32 features;

#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
@@ -1402,6 +1404,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
u64 val;
bool cd_live;
__u64 *cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
+ struct arm_smmu_device *smmu = smmu_domain->smmu;

/*
* This function handles the following cases:
@@ -1468,9 +1471,11 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
CTXDESC_CD_0_V;

/*
- * FIXME: STALL_MODEL==0b10 && CD.S==0 is ILLEGAL
+ * STALL_MODEL==0b10 && CD.S==0 is ILLEGAL
*/
- if (ssid && smmu_domain->s1_cfg.can_stall)
+ if ((ssid && smmu_domain->s1_cfg.can_stall) ||
+ (!(smmu->features & ARM_SMMU_FEAT_TERMINATE) &&
+ smmu->features & ARM_SMMU_FEAT_STALLS))
val |= CTXDESC_CD_0_S;

cdptr[0] = cpu_to_le64(val);
@@ -1690,12 +1695,13 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
dst[1] |= STRTAB_STE_1_PPAR;

/*
- * FIXME: it is illegal to set S1STALLD if STALL_MODEL=0b10
- * (force). But according to the spec, it *must* be set for
+ * According to spec, it is illegal to set S1STALLD if
+ * STALL_MODEL is not 0b00. And it *must* be set for
* devices that aren't capable of stalling (notably pci!)
- * So we need a "STALL_MODEL=0b00" feature bit.
*/
- if (smmu->features & ARM_SMMU_FEAT_STALLS && !ste->can_stall)
+ if (smmu->features & ARM_SMMU_FEAT_STALLS &&
+ smmu->features & ARM_SMMU_FEAT_TERMINATE &&
+ !ste->can_stall)
dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);

val |= (s1ctxptr & STRTAB_STE_0_S1CTXPTR_MASK
@@ -4577,9 +4583,13 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)

switch (reg & IDR0_STALL_MODEL_MASK << IDR0_STALL_MODEL_SHIFT) {
case IDR0_STALL_MODEL_STALL:
+ smmu->features |= ARM_SMMU_FEAT_TERMINATE;
/* Fallthrough */
case IDR0_STALL_MODEL_FORCE:
smmu->features |= ARM_SMMU_FEAT_STALLS;
+ break;
+ case IDR0_STALL_MODEL_NS:
+ smmu->features |= ARM_SMMU_FEAT_TERMINATE;
}

if (reg & IDR0_S1P)
--
1.7.12.4

2017-08-31 08:29:26

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 1/6] dt-bindings: document stall and PASID properties for IOMMU masters

From: Jean-Philippe Brucker <[email protected]>

Document the bindings for stall and PASID capable platform devices.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
---
Documentation/devicetree/bindings/iommu/iommu.txt | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
index 5a8b462..7355d7f 100644
--- a/Documentation/devicetree/bindings/iommu/iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -86,6 +86,19 @@ have a means to turn off translation. But it is invalid in such cases to
disable the IOMMU's device tree node in the first place because it would
prevent any driver from properly setting up the translations.

+Optional properties:
+--------------------
+- dma-can-stall: When present, the master can wait for a DMA transaction
+ to be handled by the IOMMU and can recover from a fault. For example, if
+ the page accessed by the DMA transaction isn't mapped, some IOMMUs are
+ able to stall the transaction and let the OS populate the page tables.
+ The IOMMU then performs the transaction if the fault was successfully
+ handled, or aborts the transaction otherwise.
+
+- pasid-bits: Some masters support multiple address spaces for DMA. By
+ tagging DMA transactions with an address space identifier. By default,
+ this is 0, which means that the device only has one address space.
+

Notes:
======
--
1.7.12.4

2017-08-31 08:30:31

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 5/6] iommu/arm-smmu-v3: fix panic when handle stall mode irq

When SMMU do not support SVM feature, however the master support SVM,
which means matser can stall and with mult-pasid number, then the user
can bind a task to device using API like iommu_bind_task(). however,
when device trigger a stall mode fault i will cause panic:

[ 106.996087] Unable to handle kernel NULL pointer dereference at virtual address 00000100
[ 106.996122] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003e023000
[ 106.996150] [0000000000000100] *pgd=000000003e04a003, *pud=000000003e04b003, *pmd=0000000000000000
[ 106.996201] Internal error: Oops: 96000006 [#1] PREEMPT SM
[ 106.996224] Modules linked in:
[ 106.996256] CPU: 0 PID: 916 Comm: irq/14-arm-smmu Not tainted 4.13.0-rc5-00035-g1235ddd-dirty #67
[ 106.996288] Hardware name: Hisilicon PhosphorHi1383 ESL (DT)
[ 106.996317] task: ffff80003adc1c00 task.stack: ffff80003a9f8000
[ 106.996347] PC is at __queue_work+0x30/0x3a8
[ 106.996374] LR is at queue_work_on+0x60/0x78
[ 106.996401] pc : [<ffff0000080d7d10>] lr : [<ffff0000080d80e8>] pstate: 40c001c9
[ 106.996430] sp : ffff80003a9fbc20
[ 106.996451] x29: ffff80003a9fbc20 x28: ffff80003adc1c00
[ 106.996488] x27: ffff000008d05080 x26: ffff80003ab0e028
[ 106.996526] x25: ffff80003a9900ac x24: 0000000000000001
[ 106.996562] x23: 0000000000000040 x22: 0000000000000000
[ 106.996598] x21: 0000000000000000 x20: 0000000000000140
[ 106.996634] x19: ffff80003ab0e028 x18: 0000000000000010
[ 106.996670] x17: 0000ffffa52a5040 x16: ffff00000820f260
[ 106.996708] x15: 00000018e97629e0 x14: ffff80003fb89468
[ 106.996744] x13: 0000000000000000 x12: ffff80003abb0600
[ 106.996781] x11: 0000000000000000 x10: 0000010100000100
[ 106.996817] x9 : 0000ffff85de5010 x8 : 00000000e4830001
[ 106.996854] x7 : ffff80003a9fbcf8 x6 : 0000000fffffffe0
[ 106.996890] x5 : 0000000000000000 x4 : 0000000000000001
[ 106.996926] x3 : 0000000000000000 x2 : ffff80003ab0e028
[ 106.996962] x1 : 0000000000000000 x0 : 00000000000001c0
[ 106.997002] Process irq/14-arm-smmu (pid: 916, stack limit =0xffff80003a9f8000)
[ 106.997035] Stack: (0xffff80003a9fbc20 to 0xffff80003a9fc000)
[...]
[ 106.998366] Call trace:
[ 106.998842] [<ffff0000080d7d10>] __queue_work+0x30/0x3a8
[ 106.998874] [<ffff0000080d80e8>] queue_work_on+0x60/0x78
[ 106.998912] [<ffff00000857aae4>] arm_smmu_handle_stall+0x104/0x138
[ 106.998952] [<ffff00000857b150>] arm_smmu_evtq_thread+0xc0/0x158
[ 106.998989] [<ffff000008112128>] irq_thread_fn+0x28/0x68
[ 106.999025] [<ffff0000081123e0>] irq_thread+0x128/0x1d0
[ 106.999060] [<ffff0000080df6bc>] kthread+0xfc/0x128
[ 106.999093] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
[ 106.999130] Code: a90153f3 a90573fb d53b4220 363814c0 (b94102a0)
[ 106.999159] ---[ end trace 7e5c9f0cb1f2fecd ]---

And the resean is we donot init fault_queue while the fault handle need
to use it. Fix by return -EINVAL in arm_smmu_bind_task() when smmu do not
support the feature of SVM.

Signed-off-by: Yisheng Xie <[email protected]>
---
drivers/iommu/arm-smmu-v3.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d44256a..dbda2eb 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2922,6 +2922,8 @@ static int arm_smmu_bind_task(struct device *dev, struct task_struct *task,
return -EINVAL;

smmu = master->smmu;
+ if (!(smmu->features & ARM_SMMU_FEAT_SVM))
+ return -EINVAL;

domain = iommu_get_domain_for_dev(dev);
if (WARN_ON(!domain))
--
1.7.12.4

2017-08-31 08:30:27

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 4/6] iommu/arm-smmu-v3: Add SVM support for platform devices

From: Jean-Philippe Brucker <[email protected]>

Platform device can realise SVM function by using the stall mode. That
is to say, when device access a memory via iova which is not populated,
it will stalled and when SMMU try to translate this iova, it also will
stall and meanwhile send an event to CPU via MSI.

After SMMU driver handle the event and populated the iova, it will send
a RESUME command to SMMU to exit the stall mode, therefore the platform
device can contiue access the memory.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
---
drivers/iommu/arm-smmu-v3.c | 218 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 181 insertions(+), 37 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index cafbeef..d44256a 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -359,6 +359,7 @@
#define CTXDESC_CD_0_TCR_HA_SHIFT 43
#define CTXDESC_CD_0_HD (1UL << CTXDESC_CD_0_TCR_HD_SHIFT)
#define CTXDESC_CD_0_HA (1UL << CTXDESC_CD_0_TCR_HA_SHIFT)
+#define CTXDESC_CD_0_S (1UL << 44)
#define CTXDESC_CD_0_R (1UL << 45)
#define CTXDESC_CD_0_A (1UL << 46)
#define CTXDESC_CD_0_ASET_SHIFT 47
@@ -432,6 +433,15 @@
#define CMDQ_PRI_1_RESP_FAIL (1UL << CMDQ_PRI_1_RESP_SHIFT)
#define CMDQ_PRI_1_RESP_SUCC (2UL << CMDQ_PRI_1_RESP_SHIFT)

+#define CMDQ_RESUME_0_SID_SHIFT 32
+#define CMDQ_RESUME_0_SID_MASK 0xffffffffUL
+#define CMDQ_RESUME_0_ACTION_SHIFT 12
+#define CMDQ_RESUME_0_ACTION_TERM (0UL << CMDQ_RESUME_0_ACTION_SHIFT)
+#define CMDQ_RESUME_0_ACTION_RETRY (1UL << CMDQ_RESUME_0_ACTION_SHIFT)
+#define CMDQ_RESUME_0_ACTION_ABORT (2UL << CMDQ_RESUME_0_ACTION_SHIFT)
+#define CMDQ_RESUME_1_STAG_SHIFT 0
+#define CMDQ_RESUME_1_STAG_MASK 0xffffUL
+
#define CMDQ_SYNC_0_CS_SHIFT 12
#define CMDQ_SYNC_0_CS_NONE (0UL << CMDQ_SYNC_0_CS_SHIFT)
#define CMDQ_SYNC_0_CS_SEV (2UL << CMDQ_SYNC_0_CS_SHIFT)
@@ -443,6 +453,31 @@
#define EVTQ_0_ID_SHIFT 0
#define EVTQ_0_ID_MASK 0xffUL

+#define EVT_ID_TRANSLATION_FAULT 0x10
+#define EVT_ID_ADDR_SIZE_FAULT 0x11
+#define EVT_ID_ACCESS_FAULT 0x12
+#define EVT_ID_PERMISSION_FAULT 0x13
+
+#define EVTQ_0_SSV (1UL << 11)
+#define EVTQ_0_SSID_SHIFT 12
+#define EVTQ_0_SSID_MASK 0xfffffUL
+#define EVTQ_0_SID_SHIFT 32
+#define EVTQ_0_SID_MASK 0xffffffffUL
+#define EVTQ_1_STAG_SHIFT 0
+#define EVTQ_1_STAG_MASK 0xffffUL
+#define EVTQ_1_STALL (1UL << 31)
+#define EVTQ_1_PRIV (1UL << 33)
+#define EVTQ_1_EXEC (1UL << 34)
+#define EVTQ_1_READ (1UL << 35)
+#define EVTQ_1_S2 (1UL << 39)
+#define EVTQ_1_CLASS_SHIFT 40
+#define EVTQ_1_CLASS_MASK 0x3UL
+#define EVTQ_1_TT_READ (1UL << 44)
+#define EVTQ_2_ADDR_SHIFT 0
+#define EVTQ_2_ADDR_MASK 0xffffffffffffffffUL
+#define EVTQ_3_IPA_SHIFT 12
+#define EVTQ_3_IPA_MASK 0xffffffffffUL
+
/* PRI queue */
#define PRIQ_ENT_DWORDS 2
#define PRIQ_MAX_SZ_SHIFT 8
@@ -586,6 +621,13 @@ struct arm_smmu_cmdq_ent {
enum fault_status resp;
} pri;

+ #define CMDQ_OP_RESUME 0x44
+ struct {
+ u32 sid;
+ u16 stag;
+ enum fault_status resp;
+ } resume;
+
#define CMDQ_OP_CMD_SYNC 0x46
};
};
@@ -659,6 +701,7 @@ struct arm_smmu_s1_cfg {

struct list_head contexts;
size_t num_contexts;
+ bool can_stall;

struct arm_smmu_ctx_desc cd; /* Default context (SSID0) */
};
@@ -682,6 +725,7 @@ struct arm_smmu_strtab_ent {
struct arm_smmu_s2_cfg *s2_cfg;

bool prg_response_needs_ssid;
+ bool can_stall;
};

struct arm_smmu_strtab_cfg {
@@ -816,6 +860,7 @@ struct arm_smmu_fault {
bool priv;

bool last;
+ bool stall;

struct work_struct work;
};
@@ -1098,6 +1143,21 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return -EINVAL;
}
break;
+ case CMDQ_OP_RESUME:
+ switch (ent->resume.resp) {
+ case ARM_SMMU_FAULT_FAIL:
+ case ARM_SMMU_FAULT_DENY:
+ cmd[0] |= CMDQ_RESUME_0_ACTION_ABORT;
+ break;
+ case ARM_SMMU_FAULT_SUCC:
+ cmd[0] |= CMDQ_RESUME_0_ACTION_RETRY;
+ break;
+ default:
+ return -EINVAL;
+ }
+ cmd[0] |= (u64)ent->resume.sid << CMDQ_RESUME_0_SID_SHIFT;
+ cmd[1] |= ent->resume.stag << CMDQ_RESUME_1_STAG_SHIFT;
+ break;
case CMDQ_OP_CMD_SYNC:
cmd[0] |= CMDQ_SYNC_0_CS_SEV;
break;
@@ -1194,16 +1254,21 @@ static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
static void arm_smmu_fault_reply(struct arm_smmu_fault *fault,
enum fault_status resp)
{
- struct arm_smmu_cmdq_ent cmd = {
- .opcode = CMDQ_OP_PRI_RESP,
- .substream_valid = fault->ssv,
- .pri = {
- .sid = fault->sid,
- .ssid = fault->ssid,
- .grpid = fault->grpid,
- .resp = resp,
- },
- };
+ struct arm_smmu_cmdq_ent cmd = { 0 };
+
+ if (fault->stall) {
+ cmd.opcode = CMDQ_OP_RESUME;
+ cmd.resume.sid = fault->sid;
+ cmd.resume.stag = fault->grpid;
+ cmd.resume.resp = resp;
+ } else {
+ cmd.opcode = CMDQ_OP_PRI_RESP;
+ cmd.substream_valid = fault->ssv;
+ cmd.pri.sid = fault->sid;
+ cmd.pri.ssid = fault->ssid;
+ cmd.pri.grpid = fault->grpid;
+ cmd.pri.resp = resp;
+ }

if (!fault->last || resp == ARM_SMMU_FAULT_IGNORE)
return;
@@ -1402,8 +1467,13 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
(u64)cd->asid << CTXDESC_CD_0_ASID_SHIFT |
CTXDESC_CD_0_V;

- cdptr[0] = cpu_to_le64(val);
+ /*
+ * FIXME: STALL_MODEL==0b10 && CD.S==0 is ILLEGAL
+ */
+ if (ssid && smmu_domain->s1_cfg.can_stall)
+ val |= CTXDESC_CD_0_S;

+ cdptr[0] = cpu_to_le64(val);
}

if (ssid || cd_live)
@@ -1619,7 +1689,13 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
if (ste->prg_response_needs_ssid)
dst[1] |= STRTAB_STE_1_PPAR;

- if (smmu->features & ARM_SMMU_FEAT_STALLS)
+ /*
+ * FIXME: it is illegal to set S1STALLD if STALL_MODEL=0b10
+ * (force). But according to the spec, it *must* be set for
+ * devices that aren't capable of stalling (notably pci!)
+ * So we need a "STALL_MODEL=0b00" feature bit.
+ */
+ if (smmu->features & ARM_SMMU_FEAT_STALLS && !ste->can_stall)
dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);

val |= (s1ctxptr & STRTAB_STE_0_S1CTXPTR_MASK
@@ -1701,6 +1777,53 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
return 0;
}

+static void arm_smmu_handle_fault(struct work_struct *work);
+
+static int arm_smmu_handle_stall(struct arm_smmu_device *smmu, u64 *evt)
+{
+ u8 id = evt[0] >> EVTQ_0_ID_SHIFT & EVTQ_0_ID_MASK;
+ struct arm_smmu_fault *fault;
+ struct arm_smmu_fault params = {
+ .smmu = smmu,
+
+ .sid = evt[0] >> EVTQ_0_SID_SHIFT & EVTQ_0_SID_MASK,
+ .ssv = evt[0] & EVTQ_0_SSV,
+ .ssid = evt[0] >> EVTQ_0_SSID_SHIFT & EVTQ_0_SSID_MASK,
+ .grpid = evt[1] >> EVTQ_1_STAG_SHIFT & EVTQ_1_STAG_MASK,
+
+ .iova = evt[2] >> EVTQ_2_ADDR_SHIFT & EVTQ_2_ADDR_MASK,
+ .read = evt[1] & EVTQ_1_READ,
+ .write = !(evt[1] & EVTQ_1_READ),
+ .exec = evt[1] & EVTQ_1_EXEC,
+ .priv = evt[1] & EVTQ_1_PRIV,
+
+ .last = true,
+ .stall = true,
+ };
+
+ switch (id) {
+ case EVT_ID_TRANSLATION_FAULT:
+ case EVT_ID_ADDR_SIZE_FAULT:
+ case EVT_ID_ACCESS_FAULT:
+ case EVT_ID_PERMISSION_FAULT:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ fault = kmem_cache_alloc(arm_smmu_fault_cache, GFP_KERNEL);
+ if (!fault) {
+ arm_smmu_fault_reply(&params, ARM_SMMU_FAULT_FAIL);
+ return -ENOMEM;
+ }
+
+ *fault = params;
+ INIT_WORK(&fault->work, arm_smmu_handle_fault);
+ queue_work(smmu->fault_queue, &fault->work);
+
+ return 0;
+}
+
/* IRQ and event handlers */
static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
{
@@ -1713,6 +1836,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
while (!queue_remove_raw(q, evt)) {
u8 id = evt[0] >> EVTQ_0_ID_SHIFT & EVTQ_0_ID_MASK;

+ if (evt[1] & EVTQ_1_STALL &&
+ !arm_smmu_handle_stall(smmu, evt))
+ continue;
+
dev_info(smmu->dev, "event 0x%02x received:\n", id);
for (i = 0; i < ARRAY_SIZE(evt); ++i)
dev_info(smmu->dev, "\t0x%016llx\n",
@@ -1733,8 +1860,6 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
return IRQ_HANDLED;
}

-static void arm_smmu_handle_fault(struct work_struct *work);
-
static void arm_smmu_handle_ppr(struct arm_smmu_device *smmu, u64 *evt,
bool overflowing)
{
@@ -2648,11 +2773,13 @@ static enum fault_status _arm_smmu_handle_fault(struct arm_smmu_fault *fault)
goto out_put_context;
}

- list_for_each_entry(tmp_prg, &smmu_context->prgs, list) {
- if (tmp_prg->sid == fault->sid && tmp_prg->index ==
- fault->grpid) {
- prg = tmp_prg;
- break;
+ if (!fault->stall) {
+ list_for_each_entry(tmp_prg, &smmu_context->prgs, list) {
+ if (tmp_prg->sid == fault->sid &&
+ tmp_prg->index == fault->grpid) {
+ prg = tmp_prg;
+ break;
+ }
}
}

@@ -2742,8 +2869,7 @@ static void arm_smmu_flush_prgs(struct work_struct *work)

static bool arm_smmu_master_supports_svm(struct arm_smmu_master_data *master)
{
- return dev_is_pci(master->dev) && master->can_fault &&
- master->num_ssids > 1;
+ return master->can_fault && master->num_ssids > 1;
}

static int arm_smmu_set_svm_ops(struct device *dev,
@@ -2892,7 +3018,7 @@ static int arm_smmu_unbind_task(struct device *dev, int pasid)
* TODO: only do this if the context hasn't been invalidated yet (with a
* stop PASID).
*/
- if (master->can_fault) {
+ if (dev_is_pci(dev) && master->can_fault) {
ret = arm_smmu_flush_priq(smmu);
if (ret)
return ret;
@@ -2958,7 +3084,7 @@ static int arm_smmu_detach_all_contexts(struct arm_smmu_master_data *master)
spin_unlock(&smmu->contexts_lock);

/* Propagate the invalidations */
- if (master->can_fault) {
+ if (dev_is_pci(master->dev) && master->can_fault) {
arm_smmu_flush_priq(smmu); /* may fail */
flush_workqueue(smmu->fault_queue);
}
@@ -3271,8 +3397,10 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
if (!smmu_domain->smmu) {
smmu_domain->smmu = smmu;

- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
smmu_domain->s1_cfg.num_contexts = master->num_ssids;
+ smmu_domain->s1_cfg.can_stall = master->ste.can_stall;
+ }

ret = arm_smmu_domain_finalise(domain);
if (ret) {
@@ -3286,17 +3414,26 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
dev_name(smmu->dev));
ret = -ENXIO;
goto out_unlock;
- } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
- master->num_ssids > 1 &&
- smmu_domain->s1_cfg.num_contexts > master->num_ssids) {
- /*
- * New device does not support enough contexts for SVM. This is
- * OK within a domain (the device can simply be put in its own
- * domain) but within a group it might be problematic. FIXME
- */
- dev_err(dev, "cannot attach to domain (not enough contexts)");
- ret = -ESRCH;
- goto out_unlock;
+ } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ /* Sanity checks */
+ if (master->num_ssids > 1 &&
+ smmu_domain->s1_cfg.num_contexts > master->num_ssids) {
+ /*
+ * New device does not support enough contexts for SVM.
+ * This is OK within a domain (the device can simply be
+ * put in its own domain) but within a group it might be
+ * problematic. FIXME
+ */
+ dev_err(dev, "cannot attach to domain (not enough contexts)");
+ ret = -ESRCH;
+ goto out_unlock;
+ }
+
+ if (master->ste.can_stall != smmu_domain->s1_cfg.can_stall) {
+ dev_err(dev, "devices within a domain must have the same stall capabilities");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
}

ste->assigned = true;
@@ -3444,9 +3581,10 @@ static int arm_smmu_enable_ssid(struct arm_smmu_master_data *master)
int features;
int num_ssids;
struct pci_dev *pdev;
+ struct iommu_fwspec *fwspec = master->dev->iommu_fwspec;

if (!dev_is_pci(master->dev))
- return -ENOSYS;
+ return 1 << fwspec->num_pasid_bits;

pdev = to_pci_dev(master->dev);

@@ -3589,6 +3727,11 @@ static int arm_smmu_add_device(struct device *dev)
if (!arm_smmu_enable_ats(master))
arm_smmu_enable_pri(master);

+ if (fwspec->can_stall && smmu->features & ARM_SMMU_FEAT_STALLS) {
+ master->can_fault = true;
+ master->ste.can_stall = true;
+ }
+
group = iommu_group_get_for_dev(dev);
if (IS_ERR(group)) {
ret = PTR_ERR(group);
@@ -3937,7 +4080,8 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu)
return ret;

if (smmu->features & ARM_SMMU_FEAT_SVM &&
- smmu->features & ARM_SMMU_FEAT_PRI) {
+ (smmu->features & ARM_SMMU_FEAT_PRI ||
+ smmu->features & ARM_SMMU_FEAT_STALLS)) {
/*
* Ensure strict ordering of the queue. We can't go reordering
* page faults willy nilly since they work in groups, with a
--
1.7.12.4

2017-08-31 08:30:25

by Yisheng Xie

[permalink] [raw]
Subject: [RFC PATCH 2/6] iommu/of: Add stall and pasid properties to iommu_fwspec

From: Jean-Philippe Brucker <[email protected]>

Add stall and pasid properties to iommu_fwspec.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
---
drivers/iommu/of_iommu.c | 11 +++++++++++
include/linux/iommu.h | 2 ++
2 files changed, 12 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 84fa6b4..b926276 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -209,6 +209,16 @@ static bool of_pci_rc_supports_ats(struct device_node *rc_node)
break;
}

+ if (dev->iommu_fwspec) {
+ const __be32 *prop;
+
+ if (of_get_property(np, "dma-can-stall", NULL))
+ dev->iommu_fwspec->can_stall = true;
+
+ prop = of_get_property(np, "pasid-bits", NULL);
+ if (prop)
+ dev->iommu_fwspec->num_pasid_bits = be32_to_cpu(*prop);
+ }
+
return ops;
}

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 1c5a216..5f580de 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -387,6 +387,8 @@ struct iommu_fwspec {
void *iommu_priv;
u32 flags;
unsigned int num_ids;
+ unsigned long num_pasid_bits;
+ bool can_stall;
u32 ids[1];
};

--
1.7.12.4

2017-09-05 12:49:36

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] dt-bindings: document stall and PASID properties for IOMMU masters

On 31/08/17 09:20, Yisheng Xie wrote:
> From: Jean-Philippe Brucker <[email protected]>
>
> Document the bindings for stall and PASID capable platform devices.
>
> Signed-off-by: Jean-Philippe Brucker <[email protected]>

Huh? No, I don't think I did. Please leave the patch as you found it, and
ask me for a version I consider clean next time.

Thanks,
Jean

2017-09-05 12:49:45

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 2/6] iommu/of: Add stall and pasid properties to iommu_fwspec

On 31/08/17 09:20, Yisheng Xie wrote:
> From: Jean-Philippe Brucker <[email protected]>
>
> Add stall and pasid properties to iommu_fwspec.
>
> Signed-off-by: Jean-Philippe Brucker <[email protected]>

No. This is a draft, I didn't sign it off.

Thanks,
Jean

2017-09-05 12:50:17

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 4/6] iommu/arm-smmu-v3: Add SVM support for platform devices

On 31/08/17 09:20, Yisheng Xie wrote:
> From: Jean-Philippe Brucker <[email protected]>
>
> Platform device can realise SVM function by using the stall mode. That
> is to say, when device access a memory via iova which is not populated,
> it will stalled and when SMMU try to translate this iova, it also will
> stall and meanwhile send an event to CPU via MSI.
>
> After SMMU driver handle the event and populated the iova, it will send
> a RESUME command to SMMU to exit the stall mode, therefore the platform
> device can contiue access the memory.
>
> Signed-off-by: Jean-Philippe Brucker <[email protected]>

No. Please don't forge a signed-off-by under a commit message you wrote,
it's rude. I didn't sign it, didn't consider it fit for mainline or even
as an RFC, and wanted to have another read before sending. My mistake,
I'll think twice before sharing prototypes in the future.

Thanks,
Jean

2017-09-05 12:50:41

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 5/6] iommu/arm-smmu-v3: fix panic when handle stall mode irq

On 31/08/17 09:20, Yisheng Xie wrote:
> When SMMU do not support SVM feature, however the master support SVM,
> which means matser can stall and with mult-pasid number, then the user
> can bind a task to device using API like iommu_bind_task(). however,
> when device trigger a stall mode fault i will cause panic:
>
> [ 106.996087] Unable to handle kernel NULL pointer dereference at virtual address 00000100
> [ 106.996122] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003e023000
> [ 106.996150] [0000000000000100] *pgd=000000003e04a003, *pud=000000003e04b003, *pmd=0000000000000000
> [ 106.996201] Internal error: Oops: 96000006 [#1] PREEMPT SM
> [ 106.996224] Modules linked in:
> [ 106.996256] CPU: 0 PID: 916 Comm: irq/14-arm-smmu Not tainted 4.13.0-rc5-00035-g1235ddd-dirty #67
> [ 106.996288] Hardware name: Hisilicon PhosphorHi1383 ESL (DT)
> [ 106.996317] task: ffff80003adc1c00 task.stack: ffff80003a9f8000
> [ 106.996347] PC is at __queue_work+0x30/0x3a8
> [ 106.996374] LR is at queue_work_on+0x60/0x78
> [ 106.996401] pc : [<ffff0000080d7d10>] lr : [<ffff0000080d80e8>] pstate: 40c001c9
> [ 106.996430] sp : ffff80003a9fbc20
> [ 106.996451] x29: ffff80003a9fbc20 x28: ffff80003adc1c00
> [ 106.996488] x27: ffff000008d05080 x26: ffff80003ab0e028
> [ 106.996526] x25: ffff80003a9900ac x24: 0000000000000001
> [ 106.996562] x23: 0000000000000040 x22: 0000000000000000
> [ 106.996598] x21: 0000000000000000 x20: 0000000000000140
> [ 106.996634] x19: ffff80003ab0e028 x18: 0000000000000010
> [ 106.996670] x17: 0000ffffa52a5040 x16: ffff00000820f260
> [ 106.996708] x15: 00000018e97629e0 x14: ffff80003fb89468
> [ 106.996744] x13: 0000000000000000 x12: ffff80003abb0600
> [ 106.996781] x11: 0000000000000000 x10: 0000010100000100
> [ 106.996817] x9 : 0000ffff85de5010 x8 : 00000000e4830001
> [ 106.996854] x7 : ffff80003a9fbcf8 x6 : 0000000fffffffe0
> [ 106.996890] x5 : 0000000000000000 x4 : 0000000000000001
> [ 106.996926] x3 : 0000000000000000 x2 : ffff80003ab0e028
> [ 106.996962] x1 : 0000000000000000 x0 : 00000000000001c0
> [ 106.997002] Process irq/14-arm-smmu (pid: 916, stack limit =0xffff80003a9f8000)
> [ 106.997035] Stack: (0xffff80003a9fbc20 to 0xffff80003a9fc000)
> [...]
> [ 106.998366] Call trace:
> [ 106.998842] [<ffff0000080d7d10>] __queue_work+0x30/0x3a8
> [ 106.998874] [<ffff0000080d80e8>] queue_work_on+0x60/0x78
> [ 106.998912] [<ffff00000857aae4>] arm_smmu_handle_stall+0x104/0x138
> [ 106.998952] [<ffff00000857b150>] arm_smmu_evtq_thread+0xc0/0x158
> [ 106.998989] [<ffff000008112128>] irq_thread_fn+0x28/0x68
> [ 106.999025] [<ffff0000081123e0>] irq_thread+0x128/0x1d0
> [ 106.999060] [<ffff0000080df6bc>] kthread+0xfc/0x128
> [ 106.999093] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> [ 106.999130] Code: a90153f3 a90573fb d53b4220 363814c0 (b94102a0)
> [ 106.999159] ---[ end trace 7e5c9f0cb1f2fecd ]---
>
> And the resean is we donot init fault_queue while the fault handle need
> to use it.
>
> Fix by return -EINVAL in arm_smmu_bind_task() when smmu do not
> support the feature of SVM.
>
> Signed-off-by: Yisheng Xie <[email protected]>
> ---
> drivers/iommu/arm-smmu-v3.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d44256a..dbda2eb 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2922,6 +2922,8 @@ static int arm_smmu_bind_task(struct device *dev, struct task_struct *task,
> return -EINVAL;
>
> smmu = master->smmu;
> + if (!(smmu->features & ARM_SMMU_FEAT_SVM))
> + return -EINVAL;

FEAT_SVM is set when the SMMU supports the same page table format as the
MMU, it doesn't say anything about PRI/stall ability. To fix the above
splat we should either instantiate fault_queue even when !FEAT_SVM, or
avoid enabling master->can_fault and can_stall if !FEAT_SVM. I prefer the
latter.

Thanks,
Jean

2017-09-05 12:51:06

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

On 31/08/17 09:20, Yisheng Xie wrote:
> It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
> means we should not disable stall mode if stall/terminate mode is not
> configuable.
>
> Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
> means if stall mode is force we should always set CD.S.
>
> This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
> TERMINATE feature checking to ensue above ILLEGAL cases from happening.
>
> Signed-off-by: Yisheng Xie <[email protected]>
> ---
> drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index dbda2eb..0745522 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -55,6 +55,7 @@
> #define IDR0_STALL_MODEL_SHIFT 24
> #define IDR0_STALL_MODEL_MASK 0x3
> #define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT)
> +#define IDR0_STALL_MODEL_NS (1 << IDR0_STALL_MODEL_SHIFT)
> #define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT)
> #define IDR0_TTENDIAN_SHIFT 21
> #define IDR0_TTENDIAN_MASK 0x3
> @@ -766,6 +767,7 @@ struct arm_smmu_device {
> #define ARM_SMMU_FEAT_SVM (1 << 15)
> #define ARM_SMMU_FEAT_HA (1 << 16)
> #define ARM_SMMU_FEAT_HD (1 << 17)
> +#define ARM_SMMU_FEAT_TERMINATE (1 << 18)

I'd rather introduce something like "ARM_SMMU_FEAT_STALL_FORCE" instead.
Terminate model has another meaning, and is defined by a different bit in
IDR0.

Thanks,
Jean

2017-09-05 12:52:45

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 31/08/17 09:20, Yisheng Xie wrote:
> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
> https://www.spinics.net/lists/arm-kernel/msg565155.html
>
> But for some platform devices(aka on-chip integrated devices), there is also
> SVM requirement, which works based on the SMMU stall mode.
> Jean-Philippe has prepared a prototype patchset to support it:
> git://linux-arm.org/linux-jpb.git svm/stall

Only meant for testing at that point, and unfit even for an RFC.

> We tested this patchset with some fixes on a on-chip integrated device. The
> basic function is ok, so I just send them out for review, although this
> patchset heavily depends on the former patchset (PCIe SVM support for ARM
> SMMUv3), which is still under discussion.
>
> Patch Overview:
> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
> *4 is to realise the SVM function for platform device
> *5 is fix a bug when test SVM function while SMMU donnot support this feature
> *6 avoid ILLEGAL setting of STE and CD entry about stall
>
> Acctually here, I also have a question about SVM on SMMUv3:
>
> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
> it will register a mmu_notify. Therefore, when a page range is invalid, we can
> send TLBI or ATC invalid without BTM?

We could, but the end goal for SVM is to perfectly mirror the CPU page
tables. So for platform SVM we would like to get rid of MMU notifiers
entirely.

> 2. According to ACPI IORT spec, named component specific data has a node flags field
> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
> a single platform device which should be enough, because SMMU only support 20 bit pasid
>
> 3. Presently, the pasid is allocate for a task but not for a context, if a task is trying
> to bind to 2 device A and B:
> a) A support 5 pasid bits
> b) B support 2 pasid bits
> c) when the task bind to device A, it allocate pasid = 16
> d) then it must be fail when trying to bind to task B, for its highest pasid is 4.
> So it should allocate a single pasid for a context to avoid this?

Ideally yes, but the model chosen for the IOMMU API was one PASID per
task, so I implemented this model (the PASID allocator will be common to
IOMMU core in the future).

Therefore the PASID allocation will fail in your example, and there is no
way around it. If you do (d) then (c), the task will have PASID 4.

Thanks,
Jean

2017-09-06 00:51:47

by Liubo(OS Lab)

[permalink] [raw]
Subject: Re: [RFC PATCH 4/6] iommu/arm-smmu-v3: Add SVM support for platform devices

On 2017/9/5 20:53, Jean-Philippe Brucker wrote:
> On 31/08/17 09:20, Yisheng Xie wrote:
>> From: Jean-Philippe Brucker <[email protected]>
>>
>> Platform device can realise SVM function by using the stall mode. That
>> is to say, when device access a memory via iova which is not populated,
>> it will stalled and when SMMU try to translate this iova, it also will
>> stall and meanwhile send an event to CPU via MSI.
>>
>> After SMMU driver handle the event and populated the iova, it will send
>> a RESUME command to SMMU to exit the stall mode, therefore the platform
>> device can contiue access the memory.
>>
>> Signed-off-by: Jean-Philippe Brucker <[email protected]>
>
> No. Please don't forge a signed-off-by under a commit message you wrote,

Really sorry for that.
We sent out the wrong version, I should take more careful review.

Regards,
Liubo

> it's rude. I didn't sign it, didn't consider it fit for mainline or even
> as an RFC, and wanted to have another read before sending. My mistake,
> I'll think twice before sharing prototypes in the future.
>
> Thanks,
> Jean
>



2017-09-06 01:03:44

by Liubo(OS Lab)

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
> On 31/08/17 09:20, Yisheng Xie wrote:
>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>
>> But for some platform devices(aka on-chip integrated devices), there is also
>> SVM requirement, which works based on the SMMU stall mode.
>> Jean-Philippe has prepared a prototype patchset to support it:
>> git://linux-arm.org/linux-jpb.git svm/stall
>
> Only meant for testing at that point, and unfit even for an RFC.
>

Sorry for the misunderstanding.
The PRI mode patches is in RFC even no hardware for testing, so I thought it's fine for "Stall mode" patches sent as RFC.
We have tested the Stall mode on our platform.
Anyway, I should confirm with you in advance.

Btw, Would you consider the "stall mode" upstream at first? Since there is no hardware for testing the PRI mode.
(We can provide you the hardware which support SMMU stall mode if necessary.)

>> We tested this patchset with some fixes on a on-chip integrated device. The
>> basic function is ok, so I just send them out for review, although this
>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>> SMMUv3), which is still under discussion.
>>
>> Patch Overview:
>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>> *4 is to realise the SVM function for platform device
>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>
>> Acctually here, I also have a question about SVM on SMMUv3:
>>
>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>> send TLBI or ATC invalid without BTM?
>
> We could, but the end goal for SVM is to perfectly mirror the CPU page
> tables. So for platform SVM we would like to get rid of MMU notifiers
> entirely.
>
>> 2. According to ACPI IORT spec, named component specific data has a node flags field
>> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
>> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
>> a single platform device which should be enough, because SMMU only support 20 bit pasid
>>

Any comment on this?
The ACPI IORT spec may need be updated?

Regards,
Liubo

>> 3. Presently, the pasid is allocate for a task but not for a context, if a task is trying
>> to bind to 2 device A and B:
>> a) A support 5 pasid bits
>> b) B support 2 pasid bits
>> c) when the task bind to device A, it allocate pasid = 16
>> d) then it must be fail when trying to bind to task B, for its highest pasid is 4.
>> So it should allocate a single pasid for a context to avoid this?
>
> Ideally yes, but the model chosen for the IOMMU API was one PASID per
> task, so I implemented this model (the PASID allocator will be common to
> IOMMU core in the future).
>
> Therefore the PASID allocation will fail in your example, and there is no
> way around it. If you do (d) then (c), the task will have PASID 4.
>
> Thanks,
> Jean
>
> .
>


2017-09-06 01:17:56

by Yisheng Xie

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

Hi Jean-Philippe,

On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
> On 31/08/17 09:20, Yisheng Xie wrote:
>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>
>> But for some platform devices(aka on-chip integrated devices), there is also
>> SVM requirement, which works based on the SMMU stall mode.
>> Jean-Philippe has prepared a prototype patchset to support it:
>> git://linux-arm.org/linux-jpb.git svm/stall
>
> Only meant for testing at that point, and unfit even for an RFC.

Sorry about that, I should ask you before send it out. It's my mistake. For I also
have some question about this patchset.

We have related device, and would like to do some help about it. Do you have
any plan about upstream ?

>
>> We tested this patchset with some fixes on a on-chip integrated device. The
>> basic function is ok, so I just send them out for review, although this
>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>> SMMUv3), which is still under discussion.
>>
>> Patch Overview:
>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>> *4 is to realise the SVM function for platform device
>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>
>> Acctually here, I also have some questions about SVM on SMMUv3:
>>
>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>> send TLBI or ATC invalid without BTM?
>
> We could, but the end goal for SVM is to perfectly mirror the CPU page
> tables. So for platform SVM we would like to get rid of MMU notifiers
> entirely.

I see, but for some SMMU which do not support BTM, it cannot benefit from SVM.

Meanwhile, do you mean even with BTM feature, the PCI-e device also need to send a
ATC invalid by MMU notify? It seems not fair, why not hardware do the entirely work
in this case? It may costly for send ATC invalid and sync.

>
>> 2. According to ACPI IORT spec, named component specific data has a node flags field
>> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
>> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
>> a single platform device which should be enough, because SMMU only support 20 bit pasid
>>
>> 3. Presently, the pasid is allocate for a task but not for a context, if a task is trying
>> to bind to 2 device A and B:
>> a) A support 5 pasid bits
>> b) B support 2 pasid bits
>> c) when the task bind to device A, it allocate pasid = 16
>> d) then it must be fail when trying to bind to task B, for its highest pasid is 4.
>> So it should allocate a single pasid for a context to avoid this?
>
> Ideally yes, but the model chosen for the IOMMU API was one PASID per
> task, so I implemented this model (the PASID allocator will be common to
> IOMMU core in the future).
It is fair, for each IOMMU need PASID allocator to support SVM.

Thanks
Yisheng Xie

>
> Therefore the PASID allocation will fail in your example, and there is no
> way around it. If you do (d) then (c), the task will have PASID 4.
>
> Thanks,
> Jean
>
> .
>

2017-09-06 01:21:15

by Yisheng Xie

[permalink] [raw]
Subject: Re: [RFC PATCH 4/6] iommu/arm-smmu-v3: Add SVM support for platform devices

Hi Jean-Philippe,

On 2017/9/6 8:51, Bob Liu wrote:
> On 2017/9/5 20:53, Jean-Philippe Brucker wrote:
>> On 31/08/17 09:20, Yisheng Xie wrote:
>>> From: Jean-Philippe Brucker <[email protected]>
>>>
>>> Platform device can realise SVM function by using the stall mode. That
>>> is to say, when device access a memory via iova which is not populated,
>>> it will stalled and when SMMU try to translate this iova, it also will
>>> stall and meanwhile send an event to CPU via MSI.
>>>
>>> After SMMU driver handle the event and populated the iova, it will send
>>> a RESUME command to SMMU to exit the stall mode, therefore the platform
>>> device can contiue access the memory.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <[email protected]>
>>
>> No. Please don't forge a signed-off-by under a commit message you wrote,

Sorry about that, it is my mistake.

>
> Really sorry for that.
> We sent out the wrong version, I should take more careful review.
>
> Regards,
> Liubo
>
>> it's rude. I didn't sign it, didn't consider it fit for mainline or even
>> as an RFC, and wanted to have another read before sending. My mistake,
>> I'll think twice before sharing prototypes in the future.
>>
>> Thanks,
>> Jean
>>
>
>
>
>
> .
>

2017-09-06 01:24:35

by Hanjun Guo

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 2017/8/31 16:20, Yisheng Xie wrote:
> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
> https://www.spinics.net/lists/arm-kernel/msg565155.html
>
> But for some platform devices(aka on-chip integrated devices), there is also
> SVM requirement, which works based on the SMMU stall mode.
> Jean-Philippe has prepared a prototype patchset to support it:
> git://linux-arm.org/linux-jpb.git svm/stall
>
> We tested this patchset with some fixes on a on-chip integrated device. The
> basic function is ok, so I just send them out for review, although this
> patchset heavily depends on the former patchset (PCIe SVM support for ARM
> SMMUv3), which is still under discussion.
>
> Patch Overview:
> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
> *4 is to realise the SVM function for platform device
> *5 is fix a bug when test SVM function while SMMU donnot support this feature
> *6 avoid ILLEGAL setting of STE and CD entry about stall
>
> Acctually here, I also have a question about SVM on SMMUv3:
>
> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
> it will register a mmu_notify. Therefore, when a page range is invalid, we can
> send TLBI or ATC invalid without BTM?
>
> 2. According to ACPI IORT spec, named component specific data has a node flags field
> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
> a single platform device which should be enough, because SMMU only support 20 bit pasid

I think we can propose something similar, it's a missing function in
IORT.

Thanks
Hanjun

2017-09-06 02:24:15

by Yisheng Xie

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

Hi Jean-Philippe,

On 2017/9/5 20:54, Jean-Philippe Brucker wrote:
> On 31/08/17 09:20, Yisheng Xie wrote:
>> It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
>> means we should not disable stall mode if stall/terminate mode is not
>> configuable.
>>
>> Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
>> means if stall mode is force we should always set CD.S.
>>
>> This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
>> TERMINATE feature checking to ensue above ILLEGAL cases from happening.
>>
>> Signed-off-by: Yisheng Xie <[email protected]>
>> ---
>> drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
>> 1 file changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index dbda2eb..0745522 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -55,6 +55,7 @@
>> #define IDR0_STALL_MODEL_SHIFT 24
>> #define IDR0_STALL_MODEL_MASK 0x3
>> #define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT)
>> +#define IDR0_STALL_MODEL_NS (1 << IDR0_STALL_MODEL_SHIFT)
>> #define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT)
>> #define IDR0_TTENDIAN_SHIFT 21
>> #define IDR0_TTENDIAN_MASK 0x3
>> @@ -766,6 +767,7 @@ struct arm_smmu_device {
>> #define ARM_SMMU_FEAT_SVM (1 << 15)
>> #define ARM_SMMU_FEAT_HA (1 << 16)
>> #define ARM_SMMU_FEAT_HD (1 << 17)
>> +#define ARM_SMMU_FEAT_TERMINATE (1 << 18)
>
> I'd rather introduce something like "ARM_SMMU_FEAT_STALL_FORCE" instead.
> Terminate model has another meaning, and is defined by a different bit in
> IDR0.

Ok, sound more reasonable.

Thanks
Yisheng Xie

>
> Thanks,
> Jean
>
> .
>

2017-09-06 09:54:20

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 06/09/17 02:02, Bob Liu wrote:
> On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
>> On 31/08/17 09:20, Yisheng Xie wrote:
>>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>>
>>> But for some platform devices(aka on-chip integrated devices), there is also
>>> SVM requirement, which works based on the SMMU stall mode.
>>> Jean-Philippe has prepared a prototype patchset to support it:
>>> git://linux-arm.org/linux-jpb.git svm/stall
>>
>> Only meant for testing at that point, and unfit even for an RFC.
>>
>
> Sorry for the misunderstanding.
> The PRI mode patches is in RFC even no hardware for testing, so I thought it's fine for "Stall mode" patches sent as RFC.
> We have tested the Stall mode on our platform.
> Anyway, I should confirm with you in advance.
>
> Btw, Would you consider the "stall mode" upstream at first? Since there is no hardware for testing the PRI mode.
> (We can provide you the hardware which support SMMU stall mode if necessary.)

Yes. What's blocking the ATS, PRI and PASID patches at the moment is the
lack of endpoints for testing. There has been lots of discussion on the
API side since my first RFC and I'd like to resubmit the API changes soon.
It is the same API for ATS+PRI+PASID and SSID+Stall, so the backend
doesn't matter.

I'm considering upstreaming SSID+Stall first if it can be tested on
hardware (having direct access to it would certainly speed things up).
This would require some work in moving the PCI bits at the end of the
series. I can reserve some time in the coming months to do it, but I need
to know what to focus on. Are you able to test SSID as well?

>>> We tested this patchset with some fixes on a on-chip integrated device. The
>>> basic function is ok, so I just send them out for review, although this
>>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>>> SMMUv3), which is still under discussion.
>>>
>>> Patch Overview:
>>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>>> *4 is to realise the SVM function for platform device
>>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>>
>>> Acctually here, I also have a question about SVM on SMMUv3:
>>>
>>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>>> send TLBI or ATC invalid without BTM?
>>
>> We could, but the end goal for SVM is to perfectly mirror the CPU page
>> tables. So for platform SVM we would like to get rid of MMU notifiers
>> entirely.
>>
>>> 2. According to ACPI IORT spec, named component specific data has a node flags field
>>> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
>>> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
>>> a single platform device which should be enough, because SMMU only support 20 bit pasid
>>>
>
> Any comment on this?
> The ACPI IORT spec may need be updated?

I suppose that the Named Component Node could be used for SSID and stall
capability bits. Can't the ACPI namespace entries be extended to host
these capabilities in a more generic way? Platforms with different IOMMUs
might also need this information some day.

Thanks,
Jean

2017-09-06 09:56:28

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 06/09/17 02:16, Yisheng Xie wrote:
> Hi Jean-Philippe,
>
> On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
>> On 31/08/17 09:20, Yisheng Xie wrote:
>>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>>
>>> But for some platform devices(aka on-chip integrated devices), there is also
>>> SVM requirement, which works based on the SMMU stall mode.
>>> Jean-Philippe has prepared a prototype patchset to support it:
>>> git://linux-arm.org/linux-jpb.git svm/stall
>>
>> Only meant for testing at that point, and unfit even for an RFC.
>
> Sorry about that, I should ask you before send it out. It's my mistake. For I also
> have some question about this patchset.
>
> We have related device, and would like to do some help about it. Do you have
> any plan about upstream ?
>
>>
>>> We tested this patchset with some fixes on a on-chip integrated device. The
>>> basic function is ok, so I just send them out for review, although this
>>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>>> SMMUv3), which is still under discussion.
>>>
>>> Patch Overview:
>>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>>> *4 is to realise the SVM function for platform device
>>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>>
>>> Acctually here, I also have some questions about SVM on SMMUv3:
>>>
>>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>>> send TLBI or ATC invalid without BTM?
>>
>> We could, but the end goal for SVM is to perfectly mirror the CPU page
>> tables. So for platform SVM we would like to get rid of MMU notifiers
>> entirely.
>
> I see, but for some SMMU which do not support BTM, it cannot benefit from SVM.
>
> Meanwhile, do you mean even with BTM feature, the PCI-e device also need to send a
> ATC invalid by MMU notify? It seems not fair, why not hardware do the entirely work
> in this case? It may costly for send ATC invalid and sync.

It will certainly be costly. But there are major problems with
transforming broadcast TLB maintenance into ATC invalidations in HW:

* VMID:ASID to SID:SSID conversion. TLBIs use VMID:ASID, while ATCIs use
SID:SSID.

* Most importantly, ATC invalidations accounting. Each endpoint has a
limited number of in-flight ATC invalidate requests. The conversion module
would have to buffer incoming invalidations and wait for in-flight ATC
invalidation to complete before sending the next ones. In case of
overflow, either we lose invalidation (which opens security holes) or we
somehow put back-pressure on the interconnect (no idea how feasible this
is, I suspect really hard).

Solving the last one is also quite difficult in software, but at least we
can still invalidate a range. In hardware we would invalidate the ATC
page-by-page and quickly jam the bus.

Thanks,
Jean

2017-09-07 01:44:09

by Liubo(OS Lab)

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 2017/9/6 17:57, Jean-Philippe Brucker wrote:
> On 06/09/17 02:02, Bob Liu wrote:
>> On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
>>> On 31/08/17 09:20, Yisheng Xie wrote:
>>>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>>>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>>>
>>>> But for some platform devices(aka on-chip integrated devices), there is also
>>>> SVM requirement, which works based on the SMMU stall mode.
>>>> Jean-Philippe has prepared a prototype patchset to support it:
>>>> git://linux-arm.org/linux-jpb.git svm/stall
>>>
>>> Only meant for testing at that point, and unfit even for an RFC.
>>>
>>
>> Sorry for the misunderstanding.
>> The PRI mode patches is in RFC even no hardware for testing, so I thought it's fine for "Stall mode" patches sent as RFC.
>> We have tested the Stall mode on our platform.
>> Anyway, I should confirm with you in advance.
>>
>> Btw, Would you consider the "stall mode" upstream at first? Since there is no hardware for testing the PRI mode.
>> (We can provide you the hardware which support SMMU stall mode if necessary.)
>
> Yes. What's blocking the ATS, PRI and PASID patches at the moment is the
> lack of endpoints for testing. There has been lots of discussion on the
> API side since my first RFC and I'd like to resubmit the API changes soon.
> It is the same API for ATS+PRI+PASID and SSID+Stall, so the backend
> doesn't matter.
>

Indeed!

> I'm considering upstreaming SSID+Stall first if it can be tested on
> hardware (having direct access to it would certainly speed things up).

Glad to hear that.

> This would require some work in moving the PCI bits at the end of the
> series. I can reserve some time in the coming months to do it, but I need
> to know what to focus on. Are you able to test SSID as well?
>

Yes, but the difficulty is our devices are on-chip integrated hardware accelerators which requires complicate driver.
You may need much time to understand the driver.
That's the same case as intel/amd SVM, the current user is their GPU :-(

Btw, what kind of device/method do you think is ideal for testing arm-SVM?

>>>> We tested this patchset with some fixes on a on-chip integrated device. The
>>>> basic function is ok, so I just send them out for review, although this
>>>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>>>> SMMUv3), which is still under discussion.
>>>>
>>>> Patch Overview:
>>>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>>>> *4 is to realise the SVM function for platform device
>>>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>>>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>>>
>>>> Acctually here, I also have a question about SVM on SMMUv3:
>>>>
>>>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>>>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>>>> send TLBI or ATC invalid without BTM?
>>>
>>> We could, but the end goal for SVM is to perfectly mirror the CPU page
>>> tables. So for platform SVM we would like to get rid of MMU notifiers
>>> entirely.
>>>
>>>> 2. According to ACPI IORT spec, named component specific data has a node flags field
>>>> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
>>>> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
>>>> a single platform device which should be enough, because SMMU only support 20 bit pasid
>>>>
>>
>> Any comment on this?
>> The ACPI IORT spec may need be updated?
>
> I suppose that the Named Component Node could be used for SSID and stall
> capability bits. Can't the ACPI namespace entries be extended to host
> these capabilities in a more generic way? Platforms with different IOMMUs
> might also need this information some day.
>

Hmm, that would be better.
But in anyway, it depends on the ACPI IORT Spec would be extended in next version.

--
Thanks,
Bob Liu



2017-09-07 01:58:51

by Liubo(OS Lab)

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 2017/9/6 17:59, Jean-Philippe Brucker wrote:
> On 06/09/17 02:16, Yisheng Xie wrote:
>> Hi Jean-Philippe,
>>
>> On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
>>> On 31/08/17 09:20, Yisheng Xie wrote:
>>>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>>>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>>>
>>>> But for some platform devices(aka on-chip integrated devices), there is also
>>>> SVM requirement, which works based on the SMMU stall mode.
>>>> Jean-Philippe has prepared a prototype patchset to support it:
>>>> git://linux-arm.org/linux-jpb.git svm/stall
>>>
>>> Only meant for testing at that point, and unfit even for an RFC.
>>
>> Sorry about that, I should ask you before send it out. It's my mistake. For I also
>> have some question about this patchset.
>>
>> We have related device, and would like to do some help about it. Do you have
>> any plan about upstream ?
>>
>>>
>>>> We tested this patchset with some fixes on a on-chip integrated device. The
>>>> basic function is ok, so I just send them out for review, although this
>>>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>>>> SMMUv3), which is still under discussion.
>>>>
>>>> Patch Overview:
>>>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>>>> *4 is to realise the SVM function for platform device
>>>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>>>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>>>
>>>> Acctually here, I also have some questions about SVM on SMMUv3:
>>>>
>>>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>>>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>>>> send TLBI or ATC invalid without BTM?
>>>
>>> We could, but the end goal for SVM is to perfectly mirror the CPU page
>>> tables. So for platform SVM we would like to get rid of MMU notifiers
>>> entirely.
>>
>> I see, but for some SMMU which do not support BTM, it cannot benefit from SVM.
>>
>> Meanwhile, do you mean even with BTM feature, the PCI-e device also need to send a
>> ATC invalid by MMU notify? It seems not fair, why not hardware do the entirely work
>> in this case? It may costly for send ATC invalid and sync.
>
> It will certainly be costly. But there are major problems with
> transforming broadcast TLB maintenance into ATC invalidations in HW:
>
> * VMID:ASID to SID:SSID conversion. TLBIs use VMID:ASID, while ATCIs use
> SID:SSID.
>
> * Most importantly, ATC invalidations accounting. Each endpoint has a
> limited number of in-flight ATC invalidate requests. The conversion module
> would have to buffer incoming invalidations and wait for in-flight ATC
> invalidation to complete before sending the next ones. In case of
> overflow, either we lose invalidation (which opens security holes) or we
> somehow put back-pressure on the interconnect (no idea how feasible this
> is, I suspect really hard).
>
> Solving the last one is also quite difficult in software, but at least we
> can still invalidate a range. In hardware we would invalidate the ATC
> page-by-page and quickly jam the bus.
>

Speak to the invalidation, I have one more question.

There is a time window between 1) modify page table; 2) tlb invalidate;

ARM-CPU Device

1. modify page table

^^^^^
Can still write data through smmu tlb even page table was already modified.
(At this point, the same virtual addr may not point to the same thing for CPU and device!!!
I'm afraid there may be some data-loss or other potential problems if this situation happens.)

2. tlb invalidate range

--
Thanks,
Bob

2017-09-07 16:27:17

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 07/09/17 02:55, Bob Liu wrote:
> Speak to the invalidation, I have one more question.
>
> There is a time window between 1) modify page table; 2) tlb invalidate;
>
> ARM-CPU Device
>
> 1. modify page table
>
> ^^^^^
> Can still write data through smmu tlb even page table was already modified.
> (At this point, the same virtual addr may not point to the same thing for CPU and device!!!
> I'm afraid there may be some data-loss or other potential problems if this situation happens.)
>
> 2. tlb invalidate range

The mm code serializes map/unmap operations with mm->mmap_sem, and at a
lower level I think the pte lock is used to prevent more subtle races.
Don't take my word for it though, mm/ is still very obscure to me. So the
kernel shouldn't be able to reuse the VA for something else before the tlb
invalidation completes. Even if you're using the CMDQ to invalidate
instead of TLBI instructions, you're still called by a notifier from the
mm code so there is no problem.

Thanks,
Jean

2017-09-07 16:28:52

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 07/09/17 02:41, Bob Liu wrote:
>> This would require some work in moving the PCI bits at the end of the
>> series. I can reserve some time in the coming months to do it, but I need
>> to know what to focus on. Are you able to test SSID as well?
>>
>
> Yes, but the difficulty is our devices are on-chip integrated hardware accelerators which requires complicate driver.
> You may need much time to understand the driver.
> That's the same case as intel/amd SVM, the current user is their GPU :-(
>
> Btw, what kind of device/method do you think is ideal for testing arm-SVM?

A simple, bare DMA engine would be ideal. Something just capable of
performing memcpy with parameters (PASID, input IOVA, output IOVA, size)
can be used for validating SVM and virtualization. You could easily create
reproducible unit tests and userspace drivers. If it supports isolated
channels (as in SR-IOV), even better.

As you said, having a useful device like a full GPU/accelerator as opposed
to a dummy validation engine makes it difficult to fully test the SMMU.
However it can be helpful for evaluating driver performances and is still
good enough for confirming that the IOMMU works.

Thanks,
Jean

2017-09-13 01:12:10

by Liubo(OS Lab)

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add platform device SVM support for ARM SMMUv3

On 2017/9/6 17:57, Jean-Philippe Brucker wrote:
> On 06/09/17 02:02, Bob Liu wrote:
>> On 2017/9/5 20:56, Jean-Philippe Brucker wrote:
>>> On 31/08/17 09:20, Yisheng Xie wrote:
>>>> Jean-Philippe has post a patchset for Adding PCIe SVM support to ARM SMMUv3:
>>>> https://www.spinics.net/lists/arm-kernel/msg565155.html
>>>>
>>>> But for some platform devices(aka on-chip integrated devices), there is also
>>>> SVM requirement, which works based on the SMMU stall mode.
>>>> Jean-Philippe has prepared a prototype patchset to support it:
>>>> git://linux-arm.org/linux-jpb.git svm/stall
>>>
>>> Only meant for testing at that point, and unfit even for an RFC.
>>>
>>
>> Sorry for the misunderstanding.
>> The PRI mode patches is in RFC even no hardware for testing, so I thought it's fine for "Stall mode" patches sent as RFC.
>> We have tested the Stall mode on our platform.
>> Anyway, I should confirm with you in advance.
>>
>> Btw, Would you consider the "stall mode" upstream at first? Since there is no hardware for testing the PRI mode.
>> (We can provide you the hardware which support SMMU stall mode if necessary.)
>
> Yes. What's blocking the ATS, PRI and PASID patches at the moment is the
> lack of endpoints for testing. There has been lots of discussion on the
> API side since my first RFC and I'd like to resubmit the API changes soon.
> It is the same API for ATS+PRI+PASID and SSID+Stall, so the backend
> doesn't matter.
>
> I'm considering upstreaming SSID+Stall first if it can be tested on
> hardware (having direct access to it would certainly speed things up).
> This would require some work in moving the PCI bits at the end of the
> series. I can reserve some time in the coming months to do it, but I need
> to know what to focus on. Are you able to test SSID as well?
>

Update:
Our current platform device has only one SSID register, so that have to do manually
switch(write different ssid to that register) if want to use by different processes.

But we're going to have an new platform who's platform device can support multi ssid.

Regards,
Bob

>>>> We tested this patchset with some fixes on a on-chip integrated device. The
>>>> basic function is ok, so I just send them out for review, although this
>>>> patchset heavily depends on the former patchset (PCIe SVM support for ARM
>>>> SMMUv3), which is still under discussion.
>>>>
>>>> Patch Overview:
>>>> *1 to 3 prepare for device tree or acpi get the device stall ability and pasid bits
>>>> *4 is to realise the SVM function for platform device
>>>> *5 is fix a bug when test SVM function while SMMU donnot support this feature
>>>> *6 avoid ILLEGAL setting of STE and CD entry about stall
>>>>
>>>> Acctually here, I also have a question about SVM on SMMUv3:
>>>>
>>>> 1. Why the SVM feature on SMMUv3 depends on BTM feature? when bind a task to device,
>>>> it will register a mmu_notify. Therefore, when a page range is invalid, we can
>>>> send TLBI or ATC invalid without BTM?
>>>
>>> We could, but the end goal for SVM is to perfectly mirror the CPU page
>>> tables. So for platform SVM we would like to get rid of MMU notifiers
>>> entirely.
>>>
>>>> 2. According to ACPI IORT spec, named component specific data has a node flags field
>>>> whoes bit0 is for Stall support. However, it do not have any field for pasid bit.
>>>> Can we use other 5 bits[5:1] for pasid bit numbers, so we can have 32 pasid bit for
>>>> a single platform device which should be enough, because SMMU only support 20 bit pasid
>>>>
>>
>> Any comment on this?
>> The ACPI IORT spec may need be updated?
>
> I suppose that the Named Component Node could be used for SSID and stall
> capability bits. Can't the ACPI namespace entries be extended to host
> these capabilities in a more generic way? Platforms with different IOMMUs
> might also need this information some day.
>


2017-09-13 03:06:37

by Will Deacon

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

On Tue, Sep 05, 2017 at 01:54:19PM +0100, Jean-Philippe Brucker wrote:
> On 31/08/17 09:20, Yisheng Xie wrote:
> > It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
> > means we should not disable stall mode if stall/terminate mode is not
> > configuable.
> >
> > Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
> > means if stall mode is force we should always set CD.S.
> >
> > This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
> > TERMINATE feature checking to ensue above ILLEGAL cases from happening.
> >
> > Signed-off-by: Yisheng Xie <[email protected]>
> > ---
> > drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
> > 1 file changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > index dbda2eb..0745522 100644
> > --- a/drivers/iommu/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm-smmu-v3.c
> > @@ -55,6 +55,7 @@
> > #define IDR0_STALL_MODEL_SHIFT 24
> > #define IDR0_STALL_MODEL_MASK 0x3
> > #define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT)
> > +#define IDR0_STALL_MODEL_NS (1 << IDR0_STALL_MODEL_SHIFT)
> > #define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT)
> > #define IDR0_TTENDIAN_SHIFT 21
> > #define IDR0_TTENDIAN_MASK 0x3
> > @@ -766,6 +767,7 @@ struct arm_smmu_device {
> > #define ARM_SMMU_FEAT_SVM (1 << 15)
> > #define ARM_SMMU_FEAT_HA (1 << 16)
> > #define ARM_SMMU_FEAT_HD (1 << 17)
> > +#define ARM_SMMU_FEAT_TERMINATE (1 << 18)
>
> I'd rather introduce something like "ARM_SMMU_FEAT_STALL_FORCE" instead.
> Terminate model has another meaning, and is defined by a different bit in
> IDR0.

Yes. What we need to do is:

- If STALL_MODEL is 0b00, then set S1STALLD
- If STALL_MODEL is 0b01, then we're ok (in future, avoiding trying to use
stalls, even for masters that claim to support it)
- If STALL_MODEL is 0b10, then force all PCI devices and any platform
devices that don't claim to support stalls into bypass (depending on
disable_bypass).

Reasonable? We could actually knock up a fix for mainline to do most of
this already.

Will

2017-09-13 10:14:02

by Yisheng Xie

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

Hi Will,

On 2017/9/13 11:06, Will Deacon wrote:
> On Tue, Sep 05, 2017 at 01:54:19PM +0100, Jean-Philippe Brucker wrote:
>> On 31/08/17 09:20, Yisheng Xie wrote:
>>> It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
>>> means we should not disable stall mode if stall/terminate mode is not
>>> configuable.
>>>
>>> Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
>>> means if stall mode is force we should always set CD.S.
>>>
>>> This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
>>> TERMINATE feature checking to ensue above ILLEGAL cases from happening.
>>>
>>> Signed-off-by: Yisheng Xie <[email protected]>
>>> ---
>>> drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
>>> 1 file changed, 16 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>>> index dbda2eb..0745522 100644
>>> --- a/drivers/iommu/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm-smmu-v3.c
>>> @@ -55,6 +55,7 @@
>>> #define IDR0_STALL_MODEL_SHIFT 24
>>> #define IDR0_STALL_MODEL_MASK 0x3
>>> #define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT)
>>> +#define IDR0_STALL_MODEL_NS (1 << IDR0_STALL_MODEL_SHIFT)
>>> #define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT)
>>> #define IDR0_TTENDIAN_SHIFT 21
>>> #define IDR0_TTENDIAN_MASK 0x3
>>> @@ -766,6 +767,7 @@ struct arm_smmu_device {
>>> #define ARM_SMMU_FEAT_SVM (1 << 15)
>>> #define ARM_SMMU_FEAT_HA (1 << 16)
>>> #define ARM_SMMU_FEAT_HD (1 << 17)
>>> +#define ARM_SMMU_FEAT_TERMINATE (1 << 18)
>>
>> I'd rather introduce something like "ARM_SMMU_FEAT_STALL_FORCE" instead.
>> Terminate model has another meaning, and is defined by a different bit in
>> IDR0.
>
> Yes. What we need to do is:
>
> - If STALL_MODEL is 0b00, then set S1STALLD

Yes, and within this case, we can only set the S1STALLD for masters which can
not stall in the future?

> - If STALL_MODEL is 0b01, then we're ok (in future, avoiding trying to use
> stalls, even for masters that claim to support it)
> - If STALL_MODEL is 0b10, then force all PCI devices and any platform
> devices that don't claim to support stalls into bypass (depending on
> disable_bypass).
>
> Reasonable? We could actually knock up a fix for mainline to do most of
> this already.
This sound reasonable to me. And I can be a volunteer to prepare this patch if
Jean-Philippe do not oppose :)

Thanks
Yisheng Xie

>
> Will
>
> .
>

2017-09-13 15:47:08

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

On 13/09/17 11:11, Yisheng Xie wrote:
> Hi Will,
>
> On 2017/9/13 11:06, Will Deacon wrote:
>> On Tue, Sep 05, 2017 at 01:54:19PM +0100, Jean-Philippe Brucker wrote:
>>> On 31/08/17 09:20, Yisheng Xie wrote:
>>>> It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
>>>> means we should not disable stall mode if stall/terminate mode is not
>>>> configuable.
>>>>
>>>> Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
>>>> means if stall mode is force we should always set CD.S.
>>>>
>>>> This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
>>>> TERMINATE feature checking to ensue above ILLEGAL cases from happening.
>>>>
>>>> Signed-off-by: Yisheng Xie <[email protected]>
>>>> ---
>>>>? drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
>>>>? 1 file changed, 16 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>>>> index dbda2eb..0745522 100644
>>>> --- a/drivers/iommu/arm-smmu-v3.c
>>>> +++ b/drivers/iommu/arm-smmu-v3.c
>>>> @@ -55,6 +55,7 @@
>>>>? #define IDR0_STALL_MODEL_SHIFT???????????? 24
>>>>? #define IDR0_STALL_MODEL_MASK????????????? 0x3
>>>>? #define IDR0_STALL_MODEL_STALL???????????? (0 << IDR0_STALL_MODEL_SHIFT)
>>>> +#define IDR0_STALL_MODEL_NS??????????????? (1 << IDR0_STALL_MODEL_SHIFT)
>>>>? #define IDR0_STALL_MODEL_FORCE???????????? (2 << IDR0_STALL_MODEL_SHIFT)
>>>>? #define IDR0_TTENDIAN_SHIFT??????????????? 21
>>>>? #define IDR0_TTENDIAN_MASK???????? 0x3
>>>> @@ -766,6 +767,7 @@ struct arm_smmu_device {
>>>>? #define ARM_SMMU_FEAT_SVM????????? (1 << 15)
>>>>? #define ARM_SMMU_FEAT_HA?????????? (1 << 16)
>>>>? #define ARM_SMMU_FEAT_HD?????????? (1 << 17)
>>>> +#define ARM_SMMU_FEAT_TERMINATE??????????? (1 << 18)
>>>
>>> I'd rather introduce something like "ARM_SMMU_FEAT_STALL_FORCE" instead.
>>> Terminate model has another meaning, and is defined by a different bit in
>>> IDR0.
>>
>> Yes. What we need to do is:
>>
>> - If STALL_MODEL is 0b00, then set S1STALLD
>
> Yes, and within this case, we can only set the S1STALLD for masters which can
> not stall in the future?
>
>> - If STALL_MODEL is 0b01, then we're ok (in future, avoiding trying to use
>>?? stalls, even for masters that claim to support it)
>> - If STALL_MODEL is 0b10, then force all PCI devices and any platform
>>?? devices that don't claim to support stalls into bypass (depending on
>>?? disable_bypass).
>>
>> Reasonable? We could actually knock up a fix for mainline to do most of
>> this already.
> This sound reasonable to me. And I can be a volunteer to prepare this patch if
> Jean-Philippe do not oppose :)

Sure go ahead, I'll rebase the platform SVM work on top of it.

Thanks,
Jean

2017-09-13 17:11:52

by Will Deacon

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S

On Wed, Sep 13, 2017 at 06:11:13PM +0800, Yisheng Xie wrote:
> On 2017/9/13 11:06, Will Deacon wrote:
> > On Tue, Sep 05, 2017 at 01:54:19PM +0100, Jean-Philippe Brucker wrote:
> >> On 31/08/17 09:20, Yisheng Xie wrote:
> >>> It is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which
> >>> means we should not disable stall mode if stall/terminate mode is not
> >>> configuable.
> >>>
> >>> Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
> >>> means if stall mode is force we should always set CD.S.
> >>>
> >>> This patch add ARM_SMMU_FEAT_TERMINATE feature bit for smmu, and use
> >>> TERMINATE feature checking to ensue above ILLEGAL cases from happening.
> >>>
> >>> Signed-off-by: Yisheng Xie <[email protected]>
> >>> ---
> >>> drivers/iommu/arm-smmu-v3.c | 22 ++++++++++++++++------
> >>> 1 file changed, 16 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >>> index dbda2eb..0745522 100644
> >>> --- a/drivers/iommu/arm-smmu-v3.c
> >>> +++ b/drivers/iommu/arm-smmu-v3.c
> >>> @@ -55,6 +55,7 @@
> >>> #define IDR0_STALL_MODEL_SHIFT 24
> >>> #define IDR0_STALL_MODEL_MASK 0x3
> >>> #define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT)
> >>> +#define IDR0_STALL_MODEL_NS (1 << IDR0_STALL_MODEL_SHIFT)
> >>> #define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT)
> >>> #define IDR0_TTENDIAN_SHIFT 21
> >>> #define IDR0_TTENDIAN_MASK 0x3
> >>> @@ -766,6 +767,7 @@ struct arm_smmu_device {
> >>> #define ARM_SMMU_FEAT_SVM (1 << 15)
> >>> #define ARM_SMMU_FEAT_HA (1 << 16)
> >>> #define ARM_SMMU_FEAT_HD (1 << 17)
> >>> +#define ARM_SMMU_FEAT_TERMINATE (1 << 18)
> >>
> >> I'd rather introduce something like "ARM_SMMU_FEAT_STALL_FORCE" instead.
> >> Terminate model has another meaning, and is defined by a different bit in
> >> IDR0.
> >
> > Yes. What we need to do is:
> >
> > - If STALL_MODEL is 0b00, then set S1STALLD
>
> Yes, and within this case, we can only set the S1STALLD for masters which can
> not stall in the future?

Yeah, something like that. I'd probably predicate it on having afault
handler registered too.

Will