2023-08-02 17:15:33

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v4 0/8] Refactor the SMMU's CD table ownership

Hi all,

This series refactors stage 1 domains so that they describe a single CD
entry. These entries are now inserted into a CD table that is owned by
the arm_smmu_master instead of the domain.
This is conceptually cleaner and unblocks other features, such as
attaching domains with PASID (for unmanaged/dma domains).

This patch series was originally part of a larger patch series that
implemented the set_dev_pasid callback for non-SVA domains but is now
split into a distinct series.

This patch series is also available on gerrit with Jean's SMMU test
engine patches cherry-picked on top for testing:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/24742/5

Thanks,
Michael Shavit

Changes in v4:
- Added comment about the cd_table's dependency on the iommu core's
group mutex.
- Narrowed the range of code for which the domain's init_mutex is held
on attach since it now only protects the arm_smmu_domain_finalise
call.
- Link to v3: https://lore.kernel.org/all/[email protected]/

Changes in v3:
- Add a helper to write a CD to all masters that a domain is attached
to.
- Fixed an issue where an arm_smmu_write_ctx_desc error return wasn't
correctly handled by its caller.
- Flip the cd_table.installed bit back off when table is detached
- re-order the commit later in the series since flipping the installed
bit to off isn't obvious when the cd_table is still shared by multiple
masters.
- Link to v2: https://lore.kernel.org/all/[email protected]/

Changes in v2:
- Allocate CD table when it's first needed instead of on probe.
- Minor changes
- Added commit to rename remaining usages of cdcfg to cd_table
- Link to v1: https://lore.kernel.org/all/[email protected]/#r

Changes in v1:
- Replace s1_cfg with arm_smmu_ctx_desc_cfg representing the CD table
- Assume that the CD table is owned by the SMMU master for most
operations. This is forward-compatible with the nested patch series as
these operations wouldn't be called when the installed CD table comes
from nested domains.
- Split off as a distinct patch series from https://lore.kernel.org/all/[email protected]/

Michael Shavit (8):
iommu/arm-smmu-v3: Move ctx_desc out of s1_cfg
iommu/arm-smmu-v3: Replace s1_cfg with cdtab_cfg
iommu/arm-smmu-v3: Encapsulate ctx_desc_cfg init in alloc_cd_tables
iommu/arm-smmu-v3: move stall_enabled to the cd table
iommu/arm-smmu-v3: Refactor write_ctx_desc
iommu/arm-smmu-v3: Move CD table to arm_smmu_master
iommu/arm-smmu-v3: Skip cd sync if CD table isn't active
iommu/arm-smmu-v3: Rename cdcfg to cd_table

.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 33 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 234 ++++++++----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 +-
3 files changed, 147 insertions(+), 142 deletions(-)


base-commit: 57012c57536f8814dec92e74197ee96c3498d24e
--
2.41.0.585.gd2178a4bd4-goog



2023-08-02 17:15:58

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v4 1/8] iommu/arm-smmu-v3: Move ctx_desc out of s1_cfg

s1_cfg describes the CD table that is inserted into an SMMU's STEs. It's
weird for s1_cfg to also own ctx_desc which describes a CD that is
inserted into that table. It is more appropriate for arm_smmu_domain to
own ctx_desc.

Signed-off-by: Michael Shavit <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---

(no changes since v2)

Changes in v2:
- Undo over-reaching column alignment change

.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 ++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +++--
3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index a5a63b1c947e..968559d625c4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -62,7 +62,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
return cd;
}

- smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd);
+ smmu_domain = container_of(cd, struct arm_smmu_domain, cd);
smmu = smmu_domain->smmu;

ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9b0dc3505601..bb277ff86f65 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1869,7 +1869,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
* careful, 007.
*/
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
- arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
+ arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
} else {
cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
@@ -1957,7 +1957,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
- cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid;
+ cmd.tlbi.asid = smmu_domain->cd.asid;
} else {
cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
@@ -2088,7 +2088,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
mutex_lock(&arm_smmu_asid_lock);
if (cfg->cdcfg.cdtab)
arm_smmu_free_cd_tables(smmu_domain);
- arm_smmu_free_asid(&cfg->cd);
+ arm_smmu_free_asid(&smmu_domain->cd);
mutex_unlock(&arm_smmu_asid_lock);
} else {
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
@@ -2107,13 +2107,14 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
u32 asid;
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
+ struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;

- refcount_set(&cfg->cd.refs, 1);
+ refcount_set(&cd->refs, 1);

/* Prevent SVA from modifying the ASID until it is written to the CD */
mutex_lock(&arm_smmu_asid_lock);
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, &cfg->cd,
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
if (ret)
goto out_unlock;
@@ -2126,23 +2127,23 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
if (ret)
goto out_free_asid;

- cfg->cd.asid = (u16)asid;
- cfg->cd.ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
- cfg->cd.tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
+ cd->asid = (u16)asid;
+ cd->ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+ cd->tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) |
FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) |
FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
- cfg->cd.mair = pgtbl_cfg->arm_lpae_s1_cfg.mair;
+ cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair;

/*
* Note that this will end up calling arm_smmu_sync_cd() before
* the master has been added to the devices list for this domain.
* This isn't an issue because the STE hasn't been installed yet.
*/
- ret = arm_smmu_write_ctx_desc(smmu_domain, 0, &cfg->cd);
+ ret = arm_smmu_write_ctx_desc(smmu_domain, 0, cd);
if (ret)
goto out_free_cd_tables;

@@ -2152,7 +2153,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
out_free_cd_tables:
arm_smmu_free_cd_tables(smmu_domain);
out_free_asid:
- arm_smmu_free_asid(&cfg->cd);
+ arm_smmu_free_asid(cd);
out_unlock:
mutex_unlock(&arm_smmu_asid_lock);
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index dcab85698a4e..f841383a55a3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -599,7 +599,6 @@ struct arm_smmu_ctx_desc_cfg {

struct arm_smmu_s1_cfg {
struct arm_smmu_ctx_desc_cfg cdcfg;
- struct arm_smmu_ctx_desc cd;
u8 s1fmt;
u8 s1cdmax;
};
@@ -724,7 +723,10 @@ struct arm_smmu_domain {

enum arm_smmu_domain_stage stage;
union {
- struct arm_smmu_s1_cfg s1_cfg;
+ struct {
+ struct arm_smmu_ctx_desc cd;
+ struct arm_smmu_s1_cfg s1_cfg;
+ };
struct arm_smmu_s2_cfg s2_cfg;
};

--
2.41.0.585.gd2178a4bd4-goog


2023-08-02 17:25:27

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v4 3/8] iommu/arm-smmu-v3: Encapsulate ctx_desc_cfg init in alloc_cd_tables

This is slighlty cleaner: arm_smmu_ctx_desc_cfg is initialized in a
single function instead of having pieces set ahead-of time by its caller.

Signed-off-by: Michael Shavit <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---

(no changes since v1)

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ded613aedbb0..fe4b19c3b8de 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1132,7 +1132,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
return 0;
}

-static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain)
+static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_master *master)
{
int ret;
size_t l1size;
@@ -1140,6 +1141,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain)
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table;

+ cdcfg->max_cds_bits = master->ssid_bits;
max_contexts = 1 << cdcfg->max_cds_bits;

if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) ||
@@ -2107,7 +2109,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
int ret;
u32 asid;
struct arm_smmu_device *smmu = smmu_domain->smmu;
- struct arm_smmu_ctx_desc_cfg *cd_table = &smmu_domain->cd_table;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;

@@ -2120,11 +2121,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
if (ret)
goto out_unlock;

- cd_table->max_cds_bits = master->ssid_bits;
-
smmu_domain->stall_enabled = master->stall_enabled;

- ret = arm_smmu_alloc_cd_tables(smmu_domain);
+ ret = arm_smmu_alloc_cd_tables(smmu_domain, master);
if (ret)
goto out_free_asid;

--
2.41.0.585.gd2178a4bd4-goog


2023-08-02 17:29:45

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v4 4/8] iommu/arm-smmu-v3: move stall_enabled to the cd table

This controls whether CD entries will have the stall bit set when
writing entries into the table.

Signed-off-by: Michael Shavit <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---

(no changes since v2)

Changes in v2:
- Use a bitfield instead of a bool for stall_enabled

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++-
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fe4b19c3b8de..c01023404c26 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1114,7 +1114,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
CTXDESC_CD_0_V;

- if (smmu_domain->stall_enabled)
+ if (smmu_domain->cd_table.stall_enabled)
val |= CTXDESC_CD_0_S;
}

@@ -1141,6 +1141,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table;

+ cdcfg->stall_enabled = master->stall_enabled;
cdcfg->max_cds_bits = master->ssid_bits;
max_contexts = 1 << cdcfg->max_cds_bits;

@@ -2121,8 +2122,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
if (ret)
goto out_unlock;

- smmu_domain->stall_enabled = master->stall_enabled;
-
ret = arm_smmu_alloc_cd_tables(smmu_domain, master);
if (ret)
goto out_free_asid;
@@ -2461,7 +2460,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
ret = -EINVAL;
goto out_unlock;
} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
- smmu_domain->stall_enabled != master->stall_enabled) {
+ smmu_domain->cd_table.stall_enabled !=
+ master->stall_enabled) {
ret = -EINVAL;
goto out_unlock;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 35a93e885887..05b1f0ee6080 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -597,6 +597,8 @@ struct arm_smmu_ctx_desc_cfg {
unsigned int num_l1_ents;
/* log2 of the maximum number of CDs supported by this table */
u8 max_cds_bits;
+ /* Whether CD entries in this table have the stall bit set. */
+ u8 stall_enabled:1;
};

struct arm_smmu_s2_cfg {
@@ -714,7 +716,6 @@ struct arm_smmu_domain {
struct mutex init_mutex; /* Protects smmu pointer */

struct io_pgtable_ops *pgtbl_ops;
- bool stall_enabled;
atomic_t nr_ats_masters;

enum arm_smmu_domain_stage stage;
--
2.41.0.585.gd2178a4bd4-goog


2023-08-02 17:39:31

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v4 8/8] iommu/arm-smmu-v3: Rename cdcfg to cd_table

cdcfg is a confusing name, especially given other variables with the cfg
suffix in this driver. cd_table more clearly describes what is being
operated on.

Signed-off-by: Michael Shavit <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---

(no changes since v3)

Changes in v3:
- Commit message update

Changes in v2:
- New commit

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 66 ++++++++++-----------
1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 00b602ae9636..61de66d17a5d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1028,18 +1028,18 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid)
unsigned int idx;
struct arm_smmu_l1_ctx_desc *l1_desc;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

- if (!cdcfg->l1_desc)
- return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
+ if (!cd_table->l1_desc)
+ return cd_table->cdtab + ssid * CTXDESC_CD_DWORDS;

idx = ssid >> CTXDESC_SPLIT;
- l1_desc = &cdcfg->l1_desc[idx];
+ l1_desc = &cd_table->l1_desc[idx];
if (!l1_desc->l2ptr) {
if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
return NULL;

- l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
+ l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
/* An invalid L1CD can be cached */
arm_smmu_sync_cd(master, ssid, false);
@@ -1134,33 +1134,33 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
size_t l1size;
size_t max_contexts;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

- cdcfg->stall_enabled = master->stall_enabled;
- cdcfg->max_cds_bits = master->ssid_bits;
- max_contexts = 1 << cdcfg->max_cds_bits;
+ cd_table->stall_enabled = master->stall_enabled;
+ cd_table->max_cds_bits = master->ssid_bits;
+ max_contexts = 1 << cd_table->max_cds_bits;

if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) ||
max_contexts <= CTXDESC_L2_ENTRIES) {
- cdcfg->num_l1_ents = max_contexts;
+ cd_table->num_l1_ents = max_contexts;

l1size = max_contexts * (CTXDESC_CD_DWORDS << 3);
} else {
- cdcfg->num_l1_ents = DIV_ROUND_UP(max_contexts,
+ cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
CTXDESC_L2_ENTRIES);

- cdcfg->l1_desc = devm_kcalloc(smmu->dev, cdcfg->num_l1_ents,
- sizeof(*cdcfg->l1_desc),
+ cd_table->l1_desc = devm_kcalloc(smmu->dev, cd_table->num_l1_ents,
+ sizeof(*cd_table->l1_desc),
GFP_KERNEL);
- if (!cdcfg->l1_desc)
+ if (!cd_table->l1_desc)
return -ENOMEM;

- l1size = cdcfg->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
+ l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
}

- cdcfg->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cdcfg->cdtab_dma,
+ cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma,
GFP_KERNEL);
- if (!cdcfg->cdtab) {
+ if (!cd_table->cdtab) {
dev_warn(smmu->dev, "failed to allocate context descriptor\n");
ret = -ENOMEM;
goto err_free_l1;
@@ -1169,9 +1169,9 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
return 0;

err_free_l1:
- if (cdcfg->l1_desc) {
- devm_kfree(smmu->dev, cdcfg->l1_desc);
- cdcfg->l1_desc = NULL;
+ if (cd_table->l1_desc) {
+ devm_kfree(smmu->dev, cd_table->l1_desc);
+ cd_table->l1_desc = NULL;
}
return ret;
}
@@ -1181,30 +1181,30 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
int i;
size_t size, l1size;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

- if (cdcfg->l1_desc) {
+ if (cd_table->l1_desc) {
size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);

- for (i = 0; i < cdcfg->num_l1_ents; i++) {
- if (!cdcfg->l1_desc[i].l2ptr)
+ for (i = 0; i < cd_table->num_l1_ents; i++) {
+ if (!cd_table->l1_desc[i].l2ptr)
continue;

dmam_free_coherent(smmu->dev, size,
- cdcfg->l1_desc[i].l2ptr,
- cdcfg->l1_desc[i].l2ptr_dma);
+ cd_table->l1_desc[i].l2ptr,
+ cd_table->l1_desc[i].l2ptr_dma);
}
- devm_kfree(smmu->dev, cdcfg->l1_desc);
- cdcfg->l1_desc = NULL;
+ devm_kfree(smmu->dev, cd_table->l1_desc);
+ cd_table->l1_desc = NULL;

- l1size = cdcfg->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
+ l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
} else {
- l1size = cdcfg->num_l1_ents * (CTXDESC_CD_DWORDS << 3);
+ l1size = cd_table->num_l1_ents * (CTXDESC_CD_DWORDS << 3);
}

- dmam_free_coherent(smmu->dev, l1size, cdcfg->cdtab, cdcfg->cdtab_dma);
- cdcfg->cdtab_dma = 0;
- cdcfg->cdtab = NULL;
+ dmam_free_coherent(smmu->dev, l1size, cd_table->cdtab, cd_table->cdtab_dma);
+ cd_table->cdtab_dma = 0;
+ cd_table->cdtab = NULL;
}

bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
--
2.41.0.585.gd2178a4bd4-goog


2023-08-02 19:03:36

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

With this change, each master will now own its own CD table instead of
sharing one with other masters attached to the same domain. Attaching a
stage 1 domain installs CD entries into the master's CD table. SVA
writes its CD entries into each master's CD table if the domain is
shared across masters.

Signed-off-by: Michael Shavit <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---

Changes in v4:
- Added comment about the cd_table's dependency on the iommu core's
group mutex.
- Narrowed the range of code for which the domain's init_mutex is held
on attach since it now only protects the arm_smmu_domain_finalise
call.

Changes in v2:
- Allocate CD table when it's first needed instead of on probe.

Changes in v1:
- The master's CD table allocation was previously split to a different
commit. This change now atomically allocates the new CD table, uses
it, and removes the old one.

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +-
2 files changed, 39 insertions(+), 50 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 34bd7815aeb8..e2c33024ad85 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1025,7 +1025,7 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid)
unsigned int idx;
struct arm_smmu_l1_ctx_desc *l1_desc;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->domain->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;

if (!cdcfg->l1_desc)
return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
@@ -1062,7 +1062,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
u64 val;
bool cd_live;
__le64 *cdptr;
- struct arm_smmu_ctx_desc_cfg *cd_table = &master->domain->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

if (WARN_ON(ssid >= (1 << cd_table->max_cds_bits)))
return -E2BIG;
@@ -1125,14 +1125,13 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
return 0;
}

-static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_master *master)
+static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
{
int ret;
size_t l1size;
size_t max_contexts;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;

cdcfg->stall_enabled = master->stall_enabled;
cdcfg->max_cds_bits = master->ssid_bits;
@@ -1174,12 +1173,12 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain,
return ret;
}

-static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain)
+static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
{
int i;
size_t size, l1size;
- struct arm_smmu_device *smmu = smmu_domain->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table;
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;

if (cdcfg->l1_desc) {
size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
@@ -1287,7 +1286,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
if (smmu_domain) {
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1:
- cd_table = &smmu_domain->cd_table;
+ cd_table = &master->cd_table;
break;
case ARM_SMMU_DOMAIN_S2:
case ARM_SMMU_DOMAIN_NESTED:
@@ -2077,14 +2076,10 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)

free_io_pgtable_ops(smmu_domain->pgtbl_ops);

- /* Free the CD and ASID, if we allocated them */
+ /* Free the ASID or VMID */
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
- struct arm_smmu_ctx_desc_cfg *cd_table = &smmu_domain->cd_table;
-
/* Prevent SVA from touching the CD while we're freeing it */
mutex_lock(&arm_smmu_asid_lock);
- if (cd_table->cdtab)
- arm_smmu_free_cd_tables(smmu_domain);
arm_smmu_free_asid(&smmu_domain->cd);
mutex_unlock(&arm_smmu_asid_lock);
} else {
@@ -2096,7 +2091,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
kfree(smmu_domain);
}

-static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_cd(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_master *master,
struct io_pgtable_cfg *pgtbl_cfg)
{
@@ -2115,10 +2110,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
if (ret)
goto out_unlock;

- ret = arm_smmu_alloc_cd_tables(smmu_domain, master);
- if (ret)
- goto out_free_asid;
-
cd->asid = (u16)asid;
cd->ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
cd->tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
@@ -2130,17 +2121,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair;

- ret = arm_smmu_write_ctx_desc(master, 0, cd);
- if (ret)
- goto out_free_cd_tables;
-
mutex_unlock(&arm_smmu_asid_lock);
return 0;

-out_free_cd_tables:
- arm_smmu_free_cd_tables(smmu_domain);
-out_free_asid:
- arm_smmu_free_asid(cd);
out_unlock:
mutex_unlock(&arm_smmu_asid_lock);
return ret;
@@ -2203,7 +2186,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
ias = min_t(unsigned long, ias, VA_BITS);
oas = smmu->ias;
fmt = ARM_64_LPAE_S1;
- finalise_stage_fn = arm_smmu_domain_finalise_s1;
+ finalise_stage_fn = arm_smmu_domain_finalise_cd;
break;
case ARM_SMMU_DOMAIN_NESTED:
case ARM_SMMU_DOMAIN_S2:
@@ -2436,22 +2419,14 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
if (!smmu_domain->smmu) {
smmu_domain->smmu = smmu;
ret = arm_smmu_domain_finalise(domain, master);
- if (ret) {
+ if (ret)
smmu_domain->smmu = NULL;
- goto out_unlock;
- }
- } else if (smmu_domain->smmu != smmu) {
+ } else if (smmu_domain->smmu != smmu)
ret = -EINVAL;
- goto out_unlock;
- } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
- master->ssid_bits != smmu_domain->cd_table.max_cds_bits) {
- ret = -EINVAL;
- goto out_unlock;
- } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
- smmu_domain->cd_table.stall_enabled != master->stall_enabled) {
- ret = -EINVAL;
- goto out_unlock;
- }
+
+ mutex_unlock(&smmu_domain->init_mutex);
+ if (ret)
+ return ret;

master->domain = smmu_domain;

@@ -2465,6 +2440,22 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
master->ats_enabled = arm_smmu_ats_supported(master);

+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ if (!master->cd_table.cdtab) {
+ ret = arm_smmu_alloc_cd_tables(master);
+ if (ret) {
+ master->domain = NULL;
+ return ret;
+ }
+ }
+
+ ret = arm_smmu_write_ctx_desc(master, 0, &smmu_domain->cd);
+ if (ret) {
+ master->domain = NULL;
+ return ret;
+ }
+ }
+
arm_smmu_install_ste_for_dev(master);

spin_lock_irqsave(&smmu_domain->devices_lock, flags);
@@ -2472,10 +2463,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);

arm_smmu_enable_ats(master);
-
-out_unlock:
- mutex_unlock(&smmu_domain->init_mutex);
- return ret;
+ return 0;
}

static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
@@ -2719,6 +2707,8 @@ static void arm_smmu_release_device(struct device *dev)
arm_smmu_detach_dev(master);
arm_smmu_disable_pasid(master);
arm_smmu_remove_master(master);
+ if (master->cd_table.cdtab_dma)
+ arm_smmu_free_cd_tables(master);
kfree(master);
}

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 6066a09c0199..1f3b37025777 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -694,6 +694,8 @@ struct arm_smmu_master {
struct arm_smmu_domain *domain;
struct list_head domain_head;
struct arm_smmu_stream *streams;
+ /* Locked by the iommu core using the group mutex */
+ struct arm_smmu_ctx_desc_cfg cd_table;
unsigned int num_streams;
bool ats_enabled;
bool stall_enabled;
@@ -720,11 +722,8 @@ struct arm_smmu_domain {

enum arm_smmu_domain_stage stage;
union {
- struct {
struct arm_smmu_ctx_desc cd;
- struct arm_smmu_ctx_desc_cfg cd_table;
- };
- struct arm_smmu_s2_cfg s2_cfg;
+ struct arm_smmu_s2_cfg s2_cfg;
};

struct iommu_domain domain;
--
2.41.0.585.gd2178a4bd4-goog


2023-08-03 18:44:12

by Michael Shavit

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

This patch introduces a subtle bug.

Previously, the arm-smmu-v3 driver could get away with skipping the
clearing of the CD entry on detach, since the table belonged to the
domain and wouldn't be re-written on re-attach. When we switch to the
master-owned table model, that CDTE in the master's table can get
written to with different CD domains. When the CD domain get's
switched to a new one without first being cleared, arm_smmu_write_ctx
will mis-interpret its call as an ASID update instead of an entirely
new Cd.

This bug was handled by clearing the CD entry on detach in the
"iommu/arm-smmu-v3: Refactor write_ctx_desc" commit before it was
split from the set_dev_pasid
series(https://lore.kernel.org/all/[email protected]/).
The change was lost when the commit moved to this series however. It's
splitting hairs a little, but that fix probably belongs in this patch
instead.

2023-08-03 19:09:30

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

On Fri, Aug 04, 2023 at 01:56:12AM +0800, Michael Shavit wrote:
> This patch introduces a subtle bug.
>
> Previously, the arm-smmu-v3 driver could get away with skipping the
> clearing of the CD entry on detach, since the table belonged to the
> domain and wouldn't be re-written on re-attach. When we switch to the
> master-owned table model, that CDTE in the master's table can get
> written to with different CD domains. When the CD domain get's
> switched to a new one without first being cleared, arm_smmu_write_ctx
> will mis-interpret its call as an ASID update instead of an entirely
> new Cd.

I'm not surprised, I think arm_smmu_write_ctx is a little too clever
for its own good..

I would have written it by computing the full target CD entry,
extracted directly from the domain.

Something like:

struct cd_entry {
__le64 val[4];
};

static void arm_smmu_get_domain_cd_value(struct arm_smmu_domain *domain,
struct arm_smmu_master *master,
bool quiet, struct cd_entry *entry)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_ctx_desc *cd = &domain->cd;
u64 val0;

if (!domain) {
memset(entry, 0, sizeof(*entry));
return;
}

val0 = cd->tcr |
#ifdef __BIG_ENDIAN
CTXDESC_CD_0_ENDI |
#endif
CTXDESC_CD_0_R | CTXDESC_CD_0_A |
(cd->mm ? 0 : CTXDESC_CD_0_ASET) | CTXDESC_CD_0_AA64 |
FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | CTXDESC_CD_0_V;

if (cd_table->stall_enabled)
val0 |= CTXDESC_CD_0_S;

if (quiet)
val0 |= CTXDESC_CD_0_TCR_EPD0;

entry->val[0] = cpu_to_le64(val0);
entry->val[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
entry->val[2] = 0;
entry->val[3] = cpu_to_le64(cd->mair);
}

int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
struct arm_smmu_ctx_desc *cd)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct cd_entry *cur_cd;
struct cd_entry new_cd;

if (WARN_ON(ssid >= (1 << cd_table->max_cds_bits)))
return -E2BIG;

new_cd = arm_smmu_get_cd_ptr(master, ssid);
if (!new_cd)
return -ENOMEM;

arm_smmu_get_domain_cd_value(domain, master, cd == &quiet_cd, &new_cd);

/*
* The SMMU accesses 64-bit values atomically. See IHI0070Ca 3.21.3
* "Configuration structures and configuration invalidation completion"
*
* The size of single-copy atomic reads made by the SMMU is
* IMPLEMENTATION DEFINED but must be at least 64 bits. Any single
* field within an aligned 64-bit span of a structure can be altered
* without first making the structure invalid.
*/

/*
* Changing only dword 0 is common enough that we give it a fast path.
*/
if (cur_cd->val[1] != new_cd.val[1] ||
cur_cd->val[2] != new_cd.val[2] ||
cur_cd->val[3] != new_cd.val[3]) {
/* Make it invalid so we can update all 4 values */
if (le64_to_cpu(cur_cd->val[0]) & CTXDESC_CD_0_V) {
if (le64_to_cpu(new_cd.val[0]) & CTXDESC_CD_0_V)
WRITE_ONCE(cur_cd->val[0], 0);
else
WRITE_ONCE(cur_cd->val[0], new_cd.val[0]);
arm_smmu_sync_cd(master, ssid, true);
}

cur_cd->val[1] = new_cd.val[1];
cur_cd->val[2] = new_cd.val[2];
cur_cd->val[3] = new_cd.val[3];

/*
* CD entry may be live, and the SMMU might read dwords of this
* CD in any order. Ensure that it observes valid values before
* reading V=1.
*/
if (le64_to_cpu(new_cd.val[0]) & CTXDESC_CD_0_V)
arm_smmu_sync_cd(master, ssid, true);
}
if (cur_cd->val[0] == new_cd.val[0])
return 0;

WRITE_ONCE(cur_cd->val[0], new_cd.val[0]);
arm_smmu_sync_cd(master, ssid, true);
}

Jason

2023-08-04 20:50:38

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v4 8/8] iommu/arm-smmu-v3: Rename cdcfg to cd_table

On Thu, Aug 03, 2023 at 12:32:36AM +0800, Michael Shavit wrote:
>
> cdcfg is a confusing name, especially given other variables with the cfg
> suffix in this driver. cd_table more clearly describes what is being
> operated on.
>
> Signed-off-by: Michael Shavit <[email protected]>
> Reviewed-by: Jason Gunthorpe <[email protected]>

Reviewed-by: Nicolin Chen <[email protected]>

2023-08-04 21:28:45

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v4 1/8] iommu/arm-smmu-v3: Move ctx_desc out of s1_cfg

On Thu, Aug 03, 2023 at 12:32:29AM +0800, Michael Shavit wrote:
> s1_cfg describes the CD table that is inserted into an SMMU's STEs. It's
> weird for s1_cfg to also own ctx_desc which describes a CD that is
> inserted into that table. It is more appropriate for arm_smmu_domain to
> own ctx_desc.
>
> Signed-off-by: Michael Shavit <[email protected]>
> Reviewed-by: Jason Gunthorpe <[email protected]>

Reviewed-by: Nicolin Chen <[email protected]>

2023-08-04 21:38:36

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v4 3/8] iommu/arm-smmu-v3: Encapsulate ctx_desc_cfg init in alloc_cd_tables

On Thu, Aug 03, 2023 at 12:32:31AM +0800, Michael Shavit wrote:
> This is slighlty cleaner: arm_smmu_ctx_desc_cfg is initialized in a
> single function instead of having pieces set ahead-of time by its caller.
>
> Signed-off-by: Michael Shavit <[email protected]>
> Reviewed-by: Jason Gunthorpe <[email protected]>

Reviewed-by: Nicolin Chen <[email protected]>

2023-08-04 23:47:54

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

On Thu, Aug 03, 2023 at 12:32:34AM +0800, Michael Shavit wrote:
> +static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
> {
> int ret;
> size_t l1size;
> size_t max_contexts;
> struct arm_smmu_device *smmu = master->smmu;
> - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table;
> + struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
>
> cdcfg->stall_enabled = master->stall_enabled;

We have stall_enabled at both master->cd_table->stall_enabled
and master->stall_enabled, and we removed stall_enabled from
the CD structure...

> @@ -2436,22 +2419,14 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> if (!smmu_domain->smmu) {
> smmu_domain->smmu = smmu;
> ret = arm_smmu_domain_finalise(domain, master);
> - if (ret) {
> + if (ret)
> smmu_domain->smmu = NULL;
> - goto out_unlock;
> - }
> - } else if (smmu_domain->smmu != smmu) {
> + } else if (smmu_domain->smmu != smmu)
> ret = -EINVAL;
> - goto out_unlock;
> - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> - master->ssid_bits != smmu_domain->cd_table.max_cds_bits) {
> - ret = -EINVAL;
> - goto out_unlock;
> - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> - smmu_domain->cd_table.stall_enabled != master->stall_enabled) {
> - ret = -EINVAL;
> - goto out_unlock;
> - }

... then we remove this stall_enabled sanity also.

This means a shared domain (holding a shared CD) being inserted
to two CD tables from two masters would have two different CDTE
configurations at the stall bit.

If this is fine (I can't think of something wrong but not sure),
it would be okay here, though I feel we could mention this some-
where (maybe commit logs) since it changes the attach behavior?

Thanks
Nicolin

2023-08-05 00:17:40

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

On Fri, Aug 04, 2023 at 07:46:23PM -0300, Jason Gunthorpe wrote:
> On Fri, Aug 04, 2023 at 03:25:43PM -0700, Nicolin Chen wrote:
> > > @@ -2436,22 +2419,14 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > > if (!smmu_domain->smmu) {
> > > smmu_domain->smmu = smmu;
> > > ret = arm_smmu_domain_finalise(domain, master);
> > > - if (ret) {
> > > + if (ret)
> > > smmu_domain->smmu = NULL;
> > > - goto out_unlock;
> > > - }
> > > - } else if (smmu_domain->smmu != smmu) {
> > > + } else if (smmu_domain->smmu != smmu)
> > > ret = -EINVAL;
> > > - goto out_unlock;
> > > - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > > - master->ssid_bits != smmu_domain->cd_table.max_cds_bits) {
> > > - ret = -EINVAL;
> > > - goto out_unlock;
> > > - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > > - smmu_domain->cd_table.stall_enabled != master->stall_enabled) {
> > > - ret = -EINVAL;
> > > - goto out_unlock;
> > > - }
> >
> > ... then we remove this stall_enabled sanity also.
> >
> > This means a shared domain (holding a shared CD) being inserted
> > to two CD tables from two masters would have two different CDTE
> > configurations at the stall bit.
>
> I looked through the spec for a while and I thought this was fine..
>
> Stall is basically a master specific behavior on how to operate page
> faulting. It makes sense that it follows the master and the IOPTEs in
> the domain can be used with both the faulting and non-faulting page
> faulting path.
>
> I would expect the page faulting path to figure out what to (if there
> is anything special to do) do based on the master that triggered the
> fault, not based on the domain that received it.

Yea, I went through the spec too yet didn't find anything that
could block us. And there is no SW dependency on the STALL bit
of the CDTE: actually it has an inverse relationship with the
S1STALLD bit in the STE, so following the STE/cd_table/master
makes sense. So long as a master has its own cd_table holding
its own CDTE for a shared domain, HW CD caching should be fine
as well.

With that being said, I think mentioning this behavior change
in the commit log wouldn't hurt. Someday people might want to
check this out in case something breaks.

Thanks
Nic

2023-08-05 01:23:08

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] iommu/arm-smmu-v3: move stall_enabled to the cd table

On Thu, Aug 03, 2023 at 12:32:32AM +0800, Michael Shavit wrote:

> This controls whether CD entries will have the stall bit set when
> writing entries into the table.

It'd be nicer to spare a few more lines here -- something like
this yet feel free to rephrase:

The stall bit of a CD entry should follow the master->stall_enabled and
has an inverse relationship with the STE.S1STALLD bit.

Also, a domain should be able to share between two masters, even if they
have different stall_enabled configurations, in which case, two masters
should have two sets of CD tables, i.e. having two different CD entries
for the same domain holding a CD.

So, move the "stall_enabled" out of domain to cd_table. It then controls
whether CD entries will have the stall bit set when writing entries into
the table.

> Signed-off-by: Michael Shavit <[email protected]>
> Reviewed-by: Jason Gunthorpe <[email protected]>

With that,
Reviewed-by: Nicolin Chen <[email protected]>

And btw, you should probably put your Signed-off-by at the end
the commit log, i.e. behind "Reviewed-by", meaning you created/
updated the commit, and then signed it off.

Nicolin

2023-08-05 01:32:23

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

On Fri, Aug 04, 2023 at 03:25:43PM -0700, Nicolin Chen wrote:
> > @@ -2436,22 +2419,14 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > if (!smmu_domain->smmu) {
> > smmu_domain->smmu = smmu;
> > ret = arm_smmu_domain_finalise(domain, master);
> > - if (ret) {
> > + if (ret)
> > smmu_domain->smmu = NULL;
> > - goto out_unlock;
> > - }
> > - } else if (smmu_domain->smmu != smmu) {
> > + } else if (smmu_domain->smmu != smmu)
> > ret = -EINVAL;
> > - goto out_unlock;
> > - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > - master->ssid_bits != smmu_domain->cd_table.max_cds_bits) {
> > - ret = -EINVAL;
> > - goto out_unlock;
> > - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > - smmu_domain->cd_table.stall_enabled != master->stall_enabled) {
> > - ret = -EINVAL;
> > - goto out_unlock;
> > - }
>
> ... then we remove this stall_enabled sanity also.
>
> This means a shared domain (holding a shared CD) being inserted
> to two CD tables from two masters would have two different CDTE
> configurations at the stall bit.

I looked through the spec for a while and I thought this was fine..

Stall is basically a master specific behavior on how to operate page
faulting. It makes sense that it follows the master and the IOPTEs in
the domain can be used with both the faulting and non-faulting page
faulting path.

I would expect the page faulting path to figure out what to (if there
is anything special to do) do based on the master that triggered the
fault, not based on the domain that received it.

Jason

2023-08-07 13:09:19

by Michael Shavit

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] iommu/arm-smmu-v3: move stall_enabled to the cd table

On Sat, Aug 5, 2023 at 7:32 AM Nicolin Chen <[email protected]> wrote:
> It'd be nicer to spare a few more lines here -- something like
> this yet feel free to rephrase:

Yeah you're right, this commit message was pretty sparse. Will update.

> And btw, you should probably put your Signed-off-by at the end
> the commit log, i.e. behind "Reviewed-by", meaning you created/
> updated the commit, and then signed it off.
>
> Nicolin

Ah, thanks for the tip, will do as well :) .

2023-08-07 14:19:33

by Michael Shavit

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

On Fri, Aug 4, 2023 at 2:47 AM Jason Gunthorpe <[email protected]> wrote:
>
>
> I'm not surprised, I think arm_smmu_write_ctx is a little too clever
> for its own good..
>
> I would have written it by computing the full target CD entry,
> extracted directly from the domain.
>

Yeah I was considering making a fix to arm_smmu_write_ctx instead; but
clearing the CD entry on detach feels like the right thing to do.
Relying on the 0th CD entry being re-written when the CD table is
re-inserted feels fragile.

Perhaps re-writing arm_smmu_write_ctx could be considered as a
separate singleton patch?

2023-08-08 01:02:20

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Move CD table to arm_smmu_master

On Mon, Aug 07, 2023 at 08:19:44PM +0800, Michael Shavit wrote:
> On Fri, Aug 4, 2023 at 2:47 AM Jason Gunthorpe <[email protected]> wrote:
> >
> >
> > I'm not surprised, I think arm_smmu_write_ctx is a little too clever
> > for its own good..
> >
> > I would have written it by computing the full target CD entry,
> > extracted directly from the domain.
> >
>
> Yeah I was considering making a fix to arm_smmu_write_ctx instead; but
> clearing the CD entry on detach feels like the right thing to do.
> Relying on the 0th CD entry being re-written when the CD table is
> re-inserted feels fragile.
>
> Perhaps re-writing arm_smmu_write_ctx could be considered as a
> separate singleton patch?

I wouldn't touch it in this series at least

Jason