2023-08-08 20:47:31

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v5 0/9] Refactor the SMMU's CD table ownership


Hi all,

This series refactors stage 1 domains so that they describe a single CD
entry. These entries are now inserted into a CD table that is owned by
the arm_smmu_master instead of the domain.
This is conceptually cleaner and unblocks other features, such as
attaching domains with PASID (for unmanaged/dma domains).

This patch series was originally part of a larger patch series that
implemented the set_dev_pasid callback for non-SVA domains but is now
split into a distinct series.

This patch series is also available on gerrit with Jean's SMMU test
engine patches cherry-picked on top for testing:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/24742/8

Thanks,
Michael Shavit

Changes in v5:
- Clear the 0th CD entry when the domain is detached. Not clearing it
caused a bug in arm_smmu_write_ctx_desc which doesn't expect the entry
to already be set.
- Fix an issue where cd_table.installed wasn't correctly updated.
- Added commit to clean-up now unused master parameter in
arm_smmu_domain_finalise
- Link to v4: https://lore.kernel.org/all/[email protected]/

Changes in v4:
- Added comment about the cd_table's dependency on the iommu core's
group mutex.
- Narrowed the range of code for which the domain's init_mutex is held
on attach since it now only protects the arm_smmu_domain_finalise
call.
- Link to v3: https://lore.kernel.org/all/[email protected]/

Changes in v3:
- Add a helper to write a CD to all masters that a domain is attached
to.
- Fixed an issue where an arm_smmu_write_ctx_desc error return wasn't
correctly handled by its caller.
- Flip the cd_table.installed bit back off when table is detached
- re-order the commit later in the series since flipping the installed
bit to off isn't obvious when the cd_table is still shared by multiple
masters.
- Link to v2: https://lore.kernel.org/all/[email protected]/

Changes in v2:
- Allocate CD table when it's first needed instead of on probe.
- Minor changes
- Added commit to rename remaining usages of cdcfg to cd_table
- Link to v1: https://lore.kernel.org/all/[email protected]/#r

Changes in v1:
- Replace s1_cfg with arm_smmu_ctx_desc_cfg representing the CD table
- Assume that the CD table is owned by the SMMU master for most
operations. This is forward-compatible with the nested patch series as
these operations wouldn't be called when the installed CD table comes
from nested domains.
- Split off as a distinct patch series from https://lore.kernel.org/all/[email protected]/

Michael Shavit (9):
iommu/arm-smmu-v3: Move ctx_desc out of s1_cfg
iommu/arm-smmu-v3: Replace s1_cfg with cdtab_cfg
iommu/arm-smmu-v3: Encapsulate ctx_desc_cfg init in alloc_cd_tables
iommu/arm-smmu-v3: move stall_enabled to the cd table
iommu/arm-smmu-v3: Refactor write_ctx_desc
iommu/arm-smmu-v3: Move CD table to arm_smmu_master
iommu/arm-smmu-v3: Cleanup arm_smmu_domain_finalise
iommu/arm-smmu-v3: Skip cd sync if CD table isn't active
iommu/arm-smmu-v3: Rename cdcfg to cd_table

.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 33 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 255 +++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 +-
3 files changed, 161 insertions(+), 149 deletions(-)


base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
--
2.41.0.640.ga95def55d0-goog



2023-08-08 22:37:43

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v5 9/9] iommu/arm-smmu-v3: Rename cdcfg to cd_table

cdcfg is a confusing name, especially given other variables with the cfg
suffix in this driver. cd_table more clearly describes what is being
operated on.

Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Nicolin Chen <[email protected]>
Signed-off-by: Michael Shavit <[email protected]>
---

(no changes since v3)

Changes in v3:
- Commit message update

Changes in v2:
- New commit

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 66 ++++++++++-----------
1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 488d12dd2d4aa..e29e548776c1d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1028,18 +1028,18 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid)
unsigned int idx;
struct arm_smmu_l1_ctx_desc *l1_desc;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

- if (!cdcfg->l1_desc)
- return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
+ if (!cd_table->l1_desc)
+ return cd_table->cdtab + ssid * CTXDESC_CD_DWORDS;

idx = ssid >> CTXDESC_SPLIT;
- l1_desc = &cdcfg->l1_desc[idx];
+ l1_desc = &cd_table->l1_desc[idx];
if (!l1_desc->l2ptr) {
if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
return NULL;

- l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
+ l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
/* An invalid L1CD can be cached */
arm_smmu_sync_cd(master, ssid, false);
@@ -1134,33 +1134,33 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
size_t l1size;
size_t max_contexts;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

- cdcfg->stall_enabled = master->stall_enabled;
- cdcfg->max_cds_bits = master->ssid_bits;
- max_contexts = 1 << cdcfg->max_cds_bits;
+ cd_table->stall_enabled = master->stall_enabled;
+ cd_table->max_cds_bits = master->ssid_bits;
+ max_contexts = 1 << cd_table->max_cds_bits;

if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) ||
max_contexts <= CTXDESC_L2_ENTRIES) {
- cdcfg->num_l1_ents = max_contexts;
+ cd_table->num_l1_ents = max_contexts;

l1size = max_contexts * (CTXDESC_CD_DWORDS << 3);
} else {
- cdcfg->num_l1_ents = DIV_ROUND_UP(max_contexts,
+ cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
CTXDESC_L2_ENTRIES);

- cdcfg->l1_desc = devm_kcalloc(smmu->dev, cdcfg->num_l1_ents,
- sizeof(*cdcfg->l1_desc),
+ cd_table->l1_desc = devm_kcalloc(smmu->dev, cd_table->num_l1_ents,
+ sizeof(*cd_table->l1_desc),
GFP_KERNEL);
- if (!cdcfg->l1_desc)
+ if (!cd_table->l1_desc)
return -ENOMEM;

- l1size = cdcfg->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
+ l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
}

- cdcfg->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cdcfg->cdtab_dma,
+ cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma,
GFP_KERNEL);
- if (!cdcfg->cdtab) {
+ if (!cd_table->cdtab) {
dev_warn(smmu->dev, "failed to allocate context descriptor\n");
ret = -ENOMEM;
goto err_free_l1;
@@ -1169,9 +1169,9 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
return 0;

err_free_l1:
- if (cdcfg->l1_desc) {
- devm_kfree(smmu->dev, cdcfg->l1_desc);
- cdcfg->l1_desc = NULL;
+ if (cd_table->l1_desc) {
+ devm_kfree(smmu->dev, cd_table->l1_desc);
+ cd_table->l1_desc = NULL;
}
return ret;
}
@@ -1181,30 +1181,30 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
int i;
size_t size, l1size;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table;
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;

- if (cdcfg->l1_desc) {
+ if (cd_table->l1_desc) {
size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);

- for (i = 0; i < cdcfg->num_l1_ents; i++) {
- if (!cdcfg->l1_desc[i].l2ptr)
+ for (i = 0; i < cd_table->num_l1_ents; i++) {
+ if (!cd_table->l1_desc[i].l2ptr)
continue;

dmam_free_coherent(smmu->dev, size,
- cdcfg->l1_desc[i].l2ptr,
- cdcfg->l1_desc[i].l2ptr_dma);
+ cd_table->l1_desc[i].l2ptr,
+ cd_table->l1_desc[i].l2ptr_dma);
}
- devm_kfree(smmu->dev, cdcfg->l1_desc);
- cdcfg->l1_desc = NULL;
+ devm_kfree(smmu->dev, cd_table->l1_desc);
+ cd_table->l1_desc = NULL;

- l1size = cdcfg->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
+ l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
} else {
- l1size = cdcfg->num_l1_ents * (CTXDESC_CD_DWORDS << 3);
+ l1size = cd_table->num_l1_ents * (CTXDESC_CD_DWORDS << 3);
}

- dmam_free_coherent(smmu->dev, l1size, cdcfg->cdtab, cdcfg->cdtab_dma);
- cdcfg->cdtab_dma = 0;
- cdcfg->cdtab = NULL;
+ dmam_free_coherent(smmu->dev, l1size, cd_table->cdtab, cd_table->cdtab_dma);
+ cd_table->cdtab_dma = 0;
+ cd_table->cdtab = NULL;
}

bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
--
2.41.0.640.ga95def55d0-goog


2023-08-08 22:49:17

by Michael Shavit

[permalink] [raw]
Subject: [PATCH v5 4/9] iommu/arm-smmu-v3: move stall_enabled to the cd table

A domain can be attached to multiple masters with different
master->stall_enabled values. The stall bit of a CD entry should follow
master->stall_enabled and has an inverse relationship with the
STE.S1STALLD bit.

The stall_enabled bit does not depend on any property of the domain, so
move it out of the arm_smmu_domain struct. Move it to the CD table
struct so that it can fully describe how CD entries should be written to
it.

Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Nicolin Chen <[email protected]>
Signed-off-by: Michael Shavit <[email protected]>
---

Changes in v5:
- Reword commit

Changes in v2:
- Use a bitfield instead of a bool for stall_enabled

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++-
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fe4b19c3b8dee..c01023404c26c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1114,7 +1114,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
CTXDESC_CD_0_V;

- if (smmu_domain->stall_enabled)
+ if (smmu_domain->cd_table.stall_enabled)
val |= CTXDESC_CD_0_S;
}

@@ -1141,6 +1141,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table;

+ cdcfg->stall_enabled = master->stall_enabled;
cdcfg->max_cds_bits = master->ssid_bits;
max_contexts = 1 << cdcfg->max_cds_bits;

@@ -2121,8 +2122,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
if (ret)
goto out_unlock;

- smmu_domain->stall_enabled = master->stall_enabled;
-
ret = arm_smmu_alloc_cd_tables(smmu_domain, master);
if (ret)
goto out_free_asid;
@@ -2461,7 +2460,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
ret = -EINVAL;
goto out_unlock;
} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
- smmu_domain->stall_enabled != master->stall_enabled) {
+ smmu_domain->cd_table.stall_enabled !=
+ master->stall_enabled) {
ret = -EINVAL;
goto out_unlock;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 35a93e8858872..05b1f0ee60808 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -597,6 +597,8 @@ struct arm_smmu_ctx_desc_cfg {
unsigned int num_l1_ents;
/* log2 of the maximum number of CDs supported by this table */
u8 max_cds_bits;
+ /* Whether CD entries in this table have the stall bit set. */
+ u8 stall_enabled:1;
};

struct arm_smmu_s2_cfg {
@@ -714,7 +716,6 @@ struct arm_smmu_domain {
struct mutex init_mutex; /* Protects smmu pointer */

struct io_pgtable_ops *pgtbl_ops;
- bool stall_enabled;
atomic_t nr_ats_masters;

enum arm_smmu_domain_stage stage;
--
2.41.0.640.ga95def55d0-goog