This commit explicitly keeps track of whether a CD table is installed in
an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This
was previously achieved through the domain->devices list, but we are
moving to a model where arm_smmu_sync_cd directly operates on a master
and the master's CD table instead of a domain.
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Nicolin Chen <[email protected]>
Signed-off-by: Michael Shavit <[email protected]>
---
Changes in v5:
- Fix an issue where cd_table.installed wasn't correctly updated.
Changes in v3:
- Flip the cd_table.installed bit back off when table is detached
- re-order the commit later in the series since flipping the installed
bit to off isn't obvious when the cd_table is still shared by multiple
masters.
Changes in v2:
- Store field as a bit instead of a bool. Fix comment about STE being
live before the sync in write_ctx_desc().
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f5ad386cc8760..488d12dd2d4aa 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -985,6 +985,9 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
},
};
+ if (!master->cd_table.installed)
+ return;
+
cmds.num = 0;
for (i = 0; i < master->num_streams; i++) {
cmd.cfgi.sid = master->streams[i].id;
@@ -1091,7 +1094,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
cdptr[3] = cpu_to_le64(cd->mair);
/*
- * STE is live, and the SMMU might read dwords of this CD in any
+ * STE may be live, and the SMMU might read dwords of this CD in any
* order. Ensure that it observes valid values before reading
* V=1.
*/
@@ -1333,6 +1336,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
*/
if (smmu)
arm_smmu_sync_ste_for_sid(smmu, sid);
+ master->cd_table.installed = false;
return;
}
@@ -1360,6 +1364,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
cd_table->l1_desc ?
STRTAB_STE_0_S1FMT_64K_L2 :
STRTAB_STE_0_S1FMT_LINEAR);
+ cd_table->installed = true;
+ } else {
+ master->cd_table.installed = false;
}
if (s2_cfg) {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1f3b370257779..e76452e735a04 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -599,6 +599,8 @@ struct arm_smmu_ctx_desc_cfg {
u8 max_cds_bits;
/* Whether CD entries in this table have the stall bit set. */
u8 stall_enabled:1;
+ /* Whether this CD table is installed in any STE */
+ u8 installed:1;
};
struct arm_smmu_s2_cfg {
--
2.41.0.640.ga95def55d0-goog
On Wed, Aug 09, 2023 at 01:12:04AM +0800, Michael Shavit wrote:
> This commit explicitly keeps track of whether a CD table is installed in
> an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This
> was previously achieved through the domain->devices list, but we are
> moving to a model where arm_smmu_sync_cd directly operates on a master
> and the master's CD table instead of a domain.
Why is this path worth optimising?
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f5ad386cc8760..488d12dd2d4aa 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -985,6 +985,9 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
> },
> };
>
> + if (!master->cd_table.installed)
> + return;
Doesn't this interact badly with the sync in arm_smmu_detach_dev(), which I
think happens after zapping the STE?
> cmds.num = 0;
> for (i = 0; i < master->num_streams; i++) {
> cmd.cfgi.sid = master->streams[i].id;
> @@ -1091,7 +1094,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
> cdptr[3] = cpu_to_le64(cd->mair);
>
> /*
> - * STE is live, and the SMMU might read dwords of this CD in any
> + * STE may be live, and the SMMU might read dwords of this CD in any
> * order. Ensure that it observes valid values before reading
> * V=1.
> */
Why does this patch need to update this comment?
Will
On Wed, Aug 9, 2023 at 9:50 PM Will Deacon <[email protected]> wrote:
>
> On Wed, Aug 09, 2023 at 01:12:04AM +0800, Michael Shavit wrote:
> > This commit explicitly keeps track of whether a CD table is installed in
> > an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This
> > was previously achieved through the domain->devices list, but we are
> > moving to a model where arm_smmu_sync_cd directly operates on a master
> > and the master's CD table instead of a domain.
>
> Why is this path worth optimising?
I have no idea what the practical impact of this optimization is, but
the motivation here was to make the overall series as close to a nop
as possible. This optimization existed before but is "broken" by the
previous patch. This patch restores it.
> Doesn't this interact badly with the sync in arm_smmu_detach_dev(), which I
> think happens after zapping the STE?
The arm_smmu_write_ctx_desc call added in arm_smmu_detach_dev() was
inserted after zapping the STE precisely so that we could skip the
sync. Is there a concern that a stale CD could be used when the
CDtable is re-inserted into the STE?
> > /*
> > - * STE is live, and the SMMU might read dwords of this CD in any
> > + * STE may be live, and the SMMU might read dwords of this CD in any
> > * order. Ensure that it observes valid values before reading
> > * V=1.
> > */
>
> Why does this patch need to update this comment?
This is a drive-by to make this comment more accurate. Note how
(before this patch series) arm_smmu_domain_finalise_s1 explicitly
mentions that it calls arm_smmu_write_ctx_desc while the STE isn't
installed yet. Yet this comment asserts the STE *is* live.
On Thu, Aug 10, 2023 at 04:34:39PM +0800, Michael Shavit wrote:
> On Wed, Aug 9, 2023 at 9:50 PM Will Deacon <[email protected]> wrote:
> >
> > On Wed, Aug 09, 2023 at 01:12:04AM +0800, Michael Shavit wrote:
> > > This commit explicitly keeps track of whether a CD table is installed in
> > > an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This
> > > was previously achieved through the domain->devices list, but we are
> > > moving to a model where arm_smmu_sync_cd directly operates on a master
> > > and the master's CD table instead of a domain.
> >
> > Why is this path worth optimising?
>
> I have no idea what the practical impact of this optimization is, but
> the motivation here was to make the overall series as close to a nop
> as possible. This optimization existed before but is "broken" by the
> previous patch. This patch restores it.
I'm not sure it's necessary, tbh. It's not like we're calling
arm_smmu_sync_cd() all over the place -- it's used when we're actually
working with the CD.
> > Doesn't this interact badly with the sync in arm_smmu_detach_dev(), which I
> > think happens after zapping the STE?
>
> The arm_smmu_write_ctx_desc call added in arm_smmu_detach_dev() was
> inserted after zapping the STE precisely so that we could skip the
> sync. Is there a concern that a stale CD could be used when the
> CDtable is re-inserted into the STE?
Ah, sorry, I went and looked at the architecture and it says for
CMD_CFGI_STE:
| This command invalidates all Context descriptors (including L1CD)
| that were cached using the given StreamID.
so as long as we make the CD unreachable in the STE before the STE
invalidation (which I think we do by setting the Config field to bypass or
abort), then I agree that we don't need the subsequent CD invalidation.
> > > /*
> > > - * STE is live, and the SMMU might read dwords of this CD in any
> > > + * STE may be live, and the SMMU might read dwords of this CD in any
> > > * order. Ensure that it observes valid values before reading
> > > * V=1.
> > > */
> >
> > Why does this patch need to update this comment?
>
> This is a drive-by to make this comment more accurate. Note how
> (before this patch series) arm_smmu_domain_finalise_s1 explicitly
> mentions that it calls arm_smmu_write_ctx_desc while the STE isn't
> installed yet. Yet this comment asserts the STE *is* live.
Can you do it as its own patch then, please?
Will