From: Joerg Roedel <[email protected]>
Hi,
a previous discussion pointed out that using atomic64_t for that
purpose is a bit of overkill. This patch-set replaces it with unsigned
long and introduces some helpers first to make the change more easy.
Qian, can you please test these patches in your environment? You can
trigger any race-condition there pretty reliably :)
Other than that, please review and test.
Regards,
Joerg
Changed to v1:
- Addressed review comments from Qian.
Joerg Roedel (2):
iommu/amd: Add helper functions to update domain->pt_root
iommu/amd: Use 'unsigned long' for domain->pt_root
drivers/iommu/amd/amd_iommu_types.h | 2 +-
drivers/iommu/amd/iommu.c | 44 +++++++++++++++++++++--------
2 files changed, 33 insertions(+), 13 deletions(-)
--
2.27.0
From: Joerg Roedel <[email protected]>
Using atomic64_t can be quite expensive, so use unsigned long instead.
This is safe because the write becomes visible atomically.
Signed-off-by: Joerg Roedel <[email protected]>
---
drivers/iommu/amd/amd_iommu_types.h | 2 +-
drivers/iommu/amd/iommu.c | 15 +++++++++++++--
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a..f6f102282dda 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -468,7 +468,7 @@ struct protection_domain {
iommu core code */
spinlock_t lock; /* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
- atomic64_t pt_root; /* pgtable root and pgtable mode */
+ unsigned long pt_root; /* pgtable root and pgtable mode */
int glx; /* Number of levels for GCR3 table */
u64 *gcr3_tbl; /* Guest CR3 table */
unsigned long flags; /* flags to find out type of domain */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5286ddcfc2f9..aec585f47646 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -156,7 +156,12 @@ static struct protection_domain *to_pdomain(struct iommu_domain *dom)
static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
struct domain_pgtable *pgtable)
{
- u64 pt_root = atomic64_read(&domain->pt_root);
+ unsigned long pt_root;
+
+ /* Match the barrier in amd_iommu_domain_set_pt_root() */
+ smp_rmb();
+
+ pt_root = READ_ONCE(domain->pt_root);
pgtable->root = (u64 *)(pt_root & PAGE_MASK);
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
@@ -164,7 +169,13 @@ static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
{
- atomic64_set(&domain->pt_root, root);
+ WRITE_ONCE(domain->pt_root, root);
+
+ /*
+ * The new value needs to be gobally visible in case pt_root gets
+ * cleared, so that the page-table can be safely freed.
+ */
+ smp_wmb();
}
static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
--
2.27.0
> On Jun 26, 2020, at 4:05 AM, Joerg Roedel <[email protected]> wrote:
>
> a previous discussion pointed out that using atomic64_t for that
> purpose is a bit of overkill. This patch-set replaces it with unsigned
> long and introduces some helpers first to make the change more easy.
BTW, from the previous discussion, Linus mentioned,
“
The thing is, the 64-bit atomic reads/writes are very expensive on
32-bit x86. If it was just a native pointer, it would be much cheaper
than an "atomic64_t".
“
However, here we have AMD_IOMMU depend on x86_64, so I am wondering if it makes any sense to run this code on 32-bit x86 at all?
On Fri, Jun 26, 2020 at 08:30:21AM -0400, Qian Cai wrote:
> BTW, from the previous discussion, Linus mentioned,
>
> “
> The thing is, the 64-bit atomic reads/writes are very expensive on
> 32-bit x86. If it was just a native pointer, it would be much cheaper
> than an "atomic64_t".
> “
>
> However, here we have AMD_IOMMU depend on x86_64, so I am wondering if
> it makes any sense to run this code on 32-bit x86 at all?
No, it doesn't, the driver is not supported on 32bit and probably never
will. I skip this patch and only apply the first one, as it is an
improvement in itself.
Regards,
Joerg