2024-04-13 03:48:25

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 00/14] Add Tegra241 (Grace) CMDQV Support (part 2/2)

This is an experimental RFC series for VIOMMU infrastructure, using NVIDIA
Tegra241 (Grace) CMDQV as a test instance.

VIOMMU obj is used to represent a virtual interface (iommu) backed by an
underlying IOMMU's HW-accelerated feature for virtualizaion: for example,
NVIDIA's VINTF (v-interface for CMDQV) and AMD"s vIOMMU.

VQUEUE obj is used to represent a virtual command queue (buffer) backed by
an underlying IOMMU command queue to passthrough for VMs to use directly:
for example, NVIDIA's Virtual Command Queue and AMD's Command Buffer.

NVIDIA's CMDQV requires a pair of physical and virtual device Stream IDs
to process ATC invalidation commands by ARM SMMU. So, set/unset_dev_id ops
and ioctls are introduced to VIOMMU.

Also, a passthrough queue has a pair of start and tail pointers/indexes in
the real HW registers, which should be mmaped to user space for hypervisor
to map to VM's mmio region directly. Thus, iommufd needs an mmap op too.

Some todos/opens:
1. Add selftest coverages for new ioctls
2. The mmap needs a way to get viommu_id. Currently it's getting from
vma->vm_pgoff, which might not be ideal.
3. This series is only verified with a single passthrough device that's
hehind a physical ARM SMMU. So, devices behind two+ IOMMUs might need
some additional support (and verifications).
4. Requires for comments from AMD folks to support AMD's vIOMMU feature.

This series is on Github (for review and reference only):
https://github.com/nicolinc/iommufd/commits/vcmdq_user_space-rfc-v1

Real HW tests wre conducted with this QEMU branch:
https://github.com/nicolinc/qemu/commits/wip/iommufd_vcmdq/

Thanks

Nicolin Chen (14):
iommufd: Move iommufd_object to public iommufd header
iommufd: Swap _iommufd_object_alloc and __iommufd_object_alloc
iommufd: Prepare for viommu structures and functions
iommufd: Add struct iommufd_viommu and iommufd_viommu_ops
iommufd: Add IOMMUFD_OBJ_VIOMMU and IOMMUFD_CMD_VIOMMU_ALLOC
iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
iommufd: Add viommu set/unset_dev_id ops
iommufd: Add IOMMU_VIOMMU_SET_DEV_ID ioctl
iommufd/selftest: Add IOMMU_VIOMMU_SET_DEV_ID test coverage
iommufd/selftest: Add IOMMU_TEST_OP_MV_CHECK_DEV_ID
iommufd: Add struct iommufd_vqueue and its related viommu ops
iommufd: Add IOMMUFD_OBJ_VQUEUE and IOMMUFD_CMD_VQUEUE_ALLOC
iommufd: Add mmap infrastructure
iommu/tegra241-cmdqv: Add user-space use support

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 ++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 19 ++
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 284 +++++++++++++++++-
drivers/iommu/iommufd/Makefile | 3 +-
drivers/iommu/iommufd/device.c | 11 +
drivers/iommu/iommufd/hw_pagetable.c | 4 +-
drivers/iommu/iommufd/iommufd_private.h | 71 +++--
drivers/iommu/iommufd/iommufd_test.h | 5 +
drivers/iommu/iommufd/main.c | 69 ++++-
drivers/iommu/iommufd/selftest.c | 100 ++++++
drivers/iommu/iommufd/viommu.c | 235 +++++++++++++++
include/linux/iommu.h | 16 +
include/linux/iommufd.h | 100 ++++++
include/uapi/linux/iommufd.h | 98 ++++++
tools/testing/selftests/iommu/iommufd.c | 44 +++
tools/testing/selftests/iommu/iommufd_utils.h | 71 +++++
16 files changed, 1103 insertions(+), 46 deletions(-)
create mode 100644 drivers/iommu/iommufd/viommu.c

--
2.43.0



2024-04-13 03:48:27

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 03/14] iommufd: Prepare for viommu structures and functions

The following changes will introduce a new iommufd_viommu and its related
structures and functions. These new core-level structures will be embedded
in the driver-level structures. So a helper to allocate the bundle of core
and driver strucutres will be nice to have.

As a preparatory change, introduce a viommu.c file and put a new structure
allocator in it. Since some of the object allocation will be similar, also
move the common part to a new level-3 allocator ___iommufd_object_alloc().

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/Makefile | 3 ++-
drivers/iommu/iommufd/iommufd_private.h | 20 ++++++++++++++++++++
drivers/iommu/iommufd/main.c | 9 +++------
drivers/iommu/iommufd/viommu.c | 19 +++++++++++++++++++
4 files changed, 44 insertions(+), 7 deletions(-)
create mode 100644 drivers/iommu/iommufd/viommu.c

diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
index 34b446146961..76c2bc41af40 100644
--- a/drivers/iommu/iommufd/Makefile
+++ b/drivers/iommu/iommufd/Makefile
@@ -6,7 +6,8 @@ iommufd-y := \
ioas.o \
main.o \
pages.o \
- vfio_compat.o
+ vfio_compat.o \
+ viommu.o

iommufd-$(CONFIG_IOMMUFD_TEST) += selftest.o

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 3acbc67dd5f0..eccc565ed38e 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -156,6 +156,26 @@ void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx,
void iommufd_object_finalize(struct iommufd_ctx *ictx,
struct iommufd_object *obj);

+static inline struct iommufd_object *___iommufd_object_alloc(size_t size)
+{
+ struct iommufd_object *obj;
+
+ obj = kzalloc(size, GFP_KERNEL_ACCOUNT);
+ if (!obj)
+ return ERR_PTR(-ENOMEM);
+
+ /* Starts out bias'd by 1 until it is removed from the xarray */
+ refcount_set(&obj->shortterm_users, 1);
+ refcount_set(&obj->users, 1);
+
+ /*
+ * The allocation of an obj->id needs an ictx, so it has to be done
+ * after this ___iommufd_object_alloc() callback.
+ */
+
+ return obj;
+}
+
enum {
REMOVE_WAIT_SHORTTERM = 1,
};
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index a51ab766e183..5187942b375d 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -36,13 +36,10 @@ struct iommufd_object *__iommufd_object_alloc(struct iommufd_ctx *ictx,
struct iommufd_object *obj;
int rc;

- obj = kzalloc(size, GFP_KERNEL_ACCOUNT);
- if (!obj)
- return ERR_PTR(-ENOMEM);
+ obj = ___iommufd_object_alloc(size);
+ if (IS_ERR(obj))
+ return obj;
obj->type = type;
- /* Starts out bias'd by 1 until it is removed from the xarray */
- refcount_set(&obj->shortterm_users, 1);
- refcount_set(&obj->users, 1);

/*
* Reserve an ID in the xarray but do not publish the pointer yet since
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
new file mode 100644
index 000000000000..f77d6972d552
--- /dev/null
+++ b/drivers/iommu/iommufd/viommu.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES
+ */
+
+#include <linux/iommufd.h>
+
+#include "iommufd_private.h"
+
+#define viommu_struct_alloc(name) \
+ struct iommufd_##name *_iommufd_##name##_alloc(size_t size) \
+ { \
+ struct iommufd_object *obj; \
+ if (WARN_ON(size < sizeof(struct iommufd_##name))) \
+ return NULL; \
+ obj = ___iommufd_object_alloc(size); \
+ if (IS_ERR(obj)) \
+ return NULL; \
+ return container_of(obj, struct iommufd_##name, obj); \
+ }
--
2.43.0


2024-04-13 03:49:15

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 04/14] iommufd: Add struct iommufd_viommu and iommufd_viommu_ops

Add a new iommufd_viommu core structure to represent a vIOMMU instance in
the user space, typically backed by a HW-accelerated feature of an IOMMU,
e.g. NVIDIA CMDQ-Virtualization (an ARM SMMUv3 extension) and AMD Hardware
Accelerated Virtualized IOMMU (vIOMMU).

Define a new iommufd_viommu_ops to hold the free op and any future op.

A driver should embed this core structure in its driver viommu structure
and call the new iommufd_viommu_alloc() helper to allocate a core/driver
structure bundle and fill its core viommu->ops:
struct my_driver_viommu {
struct iommufd_viommu core;
....
};

static const struct iommufd_viommu_ops my_driver_viommu_ops = {
.free = my_driver_viommu_free,
};

struct my_driver_viommu *my_viommu =
iommufd_viommu_alloc(my_driver_viommu, core);
my_viommu->core.ops = &my_driver_viommu_ops;

Lastly, add viommu_alloc to iommu_ops so iommufd can link to the driver to
allocate a viommu object.

Any future viommu op will be added to the new struct iommufd_viommu_ops.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/viommu.c | 2 ++
include/linux/iommu.h | 15 ++++++++++++++
include/linux/iommufd.h | 38 ++++++++++++++++++++++++++++++++++
3 files changed, 55 insertions(+)

diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index f77d6972d552..3886b1dd1f13 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -17,3 +17,5 @@
return NULL; \
return container_of(obj, struct iommufd_##name, obj); \
}
+
+viommu_struct_alloc(viommu);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2e925b5eba53..4d4461146288 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -145,6 +145,8 @@ struct iopf_queue {
struct mutex lock;
};

+struct iommufd_viommu;
+
/* iommu fault flags */
#define IOMMU_FAULT_READ 0x0
#define IOMMU_FAULT_WRITE 0x1
@@ -527,6 +529,14 @@ static inline int __iommu_copy_struct_from_user_array(
* @of_xlate: add OF master IDs to iommu grouping
* @is_attach_deferred: Check if domain attach should be deferred from iommu
* driver init to device driver init (default no)
+ * @viommu_alloc: Allocate an iommufd_viommu as a user space IOMMU instance,
+ * associated to a nested parent @domain, for iommu-specific
+ * hardware acceleration. The @viommu_type must be defined in
+ * the include/uapi/linux/iommufd.h header.
+ * It is suggested to call iommufd_viommu_alloc() helper for
+ * a bundled allocation of the core and the driver structure,
+ * and a driver in gernal should assign an iommufd_viommu_ops
+ * to the core structure's viommu->ops.
* @dev_enable/disable_feat: per device entries to enable/disable
* iommu specific features.
* @page_response: handle page request response
@@ -570,6 +580,11 @@ struct iommu_ops {
int (*of_xlate)(struct device *dev, const struct of_phandle_args *args);
bool (*is_attach_deferred)(struct device *dev);

+ /* User space instance allocation by the iommu driver */
+ struct iommufd_viommu *(*viommu_alloc)(struct device *dev,
+ unsigned int viommu_type,
+ struct iommu_domain *domain);
+
/* Per device IOMMU features */
int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index a0cb08a4b653..650acfac307a 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -16,6 +16,7 @@ struct iommufd_device;
struct page;
struct iommufd_ctx;
struct iommufd_access;
+struct iommufd_hwpt_paging;
struct file;
struct iommu_group;

@@ -77,6 +78,26 @@ void iommufd_access_detach(struct iommufd_access *access);

void iommufd_ctx_get(struct iommufd_ctx *ictx);

+struct iommufd_viommu {
+ struct iommufd_object obj;
+ struct iommufd_ctx *ictx;
+ struct iommu_device *iommu_dev;
+ struct iommufd_hwpt_paging *hwpt;
+
+ const struct iommufd_viommu_ops *ops;
+
+ unsigned int type;
+};
+
+/**
+ * struct iommufd_viommu_ops - viommu specific operations
+ * @free: Free all driver-specific parts of an iommufd_viommu. The memory
+ * of the entire viommu will be free-ed by iommufd core
+ */
+struct iommufd_viommu_ops {
+ void (*free)(struct iommufd_viommu *viommu);
+};
+
#if IS_ENABLED(CONFIG_IOMMUFD)
struct iommufd_ctx *iommufd_ctx_from_file(struct file *file);
struct iommufd_ctx *iommufd_ctx_from_fd(int fd);
@@ -93,6 +114,8 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova,
int iommufd_vfio_compat_ioas_get_id(struct iommufd_ctx *ictx, u32 *out_ioas_id);
int iommufd_vfio_compat_ioas_create(struct iommufd_ctx *ictx);
int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx);
+
+struct iommufd_viommu *_iommufd_viommu_alloc(size_t size);
#else /* !CONFIG_IOMMUFD */
static inline struct iommufd_ctx *iommufd_ctx_from_file(struct file *file)
{
@@ -133,5 +156,20 @@ static inline int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx)
{
return -EOPNOTSUPP;
}
+
+static inline struct iommufd_viommu *_iommufd_viommu_alloc(size_t size)
+{
+ return NULL;
+}
#endif /* CONFIG_IOMMUFD */
+
+/*
+ * Helpers for IOMMU driver to allocate driver structures that will be freed by
+ * the iommufd core. A driver is only responsible for its own struct cleanup.
+ */
+#define iommufd_viommu_alloc(drv_struct, member) \
+ container_of(_iommufd_viommu_alloc(sizeof(struct drv_struct) + \
+ BUILD_BUG_ON_ZERO(offsetof( \
+ struct drv_struct, member))), \
+ struct drv_struct, member)
#endif
--
2.43.0


2024-04-13 03:49:25

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 05/14] iommufd: Add IOMMUFD_OBJ_VIOMMU and IOMMUFD_CMD_VIOMMU_ALLOC

Corresponding to the new iommufd_viommu core structure that represents a
vIOMMU instance in the user space for HW-accelerated features, add a new
IOMMUFD_OBJ_VIOMMU and its ioctl for user space to allocate it.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/iommufd_private.h | 3 +
drivers/iommu/iommufd/main.c | 6 ++
drivers/iommu/iommufd/viommu.c | 83 +++++++++++++++++++++++++
include/linux/iommufd.h | 1 +
include/uapi/linux/iommufd.h | 30 +++++++++
5 files changed, 123 insertions(+)

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index eccc565ed38e..ae90b4493109 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -424,6 +424,9 @@ void iopt_remove_access(struct io_pagetable *iopt,
u32 iopt_access_list_id);
void iommufd_access_destroy_object(struct iommufd_object *obj);

+int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
+void iommufd_viommu_destroy(struct iommufd_object *obj);
+
#ifdef CONFIG_IOMMUFD_TEST
int iommufd_test(struct iommufd_ucmd *ucmd);
void iommufd_selftest_destroy(struct iommufd_object *obj);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 5187942b375d..9de7e3e63ce4 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -323,6 +323,7 @@ union ucmd_buffer {
struct iommu_hwpt_set_dirty_tracking set_dirty_tracking;
struct iommu_ioas_alloc alloc;
struct iommu_ioas_allow_iovas allow_iovas;
+ struct iommu_viommu_alloc viommu;
struct iommu_ioas_copy ioas_copy;
struct iommu_ioas_iova_ranges iova_ranges;
struct iommu_ioas_map map;
@@ -378,6 +379,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
val64),
IOCTL_OP(IOMMU_VFIO_IOAS, iommufd_vfio_ioas, struct iommu_vfio_ioas,
__reserved),
+ IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl,
+ struct iommu_viommu_alloc, out_viommu_id),
#ifdef CONFIG_IOMMUFD_TEST
IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last),
#endif
@@ -510,6 +513,9 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
.destroy = iommufd_hwpt_nested_destroy,
.abort = iommufd_hwpt_nested_abort,
},
+ [IOMMUFD_OBJ_VIOMMU] = {
+ .destroy = iommufd_viommu_destroy,
+ },
#ifdef CONFIG_IOMMUFD_TEST
[IOMMUFD_OBJ_SELFTEST] = {
.destroy = iommufd_selftest_destroy,
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 3886b1dd1f13..079e0ff79942 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -19,3 +19,86 @@
}

viommu_struct_alloc(viommu);
+
+void iommufd_viommu_destroy(struct iommufd_object *obj)
+{
+ struct iommufd_viommu *viommu =
+ container_of(obj, struct iommufd_viommu, obj);
+
+ if (viommu->ops && viommu->ops->free)
+ viommu->ops->free(viommu);
+ refcount_dec(&viommu->hwpt->common.obj.users);
+}
+
+int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_viommu_alloc *cmd = ucmd->cmd;
+ struct iommufd_hwpt_paging *hwpt_paging;
+ struct iommu_device *iommu_dev;
+ struct iommufd_viommu *viommu;
+ struct iommufd_device *idev;
+ int rc;
+
+ if (cmd->flags)
+ return -EOPNOTSUPP;
+
+ idev = iommufd_get_device(ucmd, cmd->dev_id);
+ if (IS_ERR(idev))
+ return PTR_ERR(idev);
+ iommu_dev = idev->dev->iommu->iommu_dev;
+
+ if (!iommu_dev->ops->viommu_alloc) {
+ rc = -EOPNOTSUPP;
+ goto out_put_idev;
+ }
+
+ hwpt_paging = iommufd_get_hwpt_paging(ucmd, cmd->hwpt_id);
+ if (IS_ERR(hwpt_paging)) {
+ rc = PTR_ERR(hwpt_paging);
+ goto out_put_idev;
+ }
+
+ if (!hwpt_paging->nest_parent) {
+ rc = -EINVAL;
+ goto out_put_hwpt;
+ }
+
+ viommu = iommu_dev->ops->viommu_alloc(idev->dev, cmd->type,
+ hwpt_paging->common.domain);
+ if (IS_ERR(viommu)) {
+ rc = PTR_ERR(viommu);
+ goto out_put_hwpt;
+ }
+
+ /* iommufd_object_finalize will store the viommu->obj.id */
+ rc = xa_alloc(&ucmd->ictx->objects, &viommu->obj.id, XA_ZERO_ENTRY,
+ xa_limit_31b, GFP_KERNEL_ACCOUNT);
+ if (rc)
+ goto out_free;
+
+ viommu->obj.type = IOMMUFD_OBJ_VIOMMU;
+ viommu->type = cmd->type;
+
+ viommu->ictx = ucmd->ictx;
+ viommu->hwpt = hwpt_paging;
+ viommu->iommu_dev = idev->dev->iommu->iommu_dev;
+ cmd->out_viommu_id = viommu->obj.id;
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+ if (rc)
+ goto out_erase_xa;
+ iommufd_object_finalize(ucmd->ictx, &viommu->obj);
+ refcount_inc(&viommu->hwpt->common.obj.users);
+ goto out_put_hwpt;
+
+out_erase_xa:
+ xa_erase(&ucmd->ictx->objects, viommu->obj.id);
+out_free:
+ if (viommu->ops && viommu->ops->free)
+ viommu->ops->free(viommu);
+ kfree(viommu);
+out_put_hwpt:
+ iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj);
+out_put_idev:
+ iommufd_put_object(ucmd->ictx, &idev->obj);
+ return rc;
+}
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 650acfac307a..dec10c6bb261 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -28,6 +28,7 @@ enum iommufd_object_type {
IOMMUFD_OBJ_HWPT_NESTED,
IOMMUFD_OBJ_IOAS,
IOMMUFD_OBJ_ACCESS,
+ IOMMUFD_OBJ_VIOMMU,
#ifdef CONFIG_IOMMUFD_TEST
IOMMUFD_OBJ_SELFTEST,
#endif
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 1dfeaa2e649e..2b0825d69846 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -50,6 +50,7 @@ enum {
IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING,
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP,
IOMMUFD_CMD_HWPT_INVALIDATE,
+ IOMMUFD_CMD_VIOMMU_ALLOC,
};

/**
@@ -692,4 +693,33 @@ struct iommu_hwpt_invalidate {
__u32 __reserved;
};
#define IOMMU_HWPT_INVALIDATE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_INVALIDATE)
+
+/**
+ * enum iommu_viommu_type - VIOMMU Type
+ * @IOMMU_VIOMMU_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV Extension for SMMUv3
+ */
+enum iommu_viommu_type {
+ IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
+};
+
+/**
+ * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
+ * @size: sizeof(struct iommu_viommu_alloc)
+ * @flags: Must be 0
+ * @type: Type of the VIOMMU object. Must be defined in enum iommu_viommu_type
+ * @dev_id: The device to allocate this virtual IOMMU for
+ * @hwpt_id: ID of a nested parent HWPT
+ * @out_viommu_id: Output virtual IOMMU ID for the allocated object
+ *
+ * Allocate an virtual IOMMU object that holds a (shared) nested parent HWPT
+ */
+struct iommu_viommu_alloc {
+ __u32 size;
+ __u32 flags;
+ __u32 type;
+ __u32 dev_id;
+ __u32 hwpt_id;
+ __u32 out_viommu_id;
+};
+#define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC)
#endif
--
2.43.0


2024-04-13 03:49:39

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 08/14] iommufd: Add IOMMU_VIOMMU_SET_DEV_ID ioctl

Introduce a new ioctl to set a per-viommu device virtual id that should be
linked to the physical device id (or just a struct device pointer).

Since a viommu (user space IOMMU instance) can have multiple devices while
it's not ideal to confine a device to one single user space IOMMU instance
either, these two shouldn't just do a 1:1 mapping. Add two xarrays in their
structures to bind them bidirectionally.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/device.c | 11 +++++
drivers/iommu/iommufd/iommufd_private.h | 10 +++++
drivers/iommu/iommufd/main.c | 2 +
drivers/iommu/iommufd/viommu.c | 58 +++++++++++++++++++++++++
include/linux/iommufd.h | 2 +
include/uapi/linux/iommufd.h | 20 +++++++++
6 files changed, 103 insertions(+)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 873630c111c1..68086f3074b6 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -135,7 +135,16 @@ void iommufd_device_destroy(struct iommufd_object *obj)
{
struct iommufd_device *idev =
container_of(obj, struct iommufd_device, obj);
+ struct iommufd_viommu *viommu;
+ unsigned long index;

+ xa_for_each(&idev->viommus, index, viommu) {
+ if (viommu->ops->unset_dev_id)
+ viommu->ops->unset_dev_id(viommu, idev->dev);
+ xa_erase(&viommu->idevs, idev->obj.id);
+ xa_erase(&idev->viommus, index);
+ }
+ xa_destroy(&idev->viommus);
iommu_device_release_dma_owner(idev->dev);
iommufd_put_group(idev->igroup);
if (!iommufd_selftest_is_mock_dev(idev->dev))
@@ -216,6 +225,8 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
/* igroup refcount moves into iommufd_device */
idev->igroup = igroup;

+ xa_init_flags(&idev->viommus, XA_FLAGS_ALLOC1);
+
/*
* If the caller fails after this success it must call
* iommufd_unbind_device() which is safe since we hold this refcount.
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index ae90b4493109..9ba8f4ecc221 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -392,6 +392,7 @@ struct iommufd_device {
struct list_head group_item;
/* always the physical device */
struct device *dev;
+ struct xarray viommus;
bool enforce_cache_coherency;
};

@@ -424,8 +425,17 @@ void iopt_remove_access(struct io_pagetable *iopt,
u32 iopt_access_list_id);
void iommufd_access_destroy_object(struct iommufd_object *obj);

+static inline struct iommufd_viommu *
+iommufd_get_viommu(struct iommufd_ucmd *ucmd, u32 id)
+{
+ return container_of(iommufd_get_object(ucmd->ictx, id,
+ IOMMUFD_OBJ_VIOMMU),
+ struct iommufd_viommu, obj);
+}
+
int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
void iommufd_viommu_destroy(struct iommufd_object *obj);
+int iommufd_viommu_set_device_id(struct iommufd_ucmd *ucmd);

#ifdef CONFIG_IOMMUFD_TEST
int iommufd_test(struct iommufd_ucmd *ucmd);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 9de7e3e63ce4..16efc3346a2a 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -381,6 +381,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
__reserved),
IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl,
struct iommu_viommu_alloc, out_viommu_id),
+ IOCTL_OP(IOMMU_VIOMMU_SET_DEV_ID, iommufd_viommu_set_device_id,
+ struct iommu_viommu_set_dev_id, id),
#ifdef CONFIG_IOMMUFD_TEST
IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last),
#endif
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 079e0ff79942..71baca0c75de 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -69,6 +69,7 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd)
rc = PTR_ERR(viommu);
goto out_put_hwpt;
}
+ xa_init_flags(&viommu->idevs, XA_FLAGS_ALLOC1);

/* iommufd_object_finalize will store the viommu->obj.id */
rc = xa_alloc(&ucmd->ictx->objects, &viommu->obj.id, XA_ZERO_ENTRY,
@@ -102,3 +103,60 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd)
iommufd_put_object(ucmd->ictx, &idev->obj);
return rc;
}
+
+int iommufd_viommu_set_device_id(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_viommu_set_dev_id *cmd = ucmd->cmd;
+ unsigned int dev_id, viommu_id;
+ struct iommufd_viommu *viommu;
+ struct iommufd_device *idev;
+ int rc;
+
+ idev = iommufd_get_device(ucmd, cmd->dev_id);
+ if (IS_ERR(idev))
+ return PTR_ERR(idev);
+ dev_id = idev->obj.id;
+
+ viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
+ if (IS_ERR(viommu)) {
+ rc = PTR_ERR(viommu);
+ goto out_put_idev;
+ }
+
+ if (viommu->iommu_dev != idev->dev->iommu->iommu_dev) {
+ rc = -EINVAL;
+ goto out_put_viommu;
+ }
+
+ if (!viommu->ops->set_dev_id || !viommu->ops->unset_dev_id) {
+ rc = -EOPNOTSUPP;
+ goto out_put_viommu;
+ }
+
+ rc = xa_alloc(&idev->viommus, &viommu_id, viommu,
+ XA_LIMIT(viommu->obj.id, viommu->obj.id),
+ GFP_KERNEL_ACCOUNT);
+ if (rc)
+ goto out_put_viommu;
+
+ rc = xa_alloc(&viommu->idevs, &dev_id, idev,
+ XA_LIMIT(dev_id, dev_id), GFP_KERNEL_ACCOUNT);
+ if (rc)
+ goto out_xa_erase_viommu;
+
+ rc = viommu->ops->set_dev_id(viommu, idev->dev, cmd->id);
+ if (rc)
+ goto out_xa_erase_idev;
+
+ goto out_put_viommu;
+
+out_xa_erase_idev:
+ xa_erase(&viommu->idevs, idev->obj.id);
+out_xa_erase_viommu:
+ xa_erase(&idev->viommus, viommu->obj.id);
+out_put_viommu:
+ iommufd_put_object(ucmd->ictx, &viommu->obj);
+out_put_idev:
+ iommufd_put_object(ucmd->ictx, &idev->obj);
+ return rc;
+}
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index ca6ac8a1ffd0..2be302b82f47 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -10,6 +10,7 @@
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/refcount.h>
+#include <linux/xarray.h>

struct device;
struct iommufd_device;
@@ -88,6 +89,7 @@ struct iommufd_viommu {
const struct iommufd_viommu_ops *ops;

unsigned int type;
+ struct xarray idevs;
};

/**
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 2b0825d69846..eaa192de63d3 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -51,6 +51,7 @@ enum {
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP,
IOMMUFD_CMD_HWPT_INVALIDATE,
IOMMUFD_CMD_VIOMMU_ALLOC,
+ IOMMUFD_CMD_VIOMMU_SET_DEV_ID,
};

/**
@@ -722,4 +723,23 @@ struct iommu_viommu_alloc {
__u32 out_viommu_id;
};
#define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC)
+
+/**
+ * struct iommu_viommu_set_dev_id - ioctl(IOMMU_VIOMMU_SET_DEV_ID)
+ * @size: sizeof(struct iommu_viommu_set_dev_id)
+ * @viommu_id: viommu ID to associate with the device to store its virtual ID
+ * @dev_id: device ID to set a device virtual ID
+ * @__reserved: Must be 0
+ * @id: Device virtual ID
+ *
+ * Set a viommu-specific virtual ID of a device
+ */
+struct iommu_viommu_set_dev_id {
+ __u32 size;
+ __u32 viommu_id;
+ __u32 dev_id;
+ __u32 __reserved;
+ __aligned_u64 id;
+};
+#define IOMMU_VIOMMU_SET_DEV_ID _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_SET_DEV_ID)
#endif
--
2.43.0


2024-04-13 03:49:41

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 10/14] iommufd/selftest: Add IOMMU_TEST_OP_MV_CHECK_DEV_ID

This allows verifying the ids xarray of struct mock_viommu, and confirming
the correctness of an IOMMU_VIOMMU_SET_DEV_ID call.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/iommufd_test.h | 5 +++
drivers/iommu/iommufd/selftest.c | 32 +++++++++++++++++++
tools/testing/selftests/iommu/iommufd.c | 7 ++++
tools/testing/selftests/iommu/iommufd_utils.h | 24 ++++++++++++++
4 files changed, 68 insertions(+)

diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index e854d3f67205..91af979a0c23 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -22,6 +22,7 @@ enum {
IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS,
IOMMU_TEST_OP_DIRTY,
IOMMU_TEST_OP_MD_CHECK_IOTLB,
+ IOMMU_TEST_OP_MV_CHECK_DEVID,
};

enum {
@@ -127,6 +128,10 @@ struct iommu_test_cmd {
__u32 id;
__u32 iotlb;
} check_iotlb;
+ struct {
+ __u32 idev_id;
+ __u32 dev_id;
+ } check_dev_id;
};
__u32 last;
};
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 4caed9304065..b7d0ce3d3659 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -999,6 +999,34 @@ static int iommufd_test_md_check_iotlb(struct iommufd_ucmd *ucmd,
return rc;
}

+static int iommufd_test_mv_check_dev_id(struct iommufd_ucmd *ucmd,
+ u32 viommu_id, unsigned int idev_id,
+ u32 dev_id)
+{
+ struct iommufd_device *idev;
+ struct mock_viommu *mv;
+ int rc = 0;
+
+ mv = container_of(iommufd_get_viommu(ucmd, viommu_id),
+ struct mock_viommu, core);
+ if (IS_ERR(mv))
+ return PTR_ERR(mv);
+
+ idev = iommufd_get_device(ucmd, idev_id);
+ if (IS_ERR(idev)) {
+ rc = PTR_ERR(idev);
+ goto out_put_viommu;
+ }
+
+ if (idev->dev != xa_load(&mv->ids, dev_id))
+ rc = -EINVAL;
+
+ iommufd_put_object(ucmd->ictx, &idev->obj);
+out_put_viommu:
+ iommufd_put_object(ucmd->ictx, &mv->core.obj);
+ return rc;
+}
+
struct selftest_access {
struct iommufd_access *access;
struct file *file;
@@ -1484,6 +1512,10 @@ int iommufd_test(struct iommufd_ucmd *ucmd)
return iommufd_test_md_check_iotlb(ucmd, cmd->id,
cmd->check_iotlb.id,
cmd->check_iotlb.iotlb);
+ case IOMMU_TEST_OP_MV_CHECK_DEVID:
+ return iommufd_test_mv_check_dev_id(ucmd, cmd->id,
+ cmd->check_dev_id.idev_id,
+ cmd->check_dev_id.dev_id);
case IOMMU_TEST_OP_CREATE_ACCESS:
return iommufd_test_create_access(ucmd, cmd->id,
cmd->create_access.flags);
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 378fbf00730e..3af5932c227c 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -288,14 +288,21 @@ TEST_F(iommufd_ioas, viommu)
test_err_viommu_set_dev_id(EINVAL, dev_id, viommu_id, 1ULL << 32);
test_cmd_viommu_set_dev_id(dev_id, viommu_id, 0x99);
test_err_viommu_set_dev_id(EBUSY, dev_id, viommu_id, 0x99);
+ test_cmd_viommu_check_dev_id(viommu_id, dev_id, 0x99);

test_cmd_mock_domain(self->ioas_id, &stdev2, &hwpt2, &device2);
test_err_viommu_set_dev_id(EBUSY, device2, viommu_id, 0x99);
test_cmd_viommu_set_dev_id(device2, viommu_id, 0xaa);
test_err_viommu_set_dev_id(EBUSY, device2, viommu_id, 0xaa);
+ test_cmd_viommu_check_dev_id(viommu_id, dev_id, 0x99);
+ test_cmd_viommu_check_dev_id(viommu_id, device2, 0xaa);
+
test_ioctl_destroy(stdev2);
+ test_cmd_viommu_check_dev_id(viommu_id, dev_id, 0x99);
+ test_err_viommu_check_dev_id(ENOENT, viommu_id, device2, 0xaa);

test_ioctl_destroy(viommu_id);
+ test_err_viommu_check_dev_id(ENOENT, viommu_id, dev_id, 0x99);
test_ioctl_destroy(hwpt_id);
} else {
test_err_viommu_alloc(ENOENT, dev_id, hwpt_id, &viommu_id);
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 81e9184fd1d5..e926fa289baa 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -731,3 +731,27 @@ static int _test_cmd_viommu_set_dev_id(int fd, __u32 device_id,
EXPECT_ERRNO(_errno, \
_test_cmd_viommu_set_dev_id(self->fd, device_id, \
viommu_id, virtual_id))
+
+static int _test_cmd_viommu_check_dev_id(int fd, __u32 viommu_id,
+ __u32 device_id, __u64 virtual_id)
+{
+ struct iommu_test_cmd test_cmd = {
+ .size = sizeof(test_cmd),
+ .op = IOMMU_TEST_OP_MV_CHECK_DEVID,
+ .id = viommu_id,
+ .check_dev_id = {
+ .idev_id = device_id,
+ .dev_id = virtual_id,
+ },
+ };
+ return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_MV_CHECK_DEVID),
+ &test_cmd);
+}
+#define test_cmd_viommu_check_dev_id(viommu_id, device_id, expected) \
+ ASSERT_EQ(0, _test_cmd_viommu_check_dev_id(self->fd, viommu_id, \
+ device_id, expected))
+
+#define test_err_viommu_check_dev_id(_errno, viommu_id, device_id, expected) \
+ EXPECT_ERRNO(_errno, \
+ _test_cmd_viommu_check_dev_id(self->fd, viommu_id, \
+ device_id, expected))
--
2.43.0


2024-04-13 03:49:58

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 02/14] iommufd: Swap _iommufd_object_alloc and __iommufd_object_alloc

Currently, the object allocation function calls:
level-0: iommufd_object_alloc()
level-1: ___iommufd_object_alloc()
level-2: _iommufd_object_alloc()

So the level-1 and level-2 look inverted.

As the following change will add another level-3 helper, to make it clear:
level-0: iommufd_object_alloc()
level-1: _iommufd_object_alloc()
level-2: __iommufd_object_alloc()
level-3: ___iommufd_object_alloc()

Swap the names of the level-1 and level-2 functions.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/hw_pagetable.c | 4 ++--
drivers/iommu/iommufd/iommufd_private.h | 12 ++++++------
drivers/iommu/iommufd/main.c | 6 +++---
3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 33d142f8057d..111b8154cce8 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -115,7 +115,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
if (flags & ~valid_flags)
return ERR_PTR(-EOPNOTSUPP);

- hwpt_paging = __iommufd_object_alloc(
+ hwpt_paging = _iommufd_object_alloc(
ictx, hwpt_paging, IOMMUFD_OBJ_HWPT_PAGING, common.obj);
if (IS_ERR(hwpt_paging))
return ERR_CAST(hwpt_paging);
@@ -218,7 +218,7 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx,
if (parent->auto_domain || !parent->nest_parent)
return ERR_PTR(-EINVAL);

- hwpt_nested = __iommufd_object_alloc(
+ hwpt_nested = _iommufd_object_alloc(
ictx, hwpt_nested, IOMMUFD_OBJ_HWPT_NESTED, common.obj);
if (IS_ERR(hwpt_nested))
return ERR_CAST(hwpt_nested);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 3ea0a093ee50..3acbc67dd5f0 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -200,12 +200,12 @@ iommufd_object_put_and_try_destroy(struct iommufd_ctx *ictx,
iommufd_object_remove(ictx, obj, obj->id, 0);
}

-struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
- size_t size,
- enum iommufd_object_type type);
+struct iommufd_object *__iommufd_object_alloc(struct iommufd_ctx *ictx,
+ size_t size,
+ enum iommufd_object_type type);

-#define __iommufd_object_alloc(ictx, ptr, type, obj) \
- container_of(_iommufd_object_alloc( \
+#define _iommufd_object_alloc(ictx, ptr, type, obj) \
+ container_of(__iommufd_object_alloc( \
ictx, \
sizeof(*(ptr)) + BUILD_BUG_ON_ZERO( \
offsetof(typeof(*(ptr)), \
@@ -214,7 +214,7 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
typeof(*(ptr)), obj)

#define iommufd_object_alloc(ictx, ptr, type) \
- __iommufd_object_alloc(ictx, ptr, type, obj)
+ _iommufd_object_alloc(ictx, ptr, type, obj)

/*
* The IO Address Space (IOAS) pagetable is a virtual page table backed by the
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 39b32932c61e..a51ab766e183 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -29,9 +29,9 @@ struct iommufd_object_ops {
static const struct iommufd_object_ops iommufd_object_ops[];
static struct miscdevice vfio_misc_dev;

-struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
- size_t size,
- enum iommufd_object_type type)
+struct iommufd_object *__iommufd_object_alloc(struct iommufd_ctx *ictx,
+ size_t size,
+ enum iommufd_object_type type)
{
struct iommufd_object *obj;
int rc;
--
2.43.0


2024-04-13 03:50:33

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 11/14] iommufd: Add struct iommufd_vqueue and its related viommu ops

Inroduce a new core structure and its allocator iommufd_vqueue_alloc().

This can be used for a viommu to allocate a HW-accelerated queue, e.g.
NVIDIA's virtual command queue and AMD vIOMMU's command buffer.

Also add a pair of viommu ops for iommufd to forward user space ioctls
to IOMMU drivers.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/viommu.c | 1 +
include/linux/iommu.h | 1 +
include/linux/iommufd.h | 27 +++++++++++++++++++++++++++
3 files changed, 29 insertions(+)

diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 71baca0c75de..8894878c1e73 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -19,6 +19,7 @@
}

viommu_struct_alloc(viommu);
+viommu_struct_alloc(vqueue);

void iommufd_viommu_destroy(struct iommufd_object *obj)
{
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 4d4461146288..475f41f2d41f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -146,6 +146,7 @@ struct iopf_queue {
};

struct iommufd_viommu;
+struct iommufd_vqueue;

/* iommu fault flags */
#define IOMMU_FAULT_READ 0x0
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 2be302b82f47..5b97b04aa145 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -20,6 +20,7 @@ struct iommufd_access;
struct iommufd_hwpt_paging;
struct file;
struct iommu_group;
+struct iommu_user_data;

enum iommufd_object_type {
IOMMUFD_OBJ_NONE,
@@ -97,12 +98,27 @@ struct iommufd_viommu {
* @free: Free all driver-specific parts of an iommufd_viommu. The memory
* of the entire viommu will be free-ed by iommufd core
* @set/unset_dev_id: set/unset a user space virtual id for a device
+ * @vqueue_alloc: Allocate an iommufd_vqueue as a user space command queue for a
+ * @viommu instance. Queue specific @user_data must be defined in
+ * the include/uapi/linux/iommufd.h header.
+ * @vqueue_free: Free all driver-specific parts of an iommufd_vqueue. The memory
+ * of the iommufd_vqueue will be free-ed by iommufd core
*/
struct iommufd_viommu_ops {
void (*free)(struct iommufd_viommu *viommu);
int (*set_dev_id)(struct iommufd_viommu *viommu,
struct device *dev, u64 dev_id);
void (*unset_dev_id)(struct iommufd_viommu *viommu, struct device *dev);
+ struct iommufd_vqueue *(*vqueue_alloc)(
+ struct iommufd_viommu *viommu,
+ const struct iommu_user_data *user_data);
+ void (*vqueue_free)(struct iommufd_vqueue *vqueue);
+};
+
+struct iommufd_vqueue {
+ struct iommufd_object obj;
+ struct iommufd_ctx *ictx;
+ struct iommufd_viommu *viommu;
};

#if IS_ENABLED(CONFIG_IOMMUFD)
@@ -123,6 +139,7 @@ int iommufd_vfio_compat_ioas_create(struct iommufd_ctx *ictx);
int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx);

struct iommufd_viommu *_iommufd_viommu_alloc(size_t size);
+struct iommufd_vqueue *_iommufd_vqueue_alloc(size_t size);
#else /* !CONFIG_IOMMUFD */
static inline struct iommufd_ctx *iommufd_ctx_from_file(struct file *file)
{
@@ -168,6 +185,11 @@ static inline struct iommufd_viommu *_iommufd_viommu_alloc(size_t size)
{
return NULL;
}
+
+static inline struct iommufd_vqueue *_iommufd_vqueue_alloc(size_t size)
+{
+ return NULL;
+}
#endif /* CONFIG_IOMMUFD */

/*
@@ -179,4 +201,9 @@ static inline struct iommufd_viommu *_iommufd_viommu_alloc(size_t size)
BUILD_BUG_ON_ZERO(offsetof( \
struct drv_struct, member))), \
struct drv_struct, member)
+#define iommufd_vqueue_alloc(drv_struct, member) \
+ container_of(_iommufd_vqueue_alloc(sizeof(struct drv_struct) + \
+ BUILD_BUG_ON_ZERO(offsetof( \
+ struct drv_struct, member))), \
+ struct drv_struct, member)
#endif
--
2.43.0


2024-04-13 03:50:42

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 13/14] iommufd: Add mmap infrastructure

Add for sharing the kernel page with user space. This allows to pass
through HW resource (VCMDQ MMIO pages for example) to user space VMM
and guest OS. Use vma->vm_pgoff as the carrier of a viommu_id.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/main.c | 40 ++++++++++++++++++++++++++++++++++++
include/linux/iommufd.h | 4 ++++
2 files changed, 44 insertions(+)

diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 96ef81530809..5b401c80cca8 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -16,6 +16,7 @@
#include <linux/mutex.h>
#include <linux/bug.h>
#include <uapi/linux/iommufd.h>
+#include <linux/iommu.h>
#include <linux/iommufd.h>

#include "io_pagetable.h"
@@ -427,11 +428,50 @@ static long iommufd_fops_ioctl(struct file *filp, unsigned int cmd,
return ret;
}

+static int iommufd_fops_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ struct iommufd_ctx *ictx = filp->private_data;
+ size_t size = vma->vm_end - vma->vm_start;
+ u32 viommu_id = (u32)vma->vm_pgoff;
+ struct iommufd_viommu *viommu;
+ unsigned long pfn;
+ int rc;
+
+ if (size > PAGE_SIZE)
+ return -EINVAL;
+
+ viommu = container_of(iommufd_get_object(ictx, viommu_id,
+ IOMMUFD_OBJ_VIOMMU),
+ struct iommufd_viommu, obj);
+ if (IS_ERR(viommu))
+ return PTR_ERR(viommu);
+
+ if (!viommu->ops->get_mmap_pfn) {
+ rc = -EOPNOTSUPP;
+ goto out_put_viommu;
+ }
+
+ pfn = viommu->ops->get_mmap_pfn(viommu, size);
+ if (!pfn) {
+ rc = -ENOMEM;
+ goto out_put_viommu;
+ }
+
+ vma->vm_pgoff = 0;
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+ vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
+ rc = remap_pfn_range(vma, vma->vm_start, pfn, size, vma->vm_page_prot);
+out_put_viommu:
+ iommufd_put_object(ictx, &viommu->obj);
+ return rc;
+}
+
static const struct file_operations iommufd_fops = {
.owner = THIS_MODULE,
.open = iommufd_fops_open,
.release = iommufd_fops_release,
.unlocked_ioctl = iommufd_fops_ioctl,
+ .mmap = iommufd_fops_mmap,
};

/**
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 707b6d4b20a3..4a9c6b281d35 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -104,6 +104,8 @@ struct iommufd_viommu {
* the include/uapi/linux/iommufd.h header.
* @vqueue_free: Free all driver-specific parts of an iommufd_vqueue. The memory
* of the iommufd_vqueue will be free-ed by iommufd core
+ * @get_mmap_pfn: Return the PFN of a viommu given a finite size, for user space
+ * to mmap the page(s).
*/
struct iommufd_viommu_ops {
void (*free)(struct iommufd_viommu *viommu);
@@ -114,6 +116,8 @@ struct iommufd_viommu_ops {
struct iommufd_viommu *viommu,
const struct iommu_user_data *user_data);
void (*vqueue_free)(struct iommufd_vqueue *vqueue);
+ unsigned long (*get_mmap_pfn)(struct iommufd_viommu *viommu,
+ size_t pgsize);
};

struct iommufd_vqueue {
--
2.43.0


2024-04-13 03:50:51

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 09/14] iommufd/selftest: Add IOMMU_VIOMMU_SET_DEV_ID test coverage

Add an xarray to store the array of virtual ids to mock_devs, to cover the
new IOMMU_VIOMMU_SET_DEV_ID ioctl.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/selftest.c | 50 +++++++++++++++++++
tools/testing/selftests/iommu/iommufd.c | 14 ++++++
tools/testing/selftests/iommu/iommufd_utils.h | 21 ++++++++
3 files changed, 85 insertions(+)

diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index e2fc2ec23093..4caed9304065 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -139,6 +139,7 @@ struct mock_dev {
struct device dev;
unsigned long flags;
int id;
+ unsigned int id_user;
};

struct selftest_obj {
@@ -516,6 +517,53 @@ static struct iommu_device *mock_probe_device(struct device *dev)

struct mock_viommu {
struct iommufd_viommu core;
+ struct xarray ids;
+};
+
+static void mock_viommu_free(struct iommufd_viommu *viommu)
+{
+ struct mock_viommu *mv = container_of(viommu, struct mock_viommu, core);
+ struct device *dev;
+ unsigned long index;
+
+ xa_for_each(&mv->ids, index, dev)
+ xa_erase(&mv->ids, index);
+ xa_destroy(&mv->ids);
+}
+
+static int mock_viommu_set_dev_id(struct iommufd_viommu *viommu,
+ struct device *dev, u64 dev_id)
+{
+ struct mock_viommu *mv = container_of(viommu, struct mock_viommu, core);
+ struct mock_dev *mdev = container_of(dev, struct mock_dev, dev);
+ u32 id = (u32)dev_id;
+ int rc;
+
+ if (dev_id > UINT_MAX)
+ return -EINVAL;
+ if (mdev->id_user > 0)
+ return -EBUSY;
+ rc = xa_alloc(&mv->ids, &id, dev, XA_LIMIT(id, id), GFP_KERNEL);
+ if (rc)
+ return rc;
+ mdev->id_user = (unsigned int)dev_id;
+ return 0;
+}
+
+static void mock_viommu_unset_dev_id(struct iommufd_viommu *viommu,
+ struct device *dev)
+{
+ struct mock_viommu *mv = container_of(viommu, struct mock_viommu, core);
+ struct mock_dev *mdev = container_of(dev, struct mock_dev, dev);
+
+ WARN_ON(dev != xa_erase(&mv->ids, mdev->id_user));
+ mdev->id_user = 0;
+}
+
+static const struct iommufd_viommu_ops mock_viommu_ops = {
+ .free = mock_viommu_free,
+ .set_dev_id = mock_viommu_set_dev_id,
+ .unset_dev_id = mock_viommu_unset_dev_id,
};

static struct iommufd_viommu *mock_viommu_alloc(struct device *dev,
@@ -527,6 +575,8 @@ static struct iommufd_viommu *mock_viommu_alloc(struct device *dev,
mv = iommufd_viommu_alloc(mock_viommu, core);
if (!mv)
return ERR_PTR(-ENOMEM);
+ mv->core.ops = &mock_viommu_ops;
+ xa_init_flags(&mv->ids, XA_FLAGS_ALLOC1);

return &mv->core;
}
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index dd7b7c7a06c6..378fbf00730e 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -271,6 +271,9 @@ TEST_F(iommufd_ioas, viommu)
uint32_t dev_id = self->device_id;
uint32_t viommu_id = 0;
uint32_t hwpt_id = 0;
+ uint32_t device2;
+ uint32_t stdev2;
+ uint32_t hwpt2;

if (dev_id) {
test_err_viommu_alloc(ENOENT, dev_id, hwpt_id, &viommu_id);
@@ -282,10 +285,21 @@ TEST_F(iommufd_ioas, viommu)
IOMMU_HWPT_ALLOC_NEST_PARENT,
&hwpt_id);
test_cmd_viommu_alloc(dev_id, hwpt_id, &viommu_id);
+ test_err_viommu_set_dev_id(EINVAL, dev_id, viommu_id, 1ULL << 32);
+ test_cmd_viommu_set_dev_id(dev_id, viommu_id, 0x99);
+ test_err_viommu_set_dev_id(EBUSY, dev_id, viommu_id, 0x99);
+
+ test_cmd_mock_domain(self->ioas_id, &stdev2, &hwpt2, &device2);
+ test_err_viommu_set_dev_id(EBUSY, device2, viommu_id, 0x99);
+ test_cmd_viommu_set_dev_id(device2, viommu_id, 0xaa);
+ test_err_viommu_set_dev_id(EBUSY, device2, viommu_id, 0xaa);
+ test_ioctl_destroy(stdev2);
+
test_ioctl_destroy(viommu_id);
test_ioctl_destroy(hwpt_id);
} else {
test_err_viommu_alloc(ENOENT, dev_id, hwpt_id, &viommu_id);
+ test_err_viommu_set_dev_id(ENOENT, dev_id, viommu_id, 99);
}
}

diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 037c84189d50..81e9184fd1d5 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -710,3 +710,24 @@ static int _test_cmd_viommu_alloc(int fd, __u32 device_id, __u32 hwpt_id,
#define test_err_viommu_alloc(_errno, device_id, hwpt_id, viommu_id) \
EXPECT_ERRNO(_errno, _test_cmd_viommu_alloc(self->fd, device_id, \
hwpt_id, 0, viommu_id))
+
+static int _test_cmd_viommu_set_dev_id(int fd, __u32 device_id,
+ __u32 viommu_id, __u64 virtual_id)
+{
+ struct iommu_viommu_set_dev_id cmd = {
+ .size = sizeof(cmd),
+ .dev_id = device_id,
+ .viommu_id = viommu_id,
+ .id = virtual_id,
+ };
+
+ return ioctl(fd, IOMMU_VIOMMU_SET_DEV_ID, &cmd);
+}
+
+#define test_cmd_viommu_set_dev_id(device_id, viommu_id, virtual_id) \
+ ASSERT_EQ(0, _test_cmd_viommu_set_dev_id(self->fd, device_id, \
+ viommu_id, virtual_id))
+#define test_err_viommu_set_dev_id(_errno, device_id, viommu_id, virtual_id) \
+ EXPECT_ERRNO(_errno, \
+ _test_cmd_viommu_set_dev_id(self->fd, device_id, \
+ viommu_id, virtual_id))
--
2.43.0


2024-04-13 03:51:03

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 12/14] iommufd: Add IOMMUFD_OBJ_VQUEUE and IOMMUFD_CMD_VQUEUE_ALLOC

Introduce a new IOMMUFD_OBJ_VQUEU to represent a virtual command queue
instance. And add a new ioctl for user space to allocate it.

As an initial version, ddd IOMMU_VQUEUE_DATA_TEGRA241_CMDQV to the enum
iommu_vqueue_data_type and the corresponding iommu_vqueue_tegra241_cmdqv
data structure.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/iommufd/iommufd_private.h | 2 +
drivers/iommu/iommufd/main.c | 6 +++
drivers/iommu/iommufd/viommu.c | 72 +++++++++++++++++++++++++
include/linux/iommufd.h | 1 +
include/uapi/linux/iommufd.h | 48 +++++++++++++++++
5 files changed, 129 insertions(+)

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 9ba8f4ecc221..c994f2db3052 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -436,6 +436,8 @@ iommufd_get_viommu(struct iommufd_ucmd *ucmd, u32 id)
int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
void iommufd_viommu_destroy(struct iommufd_object *obj);
int iommufd_viommu_set_device_id(struct iommufd_ucmd *ucmd);
+int iommufd_vqueue_alloc_ioctl(struct iommufd_ucmd *ucmd);
+void iommufd_vqueue_destroy(struct iommufd_object *obj);

#ifdef CONFIG_IOMMUFD_TEST
int iommufd_test(struct iommufd_ucmd *ucmd);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 16efc3346a2a..96ef81530809 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -324,6 +324,7 @@ union ucmd_buffer {
struct iommu_ioas_alloc alloc;
struct iommu_ioas_allow_iovas allow_iovas;
struct iommu_viommu_alloc viommu;
+ struct iommu_vqueue_alloc vqueue;
struct iommu_ioas_copy ioas_copy;
struct iommu_ioas_iova_ranges iova_ranges;
struct iommu_ioas_map map;
@@ -383,6 +384,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
struct iommu_viommu_alloc, out_viommu_id),
IOCTL_OP(IOMMU_VIOMMU_SET_DEV_ID, iommufd_viommu_set_device_id,
struct iommu_viommu_set_dev_id, id),
+ IOCTL_OP(IOMMU_VQUEUE_ALLOC, iommufd_vqueue_alloc_ioctl,
+ struct iommu_vqueue_alloc, data_uptr),
#ifdef CONFIG_IOMMUFD_TEST
IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last),
#endif
@@ -518,6 +521,9 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
[IOMMUFD_OBJ_VIOMMU] = {
.destroy = iommufd_viommu_destroy,
},
+ [IOMMUFD_OBJ_VQUEUE] = {
+ .destroy = iommufd_vqueue_destroy,
+ },
#ifdef CONFIG_IOMMUFD_TEST
[IOMMUFD_OBJ_SELFTEST] = {
.destroy = iommufd_selftest_destroy,
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 8894878c1e73..56c9ea818bfa 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -161,3 +161,75 @@ int iommufd_viommu_set_device_id(struct iommufd_ucmd *ucmd)
iommufd_put_object(ucmd->ictx, &idev->obj);
return rc;
}
+
+void iommufd_vqueue_destroy(struct iommufd_object *obj)
+{
+ struct iommufd_vqueue *vqueue =
+ container_of(obj, struct iommufd_vqueue, obj);
+ struct iommufd_viommu *viommu = vqueue->viommu;
+
+ if (viommu->ops->vqueue_free)
+ viommu->ops->vqueue_free(vqueue);
+ refcount_dec(&viommu->obj.users);
+}
+
+int iommufd_vqueue_alloc_ioctl(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_vqueue_alloc *cmd = ucmd->cmd;
+ const struct iommu_user_data user_data = {
+ .type = cmd->data_type,
+ .uptr = u64_to_user_ptr(cmd->data_uptr),
+ .len = cmd->data_len,
+ };
+ struct iommufd_vqueue *vqueue;
+ struct iommufd_viommu *viommu;
+ int rc;
+
+ if (cmd->flags)
+ return -EOPNOTSUPP;
+ if (!cmd->data_len)
+ return -EINVAL;
+
+ viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
+ if (IS_ERR(viommu))
+ return PTR_ERR(viommu);
+
+ if (!viommu->ops || !viommu->ops->vqueue_alloc) {
+ rc = -EOPNOTSUPP;
+ goto out_put_viommu;
+ }
+
+ vqueue = viommu->ops->vqueue_alloc(
+ viommu, user_data.len ? &user_data : NULL);
+ if (IS_ERR(vqueue)) {
+ rc = PTR_ERR(vqueue);
+ goto out_put_viommu;
+ }
+
+ /* iommufd_object_finalize will store the vqueue->obj.id */
+ rc = xa_alloc(&ucmd->ictx->objects, &vqueue->obj.id, XA_ZERO_ENTRY,
+ xa_limit_31b, GFP_KERNEL_ACCOUNT);
+ if (rc)
+ goto out_free;
+
+ vqueue->obj.type = IOMMUFD_OBJ_VQUEUE;
+
+ vqueue->ictx = ucmd->ictx;
+ vqueue->viommu = viommu;
+ cmd->out_vqueue_id = vqueue->obj.id;
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+ if (rc)
+ goto out_erase_xa;
+ iommufd_object_finalize(ucmd->ictx, &vqueue->obj);
+ refcount_inc(&viommu->obj.users);
+ goto out_put_viommu;
+
+out_erase_xa:
+ xa_erase(&ucmd->ictx->objects, vqueue->obj.id);
+out_free:
+ if (viommu->ops->vqueue_free)
+ viommu->ops->vqueue_free(vqueue);
+out_put_viommu:
+ iommufd_put_object(ucmd->ictx, &viommu->obj);
+ return rc;
+}
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 5b97b04aa145..707b6d4b20a3 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -31,6 +31,7 @@ enum iommufd_object_type {
IOMMUFD_OBJ_IOAS,
IOMMUFD_OBJ_ACCESS,
IOMMUFD_OBJ_VIOMMU,
+ IOMMUFD_OBJ_VQUEUE,
#ifdef CONFIG_IOMMUFD_TEST
IOMMUFD_OBJ_SELFTEST,
#endif
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index eaa192de63d3..95abe3100e3b 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -52,6 +52,7 @@ enum {
IOMMUFD_CMD_HWPT_INVALIDATE,
IOMMUFD_CMD_VIOMMU_ALLOC,
IOMMUFD_CMD_VIOMMU_SET_DEV_ID,
+ IOMMUFD_CMD_VQUEUE_ALLOC,
};

/**
@@ -742,4 +743,51 @@ struct iommu_viommu_set_dev_id {
__aligned_u64 id;
};
#define IOMMU_VIOMMU_SET_DEV_ID _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_SET_DEV_ID)
+
+/**
+ * struct iommu_vqueue_tegra241_cmdqv - NVIDIA Tegra241's Virtual Command Queue
+ * for its CMDQV Extension for ARM SMMUv3
+ * (IOMMU_VQUEUE_DATA_TEGRA241_CMDQV)
+ * @vcmdq_id: logical ID of a virtual command queue in the VIOMMU instance
+ * @vcmdq_log2size: (1 << @vcmdq_log2size) will be the size of the vcmdq
+ * @vcmdq_base: guest physical address (IPA) to the vcmdq base address
+ */
+struct iommu_vqueue_tegra241_cmdqv {
+ __u32 vcmdq_id;
+ __u32 vcmdq_log2size;
+ __aligned_u64 vcmdq_base;
+};
+
+/**
+ * enum iommu_vqueue_data_type - VQUEUE Data Type
+ * @IOMMU_VQUEUE_DATA_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV Extension for SMMUv3
+ */
+enum iommu_vqueue_data_type {
+ IOMMU_VQUEUE_DATA_TEGRA241_CMDQV,
+};
+
+/**
+ * struct iommu_vqueue_alloc - ioctl(IOMMU_VQUEUE_ALLOC)
+ * @size: sizeof(struct iommu_vqueue_alloc)
+ * @flags: Must be 0
+ * @viommu_id: viommu ID to associate the virtual queue with
+ * @out_vqueue_id: The ID of the new virtual queue
+ * @data_type: One of enum iommu_vqueue_data_type
+ * @data_len: Length of the type specific data
+ * @data_uptr: User pointer to the type specific data
+ *
+ * Allocate an virtual queue object for driver-specific HW-accelerated queue
+ */
+
+struct iommu_vqueue_alloc {
+ __u32 size;
+ __u32 flags;
+ __u32 viommu_id;
+ __u32 out_vqueue_id;
+ __u32 data_type;
+ __u32 data_len;
+ __aligned_u64 data_uptr;
+};
+#define IOMMU_VQUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VQUEUE_ALLOC)
+
#endif
--
2.43.0


2024-04-13 03:51:11

by Nicolin Chen

[permalink] [raw]
Subject: [PATCH RFCv1 14/14] iommu/tegra241-cmdqv: Add user-space use support

Add the support via VIOMMU infrastructure for virtualization use case.

This basically allows VMM to allocate VINTFs (as a viommu object) and
assign VCMDQs to it. A VINTF's MMIO page0 can be mmap'd to user space
for VM to access directly without VMEXIT and corresponding hypercall.

As an initial version, the number of VCMDQs per VINTF is fixed to two.

Signed-off-by: Nicolin Chen <[email protected]>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 ++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 19 ++
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 284 +++++++++++++++++-
3 files changed, 317 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9af6659ea488..367df65d1e2a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -17,6 +17,7 @@
#include <linux/err.h>
#include <linux/interrupt.h>
#include <linux/io-pgtable.h>
+#include <linux/iommufd.h>
#include <linux/iopoll.h>
#include <linux/module.h>
#include <linux/msi.h>
@@ -3077,6 +3078,23 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
arm_smmu_sva_remove_dev_pasid(domain, dev, pasid);
}

+static struct iommufd_viommu *arm_smmu_viommu_alloc(struct device *dev,
+ unsigned int viommu_type,
+ struct iommu_domain *domain)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+ if (!master || !master->smmu)
+ return ERR_PTR(-ENODEV);
+
+ if (master->smmu->tegra241_cmdqv &&
+ viommu_type == IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV)
+ return tegra241_cmdqv_viommu_alloc(
+ master->smmu->tegra241_cmdqv, smmu_domain);
+ return ERR_PTR(-EOPNOTSUPP);
+}
+
static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
@@ -3091,6 +3109,7 @@ static struct iommu_ops arm_smmu_ops = {
.remove_dev_pasid = arm_smmu_remove_dev_pasid,
.dev_enable_feat = arm_smmu_dev_enable_feature,
.dev_disable_feat = arm_smmu_dev_disable_feature,
+ .viommu_alloc = arm_smmu_viommu_alloc,
.page_response = arm_smmu_page_response,
.def_domain_type = arm_smmu_def_domain_type,
.pgsize_bitmap = -1UL, /* Restricted during device attach */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index fdc3d570cf43..8ee3f10086b9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -701,6 +701,7 @@ struct arm_smmu_device {

struct arm_smmu_stream {
u32 id;
+ u32 cmdqv_sid_slot;
struct arm_smmu_master *master;
struct rb_node node;
};
@@ -776,6 +777,14 @@ int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
unsigned long prod_off, unsigned long cons_off,
size_t dwords, const char *name);

+static inline phys_addr_t
+arm_smmu_domain_ipa_to_pa(struct arm_smmu_domain *smmu_domain, u64 ipa)
+{
+ if (WARN_ON_ONCE(smmu_domain->stage != ARM_SMMU_DOMAIN_S2))
+ return 0;
+ return iommu_iova_to_phys(&smmu_domain->domain, ipa);
+}
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
@@ -838,6 +847,9 @@ tegra241_cmdqv_acpi_probe(struct arm_smmu_device *smmu, int id);
int tegra241_cmdqv_device_reset(struct arm_smmu_device *smmu);
struct arm_smmu_cmdq *tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
u64 *cmds, int n);
+struct iommufd_viommu *
+tegra241_cmdqv_viommu_alloc(struct tegra241_cmdqv *cmdqv,
+ struct arm_smmu_domain *smmu_domain);
#else /* CONFIG_TEGRA241_CMDQV */
static inline struct tegra241_cmdqv *
tegra241_cmdqv_acpi_probe(struct arm_smmu_device *smmu, int id)
@@ -855,6 +867,13 @@ tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
return NULL;
}
+
+static inline struct iommufd_viommu *
+tegra241_cmdqv_viommu_alloc(struct tegra241_cmdqv *cmdqv,
+ struct arm_smmu_domain *smmu_domain);
+{
+ return -ENODEV;
+}
#endif /* CONFIG_TEGRA241_CMDQV */

#endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 7aeaf810980c..18e250ec60dc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -8,7 +8,12 @@
#include <linux/dma-mapping.h>
#include <linux/interrupt.h>
#include <linux/iommu.h>
+#include <linux/iommufd.h>
#include <linux/iopoll.h>
+#include <linux/kvm_host.h>
+#include <linux/platform_device.h>
+#include <linux/vfio.h>
+#include <uapi/linux/iommufd.h>

#include <acpi/acpixf.h>

@@ -28,8 +33,10 @@
#define CMDQV_EN BIT(0)

#define TEGRA241_CMDQV_PARAM 0x0004
+#define CMDQV_NUM_SID_PER_VM_LOG2 GENMASK(15, 12)
#define CMDQV_NUM_VINTF_LOG2 GENMASK(11, 8)
#define CMDQV_NUM_VCMDQ_LOG2 GENMASK(7, 4)
+#define CMDQV_VER GENMASK(3, 0)

#define TEGRA241_CMDQV_STATUS 0x0008
#define CMDQV_ENABLED BIT(0)
@@ -47,6 +54,14 @@
/* VINTF config regs */
#define TEGRA241_VINTF(v) (0x1000 + 0x100*(v))

+#define TEGRA241_VINTFi_CONFIG(i) (TEGRA241_VINTF(i) + TEGRA241_VINTF_CONFIG)
+#define TEGRA241_VINTFi_STATUS(i) (TEGRA241_VINTF(i) + TEGRA241_VINTF_STATUS)
+#define TEGRA241_VINTFi_SID_MATCH(i, s) (TEGRA241_VINTF(i) + TEGRA241_VINTF_SID_MATCH(s))
+#define TEGRA241_VINTFi_SID_REPLACE(i, s) \
+ (TEGRA241_VINTF(i) + TEGRA241_VINTF_SID_REPLACE(s))
+#define TEGRA241_VINTFi_CMDQ_ERR_MAP(i, m) \
+ (TEGRA241_VINTF(i) + TEGRA241_VINTF_CMDQ_ERR_MAP(m))
+
#define TEGRA241_VINTF_CONFIG 0x0000
#define VINTF_HYP_OWN BIT(17)
#define VINTF_VMID GENMASK(16, 1)
@@ -55,6 +70,10 @@
#define TEGRA241_VINTF_STATUS 0x0004
#define VINTF_STATUS GENMASK(3, 1)
#define VINTF_ENABLED BIT(0)
+#define VINTF_VI_NUM_LVCMDQ GENMASK(23, 16)
+
+#define TEGRA241_VINTF_SID_MATCH(s) (0x0040 + 0x4*(s))
+#define TEGRA241_VINTF_SID_REPLACE(s) (0x0080 + 0x4*(s))

#define TEGRA241_VINTF_CMDQ_ERR_MAP(m) (0x00C0 + 0x4*(m))

@@ -236,6 +255,7 @@ MODULE_PARM_DESC(bypass_vcmdq,

/**
* struct tegra241_vcmdq - Virtual Command Queue
+ * @core: Embedded iommufd_vqueue structure
* @idx: Global index in the CMDQV HW
* @lidx: Local index in the VINTF
* @cmdqv: CMDQV HW pointer
@@ -246,6 +266,8 @@ MODULE_PARM_DESC(bypass_vcmdq,
* @page1: MMIO Page1 base address
*/
struct tegra241_vcmdq {
+ struct iommufd_vqueue core;
+
u16 idx;
u16 lidx;

@@ -257,19 +279,27 @@ struct tegra241_vcmdq {
void __iomem *page0;
void __iomem *page1;
};
+#define vqueue_to_vcmdq(v) container_of(v, struct tegra241_vcmdq, core)

/**
* struct tegra241_vintf - Virtual Interface
+ * @core: Embedded iommufd_viommu structure
* @idx: Global index in the CMDQV HW
+ * @vmid: VMID for configuration
* @enabled: Enabled or not
* @hyp_own: Owned by hypervisor (in-kernel)
* @error: Status error or not
* @cmdqv: CMDQV HW pointer
* @vcmdqs: List of VCMDQ pointers
* @base: MMIO base address
+ * @s2_domain: Stage-2 SMMU domain
+ * @sid_slots: Stream ID Slot allocator
*/
struct tegra241_vintf {
+ struct iommufd_viommu core;
+
u16 idx;
+ u16 vmid;

bool enabled;
bool hyp_own;
@@ -279,13 +309,19 @@ struct tegra241_vintf {
struct tegra241_vcmdq **vcmdqs;

void __iomem *base;
+ struct arm_smmu_domain *s2_domain;
+
+#define TEGRA241_VINTF_MAX_SID_SLOT 15
+ struct ida sid_slots;
};
+#define viommu_to_vintf(v) container_of(v, struct tegra241_vintf, core)

/**
* struct tegra241_cmdqv - CMDQ-V for SMMUv3
* @smmu: SMMUv3 pointer
* @dev: Device pointer
* @base: MMIO base address
+ * @base_phys: Page frame number of @base, for mmap
* @irq: IRQ number
* @num_vintfs: Total number of VINTFs
* @num_vcmdqs: Total number of VCMDQs
@@ -299,6 +335,7 @@ struct tegra241_cmdqv {

struct device *dev;
void __iomem *base;
+ unsigned long base_pfn;
int irq;

/* CMDQV Hardware Params */
@@ -446,15 +483,11 @@ static void tegra241_vcmdq_hw_deinit(struct tegra241_vcmdq *vcmdq)
vcmdq_dbg("deinited\n");
}

-static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
+static void _tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
{
struct tegra241_cmdqv *cmdqv = vcmdq->cmdqv;
struct tegra241_vintf *vintf = vcmdq->vintf;
u32 regval;
- int ret;
-
- /* Configure and enable the vcmdq */
- tegra241_vcmdq_hw_deinit(vcmdq);

regval = FIELD_PREP(CMDQV_CMDQ_ALLOC_VINTF, vintf->idx);
regval |= FIELD_PREP(CMDQV_CMDQ_ALLOC_LVCMDQ, vcmdq->lidx);
@@ -462,7 +495,15 @@ static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
cmdqv_writel_relaxed(regval, CMDQ_ALLOC(vcmdq->idx));

vcmdq_page1_writeq_relaxed(vcmdq->cmdq.q.q_base, BASE);
+}
+
+static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
+{
+ int ret;

+ /* Configure and enable the vcmdq */
+ tegra241_vcmdq_hw_deinit(vcmdq);
+ _tegra241_vcmdq_hw_init(vcmdq);
ret = vcmdq_write_config(VCMDQ_EN);
if (ret) {
vcmdq_err("GERRORN=0x%X\n", vcmdq_page0_readl_relaxed(GERRORN));
@@ -723,6 +764,8 @@ tegra241_cmdqv_find_resource(struct arm_smmu_device *smmu, int id)

acpi_dev_free_resource_list(&resource_list);

+ cmdqv->base_pfn = rentry->res->start >> PAGE_SHIFT;
+
INIT_LIST_HEAD(&resource_list);

ret = acpi_dev_get_resources(adev, &resource_list,
@@ -843,3 +886,234 @@ tegra241_cmdqv_acpi_probe(struct arm_smmu_device *smmu, int id)

return cmdqv;
}
+
+static int tegra241_vcmdq_hw_init_user(struct tegra241_vcmdq *vcmdq)
+{
+ /* Configure the vcmdq only; User space does the enabling */
+ _tegra241_vcmdq_hw_init(vcmdq);
+
+ vcmdq_dbg("inited at host PA 0x%llx size 0x%lx\n",
+ vcmdq->cmdq.q.q_base & VCMDQ_ADDR,
+ 1UL << (vcmdq->cmdq.q.q_base & VCMDQ_LOG2SIZE));
+ return 0;
+}
+
+static struct iommufd_vqueue *
+tegra241_cmdqv_vqueue_alloc(struct iommufd_viommu *viommu,
+ const struct iommu_user_data *user_data)
+{
+ struct tegra241_vintf *vintf = viommu_to_vintf(viommu);
+ struct tegra241_cmdqv *cmdqv = vintf->cmdqv;
+ struct iommu_vqueue_tegra241_cmdqv arg;
+ struct tegra241_vcmdq *vcmdq;
+ phys_addr_t q_base;
+ int ret;
+
+ ret = iommu_copy_struct_from_user(&arg, user_data,
+ IOMMU_VQUEUE_DATA_TEGRA241_CMDQV,
+ vcmdq_base);
+ if (ret)
+ return ERR_PTR(ret);
+
+ if (!arg.vcmdq_base || arg.vcmdq_base & ~VCMDQ_ADDR)
+ return ERR_PTR(-EINVAL);
+ if (!arg.vcmdq_log2size || arg.vcmdq_log2size > VCMDQ_LOG2SIZE)
+ return ERR_PTR(-EINVAL);
+ if (arg.vcmdq_id >= cmdqv->num_vcmdqs_per_vintf)
+ return ERR_PTR(-EINVAL);
+ q_base = arm_smmu_domain_ipa_to_pa(vintf->s2_domain, arg.vcmdq_base);
+ if (!q_base)
+ return ERR_PTR(-EINVAL);
+
+ if (vintf->vcmdqs[arg.vcmdq_id]) {
+ vcmdq = vintf->vcmdqs[arg.vcmdq_id];
+
+ /* deinit the previous setting as a reset, before re-init */
+ tegra241_vcmdq_hw_deinit(vcmdq);
+
+ vcmdq->cmdq.q.q_base = q_base & VCMDQ_ADDR;
+ vcmdq->cmdq.q.q_base |= arg.vcmdq_log2size;
+ tegra241_vcmdq_hw_init_user(vcmdq);
+
+ return &vcmdq->core;
+ }
+
+ vcmdq = iommufd_vqueue_alloc(tegra241_vcmdq, core);
+ if (!vcmdq)
+ return ERR_PTR(-ENOMEM);
+
+ ret = tegra241_vintf_lvcmdq_init(vintf, arg.vcmdq_id, vcmdq);
+ if (ret)
+ goto free_vcmdq;
+
+ vcmdq->cmdq.q.q_base = q_base & VCMDQ_ADDR;
+ vcmdq->cmdq.q.q_base |= arg.vcmdq_log2size;
+
+ ret = tegra241_vcmdq_hw_init_user(vcmdq);
+ if (ret)
+ goto deinit_lvcmdq;
+ vintf->vcmdqs[arg.vcmdq_id] = vcmdq;
+ vcmdq_dbg("allocated\n");
+
+ return &vcmdq->core;
+deinit_lvcmdq:
+ tegra241_vintf_lvcmdq_deinit(vcmdq);
+free_vcmdq:
+ kfree(vcmdq);
+ return ERR_PTR(ret);
+}
+
+static void tegra241_cmdqv_vqueue_free(struct iommufd_vqueue *vqueue)
+{
+ struct tegra241_vcmdq *vcmdq = vqueue_to_vcmdq(vqueue);
+
+ tegra241_vcmdq_hw_deinit(vcmdq);
+ tegra241_vintf_lvcmdq_deinit(vcmdq);
+ vcmdq->vintf->vcmdqs[vcmdq->lidx] = NULL;
+ vcmdq_dbg("deallocated\n");
+
+ /* IOMMUFD core frees the memory of vcmdq and vqueue */
+}
+
+static void tegra241_cmdqv_viommu_free(struct iommufd_viommu *viommu)
+{
+ struct tegra241_vintf *vintf = viommu_to_vintf(viommu);
+ struct tegra241_cmdqv *cmdqv = vintf->cmdqv;
+ int qidx;
+
+ /* Just in case, make sure all LVCMDQs are disabled */
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++)
+ if (vintf->vcmdqs[qidx])
+ tegra241_cmdqv_vqueue_free(&vintf->vcmdqs[qidx]->core);
+
+ vintf_write_config(0);
+
+ ida_destroy(&vintf->sid_slots);
+ kfree(vintf->vcmdqs);
+
+ ida_free(&cmdqv->vintf_ids, vintf->idx);
+ cmdqv->vintfs[vintf->idx] = NULL;
+ vintf_dbg("deallocated with vmid (%d)\n", vintf->vmid);
+
+ /* IOMMUFD core frees the memory of vintf and viommu */
+}
+
+static int tegra241_cmdqv_viommu_set_dev_id(struct iommufd_viommu *viommu,
+ struct device *dev, u64 dev_id)
+{
+ struct tegra241_vintf *vintf =
+ container_of(viommu, struct tegra241_vintf, core);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_stream *stream = &master->streams[0];
+ int slot;
+
+ WARN_ON_ONCE(master->num_streams != 1);
+
+ /* Find an empty slot of SID_MATCH and SID_REPLACE */
+ slot = ida_alloc_max(&vintf->sid_slots,
+ TEGRA241_VINTF_MAX_SID_SLOT, GFP_KERNEL);
+ if (slot < 0)
+ return slot;
+
+ vintf_writel_relaxed(stream->id, SID_REPLACE(slot));
+ vintf_writel_relaxed(dev_id << 1 | 0x1, SID_MATCH(slot));
+ stream->cmdqv_sid_slot = slot;
+ vintf_dbg("allocated a slot (%d) for pSID=%x, vSID=%x\n",
+ slot, stream->id, (u32)dev_id);
+
+ return 0;
+}
+
+static void tegra241_cmdqv_viommu_unset_dev_id(struct iommufd_viommu *viommu,
+ struct device *dev)
+{
+ struct tegra241_vintf *vintf =
+ container_of(viommu, struct tegra241_vintf, core);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_stream *stream = &master->streams[0];
+ int slot = stream->cmdqv_sid_slot;
+
+ vintf_writel_relaxed(0, SID_REPLACE(slot));
+ vintf_writel_relaxed(0, SID_MATCH(slot));
+ ida_free(&vintf->sid_slots, slot);
+ vintf_dbg("deallocated a slot (%d) for pSID=%x\n", slot, stream->id);
+}
+
+static unsigned long tegra241_cmdqv_get_mmap_pfn(struct iommufd_viommu *viommu,
+ size_t pgsize)
+{
+ struct tegra241_vintf *vintf =
+ container_of(viommu, struct tegra241_vintf, core);
+ struct tegra241_cmdqv *cmdqv = vintf->cmdqv;
+
+ return cmdqv->base_pfn + TEGRA241_VINTFi_PAGE0(vintf->idx) / PAGE_SIZE;
+}
+
+static const struct iommufd_viommu_ops tegra241_cmdqv_viommu_ops = {
+ .free = tegra241_cmdqv_viommu_free,
+ .set_dev_id = tegra241_cmdqv_viommu_set_dev_id,
+ .unset_dev_id = tegra241_cmdqv_viommu_unset_dev_id,
+ .vqueue_alloc = tegra241_cmdqv_vqueue_alloc,
+ .vqueue_free = tegra241_cmdqv_vqueue_free,
+ .get_mmap_pfn = tegra241_cmdqv_get_mmap_pfn,
+};
+
+struct iommufd_viommu *
+tegra241_cmdqv_viommu_alloc(struct tegra241_cmdqv *cmdqv,
+ struct arm_smmu_domain *smmu_domain)
+{
+ struct tegra241_vintf *vintf;
+ int idx, ret;
+ u32 regval;
+
+ if (!smmu_domain || smmu_domain->stage != ARM_SMMU_DOMAIN_S2)
+ return ERR_PTR(-EINVAL);
+
+ vintf = iommufd_viommu_alloc(tegra241_vintf, core);
+ if (!vintf)
+ return ERR_PTR(-ENOMEM);
+ vintf->core.ops = &tegra241_cmdqv_viommu_ops;
+
+ ret = ida_alloc_range(&cmdqv->vintf_ids, 1,
+ cmdqv->num_vintfs - 1, GFP_KERNEL);
+ if (ret < 0) {
+ dev_err(cmdqv->dev, "no more available vintf\n");
+ goto out_free;
+ }
+ idx = ret;
+ cmdqv->vintfs[idx] = vintf;
+
+ vintf->idx = idx;
+ vintf->cmdqv = cmdqv;
+ vintf->vmid = smmu_domain->vmid;
+ vintf->s2_domain = smmu_domain;
+ vintf->base = cmdqv->base + TEGRA241_VINTF(idx);
+
+ ret = vintf_write_config(0);
+ if (ret)
+ goto out_ida_free;
+ regval = FIELD_PREP(VINTF_VMID, vintf->vmid) |
+ FIELD_PREP(VINTF_EN, 1);
+ ret = vintf_write_config(regval);
+ if (ret)
+ goto out_ida_free;
+
+ vintf->vcmdqs = kcalloc(cmdqv->num_vcmdqs_per_vintf,
+ sizeof(*vintf->vcmdqs), GFP_KERNEL);
+ if (!vintf->vcmdqs) {
+ ret = -ENOMEM;
+ goto out_ida_free;
+ }
+
+ ida_init(&vintf->sid_slots);
+
+ vintf_dbg("allocated with vmid (%d)\n", vintf->vmid);
+
+ return &vintf->core;
+
+out_ida_free:
+ ida_free(&cmdqv->vintf_ids, vintf->idx);
+out_free:
+ kfree(vintf);
+ return ERR_PTR(ret);
+}
--
2.43.0