2023-09-19 09:31:05

by Yi Liu

[permalink] [raw]
Subject: [PATCH 0/6] iommufd support allocating nested parent domain

IOMMU hardwares that support nested translation would have two stages
address translation (normally mentioned as stage-1 and stage-2). The page
table formats of the stage-1 and stage-2 can be different. e.g., VT-d has
different page table formats for stage-1 and stage-2.

Nested parent domain is the iommu domain used to represent the stage-2
translation. In IOMMUFD, both stage-1 and stage-2 translation are tracked
as HWPT (a.k.a. iommu domain). Stage-2 HWPT is parent of stage-1 HWPT as
stage-1 cannot work alone in nested translation. In the cases of stage-1 and
stage-2 page table format are different, the parent HWPT should use exactly
the stage-2 page table format. However, the existing kernel hides the format
selection in iommu drivers, so the domain allocated via IOMMU_HWPT_ALLOC can
use either stage-1 page table format or stage-2 page table format, there is
no guarantees for it.

To enforce the page table format of the nested parent domain, this series
introduces a new iommu op (domain_alloc_user) which can accept user flags
to allocate domain as userspace requires. It also converts IOMMUFD to use
the new domain_alloc_user op for domain allocation if supported, then extends
the IOMMU_HWPT_ALLOC ioctl to pass down a NEST_PARENT flag to allocate a HWPT
which can be used as parent. This series implements the new op in Intel iommu
driver to have a complete picture. It is a preparation for adding nesting
support in IOMMUFD/IOMMU.

Complete code can be found:
https://github.com/yiliu1765/iommufd/tree/iommufd_alloc_user_v1

Regards,
Yi Liu

Yi Liu (6):
iommu: Add new iommu op to create domains owned by userspace
iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation
iommufd/hw_pagetable: Accepts user flags for domain allocation
iommufd/hw_pagetable: Support allocating nested parent domain
iommufd/selftest: Add domain_alloc_user() support in iommu mock
iommu/vt-d: Add domain_alloc_user op

drivers/iommu/intel/iommu.c | 20 ++++++++++++
drivers/iommu/iommufd/device.c | 2 +-
drivers/iommu/iommufd/hw_pagetable.c | 31 ++++++++++++++-----
drivers/iommu/iommufd/iommufd_private.h | 3 +-
drivers/iommu/iommufd/selftest.c | 16 ++++++++++
include/linux/iommu.h | 8 +++++
include/uapi/linux/iommufd.h | 12 ++++++-
tools/testing/selftests/iommu/iommufd.c | 24 +++++++++++---
.../selftests/iommu/iommufd_fail_nth.c | 2 +-
tools/testing/selftests/iommu/iommufd_utils.h | 11 +++++--
10 files changed, 111 insertions(+), 18 deletions(-)

--
2.34.1


2023-09-19 10:29:09

by Yi Liu

[permalink] [raw]
Subject: [PATCH 4/6] iommufd/hw_pagetable: Support allocating nested parent domain

This extends IOMMU_HWPT_ALLOC to allocate domains used as parent (stage-2)
in nested translation.

Signed-off-by: Yi Liu <[email protected]>
---
drivers/iommu/iommufd/hw_pagetable.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 5be7a31cbd9c..26a8a818ffa3 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,

lockdep_assert_held(&ioas->mutex);

+ if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ops->domain_alloc_user)
+ return ERR_PTR(-EOPNOTSUPP);
+
hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HW_PAGETABLE);
if (IS_ERR(hwpt))
return hwpt;
@@ -154,7 +157,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
struct iommufd_ioas *ioas;
int rc;

- if (cmd->flags || cmd->__reserved)
+ if (cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT || cmd->__reserved)
return -EOPNOTSUPP;

idev = iommufd_get_device(ucmd, cmd->dev_id);
--
2.34.1

2023-09-19 12:32:37

by Yi Liu

[permalink] [raw]
Subject: [PATCH 2/6] iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation

This makes IOMMUFD to use iommu_domain_alloc_user() for iommu_domain
creation as IOMMUFD needs to support iommu_domain allocation with
parameters from userspace in nested support. If the iommu driver
doesn't provide domain_alloc_user callback then IOMMUFD falls back to
use iommu_domain_alloc().

Suggested-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Co-developed-by: Nicolin Chen <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Signed-off-by: Yi Liu <[email protected]>
---
drivers/iommu/iommufd/hw_pagetable.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index cf2c1504e20d..48874f896521 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -5,6 +5,7 @@
#include <linux/iommu.h>
#include <uapi/linux/iommufd.h>

+#include "../iommu-priv.h"
#include "iommufd_private.h"

void iommufd_hw_pagetable_destroy(struct iommufd_object *obj)
@@ -74,6 +75,7 @@ struct iommufd_hw_pagetable *
iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
struct iommufd_device *idev, bool immediate_attach)
{
+ const struct iommu_ops *ops = dev_iommu_ops(idev->dev);
struct iommufd_hw_pagetable *hwpt;
int rc;

@@ -88,10 +90,19 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
refcount_inc(&ioas->obj.users);
hwpt->ioas = ioas;

- hwpt->domain = iommu_domain_alloc(idev->dev->bus);
- if (!hwpt->domain) {
- rc = -ENOMEM;
- goto out_abort;
+ if (ops->domain_alloc_user) {
+ hwpt->domain = ops->domain_alloc_user(idev->dev, 0);
+ if (IS_ERR(hwpt->domain)) {
+ rc = PTR_ERR(hwpt->domain);
+ hwpt->domain = NULL;
+ goto out_abort;
+ }
+ } else {
+ hwpt->domain = iommu_domain_alloc(idev->dev->bus);
+ if (!hwpt->domain) {
+ rc = -ENOMEM;
+ goto out_abort;
+ }
}

/*
--
2.34.1

2023-09-19 13:53:20

by Yi Liu

[permalink] [raw]
Subject: [PATCH 3/6] iommufd/hw_pagetable: Accepts user flags for domain allocation

This extends iommufd_hw_pagetable_alloc() to accepts user flags.

Signed-off-by: Yi Liu <[email protected]>
---
drivers/iommu/iommufd/device.c | 2 +-
drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++---
drivers/iommu/iommufd/iommufd_private.h | 3 ++-
3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index ce78c3671539..e88fa73a45e6 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -540,7 +540,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev,
}

hwpt = iommufd_hw_pagetable_alloc(idev->ictx, ioas, idev,
- immediate_attach);
+ 0, immediate_attach);
if (IS_ERR(hwpt)) {
destroy_hwpt = ERR_CAST(hwpt);
goto out_unlock;
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 48874f896521..5be7a31cbd9c 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -61,6 +61,7 @@ int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt)
* @ictx: iommufd context
* @ioas: IOAS to associate the domain with
* @idev: Device to get an iommu_domain for
+ * @flags: Flags from userspace
* @immediate_attach: True if idev should be attached to the hwpt
*
* Allocate a new iommu_domain and return it as a hw_pagetable. The HWPT
@@ -73,7 +74,8 @@ int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt)
*/
struct iommufd_hw_pagetable *
iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
- struct iommufd_device *idev, bool immediate_attach)
+ struct iommufd_device *idev, u32 flags,
+ bool immediate_attach)
{
const struct iommu_ops *ops = dev_iommu_ops(idev->dev);
struct iommufd_hw_pagetable *hwpt;
@@ -91,7 +93,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
hwpt->ioas = ioas;

if (ops->domain_alloc_user) {
- hwpt->domain = ops->domain_alloc_user(idev->dev, 0);
+ hwpt->domain = ops->domain_alloc_user(idev->dev, flags);
if (IS_ERR(hwpt->domain)) {
rc = PTR_ERR(hwpt->domain);
hwpt->domain = NULL;
@@ -166,7 +168,8 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
}

mutex_lock(&ioas->mutex);
- hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, idev, false);
+ hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas,
+ idev, cmd->flags, false);
if (IS_ERR(hwpt)) {
rc = PTR_ERR(hwpt);
goto out_unlock;
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 2c58670011fe..3064997a0181 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -242,7 +242,8 @@ struct iommufd_hw_pagetable {

struct iommufd_hw_pagetable *
iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
- struct iommufd_device *idev, bool immediate_attach);
+ struct iommufd_device *idev, u32 flags,
+ bool immediate_attach);
int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt);
int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev);
--
2.34.1

2023-09-19 18:02:52

by Yi Liu

[permalink] [raw]
Subject: [PATCH 1/6] iommu: Add new iommu op to create domains owned by userspace

Introduce a new iommu_domain op to create domains owned by userspace,
e.g. through IOMMUFD. These domains have a few different properties
compares to kernel owned domains:

- They may be UNMANAGED domains, but created with special parameters.
For instance aperture size changes/number of levels, different
IOPTE formats, or other things necessary to make a vIOMMU work

- We have to track all the memory allocations with GFP_KERNEL_ACCOUNT
to make the cgroup sandbox stronger

- Device-specialty domains, such as NESTED domains can be created by
IOMMUFD.

The new op clearly says the domain is being created by IOMMUFD, that
the domain is intended for userspace use, and it provides a way to pass
user flags or a driver specific uAPI structure to customize the created
domain to exactly what the vIOMMU userspace driver requires.

iommu drivers that cannot support VFIO/IOMMUFD should not support this
op. This includes any driver that cannot provide a fully functional
UNMANAGED domain.

This new op for now is only supposed to be used by IOMMUFD, hence no
wrapper for it. IOMMUFD would call the callback directly. As for domain
free, IOMMUFD would use iommu_domain_free().

Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Co-developed-by: Nicolin Chen <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Signed-off-by: Yi Liu <[email protected]>
---
include/linux/iommu.h | 8 ++++++++
include/uapi/linux/iommufd.h | 12 +++++++++++-
2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c50a769d569a..660dc1931dc9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -235,6 +235,13 @@ struct iommu_iotlb_gather {
* use. The information type is one of enum iommu_hw_info_type defined
* in include/uapi/linux/iommufd.h.
* @domain_alloc: allocate iommu domain
+ * @domain_alloc_user: Allocate an iommu domain corresponding to the input
+ * parameters like flags defined as enum iommufd_ioas_map_flags
+ * in include/uapi/linux/iommufd.h. Different from the
+ * domain_alloc op, it requires iommu driver to fully
+ * initialize a new domain including the generic iommu_domain
+ * struct. Upon success, a domain is returned. Upon failure,
+ * ERR_PTR must be returned.
* @probe_device: Add device to iommu driver handling
* @release_device: Remove device from iommu driver handling
* @probe_finalize: Do final setup work after the device is added to an IOMMU
@@ -267,6 +274,7 @@ struct iommu_ops {

/* Domain allocation and freeing by the iommu driver */
struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type);
+ struct iommu_domain *(*domain_alloc_user)(struct device *dev, u32 flags);

struct iommu_device *(*probe_device)(struct device *dev);
void (*release_device)(struct device *dev);
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index b4ba0c0cbab6..4a7c5c8fdbb4 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -347,10 +347,20 @@ struct iommu_vfio_ioas {
};
#define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)

+/**
+ * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
+ * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
+ * as the parent domain in the nesting
+ * configuration.
+ */
+enum iommufd_hwpt_alloc_flags {
+ IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0,
+};
+
/**
* struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
* @size: sizeof(struct iommu_hwpt_alloc)
- * @flags: Must be 0
+ * @flags: Combination of enum iommufd_hwpt_alloc_flags
* @dev_id: The device to allocate this HWPT for
* @pt_id: The IOAS to connect this HWPT to
* @out_hwpt_id: The ID of the new HWPT
--
2.34.1

2023-09-19 18:15:40

by Yi Liu

[permalink] [raw]
Subject: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

This adds the domain_alloc_user op implementation. It supports allocating
domains to be used as parent under nested translation.

Signed-off-by: Yi Liu <[email protected]>
---
drivers/iommu/intel/iommu.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5db283c17e0d..491bcde1ff96 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4074,6 +4074,25 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
return NULL;
}

+static struct iommu_domain *
+intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
+{
+ struct iommu_domain *domain;
+ struct intel_iommu *iommu;
+
+ iommu = device_to_iommu(dev, NULL, NULL);
+ if (!iommu)
+ return ERR_PTR(-ENODEV);
+
+ if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap))
+ return ERR_PTR(-EOPNOTSUPP);
+
+ domain = iommu_domain_alloc(dev->bus);
+ if (!domain)
+ domain = ERR_PTR(-ENOMEM);
+ return domain;
+}
+
static void intel_iommu_domain_free(struct iommu_domain *domain)
{
if (domain != &si_domain->domain && domain != &blocking_domain)
@@ -4807,6 +4826,7 @@ const struct iommu_ops intel_iommu_ops = {
.capable = intel_iommu_capable,
.hw_info = intel_iommu_hw_info,
.domain_alloc = intel_iommu_domain_alloc,
+ .domain_alloc_user = intel_iommu_domain_alloc_user,
.probe_device = intel_iommu_probe_device,
.probe_finalize = intel_iommu_probe_finalize,
.release_device = intel_iommu_release_device,
--
2.34.1

2023-09-20 07:47:56

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On 9/19/23 5:25 PM, Yi Liu wrote:
> This adds the domain_alloc_user op implementation. It supports allocating
> domains to be used as parent under nested translation.

Documentation/process/submitting-patches.rst:

Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour.

So how about,

Add the domain_alloc_user callback to support allocating domains used as
parent under nested translation.

?

>
> Signed-off-by: Yi Liu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 5db283c17e0d..491bcde1ff96 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4074,6 +4074,25 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
> return NULL;
> }
>
> +static struct iommu_domain *
> +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> +{
> + struct iommu_domain *domain;
> + struct intel_iommu *iommu;
> +
> + iommu = device_to_iommu(dev, NULL, NULL);
> + if (!iommu)
> + return ERR_PTR(-ENODEV);
> +
> + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap))
> + return ERR_PTR(-EOPNOTSUPP);
> +
> + domain = iommu_domain_alloc(dev->bus);

No need to bounce between core and driver. Just,

intel_iommu_domain_alloc(IOMMU_DOMAIN_UNMANAGED);

and fully initialize it before return.

> + if (!domain)
> + domain = ERR_PTR(-ENOMEM);
> + return domain;
> +}
> +
> static void intel_iommu_domain_free(struct iommu_domain *domain)
> {
> if (domain != &si_domain->domain && domain != &blocking_domain)
> @@ -4807,6 +4826,7 @@ const struct iommu_ops intel_iommu_ops = {
> .capable = intel_iommu_capable,
> .hw_info = intel_iommu_hw_info,
> .domain_alloc = intel_iommu_domain_alloc,
> + .domain_alloc_user = intel_iommu_domain_alloc_user,
> .probe_device = intel_iommu_probe_device,
> .probe_finalize = intel_iommu_probe_finalize,
> .release_device = intel_iommu_release_device,

Best regards,
baolu

2023-09-20 10:22:59

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/6] iommufd/hw_pagetable: Support allocating nested parent domain

On 9/19/23 5:25 PM, Yi Liu wrote:
> This extends IOMMU_HWPT_ALLOC to allocate domains used as parent (stage-2)
> in nested translation.
>
> Signed-off-by: Yi Liu <[email protected]>
> ---
> drivers/iommu/iommufd/hw_pagetable.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
> index 5be7a31cbd9c..26a8a818ffa3 100644
> --- a/drivers/iommu/iommufd/hw_pagetable.c
> +++ b/drivers/iommu/iommufd/hw_pagetable.c
> @@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
>
> lockdep_assert_held(&ioas->mutex);
>
> + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ops->domain_alloc_user)
> + return ERR_PTR(-EOPNOTSUPP);
> +
> hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HW_PAGETABLE);
> if (IS_ERR(hwpt))
> return hwpt;
> @@ -154,7 +157,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
> struct iommufd_ioas *ioas;
> int rc;
>
> - if (cmd->flags || cmd->__reserved)
> + if (cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT || cmd->__reserved)
> return -EOPNOTSUPP;

Need a parenthesis here, otherwise the compiler will interpret it as a
different condition.

if ((cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT) || cmd->__reserved)
return -EOPNOTSUPP;

Best regards,
baolu

2023-09-20 11:56:52

by Yang, Weijiang

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On 9/19/2023 5:25 PM, Yi Liu wrote:
> This adds the domain_alloc_user op implementation. It supports allocating
> domains to be used as parent under nested translation.
>
> Signed-off-by: Yi Liu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 5db283c17e0d..491bcde1ff96 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4074,6 +4074,25 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
> return NULL;
> }
>
> +static struct iommu_domain *
> +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> +{
> + struct iommu_domain *domain;
> + struct intel_iommu *iommu;
> +
> + iommu = device_to_iommu(dev, NULL, NULL);
> + if (!iommu)
> + return ERR_PTR(-ENODEV);
> +
> + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap))
> + return ERR_PTR(-EOPNOTSUPP);

The outer caller has checked (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) before it comes here.
If this callback is dedicated for nested domain allocation, then you may omit the condition here.

2023-09-20 15:21:37

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On Wed, Sep 20, 2023 at 01:10:04PM +0000, Liu, Yi L wrote:
> > From: Jason Gunthorpe <[email protected]>
> > Sent: Wednesday, September 20, 2023 9:05 PM
> >
> > On Wed, Sep 20, 2023 at 01:28:41PM +0800, Baolu Lu wrote:
> > > >
> > > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > > > index 5db283c17e0d..491bcde1ff96 100644
> > > > --- a/drivers/iommu/intel/iommu.c
> > > > +++ b/drivers/iommu/intel/iommu.c
> > > > @@ -4074,6 +4074,25 @@ static struct iommu_domain
> > *intel_iommu_domain_alloc(unsigned type)
> > > > return NULL;
> > > > }
> > > > +static struct iommu_domain *
> > > > +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> > > > +{
> > > > + struct iommu_domain *domain;
> > > > + struct intel_iommu *iommu;
> > > > +
> > > > + iommu = device_to_iommu(dev, NULL, NULL);
> > > > + if (!iommu)
> > > > + return ERR_PTR(-ENODEV);
> > > > +
> > > > + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu-
> > >ecap))
> > > > + return ERR_PTR(-EOPNOTSUPP);
> >
> > There is a check missing for supported flags
> >
> > if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
> > return ERR_PTR(-EOPNOTSUPP);
>
> Well, the iommufd has such check. But I also noticed your another
> reply to Weijiang. So your preference is to do the flags validation
> in iommu driver instead of iommufd. Isn't it?

The core code should check that only kernel known bits are set

The driver code should check that only driver supported bits are set.

Today there is only one bit so the checks are the same code.

Tomorrow when we add a new bit the checks will not be the same

Jason

2023-09-20 15:27:30

by Yi Liu

[permalink] [raw]
Subject: RE: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

> From: Jason Gunthorpe <[email protected]>
> Sent: Wednesday, September 20, 2023 9:05 PM
>
> On Wed, Sep 20, 2023 at 01:28:41PM +0800, Baolu Lu wrote:
> > >
> > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > > index 5db283c17e0d..491bcde1ff96 100644
> > > --- a/drivers/iommu/intel/iommu.c
> > > +++ b/drivers/iommu/intel/iommu.c
> > > @@ -4074,6 +4074,25 @@ static struct iommu_domain
> *intel_iommu_domain_alloc(unsigned type)
> > > return NULL;
> > > }
> > > +static struct iommu_domain *
> > > +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> > > +{
> > > + struct iommu_domain *domain;
> > > + struct intel_iommu *iommu;
> > > +
> > > + iommu = device_to_iommu(dev, NULL, NULL);
> > > + if (!iommu)
> > > + return ERR_PTR(-ENODEV);
> > > +
> > > + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu-
> >ecap))
> > > + return ERR_PTR(-EOPNOTSUPP);
>
> There is a check missing for supported flags
>
> if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
> return ERR_PTR(-EOPNOTSUPP);

Well, the iommufd has such check. But I also noticed your another
reply to Weijiang. So your preference is to do the flags validation
in iommu driver instead of iommufd. Isn't it?

> > > +
> > > + domain = iommu_domain_alloc(dev->bus);
> >
> > No need to bounce between core and driver. Just,
> >
> > intel_iommu_domain_alloc(IOMMU_DOMAIN_UNMANAGED);
> >
> > and fully initialize it before return.
>
> If you are going to do that then intel_iommu_domain_alloc() should
> fully initialize the domain, not here.

I've also considered what Baolu described, but it requires to do some
extra initialization which is duplicated with iommu_domain_alloc().
So I chose this simple way.

Regards,
Yi Liu

2023-09-20 21:15:20

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On Wed, Sep 20, 2023 at 01:41:07PM +0800, Yang, Weijiang wrote:
> On 9/19/2023 5:25 PM, Yi Liu wrote:
> > This adds the domain_alloc_user op implementation. It supports allocating
> > domains to be used as parent under nested translation.
> >
> > Signed-off-by: Yi Liu <[email protected]>
> > ---
> > drivers/iommu/intel/iommu.c | 20 ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 5db283c17e0d..491bcde1ff96 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -4074,6 +4074,25 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
> > return NULL;
> > }
> > +static struct iommu_domain *
> > +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> > +{
> > + struct iommu_domain *domain;
> > + struct intel_iommu *iommu;
> > +
> > + iommu = device_to_iommu(dev, NULL, NULL);
> > + if (!iommu)
> > + return ERR_PTR(-ENODEV);
> > +
> > + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap))
> > + return ERR_PTR(-EOPNOTSUPP);
>
> The outer caller has checked (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) before it comes here.
> If this callback is dedicated for nested domain allocation, then you may omit the condition here.

No, please don't.

The point of the flags is to be passed to the driver. The driver
should validate them, not the core code.

We will add more flags, I don't want to change every driver to do
this.

Jason

2023-09-20 21:43:22

by Yi Liu

[permalink] [raw]
Subject: RE: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

> From: Baolu Lu <[email protected]>
> Sent: Wednesday, September 20, 2023 1:29 PM
>
> On 9/19/23 5:25 PM, Yi Liu wrote:
> > This adds the domain_alloc_user op implementation. It supports allocating
> > domains to be used as parent under nested translation.
>
> Documentation/process/submitting-patches.rst:
>
> Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
> instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
> to do frotz", as if you are giving orders to the codebase to change
> its behaviour.
>
> So how about,
>
> Add the domain_alloc_user callback to support allocating domains used as
> parent under nested translation.
>

Sure.

2023-09-20 22:31:07

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On Wed, Sep 20, 2023 at 01:28:41PM +0800, Baolu Lu wrote:
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 5db283c17e0d..491bcde1ff96 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -4074,6 +4074,25 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
> > return NULL;
> > }
> > +static struct iommu_domain *
> > +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> > +{
> > + struct iommu_domain *domain;
> > + struct intel_iommu *iommu;
> > +
> > + iommu = device_to_iommu(dev, NULL, NULL);
> > + if (!iommu)
> > + return ERR_PTR(-ENODEV);
> > +
> > + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap))
> > + return ERR_PTR(-EOPNOTSUPP);

There is a check missing for supported flags

if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
return ERR_PTR(-EOPNOTSUPP);

> > +
> > + domain = iommu_domain_alloc(dev->bus);
>
> No need to bounce between core and driver. Just,
>
> intel_iommu_domain_alloc(IOMMU_DOMAIN_UNMANAGED);
>
> and fully initialize it before return.

If you are going to do that then intel_iommu_domain_alloc() should
fully initialize the domain, not here.

Jason

2023-09-20 23:55:30

by Yi Liu

[permalink] [raw]
Subject: RE: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

> From: Yang, Weijiang <[email protected]>
> Sent: Wednesday, September 20, 2023 1:41 PM
> On 9/19/2023 5:25 PM, Yi Liu wrote:
> > This adds the domain_alloc_user op implementation. It supports allocating
> > domains to be used as parent under nested translation.
> >
> > Signed-off-by: Yi Liu <[email protected]>
> > ---
> > drivers/iommu/intel/iommu.c | 20 ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 5db283c17e0d..491bcde1ff96 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -4074,6 +4074,25 @@ static struct iommu_domain
> *intel_iommu_domain_alloc(unsigned type)
> > return NULL;
> > }
> >
> > +static struct iommu_domain *
> > +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
> > +{
> > + struct iommu_domain *domain;
> > + struct intel_iommu *iommu;
> > +
> > + iommu = device_to_iommu(dev, NULL, NULL);
> > + if (!iommu)
> > + return ERR_PTR(-ENODEV);
> > +
> > + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu-
> >ecap))
> > + return ERR_PTR(-EOPNOTSUPP);
>
> The outer caller has checked (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) before it
> comes here.
> If this callback is dedicated for nested domain allocation, then you may omit the
> condition here.

This check is different. It aims to fail the call if iommu hw does not support nested.
I just realized that it may need to check if scalable mode is enabled. This should
be more accurate.

Regards,
Yi Liu

2023-09-21 08:09:30

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On 9/20/23 9:10 PM, Liu, Yi L wrote:
>>>> +
>>>> + domain = iommu_domain_alloc(dev->bus);
>>> No need to bounce between core and driver. Just,
>>>
>>> intel_iommu_domain_alloc(IOMMU_DOMAIN_UNMANAGED);
>>>
>>> and fully initialize it before return.
>> If you are going to do that then intel_iommu_domain_alloc() should
>> fully initialize the domain, not here.
> I've also considered what Baolu described, but it requires to do some
> extra initialization which is duplicated with iommu_domain_alloc().
> So I chose this simple way.

Okay, got you.

Once Jason's paging domain and Robin's bus->iommu_ops retirement series
have landed, the VT-d driver will need some refactoring. Therefore, I'm
fine with you using a simpler approach here. I'll refactor everything
later.

Best regards,
baolu

2023-09-25 08:18:37

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 4/6] iommufd/hw_pagetable: Support allocating nested parent domain

On 2023/9/20 13:05, Baolu Lu wrote:
> On 9/19/23 5:25 PM, Yi Liu wrote:
>> This extends IOMMU_HWPT_ALLOC to allocate domains used as parent (stage-2)
>> in nested translation.
>>
>> Signed-off-by: Yi Liu <[email protected]>
>> ---
>>   drivers/iommu/iommufd/hw_pagetable.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iommufd/hw_pagetable.c
>> b/drivers/iommu/iommufd/hw_pagetable.c
>> index 5be7a31cbd9c..26a8a818ffa3 100644
>> --- a/drivers/iommu/iommufd/hw_pagetable.c
>> +++ b/drivers/iommu/iommufd/hw_pagetable.c
>> @@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx,
>> struct iommufd_ioas *ioas,
>>       lockdep_assert_held(&ioas->mutex);
>> +    if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ops->domain_alloc_user)
>> +        return ERR_PTR(-EOPNOTSUPP);
>> +
>>       hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HW_PAGETABLE);
>>       if (IS_ERR(hwpt))
>>           return hwpt;
>> @@ -154,7 +157,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
>>       struct iommufd_ioas *ioas;
>>       int rc;
>> -    if (cmd->flags || cmd->__reserved)
>> +    if (cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT || cmd->__reserved)
>>           return -EOPNOTSUPP;
>
> Need a parenthesis here, otherwise the compiler will interpret it as a
> different condition.
>
>     if ((cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT) || cmd->__reserved)
>         return -EOPNOTSUPP;

ok.

--
Regards,
Yi Liu

2023-09-25 10:44:37

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On 2023/9/21 09:31, Baolu Lu wrote:
> On 9/20/23 9:10 PM, Liu, Yi L wrote:
>>>>> +
>>>>> +    domain = iommu_domain_alloc(dev->bus);
>>>> No need to bounce between core and driver. Just,
>>>>
>>>>     intel_iommu_domain_alloc(IOMMU_DOMAIN_UNMANAGED);
>>>>
>>>> and fully initialize it before return.
>>> If you are going to do that then intel_iommu_domain_alloc() should
>>> fully initialize the domain, not here.
>> I've also considered what Baolu described, but it requires to do some
>> extra initialization which is duplicated with iommu_domain_alloc().
>> So I chose this simple way.
>
> Okay, got you.
>
> Once Jason's paging domain and Robin's bus->iommu_ops retirement series
> have landed, the VT-d driver will need some refactoring. Therefore, I'm
> fine with you using a simpler approach here. I'll refactor everything
> later.

yes.

--
Regards,
Yi Liu

2023-09-25 11:03:00

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

On 2023/9/20 21:18, Jason Gunthorpe wrote:
> On Wed, Sep 20, 2023 at 01:10:04PM +0000, Liu, Yi L wrote:
>>> From: Jason Gunthorpe <[email protected]>
>>> Sent: Wednesday, September 20, 2023 9:05 PM
>>>
>>> On Wed, Sep 20, 2023 at 01:28:41PM +0800, Baolu Lu wrote:
>>>>>
>>>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>>>> index 5db283c17e0d..491bcde1ff96 100644
>>>>> --- a/drivers/iommu/intel/iommu.c
>>>>> +++ b/drivers/iommu/intel/iommu.c
>>>>> @@ -4074,6 +4074,25 @@ static struct iommu_domain
>>> *intel_iommu_domain_alloc(unsigned type)
>>>>> return NULL;
>>>>> }
>>>>> +static struct iommu_domain *
>>>>> +intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
>>>>> +{
>>>>> + struct iommu_domain *domain;
>>>>> + struct intel_iommu *iommu;
>>>>> +
>>>>> + iommu = device_to_iommu(dev, NULL, NULL);
>>>>> + if (!iommu)
>>>>> + return ERR_PTR(-ENODEV);
>>>>> +
>>>>> + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu-
>>>> ecap))
>>>>> + return ERR_PTR(-EOPNOTSUPP);
>>>
>>> There is a check missing for supported flags
>>>
>>> if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
>>> return ERR_PTR(-EOPNOTSUPP);
>>
>> Well, the iommufd has such check. But I also noticed your another
>> reply to Weijiang. So your preference is to do the flags validation
>> in iommu driver instead of iommufd. Isn't it?
>
> The core code should check that only kernel known bits are set
>
> The driver code should check that only driver supported bits are set.
>
> Today there is only one bit so the checks are the same code.
>
> Tomorrow when we add a new bit the checks will not be the same

fair enough. I'll have the check in both core and iommu driver.

if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
return ERR_PTR(-EOPNOTSUPP);


--
Regards,
Yi Liu

2023-09-26 05:52:51

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 1/6] iommu: Add new iommu op to create domains owned by userspace

On 2023/9/26 13:28, Tian, Kevin wrote:
>> From: Liu, Yi L <[email protected]>
>> Sent: Tuesday, September 19, 2023 5:25 PM
>>
>> @@ -235,6 +235,13 @@ struct iommu_iotlb_gather {
>> * use. The information type is one of enum iommu_hw_info_type
>> defined
>> * in include/uapi/linux/iommufd.h.
>> * @domain_alloc: allocate iommu domain
>
> Given now we have two @alloc ops it'd be clearer to also update the
> comment here so the explanation for @domain_alloc_user() is easier
> to be understood, e.g.:
>
> @domain_alloc: allocate and return an iommu domain if success. Otherwise
> NULL is returned. The domain is not fully initialized until
> the caller iommu_domain_alloc() returns.
>
>> + * @domain_alloc_user: Allocate an iommu domain corresponding to the
>> input
>> + * parameters like flags defined as enum
>> iommufd_ioas_map_flags
>> + * in include/uapi/linux/iommufd.h. Different from the
>
> "to the input parameters as defined in include/uapi/linux/iommufd.h".
>
>> + * domain_alloc op, it requires iommu driver to fully
>> + * initialize a new domain including the generic iommu_domain
>
> "Unlike @domain_alloc, it is called only by iommufd and must fully initialize
> the new domain before return".
>
> *domain* here already refers to the generic iommu_domain struct.
>

above comment well received.

--
Regards,
Yi Liu

2023-09-26 06:46:02

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 6/6] iommu/vt-d: Add domain_alloc_user op

> From: Baolu Lu <[email protected]>
> Sent: Thursday, September 21, 2023 9:31 AM
>
> On 9/20/23 9:10 PM, Liu, Yi L wrote:
> >>>> +
> >>>> + domain = iommu_domain_alloc(dev->bus);
> >>> No need to bounce between core and driver. Just,
> >>>
> >>> intel_iommu_domain_alloc(IOMMU_DOMAIN_UNMANAGED);
> >>>
> >>> and fully initialize it before return.
> >> If you are going to do that then intel_iommu_domain_alloc() should
> >> fully initialize the domain, not here.
> > I've also considered what Baolu described, but it requires to do some
> > extra initialization which is duplicated with iommu_domain_alloc().
> > So I chose this simple way.
>
> Okay, got you.
>

Please add a comment for this temporary option.

2023-09-26 07:41:21

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 4/6] iommufd/hw_pagetable: Support allocating nested parent domain

> From: Liu, Yi L <[email protected]>
> Sent: Tuesday, September 19, 2023 5:25 PM
>
> @@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx,
> struct iommufd_ioas *ioas,
>
> lockdep_assert_held(&ioas->mutex);
>
> + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ops-
> >domain_alloc_user)
> + return ERR_PTR(-EOPNOTSUPP);
> +

if (flags && !ops->domain_alloc_user)
return ERR_PTR(-EOPNOTSUPP);

as long as flags is non-zero we'll need the new alloc_user ops.

2023-09-26 07:55:15

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 4/6] iommufd/hw_pagetable: Support allocating nested parent domain

On 2023/9/26 13:32, Tian, Kevin wrote:
>> From: Liu, Yi L <[email protected]>
>> Sent: Tuesday, September 19, 2023 5:25 PM
>>
>> @@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx,
>> struct iommufd_ioas *ioas,
>>
>> lockdep_assert_held(&ioas->mutex);
>>
>> + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ops-
>>> domain_alloc_user)
>> + return ERR_PTR(-EOPNOTSUPP);
>> +
>
> if (flags && !ops->domain_alloc_user)
> return ERR_PTR(-EOPNOTSUPP);
>
> as long as flags is non-zero we'll need the new alloc_user ops.

yes.

--
Regards,
Yi Liu

2023-09-26 09:54:46

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/6] iommu: Add new iommu op to create domains owned by userspace

> From: Liu, Yi L <[email protected]>
> Sent: Tuesday, September 19, 2023 5:25 PM
>
> @@ -235,6 +235,13 @@ struct iommu_iotlb_gather {
> * use. The information type is one of enum iommu_hw_info_type
> defined
> * in include/uapi/linux/iommufd.h.
> * @domain_alloc: allocate iommu domain

Given now we have two @alloc ops it'd be clearer to also update the
comment here so the explanation for @domain_alloc_user() is easier
to be understood, e.g.:

@domain_alloc: allocate and return an iommu domain if success. Otherwise
NULL is returned. The domain is not fully initialized until
the caller iommu_domain_alloc() returns.

> + * @domain_alloc_user: Allocate an iommu domain corresponding to the
> input
> + * parameters like flags defined as enum
> iommufd_ioas_map_flags
> + * in include/uapi/linux/iommufd.h. Different from the

"to the input parameters as defined in include/uapi/linux/iommufd.h".

> + * domain_alloc op, it requires iommu driver to fully
> + * initialize a new domain including the generic iommu_domain

"Unlike @domain_alloc, it is called only by iommufd and must fully initialize
the new domain before return".

*domain* here already refers to the generic iommu_domain struct.

2023-09-26 11:37:02

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 3/6] iommufd/hw_pagetable: Accepts user flags for domain allocation

> From: Liu, Yi L <[email protected]>
> Sent: Tuesday, September 19, 2023 5:25 PM
>
> This extends iommufd_hw_pagetable_alloc() to accepts user flags.
>
> Signed-off-by: Yi Liu <[email protected]>

Reviewed-by: Kevin Tian <[email protected]>