2024-01-22 07:45:38

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v3 3/8] iommufd: Add fault and response message definitions

iommu_hwpt_pgfaults represent fault messages that the userspace can
retrieve. Multiple iommu_hwpt_pgfaults might be put in an iopf group,
with the IOMMU_PGFAULT_FLAGS_LAST_PAGE flag set only for the last
iommu_hwpt_pgfault.

An iommu_hwpt_page_response is a response message that the userspace
should send to the kernel after finishing handling a group of fault
messages. The @dev_id, @pasid, and @grpid fields in the message
identify an outstanding iopf group for a device. The @addr field,
which matches the fault address of the last fault in the group, will
be used by the kernel for a sanity check.

Signed-off-by: Lu Baolu <[email protected]>
---
include/uapi/linux/iommufd.h | 67 ++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)

diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 1dfeaa2e649e..d59e839ae49e 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -692,4 +692,71 @@ struct iommu_hwpt_invalidate {
__u32 __reserved;
};
#define IOMMU_HWPT_INVALIDATE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_INVALIDATE)
+
+/**
+ * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault
+ * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is
+ * valid.
+ * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group.
+ */
+enum iommu_hwpt_pgfault_flags {
+ IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0),
+ IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1),
+};
+
+/**
+ * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault
+ * @IOMMU_PGFAULT_PERM_READ: request for read permission
+ * @IOMMU_PGFAULT_PERM_WRITE: request for write permission
+ * @IOMMU_PGFAULT_PERM_EXEC: request for execute permission
+ * @IOMMU_PGFAULT_PERM_PRIV: request for privileged permission
+ */
+enum iommu_hwpt_pgfault_perm {
+ IOMMU_PGFAULT_PERM_READ = (1 << 0),
+ IOMMU_PGFAULT_PERM_WRITE = (1 << 1),
+ IOMMU_PGFAULT_PERM_EXEC = (1 << 2),
+ IOMMU_PGFAULT_PERM_PRIV = (1 << 3),
+};
+
+/**
+ * struct iommu_hwpt_pgfault - iommu page fault data
+ * @size: sizeof(struct iommu_hwpt_pgfault)
+ * @flags: Combination of enum iommu_hwpt_pgfault_flags
+ * @dev_id: id of the originated device
+ * @pasid: Process Address Space ID
+ * @grpid: Page Request Group Index
+ * @perm: Combination of enum iommu_hwpt_pgfault_perm
+ * @addr: page address
+ */
+struct iommu_hwpt_pgfault {
+ __u32 size;
+ __u32 flags;
+ __u32 dev_id;
+ __u32 pasid;
+ __u32 grpid;
+ __u32 perm;
+ __u64 addr;
+};
+
+/**
+ * struct iommu_hwpt_page_response - IOMMU page fault response
+ * @size: sizeof(struct iommu_hwpt_page_response)
+ * @flags: Must be set to 0
+ * @dev_id: device ID of target device for the response
+ * @pasid: Process Address Space ID
+ * @grpid: Page Request Group Index
+ * @code: response code. The supported codes include:
+ * 0: Successful; 1: Response Failure; 2: Invalid Request.
+ * @addr: The fault address. Must match the addr field of the
+ * last iommu_hwpt_pgfault of a reported iopf group.
+ */
+struct iommu_hwpt_page_response {
+ __u32 size;
+ __u32 flags;
+ __u32 dev_id;
+ __u32 pasid;
+ __u32 grpid;
+ __u32 code;
+ __u64 addr;
+};
#endif
--
2.34.1



2024-03-08 17:50:20

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 3/8] iommufd: Add fault and response message definitions

On Mon, Jan 22, 2024 at 03:38:58PM +0800, Lu Baolu wrote:

> +/**
> + * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault
> + * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is
> + * valid.
> + * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group.
> + */
> +enum iommu_hwpt_pgfault_flags {
> + IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0),
> + IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1),
> +};
> +
> +/**
> + * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault
> + * @IOMMU_PGFAULT_PERM_READ: request for read permission
> + * @IOMMU_PGFAULT_PERM_WRITE: request for write permission
> + * @IOMMU_PGFAULT_PERM_EXEC: request for execute permission
> + * @IOMMU_PGFAULT_PERM_PRIV: request for privileged permission

You are going to have to elaborate what PRIV is for.. We don't have
any concept of this in the UAPI for iommufd so what is a userspace
supposed to do if it hits this? EXEC is similar, we can't actually
enable exec permissions from userspace IIRC..

> +enum iommu_hwpt_pgfault_perm {
> + IOMMU_PGFAULT_PERM_READ = (1 << 0),
> + IOMMU_PGFAULT_PERM_WRITE = (1 << 1),
> + IOMMU_PGFAULT_PERM_EXEC = (1 << 2),
> + IOMMU_PGFAULT_PERM_PRIV = (1 << 3),
> +};
> +
> +/**
> + * struct iommu_hwpt_pgfault - iommu page fault data
> + * @size: sizeof(struct iommu_hwpt_pgfault)
> + * @flags: Combination of enum iommu_hwpt_pgfault_flags
> + * @dev_id: id of the originated device
> + * @pasid: Process Address Space ID
> + * @grpid: Page Request Group Index
> + * @perm: Combination of enum iommu_hwpt_pgfault_perm
> + * @addr: page address
> + */
> +struct iommu_hwpt_pgfault {
> + __u32 size;
> + __u32 flags;
> + __u32 dev_id;
> + __u32 pasid;
> + __u32 grpid;
> + __u32 perm;
> + __u64 addr;
> +};

Do we need an addr + size here? I've seen a few things where I wonder
if that might become an enhancment someday.

> +/**
> + * struct iommu_hwpt_page_response - IOMMU page fault response
> + * @size: sizeof(struct iommu_hwpt_page_response)
> + * @flags: Must be set to 0
> + * @dev_id: device ID of target device for the response
> + * @pasid: Process Address Space ID
> + * @grpid: Page Request Group Index
> + * @code: response code. The supported codes include:
> + * 0: Successful; 1: Response Failure; 2: Invalid Request.

This should be an enum

> + * @addr: The fault address. Must match the addr field of the
> + * last iommu_hwpt_pgfault of a reported iopf group.
> + */
> +struct iommu_hwpt_page_response {
> + __u32 size;
> + __u32 flags;
> + __u32 dev_id;
> + __u32 pasid;
> + __u32 grpid;
> + __u32 code;
> + __u64 addr;
> +};

Do we want some kind of opaque ID value from the kernel here to match
request with response exactly? Or is the plan to search on the addr?

Jason

2024-03-14 16:41:50

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v3 3/8] iommufd: Add fault and response message definitions

On 2024/3/9 1:50, Jason Gunthorpe wrote:
> On Mon, Jan 22, 2024 at 03:38:58PM +0800, Lu Baolu wrote:
>
>> +/**
>> + * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault
>> + * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is
>> + * valid.
>> + * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group.
>> + */
>> +enum iommu_hwpt_pgfault_flags {
>> + IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0),
>> + IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1),
>> +};
>> +
>> +/**
>> + * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault
>> + * @IOMMU_PGFAULT_PERM_READ: request for read permission
>> + * @IOMMU_PGFAULT_PERM_WRITE: request for write permission
>> + * @IOMMU_PGFAULT_PERM_EXEC: request for execute permission
>> + * @IOMMU_PGFAULT_PERM_PRIV: request for privileged permission
>
> You are going to have to elaborate what PRIV is for.. We don't have
> any concept of this in the UAPI for iommufd so what is a userspace
> supposed to do if it hits this? EXEC is similar, we can't actually
> enable exec permissions from userspace IIRC..

The PCIe spec, section "10.4.1 Page Request Message" and "6.20.2 PASID
Information Layout":

The PCI PASID TLP Prefix defines "Execute Requested" and "Privileged
Mode Requested" bits.

PERM_EXEC indicates a page request with a PASID that has the "Execute
Requested" bit set. Similarly, PERM_PRIV indicates a page request with a
PASID that has "Privileged Mode Requested" bit set.

>
>> +enum iommu_hwpt_pgfault_perm {
>> + IOMMU_PGFAULT_PERM_READ = (1 << 0),
>> + IOMMU_PGFAULT_PERM_WRITE = (1 << 1),
>> + IOMMU_PGFAULT_PERM_EXEC = (1 << 2),
>> + IOMMU_PGFAULT_PERM_PRIV = (1 << 3),
>> +};
>> +
>> +/**
>> + * struct iommu_hwpt_pgfault - iommu page fault data
>> + * @size: sizeof(struct iommu_hwpt_pgfault)
>> + * @flags: Combination of enum iommu_hwpt_pgfault_flags
>> + * @dev_id: id of the originated device
>> + * @pasid: Process Address Space ID
>> + * @grpid: Page Request Group Index
>> + * @perm: Combination of enum iommu_hwpt_pgfault_perm
>> + * @addr: page address
>> + */
>> +struct iommu_hwpt_pgfault {
>> + __u32 size;
>> + __u32 flags;
>> + __u32 dev_id;
>> + __u32 pasid;
>> + __u32 grpid;
>> + __u32 perm;
>> + __u64 addr;
>> +};
>
> Do we need an addr + size here? I've seen a few things where I wonder
> if that might become an enhancment someday.

I am not sure. The page size is not part of ATS/PRI. Can you please
elaborate a bit about how the size could be used? Perhaps I
misunderstood here?

>
>> +/**
>> + * struct iommu_hwpt_page_response - IOMMU page fault response
>> + * @size: sizeof(struct iommu_hwpt_page_response)
>> + * @flags: Must be set to 0
>> + * @dev_id: device ID of target device for the response
>> + * @pasid: Process Address Space ID
>> + * @grpid: Page Request Group Index
>> + * @code: response code. The supported codes include:
>> + * 0: Successful; 1: Response Failure; 2: Invalid Request.
>
> This should be an enum

Sure.

>
>> + * @addr: The fault address. Must match the addr field of the
>> + * last iommu_hwpt_pgfault of a reported iopf group.
>> + */
>> +struct iommu_hwpt_page_response {
>> + __u32 size;
>> + __u32 flags;
>> + __u32 dev_id;
>> + __u32 pasid;
>> + __u32 grpid;
>> + __u32 code;
>> + __u64 addr;
>> +};
>
> Do we want some kind of opaque ID value from the kernel here to match
> request with response exactly? Or is the plan to search on the addr?

I am using the "addr" as the opaque data to search request in this
series. Is it enough?

Best regards,
baolu

2024-03-22 17:06:53

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 3/8] iommufd: Add fault and response message definitions

On Thu, Mar 14, 2024 at 09:41:45PM +0800, Baolu Lu wrote:
> On 2024/3/9 1:50, Jason Gunthorpe wrote:
> > On Mon, Jan 22, 2024 at 03:38:58PM +0800, Lu Baolu wrote:
> >
> > > +/**
> > > + * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault
> > > + * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is
> > > + * valid.
> > > + * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group.
> > > + */
> > > +enum iommu_hwpt_pgfault_flags {
> > > + IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0),
> > > + IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1),
> > > +};
> > > +
> > > +/**
> > > + * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault
> > > + * @IOMMU_PGFAULT_PERM_READ: request for read permission
> > > + * @IOMMU_PGFAULT_PERM_WRITE: request for write permission
> > > + * @IOMMU_PGFAULT_PERM_EXEC: request for execute permission
> > > + * @IOMMU_PGFAULT_PERM_PRIV: request for privileged permission
> >
> > You are going to have to elaborate what PRIV is for.. We don't have
> > any concept of this in the UAPI for iommufd so what is a userspace
> > supposed to do if it hits this? EXEC is similar, we can't actually
> > enable exec permissions from userspace IIRC..
>
> The PCIe spec, section "10.4.1 Page Request Message" and "6.20.2 PASID
> Information Layout":
>
> The PCI PASID TLP Prefix defines "Execute Requested" and "Privileged
> Mode Requested" bits.
>
> PERM_EXEC indicates a page request with a PASID that has the "Execute
> Requested" bit set. Similarly, PERM_PRIV indicates a page request with a
> PASID that has "Privileged Mode Requested" bit set.

Oh, I see! OK Maybe just add a note that it follows PCIE 10.4.1

> > > +struct iommu_hwpt_pgfault {
> > > + __u32 size;
> > > + __u32 flags;
> > > + __u32 dev_id;
> > > + __u32 pasid;
> > > + __u32 grpid;
> > > + __u32 perm;
> > > + __u64 addr;
> > > +};
> >
> > Do we need an addr + size here? I've seen a few things where I wonder
> > if that might become an enhancment someday.
>
> I am not sure. The page size is not part of ATS/PRI. Can you please
> elaborate a bit about how the size could be used? Perhaps I
> misunderstood here?

size would be an advice how much data the requestor is expecting to
fetch. Eg of the PRI initiator knows it is going to do a 10MB transfer
it could fill in 10MB and the OS could pre-fault in 10MB of IOVA.

It is not in the spec, it may never be in the spec, but it seems like
it would be good to consider it, at least make sure we have
compatability to add it later.

> > > + * @addr: The fault address. Must match the addr field of the
> > > + * last iommu_hwpt_pgfault of a reported iopf group.
> > > + */
> > > +struct iommu_hwpt_page_response {
> > > + __u32 size;
> > > + __u32 flags;
> > > + __u32 dev_id;
> > > + __u32 pasid;
> > > + __u32 grpid;
> > > + __u32 code;
> > > + __u64 addr;
> > > +};
> >
> > Do we want some kind of opaque ID value from the kernel here to match
> > request with response exactly? Or is the plan to search on the addr?
>
> I am using the "addr" as the opaque data to search request in this
> series. Is it enough?

I'm not sure, the other discussion about grpid seems to be the main
question so lets see there.

Jason

2024-03-25 14:15:30

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v3 3/8] iommufd: Add fault and response message definitions

On 3/23/24 1:04 AM, Jason Gunthorpe wrote:
>>>> +struct iommu_hwpt_pgfault {
>>>> + __u32 size;
>>>> + __u32 flags;
>>>> + __u32 dev_id;
>>>> + __u32 pasid;
>>>> + __u32 grpid;
>>>> + __u32 perm;
>>>> + __u64 addr;
>>>> +};
>>> Do we need an addr + size here? I've seen a few things where I wonder
>>> if that might become an enhancment someday.
>> I am not sure. The page size is not part of ATS/PRI. Can you please
>> elaborate a bit about how the size could be used? Perhaps I
>> misunderstood here?
> size would be an advice how much data the requestor is expecting to
> fetch. Eg of the PRI initiator knows it is going to do a 10MB transfer
> it could fill in 10MB and the OS could pre-fault in 10MB of IOVA.
>
> It is not in the spec, it may never be in the spec, but it seems like
> it would be good to consider it, at least make sure we have
> compatability to add it later.

Thanks for the explanation. It sounds reasonable. I will take it and add
some comments to explain the motivation as you described above.

Best regards,
baolu