v7:
- Add sign-off and review tag from Boris
Boris:
- Fixed up ENQCMDS patch
Vinod:
- Fix line formatting
- Add comment for completion address compare
v6:
Boris:
- Fixup MOBDIR64B inline asm input/output constraints
v5:
Boris:
- Fixup commit headers
- Fixup var names for movdir64b()
- Move enqcmds() to special_insns.h
- Fix up comments for enqcmds()
- Change enqcmds() to reflect instruction return. 0 as success, -EAGAIN for fail.
DavidL:
- Fixup enqcmds() gas constraints
v4:
- Rebased against latest dmaengine/next tree
- Split out enqcmd and pasid dependency.
V3:
- Rebased against latest dmaengine/next tree.
- Updated API doc with new kernel version and dates.
- Changed to allow driver to load without ENQCMD support.
- Break out some patches that can be sent ahead of this series for inclusion.
v2:
- Dropped device feature enabling (GregKH)
- Dropped PCI device feature enabling (Bjorn)
- https://members.pcisig.com/wg/PCI-SIG/document/14237
- After some internal discussion, we have decided to hold off on the enabling of DMWR due to the
following reasons. 1. Most first gen hw will not have the feature bits. 2. First gen hw that
support the feature are all Root Complex integrated endpoints. 3. PCI devices that are not
RCiEP’s with this capability won’t surface for a few years so we can wait until we can test the
full code.
- Dropped special ioremap (hch)
- Added proper support for WQ flush (tony, dan)
- Changed descriptor submission to use sbitmap_queue for blocking. (dan)
Driver stage 1 postings for context: [1]
The patch series has compilation and functional dependency on Fenghua's "Tag application
address space for devices" patch series for the ENQCMD CPU command enumeration and the PASID MSR
support. [2]
== Background ==
A typical DMA device requires the driver to translate application buffers to hardware addresses,
and a kernel-user transition to notify the hardware of new work. Shared Virtual Addressing (SVA)
allows the processor and device to use the same virtual addresses without requiring software to
translate between the address spaces. ENQCMD is a new instruction on Intel Platforms that allows
user applications to directly notify hardware of new work, much like how doorbells are used in
some hardware, but it carries a payload along with it. ENQCMDS is the supervisor version (ring0)
of ENQCMD.
== ENQCMDS ==
Introduce enqcmds(), a helper funciton that copies an input payload to a 64B aligned
destination and confirms whether the payload was accepted by the device or not.
enqcmds() wraps the new ENQCMDS CPU instruction. The ENQCMDS is a ring 0 CPU instruction that
performs similar to the ENQCMD instruction. Descriptor submission must use ENQCMD(S) for shared
workqueues (swq) on an Intel DSA device.
== Shared WQ support ==
Introduce shared workqueue (swq) support for the idxd driver. The current idxd driver contains
dedicated workqueue (dwq) support only. A dwq accepts descriptors from a MOVDIR64B instruction.
MOVDIR64B is a posted instruction on the PCIe bus, it does not wait for any response from the
device. If the wq is full, submitted descriptors are dropped. A swq utilizes the ENQCMDS in
ring 0, which is a non-posted instruction. The zero flag would be set to 1 if the device rejects
the descriptor or if the wq is full. A swq can be shared between multiple users
(kernel or userspace) due to not having to keep track of the wq full condition for submission.
A swq requires PASID and can only run with SVA support.
== IDXD SVA support ==
Add utilization of PASID to support Shared Virtual Addressing (SVA). With PASID support,
the descriptors can be programmed with host virtual address (HVA) rather than IOVA.
The hardware will work with the IOMMU in fulfilling page requests. With SVA support,
a user app using the char device interface can now submit descriptors without having to pin the
virtual memory range it wants to DMA in its own address space.
The series does not add SVA support for the dmaengine subsystem. That support is coming at a
later time.
[1]: https://lore.kernel.org/lkml/157965011794.73301.15960052071729101309.stgit@djiang5-desk3.ch.intel.com/
[2]: https://lore.kernel.org/lkml/[email protected]/
[3]: https://software.intel.com/en-us/articles/intel-sdm
[4]: https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[5]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification
[6]: https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
[7]: https://intel.github.io/idxd/
[8]: https://github.com/intel/idxd-driver idxd-stage2
Dave Jiang (5):
x86/asm: Carve out a generic movdir64b() helper for general usage
x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction
dmaengine: idxd: Add shared workqueue support
dmaengine: idxd: Clean up descriptors with fault error
dmaengine: idxd: Add ABI documentation for shared wq
.../ABI/stable/sysfs-driver-dma-idxd | 14 ++
arch/x86/include/asm/io.h | 17 +-
arch/x86/include/asm/special_insns.h | 64 ++++++++
drivers/dma/Kconfig | 10 ++
drivers/dma/idxd/cdev.c | 49 +++++-
drivers/dma/idxd/device.c | 91 ++++++++++-
drivers/dma/idxd/dma.c | 9 --
drivers/dma/idxd/idxd.h | 33 +++-
drivers/dma/idxd/init.c | 92 ++++++++---
drivers/dma/idxd/irq.c | 146 ++++++++++++++++--
drivers/dma/idxd/registers.h | 14 ++
drivers/dma/idxd/submit.c | 35 ++++-
drivers/dma/idxd/sysfs.c | 127 +++++++++++++++
13 files changed, 631 insertions(+), 70 deletions(-)
--
2.26.2
Currently, the MOVDIR64B instruction is used to atomically submit
64-byte work descriptors to devices. Although it can encounter errors
like device queue full, command not accepted, device not ready, etc when
writing to a device MMIO, MOVDIR64B can not report back on errors from
the device itself. This means that MOVDIR64B users need to separately
interact with a device to see if a descriptor was successfully queued,
which slows down device interactions.
ENQCMD and ENQCMDS also atomically submit 64-byte work descriptors
to devices. But, they *can* report back errors directly from the
device, such as if the device was busy, or device not enabled or does
not support the command. This immediate feedback from the submission
instruction itself reduces the number of interactions with the device
and can greatly increase efficiency.
ENQCMD can be used at any privilege level, but can effectively only
submit work on behalf of the current process. ENQCMDS is a ring0-only
instruction and can explicitly specify a process context instead of
being tied to the current process or needing to reprogram the IA32_PASID
MSR.
Use ENQCMDS for work submission within the kernel because a Process
Address ID (PASID) is setup to translate the kernel virtual address
space. This PASID is provided to ENQCMDS from the descriptor structure
submitted to the device and not retrieved from IA32_PASID MSR, which is
setup for the current user address space.
See Intel Software Developer’s Manual for more information on the
instructions.
[ bp:
- Make operand constraints like movdir64b() because both insns are
basically doing the same thing, more or less.
- Fixup comments and cleanup. ]
Signed-off-by: Dave Jiang <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/include/asm/special_insns.h | 42 ++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 2258c7d6e281..83f7c1a391e0 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -256,6 +256,48 @@ static inline void movdir64b(void *dst, const void *src)
: "m" (*__src), "a" (__dst), "d" (__src));
}
+/**
+ * enqcmds - Enqueue a command in supervisor (CPL0) mode
+ * @dst: destination, in MMIO space (must be 512-bit aligned)
+ * @src: 512 bits memory operand
+ *
+ * The ENQCMDS instruction allows software to write a 512-bit command to
+ * a 512-bit-aligned special MMIO region that supports the instruction.
+ * A return status is loaded into the ZF flag in the RFLAGS register.
+ * ZF = 0 equates to success, and ZF = 1 indicates retry or error.
+ *
+ * This function issues the ENQCMDS instruction to submit data from
+ * kernel space to MMIO space, in a unit of 512 bits. Order of data access
+ * is not guaranteed, nor is a memory barrier performed afterwards. It
+ * returns 0 on success and -EAGAIN on failure.
+ *
+ * Warning: Do not use this helper unless your driver has checked that the
+ * ENQCMDS instruction is supported on the platform and the device accepts
+ * ENQCMDS.
+ */
+static inline int enqcmds(void __iomem *dst, const void *src)
+{
+ const struct { char _[64]; } *__src = src;
+ struct { char _[64]; } *__dst = dst;
+ int zf;
+
+ /*
+ * ENQCMDS %(rdx), rax
+ *
+ * See movdir64b()'s comment on operand specification.
+ */
+ asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90"
+ CC_SET(z)
+ : CC_OUT(z) (zf), "+m" (*__dst)
+ : "m" (*__src), "a" (__dst), "d" (__src));
+
+ /* Submission failure is indicated via EFLAGS.ZF=1 */
+ if (zf)
+ return -EAGAIN;
+
+ return 0;
+}
+
#endif /* __KERNEL__ */
#endif /* _ASM_X86_SPECIAL_INSNS_H */
--
2.26.2
Add code to "complete" a descriptor when the descriptor or its completion
address hit a fault error when SVA mode is being used. This error can be
triggered due to bad programming by the user. A lock is introduced in order
to protect the descriptor completion lists since the fault handler will run
from the system work queue after being scheduled in the interrupt handler.
Signed-off-by: Dave Jiang <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
---
drivers/dma/idxd/idxd.h | 5 ++
drivers/dma/idxd/init.c | 1 +
drivers/dma/idxd/irq.c | 146 ++++++++++++++++++++++++++++++++++++----
3 files changed, 140 insertions(+), 12 deletions(-)
diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index 43a216c42d25..b64b6266ca97 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -34,6 +34,11 @@ struct idxd_irq_entry {
int id;
struct llist_head pending_llist;
struct list_head work_list;
+ /*
+ * Lock to protect access between irq thread process descriptor
+ * and irq thread processing error descriptor.
+ */
+ spinlock_t list_lock;
};
struct idxd_group {
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 626401a71fdd..1bb7637b02eb 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -97,6 +97,7 @@ static int idxd_setup_interrupts(struct idxd_device *idxd)
for (i = 0; i < msixcnt; i++) {
idxd->irq_entries[i].id = i;
idxd->irq_entries[i].idxd = idxd;
+ spin_lock_init(&idxd->irq_entries[i].list_lock);
}
msix = &idxd->msix_entries[0];
diff --git a/drivers/dma/idxd/irq.c b/drivers/dma/idxd/irq.c
index 17a65a13fb64..593a2f6ed16c 100644
--- a/drivers/dma/idxd/irq.c
+++ b/drivers/dma/idxd/irq.c
@@ -11,6 +11,24 @@
#include "idxd.h"
#include "registers.h"
+enum irq_work_type {
+ IRQ_WORK_NORMAL = 0,
+ IRQ_WORK_PROCESS_FAULT,
+};
+
+struct idxd_fault {
+ struct work_struct work;
+ u64 addr;
+ struct idxd_device *idxd;
+};
+
+static int irq_process_work_list(struct idxd_irq_entry *irq_entry,
+ enum irq_work_type wtype,
+ int *processed, u64 data);
+static int irq_process_pending_llist(struct idxd_irq_entry *irq_entry,
+ enum irq_work_type wtype,
+ int *processed, u64 data);
+
static void idxd_device_reinit(struct work_struct *work)
{
struct idxd_device *idxd = container_of(work, struct idxd_device, work);
@@ -44,6 +62,46 @@ static void idxd_device_reinit(struct work_struct *work)
idxd_device_wqs_clear_state(idxd);
}
+static void idxd_device_fault_work(struct work_struct *work)
+{
+ struct idxd_fault *fault = container_of(work, struct idxd_fault, work);
+ struct idxd_irq_entry *ie;
+ int i;
+ int processed;
+ int irqcnt = fault->idxd->num_wq_irqs + 1;
+
+ for (i = 1; i < irqcnt; i++) {
+ ie = &fault->idxd->irq_entries[i];
+ irq_process_work_list(ie, IRQ_WORK_PROCESS_FAULT,
+ &processed, fault->addr);
+ if (processed)
+ break;
+
+ irq_process_pending_llist(ie, IRQ_WORK_PROCESS_FAULT,
+ &processed, fault->addr);
+ if (processed)
+ break;
+ }
+
+ kfree(fault);
+}
+
+static int idxd_device_schedule_fault_process(struct idxd_device *idxd,
+ u64 fault_addr)
+{
+ struct idxd_fault *fault;
+
+ fault = kmalloc(sizeof(*fault), GFP_ATOMIC);
+ if (!fault)
+ return -ENOMEM;
+
+ fault->addr = fault_addr;
+ fault->idxd = idxd;
+ INIT_WORK(&fault->work, idxd_device_fault_work);
+ queue_work(idxd->wq, &fault->work);
+ return 0;
+}
+
irqreturn_t idxd_irq_handler(int vec, void *data)
{
struct idxd_irq_entry *irq_entry = data;
@@ -125,6 +183,15 @@ irqreturn_t idxd_misc_thread(int vec, void *data)
if (!err)
goto out;
+ /*
+ * This case should rarely happen and typically is due to software
+ * programming error by the driver.
+ */
+ if (idxd->sw_err.valid &&
+ idxd->sw_err.desc_valid &&
+ idxd->sw_err.fault_addr)
+ idxd_device_schedule_fault_process(idxd, idxd->sw_err.fault_addr);
+
gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET);
if (gensts.state == IDXD_DEVICE_STATE_HALT) {
idxd->state = IDXD_DEV_HALTED;
@@ -152,57 +219,110 @@ irqreturn_t idxd_misc_thread(int vec, void *data)
return IRQ_HANDLED;
}
+static bool process_fault(struct idxd_desc *desc, u64 fault_addr)
+{
+ /*
+ * Completion address can be bad as well. Check fault address match for descriptor
+ * and completion address.
+ */
+ if ((u64)desc->hw == fault_addr ||
+ (u64)desc->completion == fault_addr) {
+ idxd_dma_complete_txd(desc, IDXD_COMPLETE_DEV_FAIL);
+ return true;
+ }
+
+ return false;
+}
+
+static bool complete_desc(struct idxd_desc *desc)
+{
+ if (desc->completion->status) {
+ idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
+ return true;
+ }
+
+ return false;
+}
+
static int irq_process_pending_llist(struct idxd_irq_entry *irq_entry,
- int *processed)
+ enum irq_work_type wtype,
+ int *processed, u64 data)
{
struct idxd_desc *desc, *t;
struct llist_node *head;
int queued = 0;
+ bool completed = false;
+ unsigned long flags;
*processed = 0;
head = llist_del_all(&irq_entry->pending_llist);
if (!head)
- return 0;
+ goto out;
llist_for_each_entry_safe(desc, t, head, llnode) {
- if (desc->completion->status) {
- idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
+ if (wtype == IRQ_WORK_NORMAL)
+ completed = complete_desc(desc);
+ else if (wtype == IRQ_WORK_PROCESS_FAULT)
+ completed = process_fault(desc, data);
+
+ if (completed) {
idxd_free_desc(desc->wq, desc);
(*processed)++;
+ if (wtype == IRQ_WORK_PROCESS_FAULT)
+ break;
} else {
- list_add_tail(&desc->list, &irq_entry->work_list);
+ spin_lock_irqsave(&irq_entry->list_lock, flags);
+ list_add_tail(&desc->list,
+ &irq_entry->work_list);
+ spin_unlock_irqrestore(&irq_entry->list_lock, flags);
queued++;
}
}
+ out:
return queued;
}
static int irq_process_work_list(struct idxd_irq_entry *irq_entry,
- int *processed)
+ enum irq_work_type wtype,
+ int *processed, u64 data)
{
struct list_head *node, *next;
int queued = 0;
+ bool completed = false;
+ unsigned long flags;
*processed = 0;
+ spin_lock_irqsave(&irq_entry->list_lock, flags);
if (list_empty(&irq_entry->work_list))
- return 0;
+ goto out;
list_for_each_safe(node, next, &irq_entry->work_list) {
struct idxd_desc *desc =
container_of(node, struct idxd_desc, list);
- if (desc->completion->status) {
+ spin_unlock_irqrestore(&irq_entry->list_lock, flags);
+ if (wtype == IRQ_WORK_NORMAL)
+ completed = complete_desc(desc);
+ else if (wtype == IRQ_WORK_PROCESS_FAULT)
+ completed = process_fault(desc, data);
+
+ if (completed) {
+ spin_lock_irqsave(&irq_entry->list_lock, flags);
list_del(&desc->list);
- /* process and callback */
- idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
+ spin_unlock_irqrestore(&irq_entry->list_lock, flags);
idxd_free_desc(desc->wq, desc);
(*processed)++;
+ if (wtype == IRQ_WORK_PROCESS_FAULT)
+ return queued;
} else {
queued++;
}
+ spin_lock_irqsave(&irq_entry->list_lock, flags);
}
+ out:
+ spin_unlock_irqrestore(&irq_entry->list_lock, flags);
return queued;
}
@@ -230,12 +350,14 @@ static int idxd_desc_process(struct idxd_irq_entry *irq_entry)
* 5. Repeat until no more descriptors.
*/
do {
- rc = irq_process_work_list(irq_entry, &processed);
+ rc = irq_process_work_list(irq_entry, IRQ_WORK_NORMAL,
+ &processed, 0);
total += processed;
if (rc != 0)
continue;
- rc = irq_process_pending_llist(irq_entry, &processed);
+ rc = irq_process_pending_llist(irq_entry, IRQ_WORK_NORMAL,
+ &processed, 0);
total += processed;
} while (rc != 0);
--
2.26.2
On 05-10-20, 08:11, Dave Jiang wrote:
> == Background ==
> A typical DMA device requires the driver to translate application buffers to hardware addresses,
> and a kernel-user transition to notify the hardware of new work. Shared Virtual Addressing (SVA)
> allows the processor and device to use the same virtual addresses without requiring software to
> translate between the address spaces. ENQCMD is a new instruction on Intel Platforms that allows
> user applications to directly notify hardware of new work, much like how doorbells are used in
> some hardware, but it carries a payload along with it. ENQCMDS is the supervisor version (ring0)
> of ENQCMD.
>
> == ENQCMDS ==
> Introduce enqcmds(), a helper funciton that copies an input payload to a 64B aligned
> destination and confirms whether the payload was accepted by the device or not.
> enqcmds() wraps the new ENQCMDS CPU instruction. The ENQCMDS is a ring 0 CPU instruction that
> performs similar to the ENQCMD instruction. Descriptor submission must use ENQCMD(S) for shared
> workqueues (swq) on an Intel DSA device.
>
> == Shared WQ support ==
> Introduce shared workqueue (swq) support for the idxd driver. The current idxd driver contains
> dedicated workqueue (dwq) support only. A dwq accepts descriptors from a MOVDIR64B instruction.
> MOVDIR64B is a posted instruction on the PCIe bus, it does not wait for any response from the
> device. If the wq is full, submitted descriptors are dropped. A swq utilizes the ENQCMDS in
> ring 0, which is a non-posted instruction. The zero flag would be set to 1 if the device rejects
> the descriptor or if the wq is full. A swq can be shared between multiple users
> (kernel or userspace) due to not having to keep track of the wq full condition for submission.
> A swq requires PASID and can only run with SVA support.
>
> == IDXD SVA support ==
> Add utilization of PASID to support Shared Virtual Addressing (SVA). With PASID support,
> the descriptors can be programmed with host virtual address (HVA) rather than IOVA.
> The hardware will work with the IOMMU in fulfilling page requests. With SVA support,
> a user app using the char device interface can now submit descriptors without having to pin the
> virtual memory range it wants to DMA in its own address space.
>
> The series does not add SVA support for the dmaengine subsystem. That support is coming at a
> later time.
Applied, thanks
--
~Vinod
On Wed, Oct 07, 2020 at 12:31:32PM +0530, Vinod Koul wrote:
> Applied, thanks
I'm tired of repeating what you should've done - your branch doesn't
even build. How did you test it?
Also, what happens if Linus merges your branch first, before tip?
Oh boy.
In file included from ./arch/x86/include/asm/tsc.h:9,
from ./arch/x86/include/asm/timex.h:6,
from ./include/linux/timex.h:65,
from ./include/linux/time32.h:13,
from ./include/linux/time.h:73,
from ./include/linux/stat.h:19,
from ./include/linux/module.h:13,
from drivers/dma/idxd/init.c:5:
drivers/dma/idxd/init.c: In function ‘idxd_init_module’:
drivers/dma/idxd/init.c:526:20: error: ‘X86_FEATURE_ENQCMD’ undeclared (first use in this function); did you mean ‘X86_FEATURE_PCID’?
526 | if (!boot_cpu_has(X86_FEATURE_ENQCMD))
| ^~~~~~~~~~~~~~~~~~
./arch/x86/include/asm/cpufeature.h:118:24: note: in definition of macro ‘cpu_has’
118 | (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \
| ^~~
drivers/dma/idxd/init.c:526:7: note: in expansion of macro ‘boot_cpu_has’
526 | if (!boot_cpu_has(X86_FEATURE_ENQCMD))
| ^~~~~~~~~~~~
drivers/dma/idxd/init.c:526:20: note: each undeclared identifier is reported only once for each function it appears in
526 | if (!boot_cpu_has(X86_FEATURE_ENQCMD))
| ^~~~~~~~~~~~~~~~~~
./arch/x86/include/asm/cpufeature.h:118:24: note: in definition of macro ‘cpu_has’
118 | (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \
| ^~~
drivers/dma/idxd/init.c:526:7: note: in expansion of macro ‘boot_cpu_has’
526 | if (!boot_cpu_has(X86_FEATURE_ENQCMD))
| ^~~~~~~~~~~~
make[3]: *** [scripts/Makefile.build:283: drivers/dma/idxd/init.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [scripts/Makefile.build:500: drivers/dma/idxd] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [scripts/Makefile.build:500: drivers/dma] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:1788: drivers] Error 2
make: *** Waiting for unfinished jobs....
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 07-10-20, 10:48, Borislav Petkov wrote:
> On Wed, Oct 07, 2020 at 12:31:32PM +0530, Vinod Koul wrote:
> > Applied, thanks
>
> I'm tired of repeating what you should've done - your branch doesn't
> even build. How did you test it?
Right my build failed for x86 and I have dropped these now. I would have
expected the dependency to be a signed tag to be cross merged when I was
asked to merge this.
--
~Vinod
On Wed, Oct 07, 2020 at 03:23:13PM +0530, Vinod Koul wrote:
> Right my build failed for x86 and I have dropped these now. I would have
> expected the dependency to be a signed tag to be cross merged when I was
> asked to merge this.
I can give you a signed tag is you prefer but that's usually not
necessary. You can simply merge tip's x86/pasid branch, then apply those
ontop and test.
HTH.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 07-10-20, 12:04, Borislav Petkov wrote:
> On Wed, Oct 07, 2020 at 03:23:13PM +0530, Vinod Koul wrote:
> > Right my build failed for x86 and I have dropped these now. I would have
> > expected the dependency to be a signed tag to be cross merged when I was
> > asked to merge this.
>
> I can give you a signed tag is you prefer but that's usually not
That would be better, signed tags are preferred
> necessary. You can simply merge tip's x86/pasid branch, then apply those
> ontop and test.
While at it, it would be good if x86 patches of this series come from
your tree, that makes more sense if we are doing a cross merge
Thanks
--
~Vinod
On Wed, Oct 07, 2020 at 08:27:33PM +0530, Vinod Koul wrote:
> That would be better, signed tags are preferred
...
> While at it, it would be good if x86 patches of this series come from
> your tree, that makes more sense if we are doing a cross merge
All done, here it is:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/tag/?h=x86_pasid_for_5.10
That's going to be the tag I send to Linus next week too. I'll send it
on Monday when the merge window opens so that he can merge it before
your branch.
HTH.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 10/7/2020 12:01 AM, Vinod Koul wrote:
> On 05-10-20, 08:11, Dave Jiang wrote:
>
>> == Background ==
>> A typical DMA device requires the driver to translate application buffers to hardware addresses,
>> and a kernel-user transition to notify the hardware of new work. Shared Virtual Addressing (SVA)
>> allows the processor and device to use the same virtual addresses without requiring software to
>> translate between the address spaces. ENQCMD is a new instruction on Intel Platforms that allows
>> user applications to directly notify hardware of new work, much like how doorbells are used in
>> some hardware, but it carries a payload along with it. ENQCMDS is the supervisor version (ring0)
>> of ENQCMD.
>>
>> == ENQCMDS ==
>> Introduce enqcmds(), a helper funciton that copies an input payload to a 64B aligned
>> destination and confirms whether the payload was accepted by the device or not.
>> enqcmds() wraps the new ENQCMDS CPU instruction. The ENQCMDS is a ring 0 CPU instruction that
>> performs similar to the ENQCMD instruction. Descriptor submission must use ENQCMD(S) for shared
>> workqueues (swq) on an Intel DSA device.
>>
>> == Shared WQ support ==
>> Introduce shared workqueue (swq) support for the idxd driver. The current idxd driver contains
>> dedicated workqueue (dwq) support only. A dwq accepts descriptors from a MOVDIR64B instruction.
>> MOVDIR64B is a posted instruction on the PCIe bus, it does not wait for any response from the
>> device. If the wq is full, submitted descriptors are dropped. A swq utilizes the ENQCMDS in
>> ring 0, which is a non-posted instruction. The zero flag would be set to 1 if the device rejects
>> the descriptor or if the wq is full. A swq can be shared between multiple users
>> (kernel or userspace) due to not having to keep track of the wq full condition for submission.
>> A swq requires PASID and can only run with SVA support.
>>
>> == IDXD SVA support ==
>> Add utilization of PASID to support Shared Virtual Addressing (SVA). With PASID support,
>> the descriptors can be programmed with host virtual address (HVA) rather than IOVA.
>> The hardware will work with the IOMMU in fulfilling page requests. With SVA support,
>> a user app using the char device interface can now submit descriptors without having to pin the
>> virtual memory range it wants to DMA in its own address space.
>>
>> The series does not add SVA support for the dmaengine subsystem. That support is coming at a
>> later time.
>
> Applied, thanks
>
Thanks Vinod!