2022-02-07 17:24:39

by Iouri Tarassov

[permalink] [raw]
Subject: [PATCH v2 00/24] Driver for Hyper-v virtual compute device

This is a follow-up on the changes we sent a few weeks back[1]

[1] https://lore.kernel.org/lkml/[email protected]/

The changes since [1]:

- the driver code is split to more patches for easy reviewing
- static variable dxgglobaldev is removed
- various compiler warnings are fixed
- the patch for DXGSYNCFILE is removed. This patch requires more work and will
be submitted separately.

Foreword
-------------------------------------------------------
The patches contain implementation of the Hyper-V vGPU / Compute hardware
driver, which powers the Windows Subsystem for Linux (WSL) and will soon power
the Windows Subsystem for Android (WSA).

This set of patches is rebuilt from ground up and organized in logical layers
to better understand how the driver is built, making it easier for reviewers
to understand and follow. The first patch introduces headers and driver
initialization. The subsequent patches add additional functionality to the
driver.

Per earlier feedback, the driver is now located under the Hyper-V node as it
is not a classic Linux GPU (KMS/DRM) driver and really only make sense
when running under a Windows host under Hyper-V. Although we refer to this
driver as vGPU for shorthand, in reality this is a generic virtualization
infrastructure for various class of compute accelerators, the most popular
and ubiquitous being the GPU. We support virtualization of non-GPU devices
through this infrastructure. These devices are exposed to user-space and
used by various API and framework, such as CUDA, OpenCL, OpenVINO,
OneAPI, DX12, etc...

One critical piece of feedback, provided by the community in our earlier
submission, was the lack of an open source user-space to go along with our
kernel compute driver. At the time we only had CUDA and DX12 user-space APIs,
both of which being closed source. This is a divisive issue in the community
and is heavily debated (https://lwn.net/Articles/821817/).

We took this feedback to heart and we spent the last year working on a way
to address this key piece of feedback. Working closely with our partner Intel,
we're happy to say that we now have a fully open source user-space for the
OpenCL, OpenVINO and OneAPI compute family of APIs on Intel GPU platforms.
This is supported by this open source project
(https://github.com/intel/compute-runtime).

To make it easy for our partners to build compute drivers, which are usable
in both Windows and WSL environment, we provide a library, that provides a
stable interface to our compute device abstraction. This was originally part
of the libdxcore closed source API, but have now spawn that off into it's
own open source project (https://github.com/microsoft/libdxg).

Between the Intel compute runtime project and libdxg, we now have a fully
open source implementation of our virtualized compute stack inside of WSL.
We will continue to support both open source user-space API against our
compute abstraction as well as closed source one (CUDA, DX12), leaving
it to the API owners and partners to decide what makes the most sense for them.

A lot of efforts went into addressing community feedback in this revised
set of patches and we hope this is getting closer to what the community
would like to see.

We're looking forward additional feedback.

Driver overview
-------------------------------------------------------
dxgkrnl is a driver for Hyper-V virtual compute devices, such as vGPU
devices, which are projected to a Linux virtual machine (VM) by a Windows
host. dxgkrnl works in context of WDDM (Windows Display Driver Model)
for GPU or MCDM (Microsoft Compute Driver Model) for non-GPU devices.
WDDM/MCDM consists of the following components:
- Graphics or Compute applications
- A graphics or compute user mode API (for example OpenGL, Vulkan, OpenCL,
OpenVINO, OneAPI, CUDA, DX12, ...)
- User Mode Driver (UMD), written by a hardware vendor
- optional libdxg library helping UMD portability across Windows and Linux
- dxgkrnl Linux kernel driver (this driver)
- Kernel mode port driver on the Windows host (dxgkrnl.sys / dxgmms*.sys)
- Kernel Mode miniport driver (KMD) on the Windows host, written by a
hardware vendor running on the Windows host and interfacing with the
hardware device.

dxgkrnl exposes a subset the WDDM/MCDM D3DKMT interface to user space. See
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dkmthk/
This interface provides user space with an abstracted view and control
of compute devices in a portable way across Windows and WSL. It is used for
devices such as GPU or AI/ML processors. This interface is mapped to custom
IOCTLs on Linux (see d3dkmthk.h).

A more detailed overview of this architecture is available here:
https://devblogs.microsoft.com/directx/directx-heart-linux/

Compute devices are paravirtualized, meaning that the actual communication with
the corresponding device happens on the host. The version of dxgkrnl inside
of the Linux kernel coordinates with the version of dxgkrnl running on Windows
to provide a consistent and portable abstraction for the device that the various
APIs and UMD can rely on across Windows and Linux.

Dxgkrnl creates the /dev/dxg device, which can be used to enumerate devices.
UMD or an API runtime open the device and send ioctls to create various device
objects (dxgdevice, dxgcontext, dxgallocation, etc., defined in dxgkrnl.h)
and to submit work to the device. The WDDM objects are represented in user mode
as opaque handles (struct d3dkmthandle). Dxgkrnl creates a dxgprocess object
for each process, which opens the /dev/dxg device. This object has a handle
table, which is used to translate d3dkmt handles to kernel objects. Handle
table definitions are in hmgr.h. There is also a global handle table for
objects, which do not belong to a particular process.

Driver initialization
-------------------------------------------------------
When dxgkrnl is loaded, dxgkrnl registers for virtual PCI device arrival
notifications and VM bus channel device notifications (see dxgmodule.c). When
the first virtual device is started, dxgkrnl creates a misc device (/dev/dxg).
A user mode client can open the /dev/dxg device and send IOCTLs to enumerate
and control virtual compute devices.

Virtual device initialization
-------------------------------------------------------
A virtual device is represented by a dxgadapter object. It is created when the
corresponding device arrives on the virtual PCI bus. The device vendor is
PCI_VENDOR_ID_MICROSOFT and the device id is PCI_DEVICE_ID_VIRTUAL_RENDER.
The adapter is started when the corresponding VM bus channel and the global
VM bus channel are initialized.
Dynamic arrival/removal of devices is supported.

Internal objects
-------------------------------------------------------
Dxgkrnl creates various internal objects in response to IOCTL calls. The
corresponsing objects are also created on the host. Each object is placed to
a process or global handle table. Object handles (d3dkmthandle) are returned
to user mode. The object handles are also used to reference objects on the host.
Corresponding objects in the guest and the host have the same handle value to
avoid handle translation. The internal objects are:
- dxgadapter
Represents a virtual device object. It is created for every device projected by
the host to the VM.
- dxgprocess
The object is created for each Linux process, which opens /dxg/dev. It has the
object handle table, which holds pointers to all internal objects, which are
created by this process.
- dxgcontext
Represents a device execution thread in the packet scheduling mode (as oppose to
the hardware scheduling mode).
- dxghwqueue
Represents a device execution thread in the hardware scheduling mode.
- dxgdevice
A collection of device contexts, allocations, resources, sync objects, etc.
- dxgallocation
Represents a device accessible memory allocation.
- dxgresource
A collection of dxgallocation objects. This object could be shared between
devices and processes.
- dxgsharedresource
Represents a dxgresource object, which is sharable between processes.
- dxgsyncobject
Represents a device synchronization object, which is used to synchronize
execution of the device execution contexts.
- dxgsharedsyncobject
Represent a device synchronization object, which is sharable between processes.
- dxgpagingqueue
Represents a queue, which is used to manage residency of the device allocation
objects.

Communications with the host
-------------------------------------------------------
Dxgkrnl communicates with the host via Hyper-V VM bus channels. There is a
global channel and a per device channel. The VM bus messages are defined in
dxgvmbus.h and the implementation is in dxgvmbus.c.

Most VM bus messages to the host are synchronous. When the host enables the
asynchronous mode, some high frequency VM bus messages are sent asynchronously
to improve performance. When async messages are enabled, all VM bus messages are
sent only via the global channel to maintain the order of messages on the host.

The host could send asynchronous messages to dxgkrnl via the global VM bus
channel. The host messages are handled by dxgvmbuschannel_receive().

PCI config space of the device is used to exchange information between the host
and the guest during dxgkrnl initialization. Look at dxg_pci_probe_device().

CPU access to device accessible allocations
-------------------------------------------------------
D3DKMT API allows creation of allocations, which are accessible by the device
and the CPU. The global VM bus channels has associated IO space, which is used
to implement CPU access to CPU visible allocations. For each such allocation
the host allocates a portion of the guest IO space and maps it to the
allocation memory (it could be in system memory or in device local memory).
A user mode application calls the LX_DXLOCK2 ioctl to get the allocation
CPU address. Dxgkrnl uses vm_mmap to allocate a user space VA range and maps
it to the allocation IO space using io_remap_pfn_range(). This way Linux
user mode virtual addresses point to the host system memory or device local
memory.

Sharing objects between processes
-------------------------------------------------------
Some dxgkrnl objects could be shared between processes. This includes resources
(dxgresource) and synchronization objects (dxgsyncobject).
The WDDM API provides a way to share objects using so called "NT handles".
"NT handle" on Windows is a native Windows process handle (HANDLE). "NT handles"
are implemented as file descriptors (FD) on Linux.
The LX_DXSHAREOBJECTS ioctl is used to get an FD for a shared object.
Before a shared object can be used, it needs to be "opened" to get the "local"
d3dkmthandle handle. The LX_DXOPENRESOURCEFROMNTHANDLE and
LX_DXOPENSYNCOBJECTFROMNTHANDLE2 aioctls re used to open a shared object.

Iouri Tarassov (24):
drivers: hv: dxgkrnl: Driver initialization and creation of dxgadapter
drivers: hv: dxgkrnl: Open device file and dxgprocess creation
drivers: hv: dxgkrnl: Enumerate and open dxgadapter objects
drivers: hv: dxgkrnl: Creation of dxgdevice
drivers: hv: dxgkrnl: Creation of dxgcontext objects
drivers: hv: dxgkrnl: Creation of GPU allocations and resources
drivers: hv: dxgkrnl: Create and destroy GPU sync objects
drivers: hv: dxgkrnl: Operations using GPU sync objects
drivers: hv: dxgkrnl: Sharing of dxgresource objects
drivers: hv: dxgkrnl: Sharing of sync objects
drivers: hv: dxgkrnl: Creation of hardware queue. Sync object
operations to hw queue.
drivers: hv: dxgkrnl: Creation of paging queue objects.
drivers: hv: dxgkrnl: Submit execution commands to the compute device
drivers: hv: dxgkrnl: Implement LX_DXSHAREOBJECTWITHHOST ioctl
drivers: hv: dxgkrnl: IOCTL to get the dxgdevice state
LX_DXGETDEVICESTATE
drivers: hv: dxgkrnl: Mmap(unmap) CPU address to device allocation:
LX_DXLOCK2, LX_DXUNLOCK2
drivers: hv: dxgkrnl: IOCTLs to handle GPU allocation properties
drivers: hv: dxgkrnl: Various simple IOCTLs and unused ones
LX_DXQUERYVIDEOMEMORYINFO, LX_DXFLUSHHEAPTRANSITIONS,
LX_DXINVALIDATECACHE LX_DXGETSHAREDRESOURCEADAPTERLUID
drivers: hv: dxgkrnl: Simple IOCTLs LX_DXESCAPE,
LX_DXMARKDEVICEASERROR, LX_DXQUERYSTATISTICS,
LX_DXQUERYCLOCKCALIBRATION
drivers: hv: dxgkrnl: IOCTLs to offer and reclaim allocations
drivers: hv: dxgkrnl: Ioctls to set/get scheduling priority
drivers: hv: dxgkrnl: IOCTLs to manage allocation residency
drivers: hv: dxgkrnl: IOCTLs to handle GPU virtual addressing (GPU VA)
drivers: hv: dxgkrnl: Add support to map guest pages by host

MAINTAINERS | 7 +
drivers/hv/Kconfig | 2 +
drivers/hv/Makefile | 1 +
drivers/hv/dxgkrnl/Kconfig | 26 +
drivers/hv/dxgkrnl/Makefile | 5 +
drivers/hv/dxgkrnl/dxgadapter.c | 1371 ++++++++
drivers/hv/dxgkrnl/dxgkrnl.h | 956 ++++++
drivers/hv/dxgkrnl/dxgmodule.c | 942 ++++++
drivers/hv/dxgkrnl/dxgprocess.c | 334 ++
drivers/hv/dxgkrnl/dxgvmbus.c | 3750 +++++++++++++++++++++
drivers/hv/dxgkrnl/dxgvmbus.h | 882 +++++
drivers/hv/dxgkrnl/hmgr.c | 566 ++++
drivers/hv/dxgkrnl/hmgr.h | 112 +
drivers/hv/dxgkrnl/ioctl.c | 5450 +++++++++++++++++++++++++++++++
drivers/hv/dxgkrnl/misc.c | 43 +
drivers/hv/dxgkrnl/misc.h | 96 +
include/linux/hyperv.h | 16 +
include/uapi/misc/d3dkmthk.h | 1945 +++++++++++
18 files changed, 16504 insertions(+)
create mode 100644 drivers/hv/dxgkrnl/Kconfig
create mode 100644 drivers/hv/dxgkrnl/Makefile
create mode 100644 drivers/hv/dxgkrnl/dxgadapter.c
create mode 100644 drivers/hv/dxgkrnl/dxgkrnl.h
create mode 100644 drivers/hv/dxgkrnl/dxgmodule.c
create mode 100644 drivers/hv/dxgkrnl/dxgprocess.c
create mode 100644 drivers/hv/dxgkrnl/dxgvmbus.c
create mode 100644 drivers/hv/dxgkrnl/dxgvmbus.h
create mode 100644 drivers/hv/dxgkrnl/hmgr.c
create mode 100644 drivers/hv/dxgkrnl/hmgr.h
create mode 100644 drivers/hv/dxgkrnl/ioctl.c
create mode 100644 drivers/hv/dxgkrnl/misc.c
create mode 100644 drivers/hv/dxgkrnl/misc.h
create mode 100644 include/uapi/misc/d3dkmthk.h

--
2.35.1



2022-02-07 19:47:05

by Iouri Tarassov

[permalink] [raw]
Subject: [PATCH v2 20/24] drivers: hv: dxgkrnl: IOCTLs to offer and reclaim allocations

LX_DXOFFERALLOCATIONS {D3DKMTOfferAllocations),
LX_DXRECLAIMALLOCATIONS2 (D3DKMTReclaimAllocations)

When a user mode driver does not need to access an allocation, it can
"offer" it. This means that the allocation is not in use and it local
device memory could be reclaimed and given to another allocation.
When the allocation is again needed, the UMD can attempt to"reclaim" the
allocation. If the allocation is still in the device local memory,
the reclaim operation succeeds. If not the called must restore the content
of the allocation before it can be used by the device.
The IOCTLs are implemented by sending messages to the host and returning
results back to the caller.

Signed-off-by: Iouri Tarassov <[email protected]>
---
drivers/hv/dxgkrnl/dxgkrnl.h | 8 +++
drivers/hv/dxgkrnl/dxgvmbus.c | 122 ++++++++++++++++++++++++++++++++++
drivers/hv/dxgkrnl/ioctl.c | 119 +++++++++++++++++++++++++++++++++
3 files changed, 249 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 8310dd9b7843..52d7d74a93e4 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -839,6 +839,14 @@ int dxgvmb_send_set_allocation_priority(struct dxgprocess *process,
int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
struct dxgadapter *adapter,
struct d3dkmt_getallocationpriority *a);
+int dxgvmb_send_offer_allocations(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_offerallocations *args);
+int dxgvmb_send_reclaim_allocations(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmthandle device,
+ struct d3dkmt_reclaimallocations2 *args,
+ u64 __user *paging_fence_value);
int dxgvmb_send_change_vidmem_reservation(struct dxgprocess *process,
struct dxgadapter *adapter,
struct d3dkmthandle other_process,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index f7fcbf62f95b..84034c4fafe2 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2889,6 +2889,128 @@ int dxgvmb_send_get_allocation_priority(struct dxgprocess *process,
return ret;
}

+int dxgvmb_send_offer_allocations(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_offerallocations *args)
+{
+ struct dxgkvmb_command_offerallocations *command;
+ int ret = -EINVAL;
+ u32 alloc_size = sizeof(struct d3dkmthandle) * args->allocation_count;
+ u32 cmd_size = sizeof(struct dxgkvmb_command_offerallocations) +
+ alloc_size - sizeof(struct d3dkmthandle);
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ ret = init_message(&msg, adapter, process, cmd_size);
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_OFFERALLOCATIONS,
+ process->host_handle);
+ command->flags = args->flags;
+ command->priority = args->priority;
+ command->device = args->device;
+ command->allocation_count = args->allocation_count;
+ if (args->resources) {
+ command->resources = true;
+ ret = copy_from_user(command->allocations, args->resources,
+ alloc_size);
+ } else {
+ ret = copy_from_user(command->allocations,
+ args->allocations, alloc_size);
+ }
+ if (ret) {
+ pr_err("%s failed to copy input handles", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
+int dxgvmb_send_reclaim_allocations(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmthandle device,
+ struct d3dkmt_reclaimallocations2 *args,
+ u64 __user *paging_fence_value)
+{
+ struct dxgkvmb_command_reclaimallocations *command;
+ struct dxgkvmb_command_reclaimallocations_return *result;
+ int ret;
+ u32 alloc_size = sizeof(struct d3dkmthandle) * args->allocation_count;
+ u32 cmd_size = sizeof(struct dxgkvmb_command_reclaimallocations) +
+ alloc_size - sizeof(struct d3dkmthandle);
+ u32 result_size = sizeof(*result);
+ struct dxgvmbusmsgres msg = {.hdr = NULL};
+
+ if (args->results)
+ result_size += (args->allocation_count - 1) *
+ sizeof(enum d3dddi_reclaim_result);
+
+ ret = init_message_res(&msg, adapter, process, cmd_size, result_size);
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+ result = msg.res;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_RECLAIMALLOCATIONS,
+ process->host_handle);
+ command->device = device;
+ command->paging_queue = args->paging_queue;
+ command->allocation_count = args->allocation_count;
+ command->write_results = args->results != NULL;
+ if (args->resources) {
+ command->resources = true;
+ ret = copy_from_user(command->allocations, args->resources,
+ alloc_size);
+ } else {
+ ret = copy_from_user(command->allocations,
+ args->allocations, alloc_size);
+ }
+ if (ret) {
+ pr_err("%s failed to copy input handles", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+ result, msg.res_size);
+ if (ret < 0)
+ goto cleanup;
+ ret = copy_to_user(paging_fence_value,
+ &result->paging_fence_value, sizeof(u64));
+ if (ret) {
+ pr_err("%s failed to copy paging fence", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret = ntstatus2int(result->status);
+ if (NT_SUCCESS(result->status) && args->results) {
+ ret = copy_to_user(args->results, result->discarded,
+ sizeof(result->discarded[0]) *
+ args->allocation_count);
+ if (ret) {
+ pr_err("%s failed to copy results", __func__);
+ ret = -EINVAL;
+ }
+ }
+
+cleanup:
+ free_message((struct dxgvmbusmsg *)&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
int dxgvmb_send_change_vidmem_reservation(struct dxgprocess *process,
struct dxgadapter *adapter,
struct d3dkmthandle other_process,
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 6c9b6e6ea296..3370e5de4314 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -2010,6 +2010,121 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
return ret;
}

+static int
+dxgk_offer_allocations(struct dxgprocess *process, void *__user inargs)
+{
+ int ret;
+ struct d3dkmt_offerallocations args;
+ struct dxgdevice *device = NULL;
+ struct dxgadapter *adapter = NULL;
+
+ pr_debug("ioctl: %s", __func__);
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ if (args.allocation_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+ args.allocation_count == 0) {
+ pr_err("invalid number of allocations");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ if ((args.resources == NULL) == (args.allocations == NULL)) {
+ pr_err("invalid pointer to resources/allocations");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ device = dxgprocess_device_by_handle(process, args.device);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_offer_allocations(process, adapter, &args);
+
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
+static int
+dxgk_reclaim_allocations(struct dxgprocess *process, void *__user inargs)
+{
+ int ret;
+ struct d3dkmt_reclaimallocations2 args;
+ struct dxgdevice *device = NULL;
+ struct dxgadapter *adapter = NULL;
+ struct d3dkmt_reclaimallocations2 * __user in_args = inargs;
+
+ pr_debug("ioctl: %s", __func__);
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ if (args.allocation_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+ args.allocation_count == 0) {
+ pr_err("invalid number of allocations");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ if ((args.resources == NULL) == (args.allocations == NULL)) {
+ pr_err("invalid pointer to resources/allocations");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ device = dxgprocess_device_by_object_handle(process,
+ HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+ args.paging_queue);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_reclaim_allocations(process, adapter,
+ device->handle, &args,
+ &in_args->paging_fence_value);
+
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
static int
dxgk_submit_command(struct dxgprocess *process, void *__user inargs)
{
@@ -4736,6 +4851,8 @@ void init_ioctls(void)
LX_DXLOCK2);
SET_IOCTL(/*0x26 */ dxgk_mark_device_as_error,
LX_DXMARKDEVICEASERROR);
+ SET_IOCTL(/*0x27 */ dxgk_offer_allocations,
+ LX_DXOFFERALLOCATIONS);
SET_IOCTL(/*0x28 */ dxgk_open_resource,
LX_DXOPENRESOURCE);
SET_IOCTL(/*0x29 */ dxgk_open_sync_object,
@@ -4744,6 +4861,8 @@ void init_ioctls(void)
LX_DXQUERYALLOCATIONRESIDENCY);
SET_IOCTL(/*0x2b */ dxgk_query_resource_info,
LX_DXQUERYRESOURCEINFO);
+ SET_IOCTL(/*0x2c */ dxgk_reclaim_allocations,
+ LX_DXRECLAIMALLOCATIONS2);
SET_IOCTL(/*0x2d */ dxgk_render,
LX_DXRENDER);
SET_IOCTL(/*0x2e */ dxgk_set_allocation_priority,
--
2.35.1


2022-02-07 19:56:51

by Iouri Tarassov

[permalink] [raw]
Subject: [PATCH v2 22/24] drivers: hv: dxgkrnl: IOCTLs to manage allocation residency

LX_DXMAKERESIDENT (D3DKMTMakeResident)
LX_DXEVICT (D3DKMTEvict)

An allocation is resident when GPU is setup to access it. The current
WDDM design does not support on demand GPU page faulting. An
allocation must be resident (be in the local device memory or
in non-pageable system memory) before GPU is allowed to access it.
Applications need to call D3DKMTMAkeResident to make an allocation
resident and D3DKMTEvict to evict it from GPU accessible memory.
The IOCTLs send the requests to the host and return the result back.

Signed-off-by: Iouri Tarassov <[email protected]>
---
drivers/hv/dxgkrnl/dxgkrnl.h | 4 +
drivers/hv/dxgkrnl/dxgvmbus.c | 98 +++++++++++++++++++++++
drivers/hv/dxgkrnl/ioctl.c | 144 ++++++++++++++++++++++++++++++++++
3 files changed, 246 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index eebb9cee39c3..70cb31654aac 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -784,6 +784,10 @@ int dxgvmb_send_create_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
int dxgvmb_send_destroy_allocation(struct dxgprocess *pr, struct dxgdevice *dev,
struct d3dkmt_destroyallocation2 *args,
struct d3dkmthandle *alloc_handles);
+int dxgvmb_send_make_resident(struct dxgprocess *pr, struct dxgadapter *adapter,
+ struct d3dddi_makeresident *args);
+int dxgvmb_send_evict(struct dxgprocess *pr, struct dxgadapter *adapter,
+ struct d3dkmt_evict *args);
int dxgvmb_send_submit_command(struct dxgprocess *pr,
struct dxgadapter *adapter,
struct d3dkmt_submitcommand *args);
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 2b64d53685a0..4299127af863 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2224,6 +2224,104 @@ int dxgvmb_send_get_stdalloc_data(struct dxgdevice *device,
return ret;
}

+int dxgvmb_send_make_resident(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dddi_makeresident *args)
+{
+ int ret;
+ u32 cmd_size;
+ struct dxgkvmb_command_makeresident_return result = { };
+ struct dxgkvmb_command_makeresident *command = NULL;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ cmd_size = (args->alloc_count - 1) * sizeof(struct d3dkmthandle) +
+ sizeof(struct dxgkvmb_command_makeresident);
+
+ ret = init_message(&msg, adapter, process, cmd_size);
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ ret = copy_from_user(command->allocations, args->allocation_list,
+ args->alloc_count *
+ sizeof(struct d3dkmthandle));
+ if (ret) {
+ pr_err("%s failed to copy alloc handles", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_MAKERESIDENT,
+ process->host_handle);
+ command->alloc_count = args->alloc_count;
+ command->paging_queue = args->paging_queue;
+ command->flags = args->flags;
+
+ ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+ &result, sizeof(result));
+ if (ret < 0) {
+ pr_err("send_make_resident failed %x", ret);
+ goto cleanup;
+ }
+
+ args->paging_fence_value = result.paging_fence_value;
+ args->num_bytes_to_trim = result.num_bytes_to_trim;
+ ret = ntstatus2int(result.status);
+
+cleanup:
+
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
+int dxgvmb_send_evict(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_evict *args)
+{
+ int ret;
+ u32 cmd_size;
+ struct dxgkvmb_command_evict_return result = { };
+ struct dxgkvmb_command_evict *command = NULL;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ cmd_size = (args->alloc_count - 1) * sizeof(struct d3dkmthandle) +
+ sizeof(struct dxgkvmb_command_evict);
+ ret = init_message(&msg, adapter, process, cmd_size);
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+ ret = copy_from_user(command->allocations, args->allocations,
+ args->alloc_count *
+ sizeof(struct d3dkmthandle));
+ if (ret) {
+ pr_err("%s failed to copy alloc handles", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_EVICT, process->host_handle);
+ command->alloc_count = args->alloc_count;
+ command->device = args->device;
+ command->flags = args->flags;
+
+ ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+ &result, sizeof(result));
+ if (ret < 0) {
+ pr_err("send_evict failed %x", ret);
+ goto cleanup;
+ }
+ args->num_bytes_to_trim = result.num_bytes_to_trim;
+
+cleanup:
+
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
int dxgvmb_send_submit_command(struct dxgprocess *process,
struct dxgadapter *adapter,
struct d3dkmt_submitcommand *args)
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 61bdc9a91903..317a4f2938d1 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -2010,6 +2010,146 @@ dxgk_destroy_allocation(struct dxgprocess *process, void *__user inargs)
return ret;
}

+static int
+dxgk_make_resident(struct dxgprocess *process, void *__user inargs)
+{
+ int ret, ret2;
+ struct d3dddi_makeresident args;
+ struct d3dddi_makeresident *input = inargs;
+ struct dxgdevice *device = NULL;
+ struct dxgadapter *adapter = NULL;
+
+ pr_debug("ioctl: %s", __func__);
+
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ if (args.alloc_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+ args.alloc_count == 0) {
+ pr_err("invalid number of allocations");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+ if (args.paging_queue.v == 0) {
+ pr_err("paging queue is missing");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ device = dxgprocess_device_by_object_handle(process,
+ HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+ args.paging_queue);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_make_resident(process, adapter, &args);
+ if (ret < 0)
+ goto cleanup;
+ /* STATUS_PENING is a success code > 0. It is returned to user mode */
+ if (!(ret == STATUS_PENDING || ret == 0)) {
+ pr_err("%s Unexpected error %x", __func__, ret);
+ goto cleanup;
+ }
+
+ ret2 = copy_to_user(&input->paging_fence_value,
+ &args.paging_fence_value, sizeof(u64));
+ if (ret2) {
+ pr_err("%s failed to copy paging fence", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret2 = copy_to_user(&input->num_bytes_to_trim,
+ &args.num_bytes_to_trim, sizeof(u64));
+ if (ret2) {
+ pr_err("%s failed to copy bytes to trim", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+
+ return ret;
+}
+
+static int
+dxgk_evict(struct dxgprocess *process, void *__user inargs)
+{
+ int ret;
+ struct d3dkmt_evict args;
+ struct d3dkmt_evict *input = inargs;
+ struct dxgdevice *device = NULL;
+ struct dxgadapter *adapter = NULL;
+
+ pr_debug("ioctl: %s", __func__);
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ if (args.alloc_count > D3DKMT_MAKERESIDENT_ALLOC_MAX ||
+ args.alloc_count == 0) {
+ pr_err("invalid number of allocations");
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ device = dxgprocess_device_by_handle(process, args.device);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_evict(process, adapter, &args);
+ if (ret < 0)
+ goto cleanup;
+
+ ret = copy_to_user(&input->num_bytes_to_trim,
+ &args.num_bytes_to_trim, sizeof(u64));
+ if (ret) {
+ pr_err("%s failed to copy bytes to trim to user", __func__);
+ ret = -EINVAL;
+ }
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
static int
dxgk_offer_allocations(struct dxgprocess *process, void *__user inargs)
{
@@ -4972,6 +5112,8 @@ void init_ioctls(void)
LX_DXQUERYADAPTERINFO);
SET_IOCTL(/*0xa */ dxgk_query_vidmem_info,
LX_DXQUERYVIDEOMEMORYINFO);
+ SET_IOCTL(/*0xb */ dxgk_make_resident,
+ LX_DXMAKERESIDENT);
SET_IOCTL(/*0xd */ dxgk_escape,
LX_DXESCAPE);
SET_IOCTL(/*0xe */ dxgk_get_device_state,
@@ -5006,6 +5148,8 @@ void init_ioctls(void)
LX_DXDESTROYPAGINGQUEUE);
SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
LX_DXDESTROYSYNCHRONIZATIONOBJECT);
+ SET_IOCTL(/*0x1e */ dxgk_evict,
+ LX_DXEVICT);
SET_IOCTL(/*0x1f */ dxgk_flush_heap_transitions,
LX_DXFLUSHHEAPTRANSITIONS);
SET_IOCTL(/*0x21 */ dxgk_get_context_process_scheduling_priority,
--
2.35.1


2022-02-08 11:31:54

by Iouri Tarassov

[permalink] [raw]
Subject: [PATCH v2 16/24] drivers: hv: dxgkrnl: Mmap(unmap) CPU address to device allocation: LX_DXLOCK2, LX_DXUNLOCK2

- LX_DXLOCK2(D3DKMTLock2) maps a CPU virtual address to a compute
device allocation. The allocation could be located in system
memory or local device memory.
When the device allocation is created from the guest system memory
(existing sysmem allocation), the allocation CPU address is known
and is returned to the caller.
For other CPU visible allocations the code flow is the following:
1. A VM bus message is sent to the host
2. The host allocates a portion of the guest IO space and maps it
to the allocation backing store. The IO space address is returned
back to the guest.
3. The guest allocates a CPU virtual address and maps it to the IO
space (dxg_map_iospace).
4. The CPU VA is returned back to the caller
cpu_address_mapped and cpu_address_refcount are used to track how
many times an allocation was mapped.

- LX_DXUNLOCK2(D3DKMTUnlock2) unmaps a CPU virtual address from
a compute device allocation.

Signed-off-by: Iouri Tarassov <[email protected]>
---
drivers/hv/dxgkrnl/dxgadapter.c | 11 +++
drivers/hv/dxgkrnl/dxgkrnl.h | 14 +++
drivers/hv/dxgkrnl/dxgvmbus.c | 107 +++++++++++++++++++++
drivers/hv/dxgkrnl/ioctl.c | 161 ++++++++++++++++++++++++++++++++
4 files changed, 293 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgadapter.c b/drivers/hv/dxgkrnl/dxgadapter.c
index f4e1d402fdbe..e28f5d3e4210 100644
--- a/drivers/hv/dxgkrnl/dxgadapter.c
+++ b/drivers/hv/dxgkrnl/dxgadapter.c
@@ -885,6 +885,15 @@ void dxgallocation_stop(struct dxgallocation *alloc)
vfree(alloc->pages);
alloc->pages = NULL;
}
+ dxgprocess_ht_lock_exclusive_down(alloc->process);
+ if (alloc->cpu_address_mapped) {
+ dxg_unmap_iospace(alloc->cpu_address,
+ alloc->num_pages << PAGE_SHIFT);
+ alloc->cpu_address_mapped = false;
+ alloc->cpu_address = NULL;
+ alloc->cpu_address_refcount = 0;
+ }
+ dxgprocess_ht_lock_exclusive_up(alloc->process);
}

void dxgallocation_free_handle(struct dxgallocation *alloc)
@@ -926,6 +935,8 @@ void dxgallocation_destroy(struct dxgallocation *alloc)
}
if (alloc->priv_drv_data)
vfree(alloc->priv_drv_data);
+ if (alloc->cpu_address_mapped)
+ pr_err("Alloc IO space is mapped: %p", alloc);
vfree(alloc);
}

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 892c95a11368..fa46abf0350a 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -691,6 +691,8 @@ struct dxgallocation {
struct d3dkmthandle alloc_handle;
/* Set to 1 when allocation belongs to resource. */
u32 resource_owner:1;
+ /* Set to 1 when 'cpu_address' is mapped to the IO space. */
+ u32 cpu_address_mapped:1;
/* Set to 1 when the allocatio is mapped as cached */
u32 cached:1;
u32 handle_valid:1;
@@ -698,6 +700,11 @@ struct dxgallocation {
struct vmbus_gpadl gpadl;
/* Number of pages in the 'pages' array */
u32 num_pages;
+ /*
+ * How many times dxgk_lock2 is called to allocation, which is mapped
+ * to IO space.
+ */
+ u32 cpu_address_refcount;
/*
* CPU address from the existing sysmem allocation, or
* mapped to the CPU visible backing store in the IO space
@@ -811,6 +818,13 @@ int dxgvmb_send_wait_sync_object_cpu(struct dxgprocess *process,
d3dkmt_waitforsynchronizationobjectfromcpu
*args,
u64 cpu_event);
+int dxgvmb_send_lock2(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_lock2 *args,
+ struct d3dkmt_lock2 *__user outargs);
+int dxgvmb_send_unlock2(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_unlock2 *args);
int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
struct dxgadapter *adapter,
struct d3dkmt_createhwqueue *args,
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index b5225756ee6e..a064efa320cf 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2294,6 +2294,113 @@ int dxgvmb_send_wait_sync_object_gpu(struct dxgprocess *process,
return ret;
}

+int dxgvmb_send_lock2(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_lock2 *args,
+ struct d3dkmt_lock2 *__user outargs)
+{
+ int ret;
+ struct dxgkvmb_command_lock2 *command;
+ struct dxgkvmb_command_lock2_return result = { };
+ struct dxgallocation *alloc = NULL;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ ret = init_message(&msg, adapter, process, sizeof(*command));
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_LOCK2, process->host_handle);
+ command->args = *args;
+
+ ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size,
+ &result, sizeof(result));
+ if (ret < 0)
+ goto cleanup;
+
+ ret = ntstatus2int(result.status);
+ if (ret < 0)
+ goto cleanup;
+
+ hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+ alloc = hmgrtable_get_object_by_type(&process->handle_table,
+ HMGRENTRY_TYPE_DXGALLOCATION,
+ args->allocation);
+ if (alloc == NULL) {
+ pr_err("%s invalid alloc", __func__);
+ ret = -EINVAL;
+ } else {
+ if (alloc->cpu_address) {
+ args->data = alloc->cpu_address;
+ if (alloc->cpu_address_mapped)
+ alloc->cpu_address_refcount++;
+ } else {
+ u64 offset = (u64)result.cpu_visible_buffer_offset;
+
+ args->data = dxg_map_iospace(offset,
+ alloc->num_pages << PAGE_SHIFT,
+ PROT_READ | PROT_WRITE, alloc->cached);
+ if (args->data) {
+ alloc->cpu_address_refcount = 1;
+ alloc->cpu_address_mapped = true;
+ alloc->cpu_address = args->data;
+ }
+ }
+ if (args->data == NULL) {
+ ret = -ENOMEM;
+ } else {
+ ret = copy_to_user(&outargs->data, &args->data,
+ sizeof(args->data));
+ if (ret) {
+ pr_err("%s failed to copy data", __func__);
+ ret = -EINVAL;
+ alloc->cpu_address_refcount--;
+ if (alloc->cpu_address_refcount == 0) {
+ dxg_unmap_iospace(alloc->cpu_address,
+ alloc->num_pages << PAGE_SHIFT);
+ alloc->cpu_address_mapped = false;
+ alloc->cpu_address = NULL;
+ }
+ }
+ }
+ }
+ hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+
+cleanup:
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
+int dxgvmb_send_unlock2(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_unlock2 *args)
+{
+ int ret;
+ struct dxgkvmb_command_unlock2 *command;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ ret = init_message(&msg, adapter, process, sizeof(*command));
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_UNLOCK2,
+ process->host_handle);
+ command->args = *args;
+
+ ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
int dxgvmb_send_create_hwqueue(struct dxgprocess *process,
struct dxgadapter *adapter,
struct d3dkmt_createhwqueue *args,
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index bb6eab08898f..12a4593c6aa5 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -3180,6 +3180,163 @@ dxgk_wait_sync_object_gpu(struct dxgprocess *process, void *__user inargs)
return ret;
}

+static int
+dxgk_lock2(struct dxgprocess *process, void *__user inargs)
+{
+ struct d3dkmt_lock2 args;
+ struct d3dkmt_lock2 *__user result = inargs;
+ int ret;
+ struct dxgadapter *adapter = NULL;
+ struct dxgdevice *device = NULL;
+ struct dxgallocation *alloc = NULL;
+
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ args.data = NULL;
+ hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+ alloc = hmgrtable_get_object_by_type(&process->handle_table,
+ HMGRENTRY_TYPE_DXGALLOCATION,
+ args.allocation);
+ if (alloc == NULL) {
+ ret = -EINVAL;
+ } else {
+ if (alloc->cpu_address) {
+ ret = copy_to_user(&result->data,
+ &alloc->cpu_address,
+ sizeof(args.data));
+ if (ret == 0) {
+ args.data = alloc->cpu_address;
+ if (alloc->cpu_address_mapped)
+ alloc->cpu_address_refcount++;
+ } else {
+ pr_err("%s Failed to copy cpu address",
+ __func__);
+ ret = -EINVAL;
+ }
+ }
+ }
+ hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+ if (ret < 0)
+ goto cleanup;
+ if (args.data)
+ goto success;
+
+ /*
+ * The call acquires reference on the device. It is safe to access the
+ * adapter, because the device holds reference on it.
+ */
+ device = dxgprocess_device_by_handle(process, args.device);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_lock2(process, adapter, &args, result);
+
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+success:
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
+static int
+dxgk_unlock2(struct dxgprocess *process, void *__user inargs)
+{
+ struct d3dkmt_unlock2 args;
+ int ret;
+ struct dxgadapter *adapter = NULL;
+ struct dxgdevice *device = NULL;
+ struct dxgallocation *alloc = NULL;
+ bool done = false;
+
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ hmgrtable_lock(&process->handle_table, DXGLOCK_EXCL);
+ alloc = hmgrtable_get_object_by_type(&process->handle_table,
+ HMGRENTRY_TYPE_DXGALLOCATION,
+ args.allocation);
+ if (alloc == NULL) {
+ ret = -EINVAL;
+ } else {
+ if (alloc->cpu_address == NULL) {
+ pr_err("Allocation is not locked: %p", alloc);
+ ret = -EINVAL;
+ } else if (alloc->cpu_address_mapped) {
+ if (alloc->cpu_address_refcount > 0) {
+ alloc->cpu_address_refcount--;
+ if (alloc->cpu_address_refcount != 0) {
+ done = true;
+ } else {
+ dxg_unmap_iospace(alloc->cpu_address,
+ alloc->num_pages << PAGE_SHIFT);
+ alloc->cpu_address_mapped = false;
+ alloc->cpu_address = NULL;
+ }
+ } else {
+ pr_err("Invalid cpu access refcount");
+ done = true;
+ }
+ }
+ }
+ hmgrtable_unlock(&process->handle_table, DXGLOCK_EXCL);
+ if (done)
+ goto success;
+ if (ret < 0)
+ goto cleanup;
+
+ /*
+ * The call acquires reference on the device. It is safe to access the
+ * adapter, because the device holds reference on it.
+ */
+ device = dxgprocess_device_by_handle(process, args.device);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_unlock2(process, adapter, &args);
+
+cleanup:
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+success:
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
static int
dxgk_get_device_state(struct dxgprocess *process, void *__user inargs)
{
@@ -4011,6 +4168,8 @@ void init_ioctls(void)
LX_DXDESTROYPAGINGQUEUE);
SET_IOCTL(/*0x1d */ dxgk_destroy_sync_object,
LX_DXDESTROYSYNCHRONIZATIONOBJECT);
+ SET_IOCTL(/*0x25 */ dxgk_lock2,
+ LX_DXLOCK2);
SET_IOCTL(/*0x31 */ dxgk_signal_sync_object_cpu,
LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU);
SET_IOCTL(/*0x32 */ dxgk_signal_sync_object_gpu,
@@ -4023,6 +4182,8 @@ void init_ioctls(void)
LX_DXSUBMITWAITFORSYNCOBJECTSTOHWQUEUE);
SET_IOCTL(/*0x36 */ dxgk_submit_signal_to_hwqueue,
LX_DXSUBMITSIGNALSYNCOBJECTSTOHWQUEUE);
+ SET_IOCTL(/*0x37 */ dxgk_unlock2,
+ LX_DXUNLOCK2);
SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
--
2.35.1


2022-02-08 13:15:29

by Iouri Tarassov

[permalink] [raw]
Subject: [PATCH v2 23/24] drivers: hv: dxgkrnl: IOCTLs to handle GPU virtual addressing (GPU VA)

LX_DXRESERVEGPUVIRTUALADDRESS(D3DKMTReserveGpuVertualAddress)
LX_DXFREEGPUVIRTUALADDRESS(D3DKMTFreeGpuVertualAddress)
LX_DXMAPGPUVIRTUALADDRESS(D3DKMTMapGpuVertualAddress)
LX_DXUPDATEGPUVIRTUALADDRESS(D3DKMTUpdateGpuVertualAddress)

WDDM supports accessing GPU memory by using GPU VAs.
Each process has a dedicated GPU VA address space.
A GPU VA could be reserved, mapped to a GPU allocation, updated or
freed.
The video memory manager on the host updates GPU page tables for
the GPU virtual addresses.
The IOCTLs are simply forwarded to the host and return results to
the caller.

Signed-off-by: Iouri Tarassov <[email protected]>
---
drivers/hv/dxgkrnl/dxgkrnl.h | 10 ++
drivers/hv/dxgkrnl/dxgvmbus.c | 150 ++++++++++++++++++++++
drivers/hv/dxgkrnl/ioctl.c | 230 ++++++++++++++++++++++++++++++++++
3 files changed, 390 insertions(+)

diff --git a/drivers/hv/dxgkrnl/dxgkrnl.h b/drivers/hv/dxgkrnl/dxgkrnl.h
index 70cb31654aac..4a2b4c7611ff 100644
--- a/drivers/hv/dxgkrnl/dxgkrnl.h
+++ b/drivers/hv/dxgkrnl/dxgkrnl.h
@@ -791,6 +791,16 @@ int dxgvmb_send_evict(struct dxgprocess *pr, struct dxgadapter *adapter,
int dxgvmb_send_submit_command(struct dxgprocess *pr,
struct dxgadapter *adapter,
struct d3dkmt_submitcommand *args);
+int dxgvmb_send_map_gpu_va(struct dxgprocess *pr, struct d3dkmthandle h,
+ struct dxgadapter *adapter,
+ struct d3dddi_mapgpuvirtualaddress *args);
+int dxgvmb_send_reserve_gpu_va(struct dxgprocess *pr,
+ struct dxgadapter *adapter,
+ struct d3dddi_reservegpuvirtualaddress *args);
+int dxgvmb_send_free_gpu_va(struct dxgprocess *pr, struct dxgadapter *adapter,
+ struct d3dkmt_freegpuvirtualaddress *args);
+int dxgvmb_send_update_gpu_va(struct dxgprocess *pr, struct dxgadapter *adapter,
+ struct d3dkmt_updategpuvirtualaddress *args);
int dxgvmb_send_create_sync_object(struct dxgprocess *pr,
struct dxgadapter *adapter,
struct d3dkmt_createsynchronizationobject2
diff --git a/drivers/hv/dxgkrnl/dxgvmbus.c b/drivers/hv/dxgkrnl/dxgvmbus.c
index 4299127af863..7a4b17938f53 100644
--- a/drivers/hv/dxgkrnl/dxgvmbus.c
+++ b/drivers/hv/dxgkrnl/dxgvmbus.c
@@ -2376,6 +2376,156 @@ int dxgvmb_send_submit_command(struct dxgprocess *process,
return ret;
}

+int dxgvmb_send_map_gpu_va(struct dxgprocess *process,
+ struct d3dkmthandle device,
+ struct dxgadapter *adapter,
+ struct d3dddi_mapgpuvirtualaddress *args)
+{
+ struct dxgkvmb_command_mapgpuvirtualaddress *command;
+ struct dxgkvmb_command_mapgpuvirtualaddress_return result;
+ int ret;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ ret = init_message(&msg, adapter, process, sizeof(*command));
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_MAPGPUVIRTUALADDRESS,
+ process->host_handle);
+ command->args = *args;
+ command->device = device;
+
+ ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, &result,
+ sizeof(result));
+ if (ret < 0)
+ goto cleanup;
+ args->virtual_address = result.virtual_address;
+ args->paging_fence_value = result.paging_fence_value;
+ ret = ntstatus2int(result.status);
+
+cleanup:
+
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
+int dxgvmb_send_reserve_gpu_va(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dddi_reservegpuvirtualaddress *args)
+{
+ struct dxgkvmb_command_reservegpuvirtualaddress *command;
+ struct dxgkvmb_command_reservegpuvirtualaddress_return result;
+ int ret;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ ret = init_message(&msg, adapter, process, sizeof(*command));
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_RESERVEGPUVIRTUALADDRESS,
+ process->host_handle);
+ command->args = *args;
+
+ ret = dxgvmb_send_sync_msg(msg.channel, msg.hdr, msg.size, &result,
+ sizeof(result));
+ args->virtual_address = result.virtual_address;
+
+cleanup:
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
+int dxgvmb_send_free_gpu_va(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_freegpuvirtualaddress *args)
+{
+ struct dxgkvmb_command_freegpuvirtualaddress *command;
+ int ret;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ ret = init_message(&msg, adapter, process, sizeof(*command));
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_FREEGPUVIRTUALADDRESS,
+ process->host_handle);
+ command->args = *args;
+
+ ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
+int dxgvmb_send_update_gpu_va(struct dxgprocess *process,
+ struct dxgadapter *adapter,
+ struct d3dkmt_updategpuvirtualaddress *args)
+{
+ struct dxgkvmb_command_updategpuvirtualaddress *command;
+ u32 cmd_size;
+ u32 op_size;
+ int ret;
+ struct dxgvmbusmsg msg = {.hdr = NULL};
+
+ if (args->num_operations == 0 ||
+ (DXG_MAX_VM_BUS_PACKET_SIZE /
+ sizeof(struct d3dddi_updategpuvirtualaddress_operation)) <
+ args->num_operations) {
+ ret = -EINVAL;
+ pr_err("Invalid number of operations: %d",
+ args->num_operations);
+ goto cleanup;
+ }
+
+ op_size = args->num_operations *
+ sizeof(struct d3dddi_updategpuvirtualaddress_operation);
+ cmd_size = sizeof(struct dxgkvmb_command_updategpuvirtualaddress) +
+ op_size - sizeof(args->operations[0]);
+
+ ret = init_message(&msg, adapter, process, cmd_size);
+ if (ret)
+ goto cleanup;
+ command = (void *)msg.msg;
+
+ command_vgpu_to_host_init2(&command->hdr,
+ DXGK_VMBCOMMAND_UPDATEGPUVIRTUALADDRESS,
+ process->host_handle);
+ command->fence_value = args->fence_value;
+ command->device = args->device;
+ command->context = args->context;
+ command->fence_object = args->fence_object;
+ command->num_operations = args->num_operations;
+ command->flags = args->flags.value;
+ ret = copy_from_user(command->operations, args->operations,
+ op_size);
+ if (ret) {
+ pr_err("%s failed to copy operations", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_sync_msg_ntstatus(msg.channel, msg.hdr, msg.size);
+
+cleanup:
+ free_message(&msg, process);
+ if (ret)
+ pr_debug("err: %s %d", __func__, ret);
+ return ret;
+}
+
static void set_result(struct d3dkmt_createsynchronizationobject2 *args,
u64 fence_gpu_va, u8 *va)
{
diff --git a/drivers/hv/dxgkrnl/ioctl.c b/drivers/hv/dxgkrnl/ioctl.c
index 317a4f2938d1..26bdcdfeba86 100644
--- a/drivers/hv/dxgkrnl/ioctl.c
+++ b/drivers/hv/dxgkrnl/ioctl.c
@@ -2549,6 +2549,228 @@ dxgk_submit_wait_to_hwqueue(struct dxgprocess *process, void *__user inargs)
return ret;
}

+static int
+dxgk_map_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+ int ret, ret2;
+ struct d3dddi_mapgpuvirtualaddress args;
+ struct d3dddi_mapgpuvirtualaddress *input = inargs;
+ struct dxgdevice *device = NULL;
+ struct dxgadapter *adapter = NULL;
+
+ pr_debug("ioctl: %s", __func__);
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ device = dxgprocess_device_by_object_handle(process,
+ HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+ args.paging_queue);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_map_gpu_va(process, zerohandle, adapter, &args);
+ if (ret < 0)
+ goto cleanup;
+ /* STATUS_PENING is a success code > 0. It is returned to user mode */
+ if (!(ret == STATUS_PENDING || ret == 0)) {
+ pr_err("%s Unexpected error %x", __func__, ret);
+ goto cleanup;
+ }
+
+ ret2 = copy_to_user(&input->paging_fence_value,
+ &args.paging_fence_value, sizeof(u64));
+ if (ret2) {
+ pr_err("%s failed to copy paging fence to user", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret2 = copy_to_user(&input->virtual_address, &args.virtual_address,
+ sizeof(args.virtual_address));
+ if (ret2) {
+ pr_err("%s failed to copy va to user", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
+static int
+dxgk_reserve_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+ int ret;
+ struct d3dddi_reservegpuvirtualaddress args;
+ struct d3dddi_reservegpuvirtualaddress *input = inargs;
+ struct dxgadapter *adapter = NULL;
+ struct dxgdevice *device = NULL;
+
+ pr_debug("ioctl: %s", __func__);
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+ if (adapter == NULL) {
+ device = dxgprocess_device_by_object_handle(process,
+ HMGRENTRY_TYPE_DXGPAGINGQUEUE,
+ args.adapter);
+ if (device == NULL) {
+ pr_err("invalid adapter or paging queue: 0x%x",
+ args.adapter.v);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+ adapter = device->adapter;
+ kref_get(&adapter->adapter_kref);
+ kref_put(&device->device_kref, dxgdevice_release);
+ } else {
+ args.adapter = adapter->host_handle;
+ }
+
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ kref_put(&adapter->adapter_kref, dxgadapter_release);
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_reserve_gpu_va(process, adapter, &args);
+ if (ret < 0)
+ goto cleanup;
+
+ ret = copy_to_user(&input->virtual_address, &args.virtual_address,
+ sizeof(args.virtual_address));
+ if (ret) {
+ pr_err("%s failed to copy VA to user", __func__);
+ ret = -EINVAL;
+ }
+
+cleanup:
+
+ if (adapter) {
+ dxgadapter_release_lock_shared(adapter);
+ kref_put(&adapter->adapter_kref, dxgadapter_release);
+ }
+
+ pr_debug("ioctl:%s %s %d", errorstr(ret), __func__, ret);
+ return ret;
+}
+
+static int
+dxgk_free_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+ int ret;
+ struct d3dkmt_freegpuvirtualaddress args;
+ struct dxgadapter *adapter = NULL;
+
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = dxgprocess_adapter_by_handle(process, args.adapter);
+ if (adapter == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ kref_put(&adapter->adapter_kref, dxgadapter_release);
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ args.adapter = adapter->host_handle;
+ ret = dxgvmb_send_free_gpu_va(process, adapter, &args);
+
+cleanup:
+
+ if (adapter) {
+ dxgadapter_release_lock_shared(adapter);
+ kref_put(&adapter->adapter_kref, dxgadapter_release);
+ }
+
+ return ret;
+}
+
+static int
+dxgk_update_gpu_va(struct dxgprocess *process, void *__user inargs)
+{
+ int ret;
+ struct d3dkmt_updategpuvirtualaddress args;
+ struct d3dkmt_updategpuvirtualaddress *input = inargs;
+ struct dxgadapter *adapter = NULL;
+ struct dxgdevice *device = NULL;
+
+ ret = copy_from_user(&args, inargs, sizeof(args));
+ if (ret) {
+ pr_err("%s failed to copy input args", __func__);
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ device = dxgprocess_device_by_handle(process, args.device);
+ if (device == NULL) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ adapter = device->adapter;
+ ret = dxgadapter_acquire_lock_shared(adapter);
+ if (ret < 0) {
+ adapter = NULL;
+ goto cleanup;
+ }
+
+ ret = dxgvmb_send_update_gpu_va(process, adapter, &args);
+ if (ret < 0)
+ goto cleanup;
+
+ ret = copy_to_user(&input->fence_value, &args.fence_value,
+ sizeof(args.fence_value));
+ if (ret) {
+ pr_err("%s failed to copy fence value to user", __func__);
+ ret = -EINVAL;
+ }
+
+cleanup:
+
+ if (adapter)
+ dxgadapter_release_lock_shared(adapter);
+ if (device)
+ kref_put(&device->device_kref, dxgdevice_release);
+
+ return ret;
+}
+
static int
dxgk_create_sync_object(struct dxgprocess *process, void *__user inargs)
{
@@ -5108,12 +5330,16 @@ void init_ioctls(void)
LX_DXCREATEALLOCATION);
SET_IOCTL(/*0x7 */ dxgk_create_paging_queue,
LX_DXCREATEPAGINGQUEUE);
+ SET_IOCTL(/*0x8 */ dxgk_reserve_gpu_va,
+ LX_DXRESERVEGPUVIRTUALADDRESS);
SET_IOCTL(/*0x9 */ dxgk_query_adapter_info,
LX_DXQUERYADAPTERINFO);
SET_IOCTL(/*0xa */ dxgk_query_vidmem_info,
LX_DXQUERYVIDEOMEMORYINFO);
SET_IOCTL(/*0xb */ dxgk_make_resident,
LX_DXMAKERESIDENT);
+ SET_IOCTL(/*0xc */ dxgk_map_gpu_va,
+ LX_DXMAPGPUVIRTUALADDRESS);
SET_IOCTL(/*0xd */ dxgk_escape,
LX_DXESCAPE);
SET_IOCTL(/*0xe */ dxgk_get_device_state,
@@ -5152,6 +5378,8 @@ void init_ioctls(void)
LX_DXEVICT);
SET_IOCTL(/*0x1f */ dxgk_flush_heap_transitions,
LX_DXFLUSHHEAPTRANSITIONS);
+ SET_IOCTL(/*0x20 */ dxgk_free_gpu_va,
+ LX_DXFREEGPUVIRTUALADDRESS);
SET_IOCTL(/*0x21 */ dxgk_get_context_process_scheduling_priority,
LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY);
SET_IOCTL(/*0x22 */ dxgk_get_context_scheduling_priority,
@@ -5200,6 +5428,8 @@ void init_ioctls(void)
LX_DXUNLOCK2);
SET_IOCTL(/*0x38 */ dxgk_update_alloc_property,
LX_DXUPDATEALLOCPROPERTY);
+ SET_IOCTL(/*0x39 */ dxgk_update_gpu_va,
+ LX_DXUPDATEGPUVIRTUALADDRESS);
SET_IOCTL(/*0x3a */ dxgk_wait_sync_object_cpu,
LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMCPU);
SET_IOCTL(/*0x3b */ dxgk_wait_sync_object_gpu,
--
2.35.1