LinuxLists.cc - [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM

2023-06-06 22:54:06

Subject: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to
keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it
serves the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself
providing the following new interfaces.

1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl
for UMDs to specify the portion of VA space managed by the kernel and
userspace, respectively.

2) Allocate and free a VA space region as well as bind and unbind memory
to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.

3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM
scheduler to queue jobs and support asynchronous processing with DRM syncobjs
as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context
for GEM buffers) by Christian König. Since the patch implementing drm_exec was
not yet merged into drm-next it is part of this series, as well as a small fix
for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse residency test
cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be considered just as
helper for implementation.

However, I absolutely intend to change those test cases to proper kunit test
cases for the DRM GPUVA manager, once and if we agree on it's usefulness and
design.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind

Changes in V2:
==============
Nouveau:
- Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence
signalling critical sections. Updates to the VA space are split up in three
separate stages, where only the 2. stage executes in a fence signalling
critical section:

1. update the VA space, allocate new structures and page tables
2. (un-)map the requested memory bindings
3. free structures and page tables

- Separated generic job scheduler code from specific job implementations.
- Separated the EXEC and VM_BIND implementation of the UAPI.
- Reworked the locking parts of the nvkm/vmm RAW interface, such that
(un-)map operations can be executed in fence signalling critical sections.

GPUVA Manager:
- made drm_gpuva_regions optional for users of the GPUVA manager
- allow NULL GEMs for drm_gpuva entries
- swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region
entries
- provide callbacks for users to allocate custom drm_gpuva_op structures to
allow inheritance
- added user bits to drm_gpuva_flags
- added a prefetch operation type in order to support generating prefetch
operations in the same way other operations generated
- hand the responsibility for mutual exclusion for a GEM's
drm_gpuva list to the user; simplified corresponding (un-)link functions

Maple Tree:
- I added two maple tree patches to the series, one to support custom tree
walk macros and one to hand the locking responsibility to the user of the
GPUVA manager without pre-defined lockdep checks.

Changes in V3:
==============
Nouveau:
- Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page
table cleanup) within a workqueue rather than the job_free() callback of
the scheduler itself. A job_free() callback can stall the execution (run()
callback) of the next job in the queue. Since the page table cleanup
requires to take the same locks as need to be taken for page table
allocation, doing it directly in the job_free() callback would still
violate the fence signalling critical path.
- Separated Nouveau fence allocation and emit, such that we do not violate
the fence signalling critical path in EXEC jobs.
- Implement "regions" (for handling sparse mappings through PDEs and dual
page tables) within Nouveau.
- Drop the requirement for every mapping to be contained within a region.
- Add necassary synchronization of VM_BIND job operation sequences in order
to work around limitations in page table handling. This will be addressed
in a future re-work of Nouveau's page table handling.
- Fixed a couple of race conditions found through more testing. Thanks to
Dave for consitently trying to break it. :-)

GPUVA Manager:
- Implement pre-allocation capabilities for tree modifications within fence
signalling critical sections.
- Implement accessors to to apply tree modification while walking the GPUVA
tree in order to actually support processing of drm_gpuva_ops through
callbacks in fence signalling critical sections rather than through
pre-allocated operation lists.
- Remove merging of GPUVAs; the kernel has limited to none knowlege about
the semantics of mapping sequences. Hence, merging is purely speculative.
It seems that gaining a significant (or at least a measurable) performance
increase through merging is way more likely to happen when userspace is
responsible for merging mappings up to the next larger page size if
possible.
- Since merging was removed, regions pretty much loose their right to exist.
They might still be useful for handling dual page tables or similar
mechanisms, but since Nouveau seems to be the only driver having a need
for this for now, regions were removed from the GPUVA manager.
- Fixed a couple of maple_tree related issues; thanks to Liam for helping me
out.

Changes in V4:
==============
Nouveau:
- Refactored how specific VM_BIND and EXEC jobs are created and how their
arguments are passed to the generic job implementation.
- Fixed a UAF race condition where bind job ops could have been freed
already while still waiting for a job cleanup to finish. This is due to
in certain cases we need to wait for mappings actually being unmapped
before creating sparse regions in the same area.
- Re-based the code onto drm_exec v4 patch.

GPUVA Manager:
- Fixed a maple tree related bug when pre-allocating MA states.
(Boris Brezillion)
- Made struct drm_gpuva_fn_ops a const object in all occurrences.
(Boris Brezillion)

TODO
====
Maple Tree:
- Maple tree uses the 'unsinged long' type for node entries. While this
works for 64bit, it's incompatible with the DRM GPUVA Manager on 32bit,
since the DRM GPUVA Manager uses the u64 type and so do drivers using it.
While it's questionable whether a 32bit kernel and a > 32bit GPU address
space make any sense, it creates tons of compiler warnings when compiling
for 32bit. Maybe it makes sense to expand the maple tree API to let users
decide which size to pick - other ideas / proposals are welcome.

Christian König (1):
drm: execution context for GEM buffers v4

Danilo Krummrich (13):
maple_tree: split up MA_STATE() macro
drm: manager to keep track of GPUs VA mappings
drm: debugfs: provide infrastructure to dump a DRM GPU VA space
drm/nouveau: new VM_BIND uapi interfaces
drm/nouveau: get vmm via nouveau_cli_vmm()
drm/nouveau: bo: initialize GEM GPU VA interface
drm/nouveau: move usercopy helpers to nouveau_drv.h
drm/nouveau: fence: separate fence alloc and emit
drm/nouveau: fence: fail to emit when fence context is killed
drm/nouveau: chan: provide nouveau_channel_kill()
drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
drm/nouveau: implement new VM_BIND uAPI
drm/nouveau: debugfs: implement DRM GPU VA debugfs

Documentation/gpu/driver-uapi.rst | 11 +
Documentation/gpu/drm-mm.rst | 43 +
drivers/gpu/drm/Kconfig | 6 +
drivers/gpu/drm/Makefile | 3 +
drivers/gpu/drm/drm_debugfs.c | 41 +
drivers/gpu/drm/drm_exec.c | 278 +++
drivers/gpu/drm/drm_gem.c | 3 +
drivers/gpu/drm/drm_gpuva_mgr.c | 1687 +++++++++++++++
drivers/gpu/drm/nouveau/Kbuild | 3 +
drivers/gpu/drm/nouveau/Kconfig | 2 +
drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +-
drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +-
drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +-
.../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +-
drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 +
drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +-
drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +-
drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 +
drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +-
drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +-
drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +-
drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++
drivers/gpu/drm/nouveau/nouveau_exec.h | 54 +
drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +-
drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +-
drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +-
drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++
drivers/gpu/drm/nouveau/nouveau_sched.h | 123 ++
drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1898 +++++++++++++++++
drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 +
drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +-
.../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +-
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +-
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 +
.../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +-
.../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +-
.../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +-
include/drm/drm_debugfs.h | 25 +
include/drm/drm_drv.h | 6 +
include/drm/drm_exec.h | 119 ++
include/drm/drm_gem.h | 75 +
include/drm/drm_gpuva_mgr.h | 681 ++++++
include/linux/maple_tree.h | 7 +-
include/uapi/drm/nouveau_drm.h | 209 ++
51 files changed, 7212 insertions(+), 242 deletions(-)
create mode 100644 drivers/gpu/drm/drm_exec.c
create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
create mode 100644 include/drm/drm_exec.h
create mode 100644 include/drm/drm_gpuva_mgr.h

base-commit: 33a86170888b7e4aa0cea94ebb9c67180139cea9
--
2.40.1

2023-06-06 22:54:32

by Danilo Krummrich

[permalink] [raw]

Subject: [PATCH drm-next v4 04/14] drm: debugfs: provide infrastructure to dump a DRM GPU VA space

This commit adds a function to dump a DRM GPU VA space and a macro for
drivers to register the struct drm_info_list 'gpuvas' entry.

Most likely, most drivers might maintain one DRM GPU VA space per struct
drm_file, but there might also be drivers not having a fixed relation
between DRM GPU VA spaces and a DRM core infrastructure, hence we need the
indirection via the driver iterating it's maintained DRM GPU VA spaces.

Signed-off-by: Danilo Krummrich <[email protected]>
---
drivers/gpu/drm/drm_debugfs.c | 41 +++++++++++++++++++++++++++++++++++
include/drm/drm_debugfs.h | 25 +++++++++++++++++++++
2 files changed, 66 insertions(+)

diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index 4855230ba2c6..82180fb1c200 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -39,6 +39,7 @@
#include <drm/drm_file.h>
#include <drm/drm_gem.h>
#include <drm/drm_managed.h>
+#include <drm/drm_gpuva_mgr.h>

#include "drm_crtc_internal.h"
#include "drm_internal.h"
@@ -175,6 +176,46 @@ static const struct file_operations drm_debugfs_fops = {
.release = single_release,
};

+/**
+ * drm_debugfs_gpuva_info - dump the given DRM GPU VA space
+ * @m: pointer to the &seq_file to write
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ *
+ * Dumps the GPU VA mappings of a given DRM GPU VA manager.
+ *
+ * For each DRM GPU VA space drivers should call this function from their
+ * &drm_info_list's show callback.
+ *
+ * Returns: 0 on success, -ENODEV if the &mgr is not initialized
+ */
+int drm_debugfs_gpuva_info(struct seq_file *m,
+ struct drm_gpuva_manager *mgr)
+{
+ DRM_GPUVA_ITER(it, mgr, 0);
+ struct drm_gpuva *va, *kva = &mgr->kernel_alloc_node;
+
+ if (!mgr->name)
+ return -ENODEV;
+
+ seq_printf(m, "DRM GPU VA space (%s) [0x%016llx;0x%016llx]\n",
+ mgr->name, mgr->mm_start, mgr->mm_start + mgr->mm_range);
+ seq_printf(m, "Kernel reserved node [0x%016llx;0x%016llx]\n",
+ kva->va.addr, kva->va.addr + kva->va.range);
+ seq_puts(m, "\n");
+ seq_puts(m, " VAs | start | range | end | object | object offset\n");
+ seq_puts(m, "-------------------------------------------------------------------------------------------------------------\n");
+ drm_gpuva_iter_for_each(va, it) {
+ if (unlikely(va == &mgr->kernel_alloc_node))
+ continue;
+
+ seq_printf(m, " | 0x%016llx | 0x%016llx | 0x%016llx | 0x%016llx | 0x%016llx\n",
+ va->va.addr, va->va.range, va->va.addr + va->va.range,
+ (u64)va->gem.obj, va->gem.offset);
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(drm_debugfs_gpuva_info);

/**
* drm_debugfs_create_files - Initialize a given set of debugfs files for DRM
diff --git a/include/drm/drm_debugfs.h b/include/drm/drm_debugfs.h
index 7616f457ce70..cb2c1956a214 100644
--- a/include/drm/drm_debugfs.h
+++ b/include/drm/drm_debugfs.h
@@ -34,6 +34,22 @@

#include <linux/types.h>
#include <linux/seq_file.h>
+
+#include <drm/drm_gpuva_mgr.h>
+
+/**
+ * DRM_DEBUGFS_GPUVA_INFO - &drm_info_list entry to dump a GPU VA space
+ * @show: the &drm_info_list's show callback
+ * @data: driver private data
+ *
+ * Drivers should use this macro to define a &drm_info_list entry to provide a
+ * debugfs file for dumping the GPU VA space regions and mappings.
+ *
+ * For each DRM GPU VA space drivers should call drm_debugfs_gpuva_info() from
+ * their @show callback.
+ */
+#define DRM_DEBUGFS_GPUVA_INFO(show, data) {"gpuvas", show, DRIVER_GEM_GPUVA, data}
+
/**
* struct drm_info_list - debugfs info list entry
*
@@ -134,6 +150,9 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,

void drm_debugfs_add_files(struct drm_device *dev,
const struct drm_debugfs_info *files, int count);
+
+int drm_debugfs_gpuva_info(struct seq_file *m,
+ struct drm_gpuva_manager *mgr);
#else
static inline void drm_debugfs_create_files(const struct drm_info_list *files,
int count, struct dentry *root,
@@ -155,6 +174,12 @@ static inline void drm_debugfs_add_files(struct drm_device *dev,
const struct drm_debugfs_info *files,
int count)
{}
+
+static inline int drm_debugfs_gpuva_info(struct seq_file *m,
+ struct drm_gpuva_manager *mgr)
+{
+ return 0;
+}
#endif

#endif /* _DRM_DEBUGFS_H_ */
--
2.40.1

2023-06-06 22:56:50

by Danilo Krummrich

[permalink] [raw]

Subject: [PATCH drm-next v4 13/14] drm/nouveau: implement new VM_BIND uAPI

This commit provides the implementation for the new uapi motivated by the
Vulkan API. It allows user mode drivers (UMDs) to:

1) Initialize a GPU virtual address (VA) space via the new
DRM_IOCTL_NOUVEAU_VM_INIT ioctl for UMDs to specify the portion of VA
space managed by the kernel and userspace, respectively.

2) Allocate and free a VA space region as well as bind and unbind memory
to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.
UMDs can request the named operations to be processed either
synchronously or asynchronously. It supports DRM syncobjs
(incl. timelines) as synchronization mechanism. The management of the
GPU VA mappings is implemented with the DRM GPU VA manager.

3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. The
execution happens asynchronously. It supports DRM syncobj (incl.
timelines) as synchronization mechanism. DRM GEM object locking is
handled with drm_exec.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, use the DRM
GPU scheduler for the asynchronous paths.

Signed-off-by: Danilo Krummrich <[email protected]>
---
Documentation/gpu/driver-uapi.rst | 3 +
drivers/gpu/drm/nouveau/Kbuild | 3 +
drivers/gpu/drm/nouveau/Kconfig | 2 +
drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 +
drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
drivers/gpu/drm/nouveau/nouveau_bo.c | 147 +-
drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +-
drivers/gpu/drm/nouveau/nouveau_drv.h | 59 +-
drivers/gpu/drm/nouveau/nouveau_exec.c | 418 +++++
drivers/gpu/drm/nouveau/nouveau_exec.h | 54 +
drivers/gpu/drm/nouveau/nouveau_gem.c | 25 +-
drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++++
drivers/gpu/drm/nouveau/nouveau_sched.h | 123 ++
drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1898 +++++++++++++++++++++++
drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 ++
18 files changed, 3296 insertions(+), 65 deletions(-)
create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h

diff --git a/Documentation/gpu/driver-uapi.rst b/Documentation/gpu/driver-uapi.rst
index 9c7ca6e33a68..c08bcbb95fb3 100644
--- a/Documentation/gpu/driver-uapi.rst
+++ b/Documentation/gpu/driver-uapi.rst
@@ -13,4 +13,7 @@ drm/nouveau uAPI
VM_BIND / EXEC uAPI
-------------------

+.. kernel-doc:: drivers/gpu/drm/nouveau/nouveau_exec.c
+ :doc: Overview
+
.. kernel-doc:: include/uapi/drm/nouveau_drm.h
diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild
index 5e5617006da5..cf6b3a80c0c8 100644
--- a/drivers/gpu/drm/nouveau/Kbuild
+++ b/drivers/gpu/drm/nouveau/Kbuild
@@ -47,6 +47,9 @@ nouveau-y += nouveau_prime.o
nouveau-y += nouveau_sgdma.o
nouveau-y += nouveau_ttm.o
nouveau-y += nouveau_vmm.o
+nouveau-y += nouveau_exec.o
+nouveau-y += nouveau_sched.o
+nouveau-y += nouveau_uvmm.o

# DRM - modesetting
nouveau-$(CONFIG_DRM_NOUVEAU_BACKLIGHT) += nouveau_backlight.o
diff --git a/drivers/gpu/drm/nouveau/Kconfig b/drivers/gpu/drm/nouveau/Kconfig
index a70bd65e1400..c52e8096cca4 100644
--- a/drivers/gpu/drm/nouveau/Kconfig
+++ b/drivers/gpu/drm/nouveau/Kconfig
@@ -10,6 +10,8 @@ config DRM_NOUVEAU
select DRM_KMS_HELPER
select DRM_TTM
select DRM_TTM_HELPER
+ select DRM_EXEC
+ select DRM_SCHED
select I2C
select I2C_ALGOBIT
select BACKLIGHT_CLASS_DEVICE if DRM_NOUVEAU_BACKLIGHT
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index 82dab51d8aeb..a112f28681d3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -35,6 +35,7 @@
#include "nouveau_chan.h"
#include "nouveau_abi16.h"
#include "nouveau_vmm.h"
+#include "nouveau_sched.h"

static struct nouveau_abi16 *
nouveau_abi16(struct drm_file *file_priv)
@@ -125,6 +126,17 @@ nouveau_abi16_chan_fini(struct nouveau_abi16 *abi16,
{
struct nouveau_abi16_ntfy *ntfy, *temp;

+ /* When a client exits without waiting for it's queued up jobs to
+ * finish it might happen that we fault the channel. This is due to
+ * drm_file_free() calling drm_gem_release() before the postclose()
+ * callback. Hence, we can't tear down this scheduler entity before
+ * uvmm mappings are unmapped. Currently, we can't detect this case.
+ *
+ * However, this should be rare and harmless, since the channel isn't
+ * needed anymore.
+ */
+ nouveau_sched_entity_fini(&chan->sched_entity);
+
/* wait for all activity to stop before cleaning up */
if (chan->chan)
nouveau_channel_idle(chan->chan);
@@ -261,6 +273,13 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
if (!drm->channel)
return nouveau_abi16_put(abi16, -ENODEV);

+ /* If uvmm wasn't initialized until now disable it completely to prevent
+ * userspace from mixing up UAPIs.
+ *
+ * The client lock is already acquired by nouveau_abi16_get().
+ */
+ __nouveau_cli_uvmm_disable(cli);
+
device = &abi16->device;
engine = NV_DEVICE_HOST_RUNLIST_ENGINES_GR;

@@ -304,6 +323,11 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
if (ret)
goto done;

+ ret = nouveau_sched_entity_init(&chan->sched_entity, &drm->sched,
+ drm->sched_wq);
+ if (ret)
+ goto done;
+
init->channel = chan->chan->chid;

if (device->info.family >= NV_DEVICE_INFO_V0_TESLA)
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.h b/drivers/gpu/drm/nouveau/nouveau_abi16.h
index 27eae85f33e6..8209eb28feaf 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.h
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.h
@@ -26,6 +26,7 @@ struct nouveau_abi16_chan {
struct nouveau_bo *ntfy;
struct nouveau_vma *ntfy_vma;
struct nvkm_mm heap;
+ struct nouveau_sched_entity sched_entity;
};

struct nouveau_abi16 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index e9cbbf594e6f..6487185f2d11 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -199,7 +199,7 @@ nouveau_bo_fixup_align(struct nouveau_bo *nvbo, int *align, u64 *size)

struct nouveau_bo *
nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 domain,
- u32 tile_mode, u32 tile_flags)
+ u32 tile_mode, u32 tile_flags, bool internal)
{
struct nouveau_drm *drm = cli->drm;
struct nouveau_bo *nvbo;
@@ -235,68 +235,103 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 domain,
nvbo->force_coherent = true;
}

- if (cli->device.info.family >= NV_DEVICE_INFO_V0_FERMI) {
- nvbo->kind = (tile_flags & 0x0000ff00) >> 8;
- if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
- kfree(nvbo);
- return ERR_PTR(-EINVAL);
+ nvbo->contig = !(tile_flags & NOUVEAU_GEM_TILE_NONCONTIG);
+ if (!nouveau_cli_uvmm(cli) || internal) {
+ /* for BO noVM allocs, don't assign kinds */
+ if (cli->device.info.family >= NV_DEVICE_INFO_V0_FERMI) {
+ nvbo->kind = (tile_flags & 0x0000ff00) >> 8;
+ if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
+ kfree(nvbo);
+ return ERR_PTR(-EINVAL);
+ }
+
+ nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind;
+ } else if (cli->device.info.family >= NV_DEVICE_INFO_V0_TESLA) {
+ nvbo->kind = (tile_flags & 0x00007f00) >> 8;
+ nvbo->comp = (tile_flags & 0x00030000) >> 16;
+ if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
+ kfree(nvbo);
+ return ERR_PTR(-EINVAL);
+ }
+ } else {
+ nvbo->zeta = (tile_flags & 0x00000007);
}
+ nvbo->mode = tile_mode;
+
+ /* Determine the desirable target GPU page size for the buffer. */
+ for (i = 0; i < vmm->page_nr; i++) {
+ /* Because we cannot currently allow VMM maps to fail
+ * during buffer migration, we need to determine page
+ * size for the buffer up-front, and pre-allocate its
+ * page tables.
+ *
+ * Skip page sizes that can't support needed domains.
+ */
+ if (cli->device.info.family > NV_DEVICE_INFO_V0_CURIE &&
+ (domain & NOUVEAU_GEM_DOMAIN_VRAM) && !vmm->page[i].vram)
+ continue;
+ if ((domain & NOUVEAU_GEM_DOMAIN_GART) &&
+ (!vmm->page[i].host || vmm->page[i].shift > PAGE_SHIFT))
+ continue;

- nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind;
- } else
- if (cli->device.info.family >= NV_DEVICE_INFO_V0_TESLA) {
- nvbo->kind = (tile_flags & 0x00007f00) >> 8;
- nvbo->comp = (tile_flags & 0x00030000) >> 16;
- if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
+ /* Select this page size if it's the first that supports
+ * the potential memory domains, or when it's compatible
+ * with the requested compression settings.
+ */
+ if (pi < 0 || !nvbo->comp || vmm->page[i].comp)
+ pi = i;
+
+ /* Stop once the buffer is larger than the current page size. */
+ if (*size >= 1ULL << vmm->page[i].shift)
+ break;
+ }
+
+ if (WARN_ON(pi < 0)) {
kfree(nvbo);
return ERR_PTR(-EINVAL);
}
- } else {
- nvbo->zeta = (tile_flags & 0x00000007);
- }
- nvbo->mode = tile_mode;
- nvbo->contig = !(tile_flags & NOUVEAU_GEM_TILE_NONCONTIG);
-
- /* Determine the desirable target GPU page size for the buffer. */
- for (i = 0; i < vmm->page_nr; i++) {
- /* Because we cannot currently allow VMM maps to fail
- * during buffer migration, we need to determine page
- * size for the buffer up-front, and pre-allocate its
- * page tables.
- *
- * Skip page sizes that can't support needed domains.
- */
- if (cli->device.info.family > NV_DEVICE_INFO_V0_CURIE &&
- (domain & NOUVEAU_GEM_DOMAIN_VRAM) && !vmm->page[i].vram)
- continue;
- if ((domain & NOUVEAU_GEM_DOMAIN_GART) &&
- (!vmm->page[i].host || vmm->page[i].shift > PAGE_SHIFT))
- continue;

- /* Select this page size if it's the first that supports
- * the potential memory domains, or when it's compatible
- * with the requested compression settings.
- */
- if (pi < 0 || !nvbo->comp || vmm->page[i].comp)
- pi = i;
-
- /* Stop once the buffer is larger than the current page size. */
- if (*size >= 1ULL << vmm->page[i].shift)
- break;
- }
+ /* Disable compression if suitable settings couldn't be found. */
+ if (nvbo->comp && !vmm->page[pi].comp) {
+ if (mmu->object.oclass >= NVIF_CLASS_MMU_GF100)
+ nvbo->kind = mmu->kind[nvbo->kind];
+ nvbo->comp = 0;
+ }
+ nvbo->page = vmm->page[pi].shift;
+ } else {
+ /* reject other tile flags when in VM mode. */
+ if (tile_mode)
+ return ERR_PTR(-EINVAL);
+ if (tile_flags & ~NOUVEAU_GEM_TILE_NONCONTIG)
+ return ERR_PTR(-EINVAL);

- if (WARN_ON(pi < 0)) {
- kfree(nvbo);
- return ERR_PTR(-EINVAL);
- }
+ /* Determine the desirable target GPU page size for the buffer. */
+ for (i = 0; i < vmm->page_nr; i++) {
+ /* Because we cannot currently allow VMM maps to fail
+ * during buffer migration, we need to determine page
+ * size for the buffer up-front, and pre-allocate its
+ * page tables.
+ *
+ * Skip page sizes that can't support needed domains.
+ */
+ if ((domain & NOUVEAU_GEM_DOMAIN_VRAM) && !vmm->page[i].vram)
+ continue;
+ if ((domain & NOUVEAU_GEM_DOMAIN_GART) &&
+ (!vmm->page[i].host || vmm->page[i].shift > PAGE_SHIFT))
+ continue;

- /* Disable compression if suitable settings couldn't be found. */
- if (nvbo->comp && !vmm->page[pi].comp) {
- if (mmu->object.oclass >= NVIF_CLASS_MMU_GF100)
- nvbo->kind = mmu->kind[nvbo->kind];
- nvbo->comp = 0;
+ if (pi < 0)
+ pi = i;
+ /* Stop once the buffer is larger than the current page size. */
+ if (*size >= 1ULL << vmm->page[i].shift)
+ break;
+ }
+ if (WARN_ON(pi < 0)) {
+ kfree(nvbo);
+ return ERR_PTR(-EINVAL);
+ }
+ nvbo->page = vmm->page[pi].shift;
}
- nvbo->page = vmm->page[pi].shift;

nouveau_bo_fixup_align(nvbo, align, size);

@@ -334,7 +369,7 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align,
int ret;

nvbo = nouveau_bo_alloc(cli, &size, &align, domain, tile_mode,
- tile_flags);
+ tile_flags, true);
if (IS_ERR(nvbo))
return PTR_ERR(nvbo);

@@ -948,6 +983,7 @@ static void nouveau_bo_move_ntfy(struct ttm_buffer_object *bo,
list_for_each_entry(vma, &nvbo->vma_list, head) {
nouveau_vma_map(vma, mem);
}
+ nouveau_uvmm_bo_map_all(nvbo, mem);
} else {
list_for_each_entry(vma, &nvbo->vma_list, head) {
ret = dma_resv_wait_timeout(bo->base.resv,
@@ -956,6 +992,7 @@ static void nouveau_bo_move_ntfy(struct ttm_buffer_object *bo,
WARN_ON(ret <= 0);
nouveau_vma_unmap(vma);
}
+ nouveau_uvmm_bo_unmap_all(nvbo);
}

if (new_reg)
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h
index 774dd93ca76b..cb85207d9e8f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.h
@@ -73,7 +73,7 @@ extern struct ttm_device_funcs nouveau_bo_driver;

void nouveau_bo_move_init(struct nouveau_drm *);
struct nouveau_bo *nouveau_bo_alloc(struct nouveau_cli *, u64 *size, int *align,
- u32 domain, u32 tile_mode, u32 tile_flags);
+ u32 domain, u32 tile_mode, u32 tile_flags, bool internal);
int nouveau_bo_init(struct nouveau_bo *, u64 size, int align, u32 domain,
struct sg_table *sg, struct dma_resv *robj);
int nouveau_bo_new(struct nouveau_cli *, u64 size, int align, u32 domain,
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index cc7c5b4a05fd..a06f8ad227ad 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -68,6 +68,9 @@
#include "nouveau_platform.h"
#include "nouveau_svm.h"
#include "nouveau_dmem.h"
+#include "nouveau_exec.h"
+#include "nouveau_uvmm.h"
+#include "nouveau_sched.h"

DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0,
"DRM_UT_CORE",
@@ -190,6 +193,8 @@ nouveau_cli_fini(struct nouveau_cli *cli)
WARN_ON(!list_empty(&cli->worker));

usif_client_fini(cli);
+ nouveau_uvmm_fini(&cli->uvmm);
+ nouveau_sched_entity_fini(&cli->sched_entity);
nouveau_vmm_fini(&cli->svm);
nouveau_vmm_fini(&cli->vmm);
nvif_mmu_dtor(&cli->mmu);
@@ -295,6 +300,12 @@ nouveau_cli_init(struct nouveau_drm *drm, const char *sname,
}

cli->mem = &mems[ret];
+
+ ret = nouveau_sched_entity_init(&cli->sched_entity, &drm->sched,
+ drm->sched_wq);
+ if (ret)
+ goto done;
+
return 0;
done:
if (ret)
@@ -548,10 +559,14 @@ nouveau_drm_device_init(struct drm_device *dev)
nvif_parent_ctor(&nouveau_parent, &drm->parent);
drm->master.base.object.parent = &drm->parent;

- ret = nouveau_cli_init(drm, "DRM-master", &drm->master);
+ ret = nouveau_sched_init(drm);
if (ret)
goto fail_alloc;

+ ret = nouveau_cli_init(drm, "DRM-master", &drm->master);
+ if (ret)
+ goto fail_sched;
+
ret = nouveau_cli_init(drm, "DRM", &drm->client);
if (ret)
goto fail_master;
@@ -608,7 +623,6 @@ nouveau_drm_device_init(struct drm_device *dev)
}

return 0;
-
fail_dispinit:
nouveau_display_destroy(dev);
fail_dispctor:
@@ -621,6 +635,8 @@ nouveau_drm_device_init(struct drm_device *dev)
nouveau_cli_fini(&drm->client);
fail_master:
nouveau_cli_fini(&drm->master);
+fail_sched:
+ nouveau_sched_fini(drm);
fail_alloc:
nvif_parent_dtor(&drm->parent);
kfree(drm);
@@ -672,6 +688,8 @@ nouveau_drm_device_fini(struct drm_device *dev)
}
mutex_unlock(&drm->clients_lock);

+ nouveau_sched_fini(drm);
+
nouveau_cli_fini(&drm->client);
nouveau_cli_fini(&drm->master);
nvif_parent_dtor(&drm->parent);
@@ -1173,6 +1191,9 @@ nouveau_ioctls[] = {
DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_PREP, nouveau_gem_ioctl_cpu_prep, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_FINI, nouveau_gem_ioctl_cpu_fini, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_INFO, nouveau_gem_ioctl_info, DRM_RENDER_ALLOW),
+ DRM_IOCTL_DEF_DRV(NOUVEAU_VM_INIT, nouveau_uvmm_ioctl_vm_init, DRM_RENDER_ALLOW),
+ DRM_IOCTL_DEF_DRV(NOUVEAU_VM_BIND, nouveau_uvmm_ioctl_vm_bind, DRM_RENDER_ALLOW),
+ DRM_IOCTL_DEF_DRV(NOUVEAU_EXEC, nouveau_exec_ioctl_exec, DRM_RENDER_ALLOW),
};

long
@@ -1220,6 +1241,8 @@ nouveau_driver_fops = {
static struct drm_driver
driver_stub = {
.driver_features = DRIVER_GEM |
+ DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE |
+ DRIVER_GEM_GPUVA |
DRIVER_MODESET |
DRIVER_RENDER,
.open = nouveau_drm_open,
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 20a7f31b9082..ab810b4e028b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -10,8 +10,8 @@
#define DRIVER_DATE "20120801"

#define DRIVER_MAJOR 1
-#define DRIVER_MINOR 3
-#define DRIVER_PATCHLEVEL 1
+#define DRIVER_MINOR 4
+#define DRIVER_PATCHLEVEL 0

/*
* 1.1.1:
@@ -63,7 +63,9 @@ struct platform_device;

#include "nouveau_fence.h"
#include "nouveau_bios.h"
+#include "nouveau_sched.h"
#include "nouveau_vmm.h"
+#include "nouveau_uvmm.h"

struct nouveau_drm_tile {
struct nouveau_fence *fence;
@@ -91,6 +93,10 @@ struct nouveau_cli {
struct nvif_mmu mmu;
struct nouveau_vmm vmm;
struct nouveau_vmm svm;
+ struct nouveau_uvmm uvmm;
+
+ struct nouveau_sched_entity sched_entity;
+
const struct nvif_mclass *mem;

struct list_head head;
@@ -112,15 +118,60 @@ struct nouveau_cli_work {
struct dma_fence_cb cb;
};

+static inline struct nouveau_uvmm *
+nouveau_cli_uvmm(struct nouveau_cli *cli)
+{
+ if (!cli || !cli->uvmm.vmm.cli)
+ return NULL;
+
+ return &cli->uvmm;
+}
+
+static inline struct nouveau_uvmm *
+nouveau_cli_uvmm_locked(struct nouveau_cli *cli)
+{
+ struct nouveau_uvmm *uvmm;
+
+ mutex_lock(&cli->mutex);
+ uvmm = nouveau_cli_uvmm(cli);
+ mutex_unlock(&cli->mutex);
+
+ return uvmm;
+}
+
static inline struct nouveau_vmm *
nouveau_cli_vmm(struct nouveau_cli *cli)
{
+ struct nouveau_uvmm *uvmm;
+
+ uvmm = nouveau_cli_uvmm(cli);
+ if (uvmm)
+ return &uvmm->vmm;
+
if (cli->svm.cli)
return &cli->svm;

return &cli->vmm;
}

+static inline void
+__nouveau_cli_uvmm_disable(struct nouveau_cli *cli)
+{
+ struct nouveau_uvmm *uvmm;
+
+ uvmm = nouveau_cli_uvmm(cli);
+ if (!uvmm)
+ cli->uvmm.disabled = true;
+}
+
+static inline void
+nouveau_cli_uvmm_disable(struct nouveau_cli *cli)
+{
+ mutex_lock(&cli->mutex);
+ __nouveau_cli_uvmm_disable(cli);
+ mutex_unlock(&cli->mutex);
+}
+
void nouveau_cli_work_queue(struct nouveau_cli *, struct dma_fence *,
struct nouveau_cli_work *);

@@ -257,6 +308,10 @@ struct nouveau_drm {
struct mutex lock;
bool component_registered;
} audio;
+
+ struct drm_gpu_scheduler sched;
+ struct workqueue_struct *sched_wq;
+
};

static inline struct nouveau_drm *
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c b/drivers/gpu/drm/nouveau/nouveau_exec.c
new file mode 100644
index 000000000000..7fd533f54662
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
@@ -0,0 +1,418 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ * Danilo Krummrich <[email protected]>
+ *
+ */
+
+#include <drm/drm_exec.h>
+
+#include "nouveau_drv.h"
+#include "nouveau_gem.h"
+#include "nouveau_mem.h"
+#include "nouveau_dma.h"
+#include "nouveau_exec.h"
+#include "nouveau_abi16.h"
+#include "nouveau_chan.h"
+#include "nouveau_sched.h"
+#include "nouveau_uvmm.h"
+
+/**
+ * DOC: Overview
+ *
+ * Nouveau's VM_BIND / EXEC UAPI consists of three ioctls: DRM_NOUVEAU_VM_INIT,
+ * DRM_NOUVEAU_VM_BIND and DRM_NOUVEAU_EXEC.
+ *
+ * In order to use the UAPI firstly a user client must initialize the VA space
+ * using the DRM_NOUVEAU_VM_INIT ioctl specifying which region of the VA space
+ * should be managed by the kernel and which by the UMD.
+ *
+ * The DRM_NOUVEAU_VM_BIND ioctl provides clients an interface to manage the
+ * userspace-managable portion of the VA space. It provides operations to map
+ * and unmap memory. Mappings may be flagged as sparse. Sparse mappings are not
+ * backed by a GEM object and the kernel will ignore GEM handles provided
+ * alongside a sparse mapping.
+ *
+ * Userspace may request memory backed mappings either within or outside of the
+ * bounds (but not crossing those bounds) of a previously mapped sparse
+ * mapping. Subsequently requested memory backed mappings within a sparse
+ * mapping will take precedence over the corresponding range of the sparse
+ * mapping. If such memory backed mappings are unmapped the kernel will make
+ * sure that the corresponding sparse mapping will take their place again.
+ * Requests to unmap a sparse mapping that still contains memory backed mappings
+ * will result in those memory backed mappings being unmapped first.
+ *
+ * Unmap requests are not bound to the range of existing mappings and can even
+ * overlap the bounds of sparse mappings. For such a request the kernel will
+ * make sure to unmap all memory backed mappings within the given range,
+ * splitting up memory backed mappings which are only partially contained
+ * within the given range. Unmap requests with the sparse flag set must match
+ * the range of a previously mapped sparse mapping exactly though.
+ *
+ * While the kernel generally permits arbitrary sequences and ranges of memory
+ * backed mappings being mapped and unmapped, either within a single or multiple
+ * VM_BIND ioctl calls, there are some restrictions for sparse mappings.
+ *
+ * The kernel does not permit to:
+ * - unmap non-existent sparse mappings
+ * - unmap a sparse mapping and map a new sparse mapping overlapping the range
+ * of the previously unmapped sparse mapping within the same VM_BIND ioctl
+ * - unmap a sparse mapping and map new memory backed mappings overlapping the
+ * range of the previously unmapped sparse mapping within the same VM_BIND
+ * ioctl
+ *
+ * When using the VM_BIND ioctl to request the kernel to map memory to a given
+ * virtual address in the GPU's VA space there is no guarantee that the actual
+ * mappings are created in the GPU's MMU. If the given memory is swapped out
+ * at the time the bind operation is executed the kernel will stash the mapping
+ * details into it's internal alloctor and create the actual MMU mappings once
+ * the memory is swapped back in. While this is transparent for userspace, it is
+ * guaranteed that all the backing memory is swapped back in and all the memory
+ * mappings, as requested by userspace previously, are actually mapped once the
+ * DRM_NOUVEAU_EXEC ioctl is called to submit an exec job.
+ *
+ * A VM_BIND job can be executed either synchronously or asynchronously. If
+ * exectued asynchronously, userspace may provide a list of syncobjs this job
+ * will wait for and/or a list of syncobj the kernel will signal once the
+ * VM_BIND job finished execution. If executed synchronously the ioctl will
+ * block until the bind job is finished. For synchronous jobs the kernel will
+ * not permit any syncobjs submitted to the kernel.
+ *
+ * To execute a push buffer the UAPI provides the DRM_NOUVEAU_EXEC ioctl. EXEC
+ * jobs are always executed asynchronously, and, equal to VM_BIND jobs, provide
+ * the option to synchronize them with syncobjs.
+ *
+ * Besides that, EXEC jobs can be scheduled for a specified channel to execute on.
+ *
+ * Since VM_BIND jobs update the GPU's VA space on job submit, EXEC jobs do have
+ * an up to date view of the VA space. However, the actual mappings might still
+ * be pending. Hence, EXEC jobs require to have the particular fences - of
+ * the corresponding VM_BIND jobs they depent on - attached to them.
+ */
+
+static int
+nouveau_exec_job_submit(struct nouveau_job *job)
+{
+ struct nouveau_exec_job *exec_job = to_nouveau_exec_job(job);
+ struct nouveau_cli *cli = exec_job->base.cli;
+ struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(cli);
+ struct drm_exec *exec = &job->exec;
+ struct drm_gem_object *obj;
+ unsigned long index;
+ int ret;
+
+ ret = nouveau_fence_new(&exec_job->fence);
+ if (ret)
+ return ret;
+
+ nouveau_uvmm_lock(uvmm);
+ drm_exec_while_not_all_locked(exec) {
+ DRM_GPUVA_ITER(it, &uvmm->umgr, 0);
+ struct drm_gpuva *va;
+
+ drm_gpuva_iter_for_each(va, it) {
+
+ if (unlikely(va == &uvmm->umgr.kernel_alloc_node))
+ continue;
+
+ ret = drm_exec_prepare_obj(exec, va->gem.obj, 1);
+ drm_exec_break_on_contention(exec);
+ if (ret == -EALREADY) {
+ continue;
+ } else if (ret) {
+ nouveau_uvmm_unlock(uvmm);
+ return ret;
+ }
+ }
+ }
+ nouveau_uvmm_unlock(uvmm);
+
+ drm_exec_for_each_locked_object(exec, index, obj) {
+ struct nouveau_bo *nvbo = nouveau_gem_object(obj);
+
+ ret = nouveau_bo_validate(nvbo, true, false);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static struct dma_fence *
+nouveau_exec_job_run(struct nouveau_job *job)
+{
+ struct nouveau_exec_job *exec_job = to_nouveau_exec_job(job);
+ struct nouveau_channel *chan = exec_job->chan;
+ struct nouveau_fence *fence = exec_job->fence;
+ int i, ret;
+
+ ret = nouveau_dma_wait(chan, exec_job->push.count + 1, 16);
+ if (ret) {
+ NV_PRINTK(err, job->cli, "nv50cal_space: %d\n", ret);
+ return ERR_PTR(ret);
+ }
+
+ for (i = 0; i < exec_job->push.count; i++) {
+ nv50_dma_push(chan, exec_job->push.s[i].va,
+ exec_job->push.s[i].va_len);
+ }
+
+ ret = nouveau_fence_emit(fence, chan);
+ if (ret) {
+ NV_PRINTK(err, job->cli, "error fencing pushbuf: %d\n", ret);
+ WIND_RING(chan);
+ return ERR_PTR(ret);
+ }
+
+ exec_job->fence = NULL;
+
+ return &fence->base;
+}
+
+static void
+nouveau_exec_job_free(struct nouveau_job *job)
+{
+ struct nouveau_exec_job *exec_job = to_nouveau_exec_job(job);
+
+ nouveau_job_free(job);
+
+ nouveau_fence_unref(&exec_job->fence);
+ kfree(exec_job->push.s);
+ kfree(exec_job);
+}
+
+static enum drm_gpu_sched_stat
+nouveau_exec_job_timeout(struct nouveau_job *job)
+{
+ struct nouveau_exec_job *exec_job = to_nouveau_exec_job(job);
+ struct nouveau_channel *chan = exec_job->chan;
+
+ if (unlikely(!atomic_read(&chan->killed)))
+ nouveau_channel_kill(chan);
+
+ NV_PRINTK(warn, job->cli, "job timeout, channel %d killed!\n",
+ chan->chid);
+
+ nouveau_sched_entity_fini(job->entity);
+
+ return DRM_GPU_SCHED_STAT_ENODEV;
+}
+
+static struct nouveau_job_ops nouveau_exec_job_ops = {
+ .submit = nouveau_exec_job_submit,
+ .run = nouveau_exec_job_run,
+ .free = nouveau_exec_job_free,
+ .timeout = nouveau_exec_job_timeout,
+};
+
+int
+nouveau_exec_job_init(struct nouveau_exec_job **pjob,
+ struct nouveau_exec_job_args *__args)
+{
+ struct nouveau_exec_job *job;
+ struct nouveau_job_args args = {};
+ int ret;
+
+ job = *pjob = kzalloc(sizeof(*job), GFP_KERNEL);
+ if (!job)
+ return -ENOMEM;
+
+ job->push.count = __args->push.count;
+ job->push.s = kmemdup(__args->push.s,
+ sizeof(*__args->push.s) *
+ __args->push.count,
+ GFP_KERNEL);
+ if (!job->push.s) {
+ ret = -ENOMEM;
+ goto err_free_job;
+ }
+
+ job->chan = __args->chan;
+
+ args.sched_entity = __args->sched_entity;
+ args.file_priv = __args->file_priv;
+
+ args.in_sync.count = __args->in_sync.count;
+ args.in_sync.s = __args->in_sync.s;
+
+ args.out_sync.count = __args->out_sync.count;
+ args.out_sync.s = __args->out_sync.s;
+
+ args.ops = &nouveau_exec_job_ops;
+ args.resv_usage = DMA_RESV_USAGE_WRITE;
+
+ ret = nouveau_job_init(&job->base, &args);
+ if (ret)
+ goto err_free_pushs;
+
+ return 0;
+
+err_free_pushs:
+ kfree(job->push.s);
+err_free_job:
+ kfree(job);
+ *pjob = NULL;
+
+ return ret;
+}
+
+static int
+nouveau_exec(struct nouveau_exec_job_args *args)
+{
+ struct nouveau_exec_job *job;
+ int ret;
+
+ ret = nouveau_exec_job_init(&job, args);
+ if (ret)
+ return ret;
+
+ ret = nouveau_job_submit(&job->base);
+ if (ret)
+ goto err_job_fini;
+
+ return 0;
+
+err_job_fini:
+ nouveau_job_fini(&job->base);
+ return ret;
+}
+
+static int
+nouveau_exec_ucopy(struct nouveau_exec_job_args *args,
+ struct drm_nouveau_exec __user *req)
+{
+ struct drm_nouveau_sync **s;
+ u32 inc = req->wait_count;
+ u64 ins = req->wait_ptr;
+ u32 outc = req->sig_count;
+ u64 outs = req->sig_ptr;
+ u32 pushc = req->push_count;
+ u64 pushs = req->push_ptr;
+ int ret;
+
+ args->push.count = pushc;
+ args->push.s = u_memcpya(pushs, pushc, sizeof(*args->push.s));
+ if (IS_ERR(args->push.s))
+ return PTR_ERR(args->push.s);
+
+ if (inc) {
+ s = &args->in_sync.s;
+
+ args->in_sync.count = inc;
+ *s = u_memcpya(ins, inc, sizeof(**s));
+ if (IS_ERR(*s)) {
+ ret = PTR_ERR(*s);
+ goto err_free_pushs;
+ }
+ }
+
+ if (outc) {
+ s = &args->out_sync.s;
+
+ args->out_sync.count = outc;
+ *s = u_memcpya(outs, outc, sizeof(**s));
+ if (IS_ERR(*s)) {
+ ret = PTR_ERR(*s);
+ goto err_free_ins;
+ }
+ }
+
+ return 0;
+
+err_free_pushs:
+ u_free(args->push.s);
+err_free_ins:
+ u_free(args->in_sync.s);
+ return ret;
+}
+
+static void
+nouveau_exec_ufree(struct nouveau_exec_job_args *args)
+{
+ u_free(args->push.s);
+ u_free(args->in_sync.s);
+ u_free(args->out_sync.s);
+}
+
+int
+nouveau_exec_ioctl_exec(struct drm_device *dev,
+ void __user *data,
+ struct drm_file *file_priv)
+{
+ struct nouveau_abi16 *abi16 = nouveau_abi16_get(file_priv);
+ struct nouveau_cli *cli = nouveau_cli(file_priv);
+ struct nouveau_abi16_chan *chan16;
+ struct nouveau_channel *chan = NULL;
+ struct nouveau_exec_job_args args = {};
+ struct drm_nouveau_exec __user *req = data;
+ int ret = 0;
+
+ if (unlikely(!abi16))
+ return -ENOMEM;
+
+ /* abi16 locks already */
+ if (unlikely(!nouveau_cli_uvmm(cli)))
+ return nouveau_abi16_put(abi16, -ENOSYS);
+
+ list_for_each_entry(chan16, &abi16->channels, head) {
+ if (chan16->chan->chid == req->channel) {
+ chan = chan16->chan;
+ break;
+ }
+ }
+
+ if (!chan)
+ return nouveau_abi16_put(abi16, -ENOENT);
+
+ if (unlikely(atomic_read(&chan->killed)))
+ return nouveau_abi16_put(abi16, -ENODEV);
+
+ if (!chan->dma.ib_max)
+ return nouveau_abi16_put(abi16, -ENOSYS);
+
+ if (unlikely(req->push_count == 0))
+ goto out;
+
+ if (unlikely(req->push_count > NOUVEAU_GEM_MAX_PUSH)) {
+ NV_PRINTK(err, cli, "pushbuf push count exceeds limit: %d max %d\n",
+ req->push_count, NOUVEAU_GEM_MAX_PUSH);
+ return nouveau_abi16_put(abi16, -EINVAL);
+ }
+
+ ret = nouveau_exec_ucopy(&args, req);
+ if (ret)
+ goto out;
+
+ args.sched_entity = &chan16->sched_entity;
+ args.file_priv = file_priv;
+ args.chan = chan;
+
+ ret = nouveau_exec(&args);
+ if (ret)
+ goto out_free_args;
+
+out_free_args:
+ nouveau_exec_ufree(&args);
+out:
+ return nouveau_abi16_put(abi16, ret);
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.h b/drivers/gpu/drm/nouveau/nouveau_exec.h
new file mode 100644
index 000000000000..3032db27b8d7
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: MIT */
+
+#ifndef __NOUVEAU_EXEC_H__
+#define __NOUVEAU_EXEC_H__
+
+#include <drm/drm_exec.h>
+
+#include "nouveau_drv.h"
+#include "nouveau_sched.h"
+
+struct nouveau_exec_job_args {
+ struct drm_file *file_priv;
+ struct nouveau_sched_entity *sched_entity;
+
+ struct drm_exec exec;
+ struct nouveau_channel *chan;
+
+ struct {
+ struct drm_nouveau_sync *s;
+ u32 count;
+ } in_sync;
+
+ struct {
+ struct drm_nouveau_sync *s;
+ u32 count;
+ } out_sync;
+
+ struct {
+ struct drm_nouveau_exec_push *s;
+ u32 count;
+ } push;
+};
+
+struct nouveau_exec_job {
+ struct nouveau_job base;
+ struct nouveau_fence *fence;
+ struct nouveau_channel *chan;
+
+ struct {
+ struct drm_nouveau_exec_push *s;
+ u32 count;
+ } push;
+};
+
+#define to_nouveau_exec_job(job) \
+ container_of((job), struct nouveau_exec_job, base)
+
+int nouveau_exec_job_init(struct nouveau_exec_job **job,
+ struct nouveau_exec_job_args *args);
+
+int nouveau_exec_ioctl_exec(struct drm_device *dev, void __user *data,
+ struct drm_file *file_priv);
+
+#endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 9c8d1b911a01..3b0fbaedfb57 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -120,7 +120,11 @@ nouveau_gem_object_open(struct drm_gem_object *gem, struct drm_file *file_priv)
goto out;
}

- ret = nouveau_vma_new(nvbo, vmm, &vma);
+ /* only create a VMA on binding */
+ if (!nouveau_cli_uvmm(cli))
+ ret = nouveau_vma_new(nvbo, vmm, &vma);
+ else
+ ret = 0;
pm_runtime_mark_last_busy(dev);
pm_runtime_put_autosuspend(dev);
out:
@@ -187,6 +191,9 @@ nouveau_gem_object_close(struct drm_gem_object *gem, struct drm_file *file_priv)
if (vmm->vmm.object.oclass < NVIF_CLASS_VMM_NV50)
return;

+ if (nouveau_cli_uvmm(cli))
+ return;
+
ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL);
if (ret)
return;
@@ -231,7 +238,7 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int align, uint32_t domain,
domain |= NOUVEAU_GEM_DOMAIN_CPU;

nvbo = nouveau_bo_alloc(cli, &size, &align, domain, tile_mode,
- tile_flags);
+ tile_flags, false);
if (IS_ERR(nvbo))
return PTR_ERR(nvbo);

@@ -279,13 +286,15 @@ nouveau_gem_info(struct drm_file *file_priv, struct drm_gem_object *gem,
else
rep->domain = NOUVEAU_GEM_DOMAIN_VRAM;
rep->offset = nvbo->offset;
- if (vmm->vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
+ if (vmm->vmm.object.oclass >= NVIF_CLASS_VMM_NV50 &&
+ !nouveau_cli_uvmm(cli)) {
vma = nouveau_vma_find(nvbo, vmm);
if (!vma)
return -EINVAL;

rep->offset = vma->addr;
- }
+ } else
+ rep->offset = 0;

rep->size = nvbo->bo.base.size;
rep->map_handle = drm_vma_node_offset_addr(&nvbo->bo.base.vma_node);
@@ -310,6 +319,11 @@ nouveau_gem_ioctl_new(struct drm_device *dev, void *data,
struct nouveau_bo *nvbo = NULL;
int ret = 0;

+ /* If uvmm wasn't initialized until now disable it completely to prevent
+ * userspace from mixing up UAPIs.
+ */
+ nouveau_cli_uvmm_disable(cli);
+
ret = nouveau_gem_new(cli, req->info.size, req->align,
req->info.domain, req->info.tile_mode,
req->info.tile_flags, &nvbo);
@@ -721,6 +735,9 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
if (unlikely(!abi16))
return -ENOMEM;

+ if (unlikely(nouveau_cli_uvmm(cli)))
+ return -ENOSYS;
+
list_for_each_entry(temp, &abi16->channels, head) {
if (temp->chan->chid == req->channel) {
chan = temp->chan;
diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.h b/drivers/gpu/drm/nouveau/nouveau_mem.h
index 76c86d8bb01e..5365a3d3a17f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_mem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_mem.h
@@ -35,4 +35,9 @@ int nouveau_mem_vram(struct ttm_resource *, bool contig, u8 page);
int nouveau_mem_host(struct ttm_resource *, struct ttm_tt *);
void nouveau_mem_fini(struct nouveau_mem *);
int nouveau_mem_map(struct nouveau_mem *, struct nvif_vmm *, struct nvif_vma *);
+int
+nouveau_mem_map_fixed(struct nouveau_mem *mem,
+ struct nvif_vmm *vmm,
+ u8 kind, u64 addr,
+ u64 offset, u64 range);
#endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c
index f42c2b1b0363..6a883b9a799a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -50,7 +50,7 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev,

dma_resv_lock(robj, NULL);
nvbo = nouveau_bo_alloc(&drm->client, &size, &align,
- NOUVEAU_GEM_DOMAIN_GART, 0, 0);
+ NOUVEAU_GEM_DOMAIN_GART, 0, 0, true);
if (IS_ERR(nvbo)) {
obj = ERR_CAST(nvbo);
goto unlock;
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
new file mode 100644
index 000000000000..5cade4cfca6d
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -0,0 +1,461 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ * Danilo Krummrich <[email protected]>
+ *
+ */
+
+#include <linux/slab.h>
+#include <drm/gpu_scheduler.h>
+#include <drm/drm_syncobj.h>
+
+#include "nouveau_drv.h"
+#include "nouveau_gem.h"
+#include "nouveau_mem.h"
+#include "nouveau_dma.h"
+#include "nouveau_exec.h"
+#include "nouveau_abi16.h"
+#include "nouveau_sched.h"
+
+/* FIXME
+ *
+ * We want to make sure that jobs currently executing can't be deferred by
+ * other jobs competing for the hardware. Otherwise we might end up with job
+ * timeouts just because of too many clients submitting too many jobs. We don't
+ * want jobs to time out because of system load, but because of the job being
+ * too bulky.
+ *
+ * For now allow for up to 16 concurrent jobs in flight until we know how many
+ * rings the hardware can process in parallel.
+ */
+#define NOUVEAU_SCHED_HW_SUBMISSIONS 16
+#define NOUVEAU_SCHED_JOB_TIMEOUT_MS 10000
+
+int
+nouveau_job_init(struct nouveau_job *job,
+ struct nouveau_job_args *args)
+{
+ struct nouveau_sched_entity *entity = args->sched_entity;
+ int ret;
+
+ job->file_priv = args->file_priv;
+ job->cli = nouveau_cli(args->file_priv);
+ job->entity = entity;
+
+ job->sync = args->sync;
+ job->resv_usage = args->resv_usage;
+
+ job->ops = args->ops;
+
+ job->in_sync.count = args->in_sync.count;
+ if (job->in_sync.count) {
+ if (job->sync)
+ return -EINVAL;
+
+ job->in_sync.data = kmemdup(args->in_sync.s,
+ sizeof(*args->in_sync.s) *
+ args->in_sync.count,
+ GFP_KERNEL);
+ if (!job->in_sync.data)
+ return -ENOMEM;
+ }
+
+ job->out_sync.count = args->out_sync.count;
+ if (job->out_sync.count) {
+ if (job->sync) {
+ ret = -EINVAL;
+ goto err_free_in_sync;
+ }
+
+ job->out_sync.data = kmemdup(args->out_sync.s,
+ sizeof(*args->out_sync.s) *
+ args->out_sync.count,
+ GFP_KERNEL);
+ if (!job->out_sync.data) {
+ ret = -ENOMEM;
+ goto err_free_in_sync;
+ }
+
+ job->out_sync.objs = kcalloc(job->out_sync.count,
+ sizeof(*job->out_sync.objs),
+ GFP_KERNEL);
+ if (!job->out_sync.objs) {
+ ret = -ENOMEM;
+ goto err_free_out_sync;
+ }
+
+ job->out_sync.chains = kcalloc(job->out_sync.count,
+ sizeof(*job->out_sync.chains),
+ GFP_KERNEL);
+ if (!job->out_sync.chains) {
+ ret = -ENOMEM;
+ goto err_free_objs;
+ }
+
+ }
+
+ ret = drm_sched_job_init(&job->base, &entity->base, NULL);
+ if (ret)
+ goto err_free_chains;
+
+ job->state = NOUVEAU_JOB_INITIALIZED;
+
+ return 0;
+
+err_free_chains:
+ kfree(job->out_sync.chains);
+err_free_objs:
+ kfree(job->out_sync.objs);
+err_free_out_sync:
+ kfree(job->out_sync.data);
+err_free_in_sync:
+ kfree(job->in_sync.data);
+return ret;
+}
+
+void
+nouveau_job_free(struct nouveau_job *job)
+{
+ kfree(job->in_sync.data);
+ kfree(job->out_sync.data);
+ kfree(job->out_sync.objs);
+ kfree(job->out_sync.chains);
+}
+
+void nouveau_job_fini(struct nouveau_job *job)
+{
+ dma_fence_put(job->done_fence);
+ drm_sched_job_cleanup(&job->base);
+ job->ops->free(job);
+}
+
+static int
+sync_find_fence(struct nouveau_job *job,
+ struct drm_nouveau_sync *sync,
+ struct dma_fence **fence)
+{
+ u32 stype = sync->flags & DRM_NOUVEAU_SYNC_TYPE_MASK;
+ u64 point = 0;
+ int ret;
+
+ if (stype != DRM_NOUVEAU_SYNC_SYNCOBJ &&
+ stype != DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ)
+ return -EOPNOTSUPP;
+
+ if (stype == DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ)
+ point = sync->timeline_value;
+
+ ret = drm_syncobj_find_fence(job->file_priv,
+ sync->handle, point,
+ sync->flags, fence);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int
+nouveau_job_add_deps(struct nouveau_job *job)
+{
+ struct dma_fence *in_fence = NULL;
+ int ret, i;
+
+ for (i = 0; i < job->in_sync.count; i++) {
+ struct drm_nouveau_sync *sync = &job->in_sync.data[i];
+
+ ret = sync_find_fence(job, sync, &in_fence);
+ if (ret) {
+ NV_PRINTK(warn, job->cli,
+ "Failed to find syncobj (-> in): handle=%d\n",
+ sync->handle);
+ return ret;
+ }
+
+ ret = drm_sched_job_add_dependency(&job->base, in_fence);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static void
+nouveau_job_fence_attach_cleanup(struct nouveau_job *job)
+{
+ int i;
+
+ for (i = 0; i < job->out_sync.count; i++) {
+ struct drm_syncobj *obj = job->out_sync.objs[i];
+ struct dma_fence_chain *chain = job->out_sync.chains[i];
+
+ if (obj)
+ drm_syncobj_put(obj);
+
+ if (chain)
+ dma_fence_chain_free(chain);
+ }
+}
+
+static int
+nouveau_job_fence_attach_prepare(struct nouveau_job *job)
+{
+ int i, ret;
+
+ for (i = 0; i < job->out_sync.count; i++) {
+ struct drm_nouveau_sync *sync = &job->out_sync.data[i];
+ struct drm_syncobj **pobj = &job->out_sync.objs[i];
+ struct dma_fence_chain **pchain = &job->out_sync.chains[i];
+ u32 stype = sync->flags & DRM_NOUVEAU_SYNC_TYPE_MASK;
+
+ if (stype != DRM_NOUVEAU_SYNC_SYNCOBJ &&
+ stype != DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ) {
+ ret = -EINVAL;
+ goto err_sync_cleanup;
+ }
+
+ *pobj = drm_syncobj_find(job->file_priv, sync->handle);
+ if (!*pobj) {
+ NV_PRINTK(warn, job->cli,
+ "Failed to find syncobj (-> out): handle=%d\n",
+ sync->handle);
+ ret = -ENOENT;
+ goto err_sync_cleanup;
+ }
+
+ if (stype == DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ) {
+ *pchain = dma_fence_chain_alloc();
+ if (!*pchain) {
+ ret = -ENOMEM;
+ goto err_sync_cleanup;
+ }
+ }
+ }
+
+ return 0;
+
+err_sync_cleanup:
+ nouveau_job_fence_attach_cleanup(job);
+ return ret;
+}
+
+static void
+nouveau_job_fence_attach(struct nouveau_job *job)
+{
+ struct dma_fence *fence = job->done_fence;
+ int i;
+
+ for (i = 0; i < job->out_sync.count; i++) {
+ struct drm_nouveau_sync *sync = &job->out_sync.data[i];
+ struct drm_syncobj **pobj = &job->out_sync.objs[i];
+ struct dma_fence_chain **pchain = &job->out_sync.chains[i];
+ u32 stype = sync->flags & DRM_NOUVEAU_SYNC_TYPE_MASK;
+
+ if (stype == DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ) {
+ drm_syncobj_add_point(*pobj, *pchain, fence,
+ sync->timeline_value);
+ } else {
+ drm_syncobj_replace_fence(*pobj, fence);
+ }
+
+ drm_syncobj_put(*pobj);
+ *pobj = NULL;
+ *pchain = NULL;
+ }
+}
+
+static void
+nouveau_job_resv_add_fence(struct nouveau_job *job)
+{
+ struct drm_exec *exec = &job->exec;
+ struct drm_gem_object *obj;
+ unsigned long index;
+
+ drm_exec_for_each_locked_object(exec, index, obj) {
+ struct dma_resv *resv = obj->resv;
+
+ dma_resv_add_fence(resv, job->done_fence, job->resv_usage);
+ }
+}
+
+int
+nouveau_job_submit(struct nouveau_job *job)
+{
+ struct nouveau_sched_entity *entity = to_nouveau_sched_entity(job->base.entity);
+ struct dma_fence *done_fence = NULL;
+ int ret;
+
+ ret = nouveau_job_add_deps(job);
+ if (ret)
+ goto err;
+
+ ret = nouveau_job_fence_attach_prepare(job);
+ if (ret)
+ goto err;
+
+ /* Make sure the job appears on the sched_entity's queue in the same
+ * order as it was submitted.
+ */
+ mutex_lock(&entity->mutex);
+
+ drm_exec_init(&job->exec, true);
+
+ /* Guarantee jobs we won't fail after the submit() callback
+ * returned successfully.
+ */
+ if (job->ops->submit) {
+ ret = job->ops->submit(job);
+ if (ret)
+ goto err_cleanup;
+ }
+
+ drm_sched_job_arm(&job->base);
+ job->done_fence = dma_fence_get(&job->base.s_fence->finished);
+ if (job->sync)
+ done_fence = dma_fence_get(job->done_fence);
+
+ nouveau_job_fence_attach(job);
+ nouveau_job_resv_add_fence(job);
+
+ drm_exec_fini(&job->exec);
+
+ /* Set job state before pushing the job to the scheduler,
+ * such that we do not overwrite the job state set in run().
+ */
+ job->state = NOUVEAU_JOB_SUBMIT_SUCCESS;
+
+ drm_sched_entity_push_job(&job->base);
+
+ mutex_unlock(&entity->mutex);
+
+ if (done_fence) {
+ dma_fence_wait(done_fence, true);
+ dma_fence_put(done_fence);
+ }
+
+ return 0;
+
+err_cleanup:
+ drm_exec_fini(&job->exec);
+ mutex_unlock(&entity->mutex);
+ nouveau_job_fence_attach_cleanup(job);
+err:
+ job->state = NOUVEAU_JOB_SUBMIT_FAILED;
+ return ret;
+}
+
+bool
+nouveau_sched_entity_qwork(struct nouveau_sched_entity *entity,
+ struct work_struct *work)
+{
+ return queue_work(entity->sched_wq, work);
+}
+
+static struct dma_fence *
+nouveau_job_run(struct nouveau_job *job)
+{
+ struct dma_fence *fence;
+
+ fence = job->ops->run(job);
+ if (unlikely(IS_ERR(fence)))
+ job->state = NOUVEAU_JOB_RUN_FAILED;
+ else
+ job->state = NOUVEAU_JOB_RUN_SUCCESS;
+
+ return fence;
+}
+
+static struct dma_fence *
+nouveau_sched_run_job(struct drm_sched_job *sched_job)
+{
+ struct nouveau_job *job = to_nouveau_job(sched_job);
+
+ return nouveau_job_run(job);
+}
+
+static enum drm_gpu_sched_stat
+nouveau_sched_timedout_job(struct drm_sched_job *sched_job)
+{
+ struct nouveau_job *job = to_nouveau_job(sched_job);
+
+ NV_PRINTK(warn, job->cli, "Job timed out.\n");
+
+ if (job->ops->timeout)
+ return job->ops->timeout(job);
+
+ return DRM_GPU_SCHED_STAT_ENODEV;
+}
+
+static void
+nouveau_sched_free_job(struct drm_sched_job *sched_job)
+{
+ struct nouveau_job *job = to_nouveau_job(sched_job);
+
+ nouveau_job_fini(job);
+}
+
+int nouveau_sched_entity_init(struct nouveau_sched_entity *entity,
+ struct drm_gpu_scheduler *sched,
+ struct workqueue_struct *sched_wq)
+{
+ mutex_init(&entity->mutex);
+ spin_lock_init(&entity->job.list.lock);
+ INIT_LIST_HEAD(&entity->job.list.head);
+ init_waitqueue_head(&entity->job.wq);
+
+ entity->sched_wq = sched_wq;
+ return drm_sched_entity_init(&entity->base,
+ DRM_SCHED_PRIORITY_NORMAL,
+ &sched, 1, NULL);
+}
+
+void
+nouveau_sched_entity_fini(struct nouveau_sched_entity *entity)
+{
+ drm_sched_entity_destroy(&entity->base);
+}
+
+static const struct drm_sched_backend_ops nouveau_sched_ops = {
+ .run_job = nouveau_sched_run_job,
+ .timedout_job = nouveau_sched_timedout_job,
+ .free_job = nouveau_sched_free_job,
+};
+
+int nouveau_sched_init(struct nouveau_drm *drm)
+{
+ struct drm_gpu_scheduler *sched = &drm->sched;
+ long job_hang_limit = msecs_to_jiffies(NOUVEAU_SCHED_JOB_TIMEOUT_MS);
+
+ drm->sched_wq = create_singlethread_workqueue("nouveau_sched_wq");
+ if (!drm->sched_wq)
+ return ENOMEM;
+
+ return drm_sched_init(sched, &nouveau_sched_ops,
+ NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
+ NULL, NULL, "nouveau_sched", drm->dev->dev);
+}
+
+void nouveau_sched_fini(struct nouveau_drm *drm)
+{
+ destroy_workqueue(drm->sched_wq);
+ drm_sched_fini(&drm->sched);
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.h b/drivers/gpu/drm/nouveau/nouveau_sched.h
new file mode 100644
index 000000000000..8b27b5f3dd8d
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.h
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: MIT */
+
+#ifndef NOUVEAU_SCHED_H
+#define NOUVEAU_SCHED_H
+
+#include <linux/types.h>
+
+#include <drm/drm_exec.h>
+#include <drm/gpu_scheduler.h>
+
+#include "nouveau_drv.h"
+
+#define to_nouveau_job(sched_job) \
+ container_of((sched_job), struct nouveau_job, base)
+
+struct nouveau_job_ops;
+
+enum nouveau_job_state {
+ NOUVEAU_JOB_UNINITIALIZED = 0,
+ NOUVEAU_JOB_INITIALIZED,
+ NOUVEAU_JOB_SUBMIT_SUCCESS,
+ NOUVEAU_JOB_SUBMIT_FAILED,
+ NOUVEAU_JOB_RUN_SUCCESS,
+ NOUVEAU_JOB_RUN_FAILED,
+};
+
+struct nouveau_job_args {
+ struct drm_file *file_priv;
+ struct nouveau_sched_entity *sched_entity;
+
+ enum dma_resv_usage resv_usage;
+ bool sync;
+
+ struct {
+ struct drm_nouveau_sync *s;
+ u32 count;
+ } in_sync;
+
+ struct {
+ struct drm_nouveau_sync *s;
+ u32 count;
+ } out_sync;
+
+ struct nouveau_job_ops *ops;
+};
+
+struct nouveau_job {
+ struct drm_sched_job base;
+
+ enum nouveau_job_state state;
+
+ struct nouveau_sched_entity *entity;
+
+ struct drm_file *file_priv;
+ struct nouveau_cli *cli;
+
+ struct drm_exec exec;
+ enum dma_resv_usage resv_usage;
+ struct dma_fence *done_fence;
+
+ bool sync;
+
+ struct {
+ struct drm_nouveau_sync *data;
+ u32 count;
+ } in_sync;
+
+ struct {
+ struct drm_nouveau_sync *data;
+ struct drm_syncobj **objs;
+ struct dma_fence_chain **chains;
+ u32 count;
+ } out_sync;
+
+ struct nouveau_job_ops {
+ int (*submit)(struct nouveau_job *);
+ struct dma_fence *(*run)(struct nouveau_job *);
+ void (*free)(struct nouveau_job *);
+ enum drm_gpu_sched_stat (*timeout)(struct nouveau_job *);
+ } *ops;
+};
+
+int nouveau_job_ucopy_syncs(struct nouveau_job_args *args,
+ u32 inc, u64 ins,
+ u32 outc, u64 outs);
+
+int nouveau_job_init(struct nouveau_job *job,
+ struct nouveau_job_args *args);
+void nouveau_job_free(struct nouveau_job *job);
+
+int nouveau_job_submit(struct nouveau_job *job);
+void nouveau_job_fini(struct nouveau_job *job);
+
+#define to_nouveau_sched_entity(entity) \
+ container_of((entity), struct nouveau_sched_entity, base)
+
+struct nouveau_sched_entity {
+ struct drm_sched_entity base;
+ struct mutex mutex;
+
+ struct workqueue_struct *sched_wq;
+
+ struct {
+ struct {
+ struct list_head head;
+ spinlock_t lock;
+ } list;
+ struct wait_queue_head wq;
+ } job;
+};
+
+int nouveau_sched_entity_init(struct nouveau_sched_entity *entity,
+ struct drm_gpu_scheduler *sched,
+ struct workqueue_struct *sched_wq);
+void nouveau_sched_entity_fini(struct nouveau_sched_entity *entity);
+
+bool nouveau_sched_entity_qwork(struct nouveau_sched_entity *entity,
+ struct work_struct *work);
+
+int nouveau_sched_init(struct nouveau_drm *drm);
+void nouveau_sched_fini(struct nouveau_drm *drm);
+
+#endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
new file mode 100644
index 000000000000..f530546e67e8
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -0,0 +1,1898 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ * Danilo Krummrich <[email protected]>
+ *
+ */
+
+/*
+ * Locking:
+ *
+ * The uvmm mutex protects any operations on the GPU VA space provided by the
+ * DRM GPU VA manager.
+ *
+ * The DRM GEM GPUVA lock protects a GEM's GPUVA list. It also protects single
+ * map/unmap operations against a BO move, which itself walks the GEM's GPUVA
+ * list in order to map/unmap it's entries.
+ *
+ * We'd also need to protect the DRM_GPUVA_EVICTED flag for each individual
+ * GPUVA, however this isn't necessary since any read or write to this flag
+ * happens when we already took the DRM GEM GPUVA lock of the backing GEM of
+ * the particular GPUVA.
+ */
+
+#include "nouveau_drv.h"
+#include "nouveau_gem.h"
+#include "nouveau_mem.h"
+#include "nouveau_uvmm.h"
+
+#include <nvif/vmm.h>
+#include <nvif/mem.h>
+
+#include <nvif/class.h>
+#include <nvif/if000c.h>
+#include <nvif/if900d.h>
+
+#define NOUVEAU_VA_SPACE_BITS 47 /* FIXME */
+#define NOUVEAU_VA_SPACE_START 0x0
+#define NOUVEAU_VA_SPACE_END (1ULL << NOUVEAU_VA_SPACE_BITS)
+
+#define list_last_op(_ops) list_last_entry(_ops, struct bind_job_op, entry)
+#define list_prev_op(_op) list_prev_entry(_op, entry)
+#define list_for_each_op(_op, _ops) list_for_each_entry(_op, _ops, entry)
+#define list_for_each_op_from_reverse(_op, _ops) \
+ list_for_each_entry_from_reverse(_op, _ops, entry)
+#define list_for_each_op_safe(_op, _n, _ops) list_for_each_entry_safe(_op, _n, _ops, entry)
+
+enum vm_bind_op {
+ OP_MAP = DRM_NOUVEAU_VM_BIND_OP_MAP,
+ OP_UNMAP = DRM_NOUVEAU_VM_BIND_OP_UNMAP,
+ OP_MAP_SPARSE,
+ OP_UNMAP_SPARSE,
+};
+
+struct nouveau_uvma_prealloc {
+ struct nouveau_uvma *map;
+ struct nouveau_uvma *prev;
+ struct nouveau_uvma *next;
+};
+
+struct bind_job_op {
+ struct list_head entry;
+
+ enum vm_bind_op op;
+ u32 flags;
+
+ struct {
+ u64 addr;
+ u64 range;
+ } va;
+
+ struct {
+ u32 handle;
+ u64 offset;
+ struct drm_gem_object *obj;
+ } gem;
+
+ struct nouveau_uvma_region *reg;
+ struct nouveau_uvma_prealloc new;
+ struct drm_gpuva_ops *ops;
+};
+
+struct uvmm_map_args {
+ struct nouveau_uvma_region *region;
+ u64 addr;
+ u64 range;
+ u8 kind;
+};
+
+static int
+nouveau_uvmm_vmm_sparse_ref(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nvif_vmm *vmm = &uvmm->vmm.vmm;
+
+ return nvif_vmm_raw_sparse(vmm, addr, range, true);
+}
+
+static int
+nouveau_uvmm_vmm_sparse_unref(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nvif_vmm *vmm = &uvmm->vmm.vmm;
+
+ return nvif_vmm_raw_sparse(vmm, addr, range, false);
+}
+
+static int
+nouveau_uvmm_vmm_get(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nvif_vmm *vmm = &uvmm->vmm.vmm;
+
+ return nvif_vmm_raw_get(vmm, addr, range, PAGE_SHIFT);
+}
+
+static int
+nouveau_uvmm_vmm_put(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nvif_vmm *vmm = &uvmm->vmm.vmm;
+
+ return nvif_vmm_raw_put(vmm, addr, range, PAGE_SHIFT);
+}
+
+static int
+nouveau_uvmm_vmm_unmap(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range, bool sparse)
+{
+ struct nvif_vmm *vmm = &uvmm->vmm.vmm;
+
+ return nvif_vmm_raw_unmap(vmm, addr, range, PAGE_SHIFT, sparse);
+}
+
+static int
+nouveau_uvmm_vmm_map(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range,
+ u64 bo_offset, u8 kind,
+ struct nouveau_mem *mem)
+{
+ struct nvif_vmm *vmm = &uvmm->vmm.vmm;
+ union {
+ struct gf100_vmm_map_v0 gf100;
+ } args;
+ u32 argc = 0;
+
+ switch (vmm->object.oclass) {
+ case NVIF_CLASS_VMM_GF100:
+ case NVIF_CLASS_VMM_GM200:
+ case NVIF_CLASS_VMM_GP100:
+ args.gf100.version = 0;
+ if (mem->mem.type & NVIF_MEM_VRAM)
+ args.gf100.vol = 0;
+ else
+ args.gf100.vol = 1;
+ args.gf100.ro = 0;
+ args.gf100.priv = 0;
+ args.gf100.kind = kind;
+ argc = sizeof(args.gf100);
+ break;
+ default:
+ WARN_ON(1);
+ return -ENOSYS;
+ }
+
+ return nvif_vmm_raw_map(vmm, addr, range, PAGE_SHIFT,
+ &args, argc,
+ &mem->mem, bo_offset);
+}
+
+static int
+nouveau_uvma_region_sparse_unref(struct nouveau_uvma_region *reg)
+{
+ u64 addr = reg->va.addr;
+ u64 range = reg->va.range;
+
+ return nouveau_uvmm_vmm_sparse_unref(reg->uvmm, addr, range);
+}
+
+static int
+nouveau_uvma_vmm_put(struct nouveau_uvma *uvma)
+{
+ u64 addr = uvma->va.va.addr;
+ u64 range = uvma->va.va.range;
+
+ return nouveau_uvmm_vmm_put(uvma->uvmm, addr, range);
+}
+
+static int
+nouveau_uvma_map(struct nouveau_uvma *uvma,
+ struct nouveau_mem *mem)
+{
+ u64 addr = uvma->va.va.addr;
+ u64 offset = uvma->va.gem.offset;
+ u64 range = uvma->va.va.range;
+
+ return nouveau_uvmm_vmm_map(uvma->uvmm, addr, range,
+ offset, uvma->kind, mem);
+}
+
+static int
+nouveau_uvma_unmap(struct nouveau_uvma *uvma)
+{
+ u64 addr = uvma->va.va.addr;
+ u64 range = uvma->va.va.range;
+ bool sparse = !!uvma->region;
+
+ if (drm_gpuva_evicted(&uvma->va))
+ return 0;
+
+ return nouveau_uvmm_vmm_unmap(uvma->uvmm, addr, range, sparse);
+}
+
+static int
+nouveau_uvma_alloc(struct nouveau_uvma **puvma)
+{
+ *puvma = kzalloc(sizeof(**puvma), GFP_KERNEL);
+ if (!*puvma)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void
+nouveau_uvma_free(struct nouveau_uvma *uvma)
+{
+ kfree(uvma);
+}
+
+static int
+__nouveau_uvma_insert(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma *uvma)
+{
+ return drm_gpuva_insert(&uvmm->umgr, &uvma->va);
+}
+
+static int
+nouveau_uvma_insert(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma *uvma,
+ struct nouveau_uvma_region *region,
+ struct drm_gem_object *obj,
+ u64 bo_offset, u64 addr,
+ u64 range, u8 kind)
+{
+ int ret;
+
+ uvma->uvmm = uvmm;
+ uvma->region = region;
+ uvma->kind = kind;
+ uvma->va.va.addr = addr;
+ uvma->va.va.range = range;
+ uvma->va.gem.offset = bo_offset;
+ uvma->va.gem.obj = obj;
+
+ ret = __nouveau_uvma_insert(uvmm, uvma);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static void
+nouveau_uvma_remove(struct nouveau_uvma *uvma)
+{
+ drm_gpuva_remove(&uvma->va);
+}
+
+static void
+nouveau_uvma_gem_get(struct nouveau_uvma *uvma)
+{
+ drm_gem_object_get(uvma->va.gem.obj);
+}
+
+static void
+nouveau_uvma_gem_put(struct nouveau_uvma *uvma)
+{
+ drm_gem_object_put(uvma->va.gem.obj);
+}
+
+static int
+nouveau_uvma_region_alloc(struct nouveau_uvma_region **preg)
+{
+ *preg = kzalloc(sizeof(**preg), GFP_KERNEL);
+ if (!*preg)
+ return -ENOMEM;
+
+ kref_init(&(*preg)->kref);
+
+ return 0;
+}
+
+static void
+nouveau_uvma_region_free(struct kref *kref)
+{
+ struct nouveau_uvma_region *reg =
+ container_of(kref, struct nouveau_uvma_region, kref);
+
+ kfree(reg);
+}
+
+static void
+nouveau_uvma_region_get(struct nouveau_uvma_region *reg)
+{
+ kref_get(&reg->kref);
+}
+
+static void
+nouveau_uvma_region_put(struct nouveau_uvma_region *reg)
+{
+ kref_put(&reg->kref, nouveau_uvma_region_free);
+}
+
+static int
+__nouveau_uvma_region_insert(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_region *reg)
+{
+ u64 addr = reg->va.addr;
+ u64 range = reg->va.range;
+ u64 last = addr + range - 1;
+ MA_STATE(mas, &uvmm->region_mt, addr, addr);
+
+ if (unlikely(mas_walk(&mas))) {
+ mas_unlock(&mas);
+ return -EEXIST;
+ }
+
+ if (unlikely(mas.last < last)) {
+ mas_unlock(&mas);
+ return -EEXIST;
+ }
+
+ mas.index = addr;
+ mas.last = last;
+
+ mas_store_gfp(&mas, reg, GFP_KERNEL);
+
+ reg->uvmm = uvmm;
+
+ return 0;
+}
+
+static int
+nouveau_uvma_region_insert(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_region *reg,
+ u64 addr, u64 range)
+{
+ int ret;
+
+ reg->uvmm = uvmm;
+ reg->va.addr = addr;
+ reg->va.range = range;
+
+ ret = __nouveau_uvma_region_insert(uvmm, reg);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static void
+nouveau_uvma_region_remove(struct nouveau_uvma_region *reg)
+{
+ struct nouveau_uvmm *uvmm = reg->uvmm;
+ MA_STATE(mas, &uvmm->region_mt, reg->va.addr, 0);
+
+ mas_erase(&mas);
+}
+
+static int
+nouveau_uvma_region_create(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nouveau_uvma_region *reg;
+ int ret;
+
+ if (!drm_gpuva_interval_empty(&uvmm->umgr, addr, range))
+ return -ENOSPC;
+
+ ret = nouveau_uvma_region_alloc(&reg);
+ if (ret)
+ return ret;
+
+ ret = nouveau_uvma_region_insert(uvmm, reg, addr, range);
+ if (ret)
+ goto err_free_region;
+
+ ret = nouveau_uvmm_vmm_sparse_ref(uvmm, addr, range);
+ if (ret)
+ goto err_region_remove;
+
+ return 0;
+
+err_region_remove:
+ nouveau_uvma_region_remove(reg);
+err_free_region:
+ nouveau_uvma_region_put(reg);
+ return ret;
+}
+
+static struct nouveau_uvma_region *
+nouveau_uvma_region_find_first(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ MA_STATE(mas, &uvmm->region_mt, addr, 0);
+
+ return mas_find(&mas, addr + range - 1);
+}
+
+static struct nouveau_uvma_region *
+nouveau_uvma_region_find(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nouveau_uvma_region *reg;
+
+ reg = nouveau_uvma_region_find_first(uvmm, addr, range);
+ if (!reg)
+ return NULL;
+
+ if (reg->va.addr != addr ||
+ reg->va.range != range)
+ return NULL;
+
+ return reg;
+}
+
+static bool
+nouveau_uvma_region_empty(struct nouveau_uvma_region *reg)
+{
+ struct nouveau_uvmm *uvmm = reg->uvmm;
+
+ return drm_gpuva_interval_empty(&uvmm->umgr,
+ reg->va.addr,
+ reg->va.range);
+}
+
+static int
+__nouveau_uvma_region_destroy(struct nouveau_uvma_region *reg)
+{
+ struct nouveau_uvmm *uvmm = reg->uvmm;
+ u64 addr = reg->va.addr;
+ u64 range = reg->va.range;
+
+ if (!nouveau_uvma_region_empty(reg))
+ return -EBUSY;
+
+ nouveau_uvma_region_remove(reg);
+ nouveau_uvmm_vmm_sparse_unref(uvmm, addr, range);
+ nouveau_uvma_region_put(reg);
+
+ return 0;
+}
+
+static int
+nouveau_uvma_region_destroy(struct nouveau_uvmm *uvmm,
+ u64 addr, u64 range)
+{
+ struct nouveau_uvma_region *reg;
+
+ reg = nouveau_uvma_region_find(uvmm, addr, range);
+ if (!reg)
+ return -ENOENT;
+
+ return __nouveau_uvma_region_destroy(reg);
+}
+
+static void
+nouveau_uvma_region_dirty(struct nouveau_uvma_region *reg)
+{
+
+ init_completion(&reg->complete);
+ reg->dirty = true;
+}
+
+static void
+nouveau_uvma_region_complete(struct nouveau_uvma_region *reg)
+{
+ complete_all(&reg->complete);
+}
+
+static void
+op_map_prepare_unwind(struct nouveau_uvma *uvma)
+{
+ nouveau_uvma_gem_put(uvma);
+ nouveau_uvma_remove(uvma);
+ nouveau_uvma_free(uvma);
+}
+
+static void
+op_unmap_prepare_unwind(struct drm_gpuva *va)
+{
+ drm_gpuva_insert(va->mgr, va);
+}
+
+static void
+nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops,
+ struct drm_gpuva_op *last,
+ struct uvmm_map_args *args)
+{
+ struct drm_gpuva_op *op = last;
+ u64 vmm_get_start = args ? args->addr : 0;
+ u64 vmm_get_end = args ? args->addr + args->range : 0;
+
+ /* Unwind GPUVA space. */
+ drm_gpuva_for_each_op_from_reverse(op, ops) {
+ switch (op->op) {
+ case DRM_GPUVA_OP_MAP:
+ op_map_prepare_unwind(new->map);
+ break;
+ case DRM_GPUVA_OP_REMAP: {
+ struct drm_gpuva_op_remap *r = &op->remap;
+
+ if (r->next)
+ op_map_prepare_unwind(new->next);
+
+ if (r->prev)
+ op_map_prepare_unwind(new->prev);
+
+ op_unmap_prepare_unwind(r->unmap->va);
+ break;
+ }
+ case DRM_GPUVA_OP_UNMAP:
+ op_unmap_prepare_unwind(op->unmap.va);
+ break;
+ default:
+ break;
+ }
+ }
+
+ /* Unmap operation don't allocate page tables, hence skip the following
+ * page table unwind.
+ */
+ if (!args)
+ return;
+
+ drm_gpuva_for_each_op(op, ops) {
+ switch (op->op) {
+ case DRM_GPUVA_OP_MAP: {
+ u64 vmm_get_range = vmm_get_end - vmm_get_start;
+
+ if (vmm_get_range)
+ nouveau_uvmm_vmm_put(uvmm, vmm_get_start,
+ vmm_get_range);
+ break;
+ }
+ case DRM_GPUVA_OP_REMAP: {
+ struct drm_gpuva_op_remap *r = &op->remap;
+ struct drm_gpuva *va = r->unmap->va;
+ u64 ustart = va->va.addr;
+ u64 urange = va->va.range;
+ u64 uend = ustart + urange;
+
+ if (r->prev)
+ vmm_get_start = uend;
+
+ if (r->next)
+ vmm_get_end = ustart;
+
+ if (r->prev && r->next)
+ vmm_get_start = vmm_get_end = 0;
+
+ break;
+ }
+ case DRM_GPUVA_OP_UNMAP: {
+ struct drm_gpuva_op_unmap *u = &op->unmap;
+ struct drm_gpuva *va = u->va;
+ u64 ustart = va->va.addr;
+ u64 urange = va->va.range;
+ u64 uend = ustart + urange;
+
+ /* Nothing to do for mappings we merge with. */
+ if (uend == vmm_get_start ||
+ ustart == vmm_get_end)
+ break;
+
+ if (ustart > vmm_get_start) {
+ u64 vmm_get_range = ustart - vmm_get_start;
+
+ nouveau_uvmm_vmm_put(uvmm, vmm_get_start,
+ vmm_get_range);
+ }
+ vmm_get_start = uend;
+ break;
+ }
+ default:
+ break;
+ }
+
+ if (op == last)
+ break;
+ }
+}
+
+static void
+nouveau_uvmm_sm_map_prepare_unwind(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops,
+ u64 addr, u64 range)
+{
+ struct drm_gpuva_op *last = drm_gpuva_last_op(ops);
+ struct uvmm_map_args args = {
+ .addr = addr,
+ .range = range,
+ };
+
+ nouveau_uvmm_sm_prepare_unwind(uvmm, new, ops, last, &args);
+}
+
+static void
+nouveau_uvmm_sm_unmap_prepare_unwind(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ struct drm_gpuva_op *last = drm_gpuva_last_op(ops);
+
+ nouveau_uvmm_sm_prepare_unwind(uvmm, new, ops, last, NULL);
+}
+
+static int
+op_map_prepare(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma **puvma,
+ struct drm_gpuva_op_map *m,
+ struct uvmm_map_args *args)
+{
+ struct nouveau_uvma *uvma;
+ int ret;
+
+ ret = nouveau_uvma_alloc(&uvma);
+ if (ret)
+ goto err;
+
+ ret = nouveau_uvma_insert(uvmm, uvma, args->region,
+ m->gem.obj, m->gem.offset,
+ m->va.addr, m->va.range,
+ args->kind);
+ if (ret)
+ goto err_free_uvma;
+
+ /* Keep a reference until this uvma is destroyed. */
+ nouveau_uvma_gem_get(uvma);
+
+ *puvma = uvma;
+ return 0;
+
+err_free_uvma:
+ nouveau_uvma_free(uvma);
+err:
+ *puvma = NULL;
+ return ret;
+}
+
+static void
+op_unmap_prepare(struct drm_gpuva_op_unmap *u)
+{
+ struct nouveau_uvma *uvma = uvma_from_va(u->va);
+
+ nouveau_uvma_remove(uvma);
+}
+
+static int
+nouveau_uvmm_sm_prepare(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops,
+ struct uvmm_map_args *args)
+{
+ struct drm_gpuva_op *op;
+ u64 vmm_get_start = args ? args->addr : 0;
+ u64 vmm_get_end = args ? args->addr + args->range : 0;
+ int ret;
+
+ drm_gpuva_for_each_op(op, ops) {
+ switch (op->op) {
+ case DRM_GPUVA_OP_MAP: {
+ u64 vmm_get_range = vmm_get_end - vmm_get_start;
+
+ ret = op_map_prepare(uvmm, &new->map, &op->map, args);
+ if (ret)
+ goto unwind;
+
+ if (args && vmm_get_range) {
+ ret = nouveau_uvmm_vmm_get(uvmm, vmm_get_start,
+ vmm_get_range);
+ if (ret) {
+ op_map_prepare_unwind(new->map);
+ goto unwind;
+ }
+ }
+ break;
+ }
+ case DRM_GPUVA_OP_REMAP: {
+ struct drm_gpuva_op_remap *r = &op->remap;
+ struct drm_gpuva *va = r->unmap->va;
+ struct uvmm_map_args remap_args = {
+ .kind = uvma_from_va(va)->kind,
+ };
+ u64 ustart = va->va.addr;
+ u64 urange = va->va.range;
+ u64 uend = ustart + urange;
+
+ op_unmap_prepare(r->unmap);
+
+ if (r->prev) {
+ ret = op_map_prepare(uvmm, &new->prev, r->prev,
+ &remap_args);
+ if (ret)
+ goto unwind;
+
+ if (args)
+ vmm_get_start = uend;
+ }
+
+ if (r->next) {
+ ret = op_map_prepare(uvmm, &new->next, r->next,
+ &remap_args);
+ if (ret) {
+ if (r->prev)
+ op_map_prepare_unwind(new->prev);
+ goto unwind;
+ }
+
+ if (args)
+ vmm_get_end = ustart;
+ }
+
+ if (args && (r->prev && r->next))
+ vmm_get_start = vmm_get_end = 0;
+
+ break;
+ }
+ case DRM_GPUVA_OP_UNMAP: {
+ struct drm_gpuva_op_unmap *u = &op->unmap;
+ struct drm_gpuva *va = u->va;
+ u64 ustart = va->va.addr;
+ u64 urange = va->va.range;
+ u64 uend = ustart + urange;
+
+ op_unmap_prepare(u);
+
+ if (!args)
+ break;
+
+ /* Nothing to do for mappings we merge with. */
+ if (uend == vmm_get_start ||
+ ustart == vmm_get_end)
+ break;
+
+ if (ustart > vmm_get_start) {
+ u64 vmm_get_range = ustart - vmm_get_start;
+
+ ret = nouveau_uvmm_vmm_get(uvmm, vmm_get_start,
+ vmm_get_range);
+ if (ret) {
+ op_unmap_prepare_unwind(va);
+ goto unwind;
+ }
+ }
+ vmm_get_start = uend;
+
+ break;
+ }
+ default:
+ ret = -EINVAL;
+ goto unwind;
+ }
+ }
+
+ return 0;
+
+unwind:
+ if (op != drm_gpuva_first_op(ops))
+ nouveau_uvmm_sm_prepare_unwind(uvmm, new, ops,
+ drm_gpuva_prev_op(op),
+ args);
+ return ret;
+}
+
+static int
+nouveau_uvmm_sm_map_prepare(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct nouveau_uvma_region *region,
+ struct drm_gpuva_ops *ops,
+ u64 addr, u64 range, u8 kind)
+{
+ struct uvmm_map_args args = {
+ .region = region,
+ .addr = addr,
+ .range = range,
+ .kind = kind,
+ };
+
+ return nouveau_uvmm_sm_prepare(uvmm, new, ops, &args);
+}
+
+static int
+nouveau_uvmm_sm_unmap_prepare(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ return nouveau_uvmm_sm_prepare(uvmm, new, ops, NULL);
+}
+
+static struct drm_gem_object *
+op_gem_obj(struct drm_gpuva_op *op)
+{
+ switch (op->op) {
+ case DRM_GPUVA_OP_MAP:
+ return op->map.gem.obj;
+ case DRM_GPUVA_OP_REMAP:
+ return op->remap.unmap->va->gem.obj;
+ case DRM_GPUVA_OP_UNMAP:
+ return op->unmap.va->gem.obj;
+ default:
+ WARN(1, "Unknown operation.\n");
+ return NULL;
+ }
+}
+
+static void
+op_map(struct nouveau_uvma *uvma)
+{
+ struct nouveau_bo *nvbo = nouveau_gem_object(uvma->va.gem.obj);
+
+ nouveau_uvma_map(uvma, nouveau_mem(nvbo->bo.resource));
+ drm_gpuva_link(&uvma->va);
+}
+
+static void
+op_unmap(struct drm_gpuva_op_unmap *u)
+{
+ struct drm_gpuva *va = u->va;
+ struct nouveau_uvma *uvma = uvma_from_va(va);
+
+ /* nouveau_uvma_unmap() does not unmap if backing BO is evicted. */
+ if (!u->keep)
+ nouveau_uvma_unmap(uvma);
+ drm_gpuva_unlink(va);
+}
+
+static void
+op_unmap_range(struct drm_gpuva_op_unmap *u,
+ u64 addr, u64 range)
+{
+ struct nouveau_uvma *uvma = uvma_from_va(u->va);
+ bool sparse = !!uvma->region;
+
+ if (!drm_gpuva_evicted(u->va))
+ nouveau_uvmm_vmm_unmap(uvma->uvmm, addr, range, sparse);
+
+ drm_gpuva_unlink(u->va);
+}
+
+static void
+op_remap(struct drm_gpuva_op_remap *r,
+ struct nouveau_uvma_prealloc *new)
+{
+ struct drm_gpuva_op_unmap *u = r->unmap;
+ struct nouveau_uvma *uvma = uvma_from_va(u->va);
+ u64 addr = uvma->va.va.addr;
+ u64 range = uvma->va.va.range;
+
+ if (r->prev) {
+ addr = r->prev->va.addr + r->prev->va.range;
+ drm_gpuva_link(&new->prev->va);
+ }
+
+ if (r->next) {
+ range = r->next->va.addr - addr;
+ drm_gpuva_link(&new->next->va);
+ }
+
+ op_unmap_range(u, addr, range);
+}
+
+static int
+nouveau_uvmm_sm(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ struct drm_gpuva_op *op;
+
+ drm_gpuva_for_each_op(op, ops) {
+ struct drm_gem_object *obj = op_gem_obj(op);
+
+ if (!obj)
+ continue;
+
+ drm_gem_gpuva_lock(obj);
+ switch (op->op) {
+ case DRM_GPUVA_OP_MAP:
+ op_map(new->map);
+ break;
+ case DRM_GPUVA_OP_REMAP:
+ op_remap(&op->remap, new);
+ break;
+ case DRM_GPUVA_OP_UNMAP:
+ op_unmap(&op->unmap);
+ break;
+ default:
+ break;
+ }
+ drm_gem_gpuva_unlock(obj);
+ }
+
+ return 0;
+}
+
+static int
+nouveau_uvmm_sm_map(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ return nouveau_uvmm_sm(uvmm, new, ops);
+}
+
+static int
+nouveau_uvmm_sm_unmap(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ return nouveau_uvmm_sm(uvmm, new, ops);
+}
+
+static void
+nouveau_uvmm_sm_cleanup(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops, bool unmap)
+{
+ struct drm_gpuva_op *op;
+
+ drm_gpuva_for_each_op(op, ops) {
+ switch (op->op) {
+ case DRM_GPUVA_OP_MAP:
+ break;
+ case DRM_GPUVA_OP_REMAP: {
+ struct drm_gpuva_op_remap *r = &op->remap;
+ struct drm_gpuva_op_map *p = r->prev;
+ struct drm_gpuva_op_map *n = r->next;
+ struct drm_gpuva *va = r->unmap->va;
+ struct nouveau_uvma *uvma = uvma_from_va(va);
+
+ if (unmap) {
+ u64 addr = va->va.addr;
+ u64 end = addr + va->va.range;
+
+ if (p)
+ addr = p->va.addr + p->va.range;
+
+ if (n)
+ end = n->va.addr;
+
+ nouveau_uvmm_vmm_put(uvmm, addr, end - addr);
+ }
+
+ nouveau_uvma_gem_put(uvma);
+ nouveau_uvma_free(uvma);
+ break;
+ }
+ case DRM_GPUVA_OP_UNMAP: {
+ struct drm_gpuva_op_unmap *u = &op->unmap;
+ struct drm_gpuva *va = u->va;
+ struct nouveau_uvma *uvma = uvma_from_va(va);
+
+ if (unmap)
+ nouveau_uvma_vmm_put(uvma);
+
+ nouveau_uvma_gem_put(uvma);
+ nouveau_uvma_free(uvma);
+ break;
+ }
+ default:
+ break;
+ }
+ }
+}
+
+static void
+nouveau_uvmm_sm_map_cleanup(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ nouveau_uvmm_sm_cleanup(uvmm, new, ops, false);
+}
+
+static void
+nouveau_uvmm_sm_unmap_cleanup(struct nouveau_uvmm *uvmm,
+ struct nouveau_uvma_prealloc *new,
+ struct drm_gpuva_ops *ops)
+{
+ nouveau_uvmm_sm_cleanup(uvmm, new, ops, true);
+}
+
+static int
+nouveau_uvmm_validate_range(struct nouveau_uvmm *uvmm, u64 addr, u64 range)
+{
+ u64 end = addr + range;
+ u64 unmanaged_end = uvmm->unmanaged_addr +
+ uvmm->unmanaged_size;
+
+ if (addr & ~PAGE_MASK)
+ return -EINVAL;
+
+ if (range & ~PAGE_MASK)
+ return -EINVAL;
+
+ if (end <= addr)
+ return -EINVAL;
+
+ if (addr < NOUVEAU_VA_SPACE_START ||
+ end > NOUVEAU_VA_SPACE_END)
+ return -EINVAL;
+
+ if (addr < unmanaged_end &&
+ end > uvmm->unmanaged_addr)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int
+nouveau_uvmm_bind_job_alloc(struct nouveau_uvmm_bind_job **pjob)
+{
+ *pjob = kzalloc(sizeof(**pjob), GFP_KERNEL);
+ if (!*pjob)
+ return -ENOMEM;
+
+ kref_init(&(*pjob)->kref);
+
+ return 0;
+}
+
+static void
+nouveau_uvmm_bind_job_free(struct kref *kref)
+{
+ struct nouveau_uvmm_bind_job *job =
+ container_of(kref, struct nouveau_uvmm_bind_job, kref);
+
+ nouveau_job_free(&job->base);
+ kfree(job);
+}
+
+static void
+nouveau_uvmm_bind_job_get(struct nouveau_uvmm_bind_job *job)
+{
+ kref_get(&job->kref);
+}
+
+static void
+nouveau_uvmm_bind_job_put(struct nouveau_uvmm_bind_job *job)
+{
+ kref_put(&job->kref, nouveau_uvmm_bind_job_free);
+}
+
+static int
+bind_validate_op(struct nouveau_job *job,
+ struct bind_job_op *op)
+{
+ struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
+ struct drm_gem_object *obj = op->gem.obj;
+
+ if (op->op == OP_MAP) {
+ if (op->gem.offset & ~PAGE_MASK)
+ return -EINVAL;
+
+ if (obj->size <= op->gem.offset)
+ return -EINVAL;
+
+ if (op->va.range > (obj->size - op->gem.offset))
+ return -EINVAL;
+ }
+
+ return nouveau_uvmm_validate_range(uvmm, op->va.addr, op->va.range);
+}
+
+static void
+bind_validate_map_sparse(struct nouveau_job *job, u64 addr, u64 range)
+{
+ struct nouveau_uvmm_bind_job *bind_job;
+ struct nouveau_sched_entity *entity = job->entity;
+ struct bind_job_op *op;
+ u64 end = addr + range;
+
+again:
+ spin_lock(&entity->job.list.lock);
+ list_for_each_entry(bind_job, &entity->job.list.head, entry) {
+ list_for_each_op(op, &bind_job->ops) {
+ if (op->op == OP_UNMAP) {
+ u64 op_addr = op->va.addr;
+ u64 op_end = op_addr + op->va.range;
+
+ if (!(end <= op_addr || addr >= op_end)) {
+ nouveau_uvmm_bind_job_get(bind_job);
+ spin_unlock(&entity->job.list.lock);
+ wait_for_completion(&bind_job->complete);
+ nouveau_uvmm_bind_job_put(bind_job);
+ goto again;
+ }
+ }
+ }
+ }
+ spin_unlock(&entity->job.list.lock);
+}
+
+static int
+bind_validate_map_common(struct nouveau_job *job, u64 addr, u64 range,
+ bool sparse)
+{
+ struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
+ struct nouveau_uvma_region *reg;
+ u64 reg_addr, reg_end;
+ u64 end = addr + range;
+
+again:
+ nouveau_uvmm_lock(uvmm);
+ reg = nouveau_uvma_region_find_first(uvmm, addr, range);
+ if (!reg) {
+ nouveau_uvmm_unlock(uvmm);
+ return 0;
+ }
+
+ /* Generally, job submits are serialized, hence only
+ * dirty regions can be modified concurrently. */
+ if (reg->dirty) {
+ nouveau_uvma_region_get(reg);
+ nouveau_uvmm_unlock(uvmm);
+ wait_for_completion(&reg->complete);
+ nouveau_uvma_region_put(reg);
+ goto again;
+ }
+ nouveau_uvmm_unlock(uvmm);
+
+ if (sparse)
+ return -ENOSPC;
+
+ reg_addr = reg->va.addr;
+ reg_end = reg_addr + reg->va.range;
+
+ /* Make sure the mapping is either outside of a
+ * region or fully enclosed by a region.
+ */
+ if (reg_addr > addr || reg_end < end)
+ return -ENOSPC;
+
+ return 0;
+}
+
+static int
+bind_validate_region(struct nouveau_job *job)
+{
+ struct nouveau_uvmm_bind_job *bind_job = to_uvmm_bind_job(job);
+ struct bind_job_op *op;
+ int ret;
+
+ list_for_each_op(op, &bind_job->ops) {
+ u64 op_addr = op->va.addr;
+ u64 op_range = op->va.range;
+ bool sparse = false;
+
+ switch (op->op) {
+ case OP_MAP_SPARSE:
+ sparse = true;
+ bind_validate_map_sparse(job, op_addr, op_range);
+ fallthrough;
+ case OP_MAP:
+ ret = bind_validate_map_common(job, op_addr, op_range,
+ sparse);
+ if (ret)
+ return ret;
+ break;
+ default:
+ break;
+ }
+ }
+
+ return 0;
+}
+
+static int
+nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
+{
+ struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
+ struct nouveau_uvmm_bind_job *bind_job = to_uvmm_bind_job(job);
+ struct nouveau_sched_entity *entity = job->entity;
+ struct drm_exec *exec = &job->exec;
+ struct drm_gem_object *obj;
+ struct bind_job_op *op;
+ unsigned long index;
+ int ret;
+
+ list_for_each_op(op, &bind_job->ops) {
+ if (op->op == OP_MAP) {
+ op->gem.obj = drm_gem_object_lookup(job->file_priv,
+ op->gem.handle);
+ if (!op->gem.obj)
+ return -ENOENT;
+ }
+
+ ret = bind_validate_op(job, op);
+ if (ret)
+ return ret;
+ }
+
+ /* If a sparse region or mapping overlaps a dirty region, we need to
+ * wait for the region to complete the unbind process. This is due to
+ * how page table management is currently implemented. A future
+ * implementation might change this.
+ */
+ ret = bind_validate_region(job);
+ if (ret)
+ return ret;
+
+ /* Once we start modifying the GPU VA space we need to keep holding the
+ * uvmm lock until we can't fail anymore. This is due to the set of GPU
+ * VA space changes must appear atomically and we need to be able to
+ * unwind all GPU VA space changes on failure.
+ */
+ nouveau_uvmm_lock(uvmm);
+ list_for_each_op(op, &bind_job->ops) {
+ switch (op->op) {
+ case OP_MAP_SPARSE:
+ ret = nouveau_uvma_region_create(uvmm,
+ op->va.addr,
+ op->va.range);
+ if (ret)
+ goto unwind_continue;
+
+ break;
+ case OP_UNMAP_SPARSE:
+ op->reg = nouveau_uvma_region_find(uvmm, op->va.addr,
+ op->va.range);
+ if (!op->reg || op->reg->dirty) {
+ ret = -ENOENT;
+ goto unwind_continue;
+ }
+
+ op->ops = drm_gpuva_sm_unmap_ops_create(&uvmm->umgr,
+ op->va.addr,
+ op->va.range);
+ if (IS_ERR(op->ops)) {
+ ret = PTR_ERR(op->ops);
+ goto unwind_continue;
+ }
+
+ ret = nouveau_uvmm_sm_unmap_prepare(uvmm, &op->new,
+ op->ops);
+ if (ret) {
+ drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+ op->ops = NULL;
+ op->reg = NULL;
+ goto unwind_continue;
+ }
+
+ nouveau_uvma_region_dirty(op->reg);
+
+ break;
+ case OP_MAP: {
+ struct nouveau_uvma_region *reg;
+
+ reg = nouveau_uvma_region_find_first(uvmm,
+ op->va.addr,
+ op->va.range);
+ if (reg) {
+ u64 reg_addr = reg->va.addr;
+ u64 reg_end = reg_addr + reg->va.range;
+ u64 op_addr = op->va.addr;
+ u64 op_end = op_addr + op->va.range;
+
+ if (unlikely(reg->dirty)) {
+ ret = -EINVAL;
+ goto unwind_continue;
+ }
+
+ /* Make sure the mapping is either outside of a
+ * region or fully enclosed by a region.
+ */
+ if (reg_addr > op_addr || reg_end < op_end) {
+ ret = -ENOSPC;
+ goto unwind_continue;
+ }
+ }
+
+ op->ops = drm_gpuva_sm_map_ops_create(&uvmm->umgr,
+ op->va.addr,
+ op->va.range,
+ op->gem.obj,
+ op->gem.offset);
+ if (IS_ERR(op->ops)) {
+ ret = PTR_ERR(op->ops);
+ goto unwind_continue;
+ }
+
+ ret = nouveau_uvmm_sm_map_prepare(uvmm, &op->new,
+ reg, op->ops,
+ op->va.addr,
+ op->va.range,
+ op->flags & 0xff);
+ if (ret) {
+ drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+ op->ops = NULL;
+ goto unwind_continue;
+ }
+
+ break;
+ }
+ case OP_UNMAP:
+ op->ops = drm_gpuva_sm_unmap_ops_create(&uvmm->umgr,
+ op->va.addr,
+ op->va.range);
+ if (IS_ERR(op->ops)) {
+ ret = PTR_ERR(op->ops);
+ goto unwind_continue;
+ }
+
+ ret = nouveau_uvmm_sm_unmap_prepare(uvmm, &op->new,
+ op->ops);
+ if (ret) {
+ drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+ op->ops = NULL;
+ goto unwind_continue;
+ }
+
+ break;
+ default:
+ ret = -EINVAL;
+ goto unwind_continue;
+ }
+ }
+
+ drm_exec_while_not_all_locked(exec) {
+ list_for_each_op(op, &bind_job->ops) {
+ if (op->op != OP_MAP)
+ continue;
+
+ ret = drm_exec_prepare_obj(exec, op->gem.obj, 1);
+ drm_exec_break_on_contention(exec);
+ if (ret == -EALREADY) {
+ continue;
+ } else if (ret) {
+ op = list_last_op(&bind_job->ops);
+ goto unwind;
+ }
+ }
+ }
+
+ drm_exec_for_each_locked_object(exec, index, obj) {
+ struct nouveau_bo *nvbo = nouveau_gem_object(obj);
+
+ ret = nouveau_bo_validate(nvbo, true, false);
+ if (ret) {
+ op = list_last_op(&bind_job->ops);
+ goto unwind;
+ }
+ }
+ nouveau_uvmm_unlock(uvmm);
+
+ spin_lock(&entity->job.list.lock);
+ list_add(&bind_job->entry, &entity->job.list.head);
+ spin_unlock(&entity->job.list.lock);
+
+ return 0;
+
+unwind_continue:
+ op = list_prev_op(op);
+unwind:
+ list_for_each_op_from_reverse(op, &bind_job->ops) {
+ switch (op->op) {
+ case OP_MAP_SPARSE:
+ nouveau_uvma_region_destroy(uvmm, op->va.addr,
+ op->va.range);
+ break;
+ case OP_UNMAP_SPARSE:
+ __nouveau_uvma_region_insert(uvmm, op->reg);
+ nouveau_uvmm_sm_unmap_prepare_unwind(uvmm, &op->new,
+ op->ops);
+ break;
+ case OP_MAP:
+ nouveau_uvmm_sm_map_prepare_unwind(uvmm, &op->new,
+ op->ops,
+ op->va.addr,
+ op->va.range);
+ break;
+ case OP_UNMAP:
+ nouveau_uvmm_sm_unmap_prepare_unwind(uvmm, &op->new,
+ op->ops);
+ break;
+ }
+
+ drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+ op->ops = NULL;
+ op->reg = NULL;
+ }
+
+ nouveau_uvmm_unlock(uvmm);
+ return ret;
+}
+
+static struct dma_fence *
+nouveau_uvmm_bind_job_run(struct nouveau_job *job)
+{
+ struct nouveau_uvmm_bind_job *bind_job = to_uvmm_bind_job(job);
+ struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
+ struct bind_job_op *op;
+ int ret = 0;
+
+ list_for_each_op(op, &bind_job->ops) {
+ switch (op->op) {
+ case OP_MAP_SPARSE:
+ /* noop */
+ break;
+ case OP_MAP:
+ ret = nouveau_uvmm_sm_map(uvmm, &op->new, op->ops);
+ if (ret)
+ goto out;
+ break;
+ case OP_UNMAP_SPARSE:
+ fallthrough;
+ case OP_UNMAP:
+ ret = nouveau_uvmm_sm_unmap(uvmm, &op->new, op->ops);
+ if (ret)
+ goto out;
+ break;
+ }
+ }
+
+out:
+ if (ret)
+ NV_PRINTK(err, job->cli, "bind job failed: %d\n", ret);
+ return ERR_PTR(ret);
+}
+
+static void
+nouveau_uvmm_bind_job_free_work_fn(struct work_struct *work)
+{
+ struct nouveau_uvmm_bind_job *bind_job =
+ container_of(work, struct nouveau_uvmm_bind_job, work);
+ struct nouveau_job *job = &bind_job->base;
+ struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
+ struct nouveau_sched_entity *entity = job->entity;
+ struct bind_job_op *op, *next;
+
+ list_for_each_op(op, &bind_job->ops) {
+ struct drm_gem_object *obj = op->gem.obj;
+
+ /* When nouveau_uvmm_bind_job_submit() fails op->ops and op->reg
+ * will be NULL, hence skip the cleanup.
+ */
+ switch (op->op) {
+ case OP_MAP_SPARSE:
+ /* noop */
+ break;
+ case OP_UNMAP_SPARSE:
+ if (!IS_ERR_OR_NULL(op->ops))
+ nouveau_uvmm_sm_unmap_cleanup(uvmm, &op->new,
+ op->ops);
+
+ if (op->reg) {
+ nouveau_uvma_region_sparse_unref(op->reg);
+ nouveau_uvmm_lock(uvmm);
+ nouveau_uvma_region_remove(op->reg);
+ nouveau_uvmm_unlock(uvmm);
+ nouveau_uvma_region_complete(op->reg);
+ nouveau_uvma_region_put(op->reg);
+ }
+
+ break;
+ case OP_MAP:
+ if (!IS_ERR_OR_NULL(op->ops))
+ nouveau_uvmm_sm_map_cleanup(uvmm, &op->new,
+ op->ops);
+ break;
+ case OP_UNMAP:
+ if (!IS_ERR_OR_NULL(op->ops))
+ nouveau_uvmm_sm_unmap_cleanup(uvmm, &op->new,
+ op->ops);
+ break;
+ }
+
+ if (!IS_ERR_OR_NULL(op->ops))
+ drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+
+ if (obj)
+ drm_gem_object_put(obj);
+ }
+
+ spin_lock(&entity->job.list.lock);
+ list_del(&bind_job->entry);
+ spin_unlock(&entity->job.list.lock);
+
+ complete_all(&bind_job->complete);
+ wake_up(&entity->job.wq);
+
+ /* Remove and free ops after removing the bind job from the job list to
+ * avoid races against bind_validate_map_sparse().
+ */
+ list_for_each_op_safe(op, next, &bind_job->ops) {
+ list_del(&op->entry);
+ kfree(op);
+ }
+
+ nouveau_uvmm_bind_job_put(bind_job);
+}
+
+static void
+nouveau_uvmm_bind_job_free_qwork(struct nouveau_job *job)
+{
+ struct nouveau_uvmm_bind_job *bind_job = to_uvmm_bind_job(job);
+ struct nouveau_sched_entity *entity = job->entity;
+
+ nouveau_sched_entity_qwork(entity, &bind_job->work);
+}
+
+static struct nouveau_job_ops nouveau_bind_job_ops = {
+ .submit = nouveau_uvmm_bind_job_submit,
+ .run = nouveau_uvmm_bind_job_run,
+ .free = nouveau_uvmm_bind_job_free_qwork,
+};
+
+static int
+bind_job_op_from_uop(struct bind_job_op **pop,
+ struct drm_nouveau_vm_bind_op *uop)
+{
+ struct bind_job_op *op;
+
+ op = *pop = kzalloc(sizeof(*op), GFP_KERNEL);
+ if (!op)
+ return -ENOMEM;
+
+ switch (uop->op) {
+ case OP_MAP:
+ op->op = uop->flags & DRM_NOUVEAU_VM_BIND_SPARSE ?
+ OP_MAP_SPARSE : OP_MAP;
+ break;
+ case OP_UNMAP:
+ op->op = uop->flags & DRM_NOUVEAU_VM_BIND_SPARSE ?
+ OP_UNMAP_SPARSE : OP_UNMAP;
+ break;
+ default:
+ op->op = uop->op;
+ break;
+ }
+
+ op->flags = uop->flags;
+ op->va.addr = uop->addr;
+ op->va.range = uop->range;
+ op->gem.handle = uop->handle;
+ op->gem.offset = uop->bo_offset;
+
+ return 0;
+}
+
+static void
+bind_job_ops_free(struct list_head *ops)
+{
+ struct bind_job_op *op, *next;
+
+ list_for_each_op_safe(op, next, ops) {
+ list_del(&op->entry);
+ kfree(op);
+ }
+}
+
+static int
+nouveau_uvmm_bind_job_init(struct nouveau_uvmm_bind_job **pjob,
+ struct nouveau_uvmm_bind_job_args *__args)
+{
+ struct nouveau_uvmm_bind_job *job;
+ struct nouveau_job_args args = {};
+ struct bind_job_op *op;
+ int i, ret;
+
+ ret = nouveau_uvmm_bind_job_alloc(&job);
+ if (ret)
+ return ret;
+
+ INIT_LIST_HEAD(&job->ops);
+ INIT_LIST_HEAD(&job->entry);
+
+ for (i = 0; i < __args->op.count; i++) {
+ ret = bind_job_op_from_uop(&op, &__args->op.s[i]);
+ if (ret)
+ goto err_free;
+
+ list_add_tail(&op->entry, &job->ops);
+ }
+
+ init_completion(&job->complete);
+ INIT_WORK(&job->work, nouveau_uvmm_bind_job_free_work_fn);
+
+ args.sched_entity = __args->sched_entity;
+ args.file_priv = __args->file_priv;
+
+ args.in_sync.count = __args->in_sync.count;
+ args.in_sync.s = __args->in_sync.s;
+
+ args.out_sync.count = __args->out_sync.count;
+ args.out_sync.s = __args->out_sync.s;
+
+ args.sync = !(__args->flags & DRM_NOUVEAU_VM_BIND_RUN_ASYNC);
+ args.ops = &nouveau_bind_job_ops;
+ args.resv_usage = DMA_RESV_USAGE_BOOKKEEP;
+
+ ret = nouveau_job_init(&job->base, &args);
+ if (ret)
+ goto err_free;
+
+ *pjob = job;
+ return 0;
+
+err_free:
+ bind_job_ops_free(&job->ops);
+ kfree(job);
+ *pjob = NULL;
+
+ return ret;
+}
+
+int
+nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
+ void *data,
+ struct drm_file *file_priv)
+{
+ struct nouveau_cli *cli = nouveau_cli(file_priv);
+ struct drm_nouveau_vm_init *init = data;
+
+ return nouveau_uvmm_init(&cli->uvmm, cli, init->unmanaged_addr,
+ init->unmanaged_size);
+}
+
+static int
+nouveau_uvmm_vm_bind(struct nouveau_uvmm_bind_job_args *args)
+{
+ struct nouveau_uvmm_bind_job *job;
+ int ret;
+
+ ret = nouveau_uvmm_bind_job_init(&job, args);
+ if (ret)
+ return ret;
+
+ ret = nouveau_job_submit(&job->base);
+ if (ret)
+ goto err_job_fini;
+
+ return 0;
+
+err_job_fini:
+ nouveau_job_fini(&job->base);
+ return ret;
+}
+
+static int
+nouveau_uvmm_vm_bind_ucopy(struct nouveau_uvmm_bind_job_args *args,
+ struct drm_nouveau_vm_bind __user *req)
+{
+ struct drm_nouveau_sync **s;
+ u32 inc = req->wait_count;
+ u64 ins = req->wait_ptr;
+ u32 outc = req->sig_count;
+ u64 outs = req->sig_ptr;
+ u32 opc = req->op_count;
+ u64 ops = req->op_ptr;
+ int ret;
+
+ args->flags = req->flags;
+
+ args->op.count = opc;
+ args->op.s = u_memcpya(ops, opc,
+ sizeof(*args->op.s));
+ if (IS_ERR(args->op.s))
+ return PTR_ERR(args->op.s);
+
+ if (inc) {
+ s = &args->in_sync.s;
+
+ args->in_sync.count = inc;
+ *s = u_memcpya(ins, inc, sizeof(**s));
+ if (IS_ERR(*s)) {
+ ret = PTR_ERR(*s);
+ goto err_free_ops;
+ }
+ }
+
+ if (outc) {
+ s = &args->out_sync.s;
+
+ args->out_sync.count = outc;
+ *s = u_memcpya(outs, outc, sizeof(**s));
+ if (IS_ERR(*s)) {
+ ret = PTR_ERR(*s);
+ goto err_free_ins;
+ }
+ }
+
+ return 0;
+
+err_free_ops:
+ u_free(args->op.s);
+err_free_ins:
+ u_free(args->in_sync.s);
+ return ret;
+}
+
+static void
+nouveau_uvmm_vm_bind_ufree(struct nouveau_uvmm_bind_job_args *args)
+{
+ u_free(args->op.s);
+ u_free(args->in_sync.s);
+ u_free(args->out_sync.s);
+}
+
+int
+nouveau_uvmm_ioctl_vm_bind(struct drm_device *dev,
+ void __user *data,
+ struct drm_file *file_priv)
+{
+ struct nouveau_cli *cli = nouveau_cli(file_priv);
+ struct nouveau_uvmm_bind_job_args args = {};
+ struct drm_nouveau_vm_bind __user *req = data;
+ int ret = 0;
+
+ if (unlikely(!nouveau_cli_uvmm_locked(cli)))
+ return -ENOSYS;
+
+ ret = nouveau_uvmm_vm_bind_ucopy(&args, req);
+ if (ret)
+ return ret;
+
+ args.sched_entity = &cli->sched_entity;
+ args.file_priv = file_priv;
+
+ ret = nouveau_uvmm_vm_bind(&args);
+ if (ret)
+ goto out_free_args;
+
+out_free_args:
+ nouveau_uvmm_vm_bind_ufree(&args);
+ return ret;
+}
+
+void
+nouveau_uvmm_bo_map_all(struct nouveau_bo *nvbo, struct nouveau_mem *mem)
+{
+ struct drm_gem_object *obj = &nvbo->bo.base;
+ struct drm_gpuva *va;
+
+ drm_gem_gpuva_lock(obj);
+ drm_gem_for_each_gpuva(va, obj) {
+ struct nouveau_uvma *uvma = uvma_from_va(va);
+
+ nouveau_uvma_map(uvma, mem);
+ drm_gpuva_evict(va, false);
+ }
+ drm_gem_gpuva_unlock(obj);
+}
+
+void
+nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo)
+{
+ struct drm_gem_object *obj = &nvbo->bo.base;
+ struct drm_gpuva *va;
+
+ drm_gem_gpuva_lock(obj);
+ drm_gem_for_each_gpuva(va, obj) {
+ struct nouveau_uvma *uvma = uvma_from_va(va);
+
+ nouveau_uvma_unmap(uvma);
+ drm_gpuva_evict(va, true);
+ }
+ drm_gem_gpuva_unlock(obj);
+}
+
+int
+nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
+ u64 unmanaged_addr, u64 unmanaged_size)
+{
+ int ret;
+ u64 unmanaged_end = unmanaged_addr + unmanaged_size;
+
+ mutex_init(&uvmm->mutex);
+ mt_init_flags(&uvmm->region_mt, MT_FLAGS_LOCK_EXTERN);
+ mt_set_external_lock(&uvmm->region_mt, &uvmm->mutex);
+
+ mutex_lock(&cli->mutex);
+
+ if (unlikely(cli->uvmm.disabled)) {
+ ret = -ENOSYS;
+ goto out_unlock;
+ }
+
+ if (unmanaged_end <= unmanaged_addr) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ if (unmanaged_end > NOUVEAU_VA_SPACE_END) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ uvmm->unmanaged_addr = unmanaged_addr;
+ uvmm->unmanaged_size = unmanaged_size;
+
+ drm_gpuva_manager_init(&uvmm->umgr, cli->name,
+ NOUVEAU_VA_SPACE_START,
+ NOUVEAU_VA_SPACE_END,
+ unmanaged_addr, unmanaged_size,
+ NULL);
+
+ ret = nvif_vmm_ctor(&cli->mmu, "uvmm",
+ cli->vmm.vmm.object.oclass, RAW,
+ unmanaged_addr, unmanaged_size,
+ NULL, 0, &cli->uvmm.vmm.vmm);
+ if (ret)
+ goto out_free_gpuva_mgr;
+
+ cli->uvmm.vmm.cli = cli;
+ mutex_unlock(&cli->mutex);
+
+ return 0;
+
+out_free_gpuva_mgr:
+ drm_gpuva_manager_destroy(&uvmm->umgr);
+out_unlock:
+ mutex_unlock(&cli->mutex);
+ return ret;
+}
+
+void
+nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
+{
+ DRM_GPUVA_ITER(it, &uvmm->umgr, 0);
+ MA_STATE(mas, &uvmm->region_mt, 0, 0);
+ struct nouveau_uvma_region *reg;
+ struct nouveau_cli *cli = uvmm->vmm.cli;
+ struct nouveau_sched_entity *entity = &cli->sched_entity;
+ struct drm_gpuva *va;
+
+ if (!cli)
+ return;
+
+ rmb(); /* for list_empty to work without lock */
+ wait_event(entity->job.wq, list_empty(&entity->job.list.head));
+
+ nouveau_uvmm_lock(uvmm);
+ drm_gpuva_iter_for_each(va, it) {
+ struct nouveau_uvma *uvma = uvma_from_va(va);
+ struct drm_gem_object *obj = va->gem.obj;
+
+ if (unlikely(va == &uvmm->umgr.kernel_alloc_node))
+ continue;
+
+ drm_gpuva_iter_remove(&it);
+
+ drm_gem_gpuva_lock(obj);
+ nouveau_uvma_unmap(uvma);
+ drm_gpuva_unlink(va);
+ drm_gem_gpuva_unlock(obj);
+
+ nouveau_uvma_vmm_put(uvma);
+
+ nouveau_uvma_gem_put(uvma);
+ nouveau_uvma_free(uvma);
+ }
+
+ mas_for_each(&mas, reg, ULONG_MAX) {
+ mas_erase(&mas);
+ nouveau_uvma_region_sparse_unref(reg);
+ nouveau_uvma_region_put(reg);
+ }
+
+ WARN(!mtree_empty(&uvmm->region_mt),
+ "nouveau_uvma_region tree not empty, potentially leaking memory.");
+ __mt_destroy(&uvmm->region_mt);
+ nouveau_uvmm_unlock(uvmm);
+
+ mutex_lock(&cli->mutex);
+ nouveau_vmm_fini(&uvmm->vmm);
+ drm_gpuva_manager_destroy(&uvmm->umgr);
+ mutex_unlock(&cli->mutex);
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.h b/drivers/gpu/drm/nouveau/nouveau_uvmm.h
new file mode 100644
index 000000000000..374b8fbd2a59
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: MIT */
+
+#ifndef __NOUVEAU_UVMM_H__
+#define __NOUVEAU_UVMM_H__
+
+#include <drm/drm_gpuva_mgr.h>
+
+#include "nouveau_drv.h"
+
+struct nouveau_uvmm {
+ struct nouveau_vmm vmm;
+ struct drm_gpuva_manager umgr;
+ struct maple_tree region_mt;
+ struct mutex mutex;
+
+ u64 unmanaged_addr;
+ u64 unmanaged_size;
+
+ bool disabled;
+};
+
+struct nouveau_uvma_region {
+ struct nouveau_uvmm *uvmm;
+
+ struct {
+ u64 addr;
+ u64 range;
+ } va;
+
+ struct kref kref;
+
+ struct completion complete;
+ bool dirty;
+};
+
+struct nouveau_uvma {
+ struct drm_gpuva va;
+
+ struct nouveau_uvmm *uvmm;
+ struct nouveau_uvma_region *region;
+
+ u8 kind;
+};
+
+struct nouveau_uvmm_bind_job {
+ struct nouveau_job base;
+
+ struct kref kref;
+ struct list_head entry;
+ struct work_struct work;
+ struct completion complete;
+
+ /* struct bind_job_op */
+ struct list_head ops;
+};
+
+struct nouveau_uvmm_bind_job_args {
+ struct drm_file *file_priv;
+ struct nouveau_sched_entity *sched_entity;
+
+ unsigned int flags;
+
+ struct {
+ struct drm_nouveau_sync *s;
+ u32 count;
+ } in_sync;
+
+ struct {
+ struct drm_nouveau_sync *s;
+ u32 count;
+ } out_sync;
+
+ struct {
+ struct drm_nouveau_vm_bind_op *s;
+ u32 count;
+ } op;
+};
+
+#define to_uvmm_bind_job(job) container_of((job), struct nouveau_uvmm_bind_job, base)
+
+#define uvmm_from_mgr(x) container_of((x), struct nouveau_uvmm, umgr)
+#define uvma_from_va(x) container_of((x), struct nouveau_uvma, va)
+
+int nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
+ u64 unmanaged_addr, u64 unmanaged_size);
+void nouveau_uvmm_fini(struct nouveau_uvmm *uvmm);
+
+void nouveau_uvmm_bo_map_all(struct nouveau_bo *nvbov, struct nouveau_mem *mem);
+void nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo);
+
+int nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, void __user *data,
+ struct drm_file *file_priv);
+
+int nouveau_uvmm_ioctl_vm_bind(struct drm_device *dev, void __user *data,
+ struct drm_file *file_priv);
+
+static inline void nouveau_uvmm_lock(struct nouveau_uvmm *uvmm)
+{
+ mutex_lock(&uvmm->mutex);
+}
+
+static inline void nouveau_uvmm_unlock(struct nouveau_uvmm *uvmm)
+{
+ mutex_unlock(&uvmm->mutex);
+}
+
+#endif
--
2.40.1

2023-06-06 22:57:02

by Danilo Krummrich

[permalink] [raw]

Subject: [PATCH drm-next v4 02/14] maple_tree: split up MA_STATE() macro

Split up the MA_STATE() macro such that components using the maple tree
can easily inherit from struct ma_state and build custom tree walk
macros to hide their internals from users.

Example:

struct sample_iterator {
struct ma_state mas;
struct sample_mgr *mgr;
};

\#define SAMPLE_ITERATOR(name, __mgr, start) \
struct sample_iterator name = { \
.mas = MA_STATE_INIT(&(__mgr)->mt, start, 0), \
.mgr = __mgr, \
}

\#define sample_iter_for_each_range(it__, entry__, end__) \
mas_for_each(&(it__).mas, entry__, end__)

--

struct sample *sample;
SAMPLE_ITERATOR(si, min);

sample_iter_for_each_range(&si, sample, max) {
frob(mgr, sample);
}

Signed-off-by: Danilo Krummrich <[email protected]>
---
include/linux/maple_tree.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 1fadb5f5978b..87d55334f1c2 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -423,8 +423,8 @@ struct ma_wr_state {
#define MA_ERROR(err) \
((struct maple_enode *)(((unsigned long)err << 2) | 2UL))

-#define MA_STATE(name, mt, first, end) \
- struct ma_state name = { \
+#define MA_STATE_INIT(mt, first, end) \
+ { \
.tree = mt, \
.index = first, \
.last = end, \
@@ -435,6 +435,9 @@ struct ma_wr_state {
.mas_flags = 0, \
}

+#define MA_STATE(name, mt, first, end) \
+ struct ma_state name = MA_STATE_INIT(mt, first, end)
+
#define MA_WR_STATE(name, ma_state, wr_entry) \
struct ma_wr_state name = { \
.mas = ma_state, \
--
2.40.1

2023-06-06 23:15:36

by Danilo Krummrich

[permalink] [raw]

Subject: [PATCH drm-next v4 03/14] drm: manager to keep track of GPUs VA mappings

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

Suggested-by: Dave Airlie <[email protected]>
Signed-off-by: Danilo Krummrich <[email protected]>
---
Documentation/gpu/drm-mm.rst | 31 +
drivers/gpu/drm/Makefile | 1 +
drivers/gpu/drm/drm_gem.c | 3 +
drivers/gpu/drm/drm_gpuva_mgr.c | 1687 +++++++++++++++++++++++++++++++
include/drm/drm_drv.h | 6 +
include/drm/drm_gem.h | 75 ++
include/drm/drm_gpuva_mgr.h | 681 +++++++++++++
7 files changed, 2484 insertions(+)
create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a52e6f4117d6..c9f120cfe730 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
.. kernel-doc:: drivers/gpu/drm/drm_mm.c
:export:

+DRM GPU VA Manager
+==================
+
+Overview
+--------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+ :doc: Overview
+
+Split and Merge
+---------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+ :doc: Split and Merge
+
+Locking
+-------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+ :doc: Locking
+
+
+DRM GPU VA Manager Function References
+--------------------------------------
+
+.. kernel-doc:: include/drm/drm_gpuva_mgr.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+ :export:
+
DRM Buddy Allocator
===================

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 9c6446eb3c83..8eeed446a078 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -45,6 +45,7 @@ drm-y := \
drm_vblank.o \
drm_vblank_work.o \
drm_vma_manager.o \
+ drm_gpuva_mgr.o \
drm_writeback.o
drm-$(CONFIG_DRM_LEGACY) += \
drm_agpsupport.o \
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 1a5a2cd0d4ec..cd878ebddbd0 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
if (!obj->resv)
obj->resv = &obj->_resv;

+ if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))
+ drm_gem_gpuva_init(obj);
+
drm_vma_node_reset(&obj->vma_node);
INIT_LIST_HEAD(&obj->lru_node);
}
diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
new file mode 100644
index 000000000000..dd8dd7fef14b
--- /dev/null
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -0,0 +1,1687 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ * Danilo Krummrich <[email protected]>
+ *
+ */
+
+#include <drm/drm_gem.h>
+#include <drm/drm_gpuva_mgr.h>
+
+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
+ * of a GPU's virtual address (VA) space and manages the corresponding virtual
+ * mappings represented by &drm_gpuva objects. It also keeps track of the
+ * mapping's backing &drm_gem_object buffers.
+ *
+ * &drm_gem_object buffers maintain a list (and a corresponding list lock) of
+ * &drm_gpuva objects representing all existent GPU VA mappings using this
+ * &drm_gem_object as backing buffer.
+ *
+ * GPU VAs can be flagged as sparse, such that drivers may use GPU VAs to also
+ * keep track of sparse PTEs in order to support Vulkan 'Sparse Resources'.
+ *
+ * The GPU VA manager internally uses a &maple_tree to manage the
+ * &drm_gpuva mappings within a GPU's virtual address space.
+ *
+ * The &drm_gpuva_manager contains a special &drm_gpuva representing the
+ * portion of VA space reserved by the kernel. This node is initialized together
+ * with the GPU VA manager instance and removed when the GPU VA manager is
+ * destroyed.
+ *
+ * In a typical application drivers would embed struct drm_gpuva_manager and
+ * struct drm_gpuva within their own driver specific structures, there won't be
+ * any memory allocations of it's own nor memory allocations of &drm_gpuva
+ * entries.
+ *
+ * However, the &drm_gpuva_manager needs to allocate nodes for it's internal
+ * tree structures when &drm_gpuva entries are inserted. In order to support
+ * inserting &drm_gpuva entries from dma-fence signalling critical sections the
+ * &drm_gpuva_manager provides struct drm_gpuva_prealloc. Drivers may create
+ * pre-allocated nodes which drm_gpuva_prealloc_create() and subsequently insert
+ * a new &drm_gpuva entry with drm_gpuva_insert_prealloc().
+ */
+
+/**
+ * DOC: Split and Merge
+ *
+ * The DRM GPU VA manager also provides an algorithm implementing splitting and
+ * merging of existent GPU VA mappings with the ones that are requested to be
+ * mapped or unmapped. This feature is required by the Vulkan API to implement
+ * Vulkan 'Sparse Memory Bindings' - drivers UAPIs often refer to this as
+ * VM BIND.
+ *
+ * Drivers can call drm_gpuva_sm_map() to receive a sequence of callbacks
+ * containing map, unmap and remap operations for a given newly requested
+ * mapping. The sequence of callbacks represents the set of operations to
+ * execute in order to integrate the new mapping cleanly into the current state
+ * of the GPU VA space.
+ *
+ * Depending on how the new GPU VA mapping intersects with the existent mappings
+ * of the GPU VA space the &drm_gpuva_fn_ops callbacks contain an arbitrary
+ * amount of unmap operations, a maximum of two remap operations and a single
+ * map operation. The caller might receive no callback at all if no operation is
+ * required, e.g. if the requested mapping already exists in the exact same way.
+ *
+ * The single map operation represents the original map operation requested by
+ * the caller.
+ *
+ * &drm_gpuva_op_unmap contains a 'keep' field, which indicates whether the
+ * &drm_gpuva to unmap is physically contiguous with the original mapping
+ * request. Optionally, if 'keep' is set, drivers may keep the actual page table
+ * entries for this &drm_gpuva, adding the missing page table entries only and
+ * update the &drm_gpuva_manager's view of things accordingly.
+ *
+ * Drivers may do the same optimization, namely delta page table updates, also
+ * for remap operations. This is possible since &drm_gpuva_op_remap consists of
+ * one unmap operation and one or two map operations, such that drivers can
+ * derive the page table update delta accordingly.
+ *
+ * Note that there can't be more than two existent mappings to split up, one at
+ * the beginning and one at the end of the new mapping, hence there is a
+ * maximum of two remap operations.
+ *
+ * Analogous to drm_gpuva_sm_map() drm_gpuva_sm_unmap() uses &drm_gpuva_fn_ops
+ * to call back into the driver in order to unmap a range of GPU VA space. The
+ * logic behind this function is way simpler though: For all existent mappings
+ * enclosed by the given range unmap operations are created. For mappings which
+ * are only partically located within the given range, remap operations are
+ * created such that those mappings are split up and re-mapped partically.
+ *
+ * To update the &drm_gpuva_manager's view of the GPU VA space
+ * drm_gpuva_insert(), drm_gpuva_insert_prealloc(), and drm_gpuva_remove() may
+ * be used. Please note that these functions are not safe to be called from a
+ * &drm_gpuva_fn_ops callback originating from drm_gpuva_sm_map() or
+ * drm_gpuva_sm_unmap(). The drm_gpuva_map(), drm_gpuva_remap() and
+ * drm_gpuva_unmap() helpers should be used instead.
+ *
+ * The following diagram depicts the basic relationships of existent GPU VA
+ * mappings, a newly requested mapping and the resulting mappings as implemented
+ * by drm_gpuva_sm_map() - it doesn't cover any arbitrary combinations of these.
+ *
+ * 1) Requested mapping is identical. Replace it, but indicate the backing PTEs
+ * could be kept.
+ *
+ * ::
+ *
+ * 0 a 1
+ * old: |-----------| (bo_offset=n)
+ *
+ * 0 a 1
+ * req: |-----------| (bo_offset=n)
+ *
+ * 0 a 1
+ * new: |-----------| (bo_offset=n)
+ *
+ *
+ * 2) Requested mapping is identical, except for the BO offset, hence replace
+ * the mapping.
+ *
+ * ::
+ *
+ * 0 a 1
+ * old: |-----------| (bo_offset=n)
+ *
+ * 0 a 1
+ * req: |-----------| (bo_offset=m)
+ *
+ * 0 a 1
+ * new: |-----------| (bo_offset=m)
+ *
+ *
+ * 3) Requested mapping is identical, except for the backing BO, hence replace
+ * the mapping.
+ *
+ * ::
+ *
+ * 0 a 1
+ * old: |-----------| (bo_offset=n)
+ *
+ * 0 b 1
+ * req: |-----------| (bo_offset=n)
+ *
+ * 0 b 1
+ * new: |-----------| (bo_offset=n)
+ *
+ *
+ * 4) Existent mapping is a left aligned subset of the requested one, hence
+ * replace the existent one.
+ *
+ * ::
+ *
+ * 0 a 1
+ * old: |-----| (bo_offset=n)
+ *
+ * 0 a 2
+ * req: |-----------| (bo_offset=n)
+ *
+ * 0 a 2
+ * new: |-----------| (bo_offset=n)
+ *
+ * .. note::
+ * We expect to see the same result for a request with a different BO
+ * and/or non-contiguous BO offset.
+ *
+ *
+ * 5) Requested mapping's range is a left aligned subset of the existent one,
+ * but backed by a different BO. Hence, map the requested mapping and split
+ * the existent one adjusting it's BO offset.
+ *
+ * ::
+ *
+ * 0 a 2
+ * old: |-----------| (bo_offset=n)
+ *
+ * 0 b 1
+ * req: |-----| (bo_offset=n)
+ *
+ * 0 b 1 a' 2
+ * new: |-----|-----| (b.bo_offset=n, a.bo_offset=n+1)
+ *
+ * .. note::
+ * We expect to see the same result for a request with a different BO
+ * and/or non-contiguous BO offset.
+ *
+ *
+ * 6) Existent mapping is a superset of the requested mapping. Split it up, but
+ * indicate that the backing PTEs could be kept.
+ *
+ * ::
+ *
+ * 0 a 2
+ * old: |-----------| (bo_offset=n)
+ *
+ * 0 a 1
+ * req: |-----| (bo_offset=n)
+ *
+ * 0 a 1 a' 2
+ * new: |-----|-----| (a.bo_offset=n, a'.bo_offset=n+1)
+ *
+ *
+ * 7) Requested mapping's range is a right aligned subset of the existent one,
+ * but backed by a different BO. Hence, map the requested mapping and split
+ * the existent one, without adjusting the BO offset.
+ *
+ * ::
+ *
+ * 0 a 2
+ * old: |-----------| (bo_offset=n)
+ *
+ * 1 b 2
+ * req: |-----| (bo_offset=m)
+ *
+ * 0 a 1 b 2
+ * new: |-----|-----| (a.bo_offset=n,b.bo_offset=m)
+ *
+ *
+ * 8) Existent mapping is a superset of the requested mapping. Split it up, but
+ * indicate that the backing PTEs could be kept.
+ *
+ * ::
+ *
+ * 0 a 2
+ * old: |-----------| (bo_offset=n)
+ *
+ * 1 a 2
+ * req: |-----| (bo_offset=n+1)
+ *
+ * 0 a' 1 a 2
+ * new: |-----|-----| (a'.bo_offset=n, a.bo_offset=n+1)
+ *
+ *
+ * 9) Existent mapping is overlapped at the end by the requested mapping backed
+ * by a different BO. Hence, map the requested mapping and split up the
+ * existent one, without adjusting the BO offset.
+ *
+ * ::
+ *
+ * 0 a 2
+ * old: |-----------| (bo_offset=n)
+ *
+ * 1 b 3
+ * req: |-----------| (bo_offset=m)
+ *
+ * 0 a 1 b 3
+ * new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)
+ *
+ *
+ * 10) Existent mapping is overlapped by the requested mapping, both having the
+ * same backing BO with a contiguous offset. Indicate the backing PTEs of
+ * the old mapping could be kept.
+ *
+ * ::
+ *
+ * 0 a 2
+ * old: |-----------| (bo_offset=n)
+ *
+ * 1 a 3
+ * req: |-----------| (bo_offset=n+1)
+ *
+ * 0 a' 1 a 3
+ * new: |-----|-----------| (a'.bo_offset=n, a.bo_offset=n+1)
+ *
+ *
+ * 11) Requested mapping's range is a centered subset of the existent one
+ * having a different backing BO. Hence, map the requested mapping and split
+ * up the existent one in two mappings, adjusting the BO offset of the right
+ * one accordingly.
+ *
+ * ::
+ *
+ * 0 a 3
+ * old: |-----------------| (bo_offset=n)
+ *
+ * 1 b 2
+ * req: |-----| (bo_offset=m)
+ *
+ * 0 a 1 b 2 a' 3
+ * new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)
+ *
+ *
+ * 12) Requested mapping is a contiguous subset of the existent one. Split it
+ * up, but indicate that the backing PTEs could be kept.
+ *
+ * ::
+ *
+ * 0 a 3
+ * old: |-----------------| (bo_offset=n)
+ *
+ * 1 a 2
+ * req: |-----| (bo_offset=n+1)
+ *
+ * 0 a' 1 a 2 a'' 3
+ * old: |-----|-----|-----| (a'.bo_offset=n, a.bo_offset=n+1, a''.bo_offset=n+2)
+ *
+ *
+ * 13) Existent mapping is a right aligned subset of the requested one, hence
+ * replace the existent one.
+ *
+ * ::
+ *
+ * 1 a 2
+ * old: |-----| (bo_offset=n+1)
+ *
+ * 0 a 2
+ * req: |-----------| (bo_offset=n)
+ *
+ * 0 a 2
+ * new: |-----------| (bo_offset=n)
+ *
+ * .. note::
+ * We expect to see the same result for a request with a different bo
+ * and/or non-contiguous bo_offset.
+ *
+ *
+ * 14) Existent mapping is a centered subset of the requested one, hence
+ * replace the existent one.
+ *
+ * ::
+ *
+ * 1 a 2
+ * old: |-----| (bo_offset=n+1)
+ *
+ * 0 a 3
+ * req: |----------------| (bo_offset=n)
+ *
+ * 0 a 3
+ * new: |----------------| (bo_offset=n)
+ *
+ * .. note::
+ * We expect to see the same result for a request with a different bo
+ * and/or non-contiguous bo_offset.
+ *
+ *
+ * 15) Existent mappings is overlapped at the beginning by the requested mapping
+ * backed by a different BO. Hence, map the requested mapping and split up
+ * the existent one, adjusting it's BO offset accordingly.
+ *
+ * ::
+ *
+ * 1 a 3
+ * old: |-----------| (bo_offset=n)
+ *
+ * 0 b 2
+ * req: |-----------| (bo_offset=m)
+ *
+ * 0 b 2 a' 3
+ * new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)
+ */
+
+/**
+ * DOC: Locking
+ *
+ * Generally, the GPU VA manager does not take care of locking itself, it is
+ * the drivers responsibility to take care about locking. Drivers might want to
+ * protect the following operations: inserting, removing and iterating
+ * &drm_gpuva objects as well as generating all kinds of operations, such as
+ * split / merge or prefetch.
+ *
+ * The GPU VA manager also does not take care of the locking of the backing
+ * &drm_gem_object buffers GPU VA lists by itself; drivers are responsible to
+ * enforce mutual exclusion.
+ */
+
+ /*
+ * Maple Tree Locking
+ *
+ * The maple tree's advanced API requires the user of the API to protect
+ * certain tree operations with a lock (either the external or internal tree
+ * lock) for tree internal reasons.
+ *
+ * The actual rules (when to aquire/release the lock) are enforced by lockdep
+ * through the maple tree implementation.
+ *
+ * For this reason the DRM GPUVA manager takes the maple tree's internal
+ * spinlock according to the lockdep enforced rules.
+ *
+ * Please note, that this lock is *only* meant to fulfill the maple trees
+ * requirements and does not intentionally protect the DRM GPUVA manager
+ * against concurrent access.
+ *
+ * The following mail thread provides more details on why the maple tree
+ * has this requirement.
+ *
+ * https://lore.kernel.org/lkml/[email protected]/
+ */
+
+static int __drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva *va);
+static void __drm_gpuva_remove(struct drm_gpuva *va);
+
+/**
+ * drm_gpuva_manager_init - initialize a &drm_gpuva_manager
+ * @mgr: pointer to the &drm_gpuva_manager to initialize
+ * @name: the name of the GPU VA space
+ * @start_offset: the start offset of the GPU VA space
+ * @range: the size of the GPU VA space
+ * @reserve_offset: the start of the kernel reserved GPU VA area
+ * @reserve_range: the size of the kernel reserved GPU VA area
+ * @ops: &drm_gpuva_fn_ops called on &drm_gpuva_sm_map / &drm_gpuva_sm_unmap
+ *
+ * The &drm_gpuva_manager must be initialized with this function before use.
+ *
+ * Note that @mgr must be cleared to 0 before calling this function. The given
+ * &name is expected to be managed by the surrounding driver structures.
+ */
+void
+drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+ const char *name,
+ u64 start_offset, u64 range,
+ u64 reserve_offset, u64 reserve_range,
+ const struct drm_gpuva_fn_ops *ops)
+{
+ mt_init(&mgr->mtree);
+
+ mgr->mm_start = start_offset;
+ mgr->mm_range = range;
+
+ mgr->name = name ? name : "unknown";
+ mgr->ops = ops;
+
+ memset(&mgr->kernel_alloc_node, 0, sizeof(struct drm_gpuva));
+
+ if (reserve_range) {
+ mgr->kernel_alloc_node.va.addr = reserve_offset;
+ mgr->kernel_alloc_node.va.range = reserve_range;
+
+ __drm_gpuva_insert(mgr, &mgr->kernel_alloc_node);
+ }
+
+}
+EXPORT_SYMBOL(drm_gpuva_manager_init);
+
+/**
+ * drm_gpuva_manager_destroy - cleanup a &drm_gpuva_manager
+ * @mgr: pointer to the &drm_gpuva_manager to clean up
+ *
+ * Note that it is a bug to call this function on a manager that still
+ * holds GPU VA mappings.
+ */
+void
+drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
+{
+ mgr->name = NULL;
+
+ if (mgr->kernel_alloc_node.va.range)
+ __drm_gpuva_remove(&mgr->kernel_alloc_node);
+
+ mtree_lock(&mgr->mtree);
+ WARN(!mtree_empty(&mgr->mtree),
+ "GPUVA tree is not empty, potentially leaking memory.");
+ __mt_destroy(&mgr->mtree);
+ mtree_unlock(&mgr->mtree);
+}
+EXPORT_SYMBOL(drm_gpuva_manager_destroy);
+
+static inline bool
+drm_gpuva_in_mm_range(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+ u64 end = addr + range;
+ u64 mm_start = mgr->mm_start;
+ u64 mm_end = mm_start + mgr->mm_range;
+
+ return addr < mm_end && mm_start < end;
+}
+
+static inline bool
+drm_gpuva_in_kernel_node(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+ u64 end = addr + range;
+ u64 kstart = mgr->kernel_alloc_node.va.addr;
+ u64 krange = mgr->kernel_alloc_node.va.range;
+ u64 kend = kstart + krange;
+
+ return krange && addr < kend && kstart < end;
+}
+
+static inline bool
+drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range)
+{
+ return drm_gpuva_in_mm_range(mgr, addr, range) &&
+ !drm_gpuva_in_kernel_node(mgr, addr, range);
+}
+
+/**
+ * drm_gpuva_iter_remove - removes the iterators current element
+ * @it: the &drm_gpuva_iterator
+ *
+ * This removes the element the iterator currently points to.
+ */
+void
+drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
+{
+ mas_lock(&it->mas);
+ mas_erase(&it->mas);
+ mas_unlock(&it->mas);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_remove);
+
+/**
+ * drm_gpuva_prealloc_create - creates a preallocated node to store a
+ * &drm_gpuva entry.
+ *
+ * Returns: the &drm_gpuva_prealloc object on success, NULL on failure
+ */
+struct drm_gpuva_prealloc *
+drm_gpuva_prealloc_create(struct drm_gpuva_manager *mgr)
+{
+ struct drm_gpuva_prealloc *pa;
+
+ pa = kzalloc(sizeof(*pa), GFP_KERNEL);
+ if (!pa)
+ return NULL;
+
+ mas_init(&pa->mas, &mgr->mtree, 0);
+ if (mas_preallocate(&pa->mas, GFP_KERNEL)) {
+ kfree(pa);
+ return NULL;
+ }
+
+ return pa;
+}
+EXPORT_SYMBOL(drm_gpuva_prealloc_create);
+
+/**
+ * drm_gpuva_prealloc_destroy - destroyes a preallocated node and frees the
+ * &drm_gpuva_prealloc
+ *
+ * @pa: the &drm_gpuva_prealloc to destroy
+ */
+void
+drm_gpuva_prealloc_destroy(struct drm_gpuva_prealloc *pa)
+{
+ mas_destroy(&pa->mas);
+ kfree(pa);
+}
+EXPORT_SYMBOL(drm_gpuva_prealloc_destroy);
+
+static int
+drm_gpuva_insert_state(struct drm_gpuva_manager *mgr,
+ struct ma_state *mas,
+ struct drm_gpuva *va)
+{
+ u64 addr = va->va.addr;
+ u64 range = va->va.range;
+ u64 last = addr + range - 1;
+
+ mas_set(mas, addr);
+
+ mas_lock(mas);
+ if (unlikely(mas_walk(mas))) {
+ mas_unlock(mas);
+ return -EEXIST;
+ }
+
+ if (unlikely(mas->last < last)) {
+ mas_unlock(mas);
+ return -EEXIST;
+ }
+
+ mas->index = addr;
+ mas->last = last;
+
+ mas_store_prealloc(mas, va);
+ mas_unlock(mas);
+
+ va->mgr = mgr;
+
+ return 0;
+}
+
+static int
+__drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva *va)
+{
+ MA_STATE(mas, &mgr->mtree, 0, 0);
+ int ret;
+
+ ret = mas_preallocate(&mas, GFP_KERNEL);
+ if (ret)
+ return ret;
+
+ return drm_gpuva_insert_state(mgr, &mas, va);
+}
+
+/**
+ * drm_gpuva_insert - insert a &drm_gpuva
+ * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
+ * @va: the &drm_gpuva to insert
+ *
+ * Insert a &drm_gpuva with a given address and range into a
+ * &drm_gpuva_manager.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each().
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva *va)
+{
+ u64 addr = va->va.addr;
+ u64 range = va->va.range;
+
+ if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
+ return -EINVAL;
+
+ return __drm_gpuva_insert(mgr, va);
+}
+EXPORT_SYMBOL(drm_gpuva_insert);
+
+/**
+ * drm_gpuva_insert_prealloc - insert a &drm_gpuva with a preallocated node
+ * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
+ * @va: the &drm_gpuva to insert
+ * @pa: the &drm_gpuva_prealloc node
+ *
+ * Insert a &drm_gpuva with a given address and range into a
+ * &drm_gpuva_manager.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each().
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_insert_prealloc(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_prealloc *pa,
+ struct drm_gpuva *va)
+{
+ struct ma_state *mas = &pa->mas;
+ u64 addr = va->va.addr;
+ u64 range = va->va.range;
+
+ if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
+ return -EINVAL;
+
+ mas->tree = &mgr->mtree;
+ return drm_gpuva_insert_state(mgr, mas, va);
+}
+EXPORT_SYMBOL(drm_gpuva_insert_prealloc);
+
+static void
+__drm_gpuva_remove(struct drm_gpuva *va)
+{
+ MA_STATE(mas, &va->mgr->mtree, va->va.addr, 0);
+
+ mas_lock(&mas);
+ mas_erase(&mas);
+ mas_unlock(&mas);
+}
+
+/**
+ * drm_gpuva_remove - remove a &drm_gpuva
+ * @va: the &drm_gpuva to remove
+ *
+ * This removes the given &va from the underlaying tree.
+ *
+ * It is not allowed to use this function while iterating this GPU VA space,
+ * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove()
+ * instead.
+ */
+void
+drm_gpuva_remove(struct drm_gpuva *va)
+{
+ struct drm_gpuva_manager *mgr = va->mgr;
+
+ if (unlikely(va == &mgr->kernel_alloc_node)) {
+ WARN(1, "Can't destroy kernel reserved node.\n");
+ return;
+ }
+
+ __drm_gpuva_remove(va);
+}
+EXPORT_SYMBOL(drm_gpuva_remove);
+
+/**
+ * drm_gpuva_link - link a &drm_gpuva
+ * @va: the &drm_gpuva to link
+ *
+ * This adds the given &va to the GPU VA list of the &drm_gem_object it is
+ * associated with.
+ *
+ * This function expects the caller to protect the GEM's GPUVA list against
+ * concurrent access.
+ */
+void
+drm_gpuva_link(struct drm_gpuva *va)
+{
+ if (likely(va->gem.obj))
+ list_add_tail(&va->gem.entry, &va->gem.obj->gpuva.list);
+}
+EXPORT_SYMBOL(drm_gpuva_link);
+
+/**
+ * drm_gpuva_unlink - unlink a &drm_gpuva
+ * @va: the &drm_gpuva to unlink
+ *
+ * This removes the given &va from the GPU VA list of the &drm_gem_object it is
+ * associated with.
+ *
+ * This function expects the caller to protect the GEM's GPUVA list against
+ * concurrent access.
+ */
+void
+drm_gpuva_unlink(struct drm_gpuva *va)
+{
+ if (likely(va->gem.obj))
+ list_del_init(&va->gem.entry);
+}
+EXPORT_SYMBOL(drm_gpuva_unlink);
+
+/**
+ * drm_gpuva_find_first - find the first &drm_gpuva in the given range
+ * @mgr: the &drm_gpuva_manager to search in
+ * @addr: the &drm_gpuvas address
+ * @range: the &drm_gpuvas range
+ *
+ * Returns: the first &drm_gpuva within the given range
+ */
+struct drm_gpuva *
+drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range)
+{
+ MA_STATE(mas, &mgr->mtree, addr, 0);
+ struct drm_gpuva *va;
+
+ mas_lock(&mas);
+ va = mas_find(&mas, addr + range - 1);
+ mas_unlock(&mas);
+
+ return va;
+}
+EXPORT_SYMBOL(drm_gpuva_find_first);
+
+/**
+ * drm_gpuva_find - find a &drm_gpuva
+ * @mgr: the &drm_gpuva_manager to search in
+ * @addr: the &drm_gpuvas address
+ * @range: the &drm_gpuvas range
+ *
+ * Returns: the &drm_gpuva at a given &addr and with a given &range
+ */
+struct drm_gpuva *
+drm_gpuva_find(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range)
+{
+ struct drm_gpuva *va;
+
+ va = drm_gpuva_find_first(mgr, addr, range);
+ if (!va)
+ goto out;
+
+ if (va->va.addr != addr ||
+ va->va.range != range)
+ goto out;
+
+ return va;
+
+out:
+ return NULL;
+}
+EXPORT_SYMBOL(drm_gpuva_find);
+
+/**
+ * drm_gpuva_find_prev - find the &drm_gpuva before the given address
+ * @mgr: the &drm_gpuva_manager to search in
+ * @start: the given GPU VA's start address
+ *
+ * Find the adjacent &drm_gpuva before the GPU VA with given &start address.
+ *
+ * Note that if there is any free space between the GPU VA mappings no mapping
+ * is returned.
+ *
+ * Returns: a pointer to the found &drm_gpuva or NULL if none was found
+ */
+struct drm_gpuva *
+drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start)
+{
+ MA_STATE(mas, &mgr->mtree, start - 1, 0);
+ struct drm_gpuva *va;
+
+ if (start <= mgr->mm_start ||
+ start > (mgr->mm_start + mgr->mm_range))
+ return NULL;
+
+ mas_lock(&mas);
+ va = mas_walk(&mas);
+ mas_unlock(&mas);
+
+ return va;
+}
+EXPORT_SYMBOL(drm_gpuva_find_prev);
+
+/**
+ * drm_gpuva_find_next - find the &drm_gpuva after the given address
+ * @mgr: the &drm_gpuva_manager to search in
+ * @end: the given GPU VA's end address
+ *
+ * Find the adjacent &drm_gpuva after the GPU VA with given &end address.
+ *
+ * Note that if there is any free space between the GPU VA mappings no mapping
+ * is returned.
+ *
+ * Returns: a pointer to the found &drm_gpuva or NULL if none was found
+ */
+struct drm_gpuva *
+drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end)
+{
+ MA_STATE(mas, &mgr->mtree, end, 0);
+ struct drm_gpuva *va;
+
+ if (end < mgr->mm_start ||
+ end >= (mgr->mm_start + mgr->mm_range))
+ return NULL;
+
+ mas_lock(&mas);
+ va = mas_walk(&mas);
+ mas_unlock(&mas);
+
+ return va;
+}
+EXPORT_SYMBOL(drm_gpuva_find_next);
+
+/**
+ * drm_gpuva_interval_empty - indicate whether a given interval of the VA space
+ * is empty
+ * @mgr: the &drm_gpuva_manager to check the range for
+ * @addr: the start address of the range
+ * @range: the range of the interval
+ *
+ * Returns: true if the interval is empty, false otherwise
+ */
+bool
+drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+{
+ DRM_GPUVA_ITER(it, mgr, addr);
+ struct drm_gpuva *va;
+
+ drm_gpuva_iter_for_each_range(va, it, addr + range)
+ return false;
+
+ return true;
+}
+EXPORT_SYMBOL(drm_gpuva_interval_empty);
+
+/**
+ * drm_gpuva_map - helper to insert a &drm_gpuva from &drm_gpuva_fn_ops
+ * callbacks
+ *
+ * @mgr: the &drm_gpuva_manager
+ * @pa: the &drm_gpuva_prealloc
+ * @va: the &drm_gpuva to inser
+ */
+int
+drm_gpuva_map(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_prealloc *pa,
+ struct drm_gpuva *va)
+{
+ return drm_gpuva_insert_prealloc(mgr, pa, va);
+}
+EXPORT_SYMBOL(drm_gpuva_map);
+
+/**
+ * drm_gpuva_remap - helper to insert a &drm_gpuva from &drm_gpuva_fn_ops
+ * callbacks
+ *
+ * @state: the current &drm_gpuva_state
+ * @prev: the &drm_gpuva to remap when keeping the start of a mapping,
+ * may be NULL
+ * @next: the &drm_gpuva to remap when keeping the end of a mapping,
+ * may be NULL
+ */
+int
+drm_gpuva_remap(drm_gpuva_state_t state,
+ struct drm_gpuva *prev,
+ struct drm_gpuva *next)
+{
+ struct ma_state *mas = &state->mas;
+ u64 max = mas->last;
+
+ if (unlikely(!prev && !next))
+ return -EINVAL;
+
+ if (prev) {
+ u64 addr = prev->va.addr;
+ u64 last = addr + prev->va.range - 1;
+
+ if (unlikely(addr != mas->index))
+ return -EINVAL;
+
+ if (unlikely(last >= mas->last))
+ return -EINVAL;
+ }
+
+ if (next) {
+ u64 addr = next->va.addr;
+ u64 last = addr + next->va.range - 1;
+
+ if (unlikely(last != mas->last))
+ return -EINVAL;
+
+ if (unlikely(addr <= mas->index))
+ return -EINVAL;
+ }
+
+ if (prev && next) {
+ u64 p_last = prev->va.addr + prev->va.range - 1;
+ u64 n_addr = next->va.addr;
+
+ if (unlikely(p_last > n_addr))
+ return -EINVAL;
+
+ if (unlikely(n_addr - p_last <= 1))
+ return -EINVAL;
+ }
+
+ mas_lock(mas);
+ if (prev) {
+ mas_store(mas, prev);
+ mas_next(mas, max);
+ if (!next)
+ mas_store(mas, NULL);
+ }
+
+ if (next) {
+ mas->last = next->va.addr - 1;
+ mas_store(mas, NULL);
+ mas_next(mas, max);
+ mas_store(mas, next);
+ }
+ mas_unlock(mas);
+
+ return 0;
+}
+EXPORT_SYMBOL(drm_gpuva_remap);
+
+/**
+ * drm_gpuva_unmap - helper to remove a &drm_gpuva from &drm_gpuva_fn_ops
+ * callbacks
+ *
+ * @state: the current &drm_gpuva_state
+ *
+ * The entry associated with the current state is removed.
+ */
+void
+drm_gpuva_unmap(drm_gpuva_state_t state)
+{
+ drm_gpuva_iter_remove(state);
+}
+EXPORT_SYMBOL(drm_gpuva_unmap);
+
+static int
+op_map_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
+ u64 addr, u64 range,
+ struct drm_gem_object *obj, u64 offset)
+{
+ struct drm_gpuva_op op = {};
+
+ op.op = DRM_GPUVA_OP_MAP;
+ op.map.va.addr = addr;
+ op.map.va.range = range;
+ op.map.gem.obj = obj;
+ op.map.gem.offset = offset;
+
+ return fn->sm_step_map(&op, priv);
+}
+
+static int
+op_remap_cb(const struct drm_gpuva_fn_ops *fn,
+ drm_gpuva_state_t state, void *priv,
+ struct drm_gpuva_op_map *prev,
+ struct drm_gpuva_op_map *next,
+ struct drm_gpuva_op_unmap *unmap)
+{
+ struct drm_gpuva_op op = {};
+ struct drm_gpuva_op_remap *r;
+
+ op.op = DRM_GPUVA_OP_REMAP;
+ r = &op.remap;
+ r->prev = prev;
+ r->next = next;
+ r->unmap = unmap;
+
+ return fn->sm_step_remap(&op, state, priv);
+}
+
+static int
+op_unmap_cb(const struct drm_gpuva_fn_ops *fn,
+ drm_gpuva_state_t state, void *priv,
+ struct drm_gpuva *va, bool merge)
+{
+ struct drm_gpuva_op op = {};
+
+ op.op = DRM_GPUVA_OP_UNMAP;
+ op.unmap.va = va;
+ op.unmap.keep = merge;
+
+ return fn->sm_step_unmap(&op, state, priv);
+}
+
+static int
+__drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
+ const struct drm_gpuva_fn_ops *ops, void *priv,
+ u64 req_addr, u64 req_range,
+ struct drm_gem_object *req_obj, u64 req_offset)
+{
+ DRM_GPUVA_ITER(it, mgr, req_addr);
+ struct drm_gpuva *va, *prev = NULL;
+ u64 req_end = req_addr + req_range;
+ int ret;
+
+ if (unlikely(!drm_gpuva_in_mm_range(mgr, req_addr, req_range)))
+ return -EINVAL;
+
+ if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
+ return -EINVAL;
+
+ drm_gpuva_iter_for_each_range(va, it, req_end) {
+ struct drm_gem_object *obj = va->gem.obj;
+ u64 offset = va->gem.offset;
+ u64 addr = va->va.addr;
+ u64 range = va->va.range;
+ u64 end = addr + range;
+ bool merge = !!va->gem.obj;
+
+ if (addr == req_addr) {
+ merge &= obj == req_obj &&
+ offset == req_offset;
+
+ if (end == req_end) {
+ ret = op_unmap_cb(ops, &it, priv, va, merge);
+ if (ret)
+ return ret;
+ break;
+ }
+
+ if (end < req_end) {
+ ret = op_unmap_cb(ops, &it, priv, va, merge);
+ if (ret)
+ return ret;
+ goto next;
+ }
+
+ if (end > req_end) {
+ struct drm_gpuva_op_map n = {
+ .va.addr = req_end,
+ .va.range = range - req_range,
+ .gem.obj = obj,
+ .gem.offset = offset + req_range,
+ };
+ struct drm_gpuva_op_unmap u = {
+ .va = va,
+ .keep = merge,
+ };
+
+ ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
+ if (ret)
+ return ret;
+ break;
+ }
+ } else if (addr < req_addr) {
+ u64 ls_range = req_addr - addr;
+ struct drm_gpuva_op_map p = {
+ .va.addr = addr,
+ .va.range = ls_range,
+ .gem.obj = obj,
+ .gem.offset = offset,
+ };
+ struct drm_gpuva_op_unmap u = { .va = va };
+
+ merge &= obj == req_obj &&
+ offset + ls_range == req_offset;
+ u.keep = merge;
+
+ if (end == req_end) {
+ ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
+ if (ret)
+ return ret;
+ break;
+ }
+
+ if (end < req_end) {
+ ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
+ if (ret)
+ return ret;
+ goto next;
+ }
+
+ if (end > req_end) {
+ struct drm_gpuva_op_map n = {
+ .va.addr = req_end,
+ .va.range = end - req_end,
+ .gem.obj = obj,
+ .gem.offset = offset + ls_range +
+ req_range,
+ };
+
+ ret = op_remap_cb(ops, &it, priv, &p, &n, &u);
+ if (ret)
+ return ret;
+ break;
+ }
+ } else if (addr > req_addr) {
+ merge &= obj == req_obj &&
+ offset == req_offset +
+ (addr - req_addr);
+
+ if (end == req_end) {
+ ret = op_unmap_cb(ops, &it, priv, va, merge);
+ if (ret)
+ return ret;
+ break;
+ }
+
+ if (end < req_end) {
+ ret = op_unmap_cb(ops, &it, priv, va, merge);
+ if (ret)
+ return ret;
+ goto next;
+ }
+
+ if (end > req_end) {
+ struct drm_gpuva_op_map n = {
+ .va.addr = req_end,
+ .va.range = end - req_end,
+ .gem.obj = obj,
+ .gem.offset = offset + req_end - addr,
+ };
+ struct drm_gpuva_op_unmap u = {
+ .va = va,
+ .keep = merge,
+ };
+
+ ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
+ if (ret)
+ return ret;
+ break;
+ }
+ }
+next:
+ prev = va;
+ }
+
+ return op_map_cb(ops, priv,
+ req_addr, req_range,
+ req_obj, req_offset);
+}
+
+static int
+__drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
+ const struct drm_gpuva_fn_ops *ops, void *priv,
+ u64 req_addr, u64 req_range)
+{
+ DRM_GPUVA_ITER(it, mgr, req_addr);
+ struct drm_gpuva *va;
+ u64 req_end = req_addr + req_range;
+ int ret;
+
+ if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
+ return -EINVAL;
+
+ drm_gpuva_iter_for_each_range(va, it, req_end) {
+ struct drm_gpuva_op_map prev = {}, next = {};
+ bool prev_split = false, next_split = false;
+ struct drm_gem_object *obj = va->gem.obj;
+ u64 offset = va->gem.offset;
+ u64 addr = va->va.addr;
+ u64 range = va->va.range;
+ u64 end = addr + range;
+
+ if (addr < req_addr) {
+ prev.va.addr = addr;
+ prev.va.range = req_addr - addr;
+ prev.gem.obj = obj;
+ prev.gem.offset = offset;
+
+ prev_split = true;
+ }
+
+ if (end > req_end) {
+ next.va.addr = req_end;
+ next.va.range = end - req_end;
+ next.gem.obj = obj;
+ next.gem.offset = offset + (req_end - addr);
+
+ next_split = true;
+ }
+
+ if (prev_split || next_split) {
+ struct drm_gpuva_op_unmap unmap = { .va = va };
+
+ ret = op_remap_cb(ops, &it, priv,
+ prev_split ? &prev : NULL,
+ next_split ? &next : NULL,
+ &unmap);
+ if (ret)
+ return ret;
+ } else {
+ ret = op_unmap_cb(ops, &it, priv, va, false);
+ if (ret)
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * drm_gpuva_sm_map - creates the &drm_gpuva_op split/merge steps
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the new mapping
+ * @req_range: the range of the new mapping
+ * @req_obj: the &drm_gem_object to map
+ * @req_offset: the offset within the &drm_gem_object
+ * @priv: pointer to a driver private data structure
+ *
+ * This function iterates the given range of the GPU VA space. It utilizes the
+ * &drm_gpuva_fn_ops to call back into the driver providing the split and merge
+ * steps.
+ *
+ * Drivers may use these callbacks to update the GPU VA space right away within
+ * the callback. In case the driver decides to copy and store the operations for
+ * later processing neither this function nor &drm_gpuva_sm_unmap is allowed to
+ * be called before the &drm_gpuva_manager's view of the GPU VA space was
+ * updated with the previous set of operations. To update the
+ * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
+ * used.
+ *
+ * A sequence of callbacks can contain map, unmap and remap operations, but
+ * the sequence of callbacks might also be empty if no operation is required,
+ * e.g. if the requested mapping already exists in the exact same way.
+ *
+ * There can be an arbitrary amount of unmap operations, a maximum of two remap
+ * operations and a single map operation. The latter one represents the original
+ * map operation requested by the caller.
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
+ u64 req_addr, u64 req_range,
+ struct drm_gem_object *req_obj, u64 req_offset)
+{
+ const struct drm_gpuva_fn_ops *ops = mgr->ops;
+
+ if (unlikely(!(ops && ops->sm_step_map &&
+ ops->sm_step_remap &&
+ ops->sm_step_unmap)))
+ return -EINVAL;
+
+ return __drm_gpuva_sm_map(mgr, ops, priv,
+ req_addr, req_range,
+ req_obj, req_offset);
+}
+EXPORT_SYMBOL(drm_gpuva_sm_map);
+
+/**
+ * drm_gpuva_sm_unmap - creates the &drm_gpuva_ops to split on unmap
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @priv: pointer to a driver private data structure
+ * @req_addr: the start address of the range to unmap
+ * @req_range: the range of the mappings to unmap
+ *
+ * This function iterates the given range of the GPU VA space. It utilizes the
+ * &drm_gpuva_fn_ops to call back into the driver providing the operations to
+ * unmap and, if required, split existent mappings.
+ *
+ * Drivers may use these callbacks to update the GPU VA space right away within
+ * the callback. In case the driver decides to copy and store the operations for
+ * later processing neither this function nor &drm_gpuva_sm_map is allowed to be
+ * called before the &drm_gpuva_manager's view of the GPU VA space was updated
+ * with the previous set of operations. To update the &drm_gpuva_manager's view
+ * of the GPU VA space drm_gpuva_insert(), drm_gpuva_destroy_locked() and/or
+ * drm_gpuva_destroy_unlocked() should be used.
+ *
+ * A sequence of callbacks can contain unmap and remap operations, depending on
+ * whether there are actual overlapping mappings to split.
+ *
+ * There can be an arbitrary amount of unmap operations and a maximum of two
+ * remap operations.
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
+ u64 req_addr, u64 req_range)
+{
+ const struct drm_gpuva_fn_ops *ops = mgr->ops;
+
+ if (unlikely(!(ops && ops->sm_step_remap &&
+ ops->sm_step_unmap)))
+ return -EINVAL;
+
+ return __drm_gpuva_sm_unmap(mgr, ops, priv,
+ req_addr, req_range);
+}
+EXPORT_SYMBOL(drm_gpuva_sm_unmap);
+
+static struct drm_gpuva_op *
+gpuva_op_alloc(struct drm_gpuva_manager *mgr)
+{
+ const struct drm_gpuva_fn_ops *fn = mgr->ops;
+ struct drm_gpuva_op *op;
+
+ if (fn && fn->op_alloc)
+ op = fn->op_alloc();
+ else
+ op = kzalloc(sizeof(*op), GFP_KERNEL);
+
+ if (unlikely(!op))
+ return NULL;
+
+ return op;
+}
+
+static void
+gpuva_op_free(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_op *op)
+{
+ const struct drm_gpuva_fn_ops *fn = mgr->ops;
+
+ if (fn && fn->op_free)
+ fn->op_free(op);
+ else
+ kfree(op);
+}
+
+static int
+drm_gpuva_sm_step(struct drm_gpuva_op *__op,
+ drm_gpuva_state_t state,
+ void *priv)
+{
+ struct {
+ struct drm_gpuva_manager *mgr;
+ struct drm_gpuva_ops *ops;
+ } *args = priv;
+ struct drm_gpuva_manager *mgr = args->mgr;
+ struct drm_gpuva_ops *ops = args->ops;
+ struct drm_gpuva_op *op;
+
+ op = gpuva_op_alloc(mgr);
+ if (unlikely(!op))
+ goto err;
+
+ memcpy(op, __op, sizeof(*op));
+
+ if (op->op == DRM_GPUVA_OP_REMAP) {
+ struct drm_gpuva_op_remap *__r = &__op->remap;
+ struct drm_gpuva_op_remap *r = &op->remap;
+
+ r->unmap = kmemdup(__r->unmap, sizeof(*r->unmap),
+ GFP_KERNEL);
+ if (unlikely(!r->unmap))
+ goto err_free_op;
+
+ if (__r->prev) {
+ r->prev = kmemdup(__r->prev, sizeof(*r->prev),
+ GFP_KERNEL);
+ if (unlikely(!r->prev))
+ goto err_free_unmap;
+ }
+
+ if (__r->next) {
+ r->next = kmemdup(__r->next, sizeof(*r->next),
+ GFP_KERNEL);
+ if (unlikely(!r->next))
+ goto err_free_prev;
+ }
+ }
+
+ list_add_tail(&op->entry, &ops->list);
+
+ return 0;
+
+err_free_unmap:
+ kfree(op->remap.unmap);
+err_free_prev:
+ kfree(op->remap.prev);
+err_free_op:
+ gpuva_op_free(mgr, op);
+err:
+ return -ENOMEM;
+}
+
+static int
+drm_gpuva_sm_step_map(struct drm_gpuva_op *__op, void *priv)
+{
+ return drm_gpuva_sm_step(__op, NULL, priv);
+}
+
+static const struct drm_gpuva_fn_ops gpuva_list_ops = {
+ .sm_step_map = drm_gpuva_sm_step_map,
+ .sm_step_remap = drm_gpuva_sm_step,
+ .sm_step_unmap = drm_gpuva_sm_step,
+};
+
+/**
+ * drm_gpuva_sm_map_ops_create - creates the &drm_gpuva_ops to split and merge
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the new mapping
+ * @req_range: the range of the new mapping
+ * @req_obj: the &drm_gem_object to map
+ * @req_offset: the offset within the &drm_gem_object
+ *
+ * This function creates a list of operations to perform splitting and merging
+ * of existent mapping(s) with the newly requested one.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and must be processed
+ * in the given order. It can contain map, unmap and remap operations, but it
+ * also can be empty if no operation is required, e.g. if the requested mapping
+ * already exists is the exact same way.
+ *
+ * There can be an arbitrary amount of unmap operations, a maximum of two remap
+ * operations and a single map operation. The latter one represents the original
+ * map operation requested by the caller.
+ *
+ * Note that before calling this function again with another mapping request it
+ * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
+ * previously obtained operations must be either processed or abandoned. To
+ * update the &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
+ * used.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
+ u64 req_addr, u64 req_range,
+ struct drm_gem_object *req_obj, u64 req_offset)
+{
+ struct drm_gpuva_ops *ops;
+ struct {
+ struct drm_gpuva_manager *mgr;
+ struct drm_gpuva_ops *ops;
+ } args;
+ int ret;
+
+ ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+ if (unlikely(!ops))
+ return ERR_PTR(-ENOMEM);
+
+ INIT_LIST_HEAD(&ops->list);
+
+ args.mgr = mgr;
+ args.ops = ops;
+
+ ret = __drm_gpuva_sm_map(mgr, &gpuva_list_ops, &args,
+ req_addr, req_range,
+ req_obj, req_offset);
+ if (ret)
+ goto err_free_ops;
+
+ return ops;
+
+err_free_ops:
+ drm_gpuva_ops_free(mgr, ops);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gpuva_sm_map_ops_create);
+
+/**
+ * drm_gpuva_sm_unmap_ops_create - creates the &drm_gpuva_ops to split on unmap
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @req_addr: the start address of the range to unmap
+ * @req_range: the range of the mappings to unmap
+ *
+ * This function creates a list of operations to perform unmapping and, if
+ * required, splitting of the mappings overlapping the unmap range.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and must be processed
+ * in the given order. It can contain unmap and remap operations, depending on
+ * whether there are actual overlapping mappings to split.
+ *
+ * There can be an arbitrary amount of unmap operations and a maximum of two
+ * remap operations.
+ *
+ * Note that before calling this function again with another range to unmap it
+ * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
+ * previously obtained operations must be processed or abandoned. To update the
+ * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
+ * used.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
+ u64 req_addr, u64 req_range)
+{
+ struct drm_gpuva_ops *ops;
+ struct {
+ struct drm_gpuva_manager *mgr;
+ struct drm_gpuva_ops *ops;
+ } args;
+ int ret;
+
+ ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+ if (unlikely(!ops))
+ return ERR_PTR(-ENOMEM);
+
+ INIT_LIST_HEAD(&ops->list);
+
+ args.mgr = mgr;
+ args.ops = ops;
+
+ ret = __drm_gpuva_sm_unmap(mgr, &gpuva_list_ops, &args,
+ req_addr, req_range);
+ if (ret)
+ goto err_free_ops;
+
+ return ops;
+
+err_free_ops:
+ drm_gpuva_ops_free(mgr, ops);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gpuva_sm_unmap_ops_create);
+
+/**
+ * drm_gpuva_prefetch_ops_create - creates the &drm_gpuva_ops to prefetch
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @addr: the start address of the range to prefetch
+ * @range: the range of the mappings to prefetch
+ *
+ * This function creates a list of operations to perform prefetching.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and must be processed
+ * in the given order. It can contain prefetch operations.
+ *
+ * There can be an arbitrary amount of prefetch operations.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range)
+{
+ DRM_GPUVA_ITER(it, mgr, addr);
+ struct drm_gpuva_ops *ops;
+ struct drm_gpuva_op *op;
+ struct drm_gpuva *va;
+ int ret;
+
+ ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+ if (!ops)
+ return ERR_PTR(-ENOMEM);
+
+ INIT_LIST_HEAD(&ops->list);
+
+ drm_gpuva_iter_for_each_range(va, it, addr + range) {
+ op = gpuva_op_alloc(mgr);
+ if (!op) {
+ ret = -ENOMEM;
+ goto err_free_ops;
+ }
+
+ op->op = DRM_GPUVA_OP_PREFETCH;
+ op->prefetch.va = va;
+ list_add_tail(&op->entry, &ops->list);
+ }
+
+ return ops;
+
+err_free_ops:
+ drm_gpuva_ops_free(mgr, ops);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gpuva_prefetch_ops_create);
+
+/**
+ * drm_gpuva_gem_unmap_ops_create - creates the &drm_gpuva_ops to unmap a GEM
+ * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @obj: the &drm_gem_object to unmap
+ *
+ * This function creates a list of operations to perform unmapping for every
+ * GPUVA attached to a GEM.
+ *
+ * The list can be iterated with &drm_gpuva_for_each_op and consists out of an
+ * arbitrary amount of unmap operations.
+ *
+ * After the caller finished processing the returned &drm_gpuva_ops, they must
+ * be freed with &drm_gpuva_ops_free.
+ *
+ * It is the callers responsibility to protect the GEMs GPUVA list against
+ * concurrent access.
+ *
+ * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
+ */
+struct drm_gpuva_ops *
+drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
+ struct drm_gem_object *obj)
+{
+ struct drm_gpuva_ops *ops;
+ struct drm_gpuva_op *op;
+ struct drm_gpuva *va;
+ int ret;
+
+ ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+ if (!ops)
+ return ERR_PTR(-ENOMEM);
+
+ INIT_LIST_HEAD(&ops->list);
+
+ drm_gem_for_each_gpuva(va, obj) {
+ op = gpuva_op_alloc(mgr);
+ if (!op) {
+ ret = -ENOMEM;
+ goto err_free_ops;
+ }
+
+ op->op = DRM_GPUVA_OP_UNMAP;
+ op->unmap.va = va;
+ list_add_tail(&op->entry, &ops->list);
+ }
+
+ return ops;
+
+err_free_ops:
+ drm_gpuva_ops_free(mgr, ops);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gpuva_gem_unmap_ops_create);
+
+
+/**
+ * drm_gpuva_ops_free - free the given &drm_gpuva_ops
+ * @mgr: the &drm_gpuva_manager the ops were created for
+ * @ops: the &drm_gpuva_ops to free
+ *
+ * Frees the given &drm_gpuva_ops structure including all the ops associated
+ * with it.
+ */
+void
+drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_ops *ops)
+{
+ struct drm_gpuva_op *op, *next;
+
+ drm_gpuva_for_each_op_safe(op, next, ops) {
+ list_del(&op->entry);
+
+ if (op->op == DRM_GPUVA_OP_REMAP) {
+ kfree(op->remap.prev);
+ kfree(op->remap.next);
+ kfree(op->remap.unmap);
+ }
+
+ gpuva_op_free(mgr, op);
+ }
+
+ kfree(ops);
+}
+EXPORT_SYMBOL(drm_gpuva_ops_free);
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index b419c59c4bef..b6e22f66c3fd 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -104,6 +104,12 @@ enum drm_driver_feature {
* acceleration should be handled by two drivers that are connected using auxiliary bus.
*/
DRIVER_COMPUTE_ACCEL = BIT(7),
+ /**
+ * @DRIVER_GEM_GPUVA:
+ *
+ * Driver supports user defined GPU VA bindings for GEM objects.
+ */
+ DRIVER_GEM_GPUVA = BIT(8),

/* IMPORTANT: Below are all the legacy flags, add new ones above. */

diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index b8efd836edef..f2782f55b7e7 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -36,6 +36,8 @@

#include <linux/kref.h>
#include <linux/dma-resv.h>
+#include <linux/list.h>
+#include <linux/mutex.h>

#include <drm/drm_vma_manager.h>

@@ -347,6 +349,17 @@ struct drm_gem_object {
*/
struct dma_resv _resv;

+ /**
+ * @gpuva:
+ *
+ * Provides the list and list mutex of GPU VAs attached to this
+ * GEM object.
+ */
+ struct {
+ struct list_head list;
+ struct mutex mutex;
+ } gpuva;
+
/**
* @funcs:
*
@@ -494,4 +507,66 @@ unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,

int drm_gem_evict(struct drm_gem_object *obj);

+/**
+ * drm_gem_gpuva_init - initialize the gpuva list of a GEM object
+ * @obj: the &drm_gem_object
+ *
+ * This initializes the &drm_gem_object's &drm_gpuva list and the mutex
+ * protecting it.
+ *
+ * Calling this function is only necessary for drivers intending to support the
+ * &drm_driver_feature DRIVER_GEM_GPUVA.
+ */
+static inline void drm_gem_gpuva_init(struct drm_gem_object *obj)
+{
+ INIT_LIST_HEAD(&obj->gpuva.list);
+ mutex_init(&obj->gpuva.mutex);
+}
+
+/**
+ * drm_gem_gpuva_lock - lock the GEM's gpuva list mutex
+ * @obj: the &drm_gem_object
+ *
+ * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
+ */
+static inline void drm_gem_gpuva_lock(struct drm_gem_object *obj)
+{
+ mutex_lock(&obj->gpuva.mutex);
+}
+
+/**
+ * drm_gem_gpuva_unlock - unlock the GEM's gpuva list mutex
+ * @obj: the &drm_gem_object
+ *
+ * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
+ */
+static inline void drm_gem_gpuva_unlock(struct drm_gem_object *obj)
+{
+ mutex_unlock(&obj->gpuva.mutex);
+}
+
+/**
+ * drm_gem_for_each_gpuva - iternator to walk over a list of gpuvas
+ * @entry: &drm_gpuva structure to assign to in each iteration step
+ * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
+ *
+ * This iterator walks over all &drm_gpuva structures associated with the
+ * &drm_gpuva_manager.
+ */
+#define drm_gem_for_each_gpuva(entry__, obj__) \
+ list_for_each_entry(entry__, &(obj__)->gpuva.list, gem.entry)
+
+/**
+ * drm_gem_for_each_gpuva_safe - iternator to safely walk over a list of gpuvas
+ * @entry: &drm_gpuva structure to assign to in each iteration step
+ * @next: &next &drm_gpuva to store the next step
+ * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
+ *
+ * This iterator walks over all &drm_gpuva structures associated with the
+ * &drm_gem_object. It is implemented with list_for_each_entry_safe(), hence
+ * it is save against removal of elements.
+ */
+#define drm_gem_for_each_gpuva_safe(entry__, next__, obj__) \
+ list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, gem.entry)
+
#endif /* __DRM_GEM_H__ */
diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
new file mode 100644
index 000000000000..b52ac2d00d12
--- /dev/null
+++ b/include/drm/drm_gpuva_mgr.h
@@ -0,0 +1,681 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __DRM_GPUVA_MGR_H__
+#define __DRM_GPUVA_MGR_H__
+
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include <linux/maple_tree.h>
+#include <linux/mm.h>
+#include <linux/rbtree.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+
+struct drm_gpuva_manager;
+struct drm_gpuva_fn_ops;
+struct drm_gpuva_prealloc;
+
+/**
+ * enum drm_gpuva_flags - flags for struct drm_gpuva
+ */
+enum drm_gpuva_flags {
+ /**
+ * @DRM_GPUVA_EVICTED:
+ *
+ * Flag indicating that the &drm_gpuva's backing GEM is evicted.
+ */
+ DRM_GPUVA_EVICTED = (1 << 0),
+
+ /**
+ * @DRM_GPUVA_SPARSE:
+ *
+ * Flag indicating that the &drm_gpuva is a sparse mapping.
+ */
+ DRM_GPUVA_SPARSE = (1 << 1),
+
+ /**
+ * @DRM_GPUVA_USERBITS: user defined bits
+ */
+ DRM_GPUVA_USERBITS = (1 << 2),
+};
+
+/**
+ * struct drm_gpuva - structure to track a GPU VA mapping
+ *
+ * This structure represents a GPU VA mapping and is associated with a
+ * &drm_gpuva_manager.
+ *
+ * Typically, this structure is embedded in bigger driver structures.
+ */
+struct drm_gpuva {
+ /**
+ * @mgr: the &drm_gpuva_manager this object is associated with
+ */
+ struct drm_gpuva_manager *mgr;
+
+ /**
+ * @flags: the &drm_gpuva_flags for this mapping
+ */
+ enum drm_gpuva_flags flags;
+
+ /**
+ * @va: structure containing the address and range of the &drm_gpuva
+ */
+ struct {
+ /**
+ * @addr: the start address
+ */
+ u64 addr;
+
+ /*
+ * @range: the range
+ */
+ u64 range;
+ } va;
+
+ /**
+ * @gem: structure containing the &drm_gem_object and it's offset
+ */
+ struct {
+ /**
+ * @offset: the offset within the &drm_gem_object
+ */
+ u64 offset;
+
+ /**
+ * @obj: the mapped &drm_gem_object
+ */
+ struct drm_gem_object *obj;
+
+ /**
+ * @entry: the &list_head to attach this object to a &drm_gem_object
+ */
+ struct list_head entry;
+ } gem;
+};
+
+void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_unlink(struct drm_gpuva *va);
+
+int drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva *va);
+int drm_gpuva_insert_prealloc(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_prealloc *pa,
+ struct drm_gpuva *va);
+void drm_gpuva_remove(struct drm_gpuva *va);
+
+struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range);
+struct drm_gpuva *drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range);
+struct drm_gpuva *drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start);
+struct drm_gpuva *drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end);
+
+bool drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range);
+
+/**
+ * drm_gpuva_evict - sets whether the backing GEM of this &drm_gpuva is evicted
+ * @va: the &drm_gpuva to set the evict flag for
+ * @evict: indicates whether the &drm_gpuva is evicted
+ */
+static inline void drm_gpuva_evict(struct drm_gpuva *va, bool evict)
+{
+ if (evict)
+ va->flags |= DRM_GPUVA_EVICTED;
+ else
+ va->flags &= ~DRM_GPUVA_EVICTED;
+}
+
+/**
+ * drm_gpuva_evicted - indicates whether the backing BO of this &drm_gpuva
+ * is evicted
+ * @va: the &drm_gpuva to check
+ */
+static inline bool drm_gpuva_evicted(struct drm_gpuva *va)
+{
+ return va->flags & DRM_GPUVA_EVICTED;
+}
+
+/**
+ * struct drm_gpuva_manager - DRM GPU VA Manager
+ *
+ * The DRM GPU VA Manager keeps track of a GPU's virtual address space by using
+ * &maple_tree structures. Typically, this structure is embedded in bigger
+ * driver structures.
+ *
+ * Drivers can pass addresses and ranges in an arbitrary unit, e.g. bytes or
+ * pages.
+ *
+ * There should be one manager instance per GPU virtual address space.
+ */
+struct drm_gpuva_manager {
+ /**
+ * @name: the name of the DRM GPU VA space
+ */
+ const char *name;
+
+ /**
+ * @mm_start: start of the VA space
+ */
+ u64 mm_start;
+
+ /**
+ * @mm_range: length of the VA space
+ */
+ u64 mm_range;
+
+ /**
+ * @mtree: the &maple_tree to track GPU VA mappings
+ */
+ struct maple_tree mtree;
+
+ /**
+ * @kernel_alloc_node:
+ *
+ * &drm_gpuva representing the address space cutout reserved for
+ * the kernel
+ */
+ struct drm_gpuva kernel_alloc_node;
+
+ /**
+ * @ops: &drm_gpuva_fn_ops providing the split/merge steps to drivers
+ */
+ const struct drm_gpuva_fn_ops *ops;
+};
+
+void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+ const char *name,
+ u64 start_offset, u64 range,
+ u64 reserve_offset, u64 reserve_range,
+ const struct drm_gpuva_fn_ops *ops);
+void drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr);
+
+/**
+ * struct drm_gpuva_prealloc - holds a preallocated node for the
+ * &drm_gpuva_manager to insert a single new entry
+ */
+struct drm_gpuva_prealloc {
+ /**
+ * @mas: the maple tree advanced state
+ */
+ struct ma_state mas;
+};
+
+struct drm_gpuva_prealloc * drm_gpuva_prealloc_create(struct drm_gpuva_manager *mgr);
+void drm_gpuva_prealloc_destroy(struct drm_gpuva_prealloc *pa);
+
+/**
+ * struct drm_gpuva_iterator - iterator for walking the internal (maple) tree
+ */
+struct drm_gpuva_iterator {
+ /**
+ * @mas: the maple tree advanced state
+ */
+ struct ma_state mas;
+
+ /**
+ * @mgr: the &drm_gpuva_manager to iterate
+ */
+ struct drm_gpuva_manager *mgr;
+};
+typedef struct drm_gpuva_iterator * drm_gpuva_state_t;
+
+void drm_gpuva_iter_remove(struct drm_gpuva_iterator *it);
+int drm_gpuva_iter_va_replace(struct drm_gpuva_iterator *it,
+ struct drm_gpuva *va);
+
+static inline struct drm_gpuva *
+drm_gpuva_iter_find(struct drm_gpuva_iterator *it, unsigned long max)
+{
+ struct drm_gpuva *va;
+
+ mas_lock(&it->mas);
+ va = mas_find(&it->mas, max);
+ mas_unlock(&it->mas);
+
+ return va;
+}
+
+/**
+ * DRM_GPUVA_ITER - create an iterator structure to iterate the &drm_gpuva tree
+ * @name: the name of the &drm_gpuva_iterator to create
+ * @mgr__: the &drm_gpuva_manager to iterate
+ * @start: starting offset, the first entry will overlap this
+ */
+#define DRM_GPUVA_ITER(name, mgr__, start) \
+ struct drm_gpuva_iterator name = { \
+ .mas = MA_STATE_INIT(&(mgr__)->mtree, start, 0), \
+ .mgr = mgr__, \
+ }
+
+/**
+ * drm_gpuva_iter_for_each_range - iternator to walk over a range of entries
+ * @va__: the &drm_gpuva found for the current iteration
+ * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
+ * @end__: ending offset, the last entry will start before this (but may overlap)
+ *
+ * This function can be used to iterate &drm_gpuva objects.
+ *
+ * It is safe against the removal of elements using &drm_gpuva_iter_remove,
+ * however it is not safe against the removal of elements using
+ * &drm_gpuva_remove.
+ */
+#define drm_gpuva_iter_for_each_range(va__, it__, end__) \
+ while (((va__) = drm_gpuva_iter_find(&(it__), (end__) - 1)))
+
+/**
+ * drm_gpuva_iter_for_each - iternator to walk over all existing entries
+ * @va__: the &drm_gpuva found for the current iteration
+ * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
+ *
+ * This function can be used to iterate &drm_gpuva objects.
+ *
+ * In order to walk over all potentially existing entries, the
+ * &drm_gpuva_iterator must be initialized to start at
+ * &drm_gpuva_manager->mm_start or simply 0.
+ *
+ * It is safe against the removal of elements using &drm_gpuva_iter_remove,
+ * however it is not safe against the removal of elements using
+ * &drm_gpuva_remove.
+ */
+#define drm_gpuva_iter_for_each(va__, it__) \
+ drm_gpuva_iter_for_each_range(va__, it__, (it__).mgr->mm_start + (it__).mgr->mm_range)
+
+/**
+ * enum drm_gpuva_op_type - GPU VA operation type
+ *
+ * Operations to alter the GPU VA mappings tracked by the &drm_gpuva_manager.
+ */
+enum drm_gpuva_op_type {
+ /**
+ * @DRM_GPUVA_OP_MAP: the map op type
+ */
+ DRM_GPUVA_OP_MAP,
+
+ /**
+ * @DRM_GPUVA_OP_REMAP: the remap op type
+ */
+ DRM_GPUVA_OP_REMAP,
+
+ /**
+ * @DRM_GPUVA_OP_UNMAP: the unmap op type
+ */
+ DRM_GPUVA_OP_UNMAP,
+
+ /**
+ * @DRM_GPUVA_OP_PREFETCH: the prefetch op type
+ */
+ DRM_GPUVA_OP_PREFETCH,
+};
+
+/**
+ * struct drm_gpuva_op_map - GPU VA map operation
+ *
+ * This structure represents a single map operation generated by the
+ * DRM GPU VA manager.
+ */
+struct drm_gpuva_op_map {
+ /**
+ * @va: structure containing address and range of a map
+ * operation
+ */
+ struct {
+ /**
+ * @addr: the base address of the new mapping
+ */
+ u64 addr;
+
+ /**
+ * @range: the range of the new mapping
+ */
+ u64 range;
+ } va;
+
+ /**
+ * @gem: structure containing the &drm_gem_object and it's offset
+ */
+ struct {
+ /**
+ * @offset: the offset within the &drm_gem_object
+ */
+ u64 offset;
+
+ /**
+ * @obj: the &drm_gem_object to map
+ */
+ struct drm_gem_object *obj;
+ } gem;
+};
+
+/**
+ * struct drm_gpuva_op_unmap - GPU VA unmap operation
+ *
+ * This structure represents a single unmap operation generated by the
+ * DRM GPU VA manager.
+ */
+struct drm_gpuva_op_unmap {
+ /**
+ * @va: the &drm_gpuva to unmap
+ */
+ struct drm_gpuva *va;
+
+ /**
+ * @keep:
+ *
+ * Indicates whether this &drm_gpuva is physically contiguous with the
+ * original mapping request.
+ *
+ * Optionally, if &keep is set, drivers may keep the actual page table
+ * mappings for this &drm_gpuva, adding the missing page table entries
+ * only and update the &drm_gpuva_manager accordingly.
+ */
+ bool keep;
+};
+
+/**
+ * struct drm_gpuva_op_remap - GPU VA remap operation
+ *
+ * This represents a single remap operation generated by the DRM GPU VA manager.
+ *
+ * A remap operation is generated when an existing GPU VA mmapping is split up
+ * by inserting a new GPU VA mapping or by partially unmapping existent
+ * mapping(s), hence it consists of a maximum of two map and one unmap
+ * operation.
+ *
+ * The @unmap operation takes care of removing the original existing mapping.
+ * @prev is used to remap the preceding part, @next the subsequent part.
+ *
+ * If either a new mapping's start address is aligned with the start address
+ * of the old mapping or the new mapping's end address is aligned with the
+ * end address of the old mapping, either @prev or @next is NULL.
+ *
+ * Note, the reason for a dedicated remap operation, rather than arbitrary
+ * unmap and map operations, is to give drivers the chance of extracting driver
+ * specific data for creating the new mappings from the unmap operations's
+ * &drm_gpuva structure which typically is embedded in larger driver specific
+ * structures.
+ */
+struct drm_gpuva_op_remap {
+ /**
+ * @prev: the preceding part of a split mapping
+ */
+ struct drm_gpuva_op_map *prev;
+
+ /**
+ * @next: the subsequent part of a split mapping
+ */
+ struct drm_gpuva_op_map *next;
+
+ /**
+ * @unmap: the unmap operation for the original existing mapping
+ */
+ struct drm_gpuva_op_unmap *unmap;
+};
+
+/**
+ * struct drm_gpuva_op_prefetch - GPU VA prefetch operation
+ *
+ * This structure represents a single prefetch operation generated by the
+ * DRM GPU VA manager.
+ */
+struct drm_gpuva_op_prefetch {
+ /**
+ * @va: the &drm_gpuva to prefetch
+ */
+ struct drm_gpuva *va;
+};
+
+/**
+ * struct drm_gpuva_op - GPU VA operation
+ *
+ * This structure represents a single generic operation.
+ *
+ * The particular type of the operation is defined by @op.
+ */
+struct drm_gpuva_op {
+ /**
+ * @entry:
+ *
+ * The &list_head used to distribute instances of this struct within
+ * &drm_gpuva_ops.
+ */
+ struct list_head entry;
+
+ /**
+ * @op: the type of the operation
+ */
+ enum drm_gpuva_op_type op;
+
+ union {
+ /**
+ * @map: the map operation
+ */
+ struct drm_gpuva_op_map map;
+
+ /**
+ * @remap: the remap operation
+ */
+ struct drm_gpuva_op_remap remap;
+
+ /**
+ * @unmap: the unmap operation
+ */
+ struct drm_gpuva_op_unmap unmap;
+
+ /**
+ * @prefetch: the prefetch operation
+ */
+ struct drm_gpuva_op_prefetch prefetch;
+ };
+};
+
+/**
+ * struct drm_gpuva_ops - wraps a list of &drm_gpuva_op
+ */
+struct drm_gpuva_ops {
+ /**
+ * @list: the &list_head
+ */
+ struct list_head list;
+};
+
+/**
+ * drm_gpuva_for_each_op - iterator to walk over &drm_gpuva_ops
+ * @op: &drm_gpuva_op to assign in each iteration step
+ * @ops: &drm_gpuva_ops to walk
+ *
+ * This iterator walks over all ops within a given list of operations.
+ */
+#define drm_gpuva_for_each_op(op, ops) list_for_each_entry(op, &(ops)->list, entry)
+
+/**
+ * drm_gpuva_for_each_op_safe - iterator to safely walk over &drm_gpuva_ops
+ * @op: &drm_gpuva_op to assign in each iteration step
+ * @next: &next &drm_gpuva_op to store the next step
+ * @ops: &drm_gpuva_ops to walk
+ *
+ * This iterator walks over all ops within a given list of operations. It is
+ * implemented with list_for_each_safe(), so save against removal of elements.
+ */
+#define drm_gpuva_for_each_op_safe(op, next, ops) \
+ list_for_each_entry_safe(op, next, &(ops)->list, entry)
+
+/**
+ * drm_gpuva_for_each_op_from_reverse - iterate backwards from the given point
+ * @op: &drm_gpuva_op to assign in each iteration step
+ * @ops: &drm_gpuva_ops to walk
+ *
+ * This iterator walks over all ops within a given list of operations beginning
+ * from the given operation in reverse order.
+ */
+#define drm_gpuva_for_each_op_from_reverse(op, ops) \
+ list_for_each_entry_from_reverse(op, &(ops)->list, entry)
+
+/**
+ * drm_gpuva_first_op - returns the first &drm_gpuva_op from &drm_gpuva_ops
+ * @ops: the &drm_gpuva_ops to get the fist &drm_gpuva_op from
+ */
+#define drm_gpuva_first_op(ops) \
+ list_first_entry(&(ops)->list, struct drm_gpuva_op, entry)
+
+/**
+ * drm_gpuva_last_op - returns the last &drm_gpuva_op from &drm_gpuva_ops
+ * @ops: the &drm_gpuva_ops to get the last &drm_gpuva_op from
+ */
+#define drm_gpuva_last_op(ops) \
+ list_last_entry(&(ops)->list, struct drm_gpuva_op, entry)
+
+/**
+ * drm_gpuva_prev_op - previous &drm_gpuva_op in the list
+ * @op: the current &drm_gpuva_op
+ */
+#define drm_gpuva_prev_op(op) list_prev_entry(op, entry)
+
+/**
+ * drm_gpuva_next_op - next &drm_gpuva_op in the list
+ * @op: the current &drm_gpuva_op
+ */
+#define drm_gpuva_next_op(op) list_next_entry(op, entry)
+
+struct drm_gpuva_ops *
+drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range,
+ struct drm_gem_object *obj, u64 offset);
+struct drm_gpuva_ops *
+drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range);
+
+struct drm_gpuva_ops *
+drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
+ u64 addr, u64 range);
+
+struct drm_gpuva_ops *
+drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
+ struct drm_gem_object *obj);
+
+void drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_ops *ops);
+
+/**
+ * struct drm_gpuva_fn_ops - callbacks for split/merge steps
+ *
+ * This structure defines the callbacks used by &drm_gpuva_sm_map and
+ * &drm_gpuva_sm_unmap to provide the split/merge steps for map and unmap
+ * operations to drivers.
+ */
+struct drm_gpuva_fn_ops {
+ /**
+ * @op_alloc: called when the &drm_gpuva_manager allocates
+ * a struct drm_gpuva_op
+ *
+ * Some drivers may want to embed struct drm_gpuva_op into driver
+ * specific structures. By implementing this callback drivers can
+ * allocate memory accordingly.
+ *
+ * This callback is optional.
+ */
+ struct drm_gpuva_op *(*op_alloc)(void);
+
+ /**
+ * @op_free: called when the &drm_gpuva_manager frees a
+ * struct drm_gpuva_op
+ *
+ * Some drivers may want to embed struct drm_gpuva_op into driver
+ * specific structures. By implementing this callback drivers can
+ * free the previously allocated memory accordingly.
+ *
+ * This callback is optional.
+ */
+ void (*op_free)(struct drm_gpuva_op *op);
+
+ /**
+ * @sm_step_map: called from &drm_gpuva_sm_map to finally insert the
+ * mapping once all previous steps were completed
+ *
+ * The &priv pointer matches the one the driver passed to
+ * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
+ *
+ * Can be NULL if &drm_gpuva_sm_map is used.
+ */
+ int (*sm_step_map)(struct drm_gpuva_op *op, void *priv);
+
+ /**
+ * @sm_step_remap: called from &drm_gpuva_sm_map and
+ * &drm_gpuva_sm_unmap to split up an existent mapping
+ *
+ * This callback is called when existent mapping needs to be split up.
+ * This is the case when either a newly requested mapping overlaps or
+ * is enclosed by an existent mapping or a partial unmap of an existent
+ * mapping is requested.
+ *
+ * Drivers must not modify the GPUVA space with accessors that do not
+ * take a &drm_gpuva_state as argument from this callback.
+ *
+ * The &priv pointer matches the one the driver passed to
+ * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
+ *
+ * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
+ * used.
+ */
+ int (*sm_step_remap)(struct drm_gpuva_op *op,
+ drm_gpuva_state_t state,
+ void *priv);
+
+ /**
+ * @sm_step_unmap: called from &drm_gpuva_sm_map and
+ * &drm_gpuva_sm_unmap to unmap an existent mapping
+ *
+ * This callback is called when existent mapping needs to be unmapped.
+ * This is the case when either a newly requested mapping encloses an
+ * existent mapping or an unmap of an existent mapping is requested.
+ *
+ * Drivers must not modify the GPUVA space with accessors that do not
+ * take a &drm_gpuva_state as argument from this callback.
+ *
+ * The &priv pointer matches the one the driver passed to
+ * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
+ *
+ * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
+ * used.
+ */
+ int (*sm_step_unmap)(struct drm_gpuva_op *op,
+ drm_gpuva_state_t state,
+ void *priv);
+};
+
+int drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
+ u64 addr, u64 range,
+ struct drm_gem_object *obj, u64 offset);
+
+int drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
+ u64 addr, u64 range);
+
+int drm_gpuva_map(struct drm_gpuva_manager *mgr,
+ struct drm_gpuva_prealloc *pa,
+ struct drm_gpuva *va);
+int drm_gpuva_remap(drm_gpuva_state_t state,
+ struct drm_gpuva *prev,
+ struct drm_gpuva *next);
+void drm_gpuva_unmap(drm_gpuva_state_t state);
+
+#endif /* __DRM_GPUVA_MGR_H__ */
--
2.40.1

2023-06-07 00:14:55

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 04/14] drm: debugfs: provide infrastructure to dump a DRM GPU VA space

2023-06-07 05:01:26

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 03/14] drm: manager to keep track of GPUs VA mappings

Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 33a86170888b7e4aa0cea94ebb9c67180139cea9]

url: https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-execution-context-for-GEM-buffers-v4/20230607-063442
base: 33a86170888b7e4aa0cea94ebb9c67180139cea9
patch link: https://lore.kernel.org/r/20230606223130.6132-4-dakr%40redhat.com
patch subject: [PATCH drm-next v4 03/14] drm: manager to keep track of GPUs VA mappings
config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20230607/[email protected]/config)
compiler: alpha-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build):
mkdir -p ~/bin
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 33a86170888b7e4aa0cea94ebb9c67180139cea9
b4 shazam https://lore.kernel.org/r/[email protected]
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross W=1 O=build_dir ARCH=alpha olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross W=1 O=build_dir ARCH=alpha SHELL=/bin/bash drivers/gpu/drm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All warnings (new ones prefixed by >>):

drivers/gpu/drm/drm_gpuva_mgr.c: In function '__drm_gpuva_sm_map':
>> drivers/gpu/drm/drm_gpuva_mgr.c:1032:32: warning: variable 'prev' set but not used [-Wunused-but-set-variable]
1032 | struct drm_gpuva *va, *prev = NULL;
| ^~~~

vim +/prev +1032 drivers/gpu/drm/drm_gpuva_mgr.c

1024
1025 static int
1026 __drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
1027 const struct drm_gpuva_fn_ops *ops, void *priv,
1028 u64 req_addr, u64 req_range,
1029 struct drm_gem_object *req_obj, u64 req_offset)
1030 {
1031 DRM_GPUVA_ITER(it, mgr, req_addr);
> 1032 struct drm_gpuva *va, *prev = NULL;
1033 u64 req_end = req_addr + req_range;
1034 int ret;
1035
1036 if (unlikely(!drm_gpuva_in_mm_range(mgr, req_addr, req_range)))
1037 return -EINVAL;
1038
1039 if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
1040 return -EINVAL;
1041
1042 drm_gpuva_iter_for_each_range(va, it, req_end) {
1043 struct drm_gem_object *obj = va->gem.obj;
1044 u64 offset = va->gem.offset;
1045 u64 addr = va->va.addr;
1046 u64 range = va->va.range;
1047 u64 end = addr + range;
1048 bool merge = !!va->gem.obj;
1049
1050 if (addr == req_addr) {
1051 merge &= obj == req_obj &&
1052 offset == req_offset;
1053
1054 if (end == req_end) {
1055 ret = op_unmap_cb(ops, &it, priv, va, merge);
1056 if (ret)
1057 return ret;
1058 break;
1059 }
1060
1061 if (end < req_end) {
1062 ret = op_unmap_cb(ops, &it, priv, va, merge);
1063 if (ret)
1064 return ret;
1065 goto next;
1066 }
1067
1068 if (end > req_end) {
1069 struct drm_gpuva_op_map n = {
1070 .va.addr = req_end,
1071 .va.range = range - req_range,
1072 .gem.obj = obj,
1073 .gem.offset = offset + req_range,
1074 };
1075 struct drm_gpuva_op_unmap u = {
1076 .va = va,
1077 .keep = merge,
1078 };
1079
1080 ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
1081 if (ret)
1082 return ret;
1083 break;
1084 }
1085 } else if (addr < req_addr) {
1086 u64 ls_range = req_addr - addr;
1087 struct drm_gpuva_op_map p = {
1088 .va.addr = addr,
1089 .va.range = ls_range,
1090 .gem.obj = obj,
1091 .gem.offset = offset,
1092 };
1093 struct drm_gpuva_op_unmap u = { .va = va };
1094
1095 merge &= obj == req_obj &&
1096 offset + ls_range == req_offset;
1097 u.keep = merge;
1098
1099 if (end == req_end) {
1100 ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
1101 if (ret)
1102 return ret;
1103 break;
1104 }
1105
1106 if (end < req_end) {
1107 ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
1108 if (ret)
1109 return ret;
1110 goto next;
1111 }
1112
1113 if (end > req_end) {
1114 struct drm_gpuva_op_map n = {
1115 .va.addr = req_end,
1116 .va.range = end - req_end,
1117 .gem.obj = obj,
1118 .gem.offset = offset + ls_range +
1119 req_range,
1120 };
1121
1122 ret = op_remap_cb(ops, &it, priv, &p, &n, &u);
1123 if (ret)
1124 return ret;
1125 break;
1126 }
1127 } else if (addr > req_addr) {
1128 merge &= obj == req_obj &&
1129 offset == req_offset +
1130 (addr - req_addr);
1131
1132 if (end == req_end) {
1133 ret = op_unmap_cb(ops, &it, priv, va, merge);
1134 if (ret)
1135 return ret;
1136 break;
1137 }
1138
1139 if (end < req_end) {
1140 ret = op_unmap_cb(ops, &it, priv, va, merge);
1141 if (ret)
1142 return ret;
1143 goto next;
1144 }
1145
1146 if (end > req_end) {
1147 struct drm_gpuva_op_map n = {
1148 .va.addr = req_end,
1149 .va.range = end - req_end,
1150 .gem.obj = obj,
1151 .gem.offset = offset + req_end - addr,
1152 };
1153 struct drm_gpuva_op_unmap u = {
1154 .va = va,
1155 .keep = merge,
1156 };
1157
1158 ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
1159 if (ret)
1160 return ret;
1161 break;
1162 }
1163 }
1164 next:
1165 prev = va;
1166 }
1167
1168 return op_map_cb(ops, priv,
1169 req_addr, req_range,
1170 req_obj, req_offset);
1171 }
1172

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-06-07 21:43:53

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 04/14] drm: debugfs: provide infrastructure to dump a DRM GPU VA space

2023-06-08 13:16:26

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 13/14] drm/nouveau: implement new VM_BIND uAPI

Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 33a86170888b7e4aa0cea94ebb9c67180139cea9]

url: https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-execution-context-for-GEM-buffers-v4/20230607-063442
base: 33a86170888b7e4aa0cea94ebb9c67180139cea9
patch link: https://lore.kernel.org/r/20230606223130.6132-14-dakr%40redhat.com
patch subject: [PATCH drm-next v4 13/14] drm/nouveau: implement new VM_BIND uAPI
config: alpha-randconfig-s041-20230608 (https://download.01.org/0day-ci/archive/20230608/[email protected]/config)
compiler: alpha-linux-gcc (GCC) 12.3.0
reproduce:
mkdir -p ~/bin
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.4-39-gce1a6720-dirty
# https://github.com/intel-lab-lkp/linux/commit/28d9f3973f9ed165312943fb05304fad878abb33
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Danilo-Krummrich/drm-execution-context-for-GEM-buffers-v4/20230607-063442
git checkout 28d9f3973f9ed165312943fb05304fad878abb33
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=alpha olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=alpha SHELL=/bin/bash drivers/gpu/drm/

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

sparse warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/nouveau/nouveau_drm.c:1194:9: sparse: sparse: incorrect type in initializer (incompatible argument 2 (different address spaces)) @@ expected int ( [usertype] *func )( ... ) @@ got int ( * )( ... ) @@
drivers/gpu/drm/nouveau/nouveau_drm.c:1194:9: sparse: expected int ( [usertype] *func )( ... )
drivers/gpu/drm/nouveau/nouveau_drm.c:1194:9: sparse: got int ( * )( ... )
drivers/gpu/drm/nouveau/nouveau_drm.c:1195:9: sparse: sparse: incorrect type in initializer (incompatible argument 2 (different address spaces)) @@ expected int ( [usertype] *func )( ... ) @@ got int ( * )( ... ) @@
drivers/gpu/drm/nouveau/nouveau_drm.c:1195:9: sparse: expected int ( [usertype] *func )( ... )
drivers/gpu/drm/nouveau/nouveau_drm.c:1195:9: sparse: got int ( * )( ... )
drivers/gpu/drm/nouveau/nouveau_drm.c:1196:9: sparse: sparse: incorrect type in initializer (incompatible argument 2 (different address spaces)) @@ expected int ( [usertype] *func )( ... ) @@ got int ( * )( ... ) @@
drivers/gpu/drm/nouveau/nouveau_drm.c:1196:9: sparse: expected int ( [usertype] *func )( ... )
drivers/gpu/drm/nouveau/nouveau_drm.c:1196:9: sparse: got int ( * )( ... )
--
>> drivers/gpu/drm/nouveau/nouveau_exec.c:305:19: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:306:19: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:307:20: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:308:20: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:309:21: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:310:21: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:378:43: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:393:13: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:396:13: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_exec.c:397:17: sparse: sparse: dereference of noderef expression
--
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1637:1: sparse: sparse: symbol 'nouveau_uvmm_ioctl_vm_init' redeclared with different type (incompatible argument 2 (different address spaces)):
>> drivers/gpu/drm/nouveau/nouveau_uvmm.c:1637:1: sparse: int extern [addressable] [signed] [toplevel] nouveau_uvmm_ioctl_vm_init( ... )
drivers/gpu/drm/nouveau/nouveau_uvmm.c: note: in included file (through drivers/gpu/drm/nouveau/nouveau_drv.h):
drivers/gpu/drm/nouveau/nouveau_uvmm.h:91:5: sparse: note: previously declared as:
>> drivers/gpu/drm/nouveau/nouveau_uvmm.h:91:5: sparse: int extern [addressable] [signed] [toplevel] nouveau_uvmm_ioctl_vm_init( ... )
drivers/gpu/drm/nouveau/nouveau_uvmm.c:342:17: sparse: sparse: context imbalance in '__nouveau_uvma_region_insert' - unexpected unlock
>> drivers/gpu/drm/nouveau/nouveau_uvmm.c:1674:19: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1675:19: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1676:20: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1677:20: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1678:19: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1679:19: sparse: sparse: dereference of noderef expression
drivers/gpu/drm/nouveau/nouveau_uvmm.c:1682:23: sparse: sparse: dereference of noderef expression

vim +1194 drivers/gpu/drm/nouveau/nouveau_drm.c

1177
1178 static const struct drm_ioctl_desc
1179 nouveau_ioctls[] = {
1180 DRM_IOCTL_DEF_DRV(NOUVEAU_GETPARAM, nouveau_abi16_ioctl_getparam, DRM_RENDER_ALLOW),
1181 DRM_IOCTL_DEF_DRV(NOUVEAU_SETPARAM, drm_invalid_op, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
1182 DRM_IOCTL_DEF_DRV(NOUVEAU_CHANNEL_ALLOC, nouveau_abi16_ioctl_channel_alloc, DRM_RENDER_ALLOW),
1183 DRM_IOCTL_DEF_DRV(NOUVEAU_CHANNEL_FREE, nouveau_abi16_ioctl_channel_free, DRM_RENDER_ALLOW),
1184 DRM_IOCTL_DEF_DRV(NOUVEAU_GROBJ_ALLOC, nouveau_abi16_ioctl_grobj_alloc, DRM_RENDER_ALLOW),
1185 DRM_IOCTL_DEF_DRV(NOUVEAU_NOTIFIEROBJ_ALLOC, nouveau_abi16_ioctl_notifierobj_alloc, DRM_RENDER_ALLOW),
1186 DRM_IOCTL_DEF_DRV(NOUVEAU_GPUOBJ_FREE, nouveau_abi16_ioctl_gpuobj_free, DRM_RENDER_ALLOW),
1187 DRM_IOCTL_DEF_DRV(NOUVEAU_SVM_INIT, nouveau_svmm_init, DRM_RENDER_ALLOW),
1188 DRM_IOCTL_DEF_DRV(NOUVEAU_SVM_BIND, nouveau_svmm_bind, DRM_RENDER_ALLOW),
1189 DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_NEW, nouveau_gem_ioctl_new, DRM_RENDER_ALLOW),
1190 DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_PUSHBUF, nouveau_gem_ioctl_pushbuf, DRM_RENDER_ALLOW),
1191 DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_PREP, nouveau_gem_ioctl_cpu_prep, DRM_RENDER_ALLOW),
1192 DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_FINI, nouveau_gem_ioctl_cpu_fini, DRM_RENDER_ALLOW),
1193 DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_INFO, nouveau_gem_ioctl_info, DRM_RENDER_ALLOW),
> 1194 DRM_IOCTL_DEF_DRV(NOUVEAU_VM_INIT, nouveau_uvmm_ioctl_vm_init, DRM_RENDER_ALLOW),
1195 DRM_IOCTL_DEF_DRV(NOUVEAU_VM_BIND, nouveau_uvmm_ioctl_vm_bind, DRM_RENDER_ALLOW),
1196 DRM_IOCTL_DEF_DRV(NOUVEAU_EXEC, nouveau_exec_ioctl_exec, DRM_RENDER_ALLOW),
1197 };
1198

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-06-09 12:52:14

by Donald Robson

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Wed, 2023-06-07 at 00:31 +0200, Danilo Krummrich wrote:
>
> Christian König (1):
> drm: execution context for GEM buffers v4
>
> Danilo Krummrich (13):
> maple_tree: split up MA_STATE() macro
> drm: manager to keep track of GPUs VA mappings

I have tested the drm GPUVA manager as part of using it with our new
driver. The link below shows use of the drm_gpuva_sm_[un]map()
functions. I think this is based on the v3 patches, but I have also
tried it locally using v4 patches. We will be submitting this
driver for review soon.

https://gitlab.freedesktop.org/sarah-walker-imgtec/powervr/-/blob/dev/v3/drivers/gpu/drm/imagination/pvr_vm.c

In a previous incarnation, I used the drm_gpuva_insert() and
drm_gpuva_remove() functions directly. In some now abandoned work I
used the drm_gpuva_sm_[un]map_ops_create() route.

The only problem I encountered along the way was the maple tree init
issue already reported by Boris and fixed in v4. One caveat - as
our driver is a work in progress our testing is limited to certain
Sascha Willem tests.

I did find it quite difficult to get the prealloc route with
drm_gpuva_sm_[un]map() working. I'm not sure to what degree this
reflects me being a novice on matters DRM, but I did find myself
wishing for more direction, even with Boris's help.

Tested-by: Donald Robson <[email protected]>

> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
> drm/nouveau: new VM_BIND uapi interfaces
> drm/nouveau: get vmm via nouveau_cli_vmm()
> drm/nouveau: bo: initialize GEM GPU VA interface
> drm/nouveau: move usercopy helpers to nouveau_drv.h
> drm/nouveau: fence: separate fence alloc and emit
> drm/nouveau: fence: fail to emit when fence context is killed
> drm/nouveau: chan: provide nouveau_channel_kill()
> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
> drm/nouveau: implement new VM_BIND uAPI
> drm/nouveau: debugfs: implement DRM GPU VA debugfs
>

2023-06-13 14:44:32

by Danilo Krummrich

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Hi Donald,

On 6/9/23 13:56, Donald Robson wrote:
> On Wed, 2023-06-07 at 00:31 +0200, Danilo Krummrich wrote:
>>
>> Christian König (1):
>> drm: execution context for GEM buffers v4
>>
>> Danilo Krummrich (13):
>> maple_tree: split up MA_STATE() macro
>> drm: manager to keep track of GPUs VA mappings
>
> I have tested the drm GPUVA manager as part of using it with our new
> driver. The link below shows use of the drm_gpuva_sm_[un]map()
> functions. I think this is based on the v3 patches, but I have also
> tried it locally using v4 patches. We will be submitting this
> driver for review soon.

That's awesome - thank your for taking the effort!

>
> https://gitlab.freedesktop.org/sarah-walker-imgtec/powervr/-/blob/dev/v3/drivers/gpu/drm/imagination/pvr_vm.c
>
> In a previous incarnation, I used the drm_gpuva_insert() and
> drm_gpuva_remove() functions directly. In some now abandoned work I
> used the drm_gpuva_sm_[un]map_ops_create() route.
>
> The only problem I encountered along the way was the maple tree init
> issue already reported by Boris and fixed in v4. One caveat - as
> our driver is a work in progress our testing is limited to certain
> Sascha Willem tests.
>
> I did find it quite difficult to get the prealloc route with
> drm_gpuva_sm_[un]map() working. I'm not sure to what degree this
> reflects me being a novice on matters DRM, but I did find myself
> wishing for more direction, even with Boris's help.

I'm definitely up improving the existing documentation. Anything in
particular you think should be described in more detail?

- Danilo

>
> Tested-by: Donald Robson <[email protected]>
>
>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
>> drm/nouveau: new VM_BIND uapi interfaces
>> drm/nouveau: get vmm via nouveau_cli_vmm()
>> drm/nouveau: bo: initialize GEM GPU VA interface
>> drm/nouveau: move usercopy helpers to nouveau_drv.h
>> drm/nouveau: fence: separate fence alloc and emit
>> drm/nouveau: fence: fail to emit when fence context is killed
>> drm/nouveau: chan: provide nouveau_channel_kill()
>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
>> drm/nouveau: implement new VM_BIND uAPI
>> drm/nouveau: debugfs: implement DRM GPU VA debugfs
>>

2023-06-13 18:07:44

by Liam R. Howlett

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 02/14] maple_tree: split up MA_STATE() macro

* Danilo Krummrich <[email protected]> [230606 18:31]:
> Split up the MA_STATE() macro such that components using the maple tree
> can easily inherit from struct ma_state and build custom tree walk
> macros to hide their internals from users.
>
> Example:
>
> struct sample_iterator {
> struct ma_state mas;
> struct sample_mgr *mgr;
> };
>
> \#define SAMPLE_ITERATOR(name, __mgr, start) \
> struct sample_iterator name = { \
> .mas = MA_STATE_INIT(&(__mgr)->mt, start, 0), \
> .mgr = __mgr, \
> }
>
> \#define sample_iter_for_each_range(it__, entry__, end__) \
> mas_for_each(&(it__).mas, entry__, end__)
>
> --
>
> struct sample *sample;
> SAMPLE_ITERATOR(si, min);
>
> sample_iter_for_each_range(&si, sample, max) {
> frob(mgr, sample);
> }
>
> Signed-off-by: Danilo Krummrich <[email protected]>

Reviewed-by: Liam R. Howlett <[email protected]>

> ---
> include/linux/maple_tree.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
> index 1fadb5f5978b..87d55334f1c2 100644
> --- a/include/linux/maple_tree.h
> +++ b/include/linux/maple_tree.h
> @@ -423,8 +423,8 @@ struct ma_wr_state {
> #define MA_ERROR(err) \
> ((struct maple_enode *)(((unsigned long)err << 2) | 2UL))
>
> -#define MA_STATE(name, mt, first, end) \
> - struct ma_state name = { \
> +#define MA_STATE_INIT(mt, first, end) \
> + { \
> .tree = mt, \
> .index = first, \
> .last = end, \
> @@ -435,6 +435,9 @@ struct ma_wr_state {
> .mas_flags = 0, \
> }
>
> +#define MA_STATE(name, mt, first, end) \
> + struct ma_state name = MA_STATE_INIT(mt, first, end)
> +
> #define MA_WR_STATE(name, ma_state, wr_entry) \
> struct ma_wr_state name = { \
> .mas = ma_state, \
> --
> 2.40.1
>
>

2023-06-14 01:05:04

by Liam R. Howlett

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 03/14] drm: manager to keep track of GPUs VA mappings

* Danilo Krummrich <[email protected]> [230606 18:32]:
> Add infrastructure to keep track of GPU virtual address (VA) mappings
> with a decicated VA space manager implementation.
>
> New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> start implementing, allow userspace applications to request multiple and
> arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> intended to serve the following purposes in this context.
>
> 1) Provide infrastructure to track GPU VA allocations and mappings,
> making use of the maple_tree.
>
> 2) Generically connect GPU VA mappings to their backing buffers, in
> particular DRM GEM objects.
>
> 3) Provide a common implementation to perform more complex mapping
> operations on the GPU VA space. In particular splitting and merging
> of GPU VA mappings, e.g. for intersecting mapping requests or partial
> unmap requests.
>
> Suggested-by: Dave Airlie <[email protected]>
> Signed-off-by: Danilo Krummrich <[email protected]>
> ---
> Documentation/gpu/drm-mm.rst | 31 +
> drivers/gpu/drm/Makefile | 1 +
> drivers/gpu/drm/drm_gem.c | 3 +
> drivers/gpu/drm/drm_gpuva_mgr.c | 1687 +++++++++++++++++++++++++++++++
> include/drm/drm_drv.h | 6 +
> include/drm/drm_gem.h | 75 ++
> include/drm/drm_gpuva_mgr.h | 681 +++++++++++++
> 7 files changed, 2484 insertions(+)
> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> create mode 100644 include/drm/drm_gpuva_mgr.h
>
> diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> index a52e6f4117d6..c9f120cfe730 100644
> --- a/Documentation/gpu/drm-mm.rst
> +++ b/Documentation/gpu/drm-mm.rst
> @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
> .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> :export:
>
> +DRM GPU VA Manager
> +==================
> +
> +Overview
> +--------
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> + :doc: Overview
> +
> +Split and Merge
> +---------------
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> + :doc: Split and Merge
> +
> +Locking
> +-------
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> + :doc: Locking
> +
> +
> +DRM GPU VA Manager Function References
> +--------------------------------------
> +
> +.. kernel-doc:: include/drm/drm_gpuva_mgr.h
> + :internal:
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> + :export:
> +
> DRM Buddy Allocator
> ===================
>
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 9c6446eb3c83..8eeed446a078 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -45,6 +45,7 @@ drm-y := \
> drm_vblank.o \
> drm_vblank_work.o \
> drm_vma_manager.o \
> + drm_gpuva_mgr.o \
> drm_writeback.o
> drm-$(CONFIG_DRM_LEGACY) += \
> drm_agpsupport.o \
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 1a5a2cd0d4ec..cd878ebddbd0 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
> if (!obj->resv)
> obj->resv = &obj->_resv;
>
> + if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))
> + drm_gem_gpuva_init(obj);
> +
> drm_vma_node_reset(&obj->vma_node);
> INIT_LIST_HEAD(&obj->lru_node);
> }
> diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
> new file mode 100644
> index 000000000000..dd8dd7fef14b
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_gpuva_mgr.c
> @@ -0,0 +1,1687 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 Red Hat.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * Authors:
> + * Danilo Krummrich <[email protected]>
> + *
> + */
> +
> +#include <drm/drm_gem.h>
> +#include <drm/drm_gpuva_mgr.h>
> +
> +/**
> + * DOC: Overview
> + *
> + * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
> + * of a GPU's virtual address (VA) space and manages the corresponding virtual
> + * mappings represented by &drm_gpuva objects. It also keeps track of the
> + * mapping's backing &drm_gem_object buffers.
> + *
> + * &drm_gem_object buffers maintain a list (and a corresponding list lock) of
> + * &drm_gpuva objects representing all existent GPU VA mappings using this
> + * &drm_gem_object as backing buffer.
> + *
> + * GPU VAs can be flagged as sparse, such that drivers may use GPU VAs to also
> + * keep track of sparse PTEs in order to support Vulkan 'Sparse Resources'.
> + *
> + * The GPU VA manager internally uses a &maple_tree to manage the
> + * &drm_gpuva mappings within a GPU's virtual address space.
> + *
> + * The &drm_gpuva_manager contains a special &drm_gpuva representing the
> + * portion of VA space reserved by the kernel. This node is initialized together
> + * with the GPU VA manager instance and removed when the GPU VA manager is
> + * destroyed.
> + *
> + * In a typical application drivers would embed struct drm_gpuva_manager and
> + * struct drm_gpuva within their own driver specific structures, there won't be
> + * any memory allocations of it's own nor memory allocations of &drm_gpuva
> + * entries.
> + *
> + * However, the &drm_gpuva_manager needs to allocate nodes for it's internal
> + * tree structures when &drm_gpuva entries are inserted. In order to support
> + * inserting &drm_gpuva entries from dma-fence signalling critical sections the
> + * &drm_gpuva_manager provides struct drm_gpuva_prealloc. Drivers may create
> + * pre-allocated nodes which drm_gpuva_prealloc_create() and subsequently insert
> + * a new &drm_gpuva entry with drm_gpuva_insert_prealloc().
> + */
> +
> +/**
> + * DOC: Split and Merge
> + *
> + * The DRM GPU VA manager also provides an algorithm implementing splitting and
> + * merging of existent GPU VA mappings with the ones that are requested to be
> + * mapped or unmapped. This feature is required by the Vulkan API to implement
> + * Vulkan 'Sparse Memory Bindings' - drivers UAPIs often refer to this as
> + * VM BIND.
> + *
> + * Drivers can call drm_gpuva_sm_map() to receive a sequence of callbacks
> + * containing map, unmap and remap operations for a given newly requested
> + * mapping. The sequence of callbacks represents the set of operations to
> + * execute in order to integrate the new mapping cleanly into the current state
> + * of the GPU VA space.
> + *
> + * Depending on how the new GPU VA mapping intersects with the existent mappings
> + * of the GPU VA space the &drm_gpuva_fn_ops callbacks contain an arbitrary
> + * amount of unmap operations, a maximum of two remap operations and a single
> + * map operation. The caller might receive no callback at all if no operation is
> + * required, e.g. if the requested mapping already exists in the exact same way.
> + *
> + * The single map operation represents the original map operation requested by
> + * the caller.
> + *
> + * &drm_gpuva_op_unmap contains a 'keep' field, which indicates whether the
> + * &drm_gpuva to unmap is physically contiguous with the original mapping
> + * request. Optionally, if 'keep' is set, drivers may keep the actual page table
> + * entries for this &drm_gpuva, adding the missing page table entries only and
> + * update the &drm_gpuva_manager's view of things accordingly.
> + *
> + * Drivers may do the same optimization, namely delta page table updates, also
> + * for remap operations. This is possible since &drm_gpuva_op_remap consists of
> + * one unmap operation and one or two map operations, such that drivers can
> + * derive the page table update delta accordingly.
> + *
> + * Note that there can't be more than two existent mappings to split up, one at
> + * the beginning and one at the end of the new mapping, hence there is a
> + * maximum of two remap operations.
> + *
> + * Analogous to drm_gpuva_sm_map() drm_gpuva_sm_unmap() uses &drm_gpuva_fn_ops
> + * to call back into the driver in order to unmap a range of GPU VA space. The
> + * logic behind this function is way simpler though: For all existent mappings
> + * enclosed by the given range unmap operations are created. For mappings which
> + * are only partically located within the given range, remap operations are
> + * created such that those mappings are split up and re-mapped partically.
> + *
> + * To update the &drm_gpuva_manager's view of the GPU VA space
> + * drm_gpuva_insert(), drm_gpuva_insert_prealloc(), and drm_gpuva_remove() may
> + * be used. Please note that these functions are not safe to be called from a
> + * &drm_gpuva_fn_ops callback originating from drm_gpuva_sm_map() or
> + * drm_gpuva_sm_unmap(). The drm_gpuva_map(), drm_gpuva_remap() and
> + * drm_gpuva_unmap() helpers should be used instead.
> + *
> + * The following diagram depicts the basic relationships of existent GPU VA
> + * mappings, a newly requested mapping and the resulting mappings as implemented
> + * by drm_gpuva_sm_map() - it doesn't cover any arbitrary combinations of these.
> + *
> + * 1) Requested mapping is identical. Replace it, but indicate the backing PTEs
> + * could be kept.
> + *
> + * ::
> + *
> + * 0 a 1
> + * old: |-----------| (bo_offset=n)
> + *
> + * 0 a 1
> + * req: |-----------| (bo_offset=n)
> + *
> + * 0 a 1
> + * new: |-----------| (bo_offset=n)
> + *
> + *
> + * 2) Requested mapping is identical, except for the BO offset, hence replace
> + * the mapping.
> + *
> + * ::
> + *
> + * 0 a 1
> + * old: |-----------| (bo_offset=n)
> + *
> + * 0 a 1
> + * req: |-----------| (bo_offset=m)
> + *
> + * 0 a 1
> + * new: |-----------| (bo_offset=m)
> + *
> + *
> + * 3) Requested mapping is identical, except for the backing BO, hence replace
> + * the mapping.
> + *
> + * ::
> + *
> + * 0 a 1
> + * old: |-----------| (bo_offset=n)
> + *
> + * 0 b 1
> + * req: |-----------| (bo_offset=n)
> + *
> + * 0 b 1
> + * new: |-----------| (bo_offset=n)
> + *
> + *
> + * 4) Existent mapping is a left aligned subset of the requested one, hence
> + * replace the existent one.
> + *
> + * ::
> + *
> + * 0 a 1
> + * old: |-----| (bo_offset=n)
> + *
> + * 0 a 2
> + * req: |-----------| (bo_offset=n)
> + *
> + * 0 a 2
> + * new: |-----------| (bo_offset=n)
> + *
> + * .. note::
> + * We expect to see the same result for a request with a different BO
> + * and/or non-contiguous BO offset.
> + *
> + *
> + * 5) Requested mapping's range is a left aligned subset of the existent one,
> + * but backed by a different BO. Hence, map the requested mapping and split
> + * the existent one adjusting it's BO offset.
> + *
> + * ::
> + *
> + * 0 a 2
> + * old: |-----------| (bo_offset=n)
> + *
> + * 0 b 1
> + * req: |-----| (bo_offset=n)
> + *
> + * 0 b 1 a' 2
> + * new: |-----|-----| (b.bo_offset=n, a.bo_offset=n+1)
> + *
> + * .. note::
> + * We expect to see the same result for a request with a different BO
> + * and/or non-contiguous BO offset.
> + *
> + *
> + * 6) Existent mapping is a superset of the requested mapping. Split it up, but
> + * indicate that the backing PTEs could be kept.
> + *
> + * ::
> + *
> + * 0 a 2
> + * old: |-----------| (bo_offset=n)
> + *
> + * 0 a 1
> + * req: |-----| (bo_offset=n)
> + *
> + * 0 a 1 a' 2
> + * new: |-----|-----| (a.bo_offset=n, a'.bo_offset=n+1)
> + *
> + *
> + * 7) Requested mapping's range is a right aligned subset of the existent one,
> + * but backed by a different BO. Hence, map the requested mapping and split
> + * the existent one, without adjusting the BO offset.
> + *
> + * ::
> + *
> + * 0 a 2
> + * old: |-----------| (bo_offset=n)
> + *
> + * 1 b 2
> + * req: |-----| (bo_offset=m)
> + *
> + * 0 a 1 b 2
> + * new: |-----|-----| (a.bo_offset=n,b.bo_offset=m)
> + *
> + *
> + * 8) Existent mapping is a superset of the requested mapping. Split it up, but
> + * indicate that the backing PTEs could be kept.
> + *
> + * ::
> + *
> + * 0 a 2
> + * old: |-----------| (bo_offset=n)
> + *
> + * 1 a 2
> + * req: |-----| (bo_offset=n+1)
> + *
> + * 0 a' 1 a 2
> + * new: |-----|-----| (a'.bo_offset=n, a.bo_offset=n+1)
> + *
> + *
> + * 9) Existent mapping is overlapped at the end by the requested mapping backed
> + * by a different BO. Hence, map the requested mapping and split up the
> + * existent one, without adjusting the BO offset.
> + *
> + * ::
> + *
> + * 0 a 2
> + * old: |-----------| (bo_offset=n)
> + *
> + * 1 b 3
> + * req: |-----------| (bo_offset=m)
> + *
> + * 0 a 1 b 3
> + * new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)
> + *
> + *
> + * 10) Existent mapping is overlapped by the requested mapping, both having the
> + * same backing BO with a contiguous offset. Indicate the backing PTEs of
> + * the old mapping could be kept.
> + *
> + * ::
> + *
> + * 0 a 2
> + * old: |-----------| (bo_offset=n)
> + *
> + * 1 a 3
> + * req: |-----------| (bo_offset=n+1)
> + *
> + * 0 a' 1 a 3
> + * new: |-----|-----------| (a'.bo_offset=n, a.bo_offset=n+1)
> + *
> + *
> + * 11) Requested mapping's range is a centered subset of the existent one
> + * having a different backing BO. Hence, map the requested mapping and split
> + * up the existent one in two mappings, adjusting the BO offset of the right
> + * one accordingly.
> + *
> + * ::
> + *
> + * 0 a 3
> + * old: |-----------------| (bo_offset=n)
> + *
> + * 1 b 2
> + * req: |-----| (bo_offset=m)
> + *
> + * 0 a 1 b 2 a' 3
> + * new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)
> + *
> + *
> + * 12) Requested mapping is a contiguous subset of the existent one. Split it
> + * up, but indicate that the backing PTEs could be kept.
> + *
> + * ::
> + *
> + * 0 a 3
> + * old: |-----------------| (bo_offset=n)
> + *
> + * 1 a 2
> + * req: |-----| (bo_offset=n+1)
> + *
> + * 0 a' 1 a 2 a'' 3
> + * old: |-----|-----|-----| (a'.bo_offset=n, a.bo_offset=n+1, a''.bo_offset=n+2)
> + *
> + *
> + * 13) Existent mapping is a right aligned subset of the requested one, hence
> + * replace the existent one.
> + *
> + * ::
> + *
> + * 1 a 2
> + * old: |-----| (bo_offset=n+1)
> + *
> + * 0 a 2
> + * req: |-----------| (bo_offset=n)
> + *
> + * 0 a 2
> + * new: |-----------| (bo_offset=n)
> + *
> + * .. note::
> + * We expect to see the same result for a request with a different bo
> + * and/or non-contiguous bo_offset.
> + *
> + *
> + * 14) Existent mapping is a centered subset of the requested one, hence
> + * replace the existent one.
> + *
> + * ::
> + *
> + * 1 a 2
> + * old: |-----| (bo_offset=n+1)
> + *
> + * 0 a 3
> + * req: |----------------| (bo_offset=n)
> + *
> + * 0 a 3
> + * new: |----------------| (bo_offset=n)
> + *
> + * .. note::
> + * We expect to see the same result for a request with a different bo
> + * and/or non-contiguous bo_offset.
> + *
> + *
> + * 15) Existent mappings is overlapped at the beginning by the requested mapping
> + * backed by a different BO. Hence, map the requested mapping and split up
> + * the existent one, adjusting it's BO offset accordingly.
> + *
> + * ::
> + *
> + * 1 a 3
> + * old: |-----------| (bo_offset=n)
> + *
> + * 0 b 2
> + * req: |-----------| (bo_offset=m)
> + *
> + * 0 b 2 a' 3
> + * new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)
> + */
> +
> +/**
> + * DOC: Locking
> + *
> + * Generally, the GPU VA manager does not take care of locking itself, it is
> + * the drivers responsibility to take care about locking. Drivers might want to
> + * protect the following operations: inserting, removing and iterating
> + * &drm_gpuva objects as well as generating all kinds of operations, such as
> + * split / merge or prefetch.
> + *
> + * The GPU VA manager also does not take care of the locking of the backing
> + * &drm_gem_object buffers GPU VA lists by itself; drivers are responsible to
> + * enforce mutual exclusion.
> + */
> +
> + /*
> + * Maple Tree Locking
> + *
> + * The maple tree's advanced API requires the user of the API to protect
> + * certain tree operations with a lock (either the external or internal tree
> + * lock) for tree internal reasons.
> + *
> + * The actual rules (when to aquire/release the lock) are enforced by lockdep
> + * through the maple tree implementation.
> + *
> + * For this reason the DRM GPUVA manager takes the maple tree's internal
> + * spinlock according to the lockdep enforced rules.
> + *
> + * Please note, that this lock is *only* meant to fulfill the maple trees
> + * requirements and does not intentionally protect the DRM GPUVA manager
> + * against concurrent access.
> + *
> + * The following mail thread provides more details on why the maple tree
> + * has this requirement.
> + *
> + * https://lore.kernel.org/lkml/[email protected]/
> + */
> +
> +static int __drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva *va);
> +static void __drm_gpuva_remove(struct drm_gpuva *va);
> +
> +/**
> + * drm_gpuva_manager_init - initialize a &drm_gpuva_manager
> + * @mgr: pointer to the &drm_gpuva_manager to initialize
> + * @name: the name of the GPU VA space
> + * @start_offset: the start offset of the GPU VA space
> + * @range: the size of the GPU VA space
> + * @reserve_offset: the start of the kernel reserved GPU VA area
> + * @reserve_range: the size of the kernel reserved GPU VA area
> + * @ops: &drm_gpuva_fn_ops called on &drm_gpuva_sm_map / &drm_gpuva_sm_unmap
> + *
> + * The &drm_gpuva_manager must be initialized with this function before use.
> + *
> + * Note that @mgr must be cleared to 0 before calling this function. The given
> + * &name is expected to be managed by the surrounding driver structures.
> + */
> +void
> +drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> + const char *name,
> + u64 start_offset, u64 range,
> + u64 reserve_offset, u64 reserve_range,
> + const struct drm_gpuva_fn_ops *ops)
> +{
> + mt_init(&mgr->mtree);
> +
> + mgr->mm_start = start_offset;
> + mgr->mm_range = range;
> +
> + mgr->name = name ? name : "unknown";
> + mgr->ops = ops;
> +
> + memset(&mgr->kernel_alloc_node, 0, sizeof(struct drm_gpuva));
> +
> + if (reserve_range) {
> + mgr->kernel_alloc_node.va.addr = reserve_offset;
> + mgr->kernel_alloc_node.va.range = reserve_range;
> +
> + __drm_gpuva_insert(mgr, &mgr->kernel_alloc_node);
> + }
> +
> +}
> +EXPORT_SYMBOL(drm_gpuva_manager_init);
> +
> +/**
> + * drm_gpuva_manager_destroy - cleanup a &drm_gpuva_manager
> + * @mgr: pointer to the &drm_gpuva_manager to clean up
> + *
> + * Note that it is a bug to call this function on a manager that still
> + * holds GPU VA mappings.
> + */
> +void
> +drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
> +{
> + mgr->name = NULL;
> +
> + if (mgr->kernel_alloc_node.va.range)
> + __drm_gpuva_remove(&mgr->kernel_alloc_node);
> +
> + mtree_lock(&mgr->mtree);
> + WARN(!mtree_empty(&mgr->mtree),
> + "GPUVA tree is not empty, potentially leaking memory.");
> + __mt_destroy(&mgr->mtree);
> + mtree_unlock(&mgr->mtree);
> +}
> +EXPORT_SYMBOL(drm_gpuva_manager_destroy);
> +
> +static inline bool
> +drm_gpuva_in_mm_range(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
> +{
> + u64 end = addr + range;
> + u64 mm_start = mgr->mm_start;
> + u64 mm_end = mm_start + mgr->mm_range;
> +
> + return addr < mm_end && mm_start < end;
> +}
> +
> +static inline bool
> +drm_gpuva_in_kernel_node(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
> +{
> + u64 end = addr + range;
> + u64 kstart = mgr->kernel_alloc_node.va.addr;
> + u64 krange = mgr->kernel_alloc_node.va.range;
> + u64 kend = kstart + krange;
> +
> + return krange && addr < kend && kstart < end;
> +}
> +
> +static inline bool
> +drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range)
> +{
> + return drm_gpuva_in_mm_range(mgr, addr, range) &&
> + !drm_gpuva_in_kernel_node(mgr, addr, range);
> +}
> +
> +/**
> + * drm_gpuva_iter_remove - removes the iterators current element
> + * @it: the &drm_gpuva_iterator
> + *
> + * This removes the element the iterator currently points to.
> + */
> +void
> +drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
> +{
> + mas_lock(&it->mas);
> + mas_erase(&it->mas);
> + mas_unlock(&it->mas);
> +}
> +EXPORT_SYMBOL(drm_gpuva_iter_remove);
> +
> +/**
> + * drm_gpuva_prealloc_create - creates a preallocated node to store a
> + * &drm_gpuva entry.
> + *
> + * Returns: the &drm_gpuva_prealloc object on success, NULL on failure
> + */
> +struct drm_gpuva_prealloc *
> +drm_gpuva_prealloc_create(struct drm_gpuva_manager *mgr)
> +{
> + struct drm_gpuva_prealloc *pa;
> +
> + pa = kzalloc(sizeof(*pa), GFP_KERNEL);
> + if (!pa)
> + return NULL;
> +
> + mas_init(&pa->mas, &mgr->mtree, 0);

I've broken this interface on you too, with the mas_preallocate()
change - See below.

> + if (mas_preallocate(&pa->mas, GFP_KERNEL)) {
> + kfree(pa);
> + return NULL;
> + }
> +
> + return pa;
> +}
> +EXPORT_SYMBOL(drm_gpuva_prealloc_create);
> +
> +/**
> + * drm_gpuva_prealloc_destroy - destroyes a preallocated node and frees the
> + * &drm_gpuva_prealloc

I tend to think of it as destroying a maple state by freeing the
preallocated nodes, but I guess the state isn't destroyed.

> + *
> + * @pa: the &drm_gpuva_prealloc to destroy
> + */
> +void
> +drm_gpuva_prealloc_destroy(struct drm_gpuva_prealloc *pa)
> +{
> + mas_destroy(&pa->mas);
> + kfree(pa);
> +}
> +EXPORT_SYMBOL(drm_gpuva_prealloc_destroy);
> +
> +static int
> +drm_gpuva_insert_state(struct drm_gpuva_manager *mgr,
> + struct ma_state *mas,
> + struct drm_gpuva *va)

Couldn't these arguments could be on one line?

> +{
> + u64 addr = va->va.addr;
> + u64 range = va->va.range;
> + u64 last = addr + range - 1;
> +
> + mas_set(mas, addr);
> +
> + mas_lock(mas);
> + if (unlikely(mas_walk(mas))) {
> + mas_unlock(mas);
> + return -EEXIST;
> + }
> +
> + if (unlikely(mas->last < last)) {
> + mas_unlock(mas);
> + return -EEXIST;
> + }
> +
> + mas->index = addr;
> + mas->last = last;
> +
> + mas_store_prealloc(mas, va);
> + mas_unlock(mas);
> +
> + va->mgr = mgr;
> +
> + return 0;
> +}
> +
> +static int
> +__drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva *va)
> +{
> + MA_STATE(mas, &mgr->mtree, 0, 0);
> + int ret;
> +
> + ret = mas_preallocate(&mas, GFP_KERNEL);

mas_preallocate() is in the process of being updated to reduce the
allocations, so this will eventually fail to compile [1].

mas_preallocate(&mas, va, GFP_KERNEL) will work in the future.

The calculated allocations depend on the area being written and if there
is a value or NULL being written.

> + if (ret)
> + return ret;
> +
> + return drm_gpuva_insert_state(mgr, &mas, va);

This has the added effect that the mas_preallocate() examines the tree
by walking it, so you need to hold the lock during that work. It is
also possible, since you are not holding the lock here, that you could
have a writer come in and change what you preallocated to store and may
cause the write to not have enough memory. IIRC you have another
locking strategy that negates this, but you will still need to hold the
lock and have the maple state pointing at the correct range now (or,
well, soon) to keep lockdep happy.

Change this:
MA_STATE(mas, &mgr->mtree, 0, 0);

to something like this (but hopefully less ugly..)
MA_STATE(mas, &mgr->mtree, va->va.addr, va->va.addr + va->va.range - 1);

...maybe use mas_init() instead.

This strictly does not need to preallocate since you don't have complex
locking in this case, but I suspect you are preallocating for external
driver use as documented in this patch? This can still work if the
preallocation call sets up the maple state and the driver doesn't mess
things up on you. You could check that by verifying mas.index and
mas.last are what you expect, but I think you'll want to move your
mas_walk() checks to before preallocating.

[1] https://lore.kernel.org/all/[email protected]/

> +}
> +
> +/**
> + * drm_gpuva_insert - insert a &drm_gpuva
> + * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
> + * @va: the &drm_gpuva to insert
> + *
> + * Insert a &drm_gpuva with a given address and range into a
> + * &drm_gpuva_manager.
> + *
> + * It is not allowed to use this function while iterating this GPU VA space,
> + * e.g via drm_gpuva_iter_for_each().
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva *va)
> +{
> + u64 addr = va->va.addr;
> + u64 range = va->va.range;
> +
> + if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
> + return -EINVAL;
> +
> + return __drm_gpuva_insert(mgr, va);
> +}
> +EXPORT_SYMBOL(drm_gpuva_insert);
> +
> +/**
> + * drm_gpuva_insert_prealloc - insert a &drm_gpuva with a preallocated node
> + * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
> + * @va: the &drm_gpuva to insert
> + * @pa: the &drm_gpuva_prealloc node
> + *
> + * Insert a &drm_gpuva with a given address and range into a
> + * &drm_gpuva_manager.
> + *
> + * It is not allowed to use this function while iterating this GPU VA space,
> + * e.g via drm_gpuva_iter_for_each().
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuva_insert_prealloc(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_prealloc *pa,
> + struct drm_gpuva *va)
> +{
> + struct ma_state *mas = &pa->mas;
> + u64 addr = va->va.addr;
> + u64 range = va->va.range;
> +
> + if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
> + return -EINVAL;
> +
> + mas->tree = &mgr->mtree;

Are you trying to take the allocated nodes for a write to a different
tree? You may not have enough nodes..

> + return drm_gpuva_insert_state(mgr, mas, va);
> +}
> +EXPORT_SYMBOL(drm_gpuva_insert_prealloc);
> +
> +static void
> +__drm_gpuva_remove(struct drm_gpuva *va)
> +{
> + MA_STATE(mas, &va->mgr->mtree, va->va.addr, 0);
> +
> + mas_lock(&mas);
> + mas_erase(&mas);
> + mas_unlock(&mas);

This should be the same as: mtree_erase(&va->mgr->mtree, va->va.addr);

> +}
> +
> +/**
> + * drm_gpuva_remove - remove a &drm_gpuva
> + * @va: the &drm_gpuva to remove
> + *
> + * This removes the given &va from the underlaying tree.
> + *
> + * It is not allowed to use this function while iterating this GPU VA space,
> + * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove()
> + * instead.
> + */
> +void
> +drm_gpuva_remove(struct drm_gpuva *va)
> +{
> + struct drm_gpuva_manager *mgr = va->mgr;
> +
> + if (unlikely(va == &mgr->kernel_alloc_node)) {
> + WARN(1, "Can't destroy kernel reserved node.\n");
> + return;
> + }
> +
> + __drm_gpuva_remove(va);
> +}
> +EXPORT_SYMBOL(drm_gpuva_remove);
> +
> +/**
> + * drm_gpuva_link - link a &drm_gpuva
> + * @va: the &drm_gpuva to link
> + *
> + * This adds the given &va to the GPU VA list of the &drm_gem_object it is
> + * associated with.
> + *
> + * This function expects the caller to protect the GEM's GPUVA list against
> + * concurrent access.
> + */
> +void
> +drm_gpuva_link(struct drm_gpuva *va)
> +{
> + if (likely(va->gem.obj))
> + list_add_tail(&va->gem.entry, &va->gem.obj->gpuva.list);
> +}
> +EXPORT_SYMBOL(drm_gpuva_link);
> +
> +/**
> + * drm_gpuva_unlink - unlink a &drm_gpuva
> + * @va: the &drm_gpuva to unlink
> + *
> + * This removes the given &va from the GPU VA list of the &drm_gem_object it is
> + * associated with.
> + *
> + * This function expects the caller to protect the GEM's GPUVA list against
> + * concurrent access.
> + */
> +void
> +drm_gpuva_unlink(struct drm_gpuva *va)
> +{
> + if (likely(va->gem.obj))
> + list_del_init(&va->gem.entry);
> +}
> +EXPORT_SYMBOL(drm_gpuva_unlink);
> +
> +/**
> + * drm_gpuva_find_first - find the first &drm_gpuva in the given range
> + * @mgr: the &drm_gpuva_manager to search in
> + * @addr: the &drm_gpuvas address
> + * @range: the &drm_gpuvas range
> + *
> + * Returns: the first &drm_gpuva within the given range
> + */
> +struct drm_gpuva *
> +drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range)
> +{
> + MA_STATE(mas, &mgr->mtree, addr, 0);
> + struct drm_gpuva *va;
> +

Again, this can be an rcu_read_lock()

> + mas_lock(&mas);
> + va = mas_find(&mas, addr + range - 1);
> + mas_unlock(&mas);
> +
> + return va;
> +}
> +EXPORT_SYMBOL(drm_gpuva_find_first);
> +
> +/**
> + * drm_gpuva_find - find a &drm_gpuva
> + * @mgr: the &drm_gpuva_manager to search in
> + * @addr: the &drm_gpuvas address
> + * @range: the &drm_gpuvas range
> + *
> + * Returns: the &drm_gpuva at a given &addr and with a given &range
> + */
> +struct drm_gpuva *
> +drm_gpuva_find(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range)
> +{
> + struct drm_gpuva *va;
> +
> + va = drm_gpuva_find_first(mgr, addr, range);
> + if (!va)
> + goto out;
> +
> + if (va->va.addr != addr ||
> + va->va.range != range)
> + goto out;
> +
> + return va;
> +
> +out:
> + return NULL;
> +}
> +EXPORT_SYMBOL(drm_gpuva_find);
> +
> +/**
> + * drm_gpuva_find_prev - find the &drm_gpuva before the given address
> + * @mgr: the &drm_gpuva_manager to search in
> + * @start: the given GPU VA's start address
> + *
> + * Find the adjacent &drm_gpuva before the GPU VA with given &start address.
> + *
> + * Note that if there is any free space between the GPU VA mappings no mapping
> + * is returned.
> + *
> + * Returns: a pointer to the found &drm_gpuva or NULL if none was found
> + */
> +struct drm_gpuva *
> +drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start)
> +{
> + MA_STATE(mas, &mgr->mtree, start - 1, 0);
> + struct drm_gpuva *va;
> +
> + if (start <= mgr->mm_start ||
> + start > (mgr->mm_start + mgr->mm_range))
> + return NULL;
> +

And here as well. Maybe mtree_load() would be easier?

> + mas_lock(&mas);
> + va = mas_walk(&mas);
> + mas_unlock(&mas);
> +
> + return va;
> +}
> +EXPORT_SYMBOL(drm_gpuva_find_prev);
> +
> +/**
> + * drm_gpuva_find_next - find the &drm_gpuva after the given address
> + * @mgr: the &drm_gpuva_manager to search in
> + * @end: the given GPU VA's end address
> + *
> + * Find the adjacent &drm_gpuva after the GPU VA with given &end address.
> + *
> + * Note that if there is any free space between the GPU VA mappings no mapping
> + * is returned.
> + *
> + * Returns: a pointer to the found &drm_gpuva or NULL if none was found
> + */
> +struct drm_gpuva *
> +drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end)
> +{
> + MA_STATE(mas, &mgr->mtree, end, 0);
> + struct drm_gpuva *va;
> +
> + if (end < mgr->mm_start ||
> + end >= (mgr->mm_start + mgr->mm_range))
> + return NULL;
> +

Here too, you can use the mtree_load() function.

A note though that when I store my VMAs in the mm code, the VMAs are
[start, end) and the tree is [start, end], so I always take one away.
Not sure if your VMAs are the same way.

> + mas_lock(&mas);
> + va = mas_walk(&mas);
> + mas_unlock(&mas);
> +
> + return va;
> +}
> +EXPORT_SYMBOL(drm_gpuva_find_next);
> +
> +/**
> + * drm_gpuva_interval_empty - indicate whether a given interval of the VA space
> + * is empty
> + * @mgr: the &drm_gpuva_manager to check the range for
> + * @addr: the start address of the range
> + * @range: the range of the interval
> + *
> + * Returns: true if the interval is empty, false otherwise
> + */
> +bool
> +drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
> +{
> + DRM_GPUVA_ITER(it, mgr, addr);
> + struct drm_gpuva *va;
> +
> + drm_gpuva_iter_for_each_range(va, it, addr + range)
> + return false;
> +
> + return true;
> +}
> +EXPORT_SYMBOL(drm_gpuva_interval_empty);
> +
> +/**
> + * drm_gpuva_map - helper to insert a &drm_gpuva from &drm_gpuva_fn_ops
> + * callbacks
> + *
> + * @mgr: the &drm_gpuva_manager
> + * @pa: the &drm_gpuva_prealloc
> + * @va: the &drm_gpuva to inser
> + */
> +int
> +drm_gpuva_map(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_prealloc *pa,
> + struct drm_gpuva *va)
> +{
> + return drm_gpuva_insert_prealloc(mgr, pa, va);
> +}
> +EXPORT_SYMBOL(drm_gpuva_map);
> +
> +/**
> + * drm_gpuva_remap - helper to insert a &drm_gpuva from &drm_gpuva_fn_ops
> + * callbacks
> + *
> + * @state: the current &drm_gpuva_state
> + * @prev: the &drm_gpuva to remap when keeping the start of a mapping,
> + * may be NULL
> + * @next: the &drm_gpuva to remap when keeping the end of a mapping,
> + * may be NULL
> + */
> +int
> +drm_gpuva_remap(drm_gpuva_state_t state,
> + struct drm_gpuva *prev,
> + struct drm_gpuva *next)
> +{
> + struct ma_state *mas = &state->mas;
> + u64 max = mas->last;
> +
> + if (unlikely(!prev && !next))
> + return -EINVAL;
> +
> + if (prev) {
> + u64 addr = prev->va.addr;
> + u64 last = addr + prev->va.range - 1;
> +
> + if (unlikely(addr != mas->index))
> + return -EINVAL;
> +
> + if (unlikely(last >= mas->last))
> + return -EINVAL;
> + }
> +
> + if (next) {
> + u64 addr = next->va.addr;
> + u64 last = addr + next->va.range - 1;
> +
> + if (unlikely(last != mas->last))
> + return -EINVAL;
> +
> + if (unlikely(addr <= mas->index))
> + return -EINVAL;
> + }
> +
> + if (prev && next) {
> + u64 p_last = prev->va.addr + prev->va.range - 1;
> + u64 n_addr = next->va.addr;
> +
> + if (unlikely(p_last > n_addr))
> + return -EINVAL;
> +
> + if (unlikely(n_addr - p_last <= 1))
> + return -EINVAL;
> + }
> +
> + mas_lock(mas);
> + if (prev) {
> + mas_store(mas, prev);
> + mas_next(mas, max);

This will advance to the next entry, is that what you want to do? I
think you want mas_next_range(), which is in a recent patch set [2]. I
believe, what you have here is a large range which is NULL and you are
either inserting something at the start, at the end, or both?

[2] https://lore.kernel.org/lkml/[email protected]/

> + if (!next)
> + mas_store(mas, NULL);
> + }
> +
> + if (next) {
> + mas->last = next->va.addr - 1;
> + mas_store(mas, NULL);
> + mas_next(mas, max);
> + mas_store(mas, next);
> + }
> + mas_unlock(mas);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_gpuva_remap);
> +
> +/**
> + * drm_gpuva_unmap - helper to remove a &drm_gpuva from &drm_gpuva_fn_ops
> + * callbacks
> + *
> + * @state: the current &drm_gpuva_state
> + *
> + * The entry associated with the current state is removed.
> + */
> +void
> +drm_gpuva_unmap(drm_gpuva_state_t state)
> +{
> + drm_gpuva_iter_remove(state);
> +}
> +EXPORT_SYMBOL(drm_gpuva_unmap);
> +
> +static int
> +op_map_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
> + u64 addr, u64 range,
> + struct drm_gem_object *obj, u64 offset)
> +{
> + struct drm_gpuva_op op = {};
> +
> + op.op = DRM_GPUVA_OP_MAP;
> + op.map.va.addr = addr;
> + op.map.va.range = range;
> + op.map.gem.obj = obj;
> + op.map.gem.offset = offset;
> +
> + return fn->sm_step_map(&op, priv);
> +}
> +
> +static int
> +op_remap_cb(const struct drm_gpuva_fn_ops *fn,
> + drm_gpuva_state_t state, void *priv,
> + struct drm_gpuva_op_map *prev,
> + struct drm_gpuva_op_map *next,
> + struct drm_gpuva_op_unmap *unmap)
> +{
> + struct drm_gpuva_op op = {};
> + struct drm_gpuva_op_remap *r;
> +
> + op.op = DRM_GPUVA_OP_REMAP;
> + r = &op.remap;
> + r->prev = prev;
> + r->next = next;
> + r->unmap = unmap;
> +
> + return fn->sm_step_remap(&op, state, priv);
> +}
> +
> +static int
> +op_unmap_cb(const struct drm_gpuva_fn_ops *fn,
> + drm_gpuva_state_t state, void *priv,
> + struct drm_gpuva *va, bool merge)
> +{
> + struct drm_gpuva_op op = {};
> +
> + op.op = DRM_GPUVA_OP_UNMAP;
> + op.unmap.va = va;
> + op.unmap.keep = merge;
> +
> + return fn->sm_step_unmap(&op, state, priv);
> +}
> +
> +static int
> +__drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
> + const struct drm_gpuva_fn_ops *ops, void *priv,
> + u64 req_addr, u64 req_range,
> + struct drm_gem_object *req_obj, u64 req_offset)
> +{
> + DRM_GPUVA_ITER(it, mgr, req_addr);
> + struct drm_gpuva *va, *prev = NULL;
> + u64 req_end = req_addr + req_range;
> + int ret;
> +
> + if (unlikely(!drm_gpuva_in_mm_range(mgr, req_addr, req_range)))
> + return -EINVAL;
> +
> + if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
> + return -EINVAL;
> +
> + drm_gpuva_iter_for_each_range(va, it, req_end) {
> + struct drm_gem_object *obj = va->gem.obj;
> + u64 offset = va->gem.offset;
> + u64 addr = va->va.addr;
> + u64 range = va->va.range;
> + u64 end = addr + range;
> + bool merge = !!va->gem.obj;
> +
> + if (addr == req_addr) {
> + merge &= obj == req_obj &&
> + offset == req_offset;
> +
> + if (end == req_end) {
> + ret = op_unmap_cb(ops, &it, priv, va, merge);
> + if (ret)
> + return ret;
> + break;
> + }
> +
> + if (end < req_end) {
> + ret = op_unmap_cb(ops, &it, priv, va, merge);
> + if (ret)
> + return ret;
> + goto next;
> + }
> +
> + if (end > req_end) {
> + struct drm_gpuva_op_map n = {
> + .va.addr = req_end,
> + .va.range = range - req_range,
> + .gem.obj = obj,
> + .gem.offset = offset + req_range,
> + };
> + struct drm_gpuva_op_unmap u = {
> + .va = va,
> + .keep = merge,
> + };
> +
> + ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
> + if (ret)
> + return ret;
> + break;
> + }
> + } else if (addr < req_addr) {
> + u64 ls_range = req_addr - addr;
> + struct drm_gpuva_op_map p = {
> + .va.addr = addr,
> + .va.range = ls_range,
> + .gem.obj = obj,
> + .gem.offset = offset,
> + };
> + struct drm_gpuva_op_unmap u = { .va = va };
> +
> + merge &= obj == req_obj &&
> + offset + ls_range == req_offset;
> + u.keep = merge;
> +
> + if (end == req_end) {
> + ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
> + if (ret)
> + return ret;
> + break;
> + }
> +
> + if (end < req_end) {
> + ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
> + if (ret)
> + return ret;
> + goto next;
> + }
> +
> + if (end > req_end) {
> + struct drm_gpuva_op_map n = {
> + .va.addr = req_end,
> + .va.range = end - req_end,
> + .gem.obj = obj,
> + .gem.offset = offset + ls_range +
> + req_range,
> + };
> +
> + ret = op_remap_cb(ops, &it, priv, &p, &n, &u);
> + if (ret)
> + return ret;
> + break;
> + }
> + } else if (addr > req_addr) {
> + merge &= obj == req_obj &&
> + offset == req_offset +
> + (addr - req_addr);
> +
> + if (end == req_end) {
> + ret = op_unmap_cb(ops, &it, priv, va, merge);
> + if (ret)
> + return ret;
> + break;
> + }
> +
> + if (end < req_end) {
> + ret = op_unmap_cb(ops, &it, priv, va, merge);
> + if (ret)
> + return ret;
> + goto next;
> + }
> +
> + if (end > req_end) {
> + struct drm_gpuva_op_map n = {
> + .va.addr = req_end,
> + .va.range = end - req_end,
> + .gem.obj = obj,
> + .gem.offset = offset + req_end - addr,
> + };
> + struct drm_gpuva_op_unmap u = {
> + .va = va,
> + .keep = merge,
> + };
> +
> + ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
> + if (ret)
> + return ret;
> + break;
> + }
> + }
> +next:
> + prev = va;
> + }
> +
> + return op_map_cb(ops, priv,
> + req_addr, req_range,
> + req_obj, req_offset);
> +}
> +
> +static int
> +__drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
> + const struct drm_gpuva_fn_ops *ops, void *priv,
> + u64 req_addr, u64 req_range)
> +{
> + DRM_GPUVA_ITER(it, mgr, req_addr);
> + struct drm_gpuva *va;
> + u64 req_end = req_addr + req_range;
> + int ret;
> +
> + if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
> + return -EINVAL;
> +
> + drm_gpuva_iter_for_each_range(va, it, req_end) {
> + struct drm_gpuva_op_map prev = {}, next = {};
> + bool prev_split = false, next_split = false;
> + struct drm_gem_object *obj = va->gem.obj;
> + u64 offset = va->gem.offset;
> + u64 addr = va->va.addr;
> + u64 range = va->va.range;
> + u64 end = addr + range;
> +
> + if (addr < req_addr) {
> + prev.va.addr = addr;
> + prev.va.range = req_addr - addr;
> + prev.gem.obj = obj;
> + prev.gem.offset = offset;
> +
> + prev_split = true;
> + }
> +
> + if (end > req_end) {
> + next.va.addr = req_end;
> + next.va.range = end - req_end;
> + next.gem.obj = obj;
> + next.gem.offset = offset + (req_end - addr);
> +
> + next_split = true;
> + }
> +
> + if (prev_split || next_split) {
> + struct drm_gpuva_op_unmap unmap = { .va = va };
> +
> + ret = op_remap_cb(ops, &it, priv,
> + prev_split ? &prev : NULL,
> + next_split ? &next : NULL,
> + &unmap);
> + if (ret)
> + return ret;
> + } else {
> + ret = op_unmap_cb(ops, &it, priv, va, false);
> + if (ret)
> + return ret;
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * drm_gpuva_sm_map - creates the &drm_gpuva_op split/merge steps
> + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> + * @req_addr: the start address of the new mapping
> + * @req_range: the range of the new mapping
> + * @req_obj: the &drm_gem_object to map
> + * @req_offset: the offset within the &drm_gem_object
> + * @priv: pointer to a driver private data structure
> + *
> + * This function iterates the given range of the GPU VA space. It utilizes the
> + * &drm_gpuva_fn_ops to call back into the driver providing the split and merge
> + * steps.
> + *
> + * Drivers may use these callbacks to update the GPU VA space right away within
> + * the callback. In case the driver decides to copy and store the operations for
> + * later processing neither this function nor &drm_gpuva_sm_unmap is allowed to
> + * be called before the &drm_gpuva_manager's view of the GPU VA space was
> + * updated with the previous set of operations. To update the
> + * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
> + * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
> + * used.
> + *
> + * A sequence of callbacks can contain map, unmap and remap operations, but
> + * the sequence of callbacks might also be empty if no operation is required,
> + * e.g. if the requested mapping already exists in the exact same way.
> + *
> + * There can be an arbitrary amount of unmap operations, a maximum of two remap
> + * operations and a single map operation. The latter one represents the original
> + * map operation requested by the caller.
> + *
> + * Returns: 0 on success or a negative error code
> + */
> +int
> +drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
> + u64 req_addr, u64 req_range,
> + struct drm_gem_object *req_obj, u64 req_offset)
> +{
> + const struct drm_gpuva_fn_ops *ops = mgr->ops;
> +
> + if (unlikely(!(ops && ops->sm_step_map &&
> + ops->sm_step_remap &&
> + ops->sm_step_unmap)))
> + return -EINVAL;
> +
> + return __drm_gpuva_sm_map(mgr, ops, priv,
> + req_addr, req_range,
> + req_obj, req_offset);
> +}
> +EXPORT_SYMBOL(drm_gpuva_sm_map);
> +
> +/**
> + * drm_gpuva_sm_unmap - creates the &drm_gpuva_ops to split on unmap
> + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> + * @priv: pointer to a driver private data structure
> + * @req_addr: the start address of the range to unmap
> + * @req_range: the range of the mappings to unmap
> + *
> + * This function iterates the given range of the GPU VA space. It utilizes the
> + * &drm_gpuva_fn_ops to call back into the driver providing the operations to
> + * unmap and, if required, split existent mappings.
> + *
> + * Drivers may use these callbacks to update the GPU VA space right away within
> + * the callback. In case the driver decides to copy and store the operations for
> + * later processing neither this function nor &drm_gpuva_sm_map is allowed to be
> + * called before the &drm_gpuva_manager's view of the GPU VA space was updated
> + * with the previous set of operations. To update the &drm_gpuva_manager's view
> + * of the GPU VA space drm_gpuva_insert(), drm_gpuva_destroy_locked() and/or
> + * drm_gpuva_destroy_unlocked() should be used.
> + *
> + * A sequence of callbacks can contain unmap and remap operations, depending on
> + * whether there are actual overlapping mappings to split.
> + *
> + * There can be an arbitrary amount of unmap operations and a maximum of two
> + * remap operations.
> + *
> + * Returns: 0 on success or a negative error code
> + */
> +int
> +drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
> + u64 req_addr, u64 req_range)
> +{
> + const struct drm_gpuva_fn_ops *ops = mgr->ops;
> +
> + if (unlikely(!(ops && ops->sm_step_remap &&
> + ops->sm_step_unmap)))
> + return -EINVAL;
> +
> + return __drm_gpuva_sm_unmap(mgr, ops, priv,
> + req_addr, req_range);
> +}
> +EXPORT_SYMBOL(drm_gpuva_sm_unmap);
> +
> +static struct drm_gpuva_op *
> +gpuva_op_alloc(struct drm_gpuva_manager *mgr)
> +{
> + const struct drm_gpuva_fn_ops *fn = mgr->ops;
> + struct drm_gpuva_op *op;
> +
> + if (fn && fn->op_alloc)
> + op = fn->op_alloc();
> + else
> + op = kzalloc(sizeof(*op), GFP_KERNEL);
> +
> + if (unlikely(!op))
> + return NULL;
> +
> + return op;
> +}
> +
> +static void
> +gpuva_op_free(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_op *op)
> +{
> + const struct drm_gpuva_fn_ops *fn = mgr->ops;
> +
> + if (fn && fn->op_free)
> + fn->op_free(op);
> + else
> + kfree(op);
> +}
> +
> +static int
> +drm_gpuva_sm_step(struct drm_gpuva_op *__op,
> + drm_gpuva_state_t state,
> + void *priv)
> +{
> + struct {
> + struct drm_gpuva_manager *mgr;
> + struct drm_gpuva_ops *ops;
> + } *args = priv;
> + struct drm_gpuva_manager *mgr = args->mgr;
> + struct drm_gpuva_ops *ops = args->ops;
> + struct drm_gpuva_op *op;
> +
> + op = gpuva_op_alloc(mgr);
> + if (unlikely(!op))
> + goto err;
> +
> + memcpy(op, __op, sizeof(*op));
> +
> + if (op->op == DRM_GPUVA_OP_REMAP) {
> + struct drm_gpuva_op_remap *__r = &__op->remap;
> + struct drm_gpuva_op_remap *r = &op->remap;
> +
> + r->unmap = kmemdup(__r->unmap, sizeof(*r->unmap),
> + GFP_KERNEL);
> + if (unlikely(!r->unmap))
> + goto err_free_op;
> +
> + if (__r->prev) {
> + r->prev = kmemdup(__r->prev, sizeof(*r->prev),
> + GFP_KERNEL);
> + if (unlikely(!r->prev))
> + goto err_free_unmap;
> + }
> +
> + if (__r->next) {
> + r->next = kmemdup(__r->next, sizeof(*r->next),
> + GFP_KERNEL);
> + if (unlikely(!r->next))
> + goto err_free_prev;
> + }
> + }
> +
> + list_add_tail(&op->entry, &ops->list);
> +
> + return 0;
> +
> +err_free_unmap:
> + kfree(op->remap.unmap);
> +err_free_prev:
> + kfree(op->remap.prev);
> +err_free_op:
> + gpuva_op_free(mgr, op);
> +err:
> + return -ENOMEM;
> +}
> +
> +static int
> +drm_gpuva_sm_step_map(struct drm_gpuva_op *__op, void *priv)
> +{
> + return drm_gpuva_sm_step(__op, NULL, priv);
> +}
> +
> +static const struct drm_gpuva_fn_ops gpuva_list_ops = {
> + .sm_step_map = drm_gpuva_sm_step_map,
> + .sm_step_remap = drm_gpuva_sm_step,
> + .sm_step_unmap = drm_gpuva_sm_step,
> +};
> +
> +/**
> + * drm_gpuva_sm_map_ops_create - creates the &drm_gpuva_ops to split and merge
> + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> + * @req_addr: the start address of the new mapping
> + * @req_range: the range of the new mapping
> + * @req_obj: the &drm_gem_object to map
> + * @req_offset: the offset within the &drm_gem_object
> + *
> + * This function creates a list of operations to perform splitting and merging
> + * of existent mapping(s) with the newly requested one.
> + *
> + * The list can be iterated with &drm_gpuva_for_each_op and must be processed
> + * in the given order. It can contain map, unmap and remap operations, but it
> + * also can be empty if no operation is required, e.g. if the requested mapping
> + * already exists is the exact same way.
> + *
> + * There can be an arbitrary amount of unmap operations, a maximum of two remap
> + * operations and a single map operation. The latter one represents the original
> + * map operation requested by the caller.
> + *
> + * Note that before calling this function again with another mapping request it
> + * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
> + * previously obtained operations must be either processed or abandoned. To
> + * update the &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
> + * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
> + * used.
> + *
> + * After the caller finished processing the returned &drm_gpuva_ops, they must
> + * be freed with &drm_gpuva_ops_free.
> + *
> + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> + */
> +struct drm_gpuva_ops *
> +drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
> + u64 req_addr, u64 req_range,
> + struct drm_gem_object *req_obj, u64 req_offset)
> +{
> + struct drm_gpuva_ops *ops;
> + struct {
> + struct drm_gpuva_manager *mgr;
> + struct drm_gpuva_ops *ops;
> + } args;
> + int ret;
> +
> + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> + if (unlikely(!ops))
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&ops->list);
> +
> + args.mgr = mgr;
> + args.ops = ops;
> +
> + ret = __drm_gpuva_sm_map(mgr, &gpuva_list_ops, &args,
> + req_addr, req_range,
> + req_obj, req_offset);
> + if (ret)
> + goto err_free_ops;
> +
> + return ops;
> +
> +err_free_ops:
> + drm_gpuva_ops_free(mgr, ops);
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL(drm_gpuva_sm_map_ops_create);
> +
> +/**
> + * drm_gpuva_sm_unmap_ops_create - creates the &drm_gpuva_ops to split on unmap
> + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> + * @req_addr: the start address of the range to unmap
> + * @req_range: the range of the mappings to unmap
> + *
> + * This function creates a list of operations to perform unmapping and, if
> + * required, splitting of the mappings overlapping the unmap range.
> + *
> + * The list can be iterated with &drm_gpuva_for_each_op and must be processed
> + * in the given order. It can contain unmap and remap operations, depending on
> + * whether there are actual overlapping mappings to split.
> + *
> + * There can be an arbitrary amount of unmap operations and a maximum of two
> + * remap operations.
> + *
> + * Note that before calling this function again with another range to unmap it
> + * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
> + * previously obtained operations must be processed or abandoned. To update the
> + * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
> + * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
> + * used.
> + *
> + * After the caller finished processing the returned &drm_gpuva_ops, they must
> + * be freed with &drm_gpuva_ops_free.
> + *
> + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> + */
> +struct drm_gpuva_ops *
> +drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
> + u64 req_addr, u64 req_range)
> +{
> + struct drm_gpuva_ops *ops;
> + struct {
> + struct drm_gpuva_manager *mgr;
> + struct drm_gpuva_ops *ops;
> + } args;
> + int ret;
> +
> + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> + if (unlikely(!ops))
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&ops->list);
> +
> + args.mgr = mgr;
> + args.ops = ops;
> +
> + ret = __drm_gpuva_sm_unmap(mgr, &gpuva_list_ops, &args,
> + req_addr, req_range);
> + if (ret)
> + goto err_free_ops;
> +
> + return ops;
> +
> +err_free_ops:
> + drm_gpuva_ops_free(mgr, ops);
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL(drm_gpuva_sm_unmap_ops_create);
> +
> +/**
> + * drm_gpuva_prefetch_ops_create - creates the &drm_gpuva_ops to prefetch
> + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> + * @addr: the start address of the range to prefetch
> + * @range: the range of the mappings to prefetch
> + *
> + * This function creates a list of operations to perform prefetching.
> + *
> + * The list can be iterated with &drm_gpuva_for_each_op and must be processed
> + * in the given order. It can contain prefetch operations.
> + *
> + * There can be an arbitrary amount of prefetch operations.
> + *
> + * After the caller finished processing the returned &drm_gpuva_ops, they must
> + * be freed with &drm_gpuva_ops_free.
> + *
> + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> + */
> +struct drm_gpuva_ops *
> +drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range)
> +{
> + DRM_GPUVA_ITER(it, mgr, addr);
> + struct drm_gpuva_ops *ops;
> + struct drm_gpuva_op *op;
> + struct drm_gpuva *va;
> + int ret;
> +
> + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> + if (!ops)
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&ops->list);
> +
> + drm_gpuva_iter_for_each_range(va, it, addr + range) {
> + op = gpuva_op_alloc(mgr);
> + if (!op) {
> + ret = -ENOMEM;
> + goto err_free_ops;
> + }
> +
> + op->op = DRM_GPUVA_OP_PREFETCH;
> + op->prefetch.va = va;
> + list_add_tail(&op->entry, &ops->list);
> + }
> +
> + return ops;
> +
> +err_free_ops:
> + drm_gpuva_ops_free(mgr, ops);
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL(drm_gpuva_prefetch_ops_create);
> +
> +/**
> + * drm_gpuva_gem_unmap_ops_create - creates the &drm_gpuva_ops to unmap a GEM
> + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> + * @obj: the &drm_gem_object to unmap
> + *
> + * This function creates a list of operations to perform unmapping for every
> + * GPUVA attached to a GEM.
> + *
> + * The list can be iterated with &drm_gpuva_for_each_op and consists out of an
> + * arbitrary amount of unmap operations.
> + *
> + * After the caller finished processing the returned &drm_gpuva_ops, they must
> + * be freed with &drm_gpuva_ops_free.
> + *
> + * It is the callers responsibility to protect the GEMs GPUVA list against
> + * concurrent access.
> + *
> + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> + */
> +struct drm_gpuva_ops *
> +drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
> + struct drm_gem_object *obj)
> +{
> + struct drm_gpuva_ops *ops;
> + struct drm_gpuva_op *op;
> + struct drm_gpuva *va;
> + int ret;
> +
> + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> + if (!ops)
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&ops->list);
> +
> + drm_gem_for_each_gpuva(va, obj) {
> + op = gpuva_op_alloc(mgr);
> + if (!op) {
> + ret = -ENOMEM;
> + goto err_free_ops;
> + }
> +
> + op->op = DRM_GPUVA_OP_UNMAP;
> + op->unmap.va = va;
> + list_add_tail(&op->entry, &ops->list);
> + }
> +
> + return ops;
> +
> +err_free_ops:
> + drm_gpuva_ops_free(mgr, ops);
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL(drm_gpuva_gem_unmap_ops_create);
> +
> +
> +/**
> + * drm_gpuva_ops_free - free the given &drm_gpuva_ops
> + * @mgr: the &drm_gpuva_manager the ops were created for
> + * @ops: the &drm_gpuva_ops to free
> + *
> + * Frees the given &drm_gpuva_ops structure including all the ops associated
> + * with it.
> + */
> +void
> +drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_ops *ops)
> +{
> + struct drm_gpuva_op *op, *next;
> +
> + drm_gpuva_for_each_op_safe(op, next, ops) {
> + list_del(&op->entry);
> +
> + if (op->op == DRM_GPUVA_OP_REMAP) {
> + kfree(op->remap.prev);
> + kfree(op->remap.next);
> + kfree(op->remap.unmap);
> + }
> +
> + gpuva_op_free(mgr, op);
> + }
> +
> + kfree(ops);
> +}
> +EXPORT_SYMBOL(drm_gpuva_ops_free);
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index b419c59c4bef..b6e22f66c3fd 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -104,6 +104,12 @@ enum drm_driver_feature {
> * acceleration should be handled by two drivers that are connected using auxiliary bus.
> */
> DRIVER_COMPUTE_ACCEL = BIT(7),
> + /**
> + * @DRIVER_GEM_GPUVA:
> + *
> + * Driver supports user defined GPU VA bindings for GEM objects.
> + */
> + DRIVER_GEM_GPUVA = BIT(8),
>
> /* IMPORTANT: Below are all the legacy flags, add new ones above. */
>
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index b8efd836edef..f2782f55b7e7 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -36,6 +36,8 @@
>
> #include <linux/kref.h>
> #include <linux/dma-resv.h>
> +#include <linux/list.h>
> +#include <linux/mutex.h>
>
> #include <drm/drm_vma_manager.h>
>
> @@ -347,6 +349,17 @@ struct drm_gem_object {
> */
> struct dma_resv _resv;
>
> + /**
> + * @gpuva:
> + *
> + * Provides the list and list mutex of GPU VAs attached to this
> + * GEM object.
> + */
> + struct {
> + struct list_head list;
> + struct mutex mutex;
> + } gpuva;
> +
> /**
> * @funcs:
> *
> @@ -494,4 +507,66 @@ unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
>
> int drm_gem_evict(struct drm_gem_object *obj);
>
> +/**
> + * drm_gem_gpuva_init - initialize the gpuva list of a GEM object
> + * @obj: the &drm_gem_object
> + *
> + * This initializes the &drm_gem_object's &drm_gpuva list and the mutex
> + * protecting it.
> + *
> + * Calling this function is only necessary for drivers intending to support the
> + * &drm_driver_feature DRIVER_GEM_GPUVA.
> + */
> +static inline void drm_gem_gpuva_init(struct drm_gem_object *obj)
> +{
> + INIT_LIST_HEAD(&obj->gpuva.list);
> + mutex_init(&obj->gpuva.mutex);
> +}
> +
> +/**
> + * drm_gem_gpuva_lock - lock the GEM's gpuva list mutex
> + * @obj: the &drm_gem_object
> + *
> + * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
> + */
> +static inline void drm_gem_gpuva_lock(struct drm_gem_object *obj)
> +{
> + mutex_lock(&obj->gpuva.mutex);
> +}
> +
> +/**
> + * drm_gem_gpuva_unlock - unlock the GEM's gpuva list mutex
> + * @obj: the &drm_gem_object
> + *
> + * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
> + */
> +static inline void drm_gem_gpuva_unlock(struct drm_gem_object *obj)
> +{
> + mutex_unlock(&obj->gpuva.mutex);
> +}
> +
> +/**
> + * drm_gem_for_each_gpuva - iternator to walk over a list of gpuvas
> + * @entry: &drm_gpuva structure to assign to in each iteration step
> + * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
> + *
> + * This iterator walks over all &drm_gpuva structures associated with the
> + * &drm_gpuva_manager.
> + */
> +#define drm_gem_for_each_gpuva(entry__, obj__) \
> + list_for_each_entry(entry__, &(obj__)->gpuva.list, gem.entry)
> +
> +/**
> + * drm_gem_for_each_gpuva_safe - iternator to safely walk over a list of gpuvas
> + * @entry: &drm_gpuva structure to assign to in each iteration step
> + * @next: &next &drm_gpuva to store the next step
> + * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
> + *
> + * This iterator walks over all &drm_gpuva structures associated with the
> + * &drm_gem_object. It is implemented with list_for_each_entry_safe(), hence
> + * it is save against removal of elements.
> + */
> +#define drm_gem_for_each_gpuva_safe(entry__, next__, obj__) \
> + list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, gem.entry)
> +
> #endif /* __DRM_GEM_H__ */
> diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
> new file mode 100644
> index 000000000000..b52ac2d00d12
> --- /dev/null
> +++ b/include/drm/drm_gpuva_mgr.h
> @@ -0,0 +1,681 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __DRM_GPUVA_MGR_H__
> +#define __DRM_GPUVA_MGR_H__
> +
> +/*
> + * Copyright (c) 2022 Red Hat.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include <linux/maple_tree.h>
> +#include <linux/mm.h>
> +#include <linux/rbtree.h>
> +#include <linux/spinlock.h>
> +#include <linux/types.h>
> +
> +struct drm_gpuva_manager;
> +struct drm_gpuva_fn_ops;
> +struct drm_gpuva_prealloc;
> +
> +/**
> + * enum drm_gpuva_flags - flags for struct drm_gpuva
> + */
> +enum drm_gpuva_flags {
> + /**
> + * @DRM_GPUVA_EVICTED:
> + *
> + * Flag indicating that the &drm_gpuva's backing GEM is evicted.
> + */
> + DRM_GPUVA_EVICTED = (1 << 0),
> +
> + /**
> + * @DRM_GPUVA_SPARSE:
> + *
> + * Flag indicating that the &drm_gpuva is a sparse mapping.
> + */
> + DRM_GPUVA_SPARSE = (1 << 1),
> +
> + /**
> + * @DRM_GPUVA_USERBITS: user defined bits
> + */
> + DRM_GPUVA_USERBITS = (1 << 2),
> +};
> +
> +/**
> + * struct drm_gpuva - structure to track a GPU VA mapping
> + *
> + * This structure represents a GPU VA mapping and is associated with a
> + * &drm_gpuva_manager.
> + *
> + * Typically, this structure is embedded in bigger driver structures.
> + */
> +struct drm_gpuva {
> + /**
> + * @mgr: the &drm_gpuva_manager this object is associated with
> + */
> + struct drm_gpuva_manager *mgr;
> +
> + /**
> + * @flags: the &drm_gpuva_flags for this mapping
> + */
> + enum drm_gpuva_flags flags;
> +
> + /**
> + * @va: structure containing the address and range of the &drm_gpuva
> + */
> + struct {
> + /**
> + * @addr: the start address
> + */
> + u64 addr;
> +
> + /*
> + * @range: the range
> + */
> + u64 range;
> + } va;
> +
> + /**
> + * @gem: structure containing the &drm_gem_object and it's offset
> + */
> + struct {
> + /**
> + * @offset: the offset within the &drm_gem_object
> + */
> + u64 offset;
> +
> + /**
> + * @obj: the mapped &drm_gem_object
> + */
> + struct drm_gem_object *obj;
> +
> + /**
> + * @entry: the &list_head to attach this object to a &drm_gem_object
> + */
> + struct list_head entry;
> + } gem;
> +};
> +
> +void drm_gpuva_link(struct drm_gpuva *va);
> +void drm_gpuva_unlink(struct drm_gpuva *va);
> +
> +int drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva *va);
> +int drm_gpuva_insert_prealloc(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_prealloc *pa,
> + struct drm_gpuva *va);
> +void drm_gpuva_remove(struct drm_gpuva *va);
> +
> +struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range);
> +struct drm_gpuva *drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range);
> +struct drm_gpuva *drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start);
> +struct drm_gpuva *drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end);
> +
> +bool drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range);
> +
> +/**
> + * drm_gpuva_evict - sets whether the backing GEM of this &drm_gpuva is evicted
> + * @va: the &drm_gpuva to set the evict flag for
> + * @evict: indicates whether the &drm_gpuva is evicted
> + */
> +static inline void drm_gpuva_evict(struct drm_gpuva *va, bool evict)
> +{
> + if (evict)
> + va->flags |= DRM_GPUVA_EVICTED;
> + else
> + va->flags &= ~DRM_GPUVA_EVICTED;
> +}
> +
> +/**
> + * drm_gpuva_evicted - indicates whether the backing BO of this &drm_gpuva
> + * is evicted
> + * @va: the &drm_gpuva to check
> + */
> +static inline bool drm_gpuva_evicted(struct drm_gpuva *va)
> +{
> + return va->flags & DRM_GPUVA_EVICTED;
> +}
> +
> +/**
> + * struct drm_gpuva_manager - DRM GPU VA Manager
> + *
> + * The DRM GPU VA Manager keeps track of a GPU's virtual address space by using
> + * &maple_tree structures. Typically, this structure is embedded in bigger
> + * driver structures.
> + *
> + * Drivers can pass addresses and ranges in an arbitrary unit, e.g. bytes or
> + * pages.
> + *
> + * There should be one manager instance per GPU virtual address space.
> + */
> +struct drm_gpuva_manager {
> + /**
> + * @name: the name of the DRM GPU VA space
> + */
> + const char *name;
> +
> + /**
> + * @mm_start: start of the VA space
> + */
> + u64 mm_start;
> +
> + /**
> + * @mm_range: length of the VA space
> + */
> + u64 mm_range;
> +
> + /**
> + * @mtree: the &maple_tree to track GPU VA mappings
> + */
> + struct maple_tree mtree;
> +
> + /**
> + * @kernel_alloc_node:
> + *
> + * &drm_gpuva representing the address space cutout reserved for
> + * the kernel
> + */
> + struct drm_gpuva kernel_alloc_node;
> +
> + /**
> + * @ops: &drm_gpuva_fn_ops providing the split/merge steps to drivers
> + */
> + const struct drm_gpuva_fn_ops *ops;
> +};
> +
> +void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> + const char *name,
> + u64 start_offset, u64 range,
> + u64 reserve_offset, u64 reserve_range,
> + const struct drm_gpuva_fn_ops *ops);
> +void drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr);
> +
> +/**
> + * struct drm_gpuva_prealloc - holds a preallocated node for the
> + * &drm_gpuva_manager to insert a single new entry
> + */
> +struct drm_gpuva_prealloc {
> + /**
> + * @mas: the maple tree advanced state
> + */
> + struct ma_state mas;
> +};
> +
> +struct drm_gpuva_prealloc * drm_gpuva_prealloc_create(struct drm_gpuva_manager *mgr);
> +void drm_gpuva_prealloc_destroy(struct drm_gpuva_prealloc *pa);
> +
> +/**
> + * struct drm_gpuva_iterator - iterator for walking the internal (maple) tree
> + */
> +struct drm_gpuva_iterator {
> + /**
> + * @mas: the maple tree advanced state
> + */
> + struct ma_state mas;
> +
> + /**
> + * @mgr: the &drm_gpuva_manager to iterate
> + */
> + struct drm_gpuva_manager *mgr;
> +};
> +typedef struct drm_gpuva_iterator * drm_gpuva_state_t;
> +
> +void drm_gpuva_iter_remove(struct drm_gpuva_iterator *it);
> +int drm_gpuva_iter_va_replace(struct drm_gpuva_iterator *it,
> + struct drm_gpuva *va);
> +
> +static inline struct drm_gpuva *
> +drm_gpuva_iter_find(struct drm_gpuva_iterator *it, unsigned long max)
> +{
> + struct drm_gpuva *va;
> +
> + mas_lock(&it->mas);

This is the write lock, if you can have more than one reader then use
rcu_read_lock() and friends. You can also probably use mt_find() to
handle the locking here?

> + va = mas_find(&it->mas, max);
> + mas_unlock(&it->mas);
> +
> + return va;
> +}
> +
> +/**
> + * DRM_GPUVA_ITER - create an iterator structure to iterate the &drm_gpuva tree
> + * @name: the name of the &drm_gpuva_iterator to create
> + * @mgr__: the &drm_gpuva_manager to iterate
> + * @start: starting offset, the first entry will overlap this
> + */
> +#define DRM_GPUVA_ITER(name, mgr__, start) \
> + struct drm_gpuva_iterator name = { \
> + .mas = MA_STATE_INIT(&(mgr__)->mtree, start, 0), \
> + .mgr = mgr__, \
> + }
> +
> +/**
> + * drm_gpuva_iter_for_each_range - iternator to walk over a range of entries
> + * @va__: the &drm_gpuva found for the current iteration
> + * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
> + * @end__: ending offset, the last entry will start before this (but may overlap)
> + *
> + * This function can be used to iterate &drm_gpuva objects.
> + *
> + * It is safe against the removal of elements using &drm_gpuva_iter_remove,
> + * however it is not safe against the removal of elements using
> + * &drm_gpuva_remove.
> + */
> +#define drm_gpuva_iter_for_each_range(va__, it__, end__) \
> + while (((va__) = drm_gpuva_iter_find(&(it__), (end__) - 1)))
> +
> +/**
> + * drm_gpuva_iter_for_each - iternator to walk over all existing entries
> + * @va__: the &drm_gpuva found for the current iteration
> + * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
> + *
> + * This function can be used to iterate &drm_gpuva objects.
> + *
> + * In order to walk over all potentially existing entries, the
> + * &drm_gpuva_iterator must be initialized to start at
> + * &drm_gpuva_manager->mm_start or simply 0.
> + *
> + * It is safe against the removal of elements using &drm_gpuva_iter_remove,
> + * however it is not safe against the removal of elements using
> + * &drm_gpuva_remove.
> + */
> +#define drm_gpuva_iter_for_each(va__, it__) \
> + drm_gpuva_iter_for_each_range(va__, it__, (it__).mgr->mm_start + (it__).mgr->mm_range)
> +
> +/**
> + * enum drm_gpuva_op_type - GPU VA operation type
> + *
> + * Operations to alter the GPU VA mappings tracked by the &drm_gpuva_manager.
> + */
> +enum drm_gpuva_op_type {
> + /**
> + * @DRM_GPUVA_OP_MAP: the map op type
> + */
> + DRM_GPUVA_OP_MAP,
> +
> + /**
> + * @DRM_GPUVA_OP_REMAP: the remap op type
> + */
> + DRM_GPUVA_OP_REMAP,
> +
> + /**
> + * @DRM_GPUVA_OP_UNMAP: the unmap op type
> + */
> + DRM_GPUVA_OP_UNMAP,
> +
> + /**
> + * @DRM_GPUVA_OP_PREFETCH: the prefetch op type
> + */
> + DRM_GPUVA_OP_PREFETCH,
> +};
> +
> +/**
> + * struct drm_gpuva_op_map - GPU VA map operation
> + *
> + * This structure represents a single map operation generated by the
> + * DRM GPU VA manager.
> + */
> +struct drm_gpuva_op_map {
> + /**
> + * @va: structure containing address and range of a map
> + * operation
> + */
> + struct {
> + /**
> + * @addr: the base address of the new mapping
> + */
> + u64 addr;
> +
> + /**
> + * @range: the range of the new mapping
> + */
> + u64 range;
> + } va;
> +
> + /**
> + * @gem: structure containing the &drm_gem_object and it's offset
> + */
> + struct {
> + /**
> + * @offset: the offset within the &drm_gem_object
> + */
> + u64 offset;
> +
> + /**
> + * @obj: the &drm_gem_object to map
> + */
> + struct drm_gem_object *obj;
> + } gem;
> +};
> +
> +/**
> + * struct drm_gpuva_op_unmap - GPU VA unmap operation
> + *
> + * This structure represents a single unmap operation generated by the
> + * DRM GPU VA manager.
> + */
> +struct drm_gpuva_op_unmap {
> + /**
> + * @va: the &drm_gpuva to unmap
> + */
> + struct drm_gpuva *va;
> +
> + /**
> + * @keep:
> + *
> + * Indicates whether this &drm_gpuva is physically contiguous with the
> + * original mapping request.
> + *
> + * Optionally, if &keep is set, drivers may keep the actual page table
> + * mappings for this &drm_gpuva, adding the missing page table entries
> + * only and update the &drm_gpuva_manager accordingly.
> + */
> + bool keep;
> +};
> +
> +/**
> + * struct drm_gpuva_op_remap - GPU VA remap operation
> + *
> + * This represents a single remap operation generated by the DRM GPU VA manager.
> + *
> + * A remap operation is generated when an existing GPU VA mmapping is split up
> + * by inserting a new GPU VA mapping or by partially unmapping existent
> + * mapping(s), hence it consists of a maximum of two map and one unmap
> + * operation.
> + *
> + * The @unmap operation takes care of removing the original existing mapping.
> + * @prev is used to remap the preceding part, @next the subsequent part.
> + *
> + * If either a new mapping's start address is aligned with the start address
> + * of the old mapping or the new mapping's end address is aligned with the
> + * end address of the old mapping, either @prev or @next is NULL.
> + *
> + * Note, the reason for a dedicated remap operation, rather than arbitrary
> + * unmap and map operations, is to give drivers the chance of extracting driver
> + * specific data for creating the new mappings from the unmap operations's
> + * &drm_gpuva structure which typically is embedded in larger driver specific
> + * structures.
> + */
> +struct drm_gpuva_op_remap {
> + /**
> + * @prev: the preceding part of a split mapping
> + */
> + struct drm_gpuva_op_map *prev;
> +
> + /**
> + * @next: the subsequent part of a split mapping
> + */
> + struct drm_gpuva_op_map *next;
> +
> + /**
> + * @unmap: the unmap operation for the original existing mapping
> + */
> + struct drm_gpuva_op_unmap *unmap;
> +};
> +
> +/**
> + * struct drm_gpuva_op_prefetch - GPU VA prefetch operation
> + *
> + * This structure represents a single prefetch operation generated by the
> + * DRM GPU VA manager.
> + */
> +struct drm_gpuva_op_prefetch {
> + /**
> + * @va: the &drm_gpuva to prefetch
> + */
> + struct drm_gpuva *va;
> +};
> +
> +/**
> + * struct drm_gpuva_op - GPU VA operation
> + *
> + * This structure represents a single generic operation.
> + *
> + * The particular type of the operation is defined by @op.
> + */
> +struct drm_gpuva_op {
> + /**
> + * @entry:
> + *
> + * The &list_head used to distribute instances of this struct within
> + * &drm_gpuva_ops.
> + */
> + struct list_head entry;
> +
> + /**
> + * @op: the type of the operation
> + */
> + enum drm_gpuva_op_type op;
> +
> + union {
> + /**
> + * @map: the map operation
> + */
> + struct drm_gpuva_op_map map;
> +
> + /**
> + * @remap: the remap operation
> + */
> + struct drm_gpuva_op_remap remap;
> +
> + /**
> + * @unmap: the unmap operation
> + */
> + struct drm_gpuva_op_unmap unmap;
> +
> + /**
> + * @prefetch: the prefetch operation
> + */
> + struct drm_gpuva_op_prefetch prefetch;
> + };
> +};
> +
> +/**
> + * struct drm_gpuva_ops - wraps a list of &drm_gpuva_op
> + */
> +struct drm_gpuva_ops {
> + /**
> + * @list: the &list_head
> + */
> + struct list_head list;
> +};
> +
> +/**
> + * drm_gpuva_for_each_op - iterator to walk over &drm_gpuva_ops
> + * @op: &drm_gpuva_op to assign in each iteration step
> + * @ops: &drm_gpuva_ops to walk
> + *
> + * This iterator walks over all ops within a given list of operations.
> + */
> +#define drm_gpuva_for_each_op(op, ops) list_for_each_entry(op, &(ops)->list, entry)
> +
> +/**
> + * drm_gpuva_for_each_op_safe - iterator to safely walk over &drm_gpuva_ops
> + * @op: &drm_gpuva_op to assign in each iteration step
> + * @next: &next &drm_gpuva_op to store the next step
> + * @ops: &drm_gpuva_ops to walk
> + *
> + * This iterator walks over all ops within a given list of operations. It is
> + * implemented with list_for_each_safe(), so save against removal of elements.
> + */
> +#define drm_gpuva_for_each_op_safe(op, next, ops) \
> + list_for_each_entry_safe(op, next, &(ops)->list, entry)
> +
> +/**
> + * drm_gpuva_for_each_op_from_reverse - iterate backwards from the given point
> + * @op: &drm_gpuva_op to assign in each iteration step
> + * @ops: &drm_gpuva_ops to walk
> + *
> + * This iterator walks over all ops within a given list of operations beginning
> + * from the given operation in reverse order.
> + */
> +#define drm_gpuva_for_each_op_from_reverse(op, ops) \
> + list_for_each_entry_from_reverse(op, &(ops)->list, entry)
> +
> +/**
> + * drm_gpuva_first_op - returns the first &drm_gpuva_op from &drm_gpuva_ops
> + * @ops: the &drm_gpuva_ops to get the fist &drm_gpuva_op from
> + */
> +#define drm_gpuva_first_op(ops) \
> + list_first_entry(&(ops)->list, struct drm_gpuva_op, entry)
> +
> +/**
> + * drm_gpuva_last_op - returns the last &drm_gpuva_op from &drm_gpuva_ops
> + * @ops: the &drm_gpuva_ops to get the last &drm_gpuva_op from
> + */
> +#define drm_gpuva_last_op(ops) \
> + list_last_entry(&(ops)->list, struct drm_gpuva_op, entry)
> +
> +/**
> + * drm_gpuva_prev_op - previous &drm_gpuva_op in the list
> + * @op: the current &drm_gpuva_op
> + */
> +#define drm_gpuva_prev_op(op) list_prev_entry(op, entry)
> +
> +/**
> + * drm_gpuva_next_op - next &drm_gpuva_op in the list
> + * @op: the current &drm_gpuva_op
> + */
> +#define drm_gpuva_next_op(op) list_next_entry(op, entry)
> +
> +struct drm_gpuva_ops *
> +drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range,
> + struct drm_gem_object *obj, u64 offset);
> +struct drm_gpuva_ops *
> +drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range);
> +
> +struct drm_gpuva_ops *
> +drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
> + u64 addr, u64 range);
> +
> +struct drm_gpuva_ops *
> +drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
> + struct drm_gem_object *obj);
> +
> +void drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_ops *ops);
> +
> +/**
> + * struct drm_gpuva_fn_ops - callbacks for split/merge steps
> + *
> + * This structure defines the callbacks used by &drm_gpuva_sm_map and
> + * &drm_gpuva_sm_unmap to provide the split/merge steps for map and unmap
> + * operations to drivers.
> + */
> +struct drm_gpuva_fn_ops {
> + /**
> + * @op_alloc: called when the &drm_gpuva_manager allocates
> + * a struct drm_gpuva_op
> + *
> + * Some drivers may want to embed struct drm_gpuva_op into driver
> + * specific structures. By implementing this callback drivers can
> + * allocate memory accordingly.
> + *
> + * This callback is optional.
> + */
> + struct drm_gpuva_op *(*op_alloc)(void);
> +
> + /**
> + * @op_free: called when the &drm_gpuva_manager frees a
> + * struct drm_gpuva_op
> + *
> + * Some drivers may want to embed struct drm_gpuva_op into driver
> + * specific structures. By implementing this callback drivers can
> + * free the previously allocated memory accordingly.
> + *
> + * This callback is optional.
> + */
> + void (*op_free)(struct drm_gpuva_op *op);
> +
> + /**
> + * @sm_step_map: called from &drm_gpuva_sm_map to finally insert the
> + * mapping once all previous steps were completed
> + *
> + * The &priv pointer matches the one the driver passed to
> + * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
> + *
> + * Can be NULL if &drm_gpuva_sm_map is used.
> + */
> + int (*sm_step_map)(struct drm_gpuva_op *op, void *priv);
> +
> + /**
> + * @sm_step_remap: called from &drm_gpuva_sm_map and
> + * &drm_gpuva_sm_unmap to split up an existent mapping
> + *
> + * This callback is called when existent mapping needs to be split up.
> + * This is the case when either a newly requested mapping overlaps or
> + * is enclosed by an existent mapping or a partial unmap of an existent
> + * mapping is requested.
> + *
> + * Drivers must not modify the GPUVA space with accessors that do not
> + * take a &drm_gpuva_state as argument from this callback.
> + *
> + * The &priv pointer matches the one the driver passed to
> + * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
> + *
> + * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
> + * used.
> + */
> + int (*sm_step_remap)(struct drm_gpuva_op *op,
> + drm_gpuva_state_t state,
> + void *priv);
> +
> + /**
> + * @sm_step_unmap: called from &drm_gpuva_sm_map and
> + * &drm_gpuva_sm_unmap to unmap an existent mapping
> + *
> + * This callback is called when existent mapping needs to be unmapped.
> + * This is the case when either a newly requested mapping encloses an
> + * existent mapping or an unmap of an existent mapping is requested.
> + *
> + * Drivers must not modify the GPUVA space with accessors that do not
> + * take a &drm_gpuva_state as argument from this callback.
> + *
> + * The &priv pointer matches the one the driver passed to
> + * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
> + *
> + * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
> + * used.
> + */
> + int (*sm_step_unmap)(struct drm_gpuva_op *op,
> + drm_gpuva_state_t state,
> + void *priv);
> +};
> +
> +int drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
> + u64 addr, u64 range,
> + struct drm_gem_object *obj, u64 offset);
> +
> +int drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
> + u64 addr, u64 range);
> +
> +int drm_gpuva_map(struct drm_gpuva_manager *mgr,
> + struct drm_gpuva_prealloc *pa,
> + struct drm_gpuva *va);
> +int drm_gpuva_remap(drm_gpuva_state_t state,
> + struct drm_gpuva *prev,
> + struct drm_gpuva *next);
> +void drm_gpuva_unmap(drm_gpuva_state_t state);
> +
> +#endif /* __DRM_GPUVA_MGR_H__ */
> --
> 2.40.1
>

2023-06-14 08:04:20

by Donald Robson

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Tue, 2023-06-13 at 16:20 +0200, Danilo Krummrich wrote:

> I'm definitely up improving the existing documentation. Anything in
> particular you think should be described in more detail?
>
> - Danilo

Hi Danilo,

As I said, with inexperience it's possible I missed what I was
looking for in the existing documentation, which is highly detailed
in regard to how it deals with operations, but usage was where I fell
down.

If I understand there are three ways to use this, which are:
1) Using drm_gpuva_insert() and drm_gpuva_remove() directly using
stack va objects.
2) Using drm_gpuva_insert() and drm_gpuva_remove() in a callback
context, after having created ops lists using
drm_gpuva_sm_[un]map_ops_create().
3) Using drm_gpuva_[un]map() in callback context after having
prealloced a node and va objects for map/remap function use,
which must be forwarded in as the 'priv' argument to
drm_gpuva_sm_[un]map().

The first of these is pretty self-explanatory. The second was also
fairly easy to understand, it has an example in your own driver, and
since it takes care of allocs in drm_gpuva_sm_map_ops_create() it
leads to pretty clean code too.

The third case, which I am using in the new PowerVR driver did not
have an example of usage and the approach is quite different to 2)
in that you have to prealloc everything explicitly. I didn't realise
this, so it led to a fair amount of frustration.

I think if you're willing, it would help inexperienced implementers a
lot if there were some brief 'how to' snippets for each of the three
use cases.

Thanks,
Donald

2023-06-15 14:57:32

by Danilo Krummrich

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 03/14] drm: manager to keep track of GPUs VA mappings

On Tue, Jun 13, 2023 at 08:29:35PM -0400, Liam R. Howlett wrote:
> * Danilo Krummrich <[email protected]> [230606 18:32]:
> > Add infrastructure to keep track of GPU virtual address (VA) mappings
> > with a decicated VA space manager implementation.
> >
> > New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> > start implementing, allow userspace applications to request multiple and
> > arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> > intended to serve the following purposes in this context.
> >
> > 1) Provide infrastructure to track GPU VA allocations and mappings,
> > making use of the maple_tree.
> >
> > 2) Generically connect GPU VA mappings to their backing buffers, in
> > particular DRM GEM objects.
> >
> > 3) Provide a common implementation to perform more complex mapping
> > operations on the GPU VA space. In particular splitting and merging
> > of GPU VA mappings, e.g. for intersecting mapping requests or partial
> > unmap requests.
> >
> > Suggested-by: Dave Airlie <[email protected]>
> > Signed-off-by: Danilo Krummrich <[email protected]>
> > ---
> > Documentation/gpu/drm-mm.rst | 31 +
> > drivers/gpu/drm/Makefile | 1 +
> > drivers/gpu/drm/drm_gem.c | 3 +
> > drivers/gpu/drm/drm_gpuva_mgr.c | 1687 +++++++++++++++++++++++++++++++
> > include/drm/drm_drv.h | 6 +
> > include/drm/drm_gem.h | 75 ++
> > include/drm/drm_gpuva_mgr.h | 681 +++++++++++++
> > 7 files changed, 2484 insertions(+)
> > create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> > create mode 100644 include/drm/drm_gpuva_mgr.h
> >
> > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> > index a52e6f4117d6..c9f120cfe730 100644
> > --- a/Documentation/gpu/drm-mm.rst
> > +++ b/Documentation/gpu/drm-mm.rst
> > @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
> > .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> > :export:
> >
> > +DRM GPU VA Manager
> > +==================
> > +
> > +Overview
> > +--------
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > + :doc: Overview
> > +
> > +Split and Merge
> > +---------------
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > + :doc: Split and Merge
> > +
> > +Locking
> > +-------
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > + :doc: Locking
> > +
> > +
> > +DRM GPU VA Manager Function References
> > +--------------------------------------
> > +
> > +.. kernel-doc:: include/drm/drm_gpuva_mgr.h
> > + :internal:
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > + :export:
> > +
> > DRM Buddy Allocator
> > ===================
> >
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index 9c6446eb3c83..8eeed446a078 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -45,6 +45,7 @@ drm-y := \
> > drm_vblank.o \
> > drm_vblank_work.o \
> > drm_vma_manager.o \
> > + drm_gpuva_mgr.o \
> > drm_writeback.o
> > drm-$(CONFIG_DRM_LEGACY) += \
> > drm_agpsupport.o \
> > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > index 1a5a2cd0d4ec..cd878ebddbd0 100644
> > --- a/drivers/gpu/drm/drm_gem.c
> > +++ b/drivers/gpu/drm/drm_gem.c
> > @@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
> > if (!obj->resv)
> > obj->resv = &obj->_resv;
> >
> > + if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))
> > + drm_gem_gpuva_init(obj);
> > +
> > drm_vma_node_reset(&obj->vma_node);
> > INIT_LIST_HEAD(&obj->lru_node);
> > }
> > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
> > new file mode 100644
> > index 000000000000..dd8dd7fef14b
> > --- /dev/null
> > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c
> > @@ -0,0 +1,1687 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2022 Red Hat.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + *
> > + * Authors:
> > + * Danilo Krummrich <[email protected]>
> > + *
> > + */
> > +
> > +#include <drm/drm_gem.h>
> > +#include <drm/drm_gpuva_mgr.h>
> > +
> > +/**
> > + * DOC: Overview
> > + *
> > + * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
> > + * of a GPU's virtual address (VA) space and manages the corresponding virtual
> > + * mappings represented by &drm_gpuva objects. It also keeps track of the
> > + * mapping's backing &drm_gem_object buffers.
> > + *
> > + * &drm_gem_object buffers maintain a list (and a corresponding list lock) of
> > + * &drm_gpuva objects representing all existent GPU VA mappings using this
> > + * &drm_gem_object as backing buffer.
> > + *
> > + * GPU VAs can be flagged as sparse, such that drivers may use GPU VAs to also
> > + * keep track of sparse PTEs in order to support Vulkan 'Sparse Resources'.
> > + *
> > + * The GPU VA manager internally uses a &maple_tree to manage the
> > + * &drm_gpuva mappings within a GPU's virtual address space.
> > + *
> > + * The &drm_gpuva_manager contains a special &drm_gpuva representing the
> > + * portion of VA space reserved by the kernel. This node is initialized together
> > + * with the GPU VA manager instance and removed when the GPU VA manager is
> > + * destroyed.
> > + *
> > + * In a typical application drivers would embed struct drm_gpuva_manager and
> > + * struct drm_gpuva within their own driver specific structures, there won't be
> > + * any memory allocations of it's own nor memory allocations of &drm_gpuva
> > + * entries.
> > + *
> > + * However, the &drm_gpuva_manager needs to allocate nodes for it's internal
> > + * tree structures when &drm_gpuva entries are inserted. In order to support
> > + * inserting &drm_gpuva entries from dma-fence signalling critical sections the
> > + * &drm_gpuva_manager provides struct drm_gpuva_prealloc. Drivers may create
> > + * pre-allocated nodes which drm_gpuva_prealloc_create() and subsequently insert
> > + * a new &drm_gpuva entry with drm_gpuva_insert_prealloc().
> > + */
> > +
> > +/**
> > + * DOC: Split and Merge
> > + *
> > + * The DRM GPU VA manager also provides an algorithm implementing splitting and
> > + * merging of existent GPU VA mappings with the ones that are requested to be
> > + * mapped or unmapped. This feature is required by the Vulkan API to implement
> > + * Vulkan 'Sparse Memory Bindings' - drivers UAPIs often refer to this as
> > + * VM BIND.
> > + *
> > + * Drivers can call drm_gpuva_sm_map() to receive a sequence of callbacks
> > + * containing map, unmap and remap operations for a given newly requested
> > + * mapping. The sequence of callbacks represents the set of operations to
> > + * execute in order to integrate the new mapping cleanly into the current state
> > + * of the GPU VA space.
> > + *
> > + * Depending on how the new GPU VA mapping intersects with the existent mappings
> > + * of the GPU VA space the &drm_gpuva_fn_ops callbacks contain an arbitrary
> > + * amount of unmap operations, a maximum of two remap operations and a single
> > + * map operation. The caller might receive no callback at all if no operation is
> > + * required, e.g. if the requested mapping already exists in the exact same way.
> > + *
> > + * The single map operation represents the original map operation requested by
> > + * the caller.
> > + *
> > + * &drm_gpuva_op_unmap contains a 'keep' field, which indicates whether the
> > + * &drm_gpuva to unmap is physically contiguous with the original mapping
> > + * request. Optionally, if 'keep' is set, drivers may keep the actual page table
> > + * entries for this &drm_gpuva, adding the missing page table entries only and
> > + * update the &drm_gpuva_manager's view of things accordingly.
> > + *
> > + * Drivers may do the same optimization, namely delta page table updates, also
> > + * for remap operations. This is possible since &drm_gpuva_op_remap consists of
> > + * one unmap operation and one or two map operations, such that drivers can
> > + * derive the page table update delta accordingly.
> > + *
> > + * Note that there can't be more than two existent mappings to split up, one at
> > + * the beginning and one at the end of the new mapping, hence there is a
> > + * maximum of two remap operations.
> > + *
> > + * Analogous to drm_gpuva_sm_map() drm_gpuva_sm_unmap() uses &drm_gpuva_fn_ops
> > + * to call back into the driver in order to unmap a range of GPU VA space. The
> > + * logic behind this function is way simpler though: For all existent mappings
> > + * enclosed by the given range unmap operations are created. For mappings which
> > + * are only partically located within the given range, remap operations are
> > + * created such that those mappings are split up and re-mapped partically.
> > + *
> > + * To update the &drm_gpuva_manager's view of the GPU VA space
> > + * drm_gpuva_insert(), drm_gpuva_insert_prealloc(), and drm_gpuva_remove() may
> > + * be used. Please note that these functions are not safe to be called from a
> > + * &drm_gpuva_fn_ops callback originating from drm_gpuva_sm_map() or
> > + * drm_gpuva_sm_unmap(). The drm_gpuva_map(), drm_gpuva_remap() and
> > + * drm_gpuva_unmap() helpers should be used instead.
> > + *
> > + * The following diagram depicts the basic relationships of existent GPU VA
> > + * mappings, a newly requested mapping and the resulting mappings as implemented
> > + * by drm_gpuva_sm_map() - it doesn't cover any arbitrary combinations of these.
> > + *
> > + * 1) Requested mapping is identical. Replace it, but indicate the backing PTEs
> > + * could be kept.
> > + *
> > + * ::
> > + *
> > + * 0 a 1
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 0 a 1
> > + * req: |-----------| (bo_offset=n)
> > + *
> > + * 0 a 1
> > + * new: |-----------| (bo_offset=n)
> > + *
> > + *
> > + * 2) Requested mapping is identical, except for the BO offset, hence replace
> > + * the mapping.
> > + *
> > + * ::
> > + *
> > + * 0 a 1
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 0 a 1
> > + * req: |-----------| (bo_offset=m)
> > + *
> > + * 0 a 1
> > + * new: |-----------| (bo_offset=m)
> > + *
> > + *
> > + * 3) Requested mapping is identical, except for the backing BO, hence replace
> > + * the mapping.
> > + *
> > + * ::
> > + *
> > + * 0 a 1
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 0 b 1
> > + * req: |-----------| (bo_offset=n)
> > + *
> > + * 0 b 1
> > + * new: |-----------| (bo_offset=n)
> > + *
> > + *
> > + * 4) Existent mapping is a left aligned subset of the requested one, hence
> > + * replace the existent one.
> > + *
> > + * ::
> > + *
> > + * 0 a 1
> > + * old: |-----| (bo_offset=n)
> > + *
> > + * 0 a 2
> > + * req: |-----------| (bo_offset=n)
> > + *
> > + * 0 a 2
> > + * new: |-----------| (bo_offset=n)
> > + *
> > + * .. note::
> > + * We expect to see the same result for a request with a different BO
> > + * and/or non-contiguous BO offset.
> > + *
> > + *
> > + * 5) Requested mapping's range is a left aligned subset of the existent one,
> > + * but backed by a different BO. Hence, map the requested mapping and split
> > + * the existent one adjusting it's BO offset.
> > + *
> > + * ::
> > + *
> > + * 0 a 2
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 0 b 1
> > + * req: |-----| (bo_offset=n)
> > + *
> > + * 0 b 1 a' 2
> > + * new: |-----|-----| (b.bo_offset=n, a.bo_offset=n+1)
> > + *
> > + * .. note::
> > + * We expect to see the same result for a request with a different BO
> > + * and/or non-contiguous BO offset.
> > + *
> > + *
> > + * 6) Existent mapping is a superset of the requested mapping. Split it up, but
> > + * indicate that the backing PTEs could be kept.
> > + *
> > + * ::
> > + *
> > + * 0 a 2
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 0 a 1
> > + * req: |-----| (bo_offset=n)
> > + *
> > + * 0 a 1 a' 2
> > + * new: |-----|-----| (a.bo_offset=n, a'.bo_offset=n+1)
> > + *
> > + *
> > + * 7) Requested mapping's range is a right aligned subset of the existent one,
> > + * but backed by a different BO. Hence, map the requested mapping and split
> > + * the existent one, without adjusting the BO offset.
> > + *
> > + * ::
> > + *
> > + * 0 a 2
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 1 b 2
> > + * req: |-----| (bo_offset=m)
> > + *
> > + * 0 a 1 b 2
> > + * new: |-----|-----| (a.bo_offset=n,b.bo_offset=m)
> > + *
> > + *
> > + * 8) Existent mapping is a superset of the requested mapping. Split it up, but
> > + * indicate that the backing PTEs could be kept.
> > + *
> > + * ::
> > + *
> > + * 0 a 2
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 1 a 2
> > + * req: |-----| (bo_offset=n+1)
> > + *
> > + * 0 a' 1 a 2
> > + * new: |-----|-----| (a'.bo_offset=n, a.bo_offset=n+1)
> > + *
> > + *
> > + * 9) Existent mapping is overlapped at the end by the requested mapping backed
> > + * by a different BO. Hence, map the requested mapping and split up the
> > + * existent one, without adjusting the BO offset.
> > + *
> > + * ::
> > + *
> > + * 0 a 2
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 1 b 3
> > + * req: |-----------| (bo_offset=m)
> > + *
> > + * 0 a 1 b 3
> > + * new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)
> > + *
> > + *
> > + * 10) Existent mapping is overlapped by the requested mapping, both having the
> > + * same backing BO with a contiguous offset. Indicate the backing PTEs of
> > + * the old mapping could be kept.
> > + *
> > + * ::
> > + *
> > + * 0 a 2
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 1 a 3
> > + * req: |-----------| (bo_offset=n+1)
> > + *
> > + * 0 a' 1 a 3
> > + * new: |-----|-----------| (a'.bo_offset=n, a.bo_offset=n+1)
> > + *
> > + *
> > + * 11) Requested mapping's range is a centered subset of the existent one
> > + * having a different backing BO. Hence, map the requested mapping and split
> > + * up the existent one in two mappings, adjusting the BO offset of the right
> > + * one accordingly.
> > + *
> > + * ::
> > + *
> > + * 0 a 3
> > + * old: |-----------------| (bo_offset=n)
> > + *
> > + * 1 b 2
> > + * req: |-----| (bo_offset=m)
> > + *
> > + * 0 a 1 b 2 a' 3
> > + * new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)
> > + *
> > + *
> > + * 12) Requested mapping is a contiguous subset of the existent one. Split it
> > + * up, but indicate that the backing PTEs could be kept.
> > + *
> > + * ::
> > + *
> > + * 0 a 3
> > + * old: |-----------------| (bo_offset=n)
> > + *
> > + * 1 a 2
> > + * req: |-----| (bo_offset=n+1)
> > + *
> > + * 0 a' 1 a 2 a'' 3
> > + * old: |-----|-----|-----| (a'.bo_offset=n, a.bo_offset=n+1, a''.bo_offset=n+2)
> > + *
> > + *
> > + * 13) Existent mapping is a right aligned subset of the requested one, hence
> > + * replace the existent one.
> > + *
> > + * ::
> > + *
> > + * 1 a 2
> > + * old: |-----| (bo_offset=n+1)
> > + *
> > + * 0 a 2
> > + * req: |-----------| (bo_offset=n)
> > + *
> > + * 0 a 2
> > + * new: |-----------| (bo_offset=n)
> > + *
> > + * .. note::
> > + * We expect to see the same result for a request with a different bo
> > + * and/or non-contiguous bo_offset.
> > + *
> > + *
> > + * 14) Existent mapping is a centered subset of the requested one, hence
> > + * replace the existent one.
> > + *
> > + * ::
> > + *
> > + * 1 a 2
> > + * old: |-----| (bo_offset=n+1)
> > + *
> > + * 0 a 3
> > + * req: |----------------| (bo_offset=n)
> > + *
> > + * 0 a 3
> > + * new: |----------------| (bo_offset=n)
> > + *
> > + * .. note::
> > + * We expect to see the same result for a request with a different bo
> > + * and/or non-contiguous bo_offset.
> > + *
> > + *
> > + * 15) Existent mappings is overlapped at the beginning by the requested mapping
> > + * backed by a different BO. Hence, map the requested mapping and split up
> > + * the existent one, adjusting it's BO offset accordingly.
> > + *
> > + * ::
> > + *
> > + * 1 a 3
> > + * old: |-----------| (bo_offset=n)
> > + *
> > + * 0 b 2
> > + * req: |-----------| (bo_offset=m)
> > + *
> > + * 0 b 2 a' 3
> > + * new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)
> > + */
> > +
> > +/**
> > + * DOC: Locking
> > + *
> > + * Generally, the GPU VA manager does not take care of locking itself, it is
> > + * the drivers responsibility to take care about locking. Drivers might want to
> > + * protect the following operations: inserting, removing and iterating
> > + * &drm_gpuva objects as well as generating all kinds of operations, such as
> > + * split / merge or prefetch.
> > + *
> > + * The GPU VA manager also does not take care of the locking of the backing
> > + * &drm_gem_object buffers GPU VA lists by itself; drivers are responsible to
> > + * enforce mutual exclusion.
> > + */
> > +
> > + /*
> > + * Maple Tree Locking
> > + *
> > + * The maple tree's advanced API requires the user of the API to protect
> > + * certain tree operations with a lock (either the external or internal tree
> > + * lock) for tree internal reasons.
> > + *
> > + * The actual rules (when to aquire/release the lock) are enforced by lockdep
> > + * through the maple tree implementation.
> > + *
> > + * For this reason the DRM GPUVA manager takes the maple tree's internal
> > + * spinlock according to the lockdep enforced rules.
> > + *
> > + * Please note, that this lock is *only* meant to fulfill the maple trees
> > + * requirements and does not intentionally protect the DRM GPUVA manager
> > + * against concurrent access.
> > + *
> > + * The following mail thread provides more details on why the maple tree
> > + * has this requirement.
> > + *
> > + * https://lore.kernel.org/lkml/[email protected]/
> > + */
> > +
> > +static int __drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva *va);
> > +static void __drm_gpuva_remove(struct drm_gpuva *va);
> > +
> > +/**
> > + * drm_gpuva_manager_init - initialize a &drm_gpuva_manager
> > + * @mgr: pointer to the &drm_gpuva_manager to initialize
> > + * @name: the name of the GPU VA space
> > + * @start_offset: the start offset of the GPU VA space
> > + * @range: the size of the GPU VA space
> > + * @reserve_offset: the start of the kernel reserved GPU VA area
> > + * @reserve_range: the size of the kernel reserved GPU VA area
> > + * @ops: &drm_gpuva_fn_ops called on &drm_gpuva_sm_map / &drm_gpuva_sm_unmap
> > + *
> > + * The &drm_gpuva_manager must be initialized with this function before use.
> > + *
> > + * Note that @mgr must be cleared to 0 before calling this function. The given
> > + * &name is expected to be managed by the surrounding driver structures.
> > + */
> > +void
> > +drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > + const char *name,
> > + u64 start_offset, u64 range,
> > + u64 reserve_offset, u64 reserve_range,
> > + const struct drm_gpuva_fn_ops *ops)
> > +{
> > + mt_init(&mgr->mtree);
> > +
> > + mgr->mm_start = start_offset;
> > + mgr->mm_range = range;
> > +
> > + mgr->name = name ? name : "unknown";
> > + mgr->ops = ops;
> > +
> > + memset(&mgr->kernel_alloc_node, 0, sizeof(struct drm_gpuva));
> > +
> > + if (reserve_range) {
> > + mgr->kernel_alloc_node.va.addr = reserve_offset;
> > + mgr->kernel_alloc_node.va.range = reserve_range;
> > +
> > + __drm_gpuva_insert(mgr, &mgr->kernel_alloc_node);
> > + }
> > +
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_manager_init);
> > +
> > +/**
> > + * drm_gpuva_manager_destroy - cleanup a &drm_gpuva_manager
> > + * @mgr: pointer to the &drm_gpuva_manager to clean up
> > + *
> > + * Note that it is a bug to call this function on a manager that still
> > + * holds GPU VA mappings.
> > + */
> > +void
> > +drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
> > +{
> > + mgr->name = NULL;
> > +
> > + if (mgr->kernel_alloc_node.va.range)
> > + __drm_gpuva_remove(&mgr->kernel_alloc_node);
> > +
> > + mtree_lock(&mgr->mtree);
> > + WARN(!mtree_empty(&mgr->mtree),
> > + "GPUVA tree is not empty, potentially leaking memory.");
> > + __mt_destroy(&mgr->mtree);
> > + mtree_unlock(&mgr->mtree);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_manager_destroy);
> > +
> > +static inline bool
> > +drm_gpuva_in_mm_range(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
> > +{
> > + u64 end = addr + range;
> > + u64 mm_start = mgr->mm_start;
> > + u64 mm_end = mm_start + mgr->mm_range;
> > +
> > + return addr < mm_end && mm_start < end;
> > +}
> > +
> > +static inline bool
> > +drm_gpuva_in_kernel_node(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
> > +{
> > + u64 end = addr + range;
> > + u64 kstart = mgr->kernel_alloc_node.va.addr;
> > + u64 krange = mgr->kernel_alloc_node.va.range;
> > + u64 kend = kstart + krange;
> > +
> > + return krange && addr < kend && kstart < end;
> > +}
> > +
> > +static inline bool
> > +drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range)
> > +{
> > + return drm_gpuva_in_mm_range(mgr, addr, range) &&
> > + !drm_gpuva_in_kernel_node(mgr, addr, range);
> > +}
> > +
> > +/**
> > + * drm_gpuva_iter_remove - removes the iterators current element
> > + * @it: the &drm_gpuva_iterator
> > + *
> > + * This removes the element the iterator currently points to.
> > + */
> > +void
> > +drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
> > +{
> > + mas_lock(&it->mas);
> > + mas_erase(&it->mas);
> > + mas_unlock(&it->mas);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_iter_remove);
> > +
> > +/**
> > + * drm_gpuva_prealloc_create - creates a preallocated node to store a
> > + * &drm_gpuva entry.
> > + *
> > + * Returns: the &drm_gpuva_prealloc object on success, NULL on failure
> > + */
> > +struct drm_gpuva_prealloc *
> > +drm_gpuva_prealloc_create(struct drm_gpuva_manager *mgr)
> > +{
> > + struct drm_gpuva_prealloc *pa;
> > +
> > + pa = kzalloc(sizeof(*pa), GFP_KERNEL);
> > + if (!pa)
> > + return NULL;
> > +
> > + mas_init(&pa->mas, &mgr->mtree, 0);
>
> I've broken this interface on you too, with the mas_preallocate()
> change - See below.
>
> > + if (mas_preallocate(&pa->mas, GFP_KERNEL)) {
> > + kfree(pa);
> > + return NULL;
> > + }
> > +
> > + return pa;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_prealloc_create);
> > +
> > +/**
> > + * drm_gpuva_prealloc_destroy - destroyes a preallocated node and frees the
> > + * &drm_gpuva_prealloc
>
> I tend to think of it as destroying a maple state by freeing the
> preallocated nodes, but I guess the state isn't destroyed.
>

While writing 'preallocated node' I think I did not think of the maple tree
implementation behind it. I intended to tell the user of the API that the 'node'
(meaning an arbitrary place to store a GPUVA entry) internally is destroyed with
this function.

> > + *
> > + * @pa: the &drm_gpuva_prealloc to destroy
> > + */
> > +void
> > +drm_gpuva_prealloc_destroy(struct drm_gpuva_prealloc *pa)
> > +{
> > + mas_destroy(&pa->mas);
> > + kfree(pa);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_prealloc_destroy);
> > +
> > +static int
> > +drm_gpuva_insert_state(struct drm_gpuva_manager *mgr,
> > + struct ma_state *mas,
> > + struct drm_gpuva *va)
>
> Couldn't these arguments could be on one line?
>

Yep, gonna change that.

> > +{
> > + u64 addr = va->va.addr;
> > + u64 range = va->va.range;
> > + u64 last = addr + range - 1;
> > +
> > + mas_set(mas, addr);
> > +
> > + mas_lock(mas);
> > + if (unlikely(mas_walk(mas))) {
> > + mas_unlock(mas);
> > + return -EEXIST;
> > + }
> > +
> > + if (unlikely(mas->last < last)) {
> > + mas_unlock(mas);
> > + return -EEXIST;
> > + }
> > +
> > + mas->index = addr;
> > + mas->last = last;
> > +
> > + mas_store_prealloc(mas, va);
> > + mas_unlock(mas);
> > +
> > + va->mgr = mgr;
> > +
> > + return 0;
> > +}
> > +
> > +static int
> > +__drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva *va)
> > +{
> > + MA_STATE(mas, &mgr->mtree, 0, 0);
> > + int ret;
> > +
> > + ret = mas_preallocate(&mas, GFP_KERNEL);
>
> mas_preallocate() is in the process of being updated to reduce the
> allocations, so this will eventually fail to compile [1].
>
> mas_preallocate(&mas, va, GFP_KERNEL) will work in the future.

This is perfectly fine for __drm_gpuva_insert(). In fact, I could also just
use mas_store_gfp() right away in this function. The reason for this
mas_preallocate() is just to produce a common code path for __drm_gpuva_insert()
and drm_gpuva_insert_prealloc() with drm_gpuva_insert_state(). I already
considered to abstain from that and just implement __drm_gpuva_insert() with
mas_store_gfp() noticing the overhead of mas_preallocate() in this case.

However, considering your explanation below, I tend to think that this change
could be a showstopper for another use case. Please see the comment on
drm_gpuva_insert_prealloc().

>
> The calculated allocations depend on the area being written and if there
> is a value or NULL being written.
>
> > + if (ret)
> > + return ret;
> > +
> > + return drm_gpuva_insert_state(mgr, &mas, va);
>
> This has the added effect that the mas_preallocate() examines the tree
> by walking it, so you need to hold the lock during that work. It is
> also possible, since you are not holding the lock here, that you could
> have a writer come in and change what you preallocated to store and may
> cause the write to not have enough memory. IIRC you have another
> locking strategy that negates this, but you will still need to hold the
> lock and have the maple state pointing at the correct range now (or,
> well, soon) to keep lockdep happy.
>
> Change this:
> MA_STATE(mas, &mgr->mtree, 0, 0);
>
> to something like this (but hopefully less ugly..)
> MA_STATE(mas, &mgr->mtree, va->va.addr, va->va.addr + va->va.range - 1);
>
> ...maybe use mas_init() instead.
>
> This strictly does not need to preallocate since you don't have complex
> locking in this case, but I suspect you are preallocating for external
> driver use as documented in this patch? This can still work if the
> preallocation call sets up the maple state and the driver doesn't mess
> things up on you. You could check that by verifying mas.index and
> mas.last are what you expect, but I think you'll want to move your
> mas_walk() checks to before preallocating.
>
> [1] https://lore.kernel.org/all/[email protected]/
>
> > +}
> > +
> > +/**
> > + * drm_gpuva_insert - insert a &drm_gpuva
> > + * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
> > + * @va: the &drm_gpuva to insert
> > + *
> > + * Insert a &drm_gpuva with a given address and range into a
> > + * &drm_gpuva_manager.
> > + *
> > + * It is not allowed to use this function while iterating this GPU VA space,
> > + * e.g via drm_gpuva_iter_for_each().
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva *va)
> > +{
> > + u64 addr = va->va.addr;
> > + u64 range = va->va.range;
> > +
> > + if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
> > + return -EINVAL;
> > +
> > + return __drm_gpuva_insert(mgr, va);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_insert);
> > +
> > +/**
> > + * drm_gpuva_insert_prealloc - insert a &drm_gpuva with a preallocated node
> > + * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
> > + * @va: the &drm_gpuva to insert
> > + * @pa: the &drm_gpuva_prealloc node
> > + *
> > + * Insert a &drm_gpuva with a given address and range into a
> > + * &drm_gpuva_manager.
> > + *
> > + * It is not allowed to use this function while iterating this GPU VA space,
> > + * e.g via drm_gpuva_iter_for_each().
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuva_insert_prealloc(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_prealloc *pa,
> > + struct drm_gpuva *va)
> > +{
> > + struct ma_state *mas = &pa->mas;
> > + u64 addr = va->va.addr;
> > + u64 range = va->va.range;
> > +
> > + if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
> > + return -EINVAL;
> > +
> > + mas->tree = &mgr->mtree;
>
> Are you trying to take the allocated nodes for a write to a different
> tree? You may not have enough nodes..
>

They're not going to a different tree. I guess your're confused why I set
mas->tree (again). And so am I - I think it is just a leftover. The tree is
already set in drm_gpuva_prealloc_create() through mas_init().

However, if I understand your changes to mas_preallocate() you explained above
correctly, your point "you may not have enough nodes" might still hold. Let me
explain:

When talking about locking I mentioned earlier that there are two cases.
In the first case users of the GPUVA manager need to hold a mutex (anyway) while
applying changes to the GPUVA space and hence the maple tree, not just to
protect the tree, but also to make sure that multiple changes to the GPUVA space
appear atomically.
In the second case, any accesses to GPUVA space are serialized and hence
technically don't require any locking at all.

In the latter, serialized, case we basically have two stages.
In the first one jobs to create / remove mappings are submitted to the driver,
which pre-allocates the required resources and puts the job into a job queue.
In a second stage the actual updates to the GPUVA space are performed
asynchronously running within a dma_fence signalling critical path and hence
no memory allocations are permitted. One single job entering the serialized path
can usually contain an arbitrary amount of map / unmap / remap operations and
hence an arbitrary amount of GPUVAs to add or remove from the maple tree.

Therefore at the time of pre-allocation we can't predict the state of the maple
tree at the time the pre-allocated nodes are actually used to insert an entry,
nor can we ensure that the maple tree doesn't change until the pre-allocated
nodes were actually used to insert an entry.

If I understand it correctly, until now mas_preallocate() allocates the
worst-case amount of nodes, which seems to be exactly what we need for this use
case. While with your change you walk the tree calculate how many nodes you need
at this time and hence require to not have the tree changed until the nodes were
used to store the indicated entry.

Would it be possible to just have both paths, your new mas_preallocate() and
something like mas_preallocate_worst_case() (hopefully with a better name)?

> > + return drm_gpuva_insert_state(mgr, mas, va);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_insert_prealloc);
> > +
> > +static void
> > +__drm_gpuva_remove(struct drm_gpuva *va)
> > +{
> > + MA_STATE(mas, &va->mgr->mtree, va->va.addr, 0);
> > +
> > + mas_lock(&mas);
> > + mas_erase(&mas);
> > + mas_unlock(&mas);
>
> This should be the same as: mtree_erase(&va->mgr->mtree, va->va.addr);
>

Yeah, I think there are a few cases where I could do something similar as well.
I think in a follow up patch I already covered those, seems like I missed
them here.

> > +}
> > +
> > +/**
> > + * drm_gpuva_remove - remove a &drm_gpuva
> > + * @va: the &drm_gpuva to remove
> > + *
> > + * This removes the given &va from the underlaying tree.
> > + *
> > + * It is not allowed to use this function while iterating this GPU VA space,
> > + * e.g via drm_gpuva_iter_for_each(). Please use drm_gpuva_iter_remove()
> > + * instead.
> > + */
> > +void
> > +drm_gpuva_remove(struct drm_gpuva *va)
> > +{
> > + struct drm_gpuva_manager *mgr = va->mgr;
> > +
> > + if (unlikely(va == &mgr->kernel_alloc_node)) {
> > + WARN(1, "Can't destroy kernel reserved node.\n");
> > + return;
> > + }
> > +
> > + __drm_gpuva_remove(va);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_remove);
> > +
> > +/**
> > + * drm_gpuva_link - link a &drm_gpuva
> > + * @va: the &drm_gpuva to link
> > + *
> > + * This adds the given &va to the GPU VA list of the &drm_gem_object it is
> > + * associated with.
> > + *
> > + * This function expects the caller to protect the GEM's GPUVA list against
> > + * concurrent access.
> > + */
> > +void
> > +drm_gpuva_link(struct drm_gpuva *va)
> > +{
> > + if (likely(va->gem.obj))
> > + list_add_tail(&va->gem.entry, &va->gem.obj->gpuva.list);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_link);
> > +
> > +/**
> > + * drm_gpuva_unlink - unlink a &drm_gpuva
> > + * @va: the &drm_gpuva to unlink
> > + *
> > + * This removes the given &va from the GPU VA list of the &drm_gem_object it is
> > + * associated with.
> > + *
> > + * This function expects the caller to protect the GEM's GPUVA list against
> > + * concurrent access.
> > + */
> > +void
> > +drm_gpuva_unlink(struct drm_gpuva *va)
> > +{
> > + if (likely(va->gem.obj))
> > + list_del_init(&va->gem.entry);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_unlink);
> > +
> > +/**
> > + * drm_gpuva_find_first - find the first &drm_gpuva in the given range
> > + * @mgr: the &drm_gpuva_manager to search in
> > + * @addr: the &drm_gpuvas address
> > + * @range: the &drm_gpuvas range
> > + *
> > + * Returns: the first &drm_gpuva within the given range
> > + */
> > +struct drm_gpuva *
> > +drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range)
> > +{
> > + MA_STATE(mas, &mgr->mtree, addr, 0);
> > + struct drm_gpuva *va;
> > +
>
> Again, this can be an rcu_read_lock()
>

Same as below.

> > + mas_lock(&mas);
> > + va = mas_find(&mas, addr + range - 1);
> > + mas_unlock(&mas);
> > +
> > + return va;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_find_first);
> > +
> > +/**
> > + * drm_gpuva_find - find a &drm_gpuva
> > + * @mgr: the &drm_gpuva_manager to search in
> > + * @addr: the &drm_gpuvas address
> > + * @range: the &drm_gpuvas range
> > + *
> > + * Returns: the &drm_gpuva at a given &addr and with a given &range
> > + */
> > +struct drm_gpuva *
> > +drm_gpuva_find(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range)
> > +{
> > + struct drm_gpuva *va;
> > +
> > + va = drm_gpuva_find_first(mgr, addr, range);
> > + if (!va)
> > + goto out;
> > +
> > + if (va->va.addr != addr ||
> > + va->va.range != range)
> > + goto out;
> > +
> > + return va;
> > +
> > +out:
> > + return NULL;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_find);
> > +
> > +/**
> > + * drm_gpuva_find_prev - find the &drm_gpuva before the given address
> > + * @mgr: the &drm_gpuva_manager to search in
> > + * @start: the given GPU VA's start address
> > + *
> > + * Find the adjacent &drm_gpuva before the GPU VA with given &start address.
> > + *
> > + * Note that if there is any free space between the GPU VA mappings no mapping
> > + * is returned.
> > + *
> > + * Returns: a pointer to the found &drm_gpuva or NULL if none was found
> > + */
> > +struct drm_gpuva *
> > +drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start)
> > +{
> > + MA_STATE(mas, &mgr->mtree, start - 1, 0);
> > + struct drm_gpuva *va;
> > +
> > + if (start <= mgr->mm_start ||
> > + start > (mgr->mm_start + mgr->mm_range))
> > + return NULL;
> > +
>
> And here as well. Maybe mtree_load() would be easier?
>
> > + mas_lock(&mas);
> > + va = mas_walk(&mas);
> > + mas_unlock(&mas);
> > +
> > + return va;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_find_prev);
> > +
> > +/**
> > + * drm_gpuva_find_next - find the &drm_gpuva after the given address
> > + * @mgr: the &drm_gpuva_manager to search in
> > + * @end: the given GPU VA's end address
> > + *
> > + * Find the adjacent &drm_gpuva after the GPU VA with given &end address.
> > + *
> > + * Note that if there is any free space between the GPU VA mappings no mapping
> > + * is returned.
> > + *
> > + * Returns: a pointer to the found &drm_gpuva or NULL if none was found
> > + */
> > +struct drm_gpuva *
> > +drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end)
> > +{
> > + MA_STATE(mas, &mgr->mtree, end, 0);
> > + struct drm_gpuva *va;
> > +
> > + if (end < mgr->mm_start ||
> > + end >= (mgr->mm_start + mgr->mm_range))
> > + return NULL;
> > +
>
> Here too, you can use the mtree_load() function.
>
> A note though that when I store my VMAs in the mm code, the VMAs are
> [start, end) and the tree is [start, end], so I always take one away.
> Not sure if your VMAs are the same way.
>
> > + mas_lock(&mas);
> > + va = mas_walk(&mas);
> > + mas_unlock(&mas);
> > +
> > + return va;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_find_next);
> > +
> > +/**
> > + * drm_gpuva_interval_empty - indicate whether a given interval of the VA space
> > + * is empty
> > + * @mgr: the &drm_gpuva_manager to check the range for
> > + * @addr: the start address of the range
> > + * @range: the range of the interval
> > + *
> > + * Returns: true if the interval is empty, false otherwise
> > + */
> > +bool
> > +drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
> > +{
> > + DRM_GPUVA_ITER(it, mgr, addr);
> > + struct drm_gpuva *va;
> > +
> > + drm_gpuva_iter_for_each_range(va, it, addr + range)
> > + return false;
> > +
> > + return true;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_interval_empty);
> > +
> > +/**
> > + * drm_gpuva_map - helper to insert a &drm_gpuva from &drm_gpuva_fn_ops
> > + * callbacks
> > + *
> > + * @mgr: the &drm_gpuva_manager
> > + * @pa: the &drm_gpuva_prealloc
> > + * @va: the &drm_gpuva to inser
> > + */
> > +int
> > +drm_gpuva_map(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_prealloc *pa,
> > + struct drm_gpuva *va)
> > +{
> > + return drm_gpuva_insert_prealloc(mgr, pa, va);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_map);
> > +
> > +/**
> > + * drm_gpuva_remap - helper to insert a &drm_gpuva from &drm_gpuva_fn_ops
> > + * callbacks
> > + *
> > + * @state: the current &drm_gpuva_state
> > + * @prev: the &drm_gpuva to remap when keeping the start of a mapping,
> > + * may be NULL
> > + * @next: the &drm_gpuva to remap when keeping the end of a mapping,
> > + * may be NULL
> > + */
> > +int
> > +drm_gpuva_remap(drm_gpuva_state_t state,
> > + struct drm_gpuva *prev,
> > + struct drm_gpuva *next)
> > +{
> > + struct ma_state *mas = &state->mas;
> > + u64 max = mas->last;
> > +
> > + if (unlikely(!prev && !next))
> > + return -EINVAL;
> > +
> > + if (prev) {
> > + u64 addr = prev->va.addr;
> > + u64 last = addr + prev->va.range - 1;
> > +
> > + if (unlikely(addr != mas->index))
> > + return -EINVAL;
> > +
> > + if (unlikely(last >= mas->last))
> > + return -EINVAL;
> > + }
> > +
> > + if (next) {
> > + u64 addr = next->va.addr;
> > + u64 last = addr + next->va.range - 1;
> > +
> > + if (unlikely(last != mas->last))
> > + return -EINVAL;
> > +
> > + if (unlikely(addr <= mas->index))
> > + return -EINVAL;
> > + }
> > +
> > + if (prev && next) {
> > + u64 p_last = prev->va.addr + prev->va.range - 1;
> > + u64 n_addr = next->va.addr;
> > +
> > + if (unlikely(p_last > n_addr))
> > + return -EINVAL;
> > +
> > + if (unlikely(n_addr - p_last <= 1))
> > + return -EINVAL;
> > + }
> > +
> > + mas_lock(mas);
> > + if (prev) {
> > + mas_store(mas, prev);
> > + mas_next(mas, max);
>
> This will advance to the next entry, is that what you want to do? I
> think you want mas_next_range(), which is in a recent patch set [2]. I
> believe, what you have here is a large range which is NULL and you are
> either inserting something at the start, at the end, or both?

drm_gpuva_remap() is called from a callback while walking the tree in
__drm_gpuva_sm_map() or __drm_gpuva_sm_unmap() with
drm_gpuva_iter_for_each_range() to insert a new mapping or unmap a given range.

It is called whenever the new mapping to insert or the given range to unmap
intersects with the entry at the iterators current position. For example:

0 a 3
old: |-----------------| (bo_offset=n)

1 b 2
req: |-----| (bo_offset=m)

0 a' 1 b 2 a'' 3
new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)

In this example the current iterators entry would be a. prev would be a' and
next would be a''. When this function returns I expect to have a hole where b
should go. b would be inserted in the last iteration of
drm_gpuva_iter_for_each_range() in __drm_gpuva_sm_map from another callback,
which is drm_gpuva_map(). drm_gpuva_map() gets a struct drm_gpuva_prealloc
passed by the caller such that it got the pre-allocated nodes for inserting the
entry into the hole we left in this function.

Taking this example the maple tree should look like this right before the call
to mas_next():

0 a' 1 a 3
|-----|-----------|

So, what I expect from mas_next() is that it brings me to offset '1'.
Afterwards, we enter the 'if (next)' path and create the hole with
'mas->last = next->va.addr - 1' and 'mas_store(mas, NULL)'. Again calling
mas_next() to jump to offset '2' inserting a''.

Hopefully, now you can tell me whether I want mas_next() or mas_next_range()
here. :-)

>
> [2] https://lore.kernel.org/lkml/[email protected]/
>
> > + if (!next)
> > + mas_store(mas, NULL);
> > + }
> > +
> > + if (next) {
> > + mas->last = next->va.addr - 1;
> > + mas_store(mas, NULL);
> > + mas_next(mas, max);
> > + mas_store(mas, next);
> > + }
> > + mas_unlock(mas);
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_remap);
> > +
> > +/**
> > + * drm_gpuva_unmap - helper to remove a &drm_gpuva from &drm_gpuva_fn_ops
> > + * callbacks
> > + *
> > + * @state: the current &drm_gpuva_state
> > + *
> > + * The entry associated with the current state is removed.
> > + */
> > +void
> > +drm_gpuva_unmap(drm_gpuva_state_t state)
> > +{
> > + drm_gpuva_iter_remove(state);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_unmap);
> > +
> > +static int
> > +op_map_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
> > + u64 addr, u64 range,
> > + struct drm_gem_object *obj, u64 offset)
> > +{
> > + struct drm_gpuva_op op = {};
> > +
> > + op.op = DRM_GPUVA_OP_MAP;
> > + op.map.va.addr = addr;
> > + op.map.va.range = range;
> > + op.map.gem.obj = obj;
> > + op.map.gem.offset = offset;
> > +
> > + return fn->sm_step_map(&op, priv);
> > +}
> > +
> > +static int
> > +op_remap_cb(const struct drm_gpuva_fn_ops *fn,
> > + drm_gpuva_state_t state, void *priv,
> > + struct drm_gpuva_op_map *prev,
> > + struct drm_gpuva_op_map *next,
> > + struct drm_gpuva_op_unmap *unmap)
> > +{
> > + struct drm_gpuva_op op = {};
> > + struct drm_gpuva_op_remap *r;
> > +
> > + op.op = DRM_GPUVA_OP_REMAP;
> > + r = &op.remap;
> > + r->prev = prev;
> > + r->next = next;
> > + r->unmap = unmap;
> > +
> > + return fn->sm_step_remap(&op, state, priv);
> > +}
> > +
> > +static int
> > +op_unmap_cb(const struct drm_gpuva_fn_ops *fn,
> > + drm_gpuva_state_t state, void *priv,
> > + struct drm_gpuva *va, bool merge)
> > +{
> > + struct drm_gpuva_op op = {};
> > +
> > + op.op = DRM_GPUVA_OP_UNMAP;
> > + op.unmap.va = va;
> > + op.unmap.keep = merge;
> > +
> > + return fn->sm_step_unmap(&op, state, priv);
> > +}
> > +
> > +static int
> > +__drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
> > + const struct drm_gpuva_fn_ops *ops, void *priv,
> > + u64 req_addr, u64 req_range,
> > + struct drm_gem_object *req_obj, u64 req_offset)
> > +{
> > + DRM_GPUVA_ITER(it, mgr, req_addr);
> > + struct drm_gpuva *va, *prev = NULL;
> > + u64 req_end = req_addr + req_range;
> > + int ret;
> > +
> > + if (unlikely(!drm_gpuva_in_mm_range(mgr, req_addr, req_range)))
> > + return -EINVAL;
> > +
> > + if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
> > + return -EINVAL;
> > +
> > + drm_gpuva_iter_for_each_range(va, it, req_end) {
> > + struct drm_gem_object *obj = va->gem.obj;
> > + u64 offset = va->gem.offset;
> > + u64 addr = va->va.addr;
> > + u64 range = va->va.range;
> > + u64 end = addr + range;
> > + bool merge = !!va->gem.obj;
> > +
> > + if (addr == req_addr) {
> > + merge &= obj == req_obj &&
> > + offset == req_offset;
> > +
> > + if (end == req_end) {
> > + ret = op_unmap_cb(ops, &it, priv, va, merge);
> > + if (ret)
> > + return ret;
> > + break;
> > + }
> > +
> > + if (end < req_end) {
> > + ret = op_unmap_cb(ops, &it, priv, va, merge);
> > + if (ret)
> > + return ret;
> > + goto next;
> > + }
> > +
> > + if (end > req_end) {
> > + struct drm_gpuva_op_map n = {
> > + .va.addr = req_end,
> > + .va.range = range - req_range,
> > + .gem.obj = obj,
> > + .gem.offset = offset + req_range,
> > + };
> > + struct drm_gpuva_op_unmap u = {
> > + .va = va,
> > + .keep = merge,
> > + };
> > +
> > + ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
> > + if (ret)
> > + return ret;
> > + break;
> > + }
> > + } else if (addr < req_addr) {
> > + u64 ls_range = req_addr - addr;
> > + struct drm_gpuva_op_map p = {
> > + .va.addr = addr,
> > + .va.range = ls_range,
> > + .gem.obj = obj,
> > + .gem.offset = offset,
> > + };
> > + struct drm_gpuva_op_unmap u = { .va = va };
> > +
> > + merge &= obj == req_obj &&
> > + offset + ls_range == req_offset;
> > + u.keep = merge;
> > +
> > + if (end == req_end) {
> > + ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
> > + if (ret)
> > + return ret;
> > + break;
> > + }
> > +
> > + if (end < req_end) {
> > + ret = op_remap_cb(ops, &it, priv, &p, NULL, &u);
> > + if (ret)
> > + return ret;
> > + goto next;
> > + }
> > +
> > + if (end > req_end) {
> > + struct drm_gpuva_op_map n = {
> > + .va.addr = req_end,
> > + .va.range = end - req_end,
> > + .gem.obj = obj,
> > + .gem.offset = offset + ls_range +
> > + req_range,
> > + };
> > +
> > + ret = op_remap_cb(ops, &it, priv, &p, &n, &u);
> > + if (ret)
> > + return ret;
> > + break;
> > + }
> > + } else if (addr > req_addr) {
> > + merge &= obj == req_obj &&
> > + offset == req_offset +
> > + (addr - req_addr);
> > +
> > + if (end == req_end) {
> > + ret = op_unmap_cb(ops, &it, priv, va, merge);
> > + if (ret)
> > + return ret;
> > + break;
> > + }
> > +
> > + if (end < req_end) {
> > + ret = op_unmap_cb(ops, &it, priv, va, merge);
> > + if (ret)
> > + return ret;
> > + goto next;
> > + }
> > +
> > + if (end > req_end) {
> > + struct drm_gpuva_op_map n = {
> > + .va.addr = req_end,
> > + .va.range = end - req_end,
> > + .gem.obj = obj,
> > + .gem.offset = offset + req_end - addr,
> > + };
> > + struct drm_gpuva_op_unmap u = {
> > + .va = va,
> > + .keep = merge,
> > + };
> > +
> > + ret = op_remap_cb(ops, &it, priv, NULL, &n, &u);
> > + if (ret)
> > + return ret;
> > + break;
> > + }
> > + }
> > +next:
> > + prev = va;
> > + }
> > +
> > + return op_map_cb(ops, priv,
> > + req_addr, req_range,
> > + req_obj, req_offset);
> > +}
> > +
> > +static int
> > +__drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
> > + const struct drm_gpuva_fn_ops *ops, void *priv,
> > + u64 req_addr, u64 req_range)
> > +{
> > + DRM_GPUVA_ITER(it, mgr, req_addr);
> > + struct drm_gpuva *va;
> > + u64 req_end = req_addr + req_range;
> > + int ret;
> > +
> > + if (unlikely(drm_gpuva_in_kernel_node(mgr, req_addr, req_range)))
> > + return -EINVAL;
> > +
> > + drm_gpuva_iter_for_each_range(va, it, req_end) {
> > + struct drm_gpuva_op_map prev = {}, next = {};
> > + bool prev_split = false, next_split = false;
> > + struct drm_gem_object *obj = va->gem.obj;
> > + u64 offset = va->gem.offset;
> > + u64 addr = va->va.addr;
> > + u64 range = va->va.range;
> > + u64 end = addr + range;
> > +
> > + if (addr < req_addr) {
> > + prev.va.addr = addr;
> > + prev.va.range = req_addr - addr;
> > + prev.gem.obj = obj;
> > + prev.gem.offset = offset;
> > +
> > + prev_split = true;
> > + }
> > +
> > + if (end > req_end) {
> > + next.va.addr = req_end;
> > + next.va.range = end - req_end;
> > + next.gem.obj = obj;
> > + next.gem.offset = offset + (req_end - addr);
> > +
> > + next_split = true;
> > + }
> > +
> > + if (prev_split || next_split) {
> > + struct drm_gpuva_op_unmap unmap = { .va = va };
> > +
> > + ret = op_remap_cb(ops, &it, priv,
> > + prev_split ? &prev : NULL,
> > + next_split ? &next : NULL,
> > + &unmap);
> > + if (ret)
> > + return ret;
> > + } else {
> > + ret = op_unmap_cb(ops, &it, priv, va, false);
> > + if (ret)
> > + return ret;
> > + }
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * drm_gpuva_sm_map - creates the &drm_gpuva_op split/merge steps
> > + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> > + * @req_addr: the start address of the new mapping
> > + * @req_range: the range of the new mapping
> > + * @req_obj: the &drm_gem_object to map
> > + * @req_offset: the offset within the &drm_gem_object
> > + * @priv: pointer to a driver private data structure
> > + *
> > + * This function iterates the given range of the GPU VA space. It utilizes the
> > + * &drm_gpuva_fn_ops to call back into the driver providing the split and merge
> > + * steps.
> > + *
> > + * Drivers may use these callbacks to update the GPU VA space right away within
> > + * the callback. In case the driver decides to copy and store the operations for
> > + * later processing neither this function nor &drm_gpuva_sm_unmap is allowed to
> > + * be called before the &drm_gpuva_manager's view of the GPU VA space was
> > + * updated with the previous set of operations. To update the
> > + * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
> > + * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
> > + * used.
> > + *
> > + * A sequence of callbacks can contain map, unmap and remap operations, but
> > + * the sequence of callbacks might also be empty if no operation is required,
> > + * e.g. if the requested mapping already exists in the exact same way.
> > + *
> > + * There can be an arbitrary amount of unmap operations, a maximum of two remap
> > + * operations and a single map operation. The latter one represents the original
> > + * map operation requested by the caller.
> > + *
> > + * Returns: 0 on success or a negative error code
> > + */
> > +int
> > +drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
> > + u64 req_addr, u64 req_range,
> > + struct drm_gem_object *req_obj, u64 req_offset)
> > +{
> > + const struct drm_gpuva_fn_ops *ops = mgr->ops;
> > +
> > + if (unlikely(!(ops && ops->sm_step_map &&
> > + ops->sm_step_remap &&
> > + ops->sm_step_unmap)))
> > + return -EINVAL;
> > +
> > + return __drm_gpuva_sm_map(mgr, ops, priv,
> > + req_addr, req_range,
> > + req_obj, req_offset);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_sm_map);
> > +
> > +/**
> > + * drm_gpuva_sm_unmap - creates the &drm_gpuva_ops to split on unmap
> > + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> > + * @priv: pointer to a driver private data structure
> > + * @req_addr: the start address of the range to unmap
> > + * @req_range: the range of the mappings to unmap
> > + *
> > + * This function iterates the given range of the GPU VA space. It utilizes the
> > + * &drm_gpuva_fn_ops to call back into the driver providing the operations to
> > + * unmap and, if required, split existent mappings.
> > + *
> > + * Drivers may use these callbacks to update the GPU VA space right away within
> > + * the callback. In case the driver decides to copy and store the operations for
> > + * later processing neither this function nor &drm_gpuva_sm_map is allowed to be
> > + * called before the &drm_gpuva_manager's view of the GPU VA space was updated
> > + * with the previous set of operations. To update the &drm_gpuva_manager's view
> > + * of the GPU VA space drm_gpuva_insert(), drm_gpuva_destroy_locked() and/or
> > + * drm_gpuva_destroy_unlocked() should be used.
> > + *
> > + * A sequence of callbacks can contain unmap and remap operations, depending on
> > + * whether there are actual overlapping mappings to split.
> > + *
> > + * There can be an arbitrary amount of unmap operations and a maximum of two
> > + * remap operations.
> > + *
> > + * Returns: 0 on success or a negative error code
> > + */
> > +int
> > +drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
> > + u64 req_addr, u64 req_range)
> > +{
> > + const struct drm_gpuva_fn_ops *ops = mgr->ops;
> > +
> > + if (unlikely(!(ops && ops->sm_step_remap &&
> > + ops->sm_step_unmap)))
> > + return -EINVAL;
> > +
> > + return __drm_gpuva_sm_unmap(mgr, ops, priv,
> > + req_addr, req_range);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_sm_unmap);
> > +
> > +static struct drm_gpuva_op *
> > +gpuva_op_alloc(struct drm_gpuva_manager *mgr)
> > +{
> > + const struct drm_gpuva_fn_ops *fn = mgr->ops;
> > + struct drm_gpuva_op *op;
> > +
> > + if (fn && fn->op_alloc)
> > + op = fn->op_alloc();
> > + else
> > + op = kzalloc(sizeof(*op), GFP_KERNEL);
> > +
> > + if (unlikely(!op))
> > + return NULL;
> > +
> > + return op;
> > +}
> > +
> > +static void
> > +gpuva_op_free(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_op *op)
> > +{
> > + const struct drm_gpuva_fn_ops *fn = mgr->ops;
> > +
> > + if (fn && fn->op_free)
> > + fn->op_free(op);
> > + else
> > + kfree(op);
> > +}
> > +
> > +static int
> > +drm_gpuva_sm_step(struct drm_gpuva_op *__op,
> > + drm_gpuva_state_t state,
> > + void *priv)
> > +{
> > + struct {
> > + struct drm_gpuva_manager *mgr;
> > + struct drm_gpuva_ops *ops;
> > + } *args = priv;
> > + struct drm_gpuva_manager *mgr = args->mgr;
> > + struct drm_gpuva_ops *ops = args->ops;
> > + struct drm_gpuva_op *op;
> > +
> > + op = gpuva_op_alloc(mgr);
> > + if (unlikely(!op))
> > + goto err;
> > +
> > + memcpy(op, __op, sizeof(*op));
> > +
> > + if (op->op == DRM_GPUVA_OP_REMAP) {
> > + struct drm_gpuva_op_remap *__r = &__op->remap;
> > + struct drm_gpuva_op_remap *r = &op->remap;
> > +
> > + r->unmap = kmemdup(__r->unmap, sizeof(*r->unmap),
> > + GFP_KERNEL);
> > + if (unlikely(!r->unmap))
> > + goto err_free_op;
> > +
> > + if (__r->prev) {
> > + r->prev = kmemdup(__r->prev, sizeof(*r->prev),
> > + GFP_KERNEL);
> > + if (unlikely(!r->prev))
> > + goto err_free_unmap;
> > + }
> > +
> > + if (__r->next) {
> > + r->next = kmemdup(__r->next, sizeof(*r->next),
> > + GFP_KERNEL);
> > + if (unlikely(!r->next))
> > + goto err_free_prev;
> > + }
> > + }
> > +
> > + list_add_tail(&op->entry, &ops->list);
> > +
> > + return 0;
> > +
> > +err_free_unmap:
> > + kfree(op->remap.unmap);
> > +err_free_prev:
> > + kfree(op->remap.prev);
> > +err_free_op:
> > + gpuva_op_free(mgr, op);
> > +err:
> > + return -ENOMEM;
> > +}
> > +
> > +static int
> > +drm_gpuva_sm_step_map(struct drm_gpuva_op *__op, void *priv)
> > +{
> > + return drm_gpuva_sm_step(__op, NULL, priv);
> > +}
> > +
> > +static const struct drm_gpuva_fn_ops gpuva_list_ops = {
> > + .sm_step_map = drm_gpuva_sm_step_map,
> > + .sm_step_remap = drm_gpuva_sm_step,
> > + .sm_step_unmap = drm_gpuva_sm_step,
> > +};
> > +
> > +/**
> > + * drm_gpuva_sm_map_ops_create - creates the &drm_gpuva_ops to split and merge
> > + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> > + * @req_addr: the start address of the new mapping
> > + * @req_range: the range of the new mapping
> > + * @req_obj: the &drm_gem_object to map
> > + * @req_offset: the offset within the &drm_gem_object
> > + *
> > + * This function creates a list of operations to perform splitting and merging
> > + * of existent mapping(s) with the newly requested one.
> > + *
> > + * The list can be iterated with &drm_gpuva_for_each_op and must be processed
> > + * in the given order. It can contain map, unmap and remap operations, but it
> > + * also can be empty if no operation is required, e.g. if the requested mapping
> > + * already exists is the exact same way.
> > + *
> > + * There can be an arbitrary amount of unmap operations, a maximum of two remap
> > + * operations and a single map operation. The latter one represents the original
> > + * map operation requested by the caller.
> > + *
> > + * Note that before calling this function again with another mapping request it
> > + * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
> > + * previously obtained operations must be either processed or abandoned. To
> > + * update the &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
> > + * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
> > + * used.
> > + *
> > + * After the caller finished processing the returned &drm_gpuva_ops, they must
> > + * be freed with &drm_gpuva_ops_free.
> > + *
> > + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> > + */
> > +struct drm_gpuva_ops *
> > +drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
> > + u64 req_addr, u64 req_range,
> > + struct drm_gem_object *req_obj, u64 req_offset)
> > +{
> > + struct drm_gpuva_ops *ops;
> > + struct {
> > + struct drm_gpuva_manager *mgr;
> > + struct drm_gpuva_ops *ops;
> > + } args;
> > + int ret;
> > +
> > + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> > + if (unlikely(!ops))
> > + return ERR_PTR(-ENOMEM);
> > +
> > + INIT_LIST_HEAD(&ops->list);
> > +
> > + args.mgr = mgr;
> > + args.ops = ops;
> > +
> > + ret = __drm_gpuva_sm_map(mgr, &gpuva_list_ops, &args,
> > + req_addr, req_range,
> > + req_obj, req_offset);
> > + if (ret)
> > + goto err_free_ops;
> > +
> > + return ops;
> > +
> > +err_free_ops:
> > + drm_gpuva_ops_free(mgr, ops);
> > + return ERR_PTR(ret);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_sm_map_ops_create);
> > +
> > +/**
> > + * drm_gpuva_sm_unmap_ops_create - creates the &drm_gpuva_ops to split on unmap
> > + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> > + * @req_addr: the start address of the range to unmap
> > + * @req_range: the range of the mappings to unmap
> > + *
> > + * This function creates a list of operations to perform unmapping and, if
> > + * required, splitting of the mappings overlapping the unmap range.
> > + *
> > + * The list can be iterated with &drm_gpuva_for_each_op and must be processed
> > + * in the given order. It can contain unmap and remap operations, depending on
> > + * whether there are actual overlapping mappings to split.
> > + *
> > + * There can be an arbitrary amount of unmap operations and a maximum of two
> > + * remap operations.
> > + *
> > + * Note that before calling this function again with another range to unmap it
> > + * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
> > + * previously obtained operations must be processed or abandoned. To update the
> > + * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
> > + * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
> > + * used.
> > + *
> > + * After the caller finished processing the returned &drm_gpuva_ops, they must
> > + * be freed with &drm_gpuva_ops_free.
> > + *
> > + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> > + */
> > +struct drm_gpuva_ops *
> > +drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
> > + u64 req_addr, u64 req_range)
> > +{
> > + struct drm_gpuva_ops *ops;
> > + struct {
> > + struct drm_gpuva_manager *mgr;
> > + struct drm_gpuva_ops *ops;
> > + } args;
> > + int ret;
> > +
> > + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> > + if (unlikely(!ops))
> > + return ERR_PTR(-ENOMEM);
> > +
> > + INIT_LIST_HEAD(&ops->list);
> > +
> > + args.mgr = mgr;
> > + args.ops = ops;
> > +
> > + ret = __drm_gpuva_sm_unmap(mgr, &gpuva_list_ops, &args,
> > + req_addr, req_range);
> > + if (ret)
> > + goto err_free_ops;
> > +
> > + return ops;
> > +
> > +err_free_ops:
> > + drm_gpuva_ops_free(mgr, ops);
> > + return ERR_PTR(ret);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_sm_unmap_ops_create);
> > +
> > +/**
> > + * drm_gpuva_prefetch_ops_create - creates the &drm_gpuva_ops to prefetch
> > + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> > + * @addr: the start address of the range to prefetch
> > + * @range: the range of the mappings to prefetch
> > + *
> > + * This function creates a list of operations to perform prefetching.
> > + *
> > + * The list can be iterated with &drm_gpuva_for_each_op and must be processed
> > + * in the given order. It can contain prefetch operations.
> > + *
> > + * There can be an arbitrary amount of prefetch operations.
> > + *
> > + * After the caller finished processing the returned &drm_gpuva_ops, they must
> > + * be freed with &drm_gpuva_ops_free.
> > + *
> > + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> > + */
> > +struct drm_gpuva_ops *
> > +drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range)
> > +{
> > + DRM_GPUVA_ITER(it, mgr, addr);
> > + struct drm_gpuva_ops *ops;
> > + struct drm_gpuva_op *op;
> > + struct drm_gpuva *va;
> > + int ret;
> > +
> > + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> > + if (!ops)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + INIT_LIST_HEAD(&ops->list);
> > +
> > + drm_gpuva_iter_for_each_range(va, it, addr + range) {
> > + op = gpuva_op_alloc(mgr);
> > + if (!op) {
> > + ret = -ENOMEM;
> > + goto err_free_ops;
> > + }
> > +
> > + op->op = DRM_GPUVA_OP_PREFETCH;
> > + op->prefetch.va = va;
> > + list_add_tail(&op->entry, &ops->list);
> > + }
> > +
> > + return ops;
> > +
> > +err_free_ops:
> > + drm_gpuva_ops_free(mgr, ops);
> > + return ERR_PTR(ret);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_prefetch_ops_create);
> > +
> > +/**
> > + * drm_gpuva_gem_unmap_ops_create - creates the &drm_gpuva_ops to unmap a GEM
> > + * @mgr: the &drm_gpuva_manager representing the GPU VA space
> > + * @obj: the &drm_gem_object to unmap
> > + *
> > + * This function creates a list of operations to perform unmapping for every
> > + * GPUVA attached to a GEM.
> > + *
> > + * The list can be iterated with &drm_gpuva_for_each_op and consists out of an
> > + * arbitrary amount of unmap operations.
> > + *
> > + * After the caller finished processing the returned &drm_gpuva_ops, they must
> > + * be freed with &drm_gpuva_ops_free.
> > + *
> > + * It is the callers responsibility to protect the GEMs GPUVA list against
> > + * concurrent access.
> > + *
> > + * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
> > + */
> > +struct drm_gpuva_ops *
> > +drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
> > + struct drm_gem_object *obj)
> > +{
> > + struct drm_gpuva_ops *ops;
> > + struct drm_gpuva_op *op;
> > + struct drm_gpuva *va;
> > + int ret;
> > +
> > + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> > + if (!ops)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + INIT_LIST_HEAD(&ops->list);
> > +
> > + drm_gem_for_each_gpuva(va, obj) {
> > + op = gpuva_op_alloc(mgr);
> > + if (!op) {
> > + ret = -ENOMEM;
> > + goto err_free_ops;
> > + }
> > +
> > + op->op = DRM_GPUVA_OP_UNMAP;
> > + op->unmap.va = va;
> > + list_add_tail(&op->entry, &ops->list);
> > + }
> > +
> > + return ops;
> > +
> > +err_free_ops:
> > + drm_gpuva_ops_free(mgr, ops);
> > + return ERR_PTR(ret);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_gem_unmap_ops_create);
> > +
> > +
> > +/**
> > + * drm_gpuva_ops_free - free the given &drm_gpuva_ops
> > + * @mgr: the &drm_gpuva_manager the ops were created for
> > + * @ops: the &drm_gpuva_ops to free
> > + *
> > + * Frees the given &drm_gpuva_ops structure including all the ops associated
> > + * with it.
> > + */
> > +void
> > +drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_ops *ops)
> > +{
> > + struct drm_gpuva_op *op, *next;
> > +
> > + drm_gpuva_for_each_op_safe(op, next, ops) {
> > + list_del(&op->entry);
> > +
> > + if (op->op == DRM_GPUVA_OP_REMAP) {
> > + kfree(op->remap.prev);
> > + kfree(op->remap.next);
> > + kfree(op->remap.unmap);
> > + }
> > +
> > + gpuva_op_free(mgr, op);
> > + }
> > +
> > + kfree(ops);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_ops_free);
> > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> > index b419c59c4bef..b6e22f66c3fd 100644
> > --- a/include/drm/drm_drv.h
> > +++ b/include/drm/drm_drv.h
> > @@ -104,6 +104,12 @@ enum drm_driver_feature {
> > * acceleration should be handled by two drivers that are connected using auxiliary bus.
> > */
> > DRIVER_COMPUTE_ACCEL = BIT(7),
> > + /**
> > + * @DRIVER_GEM_GPUVA:
> > + *
> > + * Driver supports user defined GPU VA bindings for GEM objects.
> > + */
> > + DRIVER_GEM_GPUVA = BIT(8),
> >
> > /* IMPORTANT: Below are all the legacy flags, add new ones above. */
> >
> > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> > index b8efd836edef..f2782f55b7e7 100644
> > --- a/include/drm/drm_gem.h
> > +++ b/include/drm/drm_gem.h
> > @@ -36,6 +36,8 @@
> >
> > #include <linux/kref.h>
> > #include <linux/dma-resv.h>
> > +#include <linux/list.h>
> > +#include <linux/mutex.h>
> >
> > #include <drm/drm_vma_manager.h>
> >
> > @@ -347,6 +349,17 @@ struct drm_gem_object {
> > */
> > struct dma_resv _resv;
> >
> > + /**
> > + * @gpuva:
> > + *
> > + * Provides the list and list mutex of GPU VAs attached to this
> > + * GEM object.
> > + */
> > + struct {
> > + struct list_head list;
> > + struct mutex mutex;
> > + } gpuva;
> > +
> > /**
> > * @funcs:
> > *
> > @@ -494,4 +507,66 @@ unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
> >
> > int drm_gem_evict(struct drm_gem_object *obj);
> >
> > +/**
> > + * drm_gem_gpuva_init - initialize the gpuva list of a GEM object
> > + * @obj: the &drm_gem_object
> > + *
> > + * This initializes the &drm_gem_object's &drm_gpuva list and the mutex
> > + * protecting it.
> > + *
> > + * Calling this function is only necessary for drivers intending to support the
> > + * &drm_driver_feature DRIVER_GEM_GPUVA.
> > + */
> > +static inline void drm_gem_gpuva_init(struct drm_gem_object *obj)
> > +{
> > + INIT_LIST_HEAD(&obj->gpuva.list);
> > + mutex_init(&obj->gpuva.mutex);
> > +}
> > +
> > +/**
> > + * drm_gem_gpuva_lock - lock the GEM's gpuva list mutex
> > + * @obj: the &drm_gem_object
> > + *
> > + * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
> > + */
> > +static inline void drm_gem_gpuva_lock(struct drm_gem_object *obj)
> > +{
> > + mutex_lock(&obj->gpuva.mutex);
> > +}
> > +
> > +/**
> > + * drm_gem_gpuva_unlock - unlock the GEM's gpuva list mutex
> > + * @obj: the &drm_gem_object
> > + *
> > + * This unlocks the mutex protecting the &drm_gem_object's &drm_gpuva list.
> > + */
> > +static inline void drm_gem_gpuva_unlock(struct drm_gem_object *obj)
> > +{
> > + mutex_unlock(&obj->gpuva.mutex);
> > +}
> > +
> > +/**
> > + * drm_gem_for_each_gpuva - iternator to walk over a list of gpuvas
> > + * @entry: &drm_gpuva structure to assign to in each iteration step
> > + * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
> > + *
> > + * This iterator walks over all &drm_gpuva structures associated with the
> > + * &drm_gpuva_manager.
> > + */
> > +#define drm_gem_for_each_gpuva(entry__, obj__) \
> > + list_for_each_entry(entry__, &(obj__)->gpuva.list, gem.entry)
> > +
> > +/**
> > + * drm_gem_for_each_gpuva_safe - iternator to safely walk over a list of gpuvas
> > + * @entry: &drm_gpuva structure to assign to in each iteration step
> > + * @next: &next &drm_gpuva to store the next step
> > + * @obj: the &drm_gem_object the &drm_gpuvas to walk are associated with
> > + *
> > + * This iterator walks over all &drm_gpuva structures associated with the
> > + * &drm_gem_object. It is implemented with list_for_each_entry_safe(), hence
> > + * it is save against removal of elements.
> > + */
> > +#define drm_gem_for_each_gpuva_safe(entry__, next__, obj__) \
> > + list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, gem.entry)
> > +
> > #endif /* __DRM_GEM_H__ */
> > diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
> > new file mode 100644
> > index 000000000000..b52ac2d00d12
> > --- /dev/null
> > +++ b/include/drm/drm_gpuva_mgr.h
> > @@ -0,0 +1,681 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef __DRM_GPUVA_MGR_H__
> > +#define __DRM_GPUVA_MGR_H__
> > +
> > +/*
> > + * Copyright (c) 2022 Red Hat.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#include <linux/maple_tree.h>
> > +#include <linux/mm.h>
> > +#include <linux/rbtree.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/types.h>
> > +
> > +struct drm_gpuva_manager;
> > +struct drm_gpuva_fn_ops;
> > +struct drm_gpuva_prealloc;
> > +
> > +/**
> > + * enum drm_gpuva_flags - flags for struct drm_gpuva
> > + */
> > +enum drm_gpuva_flags {
> > + /**
> > + * @DRM_GPUVA_EVICTED:
> > + *
> > + * Flag indicating that the &drm_gpuva's backing GEM is evicted.
> > + */
> > + DRM_GPUVA_EVICTED = (1 << 0),
> > +
> > + /**
> > + * @DRM_GPUVA_SPARSE:
> > + *
> > + * Flag indicating that the &drm_gpuva is a sparse mapping.
> > + */
> > + DRM_GPUVA_SPARSE = (1 << 1),
> > +
> > + /**
> > + * @DRM_GPUVA_USERBITS: user defined bits
> > + */
> > + DRM_GPUVA_USERBITS = (1 << 2),
> > +};
> > +
> > +/**
> > + * struct drm_gpuva - structure to track a GPU VA mapping
> > + *
> > + * This structure represents a GPU VA mapping and is associated with a
> > + * &drm_gpuva_manager.
> > + *
> > + * Typically, this structure is embedded in bigger driver structures.
> > + */
> > +struct drm_gpuva {
> > + /**
> > + * @mgr: the &drm_gpuva_manager this object is associated with
> > + */
> > + struct drm_gpuva_manager *mgr;
> > +
> > + /**
> > + * @flags: the &drm_gpuva_flags for this mapping
> > + */
> > + enum drm_gpuva_flags flags;
> > +
> > + /**
> > + * @va: structure containing the address and range of the &drm_gpuva
> > + */
> > + struct {
> > + /**
> > + * @addr: the start address
> > + */
> > + u64 addr;
> > +
> > + /*
> > + * @range: the range
> > + */
> > + u64 range;
> > + } va;
> > +
> > + /**
> > + * @gem: structure containing the &drm_gem_object and it's offset
> > + */
> > + struct {
> > + /**
> > + * @offset: the offset within the &drm_gem_object
> > + */
> > + u64 offset;
> > +
> > + /**
> > + * @obj: the mapped &drm_gem_object
> > + */
> > + struct drm_gem_object *obj;
> > +
> > + /**
> > + * @entry: the &list_head to attach this object to a &drm_gem_object
> > + */
> > + struct list_head entry;
> > + } gem;
> > +};
> > +
> > +void drm_gpuva_link(struct drm_gpuva *va);
> > +void drm_gpuva_unlink(struct drm_gpuva *va);
> > +
> > +int drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva *va);
> > +int drm_gpuva_insert_prealloc(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_prealloc *pa,
> > + struct drm_gpuva *va);
> > +void drm_gpuva_remove(struct drm_gpuva *va);
> > +
> > +struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range);
> > +struct drm_gpuva *drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range);
> > +struct drm_gpuva *drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start);
> > +struct drm_gpuva *drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end);
> > +
> > +bool drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range);
> > +
> > +/**
> > + * drm_gpuva_evict - sets whether the backing GEM of this &drm_gpuva is evicted
> > + * @va: the &drm_gpuva to set the evict flag for
> > + * @evict: indicates whether the &drm_gpuva is evicted
> > + */
> > +static inline void drm_gpuva_evict(struct drm_gpuva *va, bool evict)
> > +{
> > + if (evict)
> > + va->flags |= DRM_GPUVA_EVICTED;
> > + else
> > + va->flags &= ~DRM_GPUVA_EVICTED;
> > +}
> > +
> > +/**
> > + * drm_gpuva_evicted - indicates whether the backing BO of this &drm_gpuva
> > + * is evicted
> > + * @va: the &drm_gpuva to check
> > + */
> > +static inline bool drm_gpuva_evicted(struct drm_gpuva *va)
> > +{
> > + return va->flags & DRM_GPUVA_EVICTED;
> > +}
> > +
> > +/**
> > + * struct drm_gpuva_manager - DRM GPU VA Manager
> > + *
> > + * The DRM GPU VA Manager keeps track of a GPU's virtual address space by using
> > + * &maple_tree structures. Typically, this structure is embedded in bigger
> > + * driver structures.
> > + *
> > + * Drivers can pass addresses and ranges in an arbitrary unit, e.g. bytes or
> > + * pages.
> > + *
> > + * There should be one manager instance per GPU virtual address space.
> > + */
> > +struct drm_gpuva_manager {
> > + /**
> > + * @name: the name of the DRM GPU VA space
> > + */
> > + const char *name;
> > +
> > + /**
> > + * @mm_start: start of the VA space
> > + */
> > + u64 mm_start;
> > +
> > + /**
> > + * @mm_range: length of the VA space
> > + */
> > + u64 mm_range;
> > +
> > + /**
> > + * @mtree: the &maple_tree to track GPU VA mappings
> > + */
> > + struct maple_tree mtree;
> > +
> > + /**
> > + * @kernel_alloc_node:
> > + *
> > + * &drm_gpuva representing the address space cutout reserved for
> > + * the kernel
> > + */
> > + struct drm_gpuva kernel_alloc_node;
> > +
> > + /**
> > + * @ops: &drm_gpuva_fn_ops providing the split/merge steps to drivers
> > + */
> > + const struct drm_gpuva_fn_ops *ops;
> > +};
> > +
> > +void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > + const char *name,
> > + u64 start_offset, u64 range,
> > + u64 reserve_offset, u64 reserve_range,
> > + const struct drm_gpuva_fn_ops *ops);
> > +void drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr);
> > +
> > +/**
> > + * struct drm_gpuva_prealloc - holds a preallocated node for the
> > + * &drm_gpuva_manager to insert a single new entry
> > + */
> > +struct drm_gpuva_prealloc {
> > + /**
> > + * @mas: the maple tree advanced state
> > + */
> > + struct ma_state mas;
> > +};
> > +
> > +struct drm_gpuva_prealloc * drm_gpuva_prealloc_create(struct drm_gpuva_manager *mgr);
> > +void drm_gpuva_prealloc_destroy(struct drm_gpuva_prealloc *pa);
> > +
> > +/**
> > + * struct drm_gpuva_iterator - iterator for walking the internal (maple) tree
> > + */
> > +struct drm_gpuva_iterator {
> > + /**
> > + * @mas: the maple tree advanced state
> > + */
> > + struct ma_state mas;
> > +
> > + /**
> > + * @mgr: the &drm_gpuva_manager to iterate
> > + */
> > + struct drm_gpuva_manager *mgr;
> > +};
> > +typedef struct drm_gpuva_iterator * drm_gpuva_state_t;
> > +
> > +void drm_gpuva_iter_remove(struct drm_gpuva_iterator *it);
> > +int drm_gpuva_iter_va_replace(struct drm_gpuva_iterator *it,
> > + struct drm_gpuva *va);
> > +
> > +static inline struct drm_gpuva *
> > +drm_gpuva_iter_find(struct drm_gpuva_iterator *it, unsigned long max)
> > +{
> > + struct drm_gpuva *va;
> > +
> > + mas_lock(&it->mas);
>
> This is the write lock, if you can have more than one reader then use
> rcu_read_lock() and friends. You can also probably use mt_find() to
> handle the locking here?
>

Calls to this function should either be protected by an external lock or be
serialized anyway. I only got those locks here to make lockdep happly. Hence,
I think there should not be much of a difference. However, I will change read
only sections to use rcu_read_lock(), even if it's just to be a better example.
:-)

> > + va = mas_find(&it->mas, max);
> > + mas_unlock(&it->mas);
> > +
> > + return va;
> > +}
> > +
> > +/**
> > + * DRM_GPUVA_ITER - create an iterator structure to iterate the &drm_gpuva tree
> > + * @name: the name of the &drm_gpuva_iterator to create
> > + * @mgr__: the &drm_gpuva_manager to iterate
> > + * @start: starting offset, the first entry will overlap this
> > + */
> > +#define DRM_GPUVA_ITER(name, mgr__, start) \
> > + struct drm_gpuva_iterator name = { \
> > + .mas = MA_STATE_INIT(&(mgr__)->mtree, start, 0), \
> > + .mgr = mgr__, \
> > + }
> > +
> > +/**
> > + * drm_gpuva_iter_for_each_range - iternator to walk over a range of entries
> > + * @va__: the &drm_gpuva found for the current iteration
> > + * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
> > + * @end__: ending offset, the last entry will start before this (but may overlap)
> > + *
> > + * This function can be used to iterate &drm_gpuva objects.
> > + *
> > + * It is safe against the removal of elements using &drm_gpuva_iter_remove,
> > + * however it is not safe against the removal of elements using
> > + * &drm_gpuva_remove.
> > + */
> > +#define drm_gpuva_iter_for_each_range(va__, it__, end__) \
> > + while (((va__) = drm_gpuva_iter_find(&(it__), (end__) - 1)))
> > +
> > +/**
> > + * drm_gpuva_iter_for_each - iternator to walk over all existing entries
> > + * @va__: the &drm_gpuva found for the current iteration
> > + * @it__: &drm_gpuva_iterator structure to assign to in each iteration step
> > + *
> > + * This function can be used to iterate &drm_gpuva objects.
> > + *
> > + * In order to walk over all potentially existing entries, the
> > + * &drm_gpuva_iterator must be initialized to start at
> > + * &drm_gpuva_manager->mm_start or simply 0.
> > + *
> > + * It is safe against the removal of elements using &drm_gpuva_iter_remove,
> > + * however it is not safe against the removal of elements using
> > + * &drm_gpuva_remove.
> > + */
> > +#define drm_gpuva_iter_for_each(va__, it__) \
> > + drm_gpuva_iter_for_each_range(va__, it__, (it__).mgr->mm_start + (it__).mgr->mm_range)
> > +
> > +/**
> > + * enum drm_gpuva_op_type - GPU VA operation type
> > + *
> > + * Operations to alter the GPU VA mappings tracked by the &drm_gpuva_manager.
> > + */
> > +enum drm_gpuva_op_type {
> > + /**
> > + * @DRM_GPUVA_OP_MAP: the map op type
> > + */
> > + DRM_GPUVA_OP_MAP,
> > +
> > + /**
> > + * @DRM_GPUVA_OP_REMAP: the remap op type
> > + */
> > + DRM_GPUVA_OP_REMAP,
> > +
> > + /**
> > + * @DRM_GPUVA_OP_UNMAP: the unmap op type
> > + */
> > + DRM_GPUVA_OP_UNMAP,
> > +
> > + /**
> > + * @DRM_GPUVA_OP_PREFETCH: the prefetch op type
> > + */
> > + DRM_GPUVA_OP_PREFETCH,
> > +};
> > +
> > +/**
> > + * struct drm_gpuva_op_map - GPU VA map operation
> > + *
> > + * This structure represents a single map operation generated by the
> > + * DRM GPU VA manager.
> > + */
> > +struct drm_gpuva_op_map {
> > + /**
> > + * @va: structure containing address and range of a map
> > + * operation
> > + */
> > + struct {
> > + /**
> > + * @addr: the base address of the new mapping
> > + */
> > + u64 addr;
> > +
> > + /**
> > + * @range: the range of the new mapping
> > + */
> > + u64 range;
> > + } va;
> > +
> > + /**
> > + * @gem: structure containing the &drm_gem_object and it's offset
> > + */
> > + struct {
> > + /**
> > + * @offset: the offset within the &drm_gem_object
> > + */
> > + u64 offset;
> > +
> > + /**
> > + * @obj: the &drm_gem_object to map
> > + */
> > + struct drm_gem_object *obj;
> > + } gem;
> > +};
> > +
> > +/**
> > + * struct drm_gpuva_op_unmap - GPU VA unmap operation
> > + *
> > + * This structure represents a single unmap operation generated by the
> > + * DRM GPU VA manager.
> > + */
> > +struct drm_gpuva_op_unmap {
> > + /**
> > + * @va: the &drm_gpuva to unmap
> > + */
> > + struct drm_gpuva *va;
> > +
> > + /**
> > + * @keep:
> > + *
> > + * Indicates whether this &drm_gpuva is physically contiguous with the
> > + * original mapping request.
> > + *
> > + * Optionally, if &keep is set, drivers may keep the actual page table
> > + * mappings for this &drm_gpuva, adding the missing page table entries
> > + * only and update the &drm_gpuva_manager accordingly.
> > + */
> > + bool keep;
> > +};
> > +
> > +/**
> > + * struct drm_gpuva_op_remap - GPU VA remap operation
> > + *
> > + * This represents a single remap operation generated by the DRM GPU VA manager.
> > + *
> > + * A remap operation is generated when an existing GPU VA mmapping is split up
> > + * by inserting a new GPU VA mapping or by partially unmapping existent
> > + * mapping(s), hence it consists of a maximum of two map and one unmap
> > + * operation.
> > + *
> > + * The @unmap operation takes care of removing the original existing mapping.
> > + * @prev is used to remap the preceding part, @next the subsequent part.
> > + *
> > + * If either a new mapping's start address is aligned with the start address
> > + * of the old mapping or the new mapping's end address is aligned with the
> > + * end address of the old mapping, either @prev or @next is NULL.
> > + *
> > + * Note, the reason for a dedicated remap operation, rather than arbitrary
> > + * unmap and map operations, is to give drivers the chance of extracting driver
> > + * specific data for creating the new mappings from the unmap operations's
> > + * &drm_gpuva structure which typically is embedded in larger driver specific
> > + * structures.
> > + */
> > +struct drm_gpuva_op_remap {
> > + /**
> > + * @prev: the preceding part of a split mapping
> > + */
> > + struct drm_gpuva_op_map *prev;
> > +
> > + /**
> > + * @next: the subsequent part of a split mapping
> > + */
> > + struct drm_gpuva_op_map *next;
> > +
> > + /**
> > + * @unmap: the unmap operation for the original existing mapping
> > + */
> > + struct drm_gpuva_op_unmap *unmap;
> > +};
> > +
> > +/**
> > + * struct drm_gpuva_op_prefetch - GPU VA prefetch operation
> > + *
> > + * This structure represents a single prefetch operation generated by the
> > + * DRM GPU VA manager.
> > + */
> > +struct drm_gpuva_op_prefetch {
> > + /**
> > + * @va: the &drm_gpuva to prefetch
> > + */
> > + struct drm_gpuva *va;
> > +};
> > +
> > +/**
> > + * struct drm_gpuva_op - GPU VA operation
> > + *
> > + * This structure represents a single generic operation.
> > + *
> > + * The particular type of the operation is defined by @op.
> > + */
> > +struct drm_gpuva_op {
> > + /**
> > + * @entry:
> > + *
> > + * The &list_head used to distribute instances of this struct within
> > + * &drm_gpuva_ops.
> > + */
> > + struct list_head entry;
> > +
> > + /**
> > + * @op: the type of the operation
> > + */
> > + enum drm_gpuva_op_type op;
> > +
> > + union {
> > + /**
> > + * @map: the map operation
> > + */
> > + struct drm_gpuva_op_map map;
> > +
> > + /**
> > + * @remap: the remap operation
> > + */
> > + struct drm_gpuva_op_remap remap;
> > +
> > + /**
> > + * @unmap: the unmap operation
> > + */
> > + struct drm_gpuva_op_unmap unmap;
> > +
> > + /**
> > + * @prefetch: the prefetch operation
> > + */
> > + struct drm_gpuva_op_prefetch prefetch;
> > + };
> > +};
> > +
> > +/**
> > + * struct drm_gpuva_ops - wraps a list of &drm_gpuva_op
> > + */
> > +struct drm_gpuva_ops {
> > + /**
> > + * @list: the &list_head
> > + */
> > + struct list_head list;
> > +};
> > +
> > +/**
> > + * drm_gpuva_for_each_op - iterator to walk over &drm_gpuva_ops
> > + * @op: &drm_gpuva_op to assign in each iteration step
> > + * @ops: &drm_gpuva_ops to walk
> > + *
> > + * This iterator walks over all ops within a given list of operations.
> > + */
> > +#define drm_gpuva_for_each_op(op, ops) list_for_each_entry(op, &(ops)->list, entry)
> > +
> > +/**
> > + * drm_gpuva_for_each_op_safe - iterator to safely walk over &drm_gpuva_ops
> > + * @op: &drm_gpuva_op to assign in each iteration step
> > + * @next: &next &drm_gpuva_op to store the next step
> > + * @ops: &drm_gpuva_ops to walk
> > + *
> > + * This iterator walks over all ops within a given list of operations. It is
> > + * implemented with list_for_each_safe(), so save against removal of elements.
> > + */
> > +#define drm_gpuva_for_each_op_safe(op, next, ops) \
> > + list_for_each_entry_safe(op, next, &(ops)->list, entry)
> > +
> > +/**
> > + * drm_gpuva_for_each_op_from_reverse - iterate backwards from the given point
> > + * @op: &drm_gpuva_op to assign in each iteration step
> > + * @ops: &drm_gpuva_ops to walk
> > + *
> > + * This iterator walks over all ops within a given list of operations beginning
> > + * from the given operation in reverse order.
> > + */
> > +#define drm_gpuva_for_each_op_from_reverse(op, ops) \
> > + list_for_each_entry_from_reverse(op, &(ops)->list, entry)
> > +
> > +/**
> > + * drm_gpuva_first_op - returns the first &drm_gpuva_op from &drm_gpuva_ops
> > + * @ops: the &drm_gpuva_ops to get the fist &drm_gpuva_op from
> > + */
> > +#define drm_gpuva_first_op(ops) \
> > + list_first_entry(&(ops)->list, struct drm_gpuva_op, entry)
> > +
> > +/**
> > + * drm_gpuva_last_op - returns the last &drm_gpuva_op from &drm_gpuva_ops
> > + * @ops: the &drm_gpuva_ops to get the last &drm_gpuva_op from
> > + */
> > +#define drm_gpuva_last_op(ops) \
> > + list_last_entry(&(ops)->list, struct drm_gpuva_op, entry)
> > +
> > +/**
> > + * drm_gpuva_prev_op - previous &drm_gpuva_op in the list
> > + * @op: the current &drm_gpuva_op
> > + */
> > +#define drm_gpuva_prev_op(op) list_prev_entry(op, entry)
> > +
> > +/**
> > + * drm_gpuva_next_op - next &drm_gpuva_op in the list
> > + * @op: the current &drm_gpuva_op
> > + */
> > +#define drm_gpuva_next_op(op) list_next_entry(op, entry)
> > +
> > +struct drm_gpuva_ops *
> > +drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range,
> > + struct drm_gem_object *obj, u64 offset);
> > +struct drm_gpuva_ops *
> > +drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range);
> > +
> > +struct drm_gpuva_ops *
> > +drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
> > + u64 addr, u64 range);
> > +
> > +struct drm_gpuva_ops *
> > +drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
> > + struct drm_gem_object *obj);
> > +
> > +void drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_ops *ops);
> > +
> > +/**
> > + * struct drm_gpuva_fn_ops - callbacks for split/merge steps
> > + *
> > + * This structure defines the callbacks used by &drm_gpuva_sm_map and
> > + * &drm_gpuva_sm_unmap to provide the split/merge steps for map and unmap
> > + * operations to drivers.
> > + */
> > +struct drm_gpuva_fn_ops {
> > + /**
> > + * @op_alloc: called when the &drm_gpuva_manager allocates
> > + * a struct drm_gpuva_op
> > + *
> > + * Some drivers may want to embed struct drm_gpuva_op into driver
> > + * specific structures. By implementing this callback drivers can
> > + * allocate memory accordingly.
> > + *
> > + * This callback is optional.
> > + */
> > + struct drm_gpuva_op *(*op_alloc)(void);
> > +
> > + /**
> > + * @op_free: called when the &drm_gpuva_manager frees a
> > + * struct drm_gpuva_op
> > + *
> > + * Some drivers may want to embed struct drm_gpuva_op into driver
> > + * specific structures. By implementing this callback drivers can
> > + * free the previously allocated memory accordingly.
> > + *
> > + * This callback is optional.
> > + */
> > + void (*op_free)(struct drm_gpuva_op *op);
> > +
> > + /**
> > + * @sm_step_map: called from &drm_gpuva_sm_map to finally insert the
> > + * mapping once all previous steps were completed
> > + *
> > + * The &priv pointer matches the one the driver passed to
> > + * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
> > + *
> > + * Can be NULL if &drm_gpuva_sm_map is used.
> > + */
> > + int (*sm_step_map)(struct drm_gpuva_op *op, void *priv);
> > +
> > + /**
> > + * @sm_step_remap: called from &drm_gpuva_sm_map and
> > + * &drm_gpuva_sm_unmap to split up an existent mapping
> > + *
> > + * This callback is called when existent mapping needs to be split up.
> > + * This is the case when either a newly requested mapping overlaps or
> > + * is enclosed by an existent mapping or a partial unmap of an existent
> > + * mapping is requested.
> > + *
> > + * Drivers must not modify the GPUVA space with accessors that do not
> > + * take a &drm_gpuva_state as argument from this callback.
> > + *
> > + * The &priv pointer matches the one the driver passed to
> > + * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
> > + *
> > + * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
> > + * used.
> > + */
> > + int (*sm_step_remap)(struct drm_gpuva_op *op,
> > + drm_gpuva_state_t state,
> > + void *priv);
> > +
> > + /**
> > + * @sm_step_unmap: called from &drm_gpuva_sm_map and
> > + * &drm_gpuva_sm_unmap to unmap an existent mapping
> > + *
> > + * This callback is called when existent mapping needs to be unmapped.
> > + * This is the case when either a newly requested mapping encloses an
> > + * existent mapping or an unmap of an existent mapping is requested.
> > + *
> > + * Drivers must not modify the GPUVA space with accessors that do not
> > + * take a &drm_gpuva_state as argument from this callback.
> > + *
> > + * The &priv pointer matches the one the driver passed to
> > + * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
> > + *
> > + * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
> > + * used.
> > + */
> > + int (*sm_step_unmap)(struct drm_gpuva_op *op,
> > + drm_gpuva_state_t state,
> > + void *priv);
> > +};
> > +
> > +int drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
> > + u64 addr, u64 range,
> > + struct drm_gem_object *obj, u64 offset);
> > +
> > +int drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
> > + u64 addr, u64 range);
> > +
> > +int drm_gpuva_map(struct drm_gpuva_manager *mgr,
> > + struct drm_gpuva_prealloc *pa,
> > + struct drm_gpuva *va);
> > +int drm_gpuva_remap(drm_gpuva_state_t state,
> > + struct drm_gpuva *prev,
> > + struct drm_gpuva *next);
> > +void drm_gpuva_unmap(drm_gpuva_state_t state);
> > +
> > +#endif /* __DRM_GPUVA_MGR_H__ */
> > --
> > 2.40.1
> >
>

2023-06-15 16:40:22

by Danilo Krummrich

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On 6/14/23 09:58, Donald Robson wrote:
> On Tue, 2023-06-13 at 16:20 +0200, Danilo Krummrich wrote:
>
>> I'm definitely up improving the existing documentation. Anything in
>> particular you think should be described in more detail?
>>
>> - Danilo
>
> Hi Danilo,
>
> As I said, with inexperience it's possible I missed what I was
> looking for in the existing documentation, which is highly detailed
> in regard to how it deals with operations, but usage was where I fell
> down.
>
> If I understand there are three ways to use this, which are:
> 1) Using drm_gpuva_insert() and drm_gpuva_remove() directly using
> stack va objects.

What do you mean with stack va objects?

> 2) Using drm_gpuva_insert() and drm_gpuva_remove() in a callback
> context, after having created ops lists using
> drm_gpuva_sm_[un]map_ops_create().
> 3) Using drm_gpuva_[un]map() in callback context after having
> prealloced a node and va objects for map/remap function use,
> which must be forwarded in as the 'priv' argument to
> drm_gpuva_sm_[un]map().

Right, and I think it might be worth concretely mentioning this in the
documentation.

>
> The first of these is pretty self-explanatory. The second was also
> fairly easy to understand, it has an example in your own driver, and
> since it takes care of allocs in drm_gpuva_sm_map_ops_create() it
> leads to pretty clean code too.
>
> The third case, which I am using in the new PowerVR driver did not
> have an example of usage and the approach is quite different to 2)
> in that you have to prealloc everything explicitly. I didn't realise
> this, so it led to a fair amount of frustration.

Yeah, I think this is not entirely obvious why this is the case. I
should maybe add a comment on how the callback way of using this
interface is motivated.

The requirement of pre-allocation arises out of two circumstances.
First, having a single callback for every drm_gpuva_op on the GPUVA
space implies that we're not allowed to fail the operation, because
processing the drm_gpuva_ops directly implies that we can't unwind them
on failure.

I know that the API functions the documentation guides you to use in
this case actually can return error codes, but those are just range
checks. If they fail, it's clearly a bug. However, I did not use WARN()
for those cases, since the driver could still decide to use the
callbacks to keep track of the operations in a driver specific way,
although I would not recommend doing this and rather like to try to
cover the drivers use case within the regular way of creating a list of
operations.

Second, most (other) drivers when using the callback way of this
interface would need to execute the GPUVA space updates asynchronously
in a dma_fence signalling critical path, where no memory allocations are
permitted.

>
> I think if you're willing, it would help inexperienced implementers a
> lot if there were some brief 'how to' snippets for each of the three
> use cases.

Yes, I can definitely add some.

>
> Thanks,
> Donald

2023-06-15 16:51:44

by Danilo Krummrich

[permalink] [raw]

Subject: Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On 6/7/23 00:31, Danilo Krummrich wrote:

> Maple Tree:
> - Maple tree uses the 'unsinged long' type for node entries. While this
> works for 64bit, it's incompatible with the DRM GPUVA Manager on 32bit,
> since the DRM GPUVA Manager uses the u64 type and so do drivers using it.
> While it's questionable whether a 32bit kernel and a > 32bit GPU address
> space make any sense, it creates tons of compiler warnings when compiling
> for 32bit. Maybe it makes sense to expand the maple tree API to let users
> decide which size to pick - other ideas / proposals are welcome.

I remember you told me that the filesystem folks had some interest in a
64-bit maple tree for a 32-bit kernel as well. Are there any news or
plans for such a feature?

For the short term I'd probably add a feature flag to the GPUVA manager,
where drivers explicitly need to promise not to pass in addresses
exceeding 32-bit on a 32-bit kernel, and if they don't refuse to
initialize the GPUVA manager on 32-bit kernels - or something similar...