2023-01-18 08:26:58

by Danilo Krummrich

[permalink] [raw]
Subject: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

This patch series provides a new UAPI for the Nouveau driver in order to
support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to
keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it
serves the following purposes in this context.

1) Provide a dedicated range allocator to track GPU VA allocations and
mappings, making use of the drm_mm range allocator.

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself
providing the following new interfaces.

1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl
for UMDs to specify the portion of VA space managed by the kernel and
userspace, respectively.

2) Allocate and free a VA space region as well as bind and unbind memory
to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.

3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM
scheduler to queue jobs and support asynchronous processing with DRM syncobjs
as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context
for GEM buffers) by Christian König. Since the patch implementing drm_exec was
not yet merged into drm-next it is part of this series, as well as a small fix
for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse residency test
cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be considered just as
helper for implementation.

However, I absolutely intend to change those test cases to proper kunit test
cases for the DRM GPUVA manager, once and if we agree on it's usefulness and
design.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind

I also want to give credit to Dave Airlie, who contributed a lot of ideas to
this patch series.

Christian König (1):
drm: execution context for GEM buffers

Danilo Krummrich (13):
drm/exec: fix memory leak in drm_exec_prepare_obj()
drm: manager to keep track of GPUs VA mappings
drm: debugfs: provide infrastructure to dump a DRM GPU VA space
drm/nouveau: new VM_BIND uapi interfaces
drm/nouveau: get vmm via nouveau_cli_vmm()
drm/nouveau: bo: initialize GEM GPU VA interface
drm/nouveau: move usercopy helpers to nouveau_drv.h
drm/nouveau: fence: fail to emit when fence context is killed
drm/nouveau: chan: provide nouveau_channel_kill()
drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
drm/nouveau: implement uvmm for user mode bindings
drm/nouveau: implement new VM_BIND UAPI
drm/nouveau: debugfs: implement DRM GPU VA debugfs

Documentation/gpu/driver-uapi.rst | 11 +
Documentation/gpu/drm-mm.rst | 43 +
drivers/gpu/drm/Kconfig | 6 +
drivers/gpu/drm/Makefile | 3 +
drivers/gpu/drm/amd/amdgpu/Kconfig | 1 +
drivers/gpu/drm/drm_debugfs.c | 56 +
drivers/gpu/drm/drm_exec.c | 294 ++++
drivers/gpu/drm/drm_gem.c | 3 +
drivers/gpu/drm/drm_gpuva_mgr.c | 1323 +++++++++++++++++
drivers/gpu/drm/nouveau/Kbuild | 3 +
drivers/gpu/drm/nouveau/Kconfig | 2 +
drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +-
drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +-
.../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 +
drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 +
drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +-
drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +-
drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 +
drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +-
drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +-
drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++
drivers/gpu/drm/nouveau/nouveau_exec.h | 55 +
drivers/gpu/drm/nouveau/nouveau_fence.c | 7 +
drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +-
drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +-
drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++
drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++
drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++
drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 +
drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +-
.../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++-
.../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 +
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +-
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 +
include/drm/drm_debugfs.h | 25 +
include/drm/drm_drv.h | 6 +
include/drm/drm_exec.h | 144 ++
include/drm/drm_gem.h | 75 +
include/drm/drm_gpuva_mgr.h | 527 +++++++
include/uapi/drm/nouveau_drm.h | 216 +++
47 files changed, 5266 insertions(+), 126 deletions(-)
create mode 100644 drivers/gpu/drm/drm_exec.c
create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
create mode 100644 include/drm/drm_exec.h
create mode 100644 include/drm/drm_gpuva_mgr.h


base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
--
2.39.0


2023-01-18 10:18:44

by Christian König

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> This patch series provides a new UAPI for the Nouveau driver in order to
> support Vulkan features, such as sparse bindings and sparse residency.
>
> Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to
> keep track of GPU virtual address (VA) mappings in a more generic way.
>
> The DRM GPUVA manager is indented to help drivers implement userspace-manageable
> GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it
> serves the following purposes in this context.
>
> 1) Provide a dedicated range allocator to track GPU VA allocations and
> mappings, making use of the drm_mm range allocator.

This means that the ranges are allocated by the kernel? If yes that's a
really really bad idea.

Regards,
Christian.

>
> 2) Generically connect GPU VA mappings to their backing buffers, in
> particular DRM GEM objects.
>
> 3) Provide a common implementation to perform more complex mapping
> operations on the GPU VA space. In particular splitting and merging
> of GPU VA mappings, e.g. for intersecting mapping requests or partial
> unmap requests.
>
> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself
> providing the following new interfaces.
>
> 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl
> for UMDs to specify the portion of VA space managed by the kernel and
> userspace, respectively.
>
> 2) Allocate and free a VA space region as well as bind and unbind memory
> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.
>
> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
>
> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM
> scheduler to queue jobs and support asynchronous processing with DRM syncobjs
> as synchronization mechanism.
>
> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
>
> The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context
> for GEM buffers) by Christian König. Since the patch implementing drm_exec was
> not yet merged into drm-next it is part of this series, as well as a small fix
> for this patch, which was found while testing this series.
>
> This patch series is also available at [1].
>
> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
> corresponding userspace parts for this series.
>
> The Vulkan CTS test suite passes the sparse binding and sparse residency test
> cases for the new UAPI together with Dave's Mesa work.
>
> There are also some test cases in the igt-gpu-tools project [3] for the new UAPI
> and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU
> VA manager's logic through Nouveau's new UAPI and should be considered just as
> helper for implementation.
>
> However, I absolutely intend to change those test cases to proper kunit test
> cases for the DRM GPUVA manager, once and if we agree on it's usefulness and
> design.
>
> [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
> [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
>
> I also want to give credit to Dave Airlie, who contributed a lot of ideas to
> this patch series.
>
> Christian König (1):
> drm: execution context for GEM buffers
>
> Danilo Krummrich (13):
> drm/exec: fix memory leak in drm_exec_prepare_obj()
> drm: manager to keep track of GPUs VA mappings
> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
> drm/nouveau: new VM_BIND uapi interfaces
> drm/nouveau: get vmm via nouveau_cli_vmm()
> drm/nouveau: bo: initialize GEM GPU VA interface
> drm/nouveau: move usercopy helpers to nouveau_drv.h
> drm/nouveau: fence: fail to emit when fence context is killed
> drm/nouveau: chan: provide nouveau_channel_kill()
> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
> drm/nouveau: implement uvmm for user mode bindings
> drm/nouveau: implement new VM_BIND UAPI
> drm/nouveau: debugfs: implement DRM GPU VA debugfs
>
> Documentation/gpu/driver-uapi.rst | 11 +
> Documentation/gpu/drm-mm.rst | 43 +
> drivers/gpu/drm/Kconfig | 6 +
> drivers/gpu/drm/Makefile | 3 +
> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 +
> drivers/gpu/drm/drm_debugfs.c | 56 +
> drivers/gpu/drm/drm_exec.c | 294 ++++
> drivers/gpu/drm/drm_gem.c | 3 +
> drivers/gpu/drm/drm_gpuva_mgr.c | 1323 +++++++++++++++++
> drivers/gpu/drm/nouveau/Kbuild | 3 +
> drivers/gpu/drm/nouveau/Kconfig | 2 +
> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +-
> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +-
> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 +
> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 +
> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +-
> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +-
> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 +
> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +-
> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +-
> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++
> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 +
> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 +
> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +-
> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +-
> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++
> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++
> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++
> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 +
> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +-
> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++-
> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 +
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +-
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 +
> include/drm/drm_debugfs.h | 25 +
> include/drm/drm_drv.h | 6 +
> include/drm/drm_exec.h | 144 ++
> include/drm/drm_gem.h | 75 +
> include/drm/drm_gpuva_mgr.h | 527 +++++++
> include/uapi/drm/nouveau_drm.h | 216 +++
> 47 files changed, 5266 insertions(+), 126 deletions(-)
> create mode 100644 drivers/gpu/drm/drm_exec.c
> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
> create mode 100644 include/drm/drm_exec.h
> create mode 100644 include/drm/drm_gpuva_mgr.h
>
>
> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d

2023-01-18 15:52:04

by Christian König

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> Hi Christian,
>
> On 1/18/23 09:53, Christian König wrote:
>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>>> This patch series provides a new UAPI for the Nouveau driver in
>>> order to
>>> support Vulkan features, such as sparse bindings and sparse residency.
>>>
>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
>>> feature to
>>> keep track of GPU virtual address (VA) mappings in a more generic way.
>>>
>>> The DRM GPUVA manager is indented to help drivers implement
>>> userspace-manageable
>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
>>> this goal it
>>> serves the following purposes in this context.
>>>
>>>      1) Provide a dedicated range allocator to track GPU VA
>>> allocations and
>>>         mappings, making use of the drm_mm range allocator.
>>
>> This means that the ranges are allocated by the kernel? If yes that's
>> a really really bad idea.
>
> No, it's just for keeping track of the ranges userspace has allocated.

Ok, that makes more sense.

So basically you have an IOCTL which asks kernel for a free range? Or
what exactly is the drm_mm used for here?

Regards,
Christian.

>
> - Danilo
>
>>
>> Regards,
>> Christian.
>>
>>>
>>>      2) Generically connect GPU VA mappings to their backing
>>> buffers, in
>>>         particular DRM GEM objects.
>>>
>>>      3) Provide a common implementation to perform more complex mapping
>>>         operations on the GPU VA space. In particular splitting and
>>> merging
>>>         of GPU VA mappings, e.g. for intersecting mapping requests
>>> or partial
>>>         unmap requests.
>>>
>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
>>> itself
>>> providing the following new interfaces.
>>>
>>>      1) Initialize a GPU VA space via the new
>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
>>>         for UMDs to specify the portion of VA space managed by the
>>> kernel and
>>>         userspace, respectively.
>>>
>>>      2) Allocate and free a VA space region as well as bind and
>>> unbind memory
>>>         to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
>>> ioctl.
>>>
>>>      3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
>>>
>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
>>> of the DRM
>>> scheduler to queue jobs and support asynchronous processing with DRM
>>> syncobjs
>>> as synchronization mechanism.
>>>
>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
>>>
>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec
>>> (execution context
>>> for GEM buffers) by Christian König. Since the patch implementing
>>> drm_exec was
>>> not yet merged into drm-next it is part of this series, as well as a
>>> small fix
>>> for this patch, which was found while testing this series.
>>>
>>> This patch series is also available at [1].
>>>
>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
>>> corresponding userspace parts for this series.
>>>
>>> The Vulkan CTS test suite passes the sparse binding and sparse
>>> residency test
>>> cases for the new UAPI together with Dave's Mesa work.
>>>
>>> There are also some test cases in the igt-gpu-tools project [3] for
>>> the new UAPI
>>> and hence the DRM GPU VA manager. However, most of them are testing
>>> the DRM GPU
>>> VA manager's logic through Nouveau's new UAPI and should be
>>> considered just as
>>> helper for implementation.
>>>
>>> However, I absolutely intend to change those test cases to proper
>>> kunit test
>>> cases for the DRM GPUVA manager, once and if we agree on it's
>>> usefulness and
>>> design.
>>>
>>> [1]
>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next
>>> /
>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
>>> [3]
>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
>>>
>>> I also want to give credit to Dave Airlie, who contributed a lot of
>>> ideas to
>>> this patch series.
>>>
>>> Christian König (1):
>>>    drm: execution context for GEM buffers
>>>
>>> Danilo Krummrich (13):
>>>    drm/exec: fix memory leak in drm_exec_prepare_obj()
>>>    drm: manager to keep track of GPUs VA mappings
>>>    drm: debugfs: provide infrastructure to dump a DRM GPU VA space
>>>    drm/nouveau: new VM_BIND uapi interfaces
>>>    drm/nouveau: get vmm via nouveau_cli_vmm()
>>>    drm/nouveau: bo: initialize GEM GPU VA interface
>>>    drm/nouveau: move usercopy helpers to nouveau_drv.h
>>>    drm/nouveau: fence: fail to emit when fence context is killed
>>>    drm/nouveau: chan: provide nouveau_channel_kill()
>>>    drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
>>>    drm/nouveau: implement uvmm for user mode bindings
>>>    drm/nouveau: implement new VM_BIND UAPI
>>>    drm/nouveau: debugfs: implement DRM GPU VA debugfs
>>>
>>>   Documentation/gpu/driver-uapi.rst             |   11 +
>>>   Documentation/gpu/drm-mm.rst                  |   43 +
>>>   drivers/gpu/drm/Kconfig                       |    6 +
>>>   drivers/gpu/drm/Makefile                      |    3 +
>>>   drivers/gpu/drm/amd/amdgpu/Kconfig            |    1 +
>>>   drivers/gpu/drm/drm_debugfs.c                 |   56 +
>>>   drivers/gpu/drm/drm_exec.c                    |  294 ++++
>>>   drivers/gpu/drm/drm_gem.c                     |    3 +
>>>   drivers/gpu/drm/drm_gpuva_mgr.c               | 1323
>>> +++++++++++++++++
>>>   drivers/gpu/drm/nouveau/Kbuild                |    3 +
>>>   drivers/gpu/drm/nouveau/Kconfig               |    2 +
>>>   drivers/gpu/drm/nouveau/include/nvif/if000c.h |   23 +-
>>>   drivers/gpu/drm/nouveau/include/nvif/vmm.h    |   17 +-
>>>   .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h |   10 +
>>>   drivers/gpu/drm/nouveau/nouveau_abi16.c       |   23 +
>>>   drivers/gpu/drm/nouveau/nouveau_abi16.h       |    1 +
>>>   drivers/gpu/drm/nouveau/nouveau_bo.c          |  152 +-
>>>   drivers/gpu/drm/nouveau/nouveau_bo.h          |    2 +-
>>>   drivers/gpu/drm/nouveau/nouveau_chan.c        |   16 +-
>>>   drivers/gpu/drm/nouveau/nouveau_chan.h        |    1 +
>>>   drivers/gpu/drm/nouveau/nouveau_debugfs.c     |   24 +
>>>   drivers/gpu/drm/nouveau/nouveau_drm.c         |   25 +-
>>>   drivers/gpu/drm/nouveau/nouveau_drv.h         |   92 +-
>>>   drivers/gpu/drm/nouveau/nouveau_exec.c        |  310 ++++
>>>   drivers/gpu/drm/nouveau/nouveau_exec.h        |   55 +
>>>   drivers/gpu/drm/nouveau/nouveau_fence.c       |    7 +
>>>   drivers/gpu/drm/nouveau/nouveau_fence.h       |    2 +-
>>>   drivers/gpu/drm/nouveau/nouveau_gem.c         |   83 +-
>>>   drivers/gpu/drm/nouveau/nouveau_mem.h         |    5 +
>>>   drivers/gpu/drm/nouveau/nouveau_prime.c       |    2 +-
>>>   drivers/gpu/drm/nouveau/nouveau_sched.c       |  780 ++++++++++
>>>   drivers/gpu/drm/nouveau/nouveau_sched.h       |   98 ++
>>>   drivers/gpu/drm/nouveau/nouveau_svm.c         |    2 +-
>>>   drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  575 +++++++
>>>   drivers/gpu/drm/nouveau/nouveau_uvmm.h        |   68 +
>>>   drivers/gpu/drm/nouveau/nouveau_vmm.c         |    4 +-
>>>   drivers/gpu/drm/nouveau/nvif/vmm.c            |   73 +-
>>>   .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c    |  168 ++-
>>>   .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h    |    1 +
>>>   drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c |   32 +-
>>>   drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |    3 +
>>>   include/drm/drm_debugfs.h                     |   25 +
>>>   include/drm/drm_drv.h                         |    6 +
>>>   include/drm/drm_exec.h                        |  144 ++
>>>   include/drm/drm_gem.h                         |   75 +
>>>   include/drm/drm_gpuva_mgr.h                   |  527 +++++++
>>>   include/uapi/drm/nouveau_drm.h                |  216 +++
>>>   47 files changed, 5266 insertions(+), 126 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/drm_exec.c
>>>   create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
>>>   create mode 100644 include/drm/drm_exec.h
>>>   create mode 100644 include/drm/drm_gpuva_mgr.h
>>>
>>>
>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
>>
>

2023-01-18 16:04:13

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Hi Christian,

On 1/18/23 09:53, Christian König wrote:
> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>> This patch series provides a new UAPI for the Nouveau driver in order to
>> support Vulkan features, such as sparse bindings and sparse residency.
>>
>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
>> feature to
>> keep track of GPU virtual address (VA) mappings in a more generic way.
>>
>> The DRM GPUVA manager is indented to help drivers implement
>> userspace-manageable
>> GPU VA spaces in reference to the Vulkan API. In order to achieve this
>> goal it
>> serves the following purposes in this context.
>>
>>      1) Provide a dedicated range allocator to track GPU VA
>> allocations and
>>         mappings, making use of the drm_mm range allocator.
>
> This means that the ranges are allocated by the kernel? If yes that's a
> really really bad idea.

No, it's just for keeping track of the ranges userspace has allocated.

- Danilo

>
> Regards,
> Christian.
>
>>
>>      2) Generically connect GPU VA mappings to their backing buffers, in
>>         particular DRM GEM objects.
>>
>>      3) Provide a common implementation to perform more complex mapping
>>         operations on the GPU VA space. In particular splitting and
>> merging
>>         of GPU VA mappings, e.g. for intersecting mapping requests or
>> partial
>>         unmap requests.
>>
>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
>> itself
>> providing the following new interfaces.
>>
>>      1) Initialize a GPU VA space via the new
>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
>>         for UMDs to specify the portion of VA space managed by the
>> kernel and
>>         userspace, respectively.
>>
>>      2) Allocate and free a VA space region as well as bind and unbind
>> memory
>>         to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.
>>
>>      3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
>>
>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
>> of the DRM
>> scheduler to queue jobs and support asynchronous processing with DRM
>> syncobjs
>> as synchronization mechanism.
>>
>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
>>
>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution
>> context
>> for GEM buffers) by Christian König. Since the patch implementing
>> drm_exec was
>> not yet merged into drm-next it is part of this series, as well as a
>> small fix
>> for this patch, which was found while testing this series.
>>
>> This patch series is also available at [1].
>>
>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
>> corresponding userspace parts for this series.
>>
>> The Vulkan CTS test suite passes the sparse binding and sparse
>> residency test
>> cases for the new UAPI together with Dave's Mesa work.
>>
>> There are also some test cases in the igt-gpu-tools project [3] for
>> the new UAPI
>> and hence the DRM GPU VA manager. However, most of them are testing
>> the DRM GPU
>> VA manager's logic through Nouveau's new UAPI and should be considered
>> just as
>> helper for implementation.
>>
>> However, I absolutely intend to change those test cases to proper
>> kunit test
>> cases for the DRM GPUVA manager, once and if we agree on it's
>> usefulness and
>> design.
>>
>> [1]
>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
>>      https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
>> [3]
>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
>>
>> I also want to give credit to Dave Airlie, who contributed a lot of
>> ideas to
>> this patch series.
>>
>> Christian König (1):
>>    drm: execution context for GEM buffers
>>
>> Danilo Krummrich (13):
>>    drm/exec: fix memory leak in drm_exec_prepare_obj()
>>    drm: manager to keep track of GPUs VA mappings
>>    drm: debugfs: provide infrastructure to dump a DRM GPU VA space
>>    drm/nouveau: new VM_BIND uapi interfaces
>>    drm/nouveau: get vmm via nouveau_cli_vmm()
>>    drm/nouveau: bo: initialize GEM GPU VA interface
>>    drm/nouveau: move usercopy helpers to nouveau_drv.h
>>    drm/nouveau: fence: fail to emit when fence context is killed
>>    drm/nouveau: chan: provide nouveau_channel_kill()
>>    drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
>>    drm/nouveau: implement uvmm for user mode bindings
>>    drm/nouveau: implement new VM_BIND UAPI
>>    drm/nouveau: debugfs: implement DRM GPU VA debugfs
>>
>>   Documentation/gpu/driver-uapi.rst             |   11 +
>>   Documentation/gpu/drm-mm.rst                  |   43 +
>>   drivers/gpu/drm/Kconfig                       |    6 +
>>   drivers/gpu/drm/Makefile                      |    3 +
>>   drivers/gpu/drm/amd/amdgpu/Kconfig            |    1 +
>>   drivers/gpu/drm/drm_debugfs.c                 |   56 +
>>   drivers/gpu/drm/drm_exec.c                    |  294 ++++
>>   drivers/gpu/drm/drm_gem.c                     |    3 +
>>   drivers/gpu/drm/drm_gpuva_mgr.c               | 1323 +++++++++++++++++
>>   drivers/gpu/drm/nouveau/Kbuild                |    3 +
>>   drivers/gpu/drm/nouveau/Kconfig               |    2 +
>>   drivers/gpu/drm/nouveau/include/nvif/if000c.h |   23 +-
>>   drivers/gpu/drm/nouveau/include/nvif/vmm.h    |   17 +-
>>   .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h |   10 +
>>   drivers/gpu/drm/nouveau/nouveau_abi16.c       |   23 +
>>   drivers/gpu/drm/nouveau/nouveau_abi16.h       |    1 +
>>   drivers/gpu/drm/nouveau/nouveau_bo.c          |  152 +-
>>   drivers/gpu/drm/nouveau/nouveau_bo.h          |    2 +-
>>   drivers/gpu/drm/nouveau/nouveau_chan.c        |   16 +-
>>   drivers/gpu/drm/nouveau/nouveau_chan.h        |    1 +
>>   drivers/gpu/drm/nouveau/nouveau_debugfs.c     |   24 +
>>   drivers/gpu/drm/nouveau/nouveau_drm.c         |   25 +-
>>   drivers/gpu/drm/nouveau/nouveau_drv.h         |   92 +-
>>   drivers/gpu/drm/nouveau/nouveau_exec.c        |  310 ++++
>>   drivers/gpu/drm/nouveau/nouveau_exec.h        |   55 +
>>   drivers/gpu/drm/nouveau/nouveau_fence.c       |    7 +
>>   drivers/gpu/drm/nouveau/nouveau_fence.h       |    2 +-
>>   drivers/gpu/drm/nouveau/nouveau_gem.c         |   83 +-
>>   drivers/gpu/drm/nouveau/nouveau_mem.h         |    5 +
>>   drivers/gpu/drm/nouveau/nouveau_prime.c       |    2 +-
>>   drivers/gpu/drm/nouveau/nouveau_sched.c       |  780 ++++++++++
>>   drivers/gpu/drm/nouveau/nouveau_sched.h       |   98 ++
>>   drivers/gpu/drm/nouveau/nouveau_svm.c         |    2 +-
>>   drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  575 +++++++
>>   drivers/gpu/drm/nouveau/nouveau_uvmm.h        |   68 +
>>   drivers/gpu/drm/nouveau/nouveau_vmm.c         |    4 +-
>>   drivers/gpu/drm/nouveau/nvif/vmm.c            |   73 +-
>>   .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c    |  168 ++-
>>   .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h    |    1 +
>>   drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c |   32 +-
>>   drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |    3 +
>>   include/drm/drm_debugfs.h                     |   25 +
>>   include/drm/drm_drv.h                         |    6 +
>>   include/drm/drm_exec.h                        |  144 ++
>>   include/drm/drm_gem.h                         |   75 +
>>   include/drm/drm_gpuva_mgr.h                   |  527 +++++++
>>   include/uapi/drm/nouveau_drm.h                |  216 +++
>>   47 files changed, 5266 insertions(+), 126 deletions(-)
>>   create mode 100644 drivers/gpu/drm/drm_exec.c
>>   create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
>>   create mode 100644 include/drm/drm_exec.h
>>   create mode 100644 include/drm/drm_gpuva_mgr.h
>>
>>
>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
>

2023-01-18 16:43:49

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On 1/18/23 16:37, Christian König wrote:
> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
>> Hi Christian,
>>
>> On 1/18/23 09:53, Christian König wrote:
>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>>>> This patch series provides a new UAPI for the Nouveau driver in
>>>> order to
>>>> support Vulkan features, such as sparse bindings and sparse residency.
>>>>
>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
>>>> feature to
>>>> keep track of GPU virtual address (VA) mappings in a more generic way.
>>>>
>>>> The DRM GPUVA manager is indented to help drivers implement
>>>> userspace-manageable
>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
>>>> this goal it
>>>> serves the following purposes in this context.
>>>>
>>>>      1) Provide a dedicated range allocator to track GPU VA
>>>> allocations and
>>>>         mappings, making use of the drm_mm range allocator.
>>>
>>> This means that the ranges are allocated by the kernel? If yes that's
>>> a really really bad idea.
>>
>> No, it's just for keeping track of the ranges userspace has allocated.
>
> Ok, that makes more sense.
>
> So basically you have an IOCTL which asks kernel for a free range? Or
> what exactly is the drm_mm used for here?

Not even that, userspace provides both the base address and the range,
the kernel really just keeps track of things. Though, writing a UAPI on
top of the GPUVA manager asking for a free range instead would be
possible by just adding the corresponding wrapper functions to get a
free hole.

Currently, and that's what I think I read out of your question, the main
benefit of using drm_mm over simply stuffing the entries into a list or
something boils down to easier collision detection and iterating
sub-ranges of the whole VA space.

>
> Regards,
> Christian.
>
>>
>> - Danilo
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>>      2) Generically connect GPU VA mappings to their backing
>>>> buffers, in
>>>>         particular DRM GEM objects.
>>>>
>>>>      3) Provide a common implementation to perform more complex mapping
>>>>         operations on the GPU VA space. In particular splitting and
>>>> merging
>>>>         of GPU VA mappings, e.g. for intersecting mapping requests
>>>> or partial
>>>>         unmap requests.
>>>>
>>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
>>>> itself
>>>> providing the following new interfaces.
>>>>
>>>>      1) Initialize a GPU VA space via the new
>>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
>>>>         for UMDs to specify the portion of VA space managed by the
>>>> kernel and
>>>>         userspace, respectively.
>>>>
>>>>      2) Allocate and free a VA space region as well as bind and
>>>> unbind memory
>>>>         to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
>>>> ioctl.
>>>>
>>>>      3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
>>>>
>>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
>>>> of the DRM
>>>> scheduler to queue jobs and support asynchronous processing with DRM
>>>> syncobjs
>>>> as synchronization mechanism.
>>>>
>>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
>>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
>>>>
>>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec
>>>> (execution context
>>>> for GEM buffers) by Christian König. Since the patch implementing
>>>> drm_exec was
>>>> not yet merged into drm-next it is part of this series, as well as a
>>>> small fix
>>>> for this patch, which was found while testing this series.
>>>>
>>>> This patch series is also available at [1].
>>>>
>>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
>>>> corresponding userspace parts for this series.
>>>>
>>>> The Vulkan CTS test suite passes the sparse binding and sparse
>>>> residency test
>>>> cases for the new UAPI together with Dave's Mesa work.
>>>>
>>>> There are also some test cases in the igt-gpu-tools project [3] for
>>>> the new UAPI
>>>> and hence the DRM GPU VA manager. However, most of them are testing
>>>> the DRM GPU
>>>> VA manager's logic through Nouveau's new UAPI and should be
>>>> considered just as
>>>> helper for implementation.
>>>>
>>>> However, I absolutely intend to change those test cases to proper
>>>> kunit test
>>>> cases for the DRM GPUVA manager, once and if we agree on it's
>>>> usefulness and
>>>> design.
>>>>
>>>> [1]
>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
>>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
>>>> [3]
>>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
>>>>
>>>> I also want to give credit to Dave Airlie, who contributed a lot of
>>>> ideas to
>>>> this patch series.
>>>>
>>>> Christian König (1):
>>>>    drm: execution context for GEM buffers
>>>>
>>>> Danilo Krummrich (13):
>>>>    drm/exec: fix memory leak in drm_exec_prepare_obj()
>>>>    drm: manager to keep track of GPUs VA mappings
>>>>    drm: debugfs: provide infrastructure to dump a DRM GPU VA space
>>>>    drm/nouveau: new VM_BIND uapi interfaces
>>>>    drm/nouveau: get vmm via nouveau_cli_vmm()
>>>>    drm/nouveau: bo: initialize GEM GPU VA interface
>>>>    drm/nouveau: move usercopy helpers to nouveau_drv.h
>>>>    drm/nouveau: fence: fail to emit when fence context is killed
>>>>    drm/nouveau: chan: provide nouveau_channel_kill()
>>>>    drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
>>>>    drm/nouveau: implement uvmm for user mode bindings
>>>>    drm/nouveau: implement new VM_BIND UAPI
>>>>    drm/nouveau: debugfs: implement DRM GPU VA debugfs
>>>>
>>>>   Documentation/gpu/driver-uapi.rst             |   11 +
>>>>   Documentation/gpu/drm-mm.rst                  |   43 +
>>>>   drivers/gpu/drm/Kconfig                       |    6 +
>>>>   drivers/gpu/drm/Makefile                      |    3 +
>>>>   drivers/gpu/drm/amd/amdgpu/Kconfig            |    1 +
>>>>   drivers/gpu/drm/drm_debugfs.c                 |   56 +
>>>>   drivers/gpu/drm/drm_exec.c                    |  294 ++++
>>>>   drivers/gpu/drm/drm_gem.c                     |    3 +
>>>>   drivers/gpu/drm/drm_gpuva_mgr.c               | 1323
>>>> +++++++++++++++++
>>>>   drivers/gpu/drm/nouveau/Kbuild                |    3 +
>>>>   drivers/gpu/drm/nouveau/Kconfig               |    2 +
>>>>   drivers/gpu/drm/nouveau/include/nvif/if000c.h |   23 +-
>>>>   drivers/gpu/drm/nouveau/include/nvif/vmm.h    |   17 +-
>>>>   .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h |   10 +
>>>>   drivers/gpu/drm/nouveau/nouveau_abi16.c       |   23 +
>>>>   drivers/gpu/drm/nouveau/nouveau_abi16.h       |    1 +
>>>>   drivers/gpu/drm/nouveau/nouveau_bo.c          |  152 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_bo.h          |    2 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_chan.c        |   16 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_chan.h        |    1 +
>>>>   drivers/gpu/drm/nouveau/nouveau_debugfs.c     |   24 +
>>>>   drivers/gpu/drm/nouveau/nouveau_drm.c         |   25 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_drv.h         |   92 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_exec.c        |  310 ++++
>>>>   drivers/gpu/drm/nouveau/nouveau_exec.h        |   55 +
>>>>   drivers/gpu/drm/nouveau/nouveau_fence.c       |    7 +
>>>>   drivers/gpu/drm/nouveau/nouveau_fence.h       |    2 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_gem.c         |   83 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_mem.h         |    5 +
>>>>   drivers/gpu/drm/nouveau/nouveau_prime.c       |    2 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_sched.c       |  780 ++++++++++
>>>>   drivers/gpu/drm/nouveau/nouveau_sched.h       |   98 ++
>>>>   drivers/gpu/drm/nouveau/nouveau_svm.c         |    2 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  575 +++++++
>>>>   drivers/gpu/drm/nouveau/nouveau_uvmm.h        |   68 +
>>>>   drivers/gpu/drm/nouveau/nouveau_vmm.c         |    4 +-
>>>>   drivers/gpu/drm/nouveau/nvif/vmm.c            |   73 +-
>>>>   .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c    |  168 ++-
>>>>   .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h    |    1 +
>>>>   drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c |   32 +-
>>>>   drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |    3 +
>>>>   include/drm/drm_debugfs.h                     |   25 +
>>>>   include/drm/drm_drv.h                         |    6 +
>>>>   include/drm/drm_exec.h                        |  144 ++
>>>>   include/drm/drm_gem.h                         |   75 +
>>>>   include/drm/drm_gpuva_mgr.h                   |  527 +++++++
>>>>   include/uapi/drm/nouveau_drm.h                |  216 +++
>>>>   47 files changed, 5266 insertions(+), 126 deletions(-)
>>>>   create mode 100644 drivers/gpu/drm/drm_exec.c
>>>>   create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
>>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
>>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
>>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
>>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>>>   create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
>>>>   create mode 100644 include/drm/drm_exec.h
>>>>   create mode 100644 include/drm/drm_gpuva_mgr.h
>>>>
>>>>
>>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
>>>
>>
>

2023-01-18 17:05:30

by Alex Deucher

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <[email protected]> wrote:
>
> On 1/18/23 16:37, Christian König wrote:
> > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> >> Hi Christian,
> >>
> >> On 1/18/23 09:53, Christian König wrote:
> >>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> >>>> This patch series provides a new UAPI for the Nouveau driver in
> >>>> order to
> >>>> support Vulkan features, such as sparse bindings and sparse residency.
> >>>>
> >>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
> >>>> feature to
> >>>> keep track of GPU virtual address (VA) mappings in a more generic way.
> >>>>
> >>>> The DRM GPUVA manager is indented to help drivers implement
> >>>> userspace-manageable
> >>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
> >>>> this goal it
> >>>> serves the following purposes in this context.
> >>>>
> >>>> 1) Provide a dedicated range allocator to track GPU VA
> >>>> allocations and
> >>>> mappings, making use of the drm_mm range allocator.
> >>>
> >>> This means that the ranges are allocated by the kernel? If yes that's
> >>> a really really bad idea.
> >>
> >> No, it's just for keeping track of the ranges userspace has allocated.
> >
> > Ok, that makes more sense.
> >
> > So basically you have an IOCTL which asks kernel for a free range? Or
> > what exactly is the drm_mm used for here?
>
> Not even that, userspace provides both the base address and the range,
> the kernel really just keeps track of things. Though, writing a UAPI on
> top of the GPUVA manager asking for a free range instead would be
> possible by just adding the corresponding wrapper functions to get a
> free hole.
>
> Currently, and that's what I think I read out of your question, the main
> benefit of using drm_mm over simply stuffing the entries into a list or
> something boils down to easier collision detection and iterating
> sub-ranges of the whole VA space.

Why not just do this in userspace? We have a range manager in
libdrm_amdgpu that you could lift out into libdrm or some other
helper.

Alex


>
> >
> > Regards,
> > Christian.
> >
> >>
> >> - Danilo
> >>
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>>>
> >>>> 2) Generically connect GPU VA mappings to their backing
> >>>> buffers, in
> >>>> particular DRM GEM objects.
> >>>>
> >>>> 3) Provide a common implementation to perform more complex mapping
> >>>> operations on the GPU VA space. In particular splitting and
> >>>> merging
> >>>> of GPU VA mappings, e.g. for intersecting mapping requests
> >>>> or partial
> >>>> unmap requests.
> >>>>
> >>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
> >>>> itself
> >>>> providing the following new interfaces.
> >>>>
> >>>> 1) Initialize a GPU VA space via the new
> >>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
> >>>> for UMDs to specify the portion of VA space managed by the
> >>>> kernel and
> >>>> userspace, respectively.
> >>>>
> >>>> 2) Allocate and free a VA space region as well as bind and
> >>>> unbind memory
> >>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
> >>>> ioctl.
> >>>>
> >>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
> >>>>
> >>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
> >>>> of the DRM
> >>>> scheduler to queue jobs and support asynchronous processing with DRM
> >>>> syncobjs
> >>>> as synchronization mechanism.
> >>>>
> >>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
> >>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
> >>>>
> >>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec
> >>>> (execution context
> >>>> for GEM buffers) by Christian König. Since the patch implementing
> >>>> drm_exec was
> >>>> not yet merged into drm-next it is part of this series, as well as a
> >>>> small fix
> >>>> for this patch, which was found while testing this series.
> >>>>
> >>>> This patch series is also available at [1].
> >>>>
> >>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
> >>>> corresponding userspace parts for this series.
> >>>>
> >>>> The Vulkan CTS test suite passes the sparse binding and sparse
> >>>> residency test
> >>>> cases for the new UAPI together with Dave's Mesa work.
> >>>>
> >>>> There are also some test cases in the igt-gpu-tools project [3] for
> >>>> the new UAPI
> >>>> and hence the DRM GPU VA manager. However, most of them are testing
> >>>> the DRM GPU
> >>>> VA manager's logic through Nouveau's new UAPI and should be
> >>>> considered just as
> >>>> helper for implementation.
> >>>>
> >>>> However, I absolutely intend to change those test cases to proper
> >>>> kunit test
> >>>> cases for the DRM GPUVA manager, once and if we agree on it's
> >>>> usefulness and
> >>>> design.
> >>>>
> >>>> [1]
> >>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
> >>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
> >>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
> >>>> [3]
> >>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
> >>>>
> >>>> I also want to give credit to Dave Airlie, who contributed a lot of
> >>>> ideas to
> >>>> this patch series.
> >>>>
> >>>> Christian König (1):
> >>>> drm: execution context for GEM buffers
> >>>>
> >>>> Danilo Krummrich (13):
> >>>> drm/exec: fix memory leak in drm_exec_prepare_obj()
> >>>> drm: manager to keep track of GPUs VA mappings
> >>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
> >>>> drm/nouveau: new VM_BIND uapi interfaces
> >>>> drm/nouveau: get vmm via nouveau_cli_vmm()
> >>>> drm/nouveau: bo: initialize GEM GPU VA interface
> >>>> drm/nouveau: move usercopy helpers to nouveau_drv.h
> >>>> drm/nouveau: fence: fail to emit when fence context is killed
> >>>> drm/nouveau: chan: provide nouveau_channel_kill()
> >>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
> >>>> drm/nouveau: implement uvmm for user mode bindings
> >>>> drm/nouveau: implement new VM_BIND UAPI
> >>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs
> >>>>
> >>>> Documentation/gpu/driver-uapi.rst | 11 +
> >>>> Documentation/gpu/drm-mm.rst | 43 +
> >>>> drivers/gpu/drm/Kconfig | 6 +
> >>>> drivers/gpu/drm/Makefile | 3 +
> >>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 +
> >>>> drivers/gpu/drm/drm_debugfs.c | 56 +
> >>>> drivers/gpu/drm/drm_exec.c | 294 ++++
> >>>> drivers/gpu/drm/drm_gem.c | 3 +
> >>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323
> >>>> +++++++++++++++++
> >>>> drivers/gpu/drm/nouveau/Kbuild | 3 +
> >>>> drivers/gpu/drm/nouveau/Kconfig | 2 +
> >>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +-
> >>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +-
> >>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 +
> >>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 +
> >>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
> >>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
> >>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 +
> >>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++
> >>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 +
> >>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 +
> >>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
> >>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++
> >>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++
> >>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> >>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++
> >>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 +
> >>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
> >>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +-
> >>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++-
> >>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 +
> >>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +-
> >>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 +
> >>>> include/drm/drm_debugfs.h | 25 +
> >>>> include/drm/drm_drv.h | 6 +
> >>>> include/drm/drm_exec.h | 144 ++
> >>>> include/drm/drm_gem.h | 75 +
> >>>> include/drm/drm_gpuva_mgr.h | 527 +++++++
> >>>> include/uapi/drm/nouveau_drm.h | 216 +++
> >>>> 47 files changed, 5266 insertions(+), 126 deletions(-)
> >>>> create mode 100644 drivers/gpu/drm/drm_exec.c
> >>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
> >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
> >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
> >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
> >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
> >>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
> >>>> create mode 100644 include/drm/drm_exec.h
> >>>> create mode 100644 include/drm/drm_gpuva_mgr.h
> >>>>
> >>>>
> >>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
> >>>
> >>
> >
>

2023-01-18 17:20:47

by Alex Deucher

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <[email protected]> wrote:
>
>
>
> On 1/18/23 17:30, Alex Deucher wrote:
> > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <[email protected]> wrote:
> >>
> >> On 1/18/23 16:37, Christian König wrote:
> >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> >>>> Hi Christian,
> >>>>
> >>>> On 1/18/23 09:53, Christian König wrote:
> >>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> >>>>>> This patch series provides a new UAPI for the Nouveau driver in
> >>>>>> order to
> >>>>>> support Vulkan features, such as sparse bindings and sparse residency.
> >>>>>>
> >>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
> >>>>>> feature to
> >>>>>> keep track of GPU virtual address (VA) mappings in a more generic way.
> >>>>>>
> >>>>>> The DRM GPUVA manager is indented to help drivers implement
> >>>>>> userspace-manageable
> >>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
> >>>>>> this goal it
> >>>>>> serves the following purposes in this context.
> >>>>>>
> >>>>>> 1) Provide a dedicated range allocator to track GPU VA
> >>>>>> allocations and
> >>>>>> mappings, making use of the drm_mm range allocator.
> >>>>>
> >>>>> This means that the ranges are allocated by the kernel? If yes that's
> >>>>> a really really bad idea.
> >>>>
> >>>> No, it's just for keeping track of the ranges userspace has allocated.
> >>>
> >>> Ok, that makes more sense.
> >>>
> >>> So basically you have an IOCTL which asks kernel for a free range? Or
> >>> what exactly is the drm_mm used for here?
> >>
> >> Not even that, userspace provides both the base address and the range,
> >> the kernel really just keeps track of things. Though, writing a UAPI on
> >> top of the GPUVA manager asking for a free range instead would be
> >> possible by just adding the corresponding wrapper functions to get a
> >> free hole.
> >>
> >> Currently, and that's what I think I read out of your question, the main
> >> benefit of using drm_mm over simply stuffing the entries into a list or
> >> something boils down to easier collision detection and iterating
> >> sub-ranges of the whole VA space.
> >
> > Why not just do this in userspace? We have a range manager in
> > libdrm_amdgpu that you could lift out into libdrm or some other
> > helper.
>
> The kernel still needs to keep track of the mappings within the various
> VA spaces, e.g. it silently needs to unmap mappings that are backed by
> BOs that get evicted and remap them once they're validated (or swapped
> back in).

Ok, you are just using this for maintaining the GPU VM space in the kernel.

Alex

>
> >
> > Alex
> >
> >
> >>
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>>>
> >>>> - Danilo
> >>>>
> >>>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>>
> >>>>>> 2) Generically connect GPU VA mappings to their backing
> >>>>>> buffers, in
> >>>>>> particular DRM GEM objects.
> >>>>>>
> >>>>>> 3) Provide a common implementation to perform more complex mapping
> >>>>>> operations on the GPU VA space. In particular splitting and
> >>>>>> merging
> >>>>>> of GPU VA mappings, e.g. for intersecting mapping requests
> >>>>>> or partial
> >>>>>> unmap requests.
> >>>>>>
> >>>>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
> >>>>>> itself
> >>>>>> providing the following new interfaces.
> >>>>>>
> >>>>>> 1) Initialize a GPU VA space via the new
> >>>>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
> >>>>>> for UMDs to specify the portion of VA space managed by the
> >>>>>> kernel and
> >>>>>> userspace, respectively.
> >>>>>>
> >>>>>> 2) Allocate and free a VA space region as well as bind and
> >>>>>> unbind memory
> >>>>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
> >>>>>> ioctl.
> >>>>>>
> >>>>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
> >>>>>>
> >>>>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
> >>>>>> of the DRM
> >>>>>> scheduler to queue jobs and support asynchronous processing with DRM
> >>>>>> syncobjs
> >>>>>> as synchronization mechanism.
> >>>>>>
> >>>>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
> >>>>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
> >>>>>>
> >>>>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec
> >>>>>> (execution context
> >>>>>> for GEM buffers) by Christian König. Since the patch implementing
> >>>>>> drm_exec was
> >>>>>> not yet merged into drm-next it is part of this series, as well as a
> >>>>>> small fix
> >>>>>> for this patch, which was found while testing this series.
> >>>>>>
> >>>>>> This patch series is also available at [1].
> >>>>>>
> >>>>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
> >>>>>> corresponding userspace parts for this series.
> >>>>>>
> >>>>>> The Vulkan CTS test suite passes the sparse binding and sparse
> >>>>>> residency test
> >>>>>> cases for the new UAPI together with Dave's Mesa work.
> >>>>>>
> >>>>>> There are also some test cases in the igt-gpu-tools project [3] for
> >>>>>> the new UAPI
> >>>>>> and hence the DRM GPU VA manager. However, most of them are testing
> >>>>>> the DRM GPU
> >>>>>> VA manager's logic through Nouveau's new UAPI and should be
> >>>>>> considered just as
> >>>>>> helper for implementation.
> >>>>>>
> >>>>>> However, I absolutely intend to change those test cases to proper
> >>>>>> kunit test
> >>>>>> cases for the DRM GPUVA manager, once and if we agree on it's
> >>>>>> usefulness and
> >>>>>> design.
> >>>>>>
> >>>>>> [1]
> >>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
> >>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
> >>>>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
> >>>>>> [3]
> >>>>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
> >>>>>>
> >>>>>> I also want to give credit to Dave Airlie, who contributed a lot of
> >>>>>> ideas to
> >>>>>> this patch series.
> >>>>>>
> >>>>>> Christian König (1):
> >>>>>> drm: execution context for GEM buffers
> >>>>>>
> >>>>>> Danilo Krummrich (13):
> >>>>>> drm/exec: fix memory leak in drm_exec_prepare_obj()
> >>>>>> drm: manager to keep track of GPUs VA mappings
> >>>>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
> >>>>>> drm/nouveau: new VM_BIND uapi interfaces
> >>>>>> drm/nouveau: get vmm via nouveau_cli_vmm()
> >>>>>> drm/nouveau: bo: initialize GEM GPU VA interface
> >>>>>> drm/nouveau: move usercopy helpers to nouveau_drv.h
> >>>>>> drm/nouveau: fence: fail to emit when fence context is killed
> >>>>>> drm/nouveau: chan: provide nouveau_channel_kill()
> >>>>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
> >>>>>> drm/nouveau: implement uvmm for user mode bindings
> >>>>>> drm/nouveau: implement new VM_BIND UAPI
> >>>>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs
> >>>>>>
> >>>>>> Documentation/gpu/driver-uapi.rst | 11 +
> >>>>>> Documentation/gpu/drm-mm.rst | 43 +
> >>>>>> drivers/gpu/drm/Kconfig | 6 +
> >>>>>> drivers/gpu/drm/Makefile | 3 +
> >>>>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 +
> >>>>>> drivers/gpu/drm/drm_debugfs.c | 56 +
> >>>>>> drivers/gpu/drm/drm_exec.c | 294 ++++
> >>>>>> drivers/gpu/drm/drm_gem.c | 3 +
> >>>>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323
> >>>>>> +++++++++++++++++
> >>>>>> drivers/gpu/drm/nouveau/Kbuild | 3 +
> >>>>>> drivers/gpu/drm/nouveau/Kconfig | 2 +
> >>>>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +-
> >>>>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +-
> >>>>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++
> >>>>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++
> >>>>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++
> >>>>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> >>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++
> >>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 +
> >>>>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
> >>>>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +-
> >>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++-
> >>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 +
> >>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +-
> >>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 +
> >>>>>> include/drm/drm_debugfs.h | 25 +
> >>>>>> include/drm/drm_drv.h | 6 +
> >>>>>> include/drm/drm_exec.h | 144 ++
> >>>>>> include/drm/drm_gem.h | 75 +
> >>>>>> include/drm/drm_gpuva_mgr.h | 527 +++++++
> >>>>>> include/uapi/drm/nouveau_drm.h | 216 +++
> >>>>>> 47 files changed, 5266 insertions(+), 126 deletions(-)
> >>>>>> create mode 100644 drivers/gpu/drm/drm_exec.c
> >>>>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
> >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
> >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
> >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
> >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
> >>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
> >>>>>> create mode 100644 include/drm/drm_exec.h
> >>>>>> create mode 100644 include/drm/drm_gpuva_mgr.h
> >>>>>>
> >>>>>>
> >>>>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
> >>>>>
> >>>>
> >>>
> >>
> >
>

2023-01-18 17:45:04

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI



On 1/18/23 17:30, Alex Deucher wrote:
> On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <[email protected]> wrote:
>>
>> On 1/18/23 16:37, Christian König wrote:
>>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
>>>> Hi Christian,
>>>>
>>>> On 1/18/23 09:53, Christian König wrote:
>>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>>>>>> This patch series provides a new UAPI for the Nouveau driver in
>>>>>> order to
>>>>>> support Vulkan features, such as sparse bindings and sparse residency.
>>>>>>
>>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
>>>>>> feature to
>>>>>> keep track of GPU virtual address (VA) mappings in a more generic way.
>>>>>>
>>>>>> The DRM GPUVA manager is indented to help drivers implement
>>>>>> userspace-manageable
>>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
>>>>>> this goal it
>>>>>> serves the following purposes in this context.
>>>>>>
>>>>>> 1) Provide a dedicated range allocator to track GPU VA
>>>>>> allocations and
>>>>>> mappings, making use of the drm_mm range allocator.
>>>>>
>>>>> This means that the ranges are allocated by the kernel? If yes that's
>>>>> a really really bad idea.
>>>>
>>>> No, it's just for keeping track of the ranges userspace has allocated.
>>>
>>> Ok, that makes more sense.
>>>
>>> So basically you have an IOCTL which asks kernel for a free range? Or
>>> what exactly is the drm_mm used for here?
>>
>> Not even that, userspace provides both the base address and the range,
>> the kernel really just keeps track of things. Though, writing a UAPI on
>> top of the GPUVA manager asking for a free range instead would be
>> possible by just adding the corresponding wrapper functions to get a
>> free hole.
>>
>> Currently, and that's what I think I read out of your question, the main
>> benefit of using drm_mm over simply stuffing the entries into a list or
>> something boils down to easier collision detection and iterating
>> sub-ranges of the whole VA space.
>
> Why not just do this in userspace? We have a range manager in
> libdrm_amdgpu that you could lift out into libdrm or some other
> helper.

The kernel still needs to keep track of the mappings within the various
VA spaces, e.g. it silently needs to unmap mappings that are backed by
BOs that get evicted and remap them once they're validated (or swapped
back in).

>
> Alex
>
>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> - Danilo
>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> 2) Generically connect GPU VA mappings to their backing
>>>>>> buffers, in
>>>>>> particular DRM GEM objects.
>>>>>>
>>>>>> 3) Provide a common implementation to perform more complex mapping
>>>>>> operations on the GPU VA space. In particular splitting and
>>>>>> merging
>>>>>> of GPU VA mappings, e.g. for intersecting mapping requests
>>>>>> or partial
>>>>>> unmap requests.
>>>>>>
>>>>>> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
>>>>>> itself
>>>>>> providing the following new interfaces.
>>>>>>
>>>>>> 1) Initialize a GPU VA space via the new
>>>>>> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
>>>>>> for UMDs to specify the portion of VA space managed by the
>>>>>> kernel and
>>>>>> userspace, respectively.
>>>>>>
>>>>>> 2) Allocate and free a VA space region as well as bind and
>>>>>> unbind memory
>>>>>> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
>>>>>> ioctl.
>>>>>>
>>>>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
>>>>>>
>>>>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
>>>>>> of the DRM
>>>>>> scheduler to queue jobs and support asynchronous processing with DRM
>>>>>> syncobjs
>>>>>> as synchronization mechanism.
>>>>>>
>>>>>> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
>>>>>> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
>>>>>>
>>>>>> The new VM_BIND UAPI for Nouveau makes also use of drm_exec
>>>>>> (execution context
>>>>>> for GEM buffers) by Christian König. Since the patch implementing
>>>>>> drm_exec was
>>>>>> not yet merged into drm-next it is part of this series, as well as a
>>>>>> small fix
>>>>>> for this patch, which was found while testing this series.
>>>>>>
>>>>>> This patch series is also available at [1].
>>>>>>
>>>>>> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
>>>>>> corresponding userspace parts for this series.
>>>>>>
>>>>>> The Vulkan CTS test suite passes the sparse binding and sparse
>>>>>> residency test
>>>>>> cases for the new UAPI together with Dave's Mesa work.
>>>>>>
>>>>>> There are also some test cases in the igt-gpu-tools project [3] for
>>>>>> the new UAPI
>>>>>> and hence the DRM GPU VA manager. However, most of them are testing
>>>>>> the DRM GPU
>>>>>> VA manager's logic through Nouveau's new UAPI and should be
>>>>>> considered just as
>>>>>> helper for implementation.
>>>>>>
>>>>>> However, I absolutely intend to change those test cases to proper
>>>>>> kunit test
>>>>>> cases for the DRM GPUVA manager, once and if we agree on it's
>>>>>> usefulness and
>>>>>> design.
>>>>>>
>>>>>> [1]
>>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
>>>>>> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
>>>>>> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
>>>>>> [3]
>>>>>> https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
>>>>>>
>>>>>> I also want to give credit to Dave Airlie, who contributed a lot of
>>>>>> ideas to
>>>>>> this patch series.
>>>>>>
>>>>>> Christian König (1):
>>>>>> drm: execution context for GEM buffers
>>>>>>
>>>>>> Danilo Krummrich (13):
>>>>>> drm/exec: fix memory leak in drm_exec_prepare_obj()
>>>>>> drm: manager to keep track of GPUs VA mappings
>>>>>> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
>>>>>> drm/nouveau: new VM_BIND uapi interfaces
>>>>>> drm/nouveau: get vmm via nouveau_cli_vmm()
>>>>>> drm/nouveau: bo: initialize GEM GPU VA interface
>>>>>> drm/nouveau: move usercopy helpers to nouveau_drv.h
>>>>>> drm/nouveau: fence: fail to emit when fence context is killed
>>>>>> drm/nouveau: chan: provide nouveau_channel_kill()
>>>>>> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
>>>>>> drm/nouveau: implement uvmm for user mode bindings
>>>>>> drm/nouveau: implement new VM_BIND UAPI
>>>>>> drm/nouveau: debugfs: implement DRM GPU VA debugfs
>>>>>>
>>>>>> Documentation/gpu/driver-uapi.rst | 11 +
>>>>>> Documentation/gpu/drm-mm.rst | 43 +
>>>>>> drivers/gpu/drm/Kconfig | 6 +
>>>>>> drivers/gpu/drm/Makefile | 3 +
>>>>>> drivers/gpu/drm/amd/amdgpu/Kconfig | 1 +
>>>>>> drivers/gpu/drm/drm_debugfs.c | 56 +
>>>>>> drivers/gpu/drm/drm_exec.c | 294 ++++
>>>>>> drivers/gpu/drm/drm_gem.c | 3 +
>>>>>> drivers/gpu/drm/drm_gpuva_mgr.c | 1323
>>>>>> +++++++++++++++++
>>>>>> drivers/gpu/drm/nouveau/Kbuild | 3 +
>>>>>> drivers/gpu/drm/nouveau/Kconfig | 2 +
>>>>>> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 23 +-
>>>>>> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 17 +-
>>>>>> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 10 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.c | 23 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_bo.c | 152 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_chan.c | 16 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 24 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_drm.c | 25 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_drv.h | 92 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_exec.c | 310 ++++
>>>>>> drivers/gpu/drm/nouveau/nouveau_exec.h | 55 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 7 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_gem.c | 83 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_sched.c | 780 ++++++++++
>>>>>> drivers/gpu/drm/nouveau/nouveau_sched.h | 98 ++
>>>>>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
>>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 575 +++++++
>>>>>> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 68 +
>>>>>> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
>>>>>> drivers/gpu/drm/nouveau/nvif/vmm.c | 73 +-
>>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 168 ++-
>>>>>> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.h | 1 +
>>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 32 +-
>>>>>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 +
>>>>>> include/drm/drm_debugfs.h | 25 +
>>>>>> include/drm/drm_drv.h | 6 +
>>>>>> include/drm/drm_exec.h | 144 ++
>>>>>> include/drm/drm_gem.h | 75 +
>>>>>> include/drm/drm_gpuva_mgr.h | 527 +++++++
>>>>>> include/uapi/drm/nouveau_drm.h | 216 +++
>>>>>> 47 files changed, 5266 insertions(+), 126 deletions(-)
>>>>>> create mode 100644 drivers/gpu/drm/drm_exec.c
>>>>>> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
>>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
>>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
>>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
>>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>>>>> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
>>>>>> create mode 100644 include/drm/drm_exec.h
>>>>>> create mode 100644 include/drm/drm_gpuva_mgr.h
>>>>>>
>>>>>>
>>>>>> base-commit: 0b45ac1170ea6416bc1d36798414c04870cd356d
>>>>>
>>>>
>>>
>>
>

2023-01-18 19:30:30

by Dave Airlie

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Thu, 19 Jan 2023 at 02:54, Alex Deucher <[email protected]> wrote:
>
> On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <[email protected]> wrote:
> >
> >
> >
> > On 1/18/23 17:30, Alex Deucher wrote:
> > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <[email protected]> wrote:
> > >>
> > >> On 1/18/23 16:37, Christian König wrote:
> > >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> > >>>> Hi Christian,
> > >>>>
> > >>>> On 1/18/23 09:53, Christian König wrote:
> > >>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > >>>>>> This patch series provides a new UAPI for the Nouveau driver in
> > >>>>>> order to
> > >>>>>> support Vulkan features, such as sparse bindings and sparse residency.
> > >>>>>>
> > >>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
> > >>>>>> feature to
> > >>>>>> keep track of GPU virtual address (VA) mappings in a more generic way.
> > >>>>>>
> > >>>>>> The DRM GPUVA manager is indented to help drivers implement
> > >>>>>> userspace-manageable
> > >>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
> > >>>>>> this goal it
> > >>>>>> serves the following purposes in this context.
> > >>>>>>
> > >>>>>> 1) Provide a dedicated range allocator to track GPU VA
> > >>>>>> allocations and
> > >>>>>> mappings, making use of the drm_mm range allocator.
> > >>>>>
> > >>>>> This means that the ranges are allocated by the kernel? If yes that's
> > >>>>> a really really bad idea.
> > >>>>
> > >>>> No, it's just for keeping track of the ranges userspace has allocated.
> > >>>
> > >>> Ok, that makes more sense.
> > >>>
> > >>> So basically you have an IOCTL which asks kernel for a free range? Or
> > >>> what exactly is the drm_mm used for here?
> > >>
> > >> Not even that, userspace provides both the base address and the range,
> > >> the kernel really just keeps track of things. Though, writing a UAPI on
> > >> top of the GPUVA manager asking for a free range instead would be
> > >> possible by just adding the corresponding wrapper functions to get a
> > >> free hole.
> > >>
> > >> Currently, and that's what I think I read out of your question, the main
> > >> benefit of using drm_mm over simply stuffing the entries into a list or
> > >> something boils down to easier collision detection and iterating
> > >> sub-ranges of the whole VA space.
> > >
> > > Why not just do this in userspace? We have a range manager in
> > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > helper.
> >
> > The kernel still needs to keep track of the mappings within the various
> > VA spaces, e.g. it silently needs to unmap mappings that are backed by
> > BOs that get evicted and remap them once they're validated (or swapped
> > back in).
>
> Ok, you are just using this for maintaining the GPU VM space in the kernel.
>

Yes the idea behind having common code wrapping drm_mm for this is to
allow us to make the rules consistent across drivers.

Userspace (generally Vulkan, some compute) has interfaces that pretty
much dictate a lot of how VMA tracking works, esp around lifetimes,
sparse mappings and splitting/merging underlying page tables, I'd
really like this to be more consistent across drivers, because already
I think we've seen with freedreno some divergence from amdgpu and we
also have i915/xe to deal with. I'd like to at least have one place
that we can say this is how it should work, since this is something
that *should* be consistent across drivers mostly, as it is more about
how the uapi is exposed.

Dave.

2023-01-18 19:58:48

by Christian König

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Am 18.01.23 um 20:17 schrieb Dave Airlie:
> On Thu, 19 Jan 2023 at 02:54, Alex Deucher <[email protected]> wrote:
>> On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <[email protected]> wrote:
>>>
>>>
>>> On 1/18/23 17:30, Alex Deucher wrote:
>>>> On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <[email protected]> wrote:
>>>>> On 1/18/23 16:37, Christian König wrote:
>>>>>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
>>>>>>> Hi Christian,
>>>>>>>
>>>>>>> On 1/18/23 09:53, Christian König wrote:
>>>>>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>>>>>>>>> This patch series provides a new UAPI for the Nouveau driver in
>>>>>>>>> order to
>>>>>>>>> support Vulkan features, such as sparse bindings and sparse residency.
>>>>>>>>>
>>>>>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM core
>>>>>>>>> feature to
>>>>>>>>> keep track of GPU virtual address (VA) mappings in a more generic way.
>>>>>>>>>
>>>>>>>>> The DRM GPUVA manager is indented to help drivers implement
>>>>>>>>> userspace-manageable
>>>>>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
>>>>>>>>> this goal it
>>>>>>>>> serves the following purposes in this context.
>>>>>>>>>
>>>>>>>>> 1) Provide a dedicated range allocator to track GPU VA
>>>>>>>>> allocations and
>>>>>>>>> mappings, making use of the drm_mm range allocator.
>>>>>>>> This means that the ranges are allocated by the kernel? If yes that's
>>>>>>>> a really really bad idea.
>>>>>>> No, it's just for keeping track of the ranges userspace has allocated.
>>>>>> Ok, that makes more sense.
>>>>>>
>>>>>> So basically you have an IOCTL which asks kernel for a free range? Or
>>>>>> what exactly is the drm_mm used for here?
>>>>> Not even that, userspace provides both the base address and the range,
>>>>> the kernel really just keeps track of things. Though, writing a UAPI on
>>>>> top of the GPUVA manager asking for a free range instead would be
>>>>> possible by just adding the corresponding wrapper functions to get a
>>>>> free hole.
>>>>>
>>>>> Currently, and that's what I think I read out of your question, the main
>>>>> benefit of using drm_mm over simply stuffing the entries into a list or
>>>>> something boils down to easier collision detection and iterating
>>>>> sub-ranges of the whole VA space.
>>>> Why not just do this in userspace? We have a range manager in
>>>> libdrm_amdgpu that you could lift out into libdrm or some other
>>>> helper.
>>> The kernel still needs to keep track of the mappings within the various
>>> VA spaces, e.g. it silently needs to unmap mappings that are backed by
>>> BOs that get evicted and remap them once they're validated (or swapped
>>> back in).
>> Ok, you are just using this for maintaining the GPU VM space in the kernel.
>>
> Yes the idea behind having common code wrapping drm_mm for this is to
> allow us to make the rules consistent across drivers.
>
> Userspace (generally Vulkan, some compute) has interfaces that pretty
> much dictate a lot of how VMA tracking works, esp around lifetimes,
> sparse mappings and splitting/merging underlying page tables, I'd
> really like this to be more consistent across drivers, because already
> I think we've seen with freedreno some divergence from amdgpu and we
> also have i915/xe to deal with. I'd like to at least have one place
> that we can say this is how it should work, since this is something
> that *should* be consistent across drivers mostly, as it is more about
> how the uapi is exposed.

That's a really good idea, but the implementation with drm_mm won't work
like that.

We have Vulkan applications which use the sparse feature to create
literally millions of mappings. That's why I have fine tuned the mapping
structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
possible in the handling of that.

A drm_mm_node is more in the range of ~200 bytes and certainly not
suitable for this kind of job.

I strongly suggest to rather use a good bunch of the amdgpu VM code as
blueprint for the common infrastructure.

Regards,
Christian.

>
> Dave.

2023-01-19 04:30:45

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On 1/18/23 20:48, Christian König wrote:
> Am 18.01.23 um 20:17 schrieb Dave Airlie:
>> On Thu, 19 Jan 2023 at 02:54, Alex Deucher <[email protected]> wrote:
>>> On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <[email protected]>
>>> wrote:
>>>>
>>>>
>>>> On 1/18/23 17:30, Alex Deucher wrote:
>>>>> On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <[email protected]>
>>>>> wrote:
>>>>>> On 1/18/23 16:37, Christian König wrote:
>>>>>>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
>>>>>>>> Hi Christian,
>>>>>>>>
>>>>>>>> On 1/18/23 09:53, Christian König wrote:
>>>>>>>>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>>>>>>>>>> This patch series provides a new UAPI for the Nouveau driver in
>>>>>>>>>> order to
>>>>>>>>>> support Vulkan features, such as sparse bindings and sparse
>>>>>>>>>> residency.
>>>>>>>>>>
>>>>>>>>>> Furthermore, with the DRM GPUVA manager it provides a new DRM
>>>>>>>>>> core
>>>>>>>>>> feature to
>>>>>>>>>> keep track of GPU virtual address (VA) mappings in a more
>>>>>>>>>> generic way.
>>>>>>>>>>
>>>>>>>>>> The DRM GPUVA manager is indented to help drivers implement
>>>>>>>>>> userspace-manageable
>>>>>>>>>> GPU VA spaces in reference to the Vulkan API. In order to achieve
>>>>>>>>>> this goal it
>>>>>>>>>> serves the following purposes in this context.
>>>>>>>>>>
>>>>>>>>>>        1) Provide a dedicated range allocator to track GPU VA
>>>>>>>>>> allocations and
>>>>>>>>>>           mappings, making use of the drm_mm range allocator.
>>>>>>>>> This means that the ranges are allocated by the kernel? If yes
>>>>>>>>> that's
>>>>>>>>> a really really bad idea.
>>>>>>>> No, it's just for keeping track of the ranges userspace has
>>>>>>>> allocated.
>>>>>>> Ok, that makes more sense.
>>>>>>>
>>>>>>> So basically you have an IOCTL which asks kernel for a free
>>>>>>> range? Or
>>>>>>> what exactly is the drm_mm used for here?
>>>>>> Not even that, userspace provides both the base address and the
>>>>>> range,
>>>>>> the kernel really just keeps track of things. Though, writing a
>>>>>> UAPI on
>>>>>> top of the GPUVA manager asking for a free range instead would be
>>>>>> possible by just adding the corresponding wrapper functions to get a
>>>>>> free hole.
>>>>>>
>>>>>> Currently, and that's what I think I read out of your question,
>>>>>> the main
>>>>>> benefit of using drm_mm over simply stuffing the entries into a
>>>>>> list or
>>>>>> something boils down to easier collision detection and iterating
>>>>>> sub-ranges of the whole VA space.
>>>>> Why not just do this in userspace?  We have a range manager in
>>>>> libdrm_amdgpu that you could lift out into libdrm or some other
>>>>> helper.
>>>> The kernel still needs to keep track of the mappings within the various
>>>> VA spaces, e.g. it silently needs to unmap mappings that are backed by
>>>> BOs that get evicted and remap them once they're validated (or swapped
>>>> back in).
>>> Ok, you are just using this for maintaining the GPU VM space in the
>>> kernel.
>>>
>> Yes the idea behind having common code wrapping drm_mm for this is to
>> allow us to make the rules consistent across drivers.
>>
>> Userspace (generally Vulkan, some compute) has interfaces that pretty
>> much dictate a lot of how VMA tracking works, esp around lifetimes,
>> sparse mappings and splitting/merging underlying page tables, I'd
>> really like this to be more consistent across drivers, because already
>> I think we've seen with freedreno some divergence from amdgpu and we
>> also have i915/xe to deal with. I'd like to at least have one place
>> that we can say this is how it should work, since this is something
>> that *should* be consistent across drivers mostly, as it is more about
>> how the uapi is exposed.
>
> That's a really good idea, but the implementation with drm_mm won't work
> like that.
>
> We have Vulkan applications which use the sparse feature to create
> literally millions of mappings. That's why I have fine tuned the mapping
> structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
> possible in the handling of that.

That's a valuable information. Can you recommend such an application for
testing / benchmarking?

Your optimization effort sounds great. May it be worth thinking about
generalizing your approach by itself and stacking the drm_gpuva_manager
on top of it?

>
> A drm_mm_node is more in the range of ~200 bytes and certainly not
> suitable for this kind of job.
>
> I strongly suggest to rather use a good bunch of the amdgpu VM code as
> blueprint for the common infrastructure.

I will definitely have look.

>
> Regards,
> Christian.
>
>>
>> Dave.
>

2023-01-19 05:52:00

by Matthew Brost

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote:
> On 1/18/23 20:48, Christian K?nig wrote:
> > Am 18.01.23 um 20:17 schrieb Dave Airlie:
> > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher <[email protected]> wrote:
> > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich
> > > > <[email protected]> wrote:
> > > > >
> > > > >
> > > > > On 1/18/23 17:30, Alex Deucher wrote:
> > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich
> > > > > > <[email protected]> wrote:
> > > > > > > On 1/18/23 16:37, Christian K?nig wrote:
> > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> > > > > > > > > Hi Christian,
> > > > > > > > >
> > > > > > > > > On 1/18/23 09:53, Christian K?nig wrote:
> > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > > > > > > > > > > This patch series provides a new UAPI for the Nouveau driver in
> > > > > > > > > > > order to
> > > > > > > > > > > support Vulkan features, such as
> > > > > > > > > > > sparse bindings and sparse
> > > > > > > > > > > residency.
> > > > > > > > > > >
> > > > > > > > > > > Furthermore, with the DRM GPUVA
> > > > > > > > > > > manager it provides a new DRM core
> > > > > > > > > > > feature to
> > > > > > > > > > > keep track of GPU virtual address
> > > > > > > > > > > (VA) mappings in a more generic way.
> > > > > > > > > > >
> > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement
> > > > > > > > > > > userspace-manageable
> > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve
> > > > > > > > > > > this goal it
> > > > > > > > > > > serves the following purposes in this context.
> > > > > > > > > > >
> > > > > > > > > > > ?????? 1) Provide a dedicated range allocator to track GPU VA
> > > > > > > > > > > allocations and
> > > > > > > > > > > ????????? mappings, making use of the drm_mm range allocator.
> > > > > > > > > > This means that the ranges are allocated
> > > > > > > > > > by the kernel? If yes that's
> > > > > > > > > > a really really bad idea.
> > > > > > > > > No, it's just for keeping track of the
> > > > > > > > > ranges userspace has allocated.
> > > > > > > > Ok, that makes more sense.
> > > > > > > >
> > > > > > > > So basically you have an IOCTL which asks kernel
> > > > > > > > for a free range? Or
> > > > > > > > what exactly is the drm_mm used for here?
> > > > > > > Not even that, userspace provides both the base
> > > > > > > address and the range,
> > > > > > > the kernel really just keeps track of things.
> > > > > > > Though, writing a UAPI on
> > > > > > > top of the GPUVA manager asking for a free range instead would be
> > > > > > > possible by just adding the corresponding wrapper functions to get a
> > > > > > > free hole.
> > > > > > >
> > > > > > > Currently, and that's what I think I read out of
> > > > > > > your question, the main
> > > > > > > benefit of using drm_mm over simply stuffing the
> > > > > > > entries into a list or
> > > > > > > something boils down to easier collision detection and iterating
> > > > > > > sub-ranges of the whole VA space.
> > > > > > Why not just do this in userspace?? We have a range manager in
> > > > > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > > > > helper.
> > > > > The kernel still needs to keep track of the mappings within the various
> > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed by
> > > > > BOs that get evicted and remap them once they're validated (or swapped
> > > > > back in).
> > > > Ok, you are just using this for maintaining the GPU VM space in
> > > > the kernel.
> > > >
> > > Yes the idea behind having common code wrapping drm_mm for this is to
> > > allow us to make the rules consistent across drivers.
> > >
> > > Userspace (generally Vulkan, some compute) has interfaces that pretty
> > > much dictate a lot of how VMA tracking works, esp around lifetimes,
> > > sparse mappings and splitting/merging underlying page tables, I'd
> > > really like this to be more consistent across drivers, because already
> > > I think we've seen with freedreno some divergence from amdgpu and we
> > > also have i915/xe to deal with. I'd like to at least have one place
> > > that we can say this is how it should work, since this is something
> > > that *should* be consistent across drivers mostly, as it is more about
> > > how the uapi is exposed.
> >
> > That's a really good idea, but the implementation with drm_mm won't work
> > like that.
> >
> > We have Vulkan applications which use the sparse feature to create
> > literally millions of mappings. That's why I have fine tuned the mapping

Is this not an application issue? Millions of mappings seems a bit
absurd to me.

> > structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
> > possible in the handling of that.

We might need to bit of work here in Xe as our xe_vma structure is quite
big as we currently use it as dumping ground for various features.

>
> That's a valuable information. Can you recommend such an application for
> testing / benchmarking?
>

Also interested.

> Your optimization effort sounds great. May it be worth thinking about
> generalizing your approach by itself and stacking the drm_gpuva_manager on
> top of it?
>

FWIW the Xe is on board with the drm_gpuva_manager effort, we basically
open code all of this right now. I'd like to port over to
drm_gpuva_manager ASAP so we can contribute and help find a viable
solution for all of us.

Matt

> >
> > A drm_mm_node is more in the range of ~200 bytes and certainly not
> > suitable for this kind of job.
> >
> > I strongly suggest to rather use a good bunch of the amdgpu VM code as
> > blueprint for the common infrastructure.
>
> I will definitely have look.
>
> >
> > Regards,
> > Christian.
> >
> > >
> > > Dave.
> >
>

2023-01-19 11:56:24

by Christian König

[permalink] [raw]
Subject: drm_gpuva_manager requirements (was Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI)

Am 19.01.23 um 06:23 schrieb Matthew Brost:
> [SNIP]
>>>> Userspace (generally Vulkan, some compute) has interfaces that pretty
>>>> much dictate a lot of how VMA tracking works, esp around lifetimes,
>>>> sparse mappings and splitting/merging underlying page tables, I'd
>>>> really like this to be more consistent across drivers, because already
>>>> I think we've seen with freedreno some divergence from amdgpu and we
>>>> also have i915/xe to deal with. I'd like to at least have one place
>>>> that we can say this is how it should work, since this is something
>>>> that *should* be consistent across drivers mostly, as it is more about
>>>> how the uapi is exposed.
>>> That's a really good idea, but the implementation with drm_mm won't work
>>> like that.
>>>
>>> We have Vulkan applications which use the sparse feature to create
>>> literally millions of mappings. That's why I have fine tuned the mapping
> Is this not an application issue? Millions of mappings seems a bit
> absurd to me.

That's unfortunately how some games are designed these days.

>>> structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
>>> possible in the handling of that.
> We might need to bit of work here in Xe as our xe_vma structure is quite
> big as we currently use it as dumping ground for various features.

We have done that as well and it turned out to be a bad idea. At one
point we added some power management information into the mapping
structure, but quickly reverted that.

>> That's a valuable information. Can you recommend such an application for
>> testing / benchmarking?
>>
> Also interested.

On of the most demanding ones is Forza Horizon 5. The general approach
of that game seems to be to allocate 64GiB of address space (equals 16
million 4kiB pages) and then mmap() whatever data it needs into that
self managed space, assuming that every 4KiB page is individually
mapable to a different location.

>> Your optimization effort sounds great. May it be worth thinking about
>> generalizing your approach by itself and stacking the drm_gpuva_manager on
>> top of it?
>>
> FWIW the Xe is on board with the drm_gpuva_manager effort, we basically
> open code all of this right now. I'd like to port over to
> drm_gpuva_manager ASAP so we can contribute and help find a viable
> solution for all of us.

Sounds good. I haven't looked into the drm_gpuva_manager code yet, but a
few design notes I've leaned from amdgpu:

Separate address space management (drm_mm) from page table management.
In other words when an application asks for 64GiB for free address space
you don't look into the page table structures, but rather into a
separate drm_mm instance. In amdgpu we even moved the later into
userspace, but the general take away is that you have only a handful of
address space requests while you have tons of mapping/unmapping requests.

Separate the tracking structure into two, one for each BO+VM combination
(we call that amdgpu_bo_va) and one for each mapping (called
amdgpu_bo_va_mapping). We unfortunately use that for our hw dependent
state machine as well, so it isn't easily generalize-able.

I've gone back on forth on merging VMA and then not again. Not merging
them can save quite a bit of overhead, but results in much more mappings
for some use cases.

Regards,
Christian.

>
> Matt
>
>>> A drm_mm_node is more in the range of ~200 bytes and certainly not
>>> suitable for this kind of job.
>>>
>>> I strongly suggest to rather use a good bunch of the amdgpu VM code as
>>> blueprint for the common infrastructure.
>> I will definitely have look.
>>
>>> Regards,
>>> Christian.
>>>
>>>> Dave.

2023-02-06 14:48:50

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <[email protected]> wrote:
>
> On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote:
> > On 1/18/23 20:48, Christian König wrote:
> > > Am 18.01.23 um 20:17 schrieb Dave Airlie:
> > > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher <[email protected]> wrote:
> > > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > >
> > > > > > On 1/18/23 17:30, Alex Deucher wrote:
> > > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich
> > > > > > > <[email protected]> wrote:
> > > > > > > > On 1/18/23 16:37, Christian König wrote:
> > > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> > > > > > > > > > Hi Christian,
> > > > > > > > > >
> > > > > > > > > > On 1/18/23 09:53, Christian König wrote:
> > > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > > > > > > > > > > > This patch series provides a new UAPI for the Nouveau driver in
> > > > > > > > > > > > order to
> > > > > > > > > > > > support Vulkan features, such as
> > > > > > > > > > > > sparse bindings and sparse
> > > > > > > > > > > > residency.
> > > > > > > > > > > >
> > > > > > > > > > > > Furthermore, with the DRM GPUVA
> > > > > > > > > > > > manager it provides a new DRM core
> > > > > > > > > > > > feature to
> > > > > > > > > > > > keep track of GPU virtual address
> > > > > > > > > > > > (VA) mappings in a more generic way.
> > > > > > > > > > > >
> > > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement
> > > > > > > > > > > > userspace-manageable
> > > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve
> > > > > > > > > > > > this goal it
> > > > > > > > > > > > serves the following purposes in this context.
> > > > > > > > > > > >
> > > > > > > > > > > > 1) Provide a dedicated range allocator to track GPU VA
> > > > > > > > > > > > allocations and
> > > > > > > > > > > > mappings, making use of the drm_mm range allocator.
> > > > > > > > > > > This means that the ranges are allocated
> > > > > > > > > > > by the kernel? If yes that's
> > > > > > > > > > > a really really bad idea.
> > > > > > > > > > No, it's just for keeping track of the
> > > > > > > > > > ranges userspace has allocated.
> > > > > > > > > Ok, that makes more sense.
> > > > > > > > >
> > > > > > > > > So basically you have an IOCTL which asks kernel
> > > > > > > > > for a free range? Or
> > > > > > > > > what exactly is the drm_mm used for here?
> > > > > > > > Not even that, userspace provides both the base
> > > > > > > > address and the range,
> > > > > > > > the kernel really just keeps track of things.
> > > > > > > > Though, writing a UAPI on
> > > > > > > > top of the GPUVA manager asking for a free range instead would be
> > > > > > > > possible by just adding the corresponding wrapper functions to get a
> > > > > > > > free hole.
> > > > > > > >
> > > > > > > > Currently, and that's what I think I read out of
> > > > > > > > your question, the main
> > > > > > > > benefit of using drm_mm over simply stuffing the
> > > > > > > > entries into a list or
> > > > > > > > something boils down to easier collision detection and iterating
> > > > > > > > sub-ranges of the whole VA space.
> > > > > > > Why not just do this in userspace? We have a range manager in
> > > > > > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > > > > > helper.
> > > > > > The kernel still needs to keep track of the mappings within the various
> > > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed by
> > > > > > BOs that get evicted and remap them once they're validated (or swapped
> > > > > > back in).
> > > > > Ok, you are just using this for maintaining the GPU VM space in
> > > > > the kernel.
> > > > >
> > > > Yes the idea behind having common code wrapping drm_mm for this is to
> > > > allow us to make the rules consistent across drivers.
> > > >
> > > > Userspace (generally Vulkan, some compute) has interfaces that pretty
> > > > much dictate a lot of how VMA tracking works, esp around lifetimes,
> > > > sparse mappings and splitting/merging underlying page tables, I'd
> > > > really like this to be more consistent across drivers, because already
> > > > I think we've seen with freedreno some divergence from amdgpu and we
> > > > also have i915/xe to deal with. I'd like to at least have one place
> > > > that we can say this is how it should work, since this is something
> > > > that *should* be consistent across drivers mostly, as it is more about
> > > > how the uapi is exposed.
> > >
> > > That's a really good idea, but the implementation with drm_mm won't work
> > > like that.
> > >
> > > We have Vulkan applications which use the sparse feature to create
> > > literally millions of mappings. That's why I have fine tuned the mapping
>
> Is this not an application issue? Millions of mappings seems a bit
> absurd to me.
If I look at the most extreme case for AI, assuming 256GB of HBM
memory and page mapping of 2MB, we get to 128K of mappings. But that's
really the extreme case imo. I assume most mappings will be much
larger. In fact, in the most realistic scenario of large-scale
training, a single user will probably map the entire HBM memory using
1GB pages.

I have also a question, could this GPUVA code manage VA ranges
mappings for userptr mappings, assuming we work without svm/uva/usm
(pointer-is-a-pointer) ? Because then we are talking about possible
4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is
the assumption this can be used also for non-VK use-cases. Please tell
me if I'm totally wrong here).

Thanks,
Oded

2023-03-16 16:41:32

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

Hi Oded,

sorry for the late response, somehow this mail slipped through.

On 2/6/23 15:48, Oded Gabbay wrote:
> On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <[email protected]> wrote:
>> Is this not an application issue? Millions of mappings seems a bit
>> absurd to me.
> If I look at the most extreme case for AI, assuming 256GB of HBM
> memory and page mapping of 2MB, we get to 128K of mappings. But that's
> really the extreme case imo. I assume most mappings will be much
> larger. In fact, in the most realistic scenario of large-scale
> training, a single user will probably map the entire HBM memory using
> 1GB pages.
>
> I have also a question, could this GPUVA code manage VA ranges
> mappings for userptr mappings, assuming we work without svm/uva/usm
> (pointer-is-a-pointer) ? Because then we are talking about possible
> 4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is
> the assumption this can be used also for non-VK use-cases. Please tell
> me if I'm totally wrong here).

In V2 I switched from drm_mm to maple tree, which should improve
handling of lots of entries. I also dropped the requirement for GPUVA
entries to be backed by a valid GEM object.

I think it can be used for non-VK use-cases. It basically just keeps
track of mappings (not allocating them in the sense of finding a hole
and providing a base address for a given size). There are basic
functions to insert and remove entries. For those basic functions it is
ensured that colliding entries can't be inserted and only a specific
given entry can be removed, rather than e.g. an arbitrary range.

There are also more advanced functions where users of the GPUVA manager
can request to "force map" a new mapping and to unmap a given range. The
GPUVA manager will figure out the (sub-)operations to make this happen
(.e.g. remove mappings in the way, split up mappings, etc.) and either
provide these operations (or steps) through callbacks or though a list
of operations to the caller to process them.

Are there any other use-cases or features you could think of that would
be beneficial for accelerators?

- Danilo

>
> Thanks,
> Oded
>