2020-03-04 00:14:25

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 0/4] nouveau/hmm: map pages after migration

Originally patch 4 was targeted for Jason's rdma tree since other HMM
related changes were queued there. Now that those have been merged,
these patches just contain changes to nouveau so they could go through
any tree. I guess Ben Skeggs' tree would be appropriate.

Changes since v2:
Added patches 1-3 to fix some minor issues.
Eliminated nouveau_find_svmm() since it is easily found.
Applied Jason Gunthorpe's suggestions for nouveau_pfns_to_args().

Changes since v1:
Rebase to linux-5.6.0-rc4
Address Christoph Hellwig's comments


Ralph Campbell (4):
nouveau/hmm: fix vma range check for migration
nouveau/hmm: check for SVM initialized before migrating
nouveau: remove useless SVM range check
nouveau/hmm: map pages after migration

drivers/gpu/drm/nouveau/nouveau_dmem.c | 46 +++++++++++------
drivers/gpu/drm/nouveau/nouveau_dmem.h | 2 +
drivers/gpu/drm/nouveau/nouveau_svm.c | 69 ++++++++++++++++++++++++--
drivers/gpu/drm/nouveau/nouveau_svm.h | 5 ++
4 files changed, 102 insertions(+), 20 deletions(-)

--
2.20.1


2020-03-04 00:14:33

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 2/4] nouveau/hmm: check for SVM initialized before migrating

When migrating system memory to GPU memory, check that SVM has been
enabled. Even though most errors can be ignored since migration is
a performance optimization, return an error because this is a violation
of the API.

Signed-off-by: Ralph Campbell <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 169320409286..c567526b75b8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -171,6 +171,11 @@ nouveau_svmm_bind(struct drm_device *dev, void *data,
mm = get_task_mm(current);
down_read(&mm->mmap_sem);

+ if (!cli->svm.svmm) {
+ up_read(&mm->mmap_sem);
+ return -EINVAL;
+ }
+
for (addr = args->va_start, end = args->va_start + size; addr < end;) {
struct vm_area_struct *vma;
unsigned long next;
--
2.20.1

2020-03-04 00:14:47

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 3/4] nouveau: remove useless SVM range check

When nouveau processes GPU faults, it checks to see if the fault address
falls within the "unmanaged" range which is reserved for fixed allocations
instead of addresses chosen by the core mm code. If start is greater than
or equal to svmm->unmanaged.limit, then limit will also be greater than
svmm->unmanaged.limit which is greater than svmm->unmanaged.start and the
start = max_t(u64, start, svmm->unmanaged.limit) will change nothing.
Just remove the useless lines of code.

Signed-off-by: Ralph Campbell <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index c567526b75b8..8dfa5cb74826 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -663,9 +663,6 @@ nouveau_svm_fault(struct nvif_notify *notify)
limit = start + (ARRAY_SIZE(args.phys) << PAGE_SHIFT);
if (start < svmm->unmanaged.limit)
limit = min_t(u64, limit, svmm->unmanaged.start);
- else
- if (limit > svmm->unmanaged.start)
- start = max_t(u64, start, svmm->unmanaged.limit);
SVMM_DBG(svmm, "wndw %016llx-%016llx", start, limit);

mm = svmm->notifier.mm;
--
2.20.1

2020-03-04 00:15:00

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 4/4] nouveau/hmm: map pages after migration

When memory is migrated to the GPU, it is likely to be accessed by GPU
code soon afterwards. Instead of waiting for a GPU fault, map the
migrated memory into the GPU page tables with the same access permissions
as the source CPU page table entries. This preserves copy on write
semantics.

Signed-off-by: Ralph Campbell <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Ben Skeggs <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_dmem.c | 46 +++++++++++++-------
drivers/gpu/drm/nouveau/nouveau_dmem.h | 2 +
drivers/gpu/drm/nouveau/nouveau_svm.c | 59 +++++++++++++++++++++++++-
drivers/gpu/drm/nouveau/nouveau_svm.h | 5 +++
4 files changed, 95 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 0ad5d87b5a8e..981c05a2a6ca 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -25,11 +25,13 @@
#include "nouveau_dma.h"
#include "nouveau_mem.h"
#include "nouveau_bo.h"
+#include "nouveau_svm.h"

#include <nvif/class.h>
#include <nvif/object.h>
#include <nvif/if500b.h>
#include <nvif/if900b.h>
+#include <nvif/if000c.h>

#include <linux/sched/mm.h>
#include <linux/hmm.h>
@@ -558,10 +560,11 @@ nouveau_dmem_init(struct nouveau_drm *drm)
}

static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
- unsigned long src, dma_addr_t *dma_addr)
+ unsigned long src, dma_addr_t *dma_addr, u64 *pfn)
{
struct device *dev = drm->dev->dev;
struct page *dpage, *spage;
+ unsigned long paddr;

spage = migrate_pfn_to_page(src);
if (!spage || !(src & MIGRATE_PFN_MIGRATE))
@@ -569,17 +572,21 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,

dpage = nouveau_dmem_page_alloc_locked(drm);
if (!dpage)
- return 0;
+ goto out;

*dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, *dma_addr))
goto out_free_page;

+ paddr = nouveau_dmem_page_addr(dpage);
if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM,
- nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST,
- *dma_addr))
+ paddr, NOUVEAU_APER_HOST, *dma_addr))
goto out_dma_unmap;

+ *pfn = NVIF_VMM_PFNMAP_V0_V | NVIF_VMM_PFNMAP_V0_VRAM |
+ ((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
+ if (src & MIGRATE_PFN_WRITE)
+ *pfn |= NVIF_VMM_PFNMAP_V0_W;
return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;

out_dma_unmap:
@@ -587,18 +594,20 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
out_free_page:
nouveau_dmem_page_free_locked(drm, dpage);
out:
+ *pfn = NVIF_VMM_PFNMAP_V0_NONE;
return 0;
}

static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm,
- struct migrate_vma *args, dma_addr_t *dma_addrs)
+ struct nouveau_svmm *svmm, struct migrate_vma *args,
+ dma_addr_t *dma_addrs, u64 *pfns)
{
struct nouveau_fence *fence;
unsigned long addr = args->start, nr_dma = 0, i;

for (i = 0; addr < args->end; i++) {
args->dst[i] = nouveau_dmem_migrate_copy_one(drm, args->src[i],
- dma_addrs + nr_dma);
+ dma_addrs + nr_dma, pfns + i);
if (args->dst[i])
nr_dma++;
addr += PAGE_SIZE;
@@ -607,20 +616,18 @@ static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm,
nouveau_fence_new(drm->dmem->migrate.chan, false, &fence);
migrate_vma_pages(args);
nouveau_dmem_fence_done(&fence);
+ nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i);

while (nr_dma--) {
dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE,
DMA_BIDIRECTIONAL);
}
- /*
- * FIXME optimization: update GPU page table to point to newly migrated
- * memory.
- */
migrate_vma_finalize(args);
}

int
nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
+ struct nouveau_svmm *svmm,
struct vm_area_struct *vma,
unsigned long start,
unsigned long end)
@@ -632,7 +639,8 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
.vma = vma,
.start = start,
};
- unsigned long c, i;
+ unsigned long i;
+ u64 *pfns;
int ret = -ENOMEM;

args.src = kcalloc(max, sizeof(*args.src), GFP_KERNEL);
@@ -646,19 +654,25 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
if (!dma_addrs)
goto out_free_dst;

- for (i = 0; i < npages; i += c) {
- c = min(SG_MAX_SINGLE_ALLOC, npages);
- args.end = start + (c << PAGE_SHIFT);
+ pfns = nouveau_pfns_alloc(max);
+ if (!pfns)
+ goto out_free_dma;
+
+ for (i = 0; i < npages; i += max) {
+ args.end = start + (max << PAGE_SHIFT);
ret = migrate_vma_setup(&args);
if (ret)
- goto out_free_dma;
+ goto out_free_pfns;

if (args.cpages)
- nouveau_dmem_migrate_chunk(drm, &args, dma_addrs);
+ nouveau_dmem_migrate_chunk(drm, svmm, &args, dma_addrs,
+ pfns);
args.start = args.end;
}

ret = 0;
+out_free_pfns:
+ nouveau_pfns_free(pfns);
out_free_dma:
kfree(dma_addrs);
out_free_dst:
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h
index 92394be5d649..3e03d9629a38 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h
@@ -25,6 +25,7 @@
struct drm_device;
struct drm_file;
struct nouveau_drm;
+struct nouveau_svmm;
struct hmm_range;

#if IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM)
@@ -34,6 +35,7 @@ void nouveau_dmem_suspend(struct nouveau_drm *);
void nouveau_dmem_resume(struct nouveau_drm *);

int nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
+ struct nouveau_svmm *svmm,
struct vm_area_struct *vma,
unsigned long start,
unsigned long end);
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 8dfa5cb74826..d33ae94c28ba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -70,6 +70,12 @@ struct nouveau_svm {
#define SVM_DBG(s,f,a...) NV_DEBUG((s)->drm, "svm: "f"\n", ##a)
#define SVM_ERR(s,f,a...) NV_WARN((s)->drm, "svm: "f"\n", ##a)

+struct nouveau_pfnmap_args {
+ struct nvif_ioctl_v0 i;
+ struct nvif_ioctl_mthd_v0 m;
+ struct nvif_vmm_pfnmap_v0 p;
+};
+
struct nouveau_ivmm {
struct nouveau_svmm *svmm;
u64 inst;
@@ -187,7 +193,8 @@ nouveau_svmm_bind(struct drm_device *dev, void *data,
addr = max(addr, vma->vm_start);
next = min(vma->vm_end, end);
/* This is a best effort so we ignore errors */
- nouveau_dmem_migrate_vma(cli->drm, vma, addr, next);
+ nouveau_dmem_migrate_vma(cli->drm, cli->svm.svmm, vma, addr,
+ next);
addr = next;
}

@@ -785,6 +792,56 @@ nouveau_svm_fault(struct nvif_notify *notify)
return NVIF_NOTIFY_KEEP;
}

+static struct nouveau_pfnmap_args *
+nouveau_pfns_to_args(void *pfns)
+{
+ return container_of(pfns, struct nouveau_pfnmap_args, p.phys);
+}
+
+u64 *
+nouveau_pfns_alloc(unsigned long npages)
+{
+ struct nouveau_pfnmap_args *args;
+
+ args = kzalloc(struct_size(args, p.phys, npages), GFP_KERNEL);
+ if (!args)
+ return NULL;
+
+ args->i.type = NVIF_IOCTL_V0_MTHD;
+ args->m.method = NVIF_VMM_V0_PFNMAP;
+ args->p.page = PAGE_SHIFT;
+
+ return args->p.phys;
+}
+
+void
+nouveau_pfns_free(u64 *pfns)
+{
+ struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
+
+ kfree(args);
+}
+
+void
+nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm,
+ unsigned long addr, u64 *pfns, unsigned long npages)
+{
+ struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
+ int ret;
+
+ args->p.addr = addr;
+ args->p.size = npages << PAGE_SHIFT;
+
+ mutex_lock(&svmm->mutex);
+
+ svmm->vmm->vmm.object.client->super = true;
+ ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, sizeof(*args) +
+ npages * sizeof(args->p.phys[0]), NULL);
+ svmm->vmm->vmm.object.client->super = false;
+
+ mutex_unlock(&svmm->mutex);
+}
+
static void
nouveau_svm_fault_buffer_fini(struct nouveau_svm *svm, int id)
{
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.h b/drivers/gpu/drm/nouveau/nouveau_svm.h
index e839d8189461..f0fcd1b72e8b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.h
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.h
@@ -18,6 +18,11 @@ void nouveau_svmm_fini(struct nouveau_svmm **);
int nouveau_svmm_join(struct nouveau_svmm *, u64 inst);
void nouveau_svmm_part(struct nouveau_svmm *, u64 inst);
int nouveau_svmm_bind(struct drm_device *, void *, struct drm_file *);
+
+u64 *nouveau_pfns_alloc(unsigned long npages);
+void nouveau_pfns_free(u64 *pfns);
+void nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm,
+ unsigned long addr, u64 *pfns, unsigned long npages);
#else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
static inline void nouveau_svm_init(struct nouveau_drm *drm) {}
static inline void nouveau_svm_fini(struct nouveau_drm *drm) {}
--
2.20.1

2020-03-04 00:15:17

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 1/4] nouveau/hmm: fix vma range check for migration

find_vma_intersection(mm, start, end) only guarantees that end is greater
than or equal to vma->vm_start but doesn't guarantee that start is
greater than or equal to vma->vm_start. The calculation for the
intersecting range in nouveau_svmm_bind() isn't accounting for this and
can call migrate_vma_setup() with a starting address less than
vma->vm_start. This results in migrate_vma_setup() returning -EINVAL for
the range instead of nouveau skipping that part of the range and migrating
the rest.

Signed-off-by: Ralph Campbell <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index df9bf1fd1bc0..169320409286 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -179,6 +179,7 @@ nouveau_svmm_bind(struct drm_device *dev, void *data,
if (!vma)
break;

+ addr = max(addr, vma->vm_start);
next = min(vma->vm_end, end);
/* This is a best effort so we ignore errors */
nouveau_dmem_migrate_vma(cli->drm, vma, addr, next);
--
2.20.1

2020-03-12 07:56:30

by Ben Skeggs

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH v3 1/4] nouveau/hmm: fix vma range check for migration

I've taken all 4 patches in my tree.

Thanks Ralph,
Ben.

On Wed, 4 Mar 2020 at 10:14, Ralph Campbell <[email protected]> wrote:
>
> find_vma_intersection(mm, start, end) only guarantees that end is greater
> than or equal to vma->vm_start but doesn't guarantee that start is
> greater than or equal to vma->vm_start. The calculation for the
> intersecting range in nouveau_svmm_bind() isn't accounting for this and
> can call migrate_vma_setup() with a starting address less than
> vma->vm_start. This results in migrate_vma_setup() returning -EINVAL for
> the range instead of nouveau skipping that part of the range and migrating
> the rest.
>
> Signed-off-by: Ralph Campbell <[email protected]>
> ---
> drivers/gpu/drm/nouveau/nouveau_svm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index df9bf1fd1bc0..169320409286 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -179,6 +179,7 @@ nouveau_svmm_bind(struct drm_device *dev, void *data,
> if (!vma)
> break;
>
> + addr = max(addr, vma->vm_start);
> next = min(vma->vm_end, end);
> /* This is a best effort so we ignore errors */
> nouveau_dmem_migrate_vma(cli->drm, vma, addr, next);
> --
> 2.20.1
>
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau