2020-07-01 22:56:27

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 0/5] mm/hmm/nouveau: add PMD system memory mapping

The goal for this series is to introduce the hmm_pfn_to_map_order()
function. This allows a device driver to know that a given 4K PFN is
actually mapped by the CPU using a larger sized CPU page table entry and
therefore the device driver can safely map system memory using larger
device MMU PTEs.
The series is based on 5.8.0-rc3 and is intended for Jason Gunthorpe's
hmm tree. These were originally part of a larger series:
https://lore.kernel.org/linux-mm/[email protected]/

Changes in v3:
Replaced the HMM_PFN_P[MU]D flags with hmm_pfn_to_map_order() to
indicate the size of the CPU mapping.

Changes in v2:
Make the hmm_range_fault() API changes into a separate series and add
two output flags for PMD/PUD instead of a single compund page flag as
suggested by Jason Gunthorpe.
Make the nouveau page table changes a separate patch as suggested by
Ben Skeggs.
Only add support for 2MB nouveau mappings initially since changing the
1:1 CPU/GPU page table size assumptions requires a bigger set of changes.
Rebase to 5.8.0-rc3.

Ralph Campbell (5):
nouveau/hmm: fault one page at a time
mm/hmm: add hmm_mapping order
nouveau: fix mapping 2MB sysmem pages
nouveau/hmm: support mapping large sysmem pages
hmm: add tests for HMM_PFN_PMD flag

drivers/gpu/drm/nouveau/nouveau_svm.c | 236 ++++++++----------
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 5 +-
.../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 82 ++++++
include/linux/hmm.h | 24 +-
lib/test_hmm.c | 4 +
lib/test_hmm_uapi.h | 4 +
mm/hmm.c | 14 +-
tools/testing/selftests/vm/hmm-tests.c | 76 ++++++
8 files changed, 299 insertions(+), 146 deletions(-)

--
2.20.1


2020-07-01 22:56:53

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 5/5] hmm: add tests for hmm_pfn_to_map_order()

Add a sanity test for hmm_range_fault() returning the page mapping size
order.

Signed-off-by: Ralph Campbell <[email protected]>
---
lib/test_hmm.c | 4 ++
lib/test_hmm_uapi.h | 4 ++
tools/testing/selftests/vm/hmm-tests.c | 76 ++++++++++++++++++++++++++
3 files changed, 84 insertions(+)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index a2a82262b97b..9aa577afc269 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -766,6 +766,10 @@ static void dmirror_mkentry(struct dmirror *dmirror, struct hmm_range *range,
*perm |= HMM_DMIRROR_PROT_WRITE;
else
*perm |= HMM_DMIRROR_PROT_READ;
+ if (hmm_pfn_to_map_order(entry) + PAGE_SHIFT == PMD_SHIFT)
+ *perm |= HMM_DMIRROR_PROT_PMD;
+ else if (hmm_pfn_to_map_order(entry) + PAGE_SHIFT == PUD_SHIFT)
+ *perm |= HMM_DMIRROR_PROT_PUD;
}

static bool dmirror_snapshot_invalidate(struct mmu_interval_notifier *mni,
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index 67b3b2e6ff5d..670b4ef2a5b6 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -40,6 +40,8 @@ struct hmm_dmirror_cmd {
* HMM_DMIRROR_PROT_NONE: unpopulated PTE or PTE with no access
* HMM_DMIRROR_PROT_READ: read-only PTE
* HMM_DMIRROR_PROT_WRITE: read/write PTE
+ * HMM_DMIRROR_PROT_PMD: PMD sized page is fully mapped by same permissions
+ * HMM_DMIRROR_PROT_PUD: PUD sized page is fully mapped by same permissions
* HMM_DMIRROR_PROT_ZERO: special read-only zero page
* HMM_DMIRROR_PROT_DEV_PRIVATE_LOCAL: Migrated device private page on the
* device the ioctl() is made
@@ -51,6 +53,8 @@ enum {
HMM_DMIRROR_PROT_NONE = 0x00,
HMM_DMIRROR_PROT_READ = 0x01,
HMM_DMIRROR_PROT_WRITE = 0x02,
+ HMM_DMIRROR_PROT_PMD = 0x04,
+ HMM_DMIRROR_PROT_PUD = 0x08,
HMM_DMIRROR_PROT_ZERO = 0x10,
HMM_DMIRROR_PROT_DEV_PRIVATE_LOCAL = 0x20,
HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
diff --git a/tools/testing/selftests/vm/hmm-tests.c b/tools/testing/selftests/vm/hmm-tests.c
index 79db22604019..b533dd08da1d 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -1291,6 +1291,82 @@ TEST_F(hmm2, snapshot)
hmm_buffer_free(buffer);
}

+/*
+ * Test the hmm_range_fault() HMM_PFN_PMD flag for large pages that
+ * should be mapped by a large page table entry.
+ */
+TEST_F(hmm, compound)
+{
+ struct hmm_buffer *buffer;
+ unsigned long npages;
+ unsigned long size;
+ int *ptr;
+ unsigned char *m;
+ int ret;
+ long pagesizes[4];
+ int n, idx;
+ unsigned long i;
+
+ /* Skip test if we can't allocate a hugetlbfs page. */
+
+ n = gethugepagesizes(pagesizes, 4);
+ if (n <= 0)
+ return;
+ for (idx = 0; --n > 0; ) {
+ if (pagesizes[n] < pagesizes[idx])
+ idx = n;
+ }
+ size = ALIGN(TWOMEG, pagesizes[idx]);
+ npages = size >> self->page_shift;
+
+ buffer = malloc(sizeof(*buffer));
+ ASSERT_NE(buffer, NULL);
+
+ buffer->ptr = get_hugepage_region(size, GHR_STRICT);
+ if (buffer->ptr == NULL) {
+ free(buffer);
+ return;
+ }
+
+ buffer->size = size;
+ buffer->mirror = malloc(npages);
+ ASSERT_NE(buffer->mirror, NULL);
+
+ /* Initialize the pages the device will snapshot in buffer->ptr. */
+ for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+ ptr[i] = i;
+
+ /* Simulate a device snapshotting CPU pagetables. */
+ ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_SNAPSHOT, buffer, npages);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(buffer->cpages, npages);
+
+ /* Check what the device saw. */
+ m = buffer->mirror;
+ for (i = 0; i < npages; ++i)
+ ASSERT_EQ(m[i], HMM_DMIRROR_PROT_WRITE |
+ HMM_DMIRROR_PROT_PMD);
+
+ /* Make the region read-only. */
+ ret = mprotect(buffer->ptr, size, PROT_READ);
+ ASSERT_EQ(ret, 0);
+
+ /* Simulate a device snapshotting CPU pagetables. */
+ ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_SNAPSHOT, buffer, npages);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(buffer->cpages, npages);
+
+ /* Check what the device saw. */
+ m = buffer->mirror;
+ for (i = 0; i < npages; ++i)
+ ASSERT_EQ(m[i], HMM_DMIRROR_PROT_READ |
+ HMM_DMIRROR_PROT_PMD);
+
+ free_hugepage_region(buffer->ptr);
+ buffer->ptr = NULL;
+ hmm_buffer_free(buffer);
+}
+
/*
* Test two devices reading the same memory (double mapped).
*/
--
2.20.1

2020-07-01 22:57:51

by Ralph Campbell

[permalink] [raw]
Subject: [PATCH v3 3/5] nouveau: fix mapping 2MB sysmem pages

The nvif_object_ioctl() method NVIF_VMM_V0_PFNMAP wasn't correctly
setting the hardware specific GPU page table entries for 2MB sized
pages. Fix this by adding functions to set and clear PD0 GPU page
table entries.

Signed-off-by: Ralph Campbell <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 5 +-
.../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 82 +++++++++++++++++++
2 files changed, 84 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
index 199f94e15c5f..19a6804e3989 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
@@ -1204,7 +1204,6 @@ nvkm_vmm_pfn_unmap(struct nvkm_vmm *vmm, u64 addr, u64 size)
/*TODO:
* - Avoid PT readback (for dma_unmap etc), this might end up being dealt
* with inside HMM, which would be a lot nicer for us to deal with.
- * - Multiple page sizes (particularly for huge page support).
* - Support for systems without a 4KiB page size.
*/
int
@@ -1220,8 +1219,8 @@ nvkm_vmm_pfn_map(struct nvkm_vmm *vmm, u8 shift, u64 addr, u64 size, u64 *pfn)
/* Only support mapping where the page size of the incoming page
* array matches a page size available for direct mapping.
*/
- while (page->shift && page->shift != shift &&
- page->desc->func->pfn == NULL)
+ while (page->shift && (page->shift != shift ||
+ page->desc->func->pfn == NULL))
page++;

if (!page->shift || !IS_ALIGNED(addr, 1ULL << shift) ||
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
index d86287565542..ed37fddd063f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
@@ -258,12 +258,94 @@ gp100_vmm_pd0_unmap(struct nvkm_vmm *vmm,
VMM_FO128(pt, vmm, pdei * 0x10, 0ULL, 0ULL, pdes);
}

+static void
+gp100_vmm_pd0_pfn_unmap(struct nvkm_vmm *vmm,
+ struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes)
+{
+ struct device *dev = vmm->mmu->subdev.device->dev;
+ dma_addr_t addr;
+
+ nvkm_kmap(pt->memory);
+ while (ptes--) {
+ u32 datalo = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 0);
+ u32 datahi = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 4);
+ u64 data = (u64)datahi << 32 | datalo;
+
+ if ((data & (3ULL << 1)) != 0) {
+ addr = (data >> 8) << 12;
+ dma_unmap_page(dev, addr, 1UL << 21, DMA_BIDIRECTIONAL);
+ }
+ ptei++;
+ }
+ nvkm_done(pt->memory);
+}
+
+static bool
+gp100_vmm_pd0_pfn_clear(struct nvkm_vmm *vmm,
+ struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes)
+{
+ bool dma = false;
+
+ nvkm_kmap(pt->memory);
+ while (ptes--) {
+ u32 datalo = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 0);
+ u32 datahi = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 4);
+ u64 data = (u64)datahi << 32 | datalo;
+
+ if ((data & BIT_ULL(0)) && (data & (3ULL << 1)) != 0) {
+ VMM_WO064(pt, vmm, ptei * 16, data & ~BIT_ULL(0));
+ dma = true;
+ }
+ ptei++;
+ }
+ nvkm_done(pt->memory);
+ return dma;
+}
+
+static void
+gp100_vmm_pd0_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt,
+ u32 ptei, u32 ptes, struct nvkm_vmm_map *map)
+{
+ struct device *dev = vmm->mmu->subdev.device->dev;
+ dma_addr_t addr;
+
+ nvkm_kmap(pt->memory);
+ while (ptes--) {
+ u64 data = 0;
+
+ if (!(*map->pfn & NVKM_VMM_PFN_W))
+ data |= BIT_ULL(6); /* RO. */
+
+ if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) {
+ addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT;
+ addr = dma_map_page(dev, pfn_to_page(addr), 0,
+ 1UL << 21, DMA_BIDIRECTIONAL);
+ if (!WARN_ON(dma_mapping_error(dev, addr))) {
+ data |= addr >> 4;
+ data |= 2ULL << 1; /* SYSTEM_COHERENT_MEMORY. */
+ data |= BIT_ULL(3); /* VOL. */
+ data |= BIT_ULL(0); /* VALID. */
+ }
+ } else {
+ data |= (*map->pfn & NVKM_VMM_PFN_ADDR) >> 4;
+ data |= BIT_ULL(0); /* VALID. */
+ }
+
+ VMM_WO064(pt, vmm, ptei++ * 16, data);
+ map->pfn++;
+ }
+ nvkm_done(pt->memory);
+}
+
static const struct nvkm_vmm_desc_func
gp100_vmm_desc_pd0 = {
.unmap = gp100_vmm_pd0_unmap,
.sparse = gp100_vmm_pd0_sparse,
.pde = gp100_vmm_pd0_pde,
.mem = gp100_vmm_pd0_mem,
+ .pfn = gp100_vmm_pd0_pfn,
+ .pfn_clear = gp100_vmm_pd0_pfn_clear,
+ .pfn_unmap = gp100_vmm_pd0_pfn_unmap,
};

static void
--
2.20.1

2020-07-08 03:23:34

by Ben Skeggs

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH v3 3/5] nouveau: fix mapping 2MB sysmem pages

On Thu, 2 Jul 2020 at 08:54, Ralph Campbell <[email protected]> wrote:
>
> The nvif_object_ioctl() method NVIF_VMM_V0_PFNMAP wasn't correctly
> setting the hardware specific GPU page table entries for 2MB sized
> pages. Fix this by adding functions to set and clear PD0 GPU page
> table entries.
I can take this one in my tree now, it's fairly independent of the rest.

Ben.

>
> Signed-off-by: Ralph Campbell <[email protected]>
> ---
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 5 +-
> .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 82 +++++++++++++++++++
> 2 files changed, 84 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> index 199f94e15c5f..19a6804e3989 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> @@ -1204,7 +1204,6 @@ nvkm_vmm_pfn_unmap(struct nvkm_vmm *vmm, u64 addr, u64 size)
> /*TODO:
> * - Avoid PT readback (for dma_unmap etc), this might end up being dealt
> * with inside HMM, which would be a lot nicer for us to deal with.
> - * - Multiple page sizes (particularly for huge page support).
> * - Support for systems without a 4KiB page size.
> */
> int
> @@ -1220,8 +1219,8 @@ nvkm_vmm_pfn_map(struct nvkm_vmm *vmm, u8 shift, u64 addr, u64 size, u64 *pfn)
> /* Only support mapping where the page size of the incoming page
> * array matches a page size available for direct mapping.
> */
> - while (page->shift && page->shift != shift &&
> - page->desc->func->pfn == NULL)
> + while (page->shift && (page->shift != shift ||
> + page->desc->func->pfn == NULL))
> page++;
>
> if (!page->shift || !IS_ALIGNED(addr, 1ULL << shift) ||
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
> index d86287565542..ed37fddd063f 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
> @@ -258,12 +258,94 @@ gp100_vmm_pd0_unmap(struct nvkm_vmm *vmm,
> VMM_FO128(pt, vmm, pdei * 0x10, 0ULL, 0ULL, pdes);
> }
>
> +static void
> +gp100_vmm_pd0_pfn_unmap(struct nvkm_vmm *vmm,
> + struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes)
> +{
> + struct device *dev = vmm->mmu->subdev.device->dev;
> + dma_addr_t addr;
> +
> + nvkm_kmap(pt->memory);
> + while (ptes--) {
> + u32 datalo = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 0);
> + u32 datahi = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 4);
> + u64 data = (u64)datahi << 32 | datalo;
> +
> + if ((data & (3ULL << 1)) != 0) {
> + addr = (data >> 8) << 12;
> + dma_unmap_page(dev, addr, 1UL << 21, DMA_BIDIRECTIONAL);
> + }
> + ptei++;
> + }
> + nvkm_done(pt->memory);
> +}
> +
> +static bool
> +gp100_vmm_pd0_pfn_clear(struct nvkm_vmm *vmm,
> + struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes)
> +{
> + bool dma = false;
> +
> + nvkm_kmap(pt->memory);
> + while (ptes--) {
> + u32 datalo = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 0);
> + u32 datahi = nvkm_ro32(pt->memory, pt->base + ptei * 16 + 4);
> + u64 data = (u64)datahi << 32 | datalo;
> +
> + if ((data & BIT_ULL(0)) && (data & (3ULL << 1)) != 0) {
> + VMM_WO064(pt, vmm, ptei * 16, data & ~BIT_ULL(0));
> + dma = true;
> + }
> + ptei++;
> + }
> + nvkm_done(pt->memory);
> + return dma;
> +}
> +
> +static void
> +gp100_vmm_pd0_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt,
> + u32 ptei, u32 ptes, struct nvkm_vmm_map *map)
> +{
> + struct device *dev = vmm->mmu->subdev.device->dev;
> + dma_addr_t addr;
> +
> + nvkm_kmap(pt->memory);
> + while (ptes--) {
> + u64 data = 0;
> +
> + if (!(*map->pfn & NVKM_VMM_PFN_W))
> + data |= BIT_ULL(6); /* RO. */
> +
> + if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) {
> + addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT;
> + addr = dma_map_page(dev, pfn_to_page(addr), 0,
> + 1UL << 21, DMA_BIDIRECTIONAL);
> + if (!WARN_ON(dma_mapping_error(dev, addr))) {
> + data |= addr >> 4;
> + data |= 2ULL << 1; /* SYSTEM_COHERENT_MEMORY. */
> + data |= BIT_ULL(3); /* VOL. */
> + data |= BIT_ULL(0); /* VALID. */
> + }
> + } else {
> + data |= (*map->pfn & NVKM_VMM_PFN_ADDR) >> 4;
> + data |= BIT_ULL(0); /* VALID. */
> + }
> +
> + VMM_WO064(pt, vmm, ptei++ * 16, data);
> + map->pfn++;
> + }
> + nvkm_done(pt->memory);
> +}
> +
> static const struct nvkm_vmm_desc_func
> gp100_vmm_desc_pd0 = {
> .unmap = gp100_vmm_pd0_unmap,
> .sparse = gp100_vmm_pd0_sparse,
> .pde = gp100_vmm_pd0_pde,
> .mem = gp100_vmm_pd0_mem,
> + .pfn = gp100_vmm_pd0_pfn,
> + .pfn_clear = gp100_vmm_pd0_pfn_clear,
> + .pfn_unmap = gp100_vmm_pd0_pfn_unmap,
> };
>
> static void
> --
> 2.20.1
>
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau

2020-07-10 19:30:10

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] mm/hmm/nouveau: add PMD system memory mapping

On Wed, Jul 01, 2020 at 03:53:47PM -0700, Ralph Campbell wrote:
> The goal for this series is to introduce the hmm_pfn_to_map_order()
> function. This allows a device driver to know that a given 4K PFN is
> actually mapped by the CPU using a larger sized CPU page table entry and
> therefore the device driver can safely map system memory using larger
> device MMU PTEs.
> The series is based on 5.8.0-rc3 and is intended for Jason Gunthorpe's
> hmm tree. These were originally part of a larger series:
> https://lore.kernel.org/linux-mm/[email protected]/
>
> Changes in v3:
> Replaced the HMM_PFN_P[MU]D flags with hmm_pfn_to_map_order() to
> indicate the size of the CPU mapping.
>
> Changes in v2:
> Make the hmm_range_fault() API changes into a separate series and add
> two output flags for PMD/PUD instead of a single compund page flag as
> suggested by Jason Gunthorpe.
> Make the nouveau page table changes a separate patch as suggested by
> Ben Skeggs.
> Only add support for 2MB nouveau mappings initially since changing the
> 1:1 CPU/GPU page table size assumptions requires a bigger set of changes.
> Rebase to 5.8.0-rc3.
>
> Ralph Campbell (5):
> nouveau/hmm: fault one page at a time
> mm/hmm: add hmm_mapping order
> nouveau: fix mapping 2MB sysmem pages
> nouveau/hmm: support mapping large sysmem pages
> hmm: add tests for HMM_PFN_PMD flag

Applied to hmm.git.

I edited the comment for hmm_pfn_to_map_order() and added a function
to compute the field.

Thanks,
Jason

2020-07-10 20:16:30

by Ralph Campbell

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] mm/hmm/nouveau: add PMD system memory mapping


On 7/10/20 12:27 PM, Jason Gunthorpe wrote:
> On Wed, Jul 01, 2020 at 03:53:47PM -0700, Ralph Campbell wrote:
>> The goal for this series is to introduce the hmm_pfn_to_map_order()
>> function. This allows a device driver to know that a given 4K PFN is
>> actually mapped by the CPU using a larger sized CPU page table entry and
>> therefore the device driver can safely map system memory using larger
>> device MMU PTEs.
>> The series is based on 5.8.0-rc3 and is intended for Jason Gunthorpe's
>> hmm tree. These were originally part of a larger series:
>> https://lore.kernel.org/linux-mm/[email protected]/
>>
>> Changes in v3:
>> Replaced the HMM_PFN_P[MU]D flags with hmm_pfn_to_map_order() to
>> indicate the size of the CPU mapping.
>>
>> Changes in v2:
>> Make the hmm_range_fault() API changes into a separate series and add
>> two output flags for PMD/PUD instead of a single compund page flag as
>> suggested by Jason Gunthorpe.
>> Make the nouveau page table changes a separate patch as suggested by
>> Ben Skeggs.
>> Only add support for 2MB nouveau mappings initially since changing the
>> 1:1 CPU/GPU page table size assumptions requires a bigger set of changes.
>> Rebase to 5.8.0-rc3.
>>
>> Ralph Campbell (5):
>> nouveau/hmm: fault one page at a time
>> mm/hmm: add hmm_mapping order
>> nouveau: fix mapping 2MB sysmem pages
>> nouveau/hmm: support mapping large sysmem pages
>> hmm: add tests for HMM_PFN_PMD flag
>
> Applied to hmm.git.
>
> I edited the comment for hmm_pfn_to_map_order() and added a function
> to compute the field.
>
> Thanks,
> Jason

Looks good, thanks.