2018-10-06 02:51:08

by John Hubbard

[permalink] [raw]
Subject: [PATCH v3 0/3] get_user_pages*() and RDMA: first steps

From: John Hubbard <[email protected]>

Changes since v2:

-- Absorbed more dirty page handling logic into the put_user_page*(), and
handled some page releasing loops in infiniband more thoroughly, as per
Jason Gunthorpe's feedback.

-- Fixed a bug in the put_user_pages*() routines' loops (thanks to
Ralph Campbell for spotting it).

Changes since v1:

-- Renamed release_user_pages*() to put_user_pages*(), from Jan's feedback.

-- Removed the goldfish.c changes, and instead, only included a single
user (infiniband) of the new functions. That is because goldfish.c no
longer has a name collision (it has a release_user_pages() routine), and
also because infiniband exercises both the put_user_page() and
put_user_pages*() paths.

-- Updated links to discussions and plans, so as to be sure to include
bounce buffers, thanks to Jerome's feedback.

Also:

-- Dennis, thanks for your earlier review, and I have not yet added your
Reviewed-by tag, because this revision changes the things that you had
previously reviewed, thus potentially requiring another look.

This short series prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

Patch 1, although not technically critical to do now, is still nice to
have, because it's already been reviewed by Jan, and it's just one more
thing on the long TODO list here, that is ready to be checked off.

Patch 2 is required in order to allow me (and others, if I'm lucky) to
start submitting changes to convert all of the callsites of
get_user_pages*() and put_page(). I think this will work a lot better
than trying to maintain a massive patchset and submitting all at once.

Patch 3 converts infiniband drivers: put_page() --> put_user_page(), and
also exercises put_user_pages_dirty_locked().

Once these are all in, then the floodgates can open up to convert the large
number of get_user_pages*() callsites.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/[email protected]
Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/[email protected]
Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/[email protected]
Follow-up discussions.

CC: Matthew Wilcox <[email protected]>
CC: Michal Hocko <[email protected]>
CC: Christopher Lameter <[email protected]>
CC: Jason Gunthorpe <[email protected]>
CC: Dan Williams <[email protected]>
CC: Jan Kara <[email protected]>
CC: Al Viro <[email protected]>
CC: Jerome Glisse <[email protected]>
CC: Christoph Hellwig <[email protected]>
CC: Ralph Campbell <[email protected]>

John Hubbard (3):
mm: get_user_pages: consolidate error handling
mm: introduce put_user_page*(), placeholder versions
infiniband/mm: convert put_page() to put_user_page*()

drivers/infiniband/core/umem.c | 7 +--
drivers/infiniband/core/umem_odp.c | 2 +-
drivers/infiniband/hw/hfi1/user_pages.c | 11 ++---
drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +--
drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++---
drivers/infiniband/hw/qib/qib_user_sdma.c | 8 ++--
drivers/infiniband/hw/usnic/usnic_uiom.c | 7 +--
include/linux/mm.h | 48 ++++++++++++++++++++-
mm/gup.c | 37 +++++++++-------
9 files changed, 92 insertions(+), 45 deletions(-)

--
2.19.0



2018-10-06 02:50:34

by John Hubbard

[permalink] [raw]
Subject: [PATCH v3 1/3] mm: get_user_pages: consolidate error handling

From: John Hubbard <[email protected]>

An upcoming patch requires a way to operate on each page that
any of the get_user_pages_*() variants returns.

In preparation for that, consolidate the error handling for
__get_user_pages(). This provides a single location (the "out:" label)
for operating on the collected set of pages that are about to be returned.

As long every use of the "ret" variable is being edited, rename
"ret" --> "err", so that its name matches its true role.
This also gets rid of two shadowed variable declarations, as a
tiny beneficial a side effect.

Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
mm/gup.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 1abc8b4afff6..05ee7c18e59a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -660,6 +660,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
struct vm_area_struct **vmas, int *nonblocking)
{
long i = 0;
+ int err = 0;
unsigned int page_mask;
struct vm_area_struct *vma = NULL;

@@ -685,18 +686,19 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
if (!vma || start >= vma->vm_end) {
vma = find_extend_vma(mm, start);
if (!vma && in_gate_area(mm, start)) {
- int ret;
- ret = get_gate_page(mm, start & PAGE_MASK,
+ err = get_gate_page(mm, start & PAGE_MASK,
gup_flags, &vma,
pages ? &pages[i] : NULL);
- if (ret)
- return i ? : ret;
+ if (err)
+ goto out;
page_mask = 0;
goto next_page;
}

- if (!vma || check_vma_flags(vma, gup_flags))
- return i ? : -EFAULT;
+ if (!vma || check_vma_flags(vma, gup_flags)) {
+ err = -EFAULT;
+ goto out;
+ }
if (is_vm_hugetlb_page(vma)) {
i = follow_hugetlb_page(mm, vma, pages, vmas,
&start, &nr_pages, i,
@@ -709,23 +711,25 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
* If we have a pending SIGKILL, don't keep faulting pages and
* potentially allocating memory.
*/
- if (unlikely(fatal_signal_pending(current)))
- return i ? i : -ERESTARTSYS;
+ if (unlikely(fatal_signal_pending(current))) {
+ err = -ERESTARTSYS;
+ goto out;
+ }
cond_resched();
page = follow_page_mask(vma, start, foll_flags, &page_mask);
if (!page) {
- int ret;
- ret = faultin_page(tsk, vma, start, &foll_flags,
+ err = faultin_page(tsk, vma, start, &foll_flags,
nonblocking);
- switch (ret) {
+ switch (err) {
case 0:
goto retry;
case -EFAULT:
case -ENOMEM:
case -EHWPOISON:
- return i ? i : ret;
+ goto out;
case -EBUSY:
- return i;
+ err = 0;
+ goto out;
case -ENOENT:
goto next_page;
}
@@ -737,7 +741,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
*/
goto next_page;
} else if (IS_ERR(page)) {
- return i ? i : PTR_ERR(page);
+ err = PTR_ERR(page);
+ goto out;
}
if (pages) {
pages[i] = page;
@@ -757,7 +762,9 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
start += page_increm * PAGE_SIZE;
nr_pages -= page_increm;
} while (nr_pages);
- return i;
+
+out:
+ return i ? i : err;
}

static bool vma_permits_fault(struct vm_area_struct *vma,
--
2.19.0


2018-10-06 02:50:34

by John Hubbard

[permalink] [raw]
Subject: [PATCH v3 3/3] infiniband/mm: convert put_page() to put_user_page*()

From: John Hubbard <[email protected]>

For code that retains pages via get_user_pages*(),
release those pages via the new put_user_page(), or
put_user_pages*(), instead of put_page()

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/[email protected]
Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/[email protected]
Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/[email protected]
Follow-up discussions.

CC: Doug Ledford <[email protected]>
CC: Jason Gunthorpe <[email protected]>
CC: Mike Marciniszyn <[email protected]>
CC: Dennis Dalessandro <[email protected]>
CC: Christian Benvenuti <[email protected]>

CC: [email protected]
CC: [email protected]
CC: [email protected]
Signed-off-by: John Hubbard <[email protected]>
---
drivers/infiniband/core/umem.c | 7 ++++---
drivers/infiniband/core/umem_odp.c | 2 +-
drivers/infiniband/hw/hfi1/user_pages.c | 11 ++++-------
drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++++-------
drivers/infiniband/hw/qib/qib_user_sdma.c | 8 ++++----
drivers/infiniband/hw/usnic/usnic_uiom.c | 7 ++++---
7 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index a41792dbae1f..7ab7a3a35eb4 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -58,9 +58,10 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) {

page = sg_page(sg);
- if (!PageDirty(page) && umem->writable && dirty)
- set_page_dirty_lock(page);
- put_page(page);
+ if (umem->writable && dirty)
+ put_user_pages_dirty_lock(&page, 1);
+ else
+ put_user_page(page);
}

sg_free_table(&umem->sg_head);
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 6ec748eccff7..6227b89cf05c 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -717,7 +717,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem *umem, u64 user_virt, u64 bcnt,
ret = -EFAULT;
break;
}
- put_page(local_page_list[j]);
+ put_user_page(local_page_list[j]);
continue;
}

diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..99ccc0483711 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -121,13 +121,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
size_t npages, bool dirty)
{
- size_t i;
-
- for (i = 0; i < npages; i++) {
- if (dirty)
- set_page_dirty_lock(p[i]);
- put_page(p[i]);
- }
+ if (dirty)
+ put_user_pages_dirty_lock(p, npages);
+ else
+ put_user_pages(p, npages);

if (mm) { /* during close after signal, mm can be NULL */
down_write(&mm->mmap_sem);
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c b/drivers/infiniband/hw/mthca/mthca_memfree.c
index cc9c0c8ccba3..b8b12effd009 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -481,7 +481,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct mthca_uar *uar,

ret = pci_map_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
if (ret < 0) {
- put_page(pages[0]);
+ put_user_page(pages[0]);
goto out;
}

@@ -489,7 +489,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct mthca_uar *uar,
mthca_uarc_virt(dev, uar, i));
if (ret) {
pci_unmap_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
- put_page(sg_page(&db_tab->page[i].mem));
+ put_user_page(sg_page(&db_tab->page[i].mem));
goto out;
}

@@ -555,7 +555,7 @@ void mthca_cleanup_user_db_tab(struct mthca_dev *dev, struct mthca_uar *uar,
if (db_tab->page[i].uvirt) {
mthca_UNMAP_ICM(dev, mthca_uarc_virt(dev, uar, i), 1);
pci_unmap_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
- put_page(sg_page(&db_tab->page[i].mem));
+ put_user_page(sg_page(&db_tab->page[i].mem));
}
}

diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
index 16543d5e80c3..1a5c64c8695f 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -40,13 +40,10 @@
static void __qib_release_user_pages(struct page **p, size_t num_pages,
int dirty)
{
- size_t i;
-
- for (i = 0; i < num_pages; i++) {
- if (dirty)
- set_page_dirty_lock(p[i]);
- put_page(p[i]);
- }
+ if (dirty)
+ put_user_pages_dirty_lock(p, num_pages);
+ else
+ put_user_pages(p, num_pages);
}

/*
diff --git a/drivers/infiniband/hw/qib/qib_user_sdma.c b/drivers/infiniband/hw/qib/qib_user_sdma.c
index 926f3c8eba69..14f94d823907 100644
--- a/drivers/infiniband/hw/qib/qib_user_sdma.c
+++ b/drivers/infiniband/hw/qib/qib_user_sdma.c
@@ -266,7 +266,7 @@ static void qib_user_sdma_init_frag(struct qib_user_sdma_pkt *pkt,
pkt->addr[i].length = len;
pkt->addr[i].first_desc = first_desc;
pkt->addr[i].last_desc = last_desc;
- pkt->addr[i].put_page = put_page;
+ pkt->addr[i].put_page = put_user_page;
pkt->addr[i].dma_mapped = dma_mapped;
pkt->addr[i].page = page;
pkt->addr[i].kvaddr = kvaddr;
@@ -321,7 +321,7 @@ static int qib_user_sdma_page_to_frags(const struct qib_devdata *dd,
* the caller can ignore this page.
*/
if (put) {
- put_page(page);
+ put_user_page(page);
} else {
/* coalesce case */
kunmap(page);
@@ -635,7 +635,7 @@ static void qib_user_sdma_free_pkt_frag(struct device *dev,
kunmap(pkt->addr[i].page);

if (pkt->addr[i].put_page)
- put_page(pkt->addr[i].page);
+ put_user_page(pkt->addr[i].page);
else
__free_page(pkt->addr[i].page);
} else if (pkt->addr[i].kvaddr) {
@@ -710,7 +710,7 @@ static int qib_user_sdma_pin_pages(const struct qib_devdata *dd,
/* if error, return all pages not managed by pkt */
free_pages:
while (i < j)
- put_page(pages[i++]);
+ put_user_page(pages[i++]);

done:
return ret;
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 9dd39daa602b..9e3615fd05f7 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -89,9 +89,10 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
for_each_sg(chunk->page_list, sg, chunk->nents, i) {
page = sg_page(sg);
pa = sg_phys(sg);
- if (!PageDirty(page) && dirty)
- set_page_dirty_lock(page);
- put_page(page);
+ if (dirty)
+ put_user_pages_dirty_lock(&page, 1);
+ else
+ put_user_page(page);
usnic_dbg("pa: %pa\n", &pa);
}
kfree(chunk);
--
2.19.0


2018-10-06 02:52:20

by John Hubbard

[permalink] [raw]
Subject: [PATCH v3 2/3] mm: introduce put_user_page*(), placeholder versions

From: John Hubbard <[email protected]>

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also introduces put_user_pages(), and a few dirty/locked variations,
as a replacement for release_pages(), and also as a replacement
for open-coded loops that release multiple pages.
These may be used for subsequent performance improvements,
via batching of pages to be released.

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/[email protected]
Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/[email protected]
Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/[email protected]
Follow-up discussions.

CC: Matthew Wilcox <[email protected]>
CC: Michal Hocko <[email protected]>
CC: Christopher Lameter <[email protected]>
CC: Jason Gunthorpe <[email protected]>
CC: Dan Williams <[email protected]>
CC: Jan Kara <[email protected]>
CC: Al Viro <[email protected]>
CC: Jerome Glisse <[email protected]>
CC: Christoph Hellwig <[email protected]>
CC: Ralph Campbell <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
include/linux/mm.h | 48 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0416a7204be3..305b206e6851 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -137,6 +137,8 @@ extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
size_t *, loff_t *);
extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
size_t *, loff_t *);
+int set_page_dirty(struct page *page);
+int set_page_dirty_lock(struct page *page);

#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))

@@ -943,6 +945,50 @@ static inline void put_page(struct page *page)
__put_page(page);
}

+/* Pages that were pinned via get_user_pages*() should be released via
+ * either put_user_page(), or one of the put_user_pages*() routines
+ * below.
+ */
+static inline void put_user_page(struct page *page)
+{
+ put_page(page);
+}
+
+static inline void put_user_pages_dirty(struct page **pages,
+ unsigned long npages)
+{
+ unsigned long index;
+
+ for (index = 0; index < npages; index++) {
+ if (!PageDirty(pages[index]))
+ set_page_dirty(pages[index]);
+
+ put_user_page(pages[index]);
+ }
+}
+
+static inline void put_user_pages_dirty_lock(struct page **pages,
+ unsigned long npages)
+{
+ unsigned long index;
+
+ for (index = 0; index < npages; index++) {
+ if (!PageDirty(pages[index]))
+ set_page_dirty_lock(pages[index]);
+
+ put_user_page(pages[index]);
+ }
+}
+
+static inline void put_user_pages(struct page **pages,
+ unsigned long npages)
+{
+ unsigned long index;
+
+ for (index = 0; index < npages; index++)
+ put_user_page(pages[index]);
+}
+
#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
#define SECTION_IN_PAGE_FLAGS
#endif
@@ -1534,8 +1580,6 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
void account_page_dirtied(struct page *page, struct address_space *mapping);
void account_page_cleaned(struct page *page, struct address_space *mapping,
struct bdi_writeback *wb);
-int set_page_dirty(struct page *page);
-int set_page_dirty_lock(struct page *page);
void __cancel_dirty_page(struct page *page);
static inline void cancel_dirty_page(struct page *page)
{
--
2.19.0


2018-10-08 15:53:46

by Dennis Dalessandro

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] get_user_pages*() and RDMA: first steps

On 10/5/2018 10:49 PM, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> Changes since v2:
>
> -- Absorbed more dirty page handling logic into the put_user_page*(), and
> handled some page releasing loops in infiniband more thoroughly, as per
> Jason Gunthorpe's feedback.
>
> -- Fixed a bug in the put_user_pages*() routines' loops (thanks to
> Ralph Campbell for spotting it).
>
> Changes since v1:
>
> -- Renamed release_user_pages*() to put_user_pages*(), from Jan's feedback.
>
> -- Removed the goldfish.c changes, and instead, only included a single
> user (infiniband) of the new functions. That is because goldfish.c no
> longer has a name collision (it has a release_user_pages() routine), and
> also because infiniband exercises both the put_user_page() and
> put_user_pages*() paths.
>
> -- Updated links to discussions and plans, so as to be sure to include
> bounce buffers, thanks to Jerome's feedback.
>
> Also:
>
> -- Dennis, thanks for your earlier review, and I have not yet added your
> Reviewed-by tag, because this revision changes the things that you had
> previously reviewed, thus potentially requiring another look.

This spin looks fine to me.

Reviewed-by: Dennis Dalessandro <[email protected]>

2018-10-08 16:49:49

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] mm: introduce put_user_page*(), placeholder versions

On Fri 05-10-18 19:49:48, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> Introduces put_user_page(), which simply calls put_page().
> This provides a way to update all get_user_pages*() callers,
> so that they call put_user_page(), instead of put_page().
>
> Also introduces put_user_pages(), and a few dirty/locked variations,
> as a replacement for release_pages(), and also as a replacement
> for open-coded loops that release multiple pages.
> These may be used for subsequent performance improvements,
> via batching of pages to be released.
>
> This prepares for eventually fixing the problem described
> in [1], and is following a plan listed in [2], [3], [4].
>
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>
> [2] https://lkml.kernel.org/r/[email protected]
> Proposed steps for fixing get_user_pages() + DMA problems.
>
> [3]https://lkml.kernel.org/r/[email protected]
> Bounce buffers (otherwise [2] is not really viable).
>
> [4] https://lkml.kernel.org/r/[email protected]
> Follow-up discussions.
>
> CC: Matthew Wilcox <[email protected]>
> CC: Michal Hocko <[email protected]>
> CC: Christopher Lameter <[email protected]>
> CC: Jason Gunthorpe <[email protected]>
> CC: Dan Williams <[email protected]>
> CC: Jan Kara <[email protected]>
> CC: Al Viro <[email protected]>
> CC: Jerome Glisse <[email protected]>
> CC: Christoph Hellwig <[email protected]>
> CC: Ralph Campbell <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>
> ---
> include/linux/mm.h | 48 ++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 46 insertions(+), 2 deletions(-)

Looks good to me. You can add:

Reviewed-by: Jan Kara <[email protected]>

Just one nit below:

> +/* Pages that were pinned via get_user_pages*() should be released via
> + * either put_user_page(), or one of the put_user_pages*() routines
> + * below.
> + */

Multi-line comments usually follow formatting:

/*
* Some text here
* and more text here...
*/

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2018-10-08 16:50:10

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] infiniband/mm: convert put_page() to put_user_page*()

On Fri 05-10-18 19:49:49, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> For code that retains pages via get_user_pages*(),
> release those pages via the new put_user_page(), or
> put_user_pages*(), instead of put_page()
>
> This prepares for eventually fixing the problem described
> in [1], and is following a plan listed in [2], [3], [4].
>
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>
> [2] https://lkml.kernel.org/r/[email protected]
> Proposed steps for fixing get_user_pages() + DMA problems.
>
> [3]https://lkml.kernel.org/r/[email protected]
> Bounce buffers (otherwise [2] is not really viable).
>
> [4] https://lkml.kernel.org/r/[email protected]
> Follow-up discussions.
>
> CC: Doug Ledford <[email protected]>
> CC: Jason Gunthorpe <[email protected]>
> CC: Mike Marciniszyn <[email protected]>
> CC: Dennis Dalessandro <[email protected]>
> CC: Christian Benvenuti <[email protected]>
>
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> Signed-off-by: John Hubbard <[email protected]>

Looks good to me. You can add:

Reviewed-by: Jan Kara <[email protected]>

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2018-10-08 19:45:27

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] infiniband/mm: convert put_page() to put_user_page*()

On Fri, Oct 05, 2018 at 07:49:49PM -0700, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> For code that retains pages via get_user_pages*(),
> release those pages via the new put_user_page(), or
> put_user_pages*(), instead of put_page()
>
> This prepares for eventually fixing the problem described
> in [1], and is following a plan listed in [2], [3], [4].
>
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>
> [2] https://lkml.kernel.org/r/[email protected]
> Proposed steps for fixing get_user_pages() + DMA problems.
>
> [3]https://lkml.kernel.org/r/[email protected]
> Bounce buffers (otherwise [2] is not really viable).
>
> [4] https://lkml.kernel.org/r/[email protected]
> Follow-up discussions.
>
> CC: Doug Ledford <[email protected]>
> CC: Jason Gunthorpe <[email protected]>
> CC: Mike Marciniszyn <[email protected]>
> CC: Dennis Dalessandro <[email protected]>
> CC: Christian Benvenuti <[email protected]>
>
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> Signed-off-by: John Hubbard <[email protected]>
> ---
> drivers/infiniband/core/umem.c | 7 ++++---
> drivers/infiniband/core/umem_odp.c | 2 +-
> drivers/infiniband/hw/hfi1/user_pages.c | 11 ++++-------
> drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
> drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++++-------
> drivers/infiniband/hw/qib/qib_user_sdma.c | 8 ++++----
> drivers/infiniband/hw/usnic/usnic_uiom.c | 7 ++++---
> 7 files changed, 24 insertions(+), 28 deletions(-)

I have no issues with this, do you want this series to go through the
rdma tree? Otherwise:

Acked-by: Jason Gunthorpe <[email protected]>

Thanks,
Jason

2018-10-08 20:38:15

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] infiniband/mm: convert put_page() to put_user_page*()

On 10/8/18 12:42 PM, Jason Gunthorpe wrote:
> On Fri, Oct 05, 2018 at 07:49:49PM -0700, [email protected] wrote:
>> From: John Hubbard <[email protected]>
[...]
>> drivers/infiniband/core/umem.c | 7 ++++---
>> drivers/infiniband/core/umem_odp.c | 2 +-
>> drivers/infiniband/hw/hfi1/user_pages.c | 11 ++++-------
>> drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
>> drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++++-------
>> drivers/infiniband/hw/qib/qib_user_sdma.c | 8 ++++----
>> drivers/infiniband/hw/usnic/usnic_uiom.c | 7 ++++---
>> 7 files changed, 24 insertions(+), 28 deletions(-)
>
> I have no issues with this, do you want this series to go through the
> rdma tree? Otherwise:
>
> Acked-by: Jason Gunthorpe <[email protected]>
>

The RDMA tree seems like a good path for this, yes, glad you suggested
that.

I'll post a v4 with the comment fix and the recent reviewed-by's, which
should be ready for that. It's based on today's linux.git tree at the
moment, but let me know if I should re-apply it to the RDMA tree.


thanks,
--
John Hubbard
NVIDIA

2018-10-08 20:57:00

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] infiniband/mm: convert put_page() to put_user_page*()

On Mon, Oct 08, 2018 at 01:37:35PM -0700, John Hubbard wrote:
> On 10/8/18 12:42 PM, Jason Gunthorpe wrote:
> > On Fri, Oct 05, 2018 at 07:49:49PM -0700, [email protected] wrote:
> >> From: John Hubbard <[email protected]>
> [...]
> >> drivers/infiniband/core/umem.c | 7 ++++---
> >> drivers/infiniband/core/umem_odp.c | 2 +-
> >> drivers/infiniband/hw/hfi1/user_pages.c | 11 ++++-------
> >> drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
> >> drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++++-------
> >> drivers/infiniband/hw/qib/qib_user_sdma.c | 8 ++++----
> >> drivers/infiniband/hw/usnic/usnic_uiom.c | 7 ++++---
> >> 7 files changed, 24 insertions(+), 28 deletions(-)
> >
> > I have no issues with this, do you want this series to go through the
> > rdma tree? Otherwise:
> >
> > Acked-by: Jason Gunthorpe <[email protected]>
> >
>
> The RDMA tree seems like a good path for this, yes, glad you suggested
> that.
>
> I'll post a v4 with the comment fix and the recent reviewed-by's, which
> should be ready for that. It's based on today's linux.git tree at the
> moment, but let me know if I should re-apply it to the RDMA tree.

I'm unclear who needs to ack the MM sections for us to take it to
RDMA?

Otherwise it is no problem..

Jason

2018-10-08 20:59:47

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] infiniband/mm: convert put_page() to put_user_page*()

On 10/8/18 1:56 PM, Jason Gunthorpe wrote:
> On Mon, Oct 08, 2018 at 01:37:35PM -0700, John Hubbard wrote:
>> On 10/8/18 12:42 PM, Jason Gunthorpe wrote:
>>> On Fri, Oct 05, 2018 at 07:49:49PM -0700, [email protected] wrote:
>>>> From: John Hubbard <[email protected]>
>> [...]
>>>> drivers/infiniband/core/umem.c | 7 ++++---
>>>> drivers/infiniband/core/umem_odp.c | 2 +-
>>>> drivers/infiniband/hw/hfi1/user_pages.c | 11 ++++-------
>>>> drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
>>>> drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++++-------
>>>> drivers/infiniband/hw/qib/qib_user_sdma.c | 8 ++++----
>>>> drivers/infiniband/hw/usnic/usnic_uiom.c | 7 ++++---
>>>> 7 files changed, 24 insertions(+), 28 deletions(-)
>>>
>>> I have no issues with this, do you want this series to go through the
>>> rdma tree? Otherwise:
>>>
>>> Acked-by: Jason Gunthorpe <[email protected]>
>>>
>>
>> The RDMA tree seems like a good path for this, yes, glad you suggested
>> that.
>>
>> I'll post a v4 with the comment fix and the recent reviewed-by's, which
>> should be ready for that. It's based on today's linux.git tree at the
>> moment, but let me know if I should re-apply it to the RDMA tree.
>
> I'm unclear who needs to ack the MM sections for us to take it to
> RDMA?
>
> Otherwise it is no problem..
>

It needs Andrew Morton (+CC) and preferably also Michal Hocko (already on CC).

thanks,
--
John Hubbard
NVIDIA