2019-04-08 02:39:10

by Huang Shijie

[permalink] [raw]
Subject: [PATCH 1/2] mm/gup.c: fix the wrong comments

When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
get_user_pages_fast().

In the following scenario, we will may meet the bug in the DMA case:
.....................
get_user_pages_fast(start,,, pages);
......
sg_alloc_table_from_pages(, pages, ...);
.....................

The root cause is that sg_alloc_table_from_pages() requires the
page order to keep the same as it used in the user space, but
get_user_pages_fast() will mess it up.

So change the comments, and make it more clear for the driver
users.

Signed-off-by: Huang Shijie <[email protected]>
---
mm/gup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 22acdd0f79ff..fb11ff90ba3b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1129,10 +1129,6 @@ EXPORT_SYMBOL(get_user_pages_locked);
* with:
*
* get_user_pages_unlocked(tsk, mm, ..., pages);
- *
- * It is functionally equivalent to get_user_pages_fast so
- * get_user_pages_fast should be used instead if specific gup_flags
- * (e.g. FOLL_FORCE) are not required.
*/
long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
struct page **pages, unsigned int gup_flags)
@@ -2147,6 +2143,10 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
* If not successful, it will fall back to taking the lock and
* calling get_user_pages().
*
+ * Note this routine may fill the pages array with entries in a
+ * different order than get_user_pages_unlocked(), which may cause
+ * issues for callers expecting the routines to be equivalent.
+ *
* Returns number of pages pinned. This may be fewer than the number
* requested. If nr_pages is 0 or negative, returns 0. If no pages
* were pinned, returns -errno.
--
2.17.1


2019-04-08 02:39:11

by Huang Shijie

[permalink] [raw]
Subject: [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages

The get_user_pages_fast() may mess up the page order in @pages array,
We will get the wrong DMA results in this case.

Add more commit to clarify it.

Signed-off-by: Huang Shijie <[email protected]>
---
lib/scatterlist.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 739dc9fe2c55..c170afb1a25e 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -450,6 +450,9 @@ EXPORT_SYMBOL(__sg_alloc_table_from_pages);
* specified by the page array. The returned sg table is released by
* sg_free_table.
*
+ * Note: Do not use get_user_pages_fast() to pin the pages for @pages array,
+ * it may mess up the page order, and we will get the wrong DMA results.
+
* Returns:
* 0 on success, negative error on failure
*/
--
2.17.1

2019-04-08 14:16:03

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> get_user_pages_fast().
>
> In the following scenario, we will may meet the bug in the DMA case:
> .....................
> get_user_pages_fast(start,,, pages);
> ......
> sg_alloc_table_from_pages(, pages, ...);
> .....................
>
> The root cause is that sg_alloc_table_from_pages() requires the
> page order to keep the same as it used in the user space, but
> get_user_pages_fast() will mess it up.

I don't understand how get_user_pages_fast() can return the pages in a
different order in the array from the order they appear in userspace.
Can you explain?

2019-04-09 01:47:01

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > get_user_pages_fast().
> >
> > In the following scenario, we will may meet the bug in the DMA case:
> > .....................
> > get_user_pages_fast(start,,, pages);
> > ......
> > sg_alloc_table_from_pages(, pages, ...);
> > .....................
> >
> > The root cause is that sg_alloc_table_from_pages() requires the
> > page order to keep the same as it used in the user space, but
> > get_user_pages_fast() will mess it up.
>
> I don't understand how get_user_pages_fast() can return the pages in a
> different order in the array from the order they appear in userspace.
> Can you explain?
Please see the code in gup.c:

int get_user_pages_fast(unsigned long start, int nr_pages,
unsigned int gup_flags, struct page **pages)
{
.......
if (gup_fast_permitted(start, nr_pages)) {
local_irq_disable();
gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.
local_irq_enable();
ret = nr;
}
.......
if (nr < nr_pages) {
/* Try to get the remaining pages with get_user_pages */
start += nr << PAGE_SHIFT;
pages += nr; // The @pages is moved forward.

if (gup_flags & FOLL_LONGTERM) {
down_read(&current->mm->mmap_sem);
ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time
start, nr_pages - nr,
pages, NULL, gup_flags);
up_read(&current->mm->mmap_sem);
} else {
/*
* retain FAULT_FOLL_ALLOW_RETRY optimization if
* possible
*/
ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
pages, gup_flags);
}
}


.....................

BTW, I do not know why we mess up the page order. It maybe used in some special case.

Thanks
Huang Shijie

2019-04-09 03:06:25

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > page order to keep the same as it used in the user space, but
> > > > get_user_pages_fast() will mess it up.
> > >
> > > I don't understand how get_user_pages_fast() can return the pages in a
> > > different order in the array from the order they appear in userspace.
> > > Can you explain?
> > Please see the code in gup.c:
> >
> > int get_user_pages_fast(unsigned long start, int nr_pages,
> > unsigned int gup_flags, struct page **pages)
> > {
> > .......
> > if (gup_fast_permitted(start, nr_pages)) {
> > local_irq_disable();
> > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.
>
> Right ... but if it's not filled entirely, it will be filled part-way,
> and then we stop.
>
> > local_irq_enable();
> > ret = nr;
> > }
> > .......
> > if (nr < nr_pages) {
> > /* Try to get the remaining pages with get_user_pages */
> > start += nr << PAGE_SHIFT;
> > pages += nr; // The @pages is moved forward.
>
> Yes, to the point where gup_pgd_range() stopped.
>
> > if (gup_flags & FOLL_LONGTERM) {
> > down_read(&current->mm->mmap_sem);
> > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time
>
> Right.
>
> > /*
> > * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > * possible
> > */
> > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
> > pages, gup_flags);
>
> Yes. But they'll be in the same order.
>
> > BTW, I do not know why we mess up the page order. It maybe used in some special case.
>
> I'm not discounting the possibility that you've found a bug.
> But documenting that a bug exists is not the solution; the solution is
> fixing the bug.
I do not think it is a bug :)

If we use the get_user_pages_unlocked(), DMA is okay, such as:
....
get_user_pages_unlocked()
sg_alloc_table_from_pages()
.....

I think the comment is not accurate enough. So just add more comments, and tell the driver
users how to use the GUPs.

Thanks
Huang Shijie

2019-04-09 03:48:53

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > The root cause is that sg_alloc_table_from_pages() requires the
> > > page order to keep the same as it used in the user space, but
> > > get_user_pages_fast() will mess it up.
> >
> > I don't understand how get_user_pages_fast() can return the pages in a
> > different order in the array from the order they appear in userspace.
> > Can you explain?
> Please see the code in gup.c:
>
> int get_user_pages_fast(unsigned long start, int nr_pages,
> unsigned int gup_flags, struct page **pages)
> {
> .......
> if (gup_fast_permitted(start, nr_pages)) {
> local_irq_disable();
> gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.

Right ... but if it's not filled entirely, it will be filled part-way,
and then we stop.

> local_irq_enable();
> ret = nr;
> }
> .......
> if (nr < nr_pages) {
> /* Try to get the remaining pages with get_user_pages */
> start += nr << PAGE_SHIFT;
> pages += nr; // The @pages is moved forward.

Yes, to the point where gup_pgd_range() stopped.

> if (gup_flags & FOLL_LONGTERM) {
> down_read(&current->mm->mmap_sem);
> ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time

Right.

> /*
> * retain FAULT_FOLL_ALLOW_RETRY optimization if
> * possible
> */
> ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
> pages, gup_flags);

Yes. But they'll be in the same order.

> BTW, I do not know why we mess up the page order. It maybe used in some special case.

I'm not discounting the possibility that you've found a bug.
But documenting that a bug exists is not the solution; the solution is
fixing the bug.

2019-04-09 11:20:10

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > > page order to keep the same as it used in the user space, but
> > > > > get_user_pages_fast() will mess it up.
> > > >
> > > > I don't understand how get_user_pages_fast() can return the pages in a
> > > > different order in the array from the order they appear in userspace.
> > > > Can you explain?
> > > Please see the code in gup.c:
> > >
> > > int get_user_pages_fast(unsigned long start, int nr_pages,
> > > unsigned int gup_flags, struct page **pages)
> > > {
> > > .......
> > > if (gup_fast_permitted(start, nr_pages)) {
> > > local_irq_disable();
> > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.
> >
> > Right ... but if it's not filled entirely, it will be filled part-way,
> > and then we stop.
> >
> > > local_irq_enable();
> > > ret = nr;
> > > }
> > > .......
> > > if (nr < nr_pages) {
> > > /* Try to get the remaining pages with get_user_pages */
> > > start += nr << PAGE_SHIFT;
> > > pages += nr; // The @pages is moved forward.
> >
> > Yes, to the point where gup_pgd_range() stopped.
> >
> > > if (gup_flags & FOLL_LONGTERM) {
> > > down_read(&current->mm->mmap_sem);
> > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time
> >
> > Right.
> >
> > > /*
> > > * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > > * possible
> > > */
> > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
> > > pages, gup_flags);
> >
> > Yes. But they'll be in the same order.
> >
> > > BTW, I do not know why we mess up the page order. It maybe used in some special case.
> >
> > I'm not discounting the possibility that you've found a bug.
> > But documenting that a bug exists is not the solution; the solution is
> > fixing the bug.
> I do not think it is a bug :)
>
> If we use the get_user_pages_unlocked(), DMA is okay, such as:
> ....
> get_user_pages_unlocked()
> sg_alloc_table_from_pages()
> .....
>
> I think the comment is not accurate enough. So just add more comments, and tell the driver
> users how to use the GUPs.

gup_fast() and gup_unlocked() should return the pages in the same order.
If they do not, then it is a bug.

2019-04-09 14:56:31

by Ira Weiny

[permalink] [raw]
Subject: RE: [PATCH 1/2] mm/gup.c: fix the wrong comments

> On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > > The root cause is that sg_alloc_table_from_pages() requires
> > > > > > the page order to keep the same as it used in the user space,
> > > > > > but
> > > > > > get_user_pages_fast() will mess it up.
> > > > >
> > > > > I don't understand how get_user_pages_fast() can return the
> > > > > pages in a different order in the array from the order they appear in
> userspace.
> > > > > Can you explain?
> > > > Please see the code in gup.c:
> > > >
> > > > int get_user_pages_fast(unsigned long start, int nr_pages,
> > > > unsigned int gup_flags, struct page **pages)
> > > > {
> > > > .......
> > > > if (gup_fast_permitted(start, nr_pages)) {
> > > > local_irq_disable();
> > > > gup_pgd_range(addr, end, gup_flags, pages, &nr);
> // The @pages array maybe filled at the first time.
> > >
> > > Right ... but if it's not filled entirely, it will be filled
> > > part-way, and then we stop.
> > >
> > > > local_irq_enable();
> > > > ret = nr;
> > > > }
> > > > .......
> > > > if (nr < nr_pages) {
> > > > /* Try to get the remaining pages with
> get_user_pages */
> > > > start += nr << PAGE_SHIFT;
> > > > pages += nr; // The
> @pages is moved forward.
> > >
> > > Yes, to the point where gup_pgd_range() stopped.
> > >
> > > > if (gup_flags & FOLL_LONGTERM) {
> > > > down_read(&current->mm->mmap_sem);
> > > > ret = __gup_longterm_locked(current,
> current->mm, // The @pages maybe filled at the second time
> > >
> > > Right.
> > >
> > > > /*
> > > > * retain FAULT_FOLL_ALLOW_RETRY
> optimization if
> > > > * possible
> > > > */
> > > > ret = get_user_pages_unlocked(start,
> nr_pages - nr, // The @pages maybe filled at the second time.
> > > > pages, gup_flags);
> > >
> > > Yes. But they'll be in the same order.
> > >
> > > > BTW, I do not know why we mess up the page order. It maybe used in
> some special case.
> > >
> > > I'm not discounting the possibility that you've found a bug.
> > > But documenting that a bug exists is not the solution; the solution
> > > is fixing the bug.
> > I do not think it is a bug :)
> >
> > If we use the get_user_pages_unlocked(), DMA is okay, such as:
> > ....
> > get_user_pages_unlocked()
> > sg_alloc_table_from_pages()
> > .....
> >
> > I think the comment is not accurate enough. So just add more comments,
> > and tell the driver users how to use the GUPs.
>
> gup_fast() and gup_unlocked() should return the pages in the same order.
> If they do not, then it is a bug.

Is there a reproducer for this? Or do you have some debug output which shows this problem?

Ira

2019-04-09 20:24:44

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > > get_user_pages_fast().
> > >
> > > In the following scenario, we will may meet the bug in the DMA case:
> > > .....................
> > > get_user_pages_fast(start,,, pages);
> > > ......
> > > sg_alloc_table_from_pages(, pages, ...);
> > > .....................
> > >
> > > The root cause is that sg_alloc_table_from_pages() requires the
> > > page order to keep the same as it used in the user space, but
> > > get_user_pages_fast() will mess it up.
> >
> > I don't understand how get_user_pages_fast() can return the pages in a
> > different order in the array from the order they appear in userspace.
> > Can you explain?
> Please see the code in gup.c:
>
> int get_user_pages_fast(unsigned long start, int nr_pages,
> unsigned int gup_flags, struct page **pages)
> {
> .......
> if (gup_fast_permitted(start, nr_pages)) {
> local_irq_disable();
> gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.
> local_irq_enable();
> ret = nr;
> }
> .......
> if (nr < nr_pages) {
> /* Try to get the remaining pages with get_user_pages */
> start += nr << PAGE_SHIFT;
> pages += nr; // The @pages is moved forward.
>
> if (gup_flags & FOLL_LONGTERM) {
> down_read(&current->mm->mmap_sem);
> ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time
>

Neither this nor the get_user_pages_unlocked is filling the pages a second
time. It is adding to the page array having moved start and the page array
forward.

Are you doing a FOLL_LONGTERM GUP? Or are you in the else clause below when
you get this bug?

Ira

> start, nr_pages - nr,
> pages, NULL, gup_flags);
> up_read(&current->mm->mmap_sem);
> } else {
> /*
> * retain FAULT_FOLL_ALLOW_RETRY optimization if
> * possible
> */
> ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
> pages, gup_flags);
> }
> }
>
>
> .....................
>
> BTW, I do not know why we mess up the page order. It maybe used in some special case.
>
> Thanks
> Huang Shijie
>

2019-04-10 01:19:55

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Tue, Apr 09, 2019 at 01:23:16PM -0700, Ira Weiny wrote:
> On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > > > get_user_pages_fast().
> > > >
> > > > In the following scenario, we will may meet the bug in the DMA case:
> > > > .....................
> > > > get_user_pages_fast(start,,, pages);
> > > > ......
> > > > sg_alloc_table_from_pages(, pages, ...);
> > > > .....................
> > > >
> > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > page order to keep the same as it used in the user space, but
> > > > get_user_pages_fast() will mess it up.
> > >
> > > I don't understand how get_user_pages_fast() can return the pages in a
> > > different order in the array from the order they appear in userspace.
> > > Can you explain?
> > Please see the code in gup.c:
> >
> > int get_user_pages_fast(unsigned long start, int nr_pages,
> > unsigned int gup_flags, struct page **pages)
> > {
> > .......
> > if (gup_fast_permitted(start, nr_pages)) {
> > local_irq_disable();
> > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.
> > local_irq_enable();
> > ret = nr;
> > }
> > .......
> > if (nr < nr_pages) {
> > /* Try to get the remaining pages with get_user_pages */
> > start += nr << PAGE_SHIFT;
> > pages += nr; // The @pages is moved forward.
> >
> > if (gup_flags & FOLL_LONGTERM) {
> > down_read(&current->mm->mmap_sem);
> > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time
> >
>
> Neither this nor the get_user_pages_unlocked is filling the pages a second
The get_user_pages_unlocked() will call the handle_mm_fault which will allocate a
new page for the empty PTE, and save the new page into the @pages array.


> time. It is adding to the page array having moved start and the page array
> forward.

Yes. This will mess up the page order.

I will read the code again to check if I am wrong :)

>
> Are you doing a FOLL_LONGTERM GUP? Or are you in the else clause below when
> you get this bug?
I do not use FOLL_LONGTERM, I just use the FOLL_WRITE.

So it seems it runs into the else clause below.

Thanks
Huang Shijie

>
> Ira
>
> > start, nr_pages - nr,
> > pages, NULL, gup_flags);
> > up_read(&current->mm->mmap_sem);
> > } else {
> > /*
> > * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > * possible
> > */
> > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
> > pages, gup_flags);
> > }
> > }
> >
> >

2019-04-10 01:21:42

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Tue, Apr 09, 2019 at 02:55:31PM +0000, Weiny, Ira wrote:
> > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > > > The root cause is that sg_alloc_table_from_pages() requires
> > > > > > > the page order to keep the same as it used in the user space,
> > > > > > > but
> > > > > > > get_user_pages_fast() will mess it up.
> > > > > >
> > > > > > I don't understand how get_user_pages_fast() can return the
> > > > > > pages in a different order in the array from the order they appear in
> > userspace.
> > > > > > Can you explain?
> > > > > Please see the code in gup.c:
> > > > >
> > > > > int get_user_pages_fast(unsigned long start, int nr_pages,
> > > > > unsigned int gup_flags, struct page **pages)
> > > > > {
> > > > > .......
> > > > > if (gup_fast_permitted(start, nr_pages)) {
> > > > > local_irq_disable();
> > > > > gup_pgd_range(addr, end, gup_flags, pages, &nr);
> > // The @pages array maybe filled at the first time.
> > > >
> > > > Right ... but if it's not filled entirely, it will be filled
> > > > part-way, and then we stop.
> > > >
> > > > > local_irq_enable();
> > > > > ret = nr;
> > > > > }
> > > > > .......
> > > > > if (nr < nr_pages) {
> > > > > /* Try to get the remaining pages with
> > get_user_pages */
> > > > > start += nr << PAGE_SHIFT;
> > > > > pages += nr; // The
> > @pages is moved forward.
> > > >
> > > > Yes, to the point where gup_pgd_range() stopped.
> > > >
> > > > > if (gup_flags & FOLL_LONGTERM) {
> > > > > down_read(&current->mm->mmap_sem);
> > > > > ret = __gup_longterm_locked(current,
> > current->mm, // The @pages maybe filled at the second time
> > > >
> > > > Right.
> > > >
> > > > > /*
> > > > > * retain FAULT_FOLL_ALLOW_RETRY
> > optimization if
> > > > > * possible
> > > > > */
> > > > > ret = get_user_pages_unlocked(start,
> > nr_pages - nr, // The @pages maybe filled at the second time.
> > > > > pages, gup_flags);
> > > >
> > > > Yes. But they'll be in the same order.
> > > >
> > > > > BTW, I do not know why we mess up the page order. It maybe used in
> > some special case.
> > > >
> > > > I'm not discounting the possibility that you've found a bug.
> > > > But documenting that a bug exists is not the solution; the solution
> > > > is fixing the bug.
> > > I do not think it is a bug :)
> > >
> > > If we use the get_user_pages_unlocked(), DMA is okay, such as:
> > > ....
> > > get_user_pages_unlocked()
> > > sg_alloc_table_from_pages()
> > > .....
> > >
> > > I think the comment is not accurate enough. So just add more comments,
> > > and tell the driver users how to use the GUPs.
> >
> > gup_fast() and gup_unlocked() should return the pages in the same order.
> > If they do not, then it is a bug.
>
> Is there a reproducer for this? Or do you have some debug output which shows this problem?
Is Matthew right?

" gup_fast() and gup_unlocked() should return the pages in the same order.
If they do not, then it is a bug."

If Matthew is right,
I need more time to debug the DMA issue...


Thanks
Huang Shijie


>
> Ira
>

2019-04-10 19:08:42

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Wed, Apr 10, 2019 at 09:20:36AM +0800, Huang Shijie wrote:
> On Tue, Apr 09, 2019 at 02:55:31PM +0000, Weiny, Ira wrote:
> > > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> > > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > > > > The root cause is that sg_alloc_table_from_pages() requires
> > > > > > > > the page order to keep the same as it used in the user space,
> > > > > > > > but
> > > > > > > > get_user_pages_fast() will mess it up.
> > > > > > >
> > > > > > > I don't understand how get_user_pages_fast() can return the
> > > > > > > pages in a different order in the array from the order they appear in
> > > userspace.
> > > > > > > Can you explain?
> > > > > > Please see the code in gup.c:
> > > > > >
> > > > > > int get_user_pages_fast(unsigned long start, int nr_pages,
> > > > > > unsigned int gup_flags, struct page **pages)
> > > > > > {
> > > > > > .......
> > > > > > if (gup_fast_permitted(start, nr_pages)) {
> > > > > > local_irq_disable();
> > > > > > gup_pgd_range(addr, end, gup_flags, pages, &nr);
> > > // The @pages array maybe filled at the first time.
> > > > >
> > > > > Right ... but if it's not filled entirely, it will be filled
> > > > > part-way, and then we stop.
> > > > >
> > > > > > local_irq_enable();
> > > > > > ret = nr;
> > > > > > }
> > > > > > .......
> > > > > > if (nr < nr_pages) {
> > > > > > /* Try to get the remaining pages with
> > > get_user_pages */
> > > > > > start += nr << PAGE_SHIFT;
> > > > > > pages += nr; // The
> > > @pages is moved forward.
> > > > >
> > > > > Yes, to the point where gup_pgd_range() stopped.
> > > > >
> > > > > > if (gup_flags & FOLL_LONGTERM) {
> > > > > > down_read(&current->mm->mmap_sem);
> > > > > > ret = __gup_longterm_locked(current,
> > > current->mm, // The @pages maybe filled at the second time
> > > > >
> > > > > Right.
> > > > >
> > > > > > /*
> > > > > > * retain FAULT_FOLL_ALLOW_RETRY
> > > optimization if
> > > > > > * possible
> > > > > > */
> > > > > > ret = get_user_pages_unlocked(start,
> > > nr_pages - nr, // The @pages maybe filled at the second time.
> > > > > > pages, gup_flags);
> > > > >
> > > > > Yes. But they'll be in the same order.
> > > > >
> > > > > > BTW, I do not know why we mess up the page order. It maybe used in
> > > some special case.
> > > > >
> > > > > I'm not discounting the possibility that you've found a bug.
> > > > > But documenting that a bug exists is not the solution; the solution
> > > > > is fixing the bug.
> > > > I do not think it is a bug :)
> > > >
> > > > If we use the get_user_pages_unlocked(), DMA is okay, such as:
> > > > ....
> > > > get_user_pages_unlocked()
> > > > sg_alloc_table_from_pages()
> > > > .....
> > > >
> > > > I think the comment is not accurate enough. So just add more comments,
> > > > and tell the driver users how to use the GUPs.
> > >
> > > gup_fast() and gup_unlocked() should return the pages in the same order.
> > > If they do not, then it is a bug.
> >
> > Is there a reproducer for this? Or do you have some debug output which shows this problem?
> Is Matthew right?
>
> " gup_fast() and gup_unlocked() should return the pages in the same order.
> If they do not, then it is a bug."

Yes I think he is...

Ira

>
> If Matthew is right,
> I need more time to debug the DMA issue...
>
>
> Thanks
> Huang Shijie
>
>
> >
> > Ira
> >
>

2019-04-10 19:10:16

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/gup.c: fix the wrong comments

On Wed, Apr 10, 2019 at 09:18:50AM +0800, Huang Shijie wrote:
> On Tue, Apr 09, 2019 at 01:23:16PM -0700, Ira Weiny wrote:
> > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > > > > get_user_pages_fast().
> > > > >
> > > > > In the following scenario, we will may meet the bug in the DMA case:
> > > > > .....................
> > > > > get_user_pages_fast(start,,, pages);
> > > > > ......
> > > > > sg_alloc_table_from_pages(, pages, ...);
> > > > > .....................
> > > > >
> > > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > > page order to keep the same as it used in the user space, but
> > > > > get_user_pages_fast() will mess it up.
> > > >
> > > > I don't understand how get_user_pages_fast() can return the pages in a
> > > > different order in the array from the order they appear in userspace.
> > > > Can you explain?
> > > Please see the code in gup.c:
> > >
> > > int get_user_pages_fast(unsigned long start, int nr_pages,
> > > unsigned int gup_flags, struct page **pages)
> > > {
> > > .......
> > > if (gup_fast_permitted(start, nr_pages)) {
> > > local_irq_disable();
> > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time.
> > > local_irq_enable();
> > > ret = nr;
> > > }
> > > .......
> > > if (nr < nr_pages) {
> > > /* Try to get the remaining pages with get_user_pages */
> > > start += nr << PAGE_SHIFT;
> > > pages += nr; // The @pages is moved forward.
> > >
> > > if (gup_flags & FOLL_LONGTERM) {
> > > down_read(&current->mm->mmap_sem);
> > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time
> > >
> >
> > Neither this nor the get_user_pages_unlocked is filling the pages a second
> The get_user_pages_unlocked() will call the handle_mm_fault which will allocate a
> new page for the empty PTE, and save the new page into the @pages array.

But shouldn't this happen if get_user_pages_unlocked() is called directly?

>
>
> > time. It is adding to the page array having moved start and the page array
> > forward.
>
> Yes. This will mess up the page order.
>
> I will read the code again to check if I am wrong :)
>
> >
> > Are you doing a FOLL_LONGTERM GUP? Or are you in the else clause below when
> > you get this bug?
> I do not use FOLL_LONGTERM, I just use the FOLL_WRITE.
>
> So it seems it runs into the else clause below.

Ok thanks,
Ira

>
> Thanks
> Huang Shijie
>
> >
> > Ira
> >
> > > start, nr_pages - nr,
> > > pages, NULL, gup_flags);
> > > up_read(&current->mm->mmap_sem);
> > > } else {
> > > /*
> > > * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > > * possible
> > > */
> > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time.
> > > pages, gup_flags);
> > > }
> > > }
> > >
> > >
>