2022-02-01 20:49:30

by William McVicker

[permalink] [raw]
Subject: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify
try_grab_page()") which refactors try_grab_page() to call
try_grab_compound_head() with refs=1. The refactor commit is causing
pin_user_pages() to return -ENOMEM when we try to pin one user page that
is migratable and not in the movable zone. Previously, try_grab_page()
didn't check if the page was pinnable for FOLL_PIN. To match the same
functionality, this fix adds the check `refs > 1 &&` to skip the call to
is_pinnable_page().

This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here
is the call stack to reproduce the -ENOMEM error:

Call trace:
: dump_backtrace.cfi_jt+0x0/0x8
: show_stack+0x1c/0x2c
: dump_stack_lvl+0x68/0x98
: try_grab_compound_head+0x298/0x3c4
: follow_page_pte+0x1dc/0x330
: follow_page_mask+0x174/0x340
: __get_user_pages+0x158/0x34c
: __gup_longterm_locked+0xfc/0x194
: __gup_longterm_unlocked+0x80/0xf4
: internal_get_user_pages_fast+0xf0/0x15c
: pin_user_pages_fast+0x24/0x40
: edgetpu_device_group_map+0x130/0x584 [abrolhos]
: edgetpu_ioctl_map_buffer+0x110/0x3b4 [abrolhos]
: edgetpu_ioctl+0x238/0x408 [abrolhos]
: edgetpu_fs_ioctl+0x14/0x24 [abrolhos]

Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()")
Cc: John Hubbard <[email protected]>
Cc: Minchan Kim <[email protected]>
Signed-off-by: Will McVicker <[email protected]>
---
mm/gup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index f0af462ac1e2..0509c49c46a3 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
* right zone, so fail and let the caller fall back to the slow
* path.
*/
- if (unlikely((flags & FOLL_LONGTERM) &&
+ if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
!is_pinnable_page(page)))
return NULL;

--
2.35.0.rc2.247.g8bbb082509-goog


2022-02-01 20:52:03

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

On 1/31/22 12:35, Will McVicker wrote:
> This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify
> try_grab_page()") which refactors try_grab_page() to call
> try_grab_compound_head() with refs=1. The refactor commit is causing
> pin_user_pages() to return -ENOMEM when we try to pin one user page that
> is migratable and not in the movable zone. Previously, try_grab_page()
> didn't check if the page was pinnable for FOLL_PIN. To match the same
> functionality, this fix adds the check `refs > 1 &&` to skip the call to
> is_pinnable_page().
>

That's a clear write-up of what you're seeing, what caused it, and how
you'd like to correct it. The previous code had a loophole, and you want
to keep that loophole. More below...

> This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here
> is the call stack to reproduce the -ENOMEM error:
...
> Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()")
> Cc: John Hubbard <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Signed-off-by: Will McVicker <[email protected]>
> ---
> mm/gup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index f0af462ac1e2..0509c49c46a3 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
> * right zone, so fail and let the caller fall back to the slow
> * path.
> */
> - if (unlikely((flags & FOLL_LONGTERM) &&
> + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
> !is_pinnable_page(page)))
> return NULL;
>

...but are you really sure that this is the best way to "fix" the
problem? This trades correctness for "bug-for-bug compatibility" with
the previous code. It says, "it's OK to violate the pinnable and
longterm checks, as long as you do it one page at a time, rather than in
larger chunks.

Wouldn't it be better to try to fix up the calling code so that it's
not in violation of these zone rules?


thanks,
--
John Hubbard
NVIDIA

2022-02-01 20:58:01

by William McVicker

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

On Mon, Jan 31, 2022 at 12:49 PM John Hubbard <[email protected]> wrote:
>
> On 1/31/22 12:35, Will McVicker wrote:
> > This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify
> > try_grab_page()") which refactors try_grab_page() to call
> > try_grab_compound_head() with refs=1. The refactor commit is causing
> > pin_user_pages() to return -ENOMEM when we try to pin one user page that
> > is migratable and not in the movable zone. Previously, try_grab_page()
> > didn't check if the page was pinnable for FOLL_PIN. To match the same
> > functionality, this fix adds the check `refs > 1 &&` to skip the call to
> > is_pinnable_page().
> >
>
> That's a clear write-up of what you're seeing, what caused it, and how
> you'd like to correct it. The previous code had a loophole, and you want
> to keep that loophole. More below...
>
> > This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here
> > is the call stack to reproduce the -ENOMEM error:
> ...
> > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()")
> > Cc: John Hubbard <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Signed-off-by: Will McVicker <[email protected]>
> > ---
> > mm/gup.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index f0af462ac1e2..0509c49c46a3 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
> > * right zone, so fail and let the caller fall back to the slow
> > * path.
> > */
> > - if (unlikely((flags & FOLL_LONGTERM) &&
> > + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
> > !is_pinnable_page(page)))
> > return NULL;
> >
>
> ...but are you really sure that this is the best way to "fix" the
> problem? This trades correctness for "bug-for-bug compatibility" with
> the previous code. It says, "it's OK to violate the pinnable and
> longterm checks, as long as you do it one page at a time, rather than in
> larger chunks.
>
> Wouldn't it be better to try to fix up the calling code so that it's
> not in violation of these zone rules?
>
>
> thanks,
> --
> John Hubbard
> NVIDIA

Hi John,

Thanks for the prompt response! I'm not super familiar with what
PIN+LONGTERM conditions require, but if this was previously a bug,
then I definitely don't want to re-introduce it. Since you're
confirming that, let me sync-up with the driver owner to see how I can
fix this on the side.

Thanks!
Will

2022-02-02 11:15:31

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

On 1/31/22 23:28, Minchan Kim wrote:
...
>>> --- a/mm/gup.c
>>> +++ b/mm/gup.c
>>> @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
>>> * right zone, so fail and let the caller fall back to the slow
>>> * path.
>>> */
>>> - if (unlikely((flags & FOLL_LONGTERM) &&
>>> + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
>>> !is_pinnable_page(page)))
>>> return NULL;
>>
>> ...but are you really sure that this is the best way to "fix" the
>> problem? This trades correctness for "bug-for-bug compatibility" with
>> the previous code. It says, "it's OK to violate the pinnable and
>> longterm checks, as long as you do it one page at a time, rather than in
>> larger chunks.
>>
>> Wouldn't it be better to try to fix up the calling code so that it's
>> not in violation of these zone rules?
>
> I think the problem is before pin_user_pages can work with CMA pages
> in the fallback path but now it doesn't work with CMA page. Driver

Actually, it "worked" only if the caller did it one page at a time.
(See how the above attempted fix restores the "make it work for
refs == 1.)

> couldn't know whether it will work with CMA page or not in advance.
>
> pin_user_pages
> __get_user_pages_locked
> follow_page_mask
> follow_page_pte
> try_grab_page
> !is_pinnable_page(page)
> return NULL;
> return ERR_PTR(-ENOMEM);
> return -ENOMEM without faultin_page

Yes, that's all clear.


thanks,
--
John Hubbard
NVIDIA

2022-02-02 13:14:09

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

Hi John,

On Mon, Jan 31, 2022 at 12:49:35PM -0800, John Hubbard wrote:
> On 1/31/22 12:35, Will McVicker wrote:
> > This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify
> > try_grab_page()") which refactors try_grab_page() to call
> > try_grab_compound_head() with refs=1. The refactor commit is causing
> > pin_user_pages() to return -ENOMEM when we try to pin one user page that
> > is migratable and not in the movable zone. Previously, try_grab_page()
> > didn't check if the page was pinnable for FOLL_PIN. To match the same
> > functionality, this fix adds the check `refs > 1 &&` to skip the call to
> > is_pinnable_page().
> >
>
> That's a clear write-up of what you're seeing, what caused it, and how
> you'd like to correct it. The previous code had a loophole, and you want
> to keep that loophole. More below...
>
> > This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here
> > is the call stack to reproduce the -ENOMEM error:
> ...
> > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()")
> > Cc: John Hubbard <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Signed-off-by: Will McVicker <[email protected]>
> > ---
> > mm/gup.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index f0af462ac1e2..0509c49c46a3 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
> > * right zone, so fail and let the caller fall back to the slow
> > * path.
> > */
> > - if (unlikely((flags & FOLL_LONGTERM) &&
> > + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
> > !is_pinnable_page(page)))
> > return NULL;
>
> ...but are you really sure that this is the best way to "fix" the
> problem? This trades correctness for "bug-for-bug compatibility" with
> the previous code. It says, "it's OK to violate the pinnable and
> longterm checks, as long as you do it one page at a time, rather than in
> larger chunks.
>
> Wouldn't it be better to try to fix up the calling code so that it's
> not in violation of these zone rules?

I think the problem is before pin_user_pages can work with CMA pages
in the fallback path but now it doesn't work with CMA page. Driver
couldn't know whether it will work with CMA page or not in advance.

pin_user_pages
__get_user_pages_locked
follow_page_mask
follow_page_pte
try_grab_page
!is_pinnable_page(page)
return NULL;
return ERR_PTR(-ENOMEM);
return -ENOMEM without faultin_page

2022-02-02 15:38:17

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

On 1/31/22 23:43, John Hubbard wrote:
>>> pin_user_pages
>>>    __get_user_pages_locked
>>>      follow_page_mask
>>>        follow_page_pte
>>>          try_grab_page
>>>            !is_pinnable_page(page)
>>>              return NULL;
>>>          return ERR_PTR(-ENOMEM);
>>>       return -ENOMEM without faultin_page
>>
>> Yes, that's all clear.
>> ...oh, but I guess you're pointing out that it's always going to be
> page-at-a-time this deep in the pin_user_pages() call path. Which is true.
>
> I hadn't worked through how to fix this yet, my initial reaction was
> that allowing single refs to go through, while prohibiting multiple refs,
> was clearly *not* the way to go.
>

OK, so after looking at this some more, I think that the real problem
with commit 54d516b1d62f ("mm/gup: small refactoring: simplify
try_grab_page()") is that it funnels fast and slow gup through the same
routine (try_grab_compound_head()). And the problem with *that*, is that
try_grab_compound_head() is coded up with that assumption that it is
being called *only* by fast-gup:

/*
* Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
* right zone, so fail and let the caller fall back to the slow
* path.
*/
if (unlikely((flags & FOLL_LONGTERM) &&
!is_pinnable_page(page)))
return NULL;

Now, to fix that, I'd really rather not conflate "refs == 1" with "on
the slow path", because that's a conceptual mismatch.

So, other ideas:

a) Remove the above check, and fail fast gup at a different point for
the (FOLL_LONGTERM && !is_pinnable) case. Haven't looked closely at this
yet.

b) Pass in FOLL_FAST_ONLY from all call sites *except* try_grag_page().
Skip the above check in slow-gup (!FOLL_FAST_ONLY) cases.

This makes the code ugly, though, and I'm just listing it here for
completeness.

c) Just call the entire refactoring idea a mistake, and roll it back
either entirely, or enough to keep fast and slow gup separate.

Unless a better idea shows up, (c) is probably the way to go, I think...


thanks,
--
John Hubbard
NVIDIA

2022-02-03 05:27:07

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

On 1/31/22 23:35, John Hubbard wrote:
> On 1/31/22 23:28, Minchan Kim wrote:
> ...
>>>> --- a/mm/gup.c
>>>> +++ b/mm/gup.c
>>>> @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
>>>>             * right zone, so fail and let the caller fall back to the slow
>>>>             * path.
>>>>             */
>>>> -        if (unlikely((flags & FOLL_LONGTERM) &&
>>>> +        if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
>>>>                     !is_pinnable_page(page)))
>>>>                return NULL;
>>>
>>> ...but are you really sure that this is the best way to "fix" the
>>> problem? This trades correctness for "bug-for-bug compatibility" with
>>> the previous code. It says, "it's OK to violate the pinnable and
>>> longterm checks, as long as you do it one page at a time, rather than in
>>> larger chunks.
>>>
>>> Wouldn't it be better to try to fix up the calling code so that it's
>>> not in violation of these zone rules?
>>
>> I think the problem is before pin_user_pages can work with CMA pages
>> in the fallback path but now it doesn't work with CMA page. Driver
>
> Actually, it "worked" only if the caller did it one page at a time.
> (See how the above attempted fix restores the "make it work for
> refs == 1.)
>
>> couldn't know whether it will work with CMA page or not in advance.
>>
>> pin_user_pages
>>    __get_user_pages_locked
>>      follow_page_mask
>>        follow_page_pte
>>          try_grab_page
>>            !is_pinnable_page(page)
>>              return NULL;
>>          return ERR_PTR(-ENOMEM);
>>       return -ENOMEM without faultin_page
>
> Yes, that's all clear.
> ...oh, but I guess you're pointing out that it's always going to be
page-at-a-time this deep in the pin_user_pages() call path. Which is true.

I hadn't worked through how to fix this yet, my initial reaction was
that allowing single refs to go through, while prohibiting multiple refs,
was clearly *not* the way to go.

thanks,
--
John Hubbard
NVIDIA