2021-05-07 07:37:07

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 2/3] mm: gup: allow FOLL_PIN to scale in SMP

On 5/6/21 4:25 PM, Peter Xu wrote:
> From: Andrea Arcangeli <[email protected]>
>
> has_pinned cannot be written by each pin-fast or it won't scale in
> SMP. This isn't "false sharing" strictly speaking (it's more like
> "true non-sharing"), but it creates the same SMP scalability
> bottleneck of "false sharing".
>
> To verify the improvement a new "pin_fast.c" program was added to
> the will-it-scale benchmark.
...
>
> This commits increases the SMP scalability of pin_user_pages_fast()
> executed by different threads of the same process by more than 4000%.
>

Remarkable! I mean, yes, everyone knows that atomic writes are
"expensive", but this is a fun, dramatic example of just *how*
expensive they can get, once you start doing contended atomic writes.


Reviewed-by: John Hubbard <[email protected]>

Other notes, that don't have any effect on the above reviewed-by
tag:

On the commit log, I will add a "+1" to the idea of deleting the
pin_fast.c contents from the commit log, and just providing a URL
instead. No need to put C programs in the commit log, IMHO, especially
when you have them elsewhere anyway.


thanks,
--
John Hubbard
NVIDIA


> Signed-off-by: Andrea Arcangeli <[email protected]>
> Signed-off-by: Peter Xu <[email protected]>
> ---
> mm/gup.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 63a079e361a3d..8b513e1723b45 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1292,7 +1292,7 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
> BUG_ON(*locked != 1);
> }
>
> - if (flags & FOLL_PIN)
> + if (flags & FOLL_PIN && !atomic_read(&mm->has_pinned))
> atomic_set(&mm->has_pinned, 1);
>
> /*
> @@ -2617,7 +2617,7 @@ static int internal_get_user_pages_fast(unsigned long start,
> FOLL_FAST_ONLY)))
> return -EINVAL;
>
> - if (gup_flags & FOLL_PIN)
> + if (gup_flags & FOLL_PIN && !atomic_read(&current->mm->has_pinned))
> atomic_set(&current->mm->has_pinned, 1);
>
> if (!(gup_flags & FOLL_FAST_ONLY))
>

thanks,
--
John Hubbard
NVIDIA


2021-05-07 17:21:59

by Peter Xu

[permalink] [raw]
Subject: Re: [PATCH 2/3] mm: gup: allow FOLL_PIN to scale in SMP

On Thu, May 06, 2021 at 11:07:32PM -0700, John Hubbard wrote:
> On 5/6/21 4:25 PM, Peter Xu wrote:
> > From: Andrea Arcangeli <[email protected]>
> >
> > has_pinned cannot be written by each pin-fast or it won't scale in
> > SMP. This isn't "false sharing" strictly speaking (it's more like
> > "true non-sharing"), but it creates the same SMP scalability
> > bottleneck of "false sharing".
> >
> > To verify the improvement a new "pin_fast.c" program was added to
> > the will-it-scale benchmark.
> ...
> >
> > This commits increases the SMP scalability of pin_user_pages_fast()
> > executed by different threads of the same process by more than 4000%.
> >
>
> Remarkable! I mean, yes, everyone knows that atomic writes are
> "expensive", but this is a fun, dramatic example of just *how*
> expensive they can get, once you start doing contended atomic writes.
>
>
> Reviewed-by: John Hubbard <[email protected]>
>
> Other notes, that don't have any effect on the above reviewed-by
> tag:
>
> On the commit log, I will add a "+1" to the idea of deleting the
> pin_fast.c contents from the commit log, and just providing a URL
> instead. No need to put C programs in the commit log, IMHO, especially
> when you have them elsewhere anyway.

I guess it's put there only because it does not exist elsewhere. :)

I didn't run it, but I think it needs to be put into tests/ of if-it-scale repo
and build, something like that.

https://github.com/antonblanchard/will-it-scale/tree/master/tests

But yeah since we've got the 1st patch which can also reproduce this, I'll
reference that instead in the commit mesasge and I'll be able to shrink it.

I'll also start to use parentheses as Linus suggested. My thanks to both of
you on the quick comments!

--
Peter Xu