2020-02-01 03:44:14

by John Hubbard

[permalink] [raw]
Subject: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

Now that pages are "DMA-pinned" via pin_user_page*(), and unpinned via
unpin_user_pages*(), we need some visibility into whether all of this is
working correctly.

Add two new fields to /proc/vmstat:

nr_foll_pin_requested
nr_foll_pin_returned

These are documented in Documentation/core-api/pin_user_pages.rst.
They represent the number of pages (since boot time) that have been
pinned ("nr_foll_pin_requested") and unpinned ("nr_foll_pin_returned"),
via pin_user_pages*() and unpin_user_pages*().

In the absence of long-running DMA or RDMA operations that hold pages
pinned, the above two fields will normally be equal to each other.

Signed-off-by: John Hubbard <[email protected]>
---
include/linux/mmzone.h | 2 ++
mm/gup.c | 21 +++++++++++++++++++++
mm/vmstat.c | 2 ++
3 files changed, 25 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c2bc309d1634..01d690586206 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -243,6 +243,8 @@ enum node_stat_item {
NR_DIRTIED, /* page dirtyings since bootup */
NR_WRITTEN, /* page writings since bootup */
NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */
+ NR_FOLL_PIN_REQUESTED, /* via: pin_user_page(), gup flag: FOLL_PIN */
+ NR_FOLL_PIN_RETURNED, /* pages returned via unpin_user_page() */
NR_VM_NODE_STAT_ITEMS
};

diff --git a/mm/gup.c b/mm/gup.c
index c10d0d051c5b..9fe61d15fc0e 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,6 +29,19 @@ struct follow_page_context {
unsigned int page_mask;
};

+#ifdef CONFIG_DEBUG_VM
+static inline void __update_proc_vmstat(struct page *page,
+ enum node_stat_item item, int count)
+{
+ mod_node_page_state(page_pgdat(page), item, count);
+}
+#else
+static inline void __update_proc_vmstat(struct page *page,
+ enum node_stat_item item, int count)
+{
+}
+#endif
+
static void hpage_pincount_add(struct page *page, int refs)
{
VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
@@ -86,6 +99,8 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page,
if (flags & FOLL_GET)
return try_get_compound_head(page, refs);
else if (flags & FOLL_PIN) {
+ int orig_refs = refs;
+
/*
* When pinning a compound page of order > 1 (which is what
* hpage_pincount_available() checks for), use an exact count to
@@ -104,6 +119,7 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page,
if (hpage_pincount_available(page))
hpage_pincount_add(page, refs);

+ __update_proc_vmstat(page, NR_FOLL_PIN_REQUESTED, orig_refs);
return page;
}

@@ -158,6 +174,8 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
* once, so that the page really is pinned.
*/
page_ref_add(page, refs);
+
+ __update_proc_vmstat(page, NR_FOLL_PIN_REQUESTED, 1);
}

return true;
@@ -178,6 +196,7 @@ static bool __unpin_devmap_managed_user_page(struct page *page)

count = page_ref_sub_return(page, refs);

+ __update_proc_vmstat(page, NR_FOLL_PIN_RETURNED, 1);
/*
* devmap page refcounts are 1-based, rather than 0-based: if
* refcount is 1, then the page is free and the refcount is
@@ -228,6 +247,8 @@ void unpin_user_page(struct page *page)

if (page_ref_sub_and_test(page, refs))
__put_page(page);
+
+ __update_proc_vmstat(page, NR_FOLL_PIN_RETURNED, 1);
}
EXPORT_SYMBOL(unpin_user_page);

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 78d53378db99..b56808bae1b4 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1168,6 +1168,8 @@ const char * const vmstat_text[] = {
"nr_dirtied",
"nr_written",
"nr_kernel_misc_reclaimable",
+ "nr_foll_pin_requested",
+ "nr_foll_pin_returned",

/* enum writeback_stat_item counters */
"nr_dirty_threshold",
--
2.25.0


2020-02-03 15:55:08

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

On Fri, Jan 31, 2020 at 07:40:27PM -0800, John Hubbard wrote:
> diff --git a/mm/gup.c b/mm/gup.c
> index c10d0d051c5b..9fe61d15fc0e 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -29,6 +29,19 @@ struct follow_page_context {
> unsigned int page_mask;
> };
>
> +#ifdef CONFIG_DEBUG_VM

Why under CONFIG_DEBUG_VM? There's nothing about this in the cover letter.

> +static inline void __update_proc_vmstat(struct page *page,
> + enum node_stat_item item, int count)
> +{
> + mod_node_page_state(page_pgdat(page), item, count);
> +}
> +#else
> +static inline void __update_proc_vmstat(struct page *page,
> + enum node_stat_item item, int count)
> +{
> +}
> +#endif
> +
> static void hpage_pincount_add(struct page *page, int refs)
> {
> VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
--
Kirill A. Shutemov

2020-02-03 21:05:30

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

On 2/3/20 5:53 AM, Kirill A. Shutemov wrote:
> On Fri, Jan 31, 2020 at 07:40:27PM -0800, John Hubbard wrote:
>> diff --git a/mm/gup.c b/mm/gup.c
>> index c10d0d051c5b..9fe61d15fc0e 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -29,6 +29,19 @@ struct follow_page_context {
>> unsigned int page_mask;
>> };
>>
>> +#ifdef CONFIG_DEBUG_VM
>
> Why under CONFIG_DEBUG_VM? There's nothing about this in the cover letter.
>

Early on, gup_benchmark showed a really significant slowdown from using these
counters. And I don't doubt that it's still the case.

I'll re-measure and add a short summary and a few numbers to the patch commit
description, and to the v4 cover letter.


thanks,
--
John Hubbard
NVIDIA

>> +static inline void __update_proc_vmstat(struct page *page,
>> + enum node_stat_item item, int count)
>> +{
>> + mod_node_page_state(page_pgdat(page), item, count);
>> +}
>> +#else
>> +static inline void __update_proc_vmstat(struct page *page,
>> + enum node_stat_item item, int count)
>> +{
>> +}
>> +#endif
>> +
>> static void hpage_pincount_add(struct page *page, int refs)
>> {
>> VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);

2020-02-03 21:32:09

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

On Mon, Feb 03, 2020 at 01:04:04PM -0800, John Hubbard wrote:
> On 2/3/20 5:53 AM, Kirill A. Shutemov wrote:
> > On Fri, Jan 31, 2020 at 07:40:27PM -0800, John Hubbard wrote:
> >> diff --git a/mm/gup.c b/mm/gup.c
> >> index c10d0d051c5b..9fe61d15fc0e 100644
> >> --- a/mm/gup.c
> >> +++ b/mm/gup.c
> >> @@ -29,6 +29,19 @@ struct follow_page_context {
> >> unsigned int page_mask;
> >> };
> >>
> >> +#ifdef CONFIG_DEBUG_VM
> >
> > Why under CONFIG_DEBUG_VM? There's nothing about this in the cover letter.
> >
>
> Early on, gup_benchmark showed a really significant slowdown from using these
> counters. And I don't doubt that it's still the case.
>
> I'll re-measure and add a short summary and a few numbers to the patch commit
> description, and to the v4 cover letter.

Looks like you'll show zeros for these counters if debug is off. It can be
confusing to the user. I think these counters should go away if you don't
count them.

--
Kirill A. Shutemov

2020-02-03 21:36:27

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

On 2/3/20 1:30 PM, Kirill A. Shutemov wrote:
> On Mon, Feb 03, 2020 at 01:04:04PM -0800, John Hubbard wrote:
>> On 2/3/20 5:53 AM, Kirill A. Shutemov wrote:
>>> On Fri, Jan 31, 2020 at 07:40:27PM -0800, John Hubbard wrote:
>>>> diff --git a/mm/gup.c b/mm/gup.c
>>>> index c10d0d051c5b..9fe61d15fc0e 100644
>>>> --- a/mm/gup.c
>>>> +++ b/mm/gup.c
>>>> @@ -29,6 +29,19 @@ struct follow_page_context {
>>>> unsigned int page_mask;
>>>> };
>>>>
>>>> +#ifdef CONFIG_DEBUG_VM
>>>
>>> Why under CONFIG_DEBUG_VM? There's nothing about this in the cover letter.
>>>
>>
>> Early on, gup_benchmark showed a really significant slowdown from using these
>> counters. And I don't doubt that it's still the case.
>>
>> I'll re-measure and add a short summary and a few numbers to the patch commit
>> description, and to the v4 cover letter.
>
> Looks like you'll show zeros for these counters if debug is off. It can be
> confusing to the user. I think these counters should go away if you don't
> count them.
>

OK, that's a good point. (And in fact, the counters==0 situation already led me
astray briefly while debugging with Leon R, even. heh.) I'll remove them entirely for
the !CONFIG_DEBUG_VM case.

thanks,
--
John Hubbard
NVIDIA

2020-02-03 23:19:22

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

On 2/3/20 1:34 PM, John Hubbard wrote:
> On 2/3/20 1:30 PM, Kirill A. Shutemov wrote:
>> On Mon, Feb 03, 2020 at 01:04:04PM -0800, John Hubbard wrote:
>>> On 2/3/20 5:53 AM, Kirill A. Shutemov wrote:
>>>> On Fri, Jan 31, 2020 at 07:40:27PM -0800, John Hubbard wrote:
>>>>> diff --git a/mm/gup.c b/mm/gup.c
>>>>> index c10d0d051c5b..9fe61d15fc0e 100644
>>>>> --- a/mm/gup.c
>>>>> +++ b/mm/gup.c
>>>>> @@ -29,6 +29,19 @@ struct follow_page_context {
>>>>> unsigned int page_mask;
>>>>> };
>>>>>
>>>>> +#ifdef CONFIG_DEBUG_VM
>>>>
>>>> Why under CONFIG_DEBUG_VM? There's nothing about this in the cover letter.
>>>>
>>>
>>> Early on, gup_benchmark showed a really significant slowdown from using these
>>> counters. And I don't doubt that it's still the case.
>>>
>>> I'll re-measure and add a short summary and a few numbers to the patch commit
>>> description, and to the v4 cover letter.
>>
>> Looks like you'll show zeros for these counters if debug is off. It can be
>> confusing to the user. I think these counters should go away if you don't
>> count them.
>>
>
> OK, that's a good point. (And in fact, the counters==0 situation already led me
> astray briefly while debugging with Leon R, even. heh.) I'll remove them entirely for
> the !CONFIG_DEBUG_VM case.
>

On second thought, let me do some more careful performance testing. I don't recall
now if I was just removing every possible perf slowdown item, when I made this decision.
It could be that the perf is not affected, and I could just leave this feature enabled
at all times, which would be nicer.

And after all, these counters were designed for pretty hot-path items. I'll report back
with results...


thanks,
--
John Hubbard
NVIDIA

2020-02-03 23:45:41

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH v3 10/12] mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting

On 2/3/20 3:16 PM, John Hubbard wrote:
> On 2/3/20 1:34 PM, John Hubbard wrote:
>> On 2/3/20 1:30 PM, Kirill A. Shutemov wrote:
>>> On Mon, Feb 03, 2020 at 01:04:04PM -0800, John Hubbard wrote:
>>>> On 2/3/20 5:53 AM, Kirill A. Shutemov wrote:
>>>>> On Fri, Jan 31, 2020 at 07:40:27PM -0800, John Hubbard wrote:
>>>>>> diff --git a/mm/gup.c b/mm/gup.c
>>>>>> index c10d0d051c5b..9fe61d15fc0e 100644
>>>>>> --- a/mm/gup.c
>>>>>> +++ b/mm/gup.c
>>>>>> @@ -29,6 +29,19 @@ struct follow_page_context {
>>>>>> unsigned int page_mask;
>>>>>> };
>>>>>>
>>>>>> +#ifdef CONFIG_DEBUG_VM
>>>>>
>>>>> Why under CONFIG_DEBUG_VM? There's nothing about this in the cover letter.
>>>>>
>>>>
>>>> Early on, gup_benchmark showed a really significant slowdown from using these
>>>> counters. And I don't doubt that it's still the case.
>>>>
>>>> I'll re-measure and add a short summary and a few numbers to the patch commit
>>>> description, and to the v4 cover letter.
>>>
>>> Looks like you'll show zeros for these counters if debug is off. It can be
>>> confusing to the user. I think these counters should go away if you don't
>>> count them.
>>>
>>
>> OK, that's a good point. (And in fact, the counters==0 situation already led me
>> astray briefly while debugging with Leon R, even. heh.) I'll remove them entirely for
>> the !CONFIG_DEBUG_VM case.
>>
>
> On second thought, let me do some more careful performance testing. I don't recall
> now if I was just removing every possible perf slowdown item, when I made this decision.
> It could be that the perf is not affected, and I could just leave this feature enabled
> at all times, which would be nicer.
>
> And after all, these counters were designed for pretty hot-path items. I'll report back
> with results...


Sure enough, any perf effects are hidden in the noise. I'll just remove the CONFIG_DEBUG_VM
checks. Glad you asked about this!



thanks,
--
John Hubbard
NVIDIA