2020-05-18 06:48:05

by Anshuman Khandual

[permalink] [raw]
Subject: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split

This adds the following two new VM events which will help in validating PMD
based THP migration without split. Statistics reported through these events
will help in performance debugging.

1. THP_PMD_MIGRATION_SUCCESS
2. THP_PMD_MIGRATION_FAILURE

Cc: Naoya Horiguchi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Anshuman Khandual <[email protected]>
---
Changes in RFC V2:

- Decopupled and renamed VM events from their implementation per Zi and John
- Added THP_PMD_MIGRATION_FAILURE VM event upon allocation failure and split

Changes in RFC V1: (https://patchwork.kernel.org/patch/11542055/)

include/linux/vm_event_item.h | 4 ++++
mm/migrate.c | 15 +++++++++++++++
mm/vmstat.c | 4 ++++
3 files changed, 23 insertions(+)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index ffef0f279747..23d8f9884c2b 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -91,6 +91,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
THP_ZERO_PAGE_ALLOC_FAILED,
THP_SWPOUT,
THP_SWPOUT_FALLBACK,
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ THP_PMD_MIGRATION_SUCCESS,
+ THP_PMD_MIGRATION_FAILURE,
+#endif
#endif
#ifdef CONFIG_MEMORY_BALLOON
BALLOON_INFLATE,
diff --git a/mm/migrate.c b/mm/migrate.c
index 7160c1556f79..5325700a3e90 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1170,6 +1170,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
#define ICE_noinline
#endif

+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline void thp_migration_success(bool success)
+{
+ if (success)
+ count_vm_event(THP_PMD_MIGRATION_SUCCESS);
+ else
+ count_vm_event(THP_PMD_MIGRATION_FAILURE);
+}
+#else
+static inline void thp_migration_success(bool success) { }
+#endif
+
/*
* Obtain the lock on page, remove all ptes and migrate the page
* to the newly allocated page in newpage.
@@ -1232,6 +1244,8 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
* we want to retry.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
+ if (PageTransHuge(newpage))
+ thp_migration_success(true);
put_page(page);
if (reason == MR_MEMORY_FAILURE) {
/*
@@ -1474,6 +1488,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
unlock_page(page);
if (!rc) {
list_safe_reset_next(page, page2, lru);
+ thp_migration_success(false);
goto retry;
}
}
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 96d21a792b57..e258c782fd3a 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1274,6 +1274,10 @@ const char * const vmstat_text[] = {
"thp_zero_page_alloc_failed",
"thp_swpout",
"thp_swpout_fallback",
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ "thp_pmd_migration_success",
+ "thp_pmd_migration_failure",
+#endif
#endif
#ifdef CONFIG_MEMORY_BALLOON
"balloon_inflate",
--
2.20.1


2020-05-18 20:12:48

by John Hubbard

[permalink] [raw]
Subject: Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split

On 2020-05-17 23:42, Anshuman Khandual wrote:
...
> diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
> index ffef0f279747..23d8f9884c2b 100644
> --- a/include/linux/vm_event_item.h
> +++ b/include/linux/vm_event_item.h
> @@ -91,6 +91,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
> THP_ZERO_PAGE_ALLOC_FAILED,
> THP_SWPOUT,
> THP_SWPOUT_FALLBACK,
> +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION


Hi Anshuman,

These new events look great to me. I have some nits below, plus one
lightly controversial suggestion which I'd really like to have someone
more experienced weigh in on, which is:

How about not being quite so granular on the THP config options, and
just guarding these events with the overall CONFIG_TRANSPARENT_HUGEPAGE
option, instead of the sub-option CONFIG_ARCH_ENABLE_THP_MIGRATION?

I tentatively think it's harmless and not really misleading to have
/proc/vmstat showing this in all THP-enabled configurations:

thp_pmd_migration_success 0
thp_pmd_migration_failure 0

...if THP is enabled, and *whether or not* _THP_MIGRATION is enabled.
And this simplifies things a bit. Given how the .config options can get,
I think simplifying would be nice.

However, I'm ready to be corrected on that, if it's a bad idea for
other API reasons perhaps. Can anyone please comment?


> + THP_PMD_MIGRATION_SUCCESS,
> + THP_PMD_MIGRATION_FAILURE,
> +#endif
> #endif
> #ifdef CONFIG_MEMORY_BALLOON
> BALLOON_INFLATE,
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 7160c1556f79..5325700a3e90 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1170,6 +1170,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> #define ICE_noinline
> #endif
>
> +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> +static inline void thp_migration_success(bool success)


I think this should be named

thp_pmd_migration_success()

, since that's what you're really counting. Or, you could
name the events THP_MIGRATION_SUCCESS|FAILURE. Either way,
just so the function name matches the events it's counting.


> +{
> + if (success)
> + count_vm_event(THP_PMD_MIGRATION_SUCCESS);
> + else
> + count_vm_event(THP_PMD_MIGRATION_FAILURE);
> +}
> +#else
> +static inline void thp_migration_success(bool success) { }


This whole ifdef clause would disappear if my suggestion above is
accepted. However, if not, then I believe the convention for this
kind of situation is:

static inline void thp_migration_success(bool success)
{
}


> +#endif
> +
> /*
> * Obtain the lock on page, remove all ptes and migrate the page
> * to the newly allocated page in newpage.
> @@ -1232,6 +1244,8 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
> * we want to retry.
> */
> if (rc == MIGRATEPAGE_SUCCESS) {
> + if (PageTransHuge(newpage))
> + thp_migration_success(true);
> put_page(page);
> if (reason == MR_MEMORY_FAILURE) {
> /*
> @@ -1474,6 +1488,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> unlock_page(page);
> if (!rc) {
> list_safe_reset_next(page, page2, lru);
> + thp_migration_success(false);
> goto retry;
> }
> }
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 96d21a792b57..e258c782fd3a 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1274,6 +1274,10 @@ const char * const vmstat_text[] = {
> "thp_zero_page_alloc_failed",
> "thp_swpout",
> "thp_swpout_fallback",
> +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> + "thp_pmd_migration_success",
> + "thp_pmd_migration_failure",
> +#endif
> #endif
> #ifdef CONFIG_MEMORY_BALLOON
> "balloon_inflate",
>

thanks,
--
John Hubbard
NVIDIA

2020-05-20 03:35:09

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split



On 05/19/2020 01:40 AM, John Hubbard wrote:
> On 2020-05-17 23:42, Anshuman Khandual wrote:
> ...
>> diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
>> index ffef0f279747..23d8f9884c2b 100644
>> --- a/include/linux/vm_event_item.h
>> +++ b/include/linux/vm_event_item.h
>> @@ -91,6 +91,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
>>           THP_ZERO_PAGE_ALLOC_FAILED,
>>           THP_SWPOUT,
>>           THP_SWPOUT_FALLBACK,
>> +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
>
>
> Hi Anshuman,
>
> These new events look great to me. I have some nits below, plus one
> lightly controversial suggestion which I'd really like to have someone
> more experienced weigh in on, which is:
>
> How about not being quite so granular on the THP config options, and
> just guarding these events with the overall CONFIG_TRANSPARENT_HUGEPAGE
> option, instead of the sub-option CONFIG_ARCH_ENABLE_THP_MIGRATION?
>
> I tentatively think it's harmless and not really misleading to have
> /proc/vmstat showing this in all THP-enabled configurations:
>
> thp_pmd_migration_success 0
> thp_pmd_migration_failure 0
>
> ...if THP is enabled, and *whether or not* _THP_MIGRATION is enabled.
> And this simplifies things a bit. Given how the .config options can get,
> I think simplifying would be nice.
>
> However, I'm ready to be corrected on that, if it's a bad idea for
> other API reasons perhaps.  Can anyone please comment?

There is no THP migration events to track unless it is enabled. Why to
show these statistics (as 0) when its not even possible. If the config
simplicity is the only intended rationale here, it might not be the
case either. These events and their tracking would still need to be
wrapped with CONFIG_TRANSPARENT_HUGEPAGE otherwise.

If your concern is more towards CONFIG_ARCH_ENABLE_THP_MIGRATION being
unsuitable or with complex dependencies, then that is something how THP
migration feature itself is implemented currently and adding VM events
does not address that. A possible patch in the future patch could solve
all these (together).

But sure, let's hear it for what others have to say on this.

>
>
>> +        THP_PMD_MIGRATION_SUCCESS,
>> +        THP_PMD_MIGRATION_FAILURE,
>> +#endif
>>   #endif
>>   #ifdef CONFIG_MEMORY_BALLOON
>>           BALLOON_INFLATE,
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 7160c1556f79..5325700a3e90 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1170,6 +1170,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>>   #define ICE_noinline
>>   #endif
>>   +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
>> +static inline void thp_migration_success(bool success)
>
>
> I think this should be named
>
>     thp_pmd_migration_success()
>
> , since that's what you're really counting. Or, you could
> name the events THP_MIGRATION_SUCCESS|FAILURE. Either way,
> just so the function name matches the events it's counting.

Makes sense but IMHO we should keep _pmd_ to be more specific.
Will change the name here as thp_pmd_migration_success().

>
>
>> +{
>> +    if (success)
>> +        count_vm_event(THP_PMD_MIGRATION_SUCCESS);
>> +    else
>> +        count_vm_event(THP_PMD_MIGRATION_FAILURE);
>> +}
>> +#else
>> +static inline void thp_migration_success(bool success) { }
>
>
> This whole ifdef clause would disappear if my suggestion above is

We will have to protect these with CONFIG_TRANSPARENT_HUGEPAGE as
the events are still conditionally available.

> accepted. However, if not, then I believe the convention for this
> kind of situation is:
>
> static inline void thp_migration_success(bool success)
> {
> }

AFAIK, we have examples both ways but will change if this is preferred.

2020-05-20 05:20:15

by John Hubbard

[permalink] [raw]
Subject: Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split

On 2020-05-19 20:32, Anshuman Khandual wrote:
...
>> How about not being quite so granular on the THP config options, and
>> just guarding these events with the overall CONFIG_TRANSPARENT_HUGEPAGE
>> option, instead of the sub-option CONFIG_ARCH_ENABLE_THP_MIGRATION?
>>
>> I tentatively think it's harmless and not really misleading to have
>> /proc/vmstat showing this in all THP-enabled configurations:
>>
>> thp_pmd_migration_success 0
>> thp_pmd_migration_failure 0
>>
>> ...if THP is enabled, and *whether or not* _THP_MIGRATION is enabled.
>> And this simplifies things a bit. Given how the .config options can get,
>> I think simplifying would be nice.
>>
>> However, I'm ready to be corrected on that, if it's a bad idea for
>> other API reasons perhaps.  Can anyone please comment?
>
> There is no THP migration events to track unless it is enabled. Why to
> show these statistics (as 0) when its not even possible. If the config
> simplicity is the only intended rationale here, it might not be the
> case either. These events and their tracking would still need to be
> wrapped with CONFIG_TRANSPARENT_HUGEPAGE otherwise.
>
> If your concern is more towards CONFIG_ARCH_ENABLE_THP_MIGRATION being
> unsuitable or with complex dependencies, then that is something how THP
> migration feature itself is implemented currently and adding VM events
> does not address that. A possible patch in the future patch could solve
> all these (together).
>
> But sure, let's hear it for what others have to say on this.


Well, I don't want to hold up progress. If it's not very convincing to you,
let's just drop the idea/ It was kind of weak. :)


>>> +        THP_PMD_MIGRATION_SUCCESS,
>>> +        THP_PMD_MIGRATION_FAILURE,
>>> +#endif
>>>   #endif
>>>   #ifdef CONFIG_MEMORY_BALLOON
>>>           BALLOON_INFLATE,
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 7160c1556f79..5325700a3e90 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1170,6 +1170,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>>>   #define ICE_noinline
>>>   #endif
>>>   +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
>>> +static inline void thp_migration_success(bool success)
>>
>>
>> I think this should be named
>>
>>     thp_pmd_migration_success()
>>
>> , since that's what you're really counting. Or, you could
>> name the events THP_MIGRATION_SUCCESS|FAILURE. Either way,
>> just so the function name matches the events it's counting.
>
> Makes sense but IMHO we should keep _pmd_ to be more specific.
> Will change the name here as thp_pmd_migration_success().
>
>>
>>
>>> +{
>>> +    if (success)
>>> +        count_vm_event(THP_PMD_MIGRATION_SUCCESS);
>>> +    else
>>> +        count_vm_event(THP_PMD_MIGRATION_FAILURE);
>>> +}
>>> +#else
>>> +static inline void thp_migration_success(bool success) { }
>>
>>
>> This whole ifdef clause would disappear if my suggestion above is
>
> We will have to protect these with CONFIG_TRANSPARENT_HUGEPAGE as
> the events are still conditionally available.
>

Yes you are right, of course. And I even worked through that, but then
when I sat down to write a response my fingers typed v1 of my understanding
instead of v2. No one knows why. :) Sorry about the misinformation there.

>> accepted. However, if not, then I believe the convention for this
>> kind of situation is:
>>
>> static inline void thp_migration_success(bool success)
>> {
>> }
>
> AFAIK, we have examples both ways but will change if this is preferred.
>

Not worth worrying about, but I do recall a few recent code reviews that
all preferred the multi-line version, which is why I suggested it.

Anyway, either way, with the thp_pmd_migration_success() name change, you
can add:

Reviewed-by: John Hubbard <[email protected]>


thanks,
--
John Hubbard
NVIDIA

Subject: Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split

On Mon, May 18, 2020 at 12:12:36PM +0530, Anshuman Khandual wrote:
> This adds the following two new VM events which will help in validating PMD
> based THP migration without split. Statistics reported through these events
> will help in performance debugging.
>
> 1. THP_PMD_MIGRATION_SUCCESS
> 2. THP_PMD_MIGRATION_FAILURE
>
> Cc: Naoya Horiguchi <[email protected]>
> Cc: Zi Yan <[email protected]>
> Cc: John Hubbard <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Anshuman Khandual <[email protected]>

Hi Anshuman,

I'm neutral for additinal lines in /proc/vmstat. It's a classic (so widely
used) but inflexible interface. Users disabling thp are not happy with many
thp-related lines, but judging from the fact that we already have many
thp-related lines some users really need them. So I feel hard to decide to
agree or disagree with additional lines.

I think that tracepoints are the more flexible interfaces for monitoring,
so I'm interested more in whether thp migration could be monitorable via
tracepoint. Do you have any idea/plan on it?

Thanks,
Naoya Horiguchi

2020-05-21 04:12:45

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split



On 05/20/2020 12:45 PM, HORIGUCHI NAOYA($BKY8}!!D>Li(B) wrote:
> On Mon, May 18, 2020 at 12:12:36PM +0530, Anshuman Khandual wrote:
>> This adds the following two new VM events which will help in validating PMD
>> based THP migration without split. Statistics reported through these events
>> will help in performance debugging.
>>
>> 1. THP_PMD_MIGRATION_SUCCESS
>> 2. THP_PMD_MIGRATION_FAILURE
>>
>> Cc: Naoya Horiguchi <[email protected]>
>> Cc: Zi Yan <[email protected]>
>> Cc: John Hubbard <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Anshuman Khandual <[email protected]>
>
> Hi Anshuman,

Hi Naoya,

>
> I'm neutral for additinal lines in /proc/vmstat. It's a classic (so widely
> used) but inflexible interface. Users disabling thp are not happy with many
> thp-related lines, but judging from the fact that we already have many

Right, for similar reason, I am not too keen on enabling these counters
without migration being enabled with ARCH_ENABLE_THP_MIGRATION.

> thp-related lines some users really need them. So I feel hard to decide to
> agree or disagree with additional lines.

Currently these are conditional on ARCH_ENABLE_THP_MIGRATION. So we are
not adding these new lines unless it migration is available and enabled.

>
> I think that tracepoints are the more flexible interfaces for monitoring,
> so I'm interested more in whether thp migration could be monitorable via
> tracepoint. Do you have any idea/plan on it?

Sure, we can add some trace points as well which can give more granular
details regarding THP migration mechanism itself e.g setting and removing
PMD migration entries etc probably with (vaddr, pmdp, pmd) details.

But we will still need /proc/vmstat entries that will be available right
away without requiring additional steps. This simplicity is essential for
folks to consider using these events more often.

Sure, will look into what trace points can be added for THP migration but
in a subsequent patch.

- Anshuman