2020-06-04 04:34:09

by Anshuman Khandual

[permalink] [raw]
Subject: [PATCH V2] mm/vmstat: Add events for THP migration without split

Add the following new VM events which will help in validating THP migration
without split. Statistics reported through these new events will help in
performance debugging.

1. THP_MIGRATION_SUCCESS
2. THP_MIGRATION_FAILURE

THP_MIGRATION_FAILURE in particular represents an event when a THP could
not be migrated as a single entity following an allocation failure and
ended up getting split into constituent normal pages before being retried.
This event, along with PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in
quantifying and analyzing THP migration events including both success and
failure cases.

Cc: Naoya Horiguchi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: [email protected]
[hughd: fixed oops on NULL newpage]
Signed-off-by: Anshuman Khandual <[email protected]>
---
Changes in V2:

- Dropped PMD reference both from code and commit message per Matthew
- Added documentation and updated the commit message per Daniel

Changes in V1: (https://patchwork.kernel.org/patch/11564497/)

- Changed function name as thp_pmd_migration_success() per John
- Folded in a fix (https://patchwork.kernel.org/patch/11563009/) from Hugh

Changes in RFC V2: (https://patchwork.kernel.org/patch/11554861/)

- Decopupled and renamed VM events from their implementation per Zi and John
- Added THP_PMD_MIGRATION_FAILURE VM event upon allocation failure and split

Changes in RFC V1: (https://patchwork.kernel.org/patch/11542055/)

Documentation/vm/page_migration.rst | 15 +++++++++++++++
include/linux/vm_event_item.h | 4 ++++
mm/migrate.c | 23 +++++++++++++++++++++++
mm/vmstat.c | 4 ++++
4 files changed, 46 insertions(+)

diff --git a/Documentation/vm/page_migration.rst b/Documentation/vm/page_migration.rst
index 1d6cd7db4e43..67e9b067fed7 100644
--- a/Documentation/vm/page_migration.rst
+++ b/Documentation/vm/page_migration.rst
@@ -253,5 +253,20 @@ which are function pointers of struct address_space_operations.
PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
for own purpose.

+Quantifying Migration
+=====================
+Following events can be used to quantify page migration.
+
+- PGMIGRATE_SUCCESS
+- PGMIGRATE_FAIL
+- THP_MIGRATION_SUCCESS
+- THP_MIGRATION_FAILURE
+
+THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
+migrated as a single entity following an allocation failure and ended up getting
+split into constituent normal pages before being retried. This event, along with
+PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
+migration events including both success and failure cases.
+
Christoph Lameter, May 8, 2006.
Minchan Kim, Mar 28, 2016.
diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index ffef0f279747..6459265461df 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -91,6 +91,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
THP_ZERO_PAGE_ALLOC_FAILED,
THP_SWPOUT,
THP_SWPOUT_FALLBACK,
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ THP_MIGRATION_SUCCESS,
+ THP_MIGRATION_FAILURE,
+#endif
#endif
#ifdef CONFIG_MEMORY_BALLOON
BALLOON_INFLATE,
diff --git a/mm/migrate.c b/mm/migrate.c
index 7160c1556f79..0bb1dbb891bb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1170,6 +1170,20 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
#define ICE_noinline
#endif

+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline void thp_migration_success(bool success)
+{
+ if (success)
+ count_vm_event(THP_MIGRATION_SUCCESS);
+ else
+ count_vm_event(THP_MIGRATION_FAILURE);
+}
+#else
+static inline void thp_migration_success(bool success)
+{
+}
+#endif
+
/*
* Obtain the lock on page, remove all ptes and migrate the page
* to the newly allocated page in newpage.
@@ -1232,6 +1246,14 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
* we want to retry.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
+ /*
+ * When the page to be migrated has been freed from under
+ * us, that is considered a MIGRATEPAGE_SUCCESS, but no
+ * newpage has been allocated. It should not be counted
+ * as a successful THP migration.
+ */
+ if (newpage && PageTransHuge(newpage))
+ thp_migration_success(true);
put_page(page);
if (reason == MR_MEMORY_FAILURE) {
/*
@@ -1474,6 +1496,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
unlock_page(page);
if (!rc) {
list_safe_reset_next(page, page2, lru);
+ thp_migration_success(false);
goto retry;
}
}
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 96d21a792b57..4ce1ab2e9704 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1274,6 +1274,10 @@ const char * const vmstat_text[] = {
"thp_zero_page_alloc_failed",
"thp_swpout",
"thp_swpout_fallback",
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ "thp_migration_success",
+ "thp_migration_failure",
+#endif
#endif
#ifdef CONFIG_MEMORY_BALLOON
"balloon_inflate",
--
2.20.1


2020-06-04 12:32:53

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
> Add the following new VM events which will help in validating THP migration
> without split. Statistics reported through these new events will help in
> performance debugging.
>
> 1. THP_MIGRATION_SUCCESS
> 2. THP_MIGRATION_FAILURE
>
> THP_MIGRATION_FAILURE in particular represents an event when a THP could
> not be migrated as a single entity following an allocation failure and
> ended up getting split into constituent normal pages before being retried.
> This event, along with PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in
> quantifying and analyzing THP migration events including both success and
> failure cases.

> +Quantifying Migration
> +=====================
> +Following events can be used to quantify page migration.
> +
> +- PGMIGRATE_SUCCESS
> +- PGMIGRATE_FAIL
> +- THP_MIGRATION_SUCCESS
> +- THP_MIGRATION_FAILURE
> +
> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
> +migrated as a single entity following an allocation failure and ended up getting
> +split into constituent normal pages before being retried. This event, along with
> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
> +migration events including both success and failure cases.

First, I'd suggest running this paragraph through 'fmt'. That way you
don't have to care about line lengths.

Second, this paragraph doesn't really explain what I need to know to
understand the meaning of these numbers. When Linux attempts to migrate
a THP, one of three things can happen:

- It is migrated as a single THP
- It is migrated, but had to be split
- Migration fails

How do I turn these four numbers into an understanding of how often each
of those three situations happen? And why do we need four numbers to
report three situations?

Or is there something else that can happen? If so, I'd like that explained
here too ;-)

2020-06-04 13:54:33

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:

> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>> Add the following new VM events which will help in validating THP migration
>> without split. Statistics reported through these new events will help in
>> performance debugging.
>>
>> 1. THP_MIGRATION_SUCCESS
>> 2. THP_MIGRATION_FAILURE
>>
>> THP_MIGRATION_FAILURE in particular represents an event when a THP could
>> not be migrated as a single entity following an allocation failure and
>> ended up getting split into constituent normal pages before being retried.
>> This event, along with PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in
>> quantifying and analyzing THP migration events including both success and
>> failure cases.
>
>> +Quantifying Migration
>> +=====================
>> +Following events can be used to quantify page migration.
>> +
>> +- PGMIGRATE_SUCCESS
>> +- PGMIGRATE_FAIL
>> +- THP_MIGRATION_SUCCESS
>> +- THP_MIGRATION_FAILURE
>> +
>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>> +migrated as a single entity following an allocation failure and ended up getting
>> +split into constituent normal pages before being retried. This event, along with
>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>> +migration events including both success and failure cases.
>
> First, I'd suggest running this paragraph through 'fmt'. That way you
> don't have to care about line lengths.
>
> Second, this paragraph doesn't really explain what I need to know to
> understand the meaning of these numbers. When Linux attempts to migrate
> a THP, one of three things can happen:
>
> - It is migrated as a single THP
> - It is migrated, but had to be split
> - Migration fails
>
> How do I turn these four numbers into an understanding of how often each
> of those three situations happen? And why do we need four numbers to
> report three situations?
>
> Or is there something else that can happen? If so, I'd like that explained
> here too ;-)

PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
so it is not easy to interpret them. Let me try to explain them.

1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
these base pages are migrated and fail to migrate respectively.
THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
Simple.

2. migrating only THPs:
- PGMIGRATE_SUCCESS means THPs that are migrated and base pages
(from the split of THPs) that are migrated,

- PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.

- THP_MIGRATION_SUCCESS means THPs that are migrated.

- THP_MIGRATION_FAILURE means THPs that are split.

So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
which are from the split of THPs.

When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
means the number of pages that are failed to migrate, but we cannot tell how many
are base pages and how many are THPs.

3. migrating base pages and THP:

The math should be very similar to the second case, except that
a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
as base pages and how many are pages begin as THPs but become base pages after split;
b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
base pages that begin as base pages fail to migrate, is mixed into the number and we
cannot tell three cases apart.



Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-06-04 16:51:59

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On 4 Jun 2020, at 12:36, Matthew Wilcox wrote:

> On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote:
>> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:
>>> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>>>> +Quantifying Migration
>>>> +=====================
>>>> +Following events can be used to quantify page migration.
>>>> +
>>>> +- PGMIGRATE_SUCCESS
>>>> +- PGMIGRATE_FAIL
>>>> +- THP_MIGRATION_SUCCESS
>>>> +- THP_MIGRATION_FAILURE
>>>> +
>>>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>>>> +migrated as a single entity following an allocation failure and ended up getting
>>>> +split into constituent normal pages before being retried. This event, along with
>>>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>>>> +migration events including both success and failure cases.
>>>
>>> First, I'd suggest running this paragraph through 'fmt'. That way you
>>> don't have to care about line lengths.
>>>
>>> Second, this paragraph doesn't really explain what I need to know to
>>> understand the meaning of these numbers. When Linux attempts to migrate
>>> a THP, one of three things can happen:
>>>
>>> - It is migrated as a single THP
>>> - It is migrated, but had to be split
>>> - Migration fails
>>>
>>> How do I turn these four numbers into an understanding of how often each
>>> of those three situations happen? And why do we need four numbers to
>>> report three situations?
>>>
>>> Or is there something else that can happen? If so, I'd like that explained
>>> here too ;-)
>>
>> PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
>> so it is not easy to interpret them. Let me try to explain them.
>
> Thanks! Very helpful explanation.
>
>> 1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
>> these base pages are migrated and fail to migrate respectively.
>> THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
>> Simple.
>>
>> 2. migrating only THPs:
>> - PGMIGRATE_SUCCESS means THPs that are migrated and base pages
>> (from the split of THPs) that are migrated,
>>
>> - PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.
>>
>> - THP_MIGRATION_SUCCESS means THPs that are migrated.
>>
>> - THP_MIGRATION_FAILURE means THPs that are split.
>>
>> So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
>> which are from the split of THPs.
>
> Are you sure about that? If I split a THP and each of those subpages
> migrates, won't I then see PGMIGRATE_SUCCESS increase by 512?

That is what I mean. I guess my words did not work. I should have used subpages.

>
>> When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
>> means the number of pages that are failed to migrate, but we cannot tell how many
>> are base pages and how many are THPs.
>>
>> 3. migrating base pages and THP:
>>
>> The math should be very similar to the second case, except that
>> a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
>> as base pages and how many are pages begin as THPs but become base pages after split;
>> b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
>> base pages that begin as base pages fail to migrate, is mixed into the number and we
>> cannot tell three cases apart.
>
> So why don't we just expose PGMIGRATE_SPLIT? That would be defined as
> the number of times we succeeded in migrating a THP but had to split it
> to succeed.

It might need extra code to get that number. Currently, the subpages from split
THPs are appended to the end of the original page list, so we might need a separate
page list for these subpages to count PGMIGRATE_SPLIT. Also what if some of the
subpages fail to migrate? Do we increase PGMIGRATE_SPLIT or not?



--
Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-06-04 18:58:22

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote:
> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:
> > On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
> >> +Quantifying Migration
> >> +=====================
> >> +Following events can be used to quantify page migration.
> >> +
> >> +- PGMIGRATE_SUCCESS
> >> +- PGMIGRATE_FAIL
> >> +- THP_MIGRATION_SUCCESS
> >> +- THP_MIGRATION_FAILURE
> >> +
> >> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
> >> +migrated as a single entity following an allocation failure and ended up getting
> >> +split into constituent normal pages before being retried. This event, along with
> >> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
> >> +migration events including both success and failure cases.
> >
> > First, I'd suggest running this paragraph through 'fmt'. That way you
> > don't have to care about line lengths.
> >
> > Second, this paragraph doesn't really explain what I need to know to
> > understand the meaning of these numbers. When Linux attempts to migrate
> > a THP, one of three things can happen:
> >
> > - It is migrated as a single THP
> > - It is migrated, but had to be split
> > - Migration fails
> >
> > How do I turn these four numbers into an understanding of how often each
> > of those three situations happen? And why do we need four numbers to
> > report three situations?
> >
> > Or is there something else that can happen? If so, I'd like that explained
> > here too ;-)
>
> PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
> so it is not easy to interpret them. Let me try to explain them.

Thanks! Very helpful explanation.

> 1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
> these base pages are migrated and fail to migrate respectively.
> THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
> Simple.
>
> 2. migrating only THPs:
> - PGMIGRATE_SUCCESS means THPs that are migrated and base pages
> (from the split of THPs) that are migrated,
>
> - PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.
>
> - THP_MIGRATION_SUCCESS means THPs that are migrated.
>
> - THP_MIGRATION_FAILURE means THPs that are split.
>
> So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
> which are from the split of THPs.

Are you sure about that? If I split a THP and each of those subpages
migrates, won't I then see PGMIGRATE_SUCCESS increase by 512?

> When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
> means the number of pages that are failed to migrate, but we cannot tell how many
> are base pages and how many are THPs.
>
> 3. migrating base pages and THP:
>
> The math should be very similar to the second case, except that
> a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
> as base pages and how many are pages begin as THPs but become base pages after split;
> b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
> base pages that begin as base pages fail to migrate, is mixed into the number and we
> cannot tell three cases apart.

So why don't we just expose PGMIGRATE_SPLIT? That would be defined as
the number of times we succeeded in migrating a THP but had to split it
to succeed.


2020-06-05 03:40:14

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split



On 06/04/2020 10:19 PM, Zi Yan wrote:
> On 4 Jun 2020, at 12:36, Matthew Wilcox wrote:
>
>> On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote:
>>> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:
>>>> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>>>>> +Quantifying Migration
>>>>> +=====================
>>>>> +Following events can be used to quantify page migration.
>>>>> +
>>>>> +- PGMIGRATE_SUCCESS
>>>>> +- PGMIGRATE_FAIL
>>>>> +- THP_MIGRATION_SUCCESS
>>>>> +- THP_MIGRATION_FAILURE
>>>>> +
>>>>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>>>>> +migrated as a single entity following an allocation failure and ended up getting
>>>>> +split into constituent normal pages before being retried. This event, along with
>>>>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>>>>> +migration events including both success and failure cases.
>>>>
>>>> First, I'd suggest running this paragraph through 'fmt'. That way you
>>>> don't have to care about line lengths.
>>>>
>>>> Second, this paragraph doesn't really explain what I need to know to
>>>> understand the meaning of these numbers. When Linux attempts to migrate
>>>> a THP, one of three things can happen:
>>>>
>>>> - It is migrated as a single THP
>>>> - It is migrated, but had to be split
>>>> - Migration fails
>>>>
>>>> How do I turn these four numbers into an understanding of how often each
>>>> of those three situations happen? And why do we need four numbers to
>>>> report three situations?
>>>>
>>>> Or is there something else that can happen? If so, I'd like that explained
>>>> here too ;-)
>>>
>>> PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
>>> so it is not easy to interpret them. Let me try to explain them.
>>
>> Thanks! Very helpful explanation.
>>
>>> 1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
>>> these base pages are migrated and fail to migrate respectively.
>>> THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
>>> Simple.
>>>
>>> 2. migrating only THPs:
>>> - PGMIGRATE_SUCCESS means THPs that are migrated and base pages
>>> (from the split of THPs) that are migrated,
>>>
>>> - PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.
>>>
>>> - THP_MIGRATION_SUCCESS means THPs that are migrated.
>>>
>>> - THP_MIGRATION_FAILURE means THPs that are split.
>>>
>>> So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
>>> which are from the split of THPs.
>>
>> Are you sure about that? If I split a THP and each of those subpages
>> migrates, won't I then see PGMIGRATE_SUCCESS increase by 512?
>
> That is what I mean. I guess my words did not work. I should have used subpages.
>
>>
>>> When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
>>> means the number of pages that are failed to migrate, but we cannot tell how many
>>> are base pages and how many are THPs.
>>>
>>> 3. migrating base pages and THP:
>>>
>>> The math should be very similar to the second case, except that
>>> a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
>>> as base pages and how many are pages begin as THPs but become base pages after split;
>>> b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
>>> base pages that begin as base pages fail to migrate, is mixed into the number and we
>>> cannot tell three cases apart.
>>
>> So why don't we just expose PGMIGRATE_SPLIT? That would be defined as
>> the number of times we succeeded in migrating a THP but had to split it
>> to succeed.
>
> It might need extra code to get that number. Currently, the subpages from split
> THPs are appended to the end of the original page list, so we might need a separate
> page list for these subpages to count PGMIGRATE_SPLIT. Also what if some of the
> subpages fail to migrate? Do we increase PGMIGRATE_SPLIT or not?

Thanks Zi, for such a detailed explanation. Ideally, we should separate THP
migration from base page migration in terms of statistics. PGMIGRATE_SUCCESS
and PGMIGRATE_FAIL should continue to track statistics when migration starts
with base pages. But for THP, we should track the following events.

1. THP_MIGRATION_SUCCESS - THP migration is successful, without split
2. THP_MIGRATION_FAILURE - THP could neither be migrated, nor be split
3. THP_MIGRATION_SPLIT_SUCCESS - THP got split and all sub pages migrated
4. THP_MIGRATION_SPLIT_FAILURE - THP got split but all sub pages could not be migrated

THP_MIGRATION_SPLIT_FAILURE could either increment once for a single THP or
number of subpages that did not get migrated after split. As you mentioned,
this will need some extra code in the core migration. Nonetheless, if these
new events look good, will be happy to make required changes.

2020-06-05 14:27:21

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On 4 Jun 2020, at 23:35, Anshuman Khandual wrote:

> On 06/04/2020 10:19 PM, Zi Yan wrote:
>> On 4 Jun 2020, at 12:36, Matthew Wilcox wrote:
>>
>>> On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote:
>>>> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:
>>>>> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>>>>>> +Quantifying Migration
>>>>>> +=====================
>>>>>> +Following events can be used to quantify page migration.
>>>>>> +
>>>>>> +- PGMIGRATE_SUCCESS
>>>>>> +- PGMIGRATE_FAIL
>>>>>> +- THP_MIGRATION_SUCCESS
>>>>>> +- THP_MIGRATION_FAILURE
>>>>>> +
>>>>>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>>>>>> +migrated as a single entity following an allocation failure and ended up getting
>>>>>> +split into constituent normal pages before being retried. This event, along with
>>>>>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>>>>>> +migration events including both success and failure cases.
>>>>>
>>>>> First, I'd suggest running this paragraph through 'fmt'. That way you
>>>>> don't have to care about line lengths.
>>>>>
>>>>> Second, this paragraph doesn't really explain what I need to know to
>>>>> understand the meaning of these numbers. When Linux attempts to migrate
>>>>> a THP, one of three things can happen:
>>>>>
>>>>> - It is migrated as a single THP
>>>>> - It is migrated, but had to be split
>>>>> - Migration fails
>>>>>
>>>>> How do I turn these four numbers into an understanding of how often each
>>>>> of those three situations happen? And why do we need four numbers to
>>>>> report three situations?
>>>>>
>>>>> Or is there something else that can happen? If so, I'd like that explained
>>>>> here too ;-)
>>>>
>>>> PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
>>>> so it is not easy to interpret them. Let me try to explain them.
>>>
>>> Thanks! Very helpful explanation.
>>>
>>>> 1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
>>>> these base pages are migrated and fail to migrate respectively.
>>>> THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
>>>> Simple.
>>>>
>>>> 2. migrating only THPs:
>>>> - PGMIGRATE_SUCCESS means THPs that are migrated and base pages
>>>> (from the split of THPs) that are migrated,
>>>>
>>>> - PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.
>>>>
>>>> - THP_MIGRATION_SUCCESS means THPs that are migrated.
>>>>
>>>> - THP_MIGRATION_FAILURE means THPs that are split.
>>>>
>>>> So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
>>>> which are from the split of THPs.
>>>
>>> Are you sure about that? If I split a THP and each of those subpages
>>> migrates, won't I then see PGMIGRATE_SUCCESS increase by 512?
>>
>> That is what I mean. I guess my words did not work. I should have used subpages.
>>
>>>
>>>> When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
>>>> means the number of pages that are failed to migrate, but we cannot tell how many
>>>> are base pages and how many are THPs.
>>>>
>>>> 3. migrating base pages and THP:
>>>>
>>>> The math should be very similar to the second case, except that
>>>> a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
>>>> as base pages and how many are pages begin as THPs but become base pages after split;
>>>> b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
>>>> base pages that begin as base pages fail to migrate, is mixed into the number and we
>>>> cannot tell three cases apart.
>>>
>>> So why don't we just expose PGMIGRATE_SPLIT? That would be defined as
>>> the number of times we succeeded in migrating a THP but had to split it
>>> to succeed.
>>
>> It might need extra code to get that number. Currently, the subpages from split
>> THPs are appended to the end of the original page list, so we might need a separate
>> page list for these subpages to count PGMIGRATE_SPLIT. Also what if some of the
>> subpages fail to migrate? Do we increase PGMIGRATE_SPLIT or not?
>
> Thanks Zi, for such a detailed explanation. Ideally, we should separate THP
> migration from base page migration in terms of statistics. PGMIGRATE_SUCCESS
> and PGMIGRATE_FAIL should continue to track statistics when migration starts
> with base pages. But for THP, we should track the following events.

You mean PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will not track the number of migrated subpages
from split THPs? Will it cause userspace issues since their semantics are changed?

>
> 1. THP_MIGRATION_SUCCESS - THP migration is successful, without split
> 2. THP_MIGRATION_FAILURE - THP could neither be migrated, nor be split

They make sense to me.

> 3. THP_MIGRATION_SPLIT_SUCCESS - THP got split and all sub pages migrated
> 4. THP_MIGRATION_SPLIT_FAILURE - THP got split but all sub pages could not be migrated
>
> THP_MIGRATION_SPLIT_FAILURE could either increment once for a single THP or
> number of subpages that did not get migrated after split. As you mentioned,
> this will need some extra code in the core migration. Nonetheless, if these
> new events look good, will be happy to make required changes.

Maybe THP_MIGRATION_SPLIT would be simpler? My concern is that whether we need such
detailed information or not. Maybe trace points would be good enough for 3 and 4.
But if you think it is useful to you, feel free to implement them.

BTW, in terms of stats tracking, what do you think of my patch below? I am trying to
aggregate all stats counting in one place. Feel free to use it if you think it works
for you.


diff --git a/mm/migrate.c b/mm/migrate.c
index 7bfd0962149e..0f3c60470489 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1429,9 +1429,14 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
enum migrate_mode mode, int reason)
{
int retry = 1;
+ int thp_retry = 1;
int nr_failed = 0;
+ int nr_thp_failed = 0;
+ int nr_thp_split = 0;
int nr_succeeded = 0;
+ int nr_thp_succeeded = 0;
int pass = 0;
+ bool is_thp = false;
struct page *page;
struct page *page2;
int swapwrite = current->flags & PF_SWAPWRITE;
@@ -1440,11 +1445,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
if (!swapwrite)
current->flags |= PF_SWAPWRITE;

- for(pass = 0; pass < 10 && retry; pass++) {
+ for(pass = 0; pass < 10 && (retry || thp_retry); pass++) {
retry = 0;
+ thp_retry = 0;

list_for_each_entry_safe(page, page2, from, lru) {
retry:
+ is_thp = PageTransHuge(page);
cond_resched();

if (PageHuge(page))
@@ -1475,15 +1482,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
unlock_page(page);
if (!rc) {
list_safe_reset_next(page, page2, lru);
+ nr_thp_split++;
goto retry;
}
}
nr_failed++;
goto out;
case -EAGAIN:
+ if (is_thp)
+ thp_retry++;
retry++;
break;
case MIGRATEPAGE_SUCCESS:
+ if (is_thp)
+ nr_thp_succeeded++;
nr_succeeded++;
break;
default:
@@ -1493,18 +1505,27 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
* removed from migration page list and not
* retried in the next outer loop.
*/
+ if (is_thp)
+ nr_thp_failed++;
nr_failed++;
break;
}
}
}
nr_failed += retry;
+ nr_thp_failed += thp_retry;
rc = nr_failed;
out:
if (nr_succeeded)
count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
if (nr_failed)
count_vm_events(PGMIGRATE_FAIL, nr_failed);
+ if (nr_thp_succeeded)
+ count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
+ if (nr_thp_failed)
+ count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
+ if (nr_thp_split)
+ count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason);

if (!swapwrite)



Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature

2020-06-09 11:38:15

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On 06/05/2020 07:54 PM, Zi Yan wrote:
> On 4 Jun 2020, at 23:35, Anshuman Khandual wrote:
>
>> On 06/04/2020 10:19 PM, Zi Yan wrote:
>>> On 4 Jun 2020, at 12:36, Matthew Wilcox wrote:
>>>
>>>> On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote:
>>>>> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:
>>>>>> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>>>>>>> +Quantifying Migration
>>>>>>> +=====================
>>>>>>> +Following events can be used to quantify page migration.
>>>>>>> +
>>>>>>> +- PGMIGRATE_SUCCESS
>>>>>>> +- PGMIGRATE_FAIL
>>>>>>> +- THP_MIGRATION_SUCCESS
>>>>>>> +- THP_MIGRATION_FAILURE
>>>>>>> +
>>>>>>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>>>>>>> +migrated as a single entity following an allocation failure and ended up getting
>>>>>>> +split into constituent normal pages before being retried. This event, along with
>>>>>>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>>>>>>> +migration events including both success and failure cases.
>>>>>>
>>>>>> First, I'd suggest running this paragraph through 'fmt'. That way you
>>>>>> don't have to care about line lengths.
>>>>>>
>>>>>> Second, this paragraph doesn't really explain what I need to know to
>>>>>> understand the meaning of these numbers. When Linux attempts to migrate
>>>>>> a THP, one of three things can happen:
>>>>>>>>>>>> - It is migrated as a single THP
>>>>>> - It is migrated, but had to be split
>>>>>> - Migration fails
>>>>>>
>>>>>> How do I turn these four numbers into an understanding of how often each
>>>>>> of those three situations happen? And why do we need four numbers to
>>>>>> report three situations?
>>>>>>
>>>>>> Or is there something else that can happen? If so, I'd like that explained
>>>>>> here too ;-)
>>>>>
>>>>> PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
>>>>> so it is not easy to interpret them. Let me try to explain them.
>>>>
>>>> Thanks! Very helpful explanation.
>>>>
>>>>> 1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
>>>>> these base pages are migrated and fail to migrate respectively.
>>>>> THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
>>>>> Simple.
>>>>>
>>>>> 2. migrating only THPs:
>>>>> - PGMIGRATE_SUCCESS means THPs that are migrated and base pages
>>>>> (from the split of THPs) that are migrated,
>>>>>
>>>>> - PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.
>>>>>
>>>>> - THP_MIGRATION_SUCCESS means THPs that are migrated.
>>>>>
>>>>> - THP_MIGRATION_FAILURE means THPs that are split.
>>>>>
>>>>> So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
>>>>> which are from the split of THPs.
>>>>
>>>> Are you sure about that? If I split a THP and each of those subpages
>>>> migrates, won't I then see PGMIGRATE_SUCCESS increase by 512?
>>>
>>> That is what I mean. I guess my words did not work. I should have used subpages.
>>>
>>>>
>>>>> When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
>>>>> means the number of pages that are failed to migrate, but we cannot tell how many
>>>>> are base pages and how many are THPs.
>>>>>
>>>>> 3. migrating base pages and THP:
>>>>>
>>>>> The math should be very similar to the second case, except that
>>>>> a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
>>>>> as base pages and how many are pages begin as THPs but become base pages after split;
>>>>> b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
>>>>> base pages that begin as base pages fail to migrate, is mixed into the number and we
>>>>> cannot tell three cases apart.
>>>>
>>>> So why don't we just expose PGMIGRATE_SPLIT? That would be defined as
>>>> the number of times we succeeded in migrating a THP but had to split it
>>>> to succeed.
>>>
>>> It might need extra code to get that number. Currently, the subpages from split
>>> THPs are appended to the end of the original page list, so we might need a separate
>>> page list for these subpages to count PGMIGRATE_SPLIT. Also what if some of the
>>> subpages fail to migrate? Do we increase PGMIGRATE_SPLIT or not?
>>
>> Thanks Zi, for such a detailed explanation. Ideally, we should separate THP
>> migration from base page migration in terms of statistics. PGMIGRATE_SUCCESS
>> and PGMIGRATE_FAIL should continue to track statistics when migration starts
>> with base pages. But for THP, we should track the following events.
>
> You mean PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will not track the number of migrated subpages
> from split THPs? Will it cause userspace issues since their semantics are changed?

Yeah, basic idea is to carve out all THP migration related statistics from
the normal page migration. Not sure if that might cause any issue for user
space as you have mentioned. Does /proc/vmstat indicate some sort of an ABI
whose semantics can not really change ? Some more opinions here from others
would be helpful.

The current situation is definitely bit problematic where PGMIGRATE_SUCCESS
increments (+1) both for normal and successful THP migration. Same situation
for PGMIGRATE_FAILURE as well. Hence, there are two clear choices available.

1. THP and normal page migration stats are separate and mutually exclusive

OR

2. THP migration has specific counters but normal page migration counters
still account for everything in THP migration in terms of normal pages

But IIUC, either way the current PGMIGRATE_SUCCESS or PGMIGRATE_FAIL stats
will change for the user as visible from /proc/vmstat.

>
>>
>> 1. THP_MIGRATION_SUCCESS - THP migration is successful, without split
>> 2. THP_MIGRATION_FAILURE - THP could neither be migrated, nor be split
>
> They make sense to me.>
>> 3. THP_MIGRATION_SPLIT_SUCCESS - THP got split and all sub pages migrated
>> 4. THP_MIGRATION_SPLIT_FAILURE - THP got split but all sub pages could not be migrated
>>
>> THP_MIGRATION_SPLIT_FAILURE could either increment once for a single THP or
>> number of subpages that did not get migrated after split. As you mentioned,
>> this will need some extra code in the core migration. Nonetheless, if these
>> new events look good, will be happy to make required changes.
>
> Maybe THP_MIGRATION_SPLIT would be simpler? My concern is that whether we need such

Also, it will not require a new migration queue tracking the split THP sub pages.

> detailed information or not. Maybe trace points would be good enough for 3 and 4.

But without a separate queue for split THP subpages, will it be possible to track
(3) and (4) through trace events ? Just wondering, where are the intercept points.

> But if you think it is useful to you, feel free to implement them.

Original idea was that, all stats here should give high level view but new upcoming
trace events should help in details like which particular subpages could not be
migrated resulting in a THP_MIGRATION_SPLIT_FAILURE increment etc.

>
> BTW, in terms of stats tracking, what do you think of my patch below? I am trying to
> aggregate all stats counting in one place. Feel free to use it if you think it works
> for you.

Assume that, I could take the liberty to add your 'Signed-off-by' in case end up
using this code chunk below. This seems to be going with option (2) as mentioned
before.

>
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 7bfd0962149e..0f3c60470489 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1429,9 +1429,14 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> enum migrate_mode mode, int reason)
> {
> int retry = 1;
> + int thp_retry = 1;
> int nr_failed = 0;
> + int nr_thp_failed = 0;
> + int nr_thp_split = 0;
> int nr_succeeded = 0;
> + int nr_thp_succeeded = 0;
> int pass = 0;
> + bool is_thp = false;
> struct page *page;
> struct page *page2;
> int swapwrite = current->flags & PF_SWAPWRITE;
> @@ -1440,11 +1445,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> if (!swapwrite)
> current->flags |= PF_SWAPWRITE;
>
> - for(pass = 0; pass < 10 && retry; pass++) {
> + for(pass = 0; pass < 10 && (retry || thp_retry); pass++) {

'thp_retry' check might not be necessary here as 'retry' already
contains 'thp_retry'.

> retry = 0;
> + thp_retry = 0;
>
> list_for_each_entry_safe(page, page2, from, lru) {
> retry:
> + is_thp = PageTransHuge(page);
> cond_resched();
>
> if (PageHuge(page))
> @@ -1475,15 +1482,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> unlock_page(page);
> if (!rc) {
> list_safe_reset_next(page, page2, lru);
> + nr_thp_split++;
> goto retry;
> }
> }

Check 'if_thp' and increment 'nr_thp_failed' like 'default' case and
also increment nr_failed by (1 << order) for the THP being migrated.


> nr_failed++;
> goto out;
> case -EAGAIN:
> + if (is_thp)
> + thp_retry++;
> retry++;
> break;
> case MIGRATEPAGE_SUCCESS:
> + if (is_thp)
> + nr_thp_succeeded++;

Increment nr_succeeded by (1 << order) for the THP being migrated.

> nr_succeeded++;
> break;
> default:
> @@ -1493,18 +1505,27 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> * removed from migration page list and not
> * retried in the next outer loop.
> */
> + if (is_thp)
> + nr_thp_failed++;

Increment nr_failed by (1 << order) for the THP being migrated.

> nr_failed++;
> break;
> }
> }
> }
> nr_failed += retry;
> + nr_thp_failed += thp_retry;
> rc = nr_failed;

Right, nr_failed already contains nr_thp_failed. Hence need not change.

> if (nr_succeeded)
> count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
> if (nr_failed)
> count_vm_events(PGMIGRATE_FAIL, nr_failed);
> + if (nr_thp_succeeded)
> + count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
> + if (nr_thp_failed)
> + count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
> + if (nr_thp_split)
> + count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
> trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason);

This existing trace event should add stats for THP migration as well.

>
> if (!swapwrite)
>

Regardless, this change set (may be with some more modifications), makes sense if
we decide to go with option (2), where existing normal page migration stats will
cover THP related stats as well. But it will be really great, if we get some more
opinions on this. Meanwhile, will continue looking into it further.

- Anshuman

2020-06-09 22:31:23

by Daniel Jordan

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On Tue, Jun 09, 2020 at 05:05:45PM +0530, Anshuman Khandual wrote:
> On 06/05/2020 07:54 PM, Zi Yan wrote:
> > On 4 Jun 2020, at 23:35, Anshuman Khandual wrote:
> >> Thanks Zi, for such a detailed explanation. Ideally, we should separate THP
> >> migration from base page migration in terms of statistics. PGMIGRATE_SUCCESS
> >> and PGMIGRATE_FAIL should continue to track statistics when migration starts
> >> with base pages. But for THP, we should track the following events.
> >
> > You mean PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will not track the number of migrated subpages
> > from split THPs? Will it cause userspace issues since their semantics are changed?
>
> Yeah, basic idea is to carve out all THP migration related statistics from
> the normal page migration. Not sure if that might cause any issue for user
> space as you have mentioned. Does /proc/vmstat indicate some sort of an ABI
> whose semantics can not really change ? Some more opinions here from others
> would be helpful.

vmstats have had their implementations changed for THP before, so there's at
least some precedent.

https://lore.kernel.org/linux-mm/[email protected]/

Others know better than I do.

> The current situation is definitely bit problematic where PGMIGRATE_SUCCESS
> increments (+1) both for normal and successful THP migration. Same situation
> for PGMIGRATE_FAILURE as well.

Yeah, that's suboptimal. Maybe a separate patch to get thps accounted properly
in those stats?

> Hence, there are two clear choices available.
>
> 1. THP and normal page migration stats are separate and mutually exclusive
>
> OR
>
> 2. THP migration has specific counters but normal page migration counters
> still account for everything in THP migration in terms of normal pages

FWIW, 2 makes more sense to me because it suits the existing names (it's
PGMIGRATE_SUCCESS not PGMIGRATE_SMALL_PAGES_SUCCESS) and it's less surprising
than leaving thp out of them.

> But IIUC, either way the current PGMIGRATE_SUCCESS or PGMIGRATE_FAIL stats
> will change for the user as visible from /proc/vmstat.
>
> >
> >>
> >> 1. THP_MIGRATION_SUCCESS - THP migration is successful, without split
> >> 2. THP_MIGRATION_FAILURE - THP could neither be migrated, nor be split
> >
> > They make sense to me.>
> >> 3. THP_MIGRATION_SPLIT_SUCCESS - THP got split and all sub pages migrated
> >> 4. THP_MIGRATION_SPLIT_FAILURE - THP got split but all sub pages could not be migrated
> >>
> >> THP_MIGRATION_SPLIT_FAILURE could either increment once for a single THP or
> >> number of subpages that did not get migrated after split. As you mentioned,
> >> this will need some extra code in the core migration. Nonetheless, if these
> >> new events look good, will be happy to make required changes.
> >
> > Maybe THP_MIGRATION_SPLIT would be simpler? My concern is that whether we need such
>
> Also, it will not require a new migration queue tracking the split THP sub pages.
>
> > detailed information or not. Maybe trace points would be good enough for 3 and 4.

I share the concern about whether that many new stats are necessary. The
changelog says this is for performance debugging, so it seems THP success and
THP split would cover that.

The split stat becomes redundant if the other information is needed in the
future, but it can presumably be removed in that case.

2020-06-09 23:11:49

by Zi Yan

[permalink] [raw]
Subject: Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

On 9 Jun 2020, at 7:35, Anshuman Khandual wrote:

> On 06/05/2020 07:54 PM, Zi Yan wrote:
>> On 4 Jun 2020, at 23:35, Anshuman Khandual wrote:
>>
>>> On 06/04/2020 10:19 PM, Zi Yan wrote:
>>>> On 4 Jun 2020, at 12:36, Matthew Wilcox wrote:
>>>>
>>>>> On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote:
>>>>>> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:
>>>>>>> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>>>>>>>> +Quantifying Migration
>>>>>>>> +=====================
>>>>>>>> +Following events can be used to quantify page migration.
>>>>>>>> +
>>>>>>>> +- PGMIGRATE_SUCCESS
>>>>>>>> +- PGMIGRATE_FAIL
>>>>>>>> +- THP_MIGRATION_SUCCESS
>>>>>>>> +- THP_MIGRATION_FAILURE
>>>>>>>> +
>>>>>>>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>>>>>>>> +migrated as a single entity following an allocation failure and ended up getting
>>>>>>>> +split into constituent normal pages before being retried. This event, along with
>>>>>>>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>>>>>>>> +migration events including both success and failure cases.
>>>>>>>
>>>>>>> First, I'd suggest running this paragraph through 'fmt'. That way you
>>>>>>> don't have to care about line lengths.
>>>>>>>
>>>>>>> Second, this paragraph doesn't really explain what I need to know to
>>>>>>> understand the meaning of these numbers. When Linux attempts to migrate
>>>>>>> a THP, one of three things can happen:
>>>>>>>>>>>>> - It is migrated as a single THP
>>>>>>> - It is migrated, but had to be split
>>>>>>> - Migration fails
>>>>>>>
>>>>>>> How do I turn these four numbers into an understanding of how often each
>>>>>>> of those three situations happen? And why do we need four numbers to
>>>>>>> report three situations?
>>>>>>>
>>>>>>> Or is there something else that can happen? If so, I'd like that explained
>>>>>>> here too ;-)
>>>>>>
>>>>>> PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
>>>>>> so it is not easy to interpret them. Let me try to explain them.
>>>>>
>>>>> Thanks! Very helpful explanation.
>>>>>
>>>>>> 1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
>>>>>> these base pages are migrated and fail to migrate respectively.
>>>>>> THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
>>>>>> Simple.
>>>>>>
>>>>>> 2. migrating only THPs:
>>>>>> - PGMIGRATE_SUCCESS means THPs that are migrated and base pages
>>>>>> (from the split of THPs) that are migrated,
>>>>>>
>>>>>> - PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.
>>>>>>
>>>>>> - THP_MIGRATION_SUCCESS means THPs that are migrated.
>>>>>>
>>>>>> - THP_MIGRATION_FAILURE means THPs that are split.
>>>>>>
>>>>>> So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
>>>>>> which are from the split of THPs.
>>>>>
>>>>> Are you sure about that? If I split a THP and each of those subpages
>>>>> migrates, won't I then see PGMIGRATE_SUCCESS increase by 512?
>>>>
>>>> That is what I mean. I guess my words did not work. I should have used subpages.
>>>>
>>>>>
>>>>>> When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
>>>>>> means the number of pages that are failed to migrate, but we cannot tell how many
>>>>>> are base pages and how many are THPs.
>>>>>>
>>>>>> 3. migrating base pages and THP:
>>>>>>
>>>>>> The math should be very similar to the second case, except that
>>>>>> a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
>>>>>> as base pages and how many are pages begin as THPs but become base pages after split;
>>>>>> b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
>>>>>> base pages that begin as base pages fail to migrate, is mixed into the number and we
>>>>>> cannot tell three cases apart.
>>>>>
>>>>> So why don't we just expose PGMIGRATE_SPLIT? That would be defined as
>>>>> the number of times we succeeded in migrating a THP but had to split it
>>>>> to succeed.
>>>>
>>>> It might need extra code to get that number. Currently, the subpages from split
>>>> THPs are appended to the end of the original page list, so we might need a separate
>>>> page list for these subpages to count PGMIGRATE_SPLIT. Also what if some of the
>>>> subpages fail to migrate? Do we increase PGMIGRATE_SPLIT or not?
>>>
>>> Thanks Zi, for such a detailed explanation. Ideally, we should separate THP
>>> migration from base page migration in terms of statistics. PGMIGRATE_SUCCESS
>>> and PGMIGRATE_FAIL should continue to track statistics when migration starts
>>> with base pages. But for THP, we should track the following events.
>>
>> You mean PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will not track the number of migrated subpages
>> from split THPs? Will it cause userspace issues since their semantics are changed?
>
> Yeah, basic idea is to carve out all THP migration related statistics from
> the normal page migration. Not sure if that might cause any issue for user
> space as you have mentioned. Does /proc/vmstat indicate some sort of an ABI
> whose semantics can not really change ? Some more opinions here from others
> would be helpful.
>
> The current situation is definitely bit problematic where PGMIGRATE_SUCCESS
> increments (+1) both for normal and successful THP migration. Same situation
> for PGMIGRATE_FAILURE as well. Hence, there are two clear choices available.
>
> 1. THP and normal page migration stats are separate and mutually exclusive
>
> OR
>
> 2. THP migration has specific counters but normal page migration counters
> still account for everything in THP migration in terms of normal pages
>
> But IIUC, either way the current PGMIGRATE_SUCCESS or PGMIGRATE_FAIL stats
> will change for the user as visible from /proc/vmstat.

Why? In the case 2, PGMIGRATE_SUCCESS and PGMIGRATE_FAIL would remain the same
as today, right?

>
>>
>>>
>>> 1. THP_MIGRATION_SUCCESS - THP migration is successful, without split
>>> 2. THP_MIGRATION_FAILURE - THP could neither be migrated, nor be split
>>
>> They make sense to me.>
>>> 3. THP_MIGRATION_SPLIT_SUCCESS - THP got split and all sub pages migrated
>>> 4. THP_MIGRATION_SPLIT_FAILURE - THP got split but all sub pages could not be migrated
>>>
>>> THP_MIGRATION_SPLIT_FAILURE could either increment once for a single THP or
>>> number of subpages that did not get migrated after split. As you mentioned,
>>> this will need some extra code in the core migration. Nonetheless, if these
>>> new events look good, will be happy to make required changes.
>>
>> Maybe THP_MIGRATION_SPLIT would be simpler? My concern is that whether we need such
>
> Also, it will not require a new migration queue tracking the split THP sub pages.
>
>> detailed information or not. Maybe trace points would be good enough for 3 and 4.
>
> But without a separate queue for split THP subpages, will it be possible to track
> (3) and (4) through trace events ? Just wondering, where are the intercept points.

Not easily. You could get a trace of the migration result and the physical address
of individual page, whether each is a THP or not, and the split of a THP.
You then can see which THP is split and what is the migration result of its subpages.
It is more analysis effort but less kernel code.

>
>> But if you think it is useful to you, feel free to implement them.
>
> Original idea was that, all stats here should give high level view but new upcoming
> trace events should help in details like which particular subpages could not be
> migrated resulting in a THP_MIGRATION_SPLIT_FAILURE increment etc.
>
>>
>> BTW, in terms of stats tracking, what do you think of my patch below? I am trying to
>> aggregate all stats counting in one place. Feel free to use it if you think it works
>> for you.
>
> Assume that, I could take the liberty to add your 'Signed-off-by' in case end up
> using this code chunk below. This seems to be going with option (2) as mentioned
> before.

Feel free to do so.

>
>>
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 7bfd0962149e..0f3c60470489 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1429,9 +1429,14 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>> enum migrate_mode mode, int reason)
>> {
>> int retry = 1;
>> + int thp_retry = 1;
>> int nr_failed = 0;
>> + int nr_thp_failed = 0;
>> + int nr_thp_split = 0;
>> int nr_succeeded = 0;
>> + int nr_thp_succeeded = 0;
>> int pass = 0;
>> + bool is_thp = false;
>> struct page *page;
>> struct page *page2;
>> int swapwrite = current->flags & PF_SWAPWRITE;
>> @@ -1440,11 +1445,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>> if (!swapwrite)
>> current->flags |= PF_SWAPWRITE;
>>
>> - for(pass = 0; pass < 10 && retry; pass++) {
>> + for(pass = 0; pass < 10 && (retry || thp_retry); pass++) {
>
> 'thp_retry' check might not be necessary here as 'retry' already
> contains 'thp_retry'.

Right.
>
>> retry = 0;
>> + thp_retry = 0;
>>
>> list_for_each_entry_safe(page, page2, from, lru) {
>> retry:
>> + is_thp = PageTransHuge(page);
>> cond_resched();
>>
>> if (PageHuge(page))
>> @@ -1475,15 +1482,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>> unlock_page(page);
>> if (!rc) {
>> list_safe_reset_next(page, page2, lru);
>> + nr_thp_split++;
>> goto retry;
>> }
>> }
>
> Check 'if_thp' and increment 'nr_thp_failed' like 'default' case and
> also increment nr_failed by (1 << order) for the THP being migrated.

Right. hpage_nr_pages() would be better.

>
>
>> nr_failed++;
>> goto out;
>> case -EAGAIN:
>> + if (is_thp)
>> + thp_retry++;
>> retry++;
>> break;
>> case MIGRATEPAGE_SUCCESS:
>> + if (is_thp)
>> + nr_thp_succeeded++;
>
> Increment nr_succeeded by (1 << order) for the THP being migrated.
Ditto.

>
>> nr_succeeded++;
>> break;
>> default:
>> @@ -1493,18 +1505,27 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>> * removed from migration page list and not
>> * retried in the next outer loop.
>> */
>> + if (is_thp)
>> + nr_thp_failed++;
>
> Increment nr_failed by (1 << order) for the THP being migrated.
Ditto.
>
>> nr_failed++;
>> break;
>> }
>> }
>> }
>> nr_failed += retry;
>> + nr_thp_failed += thp_retry;
>> rc = nr_failed;
>
> Right, nr_failed already contains nr_thp_failed. Hence need not change.
>
>> if (nr_succeeded)
>> count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
>> if (nr_failed)
>> count_vm_events(PGMIGRATE_FAIL, nr_failed);
>> + if (nr_thp_succeeded)
>> + count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
>> + if (nr_thp_failed)
>> + count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
>> + if (nr_thp_split)
>> + count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
>> trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason);
>
> This existing trace event should add stats for THP migration as well.

Sure.
>
>>
>> if (!swapwrite)
>>
>
> Regardless, this change set (may be with some more modifications), makes sense if
> we decide to go with option (2), where existing normal page migration stats will
> cover THP related stats as well. But it will be really great, if we get some more
> opinions on this. Meanwhile, will continue looking into it further.

Thank you for working on this. :)


Best Regards,
Yan Zi


Attachments:
signature.asc (871.00 B)
OpenPGP digital signature