2022-10-21 10:40:29

by Baolin Wang

[permalink] [raw]
Subject: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully

When THP migration, if THPs are split and all subpages are migrated successfully
, the migrate_pages() will still return the number of THP that were not migrated.
That will confuse the callers of migrate_pages(), for example, which will make
the longterm pinning failed though all pages are migrated successfully.

Thus we should return 0 to indicate all pages are migrated in this case.

Signed-off-by: Baolin Wang <[email protected]>
---
Changes from v1:
- Fix the return value of migrate_pages() instead of fixing the
callers' validation.
---
mm/migrate.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index 8e5eb6e..1da0dbc 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
*/
list_splice(&ret_pages, from);

+ /*
+ * Return 0 in case all subpages of fail-to-migrate THPs are
+ * migrated successfully.
+ */
+ if (nr_thp_split && list_empty(from))
+ rc = 0;
+
count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
--
1.8.3.1


2022-10-21 10:41:00

by Baolin Wang

[permalink] [raw]
Subject: [PATCH v2 2/2] mm: migrate: Try again if THP split is failed due to page refcnt

When creating a virtual machine, we will use memfd_create() to get
a file descriptor which can be used to create share memory mappings
using the mmap function, meanwhile the mmap() will set the MAP_POPULATE
flag to allocate physical pages for the virtual machine.

When allocating physical pages for the guest, the host can fallback to
allocate some CMA pages for the guest when over half of the zone's free
memory is in the CMA area.

In guest os, when the application wants to do some data transaction with
DMA, our QEMU will call VFIO_IOMMU_MAP_DMA ioctl to do longterm-pin and
create IOMMU mappings for the DMA pages. However, when calling
VFIO_IOMMU_MAP_DMA ioctl to pin the physical pages, we found it will be
failed to longterm-pin sometimes.

After some invetigation, we found the pages used to do DMA mapping can
contain some CMA pages, and these CMA pages will cause a possible
failure of the longterm-pin, due to failed to migrate the CMA pages.
The reason of migration failure may be temporary reference count or
memory allocation failure. So that will cause the VFIO_IOMMU_MAP_DMA
ioctl returns error, which makes the application failed to start.

I observed one migration failure case (which is not easy to reproduce) is
that, the 'thp_migration_fail' count is 1 and the 'thp_split_page_failed'
count is also 1.

That means when migrating a THP which is in CMA area, but can not allocate
a new THP due to memory fragmentation, so it will split the THP. However
THP split is also failed, probably the reason is temporary reference count
of this THP. And the temporary reference count can be caused by dropping
page caches (I observed the drop caches operation in the system), but we
can not drop the shmem page caches due to they are already dirty at that time.

Especially for THP split failure, which is caused by temporary reference
count, we can try again to mitigate the failure of migration in this case
according to previous discussion [1].

[1] https://lore.kernel.org/all/[email protected]/
Signed-off-by: Baolin Wang <[email protected]>
---
Changes from v1:
- Use another variable to save the return value of THP split.
---
mm/huge_memory.c | 4 ++--
mm/migrate.c | 19 ++++++++++++++++---
2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ad17c8d..a79f03b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2666,7 +2666,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
* split PMDs
*/
if (!can_split_folio(folio, &extra_pins)) {
- ret = -EBUSY;
+ ret = -EAGAIN;
goto out_unlock;
}

@@ -2716,7 +2716,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
xas_unlock(&xas);
local_irq_enable();
remap_page(folio, folio_nr_pages(folio));
- ret = -EBUSY;
+ ret = -EAGAIN;
}

out_unlock:
diff --git a/mm/migrate.c b/mm/migrate.c
index 1da0dbc..6d49a3e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1506,9 +1506,22 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
if (is_thp) {
nr_thp_failed++;
/* THP NUMA faulting doesn't split THP to retry. */
- if (!nosplit && !try_split_thp(page, &thp_split_pages)) {
- nr_thp_split++;
- break;
+ if (!nosplit) {
+ int ret = try_split_thp(page, &thp_split_pages);
+
+ if (!ret) {
+ nr_thp_split++;
+ break;
+ } else if (reason == MR_LONGTERM_PIN &&
+ ret == -EAGAIN) {
+ /*
+ * Try again to split THP to mitigate
+ * the failure of longterm pinning.
+ */
+ thp_retry++;
+ nr_retry_pages += nr_subpages;
+ break;
+ }
}
} else if (!no_subpage_counting) {
nr_failed++;
--
1.8.3.1

2022-10-21 19:32:22

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully

On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <[email protected]> wrote:

> When THP migration, if THPs are split and all subpages are migrated successfully
> , the migrate_pages() will still return the number of THP that were not migrated.
> That will confuse the callers of migrate_pages(), for example, which will make
> the longterm pinning failed though all pages are migrated successfully.
>
> Thus we should return 0 to indicate all pages are migrated in this case.
>

This had me puzzled for a while. I think this wording is clearer?

: During THP migration, if THPs are not migrated but they are split and all
: subpages are migrated successfully, migrate_pages() will still return the
: number of THP pages that were not migrated. This will confuse the callers
: of migrate_pages(). For example, the longterm pinning will failed though
: all pages are migrated successfully.
:
: Thus we should return 0 to indicate that all pages are migrated in this
: case.

This is a fairly longstanding problem? No Fixes: we can identify?

Did you consider the desirability of a -stable backport?

2022-10-21 21:31:50

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully

On Fri, Oct 21, 2022 at 11:41 AM Andrew Morton
<[email protected]> wrote:
>
> On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <[email protected]> wrote:
>
> > When THP migration, if THPs are split and all subpages are migrated successfully
> > , the migrate_pages() will still return the number of THP that were not migrated.
> > That will confuse the callers of migrate_pages(), for example, which will make
> > the longterm pinning failed though all pages are migrated successfully.
> >
> > Thus we should return 0 to indicate all pages are migrated in this case.
> >
>
> This had me puzzled for a while. I think this wording is clearer?
>
> : During THP migration, if THPs are not migrated but they are split and all
> : subpages are migrated successfully, migrate_pages() will still return the
> : number of THP pages that were not migrated. This will confuse the callers
> : of migrate_pages(). For example, the longterm pinning will failed though
> : all pages are migrated successfully.
> :
> : Thus we should return 0 to indicate that all pages are migrated in this
> : case.
>
> This is a fairly longstanding problem? No Fixes: we can identify?

It doesn't seem like a long standing issue. It seems like commit
b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
fixed one problem, but introduced this new one IIUC.

Before this commit, the code did:

nr_failed += retry + thp_retry;
rc = nr_failed;

But retry and thp_retry were actually reset for each retry until the
last one. So as long as there is no permanent migration failure and
THP split failure, nr_failed should be 0 IIUC. TBH the code is a
little bit hard to follow, please correct me if I'm wrong.

>
> Did you consider the desirability of a -stable backport?
>

2022-10-24 02:05:44

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] mm: migrate: Try again if THP split is failed due to page refcnt

Baolin Wang <[email protected]> writes:

> When creating a virtual machine, we will use memfd_create() to get
> a file descriptor which can be used to create share memory mappings
> using the mmap function, meanwhile the mmap() will set the MAP_POPULATE
> flag to allocate physical pages for the virtual machine.
>
> When allocating physical pages for the guest, the host can fallback to
> allocate some CMA pages for the guest when over half of the zone's free
> memory is in the CMA area.
>
> In guest os, when the application wants to do some data transaction with
> DMA, our QEMU will call VFIO_IOMMU_MAP_DMA ioctl to do longterm-pin and
> create IOMMU mappings for the DMA pages. However, when calling
> VFIO_IOMMU_MAP_DMA ioctl to pin the physical pages, we found it will be
> failed to longterm-pin sometimes.
>
> After some invetigation, we found the pages used to do DMA mapping can
> contain some CMA pages, and these CMA pages will cause a possible
> failure of the longterm-pin, due to failed to migrate the CMA pages.
> The reason of migration failure may be temporary reference count or
> memory allocation failure. So that will cause the VFIO_IOMMU_MAP_DMA
> ioctl returns error, which makes the application failed to start.
>
> I observed one migration failure case (which is not easy to reproduce) is
> that, the 'thp_migration_fail' count is 1 and the 'thp_split_page_failed'
> count is also 1.
>
> That means when migrating a THP which is in CMA area, but can not allocate
> a new THP due to memory fragmentation, so it will split the THP. However
> THP split is also failed, probably the reason is temporary reference count
> of this THP. And the temporary reference count can be caused by dropping
> page caches (I observed the drop caches operation in the system), but we
> can not drop the shmem page caches due to they are already dirty at that time.
>
> Especially for THP split failure, which is caused by temporary reference
> count, we can try again to mitigate the failure of migration in this case
> according to previous discussion [1].
>
> [1] https://lore.kernel.org/all/[email protected]/
> Signed-off-by: Baolin Wang <[email protected]>

Thanks!

Reviewed-by: "Huang, Ying" <[email protected]>

Best Regards,
Huang, Ying

> ---
> Changes from v1:
> - Use another variable to save the return value of THP split.
> ---
> mm/huge_memory.c | 4 ++--
> mm/migrate.c | 19 ++++++++++++++++---
> 2 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index ad17c8d..a79f03b 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2666,7 +2666,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> * split PMDs
> */
> if (!can_split_folio(folio, &extra_pins)) {
> - ret = -EBUSY;
> + ret = -EAGAIN;
> goto out_unlock;
> }
>
> @@ -2716,7 +2716,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> xas_unlock(&xas);
> local_irq_enable();
> remap_page(folio, folio_nr_pages(folio));
> - ret = -EBUSY;
> + ret = -EAGAIN;
> }
>
> out_unlock:
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 1da0dbc..6d49a3e 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1506,9 +1506,22 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> if (is_thp) {
> nr_thp_failed++;
> /* THP NUMA faulting doesn't split THP to retry. */
> - if (!nosplit && !try_split_thp(page, &thp_split_pages)) {
> - nr_thp_split++;
> - break;
> + if (!nosplit) {
> + int ret = try_split_thp(page, &thp_split_pages);
> +
> + if (!ret) {
> + nr_thp_split++;
> + break;
> + } else if (reason == MR_LONGTERM_PIN &&
> + ret == -EAGAIN) {
> + /*
> + * Try again to split THP to mitigate
> + * the failure of longterm pinning.
> + */
> + thp_retry++;
> + nr_retry_pages += nr_subpages;
> + break;
> + }
> }
> } else if (!no_subpage_counting) {
> nr_failed++;

2022-10-24 02:20:21

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully

Yang Shi <[email protected]> writes:

> On Fri, Oct 21, 2022 at 11:41 AM Andrew Morton
> <[email protected]> wrote:
>>
>> On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <[email protected]> wrote:
>>
>> > When THP migration, if THPs are split and all subpages are migrated successfully
>> > , the migrate_pages() will still return the number of THP that were not migrated.
>> > That will confuse the callers of migrate_pages(), for example, which will make
>> > the longterm pinning failed though all pages are migrated successfully.
>> >
>> > Thus we should return 0 to indicate all pages are migrated in this case.
>> >
>>
>> This had me puzzled for a while. I think this wording is clearer?
>>
>> : During THP migration, if THPs are not migrated but they are split and all
>> : subpages are migrated successfully, migrate_pages() will still return the
>> : number of THP pages that were not migrated. This will confuse the callers
>> : of migrate_pages(). For example, the longterm pinning will failed though
>> : all pages are migrated successfully.
>> :
>> : Thus we should return 0 to indicate that all pages are migrated in this
>> : case.
>>
>> This is a fairly longstanding problem? No Fixes: we can identify?
>
> It doesn't seem like a long standing issue. It seems like commit
> b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
> fixed one problem, but introduced this new one IIUC.
>
> Before this commit, the code did:
>
> nr_failed += retry + thp_retry;
> rc = nr_failed;
>
> But retry and thp_retry were actually reset for each retry until the
> last one. So as long as there is no permanent migration failure and
> THP split failure, nr_failed should be 0 IIUC. TBH the code is a
> little bit hard to follow, please correct me if I'm wrong.

I think that you are correct. We can added

Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")

>> Did you consider the desirability of a -stable backport?

I think this can be backport to -stable.

Best Regards,
Huang, Ying

2022-10-24 03:11:34

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully


Baolin Wang <[email protected]> writes:

> When THP migration, if THPs are split and all subpages are migrated successfully
> , the migrate_pages() will still return the number of THP that were not migrated.
> That will confuse the callers of migrate_pages(), for example, which will make
> the longterm pinning failed though all pages are migrated successfully.
>
> Thus we should return 0 to indicate all pages are migrated in this case.
>
> Signed-off-by: Baolin Wang <[email protected]>
> ---
> Changes from v1:
> - Fix the return value of migrate_pages() instead of fixing the
> callers' validation.
> ---
> mm/migrate.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8e5eb6e..1da0dbc 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> */
> list_splice(&ret_pages, from);
>
> + /*
> + * Return 0 in case all subpages of fail-to-migrate THPs are
> + * migrated successfully.
> + */
> + if (nr_thp_split && list_empty(from))
> + rc = 0;

Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
imply success? And if it doesn't imply success wouldn't it be possible
to end up with nr_thp_split && list_empty(from) whilst still having
pages that failed to migrate?

The list management and return code logic from unmap_and_move() has
gotten pretty difficult to follow and could do with some rework IMHO.

> count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
> count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
> count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);

2022-10-24 06:14:58

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully



On 10/24/2022 9:56 AM, Huang, Ying wrote:
> Yang Shi <[email protected]> writes:
>
>> On Fri, Oct 21, 2022 at 11:41 AM Andrew Morton
>> <[email protected]> wrote:
>>>
>>> On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <[email protected]> wrote:
>>>
>>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>>> That will confuse the callers of migrate_pages(), for example, which will make
>>>> the longterm pinning failed though all pages are migrated successfully.
>>>>
>>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>>
>>>
>>> This had me puzzled for a while. I think this wording is clearer?
>>>
>>> : During THP migration, if THPs are not migrated but they are split and all
>>> : subpages are migrated successfully, migrate_pages() will still return the
>>> : number of THP pages that were not migrated. This will confuse the callers
>>> : of migrate_pages(). For example, the longterm pinning will failed though
>>> : all pages are migrated successfully.
>>> :
>>> : Thus we should return 0 to indicate that all pages are migrated in this
>>> : case.
>>>
>>> This is a fairly longstanding problem? No Fixes: we can identify?
>>
>> It doesn't seem like a long standing issue. It seems like commit
>> b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
>> fixed one problem, but introduced this new one IIUC.
>>
>> Before this commit, the code did:
>>
>> nr_failed += retry + thp_retry;
>> rc = nr_failed;
>>
>> But retry and thp_retry were actually reset for each retry until the
>> last one. So as long as there is no permanent migration failure and
>> THP split failure, nr_failed should be 0 IIUC. TBH the code is a
>> little bit hard to follow, please correct me if I'm wrong.
>
> I think that you are correct. We can added
>
> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")

I think so too. Thanks Yang and Ying for pointing it out.

>
>>> Did you consider the desirability of a -stable backport?
>
> I think this can be backport to -stable.

Agree.

Andrew, could you help to add the Fixes tag and cc -stable? Thanks.

2022-10-24 06:53:54

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully



On 10/24/2022 10:36 AM, Alistair Popple wrote:
>
> Baolin Wang <[email protected]> writes:
>
>> When THP migration, if THPs are split and all subpages are migrated successfully
>> , the migrate_pages() will still return the number of THP that were not migrated.
>> That will confuse the callers of migrate_pages(), for example, which will make
>> the longterm pinning failed though all pages are migrated successfully.
>>
>> Thus we should return 0 to indicate all pages are migrated in this case.
>>
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>> Changes from v1:
>> - Fix the return value of migrate_pages() instead of fixing the
>> callers' validation.
>> ---
>> mm/migrate.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 8e5eb6e..1da0dbc 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>> */
>> list_splice(&ret_pages, from);
>>
>> + /*
>> + * Return 0 in case all subpages of fail-to-migrate THPs are
>> + * migrated successfully.
>> + */
>> + if (nr_thp_split && list_empty(from))
>> + rc = 0;
>
> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True

Only in the case of THP split, we can meet this abnormal case. So if no
THP split, just return the original 'rc' instead of validating the list,
since the 'nr_thp_split' validation is cheaper than the list_empty()
validation IMHO.

> imply success? And if it doesn't imply success wouldn't it be possible
> to end up with nr_thp_split && list_empty(from) whilst still having
> pages that failed to migrate?
>
> The list management and return code logic from unmap_and_move() has
> gotten pretty difficult to follow and could do with some rework IMHO.

Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor,
which seems a good start.

[1] https://lore.kernel.org/all/[email protected]/

2022-10-24 07:43:47

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully


Baolin Wang <[email protected]> writes:

> On 10/24/2022 10:36 AM, Alistair Popple wrote:
>> Baolin Wang <[email protected]> writes:
>>
>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>> That will confuse the callers of migrate_pages(), for example, which will make
>>> the longterm pinning failed though all pages are migrated successfully.
>>>
>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>
>>> Signed-off-by: Baolin Wang <[email protected]>
>>> ---
>>> Changes from v1:
>>> - Fix the return value of migrate_pages() instead of fixing the
>>> callers' validation.
>>> ---
>>> mm/migrate.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 8e5eb6e..1da0dbc 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>> */
>>> list_splice(&ret_pages, from);
>>>
>>> + /*
>>> + * Return 0 in case all subpages of fail-to-migrate THPs are
>>> + * migrated successfully.
>>> + */
>>> + if (nr_thp_split && list_empty(from))
>>> + rc = 0;
>> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
>
> Only in the case of THP split, we can meet this abnormal case. So if no THP
> split, just return the original 'rc' instead of validating the list, since the
> 'nr_thp_split' validation is cheaper than the list_empty() validation IMHO.

Is it really that much cheaper? We're already retrying migrations
multiple times, etc. so surely the difference here would be marginal at
best, and IMHO the code would be much clearer if we always set rc = 0
when list_empty(from) = True.

>> imply success? And if it doesn't imply success wouldn't it be possible
>> to end up with nr_thp_split && list_empty(from) whilst still having
>> pages that failed to migrate?
>> The list management and return code logic from unmap_and_move() has
>> gotten pretty difficult to follow and could do with some rework IMHO.
>
> Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, which seems
> a good start.

Thanks for pointing that out, I looked at it a while back but missed the
clean ups. I was kind of waiting for the non-RFC version before taking
another closer look.

> [1] https://lore.kernel.org/all/[email protected]/

2022-10-24 08:16:59

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully



On 10/24/2022 3:24 PM, Alistair Popple wrote:
>
> Baolin Wang <[email protected]> writes:
>
>> On 10/24/2022 10:36 AM, Alistair Popple wrote:
>>> Baolin Wang <[email protected]> writes:
>>>
>>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>>> That will confuse the callers of migrate_pages(), for example, which will make
>>>> the longterm pinning failed though all pages are migrated successfully.
>>>>
>>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>>
>>>> Signed-off-by: Baolin Wang <[email protected]>
>>>> ---
>>>> Changes from v1:
>>>> - Fix the return value of migrate_pages() instead of fixing the
>>>> callers' validation.
>>>> ---
>>>> mm/migrate.c | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index 8e5eb6e..1da0dbc 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>> */
>>>> list_splice(&ret_pages, from);
>>>>
>>>> + /*
>>>> + * Return 0 in case all subpages of fail-to-migrate THPs are
>>>> + * migrated successfully.
>>>> + */
>>>> + if (nr_thp_split && list_empty(from))
>>>> + rc = 0;
>>> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
>>
>> Only in the case of THP split, we can meet this abnormal case. So if no THP
>> split, just return the original 'rc' instead of validating the list, since the
>> 'nr_thp_split' validation is cheaper than the list_empty() validation IMHO.
>
> Is it really that much cheaper? We're already retrying migrations
> multiple times, etc. so surely the difference here would be marginal at
> best, and IMHO the code would be much clearer if we always set rc = 0
> when list_empty(from) = True.

Yeah, the difference is marginal and I have no strong preference. OK,
will drop the 'nr_thp_split' in next version. Thanks.

>>> imply success? And if it doesn't imply success wouldn't it be possible
>>> to end up with nr_thp_split && list_empty(from) whilst still having
>>> pages that failed to migrate?
>>> The list management and return code logic from unmap_and_move() has
>>> gotten pretty difficult to follow and could do with some rework IMHO.
>>
>> Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, which seems
>> a good start.
>
> Thanks for pointing that out, I looked at it a while back but missed the
> clean ups. I was kind of waiting for the non-RFC version before taking
> another closer look.
>
>> [1] https://lore.kernel.org/all/[email protected]/

2022-10-24 09:01:21

by Alistair Popple

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully


Baolin Wang <[email protected]> writes:

> On 10/24/2022 3:24 PM, Alistair Popple wrote:
>> Baolin Wang <[email protected]> writes:
>>
>>> On 10/24/2022 10:36 AM, Alistair Popple wrote:
>>>> Baolin Wang <[email protected]> writes:
>>>>
>>>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>>>> That will confuse the callers of migrate_pages(), for example, which will make
>>>>> the longterm pinning failed though all pages are migrated successfully.
>>>>>
>>>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>>>
>>>>> Signed-off-by: Baolin Wang <[email protected]>
>>>>> ---
>>>>> Changes from v1:
>>>>> - Fix the return value of migrate_pages() instead of fixing the
>>>>> callers' validation.
>>>>> ---
>>>>> mm/migrate.c | 7 +++++++
>>>>> 1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>> index 8e5eb6e..1da0dbc 100644
>>>>> --- a/mm/migrate.c
>>>>> +++ b/mm/migrate.c
>>>>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>>> */
>>>>> list_splice(&ret_pages, from);
>>>>>
>>>>> + /*
>>>>> + * Return 0 in case all subpages of fail-to-migrate THPs are
>>>>> + * migrated successfully.
>>>>> + */
>>>>> + if (nr_thp_split && list_empty(from))
>>>>> + rc = 0;
>>>> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
>>>
>>> Only in the case of THP split, we can meet this abnormal case. So if no THP
>>> split, just return the original 'rc' instead of validating the list, since the
>>> 'nr_thp_split' validation is cheaper than the list_empty() validation IMHO.
>> Is it really that much cheaper? We're already retrying migrations
>> multiple times, etc. so surely the difference here would be marginal at
>> best, and IMHO the code would be much clearer if we always set rc = 0
>> when list_empty(from) = True.
>
> Yeah, the difference is marginal and I have no strong preference. OK, will drop
> the 'nr_thp_split' in next version. Thanks.

Thanks. With that change feel free to add:

Reviewed-by: Alistair Popple <[email protected]>

>>>> imply success? And if it doesn't imply success wouldn't it be possible
>>>> to end up with nr_thp_split && list_empty(from) whilst still having
>>>> pages that failed to migrate?
>>>> The list management and return code logic from unmap_and_move() has
>>>> gotten pretty difficult to follow and could do with some rework IMHO.
>>>
>>> Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, which seems
>>> a good start.
>> Thanks for pointing that out, I looked at it a while back but missed the
>> clean ups. I was kind of waiting for the non-RFC version before taking
>> another closer look.
>>
>>> [1] https://lore.kernel.org/all/[email protected]/