2022-03-28 06:51:33

by Wei Yang

[permalink] [raw]
Subject: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

wakeup_kswapd() only wake up kswapd when the zone is managed.

For two callers of wakeup_kswapd(), they are node perspective.

* wake_all_kswapds
* numamigrate_isolate_page

If we picked up a !managed zone, this is not we expected.

This patch makes sure we pick up a managed zone for wakeup_kswapd().

Signed-off-by: Wei Yang <[email protected]>
---
mm/migrate.c | 2 +-
mm/page_alloc.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 3d60823afd2d..c4b654c0bdf0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
return 0;
for (z = pgdat->nr_zones - 1; z >= 0; z--) {
- if (populated_zone(pgdat->node_zones + z))
+ if (managed_zone(pgdat->node_zones + z))
break;
}
wakeup_kswapd(pgdat->node_zones + z, 0, order, ZONE_MOVABLE);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4c0c4ef94ba0..6656c2d06e01 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4674,6 +4674,8 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask,

for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, highest_zoneidx,
ac->nodemask) {
+ if (!managed_zone(zone))
+ continue;
if (last_pgdat != zone->zone_pgdat)
wakeup_kswapd(zone, gfp_mask, order, highest_zoneidx);
last_pgdat = zone->zone_pgdat;
--
2.33.1


2022-03-28 11:44:29

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

Hi, Wei,

Wei Yang <[email protected]> writes:

> wakeup_kswapd() only wake up kswapd when the zone is managed.
>
> For two callers of wakeup_kswapd(), they are node perspective.
>
> * wake_all_kswapds
> * numamigrate_isolate_page
>
> If we picked up a !managed zone, this is not we expected.
>
> This patch makes sure we pick up a managed zone for wakeup_kswapd().
>
> Signed-off-by: Wei Yang <[email protected]>
> ---
> mm/migrate.c | 2 +-
> mm/page_alloc.c | 2 ++
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 3d60823afd2d..c4b654c0bdf0 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
> return 0;
> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
> - if (populated_zone(pgdat->node_zones + z))
> + if (managed_zone(pgdat->node_zones + z))

This looks good to me! Thanks! It seems that we can replace
populated_zone() in migrate_balanced_pgdat() too. Right?

> break;
> }
> wakeup_kswapd(pgdat->node_zones + z, 0, order, ZONE_MOVABLE);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4c0c4ef94ba0..6656c2d06e01 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4674,6 +4674,8 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask,
>
> for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, highest_zoneidx,
> ac->nodemask) {
> + if (!managed_zone(zone))
> + continue;
> if (last_pgdat != zone->zone_pgdat)
> wakeup_kswapd(zone, gfp_mask, order, highest_zoneidx);
> last_pgdat = zone->zone_pgdat;

Best Regards,
Huang, Ying

2022-03-29 00:53:03

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

On Mon, Mar 28, 2022 at 09:08:34AM +0800, Huang, Ying wrote:
>Hi, Wei,
>
>Wei Yang <[email protected]> writes:
>
>> wakeup_kswapd() only wake up kswapd when the zone is managed.
>>
>> For two callers of wakeup_kswapd(), they are node perspective.
>>
>> * wake_all_kswapds
>> * numamigrate_isolate_page
>>
>> If we picked up a !managed zone, this is not we expected.
>>
>> This patch makes sure we pick up a managed zone for wakeup_kswapd().
>>
>> Signed-off-by: Wei Yang <[email protected]>
>> ---
>> mm/migrate.c | 2 +-
>> mm/page_alloc.c | 2 ++
>> 2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 3d60823afd2d..c4b654c0bdf0 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
>> return 0;
>> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
>> - if (populated_zone(pgdat->node_zones + z))
>> + if (managed_zone(pgdat->node_zones + z))
>
>This looks good to me! Thanks! It seems that we can replace
>populated_zone() in migrate_balanced_pgdat() too. Right?
>

Yes, you are right. I didn't spot this.

While this patch comes from the clue of wakeup_kswapd(), I am not sure it is
nice to put it in this patch together.

Which way you prefer to include this: merge the change into this one, or a
separate one?

--
Wei Yang
Help you, Help me

2022-03-29 00:57:37

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

Wei Yang <[email protected]> writes:

> On Mon, Mar 28, 2022 at 09:08:34AM +0800, Huang, Ying wrote:
>>Hi, Wei,
>>
>>Wei Yang <[email protected]> writes:
>>
>>> wakeup_kswapd() only wake up kswapd when the zone is managed.
>>>
>>> For two callers of wakeup_kswapd(), they are node perspective.
>>>
>>> * wake_all_kswapds
>>> * numamigrate_isolate_page
>>>
>>> If we picked up a !managed zone, this is not we expected.
>>>
>>> This patch makes sure we pick up a managed zone for wakeup_kswapd().
>>>
>>> Signed-off-by: Wei Yang <[email protected]>
>>> ---
>>> mm/migrate.c | 2 +-
>>> mm/page_alloc.c | 2 ++
>>> 2 files changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 3d60823afd2d..c4b654c0bdf0 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
>>> return 0;
>>> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
>>> - if (populated_zone(pgdat->node_zones + z))
>>> + if (managed_zone(pgdat->node_zones + z))
>>
>>This looks good to me! Thanks! It seems that we can replace
>>populated_zone() in migrate_balanced_pgdat() too. Right?
>>
>
> Yes, you are right. I didn't spot this.
>
> While this patch comes from the clue of wakeup_kswapd(), I am not sure it is
> nice to put it in this patch together.
>
> Which way you prefer to include this: merge the change into this one, or a
> separate one?

Either is OK for me.

Best Regards,
Huang, Ying

2022-03-29 00:57:39

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

On Mon, Mar 28, 2022 at 03:23:49PM +0800, Miaohe Lin wrote:
>On 2022/3/28 9:08, Huang, Ying wrote:
>> Hi, Wei,
>>
>> Wei Yang <[email protected]> writes:
>>
>>> wakeup_kswapd() only wake up kswapd when the zone is managed.
>>>
>>> For two callers of wakeup_kswapd(), they are node perspective.
>>>
>>> * wake_all_kswapds
>>> * numamigrate_isolate_page
>>>
>>> If we picked up a !managed zone, this is not we expected.
>>>
>>> This patch makes sure we pick up a managed zone for wakeup_kswapd().
>>>
>>> Signed-off-by: Wei Yang <[email protected]>
>>> ---
>>> mm/migrate.c | 2 +-
>>> mm/page_alloc.c | 2 ++
>>> 2 files changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 3d60823afd2d..c4b654c0bdf0 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
>>> return 0;
>>> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
>>> - if (populated_zone(pgdat->node_zones + z))
>>> + if (managed_zone(pgdat->node_zones + z))
>>
>> This looks good to me! Thanks! It seems that we can replace
>> populated_zone() in migrate_balanced_pgdat() too. Right?
>
>This patch looks good to me too. Thanks!
>
>BTW: This makes me remember the bewilderment when I read the relevant code.
>It's very kind of you if you could tell me the difference between
>managed_zone and populated_zone. IIUC, when the caller relies on the

The difference is managed_zone means the zone has pages managed by buddy,
while populated_zone means the zone has pages but may be reserved.

>activity from buddy system, managed_zone should always be used. I think
>there're many places like compaction need to use managed_zone but
>populated_zone is used now. They might need to change to use managed_zone
>too. Or am I miss something?

This thread comes from the read of commit 6aa303defb74, which adjust the
vmscan code. It looks like there is some mis-use in compaction, but I didn't
get time to go through it.

>
>Many Thanks. :)

--
Wei Yang
Help you, Help me

2022-03-29 02:15:50

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

On Tue, Mar 29, 2022 at 08:43:23AM +0800, Huang, Ying wrote:
[...]
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
>>>> return 0;
>>>> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
>>>> - if (populated_zone(pgdat->node_zones + z))
>>>> + if (managed_zone(pgdat->node_zones + z))
>>>
>>>This looks good to me! Thanks! It seems that we can replace
>>>populated_zone() in migrate_balanced_pgdat() too. Right?
>>>
>>
>> Yes, you are right. I didn't spot this.
>>
>> While this patch comes from the clue of wakeup_kswapd(), I am not sure it is
>> nice to put it in this patch together.
>>
>> Which way you prefer to include this: merge the change into this one, or a
>> separate one?
>
>Either is OK for me.
>

After reading the code, I am willing to do a little simplification. Does this
look good to you?

From 85c8a5cd708ada3e9f5b0409413407b7be1bc446 Mon Sep 17 00:00:00 2001
From: Wei Yang <[email protected]>
Date: Tue, 29 Mar 2022 09:24:36 +0800
Subject: [PATCH] mm/migrate.c: return valid zone for wakeup_kswapd from
migrate_balanced_pgdat()

To wakeup kswapd, we need to iterate pgdat->node_zones and get the
proper zone. While this work has already been done in
migrate_balanced_pgdat().

Let's return the valid zone directly instead of do the iteration again.

Signed-off-by: Wei Yang <[email protected]>
---
mm/migrate.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 5adc55b5347c..b086bd781956 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1973,7 +1973,7 @@ SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
* Returns true if this is a safe migration target node for misplaced NUMA
* pages. Currently it only checks the watermarks which is crude.
*/
-static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
+static struct zone *migrate_balanced_pgdat(struct pglist_data *pgdat,
unsigned long nr_migrate_pages)
{
int z;
@@ -1985,14 +1985,13 @@ static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
continue;

/* Avoid waking kswapd by allocating pages_to_migrate pages. */
- if (!zone_watermark_ok(zone, 0,
+ if (zone_watermark_ok(zone, 0,
high_wmark_pages(zone) +
nr_migrate_pages,
ZONE_MOVABLE, 0))
- continue;
- return true;
+ return zone;
}
- return false;
+ return NULL;
}

static struct page *alloc_misplaced_dst_page(struct page *page,
@@ -2032,6 +2031,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
int page_lru;
int nr_pages = thp_nr_pages(page);
int order = compound_order(page);
+ struct zone *zone;

VM_BUG_ON_PAGE(order && !PageTransHuge(page), page);

@@ -2040,16 +2040,11 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
return 0;

/* Avoid migrating to a node that is nearly full */
- if (!migrate_balanced_pgdat(pgdat, nr_pages)) {
- int z;
-
+ if ((zone = migrate_balanced_pgdat(pgdat, nr_pages))) {
if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
return 0;
- for (z = pgdat->nr_zones - 1; z >= 0; z--) {
- if (managed_zone(pgdat->node_zones + z))
- break;
- }
- wakeup_kswapd(pgdat->node_zones + z, 0, order, ZONE_MOVABLE);
+
+ wakeup_kswapd(zone, 0, order, ZONE_MOVABLE);
return 0;
}

--
2.33.1


>Best Regards,
>Huang, Ying

--
Wei Yang
Help you, Help me

2022-03-29 02:57:14

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

On Tue, Mar 29, 2022 at 01:52:30AM +0000, Wei Yang wrote:
> @@ -1985,14 +1985,13 @@ static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
> continue;
>
> /* Avoid waking kswapd by allocating pages_to_migrate pages. */
> - if (!zone_watermark_ok(zone, 0,
> + if (zone_watermark_ok(zone, 0,
> high_wmark_pages(zone) +
> nr_migrate_pages,
> ZONE_MOVABLE, 0))

Someone's done the silly thing of lining up all of these with spaces,
so either all these lines also need to be shrunk by one space, or you
need to break that convention and just go to a reasonable number of
tabs. I'd do it like this:

if (zone_watermark_ok(zone, 0,
high_wmark_pages(zone) + nr_migrate_pages,
ZONE_MOVABLE, 0))

but not everybody would.

> @@ -2040,16 +2040,11 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
> return 0;
>
> /* Avoid migrating to a node that is nearly full */
> - if (!migrate_balanced_pgdat(pgdat, nr_pages)) {
> - int z;
> -
> + if ((zone = migrate_balanced_pgdat(pgdat, nr_pages))) {

Linus had a rant about this style recently. He much prefers:

zone = migrate_balanced_pgdat(pgdat, nr_pages);
if (zone) {

(the exception is for while loops:

while ((zone = migrate_balanced_pgdat(pgdat, nr_pages)) != NULL)

where he wants to see the comparison against NULL instead of the awkard
double-bracket)

2022-03-29 04:32:28

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

Wei Yang <[email protected]> writes:

> On Tue, Mar 29, 2022 at 08:43:23AM +0800, Huang, Ying wrote:
> [...]
>>>>> --- a/mm/migrate.c
>>>>> +++ b/mm/migrate.c
>>>>> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>>>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
>>>>> return 0;
>>>>> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
>>>>> - if (populated_zone(pgdat->node_zones + z))
>>>>> + if (managed_zone(pgdat->node_zones + z))
>>>>
>>>>This looks good to me! Thanks! It seems that we can replace
>>>>populated_zone() in migrate_balanced_pgdat() too. Right?
>>>>
>>>
>>> Yes, you are right. I didn't spot this.
>>>
>>> While this patch comes from the clue of wakeup_kswapd(), I am not sure it is
>>> nice to put it in this patch together.
>>>
>>> Which way you prefer to include this: merge the change into this one, or a
>>> separate one?
>>
>>Either is OK for me.
>>
>
> After reading the code, I am willing to do a little simplification. Does this
> look good to you?
>
> From 85c8a5cd708ada3e9f5b0409413407b7be1bc446 Mon Sep 17 00:00:00 2001
> From: Wei Yang <[email protected]>
> Date: Tue, 29 Mar 2022 09:24:36 +0800
> Subject: [PATCH] mm/migrate.c: return valid zone for wakeup_kswapd from
> migrate_balanced_pgdat()
>
> To wakeup kswapd, we need to iterate pgdat->node_zones and get the
> proper zone. While this work has already been done in
> migrate_balanced_pgdat().
>
> Let's return the valid zone directly instead of do the iteration again.
>
> Signed-off-by: Wei Yang <[email protected]>
> ---
> mm/migrate.c | 21 ++++++++-------------
> 1 file changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 5adc55b5347c..b086bd781956 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1973,7 +1973,7 @@ SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
> * Returns true if this is a safe migration target node for misplaced NUMA
> * pages. Currently it only checks the watermarks which is crude.
> */
> -static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
> +static struct zone *migrate_balanced_pgdat(struct pglist_data *pgdat,
> unsigned long nr_migrate_pages)
> {
> int z;
> @@ -1985,14 +1985,13 @@ static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
> continue;
>
> /* Avoid waking kswapd by allocating pages_to_migrate pages. */
> - if (!zone_watermark_ok(zone, 0,
> + if (zone_watermark_ok(zone, 0,
> high_wmark_pages(zone) +
> nr_migrate_pages,
> ZONE_MOVABLE, 0))
> - continue;
> - return true;
> + return zone;
> }
> - return false;
> + return NULL;
> }
>
> static struct page *alloc_misplaced_dst_page(struct page *page,
> @@ -2032,6 +2031,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
> int page_lru;
> int nr_pages = thp_nr_pages(page);
> int order = compound_order(page);
> + struct zone *zone;
>
> VM_BUG_ON_PAGE(order && !PageTransHuge(page), page);
>
> @@ -2040,16 +2040,11 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
> return 0;
>
> /* Avoid migrating to a node that is nearly full */
> - if (!migrate_balanced_pgdat(pgdat, nr_pages)) {
> - int z;
> -
> + if ((zone = migrate_balanced_pgdat(pgdat, nr_pages))) {

I think that this reverses the original semantics. Originally, we give
up and wake up kswapd if there's no enough free pages on the target
node. But now, you give up and wake up if there's enough free pages.

Best Regards,
Huang, Ying

> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
> return 0;
> - for (z = pgdat->nr_zones - 1; z >= 0; z--) {
> - if (managed_zone(pgdat->node_zones + z))
> - break;
> - }
> - wakeup_kswapd(pgdat->node_zones + z, 0, order, ZONE_MOVABLE);
> +
> + wakeup_kswapd(zone, 0, order, ZONE_MOVABLE);
> return 0;
> }
>
> --
>
> 2.33.1

2022-03-31 01:09:20

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

On Tue, Mar 29, 2022 at 03:22:51AM +0100, Matthew Wilcox wrote:
>On Tue, Mar 29, 2022 at 01:52:30AM +0000, Wei Yang wrote:
>> @@ -1985,14 +1985,13 @@ static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
>> continue;
>>
>> /* Avoid waking kswapd by allocating pages_to_migrate pages. */
>> - if (!zone_watermark_ok(zone, 0,
>> + if (zone_watermark_ok(zone, 0,
>> high_wmark_pages(zone) +
>> nr_migrate_pages,
>> ZONE_MOVABLE, 0))
>
>Someone's done the silly thing of lining up all of these with spaces,
>so either all these lines also need to be shrunk by one space, or you
>need to break that convention and just go to a reasonable number of
>tabs. I'd do it like this:
>
> if (zone_watermark_ok(zone, 0,
> high_wmark_pages(zone) + nr_migrate_pages,
> ZONE_MOVABLE, 0))
>
>but not everybody would.
>
>> @@ -2040,16 +2040,11 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>> return 0;
>>
>> /* Avoid migrating to a node that is nearly full */
>> - if (!migrate_balanced_pgdat(pgdat, nr_pages)) {
>> - int z;
>> -
>> + if ((zone = migrate_balanced_pgdat(pgdat, nr_pages))) {
>
>Linus had a rant about this style recently. He much prefers:
>
> zone = migrate_balanced_pgdat(pgdat, nr_pages);
> if (zone) {
>
>(the exception is for while loops:
>
> while ((zone = migrate_balanced_pgdat(pgdat, nr_pages)) != NULL)
>
>where he wants to see the comparison against NULL instead of the awkard
>double-bracket)

Matthew,

Thanks for your suggestion, I would change this later.

--
Wei Yang
Help you, Help me

2022-03-31 04:52:38

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/vmscan: make sure wakeup_kswapd with managed zone

On Tue, Mar 29, 2022 at 10:05:20AM +0800, Huang, Ying wrote:
>Wei Yang <[email protected]> writes:
>
>> On Tue, Mar 29, 2022 at 08:43:23AM +0800, Huang, Ying wrote:
>> [...]
>>>>>> --- a/mm/migrate.c
>>>>>> +++ b/mm/migrate.c
>>>>>> @@ -2046,7 +2046,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>>>>> if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING))
>>>>>> return 0;
>>>>>> for (z = pgdat->nr_zones - 1; z >= 0; z--) {
>>>>>> - if (populated_zone(pgdat->node_zones + z))
>>>>>> + if (managed_zone(pgdat->node_zones + z))
>>>>>
>>>>>This looks good to me! Thanks! It seems that we can replace
>>>>>populated_zone() in migrate_balanced_pgdat() too. Right?
>>>>>
>>>>
>>>> Yes, you are right. I didn't spot this.
>>>>
>>>> While this patch comes from the clue of wakeup_kswapd(), I am not sure it is
>>>> nice to put it in this patch together.
>>>>
>>>> Which way you prefer to include this: merge the change into this one, or a
>>>> separate one?
>>>
>>>Either is OK for me.
>>>
>>
>> After reading the code, I am willing to do a little simplification. Does this
>> look good to you?
>>
>> From 85c8a5cd708ada3e9f5b0409413407b7be1bc446 Mon Sep 17 00:00:00 2001
>> From: Wei Yang <[email protected]>
>> Date: Tue, 29 Mar 2022 09:24:36 +0800
>> Subject: [PATCH] mm/migrate.c: return valid zone for wakeup_kswapd from
>> migrate_balanced_pgdat()
>>
>> To wakeup kswapd, we need to iterate pgdat->node_zones and get the
>> proper zone. While this work has already been done in
>> migrate_balanced_pgdat().
>>
>> Let's return the valid zone directly instead of do the iteration again.
>>
>> Signed-off-by: Wei Yang <[email protected]>
>> ---
>> mm/migrate.c | 21 ++++++++-------------
>> 1 file changed, 8 insertions(+), 13 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 5adc55b5347c..b086bd781956 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1973,7 +1973,7 @@ SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
>> * Returns true if this is a safe migration target node for misplaced NUMA
>> * pages. Currently it only checks the watermarks which is crude.
>> */
>> -static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
>> +static struct zone *migrate_balanced_pgdat(struct pglist_data *pgdat,
>> unsigned long nr_migrate_pages)
>> {
>> int z;
>> @@ -1985,14 +1985,13 @@ static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
>> continue;
>>
>> /* Avoid waking kswapd by allocating pages_to_migrate pages. */
>> - if (!zone_watermark_ok(zone, 0,
>> + if (zone_watermark_ok(zone, 0,
>> high_wmark_pages(zone) +
>> nr_migrate_pages,
>> ZONE_MOVABLE, 0))
>> - continue;
>> - return true;
>> + return zone;
>> }
>> - return false;
>> + return NULL;
>> }
>>
>> static struct page *alloc_misplaced_dst_page(struct page *page,
>> @@ -2032,6 +2031,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>> int page_lru;
>> int nr_pages = thp_nr_pages(page);
>> int order = compound_order(page);
>> + struct zone *zone;
>>
>> VM_BUG_ON_PAGE(order && !PageTransHuge(page), page);
>>
>> @@ -2040,16 +2040,11 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>> return 0;
>>
>> /* Avoid migrating to a node that is nearly full */
>> - if (!migrate_balanced_pgdat(pgdat, nr_pages)) {
>> - int z;
>> -
>> + if ((zone = migrate_balanced_pgdat(pgdat, nr_pages))) {
>
>I think that this reverses the original semantics. Originally, we give
>up and wake up kswapd if there's no enough free pages on the target
>node. But now, you give up and wake up if there's enough free pages.
>

You are right, I misunderstand it.

Sorry


--
Wei Yang
Help you, Help me