Changes from v2:
1. Rewrite the comment in code and the commit message becasue it
turns out that this patch is not the real fix for the oops
descriped. The real fix goes in another patch below:
https://lore.kernel.org/lkml/[email protected]/
Changes from v1:
1. Trim the verbose oops in the commit message. (feedbacked by
Phil Auld)
2. Rewrite a comment in code. (feedbacked by Phil Auld)
--->8---
From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
From: Byungchul Park <[email protected]>
Date: Fri, 16 Feb 2024 20:18:10 +0900
Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
A numa node might not have its local memory but CPUs. Promoting a folio
to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
from getting promoted.
Signed-off-by: Byungchul Park <[email protected]>
---
kernel/sched/fair.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d7a3c63a2171..7ed9ef3c0134 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
int dst_nid = cpu_to_node(dst_cpu);
int last_cpupid, this_cpupid;
+ /*
+ * A node of dst_nid might not have its local memory. Promoting
+ * a folio to the node is meaningless.
+ */
+ if (!node_state(dst_nid, N_MEMORY))
+ return false;
+
/*
* The pages in slow memory node should be migrated according
* to hot/cold instead of private/shared.
--
2.17.1
On Fri, Feb 16, 2024 at 08:40:45PM +0900, Byungchul Park wrote:
> From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
> From: Byungchul Park <[email protected]>
> Date: Fri, 16 Feb 2024 20:18:10 +0900
> Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
"do not try to promote folios to memoryless nodes"
because AFAICS we are just trying.
Even if should_numa_migrate_memory() returns true, I assume that we will
fail somewhere down the chain e.g: migrate_pages() when we see that this
node does not any memory, right?
> A numa node might not have its local memory but CPUs. Promoting a folio
> to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
> from getting promoted.
If you talk about memoryless nodes everybody gets it better IMHO.
"Memoryless nodes do not have any memory to migrate to, so stop trying it."
> Signed-off-by: Byungchul Park <[email protected]>
> ---
> kernel/sched/fair.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d7a3c63a2171..7ed9ef3c0134 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> int dst_nid = cpu_to_node(dst_cpu);
> int last_cpupid, this_cpupid;
>
> + /*
> + * A node of dst_nid might not have its local memory. Promoting
> + * a folio to the node is meaningless.
> + */
> + if (!node_state(dst_nid, N_MEMORY))
> + return false;
"Cannot migrate to memoryless nodes"
seems shorter and more clear.
So, what happens when we return true here? will we fail at
migrate_pages() I guess? That is quite down the road so I guess
this check can save us some time.
--
Oscar Salvador
SUSE Labs
On Fri, Feb 16, 2024 at 02:51:24PM +0100, Oscar Salvador wrote:
> On Fri, Feb 16, 2024 at 08:40:45PM +0900, Byungchul Park wrote:
> > From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
> > From: Byungchul Park <[email protected]>
> > Date: Fri, 16 Feb 2024 20:18:10 +0900
> > Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
>
> "do not try to promote folios to memoryless nodes"
Thinking some more, promote might be misleading, just something like
"do not try to migrate memory to memoryless nodes".
As this is not a bug fix but an optimization, as we will fail anyways
in migrate_misplaced_folio() when migrate_balanced_pgdat() notices that
we do not have any memory on that code.
With the other comments addressed:
Reviewed-by: Oscar Salvador <[email protected]>
>
> because AFAICS we are just trying.
> Even if should_numa_migrate_memory() returns true, I assume that we will
> fail somewhere down the chain e.g: migrate_pages() when we see that this
> node does not any memory, right?
>
> > A numa node might not have its local memory but CPUs. Promoting a folio
> > to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
> > from getting promoted.
>
> If you talk about memoryless nodes everybody gets it better IMHO.
> "Memoryless nodes do not have any memory to migrate to, so stop trying it."
>
>
> > Signed-off-by: Byungchul Park <[email protected]>
> > ---
> > kernel/sched/fair.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d7a3c63a2171..7ed9ef3c0134 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> > int dst_nid = cpu_to_node(dst_cpu);
> > int last_cpupid, this_cpupid;
> >
> > + /*
> > + * A node of dst_nid might not have its local memory. Promoting
> > + * a folio to the node is meaningless.
> > + */
> > + if (!node_state(dst_nid, N_MEMORY))
> > + return false;
>
> "Cannot migrate to memoryless nodes"
>
> seems shorter and more clear.
>
> So, what happens when we return true here? will we fail at
> migrate_pages() I guess? That is quite down the road so I guess
> this check can save us some time.
>
>
> --
> Oscar Salvador
> SUSE Labs
>
--
Oscar Salvador
SUSE Labs
On Fri, Feb 16, 2024 at 02:51:24PM +0100, Oscar Salvador wrote:
> On Fri, Feb 16, 2024 at 08:40:45PM +0900, Byungchul Park wrote:
> > From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
> > From: Byungchul Park <[email protected]>
> > Date: Fri, 16 Feb 2024 20:18:10 +0900
> > Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
>
> "do not try to promote folios to memoryless nodes"
Thank you. I will.
> because AFAICS we are just trying.
> Even if should_numa_migrate_memory() returns true, I assume that we will
> fail somewhere down the chain e.g: migrate_pages() when we see that this
> node does not any memory, right?
Yes.
> > A numa node might not have its local memory but CPUs. Promoting a folio
> > to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
> > from getting promoted.
>
> If you talk about memoryless nodes everybody gets it better IMHO.
> "Memoryless nodes do not have any memory to migrate to, so stop trying it."
Much better.
> > Signed-off-by: Byungchul Park <[email protected]>
> > ---
> > kernel/sched/fair.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d7a3c63a2171..7ed9ef3c0134 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> > int dst_nid = cpu_to_node(dst_cpu);
> > int last_cpupid, this_cpupid;
> >
> > + /*
> > + * A node of dst_nid might not have its local memory. Promoting
> > + * a folio to the node is meaningless.
> > + */
> > + if (!node_state(dst_nid, N_MEMORY))
> > + return false;
>
> "Cannot migrate to memoryless nodes"
>
> seems shorter and more clear.
Agree.
Byungchul
> So, what happens when we return true here? will we fail at
> migrate_pages() I guess? That is quite down the road so I guess
> this check can save us some time.
>
>
> --
> Oscar Salvador
> SUSE Labs
On Sun, Feb 18, 2024 at 08:46:16AM +0100, Oscar Salvador wrote:
> On Fri, Feb 16, 2024 at 02:51:24PM +0100, Oscar Salvador wrote:
> > On Fri, Feb 16, 2024 at 08:40:45PM +0900, Byungchul Park wrote:
> > > From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
> > > From: Byungchul Park <[email protected]>
> > > Date: Fri, 16 Feb 2024 20:18:10 +0900
> > > Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
> >
> > "do not try to promote folios to memoryless nodes"
>
> Thinking some more, promote might be misleading, just something like
> "do not try to migrate memory to memoryless nodes".
Thank you. I will.
> As this is not a bug fix but an optimization, as we will fail anyways
> in migrate_misplaced_folio() when migrate_balanced_pgdat() notices that
> we do not have any memory on that code.
>
> With the other comments addressed:
>
> Reviewed-by: Oscar Salvador <[email protected]>
Thank you for reviewing.
Byungchul
> > because AFAICS we are just trying.
> > Even if should_numa_migrate_memory() returns true, I assume that we will
> > fail somewhere down the chain e.g: migrate_pages() when we see that this
> > node does not any memory, right?
> >
> > > A numa node might not have its local memory but CPUs. Promoting a folio
> > > to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
> > > from getting promoted.
> >
> > If you talk about memoryless nodes everybody gets it better IMHO.
> > "Memoryless nodes do not have any memory to migrate to, so stop trying it."
> >
> >
> > > Signed-off-by: Byungchul Park <[email protected]>
> > > ---
> > > kernel/sched/fair.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index d7a3c63a2171..7ed9ef3c0134 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> > > int dst_nid = cpu_to_node(dst_cpu);
> > > int last_cpupid, this_cpupid;
> > >
> > > + /*
> > > + * A node of dst_nid might not have its local memory. Promoting
> > > + * a folio to the node is meaningless.
> > > + */
> > > + if (!node_state(dst_nid, N_MEMORY))
> > > + return false;
> >
> > "Cannot migrate to memoryless nodes"
> >
> > seems shorter and more clear.
> >
> > So, what happens when we return true here? will we fail at
> > migrate_pages() I guess? That is quite down the road so I guess
> > this check can save us some time.
> >
> >
> > --
> > Oscar Salvador
> > SUSE Labs
> >
>
> --
> Oscar Salvador
> SUSE Labs
On 16.02.24 12:40, Byungchul Park wrote:
> Changes from v2:
> 1. Rewrite the comment in code and the commit message becasue it
> turns out that this patch is not the real fix for the oops
> descriped. The real fix goes in another patch below:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Changes from v1:
> 1. Trim the verbose oops in the commit message. (feedbacked by
> Phil Auld)
> 2. Rewrite a comment in code. (feedbacked by Phil Auld)
>
> --->8---
> From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
> From: Byungchul Park <[email protected]>
> Date: Fri, 16 Feb 2024 20:18:10 +0900
> Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
>
> A numa node might not have its local memory but CPUs. Promoting a folio
> to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
> from getting promoted.
So there is no bug/panic that can be triggered and this is not a "fix"
but an optimization?
>
> Signed-off-by: Byungchul Park <[email protected]>
> ---
> kernel/sched/fair.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d7a3c63a2171..7ed9ef3c0134 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> int dst_nid = cpu_to_node(dst_cpu);
> int last_cpupid, this_cpupid;
>
> + /*
> + * A node of dst_nid might not have its local memory. Promoting
> + * a folio to the node is meaningless.
> + */
> + if (!node_state(dst_nid, N_MEMORY))
> + return false;
> +
> /*
> * The pages in slow memory node should be migrated according
> * to hot/cold instead of private/shared.
Acked-by: David Hildenbrand <[email protected]>
--
Cheers,
David / dhildenb
On Mon, Feb 19, 2024 at 11:08:54AM +0900 Byungchul Park wrote:
> On Fri, Feb 16, 2024 at 02:51:24PM +0100, Oscar Salvador wrote:
> > On Fri, Feb 16, 2024 at 08:40:45PM +0900, Byungchul Park wrote:
> > > From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
> > > From: Byungchul Park <[email protected]>
> > > Date: Fri, 16 Feb 2024 20:18:10 +0900
> > > Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
> >
> > "do not try to promote folios to memoryless nodes"
>
> Thank you. I will.
>
> > because AFAICS we are just trying.
> > Even if should_numa_migrate_memory() returns true, I assume that we will
> > fail somewhere down the chain e.g: migrate_pages() when we see that this
> > node does not any memory, right?
>
> Yes.
>
> > > A numa node might not have its local memory but CPUs. Promoting a folio
> > > to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
> > > from getting promoted.
> >
> > If you talk about memoryless nodes everybody gets it better IMHO.
> > "Memoryless nodes do not have any memory to migrate to, so stop trying it."
>
> Much better.
>
> > > Signed-off-by: Byungchul Park <[email protected]>
> > > ---
> > > kernel/sched/fair.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index d7a3c63a2171..7ed9ef3c0134 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> > > int dst_nid = cpu_to_node(dst_cpu);
> > > int last_cpupid, this_cpupid;
> > >
> > > + /*
> > > + * A node of dst_nid might not have its local memory. Promoting
> > > + * a folio to the node is meaningless.
> > > + */
> > > + if (!node_state(dst_nid, N_MEMORY))
> > > + return false;
> >
> > "Cannot migrate to memoryless nodes"
> >
> > seems shorter and more clear.
>
> Agree.
>
We clearly have dst_cpu when we call this so I still think a
check could be done farther up. But this one still looks reasonable
to me.
Thanks,
Phil
Reviewed-by: Phil Auld <[email protected]>
> Byungchul
>
> > So, what happens when we return true here? will we fail at
> > migrate_pages() I guess? That is quite down the road so I guess
> > this check can save us some time.
> >
> >
> > --
> > Oscar Salvador
> > SUSE Labs
>
--
On Sun, 18 Feb 2024, Oscar Salvador wrote:
>On Fri, Feb 16, 2024 at 02:51:24PM +0100, Oscar Salvador wrote:
>> On Fri, Feb 16, 2024 at 08:40:45PM +0900, Byungchul Park wrote:
>> > From 150af2f78e19217a1d03e47e3ee5279684590fb4 Mon Sep 17 00:00:00 2001
>> > From: Byungchul Park <[email protected]>
>> > Date: Fri, 16 Feb 2024 20:18:10 +0900
>> > Subject: [PATCH v3] sched/numa, mm: do not promote folios to nodes not set N_MEMORY
>>
>> "do not try to promote folios to memoryless nodes"
>
>Thinking some more, promote might be misleading, just something like
>"do not try to migrate memory to memoryless nodes".
Yes. Does this also want an unlikely()? Not that it would be measurable.
>As this is not a bug fix but an optimization, as we will fail anyways
>in migrate_misplaced_folio() when migrate_balanced_pgdat() notices that
>we do not have any memory on that code.
This should be in the changelog and the subject is misleading as well.
Reviewed-by: Davidlohr Bueso <[email protected]>
>
>With the other comments addressed:
>
>Reviewed-by: Oscar Salvador <[email protected]>
>
>>
>> because AFAICS we are just trying.
>> Even if should_numa_migrate_memory() returns true, I assume that we will
>> fail somewhere down the chain e.g: migrate_pages() when we see that this
>> node does not any memory, right?
>>
>> > A numa node might not have its local memory but CPUs. Promoting a folio
>> > to the node's local memory is nonsense. So avoid nodes not set N_MEMORY
>> > from getting promoted.
>>
>> If you talk about memoryless nodes everybody gets it better IMHO.
>> "Memoryless nodes do not have any memory to migrate to, so stop trying it."
>>
>>
>> > Signed-off-by: Byungchul Park <[email protected]>
>> > ---
>> > kernel/sched/fair.c | 7 +++++++
>> > 1 file changed, 7 insertions(+)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index d7a3c63a2171..7ed9ef3c0134 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -1828,6 +1828,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
>> > int dst_nid = cpu_to_node(dst_cpu);
>> > int last_cpupid, this_cpupid;
>> >
>> > + /*
>> > + * A node of dst_nid might not have its local memory. Promoting
>> > + * a folio to the node is meaningless.
>> > + */
>> > + if (!node_state(dst_nid, N_MEMORY))
>> > + return false;
>>
>> "Cannot migrate to memoryless nodes"
>>
>> seems shorter and more clear.
>>
>> So, what happens when we return true here? will we fail at
>> migrate_pages() I guess? That is quite down the road so I guess
>> this check can save us some time.
>>
>>
>> --
>> Oscar Salvador
>> SUSE Labs
>>
>
>--
>Oscar Salvador
>SUSE Labs
>