LinuxLists.cc - [PATCH RESEND] sched/numa: expose per-task pages-migration-failure

2019-12-03 02:22:09

Subject: [PATCH RESEND] sched/numa: expose per-task pages-migration-failure

NUMA balancing will try to migrate pages between nodes, which
could caused by memory policy or numa group aggregation, while
the page migration could failed too for eg when the target node
run out of memory.

Since this is critical to the performance, admin should know
how serious the problem is, and take actions before it causing
too much performance damage, thus this patch expose the counter
as 'migfailed' in '/proc/PID/sched'.

Cc: Peter Zijlstra <[email protected]>
Cc: Michal Koutný <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
kernel/sched/debug.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index f7e4579e746c..73c4809c8f37 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -848,6 +848,7 @@ static void sched_show_numa(struct task_struct *p, struct seq_file *m)
P(total_numa_faults);
SEQ_printf(m, "current_node=%d, numa_group_id=%d\n",
task_node(p), task_numa_group_id(p));
+ SEQ_printf(m, "migfailed=%lu\n", p->numa_faults_locality[2]);
show_numa_stats(p, m);
mpol_put(pol);
#endif
--
2.14.4.44.g2045bb6

2019-12-03 07:17:19

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH RESEND] sched/numa: expose per-task pages-migration-failure

* 王贇 <[email protected]> wrote:

> NUMA balancing will try to migrate pages between nodes, which
> could caused by memory policy or numa group aggregation, while
> the page migration could failed too for eg when the target node
> run out of memory.
>
> Since this is critical to the performance, admin should know
> how serious the problem is, and take actions before it causing
> too much performance damage, thus this patch expose the counter
> as 'migfailed' in '/proc/PID/sched'.
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Michal Koutný <[email protected]>
> Acked-by: Mel Gorman <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> kernel/sched/debug.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index f7e4579e746c..73c4809c8f37 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -848,6 +848,7 @@ static void sched_show_numa(struct task_struct *p, struct seq_file *m)
> P(total_numa_faults);
> SEQ_printf(m, "current_node=%d, numa_group_id=%d\n",
> task_node(p), task_numa_group_id(p));
> + SEQ_printf(m, "migfailed=%lu\n", p->numa_faults_locality[2]);

Any reason not to expose the other 2 fields of this array as well, which
show remote/local migrations?

Thanks,

Ingo

2019-12-03 09:26:44

by 王贇

[permalink] [raw]

Subject: Re: [PATCH RESEND] sched/numa: expose per-task pages-migration-failure

On 2019/12/3 下午3:16, Ingo Molnar wrote:
[snip]
>> kernel/sched/debug.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
>> index f7e4579e746c..73c4809c8f37 100644
>> --- a/kernel/sched/debug.c
>> +++ b/kernel/sched/debug.c
>> @@ -848,6 +848,7 @@ static void sched_show_numa(struct task_struct *p, struct seq_file *m)
>> P(total_numa_faults);
>> SEQ_printf(m, "current_node=%d, numa_group_id=%d\n",
>> task_node(p), task_numa_group_id(p));
>> + SEQ_printf(m, "migfailed=%lu\n", p->numa_faults_locality[2]);
>
> Any reason not to expose the other 2 fields of this array as well, which
> show remote/local migrations?

The rest are local/remote faults counter, AFAIK not related to
migration, when the CPU triggered PF is from the same node of page
(before migration), local faults increased.

Regards,
Michael Wang

>
> Thanks,
>
> Ingo
>