2022-03-16 13:59:24

by Marcelo Tosatti

[permalink] [raw]
Subject: [patch v12 12/13] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean

It is not necessary to queue work item to run refresh_vm_stats
on a remote CPU if that CPU has no dirty stats and no per-CPU
allocations for remote nodes.

This fixes sosreport hang (which uses vmstat_refresh) with
spinning SCHED_FIFO process.

Signed-off-by: Marcelo Tosatti <[email protected]>

---
mm/vmstat.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 44 insertions(+), 5 deletions(-)

Index: linux-2.6/mm/vmstat.c
===================================================================
--- linux-2.6.orig/mm/vmstat.c
+++ linux-2.6/mm/vmstat.c
@@ -1924,6 +1924,31 @@ static bool need_update(int cpu)
}

#ifdef CONFIG_PROC_FS
+static bool need_drain_remote_zones(int cpu)
+{
+#ifdef CONFIG_NUMA
+ struct zone *zone;
+
+ for_each_populated_zone(zone) {
+ struct per_cpu_pages *pcp;
+
+ pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
+ if (!pcp->count)
+ continue;
+
+ if (!pcp->expire)
+ continue;
+
+ if (zone_to_nid(zone) == cpu_to_node(cpu))
+ continue;
+
+ return true;
+ }
+#endif
+
+ return false;
+}
+
static void refresh_vm_stats(struct work_struct *work)
{
refresh_cpu_vm_stats(true);
@@ -1933,8 +1958,12 @@ int vmstat_refresh(struct ctl_table *tab
void *buffer, size_t *lenp, loff_t *ppos)
{
long val;
- int err;
- int i;
+ int i, cpu;
+ struct work_struct __percpu *works;
+
+ works = alloc_percpu(struct work_struct);
+ if (!works)
+ return -ENOMEM;

/*
* The regular update, every sysctl_stat_interval, may come later
@@ -1948,9 +1977,19 @@ int vmstat_refresh(struct ctl_table *tab
* transiently negative values, report an error here if any of
* the stats is negative, so we know to go looking for imbalance.
*/
- err = schedule_on_each_cpu(refresh_vm_stats);
- if (err)
- return err;
+ cpus_read_lock();
+ for_each_online_cpu(cpu) {
+ struct work_struct *work = per_cpu_ptr(works, cpu);
+
+ INIT_WORK(work, refresh_vm_stats);
+ if (need_update(cpu) || need_drain_remote_zones(cpu))
+ schedule_work_on(cpu, work);
+ }
+ for_each_online_cpu(cpu)
+ flush_work(per_cpu_ptr(works, cpu));
+ cpus_read_unlock();
+ free_percpu(works);
+
for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
/*
* Skip checking stats known to go negative occasionally.



2022-04-27 10:38:40

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch v12 12/13] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean

On Tue, Mar 15 2022 at 12:31, Marcelo Tosatti wrote:
> /*
> * The regular update, every sysctl_stat_interval, may come later
> @@ -1948,9 +1977,19 @@ int vmstat_refresh(struct ctl_table *tab
> * transiently negative values, report an error here if any of
> * the stats is negative, so we know to go looking for imbalance.
> */
> - err = schedule_on_each_cpu(refresh_vm_stats);
> - if (err)
> - return err;
> + cpus_read_lock();
> + for_each_online_cpu(cpu) {
> + struct work_struct *work = per_cpu_ptr(works, cpu);
> +
> + INIT_WORK(work, refresh_vm_stats);
> + if (need_update(cpu) || need_drain_remote_zones(cpu))

Of course that makes sense in general, but now you have two ways of
deciding whether updating this is required.

1) The above

2) The per CPU boolean which tells whether vmstats are dirty or not.

Can we have a third method perhaps?

Thanks,

tglx

2022-05-04 15:08:52

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [patch v12 12/13] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean

On Wed, Apr 27, 2022 at 09:23:06AM +0200, Thomas Gleixner wrote:
> On Tue, Mar 15 2022 at 12:31, Marcelo Tosatti wrote:
> > /*
> > * The regular update, every sysctl_stat_interval, may come later
> > @@ -1948,9 +1977,19 @@ int vmstat_refresh(struct ctl_table *tab
> > * transiently negative values, report an error here if any of
> > * the stats is negative, so we know to go looking for imbalance.
> > */
> > - err = schedule_on_each_cpu(refresh_vm_stats);
> > - if (err)
> > - return err;
> > + cpus_read_lock();
> > + for_each_online_cpu(cpu) {
> > + struct work_struct *work = per_cpu_ptr(works, cpu);
> > +
> > + INIT_WORK(work, refresh_vm_stats);
> > + if (need_update(cpu) || need_drain_remote_zones(cpu))
>
> Of course that makes sense in general, but now you have two ways of
> deciding whether updating this is required.
>
> 1) The above
>
> 2) The per CPU boolean which tells whether vmstats are dirty or not.
>
> Can we have a third method perhaps?

Ok, will think of a third method to increase clarity :-)

By the fourth method it will be clear for sure!