2020-03-24 14:24:48

by Baoquan He

[permalink] [raw]
Subject: [PATCH 0/5] improvements about lowmem_reserve and /proc/zoneinfo

When tried to review below patchset from Joonsoo, I found out there's
some space to improve about lowmem_reserve and /proc/zoneinfo printing.
So made this patchset to do it.

[PATCH v3 0/2] integrate classzone_idx and high_zoneidx
http://lkml.kernel.org/r/[email protected]

Baoquan He (5):
mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when
changing it
mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
mm/vmstat.c: do not show lowmem reserve protection information of
empty zone
mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo
mm/vmstat.c: remove the useless code

mm/page_alloc.c | 13 +++++++++++--
mm/vmstat.c | 44 +++++++++++++++++---------------------------
2 files changed, 28 insertions(+), 29 deletions(-)

--
2.17.2


2020-03-24 14:25:08

by Baoquan He

[permalink] [raw]
Subject: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

This moving makes the layout of /proc/zoneinfo more sensible. And there
are 4 zones at most currently, it doesn't need to scroll down much to get
to the 1st populated zone, even though the 1st populated zone is MOVABLE
zone.

Node 2, per-node stats
nr_inactive_anon 48
nr_active_anon 15454
...
nr_foll_pin_acquired 0
nr_foll_pin_released 0
Node 2, zone DMA
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
Node 2, zone DMA32
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
Node 2, zone Normal
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
Node 2, zone Movable
pages free 196346
min 3540
...
managed 262144

Signed-off-by: Baoquan He <[email protected]>
---
mm/vmstat.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6fd1407f4632..4bbf9be786da 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1567,13 +1567,6 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
{
int i;
seq_printf(m, "Node %d, zone %8s", pgdat->node_id, zone->name);
- if (is_zone_first_populated(pgdat, zone)) {
- seq_printf(m, "\n per-node stats");
- for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
- seq_printf(m, "\n %-12s %lu", node_stat_name(i),
- node_page_state(pgdat, i));
- }
- }
seq_printf(m,
"\n pages free %lu"
"\n min %lu"
@@ -1648,7 +1641,18 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
*/
static int zoneinfo_show(struct seq_file *m, void *arg)
{
+ int i;
pg_data_t *pgdat = (pg_data_t *)arg;
+
+ if (node_state(pgdat->node_id, N_MEMORY)) {
+ seq_printf(m, "Node %d, per-node stats", pgdat->node_id);
+ for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
+ seq_printf(m, "\n %-12s %lu", node_stat_name(i),
+ node_page_state(pgdat, i));
+ }
+ seq_putc(m, '\n');
+ }
+
walk_zones_in_node(m, pgdat, false, false, zoneinfo_show_print);
return 0;
}
--
2.17.2

2020-03-24 14:25:16

by Baoquan He

[permalink] [raw]
Subject: [PATCH 5/5] mm/vmstat.c: remove the useless code

No one calls is_zone_first_populated(), remove it.

Signed-off-by: Baoquan He <[email protected]>
---
mm/vmstat.c | 14 --------------
1 file changed, 14 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 4bbf9be786da..7097eb99f30d 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1548,20 +1548,6 @@ static const struct seq_operations pagetypeinfo_op = {
.show = pagetypeinfo_show,
};

-static bool is_zone_first_populated(pg_data_t *pgdat, struct zone *zone)
-{
- int zid;
-
- for (zid = 0; zid < MAX_NR_ZONES; zid++) {
- struct zone *compare = &pgdat->node_zones[zid];
-
- if (populated_zone(compare))
- return zone == compare;
- }
-
- return false;
-}
-
static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
struct zone *zone)
{
--
2.17.2

2020-03-24 19:27:20

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On Tue, 24 Mar 2020, Baoquan He wrote:

> This moving makes the layout of /proc/zoneinfo more sensible. And there
> are 4 zones at most currently, it doesn't need to scroll down much to get
> to the 1st populated zone, even though the 1st populated zone is MOVABLE
> zone.
>

Doesn't this introduce risk that it will break existing parsers of
/proc/zoneinfo in subtle ways?

In some cases /proc/zoneinfo is a tricky file to correctly parse because
you have to rely on the existing order in which it is printed to determine
which zone is being described. We need to print zones even with unmanaged
pages, for instance, otherwise userspace may be unaware of which zones are
supported and what order they are in. That's important to be able to
construct the proper string to use when writing vm.lowmem_reserve_ratio.

I'd prefer not changing the order of /proc/zoneinfo if it can be avoided
just because the risk outweighs the reward that we may break some
initscript parsers.

> Node 2, per-node stats
> nr_inactive_anon 48
> nr_active_anon 15454
> ...
> nr_foll_pin_acquired 0
> nr_foll_pin_released 0
> Node 2, zone DMA
> pages free 0
> min 0
> low 0
> high 0
> spanned 0
> present 0
> managed 0
> Node 2, zone DMA32
> pages free 0
> min 0
> low 0
> high 0
> spanned 0
> present 0
> managed 0
> Node 2, zone Normal
> pages free 0
> min 0
> low 0
> high 0
> spanned 0
> present 0
> managed 0
> Node 2, zone Movable
> pages free 196346
> min 3540
> ...
> managed 262144
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> mm/vmstat.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 6fd1407f4632..4bbf9be786da 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1567,13 +1567,6 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
> {
> int i;
> seq_printf(m, "Node %d, zone %8s", pgdat->node_id, zone->name);
> - if (is_zone_first_populated(pgdat, zone)) {
> - seq_printf(m, "\n per-node stats");
> - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
> - seq_printf(m, "\n %-12s %lu", node_stat_name(i),
> - node_page_state(pgdat, i));
> - }
> - }
> seq_printf(m,
> "\n pages free %lu"
> "\n min %lu"
> @@ -1648,7 +1641,18 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
> */
> static int zoneinfo_show(struct seq_file *m, void *arg)
> {
> + int i;
> pg_data_t *pgdat = (pg_data_t *)arg;
> +
> + if (node_state(pgdat->node_id, N_MEMORY)) {
> + seq_printf(m, "Node %d, per-node stats", pgdat->node_id);
> + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
> + seq_printf(m, "\n %-12s %lu", node_stat_name(i),
> + node_page_state(pgdat, i));
> + }
> + seq_putc(m, '\n');
> + }
> +
> walk_zones_in_node(m, pgdat, false, false, zoneinfo_show_print);
> return 0;
> }
> --
> 2.17.2
>
>
>

2020-03-25 05:55:33

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On 03/24/20 at 12:25pm, David Rientjes wrote:
> On Tue, 24 Mar 2020, Baoquan He wrote:
>
> > This moving makes the layout of /proc/zoneinfo more sensible. And there
> > are 4 zones at most currently, it doesn't need to scroll down much to get
> > to the 1st populated zone, even though the 1st populated zone is MOVABLE
> > zone.
> >
>
> Doesn't this introduce risk that it will break existing parsers of
> /proc/zoneinfo in subtle ways?
>
> In some cases /proc/zoneinfo is a tricky file to correctly parse because
> you have to rely on the existing order in which it is printed to determine
> which zone is being described. We need to print zones even with unmanaged
> pages, for instance, otherwise userspace may be unaware of which zones are
> supported and what order they are in. That's important to be able to
> construct the proper string to use when writing vm.lowmem_reserve_ratio.
>
> I'd prefer not changing the order of /proc/zoneinfo if it can be avoided
> just because the risk outweighs the reward that we may break some
> initscript parsers.

Oh, I may not describe the change and result clearly. This patch doesn't
change zone order at all. I only move the per-node stats to the front of
each node, the zone order is completely kept the same, still DMA, DMA32,
NORMAL, MOVABLE.

Before this patch, per-node stats are printed inside the first populated
zone of each node. E.g in this node 2 which only has movable zone, the
per-node stats is embedded in the last zone. In fact, this per-node stats
are made for the whole node, but not for one zone.

Node 2, zone DMA
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
protection: (0, 0, 0, 1024, 1024)
Node 2, zone DMA32
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
protection: (0, 0, 0, 1024, 1024)
Node 2, zone Normal
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
protection: (0, 0, 0, 8192, 8192)
Node 2, zone Movable
per-node stats --------------------------->> start of per-node stats
nr_inactive_anon 42
nr_active_anon 11787
nr_inactive_file 32222
nr_active_file 6081
nr_unevictable 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 0
...... --------- (mid items are omitted)
nr_anon_transparent_hugepages 0
nr_unstable 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_dirtied 25523
nr_written 25113
nr_kernel_misc_reclaimable 0 ------------------------->> end of per-node stats
pages free 211331 ------------------------->> start printing data of zone Movable
min 3524
low 4405
high 5286
spanned 262144
present 262144
managed 262144
protection: (0, 0, 0, 0, 0)
nr_free_pages 211331
nr_zone_inactive_anon 42
nr_zone_active_anon 11787
nr_zone_inactive_file 32222
nr_zone_active_file 6081
nr_zone_unevictable 0
nr_zone_write_pending 2

With this patch applied, only the per-node stats part is moved to the
front of each node.

Node 2, per-node stats --------------------------->> start of per-node stats
nr_inactive_anon 42
nr_active_anon 12358
nr_inactive_file 33139
nr_active_file 10088
nr_unevictable 0
nr_slab_reclaimable 0
...... --------- (mid items are omitted)
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_dirtied 9082
nr_written 8844
nr_kernel_misc_reclaimable 0
nr_foll_pin_acquired 0
nr_foll_pin_released 0 ------------------------->> end of per-node stats
Node 2, zone DMA
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
Node 2, zone DMA32
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
Node 2, zone Normal
pages free 0
min 0
low 0
high 0
spanned 0
present 0
managed 0
Node 2, zone Movable
pages free 205601 ------------------------->> start printing data of zone Movable
min 3525
low 4406
high 5287
spanned 262144
present 262144
managed 262144
protection: (0, 0, 0, 0)
nr_free_pages 205601
nr_zone_inactive_anon 42
nr_zone_active_anon 12358
........

2020-03-25 08:56:36

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On Wed 25-03-20 13:53:31, Baoquan He wrote:
> On 03/24/20 at 12:25pm, David Rientjes wrote:
> > On Tue, 24 Mar 2020, Baoquan He wrote:
> >
> > > This moving makes the layout of /proc/zoneinfo more sensible. And there
> > > are 4 zones at most currently, it doesn't need to scroll down much to get
> > > to the 1st populated zone, even though the 1st populated zone is MOVABLE
> > > zone.
> > >
> >
> > Doesn't this introduce risk that it will break existing parsers of
> > /proc/zoneinfo in subtle ways?
> >
> > In some cases /proc/zoneinfo is a tricky file to correctly parse because
> > you have to rely on the existing order in which it is printed to determine
> > which zone is being described. We need to print zones even with unmanaged
> > pages, for instance, otherwise userspace may be unaware of which zones are
> > supported and what order they are in. That's important to be able to
> > construct the proper string to use when writing vm.lowmem_reserve_ratio.
> >
> > I'd prefer not changing the order of /proc/zoneinfo if it can be avoided
> > just because the risk outweighs the reward that we may break some
> > initscript parsers.
>
> Oh, I may not describe the change and result clearly. This patch doesn't
> change zone order at all. I only move the per-node stats to the front of
> each node, the zone order is completely kept the same, still DMA, DMA32,
> NORMAL, MOVABLE.

Even this can break existing parsers. Fixing that up is likely not hard
and existing parsers would be mostly debugging hacks here and there but
I do miss any actual justification except for you considering it more
sensible. I do not remember this would be a common pain point for people
parsing this file. If anything the overal structure of the file makes it
hard to parse and your patches do not really address that. We are likely
too late to make the output much more sensible TBH.

That being said, I haven't looked more closely on your patches because I
do not have spare cycles for that. Your justification for touching the
code which seems to be working relatively well is quite weak IMHO, yet
it adds a non zero risk for breaking existing parsers.
--
Michal Hocko
SUSE Labs

2020-03-25 14:24:05

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On 03/25/20 at 09:55am, Michal Hocko wrote:
> On Wed 25-03-20 13:53:31, Baoquan He wrote:
> > On 03/24/20 at 12:25pm, David Rientjes wrote:
> > > On Tue, 24 Mar 2020, Baoquan He wrote:
> > >
> > > > This moving makes the layout of /proc/zoneinfo more sensible. And there
> > > > are 4 zones at most currently, it doesn't need to scroll down much to get
> > > > to the 1st populated zone, even though the 1st populated zone is MOVABLE
> > > > zone.
> > > >
> > >
> > > Doesn't this introduce risk that it will break existing parsers of
> > > /proc/zoneinfo in subtle ways?
> > >
> > > In some cases /proc/zoneinfo is a tricky file to correctly parse because
> > > you have to rely on the existing order in which it is printed to determine
> > > which zone is being described. We need to print zones even with unmanaged
> > > pages, for instance, otherwise userspace may be unaware of which zones are
> > > supported and what order they are in. That's important to be able to
> > > construct the proper string to use when writing vm.lowmem_reserve_ratio.
> > >
> > > I'd prefer not changing the order of /proc/zoneinfo if it can be avoided
> > > just because the risk outweighs the reward that we may break some
> > > initscript parsers.
> >
> > Oh, I may not describe the change and result clearly. This patch doesn't
> > change zone order at all. I only move the per-node stats to the front of
> > each node, the zone order is completely kept the same, still DMA, DMA32,
> > NORMAL, MOVABLE.
>
> Even this can break existing parsers. Fixing that up is likely not hard
> and existing parsers would be mostly debugging hacks here and there but
> I do miss any actual justification except for you considering it more
> sensible. I do not remember this would be a common pain point for people
> parsing this file. If anything the overal structure of the file makes it
> hard to parse and your patches do not really address that. We are likely
> too late to make the output much more sensible TBH.
>
> That being said, I haven't looked more closely on your patches because I
> do not have spare cycles for that. Your justification for touching the
> code which seems to be working relatively well is quite weak IMHO, yet
> it adds a non zero risk for breaking existing parsers.

I would take the saying of non zero risk for breaking existing parsers.
When considering this change, I thought about the possible risk. However,
found out the per-node stats was added in 2016 which is not so late, and
assume nobody will rely on the order of per-node stats embeded into a
zone. But I have to admit any concern or worry of risk is worth being
considerred carefully since /proc/zoneinfo is a classic interface.

So, in view of objections from you and David, I would like to drop this
patch and patch 5. It's a small improvement, not worth taking any risk.
But if it goes back to this time of 2017, I would like to spend some
time to defend it :-)

commit e2ecc8a79ed49f7838b4fdf352c4c48cec9424ac
Author: Mel Gorman <[email protected]>
Date: Thu Jul 28 15:47:02 2016 -0700

mm, vmstat: print node-based stats in zoneinfo file


2020-03-25 19:46:55

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On Wed, 25 Mar 2020, Baoquan He wrote:

> > Even this can break existing parsers. Fixing that up is likely not hard
> > and existing parsers would be mostly debugging hacks here and there but
> > I do miss any actual justification except for you considering it more
> > sensible. I do not remember this would be a common pain point for people
> > parsing this file. If anything the overal structure of the file makes it
> > hard to parse and your patches do not really address that. We are likely
> > too late to make the output much more sensible TBH.
> >
> > That being said, I haven't looked more closely on your patches because I
> > do not have spare cycles for that. Your justification for touching the
> > code which seems to be working relatively well is quite weak IMHO, yet
> > it adds a non zero risk for breaking existing parsers.
>
> I would take the saying of non zero risk for breaking existing parsers.
> When considering this change, I thought about the possible risk. However,
> found out the per-node stats was added in 2016 which is not so late, and
> assume nobody will rely on the order of per-node stats embeded into a
> zone. But I have to admit any concern or worry of risk is worth being
> considerred carefully since /proc/zoneinfo is a classic interface.
>

For context, we started parsing /proc/zoneinfo in initscripts to be able
to determine the order in which vm.lowmem_reserve_ratio needs to be set
and this required my kernel change from 2017:

commit b2bd8598195f1b2a72130592125ac6b4218988a2
Author: David Rientjes <[email protected]>
Date: Wed May 3 14:52:59 2017 -0700

mm, vmstat: print non-populated zones in zoneinfo

Otherwise, we found, it's much more difficult to determine how this array
should be structured. So at least we parse this file very carefully, I'm
not sure how much others do, but it seems like an unnecessary risk for
little reward. I'm happy to see it has been decided to drop this patch
and patch 5.

> So, in view of objections from you and David, I would like to drop this
> patch and patch 5. It's a small improvement, not worth taking any risk.
> But if it goes back to this time of 2017, I would like to spend some
> time to defend it :-)
>
> commit e2ecc8a79ed49f7838b4fdf352c4c48cec9424ac
> Author: Mel Gorman <[email protected]>
> Date: Thu Jul 28 15:47:02 2016 -0700
>
> mm, vmstat: print node-based stats in zoneinfo file
>
>
>

2020-03-26 04:25:44

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On 03/25/20 at 12:45pm, David Rientjes wrote:
> On Wed, 25 Mar 2020, Baoquan He wrote:
>
> > > Even this can break existing parsers. Fixing that up is likely not hard
> > > and existing parsers would be mostly debugging hacks here and there but
> > > I do miss any actual justification except for you considering it more
> > > sensible. I do not remember this would be a common pain point for people
> > > parsing this file. If anything the overal structure of the file makes it
> > > hard to parse and your patches do not really address that. We are likely
> > > too late to make the output much more sensible TBH.
> > >
> > > That being said, I haven't looked more closely on your patches because I
> > > do not have spare cycles for that. Your justification for touching the
> > > code which seems to be working relatively well is quite weak IMHO, yet
> > > it adds a non zero risk for breaking existing parsers.
> >
> > I would take the saying of non zero risk for breaking existing parsers.
> > When considering this change, I thought about the possible risk. However,
> > found out the per-node stats was added in 2016 which is not so late, and
> > assume nobody will rely on the order of per-node stats embeded into a
> > zone. But I have to admit any concern or worry of risk is worth being
> > considerred carefully since /proc/zoneinfo is a classic interface.
> >
>
> For context, we started parsing /proc/zoneinfo in initscripts to be able
> to determine the order in which vm.lowmem_reserve_ratio needs to be set
> and this required my kernel change from 2017:
>
> commit b2bd8598195f1b2a72130592125ac6b4218988a2
> Author: David Rientjes <[email protected]>
> Date: Wed May 3 14:52:59 2017 -0700
>
> mm, vmstat: print non-populated zones in zoneinfo
>
> Otherwise, we found, it's much more difficult to determine how this array
> should be structured. So at least we parse this file very carefully, I'm
> not sure how much others do, but it seems like an unnecessary risk for
> little reward. I'm happy to see it has been decided to drop this patch
> and patch 5.


OK, I see why it is in such a situation, the empty zones were not printed.

I could still not get how vm.lowmem_reserve_ratio is set with
/proc/zoneinfo in the old initscripts, do you see any risk if not
filling and showing the ->lowmem_reserve[] of empty zone in
patch 2 and 3? Thanks in advance.

2020-03-26 06:45:18

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On Thu 26-03-20 12:24:54, Baoquan He wrote:
> On 03/25/20 at 12:45pm, David Rientjes wrote:
> > On Wed, 25 Mar 2020, Baoquan He wrote:
> >
> > > > Even this can break existing parsers. Fixing that up is likely not hard
> > > > and existing parsers would be mostly debugging hacks here and there but
> > > > I do miss any actual justification except for you considering it more
> > > > sensible. I do not remember this would be a common pain point for people
> > > > parsing this file. If anything the overal structure of the file makes it
> > > > hard to parse and your patches do not really address that. We are likely
> > > > too late to make the output much more sensible TBH.
> > > >
> > > > That being said, I haven't looked more closely on your patches because I
> > > > do not have spare cycles for that. Your justification for touching the
> > > > code which seems to be working relatively well is quite weak IMHO, yet
> > > > it adds a non zero risk for breaking existing parsers.
> > >
> > > I would take the saying of non zero risk for breaking existing parsers.
> > > When considering this change, I thought about the possible risk. However,
> > > found out the per-node stats was added in 2016 which is not so late, and
> > > assume nobody will rely on the order of per-node stats embeded into a
> > > zone. But I have to admit any concern or worry of risk is worth being
> > > considerred carefully since /proc/zoneinfo is a classic interface.
> > >
> >
> > For context, we started parsing /proc/zoneinfo in initscripts to be able
> > to determine the order in which vm.lowmem_reserve_ratio needs to be set
> > and this required my kernel change from 2017:
> >
> > commit b2bd8598195f1b2a72130592125ac6b4218988a2
> > Author: David Rientjes <[email protected]>
> > Date: Wed May 3 14:52:59 2017 -0700
> >
> > mm, vmstat: print non-populated zones in zoneinfo
> >
> > Otherwise, we found, it's much more difficult to determine how this array
> > should be structured. So at least we parse this file very carefully, I'm
> > not sure how much others do, but it seems like an unnecessary risk for
> > little reward. I'm happy to see it has been decided to drop this patch
> > and patch 5.
>
>
> OK, I see why it is in such a situation, the empty zones were not printed.
>
> I could still not get how vm.lowmem_reserve_ratio is set with
> /proc/zoneinfo in the old initscripts, do you see any risk if not
> filling and showing the ->lowmem_reserve[] of empty zone in
> patch 2 and 3? Thanks in advance.

The point is why should we even care. Displaying that information
shouldn't hurt anything, right?

--
Michal Hocko
SUSE Labs

2020-03-26 11:23:00

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 4/5] mm/vmstat.c: move the per-node stats to the front of /proc/zoneinfo

On 03/26/20 at 07:43am, Michal Hocko wrote:
> On Thu 26-03-20 12:24:54, Baoquan He wrote:
> > On 03/25/20 at 12:45pm, David Rientjes wrote:
> > > On Wed, 25 Mar 2020, Baoquan He wrote:
> > >
> > > > > Even this can break existing parsers. Fixing that up is likely not hard
> > > > > and existing parsers would be mostly debugging hacks here and there but
> > > > > I do miss any actual justification except for you considering it more
> > > > > sensible. I do not remember this would be a common pain point for people
> > > > > parsing this file. If anything the overal structure of the file makes it
> > > > > hard to parse and your patches do not really address that. We are likely
> > > > > too late to make the output much more sensible TBH.
> > > > >
> > > > > That being said, I haven't looked more closely on your patches because I
> > > > > do not have spare cycles for that. Your justification for touching the
> > > > > code which seems to be working relatively well is quite weak IMHO, yet
> > > > > it adds a non zero risk for breaking existing parsers.
> > > >
> > > > I would take the saying of non zero risk for breaking existing parsers.
> > > > When considering this change, I thought about the possible risk. However,
> > > > found out the per-node stats was added in 2016 which is not so late, and
> > > > assume nobody will rely on the order of per-node stats embeded into a
> > > > zone. But I have to admit any concern or worry of risk is worth being
> > > > considerred carefully since /proc/zoneinfo is a classic interface.
> > > >
> > >
> > > For context, we started parsing /proc/zoneinfo in initscripts to be able
> > > to determine the order in which vm.lowmem_reserve_ratio needs to be set
> > > and this required my kernel change from 2017:
> > >
> > > commit b2bd8598195f1b2a72130592125ac6b4218988a2
> > > Author: David Rientjes <[email protected]>
> > > Date: Wed May 3 14:52:59 2017 -0700
> > >
> > > mm, vmstat: print non-populated zones in zoneinfo
> > >
> > > Otherwise, we found, it's much more difficult to determine how this array
> > > should be structured. So at least we parse this file very carefully, I'm
> > > not sure how much others do, but it seems like an unnecessary risk for
> > > little reward. I'm happy to see it has been decided to drop this patch
> > > and patch 5.
> >
> >
> > OK, I see why it is in such a situation, the empty zones were not printed.
> >
> > I could still not get how vm.lowmem_reserve_ratio is set with
> > /proc/zoneinfo in the old initscripts, do you see any risk if not
> > filling and showing the ->lowmem_reserve[] of empty zone in
> > patch 2 and 3? Thanks in advance.
>
> The point is why should we even care. Displaying that information
> shouldn't hurt anything, right?

Well, I would say why not. If saying anything hurted, I often check
/proc/zoneinfo to get information about system memory like many people,
I was wondering why the protection data is over there, but it's am
empty zone, and they protect what. I dare say it's more than once I
asked to myself, just sometime I am too lazy to start to make it clear
when focusing on another issue. Not sure if that is kind of hurting. I
would like to see it's not there to confuse me if anyone else have
stood up to fix it, I absolutely will vote for it.

Surely, we also need to evaluate if any risk or complexity is involved,
While with my understanding, I don't see risk, and the change is quite
simple and easy to understand.