2020-12-27 18:16:45

by Shakeel Butt

[permalink] [raw]
Subject: [PATCH 1/2] mm: memcg: fix memcg file_dirty numa stat

The kernel updates the per-node NR_FILE_DIRTY stats on page migration
but not the memcg numa stats. That was not an issue until recently the
commit 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface
for cgroup v2") exposed numa stats for the memcg. So fixing the
file_dirty per-memcg numa stat.

Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2")
Signed-off-by: Shakeel Butt <[email protected]>
Cc: <[email protected]>
---
mm/migrate.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index ee5e612b4cd8..613794f6a433 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -500,9 +500,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
__inc_lruvec_state(new_lruvec, NR_SHMEM);
}
if (dirty && mapping_can_writeback(mapping)) {
- __dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
+ __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
__dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
- __inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
+ __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
__inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
}
}
--
2.29.2.729.g45daf8777d-goog


2020-12-27 18:18:43

by Shakeel Butt

[permalink] [raw]
Subject: [PATCH 2/2] mm: fix numa stats for thp migration

Currently the kernel is not correctly updating the numa stats for
NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY
and NR_ZONE_WRITE_PENDING, although at the moment there is no need to
handle THP migration as kernel still does not have write support for
file THP but to be more future proof, this patch adds the THP support
for those stats as well.

Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
Signed-off-by: Shakeel Butt <[email protected]>
Cc: <[email protected]>
---
mm/migrate.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 613794f6a433..ade163c6ecdf 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -402,6 +402,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
struct zone *oldzone, *newzone;
int dirty;
int expected_count = expected_page_refs(mapping, page) + extra_count;
+ int nr = thp_nr_pages(page);

if (!mapping) {
/* Anonymous page without mapping */
@@ -437,7 +438,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
*/
newpage->index = page->index;
newpage->mapping = page->mapping;
- page_ref_add(newpage, thp_nr_pages(page)); /* add cache reference */
+ page_ref_add(newpage, nr); /* add cache reference */
if (PageSwapBacked(page)) {
__SetPageSwapBacked(newpage);
if (PageSwapCache(page)) {
@@ -459,7 +460,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
if (PageTransHuge(page)) {
int i;

- for (i = 1; i < HPAGE_PMD_NR; i++) {
+ for (i = 1; i < nr; i++) {
xas_next(&xas);
xas_store(&xas, newpage);
}
@@ -470,7 +471,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
* to one less reference.
* We know this isn't the last reference.
*/
- page_ref_unfreeze(page, expected_count - thp_nr_pages(page));
+ page_ref_unfreeze(page, expected_count - nr);

xas_unlock(&xas);
/* Leave irq disabled to prevent preemption while updating stats */
@@ -493,17 +494,17 @@ int migrate_page_move_mapping(struct address_space *mapping,
old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);

- __dec_lruvec_state(old_lruvec, NR_FILE_PAGES);
- __inc_lruvec_state(new_lruvec, NR_FILE_PAGES);
+ __mod_lruvec_state(old_lruvec, NR_FILE_PAGES, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_PAGES, nr);
if (PageSwapBacked(page) && !PageSwapCache(page)) {
- __dec_lruvec_state(old_lruvec, NR_SHMEM);
- __inc_lruvec_state(new_lruvec, NR_SHMEM);
+ __mod_lruvec_state(old_lruvec, NR_SHMEM, -nr);
+ __mod_lruvec_state(new_lruvec, NR_SHMEM, nr);
}
if (dirty && mapping_can_writeback(mapping)) {
- __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
- __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
- __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
- __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
+ __mod_lruvec_state(old_lruvec, NR_FILE_DIRTY, -nr);
+ __mod_zone_page_tate(oldzone, NR_ZONE_WRITE_PENDING, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
+ __mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
}
}
local_irq_enable();
--
2.29.2.729.g45daf8777d-goog

2020-12-27 18:19:07

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm: fix numa stats for thp migration

On Sun, Dec 27, 2020 at 10:14 AM Shakeel Butt <[email protected]> wrote:
>
> Currently the kernel is not correctly updating the numa stats for
> NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY
> and NR_ZONE_WRITE_PENDING, although at the moment there is no need to
> handle THP migration as kernel still does not have write support for
> file THP but to be more future proof, this patch adds the THP support
> for those stats as well.
>
> Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
> Signed-off-by: Shakeel Butt <[email protected]>
> Cc: <[email protected]>
> ---
> mm/migrate.c | 23 ++++++++++++-----------
> 1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 613794f6a433..ade163c6ecdf 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -402,6 +402,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> struct zone *oldzone, *newzone;
> int dirty;
> int expected_count = expected_page_refs(mapping, page) + extra_count;
> + int nr = thp_nr_pages(page);
>
> if (!mapping) {
> /* Anonymous page without mapping */
> @@ -437,7 +438,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> */
> newpage->index = page->index;
> newpage->mapping = page->mapping;
> - page_ref_add(newpage, thp_nr_pages(page)); /* add cache reference */
> + page_ref_add(newpage, nr); /* add cache reference */
> if (PageSwapBacked(page)) {
> __SetPageSwapBacked(newpage);
> if (PageSwapCache(page)) {
> @@ -459,7 +460,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> if (PageTransHuge(page)) {
> int i;
>
> - for (i = 1; i < HPAGE_PMD_NR; i++) {
> + for (i = 1; i < nr; i++) {
> xas_next(&xas);
> xas_store(&xas, newpage);
> }
> @@ -470,7 +471,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> * to one less reference.
> * We know this isn't the last reference.
> */
> - page_ref_unfreeze(page, expected_count - thp_nr_pages(page));
> + page_ref_unfreeze(page, expected_count - nr);
>
> xas_unlock(&xas);
> /* Leave irq disabled to prevent preemption while updating stats */
> @@ -493,17 +494,17 @@ int migrate_page_move_mapping(struct address_space *mapping,
> old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
> new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
>
> - __dec_lruvec_state(old_lruvec, NR_FILE_PAGES);
> - __inc_lruvec_state(new_lruvec, NR_FILE_PAGES);
> + __mod_lruvec_state(old_lruvec, NR_FILE_PAGES, -nr);
> + __mod_lruvec_state(new_lruvec, NR_FILE_PAGES, nr);
> if (PageSwapBacked(page) && !PageSwapCache(page)) {
> - __dec_lruvec_state(old_lruvec, NR_SHMEM);
> - __inc_lruvec_state(new_lruvec, NR_SHMEM);
> + __mod_lruvec_state(old_lruvec, NR_SHMEM, -nr);
> + __mod_lruvec_state(new_lruvec, NR_SHMEM, nr);
> }
> if (dirty && mapping_can_writeback(mapping)) {
> - __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
> - __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
> - __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
> - __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
> + __mod_lruvec_state(old_lruvec, NR_FILE_DIRTY, -nr);
> + __mod_zone_page_tate(oldzone, NR_ZONE_WRITE_PENDING, -nr);

This should be __mod_zone_page_state(). I fixed locally but sent the
older patch by mistake.

> + __mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
> + __mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
> }
> }
> local_irq_enable();
> --
> 2.29.2.729.g45daf8777d-goog
>

2020-12-28 05:43:48

by Muchun Song

[permalink] [raw]
Subject: Re: [External] [PATCH 1/2] mm: memcg: fix memcg file_dirty numa stat

On Mon, Dec 28, 2020 at 2:13 AM Shakeel Butt <[email protected]> wrote:
>
> The kernel updates the per-node NR_FILE_DIRTY stats on page migration
> but not the memcg numa stats. That was not an issue until recently the
> commit 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface
> for cgroup v2") exposed numa stats for the memcg. So fixing the
> file_dirty per-memcg numa stat.
>
> Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2")

Thanks for catching this problem.

Reviewed-by: Muchun Song <[email protected]>

> Signed-off-by: Shakeel Butt <[email protected]>
> Cc: <[email protected]>
> ---
> mm/migrate.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ee5e612b4cd8..613794f6a433 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -500,9 +500,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
> __inc_lruvec_state(new_lruvec, NR_SHMEM);
> }
> if (dirty && mapping_can_writeback(mapping)) {
> - __dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
> + __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
> __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
> - __inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
> + __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
> __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
> }
> }
> --
> 2.29.2.729.g45daf8777d-goog
>


--
Yours,
Muchun

2020-12-29 01:18:35

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: memcg: fix memcg file_dirty numa stat

On Sun, Dec 27, 2020 at 10:13 AM Shakeel Butt <[email protected]> wrote:
>
> The kernel updates the per-node NR_FILE_DIRTY stats on page migration
> but not the memcg numa stats. That was not an issue until recently the
> commit 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface
> for cgroup v2") exposed numa stats for the memcg. So fixing the
> file_dirty per-memcg numa stat.
>
> Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2")
> Signed-off-by: Shakeel Butt <[email protected]>
> Cc: <[email protected]>

Acked-by: Yang Shi <[email protected]>

> ---
> mm/migrate.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ee5e612b4cd8..613794f6a433 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -500,9 +500,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
> __inc_lruvec_state(new_lruvec, NR_SHMEM);
> }
> if (dirty && mapping_can_writeback(mapping)) {
> - __dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
> + __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
> __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
> - __inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
> + __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
> __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
> }
> }
> --
> 2.29.2.729.g45daf8777d-goog
>
>

2020-12-29 01:18:46

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm: fix numa stats for thp migration

On Sun, Dec 27, 2020 at 10:16 AM Shakeel Butt <[email protected]> wrote:
>
> On Sun, Dec 27, 2020 at 10:14 AM Shakeel Butt <[email protected]> wrote:
> >
> > Currently the kernel is not correctly updating the numa stats for
> > NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY
> > and NR_ZONE_WRITE_PENDING, although at the moment there is no need to
> > handle THP migration as kernel still does not have write support for
> > file THP but to be more future proof, this patch adds the THP support
> > for those stats as well.
> >
> > Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
> > Signed-off-by: Shakeel Butt <[email protected]>
> > Cc: <[email protected]>
> > ---
> > mm/migrate.c | 23 ++++++++++++-----------
> > 1 file changed, 12 insertions(+), 11 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 613794f6a433..ade163c6ecdf 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -402,6 +402,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> > struct zone *oldzone, *newzone;
> > int dirty;
> > int expected_count = expected_page_refs(mapping, page) + extra_count;
> > + int nr = thp_nr_pages(page);
> >
> > if (!mapping) {
> > /* Anonymous page without mapping */
> > @@ -437,7 +438,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> > */
> > newpage->index = page->index;
> > newpage->mapping = page->mapping;
> > - page_ref_add(newpage, thp_nr_pages(page)); /* add cache reference */
> > + page_ref_add(newpage, nr); /* add cache reference */
> > if (PageSwapBacked(page)) {
> > __SetPageSwapBacked(newpage);
> > if (PageSwapCache(page)) {
> > @@ -459,7 +460,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> > if (PageTransHuge(page)) {
> > int i;
> >
> > - for (i = 1; i < HPAGE_PMD_NR; i++) {
> > + for (i = 1; i < nr; i++) {
> > xas_next(&xas);
> > xas_store(&xas, newpage);
> > }
> > @@ -470,7 +471,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
> > * to one less reference.
> > * We know this isn't the last reference.
> > */
> > - page_ref_unfreeze(page, expected_count - thp_nr_pages(page));
> > + page_ref_unfreeze(page, expected_count - nr);
> >
> > xas_unlock(&xas);
> > /* Leave irq disabled to prevent preemption while updating stats */
> > @@ -493,17 +494,17 @@ int migrate_page_move_mapping(struct address_space *mapping,
> > old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
> > new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
> >
> > - __dec_lruvec_state(old_lruvec, NR_FILE_PAGES);
> > - __inc_lruvec_state(new_lruvec, NR_FILE_PAGES);
> > + __mod_lruvec_state(old_lruvec, NR_FILE_PAGES, -nr);
> > + __mod_lruvec_state(new_lruvec, NR_FILE_PAGES, nr);
> > if (PageSwapBacked(page) && !PageSwapCache(page)) {
> > - __dec_lruvec_state(old_lruvec, NR_SHMEM);
> > - __inc_lruvec_state(new_lruvec, NR_SHMEM);
> > + __mod_lruvec_state(old_lruvec, NR_SHMEM, -nr);
> > + __mod_lruvec_state(new_lruvec, NR_SHMEM, nr);
> > }
> > if (dirty && mapping_can_writeback(mapping)) {
> > - __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
> > - __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
> > - __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
> > - __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
> > + __mod_lruvec_state(old_lruvec, NR_FILE_DIRTY, -nr);
> > + __mod_zone_page_tate(oldzone, NR_ZONE_WRITE_PENDING, -nr);
>
> This should be __mod_zone_page_state(). I fixed locally but sent the
> older patch by mistake.

Acked-by: Yang Shi <[email protected]>

>
> > + __mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
> > + __mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
> > }
> > }
> > local_irq_enable();
> > --
> > 2.29.2.729.g45daf8777d-goog
> >
>

2020-12-29 01:44:04

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm: fix numa stats for thp migration

On Sun, Dec 27, 2020 at 10:13:10AM -0800, Shakeel Butt wrote:
> Currently the kernel is not correctly updating the numa stats for
> NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY
> and NR_ZONE_WRITE_PENDING, although at the moment there is no need to
> handle THP migration as kernel still does not have write support for
> file THP but to be more future proof, this patch adds the THP support
> for those stats as well.
>
> Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
> Signed-off-by: Shakeel Butt <[email protected]>
> Cc: <[email protected]>

With the typo fix ("__mod_zone_page_tate")

Reviewed-by: Roman Gushchin <[email protected]>

Thanks!

2020-12-29 01:47:09

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: memcg: fix memcg file_dirty numa stat

On Sun, Dec 27, 2020 at 10:13:09AM -0800, Shakeel Butt wrote:
> The kernel updates the per-node NR_FILE_DIRTY stats on page migration
> but not the memcg numa stats. That was not an issue until recently the
> commit 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface
> for cgroup v2") exposed numa stats for the memcg. So fixing the
> file_dirty per-memcg numa stat.
>
> Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2")
> Signed-off-by: Shakeel Butt <[email protected]>
> Cc: <[email protected]>

Reviewed-by: Roman Gushchin <[email protected]>

Thanks!

> ---
> mm/migrate.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ee5e612b4cd8..613794f6a433 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -500,9 +500,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
> __inc_lruvec_state(new_lruvec, NR_SHMEM);
> }
> if (dirty && mapping_can_writeback(mapping)) {
> - __dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
> + __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
> __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
> - __inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
> + __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
> __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
> }
> }
> --
> 2.29.2.729.g45daf8777d-goog
>