2021-03-01 06:30:34

by Muchun Song

[permalink] [raw]
Subject: [PATCH 2/5] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page

We want to reuse the obj_cgroup APIs to reparent the kmem pages when
the memcg offlined. If we do this, we should store an object cgroup
pointer to page->memcg_data for the kmem pages.

Finally, page->memcg_data can have 3 different meanings.

1) For the slab pages, page->memcg_data points to an object cgroups
vector.

2) For the kmem pages (exclude the slab pages), page->memcg_data
points to an object cgroup.

3) For the user pages (e.g. the LRU pages), page->memcg_data points
to a memory cgroup.

Currently we always get the memcg associated with a page via page_memcg
or page_memcg_rcu. page_memcg_check is special, it has to be used in
cases when it's not known if a page has an associated memory cgroup
pointer or an object cgroups vector. Because the page->memcg_data of
the kmem page is not pointing to a memory cgroup in the later patch,
the page_memcg and page_memcg_rcu cannot be applicable for the kmem
pages. In this patch, we introduce page_memcg_kmem to get the memcg
associated with the kmem pages. And make page_memcg and page_memcg_rcu
no longer apply to the kmem pages.

In the end, there are 4 helpers to get the memcg associated with a
page. The usage is as follows.

1) Get the memory cgroup associated with a non-kmem page (e.g. the LRU
pages).

- page_memcg()
- page_memcg_rcu()

2) Get the memory cgroup associated with a kmem page (exclude the slab
pages).

- page_memcg_kmem()

3) Get the memory cgroup associated with a page. It has to be used in
cases when it's not known if a page has an associated memory cgroup
pointer or an object cgroups vector. Returns NULL for slab pages or
uncharged pages, otherwise, returns memory cgroup for charged pages
(e.g. kmem pages, LRU pages).

- page_memcg_check()

In some place, we use page_memcg to check whether the page is charged.
Now we introduce page_memcg_charged helper to do this.

This is a preparation for reparenting the kmem pages. To support reparent
kmem pages, we just need to adjust page_memcg_kmem and page_memcg_check in
the later patch.

Signed-off-by: Muchun Song <[email protected]>
---
include/linux/memcontrol.h | 56 +++++++++++++++++++++++++++++++++++++++-------
mm/memcontrol.c | 23 ++++++++++---------
mm/page_alloc.c | 4 ++--
3 files changed, 63 insertions(+), 20 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index e6dc793d587d..1d2c82464c8c 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -358,14 +358,46 @@ enum page_memcg_data_flags {

#define MEMCG_DATA_FLAGS_MASK (__NR_MEMCG_DATA_FLAGS - 1)

+/* Return true for charged page, otherwise false. */
+static inline bool page_memcg_charged(struct page *page)
+{
+ unsigned long memcg_data = page->memcg_data;
+
+ VM_BUG_ON_PAGE(PageSlab(page), page);
+ VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page);
+
+ return !!memcg_data;
+}
+
/*
- * page_memcg - get the memory cgroup associated with a page
+ * page_memcg_kmem - get the memory cgroup associated with a kmem page.
+ * @page: a pointer to the page struct
+ *
+ * Returns a pointer to the memory cgroup associated with the kmem page,
+ * or NULL. This function assumes that the page is known to have a proper
+ * memory cgroup pointer. It is only suitable for kmem pages which means
+ * PageMemcgKmem() returns true for this page.
+ */
+static inline struct mem_cgroup *page_memcg_kmem(struct page *page)
+{
+ unsigned long memcg_data = page->memcg_data;
+
+ VM_BUG_ON_PAGE(PageSlab(page), page);
+ VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page);
+ VM_BUG_ON_PAGE(!(memcg_data & MEMCG_DATA_KMEM), page);
+
+ return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
+}
+
+/*
+ * page_memcg - get the memory cgroup associated with a non-kmem page
* @page: a pointer to the page struct
*
* Returns a pointer to the memory cgroup associated with the page,
* or NULL. This function assumes that the page is known to have a
* proper memory cgroup pointer. It's not safe to call this function
- * against some type of pages, e.g. slab pages or ex-slab pages.
+ * against some type of pages, e.g. slab pages, kmem pages or ex-slab
+ * pages.
*
* Any of the following ensures page and memcg binding stability:
* - the page lock
@@ -378,27 +410,30 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
unsigned long memcg_data = page->memcg_data;

VM_BUG_ON_PAGE(PageSlab(page), page);
- VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page);
+ VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_FLAGS_MASK, page);

- return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
+ return (struct mem_cgroup *)memcg_data;
}

/*
- * page_memcg_rcu - locklessly get the memory cgroup associated with a page
+ * page_memcg_rcu - locklessly get the memory cgroup associated with a non-kmem page
* @page: a pointer to the page struct
*
* Returns a pointer to the memory cgroup associated with the page,
* or NULL. This function assumes that the page is known to have a
* proper memory cgroup pointer. It's not safe to call this function
- * against some type of pages, e.g. slab pages or ex-slab pages.
+ * against some type of pages, e.g. slab pages, kmem pages or ex-slab
+ * pages.
*/
static inline struct mem_cgroup *page_memcg_rcu(struct page *page)
{
+ unsigned long memcg_data = READ_ONCE(page->memcg_data);
+
VM_BUG_ON_PAGE(PageSlab(page), page);
+ VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_FLAGS_MASK, page);
WARN_ON_ONCE(!rcu_read_lock_held());

- return (struct mem_cgroup *)(READ_ONCE(page->memcg_data) &
- ~MEMCG_DATA_FLAGS_MASK);
+ return (struct mem_cgroup *)memcg_data;
}

/*
@@ -1072,6 +1107,11 @@ void mem_cgroup_split_huge_fixup(struct page *head);

struct mem_cgroup;

+static inline bool page_memcg_charged(struct page *page)
+{
+ return false;
+}
+
static inline struct mem_cgroup *page_memcg(struct page *page)
{
return NULL;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2eafbae504ac..bfd6efe1e196 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -855,10 +855,11 @@ void __mod_lruvec_page_state(struct page *page, enum node_stat_item idx,
int val)
{
struct page *head = compound_head(page); /* rmap on tail pages */
- struct mem_cgroup *memcg = page_memcg(head);
+ struct mem_cgroup *memcg;
pg_data_t *pgdat = page_pgdat(page);
struct lruvec *lruvec;

+ memcg = PageMemcgKmem(head) ? page_memcg_kmem(head) : page_memcg(head);
/* Untracked pages have no memcg, no lruvec. Update only the node */
if (!memcg) {
__mod_node_page_state(pgdat, idx, val);
@@ -3170,12 +3171,13 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order)
*/
void __memcg_kmem_uncharge_page(struct page *page, int order)
{
- struct mem_cgroup *memcg = page_memcg(page);
+ struct mem_cgroup *memcg;
unsigned int nr_pages = 1 << order;

- if (!memcg)
+ if (!page_memcg_charged(page))
return;

+ memcg = page_memcg_kmem(page);
VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
__memcg_kmem_uncharge(memcg, nr_pages);
page->memcg_data = 0;
@@ -6831,24 +6833,25 @@ static void uncharge_batch(const struct uncharge_gather *ug)
static void uncharge_page(struct page *page, struct uncharge_gather *ug)
{
unsigned long nr_pages;
+ struct mem_cgroup *memcg;

VM_BUG_ON_PAGE(PageLRU(page), page);

- if (!page_memcg(page))
+ if (!page_memcg_charged(page))
return;

/*
* Nobody should be changing or seriously looking at
- * page_memcg(page) at this point, we have fully
- * exclusive access to the page.
+ * page memcg at this point, we have fully exclusive
+ * access to the page.
*/
-
- if (ug->memcg != page_memcg(page)) {
+ memcg = PageMemcgKmem(page) ? page_memcg_kmem(page) : page_memcg(page);
+ if (ug->memcg != memcg) {
if (ug->memcg) {
uncharge_batch(ug);
uncharge_gather_clear(ug);
}
- ug->memcg = page_memcg(page);
+ ug->memcg = memcg;

/* pairs with css_put in uncharge_batch */
css_get(&ug->memcg->css);
@@ -6881,7 +6884,7 @@ void mem_cgroup_uncharge(struct page *page)
return;

/* Don't touch page->lru of any random page, pre-check: */
- if (!page_memcg(page))
+ if (!page_memcg_charged(page))
return;

uncharge_gather_clear(&ug);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f10966e3b4a5..bcb58ae15e24 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1124,7 +1124,7 @@ static inline bool page_expected_state(struct page *page,
if (unlikely((unsigned long)page->mapping |
page_ref_count(page) |
#ifdef CONFIG_MEMCG
- (unsigned long)page_memcg(page) |
+ page_memcg_charged(page) |
#endif
(page->flags & check_flags)))
return false;
@@ -1149,7 +1149,7 @@ static const char *page_bad_reason(struct page *page, unsigned long flags)
bad_reason = "PAGE_FLAGS_CHECK_AT_FREE flag(s) set";
}
#ifdef CONFIG_MEMCG
- if (unlikely(page_memcg(page)))
+ if (unlikely(page_memcg_charged(page)))
bad_reason = "page still charged to cgroup";
#endif
return bad_reason;
--
2.11.0


2021-03-02 10:00:06

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page

Muchun, can you please reduce the CC list to mm/memcg folks only for
the next submission? I think probably 80% of the current recipients
don't care ;-)

On Mon, Mar 01, 2021 at 10:11:45AM -0800, Shakeel Butt wrote:
> On Sun, Feb 28, 2021 at 10:25 PM Muchun Song <[email protected]> wrote:
> >
> > We want to reuse the obj_cgroup APIs to reparent the kmem pages when
> > the memcg offlined. If we do this, we should store an object cgroup
> > pointer to page->memcg_data for the kmem pages.
> >
> > Finally, page->memcg_data can have 3 different meanings.
> >
> > 1) For the slab pages, page->memcg_data points to an object cgroups
> > vector.
> >
> > 2) For the kmem pages (exclude the slab pages), page->memcg_data
> > points to an object cgroup.
> >
> > 3) For the user pages (e.g. the LRU pages), page->memcg_data points
> > to a memory cgroup.
> >
> > Currently we always get the memcg associated with a page via page_memcg
> > or page_memcg_rcu. page_memcg_check is special, it has to be used in
> > cases when it's not known if a page has an associated memory cgroup
> > pointer or an object cgroups vector. Because the page->memcg_data of
> > the kmem page is not pointing to a memory cgroup in the later patch,
> > the page_memcg and page_memcg_rcu cannot be applicable for the kmem
> > pages. In this patch, we introduce page_memcg_kmem to get the memcg
> > associated with the kmem pages. And make page_memcg and page_memcg_rcu
> > no longer apply to the kmem pages.
> >
> > In the end, there are 4 helpers to get the memcg associated with a
> > page. The usage is as follows.
> >
> > 1) Get the memory cgroup associated with a non-kmem page (e.g. the LRU
> > pages).
> >
> > - page_memcg()
> > - page_memcg_rcu()
>
> Can you rename these to page_memcg_lru[_rcu] to make them explicitly
> for LRU pages?

The next patch removes page_memcg_kmem() again to replace it with
page_objcg(). That should (luckily) remove the need for this
distinction and keep page_memcg() simple and obvious.

It would be better to not introduce page_memcg_kmem() in the first
place in this patch, IMO.

2021-03-03 16:21:16

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page

On Sun, Feb 28, 2021 at 10:25 PM Muchun Song <[email protected]> wrote:
>
> We want to reuse the obj_cgroup APIs to reparent the kmem pages when
> the memcg offlined. If we do this, we should store an object cgroup
> pointer to page->memcg_data for the kmem pages.
>
> Finally, page->memcg_data can have 3 different meanings.
>
> 1) For the slab pages, page->memcg_data points to an object cgroups
> vector.
>
> 2) For the kmem pages (exclude the slab pages), page->memcg_data
> points to an object cgroup.
>
> 3) For the user pages (e.g. the LRU pages), page->memcg_data points
> to a memory cgroup.
>
> Currently we always get the memcg associated with a page via page_memcg
> or page_memcg_rcu. page_memcg_check is special, it has to be used in
> cases when it's not known if a page has an associated memory cgroup
> pointer or an object cgroups vector. Because the page->memcg_data of
> the kmem page is not pointing to a memory cgroup in the later patch,
> the page_memcg and page_memcg_rcu cannot be applicable for the kmem
> pages. In this patch, we introduce page_memcg_kmem to get the memcg
> associated with the kmem pages. And make page_memcg and page_memcg_rcu
> no longer apply to the kmem pages.
>
> In the end, there are 4 helpers to get the memcg associated with a
> page. The usage is as follows.
>
> 1) Get the memory cgroup associated with a non-kmem page (e.g. the LRU
> pages).
>
> - page_memcg()
> - page_memcg_rcu()

Can you rename these to page_memcg_lru[_rcu] to make them explicitly
for LRU pages?

>
> 2) Get the memory cgroup associated with a kmem page (exclude the slab
> pages).
>
> - page_memcg_kmem()
>
> 3) Get the memory cgroup associated with a page. It has to be used in
> cases when it's not known if a page has an associated memory cgroup
> pointer or an object cgroups vector. Returns NULL for slab pages or
> uncharged pages, otherwise, returns memory cgroup for charged pages
> (e.g. kmem pages, LRU pages).
>
> - page_memcg_check()
>
> In some place, we use page_memcg to check whether the page is charged.
> Now we introduce page_memcg_charged helper to do this.
>
> This is a preparation for reparenting the kmem pages. To support reparent
> kmem pages, we just need to adjust page_memcg_kmem and page_memcg_check in
> the later patch.
>
> Signed-off-by: Muchun Song <[email protected]>
> ---
[snip]
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -855,10 +855,11 @@ void __mod_lruvec_page_state(struct page *page, enum node_stat_item idx,
> int val)
> {
> struct page *head = compound_head(page); /* rmap on tail pages */
> - struct mem_cgroup *memcg = page_memcg(head);
> + struct mem_cgroup *memcg;
> pg_data_t *pgdat = page_pgdat(page);
> struct lruvec *lruvec;
>
> + memcg = PageMemcgKmem(head) ? page_memcg_kmem(head) : page_memcg(head);

Should page_memcg_check() be used here?

> /* Untracked pages have no memcg, no lruvec. Update only the node */
> if (!memcg) {
> __mod_node_page_state(pgdat, idx, val);
> @@ -3170,12 +3171,13 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order)
> */
> void __memcg_kmem_uncharge_page(struct page *page, int order)
> {
> - struct mem_cgroup *memcg = page_memcg(page);
> + struct mem_cgroup *memcg;
> unsigned int nr_pages = 1 << order;
>
> - if (!memcg)
> + if (!page_memcg_charged(page))
> return;
>
> + memcg = page_memcg_kmem(page);
> VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
> __memcg_kmem_uncharge(memcg, nr_pages);
> page->memcg_data = 0;
> @@ -6831,24 +6833,25 @@ static void uncharge_batch(const struct uncharge_gather *ug)
> static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> {
> unsigned long nr_pages;
> + struct mem_cgroup *memcg;
>
> VM_BUG_ON_PAGE(PageLRU(page), page);
>
> - if (!page_memcg(page))
> + if (!page_memcg_charged(page))
> return;
>
> /*
> * Nobody should be changing or seriously looking at
> - * page_memcg(page) at this point, we have fully
> - * exclusive access to the page.
> + * page memcg at this point, we have fully exclusive
> + * access to the page.
> */
> -
> - if (ug->memcg != page_memcg(page)) {
> + memcg = PageMemcgKmem(page) ? page_memcg_kmem(page) : page_memcg(page);

Same, should page_memcg_check() be used here?

2021-03-04 05:44:29

by Muchun Song

[permalink] [raw]
Subject: Re: [External] Re: [PATCH 2/5] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page

On Tue, Mar 2, 2021 at 3:09 AM Johannes Weiner <[email protected]> wrote:
>
> Muchun, can you please reduce the CC list to mm/memcg folks only for
> the next submission? I think probably 80% of the current recipients
> don't care ;-)

At first, I just used scripts/get_maintainer.pl to get the
CC list. I will reduce the CC list in the next version.
Thanks.

>
> On Mon, Mar 01, 2021 at 10:11:45AM -0800, Shakeel Butt wrote:
> > On Sun, Feb 28, 2021 at 10:25 PM Muchun Song <[email protected]> wrote:
> > >
> > > We want to reuse the obj_cgroup APIs to reparent the kmem pages when
> > > the memcg offlined. If we do this, we should store an object cgroup
> > > pointer to page->memcg_data for the kmem pages.
> > >
> > > Finally, page->memcg_data can have 3 different meanings.
> > >
> > > 1) For the slab pages, page->memcg_data points to an object cgroups
> > > vector.
> > >
> > > 2) For the kmem pages (exclude the slab pages), page->memcg_data
> > > points to an object cgroup.
> > >
> > > 3) For the user pages (e.g. the LRU pages), page->memcg_data points
> > > to a memory cgroup.
> > >
> > > Currently we always get the memcg associated with a page via page_memcg
> > > or page_memcg_rcu. page_memcg_check is special, it has to be used in
> > > cases when it's not known if a page has an associated memory cgroup
> > > pointer or an object cgroups vector. Because the page->memcg_data of
> > > the kmem page is not pointing to a memory cgroup in the later patch,
> > > the page_memcg and page_memcg_rcu cannot be applicable for the kmem
> > > pages. In this patch, we introduce page_memcg_kmem to get the memcg
> > > associated with the kmem pages. And make page_memcg and page_memcg_rcu
> > > no longer apply to the kmem pages.
> > >
> > > In the end, there are 4 helpers to get the memcg associated with a
> > > page. The usage is as follows.
> > >
> > > 1) Get the memory cgroup associated with a non-kmem page (e.g. the LRU
> > > pages).
> > >
> > > - page_memcg()
> > > - page_memcg_rcu()
> >
> > Can you rename these to page_memcg_lru[_rcu] to make them explicitly
> > for LRU pages?
>
> The next patch removes page_memcg_kmem() again to replace it with
> page_objcg(). That should (luckily) remove the need for this
> distinction and keep page_memcg() simple and obvious.
>
> It would be better to not introduce page_memcg_kmem() in the first
> place in this patch, IMO.

OK. I will follow your suggestion. Thanks.