by Michal Hocko

[permalink] [raw]

Subject: Re: [patch 2/2] mm: memcontrol: fix missed end-writeback page accounting

On Thu 23-10-14 09:54:12, Johannes Weiner wrote:
[...]
> From 1808b8e2114a7d3cc6a0a52be2fe568ff6e1457e Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <[email protected]>
> Date: Thu, 23 Oct 2014 09:12:01 -0400
> Subject: [patch] mm: memcontrol: fix missed end-writeback page accounting fix
>
> Add kernel-doc to page state accounting functions.
>
> Signed-off-by: Johannes Weiner <[email protected]>

Nice!
Acked-by: Michal Hocko <[email protected]>

> ---
> mm/memcontrol.c | 51 +++++++++++++++++++++++++++++++++++----------------
> 1 file changed, 35 insertions(+), 16 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 024177df7aae..ae9b630e928b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2109,21 +2109,31 @@ cleanup:
> return true;
> }
>
> -/*
> - * Used to update mapped file or writeback or other statistics.
> +/**
> + * mem_cgroup_begin_page_stat - begin a page state statistics transaction
> + * @page: page that is going to change accounted state
> + * @locked: &memcg->move_lock slowpath was taken
> + * @flags: IRQ-state flags for &memcg->move_lock
> *
> - * Notes: Race condition
> + * This function must mark the beginning of an accounted page state
> + * change to prevent double accounting when the page is concurrently
> + * being moved to another memcg:
> *
> - * Charging occurs during page instantiation, while the page is
> - * unmapped and locked in page migration, or while the page table is
> - * locked in THP migration. No race is possible.
> + * memcg = mem_cgroup_begin_page_stat(page, &locked, &flags);
> + * if (TestClearPageState(page))
> + * mem_cgroup_update_page_stat(memcg, state, -1);
> + * mem_cgroup_end_page_stat(memcg, locked, flags);
> *
> - * Uncharge happens to pages with zero references, no race possible.
> + * The RCU lock is held throughout the transaction. The fast path can
> + * get away without acquiring the memcg->move_lock (@locked is false)
> + * because page moving starts with an RCU grace period.
> *
> - * Charge moving between groups is protected by checking mm->moving
> - * account and taking the move_lock in the slowpath.
> + * The RCU lock also protects the memcg from being freed when the page
> + * state that is going to change is the only thing preventing the page
> + * from being uncharged. E.g. end-writeback clearing PageWriteback(),
> + * which allows migration to go ahead and uncharge the page before the
> + * account transaction might be complete.
> */
> -
> struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page,
> bool *locked,
> unsigned long *flags)
> @@ -2141,12 +2151,7 @@ again:
> memcg = pc->mem_cgroup;
> if (unlikely(!memcg))
> return NULL;
> - /*
> - * If this memory cgroup is not under account moving, we don't
> - * need to take move_lock_mem_cgroup(). Because we already hold
> - * rcu_read_lock(), any calls to move_account will be delayed until
> - * rcu_read_unlock().
> - */
> +
> *locked = false;
> if (atomic_read(&memcg->moving_account) <= 0)
> return memcg;
> @@ -2161,6 +2166,12 @@ again:
> return memcg;
> }
>
> +/**
> + * mem_cgroup_end_page_stat - finish a page state statistics transaction
> + * @memcg: the memcg that was accounted against
> + * @locked: value received from mem_cgroup_begin_page_stat()
> + * @flags: value received from mem_cgroup_begin_page_stat()
> + */
> void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool locked,
> unsigned long flags)
> {
> @@ -2170,6 +2181,14 @@ void mem_cgroup_end_page_stat(struct mem_cgroup *memcg, bool locked,
> rcu_read_unlock();
> }
>
> +/**
> + * mem_cgroup_update_page_stat - update page state statistics
> + * @memcg: memcg to account against
> + * @idx: page state item to account
> + * @val: number of pages (positive or negative)
> + *
> + * See mem_cgroup_begin_page_stat() for locking requirements.
> + */
> void mem_cgroup_update_page_stat(struct mem_cgroup *memcg,
> enum mem_cgroup_stat_index idx, int val)
> {
> --
> 2.1.2
>

--
Michal Hocko
SUSE Labs

2014-10-23 15:03:25

by Michal Hocko

[permalink] [raw]

Subject: Re: [patch 2/2] mm: memcontrol: fix missed end-writeback page accounting

On Thu 23-10-14 09:57:29, Johannes Weiner wrote:
[...]
> From b518d88254b01be8c6c0c4a496d9f311f0c71b4a Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <[email protected]>
> Date: Thu, 23 Oct 2014 09:29:06 -0400
> Subject: [patch] mm: rmap: split out page_remove_file_rmap()
>
> page_remove_rmap() has too many branches on PageAnon() and is hard to
> follow. Move the file part into a separate function.
>
> Signed-off-by: Johannes Weiner <[email protected]>

Reviewed-by: Michal Hocko <[email protected]>

> ---
> mm/rmap.c | 78 +++++++++++++++++++++++++++++++++++++--------------------------
> 1 file changed, 46 insertions(+), 32 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index f574046f77d4..19886fb2f13a 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1054,6 +1054,36 @@ void page_add_file_rmap(struct page *page)
> mem_cgroup_end_page_stat(memcg, locked, flags);
> }
>
> +static void page_remove_file_rmap(struct page *page)
> +{
> + struct mem_cgroup *memcg;
> + unsigned long flags;
> + bool locked;
> +
> + memcg = mem_cgroup_begin_page_stat(page, &locked, &flags);
> +
> + /* page still mapped by someone else? */
> + if (!atomic_add_negative(-1, &page->_mapcount))
> + goto out;
> +
> + /* Hugepages are not counted in NR_FILE_MAPPED for now. */
> + if (unlikely(PageHuge(page)))
> + goto out;
> +
> + /*
> + * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> + * these counters are not modified in interrupt context, and
> + * pte lock(a spinlock) is held, which implies preemption disabled.
> + */
> + __dec_zone_page_state(page, NR_FILE_MAPPED);
> + mem_cgroup_dec_page_stat(memcg, MEM_CGROUP_STAT_FILE_MAPPED);
> +
> + if (unlikely(PageMlocked(page)))
> + clear_page_mlock(page);
> +out:
> + mem_cgroup_end_page_stat(memcg, locked, flags);
> +}
> +
> /**
> * page_remove_rmap - take down pte mapping from a page
> * @page: page to remove mapping from
> @@ -1062,46 +1092,33 @@ void page_add_file_rmap(struct page *page)
> */
> void page_remove_rmap(struct page *page)
> {
> - struct mem_cgroup *uninitialized_var(memcg);
> - bool anon = PageAnon(page);
> - unsigned long flags;
> - bool locked;
> -
> - /*
> - * The anon case has no mem_cgroup page_stat to update; but may
> - * uncharge_page() below, where the lock ordering can deadlock if
> - * we hold the lock against page_stat move: so avoid it on anon.
> - */
> - if (!anon)
> - memcg = mem_cgroup_begin_page_stat(page, &locked, &flags);
> + if (!PageAnon(page)) {
> + page_remove_file_rmap(page);
> + return;
> + }
>
> /* page still mapped by someone else? */
> if (!atomic_add_negative(-1, &page->_mapcount))
> - goto out;
> + return;
> +
> + /* Hugepages are not counted in NR_ANON_PAGES for now. */
> + if (unlikely(PageHuge(page)))
> + return;
>
> /*
> - * Hugepages are not counted in NR_ANON_PAGES nor NR_FILE_MAPPED
> - * and not charged by memcg for now.
> - *
> * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> * these counters are not modified in interrupt context, and
> - * these counters are not modified in interrupt context, and
> * pte lock(a spinlock) is held, which implies preemption disabled.
> */
> - if (unlikely(PageHuge(page)))
> - goto out;
> - if (anon) {
> - if (PageTransHuge(page))
> - __dec_zone_page_state(page,
> - NR_ANON_TRANSPARENT_HUGEPAGES);
> - __mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
> - -hpage_nr_pages(page));
> - } else {
> - __dec_zone_page_state(page, NR_FILE_MAPPED);
> - mem_cgroup_dec_page_stat(memcg, MEM_CGROUP_STAT_FILE_MAPPED);
> - }
> + if (PageTransHuge(page))
> + __dec_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
> +
> + __mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
> + -hpage_nr_pages(page));
> +
> if (unlikely(PageMlocked(page)))
> clear_page_mlock(page);
> +
> /*
> * It would be tidy to reset the PageAnon mapping here,
> * but that might overwrite a racing page_add_anon_rmap
> @@ -1111,9 +1128,6 @@ void page_remove_rmap(struct page *page)
> * Leaving it set also helps swapoff to reinstate ptes
> * faster for those pages still in swapcache.
> */
> -out:
> - if (!anon)
> - mem_cgroup_end_page_stat(memcg, locked, flags);
> }
>
> /*
> --
> 2.1.2
>

--
Michal Hocko
SUSE Labs