LinuxLists.cc - [patch] mm: memcontrol: update mem_cgroup_page

2014-10-19 15:31:40

Subject: [patch] mm: memcontrol: update mem_cgroup_page_lruvec() documentation

7512102cf64d ("memcg: fix GPF when cgroup removal races with last
exit") added a pc->mem_cgroup reset into mem_cgroup_page_lruvec() to
prevent a crash where an anon page gets uncharged on unmap, the memcg
is released, and then the final LRU isolation on free dereferences the
stale pc->mem_cgroup pointer.

But since 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API"), pages
are only uncharged AFTER that final LRU isolation, which guarantees
the memcg's lifetime until then. pc->mem_cgroup now only needs to be
reset for swapcache readahead pages.

Update the comment and callsite requirements accordingly.

Signed-off-by: Johannes Weiner <[email protected]>
---
mm/memcontrol.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3a203c7ec6c7..fc1d7ca96b9d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1262,9 +1262,13 @@ struct lruvec *mem_cgroup_zone_lruvec(struct zone *zone,
}

/**
- * mem_cgroup_page_lruvec - return lruvec for adding an lru page
+ * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page
* @page: the page
* @zone: zone of the page
+ *
+ * This function is only safe when following the LRU page isolation
+ * and putback protocol: the LRU lock must be held, and the page must
+ * either be PageLRU() or the caller must have isolated/allocated it.
*/
struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct zone *zone)
{
@@ -1282,13 +1286,9 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct zone *zone)
memcg = pc->mem_cgroup;

/*
- * Surreptitiously switch any uncharged offlist page to root:
- * an uncharged page off lru does nothing to secure
- * its former mem_cgroup from sudden removal.
- *
- * Our caller holds lru_lock, and PageCgroupUsed is updated
- * under page_cgroup lock: between them, they make all uses
- * of pc->mem_cgroup safe.
+ * Swapcache readahead pages are added to the LRU - and
+ * possibly migrated - before they are charged. Ensure
+ * pc->mem_cgroup is sane.
*/
if (!PageLRU(page) && !PageCgroupUsed(pc) && memcg != root_mem_cgroup)
pc->mem_cgroup = memcg = root_mem_cgroup;
--
2.1.2

2014-10-20 19:13:01

by Michal Hocko

[permalink] [raw]

Subject: Re: [patch] mm: memcontrol: update mem_cgroup_page_lruvec() documentation

On Sun 19-10-14 11:30:16, Johannes Weiner wrote:
> 7512102cf64d ("memcg: fix GPF when cgroup removal races with last
> exit") added a pc->mem_cgroup reset into mem_cgroup_page_lruvec() to
> prevent a crash where an anon page gets uncharged on unmap, the memcg
> is released, and then the final LRU isolation on free dereferences the
> stale pc->mem_cgroup pointer.
>
> But since 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API"), pages
> are only uncharged AFTER that final LRU isolation, which guarantees
> the memcg's lifetime until then. pc->mem_cgroup now only needs to be
> reset for swapcache readahead pages.

Do we want VM_BUG_ON_PAGE(!PageSwapCache, page) into the fixup path?

> Update the comment and callsite requirements accordingly.
>
> Signed-off-by: Johannes Weiner <[email protected]>

Acked-by: Michal Hocko <[email protected]>

> ---
> mm/memcontrol.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 3a203c7ec6c7..fc1d7ca96b9d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1262,9 +1262,13 @@ struct lruvec *mem_cgroup_zone_lruvec(struct zone *zone,
> }
>
> /**
> - * mem_cgroup_page_lruvec - return lruvec for adding an lru page
> + * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page
> * @page: the page
> * @zone: zone of the page
> + *
> + * This function is only safe when following the LRU page isolation
> + * and putback protocol: the LRU lock must be held, and the page must
> + * either be PageLRU() or the caller must have isolated/allocated it.
> */
> struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct zone *zone)
> {
> @@ -1282,13 +1286,9 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct zone *zone)
> memcg = pc->mem_cgroup;
>
> /*
> - * Surreptitiously switch any uncharged offlist page to root:
> - * an uncharged page off lru does nothing to secure
> - * its former mem_cgroup from sudden removal.
> - *
> - * Our caller holds lru_lock, and PageCgroupUsed is updated
> - * under page_cgroup lock: between them, they make all uses
> - * of pc->mem_cgroup safe.
> + * Swapcache readahead pages are added to the LRU - and
> + * possibly migrated - before they are charged. Ensure
> + * pc->mem_cgroup is sane.
> */
> if (!PageLRU(page) && !PageCgroupUsed(pc) && memcg != root_mem_cgroup)
> pc->mem_cgroup = memcg = root_mem_cgroup;
> --
> 2.1.2
>

--
Michal Hocko
SUSE Labs

2014-10-20 19:46:42

by Johannes Weiner

[permalink] [raw]

Subject: Re: [patch] mm: memcontrol: update mem_cgroup_page_lruvec() documentation

On Mon, Oct 20, 2014 at 09:12:56PM +0200, Michal Hocko wrote:
> On Sun 19-10-14 11:30:16, Johannes Weiner wrote:
> > 7512102cf64d ("memcg: fix GPF when cgroup removal races with last
> > exit") added a pc->mem_cgroup reset into mem_cgroup_page_lruvec() to
> > prevent a crash where an anon page gets uncharged on unmap, the memcg
> > is released, and then the final LRU isolation on free dereferences the
> > stale pc->mem_cgroup pointer.
> >
> > But since 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API"), pages
> > are only uncharged AFTER that final LRU isolation, which guarantees
> > the memcg's lifetime until then. pc->mem_cgroup now only needs to be
> > reset for swapcache readahead pages.
>
> Do we want VM_BUG_ON_PAGE(!PageSwapCache, page) into the fixup path?

While that is what we expect as of right now, it's not really a
requirement for this function. Should somebody later add other page
types they might trigger this assertion and scratch their head about
it and wonder if they're missing some non-obvious dependency.

> > Update the comment and callsite requirements accordingly.
> >
> > Signed-off-by: Johannes Weiner <[email protected]>
>
> Acked-by: Michal Hocko <[email protected]>

Thanks!