LinuxLists.cc - Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage

2021-07-02 02:50:04

Subject: Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep

On Thu, Jul 1, 2021 at 10:56 PM Minchan Kim <[email protected]> wrote:
>
> On Wed, Jun 30, 2021 at 06:52:58PM -0700, Andrew Morton wrote:
> > From: Zhaoyang Huang <[email protected]>
> > Subject: mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep
> >
> > Zspage_cachep is found be merged with other kmem cache during test, which
> > is not good for debug things (zs_pool->zspage_cachep present to be another
> > kmem cache in memory dumpfile). It is also neccessary to do so as
> > shrinker has been registered for zspage.
> >
> > Amending this flag can help kernel to calculate SLAB_RECLAIMBLE correctly.
> >
> > Link: https://lkml.kernel.org/r/[email protected]
> > Signed-off-by: Zhaoyang Huang <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Cc: Sergey Senozhatsky <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
>
> Sorry for the late. I don't think this is correct.
>
> It's true "struct zspage" can be freed by zsmalloc's compaction registerred
> by slab shrinker so tempted to make it SLAB_RECLAIM_ACCOUNT. However, it's
> quite limited to work only when objects in the zspage are heavily fragmented.
> Once the compaction is done, zspage are never discardable until objects are
> fragmented again. It means it could hurt other reclaimable slab page reclaiming
> since the zspage slab object pins the page.
IMHO, kmem cache's reclaiming is NOT affected by SLAB_RECLAIM_ACCOUNT
. This flag just affects kmem cache merge[1], the slab page's migrate
type[2] and the page's statistics. Actually, zspage's cache DO merged
with others even without SLAB_RECLAIM_ACCOUNT currently, which maybe
cause zspage's object will NEVER be discarded.(SLAB_MERGE_SAME
introduce confusions as people believe the cache will merge with
others when it set and vice versa)

[1]
struct kmem_cache *find_mergeable(size_t size, size_t align, unsigned
long flags, const char *name, void (*ctor)(void *))
...
if ((flags & SLAB_MERGE_SAME) != (s->flags & SLAB_MERGE_SAME))
continue;

[2]
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;

>
> > ---
> >
> > mm/zsmalloc.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > --- a/mm/zsmalloc.c~mm-zram-amend-slab_reclaim_account-on-zspage_cachep
> > +++ a/mm/zsmalloc.c
> > @@ -328,7 +328,7 @@ static int create_cache(struct zs_pool *
> > return 1;
> >
> > pool->zspage_cachep = kmem_cache_create("zspage", sizeof(struct zspage),
> > - 0, 0, NULL);
> > + 0, SLAB_RECLAIM_ACCOUNT, NULL);
> > if (!pool->zspage_cachep) {
> > kmem_cache_destroy(pool->handle_cachep);
> > pool->handle_cachep = NULL;
> > _

2021-07-02 06:02:46

by Minchan Kim

[permalink] [raw]

Subject: Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep

On Fri, Jul 02, 2021 at 10:45:09AM +0800, Zhaoyang Huang wrote:
> On Thu, Jul 1, 2021 at 10:56 PM Minchan Kim <[email protected]> wrote:
> >
> > On Wed, Jun 30, 2021 at 06:52:58PM -0700, Andrew Morton wrote:
> > > From: Zhaoyang Huang <[email protected]>
> > > Subject: mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep
> > >
> > > Zspage_cachep is found be merged with other kmem cache during test, which
> > > is not good for debug things (zs_pool->zspage_cachep present to be another
> > > kmem cache in memory dumpfile). It is also neccessary to do so as
> > > shrinker has been registered for zspage.
> > >
> > > Amending this flag can help kernel to calculate SLAB_RECLAIMBLE correctly.
> > >
> > > Link: https://lkml.kernel.org/r/[email protected]
> > > Signed-off-by: Zhaoyang Huang <[email protected]>
> > > Cc: Minchan Kim <[email protected]>
> > > Cc: Sergey Senozhatsky <[email protected]>
> > > Signed-off-by: Andrew Morton <[email protected]>
> >
> > Sorry for the late. I don't think this is correct.
> >
> > It's true "struct zspage" can be freed by zsmalloc's compaction registerred
> > by slab shrinker so tempted to make it SLAB_RECLAIM_ACCOUNT. However, it's
> > quite limited to work only when objects in the zspage are heavily fragmented.
> > Once the compaction is done, zspage are never discardable until objects are
> > fragmented again. It means it could hurt other reclaimable slab page reclaiming
> > since the zspage slab object pins the page.
> IMHO, kmem cache's reclaiming is NOT affected by SLAB_RECLAIM_ACCOUNT
> . This flag just affects kmem cache merge[1], the slab page's migrate
> type[2] and the page's statistics. Actually, zspage's cache DO merged
> with others even without SLAB_RECLAIM_ACCOUNT currently, which maybe
> cause zspage's object will NEVER be discarded.(SLAB_MERGE_SAME
> introduce confusions as people believe the cache will merge with
> others when it set and vice versa)
>
> [1]
> struct kmem_cache *find_mergeable(size_t size, size_t align, unsigned
> long flags, const char *name, void (*ctor)(void *))
> ...
> if ((flags & SLAB_MERGE_SAME) != (s->flags & SLAB_MERGE_SAME))
> continue;
>
> [2]
> if (s->flags & SLAB_RECLAIM_ACCOUNT)
> s->allocflags |= __GFP_RECLAIMABLE;

That's the point here. With SLAB_RECLAIM_ACCOUNT, page allocator
try to allocate pages from MIGRATE_RECLAIMABLE with belief those
objects are easily reclaimable. Say a page has object A, B, C, D
and E. A-D are easily reclaimable but E is hard. What happens is
VM couldn't reclaim the page in the end due to E even though it
already reclaimed A-D. And the such fragmenation could be spread
out entire MIGRATE_RECLAIMABLE pageblocks over time.
That's why I'd like to put zspage into MIGRATE_UNMOVALBE from the
beginning since I don't think it's easily reclaimble once compaction
is done.

2021-07-02 06:25:51

by Zhaoyang Huang

[permalink] [raw]

Subject: Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep

On Fri, Jul 2, 2021 at 1:47 PM Minchan Kim <[email protected]> wrote:
>
> On Fri, Jul 02, 2021 at 10:45:09AM +0800, Zhaoyang Huang wrote:
> > On Thu, Jul 1, 2021 at 10:56 PM Minchan Kim <[email protected]> wrote:
> > >
> > > On Wed, Jun 30, 2021 at 06:52:58PM -0700, Andrew Morton wrote:
> > > > From: Zhaoyang Huang <[email protected]>
> > > > Subject: mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep
> > > >
> > > > Zspage_cachep is found be merged with other kmem cache during test, which
> > > > is not good for debug things (zs_pool->zspage_cachep present to be another
> > > > kmem cache in memory dumpfile). It is also neccessary to do so as
> > > > shrinker has been registered for zspage.
> > > >
> > > > Amending this flag can help kernel to calculate SLAB_RECLAIMBLE correctly.
> > > >
> > > > Link: https://lkml.kernel.org/r/[email protected]
> > > > Signed-off-by: Zhaoyang Huang <[email protected]>
> > > > Cc: Minchan Kim <[email protected]>
> > > > Cc: Sergey Senozhatsky <[email protected]>
> > > > Signed-off-by: Andrew Morton <[email protected]>
> > >
> > > Sorry for the late. I don't think this is correct.
> > >
> > > It's true "struct zspage" can be freed by zsmalloc's compaction registerred
> > > by slab shrinker so tempted to make it SLAB_RECLAIM_ACCOUNT. However, it's
> > > quite limited to work only when objects in the zspage are heavily fragmented.
> > > Once the compaction is done, zspage are never discardable until objects are
> > > fragmented again. It means it could hurt other reclaimable slab page reclaiming
> > > since the zspage slab object pins the page.
> > IMHO, kmem cache's reclaiming is NOT affected by SLAB_RECLAIM_ACCOUNT
> > . This flag just affects kmem cache merge[1], the slab page's migrate
> > type[2] and the page's statistics. Actually, zspage's cache DO merged
> > with others even without SLAB_RECLAIM_ACCOUNT currently, which maybe
> > cause zspage's object will NEVER be discarded.(SLAB_MERGE_SAME
> > introduce confusions as people believe the cache will merge with
> > others when it set and vice versa)
> >
> > [1]
> > struct kmem_cache *find_mergeable(size_t size, size_t align, unsigned
> > long flags, const char *name, void (*ctor)(void *))
> > ...
> > if ((flags & SLAB_MERGE_SAME) != (s->flags & SLAB_MERGE_SAME))
> > continue;
> >
> > [2]
> > if (s->flags & SLAB_RECLAIM_ACCOUNT)
> > s->allocflags |= __GFP_RECLAIMABLE;
>
> That's the point here. With SLAB_RECLAIM_ACCOUNT, page allocator
> try to allocate pages from MIGRATE_RECLAIMABLE with belief those
> objects are easily reclaimable. Say a page has object A, B, C, D
> and E. A-D are easily reclaimable but E is hard. What happens is
> VM couldn't reclaim the page in the end due to E even though it
> already reclaimed A-D. And the such fragmenation could be spread
> out entire MIGRATE_RECLAIMABLE pageblocks over time.
> That's why I'd like to put zspage into MIGRATE_UNMOVALBE from the
> beginning since I don't think it's easily reclaimble once compaction
> is done.
The slab page could fallback to any migrate type even allocating with
__GFP_RECLAIMABLE, and there is only one page per slab within zspage's
cache, which will not be affected by compaction, so I think that
doesn't make sense.

2021-07-02 07:36:02

by Minchan Kim

[permalink] [raw]

Subject: Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep

On Fri, Jul 02, 2021 at 02:20:42PM +0800, Zhaoyang Huang wrote:
> On Fri, Jul 2, 2021 at 1:47 PM Minchan Kim <[email protected]> wrote:
> >
> > On Fri, Jul 02, 2021 at 10:45:09AM +0800, Zhaoyang Huang wrote:
> > > On Thu, Jul 1, 2021 at 10:56 PM Minchan Kim <[email protected]> wrote:
> > > >
> > > > On Wed, Jun 30, 2021 at 06:52:58PM -0700, Andrew Morton wrote:
> > > > > From: Zhaoyang Huang <[email protected]>
> > > > > Subject: mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep
> > > > >
> > > > > Zspage_cachep is found be merged with other kmem cache during test, which
> > > > > is not good for debug things (zs_pool->zspage_cachep present to be another
> > > > > kmem cache in memory dumpfile). It is also neccessary to do so as
> > > > > shrinker has been registered for zspage.
> > > > >
> > > > > Amending this flag can help kernel to calculate SLAB_RECLAIMBLE correctly.
> > > > >
> > > > > Link: https://lkml.kernel.org/r/[email protected]
> > > > > Signed-off-by: Zhaoyang Huang <[email protected]>
> > > > > Cc: Minchan Kim <[email protected]>
> > > > > Cc: Sergey Senozhatsky <[email protected]>
> > > > > Signed-off-by: Andrew Morton <[email protected]>
> > > >
> > > > Sorry for the late. I don't think this is correct.
> > > >
> > > > It's true "struct zspage" can be freed by zsmalloc's compaction registerred
> > > > by slab shrinker so tempted to make it SLAB_RECLAIM_ACCOUNT. However, it's
> > > > quite limited to work only when objects in the zspage are heavily fragmented.
> > > > Once the compaction is done, zspage are never discardable until objects are
> > > > fragmented again. It means it could hurt other reclaimable slab page reclaiming
> > > > since the zspage slab object pins the page.
> > > IMHO, kmem cache's reclaiming is NOT affected by SLAB_RECLAIM_ACCOUNT
> > > . This flag just affects kmem cache merge[1], the slab page's migrate
> > > type[2] and the page's statistics. Actually, zspage's cache DO merged
> > > with others even without SLAB_RECLAIM_ACCOUNT currently, which maybe
> > > cause zspage's object will NEVER be discarded.(SLAB_MERGE_SAME
> > > introduce confusions as people believe the cache will merge with
> > > others when it set and vice versa)
> > >
> > > [1]
> > > struct kmem_cache *find_mergeable(size_t size, size_t align, unsigned
> > > long flags, const char *name, void (*ctor)(void *))
> > > ...
> > > if ((flags & SLAB_MERGE_SAME) != (s->flags & SLAB_MERGE_SAME))
> > > continue;
> > >
> > > [2]
> > > if (s->flags & SLAB_RECLAIM_ACCOUNT)
> > > s->allocflags |= __GFP_RECLAIMABLE;
> >
> > That's the point here. With SLAB_RECLAIM_ACCOUNT, page allocator
> > try to allocate pages from MIGRATE_RECLAIMABLE with belief those
> > objects are easily reclaimable. Say a page has object A, B, C, D
> > and E. A-D are easily reclaimable but E is hard. What happens is
> > VM couldn't reclaim the page in the end due to E even though it
> > already reclaimed A-D. And the such fragmenation could be spread
> > out entire MIGRATE_RECLAIMABLE pageblocks over time.
> > That's why I'd like to put zspage into MIGRATE_UNMOVALBE from the
> > beginning since I don't think it's easily reclaimble once compaction
> > is done.
> The slab page could fallback to any migrate type even allocating with

It's true but it couldn't be justication to allocate objects from any
migration type. We should try to select right type. Please see below.

> __GFP_RECLAIMABLE, and there is only one page per slab within zspage's
> cache, which will not be affected by compaction, so I think that
> doesn't make sense.

You shouldn't rely on how many pages the slab has since it's internal
implemenation and zspage size also could be changed in the future.
And please think about external fragmentaion as well as internal one.

What we want to try with allocation type is to group similar lifetime
objects together in a pageblock group to help external fragmentation
for high-order allocation. Think what happens if the unreclaimable
object is located in a reclaimable pageblock. The block couldn't be
merged into high-order page in the end so it causes more compaction
and smaller available high-order pages in the system.