2018-03-13 16:57:07

by Shakeel Butt

[permalink] [raw]
Subject: [PATCH] slab, slub: remove size disparity on debug kernel

I have noticed on debug kernel with SLAB, the size of some non-root
slabs were larger than their corresponding root slabs.

e.g. for radix_tree_node:
$cat /proc/slabinfo | grep radix
name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> ...
radix_tree_node 15052 15075 4096 1 1 ...

$cat /cgroup/memory/temp/memory.kmem.slabinfo | grep radix
name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> ...
radix_tree_node 1581 158 4120 1 2 ...

However for SLUB in debug kernel, the sizes were same. On further
inspection it is found that SLUB always use kmem_cache.object_size to
measure the kmem_cache.size while SLAB use the given kmem_cache.size. In
the debug kernel the slab's size can be larger than its object_size.
Thus in the creation of non-root slab, the SLAB uses the root's size as
base to calculate the non-root slab's size and thus non-root slab's size
can be larger than the root slab's size. For SLUB, the non-root slab's
size is measured based on the root's object_size and thus the size will
remain same for root and non-root slab.

This patch makes slab's object_size the default base to measure the
slab's size.

Signed-off-by: Shakeel Butt <[email protected]>
---
mm/slab_common.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index e41cbc18c57d..61ab2ca8bea7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -379,7 +379,7 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align,
}

static struct kmem_cache *create_cache(const char *name,
- unsigned int object_size, unsigned int size, unsigned int align,
+ unsigned int object_size, unsigned int align,
slab_flags_t flags, unsigned int useroffset,
unsigned int usersize, void (*ctor)(void *),
struct mem_cgroup *memcg, struct kmem_cache *root_cache)
@@ -396,8 +396,7 @@ static struct kmem_cache *create_cache(const char *name,
goto out;

s->name = name;
- s->object_size = object_size;
- s->size = size;
+ s->size = s->object_size = object_size;
s->align = align;
s->ctor = ctor;
s->useroffset = useroffset;
@@ -503,7 +502,7 @@ kmem_cache_create_usercopy(const char *name,
goto out_unlock;
}

- s = create_cache(cache_name, size, size,
+ s = create_cache(cache_name, size,
calculate_alignment(flags, align, size),
flags, useroffset, usersize, ctor, NULL, NULL);
if (IS_ERR(s)) {
@@ -650,7 +649,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
goto out_unlock;

s = create_cache(cache_name, root_cache->object_size,
- root_cache->size, root_cache->align,
+ root_cache->align,
root_cache->flags & CACHE_CREATE_MASK,
root_cache->useroffset, root_cache->usersize,
root_cache->ctor, memcg, root_cache);
--
2.16.2.660.g709887971b-goog



Subject: Re: [PATCH] slab, slub: remove size disparity on debug kernel

On Tue, 13 Mar 2018, Shakeel Butt wrote:

> However for SLUB in debug kernel, the sizes were same. On further
> inspection it is found that SLUB always use kmem_cache.object_size to
> measure the kmem_cache.size while SLAB use the given kmem_cache.size. In
> the debug kernel the slab's size can be larger than its object_size.
> Thus in the creation of non-root slab, the SLAB uses the root's size as
> base to calculate the non-root slab's size and thus non-root slab's size
> can be larger than the root slab's size. For SLUB, the non-root slab's
> size is measured based on the root's object_size and thus the size will
> remain same for root and non-root slab.

Note that the object_size and size may differ for SLUB based on kernel
parameters and slab configuration. For SLAB these are compilation options.

> @@ -379,7 +379,7 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align,
> }
>
> static struct kmem_cache *create_cache(const char *name,
> - unsigned int object_size, unsigned int size, unsigned int align,
> + unsigned int object_size, unsigned int align,
> slab_flags_t flags, unsigned int useroffset,

Why was both the size and object_size passed during cache creation in the
first place? From the flags etc the slab logic should be able to compute
the actual bytes required for each object and its metadata.


2018-03-13 17:38:16

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] slab, slub: remove size disparity on debug kernel

On Tue, Mar 13, 2018 at 10:19 AM, Christopher Lameter <[email protected]> wrote:
> On Tue, 13 Mar 2018, Shakeel Butt wrote:
>
>> However for SLUB in debug kernel, the sizes were same. On further
>> inspection it is found that SLUB always use kmem_cache.object_size to
>> measure the kmem_cache.size while SLAB use the given kmem_cache.size. In
>> the debug kernel the slab's size can be larger than its object_size.
>> Thus in the creation of non-root slab, the SLAB uses the root's size as
>> base to calculate the non-root slab's size and thus non-root slab's size
>> can be larger than the root slab's size. For SLUB, the non-root slab's
>> size is measured based on the root's object_size and thus the size will
>> remain same for root and non-root slab.
>
> Note that the object_size and size may differ for SLUB based on kernel
> parameters and slab configuration. For SLAB these are compilation options.
>

Thanks for the explanation.

>> @@ -379,7 +379,7 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align,
>> }
>>
>> static struct kmem_cache *create_cache(const char *name,
>> - unsigned int object_size, unsigned int size, unsigned int align,
>> + unsigned int object_size, unsigned int align,
>> slab_flags_t flags, unsigned int useroffset,
>
> Why was both the size and object_size passed during cache creation in the
> first place? From the flags etc the slab logic should be able to compute
> the actual bytes required for each object and its metadata.
>

+Vladimir

I think it was introduced by 794b1248be4e7 ("memcg, slab: separate
memcg vs root cache creation paths") but I could not find out the
reason.

2018-03-14 08:44:41

by Vladimir Davydov

[permalink] [raw]
Subject: Re: [PATCH] slab, slub: remove size disparity on debug kernel

On Tue, Mar 13, 2018 at 10:36:52AM -0700, Shakeel Butt wrote:
> On Tue, Mar 13, 2018 at 10:19 AM, Christopher Lameter <[email protected]> wrote:
> > On Tue, 13 Mar 2018, Shakeel Butt wrote:
> >
> >> However for SLUB in debug kernel, the sizes were same. On further
> >> inspection it is found that SLUB always use kmem_cache.object_size to
> >> measure the kmem_cache.size while SLAB use the given kmem_cache.size. In
> >> the debug kernel the slab's size can be larger than its object_size.
> >> Thus in the creation of non-root slab, the SLAB uses the root's size as
> >> base to calculate the non-root slab's size and thus non-root slab's size
> >> can be larger than the root slab's size. For SLUB, the non-root slab's
> >> size is measured based on the root's object_size and thus the size will
> >> remain same for root and non-root slab.
> >
> > Note that the object_size and size may differ for SLUB based on kernel
> > parameters and slab configuration. For SLAB these are compilation options.
> >
>
> Thanks for the explanation.
>
> >> @@ -379,7 +379,7 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align,
> >> }
> >>
> >> static struct kmem_cache *create_cache(const char *name,
> >> - unsigned int object_size, unsigned int size, unsigned int align,
> >> + unsigned int object_size, unsigned int align,
> >> slab_flags_t flags, unsigned int useroffset,
> >
> > Why was both the size and object_size passed during cache creation in the
> > first place? From the flags etc the slab logic should be able to compute
> > the actual bytes required for each object and its metadata.
> >
>
> +Vladimir
>
> I think it was introduced by 794b1248be4e7 ("memcg, slab: separate
> memcg vs root cache creation paths") but I could not find out the
> reason.

This was a mistake - I missed that __kmem_cache_create() overwrites
kmem_cache->size. Thanks for fixing this.

2018-03-14 17:06:22

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH] slab, slub: remove size disparity on debug kernel

On Wed, Mar 14, 2018 at 1:43 AM, Vladimir Davydov
<[email protected]> wrote:
> On Tue, Mar 13, 2018 at 10:36:52AM -0700, Shakeel Butt wrote:
>> On Tue, Mar 13, 2018 at 10:19 AM, Christopher Lameter <[email protected]> wrote:
>> > On Tue, 13 Mar 2018, Shakeel Butt wrote:
>> >
>> >> However for SLUB in debug kernel, the sizes were same. On further
>> >> inspection it is found that SLUB always use kmem_cache.object_size to
>> >> measure the kmem_cache.size while SLAB use the given kmem_cache.size. In
>> >> the debug kernel the slab's size can be larger than its object_size.
>> >> Thus in the creation of non-root slab, the SLAB uses the root's size as
>> >> base to calculate the non-root slab's size and thus non-root slab's size
>> >> can be larger than the root slab's size. For SLUB, the non-root slab's
>> >> size is measured based on the root's object_size and thus the size will
>> >> remain same for root and non-root slab.
>> >
>> > Note that the object_size and size may differ for SLUB based on kernel
>> > parameters and slab configuration. For SLAB these are compilation options.
>> >
>>
>> Thanks for the explanation.
>>
>> >> @@ -379,7 +379,7 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align,
>> >> }
>> >>
>> >> static struct kmem_cache *create_cache(const char *name,
>> >> - unsigned int object_size, unsigned int size, unsigned int align,
>> >> + unsigned int object_size, unsigned int align,
>> >> slab_flags_t flags, unsigned int useroffset,
>> >
>> > Why was both the size and object_size passed during cache creation in the
>> > first place? From the flags etc the slab logic should be able to compute
>> > the actual bytes required for each object and its metadata.
>> >
>>
>> +Vladimir
>>
>> I think it was introduced by 794b1248be4e7 ("memcg, slab: separate
>> memcg vs root cache creation paths") but I could not find out the
>> reason.
>
> This was a mistake - I missed that __kmem_cache_create() overwrites
> kmem_cache->size. Thanks for fixing this.

Thanks for confirming.

Andrew, can you please add following line to the patch commit message.

Fixes: 794b1248be4e ("memcg, slab: separate memcg vs root cache creation paths")