2016-10-31 18:37:25

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB.

The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.

This fix discards allocator specific flags from memcg and ensure
cache_create cannot be called with them.

Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Signed-off-by: Thomas Garnier <[email protected]>
Signed-off-by: Greg Thelen <[email protected]>
---
Based on next-20161025
---
mm/slab.h | 3 +++
mm/slab_common.c | 10 ++++++++--
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 9653f2e..58be647 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -144,6 +144,9 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,

#define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)

+/* Common allocator flags allowed for cache_create. */
+#define SLAB_FLAGS_PERMITTED (CACHE_CREATE_MASK | SLAB_KASAN)
+
int __kmem_cache_shutdown(struct kmem_cache *);
void __kmem_cache_release(struct kmem_cache *);
int __kmem_cache_shrink(struct kmem_cache *, bool);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 71f0b28..01d067c 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -329,6 +329,12 @@ static struct kmem_cache *create_cache(const char *name,
struct kmem_cache *s;
int err;

+ /* Do not allow allocator specific flags */
+ if (flags & ~SLAB_FLAGS_PERMITTED) {
+ err = -EINVAL;
+ goto out;
+ }
+
err = -ENOMEM;
s = kmem_cache_zalloc(kmem_cache, GFP_KERNEL);
if (!s)
@@ -533,8 +539,8 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,

s = create_cache(cache_name, root_cache->object_size,
root_cache->size, root_cache->align,
- root_cache->flags, root_cache->ctor,
- memcg, root_cache);
+ root_cache->flags & SLAB_FLAGS_PERMITTED,
+ root_cache->ctor, memcg, root_cache);
/*
* If we could not create a memcg cache, do not complain, because
* that's not critical at all as we can always proceed with the root
--
2.8.0.rc3.226.g39d4020


2016-10-31 23:38:24

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Mon, 31 Oct 2016, Thomas Garnier wrote:

> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
>
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
>
> This fix discards allocator specific flags from memcg and ensure
> cache_create cannot be called with them.
>
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier <[email protected]>
> Signed-off-by: Greg Thelen <[email protected]>

Order of the signoffs is strange, should this have a

From: Greg Thelen <[email protected]>

in the first line or is this your patch?

> ---
> Based on next-20161025
> ---
> mm/slab.h | 3 +++
> mm/slab_common.c | 10 ++++++++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 9653f2e..58be647 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -144,6 +144,9 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>
> #define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)
>
> +/* Common allocator flags allowed for cache_create. */
> +#define SLAB_FLAGS_PERMITTED (CACHE_CREATE_MASK | SLAB_KASAN)
> +
> int __kmem_cache_shutdown(struct kmem_cache *);
> void __kmem_cache_release(struct kmem_cache *);
> int __kmem_cache_shrink(struct kmem_cache *, bool);
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 71f0b28..01d067c 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -329,6 +329,12 @@ static struct kmem_cache *create_cache(const char *name,
> struct kmem_cache *s;
> int err;
>
> + /* Do not allow allocator specific flags */
> + if (flags & ~SLAB_FLAGS_PERMITTED) {
> + err = -EINVAL;
> + goto out;
> + }
> +

Why not just flags &= SLAB_FLAGS_PERMITTED if we're concerned about this
like kmem_cache_create does &= CACHE_CREATE_MASK?

> err = -ENOMEM;
> s = kmem_cache_zalloc(kmem_cache, GFP_KERNEL);
> if (!s)
> @@ -533,8 +539,8 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>
> s = create_cache(cache_name, root_cache->object_size,
> root_cache->size, root_cache->align,
> - root_cache->flags, root_cache->ctor,
> - memcg, root_cache);
> + root_cache->flags & SLAB_FLAGS_PERMITTED,
> + root_cache->ctor, memcg, root_cache);
> /*
> * If we could not create a memcg cache, do not complain, because
> * that's not critical at all as we can always proceed with the root

This introduces an inconsistency that isn't explained: why is SLAB_KASAN,
the only reason why SLAB_FLAGS_PERMITTED needs to be defined, permitted
for memcg_create_kmem_cache() but not kmem_cache_create()? (If we need to
keep SLAB_FLAGS_PERMITTED around, I think it needs a new name since its a
restriction on the cache, not slab.)

2016-11-02 15:59:11

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Mon, Oct 31, 2016 at 4:38 PM, David Rientjes <[email protected]> wrote:
> On Mon, 31 Oct 2016, Thomas Garnier wrote:
>
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> This fix discards allocator specific flags from memcg and ensure
>> cache_create cannot be called with them.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier <[email protected]>
>> Signed-off-by: Greg Thelen <[email protected]>
>
> Order of the signoffs is strange, should this have a
>
> From: Greg Thelen <[email protected]>
>
> in the first line or is this your patch?
>

Yes, thanks for pointing that out. I will put Greg as owner and myself
as tester. That make more sense for this patch.

>> ---
>> Based on next-20161025
>> ---
>> mm/slab.h | 3 +++
>> mm/slab_common.c | 10 ++++++++--
>> 2 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/slab.h b/mm/slab.h
>> index 9653f2e..58be647 100644
>> --- a/mm/slab.h
>> +++ b/mm/slab.h
>> @@ -144,6 +144,9 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>>
>> #define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)
>>
>> +/* Common allocator flags allowed for cache_create. */
>> +#define SLAB_FLAGS_PERMITTED (CACHE_CREATE_MASK | SLAB_KASAN)
>> +
>> int __kmem_cache_shutdown(struct kmem_cache *);
>> void __kmem_cache_release(struct kmem_cache *);
>> int __kmem_cache_shrink(struct kmem_cache *, bool);
>> diff --git a/mm/slab_common.c b/mm/slab_common.c
>> index 71f0b28..01d067c 100644
>> --- a/mm/slab_common.c
>> +++ b/mm/slab_common.c
>> @@ -329,6 +329,12 @@ static struct kmem_cache *create_cache(const char *name,
>> struct kmem_cache *s;
>> int err;
>>
>> + /* Do not allow allocator specific flags */
>> + if (flags & ~SLAB_FLAGS_PERMITTED) {
>> + err = -EINVAL;
>> + goto out;
>> + }
>> +
>
> Why not just flags &= SLAB_FLAGS_PERMITTED if we're concerned about this
> like kmem_cache_create does &= CACHE_CREATE_MASK?
>

Christoph on the first version advised removing invalid flags on the
caller and checking they are correct in kmem_cache_create. The memcg
path putting the wrong flags is through create_cache but I still used
this approach.

>> err = -ENOMEM;
>> s = kmem_cache_zalloc(kmem_cache, GFP_KERNEL);
>> if (!s)
>> @@ -533,8 +539,8 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>>
>> s = create_cache(cache_name, root_cache->object_size,
>> root_cache->size, root_cache->align,
>> - root_cache->flags, root_cache->ctor,
>> - memcg, root_cache);
>> + root_cache->flags & SLAB_FLAGS_PERMITTED,
>> + root_cache->ctor, memcg, root_cache);
>> /*
>> * If we could not create a memcg cache, do not complain, because
>> * that's not critical at all as we can always proceed with the root
>
> This introduces an inconsistency that isn't explained: why is SLAB_KASAN,
> the only reason why SLAB_FLAGS_PERMITTED needs to be defined, permitted
> for memcg_create_kmem_cache() but not kmem_cache_create()? (If we need to
> keep SLAB_FLAGS_PERMITTED around, I think it needs a new name since its a
> restriction on the cache, not slab.)

The idea was that SLAB_FLAGS_PERMITTED would be all the common flags.
SLAB_KASAN was the only one not on CACHE_CREATE_MASK.

Christoph: Which approach to do you prefer?


--
Thomas

2016-11-03 00:46:11

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Wed, 2 Nov 2016, Thomas Garnier wrote:

> >> diff --git a/mm/slab.h b/mm/slab.h
> >> index 9653f2e..58be647 100644
> >> --- a/mm/slab.h
> >> +++ b/mm/slab.h
> >> @@ -144,6 +144,9 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
> >>
> >> #define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)
> >>
> >> +/* Common allocator flags allowed for cache_create. */
> >> +#define SLAB_FLAGS_PERMITTED (CACHE_CREATE_MASK | SLAB_KASAN)
> >> +
> >> int __kmem_cache_shutdown(struct kmem_cache *);
> >> void __kmem_cache_release(struct kmem_cache *);
> >> int __kmem_cache_shrink(struct kmem_cache *, bool);
> >> diff --git a/mm/slab_common.c b/mm/slab_common.c
> >> index 71f0b28..01d067c 100644
> >> --- a/mm/slab_common.c
> >> +++ b/mm/slab_common.c
> >> @@ -329,6 +329,12 @@ static struct kmem_cache *create_cache(const char *name,
> >> struct kmem_cache *s;
> >> int err;
> >>
> >> + /* Do not allow allocator specific flags */
> >> + if (flags & ~SLAB_FLAGS_PERMITTED) {
> >> + err = -EINVAL;
> >> + goto out;
> >> + }
> >> +
> >
> > Why not just flags &= SLAB_FLAGS_PERMITTED if we're concerned about this
> > like kmem_cache_create does &= CACHE_CREATE_MASK?
> >
>
> Christoph on the first version advised removing invalid flags on the
> caller and checking they are correct in kmem_cache_create. The memcg
> path putting the wrong flags is through create_cache but I still used
> this approach.
>

I think this is a rather trivial point since it doesn't matter if we clear
invalid flags on the caller or in the callee and obviously
kmem_cache_create() does it in the callee.

Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Wed, 2 Nov 2016, David Rientjes wrote:

> > Christoph on the first version advised removing invalid flags on the
> > caller and checking they are correct in kmem_cache_create. The memcg
> > path putting the wrong flags is through create_cache but I still used
> > this approach.
> >
>
> I think this is a rather trivial point since it doesn't matter if we clear
> invalid flags on the caller or in the callee and obviously
> kmem_cache_create() does it in the callee.

In order to be correct we need to do the following:

kmem_cache_create should check for invalid flags (and that includes
internal alloocator flgs) being set and refuse to create the slab cache.

memcg needs to call kmem_cache_create without any internal flags.

I also want to make sure that there are no other callers that specify
extraneou flags while we are at it.

2016-11-07 18:52:11

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Thu, Nov 3, 2016 at 1:33 PM, Christoph Lameter <[email protected]> wrote:
> On Wed, 2 Nov 2016, David Rientjes wrote:
>
>> > Christoph on the first version advised removing invalid flags on the
>> > caller and checking they are correct in kmem_cache_create. The memcg
>> > path putting the wrong flags is through create_cache but I still used
>> > this approach.
>> >
>>
>> I think this is a rather trivial point since it doesn't matter if we clear
>> invalid flags on the caller or in the callee and obviously
>> kmem_cache_create() does it in the callee.
>
> In order to be correct we need to do the following:
>
> kmem_cache_create should check for invalid flags (and that includes
> internal alloocator flgs) being set and refuse to create the slab cache.
>
> memcg needs to call kmem_cache_create without any internal flags.
>

I am not sure that is possible. kmem_cache_create currently check for
possible alias, I assume that it goes against what memcg tries to do.

Separate the changes in two patches might make sense:

1) Fix the original bug by masking the flags passed to create_cache
2) Add flags check in kmem_cache_create.

Does it make sense?

> I also want to make sure that there are no other callers that specify
> extraneou flags while we are at it.
>

I will review as many as I can but we might run into surprises (quick
boot on defconfig didn't show anything). That's why having two
different patches might be useful.

--
Thomas

Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Mon, 7 Nov 2016, Thomas Garnier wrote:

> I am not sure that is possible. kmem_cache_create currently check for
> possible alias, I assume that it goes against what memcg tries to do.

What does aliasing have to do with this? The aliases must have the same
flags otherwise the caches would not have been merged.

> Separate the changes in two patches might make sense:
>
> 1) Fix the original bug by masking the flags passed to create_cache
> 2) Add flags check in kmem_cache_create.
>
> Does it make sense?

Sure.

> > I also want to make sure that there are no other callers that specify
> > extraneou flags while we are at it.
> I will review as many as I can but we might run into surprises (quick
> boot on defconfig didn't show anything). That's why having two
> different patches might be useful.

These surprises can be caught later ... Just make sure that the core works
fine with this. You cannot audit all drivers.

2016-11-07 19:52:49

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v2] memcg: Prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

On Mon, Nov 7, 2016 at 11:28 AM, Christoph Lameter <[email protected]> wrote:
> On Mon, 7 Nov 2016, Thomas Garnier wrote:
>
>> I am not sure that is possible. kmem_cache_create currently check for
>> possible alias, I assume that it goes against what memcg tries to do.
>
> What does aliasing have to do with this? The aliases must have the same
> flags otherwise the caches would not have been merged.
>

I assume there might be cases where the parent cache and the new memcg
cache are compatible for merge (same flags and size). We can bypass
that by adding SLAB_NEVER_MERGE but I am not sure what is the
consequence of that.

>> Separate the changes in two patches might make sense:
>>
>> 1) Fix the original bug by masking the flags passed to create_cache
>> 2) Add flags check in kmem_cache_create.
>>
>> Does it make sense?
>
> Sure.
>

Great, I will send both patches.

>> > I also want to make sure that there are no other callers that specify
>> > extraneou flags while we are at it.
>> I will review as many as I can but we might run into surprises (quick
>> boot on defconfig didn't show anything). That's why having two
>> different patches might be useful.
>
> These surprises can be caught later ... Just make sure that the core works
> fine with this. You cannot audit all drivers.
>

Okay, I will.



--
Thomas