From: Greg Thelen <[email protected]>
While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB.
The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.
This fix discards allocator specific flags from memcg before calling
create_cache.
Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Signed-off-by: Greg Thelen <[email protected]>
Tested-by: Thomas Garnier <[email protected]>
---
Based on next-20161027
---
mm/slab_common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 71f0b28..329b038 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -533,8 +533,8 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
s = create_cache(cache_name, root_cache->object_size,
root_cache->size, root_cache->align,
- root_cache->flags, root_cache->ctor,
- memcg, root_cache);
+ root_cache->flags & CACHE_CREATE_MASK,
+ root_cache->ctor, memcg, root_cache);
/*
* If we could not create a memcg cache, do not complain, because
* that's not critical at all as we can always proceed with the root
--
2.8.0.rc3.226.g39d4020
Verify that kmem_create_cache flags are not allocator specific. It is
done before removing flags that are not available with the current
configuration.
Signed-off-by: Thomas Garnier <[email protected]>
---
Based on next-20161027
---
mm/slab.h | 15 +++++++++++++++
mm/slab_common.c | 6 ++++++
2 files changed, 21 insertions(+)
diff --git a/mm/slab.h b/mm/slab.h
index 9653f2e..3b11896 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -142,8 +142,23 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
#define SLAB_CACHE_FLAGS (0)
#endif
+/* Common flags available with current configuration */
#define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)
+/* Common flags permitted for kmem_cache_create */
+#define SLAB_FLAGS_PERMITTED (SLAB_CORE_FLAGS | \
+ SLAB_RED_ZONE | \
+ SLAB_POISON | \
+ SLAB_STORE_USER | \
+ SLAB_TRACE | \
+ SLAB_CONSISTENCY_CHECKS | \
+ SLAB_MEM_SPREAD | \
+ SLAB_NOLEAKTRACE | \
+ SLAB_RECLAIM_ACCOUNT | \
+ SLAB_TEMPORARY | \
+ SLAB_NOTRACK | \
+ SLAB_ACCOUNT)
+
int __kmem_cache_shutdown(struct kmem_cache *);
void __kmem_cache_release(struct kmem_cache *);
int __kmem_cache_shrink(struct kmem_cache *, bool);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 329b038..5e01994 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -404,6 +404,12 @@ kmem_cache_create(const char *name, size_t size, size_t align,
goto out_unlock;
}
+ /* Refuse requests with allocator specific flags */
+ if (flags & ~SLAB_FLAGS_PERMITTED) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+
/*
* Some allocators will constraint the set of valid flags to a subset
* of all flags. We expect them to define CACHE_CREATE_MASK in this
--
2.8.0.rc3.226.g39d4020
On Mon, 7 Nov 2016 13:11:14 -0800 Thomas Garnier <[email protected]> wrote:
> From: Greg Thelen <[email protected]>
>
> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
>
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
Can we please have a better description of the problems which this bug
causes? Without this info it's unclear to me which kernel version(s)
need the fix.
Given that the bug is 6 months old I'm assuming "not very urgent".
> This fix discards allocator specific flags from memcg before calling
> create_cache.
>
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
> Signed-off-by: Greg Thelen <[email protected]>
> Tested-by: Thomas Garnier <[email protected]>
This should have had your signed-off-by, as you were on the delivery
path. I've made that change.
On Mon, Nov 7, 2016 at 2:19 PM, Andrew Morton <[email protected]> wrote:
> On Mon, 7 Nov 2016 13:11:14 -0800 Thomas Garnier <[email protected]> wrote:
>
>> From: Greg Thelen <[email protected]>
>>
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>
> Can we please have a better description of the problems which this bug
> causes? Without this info it's unclear to me which kernel version(s)
> need the fix.
>
> Given that the bug is 6 months old I'm assuming "not very urgent".
>
I will add more details and send another round.
>> This fix discards allocator specific flags from memcg before calling
>> create_cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
>> Signed-off-by: Greg Thelen <[email protected]>
>> Tested-by: Thomas Garnier <[email protected]>
>
> This should have had your signed-off-by, as you were on the delivery
> path. I've made that change.
Thanks Andrew.
--
Thomas
On Mon, 7 Nov 2016 14:32:56 -0800 Thomas Garnier <[email protected]> wrote:
> On Mon, Nov 7, 2016 at 2:19 PM, Andrew Morton <[email protected]> wrote:
> > On Mon, 7 Nov 2016 13:11:14 -0800 Thomas Garnier <[email protected]> wrote:
> >
> >> From: Greg Thelen <[email protected]>
> >>
> >> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> >> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> >> CFLGS_OBJFREELIST_SLAB.
> >>
> >> The original kmem_cache is created early making OFF_SLAB not possible.
> >> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> >> is enabled it will try to enable it first under certain conditions.
> >> Given kmem_cache(sys) reuses the original flag, you can have both flags
> >> at the same time resulting in allocation failures and odd behaviors.
> >
> > Can we please have a better description of the problems which this bug
> > causes? Without this info it's unclear to me which kernel version(s)
> > need the fix.
> >
> > Given that the bug is 6 months old I'm assuming "not very urgent".
> >
>
> I will add more details and send another round.
Please simply send the additional changelog text in this thread -
processing an entire v4 patch just for a changelog fiddle is rather
heavyweight.
On Mon, 7 Nov 2016 13:11:15 -0800 Thomas Garnier <[email protected]> wrote:
> Verify that kmem_create_cache flags are not allocator specific. It is
> done before removing flags that are not available with the current
> configuration.
What is the reason for this change?
On Mon, Nov 7, 2016 at 3:07 PM, Andrew Morton <[email protected]> wrote:
> On Mon, 7 Nov 2016 13:11:15 -0800 Thomas Garnier <[email protected]> wrote:
>
>> Verify that kmem_create_cache flags are not allocator specific. It is
>> done before removing flags that are not available with the current
>> configuration.
>
> What is the reason for this change?
The current kmem_cache_create removes incorrect flags but do not
validate the callers are using them right. This change will ensure
that callers are not trying to create caches with flags that won't be
used because allocator specific.
It was Christoph's suggestion on the previous versions of the original
patch (the memcg bug fix).
--
Thomas
On Mon, Nov 7, 2016 at 2:49 PM, Andrew Morton <[email protected]> wrote:
> On Mon, 7 Nov 2016 14:32:56 -0800 Thomas Garnier <[email protected]> wrote:
>
>> On Mon, Nov 7, 2016 at 2:19 PM, Andrew Morton <[email protected]> wrote:
>> > On Mon, 7 Nov 2016 13:11:14 -0800 Thomas Garnier <[email protected]> wrote:
>> >
>> >> From: Greg Thelen <[email protected]>
>> >>
>> >> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> >> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> >> CFLGS_OBJFREELIST_SLAB.
>> >>
>> >> The original kmem_cache is created early making OFF_SLAB not possible.
>> >> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> >> is enabled it will try to enable it first under certain conditions.
>> >> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> >> at the same time resulting in allocation failures and odd behaviors.
>> >
>> > Can we please have a better description of the problems which this bug
>> > causes? Without this info it's unclear to me which kernel version(s)
>> > need the fix.
>> >
>> > Given that the bug is 6 months old I'm assuming "not very urgent".
>> >
>>
>> I will add more details and send another round.
>
> Please simply send the additional changelog text in this thread -
> processing an entire v4 patch just for a changelog fiddle is rather
> heavyweight.
>
Got it, here is the diff of the previous commit message:
9,10c9
< CFLGS_OBJFREELIST_SLAB. When it happened, critical allocations needed
< for loading drivers or creating new caches will fail.
---
> CFLGS_OBJFREELIST_SLAB.
16c15
< at the same time.
---
> at the same time resulting in allocation failures and odd behaviors.
21,23d19
< The bug exists since 4.6-rc1 and affects testing debug pagealloc
< configurations.
<
26d21
< Signed-off-by: Thomas Garnier <[email protected]>
--
Thomas
On Mon, 7 Nov 2016, Andrew Morton wrote:
> > I will add more details and send another round.
>
> Please simply send the additional changelog text in this thread -
> processing an entire v4 patch just for a changelog fiddle is rather
> heavyweight.
I think this patch is good for future cleanup. We have had a case here
where an internal flag was passed to kmem_cache_create that caused issues
later. This should not happen. We need to guard against this in the
future.
Acked-by: Christoph Lameter <[email protected]>