2023-12-04 19:35:03

by Vlastimil Babka

[permalink] [raw]
Subject: [PATCH 0/4] SLUB: cleanup hook processing

This is a spin-off of preparatory patches from the percpu array series
[1] as they are IMHO useful on their own and simple enough to target
6.8, while the percpu array is still a RFC.

To avoid non-trivial conflict, the series is also rebased on top of the
SLAB removal branch. [2]

[1] https://lore.kernel.org/all/[email protected]/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.8/slab-removal

---
Vlastimil Babka (4):
mm/slub: fix bulk alloc and free stats
mm/slub: introduce __kmem_cache_free_bulk() without free hooks
mm/slub: handle bulk and single object freeing separately
mm/slub: free KFENCE objects in slab_free_hook()

mm/slub.c | 109 +++++++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 73 insertions(+), 36 deletions(-)
---
base-commit: 4a38e93b3a7e6669c44929fed918b1494e902dd7
change-id: 20231204-slub-cleanup-hooks-f5f54132a61c

Best regards,
--
Vlastimil Babka <[email protected]>


2023-12-04 19:35:12

by Vlastimil Babka

[permalink] [raw]
Subject: [PATCH 1/4] mm/slub: fix bulk alloc and free stats

The SLUB sysfs stats enabled CONFIG_SLUB_STATS have two deficiencies
identified wrt bulk alloc/free operations:

- Bulk allocations from cpu freelist are not counted. Add the
ALLOC_FASTPATH counter there.

- Bulk fastpath freeing will count a list of multiple objects with a
single FREE_FASTPATH inc. Add a stat_add() variant to count them all.

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index 3f8b95757106..d7b0ca6012e0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -396,6 +396,14 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
#endif
}

+static inline
+void stat_add(const struct kmem_cache *s, enum stat_item si, int v)
+{
+#ifdef CONFIG_SLUB_STATS
+ raw_cpu_add(s->cpu_slab->stat[si], v);
+#endif
+}
+
/*
* The slab lists for all objects.
*/
@@ -4268,7 +4276,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,

local_unlock(&s->cpu_slab->lock);
}
- stat(s, FREE_FASTPATH);
+ stat_add(s, FREE_FASTPATH, cnt);
}
#else /* CONFIG_SLUB_TINY */
static void do_slab_free(struct kmem_cache *s,
@@ -4545,6 +4553,7 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
c->freelist = get_freepointer(s, object);
p[i] = object;
maybe_wipe_obj_freeptr(s, p[i]);
+ stat(s, ALLOC_FASTPATH);
}
c->tid = next_tid(c->tid);
local_unlock_irqrestore(&s->cpu_slab->lock, irqflags);

--
2.43.0

2023-12-04 19:35:42

by Vlastimil Babka

[permalink] [raw]
Subject: [PATCH 3/4] mm/slub: handle bulk and single object freeing separately

Currently we have a single function slab_free() handling both single
object freeing and bulk freeing with necessary hooks, the latter case
requiring slab_free_freelist_hook(). It should be however better to
distinguish the two use cases for the following reasons:

- code simpler to follow for the single object case

- better code generation - although inlining should eliminate the
slab_free_freelist_hook() for single object freeing in case no
debugging options are enabled, it seems it's not perfect. When e.g.
KASAN is enabled, we're imposing additional unnecessary overhead for
single object freeing.

- preparation to add percpu array caches in near future

Therefore, simplify slab_free() for the single object case by dropping
unnecessary parameters and calling only slab_free_hook() instead of
slab_free_freelist_hook(). Rename the bulk variant to slab_free_bulk()
and adjust callers accordingly.

While at it, flip (and document) slab_free_hook() return value so that
it returns true when the freeing can proceed, which matches the logic of
slab_free_freelist_hook() and is not confusingly the opposite.

Additionally we can simplify a bit by changing the tail parameter of
do_slab_free() when freeing a single object - instead of NULL we can set
it equal to head.

bloat-o-meter shows small code reduction with a .config that has KASAN
etc disabled:

add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-118 (-118)
Function old new delta
kmem_cache_alloc_bulk 1203 1196 -7
kmem_cache_free 861 835 -26
__kmem_cache_free 741 704 -37
kmem_cache_free_bulk 911 863 -48

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 59 +++++++++++++++++++++++++++++++++++------------------------
1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 0742564c4538..ed2fa92e914c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2037,9 +2037,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
/*
* Hooks for other subsystems that check memory allocations. In a typical
* production configuration these hooks all should produce no code at all.
+ *
+ * Returns true if freeing of the object can proceed, false if its reuse
+ * was delayed by KASAN quarantine.
*/
-static __always_inline bool slab_free_hook(struct kmem_cache *s,
- void *x, bool init)
+static __always_inline
+bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
{
kmemleak_free_recursive(x, s->flags);
kmsan_slab_free(s, x);
@@ -2072,7 +2075,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s,
s->size - s->inuse - rsize);
}
/* KASAN might put x into memory quarantine, delaying its reuse. */
- return kasan_slab_free(s, x, init);
+ return !kasan_slab_free(s, x, init);
}

static inline bool slab_free_freelist_hook(struct kmem_cache *s,
@@ -2082,7 +2085,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,

void *object;
void *next = *head;
- void *old_tail = *tail ? *tail : *head;
+ void *old_tail = *tail;

if (is_kfence_address(next)) {
slab_free_hook(s, next, false);
@@ -2098,8 +2101,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
next = get_freepointer(s, object);

/* If object's reuse doesn't have to be delayed */
- if (likely(!slab_free_hook(s, object,
- slab_want_init_on_free(s)))) {
+ if (likely(slab_free_hook(s, object,
+ slab_want_init_on_free(s)))) {
/* Move object to the new freelist */
set_freepointer(s, object, *head);
*head = object;
@@ -2114,9 +2117,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
}
} while (object != old_tail);

- if (*head == *tail)
- *tail = NULL;
-
return *head != NULL;
}

@@ -4227,7 +4227,6 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
struct slab *slab, void *head, void *tail,
int cnt, unsigned long addr)
{
- void *tail_obj = tail ? : head;
struct kmem_cache_cpu *c;
unsigned long tid;
void **freelist;
@@ -4246,14 +4245,14 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
barrier();

if (unlikely(slab != c->slab)) {
- __slab_free(s, slab, head, tail_obj, cnt, addr);
+ __slab_free(s, slab, head, tail, cnt, addr);
return;
}

if (USE_LOCKLESS_FAST_PATH()) {
freelist = READ_ONCE(c->freelist);

- set_freepointer(s, tail_obj, freelist);
+ set_freepointer(s, tail, freelist);

if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) {
note_cmpxchg_failure("slab_free", s, tid);
@@ -4270,7 +4269,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
tid = c->tid;
freelist = c->freelist;

- set_freepointer(s, tail_obj, freelist);
+ set_freepointer(s, tail, freelist);
c->freelist = head;
c->tid = next_tid(tid);

@@ -4283,15 +4282,27 @@ static void do_slab_free(struct kmem_cache *s,
struct slab *slab, void *head, void *tail,
int cnt, unsigned long addr)
{
- void *tail_obj = tail ? : head;
-
- __slab_free(s, slab, head, tail_obj, cnt, addr);
+ __slab_free(s, slab, head, tail, cnt, addr);
}
#endif /* CONFIG_SLUB_TINY */

-static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
- void *head, void *tail, void **p, int cnt,
- unsigned long addr)
+static __fastpath_inline
+void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
+ unsigned long addr)
+{
+ bool init;
+
+ memcg_slab_free_hook(s, slab, &object, 1);
+
+ init = !is_kfence_address(object) && slab_want_init_on_free(s);
+
+ if (likely(slab_free_hook(s, object, init)))
+ do_slab_free(s, slab, object, object, 1, addr);
+}
+
+static __fastpath_inline
+void slab_free_bulk(struct kmem_cache *s, struct slab *slab, void *head,
+ void *tail, void **p, int cnt, unsigned long addr)
{
memcg_slab_free_hook(s, slab, p, cnt);
/*
@@ -4305,7 +4316,7 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
#ifdef CONFIG_KASAN_GENERIC
void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
{
- do_slab_free(cache, virt_to_slab(x), x, NULL, 1, addr);
+ do_slab_free(cache, virt_to_slab(x), x, x, 1, addr);
}
#endif

@@ -4349,7 +4360,7 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
if (!s)
return;
trace_kmem_cache_free(_RET_IP_, x, s);
- slab_free(s, virt_to_slab(x), x, NULL, &x, 1, _RET_IP_);
+ slab_free(s, virt_to_slab(x), x, _RET_IP_);
}
EXPORT_SYMBOL(kmem_cache_free);

@@ -4395,7 +4406,7 @@ void kfree(const void *object)

slab = folio_slab(folio);
s = slab->slab_cache;
- slab_free(s, slab, x, NULL, &x, 1, _RET_IP_);
+ slab_free(s, slab, x, _RET_IP_);
}
EXPORT_SYMBOL(kfree);

@@ -4512,8 +4523,8 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
if (!df.slab)
continue;

- slab_free(df.s, df.slab, df.freelist, df.tail, &p[size], df.cnt,
- _RET_IP_);
+ slab_free_bulk(df.s, df.slab, df.freelist, df.tail, &p[size],
+ df.cnt, _RET_IP_);
} while (likely(size));
}
EXPORT_SYMBOL(kmem_cache_free_bulk);

--
2.43.0

2023-12-04 19:35:42

by Vlastimil Babka

[permalink] [raw]
Subject: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

When freeing an object that was allocated from KFENCE, we do that in the
slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
the cpu slab, so the fastpath has to fallback to the slowpath.

This optimization doesn't help much though, because is_kfence_address()
is checked earlier anyway during the free hook processing or detached
freelist building. Thus we can simplify the code by making the
slab_free_hook() free the KFENCE object immediately, similarly to KASAN
quarantine.

In slab_free_hook() we can place kfence_free() above init processing, as
callers have been making sure to set init to false for KFENCE objects.
This simplifies slab_free(). This places it also above kasan_slab_free()
which is ok as that skips KFENCE objects anyway.

While at it also determine the init value in slab_free_freelist_hook()
outside of the loop.

This change will also make introducing per cpu array caches easier.

Tested-by: Marco Elver <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index ed2fa92e914c..e38c2b712f6c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
* production configuration these hooks all should produce no code at all.
*
* Returns true if freeing of the object can proceed, false if its reuse
- * was delayed by KASAN quarantine.
+ * was delayed by KASAN quarantine, or it was returned to KFENCE.
*/
static __always_inline
bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
@@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
__kcsan_check_access(x, s->object_size,
KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);

+ if (kfence_free(kasan_reset_tag(x)))
+ return false;
+
/*
* As memory initialization might be integrated into KASAN,
* kasan_slab_free and initialization memset's must be
@@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
void *object;
void *next = *head;
void *old_tail = *tail;
+ bool init;

if (is_kfence_address(next)) {
slab_free_hook(s, next, false);
- return true;
+ return false;
}

/* Head and tail of the reconstructed freelist */
*head = NULL;
*tail = NULL;

+ init = slab_want_init_on_free(s);
+
do {
object = next;
next = get_freepointer(s, object);

/* If object's reuse doesn't have to be delayed */
- if (likely(slab_free_hook(s, object,
- slab_want_init_on_free(s)))) {
+ if (likely(slab_free_hook(s, object, init))) {
/* Move object to the new freelist */
set_freepointer(s, object, *head);
*head = object;
@@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,

stat(s, FREE_SLOWPATH);

- if (kfence_free(head))
- return;
-
if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
free_to_partial_list(s, slab, head, tail, cnt, addr);
return;
@@ -4290,13 +4292,9 @@ static __fastpath_inline
void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
unsigned long addr)
{
- bool init;
-
memcg_slab_free_hook(s, slab, &object, 1);

- init = !is_kfence_address(object) && slab_want_init_on_free(s);
-
- if (likely(slab_free_hook(s, object, init)))
+ if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
do_slab_free(s, slab, object, object, 1, addr);
}


--
2.43.0

2023-12-04 19:36:16

by Vlastimil Babka

[permalink] [raw]
Subject: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks

Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
objects that were allocated before the failure, using
kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
hooks (KASAN etc.) and those expect objects that were processed by the
post alloc hooks, slab_post_alloc_hook() is called before
kmem_cache_free_bulk().

This is wasteful, although not a big concern in practice for the rare
error path. But in order to efficiently handle percpu array batch refill
and free in the near future, we will also need a variant of
kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
and use it for the failure path.

As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
parameter, remove it.

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 33 ++++++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d7b0ca6012e0..0742564c4538 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4478,6 +4478,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
return same;
}

+/*
+ * Internal bulk free of objects that were not initialised by the post alloc
+ * hooks and thus should not be processed by the free hooks
+ */
+static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+{
+ if (!size)
+ return;
+
+ do {
+ struct detached_freelist df;
+
+ size = build_detached_freelist(s, size, p, &df);
+ if (!df.slab)
+ continue;
+
+ do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
+ _RET_IP_);
+ } while (likely(size));
+}
+
/* Note that interrupts must be enabled when calling this function. */
void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
{
@@ -4499,7 +4520,7 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);

#ifndef CONFIG_SLUB_TINY
static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
- size_t size, void **p, struct obj_cgroup *objcg)
+ size_t size, void **p)
{
struct kmem_cache_cpu *c;
unsigned long irqflags;
@@ -4563,14 +4584,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,

error:
slub_put_cpu_ptr(s->cpu_slab);
- slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
- kmem_cache_free_bulk(s, i, p);
+ __kmem_cache_free_bulk(s, i, p);
return 0;

}
#else /* CONFIG_SLUB_TINY */
static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
- size_t size, void **p, struct obj_cgroup *objcg)
+ size_t size, void **p)
{
int i;

@@ -4593,8 +4613,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
return i;

error:
- slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
- kmem_cache_free_bulk(s, i, p);
+ __kmem_cache_free_bulk(s, i, p);
return 0;
}
#endif /* CONFIG_SLUB_TINY */
@@ -4614,7 +4633,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
if (unlikely(!s))
return 0;

- i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
+ i = __kmem_cache_alloc_bulk(s, flags, size, p);

/*
* memcg and kmem_cache debug support and memory initialization.

--
2.43.0

2023-12-05 08:13:13

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm/slub: fix bulk alloc and free stats

On 2023/12/5 03:34, Vlastimil Babka wrote:
> The SLUB sysfs stats enabled CONFIG_SLUB_STATS have two deficiencies
> identified wrt bulk alloc/free operations:
>
> - Bulk allocations from cpu freelist are not counted. Add the
> ALLOC_FASTPATH counter there.
>
> - Bulk fastpath freeing will count a list of multiple objects with a
> single FREE_FASTPATH inc. Add a stat_add() variant to count them all.
>
> Signed-off-by: Vlastimil Babka <[email protected]>

Looks good to me!

Reviewed-by: Chengming Zhou <[email protected]>

> ---
> mm/slub.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 3f8b95757106..d7b0ca6012e0 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -396,6 +396,14 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
> #endif
> }
>
> +static inline
> +void stat_add(const struct kmem_cache *s, enum stat_item si, int v)
> +{
> +#ifdef CONFIG_SLUB_STATS
> + raw_cpu_add(s->cpu_slab->stat[si], v);
> +#endif
> +}
> +
> /*
> * The slab lists for all objects.
> */
> @@ -4268,7 +4276,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
>
> local_unlock(&s->cpu_slab->lock);
> }
> - stat(s, FREE_FASTPATH);
> + stat_add(s, FREE_FASTPATH, cnt);
> }
> #else /* CONFIG_SLUB_TINY */
> static void do_slab_free(struct kmem_cache *s,
> @@ -4545,6 +4553,7 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> c->freelist = get_freepointer(s, object);
> p[i] = object;
> maybe_wipe_obj_freeptr(s, p[i]);
> + stat(s, ALLOC_FASTPATH);
> }
> c->tid = next_tid(c->tid);
> local_unlock_irqrestore(&s->cpu_slab->lock, irqflags);
>

2023-12-05 08:19:46

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks

On 2023/12/5 03:34, Vlastimil Babka wrote:
> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
> objects that were allocated before the failure, using
> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
> hooks (KASAN etc.) and those expect objects that were processed by the
> post alloc hooks, slab_post_alloc_hook() is called before
> kmem_cache_free_bulk().
>
> This is wasteful, although not a big concern in practice for the rare
> error path. But in order to efficiently handle percpu array batch refill
> and free in the near future, we will also need a variant of
> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
> and use it for the failure path.
>
> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
> parameter, remove it.

The objects may have been charged before, but it seems __kmem_cache_alloc_bulk()
forget to uncharge them? I can't find "uncharge" in do_slab_free(), or maybe
the bulk interface won't be used on chargeable slab?

Thanks.

>
> Signed-off-by: Vlastimil Babka <[email protected]>
> ---
> mm/slub.c | 33 ++++++++++++++++++++++++++-------
> 1 file changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index d7b0ca6012e0..0742564c4538 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4478,6 +4478,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
> return same;
> }
>
> +/*
> + * Internal bulk free of objects that were not initialised by the post alloc
> + * hooks and thus should not be processed by the free hooks
> + */
> +static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> +{
> + if (!size)
> + return;
> +
> + do {
> + struct detached_freelist df;
> +
> + size = build_detached_freelist(s, size, p, &df);
> + if (!df.slab)
> + continue;
> +
> + do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
> + _RET_IP_);
> + } while (likely(size));
> +}
> +
> /* Note that interrupts must be enabled when calling this function. */
> void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> {
> @@ -4499,7 +4520,7 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
>
> #ifndef CONFIG_SLUB_TINY
> static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> - size_t size, void **p, struct obj_cgroup *objcg)
> + size_t size, void **p)
> {
> struct kmem_cache_cpu *c;
> unsigned long irqflags;
> @@ -4563,14 +4584,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>
> error:
> slub_put_cpu_ptr(s->cpu_slab);
> - slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> - kmem_cache_free_bulk(s, i, p);
> + __kmem_cache_free_bulk(s, i, p);
> return 0;
>
> }
> #else /* CONFIG_SLUB_TINY */
> static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> - size_t size, void **p, struct obj_cgroup *objcg)
> + size_t size, void **p)
> {
> int i;
>
> @@ -4593,8 +4613,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> return i;
>
> error:
> - slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> - kmem_cache_free_bulk(s, i, p);
> + __kmem_cache_free_bulk(s, i, p);
> return 0;
> }
> #endif /* CONFIG_SLUB_TINY */
> @@ -4614,7 +4633,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> if (unlikely(!s))
> return 0;
>
> - i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
> + i = __kmem_cache_alloc_bulk(s, flags, size, p);
>
> /*
> * memcg and kmem_cache debug support and memory initialization.
>

2023-12-05 13:24:05

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/slub: handle bulk and single object freeing separately

On 2023/12/5 03:34, Vlastimil Babka wrote:
> Currently we have a single function slab_free() handling both single
> object freeing and bulk freeing with necessary hooks, the latter case
> requiring slab_free_freelist_hook(). It should be however better to
> distinguish the two use cases for the following reasons:
>
> - code simpler to follow for the single object case
>
> - better code generation - although inlining should eliminate the
> slab_free_freelist_hook() for single object freeing in case no
> debugging options are enabled, it seems it's not perfect. When e.g.
> KASAN is enabled, we're imposing additional unnecessary overhead for
> single object freeing.
>
> - preparation to add percpu array caches in near future
>
> Therefore, simplify slab_free() for the single object case by dropping
> unnecessary parameters and calling only slab_free_hook() instead of
> slab_free_freelist_hook(). Rename the bulk variant to slab_free_bulk()
> and adjust callers accordingly.
>
> While at it, flip (and document) slab_free_hook() return value so that
> it returns true when the freeing can proceed, which matches the logic of
> slab_free_freelist_hook() and is not confusingly the opposite.
>
> Additionally we can simplify a bit by changing the tail parameter of
> do_slab_free() when freeing a single object - instead of NULL we can set
> it equal to head.
>
> bloat-o-meter shows small code reduction with a .config that has KASAN
> etc disabled:
>
> add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-118 (-118)
> Function old new delta
> kmem_cache_alloc_bulk 1203 1196 -7
> kmem_cache_free 861 835 -26
> __kmem_cache_free 741 704 -37
> kmem_cache_free_bulk 911 863 -48
>
> Signed-off-by: Vlastimil Babka <[email protected]>

Looks good to me.

Reviewed-by: Chengming Zhou <[email protected]>

Thanks!

> ---
> mm/slub.c | 59 +++++++++++++++++++++++++++++++++++------------------------
> 1 file changed, 35 insertions(+), 24 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 0742564c4538..ed2fa92e914c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2037,9 +2037,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
> /*
> * Hooks for other subsystems that check memory allocations. In a typical
> * production configuration these hooks all should produce no code at all.
> + *
> + * Returns true if freeing of the object can proceed, false if its reuse
> + * was delayed by KASAN quarantine.
> */
> -static __always_inline bool slab_free_hook(struct kmem_cache *s,
> - void *x, bool init)
> +static __always_inline
> +bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> {
> kmemleak_free_recursive(x, s->flags);
> kmsan_slab_free(s, x);
> @@ -2072,7 +2075,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s,
> s->size - s->inuse - rsize);
> }
> /* KASAN might put x into memory quarantine, delaying its reuse. */
> - return kasan_slab_free(s, x, init);
> + return !kasan_slab_free(s, x, init);
> }
>
> static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> @@ -2082,7 +2085,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>
> void *object;
> void *next = *head;
> - void *old_tail = *tail ? *tail : *head;
> + void *old_tail = *tail;
>
> if (is_kfence_address(next)) {
> slab_free_hook(s, next, false);
> @@ -2098,8 +2101,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> next = get_freepointer(s, object);
>
> /* If object's reuse doesn't have to be delayed */
> - if (likely(!slab_free_hook(s, object,
> - slab_want_init_on_free(s)))) {
> + if (likely(slab_free_hook(s, object,
> + slab_want_init_on_free(s)))) {
> /* Move object to the new freelist */
> set_freepointer(s, object, *head);
> *head = object;
> @@ -2114,9 +2117,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> }
> } while (object != old_tail);
>
> - if (*head == *tail)
> - *tail = NULL;
> -
> return *head != NULL;
> }
>
> @@ -4227,7 +4227,6 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
> struct slab *slab, void *head, void *tail,
> int cnt, unsigned long addr)
> {
> - void *tail_obj = tail ? : head;
> struct kmem_cache_cpu *c;
> unsigned long tid;
> void **freelist;
> @@ -4246,14 +4245,14 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
> barrier();
>
> if (unlikely(slab != c->slab)) {
> - __slab_free(s, slab, head, tail_obj, cnt, addr);
> + __slab_free(s, slab, head, tail, cnt, addr);
> return;
> }
>
> if (USE_LOCKLESS_FAST_PATH()) {
> freelist = READ_ONCE(c->freelist);
>
> - set_freepointer(s, tail_obj, freelist);
> + set_freepointer(s, tail, freelist);
>
> if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) {
> note_cmpxchg_failure("slab_free", s, tid);
> @@ -4270,7 +4269,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
> tid = c->tid;
> freelist = c->freelist;
>
> - set_freepointer(s, tail_obj, freelist);
> + set_freepointer(s, tail, freelist);
> c->freelist = head;
> c->tid = next_tid(tid);
>
> @@ -4283,15 +4282,27 @@ static void do_slab_free(struct kmem_cache *s,
> struct slab *slab, void *head, void *tail,
> int cnt, unsigned long addr)
> {
> - void *tail_obj = tail ? : head;
> -
> - __slab_free(s, slab, head, tail_obj, cnt, addr);
> + __slab_free(s, slab, head, tail, cnt, addr);
> }
> #endif /* CONFIG_SLUB_TINY */
>
> -static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
> - void *head, void *tail, void **p, int cnt,
> - unsigned long addr)
> +static __fastpath_inline
> +void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
> + unsigned long addr)
> +{
> + bool init;
> +
> + memcg_slab_free_hook(s, slab, &object, 1);
> +
> + init = !is_kfence_address(object) && slab_want_init_on_free(s);
> +
> + if (likely(slab_free_hook(s, object, init)))
> + do_slab_free(s, slab, object, object, 1, addr);
> +}
> +
> +static __fastpath_inline
> +void slab_free_bulk(struct kmem_cache *s, struct slab *slab, void *head,
> + void *tail, void **p, int cnt, unsigned long addr)
> {
> memcg_slab_free_hook(s, slab, p, cnt);
> /*
> @@ -4305,7 +4316,7 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
> #ifdef CONFIG_KASAN_GENERIC
> void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
> {
> - do_slab_free(cache, virt_to_slab(x), x, NULL, 1, addr);
> + do_slab_free(cache, virt_to_slab(x), x, x, 1, addr);
> }
> #endif
>
> @@ -4349,7 +4360,7 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
> if (!s)
> return;
> trace_kmem_cache_free(_RET_IP_, x, s);
> - slab_free(s, virt_to_slab(x), x, NULL, &x, 1, _RET_IP_);
> + slab_free(s, virt_to_slab(x), x, _RET_IP_);
> }
> EXPORT_SYMBOL(kmem_cache_free);
>
> @@ -4395,7 +4406,7 @@ void kfree(const void *object)
>
> slab = folio_slab(folio);
> s = slab->slab_cache;
> - slab_free(s, slab, x, NULL, &x, 1, _RET_IP_);
> + slab_free(s, slab, x, _RET_IP_);
> }
> EXPORT_SYMBOL(kfree);
>
> @@ -4512,8 +4523,8 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> if (!df.slab)
> continue;
>
> - slab_free(df.s, df.slab, df.freelist, df.tail, &p[size], df.cnt,
> - _RET_IP_);
> + slab_free_bulk(df.s, df.slab, df.freelist, df.tail, &p[size],
> + df.cnt, _RET_IP_);
> } while (likely(size));
> }
> EXPORT_SYMBOL(kmem_cache_free_bulk);
>

2023-12-05 13:27:47

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On 2023/12/5 03:34, Vlastimil Babka wrote:
> When freeing an object that was allocated from KFENCE, we do that in the
> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
> the cpu slab, so the fastpath has to fallback to the slowpath.
>
> This optimization doesn't help much though, because is_kfence_address()
> is checked earlier anyway during the free hook processing or detached
> freelist building. Thus we can simplify the code by making the
> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
> quarantine.
>
> In slab_free_hook() we can place kfence_free() above init processing, as
> callers have been making sure to set init to false for KFENCE objects.
> This simplifies slab_free(). This places it also above kasan_slab_free()
> which is ok as that skips KFENCE objects anyway.
>
> While at it also determine the init value in slab_free_freelist_hook()
> outside of the loop.
>
> This change will also make introducing per cpu array caches easier.
>
> Tested-by: Marco Elver <[email protected]>
> Signed-off-by: Vlastimil Babka <[email protected]>
> ---
> mm/slub.c | 22 ++++++++++------------
> 1 file changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index ed2fa92e914c..e38c2b712f6c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
> * production configuration these hooks all should produce no code at all.
> *
> * Returns true if freeing of the object can proceed, false if its reuse
> - * was delayed by KASAN quarantine.
> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
> */
> static __always_inline
> bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> __kcsan_check_access(x, s->object_size,
> KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
>
> + if (kfence_free(kasan_reset_tag(x)))

I'm wondering if "kasan_reset_tag()" is needed here?

The patch looks good to me!

Reviewed-by: Chengming Zhou <[email protected]>

Thanks.

> + return false;
> +
> /*
> * As memory initialization might be integrated into KASAN,
> * kasan_slab_free and initialization memset's must be
> @@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> void *object;
> void *next = *head;
> void *old_tail = *tail;
> + bool init;
>
> if (is_kfence_address(next)) {
> slab_free_hook(s, next, false);
> - return true;
> + return false;
> }
>
> /* Head and tail of the reconstructed freelist */
> *head = NULL;
> *tail = NULL;
>
> + init = slab_want_init_on_free(s);
> +
> do {
> object = next;
> next = get_freepointer(s, object);
>
> /* If object's reuse doesn't have to be delayed */
> - if (likely(slab_free_hook(s, object,
> - slab_want_init_on_free(s)))) {
> + if (likely(slab_free_hook(s, object, init))) {
> /* Move object to the new freelist */
> set_freepointer(s, object, *head);
> *head = object;
> @@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>
> stat(s, FREE_SLOWPATH);
>
> - if (kfence_free(head))
> - return;
> -
> if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
> free_to_partial_list(s, slab, head, tail, cnt, addr);
> return;
> @@ -4290,13 +4292,9 @@ static __fastpath_inline
> void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
> unsigned long addr)
> {
> - bool init;
> -
> memcg_slab_free_hook(s, slab, &object, 1);
>
> - init = !is_kfence_address(object) && slab_want_init_on_free(s);
> -
> - if (likely(slab_free_hook(s, object, init)))
> + if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
> do_slab_free(s, slab, object, object, 1, addr);
> }
>
>

2023-12-05 19:57:49

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks

On 12/5/23 09:19, Chengming Zhou wrote:
> On 2023/12/5 03:34, Vlastimil Babka wrote:
>> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
>> objects that were allocated before the failure, using
>> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
>> hooks (KASAN etc.) and those expect objects that were processed by the
>> post alloc hooks, slab_post_alloc_hook() is called before
>> kmem_cache_free_bulk().
>>
>> This is wasteful, although not a big concern in practice for the rare
>> error path. But in order to efficiently handle percpu array batch refill
>> and free in the near future, we will also need a variant of
>> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
>> and use it for the failure path.
>>
>> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
>> parameter, remove it.
>
> The objects may have been charged before, but it seems __kmem_cache_alloc_bulk()
> forget to uncharge them? I can't find "uncharge" in do_slab_free(), or maybe
> the bulk interface won't be used on chargeable slab?

You're right! I missed that the memcg_pre_alloc_hook() already does the
charging, so we need to uncharge. How does this look? Thanks for noticing!

----8<----
From 52f8e77fdfeabffffdce6b761ba5508e940df3be Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <[email protected]>
Date: Thu, 2 Nov 2023 16:34:39 +0100
Subject: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free
hooks

Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
objects that were allocated before the failure, using
kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
hooks (KASAN etc.) and those expect objects that were processed by the
post alloc hooks, slab_post_alloc_hook() is called before
kmem_cache_free_bulk().

This is wasteful, although not a big concern in practice for the rare
error path. But in order to efficiently handle percpu array batch refill
and free in the near future, we will also need a variant of
kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
and use it for the failure path.

In case of failure we however still need to perform memcg uncharge so
handle that in a new memcg_slab_alloc_error_hook(). Thanks to Chengming
Zhou for noticing the missing uncharge.

As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
parameter, remove it.

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slub.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 47 insertions(+), 9 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d7b0ca6012e0..0a9e4bd0dd68 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2003,6 +2003,14 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,

__memcg_slab_free_hook(s, slab, p, objects, objcgs);
}
+
+static inline
+void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
+ struct obj_cgroup *objcg)
+{
+ if (objcg)
+ obj_cgroup_uncharge(objcg, objects * obj_full_size(s));
+}
#else /* CONFIG_MEMCG_KMEM */
static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr)
{
@@ -2032,6 +2040,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
void **p, int objects)
{
}
+
+static inline
+void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
+ struct obj_cgroup *objcg)
+{
+}
#endif /* CONFIG_MEMCG_KMEM */

/*
@@ -4478,6 +4492,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
return same;
}

+/*
+ * Internal bulk free of objects that were not initialised by the post alloc
+ * hooks and thus should not be processed by the free hooks
+ */
+static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+{
+ if (!size)
+ return;
+
+ do {
+ struct detached_freelist df;
+
+ size = build_detached_freelist(s, size, p, &df);
+ if (!df.slab)
+ continue;
+
+ do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
+ _RET_IP_);
+ } while (likely(size));
+}
+
/* Note that interrupts must be enabled when calling this function. */
void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
{
@@ -4498,8 +4533,9 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
EXPORT_SYMBOL(kmem_cache_free_bulk);

#ifndef CONFIG_SLUB_TINY
-static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
- size_t size, void **p, struct obj_cgroup *objcg)
+static inline
+int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
+ void **p)
{
struct kmem_cache_cpu *c;
unsigned long irqflags;
@@ -4563,14 +4599,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,

error:
slub_put_cpu_ptr(s->cpu_slab);
- slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
- kmem_cache_free_bulk(s, i, p);
+ __kmem_cache_free_bulk(s, i, p);
return 0;

}
#else /* CONFIG_SLUB_TINY */
static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
- size_t size, void **p, struct obj_cgroup *objcg)
+ size_t size, void **p)
{
int i;

@@ -4593,8 +4628,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
return i;

error:
- slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
- kmem_cache_free_bulk(s, i, p);
+ __kmem_cache_free_bulk(s, i, p);
return 0;
}
#endif /* CONFIG_SLUB_TINY */
@@ -4614,15 +4648,19 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
if (unlikely(!s))
return 0;

- i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
+ i = __kmem_cache_alloc_bulk(s, flags, size, p);

/*
* memcg and kmem_cache debug support and memory initialization.
* Done outside of the IRQ disabled fastpath loop.
*/
- if (i != 0)
+ if (likely(i != 0)) {
slab_post_alloc_hook(s, objcg, flags, size, p,
slab_want_init_on_alloc(flags, s), s->object_size);
+ } else {
+ memcg_slab_alloc_error_hook(s, size, objcg);
+ }
+
return i;
}
EXPORT_SYMBOL(kmem_cache_alloc_bulk);
--
2.43.0


2023-12-06 00:33:20

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks

On 2023/12/6 03:57, Vlastimil Babka wrote:
> On 12/5/23 09:19, Chengming Zhou wrote:
>> On 2023/12/5 03:34, Vlastimil Babka wrote:
>>> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
>>> objects that were allocated before the failure, using
>>> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
>>> hooks (KASAN etc.) and those expect objects that were processed by the
>>> post alloc hooks, slab_post_alloc_hook() is called before
>>> kmem_cache_free_bulk().
>>>
>>> This is wasteful, although not a big concern in practice for the rare
>>> error path. But in order to efficiently handle percpu array batch refill
>>> and free in the near future, we will also need a variant of
>>> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
>>> and use it for the failure path.
>>>
>>> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
>>> parameter, remove it.
>>
>> The objects may have been charged before, but it seems __kmem_cache_alloc_bulk()
>> forget to uncharge them? I can't find "uncharge" in do_slab_free(), or maybe
>> the bulk interface won't be used on chargeable slab?
>
> You're right! I missed that the memcg_pre_alloc_hook() already does the
> charging, so we need to uncharge. How does this look? Thanks for noticing!
>
> ----8<----
> From 52f8e77fdfeabffffdce6b761ba5508e940df3be Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <[email protected]>
> Date: Thu, 2 Nov 2023 16:34:39 +0100
> Subject: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free
> hooks
>
> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
> objects that were allocated before the failure, using
> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
> hooks (KASAN etc.) and those expect objects that were processed by the
> post alloc hooks, slab_post_alloc_hook() is called before
> kmem_cache_free_bulk().
>
> This is wasteful, although not a big concern in practice for the rare
> error path. But in order to efficiently handle percpu array batch refill
> and free in the near future, we will also need a variant of
> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
> and use it for the failure path.
>
> In case of failure we however still need to perform memcg uncharge so
> handle that in a new memcg_slab_alloc_error_hook(). Thanks to Chengming
> Zhou for noticing the missing uncharge.
>
> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
> parameter, remove it.
>
> Signed-off-by: Vlastimil Babka <[email protected]>

Looks good to me!

Reviewed-by: Chengming Zhou <[email protected]>

Thanks!

> ---
> mm/slub.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index d7b0ca6012e0..0a9e4bd0dd68 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2003,6 +2003,14 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
>
> __memcg_slab_free_hook(s, slab, p, objects, objcgs);
> }
> +
> +static inline
> +void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
> + struct obj_cgroup *objcg)
> +{
> + if (objcg)
> + obj_cgroup_uncharge(objcg, objects * obj_full_size(s));
> +}
> #else /* CONFIG_MEMCG_KMEM */
> static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr)
> {
> @@ -2032,6 +2040,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
> void **p, int objects)
> {
> }
> +
> +static inline
> +void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
> + struct obj_cgroup *objcg)
> +{
> +}
> #endif /* CONFIG_MEMCG_KMEM */
>
> /*
> @@ -4478,6 +4492,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
> return same;
> }
>
> +/*
> + * Internal bulk free of objects that were not initialised by the post alloc
> + * hooks and thus should not be processed by the free hooks
> + */
> +static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> +{
> + if (!size)
> + return;
> +
> + do {
> + struct detached_freelist df;
> +
> + size = build_detached_freelist(s, size, p, &df);
> + if (!df.slab)
> + continue;
> +
> + do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
> + _RET_IP_);
> + } while (likely(size));
> +}
> +
> /* Note that interrupts must be enabled when calling this function. */
> void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> {
> @@ -4498,8 +4533,9 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> EXPORT_SYMBOL(kmem_cache_free_bulk);
>
> #ifndef CONFIG_SLUB_TINY
> -static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> - size_t size, void **p, struct obj_cgroup *objcg)
> +static inline
> +int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> + void **p)
> {
> struct kmem_cache_cpu *c;
> unsigned long irqflags;
> @@ -4563,14 +4599,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>
> error:
> slub_put_cpu_ptr(s->cpu_slab);
> - slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> - kmem_cache_free_bulk(s, i, p);
> + __kmem_cache_free_bulk(s, i, p);
> return 0;
>
> }
> #else /* CONFIG_SLUB_TINY */
> static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> - size_t size, void **p, struct obj_cgroup *objcg)
> + size_t size, void **p)
> {
> int i;
>
> @@ -4593,8 +4628,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> return i;
>
> error:
> - slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> - kmem_cache_free_bulk(s, i, p);
> + __kmem_cache_free_bulk(s, i, p);
> return 0;
> }
> #endif /* CONFIG_SLUB_TINY */
> @@ -4614,15 +4648,19 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> if (unlikely(!s))
> return 0;
>
> - i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
> + i = __kmem_cache_alloc_bulk(s, flags, size, p);
>
> /*
> * memcg and kmem_cache debug support and memory initialization.
> * Done outside of the IRQ disabled fastpath loop.
> */
> - if (i != 0)
> + if (likely(i != 0)) {
> slab_post_alloc_hook(s, objcg, flags, size, p,
> slab_want_init_on_alloc(flags, s), s->object_size);
> + } else {
> + memcg_slab_alloc_error_hook(s, size, objcg);
> + }
> +
> return i;
> }
> EXPORT_SYMBOL(kmem_cache_alloc_bulk);

2023-12-06 09:58:36

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On 12/5/23 14:27, Chengming Zhou wrote:
> On 2023/12/5 03:34, Vlastimil Babka wrote:
>> When freeing an object that was allocated from KFENCE, we do that in the
>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
>> the cpu slab, so the fastpath has to fallback to the slowpath.
>>
>> This optimization doesn't help much though, because is_kfence_address()
>> is checked earlier anyway during the free hook processing or detached
>> freelist building. Thus we can simplify the code by making the
>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
>> quarantine.
>>
>> In slab_free_hook() we can place kfence_free() above init processing, as
>> callers have been making sure to set init to false for KFENCE objects.
>> This simplifies slab_free(). This places it also above kasan_slab_free()
>> which is ok as that skips KFENCE objects anyway.
>>
>> While at it also determine the init value in slab_free_freelist_hook()
>> outside of the loop.
>>
>> This change will also make introducing per cpu array caches easier.
>>
>> Tested-by: Marco Elver <[email protected]>
>> Signed-off-by: Vlastimil Babka <[email protected]>
>> ---
>> mm/slub.c | 22 ++++++++++------------
>> 1 file changed, 10 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index ed2fa92e914c..e38c2b712f6c 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>> * production configuration these hooks all should produce no code at all.
>> *
>> * Returns true if freeing of the object can proceed, false if its reuse
>> - * was delayed by KASAN quarantine.
>> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
>> */
>> static __always_inline
>> bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>> __kcsan_check_access(x, s->object_size,
>> KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
>>
>> + if (kfence_free(kasan_reset_tag(x)))
>
> I'm wondering if "kasan_reset_tag()" is needed here?

I think so, because AFAICS the is_kfence_address() check in kfence_free()
could be a false negative otherwise. In fact now I even question some of the
other is_kfence_address() checks in mm/slub.c, mainly
build_detached_freelist() which starts from pointers coming directly from
slab users. Insight from KASAN/KFENCE folks appreciated :)

> The patch looks good to me!
>
> Reviewed-by: Chengming Zhou <[email protected]>

Thanks!

> Thanks.
>
>> + return false;
>> +
>> /*
>> * As memory initialization might be integrated into KASAN,
>> * kasan_slab_free and initialization memset's must be
>> @@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>> void *object;
>> void *next = *head;
>> void *old_tail = *tail;
>> + bool init;
>>
>> if (is_kfence_address(next)) {
>> slab_free_hook(s, next, false);
>> - return true;
>> + return false;
>> }
>>
>> /* Head and tail of the reconstructed freelist */
>> *head = NULL;
>> *tail = NULL;
>>
>> + init = slab_want_init_on_free(s);
>> +
>> do {
>> object = next;
>> next = get_freepointer(s, object);
>>
>> /* If object's reuse doesn't have to be delayed */
>> - if (likely(slab_free_hook(s, object,
>> - slab_want_init_on_free(s)))) {
>> + if (likely(slab_free_hook(s, object, init))) {
>> /* Move object to the new freelist */
>> set_freepointer(s, object, *head);
>> *head = object;
>> @@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>>
>> stat(s, FREE_SLOWPATH);
>>
>> - if (kfence_free(head))
>> - return;
>> -
>> if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
>> free_to_partial_list(s, slab, head, tail, cnt, addr);
>> return;
>> @@ -4290,13 +4292,9 @@ static __fastpath_inline
>> void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
>> unsigned long addr)
>> {
>> - bool init;
>> -
>> memcg_slab_free_hook(s, slab, &object, 1);
>>
>> - init = !is_kfence_address(object) && slab_want_init_on_free(s);
>> -
>> - if (likely(slab_free_hook(s, object, init)))
>> + if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
>> do_slab_free(s, slab, object, object, 1, addr);
>> }
>>
>>

2023-12-06 13:02:53

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On 2023/12/6 17:58, Vlastimil Babka wrote:
> On 12/5/23 14:27, Chengming Zhou wrote:
>> On 2023/12/5 03:34, Vlastimil Babka wrote:
>>> When freeing an object that was allocated from KFENCE, we do that in the
>>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
>>> the cpu slab, so the fastpath has to fallback to the slowpath.
>>>
>>> This optimization doesn't help much though, because is_kfence_address()
>>> is checked earlier anyway during the free hook processing or detached
>>> freelist building. Thus we can simplify the code by making the
>>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
>>> quarantine.
>>>
>>> In slab_free_hook() we can place kfence_free() above init processing, as
>>> callers have been making sure to set init to false for KFENCE objects.
>>> This simplifies slab_free(). This places it also above kasan_slab_free()
>>> which is ok as that skips KFENCE objects anyway.
>>>
>>> While at it also determine the init value in slab_free_freelist_hook()
>>> outside of the loop.
>>>
>>> This change will also make introducing per cpu array caches easier.
>>>
>>> Tested-by: Marco Elver <[email protected]>
>>> Signed-off-by: Vlastimil Babka <[email protected]>
>>> ---
>>> mm/slub.c | 22 ++++++++++------------
>>> 1 file changed, 10 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index ed2fa92e914c..e38c2b712f6c 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>>> * production configuration these hooks all should produce no code at all.
>>> *
>>> * Returns true if freeing of the object can proceed, false if its reuse
>>> - * was delayed by KASAN quarantine.
>>> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
>>> */
>>> static __always_inline
>>> bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>>> __kcsan_check_access(x, s->object_size,
>>> KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
>>>
>>> + if (kfence_free(kasan_reset_tag(x)))
>>
>> I'm wondering if "kasan_reset_tag()" is needed here?
>
> I think so, because AFAICS the is_kfence_address() check in kfence_free()
> could be a false negative otherwise. In fact now I even question some of the

Ok.

> other is_kfence_address() checks in mm/slub.c, mainly
> build_detached_freelist() which starts from pointers coming directly from
> slab users. Insight from KASAN/KFENCE folks appreciated :)
>
I know very little about KASAN/KFENCE, looking forward to their insight. :)

Just saw a check in __kasan_slab_alloc():

if (is_kfence_address(object))
return (void *)object;

So thought it seems that a kfence object would be skipped by KASAN.

Thanks!

2023-12-06 14:47:21

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On Wed, 6 Dec 2023 at 14:02, Chengming Zhou <[email protected]> wrote:
>
> On 2023/12/6 17:58, Vlastimil Babka wrote:
> > On 12/5/23 14:27, Chengming Zhou wrote:
> >> On 2023/12/5 03:34, Vlastimil Babka wrote:
> >>> When freeing an object that was allocated from KFENCE, we do that in the
> >>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
> >>> the cpu slab, so the fastpath has to fallback to the slowpath.
> >>>
> >>> This optimization doesn't help much though, because is_kfence_address()
> >>> is checked earlier anyway during the free hook processing or detached
> >>> freelist building. Thus we can simplify the code by making the
> >>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
> >>> quarantine.
> >>>
> >>> In slab_free_hook() we can place kfence_free() above init processing, as
> >>> callers have been making sure to set init to false for KFENCE objects.
> >>> This simplifies slab_free(). This places it also above kasan_slab_free()
> >>> which is ok as that skips KFENCE objects anyway.
> >>>
> >>> While at it also determine the init value in slab_free_freelist_hook()
> >>> outside of the loop.
> >>>
> >>> This change will also make introducing per cpu array caches easier.
> >>>
> >>> Tested-by: Marco Elver <[email protected]>
> >>> Signed-off-by: Vlastimil Babka <[email protected]>
> >>> ---
> >>> mm/slub.c | 22 ++++++++++------------
> >>> 1 file changed, 10 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/mm/slub.c b/mm/slub.c
> >>> index ed2fa92e914c..e38c2b712f6c 100644
> >>> --- a/mm/slub.c
> >>> +++ b/mm/slub.c
> >>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
> >>> * production configuration these hooks all should produce no code at all.
> >>> *
> >>> * Returns true if freeing of the object can proceed, false if its reuse
> >>> - * was delayed by KASAN quarantine.
> >>> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
> >>> */
> >>> static __always_inline
> >>> bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> >>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> >>> __kcsan_check_access(x, s->object_size,
> >>> KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
> >>>
> >>> + if (kfence_free(kasan_reset_tag(x)))
> >>
> >> I'm wondering if "kasan_reset_tag()" is needed here?
> >
> > I think so, because AFAICS the is_kfence_address() check in kfence_free()
> > could be a false negative otherwise. In fact now I even question some of the
>
> Ok.
>
> > other is_kfence_address() checks in mm/slub.c, mainly
> > build_detached_freelist() which starts from pointers coming directly from
> > slab users. Insight from KASAN/KFENCE folks appreciated :)
> >
> I know very little about KASAN/KFENCE, looking forward to their insight. :)
>
> Just saw a check in __kasan_slab_alloc():
>
> if (is_kfence_address(object))
> return (void *)object;
>
> So thought it seems that a kfence object would be skipped by KASAN.

The is_kfence_address() implementation tolerates tagged addresses,
i.e. if it receives a tagged non-kfence address, it will never return
true.

The KASAN_HW_TAGS patches and KFENCE patches were in development
concurrently, and at the time there was some conflict resolution that
happened when both were merged. The
is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was
squashed into 2b8305260fb.

[1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/

Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?

2023-12-11 22:12:25

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <[email protected]> wrote:
>
> The is_kfence_address() implementation tolerates tagged addresses,
> i.e. if it receives a tagged non-kfence address, it will never return
> true.
>
> The KASAN_HW_TAGS patches and KFENCE patches were in development
> concurrently, and at the time there was some conflict resolution that
> happened when both were merged. The
> is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was
> squashed into 2b8305260fb.
>
> [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/
>
> Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?

I don't remember at this point, but this could have been just a safety measure.

If is_kfence_address tolerates tagged addresses, we should be able to
drop these kasan_reset_tag calls.

2023-12-12 11:42:57

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On 12/11/23 23:11, Andrey Konovalov wrote:
> On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <[email protected]> wrote:
>>
>> The is_kfence_address() implementation tolerates tagged addresses,
>> i.e. if it receives a tagged non-kfence address, it will never return
>> true.

So just to be sure, it can't happen that a genuine kfence address would then
become KASAN tagged and handed out, and thus when tested by
is_kfence_address() it would be a false negative?

>> The KASAN_HW_TAGS patches and KFENCE patches were in development
>> concurrently, and at the time there was some conflict resolution that
>> happened when both were merged. The
>> is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was
>> squashed into 2b8305260fb.
>>
>> [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/
>>
>> Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?
>
> I don't remember at this point, but this could have been just a safety measure.
>
> If is_kfence_address tolerates tagged addresses, we should be able to
> drop these kasan_reset_tag calls.

Will drop it once the above is confirmed. Thanks!

2023-12-20 23:45:26

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()

On Tue, Dec 12, 2023 at 12:42 PM Vlastimil Babka <[email protected]> wrote:
>
> On 12/11/23 23:11, Andrey Konovalov wrote:
> > On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <[email protected]> wrote:
> >>
> >> The is_kfence_address() implementation tolerates tagged addresses,
> >> i.e. if it receives a tagged non-kfence address, it will never return
> >> true.
>
> So just to be sure, it can't happen that a genuine kfence address would then
> become KASAN tagged and handed out, and thus when tested by
> is_kfence_address() it would be a false negative?

No, this should not happen. KFENCE objects never get tags assigned to them.