Recent times kvmalloc() begun widely be used in kernel.
Some of such memory allocations have to be freed after
rcu grace period, and this patchset introduces a generic
primitive for doing this.
Actually, everything is made in [1/2]. Patch [2/2] is just
added to make new kvfree_rcu() have the first user.
The patch [1/2] transforms kfree_rcu(), its sub definitions
and its sub functions into kvfree_rcu() form. The most
significant change is in __rcu_reclaim(), where kvfree()
is used instead of kfree(). Since kvfree() is able to
have a deal with memory allocated via kmalloc(), vmalloc()
and kvmalloc(); kfree_rcu() and vfree_rcu() may simply
be defined through this new kvfree_rcu().
---
Kirill Tkhai (2):
rcu: Transform kfree_rcu() into kvfree_rcu()
mm: Use kvfree_rcu() in update_memcg_params()
include/linux/rcupdate.h | 31 +++++++++++++++++--------------
include/linux/rcutiny.h | 4 ++--
include/linux/rcutree.h | 2 +-
include/trace/events/rcu.h | 12 ++++++------
kernel/rcu/rcu.h | 8 ++++----
kernel/rcu/tree.c | 14 +++++++-------
kernel/rcu/tree_plugin.h | 10 +++++-----
mm/slab_common.c | 10 +---------
8 files changed, 43 insertions(+), 48 deletions(-)
--
Signed-off-by: Kirill Tkhai <[email protected]>
Make update_memcg_params() to use generic kvfree_rcu()
helper and remove free_memcg_params() code.
Signed-off-by: Kirill Tkhai <[email protected]>
---
mm/slab_common.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 10f127b2de7c..92d4a3a9471d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -190,14 +190,6 @@ static void destroy_memcg_params(struct kmem_cache *s)
kvfree(rcu_access_pointer(s->memcg_params.memcg_caches));
}
-static void free_memcg_params(struct rcu_head *rcu)
-{
- struct memcg_cache_array *old;
-
- old = container_of(rcu, struct memcg_cache_array, rcu);
- kvfree(old);
-}
-
static int update_memcg_params(struct kmem_cache *s, int new_array_size)
{
struct memcg_cache_array *old, *new;
@@ -215,7 +207,7 @@ static int update_memcg_params(struct kmem_cache *s, int new_array_size)
rcu_assign_pointer(s->memcg_params.memcg_caches, new);
if (old)
- call_rcu(&old->rcu, free_memcg_params);
+ kvfree_rcu(old, rcu);
return 0;
}
Recent times kvmalloc() begun widely be used in kernel.
Some of such memory allocations have to be freed after
rcu grace period, and this patch introduces a generic
primitive for doing this.
Currently, there is kfree_rcu() primitive in kernel,
which encodes rcu_head offset inside freed structure
on place of callback function. We can simply reuse it
and replace final kfree() in __rcu_reclaim() with
kvfree(). Since this primitive is able to free memory
allocated via kmalloc(), vmalloc() and kvmalloc(),
we may have single kvfree_rcu(), and define kfree_rcu()
and vfree_rcu() through it.
This allows users to avoid to implement custom functions
for destruction kvmalloc()'ed and vmalloc()'ed memory.
The new primitive kvfree_rcu() are used since next patch.
Signed-off-by: Kirill Tkhai <[email protected]>
---
include/linux/rcupdate.h | 31 +++++++++++++++++--------------
include/linux/rcutiny.h | 4 ++--
include/linux/rcutree.h | 2 +-
include/trace/events/rcu.h | 12 ++++++------
kernel/rcu/rcu.h | 8 ++++----
kernel/rcu/tree.c | 14 +++++++-------
kernel/rcu/tree_plugin.h | 10 +++++-----
7 files changed, 42 insertions(+), 39 deletions(-)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 043d04784675..22d4086f50b2 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -832,36 +832,36 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
/*
* Does the specified offset indicate that the corresponding rcu_head
- * structure can be handled by kfree_rcu()?
+ * structure can be handled by kvfree_rcu()?
*/
-#define __is_kfree_rcu_offset(offset) ((offset) < 4096)
+#define __is_kvfree_rcu_offset(offset) ((offset) < 4096)
/*
- * Helper macro for kfree_rcu() to prevent argument-expansion eyestrain.
+ * Helper macro for kvfree_rcu() to prevent argument-expansion eyestrain.
*/
-#define __kfree_rcu(head, offset) \
+#define __kvfree_rcu(head, offset) \
do { \
- BUILD_BUG_ON(!__is_kfree_rcu_offset(offset)); \
- kfree_call_rcu(head, (rcu_callback_t)(unsigned long)(offset)); \
+ BUILD_BUG_ON(!__is_kvfree_rcu_offset(offset)); \
+ kvfree_call_rcu(head, (rcu_callback_t)(unsigned long)(offset)); \
} while (0)
/**
- * kfree_rcu() - kfree an object after a grace period.
- * @ptr: pointer to kfree
+ * kvfree_rcu() - kvfree an object after a grace period.
+ * @ptr: pointer to kvfree
* @rcu_head: the name of the struct rcu_head within the type of @ptr.
*
- * Many rcu callbacks functions just call kfree() on the base structure.
+ * Many rcu callbacks functions just call kvfree() on the base structure.
* These functions are trivial, but their size adds up, and furthermore
* when they are used in a kernel module, that module must invoke the
* high-latency rcu_barrier() function at module-unload time.
*
- * The kfree_rcu() function handles this issue. Rather than encoding a
- * function address in the embedded rcu_head structure, kfree_rcu() instead
+ * The kvfree_rcu() function handles this issue. Rather than encoding a
+ * function address in the embedded rcu_head structure, kvfree_rcu() instead
* encodes the offset of the rcu_head structure within the base structure.
* Because the functions are not allowed in the low-order 4096 bytes of
* kernel virtual memory, offsets up to 4095 bytes can be accommodated.
* If the offset is larger than 4095 bytes, a compile-time error will
- * be generated in __kfree_rcu(). If this error is triggered, you can
+ * be generated in __kvfree_rcu(). If this error is triggered, you can
* either fall back to use of call_rcu() or rearrange the structure to
* position the rcu_head structure into the first 4096 bytes.
*
@@ -871,9 +871,12 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
* The BUILD_BUG_ON check must not involve any function calls, hence the
* checks are done in macros here.
*/
-#define kfree_rcu(ptr, rcu_head) \
- __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
+#define kvfree_rcu(ptr, rcu_head) \
+ __kvfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
+#define kfree_rcu(ptr, rcu_head) kvfree_rcu(ptr, rcu_head)
+
+#define vfree_rcu(ptr, rcu_head) kvfree_rcu(ptr, rcu_head)
/*
* Place this after a lock-acquisition primitive to guarantee that
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index ce9beec35e34..2e484aaa534f 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -84,8 +84,8 @@ static inline void synchronize_sched_expedited(void)
synchronize_sched();
}
-static inline void kfree_call_rcu(struct rcu_head *head,
- rcu_callback_t func)
+static inline void kvfree_call_rcu(struct rcu_head *head,
+ rcu_callback_t func)
{
call_rcu(head, func);
}
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index fd996cdf1833..4d6365be4504 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -48,7 +48,7 @@ void synchronize_rcu_bh(void);
void synchronize_sched_expedited(void);
void synchronize_rcu_expedited(void);
-void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
+void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
/**
* synchronize_rcu_bh_expedited - Brute-force RCU-bh grace period
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 0b50fda80db0..9507264fa8f8 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -496,13 +496,13 @@ TRACE_EVENT(rcu_callback,
/*
* Tracepoint for the registration of a single RCU callback of the special
- * kfree() form. The first argument is the RCU type, the second argument
+ * kvfree() form. The first argument is the RCU type, the second argument
* is a pointer to the RCU callback, the third argument is the offset
* of the callback within the enclosing RCU-protected data structure,
* the fourth argument is the number of lazy callbacks queued, and the
* fifth argument is the total number of callbacks queued.
*/
-TRACE_EVENT(rcu_kfree_callback,
+TRACE_EVENT(rcu_kvfree_callback,
TP_PROTO(const char *rcuname, struct rcu_head *rhp, unsigned long offset,
long qlen_lazy, long qlen),
@@ -591,12 +591,12 @@ TRACE_EVENT(rcu_invoke_callback,
/*
* Tracepoint for the invocation of a single RCU callback of the special
- * kfree() form. The first argument is the RCU flavor, the second
+ * kvfree() form. The first argument is the RCU flavor, the second
* argument is a pointer to the RCU callback, and the third argument
* is the offset of the callback within the enclosing RCU-protected
* data structure.
*/
-TRACE_EVENT(rcu_invoke_kfree_callback,
+TRACE_EVENT(rcu_invoke_kvfree_callback,
TP_PROTO(const char *rcuname, struct rcu_head *rhp, unsigned long offset),
@@ -767,12 +767,12 @@ TRACE_EVENT(rcu_barrier,
#define trace_rcu_fqs(rcuname, gpnum, cpu, qsevent) do { } while (0)
#define trace_rcu_dyntick(polarity, oldnesting, newnesting, dyntick) do { } while (0)
#define trace_rcu_callback(rcuname, rhp, qlen_lazy, qlen) do { } while (0)
-#define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \
+#define trace_rcu_kvfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \
do { } while (0)
#define trace_rcu_batch_start(rcuname, qlen_lazy, qlen, blimit) \
do { } while (0)
#define trace_rcu_invoke_callback(rcuname, rhp) do { } while (0)
-#define trace_rcu_invoke_kfree_callback(rcuname, rhp, offset) do { } while (0)
+#define trace_rcu_invoke_kvfree_callback(rcuname, rhp, offset) do { } while (0)
#define trace_rcu_batch_end(rcuname, callbacks_invoked, cb, nr, iit, risk) \
do { } while (0)
#define trace_rcu_torture_read(rcutorturename, rhp, secs, c_old, c) \
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 6334f2c1abd0..696200886098 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -151,7 +151,7 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head)
}
#endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
-void kfree(const void *);
+void kvfree(const void *);
/*
* Reclaim the specified callback, either by invoking it (non-lazy case)
@@ -162,9 +162,9 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
unsigned long offset = (unsigned long)head->func;
rcu_lock_acquire(&rcu_callback_map);
- if (__is_kfree_rcu_offset(offset)) {
- RCU_TRACE(trace_rcu_invoke_kfree_callback(rn, head, offset);)
- kfree((void *)head - offset);
+ if (__is_kvfree_rcu_offset(offset)) {
+ RCU_TRACE(trace_rcu_invoke_kvfree_callback(rn, head, offset);)
+ kvfree((void *)head - offset);
rcu_lock_release(&rcu_callback_map);
return true;
} else {
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 491bdf39f276..8e736aa11a46 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3061,10 +3061,10 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func,
if (!lazy)
rcu_idle_count_callbacks_posted();
- if (__is_kfree_rcu_offset((unsigned long)func))
- trace_rcu_kfree_callback(rsp->name, head, (unsigned long)func,
- rcu_segcblist_n_lazy_cbs(&rdp->cblist),
- rcu_segcblist_n_cbs(&rdp->cblist));
+ if (__is_kvfree_rcu_offset((unsigned long)func))
+ trace_rcu_kvfree_callback(rsp->name, head, (unsigned long)func,
+ rcu_segcblist_n_lazy_cbs(&rdp->cblist),
+ rcu_segcblist_n_cbs(&rdp->cblist));
else
trace_rcu_callback(rsp->name, head,
rcu_segcblist_n_lazy_cbs(&rdp->cblist),
@@ -3134,14 +3134,14 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
* This will likely be later named something like "call_rcu_lazy()",
* but this change will require some way of tagging the lazy RCU
* callbacks in the list of pending callbacks. Until then, this
- * function may only be called from __kfree_rcu().
+ * function may only be called from __kvfree_rcu().
*/
-void kfree_call_rcu(struct rcu_head *head,
+void kvfree_call_rcu(struct rcu_head *head,
rcu_callback_t func)
{
__call_rcu(head, func, rcu_state_p, -1, 1);
}
-EXPORT_SYMBOL_GPL(kfree_call_rcu);
+EXPORT_SYMBOL_GPL(kvfree_call_rcu);
/*
* Because a context switch is a grace period for RCU-sched and RCU-bh,
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index fb88a028deec..85715963658e 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1984,11 +1984,11 @@ static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,
if (!rcu_is_nocb_cpu(rdp->cpu))
return false;
__call_rcu_nocb_enqueue(rdp, rhp, &rhp->next, 1, lazy, flags);
- if (__is_kfree_rcu_offset((unsigned long)rhp->func))
- trace_rcu_kfree_callback(rdp->rsp->name, rhp,
- (unsigned long)rhp->func,
- -atomic_long_read(&rdp->nocb_q_count_lazy),
- -atomic_long_read(&rdp->nocb_q_count));
+ if (__is_kvfree_rcu_offset((unsigned long)rhp->func))
+ trace_rcu_kvfree_callback(rdp->rsp->name, rhp,
+ (unsigned long)rhp->func,
+ -atomic_long_read(&rdp->nocb_q_count_lazy),
+ -atomic_long_read(&rdp->nocb_q_count));
else
trace_rcu_callback(rdp->rsp->name, rhp,
-atomic_long_read(&rdp->nocb_q_count_lazy),
On Tue, 06 Feb 2018 13:19:45 +0300
Kirill Tkhai <[email protected]> wrote:
> /**
> - * kfree_rcu() - kfree an object after a grace period.
> - * @ptr: pointer to kfree
> + * kvfree_rcu() - kvfree an object after a grace period.
> + * @ptr: pointer to kvfree
> * @rcu_head: the name of the struct rcu_head within the type of @ptr.
> *
You may want to add a big comment here that states this works for both
free vmalloc and kmalloc data. Because if I saw this, I would think it
only works for vmalloc, and start implementing a custom one for kmalloc
data.
-- Steve
> - * Many rcu callbacks functions just call kfree() on the base structure.
> + * Many rcu callbacks functions just call kvfree() on the base structure.
> * These functions are trivial, but their size adds up, and furthermore
> * when they are used in a kernel module, that module must invoke the
> * high-latency rcu_barrier() function at module-unload time.
> *
> - * The kfree_rcu() function handles this issue. Rather than encoding a
> - * function address in the embedded rcu_head structure, kfree_rcu() instead
> + * The kvfree_rcu() function handles this issue. Rather than encoding a
> + * function address in the embedded rcu_head structure, kvfree_rcu() instead
> * encodes the offset of the rcu_head structure within the base structure.
> * Because the functions are not allowed in the low-order 4096 bytes of
> * kernel virtual memory, offsets up to 4095 bytes can be accommodated.
> * If the offset is larger than 4095 bytes, a compile-time error will
> - * be generated in __kfree_rcu(). If this error is triggered, you can
> + * be generated in __kvfree_rcu(). If this error is triggered, you can
> * either fall back to use of call_rcu() or rearrange the structure to
> * position the rcu_head structure into the first 4096 bytes.
> *
> @@ -871,9 +871,12 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
> * The BUILD_BUG_ON check must not involve any function calls, hence the
> * checks are done in macros here.
> */
> -#define kfree_rcu(ptr, rcu_head) \
> - __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
> +#define kvfree_rcu(ptr, rcu_head) \
> + __kvfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
>
> +#define kfree_rcu(ptr, rcu_head) kvfree_rcu(ptr, rcu_head)
> +
> +#define vfree_rcu(ptr, rcu_head) kvfree_rcu(ptr, rcu_head)
>
> /*
> * Place this after a lock-acquisition primitive to guarantee that
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index ce9beec35e34..2e484aaa534f 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -84,8 +84,8 @@ static inline void synchronize_sched_expedited(void)
> synchronize_sched();
> }
>
> -static inline void kfree_call_rcu(struct rcu_head *head,
> - rcu_callback_t func)
> +static inline void kvfree_call_rcu(struct rcu_head *head,
> + rcu_callback_t func)
> {
> call_rcu(head, func);
> }
On 06.02.2018 17:34, Steven Rostedt wrote:
> On Tue, 06 Feb 2018 13:19:45 +0300
> Kirill Tkhai <[email protected]> wrote:
>
>> /**
>> - * kfree_rcu() - kfree an object after a grace period.
>> - * @ptr: pointer to kfree
>> + * kvfree_rcu() - kvfree an object after a grace period.
>> + * @ptr: pointer to kvfree
>> * @rcu_head: the name of the struct rcu_head within the type of @ptr.
>> *
>
> You may want to add a big comment here that states this works for both
> free vmalloc and kmalloc data. Because if I saw this, I would think it
> only works for vmalloc, and start implementing a custom one for kmalloc
> data.
There are kfree_rcu() and vfree_rcu() defined below, and they will give
compilation error if someone tries to implement one more primitive with
the same name.
We may add a comment, but I'm not sure it will be good if people will use
unpaired brackets like:
obj = kmalloc(..)
kvfree_rcu(obj,..)
after they read such a commentary that it works for both vmalloc and kmalloc.
After this unpaired behavior distribute over the kernel, we won't be able
to implement some debug on top of this defines (I'm not sure it will be really
need in the future, but anyway).
Though, we may add a comment forcing use of paired bracket. Something like:
/**
* kvfree_rcu() - kvfree an object after a grace period.
This is a primitive for objects allocated via kvmalloc*() family primitives.
Do not use it to free kmalloc() and vmalloc() allocated objects, use kfree_rcu()
and vfree_rcu() wrappers instead.
How are you about this?
Kirill
>> - * Many rcu callbacks functions just call kfree() on the base structure.
>> + * Many rcu callbacks functions just call kvfree() on the base structure.
>> * These functions are trivial, but their size adds up, and furthermore
>> * when they are used in a kernel module, that module must invoke the
>> * high-latency rcu_barrier() function at module-unload time.
>> *
>> - * The kfree_rcu() function handles this issue. Rather than encoding a
>> - * function address in the embedded rcu_head structure, kfree_rcu() instead
>> + * The kvfree_rcu() function handles this issue. Rather than encoding a
>> + * function address in the embedded rcu_head structure, kvfree_rcu() instead
>> * encodes the offset of the rcu_head structure within the base structure.
>> * Because the functions are not allowed in the low-order 4096 bytes of
>> * kernel virtual memory, offsets up to 4095 bytes can be accommodated.
>> * If the offset is larger than 4095 bytes, a compile-time error will
>> - * be generated in __kfree_rcu(). If this error is triggered, you can
>> + * be generated in __kvfree_rcu(). If this error is triggered, you can
>> * either fall back to use of call_rcu() or rearrange the structure to
>> * position the rcu_head structure into the first 4096 bytes.
>> *
>> @@ -871,9 +871,12 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
>> * The BUILD_BUG_ON check must not involve any function calls, hence the
>> * checks are done in macros here.
>> */
>> -#define kfree_rcu(ptr, rcu_head) \
>> - __kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
>> +#define kvfree_rcu(ptr, rcu_head) \
>> + __kvfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
>>
>> +#define kfree_rcu(ptr, rcu_head) kvfree_rcu(ptr, rcu_head)
>> +
>> +#define vfree_rcu(ptr, rcu_head) kvfree_rcu(ptr, rcu_head)
>>
>> /*
>> * Place this after a lock-acquisition primitive to guarantee that
>> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
>> index ce9beec35e34..2e484aaa534f 100644
>> --- a/include/linux/rcutiny.h
>> +++ b/include/linux/rcutiny.h
>> @@ -84,8 +84,8 @@ static inline void synchronize_sched_expedited(void)
>> synchronize_sched();
>> }
>>
>> -static inline void kfree_call_rcu(struct rcu_head *head,
>> - rcu_callback_t func)
>> +static inline void kvfree_call_rcu(struct rcu_head *head,
>> + rcu_callback_t func)
>> {
>> call_rcu(head, func);
>> }
On Tue, 6 Feb 2018 18:06:33 +0300
Kirill Tkhai <[email protected]> wrote:
> There are kfree_rcu() and vfree_rcu() defined below, and they will give
> compilation error if someone tries to implement one more primitive with
> the same name.
Ah, I misread the patch. I was thinking you were simply replacing
kfree_rcu() with kvfree_rcu(), but now see the macros added below it.
>
> We may add a comment, but I'm not sure it will be good if people will use
> unpaired brackets like:
>
> obj = kmalloc(..)
> kvfree_rcu(obj,..)
>
> after they read such a commentary that it works for both vmalloc and kmalloc.
> After this unpaired behavior distribute over the kernel, we won't be able
> to implement some debug on top of this defines (I'm not sure it will be really
> need in the future, but anyway).
>
> Though, we may add a comment forcing use of paired bracket. Something like:
>
> /**
> * kvfree_rcu() - kvfree an object after a grace period.
> This is a primitive for objects allocated via kvmalloc*() family primitives.
> Do not use it to free kmalloc() and vmalloc() allocated objects, use kfree_rcu()
> and vfree_rcu() wrappers instead.
>
> How are you about this?
Never mind, I missed the adding of kfree_rcu() at the bottom, and was
thinking that we were just using kvfree_rcu() for everything.
That's what I get for looking at patches before my first cup of
coffee ;-)
If you want to add a comment, feel free, but taking a second look, I
don't feel it is necessary.
-- Steve
On Tue, Feb 06, 2018 at 01:19:29PM +0300, Kirill Tkhai wrote:
> Recent times kvmalloc() begun widely be used in kernel.
> Some of such memory allocations have to be freed after
> rcu grace period, and this patchset introduces a generic
> primitive for doing this.
>
> Actually, everything is made in [1/2]. Patch [2/2] is just
> added to make new kvfree_rcu() have the first user.
>
> The patch [1/2] transforms kfree_rcu(), its sub definitions
> and its sub functions into kvfree_rcu() form. The most
> significant change is in __rcu_reclaim(), where kvfree()
> is used instead of kfree(). Since kvfree() is able to
> have a deal with memory allocated via kmalloc(), vmalloc()
> and kvmalloc(); kfree_rcu() and vfree_rcu() may simply
> be defined through this new kvfree_rcu().
Interesting.
So it is OK to kvmalloc() something and pass it to either kfree() or
kvfree(), and it had better be OK to kvmalloc() something and pass it
to kvfree().
Is it OK to kmalloc() something and pass it to kvfree()?
If so, is it really useful to have two different names here, that is,
both kfree_rcu() and kvfree_rcu()?
Also adding Jesper and Rao on CC for their awareness.
Thanx, Paul
> ---
>
> Kirill Tkhai (2):
> rcu: Transform kfree_rcu() into kvfree_rcu()
> mm: Use kvfree_rcu() in update_memcg_params()
>
>
> include/linux/rcupdate.h | 31 +++++++++++++++++--------------
> include/linux/rcutiny.h | 4 ++--
> include/linux/rcutree.h | 2 +-
> include/trace/events/rcu.h | 12 ++++++------
> kernel/rcu/rcu.h | 8 ++++----
> kernel/rcu/tree.c | 14 +++++++-------
> kernel/rcu/tree_plugin.h | 10 +++++-----
> mm/slab_common.c | 10 +---------
> 8 files changed, 43 insertions(+), 48 deletions(-)
>
> --
> Signed-off-by: Kirill Tkhai <[email protected]>
>
On Tue, Feb 06, 2018 at 06:17:03PM -0800, Paul E. McKenney wrote:
> So it is OK to kvmalloc() something and pass it to either kfree() or
> kvfree(), and it had better be OK to kvmalloc() something and pass it
> to kvfree().
>
> Is it OK to kmalloc() something and pass it to kvfree()?
Yes, it absolutely is.
void kvfree(const void *addr)
{
if (is_vmalloc_addr(addr))
vfree(addr);
else
kfree(addr);
}
> If so, is it really useful to have two different names here, that is,
> both kfree_rcu() and kvfree_rcu()?
I think it's handy to have all three of kvfree_rcu(), kfree_rcu() and
vfree_rcu() available in the API for the symmetry of calling kmalloc()
/ kfree_rcu().
Personally, I would like us to rename kvfree() to just free(), and have
malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
fight yet.
On Tue, Feb 06, 2018 at 08:23:34PM -0800, Matthew Wilcox wrote:
> On Tue, Feb 06, 2018 at 06:17:03PM -0800, Paul E. McKenney wrote:
> > So it is OK to kvmalloc() something and pass it to either kfree() or
> > kvfree(), and it had better be OK to kvmalloc() something and pass it
> > to kvfree().
> >
> > Is it OK to kmalloc() something and pass it to kvfree()?
>
> Yes, it absolutely is.
>
> void kvfree(const void *addr)
> {
> if (is_vmalloc_addr(addr))
> vfree(addr);
> else
> kfree(addr);
> }
>
> > If so, is it really useful to have two different names here, that is,
> > both kfree_rcu() and kvfree_rcu()?
>
> I think it's handy to have all three of kvfree_rcu(), kfree_rcu() and
> vfree_rcu() available in the API for the symmetry of calling kmalloc()
> / kfree_rcu().
>
> Personally, I would like us to rename kvfree() to just free(), and have
> malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
> fight yet.
But why not just have the existing kfree_rcu() API cover both kmalloc()
and kvmalloc()? Perhaps I am not in the right forums, but I am not hearing
anyone arguing that the RCU API has too few members. ;-)
Thanx, Paul
On Tue, Feb 06, 2018 at 09:02:00PM -0800, Paul E. McKenney wrote:
> On Tue, Feb 06, 2018 at 08:23:34PM -0800, Matthew Wilcox wrote:
> > On Tue, Feb 06, 2018 at 06:17:03PM -0800, Paul E. McKenney wrote:
> > > So it is OK to kvmalloc() something and pass it to either kfree() or
> > > kvfree(), and it had better be OK to kvmalloc() something and pass it
> > > to kvfree().
> > >
> > > Is it OK to kmalloc() something and pass it to kvfree()?
> >
> > Yes, it absolutely is.
> >
> > void kvfree(const void *addr)
> > {
> > if (is_vmalloc_addr(addr))
> > vfree(addr);
> > else
> > kfree(addr);
> > }
> >
> > > If so, is it really useful to have two different names here, that is,
> > > both kfree_rcu() and kvfree_rcu()?
> >
> > I think it's handy to have all three of kvfree_rcu(), kfree_rcu() and
> > vfree_rcu() available in the API for the symmetry of calling kmalloc()
> > / kfree_rcu().
> >
> > Personally, I would like us to rename kvfree() to just free(), and have
> > malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
> > fight yet.
>
> But why not just have the existing kfree_rcu() API cover both kmalloc()
> and kvmalloc()? Perhaps I am not in the right forums, but I am not hearing
> anyone arguing that the RCU API has too few members. ;-)
I don't have any problem with having just `kvfree_rcu`, but having just
`kfree_rcu` seems confusingly asymmetric.
(Also, count me in favor of having just one "free" function, too.)
On 07.02.2018 08:02, Paul E. McKenney wrote:
> On Tue, Feb 06, 2018 at 08:23:34PM -0800, Matthew Wilcox wrote:
>> On Tue, Feb 06, 2018 at 06:17:03PM -0800, Paul E. McKenney wrote:
>>> So it is OK to kvmalloc() something and pass it to either kfree() or
>>> kvfree(), and it had better be OK to kvmalloc() something and pass it
>>> to kvfree().
>>>
>>> Is it OK to kmalloc() something and pass it to kvfree()?
>>
>> Yes, it absolutely is.
>>
>> void kvfree(const void *addr)
>> {
>> if (is_vmalloc_addr(addr))
>> vfree(addr);
>> else
>> kfree(addr);
>> }
>>
>>> If so, is it really useful to have two different names here, that is,
>>> both kfree_rcu() and kvfree_rcu()?
>>
>> I think it's handy to have all three of kvfree_rcu(), kfree_rcu() and
>> vfree_rcu() available in the API for the symmetry of calling kmalloc()
>> / kfree_rcu().
>>
>> Personally, I would like us to rename kvfree() to just free(), and have
>> malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
>> fight yet.
>
> But why not just have the existing kfree_rcu() API cover both kmalloc()
> and kvmalloc()? Perhaps I am not in the right forums, but I am not hearing
> anyone arguing that the RCU API has too few members. ;-)
People, far from RCU internals, consider kfree_rcu() like an extension
of kfree(). And it's not clear it's need to dive into kfree_rcu() comments,
when someone is looking a primitive to free vmalloc'ed memory.
Also, construction like
obj = kvmalloc();
kfree_rcu(obj);
makes me think it's legitimately to use plain kfree() as pair bracket to kvmalloc().
So the significant change of kfree_rcu() behavior will complicate stable backporters
life, because they will need to keep in mind such differences between different
kernel versions.
It seems if we are going to use the single primitive for both kmalloc()
and kvmalloc() memory, it has to have another name. But I don't see problems
with having both kfree_rcu() and kvfree_rcu().
Kirill
On Tue, Feb 06, 2018 at 11:54:09PM -0800, Josh Triplett wrote:
> On Tue, Feb 06, 2018 at 09:02:00PM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 06, 2018 at 08:23:34PM -0800, Matthew Wilcox wrote:
> > > On Tue, Feb 06, 2018 at 06:17:03PM -0800, Paul E. McKenney wrote:
> > > > So it is OK to kvmalloc() something and pass it to either kfree() or
> > > > kvfree(), and it had better be OK to kvmalloc() something and pass it
> > > > to kvfree().
> > > >
> > > > Is it OK to kmalloc() something and pass it to kvfree()?
> > >
> > > Yes, it absolutely is.
> > >
> > > void kvfree(const void *addr)
> > > {
> > > if (is_vmalloc_addr(addr))
> > > vfree(addr);
> > > else
> > > kfree(addr);
> > > }
> > >
> > > > If so, is it really useful to have two different names here, that is,
> > > > both kfree_rcu() and kvfree_rcu()?
> > >
> > > I think it's handy to have all three of kvfree_rcu(), kfree_rcu() and
> > > vfree_rcu() available in the API for the symmetry of calling kmalloc()
> > > / kfree_rcu().
> > >
> > > Personally, I would like us to rename kvfree() to just free(), and have
> > > malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
> > > fight yet.
> >
> > But why not just have the existing kfree_rcu() API cover both kmalloc()
> > and kvmalloc()? Perhaps I am not in the right forums, but I am not hearing
> > anyone arguing that the RCU API has too few members. ;-)
>
> I don't have any problem with having just `kvfree_rcu`, but having just
> `kfree_rcu` seems confusingly asymmetric.
I don't understand why kmalloc()/kvfree_rcu() would seem any less
asymmetric than kvmalloc()/kfree_rcu().
> (Also, count me in favor of having just one "free" function, too.)
We agree on that much, anyway. ;-)
Thanx, Paul
On Wed, Feb 07, 2018 at 10:57:28AM +0300, Kirill Tkhai wrote:
> On 07.02.2018 08:02, Paul E. McKenney wrote:
> > On Tue, Feb 06, 2018 at 08:23:34PM -0800, Matthew Wilcox wrote:
> >> On Tue, Feb 06, 2018 at 06:17:03PM -0800, Paul E. McKenney wrote:
> >>> So it is OK to kvmalloc() something and pass it to either kfree() or
> >>> kvfree(), and it had better be OK to kvmalloc() something and pass it
> >>> to kvfree().
> >>>
> >>> Is it OK to kmalloc() something and pass it to kvfree()?
> >>
> >> Yes, it absolutely is.
> >>
> >> void kvfree(const void *addr)
> >> {
> >> if (is_vmalloc_addr(addr))
> >> vfree(addr);
> >> else
> >> kfree(addr);
> >> }
> >>
> >>> If so, is it really useful to have two different names here, that is,
> >>> both kfree_rcu() and kvfree_rcu()?
> >>
> >> I think it's handy to have all three of kvfree_rcu(), kfree_rcu() and
> >> vfree_rcu() available in the API for the symmetry of calling kmalloc()
> >> / kfree_rcu().
> >>
> >> Personally, I would like us to rename kvfree() to just free(), and have
> >> malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
> >> fight yet.
> >
> > But why not just have the existing kfree_rcu() API cover both kmalloc()
> > and kvmalloc()? Perhaps I am not in the right forums, but I am not hearing
> > anyone arguing that the RCU API has too few members. ;-)
>
> People, far from RCU internals, consider kfree_rcu() like an extension
> of kfree(). And it's not clear it's need to dive into kfree_rcu() comments,
> when someone is looking a primitive to free vmalloc'ed memory.
Seems like a relatively simple lesson to teach.
> Also, construction like
>
> obj = kvmalloc();
> kfree_rcu(obj);
>
> makes me think it's legitimately to use plain kfree() as pair bracket to kvmalloc().
So it all works as is, then.
> So the significant change of kfree_rcu() behavior will complicate stable backporters
> life, because they will need to keep in mind such differences between different
> kernel versions.
If I understood your construction above, that significant change in
kfree_rcu() behavior has already happened.
> It seems if we are going to use the single primitive for both kmalloc()
> and kvmalloc() memory, it has to have another name. But I don't see problems
> with having both kfree_rcu() and kvfree_rcu().
I see problems. We would then have two different names for exactly the
same thing.
Seems like it would be a lot easier to simply document the existing
kfree_rcu() behavior, especially given that it apparently already works.
The really doesn't seem to me to be worth a name change.
Thanx, Paul
On Wed, 7 Feb 2018 00:31:04 -0800
"Paul E. McKenney" <[email protected]> wrote:
> I see problems. We would then have two different names for exactly the
> same thing.
>
> Seems like it would be a lot easier to simply document the existing
> kfree_rcu() behavior, especially given that it apparently already works.
> The really doesn't seem to me to be worth a name change.
Honestly, I don't believe this is an RCU sub-system decision. This is a
memory management decision.
If we have kmalloc(), vmalloc(), kfree(), vfree() and kvfree(), and we
want kmalloc() to be freed with kfree(), and vmalloc() to be freed with
vfree(), and for strange reasons, we don't know how the data was
allocated we have kvfree(). That's an mm decision not an rcu one. We
should have kfree_rcu(), vfree_rcu() and kvfree_rcu(), and honestly,
they should not depend on kvfree() doing the same thing for everything.
Each should call the corresponding member that they represent. Which
would change this patch set.
Why? Too much coupling between RCU and MM. What if in the future
something changes and kvfree() goes away or changes drastically. We
would then have to go through all the users of RCU to change them too.
To me kvfree() is a special case and should not be used by RCU as a
generic function. That would make RCU and MM much more coupled than
necessary.
-- Steve
On Tue, 6 Feb 2018, Paul E. McKenney wrote:
> So it is OK to kvmalloc() something and pass it to either kfree() or
> kvfree(), and it had better be OK to kvmalloc() something and pass it
> to kvfree().
kvfree() is fine but not kfree().
On Wed, Feb 07, 2018 at 08:57:00AM -0500, Steven Rostedt wrote:
> On Wed, 7 Feb 2018 00:31:04 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > I see problems. We would then have two different names for exactly the
> > same thing.
> >
> > Seems like it would be a lot easier to simply document the existing
> > kfree_rcu() behavior, especially given that it apparently already works.
> > The really doesn't seem to me to be worth a name change.
>
> Honestly, I don't believe this is an RCU sub-system decision. This is a
> memory management decision.
>
> If we have kmalloc(), vmalloc(), kfree(), vfree() and kvfree(), and we
You missed kvmalloc() ...
> want kmalloc() to be freed with kfree(), and vmalloc() to be freed with
> vfree(), and for strange reasons, we don't know how the data was
> allocated we have kvfree(). That's an mm decision not an rcu one. We
> should have kfree_rcu(), vfree_rcu() and kvfree_rcu(), and honestly,
> they should not depend on kvfree() doing the same thing for everything.
> Each should call the corresponding member that they represent. Which
> would change this patch set.
>
> Why? Too much coupling between RCU and MM. What if in the future
> something changes and kvfree() goes away or changes drastically. We
> would then have to go through all the users of RCU to change them too.
>
> To me kvfree() is a special case and should not be used by RCU as a
> generic function. That would make RCU and MM much more coupled than
> necessary.
I'd still like it to be called free_rcu() ... so let's turn it around.
What memory can you allocate and then *not* free by calling kvfree()?
kvfree() can free memory allocated by kmalloc(), vmalloc(), any slab
allocation (is that guaranteed, or just something that happens to work?)
I think it can't free per-cpu allocations, bootmem, DMA allocations, or
alloc_page/get_free_page.
Do we need to be able to free any of those objects in order to rename
kfree_rcu() to just free_rcu()?
On Wed, 7 Feb 2018 08:18:46 -0800
Matthew Wilcox <[email protected]> wrote:
> Do we need to be able to free any of those objects in order to rename
> kfree_rcu() to just free_rcu()?
I'm just nervous about tightly coupling free_rcu() with all the *free()
from the memory management system. I've been burnt in the past by doing
such things.
What's the down side of having a way of matching *free_rcu() with all
the *free()s? I think it's easier to understand, and rcu doesn't need
to worry about changes of *free() code.
To me:
kfree_rcu(x);
is just a quick way of doing 'kfree(x)' after a synchronize_rcu() call.
But a "free_rcu(x)", is something I have to think about, because I
don't know from the name exactly what it is doing.
I know this may sound a bit bike shedding, but the less I need to think
about how other sub systems work, the easier it is to concentrate on my
own sub system.
-- Steve
On Wed, 7 Feb 2018 08:57:00 -0500
Steven Rostedt <[email protected]> wrote:
> On Wed, 7 Feb 2018 00:31:04 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > I see problems. We would then have two different names for exactly the
> > same thing.
> >
> > Seems like it would be a lot easier to simply document the existing
> > kfree_rcu() behavior, especially given that it apparently already works.
> > The really doesn't seem to me to be worth a name change.
>
> Honestly, I don't believe this is an RCU sub-system decision. This is a
> memory management decision.
>
> If we have kmalloc(), vmalloc(), kfree(), vfree() and kvfree(), and we
> want kmalloc() to be freed with kfree(), and vmalloc() to be freed with
> vfree(), and for strange reasons, we don't know how the data was
> allocated we have kvfree(). That's an mm decision not an rcu one. We
> should have kfree_rcu(), vfree_rcu() and kvfree_rcu(), and honestly,
> they should not depend on kvfree() doing the same thing for everything.
> Each should call the corresponding member that they represent. Which
> would change this patch set.
>
> Why? Too much coupling between RCU and MM. What if in the future
> something changes and kvfree() goes away or changes drastically. We
> would then have to go through all the users of RCU to change them too.
>
> To me kvfree() is a special case and should not be used by RCU as a
> generic function. That would make RCU and MM much more coupled than
> necessary.
For the record, I fully agree with Steve here.
And being a performance "fanatic" I don't like to have the extra branch
(and compares) in the free code path... but it's a MM-decision (and
sometimes you should not listen to "fanatics" ;-))
void kvfree(const void *addr)
{
if (is_vmalloc_addr(addr))
vfree(addr);
else
kfree(addr);
}
EXPORT_SYMBOL(kvfree);
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
static inline bool is_vmalloc_addr(const void *x)
{
#ifdef CONFIG_MMU
unsigned long addr = (unsigned long)x;
return addr >= VMALLOC_START && addr < VMALLOC_END;
#else
return false;
#endif
}
On Tue, 6 Feb 2018, Matthew Wilcox wrote:
> Personally, I would like us to rename kvfree() to just free(), and have
> malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
> fight yet.
Maybe lets implement malloc(), free() and realloc() in the kernel to be
consistent with user space use as possible? Only use the others
allocation variants for special cases.
So malloc would check allocation sizes and if < 2* PAGE_SIZE use kmalloc()
otherwise vmalloc().
free() would free anything you give it.
On Wed, 7 Feb 2018 10:47:02 -0600 (CST)
Christopher Lameter <[email protected]> wrote:
> On Tue, 6 Feb 2018, Matthew Wilcox wrote:
>
> > Personally, I would like us to rename kvfree() to just free(), and have
> > malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that
> > fight yet.
>
> Maybe lets implement malloc(), free() and realloc() in the kernel to be
> consistent with user space use as possible? Only use the others
> allocation variants for special cases.
They would need to drop the GFP part and default to GFP_KERNEL.
>
> So malloc would check allocation sizes and if < 2* PAGE_SIZE use kmalloc()
> otherwise vmalloc().
Please no, I hate subtle internal decisions like this. It makes
debugging much more difficult, when allocating dynamic sized variables.
When something works at one size but not the other.
-- Steve
>
> free() would free anything you give it.
On Wed, Feb 07, 2018 at 12:09:49PM -0500, Steven Rostedt wrote:
> > Maybe lets implement malloc(), free() and realloc() in the kernel to be
> > consistent with user space use as possible? Only use the others
> > allocation variants for special cases.
>
> They would need to drop the GFP part and default to GFP_KERNEL.
Yes, exactly.
> > So malloc would check allocation sizes and if < 2* PAGE_SIZE use kmalloc()
> > otherwise vmalloc().
>
> Please no, I hate subtle internal decisions like this. It makes
> debugging much more difficult, when allocating dynamic sized variables.
> When something works at one size but not the other.
You know we already have kvmalloc()?
On Wed, 7 Feb 2018 09:19:36 -0800
Matthew Wilcox <[email protected]> wrote:
> > Please no, I hate subtle internal decisions like this. It makes
> > debugging much more difficult, when allocating dynamic sized variables.
> > When something works at one size but not the other.
>
> You know we already have kvmalloc()?
Yes, and the name suggests exactly what it does. It has both "k" and
"v" which tells me that if I use it it could be one or the other.
But a generic "malloc" or "free" that does things differently depending
on the size is a different story.
-- Steve
On Wed, 7 Feb 2018, Steven Rostedt wrote:
> But a generic "malloc" or "free" that does things differently depending
> on the size is a different story.
They would not be used for cases with special requirements but for the
throwaway allows where noone cares about these details. Its just a
convenience for the developers that do not need to be bothered with too
much detail because they are not dealing with codepaths that have special
requirements.
On Wed, Feb 07, 2018 at 05:45:13PM +0100, Jesper Dangaard Brouer wrote:
> On Wed, 7 Feb 2018 08:57:00 -0500
> Steven Rostedt <[email protected]> wrote:
> > To me kvfree() is a special case and should not be used by RCU as a
> > generic function. That would make RCU and MM much more coupled than
> > necessary.
>
> For the record, I fully agree with Steve here.
>
> And being a performance "fanatic" I don't like to have the extra branch
> (and compares) in the free code path... but it's a MM-decision (and
> sometimes you should not listen to "fanatics" ;-))
While free_rcu() is not withut its performance requirements, I think it's
currently dominated by cache misses and not by branches. By the time RCU
gets to run callbacks, memory is certainly L1/L2 cache-cold and probably
L3 cache-cold. Also calling the callback functions is utterly impossible
for the branch predictor.
On Wed, 7 Feb 2018 10:10:55 -0800
Matthew Wilcox <[email protected]> wrote:
> > For the record, I fully agree with Steve here.
Thanks, but...
> >
> > And being a performance "fanatic" I don't like to have the extra branch
> > (and compares) in the free code path... but it's a MM-decision (and
> > sometimes you should not listen to "fanatics" ;-))
>
> While free_rcu() is not withut its performance requirements, I think it's
> currently dominated by cache misses and not by branches. By the time RCU
> gets to run callbacks, memory is certainly L1/L2 cache-cold and probably
> L3 cache-cold. Also calling the callback functions is utterly impossible
> for the branch predictor.
I agree with Matthew.
This is far from any fast path. A few extra branches isn't going to
hurt anything here as it's mostly just garbage collection. With or
without the Spectre fixes.
-- Steve
On Wed, Feb 07, 2018 at 08:57:00AM -0500, Steven Rostedt wrote:
> On Wed, 7 Feb 2018 00:31:04 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > I see problems. We would then have two different names for exactly the
> > same thing.
> >
> > Seems like it would be a lot easier to simply document the existing
> > kfree_rcu() behavior, especially given that it apparently already works.
> > The really doesn't seem to me to be worth a name change.
>
> Honestly, I don't believe this is an RCU sub-system decision. This is a
> memory management decision.
I couldn't agree more!
To that end, what are your thoughts on this patch?
https://lkml.kernel.org/r/[email protected]
Advantages include the ability to optimize based on sl[aou]b state,
getting rid of the 4K offset hack in __is_kfree_rcu_offset(), better
cache localite, and, as you say, putting the naming responsibility
in the memory-management domain.
> If we have kmalloc(), vmalloc(), kfree(), vfree() and kvfree(), and we
> want kmalloc() to be freed with kfree(), and vmalloc() to be freed with
> vfree(), and for strange reasons, we don't know how the data was
> allocated we have kvfree(). That's an mm decision not an rcu one. We
> should have kfree_rcu(), vfree_rcu() and kvfree_rcu(), and honestly,
> they should not depend on kvfree() doing the same thing for everything.
> Each should call the corresponding member that they represent. Which
> would change this patch set.
>
> Why? Too much coupling between RCU and MM. What if in the future
> something changes and kvfree() goes away or changes drastically. We
> would then have to go through all the users of RCU to change them too.
>
> To me kvfree() is a special case and should not be used by RCU as a
> generic function. That would make RCU and MM much more coupled than
> necessary.
And that is one reason I am viewing the name-change patch with great
suspicion, especially given that there seems to be some controversy
among the memory-management folks as to exactly what the names should be.
Thanx, Paul
On Wed, Feb 07, 2018 at 08:55:47AM -0600, Christopher Lameter wrote:
> On Tue, 6 Feb 2018, Paul E. McKenney wrote:
>
> > So it is OK to kvmalloc() something and pass it to either kfree() or
> > kvfree(), and it had better be OK to kvmalloc() something and pass it
> > to kvfree().
>
> kvfree() is fine but not kfree().
Ah, even more fun, then! ;-)
Thanx, Paul
On Wed, Feb 07, 2018 at 01:26:19PM -0500, Steven Rostedt wrote:
> On Wed, 7 Feb 2018 10:10:55 -0800
> Matthew Wilcox <[email protected]> wrote:
>
> > > For the record, I fully agree with Steve here.
>
> Thanks, but...
>
> > >
> > > And being a performance "fanatic" I don't like to have the extra branch
> > > (and compares) in the free code path... but it's a MM-decision (and
> > > sometimes you should not listen to "fanatics" ;-))
> >
> > While free_rcu() is not withut its performance requirements, I think it's
> > currently dominated by cache misses and not by branches. By the time RCU
> > gets to run callbacks, memory is certainly L1/L2 cache-cold and probably
> > L3 cache-cold. Also calling the callback functions is utterly impossible
> > for the branch predictor.
>
> I agree with Matthew.
>
> This is far from any fast path. A few extra branches isn't going to
> hurt anything here as it's mostly just garbage collection. With or
> without the Spectre fixes.
What Steve said!
Thanx, Paul
On Wed, Feb 07, 2018 at 10:10:55AM -0800, Matthew Wilcox wrote:
> On Wed, Feb 07, 2018 at 05:45:13PM +0100, Jesper Dangaard Brouer wrote:
> > On Wed, 7 Feb 2018 08:57:00 -0500
> > Steven Rostedt <[email protected]> wrote:
> > > To me kvfree() is a special case and should not be used by RCU as a
> > > generic function. That would make RCU and MM much more coupled than
> > > necessary.
> >
> > For the record, I fully agree with Steve here.
> >
> > And being a performance "fanatic" I don't like to have the extra branch
> > (and compares) in the free code path... but it's a MM-decision (and
> > sometimes you should not listen to "fanatics" ;-))
>
> While free_rcu() is not withut its performance requirements, I think it's
> currently dominated by cache misses and not by branches. By the time RCU
> gets to run callbacks, memory is certainly L1/L2 cache-cold and probably
> L3 cache-cold. Also calling the callback functions is utterly impossible
> for the branch predictor.
This seems to have fallen by the wayside.
To get things going again, I suggest starting out by simply replacing
the kfree() in __rcu_reclaim() with kvfree(). If desired, a kvfree_rcu()
can also be defined as a synonym for kfree_rcu().
This gets us a very simple and small patch which provides the ability
to dispose of kvmalloc() memory after a grace period.
Thanx, Paul