2020-08-20 03:28:48

by Zhang, Qiang

[permalink] [raw]
Subject: [PATCH] debugobjects: install cpu hotplug callback

From: Zqiang <[email protected]>

When a cpu going offline, we should free objects in "percpu_obj_pool"
free_objs list which corresponding to this cpu.

Signed-off-by: Zqiang <[email protected]>
---
include/linux/cpuhotplug.h | 1 +
lib/debugobjects.c | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index a2710e654b64..2e77db655cfa 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -36,6 +36,7 @@ enum cpuhp_state {
CPUHP_X86_MCE_DEAD,
CPUHP_VIRT_NET_DEAD,
CPUHP_SLUB_DEAD,
+ CPUHP_DEBUG_OBJ_DEAD,
CPUHP_MM_WRITEBACK_DEAD,
CPUHP_MM_VMSTAT_DEAD,
CPUHP_SOFTIRQ_DEAD,
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index fe4557955d97..50e21ed0519e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -19,6 +19,7 @@
#include <linux/slab.h>
#include <linux/hash.h>
#include <linux/kmemleak.h>
+#include <linux/cpu.h>

#define ODEBUG_HASH_BITS 14
#define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
@@ -433,6 +434,23 @@ static void free_object(struct debug_obj *obj)
}
}

+#if defined(CONFIG_HOTPLUG_CPU)
+static int object_cpu_offline(unsigned int cpu)
+{
+ struct debug_percpu_free *percpu_pool;
+ struct hlist_node *tmp;
+ struct debug_obj *obj;
+
+ percpu_pool = per_cpu_ptr(&percpu_obj_pool, cpu);
+ hlist_for_each_entry_safe(obj, tmp, &percpu_pool->free_objs, node) {
+ hlist_del(&obj->node);
+ kmem_cache_free(obj_cache, obj);
+ }
+
+ return 0;
+}
+#endif
+
/*
* We run out of memory. That means we probably have tons of objects
* allocated.
@@ -1367,6 +1385,11 @@ void __init debug_objects_mem_init(void)
} else
debug_objects_selftest();

+#if defined(CONFIG_HOTPLUG_CPU)
+ cpuhp_setup_state_nocalls(CPUHP_DEBUG_OBJ_DEAD, "object:offline", NULL,
+ object_cpu_offline);
+#endif
+
/*
* Increase the thresholds for allocating and freeing objects
* according to the number of possible CPUs available in the system.
--
2.17.1


2020-08-25 05:06:48

by Zhang, Qiang

[permalink] [raw]
Subject: 回复: [PATCH] debugobjects: install cpu hotpl ug callback


________________________________________
??????: [email protected] <[email protected]> ???? [email protected] <[email protected]>
????ʱ??: 2020??8??20?? 11:24
?ռ???: [email protected]; [email protected]; [email protected]
????: [email protected]
????: [PATCH] debugobjects: install cpu hotplug callback

From: Zqiang <[email protected]>

When a cpu going offline, we should free objects in "percpu_obj_pool"
free_objs list which corresponding to this cpu.

Signed-off-by: Zqiang <[email protected]>
---
include/linux/cpuhotplug.h | 1 +
lib/debugobjects.c | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index a2710e654b64..2e77db655cfa 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -36,6 +36,7 @@ enum cpuhp_state {
CPUHP_X86_MCE_DEAD,
CPUHP_VIRT_NET_DEAD,
CPUHP_SLUB_DEAD,
+ CPUHP_DEBUG_OBJ_DEAD,
CPUHP_MM_WRITEBACK_DEAD,
CPUHP_MM_VMSTAT_DEAD,
CPUHP_SOFTIRQ_DEAD,
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index fe4557955d97..50e21ed0519e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -19,6 +19,7 @@
#include <linux/slab.h>
#include <linux/hash.h>
#include <linux/kmemleak.h>
+#include <linux/cpu.h>

#define ODEBUG_HASH_BITS 14
#define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
@@ -433,6 +434,23 @@ static void free_object(struct debug_obj *obj)
}
}

+#if defined(CONFIG_HOTPLUG_CPU)
+static int object_cpu_offline(unsigned int cpu)
+{
+ struct debug_percpu_free *percpu_pool;
+ struct hlist_node *tmp;
+ struct debug_obj *obj;
+
+ percpu_pool = per_cpu_ptr(&percpu_obj_pool, cpu);
+ hlist_for_each_entry_safe(obj, tmp, &percpu_pool->free_objs, node) {
+ hlist_del(&obj->node);
+ kmem_cache_free(obj_cache, obj);
+ }
+
+ return 0;
+}
+#endif
+
/*
* We run out of memory. That means we probably have tons of objects
* allocated.
@@ -1367,6 +1385,11 @@ void __init debug_objects_mem_init(void)
} else
debug_objects_selftest();

+#if defined(CONFIG_HOTPLUG_CPU)
+ cpuhp_setup_state_nocalls(CPUHP_DEBUG_OBJ_DEAD, "object:offline", NULL,
+ object_cpu_offline);
+#endif
+
/*
* Increase the thresholds for allocating and freeing objects
* according to the number of possible CPUs available in the system.
--
2.17.1

2020-08-25 22:27:54

by Waiman Long

[permalink] [raw]
Subject: Re: 回复: [PATCH] debugobjects: install cpu hotplug callback

On 8/25/20 12:53 AM, Zhang, Qiang wrote:
> ________________________________________
> ??????: [email protected] <[email protected]> ???? [email protected] <[email protected]>
> ????ʱ??: 2020??8??20?? 11:24
> ?ռ???: [email protected]; [email protected]; [email protected]
> ????: [email protected]
> ????: [PATCH] debugobjects: install cpu hotplug callback
>
> From: Zqiang <[email protected]>
>
> When a cpu going offline, we should free objects in "percpu_obj_pool"
> free_objs list which corresponding to this cpu.

The percpu free object pool is supposed to be accessed only by that
particular cpu without any lock. Trying to access it from another cpu
can cause a race condition unless one can make sure that the offline cpu
won't become online in the mean time. There shouldn't be too many free
objects in the percpu pool. Is it worth the effort to free them?

Cheers,
Longman

2020-08-25 23:34:35

by Waiman Long

[permalink] [raw]
Subject: Re: 回复: [PATCH] debugobjects: install cpu hotplug callback

On 8/25/20 6:26 PM, Waiman Long wrote:
> On 8/25/20 12:53 AM, Zhang, Qiang wrote:
>> ________________________________________
>> ??????: [email protected]
>> <[email protected]> ???? [email protected]
>> <[email protected]>
>> ????ʱ??: 2020??8??20?? 11:24
>> ?ռ???: [email protected]; [email protected]; [email protected]
>> ????: [email protected]
>> ????: [PATCH] debugobjects: install cpu hotplug callback
>>
>> From: Zqiang <[email protected]>
>>
>> When a cpu going offline, we should free objects in "percpu_obj_pool"
>> free_objs list which corresponding to this cpu.
>
> The percpu free object pool is supposed to be accessed only by that
> particular cpu without any lock. Trying to access it from another cpu
> can cause a race condition unless one can make sure that the offline
> cpu won't become online in the mean time. There shouldn't be too many
> free objects in the percpu pool. Is it worth the effort to free them?

Or if you can make the to-be-offlined cpu free the debugobjs before it
is offlined. That will work too.

Cheers,
Longman

2020-08-25 23:56:19

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 回复: [PATCH] debugobjects: install cpu hotplug callback

On Tue, Aug 25 2020 at 18:26, Waiman Long wrote:
> On 8/25/20 12:53 AM, Zhang, Qiang wrote:
>>
>> When a cpu going offline, we should free objects in "percpu_obj_pool"
>> free_objs list which corresponding to this cpu.
>
> The percpu free object pool is supposed to be accessed only by that
> particular cpu without any lock. Trying to access it from another cpu
> can cause a race condition unless one can make sure that the offline cpu
> won't become online in the mean time.

It is actually safe because CPU hotplug is globally serialized and there
is no way that an offline CPU will come back from death valley
magically. If such a zombie ever surfaces then we have surely more
serious problems than accessing that pool :)

> There shouldn't be too many free objects in the percpu pool. Is it
> worth the effort to free them?

That's a really good question nevertheless. The only case where this
ever matters is physical hotplug. All other CPU hotplug stuff is
temporarily or in case of a late (post boottime) SMT disable it's going
to be a handful of free objects on that pool. As debugobjects is as the
name says a debug facility the benefit is questionable unless there is a
good reason to do so.

Thanks,

tglx


2020-08-26 01:04:34

by Waiman Long

[permalink] [raw]
Subject: Re: 回复: [PATCH] debugobjects: install cpu hotplug callback

On 8/25/20 7:53 PM, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 18:26, Waiman Long wrote:
>> On 8/25/20 12:53 AM, Zhang, Qiang wrote:
>>> When a cpu going offline, we should free objects in "percpu_obj_pool"
>>> free_objs list which corresponding to this cpu.
>> The percpu free object pool is supposed to be accessed only by that
>> particular cpu without any lock. Trying to access it from another cpu
>> can cause a race condition unless one can make sure that the offline cpu
>> won't become online in the mean time.
> It is actually safe because CPU hotplug is globally serialized and there
> is no way that an offline CPU will come back from death valley
> magically. If such a zombie ever surfaces then we have surely more
> serious problems than accessing that pool :)

Thanks for the clarification. I haven't looked into where the callback
functions are being called so I am not 100% sure.

Cheers,
Longman

2020-08-26 08:38:02

by Zhang, Qiang

[permalink] [raw]
Subject: 回复: 回复: [PATCH] debugobjects: install cpu hotplug callback



________________________________________
??????: [email protected] <[email protected]> ???? Thomas Gleixner <[email protected]>
????ʱ??: 2020??8??26?? 7:53
?ռ???: Waiman Long; Zhang, Qiang; [email protected]
????: [email protected]; [email protected]
????: Re: ?ظ?: [PATCH] debugobjects: install cpu hotplug callback

On Tue, Aug 25 2020 at 18:26, Waiman Long wrote:
> On 8/25/20 12:53 AM, Zhang, Qiang wrote:
>>
>> When a cpu going offline, we should free objects in "percpu_obj_pool"
>> free_objs list which corresponding to this cpu.
>
> The percpu free object pool is supposed to be accessed only by that
> particular cpu without any lock. Trying to access it from another cpu
> can cause a race condition unless one can make sure that the offline cpu
> won't become online in the mean time.

>It is actually safe because CPU hotplug is globally serialized and there
>is no way that an offline CPU will come back from death valley
>magically. If such a zombie ever surfaces then we have surely more
>serious problems than accessing that pool :)

> There shouldn't be too many free objects in the percpu pool. Is it
> worth the effort to free them?

>That's a really good question nevertheless. The only case where this
>ever matters is physical hotplug. All other CPU hotplug stuff is
>temporarily or in case of a late (post boottime) SMT disable it's going
>to be a handful of free objects on that pool. As debugobjects is as the
>name says a debug facility the benefit is questionable unless there is a
>good reason to do so.

I don't know there may not be too many objects in the percpu pool,
but that doesn't mean they no need to be free, a CPU may never be online after it is offline. some objects in percpu pool is never free.



>Thanks,

> tglx


2020-08-26 09:44:05

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 回复: 回复: [PATCH] debugobjects: install cpu hotplug callback

On Wed, Aug 26 2020 at 08:34, Qiang Zhang wrote:

> ________________________________________
> 发件人: [email protected] <[email protected]> 代表 Thomas Gleixner <[email protected]>
> 发送时间: 2020年8月26日 7:53
> 收件人: Waiman Long; Zhang, Qiang; [email protected]
> 抄送: [email protected]; [email protected]
> 主题: Re: 回复: [PATCH] debugobjects: install cpu hotplug callback

Can you please fix your mail client not to copy the headers into the
mail body? The headers are already in the mail itself.

> On Tue, Aug 25 2020 at 18:26, Waiman Long wrote:

Something like this is completely sufficient.

>>That's a really good question nevertheless. The only case where this
>>ever matters is physical hotplug. All other CPU hotplug stuff is
>>temporarily or in case of a late (post boottime) SMT disable it's going
>>to be a handful of free objects on that pool. As debugobjects is as the
>>name says a debug facility the benefit is questionable unless there is a
>>good reason to do so.
>
> I don't know there may not be too many objects in the percpu pool,
> but that doesn't mean they no need to be free, a CPU may never be
> online after it is offline. some objects in percpu pool is never
> free.

And this matters because? Because your fully debug enabled kernel will
have an uptime of years after disabling the CPU?

That said, I'm not opposed against this patch, but

'we should free objects'

is not a convincing technical argument for doing this. If we want to
have that then please add proper technical arguments to the changelog.

Thanks,

tglx