2018-11-20 21:37:48

by Qian Cai

[permalink] [raw]
Subject: [PATCH v2] debugobjects: scale the static pool size

The current value of the early boot static pool size is not big enough
for systems with large number of CPUs with timer or/and workqueue
objects selected. As the results, systems have 60+ CPUs with both timer
and workqueue objects enabled could trigger "ODEBUG: Out of memory.
ODEBUG disabled". Hence, fixed it by computing it according to
CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.

Signed-off-by: Qian Cai <[email protected]>
---

Changes since v1:
* Removed the CONFIG_NR_CPUS check since it is always defined.
* Introduced a minimal default pool size to start with. Otherwise, the pool
size would be too low for systems only with a small number of CPUs.
* Adjusted the scale factors according to data.

lib/debugobjects.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 79 insertions(+), 2 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 70935ed91125..a79f02c29617 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -23,9 +23,81 @@
#define ODEBUG_HASH_BITS 14
#define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)

-#define ODEBUG_POOL_SIZE 1024
+#define ODEBUG_DEFAULT_POOL 512
#define ODEBUG_POOL_MIN_LEVEL 256

+/*
+ * Some debug objects are allocated during the early boot. Enabling some
+ * options like timers or workqueue objects may increase the size required
+ * significantly with large number of CPUs. For example (as today, 20 Nov.
+ * 2018),
+ *
+ * No. CPUs x 2 (worker pool) objects:
+ *
+ * start_kernel
+ * workqueue_init_early
+ * init_worker_pool
+ * init_timer_key
+ * debug_object_init
+ *
+ * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
+ *
+ * sched_init
+ * hrtick_rq_init
+ * hrtimer_init
+ *
+ * CONFIG_DEBUG_OBJECTS_WORK:
+ * No. CPUs x 6 (workqueue) objects:
+ *
+ * workqueue_init_early
+ * alloc_workqueue
+ * __alloc_workqueue_key
+ * alloc_and_link_pwqs
+ * init_pwq
+ *
+ * Also, plus No. CPUs objects:
+ *
+ * perf_event_init
+ * __init_srcu_struct
+ * init_srcu_struct_fields
+ * init_srcu_struct_nodes
+ * __init_work
+ *
+ * Increase the number a bit more in case the implmentatins are changed in
+ * the future, as it is better to avoid OOM than spending a bit more kernel
+ * memory that will/can be freed.
+ */
+#if defined(CONFIG_DEBUG_OBJECTS_TIMERS) && \
+!defined(CONFIG_DEBUG_OBJECTS_WORK)
+/*
+ * With all debug objects config options selected except workqueue, kernel
+ * reports,
+ * 64-CPU: ODEBUG: 466 of 466 active objects replaced
+ * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
+ */
+#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * 10)
+
+#elif defined(CONFIG_DEBUG_OBJECTS_WORK) && \
+!defined(CONFIG_DEBUG_OBJECTS_TIMERS)
+/*
+ * all the options except the timers:
+ * 64-CPU: ODEBUG: 652 of 652 active objects replaced
+ * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
+ */
+#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * 10)
+
+#elif defined(CONFIG_DEBUG_OBJECTS_WORK) && \
+defined(CONFIG_DEBUG_OBJECTS_TIMERS)
+/*
+ * all the options:
+ * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
+ * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
+ */
+#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * 20)
+#else
+#define ODEBUG_POOL_SIZE ODEBUG_DEFAULT_POOL
+#endif
+
#define ODEBUG_CHUNK_SHIFT PAGE_SHIFT
#define ODEBUG_CHUNK_SIZE (1 << ODEBUG_CHUNK_SHIFT)
#define ODEBUG_CHUNK_MASK (~(ODEBUG_CHUNK_SIZE - 1))
@@ -58,8 +130,13 @@ static int debug_objects_fixups __read_mostly;
static int debug_objects_warnings __read_mostly;
static int debug_objects_enabled __read_mostly
= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
+/*
+ * This is only used after replaced static objects, so no need to scale it
+ * to use the early boot static pool size and it has already been scaled
+ * according to actual No. CPUs in the box within debug_objects_mem_init().
+ */
static int debug_objects_pool_size __read_mostly
- = ODEBUG_POOL_SIZE;
+ = ODEBUG_DEFAULT_POOL;
static int debug_objects_pool_min_level __read_mostly
= ODEBUG_POOL_MIN_LEVEL;
static struct debug_obj_descr *descr_test __read_mostly;
--
2.17.2 (Apple Git-113)



2018-11-20 22:09:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2] debugobjects: scale the static pool size

On Tue, 20 Nov 2018, Waiman Long wrote:
> On 11/20/2018 03:14 PM, Qian Cai wrote:
> > static struct debug_obj_descr *descr_test __read_mostly;
>
> The calculation for ODEBUG_POOL_SIZE is somewhat hard to read. Maybe you
> can do something like
>
> #ifdef CONFIG_DEBUG_OBJECTS_WORK
> #define ODEBUG_WORK_PCPU_CNT    10
> #else
> #define ODEBUG_WORK_PCPU_CNT    0
> #endif
>
> #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
> #define ODEBUG_TIMERS_PCPU_CNT 10
> #else
> #define ODEBUG_TIMERS_PCPU_CNT 0
> #endif
>
> #define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
>                          (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))

Yes please.

Thanks,

tglx

2018-11-20 22:09:08

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH v2] debugobjects: scale the static pool size

On 11/20/2018 03:14 PM, Qian Cai wrote:
> The current value of the early boot static pool size is not big enough
> for systems with large number of CPUs with timer or/and workqueue
> objects selected. As the results, systems have 60+ CPUs with both timer
> and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> ODEBUG disabled". Hence, fixed it by computing it according to
> CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.
>
> Signed-off-by: Qian Cai <[email protected]>
> ---
>
> Changes since v1:
> * Removed the CONFIG_NR_CPUS check since it is always defined.
> * Introduced a minimal default pool size to start with. Otherwise, the pool
> size would be too low for systems only with a small number of CPUs.
> * Adjusted the scale factors according to data.
>
> lib/debugobjects.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 79 insertions(+), 2 deletions(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..a79f02c29617 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -23,9 +23,81 @@
> #define ODEBUG_HASH_BITS 14
> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>
> -#define ODEBUG_POOL_SIZE 1024
> +#define ODEBUG_DEFAULT_POOL 512
> #define ODEBUG_POOL_MIN_LEVEL 256
>
> +/*
> + * Some debug objects are allocated during the early boot. Enabling some
> + * options like timers or workqueue objects may increase the size required
> + * significantly with large number of CPUs. For example (as today, 20 Nov.
> + * 2018),
> + *
> + * No. CPUs x 2 (worker pool) objects:
> + *
> + * start_kernel
> + * workqueue_init_early
> + * init_worker_pool
> + * init_timer_key
> + * debug_object_init
> + *
> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> + *
> + * sched_init
> + * hrtick_rq_init
> + * hrtimer_init
> + *
> + * CONFIG_DEBUG_OBJECTS_WORK:
> + * No. CPUs x 6 (workqueue) objects:
> + *
> + * workqueue_init_early
> + * alloc_workqueue
> + * __alloc_workqueue_key
> + * alloc_and_link_pwqs
> + * init_pwq
> + *
> + * Also, plus No. CPUs objects:
> + *
> + * perf_event_init
> + * __init_srcu_struct
> + * init_srcu_struct_fields
> + * init_srcu_struct_nodes
> + * __init_work
> + *
> + * Increase the number a bit more in case the implmentatins are changed in
> + * the future, as it is better to avoid OOM than spending a bit more kernel
> + * memory that will/can be freed.
> + */
> +#if defined(CONFIG_DEBUG_OBJECTS_TIMERS) && \
> +!defined(CONFIG_DEBUG_OBJECTS_WORK)
> +/*
> + * With all debug objects config options selected except workqueue, kernel
> + * reports,
> + * 64-CPU: ODEBUG: 466 of 466 active objects replaced
> + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
> + */
> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * 10)
> +
> +#elif defined(CONFIG_DEBUG_OBJECTS_WORK) && \
> +!defined(CONFIG_DEBUG_OBJECTS_TIMERS)
> +/*
> + * all the options except the timers:
> + * 64-CPU: ODEBUG: 652 of 652 active objects replaced
> + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
> + */
> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * 10)
> +
> +#elif defined(CONFIG_DEBUG_OBJECTS_WORK) && \
> +defined(CONFIG_DEBUG_OBJECTS_TIMERS)
> +/*
> + * all the options:
> + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
> + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
> + */
> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * 20)
> +#else
> +#define ODEBUG_POOL_SIZE ODEBUG_DEFAULT_POOL
> +#endif
> +
> #define ODEBUG_CHUNK_SHIFT PAGE_SHIFT
> #define ODEBUG_CHUNK_SIZE (1 << ODEBUG_CHUNK_SHIFT)
> #define ODEBUG_CHUNK_MASK (~(ODEBUG_CHUNK_SIZE - 1))
> @@ -58,8 +130,13 @@ static int debug_objects_fixups __read_mostly;
> static int debug_objects_warnings __read_mostly;
> static int debug_objects_enabled __read_mostly
> = CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
> +/*
> + * This is only used after replaced static objects, so no need to scale it
> + * to use the early boot static pool size and it has already been scaled
> + * according to actual No. CPUs in the box within debug_objects_mem_init().
> + */
> static int debug_objects_pool_size __read_mostly
> - = ODEBUG_POOL_SIZE;
> + = ODEBUG_DEFAULT_POOL;
> static int debug_objects_pool_min_level __read_mostly
> = ODEBUG_POOL_MIN_LEVEL;
> static struct debug_obj_descr *descr_test __read_mostly;

The calculation for ODEBUG_POOL_SIZE is somewhat hard to read. Maybe you
can do something like

#ifdef CONFIG_DEBUG_OBJECTS_WORK
#define ODEBUG_WORK_PCPU_CNT    10
#else
#define ODEBUG_WORK_PCPU_CNT    0
#endif

#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
#define ODEBUG_TIMERS_PCPU_CNT 10
#else
#define ODEBUG_TIMERS_PCPU_CNT 0
#endif

#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
                         (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))

Cheers,
Longman


2018-11-20 23:46:27

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH v3] debugobjects: scale the static pool size

On 11/20/2018 06:28 PM, Qian Cai wrote:
> The current value of the early boot static pool size is not big enough
> for systems with large number of CPUs with timer or/and workqueue
> objects selected. As the results, systems have 60+ CPUs with both timer
> and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> ODEBUG disabled". Hence, fixed it by computing it according to
> CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.
>
> Signed-off-by: Qian Cai <[email protected]>
> ---
> Changes since v2:
> * Made the calculation easier to read suggested by [email protected].
>
> Changes since v1:
> * Removed the CONFIG_NR_CPUS check since it is always defined.
> * Introduced a minimal default pool size to start with. Otherwise, the pool
> size would be too low for systems only with a small number of CPUs.
> * Adjusted the scale factors according to data.
>
> lib/debugobjects.c | 82 ++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 80 insertions(+), 2 deletions(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..02456a1e57cb 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -23,9 +23,82 @@
> #define ODEBUG_HASH_BITS 14
> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>
> -#define ODEBUG_POOL_SIZE 1024
> +#define ODEBUG_DEFAULT_POOL 512
> #define ODEBUG_POOL_MIN_LEVEL 256
>
> +/*
> + * Some debug objects are allocated during the early boot. Enabling some
> + * options like timers or workqueue objects may increase the size required
> + * significantly with large number of CPUs. For example (as today, 20 Nov.
> + * 2018),
> + *
> + * No. CPUs x 2 (worker pool) objects:
> + *
> + * start_kernel
> + * workqueue_init_early
> + * init_worker_pool
> + * init_timer_key
> + * debug_object_init
> + *
> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> + *
> + * sched_init
> + * hrtick_rq_init
> + * hrtimer_init
> + *
> + * CONFIG_DEBUG_OBJECTS_WORK:
> + * No. CPUs x 6 (workqueue) objects:
> + *
> + * workqueue_init_early
> + * alloc_workqueue
> + * __alloc_workqueue_key
> + * alloc_and_link_pwqs
> + * init_pwq
> + *
> + * Also, plus No. CPUs objects:
> + *
> + * perf_event_init
> + * __init_srcu_struct
> + * init_srcu_struct_fields
> + * init_srcu_struct_nodes
> + * __init_work
> + *
> + * Increase the number a bit more in case the implmentatins are changed in
> + * the future, as it is better to avoid OOM than spending a bit more kernel
> + * memory that will/can be freed.
> + *
> + * With all debug objects config options selected except the workqueue and
> + * the timers, kernel reports,
> + * 64-CPU: ODEBUG: 4 of 4 active objects replaced
> + * 256-CPU: ODEBUG: 4 of 4 active objects replaced
> + *
> + * all the options except the workqueue:
> + * 64-CPU: ODEBUG: 466 of 466 active objects replaced
> + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
> + *
> + * all the options except the timers:
> + * 64-CPU: ODEBUG: 652 of 652 active objects replaced
> + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
> + *
> + * all the options:
> + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
> + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
> + */
> +#ifdef CONFIG_DEBUG_OBJECTS_WORK
> +#define ODEBUG_WORK_PCPU_CNT 10
> +#else
> +#define ODEBUG_WORK_PCPU_CNT 0
> +#endif /* CONFIG_DEBUG_OBJECTS_WORK */
> +
> +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
> +#define ODEBUG_TIMERS_PCPU_CNT 10
> +#else
> +#define ODEBUG_TIMERS_PCPU_CNT 0
> +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */
> +
> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
> +(ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))

Just a nit about indentation. There should be some indentation in the
continued line.

> +
> #define ODEBUG_CHUNK_SHIFT PAGE_SHIFT
> #define ODEBUG_CHUNK_SIZE (1 << ODEBUG_CHUNK_SHIFT)
> #define ODEBUG_CHUNK_MASK (~(ODEBUG_CHUNK_SIZE - 1))
> @@ -58,8 +131,13 @@ static int debug_objects_fixups __read_mostly;
> static int debug_objects_warnings __read_mostly;
> static int debug_objects_enabled __read_mostly
> = CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
> +/*
> + * This is only used after replaced static objects, so no need to scale it
> + * to use the early boot static pool size and it has already been scaled
> + * according to actual No. CPUs in the box within debug_objects_mem_init().
> + */
> static int debug_objects_pool_size __read_mostly
> - = ODEBUG_POOL_SIZE;
> + = ODEBUG_DEFAULT_POOL;
> static int debug_objects_pool_min_level __read_mostly
> = ODEBUG_POOL_MIN_LEVEL;
> static struct debug_obj_descr *descr_test __read_mostly;

Other than that, the patch looks OK to me.

Cheers,
Longman


2018-11-20 23:59:55

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH v3] debugobjects: scale the static pool size

On Wed, 2018-11-21 at 00:54 +0100, Qian Cai wrote:
> On 11/20/18 at 6:38 PM, Waiman Long wrote:
> > On 11/20/2018 06:28 PM, Qian Cai wrote:
> > > The current value of the early boot static pool size is not big enough
> > > for systems with large number of CPUs with timer or/and workqueue
> > > objects selected. As the results, systems have 60+ CPUs with both timer
> > > and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> > > ODEBUG disabled". Hence, fixed it by computing it according to
> > > CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.
[]
> > > +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
> > > +(ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))
> >
> > Just a nit about indentation. There should be some indentation in the
> > continued line.
>
> I am fine to add that, but checkpatch.pl complained
> that there should no spaces at the beginning of a
> newline. Guess we just ignore the warning?
>
> “please, no spaces at the start of a line”

It wants tabs '0x09' and not spaces '0x20'


2018-11-21 00:16:41

by Qian Cai

[permalink] [raw]
Subject: [PATCH v3] debugobjects: scale the static pool size

The current value of the early boot static pool size is not big enough
for systems with large number of CPUs with timer or/and workqueue
objects selected. As the results, systems have 60+ CPUs with both timer
and workqueue objects enabled could trigger "ODEBUG: Out of memory.
ODEBUG disabled". Hence, fixed it by computing it according to
CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.

Signed-off-by: Qian Cai <[email protected]>
---
Changes since v2:
* Made the calculation easier to read suggested by [email protected].

Changes since v1:
* Removed the CONFIG_NR_CPUS check since it is always defined.
* Introduced a minimal default pool size to start with. Otherwise, the pool
size would be too low for systems only with a small number of CPUs.
* Adjusted the scale factors according to data.

lib/debugobjects.c | 82 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 70935ed91125..02456a1e57cb 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -23,9 +23,82 @@
#define ODEBUG_HASH_BITS 14
#define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)

-#define ODEBUG_POOL_SIZE 1024
+#define ODEBUG_DEFAULT_POOL 512
#define ODEBUG_POOL_MIN_LEVEL 256

+/*
+ * Some debug objects are allocated during the early boot. Enabling some
+ * options like timers or workqueue objects may increase the size required
+ * significantly with large number of CPUs. For example (as today, 20 Nov.
+ * 2018),
+ *
+ * No. CPUs x 2 (worker pool) objects:
+ *
+ * start_kernel
+ * workqueue_init_early
+ * init_worker_pool
+ * init_timer_key
+ * debug_object_init
+ *
+ * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
+ *
+ * sched_init
+ * hrtick_rq_init
+ * hrtimer_init
+ *
+ * CONFIG_DEBUG_OBJECTS_WORK:
+ * No. CPUs x 6 (workqueue) objects:
+ *
+ * workqueue_init_early
+ * alloc_workqueue
+ * __alloc_workqueue_key
+ * alloc_and_link_pwqs
+ * init_pwq
+ *
+ * Also, plus No. CPUs objects:
+ *
+ * perf_event_init
+ * __init_srcu_struct
+ * init_srcu_struct_fields
+ * init_srcu_struct_nodes
+ * __init_work
+ *
+ * Increase the number a bit more in case the implmentatins are changed in
+ * the future, as it is better to avoid OOM than spending a bit more kernel
+ * memory that will/can be freed.
+ *
+ * With all debug objects config options selected except the workqueue and
+ * the timers, kernel reports,
+ * 64-CPU: ODEBUG: 4 of 4 active objects replaced
+ * 256-CPU: ODEBUG: 4 of 4 active objects replaced
+ *
+ * all the options except the workqueue:
+ * 64-CPU: ODEBUG: 466 of 466 active objects replaced
+ * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
+ *
+ * all the options except the timers:
+ * 64-CPU: ODEBUG: 652 of 652 active objects replaced
+ * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
+ *
+ * all the options:
+ * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
+ * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
+ */
+#ifdef CONFIG_DEBUG_OBJECTS_WORK
+#define ODEBUG_WORK_PCPU_CNT 10
+#else
+#define ODEBUG_WORK_PCPU_CNT 0
+#endif /* CONFIG_DEBUG_OBJECTS_WORK */
+
+#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
+#define ODEBUG_TIMERS_PCPU_CNT 10
+#else
+#define ODEBUG_TIMERS_PCPU_CNT 0
+#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */
+
+#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
+(ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))
+
#define ODEBUG_CHUNK_SHIFT PAGE_SHIFT
#define ODEBUG_CHUNK_SIZE (1 << ODEBUG_CHUNK_SHIFT)
#define ODEBUG_CHUNK_MASK (~(ODEBUG_CHUNK_SIZE - 1))
@@ -58,8 +131,13 @@ static int debug_objects_fixups __read_mostly;
static int debug_objects_warnings __read_mostly;
static int debug_objects_enabled __read_mostly
= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
+/*
+ * This is only used after replaced static objects, so no need to scale it
+ * to use the early boot static pool size and it has already been scaled
+ * according to actual No. CPUs in the box within debug_objects_mem_init().
+ */
static int debug_objects_pool_size __read_mostly
- = ODEBUG_POOL_SIZE;
+ = ODEBUG_DEFAULT_POOL;
static int debug_objects_pool_min_level __read_mostly
= ODEBUG_POOL_MIN_LEVEL;
static struct debug_obj_descr *descr_test __read_mostly;
--
2.17.2 (Apple Git-113)


2018-11-21 00:26:27

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v3] debugobjects: scale the static pool size

On 11/20/18 at 6:38 PM, Waiman Long wrote:

> On 11/20/2018 06:28 PM, Qian Cai wrote:
> > The current value of the early boot static pool size is not big enough
> > for systems with large number of CPUs with timer or/and workqueue
> > objects selected. As the results, systems have 60+ CPUs with both timer
> > and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> > ODEBUG disabled". Hence, fixed it by computing it according to
> > CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.
> >
> > Signed-off-by: Qian Cai <[email protected]>
> > ---
> > Changes since v2:
> > * Made the calculation easier to read suggested by [email protected].
> >
> > Changes since v1:
> > * Removed the CONFIG_NR_CPUS check since it is always defined.
> > * Introduced a minimal default pool size to start with. Otherwise, the pool
> > size would be too low for systems only with a small number of CPUs.
> > * Adjusted the scale factors according to data.
> >
> > lib/debugobjects.c | 82 ++++++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 80 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> > index 70935ed91125..02456a1e57cb 100644
> > --- a/lib/debugobjects.c
> > +++ b/lib/debugobjects.c
> > @@ -23,9 +23,82 @@
> > #define ODEBUG_HASH_BITS 14
> > #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
> >
> > -#define ODEBUG_POOL_SIZE 1024
> > +#define ODEBUG_DEFAULT_POOL 512
> > #define ODEBUG_POOL_MIN_LEVEL 256
> >
> > +/*
> > + * Some debug objects are allocated during the early boot. Enabling some
> > + * options like timers or workqueue objects may increase the size required
> > + * significantly with large number of CPUs. For example (as today, 20 Nov.
> > + * 2018),
> > + *
> > + * No. CPUs x 2 (worker pool) objects:
> > + *
> > + * start_kernel
> > + * workqueue_init_early
> > + * init_worker_pool
> > + * init_timer_key
> > + * debug_object_init
> > + *
> > + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> > + *
> > + * sched_init
> > + * hrtick_rq_init
> > + * hrtimer_init
> > + *
> > + * CONFIG_DEBUG_OBJECTS_WORK:
> > + * No. CPUs x 6 (workqueue) objects:
> > + *
> > + * workqueue_init_early
> > + * alloc_workqueue
> > + * __alloc_workqueue_key
> > + * alloc_and_link_pwqs
> > + * init_pwq
> > + *
> > + * Also, plus No. CPUs objects:
> > + *
> > + * perf_event_init
> > + * __init_srcu_struct
> > + * init_srcu_struct_fields
> > + * init_srcu_struct_nodes
> > + * __init_work
> > + *
> > + * Increase the number a bit more in case the implmentatins are changed in
> > + * the future, as it is better to avoid OOM than spending a bit more kernel
> > + * memory that will/can be freed.
> > + *
> > + * With all debug objects config options selected except the workqueue and
> > + * the timers, kernel reports,
> > + * 64-CPU: ODEBUG: 4 of 4 active objects replaced
> > + * 256-CPU: ODEBUG: 4 of 4 active objects replaced
> > + *
> > + * all the options except the workqueue:
> > + * 64-CPU: ODEBUG: 466 of 466 active objects replaced
> > + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
> > + *
> > + * all the options except the timers:
> > + * 64-CPU: ODEBUG: 652 of 652 active objects replaced
> > + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
> > + *
> > + * all the options:
> > + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
> > + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
> > + */
> > +#ifdef CONFIG_DEBUG_OBJECTS_WORK
> > +#define ODEBUG_WORK_PCPU_CNT 10
> > +#else
> > +#define ODEBUG_WORK_PCPU_CNT 0
> > +#endif /* CONFIG_DEBUG_OBJECTS_WORK */
> > +
> > +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
> > +#define ODEBUG_TIMERS_PCPU_CNT 10
> > +#else
> > +#define ODEBUG_TIMERS_PCPU_CNT 0
> > +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */
> > +
> > +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
> > +(ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))
>
> Just a nit about indentation. There should be some indentation in the
> continued line.

I am fine to add that, but checkpatch.pl complained
that there should no spaces at the beginning of a
newline. Guess we just ignore the warning?

“please, no spaces at the start of a line”

2018-11-21 02:14:30

by Qian Cai

[permalink] [raw]
Subject: [PATCH v4] debugobjects: scale the static pool size

The current value of the early boot static pool size is not big enough
for systems with large number of CPUs with timer or/and workqueue
objects selected. As the results, systems have 60+ CPUs with both timer
and workqueue objects enabled could trigger "ODEBUG: Out of memory.
ODEBUG disabled". Hence, fixed it by computing it according to
CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.

Signed-off-by: Qian Cai <[email protected]>
---

Changes since v3:
* style fixes

Changes since v2:
* Made the calculation easier to read suggested by [email protected].

Changes since v1:
* Removed the CONFIG_NR_CPUS check since it is always defined.
* Introduced a minimal default pool size to start with. Otherwise, the pool
size would be too low for systems only with a small number of CPUs.
* Adjusted the scale factors according to data.

lib/debugobjects.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 79 insertions(+), 2 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 70935ed91125..140571aa483c 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -23,9 +23,81 @@
#define ODEBUG_HASH_BITS 14
#define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)

-#define ODEBUG_POOL_SIZE 1024
+#define ODEBUG_DEFAULT_POOL 512
#define ODEBUG_POOL_MIN_LEVEL 256

+/*
+ * Some debug objects are allocated during the early boot. Enabling some options
+ * like timers or workqueue objects may increase the size required significantly
+ * with large number of CPUs. For example (as today, 20 Nov. 2018),
+ *
+ * No. CPUs x 2 (worker pool) objects:
+ *
+ * start_kernel
+ * workqueue_init_early
+ * init_worker_pool
+ * init_timer_key
+ * debug_object_init
+ *
+ * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
+ *
+ * sched_init
+ * hrtick_rq_init
+ * hrtimer_init
+ *
+ * CONFIG_DEBUG_OBJECTS_WORK:
+ * No. CPUs x 6 (workqueue) objects:
+ *
+ * workqueue_init_early
+ * alloc_workqueue
+ * __alloc_workqueue_key
+ * alloc_and_link_pwqs
+ * init_pwq
+ *
+ * Also, plus No. CPUs objects:
+ *
+ * perf_event_init
+ * __init_srcu_struct
+ * init_srcu_struct_fields
+ * init_srcu_struct_nodes
+ * __init_work
+ *
+ * Increase the number a bit more in case the implmentatins are changed in the
+ * future, as it is better to avoid OOM than spending a bit more kernel memory
+ * that will/can be freed.
+ *
+ * With all debug objects config options selected except the workqueue and the
+ * timers, kernel reports,
+ * 64-CPU: ODEBUG: 4 of 4 active objects replaced
+ * 256-CPU: ODEBUG: 4 of 4 active objects replaced
+ *
+ * all the options except the workqueue:
+ * 64-CPU: ODEBUG: 466 of 466 active objects replaced
+ * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
+ *
+ * all the options except the timers:
+ * 64-CPU: ODEBUG: 652 of 652 active objects replaced
+ * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
+ *
+ * all the options:
+ * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
+ * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
+ */
+#ifdef CONFIG_DEBUG_OBJECTS_WORK
+#define ODEBUG_WORK_PCPU_CNT 10
+#else
+#define ODEBUG_WORK_PCPU_CNT 0
+#endif /* CONFIG_DEBUG_OBJECTS_WORK */
+
+#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
+#define ODEBUG_TIMERS_PCPU_CNT 10
+#else
+#define ODEBUG_TIMERS_PCPU_CNT 0
+#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */
+
+#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
+ (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))
+
#define ODEBUG_CHUNK_SHIFT PAGE_SHIFT
#define ODEBUG_CHUNK_SIZE (1 << ODEBUG_CHUNK_SHIFT)
#define ODEBUG_CHUNK_MASK (~(ODEBUG_CHUNK_SIZE - 1))
@@ -58,8 +130,13 @@ static int debug_objects_fixups __read_mostly;
static int debug_objects_warnings __read_mostly;
static int debug_objects_enabled __read_mostly
= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
+/*
+ * This is only used after replaced static objects, so no need to scale it to
+ * use the early boot static pool size and it has already been scaled according
+ * to actual No. CPUs in the box within debug_objects_mem_init().
+ */
static int debug_objects_pool_size __read_mostly
- = ODEBUG_POOL_SIZE;
+ = ODEBUG_DEFAULT_POOL;
static int debug_objects_pool_min_level __read_mostly
= ODEBUG_POOL_MIN_LEVEL;
static struct debug_obj_descr *descr_test __read_mostly;
--
2.17.2 (Apple Git-113)


2018-11-21 14:56:31

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size

On 11/20/2018 09:11 PM, Qian Cai wrote:
> The current value of the early boot static pool size is not big enough
> for systems with large number of CPUs with timer or/and workqueue
> objects selected. As the results, systems have 60+ CPUs with both timer
> and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> ODEBUG disabled". Hence, fixed it by computing it according to
> CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options.
>
> Signed-off-by: Qian Cai <[email protected]>
> ---
>
> Changes since v3:
> * style fixes
>
> Changes since v2:
> * Made the calculation easier to read suggested by [email protected].
>
> Changes since v1:
> * Removed the CONFIG_NR_CPUS check since it is always defined.
> * Introduced a minimal default pool size to start with. Otherwise, the pool
> size would be too low for systems only with a small number of CPUs.
> * Adjusted the scale factors according to data.
>
> lib/debugobjects.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 79 insertions(+), 2 deletions(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..140571aa483c 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -23,9 +23,81 @@
> #define ODEBUG_HASH_BITS 14
> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>
> -#define ODEBUG_POOL_SIZE 1024
> +#define ODEBUG_DEFAULT_POOL 512
> #define ODEBUG_POOL_MIN_LEVEL 256
>
> +/*
> + * Some debug objects are allocated during the early boot. Enabling some options
> + * like timers or workqueue objects may increase the size required significantly
> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
> + *
> + * No. CPUs x 2 (worker pool) objects:
> + *
> + * start_kernel
> + * workqueue_init_early
> + * init_worker_pool
> + * init_timer_key
> + * debug_object_init
> + *
> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> + *
> + * sched_init
> + * hrtick_rq_init
> + * hrtimer_init
> + *
> + * CONFIG_DEBUG_OBJECTS_WORK:
> + * No. CPUs x 6 (workqueue) objects:
> + *
> + * workqueue_init_early
> + * alloc_workqueue
> + * __alloc_workqueue_key
> + * alloc_and_link_pwqs
> + * init_pwq
> + *
> + * Also, plus No. CPUs objects:
> + *
> + * perf_event_init
> + * __init_srcu_struct
> + * init_srcu_struct_fields
> + * init_srcu_struct_nodes
> + * __init_work
> + *
> + * Increase the number a bit more in case the implmentatins are changed in the
> + * future, as it is better to avoid OOM than spending a bit more kernel memory
> + * that will/can be freed.
> + *
> + * With all debug objects config options selected except the workqueue and the
> + * timers, kernel reports,
> + * 64-CPU: ODEBUG: 4 of 4 active objects replaced
> + * 256-CPU: ODEBUG: 4 of 4 active objects replaced
> + *
> + * all the options except the workqueue:
> + * 64-CPU: ODEBUG: 466 of 466 active objects replaced
> + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
> + *
> + * all the options except the timers:
> + * 64-CPU: ODEBUG: 652 of 652 active objects replaced
> + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
> + *
> + * all the options:
> + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
> + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
> + */
> +#ifdef CONFIG_DEBUG_OBJECTS_WORK
> +#define ODEBUG_WORK_PCPU_CNT 10
> +#else
> +#define ODEBUG_WORK_PCPU_CNT 0
> +#endif /* CONFIG_DEBUG_OBJECTS_WORK */
> +
> +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
> +#define ODEBUG_TIMERS_PCPU_CNT 10
> +#else
> +#define ODEBUG_TIMERS_PCPU_CNT 0
> +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */
> +
> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
> + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))
> +
> #define ODEBUG_CHUNK_SHIFT PAGE_SHIFT
> #define ODEBUG_CHUNK_SIZE (1 << ODEBUG_CHUNK_SHIFT)
> #define ODEBUG_CHUNK_MASK (~(ODEBUG_CHUNK_SIZE - 1))
> @@ -58,8 +130,13 @@ static int debug_objects_fixups __read_mostly;
> static int debug_objects_warnings __read_mostly;
> static int debug_objects_enabled __read_mostly
> = CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
> +/*
> + * This is only used after replaced static objects, so no need to scale it to
> + * use the early boot static pool size and it has already been scaled according
> + * to actual No. CPUs in the box within debug_objects_mem_init().
> + */
> static int debug_objects_pool_size __read_mostly
> - = ODEBUG_POOL_SIZE;
> + = ODEBUG_DEFAULT_POOL;
> static int debug_objects_pool_min_level __read_mostly
> = ODEBUG_POOL_MIN_LEVEL;
> static struct debug_obj_descr *descr_test __read_mostly;

Acked-by: Waiman Long <[email protected]>


2018-11-24 07:45:19

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size

On Tue, 20 Nov 2018, Qian Cai wrote:

Looking deeper at that.

> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..140571aa483c 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -23,9 +23,81 @@
> #define ODEBUG_HASH_BITS 14
> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>
> -#define ODEBUG_POOL_SIZE 1024
> +#define ODEBUG_DEFAULT_POOL 512
> #define ODEBUG_POOL_MIN_LEVEL 256
>
> +/*
> + * Some debug objects are allocated during the early boot. Enabling some options
> + * like timers or workqueue objects may increase the size required significantly
> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
> + *
> + * No. CPUs x 2 (worker pool) objects:
> + *
> + * start_kernel
> + * workqueue_init_early
> + * init_worker_pool
> + * init_timer_key
> + * debug_object_init
> + *
> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> + *
> + * sched_init
> + * hrtick_rq_init
> + * hrtimer_init
> + *
> + * CONFIG_DEBUG_OBJECTS_WORK:
> + * No. CPUs x 6 (workqueue) objects:
> + *
> + * workqueue_init_early
> + * alloc_workqueue
> + * __alloc_workqueue_key
> + * alloc_and_link_pwqs
> + * init_pwq
> + *
> + * Also, plus No. CPUs objects:
> + *
> + * perf_event_init
> + * __init_srcu_struct
> + * init_srcu_struct_fields
> + * init_srcu_struct_nodes
> + * __init_work

None of the things are actually used or required _BEFORE_
debug_objects_mem_init() is invoked.

The reason why the call is at this place in start_kernel() is
historical. It's because back in the days when debugobjects were added the
memory allocator was enabled way later than today. So we can just move the
debug_objects_mem_init() call right before sched_init() I think.

> + * Increase the number a bit more in case the implmentatins are changed in the
> + * future, as it is better to avoid OOM than spending a bit more kernel memory
> + * that will/can be freed.
> + *
> + * With all debug objects config options selected except the workqueue and the
> + * timers, kernel reports,
> + * 64-CPU: ODEBUG: 4 of 4 active objects replaced
> + * 256-CPU: ODEBUG: 4 of 4 active objects replaced
> + *
> + * all the options except the workqueue:
> + * 64-CPU: ODEBUG: 466 of 466 active objects replaced
> + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
> + *
> + * all the options except the timers:
> + * 64-CPU: ODEBUG: 652 of 652 active objects replaced
> + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
> + *
> + * all the options:
> + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
> + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
> + */
> +#ifdef CONFIG_DEBUG_OBJECTS_WORK
> +#define ODEBUG_WORK_PCPU_CNT 10
> +#else
> +#define ODEBUG_WORK_PCPU_CNT 0
> +#endif /* CONFIG_DEBUG_OBJECTS_WORK */
> +
> +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
> +#define ODEBUG_TIMERS_PCPU_CNT 10
> +#else
> +#define ODEBUG_TIMERS_PCPU_CNT 0
> +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */

Btw, the scaling here is way off.

> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
> + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))

CONFIG_NR_CPUS Pool size Storage size

256 256512 4M

According to your list above it uses 4378 object for 256 CPUs. That's off
by an factor of ~58.

But we can spare all that and just move the init call.

Thanks,

tglx


2018-11-24 07:58:29

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH] debugobjects: call debug_objects_mem_init eariler

On 11/22/2018 11:31 PM, Qian Cai wrote:
> The current value of the early boot static pool size, 1024 is not big
> enough for systems with large number of CPUs with timer or/and workqueue
> objects selected. As the results, systems have 60+ CPUs with both timer
> and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> ODEBUG disabled".
>
> Some debug objects are allocated during the early boot. Enabling some
> options like timers or workqueue objects may increase the size required
> significantly with large number of CPUs. For example,
>
> CONFIG_DEBUG_OBJECTS_TIMERS:
> No. CPUs x 2 (worker pool) objects:
> start_kernel
> workqueue_init_early
> init_worker_pool
> init_timer_key
> debug_object_init
>
> No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> sched_init
> hrtick_rq_init
> hrtimer_init
>
> CONFIG_DEBUG_OBJECTS_WORK:
> No. CPUs x 6 (workqueue) objects:
> workqueue_init_early
> alloc_workqueue
> __alloc_workqueue_key
> alloc_and_link_pwqs
> init_pwq
>
> Also, plus No. CPUs objects:
> perf_event_init
> __init_srcu_struct
> init_srcu_struct_fields
> init_srcu_struct_nodes
> __init_work
>
> However, none of the things are actually used or required beofre
> debug_objects_mem_init() is invoked.
>
> According to tglx,
> "the reason why the call is at this place in start_kernel() is
> historical. It's because back in the days when debugobjects were added
> the memory allocator was enabled way later than today. So we can just
> move the debug_objects_mem_init() call right before sched_init()."
>
> Afterwards, when calling debug_objects_mem_init(), interrupts have
> already been disabled and lockdep_init() will only be called later, so
> no need to worry about interrupts in
> debug_objects_replace_static_objects().
>
> Suggested-by: Thomas Gleixner <[email protected]>
> Signed-off-by: Qian Cai <[email protected]>
> ---
> init/main.c | 3 ++-
> lib/debugobjects.c | 8 --------
> 2 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/init/main.c b/init/main.c
> index ee147103ba1b..f2c35dc50851 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -600,6 +600,8 @@ asmlinkage __visible void __init start_kernel(void)
> /* trace_printk can be enabled here */
> early_trace_init();
>
> + debug_objects_mem_init();
> +
> /*
> * Set up the scheduler prior starting any interrupts (such as the
> * timer interrupt). Full topology setup happens at smp_init()
> @@ -697,7 +699,6 @@ asmlinkage __visible void __init start_kernel(void)
> #endif
> page_ext_init();
> kmemleak_init();
> - debug_objects_mem_init();
> setup_per_cpu_pageset();
> numa_policy_init();
> acpi_early_init();
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..cc5818ced652 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -1132,13 +1132,6 @@ static int __init debug_objects_replace_static_objects(void)
> hlist_add_head(&obj->node, &objects);
> }
>
> - /*
> - * When debug_objects_mem_init() is called we know that only
> - * one CPU is up, so disabling interrupts is enough
> - * protection. This avoids the lockdep hell of lock ordering.
> - */
> - local_irq_disable();

I think you should have a comment saying that debug_objects_mm_init() is
called early with only one CPU up and interrupt disabled. So it is safe
to replace static objects without any protection.

> -
> /* Remove the statically allocated objects from the pool */
> hlist_for_each_entry_safe(obj, tmp, &obj_pool, node)
> hlist_del(&obj->node);
> @@ -1158,7 +1151,6 @@ static int __init debug_objects_replace_static_objects(void)
> cnt++;
> }
> }
> - local_irq_enable();
>
> pr_debug("%d of %d active objects replaced\n",
> cnt, obj_pool_used);

-Longman


2018-11-24 07:59:09

by Qian Cai

[permalink] [raw]
Subject: [PATCH] debugobjects: call debug_objects_mem_init eariler

The current value of the early boot static pool size, 1024 is not big
enough for systems with large number of CPUs with timer or/and workqueue
objects selected. As the results, systems have 60+ CPUs with both timer
and workqueue objects enabled could trigger "ODEBUG: Out of memory.
ODEBUG disabled".

Some debug objects are allocated during the early boot. Enabling some
options like timers or workqueue objects may increase the size required
significantly with large number of CPUs. For example,

CONFIG_DEBUG_OBJECTS_TIMERS:
No. CPUs x 2 (worker pool) objects:
start_kernel
workqueue_init_early
init_worker_pool
init_timer_key
debug_object_init

No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
sched_init
hrtick_rq_init
hrtimer_init

CONFIG_DEBUG_OBJECTS_WORK:
No. CPUs x 6 (workqueue) objects:
workqueue_init_early
alloc_workqueue
__alloc_workqueue_key
alloc_and_link_pwqs
init_pwq

Also, plus No. CPUs objects:
perf_event_init
__init_srcu_struct
init_srcu_struct_fields
init_srcu_struct_nodes
__init_work

However, none of the things are actually used or required beofre
debug_objects_mem_init() is invoked.

According to tglx,
"the reason why the call is at this place in start_kernel() is
historical. It's because back in the days when debugobjects were added
the memory allocator was enabled way later than today. So we can just
move the debug_objects_mem_init() call right before sched_init()."

Afterwards, when calling debug_objects_mem_init(), interrupts have
already been disabled and lockdep_init() will only be called later, so
no need to worry about interrupts in
debug_objects_replace_static_objects().

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Qian Cai <[email protected]>
---
init/main.c | 3 ++-
lib/debugobjects.c | 8 --------
2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/init/main.c b/init/main.c
index ee147103ba1b..f2c35dc50851 100644
--- a/init/main.c
+++ b/init/main.c
@@ -600,6 +600,8 @@ asmlinkage __visible void __init start_kernel(void)
/* trace_printk can be enabled here */
early_trace_init();

+ debug_objects_mem_init();
+
/*
* Set up the scheduler prior starting any interrupts (such as the
* timer interrupt). Full topology setup happens at smp_init()
@@ -697,7 +699,6 @@ asmlinkage __visible void __init start_kernel(void)
#endif
page_ext_init();
kmemleak_init();
- debug_objects_mem_init();
setup_per_cpu_pageset();
numa_policy_init();
acpi_early_init();
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 70935ed91125..cc5818ced652 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -1132,13 +1132,6 @@ static int __init debug_objects_replace_static_objects(void)
hlist_add_head(&obj->node, &objects);
}

- /*
- * When debug_objects_mem_init() is called we know that only
- * one CPU is up, so disabling interrupts is enough
- * protection. This avoids the lockdep hell of lock ordering.
- */
- local_irq_disable();
-
/* Remove the statically allocated objects from the pool */
hlist_for_each_entry_safe(obj, tmp, &obj_pool, node)
hlist_del(&obj->node);
@@ -1158,7 +1151,6 @@ static int __init debug_objects_replace_static_objects(void)
cnt++;
}
}
- local_irq_enable();

pr_debug("%d of %d active objects replaced\n",
cnt, obj_pool_used);
--
2.17.2 (Apple Git-113)


2018-11-24 08:54:00

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] debugobjects: call debug_objects_mem_init eariler

On Thu, 22 Nov 2018, Waiman Long wrote:
> On 11/22/2018 11:31 PM, Qian Cai wrote:
> > The current value of the early boot static pool size, 1024 is not big
> > enough for systems with large number of CPUs with timer or/and workqueue
> > objects selected. As the results, systems have 60+ CPUs with both timer
> > and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> > ODEBUG disabled".
> >
> > However, none of the things are actually used or required beofre

before

> > debug_objects_mem_init() is invoked.
> >
> > According to tglx,
> > "the reason why the call is at this place in start_kernel() is
> > historical. It's because back in the days when debugobjects were added
> > the memory allocator was enabled way later than today. So we can just
> > move the debug_objects_mem_init() call right before sched_init()."
> >
> > Afterwards, when calling debug_objects_mem_init(), interrupts have
> > already been disabled and lockdep_init() will only be called later, so
> > no need to worry about interrupts in
> > debug_objects_replace_static_objects().

Just out of curiosity. How many objects are allocated between early and mem
init?

> > diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> > index 70935ed91125..cc5818ced652 100644
> > --- a/lib/debugobjects.c
> > +++ b/lib/debugobjects.c
> > @@ -1132,13 +1132,6 @@ static int __init debug_objects_replace_static_objects(void)
> > hlist_add_head(&obj->node, &objects);
> > }
> >
> > - /*
> > - * When debug_objects_mem_init() is called we know that only
> > - * one CPU is up, so disabling interrupts is enough
> > - * protection. This avoids the lockdep hell of lock ordering.
> > - */
> > - local_irq_disable();
>
> I think you should have a comment saying that debug_objects_mm_init() is
> called early with only one CPU up and interrupt disabled. So it is safe
> to replace static objects without any protection.

Yes please.

Thanks,

tglx

2018-11-24 09:00:08

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] debugobjects: call debug_objects_mem_init eariler



> On Nov 23, 2018, at 4:46 PM, Thomas Gleixner <[email protected]> wrote:
>
> On Thu, 22 Nov 2018, Waiman Long wrote:
>> On 11/22/2018 11:31 PM, Qian Cai wrote:
>>> The current value of the early boot static pool size, 1024 is not big
>>> enough for systems with large number of CPUs with timer or/and workqueue
>>> objects selected. As the results, systems have 60+ CPUs with both timer
>>> and workqueue objects enabled could trigger "ODEBUG: Out of memory.
>>> ODEBUG disabled".
>>>
>>> However, none of the things are actually used or required beofre
>
> before
>
>>> debug_objects_mem_init() is invoked.
>>>
>>> According to tglx,
>>> "the reason why the call is at this place in start_kernel() is
>>> historical. It's because back in the days when debugobjects were added
>>> the memory allocator was enabled way later than today. So we can just
>>> move the debug_objects_mem_init() call right before sched_init()."
>>>
>>> Afterwards, when calling debug_objects_mem_init(), interrupts have
>>> already been disabled and lockdep_init() will only be called later, so
>>> no need to worry about interrupts in
>>> debug_objects_replace_static_objects().
>
> Just out of curiosity. How many objects are allocated between early and mem
> init?

64-CPU: 68
160-CPU: 164
256-CPU: 260

INIT_WORK(&p->wq, free_work) is called per CPU:

start_kernel
vmalloc_init
__init_work
__debug_object_init

Once debug_objects_mem_init() is moved just before vmalloc_init(), there
is only 1 object.

ODEBUG: 1 of 1 active objects replace

>
>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>> index 70935ed91125..cc5818ced652 100644
>>> --- a/lib/debugobjects.c
>>> +++ b/lib/debugobjects.c
>>> @@ -1132,13 +1132,6 @@ static int __init debug_objects_replace_static_objects(void)
>>> hlist_add_head(&obj->node, &objects);
>>> }
>>>
>>> - /*
>>> - * When debug_objects_mem_init() is called we know that only
>>> - * one CPU is up, so disabling interrupts is enough
>>> - * protection. This avoids the lockdep hell of lock ordering.
>>> - */
>>> - local_irq_disable();
>>
>> I think you should have a comment saying that debug_objects_mm_init() is
>> called early with only one CPU up and interrupt disabled. So it is safe
>> to replace static objects without any protection.
>
> Yes please.
>
> Thanks,
>
> tglx


2018-11-24 09:00:20

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size



> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <[email protected]> wrote:
>
> On Tue, 20 Nov 2018, Qian Cai wrote:
>
> Looking deeper at that.
>
>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> index 70935ed91125..140571aa483c 100644
>> --- a/lib/debugobjects.c
>> +++ b/lib/debugobjects.c
>> @@ -23,9 +23,81 @@
>> #define ODEBUG_HASH_BITS 14
>> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>>
>> -#define ODEBUG_POOL_SIZE 1024
>> +#define ODEBUG_DEFAULT_POOL 512
>> #define ODEBUG_POOL_MIN_LEVEL 256
>>
>> +/*
>> + * Some debug objects are allocated during the early boot. Enabling some options
>> + * like timers or workqueue objects may increase the size required significantly
>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>> + *
>> + * No. CPUs x 2 (worker pool) objects:
>> + *
>> + * start_kernel
>> + * workqueue_init_early
>> + * init_worker_pool
>> + * init_timer_key
>> + * debug_object_init
>> + *
>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>> + *
>> + * sched_init
>> + * hrtick_rq_init
>> + * hrtimer_init
>> + *
>> + * CONFIG_DEBUG_OBJECTS_WORK:
>> + * No. CPUs x 6 (workqueue) objects:
>> + *
>> + * workqueue_init_early
>> + * alloc_workqueue
>> + * __alloc_workqueue_key
>> + * alloc_and_link_pwqs
>> + * init_pwq
>> + *
>> + * Also, plus No. CPUs objects:
>> + *
>> + * perf_event_init
>> + * __init_srcu_struct
>> + * init_srcu_struct_fields
>> + * init_srcu_struct_nodes
>> + * __init_work
>
> None of the things are actually used or required _BEFORE_
> debug_objects_mem_init() is invoked.
>
> The reason why the call is at this place in start_kernel() is
> historical. It's because back in the days when debugobjects were added the
> memory allocator was enabled way later than today. So we can just move the
> debug_objects_mem_init() call right before sched_init() I think.

Well, now that kmemleak_init() seems complains that debug_objects_mem_init()
is called before it.

[ 0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the object search tree (overlaps existing)
[ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
[ 0.078883] Call Trace:
[ 0.078904] [c000000001c8fcd0] [c000000000c96b34] dump_stack+0xe8/0x164 (unreliable)
[ 0.078935] [c000000001c8fd20] [c000000000486e84] create_object+0x344/0x380
[ 0.078962] [c000000001c8fde0] [c000000000489544] early_alloc+0x108/0x1f8
[ 0.078989] [c000000001c8fe20] [c00000000109738c] kmemleak_init+0x1d8/0x3d4
[ 0.079016] [c000000001c8ff00] [c000000001054028] start_kernel+0x5c0/0x6f8
[ 0.079043] [c000000001c8ff90] [c00000000000ae7c] start_here_common+0x1c/0x520
[ 0.079070] kmemleak: Kernel memory leak detector disabled
[ 0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
[ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299
[ 0.079135] kmemleak: min_count = -1
[ 0.079153] kmemleak: count = 0
[ 0.079170] kmemleak: flags = 0x5
[ 0.079188] kmemleak: checksum = 0
[ 0.079206] kmemleak: backtrace:
[ 0.079227] __debug_object_init+0x688/0x700
[ 0.079250] debug_object_activate+0x1e0/0x350
[ 0.079272] __call_rcu+0x60/0x430
[ 0.079292] put_object+0x60/0x80
[ 0.079311] kmemleak_init+0x2cc/0x3d4
[ 0.079331] start_kernel+0x5c0/0x6f8
[ 0.079351] start_here_common+0x1c/0x520
[ 0.079380] kmemleak: Early log backtrace:
[ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc
[ 0.079421] sparse_init_nid+0x144/0x51c
[ 0.079440] sparse_init+0x1a0/0x238
[ 0.079459] initmem_init+0x1d8/0x25c
[ 0.079498] setup_arch+0x3e0/0x464
[ 0.079517] start_kernel+0xa4/0x6f8
[ 0.079536] start_here_common+0x1c/0x520


2018-11-25 20:43:21

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size



On 11/23/18 10:01 PM, Qian Cai wrote:
>
>
>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <[email protected]> wrote:
>>
>> On Tue, 20 Nov 2018, Qian Cai wrote:
>>
>> Looking deeper at that.
>>
>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>> index 70935ed91125..140571aa483c 100644
>>> --- a/lib/debugobjects.c
>>> +++ b/lib/debugobjects.c
>>> @@ -23,9 +23,81 @@
>>> #define ODEBUG_HASH_BITS 14
>>> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>>>
>>> -#define ODEBUG_POOL_SIZE 1024
>>> +#define ODEBUG_DEFAULT_POOL 512
>>> #define ODEBUG_POOL_MIN_LEVEL 256
>>>
>>> +/*
>>> + * Some debug objects are allocated during the early boot. Enabling some options
>>> + * like timers or workqueue objects may increase the size required significantly
>>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>>> + *
>>> + * No. CPUs x 2 (worker pool) objects:
>>> + *
>>> + * start_kernel
>>> + * workqueue_init_early
>>> + * init_worker_pool
>>> + * init_timer_key
>>> + * debug_object_init
>>> + *
>>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>>> + *
>>> + * sched_init
>>> + * hrtick_rq_init
>>> + * hrtimer_init
>>> + *
>>> + * CONFIG_DEBUG_OBJECTS_WORK:
>>> + * No. CPUs x 6 (workqueue) objects:
>>> + *
>>> + * workqueue_init_early
>>> + * alloc_workqueue
>>> + * __alloc_workqueue_key
>>> + * alloc_and_link_pwqs
>>> + * init_pwq
>>> + *
>>> + * Also, plus No. CPUs objects:
>>> + *
>>> + * perf_event_init
>>> + * __init_srcu_struct
>>> + * init_srcu_struct_fields
>>> + * init_srcu_struct_nodes
>>> + * __init_work
>>
>> None of the things are actually used or required _BEFORE_
>> debug_objects_mem_init() is invoked.
>>
>> The reason why the call is at this place in start_kernel() is
>> historical. It's because back in the days when debugobjects were added the
>> memory allocator was enabled way later than today. So we can just move the
>> debug_objects_mem_init() call right before sched_init() I think.
>
> Well, now that kmemleak_init() seems complains that debug_objects_mem_init()
> is called before it.
>
> [ 0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the object search tree (overlaps existing)
> [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
> [ 0.078883] Call Trace:
> [ 0.078904] [c000000001c8fcd0] [c000000000c96b34] dump_stack+0xe8/0x164 (unreliable)
> [ 0.078935] [c000000001c8fd20] [c000000000486e84] create_object+0x344/0x380
> [ 0.078962] [c000000001c8fde0] [c000000000489544] early_alloc+0x108/0x1f8
> [ 0.078989] [c000000001c8fe20] [c00000000109738c] kmemleak_init+0x1d8/0x3d4
> [ 0.079016] [c000000001c8ff00] [c000000001054028] start_kernel+0x5c0/0x6f8
> [ 0.079043] [c000000001c8ff90] [c00000000000ae7c] start_here_common+0x1c/0x520
> [ 0.079070] kmemleak: Kernel memory leak detector disabled
> [ 0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
> [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299
> [ 0.079135] kmemleak: min_count = -1
> [ 0.079153] kmemleak: count = 0
> [ 0.079170] kmemleak: flags = 0x5
> [ 0.079188] kmemleak: checksum = 0
> [ 0.079206] kmemleak: backtrace:
> [ 0.079227] __debug_object_init+0x688/0x700
> [ 0.079250] debug_object_activate+0x1e0/0x350
> [ 0.079272] __call_rcu+0x60/0x430
> [ 0.079292] put_object+0x60/0x80
> [ 0.079311] kmemleak_init+0x2cc/0x3d4
> [ 0.079331] start_kernel+0x5c0/0x6f8
> [ 0.079351] start_here_common+0x1c/0x520
> [ 0.079380] kmemleak: Early log backtrace:
> [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc
> [ 0.079421] sparse_init_nid+0x144/0x51c
> [ 0.079440] sparse_init+0x1a0/0x238
> [ 0.079459] initmem_init+0x1d8/0x25c
> [ 0.079498] setup_arch+0x3e0/0x464
> [ 0.079517] start_kernel+0xa4/0x6f8
> [ 0.079536] start_here_common+0x1c/0x520
>

So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so
it can make use of kmemleak_ignore() for all debug objects in order to avoid the
overlapping like the above.

while (obj_pool_free < debug_objects_pool_min_level) {

new = kmem_cache_zalloc(obj_cache, gfp);
if (!new)
return;

kmemleak_ignore(new);

However, there seems no way to move kmemleak_init() together this early in
start_kernel() just before vmalloc_init() [1] because it looks like it depends
on things like workqueue (schedule_work(&cleanup_work)) and rcu. Hence, it needs
to be after workqueue_init_early() and rcu_init()

Given that, maybe the best outcome is to stick to the alternative approach that
works [1] rather messing up with the order of debug_objects_mem_init() in
start_kernel() which seems tricky. What do you think?

[1] https://goo.gl/18N78g
[2] https://goo.gl/My6ig6

2018-11-25 20:49:21

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size



On 11/22/18 4:56 PM, Thomas Gleixner wrote:
> On Tue, 20 Nov 2018, Qian Cai wrote:
>
> Looking deeper at that.
>
>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> index 70935ed91125..140571aa483c 100644
>> --- a/lib/debugobjects.c
>> +++ b/lib/debugobjects.c
>> @@ -23,9 +23,81 @@
>> #define ODEBUG_HASH_BITS 14
>> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>>
>> -#define ODEBUG_POOL_SIZE 1024
>> +#define ODEBUG_DEFAULT_POOL 512
>> #define ODEBUG_POOL_MIN_LEVEL 256
>>
>> +/*
>> + * Some debug objects are allocated during the early boot. Enabling some options
>> + * like timers or workqueue objects may increase the size required significantly
>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>> + *
>> + * No. CPUs x 2 (worker pool) objects:
>> + *
>> + * start_kernel
>> + * workqueue_init_early
>> + * init_worker_pool
>> + * init_timer_key
>> + * debug_object_init
>> + *
>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>> + *
>> + * sched_init
>> + * hrtick_rq_init
>> + * hrtimer_init
>> + *
>> + * CONFIG_DEBUG_OBJECTS_WORK:
>> + * No. CPUs x 6 (workqueue) objects:
>> + *
>> + * workqueue_init_early
>> + * alloc_workqueue
>> + * __alloc_workqueue_key
>> + * alloc_and_link_pwqs
>> + * init_pwq
>> + *
>> + * Also, plus No. CPUs objects:
>> + *
>> + * perf_event_init
>> + * __init_srcu_struct
>> + * init_srcu_struct_fields
>> + * init_srcu_struct_nodes
>> + * __init_work
>
> None of the things are actually used or required _BEFORE_
> debug_objects_mem_init() is invoked.
>
> The reason why the call is at this place in start_kernel() is
> historical. It's because back in the days when debugobjects were added the
> memory allocator was enabled way later than today. So we can just move the
> debug_objects_mem_init() call right before sched_init() I think.
>
>> + * Increase the number a bit more in case the implmentatins are changed in the
>> + * future, as it is better to avoid OOM than spending a bit more kernel memory
>> + * that will/can be freed.
>> + *
>> + * With all debug objects config options selected except the workqueue and the
>> + * timers, kernel reports,
>> + * 64-CPU: ODEBUG: 4 of 4 active objects replaced
>> + * 256-CPU: ODEBUG: 4 of 4 active objects replaced
>> + *
>> + * all the options except the workqueue:
>> + * 64-CPU: ODEBUG: 466 of 466 active objects replaced
>> + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced
>> + *
>> + * all the options except the timers:
>> + * 64-CPU: ODEBUG: 652 of 652 active objects replaced
>> + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced
>> + *
>> + * all the options:
>> + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced
>> + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced
>> + */
>> +#ifdef CONFIG_DEBUG_OBJECTS_WORK
>> +#define ODEBUG_WORK_PCPU_CNT 10
>> +#else
>> +#define ODEBUG_WORK_PCPU_CNT 0
>> +#endif /* CONFIG_DEBUG_OBJECTS_WORK */
>> +
>> +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
>> +#define ODEBUG_TIMERS_PCPU_CNT 10
>> +#else
>> +#define ODEBUG_TIMERS_PCPU_CNT 0
>> +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */
>
> Btw, the scaling here is way off.
>
>> +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \
>> + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT))
>
> CONFIG_NR_CPUS Pool size Storage size
>
> 256 256512 4M
>
> According to your list above it uses 4378 object for 256 CPUs. That's off
> by an factor of ~58.
>

Hmm, it is not clear to me why this is off and where did the pool size 256512
come from.

CONFIG_NR_CPUS: 256
Pool size: 1024 + 256 * (10 + 10) = 6144

Am I missing anything?

2018-11-26 01:32:49

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size

On 11/25/2018 03:42 PM, Qian Cai wrote:
>
>
> On 11/23/18 10:01 PM, Qian Cai wrote:
>>
>>
>>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <[email protected]>
>>> wrote:
>>>
>>> On Tue, 20 Nov 2018, Qian Cai wrote:
>>>
>>> Looking deeper at that.
>>>
>>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>>> index 70935ed91125..140571aa483c 100644
>>>> --- a/lib/debugobjects.c
>>>> +++ b/lib/debugobjects.c
>>>> @@ -23,9 +23,81 @@
>>>> #define ODEBUG_HASH_BITS    14
>>>> #define ODEBUG_HASH_SIZE    (1 << ODEBUG_HASH_BITS)
>>>>
>>>> -#define ODEBUG_POOL_SIZE    1024
>>>> +#define ODEBUG_DEFAULT_POOL    512
>>>> #define ODEBUG_POOL_MIN_LEVEL    256
>>>>
>>>> +/*
>>>> + * Some debug objects are allocated during the early boot.
>>>> Enabling some options
>>>> + * like timers or workqueue objects may increase the size required
>>>> significantly
>>>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>>>> + *
>>>> + * No. CPUs x 2 (worker pool) objects:
>>>> + *
>>>> + * start_kernel
>>>> + *   workqueue_init_early
>>>> + *     init_worker_pool
>>>> + *       init_timer_key
>>>> + *         debug_object_init
>>>> + *
>>>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>>>> + *
>>>> + * sched_init
>>>> + *   hrtick_rq_init
>>>> + *     hrtimer_init
>>>> + *
>>>> + * CONFIG_DEBUG_OBJECTS_WORK:
>>>> + * No. CPUs x 6 (workqueue) objects:
>>>> + *
>>>> + * workqueue_init_early
>>>> + *   alloc_workqueue
>>>> + *     __alloc_workqueue_key
>>>> + *       alloc_and_link_pwqs
>>>> + *         init_pwq
>>>> + *
>>>> + * Also, plus No. CPUs objects:
>>>> + *
>>>> + * perf_event_init
>>>> + *    __init_srcu_struct
>>>> + *      init_srcu_struct_fields
>>>> + *        init_srcu_struct_nodes
>>>> + *          __init_work
>>>
>>> None of the things are actually used or required _BEFORE_
>>> debug_objects_mem_init() is invoked.
>>>
>>> The reason why the call is at this place in start_kernel() is
>>> historical. It's because back in the days when debugobjects were
>>> added the
>>> memory allocator was enabled way later than today. So we can just
>>> move the
>>> debug_objects_mem_init() call right before sched_init() I think.
>>
>> Well, now that kmemleak_init() seems complains that
>> debug_objects_mem_init()
>> is called before it.
>>
>> [    0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the
>> object search tree (overlaps existing)
>> [    0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
>> [    0.078883] Call Trace:
>> [    0.078904] [c000000001c8fcd0] [c000000000c96b34]
>> dump_stack+0xe8/0x164 (unreliable)
>> [    0.078935] [c000000001c8fd20] [c000000000486e84]
>> create_object+0x344/0x380
>> [    0.078962] [c000000001c8fde0] [c000000000489544]
>> early_alloc+0x108/0x1f8
>> [    0.078989] [c000000001c8fe20] [c00000000109738c]
>> kmemleak_init+0x1d8/0x3d4
>> [    0.079016] [c000000001c8ff00] [c000000001054028]
>> start_kernel+0x5c0/0x6f8
>> [    0.079043] [c000000001c8ff90] [c00000000000ae7c]
>> start_here_common+0x1c/0x520
>> [    0.079070] kmemleak: Kernel memory leak detector disabled
>> [    0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
>> [    0.079112] kmemleak:   comm "swapper/0", pid 0, jiffies 4294937299
>> [    0.079135] kmemleak:   min_count = -1
>> [    0.079153] kmemleak:   count = 0
>> [    0.079170] kmemleak:   flags = 0x5
>> [    0.079188] kmemleak:   checksum = 0
>> [    0.079206] kmemleak:   backtrace:
>> [    0.079227]      __debug_object_init+0x688/0x700
>> [    0.079250]      debug_object_activate+0x1e0/0x350
>> [    0.079272]      __call_rcu+0x60/0x430
>> [    0.079292]      put_object+0x60/0x80
>> [    0.079311]      kmemleak_init+0x2cc/0x3d4
>> [    0.079331]      start_kernel+0x5c0/0x6f8
>> [    0.079351]      start_here_common+0x1c/0x520
>> [    0.079380] kmemleak: Early log backtrace:
>> [    0.079399]    memblock_alloc_try_nid_raw+0x90/0xcc
>> [    0.079421]    sparse_init_nid+0x144/0x51c
>> [    0.079440]    sparse_init+0x1a0/0x238
>> [    0.079459]    initmem_init+0x1d8/0x25c
>> [    0.079498]    setup_arch+0x3e0/0x464
>> [    0.079517]    start_kernel+0xa4/0x6f8
>> [    0.079536]    start_here_common+0x1c/0x520
>>
>
> So this is an chicken-egg problem. Debug objects need kmemleak_init()
> first, so it can make use of kmemleak_ignore() for all debug objects
> in order to avoid the overlapping like the above.
>
> while (obj_pool_free < debug_objects_pool_min_level) {
>
>     new = kmem_cache_zalloc(obj_cache, gfp);
>     if (!new)
>         return;
>
>     kmemleak_ignore(new);
>
> However, there seems no way to move kmemleak_init() together this
> early in start_kernel() just before vmalloc_init() [1] because it
> looks like it depends on things like workqueue
> (schedule_work(&cleanup_work)) and rcu. Hence, it needs to be after
> workqueue_init_early() and rcu_init()
>
> Given that, maybe the best outcome is to stick to the alternative
> approach that works [1] rather messing up with the order of
> debug_objects_mem_init() in start_kernel() which seems tricky. What do
> you think?
>
> [1] https://goo.gl/18N78g
> [2] https://goo.gl/My6ig6

Could you move kmemleak_init() and debug_objects_mem_init() as far up as
possible, like before the hrtimer_init() to at least make static count
calculation as simple as possible?

Cheers,
Longman


2018-11-26 04:56:20

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size



On 11/25/18 8:31 PM, Waiman Long wrote:
> On 11/25/2018 03:42 PM, Qian Cai wrote:
>>
>>
>> On 11/23/18 10:01 PM, Qian Cai wrote:
>>>
>>>
>>>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <[email protected]>
>>>> wrote:
>>>>
>>>> On Tue, 20 Nov 2018, Qian Cai wrote:
>>>>
>>>> Looking deeper at that.
>>>>
>>>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>>>> index 70935ed91125..140571aa483c 100644
>>>>> --- a/lib/debugobjects.c
>>>>> +++ b/lib/debugobjects.c
>>>>> @@ -23,9 +23,81 @@
>>>>> #define ODEBUG_HASH_BITS    14
>>>>> #define ODEBUG_HASH_SIZE    (1 << ODEBUG_HASH_BITS)
>>>>>
>>>>> -#define ODEBUG_POOL_SIZE    1024
>>>>> +#define ODEBUG_DEFAULT_POOL    512
>>>>> #define ODEBUG_POOL_MIN_LEVEL    256
>>>>>
>>>>> +/*
>>>>> + * Some debug objects are allocated during the early boot.
>>>>> Enabling some options
>>>>> + * like timers or workqueue objects may increase the size required
>>>>> significantly
>>>>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>>>>> + *
>>>>> + * No. CPUs x 2 (worker pool) objects:
>>>>> + *
>>>>> + * start_kernel
>>>>> + *   workqueue_init_early
>>>>> + *     init_worker_pool
>>>>> + *       init_timer_key
>>>>> + *         debug_object_init
>>>>> + *
>>>>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>>>>> + *
>>>>> + * sched_init
>>>>> + *   hrtick_rq_init
>>>>> + *     hrtimer_init
>>>>> + *
>>>>> + * CONFIG_DEBUG_OBJECTS_WORK:
>>>>> + * No. CPUs x 6 (workqueue) objects:
>>>>> + *
>>>>> + * workqueue_init_early
>>>>> + *   alloc_workqueue
>>>>> + *     __alloc_workqueue_key
>>>>> + *       alloc_and_link_pwqs
>>>>> + *         init_pwq
>>>>> + *
>>>>> + * Also, plus No. CPUs objects:
>>>>> + *
>>>>> + * perf_event_init
>>>>> + *    __init_srcu_struct
>>>>> + *      init_srcu_struct_fields
>>>>> + *        init_srcu_struct_nodes
>>>>> + *          __init_work
>>>>
>>>> None of the things are actually used or required _BEFORE_
>>>> debug_objects_mem_init() is invoked.
>>>>
>>>> The reason why the call is at this place in start_kernel() is
>>>> historical. It's because back in the days when debugobjects were
>>>> added the
>>>> memory allocator was enabled way later than today. So we can just
>>>> move the
>>>> debug_objects_mem_init() call right before sched_init() I think.
>>>
>>> Well, now that kmemleak_init() seems complains that
>>> debug_objects_mem_init()
>>> is called before it.
>>>
>>> [    0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the
>>> object search tree (overlaps existing)
>>> [    0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
>>> [    0.078883] Call Trace:
>>> [    0.078904] [c000000001c8fcd0] [c000000000c96b34]
>>> dump_stack+0xe8/0x164 (unreliable)
>>> [    0.078935] [c000000001c8fd20] [c000000000486e84]
>>> create_object+0x344/0x380
>>> [    0.078962] [c000000001c8fde0] [c000000000489544]
>>> early_alloc+0x108/0x1f8
>>> [    0.078989] [c000000001c8fe20] [c00000000109738c]
>>> kmemleak_init+0x1d8/0x3d4
>>> [    0.079016] [c000000001c8ff00] [c000000001054028]
>>> start_kernel+0x5c0/0x6f8
>>> [    0.079043] [c000000001c8ff90] [c00000000000ae7c]
>>> start_here_common+0x1c/0x520
>>> [    0.079070] kmemleak: Kernel memory leak detector disabled
>>> [    0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
>>> [    0.079112] kmemleak:   comm "swapper/0", pid 0, jiffies 4294937299
>>> [    0.079135] kmemleak:   min_count = -1
>>> [    0.079153] kmemleak:   count = 0
>>> [    0.079170] kmemleak:   flags = 0x5
>>> [    0.079188] kmemleak:   checksum = 0
>>> [    0.079206] kmemleak:   backtrace:
>>> [    0.079227]      __debug_object_init+0x688/0x700
>>> [    0.079250]      debug_object_activate+0x1e0/0x350
>>> [    0.079272]      __call_rcu+0x60/0x430
>>> [    0.079292]      put_object+0x60/0x80
>>> [    0.079311]      kmemleak_init+0x2cc/0x3d4
>>> [    0.079331]      start_kernel+0x5c0/0x6f8
>>> [    0.079351]      start_here_common+0x1c/0x520
>>> [    0.079380] kmemleak: Early log backtrace:
>>> [    0.079399]    memblock_alloc_try_nid_raw+0x90/0xcc
>>> [    0.079421]    sparse_init_nid+0x144/0x51c
>>> [    0.079440]    sparse_init+0x1a0/0x238
>>> [    0.079459]    initmem_init+0x1d8/0x25c
>>> [    0.079498]    setup_arch+0x3e0/0x464
>>> [    0.079517]    start_kernel+0xa4/0x6f8
>>> [    0.079536]    start_here_common+0x1c/0x520
>>>
>>
>> So this is an chicken-egg problem. Debug objects need kmemleak_init()
>> first, so it can make use of kmemleak_ignore() for all debug objects
>> in order to avoid the overlapping like the above.
>>
>> while (obj_pool_free < debug_objects_pool_min_level) {
>>
>>     new = kmem_cache_zalloc(obj_cache, gfp);
>>     if (!new)
>>         return;
>>
>>     kmemleak_ignore(new);
>>
>> However, there seems no way to move kmemleak_init() together this
>> early in start_kernel() just before vmalloc_init() [1] because it
>> looks like it depends on things like workqueue
>> (schedule_work(&cleanup_work)) and rcu. Hence, it needs to be after
>> workqueue_init_early() and rcu_init()
>>
>> Given that, maybe the best outcome is to stick to the alternative
>> approach that works [1] rather messing up with the order of
>> debug_objects_mem_init() in start_kernel() which seems tricky. What do
>> you think?
>>
>> [1] https://goo.gl/18N78g
>> [2] https://goo.gl/My6ig6
>
> Could you move kmemleak_init() and debug_objects_mem_init() as far up as
> possible, like before the hrtimer_init() to at least make static count
> calculation as simple as possible?
>

Well, there is only 2 x NR_CPUS difference after moved both calls just after
rcu_init().

Before After
64-CPU: 1114 974
160-CPU: 2774 2429
256-CPU: 3853 4378

I suppose it is possible that the timers only need the scale factor 5 instead of
10. However, it needs to be retested for all the configurations to be sure, and
likely need to remove all irqs calls in kmemleak_init() and subroutines because
it is now called with irq disabled. Given the initdata will be freed anyway,
does it really worth doing?

BTW, calling debug_objects_mem_init() before kmemleak_init() actually could
trigger a loop on machines with 160+ CPUs until the pool is filled up,

debug_objects_pool_min_level += num_possible_cpus() * 4;

[1] while (obj_pool_free < debug_objects_pool_min_level)

kmemleak_init
kmemleak_ignore (from replaced static debug objects)
make_black_object
put_object
__call_rcu (kernel/rcu/tree.c)
debug_rcu_head_queue
debug_object_activate
debug_object_init
fill_pool
kmemleak_ignore (looping in [1])
make_black_object
...

I think until this is resolved, there is no way to move debug_objects_mem_init()
before kmemleak_init().

2018-11-26 09:47:09

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size



On 11/25/18 11:52 PM, Qian Cai wrote:
>
>
> On 11/25/18 8:31 PM, Waiman Long wrote:
>> On 11/25/2018 03:42 PM, Qian Cai wrote:
>>>
>>>
>>> On 11/23/18 10:01 PM, Qian Cai wrote:
>>>>
>>>>
>>>>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <[email protected]>
>>>>> wrote:
>>>>>
>>>>> On Tue, 20 Nov 2018, Qian Cai wrote:
>>>>>
>>>>> Looking deeper at that.
>>>>>
>>>>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>>>>> index 70935ed91125..140571aa483c 100644
>>>>>> --- a/lib/debugobjects.c
>>>>>> +++ b/lib/debugobjects.c
>>>>>> @@ -23,9 +23,81 @@
>>>>>> #define ODEBUG_HASH_BITS    14
>>>>>> #define ODEBUG_HASH_SIZE    (1 << ODEBUG_HASH_BITS)
>>>>>>
>>>>>> -#define ODEBUG_POOL_SIZE    1024
>>>>>> +#define ODEBUG_DEFAULT_POOL    512
>>>>>> #define ODEBUG_POOL_MIN_LEVEL    256
>>>>>>
>>>>>> +/*
>>>>>> + * Some debug objects are allocated during the early boot.
>>>>>> Enabling some options
>>>>>> + * like timers or workqueue objects may increase the size required
>>>>>> significantly
>>>>>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>>>>>> + *
>>>>>> + * No. CPUs x 2 (worker pool) objects:
>>>>>> + *
>>>>>> + * start_kernel
>>>>>> + *   workqueue_init_early
>>>>>> + *     init_worker_pool
>>>>>> + *       init_timer_key
>>>>>> + *         debug_object_init
>>>>>> + *
>>>>>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>>>>>> + *
>>>>>> + * sched_init
>>>>>> + *   hrtick_rq_init
>>>>>> + *     hrtimer_init
>>>>>> + *
>>>>>> + * CONFIG_DEBUG_OBJECTS_WORK:
>>>>>> + * No. CPUs x 6 (workqueue) objects:
>>>>>> + *
>>>>>> + * workqueue_init_early
>>>>>> + *   alloc_workqueue
>>>>>> + *     __alloc_workqueue_key
>>>>>> + *       alloc_and_link_pwqs
>>>>>> + *         init_pwq
>>>>>> + *
>>>>>> + * Also, plus No. CPUs objects:
>>>>>> + *
>>>>>> + * perf_event_init
>>>>>> + *    __init_srcu_struct
>>>>>> + *      init_srcu_struct_fields
>>>>>> + *        init_srcu_struct_nodes
>>>>>> + *          __init_work
>>>>>
>>>>> None of the things are actually used or required _BEFORE_
>>>>> debug_objects_mem_init() is invoked.
>>>>>
>>>>> The reason why the call is at this place in start_kernel() is
>>>>> historical. It's because back in the days when debugobjects were
>>>>> added the
>>>>> memory allocator was enabled way later than today. So we can just
>>>>> move the
>>>>> debug_objects_mem_init() call right before sched_init() I think.
>>>>
>>>> Well, now that kmemleak_init() seems complains that
>>>> debug_objects_mem_init()
>>>> is called before it.
>>>>
>>>> [    0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the
>>>> object search tree (overlaps existing)
>>>> [    0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
>>>> [    0.078883] Call Trace:
>>>> [    0.078904] [c000000001c8fcd0] [c000000000c96b34]
>>>> dump_stack+0xe8/0x164 (unreliable)
>>>> [    0.078935] [c000000001c8fd20] [c000000000486e84]
>>>> create_object+0x344/0x380
>>>> [    0.078962] [c000000001c8fde0] [c000000000489544]
>>>> early_alloc+0x108/0x1f8
>>>> [    0.078989] [c000000001c8fe20] [c00000000109738c]
>>>> kmemleak_init+0x1d8/0x3d4
>>>> [    0.079016] [c000000001c8ff00] [c000000001054028]
>>>> start_kernel+0x5c0/0x6f8
>>>> [    0.079043] [c000000001c8ff90] [c00000000000ae7c]
>>>> start_here_common+0x1c/0x520
>>>> [    0.079070] kmemleak: Kernel memory leak detector disabled
>>>> [    0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
>>>> [    0.079112] kmemleak:   comm "swapper/0", pid 0, jiffies 4294937299
>>>> [    0.079135] kmemleak:   min_count = -1
>>>> [    0.079153] kmemleak:   count = 0
>>>> [    0.079170] kmemleak:   flags = 0x5
>>>> [    0.079188] kmemleak:   checksum = 0
>>>> [    0.079206] kmemleak:   backtrace:
>>>> [    0.079227]      __debug_object_init+0x688/0x700
>>>> [    0.079250]      debug_object_activate+0x1e0/0x350
>>>> [    0.079272]      __call_rcu+0x60/0x430
>>>> [    0.079292]      put_object+0x60/0x80
>>>> [    0.079311]      kmemleak_init+0x2cc/0x3d4
>>>> [    0.079331]      start_kernel+0x5c0/0x6f8
>>>> [    0.079351]      start_here_common+0x1c/0x520
>>>> [    0.079380] kmemleak: Early log backtrace:
>>>> [    0.079399]    memblock_alloc_try_nid_raw+0x90/0xcc
>>>> [    0.079421]    sparse_init_nid+0x144/0x51c
>>>> [    0.079440]    sparse_init+0x1a0/0x238
>>>> [    0.079459]    initmem_init+0x1d8/0x25c
>>>> [    0.079498]    setup_arch+0x3e0/0x464
>>>> [    0.079517]    start_kernel+0xa4/0x6f8
>>>> [    0.079536]    start_here_common+0x1c/0x520
>>>>
>>>
>>> So this is an chicken-egg problem. Debug objects need kmemleak_init()
>>> first, so it can make use of kmemleak_ignore() for all debug objects
>>> in order to avoid the overlapping like the above.
>>>
>>> while (obj_pool_free < debug_objects_pool_min_level) {
>>>
>>>      new = kmem_cache_zalloc(obj_cache, gfp);
>>>      if (!new)
>>>          return;
>>>
>>>      kmemleak_ignore(new);
>>>
>>> However, there seems no way to move kmemleak_init() together this
>>> early in start_kernel() just before vmalloc_init() [1] because it
>>> looks like it depends on things like workqueue
>>> (schedule_work(&cleanup_work)) and rcu. Hence, it needs to be after
>>> workqueue_init_early() and rcu_init()
>>>
>>> Given that, maybe the best outcome is to stick to the alternative
>>> approach that works [1] rather messing up with the order of
>>> debug_objects_mem_init() in start_kernel() which seems tricky. What do
>>> you think?
>>>
>>> [1] https://goo.gl/18N78g
>>> [2] https://goo.gl/My6ig6
>>
>> Could you move kmemleak_init() and debug_objects_mem_init() as far up as
>> possible, like before the hrtimer_init() to at least make static count
>> calculation as simple as possible?
>>
>
> Well, there is only 2 x NR_CPUS difference after moved both calls just after
> rcu_init().
>
>          Before After
> 64-CPU:  1114   974
> 160-CPU: 2774   2429
> 256-CPU: 3853   4378
>
> I suppose it is possible that the timers only need the scale factor 5 instead of
> 10. However, it needs to be retested for all the configurations to be sure, and
> likely need to remove all irqs calls in kmemleak_init() and subroutines because
> it is now called with irq disabled. Given the initdata will be freed anyway,
> does it really worth doing?
>
> BTW, calling debug_objects_mem_init() before kmemleak_init() actually could
> trigger a loop on machines with 160+ CPUs until the pool is filled up,
>
> debug_objects_pool_min_level += num_possible_cpus() * 4;
>
> [1] while (obj_pool_free < debug_objects_pool_min_level)
>
> kmemleak_init
>   kmemleak_ignore (from replaced static debug objects)
>     make_black_object
>       put_object
>         __call_rcu (kernel/rcu/tree.c)
>           debug_rcu_head_queue
>             debug_object_activate
>               debug_object_init
>                 fill_pool
>                   kmemleak_ignore (looping in [1])
>                     make_black_object
>                       ...
>
> I think until this is resolved, there is no way to move debug_objects_mem_init()
> before kmemleak_init().

I believe this is a separate issue that kmemleak is broken with
CONFIG_DEBUG_OBJECTS_RCU_HEAD anyway where the infinite loop above could be
triggered in the existing code as well, i.e., once the pool need be refilled
(fill_pool()) after the system boot up, debug object creation will call
kmemleak_ignore() and it will create a new rcu debug_object_init(), and then it
will call fill_pool() again and again. As the results, the system is locking up
during kernel compilations.

Hence, I'll send out a patch for debug objects with large CPUs anyway and deal
with kmemleak + CONFIG_DEBUG_OBJECTS_RCU_HEAD issue later.

2018-11-26 10:51:00

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v4] debugobjects: scale the static pool size

On Mon, Nov 26, 2018 at 04:45:54AM -0500, Qian Cai wrote:
> On 11/25/18 11:52 PM, Qian Cai wrote:
> > BTW, calling debug_objects_mem_init() before kmemleak_init() actually
> > could trigger a loop on machines with 160+ CPUs until the pool is filled
> > up,
> >
> > debug_objects_pool_min_level += num_possible_cpus() * 4;
> >
> > [1] while (obj_pool_free < debug_objects_pool_min_level)
> >
> > kmemleak_init
> > ? kmemleak_ignore (from replaced static debug objects)
> > ??? make_black_object
> > ????? put_object
> > ??????? __call_rcu (kernel/rcu/tree.c)
> > ????????? debug_rcu_head_queue
> > ??????????? debug_object_activate
> > ????????????? debug_object_init
> > ??????????????? fill_pool
> > ????????????????? kmemleak_ignore (looping in [1])
> > ??????????????????? make_black_object
> > ????????????????????? ...
> >
> > I think until this is resolved, there is no way to move
> > debug_objects_mem_init() before kmemleak_init().
>
> I believe this is a separate issue that kmemleak is broken with
> CONFIG_DEBUG_OBJECTS_RCU_HEAD anyway where the infinite loop above could be
> triggered in the existing code as well, i.e., once the pool need be refilled
> (fill_pool()) after the system boot up, debug object creation will call
> kmemleak_ignore() and it will create a new rcu debug_object_init(), and then
> it will call fill_pool() again and again. As the results, the system is
> locking up during kernel compilations.
>
> Hence, I'll send out a patch for debug objects with large CPUs anyway and
> deal with kmemleak + CONFIG_DEBUG_OBJECTS_RCU_HEAD issue later.

I haven't hit this before but I can see why it happens. Kmemleak uses
RCU for freeing its own data structures to avoid a recursive call to
sl*b (kmem_cache_free()). Since we already tell kmemleak to ignore the
debug_objects_cache allocations, I think we could as well add
SLAB_NOLEAKTRACE to kmem_cache_create() (I haven't tried yet, this was
not a documented API for kmemleak, rather used to avoid kmemleak
tracking itself).

--
Catalin