2016-12-12 20:48:45

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH] cpumask: avoid WARN in prefill_possible_map()

With CONFIG_DEBUG_PER_CPU_MAPS and CONFIG_CPUMASK_OFFSTACK enabled
fixes the following WARN_ON_ONCE() for booting with nr_cpus=1:

[ 0.000000] Linux version 4.9.0 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #36 SMP Mon Dec 12 18:05:46 MSK 2016
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.9.0 root=/dev/mapper/vz-root ro crashkernel=auto rd.lvm.lv=vz/root rd.lvm.lv=vz/swap console=ttyS0,115200 vsyscall=none nr_cpus=1
[ 0.000000] smpboot: 4 Processors exceeds NR_CPUS limit of 1
[ 0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at ./include/linux/cpumask.h:121 cpumask_check.part.2+0x1c/0x1e
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0 #36
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff8136664b>] dump_stack+0x67/0x9c
[ 0.000000] [<ffffffff81063e21>] __warn+0xd1/0xf0
[ 0.000000] [<ffffffff81063f0d>] warn_slowpath_null+0x1d/0x20
[ 0.000000] [<ffffffff8105f4ca>] cpumask_check.part.2+0x1c/0x1e
[ 0.000000] [<ffffffff8104008e>] cpumask_clear_cpu+0x2e/0x40
[ 0.000000] [<ffffffff81f88a62>] prefill_possible_map+0x15c/0x16a
[ 0.000000] [<ffffffff81f802c6>] setup_arch+0xba7/0xc33
[ 0.000000] [<ffffffff81f78c59>] start_kernel+0x63/0x448
[ 0.000000] [<ffffffff81f7858c>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81f78678>] x86_64_start_kernel+0xea/0xed
[ 0.000000] ---[ end trace 5876da8d2ace83fb ]---

nr_cpu_ids is set to possible two lines futher - omit checking in
set_cpu_possible() cycles.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/kernel/smpboot.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 42f5eb7b4f6c..17167bec7c61 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1459,6 +1459,9 @@ __init void prefill_possible_map(void)
pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
possible, max_t(int, possible - num_processors, 0));

+ /* Avoid WARN() in set_cpu_possible()=>cpumask_check() */
+ nr_cpu_ids = NR_CPUS;
+
for (i = 0; i < possible; i++)
set_cpu_possible(i, true);
for (; i < NR_CPUS; i++)
--
2.10.2


2016-12-13 18:35:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] cpumask: avoid WARN in prefill_possible_map()

On Mon, 12 Dec 2016, Dmitry Safonov wrote:

> Subject : [PATCH] cpumask: avoid WARN in prefill_possible_map()

'cpumask' is hardly the proper prefix for x86/smpboot related issues.

> With CONFIG_DEBUG_PER_CPU_MAPS and CONFIG_CPUMASK_OFFSTACK enabled
> fixes the following WARN_ON_ONCE() for booting with nr_cpus=1:

This sentence, aside of not qualifying as a sentence, makes no sense.

What has this to do with nr_cpus=1? If I boot with nr_cpus=2 then this
won't fail or what?

> [ 0.000000] Linux version 4.9.0 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #36 SMP Mon Dec 12 18:05:46 MSK 2016
> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.9.0 root=/dev/mapper/vz-root ro crashkernel=auto rd.lvm.lv=vz/root rd.lvm.lv=vz/swap console=ttyS0,115200 vsyscall=none nr_cpus=1
> [ 0.000000] smpboot: 4 Processors exceeds NR_CPUS limit of 1
> [ 0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
> [ 0.000000] ------------[ cut here ]------------
> [ 0.000000] WARNING: CPU: 0 PID: 0 at ./include/linux/cpumask.h:121 cpumask_check.part.2+0x1c/0x1e
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0 #36
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff8136664b>] dump_stack+0x67/0x9c
> [ 0.000000] [<ffffffff81063e21>] __warn+0xd1/0xf0
> [ 0.000000] [<ffffffff81063f0d>] warn_slowpath_null+0x1d/0x20
> [ 0.000000] [<ffffffff8105f4ca>] cpumask_check.part.2+0x1c/0x1e
> [ 0.000000] [<ffffffff8104008e>] cpumask_clear_cpu+0x2e/0x40
> [ 0.000000] [<ffffffff81f88a62>] prefill_possible_map+0x15c/0x16a
> [ 0.000000] [<ffffffff81f802c6>] setup_arch+0xba7/0xc33
> [ 0.000000] [<ffffffff81f78c59>] start_kernel+0x63/0x448
> [ 0.000000] [<ffffffff81f7858c>] x86_64_start_reservations+0x2a/0x2c
> [ 0.000000] [<ffffffff81f78678>] x86_64_start_kernel+0xea/0xed
> [ 0.000000] ---[ end trace 5876da8d2ace83fb ]---

And that non-trimmed backtrace is useful because it takes so much room in
the changelog and looks nice? The callchain leading to
prefill_possible_map() is pretty much well known, i.e. the backtrace is
pointless.

> nr_cpu_ids is set to possible two lines futher - omit checking in
> set_cpu_possible() cycles.

-ENOPARSE.

You completely fail to explain the problem, i.e. how nr_cpu_ids gets
overwritten from it's initial compile time value NR_CPUS.

And of course you fail to explain why the "solution" is correct or
whatever you consider it to be.

> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 42f5eb7b4f6c..17167bec7c61 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1459,6 +1459,9 @@ __init void prefill_possible_map(void)
> pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
> possible, max_t(int, possible - num_processors, 0));
>
> + /* Avoid WARN() in set_cpu_possible()=>cpumask_check() */
> + nr_cpu_ids = NR_CPUS;
> +

If anything then this qualifies as a quick hack.

> for (i = 0; i < possible; i++)
> set_cpu_possible(i, true);
> for (; i < NR_CPUS; i++)

The underlying issue is not restricted to nr_cpus=1 at all. The problem
comes from the early_param setting nr_cpu_ids to the command line
parameter. If that one is smaller than NR_CPUS then the access to the
possible mask with a cpu number > nr_cpu_ids will trigger the warning.

So instead of playing completely non obvious hackery with nr_cpu_ids the
proper solution is to have a function which clears the underlying
__cpu_possible_map, which is sized NR_CPUS because it is compile time
allocated and then only set the possible bits. Does the untested patch
below fix the issue for you?

Thanks,

tglx
8<--------------------
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1476,15 +1476,15 @@ early_param("possible_cpus", _setup_poss
possible = i;
}

+ nr_cpu_ids = possible;
+
pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
possible, max_t(int, possible - num_processors, 0));

+ reset_cpu_possible_mask();
+
for (i = 0; i < possible; i++)
set_cpu_possible(i, true);
- for (; i < NR_CPUS; i++)
- set_cpu_possible(i, false);
-
- nr_cpu_ids = possible;
}

#ifdef CONFIG_HOTPLUG_CPU
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -112,6 +112,7 @@ extern struct cpumask __cpu_active_mask;
#define cpu_possible(cpu) ((cpu) == 0)
#define cpu_present(cpu) ((cpu) == 0)
#define cpu_active(cpu) ((cpu) == 0)
+static inline void cpumask_reset_possible_mask(void) { }
#endif

/* verify cpu argument to cpumask_* operators */
@@ -722,6 +723,11 @@ void init_cpu_present(const struct cpuma
void init_cpu_possible(const struct cpumask *src);
void init_cpu_online(const struct cpumask *src);

+static inline void reset_cpu_possible_mask(void)
+{
+ bitmap_zero(cpumask_bits(&__cpu_possible_mask), NR_CPUS);
+}
+
static inline void
set_cpu_possible(unsigned int cpu, bool possible)
{






2016-12-13 20:46:40

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] cpumask: avoid WARN in prefill_possible_map()

On Tue, 13 Dec 2016, Dmitry Safonov wrote:
> > +static inline void cpumask_reset_possible_mask(void) { }
> ...
> > +static inline void reset_cpu_possible_mask(void)
>
> Is this intentionally?

No, that was a left over and a missing refresh bevor sending it out
quickly.

Thanks,

tglx

2016-12-14 10:33:48

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH] cpumask: avoid WARN in prefill_possible_map()

On 12/13/2016 09:32 PM, Thomas Gleixner wrote:
> On Mon, 12 Dec 2016, Dmitry Safonov wrote:
>
>> Subject : [PATCH] cpumask: avoid WARN in prefill_possible_map()
>
> 'cpumask' is hardly the proper prefix for x86/smpboot related issues.
>
>> With CONFIG_DEBUG_PER_CPU_MAPS and CONFIG_CPUMASK_OFFSTACK enabled
>> fixes the following WARN_ON_ONCE() for booting with nr_cpus=1:
>
> This sentence, aside of not qualifying as a sentence, makes no sense.
>
> What has this to do with nr_cpus=1? If I boot with nr_cpus=2 then this
> won't fail or what?
>
>> [ 0.000000] Linux version 4.9.0 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #36 SMP Mon Dec 12 18:05:46 MSK 2016
>> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.9.0 root=/dev/mapper/vz-root ro crashkernel=auto rd.lvm.lv=vz/root rd.lvm.lv=vz/swap console=ttyS0,115200 vsyscall=none nr_cpus=1
>> [ 0.000000] smpboot: 4 Processors exceeds NR_CPUS limit of 1
>> [ 0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
>> [ 0.000000] ------------[ cut here ]------------
>> [ 0.000000] WARNING: CPU: 0 PID: 0 at ./include/linux/cpumask.h:121 cpumask_check.part.2+0x1c/0x1e
>> [ 0.000000] Modules linked in:
>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0 #36
>> [ 0.000000] Call Trace:
>> [ 0.000000] [<ffffffff8136664b>] dump_stack+0x67/0x9c
>> [ 0.000000] [<ffffffff81063e21>] __warn+0xd1/0xf0
>> [ 0.000000] [<ffffffff81063f0d>] warn_slowpath_null+0x1d/0x20
>> [ 0.000000] [<ffffffff8105f4ca>] cpumask_check.part.2+0x1c/0x1e
>> [ 0.000000] [<ffffffff8104008e>] cpumask_clear_cpu+0x2e/0x40
>> [ 0.000000] [<ffffffff81f88a62>] prefill_possible_map+0x15c/0x16a
>> [ 0.000000] [<ffffffff81f802c6>] setup_arch+0xba7/0xc33
>> [ 0.000000] [<ffffffff81f78c59>] start_kernel+0x63/0x448
>> [ 0.000000] [<ffffffff81f7858c>] x86_64_start_reservations+0x2a/0x2c
>> [ 0.000000] [<ffffffff81f78678>] x86_64_start_kernel+0xea/0xed
>> [ 0.000000] ---[ end trace 5876da8d2ace83fb ]---
>
> And that non-trimmed backtrace is useful because it takes so much room in
> the changelog and looks nice? The callchain leading to
> prefill_possible_map() is pretty much well known, i.e. the backtrace is
> pointless.
>
>> nr_cpu_ids is set to possible two lines futher - omit checking in
>> set_cpu_possible() cycles.
>
> -ENOPARSE.
>
> You completely fail to explain the problem, i.e. how nr_cpu_ids gets
> overwritten from it's initial compile time value NR_CPUS.
>
> And of course you fail to explain why the "solution" is correct or
> whatever you consider it to be.
>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 42f5eb7b4f6c..17167bec7c61 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1459,6 +1459,9 @@ __init void prefill_possible_map(void)
>> pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
>> possible, max_t(int, possible - num_processors, 0));
>>
>> + /* Avoid WARN() in set_cpu_possible()=>cpumask_check() */
>> + nr_cpu_ids = NR_CPUS;
>> +
>
> If anything then this qualifies as a quick hack.
>
>> for (i = 0; i < possible; i++)
>> set_cpu_possible(i, true);
>> for (; i < NR_CPUS; i++)
>
> The underlying issue is not restricted to nr_cpus=1 at all. The problem
> comes from the early_param setting nr_cpu_ids to the command line
> parameter. If that one is smaller than NR_CPUS then the access to the
> possible mask with a cpu number > nr_cpu_ids will trigger the warning.
>
> So instead of playing completely non obvious hackery with nr_cpu_ids the
> proper solution is to have a function which clears the underlying
> __cpu_possible_map, which is sized NR_CPUS because it is compile time
> allocated and then only set the possible bits. Does the untested patch
> below fix the issue for you?

Hi Thomas,

Well, my solution looks like a quick hack, because I didn't want to
introduce a new function in header which is used in one place.
And you did it...

> +static inline void cpumask_reset_possible_mask(void) { }
...
> +static inline void reset_cpu_possible_mask(void)

Is this intentionally?

Don't mind your version with fixed func-names and sorry for the bad
changelog.

>
> Thanks,
>
> tglx
> 8<--------------------
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1476,15 +1476,15 @@ early_param("possible_cpus", _setup_poss
> possible = i;
> }
>
> + nr_cpu_ids = possible;
> +
> pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
> possible, max_t(int, possible - num_processors, 0));
>
> + reset_cpu_possible_mask();
> +
> for (i = 0; i < possible; i++)
> set_cpu_possible(i, true);
> - for (; i < NR_CPUS; i++)
> - set_cpu_possible(i, false);
> -
> - nr_cpu_ids = possible;
> }
>
> #ifdef CONFIG_HOTPLUG_CPU
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -112,6 +112,7 @@ extern struct cpumask __cpu_active_mask;
> #define cpu_possible(cpu) ((cpu) == 0)
> #define cpu_present(cpu) ((cpu) == 0)
> #define cpu_active(cpu) ((cpu) == 0)
> +static inline void cpumask_reset_possible_mask(void) { }
> #endif
>
> /* verify cpu argument to cpumask_* operators */
> @@ -722,6 +723,11 @@ void init_cpu_present(const struct cpuma
> void init_cpu_possible(const struct cpumask *src);
> void init_cpu_online(const struct cpumask *src);
>
> +static inline void reset_cpu_possible_mask(void)
> +{
> + bitmap_zero(cpumask_bits(&__cpu_possible_mask), NR_CPUS);
> +}
> +
> static inline void
> set_cpu_possible(unsigned int cpu, bool possible)
> {
>
>
>
>
>
>


--
Dmitry

Subject: [tip:x86/urgent] x86/smpboot: Prevent false positive out of bounds cpumask access warning

Commit-ID: 427d77a32365d5f942d335248305a5c237baf63a
Gitweb: http://git.kernel.org/tip/427d77a32365d5f942d335248305a5c237baf63a
Author: Thomas Gleixner <[email protected]>
AuthorDate: Tue, 13 Dec 2016 19:32:28 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Thu, 15 Dec 2016 11:32:31 +0100

x86/smpboot: Prevent false positive out of bounds cpumask access warning

prefill_possible_map() reinitializes the cpu_possible_map by setting the
possible cpu bits and clearing all other bits up to NR_CPUS.

This is technically always correct because cpu_possible_map is statically
allocated and sized NR_CPUS. With CPUMASK_OFFSTACK and DEBUG_PER_CPU_MAPS
enabled the bounds check of cpu masks happens on nr_cpu_ids. nr_cpu_ids is
initialized to NR_CPUS and only limited after the set/clear bit loops have
been executed.

But if the system was booted with "nr_cpus=N" on the command line, where N
is < NR_CPUS then nr_cpu_ids is limited in the parameter parsing function
before prefill_possible_map() is invoked. As a consequence the cpumask
bounds check triggers when clearing the bits past nr_cpu_ids.

Add a helper which allows to reset cpu_possible_map w/o the bounds check
and then set only the possible bits which are well inside bounds.

Reported-by: Dmitry Safonov <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: [email protected]
Cc: Jan Beulich <[email protected]>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612131836050.3415@nanos
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/kernel/smpboot.c | 8 ++++----
include/linux/cpumask.h | 5 +++++
2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e09aa58..46732dc 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1463,15 +1463,15 @@ __init void prefill_possible_map(void)
possible = i;
}

+ nr_cpu_ids = possible;
+
pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
possible, max_t(int, possible - num_processors, 0));

+ reset_cpu_possible_mask();
+
for (i = 0; i < possible; i++)
set_cpu_possible(i, true);
- for (; i < NR_CPUS; i++)
- set_cpu_possible(i, false);
-
- nr_cpu_ids = possible;
}

#ifdef CONFIG_HOTPLUG_CPU
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index da7fbf1..c717f5e 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -722,6 +722,11 @@ void init_cpu_present(const struct cpumask *src);
void init_cpu_possible(const struct cpumask *src);
void init_cpu_online(const struct cpumask *src);

+static inline void reset_cpu_possible_mask(void)
+{
+ bitmap_zero(cpumask_bits(&__cpu_possible_mask), NR_CPUS);
+}
+
static inline void
set_cpu_possible(unsigned int cpu, bool possible)
{