2022-07-29 07:17:11

by Sander Vanheule

[permalink] [raw]
Subject: [PATCH v5 0/5] cpumask: fix invalid uniprocessor assumptions

On uniprocessor builds, it is currently assumed that any cpumask will
contain the single CPU: cpu0. This assumption is used to provide
optimised implementations.

The current assumption also appears to be wrong, by ignoring the fact
that users can provide empty cpumasks. This can result in bugs as
explained in [1] - for_each_cpu() will run one iteration of the loop
even when passed an empty cpumask.

This series introduces some basic tests, and updates the optimisations
for uniprocessor builds.

The x86 patch was written after the kernel test robot [2] ran into a
failed build. I have tried to list the files potentially affected by the
changes to cpumask.h, in an attempt to find any other cases that fail on
!SMP. I've gone through some of the files manually, and ran a few cross
builds, but nothing else popped up. I (build) checked about half of the
potientally affected files, but I do not have the resources to do them
all. I hope we can fix other issues if/when they pop up later.

[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/

Changes since v4:
Link: https://lore.kernel.org/all/[email protected]/
- Move new for_each_*_cpu() optimisations ahead, so they come before
the fixes.
- Update test cases for cpu_possible_mask for nr_cpu_ids < CONFIG_NR_CPUS
- Improve KUnit style compliance on tests
- Collect tags and add Cc: tags

Changes since v3:
Link: https://lore.kernel.org/all/[email protected]/
- Guard against CPU hotplugging while testing cpu online/present masks
- Add fix for cpu_llc_shared_map on x86

Changes since v2:
Link: https://lore.kernel.org/all/[email protected]/
- Put new tests after patch fixes
- Update for_each_* macros

Changes since v1:
Link: https://lore.kernel.org/all/[email protected]/
- Place tests in lib/test_cpumask.c
- Drop the modified UP code in favor of the generic SMP implementation
- Update declaration of cpumask_next_wrap()

Sander Vanheule (5):
x86/cacheinfo: move shared cache map definitions
cpumask: add UP optimised for_each_*_cpu versions
cpumask: fix invalid uniprocessor mask assumption
lib/test: introduce cpumask KUnit test suite
cpumask: update cpumask_next_wrap() signature

arch/x86/kernel/cpu/cacheinfo.c | 6 ++
arch/x86/kernel/smpboot.c | 4 -
include/linux/cpumask.h | 108 ++++++-----------------
lib/Kconfig.debug | 12 +++
lib/Makefile | 4 +-
lib/cpumask.c | 2 +
lib/cpumask_test.c | 147 ++++++++++++++++++++++++++++++++
7 files changed, 196 insertions(+), 87 deletions(-)
create mode 100644 lib/cpumask_test.c

--
2.37.1


2022-07-29 07:19:56

by Sander Vanheule

[permalink] [raw]
Subject: [PATCH v5 1/5] x86/cacheinfo: move shared cache map definitions

The maps to keep track of shared caches between CPUs on SMP systems are
declared in asm/smp.h, among them specifically cpu_llc_shared_map. These
maps are externally defined in cpu/smpboot.c. The latter is only compiled
on CONFIG_SMP=y, which means the declared extern symbols from asm/smp.h do
not have a corresponding definition on uniprocessor builds.

The inline cpu_llc_shared_mask() function from asm/smp.h refers to the map
declaration mentioned above. This function is referenced in cacheinfo.c
inside for_each_cpu() loop macros, to provide cpumask for the loop. On
uniprocessor builds, the symbol for the cpu_llc_shared_map does not exist.
However, the current implementation of for_each_cpu() also (wrongly)
ignores the provided mask.

By sheer luck, the compiler thus optimises out this unused reference to
cpu_llc_shared_map, and the linker therefore does not require the
cpu_llc_shared_mask to actually exist on uniprocessor builds. Only on SMP
bulids does smpboot.o exist to provide the required symbols.

To no longer rely on compiler optimisations for successful uniprocessor
builds, move the definitions of cpu_llc_shared_map and cpu_l2c_shared_map
from smpboot.c to cacheinfo.c.

Signed-off-by: Sander Vanheule <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Yury Norov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
Changes since v3:
- New patch

arch/x86/kernel/cpu/cacheinfo.c | 6 ++++++
arch/x86/kernel/smpboot.c | 4 ----
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index fe98a1465be6..66556833d7af 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -29,6 +29,12 @@
#define LVL_3 4
#define LVL_TRACE 5

+/* Shared last level cache maps */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
+
+/* Shared L2 cache maps */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
+
struct _cache_table {
unsigned char descriptor;
char cache_type;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5e7f9532a10d..f24227bc3220 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -95,10 +95,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_core_map);
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
EXPORT_PER_CPU_SYMBOL(cpu_die_map);

-DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
-
-DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
--
2.37.1

2022-07-29 07:21:07

by Sander Vanheule

[permalink] [raw]
Subject: [PATCH v5 3/5] cpumask: fix invalid uniprocessor mask assumption

On uniprocessor builds, any CPU mask is assumed to contain exactly one CPU
(cpu0). This assumption ignores the existence of empty masks, potentially
resulting in incorrect behaviour. For example, for_each_cpu() will run
one iteration of the loop even when passed an empty cpumask.

cpumask_first_zero(), cpumask_next_zero(), and for_each_cpu_not() don't
provide behaviour matching the assumption that a UP mask is always "1",
but instead provide behaviour matching the empty mask.

Drop the incorrectly optimised code and use the generic implementations in
all cases.

Signed-off-by: Sander Vanheule <[email protected]>
Suggested-by: Yury Norov <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Valentin Schneider <[email protected]>
---
Changes since v4:
- Reflow commit message

Changes since v3:
- Add back UP-optimised cpumask_local_spread, cpumask_any_distribute,
cpumask_any_and_distribute

Changes since v1:
- Drop UP implementations instead of trying to fix them

include/linux/cpumask.h | 99 ++++++++---------------------------------
lib/Makefile | 3 +-
lib/cpumask.c | 2 +
3 files changed, 22 insertions(+), 82 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 533612770bc0..6c5b4ee000f2 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -116,85 +116,6 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
return cpu;
}

-#if NR_CPUS == 1
-/* Uniprocessor. Assume all masks are "1". */
-static inline unsigned int cpumask_first(const struct cpumask *srcp)
-{
- return 0;
-}
-
-static inline unsigned int cpumask_first_zero(const struct cpumask *srcp)
-{
- return 0;
-}
-
-static inline unsigned int cpumask_first_and(const struct cpumask *srcp1,
- const struct cpumask *srcp2)
-{
- return 0;
-}
-
-static inline unsigned int cpumask_last(const struct cpumask *srcp)
-{
- return 0;
-}
-
-/* Valid inputs for n are -1 and 0. */
-static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
-{
- return n+1;
-}
-
-static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
-{
- return n+1;
-}
-
-static inline unsigned int cpumask_next_and(int n,
- const struct cpumask *srcp,
- const struct cpumask *andp)
-{
- return n+1;
-}
-
-static inline unsigned int cpumask_next_wrap(int n, const struct cpumask *mask,
- int start, bool wrap)
-{
- /* cpu0 unless stop condition, wrap and at cpu0, then nr_cpumask_bits */
- return (wrap && n == 0);
-}
-
-/* cpu must be a valid cpu, ie 0, so there's no other choice. */
-static inline unsigned int cpumask_any_but(const struct cpumask *mask,
- unsigned int cpu)
-{
- return 1;
-}
-
-static inline unsigned int cpumask_local_spread(unsigned int i, int node)
-{
- return 0;
-}
-
-static inline int cpumask_any_and_distribute(const struct cpumask *src1p,
- const struct cpumask *src2p) {
- return cpumask_first_and(src1p, src2p);
-}
-
-static inline int cpumask_any_distribute(const struct cpumask *srcp)
-{
- return cpumask_first(srcp);
-}
-
-#define for_each_cpu(cpu, mask) \
- for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
-#define for_each_cpu_not(cpu, mask) \
- for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
-#define for_each_cpu_wrap(cpu, mask, start) \
- for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask, (void)(start))
-#define for_each_cpu_and(cpu, mask1, mask2) \
- for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask1, (void)mask2)
-#else
/**
* cpumask_first - get the first cpu in a cpumask
* @srcp: the cpumask pointer
@@ -260,10 +181,29 @@ static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)

int __pure cpumask_next_and(int n, const struct cpumask *, const struct cpumask *);
int __pure cpumask_any_but(const struct cpumask *mask, unsigned int cpu);
+
+#if NR_CPUS == 1
+/* Uniprocessor: there is only one valid CPU */
+static inline unsigned int cpumask_local_spread(unsigned int i, int node)
+{
+ return 0;
+}
+
+static inline int cpumask_any_and_distribute(const struct cpumask *src1p,
+ const struct cpumask *src2p) {
+ return cpumask_first_and(src1p, src2p);
+}
+
+static inline int cpumask_any_distribute(const struct cpumask *srcp)
+{
+ return cpumask_first(srcp);
+}
+#else
unsigned int cpumask_local_spread(unsigned int i, int node);
int cpumask_any_and_distribute(const struct cpumask *src1p,
const struct cpumask *src2p);
int cpumask_any_distribute(const struct cpumask *srcp);
+#endif /* NR_CPUS */

/**
* for_each_cpu - iterate over every cpu in a mask
@@ -324,7 +264,6 @@ extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool
for ((cpu) = -1; \
(cpu) = cpumask_next_and((cpu), (mask1), (mask2)), \
(cpu) < nr_cpu_ids;)
-#endif /* SMP */

#define CPU_BITS_NONE \
{ \
diff --git a/lib/Makefile b/lib/Makefile
index f99bf61f8bbc..bcc7e8ea0cde 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -34,10 +34,9 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o \
- buildid.o
+ buildid.o cpumask.o

lib-$(CONFIG_PRINTK) += dump_stack.o
-lib-$(CONFIG_SMP) += cpumask.o

lib-y += kobject.o klist.o
obj-y += lockref.o
diff --git a/lib/cpumask.c b/lib/cpumask.c
index a971a82d2f43..b9728513a4d4 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -192,6 +192,7 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
}
#endif

+#if NR_CPUS > 1
/**
* cpumask_local_spread - select the i'th cpu with local numa cpu's first
* @i: index number
@@ -279,3 +280,4 @@ int cpumask_any_distribute(const struct cpumask *srcp)
return next;
}
EXPORT_SYMBOL(cpumask_any_distribute);
+#endif /* NR_CPUS */
--
2.37.1

2022-07-30 18:20:36

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] cpumask: fix invalid uniprocessor assumptions

On Fri, Jul 29, 2022 at 09:01:17AM +0200, Sander Vanheule wrote:
> On uniprocessor builds, it is currently assumed that any cpumask will
> contain the single CPU: cpu0. This assumption is used to provide
> optimised implementations.
>
> The current assumption also appears to be wrong, by ignoring the fact
> that users can provide empty cpumasks. This can result in bugs as
> explained in [1] - for_each_cpu() will run one iteration of the loop
> even when passed an empty cpumask.
>
> This series introduces some basic tests, and updates the optimisations
> for uniprocessor builds.
>
> The x86 patch was written after the kernel test robot [2] ran into a
> failed build. I have tried to list the files potentially affected by the
> changes to cpumask.h, in an attempt to find any other cases that fail on
> !SMP. I've gone through some of the files manually, and ran a few cross
> builds, but nothing else popped up. I (build) checked about half of the
> potientally affected files, but I do not have the resources to do them
> all. I hope we can fix other issues if/when they pop up later.
>
> [1] https://lore.kernel.org/all/[email protected]/
> [2] https://lore.kernel.org/all/[email protected]/

Hi Sander,

I tried to apply it on top of bitmap-for next, and there are many conflicts
with already pulled patches. There's nothing really scary, just functions
changed their prototypes and locations. Can you try your series on top of
bitmap-for-next from [email protected]:/norov/linux.git (or just -next)?

I'm asking you to do it instead of doing myself because I don't want to
screwup your code accidentally and because many cpumask functions in -next
are moved to the header, and it would be probably possible to avoid building
cpumask.o in UP case.

Briefly looking into the -next code, cpumask.c hosts only cpumask_next_wrap()
that is not overwritten by UP code, and in UP case it can be simplified.

Thanks,
Yury

2022-07-31 13:45:00

by Sander Vanheule

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] cpumask: fix invalid uniprocessor assumptions

On Sat, 2022-07-30 at 11:15 -0700, Yury Norov wrote:
> On Fri, Jul 29, 2022 at 09:01:17AM +0200, Sander Vanheule wrote:
> > On uniprocessor builds, it is currently assumed that any cpumask will
> > contain the single CPU: cpu0. This assumption is used to provide
> > optimised implementations.
> >
> > The current assumption also appears to be wrong, by ignoring the fact
> > that users can provide empty cpumasks. This can result in bugs as
> > explained in [1] - for_each_cpu() will run one iteration of the loop
> > even when passed an empty cpumask.
> >
> > This series introduces some basic tests, and updates the optimisations
> > for uniprocessor builds.
> >
> > The x86 patch was written after the kernel test robot [2] ran into a
> > failed build. I have tried to list the files potentially affected by the
> > changes to cpumask.h, in an attempt to find any other cases that fail on
> > !SMP. I've gone through some of the files manually, and ran a few cross
> > builds, but nothing else popped up. I (build) checked about half of the
> > potientally affected files, but I do not have the resources to do them
> > all. I hope we can fix other issues if/when they pop up later.
> >
> > [1] https://lore.kernel.org/all/[email protected]/
> > [2] https://lore.kernel.org/all/[email protected]/
>  
> Hi Sander,
>
> I tried to apply it on top of bitmap-for next, and there are many conflicts
> with already pulled patches. There's nothing really scary, just functions
> changed their prototypes and locations. Can you try your series on top of
> bitmap-for-next from [email protected]:/norov/linux.git (or just -next)?
>
> I'm asking you to do it instead of doing myself because I don't want to
> screwup your code accidentally and because many cpumask functions in -next
> are moved to the header, and it would be probably possible to avoid building
> cpumask.o in UP case.
>
> Briefly looking into the -next code, cpumask.c hosts  only cpumask_next_wrap()
> that is not overwritten by UP code, and in UP case it can be simplified.

Sure. I've rebased my patches and added a UP-version for cpumask_next_wrap(), so
cpumask.o doesn't have to be built anymore in that case.

How would you like to proceed with these patches? It's fine by me if you take
them through your tree to avoid more merge conflicts with your changes, but then
Andrew woud have to drop the series from mm-nonmm-stable.

Best,
Sander

2022-07-31 15:34:13

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] cpumask: fix invalid uniprocessor assumptions

On Sun, Jul 31, 2022 at 03:02:55PM +0200, Sander Vanheule wrote:
> On Sat, 2022-07-30 at 11:15 -0700, Yury Norov wrote:
> > On Fri, Jul 29, 2022 at 09:01:17AM +0200, Sander Vanheule wrote:
> > > On uniprocessor builds, it is currently assumed that any cpumask will
> > > contain the single CPU: cpu0. This assumption is used to provide
> > > optimised implementations.
> > >
> > > The current assumption also appears to be wrong, by ignoring the fact
> > > that users can provide empty cpumasks. This can result in bugs as
> > > explained in [1] - for_each_cpu() will run one iteration of the loop
> > > even when passed an empty cpumask.
> > >
> > > This series introduces some basic tests, and updates the optimisations
> > > for uniprocessor builds.
> > >
> > > The x86 patch was written after the kernel test robot [2] ran into a
> > > failed build. I have tried to list the files potentially affected by the
> > > changes to cpumask.h, in an attempt to find any other cases that fail on
> > > !SMP. I've gone through some of the files manually, and ran a few cross
> > > builds, but nothing else popped up. I (build) checked about half of the
> > > potientally affected files, but I do not have the resources to do them
> > > all. I hope we can fix other issues if/when they pop up later.
> > >
> > > [1] https://lore.kernel.org/all/[email protected]/
> > > [2] https://lore.kernel.org/all/[email protected]/
> > ?
> > Hi Sander,
> >
> > I tried to apply it on top of bitmap-for next, and there are many conflicts
> > with already pulled patches. There's nothing really scary, just functions
> > changed their prototypes and locations. Can you try your series on top of
> > bitmap-for-next from [email protected]:/norov/linux.git (or just -next)?
> >
> > I'm asking you to do it instead of doing myself because I don't want to
> > screwup your code accidentally and because many cpumask functions in -next
> > are moved to the header, and it would be probably possible to avoid building
> > cpumask.o in UP case.
> >
> > Briefly looking into the -next code, cpumask.c hosts? only cpumask_next_wrap()
> > that is not overwritten by UP code, and in UP case it can be simplified.
>
> Sure. I've rebased my patches and added a UP-version for cpumask_next_wrap(), so
> cpumask.o doesn't have to be built anymore in that case.

Thanks!

> How would you like to proceed with these patches? It's fine by me if you take
> them through your tree to avoid more merge conflicts with your changes, but then
> Andrew woud have to drop the series from mm-nonmm-stable.

I also thing that it should go through bitmap thee. Andrew, are you OK with
that?

2022-08-02 11:18:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v5 1/5] x86/cacheinfo: move shared cache map definitions


* Sander Vanheule <[email protected]> wrote:

> The maps to keep track of shared caches between CPUs on SMP systems are
> declared in asm/smp.h, among them specifically cpu_llc_shared_map. These
> maps are externally defined in cpu/smpboot.c. The latter is only compiled
> on CONFIG_SMP=y, which means the declared extern symbols from asm/smp.h do
> not have a corresponding definition on uniprocessor builds.
>
> The inline cpu_llc_shared_mask() function from asm/smp.h refers to the map
> declaration mentioned above. This function is referenced in cacheinfo.c
> inside for_each_cpu() loop macros, to provide cpumask for the loop. On
> uniprocessor builds, the symbol for the cpu_llc_shared_map does not exist.
> However, the current implementation of for_each_cpu() also (wrongly)
> ignores the provided mask.
>
> By sheer luck, the compiler thus optimises out this unused reference to
> cpu_llc_shared_map, and the linker therefore does not require the
> cpu_llc_shared_mask to actually exist on uniprocessor builds. Only on SMP
> bulids does smpboot.o exist to provide the required symbols.
>
> To no longer rely on compiler optimisations for successful uniprocessor
> builds, move the definitions of cpu_llc_shared_map and cpu_l2c_shared_map
> from smpboot.c to cacheinfo.c.
>
> Signed-off-by: Sander Vanheule <[email protected]>
> Cc: Andy Shevchenko <[email protected]>
> Cc: Marco Elver <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Valentin Schneider <[email protected]>
> Cc: Yury Norov <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> ---
> Changes since v3:
> - New patch
>
> arch/x86/kernel/cpu/cacheinfo.c | 6 ++++++
> arch/x86/kernel/smpboot.c | 4 ----
> 2 files changed, 6 insertions(+), 4 deletions(-)

Reviewed-by: Ingo Molnar <[email protected]>

I'm also fine with you guys carrying this in the bitmap tree - the x86
impact is incidental.

Thanks,

Ingo

2022-08-08 17:06:51

by Saurabh Singh Sengar

[permalink] [raw]
Subject: [RFC] issue with cpumask for UniProcessor


Hi,

I am working on a UniProcessor system with latest linux-next kernel (20220803).
I observed two files "shared_cpu_map” and “shared_cpu_list” are missing
for L3 cache (/sys/devices/system/cpu/cpu0/cache/index3). This causes lscpu
version 2.34 to segfault. On further digging I figured below is the commit
which introduced this problem.

https://lore.kernel.org/lkml/e78c55ecb98172356248a7a89da501479ead6ae0.1659077534.git.sander@svanheule.net/


I am not 100% certain what the proper fix for it is, but below changes fix
this issue. I understand above patch is already confirmed for linux kernel
6.0, please suggest if we need fixing this in 6.0.

Regards,
Saurabh



diff --git a/lib/cpumask.c b/lib/cpumask.c
index b9728513a4d4..81fc2e35b5b1 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -16,10 +16,14 @@
*/
unsigned int cpumask_next(int n, const struct cpumask *srcp)
{
+#if NR_CPUS == 1
+ return n+1;
+#else
/* -1 is a legal arg here. */
if (n != -1)
cpumask_check(n);
return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
+#endif
}
EXPORT_SYMBOL(cpumask_next);

2022-08-08 17:37:40

by Sander Vanheule

[permalink] [raw]
Subject: Re: [RFC] issue with cpumask for UniProcessor

Hi Saurabh,

On Mon, 2022-08-08 at 09:23 -0700, Saurabh Sengar wrote:
>
> Hi,
>
> I am working on a UniProcessor system with latest linux-next kernel
> (20220803).
> I observed two files "shared_cpu_map” and “shared_cpu_list” are missing
> for L3 cache (/sys/devices/system/cpu/cpu0/cache/index3). This causes lscpu
> version 2.34 to segfault. On further digging I figured below is the commit
> which introduced this problem.
>
> https://lore.kernel.org/lkml/e78c55ecb98172356248a7a89da501479ead6ae0.1659077534.git.sander@svanheule.net/
>

This is the v5 of the patch, which sadly isn't the version that got merged. The
commit that's triggering your issue is b81dce77cedc ("cpumask: Fix invalid
uniprocessor mask assumption"), which is patch v4.

https://lore.kernel.org/lkml/86bf3f005abba2d92120ddd0809235cab4f759a6.1656777646.git.sander@svanheule.net/

>
> I am not 100% certain what the proper fix for it is, but below changes fix
> this issue. I understand above patch is already confirmed for linux kernel
> 6.0, please suggest if we need fixing this in 6.0.
>
> Regards,
> Saurabh
>
>
>
> diff --git a/lib/cpumask.c b/lib/cpumask.c
> index b9728513a4d4..81fc2e35b5b1 100644
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -16,10 +16,14 @@
>   */
>  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>  {
> +#if NR_CPUS == 1
> +       return n+1;
> +#else

This is ignoring the provided cpumask again, which was exactly what my patch
fixed. If the mask is empty, then cpumask_next(-1, mask) should return (at
least) 1, not 0.

I think the problem could be caused by cpumask_next() getting an empty mask.
Then the real issue is would be that a certain mask is empty when it shouldn't
be, which was compensated by the old code's built-in assumption that a cpumask
couldn't be empty.

My MIPS testing system doesn't have these L3 maps, and "shared_cpu_map" and
"shared_cpu_list" are present for index0 and index1. I would propose that you
look for the point where the files should be created, and check how
cpumask_next() is involved, to find the actual cause of this problem.

Best,
Sander

>         /* -1 is a legal arg here. */
>         if (n != -1)
>                 cpumask_check(n);
>         return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
> +#endif
>  }
>  EXPORT_SYMBOL(cpumask_next);