2022-04-13 02:34:56

by Libo Chen

[permalink] [raw]
Subject: [PATCH RESEND 0/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

We encountered an issue related to NR_CPUMASK_BITS on ARM64. It turned
out CPUMASK_OFFSTACK was not enabled and NR_CPUS is set to 4096 in our
aarch64 config, so each cpumask operation is unnecessarily expensive
esp. on small VMs. There are decent number of cpumask operations in the
scheduler. Many of them are on the hot paths. The cumulative effect is
quite siginificant. With CONFIG_CPUMASK_OFFSTACKT set, we saw ~10% gain on
ApacheBench and a few percentage on redis in a 32-core ARM64 VM with 5.16
kernel.
One may argue why not just set NR_CPUS to smaller values, the thing is,
for example, we support VMs with flexible shapes. The customer can
specify the number of CPUs ranging from 1 to 160. You cannot just prepare
160 kernel images for each release with one for each shape. And kernel
with NR_CPUS=160 doesn't perform well on a 1-CPU VM. It's important that
we can set CPUMASK_OFFSTACK=y so that the kernel can dynamically allocate
cpumasks based on the actual number of CPUs in the system.

The problem I am trying to address in this patch is currently
CPUMASK_OFFSTACK depends on DEBUG_PER_CPU_MAPS except for x86 and it
doesn't even show up in kconfig menu as well as kconfig
without DEBUG_PER_CPU_MAPS=y. We should remove such outdated,
unnecessary dependency.

Of course, I am open to other ideas. And people who are more familar with
this matter can shed a light on why this dependency has been kept for so
long. My goal here is to determine if DEBUG_PER_CPU_MAPS is absoultely
necessary for CPUMASK_OFFSTACK.

RESEND:
- Correct the subject line of cover letter
- Send to more people, esp. gkh for more attention since no one seems to
be responsible for this part of code

Libo Chen (1):
lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

lib/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--
2.27.0


2022-04-13 02:49:37

by Libo Chen

[permalink] [raw]
Subject: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
make a lot of sense nowaday. Even the original patch dating back to 2008,
aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
rationale for such dependency.

Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
There is no reason other architectures cannot given the fact that they
have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
x86.

Signed-off-by: Libo Chen <[email protected]>
---
lib/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index 087e06b4cdfd..7209039dfb59 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -511,7 +511,7 @@ config CHECK_SIGNATURE
bool

config CPUMASK_OFFSTACK
- bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+ bool "Force CPU masks off stack"
help
Use dynamic allocation for cpumask_var_t, instead of putting
them on the stack. This is a bit more expensive, but avoids
--
2.27.0

2022-04-13 03:02:05

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi--

On 4/12/22 16:15, Libo Chen wrote:
> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
> make a lot of sense nowaday. Even the original patch dating back to 2008,
> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
> rationale for such dependency.
>
> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
> There is no reason other architectures cannot given the fact that they
> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
> x86.
>
> Signed-off-by: Libo Chen <[email protected]>
> ---
> lib/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 087e06b4cdfd..7209039dfb59 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
> bool
>
> config CPUMASK_OFFSTACK
> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS

This "if" dependency only controls whether the Kconfig symbol's prompt is
displayed (presented) in kconfig tools. Removing it makes the prompt always
be displayed.

Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
of DEBUG_PER_CPU_MAPS.

Is there another problem here?

> + bool "Force CPU masks off stack"
> help
> Use dynamic allocation for cpumask_var_t, instead of putting
> them on the stack. This is a bit more expensive, but avoids

--
~Randy

2022-04-13 03:34:03

by Libo Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK



On 4/12/22 19:13, Randy Dunlap wrote:
> Hi,
>
> On 4/12/22 18:35, Libo Chen wrote:
>> Hi Randy,
>>
>> On 4/12/22 17:18, Randy Dunlap wrote:
>>> Hi--
>>>
>>> On 4/12/22 16:15, Libo Chen wrote:
>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>> rationale for such dependency.
>>>>
>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>> There is no reason other architectures cannot given the fact that they
>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>> x86.
>>>>
>>>> Signed-off-by: Libo Chen <[email protected]>
>>>> ---
>>>>   lib/Kconfig | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>> --- a/lib/Kconfig
>>>> +++ b/lib/Kconfig
>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>       bool
>>>>     config CPUMASK_OFFSTACK
>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>> be displayed.
>>>
>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>> of DEBUG_PER_CPU_MAPS.
>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
> I'm just talking about the Kconfig change below. Not talking about whatever else
> it might require per architecture.
>
> But you say you have tried that and it doesn't work. What part of it doesn't work?
> The Kconfig part or some code execution?
oh the Kconfig part. For example, make olddefconfig on a config file
with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I
explicitly set DEBUG_PER_CPU_MAPS=y

Libo
> I'll test the Kconfig part of it later (in a few hours).
>
>> Libo
>>> Is there another problem here?
>>>
>>>> +    bool "Force CPU masks off stack"
>>>>       help
>>>>         Use dynamic allocation for cpumask_var_t, instead of putting
>>>>         them on the stack.  This is a bit more expensive, but avoids

2022-04-13 05:38:18

by Libo Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi Randy,

On 4/12/22 17:18, Randy Dunlap wrote:
> Hi--
>
> On 4/12/22 16:15, Libo Chen wrote:
>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>> rationale for such dependency.
>>
>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>> There is no reason other architectures cannot given the fact that they
>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>> x86.
>>
>> Signed-off-by: Libo Chen <[email protected]>
>> ---
>> lib/Kconfig | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/Kconfig b/lib/Kconfig
>> index 087e06b4cdfd..7209039dfb59 100644
>> --- a/lib/Kconfig
>> +++ b/lib/Kconfig
>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>> bool
>>
>> config CPUMASK_OFFSTACK
>> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> This "if" dependency only controls whether the Kconfig symbol's prompt is
> displayed (presented) in kconfig tools. Removing it makes the prompt always
> be displayed.
>
> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
> of DEBUG_PER_CPU_MAPS.
Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under
some config xxx? That will work but it requires code changes for each
architecture.
But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without
CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it
doesn't work.

Libo
> Is there another problem here?
>
>> + bool "Force CPU masks off stack"
>> help
>> Use dynamic allocation for cpumask_var_t, instead of putting
>> them on the stack. This is a bit more expensive, but avoids

2022-04-13 06:55:12

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi Libo,

On 4/12/22 19:34, Libo Chen wrote:
>
>
> On 4/12/22 19:13, Randy Dunlap wrote:
>> Hi,
>>
>> On 4/12/22 18:35, Libo Chen wrote:
>>> Hi Randy,
>>>
>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>> Hi--
>>>>
>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>> rationale for such dependency.
>>>>>
>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>> There is no reason other architectures cannot given the fact that they
>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>> x86.
>>>>>
>>>>> Signed-off-by: Libo Chen <[email protected]>
>>>>> ---
>>>>>    lib/Kconfig | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>> --- a/lib/Kconfig
>>>>> +++ b/lib/Kconfig
>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>        bool
>>>>>      config CPUMASK_OFFSTACK
>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>> be displayed.
>>>>
>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>> of DEBUG_PER_CPU_MAPS.
>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>> it might require per architecture.
>>
>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>> The Kconfig part or some code execution?
> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y

I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
(with a patch, of course.)
It builds OK. I don't know if it will run OK.

I think that you are arguing for a patch like this:

--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -511,7 +511,8 @@ config CHECK_SIGNATURE
bool

config CPUMASK_OFFSTACK
- bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+ bool "Force CPU masks off stack"
+ depends on DEBUG_PER_CPU_MAPS
help
Use dynamic allocation for cpumask_var_t, instead of putting
them on the stack. This is a bit more expensive, but avoids


As I said earlier, the "if" on the "bool" line just controls the prompt message.
This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.


> Libo
>> I'll test the Kconfig part of it later (in a few hours).
>>
>>> Libo
>>>> Is there another problem here?
>>>>
>>>>> +    bool "Force CPU masks off stack"
>>>>>        help
>>>>>          Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>          them on the stack.  This is a bit more expensive, but avoids
>

--
~Randy

2022-04-13 06:57:29

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi,

On 4/12/22 18:35, Libo Chen wrote:
> Hi Randy,
>
> On 4/12/22 17:18, Randy Dunlap wrote:
>> Hi--
>>
>> On 4/12/22 16:15, Libo Chen wrote:
>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>> rationale for such dependency.
>>>
>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>> There is no reason other architectures cannot given the fact that they
>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>> x86.
>>>
>>> Signed-off-by: Libo Chen <[email protected]>
>>> ---
>>>   lib/Kconfig | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>> index 087e06b4cdfd..7209039dfb59 100644
>>> --- a/lib/Kconfig
>>> +++ b/lib/Kconfig
>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>       bool
>>>     config CPUMASK_OFFSTACK
>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>> be displayed.
>>
>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>> of DEBUG_PER_CPU_MAPS.
> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.

I'm just talking about the Kconfig change below. Not talking about whatever else
it might require per architecture.

But you say you have tried that and it doesn't work. What part of it doesn't work?
The Kconfig part or some code execution?

I'll test the Kconfig part of it later (in a few hours).

> Libo
>> Is there another problem here?
>>
>>> +    bool "Force CPU masks off stack"
>>>       help
>>>         Use dynamic allocation for cpumask_var_t, instead of putting
>>>         them on the stack.  This is a bit more expensive, but avoids
>

--
~Randy

2022-04-13 08:30:41

by Libo Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi Randy

On 4/12/22 22:54, Randy Dunlap wrote:
> Hi Libo,
>
> On 4/12/22 19:34, Libo Chen wrote:
>>
>> On 4/12/22 19:13, Randy Dunlap wrote:
>>> Hi,
>>>
>>> On 4/12/22 18:35, Libo Chen wrote:
>>>> Hi Randy,
>>>>
>>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>>> Hi--
>>>>>
>>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>>> rationale for such dependency.
>>>>>>
>>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>>> There is no reason other architectures cannot given the fact that they
>>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>>> x86.
>>>>>>
>>>>>> Signed-off-by: Libo Chen <[email protected]>
>>>>>> ---
>>>>>>    lib/Kconfig | 2 +-
>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>>> --- a/lib/Kconfig
>>>>>> +++ b/lib/Kconfig
>>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>>        bool
>>>>>>      config CPUMASK_OFFSTACK
>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>>> be displayed.
>>>>>
>>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>>> of DEBUG_PER_CPU_MAPS.
>>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>>> it might require per architecture.
>>>
>>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>>> The Kconfig part or some code execution?
>> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
> I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
> (with a patch, of course.)
> It builds OK. I don't know if it will run OK.

I am a little confused, did you succeed with your patch (replacing "if"
with "depends on") or my patch (removing "if")? Because I definitely
cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS
enabled using your change.
> I think that you are arguing for a patch like this:

I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK
should require DEBUG_PER_CPU_MAPS. They should be separate and
independent to each other. So removing "if ..." should be enough in my
opinion.
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
> bool
>
> config CPUMASK_OFFSTACK
> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> + bool "Force CPU masks off stack"
> + depends on DEBUG_PER_CPU_MAPS

This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to
enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument
is CPUMASK_OFFSTACK should be enable/disabled independent of
DEBUG_PER_CPU_MASK
> help
> Use dynamic allocation for cpumask_var_t, instead of putting
> them on the stack. This is a bit more expensive, but avoids
>
>
> As I said earlier, the "if" on the "bool" line just controls the prompt message.
> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>

Okay I understand now "if" on the "boot" is not a dependency and it only
controls the prompt message, then the question is why we cannot enable
CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt
message? Is it not the behavior we expect?

Libo

>> Libo
>>> I'll test the Kconfig part of it later (in a few hours).
>>>
>>>> Libo
>>>>> Is there another problem here?
>>>>>
>>>>>> +    bool "Force CPU masks off stack"
>>>>>>        help
>>>>>>          Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>>          them on the stack.  This is a bit more expensive, but avoids

2022-04-13 12:27:44

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

On Wed, Apr 13, 2022 at 3:56 PM Libo Chen <[email protected]> wrote:
>
> Hi Randy
>
> On 4/12/22 22:54, Randy Dunlap wrote:
> > Hi Libo,
> >
> > On 4/12/22 19:34, Libo Chen wrote:
> >>
> >> On 4/12/22 19:13, Randy Dunlap wrote:
> >>> Hi,
> >>>
> >>> On 4/12/22 18:35, Libo Chen wrote:
> >>>> Hi Randy,
> >>>>
> >>>> On 4/12/22 17:18, Randy Dunlap wrote:
> >>>>> Hi--
> >>>>>
> >>>>> On 4/12/22 16:15, Libo Chen wrote:
> >>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
> >>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
> >>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
> >>>>>> rationale for such dependency.
> >>>>>>
> >>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
> >>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
> >>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
> >>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
> >>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
> >>>>>> There is no reason other architectures cannot given the fact that they
> >>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
> >>>>>> x86.
> >>>>>>
> >>>>>> Signed-off-by: Libo Chen <[email protected]>
> >>>>>> ---
> >>>>>> lib/Kconfig | 2 +-
> >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
> >>>>>> index 087e06b4cdfd..7209039dfb59 100644
> >>>>>> --- a/lib/Kconfig
> >>>>>> +++ b/lib/Kconfig
> >>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
> >>>>>> bool
> >>>>>> config CPUMASK_OFFSTACK
> >>>>>> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> >>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
> >>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
> >>>>> be displayed.
> >>>>>
> >>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
> >>>>> of DEBUG_PER_CPU_MAPS.
> >>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
> >>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
> >>> I'm just talking about the Kconfig change below. Not talking about whatever else
> >>> it might require per architecture.
> >>>
> >>> But you say you have tried that and it doesn't work. What part of it doesn't work?
> >>> The Kconfig part or some code execution?
> >> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
> > I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
> > (with a patch, of course.)
> > It builds OK. I don't know if it will run OK.
>
> I am a little confused, did you succeed with your patch (replacing "if"
> with "depends on") or my patch (removing "if")? Because I definitely
> cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS
> enabled using your change.
> > I think that you are arguing for a patch like this:
>
> I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK
> should require DEBUG_PER_CPU_MAPS. They should be separate and
> independent to each other. So removing "if ..." should be enough in my
> opinion.
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
> > bool
> >
> > config CPUMASK_OFFSTACK
> > - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> > + bool "Force CPU masks off stack"
> > + depends on DEBUG_PER_CPU_MAPS
>
> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to
> enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument
> is CPUMASK_OFFSTACK should be enable/disabled independent of
> DEBUG_PER_CPU_MASK
> > help
> > Use dynamic allocation for cpumask_var_t, instead of putting
> > them on the stack. This is a bit more expensive, but avoids
> >
> >
> > As I said earlier, the "if" on the "bool" line just controls the prompt message.
> > This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
> >
>
> Okay I understand now "if" on the "boot" is not a dependency and it only
> controls the prompt message, then the question is why we cannot enable
> CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt
> message? Is it not the behavior we expect?
>


config CPUMASK_OFFSTACK
bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS

... is equivalent to this:

config CPUMASK_OFFSTACK
bool
prompt "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS



When DEBUG_PER_CPU_MAPS is disabled, the prompt line is ignored,
and CPUMASK_OFFSTACK becomes a user-unconfigurable option.


Other options still can select it,
but users cannot enable it directly from the prompt.

I see x86 and powerpc do this.

$ kgrep 'select CPUMASK_OFFSTACK'
./arch/x86/Kconfig:946: select CPUMASK_OFFSTACK
./arch/powerpc/Kconfig:164: select CPUMASK_OFFSTACK if NR_CPUS >= 8192







> Libo
>
> >> Libo
> >>> I'll test the Kconfig part of it later (in a few hours).
> >>>
> >>>> Libo
> >>>>> Is there another problem here?
> >>>>>
> >>>>>> + bool "Force CPU masks off stack"
> >>>>>> help
> >>>>>> Use dynamic allocation for cpumask_var_t, instead of putting
> >>>>>> them on the stack. This is a bit more expensive, but avoids
>


--
Best Regards
Masahiro Yamada

2022-04-14 12:01:37

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi Libo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.18-rc2 next-20220413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
config: nios2-randconfig-r023-20220413 (https://download.01.org/0day-ci/archive/20220413/[email protected]/config)
compiler: nios2-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/6636f7cf28d2a79cde937c0f212e8a87080da06d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
git checkout 6636f7cf28d2a79cde937c0f212e8a87080da06d
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=nios2 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 71
kernel/workqueue.o: in function `free_workqueue_attrs':
workqueue.c:(.text+0x3fb8): undefined reference to `free_cpumask_var'
workqueue.c:(.text+0x3fb8): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
nios2-linux-ld: kernel/workqueue.o: in function `alloc_workqueue_attrs':
>> workqueue.c:(.text+0x40d4): undefined reference to `alloc_cpumask_var'
workqueue.c:(.text+0x40d4): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
nios2-linux-ld: kernel/workqueue.o: in function `workqueue_set_unbound_cpumask':
workqueue.c:(.text+0x5584): undefined reference to `zalloc_cpumask_var'
workqueue.c:(.text+0x5584): relocation truncated to fit: R_NIOS2_CALL26 against `zalloc_cpumask_var'
nios2-linux-ld: workqueue.c:(.text+0x5680): undefined reference to `free_cpumask_var'
workqueue.c:(.text+0x5680): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
nios2-linux-ld: kernel/workqueue.o: in function `wq_unbound_cpumask_store':
workqueue.c:(.text+0x56e0): undefined reference to `zalloc_cpumask_var'
workqueue.c:(.text+0x56e0): relocation truncated to fit: R_NIOS2_CALL26 against `zalloc_cpumask_var'
nios2-linux-ld: workqueue.c:(.text+0x5718): undefined reference to `free_cpumask_var'
workqueue.c:(.text+0x5718): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
nios2-linux-ld: kernel/workqueue.o: in function `workqueue_init_early':
>> workqueue.c:(.init.text+0x1e8): undefined reference to `alloc_cpumask_var'
workqueue.c:(.init.text+0x1e8): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 15310
kernel/sched/core.o: in function `sched_setaffinity':
>> core.c:(.text+0x3b40): undefined reference to `alloc_cpumask_var'
core.c:(.text+0x3b40): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
>> nios2-linux-ld: core.c:(.text+0x3b54): undefined reference to `alloc_cpumask_var'
core.c:(.text+0x3b54): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
>> nios2-linux-ld: core.c:(.text+0x3be0): undefined reference to `free_cpumask_var'
core.c:(.text+0x3be0): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
nios2-linux-ld: core.c:(.text+0x3bf0): undefined reference to `free_cpumask_var'
core.c:(.text+0x3bf0): additional relocation overflows omitted from the output
nios2-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_setaffinity':
core.c:(.text+0x3c64): undefined reference to `alloc_cpumask_var'
nios2-linux-ld: core.c:(.text+0x3cc4): undefined reference to `free_cpumask_var'
nios2-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_getaffinity':
core.c:(.text+0x3dd4): undefined reference to `alloc_cpumask_var'
nios2-linux-ld: core.c:(.text+0x3e20): undefined reference to `free_cpumask_var'
nios2-linux-ld: kernel/sched/core.o: in function `sched_init':
>> core.c:(.init.text+0x27c): undefined reference to `load_balance_mask'
>> nios2-linux-ld: core.c:(.init.text+0x284): undefined reference to `load_balance_mask'
>> nios2-linux-ld: core.c:(.init.text+0x28c): undefined reference to `select_idle_mask'
nios2-linux-ld: core.c:(.init.text+0x290): undefined reference to `select_idle_mask'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 18
kernel/sched/build_utility.o: in function `housekeeping_setup_type':
build_utility.c:(.init.text+0x130): undefined reference to `alloc_bootmem_cpumask_var'
nios2-linux-ld: kernel/sched/build_utility.o: in function `housekeeping_setup':
build_utility.c:(.init.text+0x198): undefined reference to `alloc_bootmem_cpumask_var'
nios2-linux-ld: build_utility.c:(.init.text+0x1b4): undefined reference to `alloc_bootmem_cpumask_var'
nios2-linux-ld: build_utility.c:(.init.text+0x2f4): undefined reference to `free_bootmem_cpumask_var'
nios2-linux-ld: build_utility.c:(.init.text+0x304): undefined reference to `free_bootmem_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 91
kernel/profile.o: in function `prof_cpu_mask_proc_write':
profile.c:(.text+0x40c): undefined reference to `zalloc_cpumask_var'
nios2-linux-ld: profile.c:(.text+0x450): undefined reference to `free_cpumask_var'
nios2-linux-ld: kernel/profile.o: in function `profile_init':
>> profile.c:(.ref.text+0xc0): undefined reference to `alloc_cpumask_var'
>> nios2-linux-ld: profile.c:(.ref.text+0x134): undefined reference to `free_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 700424681
kernel/time/hrtimer.o: in function `clock_was_set':
>> hrtimer.c:(.text+0xc48): undefined reference to `zalloc_cpumask_var'
>> nios2-linux-ld: hrtimer.c:(.text+0xd50): undefined reference to `free_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 361628
kernel/torture.o: in function `torture_cleanup_begin':
>> torture.c:(.text+0x758): undefined reference to `free_cpumask_var'
nios2-linux-ld: kernel/torture.o: in function `torture_shuffle_init':
>> torture.c:(.text+0xbe4): undefined reference to `alloc_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 13757
block/blk-mq.o: in function `blk_mq_alloc_hctx':
blk-mq.c:(.text+0x1550): undefined reference to `zalloc_cpumask_var_node'
nios2-linux-ld: blk-mq.c:(.text+0x16d4): undefined reference to `free_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 264202560
block/blk-mq-sysfs.o: in function `blk_mq_hw_sysfs_release':
blk-mq-sysfs.c:(.text+0x2c0): undefined reference to `free_cpumask_var'
nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 3666
drivers/base/cpu.o: in function `print_cpus_offline':
cpu.c:(.text+0x94): undefined reference to `alloc_cpumask_var'
nios2-linux-ld: cpu.c:(.text+0xec): undefined reference to `free_cpumask_var'
nios2-linux-ld: drivers/base/cpu.o: in function `print_cpus_isolated':
cpu.c:(.text+0x1b8): undefined reference to `alloc_cpumask_var'
nios2-linux-ld: cpu.c:(.text+0x20c): undefined reference to `free_cpumask_var'

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-14 13:57:26

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

On Wed, Apr 13, 2022 at 9:28 PM Libo Chen <[email protected]> wrote:
> On 4/13/22 08:41, Randy Dunlap wrote:
> > On 4/12/22 23:56, Libo Chen wrote:
> >>> --- a/lib/Kconfig
> >>> +++ b/lib/Kconfig
> >>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
> >>> bool
> >>> config CPUMASK_OFFSTACK
> >>> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> >>> + bool "Force CPU masks off stack"
> >>> + depends on DEBUG_PER_CPU_MAPS
> >> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
> >>> help
> >>> Use dynamic allocation for cpumask_var_t, instead of putting
> >>> them on the stack. This is a bit more expensive, but avoids
> >>>
> >>>
> >>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
> >>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
> >>>
> >> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
> > Yes, it is. I don't know that the problem is...
> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
> seem to be what we want.

I think the correct way to do it is to follow x86 and powerpc, and tying
CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
For smaller values of NR_CPUS, the onstack masks are obviously
cheaper, we just need to decide what the cut-off point is.

In x86, the onstack masks can be selected for normal SMP builds with
up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
CPUs while selecting CPUMASK_OFFSTACK.
PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
implicitly whenever NR_CPUS is set to 8192 or more.

I think we can easily do the same as powerpc on arm64. With the
ApacheBench test you cite in the patch description, what is the
value of NR_CPUS at which you start seeing a noticeable
benefit for offstack masks? Can you do the same test for
NR_CPUS=1024 or 2048?

Arnd

2022-04-14 14:01:26

by Libo Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK



On 4/13/22 08:41, Randy Dunlap wrote:
>
> On 4/12/22 23:56, Libo Chen wrote:
>> Hi Randy
>>
>> On 4/12/22 22:54, Randy Dunlap wrote:
>>> Hi Libo,
>>>
>>> On 4/12/22 19:34, Libo Chen wrote:
>>>> On 4/12/22 19:13, Randy Dunlap wrote:
>>>>> Hi,
>>>>>
>>>>> On 4/12/22 18:35, Libo Chen wrote:
>>>>>> Hi Randy,
>>>>>>
>>>>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>>>>> Hi--
>>>>>>>
>>>>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>>>>> rationale for such dependency.
>>>>>>>>
>>>>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>>>>> There is no reason other architectures cannot given the fact that they
>>>>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>>>>> x86.
>>>>>>>>
>>>>>>>> Signed-off-by: Libo Chen <[email protected]>
>>>>>>>> ---
>>>>>>>>     lib/Kconfig | 2 +-
>>>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>>>>> --- a/lib/Kconfig
>>>>>>>> +++ b/lib/Kconfig
>>>>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>>>>         bool
>>>>>>>>       config CPUMASK_OFFSTACK
>>>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>>>>> be displayed.
>>>>>>>
>>>>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>>>>> of DEBUG_PER_CPU_MAPS.
>>>>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>>>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>>>>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>>>>> it might require per architecture.
>>>>>
>>>>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>>>>> The Kconfig part or some code execution?
>>>> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
>>> I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
>>> (with a patch, of course.)
>>> It builds OK. I don't know if it will run OK.
>> I am a little confused, did you succeed with your patch (replacing "if" with "depends on") or my patch (removing "if")? Because I definitely cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS enabled using your change.
> This patch builds cleanly for me:
>
> ---
> arch/arm64/Kconfig | 1 +
> lib/Kconfig | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
> bool
>
> config CPUMASK_OFFSTACK
> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> + bool "Force CPU masks off stack"
> help
> Use dynamic allocation for cpumask_var_t, instead of putting
> them on the stack. This is a bit more expensive, but avoids
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -316,6 +316,7 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
>
> config SMP
> def_bool y
> + select CPUMASK_OFFSTACK
>
> config KERNEL_MODE_NEON
> def_bool y
>
> along with:
> # CONFIG_DEBUG_PER_CPU_MAPS is not set
>
>
>>> I think that you are arguing for a patch like this:
>> I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK should require DEBUG_PER_CPU_MAPS. They should be separate and independent to each other. So removing "if ..." should be enough in my opinion.
> I agree.
>
>>> --- a/lib/Kconfig
>>> +++ b/lib/Kconfig
>>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>>       bool
>>>     config CPUMASK_OFFSTACK
>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>> +    bool "Force CPU masks off stack"
>>> +    depends on DEBUG_PER_CPU_MAPS
>> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>>       help
>>>         Use dynamic allocation for cpumask_var_t, instead of putting
>>>         them on the stack.  This is a bit more expensive, but avoids
>>>
>>>
>>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>>
>> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
> Yes, it is. I don't know that the problem is...
Masahiro explained that CPUMASK_OFFSTACK can only be configured by
options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
seem to be what we want.

Libo

2022-04-14 14:45:33

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi,

On 4/13/22 14:50, Libo Chen wrote:
>
>
> On 4/13/22 13:52, Arnd Bergmann wrote:
>> On Wed, Apr 13, 2022 at 9:28 PM Libo Chen <[email protected]> wrote:
>>> On 4/13/22 08:41, Randy Dunlap wrote:
>>>> On 4/12/22 23:56, Libo Chen wrote:
>>>>>> --- a/lib/Kconfig
>>>>>> +++ b/lib/Kconfig
>>>>>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>>>>>         bool
>>>>>>       config CPUMASK_OFFSTACK
>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>>> +    bool "Force CPU masks off stack"
>>>>>> +    depends on DEBUG_PER_CPU_MAPS
>>>>> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>>>>>         help
>>>>>>           Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>>           them on the stack.  This is a bit more expensive, but avoids
>>>>>>
>>>>>>
>>>>>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>>>>>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>>>>>
>>>>> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
>>>> Yes, it is. I don't know that the problem is...
>>> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
>>> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
>>> seem to be what we want.
>> I think the correct way to do it is to follow x86 and powerpc, and tying
>> CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.

Sure. Just FTR, I was just trying to see if an arch (arm64) would build OK or not
when CPUMASK_OFFSTACK was enabled. and it does.
My patch wasn't meant to have a very long life.

>> For smaller values of NR_CPUS, the onstack masks are obviously
>> cheaper, we just need to decide what the cut-off point is.
> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on some architectures such as parisc and nios2 as reported by kernel test robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of guard on CPUMASK_OFFSTACK.
>> In x86, the onstack masks can be selected for normal SMP builds with
>> up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
>> CPUs while selecting CPUMASK_OFFSTACK.
>> PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
>> implicitly whenever NR_CPUS is set to 8192 or more.
>>
>> I think we can easily do the same as powerpc on arm64. With the
> I am leaning more towards x86's way because even NR_CPUS=160 is too expensive for 4-core arm64 VMs according to apachebench. I highly doubt that there is a good cut-off point to make everybody happy (or not unhappy).
>> ApacheBench test you cite in the patch description, what is the
>> value of NR_CPUS at which you start seeing a noticeable
>> benefit for offstack masks? Can you do the same test for
>> NR_CPUS=1024 or 2048?
> As mentioned above, a good cut-off point moves depends on the actual number of CPUs. But yeah I can do the same test for 1024 or even smaller NR_CPUs values on the same 64-core arm64 VM setup.


--
~Randy

2022-04-14 15:40:50

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK



On 4/12/22 23:56, Libo Chen wrote:
> Hi Randy
>
> On 4/12/22 22:54, Randy Dunlap wrote:
>> Hi Libo,
>>
>> On 4/12/22 19:34, Libo Chen wrote:
>>>
>>> On 4/12/22 19:13, Randy Dunlap wrote:
>>>> Hi,
>>>>
>>>> On 4/12/22 18:35, Libo Chen wrote:
>>>>> Hi Randy,
>>>>>
>>>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>>>> Hi--
>>>>>>
>>>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>>>> rationale for such dependency.
>>>>>>>
>>>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>>>> There is no reason other architectures cannot given the fact that they
>>>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>>>> x86.
>>>>>>>
>>>>>>> Signed-off-by: Libo Chen <[email protected]>
>>>>>>> ---
>>>>>>>     lib/Kconfig | 2 +-
>>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>>>> --- a/lib/Kconfig
>>>>>>> +++ b/lib/Kconfig
>>>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>>>         bool
>>>>>>>       config CPUMASK_OFFSTACK
>>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>>>> be displayed.
>>>>>>
>>>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>>>> of DEBUG_PER_CPU_MAPS.
>>>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>>>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>>>> it might require per architecture.
>>>>
>>>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>>>> The Kconfig part or some code execution?
>>> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
>> I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
>> (with a patch, of course.)
>> It builds OK. I don't know if it will run OK.
>
> I am a little confused, did you succeed with your patch (replacing "if" with "depends on") or my patch (removing "if")? Because I definitely cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS enabled using your change.

This patch builds cleanly for me:

---
arch/arm64/Kconfig | 1 +
lib/Kconfig | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)

--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -511,7 +511,7 @@ config CHECK_SIGNATURE
bool

config CPUMASK_OFFSTACK
- bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+ bool "Force CPU masks off stack"
help
Use dynamic allocation for cpumask_var_t, instead of putting
them on the stack. This is a bit more expensive, but avoids
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -316,6 +316,7 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE

config SMP
def_bool y
+ select CPUMASK_OFFSTACK

config KERNEL_MODE_NEON
def_bool y

along with:
# CONFIG_DEBUG_PER_CPU_MAPS is not set


>> I think that you are arguing for a patch like this:
>
> I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK should require DEBUG_PER_CPU_MAPS. They should be separate and independent to each other. So removing "if ..." should be enough in my opinion.

I agree.

>> --- a/lib/Kconfig
>> +++ b/lib/Kconfig
>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>       bool
>>     config CPUMASK_OFFSTACK
>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>> +    bool "Force CPU masks off stack"
>> +    depends on DEBUG_PER_CPU_MAPS
>
> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>       help
>>         Use dynamic allocation for cpumask_var_t, instead of putting
>>         them on the stack.  This is a bit more expensive, but avoids
>>
>>
>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>
>
> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?

Yes, it is. I don't know that the problem is...

--
~Randy

2022-04-15 05:08:49

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

Hi Libo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.18-rc2 next-20220413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
config: parisc-randconfig-r014-20220413 (https://download.01.org/0day-ci/archive/20220413/[email protected]/config)
compiler: hppa-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/6636f7cf28d2a79cde937c0f212e8a87080da06d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
git checkout 6636f7cf28d2a79cde937c0f212e8a87080da06d
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=parisc SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

hppa-linux-ld: kernel/workqueue.o: in function `free_workqueue_attrs':
>> kernel/workqueue.c:3370: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/workqueue.o: in function `alloc_workqueue_attrs':
>> kernel/workqueue.c:3390: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: kernel/workqueue.o: in function `workqueue_set_unbound_cpumask':
>> kernel/workqueue.c:5390: undefined reference to `zalloc_cpumask_var'
>> hppa-linux-ld: kernel/workqueue.c:5406: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/workqueue.o: in function `wq_unbound_cpumask_store':
kernel/workqueue.c:5664: undefined reference to `zalloc_cpumask_var'
hppa-linux-ld: kernel/workqueue.c:5671: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/workqueue.o: in function `workqueue_init_early':
kernel/workqueue.c:5995: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: kernel/sched/core.o: in function `sched_setaffinity':
>> kernel/sched/core.c:7948: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/sched/core.c:7951: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/sched/core.c:7978: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/sched/core.c:7980: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_setaffinity':
kernel/sched/core.c:8051: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: kernel/sched/core.c:8057: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_getaffinity':
kernel/sched/core.c:8108: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: kernel/sched/core.c:8120: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/sched/core.o: in function `sched_init':
>> kernel/sched/core.c:9499: undefined reference to `load_balance_mask'
>> hppa-linux-ld: kernel/sched/core.c:9499: undefined reference to `load_balance_mask'
>> hppa-linux-ld: kernel/sched/core.c:9501: undefined reference to `select_idle_mask'
>> hppa-linux-ld: kernel/sched/core.c:9501: undefined reference to `select_idle_mask'
hppa-linux-ld: kernel/sched/build_utility.o: in function `housekeeping_setup_type':
>> kernel/sched/isolation.c:104: undefined reference to `alloc_bootmem_cpumask_var'
hppa-linux-ld: kernel/sched/build_utility.o: in function `housekeeping_setup':
kernel/sched/isolation.c:122: undefined reference to `alloc_bootmem_cpumask_var'
>> hppa-linux-ld: kernel/sched/isolation.c:128: undefined reference to `alloc_bootmem_cpumask_var'
>> hppa-linux-ld: kernel/sched/isolation.c:173: undefined reference to `free_bootmem_cpumask_var'
hppa-linux-ld: kernel/sched/isolation.c:175: undefined reference to `free_bootmem_cpumask_var'
hppa-linux-ld: kernel/taskstats.o: in function `taskstats_user_cmd':
>> kernel/taskstats.c:441: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/taskstats.c:457: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/taskstats.c:464: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/events/core.o: in function `perf_event_init':
>> kernel/events/core.c:13237: undefined reference to `zalloc_cpumask_var'
hppa-linux-ld: fs/io_uring.o: in function `__io_uring_register':
>> fs/io_uring.c:11472: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: fs/io_uring.c:11488: undefined reference to `free_cpumask_var'
hppa-linux-ld: fs/io_uring.c:11493: undefined reference to `free_cpumask_var'
hppa-linux-ld: fs/io-wq.o: in function `io_wq_create':
fs/io-wq.c:1180: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: fs/io-wq.c:1214: undefined reference to `free_cpumask_var'
hppa-linux-ld: fs/io-wq.o: in function `io_wq_put_and_exit':
fs/io-wq.c:1290: undefined reference to `free_cpumask_var'
hppa-linux-ld: block/blk-mq.o: in function `blk_mq_alloc_hctx':
block/blk-mq.c:3528: undefined reference to `zalloc_cpumask_var_node'
hppa-linux-ld: block/blk-mq.c:3575: undefined reference to `free_cpumask_var'
hppa-linux-ld: drivers/base/cpu.o: in function `print_cpus_offline':
drivers/base/cpu.c:245: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: drivers/base/cpu.c:249: undefined reference to `free_cpumask_var'
hppa-linux-ld: drivers/base/cpu.o: in function `print_cpus_isolated':
drivers/base/cpu.c:274: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: drivers/base/cpu.c:281: undefined reference to `free_cpumask_var'
hppa-linux-ld: drivers/net/ethernet/emulex/benet/be_main.o: in function `be_clear_queues':
drivers/net/ethernet/emulex/benet/be_main.c:2943: undefined reference to `free_cpumask_var'
hppa-linux-ld: drivers/net/ethernet/emulex/benet/be_main.o: in function `be_setup_queues':
drivers/net/ethernet/emulex/benet/be_main.c:2981: undefined reference to `zalloc_cpumask_var'
hppa-linux-ld: drivers/net/ethernet/sfc/falcon/efx.o: in function `ef4_probe_nic':
drivers/net/ethernet/sfc/falcon/efx.c:1329: undefined reference to `zalloc_cpumask_var'
hppa-linux-ld: drivers/net/ethernet/sfc/falcon/efx.c:1344: undefined reference to `free_cpumask_var'
hppa-linux-ld: net/core/dev.o: in function `netif_get_num_default_rss_queues':
net/core/dev.c:3001: undefined reference to `zalloc_cpumask_var'
hppa-linux-ld: net/core/dev.c:3009: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/profile.o: in function `prof_cpu_mask_proc_write':
kernel/profile.c:361: undefined reference to `zalloc_cpumask_var'
hppa-linux-ld: kernel/profile.c:369: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/profile.o: in function `profile_init':
kernel/profile.c:114: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: kernel/profile.c:132: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/torture.o: in function `torture_cleanup_begin':
kernel/torture.c:591: undefined reference to `free_cpumask_var'
hppa-linux-ld: kernel/torture.o: in function `torture_shuffle_init':
kernel/torture.c:572: undefined reference to `alloc_cpumask_var'
hppa-linux-ld: block/blk-mq-sysfs.o: in function `blk_mq_hw_sysfs_release':
block/blk-mq-sysfs.c:41: undefined reference to `free_cpumask_var'


vim +3370 kernel/workqueue.c

1fa44ecad2b864 James Bottomley 2006-02-23 3360
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3361 /**
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3362 * free_workqueue_attrs - free a workqueue_attrs
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3363 * @attrs: workqueue_attrs to free
226223ab3c4118 Tejun Heo 2013-03-12 3364 *
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3365 * Undo alloc_workqueue_attrs().
226223ab3c4118 Tejun Heo 2013-03-12 3366 */
513c98d0868295 Daniel Jordan 2019-09-05 3367 void free_workqueue_attrs(struct workqueue_attrs *attrs)
226223ab3c4118 Tejun Heo 2013-03-12 3368 {
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3369 if (attrs) {
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 @3370 free_cpumask_var(attrs->cpumask);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3371 kfree(attrs);
226223ab3c4118 Tejun Heo 2013-03-12 3372 }
226223ab3c4118 Tejun Heo 2013-03-12 3373 }
226223ab3c4118 Tejun Heo 2013-03-12 3374
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3375 /**
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3376 * alloc_workqueue_attrs - allocate a workqueue_attrs
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3377 *
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3378 * Allocate a new workqueue_attrs, initialize with default settings and
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3379 * return it.
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3380 *
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3381 * Return: The allocated new workqueue_attr on success. %NULL on failure.
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3382 */
513c98d0868295 Daniel Jordan 2019-09-05 3383 struct workqueue_attrs *alloc_workqueue_attrs(void)
226223ab3c4118 Tejun Heo 2013-03-12 3384 {
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3385 struct workqueue_attrs *attrs;
226223ab3c4118 Tejun Heo 2013-03-12 3386
be69d00d976957 Thomas Gleixner 2019-06-26 3387 attrs = kzalloc(sizeof(*attrs), GFP_KERNEL);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3388 if (!attrs)
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3389 goto fail;
be69d00d976957 Thomas Gleixner 2019-06-26 @3390 if (!alloc_cpumask_var(&attrs->cpumask, GFP_KERNEL))
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3391 goto fail;
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3392
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3393 cpumask_copy(attrs->cpumask, cpu_possible_mask);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3394 return attrs;
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3395 fail:
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3396 free_workqueue_attrs(attrs);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 3397 return NULL;
226223ab3c4118 Tejun Heo 2013-03-12 3398 }
226223ab3c4118 Tejun Heo 2013-03-12 3399

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-15 12:55:07

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

On Wed, Apr 13, 2022 at 11:50 PM Libo Chen <[email protected]> wrote:
> On 4/13/22 13:52, Arnd Bergmann wrote:
> >>> Yes, it is. I don't know that the problem is...
> >> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
> >> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
> >> seem to be what we want.
> > I think the correct way to do it is to follow x86 and powerpc, and tying
> > CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
> > For smaller values of NR_CPUS, the onstack masks are obviously
> > cheaper, we just need to decide what the cut-off point is.
>
> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
> some architectures such as parisc and nios2 as reported by kernel test
> robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
> guard on CPUMASK_OFFSTACK.

NIOS2 does not support SMP builds at all, so it should never be possible to
select CPUMASK_OFFSTACK there. We may want to guard
DEBUG_PER_CPU_MAPS by adding a 'depends on SMP' in order to
prevent it from getting selected.

For PARISC, the largest configuration is 32-way SMP, so CPUMASK_OFFSTACK
is clearly pointless there as well, even though it should technically
be possible
to support. What is the build error on parisc?

> > In x86, the onstack masks can be selected for normal SMP builds with
> > up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
> > CPUs while selecting CPUMASK_OFFSTACK.
> > PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
> > implicitly whenever NR_CPUS is set to 8192 or more.
> >
> > I think we can easily do the same as powerpc on arm64. With the
> I am leaning more towards x86's way because even NR_CPUS=160 is too
> expensive for 4-core arm64 VMs according to apachebench. I highly doubt
> that there is a good cut-off point to make everybody happy (or not unhappy).

It seems surprising that you would see any improvement for offstack masks
when using NR_CPUS=160, that is just three 64-bit words worth of data, but
it requires allocating the mask dynamically, which takes way more memory
to initialize.

> > ApacheBench test you cite in the patch description, what is the
> > value of NR_CPUS at which you start seeing a noticeable
> > benefit for offstack masks? Can you do the same test for
> > NR_CPUS=1024 or 2048?
>
> As mentioned above, a good cut-off point moves depends on the actual
> number of CPUs. But yeah I can do the same test for 1024 or even smaller
> NR_CPUs values on the same 64-core arm64 VM setup.

If you see an improvement for small NR_CPUS values using offstack masks,
it's possible that the actual difference is something completely
different and we
can just make the on-stack case faster, possibly the cause is something about
cacheline alignment or inlining decisions using your specific kernel config.

Are you able to compare the 'perf report' output between runs with either
size to see where the extra time gets spent?

Arnd

2022-04-16 00:25:09

by Libo Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK



On 4/13/22 13:52, Arnd Bergmann wrote:
> On Wed, Apr 13, 2022 at 9:28 PM Libo Chen <[email protected]> wrote:
>> On 4/13/22 08:41, Randy Dunlap wrote:
>>> On 4/12/22 23:56, Libo Chen wrote:
>>>>> --- a/lib/Kconfig
>>>>> +++ b/lib/Kconfig
>>>>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>>>> bool
>>>>> config CPUMASK_OFFSTACK
>>>>> - bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>> + bool "Force CPU masks off stack"
>>>>> + depends on DEBUG_PER_CPU_MAPS
>>>> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>>>> help
>>>>> Use dynamic allocation for cpumask_var_t, instead of putting
>>>>> them on the stack. This is a bit more expensive, but avoids
>>>>>
>>>>>
>>>>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>>>>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>>>>
>>>> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
>>> Yes, it is. I don't know that the problem is...
>> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
>> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
>> seem to be what we want.
> I think the correct way to do it is to follow x86 and powerpc, and tying
> CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
> For smaller values of NR_CPUS, the onstack masks are obviously
> cheaper, we just need to decide what the cut-off point is.
I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
some architectures such as parisc and nios2 as reported by kernel test
robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
guard on CPUMASK_OFFSTACK.
> In x86, the onstack masks can be selected for normal SMP builds with
> up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
> CPUs while selecting CPUMASK_OFFSTACK.
> PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
> implicitly whenever NR_CPUS is set to 8192 or more.
>
> I think we can easily do the same as powerpc on arm64. With the
I am leaning more towards x86's way because even NR_CPUS=160 is too
expensive for 4-core arm64 VMs according to apachebench. I highly doubt
that there is a good cut-off point to make everybody happy (or not unhappy).
> ApacheBench test you cite in the patch description, what is the
> value of NR_CPUS at which you start seeing a noticeable
> benefit for offstack masks? Can you do the same test for
> NR_CPUS=1024 or 2048?
As mentioned above, a good cut-off point moves depends on the actual
number of CPUs. But yeah I can do the same test for 1024 or even smaller
NR_CPUs values on the same 64-core arm64 VM setup.

Libo

>
> Arnd

2022-04-16 02:11:14

by Libo Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK



On 4/14/22 04:41, Arnd Bergmann wrote:
> On Wed, Apr 13, 2022 at 11:50 PM Libo Chen <[email protected]> wrote:
>> On 4/13/22 13:52, Arnd Bergmann wrote:
>>>>> Yes, it is. I don't know that the problem is...
>>>> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
>>>> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
>>>> seem to be what we want.
>>> I think the correct way to do it is to follow x86 and powerpc, and tying
>>> CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
>>> For smaller values of NR_CPUS, the onstack masks are obviously
>>> cheaper, we just need to decide what the cut-off point is.
>> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
>> some architectures such as parisc and nios2 as reported by kernel test
>> robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
>> guard on CPUMASK_OFFSTACK.
> NIOS2 does not support SMP builds at all, so it should never be possible to
> select CPUMASK_OFFSTACK there. We may want to guard
> DEBUG_PER_CPU_MAPS by adding a 'depends on SMP' in order to
> prevent it from getting selected.
>
> For PARISC, the largest configuration is 32-way SMP, so CPUMASK_OFFSTACK
> is clearly pointless there as well, even though it should technically
> be possible
> to support. What is the build error on parisc?
Similar to NIOS2, A bunch of undefined references to *_cpumask_var()
calls.  It seems that PARISC doesn't support the cpumask offstack API at all

>>> In x86, the onstack masks can be selected for normal SMP builds with
>>> up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
>>> CPUs while selecting CPUMASK_OFFSTACK.
>>> PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
>>> implicitly whenever NR_CPUS is set to 8192 or more.
>>>
>>> I think we can easily do the same as powerpc on arm64. With the
>> I am leaning more towards x86's way because even NR_CPUS=160 is too
>> expensive for 4-core arm64 VMs according to apachebench. I highly doubt
>> that there is a good cut-off point to make everybody happy (or not unhappy).
> It seems surprising that you would see any improvement for offstack masks
> when using NR_CPUS=160, that is just three 64-bit words worth of data, but
> it requires allocating the mask dynamically, which takes way more memory
> to initialize.
>
>>> ApacheBench test you cite in the patch description, what is the
>>> value of NR_CPUS at which you start seeing a noticeable
>>> benefit for offstack masks? Can you do the same test for
>>> NR_CPUS=1024 or 2048?
>> As mentioned above, a good cut-off point moves depends on the actual
>> number of CPUs. But yeah I can do the same test for 1024 or even smaller
>> NR_CPUs values on the same 64-core arm64 VM setup.
> If you see an improvement for small NR_CPUS values using offstack masks,
> it's possible that the actual difference is something completely
> different and we
> can just make the on-stack case faster, possibly the cause is something about
> cacheline alignment or inlining decisions using your specific kernel config.
>
> Are you able to compare the 'perf report' output between runs with either
> size to see where the extra time gets spent?
Okay yeah I will take some time to gather more data. It does appear that
something else may also contribute to the performance difference.

Thanks
Libo
> Arnd