Good afternoon maintainers and subscribers to the lkml,
I'm a bit new to kernel development but I had a question with respect to the kernel parameters: isolcpus, nohz_full, and rcu_nocbs.
Basically the question is this, am I able to modify the three parameters I mentioned above at runtime after the kernel has already started/booted? Doing some reading online it seems that it’s not possible but I wanted to double check with the maintainers if there wasn’t some sort of change in the works that might make it possible. If not, what would be required to make the change after boot-time through some kind of patch or something like that? Would that be something that might be valuable upstream?
At the moment we are running an application that might see some benefit from being able to isolate cpus on the fly without having to reboot everything every time we want to modify the parameters mentioned above.
Thanks,
Gianfranco
Hi Gianfranco,
+ Adding all scheduler maintainers and few other people that are
working on similar things
On Thu, 7 Dec 2023 at 16:07, Gianfranco Dutka
<[email protected]> wrote:
>
> Good afternoon maintainers and subscribers to the lkml,
>
> I'm a bit new to kernel development but I had a question with respect to the kernel parameters: isolcpus, nohz_full, and rcu_nocbs.
>
> Basically the question is this, am I able to modify the three parameters I mentioned above at runtime after the kernel has already started/booted? Doing some reading online it seems that it’s not possible but I wanted to double check with the maintainers if there wasn’t some sort of change in the works that might make it possible. If not, what would be required to make the change after boot-time through some kind of patch or something like that? Would that be something that might be valuable upstream?
It's not possible but you can achieve something close with cgroup
although you will still have some housekeeping activities happening in
your partition.
This thread tries to do something similar:
https://lore.kernel.org/lkml/[email protected]/
>
> At the moment we are running an application that might see some benefit from being able to isolate cpus on the fly without having to reboot everything every time we want to modify the parameters mentioned above.
>
> Thanks,
> Gianfranco
(cc'ing Waiman)
On Thu, Dec 07, 2023 at 05:32:15PM +0100, Vincent Guittot wrote:
> Hi Gianfranco,
>
> + Adding all scheduler maintainers and few other people that are
> working on similar things
>
> On Thu, 7 Dec 2023 at 16:07, Gianfranco Dutka
> <[email protected]> wrote:
...
> > I'm a bit new to kernel development but I had a question with respect to the kernel parameters: isolcpus, nohz_full, and rcu_nocbs.
> >
> > Basically the question is this, am I able to modify the three parameters I mentioned above at runtime after the kernel has already started/booted? Doing some reading online it seems that it’s not possible but I wanted to double check with the maintainers if there wasn’t some sort of change in the works that might make it possible. If not, what would be required to make the change after boot-time through some kind of patch or something like that? Would that be something that might be valuable upstream?
>
> It's not possible but you can achieve something close with cgroup
> although you will still have some housekeeping activities happening in
> your partition.
FWIW, Waiman has been improving both the usability and level of isolation
with cpuset, so it should be better now.
Thanks.
--
tejun
> The isolcpus, nohz_full and rcu_nocbs are boot-time kernel parameters. I am in the process of improving dynamic CPU isolation at runtime. Right now, we are able to do isolcpus=domain with the isolated cpuset partition functionality. Other aspects of CPU isolation are being looked at with the goal of reducing the gap of what one can do at boot time versus what can be done at run time. It will certain take time to reach that goal.
>
> Cheers,
> Longman
Thank you Waiman for the response. It would seem that getting similar functionality through cgroups/cpusets is the only option at the moment. Is it completely out of the question to possibly patch the kernel to modify these parameters at runtime? Or would that entail a significant change that might not be so trivial to accomplish? For instance, the solution wouldn’t be as simple as patching the kernel to make these writeable and then calling the same functions which run at boot-time when these parameters are originally written?
Thanks,
Gianfranco
On 12/8/23 09:18, Gianfranco Dutka wrote:
>
>> The isolcpus, nohz_full and rcu_nocbs are boot-time kernel
>> parameters. I am in the process of improving dynamic CPU isolation at
>> runtime. Right now, we are able to do isolcpus=domain with the
>> isolated cpuset partition functionality. Other aspects of CPU
>> isolation are being looked at with the goal of reducing the gap of
>> what one can do at boot time versus what can be done at run time. It
>> will certain take time to reach that goal.
>>
>> Cheers, Longman
>>
>
> Thank you Waiman for the response. It would seem that getting similar
> functionality through cgroups/cpusets is the only option at the
> moment. Is it completely out of the question to possibly patch the
> kernel to modify these parameters at runtime? Or would that entail a
> significant change that might not be so trivial to accomplish? For
> instance, the solution wouldn’t be as simple as patching the kernel to
> make these writeable and then calling the same functions which run at
> boot-time when these parameters are originally written?
I would say that using cgroup/cpusets is probably the most you can do
with dynamic CPU isolation at the moment. OTOH, it may not be a good
idea to have more than one way of doing the same thing as it will lead
to code duplication and inconsistency. I don't think it is that easy to
make CPU isolation fully dynamic and we must be careful not to introduce
any regression.
Cheers,
Longman
On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
>
> > The isolcpus, nohz_full and rcu_nocbs are boot-time kernel parameters. I am in the process of improving dynamic CPU isolation at runtime. Right now, we are able to do isolcpus=domain with the isolated cpuset partition functionality. Other aspects of CPU isolation are being looked at with the goal of reducing the gap of what one can do at boot time versus what can be done at run time. It will certain take time to reach that goal.
> >
> > Cheers,
> > Longman
> >
>
> Thank you Waiman for the response. It would seem that getting similar
> functionality through cgroups/cpusets is the only option at the moment. Is it
> completely out of the question to possibly patch the kernel to modify these
> parameters at runtime? Or would that entail a significant change that might
> not be so trivial to accomplish? For instance, the solution wouldn’t be as
> simple as patching the kernel to make these writeable and then calling the
> same functions which run at boot-time when these parameters are originally
> written?
As for nohz_full (which implies rcu_nocb), it's certainly possible to make it
tunable at runtime via cpusets. If people really want it, I'm willing to help.
Thanks.
Hi Fredric,
On Tue, 2023-12-12 at 14:27 +0100, Frederic Weisbecker wrote:
> On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
> >
> > > The isolcpus, nohz_full and rcu_nocbs are boot-time kernel
> > > parameters. I am in the process of improving dynamic CPU
> > > isolation at runtime. Right now, we are able to do
> > > isolcpus=domain with the isolated cpuset partition functionality.
> > > Other aspects of CPU isolation are being looked at with the goal
> > > of reducing the gap of what one can do at boot time versus what
> > > can be done at run time. It will certain take time to reach that
> > > goal.
> > >
> > > Cheers,
> > > Longman
> > >
> >
> > Thank you Waiman for the response. It would seem that getting
> > similar
> > functionality through cgroups/cpusets is the only option at the
> > moment. Is it
> > completely out of the question to possibly patch the kernel to
> > modify these
> > parameters at runtime? Or would that entail a significant change
> > that might
> > not be so trivial to accomplish? For instance, the solution
> > wouldn’t be as
> > simple as patching the kernel to make these writeable and then
> > calling the
> > same functions which run at boot-time when these parameters are
> > originally
> > written?
>
> As for nohz_full (which implies rcu_nocb), it's certainly possible to
> make it
> tunable at runtime via cpusets. If people really want it, I'm willing
> to help.
>
We have a case for dynamically isolating CPUs at run time.
https://lore.kernel.org/lkml/ZNM5qoUSCdBwNTuH@chenyu5-mobl2/
It was suggested by Vincent to use house keeping cpumask for solving
unnecessary wake ups on isolated CPUs. Can this house keeping cpu mask
and type be updated at runtime?
Thanks,
Srinivas
> Thanks.
On Tue, Dec 12, 2023 at 02:27:56PM +0100 Frederic Weisbecker wrote:
> On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
> >
> > > The isolcpus, nohz_full and rcu_nocbs are boot-time kernel parameters. I am in the process of improving dynamic CPU isolation at runtime. Right now, we are able to do isolcpus=domain with the isolated cpuset partition functionality. Other aspects of CPU isolation are being looked at with the goal of reducing the gap of what one can do at boot time versus what can be done at run time. It will certain take time to reach that goal.
> > >
> > > Cheers,
> > > Longman
> > >
> >
> > Thank you Waiman for the response. It would seem that getting similar
> > functionality through cgroups/cpusets is the only option at the moment. Is it
> > completely out of the question to possibly patch the kernel to modify these
> > parameters at runtime? Or would that entail a significant change that might
> > not be so trivial to accomplish? For instance, the solution wouldn’t be as
> > simple as patching the kernel to make these writeable and then calling the
> > same functions which run at boot-time when these parameters are originally
> > written?
>
> As for nohz_full (which implies rcu_nocb), it's certainly possible to make it
> tunable at runtime via cpusets. If people really want it, I'm willing to help.
That is certainly where we'd like to end up. The ask for this is coming from
our container management side so cpusets would fit that well (along with the
isolation stuff Waiman has been working on).
Thanks,
Phil
>
> Thanks.
>
--
On 12/12/23 09:04, Pandruvada, Srinivas wrote:
> Hi Fredric,
> On Tue, 2023-12-12 at 14:27 +0100, Frederic Weisbecker wrote:
>> On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
>>>> The isolcpus, nohz_full and rcu_nocbs are boot-time kernel
>>>> parameters. I am in the process of improving dynamic CPU
>>>> isolation at runtime. Right now, we are able to do
>>>> isolcpus=domain with the isolated cpuset partition functionality.
>>>> Other aspects of CPU isolation are being looked at with the goal
>>>> of reducing the gap of what one can do at boot time versus what
>>>> can be done at run time. It will certain take time to reach that
>>>> goal.
>>>>
>>>> Cheers,
>>>> Longman
>>>>
>>> Thank you Waiman for the response. It would seem that getting
>>> similar
>>> functionality through cgroups/cpusets is the only option at the
>>> moment. Is it
>>> completely out of the question to possibly patch the kernel to
>>> modify these
>>> parameters at runtime? Or would that entail a significant change
>>> that might
>>> not be so trivial to accomplish? For instance, the solution
>>> wouldn’t be as
>>> simple as patching the kernel to make these writeable and then
>>> calling the
>>> same functions which run at boot-time when these parameters are
>>> originally
>>> written?
>> As for nohz_full (which implies rcu_nocb), it's certainly possible to
>> make it
>> tunable at runtime via cpusets. If people really want it, I'm willing
>> to help.
>>
> We have a case for dynamically isolating CPUs at run time.
>
> https://lore.kernel.org/lkml/ZNM5qoUSCdBwNTuH@chenyu5-mobl2/
>
> It was suggested by Vincent to use house keeping cpumask for solving
> unnecessary wake ups on isolated CPUs. Can this house keeping cpu mask
> and type be updated at runtime?
The house keeping cpumasks are statically set up at boot time. The
cpuset code will have an internal isolated_cpus cpumask that lists all
the CPUs that are in isolated partitions. It will also expose it with a
new cpuset_cpu_is_isolated() function. The code is currently in the
cgroup tree and will be merged into the upcoming v6.8 kernel. As the new
isolated_cpus mask can be changed at run time, some synchronization may
be needed depending on how it is being used. So the cpumask itself will
not be exported, new API can be provided to copy out the cpumask if needed.
Cheers,
Longman
On 12/12/23 08:27, Frederic Weisbecker wrote:
> On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
>>> The isolcpus, nohz_full and rcu_nocbs are boot-time kernel parameters. I am in the process of improving dynamic CPU isolation at runtime. Right now, we are able to do isolcpus=domain with the isolated cpuset partition functionality. Other aspects of CPU isolation are being looked at with the goal of reducing the gap of what one can do at boot time versus what can be done at run time. It will certain take time to reach that goal.
>>>
>>> Cheers,
>>> Longman
>>>
>> Thank you Waiman for the response. It would seem that getting similar
>> functionality through cgroups/cpusets is the only option at the moment. Is it
>> completely out of the question to possibly patch the kernel to modify these
>> parameters at runtime? Or would that entail a significant change that might
>> not be so trivial to accomplish? For instance, the solution wouldn’t be as
>> simple as patching the kernel to make these writeable and then calling the
>> same functions which run at boot-time when these parameters are originally
>> written?
> As for nohz_full (which implies rcu_nocb), it's certainly possible to make it
> tunable at runtime via cpusets. If people really want it, I'm willing to help.
As said by Phil, your help in in enabling dynamic rcu_nocb will be
greatly appreciated. My current thought is to have a root level
cpuset.cpus.isolation_control file to enable additional CPU isolation
like rcu_nocb to be applied to CPUs in isolated partitions.
Cheers,
Longman
On Tue, Dec 12, 2023 at 03:18:43PM -0500, Waiman Long wrote:
>
> On 12/12/23 08:27, Frederic Weisbecker wrote:
> > On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
> > > > The isolcpus, nohz_full and rcu_nocbs are boot-time kernel parameters. I am in the process of improving dynamic CPU isolation at runtime. Right now, we are able to do isolcpus=domain with the isolated cpuset partition functionality. Other aspects of CPU isolation are being looked at with the goal of reducing the gap of what one can do at boot time versus what can be done at run time. It will certain take time to reach that goal.
> > > >
> > > > Cheers,
> > > > Longman
> > > >
> > > Thank you Waiman for the response. It would seem that getting similar
> > > functionality through cgroups/cpusets is the only option at the moment. Is it
> > > completely out of the question to possibly patch the kernel to modify these
> > > parameters at runtime? Or would that entail a significant change that might
> > > not be so trivial to accomplish? For instance, the solution wouldn’t be as
> > > simple as patching the kernel to make these writeable and then calling the
> > > same functions which run at boot-time when these parameters are originally
> > > written?
> > As for nohz_full (which implies rcu_nocb), it's certainly possible to make it
> > tunable at runtime via cpusets. If people really want it, I'm willing to help.
>
> As said by Phil, your help in in enabling dynamic rcu_nocb will be greatly
> appreciated.
rcu_nocb is already ready for that. The not yet ready part is nohz_full and its
several components (tick, remote tick, [hr-]timers affinity, workqueues affinity, kthreads
affinity, vmstat, buffer head, etc...). Last debate on plumbers suggested that
nohz_full should be dynamically turned on/off only on offline CPUs. That will
indeed simplify the problem.
> My current thought is to have a root level
> cpuset.cpus.isolation_control file to enable additional CPU isolation like
> rcu_nocb to be applied to CPUs in isolated partitions.
Last time I tried that, Peter Zijlstra was more in favour of an isolate all or nothing
switch by default for nohz_full that would include rcu_nocb. And then if people
are interested in something more finegrained, introduce such a file to control
individual features (see
https://lore.kernel.org/lkml/[email protected]/ )
But so far I never heard about the need for such a finegrained isolation. Users of
nohz_full= seem to want to isolate everything out.
Thanks.
On 12/12/23 18:57, Frederic Weisbecker wrote:
> On Tue, Dec 12, 2023 at 03:18:43PM -0500, Waiman Long wrote:
>> On 12/12/23 08:27, Frederic Weisbecker wrote:
>>> On Fri, Dec 08, 2023 at 09:18:53AM -0500, Gianfranco Dutka wrote:
>>>>> The isolcpus, nohz_full and rcu_nocbs are boot-time kernel parameters. I am in the process of improving dynamic CPU isolation at runtime. Right now, we are able to do isolcpus=domain with the isolated cpuset partition functionality. Other aspects of CPU isolation are being looked at with the goal of reducing the gap of what one can do at boot time versus what can be done at run time. It will certain take time to reach that goal.
>>>>>
>>>>> Cheers,
>>>>> Longman
>>>>>
>>>> Thank you Waiman for the response. It would seem that getting similar
>>>> functionality through cgroups/cpusets is the only option at the moment. Is it
>>>> completely out of the question to possibly patch the kernel to modify these
>>>> parameters at runtime? Or would that entail a significant change that might
>>>> not be so trivial to accomplish? For instance, the solution wouldn’t be as
>>>> simple as patching the kernel to make these writeable and then calling the
>>>> same functions which run at boot-time when these parameters are originally
>>>> written?
>>> As for nohz_full (which implies rcu_nocb), it's certainly possible to make it
>>> tunable at runtime via cpusets. If people really want it, I'm willing to help.
>> As said by Phil, your help in in enabling dynamic rcu_nocb will be greatly
>> appreciated.
> rcu_nocb is already ready for that. The not yet ready part is nohz_full and its
> several components (tick, remote tick, [hr-]timers affinity, workqueues affinity, kthreads
> affinity, vmstat, buffer head, etc...). Last debate on plumbers suggested that
> nohz_full should be dynamically turned on/off only on offline CPUs. That will
> indeed simplify the problem.
So rcu_nocb is ready for dynamically changing it without too much
additional work. That is good to know as I haven't looked into that myself.
The other pieces will still need additional work. I already have a patch
in the cgroup tree that updates the unbound workqueue affinity to
exclude isolated cpuset CPUs, though there may still be some further
fine tuning that can be done.
>
>> My current thought is to have a root level
>> cpuset.cpus.isolation_control file to enable additional CPU isolation like
>> rcu_nocb to be applied to CPUs in isolated partitions.
> Last time I tried that, Peter Zijlstra was more in favour of an isolate all or nothing
> switch by default for nohz_full that would include rcu_nocb. And then if people
> are interested in something more finegrained, introduce such a file to control
> individual features (see
> https://lore.kernel.org/lkml/[email protected]/ )
>
> But so far I never heard about the need for such a finegrained isolation. Users of
> nohz_full= seem to want to isolate everything out.
Yes, I recall some of the discussion now. I am fine with a single on/off
switch. That will likely simplify the process as we can add additional
isolation features over time once the code is ready, may be a
cpuset.cpus.isolation_full boolean flag.
Cheers,
Longman
>
> Thanks.
>