2021-09-09 16:29:58

by Chris Friesen

[permalink] [raw]
Subject: question about isolcpus and nohz_full

Hi,

I'm finally getting around to moving to a newer kernel, and I'm running
into a warning "Housekeeping: nohz_full= must match isolcpus=".

In my environment I had a number of "fully-isolated" CPUs with both
"isolcpus" and "nohz_full", then a number of less-isolated CPUs that had
"nohz_full" enabled but not "isolcpus", then the housekeeping CPUs.  It
appears this is no longer supported via the boot args.  (The
"less-isolated CPUs were used for applications that expected the usual
load balancing from the scheduler but didn't want to be interrupted by
timer ticks.)

1) The kernel-parameters documentation for "isolcpus=" doesn't say
anything about needing to match "nohz_full", nor does the documentation
for "nohz_full" mention that it needs to align with "isolcpus".  Maybe
this would be a good thing to add?

2) Is it allowed to specify  "nohz_full" for some CPUs at boot time
without specifying any isolcpus?  If so, what happens if I later isolate
a subset of those CPUs using "cpuset.sched_load_balance" in cgroups?  Is
that allowed when the equivalent boot args are not?

Thanks,

Chris


2021-09-10 04:16:35

by Mike Galbraith

[permalink] [raw]
Subject: Re: question about isolcpus and nohz_full

On Thu, 2021-09-09 at 10:26 -0600, Chris Friesen wrote:
>
> 2) Is it allowed to specify  "nohz_full" for some CPUs at boot time
> without specifying any isolcpus?

Yup (IM[not the least bit;]HO the proper way to partition a box).

>   If so, what happens if I later isolate
> a subset of those CPUs using "cpuset.sched_load_balance" in cgroups?  Is
> that allowed when the equivalent boot args are not?

That's what an old shield script I still have laying around does. I
booted master on my little desktop box with nohz_full=1,2,3,5,6,7 and
shielded cores 2 and 3, after taking down cpus 4-7 (smt), and it still
seems to work fine.

I used to also override (via ugly.. maybe even fugly, hack) nohz
dynamically, turning the tick on/off for subsets, on having proven best
for jitter of heftily threaded RT app spread across many isolated
cores, thus could at need even partition a box with a mixture of
ticked, nohz idle, and tickless sets, albeit in a rather limited
fashion due to nohz_full preallocation requirement. Would be nice for
some situations if nohz mode were to become a fully dynamic set
attribute.

-Mike