2022-12-20 07:01:58

by Leonardo Bras

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] sched/isolation: Add HK_TYPE_WQ to isolcpus=domain

On Tue, 2022-11-29 at 13:10 +0100, Frederic Weisbecker wrote:
> On Fri, Oct 14, 2022 at 01:27:25PM -0300, Leonardo Brás wrote:
> > Hello Frederic,
> >
> > So, IIUC you are removing all flags composing nohz_full= parameter in favor of a
> > unified NOHZ_FULL flag.
> >
> > I am very new to the code, and I am probably missing the whole picture, but I
> > actually think it's a good approach to keep them split for a couple reasons:
> > 1 - They are easier to understand in code (IMHO): 
> > "This cpu should not do this, because it's not able to do WQ housekeeping" looks
> > better than "because it's not in DOMAIN or NOHZ_FULL housekeeping"
>
> A comment above each site may solve that.

Sure, but not having to leave comments would be better. Or am I missing
something?

>
> >
> > 2 - They are simpler for using: 
> > Suppose we have this function that should run at a WQ, but we want to keep them
> > out of the isolated cpus. If we have the unified flags, we need to combine both
> > DOMAIN and NOHZ_FULL bitmasks, and then combine it again with something like
> > cpu_online_mask. It usually means allocating a new cpumask_t, and also freeing
> > it afterwards.
> > If we have a single WQ flag, we can avoid the allocation altogether by using
> > for_each_cpu_and(), making the code much simpler.
>
> I guess having a specific function for workqueues would arrange for it.

You mean keeping a WQ housekeeping bitmap? This could be a solution, but it
would affect only the WQ example.

>
> >
> > 3 - It makes easier to compose new isolation modes:
> > In case the future requires a new isolation mode that also uses the types of
> > isolation we currently have implemented, it would be much easier to just compose
> > it with the current HK flags, instead of having to go through all usages and do
> > a cpumask_and() there. Also, new isolation modes would make (2) worse.
>
> Actually having a new feature merged in HK_NOHZ_FULL would make it easier to
> handle as it avoids spreading cpumasks. I'm not sure I understand what you
> mean.

IIUC, your queued patch merges the housekeeping types HK_TYPE_TIMER,
HK_TYPE_RCU, HK_TYPE_MISC, HK_TYPE_TICK, HK_TYPE_WQ and HK_TYPE_KTHREAD in a
single HK_TYPE_NOHZ_FULL.

Suppose in future we need a new isolation feature in cmdline, say 
isol_new=<cpulist>, and it works exactly like nohz_full=<cpulist>, but also
needs to isolate cpulist against something else, say doing X.

How would this get implemented? IIUC, following the same pattern:
- A new type HK_TYPE_ISOL_NEW would be created together with a cpumask,
- The new cpumask would be used to keep cpulist from doing X
- All places that use HK_TYPE_NOHZ_FULL bitmap for isolation would need to also
bitmask_and() the new cpumask. (sometimes needing a local cpumask_t)

Ok, there may be shortcuts for this, like keeping an intermediary bitmap, but
that can become tricky.

Other more complex example: New isolation feature isol_new2=<cpulist> behaves
like nohz_full=<cpulist>, keeps cpulist from doing X, but allows unbound RCU
work. Now it's even harder to have shortcuts from previous implementation.

What I am trying to defend here is that keeping the HK_type with the idea of
"things to get cpulist isolated from" works better for future implementations
than a single flag with a lot of responsibilities:
- A new type HK_TYPE_X would be created together with a cpumask,
- The new cpumask would be used to keep cpulist from doing X
- isol_new=<cpulist> is composed with the flags for what cpulist is getting
isolated.
- (No need to touch already implemented isolations.)

In fact, I propose that it works better for current implementations also:
The current patch (3/4) takes the WQ isolation responsibility from
HK_TYPE_DOMAIN and focus it in HK_TYPE_WQ, adding it to isolcpus=<cpulist>
flags. This avoids some cpumask_and()s, and a cpumask_t kzalloc, and makes the
code less complex to implement when we need to put isolation in further parts of
the code. (patch 4/4)

I am not sure if I am missing some important point here. 
Please let me know if it's the case.

>
> Thanks.
>

Thank you for replying!
Leo