2024-01-24 08:52:23

by Leonardo Bras

[permalink] [raw]
Subject: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

When __queue_delayed_work() is called with WORK_CPU_UNBOUND, it means any
cpu is able to run the work, as well as any cpu timer is able to be used.

This is not good if a system does use CPU isolation, because it can take
away some valuable cpu time to:
1 - deal with the timer interrupt,
2 - schedule-out the desired task,
3 - queue work on a random workqueue, and
4 - schedule the desired task back to the cpu.

So to fix this, during __queue_delayed_work(), if both:
- Work is not cpu-bounded,
- CPU isolation is in place,
then pick a random non-isolated cpu to use both the timer and the
system per-cpu workqueue.

Signed-off-by: Leonardo Bras <[email protected]>
---
kernel/workqueue.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 76e60faed8923..0c50f41d9f95e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1954,6 +1954,14 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
return;
}

+ /*
+ * If the work is cpu-unbound, and cpu isolation is in place, only
+ * schedule use timers from housekeeping cpus. In favor of avoiding
+ * cacheline bouncing, run the WQ in the same cpu as the timer.
+ */
+ if (cpu == WORK_CPU_UNBOUND && housekeeping_enabled(HK_TYPE_TIMER))
+ cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
+
dwork->wq = wq;
dwork->cpu = cpu;
timer->expires = jiffies + delay;
--
2.43.0



2024-01-24 21:47:41

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

On Wed, Jan 24, 2024 at 05:29:37AM -0300, Leonardo Bras wrote:
> + /*
> + * If the work is cpu-unbound, and cpu isolation is in place, only
> + * schedule use timers from housekeeping cpus. In favor of avoiding
> + * cacheline bouncing, run the WQ in the same cpu as the timer.
> + */
> + if (cpu == WORK_CPU_UNBOUND && housekeeping_enabled(HK_TYPE_TIMER))
> + cpu = housekeeping_any_cpu(HK_TYPE_TIMER);

Would it make more sense to use wq_unbound_cpumask?

Thanks.

--
tejun

2024-01-25 01:46:25

by Leonardo Bras

[permalink] [raw]
Subject: Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

On Wed, Jan 24, 2024 at 11:47:29AM -1000, Tejun Heo wrote:
> On Wed, Jan 24, 2024 at 05:29:37AM -0300, Leonardo Bras wrote:
> > + /*
> > + * If the work is cpu-unbound, and cpu isolation is in place, only
> > + * schedule use timers from housekeeping cpus. In favor of avoiding
> > + * cacheline bouncing, run the WQ in the same cpu as the timer.
> > + */
> > + if (cpu == WORK_CPU_UNBOUND && housekeeping_enabled(HK_TYPE_TIMER))
> > + cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
>
> Would it make more sense to use wq_unbound_cpumask?

Hello Tejun, thank you for this reply!

That's a good suggestion, but looking at workqueue_init_early() I see that,
in short:
wq_unbound_cpumask = cpu_possible_mask &
housekeeping_cpumask(HK_TYPE_WQ) &
housekeeping_cpumask(HK_TYPE_DOMAIN) &
wq_cmdline_cpumask

So wq_unbound_cpumask relates to domain and workqueue cpu isolation.

In our case, we are using this to choose in which cpu is the timer we want
to use, so it makes sense to use timer-related cpu isolation, instead.

As of today, your suggestion would work the same, as the only way to enable
WQ cpu isolation is to use nohz_full, which also enables TIMER cpu
isolation. But since that can change in the future, for any reason, I would
suggest that we stick to using the HK_TYPE_TIMER cpumask.

I can now notice that this can end up introducing an issue: possibly
running on a workqueue on a cpu outside of a valid wq_cmdline_cpumask.

I would suggest fixing this in a couple ways:
1 - We introduce a new cpumask which is basically
housekeeping_cpumask(HK_TYPE_DOMAIN) & wq_cmdline_cpumask, allowing us
to keep the timer interrupt in the same cpu as the scheduled function,
2- We use the resulting cpu only to pick the right timer.

What are your thouhts on that?

Thank you!
Leo


2024-01-25 18:38:36

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

Hello,

On Wed, Jan 24, 2024 at 10:45:50PM -0300, Leonardo Bras wrote:
> That's a good suggestion, but looking at workqueue_init_early() I see that,
> in short:
> wq_unbound_cpumask = cpu_possible_mask &
> housekeeping_cpumask(HK_TYPE_WQ) &
> housekeeping_cpumask(HK_TYPE_DOMAIN) &
> wq_cmdline_cpumask
>
> So wq_unbound_cpumask relates to domain and workqueue cpu isolation.
>
> In our case, we are using this to choose in which cpu is the timer we want
> to use, so it makes sense to use timer-related cpu isolation, instead.

- In the proposed code, when cpu == WORK_CPU_UNBOUND, it's always setting
cpu to housekeeping_any_cpu(HK_TYPE_TIMER). This may unnecessarily move
the timer and task away from local CPU. Preferring the local CPU would
likely make sense.

- If HK_TYPE_TIMER and workqueue masks may not agree, setting dwork->cpu to
the one returned from HK_TYPE_TIMER is likely problematic. That would
force __queue_work() to use that CPU instead of picking one from
wq_unbound_cpumask.

> As of today, your suggestion would work the same, as the only way to enable
> WQ cpu isolation is to use nohz_full, which also enables TIMER cpu
> isolation. But since that can change in the future, for any reason, I would
> suggest that we stick to using the HK_TYPE_TIMER cpumask.
>
> I can now notice that this can end up introducing an issue: possibly
> running on a workqueue on a cpu outside of a valid wq_cmdline_cpumask.

Yeap.

> I would suggest fixing this in a couple ways:
> 1 - We introduce a new cpumask which is basically
> housekeeping_cpumask(HK_TYPE_DOMAIN) & wq_cmdline_cpumask, allowing us
> to keep the timer interrupt in the same cpu as the scheduled function,
> 2- We use the resulting cpu only to pick the right timer.
>
> What are your thouhts on that?

How about something like the following instead?

- If current CPU is in HK_TYPE_TIMER, pick that CPU.

- If not, pick a CPU from HK_TYPE_TIMER.

- Do add_timer_on() on the selected CPU but leave dwork->cpu as
WORK_CPU_UNBOUND and leave that part to __queue_work().

Thanks.

--
tejun

2024-01-25 21:56:33

by Leonardo Bras

[permalink] [raw]
Subject: Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

On Thu, Jan 25, 2024 at 08:38:25AM -1000, Tejun Heo wrote:
> Hello,
>
> On Wed, Jan 24, 2024 at 10:45:50PM -0300, Leonardo Bras wrote:
> > That's a good suggestion, but looking at workqueue_init_early() I see that,
> > in short:
> > wq_unbound_cpumask = cpu_possible_mask &
> > housekeeping_cpumask(HK_TYPE_WQ) &
> > housekeeping_cpumask(HK_TYPE_DOMAIN) &
> > wq_cmdline_cpumask
> >
> > So wq_unbound_cpumask relates to domain and workqueue cpu isolation.
> >
> > In our case, we are using this to choose in which cpu is the timer we want
> > to use, so it makes sense to use timer-related cpu isolation, instead.
>
> - In the proposed code, when cpu == WORK_CPU_UNBOUND, it's always setting
> cpu to housekeeping_any_cpu(HK_TYPE_TIMER). This may unnecessarily move
> the timer and task away from local CPU. Preferring the local CPU would
> likely make sense.
>
> - If HK_TYPE_TIMER and workqueue masks may not agree, setting dwork->cpu to
> the one returned from HK_TYPE_TIMER is likely problematic. That would
> force __queue_work() to use that CPU instead of picking one from
> wq_unbound_cpumask.
>
> > As of today, your suggestion would work the same, as the only way to enable
> > WQ cpu isolation is to use nohz_full, which also enables TIMER cpu
> > isolation. But since that can change in the future, for any reason, I would
> > suggest that we stick to using the HK_TYPE_TIMER cpumask.
> >
> > I can now notice that this can end up introducing an issue: possibly
> > running on a workqueue on a cpu outside of a valid wq_cmdline_cpumask.
>
> Yeap.
>
> > I would suggest fixing this in a couple ways:
> > 1 - We introduce a new cpumask which is basically
> > housekeeping_cpumask(HK_TYPE_DOMAIN) & wq_cmdline_cpumask, allowing us
> > to keep the timer interrupt in the same cpu as the scheduled function,
> > 2- We use the resulting cpu only to pick the right timer.
> >
> > What are your thouhts on that?
>
> How about something like the following instead?
>
> - If current CPU is in HK_TYPE_TIMER, pick that CPU.
>
> - If not, pick a CPU from HK_TYPE_TIMER.
>
> - Do add_timer_on() on the selected CPU but leave dwork->cpu as
> WORK_CPU_UNBOUND and leave that part to __queue_work().
>
> Thanks.

It looks like a good idea to me.

It's basicaly (2) with "keep the timer in this cpu if it's not isolated",
which seems the right thing to do.

Thanks!
Leo


>
> --
> tejun
>