2024-02-14 19:06:17

by Tejun Heo

[permalink] [raw]
Subject: [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues

From 2f34d7337d98f3eae7bd3d1270efaf9d8a17cfc6 Mon Sep 17 00:00:00 2001
From: Tejun Heo <[email protected]>
Date: Wed, 14 Feb 2024 08:33:55 -1000

When queue_work_on() is used to queue a BH work item on a remote CPU, the
work item is queued on that CPU but kick_pool() raises softirq on the local
CPU. This leads to stalls as the work item won't be executed until something
else on the remote CPU schedules a BH work item or tasklet locally.

Fix it by bouncing raising softirq to the target CPU using per-cpu irq_work.

Signed-off-by: Tejun Heo <[email protected]>
Fixes: 4cb1ef64609f ("workqueue: Implement BH workqueues to eventually replace tasklets")
---
Oops, I just found out that I was raising softirq on the wrong CPU when BH
work items are queued on remote CPUs. This fixes it. Applied to wq/for-6.9.

Thanks.

kernel/workqueue.c | 41 ++++++++++++++++++++++++++++++++++++-----
1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4950bfc2cdcc..04e35dbe6799 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -54,6 +54,7 @@
#include <linux/nmi.h>
#include <linux/kvm_para.h>
#include <linux/delay.h>
+#include <linux/irq_work.h>

#include "workqueue_internal.h"

@@ -457,6 +458,10 @@ static bool wq_debug_force_rr_cpu = false;
#endif
module_param_named(debug_force_rr_cpu, wq_debug_force_rr_cpu, bool, 0644);

+/* to raise softirq for the BH worker pools on other CPUs */
+static DEFINE_PER_CPU_SHARED_ALIGNED(struct irq_work [NR_STD_WORKER_POOLS],
+ bh_pool_irq_works);
+
/* the BH worker pools */
static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
bh_worker_pools);
@@ -1197,6 +1202,13 @@ static bool assign_work(struct work_struct *work, struct worker *worker,
return true;
}

+static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
+{
+ int high = pool->attrs->nice == HIGHPRI_NICE_LEVEL ? 1 : 0;
+
+ return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
+}
+
/**
* kick_pool - wake up an idle worker if necessary
* @pool: pool to kick
@@ -1215,10 +1227,15 @@ static bool kick_pool(struct worker_pool *pool)
return false;

if (pool->flags & POOL_BH) {
- if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
- raise_softirq_irqoff(HI_SOFTIRQ);
- else
- raise_softirq_irqoff(TASKLET_SOFTIRQ);
+ if (likely(pool->cpu == smp_processor_id())) {
+ if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
+ raise_softirq_irqoff(HI_SOFTIRQ);
+ else
+ raise_softirq_irqoff(TASKLET_SOFTIRQ);
+ } else {
+ irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
+ }
+
return true;
}

@@ -7367,6 +7384,16 @@ static inline void wq_watchdog_init(void) { }

#endif /* CONFIG_WQ_WATCHDOG */

+static void bh_pool_kick_normal(struct irq_work *irq_work)
+{
+ raise_softirq_irqoff(TASKLET_SOFTIRQ);
+}
+
+static void bh_pool_kick_highpri(struct irq_work *irq_work)
+{
+ raise_softirq_irqoff(HI_SOFTIRQ);
+}
+
static void __init restrict_unbound_cpumask(const char *name, const struct cpumask *mask)
{
if (!cpumask_intersects(wq_unbound_cpumask, mask)) {
@@ -7408,6 +7435,8 @@ void __init workqueue_init_early(void)
{
struct wq_pod_type *pt = &wq_pod_types[WQ_AFFN_SYSTEM];
int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL };
+ void (*irq_work_fns[2])(struct irq_work *) = { bh_pool_kick_normal,
+ bh_pool_kick_highpri };
int i, cpu;

BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
@@ -7455,8 +7484,10 @@ void __init workqueue_init_early(void)

i = 0;
for_each_bh_worker_pool(pool, cpu) {
- init_cpu_worker_pool(pool, cpu, std_nice[i++]);
+ init_cpu_worker_pool(pool, cpu, std_nice[i]);
pool->flags |= POOL_BH;
+ init_irq_work(bh_pool_irq_work(pool), irq_work_fns[i]);
+ i++;
}

i = 0;
--
2.43.0



2024-02-14 19:07:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues

On Wed, 14 Feb 2024 at 10:39, Tejun Heo <[email protected]> wrote:
>
> When queue_work_on() is used to queue a BH work item on a remote CPU, the
> work item is queued on that CPU but kick_pool() raises softirq on the local
> CPU.

Now, does it make a lot of sense to ask to queue a BH work on another
CPU in the first place?

I don't think tasklets supported that. And while the workqueues
obviously do - and you fix that case - I wonder if we shouldn't say
"that operation makes no sense, please don't do it" rather than
actually support it?

What made you notice this issue?

Linus

2024-02-14 19:17:12

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues

Hello,

On Wed, Feb 14, 2024 at 11:03:46AM -0800, Linus Torvalds wrote:
> On Wed, 14 Feb 2024 at 10:39, Tejun Heo <[email protected]> wrote:
> >
> > When queue_work_on() is used to queue a BH work item on a remote CPU, the
> > work item is queued on that CPU but kick_pool() raises softirq on the local
> > CPU.
>
> Now, does it make a lot of sense to ask to queue a BH work on another
> CPU in the first place?
>
> I don't think tasklets supported that. And while the workqueues
> obviously do - and you fix that case - I wonder if we shouldn't say
> "that operation makes no sense, please don't do it" rather than
> actually support it?
>
> What made you notice this issue?

It's test code I'm using to verify new features to cover
tasklet_disable/enable(), so not a real use case. For tasklet migration,
this isn't necessary but at the same time all the mechanics are already
there, so it's nice to keep the API orthogonal and one can conceive
plausible use cases - e.g. "I'm using per-cpu work items to collect per-cpu
states but that comes with really high tail latencies, lemme switch to
per-cpu BH work items".

Thanks.

--
tejun

2024-02-16 01:30:15

by Tejun Heo

[permalink] [raw]
Subject: [PATCH wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK

From 9d6efa8d0dd012db22958a225246812441b25405 Mon Sep 17 00:00:00 2001
From: Tejun Heo <[email protected]>
Date: Thu, 15 Feb 2024 15:02:30 -1000

2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues") added
irq_work usage to workqueue; however, it turns out irq_work is actually
optional and the change breaks build on configuration which doesn't have
CONFIG_IRQ_WORK enabled.

Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
be better to just always enable it. However, this still saves a small bit of
memory for tiny UP configs and also the least amount of change, so, for now,
let's keep it conditional.

Signed-off-by: Tejun Heo <[email protected]>
Fixes: 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues")
Reported-by: Naresh Kamboju <[email protected]>
Tested-by: Anders Roxell <[email protected]>
---
Hello,

As this is breaking build for some configs, I'll apply this to wq/for-6.9.
If there's any objection, please let me know. I'll back out and work on a
different approach.

Thanks.

init/Kconfig | 2 ++
kernel/workqueue.c | 24 +++++++++++++++---------
2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8df18f3a9748..41be05a8ba5e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -107,6 +107,8 @@ config CONSTRUCTORS

config IRQ_WORK
bool
+ depends on SMP
+ default y

config BUILDTIME_TABLE_SORT
bool
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 04e35dbe6799..6ae441e13804 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1209,6 +1209,20 @@ static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
}

+static void kick_bh_pool(struct worker_pool *pool)
+{
+#ifdef CONFIG_SMP
+ if (unlikely(pool->cpu != smp_processor_id())) {
+ irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
+ return;
+ }
+#endif
+ if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
+ raise_softirq_irqoff(HI_SOFTIRQ);
+ else
+ raise_softirq_irqoff(TASKLET_SOFTIRQ);
+}
+
/**
* kick_pool - wake up an idle worker if necessary
* @pool: pool to kick
@@ -1227,15 +1241,7 @@ static bool kick_pool(struct worker_pool *pool)
return false;

if (pool->flags & POOL_BH) {
- if (likely(pool->cpu == smp_processor_id())) {
- if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
- raise_softirq_irqoff(HI_SOFTIRQ);
- else
- raise_softirq_irqoff(TASKLET_SOFTIRQ);
- } else {
- irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
- }
-
+ kick_bh_pool(pool);
return true;
}

--
2.43.2


2024-02-16 05:12:07

by Tejun Heo

[permalink] [raw]
Subject: [PATCH v2 wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK

2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues") added
irq_work usage to workqueue; however, it turns out irq_work is actually
optional and the change breaks build on configuration which doesn't have
CONFIG_IRQ_WORK enabled.

Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
be better to just always enable it. However, this still saves a small bit of
memory for tiny UP configs and also the least amount of change, so, for now,
let's keep it conditional.

Verified to do the right thing for x86_64 allnoconfig and defconfig, and
aarch64 allnoconfig, allnoconfig + prink disable (SMP but nothing selects
IRQ_WORK) and a modified aarch64 Kconfig where !SMP and nothing selects
IRQ_WORK.

v2: `depends on SMP` leads to Kconfig warnings when CONFIG_IRQ_WORK is
selected by something else when !CONFIG_SMP. Use `def_bool y if SMP`
instead.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Naresh Kamboju <[email protected]>
Fixes: 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues")
Cc: Stephen Rothwell <[email protected]>
---
Hello,

Unfortunately, the previous patch triggers Kconfig warnings when IRQ_WORK is
selected by something else but !CONFIG_SMP. This one seems to do the right
thing in all cases.

Naresh, Anders, can you please test it again?

Thanks.

init/Kconfig | 2 +-
kernel/workqueue.c | 24 +++++++++++++++---------
2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8df18f3a9748..0d21c9e0398f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -106,7 +106,7 @@ config CONSTRUCTORS
bool

config IRQ_WORK
- bool
+ def_bool y if SMP

config BUILDTIME_TABLE_SORT
bool
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 04e35dbe6799..6ae441e13804 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1209,6 +1209,20 @@ static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
}

+static void kick_bh_pool(struct worker_pool *pool)
+{
+#ifdef CONFIG_SMP
+ if (unlikely(pool->cpu != smp_processor_id())) {
+ irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
+ return;
+ }
+#endif
+ if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
+ raise_softirq_irqoff(HI_SOFTIRQ);
+ else
+ raise_softirq_irqoff(TASKLET_SOFTIRQ);
+}
+
/**
* kick_pool - wake up an idle worker if necessary
* @pool: pool to kick
@@ -1227,15 +1241,7 @@ static bool kick_pool(struct worker_pool *pool)
return false;

if (pool->flags & POOL_BH) {
- if (likely(pool->cpu == smp_processor_id())) {
- if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
- raise_softirq_irqoff(HI_SOFTIRQ);
- else
- raise_softirq_irqoff(TASKLET_SOFTIRQ);
- } else {
- irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
- }
-
+ kick_bh_pool(pool);
return true;
}

--
2.43.2


2024-02-16 10:24:48

by Anders Roxell

[permalink] [raw]
Subject: Re: [PATCH v2 wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK

On Fri, 16 Feb 2024 at 06:10, Tejun Heo <[email protected]> wrote:
>
> 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues") added
> irq_work usage to workqueue; however, it turns out irq_work is actually
> optional and the change breaks build on configuration which doesn't have
> CONFIG_IRQ_WORK enabled.
>
> Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
> CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
> be better to just always enable it. However, this still saves a small bit of
> memory for tiny UP configs and also the least amount of change, so, for now,
> let's keep it conditional.
>
> Verified to do the right thing for x86_64 allnoconfig and defconfig, and
> aarch64 allnoconfig, allnoconfig + prink disable (SMP but nothing selects
> IRQ_WORK) and a modified aarch64 Kconfig where !SMP and nothing selects
> IRQ_WORK.
>
> v2: `depends on SMP` leads to Kconfig warnings when CONFIG_IRQ_WORK is
> selected by something else when !CONFIG_SMP. Use `def_bool y if SMP`
> instead.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-by: Naresh Kamboju <[email protected]>
> Fixes: 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues")
> Cc: Stephen Rothwell <[email protected]>
> ---
> Hello,
>
> Unfortunately, the previous patch triggers Kconfig warnings when IRQ_WORK is
> selected by something else but !CONFIG_SMP. This one seems to do the right
> thing in all cases.
>
> Naresh, Anders, can you please test it again?

I applied this patch on yesterdays next tag next-20240215 and I was able to
build the arm64 tinyconfig.

Tested-by: Anders Roxell <[email protected]>

>
> Thanks.
>
> init/Kconfig | 2 +-
> kernel/workqueue.c | 24 +++++++++++++++---------
> 2 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 8df18f3a9748..0d21c9e0398f 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -106,7 +106,7 @@ config CONSTRUCTORS
> bool
>
> config IRQ_WORK
> - bool
> + def_bool y if SMP
>
> config BUILDTIME_TABLE_SORT
> bool
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 04e35dbe6799..6ae441e13804 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1209,6 +1209,20 @@ static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
> return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
> }
>
> +static void kick_bh_pool(struct worker_pool *pool)
> +{
> +#ifdef CONFIG_SMP
> + if (unlikely(pool->cpu != smp_processor_id())) {
> + irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
> + return;
> + }
> +#endif
> + if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
> + raise_softirq_irqoff(HI_SOFTIRQ);
> + else
> + raise_softirq_irqoff(TASKLET_SOFTIRQ);
> +}
> +
> /**
> * kick_pool - wake up an idle worker if necessary
> * @pool: pool to kick
> @@ -1227,15 +1241,7 @@ static bool kick_pool(struct worker_pool *pool)
> return false;
>
> if (pool->flags & POOL_BH) {
> - if (likely(pool->cpu == smp_processor_id())) {
> - if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
> - raise_softirq_irqoff(HI_SOFTIRQ);
> - else
> - raise_softirq_irqoff(TASKLET_SOFTIRQ);
> - } else {
> - irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
> - }
> -
> + kick_bh_pool(pool);
> return true;
> }
>
> --
> 2.43.2
>

2024-02-16 16:35:35

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v2 wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK

On Fri, Feb 16, 2024 at 11:24:26AM +0100, Anders Roxell wrote:
> I applied this patch on yesterdays next tag next-20240215 and I was able to
> build the arm64 tinyconfig.
>
> Tested-by: Anders Roxell <[email protected]>

Thanks for confirming. Applying to wq/for-6.9. Hopefully, this won't cause
any new breakages.

Thanks.

--
tejun