2023-04-21 02:51:47

by Tejun Heo

[permalink] [raw]
Subject: [PATCHSET wq/for-6.5] workqueue: Ordered workqueue creation cleanup

Hello,

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary
and contains the following 22 patches.

0001-powerpc-workqueue-Use-alloc_ordered_workqueue-to-cre.patch
0002-greybus-Use-alloc_ordered_workqueue-to-create-ordere.patch
0003-IB-hfi1-Use-alloc_ordered_workqueue-to-create-ordere.patch
0004-dm-integrity-Use-alloc_ordered_workqueue-to-create-o.patch
0005-media-amphion-Use-alloc_ordered_workqueue-to-create-.patch
0006-net-thunderx-Use-alloc_ordered_workqueue-to-create-o.patch
0007-net-octeontx2-Use-alloc_ordered_workqueue-to-create-.patch
0008-wifi-ath10-11-12k-Use-alloc_ordered_workqueue-to-cre.patch
0009-wifi-iwlwifi-Use-alloc_ordered_workqueue-to-create-o.patch
0010-wifi-mwifiex-Use-alloc_ordered_workqueue-to-create-o.patch
0011-net-wwan-t7xx-Use-alloc_ordered_workqueue-to-create-.patch
0012-scsi-Use-alloc_ordered_workqueue-to-create-ordered-w.patch
0013-virt-acrn-Use-alloc_ordered_workqueue-to-create-orde.patch
0014-soc-qcom-qmi-Use-alloc_ordered_workqueue-to-create-o.patch
0015-xen-pvcalls-Use-alloc_ordered_workqueue-to-create-or.patch
0016-btrfs-Use-alloc_ordered_workqueue-to-create-ordered-.patch
0017-cifs-Use-alloc_ordered_workqueue-to-create-ordered-w.patch
0018-net-qrtr-Use-alloc_ordered_workqueue-to-create-order.patch
0019-rxrpc-Use-alloc_ordered_workqueue-to-create-ordered-.patch
0020-crypto-octeontx2-Use-alloc_ordered_workqueue-to-crea.patch
0021-media-coda-Use-alloc_ordered_workqueue-to-create-ord.patch
0022-workqueue-Don-t-implicitly-make-UNBOUND-workqueues-w.patch

0001-0021 convert the existing users and 0022 drops the implicit ordered
promotion logic from alloc_workqueue(). I'll keep an eye out for a while
after merging 0022. Thankfully, these are pretty easy to grep for. The
patches can also be found in the following git branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git ordered-cleanup

diffstat follows. Thanks.

arch/powerpc/kernel/tau_6xx.c | 2 -
arch/powerpc/platforms/pseries/dlpar.c | 3 --
drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c | 12 ++++-----
drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c | 6 ++--
drivers/greybus/connection.c | 4 +--
drivers/greybus/svc.c | 2 -
drivers/infiniband/hw/hfi1/init.c | 7 ++---
drivers/md/dm-integrity.c | 4 +--
drivers/md/dm.c | 2 -
drivers/media/platform/amphion/vpu_core.c | 2 -
drivers/media/platform/amphion/vpu_v4l2.c | 2 -
drivers/media/platform/chips-media/coda-common.c | 2 -
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 --
drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 5 +---
drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 13 ++++------
drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c | 5 +---
drivers/net/wireless/ath/ath10k/qmi.c | 3 --
drivers/net/wireless/ath/ath11k/qmi.c | 3 --
drivers/net/wireless/ath/ath12k/qmi.c | 3 --
drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 4 +--
drivers/net/wireless/marvell/mwifiex/cfg80211.c | 13 ++++------
drivers/net/wireless/marvell/mwifiex/main.c | 22 ++++++++----------
drivers/net/wwan/t7xx/t7xx_hif_cldma.c | 13 +++++-----
drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c | 5 ++--
drivers/scsi/NCR5380.c | 5 +---
drivers/scsi/hosts.c | 12 ++++-----
drivers/scsi/libiscsi.c | 5 +---
drivers/soc/qcom/qmi_interface.c | 2 -
drivers/virt/acrn/ioreq.c | 4 +--
drivers/xen/pvcalls-back.c | 4 +--
fs/btrfs/disk-io.c | 2 -
fs/btrfs/scrub.c | 6 +++-
fs/cifs/dfs_cache.c | 2 -
include/linux/workqueue.h | 4 ---
kernel/workqueue.c | 23 +++----------------
net/qrtr/ns.c | 2 -
net/rxrpc/af_rxrpc.c | 2 -
37 files changed, 92 insertions(+), 121 deletions(-)


2023-04-21 02:51:57

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Sunil Goutham <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 7eb2ddbe9bad..a317feb8decb 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1126,8 +1126,7 @@ static int bgx_lmac_enable(struct bgx *bgx, u8 lmacid)
}

poll:
- lmac->check_link = alloc_workqueue("check_link", WQ_UNBOUND |
- WQ_MEM_RECLAIM, 1);
+ lmac->check_link = alloc_ordered_workqueue("check_link", WQ_MEM_RECLAIM);
if (!lmac->check_link)
return -ENOMEM;
INIT_DELAYED_WORK(&lmac->dwork, bgx_poll_for_link);
--
2.40.0

2023-04-21 02:52:02

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 07/22] net: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Sunil Goutham <[email protected]>
Cc: Ratheesh Kannoth <[email protected]>
Cc: Srujana Challa <[email protected]>
Cc: Geetha sowjanya <[email protected]>
Cc: [email protected]
---
drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 5 ++---
.../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 13 +++++--------
.../net/ethernet/marvell/octeontx2/nic/otx2_vf.c | 5 ++---
3 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
index 8683ce57ed3f..207041c81184 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
@@ -3013,9 +3013,8 @@ static int rvu_flr_init(struct rvu *rvu)
cfg | BIT_ULL(22));
}

- rvu->flr_wq = alloc_workqueue("rvu_afpf_flr",
- WQ_UNBOUND | WQ_HIGHPRI | WQ_MEM_RECLAIM,
- 1);
+ rvu->flr_wq = alloc_ordered_workqueue("rvu_afpf_flr",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!rvu->flr_wq)
return -ENOMEM;

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index 179433d0a54a..7b3114105757 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -271,8 +271,7 @@ static int otx2_pf_flr_init(struct otx2_nic *pf, int num_vfs)
{
int vf;

- pf->flr_wq = alloc_workqueue("otx2_pf_flr_wq",
- WQ_UNBOUND | WQ_HIGHPRI, 1);
+ pf->flr_wq = alloc_ordered_workqueue("otx2_pf_flr_wq", WQ_HIGHPRI);
if (!pf->flr_wq)
return -ENOMEM;

@@ -593,9 +592,8 @@ static int otx2_pfvf_mbox_init(struct otx2_nic *pf, int numvfs)
if (!pf->mbox_pfvf)
return -ENOMEM;

- pf->mbox_pfvf_wq = alloc_workqueue("otx2_pfvf_mailbox",
- WQ_UNBOUND | WQ_HIGHPRI |
- WQ_MEM_RECLAIM, 1);
+ pf->mbox_pfvf_wq = alloc_ordered_workqueue("otx2_pfvf_mailbox",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!pf->mbox_pfvf_wq)
return -ENOMEM;

@@ -1063,9 +1061,8 @@ static int otx2_pfaf_mbox_init(struct otx2_nic *pf)
int err;

mbox->pfvf = pf;
- pf->mbox_wq = alloc_workqueue("otx2_pfaf_mailbox",
- WQ_UNBOUND | WQ_HIGHPRI |
- WQ_MEM_RECLAIM, 1);
+ pf->mbox_wq = alloc_ordered_workqueue("otx2_pfaf_mailbox",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!pf->mbox_wq)
return -ENOMEM;

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
index ab126f8706c7..1f16e0dcbb3e 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
@@ -297,9 +297,8 @@ static int otx2vf_vfaf_mbox_init(struct otx2_nic *vf)
int err;

mbox->pfvf = vf;
- vf->mbox_wq = alloc_workqueue("otx2_vfaf_mailbox",
- WQ_UNBOUND | WQ_HIGHPRI |
- WQ_MEM_RECLAIM, 1);
+ vf->mbox_wq = alloc_ordered_workqueue("otx2_vfaf_mailbox",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!vf->mbox_wq)
return -ENOMEM;

--
2.40.0

2023-04-21 02:52:17

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 11/22] net: wwan: t7xx: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Chandrashekar Devegowda <[email protected]>
Cc: Intel Corporation <[email protected]>
Cc: Chiranjeevi Rapolu <[email protected]>
Cc: Liu Haijun <[email protected]>
Cc: M Chetan Kumar <[email protected]>
Cc: Ricardo Martinez <[email protected]>
Cc: Loic Poulain <[email protected]>
Cc: Sergey Ryazanov <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
---
drivers/net/wwan/t7xx/t7xx_hif_cldma.c | 13 +++++++------
drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c | 5 +++--
2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
index aec3a18d44bd..7162bf38a8c9 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
+++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
@@ -1293,9 +1293,9 @@ int t7xx_cldma_init(struct cldma_ctrl *md_ctrl)
for (i = 0; i < CLDMA_TXQ_NUM; i++) {
md_cd_queue_struct_init(&md_ctrl->txq[i], md_ctrl, MTK_TX, i);
md_ctrl->txq[i].worker =
- alloc_workqueue("md_hif%d_tx%d_worker",
- WQ_UNBOUND | WQ_MEM_RECLAIM | (i ? 0 : WQ_HIGHPRI),
- 1, md_ctrl->hif_id, i);
+ alloc_ordered_workqueue("md_hif%d_tx%d_worker",
+ WQ_MEM_RECLAIM | (i ? 0 : WQ_HIGHPRI),
+ md_ctrl->hif_id, i);
if (!md_ctrl->txq[i].worker)
goto err_workqueue;

@@ -1306,9 +1306,10 @@ int t7xx_cldma_init(struct cldma_ctrl *md_ctrl)
md_cd_queue_struct_init(&md_ctrl->rxq[i], md_ctrl, MTK_RX, i);
INIT_WORK(&md_ctrl->rxq[i].cldma_work, t7xx_cldma_rx_done);

- md_ctrl->rxq[i].worker = alloc_workqueue("md_hif%d_rx%d_worker",
- WQ_UNBOUND | WQ_MEM_RECLAIM,
- 1, md_ctrl->hif_id, i);
+ md_ctrl->rxq[i].worker =
+ alloc_ordered_workqueue("md_hif%d_rx%d_worker",
+ WQ_MEM_RECLAIM,
+ md_ctrl->hif_id, i);
if (!md_ctrl->rxq[i].worker)
goto err_workqueue;
}
diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c
index 46514208d4f9..8dab025a088a 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c
+++ b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c
@@ -618,8 +618,9 @@ int t7xx_dpmaif_txq_init(struct dpmaif_tx_queue *txq)
return ret;
}

- txq->worker = alloc_workqueue("md_dpmaif_tx%d_worker", WQ_UNBOUND | WQ_MEM_RECLAIM |
- (txq->index ? 0 : WQ_HIGHPRI), 1, txq->index);
+ txq->worker = alloc_ordered_workqueue("md_dpmaif_tx%d_worker",
+ WQ_MEM_RECLAIM | (txq->index ? 0 : WQ_HIGHPRI),
+ txq->index);
if (!txq->worker)
return -ENOMEM;

--
2.40.0

2023-04-21 02:52:24

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 12/22] scsi: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: [email protected]
---
drivers/scsi/NCR5380.c | 5 ++---
drivers/scsi/hosts.c | 12 ++++++------
drivers/scsi/libiscsi.c | 5 ++---
3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index ca85bddb582b..b18dd4591492 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -415,9 +415,8 @@ static int NCR5380_init(struct Scsi_Host *instance, int flags)
hostdata->flags = flags;

INIT_WORK(&hostdata->main_task, NCR5380_main);
- hostdata->work_q = alloc_workqueue("ncr5380_%d",
- WQ_UNBOUND | WQ_MEM_RECLAIM,
- 1, instance->host_no);
+ hostdata->work_q = alloc_ordered_workqueue("ncr5380_%d",
+ WQ_MEM_RECLAIM, instance->host_no);
if (!hostdata->work_q)
return -ENOMEM;

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 9b6fbbe15d92..30bf9f49ca6c 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -294,9 +294,9 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
if (shost->transportt->create_work_queue) {
snprintf(shost->work_q_name, sizeof(shost->work_q_name),
"scsi_wq_%d", shost->host_no);
- shost->work_q = alloc_workqueue("%s",
- WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
- 1, shost->work_q_name);
+ shost->work_q = alloc_ordered_workqueue("%s",
+ WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM,
+ shost->work_q_name);

if (!shost->work_q) {
error = -EINVAL;
@@ -510,9 +510,9 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
goto fail;
}

- shost->tmf_work_q = alloc_workqueue("scsi_tmf_%d",
- WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS,
- 1, shost->host_no);
+ shost->tmf_work_q = alloc_ordered_workqueue("scsi_tmf_%d",
+ WQ_MEM_RECLAIM | WQ_SYSFS,
+ shost->host_no);
if (!shost->tmf_work_q) {
shost_printk(KERN_WARNING, shost,
"failed to create tmf workq\n");
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 127f3d7f19dc..d0eba590dc69 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2907,9 +2907,8 @@ struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
ihost = shost_priv(shost);

if (xmit_can_sleep) {
- ihost->workq = alloc_workqueue("iscsi_q_%d",
- WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
- 1, shost->host_no);
+ ihost->workq = alloc_ordered_workqueue("iscsi_q_%d",
+ WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM, shost->host_no);
if (!ihost->workq)
goto free_host;
}
--
2.40.0

2023-04-21 02:52:48

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 13/22] virt: acrn: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Fei Li <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
---
drivers/virt/acrn/ioreq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index d75ab3f66da4..cecdc1c13af7 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -576,8 +576,8 @@ static void ioreq_resume(void)
int acrn_ioreq_intr_setup(void)
{
acrn_setup_intr_handler(ioreq_intr_handler);
- ioreq_wq = alloc_workqueue("ioreq_wq",
- WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+ ioreq_wq = alloc_ordered_workqueue("ioreq_wq",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!ioreq_wq) {
dev_err(acrn_dev.this_device, "Failed to alloc workqueue!\n");
acrn_remove_intr_handler();
--
2.40.0

2023-04-21 02:52:49

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 16/22] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
---
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/scrub.c | 6 ++++--
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9e1596bb208d..b1f6ff69dbe1 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2218,7 +2218,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
fs_info->qgroup_rescan_workers =
btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
fs_info->discard_ctl.discard_workers =
- alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+ alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);

if (!(fs_info->workers && fs_info->hipri_workers &&
fs_info->delalloc_workers && fs_info->flush_workers &&
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 69c93ae333f6..70882358bdb0 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4245,8 +4245,10 @@ static noinline_for_stack int scrub_workers_get(struct btrfs_fs_info *fs_info,
if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt))
return 0;

- scrub_workers = alloc_workqueue("btrfs-scrub", flags,
- is_dev_replace ? 1 : max_active);
+ if (is_dev_replace)
+ scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags);
+ else
+ scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active);
if (!scrub_workers)
goto fail_scrub_workers;

--
2.40.0

2023-04-21 02:52:53

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 15/22] xen/pvcalls: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Oleksandr Tyshchenko <[email protected]>
Cc: [email protected]
---
drivers/xen/pvcalls-back.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index 1f5219e12cc3..b41516f3f84a 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -361,7 +361,7 @@ static struct sock_mapping *pvcalls_new_active_socket(
map->data.in = map->bytes;
map->data.out = map->bytes + XEN_FLEX_RING_SIZE(map->ring_order);

- map->ioworker.wq = alloc_workqueue("pvcalls_io", WQ_UNBOUND, 1);
+ map->ioworker.wq = alloc_ordered_workqueue("pvcalls_io", 0);
if (!map->ioworker.wq)
goto out;
atomic_set(&map->io, 1);
@@ -637,7 +637,7 @@ static int pvcalls_back_bind(struct xenbus_device *dev,

INIT_WORK(&map->register_work, __pvcalls_back_accept);
spin_lock_init(&map->copy_lock);
- map->wq = alloc_workqueue("pvcalls_wq", WQ_UNBOUND, 1);
+ map->wq = alloc_ordered_workqueue("pvcalls_wq", 0);
if (!map->wq) {
ret = -ENOMEM;
goto out;
--
2.40.0

2023-04-21 02:52:56

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 14/22] soc: qcom: qmi: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Andy Gross <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Konrad Dybcio <[email protected]>
Cc: [email protected]
---
drivers/soc/qcom/qmi_interface.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/qcom/qmi_interface.c b/drivers/soc/qcom/qmi_interface.c
index 820bdd9f8e46..78d7361fdcf2 100644
--- a/drivers/soc/qcom/qmi_interface.c
+++ b/drivers/soc/qcom/qmi_interface.c
@@ -650,7 +650,7 @@ int qmi_handle_init(struct qmi_handle *qmi, size_t recv_buf_size,
if (!qmi->recv_buf)
return -ENOMEM;

- qmi->wq = alloc_workqueue("qmi_msg_handler", WQ_UNBOUND, 1);
+ qmi->wq = alloc_ordered_workqueue("qmi_msg_handler", 0);
if (!qmi->wq) {
ret = -ENOMEM;
goto err_free_recv_buf;
--
2.40.0

2023-04-21 02:52:58

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 17/22] cifs: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Steve French <[email protected]>
Cc: Paulo Alcantara <[email protected]>
Cc: Ronnie Sahlberg <[email protected]>
Cc: Shyam Prasad N <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
fs/cifs/dfs_cache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 30cbdf8514a5..aece6d51bce7 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -290,7 +290,7 @@ int dfs_cache_init(void)
int rc;
int i;

- dfscache_wq = alloc_workqueue("cifs-dfscache", WQ_FREEZABLE | WQ_UNBOUND, 1);
+ dfscache_wq = alloc_ordered_workqueue("cifs-dfscache", WQ_FREEZABLE);
if (!dfscache_wq)
return -ENOMEM;

--
2.40.0

2023-04-21 02:53:02

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 18/22] net: qrtr: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Manivannan Sadhasivam <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
net/qrtr/ns.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c
index 0f25a386138c..0f7a729f1a1f 100644
--- a/net/qrtr/ns.c
+++ b/net/qrtr/ns.c
@@ -783,7 +783,7 @@ int qrtr_ns_init(void)
goto err_sock;
}

- qrtr_ns.workqueue = alloc_workqueue("qrtr_ns_handler", WQ_UNBOUND, 1);
+ qrtr_ns.workqueue = alloc_ordered_workqueue("qrtr_ns_handler", 0);
if (!qrtr_ns.workqueue) {
ret = -ENOMEM;
goto err_sock;
--
2.40.0

2023-04-21 03:01:41

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 21/22] media: coda: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Philipp Zabel <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: [email protected]
---
drivers/media/platform/chips-media/coda-common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/chips-media/coda-common.c b/drivers/media/platform/chips-media/coda-common.c
index af71eea04dbd..c8ecfe760028 100644
--- a/drivers/media/platform/chips-media/coda-common.c
+++ b/drivers/media/platform/chips-media/coda-common.c
@@ -3268,7 +3268,7 @@ static int coda_probe(struct platform_device *pdev)
&dev->iram.blob);
}

- dev->workqueue = alloc_workqueue("coda", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
+ dev->workqueue = alloc_ordered_workqueue("coda", WQ_MEM_RECLAIM);
if (!dev->workqueue) {
dev_err(&pdev->dev, "unable to alloc workqueue\n");
ret = -ENOMEM;
--
2.40.0

2023-04-21 03:02:29

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 20/22] crypto: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Boris Brezillon <[email protected]>
Cc: Arnaud Ebalard <[email protected]>
Cc: Srujana Challa <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Shijith Thotton <[email protected]>
Cc: Vladis Dronov <[email protected]>
Cc: Vincent Mailhol <[email protected]>
Cc: Wolfram Sang <[email protected]>
Cc: Alexander Lobakin <[email protected]>
Cc: Minghao Chi <[email protected]>
Cc: ye xingchen <[email protected]>
Cc: [email protected]
---
drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c | 12 ++++++------
drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c | 6 +++---
2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c b/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
index ddf6e913c1c4..30e6acfc93d9 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
+++ b/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
@@ -357,9 +357,9 @@ static int cptpf_vfpf_mbox_init(struct otx2_cptpf_dev *cptpf, int num_vfs)
u64 vfpf_mbox_base;
int err, i;

- cptpf->vfpf_mbox_wq = alloc_workqueue("cpt_vfpf_mailbox",
- WQ_UNBOUND | WQ_HIGHPRI |
- WQ_MEM_RECLAIM, 1);
+ cptpf->vfpf_mbox_wq =
+ alloc_ordered_workqueue("cpt_vfpf_mailbox",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!cptpf->vfpf_mbox_wq)
return -ENOMEM;

@@ -453,9 +453,9 @@ static int cptpf_afpf_mbox_init(struct otx2_cptpf_dev *cptpf)
resource_size_t offset;
int err;

- cptpf->afpf_mbox_wq = alloc_workqueue("cpt_afpf_mailbox",
- WQ_UNBOUND | WQ_HIGHPRI |
- WQ_MEM_RECLAIM, 1);
+ cptpf->afpf_mbox_wq =
+ alloc_ordered_workqueue("cpt_afpf_mailbox",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!cptpf->afpf_mbox_wq)
return -ENOMEM;

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c b/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
index 392e9fee05e8..6023a7adb70c 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
+++ b/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
@@ -75,9 +75,9 @@ static int cptvf_pfvf_mbox_init(struct otx2_cptvf_dev *cptvf)
resource_size_t offset, size;
int ret;

- cptvf->pfvf_mbox_wq = alloc_workqueue("cpt_pfvf_mailbox",
- WQ_UNBOUND | WQ_HIGHPRI |
- WQ_MEM_RECLAIM, 1);
+ cptvf->pfvf_mbox_wq =
+ alloc_ordered_workqueue("cpt_pfvf_mailbox",
+ WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!cptvf->pfvf_mbox_wq)
return -ENOMEM;

--
2.40.0

2023-04-21 03:04:33

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 22/22] workqueue: Don't implicitly make UNBOUND workqueues w/ @max_active==1 ordered

5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
automoatically promoted UNBOUND workqueues w/ @max_active==1 to ordered
workqueues because UNBOUND workqueues w/ @max_active==1 used to be the way
to create ordered workqueues and the new NUMA support broke it. These
problems can be subtle and the fact that they can only trigger on NUMA
machines made them even more difficult to debug.

However, overloading the UNBOUND allocation interface this way creates other
issues. It's difficult to tell whether a given workqueue actually needs to
be ordered and users that legitimately want a min concurrency level wq
unexpectedly gets an ordered one instead. With planned UNBOUND workqueue
udpates to improve execution locality and more prevalence of chiplet designs
which can benefit from such improvements, this isn't a state we wanna be in
forever.

There aren't that many UNBOUND w/ @max_active==1 users in the tree and the
preceding patches audited all and converted them to
alloc_ordered_workqueue() as appropriate. This patch removes the implicit
promotion of UNBOUND w/ @max_active==1 workqueues to ordered ones.

Workqueue will also add a debug option to make all unordered UNBOUND
workqueues to use per-cpu pool_workqueues so that these problems can be
surfaced easier on most machines.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Christoph Hellwig <[email protected]>
---
include/linux/workqueue.h | 4 +---
kernel/workqueue.c | 23 ++++-------------------
2 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index ac551b8ee7d9..e547a90f0328 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -339,7 +339,6 @@ enum {
__WQ_DRAINING = 1 << 16, /* internal: workqueue is draining */
__WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */
__WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */
- __WQ_ORDERED_EXPLICIT = 1 << 19, /* internal: alloc_ordered_workqueue() */

WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */
WQ_MAX_UNBOUND_PER_CPU = 4, /* 4 * #cpus for unbound wq */
@@ -417,8 +416,7 @@ alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...);
* Pointer to the allocated workqueue on success, %NULL on failure.
*/
#define alloc_ordered_workqueue(fmt, flags, args...) \
- alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | \
- __WQ_ORDERED_EXPLICIT | (flags), 1, ##args)
+ alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)

#define create_workqueue(name) \
alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name))
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b8b541caed48..00bdcc3c5b36 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4180,12 +4180,8 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
return -EINVAL;

/* creating multiple pwqs breaks ordering guarantee */
- if (!list_empty(&wq->pwqs)) {
- if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
- return -EINVAL;
-
- wq->flags &= ~__WQ_ORDERED;
- }
+ if (WARN_ON(wq->flags & __WQ_ORDERED))
+ return -EINVAL;

ctx = apply_wqattrs_prepare(wq, attrs, wq_unbound_cpumask);
if (!ctx)
@@ -4408,16 +4404,6 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
struct workqueue_struct *wq;
struct pool_workqueue *pwq;

- /*
- * Unbound && max_active == 1 used to imply ordered, which is no
- * longer the case on NUMA machines due to per-node pools. While
- * alloc_ordered_workqueue() is the right way to create an ordered
- * workqueue, keep the previous behavior to avoid subtle breakages
- * on NUMA.
- */
- if ((flags & WQ_UNBOUND) && max_active == 1)
- flags |= __WQ_ORDERED;
-
/* see the comment above the definition of WQ_POWER_EFFICIENT */
if ((flags & WQ_POWER_EFFICIENT) && wq_power_efficient)
flags |= WQ_UNBOUND;
@@ -4625,14 +4611,13 @@ void workqueue_set_max_active(struct workqueue_struct *wq, int max_active)
struct pool_workqueue *pwq;

/* disallow meddling with max_active for ordered workqueues */
- if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
+ if (WARN_ON(wq->flags & __WQ_ORDERED))
return;

max_active = wq_clamp_max_active(max_active, wq->flags, wq->name);

mutex_lock(&wq->mutex);

- wq->flags &= ~__WQ_ORDERED;
wq->saved_max_active = max_active;

for_each_pwq(pwq, wq)
@@ -5868,7 +5853,7 @@ int workqueue_sysfs_register(struct workqueue_struct *wq)
* attributes breaks ordering guarantee. Disallow exposing ordered
* workqueues.
*/
- if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
+ if (WARN_ON(wq->flags & __WQ_ORDERED))
return -EINVAL;

wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
--
2.40.0

2023-04-21 03:08:17

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 19/22] rxrpc: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <[email protected]>
Cc: David Howells <[email protected]>
Cc: Marc Dionne <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
net/rxrpc/af_rxrpc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..e1822c12990d 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -972,7 +972,7 @@ static int __init af_rxrpc_init(void)
goto error_call_jar;
}

- rxrpc_workqueue = alloc_workqueue("krxrpcd", WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+ rxrpc_workqueue = alloc_ordered_workqueue("krxrpcd", WQ_HIGHPRI | WQ_MEM_RECLAIM);
if (!rxrpc_workqueue) {
pr_notice("Failed to allocate work queue\n");
goto error_work_queue;
--
2.40.0

2023-04-21 05:10:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 13/22] virt: acrn: Use alloc_ordered_workqueue() to create ordered workqueues

On Thu, Apr 20, 2023 at 04:50:37PM -1000, Tejun Heo wrote:
> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Fei Li <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> ---
> drivers/virt/acrn/ioreq.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)

Acked-by: Greg Kroah-Hartman <[email protected]>

2023-04-21 06:20:34

by Sunil Kovvuri Goutham

[permalink] [raw]
Subject: RE: [EXT] [PATCH 07/22] net: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues



> -----Original Message-----
> From: Tejun Heo <[email protected]> On Behalf Of Tejun Heo
> Sent: Friday, April 21, 2023 8:21 AM
> To: [email protected]
> Cc: [email protected]; [email protected]; Tejun Heo
> <[email protected]>; David S. Miller <[email protected]>; Eric Dumazet
> <[email protected]>; Jakub Kicinski <[email protected]>; Paolo Abeni
> <[email protected]>; Sunil Kovvuri Goutham <[email protected]>;
> Ratheesh Kannoth <[email protected]>; Srujana Challa
> <[email protected]>; Geethasowjanya Akula <[email protected]>;
> [email protected]
> Subject: [EXT] [PATCH 07/22] net: octeontx2: Use alloc_ordered_workqueue()
> to create ordered workqueues
>
> External Email
>
> ----------------------------------------------------------------------
> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an ordered
> workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1
> to be ordered"). Because there were users that depended on the ordered
> execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered") made workqueue allocation path to implicitly promote UNBOUND
> workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface this
> way creates other issues. It's difficult to tell whether a given workqueue actually
> needs to be ordered and users that legitimately want a min concurrency level wq
> unexpectedly gets an ordered one instead. With planned UNBOUND workqueue
> updates to improve execution locality and more prevalence of chiplet designs
> which can benefit from such improvements, this isn't a state we wanna be in
> forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as
> necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion is
> in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion through. The
> behavior will stay exactly the same and we can always reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Paolo Abeni <[email protected]>
> Cc: Sunil Goutham <[email protected]>
> Cc: Ratheesh Kannoth <[email protected]>
> Cc: Srujana Challa <[email protected]>
> Cc: Geetha sowjanya <[email protected]>
> Cc: [email protected]
> ---
> drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 5 ++---
> .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 13 +++++--------
> .../net/ethernet/marvell/octeontx2/nic/otx2_vf.c | 5 ++---
> 3 files changed, 9 insertions(+), 14 deletions(-)
>
Reviewed-by: Sunil Goutham <[email protected]>

2023-04-21 06:24:01

by Sunil Kovvuri Goutham

[permalink] [raw]
Subject: RE: [EXT] [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to create ordered workqueues



> -----Original Message-----
> From: Tejun Heo <[email protected]> On Behalf Of Tejun Heo
> Sent: Friday, April 21, 2023 8:21 AM
> To: [email protected]
> Cc: [email protected]; [email protected]; Tejun Heo
> <[email protected]>; Sunil Kovvuri Goutham <[email protected]>; David S.
> Miller <[email protected]>; Eric Dumazet <[email protected]>; Jakub
> Kicinski <[email protected]>; Paolo Abeni <[email protected]>; linux-arm-
> [email protected]; [email protected]
> Subject: [EXT] [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to
> create ordered workqueues
>
> External Email
>
> ----------------------------------------------------------------------
> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an ordered
> workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1
> to be ordered"). Because there were users that depended on the ordered
> execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered") made workqueue allocation path to implicitly promote UNBOUND
> workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface this
> way creates other issues. It's difficult to tell whether a given workqueue actually
> needs to be ordered and users that legitimately want a min concurrency level wq
> unexpectedly gets an ordered one instead. With planned UNBOUND workqueue
> updates to improve execution locality and more prevalence of chiplet designs
> which can benefit from such improvements, this isn't a state we wanna be in
> forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as
> necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion is
> in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion through. The
> behavior will stay exactly the same and we can always reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Sunil Goutham <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Paolo Abeni <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> index 7eb2ddbe9bad..a317feb8decb 100644
> --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> @@ -1126,8 +1126,7 @@ static int bgx_lmac_enable(struct bgx *bgx, u8
> lmacid)
> }
>
> poll:
> - lmac->check_link = alloc_workqueue("check_link", WQ_UNBOUND |
> - WQ_MEM_RECLAIM, 1);
> + lmac->check_link = alloc_ordered_workqueue("check_link",
> +WQ_MEM_RECLAIM);
> if (!lmac->check_link)
> return -ENOMEM;
> INIT_DELAYED_WORK(&lmac->dwork, bgx_poll_for_link);
> --
> 2.40.0

Reviewed-by: Sunil Goutham <[email protected]>

2023-04-21 14:03:56

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to create ordered workqueues

On Thu, 20 Apr 2023 16:50:30 -1000 Tejun Heo wrote:
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Sunil Goutham <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Paolo Abeni <[email protected]>
> Cc: [email protected]
> Cc: [email protected]

You take this via your tree directly to Linus T?

2023-04-21 14:14:35

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to create ordered workqueues

On Fri, Apr 21, 2023 at 07:01:08AM -0700, Jakub Kicinski wrote:
> On Thu, 20 Apr 2023 16:50:30 -1000 Tejun Heo wrote:
> > Signed-off-by: Tejun Heo <[email protected]>
> > Cc: Sunil Goutham <[email protected]>
> > Cc: "David S. Miller" <[email protected]>
> > Cc: Eric Dumazet <[email protected]>
> > Cc: Jakub Kicinski <[email protected]>
> > Cc: Paolo Abeni <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
>
> You take this via your tree directly to Linus T?

Yeah, that'd be my preference unless someone is really against it.

Thanks.

--
tejun

2023-04-21 14:28:02

by Benjamin Block

[permalink] [raw]
Subject: Re: [PATCH 12/22] scsi: Use alloc_ordered_workqueue() to create ordered workqueues

On Thu, Apr 20, 2023 at 04:50:36PM -1000, Tejun Heo wrote:
> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: "James E.J. Bottomley" <[email protected]>
> Cc: "Martin K. Petersen" <[email protected]>
> Cc: [email protected]
> ---
> drivers/scsi/NCR5380.c | 5 ++---
> drivers/scsi/hosts.c | 12 ++++++------
> drivers/scsi/libiscsi.c | 5 ++---
> 3 files changed, 10 insertions(+), 12 deletions(-)
>

The conversions look good to me.


Reviewed-by: Benjamin Block <[email protected]>

--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Gregor Pillen / Gesch?ftsf?hrung: David Faller
Sitz der Ges.: B?blingen / Registergericht: AmtsG Stuttgart, HRB 243294

2023-04-21 14:30:15

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to create ordered workqueues

On Fri, 21 Apr 2023 04:13:20 -1000 Tejun Heo wrote:
> On Fri, Apr 21, 2023 at 07:01:08AM -0700, Jakub Kicinski wrote:
> > On Thu, 20 Apr 2023 16:50:30 -1000 Tejun Heo wrote:
> > > Signed-off-by: Tejun Heo <[email protected]>
> > > Cc: Sunil Goutham <[email protected]>
> > > Cc: "David S. Miller" <[email protected]>
> > > Cc: Eric Dumazet <[email protected]>
> > > Cc: Jakub Kicinski <[email protected]>
> > > Cc: Paolo Abeni <[email protected]>
> > > Cc: [email protected]
> > > Cc: [email protected]
> >
> > You take this via your tree directly to Linus T?
>
> Yeah, that'd be my preference unless someone is really against it.

Acked-by: Jakub Kicinski <[email protected]>

2023-04-21 18:51:54

by Paulo Alcantara

[permalink] [raw]
Subject: Re: [PATCH 17/22] cifs: Use alloc_ordered_workqueue() to create ordered workqueues

Tejun Heo <[email protected]> writes:

> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Steve French <[email protected]>
> Cc: Paulo Alcantara <[email protected]>
> Cc: Ronnie Sahlberg <[email protected]>
> Cc: Shyam Prasad N <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> fs/cifs/dfs_cache.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Paulo Alcantara (SUSE) <[email protected]>

2023-04-24 05:18:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 22/22] workqueue: Don't implicitly make UNBOUND workqueues w/ @max_active==1 ordered

On Thu, Apr 20, 2023 at 04:50:46PM -1000, Tejun Heo wrote:
> There aren't that many UNBOUND w/ @max_active==1 users in the tree and the
> preceding patches audited all and converted them to
> alloc_ordered_workqueue() as appropriate. This patch removes the implicit
> promotion of UNBOUND w/ @max_active==1 workqueues to ordered ones.

And because you only Cc'ed me on this patch but not the rest of the
series there is no way to make me see it. Either cc me everywhere
because it's actually important, or preferably just drop fringe Ccs.

2023-04-30 04:56:30

by Wang Yugui

[permalink] [raw]
Subject: Re: [PATCH 16/22] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues

Hi,

> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.

Should we add alloc_ordered_workqueue to
btrfs_alloc_workqueue() of fs/btrfs/async-thread.c too?

Best Regards
Wang Yugui ([email protected])
2023/04/30

>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Chris Mason <[email protected]>
> Cc: Josef Bacik <[email protected]>
> Cc: David Sterba <[email protected]>
> Cc: [email protected]
> ---
> fs/btrfs/disk-io.c | 2 +-
> fs/btrfs/scrub.c | 6 ++++--
> 2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 9e1596bb208d..b1f6ff69dbe1 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2218,7 +2218,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
> fs_info->qgroup_rescan_workers =
> btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
> fs_info->discard_ctl.discard_workers =
> - alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
> + alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
>
> if (!(fs_info->workers && fs_info->hipri_workers &&
> fs_info->delalloc_workers && fs_info->flush_workers &&
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 69c93ae333f6..70882358bdb0 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -4245,8 +4245,10 @@ static noinline_for_stack int scrub_workers_get(struct btrfs_fs_info *fs_info,
> if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt))
> return 0;
>
> - scrub_workers = alloc_workqueue("btrfs-scrub", flags,
> - is_dev_replace ? 1 : max_active);
> + if (is_dev_replace)
> + scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags);
> + else
> + scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active);
> if (!scrub_workers)
> goto fail_scrub_workers;
>
> --
> 2.40.0


2023-05-05 23:25:07

by Tejun Heo

[permalink] [raw]
Subject: [PATCH v2 16/22] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

v2: btrfs_alloc_workqueue() updated too as suggested by Wang.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Wang Yugui <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
---
Hello,

Wang, yeah, that's a helper that can't tell whether the caller wants an
ordered wq or not, so it needs to be updated too. How does this look?

Thanks.

fs/btrfs/async-thread.c | 7 +++++--
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/scrub.c | 6 ++++--
3 files changed, 10 insertions(+), 5 deletions(-)

--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_work
ret->thresh = thresh;
}

- ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active,
- name);
+ if (ret->current_active == 1)
+ ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name);
+ else
+ ret->normal_wq = alloc_workqueue("btrfs-%s", flags,
+ ret->current_active, name);
if (!ret->normal_wq) {
kfree(ret);
return NULL;
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2218,7 +2218,7 @@ static int btrfs_init_workqueues(struct
fs_info->qgroup_rescan_workers =
btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
fs_info->discard_ctl.discard_workers =
- alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+ alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);

if (!(fs_info->workers && fs_info->hipri_workers &&
fs_info->delalloc_workers && fs_info->flush_workers &&
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4245,8 +4245,10 @@ static noinline_for_stack int scrub_work
if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt))
return 0;

- scrub_workers = alloc_workqueue("btrfs-scrub", flags,
- is_dev_replace ? 1 : max_active);
+ if (is_dev_replace)
+ scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags);
+ else
+ scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active);
if (!scrub_workers)
goto fail_scrub_workers;

2023-05-06 01:44:15

by Wang Yugui

[permalink] [raw]
Subject: Re: [PATCH v2 16/22] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues

Hi,

> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> v2: btrfs_alloc_workqueue() updated too as suggested by Wang.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Wang Yugui <[email protected]>
> Cc: Chris Mason <[email protected]>
> Cc: Josef Bacik <[email protected]>
> Cc: David Sterba <[email protected]>
> Cc: [email protected]
> ---
> Hello,
>
> Wang, yeah, that's a helper that can't tell whether the caller wants an
> ordered wq or not, so it needs to be updated too. How does this look?
>
> Thanks.
>
> fs/btrfs/async-thread.c | 7 +++++--
> fs/btrfs/disk-io.c | 2 +-
> fs/btrfs/scrub.c | 6 ++++--
> 3 files changed, 10 insertions(+), 5 deletions(-)
>
> --- a/fs/btrfs/async-thread.c
> +++ b/fs/btrfs/async-thread.c
> @@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_work
> ret->thresh = thresh;
> }
>
> - ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active,
> - name);
> + if (ret->current_active == 1)
> + ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name);
> + else
> + ret->normal_wq = alloc_workqueue("btrfs-%s", flags,
> + ret->current_active, name);
> if (!ret->normal_wq) {
> kfree(ret);
> return NULL;

by test, I noticed some warning caused by
void workqueue_set_max_active(struct workqueue_struct *wq, int max_active)
if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
return;

so I tested again with the flowing fix

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 43c8995..e4b68e9 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info,
ret->thresh = thresh;
}

- ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active,
- name);
+ if(limit_active == 1)
+ ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name);
+ else
+ ret->normal_wq = alloc_workqueue("btrfs-%s", flags,
+ ret->current_active, name);
if (!ret->normal_wq) {
kfree(ret);
return NULL;
@@ -139,7 +139,7 @@ static inline void thresh_exec_hook(struct btrfs_workqueue *wq)
long pending;
int need_change = 0;

- if (wq->thresh == NO_THRESHOLD)
+ if (wq->thresh == NO_THRESHOLD || wq->limit_active == 1)
return;

atomic_dec(&wq->pending);

we need 'limit_active' at 2nd postition, so I used 'limit_active' and 1st
postition too.

Best Regards
Wang Yugui ([email protected])
2023/05/06



> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2218,7 +2218,7 @@ static int btrfs_init_workqueues(struct
> fs_info->qgroup_rescan_workers =
> btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
> fs_info->discard_ctl.discard_workers =
> - alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
> + alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
>
> if (!(fs_info->workers && fs_info->hipri_workers &&
> fs_info->delalloc_workers && fs_info->flush_workers &&
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -4245,8 +4245,10 @@ static noinline_for_stack int scrub_work
> if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt))
> return 0;
>
> - scrub_workers = alloc_workqueue("btrfs-scrub", flags,
> - is_dev_replace ? 1 : max_active);
> + if (is_dev_replace)
> + scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags);
> + else
> + scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active);
> if (!scrub_workers)
> goto fail_scrub_workers;
>


2023-05-08 12:43:13

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH 15/22] xen/pvcalls: Use alloc_ordered_workqueue() to create ordered workqueues

On 21.04.23 04:50, Tejun Heo wrote:
> BACKGROUND
> ==========
>
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
>
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
>
> WHAT TO LOOK FOR
> ================
>
> The conversions are from
>
> alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
>
> to
>
> alloc_ordered_workqueue(flags, args...)
>
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
>
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
>
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Stefano Stabellini <[email protected]>
> Cc: Oleksandr Tyshchenko <[email protected]>
> Cc: [email protected]

Acked-by: Juergen Gross <[email protected]>


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments

2023-05-09 00:05:18

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 13/22] virt: acrn: Use alloc_ordered_workqueue() to create ordered workqueues

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

--
tejun

2023-05-09 00:07:48

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 06/22] net: thunderx: Use alloc_ordered_workqueue() to create ordered workqueues

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

--
tejun

2023-05-09 00:07:48

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 12/22] scsi: Use alloc_ordered_workqueue() to create ordered workqueues

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

--
tejun

2023-05-09 00:12:01

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 07/22] net: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

--
tejun

2023-05-09 00:15:05

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v2 16/22] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues

On Sat, May 06, 2023 at 09:40:14AM +0800, Wang Yugui wrote:
> by test, I noticed some warning caused by
> void workqueue_set_max_active(struct workqueue_struct *wq, int max_active)
> if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
> return;
>
> so I tested again with the flowing fix
>
> diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
> index 43c8995..e4b68e9 100644
> --- a/fs/btrfs/async-thread.c
> +++ b/fs/btrfs/async-thread.c
> @@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info,
> ret->thresh = thresh;
> }
>
> - ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active,
> - name);
> + if(limit_active == 1)
> + ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name);
> + else
> + ret->normal_wq = alloc_workqueue("btrfs-%s", flags,
> + ret->current_active, name);
> if (!ret->normal_wq) {
> kfree(ret);
> return NULL;
> @@ -139,7 +139,7 @@ static inline void thresh_exec_hook(struct btrfs_workqueue *wq)
> long pending;
> int need_change = 0;
>
> - if (wq->thresh == NO_THRESHOLD)
> + if (wq->thresh == NO_THRESHOLD || wq->limit_active == 1)
> return;
>
> atomic_dec(&wq->pending);
>
> we need 'limit_active' at 2nd postition, so I used 'limit_active' and 1st
> postition too.

Oh, that most likely means that these workqueues don't need to and shouldn't
be ordered. Will update the patch.

Thanks.

--
tejun

2023-05-09 00:20:33

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 17/22] cifs: Use alloc_ordered_workqueue() to create ordered workqueues

On Fri, Apr 21, 2023 at 03:38:57PM -0300, Paulo Alcantara wrote:
...
> > fs/cifs/dfs_cache.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Acked-by: Paulo Alcantara (SUSE) <[email protected]>

6be2ea33a409 ("cifs: avoid potential races when handling multiple dfs
tcons") made this patch unnecessary. Dropped.

Thanks.

--
tejun

2023-05-09 00:20:39

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 15/22] xen/pvcalls: Use alloc_ordered_workqueue() to create ordered workqueues

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

--
tejun

2023-05-09 01:32:11

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 12/22] scsi: Use alloc_ordered_workqueue() to create ordered workqueues

On Mon, May 08, 2023 at 01:57:30PM -1000, Tejun Heo wrote:
> Applied to wq/for-6.5-cleanup-ordered.

Oops, strike that. All scsi core workqueues have WQ_SYSFS set which means
that their max_active could be adjusted upwards through sysfs. The shouldn't
be ordered workqueues. This only leaves NCR5380 the only remaining
conversion candidate; however, that one only uses a single work item, so the
better thing to do there is using the default @max_active instead.

I'm dropping this patch and will add a patch for NCR5380 in the next round.

Thanks.

--
tejun

2023-05-25 05:45:25

by Bjorn Andersson

[permalink] [raw]
Subject: Re: (subset) [PATCHSET wq/for-6.5] workqueue: Ordered workqueue creation cleanup

On Thu, 20 Apr 2023 16:50:24 -1000, Tejun Heo wrote:
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
>
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
>
> [...]

Applied, thanks!

[14/22] soc: qcom: qmi: Use alloc_ordered_workqueue() to create ordered workqueues
commit: 56310520308ab863030e9baa9a8f63bb31c94e27

Best regards,
--
Bjorn Andersson <[email protected]>