2012-08-14 01:49:55

by Tejun Heo

[permalink] [raw]
Subject: [PATCHSET] workqueue: make all workqueues non-reentrant

Hello,

Unless explicitly specified, a workqueued is reentrant, which might
not even be the correct term. If a work item is executing on a CPU
but not pending anywhere, the work item can be queued on a different
CPU, and that CPU might start executing the work item whether the
previous execution on the first CPU is finished or not. In short,
while a work item can be queued on only one CPU, it may be executing
concurrently on any subset of the CPUs in the system.

This behavior in itself is already confusing and error-prone; however,
this causes much worse complication in flush operations. Because a
given work item could be executing on any number of CPUs, even if no
further queueing happens after invocation, flush_work() can't
guarantee that the work item is idle on return without checking all
CPUs. Being a relatively common operation, checking all CPUs on each
invocation was deemed too costly. It only verifies the instance
queued last and there's a separate version which checks all CPUs -
flush_work_sync().

Also, due to the way last CPU is tracked, if there are parallel
queueing operations, the following sequence

schedule_work(A);
flush_work(A);

can't guarantee that the instance queued right above is finished. If
someone else queues A on another CPU inbetween, flush_work() either
becomes noop or flushes that instance which may finish earlier than
the one from the above.

In the end, workqueue is a pain in the ass to get completely correct
and the breakages are very subtle. Depending on queueing pattern,
assuming non-reentrancy might work fine most of the time. The effect
of using flush_work() where flush_work_sync() should be used could be
a lot more subtle. flush_work() becoming noop would happen extremely
rarely for most users but it definitely is there.

A tool which is used as widely as workqueue shouldn't be this
difficult and error-prone. This is just silly.

Timer does the right thing. It *always* checks whether a timer is
still executing on the previous CPU and queues it there if so much
like how WQ_NON_REENTRANT makes workqueue behave. WQ_NON_REENTRANT
guarantees that any given work item is executing on only one CPU at
maximum and if both pending and executing, both are on the same CPU.
This makes the behaviors of work item execution and flush_work() much
more sane and flush_work_sync() unnecessary.

This patchset enforces WQ_NON_REENTRANT behavior on all workqueues and
simplifies workqueue API accordingly.

This adds an additional busy_hash lookup if the work item was
previously queued on a different CPU. This shouldn't be noticeable
under any sane workload. Work item queueing isn't a very
high-frequency operation and they don't jump across CPUs all the time.
In a micro benchmark to exaggerate this difference - measuring the
time it takes for two work items to repeatedly jump between two CPUs a
number (10M) of times with busy_hash table densely populated, the
difference was around 3%.

While the overhead is measureable, it is only visible in pathological
cases and the difference isn't huge. This change brings much needed
sanity to workqueue and makes its behavior consistent with timer. I
think this is the right tradeoff to make.

This patchset contains the following six patches.

0001-workqueue-make-all-workqueues-non-reentrant.patch
0002-workqueue-gut-flush-_delayed-_work_sync.patch
0003-workqueue-gut-system_nrt-_freezable-_wq.patch
0004-workqueue-deprecate-flush-_delayed-_work_sync.patch
0005-workqueue-deprecate-system_nrt-_freezable-_wq.patch
0006-workqueue-deprecate-WQ_NON_REENTRANT.patch

0001 makes all workqueues non-reentrant.

0002-0003 simplify workqueue API accordingly.

0004-0006 deprecate now unnecessary parts of workqueue API and convert
their users. Most conversions are straight-forward but 0006 contains
some non-trivial ones. If you work on NFC and drivers/staging/nvec,
please review them.

This patchset is on top of

wq/for-3.7 1265057fa02c7bed3b6d9ddc8a2048065a370364
+ [1] "timer: clean up initializers and implement irqsafe" patchset
+ [2] "workqueue: use irqsafe timer in delayed_work" patchset

and available in the following git branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-wq-always-nrt

Thanks. diffstat follows.

Documentation/workqueue.txt | 14 --
arch/arm/mach-pxa/sharpsl_pm.c | 4
arch/arm/plat-omap/mailbox.c | 2
arch/sh/drivers/push-switch.c | 2
block/blk-throttle.c | 7 -
block/genhd.c | 10 -
drivers/block/xen-blkfront.c | 4
drivers/cdrom/gdrom.c | 2
drivers/char/sonypi.c | 2
drivers/char/tpm/tpm.c | 4
drivers/firewire/core-transaction.c | 3
drivers/gpu/drm/drm_crtc_helper.c | 6
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2
drivers/gpu/drm/i915/i915_dma.c | 6
drivers/gpu/drm/nouveau/nouveau_gpio.c | 2
drivers/gpu/drm/radeon/radeon_irq_kms.c | 2
drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 2
drivers/hid/hid-picolcd.c | 2
drivers/hid/hid-wiimote-ext.c | 2
drivers/input/touchscreen/wm831x-ts.c | 2
drivers/isdn/mISDN/hwchannel.c | 4
drivers/leds/leds-lm3533.c | 6
drivers/leds/leds-lp8788.c | 2
drivers/leds/leds-wm8350.c | 2
drivers/macintosh/ams/ams-core.c | 2
drivers/md/dm-crypt.c | 10 -
drivers/md/dm-kcopyd.c | 3
drivers/md/dm-mpath.c | 2
drivers/md/dm-raid1.c | 5
drivers/md/dm-stripe.c | 2
drivers/md/dm.c | 3
drivers/media/dvb/dvb-core/dvb_net.c | 4
drivers/media/dvb/mantis/mantis_evm.c | 2
drivers/media/dvb/mantis/mantis_uart.c | 2
drivers/media/video/bt8xx/bttv-driver.c | 2
drivers/media/video/cx18/cx18-driver.c | 2
drivers/media/video/cx231xx/cx231xx-cards.c | 2
drivers/media/video/cx23885/cx23885-input.c | 6
drivers/media/video/cx88/cx88-mpeg.c | 2
drivers/media/video/em28xx/em28xx-cards.c | 2
drivers/media/video/omap24xxcam.c | 6
drivers/media/video/saa7134/saa7134-core.c | 2
drivers/media/video/saa7134/saa7134-empress.c | 2
drivers/media/video/tm6000/tm6000-cards.c | 2
drivers/mfd/menelaus.c | 4
drivers/misc/ioc4.c | 2
drivers/mmc/core/host.c | 4
drivers/mmc/host/dw_mmc.c | 3
drivers/mtd/mtdoops.c | 4
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 2
drivers/net/ethernet/neterion/vxge/vxge-main.c | 2
drivers/net/ethernet/sun/cassini.c | 2
drivers/net/ethernet/sun/niu.c | 2
drivers/net/virtio_net.c | 12 -
drivers/net/wireless/hostap/hostap_ap.c | 4
drivers/net/wireless/hostap/hostap_hw.c | 10 -
drivers/nfc/pn533.c | 5
drivers/power/collie_battery.c | 2
drivers/power/tosa_battery.c | 2
drivers/power/wm97xx_battery.c | 2
drivers/power/z2_battery.c | 2
drivers/regulator/core.c | 2
drivers/scsi/arcmsr/arcmsr_hba.c | 4
drivers/scsi/ipr.c | 2
drivers/scsi/pmcraid.c | 2
drivers/scsi/qla2xxx/qla_target.c | 2
drivers/staging/nvec/nvec.c | 10 -
drivers/staging/nvec/nvec.h | 2
drivers/staging/omapdrm/omap_drv.c | 3
drivers/tty/hvc/hvsi.c | 2
drivers/tty/ipwireless/hardware.c | 2
drivers/tty/ipwireless/network.c | 4
drivers/tty/serial/kgdboc.c | 2
drivers/tty/serial/omap-serial.c | 2
drivers/tty/tty_ldisc.c | 6
drivers/usb/atm/speedtch.c | 2
drivers/usb/atm/ueagle-atm.c | 2
drivers/usb/gadget/u_ether.c | 2
drivers/usb/host/ohci-hcd.c | 2
drivers/usb/otg/isp1301_omap.c | 2
fs/affs/super.c | 2
fs/dlm/ast.c | 5
fs/gfs2/lock_dlm.c | 2
fs/gfs2/main.c | 2
fs/gfs2/super.c | 2
fs/hfs/inode.c | 2
fs/ncpfs/inode.c | 6
fs/ocfs2/cluster/quorum.c | 2
fs/xfs/xfs_super.c | 4
fs/xfs/xfs_sync.c | 2
include/linux/workqueue.h | 37 ++++-
kernel/srcu.c | 4
kernel/workqueue.c | 150 +++---------------------
net/9p/trans_fd.c | 2
net/ceph/messenger.c | 2
net/dsa/dsa.c | 2
net/nfc/core.c | 6
net/nfc/hci/core.c | 8 -
net/nfc/hci/shdlc.c | 4
net/nfc/llcp/llcp.c | 18 --
security/keys/gc.c | 8 -
security/keys/key.c | 2
sound/i2c/other/ak4113.c | 2
sound/i2c/other/ak4114.c | 2
sound/pci/oxygen/oxygen_lib.c | 8 -
sound/soc/codecs/wm8350.c | 2
sound/soc/codecs/wm8753.c | 2
sound/soc/soc-core.c | 6
virt/kvm/eventfd.c | 2
109 files changed, 221 insertions(+), 351 deletions(-)

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1340224
[2] http://thread.gmane.org/gmane.linux.kernel/1340333


2012-08-14 01:50:00

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 2/6] workqueue: gut flush[_delayed]_work_sync()

Now that all workqueues are non-reentrant, flush[_delayed]_work_sync()
are equivalent to flush[_delayed]_work(). Drop the separate
implementation and make them thin wrappers around
flush[_delayed]_work().

* start_flush_work() no longer takes @wait_executing as the only left
user - flush_work() - always sets it to %true.

* __cancel_work_timer() uses flush_work() instead of wait_on_work().

Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/workqueue.h | 14 +++++-
kernel/workqueue.c | 122 ++++-----------------------------------------
2 files changed, 22 insertions(+), 114 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 6cd8f91..b2716b6 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -412,11 +412,9 @@ extern int keventd_up(void);
int execute_in_process_context(work_func_t fn, struct execute_work *);

extern bool flush_work(struct work_struct *work);
-extern bool flush_work_sync(struct work_struct *work);
extern bool cancel_work_sync(struct work_struct *work);

extern bool flush_delayed_work(struct delayed_work *dwork);
-extern bool flush_delayed_work_sync(struct delayed_work *work);
extern bool cancel_delayed_work(struct delayed_work *dwork);
extern bool cancel_delayed_work_sync(struct delayed_work *dwork);

@@ -441,6 +439,18 @@ static inline bool __deprecated __cancel_delayed_work(struct delayed_work *work)
return ret;
}

+/* used to be different but now identical to flush_work(), deprecated */
+static inline bool flush_work_sync(struct work_struct *work)
+{
+ return flush_work(work);
+}
+
+/* used to be different but now identical to flush_delayed_work(), deprecated */
+static inline bool flush_delayed_work_sync(struct delayed_work *dwork)
+{
+ return flush_delayed_work(dwork);
+}
+
#ifndef CONFIG_SMP
static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
{
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 26f048b..938adc0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2780,8 +2780,7 @@ reflush:
}
EXPORT_SYMBOL_GPL(drain_workqueue);

-static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
- bool wait_executing)
+static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
{
struct worker *worker = NULL;
struct global_cwq *gcwq;
@@ -2803,13 +2802,12 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
cwq = get_work_cwq(work);
if (unlikely(!cwq || gcwq != cwq->pool->gcwq))
goto already_gone;
- } else if (wait_executing) {
+ } else {
worker = find_worker_executing_work(gcwq, work);
if (!worker)
goto already_gone;
cwq = worker->current_cwq;
- } else
- goto already_gone;
+ }

insert_wq_barrier(cwq, barr, work, worker);
spin_unlock_irq(&gcwq->lock);
@@ -2836,15 +2834,8 @@ already_gone:
* flush_work - wait for a work to finish executing the last queueing instance
* @work: the work to flush
*
- * Wait until @work has finished execution. This function considers
- * only the last queueing instance of @work. If @work has been
- * enqueued across different CPUs on a non-reentrant workqueue or on
- * multiple workqueues, @work might still be executing on return on
- * some of the CPUs from earlier queueing.
- *
- * If @work was queued only on a non-reentrant, ordered or unbound
- * workqueue, @work is guaranteed to be idle on return if it hasn't
- * been requeued since flush started.
+ * Wait until @work has finished execution. @work is guaranteed to be idle
+ * on return if it hasn't been requeued since flush started.
*
* RETURNS:
* %true if flush_work() waited for the work to finish execution,
@@ -2857,85 +2848,15 @@ bool flush_work(struct work_struct *work)
lock_map_acquire(&work->lockdep_map);
lock_map_release(&work->lockdep_map);

- if (start_flush_work(work, &barr, true)) {
+ if (start_flush_work(work, &barr)) {
wait_for_completion(&barr.done);
destroy_work_on_stack(&barr.work);
return true;
- } else
- return false;
-}
-EXPORT_SYMBOL_GPL(flush_work);
-
-static bool wait_on_cpu_work(struct global_cwq *gcwq, struct work_struct *work)
-{
- struct wq_barrier barr;
- struct worker *worker;
-
- spin_lock_irq(&gcwq->lock);
-
- worker = find_worker_executing_work(gcwq, work);
- if (unlikely(worker))
- insert_wq_barrier(worker->current_cwq, &barr, work, worker);
-
- spin_unlock_irq(&gcwq->lock);
-
- if (unlikely(worker)) {
- wait_for_completion(&barr.done);
- destroy_work_on_stack(&barr.work);
- return true;
- } else
+ } else {
return false;
-}
-
-static bool wait_on_work(struct work_struct *work)
-{
- bool ret = false;
- int cpu;
-
- might_sleep();
-
- lock_map_acquire(&work->lockdep_map);
- lock_map_release(&work->lockdep_map);
-
- for_each_gcwq_cpu(cpu)
- ret |= wait_on_cpu_work(get_gcwq(cpu), work);
- return ret;
-}
-
-/**
- * flush_work_sync - wait until a work has finished execution
- * @work: the work to flush
- *
- * Wait until @work has finished execution. On return, it's
- * guaranteed that all queueing instances of @work which happened
- * before this function is called are finished. In other words, if
- * @work hasn't been requeued since this function was called, @work is
- * guaranteed to be idle on return.
- *
- * RETURNS:
- * %true if flush_work_sync() waited for the work to finish execution,
- * %false if it was already idle.
- */
-bool flush_work_sync(struct work_struct *work)
-{
- struct wq_barrier barr;
- bool pending, waited;
-
- /* we'll wait for executions separately, queue barr only if pending */
- pending = start_flush_work(work, &barr, false);
-
- /* wait for executions to finish */
- waited = wait_on_work(work);
-
- /* wait for the pending one */
- if (pending) {
- wait_for_completion(&barr.done);
- destroy_work_on_stack(&barr.work);
}
-
- return pending || waited;
}
-EXPORT_SYMBOL_GPL(flush_work_sync);
+EXPORT_SYMBOL_GPL(flush_work);

static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
{
@@ -2949,14 +2870,14 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
* would be waiting for before retrying.
*/
if (unlikely(ret == -ENOENT))
- wait_on_work(work);
+ flush_work(work);
} while (unlikely(ret < 0));

/* tell other tasks trying to grab @work to back off */
mark_work_canceling(work);
local_irq_restore(flags);

- wait_on_work(work);
+ flush_work(work);
clear_work_data(work);
return ret;
}
@@ -3009,29 +2930,6 @@ bool flush_delayed_work(struct delayed_work *dwork)
EXPORT_SYMBOL(flush_delayed_work);

/**
- * flush_delayed_work_sync - wait for a dwork to finish
- * @dwork: the delayed work to flush
- *
- * Delayed timer is cancelled and the pending work is queued for
- * execution immediately. Other than timer handling, its behavior
- * is identical to flush_work_sync().
- *
- * RETURNS:
- * %true if flush_work_sync() waited for the work to finish execution,
- * %false if it was already idle.
- */
-bool flush_delayed_work_sync(struct delayed_work *dwork)
-{
- local_irq_disable();
- if (del_timer_sync(&dwork->timer))
- __queue_work(dwork->cpu,
- get_work_cwq(&dwork->work)->wq, &dwork->work);
- local_irq_enable();
- return flush_work_sync(&dwork->work);
-}
-EXPORT_SYMBOL(flush_delayed_work_sync);
-
-/**
* cancel_delayed_work - cancel a delayed work
* @dwork: delayed_work to cancel
*
--
1.7.7.3

2012-08-14 01:50:14

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 5/6] workqueue: deprecate system_nrt[_freezable]_wq

system_nrt[_freezable]_wq are now spurious. Mark them deprecated and
convert all users to system[_freezable]_wq.

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
Please use system[_freezable]_wq instead.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: David Howells <[email protected]>
---
block/blk-throttle.c | 7 +++----
block/genhd.c | 10 +++++-----
drivers/gpu/drm/drm_crtc_helper.c | 6 +++---
drivers/hid/hid-wiimote-ext.c | 2 +-
drivers/mmc/core/host.c | 4 ++--
drivers/net/virtio_net.c | 12 ++++++------
include/linux/workqueue.h | 4 ++--
kernel/srcu.c | 4 ++--
security/keys/gc.c | 8 ++++----
security/keys/key.c | 2 +-
10 files changed, 29 insertions(+), 30 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 3d3dcae..a9664fa 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -180,7 +180,7 @@ static inline unsigned int total_nr_queued(struct throtl_data *td)

/*
* Worker for allocating per cpu stat for tgs. This is scheduled on the
- * system_nrt_wq once there are some groups on the alloc_list waiting for
+ * system_wq once there are some groups on the alloc_list waiting for
* allocation.
*/
static void tg_stats_alloc_fn(struct work_struct *work)
@@ -194,8 +194,7 @@ alloc_stats:
stats_cpu = alloc_percpu(struct tg_stats_cpu);
if (!stats_cpu) {
/* allocation failed, try again after some time */
- queue_delayed_work(system_nrt_wq, dwork,
- msecs_to_jiffies(10));
+ schedule_delayed_work(dwork, msecs_to_jiffies(10));
return;
}
}
@@ -238,7 +237,7 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
*/
spin_lock_irqsave(&tg_stats_alloc_lock, flags);
list_add(&tg->stats_alloc_node, &tg_stats_alloc_list);
- queue_delayed_work(system_nrt_wq, &tg_stats_alloc_work, 0);
+ schedule_delayed_work(&tg_stats_alloc_work, 0);
spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
}

diff --git a/block/genhd.c b/block/genhd.c
index 5d8b44a..a2f3d6a 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1490,9 +1490,9 @@ static void __disk_unblock_events(struct gendisk *disk, bool check_now)
intv = disk_events_poll_jiffies(disk);
set_timer_slack(&ev->dwork.timer, intv / 4);
if (check_now)
- queue_delayed_work(system_nrt_freezable_wq, &ev->dwork, 0);
+ queue_delayed_work(system_freezable_wq, &ev->dwork, 0);
else if (intv)
- queue_delayed_work(system_nrt_freezable_wq, &ev->dwork, intv);
+ queue_delayed_work(system_freezable_wq, &ev->dwork, intv);
out_unlock:
spin_unlock_irqrestore(&ev->lock, flags);
}
@@ -1535,7 +1535,7 @@ void disk_flush_events(struct gendisk *disk, unsigned int mask)
spin_lock_irq(&ev->lock);
ev->clearing |= mask;
if (!ev->block)
- mod_delayed_work(system_nrt_freezable_wq, &ev->dwork, 0);
+ mod_delayed_work(system_freezable_wq, &ev->dwork, 0);
spin_unlock_irq(&ev->lock);
}

@@ -1571,7 +1571,7 @@ unsigned int disk_clear_events(struct gendisk *disk, unsigned int mask)

/* uncondtionally schedule event check and wait for it to finish */
disk_block_events(disk);
- queue_delayed_work(system_nrt_freezable_wq, &ev->dwork, 0);
+ queue_delayed_work(system_freezable_wq, &ev->dwork, 0);
flush_delayed_work(&ev->dwork);
__disk_unblock_events(disk, false);

@@ -1608,7 +1608,7 @@ static void disk_events_workfn(struct work_struct *work)

intv = disk_events_poll_jiffies(disk);
if (!ev->block && intv)
- queue_delayed_work(system_nrt_freezable_wq, &ev->dwork, intv);
+ queue_delayed_work(system_freezable_wq, &ev->dwork, intv);

spin_unlock_irq(&ev->lock);

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 3252e70..8fa9d52 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -968,7 +968,7 @@ static void output_poll_execute(struct work_struct *work)
}

if (repoll)
- queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
+ schedule_delayed_work(delayed_work, DRM_OUTPUT_POLL_PERIOD);
}

void drm_kms_helper_poll_disable(struct drm_device *dev)
@@ -993,7 +993,7 @@ void drm_kms_helper_poll_enable(struct drm_device *dev)
}

if (poll)
- queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
+ schedule_delayed_work(&dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
}
EXPORT_SYMBOL(drm_kms_helper_poll_enable);

@@ -1020,6 +1020,6 @@ void drm_helper_hpd_irq_event(struct drm_device *dev)
/* kill timer and schedule immediate execution, this doesn't block */
cancel_delayed_work(&dev->mode_config.output_poll_work);
if (drm_kms_helper_poll)
- queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, 0);
+ schedule_delayed_work(&dev->mode_config.output_poll_work, 0);
}
EXPORT_SYMBOL(drm_helper_hpd_irq_event);
diff --git a/drivers/hid/hid-wiimote-ext.c b/drivers/hid/hid-wiimote-ext.c
index 0a1805c..d37cd09 100644
--- a/drivers/hid/hid-wiimote-ext.c
+++ b/drivers/hid/hid-wiimote-ext.c
@@ -204,7 +204,7 @@ static void wiiext_worker(struct work_struct *work)
/* schedule work only once, otherwise mark for reschedule */
static void wiiext_schedule(struct wiimote_ext *ext)
{
- queue_work(system_nrt_wq, &ext->worker);
+ schedule_work(&ext->worker);
}

/*
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 597f189..ee2e16b 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -204,8 +204,8 @@ void mmc_host_clk_release(struct mmc_host *host)
host->clk_requests--;
if (mmc_host_may_gate_card(host->card) &&
!host->clk_requests)
- queue_delayed_work(system_nrt_wq, &host->clk_gate_work,
- msecs_to_jiffies(host->clkgate_delay));
+ schedule_delayed_work(&host->clk_gate_work,
+ msecs_to_jiffies(host->clkgate_delay));
spin_unlock_irqrestore(&host->clk_lock, flags);
}

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 83d2b0c..9650c41 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -521,7 +521,7 @@ static void refill_work(struct work_struct *work)
/* In theory, this can happen: if we don't get any buffers in
* we will *never* try to fill again. */
if (still_empty)
- queue_delayed_work(system_nrt_wq, &vi->refill, HZ/2);
+ schedule_delayed_work(&vi->refill, HZ/2);
}

static int virtnet_poll(struct napi_struct *napi, int budget)
@@ -540,7 +540,7 @@ again:

if (vi->num < vi->max / 2) {
if (!try_fill_recv(vi, GFP_ATOMIC))
- queue_delayed_work(system_nrt_wq, &vi->refill, 0);
+ schedule_delayed_work(&vi->refill, 0);
}

/* Out of packets? */
@@ -745,7 +745,7 @@ static int virtnet_open(struct net_device *dev)

/* Make sure we have some buffers: if oom use wq. */
if (!try_fill_recv(vi, GFP_KERNEL))
- queue_delayed_work(system_nrt_wq, &vi->refill, 0);
+ schedule_delayed_work(&vi->refill, 0);

virtnet_napi_enable(vi);
return 0;
@@ -1020,7 +1020,7 @@ static void virtnet_config_changed(struct virtio_device *vdev)
{
struct virtnet_info *vi = vdev->priv;

- queue_work(system_nrt_wq, &vi->config_work);
+ schedule_work(&vi->config_work);
}

static int init_vqs(struct virtnet_info *vi)
@@ -1152,7 +1152,7 @@ static int virtnet_probe(struct virtio_device *vdev)
otherwise get link status from config. */
if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
netif_carrier_off(dev);
- queue_work(system_nrt_wq, &vi->config_work);
+ schedule_work(&vi->config_work);
} else {
vi->status = VIRTIO_NET_S_LINK_UP;
netif_carrier_on(dev);
@@ -1264,7 +1264,7 @@ static int virtnet_restore(struct virtio_device *vdev)
netif_device_attach(vi->dev);

if (!try_fill_recv(vi, GFP_KERNEL))
- queue_delayed_work(system_nrt_wq, &vi->refill, 0);
+ schedule_delayed_work(&vi->refill, 0);

mutex_lock(&vi->config_lock);
vi->config_enable = true;
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index ffaaec0..2b58905 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -310,12 +310,12 @@ extern struct workqueue_struct *system_long_wq;
extern struct workqueue_struct *system_unbound_wq;
extern struct workqueue_struct *system_freezable_wq;

-static inline struct workqueue_struct *__system_nrt_wq(void)
+static inline struct workqueue_struct * __deprecated __system_nrt_wq(void)
{
return system_wq;
}

-static inline struct workqueue_struct *__system_nrt_freezable_wq(void)
+static inline struct workqueue_struct * __deprecated __system_nrt_freezable_wq(void)
{
return system_freezable_wq;
}
diff --git a/kernel/srcu.c b/kernel/srcu.c
index 2095be3..97c465e 100644
--- a/kernel/srcu.c
+++ b/kernel/srcu.c
@@ -379,7 +379,7 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
rcu_batch_queue(&sp->batch_queue, head);
if (!sp->running) {
sp->running = true;
- queue_delayed_work(system_nrt_wq, &sp->work, 0);
+ schedule_delayed_work(&sp->work, 0);
}
spin_unlock_irqrestore(&sp->queue_lock, flags);
}
@@ -631,7 +631,7 @@ static void srcu_reschedule(struct srcu_struct *sp)
}

if (pending)
- queue_delayed_work(system_nrt_wq, &sp->work, SRCU_INTERVAL);
+ schedule_delayed_work(&sp->work, SRCU_INTERVAL);
}

/*
diff --git a/security/keys/gc.c b/security/keys/gc.c
index 61ab7c8..d67c97b 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -62,7 +62,7 @@ void key_schedule_gc(time_t gc_at)

if (gc_at <= now || test_bit(KEY_GC_REAP_KEYTYPE, &key_gc_flags)) {
kdebug("IMMEDIATE");
- queue_work(system_nrt_wq, &key_gc_work);
+ schedule_work(&key_gc_work);
} else if (gc_at < key_gc_next_run) {
kdebug("DEFERRED");
key_gc_next_run = gc_at;
@@ -77,7 +77,7 @@ void key_schedule_gc(time_t gc_at)
void key_schedule_gc_links(void)
{
set_bit(KEY_GC_KEY_EXPIRED, &key_gc_flags);
- queue_work(system_nrt_wq, &key_gc_work);
+ schedule_work(&key_gc_work);
}

/*
@@ -120,7 +120,7 @@ void key_gc_keytype(struct key_type *ktype)
set_bit(KEY_GC_REAP_KEYTYPE, &key_gc_flags);

kdebug("schedule");
- queue_work(system_nrt_wq, &key_gc_work);
+ schedule_work(&key_gc_work);

kdebug("sleep");
wait_on_bit(&key_gc_flags, KEY_GC_REAPING_KEYTYPE, key_gc_wait_bit,
@@ -369,7 +369,7 @@ maybe_resched:
}

if (gc_state & KEY_GC_REAP_AGAIN)
- queue_work(system_nrt_wq, &key_gc_work);
+ schedule_work(&key_gc_work);
kleave(" [end %x]", gc_state);
return;

diff --git a/security/keys/key.c b/security/keys/key.c
index 50d96d4..3cbe352 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -598,7 +598,7 @@ void key_put(struct key *key)
key_check(key);

if (atomic_dec_and_test(&key->usage))
- queue_work(system_nrt_wq, &key_gc_work);
+ schedule_work(&key_gc_work);
}
}
EXPORT_SYMBOL(key_put);
--
1.7.7.3

2012-08-14 01:50:33

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 6/6] workqueue: deprecate WQ_NON_REENTRANT

WQ_NON_REENTRANT is now spurious. Make alloc_workqueue() trigger
WARN_ON() if the flag is specified and drop it from all in-kernel
users.

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant, so there's no reason to use WQ_NON_REENTRANT.

Some of the conversions in this patch aren't trivial.

* WQ_UNBOUND | WQ_NON_REENTRANT workqueues w/ max_active == 1 are
converted to alloc_ordered_workqueue().

* NFC was using a lot of WQ_UNBOUND | WQ_NON_REENTRANT |
WQ_MEM_RECLAIM workqueues w/ max_active == 1. I converted them to
alloc_ordered_workqueue() but can't understand why NFC would need
the reclaim flag at all. I added comments above all such
occurrences. If I'm missing something, please let me know;
otherwise, please drop the flag. Better, please try to use
system_wq instead.

* drivers/staging/nvec doesn't seem to need its own workqueue.
Converted to use system_wq and cancel work items on unload. Plesae
scream if it's wrong.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Stefan Richter <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Lauro Ramos Venancio <[email protected]>
Cc: Aloisio Almeida Jr <[email protected]>
Cc: Samuel Ortiz <[email protected]>
Cc: Julian Andres Klode <[email protected]>
Cc: Marc Dietrich <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Christine Caulfield <[email protected]>
Cc: Steven Whitehouse <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Sage Weil <[email protected]>
---
Documentation/workqueue.txt | 14 ++------------
drivers/firewire/core-transaction.c | 3 +--
drivers/gpu/drm/i915/i915_dma.c | 6 ++----
drivers/md/dm-crypt.c | 10 ++--------
drivers/md/dm-kcopyd.c | 3 +--
drivers/md/dm-raid1.c | 3 +--
drivers/md/dm.c | 3 +--
drivers/mmc/host/dw_mmc.c | 3 +--
drivers/nfc/pn533.c | 5 ++---
drivers/staging/nvec/nvec.c | 10 ++++------
drivers/staging/nvec/nvec.h | 2 --
drivers/staging/omapdrm/omap_drv.c | 3 +--
fs/dlm/ast.c | 5 +----
fs/gfs2/main.c | 2 +-
fs/xfs/xfs_super.c | 2 +-
kernel/workqueue.c | 3 +++
net/ceph/messenger.c | 2 +-
net/nfc/core.c | 6 +++---
net/nfc/hci/core.c | 8 ++++----
net/nfc/hci/shdlc.c | 4 ++--
net/nfc/llcp/llcp.c | 18 ++++++------------
21 files changed, 40 insertions(+), 75 deletions(-)

diff --git a/Documentation/workqueue.txt b/Documentation/workqueue.txt
index a6ab4b6..1fb8813 100644
--- a/Documentation/workqueue.txt
+++ b/Documentation/workqueue.txt
@@ -100,8 +100,8 @@ Subsystems and drivers can create and queue work items through special
workqueue API functions as they see fit. They can influence some
aspects of the way the work items are executed by setting flags on the
workqueue they are putting the work item on. These flags include
-things like CPU locality, reentrancy, concurrency limits, priority and
-more. To get a detailed overview refer to the API description of
+things like CPU locality, concurrency limits, priority and more. To
+get a detailed overview refer to the API description of
alloc_workqueue() below.

When a work item is queued to a workqueue, the target gcwq and
@@ -166,16 +166,6 @@ resources, scheduled and executed.

@flags:

- WQ_NON_REENTRANT
-
- By default, a wq guarantees non-reentrance only on the same
- CPU. A work item may not be executed concurrently on the same
- CPU by multiple workers but is allowed to be executed
- concurrently on multiple CPUs. This flag makes sure
- non-reentrance is enforced across all CPUs. Work items queued
- to a non-reentrant wq are guaranteed to be executed by at most
- one worker system-wide at any given time.
-
WQ_UNBOUND

Work items queued to an unbound wq are served by a special
diff --git a/drivers/firewire/core-transaction.c b/drivers/firewire/core-transaction.c
index 87d6f2d..45acc0f 100644
--- a/drivers/firewire/core-transaction.c
+++ b/drivers/firewire/core-transaction.c
@@ -1257,8 +1257,7 @@ static int __init fw_core_init(void)
{
int ret;

- fw_workqueue = alloc_workqueue("firewire",
- WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
+ fw_workqueue = alloc_workqueue("firewire", WQ_MEM_RECLAIM, 0);
if (!fw_workqueue)
return -ENOMEM;

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 9cf7dfe..a55ca7a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1536,11 +1536,9 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
*
* All tasks on the workqueue are expected to acquire the dev mutex
* so there is no point in running more than one instance of the
- * workqueue at any time: max_active = 1 and NON_REENTRANT.
+ * workqueue at any time. Use an ordered one.
*/
- dev_priv->wq = alloc_workqueue("i915",
- WQ_UNBOUND | WQ_NON_REENTRANT,
- 1);
+ dev_priv->wq = alloc_ordered_workqueue("i915", 0);
if (dev_priv->wq == NULL) {
DRM_ERROR("Failed to create our workqueue.\n");
ret = -ENOMEM;
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 664743d..a7472d6 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1674,20 +1674,14 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
}

ret = -ENOMEM;
- cc->io_queue = alloc_workqueue("kcryptd_io",
- WQ_NON_REENTRANT|
- WQ_MEM_RECLAIM,
- 1);
+ cc->io_queue = alloc_workqueue("kcryptd_io", WQ_MEM_RECLAIM, 1);
if (!cc->io_queue) {
ti->error = "Couldn't create kcryptd io queue";
goto bad;
}

cc->crypt_queue = alloc_workqueue("kcryptd",
- WQ_NON_REENTRANT|
- WQ_CPU_INTENSIVE|
- WQ_MEM_RECLAIM,
- 1);
+ WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, 1);
if (!cc->crypt_queue) {
ti->error = "Couldn't create kcryptd queue";
goto bad;
diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index bed444c..7234f71 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -704,8 +704,7 @@ struct dm_kcopyd_client *dm_kcopyd_client_create(void)
goto bad_slab;

INIT_WORK(&kc->kcopyd_work, do_work);
- kc->kcopyd_wq = alloc_workqueue("kcopyd",
- WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
+ kc->kcopyd_wq = alloc_workqueue("kcopyd", WQ_MEM_RECLAIM, 0);
if (!kc->kcopyd_wq)
goto bad_workqueue;

diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index fd61f98..c282fc7 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -1090,8 +1090,7 @@ static int mirror_ctr(struct dm_target *ti, unsigned int argc, char **argv)
ti->num_discard_requests = 1;
ti->discard_zeroes_data_unsupported = true;

- ms->kmirrord_wq = alloc_workqueue("kmirrord",
- WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
+ ms->kmirrord_wq = alloc_workqueue("kmirrord", WQ_MEM_RECLAIM, 0);
if (!ms->kmirrord_wq) {
DMERR("couldn't start kmirrord");
r = -ENOMEM;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4e09b6f..7c5ef85 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1897,8 +1897,7 @@ static struct mapped_device *alloc_dev(int minor)
add_disk(md->disk);
format_dev_t(md->name, MKDEV(_major, minor));

- md->wq = alloc_workqueue("kdmflush",
- WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
+ md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
if (!md->wq)
goto bad_thread;

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 72dc3cd..119e441 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -2030,8 +2030,7 @@ int dw_mci_probe(struct dw_mci *host)
mci_writel(host, CLKSRC, 0);

tasklet_init(&host->tasklet, dw_mci_tasklet_func, (unsigned long)host);
- host->card_workqueue = alloc_workqueue("dw-mci-card",
- WQ_MEM_RECLAIM | WQ_NON_REENTRANT, 1);
+ host->card_workqueue = alloc_workqueue("dw-mci-card", WQ_MEM_RECLAIM, 1);
if (!host->card_workqueue)
goto err_dmaunmap;
INIT_WORK(&host->card_work, dw_mci_work_routine_card);
diff --git a/drivers/nfc/pn533.c b/drivers/nfc/pn533.c
index d606f52..7dadcb8 100644
--- a/drivers/nfc/pn533.c
+++ b/drivers/nfc/pn533.c
@@ -2334,9 +2334,8 @@ static int pn533_probe(struct usb_interface *interface,
INIT_WORK(&dev->mi_work, pn533_wq_mi_recv);
INIT_WORK(&dev->tg_work, pn533_wq_tg_get_data);
INIT_WORK(&dev->poll_work, pn533_wq_poll);
- dev->wq = alloc_workqueue("pn533",
- WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
- 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ dev->wq = alloc_ordered_workqueue("pn533", WQ_MEM_RECLAIM);
if (dev->wq == NULL)
goto error;

diff --git a/drivers/staging/nvec/nvec.c b/drivers/staging/nvec/nvec.c
index 695ea35..9c39d53 100644
--- a/drivers/staging/nvec/nvec.c
+++ b/drivers/staging/nvec/nvec.c
@@ -264,7 +264,7 @@ int nvec_write_async(struct nvec_chip *nvec, const unsigned char *data,
list_add_tail(&msg->node, &nvec->tx_data);
spin_unlock_irqrestore(&nvec->tx_lock, flags);

- queue_work(nvec->wq, &nvec->tx_work);
+ schedule_work(&nvec->tx_work);

return 0;
}
@@ -470,7 +470,7 @@ static void nvec_rx_completed(struct nvec_chip *nvec)
if (!nvec_msg_is_event(nvec->rx))
complete(&nvec->ec_transfer);

- queue_work(nvec->wq, &nvec->rx_work);
+ schedule_work(&nvec->rx_work);
}

/**
@@ -791,13 +791,11 @@ static int __devinit tegra_nvec_probe(struct platform_device *pdev)
INIT_LIST_HEAD(&nvec->tx_data);
INIT_WORK(&nvec->rx_work, nvec_dispatch);
INIT_WORK(&nvec->tx_work, nvec_request_master);
- nvec->wq = alloc_workqueue("nvec", WQ_NON_REENTRANT, 2);

err = devm_gpio_request_one(&pdev->dev, nvec->gpio, GPIOF_OUT_INIT_HIGH,
"nvec gpio");
if (err < 0) {
dev_err(nvec->dev, "couldn't request gpio\n");
- destroy_workqueue(nvec->wq);
return -ENODEV;
}

@@ -805,7 +803,6 @@ static int __devinit tegra_nvec_probe(struct platform_device *pdev)
"nvec", nvec);
if (err) {
dev_err(nvec->dev, "couldn't request irq\n");
- destroy_workqueue(nvec->wq);
return -ENODEV;
}
disable_irq(nvec->irq);
@@ -859,7 +856,8 @@ static int __devexit tegra_nvec_remove(struct platform_device *pdev)

nvec_write_async(nvec, EC_DISABLE_EVENT_REPORTING, 3);
mfd_remove_devices(nvec->dev);
- destroy_workqueue(nvec->wq);
+ cancel_work_sync(&nvec->rx_work);
+ cancel_work_sync(&nvec->tx_work);

return 0;
}
diff --git a/drivers/staging/nvec/nvec.h b/drivers/staging/nvec/nvec.h
index ba6ed8f..f522dbd 100644
--- a/drivers/staging/nvec/nvec.h
+++ b/drivers/staging/nvec/nvec.h
@@ -139,7 +139,6 @@ struct nvec_platform_data {
* @nvec_status_notifier: Internal notifier (see nvec_status_notifier())
* @rx_work: A work structure for the RX worker nvec_dispatch()
* @tx_work: A work structure for the TX worker nvec_request_master()
- * @wq: The work queue in which @rx_work and @tx_work are executed
* @rx: The message currently being retrieved or %NULL
* @msg_pool: A pool of messages for allocation
* @tx: The message currently being transferred
@@ -165,7 +164,6 @@ struct nvec_chip {
struct list_head rx_data, tx_data;
struct notifier_block nvec_status_notifier;
struct work_struct rx_work, tx_work;
- struct workqueue_struct *wq;
struct nvec_msg msg_pool[NVEC_POOL_SIZE];
struct nvec_msg *rx;

diff --git a/drivers/staging/omapdrm/omap_drv.c b/drivers/staging/omapdrm/omap_drv.c
index 4beab94..672cb3a 100644
--- a/drivers/staging/omapdrm/omap_drv.c
+++ b/drivers/staging/omapdrm/omap_drv.c
@@ -571,8 +571,7 @@ static int dev_load(struct drm_device *dev, unsigned long flags)

dev->dev_private = priv;

- priv->wq = alloc_workqueue("omapdrm",
- WQ_UNBOUND | WQ_NON_REENTRANT, 1);
+ priv->wq = alloc_ordered_workqueue("omapdrm", 0);

INIT_LIST_HEAD(&priv->obj_list);

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 63dc19c..cd24afd 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -267,10 +267,7 @@ void dlm_callback_work(struct work_struct *work)
int dlm_callback_start(struct dlm_ls *ls)
{
ls->ls_callback_wq = alloc_workqueue("dlm_callback",
- WQ_UNBOUND |
- WQ_MEM_RECLAIM |
- WQ_NON_REENTRANT,
- 0);
+ WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
if (!ls->ls_callback_wq) {
log_print("can't start dlm_callback workqueue");
return -ENOMEM;
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index e04d0e0..7b0f504 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -155,7 +155,7 @@ static int __init init_gfs2_fs(void)
goto fail_wq;

gfs2_control_wq = alloc_workqueue("gfs2_control",
- WQ_NON_REENTRANT | WQ_UNBOUND | WQ_FREEZABLE, 0);
+ WQ_UNBOUND | WQ_FREEZABLE, 0);
if (!gfs2_control_wq)
goto fail_recovery;

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index e8e6be4..c7199a4 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1531,7 +1531,7 @@ xfs_init_workqueues(void)
* competing for ressources. Use the default large max_active value
* so that even lots of filesystems can perform these task in parallel.
*/
- xfs_syncd_wq = alloc_workqueue("xfssyncd", WQ_NON_REENTRANT, 0);
+ xfs_syncd_wq = alloc_workqueue("xfssyncd", 0, 0);
if (!xfs_syncd_wq)
return -ENOMEM;

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ce39467..910b1ee 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3195,6 +3195,9 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
unsigned int cpu;
size_t namelen;

+ WARN(flags & WQ_NON_REENTRANT,
+ "workqueue: WQ_NON_REENTRANT is deprecated, all workqueues are non-reentrant\n");
+
/* determine namelen, allocate wq and format name */
va_start(args, lock_name);
va_copy(args1, args);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index b979675..fe1bd47 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -190,7 +190,7 @@ int ceph_msgr_init(void)
zero_page = ZERO_PAGE(0);
page_cache_get(zero_page);

- ceph_msgr_wq = alloc_workqueue("ceph-msgr", WQ_NON_REENTRANT, 0);
+ ceph_msgr_wq = alloc_workqueue("ceph-msgr", 0, 0);
if (ceph_msgr_wq)
return 0;

diff --git a/net/nfc/core.c b/net/nfc/core.c
index ff74979..b071f97 100644
--- a/net/nfc/core.c
+++ b/net/nfc/core.c
@@ -791,9 +791,9 @@ struct nfc_dev *nfc_allocate_device(struct nfc_ops *ops,

INIT_WORK(&dev->check_pres_work, nfc_check_pres_work);
snprintf(name, sizeof(name), "nfc%d_check_pres_wq", dev->idx);
- dev->check_pres_wq = alloc_workqueue(name, WQ_NON_REENTRANT |
- WQ_UNBOUND |
- WQ_MEM_RECLAIM, 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ dev->check_pres_wq = alloc_ordered_workqueue(name,
+ WQ_MEM_RECLAIM);
if (dev->check_pres_wq == NULL) {
kfree(dev);
return NULL;
diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
index 1ac7b3f..5e48792 100644
--- a/net/nfc/hci/core.c
+++ b/net/nfc/hci/core.c
@@ -670,8 +670,8 @@ int nfc_hci_register_device(struct nfc_hci_dev *hdev)

INIT_WORK(&hdev->msg_tx_work, nfc_hci_msg_tx_work);
snprintf(name, sizeof(name), "%s_hci_msg_tx_wq", devname);
- hdev->msg_tx_wq = alloc_workqueue(name, WQ_NON_REENTRANT | WQ_UNBOUND |
- WQ_MEM_RECLAIM, 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ hdev->msg_tx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
if (hdev->msg_tx_wq == NULL) {
r = -ENOMEM;
goto exit;
@@ -685,8 +685,8 @@ int nfc_hci_register_device(struct nfc_hci_dev *hdev)

INIT_WORK(&hdev->msg_rx_work, nfc_hci_msg_rx_work);
snprintf(name, sizeof(name), "%s_hci_msg_rx_wq", devname);
- hdev->msg_rx_wq = alloc_workqueue(name, WQ_NON_REENTRANT | WQ_UNBOUND |
- WQ_MEM_RECLAIM, 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ hdev->msg_rx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
if (hdev->msg_rx_wq == NULL) {
r = -ENOMEM;
goto exit;
diff --git a/net/nfc/hci/shdlc.c b/net/nfc/hci/shdlc.c
index 6f840c1..04405101 100644
--- a/net/nfc/hci/shdlc.c
+++ b/net/nfc/hci/shdlc.c
@@ -877,8 +877,8 @@ struct nfc_shdlc *nfc_shdlc_allocate(struct nfc_shdlc_ops *ops,

INIT_WORK(&shdlc->sm_work, nfc_shdlc_sm_work);
snprintf(name, sizeof(name), "%s_shdlc_sm_wq", devname);
- shdlc->sm_wq = alloc_workqueue(name, WQ_NON_REENTRANT | WQ_UNBOUND |
- WQ_MEM_RECLAIM, 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ shdlc->sm_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
if (shdlc->sm_wq == NULL)
goto err_allocwq;

diff --git a/net/nfc/llcp/llcp.c b/net/nfc/llcp/llcp.c
index 82f0f75..57d5df2 100644
--- a/net/nfc/llcp/llcp.c
+++ b/net/nfc/llcp/llcp.c
@@ -1150,10 +1150,8 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
skb_queue_head_init(&local->tx_queue);
INIT_WORK(&local->tx_work, nfc_llcp_tx_work);
snprintf(name, sizeof(name), "%s_llcp_tx_wq", dev_name(dev));
- local->tx_wq =
- alloc_workqueue(name,
- WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
- 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ local->tx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
if (local->tx_wq == NULL) {
err = -ENOMEM;
goto err_local;
@@ -1162,10 +1160,8 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
local->rx_pending = NULL;
INIT_WORK(&local->rx_work, nfc_llcp_rx_work);
snprintf(name, sizeof(name), "%s_llcp_rx_wq", dev_name(dev));
- local->rx_wq =
- alloc_workqueue(name,
- WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
- 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ local->rx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
if (local->rx_wq == NULL) {
err = -ENOMEM;
goto err_tx_wq;
@@ -1173,10 +1169,8 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)

INIT_WORK(&local->timeout_work, nfc_llcp_timeout_work);
snprintf(name, sizeof(name), "%s_llcp_timeout_wq", dev_name(dev));
- local->timeout_wq =
- alloc_workqueue(name,
- WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
- 1);
+ /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
+ local->timeout_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
if (local->timeout_wq == NULL) {
err = -ENOMEM;
goto err_rx_wq;
--
1.7.7.3

2012-08-14 01:50:58

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 3/6] workqueue: gut system_nrt[_freezable]_wq()

Now that all workqueues are non-reentrant, system[_freezable]_wq() are
equivalent to system_nrt[_freezable]_wq(). Replace the latter with
wrappers around system[_freezable]_wq(). The wrapping goes through
inline functions so that __deprecated can be added easily.

Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/workqueue.h | 23 ++++++++++++++---------
kernel/workqueue.c | 12 ++----------
2 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index b2716b6..b0a678b 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -297,10 +297,6 @@ enum {
* system_long_wq is similar to system_wq but may host long running
* works. Queue flushing might take relatively long.
*
- * system_nrt_wq is non-reentrant and guarantees that any given work
- * item is never executed in parallel by multiple CPUs. Queue
- * flushing might take relatively long.
- *
* system_unbound_wq is unbound workqueue. Workers are not bound to
* any specific CPU, not concurrency managed, and all queued works are
* executed immediately as long as max_active limit is not reached and
@@ -308,16 +304,25 @@ enum {
*
* system_freezable_wq is equivalent to system_wq except that it's
* freezable.
- *
- * system_nrt_freezable_wq is equivalent to system_nrt_wq except that
- * it's freezable.
*/
extern struct workqueue_struct *system_wq;
extern struct workqueue_struct *system_long_wq;
-extern struct workqueue_struct *system_nrt_wq;
extern struct workqueue_struct *system_unbound_wq;
extern struct workqueue_struct *system_freezable_wq;
-extern struct workqueue_struct *system_nrt_freezable_wq;
+
+static inline struct workqueue_struct *__system_nrt_wq(void)
+{
+ return system_wq;
+}
+
+static inline struct workqueue_struct *__system_nrt_freezable_wq(void)
+{
+ return system_freezable_wq;
+}
+
+/* equivlalent to system_wq and system_freezable_wq, deprecated */
+#define system_nrt_wq __system_nrt_wq()
+#define system_nrt_freezable_wq __system_nrt_freezable_wq()

extern struct workqueue_struct *
__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 938adc0..ce39467 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -269,16 +269,12 @@ struct workqueue_struct {

struct workqueue_struct *system_wq __read_mostly;
struct workqueue_struct *system_long_wq __read_mostly;
-struct workqueue_struct *system_nrt_wq __read_mostly;
struct workqueue_struct *system_unbound_wq __read_mostly;
struct workqueue_struct *system_freezable_wq __read_mostly;
-struct workqueue_struct *system_nrt_freezable_wq __read_mostly;
EXPORT_SYMBOL_GPL(system_wq);
EXPORT_SYMBOL_GPL(system_long_wq);
-EXPORT_SYMBOL_GPL(system_nrt_wq);
EXPORT_SYMBOL_GPL(system_unbound_wq);
EXPORT_SYMBOL_GPL(system_freezable_wq);
-EXPORT_SYMBOL_GPL(system_nrt_freezable_wq);

#define CREATE_TRACE_POINTS
#include <trace/events/workqueue.h>
@@ -3847,16 +3843,12 @@ static int __init init_workqueues(void)

system_wq = alloc_workqueue("events", 0, 0);
system_long_wq = alloc_workqueue("events_long", 0, 0);
- system_nrt_wq = alloc_workqueue("events_nrt", WQ_NON_REENTRANT, 0);
system_unbound_wq = alloc_workqueue("events_unbound", WQ_UNBOUND,
WQ_UNBOUND_MAX_ACTIVE);
system_freezable_wq = alloc_workqueue("events_freezable",
WQ_FREEZABLE, 0);
- system_nrt_freezable_wq = alloc_workqueue("events_nrt_freezable",
- WQ_NON_REENTRANT | WQ_FREEZABLE, 0);
- BUG_ON(!system_wq || !system_long_wq || !system_nrt_wq ||
- !system_unbound_wq || !system_freezable_wq ||
- !system_nrt_freezable_wq);
+ BUG_ON(!system_wq || !system_long_wq || !system_unbound_wq ||
+ !system_freezable_wq);
return 0;
}
early_initcall(init_workqueues);
--
1.7.7.3

2012-08-14 01:51:18

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/6] workqueue: make all workqueues non-reentrant

By default, each per-cpu part of a bound workqueue operates separately
and a work item may be executing concurrently on different CPUs. The
behavior avoids some cross-cpu traffic but leads to subtle weirdities
and not-so-subtle contortions in the API.

* There's no sane usefulness in allowing a single work item to be
executed concurrently on multiple CPUs. People just get the
behavior unintentionally and get surprised after learning about it.
Most either explicitly synchronize or use non-reentrant/ordered
workqueue but this is error-prone.

* flush_work() can't wait for multiple instances of the same work item
on different CPUs. If a work item is executing on cpu0 and then
queued on cpu1, flush_work() can only wait for the one on cpu1.

Unfortunately, work items can easily cross CPU boundaries
unintentionally when the queueing thread gets migrated. This means
that if multiple queuers compete, flush_work() can't even guarantee
that the instance queued right before it is finished before
returning.

* flush_work_sync() was added to work around some of the deficiencies
of flush_work(). In addition to the usual flushing, it ensures that
all currently executing instances are finished before returning.
This operation is expensive as it has to walk all CPUs and at the
same time fails to address competing queuer case.

Incorrectly using flush_work() when flush_work_sync() is necessary
is an easy error to make and can lead to bugs which are difficult to
reproduce.

* Similar problems exist for flush_delayed_work[_sync]().

Other than the cross-cpu access concern, there's no benefit in
allowing parallel execution and it's plain silly to have this level of
contortion for workqueue which is widely used from core code to
extremely obscure drivers.

This patch makes all workqueues non-reentrant. If a work item is
executing on a different CPU when queueing is requested, it is always
queued to that CPU. This guarantees that any given work item can be
executing on one CPU at maximum and if a work item is queued and
executing, both are on the same CPU.

The only behavior change which may affect workqueue users negatively
is that non-reentrancy overrides the affinity specified by
queue_work_on(). On a reentrant workqueue, the affinity specified by
queue_work_on() is always followed. Now, if the work item is
executing on one of the CPUs, the work item will be queued there
regardless of the requested affinity. I've reviewed all workqueue
users which request explicit affinity, and, fortunately, none seems to
be crazy enough to exploit parallel execution of the same work item.

This adds an additional busy_hash lookup if the work item was
previously queued on a different CPU. This shouldn't be noticeable
under any sane workload. Work item queueing isn't a very
high-frequency operation and they don't jump across CPUs all the time.
In a micro benchmark to exaggerate this difference - measuring the
time it takes for two work items to repeatedly jump between two CPUs a
number (10M) of times with busy_hash table densely populated, the
difference was around 3%.

While the overhead is measureable, it is only visible in pathological
cases and the difference isn't huge. This change brings much needed
sanity to workqueue and makes its behavior consistent with timer. I
think this is the right tradeoff to make.

This enables significant simplification of workqueue API.
Simplification patches will follow.

Signed-off-by: Tejun Heo <[email protected]>
---
kernel/workqueue.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7413242..26f048b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1216,14 +1216,15 @@ static void __queue_work(unsigned int cpu, struct workqueue_struct *wq,
cpu = raw_smp_processor_id();

/*
- * It's multi cpu. If @wq is non-reentrant and @work
- * was previously on a different cpu, it might still
- * be running there, in which case the work needs to
- * be queued on that cpu to guarantee non-reentrance.
+ * It's multi cpu. If @work was previously on a different
+ * cpu, it might still be running there, in which case the
+ * work needs to be queued on that cpu to guarantee
+ * non-reentrancy.
*/
gcwq = get_gcwq(cpu);
- if (wq->flags & WQ_NON_REENTRANT &&
- (last_gcwq = get_work_gcwq(work)) && last_gcwq != gcwq) {
+ last_gcwq = get_work_gcwq(work);
+
+ if (last_gcwq && last_gcwq != gcwq) {
struct worker *worker;

spin_lock(&last_gcwq->lock);
--
1.7.7.3

2012-08-14 17:25:18

by Marc Dietrich

[permalink] [raw]
Subject: Re: [PATCH 6/6] workqueue: deprecate WQ_NON_REENTRANT

Hi Tejun,

On Monday 13 August 2012 18:49:46 Tejun Heo wrote:
> WQ_NON_REENTRANT is now spurious. Make alloc_workqueue() trigger
> WARN_ON() if the flag is specified and drop it from all in-kernel
> users.
>
> If you're cc'd and wondering what's going on: Now all workqueues are
> non-reentrant, so there's no reason to use WQ_NON_REENTRANT.
>
> Some of the conversions in this patch aren't trivial.
>
> * WQ_UNBOUND | WQ_NON_REENTRANT workqueues w/ max_active == 1 are
> converted to alloc_ordered_workqueue().
>
> * NFC was using a lot of WQ_UNBOUND | WQ_NON_REENTRANT |
> WQ_MEM_RECLAIM workqueues w/ max_active == 1. I converted them to
> alloc_ordered_workqueue() but can't understand why NFC would need
> the reclaim flag at all. I added comments above all such
> occurrences. If I'm missing something, please let me know;
> otherwise, please drop the flag. Better, please try to use
> system_wq instead.
>
> * drivers/staging/nvec doesn't seem to need its own workqueue.
> Converted to use system_wq and cancel work items on unload. Plesae
> scream if it's wrong.

the nvec driver had its own workqueue because we found that it reduced some
timimg induced protocol errors. I just tested your patch which uses the system
workqueue and I failed to reproduce these protocol errors under stress
testing, so I think it's ok.

Because this is a single patch affecting many drivers, is this going through
linux-next or some subsystem tree? Maybe it would have been better to split it
in one patch per driver. Otherwise ...

Acked-By: Marc Dietrich <[email protected]>

> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Stefan Richter <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> Cc: Alasdair Kergon <[email protected]>
> Cc: Lauro Ramos Venancio <[email protected]>
> Cc: Aloisio Almeida Jr <[email protected]>
> Cc: Samuel Ortiz <[email protected]>
> Cc: Julian Andres Klode <[email protected]>
> Cc: Marc Dietrich <[email protected]>
> Cc: Rob Clark <[email protected]>
> Cc: Christine Caulfield <[email protected]>
> Cc: Steven Whitehouse <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Sage Weil <[email protected]>
> ---
> Documentation/workqueue.txt | 14 ++------------
> drivers/firewire/core-transaction.c | 3 +--
> drivers/gpu/drm/i915/i915_dma.c | 6 ++----
> drivers/md/dm-crypt.c | 10 ++--------
> drivers/md/dm-kcopyd.c | 3 +--
> drivers/md/dm-raid1.c | 3 +--
> drivers/md/dm.c | 3 +--
> drivers/mmc/host/dw_mmc.c | 3 +--
> drivers/nfc/pn533.c | 5 ++---
> drivers/staging/nvec/nvec.c | 10 ++++------
> drivers/staging/nvec/nvec.h | 2 --
> drivers/staging/omapdrm/omap_drv.c | 3 +--
> fs/dlm/ast.c | 5 +----
> fs/gfs2/main.c | 2 +-
> fs/xfs/xfs_super.c | 2 +-
> kernel/workqueue.c | 3 +++
> net/ceph/messenger.c | 2 +-
> net/nfc/core.c | 6 +++---
> net/nfc/hci/core.c | 8 ++++----
> net/nfc/hci/shdlc.c | 4 ++--
> net/nfc/llcp/llcp.c | 18 ++++++------------
> 21 files changed, 40 insertions(+), 75 deletions(-)
>
> diff --git a/Documentation/workqueue.txt b/Documentation/workqueue.txt
> index a6ab4b6..1fb8813 100644
> --- a/Documentation/workqueue.txt
> +++ b/Documentation/workqueue.txt
> @@ -100,8 +100,8 @@ Subsystems and drivers can create and queue work items
> through special workqueue API functions as they see fit. They can influence
> some
> aspects of the way the work items are executed by setting flags on the
> workqueue they are putting the work item on. These flags include
> -things like CPU locality, reentrancy, concurrency limits, priority and
> -more. To get a detailed overview refer to the API description of
> +things like CPU locality, concurrency limits, priority and more. To
> +get a detailed overview refer to the API description of
> alloc_workqueue() below.
>
> When a work item is queued to a workqueue, the target gcwq and
> @@ -166,16 +166,6 @@ resources, scheduled and executed.
>
> @flags:
>
> - WQ_NON_REENTRANT
> -
> - By default, a wq guarantees non-reentrance only on the same
> - CPU. A work item may not be executed concurrently on the same
> - CPU by multiple workers but is allowed to be executed
> - concurrently on multiple CPUs. This flag makes sure
> - non-reentrance is enforced across all CPUs. Work items queued
> - to a non-reentrant wq are guaranteed to be executed by at most
> - one worker system-wide at any given time.
> -
> WQ_UNBOUND
>
> Work items queued to an unbound wq are served by a special
> diff --git a/drivers/firewire/core-transaction.c
> b/drivers/firewire/core-transaction.c index 87d6f2d..45acc0f 100644
> --- a/drivers/firewire/core-transaction.c
> +++ b/drivers/firewire/core-transaction.c
> @@ -1257,8 +1257,7 @@ static int __init fw_core_init(void)
> {
> int ret;
>
> - fw_workqueue = alloc_workqueue("firewire",
> - WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
> + fw_workqueue = alloc_workqueue("firewire", WQ_MEM_RECLAIM, 0);
> if (!fw_workqueue)
> return -ENOMEM;
>
> diff --git a/drivers/gpu/drm/i915/i915_dma.c
> b/drivers/gpu/drm/i915/i915_dma.c index 9cf7dfe..a55ca7a 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1536,11 +1536,9 @@ int i915_driver_load(struct drm_device *dev, unsigned
> long flags) *
> * All tasks on the workqueue are expected to acquire the dev mutex
> * so there is no point in running more than one instance of the
> - * workqueue at any time: max_active = 1 and NON_REENTRANT.
> + * workqueue at any time. Use an ordered one.
> */
> - dev_priv->wq = alloc_workqueue("i915",
> - WQ_UNBOUND | WQ_NON_REENTRANT,
> - 1);
> + dev_priv->wq = alloc_ordered_workqueue("i915", 0);
> if (dev_priv->wq == NULL) {
> DRM_ERROR("Failed to create our workqueue.\n");
> ret = -ENOMEM;
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 664743d..a7472d6 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -1674,20 +1674,14 @@ static int crypt_ctr(struct dm_target *ti, unsigned
> int argc, char **argv) }
>
> ret = -ENOMEM;
> - cc->io_queue = alloc_workqueue("kcryptd_io",
> - WQ_NON_REENTRANT|
> - WQ_MEM_RECLAIM,
> - 1);
> + cc->io_queue = alloc_workqueue("kcryptd_io", WQ_MEM_RECLAIM, 1);
> if (!cc->io_queue) {
> ti->error = "Couldn't create kcryptd io queue";
> goto bad;
> }
>
> cc->crypt_queue = alloc_workqueue("kcryptd",
> - WQ_NON_REENTRANT|
> - WQ_CPU_INTENSIVE|
> - WQ_MEM_RECLAIM,
> - 1);
> + WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, 1);
> if (!cc->crypt_queue) {
> ti->error = "Couldn't create kcryptd queue";
> goto bad;
> diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
> index bed444c..7234f71 100644
> --- a/drivers/md/dm-kcopyd.c
> +++ b/drivers/md/dm-kcopyd.c
> @@ -704,8 +704,7 @@ struct dm_kcopyd_client *dm_kcopyd_client_create(void)
> goto bad_slab;
>
> INIT_WORK(&kc->kcopyd_work, do_work);
> - kc->kcopyd_wq = alloc_workqueue("kcopyd",
> - WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
> + kc->kcopyd_wq = alloc_workqueue("kcopyd", WQ_MEM_RECLAIM, 0);
> if (!kc->kcopyd_wq)
> goto bad_workqueue;
>
> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> index fd61f98..c282fc7 100644
> --- a/drivers/md/dm-raid1.c
> +++ b/drivers/md/dm-raid1.c
> @@ -1090,8 +1090,7 @@ static int mirror_ctr(struct dm_target *ti, unsigned
> int argc, char **argv) ti->num_discard_requests = 1;
> ti->discard_zeroes_data_unsupported = true;
>
> - ms->kmirrord_wq = alloc_workqueue("kmirrord",
> - WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
> + ms->kmirrord_wq = alloc_workqueue("kmirrord", WQ_MEM_RECLAIM, 0);
> if (!ms->kmirrord_wq) {
> DMERR("couldn't start kmirrord");
> r = -ENOMEM;
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 4e09b6f..7c5ef85 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1897,8 +1897,7 @@ static struct mapped_device *alloc_dev(int minor)
> add_disk(md->disk);
> format_dev_t(md->name, MKDEV(_major, minor));
>
> - md->wq = alloc_workqueue("kdmflush",
> - WQ_NON_REENTRANT | WQ_MEM_RECLAIM, 0);
> + md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
> if (!md->wq)
> goto bad_thread;
>
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index 72dc3cd..119e441 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -2030,8 +2030,7 @@ int dw_mci_probe(struct dw_mci *host)
> mci_writel(host, CLKSRC, 0);
>
> tasklet_init(&host->tasklet, dw_mci_tasklet_func, (unsigned long)host);
> - host->card_workqueue = alloc_workqueue("dw-mci-card",
> - WQ_MEM_RECLAIM | WQ_NON_REENTRANT, 1);
> + host->card_workqueue = alloc_workqueue("dw-mci-card", WQ_MEM_RECLAIM, 1);
> if (!host->card_workqueue)
> goto err_dmaunmap;
> INIT_WORK(&host->card_work, dw_mci_work_routine_card);
> diff --git a/drivers/nfc/pn533.c b/drivers/nfc/pn533.c
> index d606f52..7dadcb8 100644
> --- a/drivers/nfc/pn533.c
> +++ b/drivers/nfc/pn533.c
> @@ -2334,9 +2334,8 @@ static int pn533_probe(struct usb_interface
> *interface, INIT_WORK(&dev->mi_work, pn533_wq_mi_recv);
> INIT_WORK(&dev->tg_work, pn533_wq_tg_get_data);
> INIT_WORK(&dev->poll_work, pn533_wq_poll);
> - dev->wq = alloc_workqueue("pn533",
> - WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
> - 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + dev->wq = alloc_ordered_workqueue("pn533", WQ_MEM_RECLAIM);
> if (dev->wq == NULL)
> goto error;
>
> diff --git a/drivers/staging/nvec/nvec.c b/drivers/staging/nvec/nvec.c
> index 695ea35..9c39d53 100644
> --- a/drivers/staging/nvec/nvec.c
> +++ b/drivers/staging/nvec/nvec.c
> @@ -264,7 +264,7 @@ int nvec_write_async(struct nvec_chip *nvec, const
> unsigned char *data, list_add_tail(&msg->node, &nvec->tx_data);
> spin_unlock_irqrestore(&nvec->tx_lock, flags);
>
> - queue_work(nvec->wq, &nvec->tx_work);
> + schedule_work(&nvec->tx_work);
>
> return 0;
> }
> @@ -470,7 +470,7 @@ static void nvec_rx_completed(struct nvec_chip *nvec)
> if (!nvec_msg_is_event(nvec->rx))
> complete(&nvec->ec_transfer);
>
> - queue_work(nvec->wq, &nvec->rx_work);
> + schedule_work(&nvec->rx_work);
> }
>
> /**
> @@ -791,13 +791,11 @@ static int __devinit tegra_nvec_probe(struct
> platform_device *pdev) INIT_LIST_HEAD(&nvec->tx_data);
> INIT_WORK(&nvec->rx_work, nvec_dispatch);
> INIT_WORK(&nvec->tx_work, nvec_request_master);
> - nvec->wq = alloc_workqueue("nvec", WQ_NON_REENTRANT, 2);
>
> err = devm_gpio_request_one(&pdev->dev, nvec->gpio, GPIOF_OUT_INIT_HIGH,
> "nvec gpio");
> if (err < 0) {
> dev_err(nvec->dev, "couldn't request gpio\n");
> - destroy_workqueue(nvec->wq);
> return -ENODEV;
> }
>
> @@ -805,7 +803,6 @@ static int __devinit tegra_nvec_probe(struct
> platform_device *pdev) "nvec", nvec);
> if (err) {
> dev_err(nvec->dev, "couldn't request irq\n");
> - destroy_workqueue(nvec->wq);
> return -ENODEV;
> }
> disable_irq(nvec->irq);
> @@ -859,7 +856,8 @@ static int __devexit tegra_nvec_remove(struct
> platform_device *pdev)
>
> nvec_write_async(nvec, EC_DISABLE_EVENT_REPORTING, 3);
> mfd_remove_devices(nvec->dev);
> - destroy_workqueue(nvec->wq);
> + cancel_work_sync(&nvec->rx_work);
> + cancel_work_sync(&nvec->tx_work);
>
> return 0;
> }
> diff --git a/drivers/staging/nvec/nvec.h b/drivers/staging/nvec/nvec.h
> index ba6ed8f..f522dbd 100644
> --- a/drivers/staging/nvec/nvec.h
> +++ b/drivers/staging/nvec/nvec.h
> @@ -139,7 +139,6 @@ struct nvec_platform_data {
> * @nvec_status_notifier: Internal notifier (see nvec_status_notifier())
> * @rx_work: A work structure for the RX worker nvec_dispatch()
> * @tx_work: A work structure for the TX worker nvec_request_master()
> - * @wq: The work queue in which @rx_work and @tx_work are executed
> * @rx: The message currently being retrieved or %NULL
> * @msg_pool: A pool of messages for allocation
> * @tx: The message currently being transferred
> @@ -165,7 +164,6 @@ struct nvec_chip {
> struct list_head rx_data, tx_data;
> struct notifier_block nvec_status_notifier;
> struct work_struct rx_work, tx_work;
> - struct workqueue_struct *wq;
> struct nvec_msg msg_pool[NVEC_POOL_SIZE];
> struct nvec_msg *rx;
>
> diff --git a/drivers/staging/omapdrm/omap_drv.c
> b/drivers/staging/omapdrm/omap_drv.c index 4beab94..672cb3a 100644
> --- a/drivers/staging/omapdrm/omap_drv.c
> +++ b/drivers/staging/omapdrm/omap_drv.c
> @@ -571,8 +571,7 @@ static int dev_load(struct drm_device *dev, unsigned
> long flags)
>
> dev->dev_private = priv;
>
> - priv->wq = alloc_workqueue("omapdrm",
> - WQ_UNBOUND | WQ_NON_REENTRANT, 1);
> + priv->wq = alloc_ordered_workqueue("omapdrm", 0);
>
> INIT_LIST_HEAD(&priv->obj_list);
>
> diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
> index 63dc19c..cd24afd 100644
> --- a/fs/dlm/ast.c
> +++ b/fs/dlm/ast.c
> @@ -267,10 +267,7 @@ void dlm_callback_work(struct work_struct *work)
> int dlm_callback_start(struct dlm_ls *ls)
> {
> ls->ls_callback_wq = alloc_workqueue("dlm_callback",
> - WQ_UNBOUND |
> - WQ_MEM_RECLAIM |
> - WQ_NON_REENTRANT,
> - 0);
> + WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
> if (!ls->ls_callback_wq) {
> log_print("can't start dlm_callback workqueue");
> return -ENOMEM;
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index e04d0e0..7b0f504 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -155,7 +155,7 @@ static int __init init_gfs2_fs(void)
> goto fail_wq;
>
> gfs2_control_wq = alloc_workqueue("gfs2_control",
> - WQ_NON_REENTRANT | WQ_UNBOUND | WQ_FREEZABLE, 0);
> + WQ_UNBOUND | WQ_FREEZABLE, 0);
> if (!gfs2_control_wq)
> goto fail_recovery;
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index e8e6be4..c7199a4 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1531,7 +1531,7 @@ xfs_init_workqueues(void)
> * competing for ressources. Use the default large max_active value
> * so that even lots of filesystems can perform these task in parallel.
> */
> - xfs_syncd_wq = alloc_workqueue("xfssyncd", WQ_NON_REENTRANT, 0);
> + xfs_syncd_wq = alloc_workqueue("xfssyncd", 0, 0);
> if (!xfs_syncd_wq)
> return -ENOMEM;
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index ce39467..910b1ee 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3195,6 +3195,9 @@ struct workqueue_struct *__alloc_workqueue_key(const
> char *fmt, unsigned int cpu;
> size_t namelen;
>
> + WARN(flags & WQ_NON_REENTRANT,
> + "workqueue: WQ_NON_REENTRANT is deprecated, all workqueues are
> non-reentrant\n"); +
> /* determine namelen, allocate wq and format name */
> va_start(args, lock_name);
> va_copy(args1, args);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index b979675..fe1bd47 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -190,7 +190,7 @@ int ceph_msgr_init(void)
> zero_page = ZERO_PAGE(0);
> page_cache_get(zero_page);
>
> - ceph_msgr_wq = alloc_workqueue("ceph-msgr", WQ_NON_REENTRANT, 0);
> + ceph_msgr_wq = alloc_workqueue("ceph-msgr", 0, 0);
> if (ceph_msgr_wq)
> return 0;
>
> diff --git a/net/nfc/core.c b/net/nfc/core.c
> index ff74979..b071f97 100644
> --- a/net/nfc/core.c
> +++ b/net/nfc/core.c
> @@ -791,9 +791,9 @@ struct nfc_dev *nfc_allocate_device(struct nfc_ops *ops,
>
> INIT_WORK(&dev->check_pres_work, nfc_check_pres_work);
> snprintf(name, sizeof(name), "nfc%d_check_pres_wq", dev->idx);
> - dev->check_pres_wq = alloc_workqueue(name, WQ_NON_REENTRANT |
> - WQ_UNBOUND |
> - WQ_MEM_RECLAIM, 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + dev->check_pres_wq = alloc_ordered_workqueue(name,
> + WQ_MEM_RECLAIM);
> if (dev->check_pres_wq == NULL) {
> kfree(dev);
> return NULL;
> diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
> index 1ac7b3f..5e48792 100644
> --- a/net/nfc/hci/core.c
> +++ b/net/nfc/hci/core.c
> @@ -670,8 +670,8 @@ int nfc_hci_register_device(struct nfc_hci_dev *hdev)
>
> INIT_WORK(&hdev->msg_tx_work, nfc_hci_msg_tx_work);
> snprintf(name, sizeof(name), "%s_hci_msg_tx_wq", devname);
> - hdev->msg_tx_wq = alloc_workqueue(name, WQ_NON_REENTRANT | WQ_UNBOUND |
> - WQ_MEM_RECLAIM, 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + hdev->msg_tx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
> if (hdev->msg_tx_wq == NULL) {
> r = -ENOMEM;
> goto exit;
> @@ -685,8 +685,8 @@ int nfc_hci_register_device(struct nfc_hci_dev *hdev)
>
> INIT_WORK(&hdev->msg_rx_work, nfc_hci_msg_rx_work);
> snprintf(name, sizeof(name), "%s_hci_msg_rx_wq", devname);
> - hdev->msg_rx_wq = alloc_workqueue(name, WQ_NON_REENTRANT | WQ_UNBOUND |
> - WQ_MEM_RECLAIM, 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + hdev->msg_rx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
> if (hdev->msg_rx_wq == NULL) {
> r = -ENOMEM;
> goto exit;
> diff --git a/net/nfc/hci/shdlc.c b/net/nfc/hci/shdlc.c
> index 6f840c1..04405101 100644
> --- a/net/nfc/hci/shdlc.c
> +++ b/net/nfc/hci/shdlc.c
> @@ -877,8 +877,8 @@ struct nfc_shdlc *nfc_shdlc_allocate(struct
> nfc_shdlc_ops *ops,
>
> INIT_WORK(&shdlc->sm_work, nfc_shdlc_sm_work);
> snprintf(name, sizeof(name), "%s_shdlc_sm_wq", devname);
> - shdlc->sm_wq = alloc_workqueue(name, WQ_NON_REENTRANT | WQ_UNBOUND |
> - WQ_MEM_RECLAIM, 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + shdlc->sm_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
> if (shdlc->sm_wq == NULL)
> goto err_allocwq;
>
> diff --git a/net/nfc/llcp/llcp.c b/net/nfc/llcp/llcp.c
> index 82f0f75..57d5df2 100644
> --- a/net/nfc/llcp/llcp.c
> +++ b/net/nfc/llcp/llcp.c
> @@ -1150,10 +1150,8 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
> skb_queue_head_init(&local->tx_queue);
> INIT_WORK(&local->tx_work, nfc_llcp_tx_work);
> snprintf(name, sizeof(name), "%s_llcp_tx_wq", dev_name(dev));
> - local->tx_wq =
> - alloc_workqueue(name,
> - WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
> - 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + local->tx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
> if (local->tx_wq == NULL) {
> err = -ENOMEM;
> goto err_local;
> @@ -1162,10 +1160,8 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
> local->rx_pending = NULL;
> INIT_WORK(&local->rx_work, nfc_llcp_rx_work);
> snprintf(name, sizeof(name), "%s_llcp_rx_wq", dev_name(dev));
> - local->rx_wq =
> - alloc_workqueue(name,
> - WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
> - 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + local->rx_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
> if (local->rx_wq == NULL) {
> err = -ENOMEM;
> goto err_tx_wq;
> @@ -1173,10 +1169,8 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
>
> INIT_WORK(&local->timeout_work, nfc_llcp_timeout_work);
> snprintf(name, sizeof(name), "%s_llcp_timeout_wq", dev_name(dev));
> - local->timeout_wq =
> - alloc_workqueue(name,
> - WQ_NON_REENTRANT | WQ_UNBOUND | WQ_MEM_RECLAIM,
> - 1);
> + /* XXX tj - why does NFC need WQ_MEM_RECLAIM? */
> + local->timeout_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM);
> if (local->timeout_wq == NULL) {
> err = -ENOMEM;
> goto err_rx_wq;

2012-08-14 17:35:38

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 6/6] workqueue: deprecate WQ_NON_REENTRANT

Hello,

On Tue, Aug 14, 2012 at 07:24:59PM +0200, Marc Dietrich wrote:
> the nvec driver had its own workqueue because we found that it reduced some
> timimg induced protocol errors. I just tested your patch which uses the system
> workqueue and I failed to reproduce these protocol errors under stress
> testing, so I think it's ok.

Thanks, there shouldn't be any difference whether you use system_wq or
your own.

> Because this is a single patch affecting many drivers, is this going through
> linux-next or some subsystem tree? Maybe it would have been better to split it
> in one patch per driver. Otherwise ...

I generally think it's better to do quick one time conversion for
things like this, but this patch is rather borderline. It might be
better to separate out the non trivialones and route them separately.
I'll think more about it.

Thanks.

--
tejun

2012-08-20 02:55:29

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH 5/6] workqueue: deprecate system_nrt[_freezable]_wq

On 08/14/2012 09:49 AM, Tejun Heo wrote:
> system_nrt[_freezable]_wq are now spurious. Mark them deprecated and
> convert all users to system[_freezable]_wq.
>
> If you're cc'd and wondering what's going on: Now all workqueues are
> non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
> Please use system[_freezable]_wq instead.
>
> This patch doesn't make any functional difference.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: David Airlie <[email protected]>
> Cc: Jiri Kosina <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Rusty Russell <[email protected]>
> Cc: "Paul E. McKenney" <[email protected]>
> Cc: David Howells <[email protected]>
> ---
> block/blk-throttle.c | 7 +++----
> block/genhd.c | 10 +++++-----
> drivers/gpu/drm/drm_crtc_helper.c | 6 +++---
> drivers/hid/hid-wiimote-ext.c | 2 +-
> drivers/mmc/core/host.c | 4 ++--
> drivers/net/virtio_net.c | 12 ++++++------
> include/linux/workqueue.h | 4 ++--
> kernel/srcu.c | 4 ++--
> security/keys/gc.c | 8 ++++----
> security/keys/key.c | 2 +-
> 10 files changed, 29 insertions(+), 30 deletions(-)



> diff --git a/kernel/srcu.c b/kernel/srcu.c
> index 2095be3..97c465e 100644
> --- a/kernel/srcu.c
> +++ b/kernel/srcu.c
> @@ -379,7 +379,7 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
> rcu_batch_queue(&sp->batch_queue, head);
> if (!sp->running) {
> sp->running = true;
> - queue_delayed_work(system_nrt_wq, &sp->work, 0);
> + schedule_delayed_work(&sp->work, 0);
> }
> spin_unlock_irqrestore(&sp->queue_lock, flags);
> }
> @@ -631,7 +631,7 @@ static void srcu_reschedule(struct srcu_struct *sp)
> }
>
> if (pending)
> - queue_delayed_work(system_nrt_wq, &sp->work, SRCU_INTERVAL);
> + schedule_delayed_work(&sp->work, SRCU_INTERVAL);
> }
>
> /*


Acked-By: Lai Jiangshan <[email protected]>

2012-08-20 21:55:05

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCHSET] workqueue: make all workqueues non-reentrant

On Mon, Aug 13, 2012 at 06:49:40PM -0700, Tejun Heo wrote:
> This patchset enforces WQ_NON_REENTRANT behavior on all workqueues and
> simplifies workqueue API accordingly.

Applied 1-5 to wq/for-3.7. I'll break up the last patch further and
route it separately.

Thanks.

--
tejun