2023-05-10 16:07:57

by Johannes Berg

[permalink] [raw]
Subject: [RFC PATCH 0/4] wifi locking simplification start

Hi,

So maybe this outlines my thinking a bit better.

I'm adding two new bits of infrastructure to workqueues,
with the end result that a 'special' workqueue can be
completely serialized with other code even as far as
execution of a work is concerned.

The real advantage of this isn't the locking simplification
you see in patch 4, the real advantage is that mac80211
will later be able to use that workqueue, and even if it's
called from cfg80211 where the wiphy_lock() is already held
it can still properly clean up its own work structs. This
is a thing that in the passed caused the huge proliferation
of locks in mac80211, which really aren't needed since (a)
it also uses an ordered workqueue, and (b) all the drivers
have just their own underlying single lock to access the
device. Which makes sense, really, there's only one device.

As a consequence, this will allow us to radically simplify
locking, even with drivers not needing any locks, since all
(control!) paths will hold one mutex.

The second patch in this series is sort of not necessary,
we could just make our own worker infrastructure in cfg80211
and hold the lock there, but it seemed simple enough to at
least throw it out there; if you don't like it, I can just
rework without it, but maybe other subsystems may have some
ideas along the same lines in the future.

Even the first patch isn't strictly needed, I previously
posted a version of the third patch without it, but again
it seemed pretty simple here and less overhead.

An alternative overall might be to just get rid of both of
those patches, and put a separate work list into cfg80211,
and just execute those work structs off a single "real"
work struct on the workqueue, with appropriate locking and
exclusion, but that builds very local infrastructure where
this might be useful to others too?

Anyway, just an RFC right now. As described above the real
meat would only kick in later when we push this through to
mac80211 and even drivers.

johannes




2023-05-10 16:08:03

by Johannes Berg

[permalink] [raw]
Subject: [RFC PATCH 1/4] workqueue: support pausing ordered workqueues

From: Johannes Berg <[email protected]>

Add some infrastructure to support pausing ordered
workqueues, so that no work items are executing nor
can execute while the workqueue is paused.

This can be used to simplify locking between work
structs and other processes (e.g. userspace calls)
when the workqueue is paused while other code is
running, where we can then more easily avoid issues
in code paths needing to cancel works.

Signed-off-by: Johannes Berg <[email protected]>
---
include/linux/workqueue.h | 26 ++++++++++++++++++++++++++
kernel/workqueue.c | 27 ++++++++++++++++++++++++++-
2 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index ac551b8ee7d9..5e2413017a89 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -340,6 +340,7 @@ enum {
__WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */
__WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */
__WQ_ORDERED_EXPLICIT = 1 << 19, /* internal: alloc_ordered_workqueue() */
+ __WQ_PAUSED = 1 << 20, /* internal: workqueue_pause() */

WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */
WQ_MAX_UNBOUND_PER_CPU = 4, /* 4 * #cpus for unbound wq */
@@ -474,6 +475,31 @@ extern void print_worker_info(const char *log_lvl, struct task_struct *task);
extern void show_all_workqueues(void);
extern void show_one_workqueue(struct workqueue_struct *wq);
extern void wq_worker_comm(char *buf, size_t size, struct task_struct *task);
+extern void __workqueue_pause_resume(struct workqueue_struct *wq, bool pause);
+
+/**
+ * workqueue_pause - pause a workqueue
+ * @wq: workqueue to pause
+ *
+ * Pause (and flush) the given workqueue so it's not executing any
+ * work structs and won't until workqueue_resume() is called.
+ */
+static inline void workqueue_pause(struct workqueue_struct *wq)
+{
+ __workqueue_pause_resume(wq, true);
+}
+
+/**
+ * workqueue_resume - resume a paused workqueue
+ * @wq: workqueue to resume
+ *
+ * Resume the given workqueue that was paused previously to
+ * make it run work structs again.
+ */
+static inline void workqueue_resume(struct workqueue_struct *wq)
+{
+ __workqueue_pause_resume(wq, false);
+}

/**
* queue_work - queue work on a workqueue
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b8b541caed48..418d99ff8325 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3863,10 +3863,16 @@ static void pwq_adjust_max_active(struct pool_workqueue *pwq)
struct workqueue_struct *wq = pwq->wq;
bool freezable = wq->flags & WQ_FREEZABLE;
unsigned long flags;
+ int new_max_active;

- /* for @wq->saved_max_active */
+ /* for @wq->saved_max_active and @wq->flags */
lockdep_assert_held(&wq->mutex);

+ if (wq->flags & __WQ_PAUSED)
+ new_max_active = 0;
+ else
+ new_max_active = wq->saved_max_active;
+
/* fast exit for non-freezable wqs */
if (!freezable && pwq->max_active == wq->saved_max_active)
return;
@@ -4642,6 +4648,25 @@ void workqueue_set_max_active(struct workqueue_struct *wq, int max_active)
}
EXPORT_SYMBOL_GPL(workqueue_set_max_active);

+void __workqueue_pause_resume(struct workqueue_struct *wq, bool pause)
+{
+ struct pool_workqueue *pwq;
+
+ mutex_lock(&wq->mutex);
+ if (pause)
+ wq->flags |= __WQ_PAUSED;
+ else
+ wq->flags &= ~__WQ_PAUSED;
+
+ for_each_pwq(pwq, wq)
+ pwq_adjust_max_active(pwq);
+ mutex_unlock(&wq->mutex);
+
+ if (pause)
+ flush_workqueue(wq);
+}
+EXPORT_SYMBOL_GPL(__workqueue_pause_resume);
+
/**
* current_work - retrieve %current task's work struct
*
--
2.40.1


2023-05-10 16:08:05

by Johannes Berg

[permalink] [raw]
Subject: [RFC PATCH 4/4] wifi: cfg80211: move scan done work to cfg80211 workqueue

From: Johannes Berg <[email protected]>

Now that we have the cfg80211 workqueue, move the scan_done work
to it as an example.

Signed-off-by: Johannes Berg <[email protected]>
---
net/wireless/core.c | 1 -
net/wireless/scan.c | 5 ++---
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/wireless/core.c b/net/wireless/core.c
index 11e600c71fb6..2908cc4f102e 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1082,7 +1082,6 @@ void wiphy_unregister(struct wiphy *wiphy)
*/
flush_workqueue(rdev->wq);

- flush_work(&rdev->scan_done_wk);
cancel_work_sync(&rdev->conn_work);
flush_work(&rdev->event_work);
cancel_delayed_work_sync(&rdev->dfs_update_channels_wk);
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index 790bc31cf82e..6bd919352f55 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -1007,9 +1007,7 @@ void __cfg80211_scan_done(struct work_struct *wk)
rdev = container_of(wk, struct cfg80211_registered_device,
scan_done_wk);

- wiphy_lock(&rdev->wiphy);
___cfg80211_scan_done(rdev, true);
- wiphy_unlock(&rdev->wiphy);
}

void cfg80211_scan_done(struct cfg80211_scan_request *request,
@@ -1035,7 +1033,8 @@ void cfg80211_scan_done(struct cfg80211_scan_request *request,
}

request->notified = true;
- queue_work(cfg80211_wq, &wiphy_to_rdev(request->wiphy)->scan_done_wk);
+ wiphy_queue_work(request->wiphy,
+ &wiphy_to_rdev(request->wiphy)->scan_done_wk);
}
EXPORT_SYMBOL(cfg80211_scan_done);

--
2.40.1


2023-05-10 16:08:15

by Johannes Berg

[permalink] [raw]
Subject: [RFC PATCH 3/4] wifi: cfg80211: add a workqueue with special semantics

From: Johannes Berg <[email protected]>

The special semantics are that it's paused during wiphy_lock()
and nothing can run or even start running on it while that is
locked.

Signed-off-by: Johannes Berg <[email protected]>
---
include/net/cfg80211.h | 93 +++++++++++++++++++++++++++++++++++-------
net/wireless/core.c | 68 ++++++++++++++++++++++++++++++
net/wireless/core.h | 2 +
net/wireless/nl80211.c | 8 +++-
4 files changed, 155 insertions(+), 16 deletions(-)

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index f115b2550309..f30ecbac7490 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -5142,6 +5142,7 @@ struct wiphy_iftype_akm_suites {
/**
* struct wiphy - wireless hardware description
* @mtx: mutex for the data (structures) of this device
+ * @mtx_fully_held: the mutex was held with workqueue pause/flush
* @reg_notifier: the driver's regulatory notification callback,
* note that if your driver uses wiphy_apply_custom_regulatory()
* the reg_notifier's request can be passed as NULL
@@ -5351,6 +5352,9 @@ struct wiphy_iftype_akm_suites {
*/
struct wiphy {
struct mutex mtx;
+#ifdef CONFIG_LOCKDEP
+ bool mtx_fully_held;
+#endif

/* assign these fields before you register the wiphy */

@@ -5669,29 +5673,90 @@ struct cfg80211_cqm_config;
* wiphy_lock - lock the wiphy
* @wiphy: the wiphy to lock
*
- * This is mostly exposed so it can be done around registering and
- * unregistering netdevs that aren't created through cfg80211 calls,
- * since that requires locking in cfg80211 when the notifiers is
- * called, but that cannot differentiate which way it's called.
+ * This is needed around registering and unregistering netdevs that
+ * aren't created through cfg80211 calls, since that requires locking
+ * in cfg80211 when the notifiers is called, but that cannot
+ * differentiate which way it's called.
+ *
+ * It can also be used by drivers for their own purposes.
*
* When cfg80211 ops are called, the wiphy is already locked.
+ *
+ * Note that this makes sure that no workers that have been queued
+ * with wiphy_queue_work() are running.
*/
-static inline void wiphy_lock(struct wiphy *wiphy)
- __acquires(&wiphy->mtx)
-{
- mutex_lock(&wiphy->mtx);
- __acquire(&wiphy->mtx);
-}
+void wiphy_lock(struct wiphy *wiphy) __acquires(&wiphy->mtx);

/**
* wiphy_unlock - unlock the wiphy again
* @wiphy: the wiphy to unlock
*/
-static inline void wiphy_unlock(struct wiphy *wiphy)
- __releases(&wiphy->mtx)
+void wiphy_unlock(struct wiphy *wiphy) __releases(&wiphy->mtx);
+
+/**
+ * wiphy_queue_work - queue work for the wiphy
+ * @wiphy: the wiphy to queue for
+ * @work: the worker
+ *
+ * This is useful for work that must be done asynchronously, and work
+ * queued here has the special property that the wiphy mutex will be
+ * held as if wiphy_lock() was called, and that it cannot be running
+ * after wiphy_lock() was called. Therefore, wiphy_cancel_work() can
+ * use just cancel_work() instead of cancel_work_sync(), it requires
+ * being in a section protected by wiphy_lock().
+ */
+void wiphy_queue_work(struct wiphy *wiphy, struct work_struct *work);
+
+/**
+ * wiphy_cancel_work - cancel previously queued work
+ * @wiphy: the wiphy, for debug purposes
+ * @work: the work to cancel
+ *
+ * Cancel the work *without* waiting for it, this assumes being
+ * called under the wiphy mutex acquired by wiphy_lock().
+ */
+static inline void wiphy_cancel_work(struct wiphy *wiphy, struct work_struct *work)
{
- __release(&wiphy->mtx);
- mutex_unlock(&wiphy->mtx);
+#ifdef CONFIG_LOCKDEP
+ lockdep_assert_held(&wiphy->mtx);
+ WARN_ON_ONCE(!wiphy->mtx_fully_held);
+#endif
+ cancel_work(work);
+}
+
+/**
+ * wiphy_queue_delayed_work - queue delayed work for the wiphy
+ * @wiphy: the wiphy to queue for
+ * @dwork: the delayable worker
+ * @delay: number of jiffies to wait before queueing
+ *
+ * This is useful for work that must be done asynchronously, and work
+ * queued here has the special property that the wiphy mutex will be
+ * held as if wiphy_lock() was called, and that it cannot be running
+ * after wiphy_lock() was called. Therefore, wiphy_cancel_work() can
+ * use just cancel_work() instead of cancel_work_sync(), it requires
+ * being in a section protected by wiphy_lock().
+ */
+void wiphy_queue_delayed_work(struct wiphy *wiphy,
+ struct delayed_work *dwork,
+ unsigned long delay);
+
+/**
+ * wiphy_cancel_delayed_work - cancel previously queued delayed work
+ * @wiphy: the wiphy, for debug purposes
+ * @dwork: the delayed work to cancel
+ *
+ * Cancel the work *without* waiting for it, this assumes being
+ * called under the wiphy mutex acquired by wiphy_lock().
+ */
+static inline void wiphy_cancel_delayed_work(struct wiphy *wiphy,
+ struct delayed_work *dwork)
+{
+#ifdef CONFIG_LOCKDEP
+ lockdep_assert_held(&wiphy->mtx);
+ WARN_ON_ONCE(!wiphy->mtx_fully_held);
+#endif
+ cancel_delayed_work(dwork);
}

/**
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 5b0c4d5b80cf..11e600c71fb6 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -533,6 +533,13 @@ struct wiphy *wiphy_new_nm(const struct cfg80211_ops *ops, int sizeof_priv,
return NULL;
}

+ rdev->wq = alloc_ordered_workqueue_mtx("%s", 0, &rdev->wiphy.mtx,
+ dev_name(&rdev->wiphy.dev));
+ if (!rdev->wq) {
+ wiphy_free(&rdev->wiphy);
+ return NULL;
+ }
+
INIT_WORK(&rdev->rfkill_block, cfg80211_rfkill_block_work);
INIT_WORK(&rdev->conn_work, cfg80211_conn_work);
INIT_WORK(&rdev->event_work, cfg80211_event_work);
@@ -1068,6 +1075,13 @@ void wiphy_unregister(struct wiphy *wiphy)
wiphy_unlock(&rdev->wiphy);
rtnl_unlock();

+ /*
+ * flush again, even if wiphy_lock() did above, something might
+ * have been reaching it still while the code above was running,
+ * e.g. via debugfs.
+ */
+ flush_workqueue(rdev->wq);
+
flush_work(&rdev->scan_done_wk);
cancel_work_sync(&rdev->conn_work);
flush_work(&rdev->event_work);
@@ -1093,6 +1107,10 @@ void cfg80211_dev_free(struct cfg80211_registered_device *rdev)
{
struct cfg80211_internal_bss *scan, *tmp;
struct cfg80211_beacon_registration *reg, *treg;
+
+ if (rdev->wq) /* might be NULL in error cases */
+ destroy_workqueue(rdev->wq);
+
rfkill_destroy(rdev->wiphy.rfkill);
list_for_each_entry_safe(reg, treg, &rdev->beacon_registrations, list) {
list_del(&reg->list);
@@ -1564,6 +1582,56 @@ static struct pernet_operations cfg80211_pernet_ops = {
.exit = cfg80211_pernet_exit,
};

+void wiphy_lock(struct wiphy *wiphy)
+{
+ struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
+
+ workqueue_pause(rdev->wq);
+
+ mutex_lock(&wiphy->mtx);
+
+#ifdef CONFIG_LOCKDEP
+ wiphy->mtx_fully_held = true;
+#endif
+}
+EXPORT_SYMBOL(wiphy_lock);
+
+void wiphy_unlock(struct wiphy *wiphy)
+{
+ struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
+
+#ifdef CONFIG_LOCKDEP
+ WARN_ON_ONCE(!wiphy->mtx_fully_held);
+#endif
+
+ workqueue_resume(rdev->wq);
+
+#ifdef CONFIG_LOCKDEP
+ wiphy->mtx_fully_held = false;
+#endif
+
+ mutex_unlock(&wiphy->mtx);
+}
+EXPORT_SYMBOL(wiphy_unlock);
+
+void wiphy_queue_work(struct wiphy *wiphy, struct work_struct *work)
+{
+ struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
+
+ queue_work(rdev->wq, work);
+}
+EXPORT_SYMBOL(wiphy_queue_work);
+
+void wiphy_queue_delayed_work(struct wiphy *wiphy,
+ struct delayed_work *dwork,
+ unsigned long delay)
+{
+ struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
+
+ queue_delayed_work(rdev->wq, dwork, delay);
+}
+EXPORT_SYMBOL(wiphy_queue_delayed_work);
+
static int __init cfg80211_init(void)
{
int err;
diff --git a/net/wireless/core.h b/net/wireless/core.h
index 7c61752f6d83..0ab79cc28adb 100644
--- a/net/wireless/core.h
+++ b/net/wireless/core.h
@@ -108,6 +108,8 @@ struct cfg80211_registered_device {
/* lock for all wdev lists */
spinlock_t mgmt_registrations_lock;

+ struct workqueue_struct *wq;
+
/* must be last because of the way we do wiphy_priv(),
* and it should at least be aligned to NETDEV_ALIGN */
struct wiphy wiphy __aligned(NETDEV_ALIGN);
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 112b4bb009c8..1fb4978f7649 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -1002,7 +1002,9 @@ static int nl80211_prepare_wdev_dump(struct netlink_callback *cb,
return PTR_ERR(*wdev);
}
*rdev = wiphy_to_rdev((*wdev)->wiphy);
- mutex_lock(&(*rdev)->wiphy.mtx);
+ wiphy_lock(&(*rdev)->wiphy);
+ /* the conditional locking is too hard for sparse */
+ __release(&(*rdev)->wiphy.mtx);
rtnl_unlock();
/* 0 is the first index - add 1 to parse only once */
cb->args[0] = (*rdev)->wiphy_idx + 1;
@@ -1032,7 +1034,9 @@ static int nl80211_prepare_wdev_dump(struct netlink_callback *cb,
rtnl_unlock();
return -ENODEV;
}
- mutex_lock(&(*rdev)->wiphy.mtx);
+ wiphy_lock(&(*rdev)->wiphy);
+ /* the conditional locking is too hard for sparse */
+ __release(&(*rdev)->wiphy.mtx);
rtnl_unlock();
}

--
2.40.1


2023-05-10 16:09:10

by Johannes Berg

[permalink] [raw]
Subject: [RFC PATCH 2/4] workqueue: support holding a mutex for each work

From: Johannes Berg <[email protected]>

Add a bit of new infrastructure so that on ordered
workqueues, work will be executed with a specified
mutex held. This can be used to simplify subsystem
implementations, and together with the pause() and
resume() APIs can significantly simplify things in
subsystems using both.

The alternative to this is to manually lock in each
work item, but there can be many of them and this
may require special locking API inside the work
items vs. outside, since the outside might need to
also pause the workqueue.

For example, in wifi, I imagine using this for all
control paths so that wiphy_lock() will also pause
the workqueue, and all works will execute with the
same mutex that is used to implement wiphy_lock()
held. Then, work structs only need to be removed,
without _sync(), removing many potential causes of
locking problems.

Signed-off-by: Johannes Berg <[email protected]>
---
include/linux/workqueue.h | 30 +++++++++++++++++++++++++++---
kernel/workqueue.c | 30 +++++++++++++++++++++++-------
2 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 5e2413017a89..9d0a1bf4d5f7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -387,12 +387,16 @@ extern struct workqueue_struct *system_freezable_wq;
extern struct workqueue_struct *system_power_efficient_wq;
extern struct workqueue_struct *system_freezable_power_efficient_wq;

+__printf(1, 5) struct workqueue_struct *
+__alloc_workqueue(const char *fmt, unsigned int flags, int max_active,
+ struct mutex *work_mutex, ...);
+
/**
* alloc_workqueue - allocate a workqueue
* @fmt: printf format for the name of the workqueue
* @flags: WQ_* flags
* @max_active: max in-flight work items, 0 for default
- * remaining args: args for @fmt
+ * args: args for @fmt
*
* Allocate a workqueue with the specified parameters. For detailed
* information on WQ_* flags, please refer to
@@ -401,8 +405,8 @@ extern struct workqueue_struct *system_freezable_power_efficient_wq;
* RETURNS:
* Pointer to the allocated workqueue on success, %NULL on failure.
*/
-__printf(1, 4) struct workqueue_struct *
-alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...);
+#define alloc_workqueue(fmt, flags, max_active, args...) \
+ __alloc_workqueue(fmt, flags, max_active, NULL, ##args)

/**
* alloc_ordered_workqueue - allocate an ordered workqueue
@@ -421,6 +425,26 @@ alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...);
alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | \
__WQ_ORDERED_EXPLICIT | (flags), 1, ##args)

+/**
+ * alloc_ordered_workqueue_mtx - allocate an ordered workqueue with work mutex
+ * @fmt: printf format for the name of the workqueue
+ * @flags: WQ_* flags (only WQ_FREEZABLE and WQ_MEM_RECLAIM are meaningful)
+ * @work_mutex: mutex to hold for each work execution
+ * @args: args for @fmt
+ *
+ * Allocate an ordered workqueue. An ordered workqueue executes at
+ * most one work item at any given time in the queued order.
+ *
+ * The work mutex will be held for each work execution.
+ *
+ * RETURNS:
+ * Pointer to the allocated workqueue on success, %NULL on failure.
+ */
+#define alloc_ordered_workqueue_mtx(fmt, flags, work_mutex, args...) \
+ __alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | \
+ __WQ_ORDERED_EXPLICIT | (flags), 1, \
+ work_mutex, ##args)
+
#define create_workqueue(name) \
alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name))
#define create_freezable_workqueue(name) \
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 418d99ff8325..2c573e25690c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -278,6 +278,8 @@ struct workqueue_struct {
struct workqueue_attrs *unbound_attrs; /* PW: only for unbound wqs */
struct pool_workqueue *dfl_pwq; /* PW: only for unbound wqs */

+ struct mutex *work_mutex; /* user mutex held for work */
+
#ifdef CONFIG_SYSFS
struct wq_device *wq_dev; /* I: for sysfs interface */
#endif
@@ -2387,7 +2389,13 @@ __acquires(&pool->lock)
*/
lockdep_invariant_state(true);
trace_workqueue_execute_start(work);
- worker->current_func(work);
+ if (unlikely(pwq->wq->work_mutex)) {
+ mutex_lock(pwq->wq->work_mutex);
+ worker->current_func(work);
+ mutex_unlock(pwq->wq->work_mutex);
+ } else {
+ worker->current_func(work);
+ }
/*
* While we must be careful to not use "work" after this, the trace
* point will only record its address.
@@ -4404,10 +4412,12 @@ static int init_rescuer(struct workqueue_struct *wq)
return 0;
}

-__printf(1, 4)
-struct workqueue_struct *alloc_workqueue(const char *fmt,
- unsigned int flags,
- int max_active, ...)
+__printf(1, 5)
+struct workqueue_struct *__alloc_workqueue(const char *fmt,
+ unsigned int flags,
+ int max_active,
+ struct mutex *work_mutex,
+ ...)
{
size_t tbl_size = 0;
va_list args;
@@ -4432,6 +4442,10 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
if (flags & WQ_UNBOUND)
tbl_size = nr_node_ids * sizeof(wq->numa_pwq_tbl[0]);

+ /* can only reach this by calling the internal API */
+ if (WARN_ON(!(flags & __WQ_ORDERED_EXPLICIT) && work_mutex))
+ return NULL;
+
wq = kzalloc(sizeof(*wq) + tbl_size, GFP_KERNEL);
if (!wq)
return NULL;
@@ -4442,7 +4456,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
goto err_free_wq;
}

- va_start(args, max_active);
+ wq->work_mutex = work_mutex;
+
+ va_start(args, work_mutex);
vsnprintf(wq->name, sizeof(wq->name), fmt, args);
va_end(args);

@@ -4500,7 +4516,7 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
destroy_workqueue(wq);
return NULL;
}
-EXPORT_SYMBOL_GPL(alloc_workqueue);
+EXPORT_SYMBOL_GPL(__alloc_workqueue);

static bool pwq_busy(struct pool_workqueue *pwq)
{
--
2.40.1


2023-05-10 18:48:44

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] workqueue: support holding a mutex for each work

On Wed, May 10, 2023 at 06:04:26PM +0200, Johannes Berg wrote:
> @@ -2387,7 +2389,13 @@ __acquires(&pool->lock)
> */
> lockdep_invariant_state(true);
> trace_workqueue_execute_start(work);
> - worker->current_func(work);
> + if (unlikely(pwq->wq->work_mutex)) {
> + mutex_lock(pwq->wq->work_mutex);
> + worker->current_func(work);
> + mutex_unlock(pwq->wq->work_mutex);
> + } else {
> + worker->current_func(work);
> + }

Ah, I don't know about this. This can't be that difficult to do from the
callee side, right?

Thanks.

--
tejun

2023-05-10 18:49:12

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH 1/4] workqueue: support pausing ordered workqueues

Hello,

This looks great to me in general.

On Wed, May 10, 2023 at 06:04:25PM +0200, Johannes Berg wrote:
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index b8b541caed48..418d99ff8325 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3863,10 +3863,16 @@ static void pwq_adjust_max_active(struct pool_workqueue *pwq)
> struct workqueue_struct *wq = pwq->wq;
> bool freezable = wq->flags & WQ_FREEZABLE;
> unsigned long flags;
> + int new_max_active;
>
> - /* for @wq->saved_max_active */
> + /* for @wq->saved_max_active and @wq->flags */
> lockdep_assert_held(&wq->mutex);
>
> + if (wq->flags & __WQ_PAUSED)
> + new_max_active = 0;
> + else
> + new_max_active = wq->saved_max_active;

Nothing is using new_max_active and I think we can probably combine this
with the freezing test.

> +void __workqueue_pause_resume(struct workqueue_struct *wq, bool pause)
> +{
> + struct pool_workqueue *pwq;
> +
> + mutex_lock(&wq->mutex);
> + if (pause)
> + wq->flags |= __WQ_PAUSED;
> + else
> + wq->flags &= ~__WQ_PAUSED;
> +
> + for_each_pwq(pwq, wq)
> + pwq_adjust_max_active(pwq);
> + mutex_unlock(&wq->mutex);
> +
> + if (pause)
> + flush_workqueue(wq);
> +}
> +EXPORT_SYMBOL_GPL(__workqueue_pause_resume);

I'd just make pause and resume separate functions. The sharing ratio doesn't
seem that high.

Thanks.

--
tejun

2023-05-10 19:25:03

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] workqueue: support holding a mutex for each work

On Wed, 2023-05-10 at 08:34 -1000, Tejun Heo wrote:
> On Wed, May 10, 2023 at 06:04:26PM +0200, Johannes Berg wrote:
> > @@ -2387,7 +2389,13 @@ __acquires(&pool->lock)
> > */
> > lockdep_invariant_state(true);
> > trace_workqueue_execute_start(work);
> > - worker->current_func(work);
> > + if (unlikely(pwq->wq->work_mutex)) {
> > + mutex_lock(pwq->wq->work_mutex);
> > + worker->current_func(work);
> > + mutex_unlock(pwq->wq->work_mutex);
> > + } else {
> > + worker->current_func(work);
> > + }
>
> Ah, I don't know about this. This can't be that difficult to do from the
> callee side, right?
>

Yeah I thought you'd say that :)

It isn't difficult, the issue is just that in the case I'm envisioning,
you can't just call wiphy_lock() since that would attempt to pause the
workqueue, which can't work from on the workqueue itself. So you need
wiphy_lock_from_work()/wiphy_unlock_from_work() or remember to use the
mutex directly there, which all seemed more error-prone and harder to
maintain.

But anyway I could easily implement _both_ of these in cfg80211
directly, with just a linked list of works and a single struct
work_struct to execute things on the list, with the right locking. That
might be easier overall, just at the expense of more churn while
converting, but that's not even necessarily _bad_, it would really
guarantee that we can tell immediately the work is properly done...

I'll play with that idea some, I guess. Would you still want the
pause/resume patch anyway, even if I end up not using it then?

johannes

2023-05-10 19:58:26

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH 2/4] workqueue: support holding a mutex for each work

Hello,

On Wed, May 10, 2023 at 09:16:09PM +0200, Johannes Berg wrote:
> Yeah I thought you'd say that :)

Sorry about being so predictable. :)

> It isn't difficult, the issue is just that in the case I'm envisioning,
> you can't just call wiphy_lock() since that would attempt to pause the
> workqueue, which can't work from on the workqueue itself. So you need
> wiphy_lock_from_work()/wiphy_unlock_from_work() or remember to use the
> mutex directly there, which all seemed more error-prone and harder to
> maintain.
>
> But anyway I could easily implement _both_ of these in cfg80211
> directly, with just a linked list of works and a single struct
> work_struct to execute things on the list, with the right locking. That
> might be easier overall, just at the expense of more churn while
> converting, but that's not even necessarily _bad_, it would really
> guarantee that we can tell immediately the work is properly done...
>
> I'll play with that idea some, I guess. Would you still want the
> pause/resume patch anyway, even if I end up not using it then?

I think it's something inherently useful (along with the ability to do the
same thing to a work time - ie. cancel and inhibit a work item to be
queued0); however, it's probably not a good idea to merge without an in-tree
user. Would you mind posting a fixed patch nonetheless for future reference
if it's not too much hassle?

Thanks.

--
tejun