LinuxLists.cc - [PATCH 0/3 V8] workqueue: Introduce low-level unbound wq sysfs cpumask

2015-04-27 09:55:47

Subject: [PATCH 0/3 V8] workqueue: Introduce low-level unbound wq sysfs cpumask

The patchset is based on 4.1-rc1. And only the [PATCH 3/3]
is changed:

1) Rename again. wq_unbound_global_cpumask -> wq_unbound_cpumask
workqueue_set_unbound_global_cpumask() -> workqueue_set_unbound_cpumask()

2) the code sets the wq_unbound_cpumask at first before applying it to
all the wqs. It simplifies the code where it is applied to the wqs.

3) When the user configured cpumask doesn't overlap with the low
level cpumask, the V7 patchset forced the per-node pwqs use
the default pwq and caused some complications. This patch
just uses the default pwq's attrs to calculate the per-nodes' attrs
and avoid introducing any branch or special handling.

Frederic Weisbecker (1):
workqueue: Create low-level unbound workqueues cpumask

Lai Jiangshan (2):
workqueue: split apply_workqueue_attrs() into 3 stages
workqueue: Allow modifying low level unbound workqueue cpumask

Cc: Christoph Lameter <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Frederic Weisbecker <[email protected]>

include/linux/workqueue.h | 1 +
kernel/workqueue.c | 343 ++++++++++++++++++++++++++++++++++------------
2 files changed, 253 insertions(+), 91 deletions(-)

--
2.1.0

2015-04-27 09:56:11

by Lai Jiangshan

[permalink] [raw]

Subject: [PATCH 1/3 V8] workqueue: split apply_workqueue_attrs() into 3 stages

Current apply_workqueue_attrs() includes pwqs-allocation and pwqs-installation,
so when we batch multiple apply_workqueue_attrs()s as a transaction, we can't
ensure the transaction must succeed or fail as a complete unit.

To solve this, we split apply_workqueue_attrs() into three stages.
The first stage does the preparation: allocation memory, pwqs.
The second stage does the attrs-installaion and pwqs-installation.
The third stage frees the allocated memory and (old or unused) pwqs.

As the result, batching multiple apply_workqueue_attrs()s can
succeed or fail as a complete unit:
1) batch do all the first stage for all the workqueues
2) only commit all when all the above succeed.

This patch is a preparation for the next patch ("Allow modifying low level
unbound workqueue cpumask") which will do a multiple apply_workqueue_attrs().

The patch doesn't have functionality changed except two minor adjustment:
1) free_unbound_pwq() for the error path is removed, we use the
heavier version put_pwq_unlocked() instead since the error path
is rare. this adjustment simplifies the code.
2) the memory-allocation is also moved into wq_pool_mutex.
this is needed to avoid to do the further splitting.

Suggested-by: Tejun Heo <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Lai Jiangshan <[email protected]>
---
kernel/workqueue.c | 200 +++++++++++++++++++++++++++++++----------------------
1 file changed, 116 insertions(+), 84 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 586ad91..b13753a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3425,17 +3425,6 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
return pwq;
}

-/* undo alloc_unbound_pwq(), used only in the error path */
-static void free_unbound_pwq(struct pool_workqueue *pwq)
-{
- lockdep_assert_held(&wq_pool_mutex);
-
- if (pwq) {
- put_unbound_pool(pwq->pool);
- kmem_cache_free(pwq_cache, pwq);
- }
-}
-
/**
* wq_calc_node_mask - calculate a wq_attrs' cpumask for the specified node
* @attrs: the wq_attrs of interest
@@ -3498,42 +3487,49 @@ static struct pool_workqueue *numa_pwq_tbl_install(struct workqueue_struct *wq,
return old_pwq;
}

-/**
- * apply_workqueue_attrs - apply new workqueue_attrs to an unbound workqueue
- * @wq: the target workqueue
- * @attrs: the workqueue_attrs to apply, allocated with alloc_workqueue_attrs()
- *
- * Apply @attrs to an unbound workqueue @wq. Unless disabled, on NUMA
- * machines, this function maps a separate pwq to each NUMA node with
- * possibles CPUs in @attrs->cpumask so that work items are affine to the
- * NUMA node it was issued on. Older pwqs are released as in-flight work
- * items finish. Note that a work item which repeatedly requeues itself
- * back-to-back will stay on its current pwq.
- *
- * Performs GFP_KERNEL allocations.
- *
- * Return: 0 on success and -errno on failure.
- */
-int apply_workqueue_attrs(struct workqueue_struct *wq,
- const struct workqueue_attrs *attrs)
+/* Context to store the prepared attrs & pwqs before applied */
+struct apply_wqattrs_ctx {
+ struct workqueue_struct *wq; /* target to be applied */
+ struct workqueue_attrs *attrs; /* configured attrs */
+ struct pool_workqueue *dfl_pwq;
+ struct pool_workqueue *pwq_tbl[];
+};
+
+/* Free the resources after success or abort */
+static void apply_wqattrs_cleanup(struct apply_wqattrs_ctx *ctx)
+{
+ if (ctx) {
+ int node;
+
+ /* put the pwqs */
+ for_each_node(node)
+ put_pwq_unlocked(ctx->pwq_tbl[node]);
+ put_pwq_unlocked(ctx->dfl_pwq);
+
+ free_workqueue_attrs(ctx->attrs);
+
+ kfree(ctx);
+ }
+}
+
+/* Allocates the attrs and pwqs for later installment */
+static struct apply_wqattrs_ctx *
+apply_wqattrs_prepare(struct workqueue_struct *wq,
+ const struct workqueue_attrs *attrs)
{
+ struct apply_wqattrs_ctx *ctx;
struct workqueue_attrs *new_attrs, *tmp_attrs;
- struct pool_workqueue **pwq_tbl, *dfl_pwq;
- int node, ret;
+ int node;

- /* only unbound workqueues can change attributes */
- if (WARN_ON(!(wq->flags & WQ_UNBOUND)))
- return -EINVAL;
+ lockdep_assert_held(&wq_pool_mutex);

- /* creating multiple pwqs breaks ordering guarantee */
- if (WARN_ON((wq->flags & __WQ_ORDERED) && !list_empty(&wq->pwqs)))
- return -EINVAL;
+ ctx = kzalloc(sizeof(*ctx) + nr_node_ids * sizeof(ctx->pwq_tbl[0]),
+ GFP_KERNEL);

- pwq_tbl = kzalloc(nr_node_ids * sizeof(pwq_tbl[0]), GFP_KERNEL);
new_attrs = alloc_workqueue_attrs(GFP_KERNEL);
tmp_attrs = alloc_workqueue_attrs(GFP_KERNEL);
- if (!pwq_tbl || !new_attrs || !tmp_attrs)
- goto enomem;
+ if (!ctx || !new_attrs || !tmp_attrs)
+ goto out_free;

/* make a copy of @attrs and sanitize it */
copy_workqueue_attrs(new_attrs, attrs);
@@ -3547,75 +3543,111 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
copy_workqueue_attrs(tmp_attrs, new_attrs);

/*
- * CPUs should stay stable across pwq creations and installations.
- * Pin CPUs, determine the target cpumask for each node and create
- * pwqs accordingly.
- */
- get_online_cpus();
-
- mutex_lock(&wq_pool_mutex);
-
- /*
* If something goes wrong during CPU up/down, we'll fall back to
* the default pwq covering whole @attrs->cpumask. Always create
* it even if we don't use it immediately.
*/
- dfl_pwq = alloc_unbound_pwq(wq, new_attrs);
- if (!dfl_pwq)
- goto enomem_pwq;
+ ctx->dfl_pwq = alloc_unbound_pwq(wq, new_attrs);
+ if (!ctx->dfl_pwq)
+ goto out_free;

for_each_node(node) {
if (wq_calc_node_cpumask(attrs, node, -1, tmp_attrs->cpumask)) {
- pwq_tbl[node] = alloc_unbound_pwq(wq, tmp_attrs);
- if (!pwq_tbl[node])
- goto enomem_pwq;
+ ctx->pwq_tbl[node] = alloc_unbound_pwq(wq, tmp_attrs);
+ if (!ctx->pwq_tbl[node])
+ goto out_free;
} else {
- dfl_pwq->refcnt++;
- pwq_tbl[node] = dfl_pwq;
+ ctx->dfl_pwq->refcnt++;
+ ctx->pwq_tbl[node] = ctx->dfl_pwq;
}
}

- mutex_unlock(&wq_pool_mutex);
+ ctx->attrs = new_attrs;
+ ctx->wq = wq;
+ free_workqueue_attrs(tmp_attrs);
+ return ctx;
+
+out_free:
+ free_workqueue_attrs(tmp_attrs);
+ free_workqueue_attrs(new_attrs);
+ apply_wqattrs_cleanup(ctx);
+ return NULL;
+}
+
+/* Set the unbound_attr and install the prepared pwqs. Should not fail */
+static void apply_wqattrs_commit(struct apply_wqattrs_ctx *ctx)
+{
+ int node;

/* all pwqs have been created successfully, let's install'em */
- mutex_lock(&wq->mutex);
+ mutex_lock(&ctx->wq->mutex);

- copy_workqueue_attrs(wq->unbound_attrs, new_attrs);
+ copy_workqueue_attrs(ctx->wq->unbound_attrs, ctx->attrs);

/* save the previous pwq and install the new one */
for_each_node(node)
- pwq_tbl[node] = numa_pwq_tbl_install(wq, node, pwq_tbl[node]);
+ ctx->pwq_tbl[node] = numa_pwq_tbl_install(ctx->wq, node,
+ ctx->pwq_tbl[node]);

/* @dfl_pwq might not have been used, ensure it's linked */
- link_pwq(dfl_pwq);
- swap(wq->dfl_pwq, dfl_pwq);
+ link_pwq(ctx->dfl_pwq);
+ swap(ctx->wq->dfl_pwq, ctx->dfl_pwq);

- mutex_unlock(&wq->mutex);
+ mutex_unlock(&ctx->wq->mutex);
+}

- /* put the old pwqs */
- for_each_node(node)
- put_pwq_unlocked(pwq_tbl[node]);
- put_pwq_unlocked(dfl_pwq);
+/**
+ * apply_workqueue_attrs - apply new workqueue_attrs to an unbound workqueue
+ * @wq: the target workqueue
+ * @attrs: the workqueue_attrs to apply, allocated with alloc_workqueue_attrs()
+ *
+ * Apply @attrs to an unbound workqueue @wq. Unless disabled, on NUMA
+ * machines, this function maps a separate pwq to each NUMA node with
+ * possibles CPUs in @attrs->cpumask so that work items are affine to the
+ * NUMA node it was issued on. Older pwqs are released as in-flight work
+ * items finish. Note that a work item which repeatedly requeues itself
+ * back-to-back will stay on its current pwq.
+ *
+ * Performs GFP_KERNEL allocations.
+ *
+ * Return: 0 on success and -errno on failure.
+ */
+int apply_workqueue_attrs(struct workqueue_struct *wq,
+ const struct workqueue_attrs *attrs)
+{
+ struct apply_wqattrs_ctx *ctx;
+ int ret = -ENOMEM;

- put_online_cpus();
- ret = 0;
- /* fall through */
-out_free:
- free_workqueue_attrs(tmp_attrs);
- free_workqueue_attrs(new_attrs);
- kfree(pwq_tbl);
- return ret;
+ /* only unbound workqueues can change attributes */
+ if (WARN_ON(!(wq->flags & WQ_UNBOUND)))
+ return -EINVAL;

-enomem_pwq:
- free_unbound_pwq(dfl_pwq);
- for_each_node(node)
- if (pwq_tbl && pwq_tbl[node] != dfl_pwq)
- free_unbound_pwq(pwq_tbl[node]);
+ /* creating multiple pwqs breaks ordering guarantee */
+ if (WARN_ON((wq->flags & __WQ_ORDERED) && !list_empty(&wq->pwqs)))
+ return -EINVAL;
+
+ /*
+ * CPUs should stay stable across pwq creations and installations.
+ * Pin CPUs, determine the target cpumask for each node and create
+ * pwqs accordingly.
+ */
+ get_online_cpus();
+
+ mutex_lock(&wq_pool_mutex);
+ ctx = apply_wqattrs_prepare(wq, attrs);
mutex_unlock(&wq_pool_mutex);
+
+ /* the ctx has been prepared successfully, let's commit it */
+ if (ctx) {
+ apply_wqattrs_commit(ctx);
+ ret = 0;
+ }
+
put_online_cpus();
-enomem:
- ret = -ENOMEM;
- goto out_free;
+
+ apply_wqattrs_cleanup(ctx);
+
+ return ret;
}

/**
--
2.1.0

2015-04-27 09:56:09

by Lai Jiangshan

[permalink] [raw]

Subject: [PATCH 2/3 V8] workqueue: Create low-level unbound workqueues cpumask

From: Frederic Weisbecker <[email protected]>

Create a cpumask that limit the affinity of all unbound workqueues.
This cpumask is controlled though a file at the root of the workqueue
sysfs directory.

It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
such that the effective cpumask applied for a given unbound workqueue is
the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
the new /sys/devices/virtual/workqueue/cpumask file.

This patch implements the basic infrastructure and the read interface.
wq_unbound_cpumask is initially set to cpu_possible_mask.

Cc: Christoph Lameter <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Viresh Kumar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Lai Jiangshan <[email protected]>
---
kernel/workqueue.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b13753a..66ea37c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -299,6 +299,8 @@ static DEFINE_SPINLOCK(wq_mayday_lock); /* protects wq->maydays list */
static LIST_HEAD(workqueues); /* PR: list of all workqueues */
static bool workqueue_freezing; /* PL: have wqs started freezing? */

+static cpumask_var_t wq_unbound_cpumask;
+
/* the per-cpu worker pools */
static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
cpu_worker_pools);
@@ -3533,7 +3535,7 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,

/* make a copy of @attrs and sanitize it */
copy_workqueue_attrs(new_attrs, attrs);
- cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);
+ cpumask_and(new_attrs->cpumask, new_attrs->cpumask, wq_unbound_cpumask);

/*
* We may create multiple pwqs with differing cpumasks. Make a
@@ -4946,9 +4948,29 @@ static struct bus_type wq_subsys = {
.dev_groups = wq_sysfs_groups,
};

+static ssize_t wq_unbound_cpumask_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ int written;
+
+ written = scnprintf(buf, PAGE_SIZE, "%*pb\n",
+ cpumask_pr_args(wq_unbound_cpumask));
+
+ return written;
+}
+
+static struct device_attribute wq_sysfs_cpumask_attr =
+ __ATTR(cpumask, 0444, wq_unbound_cpumask_show, NULL);
+
static int __init wq_sysfs_init(void)
{
- return subsys_virtual_register(&wq_subsys, NULL);
+ int err;
+
+ err = subsys_virtual_register(&wq_subsys, NULL);
+ if (err)
+ return err;
+
+ return device_create_file(wq_subsys.dev_root, &wq_sysfs_cpumask_attr);
}
core_initcall(wq_sysfs_init);

@@ -5096,6 +5118,9 @@ static int __init init_workqueues(void)

WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));

+ BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
+ cpumask_copy(wq_unbound_cpumask, cpu_possible_mask);
+
pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);

cpu_notifier(workqueue_cpu_up_callback, CPU_PRI_WORKQUEUE_UP);
--
2.1.0

2015-04-27 09:55:53

by Lai Jiangshan

[permalink] [raw]

Subject: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

Allow to modify the low-level unbound workqueues cpumask through
sysfs. This is performed by traversing the entire workqueue list
and calling apply_wqattrs_prepare() on the unbound workqueues
with the new low level mask. Only after all the preparation are done,
we commit them all together.

The oreder-workquue is ignore from the low level unbound workqueue
cpumask, it will be handled in near future.

All the (default & per-nodes') pwqs are mandatorily controlled by
the low level cpumask. If the user configured cpumask doesn't overlap
with the low level cpumask, the low level cpumask will be used for the
wq instead.

The default wq_unbound_cpumask is still cpu_possible_mask due to the workqueue
subsystem doesn't know what is the best default value for the runtime, the
system manager or other subsystem which knows the sufficient information should set
it when needed.

Cc: Christoph Lameter <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Original-patch-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Lai Jiangshan <[email protected]>
---
include/linux/workqueue.h | 1 +
kernel/workqueue.c | 118 +++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 112 insertions(+), 7 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index deee212..4618dd6 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -424,6 +424,7 @@ struct workqueue_attrs *alloc_workqueue_attrs(gfp_t gfp_mask);
void free_workqueue_attrs(struct workqueue_attrs *attrs);
int apply_workqueue_attrs(struct workqueue_struct *wq,
const struct workqueue_attrs *attrs);
+int workqueue_set_unbound_cpumask(cpumask_var_t cpumask);

extern bool queue_work_on(int cpu, struct workqueue_struct *wq,
struct work_struct *work);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 66ea37c..9dd965a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -299,7 +299,7 @@ static DEFINE_SPINLOCK(wq_mayday_lock); /* protects wq->maydays list */
static LIST_HEAD(workqueues); /* PR: list of all workqueues */
static bool workqueue_freezing; /* PL: have wqs started freezing? */

-static cpumask_var_t wq_unbound_cpumask;
+static cpumask_var_t wq_unbound_cpumask; /* PL: low level cpumask for all unbound wqs */

/* the per-cpu worker pools */
static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
@@ -3493,6 +3493,7 @@ static struct pool_workqueue *numa_pwq_tbl_install(struct workqueue_struct *wq,
struct apply_wqattrs_ctx {
struct workqueue_struct *wq; /* target to be applied */
struct workqueue_attrs *attrs; /* configured attrs */
+ struct list_head list; /* queued for batching commit */
struct pool_workqueue *dfl_pwq;
struct pool_workqueue *pwq_tbl[];
};
@@ -3548,13 +3549,18 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
* If something goes wrong during CPU up/down, we'll fall back to
* the default pwq covering whole @attrs->cpumask. Always create
* it even if we don't use it immediately.
+ *
+ * If the user configured cpumask doesn't overlap with the
+ * wq_unbound_cpumask, we fallback to the wq_unbound_cpumask.
*/
+ if (unlikely(cpumask_empty(new_attrs->cpumask)))
+ cpumask_copy(new_attrs->cpumask, wq_unbound_cpumask);
ctx->dfl_pwq = alloc_unbound_pwq(wq, new_attrs);
if (!ctx->dfl_pwq)
goto out_free;

for_each_node(node) {
- if (wq_calc_node_cpumask(attrs, node, -1, tmp_attrs->cpumask)) {
+ if (wq_calc_node_cpumask(new_attrs, node, -1, tmp_attrs->cpumask)) {
ctx->pwq_tbl[node] = alloc_unbound_pwq(wq, tmp_attrs);
if (!ctx->pwq_tbl[node])
goto out_free;
@@ -3564,7 +3570,10 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
}
}

+ /* save the user configured attrs */
+ cpumask_and(new_attrs->cpumask, attrs->cpumask, cpu_possible_mask);
ctx->attrs = new_attrs;
+
ctx->wq = wq;
free_workqueue_attrs(tmp_attrs);
return ctx;
@@ -3705,11 +3714,11 @@ static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu,

/*
* Let's determine what needs to be done. If the target cpumask is
- * different from wq's, we need to compare it to @pwq's and create
- * a new one if they don't match. If the target cpumask equals
- * wq's, the default pwq should be used.
+ * different from the default pwq's, we need to compare it to @pwq's
+ * and create a new one if they don't match. If the target cpumask
+ * equals the default pwq's, the default pwq should be used.
*/
- if (wq_calc_node_cpumask(wq->unbound_attrs, node, cpu_off, cpumask)) {
+ if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpu_off, cpumask)) {
if (cpumask_equal(cpumask, pwq->pool->attrs->cpumask))
goto out_unlock;
} else {
@@ -4732,6 +4741,81 @@ out_unlock:
}
#endif /* CONFIG_FREEZER */

+static int workqueue_apply_unbound_cpumask(void)
+{
+ LIST_HEAD(ctxs);
+ int ret = 0;
+ struct workqueue_struct *wq;
+ struct apply_wqattrs_ctx *ctx, *n;
+
+ lockdep_assert_held(&wq_pool_mutex);
+
+ list_for_each_entry(wq, &workqueues, list) {
+ if (!(wq->flags & WQ_UNBOUND))
+ continue;
+ /* creating multiple pwqs breaks ordering guarantee */
+ if (wq->flags & __WQ_ORDERED)
+ continue;
+
+ ctx = apply_wqattrs_prepare(wq, wq->unbound_attrs);
+ if (!ctx) {
+ ret = -ENOMEM;
+ break;
+ }
+
+ list_add_tail(&ctx->list, &ctxs);
+ }
+
+ list_for_each_entry_safe(ctx, n, &ctxs, list) {
+ list_del(&ctx->list);
+ if (!ret)
+ apply_wqattrs_commit(ctx);
+ apply_wqattrs_cleanup(ctx);
+ }
+
+ return ret;
+}
+
+/**
+ * workqueue_set_unbound_cpumask - Set the low-level unbound cpumask
+ * @cpumask: the cpumask to set
+ *
+ * The low-level workqueues cpumask is a global cpumask that limits
+ * the affinity of all unbound workqueues. This function check the @cpumask
+ * and apply it to all unbound workqueues and updates all pwqs of them.
+ * When all succeed, it saves @cpumask to the global low-level unbound
+ * cpumask.
+ *
+ * Retun: 0 - Success
+ * -EINVAL - No online cpu in the @cpumask
+ * -ENOMEM - Failed to allocate memory for attrs or pwqs.
+ */
+int workqueue_set_unbound_cpumask(cpumask_var_t cpumask)
+{
+ int ret = -EINVAL;
+ cpumask_var_t saved_cpumask;
+
+ if (!zalloc_cpumask_var(&saved_cpumask, GFP_KERNEL))
+ return -ENOMEM;
+
+ get_online_cpus();
+ cpumask_and(cpumask, cpumask, cpu_possible_mask);
+ if (!cpumask_empty(cpumask)) {
+ mutex_lock(&wq_pool_mutex);
+ cpumask_copy(saved_cpumask, wq_unbound_cpumask);
+ cpumask_copy(wq_unbound_cpumask, cpumask);
+ ret = workqueue_apply_unbound_cpumask();
+ if (ret < 0)
+ cpumask_copy(wq_unbound_cpumask, saved_cpumask);
+ mutex_unlock(&wq_pool_mutex);
+ }
+ put_online_cpus();
+
+ free_cpumask_var(saved_cpumask);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(workqueue_set_unbound_cpumask);
+
#ifdef CONFIG_SYSFS
/*
* Workqueues with WQ_SYSFS flag set is visible to userland via
@@ -4953,14 +5037,34 @@ static ssize_t wq_unbound_cpumask_show(struct device *dev,
{
int written;

+ mutex_lock(&wq_pool_mutex);
written = scnprintf(buf, PAGE_SIZE, "%*pb\n",
cpumask_pr_args(wq_unbound_cpumask));
+ mutex_unlock(&wq_pool_mutex);

return written;
}

+static ssize_t wq_unbound_cpumask_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ cpumask_var_t cpumask;
+ int ret;
+
+ if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
+ return -ENOMEM;
+
+ ret = cpumask_parse(buf, cpumask);
+ if (!ret)
+ ret = workqueue_set_unbound_cpumask(cpumask);
+
+ free_cpumask_var(cpumask);
+ return ret ? ret : count;
+}
+
static struct device_attribute wq_sysfs_cpumask_attr =
- __ATTR(cpumask, 0444, wq_unbound_cpumask_show, NULL);
+ __ATTR(cpumask, 0644, wq_unbound_cpumask_show,
+ wq_unbound_cpumask_store);

static int __init wq_sysfs_init(void)
{
--
2.1.0

2015-04-27 15:45:52

by Tejun Heo

[permalink] [raw]

Subject: Re: [PATCH 2/3 V8] workqueue: Create low-level unbound workqueues cpumask

On Mon, Apr 27, 2015 at 05:58:39PM +0800, Lai Jiangshan wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Create a cpumask that limit the affinity of all unbound workqueues.
> This cpumask is controlled though a file at the root of the workqueue
> sysfs directory.
>
> It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
> such that the effective cpumask applied for a given unbound workqueue is
> the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
> the new /sys/devices/virtual/workqueue/cpumask file.
>
> This patch implements the basic infrastructure and the read interface.
> wq_unbound_cpumask is initially set to cpu_possible_mask.
>
> Cc: Christoph Lameter <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Lai Jiangshan <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: Paul E. McKenney <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Viresh Kumar <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Signed-off-by: Lai Jiangshan <[email protected]>

Applied 1-2 to wq/for-4.2 w/ minor comment updates.

Thanks.

--
tejun

2015-04-27 16:08:02

by Tejun Heo

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

Hello, Lai.

Overall, it looks good, just a couple more nits.

On Mon, Apr 27, 2015 at 05:58:40PM +0800, Lai Jiangshan wrote:
> The oreder-workquue is ignore from the low level unbound workqueue

Ordered workqueues are ignored

> cpumask, it will be handled in near future.
>
> All the (default & per-nodes') pwqs are mandatorily controlled by
default & per-node
> the low level cpumask. If the user configured cpumask doesn't overlap
> with the low level cpumask, the low level cpumask will be used for the
> wq instead.
>
> The default wq_unbound_cpumask is still cpu_possible_mask due to the workqueue
> subsystem doesn't know what is the best default value for the runtime, the
> system manager or other subsystem which knows the sufficient information should set
> it when needed.

Please re-flow the paragraph. Also, ultimately, we want this to
consider isolcpus, right?

> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -424,6 +424,7 @@ struct workqueue_attrs *alloc_workqueue_attrs(gfp_t gfp_mask);
> void free_workqueue_attrs(struct workqueue_attrs *attrs);
> int apply_workqueue_attrs(struct workqueue_struct *wq,
> const struct workqueue_attrs *attrs);
> +int workqueue_set_unbound_cpumask(cpumask_var_t cpumask);

Why is this a public function?

> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3548,13 +3549,18 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
> * If something goes wrong during CPU up/down, we'll fall back to
> * the default pwq covering whole @attrs->cpumask. Always create
> * it even if we don't use it immediately.
> + *
> + * If the user configured cpumask doesn't overlap with the
> + * wq_unbound_cpumask, we fallback to the wq_unbound_cpumask.
> */
> + if (unlikely(cpumask_empty(new_attrs->cpumask)))
> + cpumask_copy(new_attrs->cpumask, wq_unbound_cpumask);

Please see below.

> ctx->dfl_pwq = alloc_unbound_pwq(wq, new_attrs);
> if (!ctx->dfl_pwq)
> goto out_free;
>
> for_each_node(node) {
> - if (wq_calc_node_cpumask(attrs, node, -1, tmp_attrs->cpumask)) {
> + if (wq_calc_node_cpumask(new_attrs, node, -1, tmp_attrs->cpumask)) {
> ctx->pwq_tbl[node] = alloc_unbound_pwq(wq, tmp_attrs);
> if (!ctx->pwq_tbl[node])
> goto out_free;
> @@ -3564,7 +3570,10 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
> }
> }
>
> + /* save the user configured attrs */
> + cpumask_and(new_attrs->cpumask, attrs->cpumask, cpu_possible_mask);

Wouldn't this make a lot more sense above when copying @attrs into
@new_attrs? The comment there even says "make a copy of @attrs and
sanitize it". Copy to @new_attrs, mask with wq_unbound_cpumask and
fall back to wq_unbound_cpumask if empty.

> +static int workqueue_apply_unbound_cpumask(void)
> +{
...
> + list_for_each_entry_safe(ctx, n, &ctxs, list) {

Is the following list_del() necessary? The list is never used again,
right?

> + list_del(&ctx->list);
> + if (!ret)
> + apply_wqattrs_commit(ctx);
> + apply_wqattrs_cleanup(ctx);
> + }
> +
> + return ret;
> +}
...
> +int workqueue_set_unbound_cpumask(cpumask_var_t cpumask)
> +{
...
> +}
> +EXPORT_SYMBOL_GPL(workqueue_set_unbound_cpumask);

Again, why is this exported? Who's the expected user?

Thanks.

--
tejun

2015-04-28 02:01:24

by Lai Jiangshan

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

Hello

>
>> --- a/include/linux/workqueue.h
>> +++ b/include/linux/workqueue.h
>> @@ -424,6 +424,7 @@ struct workqueue_attrs *alloc_workqueue_attrs(gfp_t gfp_mask);
>> void free_workqueue_attrs(struct workqueue_attrs *attrs);
>> int apply_workqueue_attrs(struct workqueue_struct *wq,
>> const struct workqueue_attrs *attrs);
>> +int workqueue_set_unbound_cpumask(cpumask_var_t cpumask);
>
> Why is this a public function?

In V4 patchset, Kevin Hilman had requested the wq_unbound_cpumask
to be "cpumask_complement(wq_unbound_cpumask, tick_nohz_full_mask);"

I replied against it and I suggested that wq_unbound_cpumask can be
re-set after workqueue initialized it.

And Frederic Weisbecker seemed on my side:
"""
If it should be the default on NO_HZ_FULL, maybe we should do this from the
tick nohz code. Some late or fs initcall that will do the workqueue affinity,
timer affinity, etc...
"""

So, we need an API to modify the wq_unbound_cpumask, and I provided
this public function. Otherwise, the other code can't modify it.

>
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -3548,13 +3549,18 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
>> * If something goes wrong during CPU up/down, we'll fall back to
>> * the default pwq covering whole @attrs->cpumask. Always create
>> * it even if we don't use it immediately.
>> + *
>> + * If the user configured cpumask doesn't overlap with the
>> + * wq_unbound_cpumask, we fallback to the wq_unbound_cpumask.
>> */
>> + if (unlikely(cpumask_empty(new_attrs->cpumask)))
>> + cpumask_copy(new_attrs->cpumask, wq_unbound_cpumask);
>
> Please see below.
>
>> ctx->dfl_pwq = alloc_unbound_pwq(wq, new_attrs);
>> if (!ctx->dfl_pwq)
>> goto out_free;
>>
>> for_each_node(node) {
>> - if (wq_calc_node_cpumask(attrs, node, -1, tmp_attrs->cpumask)) {
>> + if (wq_calc_node_cpumask(new_attrs, node, -1, tmp_attrs->cpumask)) {
>> ctx->pwq_tbl[node] = alloc_unbound_pwq(wq, tmp_attrs);
>> if (!ctx->pwq_tbl[node])
>> goto out_free;
>> @@ -3564,7 +3570,10 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
>> }
>> }
>>
>> + /* save the user configured attrs */
>> + cpumask_and(new_attrs->cpumask, attrs->cpumask, cpu_possible_mask);
>
> Wouldn't this make a lot more sense above when copying @attrs into
> @new_attrs? The comment there even says "make a copy of @attrs and
> sanitize it". Copy to @new_attrs, mask with wq_unbound_cpumask and
> fall back to wq_unbound_cpumask if empty.

It should be:

+ copy_workqueue_attrs(new_attrs, attrs);
+ cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);

>
>> +static int workqueue_apply_unbound_cpumask(void)
>> +{
> ...
>> + list_for_each_entry_safe(ctx, n, &ctxs, list) {
>
> Is the following list_del() necessary? The list is never used again,
> right?

It isn't necessary. It was added in V7. I thought it could make
the code more normal.

Thanks
Lai

2015-04-28 02:21:16

by Lai Jiangshan

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

On 04/28/2015 09:44 AM, Lai Jiangshan wrote:

>>>
>>> + /* save the user configured attrs */
>>> + cpumask_and(new_attrs->cpumask, attrs->cpumask, cpu_possible_mask);
>>
>> Wouldn't this make a lot more sense above when copying @attrs into
>> @new_attrs? The comment there even says "make a copy of @attrs and
>> sanitize it". Copy to @new_attrs, mask with wq_unbound_cpumask and
>> fall back to wq_unbound_cpumask if empty.

We need to save the user original configured attrs.
When any time wq_unbound_cpumask is changed, we should use
the user original configured attrs (cpumask) to re-calculate
the pwqs and avoid losing any information.

>
> It should be:
>
> + copy_workqueue_attrs(new_attrs, attrs);
> + cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);
>

2015-04-28 03:44:46

by Tejun Heo

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

Hello,

On Tue, Apr 28, 2015 at 09:44:44AM +0800, Lai Jiangshan wrote:
> >> +int workqueue_set_unbound_cpumask(cpumask_var_t cpumask);
> >
> > Why is this a public function?
>
>
> In V4 patchset, Kevin Hilman had requested the wq_unbound_cpumask
> to be "cpumask_complement(wq_unbound_cpumask, tick_nohz_full_mask);"
>
> I replied against it and I suggested that wq_unbound_cpumask can be
> re-set after workqueue initialized it.
>
> And Frederic Weisbecker seemed on my side:
> """
> If it should be the default on NO_HZ_FULL, maybe we should do this from the
> tick nohz code. Some late or fs initcall that will do the workqueue affinity,
> timer affinity, etc...
> """
>
> So, we need an API to modify the wq_unbound_cpumask, and I provided
> this public function. Otherwise, the other code can't modify it.

I see. I don't have too strong an opinion; however, changing the mask
is a fairly heavy operation. Are there specific reasons why we don't
want to follow the nohz config right away? Also, even if we do it
this way, the function doesn't need to be EXPORT_SYMBOL_GPL()'d,
right?

> > Is the following list_del() necessary? The list is never used again,
> > right?
>
> It isn't necessary. It was added in V7. I thought it could make
> the code more normal.

The problem with doing unnecessary stuff is that it's bound to be
inconsistent and makes the reader wonder whether something else which
requires such extra operation is going on when there's none. It tends
to mislead than anything else.

Thanks.

--
tejun

2015-04-28 03:49:29

by Tejun Heo

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

Hello,

On Tue, Apr 28, 2015 at 10:24:31AM +0800, Lai Jiangshan wrote:
> >> Wouldn't this make a lot more sense above when copying @attrs into
> >> @new_attrs? The comment there even says "make a copy of @attrs and
> >> sanitize it". Copy to @new_attrs, mask with wq_unbound_cpumask and
> >> fall back to wq_unbound_cpumask if empty.
>
> We need to save the user original configured attrs.
> When any time wq_unbound_cpumask is changed, we should use
> the user original configured attrs (cpumask) to re-calculate
> the pwqs and avoid losing any information.

Sure, we can do that for new_attrs and then mask tmp_attrs further w/
wq_unbound_cpumask, no?

Thanks.

--
tejun

2015-04-28 04:36:49

by Mike Galbraith

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

On Mon, 2015-04-27 at 23:44 -0400, Tejun Heo wrote:

> > So, we need an API to modify the wq_unbound_cpumask, and I provided
> > this public function. Otherwise, the other code can't modify it.
>
> I see. I don't have too strong an opinion; however, changing the mask
> is a fairly heavy operation. Are there specific reasons why we don't
> want to follow the nohz config right away?

Isolation is not only applicable to nohz_full. Many loads are
unsuitable for nohz_full, yet require maximum isolation.

ATM, nohz_full is not dynamic, but hopefully one day will be. In the
here and now, we can isolate cores from the scheduler on the fly via
cpusets, a prime API user candidate.

-Mike

2015-04-28 10:12:55

by Lai Jiangshan

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

On 04/28/2015 11:49 AM, Tejun Heo wrote:
> Hello,
>
> On Tue, Apr 28, 2015 at 10:24:31AM +0800, Lai Jiangshan wrote:
>>>> Wouldn't this make a lot more sense above when copying @attrs into
>>>> @new_attrs? The comment there even says "make a copy of @attrs and
>>>> sanitize it". Copy to @new_attrs, mask with wq_unbound_cpumask and
>>>> fall back to wq_unbound_cpumask if empty.
>>
>> We need to save the user original configured attrs.
>> When any time wq_unbound_cpumask is changed, we should use
>> the user original configured attrs (cpumask) to re-calculate
>> the pwqs and avoid losing any information.
>
> Sure, we can do that for new_attrs and then mask tmp_attrs further w/
> wq_unbound_cpumask, no?
>
> Thanks.
>

We need to pass new_attrs to wq_calc_node_cpumask().

If new_attrs (the first argument of wq_calc_node_cpumask()) is not masked
with wq_unbound_cpumask when passed in, wq_calc_node_cpumask()
will be much complicated (I tried coding it yesterday).

Quote:
static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
int cpu_going_down, cpumask_t *cpumask)
{
if (!wq_numa_enabled || attrs->no_numa)
goto use_dfl;

/* does @node have any online CPUs @attrs wants? */
cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask); [1]
if (cpu_going_down >= 0)
cpumask_clear_cpu(cpu_going_down, cpumask);

if (cpumask_empty(cpumask))
goto use_dfl;

/* yeap, return possible CPUs in @node that @attrs wants */
cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]); [2]
return !cpumask_equal(cpumask, attrs->cpumask); [3]

use_dfl:
cpumask_copy(cpumask, attrs->cpumask); [4]
return false;
}

If @attrs is not masked with wq_unbound_cpumask when passed in, the code
needs add two maskings (with wq_unbound_cpumask) at [1] and [2].

And the code requests to get the cpumask of the default pwq at [3]&[4],
thus the code need to (re-)calculate the default pwq's attrs here and
doubles the code. (this calculation is already done before this function).

It will make all things simple and avoid complicating the wq_calc_node_cpumask(),
if wq_calc_node_cpumask() is kept unchanged but accepts only the default pwq's
attrs as its first argument.

The call-site in wq_update_unbound_numa() is changed in V8 to meet this requirement.

@@ -3705,11 +3714,11 @@ static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu,

/*
* Let's determine what needs to be done. If the target cpumask is
- * different from wq's, we need to compare it to @pwq's and create
- * a new one if they don't match. If the target cpumask equals
- * wq's, the default pwq should be used.
+ * different from the default pwq's, we need to compare it to @pwq's
+ * and create a new one if they don't match. If the target cpumask
+ * equals the default pwq's, the default pwq should be used.
*/
- if (wq_calc_node_cpumask(wq->unbound_attrs, node, cpu_off, cpumask)) {
+ if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpu_off, cpumask)) {
if (cpumask_equal(cpumask, pwq->pool->attrs->cpumask))
goto out_unlock;
} else {

This requirement is not a new requirement. In the code before this patch,
the argument @attrs for wq_calc_node_cpumask() is expected to be the default
pwq's attrs which happens to be wq->unbound_attrs all the time.

In the code after this patch, the argument @attrs for wq_calc_node_cpumask()
is still expected to be the default pwq's attrs which may not be
wq->unbound_attrs.

So the requirement is not new and wq_calc_node_cpumask() is untouched,
but the comment for wq_calc_node_cpumask() needs to be updated which
I should have done, forgive me.

Thanks,
Lai.

2015-04-28 10:28:20

by Lai Jiangshan

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

On 04/28/2015 12:36 PM, Mike Galbraith wrote:
> On Mon, 2015-04-27 at 23:44 -0400, Tejun Heo wrote:
>
>>> So, we need an API to modify the wq_unbound_cpumask, and I provided
>>> this public function. Otherwise, the other code can't modify it.
>>
>> I see. I don't have too strong an opinion; however, changing the mask
>> is a fairly heavy operation. Are there specific reasons why we don't
>> want to follow the nohz config right away?
>
> Isolation is not only applicable to nohz_full. Many loads are
> unsuitable for nohz_full, yet require maximum isolation.
>
> ATM, nohz_full is not dynamic, but hopefully one day will be. In the
> here and now, we can isolate cores from the scheduler on the fly via
> cpusets, a prime API user candidate.
>
> -Mike
>

So, the public function needs to be kept and the EXPORT_SYMBOL_GPL()
is killed?

2015-04-28 12:15:32

by Mike Galbraith

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

On Tue, 2015-04-28 at 18:31 +0800, Lai Jiangshan wrote:

> So, the public function needs to be kept and the EXPORT_SYMBOL_GPL()
> is killed?

IMO yes.

-Mike

2015-04-30 09:20:39

by Lai Jiangshan

[permalink] [raw]

Subject: Re: [PATCH 3/3 V8] workqueue: Allow modifying low level unbound workqueue cpumask

On 04/28/2015 06:16 PM, Lai Jiangshan wrote:
> On 04/28/2015 11:49 AM, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Apr 28, 2015 at 10:24:31AM +0800, Lai Jiangshan wrote:
>>>>> Wouldn't this make a lot more sense above when copying @attrs into
>>>>> @new_attrs? The comment there even says "make a copy of @attrs and
>>>>> sanitize it". Copy to @new_attrs, mask with wq_unbound_cpumask and
>>>>> fall back to wq_unbound_cpumask if empty.
>>>
>>> We need to save the user original configured attrs.
>>> When any time wq_unbound_cpumask is changed, we should use
>>> the user original configured attrs (cpumask) to re-calculate
>>> the pwqs and avoid losing any information.
>>
>> Sure, we can do that for new_attrs and then mask tmp_attrs further w/
>> wq_unbound_cpumask, no?

Hello, TJ,

I didn't accept your this comments in V9 patch.

I had explained it in other long email (embedded here).
I will leave for several days, so I sent V9 patch with an
unsettled comment.

Thanks,
Lai

>>
>> Thanks.
>>
>
> We need to pass new_attrs to wq_calc_node_cpumask().
>
> If new_attrs (the first argument of wq_calc_node_cpumask()) is not masked
> with wq_unbound_cpumask when passed in, wq_calc_node_cpumask()
> will be much complicated (I tried coding it yesterday).
>
> Quote:
> static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
> int cpu_going_down, cpumask_t *cpumask)
> {
> if (!wq_numa_enabled || attrs->no_numa)
> goto use_dfl;
>
> /* does @node have any online CPUs @attrs wants? */
> cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask); [1]
> if (cpu_going_down >= 0)
> cpumask_clear_cpu(cpu_going_down, cpumask);
>
> if (cpumask_empty(cpumask))
> goto use_dfl;
>
> /* yeap, return possible CPUs in @node that @attrs wants */
> cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]); [2]
> return !cpumask_equal(cpumask, attrs->cpumask); [3]
>
> use_dfl:
> cpumask_copy(cpumask, attrs->cpumask); [4]
> return false;
> }
>
>
> If @attrs is not masked with wq_unbound_cpumask when passed in, the code
> needs add two maskings (with wq_unbound_cpumask) at [1] and [2].
>
> And the code requests to get the cpumask of the default pwq at [3]&[4],
> thus the code need to (re-)calculate the default pwq's attrs here and
> doubles the code. (this calculation is already done before this function).
>
> It will make all things simple and avoid complicating the wq_calc_node_cpumask(),
> if wq_calc_node_cpumask() is kept unchanged but accepts only the default pwq's
> attrs as its first argument.
>
> The call-site in wq_update_unbound_numa() is changed in V8 to meet this requirement.
>
> @@ -3705,11 +3714,11 @@ static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu,
>
> /*
> * Let's determine what needs to be done. If the target cpumask is
> - * different from wq's, we need to compare it to @pwq's and create
> - * a new one if they don't match. If the target cpumask equals
> - * wq's, the default pwq should be used.
> + * different from the default pwq's, we need to compare it to @pwq's
> + * and create a new one if they don't match. If the target cpumask
> + * equals the default pwq's, the default pwq should be used.
> */
> - if (wq_calc_node_cpumask(wq->unbound_attrs, node, cpu_off, cpumask)) {
> + if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpu_off, cpumask)) {
> if (cpumask_equal(cpumask, pwq->pool->attrs->cpumask))
> goto out_unlock;
> } else {
>
>
> This requirement is not a new requirement. In the code before this patch,
> the argument @attrs for wq_calc_node_cpumask() is expected to be the default
> pwq's attrs which happens to be wq->unbound_attrs all the time.
>
> In the code after this patch, the argument @attrs for wq_calc_node_cpumask()
> is still expected to be the default pwq's attrs which may not be
> wq->unbound_attrs.
>
> So the requirement is not new and wq_calc_node_cpumask() is untouched,
> but the comment for wq_calc_node_cpumask() needs to be updated which
> I should have done, forgive me.
>
> Thanks,
> Lai.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
> .
>