2022-07-28 12:28:02

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

Like Hillf Danton mentioned

syzbot should have been able to catch cancel_work_sync() in work context
by checking lockdep_map in __flush_work() for both flush and cancel.

in [1], being unable to report an obvious deadlock scenario shown below is
broken. From locking dependency perspective, sync version of cancel request
should behave as if flush request, for it waits for completion of work if
that work has already started execution.

----------
#include <linux/module.h>
#include <linux/sched.h>
static DEFINE_MUTEX(mutex);
static void work_fn(struct work_struct *work)
{
schedule_timeout_uninterruptible(HZ / 5);
mutex_lock(&mutex);
mutex_unlock(&mutex);
}
static DECLARE_WORK(work, work_fn);
static int __init test_init(void)
{
schedule_work(&work);
schedule_timeout_uninterruptible(HZ / 10);
mutex_lock(&mutex);
cancel_work_sync(&work);
mutex_unlock(&mutex);
return -EINVAL;
}
module_init(test_init);
MODULE_LICENSE("GPL");
----------

Link: https://lkml.kernel.org/r/[email protected] [1]
Reported-by: Hillf Danton <[email protected]>
Fixes: d6e89786bed977f3 ("workqueue: skip lockdep wq dependency in cancel_work_sync()")
Cc: Johannes Berg <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
---
kernel/workqueue.c | 45 ++++++++++++++++++---------------------------
1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1ea50f6be843..e6df688f84db 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3000,8 +3000,7 @@ void drain_workqueue(struct workqueue_struct *wq)
}
EXPORT_SYMBOL_GPL(drain_workqueue);

-static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
- bool from_cancel)
+static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
{
struct worker *worker = NULL;
struct worker_pool *pool;
@@ -3043,8 +3042,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
* workqueues the deadlock happens when the rescuer stalls, blocking
* forward progress.
*/
- if (!from_cancel &&
- (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
+ if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
lock_map_acquire(&pwq->wq->lockdep_map);
lock_map_release(&pwq->wq->lockdep_map);
}
@@ -3056,7 +3054,18 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
return false;
}

-static bool __flush_work(struct work_struct *work, bool from_cancel)
+/**
+ * flush_work - wait for a work to finish executing the last queueing instance
+ * @work: the work to flush
+ *
+ * Wait until @work has finished execution. @work is guaranteed to be idle
+ * on return if it hasn't been requeued since flush started.
+ *
+ * Return:
+ * %true if flush_work() waited for the work to finish execution,
+ * %false if it was already idle.
+ */
+bool flush_work(struct work_struct *work)
{
struct wq_barrier barr;

@@ -3066,12 +3075,10 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
if (WARN_ON(!work->func))
return false;

- if (!from_cancel) {
- lock_map_acquire(&work->lockdep_map);
- lock_map_release(&work->lockdep_map);
- }
+ lock_map_acquire(&work->lockdep_map);
+ lock_map_release(&work->lockdep_map);

- if (start_flush_work(work, &barr, from_cancel)) {
+ if (start_flush_work(work, &barr)) {
wait_for_completion(&barr.done);
destroy_work_on_stack(&barr.work);
return true;
@@ -3079,22 +3086,6 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
return false;
}
}
-
-/**
- * flush_work - wait for a work to finish executing the last queueing instance
- * @work: the work to flush
- *
- * Wait until @work has finished execution. @work is guaranteed to be idle
- * on return if it hasn't been requeued since flush started.
- *
- * Return:
- * %true if flush_work() waited for the work to finish execution,
- * %false if it was already idle.
- */
-bool flush_work(struct work_struct *work)
-{
- return __flush_work(work, false);
-}
EXPORT_SYMBOL_GPL(flush_work);

struct cwt_wait {
@@ -3159,7 +3150,7 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
* isn't executing.
*/
if (wq_online)
- __flush_work(work, true);
+ flush_work(work);

clear_work_data(work);

--
2.18.4


2022-07-28 16:49:16

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

On Thu, Jul 28, 2022 at 09:23:25PM +0900, Tetsuo Handa wrote:
> Like Hillf Danton mentioned
>
> syzbot should have been able to catch cancel_work_sync() in work context
> by checking lockdep_map in __flush_work() for both flush and cancel.
>
> in [1], being unable to report an obvious deadlock scenario shown below is
> broken. From locking dependency perspective, sync version of cancel request
> should behave as if flush request, for it waits for completion of work if
> that work has already started execution.
>
> ----------
> #include <linux/module.h>
> #include <linux/sched.h>
> static DEFINE_MUTEX(mutex);
> static void work_fn(struct work_struct *work)
> {
> schedule_timeout_uninterruptible(HZ / 5);
> mutex_lock(&mutex);
> mutex_unlock(&mutex);
> }
> static DECLARE_WORK(work, work_fn);
> static int __init test_init(void)
> {
> schedule_work(&work);
> schedule_timeout_uninterruptible(HZ / 10);
> mutex_lock(&mutex);
> cancel_work_sync(&work);
> mutex_unlock(&mutex);
> return -EINVAL;
> }
> module_init(test_init);
> MODULE_LICENSE("GPL");
> ----------
>
> Link: https://lkml.kernel.org/r/[email protected] [1]
> Reported-by: Hillf Danton <[email protected]>
> Fixes: d6e89786bed977f3 ("workqueue: skip lockdep wq dependency in cancel_work_sync()")

Tetsuo, you gotta explain why this is okay w.r.t. the spurious warnings that
the above commit addressed. You can't just state that there are cases which
are missed and then revert it.

Thanks.

--
tejun

2022-07-28 23:34:43

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

On 2022/07/29 1:40, Tejun Heo wrote:
> Tetsuo, you gotta explain why this is okay w.r.t. the spurious warnings that
> the above commit addressed. You can't just state that there are cases which
> are missed and then revert it.

There are four commits related to this change.

commit 87915adc3f0acdf03c776df42e308e5a155c19af
Author: Johannes Berg <[email protected]>
Date: Wed Aug 22 11:49:04 2018 +0200

workqueue: re-add lockdep dependencies for flushing

commit d6e89786bed977f37f55ffca11e563f6d2b1e3b5
Author: Johannes Berg <[email protected]>
Date: Wed Aug 22 11:49:03 2018 +0200

workqueue: skip lockdep wq dependency in cancel_work_sync()

commit fd1a5b04dfb899f84ddeb8acdaea6b98283df1e5
Author: Byungchul Park <[email protected]>
Date: Wed Oct 25 17:56:04 2017 +0900

workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes

commit 0976dfc1d0cd80a4e9dfaf87bd8744612bde475a
Author: Stephen Boyd <[email protected]>
Date: Fri Apr 20 17:28:50 2012 -0700

workqueue: Catch more locking problems with flush_work()

. Commit 87915adc3f0acdf0 ("workqueue: re-add lockdep dependencies for
flushing") saying

These were removed after cross-release partially caught these
problems, but now cross-release was reverted anyway. IMHO the
removal was erroneous anyway though, since lockdep should be
able to catch potential problems, not just actual ones, and
cross-release would only have caught the problem when actually
invoking wait_for_completion().

is the answer, commit 0976dfc1d0cd80a4 ("workqueue: Catch more locking
problems with flush_work()") saying

Add a lockdep hint by acquiring and releasing the work item
lockdep_map in flush_work() so that we always catch this
potential deadlock scenario.

is what this patch restores.

2022-07-29 01:52:15

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

Johannes, why did you think that flagging it as if cancel_work_sync()
was flush_work() is a problem?

Unconditionally recording

"struct mutex" mutex->lockdep_map => "struct work_struct" work1->lockdep_map
"struct mutex" mutex->lockdep_map => "struct work_struct" work2->lockdep_map

chains has zero problem.

Unconditionally recording

"struct mutex" mutex->lockdep_map => "struct workqueue_struct" ordered_wq->lockdep_map

chain when ordered_wq can process only one work item at a time
in order to indicate that the ordered_wq is currently unable to process
other works has zero problem.

The example shown in commit d6e89786bed977f3 ("workqueue: skip lockdep wq
dependency in cancel_work_sync()") is nothing but violation of a rule that
"Do not hold a lock from a work callback function (do not record

"struct work_struct" work1->lockdep_map => "struct mutex" mutex->lockdep_map
"struct workqueue_struct" ordered_wq->lockdep_map => "struct mutex" mutex->lockdep_map

chain) if somebody might wait for completion of that callback function with
that lock held (might record

"struct mutex" mutex->lockdep_map => "struct work_struct" work1->lockdep_map
"struct mutex" mutex->lockdep_map => "struct workqueue_struct" ordered_wq->lockdep_map

chain)."

Which in-tree ordered workqueue instance is hitting this problem?

2022-07-29 02:33:21

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

Thinking more, I might be confused by difference between

the lockdep "struct work_struct" dependency handling

and

the lockdep "struct work" dependency handling

. In an example that

work3_function() { /* nothing */ }
work4_function() { mutex_lock(&mutex); ... }

other_function() {
queue_work(ordered_wq, &work3);
queue_work(ordered_wq, &work4);
mutex_lock(&mutex);
cancel_work_sync(&work4);
}

, possibility of deadlock must be reported by lockdep via
the lockdep "struct work" dependency handling.

Then, prior to commit d6e89786bed977f3, lockdep flagging

the lockdep "struct work_struct" dependency handling

as if cancel_work_sync() was flush_work() is a problem, and
lockdep not flagging

the lockdep "struct work" dependency handling

from cancel_work_sync() as if flush_work() is also a problem.

Then, commit d6e89786bed977f3 might be OK, but commit 87915adc3f0acdf0
was wrong in that it preserved lockdep not flagging

the lockdep "struct work" dependency handling

from cancel_work_sync() as if flush_work().

2022-07-29 02:53:58

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

On Thu, Jul 28, 2022 at 8:23 PM Tetsuo Handa
<[email protected]> wrote:
>
> Like Hillf Danton mentioned
>
> syzbot should have been able to catch cancel_work_sync() in work context
> by checking lockdep_map in __flush_work() for both flush and cancel.
>
> in [1], being unable to report an obvious deadlock scenario shown below is
> broken. From locking dependency perspective, sync version of cancel request
> should behave as if flush request, for it waits for completion of work if
> that work has already started execution.
>
> ----------
> #include <linux/module.h>
> #include <linux/sched.h>
> static DEFINE_MUTEX(mutex);
> static void work_fn(struct work_struct *work)
> {
> schedule_timeout_uninterruptible(HZ / 5);
> mutex_lock(&mutex);
> mutex_unlock(&mutex);
> }
> static DECLARE_WORK(work, work_fn);
> static int __init test_init(void)
> {
> schedule_work(&work);
> schedule_timeout_uninterruptible(HZ / 10);
> mutex_lock(&mutex);
> cancel_work_sync(&work);
> mutex_unlock(&mutex);
> return -EINVAL;
> }
> module_init(test_init);
> MODULE_LICENSE("GPL");
> ----------
>
> Link: https://lkml.kernel.org/r/[email protected] [1]
> Reported-by: Hillf Danton <[email protected]>
> Fixes: d6e89786bed977f3 ("workqueue: skip lockdep wq dependency in cancel_work_sync()")
> Cc: Johannes Berg <[email protected]>
> Signed-off-by: Tetsuo Handa <[email protected]>
> ---
> kernel/workqueue.c | 45 ++++++++++++++++++---------------------------
> 1 file changed, 18 insertions(+), 27 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 1ea50f6be843..e6df688f84db 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3000,8 +3000,7 @@ void drain_workqueue(struct workqueue_struct *wq)
> }
> EXPORT_SYMBOL_GPL(drain_workqueue);
>
> -static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> - bool from_cancel)
> +static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> {
> struct worker *worker = NULL;
> struct worker_pool *pool;
> @@ -3043,8 +3042,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> * workqueues the deadlock happens when the rescuer stalls, blocking
> * forward progress.
> */
> - if (!from_cancel &&
> - (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
> + if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
> lock_map_acquire(&pwq->wq->lockdep_map);
> lock_map_release(&pwq->wq->lockdep_map);
> }
> @@ -3056,7 +3054,18 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> return false;
> }
>
> -static bool __flush_work(struct work_struct *work, bool from_cancel)
> +/**
> + * flush_work - wait for a work to finish executing the last queueing instance
> + * @work: the work to flush
> + *
> + * Wait until @work has finished execution. @work is guaranteed to be idle
> + * on return if it hasn't been requeued since flush started.
> + *
> + * Return:
> + * %true if flush_work() waited for the work to finish execution,
> + * %false if it was already idle.
> + */
> +bool flush_work(struct work_struct *work)
> {
> struct wq_barrier barr;
>
> @@ -3066,12 +3075,10 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
> if (WARN_ON(!work->func))
> return false;
>
> - if (!from_cancel) {
> - lock_map_acquire(&work->lockdep_map);
> - lock_map_release(&work->lockdep_map);
> - }
> + lock_map_acquire(&work->lockdep_map);
> + lock_map_release(&work->lockdep_map);


IIUC, I think the change of these 5 lines of code (-3+2) is enough
to fix the problem described in the changelog.

If so, could you make a minimal patch?

I believe what the commit d6e89786bed977f3 ("workqueue: skip lockdep
wq dependency in cancel_work_sync()") fixes is real. It is not a good
idea to revert it.

P.S.

The commit fd1a5b04dfb8("workqueue: Remove now redundant lock
acquisitions wrt. workqueue flushes") removed this lockdep check.

And the commit 87915adc3f0a("workqueue: re-add lockdep
dependencies for flushing") added it back for non-canceling cases.

It seems the commit fd1a5b04dfb8 is the culprit and 87915adc3f0a
didn't fixes all the problem of it.

So it is better to complete 87915adc3f0a by making __flush_work()
does lock_map_acquire(&work->lockdep_map) for both canceling and
non-canceling cases.

2022-07-29 03:06:36

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

On 2022/07/29 11:28, Tetsuo Handa wrote:
> Thinking more, I might be confused by difference between
>
> the lockdep "struct work_struct" dependency handling

the lockdep "struct workqueue_struct" dependency handling

>
> and
>
> the lockdep "struct work" dependency handling

the lockdep "struct work_struct" dependency handling


On 2022/07/29 11:38, Lai Jiangshan wrote:
> The commit fd1a5b04dfb8("workqueue: Remove now redundant lock
> acquisitions wrt. workqueue flushes") removed this lockdep check.
>
> And the commit 87915adc3f0a("workqueue: re-add lockdep
> dependencies for flushing") added it back for non-canceling cases.
>
> It seems the commit fd1a5b04dfb8 is the culprit and 87915adc3f0a
> didn't fixes all the problem of it.
>
> So it is better to complete 87915adc3f0a by making __flush_work()
> does lock_map_acquire(&work->lockdep_map) for both canceling and
> non-canceling cases.

Yes, commit 87915adc3f0acdf0 was incomplete.

2022-07-29 03:39:33

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

+CC Byungchul Park <[email protected]>

On Fri, Jul 29, 2022 at 10:38 AM Lai Jiangshan <[email protected]> wrote:

> > +bool flush_work(struct work_struct *work)
> > {
> > struct wq_barrier barr;
> >
> > @@ -3066,12 +3075,10 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
> > if (WARN_ON(!work->func))
> > return false;
> >
> > - if (!from_cancel) {
> > - lock_map_acquire(&work->lockdep_map);
> > - lock_map_release(&work->lockdep_map);
> > - }
> > + lock_map_acquire(&work->lockdep_map);
> > + lock_map_release(&work->lockdep_map);
>
>
> IIUC, I think the change of these 5 lines of code (-3+2) is enough
> to fix the problem described in the changelog.
>
> If so, could you make a minimal patch?
>
> I believe what the commit d6e89786bed977f3 ("workqueue: skip lockdep
> wq dependency in cancel_work_sync()") fixes is real. It is not a good
> idea to revert it.
>
> P.S.
>
> The commit fd1a5b04dfb8("workqueue: Remove now redundant lock
> acquisitions wrt. workqueue flushes") removed this lockdep check.
>
> And the commit 87915adc3f0a("workqueue: re-add lockdep
> dependencies for flushing") added it back for non-canceling cases.
>
> It seems the commit fd1a5b04dfb8 is the culprit and 87915adc3f0a
> didn't fixes all the problem of it.
>
> So it is better to complete 87915adc3f0a by making __flush_work()
> does lock_map_acquire(&work->lockdep_map) for both canceling and
> non-canceling cases.

The cross-release locking check is reverted by the commit e966eaeeb623
("locking/lockdep: Remove the cross-release locking checks").

So fd1a5b04dfb8 was a kind of hasty. What it changed should be added back.