2012-10-18 23:00:12

by Dan Magenheimer

[permalink] [raw]
Subject: cancel_delayed_work() semantics broken?

Hi Tejun --

Forgive me if I am missing something, but it appears
that your commit: 57b30ae77bf00d2318df711ef9a4d2a9be0a3a2a
(workqueue: reimplement cancel_delayed_work() using try_to_grab_pending())
has subtly broken the semantics of the function. If
work was idle, according to the comment, it should
return false, correct?

It appears that very few callsites check the return value,
but ramster does, as does ocfs2 from whence the code at the
ramster callsite was derived. They both decrement
a kref count based on the return value.

I am still looking at try_to_grab_pending as I am not sure
if its semantics have also changed.

Thanks,
Dan

Signed-off-by: Dan Magenheimer <[email protected]>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index d951daa..042d221 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2982,7 +2982,7 @@ bool cancel_delayed_work(struct delayed_work *dwork)

set_work_cpu_and_clear_pending(&dwork->work, work_cpu(&dwork->work));
local_irq_restore(flags);
- return true;
+ return ret;
}
EXPORT_SYMBOL(cancel_delayed_work);


2012-10-18 23:25:56

by Tejun Heo

[permalink] [raw]
Subject: Re: cancel_delayed_work() semantics broken?

Hello, Dan.

On Thu, Oct 18, 2012 at 04:00:03PM -0700, Dan Magenheimer wrote:
> Forgive me if I am missing something, but it appears
> that your commit: 57b30ae77bf00d2318df711ef9a4d2a9be0a3a2a
> (workqueue: reimplement cancel_delayed_work() using try_to_grab_pending())
> has subtly broken the semantics of the function. If
> work was idle, according to the comment, it should
> return false, correct?
>
> It appears that very few callsites check the return value,
> but ramster does, as does ocfs2 from whence the code at the
> ramster callsite was derived. They both decrement
> a kref count based on the return value.
>
> I am still looking at try_to_grab_pending as I am not sure
> if its semantics have also changed.
>
> Thanks,
> Dan
>
> Signed-off-by: Dan Magenheimer <[email protected]>
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index d951daa..042d221 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2982,7 +2982,7 @@ bool cancel_delayed_work(struct delayed_work *dwork)
>
> set_work_cpu_and_clear_pending(&dwork->work, work_cpu(&dwork->work));
> local_irq_restore(flags);
> - return true;
> + return ret;

Ah, you're right. Applying to wq/for-3.7-fixes w/ patch description
updated. Thanks!

--
tejun

2012-10-18 23:39:33

by Tejun Heo

[permalink] [raw]
Subject: [PATCH] workqueue: cancel_delayed_work() should return %NULL if work item is idle

>From e65120fcfc1cb9697655d29ecd7982451c05d3c2 Mon Sep 17 00:00:00 2001
From: Dan Magenheimer <[email protected]>
Date: Thu, 18 Oct 2012 16:31:37 -0700

57b30ae77b ("workqueue: reimplement cancel_delayed_work() using
try_to_grab_pending()") made cancel_delayed_work() always return %true
unless someone else is also trying to cancel the work item, which is
broken - if the target work item is idle, the return value should be
%false.

try_to_grab_pending() indicates that the target work item was idle by
zero return value. Use it for return. Note that this brings
cancel_delayed_work() in line with __cancel_work_timer() in return
value handling.

Signed-off-by: Dan Magenheimer <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
LKML-Reference: <444a6439-b1a4-4740-9e7e-bc37267cfe73@default>
---
kernel/workqueue.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index d951daa..042d221 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2982,7 +2982,7 @@ bool cancel_delayed_work(struct delayed_work *dwork)

set_work_cpu_and_clear_pending(&dwork->work, work_cpu(&dwork->work));
local_irq_restore(flags);
- return true;
+ return ret;
}
EXPORT_SYMBOL(cancel_delayed_work);

--
1.7.7.3

2012-10-29 09:47:19

by Andrei Emeltchenko

[permalink] [raw]
Subject: Re: [PATCH] workqueue: cancel_delayed_work() should return %NULL if work item is idle

Gustavo, could you apply the patch below to bluetooth-next tree, otherwise
it is deadly broken.

Best regards
Andrei Emeltchenko

On Thu, Oct 18, 2012 at 04:39:28PM -0700, Tejun Heo wrote:
> From e65120fcfc1cb9697655d29ecd7982451c05d3c2 Mon Sep 17 00:00:00 2001
> From: Dan Magenheimer <[email protected]>
> Date: Thu, 18 Oct 2012 16:31:37 -0700
>
> 57b30ae77b ("workqueue: reimplement cancel_delayed_work() using
> try_to_grab_pending()") made cancel_delayed_work() always return %true
> unless someone else is also trying to cancel the work item, which is
> broken - if the target work item is idle, the return value should be
> %false.
>
> try_to_grab_pending() indicates that the target work item was idle by
> zero return value. Use it for return. Note that this brings
> cancel_delayed_work() in line with __cancel_work_timer() in return
> value handling.
>
> Signed-off-by: Dan Magenheimer <[email protected]>
> Signed-off-by: Tejun Heo <[email protected]>
> LKML-Reference: <444a6439-b1a4-4740-9e7e-bc37267cfe73@default>
> ---
> kernel/workqueue.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index d951daa..042d221 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2982,7 +2982,7 @@ bool cancel_delayed_work(struct delayed_work *dwork)
>
> set_work_cpu_and_clear_pending(&dwork->work, work_cpu(&dwork->work));
> local_irq_restore(flags);
> - return true;
> + return ret;
> }
> EXPORT_SYMBOL(cancel_delayed_work);
>
> --
> 1.7.7.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2012-10-29 15:14:54

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] workqueue: cancel_delayed_work() should return %NULL if work item is idle

On Mon, Oct 29, 2012 at 11:47:13AM +0200, Andrei Emeltchenko wrote:
> Gustavo, could you apply the patch below to bluetooth-next tree, otherwise
> it is deadly broken.

Already applied to mainline (c0158ca64da5732dfb86a3f28944e9626776692f).

If pulling mainline is too much, please feel free to pull

git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-3.7-fixes

Sorry about the trouble.

--
tejun