2012-11-28 15:22:36

by Anders Kaseorg

[permalink] [raw]
Subject: Re: Wireless regression in workqueue: use mod_delayed_work() instead of __cancel + queue

On Wed, 28 Nov 2012, Anders Kaseorg wrote:
> My Intel 6250 wireless card (iwldvm) can no longer associate with a
> WPA-Enterprise network (PEAP-MSCHAPv2). To my surprise, I bisected this
> regression to commit e7c2f967445dd2041f0f8e3179cca22bb8bb7f79,
> workqueue: use mod_delayed_work() instead of __cancel + queue.
>
> A bunch of logs collected by Ubuntu apport are in this bug report:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1083980
>
> How can I help to debug this?
>
> I see that someone else reported another regression with the same commit
> last week, although this looks unrelated at first glance:
> http://thread.gmane.org/gmane.linux.kernel/1395938
>
> Anders

(Oops, re-sending to fix a typo in the LKML address.)

Anders


2012-11-30 21:14:43

by Tejun Heo

[permalink] [raw]
Subject: Re: Wireless regression in workqueue: use mod_delayed_work() instead of __cancel + queue

Hello, Anders.

Sorry about the delay.

On Wed, Nov 28, 2012 at 10:17:28AM -0500, Anders Kaseorg wrote:
> On Wed, 28 Nov 2012, Anders Kaseorg wrote:
> > My Intel 6250 wireless card (iwldvm) can no longer associate with a
> > WPA-Enterprise network (PEAP-MSCHAPv2). To my surprise, I bisected this
> > regression to commit e7c2f967445dd2041f0f8e3179cca22bb8bb7f79,
> > workqueue: use mod_delayed_work() instead of __cancel + queue.

I see.

> > A bunch of logs collected by Ubuntu apport are in this bug report:
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1083980
> >
> > How can I help to debug this?
> >
> > I see that someone else reported another regression with the same commit
> > last week, although this looks unrelated at first glance:
> > http://thread.gmane.org/gmane.linux.kernel/1395938

Urgh... that one was in my spam folder probably due to the mimed
content. Nothing rings a bell yet. Will keep looking into it.

Thanks.

--
tejun

2012-11-30 22:56:26

by Tejun Heo

[permalink] [raw]
Subject: Re: Wireless regression in workqueue: use mod_delayed_work() instead of __cancel + queue

Hey, again.

Can you please test whether the following patch makes any difference?

Thanks!

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 042d221..26368ef 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1477,7 +1477,10 @@ bool mod_delayed_work_on(int cpu, struct workqueue_struct *wq,
} while (unlikely(ret == -EAGAIN));

if (likely(ret >= 0)) {
- __queue_delayed_work(cpu, wq, dwork, delay);
+ if (!delay)
+ __queue_work(cpu, wq, &dwork->work);
+ else
+ __queue_delayed_work(cpu, wq, dwork, delay);
local_irq_restore(flags);
}

2012-12-01 04:15:58

by Anders Kaseorg

[permalink] [raw]
Subject: Re: Wireless regression in workqueue: use mod_delayed_work() instead of __cancel + queue

On Fri, 30 Nov 2012, Tejun Heo wrote:
> Hey, again.
>
> Can you please test whether the following patch makes any difference?
>
> Thanks!
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 042d221..26368ef 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1477,7 +1477,10 @@ bool mod_delayed_work_on(int cpu, struct workqueue_struct *wq,
> } while (unlikely(ret == -EAGAIN));
>
> if (likely(ret >= 0)) {
> - __queue_delayed_work(cpu, wq, dwork, delay);
> + if (!delay)
> + __queue_work(cpu, wq, &dwork->work);
> + else
> + __queue_delayed_work(cpu, wq, dwork, delay);
> local_irq_restore(flags);
> }
>

Yes. I tested that both directly on top of the bad commit, and on
v3.7-rc7, and it fixes the bug in both cases.

Thanks,
Anders