2007-05-08 08:57:42

by Jiri Slaby

[permalink] [raw]
Subject: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

Hi,

this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
through netconsole). Perfectly reproducible, it simply happens each time I
try it.

usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
------------[ cut here ]------------
kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
invalid opcode: 0000 [#1]
SMP
Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
ehci_hcd pata_acpi ff_memless sr_mod cdrom
CPU: 1
EIP: 0060:[<c0132161>] Not tainted VLI
EFLAGS: 00010046 (2.6.21-mm1 #272)
EIP is at insert_work+0x6d/0x71
eax: c1c3b3c0 ebx: c1814aa0 ecx: 00000001 edx: c1814aa0
esi: c1c3b340 edi: 00000282 ebp: c04d2f68 esp: c04d2f50
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Process swapper (pid: 0, ti=c04d2000 task=c1c26030 task.ti=c1c20000)
Stack: c4685f54 c18148ac c04d2f98 c013816f c1c3b340 c1814aa0 c04d2f88 c013256c
00000066 c1814880 c1c5e000 c04d2fc4 c01325a1
c1c5e000 00000100 c012ba35 00000000 c04d2fb8 c01333f4 Call Trace:
[<c0104f27>] [<c0104fe2>] show_stack_log_lvl+0xa5/0xca
show_registers+0x1e2/0x2da
[<c01053f5>] [<c010559a>] do_trap+0x84/0xaa
do_invalid_op+0x88/0x92
[<c0378662>] [<c013256c>] __queue_work+0x22/0x33
delayed_work_timer_fn+0x24/0x2a
[<c012ba35>] [<c01288a9>] __do_softirq+0x75/0xe6
do_softirq+0x63/0xac
[<c0128713>] [<c0116d7e>] smp_apic_timer_interrupt+0x5c/0x88
apic_timer_interrupt+0x28/0x30
[<c010238d>] [<c0115ad0>] start_secondary+0x25e/0x37a
0x0
=======================
00 00 ba 03 00 00 40 a3 ff 83 10 5b 5d c3 4e 04 56 04 43 04 42 04 53 04 48
04 46 04 c9 <0f> eb fe 89 e5 ec 0c

regards,
--
http://www.fi.muni.cz/~xslaby/ Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E


2007-05-08 09:14:35

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <[email protected]> wrote:

>
> this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
> through netconsole). Perfectly reproducible, it simply happens each time I
> try it.

Let's cc Oleg.

> usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
> ------------[ cut here ]------------
> kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
> invalid opcode: 0000 [#1]
> SMP
> Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
> ehci_hcd pata_acpi ff_memless sr_mod cdrom
> CPU: 1
> EIP: 0060:[<c0132161>] Not tainted VLI
> EFLAGS: 00010046 (2.6.21-mm1 #272)
> EIP is at insert_work+0x6d/0x71
> eax: c1c3b3c0 ebx: c1814aa0 ecx: 00000001 edx: c1814aa0
> esi: c1c3b340 edi: 00000282 ebp: c04d2f68 esp: c04d2f50
> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
> Process swapper (pid: 0, ti=c04d2000 task=c1c26030 task.ti=c1c20000)
> Stack: c4685f54 c18148ac c04d2f98 c013816f c1c3b340 c1814aa0 c04d2f88 c013256c
> 00000066 c1814880 c1c5e000 c04d2fc4 c01325a1
> c1c5e000 00000100 c012ba35 00000000 c04d2fb8 c01333f4 Call Trace:
> [<c0104f27>] [<c0104fe2>] show_stack_log_lvl+0xa5/0xca
> show_registers+0x1e2/0x2da
> [<c01053f5>] [<c010559a>] do_trap+0x84/0xaa
> do_invalid_op+0x88/0x92
> [<c0378662>] [<c013256c>] __queue_work+0x22/0x33
> delayed_work_timer_fn+0x24/0x2a
> [<c012ba35>] [<c01288a9>] __do_softirq+0x75/0xe6
> do_softirq+0x63/0xac
> [<c0128713>] [<c0116d7e>] smp_apic_timer_interrupt+0x5c/0x88
> apic_timer_interrupt+0x28/0x30
> [<c010238d>] [<c0115ad0>] start_secondary+0x25e/0x37a
> 0x0
> =======================
> 00 00 ba 03 00 00 40 a3 ff 83 10 5b 5d c3 4e 04 56 04 43 04 42 04 53 04 48
> 04 46 04 c9 <0f> eb fe 89 e5 ec 0c
>

hm, how come it's so messy?

2007-05-08 10:45:44

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 08-05-2007 10:57, Jiri Slaby wrote:
...
> [...] Perfectly reproducible, it simply happens each time I
> try it.

...so, maybe, only subjectively reproducible?

Regards,
Jarek P.

2007-05-08 10:56:05

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 05/08, Andrew Morton wrote:
>
> On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <[email protected]> wrote:
>
> >
> > this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
> > through netconsole). Perfectly reproducible, it simply happens each time I
> > try it.
>
> Let's cc Oleg.
>
> > usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
> > ------------[ cut here ]------------
> > kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
> > invalid opcode: 0000 [#1]
> > SMP
> > Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
> > ehci_hcd pata_acpi ff_memless sr_mod cdrom
> > CPU: 1
> > EIP: 0060:[<c0132161>] Not tainted VLI
> > EFLAGS: 00010046 (2.6.21-mm1 #272)
> > EIP is at insert_work+0x6d/0x71
> > eax: c1c3b3c0 ebx: c1814aa0 ecx: 00000001 edx: c1814aa0
> > esi: c1c3b340 edi: 00000282 ebp: c04d2f68 esp: c04d2f50
> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
> > Process swapper (pid: 0, ti=c04d2000 task=c1c26030 task.ti=c1c20000)
> > Stack: c4685f54 c18148ac c04d2f98 c013816f c1c3b340 c1814aa0 c04d2f88 c013256c
> > 00000066 c1814880 c1c5e000 c04d2fc4 c01325a1
> > c1c5e000 00000100 c012ba35 00000000 c04d2fb8 c01333f4 Call Trace:
> > [<c0104f27>] [<c0104fe2>] show_stack_log_lvl+0xa5/0xca
> > show_registers+0x1e2/0x2da
> > [<c01053f5>] [<c010559a>] do_trap+0x84/0xaa
> > do_invalid_op+0x88/0x92
> > [<c0378662>] [<c013256c>] __queue_work+0x22/0x33
> > delayed_work_timer_fn+0x24/0x2a
> > [<c012ba35>] [<c01288a9>] __do_softirq+0x75/0xe6
> > do_softirq+0x63/0xac
> > [<c0128713>] [<c0116d7e>] smp_apic_timer_interrupt+0x5c/0x88
> > apic_timer_interrupt+0x28/0x30

queue_delayed_work().

Probably, cancel_delayed_work(&delayed_work->work) was called with the ->timer
pending. This is wrong, cancel_delayed_work() clears _PENDING unconditionally,
that is why the comment says

it is expected that, prior to calling cancel_work_sync(), the caller has
arranged for the work to not be requeued.

(Just in case, after make-cancel_rearming_delayed_work-reliable.patch this is still
wrong (as documented) to do cancel_delayed_work() before cancel_delayed_timer(),
but should work correctly).

ata_port_flush_task() and ata_port_detach() do this, I sent the patch to fix this
twice. The last one is

[PATCH -mm] libata-core: convert to use cancel_rearming_delayed_work()
http://marc.info/?l=linux-kernel&m=117840349108505


Jiri, any chance you can re-test with the patch below?

--- OLD/kernel/workqueue.c~ 2007-05-06 00:01:06.000000000 +0400
+++ OLD/kernel/workqueue.c 2007-05-08 14:50:39.000000000 +0400
@@ -103,7 +103,10 @@ static inline void set_wq_data(struct wo
{
unsigned long new;

- BUG_ON(!work_pending(work));
+ if (!work_pending(work)) {
+ printk(KERN_ERR "BUG: set_wq_data ");
+ print_symbol("%s\n", (unsigned long) work->func);
+ }

new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);

2007-05-08 11:09:38

by Jiri Slaby

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

Oleg Nesterov napsal(a):
> On 05/08, Andrew Morton wrote:
>> On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <[email protected]> wrote:
>>
>>> this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
>>> through netconsole). Perfectly reproducible, it simply happens each time I
>>> try it.
>> Let's cc Oleg.
>>
>>> usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
>>> ------------[ cut here ]------------
>>> kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
>>> invalid opcode: 0000 [#1]
>>> SMP
>>> Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
>>> ehci_hcd pata_acpi ff_memless sr_mod cdrom
>>> CPU: 1
>>> EIP: 0060:[<c0132161>] Not tainted VLI
>>> EFLAGS: 00010046 (2.6.21-mm1 #272)
>>> EIP is at insert_work+0x6d/0x71
>>> eax: c1c3b3c0 ebx: c1814aa0 ecx: 00000001 edx: c1814aa0
>>> esi: c1c3b340 edi: 00000282 ebp: c04d2f68 esp: c04d2f50
>>> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
>>> Process swapper (pid: 0, ti=c04d2000 task=c1c26030 task.ti=c1c20000)
>>> Stack: c4685f54 c18148ac c04d2f98 c013816f c1c3b340 c1814aa0 c04d2f88 c013256c
>>> 00000066 c1814880 c1c5e000 c04d2fc4 c01325a1
>>> c1c5e000 00000100 c012ba35 00000000 c04d2fb8 c01333f4 Call Trace:
>>> [<c0104f27>] [<c0104fe2>] show_stack_log_lvl+0xa5/0xca
>>> show_registers+0x1e2/0x2da
>>> [<c01053f5>] [<c010559a>] do_trap+0x84/0xaa
>>> do_invalid_op+0x88/0x92
>>> [<c0378662>] [<c013256c>] __queue_work+0x22/0x33
>>> delayed_work_timer_fn+0x24/0x2a
>>> [<c012ba35>] [<c01288a9>] __do_softirq+0x75/0xe6
>>> do_softirq+0x63/0xac
>>> [<c0128713>] [<c0116d7e>] smp_apic_timer_interrupt+0x5c/0x88
>>> apic_timer_interrupt+0x28/0x30

>> hm, how come it's so messy?

I have no idea, this is how it appeared in the `nc -ul' output...

> queue_delayed_work().
>
> Probably, cancel_delayed_work(&delayed_work->work) was called with the ->timer
> pending. This is wrong, cancel_delayed_work() clears _PENDING unconditionally,
> that is why the comment says
>
> it is expected that, prior to calling cancel_work_sync(), the caller has
> arranged for the work to not be requeued.
>
> (Just in case, after make-cancel_rearming_delayed_work-reliable.patch this is still
> wrong (as documented) to do cancel_delayed_work() before cancel_delayed_timer(),
> but should work correctly).
>
> ata_port_flush_task() and ata_port_detach() do this, I sent the patch to fix this
> twice. The last one is
>
> [PATCH -mm] libata-core: convert to use cancel_rearming_delayed_work()
> http://marc.info/?l=linux-kernel&m=117840349108505
>
>
> Jiri, any chance you can re-test with the patch below?

Yes. In the meantime I investigated, that the regression is between
broken-out-2007-04-28-05-06
and
special -js edition:
I guess it's time to end the staircase experiment in -mm.
http://userweb.kernel.org/~akpm/js.bz2 is my current rollup (against
2.6.21) minus staircase and related things.

I don't know if it's possible to dig out the patch list from it, otherwise,
2.6.21-mm1 list may be used...

> --- OLD/kernel/workqueue.c~ 2007-05-06 00:01:06.000000000 +0400
> +++ OLD/kernel/workqueue.c 2007-05-08 14:50:39.000000000 +0400
> @@ -103,7 +103,10 @@ static inline void set_wq_data(struct wo
> {
> unsigned long new;
>
> - BUG_ON(!work_pending(work));
> + if (!work_pending(work)) {
> + printk(KERN_ERR "BUG: set_wq_data ");
> + print_symbol("%s\n", (unsigned long) work->func);
> + }
>
> new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
> new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);

building and will report,
--
http://www.fi.muni.cz/~xslaby/ Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E

2007-05-08 12:06:22

by Jarek Poplawski

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 08-05-2007 12:55, Oleg Nesterov wrote:
> On 05/08, Andrew Morton wrote:
>> On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <[email protected]> wrote:
>>
>>> this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
>>> through netconsole). Perfectly reproducible, it simply happens each time I
>>> try it.
>> Let's cc Oleg.
>>
>>> usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
>>> ------------[ cut here ]------------
>>> kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
>>> invalid opcode: 0000 [#1]
>>> SMP
>>> Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
>>> ehci_hcd pata_acpi ff_memless sr_mod cdrom
...
> queue_delayed_work().
>
> Probably, cancel_delayed_work(&delayed_work->work) was called with the ->timer
> pending. This is wrong, cancel_delayed_work() clears _PENDING unconditionally,

Maybe I miss your point, but clearing is conditional: on timer delete...

I think more suspicious is calling cancel_work_sync() for a delayed work
(with timer pending). Or maybe some race profits from _PENDING cleared
without locking?

BTW, it seems some debugging is needed to show, whose work is doing the
mess.

Regards,
Jarek P.

2007-05-08 12:34:47

by Jiri Slaby

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

Jiri Slaby napsal(a):
> Oleg Nesterov napsal(a):
>> On 05/08, Andrew Morton wrote:
>>> On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <[email protected]> wrote:
>>>
>>>> this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
>>>> through netconsole). Perfectly reproducible, it simply happens each time I
>>>> try it.
>>> Let's cc Oleg.
>>>
>>>> usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
>>>> ------------[ cut here ]------------
>>>> kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
>>>> invalid opcode: 0000 [#1]
>>>> SMP
>>>> Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
>>>> ehci_hcd pata_acpi ff_memless sr_mod cdrom
>>>> CPU: 1
>>>> EIP: 0060:[<c0132161>] Not tainted VLI
>>>> EFLAGS: 00010046 (2.6.21-mm1 #272)
>>>> EIP is at insert_work+0x6d/0x71
>>>> eax: c1c3b3c0 ebx: c1814aa0 ecx: 00000001 edx: c1814aa0
>>>> esi: c1c3b340 edi: 00000282 ebp: c04d2f68 esp: c04d2f50
>>>> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
>>>> Process swapper (pid: 0, ti=c04d2000 task=c1c26030 task.ti=c1c20000)
>>>> Stack: c4685f54 c18148ac c04d2f98 c013816f c1c3b340 c1814aa0 c04d2f88 c013256c
>>>> 00000066 c1814880 c1c5e000 c04d2fc4 c01325a1
>>>> c1c5e000 00000100 c012ba35 00000000 c04d2fb8 c01333f4 Call Trace:
>>>> [<c0104f27>] [<c0104fe2>] show_stack_log_lvl+0xa5/0xca
>>>> show_registers+0x1e2/0x2da
>>>> [<c01053f5>] [<c010559a>] do_trap+0x84/0xaa
>>>> do_invalid_op+0x88/0x92
>>>> [<c0378662>] [<c013256c>] __queue_work+0x22/0x33
>>>> delayed_work_timer_fn+0x24/0x2a
>>>> [<c012ba35>] [<c01288a9>] __do_softirq+0x75/0xe6
>>>> do_softirq+0x63/0xac
>>>> [<c0128713>] [<c0116d7e>] smp_apic_timer_interrupt+0x5c/0x88
>>>> apic_timer_interrupt+0x28/0x30
[...]
>> queue_delayed_work().
>>
>> Probably, cancel_delayed_work(&delayed_work->work) was called with the ->timer
>> pending. This is wrong, cancel_delayed_work() clears _PENDING unconditionally,
>> that is why the comment says
>>
>> it is expected that, prior to calling cancel_work_sync(), the caller has
>> arranged for the work to not be requeued.
>>
>> (Just in case, after make-cancel_rearming_delayed_work-reliable.patch this is still
>> wrong (as documented) to do cancel_delayed_work() before cancel_delayed_timer(),
>> but should work correctly).
[...]
>> --- OLD/kernel/workqueue.c~ 2007-05-06 00:01:06.000000000 +0400
>> +++ OLD/kernel/workqueue.c 2007-05-08 14:50:39.000000000 +0400
>> @@ -103,7 +103,10 @@ static inline void set_wq_data(struct wo
>> {
>> unsigned long new;
>>
>> - BUG_ON(!work_pending(work));
>> + if (!work_pending(work)) {
>> + printk(KERN_ERR "BUG: set_wq_data ");
>> + print_symbol("%s\n", (unsigned long) work->func);
>> + }
>>
>> new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
>> new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);

vmstat_update+0x0/0x2b

Ccing Christoph.

regards,
--
http://www.fi.muni.cz/~xslaby/ Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E

2007-05-08 12:40:58

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 05/08, Jarek Poplawski wrote:
>
> On 08-05-2007 12:55, Oleg Nesterov wrote:
> > On 05/08, Andrew Morton wrote:
> >> On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <[email protected]> wrote:
> >>
> >>> this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
> >>> through netconsole). Perfectly reproducible, it simply happens each time I
> >>> try it.
> >> Let's cc Oleg.
> >>
> >>> usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
> >>> ------------[ cut here ]------------
> >>> kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
> >>> invalid opcode: 0000 [#1]
> >>> SMP
> >>> Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
> >>> ehci_hcd pata_acpi ff_memless sr_mod cdrom
> ...
> > queue_delayed_work().
> >
> > Probably, cancel_delayed_work(&delayed_work->work) was called with the ->timer
> > pending. This is wrong, cancel_delayed_work() clears _PENDING unconditionally,
>
> Maybe I miss your point, but clearing is conditional: on timer delete...
>
> I think more suspicious is calling cancel_work_sync() for a delayed work
> (with timer pending). Or maybe some race profits from _PENDING cleared
> without locking?

Yes, of course, I meant cancel_work_sync(), sorry for the confusion.

Thanks!

So, once again, cancel_work_sync(&dwork->work) is wrong unless the timer
was stopped.

Before make-cancel_rearming_delayed_work-reliable.patch

it requires that the @work can't be re-queued.

After
works, but waits for the timer expiration in a busy-wait loop.

Oleg.

2007-05-08 13:48:28

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 05/08, Jiri Slaby wrote:
>
> > Oleg Nesterov napsal(a):
> >>>
> >>>> kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
> >>>> invalid opcode: 0000 [#1]
> >>>> SMP
> >>>> Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
> >>>> ehci_hcd pata_acpi ff_memless sr_mod cdrom
> >>>> CPU: 1
> >>>> EIP: 0060:[<c0132161>] Not tainted VLI
> >>>> EFLAGS: 00010046 (2.6.21-mm1 #272)
> >>>> EIP is at insert_work+0x6d/0x71
> [...]
> >> --- OLD/kernel/workqueue.c~ 2007-05-06 00:01:06.000000000 +0400
> >> +++ OLD/kernel/workqueue.c 2007-05-08 14:50:39.000000000 +0400
> >> @@ -103,7 +103,10 @@ static inline void set_wq_data(struct wo
> >> {
> >> unsigned long new;
> >>
> >> - BUG_ON(!work_pending(work));
> >> + if (!work_pending(work)) {
> >> + printk(KERN_ERR "BUG: set_wq_data ");
> >> + print_symbol("%s\n", (unsigned long) work->func);
> >> + }
> >>
> >> new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
> >> new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);
>
> vmstat_update+0x0/0x2b

Thanks a lot.

I know nothing about hwsusp, to the point I don't even know what it does.
I'll try to do some reading tomorrow.

Right now,

> +static void vmstat_update(struct work_struct *w)
> +{
> + refresh_cpu_vm_stats(smp_processor_id());
> + schedule_delayed_work(&__get_cpu_var(vmstat_work),
> + sysctl_stat_interval);
> +}

This is not precisely correct. We cam schedule the wrong vmstat_work
if this timer/work migrates to another CPU. I'd suggest

schedule_delayed_work(container_of(w, struct delayed_work, work))

This should not happen because we are doing cancel_rearming_delayed_work()
below, however:

> + case CPU_DOWN_PREPARE:
> + case CPU_DOWN_PREPARE_FROZEN:
> + cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
> + per_cpu(vmstat_work, cpu).work.func = NULL;
> + case CPU_DOWN_FAILED:
> + case CPU_DOWN_FAILED_FROZEN:
> + start_cpu_timer(cpu);

we need a "break;" before "case CPU_DOWN_FAILED", otherwise we re-start
vmstat_update() immediately.

This is a bug, but I am not sure is this the only problem.

Oleg.

2007-05-08 14:20:31

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 05/08, Oleg Nesterov wrote:
>
> On 05/08, Jiri Slaby wrote:
> >
> > vmstat_update+0x0/0x2b
>
> Thanks a lot.
>
> Right now,
>
> > +static void vmstat_update(struct work_struct *w)
> > +{
> > + refresh_cpu_vm_stats(smp_processor_id());
> > + schedule_delayed_work(&__get_cpu_var(vmstat_work),
> > + sysctl_stat_interval);
> > +}
>
> This is not precisely correct. We cam schedule the wrong vmstat_work
> if this timer/work migrates to another CPU. I'd suggest
>
> schedule_delayed_work(container_of(w, struct delayed_work, work))
>
> This should not happen because we are doing cancel_rearming_delayed_work()
> below, however:
>
> > + case CPU_DOWN_PREPARE:
> > + case CPU_DOWN_PREPARE_FROZEN:
> > + cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
> > + per_cpu(vmstat_work, cpu).work.func = NULL;
> > + case CPU_DOWN_FAILED:
> > + case CPU_DOWN_FAILED_FROZEN:
> > + start_cpu_timer(cpu);
>
> we need a "break;" before "case CPU_DOWN_FAILED", otherwise we re-start
> vmstat_update() immediately.
>
> This is a bug, but I am not sure is this the only problem.

In case I was not clear, this _can_ explain the problem. Because an extra
start_cpu_timer() (due to missed "break;") re-initializes dwork, and clears
_PENDING.

Oleg.

2007-05-08 14:30:59

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH 1/1] vmstat-use-our-own-timer-events-fix

Oleg Nesterov wrote:
> On 05/08, Oleg Nesterov wrote:
> >
> > On 05/08, Jiri Slaby wrote:
> >
> > This should not happen because we are doing cancel_rearming_delayed_work()
> > below, however:
> >
> > > + case CPU_DOWN_PREPARE:
> > > + case CPU_DOWN_PREPARE_FROZEN:
> > > + cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
> > > + per_cpu(vmstat_work, cpu).work.func = NULL;
> > > + case CPU_DOWN_FAILED:
> > > + case CPU_DOWN_FAILED_FROZEN:
> > > + start_cpu_timer(cpu);
> >
> > we need a "break;" before "case CPU_DOWN_FAILED", otherwise we re-start
> > vmstat_update() immediately.
> >
> > This is a bug, but I am not sure is this the only problem.
>
> In case I was not clear, this _can_ explain the problem. Because an extra
> start_cpu_timer() (due to missed "break;") re-initializes dwork, and clears
> _PENDING.

Yes, it helped, thanks. If no objections, please apply the patch below.

--

vmstat-use-our-own-timer-events-fix

add missing break

Cc: Christoph Lameter <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>

---
commit e0fe3cf4776a3c16ac7625ce9eae75ea42eb9b4e
tree c9a4277860e2cea03911b3d73c9386c9284d991b
parent b1034323f7e57642eee59a0f5e77c50bc26982c5
author Jiri Slaby <[email protected]> Tue, 08 May 2007 16:25:28 +0200
committer Jiri Slaby <[email protected]> Tue, 08 May 2007 16:25:28 +0200

mm/vmstat.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 709e2f0..fac05e8 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -677,6 +677,7 @@ static int __cpuinit vmstat_cpuup_callback(struct notifier_block *nfb,
case CPU_DOWN_PREPARE_FROZEN:
cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
per_cpu(vmstat_work, cpu).work.func = NULL;
+ break;
case CPU_DOWN_FAILED:
case CPU_DOWN_FAILED_FROZEN:
start_cpu_timer(cpu);

2007-05-08 14:42:53

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/1] vmstat-use-our-own-timer-events-fix

On 05/08, Jiri Slaby wrote:
>
> Oleg Nesterov wrote:
> > On 05/08, Oleg Nesterov wrote:
> > >
> > > On 05/08, Jiri Slaby wrote:
> > >
> > > This should not happen because we are doing cancel_rearming_delayed_work()
> > > below, however:
> > >
> > > > + case CPU_DOWN_PREPARE:
> > > > + case CPU_DOWN_PREPARE_FROZEN:
> > > > + cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
> > > > + per_cpu(vmstat_work, cpu).work.func = NULL;
> > > > + case CPU_DOWN_FAILED:
> > > > + case CPU_DOWN_FAILED_FROZEN:
> > > > + start_cpu_timer(cpu);
> > >
> > > we need a "break;" before "case CPU_DOWN_FAILED", otherwise we re-start
> > > vmstat_update() immediately.
> > >
> > > This is a bug, but I am not sure is this the only problem.
> >
> > In case I was not clear, this _can_ explain the problem. Because an extra
> > start_cpu_timer() (due to missed "break;") re-initializes dwork, and clears
> > _PENDING.
>
> Yes, it helped, thanks. If no objections, please apply the patch below.

Great! Thanks.

>
> vmstat-use-our-own-timer-events-fix
>
> add missing break
>
> Cc: Christoph Lameter <[email protected]>
> Cc: Arjan van de Ven <[email protected]>
> Signed-off-by: Jiri Slaby <[email protected]>
>
> ---
> commit e0fe3cf4776a3c16ac7625ce9eae75ea42eb9b4e
> tree c9a4277860e2cea03911b3d73c9386c9284d991b
> parent b1034323f7e57642eee59a0f5e77c50bc26982c5
> author Jiri Slaby <[email protected]> Tue, 08 May 2007 16:25:28 +0200
> committer Jiri Slaby <[email protected]> Tue, 08 May 2007 16:25:28 +0200
>
> mm/vmstat.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 709e2f0..fac05e8 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -677,6 +677,7 @@ static int __cpuinit vmstat_cpuup_callback(struct notifier_block *nfb,
> case CPU_DOWN_PREPARE_FROZEN:
> cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
> per_cpu(vmstat_work, cpu).work.func = NULL;
> + break;
> case CPU_DOWN_FAILED:
> case CPU_DOWN_FAILED_FROZEN:
> start_cpu_timer(cpu);

2007-05-08 16:08:31

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 1/1] vmstat-use-our-own-timer-events-fix

On Tue, 8 May 2007, Jiri Slaby wrote:

> Yes, it helped, thanks. If no objections, please apply the patch below.

Acked-by: Christoph Lameter <[email protected]>