2017-12-19 12:25:08

by syzbot

[permalink] [raw]
Subject: BUG: workqueue lockup (2)

syzkaller has found reproducer for the following crash on
f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
compiler: gcc (GCC) 7.1.1 20170620
.config is attached
Raw console output is attached.
C reproducer is attached
syzkaller reproducer is attached. See https://goo.gl/kgGztJ
for information about syzkaller reproducers


BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
pending: cache_reap
workqueue events_power_efficient: flags=0x80
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
pending: neigh_periodic_work, do_cache_clean
workqueue mm_percpu_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
pending: vmstat_update
workqueue kblockd: flags=0x18
pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
pending: blk_timeout_work


Attachments:
config.txt (123.51 kB)
raw.log (2.31 kB)
repro.txt (1.35 kB)
repro.c (11.54 kB)
Download all attachments

2017-12-19 14:27:27

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

syzbot wrote:
>
> syzkaller has found reproducer for the following crash on
> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051

"BUG: workqueue lockup" is not a crash.

> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
>
>
> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
> Showing busy workqueues and worker pools:
> workqueue events: flags=0x0
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> pending: cache_reap
> workqueue events_power_efficient: flags=0x80
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
> pending: neigh_periodic_work, do_cache_clean
> workqueue mm_percpu_wq: flags=0x8
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> pending: vmstat_update
> workqueue kblockd: flags=0x18
> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
> pending: blk_timeout_work

You gave up too early. There is no hint for understanding what was going on.
While we can observe "BUG: workqueue lockup" under memory pressure, there is
no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.

At least you need to confirm that lockup lasts for a few minutes. Otherwise,
this might be just overstressing. (According to repro.c , 12 threads are
created and soon SEGV follows? According to above message, only 2 CPUs?
Triggering SEGV suggests memory was low due to saving coredump?)

2017-12-19 14:41:10

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
<[email protected]> wrote:
> syzbot wrote:
>>
>> syzkaller has found reproducer for the following crash on
>> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
>
> "BUG: workqueue lockup" is not a crash.

Hi Tetsuo,

What is the proper name for all of these collectively?


>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> C reproducer is attached
>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> for information about syzkaller reproducers
>>
>>
>> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
>> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
>> Showing busy workqueues and worker pools:
>> workqueue events: flags=0x0
>> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>> pending: cache_reap
>> workqueue events_power_efficient: flags=0x80
>> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
>> pending: neigh_periodic_work, do_cache_clean
>> workqueue mm_percpu_wq: flags=0x8
>> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>> pending: vmstat_update
>> workqueue kblockd: flags=0x18
>> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
>> pending: blk_timeout_work
>
> You gave up too early. There is no hint for understanding what was going on.
> While we can observe "BUG: workqueue lockup" under memory pressure, there is
> no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.

Do you know how to send them programmatically? I tried to find a way
several times, but failed. Articles that I've found talk about
pressing some keys that don't translate directly to us-ascii.

But you can also run the reproducer. No report can possible provide
all possible useful information, sometimes debugging boils down to
manually adding printfs. That's why syzbot aims at providing a
reproducer as the ultimate source of details. Also since a developer
needs to test a proposed fix, it's easier to start with the reproducer
right away.


> At least you need to confirm that lockup lasts for a few minutes. Otherwise,

Is it possible to increase the timeout? How? We could bump it up to 2 minutes.


> this might be just overstressing. (According to repro.c , 12 threads are
> created and soon SEGV follows? According to above message, only 2 CPUs?
> Triggering SEGV suggests memory was low due to saving coredump?)

2017-12-19 16:38:39

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Tue, Dec 19, 2017 at 3:40 PM, Dmitry Vyukov <[email protected]> wrote:
> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
> <[email protected]> wrote:
>> syzbot wrote:
>>>
>>> syzkaller has found reproducer for the following crash on
>>> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
>>
>> "BUG: workqueue lockup" is not a crash.
>
> Hi Tetsuo,
>
> What is the proper name for all of these collectively?
>
>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>> C reproducer is attached
>>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>>> for information about syzkaller reproducers
>>>
>>>
>>> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
>>> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
>>> Showing busy workqueues and worker pools:
>>> workqueue events: flags=0x0
>>> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>>> pending: cache_reap
>>> workqueue events_power_efficient: flags=0x80
>>> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
>>> pending: neigh_periodic_work, do_cache_clean
>>> workqueue mm_percpu_wq: flags=0x8
>>> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>>> pending: vmstat_update
>>> workqueue kblockd: flags=0x18
>>> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
>>> pending: blk_timeout_work
>>
>> You gave up too early. There is no hint for understanding what was going on.
>> While we can observe "BUG: workqueue lockup" under memory pressure, there is
>> no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.
>
> Do you know how to send them programmatically? I tried to find a way
> several times, but failed. Articles that I've found talk about
> pressing some keys that don't translate directly to us-ascii.

On second though, some oopses automatically dump locks/tasks. Should
we do the same for this oops?

> But you can also run the reproducer. No report can possible provide
> all possible useful information, sometimes debugging boils down to
> manually adding printfs. That's why syzbot aims at providing a
> reproducer as the ultimate source of details. Also since a developer
> needs to test a proposed fix, it's easier to start with the reproducer
> right away.
>
>
>> At least you need to confirm that lockup lasts for a few minutes. Otherwise,
>
> Is it possible to increase the timeout? How? We could bump it up to 2 minutes.
>
>
>> this might be just overstressing. (According to repro.c , 12 threads are
>> created and soon SEGV follows? According to above message, only 2 CPUs?
>> Triggering SEGV suggests memory was low due to saving coredump?)

2017-12-20 10:55:17

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
> <[email protected]> wrote:
> > syzbot wrote:
> >>
> >> syzkaller has found reproducer for the following crash on
> >> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
> >
> > "BUG: workqueue lockup" is not a crash.
>
> Hi Tetsuo,
>
> What is the proper name for all of these collectively?

I think that things which lead to kernel panic when /proc/sys/kernel/panic_on_oops
was set to 1 are called an "oops" (or a "kerneloops").

Speak of "BUG: workqueue lockup", this is not an "oops". This message was
added by 82607adcf9cdf40f ("workqueue: implement lockup detector"), and
this message does not always indicate a fatal problem. This message can be
printed when the system is really out of CPU and memory. As far as I tested,
I think that workqueue was not able to run on specific CPU due to a soft
lockup bug.

>
>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> >> compiler: gcc (GCC) 7.1.1 20170620
> >> .config is attached
> >> Raw console output is attached.
> >> C reproducer is attached
> >> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> >> for information about syzkaller reproducers
> >>
> >>
> >> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
> >> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
> >> Showing busy workqueues and worker pools:
> >> workqueue events: flags=0x0
> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> >> pending: cache_reap
> >> workqueue events_power_efficient: flags=0x80
> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
> >> pending: neigh_periodic_work, do_cache_clean
> >> workqueue mm_percpu_wq: flags=0x8
> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> >> pending: vmstat_update
> >> workqueue kblockd: flags=0x18
> >> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
> >> pending: blk_timeout_work
> >
> > You gave up too early. There is no hint for understanding what was going on.
> > While we can observe "BUG: workqueue lockup" under memory pressure, there is
> > no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.
>
> Do you know how to send them programmatically? I tried to find a way
> several times, but failed. Articles that I've found talk about
> pressing some keys that don't translate directly to us-ascii.

# echo t > /proc/sysrq-trigger
# echo m > /proc/sysrq-trigger

>
> But you can also run the reproducer. No report can possible provide
> all possible useful information, sometimes debugging boils down to
> manually adding printfs. That's why syzbot aims at providing a
> reproducer as the ultimate source of details. Also since a developer
> needs to test a proposed fix, it's easier to start with the reproducer
> right away.

I don't have information about how to run the reproducer (e.g. how many
CPUs, how much memory, what network configuration is needed).

Also, please explain how to interpret raw.log file. The raw.log in
[email protected] had a lot of code output and kernel
messages but did not contain "BUG: workqueue lockup" message. On the other
hand, the raw.log in [email protected] has only kernel
messages and contains "BUG: workqueue lockup" message. Why they are
significantly different?

Also, can you add timestamp to all messages?
When each message was printed is a clue for understanding relationship.

>
>
> > At least you need to confirm that lockup lasts for a few minutes. Otherwise,
>
> Is it possible to increase the timeout? How? We could bump it up to 2 minutes.

# echo 120 > /sys/module/workqueue/parameters/watchdog_thresh

But generally, reporting multiple times rather than only once gives me
better clue, for the former would tell me whether situation was changing.

Can you try not to give up as soon as "BUG: workqueue lockup" was printed
for the first time?

>
>
> > this might be just overstressing. (According to repro.c , 12 threads are
> > created and soon SEGV follows? According to above message, only 2 CPUs?
> > Triggering SEGV suggests memory was low due to saving coredump?)
>

2017-12-21 10:19:34

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Wed, Dec 20, 2017 at 11:55 AM, Tetsuo Handa
<[email protected]> wrote:
> Dmitry Vyukov wrote:
>> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
>> <[email protected]> wrote:
>> > syzbot wrote:
>> >>
>> >> syzkaller has found reproducer for the following crash on
>> >> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
>> >
>> > "BUG: workqueue lockup" is not a crash.
>>
>> Hi Tetsuo,
>>
>> What is the proper name for all of these collectively?
>
> I think that things which lead to kernel panic when /proc/sys/kernel/panic_on_oops
> was set to 1 are called an "oops" (or a "kerneloops").
>
> Speak of "BUG: workqueue lockup", this is not an "oops". This message was
> added by 82607adcf9cdf40f ("workqueue: implement lockup detector"), and
> this message does not always indicate a fatal problem. This message can be
> printed when the system is really out of CPU and memory. As far as I tested,
> I think that workqueue was not able to run on specific CPU due to a soft
> lockup bug.


There are also warnings which don't panic normally, unless
panic_on_warn is set. There are also cases when we suddenly lost a
machine and have no idea what happened with it. And also cases when we
are kind-a connected, and nothing bad is printed on console, but it's
still un-operable.
The only collective name I can think of is bug. We could change it to
bug. Otherwise since there are multiple names, I don't think it's
worth spending more time on this.

2017-12-21 10:23:09

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Wed, Dec 20, 2017 at 11:55 AM, Tetsuo Handa
<[email protected]> wrote:
> Dmitry Vyukov wrote:
>> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
>> <[email protected]> wrote:
>> > syzbot wrote:
>> >>
>> >> syzkaller has found reproducer for the following crash on
>> >> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
>> >
>> > "BUG: workqueue lockup" is not a crash.
>>
>> Hi Tetsuo,
>>
>> What is the proper name for all of these collectively?
>
> I think that things which lead to kernel panic when /proc/sys/kernel/panic_on_oops
> was set to 1 are called an "oops" (or a "kerneloops").
>
> Speak of "BUG: workqueue lockup", this is not an "oops". This message was
> added by 82607adcf9cdf40f ("workqueue: implement lockup detector"), and
> this message does not always indicate a fatal problem. This message can be
> printed when the system is really out of CPU and memory. As far as I tested,
> I think that workqueue was not able to run on specific CPU due to a soft
> lockup bug.
>
>>
>>
>> >> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> >> compiler: gcc (GCC) 7.1.1 20170620
>> >> .config is attached
>> >> Raw console output is attached.
>> >> C reproducer is attached
>> >> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> >> for information about syzkaller reproducers
>> >>
>> >>
>> >> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
>> >> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
>> >> Showing busy workqueues and worker pools:
>> >> workqueue events: flags=0x0
>> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>> >> pending: cache_reap
>> >> workqueue events_power_efficient: flags=0x80
>> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
>> >> pending: neigh_periodic_work, do_cache_clean
>> >> workqueue mm_percpu_wq: flags=0x8
>> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>> >> pending: vmstat_update
>> >> workqueue kblockd: flags=0x18
>> >> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
>> >> pending: blk_timeout_work
>> >
>> > You gave up too early. There is no hint for understanding what was going on.
>> > While we can observe "BUG: workqueue lockup" under memory pressure, there is
>> > no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.
>>
>> Do you know how to send them programmatically? I tried to find a way
>> several times, but failed. Articles that I've found talk about
>> pressing some keys that don't translate directly to us-ascii.
>
> # echo t > /proc/sysrq-trigger
> # echo m > /proc/sysrq-trigger


This requires working ssh connection, but we routinely deal with
half-dead kernels. I think that sysrq over console is as reliable as
we can get in this context. But I don't know how to send them.

But thinking more about this, I am leaning towards the direction that
kernel just need to do the right thing and print that info.
In lots of cases we get a panic and as far as I understand kernel
won't react on sysrq in that state. Console is still unreliable too.
If a message is not useful, the right direction is to make it useful.

2017-12-21 11:04:34

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Wed, Dec 20, 2017 at 11:55 AM, Tetsuo Handa
<[email protected]> wrote:
> Dmitry Vyukov wrote:
>> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
>> <[email protected]> wrote:
>> > syzbot wrote:
>> >>
>> >> syzkaller has found reproducer for the following crash on
>> >> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
>> >
>> > "BUG: workqueue lockup" is not a crash.
>>
>> Hi Tetsuo,
>>
>> What is the proper name for all of these collectively?
>
> I think that things which lead to kernel panic when /proc/sys/kernel/panic_on_oops
> was set to 1 are called an "oops" (or a "kerneloops").
>
> Speak of "BUG: workqueue lockup", this is not an "oops". This message was
> added by 82607adcf9cdf40f ("workqueue: implement lockup detector"), and
> this message does not always indicate a fatal problem. This message can be
> printed when the system is really out of CPU and memory. As far as I tested,
> I think that workqueue was not able to run on specific CPU due to a soft
> lockup bug.
>
>>
>>
>> >> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> >> compiler: gcc (GCC) 7.1.1 20170620
>> >> .config is attached
>> >> Raw console output is attached.
>> >> C reproducer is attached
>> >> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> >> for information about syzkaller reproducers
>> >>
>> >>
>> >> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
>> >> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
>> >> Showing busy workqueues and worker pools:
>> >> workqueue events: flags=0x0
>> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>> >> pending: cache_reap
>> >> workqueue events_power_efficient: flags=0x80
>> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
>> >> pending: neigh_periodic_work, do_cache_clean
>> >> workqueue mm_percpu_wq: flags=0x8
>> >> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>> >> pending: vmstat_update
>> >> workqueue kblockd: flags=0x18
>> >> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
>> >> pending: blk_timeout_work
>> >
>> > You gave up too early. There is no hint for understanding what was going on.
>> > While we can observe "BUG: workqueue lockup" under memory pressure, there is
>> > no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.
>>
>> Do you know how to send them programmatically? I tried to find a way
>> several times, but failed. Articles that I've found talk about
>> pressing some keys that don't translate directly to us-ascii.
>
> # echo t > /proc/sysrq-trigger
> # echo m > /proc/sysrq-trigger
>
>>
>> But you can also run the reproducer. No report can possible provide
>> all possible useful information, sometimes debugging boils down to
>> manually adding printfs. That's why syzbot aims at providing a
>> reproducer as the ultimate source of details. Also since a developer
>> needs to test a proposed fix, it's easier to start with the reproducer
>> right away.
>
> I don't have information about how to run the reproducer (e.g. how many
> CPUs, how much memory, what network configuration is needed).

Usually all of that is irrelevant and these reproduce well on any machine.
FWIW, there were 2 CPUs and 2 GBs of memory. Network -- whatever GCE
provides as default network.


> Also, please explain how to interpret raw.log file. The raw.log in
> [email protected] had a lot of code output and kernel
> messages but did not contain "BUG: workqueue lockup" message. On the other
> hand, the raw.log in [email protected] has only kernel
> messages and contains "BUG: workqueue lockup" message. Why they are
> significantly different?


The first raw.log does contain "BUG: workqueue lockup", I see it right there:

[ 120.799119] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0
nice=0 stuck for 48s!
[ 120.807313] BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0
stuck for 47s!
[ 120.815024] Showing busy workqueues and worker pools:
[ 120.820369] workqueue events: flags=0x0
[ 120.824536] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
[ 120.830803] pending: perf_sched_delayed, vmstat_shepherd,
jump_label_update_timeout, cache_reap
[ 120.840149] workqueue events_power_efficient: flags=0x80
[ 120.845651] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
[ 120.851822] pending: neigh_periodic_work, neigh_periodic_work,
do_cache_clean, reg_check_chans_work
[ 120.861447] workqueue mm_percpu_wq: flags=0x8
[ 120.865947] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 120.872082] pending: vmstat_update
[ 120.875994] workqueue writeback: flags=0x4e
[ 120.880416] pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256
[ 120.886164] in-flight: 3401:wb_workfn
[ 120.890358] workqueue kblockd: flags=0x18

The difference is cause by the fact that the first one was obtained
from fuzzing session when fuzzer executed lots of random programs,
while the second one was an attempt to localize a reproducer, so the
system run programs one-by-one on freshly booted machines.



> Also, can you add timestamp to all messages?
> When each message was printed is a clue for understanding relationship.

There are timestamps. each program is prefixed with timestamps:

2017/12/03 08:51:30 executing program 6:

these things allow to tie kernel and real time:

[ 71.240837] QAT: Invalid ioctl
2017/12/03 08:51:30 executing program 3:



>> > At least you need to confirm that lockup lasts for a few minutes. Otherwise,
>>
>> Is it possible to increase the timeout? How? We could bump it up to 2 minutes.
>
> # echo 120 > /sys/module/workqueue/parameters/watchdog_thresh
>
> But generally, reporting multiple times rather than only once gives me
> better clue, for the former would tell me whether situation was changing.
>
> Can you try not to give up as soon as "BUG: workqueue lockup" was printed
> for the first time?


I've bumped timeout to 120 seconds with workqueue.watchdog_thresh=120
command line arg. Let's see if it still leaves any false positives, I
think 2 minutes should be enough, a CPU stalled for 2+ minutes
suggests something to fix anyway(even if just slowness somewhere). And
in the end this wasn't a false positive either, right?
Not giving up after an oops message will be hard and problematic for
several reasons.

2017-12-21 13:07:45

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

Dmitry Vyukov wrote:
> On Wed, Dec 20, 2017 at 11:55 AM, Tetsuo Handa
> <[email protected]> wrote:
> > Dmitry Vyukov wrote:
> >> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
> >> <[email protected]> wrote:
> >> > syzbot wrote:
> >> >>
> >> >> syzkaller has found reproducer for the following crash on
> >> >> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
> >> >
> >> > "BUG: workqueue lockup" is not a crash.
> >>
> >> Hi Tetsuo,
> >>
> >> What is the proper name for all of these collectively?
> >
> > I think that things which lead to kernel panic when /proc/sys/kernel/panic_on_oops
> > was set to 1 are called an "oops" (or a "kerneloops").
> >
> > Speak of "BUG: workqueue lockup", this is not an "oops". This message was
> > added by 82607adcf9cdf40f ("workqueue: implement lockup detector"), and
> > this message does not always indicate a fatal problem. This message can be
> > printed when the system is really out of CPU and memory. As far as I tested,
> > I think that workqueue was not able to run on specific CPU due to a soft
> > lockup bug.
>
> There are also warnings which don't panic normally, unless
> panic_on_warn is set. There are also cases when we suddenly lost a
> machine and have no idea what happened with it. And also cases when we
> are kind-a connected, and nothing bad is printed on console, but it's
> still un-operable.

Configuring netconsole might be helpful, for I use udplogger at
https://osdn.net/projects/akari/scm/svn/tree/head/branches/udplogger/
in order to collect all messages (not only kernel messages but also
any text messages which can be sent as UDP packets) with timestamp added.

An example of timestamp added to each line is
http://I-love.SAKURA.ne.jp/tmp/20171018-deflate.log.xz .

You can combine kernel messages from netconsole and output from shell
session using bash's

$ (command1; command2; command3) > /dev/udp/$remote_ip/$remote_port

syntax.

> The only collective name I can think of is bug. We could change it to
> bug. Otherwise since there are multiple names, I don't think it's
> worth spending more time on this.

What I care is whether the report is useful.

>
> >> >
> >> > You gave up too early. There is no hint for understanding what was going on.
> >> > While we can observe "BUG: workqueue lockup" under memory pressure, there is
> >> > no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.
> >>
> >> Do you know how to send them programmatically? I tried to find a way
> >> several times, but failed. Articles that I've found talk about
> >> pressing some keys that don't translate directly to us-ascii.
> >
> > # echo t > /proc/sysrq-trigger
> > # echo m > /proc/sysrq-trigger
>
>
> This requires working ssh connection, but we routinely deal with
> half-dead kernels. I think that sysrq over console is as reliable as
> we can get in this context. But I don't know how to send them.

I can't understand your question. If the machine is running in a
virtualized environment, doesn't hypervisor provide a mean to send
SysRq commands to a guest remotely (e.g. "virsh send-keys sysrq") ?

If no means available, running

----------
#/bin/sh

while :
do
echo t > /proc/sysrq-trigger
echo m > /proc/sysrq-trigger
sleep 60
done
----------

in the background might be used.


>
> But thinking more about this, I am leaning towards the direction that
> kernel just need to do the right thing and print that info.
> In lots of cases we get a panic and as far as I understand kernel
> won't react on sysrq in that state. Console is still unreliable too.
> If a message is not useful, the right direction is to make it useful.
>

Then, configure kdump and analyze the vmcore. Kernel panic message
alone is not so helpful. You can feed commands to crash utility from
stdin and save stdout to a file. Then, the result file will provide
more information than SysRq-t + SysRq-m (apart from lack of ability to
understand whether situation has changed over time).

> >>
> >> But you can also run the reproducer. No report can possible provide
> >> all possible useful information, sometimes debugging boils down to
> >> manually adding printfs. That's why syzbot aims at providing a
> >> reproducer as the ultimate source of details. Also since a developer
> >> needs to test a proposed fix, it's easier to start with the reproducer
> >> right away.
> >
> > I don't have information about how to run the reproducer (e.g. how many
> > CPUs, how much memory, what network configuration is needed).
>
> Usually all of that is irrelevant and these reproduce well on any machine.
> FWIW, there were 2 CPUs and 2 GBs of memory. Network -- whatever GCE
> provides as default network.

The reproducer contained network addresses.
If the bug depends on network, how to configure network is important.

>
>
> > Also, please explain how to interpret raw.log file. The raw.log in
> > [email protected] had a lot of code output and kernel
> > messages but did not contain "BUG: workqueue lockup" message. On the other
> > hand, the raw.log in [email protected] has only kernel
> > messages and contains "BUG: workqueue lockup" message. Why they are
> > significantly different?
>
>
> The first raw.log does contain "BUG: workqueue lockup", I see it right there:
>
> [ 120.799119] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0
> nice=0 stuck for 48s!
> [ 120.807313] BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0
> stuck for 47s!
> [ 120.815024] Showing busy workqueues and worker pools:
> [ 120.820369] workqueue events: flags=0x0
> [ 120.824536] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
> [ 120.830803] pending: perf_sched_delayed, vmstat_shepherd,
> jump_label_update_timeout, cache_reap
> [ 120.840149] workqueue events_power_efficient: flags=0x80
> [ 120.845651] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
> [ 120.851822] pending: neigh_periodic_work, neigh_periodic_work,
> do_cache_clean, reg_check_chans_work
> [ 120.861447] workqueue mm_percpu_wq: flags=0x8
> [ 120.865947] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
> [ 120.872082] pending: vmstat_update
> [ 120.875994] workqueue writeback: flags=0x4e
> [ 120.880416] pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256
> [ 120.886164] in-flight: 3401:wb_workfn
> [ 120.890358] workqueue kblockd: flags=0x18

Where?

I'm talking about https://marc.info/?l=linux-mm&m=151231146619948&q=p4
at http://lkml.kernel.org/r/[email protected] .

>
> The difference is cause by the fact that the first one was obtained
> from fuzzing session when fuzzer executed lots of random programs,
> while the second one was an attempt to localize a reproducer, so the
> system run programs one-by-one on freshly booted machines.
>

I see. But context is too limited to know that.

>
>
> > Also, can you add timestamp to all messages?
> > When each message was printed is a clue for understanding relationship.
>
> There are timestamps. each program is prefixed with timestamps:
>
> 2017/12/03 08:51:30 executing program 6:
>
> these things allow to tie kernel and real time:
>
> [ 71.240837] QAT: Invalid ioctl
> 2017/12/03 08:51:30 executing program 3:
>

What I want is something like

timestamp kernel message 1
timestamp kernel message 2
timestamp kernel message 3
timestamp shell session message 1
timestamp kernel message 4
timestamp kernel message 5
timestamp shell session message 2
timestamp shell session message 3
timestamp kernel message 6
timestamp kernel message 7

which can be done using udplogger above.

>
>
> >> > At least you need to confirm that lockup lasts for a few minutes. Otherwise,
> >>
> >> Is it possible to increase the timeout? How? We could bump it up to 2 minutes.
> >
> > # echo 120 > /sys/module/workqueue/parameters/watchdog_thresh
> >
> > But generally, reporting multiple times rather than only once gives me
> > better clue, for the former would tell me whether situation was changing.
> >
> > Can you try not to give up as soon as "BUG: workqueue lockup" was printed
> > for the first time?
>
>
> I've bumped timeout to 120 seconds with workqueue.watchdog_thresh=120
> command line arg. Let's see if it still leaves any false positives, I
> think 2 minutes should be enough, a CPU stalled for 2+ minutes
> suggests something to fix anyway(even if just slowness somewhere). And
> in the end this wasn't a false positive either, right?

Regarding this bug, the report should include soft lockups rather than
workqueue lockups, for workqueue was not able to run for long due to
soft lockup in progress.

> Not giving up after an oops message will be hard and problematic for
> several reasons.
>

But reports which cannot understand what was happening is not actionable.
Again, "BUG: workqueue lockup" is not an "oops".

2017-12-28 13:43:30

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Thu, Dec 21, 2017 at 2:07 PM, Tetsuo Handa
<[email protected]> wrote:
> Dmitry Vyukov wrote:
>> On Wed, Dec 20, 2017 at 11:55 AM, Tetsuo Handa
>> <[email protected]> wrote:
>> > Dmitry Vyukov wrote:
>> >> On Tue, Dec 19, 2017 at 3:27 PM, Tetsuo Handa
>> >> <[email protected]> wrote:
>> >> > syzbot wrote:
>> >> >>
>> >> >> syzkaller has found reproducer for the following crash on
>> >> >> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
>> >> >
>> >> > "BUG: workqueue lockup" is not a crash.
>> >>
>> >> Hi Tetsuo,
>> >>
>> >> What is the proper name for all of these collectively?
>> >
>> > I think that things which lead to kernel panic when /proc/sys/kernel/panic_on_oops
>> > was set to 1 are called an "oops" (or a "kerneloops").
>> >
>> > Speak of "BUG: workqueue lockup", this is not an "oops". This message was
>> > added by 82607adcf9cdf40f ("workqueue: implement lockup detector"), and
>> > this message does not always indicate a fatal problem. This message can be
>> > printed when the system is really out of CPU and memory. As far as I tested,
>> > I think that workqueue was not able to run on specific CPU due to a soft
>> > lockup bug.
>>
>> There are also warnings which don't panic normally, unless
>> panic_on_warn is set. There are also cases when we suddenly lost a
>> machine and have no idea what happened with it. And also cases when we
>> are kind-a connected, and nothing bad is printed on console, but it's
>> still un-operable.
>
> Configuring netconsole might be helpful, for I use udplogger at
> https://osdn.net/projects/akari/scm/svn/tree/head/branches/udplogger/
> in order to collect all messages (not only kernel messages but also
> any text messages which can be sent as UDP packets) with timestamp added.
>
> An example of timestamp added to each line is
> http://I-love.SAKURA.ne.jp/tmp/20171018-deflate.log.xz .
>
> You can combine kernel messages from netconsole and output from shell
> session using bash's
>
> $ (command1; command2; command3) > /dev/udp/$remote_ip/$remote_port
>
> syntax.

syzkaller already sends everything over network to a reliable host. So
this part is already working.



>> The only collective name I can think of is bug. We could change it to
>> bug. Otherwise since there are multiple names, I don't think it's
>> worth spending more time on this.
>
> What I care is whether the report is useful.
>
>>
>> >> >
>> >> > You gave up too early. There is no hint for understanding what was going on.
>> >> > While we can observe "BUG: workqueue lockup" under memory pressure, there is
>> >> > no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong.
>> >>
>> >> Do you know how to send them programmatically? I tried to find a way
>> >> several times, but failed. Articles that I've found talk about
>> >> pressing some keys that don't translate directly to us-ascii.
>> >
>> > # echo t > /proc/sysrq-trigger
>> > # echo m > /proc/sysrq-trigger
>>
>>
>> This requires working ssh connection, but we routinely deal with
>> half-dead kernels. I think that sysrq over console is as reliable as
>> we can get in this context. But I don't know how to send them.
>
> I can't understand your question. If the machine is running in a
> virtualized environment, doesn't hypervisor provide a mean to send
> SysRq commands to a guest remotely (e.g. "virsh send-keys sysrq") ?

These particular machines were GCE instances. I can't find any info
about special GCE capabilities to send sysrqs.


> If no means available, running
>
> ----------
> #/bin/sh
>
> while :
> do
> echo t > /proc/sysrq-trigger
> echo m > /proc/sysrq-trigger
> sleep 60
> done
> ----------
>
> in the background might be used.

This has good chances of missing the interesting stacks. Thinking of
this more, I think kernel should dump that info on bugs. The current
"BUG: workqueue lockup" report is not actionable, it's not directly
related to syzbot, it's related to kernel.


>> But thinking more about this, I am leaning towards the direction that
>> kernel just need to do the right thing and print that info.
>> In lots of cases we get a panic and as far as I understand kernel
>> won't react on sysrq in that state. Console is still unreliable too.
>> If a message is not useful, the right direction is to make it useful.
>>
>
> Then, configure kdump and analyze the vmcore. Kernel panic message
> alone is not so helpful. You can feed commands to crash utility from
> stdin and save stdout to a file. Then, the result file will provide
> more information than SysRq-t + SysRq-m (apart from lack of ability to
> understand whether situation has changed over time).

I've filed https://github.com/google/syzkaller/issues/491 for kdump
cores. But there are lots to learn. And this also needs to be done not
once by an intelligent human, but programmed to work fully
automatically, which is usually much harder to do.
The general idea, is that the reproducer is the ultimate source of
details. kdump can well be not helpful as well. Lots of people won't
look at them at all for various reasons. Sometimes you need to add
additional printf's and re-run and then repeat this multiple times. I
don't think there a magical piece of information that will shed light
on just any kernel issue.


>> >> But you can also run the reproducer. No report can possible provide
>> >> all possible useful information, sometimes debugging boils down to
>> >> manually adding printfs. That's why syzbot aims at providing a
>> >> reproducer as the ultimate source of details. Also since a developer
>> >> needs to test a proposed fix, it's easier to start with the reproducer
>> >> right away.
>> >
>> > I don't have information about how to run the reproducer (e.g. how many
>> > CPUs, how much memory, what network configuration is needed).
>>
>> Usually all of that is irrelevant and these reproduce well on any machine.
>> FWIW, there were 2 CPUs and 2 GBs of memory. Network -- whatever GCE
>> provides as default network.
>
> The reproducer contained network addresses.
> If the bug depends on network, how to configure network is important.

Do you mean getsockopt$inet_sctp6_SCTP_GET_LOCAL_ADDRS call? But it
only obtains addresses and I think it fails, because it's called on a
local file. Generally, network communication of these programs is
self-contained. If they use network, they bring up interfaces.
There are lots of bits to full reproducibility. For example, you would
also need to use GCE VMs. As I said, in 95% of cases these are
reproducible without any special measures (.config obviously matters,
but it's supplied).




>> > Also, please explain how to interpret raw.log file. The raw.log in
>> > [email protected] had a lot of code output and kernel
>> > messages but did not contain "BUG: workqueue lockup" message. On the other
>> > hand, the raw.log in [email protected] has only kernel
>> > messages and contains "BUG: workqueue lockup" message. Why they are
>> > significantly different?
>>
>>
>> The first raw.log does contain "BUG: workqueue lockup", I see it right there:
>>
>> [ 120.799119] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0
>> nice=0 stuck for 48s!
>> [ 120.807313] BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0
>> stuck for 47s!
>> [ 120.815024] Showing busy workqueues and worker pools:
>> [ 120.820369] workqueue events: flags=0x0
>> [ 120.824536] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
>> [ 120.830803] pending: perf_sched_delayed, vmstat_shepherd,
>> jump_label_update_timeout, cache_reap
>> [ 120.840149] workqueue events_power_efficient: flags=0x80
>> [ 120.845651] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
>> [ 120.851822] pending: neigh_periodic_work, neigh_periodic_work,
>> do_cache_clean, reg_check_chans_work
>> [ 120.861447] workqueue mm_percpu_wq: flags=0x8
>> [ 120.865947] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>> [ 120.872082] pending: vmstat_update
>> [ 120.875994] workqueue writeback: flags=0x4e
>> [ 120.880416] pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256
>> [ 120.886164] in-flight: 3401:wb_workfn
>> [ 120.890358] workqueue kblockd: flags=0x18
>
> Where?
>
> I'm talking about https://marc.info/?l=linux-mm&m=151231146619948&q=p4
> at http://lkml.kernel.org/r/[email protected] .

Interesting. Looks like LKML bug, the file is truncated half way. You
can see the full raw.log here:
https://groups.google.com/d/msg/syzkaller-bugs/vwcINLkXTVQ/fuzYSNeXAwAJ

I've tried to find LKML and kernel bugzilla admins, but can't find any
real people. If you know how to contact them, we can talk to them.



>> The difference is cause by the fact that the first one was obtained
>> from fuzzing session when fuzzer executed lots of random programs,
>> while the second one was an attempt to localize a reproducer, so the
>> system run programs one-by-one on freshly booted machines.
>>
>
> I see. But context is too limited to know that.

Yes. But there is also a problem of too much context. We have hard
time making some people read even the minimal amount of concentrated
information. Having a 100-page [outdated] manual won't be helpful
either, and as it usually happens these manuals tend to contain
everything but the bit of information you are actually looking for.
That's why I an answering questions.


>> > Also, can you add timestamp to all messages?
>> > When each message was printed is a clue for understanding relationship.
>>
>> There are timestamps. each program is prefixed with timestamps:
>>
>> 2017/12/03 08:51:30 executing program 6:
>>
>> these things allow to tie kernel and real time:
>>
>> [ 71.240837] QAT: Invalid ioctl
>> 2017/12/03 08:51:30 executing program 3:
>>
>
> What I want is something like
>
> timestamp kernel message 1
> timestamp kernel message 2
> timestamp kernel message 3
> timestamp shell session message 1
> timestamp kernel message 4
> timestamp kernel message 5
> timestamp shell session message 2
> timestamp shell session message 3
> timestamp kernel message 6
> timestamp kernel message 7
>
> which can be done using udplogger above.
>
>>
>>
>> >> > At least you need to confirm that lockup lasts for a few minutes. Otherwise,
>> >>
>> >> Is it possible to increase the timeout? How? We could bump it up to 2 minutes.
>> >
>> > # echo 120 > /sys/module/workqueue/parameters/watchdog_thresh
>> >
>> > But generally, reporting multiple times rather than only once gives me
>> > better clue, for the former would tell me whether situation was changing.
>> >
>> > Can you try not to give up as soon as "BUG: workqueue lockup" was printed
>> > for the first time?
>>
>>
>> I've bumped timeout to 120 seconds with workqueue.watchdog_thresh=120
>> command line arg. Let's see if it still leaves any false positives, I
>> think 2 minutes should be enough, a CPU stalled for 2+ minutes
>> suggests something to fix anyway(even if just slowness somewhere). And
>> in the end this wasn't a false positive either, right?
>
> Regarding this bug, the report should include soft lockups rather than
> workqueue lockups, for workqueue was not able to run for long due to
> soft lockup in progress.
>
>> Not giving up after an oops message will be hard and problematic for
>> several reasons.
>>
>
> But reports which cannot understand what was happening is not actionable.
> Again, "BUG: workqueue lockup" is not an "oops".
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/201712212207.GHD30218.MtFFVSOOQLHFJO%40I-love.SAKURA.ne.jp.
> For more options, visit https://groups.google.com/d/optout.

2018-05-12 21:50:46

by Eric Biggers

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Tue, Dec 19, 2017 at 04:25:01AM -0800, syzbot wrote:
> syzkaller has found reproducer for the following crash on
> f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
>
>
> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s!
> Showing busy workqueues and worker pools:
> workqueue events: flags=0x0
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> pending: cache_reap
> workqueue events_power_efficient: flags=0x80
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
> pending: neigh_periodic_work, do_cache_clean
> workqueue mm_percpu_wq: flags=0x8
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> pending: vmstat_update
> workqueue kblockd: flags=0x18
> pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
> pending: blk_timeout_work
>

The bug that this reproducer reproduces was fixed a while ago by commit
966031f340185e, so I'm marking this bug report fixed by it:

#syz fix: n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)

Note that the error message was not always "BUG: workqueue lockup"; it was also
sometimes like "watchdog: BUG: soft lockup - CPU#5 stuck for 22s!".

syzbot still is hitting the "BUG: workqueue lockup" error sometimes, but it must
be for other reasons. None has a reproducer currently.

- Eric

2018-05-13 02:06:46

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

Eric Biggers wrote:
> The bug that this reproducer reproduces was fixed a while ago by commit
> 966031f340185e, so I'm marking this bug report fixed by it:
>
> #syz fix: n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)

Nope. Commit 966031f340185edd ("n_tty: fix EXTPROC vs ICANON interaction with
TIOCINQ (aka FIONREAD)") is "Wed Dec 20 17:57:06 2017 -0800" but the last
occurrence on linux.git (commit 008464a9360e31b1 ("Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid")) is only a few days ago
("Wed May 9 10:49:52 2018 -1000").

>
> Note that the error message was not always "BUG: workqueue lockup"; it was also
> sometimes like "watchdog: BUG: soft lockup - CPU#5 stuck for 22s!".
>
> syzbot still is hitting the "BUG: workqueue lockup" error sometimes, but it must
> be for other reasons. None has a reproducer currently.

The last occurrence on linux.git is considered as a duplicate of

[upstream] INFO: rcu detected stall in n_tty_receive_char_special
https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8

which we already have a reproducer at
https://groups.google.com/d/msg/syzkaller-bugs/O4DbPiJZFBY/YCVPocx3AgAJ
and debug output is available at
https://groups.google.com/d/msg/syzkaller-bugs/O4DbPiJZFBY/TxQ7WS5ZAwAJ .

We are currently waiting for comments from Peter Hurley who added that code.

2018-05-13 03:30:41

by Eric Biggers

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

Hi Tetsuo,

On Sun, May 13, 2018 at 11:06:17AM +0900, Tetsuo Handa wrote:
> Eric Biggers wrote:
> > The bug that this reproducer reproduces was fixed a while ago by commit
> > 966031f340185e, so I'm marking this bug report fixed by it:
> >
> > #syz fix: n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
>
> Nope. Commit 966031f340185edd ("n_tty: fix EXTPROC vs ICANON interaction with
> TIOCINQ (aka FIONREAD)") is "Wed Dec 20 17:57:06 2017 -0800" but the last
> occurrence on linux.git (commit 008464a9360e31b1 ("Merge branch 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid")) is only a few days ago
> ("Wed May 9 10:49:52 2018 -1000").
>
> >
> > Note that the error message was not always "BUG: workqueue lockup"; it was also
> > sometimes like "watchdog: BUG: soft lockup - CPU#5 stuck for 22s!".
> >
> > syzbot still is hitting the "BUG: workqueue lockup" error sometimes, but it must
> > be for other reasons. None has a reproducer currently.
>
> The last occurrence on linux.git is considered as a duplicate of
>
> [upstream] INFO: rcu detected stall in n_tty_receive_char_special
> https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8
>
> which we already have a reproducer at
> https://groups.google.com/d/msg/syzkaller-bugs/O4DbPiJZFBY/YCVPocx3AgAJ
> and debug output is available at
> https://groups.google.com/d/msg/syzkaller-bugs/O4DbPiJZFBY/TxQ7WS5ZAwAJ .
>
> We are currently waiting for comments from Peter Hurley who added that code.
>

Actually I did verify that the C reproducer is fixed by the commit I said, and I
also simplified the reproducer and turned it into an LTP test
(http://lists.linux.it/pipermail/ltp/2018-May/008071.html). Like I said, syzbot
is still occasionally hitting the same "BUG: workqueue lockup" error, but
apparently for other reasons. The one on 008464a9360e31b even looks like it's
in the TTY layer too, and it very well could be a very similar bug, but based on
what I observed it's not the same bug that syzbot reproduced on f3b5ad89de16f5d.
Generally it's best to close syzbot bug reports once the original cause is
fixed, so that syzbot can continue to report other bugs with the same signature.
Otherwise they sit on the syzbot dashboard where few people are looking at them.
Though of course, if you are up to it, you're certainly free to look into any of
the crashes already there even before a new bug report gets created.

Note also that a "workqueue lockup" can be caused by almost anything in the
kernel, I think. This one for example is probably in the sound subsystem:
https://syzkaller.appspot.com/text?tag=CrashReport&x=1767232b800000

Thanks!

Eric

2018-05-13 14:30:13

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

Eric Biggers wrote:
> Generally it's best to close syzbot bug reports once the original cause is
> fixed, so that syzbot can continue to report other bugs with the same signature.

That's difficult to judge. Closing as soon as the original cause is fixed allows
syzbot to try to report different reproducer for different bugs. But at the same time,
different/similar bugs which were reported in that report (or comments in the discussion
for that report) will become almost invisible from users (because users unlikely check
other reports in already fixed bugs).

An example is

general protection fault in kernfs_kill_sb (2)
https://syzkaller.appspot.com/bug?id=903af3e08fc7ec60e57d9c9b93b035f4fb038d9a

where the cause of above report was already pointed out in the discussion for
the below report.

general protection fault in kernfs_kill_sb
https://syzkaller.appspot.com/bug?id=d7db6ecf34f099248e4ff404cd381a19a4075653

Since the latter is marked as "fixed on May 08 18:30", I worry that quite few
users would check the relationship.

> Note also that a "workqueue lockup" can be caused by almost anything in the
> kernel, I think. This one for example is probably in the sound subsystem:
> https://syzkaller.appspot.com/text?tag=CrashReport&x=1767232b800000
>

Right. Maybe we should not stop the test upon "workqueue lockup" message, for
it is likely that the cause of lockup is that somebody is busy looping which
should have been reported shortly as "rcu detected stall".

Of course, there is possibility that "workqueue lockup" is reported because
cond_resched() was used when explicit schedule_timeout_*() is required, which
was the reason commit 82607adcf9cdf40f ("workqueue: implement lockup detector")
was added.

If we stop the test upon "workqueue lockup" message, maybe longer timeout (e.g.
300 seconds) is better so that rcu stall or hung task messages are reported
if rcu stall or hung task is occurring.

2018-05-13 14:36:21

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: BUG: workqueue lockup (2)

On Sun, May 13, 2018 at 4:29 PM, Tetsuo Handa
<[email protected]> wrote:
> Eric Biggers wrote:
>> Generally it's best to close syzbot bug reports once the original cause is
>> fixed, so that syzbot can continue to report other bugs with the same signature.
>
> That's difficult to judge. Closing as soon as the original cause is fixed allows
> syzbot to try to report different reproducer for different bugs. But at the same time,
> different/similar bugs which were reported in that report (or comments in the discussion
> for that report) will become almost invisible from users (because users unlikely check
> other reports in already fixed bugs).
>
> An example is
>
> general protection fault in kernfs_kill_sb (2)
> https://syzkaller.appspot.com/bug?id=903af3e08fc7ec60e57d9c9b93b035f4fb038d9a
>
> where the cause of above report was already pointed out in the discussion for
> the below report.
>
> general protection fault in kernfs_kill_sb
> https://syzkaller.appspot.com/bug?id=d7db6ecf34f099248e4ff404cd381a19a4075653
>
> Since the latter is marked as "fixed on May 08 18:30", I worry that quite few
> users would check the relationship.
>
>> Note also that a "workqueue lockup" can be caused by almost anything in the
>> kernel, I think. This one for example is probably in the sound subsystem:
>> https://syzkaller.appspot.com/text?tag=CrashReport&x=1767232b800000
>>
>
> Right. Maybe we should not stop the test upon "workqueue lockup" message, for
> it is likely that the cause of lockup is that somebody is busy looping which
> should have been reported shortly as "rcu detected stall".
>
> Of course, there is possibility that "workqueue lockup" is reported because
> cond_resched() was used when explicit schedule_timeout_*() is required, which
> was the reason commit 82607adcf9cdf40f ("workqueue: implement lockup detector")
> was added.
>
> If we stop the test upon "workqueue lockup" message, maybe longer timeout (e.g.
> 300 seconds) is better so that rcu stall or hung task messages are reported
> if rcu stall or hung task is occurring.

Yes, we need order different stalls/lockups/hangs/etc according to
what can trigger what. E.g. rcu stall can trigger task hung and
workqueue lockup, but not the other way around.
There is https://github.com/google/syzkaller/issues/516 to track this.
But I did not yet have time to figure out all required changes.
If you have additional details, please add them there.