Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965458AbdLSO11 (ORCPT ); Tue, 19 Dec 2017 09:27:27 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:32660 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934660AbdLSO10 (ORCPT ); Tue, 19 Dec 2017 09:27:26 -0500 To: bot+e38be687a2450270a3b593bacb6b5795a7a74edb@syzkaller.appspotmail.com, syzkaller-bugs@googlegroups.com Cc: dvyukov@google.com, gregkh@linuxfoundation.org, kstewart@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, pombredanne@nexb.com, tglx@linutronix.de Subject: Re: BUG: workqueue lockup (2) From: Tetsuo Handa References: <94eb2c03c9bc75aff2055f70734c@google.com> <001a113f711a528a3f0560b08e76@google.com> In-Reply-To: <001a113f711a528a3f0560b08e76@google.com> Message-Id: <201712192327.FIJ64026.tMQFOOVFFLHOSJ@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Tue, 19 Dec 2017 23:27:24 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1701 Lines: 40 syzbot wrote: > > syzkaller has found reproducer for the following crash on > f3b5ad89de16f5d42e8ad36fbdf85f705c1ae051 "BUG: workqueue lockup" is not a crash. > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s! > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 32s! > Showing busy workqueues and worker pools: > workqueue events: flags=0x0 > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 > pending: cache_reap > workqueue events_power_efficient: flags=0x80 > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256 > pending: neigh_periodic_work, do_cache_clean > workqueue mm_percpu_wq: flags=0x8 > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_update > workqueue kblockd: flags=0x18 > pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256 > pending: blk_timeout_work You gave up too early. There is no hint for understanding what was going on. While we can observe "BUG: workqueue lockup" under memory pressure, there is no hint like SysRq-t and SysRq-m. Thus, I can't tell something is wrong. At least you need to confirm that lockup lasts for a few minutes. Otherwise, this might be just overstressing. (According to repro.c , 12 threads are created and soon SEGV follows? According to above message, only 2 CPUs? Triggering SEGV suggests memory was low due to saving coredump?)