Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933082AbcCHRAW (ORCPT ); Tue, 8 Mar 2016 12:00:22 -0500 Received: from mail-wm0-f50.google.com ([74.125.82.50]:37155 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932071AbcCHRAQ (ORCPT ); Tue, 8 Mar 2016 12:00:16 -0500 MIME-Version: 1.0 In-Reply-To: <20160308164859.GA27516@gmail.com> References: <20160308134454.GY6344@twins.programming.kicks-ass.net> <20160308134901.GC19756@gmail.com> <20160308135759.GH6356@twins.programming.kicks-ass.net> <20160308152924.GB9147@gmail.com> <20160308155423.GA16587@gmail.com> <20160308162703.GB30211@gmail.com> <20160308164438.GA24109@gmail.com> <20160308164859.GA27516@gmail.com> From: Dmitry Vyukov Date: Tue, 8 Mar 2016 17:59:55 +0100 Message-ID: Subject: Re: [RESEND PATCH 0/5] perf core: Support overwrite ring buffer To: Ingo Molnar Cc: Peter Zijlstra , Wang Nan , Ingo Molnar , LKML , He Kuang , Alexei Starovoitov , Arnaldo Carvalho de Melo , Brendan Gregg , Jiri Olsa , Masami Hiramatsu , Namhyung Kim , Zefan Li , pi3orama@163.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1871 Lines: 46 On Tue, Mar 8, 2016 at 5:48 PM, Ingo Molnar wrote: > > * Ingo Molnar wrote: > >> It only had a couple of seconds of runtime: >> >> 49652 mingo 20 0 1434276 52144 11344 S 0.0 0.0 0:00.54 syz-manager >> 49661 mingo 20 0 2196672 43948 10448 S 0.0 0.0 0:05.59 syz-fuzzer > > Ah, so it appears to making some progress: > > 49652 mingo 20 0 1581740 47600 11344 S 0.0 0.0 0:00.58 syz-manager > 49661 mingo 20 0 2204868 43720 10448 S 0.0 0.0 0:07.49 syz-fuzzer > > 49652 mingo 20 0 1598132 31512 11344 S 0.0 0.0 0:00.61 syz-manager > 49661 mingo 20 0 2204868 44252 10448 S 0.0 0.0 0:09.09 syz-fuzzer > > but only about +1 second runtime added every minute or so. Is that expected? The main work is done by child syz-executor processes. syz-manager/syz-fuzzer only guide the process. You can set "procs" param in config to higher value to increase CPU utilization. To get more bugs you want to saturate all CPUs to trigger more unusual thread interleavings. > There's no progress mark anywhere suggesting that the tool thinks it is going > fine. You might want to emit periodic (once a minute or so) 'I am still OK!' > messages or so. Will do. Regarding the zombie processes, it may or may not be OK. If they hang for minutes and you can't kill them, then it is a kernel bug. If they hang for minutes and you can kill them, then it is either kernel bug of my bug. If they are recycled eventually, then it is OK. The first thing I do in such cases is: $ cat /proc/$PID/task/**/stack If there is a second unfinished thread hanging on a kernel spinlock or mutex, then it's definitely bad. It also helps to enable CONFIG_RCU_STALL_COMMON=y, CONFIG_DEBUG_ATOMIC_SLEEP=y, CONFIG_WQ_WATCHDOG=y and spinlock/mutex debugging. These can detect various stalls.