Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753820AbbKBORW (ORCPT ); Mon, 2 Nov 2015 09:17:22 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54875 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753758AbbKBORU (ORCPT ); Mon, 2 Nov 2015 09:17:20 -0500 Date: Mon, 2 Nov 2015 16:13:33 +0100 From: Oleg Nesterov To: Dmitry Vyukov Cc: Roland McGrath , Andrew Morton , amanieu@gmail.com, pmoore@redhat.com, Ingo Molnar , vdavydov@parallels.com, qiaowei.ren@intel.com, dave@stgolabs.net, palmer@dabbelt.com, LKML , syzkaller , Kostya Serebryany , Alexander Potapenko , Sasha Levin Subject: Re: WARNING in task_participate_group_stop Message-ID: <20151102151333.GA17152@redhat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3093 Lines: 86 Hi Dmitry, On 11/02, Dmitry Vyukov wrote: > > WARNING: CPU: 1 PID: 1 at kernel/signal.c:334 > task_participate_group_stop+0x157/0x1d0() > Modules linked in: > CPU: 1 PID: 1 Comm: init Not tainted 4.3.0 #48 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > ffffffff82e40280 ffff88003eb0fae0 ffffffff819efe55 0000000000000000 > ffff88003eb0fb20 ffffffff810ec871 ffffffff8110f4d7 ffff88003eb00000 > ffff88003eb20000 0000000000000000 ffff88003eb0fbf8 ffff88003eb20000 > Call Trace: > [] warn_slowpath_null+0x15/0x20 kernel/panic.c:480 > [] task_participate_group_stop+0x157/0x1d0 > kernel/signal.c:334 > [] do_signal_stop+0x1e7/0x6e0 kernel/signal.c:2060 > [] get_signal+0x387/0x11b0 kernel/signal.c:2316 > [] do_signal+0x8d/0x19e0 arch/x86/kernel/signal.c:707 > [] prepare_exit_to_usermode+0x11d/0x170 > arch/x86/entry/common.c:251 > [] syscall_return_slowpath+0xa3/0x2b0 > arch/x86/entry/common.c:317 > [] int_ret_from_sys_call+0x25/0x8f > arch/x86/entry/entry_64.S:281 > ---[ end trace f6697fd630b7c361 ]--- > > > The reproducer is (needs to be run as root): > > // autogenerated by syzkaller (http://github.com/google/syzkaller) > #include > #include > > int main() > { > int pid = 1; > ptrace(PTRACE_ATTACH, pid, 0, 0); > ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_EXITKILL); > sleep(1); > return 0; > } Thanks. Can't reproduce, but at first glance the problem looks clear... > Yes, it is weird and it kills init right afterwards. Could you confirm that this WARN_ON() happens _after_ the reproducer exits? > But I wasn't able > to figure out what's the root cause (why task does not have > JOBCTL_STOP_PENDING) and maybe the same WARNING can be triggered > without root and/or with other than init process. So still posting it > here. Yes I think you are right. SIGSTOP can race with SIGKILL which (unlike SIGCONT) doesn't clear JOBCTL_STOP_DEQUEUED/PENDING/etc. This is mostly fine, the task won't block in TASK_STOPPED if SIGKILL is pending, but still is not right and leads to the warning above: JOBCTL_STOP_PENDING was not set because do_signal_stop()->task_set_jobctl_pending() checks fatal_signal_pending(). Probably the patch below should fix the problem, but I'd like to think more before I send the fix. Oleg. --- x/kernel/signal.c +++ x/kernel/signal.c @@ -2002,7 +2002,7 @@ static bool do_signal_stop(int signr) WARN_ON_ONCE(signr & ~JOBCTL_STOP_SIGMASK); if (!likely(current->jobctl & JOBCTL_STOP_DEQUEUED) || - unlikely(signal_group_exit(sig))) + unlikely(fatal_signal_pending(current))) return false; /* * There is no group stop already in progress. We must -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/