Received: by 10.223.164.202 with SMTP id h10csp156770wrb; Mon, 6 Nov 2017 04:27:24 -0800 (PST) X-Google-Smtp-Source: ABhQp+RX7lvhOcO1S+SXJyTMuVju8l9YPVuzQWqjqyJi3eRS/hLZK7a2ptwiJYA5gRlzRW4pNEpi X-Received: by 10.98.186.11 with SMTP id k11mr16411982pff.141.1509971244749; Mon, 06 Nov 2017 04:27:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1509971244; cv=none; d=google.com; s=arc-20160816; b=Cy2de0QvjAEXZFIAFMjvszO4MoBwfg5qAdeTPkdmbLviqi8FJyEX0vaK64NWQ/VLZ8 nFr9rIV2pP8rJXxtoeQHtWCsX4/CW/hcIYT67LW9e2eNBVAil4JJxaD/rv6BObJ4txQi PONNwm9pmdioBuJ7h1zf7YNjzpbfol4ueJCqllwqu4CnyZA+ajlKyoFHdbzJcioPOe1A uCeHmyUpZQH9YNUxoqDy2H6niTK5LAlF4yjBWq6+n4hyOhIs/Wa6tqBj95Y/PQLj2p2W uXJGWjW8z0AB3JWIWHLKoDg1HARFR628LMTgEVoCqWwDk43vhsGzkIHMJnDcPkIOa6Em sSYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=LzseMct0Aouc5I3qF9dTpVKFbO9bSDpSK/N2Lq6tKcQ=; b=FRvgl1bvuk2kVFA7vFBRc63xnnDMyZTZpCI31aRatLnducF9nsnMEZ08dhSqwtMC/N KCxuiQFv84z6t/3FK6DMDdNJX/QbI1ypBhr+a43F96G/TqPxRrsyTFHeOrQl4E1n+Sd9 4tDf8c/BGSawfiqM9Y/2MJ5BIm2Ze5QBT0R5KHOM7kNki1SfNHClpU6A5FfitLjY1rWj a9Arrcg78sbQOHNqrvoZnV1rgjWuNBcpGHO/YxZoOlt3m217J1cTpuSTggMmLm4QRqHI 37gobenuq0eFDqVIzPBJ+mlECQe3ZxGtRXqiMZF21iA2WYheY90/miH0YY5VH9D+yaef TluQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 9si12136506pfw.188.2017.11.06.04.27.08; Mon, 06 Nov 2017 04:27:24 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752608AbdKFM03 (ORCPT + 99 others); Mon, 6 Nov 2017 07:26:29 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:28607 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751717AbdKFM02 (ORCPT ); Mon, 6 Nov 2017 07:26:28 -0500 Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vA6CQCGf005053 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 6 Nov 2017 12:26:13 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id vA6CQATa011285 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 6 Nov 2017 12:26:10 GMT Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vA6CQ9th012367; Mon, 6 Nov 2017 12:26:09 GMT Received: from localhost (/10.175.194.11) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 06 Nov 2017 04:26:08 -0800 Date: Mon, 6 Nov 2017 12:26:05 +0000 From: Jamie Iles To: Dmitry Vyukov Cc: Jamie Iles , Oleg Nesterov , syzbot , Andrew Morton , Arvind Yadav , Mark Brown , "Eric W. Biederman" , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , LKML , "Martin K. Petersen" , mchehab@kernel.org, Ingo Molnar , mpe@ellerman.id.au, syzkaller-bugs@googlegroups.com, Al Viro , Kyle Huey , Kees Cook Subject: Re: WARNING in task_participate_group_stop Message-ID: <20171106122605.zdqjodn3wtcjqevh@cedar> References: <94eb2c058c80ea49ed055cc8695e@google.com> <20171031163451.GA30223@redhat.com> <20171102170138.GA13663@redhat.com> <20171106112508.lun6eftpj5icnvdy@cedar> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 06, 2017 at 12:56:09PM +0100, Dmitry Vyukov wrote: > On Mon, Nov 6, 2017 at 12:25 PM, Jamie Iles wrote: > > Hi Dmitry, > > > > On Mon, Nov 06, 2017 at 12:02:19PM +0100, Dmitry Vyukov wrote: > >> On Thu, Nov 2, 2017 at 6:01 PM, Oleg Nesterov wrote: > >> > On 11/01, Dmitry Vyukov wrote: > >> >> > >> >> On Tue, Oct 31, 2017 at 7:34 PM, Oleg Nesterov wrote: > >> >> > Hmm. I do not see reproducer in this email... > >> >> > >> >> Ah, sorry. You can see full thread with attachments here: > >> >> https://groups.google.com/forum/#!topic/syzkaller-bugs/EUmYZU4m5gU > >> > > >> > Heh. I can't say I enjoyed reading the reproducer ;) > >> > > >> >> >> > WARNING: CPU: 0 PID: 1 at kernel/signal.c:340 > >> >> >> > task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340 > >> >> >> > Kernel panic - not syncing: panic_on_warn set ... > >> >> >> > > >> >> >> > CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-mm1+ #5 > >> >> > > >> >> > So this is init process with SIGNAL_UNKILLABLE flag set. And I hope it has > >> >> > the pending SIGKILL, otherwise there is something else. > >> > > >> > From repro.c > >> > > >> > line 111 r[8] = syscall(__NR_ptrace, 0x10ul, r[7]); > >> > > >> > this is PTRACE_ATTACH > >> > > >> > line 115 syscall(__NR_ptrace, 0x4200ul, r[7], 0x40000012ul, 0x100012ul); > >> > > >> > this is PTRACE_SETOPTIONS and "data" includes PTRACE_O_EXITKILL. > >> > > >> > r[7] is initialized at > >> > > >> > line 110 r[7] = *(uint32_t*)0x20f9cffc; > >> > > >> > so if it is eq to 1 then it can attach to init and in this case the problem > >> > can be explained by the wrong SIGNAL_UNKILLABLE/SIGKILL logic. > >> > > >> > But how *(uint32_t*)0x20f9cffc can be 1 ? > >> > > >> > line 108 r[6] = syscall(__NR_fcntl, r[1], 0x10ul, 0x20f9cff8ul); > >> > > >> > this is F_GETOWN_EX, addr = 0x20f9cff8 == 0x20f9cffc + 4, so if fcntl() > >> > actually succeeds then r[7] == f_owner_ex->pid. > >> > > >> > It _can_ be 1, but the reproducer doesn't work for me. If you can reproduce, > >> > could you try the patch below? > >> > >> Hi, > >> > >> I would like to understand why you were not able to reproduce it. I > >> won't be sitting here all the time, and we are tracking hundreds of > >> bugs across different linux kernels and other OSes, so it's > >> problematic to do any extensive work on all of them. That's why we try > >> to provide reproducers. > >> > >> I've just tried the repro on the latest upstream > >> (39dae59d66acd86d1de24294bd2f343fd5e7a625) and it triggered the > >> WARNING within a second. > >> Did you use the config provided? Did you use qemu or real hardware? > >> Can you try in qemu (with -smp>1)? > > > > I'm unable to reproduce the warning in qemu with SMP (on a 32 CPU VM). > > Instead I get the following instant traceback which is different to what > > you report when run as root: > > > Uh, it seems to be racy. I am getting either the WARNING or "attempt > to kill init" in ~1/5 proportion. > Please try this simplified program, it triggers the WARNING all the time for me: I can't reproduce the warning with that reproducer, but I do see that it runs and exits once and init is left in state T then the second run ends up killing init and the VM crashes. If I run once and then 'kill -CONT 1' then init goes back to sleeping and won't accept a SIGSTOP so it looks like SIGNAL_UNKILLABLE isn't gone though. Jamie From 1583317800206753553@xxx Mon Nov 06 11:58:40 +0000 2017 X-GM-THRID: 1582711532474407023 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread