Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754587AbdLTIAe (ORCPT ); Wed, 20 Dec 2017 03:00:34 -0500 Received: from mail-pl0-f51.google.com ([209.85.160.51]:40733 "EHLO mail-pl0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751652AbdLTIAc (ORCPT ); Wed, 20 Dec 2017 03:00:32 -0500 X-Google-Smtp-Source: ACJfBovEoitHuMDcYtFXgAB/3SHJiZ4oyDOP0b9m0D7+B+QamOpQOlSNqTfZwnOCaD+sv63BXPCDhG/gmboZF1Cchpg= MIME-Version: 1.0 In-Reply-To: <87mv2e17vz.fsf@xmission.com> References: <20171218214438.GA32728@codemonkey.org.uk> <20171218221541.GP21978@ZenIV.linux.org.uk> <20171218231013.GA9481@codemonkey.org.uk> <20171219033926.GA26981@codemonkey.org.uk> <87lghy7eul.fsf@xmission.com> <20171219193020.GA9237@codemonkey.org.uk> <878tdy5r5t.fsf@xmission.com> <87mv2e17vz.fsf@xmission.com> From: Dmitry Vyukov Date: Wed, 20 Dec 2017 09:00:11 +0100 Message-ID: Subject: Re: proc_flush_task oops To: "Eric W. Biederman" Cc: Dave Jones , Linus Torvalds , Al Viro , Linux Kernel , syzkaller-bugs@googlegroups.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id vBK80ePq008879 Content-Length: 11170 Lines: 119 On Wed, Dec 20, 2017 at 2:54 AM, Eric W. Biederman wrote: > ebiederm@xmission.com (Eric W. Biederman) writes: > >> Dave Jones writes: >> >>> On Tue, Dec 19, 2017 at 12:27:30PM -0600, Eric W. Biederman wrote: >>> > Dave Jones writes: >>> > >>> > > On Mon, Dec 18, 2017 at 03:50:52PM -0800, Linus Torvalds wrote: >>> > > >>> > > > But I don't see what would have changed in this area recently. >>> > > > >>> > > > Do you end up saving the seeds that cause crashes? Is this >>> > > > reproducible? (Other than seeing it twoce, of course) >>> > > >>> > > Only clue so far, is every time I'm able to trigger it, the last thing >>> > > the child process that triggers it did, was an execveat. >>> > >>> > Is there any chance the excveat might be called from a child thread? >>> >>> If trinity choose one of the exec syscalls, it forks off an extra child >>> to do it in, on the off-chance that it succeeds, and we never return. >>> https://github.com/kernelslacker/trinity/blob/master/syscall.c#L139 >> >> extrapid = fork(); >> if (extrapid == 0) { >> /* grand-child */ >> char childname[]="trinity-subchild"; >> prctl(PR_SET_NAME, (unsigned long) &childname); >> >> __do_syscall(rec, GOING_AWAY); >> /* if this was for eg. an successful execve, we should never get here. >> * if it failed though... */ >> _exit(EXIT_SUCCESS); >> } >> >> That is interesting. >> >> >> So the system call sequence is a fork which just succeeded and than an >> exec. That reduces the possibilities quite a lot. >> >> With pids there was a recent change that just replaced the pid hash >> table and the pid bitmap with and idr. It changes the locking somewhat >> and probably changes the timing so that might be the culprit. >> >> I am trying to figure out if there is an interface that would let >> ns_last_pid for a pid namespace be accessed before the first pid is >> allocated and I am not seeing it. It does not appear to be possible >> to mount a proc for a pid namespace you are not currently in. >> >> *Scratches my head* I am not seeing anything obvious. > > Can you try this patch as you reproduce this issue? > > diff --git a/kernel/pid.c b/kernel/pid.c > index b13b624e2c49..df9e5d4d8f83 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -210,6 +210,7 @@ struct pid *alloc_pid(struct pid_namespace *ns) > goto out_unlock; > for ( ; upid >= pid->numbers; --upid) { > /* Make the PID visible to find_pid_ns. */ > + WARN_ON(!upid->ns->proc_mnt); > idr_replace(&upid->ns->idr, pid, upid->nr); > upid->ns->pid_allocated++; > } > > > If the warning triggers it means the bug is in alloc_pid and somehow > something has gotten past the is_child_reaper check. > > If the warning does not trigger it means something is stomping proc_mnt. > > In the entire kernel there are exactly two assignments to proc_mnt. > - kmem_cache_zalloc in create_pid_namespace. > - In pid_ns_prepare_proc where proc_mnt is set to a non-zero value. > > On the 29th of Nov syzkaller also hit this and gave me this reproducer > that I can't figure out heads or tails of. You can ask syzbot to test a patch if there is a reproducer. Instructions are at the bottom of the report email. > #{Threaded:true Collide:true Repeat:true Procs:8 Sandbox:namespace Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false} > mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0) > perf_event_open(&(0x7f000025c000)={0x2, 0x78, 0x3e3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xf72, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0) > socket$inet6_dccp(0xa, 0x6, 0x0) > unshare(0x20000400) > sendmsg$unix(0xffffffffffffffff, &(0x7f0000001000-0x38)={&(0x7f0000239000-0x8)=@abs={0x0, 0x0, 0x0}, 0x8, &(0x7f0000008000)=[], 0x0, &(0x7f0000001000-0x10)=[@rights={0x200, 0x1, 0x1, [0xffffffffffffffff]}], 0x1, 0x0}, 0x0) > process_vm_writev(0x0, &(0x7f0000699000-0x70)=[{&(0x7f00006a5000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x4c}, {&(0x7f00007b9000-0x54)="", 0x0}, {&(0x7f00004f3000)="", 0x0}, {&(0x7f00002e3000-0xd6)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xd6}, {&(0x7f0000f2e000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x52}, {&(0x7f00008e5000-0x10)="00000000000000000000000000000000", 0x10}, {&(0x7f0000a3a000)="", 0x0}], 0x7, &(0x7f0000d05000)=[{&(0x7f0000d64000)="", 0x0}, {&(0x7f0000062000-0x93)="", 0x0}, {&(0x7f0000a16000-0x7e)="000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x7e}, {&(0x7f00003dc000-0x9a)="", 0x0}, {&(0x7f0000fe3000-0xc7)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xc7}], 0x5, 0x0) > pselect6(0x40, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x20000, 0x0}, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, &(0x7f00000de000-0x40)={0xffffffffffffffe1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1}, &(0x7f00008e6000-0x10)={0x0, 0x989680}, &(0x7f0000205000-0x10)={&(0x7f00006e4000-0x8)={0x0}, 0x8}) > clone(0x20900, &(0x7f0000a94000-0x1)="6f", &(0x7f00002b8000-0x4)=0x0, &(0x7f000029e000)=0x0, &(0x7f00006fe000)="") > ioctl$KVM_ENABLE_CAP_CPU(0xffffffffffffffff, 0x4068aea3, &(0x7f0000e48000)={0x7b, 0x0, [0x1, 0x1, 0x800, 0x1], [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]}) > epoll_ctl$EPOLL_CTL_DEL(0xffffffffffffffff, 0x2, 0xffffffffffffffff) > > #{Threaded:true Collide:true Repeat:true Procs:8 Sandbox:namespace Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false} > mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0) > perf_event_open(&(0x7f000025c000)={0x2, 0x78, 0x3e3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xf72, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0) > socket$inet6_dccp(0xa, 0x6, 0x0) > unshare(0x20000400) > sendmsg$unix(0xffffffffffffffff, &(0x7f0000001000-0x38)={&(0x7f0000239000-0x8)=@abs={0x0, 0x0, 0x0}, 0x8, &(0x7f0000008000)=[], 0x0, &(0x7f0000001000-0x10)=[@rights={0x200, 0x1, 0x1, [0xffffffffffffffff]}], 0x1, 0x0}, 0x0) > process_vm_writev(0x0, &(0x7f0000699000-0x70)=[{&(0x7f00006a5000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x4c}, {&(0x7f00007b9000-0x54)="", 0x0}, {&(0x7f00004f3000)="", 0x0}, {&(0x7f00002e3000-0xd6)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xd6}, {&(0x7f0000f2e000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x52}, {&(0x7f00008e5000-0x10)="00000000000000000000000000000000", 0x10}, {&(0x7f0000a3a000)="", 0x0}], 0x7, &(0x7f0000d05000)=[{&(0x7f0000d64000)="", 0x0}, {&(0x7f0000062000-0x93)="", 0x0}, {&(0x7f0000a16000-0x7e)="000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x7e}, {&(0x7f00003dc000-0x9a)="", 0x0}, {&(0x7f0000fe3000-0xc7)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xc7}], 0x5, 0x0) > pselect6(0x40, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x20000, 0x0}, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, &(0x7f00000de000-0x40)={0xffffffffffffffe1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1}, &(0x7f00008e6000-0x10)={0x0, 0x989680}, &(0x7f0000205000-0x10)={&(0x7f00006e4000-0x8)={0x0}, 0x8}) > clone(0x20900, &(0x7f0000a94000-0x1)="6f", &(0x7f00002b8000-0x4)=0x0, &(0x7f000029e000)=0x0, &(0x7f00006fe000)="") > ioctl$KVM_ENABLE_CAP_CPU(0xffffffffffffffff, 0x4068aea3, &(0x7f0000e48000)={0x7b, 0x0, [0x1, 0x1, 0x800, 0x1], [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]}) > epoll_ctl$EPOLL_CTL_DEL(0xffffffffffffffff, 0x2, 0xffffffffffffffff) > > > Eric > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/87mv2e17vz.fsf%40xmission.com. > For more options, visit https://groups.google.com/d/optout.