Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5360913imu; Wed, 26 Dec 2018 01:11:35 -0800 (PST) X-Google-Smtp-Source: ALg8bN4y/FpChIjDfCogmfpdOfXOyYW/FzhcF3ZIpHcFZEWi2TxPZEDmT/a7lakUvVL9plZuQLOe X-Received: by 2002:a17:902:145:: with SMTP id 63mr19024465plb.256.1545815495512; Wed, 26 Dec 2018 01:11:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545815495; cv=none; d=google.com; s=arc-20160816; b=w3P/9XA/J2aizBNtf9om+8wrLup/RoMtO14EBw4ZGP88ZUG4DhIGCpf8flWtlXZX09 YlFBTMN2ZqIsVZQqx5Zz2hgodK7E3WTe5yipog5XALrzjTa+qPyW9bAhXt9PUCEWxK+Z oxZBqQTixqliJdIyF9b2Almbs3L8fu5zgO33TpgevSLL+u17xgh2ACCdDlfgh4dWd9NH fCTIgVvtvXuPrxgF/ySNTUWPHmfgZhQUqmOWEBHFGy1TjWc5tB/+2dMkiHsVaJqFv055 upJOLkZUGfXFq8AfLkNUb/w2s4Ar3KTbuh4PlO+zxz+pAD900VmMHh/03b9GGEpeW/Uj l75A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=AJQijU6bPu2LKRfpLFGCE6ncxl15xGWo9q10bSkfdog=; b=AyGLMBQ8egH8a72KsygJmYog2AtHA07IWPwHvvFU/VSmpZfMpbsTeRvpEdr6bdW7nL vF1XeJSaehI0Ok9GvnfuJiSXObbaiL0LgmyDE3AQJmW0+cbdG+dEYyHOaGTPYS4vZnP7 OBipdWCjm6j07v0Qvjo+JYkyPS64zEr5M/VT54BjRvaSyJdSRrPlssi0o378pnbvPZaP VjbdNvxF45ZWfLUnZc3v91FEVqvewLNEh5GVmQKH7JxPQxx71x1DUdv02dFnzhuoi4mo aGsvLQQOcZo3YevzHBYHMwdusEp7Cq3Imcoin2O9aGUhatEn1xhVTfiqbfoXwOoTEdeu jkLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Fp1nheue; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g184si18216098pfb.288.2018.12.26.01.11.18; Wed, 26 Dec 2018 01:11:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Fp1nheue; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726196AbeLZJEH (ORCPT + 99 others); Wed, 26 Dec 2018 04:04:07 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:39309 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725983AbeLZJEH (ORCPT ); Wed, 26 Dec 2018 04:04:07 -0500 Received: by mail-it1-f196.google.com with SMTP id a6so19618270itl.4 for ; Wed, 26 Dec 2018 01:04:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AJQijU6bPu2LKRfpLFGCE6ncxl15xGWo9q10bSkfdog=; b=Fp1nheueSg8LqyAXsBIzbVJAdIM5Oy8yBThjDdApJnyCIqgel8fQQVphu7gNyg6I4N BQcegl9ahYRewE+AS+eexz53p47/Cvq4szIOfyTBAvNZUN+A4rrfam3pEamx9AvBjsHj vsohZqyrbjJC7k+LynqmkfQK5V3i70mH1peRPK4gJg3iIsvmq44P6BGu1jzWQp+dp1uF rC2iL2l+6FvLZHy2uvBvqpA+fLN+7K+WxKMF13l2obRuEZ9s60rBuit8ZUwOHfbQmUyT bmf4ohGc99nzDQbEMwbIHZsqI9gyqmNjiQcq/ZCD9I/XL/o5KfVMLgOQlAuT4m2sJxnh Umkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AJQijU6bPu2LKRfpLFGCE6ncxl15xGWo9q10bSkfdog=; b=BdBl0J+3RQGrsVxEHZ1xpesdDhdzp0AZSVXh6jeQYQEaKsSvS+COUo7PCqbynJvKWM kLSimhyOFglOE6R5KVsPgi0p1RbaU3UzKYd9dwZJZja1WmhWthoMmeuecjlWyb3PtNCf cnNQyKTIxkTVEfcDDL5sm648Vj8z8/uleI005bxWU/aiQwQJtj/sSM82KyWGRu5ftQoM dx/QSopBjMzFiw1UCmYgZKMpAfAQkaDSv0gLS1lTIqxN9zJ/l3txHbwbK0Km1+pU5QVu BA2iBcVgQ9h1hfuIcUOIDnUcpTd1/ZE36MwAcz3bpKfYbzOyR86U03gS7JyWtivyhf5k Cplw== X-Gm-Message-State: AJcUukdJLvxU95wHryWPpGQAP/x63Lqu+nnY+LmiffHgZjCozvmomIq2 0coufI6QLXpKW/yganriR+G8lmMKGmHuS8SLzU5T/OOq X-Received: by 2002:a24:f14d:: with SMTP id q13mr11319463iti.166.1545815044696; Wed, 26 Dec 2018 01:04:04 -0800 (PST) MIME-Version: 1.0 References: <00000000000051ee78057cc4d98f@google.com> <87614226-e895-c3a3-3626-b0fbe7e191be@colorfullife.com> In-Reply-To: From: Dmitry Vyukov Date: Wed, 26 Dec 2018 10:03:53 +0100 Message-ID: Subject: Re: general protection fault in put_pid To: Manfred Spraul , Shakeel Butt Cc: syzbot+1145ec2e23165570c3ac@syzkaller.appspotmail.com, Andrew Morton , David Howells , "Eric W. Biederman" , ktsanaktsidis@zendesk.com, LKML , Michal Hocko , Mike Rapoport , Stephen Rothwell , syzkaller-bugs , Matthew Wilcox , Davidlohr Bueso Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 25, 2018 at 10:35 AM Dmitry Vyukov wrote: > > On Sun, Dec 23, 2018 at 7:38 PM Manfred Spraul wrote: > > > > Hello Dmitry, > > > > On 12/23/18 11:42 AM, Dmitry Vyukov wrote: > > > Actually was able to reproduce this with a syzkaller program: > > > ./syz-execprog -repeat=0 -procs=10 prog > > > ... > > > kasan: CONFIG_KASAN_INLINE enabled > > > kasan: GPF could be caused by NULL-ptr deref or user memory access > > > general protection fault: 0000 [#1] PREEMPT SMP KASAN > > > CPU: 1 PID: 8788 Comm: syz-executor8 Not tainted 4.20.0-rc7+ #6 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 > > > RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 > > > Code: ad de 4c 8b 26 49 39 c4 74 66 48 b8 00 02 00 00 00 00 ad de 48 > > > 89 da 48 39 c3 74 65 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c > > > 02 00 75 7b 48 8b 13 48 39 f2 75 57 49 8d 7c 24 08 48 b8 00 > > > RSP: 0018:ffff88804faef210 EFLAGS: 00010a02 > > > RAX: dffffc0000000000 RBX: f817edba555e1f00 RCX: ffffffff831bad5f > > > RDX: 1f02fdb74aabc3e0 RSI: ffff88801b8a0720 RDI: ffff88801b8a0728 > > > RBP: ffff88804faef228 R08: fffff52001055401 R09: fffff52001055401 > > > R10: 0000000000000001 R11: fffff52001055400 R12: ffff88802d52cc98 > > > R13: ffff88801b8a0728 R14: ffff88801b8a0720 R15: dffffc0000000000 > > > FS: 0000000000d24940(0000) GS:ffff88802d500000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00000000004bb580 CR3: 0000000011177005 CR4: 00000000003606e0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > Call Trace: > > > __list_del_entry include/linux/list.h:117 [inline] > > > list_del include/linux/list.h:125 [inline] > > > unlink_queue ipc/sem.c:786 [inline] > > > freeary+0xddb/0x1c90 ipc/sem.c:1164 > > > free_ipcs+0xf0/0x160 ipc/namespace.c:112 > > > sem_exit_ns+0x20/0x40 ipc/sem.c:237 > > > free_ipc_ns ipc/namespace.c:120 [inline] > > > put_ipc_ns+0x55/0x160 ipc/namespace.c:152 > > > free_nsproxy+0xc0/0x1f0 kernel/nsproxy.c:180 > > > switch_task_namespaces+0xa5/0xc0 kernel/nsproxy.c:229 > > > exit_task_namespaces+0x17/0x20 kernel/nsproxy.c:234 > > > do_exit+0x19e5/0x27d0 kernel/exit.c:866 > > > do_group_exit+0x151/0x410 kernel/exit.c:970 > > > __do_sys_exit_group kernel/exit.c:981 [inline] > > > __se_sys_exit_group kernel/exit.c:979 [inline] > > > __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:979 > > > do_syscall_64+0x192/0x770 arch/x86/entry/common.c:290 > > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > > RIP: 0033:0x4570e9 > > > Code: 5d af fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 > > > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > > > 01 f0 ff ff 0f 83 2b af fb ff c3 66 2e 0f 1f 84 00 00 00 00 > > > RSP: 002b:00007ffe35f12018 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 > > > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00000000004570e9 > > > RDX: 0000000000410540 RSI: 0000000000a34c00 RDI: 0000000000000045 > > > RBP: 00000000004a43a4 R08: 000000000000000c R09: 0000000000000000 > > > R10: 0000000000d24940 R11: 0000000000000246 R12: 0000000000000000 > > > R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000008 > > > Modules linked in: > > > Dumping ftrace buffer: > > > (ftrace buffer empty) > > > ---[ end trace 17829b0f00569a59 ]--- > > > RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 > > > Code: ad de 4c 8b 26 49 39 c4 74 66 48 b8 00 02 00 00 00 00 ad de 48 > > > 89 da 48 39 c3 74 65 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c > > > 02 00 75 7b 48 8b 13 48 39 f2 75 57 49 8d 7c 24 08 48 b8 00 > > > RSP: 0018:ffff88804faef210 EFLAGS: 00010a02 > > > RAX: dffffc0000000000 RBX: f817edba555e1f00 RCX: ffffffff831bad5f > > > RDX: 1f02fdb74aabc3e0 RSI: ffff88801b8a0720 RDI: ffff88801b8a0728 > > > RBP: ffff88804faef228 R08: fffff52001055401 R09: fffff52001055401 > > > R10: 0000000000000001 R11: fffff52001055400 R12: ffff88802d52cc98 > > > R13: ffff88801b8a0728 R14: ffff88801b8a0720 R15: dffffc0000000000 > > > FS: 0000000000d24940(0000) GS:ffff88802d500000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00000000004bb580 CR3: 0000000011177005 CR4: 00000000003606e0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > > > > The prog is: > > > unshare(0x8020000) > > > semget$private(0x0, 0x4007, 0x0) > > > > > > kernel is on 9105b8aa50c182371533fc97db64fc8f26f051b3 > > > > > > and again it involved lots of oom kills, the repro eats all memory, a > > > process getting killed, frees some memory and the process repeats. > > > > I was too fast: I can't reproduce the memory leak. > > > > Can you send me the source for prog? > > > Here is the program: > https://gist.githubusercontent.com/dvyukov/03ec54b3429ade16fa07bf8b2379aff3/raw/ae4f654e279810de2505e8fa41b73dc1d77778e6/gistfile1.txt > > But we concluded this is not a leak, right? > It just creates large semaphores tied to a persistent ipcns. Once the > process is killed, all memory is released. When this program runs, it > eats all memory, then one of the subprocesses is oom-killed, part of > memory is released, then all memory is consumed again by a new > subprocess and this repeats. If all processes are killed, all memory > is released back. It seems to be working as intended. > > However, what you said about kernel.sem sysctl is useful and I think > we need to use it for additional sandboxing of syzkaller test > processes. I am thinking of applying: > > kernel.shmmax = 16777216 > kernel.shmall = 536870912 > kernel.shmmni = 1024 > kernel.msgmax = 8192 > kernel.msgmni = 1024 > kernel.msgmnb = 1024 > kernel.sem = 1024 1048576 500 1024 > > It should be enough to trigger bugs of any complexity (oom's aside), > but should prevent uncontrolled memory consumption. > Looking at the code I figured that these sysctls are > per-ipc-namespace, right? I.e. if I do sysctl from an ipcns, the > limits will be set only only for that ns. I won't use this initially, > but something to keep in mind if the global limits will fail in some > way. +Shakeel who was interested in memory isolation problems Setting these sysctl's globally does not help, as they are reset for new ipc namespaces (?). Setting them for test process namespaces does not help either, as it's trivial to do unshare(NEWIPC) (which the repro in fact does). It seems to make things somewhat better for syzkaller because any namespaces that a test creates are short-lived. But this seems to be a general resource isolation issue for containers.