Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1360185imu; Fri, 9 Nov 2018 15:23:59 -0800 (PST) X-Google-Smtp-Source: AJdET5cCnGdrpz6lMG7mxko9rxc5JnWg5yo1buHfwrteGZ3N3eiuc2/tXsHArEYZPJ0F8lFhYBci X-Received: by 2002:a63:7a5b:: with SMTP id j27mr9323877pgn.112.1541805839134; Fri, 09 Nov 2018 15:23:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541805839; cv=none; d=google.com; s=arc-20160816; b=XNbG3wfZ9oye7DVy9MBTDqvDtCZfOz+O0AEPZJBf8SuXBIv/Cp8AoOoSMgZwe23rut QRz2XpjJH+Ze4fuYDlHD5JEYso5f0pv//m0occyWVwG4zAHCl3E2tCKaY79uS8X+Pp2P lF5WFptWCG2HznBx0d7HJ9fHBo9GmerscGGpA45JtF6Q0fbk35mV0FLV+cMVWFOMO3Iv C7GnF6i88df1FmZ0h7HKIfVDv/kU2wngrQuz+5edh+r8MLWix1htgbPvmYZ4NoGTiwxH kr261A4FR1+NA9Xf2TERhP3ImjRaKiGoqA3FcKOXRnuuSTmsaAPj74uVfn3A9RFiVgTC NZGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature; bh=a1T7f3ptIvjuJHACqDYegUHUMmTMhlIrExoT7VexCjU=; b=izCf6E3MQW7btsDne46Pe9gMj0vDAlN1g3zNp3XRfmleYL2IjnGgLIoY4TOzp+R0tM uEWiV10tK/jTqjLoP8Yvo7mI2LDsfu2Jw4jkgtpegtQuAvd9LRihfX3St8SLqgOyPySU XKjtLQbL0Ao0kCOq1DNpmGVVBHxmnMu10darkqZZxeWAPrNzQVT9F2d/8nYwnLTcu0Df 1q27i2tSVhVhZCSdXeFdlTMUDMYwiN14pBvhH806v0BjCgUuESD9oxGNSv69J97CJET0 YRzBQrXszvbGkQ/ar44hl2IdeSVjZA9OrGrvFkm86+fntPzPZPoLeI22qE6dw0ga8Y25 vmNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=XsN5b5dr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m11-v6si9592234pla.251.2018.11.09.15.23.44; Fri, 09 Nov 2018 15:23:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=XsN5b5dr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728784AbeKJJEr (ORCPT + 99 others); Sat, 10 Nov 2018 04:04:47 -0500 Received: from mail-yb1-f193.google.com ([209.85.219.193]:40232 "EHLO mail-yb1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727012AbeKJJEq (ORCPT ); Sat, 10 Nov 2018 04:04:46 -0500 Received: by mail-yb1-f193.google.com with SMTP id g9-v6so1737624ybh.7 for ; Fri, 09 Nov 2018 15:22:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=a1T7f3ptIvjuJHACqDYegUHUMmTMhlIrExoT7VexCjU=; b=XsN5b5dr4DdUlqQA7Vblr3hYziw51KYS2nHqQr1W40D3VIt5uu9Pz3p4EbM4THSpOy rz+1/5bAhZ5ZAJbWqs3KCwgpbAADzcQgKxC/UCQKqOBy91hrize9L+yBBW8L8amsbS1F cFL0GQ6S5YFjhO2zwtPCYaJZurDiwgPhK9Au8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=a1T7f3ptIvjuJHACqDYegUHUMmTMhlIrExoT7VexCjU=; b=qfHMQNCgqPRGKUP2lHCP7UL9HaAZfjlMPpQMtv7MJqTwVY5u/bVk9yUAxCAOFQF747 NOfb18DM3DBoa9ufEBOnW+t7m6zsWQHbSetrl2qC4BJdWpFCPgAhebDFue2LarEC3W6a rfcYUspUivmlYs2SkVSMmrsOegbY+wZwTy+AF8VJU5j9qxn/3210yLuRaR1FI62v+0EC N70WBjNJ8lozb2XX9ao7ycbTDFz59HKDO0N9ZRSTyTzafyCNfrRRqoAz46udncfUBPml SnlieeA7APEKXG0QcghqjxLY7QbktxTulj4NRf41ADamkLCHOEQfgnEl6bXiBkmpjgaz pAXA== X-Gm-Message-State: AGRZ1gLyZePkA5UBbwgIIeLllq10TjhL/MyqM6RHu2U1hoG6XOFPGAkb V/60A5KenQVQS0Wjv0Sc4kl/NdM79ZI= X-Received: by 2002:a25:1043:: with SMTP id 64-v6mr10682483ybq.159.1541805719116; Fri, 09 Nov 2018 15:21:59 -0800 (PST) Received: from mail-yw1-f52.google.com (mail-yw1-f52.google.com. [209.85.161.52]) by smtp.gmail.com with ESMTPSA id j131-v6sm1658154ywa.84.2018.11.09.15.21.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Nov 2018 15:21:57 -0800 (PST) Received: by mail-yw1-f52.google.com with SMTP id h21-v6so2316683ywa.3 for ; Fri, 09 Nov 2018 15:21:57 -0800 (PST) X-Received: by 2002:a0d:d302:: with SMTP id v2-v6mr10757632ywd.124.1541805716996; Fri, 09 Nov 2018 15:21:56 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a25:b906:0:0:0:0:0 with HTTP; Fri, 9 Nov 2018 15:21:56 -0800 (PST) In-Reply-To: References: <5be58a6e.w0IbLdKsiRknTygq%lkp@intel.com> <2B681F10-752C-4327-9960-3987CE17A619@amacapital.net> From: Kees Cook Date: Fri, 9 Nov 2018 17:21:56 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: afaef01c00 ("x86/entry: Add STACKLEAK erasing the kernel stack .."): double fault: 0000 [#1] To: Alexander Popov Cc: Andy Lutomirski , Jann Horn , Joerg Roedel , Andy Lutomirski , Ingo Molnar , Thomas Gleixner , LKP , kbuild test robot , Kernel Hardening , "open list:DOCUMENTATION" , kernel list , Dave Hansen Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 9, 2018 at 5:09 PM, Alexander Popov wrot= e: > > On 09.11.2018 23:46, Andy Lutomirski wrote: >>> On Nov 9, 2018, at 12:06 PM, Jann Horn wrote: >>> >>> +Andy, Thomas, Ingo >>> >>>> On Fri, Nov 9, 2018 at 2:24 PM kernel test robot wrote= : >>>> 0day kernel testing robot got the below dmesg and the first bad commit= is >>>> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git mas= ter >>>> >>>> commit afaef01c001537fa97a25092d7f54d764dc7d8c1 >>>> Author: Alexander Popov >>>> AuthorDate: Fri Aug 17 01:16:58 2018 +0300 >>>> Commit: Kees Cook >>>> CommitDate: Tue Sep 4 10:35:47 2018 -0700 >>>> >>>> x86/entry: Add STACKLEAK erasing the kernel stack at the end of sys= calls >>> [...] >>>> [ 127.808225] double fault: 0000 [#1] >>>> [ 127.808695] CPU: 0 PID: 414 Comm: trinity-main Tainted: G = T 4.19.0-rc2-00001-gafaef01 #1 >>>> [ 127.809799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), = BIOS 1.10.2-1 04/01/2014 >>>> [ 127.810760] RIP: 0010:ftrace_ops_test+0x27/0xa0 >>>> [ 127.811289] Code: eb 9a 90 41 54 55 49 89 f4 53 48 89 d3 48 89 fd 4= 8 81 ec b0 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 a8 00 00 00 31 c= 0 54 df ff ff 48 85 db 74 57 e8 4a df ff ff 48 8b 85 d0 00 00 00 >>>> [ 127.813385] RSP: 0018:fffffe0000001fb8 EFLAGS: 00010046 >>> [...] >>>> [ 127.819762] CR2: fffffe0000001fa8 CR3: 000000001579a000 CR4: 000000= 00000006b0 >>> [...] >>>> [ 127.822234] Call Trace: >>>> [ 127.822530] >>>> [ 127.822914] ? __ia32_sys_rseq+0x2f0/0x2f0 >>>> [ 127.823395] ftrace_ops_list_func+0xa5/0x1b0 >>>> [ 127.823922] ftrace_call+0x5/0x34 >>>> [ 127.824318] ? stackleak_erase+0x5/0xf0 >>>> [ 127.824789] ? stackleak_erase+0x43/0xf0 >>>> [ 127.825260] stackleak_erase+0x5/0xf0 >>>> [ 127.825699] syscall_return_via_sysret+0x61/0x81 >>>> [ 127.826238] WARNING: stack recursion on stack type 4 >>>> [ 127.826243] WARNING: can't dereference registers at (____ptrval____= ) for ip syscall_return_via_sysret+0x61/0x81 >>>> [ 127.826246] >>>> [ 127.828342] ---[ end trace e9f96d3f45575499 ]--- >>>> [ 127.828911] RIP: 0010:ftrace_ops_test+0x27/0xa0 >>> >>> CR2: fffffe0000001fa8, RSP: 0018:fffffe0000001fb8; this is a pagefault >>> on the stack. fffffe0000000000 is CPU_ENTRY_AREA_RO_IDT; >>> fffffe0000001000 is CPU_ENTRY_AREA_PER_CPU; so fffffe0000002000 is the >>> page with the entry stack for cpu 0, and you overflowed from that into >>> the readonly gdt at fffffe0000001000, which doubles as a guard page >>> for the entry stack: >>> >>> struct cpu_entry_area { >>> char gdt[PAGE_SIZE]; >>> >>> /* >>> * The GDT is just below entry_stack and thus serves (on x86_64)= as >>> * a a read-only guard page. >>> */ >>> struct entry_stack_page entry_stack_page; >>> [...] >>> }; >>> >>> In other words: You're calling C code on the entry trampoline stack; >>> this C code can call into ftrace; and the entry trampoline stack isn't >>> big enough for ftrace shenanigans. I think you probably shouldn't be >>> calling C code on the entry stack, but maybe one of the X86 folks has >>> a different opinion? >> >> My opinion was that, on x86_32, the entry stack ought to be fairly large= so >> that NMIs could execute on the entry stack. I don=E2=80=99t remember wh= at the code >> actually does, though. >> >> But stackleak_erase should probably not run on the entry stack. That see= ms >> like it=E2=80=99s just asking for trouble. > > Hello Jann and Andy, > > > The stackleak_erase() function is called on the trampoline stack at the e= nd of > syscall, it erases the used part of the kernel thread stack after the sys= call is > handled. > > > I've reproduced such a double fault with function tracing for stackleak_e= rase(): > > # mount -t tracefs nodev /sys/kernel/tracing > # echo 'p:myprobe stackleak_erase' > /sys/kernel/debug/tracing/kprobe_e= vents > # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable > > > I think we should simply not allow function tracing for stackleak_*() fun= ctions: > > diff --git a/kernel/Makefile b/kernel/Makefile > index 7343b3a..0906f6d 100644 > --- a/kernel/Makefile > +++ b/kernel/Makefile > @@ -18,6 +18,7 @@ obj-$(CONFIG_MULTIUSER) +=3D groups.o > ifdef CONFIG_FUNCTION_TRACER > # Do not trace internal ftrace files > CFLAGS_REMOVE_irq_work.o =3D $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_stackleak.o =3D $(CC_FLAGS_FTRACE) > endif Yeah, that's what I was suspecting on IRC. This looks like the right fix. Can you send that to me as a "regular" patch with changelog, etc, and I'll send it up to Linus. Reviewed-by: Kees Cook Thanks for everyone's attention on this! I've been travelling this week, so I've been a little slow. :) -Kees > > > With this patch setting kprobe event for stackleak_erase() is not allowed= . This > is the corresponding dmesg output: > [ 75.660478] trace_kprobe: Could not probe notrace function stackleak= _erase > > > If you agree, I'll prepare the patch for LKML. > > Best regards, > Alexander --=20 Kees Cook