Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1200636imm; Wed, 11 Jul 2018 20:00:04 -0700 (PDT) X-Google-Smtp-Source: AAOMgpejgjOJb2/no2ptKBOi1dmaKSmPkV5FokkeFcbXeuImm2wCw0NOBSlGs3rMUYDqcHzXXQEC X-Received: by 2002:a63:c902:: with SMTP id o2-v6mr449182pgg.118.1531364403986; Wed, 11 Jul 2018 20:00:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531364403; cv=none; d=google.com; s=arc-20160816; b=iAR2uLuLSiVY7xZvOT/VIB5XT3CAJEQQeyWfSS9RUGLuo8ZvNNaEtHw9vjppbG/usq qMHoJYmzOfbVJtmPC72dTAOMnA3pCdhQXmuSkHdyj+yQm2C8o5gHr9XyHsQj+GdQY2XS mCzyc4bvnP2Xpy12g5GYup/v37bwwb/lGdchEyp4uOxl4NZJA/A0miK4knkStBFt7JN+ xXtbcOQ3KLQxPQ6K5XZBF5yg7yWTosMXT0fSU2r87U4LsmrAEzOhsZisx+wF6OiDLvJO WXKCR3qtAFTW1E7WxQJ+SD3JcFbnKcuRbU+vLNU8VxKH+qoFP/VCnkk/Ys6UdRIAtK+e KJZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature:arc-authentication-results; bh=xvkcAYA52KVnkWQ5u0hrBRpushf7GoDGYKnwRrDI2ak=; b=D6taO0fyg5XdSW0mlzdkABkVK6FGyjsCh+wBRAiCDb9QrqhRJXKTy9UDU2Uq5Ntfa6 LGoHVNwQ0tdHYoUyUnBXFCC3HAj3bm6nXol16iEYUZ4GnIBIBUw9q0f/WtZTNBr1r51E DmxtBH6GLL4Drteq7SR0azC/P+txeIGyhdyht2e1LjE2kAP9si50HTvvW9590wgflgEl yEpFaJgVJ6favGZ1/xyTeJ4WbUzAT2fBV8tD2t/ALrRwjZIzLUkyi2+bvO2PbM2iyCtj GLzrKRSH1fM+H+EcDYqnzNknx2gxz3Df5mrTicka3B6f/jYlC/kA0gmYyNF0bynlBAP3 7RWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=gzvLYE2m; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n7-v6si20204950pgm.612.2018.07.11.19.59.48; Wed, 11 Jul 2018 20:00:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=gzvLYE2m; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390260AbeGKVkb (ORCPT + 99 others); Wed, 11 Jul 2018 17:40:31 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:42651 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387546AbeGKVkb (ORCPT ); Wed, 11 Jul 2018 17:40:31 -0400 Received: by mail-pf0-f194.google.com with SMTP id l9-v6so7588568pff.9 for ; Wed, 11 Jul 2018 14:34:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=xvkcAYA52KVnkWQ5u0hrBRpushf7GoDGYKnwRrDI2ak=; b=gzvLYE2meKrbFF7dwwKA3FubeAHKuqdxq9Nm24XKcskanLxHgCwgzVolH7e1KijwC3 W9ZcBG+/wglBJPJcMEypLmnnzrvA1j7sHLyUkXocko6Qs2ymXmKWI1FxEFhG3Yk8BnMM ZZFoP59SgbDajxZSyFmANm30Gf2JbBws9ujKlzrKMA7WuQ5SyLI8DIx1+XuRIaMDfu+q iEcoBevv5OGxjVZr+n4mAA/YBhQ8KYT02uj5ZuIYxhZd1rUQ3HWjnYqK5lPA4K+uNT2u tkauBKhGSbRq6/D1OsX3f4WAjyOBYZLceMtOcxrSmto2Bu6I/kuVLsN3Ik8EIVBEM2eU Th7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=xvkcAYA52KVnkWQ5u0hrBRpushf7GoDGYKnwRrDI2ak=; b=QaBEzCIFE3t5KlhOjQoj8pSKUMTvlaApq2NgPUZwKYBe2rYhEyGdnValmBzho0Wo8c Wwehx/P4CJfO6wHrK+BOVlepu1P5FUVyUIJ4AScq2/NEEUtl/yTDsOluOPh5TlGZDjLx qgtCxtg/mj2w3usMAzrvRo0GsBaXFcJzzcpKNFglFy0eKTJMpOhhcu1OHnjFqLxB/1u7 Uc4+jSSRKiB756NvTj6Zhuh62L2s5SLqobJyltl4XXCJxLQ0DRgIoG/wvetsZv5onNP7 4yChUugbg0L1GblSPP+BOQjRwoAY/VjBiuqtkaO3rSGgYrorg8d+Jsa8AREz0D/XteDi v05Q== X-Gm-Message-State: AOUpUlHtjLj/507F0hqFKvsjaGWFJvntatd+MbruYHvEN8rsVyS5RNoM izExek++lMYZT5LjWx2NPHw57w== X-Received: by 2002:a65:52cc:: with SMTP id z12-v6mr274134pgp.69.1531344853197; Wed, 11 Jul 2018 14:34:13 -0700 (PDT) Received: from ?IPv6:2600:1010:b052:968:4f0:92ce:1385:5f3d? ([2600:1010:b052:968:4f0:92ce:1385:5f3d]) by smtp.gmail.com with ESMTPSA id e82-v6sm51656936pfd.40.2018.07.11.14.34.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Jul 2018 14:34:12 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support From: Andy Lutomirski X-Mailer: iPhone Mail (15F79) In-Reply-To: Date: Wed, 11 Jul 2018 14:34:10 -0700 Cc: yu-cheng.yu@intel.com, the arch/x86 maintainers , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , kernel list , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , bsingharora@gmail.com, Cyrill Gorcunov , Dave Hansen , Florian Weimer , hjl.tools@gmail.com, Jonathan Corbet , keescook@chromiun.org, Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , ravi.v.shankar@intel.com, vedvyas.shanbhogue@intel.com Content-Transfer-Encoding: quoted-printable Message-Id: <6F5FEFFD-0A9A-4181-8D15-5FC323632BA6@amacapital.net> References: <20180710222639.8241-1-yu-cheng.yu@intel.com> <20180710222639.8241-18-yu-cheng.yu@intel.com> To: Jann Horn Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jul 11, 2018, at 2:10 PM, Jann Horn wrote: >=20 >> On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu wrote= : >>=20 >> This patch adds basic shadow stack enabling/disabling routines. >> A task's shadow stack is allocated from memory with VM_SHSTK >> flag set and read-only protection. The shadow stack is >> allocated to a fixed size. >>=20 >> Signed-off-by: Yu-cheng Yu > [...] >> diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c >> new file mode 100644 >> index 000000000000..96bf69db7da7 >> --- /dev/null >> +++ b/arch/x86/kernel/cet.c > [...] >> +static unsigned long shstk_mmap(unsigned long addr, unsigned long len) >> +{ >> + struct mm_struct *mm =3D current->mm; >> + unsigned long populate; >> + >> + down_write(&mm->mmap_sem); >> + addr =3D do_mmap(NULL, addr, len, PROT_READ, >> + MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK, >> + 0, &populate, NULL); >> + up_write(&mm->mmap_sem); >> + >> + if (populate) >> + mm_populate(addr, populate); >> + >> + return addr; >> +} >=20 > How does this interact with UFFDIO_REGISTER? >=20 > Is there an explicit design decision on whether FOLL_FORCE should be > able to write to shadow stacks? I'm guessing the answer is "yes, > FOLL_FORCE should be able to write to shadow stacks"? It might make > sense to add documentation for this. FOLL_FORCE should be able to write them, IMO. Otherwise we=E2=80=99ll need a= whole new debugging API. By the time an attacker can do FOLL_FORCE writes, the attacker can directly m= odify *text*, and CET is useless. We should probably audit all uses of FOLL= _FORCE and remove as many as we can get away with. >=20 > Should the kernel enforce that two shadow stacks must have a guard > page between them so that they can not be directly adjacent, so that > if you have too much recursion, you can't end up corrupting an > adjacent shadow stack? I think the answer is a qualified =E2=80=9Cno=E2=80=9D. I would like to inst= ead enforce a general guard page on all mmaps that don=E2=80=99t use MAP_FOR= CE. We *might* need to exempt any mmap with an address hint for compatibilit= y. My commercial software has been manually adding guard pages on every single m= map done by tcmalloc for years, and it has caught a couple bugs and costs es= sentially nothing. Hmm. Linux should maybe add something like Windows=E2=80=99 =E2=80=9Creserve= d=E2=80=9D virtual memory. It=E2=80=99s basically a way to ask for a VA rang= e that explicitly contains nothing and can be subsequently be turned into so= mething useful with the equivalent of MAP_FORCE. >=20 >> +int cet_setup_shstk(void) >> +{ >> + unsigned long addr, size; >> + >> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) >> + return -EOPNOTSUPP; >> + >> + size =3D in_ia32_syscall() ? SHSTK_SIZE_32:SHSTK_SIZE_64; >> + addr =3D shstk_mmap(0, size); >> + >> + /* >> + * Return actual error from do_mmap(). >> + */ >> + if (addr >=3D TASK_SIZE_MAX) >> + return addr; >> + >> + set_shstk_ptr(addr + size - sizeof(u64)); >> + current->thread.cet.shstk_base =3D addr; >> + current->thread.cet.shstk_size =3D size; >> + current->thread.cet.shstk_enabled =3D 1; >> + return 0; >> +} > [...] >> +void cet_disable_free_shstk(struct task_struct *tsk) >> +{ >> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK) || >> + !tsk->thread.cet.shstk_enabled) >> + return; >> + >> + if (tsk =3D=3D current) >> + cet_disable_shstk(); >> + >> + /* >> + * Free only when tsk is current or shares mm >> + * with current but has its own shstk. >> + */ >> + if (tsk->mm && (tsk->mm =3D=3D current->mm) && >> + (tsk->thread.cet.shstk_base)) { >> + vm_munmap(tsk->thread.cet.shstk_base, >> + tsk->thread.cet.shstk_size); >> + tsk->thread.cet.shstk_base =3D 0; >> + tsk->thread.cet.shstk_size =3D 0; >> + } >> + >> + tsk->thread.cet.shstk_enabled =3D 0; >> +}