Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp2012213pxb; Mon, 20 Sep 2021 10:08:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzxJTZtuOFcrqHkZWDcmpDxm4krr5nGQ91swkwoQVj57xSP/gNzPE2bgBgB23w69zsYsAKL X-Received: by 2002:a17:906:2691:: with SMTP id t17mr28437824ejc.522.1632157700766; Mon, 20 Sep 2021 10:08:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632157700; cv=none; d=google.com; s=arc-20160816; b=QZmRmN0hyXZ8Ej5/EbaOCSrQaC1xeM/EeIY3calHoEAznqBtILAOVjmDwik736thm+ frMaIvuMi4B+JJLnqFF0ll9lSBa4s+EcDX2kiFfINKDfPW+iEx3zTrTjG14f9ygM0nkf VZtZZxDPXrUY/DaG4PMBcmNkcPXSw7WFZTRpZEJZqsEPPXjEimHvokdRsJeT+NhPNWQ0 HmMDfehsoG6U+WsIcHiVCzPDX1icmG8dowkeH4b4IEEAU2LtLGixpdZLiaiq5eVPHp2T YCMnxVQIW5dRNMA+EP5mW5b87xc6B2IpZVn60GmWMHIcZwJT6WB1AIqzgb6tiwzrTX7N XQMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:subject:cc:to:from :date:references:in-reply-to:message-id:mime-version:user-agent :dkim-signature; bh=HvcN7WGc/vbw/YDB15hsz3gTrO2yhkyLqJDYf3kpJc8=; b=ouN2l0naBXdeVuQy10P0xic3eJ14961qt9pVaR/TqPm2ATVB2oDdgrWLK7svWcp8B0 wLLHYANU57kCCg7XPfGRH6Bxf+Sgk+pLXRXCiWVl745cI7k+43xl90NNpYPmUl8Kocxs aTeW7QmK/U3NjPEX1wzDHeQgBVIuh1Rv/0v+fbLBFCVLQXIc388+Zwbt6/YbAtjuALoE v7F0xK2U3GwHylu18wEJ8NXGGaGWZxNGaZ9mD4Mq0v6WWBU9XJCTgYXDYu4ZpjsfmEEw YZjlnfe+ZyoE3iWLq63m8JwPAiW1RECFp1y/FrReXbgJ5RflHFuhJjBKQpKu9awNOh4M bWSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=FsUJa8pY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r21si20702247ejo.665.2021.09.20.10.07.55; Mon, 20 Sep 2021 10:08:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=FsUJa8pY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245345AbhITQw7 (ORCPT + 99 others); Mon, 20 Sep 2021 12:52:59 -0400 Received: from mail.kernel.org ([198.145.29.99]:38366 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244355AbhITQuH (ORCPT ); Mon, 20 Sep 2021 12:50:07 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1AAC461213; Mon, 20 Sep 2021 16:48:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1632156520; bh=JNFmVKopSY3zild0Tv3+xFBOULjKMkMEN821JJpnIGA=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=FsUJa8pYj6IPl/aiRt9kaClSiVbeKeMxQS2AxyS01HnfWhRRTrlWxsEA9AsXVGtxS IYbW+bSCKk79VPwuYVEYNXYRl3lAt8wm1+gaLDEtVQm5CtCfZLnemlvPtNb2HszN70 AZGDPXV0cb4pYY+7aOLZZfGBBJZozVfTAoP5RQhfNEcclljTbtswj6mxVoqMvzaUVB SeQTwCXwO6SUVlU2YiftoE5La9MdHw3Rp/mWhXZFLpy8JEdUYmP/twxXeOztxUY7Ee bgm+oL16xqPQ8E4ZN3+F6PD0pjHuuelx3EFFaJR3U83mxWxNsM71BIYM5vATkDhWUZ x1RWJU9sR1AYA== Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id 2D57927C005A; Mon, 20 Sep 2021 12:48:37 -0400 (EDT) Received: from imap2 ([10.202.2.52]) by compute6.internal (MEProxy); Mon, 20 Sep 2021 12:48:37 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrudeivddguddtgecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepfdet nhguhicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenuc ggtffrrghtthgvrhhnpedvleehjeejvefhuddtgeegffdtjedtffegveethedvgfejieev ieeufeevuedvteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpegrnhguhidomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudduiedu keehieefvddqvdeifeduieeitdekqdhluhhtoheppehkvghrnhgvlhdrohhrgheslhhinh hugidrlhhuthhordhush X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id EF814A03DC4; Mon, 20 Sep 2021 12:48:33 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-1291-gc66fc0a3a2-fm-20210913.001-gc66fc0a3 Mime-Version: 1.0 Message-Id: In-Reply-To: <45c62101c065ed7e728fadac7207866bf8c36ec4.camel@intel.com> References: <086c73d8-9b06-f074-e315-9964eb666db9@intel.com> <0e9996bc-4c1b-cc99-9616-c721b546f857@intel.com> <4f2dfefc-b55e-bf73-f254-7d95f9c67e5c@intel.com> <20200901102758.GY6642@arm.com> <32005d57-e51a-7c7f-4e86-612c2ff067f3@intel.com> <46dffdfd-92f8-0f05-6164-945f217b0958@intel.com> <6e1e22a5-1b7f-2783-351e-c8ed2d4893b8@intel.com> <5979c58d-a6e3-d14d-df92-72cdeb97298d@intel.com> <08c91835-8486-9da5-a7d1-75e716fc5d36@intel.com> <41aa5e8f-ad88-2934-6d10-6a78fcbe019b@intel.com> <45c62101c065ed7e728fadac7207866bf8c36ec4.camel@intel.com> Date: Mon, 20 Sep 2021 09:48:10 -0700 From: "Andy Lutomirski" To: "Rick P Edgecombe" , "Dave Hansen" Cc: "Balbir Singh" , "H. Peter Anvin" , "Eugene Syromiatnikov" , "Peter Zijlstra (Intel)" , "Randy Dunlap" , "Kees Cook" , "Yu-cheng Yu" , "Dave Hansen" , "linux-mm@kvack.org" , "Florian Weimer" , "Nadav Amit" , "Jann Horn" , "linux-arch@vger.kernel.org" , "kcc@google.com" , "Borislav Petkov" , "Oleg Nesterov" , "H.J. Lu" , "Pavel Machek" , "linux-doc@vger.kernel.org" , "Weijiang Yang" , "Arnd Bergmann" , "Moreira, Joao" , "Thomas Gleixner" , "Mike Kravetz" , "the arch/x86 maintainers" , "tarasmadan@google.com" , "Dave Martin" , "vedvyas.shanbhogue@intel.com" , "Ingo Molnar" , "Shankar, Ravi V" , "Jonathan Corbet" , "Linux Kernel Mailing List" , "Linux API" , "Cyrill Gorcunov" Subject: =?UTF-8?Q?Re:_[NEEDS-REVIEW]_Re:_[PATCH_v11_25/25]_x86/cet/shstk:_Add_ar?= =?UTF-8?Q?ch=5Fprctl_functions_for_shadow_stack?= Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 13, 2021, at 6:33 PM, Edgecombe, Rick P wrote: > On Mon, 2020-09-14 at 11:31 -0700, Andy Lutomirski wrote: > > > On Sep 14, 2020, at 7:50 AM, Dave Hansen > > > wrote: > > >=20 > > > =EF=BB=BFOn 9/11/20 3:59 PM, Yu-cheng Yu wrote: > > > ... > > > > Here are the changes if we take the mprotect(PROT_SHSTK) > > > > approach. > > > > Any comments/suggestions? > > >=20 > > > I still don't like it. :) > > >=20 > > > I'll also be much happier when there's a proper changelog to > > > accompany > > > this which also spells out the alternatives any why they suck so > > > much. > > >=20 > >=20 > > Let=E2=80=99s take a step back here. Ignoring the precise API, what = exactly > > is > > a shadow stack from the perspective of a Linux user program? > >=20 > > The simplest answer is that it=E2=80=99s just memory that happens to= have > > certain protections. This enables all kinds of shenanigans. A > > program could map a memfd twice, once as shadow stack and once as > > non-shadow-stack, and change its control flow. Similarly, a program > > could mprotect its shadow stack, modify it, and mprotect it back. In > > some threat models, though could be seen as a WRSS bypass. (Although > > if an attacker can coerce a process to call mprotect(), the game is > > likely mostly over anyway.) > >=20 > > But we could be more restrictive, or perhaps we could allow user code > > to opt into more restrictions. For example, we could have shadow > > stacks be special memory that cannot be written from usermode by any > > means other than ptrace() and friends, WRSS, and actual shadow stack > > usage. > >=20 > > What is the goal? > >=20 > > No matter what we do, the effects of calling vfork() are going to be > > a > > bit odd with SHSTK enabled. I suppose we could disallow this, but > > that seems likely to cause its own issues. >=20 > Hi, >=20 > Resurrecting this old thread to highlight a consequence of the design > change that came out of it. I am going to be taking over this series > from Yu-cheng, and wanted to check if people would be interested in re- > visiting this interface. >=20 > The consequence I wanted to highlight, is that making userspace be > responsible for mapping memory as shadow stack, also requires moving > the writing of the restore token to userspace for glibc ucontext > operations. Since these operations involve creating/pivoting to new > stacks in userspace, ucontext cet support involves also creating a new > shadow stack. For normal thread stacks, the kernel has always done the > shadow stack allocation and so it is never writable (in the normal > sense) from userspace. But after this change makecontext() now first > has to mmap() writable memory, then write the restore token, then > mprotect() it as shadow stack. See the glibc changes to support > PROT_SHADOW_STACK here[0]. >=20 > The writable window leaves an opening for an attacker to create an > arbitrary shadow stack that could be pivoted to later by tweaking the > ucontext_t structure. To try to see how much this matters, we have done > a small test that uses this window to ROP from writes in another > thread during the makecontext()/setcontext() window. (offensive work > credit to Joao on CC). This would require a real app to already to be > using ucontext in the course of normal runtime. My general opinion here (take this with a grain of salt -- I haven't pag= ed back in every single detail) is that the kernel should make it straig= htforward for a libc to do the right thing without nasty races, cross-th= read coordination, or unnecessary permission to write to the stack. I *= also* think that it should be possible for userspace to manage its own s= hadow stack allocation if it wants to, since I'm sure there will be JIT = or green thread or other use cases that want to do crazy things that we = fail to anticipate with in-kernel magic. So perhaps we should keep the explicit allocation and free operations, h= ave a way to opt-in to WRSS being flipped on, but also do our best to ha= ve API that handle the known cases well. Does that make sense? Can we have both approaches work in the same kern= el?