Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp625963rdb; Thu, 30 Nov 2023 13:51:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IEu7k3NvJAfu13Po0lb4oFxlv7N8Y0Q4mJMTCV9S9EA369ZvYs7j7rHaOZLBB1lwE3CUxyV X-Received: by 2002:aa7:9d86:0:b0:6cd:e8c3:f73d with SMTP id f6-20020aa79d86000000b006cde8c3f73dmr2796602pfq.0.1701381088438; Thu, 30 Nov 2023 13:51:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701381088; cv=none; d=google.com; s=arc-20160816; b=f1IdpaEr0gfepD3CxalKm57smaUNLUmPokNB7rbqfzC04urs0IzDzek0VAZenE6bcH jE5S8dBSGb7J66CeYMQxQqMVDTRYeszWNXmLBEWuKX8llH47lXUwERTZOASOdlFguqOt ou7vDXasUYCYypK5SNct3FDJKi5nT57i1uP0JtaJVkcZ5DWlSdhvE9K0gr1AYhEykjgu ehP09AyLbjh5XnRtvMH4YgCt9hWVdpo3PbVhk7l5MKbBy5/64YOlHb3Ts6LJ42EGEuOv zrhQBYpwrqiLjMlrlIxtWiSerR7R6Viq/S6r8L6UFCe1aLpTBDDSq4LPB46//47lGSkq j/ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=q77JTTzUX7KcSHgiB0n6qsT550pWIVNyqulJP1L7vpg=; fh=uxC1qAnC45zIDM+tVDe1SuFOhsKIbbOGzDlXRmGRI4w=; b=kcvQJP1E9HzK/IvzPnfRUTh120zsItNX9Kfg9dnMAvwtitcNYscNqohQpbklgUjrXL QOJsTlKQPsMctKNKmAKXd4Pyuo6dBMFqsnUpnn/n2TK9CsKIZe1305w8vxZC1xCu8kAf DWYMcAaL3tJ/OMen1gLp+2AGBympUTiG4Nuw60xx2Qjb6bJXGHdtcReBl8OO3uQctz3E FVsvewbS1BmTbgK/nwyW9MVJ74l2hNepCd3TuTSUhkOn9SQcRnf4hBjPN9KvN6XAlKbT 2+gFYkSREHo184aBd4wUszl2a7Nq6/YdCIdUPLgpwtpslvCfdaAG7Gqzwuan2MOdITum clfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ao1pC1rU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id k21-20020a056a00135500b006c21adbc1cbsi2129514pfu.59.2023.11.30.13.51.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 13:51:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ao1pC1rU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id B4BB482F2C77; Thu, 30 Nov 2023 13:51:25 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377026AbjK3VvL (ORCPT + 99 others); Thu, 30 Nov 2023 16:51:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232421AbjK3VvJ (ORCPT ); Thu, 30 Nov 2023 16:51:09 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D9A810DE for ; Thu, 30 Nov 2023 13:51:15 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 20E48C433C8; Thu, 30 Nov 2023 21:51:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701381074; bh=gXHc+SuljVO+wY/m8aVAbwCBVxzxhWKTAHg4DIJiGXk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Ao1pC1rUSVjJZYSjZKEv1b/bt2MEqF/SoDkjSD6sUMnxSzEkeTHc4ZQPPJbw0I0F1 SSXYPn2JflPnHrBnsGyJdj17pBZuDgLDvfr+xEuUr+erlei0MhOaA6ukRi49OS4tYm sS1Bai60e+/VdUPTKVwNOeK/iY937K740c1WcA/zvoquyPCMNbH62XDwJ4mm9xNFWo I3s/GFoBIouL+8FXPK4D3tnKtm4cp8w3TcHi/APCE384+KDwGl70T74qKOfsYYsrg/ ZeMf0tId/9RYUy/bom0hXDuhbyHTbk2K8BAB11GYgMzcM218zNtyJMtiUa/gsjRiC6 g+iUAUWvyPjJA== Date: Thu, 30 Nov 2023 21:51:04 +0000 From: Mark Brown To: Catalin Marinas Cc: "Rick P. Edgecombe" , Deepak Gupta , Szabolcs Nagy , "H.J. Lu" , Florian Weimer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Christian Brauner , Shuah Khan , linux-kernel@vger.kernel.org, Will Deacon , Kees Cook , jannh@google.com, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, David Hildenbrand Subject: Re: [PATCH RFT v4 0/5] fork: Support shadow stacks in clone3() Message-ID: References: <20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@kernel.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="EREZrZaaB+LjPV4s" Content-Disposition: inline In-Reply-To: X-Cookie: Oh, wow! Look at the moon! X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Thu, 30 Nov 2023 13:51:25 -0800 (PST) --EREZrZaaB+LjPV4s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Nov 30, 2023 at 07:00:58PM +0000, Catalin Marinas wrote: > My hope when looking at the arm64 patches was that we can completely > avoid the kernel allocation/deallocation of the shadow stack since it > doesn't need to do this for the normal stack either. Could someone > please summarise why we dropped the shadow stack pointer after v1? IIUC > there was a potential security argument but I don't think it was a very > strong one. Also what's the threat model for this feature? I thought > it's mainly mitigating stack corruption. If some rogue code can do > syscalls, we have bigger problems than clone3() taking a shadow stack > pointer. As well as preventing/detecting corruption of the in memory stack shadow stacks are also ensuring that any return instructions are unwinding a prior call instruction, and that the returns are done in opposite order to the calls. This forces usage of the stack - any value we attempt to RET to is going to be checked against the top of the shadow stack which makes chaining returns together as a substitute for branches harder. The concern Rick raised was that allowing user to pick the exact shadow stack pointer would allow userspace to corrupt or reuse the stack of an existing thread by starting a new thread with the shadow stack pointing into the existing shadow stack of that thread. While in isolation that's not too much more than what userspace could just do directly anyway it might compose with other issues to something more "interesting" (eg, I'd be a bit concerned about overlap with pkeys/POE though I've not thought through potential uses in detail). > I'm not against clone3() getting a shadow_stack_size argument but asking > some more questions. If we won't pass a pointer as well, is there any > advantage in expanding this syscall vs a specific prctl() option? Do we > need a different size per thread or do all threads have the same shadow > stack size? A new RLIMIT doesn't seem to map well though, it is more > like an upper limit rather than a fixed/default size (glibc I think uses > it for thread stacks but bionic or musl don't AFAIK). I don't know what the userspace patterns are likely to be here, it's possible a single value for each process might be fine but I couldn't say that confidently. I agree that a RLIMIT does seem like a poor fit. As well as the actual configuration of the size the other thing that we gain is that as well as relying on heuristics to determine if we need to allocate a new shadow stack for the new thread we allow userspace to explicitly request a new shadow stack. There was some corner case with IIRC posix_nspawn() mentioned where the heuristics aren't what we want for example. > Another dumb question on arm64 - is GCSPR_EL0 writeable by the user? If > yes, can the libc wrapper for threads allocate a shadow stack via > map_shadow_stack() and set it up in the thread initialisation handler > before invoking the thread function? No, GCSPR_EL0 can only be changed by EL0 through BL, RET and the new GCS instructions (push/pop and stack switch). Push is optional - userspace has to explicitly request that it be enabled and this could be prevented through seccomp or some other LSM. The stack switch instructions require a token at the destination address which must either be written by a higher EL or will be written in the process of switching away from a stack so you can switch back. Unless I've missed one every mechanism for userspace to update GCSPR_EL0 will do a GCS memory access so providing guard pages have been allocated wrapping to a different stack will be prevented. We would need a syscall to allow GCSPR_EL0 to be written. --EREZrZaaB+LjPV4s Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEyBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmVpA8cACgkQJNaLcl1U h9Ca+wf3QFyzGukhu9LAOptm51dV0RGGmEApy11RuLhFZpMcwhZf72d1VLoTaX94 2M9lCSisanqBpgOn+QY89X1GfxUDo/WgMDORlBCFqGOHe3nW7L2ACk26m9HjTa9e +WhaSQq2Q2Ujhq52LMQJel/UNV2KkMR3vza+gBaag3QqsPwKXQXKSqg6krP2UrbP O91VoUbpivePKisHXR+hmKnOpuYTYGpUGZzP3GtvrvIUNXyu2Vh8XZ3b8cLHR146 Lt+IHXjK10CoX3iqTRUlMB1v7uq8peIbt/d9hG9QihIR0utyluwXeMFmPFn6MEcv qhw0z1fyt4DQjITRgu6gV86KpbFB =v09W -----END PGP SIGNATURE----- --EREZrZaaB+LjPV4s--