Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp275495rdf; Tue, 21 Nov 2023 02:19:43 -0800 (PST) X-Google-Smtp-Source: AGHT+IFrHV65Za0nUTrPi3QLLgYv9iZZ8ur/qhg4dCddX5NKK+Mr3u2ksn6eWssi6TJc8I6LvrgL X-Received: by 2002:a05:6830:44a4:b0:6bc:8cd2:dd9c with SMTP id r36-20020a05683044a400b006bc8cd2dd9cmr11893728otv.36.1700561982728; Tue, 21 Nov 2023 02:19:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700561982; cv=none; d=google.com; s=arc-20160816; b=qopG9vrsb1L2b1Vo3ba5cncrzooGQwgvMy9B7cdyEd0ItTaegHHsIUGit2Vdj6mi1X 7l8BRK9xelN3aIJrj/Q9/02YbfZxxFFHNvJsXJyDFs8KD/pFn96r31kPSUUBTx1R1z5A HZQBttOE9SNvDjdUC0RZZvbE2SJEWZI7yzwsLI9rguRLoD4KMihZE691lACbtfyOd4hK yN3kcAMqgCDgkAcYeL1jXAA8lqjt9fs0YKA714swsxArMJ7JIjwEQjtEa43uMMbZay5C yFIICIS2D1suQhsEpyM2nBW5tKDezUVQJHjGWmHDFhIX2OFqJCqm005zYKEiSpzDd1Dp 0Q6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=b6QObmr19ow+aRn96UxyxUzFegzVbWoIxe5WmAFWZ+M=; fh=PUn6wDto8WULxZiexF9TENX365jUt3KvMTVU3DelXMk=; b=qW5EZ9HebESAtLZZwEr4SwnElNtjaYYn9O8/h6wL1loZkP4ImIsO4lqTr4nfss5s9J TQO7lfhzdxpx1xrymMpR7jvkMS3YsHe/H9GBd5hxFT9aO4Ck2RMZG9daQMqQuxfgqbhz Dq66rDqiIHkPCiIDpu1Wvny5VCPFiY1246uYyVS9rlxEvPma8TKBsd67NbAGbrFJT6p4 AZkbz6bb5eXiEti1djFgT6UtLLy8g8pFDdDch5WmD7OAqHr53he1P7zWscFNYRC8rWlL +J4VWNSwV0YSKiW6ea0VL1rG66kGGcGgSF241J8hAKx2+XhhPD0mmQAzrFmCGNK9+3KC qekw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=PjToqBuy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id f34-20020a631f22000000b005b881b4ea84si9944292pgf.428.2023.11.21.02.19.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 02:19:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=PjToqBuy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 2A36E80CC132; Tue, 21 Nov 2023 02:19:37 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233752AbjKUKTH (ORCPT + 99 others); Tue, 21 Nov 2023 05:19:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233556AbjKUKSv (ORCPT ); Tue, 21 Nov 2023 05:18:51 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3CD61BE4 for ; Tue, 21 Nov 2023 02:17:55 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A26C7C433C8; Tue, 21 Nov 2023 10:17:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700561875; bh=qobUWG625nNMvTACmwygc8zMctgjS+7q0tONS6bD3mM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=PjToqBuyTQmRxcOTpKMOibzwbIIU483uxDJ6znRdu6UBqMjietLC8muKhIlGB8xo8 S+p+JRjAld3Mi6g5nL8qf0l9t1iJxofoEKpqhqU6zgcVnlfk8zBA6M926p5lHgdmoX YhhJNcpTMOHmiXy7XQ5xIRSf+9dbsbMkRtHzxzeVwAK97ZUNDG1CXT56lQg6/zR1J1 BuG9AQXCkt6IBqTHRe1AoX/cBHNUJZT7iZsdsLPypv4hbw+vLiwQkqEOE3jm/zWxi6 ebvlokFppiTSpgIQFU7lK7N940pGkNIvFjEzPmtNlTVDl7Zgk7HPnBwvy/8lOKPJvE Rn0BA/kch9+fQ== Date: Tue, 21 Nov 2023 11:17:45 +0100 From: Christian Brauner To: Mark Brown Cc: "Rick P. Edgecombe" , Deepak Gupta , Szabolcs Nagy , "H.J. Lu" , Florian Weimer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Shuah Khan , linux-kernel@vger.kernel.org, Catalin Marinas , Will Deacon , Kees Cook , jannh@google.com, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, David Hildenbrand Subject: Re: [PATCH RFT v3 0/5] fork: Support shadow stacks in clone3() Message-ID: <20231121-urlaub-motivieren-c9d7ee1a6058@brauner> References: <20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@kernel.org> X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 21 Nov 2023 02:19:38 -0800 (PST) On Mon, Nov 20, 2023 at 11:54:28PM +0000, Mark Brown wrote: > The kernel has recently added support for shadow stacks, currently > x86 only using their CET feature but both arm64 and RISC-V have > equivalent features (GCS and Zicfiss respectively), I am actively > working on GCS[1]. With shadow stacks the hardware maintains an > additional stack containing only the return addresses for branch > instructions which is not generally writeable by userspace and ensures > that any returns are to the recorded addresses. This provides some > protection against ROP attacks and making it easier to collect call > stacks. These shadow stacks are allocated in the address space of the > userspace process. > > Our API for shadow stacks does not currently offer userspace any > flexiblity for managing the allocation of shadow stacks for newly > created threads, instead the kernel allocates a new shadow stack with > the same size as the normal stack whenever a thread is created with the > feature enabled. The stacks allocated in this way are freed by the > kernel when the thread exits or shadow stacks are disabled for the > thread. This lack of flexibility and control isn't ideal, in the vast > majority of cases the shadow stack will be over allocated and the > implicit allocation and deallocation is not consistent with other > interfaces. As far as I can tell the interface is done in this manner > mainly because the shadow stack patches were in development since before > clone3() was implemented. > > Since clone3() is readily extensible let's add support for specifying a > shadow stack when creating a new thread or process in a similar manner So while I made clone3() readily extensible I don't want it to ever devolve into a fancier version of a prctl(). I would really like to see a strong reason for allowing userspace to configure the shadow stack size at this point in time. I have a few questions that are probably me just not knowing much about shadow stacks so hopefully I'm not asking you write a thesis by accident: (1) What does it mean for a shadow stack to be over allocated and is over-allocation really that much of a problem out in the wild that we need to give I userspace a knob to control a kernel security feature? (2) With what other interfaces is implicit allocation and deallocation not consistent? I don't understand this argument. The kernel creates a shadow stack as a security measure to store return addresses. It seems to me exactly that the kernel should implicitly allocate and deallocate the shadow stack and not have userspace muck around with its size? (3) Why is it safe for userspace to request the shadow stack size? What if they request a tiny shadow stack size? Should this interface require any privilege? (4) Why isn't the @stack_size argument I added for clone3() enough? If it is specified can't the size of the shadow stack derived from it? And my current main objection is that shadow stacks were just released to userspace. There can't be a massive amount of users yet - outside of maybe early adopters. The fact that there are other architectures that bring in a similar feature makes me even more hesitant. If they have all agreed _and_ implemented shadow stacks and have unified semantics then we can consider exposing control knobs to userspace that aren't implicitly architecture specific currently. So I don't have anything against the patches per obviously but with the wider context. Thanks!