Received: by 2002:a05:7412:a9a2:b0:e2:908c:2ebd with SMTP id o34csp1483000rdh; Fri, 27 Oct 2023 16:24:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGQkU3XbG0aJmtWRbAm8rdsGY2xWOin3wtl3rljBA1zk5CJvM401yiNi5E7uUbBwZvMHbex X-Received: by 2002:a05:6a00:856:b0:6be:23dd:d62c with SMTP id q22-20020a056a00085600b006be23ddd62cmr5301454pfk.2.1698449074797; Fri, 27 Oct 2023 16:24:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698449074; cv=none; d=google.com; s=arc-20160816; b=ivY5aldaLNPi6kjwZbR65ZjmlvkTK9lFcQGQX6iM+oXXaEZzcHp7ykVZuPnzliGAEF qDkGDAbQbrnaQoHVyPfOKsV20lcZkodDdmZeXX6nQbQcp9931aUUhSgQP9q0o3dB3cyQ qcZG26ByMO1h70Y4FsSGchsHH6iW7Ti9RaNLFY8CC8ThwyMEEQDSNCU2AXwIEkb18OIH MIpHaqCCTueQ3EkyNfflfOo2ZA2zPC16kbolsVM544tUmE5rs1UXCpg9hXpPHrxPTdh6 snGNLVRTp4mzfF58Wem2eIILije3QCBICHmhH0wVTALfOrC9HTXR5WFHf5HT/6A/4qQm PIeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=2GtWOmDm2DZyVdf43Y6zUW3QwX9tZvbKVx3zn9eClIU=; fh=WzqrPBIHSdHW+x1AfVGKNcNO3WGCSBdJW/gZMnVEA8I=; b=XbFEzNEja8wM1DkT7Csk1c2Cf7OiKFmXf7zeaKswqFROp0vHaogb5arvNJQumym5qQ 7PZoxFeWYQT+mX4pbr/VZK0TxpLs4i4AIEovr3BrCUc1sYNuJEV+GCQWDSyjr9HVGNiw 9420F69uMeKyrHXEcBWYSJLV9ocZyw6gO097cLg+FuU+hfCGJj1xdsOzuZo2FC4jQLTa xbZQOve7GjDLCrLfOhA4Cw84WI9qy+geg68X5t9Qm04bm9YvP7UDsfCuD//k+jatOJnB E0BYvaY7tU8A+upMGcaVaffYqSF5pHOr0BCXIcPXPuwY1gobBX2oorntmub7p3OZ73Ih V3lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=dFfBhWoA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id d10-20020a056a0024ca00b00690c19cb105si1637698pfv.250.2023.10.27.16.24.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Oct 2023 16:24:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=dFfBhWoA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id E83AA8325180; Fri, 27 Oct 2023 16:24:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232520AbjJ0XY1 (ORCPT + 99 others); Fri, 27 Oct 2023 19:24:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231569AbjJ0XY0 (ORCPT ); Fri, 27 Oct 2023 19:24:26 -0400 Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com [IPv6:2607:f8b0:4864:20::22e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE8231BC for ; Fri, 27 Oct 2023 16:24:22 -0700 (PDT) Received: by mail-oi1-x22e.google.com with SMTP id 5614622812f47-3b3f55e1bbbso1669181b6e.2 for ; Fri, 27 Oct 2023 16:24:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1698449062; x=1699053862; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2GtWOmDm2DZyVdf43Y6zUW3QwX9tZvbKVx3zn9eClIU=; b=dFfBhWoAfkYspAeKLN/WHyQm1h9wxF5ua3i2/K4XyWfPfITPZBkicxp15Eu62EWoqq 0caL+xZSzUHLs+X9MImT7JaNHcVB0azPj6m3xdNNtYhmqSWe5wCGGS0JOtTyCbd8vQSL S9GF5NHqM0/qyJMVbdY87vRwrawcO2VnkwRZ7lEOiGzEnZ55noC0YAuSOb53U6dPHrsl kdvl+XoN+lF3bfFZXeRuqE3ha2NbQF4WemnQFM/wyaAjGKymaVUpH3jKXY2Ch1twpXFD lYnn6CgibYJ+YYeYpg9Qc1KFNd1UYGskVmJ3vSbDF6c8S/GPLlBrqZR2D4J2Yk2YhqwL zE7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698449062; x=1699053862; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2GtWOmDm2DZyVdf43Y6zUW3QwX9tZvbKVx3zn9eClIU=; b=YRaL9fPPi5VvfUOfw+jbiMALlkaMKzI8kkA6nKCvgRNLjsjWSRpwGxQssLAUsCLeT9 zgDMHu0oLHxNADIbkymAdrm6dSNHSoAqmnZuDcKNWijEa5Lq7dJGHrlFkOq3XNC/rbxB U5lSVYVP53oO2qQ/zOmcrgtxVjjGAe+cE6z1hQC9CpvMq+4bM+Tlv94Suqcw8EEq6O+C xTIPDIgzp1LGLb4pTs+daDC/oQ+3PNfzzT4GY05MmLU0R0fWchMmEfYKM1mvmY0YHyTF /uHqZVoub269BMTmef6i1liC7rxvFKPZud8XpjvY9UwruQiDNIy1KZ4nKBwNwLjO2n1Q IPgA== X-Gm-Message-State: AOJu0Yzdzy9BrUq9FG0o+iPPFEM7Dt6v3AWCWZrKISRLdSPqkZjtGnG0 DK7H7o6Ccz91fm9CjXWExYZW9A== X-Received: by 2002:a05:6808:1141:b0:3b2:ec06:7061 with SMTP id u1-20020a056808114100b003b2ec067061mr5059285oiu.14.1698449062187; Fri, 27 Oct 2023 16:24:22 -0700 (PDT) Received: from debug.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id bp2-20020a056808238200b003af638fd8e4sm476791oib.55.2023.10.27.16.24.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Oct 2023 16:24:21 -0700 (PDT) Date: Fri, 27 Oct 2023 16:24:18 -0700 From: Deepak Gupta To: "Szabolcs.Nagy@arm.com" Cc: Mark Brown , "Edgecombe, Rick P" , "dietmar.eggemann@arm.com" , "keescook@chromium.org" , "brauner@kernel.org" , "shuah@kernel.org" , "mgorman@suse.de" , "dave.hansen@linux.intel.com" , "fweimer@redhat.com" , "linux-kernel@vger.kernel.org" , "vincent.guittot@linaro.org" , "hjl.tools@gmail.com" , "rostedt@goodmis.org" , "mingo@redhat.com" , "tglx@linutronix.de" , "vschneid@redhat.com" , "catalin.marinas@arm.com" , "bristot@redhat.com" , "will@kernel.org" , "hpa@zytor.com" , "peterz@infradead.org" , "jannh@google.com" , "bp@alien8.de" , "bsegall@google.com" , "linux-kselftest@vger.kernel.org" , "linux-api@vger.kernel.org" , "x86@kernel.org" , "juri.lelli@redhat.com" Subject: Re: [PATCH RFC RFT 2/5] fork: Add shadow stack support to clone3() Message-ID: References: <20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@kernel.org> <20231023-clone3-shadow-stack-v1-2-d867d0b5d4d0@kernel.org> <8b0c9332-ba56-4259-a71f-9789d28391f1@sirena.org.uk> <2ec0be71ade109873445a95f3f3c107711bb0943.camel@intel.com> <807a8142-7a8e-4563-9859-8e928156d7e5@sirena.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 27 Oct 2023 16:24:32 -0700 (PDT) On Fri, Oct 27, 2023 at 12:49:59PM +0100, Szabolcs.Nagy@arm.com wrote: >The 10/26/2023 13:40, Deepak Gupta wrote: >> On Thu, Oct 26, 2023 at 06:53:37PM +0100, Mark Brown wrote: >> > I'm not sure placement control is essential but the other bit of it is >> > the freeing of the shadow stack, especially if userspace is doing stack >> > switches the current behaviour where we free the stack when the thread >> > is exiting doesn't feel great exactly. It's mainly an issue for >> > programs that pivot stacks which isn't the common case but it is a >> > general sharp edge. >> >> In general, I am assuming such placement requirements emanate because >> regular stack holds data (local args, etc) as well and thus software may >> make assumptions about how stack frame is prepared and may worry about >> layout and such. In case of shadow stack, it can only hold return > >no. the lifetime is the issue: a stack in principle can outlive >a thread and resumed even after the original thread exited. >for that to work the shadow stack has to outlive the thread too. > I understand an application can pre-allocate a pool of stack and re-use them whenever it's spawning new threads using clone3 system call. However, once a new thread has been spawned how can it resume? By resume I mean consume the callstack context from an earlier thread. Or you meant something else by `resume` here? Can you give an example of such an application or runtime where a newly created thread consumes callstack context created by going away thread? >(or the other way around: a stack can be freed before the thread >exits, if the thread pivots away from that stack.) This is simply a thread saying that I am moving to a different stack. Again, interested in learning why would a thread do that. If I've to speculate on reasons, I could think of user runtime managing it's own pool of worker items (some people call them green threads) or current stack became too small. JIT runtimes (and such stuff like go routines) do such things but in those cases, kernel has no idea about it. From kernel's perspective there is a main thread stack (hosting thread for JIT) and then main thread can take a decision switching stack to execute JITted code. But in that case all it needs is a shadow stack and managing lifetime of such shadow stack using `clone` wouldn't be helpful and perhaps `map_shadow_stack` should be used to create on the fly shadow stack. Another case I can think of for a thread to move to a different stack when current stack was too small and it wants larger memory. In such cases as well, I imagine that particular thread would be issuing `mmap` to allocate larger memory and thus that particular thread can very well issue `map_shadow_stack` In both of these cases, a stack free actually means thread (application) issuing a system call to free the going away stack memory. It can free up going away shadow stack memory in same way using `unmap_shadow_stack` Let me know if I misunderstood something or missing some other usecase of a stack being freed before the thread exits. > >posix threads etc. don't allow this, but the linux syscall abi >(clone) does allow it. > >i think it is reasonable to tie the shadow stack lifetime to the >thread lifetime, but this clearly introduces a limitation on how >the clone api can be used. such constraint on the userspace >programming model is normally a bad decision, but given that most >software (including all posix conforming code) is not affected, >i think it is acceptable for an opt-in feature like shadow stack. > >IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.