Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp9257748rwp; Thu, 20 Jul 2023 01:58:03 -0700 (PDT) X-Google-Smtp-Source: APBJJlGeQ1LEAej/F0KAMiKL+nbzTfe8qXI1EhRmrW8g250n+ZiTr9WUOklYHMTsS2cAWBB1U0sP X-Received: by 2002:a17:906:5306:b0:994:3037:c1f with SMTP id h6-20020a170906530600b0099430370c1fmr4314081ejo.24.1689843482977; Thu, 20 Jul 2023 01:58:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689843482; cv=none; d=google.com; s=arc-20160816; b=zOtaeu4TBqilTpYX4+w6AOROg56fUeruSG0qMuVUovJ2n/hTJvwogZl92bqaD5OCGF hFEpYXCVHjv3xHUjKKxDC9iD31KfyEPy3gLSdhQALsiUBU6tFnsvHZ8TdiJRoyXdZIb5 8lT437fVyHMaF278YkJEg3cfguZ0Tfgv0P7TuHAraY3A40oqJnPozsWtzYtLkxbZVcSW b+6dcsmTFCNoRlfEzQE+Y11/+6vvCoFeMIViLx2efchvw+yypuxvhCKeIQdrhcrGIyo+ YG41jaW3dev1HOQD/ChnOlZ1P+CEkHHQIHIEHqoDxDbE5h04uH/nefF+YgqfRem2CPAk DEVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=GYJcNSzzJTwKk+N/xTZNKE50sP/tZ1CZ3Su9PfekZow=; fh=LKMQGiMmaSljhhkFmhqvbrSQliBRgG+lfWZMxgLWons=; b=oZ85D12rN92lomFd0Pmode+WD8xOcTKkJEJ7EkcHWcL/KckhIU0fGMNpm6BiH3pEZE eyEiCxkxrkNyaFV2qrnAk/IsCxm8RP34cFIDngLFenTr529ojFNLxMGjRh9Vjg/8TsS9 ajqmo7jzI01c4lPo/6j3bNItUkvlNftAYQJDclBUlYTX2If7lgvujCLumI0MFDwfEaJ6 SnD0ffcMHhhCWocaqRWAV2hbHxC2hawSSaHgxWxeOt8BcWn7XnqyqISFCPXBsXMBMsmx Zyypn8ZAJLVVKiPcTky7gJAAv+oesuCSsSOz2V5ewQ6JRB6nixQTlryOSVP35rl3NFZe jQ0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=qgLwlrg8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gu6-20020a170906f28600b009931a072599si424073ejb.454.2023.07.20.01.57.38; Thu, 20 Jul 2023 01:58:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=qgLwlrg8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230495AbjGTIEl (ORCPT + 99 others); Thu, 20 Jul 2023 04:04:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229620AbjGTIEk (ORCPT ); Thu, 20 Jul 2023 04:04:40 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3074135; Thu, 20 Jul 2023 01:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=GYJcNSzzJTwKk+N/xTZNKE50sP/tZ1CZ3Su9PfekZow=; b=qgLwlrg8oj3b3JNg2b1yWsbEPj wiOH0516yzh0dor0idZ7d1TOdtWNX6oV4LUfYZgjFVDqhSrFsC9aUU4Z9xo82ETCHw8WRdgZzOUaA kfU7/80wZirOKXgAHHtH9QAzIxCRAkMkoHxicQMZRp/GcxIkzdf+NH06BWuFVq96EmB1g5u03XfN5 Epz0qD47fZKsxWgkqWk1Ns7+FmVYKoJo3iYt9t2leaOpLyVAblL7PE8nT0X4cDRPXGwF/4kb6ywyr SUSruf9BxyzaoIExEj2nl8+BBMMLvHuLmKGVsZluXnjW7+1lU2cUe11HiN9oD3PrDWH1P3mC5yKNY /dWRlmzA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qMOdm-00FOe6-0N; Thu, 20 Jul 2023 08:04:03 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 2D53930007E; Thu, 20 Jul 2023 10:03:58 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 185512B326662; Thu, 20 Jul 2023 10:03:58 +0200 (CEST) Date: Thu, 20 Jul 2023 10:03:57 +0200 From: Peter Zijlstra To: Pankaj Gupta Cc: Sean Christopherson , Weijiang Yang , pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, john.allen@amd.com, Chao Gao , Andrew Cooper Subject: Re: [PATCH v3 00/21] Enable CET Virtualization Message-ID: <20230720080357.GA3569127@hirez.programming.kicks-ass.net> References: <20230511040857.6094-1-weijiang.yang@intel.com> <147246fc-79a2-3bb5-f51f-93dfc1cffcc0@intel.com> <20230719203658.GE3529734@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 20, 2023 at 07:26:04AM +0200, Pankaj Gupta wrote: > > > My understanding is that PL[0-2]_SSP are used only on transitions to the > > > corresponding privilege level from a *different* privilege level. That means > > > KVM should be able to utilize the user_return_msr framework to load the host > > > values. Though if Linux ever supports SSS, I'm guessing the core kernel will > > > have some sort of mechanism to defer loading MSR_IA32_PL0_SSP until an exit to > > > userspace, e.g. to avoid having to write PL0_SSP, which will presumably be > > > per-task, on every context switch. > > > > > > But note my original wording: **If that's necessary** > > > > > > If nothing in the host ever consumes those MSRs, i.e. if SSS is NOT enabled in > > > IA32_S_CET, then running host stuff with guest values should be ok. KVM only > > > needs to guarantee that it doesn't leak values between guests. But that should > > > Just Work, e.g. KVM should load the new vCPU's values if SHSTK is exposed to the > > > guest, and intercept (to inject #GP) if SHSTK is not exposed to the guest. > > > > > > And regardless of what the mechanism ends up managing SSP MSRs, it should only > > > ever touch PL0_SSP, because Linux never runs anything at CPL1 or CPL2, i.e. will > > > never consume PL{1,2}_SSP. > > > > To clarify, Linux will only use SSS in FRED mode -- FRED removes CPL1,2. > > Trying to understand more what prevents SSS to enable in pre FRED, Is > it better #CP exception > handling with other nested exceptions? SSS took the syscall gap and made it worse -- as in *way* worse. To top it off, the whole SSS busy bit thing is fundamentally incompatible with how we manage to survive nested exceptions in NMI context. Basically, the whole x86 exception / stack switching logic was already borderline impossible (consider taking an MCE in the early NMI path where we set up, but have not finished, the re-entrancy stuff), and pushed it over the edge and set it on fire. And NMI isn't the only problem, the various new virt exceptions #VC and #HV are on their own already near impossible, adding SSS again pushes the whole thing into clear insanity. There's a good exposition of the whole trainwreck by Andrew here: https://www.youtube.com/watch?v=qcORS8CN0ow (that is, sorry for the youtube link, but Google is failing me in finding the actual Google Doc that talk is based on, or even the slide deck :/) FRED solves all that by: - removing the stack gap, cc/ip/ss/sp/ssp/gs will all be switched atomically and consistently for every transition. - removing the non-reentrant IST mechanism and replacing it with stack levels - adding an explicit NMI latch - re-organising the actual shadow stacks and doing away with that busy bit thing (I need to re-read the FRED spec on this detail again). Crazy as we are, we're not touching legacy/IDT SSS with a ten foot pole, sorry.