Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp8638653rwp; Wed, 19 Jul 2023 12:58:25 -0700 (PDT) X-Google-Smtp-Source: APBJJlEMT85aJXgcaUFXdIkv7fidMaigJf2zS1+eSPt5L8GMg8NfkNQpxq8GRUXRHFjh83derBKu X-Received: by 2002:aa7:d602:0:b0:51e:1c5c:b97f with SMTP id c2-20020aa7d602000000b0051e1c5cb97fmr3577685edr.2.1689796705442; Wed, 19 Jul 2023 12:58:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689796705; cv=none; d=google.com; s=arc-20160816; b=YGYMhkhcz5treGOzpBpoAoQt8JtguCVsSE0bhnq/LhnsFOyhQKIZgg3Wq2cFcQd9kF PwYXnTb6yEW7EZ9/ck1sTGZY0M8Xo6Q3PY1ZvrdgmdzielyykkZO7+n50wlUiODFwBaY gG9qR+ctVGZaDFBWQkjrNXsUNVWBU3gIUxJ/EMEJdz3P2cRfPwQU0RWEemCr3kUsOxkH /aodO2Ssm3L08NLwf+0m0Isn4NRzhVDobf6jzTyDmbkOlLFt0mwFY2ZI19mM23/S+Jr9 ck8CVhydqgWOChnmLDbY1ULFcm4x18NeCWyw1QymFonpzgrdmLNOZiY3S1LIdgWGjy/I iDZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=x0CVjOuwJCEU6INafeYJW4NN7Sz04k19DziuQ/frsrc=; fh=t8Lo1z39GnWpqEL/P9th2Cp7If+/HsZh4Mf3tM/7N4Q=; b=Mf1EmIXhu+4z+XAXWkIngWZlNG1+EV8bysTEXROGFiBPvrbYbFCEEU9NvDLqzcFRSs DQWhHYlp/unY5RK4XAg+PYdtUS6+/hc2kEdkHo2NtYdQD2a5Avh3G8KEk5qHFp5C/DR3 yTHSOzRvhxwXDeiPMlTBfnkwk9rI8iU4CR70Zt95zhzF8ukXBicjPsSnJCxiM1OxK10o 1EPkILgRB7W3J31Dp4FpQ42m94gFATgeYxyMEpw9Tm11okTEL9fY3FI2tB7bOyT8/fVN 71CRqJ3ft7eWFtN3tVHJNJbjj0VpzsSUHDu72cjT5WSdumpLVlwUFi9pRo7eKcTuPrw/ scsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=qj6RNkQv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n13-20020aa7db4d000000b0051e0bd32fb6si3475732edt.645.2023.07.19.12.58.01; Wed, 19 Jul 2023 12:58:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=qj6RNkQv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230191AbjGSTlw (ORCPT + 99 others); Wed, 19 Jul 2023 15:41:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230062AbjGSTlv (ORCPT ); Wed, 19 Jul 2023 15:41:51 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 605A51FCD for ; Wed, 19 Jul 2023 12:41:50 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-573d70da2afso628017b3.2 for ; Wed, 19 Jul 2023 12:41:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689795709; x=1692387709; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=x0CVjOuwJCEU6INafeYJW4NN7Sz04k19DziuQ/frsrc=; b=qj6RNkQvx9b7ihLOFjXaxIJDfFaOk5oOFW55Bn7dQ8BAde01HPZ47q4Gp4GjFBfRnS UXvE+Jj3ZAZGauBvN6rBTxNWjNJ3r82WvhogDTokd8gYZEyUYpp9T2uNDfIS2U2uLcr4 14com1XaWcy/uZ1GKkzKC7ewufFIZSGcZ1pEg6jn/L1a9jj0rg3ZUW8Ixjg3fpGOABJK qfw8JDgFyKgUCxxgJrHmion24bLWnGHZk0Uj6QbLgUutIpjleN3niU1GlAYGEQyoQJSD cEwZNdr0dopLkYGv+t09ohXkDWuAQKL99iZ00oC8VV5g8KmUfm6h/K3cvuJY5C8zI+8V xfUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689795709; x=1692387709; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=x0CVjOuwJCEU6INafeYJW4NN7Sz04k19DziuQ/frsrc=; b=W7LEGhSQddsU4Em97IuyQ0Y5xDMrvHPJJBb3vpYLjPXzBfDyHkffvAF8N1e4Z2DCxD NXp4W1mTfvNaGyok5vpNB+zerkpLBpYr0WgLwzm/lyKup1M9SetxxO2hFPGKcszOdoOa NJR+W7DyMKBPi7wj4Cnky1DT4YtdMPCKp2Fqf2beHywt06VA6KkYVqYghzaOQjkGw1J+ iPn2TcrZD4xsLwJ2aHiVsbFIctBA3TaDP86wNUmiKLOwtbtC4bFDCUDTOqKo6+ASwRSV X4CSZPM5+mliXhZDlya92cte9TEyae5clOh3NMXIhz0wKVRfxqT0eWiChf1Ws9foNf4t zWLQ== X-Gm-Message-State: ABy/qLYMm/Fnzzly9qbq6mosDr7nhyazPTgS4q+zAy/vqH39MULroTZi SpHp5j9aOL7qEkm/a7nun89OhR8KwP4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:ad5a:0:b0:577:617b:f881 with SMTP id l26-20020a81ad5a000000b00577617bf881mr487ywk.8.1689795709610; Wed, 19 Jul 2023 12:41:49 -0700 (PDT) Date: Wed, 19 Jul 2023 12:41:47 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230511040857.6094-1-weijiang.yang@intel.com> <147246fc-79a2-3bb5-f51f-93dfc1cffcc0@intel.com> Message-ID: Subject: Re: [PATCH v3 00/21] Enable CET Virtualization From: Sean Christopherson To: Weijiang Yang Cc: pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, john.allen@amd.com, Chao Gao Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 17, 2023, Weijiang Yang wrote: >=20 > On 6/24/2023 4:51 AM, Sean Christopherson wrote: > > > 1)Add Supervisor Shadow Stack=EF=BF=BD state support(i.e., XSS.bit12(= CET_S)) into > > > kernel so that host can support guest Supervisor Shadow Stack MSRs in= g/h FPU > > > context switch. > > If that's necessary for correct functionality, yes. ... > the Pros: > =EF=BF=BD- Super easy to implement for KVM. > =EF=BF=BD- Automatically avoids saving and restoring this data when the v= mexit > =EF=BF=BD=EF=BF=BD is handled within KVM. >=20 > the Cons: > =EF=BF=BD- Unnecessarily restores XFEATURE_CET_KERNEL when switching to > =EF=BF=BD=EF=BF=BD non-KVM task's userspace. > =EF=BF=BD- Forces allocating space for this state on all tasks, whether o= r not > =EF=BF=BD=EF=BF=BD they use KVM, and with likely zero users today and the= near future. > =EF=BF=BD- Complicates the FPU optimization thinking by including things = that > =EF=BF=BD=EF=BF=BD can have no affect on userspace in the FPU >=20 > Given above reasons, I implemented guest CET supervisor states management > in KVM instead of adding a kernel patch for it. >=20 > Below are 3 KVM patches to support it: >=20 > Patch 1: Save/reload guest CET supervisor states when necessary: >=20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > commit 16147ede75dee29583b7d42a6621d10d55b63595 > Author: Yang Weijiang > Date:=EF=BF=BD=EF=BF=BD Tue Jul 11 02:26:17 2023 -0400 >=20 > =EF=BF=BD=EF=BF=BD=EF=BF=BD KVM:x86: Make guest supervisor states as non-= XSAVE managed >=20 > =EF=BF=BD=EF=BF=BD=EF=BF=BD Save and reload guest CET supervisor states, = i.e.,PL{0,1,2}_SSP, > =EF=BF=BD=EF=BF=BD=EF=BF=BD when vCPU context is being swapped before and= after userspace > =EF=BF=BD=EF=BF=BD=EF=BF=BD <->kernel entry, also do the same operation w= hen vCPU is sched-in > =EF=BF=BD=EF=BF=BD=EF=BF=BD or sched-out. ... > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e2c549f147a5..7d9cfb7e2fe8 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11212,6 +11212,31 @@ static void kvm_put_guest_fpu(struct kvm_vcpu > *vcpu) > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD trace_kvm= _fpu(0); > =EF=BF=BD} >=20 > +static void kvm_save_cet_supervisor_ssp(struct kvm_vcpu *vcpu) > +{ > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD preempt_disable()= ; > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD if (unlikely(gues= t_can_use(vcpu, X86_FEATURE_SHSTK))) { > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD rdmsrl(MSR_IA32_PL0_= SSP, vcpu->arch.cet_s_ssp[0]); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD rdmsrl(MSR_IA32_PL1_= SSP, vcpu->arch.cet_s_ssp[1]); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD rdmsrl(MSR_IA32_PL2_= SSP, vcpu->arch.cet_s_ssp[2]); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD wrmsrl(MSR_IA32_PL0_= SSP, 0); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD wrmsrl(MSR_IA32_PL1_= SSP, 0); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD wrmsrl(MSR_IA32_PL2_= SSP, 0); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD } > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD preempt_enable(); > +} > + > +static void kvm_reload_cet_supervisor_ssp(struct kvm_vcpu *vcpu) > +{ > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD preempt_disable()= ; > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD if (unlikely(gues= t_can_use(vcpu, X86_FEATURE_SHSTK))) { > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD wrmsrl(MSR_IA32_PL0_= SSP, vcpu->arch.cet_s_ssp[0]); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD wrmsrl(MSR_IA32_PL1_= SSP, vcpu->arch.cet_s_ssp[1]); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD wrmsrl(MSR_IA32_PL2_= SSP, vcpu->arch.cet_s_ssp[2]); > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD } > +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD preempt_enable(); > +} My understanding is that PL[0-2]_SSP are used only on transitions to the corresponding privilege level from a *different* privilege level. That mea= ns KVM should be able to utilize the user_return_msr framework to load the hos= t values. Though if Linux ever supports SSS, I'm guessing the core kernel wi= ll have some sort of mechanism to defer loading MSR_IA32_PL0_SSP until an exit= to userspace, e.g. to avoid having to write PL0_SSP, which will presumably be per-task, on every context switch. But note my original wording: **If that's necessary** If nothing in the host ever consumes those MSRs, i.e. if SSS is NOT enabled= in IA32_S_CET, then running host stuff with guest values should be ok. KVM on= ly needs to guarantee that it doesn't leak values between guests. But that sh= ould Just Work, e.g. KVM should load the new vCPU's values if SHSTK is exposed t= o the guest, and intercept (to inject #GP) if SHSTK is not exposed to the guest. And regardless of what the mechanism ends up managing SSP MSRs, it should o= nly ever touch PL0_SSP, because Linux never runs anything at CPL1 or CPL2, i.e.= will never consume PL{1,2}_SSP. Am I missing something?