Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp160970rwb; Wed, 5 Oct 2022 16:33:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7qY8332IPfnIK0uWGbD8S4/vDXCJJN5DgONXHRnEm2iE24zahHCx4Z6iWy330YB+UjWkm7 X-Received: by 2002:a17:90a:4413:b0:20a:10e2:cb3 with SMTP id s19-20020a17090a441300b0020a10e20cb3mr2125844pjg.37.1665012811244; Wed, 05 Oct 2022 16:33:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665012811; cv=none; d=google.com; s=arc-20160816; b=Z8AgY1bEgvv9l356RP3gpXKC29/VHcpFyCMwO54bvzHypMtDYzzVBDbzwHuboHaGcd MvWCjR8uRSrHdFXwnqX9ERYY/FmNJCpZbtNPxNXtNwu9KgvxZbOdXETG+YPW8KHe8/3d KwrG0V5bKZsz5dgY9ttA8N43L/5vImT5u8CGXSWh6a6GhSwtUH2ymXVZ8foup2oKi8dx mlJNaFyccqOhVSrE0fIe4zCkk/BR1WjnfGP349XsfjQuUk/z0yQHE4zpiuNFqZExud9N 8LbCk3lu9qUDiy65oNwV4l+arTNgi/57EQkGkf9xluH8dRsKqZp5e43R8MaEjdrxH/sa wFxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ISAhMH5CXhMhm+1I427tmkYhfjMJ9VeL8BRy8vi5aDg=; b=01rNn000lrCnPkxL0ju6NtwbEKsUYqAFBxivSU/6cT35GqGVQqRpNRzCyeuj7pjYB/ xk4puVLLhpUPiQ4f1wCnOm8WVoalS/p5OLeDgbGJ4YJw3SWymJvWCs5y5/rTpCkbEDeo cHeuyBmCTClXi1eOGmkXTnon+sH8Y4caWs7GCcHpZiVw4MzF2/wX3aQd0jfvJSYrmTfa VvalAJTRAVkp27GRVZxMUU5pNfeg2r6L1k3QwxSGbsyPm/uzg8wBU4ZKOuQDYM8mUdS9 izYRD9cF5aR53YbmEqiF2LE9eieee6Mhhfg1rxN9AxWFzReT7M6OJ3tG7L+hl9Ez2IO8 nYpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="Hv+Er/u1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i70-20020a638749000000b004594e9fc930si2420430pge.721.2022.10.05.16.33.18; Wed, 05 Oct 2022 16:33:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="Hv+Er/u1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229494AbiJEXZJ (ORCPT + 99 others); Wed, 5 Oct 2022 19:25:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229482AbiJEXZH (ORCPT ); Wed, 5 Oct 2022 19:25:07 -0400 Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94EF543318 for ; Wed, 5 Oct 2022 16:25:06 -0700 (PDT) Received: by mail-oi1-x229.google.com with SMTP id m81so258208oia.1 for ; Wed, 05 Oct 2022 16:25:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ISAhMH5CXhMhm+1I427tmkYhfjMJ9VeL8BRy8vi5aDg=; b=Hv+Er/u1QUUMk2kRd/0THVgtqLoGx8V8d2vgB9TBKRsCdT6am1CAJjVlCP8D9CbGle 736QJMJOO9Kmd6eDoJI/6b1syOcP3tLqa/IDk4VdA0EYbefLXx/SxQz6xlGG0yG1UMtZ 0N0CzKsBwn67wr0WqwBLNlPTWtYGwdQtP7HmfxjbETjvMjZt4YmI8PnCcD2H8R707+kn P4faM9ADQpc31G2ax5Rhi69mzRRT/k7ughzWjn2VeyyKk8vNmj8vOy1tGtbsucxjaUzz AImN+C2uKh6TYfGlnSZvmwb9zB6BVvd97ZieJIk5kC36TMoI8L/eOCGwwPWMZ0bIxXo6 eQHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ISAhMH5CXhMhm+1I427tmkYhfjMJ9VeL8BRy8vi5aDg=; b=x2J4JG4FVZ3FP+AsufnaJc9xfxaaWuqGC6jM/nX0VEYKsxAE3YWzJ79wGoiSAKZsX8 xrnVgktzZ6+COIeCem/6V7Qq0EwVosigq7H2TZAOjJ0eFQ8LYX1xXvzYWgYYqWqWaG4X CHpmKviIfx7yaeObpy9+hMfE/EUFmktcImknXMDwTB7JqaXOCpWhR7WtQYA61rXJU3GU OQmcPzY+utb6cd7fI5eojF145hSEHlPbv07xdlUM0xwkxaJtfRgeNQU/xntBtTuVwAP/ gQBkTy5lCaWyavbpQZKdgt4z38Pn0jSw24d5hHBXFuKZBi8iMheFD6QgdIJzK5pFCj7b JvMA== X-Gm-Message-State: ACrzQf2FeeiBbygQHinUjdvJzH8wimQ09sMA7Vq1XII2EKlZK8thsQ3f gRJRF+YqtGuMa1WtXKNrIISmxUrNuRjXiB6q8hejEQ== X-Received: by 2002:a05:6808:f8e:b0:351:a39:e7ca with SMTP id o14-20020a0568080f8e00b003510a39e7camr3302472oiw.269.1665012305681; Wed, 05 Oct 2022 16:25:05 -0700 (PDT) MIME-Version: 1.0 References: <20221005220227.1959-1-surajjs@amazon.com> In-Reply-To: <20221005220227.1959-1-surajjs@amazon.com> From: Jim Mattson Date: Wed, 5 Oct 2022 16:24:54 -0700 Message-ID: Subject: Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR To: Suraj Jitindar Singh Cc: kvm@vger.kernel.org, sjitindarsingh@gmail.com, linux-kernel@vger.kernel.org, x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@suse.de, dave.hansen@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, jpoimboe@kernel.org, daniel.sneddon@linux.intel.com, pawan.kumar.gupta@linux.intel.com, benh@kernel.crashing.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh wrote: > > tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to > ensure a call instruction retires before a following unbalanced RET. Replace > this with a WRMSR serialising instruction which has a lower performance > penalty. > > == Background == > > eIBRS (enhanced indirect branch restricted speculation) is used to prevent > predictor addresses from one privilege domain from being used for prediction > in a higher privilege domain. > > == Problem == > > On processors with eIBRS protections there can be a case where upon VM exit > a guest address may be used as an RSB prediction for an unbalanced RET if a > CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier > Return Stack Buffer). > > A mitigation for this was introduced in: > (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections) > > This mitigation [1] has a ~1% performance impact on VM exit compared to without > it [2]. > > == Solution == > > The WRMSR instruction can be used as a speculation barrier and a serialising > instruction. Use this on the VM exit path instead to ensure that a CALL > instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired > before the prediction of a following unbalanced RET. > > This mitigation [3] has a negligible performance impact. > > == Testing == > > Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which > counts the cycles for an exit to kernel mode. > > [1] With existing mitigation: > Average: 2026 cycles > [2] With no mitigation: > Average: 2008 cycles > [3] With proposed mitigation: > Average: 2008 cycles > > Signed-off-by: Suraj Jitindar Singh > Cc: stable@vger.kernel.org > --- > arch/x86/include/asm/nospec-branch.h | 7 +++---- > arch/x86/kvm/vmx/vmenter.S | 3 +-- > arch/x86/kvm/vmx/vmx.c | 5 +++++ > 3 files changed, 9 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h > index c936ce9f0c47..e5723e024b47 100644 > --- a/arch/x86/include/asm/nospec-branch.h > +++ b/arch/x86/include/asm/nospec-branch.h > @@ -159,10 +159,9 @@ > * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP > * monstrosity above, manually. > */ > -.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS) > - ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \ > - __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \ > - __stringify(__FILL_ONE_RETURN), \ftr2 > +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req > + ALTERNATIVE "jmp .Lskip_rsb_\@", \ > + __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr > > .Lskip_rsb_\@: > .endm > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S > index 6de96b943804..eb82797bd7bf 100644 > --- a/arch/x86/kvm/vmx/vmenter.S > +++ b/arch/x86/kvm/vmx/vmenter.S > @@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL) > * single call to retire, before the first unbalanced RET. > */ > > - FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\ > - X86_FEATURE_RSB_VMEXIT_LITE > + FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT > > > pop %_ASM_ARG2 /* @flags */ > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index c9b49a09e6b5..fdcd8e10c2ab 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, > * For legacy IBRS, the IBRS bit always needs to be written after > * transitioning from a less privileged predictor mode, regardless of > * whether the guest/host values differ. > + * > + * For eIBRS affected by Post Barrier RSB Predictions a serialising > + * instruction (wrmsr) must be executed to ensure a call instruction has > + * retired before the prediction of a following unbalanced ret. > */ > if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) || > + cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) || > vmx->spec_ctrl != hostval) > native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval); Okay. I see how this almost meets the requirements. But this WRMSR is conditional, which means that there's a speculative path through this code that ends up at the unbalanced RET without executing the WRMSR. Also, for your timings of "no mitigation" and this proposed mitigation to be the same, I assume that the guest in your timing test has a different IA32_SPEC_CTRL value than the host, which isn't always going to be the case in practice. How much does this WRMSR cost if the guest and the host have the same IA32_SPEC_CTRL value? > -- > 2.17.1 >