Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3427960pxb; Mon, 4 Apr 2022 16:41:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyQxCeA8nqwV2nLcrzENQWdyzbJ07hecz/8UyuMggDt3vZHkPK6HBt9BzsqoX2D5VA8nqrq X-Received: by 2002:a63:125a:0:b0:382:5f9c:9c43 with SMTP id 26-20020a63125a000000b003825f9c9c43mr530865pgs.232.1649115670976; Mon, 04 Apr 2022 16:41:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649115670; cv=none; d=google.com; s=arc-20160816; b=Xhy4RlFPSHdYSWt9f4TAQXKl67PTdoPoIovFS8nz+dmrROw/aX0iyrVgLLU5Rfx4Ou ZwhiZufiPXtKr+RuPBpedfuiNJS8+IwvicWXz7Hr2U8CWBYC+ny5TCsUS5pMfN3r0sev AbzRpIt0lnj8oUuybvGucf6Abqls0aE+qKBJnXYs1eOYICQ1PpRBes0pltJaD/s3XguY Hw7PAvEyzvd3QbLKDQCPJdvsadK6aYv61yIH3HFoItfLGH7AJjChZk6ahnCBwEWkvYSM LT0uiiiuzH1zy8FpZlGtvqIKfGm/fvs4zjea8leIkVcbiHQPhBqXFRBUw5iniqZi+bMD tp1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id; bh=aNaiMz3VJTIsXSS731DLaAdhCTRijFrmFpup69p9Thk=; b=WFiGqr/teLPOnPnbzNA6JWWEozqfAFWaDdJbDTPpq85eSde4OpULvLJP8t6rlsWKiY DlLXkVDEXOG/7VO5LQdvuGSacfGKIXKnnnf7Bkn1Nn0kZ6D1vIdUYW9fvIcvLv0zbezV bZxqTYnGJG1YQB4H43QYpHHofY7WipRY6ed5Xq9ONUj7LL08+Vs8nlnpUDy8XChOx7Iq 9UBPLPVkaekRqd/z+278o6ql1Dtpc7a+vDxpSkP2akOJaI48wnoFgreJ+yVbYM7XfXBu BtBdKycwd3/hwdL1BIUCAHIDK3b8y+HEeOhxMGsgDu1gBek5VWaOhyegvhdKzrKEXuMm bj/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id t13-20020a6564cd000000b00398db25cde9si11440062pgv.226.2022.04.04.16.41.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 16:41:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B61DC5419C; Mon, 4 Apr 2022 16:32:42 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349276AbiDDVUJ (ORCPT + 99 others); Mon, 4 Apr 2022 17:20:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379284AbiDDQxM (ORCPT ); Mon, 4 Apr 2022 12:53:12 -0400 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63EBD3B032; Mon, 4 Apr 2022 09:51:14 -0700 (PDT) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1nbPuw-0002t4-DU; Mon, 04 Apr 2022 18:51:02 +0200 Message-ID: <112c2108-7548-f5bd-493d-19b944701f1b@maciej.szmigiero.name> Date: Mon, 4 Apr 2022 18:50:54 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: Sean Christopherson Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <20220402010903.727604-1-seanjc@google.com> <20220402010903.727604-2-seanjc@google.com> From: "Maciej S. Szmigiero" Subject: Re: [PATCH 1/8] KVM: nSVM: Sync next_rip field from vmcb12 to vmcb02 In-Reply-To: <20220402010903.727604-2-seanjc@google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2.04.2022 03:08, Sean Christopherson wrote: > From: Maciej S. Szmigiero > > The next_rip field of a VMCB is *not* an output-only field for a VMRUN. > This field value (instead of the saved guest RIP) in used by the CPU for > the return address pushed on stack when injecting a software interrupt or > INT3 or INTO exception. > > Make sure this field gets synced from vmcb12 to vmcb02 when entering L2 or > loading a nested state and NRIPS is exposed to L1. If NRIPS is supported > in hardware but not exposed to L1 (nrips=0 or hidden by userspace), stuff > vmcb02's next_rip from the new L2 RIP to emulate a !NRIPS CPU (which > saves RIP on the stack as-is). > > Signed-off-by: Maciej S. Szmigiero > Co-developed-by: Sean Christopherson > Signed-off-by: Sean Christopherson > --- > arch/x86/kvm/svm/nested.c | 22 +++++++++++++++++++--- > arch/x86/kvm/svm/svm.h | 1 + > 2 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index 73b545278f5f..9a6dc2b38fcf 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -369,6 +369,7 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcpu *vcpu, > to->nested_ctl = from->nested_ctl; > to->event_inj = from->event_inj; > to->event_inj_err = from->event_inj_err; > + to->next_rip = from->next_rip; > to->nested_cr3 = from->nested_cr3; > to->virt_ext = from->virt_ext; > to->pause_filter_count = from->pause_filter_count; > @@ -606,7 +607,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 > } > } > > -static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) > +static void nested_vmcb02_prepare_control(struct vcpu_svm *svm, > + unsigned long vmcb12_rip) > { > u32 int_ctl_vmcb01_bits = V_INTR_MASKING_MASK; > u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK; > @@ -660,6 +662,19 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) > vmcb02->control.event_inj = svm->nested.ctl.event_inj; > vmcb02->control.event_inj_err = svm->nested.ctl.event_inj_err; > > + /* > + * next_rip is consumed on VMRUN as the return address pushed on the > + * stack for injected soft exceptions/interrupts. If nrips is exposed > + * to L1, take it verbatim from vmcb12. If nrips is supported in > + * hardware but not exposed to L1, stuff the actual L2 RIP to emulate > + * what a nrips=0 CPU would do (L1 is responsible for advancing RIP > + * prior to injecting the event). > + */ > + if (svm->nrips_enabled) > + vmcb02->control.next_rip = svm->nested.ctl.next_rip; > + else if (boot_cpu_has(X86_FEATURE_NRIPS)) > + vmcb02->control.next_rip = vmcb12_rip; > + > vmcb02->control.virt_ext = vmcb01->control.virt_ext & > LBR_CTL_ENABLE_MASK; > if (svm->lbrv_enabled) > @@ -743,7 +758,7 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, > nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr); > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > - nested_vmcb02_prepare_control(svm); > + nested_vmcb02_prepare_control(svm, vmcb12->save.rip); > nested_vmcb02_prepare_save(svm, vmcb12); > > ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3, > @@ -1422,6 +1437,7 @@ static void nested_copy_vmcb_cache_to_control(struct vmcb_control_area *dst, > dst->nested_ctl = from->nested_ctl; > dst->event_inj = from->event_inj; > dst->event_inj_err = from->event_inj_err; > + dst->next_rip = from->next_rip; > dst->nested_cr3 = from->nested_cr3; > dst->virt_ext = from->virt_ext; > dst->pause_filter_count = from->pause_filter_count; > @@ -1606,7 +1622,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, > nested_copy_vmcb_control_to_cache(svm, ctl); > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > - nested_vmcb02_prepare_control(svm); > + nested_vmcb02_prepare_control(svm, save->rip); > ^ I guess this should be "svm->vmcb->save.rip", since KVM_{GET,SET}_NESTED_STATE "save" field contains vmcb01 data, not vmcb{0,1}2 (in contrast to the "control" field). Thanks, Maciej