Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1793361pxb; Fri, 22 Oct 2021 07:47:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyN+8sKXmXBRW3+iUf+wWdrp7JSuAqDCXLrRDSmtdHPbS4rVu4u37Qf/Vg3OQIXxljE3dME X-Received: by 2002:a63:7c41:: with SMTP id l1mr137575pgn.372.1634914060857; Fri, 22 Oct 2021 07:47:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634914060; cv=none; d=google.com; s=arc-20160816; b=gT7YrRLKJfydBAkhiqaW8UqaAuo8QjZXuKr4Iux+6HV3+/Oqfq6NvA2fTE7AtBxqZQ FP2kTv7rUCNcUHqMxq0IZFTh8Ag1AuuFZDpYjp/Pk7pCFU9oWeaP7sSEZCG7AAfoU5Oc PKzyW/GAyyVboaAD8Sdo40oGmchKGokdS4RvubBpv/MDH3W27F8FQIuNY3wBb7V8wLet eeXjHmxy59sFF3pBISwU+lIUEK308G30BJexb/o0X1Sy8lIrxE8WqkHkrEwNv4g/OQpj 1QsnZOMxyJf9Dwo9gW/Q4IOhA/SaNHw3GthnpMyk0/TDOWHFHQZjEdVMMSqaiw2VyxEB +p/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=ztPsqRcoIIa+CSgH5e9hCCBkHfLlFSvImxfgvOSy78Y=; b=eUTa13n5JWM6w2EQuA+mdY0+NJxFSxjudwCbDS6QH+JyBpUDVfOOfWLHFglshBVDCr eO0Wc3zZ9gwNV218UFOWLB+9Xx8YyNDYv3Imx1kERe33GL5R8sytiF+Sh82KUjhbcxyj t+BF8n8IAjCRiVEXQWELWgqynaO8x7hRmfhW51eDAs6T+Zueu+GIQtR0a4t8S7WOXAmI owxx33HkXVJAD1d3T0zin4amEieq1dJGQJlHlZElMN+jAbWIScQ0p7y8MtsJC2t2DCyE PGki8IU0qFQ2SiGylQ60/rlMProAL+mUdY3x9wdxPrh4IIsaZkz1WSqrGbKyp4u+hv0J t0AQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Wb7EAV54; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i9si11507654pgv.311.2021.10.22.07.47.27; Fri, 22 Oct 2021 07:47:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Wb7EAV54; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232999AbhJVOsr (ORCPT + 99 others); Fri, 22 Oct 2021 10:48:47 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:43332 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232994AbhJVOsq (ORCPT ); Fri, 22 Oct 2021 10:48:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634913989; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ztPsqRcoIIa+CSgH5e9hCCBkHfLlFSvImxfgvOSy78Y=; b=Wb7EAV54f0rKn12pwCSB/F32AzRogbJrPF9sxgs2bFv5DBW2iZRvBXQDuq0L11Poy1Uz4G 4uq2NzLRj1Ng2+l6dhcMMbc2jJJ42lZkZyS4/pEgFXSobn3sddU0CR1t01H5SvBA8UeKm0 /jdKQ6Op3bfUeC1+R1x6EfGkrO6xiio= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-604-SHObPM6rPgaJCYZBDK8ZCA-1; Fri, 22 Oct 2021 10:46:26 -0400 X-MC-Unique: SHObPM6rPgaJCYZBDK8ZCA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8EC0581C464; Fri, 22 Oct 2021 14:46:24 +0000 (UTC) Received: from starship (unknown [10.40.192.246]) by smtp.corp.redhat.com (Postfix) with ESMTP id 58D621017E28; Fri, 22 Oct 2021 14:46:15 +0000 (UTC) Message-ID: <4a92e23fcdf9959d6f9bf2c8b6ef69a0c8662ee1.camel@redhat.com> Subject: Re: [PATCH v3 2/8] nSVM: introduce smv->nested.save to cache save area fields From: Maxim Levitsky To: Emanuele Giuseppe Esposito , kvm@vger.kernel.org Cc: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org Date: Fri, 22 Oct 2021 17:46:14 +0300 In-Reply-To: <20211011143702.1786568-3-eesposit@redhat.com> References: <20211011143702.1786568-1-eesposit@redhat.com> <20211011143702.1786568-3-eesposit@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2021-10-11 at 10:36 -0400, Emanuele Giuseppe Esposito wrote: > This is useful in next patch, to avoid having temporary > copies of vmcb12 registers and passing them manually. > > Right now, instead of blindly copying everything, > we just copy EFER, CR0, CR3, CR4, DR6 and DR7. If more fields > will need to be added, it will be more obvious to see > that they must be added in struct vmcb_save_area_cached and > in nested_copy_vmcb_save_to_cache(). > > Signed-off-by: Emanuele Giuseppe Esposito > --- > arch/x86/kvm/svm/nested.c | 18 ++++++++++++++++++ > arch/x86/kvm/svm/svm.c | 1 + > arch/x86/kvm/svm/svm.h | 16 ++++++++++++++++ > 3 files changed, 35 insertions(+) > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index d2fe65e2a7a4..c4959da8aec0 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -313,6 +313,22 @@ void nested_load_control_from_vmcb12(struct vcpu_svm *svm, > svm->nested.ctl.iopm_base_pa &= ~0x0fffULL; > } > > +void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm, > + struct vmcb_save_area *save) > +{ > + /* > + * Copy only fields that are validated, as we need them > + * to avoid TOC/TOU races. > + */ > + svm->nested.save.efer = save->efer; > + svm->nested.save.cr0 = save->cr0; > + svm->nested.save.cr3 = save->cr3; > + svm->nested.save.cr4 = save->cr4; > + > + svm->nested.save.dr6 = save->dr6; > + svm->nested.save.dr7 = save->dr7; > +} > + > /* > * Synchronize fields that are written by the processor, so that > * they can be copied back into the vmcb12. > @@ -647,6 +663,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) > return -EINVAL; > > nested_load_control_from_vmcb12(svm, &vmcb12->control); > + nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); > > if (!nested_vmcb_valid_sregs(vcpu, &vmcb12->save) || > !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) { > @@ -1385,6 +1402,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, > > svm_copy_vmrun_state(&svm->vmcb01.ptr->save, save); > nested_load_control_from_vmcb12(svm, ctl); > + nested_copy_vmcb_save_to_cache(svm, save); This is wrong (and this is due to a very non obivous way SVM restores the nested state which I would like to rewrite.) On SVM we migrate L2's control area, and *L1's save* area inside a single vmcb while L2's save area is supposed to be migrated with regular KVM_SET_{S}REGS ioctls. So I think you want something like this instead: diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 4cd53a56f3ebc..fcb38b220e706 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1409,8 +1409,10 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, if (is_guest_mode(vcpu)) svm_leave_nested(svm); - else + else { svm->nested.vmcb02.ptr->save = svm->vmcb01.ptr->save; + nested_copy_vmcb_save_to_cache(svm, svm->nested.vmcb02.ptr->save); + } svm_set_gif(svm, !!(kvm_state->flags & KVM_STATE_NESTED_GIF_SET)); Note two branches here. If we are already in guest mode (qemu currently doesn't set nested state while already in guest mode, but I added this to make this code more robust, then we leave it for a while, to make the following code assume correctly that we are not in guest mode, then it returns us back to the guest mode and never touches the save area of L2, thus its cache shouldn't update either. If we are in normal non guest mode, then as you see we just copy the whole L1 save area to L2, because the assumption is that at least some L2 state is there, restored earlier (currently the result of KVM_SET_REGS), and after nested state is set, qemu then restores the regular L2 registers with KVM_SET_REGS, and other stuff like msrs. So at the point of this copying, we need to update the cache. > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > nested_vmcb02_prepare_control(svm); > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 69639f9624f5..bf171f5f6158 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -4386,6 +4386,7 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) > vmcb12 = map.hva; > > nested_load_control_from_vmcb12(svm, &vmcb12->control); > + nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); This looks correct. > > ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12); > kvm_vcpu_unmap(vcpu, &map, true); > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h > index bd0fe94c2920..f0195bc263e9 100644 > --- a/arch/x86/kvm/svm/svm.h > +++ b/arch/x86/kvm/svm/svm.h > @@ -103,6 +103,19 @@ struct kvm_vmcb_info { > uint64_t asid_generation; > }; > > +/* > + * This struct is not kept up-to-date, and it is only valid within > + * svm_set_nested_state and nested_svm_vmrun. > + */ > +struct vmcb_save_area_cached { > + u64 efer; > + u64 cr4; > + u64 cr3; > + u64 cr0; > + u64 dr7; > + u64 dr6; > +}; > + > struct svm_nested_state { > struct kvm_vmcb_info vmcb02; > u64 hsave_msr; > @@ -119,6 +132,7 @@ struct svm_nested_state { > > /* cache for control fields of the guest */ > struct vmcb_control_area ctl; > + struct vmcb_save_area_cached save; > > bool initialized; > }; > @@ -484,6 +498,8 @@ int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, > int nested_svm_exit_special(struct vcpu_svm *svm); > void nested_load_control_from_vmcb12(struct vcpu_svm *svm, > struct vmcb_control_area *control); > +void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm, > + struct vmcb_save_area *save); > void nested_sync_control_from_vmcb02(struct vcpu_svm *svm); > void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm); > void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb); Note that you need to rebase this and most other patches, due to changes (mostly mine, sorry) in the kvm/queue. Conflicts are mostly trivial though. Best regards, Maxim Levitsky