Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp244149pxb; Thu, 21 Jan 2021 06:12:38 -0800 (PST) X-Google-Smtp-Source: ABdhPJwdBMxO1OM+4lI2vn8frws6j94oF1inmA+XMvMji9HKDkMhs4l7SvY/CAvVewqRCqkoBZRI X-Received: by 2002:a17:906:3999:: with SMTP id h25mr9513656eje.146.1611238358240; Thu, 21 Jan 2021 06:12:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611238358; cv=none; d=google.com; s=arc-20160816; b=fe8dkv3t0n41PFRl2uSNKuJdKysNHdwDVQ70MhF+bmEBd0s+Wf1kC3DXXGwSeD7kJw q22f8UHIR3xgM+5+dzCi3L6i9wLdGppKMRzmBtPbX7wh7xVP982P+wh2WQjt7Hivwh82 EE7vblBLh9i89qN2TfamLnDoR6/S3WO/1l6z317mQ0Vw8OsIhQGWPuYBc9mxaHYgQ9Bm wKOY91iz+fu5HXoYN/qB03xqQYRwD5oDp0AAEQoJdb+Vsh9RU5qVNghnZKPD+CeJlG8D FbwgIVOC2fWuXW3gR8Cr/Ack4HcmvhZsZn/e8y66JOKsab4d7Lvgf/IetalwfepWPaEo j/IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=uxr536Xve0Lj402CwtprUdiUrHwMqkPLKwcjfcm4Ufw=; b=AHrbtWIuuMooE+JUj7SGAvkv2JoyzclEUmnNCLWoXGyDVBHBmJ47rmENsVSlrM76Jj CRUZG62LUSpBPoC8o+OOsmZ9/E/jmhnBA/UWmQcSi6idMz+ApVr9zqdOH2Lc+BhWBmbN 5+mRGPxUgC1w7D6dhHaPNTKiDcGkrRBV3F6S/BiLpcu6b+gkl4cj+nJxJjBRNPUyqzFC P6vd+C2IGNHf/7/nxxyyZHywKS6G7nd6wmT21RC/jjBvSGXi81TKFOVyi/1aYSm0hTEQ LU9/TzoBnh1K6KhFaFNL2Tu06CXY4PmzjsMDt9kjyVMM/ysHzmMKLbVpdSYwczqqbfRH nvZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="U4jdH/kv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w25si1781631ejb.604.2021.01.21.06.12.13; Thu, 21 Jan 2021 06:12:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="U4jdH/kv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728676AbhAUOKr (ORCPT + 99 others); Thu, 21 Jan 2021 09:10:47 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59942 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730020AbhAUOKG (ORCPT ); Thu, 21 Jan 2021 09:10:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611238084; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uxr536Xve0Lj402CwtprUdiUrHwMqkPLKwcjfcm4Ufw=; b=U4jdH/kvdwgbV2+mTaTUNp/GDB+87TQLrR8jwwqNGeGMo8s3EV0GDeajK9xnxyBX1csJH3 WyK/c37hMGGQ6dmfFbqigvCmc0o0Ahw7sjsW+jaoEwypJTklmeRRcpDpB/NXZzlJq+IHze rSUIcW1DzygPKt8QtS5lJnUH7g++GyQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-513-1bwO4Cb_NCCwYqLSX3o5gw-1; Thu, 21 Jan 2021 09:08:02 -0500 X-MC-Unique: 1bwO4Cb_NCCwYqLSX3o5gw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4BB44180A0A9; Thu, 21 Jan 2021 14:07:50 +0000 (UTC) Received: from starship (unknown [10.35.206.32]) by smtp.corp.redhat.com (Postfix) with ESMTP id 44DE02CFB1; Thu, 21 Jan 2021 14:07:46 +0000 (UTC) Message-ID: Subject: Re: [PATCH v2 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions From: Maxim Levitsky To: Wei Huang , kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org, pbonzini@redhat.com, vkuznets@redhat.com, seanjc@google.com, joro@8bytes.org, bp@alien8.de, tglx@linutronix.de, mingo@redhat.com, x86@kernel.org, jmattson@google.com, wanpengli@tencent.com, bsd@redhat.com, dgilbert@redhat.com, luto@amacapital.net Date: Thu, 21 Jan 2021 16:07:45 +0200 In-Reply-To: <20210121065508.1169585-3-wei.huang2@amd.com> References: <20210121065508.1169585-1-wei.huang2@amd.com> <20210121065508.1169585-3-wei.huang2@amd.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2021-01-21 at 01:55 -0500, Wei Huang wrote: > From: Bandan Das > > While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD > CPUs check EAX against reserved memory regions (e.g. SMM memory on host) > before checking VMCB's instruction intercept. If EAX falls into such > memory areas, #GP is triggered before VMEXIT. This causes problem under > nested virtualization. To solve this problem, KVM needs to trap #GP and > check the instructions triggering #GP. For VM execution instructions, > KVM emulates these instructions. > > Co-developed-by: Wei Huang > Signed-off-by: Wei Huang > Signed-off-by: Bandan Das > --- > arch/x86/kvm/svm/svm.c | 99 ++++++++++++++++++++++++++++++++++-------- > 1 file changed, 81 insertions(+), 18 deletions(-) > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 7ef171790d02..6ed523cab068 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -288,6 +288,9 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer) > if (!(efer & EFER_SVME)) { > svm_leave_nested(svm); > svm_set_gif(svm, true); > + /* #GP intercept is still needed in vmware_backdoor */ > + if (!enable_vmware_backdoor) > + clr_exception_intercept(svm, GP_VECTOR); Again I would prefer a flag for the errata workaround, but this is still better. > > /* > * Free the nested guest state, unless we are in SMM. > @@ -309,6 +312,9 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer) > > svm->vmcb->save.efer = efer | EFER_SVME; > vmcb_mark_dirty(svm->vmcb, VMCB_CR); > + /* Enable #GP interception for SVM instructions */ > + set_exception_intercept(svm, GP_VECTOR); > + > return 0; > } > > @@ -1957,24 +1963,6 @@ static int ac_interception(struct vcpu_svm *svm) > return 1; > } > > -static int gp_interception(struct vcpu_svm *svm) > -{ > - struct kvm_vcpu *vcpu = &svm->vcpu; > - u32 error_code = svm->vmcb->control.exit_info_1; > - > - WARN_ON_ONCE(!enable_vmware_backdoor); > - > - /* > - * VMware backdoor emulation on #GP interception only handles IN{S}, > - * OUT{S}, and RDPMC, none of which generate a non-zero error code. > - */ > - if (error_code) { > - kvm_queue_exception_e(vcpu, GP_VECTOR, error_code); > - return 1; > - } > - return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP); > -} > - > static bool is_erratum_383(void) > { > int err, i; > @@ -2173,6 +2161,81 @@ static int vmrun_interception(struct vcpu_svm *svm) > return nested_svm_vmrun(svm); > } > > +enum { > + NOT_SVM_INSTR, > + SVM_INSTR_VMRUN, > + SVM_INSTR_VMLOAD, > + SVM_INSTR_VMSAVE, > +}; > + > +/* Return NOT_SVM_INSTR if not SVM instrs, otherwise return decode result */ > +static int svm_instr_opcode(struct kvm_vcpu *vcpu) > +{ > + struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt; > + > + if (ctxt->b != 0x1 || ctxt->opcode_len != 2) > + return NOT_SVM_INSTR; > + > + switch (ctxt->modrm) { > + case 0xd8: /* VMRUN */ > + return SVM_INSTR_VMRUN; > + case 0xda: /* VMLOAD */ > + return SVM_INSTR_VMLOAD; > + case 0xdb: /* VMSAVE */ > + return SVM_INSTR_VMSAVE; > + default: > + break; > + } > + > + return NOT_SVM_INSTR; > +} > + > +static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode) > +{ > + int (*const svm_instr_handlers[])(struct vcpu_svm *svm) = { > + [SVM_INSTR_VMRUN] = vmrun_interception, > + [SVM_INSTR_VMLOAD] = vmload_interception, > + [SVM_INSTR_VMSAVE] = vmsave_interception, > + }; > + struct vcpu_svm *svm = to_svm(vcpu); > + > + return svm_instr_handlers[opcode](svm); > +} > + > +/* > + * #GP handling code. Note that #GP can be triggered under the following two > + * cases: > + * 1) SVM VM-related instructions (VMRUN/VMSAVE/VMLOAD) that trigger #GP on > + * some AMD CPUs when EAX of these instructions are in the reserved memory > + * regions (e.g. SMM memory on host). > + * 2) VMware backdoor > + */ > +static int gp_interception(struct vcpu_svm *svm) > +{ > + struct kvm_vcpu *vcpu = &svm->vcpu; > + u32 error_code = svm->vmcb->control.exit_info_1; > + int opcode; > + > + /* Both #GP cases have zero error_code */ I would have kept the original description of possible #GP reasons for the VMWARE backdoor and that WARN_ON_ONCE that was removed. > + if (error_code) > + goto reinject; > + > + /* Decode the instruction for usage later */ > + if (x86_emulate_decoded_instruction(vcpu, 0, NULL, 0) != EMULATION_OK) > + goto reinject; > + > + opcode = svm_instr_opcode(vcpu); > + if (opcode) I prefer opcode != NOT_SVM_INSTR. > + return emulate_svm_instr(vcpu, opcode); > + else 'WARN_ON_ONCE(!enable_vmware_backdoor)' I think can be placed here. > + return kvm_emulate_instruction(vcpu, > + EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE); I tested the vmware backdoor a bit (using the kvm unit tests) and I found out a tiny pre-existing bug there: We shouldn't emulate the vmware backdoor for a nested guest, but rather let it do it. The below patch (on top of your patches) works for me and allows the vmware backdoor test to pass when kvm unit tests run in a guest. diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index fe97b0e41824a..4557fdc9c3e1b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2243,7 +2243,7 @@ static int gp_interception(struct vcpu_svm *svm) opcode = svm_instr_opcode(vcpu); if (opcode) return emulate_svm_instr(vcpu, opcode); - else + else if (!is_guest_mode(vcpu)) return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE); Best regards, Maxim Levitsky > + > +reinject: > + kvm_queue_exception_e(vcpu, GP_VECTOR, error_code); > + return 1; > +} > + > void svm_set_gif(struct vcpu_svm *svm, bool value) > { > if (value) {