Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1829575pxb; Tue, 26 Oct 2021 17:14:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxGh9EIrRhIwm2T2SUrsyyRLkItPjof2n+foOASPWLUZBeacqQVOm4g+mNCISeVn918cEqh X-Received: by 2002:a63:be4e:: with SMTP id g14mr22014277pgo.194.1635293690284; Tue, 26 Oct 2021 17:14:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635293690; cv=none; d=google.com; s=arc-20160816; b=N2bow19oK8B3XukwyAI77Zvf/XXt8lCBNfUft61lZj0rgOI5sMVQ4TErHCtNmG1hW6 6N+bRUoYfYvxZEnhrWmsrre1xqjUoflo9cBtRW/jUYFBcoRmyayKalOZ16bvPF4DFQdN NHX/eqSk90nTeLv4nEwLTdStOS0Og9UBc7+6PP7OcsMpNj3OXei4GYRF1x6kx9JVI845 wT+wrwyJF4bkEtdiy5WNpF/a9SUx6H/EIILe1q+0IMLJ1Dij5j4mzpqQLvXbLVZ3A70k tnu1/B+/lY6pBaah/ZjkBqFWpHrLIPyT8cuoaBPOv7Ew9OTnPg3UCNCHNZcJLhXM4oMC xOsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9OJA0DYk/+bmg6KQJ8dcy/bsZjx+8jt3B2rPkt4S2lE=; b=Fnoh10iqeoBZnXK2UUmUO7QtaZTVN7/SPLZViNZjFjqwE40qgennu0nv76zDBpHsKT fxwp7R2TFvDg5rJHfn0zgOM/6e5fMM5SZFkY+joMzEa6UoRvJvBWomZ92UHG3pOkcbk8 cYxNkLn1pM8J8+QaZ/MSjYcO/ZF/TpzLwS44Mj+ZIYBg0xB53maUKSMEN3UbmzUx0QGC dO47dGszDtL3owaX+zOFJ9lNZpmxisf/xCiprmKxr+ZFinWHSRBQiNcVwmFcI/ezEk6w HUIVB3/tixmYC8l5fmkjxU73Xn21T4mE5B0/9k+qbCGYmBqG8eFyIpOIZEV5RNVQ/OgJ tRoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=suJTGyXg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b11si34374271plk.409.2021.10.26.17.14.35; Tue, 26 Oct 2021 17:14:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=suJTGyXg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235995AbhJZQkU (ORCPT + 99 others); Tue, 26 Oct 2021 12:40:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234443AbhJZQkT (ORCPT ); Tue, 26 Oct 2021 12:40:19 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F8ABC061745 for ; Tue, 26 Oct 2021 09:37:55 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id f8so3931145plo.12 for ; Tue, 26 Oct 2021 09:37:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=9OJA0DYk/+bmg6KQJ8dcy/bsZjx+8jt3B2rPkt4S2lE=; b=suJTGyXg5yWgwl4sIcxUN/D8f+rk/F3yQiQdMFu++jOwCRao/1c9WUOt1UFr63a5mR OYJxAMwAxP4OTLrFDM/6182iXFCSCwt2/j/ROIl4usJ+MRREPxF1PuVDMR7CgpGKRU9l SUhBBJfD99rYNNN2vnwY9jmUfA65XlXQCk56WvU6jcR8RvemNKMGCFSwk12QUVpePnO2 a5fchcywm3/zHqThlKcQIrRhsslRt8ITuNhZtb/7sloo5RVCEv3klkScn1Y2YGjg3tXR FtazhWu2IydF7VgfJpt2y+Bpw2ZXwx5euGHzjkHYPlvZHV1s4yevTTjuG1rRonM3Q+MK Z4Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=9OJA0DYk/+bmg6KQJ8dcy/bsZjx+8jt3B2rPkt4S2lE=; b=yKTLZjFPoDJfqet+4P0KvwRNAKhcxja3Rzl0DNHNypy2IPHINDQ4iuDuq2RpY1pB1V 8AXvd1yrCMQP8IUA9AFT0kJTMNNk2lTBHVS8XdQ3tLE8+u6qOLzkWoSdqAv6B8eom40g PLNtfqNE51X5v2KE914vq+LCozK4LslHMxztZQ4mJd41wIo1cRQfUUZSQLlBtnYtoaNp KYaKrz4dIKwk5vTS1eDF7wRS8GiZD014Vaq6WQ4ChoNFiZfprJ+S0zDyvTa9gZn5hzYU b2zhumUZ07BbkSwRX8CjvE2nyVO1WVbxryyiOTkLGTxv6i5m9hJYPwsO/wdIgWG3RmQm aisQ== X-Gm-Message-State: AOAM5300G6dskMAwzQRb+Zwc29m6piOBWRK3NNtpFF99E9TUcXV+uyhg d7M6PTdnw/8uHV0UitwtYpjY2Q== X-Received: by 2002:a17:90a:530f:: with SMTP id x15mr30102443pjh.156.1635266274680; Tue, 26 Oct 2021 09:37:54 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id q18sm25710421pfj.46.2021.10.26.09.37.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Oct 2021 09:37:54 -0700 (PDT) Date: Tue, 26 Oct 2021 16:37:50 +0000 From: Sean Christopherson To: Hou Wenlong Cc: kvm@vger.kernel.org, Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] KVM: VMX: fix instruction skipping when handling UD exception Message-ID: References: <8ad4de9dae77ee3690ee9bd3c5a51d235d619eb6.1634870747.git.houwenlong93@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8ad4de9dae77ee3690ee9bd3c5a51d235d619eb6.1634870747.git.houwenlong93@linux.alibaba.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 22, 2021, Hou Wenlong wrote: > When kvm.force_emulation_prefix is enabled, instruction with > kvm prefix would trigger an UD exception and do instruction > emulation. The emulation may need to exit to userspace due > to userspace io, and the complete_userspace_io callback may > skip instruction, i.e. MSR accesses emulation would exit to > userspace if userspace wanted to know about the MSR fault. > However, VM_EXIT_INSTRUCTION_LEN in vmcs is invalid now, it > should use kvm_emulate_instruction() to skip instruction. > > Signed-off-by: Hou Wenlong > --- > arch/x86/kvm/vmx/vmx.c | 4 ++-- > arch/x86/kvm/vmx/vmx.h | 9 +++++++++ > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 1c8b2b6e7ed9..01049d65da26 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1501,8 +1501,8 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu) > * (namely Hyper-V) don't set it due to it being undefined behavior, > * i.e. we end up advancing IP with some random value. > */ > - if (!static_cpu_has(X86_FEATURE_HYPERVISOR) || > - exit_reason.basic != EXIT_REASON_EPT_MISCONFIG) { > + if (!is_ud_exit(vcpu) && (!static_cpu_has(X86_FEATURE_HYPERVISOR) || This is incomplete and is just a workaround for the underlying bug. The same mess can occur if the emulator triggers an exit to userspace during "normal" emulation, e.g. if unrestricted guest is disabled and the guest accesses an MSR while in Big RM. In that case, there's no #UD to key off of. The correct way to fix this is to attach a different callback when the MSR access comes from the emulator. I'd say rename the existing complete_emulated_{rd,wr}msr() callbacks to complete_fast_{rd,wr}msr() to match the port I/O nomenclature. Something like this (which also has some opportunistic simplification of the error handling in kvm_emulate_{rd,wr}msr()). --- arch/x86/kvm/x86.c | 82 +++++++++++++++++++++++++--------------------- 1 file changed, 45 insertions(+), 37 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac83d873d65b..7ff5b8d58ca3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1814,18 +1814,44 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data) } EXPORT_SYMBOL_GPL(kvm_set_msr); -static int complete_emulated_rdmsr(struct kvm_vcpu *vcpu) +static void __complete_emulated_rdmsr(struct kvm_vcpu *vcpu) { - int err = vcpu->run->msr.error; - if (!err) { + if (!vcpu->run->msr.error) { kvm_rax_write(vcpu, (u32)vcpu->run->msr.data); kvm_rdx_write(vcpu, vcpu->run->msr.data >> 32); } +} - return static_call(kvm_x86_complete_emulated_msr)(vcpu, err); +static int complete_emulated_msr_access(struct kvm_vcpu *vcpu) +{ + if (vcpu->run->msr.error) { + kvm_inject_gp(vcpu, 0); + return 1; + } + + return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP); +} + +static int complete_emulated_rdmsr(struct kvm_vcpu *vcpu) +{ + __complete_emulated_rdmsr(vcpu); + + return complete_emulated_msr_access(vcpu); } static int complete_emulated_wrmsr(struct kvm_vcpu *vcpu) +{ + return complete_emulated_msr_access(vcpu); +} + +static int complete_fast_rdmsr(struct kvm_vcpu *vcpu) +{ + __complete_emulated_rdmsr(vcpu); + + return static_call(kvm_x86_complete_emulated_msr)(vcpu, vcpu->run->msr.error); +} + +static int complete_fast_wrmsr(struct kvm_vcpu *vcpu) { return static_call(kvm_x86_complete_emulated_msr)(vcpu, vcpu->run->msr.error); } @@ -1864,18 +1890,6 @@ static int kvm_msr_user_space(struct kvm_vcpu *vcpu, u32 index, return 1; } -static int kvm_get_msr_user_space(struct kvm_vcpu *vcpu, u32 index, int r) -{ - return kvm_msr_user_space(vcpu, index, KVM_EXIT_X86_RDMSR, 0, - complete_emulated_rdmsr, r); -} - -static int kvm_set_msr_user_space(struct kvm_vcpu *vcpu, u32 index, u64 data, int r) -{ - return kvm_msr_user_space(vcpu, index, KVM_EXIT_X86_WRMSR, data, - complete_emulated_wrmsr, r); -} - int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu) { u32 ecx = kvm_rcx_read(vcpu); @@ -1883,19 +1897,15 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu) int r; r = kvm_get_msr(vcpu, ecx, &data); - - /* MSR read failed? See if we should ask user space */ - if (r && kvm_get_msr_user_space(vcpu, ecx, r)) { - /* Bounce to user space */ - return 0; - } - if (!r) { trace_kvm_msr_read(ecx, data); kvm_rax_write(vcpu, data & -1u); kvm_rdx_write(vcpu, (data >> 32) & -1u); } else { + if (kvm_msr_user_space(vcpu, ecx, KVM_EXIT_X86_RDMSR, 0, + complete_fast_rdmsr, r)) + return 0; trace_kvm_msr_read_ex(ecx); } @@ -1910,20 +1920,16 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu) int r; r = kvm_set_msr(vcpu, ecx, data); - - /* MSR write failed? See if we should ask user space */ - if (r && kvm_set_msr_user_space(vcpu, ecx, data, r)) - /* Bounce to user space */ - return 0; - - /* Signal all other negative errors to userspace */ - if (r < 0) - return r; - - if (!r) + if (!r) { trace_kvm_msr_write(ecx, data); - else + } else { + if (kvm_msr_user_space(vcpu, ecx, KVM_EXIT_X86_WRMSR, data, + complete_fast_wrmsr, r)) + return 0; + if (r < 0) + return r; trace_kvm_msr_write_ex(ecx, data); + } return static_call(kvm_x86_complete_emulated_msr)(vcpu, r); } @@ -7387,7 +7393,8 @@ static int emulator_get_msr(struct x86_emulate_ctxt *ctxt, r = kvm_get_msr(vcpu, msr_index, pdata); - if (r && kvm_get_msr_user_space(vcpu, msr_index, r)) { + if (r && kvm_msr_user_space(vcpu, msr_index, KVM_EXIT_X86_RDMSR, 0, + complete_emulated_rdmsr, r)) { /* Bounce to user space */ return X86EMUL_IO_NEEDED; } @@ -7403,7 +7410,8 @@ static int emulator_set_msr(struct x86_emulate_ctxt *ctxt, r = kvm_set_msr(vcpu, msr_index, data); - if (r && kvm_set_msr_user_space(vcpu, msr_index, data, r)) { + if (r && kvm_msr_user_space(vcpu, msr_index, KVM_EXIT_X86_WRMSR, data, + complete_emulated_wrmsr, r)) { /* Bounce to user space */ return X86EMUL_IO_NEEDED; } --