Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp32225ybz; Thu, 16 Apr 2020 18:41:48 -0700 (PDT) X-Google-Smtp-Source: APiQypKOCvx8oekdlQWo4kTjxVDKFUeVYqO5nnSFiEw0f+8cvfBxXEIv3oJ1prAxUXu3Xiu61ie+ X-Received: by 2002:a17:906:400a:: with SMTP id v10mr786044ejj.300.1587087708131; Thu, 16 Apr 2020 18:41:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587087708; cv=none; d=google.com; s=arc-20160816; b=OS32xZ55JXPgYcW9GYkqaiOhKgVu8/4ylMMsENKwu+Xu7nEi0lSBBogTvO+P2qQKY3 PsIfhpLa9fYSqR4PIwuE7dUfjbRnrfn+aKHPLYzLfrBfpyOqCo9MbOCvlAPYYCHiFyhg 96CQztRUBllr8UG+CV62QbcRnsTkYpx+GW0MYrHbICdsxyQ5DE+4CJE35dW1eBKFB8OG cnV+9QiOpaMv0z+A9i5qaICAIp8gPgKmJ+kLs848GDS4gTgJMy6V0c/TC3iosV9gKdFC zYOgNIY8sHWuHuV7KhQslkpsjh9kGz4bwYN2xD1dXkg9agfmuJcYoyUSQxguIyXkLVqJ ldXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=a6G0OEaJwRfYQ/vo87FSt169XCe+YA0xBy9A9/TJQNI=; b=IFuuN5k1e8YjtzbIc7FrGN+oXlJAirfx54tWwshC2dDchtn96K38bHgTuwVALURmO/ yRVNVDvPnhcfW69j+J2ac3PT2jICmOkr+zaGW8JxfcgygT5YS0h8WtJr34cSreWAu3mP fZb+YQPUiCH80JtvsjiM65w5D7fSf0zm2YcdYRN7rwlakcjuUAMWHlsSwMgs08uOqGnt Oz4YglJwrLZP6D857w7naDiymFJFI3eymFvoWw97jdw4W4ao1MMou3nRYvDTVxBbr1HG B8rD8l//m8Y8I1YqKVy2vmcklPeBY0lb8srr/w/XClYkKFRPc7+8jIZEPy/sbdYdNXWH G8xA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b="C/XFIH5d"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i18si12862241ejx.488.2020.04.16.18.41.24; Thu, 16 Apr 2020 18:41:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b="C/XFIH5d"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728800AbgDQBhh (ORCPT + 99 others); Thu, 16 Apr 2020 21:37:37 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:45096 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725800AbgDQBhh (ORCPT ); Thu, 16 Apr 2020 21:37:37 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03H1SCcn036015; Fri, 17 Apr 2020 01:36:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=a6G0OEaJwRfYQ/vo87FSt169XCe+YA0xBy9A9/TJQNI=; b=C/XFIH5dylBwMo99SoC/EjpugtIuSqw1odMVXiQ8238Ij8EfF4ZhRKmJSgsSOZwnfyGM jyNjXPq9HE9UM2mWNnFvjSRoN6gNahgX8TEAvVuw/y9sTkeBQ4Ty74lmfvdLcd0KlEBd Gu+xlSD+Noza9mDa6j2iExzzCmBe1ZDT9Z8gb/b0Kt+YsYDWPW45pHi2XN+uiu+89ghh oPYxPwPqr85iuwOkWVt7T11Ue/ZkkTC+DCh+XN/1mRZS5TVeNf5EhTg2ELZ0LYEHCFch L1NlJCL2ql4SAKY/wbcdecKraLt45owpCIp9p5q1H+COzzSnxUipJqcxEnuVG4TXHdEo Zg== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 30dn95vmbs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 17 Apr 2020 01:36:29 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03H1W92r037660; Fri, 17 Apr 2020 01:34:29 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 30dn9hbp57-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 17 Apr 2020 01:34:29 +0000 Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 03H1YOL5026960; Fri, 17 Apr 2020 01:34:24 GMT Received: from localhost.localdomain (/10.159.142.247) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 16 Apr 2020 18:34:23 -0700 Subject: Re: [PATCH] KVM: nVMX: Remove non-functional "support" for CR3 target values To: Sean Christopherson , Paolo Bonzini Cc: Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <20200416000739.9012-1-sean.j.christopherson@intel.com> From: Krish Sadhukhan Message-ID: <2490337e-dc45-2363-3bab-fb9572dc9403@oracle.com> Date: Thu, 16 Apr 2020 18:34:13 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: <20200416000739.9012-1-sean.j.christopherson@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9593 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 adultscore=0 spamscore=0 phishscore=0 bulkscore=0 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004170010 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9593 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 clxscore=1015 malwarescore=0 bulkscore=0 priorityscore=1501 lowpriorityscore=0 mlxscore=0 phishscore=0 spamscore=0 impostorscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004170009 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/15/20 5:07 PM, Sean Christopherson wrote: > Remove all references to cr3_target_value[0-3] and replace the fields > in vmcs12 with "dead_space" to preserve the vmcs12 layout. KVM doesn't > support emulating CR3-target values, despite a variety of code that > implies otherwise, as KVM unconditionally reports '0' for the number of > supported CR3-target values. > > This technically fixes a bug where KVM would incorrectly allow VMREAD > and VMWRITE to nonexistent fields, i.e. cr3_target_value[0-3]. Per > Intel's SDM, the number of supported CR3-target values reported in > VMX_MISC also enumerates the existence of the associated VMCS fields: > > If a future implementation supports more than 4 CR3-target values, they > will be encoded consecutively following the 4 encodings given here. > > Alternatively, the "bug" could be fixed by actually advertisting support > for 4 CR3-target values, but that'd likely just enable kvm-unit-tests > given that no one has complained about lack of support for going on ten > years, e.g. KVM, Xen and HyperV don't use CR3-target values. > > Signed-off-by: Sean Christopherson > --- > > Krish's patch to allow L1 to disable unrestricted guest got me poking > around the related code. That led me into nested_vmx_exit_handled_cr() > and handle_cr(). handle_cr() does WARN_ON_ONCE(enable_unrestricted_guest) > in the MOV-to-CR3 path, as KVM should never enable CR3-exiting for L1. > But the CR3-target matching nested_vmx_exit_handled_cr() meant that it > would be possible to trigger the warn by enabling CR3-exiting in L1 with > a valid list of CR3-targets. I got so far as to write a patch and a unit > test, and then I checked MSR_IA32_VMX_MISC under KVM... > > AFAICT, properly emulating CR3-targets would be a trivial patch. But, I > don't think there would be any meaningful users, e.g. Xen is the one > hypervisor (that I am aware of) that runs with EPT disabled for certain > use cases, and even Xen doesn't bother with CR3-targets. On the other > hand, the odds of introducing a bug of one form or another seem high. > > Amusing side note, Xen's only reference to vmcs.CR3_TARGET_VALUE* is in > its vmcs_dump_vcpu(), which explains why KVM's dump_vmcs() bothers to spit > out the unused CR3-target fields :-). > > Documentation/virt/kvm/nested-vmx.rst | 5 +---- > arch/x86/kvm/vmx/evmcs.c | 8 -------- > arch/x86/kvm/vmx/nested.c | 17 ----------------- > arch/x86/kvm/vmx/vmcs12.c | 4 ---- > arch/x86/kvm/vmx/vmcs12.h | 10 ++-------- > arch/x86/kvm/vmx/vmx.c | 9 --------- > 6 files changed, 3 insertions(+), 50 deletions(-) > > diff --git a/Documentation/virt/kvm/nested-vmx.rst b/Documentation/virt/kvm/nested-vmx.rst > index 592b0ab6970b..89851cbb7df9 100644 > --- a/Documentation/virt/kvm/nested-vmx.rst > +++ b/Documentation/virt/kvm/nested-vmx.rst > @@ -116,10 +116,7 @@ struct shadow_vmcs is ever changed. > natural_width cr4_guest_host_mask; > natural_width cr0_read_shadow; > natural_width cr4_read_shadow; > - natural_width cr3_target_value0; > - natural_width cr3_target_value1; > - natural_width cr3_target_value2; > - natural_width cr3_target_value3; > + natural_width dead_space[4]; /* Last remnants of cr3_target_value[0-3]. */ > natural_width exit_qualification; > natural_width guest_linear_address; > natural_width guest_cr0; > diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c > index 303813423c3e..73f3e07c1852 100644 > --- a/arch/x86/kvm/vmx/evmcs.c > +++ b/arch/x86/kvm/vmx/evmcs.c > @@ -160,14 +160,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = { > HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL), > EVMCS1_FIELD(VM_ENTRY_MSR_LOAD_ADDR, vm_entry_msr_load_addr, > HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL), > - EVMCS1_FIELD(CR3_TARGET_VALUE0, cr3_target_value0, > - HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL), > - EVMCS1_FIELD(CR3_TARGET_VALUE1, cr3_target_value1, > - HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL), > - EVMCS1_FIELD(CR3_TARGET_VALUE2, cr3_target_value2, > - HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL), > - EVMCS1_FIELD(CR3_TARGET_VALUE3, cr3_target_value3, > - HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL), > > /* 32 bit rw */ > EVMCS1_FIELD(TPR_THRESHOLD, tpr_threshold, > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index aca57d8da400..aeb56b515c5c 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -1785,10 +1785,6 @@ static int copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx) > * vmcs12->vm_exit_msr_store_addr = evmcs->vm_exit_msr_store_addr; > * vmcs12->vm_exit_msr_load_addr = evmcs->vm_exit_msr_load_addr; > * vmcs12->vm_entry_msr_load_addr = evmcs->vm_entry_msr_load_addr; > - * vmcs12->cr3_target_value0 = evmcs->cr3_target_value0; > - * vmcs12->cr3_target_value1 = evmcs->cr3_target_value1; > - * vmcs12->cr3_target_value2 = evmcs->cr3_target_value2; > - * vmcs12->cr3_target_value3 = evmcs->cr3_target_value3; > * vmcs12->page_fault_error_code_mask = > * evmcs->page_fault_error_code_mask; > * vmcs12->page_fault_error_code_match = > @@ -1862,10 +1858,6 @@ static int copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx) > * evmcs->vm_exit_msr_store_addr = vmcs12->vm_exit_msr_store_addr; > * evmcs->vm_exit_msr_load_addr = vmcs12->vm_exit_msr_load_addr; > * evmcs->vm_entry_msr_load_addr = vmcs12->vm_entry_msr_load_addr; > - * evmcs->cr3_target_value0 = vmcs12->cr3_target_value0; > - * evmcs->cr3_target_value1 = vmcs12->cr3_target_value1; > - * evmcs->cr3_target_value2 = vmcs12->cr3_target_value2; > - * evmcs->cr3_target_value3 = vmcs12->cr3_target_value3; > * evmcs->tpr_threshold = vmcs12->tpr_threshold; > * evmcs->virtual_processor_id = vmcs12->virtual_processor_id; > * evmcs->exception_bitmap = vmcs12->exception_bitmap; > @@ -5540,15 +5532,6 @@ static bool nested_vmx_exit_handled_cr(struct kvm_vcpu *vcpu, > return true; > break; > case 3: > - if ((vmcs12->cr3_target_count >= 1 && > - vmcs12->cr3_target_value0 == val) || > - (vmcs12->cr3_target_count >= 2 && > - vmcs12->cr3_target_value1 == val) || > - (vmcs12->cr3_target_count >= 3 && > - vmcs12->cr3_target_value2 == val) || > - (vmcs12->cr3_target_count >= 4 && > - vmcs12->cr3_target_value3 == val)) > - return false; > if (nested_cpu_has(vmcs12, CPU_BASED_CR3_LOAD_EXITING)) > return true; > break; > diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c > index 53dfb401316d..c8e51c004f78 100644 > --- a/arch/x86/kvm/vmx/vmcs12.c > +++ b/arch/x86/kvm/vmx/vmcs12.c > @@ -115,10 +115,6 @@ const unsigned short vmcs_field_to_offset_table[] = { > FIELD(CR4_GUEST_HOST_MASK, cr4_guest_host_mask), > FIELD(CR0_READ_SHADOW, cr0_read_shadow), > FIELD(CR4_READ_SHADOW, cr4_read_shadow), > - FIELD(CR3_TARGET_VALUE0, cr3_target_value0), > - FIELD(CR3_TARGET_VALUE1, cr3_target_value1), > - FIELD(CR3_TARGET_VALUE2, cr3_target_value2), > - FIELD(CR3_TARGET_VALUE3, cr3_target_value3), > FIELD(EXIT_QUALIFICATION, exit_qualification), > FIELD(GUEST_LINEAR_ADDRESS, guest_linear_address), > FIELD(GUEST_CR0, guest_cr0), > diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h > index d0c6df373f67..80232daf00ff 100644 > --- a/arch/x86/kvm/vmx/vmcs12.h > +++ b/arch/x86/kvm/vmx/vmcs12.h > @@ -80,10 +80,7 @@ struct __packed vmcs12 { > natural_width cr4_guest_host_mask; > natural_width cr0_read_shadow; > natural_width cr4_read_shadow; > - natural_width cr3_target_value0; > - natural_width cr3_target_value1; > - natural_width cr3_target_value2; > - natural_width cr3_target_value3; > + natural_width dead_space[4]; /* Last remnants of cr3_target_value[0-3]. */ > natural_width exit_qualification; > natural_width guest_linear_address; > natural_width guest_cr0; > @@ -263,10 +260,7 @@ static inline void vmx_check_vmcs12_offsets(void) > CHECK_OFFSET(cr4_guest_host_mask, 352); > CHECK_OFFSET(cr0_read_shadow, 360); > CHECK_OFFSET(cr4_read_shadow, 368); > - CHECK_OFFSET(cr3_target_value0, 376); > - CHECK_OFFSET(cr3_target_value1, 384); > - CHECK_OFFSET(cr3_target_value2, 392); > - CHECK_OFFSET(cr3_target_value3, 400); > + CHECK_OFFSET(dead_space, 376); > CHECK_OFFSET(exit_qualification, 408); > CHECK_OFFSET(guest_linear_address, 416); > CHECK_OFFSET(guest_cr0, 424); > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index aa1b8cf7c915..1ba7722445cc 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5711,7 +5711,6 @@ void dump_vmcs(void) > u32 cpu_based_exec_ctrl, pin_based_exec_ctrl, secondary_exec_control; > unsigned long cr4; > u64 efer; > - int i, n; > > if (!dump_invalid_vmcs) { > pr_warn_ratelimited("set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.\n"); > @@ -5848,14 +5847,6 @@ void dump_vmcs(void) > pr_err("PostedIntrVec = 0x%02x\n", vmcs_read16(POSTED_INTR_NV)); > if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT)) > pr_err("EPT pointer = 0x%016llx\n", vmcs_read64(EPT_POINTER)); > - n = vmcs_read32(CR3_TARGET_COUNT); > - for (i = 0; i + 1 < n; i += 4) > - pr_err("CR3 target%u=%016lx target%u=%016lx\n", > - i, vmcs_readl(CR3_TARGET_VALUE0 + i * 2), > - i + 1, vmcs_readl(CR3_TARGET_VALUE0 + i * 2 + 2)); > - if (i < n) > - pr_err("CR3 target%u=%016lx\n", > - i, vmcs_readl(CR3_TARGET_VALUE0 + i * 2)); > if (secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING) > pr_err("PLE Gap=%08x Window=%08x\n", > vmcs_read32(PLE_GAP), vmcs_read32(PLE_WINDOW)); Reviewed-by: Krish Sadhukhan