Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3103661ybi; Thu, 18 Jul 2019 21:05:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqwlbB+pxqgqPnzWK6yI9g71yH9nlqcjlUIesBGk1J568kb6e7YsLmBPmJW7FwMW5agw3u7G X-Received: by 2002:a17:90b:f0f:: with SMTP id br15mr56008931pjb.101.1563509158296; Thu, 18 Jul 2019 21:05:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563509158; cv=none; d=google.com; s=arc-20160816; b=jB1oBUZIiFhcf+sZle+KkxssjshSvlxX5h1w+lJ9AbatNbfVRNosI+W8jhQSAOunuV xxXwsjzPnwG6l5VWoY36XubzhQw/4TJFdXT/7YMvzxaqzxsxh3R5+VZrT67J0mRURnAS W93UeMM/KijSRVqryuns5YYMbJUDN74lNpXm/qEfy5Y1OrYz9uxhnmzGoTKvFNVXzvwe ZdfkDA8ymET+ssxhZRdLCt2ljQsvTnU27tv4i/jNUcZFuGMVPhCD6vgystEIFabQszLa uqQkEWyjA4i3yojkWCn7qA5po5Xs8nfc+WaHKBdwQhCpD7kG/4iXy/iibcd2qWj1YhfZ 3TBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Ky4ScojBWhNpXrhX3Jz2TL8FyVX469KikwlNmCgSPN0=; b=t0US5IiRpkWyBtnv/lK2xImHb6j9z4/bms5tlg7d32eJml/HN8N71U9J4p+Bl/qNGa d3hCfYiKgxql12DoqkLnyRmhnRw1+ilxXjY+SmQeNq2yn1uxaRtXkXoqCgJYmqhQagVt ZkqlgmaeAuCBC5F9LtPlGGrIsbp7ETvrM5B+hIXlmsN+IRRipu7mKxdZnRHXFABwYZKl 6+CRed6oVotpL9DKvEHZzUowCI+KbuRGKXouPTRkWjdk6PidR2Cu349ONSKIrKWIxUnb EIbEaqT9C71NSoXXt7loGcFQQZwJfNTxH9yaN3W612bSsHUsfdcsIWZSjbyHHieDDyQV F06Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=MsEOKYm6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b9si1567002pgb.478.2019.07.18.21.05.43; Thu, 18 Jul 2019 21:05:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=MsEOKYm6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728709AbfGSEEr (ORCPT + 99 others); Fri, 19 Jul 2019 00:04:47 -0400 Received: from mail.kernel.org ([198.145.29.99]:36924 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731145AbfGSEEj (ORCPT ); Fri, 19 Jul 2019 00:04:39 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2CF69218A6; Fri, 19 Jul 2019 04:04:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1563509078; bh=dozGJpyLuR+/7wW4VPK5UUM5ImaAQ/K0982wrhAPpQI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MsEOKYm62pq0LxVZknIn5Qw8NA/9nnNseEe/yzLvgHK4dEKwfDOJ9rWeiH0IyZG10 cQ9A7ZByVBKr/ztOE6zVXjpqHeEHPMXR6XEL2Ui5U5savoAdgzWVuomkSllxtJi9z3 iNNLWGR6VeO0apznKWKgt+Q4s+Glll8cfXPT0Ylw= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Sean Christopherson , Jim Mattson , Liran Alon , Paolo Bonzini , Sasha Levin , kvm@vger.kernel.org Subject: [PATCH AUTOSEL 5.1 056/141] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES Date: Fri, 19 Jul 2019 00:01:21 -0400 Message-Id: <20190719040246.15945-56-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190719040246.15945-1-sashal@kernel.org> References: <20190719040246.15945-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson [ Upstream commit b643780562af5378ef7fe731c65b8f93e49c59c6 ] VMMs frequently read the guest's CS and SS AR bytes to detect 64-bit mode and CPL respectively, but effectively never write said fields once the VM is initialized. Intercepting VMWRITEs for the two fields saves ~55 cycles in copy_shadow_to_vmcs12(). Because some Intel CPUs, e.g. Haswell, drop the reserved bits of the guest access rights fields on VMWRITE, exposing the fields to L1 for VMREAD but not VMWRITE leads to inconsistent behavior between L1 and L2. On hardware that drops the bits, L1 will see the stripped down value due to reading the value from hardware, while L2 will see the full original value as stored by KVM. To avoid such an inconsistency, emulate the behavior on all CPUS, but only for intercepted VMWRITEs so as to avoid introducing pointless latency into copy_shadow_to_vmcs12(), e.g. if the emulation were added to vmcs12_write_any(). Since the AR_BYTES emulation is done only for intercepted VMWRITE, if a future patch (re)exposed AR_BYTES for both VMWRITE and VMREAD, then KVM would end up with incosistent behavior on pre-Haswell hardware, e.g. KVM would drop the reserved bits on intercepted VMWRITE, but direct VMWRITE to the shadow VMCS would not drop the bits. Add a WARN in the shadow field initialization to detect any attempt to expose an AR_BYTES field without updating vmcs12_write_any(). Note, emulation of the AR_BYTES reserved bit behavior is based on a patch[1] from Jim Mattson that applied the emulation to all writes to vmcs12 so that live migration across different generations of hardware would not introduce divergent behavior. But given that live migration of nested state has already been enabled, that ship has sailed (not to mention that no sane VMM will be affected by this behavior). [1] https://patchwork.kernel.org/patch/10483321/ Cc: Jim Mattson Cc: Liran Alon Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin --- arch/x86/kvm/vmx/nested.c | 15 +++++++++++++++ arch/x86/kvm/vmx/vmcs_shadow_fields.h | 4 ++-- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 897ae4b62980..79c76318bcb8 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -91,6 +91,10 @@ static void init_vmcs_shadow_fields(void) pr_err("Missing field from shadow_read_write_field %x\n", field + 1); + WARN_ONCE(field >= GUEST_ES_AR_BYTES && + field <= GUEST_TR_AR_BYTES, + "Update vmcs12_write_any() to expose AR_BYTES RW"); + /* * PML and the preemption timer can be emulated, but the * processor cannot vmwrite to fields that don't exist @@ -4532,6 +4536,17 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) vmcs12 = get_shadow_vmcs12(vcpu); } + /* + * Some Intel CPUs intentionally drop the reserved bits of the AR byte + * fields on VMWRITE. Emulate this behavior to ensure consistent KVM + * behavior regardless of the underlying hardware, e.g. if an AR_BYTE + * field is intercepted for VMWRITE but not VMREAD (in L1), then VMREAD + * from L1 will return a different value than VMREAD from L2 (L1 sees + * the stripped down value, L2 sees the full value as stored by KVM). + */ + if (field >= GUEST_ES_AR_BYTES && field <= GUEST_TR_AR_BYTES) + field_value &= 0x1f0ff; + if (vmcs12_write_any(vmcs12, field, field_value) < 0) return nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); diff --git a/arch/x86/kvm/vmx/vmcs_shadow_fields.h b/arch/x86/kvm/vmx/vmcs_shadow_fields.h index 132432f375c2..97dd5295be31 100644 --- a/arch/x86/kvm/vmx/vmcs_shadow_fields.h +++ b/arch/x86/kvm/vmx/vmcs_shadow_fields.h @@ -40,14 +40,14 @@ SHADOW_FIELD_RO(VM_EXIT_INSTRUCTION_LEN) SHADOW_FIELD_RO(IDT_VECTORING_INFO_FIELD) SHADOW_FIELD_RO(IDT_VECTORING_ERROR_CODE) SHADOW_FIELD_RO(VM_EXIT_INTR_ERROR_CODE) +SHADOW_FIELD_RO(GUEST_CS_AR_BYTES) +SHADOW_FIELD_RO(GUEST_SS_AR_BYTES) SHADOW_FIELD_RW(CPU_BASED_VM_EXEC_CONTROL) SHADOW_FIELD_RW(EXCEPTION_BITMAP) SHADOW_FIELD_RW(VM_ENTRY_EXCEPTION_ERROR_CODE) SHADOW_FIELD_RW(VM_ENTRY_INTR_INFO_FIELD) SHADOW_FIELD_RW(VM_ENTRY_INSTRUCTION_LEN) SHADOW_FIELD_RW(TPR_THRESHOLD) -SHADOW_FIELD_RW(GUEST_CS_AR_BYTES) -SHADOW_FIELD_RW(GUEST_SS_AR_BYTES) SHADOW_FIELD_RW(GUEST_INTERRUPTIBILITY_INFO) SHADOW_FIELD_RW(VMX_PREEMPTION_TIMER_VALUE) -- 2.20.1