Received: by 2002:ab2:620c:0:b0:1ef:ffd0:ce49 with SMTP id o12csp992311lqt; Tue, 19 Mar 2024 09:35:31 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVfP3LbmnmkXSESmH43TEGidudIvK0gNL17QEfrv9JlzN/1j8nvkwhEywSbat2ng54SOcSMms9RbLVpUjAU8dIVAYr707iTML+bpFOX2g== X-Google-Smtp-Source: AGHT+IH0QUbl/L31/4Sdk11oVk+1tZc57zh2BFLpCUMe+v5eDcoc49iMDrj/jpHj7QkzhgaTB5JE X-Received: by 2002:a05:620a:2913:b0:788:7c9b:e5c6 with SMTP id m19-20020a05620a291300b007887c9be5c6mr14491008qkp.57.1710866131469; Tue, 19 Mar 2024 09:35:31 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710866131; cv=pass; d=google.com; s=arc-20160816; b=H3sKzhpGCNP3nQ1behsZqa+1jYIHyMEP1lQvxLr7GxbVgRBiYVefL9m8yMk0ryh0EO WHSlIqqwGQq6VPRkvV4qCtBVEpuY545YVHoL6KyjN6p88yrWn0CoNYEBBmUNOtgFz8NL VFPaaGE9MpMPMg5rm2smCpntTNOb6RE+jL/hGFVEpI2gJ3alAy/D6oCbJ3leH6iPk3IF raRLeN+ciOyPetZbgKGLHQH7b4JrbOpvHQQhkolSe5ffuwqGOXThIEV4z5fLfBVDcegj 81tRDdrXi3N85nfx91kntMxdxIPBzGTSt9z/oY8ZFbZq3+U7GKubDNVNsKYDRANw8yRG kq4g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=1A6JplVJpKt9d5YjCUp/W/cMYHIdnlJlDtDSH3FwDD4=; fh=LJA/Mg5i3Y0SzaEvnTkrgYDLTEtFkBdeA2FDjJdmKmA=; b=uWb1fvZ5uSbC/z+DxhO3aCm0d2hqP0QaPKhrR+rKjUgBU7cXnuWuNrTxgnlcHtahlF 496eExTRBHkZKmJWT3Ouynof6Ihd2FW3vhigyJ13magOnCwxPghxioDt8wCQScxqFFN8 8H7IHwYKQTvbyTdbuUG3etBhwqMnzPBhnD9rAoUFbC+CY0sQhP/daEKdjiq62jwnAtmV rJxCqjpm/KICwqhVsJPu7FITvfiMsUEyn9Eztszpy6GYYzTxTGzB6zIifvCeNZVjzqzl 0vcfti60pRiM3QZ531lYqRiMw2S06hlOHIb7J/bYvR+ThTRQe0h7HPSsjvnC3fq2Skt6 irZw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=h6FVrGfW; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-107846-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107846-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id l3-20020ae9f003000000b007885a44ec02si11555960qkg.270.2024.03.19.09.35.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Mar 2024 09:35:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-107846-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=h6FVrGfW; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-107846-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107846-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id F1AC11C23148 for ; Tue, 19 Mar 2024 16:35:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 76EC46AC0; Tue, 19 Mar 2024 16:35:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="h6FVrGfW" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C440B1774A for ; Tue, 19 Mar 2024 16:35:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710866111; cv=none; b=qVCzmWT44cULB8hZ5a3MtZE4lh+rFCRZVsxaHrB8IdPUrmk+ss0mEluTeQ41Ot0ZZ7kMmKm2Td/BfP+kYIA1dWXpBCnOV3ZbMAcamPDYfwQoRFCBVS+xsjmMYaIhKTsxX/lCGO4XeqrvVDLbDu0H1Ap1mOv2SQzUPV1B46x90Q8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710866111; c=relaxed/simple; bh=DZcIdpEYDtyvpBwX6sQ+oG1Juy4yZCi8Pkp/MZRvJbk=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=CE9GCJgPZje6FCqHlcLlnDpZC3fDQDWsxxG5ivr7NzoaoeVtMmSTtTXDCQC4RGjGm+x31uTAf5w5PPBuwMuiq2Yfppqj8PFzcxc61g1ItKmMXzb2k62xdbyAjBqrJa6D1DpgQvkRmyKfpudvuH9/GZPLzksE8m9zMnajO0VnUiM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=h6FVrGfW; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1710866108; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1A6JplVJpKt9d5YjCUp/W/cMYHIdnlJlDtDSH3FwDD4=; b=h6FVrGfWxVPkWu7n6uoklpntkzb9RckC3oP5xUzPwKqRLAVL5VHdHQvcP8pUo2JgXaNrGy PwLu6me8Edx225oHQi48yz4AgH2qvmF/yREJRAVN6PmdIiBhzst94TLcuR4e2RIRH1kh9S n6Pf47SCKHZ3dR4VT/31KMe49nqDG7o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-433-WmWgR7J4PviBPQvXQR9ICQ-1; Tue, 19 Mar 2024 12:35:03 -0400 X-MC-Unique: WmWgR7J4PviBPQvXQR9ICQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 19A9A1801DB9; Tue, 19 Mar 2024 16:35:02 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.45.225.95]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1C1FE2024517; Tue, 19 Mar 2024 16:35:01 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Daan De Meyer , linux-kernel@vger.kernel.org Subject: [PATCH RFC not-to-be-merged] KVM: SVM: Workaround overly strict CR3 check by Hyper-V Date: Tue, 19 Mar 2024 17:34:56 +0100 Message-ID: <20240319163456.133942-1-vkuznets@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 Failing VMRUNs (immediate #VMEXIT with error code VMEXIT_INVALID) for KVM guests on top of Hyper-V are observed when KVM does SMM emulation. The root cause of the problem appears to be an overly strict CR3 VMCB check done by Hyper-V. Here's an example of a CR state which triggers the failure: kvm_amd: vmpl: 0 cpl: 0 efer: 0000000000001000 kvm_amd: cr0: 0000000000050032 cr2: ffff92dcf8601000 kvm_amd: cr3: 0000000100232003 cr4: 0000000000000040 CR3 value may look a bit weird as it has non-zero PCID bits set as well as non-zero bits in the upper half but the processor is not in long mode. This, however, is a valid state upon entering SMM from a long mode context with PCID enabled and should not be causing VMEXIT_INVALID. APM says that VMEXIT_INVALID is triggered when "Any MBZ bit of CR3 is set.". In CR3 format the only MBZ bits are those above MAXPHYADDR, the rest is just "Reserved". Place a temporary workaround in KVM to avoid putting problematic CR3 values into VMCB when KVM runs on top of Hyper-V. Enable CR3 READ/WRITE intercepts to make sure guest is not observing side-effects of the mangling. Also, do not overwrite 'vcpu->arch.cr3' with mangled 'save.cr3' value when CR3 intercepts are enabled (and thus a possible CR3 update from the guest would change 'vcpu->arch.cr3' instantly). The workaround is only needed until Hyper-V gets fixed. Reported-by: Daan De Meyer Signed-off-by: Vitaly Kuznetsov --- - The patch serves mostly documentational purposes, I don't expect it to be merged to the mainline. Hyper-V *is* supposed to get fixed but the timeline is unclear at this point. As Azure is a fairly popular platform for running nested KVM, it is possible that the bug will get discovered again (running OVMF based guest is a good starting point!). --- arch/x86/kvm/svm/svm.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 272d5ed37ce7..6ff7cbcb5cac 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -41,6 +41,7 @@ #include #include #include +#include #include @@ -3497,7 +3498,7 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) if (!sev_es_guest(vcpu->kvm)) { if (!svm_is_intercept(svm, INTERCEPT_CR0_WRITE)) vcpu->arch.cr0 = svm->vmcb->save.cr0; - if (npt_enabled) + if (npt_enabled && !svm_is_intercept(svm, INTERCEPT_CR3_WRITE)) vcpu->arch.cr3 = svm->vmcb->save.cr3; } @@ -4264,6 +4265,33 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, cr3 = root_hpa; } +#if IS_ENABLED(CONFIG_HYPERV) + /* + * Workaround an issue in Hyper-V hypervisor where 'reserved' bits are treated + * as MBZ failing VMRUN. + */ + if (hypervisor_is_type(X86_HYPER_MS_HYPERV) && likely(npt_enabled)) { + unsigned long cr3_unmod = cr3; + + /* + * Bits MAXPHYADDR:63 are MBZ but bits 32:MAXPHYADDR-1 are just 'reserved' + * in !long mode. + */ + if (!is_long_mode(vcpu)) + cr3 &= ~rsvd_bits(32, cpuid_maxphyaddr(vcpu) - 1); + + if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE)) + cr3 &= ~X86_CR3_PCID_MASK; + + if (cr3 != cr3_unmod && !svm_is_intercept(svm, INTERCEPT_CR3_READ)) { + svm_set_intercept(svm, INTERCEPT_CR3_READ); + svm_set_intercept(svm, INTERCEPT_CR3_WRITE); + } else if (cr3 == cr3_unmod && svm_is_intercept(svm, INTERCEPT_CR3_READ)) { + svm_clr_intercept(svm, INTERCEPT_CR3_READ); + svm_clr_intercept(svm, INTERCEPT_CR3_WRITE); + } + } +#endif svm->vmcb->save.cr3 = cr3; vmcb_mark_dirty(svm->vmcb, VMCB_CR); } -- 2.44.0