Received: by 2002:ac0:e34a:0:0:0:0:0 with SMTP id g10csp748328imn; Tue, 26 Jul 2022 08:15:41 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uMkTPgd4x4UqEh8jUil32bfZeNUfm7BVQPwh70V5Uj4hjCgFEC4qM1YtOS//lGjhXjHnaH X-Received: by 2002:a05:6402:22c9:b0:43b:c529:7ee1 with SMTP id dm9-20020a05640222c900b0043bc5297ee1mr19679124edb.251.1658848541085; Tue, 26 Jul 2022 08:15:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658848541; cv=none; d=google.com; s=arc-20160816; b=idgoR6vV0ObIFM9YVptntHo8AOfoNRViJuCpsA3TjGPx22+ek6yZe0SpUeiMBRL6xo pf8jGvFfyivTct3d5OTxiLASnCzbzrNSCF7JHG7KDC1vafCM8SzB69SX9bvOyiHpnWIs sGmnUXDumiWJ4azguYyR2TtJQBNHbHFvZixqYsutvPpuaTErWGxNgUQbaBP1j8T469YX B4zsCnKhy3QMxi2VFW9G61ng+TlfN7Nhdwky53NP9XA9kiwrXobk7XtlZp8VAh7igoX9 T6JLNJuscGB/MCFCM++N8OYaYEBGDjcXU2j2YL4l9hCLNU1XLFWJph1ykLeyhRO1lY3f h+5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=5Xb9wlB7OZqtoTXcsnACGZnE1KYZuZOAv9Bpi4M7oCs=; b=AuNaaiW/0SNyrcdrwQqrN3/W9wE73Gg3EMtOhBeqTe9PK81box2qs2MK6eCxIgMfX8 8xofZwMUQ2PITH1S4vO39xExl5d9JiDKDbcQP/7E6eo8igRwYa+qB5vGzx/taU2GJ1Cq 7Q+omVUYtnhE+yGOii5kDnrrTfwvxhFzhN7PKTYieQyaS8uV1P9bF4HFNkra/qx/yp0X 4jMX+lYhiracrflmecRNx6c91IcW7ar+FDS0WdYgsWArOXKFnsZVlk0Ow4CbqQSyZh37 DqwVpNz6DrIlbx7BjtNRwDF1ySgaUGLOwctjHogtbukeL7zBu+2r+oF1YhT+ejCIDOxt xbIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kl7-20020a170907994700b0072643608fb8si15680382ejc.953.2022.07.26.08.15.16; Tue, 26 Jul 2022 08:15:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233807AbiGZPNC (ORCPT + 99 others); Tue, 26 Jul 2022 11:13:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233293AbiGZPNB (ORCPT ); Tue, 26 Jul 2022 11:13:01 -0400 X-Greylist: delayed 599 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Tue, 26 Jul 2022 08:12:59 PDT Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85F532F021 for ; Tue, 26 Jul 2022 08:12:59 -0700 (PDT) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 4B7F442C2F; Tue, 26 Jul 2022 16:57:51 +0200 (CEST) Date: Tue, 26 Jul 2022 16:57:48 +0200 From: Stoiko Ivanov To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, bgardon@google.com Subject: Re: [PATCH] KVM: x86: enable TDP MMU by default Message-ID: <20220726165748.76db5284@rosa.proxmox.com> In-Reply-To: <20210726163106.1433600-1-pbonzini@redhat.com> References: <20210726163106.1433600-1-pbonzini@redhat.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Proxmox[0] recently switched to the 5.15 kernel series (based on the one for Ubuntu 22.04), which includes this commit. While it's working well on most installations, we have a few users who reported that some of their guests shutdown with `KVM: entry failed, hardware error 0x80000021` being logged under certain conditions and environments[1]: * The issue is not deterministically reproducible, and only happens eventually with certain loads (e.g. we have only one system in our office which exhibits the issue - and this only by repeatedly installing Windows 2k22 ~ one out of 10 installs will cause the guest-crash) * While most reports are referring to (newer) Windows guests, some users run into the issue with Linux VMs as well * The affected systems are from a quite wide range - our affected machine is an old IvyBridge Xeon with outdated BIOS (an equivalent system with the latest available BIOS is not affected), but we have reports of all kind of Intel CPUs (up to an i5-12400). It seems AMD CPUs are not affected. Disabling tdp_mmu seems to mitigate the issue, but I still thought you might want to know that in some cases tdp_mmu causes problems, or that you even might have an idea of how to fix the issue without explicitly disabling tdp_mmu? While trying to find the cause, we also included a test with a 5.18 kernel (still affected). The logs of the hypervisor after a guest crash: ``` Jun 24 17:25:51 testhost kernel: VMCS 000000006afb1754, last attempted VM-entry on CPU 12 Jun 24 17:25:51 testhost kernel: *** Guest State *** Jun 24 17:25:51 testhost kernel: CR0: actual=0x0000000000050032, shadow=0x0000000000050032, gh_mask=fffffffffffffff7 Jun 24 17:25:51 testhost kernel: CR4: actual=0x0000000000002040, shadow=0x0000000000000000, gh_mask=fffffffffffef871 Jun 24 17:25:51 testhost kernel: CR3 = 0x000000013cbf4002 Jun 24 17:25:51 testhost kernel: PDPTR0 = 0x0000003300050011 PDPTR1 = 0x0000000000000000 Jun 24 17:25:51 testhost kernel: PDPTR2 = 0x0000000000000000 PDPTR3 = 0x0000010000000000 Jun 24 17:25:51 testhost kernel: RSP = 0xffff898cacda2c90 RIP = 0x0000000000008000 Jun 24 17:25:51 testhost kernel: RFLAGS=0x00000002 DR7 = 0x0000000000000400 Jun 24 17:25:51 testhost kernel: Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000 Jun 24 17:25:51 testhost kernel: CS: sel=0xc200, attr=0x08093, limit=0xffffffff, base=0x000000007ffc2000 Jun 24 17:25:51 testhost kernel: DS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: SS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: ES: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: FS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: GS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: GDTR: limit=0x00000057, base=0xfffff8024e652fb0 Jun 24 17:25:51 testhost kernel: LDTR: sel=0x0000, attr=0x10000, limit=0x000fffff, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: IDTR: limit=0x00000000, base=0x0000000000000000 Jun 24 17:25:51 testhost kernel: TR: sel=0x0040, attr=0x0008b, limit=0x00000067, base=0xfffff8024e651000 Jun 24 17:25:51 testhost kernel: EFER= 0x0000000000000000 Jun 24 17:25:51 testhost kernel: PAT = 0x0007010600070106 Jun 24 17:25:51 testhost kernel: DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000 Jun 24 17:25:51 testhost kernel: Interruptibility = 00000009 ActivityState = 00000000 Jun 24 17:25:51 testhost kernel: InterruptStatus = 002f Jun 24 17:25:51 testhost kernel: *** Host State *** Jun 24 17:25:51 testhost kernel: RIP = 0xffffffffc119a0a0 RSP = 0xffffa6a24a52bc20 Jun 24 17:25:51 testhost kernel: CS=0010 SS=0018 DS=0000 ES=0000 FS=0000 GS=0000 TR=0040 Jun 24 17:25:51 testhost kernel: FSBase=00007f1bf7fff700 GSBase=ffff97df5ed80000 TRBase=fffffe00002c7000 Jun 24 17:25:51 testhost kernel: GDTBase=fffffe00002c5000 IDTBase=fffffe0000000000 Jun 24 17:25:51 testhost kernel: CR0=0000000080050033 CR3=00000001226c8004 CR4=00000000001726e0 Jun 24 17:25:51 testhost kernel: Sysenter RSP=fffffe00002c7000 CS:RIP=0010:ffffffffbd201d90 Jun 24 17:25:51 testhost kernel: EFER= 0x0000000000000d01 Jun 24 17:25:51 testhost kernel: PAT = 0x0407050600070106 Jun 24 17:25:51 testhost kernel: *** Control State *** Jun 24 17:25:51 testhost kernel: PinBased=000000ff CPUBased=b5a06dfa SecondaryExec=000007eb Jun 24 17:25:51 testhost kernel: EntryControls=0000d1ff ExitControls=002befff Jun 24 17:25:51 testhost kernel: ExceptionBitmap=00060042 PFECmask=00000000 PFECmatch=00000000 Jun 24 17:25:51 testhost kernel: VMEntry: intr_info=00000000 errcode=00000004 ilen=00000000 Jun 24 17:25:51 testhost kernel: VMExit: intr_info=00000000 errcode=00000000 ilen=00000001 Jun 24 17:25:51 testhost kernel: reason=80000021 qualification=0000000000000000 Jun 24 17:25:51 testhost kernel: IDTVectoring: info=00000000 errcode=00000000 Jun 24 17:25:51 testhost kernel: TSC Offset = 0xff96fad07396b5f8 Jun 24 17:25:51 testhost kernel: SVI|RVI = 00|2f TPR Threshold = 0x00 Jun 24 17:25:51 testhost kernel: APIC-access addr = 0x000000014516c000 virt-APIC addr = 0x000000014afe7000 Jun 24 17:25:51 testhost kernel: PostedIntrVec = 0xf2 Jun 24 17:25:51 testhost kernel: EPT pointer = 0x000000011aa2d01e Jun 24 17:25:51 testhost kernel: PLE Gap=00000080 Window=00020000 Jun 24 17:25:51 testhost kernel: Virtual processor ID = 0x0003 Jun 24 17:25:51 testhost QEMU[2997]: KVM: entry failed, hardware error 0x80000021 Jun 24 17:25:51 testhost QEMU[2997]: If you're running a guest on an Intel machine without unrestricted mode Jun 24 17:25:51 testhost QEMU[2997]: support, the failure can be most likely due to the guest entering an invalid Jun 24 17:25:51 testhost QEMU[2997]: state for Intel VT. For example, the guest maybe running in big real mode Jun 24 17:25:51 testhost QEMU[2997]: which is not supported on less recent Intel processors. Jun 24 17:25:51 testhost QEMU[2997]: EAX=00001e30 EBX=4e364180 ECX=00000001 EDX=00000000 Jun 24 17:25:51 testhost QEMU[2997]: ESI=df291040 EDI=e0d82080 EBP=acda2ea0 ESP=acda2c90 Jun 24 17:25:51 testhost QEMU[2997]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0 Jun 24 17:25:51 testhost QEMU[2997]: ES =0000 00000000 ffffffff 00809300 Jun 24 17:25:51 testhost QEMU[2997]: CS =c200 7ffc2000 ffffffff 00809300 Jun 24 17:25:51 testhost QEMU[2997]: SS =0000 00000000 ffffffff 00809300 Jun 24 17:25:51 testhost QEMU[2997]: DS =0000 00000000 ffffffff 00809300 Jun 24 17:25:51 testhost QEMU[2997]: FS =0000 00000000 ffffffff 00809300 Jun 24 17:25:51 testhost QEMU[2997]: GS =0000 00000000 ffffffff 00809300 Jun 24 17:25:51 testhost QEMU[2997]: LDT=0000 00000000 000fffff 00000000 Jun 24 17:25:51 testhost QEMU[2997]: TR =0040 4e651000 00000067 00008b00 Jun 24 17:25:51 testhost QEMU[2997]: GDT= 4e652fb0 00000057 Jun 24 17:25:51 testhost QEMU[2997]: IDT= 00000000 00000000 Jun 24 17:25:51 testhost QEMU[2997]: CR0=00050032 CR2=826c6000 CR3=3cbf4002 CR4=00000000 Jun 24 17:25:51 testhost QEMU[2997]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 Jun 24 17:25:51 testhost QEMU[2997]: DR6=00000000ffff0ff0 DR7=0000000000000400 Jun 24 17:25:51 testhost QEMU[2997]: EFER=0000000000000000 Jun 24 17:25:51 testhost QEMU[2997]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed. ``` Should you need any further information from my side or want me to test some potential fix - please don't hesitate to ask! Kind Regards, stoiko [0] https://www.proxmox.com/ [1] https://forum.proxmox.com/threads/.109410 On Mon, 26 Jul 2021 12:31:06 -0400 Paolo Bonzini wrote: > With the addition of fast page fault support, the TDP-specific MMU has reached > feature parity with the original MMU. All my testing in the last few months > has been done with the TDP MMU; switch the default on 64-bit machines. > > Signed-off-by: Paolo Bonzini > ..snip..