Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2085695pxb; Mon, 8 Mar 2021 13:51:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJwpChGg3tNECOqoSfTDkowRasPUY8TwlYN2HeOn4yQ/MrykpQWFr4g2w2JUw9rzVOWVBP92 X-Received: by 2002:a17:906:f02:: with SMTP id z2mr17154136eji.469.1615240281169; Mon, 08 Mar 2021 13:51:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615240281; cv=none; d=google.com; s=arc-20160816; b=nLgWDeF5hu0RTPEe8X+aY7jEPQTqQ/llrm5yvwtzs9BZFSnozx1OowyAL0CD052ZMA 2XoNrmUT2ClBDskiwO1yfGk3PMOzJtJg7sUu+uVqh/vIib9oVEkl5l7wb8jw4PFza63d G3WPDBLbRGShVwu00C6eoUinM7apQTe9hw2W/8yq9i6Qq08l/ZXxl81ZR2HwRL2xwdrg evLINF7TD57GQBu6TCMed0FnP9g/zyfegQ4kBHWGEarzGe+8tRmxOEv1JIE+Vtm0gYK6 5lIm0K3oglMPnSUOo/PTSWB7uBQB88uqZvueP9FGF4CYKkrwazxHAVVk4y3qfbUiO8QT zHmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=E+umKW6wxaxW//tXwBBv0JMHZeXf2FH8crBpZk5jUgg=; b=LLD+7OHXmSazNXfydqgdCBoGaUV2ikOCX+qpvpeN8TxGqUtaHWfVRb6+0U4jpquziW 9TKGrg24hdpXFi882uKjlA0elw0zDrU+0DFWXllBDWC4ksgMC3Ix49u/yhPepk2/sSx7 oGnE82pxo75cPd4AMN3fnyPKQN7hra6kyNqInBdrdX1Q3sKfr0Y9sVduqqAHhp06XEHD JbjRBpl44ddVz4y1Kwz0X3Imc6FgGrANzFdYBxgYSS7L1j0F9vA5Y/27TGD6wgtMAOBO gEtvUSM0TpFUuLKcgFajWVupaLOpImkWdKIg32OH9eBkmbKp+nhbBr/7C3P60GSJKsYg b3PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=roL62QLe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r11si7342684eja.563.2021.03.08.13.50.57; Mon, 08 Mar 2021 13:51:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=roL62QLe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231971AbhCHVt5 (ORCPT + 99 others); Mon, 8 Mar 2021 16:49:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232026AbhCHVtj (ORCPT ); Mon, 8 Mar 2021 16:49:39 -0500 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AB72C06174A for ; Mon, 8 Mar 2021 13:49:39 -0800 (PST) Received: by mail-pj1-x1036.google.com with SMTP id f2-20020a17090a4a82b02900c67bf8dc69so3792259pjh.1 for ; Mon, 08 Mar 2021 13:49:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=E+umKW6wxaxW//tXwBBv0JMHZeXf2FH8crBpZk5jUgg=; b=roL62QLenm6INB6NQQArshLD2zx0uU4/QKM8K7yyS8ToyQfh0fNWBfww/djww/UAum l7nu5hFLMs9pttnW3EL7LCzD4wnQwglE+66IehHOVwS/UANfZ7xeSyf7odvVvbFtBG5Q ovLuoh63lTVHka+Lt27L/4XemRHHCJ6k6hBeIsnH2hDuEMOyAbDE7Y/XB6r9+aP56oeY sDddTkDwKwZg6bex7k69fxL9+ab9QT42KsAEyHUZIREW/J4ErJVdZAmgWkqYluNSbq/X NSvl4X3qZlf/3MdYennQv5i4VycbRk9XPUQcEZjMiMas/N/sL0nw7pCnwwMF7xsQWsh9 SIZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=E+umKW6wxaxW//tXwBBv0JMHZeXf2FH8crBpZk5jUgg=; b=epUAFo7VbazAGDYfAqqjfTO/AdWcOOLksJ2deQHIbIf+oSSkRN+4SQ+Ybst0CA9wgM YVS6SYzRzvpLgDbC+gFdXWv3iiSbRF0mLqEmnuo7bbmelfN4ZjQF2/7/GREZpWzuU/+1 Fp4Y+5vECcpeRq1DRvnetONLPPE92GGEpXPJ/jpgi8F/OBrT13mzK0zA0Za6xLtqDByK SmqyLpU4uNeqpi+lgE4noeAlIvrm0JjkK5ZgoQpNFo/L3DQQrxzHIA9LqhgpOSirJQzj SaFkA4X07sWuwk45bEMKhQYYUm6R1R5zj9gC5HJRu1alyeqRG2hWuRTwRaXLMCjBnRtU bChA== X-Gm-Message-State: AOAM532Y4BTau7x4a6GEGUHyEujf8Q9sXdTM+GFfm8ihDY3iL4XxPG9k 4GV0LjfI9ZZc19e0xy8oWJPfpg== X-Received: by 2002:a17:90a:55ca:: with SMTP id o10mr924935pjm.173.1615240178779; Mon, 08 Mar 2021 13:49:38 -0800 (PST) Received: from google.com ([2620:15c:f:10:8:847a:d8b5:e2cc]) by smtp.gmail.com with ESMTPSA id o62sm10438006pga.43.2021.03.08.13.49.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Mar 2021 13:49:38 -0800 (PST) Date: Mon, 8 Mar 2021 13:49:31 -0800 From: Sean Christopherson To: Tom Lendacky Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Subject: Re: [PATCH 20/24] KVM: x86/mmu: Use a dedicated bit to track shadow/MMU-present SPTEs Message-ID: References: <20210225204749.1512652-1-seanjc@google.com> <20210225204749.1512652-21-seanjc@google.com> <42917119-b43a-062b-6c09-13b988f7194b@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 08, 2021, Sean Christopherson wrote: > On Mon, Mar 08, 2021, Tom Lendacky wrote: > > On the hypervisor, I see the following: > > > > [ 55.886136] get_mmio_spte: detect reserved bits on spte, addr 0xffc12792, dump hierarchy: > > [ 55.895284] ------ spte 0x1344a0827 level 4. > > [ 55.900059] ------ spte 0x134499827 level 3. > > [ 55.904877] ------ spte 0x165bf0827 level 2. > > [ 55.909651] ------ spte 0xff800ffc12817 level 1. > > Ah fudge. I know what's wrong. The MMIO generation uses bit 11, which means > is_shadow_present_pte() can get a false positive on high MMIO generations. This > particular error can be squashed by explicitly checking for MMIO sptes in > get_mmio_spte(), but I'm guessing there are other flows where a false positive > is fatal (probably the __pte_list_remove bug below...). The safe thing to do is > to steal bit 11 from MMIO SPTEs so that shadow present PTEs are the only thing > that sets that bit. > > I'll reproduce by stuffing the MMIO generation and get a patch posted. Sorry :-/ > > > When I kill the guest, I get a kernel panic: > > > > [ 95.539683] __pte_list_remove: 0000000040567a6a 0->BUG > > [ 95.545481] kernel BUG at arch/x86/kvm/mmu/mmu.c:896! Fudging get_mmio_spte() did allow the guest to boot, but as expected KVM panicked during guest shutdown. Initial testing on the below changes look good, I'll test more thoroughly and (hopefully) post later today. diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index b53036d9ddf3..e6e683e0fdcd 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -101,11 +101,11 @@ static_assert(!(EPT_SPTE_MMU_WRITABLE & SHADOW_ACC_TRACK_SAVED_MASK)); #undef SHADOW_ACC_TRACK_SAVED_MASK /* - * Due to limited space in PTEs, the MMIO generation is a 20 bit subset of + * Due to limited space in PTEs, the MMIO generation is a 19 bit subset of * the memslots generation and is derived as follows: * - * Bits 0-8 of the MMIO generation are propagated to spte bits 3-11 - * Bits 9-19 of the MMIO generation are propagated to spte bits 52-62 + * Bits 0-7 of the MMIO generation are propagated to spte bits 3-10 + * Bits 8-18 of the MMIO generation are propagated to spte bits 52-62 * * The KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS flag is intentionally not included in * the MMIO generation number, as doing so would require stealing a bit from @@ -116,7 +116,7 @@ static_assert(!(EPT_SPTE_MMU_WRITABLE & SHADOW_ACC_TRACK_SAVED_MASK)); */ #define MMIO_SPTE_GEN_LOW_START 3 -#define MMIO_SPTE_GEN_LOW_END 11 +#define MMIO_SPTE_GEN_LOW_END 10 #define MMIO_SPTE_GEN_HIGH_START 52 #define MMIO_SPTE_GEN_HIGH_END 62 @@ -125,12 +125,14 @@ static_assert(!(EPT_SPTE_MMU_WRITABLE & SHADOW_ACC_TRACK_SAVED_MASK)); MMIO_SPTE_GEN_LOW_START) #define MMIO_SPTE_GEN_HIGH_MASK GENMASK_ULL(MMIO_SPTE_GEN_HIGH_END, \ MMIO_SPTE_GEN_HIGH_START) +static_assert(!(SPTE_MMU_PRESENT_MASK & + (MMIO_SPTE_GEN_LOW_MASK | MMIO_SPTE_GEN_HIGH_MASK))); #define MMIO_SPTE_GEN_LOW_BITS (MMIO_SPTE_GEN_LOW_END - MMIO_SPTE_GEN_LOW_START + 1) #define MMIO_SPTE_GEN_HIGH_BITS (MMIO_SPTE_GEN_HIGH_END - MMIO_SPTE_GEN_HIGH_START + 1) /* remember to adjust the comment above as well if you change these */ -static_assert(MMIO_SPTE_GEN_LOW_BITS == 9 && MMIO_SPTE_GEN_HIGH_BITS == 11); +static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11); #define MMIO_SPTE_GEN_LOW_SHIFT (MMIO_SPTE_GEN_LOW_START - 0) #define MMIO_SPTE_GEN_HIGH_SHIFT (MMIO_SPTE_GEN_HIGH_START - MMIO_SPTE_GEN_LOW_BITS)