Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp701705pxb; Thu, 21 Oct 2021 07:55:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxaemAPE7pmz4dlUZTEYglUph7d+nifV3DCg+EZ32s64Va0bZw42oHGG6fJnTgx6lGLz/1J X-Received: by 2002:a50:e686:: with SMTP id z6mr8717171edm.311.1634828141528; Thu, 21 Oct 2021 07:55:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634828141; cv=none; d=google.com; s=arc-20160816; b=EHseiU60TVaG5XvJhbpA2u46JrWI1aFta8hpXpD4fO6XGnbFubNPpKYfLitWcQnF6Z ol8XoCctD7M2uxwOMHWCZmD2rjLRWz/uKFXNnUe6EoXCiGe6jscecoNXpFV0bIqAqBcy YV8FnYI5M0gyR8Iz+XR9aKoMZzz43TFPms0qBCM5rlXSeE4skZhyUHk5CpToeaRx3ikC R3iLduUVGBVg6c/oEAoehwJUZg8Yvz8upePxLpeQuX3qCk9pl3P2n9xxg11fJpNsvvW5 JPm69aFJkgeedt1cNZAkjn22UzWhu50Oo/E+vMrsV5d91TMD/P0YuPzNFXZQcg4wFYoS grRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=4wFPm5RkDMkY2Lduta42NvuloKhG9SyfVtIypRwx22Q=; b=ppWo3Kdjjf1054c6GnugvZeKhYmzHT407pJD9C/b8xrINiaj4H+2MkV5pwDCrsf7Bk 3+dbEicBCVh+B5nLOkIAi3patVNuNWZBZLJRg5hemi51pB0Y68avx1BCTDhUQ5x+6T2b WFGxHnrCpuIvUUuJ6N6VKu6ySYtivF1r614zBrpcbKOVZ6ISAQxq/wIoFk4RWjThpNUg jV6JOk09Uj9KnHqNZWuQnIAPJ9uo/zSQjVIU+dKHkT3qmlb4Hrx25gQ4WDjMPaxVFnao GFibhGTD9OOTHE7pc5OXNJwS6yAkhdSkLI/NCTi1KxhcmrBJEgN1lCdbhzUkVpFAYQ5C Q0CA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=omliDnFk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e22si7039779edc.121.2021.10.21.07.55.17; Thu, 21 Oct 2021 07:55:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=omliDnFk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231256AbhJUOyn (ORCPT + 99 others); Thu, 21 Oct 2021 10:54:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229595AbhJUOyl (ORCPT ); Thu, 21 Oct 2021 10:54:41 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CD1FC061348 for ; Thu, 21 Oct 2021 07:52:25 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id w17so574833plg.9 for ; Thu, 21 Oct 2021 07:52:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4wFPm5RkDMkY2Lduta42NvuloKhG9SyfVtIypRwx22Q=; b=omliDnFkGyq8QgbtvjOskgJ1f7k6abYNM9RL453A3E5TXgPH1TlXx+uNs4Wao+p7YO ql+S4iRXpMjxIwvjP6O3lO47ybx4I9KhFNKXU13LcwnqHODbJh3jGIraBpjlgqLtvMpI wiITyKPue3Htyo/+hBVZ9EFVP86VTLxiuxhe+wzvM77QH+JLhUoQHtl0XwrmU9sHVI1Z riecEBdq1yYjtscBhVtFgT5kJBDBdglDMRNEI7ik/ETzwe3EWEWLU9yrMvGyC35Qf5Lj gWco2UipWoqhGLH5gWtohfNCX5FuAxaAKrXvvgJq5a19ebX6omjq+IpsHbGhQBn+rPPN 7TJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4wFPm5RkDMkY2Lduta42NvuloKhG9SyfVtIypRwx22Q=; b=zLKx/k4itSjb41uw9q4RQf0mzlZVyLTq0thndIrUYvGYwrys5gw3I6GeNV1uD9KSCn 9mTvsR7ng759SUdUP54WnRUZU7eYmPam/TmGQq/VHz7r5ymKe5QqATUumTd9C9jJECWw kuZfbw1f5V8+D5MXpSkhmfMiEkXBKjDwUU9Fhzvown4KsZHpFAMPxgHiJcu7DwxZ/bHg z9F/Nnl2nmlc+XrWybHaW7iywLyxLYi+YEJDMfUWz2Snr9Pej+HBOcAMEy3ku7TMXtD7 01OURnV0DZe9OpxEbl5v0wBVJhanXF1Z12leAuN1wP7eKvV2VI/w/c46wLLiVH5UwJmD 0rgg== X-Gm-Message-State: AOAM531YUuDm9QP5G7lu98dqqNEVQARIj0kk3RM1jhCeZ1HBvIBN8ecU wYyNSL463727IVBRZdG/siqP0g== X-Received: by 2002:a17:902:8b8b:b0:13d:e91c:a1b9 with SMTP id ay11-20020a1709028b8b00b0013de91ca1b9mr5604534plb.60.1634827944772; Thu, 21 Oct 2021 07:52:24 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id oc8sm6808817pjb.15.2021.10.21.07.52.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Oct 2021 07:52:24 -0700 (PDT) Date: Thu, 21 Oct 2021 14:52:20 +0000 From: Sean Christopherson To: Lai Jiangshan Cc: Lai Jiangshan , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: Re: [PATCH 1/4] KVM: X86: Fix tlb flush for tdp in kvm_invalidate_pcid() Message-ID: References: <20211019110154.4091-1-jiangshanlai@gmail.com> <20211019110154.4091-2-jiangshanlai@gmail.com> <55abc519-b528-ddaa-120d-8d157b520623@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55abc519-b528-ddaa-120d-8d157b520623@linux.alibaba.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 21, 2021, Lai Jiangshan wrote: > > > On 2021/10/21 02:26, Sean Christopherson wrote: > > On Wed, Oct 20, 2021, Lai Jiangshan wrote: > > > On 2021/10/19 23:25, Sean Christopherson wrote: > > > I just read some interception policy in vmx.c, if EPT=1 but vmx_need_pf_intercept() > > > return true for some reasons/configs, #PF is intercepted. But CR3 write is not > > > intercepted, which means there will be an EPT fault _after_ (IIUC) the CR3 write if > > > the GPA of the new CR3 exceeds the guest maxphyaddr limit. And kvm queues a fault to > > > the guest which is also _after_ the CR3 write, but the guest expects the fault before > > > the write. > > > > > > IIUC, it can be fixed by intercepting CR3 write or reversing the CR3 write in EPT > > > violation handler. > > > > KVM implicitly does the latter by emulating the faulting instruction. > > > > static int handle_ept_violation(struct kvm_vcpu *vcpu) > > { > > ... > > > > /* > > * Check that the GPA doesn't exceed physical memory limits, as that is > > * a guest page fault. We have to emulate the instruction here, because > > * if the illegal address is that of a paging structure, then > > * EPT_VIOLATION_ACC_WRITE bit is set. Alternatively, if supported we > > * would also use advanced VM-exit information for EPT violations to > > * reconstruct the page fault error code. > > */ > > if (unlikely(allow_smaller_maxphyaddr && kvm_vcpu_is_illegal_gpa(vcpu, gpa))) > > return kvm_emulate_instruction(vcpu, 0); > > > > return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); > > } > > > > and injecting a #GP when kvm_set_cr3() fails. > > I think the EPT violation happens *after* the cr3 write. So the instruction to be > emulated is not "cr3 write". The emulation will queue fault into guest though, > recursive EPT violation happens since the cr3 exceeds maxphyaddr limit. Doh, you're correct. I think my mind wandered into thinking about what would happen with PDPTRs and forgot to get back to normal MOV CR3. So yeah, the only way to correctly handle this would be to intercept CR3 loads. I'm guessing that would have a noticeable impact on guest performance. Paolo, I'll leave this one for you to decide, we have pretty much written off allow_smaller_maxphyaddr :-)