Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp9476988rwr; Thu, 11 May 2023 15:57:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5bayyD9IYzLm+ZKrmS7R9KAC0oVv/x4Aqi1MzDw7+EiAsKn6/tU/6liWzx6OUXPs0H4U65 X-Received: by 2002:a17:90a:8a8c:b0:24e:5344:9c9d with SMTP id x12-20020a17090a8a8c00b0024e53449c9dmr21985501pjn.38.1683845844935; Thu, 11 May 2023 15:57:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683845844; cv=none; d=google.com; s=arc-20160816; b=rCsnekYR1Sny9lSVz66GPKwfQhWNHRh3sn1MQWcVzrM8/U7AGWKmVkUgwU/UiBVT0u JmZdTNtE1YJV1OFaE8sAKtgUbMkNacLBse+HJmGfzmBzXyo3e/yrU6uFiKhZarpgYlxX FLfcetNeA2gzF/Cp3uQTCV1tVo2JfgDAj51uMbSG7SGky+7n2KYMC5mPL+QUiShrIWl4 3+rj3izamOMTbhCJ36sSMbhmg3DxKLYHRyQ3ItvgQu1HMUaMypdPi4GYVLNNzWwH0+ml 8T25LPrWbrFmSrnzamrFNeQsLGPJ6Y5RH15ju1BQ7zFv1vSMjYbPHTpOrTjTK1kCm5Il UEDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=HS+C3IjXfWystCONYjb3EXxD2SDxWUthJbvvwGWs1+8=; b=MyfaRLO+wfNDXGUK+RhAonLZB+J+DmWBHu8UZOqSOh7+D7uciGrHNB14x1+GUL8NDW gXE6PlXISMGzSReDDXUbokBdoYVpXGYW5EO5MvN2Hms5XveGNe3xXdABGO53SeT8PKWH ezIWZOtBfyTb+oARyHCvhT0qNQVtXs0JE7KjHxmkSaLSKgpyt4rYx9PRMq6XReg/cJcx 3QqKFPQrUSyp+GQTIHCbpgOa8bicKQ4lCiLbiCSq2I7d+5Od31nQCTj+o4//G/9QP+58 fkE0dY9kLIyGYGAGWzATYD1SweNS++zBMtun8nt2GukfckAzoq11CA3GCYi15Z/R7yfo b6GQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=r5MQQ+mB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p8-20020a17090a930800b00247ad6e4188si20771218pjo.51.2023.05.11.15.57.10; Thu, 11 May 2023 15:57:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=r5MQQ+mB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239499AbjEKWkL (ORCPT + 99 others); Thu, 11 May 2023 18:40:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239571AbjEKWj6 (ORCPT ); Thu, 11 May 2023 18:39:58 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D50B7693 for ; Thu, 11 May 2023 15:39:46 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-51b8837479fso8623519a12.3 for ; Thu, 11 May 2023 15:39:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683844785; x=1686436785; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HS+C3IjXfWystCONYjb3EXxD2SDxWUthJbvvwGWs1+8=; b=r5MQQ+mBJ+84kKxnm/gw1YuGM5HuEzQgj+EEC07xuOUA0uFCYdG/pd32A2RoxkoXn4 BeIqAPoxRiiWxltudkcaPV7km11ERjQU23thTbY8+fGjgFfxkkPz2XXc2QsgVFs6NUoo 0opXUXOdMgF6+kGyY1CAWlLmiFUP1Ps1kQixkzRs80PdBlF8ik/mYwHAT7uQSMlOUU1L dG4lR9Q1eObvB/pIXpZHfHHwfz75xhjlaAfclkgFrBCHQ5WzuZU28VOTZU/K4YZYwRRI P8yD7liQYau4VMImM/FN5HiRol8YYr91eMYlOtBxEI6W7VFEiqWeq1tJ56to5a2jbxvg Xcnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683844785; x=1686436785; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HS+C3IjXfWystCONYjb3EXxD2SDxWUthJbvvwGWs1+8=; b=ceuDf1CZV+5UGHPjee9qHFFNwlWaVQJBsJ2ggZrJMYzMhEcet6NJOmTnarCyQXSoAW k8CfIC/JhImIVXjV46E/+uo1e/sxXsrqUAPVXQJMyiV7FIQFD3OOpyBQ+eGtTsyzuPXX +iq5VDRnjnNwJ7Z9VrNW31nlIWX1iKp+vWHIRCHcdldAoQgZLp5Cym1kenayxAsgxJ3Z lO//6ZhryB+/bvEaH+aqxM2XAL+gRYeH7eVWnjXs7jOQ/1oHRLVIoO4GhngxXT+3VdaQ sxgCGM209is31sv7z+2We5f8N3zh2tNRwpfuy2hwtr0qQTgDakIBH54cJB3NPwk0c1/m oQQw== X-Gm-Message-State: AC+VfDyQMWp/NwqCsYPuyM/OkymMalAeRnL3wucCUDPCoPwpeaZGX+BM hat1L9yoDEKRasMOR48NL0qP9c3Wmp8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:da53:0:b0:530:6f9f:9b36 with SMTP id l19-20020a63da53000000b005306f9f9b36mr1231330pgj.9.1683844785592; Thu, 11 May 2023 15:39:45 -0700 (PDT) Date: Thu, 11 May 2023 15:39:44 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230311002258.852397-1-seanjc@google.com> <20230311002258.852397-26-seanjc@google.com> Message-ID: Subject: Re: [PATCH v2 25/27] KVM: x86/mmu: Drop @slot param from exported/external page-track APIs From: Sean Christopherson To: Yan Zhao Cc: kvm@vger.kernel.org, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Zhenyu Wang , Ben Gardon , Paolo Bonzini , intel-gvt-dev@lists.freedesktop.org, Zhi Wang Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 08, 2023, Yan Zhao wrote: > On Thu, May 04, 2023 at 10:17:20AM +0800, Yan Zhao wrote: > > On Wed, May 03, 2023 at 04:16:10PM -0700, Sean Christopherson wrote: > > > Finally getting back to this series... > > > > > > On Thu, Mar 23, 2023, Yan Zhao wrote: > > > > On Fri, Mar 17, 2023 at 04:28:56PM +0800, Yan Zhao wrote: > > > > > On Fri, Mar 10, 2023 at 04:22:56PM -0800, Sean Christopherson wrote: > > > > > ... > > > > > > +int kvm_write_track_add_gfn(struct kvm *kvm, gfn_t gfn) > > > > > > +{ > > > > > > + struct kvm_memory_slot *slot; > > > > > > + int idx; > > > > > > + > > > > > > + idx = srcu_read_lock(&kvm->srcu); > > > > > > + > > > > > > + slot = gfn_to_memslot(kvm, gfn); > > > > > > + if (!slot) { > > > > > > + srcu_read_unlock(&kvm->srcu, idx); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > + > > > > > Also fail if slot->flags & KVM_MEMSLOT_INVALID is true? > > > > > There should exist a window for external users to see an invalid slot > > > > > when a slot is about to get deleted/moved. > > > > > (It happens before MOVE is rejected in kvm_arch_prepare_memory_region()). > > > > > > > > Or using > > > > if (!kvm_is_visible_memslot(slot)) { > > > > srcu_read_unlock(&kvm->srcu, idx); > > > > return -EINVAL; > > > > } > > > > Hi Sean, > After more thoughts, do you think checking KVM internal memslot is necessary? I don't think it's necessary per se, but I also can't think of any reason to allow it. > slot = gfn_to_memslot(kvm, gfn); > if (!slot || slot->id >= KVM_USER_MEM_SLOTS) { > srcu_read_unlock(&kvm->srcu, idx); > return -EINVAL; > } > > Do we allow write tracking to APIC access page when APIC-write VM exit > is not desired? Allow? Yes. But KVM doesn't use write-tracking for anything APICv related, e.g. to disable APICv, KVM instead zaps the SPTEs for the APIC access page and on page fault goes straight to MMIO emulation. Theoretically, the guest could create an intermediate PTE in the APIC access page and AFAICT KVM would shadow the access and write-protect the APIC access page. But that's benign as the resulting emulation would be handled just like emulated APIC MMIO. FWIW, the other internal memslots, TSS and idenity mapped page tables, are used if and only if paging is disabled in the guest, i.e. there are no guest PTEs for KVM to shadow (and paging must be enabled to enable VMX, so nested EPT is also ruled out). So this is theoretically possible only for the APIC access page. That changes with KVMGT, but that again should not be problematic. KVM will emulate in response to the write-protected page and things go on. E.g. it's arguably much weirder that the guest can read/write the identity mapped page tables that are used for EPT without unrestricted guest. There's no sane reason to allow creating PTEs in the APIC page, but I'm also not all that motivated to "fix" things. account_shadowed() isn't expected to fail, so KVM would need to check further up the stack, e.g. in walk_addr_generic() by open coding a form of kvm_vcpu_gfn_to_hva_prot(). I _think_ that's the only place KVM would need to add a check, as KVM already checks that the root, i.e. CR3, is in a "visible" memslot. I suppose KVM could just synthesize triple fault, like it does for the root/CR3 case, but I don't like making up behavior. In other words, I'm not opposed to disallowing write-tracking internal memslots, but I can't think of anything that will break, and so for me personally at least, the ROI isn't sufficient to justify writing tests and dealing with any fallout.