Received: by 2002:a05:6358:98a5:b0:11e:80f4:e92 with SMTP id q37csp23318rwa; Thu, 4 May 2023 13:47:51 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5xIk+WH5GN0rW41bgBFm7bT7n0nV2RTIvlbJIZ5vuoKjmzmahH1R5Ll6hlbFxfWXig8le8 X-Received: by 2002:a17:902:d2cc:b0:1aa:ffe1:de13 with SMTP id n12-20020a170902d2cc00b001aaffe1de13mr5632239plc.5.1683233270944; Thu, 04 May 2023 13:47:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683233270; cv=none; d=google.com; s=arc-20160816; b=JwAzWP2qwOchEOi0t7eJUVz8Cfn1y/W7Mzafi92/7B1dTVqMqh3WAEoXjTzY491O/d HkX0ydNmEyTxQ5XpEh92B3JidpoSS5djekTfeaBD2S3y6plYJDw0eR6/CjvlI3MGQPAH 1BgI+YUU9BQJGlxKmtGLonUrgjM+5lkDqL132UVN0RA1S3fD/9nThEPv6EbsIHiaPdiU 3iO/Mrc1eAz3eBD0G7JALD+SuIpjQoKeJqQK8MexALW6mBT4kAeyhqWu9Qmlv7k0MX1y IbtWiifUbHKR3L3sOUAL9/jRgQBhGeBb4+iqhhTdsg0Wtjq9fVFdXKpPn+40uKcIKfyr 5Y7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=AjkGj18Tl/NdfVrPbyDKfRjswecBEB5bOxxm+RgEgOA=; b=YlBV1cF/XHpY/otZRUDNgsdjRc+/XZY0CbtQT4hREvRoMY78KbTYMGJVDVoUGNClRM tIQ9FereOiju+Y2lFPhw2rVMdLYgkYzwr0crS1YpY+1ZH474fkjmjqle/nrkbkG31ihN HGJ2Cs7gxPRDcYQub9xPX6IOAzNHAXjD7uwJSJbAO2t2HLeq5anVOKkhvBKFvuV0e08O ZuXFXCB6l7U65QrBwS437xhwHavPIbUm+7R+h5Tfplk6gOpvTP7UvaNylAkwrytO799Q fh1SWXYqpnb/vSbu2Ju3UVorhII7Jr2G+Db13dExu5aXACJZjJEOXXe0dGsfZu+rBRsU GJLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=dwM0BsNH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k4-20020a170902694400b001aaf1659a1bsi10601629plt.25.2023.05.04.13.47.37; Thu, 04 May 2023 13:47:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=dwM0BsNH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229548AbjEDUow (ORCPT + 99 others); Thu, 4 May 2023 16:44:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229712AbjEDUoe (ORCPT ); Thu, 4 May 2023 16:44:34 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D8691568E for ; Thu, 4 May 2023 13:42:10 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1aafb2766e0so4676935ad.3 for ; Thu, 04 May 2023 13:42:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683232877; x=1685824877; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AjkGj18Tl/NdfVrPbyDKfRjswecBEB5bOxxm+RgEgOA=; b=dwM0BsNHiZq4qN5FFSbCSGE++/6He9ZJ4z5viQWckrJ8WPqSLQ8Km8bPGMKJj+0R9p I0J3YecLSBxvD02tDtPU982IC8x9/a1YZb8X+0iRfNF3fiRMfxvvQUULR3hN79FAwnI/ cAU4ZxDrnR4eK3dQFmer77OC9xbMVjvG7DPL3SgAqYCnzopDZ4sbT7kparWlvac/jrpe aywPdxXO07pbaZv/2mlj9qGJ7Q9r9pz4gOZuesNuCy0kVdWw40YrzzQQQEiQGUkTdoOc KxNZS7OaaTUFp1rlYp342z2i6irzzOSijP7gKUSkwl8xUGPyY5XEPsn+Su+NBS8T47K+ kY+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683232877; x=1685824877; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AjkGj18Tl/NdfVrPbyDKfRjswecBEB5bOxxm+RgEgOA=; b=gSqg6QsrKBYBbWJrRKKplbPdeo9htEHIcieAilA90iCds4BgR22ipfpX9F+dEQG9Gz tA/lN44wnJuqiu6ZkpMel8GU8XeFB0rAbGVxVXEj77UGSotxq4yK0xG8ONLv7kIvQvF6 OfPd0bXkDxrAsm1PYDJzw/L231I2Qg5S+0j6ecJ9deIFv4qFUfEedVLaAgFi9D4InMWZ u0hUhzrMO++HhfJgrYr4FkcVE2gyTCNnrKBOpFpZr3t7IitDhcIKqnAGsGlVpILNqent SmbhNAWHXcYsLySqe08NXco1EX31NBN6pUoHxJ6TFW38xJiP+9gp5dWp5wIMXjSbuAC4 slOA== X-Gm-Message-State: AC+VfDxgveoAaVNCyuupz0SjsbwUeCOx6hsyJb3YPZlvte8TQ3XF1fsf 0VCcne0UmHz/1LWOirqsHhF2DSTDGqw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:1247:b0:1a6:c110:900a with SMTP id u7-20020a170903124700b001a6c110900amr1614861plh.3.1683232877690; Thu, 04 May 2023 13:41:17 -0700 (PDT) Date: Thu, 4 May 2023 13:41:16 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230311002258.852397-1-seanjc@google.com> <20230311002258.852397-6-seanjc@google.com> Message-ID: Subject: Re: [PATCH v2 05/27] drm/i915/gvt: Verify VFIO-pinned page is THP when shadowing 2M gtt entry From: Sean Christopherson To: Yan Zhao Cc: Paolo Bonzini , Zhenyu Wang , Zhi Wang , kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Ben Gardon Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 17, 2023, Yan Zhao wrote: > On Fri, Mar 10, 2023 at 04:22:36PM -0800, Sean Christopherson wrote: > > When shadowing a GTT entry with a 2M page, explicitly verify that the > > first page pinned by VFIO is a transparent hugepage instead of assuming > > that page observed by is_2MB_gtt_possible() is the same page pinned by > > vfio_pin_pages(). E.g. if userspace is doing something funky with the > > guest's memslots, or if the page is demoted between is_2MB_gtt_possible() > > and vfio_pin_pages(). > > > > This is more of a performance optimization than a bug fix as the check > > for contiguous struct pages should guard against incorrect mapping (even > > though assuming struct pages are virtually contiguous is wrong). > > > > The real motivation for explicitly checking for a transparent hugepage > > after pinning is that it will reduce the risk of introducing a bug in a > > future fix for a page refcount leak (KVMGT doesn't put the reference > > acquired by gfn_to_pfn()), and eventually will allow KVMGT to stop using > > KVM's gfn_to_pfn() altogether. > > > > Signed-off-by: Sean Christopherson > > --- > > drivers/gpu/drm/i915/gvt/kvmgt.c | 18 ++++++++++++++++-- > > 1 file changed, 16 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c > > index 8ae7039b3683..90997cc385b4 100644 > > --- a/drivers/gpu/drm/i915/gvt/kvmgt.c > > +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c > > @@ -159,11 +159,25 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, > > goto err; > > } > > > > - if (npage == 0) > > - base_page = cur_page; > > + if (npage == 0) { > > + /* > > + * Bail immediately to avoid unnecessary pinning when > > + * trying to shadow a 2M page and the host page isn't > > + * a transparent hugepage. > > + * > > + * TODO: support other type hugepages, e.g. HugeTLB. > > + */ > > + if (size == I915_GTT_PAGE_SIZE_2M && > > + !PageTransHuge(cur_page)) > Maybe the checking of PageTransHuge(cur_page) and bailing out is not necessary. > If a page is not transparent huge, but there are 512 contigous 4K > pages, I think it's still good to map them in IOMMU in 2M. > See vfio_pin_map_dma() who does similar things. I agree that bailing isn't strictly necessary, and processing "blindly" should Just Work for HugeTLB and other hugepage types. I was going to argue that it would be safer to add this and then drop it at the end, but I think that's a specious argument. If not checking the page type is unsafe, then the existing code is buggy, and this changelog literally states that the check for contiguous pages guards against any such problems. I do think there's a (very, very theoretical) issue though. For "CONFIG_SPARSEMEM=y && CONFIG_SPARSEMEM_VMEMMAP=n", struct pages aren't virtually contiguous with respect to their pfns, i.e. it's possible (again, very theoretically) that two struct pages could be virtually contiguous but physically discontiguous. I suspect I'm being ridiculously paranoid, but for the efficient cases where pages are guaranteed to be contiguous, the extra page_to_pfn() checks should be optimized away by the compiler, i.e. there's no meaningful downside to the paranoia. TL;DR: My plan is to drop this patch and instead harden the continuity check.