Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp2615181pxb; Sat, 19 Feb 2022 17:35:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJxsOnJLD99t2aGCqXFO2N2oku1gABAWh9ev5OcRGOxHdpOlNDM3dQD1xTjg36FL2qxz85mg X-Received: by 2002:a05:6402:4415:b0:410:d28b:1e14 with SMTP id y21-20020a056402441500b00410d28b1e14mr14894072eda.211.1645320919178; Sat, 19 Feb 2022 17:35:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645320919; cv=none; d=google.com; s=arc-20160816; b=MEH6yTuuTRWioaKC4YT+OZ8hoiDGW+UHspSf4luiz88yqqLG7AFg6QNr5qTi/5rogr +kBSqP018ar1G55DRqtoMf1CWtN0aksTzq4PZU4Vx+AmPj440hj7G3aMZfkDrW0hTaLX o6GT0Eq1lZRENmY7Mvd7yUHfF2b/XzEwMplb29pF3AlnA9T/GlElv9qgMwK8TBXhtEIJ nEEykZ+Fw3tMT8ASH8Uj6Y2UwiM03IqSe0aQz4Zudfz3n0rJx05x7dgKFb3hxGMzhgIF chrPMoyk1IBwJJJsC2jiiTlCcW2GwxHCRERE6RzyZwb6LSUy881qqRnFcIR4fTKB7dod hdIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=H+CIB0cPm7NO71xlY+PieFfNzY6kbqTYnunsSuVd7Mk=; b=GHTfQrXJ5/aqJ1plvdsuSygUyulmteSEWbYcbrfGKs4F/6OwLqTVrTmft7G72/h6hx MWcM+/nFtFihnfWN6aKWC/9g4hNY1wyjWvhZfMulhst19Eesl5Nsai1nUcySe8Oa7WPw ZyT4zk4ypVdJGo9fRDjXmPGOADQMJj6S9VZffubPrKUgykkBiO+1cAgVdyuj8bbApyjK QGhllTYYHMhLCk7CFD6QZpoU6RmE1agkk0fjyVomThgDpUSZsnTCopvXvnOmU0OCzFDO CZbh513n47GEkDq2qWCEkgkfdgG982wGcgfbLSHomR8zD+DK1IfSVzO2oB4zWUmENmly HXqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EGx13HkN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v9si1716468edb.583.2022.02.19.17.34.26; Sat, 19 Feb 2022 17:35:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EGx13HkN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235907AbiBRSz5 (ORCPT + 99 others); Fri, 18 Feb 2022 13:55:57 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:47008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235952AbiBRSzs (ORCPT ); Fri, 18 Feb 2022 13:55:48 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE2D7245FFD for ; Fri, 18 Feb 2022 10:55:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1645210530; x=1676746530; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=A/pt0JouG2A35RjAzS61fMx6Y80ODkG2+c2DnZHED/8=; b=EGx13HkNH73r6iDypX1icXNUi0i/HuISX2UzrUKCBVrDtniHapsjEJdM 0U7rDyIt/RftRF2JQryeBjxFXmgKRS5h/UXi03EJzgjVjgtdcw4P0eDjP kerWkODWT9QTvqpqUvUOmyjhSdsYOBiUjinnO5VLs3CeNOJlyNyvopm+O IeNA22NBf4f7SvfD7w+Iv6QyzVyixHKpeONVycRXQD1BLXbWbp5CovA/F nDFXuylKg9zG8qbrDFRgFUVThPGKtMcrgxG/LrlxJKDa1Uc0DaxVeLpcd yAxDPOyuisDHKoQaiODMtq7kh3oVuL23RcahH6IBW+ExO6nNeJpMO6Vh0 w==; X-IronPort-AV: E=McAfee;i="6200,9189,10262"; a="231824998" X-IronPort-AV: E=Sophos;i="5.88,379,1635231600"; d="scan'208";a="231824998" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2022 10:55:30 -0800 X-IronPort-AV: E=Sophos;i="5.88,379,1635231600"; d="scan'208";a="637830201" Received: from ramaling-i9x.iind.intel.com (HELO intel.com) ([10.203.144.108]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2022 10:55:25 -0800 Date: Sat, 19 Feb 2022 00:25:40 +0530 From: Ramalingam C To: Robert Beckett Cc: Jordan Justen , Jani Nikula , Joonas Lahtinen , Rodrigo Vivi , Tvrtko Ursulin , David Airlie , Daniel Vetter , Matthew Auld , Thomas =?utf-8?Q?Hellstr=C3=B6m?= , Simon Ser , Pekka Paalanen , Kenneth Graunke , mesa-dev@lists.freedesktop.org, Tony Ye , Slawomir Milczarek , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v8 5/5] drm/i915/uapi: document behaviour for DG2 64K support Message-ID: <20220218185540.GA7762@intel.com> References: <20220208203419.1094362-1-bob.beckett@collabora.com> <20220208203419.1094362-6-bob.beckett@collabora.com> <87ee40ojpc.fsf@jljusten-skl> <20220218134735.GB3646@intel.com> <78df4b73-9b2d-670b-a6b0-a45b476f1f0a@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <78df4b73-9b2d-670b-a6b0-a45b476f1f0a@collabora.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-02-18 at 18:06:00 +0000, Robert Beckett wrote: > > > On 18/02/2022 13:47, Ramalingam C wrote: > > On 2022-02-17 at 20:57:35 -0800, Jordan Justen wrote: > > > Robert Beckett writes: > > > > > > > From: Matthew Auld > > > > > > > > On discrete platforms like DG2, we need to support a minimum page size > > > > of 64K when dealing with device local-memory. This is quite tricky for > > > > various reasons, so try to document the new implicit uapi for this. > > > > > > > > v3: fix typos and less emphasis > > > > v2: Fixed suggestions on formatting [Daniel] > > > > > > > > Signed-off-by: Matthew Auld > > > > Signed-off-by: Ramalingam C > > > > Signed-off-by: Robert Beckett > > > > Acked-by: Jordan Justen > > > > Reviewed-by: Ramalingam C > > > > Reviewed-by: Thomas Hellström > > > > cc: Simon Ser > > > > cc: Pekka Paalanen > > > > Cc: Jordan Justen > > > > Cc: Kenneth Graunke > > > > Cc: mesa-dev@lists.freedesktop.org > > > > Cc: Tony Ye > > > > Cc: Slawomir Milczarek > > > > --- > > > > include/uapi/drm/i915_drm.h | 44 ++++++++++++++++++++++++++++++++----- > > > > 1 file changed, 39 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > > > > index 5e678917da70..77e5e74c32c1 100644 > > > > --- a/include/uapi/drm/i915_drm.h > > > > +++ b/include/uapi/drm/i915_drm.h > > > > @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 { > > > > /** > > > > * When the EXEC_OBJECT_PINNED flag is specified this is populated by > > > > * the user with the GTT offset at which this object will be pinned. > > > > + * > > > > * When the I915_EXEC_NO_RELOC flag is specified this must contain the > > > > * presumed_offset of the object. > > > > + * > > > > * During execbuffer2 the kernel populates it with the value of the > > > > * current GTT offset of the object, for future presumed_offset writes. > > > > + * > > > > + * See struct drm_i915_gem_create_ext for the rules when dealing with > > > > + * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with > > > > + * minimum page sizes, like DG2. > > > > */ > > > > __u64 offset; > > > > @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext { > > > > * > > > > * The (page-aligned) allocated size for the object will be returned. > > > > * > > > > - * Note that for some devices we have might have further minimum > > > > - * page-size restrictions(larger than 4K), like for device local-memory. > > > > - * However in general the final size here should always reflect any > > > > - * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS > > > > - * extension to place the object in device local-memory. > > > > + * > > > > + * DG2 64K min page size implications: > > > > + * > > > > + * On discrete platforms, starting from DG2, we have to contend with GTT > > > > + * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE > > > > + * objects. Specifically the hardware only supports 64K or larger GTT > > > > + * page sizes for such memory. The kernel will already ensure that all > > > > + * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page > > > > + * sizes underneath. > > > > + * > > > > + * Note that the returned size here will always reflect any required > > > > + * rounding up done by the kernel, i.e 4K will now become 64K on devices > > > > + * such as DG2. > > > > + * > > > > + * Special DG2 GTT address alignment requirement: > > > > + * > > > > + * The GTT alignment will also need to be at least 2M for such objects. > > > > + * > > > > + * Note that due to how the hardware implements 64K GTT page support, we > > > > + * have some further complications: > > > > + * > > > > + * 1) The entire PDE (which covers a 2MB virtual address range), must > > > > + * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same > > > > + * PDE is forbidden by the hardware. > > > > + * > > > > + * 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM > > > > + * objects. > > > > + * > > > > + * To keep things simple for userland, we mandate that any GTT mappings > > > > + * must be aligned to and rounded up to 2MB. > > > > > > Could I get a clarification about this "rounded up" part. > > > > > > Currently Mesa is aligning the start of each and every buffer VMA to be > > > 2MiB aligned. But, we are *not* taking any steps to "round up" the size > > > of buffers to 2MiB alignment. > > > > > > Bob's Mesa MR from a while ago, > > > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14599 > > > > > > was trying to add this "round up" size for buffers. We didn't accept > > > this MR because we thought if we have ensured that no other buffer will > > > use the same 2MiB VMA range, then it should not be required. > > > > > > If what we are doing is ok, then maybe this "round up" language should > > > be dropped? Or, perhaps the "round up" mentioned here isn't implying we > > > must align the size of buffers that we create, and I'm misinterpreting > > > this. > > Jordan, > > > > as per my understanding this size rounding up to 2MB is for the VMA mapping, > > not for the buffer size. > correct, only the vma is rounded up > > > > > Even if we drop this rounding up of vma size to 2MB but align the VMA > > start to 2MB address then also this should be fine. Becasue the remaining of the > > last PDE(2MB) will never be used for any other GTT mapping as the > > starting addr wont align to 2MB. > The kernel has to handle 4K pages also, which could in theory attempt to be > placed in any remaining space within a 2MB region, which is not supported. > For this reason, the kernel rounds up the vma to next 2MB to ensure no 4K > pages can treat the remaining space as a candidate for placement. > > For mesa, this is not required as they only ever use 2MB alignment for all > buffers, hence the denial of the mesa mr. > > Internally, the kernel will still round up the vma reservations to prevent > any kernel 4K buffers being situated in remaining space. > > If desired, we can make the wording clearer, maybe something like: > > "To keep things simple for userland, we mandate that any GTT mappings > must be aligned to 2MB. The kernel will internally pad them out to the next > 2MB boundary" Added the extra information in next version @ https://patchwork.freedesktop.org/patch/475166/?series=100419&rev=1 Jordan, hope this explanation clears your doubt. Ram. > > > > > > Bob, Is the above understanding is right? if so could we drop the > > requirement of mapping the vma size to 2MB? > > > > Ram > > > > > > -Jordan > > > > > > > As this only wastes virtual > > > > + * address space and avoids userland having to copy any needlessly > > > > + * complicated PDE sharing scheme (coloring) and only affects DG2, this > > > > + * is deemed to be a good compromise. > > > > */ > > > > __u64 size; > > > > /** > > > > -- > > > > 2.25.1