From: Matthew Auld <[email protected]>
On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.
v3: fix typos and less emphasis
v2: Fixed suggestions on formatting [Daniel]
Signed-off-by: Matthew Auld <[email protected]>
Signed-off-by: Ramalingam C <[email protected]>
Signed-off-by: Robert Beckett <[email protected]>
Acked-by: Jordan Justen <[email protected]>
Reviewed-by: Ramalingam C <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
cc: Simon Ser <[email protected]>
cc: Pekka Paalanen <[email protected]>
Cc: Jordan Justen <[email protected]>
Cc: Kenneth Graunke <[email protected]>
Cc: [email protected]
Cc: Tony Ye <[email protected]>
Cc: Slawomir Milczarek <[email protected]>
---
include/uapi/drm/i915_drm.h | 44 ++++++++++++++++++++++++++++++++-----
1 file changed, 39 insertions(+), 5 deletions(-)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 5e678917da70..77e5e74c32c1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
/**
* When the EXEC_OBJECT_PINNED flag is specified this is populated by
* the user with the GTT offset at which this object will be pinned.
+ *
* When the I915_EXEC_NO_RELOC flag is specified this must contain the
* presumed_offset of the object.
+ *
* During execbuffer2 the kernel populates it with the value of the
* current GTT offset of the object, for future presumed_offset writes.
+ *
+ * See struct drm_i915_gem_create_ext for the rules when dealing with
+ * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
+ * minimum page sizes, like DG2.
*/
__u64 offset;
@@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext {
*
* The (page-aligned) allocated size for the object will be returned.
*
- * Note that for some devices we have might have further minimum
- * page-size restrictions(larger than 4K), like for device local-memory.
- * However in general the final size here should always reflect any
- * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
- * extension to place the object in device local-memory.
+ *
+ * DG2 64K min page size implications:
+ *
+ * On discrete platforms, starting from DG2, we have to contend with GTT
+ * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
+ * objects. Specifically the hardware only supports 64K or larger GTT
+ * page sizes for such memory. The kernel will already ensure that all
+ * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
+ * sizes underneath.
+ *
+ * Note that the returned size here will always reflect any required
+ * rounding up done by the kernel, i.e 4K will now become 64K on devices
+ * such as DG2.
+ *
+ * Special DG2 GTT address alignment requirement:
+ *
+ * The GTT alignment will also need to be at least 2M for such objects.
+ *
+ * Note that due to how the hardware implements 64K GTT page support, we
+ * have some further complications:
+ *
+ * 1) The entire PDE (which covers a 2MB virtual address range), must
+ * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
+ * PDE is forbidden by the hardware.
+ *
+ * 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
+ * objects.
+ *
+ * To keep things simple for userland, we mandate that any GTT mappings
+ * must be aligned to and rounded up to 2MB. As this only wastes virtual
+ * address space and avoids userland having to copy any needlessly
+ * complicated PDE sharing scheme (coloring) and only affects DG2, this
+ * is deemed to be a good compromise.
*/
__u64 size;
/**
--
2.25.1
Robert Beckett <[email protected]> writes:
> From: Matthew Auld <[email protected]>
>
> On discrete platforms like DG2, we need to support a minimum page size
> of 64K when dealing with device local-memory. This is quite tricky for
> various reasons, so try to document the new implicit uapi for this.
>
> v3: fix typos and less emphasis
> v2: Fixed suggestions on formatting [Daniel]
>
> Signed-off-by: Matthew Auld <[email protected]>
> Signed-off-by: Ramalingam C <[email protected]>
> Signed-off-by: Robert Beckett <[email protected]>
> Acked-by: Jordan Justen <[email protected]>
> Reviewed-by: Ramalingam C <[email protected]>
> Reviewed-by: Thomas Hellström <[email protected]>
> cc: Simon Ser <[email protected]>
> cc: Pekka Paalanen <[email protected]>
> Cc: Jordan Justen <[email protected]>
> Cc: Kenneth Graunke <[email protected]>
> Cc: [email protected]
> Cc: Tony Ye <[email protected]>
> Cc: Slawomir Milczarek <[email protected]>
> ---
> include/uapi/drm/i915_drm.h | 44 ++++++++++++++++++++++++++++++++-----
> 1 file changed, 39 insertions(+), 5 deletions(-)
>
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 5e678917da70..77e5e74c32c1 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
> /**
> * When the EXEC_OBJECT_PINNED flag is specified this is populated by
> * the user with the GTT offset at which this object will be pinned.
> + *
> * When the I915_EXEC_NO_RELOC flag is specified this must contain the
> * presumed_offset of the object.
> + *
> * During execbuffer2 the kernel populates it with the value of the
> * current GTT offset of the object, for future presumed_offset writes.
> + *
> + * See struct drm_i915_gem_create_ext for the rules when dealing with
> + * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
> + * minimum page sizes, like DG2.
> */
> __u64 offset;
>
> @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext {
> *
> * The (page-aligned) allocated size for the object will be returned.
> *
> - * Note that for some devices we have might have further minimum
> - * page-size restrictions(larger than 4K), like for device local-memory.
> - * However in general the final size here should always reflect any
> - * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> - * extension to place the object in device local-memory.
> + *
> + * DG2 64K min page size implications:
> + *
> + * On discrete platforms, starting from DG2, we have to contend with GTT
> + * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> + * objects. Specifically the hardware only supports 64K or larger GTT
> + * page sizes for such memory. The kernel will already ensure that all
> + * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> + * sizes underneath.
> + *
> + * Note that the returned size here will always reflect any required
> + * rounding up done by the kernel, i.e 4K will now become 64K on devices
> + * such as DG2.
> + *
> + * Special DG2 GTT address alignment requirement:
> + *
> + * The GTT alignment will also need to be at least 2M for such objects.
> + *
> + * Note that due to how the hardware implements 64K GTT page support, we
> + * have some further complications:
> + *
> + * 1) The entire PDE (which covers a 2MB virtual address range), must
> + * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
> + * PDE is forbidden by the hardware.
> + *
> + * 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> + * objects.
> + *
> + * To keep things simple for userland, we mandate that any GTT mappings
> + * must be aligned to and rounded up to 2MB.
Could I get a clarification about this "rounded up" part.
Currently Mesa is aligning the start of each and every buffer VMA to be
2MiB aligned. But, we are *not* taking any steps to "round up" the size
of buffers to 2MiB alignment.
Bob's Mesa MR from a while ago,
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14599
was trying to add this "round up" size for buffers. We didn't accept
this MR because we thought if we have ensured that no other buffer will
use the same 2MiB VMA range, then it should not be required.
If what we are doing is ok, then maybe this "round up" language should
be dropped? Or, perhaps the "round up" mentioned here isn't implying we
must align the size of buffers that we create, and I'm misinterpreting
this.
-Jordan
> As this only wastes virtual
> + * address space and avoids userland having to copy any needlessly
> + * complicated PDE sharing scheme (coloring) and only affects DG2, this
> + * is deemed to be a good compromise.
> */
> __u64 size;
> /**
> --
> 2.25.1
On 2022-02-17 at 20:57:35 -0800, Jordan Justen wrote:
> Robert Beckett <[email protected]> writes:
>
> > From: Matthew Auld <[email protected]>
> >
> > On discrete platforms like DG2, we need to support a minimum page size
> > of 64K when dealing with device local-memory. This is quite tricky for
> > various reasons, so try to document the new implicit uapi for this.
> >
> > v3: fix typos and less emphasis
> > v2: Fixed suggestions on formatting [Daniel]
> >
> > Signed-off-by: Matthew Auld <[email protected]>
> > Signed-off-by: Ramalingam C <[email protected]>
> > Signed-off-by: Robert Beckett <[email protected]>
> > Acked-by: Jordan Justen <[email protected]>
> > Reviewed-by: Ramalingam C <[email protected]>
> > Reviewed-by: Thomas Hellström <[email protected]>
> > cc: Simon Ser <[email protected]>
> > cc: Pekka Paalanen <[email protected]>
> > Cc: Jordan Justen <[email protected]>
> > Cc: Kenneth Graunke <[email protected]>
> > Cc: [email protected]
> > Cc: Tony Ye <[email protected]>
> > Cc: Slawomir Milczarek <[email protected]>
> > ---
> > include/uapi/drm/i915_drm.h | 44 ++++++++++++++++++++++++++++++++-----
> > 1 file changed, 39 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 5e678917da70..77e5e74c32c1 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
> > /**
> > * When the EXEC_OBJECT_PINNED flag is specified this is populated by
> > * the user with the GTT offset at which this object will be pinned.
> > + *
> > * When the I915_EXEC_NO_RELOC flag is specified this must contain the
> > * presumed_offset of the object.
> > + *
> > * During execbuffer2 the kernel populates it with the value of the
> > * current GTT offset of the object, for future presumed_offset writes.
> > + *
> > + * See struct drm_i915_gem_create_ext for the rules when dealing with
> > + * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
> > + * minimum page sizes, like DG2.
> > */
> > __u64 offset;
> >
> > @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext {
> > *
> > * The (page-aligned) allocated size for the object will be returned.
> > *
> > - * Note that for some devices we have might have further minimum
> > - * page-size restrictions(larger than 4K), like for device local-memory.
> > - * However in general the final size here should always reflect any
> > - * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> > - * extension to place the object in device local-memory.
> > + *
> > + * DG2 64K min page size implications:
> > + *
> > + * On discrete platforms, starting from DG2, we have to contend with GTT
> > + * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> > + * objects. Specifically the hardware only supports 64K or larger GTT
> > + * page sizes for such memory. The kernel will already ensure that all
> > + * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> > + * sizes underneath.
> > + *
> > + * Note that the returned size here will always reflect any required
> > + * rounding up done by the kernel, i.e 4K will now become 64K on devices
> > + * such as DG2.
> > + *
> > + * Special DG2 GTT address alignment requirement:
> > + *
> > + * The GTT alignment will also need to be at least 2M for such objects.
> > + *
> > + * Note that due to how the hardware implements 64K GTT page support, we
> > + * have some further complications:
> > + *
> > + * 1) The entire PDE (which covers a 2MB virtual address range), must
> > + * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
> > + * PDE is forbidden by the hardware.
> > + *
> > + * 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> > + * objects.
> > + *
> > + * To keep things simple for userland, we mandate that any GTT mappings
> > + * must be aligned to and rounded up to 2MB.
>
> Could I get a clarification about this "rounded up" part.
>
> Currently Mesa is aligning the start of each and every buffer VMA to be
> 2MiB aligned. But, we are *not* taking any steps to "round up" the size
> of buffers to 2MiB alignment.
>
> Bob's Mesa MR from a while ago,
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14599
>
> was trying to add this "round up" size for buffers. We didn't accept
> this MR because we thought if we have ensured that no other buffer will
> use the same 2MiB VMA range, then it should not be required.
>
> If what we are doing is ok, then maybe this "round up" language should
> be dropped? Or, perhaps the "round up" mentioned here isn't implying we
> must align the size of buffers that we create, and I'm misinterpreting
> this.
Jordan,
as per my understanding this size rounding up to 2MB is for the VMA mapping,
not for the buffer size.
Even if we drop this rounding up of vma size to 2MB but align the VMA
start to 2MB address then also this should be fine. Becasue the remaining of the
last PDE(2MB) will never be used for any other GTT mapping as the
starting addr wont align to 2MB.
Bob, Is the above understanding is right? if so could we drop the
requirement of mapping the vma size to 2MB?
Ram
>
> -Jordan
>
> > As this only wastes virtual
> > + * address space and avoids userland having to copy any needlessly
> > + * complicated PDE sharing scheme (coloring) and only affects DG2, this
> > + * is deemed to be a good compromise.
> > */
> > __u64 size;
> > /**
> > --
> > 2.25.1
On 2022-02-18 at 18:06:00 +0000, Robert Beckett wrote:
>
>
> On 18/02/2022 13:47, Ramalingam C wrote:
> > On 2022-02-17 at 20:57:35 -0800, Jordan Justen wrote:
> > > Robert Beckett <[email protected]> writes:
> > >
> > > > From: Matthew Auld <[email protected]>
> > > >
> > > > On discrete platforms like DG2, we need to support a minimum page size
> > > > of 64K when dealing with device local-memory. This is quite tricky for
> > > > various reasons, so try to document the new implicit uapi for this.
> > > >
> > > > v3: fix typos and less emphasis
> > > > v2: Fixed suggestions on formatting [Daniel]
> > > >
> > > > Signed-off-by: Matthew Auld <[email protected]>
> > > > Signed-off-by: Ramalingam C <[email protected]>
> > > > Signed-off-by: Robert Beckett <[email protected]>
> > > > Acked-by: Jordan Justen <[email protected]>
> > > > Reviewed-by: Ramalingam C <[email protected]>
> > > > Reviewed-by: Thomas Hellström <[email protected]>
> > > > cc: Simon Ser <[email protected]>
> > > > cc: Pekka Paalanen <[email protected]>
> > > > Cc: Jordan Justen <[email protected]>
> > > > Cc: Kenneth Graunke <[email protected]>
> > > > Cc: [email protected]
> > > > Cc: Tony Ye <[email protected]>
> > > > Cc: Slawomir Milczarek <[email protected]>
> > > > ---
> > > > include/uapi/drm/i915_drm.h | 44 ++++++++++++++++++++++++++++++++-----
> > > > 1 file changed, 39 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > > index 5e678917da70..77e5e74c32c1 100644
> > > > --- a/include/uapi/drm/i915_drm.h
> > > > +++ b/include/uapi/drm/i915_drm.h
> > > > @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
> > > > /**
> > > > * When the EXEC_OBJECT_PINNED flag is specified this is populated by
> > > > * the user with the GTT offset at which this object will be pinned.
> > > > + *
> > > > * When the I915_EXEC_NO_RELOC flag is specified this must contain the
> > > > * presumed_offset of the object.
> > > > + *
> > > > * During execbuffer2 the kernel populates it with the value of the
> > > > * current GTT offset of the object, for future presumed_offset writes.
> > > > + *
> > > > + * See struct drm_i915_gem_create_ext for the rules when dealing with
> > > > + * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
> > > > + * minimum page sizes, like DG2.
> > > > */
> > > > __u64 offset;
> > > > @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext {
> > > > *
> > > > * The (page-aligned) allocated size for the object will be returned.
> > > > *
> > > > - * Note that for some devices we have might have further minimum
> > > > - * page-size restrictions(larger than 4K), like for device local-memory.
> > > > - * However in general the final size here should always reflect any
> > > > - * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> > > > - * extension to place the object in device local-memory.
> > > > + *
> > > > + * DG2 64K min page size implications:
> > > > + *
> > > > + * On discrete platforms, starting from DG2, we have to contend with GTT
> > > > + * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> > > > + * objects. Specifically the hardware only supports 64K or larger GTT
> > > > + * page sizes for such memory. The kernel will already ensure that all
> > > > + * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> > > > + * sizes underneath.
> > > > + *
> > > > + * Note that the returned size here will always reflect any required
> > > > + * rounding up done by the kernel, i.e 4K will now become 64K on devices
> > > > + * such as DG2.
> > > > + *
> > > > + * Special DG2 GTT address alignment requirement:
> > > > + *
> > > > + * The GTT alignment will also need to be at least 2M for such objects.
> > > > + *
> > > > + * Note that due to how the hardware implements 64K GTT page support, we
> > > > + * have some further complications:
> > > > + *
> > > > + * 1) The entire PDE (which covers a 2MB virtual address range), must
> > > > + * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
> > > > + * PDE is forbidden by the hardware.
> > > > + *
> > > > + * 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> > > > + * objects.
> > > > + *
> > > > + * To keep things simple for userland, we mandate that any GTT mappings
> > > > + * must be aligned to and rounded up to 2MB.
> > >
> > > Could I get a clarification about this "rounded up" part.
> > >
> > > Currently Mesa is aligning the start of each and every buffer VMA to be
> > > 2MiB aligned. But, we are *not* taking any steps to "round up" the size
> > > of buffers to 2MiB alignment.
> > >
> > > Bob's Mesa MR from a while ago,
> > >
> > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14599
> > >
> > > was trying to add this "round up" size for buffers. We didn't accept
> > > this MR because we thought if we have ensured that no other buffer will
> > > use the same 2MiB VMA range, then it should not be required.
> > >
> > > If what we are doing is ok, then maybe this "round up" language should
> > > be dropped? Or, perhaps the "round up" mentioned here isn't implying we
> > > must align the size of buffers that we create, and I'm misinterpreting
> > > this.
> > Jordan,
> >
> > as per my understanding this size rounding up to 2MB is for the VMA mapping,
> > not for the buffer size.
> correct, only the vma is rounded up
>
> >
> > Even if we drop this rounding up of vma size to 2MB but align the VMA
> > start to 2MB address then also this should be fine. Becasue the remaining of the
> > last PDE(2MB) will never be used for any other GTT mapping as the
> > starting addr wont align to 2MB.
> The kernel has to handle 4K pages also, which could in theory attempt to be
> placed in any remaining space within a 2MB region, which is not supported.
> For this reason, the kernel rounds up the vma to next 2MB to ensure no 4K
> pages can treat the remaining space as a candidate for placement.
>
> For mesa, this is not required as they only ever use 2MB alignment for all
> buffers, hence the denial of the mesa mr.
>
> Internally, the kernel will still round up the vma reservations to prevent
> any kernel 4K buffers being situated in remaining space.
>
> If desired, we can make the wording clearer, maybe something like:
>
> "To keep things simple for userland, we mandate that any GTT mappings
> must be aligned to 2MB. The kernel will internally pad them out to the next
> 2MB boundary"
Added the extra information in next version @ https://patchwork.freedesktop.org/patch/475166/?series=100419&rev=1
Jordan, hope this explanation clears your doubt.
Ram.
>
>
> >
> > Bob, Is the above understanding is right? if so could we drop the
> > requirement of mapping the vma size to 2MB?
> >
> > Ram
> > >
> > > -Jordan
> > >
> > > > As this only wastes virtual
> > > > + * address space and avoids userland having to copy any needlessly
> > > > + * complicated PDE sharing scheme (coloring) and only affects DG2, this
> > > > + * is deemed to be a good compromise.
> > > > */
> > > > __u64 size;
> > > > /**
> > > > --
> > > > 2.25.1
On 18/02/2022 13:47, Ramalingam C wrote:
> On 2022-02-17 at 20:57:35 -0800, Jordan Justen wrote:
>> Robert Beckett <[email protected]> writes:
>>
>>> From: Matthew Auld <[email protected]>
>>>
>>> On discrete platforms like DG2, we need to support a minimum page size
>>> of 64K when dealing with device local-memory. This is quite tricky for
>>> various reasons, so try to document the new implicit uapi for this.
>>>
>>> v3: fix typos and less emphasis
>>> v2: Fixed suggestions on formatting [Daniel]
>>>
>>> Signed-off-by: Matthew Auld <[email protected]>
>>> Signed-off-by: Ramalingam C <[email protected]>
>>> Signed-off-by: Robert Beckett <[email protected]>
>>> Acked-by: Jordan Justen <[email protected]>
>>> Reviewed-by: Ramalingam C <[email protected]>
>>> Reviewed-by: Thomas Hellström <[email protected]>
>>> cc: Simon Ser <[email protected]>
>>> cc: Pekka Paalanen <[email protected]>
>>> Cc: Jordan Justen <[email protected]>
>>> Cc: Kenneth Graunke <[email protected]>
>>> Cc: [email protected]
>>> Cc: Tony Ye <[email protected]>
>>> Cc: Slawomir Milczarek <[email protected]>
>>> ---
>>> include/uapi/drm/i915_drm.h | 44 ++++++++++++++++++++++++++++++++-----
>>> 1 file changed, 39 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>>> index 5e678917da70..77e5e74c32c1 100644
>>> --- a/include/uapi/drm/i915_drm.h
>>> +++ b/include/uapi/drm/i915_drm.h
>>> @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
>>> /**
>>> * When the EXEC_OBJECT_PINNED flag is specified this is populated by
>>> * the user with the GTT offset at which this object will be pinned.
>>> + *
>>> * When the I915_EXEC_NO_RELOC flag is specified this must contain the
>>> * presumed_offset of the object.
>>> + *
>>> * During execbuffer2 the kernel populates it with the value of the
>>> * current GTT offset of the object, for future presumed_offset writes.
>>> + *
>>> + * See struct drm_i915_gem_create_ext for the rules when dealing with
>>> + * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
>>> + * minimum page sizes, like DG2.
>>> */
>>> __u64 offset;
>>>
>>> @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext {
>>> *
>>> * The (page-aligned) allocated size for the object will be returned.
>>> *
>>> - * Note that for some devices we have might have further minimum
>>> - * page-size restrictions(larger than 4K), like for device local-memory.
>>> - * However in general the final size here should always reflect any
>>> - * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
>>> - * extension to place the object in device local-memory.
>>> + *
>>> + * DG2 64K min page size implications:
>>> + *
>>> + * On discrete platforms, starting from DG2, we have to contend with GTT
>>> + * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
>>> + * objects. Specifically the hardware only supports 64K or larger GTT
>>> + * page sizes for such memory. The kernel will already ensure that all
>>> + * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
>>> + * sizes underneath.
>>> + *
>>> + * Note that the returned size here will always reflect any required
>>> + * rounding up done by the kernel, i.e 4K will now become 64K on devices
>>> + * such as DG2.
>>> + *
>>> + * Special DG2 GTT address alignment requirement:
>>> + *
>>> + * The GTT alignment will also need to be at least 2M for such objects.
>>> + *
>>> + * Note that due to how the hardware implements 64K GTT page support, we
>>> + * have some further complications:
>>> + *
>>> + * 1) The entire PDE (which covers a 2MB virtual address range), must
>>> + * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
>>> + * PDE is forbidden by the hardware.
>>> + *
>>> + * 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
>>> + * objects.
>>> + *
>>> + * To keep things simple for userland, we mandate that any GTT mappings
>>> + * must be aligned to and rounded up to 2MB.
>>
>> Could I get a clarification about this "rounded up" part.
>>
>> Currently Mesa is aligning the start of each and every buffer VMA to be
>> 2MiB aligned. But, we are *not* taking any steps to "round up" the size
>> of buffers to 2MiB alignment.
>>
>> Bob's Mesa MR from a while ago,
>>
>> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14599
>>
>> was trying to add this "round up" size for buffers. We didn't accept
>> this MR because we thought if we have ensured that no other buffer will
>> use the same 2MiB VMA range, then it should not be required.
>>
>> If what we are doing is ok, then maybe this "round up" language should
>> be dropped? Or, perhaps the "round up" mentioned here isn't implying we
>> must align the size of buffers that we create, and I'm misinterpreting
>> this.
> Jordan,
>
> as per my understanding this size rounding up to 2MB is for the VMA mapping,
> not for the buffer size.
correct, only the vma is rounded up
>
> Even if we drop this rounding up of vma size to 2MB but align the VMA
> start to 2MB address then also this should be fine. Becasue the remaining of the
> last PDE(2MB) will never be used for any other GTT mapping as the
> starting addr wont align to 2MB.
The kernel has to handle 4K pages also, which could in theory attempt to
be placed in any remaining space within a 2MB region, which is not
supported. For this reason, the kernel rounds up the vma to next 2MB to
ensure no 4K pages can treat the remaining space as a candidate for
placement.
For mesa, this is not required as they only ever use 2MB alignment for
all buffers, hence the denial of the mesa mr.
Internally, the kernel will still round up the vma reservations to
prevent any kernel 4K buffers being situated in remaining space.
If desired, we can make the wording clearer, maybe something like:
"To keep things simple for userland, we mandate that any GTT mappings
must be aligned to 2MB. The kernel will internally pad them out to the
next 2MB boundary"
>
> Bob, Is the above understanding is right? if so could we drop the
> requirement of mapping the vma size to 2MB?
>
> Ram
>>
>> -Jordan
>>
>>> As this only wastes virtual
>>> + * address space and avoids userland having to copy any needlessly
>>> + * complicated PDE sharing scheme (coloring) and only affects DG2, this
>>> + * is deemed to be a good compromise.
>>> */
>>> __u64 size;
>>> /**
>>> --
>>> 2.25.1
Ramalingam C <[email protected]> writes:
> On 2022-02-18 at 18:06:00 +0000, Robert Beckett wrote:
>>
>> If desired, we can make the wording clearer, maybe something like:
>>
>> "To keep things simple for userland, we mandate that any GTT mappings
>> must be aligned to 2MB. The kernel will internally pad them out to the next
>> 2MB boundary"
>
> Added the extra information in next version @
> https://patchwork.freedesktop.org/patch/475166/?series=100419&rev=1
>
> Jordan, hope this explanation clears your doubt.
Ok. It sounds like what we are doing in Mesa matches what is required by
hardware and the kernel. Thanks.
-Jordan