2023-09-12 12:41:51

by Thomas Hellström

[permalink] [raw]
Subject: Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination

Hi, Danilo

On 9/11/23 19:49, Danilo Krummrich wrote:
> Hi Thomas,
>
> On 9/11/23 19:19, Thomas Hellström wrote:
>> Hi, Danilo
>>
>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>> This patch adds an abstraction layer between the drm_gpuva mappings of
>>> a particular drm_gem_object and this GEM object itself. The abstraction
>>> represents a combination of a drm_gem_object and drm_gpuvm. The
>>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
>>> representing this abstraction), while each drm_gpuvm_bo contains
>>> list of
>>> mappings of this GEM object.
>>>
>>> This has multiple advantages:
>>>
>>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>>>     of the drm_gpuvm. This is useful for tracking external and evicted
>>>     objects per VM, which is introduced in subsequent patches.
>>>
>>> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>>>     drm_gpuvm becomes much cheaper.
>>>
>>> 3) Drivers can derive and extend the structure to easily represent
>>>     driver specific states of a BO for a certain GPUVM.
>>>
>>> The idea of this abstraction was taken from amdgpu, hence the credit
>>> for
>>> this idea goes to the developers of amdgpu.
>>>
>>> Cc: Christian König <[email protected]>
>>> Signed-off-by: Danilo Krummrich <[email protected]>
>>
>> Did you consider having the drivers embed the struct drm_gpuvm_bo in
>> their own bo definition? I figure that would mean using the gem bo's
>> refcounting and providing a helper to call from the driver's bo
>> release. Looks like that could potentially save a lot of code? Or is
>> there something that won't work with that approach?
>
> There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> callback for drivers to register for that purpose.
>
> - Danilo

Now after looking a bit deeper, I think actually the question could be
rephrased as, why don't we just use the
struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
keeping things simple? Drivers would then just embed it in their bo
subclass and we'd avoid unnecessary fields in the struct drm_gem_object
for drivers that don't do VM_BIND yet.

Sure, this won't be per bo and per vm, but it'd really only make a
slight difference where we have multiple VMAs per bo, where per-vm
per-bo state either needs to be duplicated or attached to a single vma
(as in the case of the external bo list).

To me that looks like a substantial amount of less code / complexity?

/Thomas


>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>


2023-09-12 15:18:30

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination

On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellstr?m wrote:
> Hi, Danilo
>
> On 9/11/23 19:49, Danilo Krummrich wrote:
> > Hi Thomas,
> >
> > On 9/11/23 19:19, Thomas Hellstr?m wrote:
> > > Hi, Danilo
> > >
> > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > This patch adds an abstraction layer between the drm_gpuva mappings of
> > > > a particular drm_gem_object and this GEM object itself. The abstraction
> > > > represents a combination of a drm_gem_object and drm_gpuvm. The
> > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> > > > representing this abstraction), while each drm_gpuvm_bo contains
> > > > list of
> > > > mappings of this GEM object.
> > > >
> > > > This has multiple advantages:
> > > >
> > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists
> > > > ??? of the drm_gpuvm. This is useful for tracking external and evicted
> > > > ??? objects per VM, which is introduced in subsequent patches.
> > > >
> > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain
> > > > ??? drm_gpuvm becomes much cheaper.
> > > >
> > > > 3) Drivers can derive and extend the structure to easily represent
> > > > ??? driver specific states of a BO for a certain GPUVM.
> > > >
> > > > The idea of this abstraction was taken from amdgpu, hence the
> > > > credit for
> > > > this idea goes to the developers of amdgpu.
> > > >
> > > > Cc: Christian K?nig <[email protected]>
> > > > Signed-off-by: Danilo Krummrich <[email protected]>
> > >
> > > Did you consider having the drivers embed the struct drm_gpuvm_bo in
> > > their own bo definition? I figure that would mean using the gem bo's
> > > refcounting and providing a helper to call from the driver's bo
> > > release. Looks like that could potentially save a lot of code? Or is
> > > there something that won't work with that approach?
> >
> > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> > callback for drivers to register for that purpose.
> >
> > - Danilo
>
> Now after looking a bit deeper, I think actually the question could be
> rephrased as, why don't we just use the
> struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
> keeping things simple? Drivers would then just embed it in their bo subclass
> and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
> that don't do VM_BIND yet.

struct drm_gem_object::gpuva is just a container containing a list in order to
(currently) attach drm_gpuva structs to it and with this patch attach
drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
the driver specific BO structure". Having a common connection between GEM
objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
patch series however.

>
> Sure, this won't be per bo and per vm, but it'd really only make a slight
> difference where we have multiple VMAs per bo, where per-vm per-bo state
> either needs to be duplicated or attached to a single vma (as in the case of
> the external bo list).


Correct, one implication is that we don't get a per VM and BO abstraction, and
hence are left with a list of all drm_gpuva structs having the same backing BO,
regardless of the VM.

For amdgpu this was always a concern. Now that we want to keep track of external
and evicted objects it's going to be a concern for most drivers I guess. Because
the only structure we could use for tracking external and evicted objects we are
left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
structure isn't unique and we need to consider cases where userspace just
allocates rather huge BOs and creates tons of mappings from it. Running the full
list of drm_gpuva structs (with even the ones from other VMs included) for
adding an external or evicted object isn't very efficient. Not to mention that
the maintenance when the mapping we've (randomly) picked as an entry for the
external/evicted object list is unmapped, but there are still mappings left in
the VM with the same backing BO.

Now, a way to get rid of the VM_BO abstraction would be to use maple trees
instead, since then we can store drm_gem_object structs directly for each VM.
However, Xe had concerns about using maple trees and preferred lists, plus
having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
having a VM_BO abstraction enabling us to track external/evicted objects with
lists seems to satisfy everyone's needs.

- Danilo

>
> To me that looks like a substantial amount of less code / complexity?
>
> /Thomas
>
>
> >
> > >
> > > Thanks,
> > >
> > > Thomas
> > >
> > >
> >
>

2023-09-12 19:34:20

by Thomas Hellström

[permalink] [raw]
Subject: Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination


On 9/12/23 12:06, Danilo Krummrich wrote:
> On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
>> Hi, Danilo
>>
>> On 9/11/23 19:49, Danilo Krummrich wrote:
>>> Hi Thomas,
>>>
>>> On 9/11/23 19:19, Thomas Hellström wrote:
>>>> Hi, Danilo
>>>>
>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>> This patch adds an abstraction layer between the drm_gpuva mappings of
>>>>> a particular drm_gem_object and this GEM object itself. The abstraction
>>>>> represents a combination of a drm_gem_object and drm_gpuvm. The
>>>>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
>>>>> representing this abstraction), while each drm_gpuvm_bo contains
>>>>> list of
>>>>> mappings of this GEM object.
>>>>>
>>>>> This has multiple advantages:
>>>>>
>>>>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>>>>>     of the drm_gpuvm. This is useful for tracking external and evicted
>>>>>     objects per VM, which is introduced in subsequent patches.
>>>>>
>>>>> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>>>>>     drm_gpuvm becomes much cheaper.
>>>>>
>>>>> 3) Drivers can derive and extend the structure to easily represent
>>>>>     driver specific states of a BO for a certain GPUVM.
>>>>>
>>>>> The idea of this abstraction was taken from amdgpu, hence the
>>>>> credit for
>>>>> this idea goes to the developers of amdgpu.
>>>>>
>>>>> Cc: Christian König <[email protected]>
>>>>> Signed-off-by: Danilo Krummrich <[email protected]>
>>>> Did you consider having the drivers embed the struct drm_gpuvm_bo in
>>>> their own bo definition? I figure that would mean using the gem bo's
>>>> refcounting and providing a helper to call from the driver's bo
>>>> release. Looks like that could potentially save a lot of code? Or is
>>>> there something that won't work with that approach?
>>> There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
>>> callback for drivers to register for that purpose.
>>>
>>> - Danilo
>> Now after looking a bit deeper, I think actually the question could be
>> rephrased as, why don't we just use the
>> struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
>> keeping things simple? Drivers would then just embed it in their bo subclass
>> and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
>> that don't do VM_BIND yet.
> struct drm_gem_object::gpuva is just a container containing a list in order to
> (currently) attach drm_gpuva structs to it and with this patch attach
> drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
> means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
> the driver specific BO structure". Having a common connection between GEM
> objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
> patch series however.
>
>> Sure, this won't be per bo and per vm, but it'd really only make a slight
>> difference where we have multiple VMAs per bo, where per-vm per-bo state
>> either needs to be duplicated or attached to a single vma (as in the case of
>> the external bo list).
>
> Correct, one implication is that we don't get a per VM and BO abstraction, and
> hence are left with a list of all drm_gpuva structs having the same backing BO,
> regardless of the VM.
>
> For amdgpu this was always a concern. Now that we want to keep track of external
> and evicted objects it's going to be a concern for most drivers I guess. Because
> the only structure we could use for tracking external and evicted objects we are
> left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
> structure isn't unique and we need to consider cases where userspace just
> allocates rather huge BOs and creates tons of mappings from it. Running the full
> list of drm_gpuva structs (with even the ones from other VMs included) for
> adding an external or evicted object isn't very efficient. Not to mention that
> the maintenance when the mapping we've (randomly) picked as an entry for the
> external/evicted object list is unmapped, but there are still mappings left in
> the VM with the same backing BO.
For the evicted object it's not much of an issue; we maintain a list of
vmas needing rebinding for each VM rather than objects evicted, so there
is no or very little additional overhead there. The extobj list is
indeed a problem if many VMAs are bound to the same bo. Not that the
code snippets are complicated, but the list traversals would be excessive.
>
> Now, a way to get rid of the VM_BO abstraction would be to use maple trees
> instead, since then we can store drm_gem_object structs directly for each VM.
> However, Xe had concerns about using maple trees and preferred lists, plus
> having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
> abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
> having a VM_BO abstraction enabling us to track external/evicted objects with
> lists seems to satisfy everyone's needs.

Indeed this is a tradeoff between a simple implementation that is OK for
situations with not many VMs nor VMAs per bo vs a more complex
implementation that optimizes for the opposite case.

So if this latter is a case we need to optimize for at this point then I
guess it's the way to go.
(I'm in the process of adapting the xe driver to this, so I just wanted
to bring up areas where the implementations differ quite a lot and make
sure options are discussed).

Thanks,

Thomas


>
> - Danilo
>
>> To me that looks like a substantial amount of less code / complexity?
>>
>> /Thomas
>>
>>
>>>> Thanks,
>>>>
>>>> Thomas
>>>>
>>>>

2023-09-13 03:01:09

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination

On Tue, Sep 12, 2023 at 12:33:14PM +0200, Thomas Hellstr?m wrote:
>
> On 9/12/23 12:06, Danilo Krummrich wrote:
> > On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellstr?m wrote:
> > > Hi, Danilo
> > >
> > > On 9/11/23 19:49, Danilo Krummrich wrote:
> > > > Hi Thomas,
> > > >
> > > > On 9/11/23 19:19, Thomas Hellstr?m wrote:
> > > > > Hi, Danilo
> > > > >
> > > > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > > > This patch adds an abstraction layer between the drm_gpuva mappings of
> > > > > > a particular drm_gem_object and this GEM object itself. The abstraction
> > > > > > represents a combination of a drm_gem_object and drm_gpuvm. The
> > > > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> > > > > > representing this abstraction), while each drm_gpuvm_bo contains
> > > > > > list of
> > > > > > mappings of this GEM object.
> > > > > >
> > > > > > This has multiple advantages:
> > > > > >
> > > > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists
> > > > > > ??? of the drm_gpuvm. This is useful for tracking external and evicted
> > > > > > ??? objects per VM, which is introduced in subsequent patches.
> > > > > >
> > > > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain
> > > > > > ??? drm_gpuvm becomes much cheaper.
> > > > > >
> > > > > > 3) Drivers can derive and extend the structure to easily represent
> > > > > > ??? driver specific states of a BO for a certain GPUVM.
> > > > > >
> > > > > > The idea of this abstraction was taken from amdgpu, hence the
> > > > > > credit for
> > > > > > this idea goes to the developers of amdgpu.
> > > > > >
> > > > > > Cc: Christian K?nig <[email protected]>
> > > > > > Signed-off-by: Danilo Krummrich <[email protected]>
> > > > > Did you consider having the drivers embed the struct drm_gpuvm_bo in
> > > > > their own bo definition? I figure that would mean using the gem bo's
> > > > > refcounting and providing a helper to call from the driver's bo
> > > > > release. Looks like that could potentially save a lot of code? Or is
> > > > > there something that won't work with that approach?
> > > > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> > > > callback for drivers to register for that purpose.
> > > >
> > > > - Danilo
> > > Now after looking a bit deeper, I think actually the question could be
> > > rephrased as, why don't we just use the
> > > struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
> > > keeping things simple? Drivers would then just embed it in their bo subclass
> > > and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
> > > that don't do VM_BIND yet.
> > struct drm_gem_object::gpuva is just a container containing a list in order to
> > (currently) attach drm_gpuva structs to it and with this patch attach
> > drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
> > means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
> > the driver specific BO structure". Having a common connection between GEM
> > objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
> > patch series however.
> >
> > > Sure, this won't be per bo and per vm, but it'd really only make a slight
> > > difference where we have multiple VMAs per bo, where per-vm per-bo state
> > > either needs to be duplicated or attached to a single vma (as in the case of
> > > the external bo list).
> >
> > Correct, one implication is that we don't get a per VM and BO abstraction, and
> > hence are left with a list of all drm_gpuva structs having the same backing BO,
> > regardless of the VM.
> >
> > For amdgpu this was always a concern. Now that we want to keep track of external
> > and evicted objects it's going to be a concern for most drivers I guess. Because
> > the only structure we could use for tracking external and evicted objects we are
> > left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
> > structure isn't unique and we need to consider cases where userspace just
> > allocates rather huge BOs and creates tons of mappings from it. Running the full
> > list of drm_gpuva structs (with even the ones from other VMs included) for
> > adding an external or evicted object isn't very efficient. Not to mention that
> > the maintenance when the mapping we've (randomly) picked as an entry for the
> > external/evicted object list is unmapped, but there are still mappings left in
> > the VM with the same backing BO.
> For the evicted object it's not much of an issue; we maintain a list of vmas
> needing rebinding for each VM rather than objects evicted, so there is no or
> very little additional overhead there. The extobj list is indeed a problem
> if many VMAs are bound to the same bo. Not that the code snippets are
> complicated, but the list traversals would be excessive.
> >
> > Now, a way to get rid of the VM_BO abstraction would be to use maple trees
> > instead, since then we can store drm_gem_object structs directly for each VM.
> > However, Xe had concerns about using maple trees and preferred lists, plus
> > having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
> > abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
> > having a VM_BO abstraction enabling us to track external/evicted objects with
> > lists seems to satisfy everyone's needs.
>
> Indeed this is a tradeoff between a simple implementation that is OK for
> situations with not many VMs nor VMAs per bo vs a more complex
> implementation that optimizes for the opposite case.
>
> So if this latter is a case we need to optimize for at this point then I
> guess it's the way to go.
> (I'm in the process of adapting the xe driver to this, so I just wanted to
> bring up areas where the implementations differ quite a lot and make sure
> options are discussed).

Thanks, I appreciate that. Just be aware of the locking issue in V3 that Boris
has pointed out. I don't know if I will get to sending out a V4 today to fix
that, but I'll try to do it by tomorrow.

- Danilo

>
> Thanks,
>
> Thomas
>
>
> >
> > - Danilo
> >
> > > To me that looks like a substantial amount of less code / complexity?
> > >
> > > /Thomas
> > >
> > >
> > > > > Thanks,
> > > > >
> > > > > Thomas
> > > > >
> > > > >
>