Hi Christian,
(Your email software is configured for HTML btw)
Le mardi 30 janvier 2024 à 10:23 +0100, Christian König a écrit :
> Am 30.01.24 um 10:01 schrieb Daniel Vetter:
>
> >
> > On Fri, Jan 26, 2024 at 05:42:50PM +0100, Christian König wrote:
> >
> > > [SNIP]
> > > Well I think we should have some solution, but I'm not sure if it
> > > should be
> > > part of DMA-buf.
> > >
> > > Essentially a DMA-buf exports the buffers as he uses it and the
> > > importer (or
> > > the DMA-buf subsystem) is then the one who says ok I can use this
> > > or I can't
> > > use this or I need to call extra functions to use this or
> > > whatever.
> > >
> > > It's not the job of the exporter to provide the coherency for the
> > > importer,
> > > cause otherwise we would have a lot of code in the exporter which
> > > can only
> > > be tested when you have the right importer around. And I strongly
> > > think that
> > > this is a no-go for having a reliable solution.
> > >
> >
> > The trouble is, that if you have other memory than stuff allocated
> > by the
> > dma-api or mapped using the dma-api, then by necessity the exporter
> > has to
> > deal with this.
> >
>
> Yes, I was thinking about that as well.
>
>
> >
> > Which is the exact same reason we also force the exporters to deal
> > with
> > the cpu cache flushing - you're argument that it's not great to
> > replicate
> > this everywhere holds there equally.
> >
>
> And I'm not really happy with that either.
>
>
> >
> > The other thing is that right now the exporter is the only one who
> > actually knows what kind of dma coherency rules apply for a certain
> > piece
> > of memory. E.g. on i915-gem even if it's dma_map_sg mapped the
> > underlying
> > i915-gem buffer might be non-coherent, and i915-gem makes it all
> > work by
> > doing the appropriate amount of clflush.
> >
>
> Yeah, exactly that's the reason why I think that this stuff doesn't
> belong into exporters/drivers.
>
> Looking at what kind of hacks and workarounds we have in both amdgpu
> as well as i915 it's pretty clear that we need to improve this design
> somehow.
>
>
> >
> > Similar funky things happen in other cases.
> >
> > So unless we add an interface which allows importers to figure out
> > how
> > much flushing is needed, currently the exporter is the only one who
> > knows
> > (because it can inspect the struct device at dma_buf_attach time).
> >
> > We could flip this around, but it would be a rather serious
> > depature from
> > the dma-buf design approach thus far.
> >
>
> Well clients already give the DMA-direction to exporters when
> creating the mapping and get an appropriate sg_table in return.
>
> All we need to do is getting the information what flushing is needed
> into the object returned here so that the DMA API can work with it.
>
> Christoph Hellwig pretty much nailed it when he said that the
> problem with the sg_table is that it mixes input and output
> parameters of the DMA-API.
>
> I would extend that and say that we need a mapping object the DMA-
> API can work with so that it can know what needs to be done when
> devices request that data is made coherent between them or the CPU.
>
>
> >
> > >
> > > That's why I think the approach of having DMA-buf callbacks is
> > > most likely
> > > the wrong thing to do.
> > >
> > > What should happen instead is that the DMA subsystem provides
> > > functionality
> > > which to devices which don't support coherency through it's
> > > connection to
> > > say I want to access this data, please make sure to flush the
> > > appropriate
> > > catches. But that's just a very very rough design idea.
> > >
> > > This will become more with CXL at the horizon I think.
> > >
> >
> > Yeah CXL will make this all even more fun, but we are firmly there
> > already
> > with devices deciding per-buffer (or sometimes even per-access with
> > intel's MOCS stuff) what coherency mode to use for a buffer.
> >
> > Also arm soc generally have both coherent and non-coherent device
> > interconnects, and I think some devices can switch with runtime
> > flags too
> > which mode they use for a specific transition.
> >
> > CXL just extends this to pcie devices.
> >
> > So the mess is here, how do we deal with it?
> >
>
> I would say we start with the DMA-API by getting away from sg_tables
> to something cleaner and state oriented.
FYI I am already adding a 'dma_vec' object in my IIO DMABUF patchset,
which is just a dead simple
struct dma_vec {
dma_addr_t addr;
size_t len;
};
(The rationale for introducing it in the IIO DMABUF patchset was that
the "scatterlist" wouldn't allow me to change the transfer size.)
So I believe a new "sg_table"-like could just be an array of struct
dma_vec + flags.
Cheers,
-Paul
> >
> >
> > My take is that the opt-in callback addition is far from great, but
> > it's
> > in line with how we extended dma-buf the past decade plus too. So
> > unless
> > someone's volunteering to pour some serious time into re-
> > engineering this
> > all (including testing all the different device/driver<-
> > >device/driver
> > interactions) I think there's only really one other option: To not
> > support
> > these cases at all. And I don't really like that, because it means
> > people
> > will hack together something even worse in their drivers.
> >
> > By adding it to dma-buf it'll stare us in our faces at least :-/
> >
>
> Yeah, it's the way of the least resistance. But with CXL at the
> horizon and more and more drivers using it I think it's predictable
> that this will sooner or later blow up.
>
> Cheers,
> Christian.
>
>
> >
> >
> > Cheers, Sima
> >
> >
> > >
> > > Regards,
> > > Christian.
> > >
> > >
> > > >
> > > > Cheers, Sima
> > > >
> > >
> > > _______________________________________________
> > > Linaro-mm-sig mailing list -- [email protected]
> > > To unsubscribe send an email to linaro-mm-sig-
> > > [email protected]
> > >
> >
>
>
On Tue, Jan 30, 2024 at 10:48:23AM +0100, Paul Cercueil wrote:
> Le mardi 30 janvier 2024 ? 10:23 +0100, Christian K?nig a ?crit?:
> > ?I would say we start with the DMA-API by getting away from sg_tables
> > to something cleaner and state oriented.?
>
> FYI I am already adding a 'dma_vec' object in my IIO DMABUF patchset,
> which is just a dead simple
>
> struct dma_vec {
> dma_addr_t addr;
> size_t len;
> };
>
> (The rationale for introducing it in the IIO DMABUF patchset was that
> the "scatterlist" wouldn't allow me to change the transfer size.)
>
> So I believe a new "sg_table"-like could just be an array of struct
> dma_vec + flags.
Yeah that's pretty much the proposal I've seen, split the sg table into
input data (struct page + len) and output data (which is the dma_addr_t +
len you have above).
The part I don't expect to ever happen, because it hasn't the past 20 or
so years, is that the dma-api will give us information about what is
needed to keep the buffers coherency between various devices and the cpu.
-Sima
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch