Hello,
is there the possibility of using a DMA engine channel from userspace?
Something like:
- configure DMA using ioctl() (or whatever configuration mechanism)
- read() or write() to trigger the transfer
--
Federico Vaga [CERN BE-CO-HT]
On 6/19/2020 3:47 PM, Federico Vaga wrote:
> Hello,
>
> is there the possibility of using a DMA engine channel from userspace?
>
> Something like:
> - configure DMA using ioctl() (or whatever configuration mechanism)
> - read() or write() to trigger the transfer
>
I may have supposedly promised Vinod to look into possibly providing something
like this in the future. But I have not gotten around to do that yet. Currently,
no such support.
On 19-06-20, 16:31, Dave Jiang wrote:
>
>
> On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > Hello,
> >
> > is there the possibility of using a DMA engine channel from userspace?
> >
> > Something like:
> > - configure DMA using ioctl() (or whatever configuration mechanism)
> > - read() or write() to trigger the transfer
> >
>
> I may have supposedly promised Vinod to look into possibly providing
> something like this in the future. But I have not gotten around to do that
> yet. Currently, no such support.
And I do still have serious reservations about this topic :) Opening up
userspace access to DMA does not sound very great from security point of
view.
Federico, what use case do you have in mind?
We should keep in mind dmaengine is an in-kernel interface providing
services to various subsystems, so you go thru the respective subsystem
kernel interface (network, display, spi, audio etc..) which would in
turn use dmaengine.
--
~Vinod
On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
>On 19-06-20, 16:31, Dave Jiang wrote:
>>
>>
>> On 6/19/2020 3:47 PM, Federico Vaga wrote:
>> > Hello,
>> >
>> > is there the possibility of using a DMA engine channel from userspace?
>> >
>> > Something like:
>> > - configure DMA using ioctl() (or whatever configuration mechanism)
>> > - read() or write() to trigger the transfer
>> >
>>
>> I may have supposedly promised Vinod to look into possibly providing
>> something like this in the future. But I have not gotten around to do that
>> yet. Currently, no such support.
>
>And I do still have serious reservations about this topic :) Opening up
>userspace access to DMA does not sound very great from security point of
>view.
I was thinking about a dedicated module, and not something that the DMA engine
offers directly. You load the module only if you need it (like the test module)
>Federico, what use case do you have in mind?
Userspace drivers
>We should keep in mind dmaengine is an in-kernel interface providing
>services to various subsystems, so you go thru the respective subsystem
>kernel interface (network, display, spi, audio etc..) which would in
>turn use dmaengine.
>
>--
>~Vinod
On Sun, Jun 21, 2020 at 10:37 PM Federico Vaga <[email protected]> wrote:
> >Federico, what use case do you have in mind?
>
> Userspace drivers
Is using vfio an option?
--
Thanks,
//richard
On Sun, Jun 21, 2020 at 10:45:04PM +0200, Richard Weinberger wrote:
>On Sun, Jun 21, 2020 at 10:37 PM Federico Vaga <[email protected]> wrote:
>> >Federico, what use case do you have in mind?
>>
>> Userspace drivers
>
>Is using vfio an option?
I do not know the subsystem. Could be, thanks for the suggestion I will have
a look.
>--
>Thanks,
>//richard
On 21-06-20, 22:36, Federico Vaga wrote:
> On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
> > On 19-06-20, 16:31, Dave Jiang wrote:
> > >
> > >
> > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > > > Hello,
> > > >
> > > > is there the possibility of using a DMA engine channel from userspace?
> > > >
> > > > Something like:
> > > > - configure DMA using ioctl() (or whatever configuration mechanism)
> > > > - read() or write() to trigger the transfer
> > > >
> > >
> > > I may have supposedly promised Vinod to look into possibly providing
> > > something like this in the future. But I have not gotten around to do that
> > > yet. Currently, no such support.
> >
> > And I do still have serious reservations about this topic :) Opening up
> > userspace access to DMA does not sound very great from security point of
> > view.
>
> I was thinking about a dedicated module, and not something that the DMA engine
> offers directly. You load the module only if you need it (like the test module)
But loading that module would expose dma to userspace.
>
> > Federico, what use case do you have in mind?
>
> Userspace drivers
more the reason not do do so, why cant a kernel driver be added for your
usage?
--
~Vinod
On Mon, Jun 22, 2020 at 10:17:33AM +0530, Vinod Koul wrote:
>On 21-06-20, 22:36, Federico Vaga wrote:
>> On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
>> > On 19-06-20, 16:31, Dave Jiang wrote:
>> > >
>> > >
>> > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
>> > > > Hello,
>> > > >
>> > > > is there the possibility of using a DMA engine channel from userspace?
>> > > >
>> > > > Something like:
>> > > > - configure DMA using ioctl() (or whatever configuration mechanism)
>> > > > - read() or write() to trigger the transfer
>> > > >
>> > >
>> > > I may have supposedly promised Vinod to look into possibly providing
>> > > something like this in the future. But I have not gotten around to do that
>> > > yet. Currently, no such support.
>> >
>> > And I do still have serious reservations about this topic :) Opening up
>> > userspace access to DMA does not sound very great from security point of
>> > view.
>>
>> I was thinking about a dedicated module, and not something that the DMA engine
>> offers directly. You load the module only if you need it (like the test module)
>
>But loading that module would expose dma to userspace.
Of course, but users *should* know what they are doing ... right? ^_^'
>>
>> > Federico, what use case do you have in mind?
>>
>> Userspace drivers
>
>more the reason not do do so, why cant a kernel driver be added for your
>usage?
Yes of course, I was just wandering if there was a kernel API.
>--
>~Vinod
On Sat, Jun 20, 2020 at 12:47:16AM +0200, Federico Vaga wrote:
>Hello,
>
>is there the possibility of using a DMA engine channel from userspace?
>
>Something like:
>- configure DMA using ioctl() (or whatever configuration mechanism)
>- read() or write() to trigger the transfer
Let me add one more question related to my case. The dmatest module does not
perform tests on SLAVEs. why?
Thanks
>
>--
>Federico Vaga [CERN BE-CO-HT]
>
>
On 22-06-20, 11:25, Federico Vaga wrote:
> On Sat, Jun 20, 2020 at 12:47:16AM +0200, Federico Vaga wrote:
> > Hello,
> >
> > is there the possibility of using a DMA engine channel from userspace?
> >
> > Something like:
> > - configure DMA using ioctl() (or whatever configuration mechanism)
> > - read() or write() to trigger the transfer
>
> Let me add one more question related to my case. The dmatest module does not
> perform tests on SLAVEs. why?
For slaves, we need some driver to do the peripheral configuration,
dmatest cannot do that and is suited for memcpy operations
Thanks
--
~Vinod
> On 22 June 2020 at 06:47 Vinod Koul <[email protected]> wrote:
>
> On 21-06-20, 22:36, Federico Vaga wrote:
> > On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
> > > On 19-06-20, 16:31, Dave Jiang wrote:
> > > >
> > > >
> > > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > > > > Hello,
> > > > >
> > > > > is there the possibility of using a DMA engine channel from userspace?
> > > > >
> > > > > Something like:
> > > > > - configure DMA using ioctl() (or whatever configuration mechanism)
> > > > > - read() or write() to trigger the transfer
> > > > >
> > > >
> > > > I may have supposedly promised Vinod to look into possibly providing
> > > > something like this in the future. But I have not gotten around to do that
> > > > yet. Currently, no such support.
> > >
> > > And I do still have serious reservations about this topic :) Opening up
> > > userspace access to DMA does not sound very great from security point of
> > > view.
> >
> > I was thinking about a dedicated module, and not something that the DMA engine
> > offers directly. You load the module only if you need it (like the test module)
>
> But loading that module would expose dma to userspace.
> >
> > > Federico, what use case do you have in mind?
> >
> > Userspace drivers
>
> more the reason not do do so, why cant a kernel driver be added for your
> usage?
by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
just let me introduce myself and the project:
- coding in C since '91
- coding in C++ since '98
- a lot of stuff not relevant for this ;-)
- working as a freelancer since Nov '19
- implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
- last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
- subscribed to dmaengine on friday, saw the start of this discussion on saturday
- talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
here the struct for the ioctl:
typedef struct {
unsigned int struct_size;
const void *src_user_ptr;
void *dst_user_ptr;
unsigned long length;
unsigned int timeout_in_ms;
} dma_sg_proxy_arg_t;
best regards,
Thomas
On Mon, Jun 22, 2020 at 2:02 PM Thomas Ruf <[email protected]> wrote:
> > more the reason not do do so, why cant a kernel driver be added for your
> > usage?
>
> by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
> now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
>
> just let me introduce myself and the project:
> - coding in C since '91
> - coding in C++ since '98
> - a lot of stuff not relevant for this ;-)
> - working as a freelancer since Nov '19
> - implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
> - last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
> - subscribed to dmaengine on friday, saw the start of this discussion on saturday
> - talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
How do you preserve bounds? This is the main reason why vfio requires an iommu.
> here the struct for the ioctl:
>
> typedef struct {
> unsigned int struct_size;
> const void *src_user_ptr;
> void *dst_user_ptr;
> unsigned long length;
> unsigned int timeout_in_ms;
> } dma_sg_proxy_arg_t;
Is this on top of uio or a complete new subsystem?
--
Thanks,
//richard
On Mon, Jun 22, 2020 at 02:01:12PM +0200, Thomas Ruf wrote:
>> On 22 June 2020 at 06:47 Vinod Koul <[email protected]> wrote:
>>
>> On 21-06-20, 22:36, Federico Vaga wrote:
>> > On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
>> > > On 19-06-20, 16:31, Dave Jiang wrote:
>> > > >
>> > > >
>> > > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
>> > > > > Hello,
>> > > > >
>> > > > > is there the possibility of using a DMA engine channel from userspace?
>> > > > >
>> > > > > Something like:
>> > > > > - configure DMA using ioctl() (or whatever configuration mechanism)
>> > > > > - read() or write() to trigger the transfer
>> > > > >
>> > > >
>> > > > I may have supposedly promised Vinod to look into possibly providing
>> > > > something like this in the future. But I have not gotten around to do that
>> > > > yet. Currently, no such support.
>> > >
>> > > And I do still have serious reservations about this topic :) Opening up
>> > > userspace access to DMA does not sound very great from security point of
>> > > view.
>> >
>> > I was thinking about a dedicated module, and not something that the DMA engine
>> > offers directly. You load the module only if you need it (like the test module)
>>
>> But loading that module would expose dma to userspace.
>> >
>> > > Federico, what use case do you have in mind?
>> >
>> > Userspace drivers
>>
>> more the reason not do do so, why cant a kernel driver be added for your
>> usage?
>
>by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
>now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
Not sure to get what you mean by "DMA_SG is gone". Can I have a reference?
>
>just let me introduce myself and the project:
>- coding in C since '91
>- coding in C++ since '98
>- a lot of stuff not relevant for this ;-)
>- working as a freelancer since Nov '19
>- implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
>- last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
>- subscribed to dmaengine on friday, saw the start of this discussion on saturday
>- talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
>
>here the struct for the ioctl:
>
>typedef struct {
> unsigned int struct_size;
> const void *src_user_ptr;
> void *dst_user_ptr;
> unsigned long length;
> unsigned int timeout_in_ms;
>} dma_sg_proxy_arg_t;
Yes, roughly this is what I was thinking about
>best regards,
>Thomas
> On 22 June 2020 at 14:27 Richard Weinberger <[email protected]> wrote:
>
>
> On Mon, Jun 22, 2020 at 2:02 PM Thomas Ruf <[email protected]> wrote:
> > > more the reason not do do so, why cant a kernel driver be added for your
> > > usage?
> >
> > by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
> > now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
> >
> > just let me introduce myself and the project:
> > - coding in C since '91
> > - coding in C++ since '98
> > - a lot of stuff not relevant for this ;-)
> > - working as a freelancer since Nov '19
> > - implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
> > - last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
> > - subscribed to dmaengine on friday, saw the start of this discussion on saturday
> > - talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
>
> How do you preserve bounds? This is the main reason why vfio requires an iommu.
Depends where the pointer "points to", i can detect:
- virtually allocated user memory, the generated scatterlist is slit on page bounderies
- contiguous pyhsical memory, in our case allocated by v4l2 (based on a dma without SG support), the generated scallterlist has just one entry
sorry, i am not really familar with vfio :-(
> > here the struct for the ioctl:
> >
> > typedef struct {
> > unsigned int struct_size;
> > const void *src_user_ptr;
> > void *dst_user_ptr;
> > unsigned long length;
> > unsigned int timeout_in_ms;
> > } dma_sg_proxy_arg_t;
>
> Is this on top of uio or a complete new subsystem?
Completely independent, just my own idea for a simple uapi.
Best regards,
Thomas
> On 22 June 2020 at 14:30 Federico Vaga <[email protected]> wrote:
>
>
> On Mon, Jun 22, 2020 at 02:01:12PM +0200, Thomas Ruf wrote:
> >> On 22 June 2020 at 06:47 Vinod Koul <[email protected]> wrote:
> >>
> >> On 21-06-20, 22:36, Federico Vaga wrote:
> >> > On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
> >> > > On 19-06-20, 16:31, Dave Jiang wrote:
> >> > > >
> >> > > >
> >> > > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
> >> > > > > Hello,
> >> > > > >
> >> > > > > is there the possibility of using a DMA engine channel from userspace?
> >> > > > >
> >> > > > > Something like:
> >> > > > > - configure DMA using ioctl() (or whatever configuration mechanism)
> >> > > > > - read() or write() to trigger the transfer
> >> > > > >
> >> > > >
> >> > > > I may have supposedly promised Vinod to look into possibly providing
> >> > > > something like this in the future. But I have not gotten around to do that
> >> > > > yet. Currently, no such support.
> >> > >
> >> > > And I do still have serious reservations about this topic :) Opening up
> >> > > userspace access to DMA does not sound very great from security point of
> >> > > view.
> >> >
> >> > I was thinking about a dedicated module, and not something that the DMA engine
> >> > offers directly. You load the module only if you need it (like the test module)
> >>
> >> But loading that module would expose dma to userspace.
> >> >
> >> > > Federico, what use case do you have in mind?
> >> >
> >> > Userspace drivers
> >>
> >> more the reason not do do so, why cant a kernel driver be added for your
> >> usage?
> >
> >by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
> >now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
>
> Not sure to get what you mean by "DMA_SG is gone". Can I have a reference?
here the link to the mailinglist when DMA_SG was removed:
https://www.spinics.net/lists/dmaengine/msg13778.html
> >just let me introduce myself and the project:
> >- coding in C since '91
> >- coding in C++ since '98
> >- a lot of stuff not relevant for this ;-)
> >- working as a freelancer since Nov '19
> >- implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
> >- last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
> >- subscribed to dmaengine on friday, saw the start of this discussion on saturday
> >- talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
> >
> >here the struct for the ioctl:
> >
> >typedef struct {
> > unsigned int struct_size;
> > const void *src_user_ptr;
> > void *dst_user_ptr;
> > unsigned long length;
> > unsigned int timeout_in_ms;
> >} dma_sg_proxy_arg_t;
>
> Yes, roughly this is what I was thinking about
cool, i really hope i get my stuff upstream!
Best regards,
Thomas
On 22-06-20, 14:01, Thomas Ruf wrote:
> > On 22 June 2020 at 06:47 Vinod Koul <[email protected]> wrote:
> >
> > On 21-06-20, 22:36, Federico Vaga wrote:
> > > On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
> > > > On 19-06-20, 16:31, Dave Jiang wrote:
> > > > >
> > > > >
> > > > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > > > > > Hello,
> > > > > >
> > > > > > is there the possibility of using a DMA engine channel from userspace?
> > > > > >
> > > > > > Something like:
> > > > > > - configure DMA using ioctl() (or whatever configuration mechanism)
> > > > > > - read() or write() to trigger the transfer
> > > > > >
> > > > >
> > > > > I may have supposedly promised Vinod to look into possibly providing
> > > > > something like this in the future. But I have not gotten around to do that
> > > > > yet. Currently, no such support.
> > > >
> > > > And I do still have serious reservations about this topic :) Opening up
> > > > userspace access to DMA does not sound very great from security point of
> > > > view.
> > >
> > > I was thinking about a dedicated module, and not something that the DMA engine
> > > offers directly. You load the module only if you need it (like the test module)
> >
> > But loading that module would expose dma to userspace.
> > >
> > > > Federico, what use case do you have in mind?
> > >
> > > Userspace drivers
> >
> > more the reason not do do so, why cant a kernel driver be added for your
> > usage?
>
> by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
> now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
>
> just let me introduce myself and the project:
> - coding in C since '91
> - coding in C++ since '98
> - a lot of stuff not relevant for this ;-)
> - working as a freelancer since Nov '19
> - implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
> - last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
> - subscribed to dmaengine on friday, saw the start of this discussion on saturday
> - talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
DMA_SG was removed as it had no users, if we have a user (in-kernel) we
can certainly revert that removal patch.
>
> here the struct for the ioctl:
>
> typedef struct {
> unsigned int struct_size;
> const void *src_user_ptr;
> void *dst_user_ptr;
> unsigned long length;
> unsigned int timeout_in_ms;
> } dma_sg_proxy_arg_t;
Again, am not convinced opening DMA to userspace like this is a great
idea. Why not have Xilinx camera driver invoke the dmaengine and do
DMA_SG ?
--
~Vinod
> On 22 June 2020 at 17:54 Vinod Koul <[email protected]> wrote:
>
>
> On 22-06-20, 14:01, Thomas Ruf wrote:
> > > On 22 June 2020 at 06:47 Vinod Koul <[email protected]> wrote:
> > >
> > > On 21-06-20, 22:36, Federico Vaga wrote:
> > > > On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
> > > > > On 19-06-20, 16:31, Dave Jiang wrote:
> > > > > >
> > > > > >
> > > > > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > is there the possibility of using a DMA engine channel from userspace?
> > > > > > >
> > > > > > > Something like:
> > > > > > > - configure DMA using ioctl() (or whatever configuration mechanism)
> > > > > > > - read() or write() to trigger the transfer
> > > > > > >
> > > > > >
> > > > > > I may have supposedly promised Vinod to look into possibly providing
> > > > > > something like this in the future. But I have not gotten around to do that
> > > > > > yet. Currently, no such support.
> > > > >
> > > > > And I do still have serious reservations about this topic :) Opening up
> > > > > userspace access to DMA does not sound very great from security point of
> > > > > view.
> > > >
> > > > I was thinking about a dedicated module, and not something that the DMA engine
> > > > offers directly. You load the module only if you need it (like the test module)
> > >
> > > But loading that module would expose dma to userspace.
> > > >
> > > > > Federico, what use case do you have in mind?
> > > >
> > > > Userspace drivers
> > >
> > > more the reason not do do so, why cant a kernel driver be added for your
> > > usage?
> >
> > by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
> > now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
> >
> > just let me introduce myself and the project:
> > - coding in C since '91
> > - coding in C++ since '98
> > - a lot of stuff not relevant for this ;-)
> > - working as a freelancer since Nov '19
> > - implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
> > - last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
> > - subscribed to dmaengine on friday, saw the start of this discussion on saturday
> > - talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
>
> DMA_SG was removed as it had no users, if we have a user (in-kernel) we
> can certainly revert that removal patch.
yeah, already understood that.
> >
> > here the struct for the ioctl:
> >
> > typedef struct {
> > unsigned int struct_size;
> > const void *src_user_ptr;
> > void *dst_user_ptr;
> > unsigned long length;
> > unsigned int timeout_in_ms;
> > } dma_sg_proxy_arg_t;
>
> Again, am not convinced opening DMA to userspace like this is a great
> idea. Why not have Xilinx camera driver invoke the dmaengine and do
> DMA_SG ?
In our case we have several camera pipelines, in some cases uncached memory is okay (e. g. image goes directly to display framebuffer), in some cases not because we need to process the images on cpu or gpu and we for that need to copy to oridinary user memoy first. This seems easier to do by decoupling the driver code.
And one more thing: in case we engage the dma memcpy we want to copy to target memory which is prepared for IPC because we want to share these images with another process. The v4l2 interface did not look to be made for such cases but is possible by this "memcpy" approach.
Best regards,
Thomas
> On 22 June 2020 at 18:34 Thomas Ruf <[email protected]> wrote:
>
>
>
> > On 22 June 2020 at 17:54 Vinod Koul <[email protected]> wrote:
> >
> >
> > On 22-06-20, 14:01, Thomas Ruf wrote:
> > > > On 22 June 2020 at 06:47 Vinod Koul <[email protected]> wrote:
> > > >
> > > > On 21-06-20, 22:36, Federico Vaga wrote:
> > > > > On Sun, Jun 21, 2020 at 12:54:57PM +0530, Vinod Koul wrote:
> > > > > > On 19-06-20, 16:31, Dave Jiang wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > is there the possibility of using a DMA engine channel from userspace?
> > > > > > > >
> > > > > > > > Something like:
> > > > > > > > - configure DMA using ioctl() (or whatever configuration mechanism)
> > > > > > > > - read() or write() to trigger the transfer
> > > > > > > >
> > > > > > >
> > > > > > > I may have supposedly promised Vinod to look into possibly providing
> > > > > > > something like this in the future. But I have not gotten around to do that
> > > > > > > yet. Currently, no such support.
> > > > > >
> > > > > > And I do still have serious reservations about this topic :) Opening up
> > > > > > userspace access to DMA does not sound very great from security point of
> > > > > > view.
> > > > >
> > > > > I was thinking about a dedicated module, and not something that the DMA engine
> > > > > offers directly. You load the module only if you need it (like the test module)
> > > >
> > > > But loading that module would expose dma to userspace.
> > > > >
> > > > > > Federico, what use case do you have in mind?
> > > > >
> > > > > Userspace drivers
> > > >
> > > > more the reason not do do so, why cant a kernel driver be added for your
> > > > usage?
> > >
> > > by chance i have written a driver allowing dma from user space using a memcpy like interface ;-)
> > > now i am trying to get this code upstream but was hit by the fact that DMA_SG is gone since Aug 2017 :-(
> > >
> > > just let me introduce myself and the project:
> > > - coding in C since '91
> > > - coding in C++ since '98
> > > - a lot of stuff not relevant for this ;-)
> > > - working as a freelancer since Nov '19
> > > - implemented a "dma-sg-proxy" driver for my client in Mar/Apr '20 to copy camera frames from uncached memory to cached memory using a second dma on a Zynq platform
> > > - last week we figured out that we can not upgrade from "Xilinx 2019.2" (kernel 4.19.x) to "2020.1" (kernel 5.4.x) because the DMA_SG interface is gone
> > > - subscribed to dmaengine on friday, saw the start of this discussion on saturday
> > > - talked to my client today if it is ok to try to revive DMA_SG and get our driver upstream to avoid such problems in future
> >
> > DMA_SG was removed as it had no users, if we have a user (in-kernel) we
> > can certainly revert that removal patch.
>
> yeah, already understood that.
>
> > >
> > > here the struct for the ioctl:
> > >
> > > typedef struct {
> > > unsigned int struct_size;
> > > const void *src_user_ptr;
> > > void *dst_user_ptr;
> > > unsigned long length;
> > > unsigned int timeout_in_ms;
> > > } dma_sg_proxy_arg_t;
> >
> > Again, am not convinced opening DMA to userspace like this is a great
> > idea. Why not have Xilinx camera driver invoke the dmaengine and do
> > DMA_SG ?
>
> In our case we have several camera pipelines, in some cases uncached memory is okay (e. g. image goes directly to display framebuffer), in some cases not because we need to process the images on cpu or gpu and we for that need to copy to oridinary user memoy first. This seems easier to do by decoupling the driver code.
> And one more thing: in case we engage the dma memcpy we want to copy to target memory which is prepared for IPC because we want to share these images with another process. The v4l2 interface did not look to be made for such cases but is possible by this "memcpy" approach.
To make it short - i have two questions:
- what are the chances to revive DMA_SG?
- what are the chances to get my driver for memcpy like transfers from user space using DMA_SG upstream? ("dma-sg-proxy")
Best regards,
Thomas
On 24-06-20, 11:30, Thomas Ruf wrote:
> To make it short - i have two questions:
> - what are the chances to revive DMA_SG?
100%, if we have a in-kernel user
> - what are the chances to get my driver for memcpy like transfers from
> user space using DMA_SG upstream? ("dma-sg-proxy")
pretty bleak IMHO.
--
~Vinod
On 24/06/2020 12.38, Vinod Koul wrote:
> On 24-06-20, 11:30, Thomas Ruf wrote:
>
>> To make it short - i have two questions:
>> - what are the chances to revive DMA_SG?
>
> 100%, if we have a in-kernel user
Most DMAs can not handle differently provisioned sg_list for src and dst.
Even if they could handle non symmetric SG setup it requires entirely
different setup (two independent channels sending the data to each
other, one reads, the other writes?).
>> - what are the chances to get my driver for memcpy like transfers from
>> user space using DMA_SG upstream? ("dma-sg-proxy")
>
> pretty bleak IMHO.
fwiw, I also get requests time-to-time to DMA memcpy support from user
space from companies trying to move from bare-metal code to Linux.
What could be plausible is a generic dmabuf-to-dmabuf copy driver (V4L2
can provide dma-buf, DRM can also).
If there is a DMA memcpy channel available, use that, otherwise use some
method to do the copy, user space should not care how it is done.
Where things are going to get a bit more trickier is when the copy needs
to be triggered by other DMA channel (completion of a frame reception
triggering an interleaved sub-frame extraction copy).
You don't want to extract from a buffer which can be modified while the
other channel is writing to it.
In Linux the DMA is used for kernel and user space can only use it
implicitly via standard subsystems.
Misused DMA can be very dangerous and giving full access to program a
transfer can open a can of worms.
- Péter
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
> On 24 June 2020 at 14:07 Peter Ujfalusi <[email protected]> wrote:
> On 24/06/2020 12.38, Vinod Koul wrote:
> > On 24-06-20, 11:30, Thomas Ruf wrote:
> >
> >> To make it short - i have two questions:
> >> - what are the chances to revive DMA_SG?
> >
> > 100%, if we have a in-kernel user
>
> Most DMAs can not handle differently provisioned sg_list for src and dst.
> Even if they could handle non symmetric SG setup it requires entirely
> different setup (two independent channels sending the data to each
> other, one reads, the other writes?).
Ok, i implemented that using zynqmp_dma on a Xilinx Zynq platform (obviously ;-) and it works nicely for us.
Don't think that it uses two channels from what a saw in their implementation.
Of course that was on kernel 4.19.x where DMA_SG was still available.
> >> - what are the chances to get my driver for memcpy like transfers from
> >> user space using DMA_SG upstream? ("dma-sg-proxy")
> >
> > pretty bleak IMHO.
>
> fwiw, I also get requests time-to-time to DMA memcpy support from user
> space from companies trying to move from bare-metal code to Linux.
>
> What could be plausible is a generic dmabuf-to-dmabuf copy driver (V4L2
> can provide dma-buf, DRM can also).
> If there is a DMA memcpy channel available, use that, otherwise use some
> method to do the copy, user space should not care how it is done.
Yes, i'm using it together with a v4l2 capture driver and also saw the dma-buf thing but did not find a way how to bring this together with "ordinary user memory". For me the root of my problem seems to be that dma_alloc_coherent leads to uncached memory on ARM platforms. But maybe i am doing it all wrong ;-)
> Where things are going to get a bit more trickier is when the copy needs
> to be triggered by other DMA channel (completion of a frame reception
> triggering an interleaved sub-frame extraction copy).
> You don't want to extract from a buffer which can be modified while the
> other channel is writing to it.
I think that would be no problem in case of our v4l2 capture driver doing both DMAs:
Framebuffer DMA for streaming and Zynqmp DMA (using DMA_SG) to get it to "ordinary user memory".
But as i wrote before i prefer to do the "logic and management" in userspace so the capture driver is just using the first DMA and the "dma-sg-proxy" driver is only used as a memcpy replacement.
As said this is all working fine with kernel 4.19.x but now we are stuck :-(
> In Linux the DMA is used for kernel and user space can only use it
> implicitly via standard subsystems.
> Misused DMA can be very dangerous and giving full access to program a
> transfer can open a can of worms.
Fully understand that!
But i also hope you understand that we are developing a "closed system" and do not have a problem with that at all.
We are also willing to bring that driver upstream for anyone doing the same but of course this should not affect security of any desktop or server systems.
Maybe we just need the right place for that driver?!
Not sure if staging would change your concerns.
Thanks and best regards,
Thomas
On 6/21/2020 12:24 AM, Vinod Koul wrote:
> On 19-06-20, 16:31, Dave Jiang wrote:
>>
>>
>> On 6/19/2020 3:47 PM, Federico Vaga wrote:
>>> Hello,
>>>
>>> is there the possibility of using a DMA engine channel from userspace?
>>>
>>> Something like:
>>> - configure DMA using ioctl() (or whatever configuration mechanism)
>>> - read() or write() to trigger the transfer
>>>
>>
>> I may have supposedly promised Vinod to look into possibly providing
>> something like this in the future. But I have not gotten around to do that
>> yet. Currently, no such support.
>
> And I do still have serious reservations about this topic :) Opening up
> userspace access to DMA does not sound very great from security point of
> view.
What about doing it with DMA engine that supports PASID? That way the user can
really only trash its own address space and kernel is protected.
>
> Federico, what use case do you have in mind?
>
> We should keep in mind dmaengine is an in-kernel interface providing
> services to various subsystems, so you go thru the respective subsystem
> kernel interface (network, display, spi, audio etc..) which would in
> turn use dmaengine.
>
> On 25 June 2020 at 02:42 Dave Jiang <[email protected]> wrote:
>
>
>
>
> On 6/21/2020 12:24 AM, Vinod Koul wrote:
> > On 19-06-20, 16:31, Dave Jiang wrote:
> >>
> >>
> >> On 6/19/2020 3:47 PM, Federico Vaga wrote:
> >>> Hello,
> >>>
> >>> is there the possibility of using a DMA engine channel from userspace?
> >>>
> >>> Something like:
> >>> - configure DMA using ioctl() (or whatever configuration mechanism)
> >>> - read() or write() to trigger the transfer
> >>>
> >>
> >> I may have supposedly promised Vinod to look into possibly providing
> >> something like this in the future. But I have not gotten around to do that
> >> yet. Currently, no such support.
> >
> > And I do still have serious reservations about this topic :) Opening up
> > userspace access to DMA does not sound very great from security point of
> > view.
>
> What about doing it with DMA engine that supports PASID? That way the user can
> really only trash its own address space and kernel is protected.
Sounds interesting! Not sure if this is really needed in that case...
I have already implemented checks of vm_area_struct for contiguous memory or even do a get_user_pages_fast for user memory to pin it (hope that is the correct term here). Of course i have to do that for every involved page.
But i will do some checks if my code is really suitable to avoid misusage.
Best regards,
Thomas
On 24/06/2020 16.58, Thomas Ruf wrote:
>
>> On 24 June 2020 at 14:07 Peter Ujfalusi <[email protected]> wrote:
>> On 24/06/2020 12.38, Vinod Koul wrote:
>>> On 24-06-20, 11:30, Thomas Ruf wrote:
>>>
>>>> To make it short - i have two questions:
>>>> - what are the chances to revive DMA_SG?
>>>
>>> 100%, if we have a in-kernel user
>>
>> Most DMAs can not handle differently provisioned sg_list for src and dst.
>> Even if they could handle non symmetric SG setup it requires entirely
>> different setup (two independent channels sending the data to each
>> other, one reads, the other writes?).
>
> Ok, i implemented that using zynqmp_dma on a Xilinx Zynq platform (obviously ;-) and it works nicely for us.
I see, if the HW does not support it then something along the lines of
what the atc_prep_dma_sg did can be implemented for most engines.
In essence: create a new set of sg_list which is symmetric.
> Don't think that it uses two channels from what a saw in their implementation.
I believe it was breaking it up like atc_prep_dma_sg did.
> Of course that was on kernel 4.19.x where DMA_SG was still available.
>
>>>> - what are the chances to get my driver for memcpy like transfers from
>>>> user space using DMA_SG upstream? ("dma-sg-proxy")
>>>
>>> pretty bleak IMHO.
>>
>> fwiw, I also get requests time-to-time to DMA memcpy support from user
>> space from companies trying to move from bare-metal code to Linux.
>>
>> What could be plausible is a generic dmabuf-to-dmabuf copy driver (V4L2
>> can provide dma-buf, DRM can also).
>> If there is a DMA memcpy channel available, use that, otherwise use some
>> method to do the copy, user space should not care how it is done.
>
> Yes, i'm using it together with a v4l2 capture driver and also saw the dma-buf thing but did not find a way how to bring this together with "ordinary user memory".
One of the aim of dma-buf is to share buffers between drivers and user
space (among drivers and/or drivers and userspace), but I might be
missing something.
> For me the root of my problem seems to be that dma_alloc_coherent leads to uncached memory on ARM platforms.
It depends, but in most cases that is true.
> But maybe i am doing it all wrong ;-)
>
>> Where things are going to get a bit more trickier is when the copy needs
>> to be triggered by other DMA channel (completion of a frame reception
>> triggering an interleaved sub-frame extraction copy).
>> You don't want to extract from a buffer which can be modified while the
>> other channel is writing to it.
>
> I think that would be no problem in case of our v4l2 capture driver doing both DMAs:
> Framebuffer DMA for streaming and Zynqmp DMA (using DMA_SG) to get it to "ordinary user memory".
> But as i wrote before i prefer to do the "logic and management" in userspace so the capture driver is just using the first DMA and the "dma-sg-proxy" driver is only used as a memcpy replacement.
> As said this is all working fine with kernel 4.19.x but now we are stuck :-(
>
>> In Linux the DMA is used for kernel and user space can only use it
>> implicitly via standard subsystems.
>> Misused DMA can be very dangerous and giving full access to program a
>> transfer can open a can of worms.
>
> Fully understand that!
> But i also hope you understand that we are developing a "closed system" and do not have a problem with that at all.
> We are also willing to bring that driver upstream for anyone doing the same but of course this should not affect security of any desktop or server systems.
> Maybe we just need the right place for that driver?!
What might be plausible is to introduce hw offloading support for memcpy
type of operations in a similar fashion how for example crypto does it?
The issue with a user space implemented logic is that it is not portable
between systems with different DMAs. It might be that on one DMA the
setup takes longer than do a CPU copy of X bytes, on the other DMA it
might be significantly less or higher.
Using CPU vs DMA for a copy in certain lengths and setups should not be
a concern of the user space.
Yes, you have a closed system with controlled parameters, but a generic
mem2mem_offload framework should be usable on other setups and the same
binary should be working on different DMAs where one is not efficient
for <512 bytes, the other shows benefits under 128bytes.
> Not sure if staging would change your concerns.
>
> Thanks and best regards,
> Thomas
>
- Péter
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
On Thu, Jun 25, 2020 at 10:11:28AM +0200, Thomas Ruf wrote:
>
> > On 25 June 2020 at 02:42 Dave Jiang <[email protected]> wrote:
> >
> >
> >
> >
> > On 6/21/2020 12:24 AM, Vinod Koul wrote:
> > > On 19-06-20, 16:31, Dave Jiang wrote:
> > >>
> > >>
> > >> On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > >>> Hello,
> > >>>
> > >>> is there the possibility of using a DMA engine channel from userspace?
> > >>>
> > >>> Something like:
> > >>> - configure DMA using ioctl() (or whatever configuration mechanism)
> > >>> - read() or write() to trigger the transfer
> > >>>
> > >>
> > >> I may have supposedly promised Vinod to look into possibly providing
> > >> something like this in the future. But I have not gotten around to do that
> > >> yet. Currently, no such support.
> > >
> > > And I do still have serious reservations about this topic :) Opening up
> > > userspace access to DMA does not sound very great from security point of
> > > view.
> >
> > What about doing it with DMA engine that supports PASID? That way the user can
> > really only trash its own address space and kernel is protected.
>
> Sounds interesting! Not sure if this is really needed in that case...
> I have already implemented checks of vm_area_struct for contiguous memory or even do a get_user_pages_fast for user memory to pin it (hope that is the correct term here). Of course i have to do that for every involved page.
FWIW there is a new pin_user_pages_fast()/unpin_user_page() interface now.
Ira
> But i will do some checks if my code is really suitable to avoid misusage.
>
> Best regards,
> Thomas
> On 26 June 2020 at 12:29 Peter Ujfalusi <[email protected]> wrote:
>
> On 24/06/2020 16.58, Thomas Ruf wrote:
> >
> >> On 24 June 2020 at 14:07 Peter Ujfalusi <[email protected]> wrote:
> >> On 24/06/2020 12.38, Vinod Koul wrote:
> >>> On 24-06-20, 11:30, Thomas Ruf wrote:
> >>>
> >>>> To make it short - i have two questions:
> >>>> - what are the chances to revive DMA_SG?
> >>>
> >>> 100%, if we have a in-kernel user
> >>
> >> Most DMAs can not handle differently provisioned sg_list for src and dst.
> >> Even if they could handle non symmetric SG setup it requires entirely
> >> different setup (two independent channels sending the data to each
> >> other, one reads, the other writes?).
> >
> > Ok, i implemented that using zynqmp_dma on a Xilinx Zynq platform (obviously ;-) and it works nicely for us.
>
> I see, if the HW does not support it then something along the lines of
> what the atc_prep_dma_sg did can be implemented for most engines.
>
> In essence: create a new set of sg_list which is symmetric.
Sorry, not sure if i understand you right?
You suggest that in case DMA_SG gets revived we should restrict the support to symmetric sg_lists?
Just had a glance at the deleted code and the *_prep_dma_sg of these drivers had code to support asymmetric lists and by that "unaligend" memory (relative to page start):
at_hdmac.c
dmaengine.c
dmatest.c
fsldma.c
mv_xor.c
nbpfaxi.c
ste_dma40.c
xgene-dma.c
xilinx/zynqmp_dma.c
Why not just revive that and keep this nice functionality? ;-)
> > Don't think that it uses two channels from what a saw in their implementation.
>
> I believe it was breaking it up like atc_prep_dma_sg did.
>
> > Of course that was on kernel 4.19.x where DMA_SG was still available.
> >
> >>>> - what are the chances to get my driver for memcpy like transfers from
> >>>> user space using DMA_SG upstream? ("dma-sg-proxy")
> >>>
> >>> pretty bleak IMHO.
> >>
> >> fwiw, I also get requests time-to-time to DMA memcpy support from user
> >> space from companies trying to move from bare-metal code to Linux.
> >>
> >> What could be plausible is a generic dmabuf-to-dmabuf copy driver (V4L2
> >> can provide dma-buf, DRM can also).
> >> If there is a DMA memcpy channel available, use that, otherwise use some
> >> method to do the copy, user space should not care how it is done.
> >
> > Yes, i'm using it together with a v4l2 capture driver and also saw the dma-buf thing but did not find a way how to bring this together with "ordinary user memory".
>
> One of the aim of dma-buf is to share buffers between drivers and user
> space (among drivers and/or drivers and userspace), but I might be
> missing something.
>
> > For me the root of my problem seems to be that dma_alloc_coherent leads to uncached memory on ARM platforms.
>
> It depends, but in most cases that is true.
>
> > But maybe i am doing it all wrong ;-)
> >
> >> Where things are going to get a bit more trickier is when the copy needs
> >> to be triggered by other DMA channel (completion of a frame reception
> >> triggering an interleaved sub-frame extraction copy).
> >> You don't want to extract from a buffer which can be modified while the
> >> other channel is writing to it.
> >
> > I think that would be no problem in case of our v4l2 capture driver doing both DMAs:
> > Framebuffer DMA for streaming and Zynqmp DMA (using DMA_SG) to get it to "ordinary user memory".
> > But as i wrote before i prefer to do the "logic and management" in userspace so the capture driver is just using the first DMA and the "dma-sg-proxy" driver is only used as a memcpy replacement.
> > As said this is all working fine with kernel 4.19.x but now we are stuck :-(
> >
> >> In Linux the DMA is used for kernel and user space can only use it
> >> implicitly via standard subsystems.
> >> Misused DMA can be very dangerous and giving full access to program a
> >> transfer can open a can of worms.
> >
> > Fully understand that!
> > But i also hope you understand that we are developing a "closed system" and do not have a problem with that at all.
> > We are also willing to bring that driver upstream for anyone doing the same but of course this should not affect security of any desktop or server systems.
> > Maybe we just need the right place for that driver?!
>
> What might be plausible is to introduce hw offloading support for memcpy
> type of operations in a similar fashion how for example crypto does it?
Sounds good to me, my proxy driver implementation could be a good start for that, too!
> The issue with a user space implemented logic is that it is not portable
> between systems with different DMAs. It might be that on one DMA the
> setup takes longer than do a CPU copy of X bytes, on the other DMA it
> might be significantly less or higher.
Fully agree with that!
I was also unsure how my approach will perform but in our case the latency was increased by ~20%, cpu load roughly stayed the same, of course this was the benchmark from user memory to user memory.
From uncached to user memory the DMA was around 15 times faster.
> Using CPU vs DMA for a copy in certain lengths and setups should not be
> a concern of the user space.
Also fully agree with that!
> Yes, you have a closed system with controlled parameters, but a generic
> mem2mem_offload framework should be usable on other setups and the same
> binary should be working on different DMAs where one is not efficient
> for <512 bytes, the other shows benefits under 128bytes.
Usable: of course
"Faster": not necessarily as long as it is an option
Thanks for your valuable input and suggestions!
best regards,
Thomas
> On 26 June 2020 at 22:08 Ira Weiny <[email protected]> wrote:
>
>
> On Thu, Jun 25, 2020 at 10:11:28AM +0200, Thomas Ruf wrote:
> >
> > > On 25 June 2020 at 02:42 Dave Jiang <[email protected]> wrote:
> > >
> > >
> > >
> > >
> > > On 6/21/2020 12:24 AM, Vinod Koul wrote:
> > > > On 19-06-20, 16:31, Dave Jiang wrote:
> > > >>
> > > >>
> > > >> On 6/19/2020 3:47 PM, Federico Vaga wrote:
> > > >>> Hello,
> > > >>>
> > > >>> is there the possibility of using a DMA engine channel from userspace?
> > > >>>
> > > >>> Something like:
> > > >>> - configure DMA using ioctl() (or whatever configuration mechanism)
> > > >>> - read() or write() to trigger the transfer
> > > >>>
> > > >>
> > > >> I may have supposedly promised Vinod to look into possibly providing
> > > >> something like this in the future. But I have not gotten around to do that
> > > >> yet. Currently, no such support.
> > > >
> > > > And I do still have serious reservations about this topic :) Opening up
> > > > userspace access to DMA does not sound very great from security point of
> > > > view.
> > >
> > > What about doing it with DMA engine that supports PASID? That way the user can
> > > really only trash its own address space and kernel is protected.
> >
> > Sounds interesting! Not sure if this is really needed in that case...
> > I have already implemented checks of vm_area_struct for contiguous memory or even do a get_user_pages_fast for user memory to pin it (hope that is the correct term here). Of course i have to do that for every involved page.
>
> FWIW there is a new pin_user_pages_fast()/unpin_user_page() interface now.
Thanks for that info. But at the moment we are mainly interested in a solution which can be easily backported to Xilinix Release 2020.1 with kernel 5.4.x where i could not find that new functionality.
> > But i will do some checks if my code is really suitable to avoid misusage.
Did some basic tests today and was not able to break out of my own checks done via follow_pfn() respectively get_user_pages_fast(). If this stands "advanced attacks" my proxy driver shouldn't be more dangerous as an ordinary memcpy, i know that there will always remain some doubts ;-)
best regards,
Thomas
On 29/06/2020 18.18, Thomas Ruf wrote:
>
>> On 26 June 2020 at 12:29 Peter Ujfalusi <[email protected]> wrote:
>>
>> On 24/06/2020 16.58, Thomas Ruf wrote:
>>>
>>>> On 24 June 2020 at 14:07 Peter Ujfalusi <[email protected]> wrote:
>>>> On 24/06/2020 12.38, Vinod Koul wrote:
>>>>> On 24-06-20, 11:30, Thomas Ruf wrote:
>>>>>
>>>>>> To make it short - i have two questions:
>>>>>> - what are the chances to revive DMA_SG?
>>>>>
>>>>> 100%, if we have a in-kernel user
>>>>
>>>> Most DMAs can not handle differently provisioned sg_list for src and dst.
>>>> Even if they could handle non symmetric SG setup it requires entirely
>>>> different setup (two independent channels sending the data to each
>>>> other, one reads, the other writes?).
>>>
>>> Ok, i implemented that using zynqmp_dma on a Xilinx Zynq platform (obviously ;-) and it works nicely for us.
>>
>> I see, if the HW does not support it then something along the lines of
>> what the atc_prep_dma_sg did can be implemented for most engines.
>>
>> In essence: create a new set of sg_list which is symmetric.
>
> Sorry, not sure if i understand you right?
> You suggest that in case DMA_SG gets revived we should restrict the support to symmetric sg_lists?
No, not at all. That would not make much sense.
> Just had a glance at the deleted code and the *_prep_dma_sg of these drivers had code to support asymmetric lists and by that "unaligend" memory (relative to page start):
> at_hdmac.c
> dmaengine.c
> dmatest.c
> fsldma.c
> mv_xor.c
> nbpfaxi.c
> ste_dma40.c
> xgene-dma.c
> xilinx/zynqmp_dma.c
>
> Why not just revive that and keep this nice functionality? ;-)
What I'm saying is that the drivers (at least at_hdmac) in essence
creates aligned sg_list out from the received non aligned ones.
It does this w/o actually creating the sg_list itself, but that's just a
small detail.
In a longer run what might make sense is to have a helper function to
convert two non symmetric sg_list into two symmetric ones so drivers
will not have to re-implement the same code and they will only need to
care about symmetric sg lists.
Note, some DMAs can actually handle non symmetric src and dst lists, but
I believe it is rare.
>> What might be plausible is to introduce hw offloading support for memcpy
>> type of operations in a similar fashion how for example crypto does it?
>
> Sounds good to me, my proxy driver implementation could be a good start for that, too!
It needs to find it's place as well... I'm not sure where that would be.
Simple block-copy offload, sg copy offload, interleaved offload (frame
extraction) offload, dmabuf copy offload comes to mind as candidates.
>> The issue with a user space implemented logic is that it is not portable
>> between systems with different DMAs. It might be that on one DMA the
>> setup takes longer than do a CPU copy of X bytes, on the other DMA it
>> might be significantly less or higher.
>
> Fully agree with that!
> I was also unsure how my approach will perform but in our case the latency was increased by ~20%, cpu load roughly stayed the same, of course this was the benchmark from user memory to user memory.
> From uncached to user memory the DMA was around 15 times faster.
It depends on the size of the transfer. Lots of small individual
transfers might be worst via DMA do the the setup time, completion
handling, etc.
>> Using CPU vs DMA for a copy in certain lengths and setups should not be
>> a concern of the user space.
>
> Also fully agree with that!
There is one and big issue with the fallback to CPU copy... If you used
DMA then you might need to do cache operation to get things in their
right place.
If you have done it with CPU then you most like do not need to care
about it.
Handling this should be done in level where we are aware which path is
taken.
>> Yes, you have a closed system with controlled parameters, but a generic
>> mem2mem_offload framework should be usable on other setups and the same
>> binary should be working on different DMAs where one is not efficient
>> for <512 bytes, the other shows benefits under 128bytes.
>
> Usable: of course
> "Faster": not necessarily as long as it is an option
>
> Thanks for your valuable input and suggestions!
>
> best regards,
> Thomas
>
- Péter
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
> On 30 June 2020 at 14:31 Peter Ujfalusi <[email protected]> wrote:
>
>
>
>
> On 29/06/2020 18.18, Thomas Ruf wrote:
> >
> >> On 26 June 2020 at 12:29 Peter Ujfalusi <[email protected]> wrote:
> >>
> >> On 24/06/2020 16.58, Thomas Ruf wrote:
> >>>
> >>>> On 24 June 2020 at 14:07 Peter Ujfalusi <[email protected]> wrote:
> >>>> On 24/06/2020 12.38, Vinod Koul wrote:
> >>>>> On 24-06-20, 11:30, Thomas Ruf wrote:
> >>>>>
> >>>>>> To make it short - i have two questions:
> >>>>>> - what are the chances to revive DMA_SG?
> >>>>>
> >>>>> 100%, if we have a in-kernel user
> >>>>
> >>>> Most DMAs can not handle differently provisioned sg_list for src and dst.
> >>>> Even if they could handle non symmetric SG setup it requires entirely
> >>>> different setup (two independent channels sending the data to each
> >>>> other, one reads, the other writes?).
> >>>
> >>> Ok, i implemented that using zynqmp_dma on a Xilinx Zynq platform (obviously ;-) and it works nicely for us.
> >>
> >> I see, if the HW does not support it then something along the lines of
> >> what the atc_prep_dma_sg did can be implemented for most engines.
> >>
> >> In essence: create a new set of sg_list which is symmetric.
> >
> > Sorry, not sure if i understand you right?
> > You suggest that in case DMA_SG gets revived we should restrict the support to symmetric sg_lists?
>
> No, not at all. That would not make much sense.
Glad that this was just a misunderstanding.
> > Just had a glance at the deleted code and the *_prep_dma_sg of these drivers had code to support asymmetric lists and by that "unaligend" memory (relative to page start):
> > at_hdmac.c
> > dmaengine.c
> > dmatest.c
> > fsldma.c
> > mv_xor.c
> > nbpfaxi.c
> > ste_dma40.c
> > xgene-dma.c
> > xilinx/zynqmp_dma.c
> >
> > Why not just revive that and keep this nice functionality? ;-)
>
> What I'm saying is that the drivers (at least at_hdmac) in essence
> creates aligned sg_list out from the received non aligned ones.
> It does this w/o actually creating the sg_list itself, but that's just a
> small detail.
>
> In a longer run what might make sense is to have a helper function to
> convert two non symmetric sg_list into two symmetric ones so drivers
> will not have to re-implement the same code and they will only need to
> care about symmetric sg lists.
Sounds like a superb idea!
> Note, some DMAs can actually handle non symmetric src and dst lists, but
> I believe it is rare.
So i was a bit lucky that the zynqmp_dma is one of them.
> >> What might be plausible is to introduce hw offloading support for memcpy
> >> type of operations in a similar fashion how for example crypto does it?
> >
> > Sounds good to me, my proxy driver implementation could be a good start for that, too!
>
> It needs to find it's place as well... I'm not sure where that would be.
> Simple block-copy offload, sg copy offload, interleaved offload (frame
> extraction) offload, dmabuf copy offload comes to mind as candidates.
And who would decide that...
> >> The issue with a user space implemented logic is that it is not portable
> >> between systems with different DMAs. It might be that on one DMA the
> >> setup takes longer than do a CPU copy of X bytes, on the other DMA it
> >> might be significantly less or higher.
> >
> > Fully agree with that!
> > I was also unsure how my approach will perform but in our case the latency was increased by ~20%, cpu load roughly stayed the same, of course this was the benchmark from user memory to user memory.
> > From uncached to user memory the DMA was around 15 times faster.
>
> It depends on the size of the transfer. Lots of small individual
> transfers might be worst via DMA do the the setup time, completion
> handling, etc.
Yes, exactly.
Thanks again for your great input!
best regards,
Thomas
PS: I am on vacation for the next two weaks and probably will not check this mailing list till 20.7. But will fetch later.