2024-01-19 14:14:32

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 0/6] usb: gadget: functionfs: DMABUF import interface

Hi,

This is the v5 of my patchset that adds a new DMABUF import interface to
FunctionFS.

Daniel / Sima suggested that I should cache the dma_buf_attachment while
the DMABUF is attached to the interface, instead of mapping/unmapping
the DMABUF for every transfer (also because unmapping is not possible in
the dma_fence's critical section). This meant having to add new
dma_buf_begin_access() / dma_buf_end_access() functions that the driver
can call to ensure cache coherency. These two functions are provided by
the new patch [1/6], and an implementation for udmabuf was added in
[2/6] - see the changelog below.

This patchset was successfully tested with CONFIG_LOCKDEP, no errors
were reported in dmesg while using the interface.

This interface is being used at Analog Devices, to transfer data from
high-speed transceivers to USB in a zero-copy fashion, using also the
DMABUF import interface to the IIO subsystem which is being upstreamed
in parallel [1]. The two are used by the Libiio software [2].

On a ZCU102 board with a FMComms3 daughter board, using the combination
of these two new interfaces yields a drastic improvement of the
throughput, from about 127 MiB/s using IIO's buffer read/write interface
+ read/write to the FunctionFS endpoints, to about 274 MiB/s when
passing around DMABUFs, for a lower CPU usage (0.85 load avg. before,
vs. 0.65 after).

Right now, *technically* there are no users of this interface, as
Analog Devices wants to wait until both interfaces are accepted upstream
to merge the DMABUF code in Libiio into the main branch, and Jonathan
wants to wait and see if this patchset is accepted to greenlight the
DMABUF interface in IIO as well. I think this isn't really a problem;
once everybody is happy with its part of the cake, we can merge them all
at once.

This is obviously for 5.9, and based on next-20240119.

Changelog:

- [1/6]: New patch
- [2/6]: New patch
- [5/6]:
- Cache the dma_buf_attachment while the DMABUF is attached.
- Use dma_buf_begin/end_access() to ensure that the DMABUF data will be
coherent to the hardware.
- Remove comment about cache-management and dma_buf_unmap_attachment(),
since we now use dma_buf_begin/end_access().
- Select DMA_SHARED_BUFFER in Kconfig entry
- Add Christian's ACK

Cheers,
-Paul

[1] https://lore.kernel.org/linux-iio/[email protected]/T/
[2] https://github.com/analogdevicesinc/libiio/tree/pcercuei/dev-new-dmabuf-api

Paul Cercueil (6):
dma-buf: Add dma_buf_{begin,end}_access()
dma-buf: udmabuf: Implement .{begin,end}_access
usb: gadget: Support already-mapped DMA SGs
usb: gadget: functionfs: Factorize wait-for-endpoint code
usb: gadget: functionfs: Add DMABUF import interface
Documentation: usb: Document FunctionFS DMABUF API

Documentation/usb/functionfs.rst | 36 ++
drivers/dma-buf/dma-buf.c | 66 ++++
drivers/dma-buf/udmabuf.c | 27 ++
drivers/usb/gadget/Kconfig | 1 +
drivers/usb/gadget/function/f_fs.c | 502 ++++++++++++++++++++++++++--
drivers/usb/gadget/udc/core.c | 7 +-
include/linux/dma-buf.h | 37 ++
include/linux/usb/gadget.h | 2 +
include/uapi/linux/usb/functionfs.h | 41 +++
9 files changed, 698 insertions(+), 21 deletions(-)

--
2.43.0



2024-01-19 14:15:27

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 2/6] dma-buf: udmabuf: Implement .{begin,end}_access

Implement .begin_access() and .end_access() callbacks.

For now these functions will simply sync/flush the CPU cache when
needed.

Signed-off-by: Paul Cercueil <[email protected]>

---
v5: New patch
---
drivers/dma-buf/udmabuf.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index c40645999648..a87d89b58816 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -179,6 +179,31 @@ static int end_cpu_udmabuf(struct dma_buf *buf,
return 0;
}

+static int begin_udmabuf(struct dma_buf_attachment *attach,
+ struct sg_table *sgt,
+ enum dma_data_direction dir)
+{
+ struct dma_buf *buf = attach->dmabuf;
+ struct udmabuf *ubuf = buf->priv;
+ struct device *dev = ubuf->device->this_device;
+
+ dma_sync_sg_for_device(dev, sgt->sgl, sg_nents(sgt->sgl), dir);
+ return 0;
+}
+
+static int end_udmabuf(struct dma_buf_attachment *attach,
+ struct sg_table *sgt,
+ enum dma_data_direction dir)
+{
+ struct dma_buf *buf = attach->dmabuf;
+ struct udmabuf *ubuf = buf->priv;
+ struct device *dev = ubuf->device->this_device;
+
+ if (dir != DMA_TO_DEVICE)
+ dma_sync_sg_for_cpu(dev, sgt->sgl, sg_nents(sgt->sgl), dir);
+ return 0;
+}
+
static const struct dma_buf_ops udmabuf_ops = {
.cache_sgt_mapping = true,
.map_dma_buf = map_udmabuf,
@@ -189,6 +214,8 @@ static const struct dma_buf_ops udmabuf_ops = {
.vunmap = vunmap_udmabuf,
.begin_cpu_access = begin_cpu_udmabuf,
.end_cpu_access = end_cpu_udmabuf,
+ .begin_access = begin_udmabuf,
+ .end_access = end_udmabuf,
};

#define SEALS_WANTED (F_SEAL_SHRINK)
--
2.43.0


2024-01-19 14:15:52

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 3/6] usb: gadget: Support already-mapped DMA SGs

Add a new 'sg_was_mapped' field to the struct usb_request. This field
can be used to indicate that the scatterlist associated to the USB
transfer has already been mapped into the DMA space, and it does not
have to be done internally.

Signed-off-by: Paul Cercueil <[email protected]>
---
drivers/usb/gadget/udc/core.c | 7 ++++++-
include/linux/usb/gadget.h | 2 ++
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index d59f94464b87..9d4150124fdb 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -903,6 +903,11 @@ int usb_gadget_map_request_by_dev(struct device *dev,
if (req->length == 0)
return 0;

+ if (req->sg_was_mapped) {
+ req->num_mapped_sgs = req->num_sgs;
+ return 0;
+ }
+
if (req->num_sgs) {
int mapped;

@@ -948,7 +953,7 @@ EXPORT_SYMBOL_GPL(usb_gadget_map_request);
void usb_gadget_unmap_request_by_dev(struct device *dev,
struct usb_request *req, int is_in)
{
- if (req->length == 0)
+ if (req->length == 0 || req->sg_was_mapped)
return;

if (req->num_mapped_sgs) {
diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
index a771ccc038ac..c529e4e06997 100644
--- a/include/linux/usb/gadget.h
+++ b/include/linux/usb/gadget.h
@@ -52,6 +52,7 @@ struct usb_ep;
* @short_not_ok: When reading data, makes short packets be
* treated as errors (queue stops advancing till cleanup).
* @dma_mapped: Indicates if request has been mapped to DMA (internal)
+ * @sg_was_mapped: Set if the scatterlist has been mapped before the request
* @complete: Function called when request completes, so this request and
* its buffer may be re-used. The function will always be called with
* interrupts disabled, and it must not sleep.
@@ -111,6 +112,7 @@ struct usb_request {
unsigned zero:1;
unsigned short_not_ok:1;
unsigned dma_mapped:1;
+ unsigned sg_was_mapped:1;

void (*complete)(struct usb_ep *ep,
struct usb_request *req);
--
2.43.0


2024-01-19 14:16:18

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 4/6] usb: gadget: functionfs: Factorize wait-for-endpoint code

This exact same code was duplicated in two different places.

Signed-off-by: Paul Cercueil <[email protected]>
---
drivers/usb/gadget/function/f_fs.c | 48 +++++++++++++++++-------------
1 file changed, 27 insertions(+), 21 deletions(-)

diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 6bff6cb93789..ed2a6d5fcef7 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -934,31 +934,44 @@ static ssize_t __ffs_epfile_read_data(struct ffs_epfile *epfile,
return ret;
}

-static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
+static struct ffs_ep *ffs_epfile_wait_ep(struct file *file)
{
struct ffs_epfile *epfile = file->private_data;
- struct usb_request *req;
struct ffs_ep *ep;
- char *data = NULL;
- ssize_t ret, data_len = -EINVAL;
- int halt;
-
- /* Are we still active? */
- if (WARN_ON(epfile->ffs->state != FFS_ACTIVE))
- return -ENODEV;
+ int ret;

/* Wait for endpoint to be enabled */
ep = epfile->ep;
if (!ep) {
if (file->f_flags & O_NONBLOCK)
- return -EAGAIN;
+ return ERR_PTR(-EAGAIN);

ret = wait_event_interruptible(
epfile->ffs->wait, (ep = epfile->ep));
if (ret)
- return -EINTR;
+ return ERR_PTR(-EINTR);
}

+ return ep;
+}
+
+static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
+{
+ struct ffs_epfile *epfile = file->private_data;
+ struct usb_request *req;
+ struct ffs_ep *ep;
+ char *data = NULL;
+ ssize_t ret, data_len = -EINVAL;
+ int halt;
+
+ /* Are we still active? */
+ if (WARN_ON(epfile->ffs->state != FFS_ACTIVE))
+ return -ENODEV;
+
+ ep = ffs_epfile_wait_ep(file);
+ if (IS_ERR(ep))
+ return PTR_ERR(ep);
+
/* Do we halt? */
halt = (!io_data->read == !epfile->in);
if (halt && epfile->isoc)
@@ -1280,16 +1293,9 @@ static long ffs_epfile_ioctl(struct file *file, unsigned code,
return -ENODEV;

/* Wait for endpoint to be enabled */
- ep = epfile->ep;
- if (!ep) {
- if (file->f_flags & O_NONBLOCK)
- return -EAGAIN;
-
- ret = wait_event_interruptible(
- epfile->ffs->wait, (ep = epfile->ep));
- if (ret)
- return -EINTR;
- }
+ ep = ffs_epfile_wait_ep(file);
+ if (IS_ERR(ep))
+ return PTR_ERR(ep);

spin_lock_irq(&epfile->ffs->eps_lock);

--
2.43.0


2024-01-19 14:17:12

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

These functions should be used by device drivers when they start and
stop accessing the data of DMABUF. It allows DMABUF importers to cache
the dma_buf_attachment while ensuring that the data they want to access
is available for their device when the DMA transfers take place.

Signed-off-by: Paul Cercueil <[email protected]>

---
v5: New patch
---
drivers/dma-buf/dma-buf.c | 66 +++++++++++++++++++++++++++++++++++++++
include/linux/dma-buf.h | 37 ++++++++++++++++++++++
2 files changed, 103 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..a8bab6c18fcd 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -830,6 +830,8 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach,
* - dma_buf_mmap()
* - dma_buf_begin_cpu_access()
* - dma_buf_end_cpu_access()
+ * - dma_buf_begin_access()
+ * - dma_buf_end_access()
* - dma_buf_map_attachment_unlocked()
* - dma_buf_unmap_attachment_unlocked()
* - dma_buf_vmap_unlocked()
@@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map)
}
EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);

+/**
+ * @dma_buf_begin_access - Call before any hardware access from/to the DMABUF
+ * @attach: [in] attachment used for hardware access
+ * @sg_table: [in] scatterlist used for the DMA transfer
+ * @direction: [in] direction of DMA transfer
+ */
+int dma_buf_begin_access(struct dma_buf_attachment *attach,
+ struct sg_table *sgt, enum dma_data_direction dir)
+{
+ struct dma_buf *dmabuf;
+ bool cookie;
+ int ret;
+
+ if (WARN_ON(!attach))
+ return -EINVAL;
+
+ dmabuf = attach->dmabuf;
+
+ if (!dmabuf->ops->begin_access)
+ return 0;
+
+ cookie = dma_fence_begin_signalling();
+ ret = dmabuf->ops->begin_access(attach, sgt, dir);
+ dma_fence_end_signalling(cookie);
+
+ if (WARN_ON_ONCE(ret))
+ return ret;
+
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
+
+/**
+ * @dma_buf_end_access - Call after any hardware access from/to the DMABUF
+ * @attach: [in] attachment used for hardware access
+ * @sg_table: [in] scatterlist used for the DMA transfer
+ * @direction: [in] direction of DMA transfer
+ */
+int dma_buf_end_access(struct dma_buf_attachment *attach,
+ struct sg_table *sgt, enum dma_data_direction dir)
+{
+ struct dma_buf *dmabuf;
+ bool cookie;
+ int ret;
+
+ if (WARN_ON(!attach))
+ return -EINVAL;
+
+ dmabuf = attach->dmabuf;
+
+ if (!dmabuf->ops->end_access)
+ return 0;
+
+ cookie = dma_fence_begin_signalling();
+ ret = dmabuf->ops->end_access(attach, sgt, dir);
+ dma_fence_end_signalling(cookie);
+
+ if (WARN_ON_ONCE(ret))
+ return ret;
+
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
+
#ifdef CONFIG_DEBUG_FS
static int dma_buf_debug_show(struct seq_file *s, void *unused)
{
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 8ff4add71f88..8ba612c7cc16 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -246,6 +246,38 @@ struct dma_buf_ops {
*/
int (*end_cpu_access)(struct dma_buf *, enum dma_data_direction);

+ /**
+ * @begin_access:
+ *
+ * This is called from dma_buf_begin_access() when a device driver
+ * wants to access the data of the DMABUF. The exporter can use this
+ * to flush/sync the caches if needed.
+ *
+ * This callback is optional.
+ *
+ * Returns:
+ *
+ * 0 on success or a negative error code on failure.
+ */
+ int (*begin_access)(struct dma_buf_attachment *, struct sg_table *,
+ enum dma_data_direction);
+
+ /**
+ * @end_access:
+ *
+ * This is called from dma_buf_end_access() when a device driver is
+ * done accessing the data of the DMABUF. The exporter can use this
+ * to flush/sync the caches if needed.
+ *
+ * This callback is optional.
+ *
+ * Returns:
+ *
+ * 0 on success or a negative error code on failure.
+ */
+ int (*end_access)(struct dma_buf_attachment *, struct sg_table *,
+ enum dma_data_direction);
+
/**
* @mmap:
*
@@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf *dmabuf,
int dma_buf_pin(struct dma_buf_attachment *attach);
void dma_buf_unpin(struct dma_buf_attachment *attach);

+int dma_buf_begin_access(struct dma_buf_attachment *attach,
+ struct sg_table *sgt, enum dma_data_direction dir);
+int dma_buf_end_access(struct dma_buf_attachment *attach,
+ struct sg_table *sgt, enum dma_data_direction dir);
+
struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);

int dma_buf_fd(struct dma_buf *dmabuf, int flags);
--
2.43.0


2024-01-19 14:17:30

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 6/6] Documentation: usb: Document FunctionFS DMABUF API

Add documentation for the three ioctls used to attach or detach
externally-created DMABUFs, and to request transfers from/to previously
attached DMABUFs.

Signed-off-by: Paul Cercueil <[email protected]>

---
v3: New patch
---
Documentation/usb/functionfs.rst | 36 ++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/Documentation/usb/functionfs.rst b/Documentation/usb/functionfs.rst
index a3054bea38f3..d05a775bc45b 100644
--- a/Documentation/usb/functionfs.rst
+++ b/Documentation/usb/functionfs.rst
@@ -2,6 +2,9 @@
How FunctionFS works
====================

+Overview
+========
+
From kernel point of view it is just a composite function with some
unique behaviour. It may be added to an USB configuration only after
the user space driver has registered by writing descriptors and
@@ -66,3 +69,36 @@ have been written to their ep0's.

Conversely, the gadget is unregistered after the first USB function
closes its endpoints.
+
+DMABUF interface
+================
+
+FunctionFS additionally supports a DMABUF based interface, where the
+userspace can attach DMABUF objects (externally created) to an endpoint,
+and subsequently use them for data transfers.
+
+A userspace application can then use this interface to share DMABUF
+objects between several interfaces, allowing it to transfer data in a
+zero-copy fashion, for instance between IIO and the USB stack.
+
+As part of this interface, three new IOCTLs have been added. These three
+IOCTLs have to be performed on a data endpoint (ie. not ep0). They are:
+
+ ``FUNCTIONFS_DMABUF_ATTACH(int)``
+ Attach the DMABUF object, identified by its file descriptor, to the
+ data endpoint. Returns zero on success, and a negative errno value
+ on error.
+
+ ``FUNCTIONFS_DMABUF_DETACH(int)``
+ Detach the given DMABUF object, identified by its file descriptor,
+ from the data endpoint. Returns zero on success, and a negative
+ errno value on error. Note that closing the endpoint's file
+ descriptor will automatically detach all attached DMABUFs.
+
+ ``FUNCTIONFS_DMABUF_TRANSFER(struct usb_ffs_dmabuf_transfer_req *)``
+ Enqueue the previously attached DMABUF to the transfer queue.
+ The argument is a structure that packs the DMABUF's file descriptor,
+ the size in bytes to transfer (which should generally correspond to
+ the size of the DMABUF), and a 'flags' field which is unused
+ for now. Returns zero on success, and a negative errno value on
+ error.
--
2.43.0


2024-01-19 14:19:57

by Paul Cercueil

[permalink] [raw]
Subject: [PATCH v5 5/6] usb: gadget: functionfs: Add DMABUF import interface

This patch introduces three new ioctls. They all should be called on a
data endpoint (ie. not ep0). They are:

- FUNCTIONFS_DMABUF_ATTACH, which takes the file descriptor of a DMABUF
object to attach to the endpoint.

- FUNCTIONFS_DMABUF_DETACH, which takes the file descriptor of the
DMABUF to detach from the endpoint. Note that closing the endpoint's
file descriptor will automatically detach all attached DMABUFs.

- FUNCTIONFS_DMABUF_TRANSFER, which requests a data transfer from / to
the given DMABUF. Its argument is a structure that packs the DMABUF's
file descriptor, the size in bytes to transfer (which should generally
be set to the size of the DMABUF), and a 'flags' field which is unused
for now.
Before this ioctl can be used, the related DMABUF must be attached
with FUNCTIONFS_DMABUF_ATTACH.

These three ioctls enable the FunctionFS code to transfer data between
the USB stack and a DMABUF object, which can be provided by a driver
from a completely different subsystem, in a zero-copy fashion.

Signed-off-by: Paul Cercueil <[email protected]>
Acked-by: Daniel Vetter <[email protected]>
Acked-by: Christian König <[email protected]>

---
v2:
- Make ffs_dma_resv_lock() static
- Add MODULE_IMPORT_NS(DMA_BUF);
- The attach/detach functions are now performed without locking the
eps_lock spinlock. The transfer function starts with the spinlock
unlocked, then locks it before allocating and queueing the USB
transfer.

v3:
- Inline to_ffs_dma_fence() which was called only once.
- Simplify ffs_dma_resv_lock()
- Add comment explaining why we unref twice in ffs_dmabuf_detach()
- Document uapi struct usb_ffs_dmabuf_transfer_req and IOCTLs

v4:
- Protect the DMABUF list with a mutex
- Use incremental sequence number for the dma_fences
- Unref attachments and DMABUFs in workers
- Remove dead code in ffs_dma_resv_lock()
- Fix non-block actually blocking
- Use dma_fence_begin/end_signalling()
- Add comment about cache-management and dma_buf_unmap_attachment()
- Make sure dma_buf_map_attachment() is called with the dma-resv locked

v5:
- Cache the dma_buf_attachment while the DMABUF is attached.
- Use dma_buf_begin/end_access() to ensure that the DMABUF data will be
coherent to the hardware.
- Remove comment about cache-management and dma_buf_unmap_attachment(),
since we now use dma_buf_begin/end_access().
- Added Christian's ACK
- Select DMA_SHARED_BUFFER in Kconfig entry
---
drivers/usb/gadget/Kconfig | 1 +
drivers/usb/gadget/function/f_fs.c | 456 ++++++++++++++++++++++++++++
include/uapi/linux/usb/functionfs.h | 41 +++
3 files changed, 498 insertions(+)

diff --git a/drivers/usb/gadget/Kconfig b/drivers/usb/gadget/Kconfig
index b3592bcb0f96..566ff0b1282a 100644
--- a/drivers/usb/gadget/Kconfig
+++ b/drivers/usb/gadget/Kconfig
@@ -190,6 +190,7 @@ config USB_F_MASS_STORAGE
tristate

config USB_F_FS
+ select DMA_SHARED_BUFFER
tristate

config USB_F_UAC1
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index ed2a6d5fcef7..82cc449a4d7e 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -15,6 +15,9 @@
/* #define VERBOSE_DEBUG */

#include <linux/blkdev.h>
+#include <linux/dma-buf.h>
+#include <linux/dma-fence.h>
+#include <linux/dma-resv.h>
#include <linux/pagemap.h>
#include <linux/export.h>
#include <linux/fs_parser.h>
@@ -43,6 +46,8 @@

#define FUNCTIONFS_MAGIC 0xa647361 /* Chosen by a honest dice roll ;) */

+MODULE_IMPORT_NS(DMA_BUF);
+
/* Reference counter handling */
static void ffs_data_get(struct ffs_data *ffs);
static void ffs_data_put(struct ffs_data *ffs);
@@ -124,6 +129,23 @@ struct ffs_ep {
u8 num;
};

+struct ffs_dmabuf_priv {
+ struct list_head entry;
+ struct kref ref;
+ struct ffs_data *ffs;
+ struct dma_buf_attachment *attach;
+ struct sg_table *sgt;
+ enum dma_data_direction dir;
+ spinlock_t lock;
+ u64 context;
+};
+
+struct ffs_dma_fence {
+ struct dma_fence base;
+ struct ffs_dmabuf_priv *priv;
+ struct work_struct work;
+};
+
struct ffs_epfile {
/* Protects ep->ep and ep->req. */
struct mutex mutex;
@@ -197,6 +219,11 @@ struct ffs_epfile {
unsigned char isoc; /* P: ffs->eps_lock */

unsigned char _pad;
+
+ /* Protects dmabufs */
+ struct mutex dmabufs_mutex;
+ struct list_head dmabufs; /* P: dmabufs_mutex */
+ atomic_t seqno;
};

struct ffs_buffer {
@@ -1271,10 +1298,51 @@ static ssize_t ffs_epfile_read_iter(struct kiocb *kiocb, struct iov_iter *to)
return res;
}

+static void ffs_dmabuf_release(struct kref *ref)
+{
+ struct ffs_dmabuf_priv *priv = container_of(ref, struct ffs_dmabuf_priv, ref);
+ struct dma_buf_attachment *attach = priv->attach;
+ struct dma_buf *dmabuf = attach->dmabuf;
+
+ pr_debug("FFS DMABUF release\n");
+ dma_resv_lock(dmabuf->resv, NULL);
+ dma_buf_unmap_attachment(attach, priv->sgt, priv->dir);
+ dma_resv_unlock(dmabuf->resv);
+
+ dma_buf_detach(attach->dmabuf, attach);
+ dma_buf_put(dmabuf);
+ kfree(priv);
+}
+
+static void ffs_dmabuf_get(struct dma_buf_attachment *attach)
+{
+ struct ffs_dmabuf_priv *priv = attach->importer_priv;
+
+ kref_get(&priv->ref);
+}
+
+static void ffs_dmabuf_put(struct dma_buf_attachment *attach)
+{
+ struct ffs_dmabuf_priv *priv = attach->importer_priv;
+
+ kref_put(&priv->ref, ffs_dmabuf_release);
+}
+
static int
ffs_epfile_release(struct inode *inode, struct file *file)
{
struct ffs_epfile *epfile = inode->i_private;
+ struct ffs_dmabuf_priv *priv, *tmp;
+
+ mutex_lock(&epfile->dmabufs_mutex);
+
+ /* Close all attached DMABUFs */
+ list_for_each_entry_safe(priv, tmp, &epfile->dmabufs, entry) {
+ list_del(&priv->entry);
+ ffs_dmabuf_put(priv->attach);
+ }
+
+ mutex_unlock(&epfile->dmabufs_mutex);

__ffs_epfile_read_buffer_free(epfile);
ffs_data_closed(epfile->ffs);
@@ -1282,6 +1350,354 @@ ffs_epfile_release(struct inode *inode, struct file *file)
return 0;
}

+static void ffs_dmabuf_unmap_work(struct work_struct *work)
+{
+ struct ffs_dma_fence *dma_fence =
+ container_of(work, struct ffs_dma_fence, work);
+ struct ffs_dmabuf_priv *priv = dma_fence->priv;
+ struct dma_buf_attachment *attach = priv->attach;
+ struct dma_fence *fence = &dma_fence->base;
+
+ ffs_dmabuf_put(attach);
+ dma_fence_put(fence);
+}
+
+static void ffs_dmabuf_signal_done(struct ffs_dma_fence *dma_fence, int ret)
+{
+ struct ffs_dmabuf_priv *priv = dma_fence->priv;
+ struct dma_fence *fence = &dma_fence->base;
+ bool cookie = dma_fence_begin_signalling();
+
+ dma_fence_get(fence);
+ dma_buf_end_access(priv->attach, priv->sgt, priv->dir);
+
+ fence->error = ret;
+ dma_fence_signal(fence);
+ dma_fence_end_signalling(cookie);
+
+ /*
+ * The fence will be unref'd in ffs_dmabuf_unmap_work.
+ * It can't be done here, as the unref functions might try to lock
+ * the resv object, which would deadlock.
+ */
+ INIT_WORK(&dma_fence->work, ffs_dmabuf_unmap_work);
+ queue_work(priv->ffs->io_completion_wq, &dma_fence->work);
+}
+
+static void ffs_epfile_dmabuf_io_complete(struct usb_ep *ep,
+ struct usb_request *req)
+{
+ pr_debug("FFS: DMABUF transfer complete, status=%d\n", req->status);
+ ffs_dmabuf_signal_done(req->context, req->status);
+ usb_ep_free_request(ep, req);
+}
+
+static const char *ffs_dmabuf_get_driver_name(struct dma_fence *fence)
+{
+ return "functionfs";
+}
+
+static const char *ffs_dmabuf_get_timeline_name(struct dma_fence *fence)
+{
+ return "";
+}
+
+static void ffs_dmabuf_fence_release(struct dma_fence *fence)
+{
+ struct ffs_dma_fence *dma_fence =
+ container_of(fence, struct ffs_dma_fence, base);
+
+ kfree(dma_fence);
+}
+
+static const struct dma_fence_ops ffs_dmabuf_fence_ops = {
+ .get_driver_name = ffs_dmabuf_get_driver_name,
+ .get_timeline_name = ffs_dmabuf_get_timeline_name,
+ .release = ffs_dmabuf_fence_release,
+};
+
+static int ffs_dma_resv_lock(struct dma_buf *dmabuf, bool nonblock)
+{
+ if (!nonblock)
+ return dma_resv_lock_interruptible(dmabuf->resv, NULL);
+
+ if (!dma_resv_trylock(dmabuf->resv))
+ return -EBUSY;
+
+ return 0;
+}
+
+static struct dma_buf_attachment *
+ffs_dmabuf_find_attachment(struct ffs_epfile *epfile, struct dma_buf *dmabuf)
+{
+ struct device *dev = epfile->ffs->gadget->dev.parent;
+ struct dma_buf_attachment *attach = NULL;
+ struct ffs_dmabuf_priv *priv;
+
+ mutex_lock(&epfile->dmabufs_mutex);
+
+ list_for_each_entry(priv, &epfile->dmabufs, entry) {
+ if (priv->attach->dev == dev
+ && priv->attach->dmabuf == dmabuf) {
+ attach = priv->attach;
+ break;
+ }
+ }
+
+ if (attach)
+ ffs_dmabuf_get(attach);
+
+ mutex_unlock(&epfile->dmabufs_mutex);
+
+ return attach ?: ERR_PTR(-EPERM);
+}
+
+static int ffs_dmabuf_attach(struct file *file, int fd)
+{
+ bool nonblock = file->f_flags & O_NONBLOCK;
+ struct ffs_epfile *epfile = file->private_data;
+ struct usb_gadget *gadget = epfile->ffs->gadget;
+ struct dma_buf_attachment *attach;
+ struct ffs_dmabuf_priv *priv;
+ enum dma_data_direction dir;
+ struct sg_table *sg_table;
+ struct dma_buf *dmabuf;
+ int err;
+
+ if (!gadget || !gadget->sg_supported)
+ return -EPERM;
+
+ dmabuf = dma_buf_get(fd);
+ if (IS_ERR(dmabuf))
+ return PTR_ERR(dmabuf);
+
+ attach = dma_buf_attach(dmabuf, gadget->dev.parent);
+ if (IS_ERR(attach)) {
+ err = PTR_ERR(attach);
+ goto err_dmabuf_put;
+ }
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv) {
+ err = -ENOMEM;
+ goto err_dmabuf_detach;
+ }
+
+ dir = epfile->in ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+
+ err = ffs_dma_resv_lock(dmabuf, nonblock);
+ if (err)
+ goto err_free_priv;
+
+ sg_table = dma_buf_map_attachment(attach, dir);
+ dma_resv_unlock(dmabuf->resv);
+
+ if (IS_ERR(sg_table)) {
+ err = PTR_ERR(sg_table);
+ goto err_free_priv;
+ }
+
+ attach->importer_priv = priv;
+
+ priv->sgt = sg_table;
+ priv->dir = dir;
+ priv->ffs = epfile->ffs;
+ priv->attach = attach;
+ spin_lock_init(&priv->lock);
+ kref_init(&priv->ref);
+ priv->context = dma_fence_context_alloc(1);
+
+ mutex_lock(&epfile->dmabufs_mutex);
+ list_add(&priv->entry, &epfile->dmabufs);
+ mutex_unlock(&epfile->dmabufs_mutex);
+
+ return 0;
+
+err_free_priv:
+ kfree(priv);
+err_dmabuf_detach:
+ dma_buf_detach(dmabuf, attach);
+err_dmabuf_put:
+ dma_buf_put(dmabuf);
+
+ return err;
+}
+
+static int ffs_dmabuf_detach(struct file *file, int fd)
+{
+ struct ffs_epfile *epfile = file->private_data;
+ struct device *dev = epfile->ffs->gadget->dev.parent;
+ struct ffs_dmabuf_priv *priv;
+ struct dma_buf *dmabuf;
+ int ret = -EPERM;
+
+ dmabuf = dma_buf_get(fd);
+ if (IS_ERR(dmabuf))
+ return PTR_ERR(dmabuf);
+
+ mutex_lock(&epfile->dmabufs_mutex);
+
+ list_for_each_entry(priv, &epfile->dmabufs, entry) {
+ if (priv->attach->dev == dev
+ && priv->attach->dmabuf == dmabuf) {
+ list_del(&priv->entry);
+
+ /* Unref the reference from ffs_dmabuf_attach() */
+ ffs_dmabuf_put(priv->attach);
+ ret = 0;
+ break;
+ }
+ }
+
+ mutex_unlock(&epfile->dmabufs_mutex);
+ dma_buf_put(dmabuf);
+
+ return ret;
+}
+
+static int ffs_dmabuf_transfer(struct file *file,
+ const struct usb_ffs_dmabuf_transfer_req *req)
+{
+ bool nonblock = file->f_flags & O_NONBLOCK;
+ struct ffs_epfile *epfile = file->private_data;
+ struct dma_buf_attachment *attach;
+ struct ffs_dmabuf_priv *priv;
+ struct ffs_dma_fence *fence;
+ struct usb_request *usb_req;
+ struct dma_buf *dmabuf;
+ struct ffs_ep *ep;
+ bool cookie;
+ u32 seqno;
+ int ret;
+
+ if (req->flags & ~USB_FFS_DMABUF_TRANSFER_MASK)
+ return -EINVAL;
+
+ dmabuf = dma_buf_get(req->fd);
+ if (IS_ERR(dmabuf))
+ return PTR_ERR(dmabuf);
+
+ if (req->length > dmabuf->size || req->length == 0) {
+ ret = -EINVAL;
+ goto err_dmabuf_put;
+ }
+
+ attach = ffs_dmabuf_find_attachment(epfile, dmabuf);
+ if (IS_ERR(attach)) {
+ ret = PTR_ERR(attach);
+ goto err_dmabuf_put;
+ }
+
+ priv = attach->importer_priv;
+
+ ep = ffs_epfile_wait_ep(file);
+ if (IS_ERR(ep)) {
+ ret = PTR_ERR(ep);
+ goto err_attachment_put;
+ }
+
+ ret = dma_buf_begin_access(attach, priv->sgt, priv->dir);
+ if (ret)
+ goto err_attachment_put;
+
+ ret = ffs_dma_resv_lock(dmabuf, nonblock);
+ if (ret)
+ goto err_end_access;
+
+ /* Make sure we don't have writers */
+ if (!dma_resv_test_signaled(dmabuf->resv, DMA_RESV_USAGE_WRITE)) {
+ pr_debug("FFS WRITE fence is not signaled\n");
+ ret = -EBUSY;
+ goto err_resv_unlock;
+ }
+
+ /* If we're writing to the DMABUF, make sure we don't have readers */
+ if (epfile->in &&
+ !dma_resv_test_signaled(dmabuf->resv, DMA_RESV_USAGE_READ)) {
+ pr_debug("FFS READ fence is not signaled\n");
+ ret = -EBUSY;
+ goto err_resv_unlock;
+ }
+
+ ret = dma_resv_reserve_fences(dmabuf->resv, 1);
+ if (ret)
+ goto err_resv_unlock;
+
+ fence = kmalloc(sizeof(*fence), GFP_KERNEL);
+ if (!fence) {
+ ret = -ENOMEM;
+ goto err_resv_unlock;
+ }
+
+ fence->priv = priv;
+
+ spin_lock_irq(&epfile->ffs->eps_lock);
+
+ /* In the meantime, endpoint got disabled or changed. */
+ if (epfile->ep != ep) {
+ ret = -ESHUTDOWN;
+ goto err_fence_put;
+ }
+
+ usb_req = usb_ep_alloc_request(ep->ep, GFP_ATOMIC);
+ if (!usb_req) {
+ ret = -ENOMEM;
+ goto err_fence_put;
+ }
+
+ /*
+ * usb_ep_queue() guarantees that all transfers are processed in the
+ * order they are enqueued, so we can use a simple incrementing
+ * sequence number for the dma_fence.
+ */
+ seqno = atomic_add_return(1, &epfile->seqno);
+
+ dma_fence_init(&fence->base, &ffs_dmabuf_fence_ops,
+ &priv->lock, priv->context, seqno);
+
+ dma_resv_add_fence(dmabuf->resv, &fence->base,
+ dma_resv_usage_rw(epfile->in));
+ dma_resv_unlock(dmabuf->resv);
+
+ /* Now that the dma_fence is in place, queue the transfer. */
+
+ usb_req->length = req->length;
+ usb_req->buf = NULL;
+ usb_req->sg = priv->sgt->sgl;
+ usb_req->num_sgs = sg_nents_for_len(priv->sgt->sgl, req->length);
+ usb_req->sg_was_mapped = true;
+ usb_req->context = fence;
+ usb_req->complete = ffs_epfile_dmabuf_io_complete;
+
+ cookie = dma_fence_begin_signalling();
+ ret = usb_ep_queue(ep->ep, usb_req, GFP_ATOMIC);
+ dma_fence_end_signalling(cookie);
+ if (ret) {
+ pr_warn("FFS: Failed to queue DMABUF: %d\n", ret);
+ ffs_dmabuf_signal_done(fence, ret);
+ usb_ep_free_request(ep->ep, usb_req);
+ }
+
+ spin_unlock_irq(&epfile->ffs->eps_lock);
+ dma_buf_put(dmabuf);
+
+ return ret;
+
+err_fence_put:
+ spin_unlock_irq(&epfile->ffs->eps_lock);
+ dma_fence_put(&fence->base);
+err_resv_unlock:
+ dma_resv_unlock(dmabuf->resv);
+err_end_access:
+ dma_buf_end_access(attach, priv->sgt, priv->dir);
+err_attachment_put:
+ ffs_dmabuf_put(attach);
+err_dmabuf_put:
+ dma_buf_put(dmabuf);
+
+ return ret;
+}
+
static long ffs_epfile_ioctl(struct file *file, unsigned code,
unsigned long value)
{
@@ -1292,6 +1708,44 @@ static long ffs_epfile_ioctl(struct file *file, unsigned code,
if (WARN_ON(epfile->ffs->state != FFS_ACTIVE))
return -ENODEV;

+ switch (code) {
+ case FUNCTIONFS_DMABUF_ATTACH:
+ {
+ int fd;
+
+ if (copy_from_user(&fd, (void __user *)value, sizeof(fd))) {
+ ret = -EFAULT;
+ break;
+ }
+
+ return ffs_dmabuf_attach(file, fd);
+ }
+ case FUNCTIONFS_DMABUF_DETACH:
+ {
+ int fd;
+
+ if (copy_from_user(&fd, (void __user *)value, sizeof(fd))) {
+ ret = -EFAULT;
+ break;
+ }
+
+ return ffs_dmabuf_detach(file, fd);
+ }
+ case FUNCTIONFS_DMABUF_TRANSFER:
+ {
+ struct usb_ffs_dmabuf_transfer_req req;
+
+ if (copy_from_user(&req, (void __user *)value, sizeof(req))) {
+ ret = -EFAULT;
+ break;
+ }
+
+ return ffs_dmabuf_transfer(file, &req);
+ }
+ default:
+ break;
+ }
+
/* Wait for endpoint to be enabled */
ep = ffs_epfile_wait_ep(file);
if (IS_ERR(ep))
@@ -1869,6 +2323,8 @@ static int ffs_epfiles_create(struct ffs_data *ffs)
for (i = 1; i <= count; ++i, ++epfile) {
epfile->ffs = ffs;
mutex_init(&epfile->mutex);
+ mutex_init(&epfile->dmabufs_mutex);
+ INIT_LIST_HEAD(&epfile->dmabufs);
if (ffs->user_flags & FUNCTIONFS_VIRTUAL_ADDR)
sprintf(epfile->name, "ep%02x", ffs->eps_addrmap[i]);
else
diff --git a/include/uapi/linux/usb/functionfs.h b/include/uapi/linux/usb/functionfs.h
index 078098e73fd3..9f88de9c3d66 100644
--- a/include/uapi/linux/usb/functionfs.h
+++ b/include/uapi/linux/usb/functionfs.h
@@ -86,6 +86,22 @@ struct usb_ext_prop_desc {
__le16 wPropertyNameLength;
} __attribute__((packed));

+/* Flags for usb_ffs_dmabuf_transfer_req->flags (none for now) */
+#define USB_FFS_DMABUF_TRANSFER_MASK 0x0
+
+/**
+ * struct usb_ffs_dmabuf_transfer_req - Transfer request for a DMABUF object
+ * @fd: file descriptor of the DMABUF object
+ * @flags: one or more USB_FFS_DMABUF_TRANSFER_* flags
+ * @length: number of bytes used in this DMABUF for the data transfer.
+ * Should generally be set to the DMABUF's size.
+ */
+struct usb_ffs_dmabuf_transfer_req {
+ int fd;
+ __u32 flags;
+ __u64 length;
+} __attribute__((packed));
+
#ifndef __KERNEL__

/*
@@ -290,6 +306,31 @@ struct usb_functionfs_event {
#define FUNCTIONFS_ENDPOINT_DESC _IOR('g', 130, \
struct usb_endpoint_descriptor)

+/*
+ * Attach the DMABUF object, identified by its file descriptor, to the
+ * data endpoint. Returns zero on success, and a negative errno value
+ * on error.
+ */
+#define FUNCTIONFS_DMABUF_ATTACH _IOW('g', 131, int)
+

+/*
+ * Detach the given DMABUF object, identified by its file descriptor,
+ * from the data endpoint. Returns zero on success, and a negative
+ * errno value on error. Note that closing the endpoint's file
+ * descriptor will automatically detach all attached DMABUFs.
+ */
+#define FUNCTIONFS_DMABUF_DETACH _IOW('g', 132, int)
+
+/*
+ * Enqueue the previously attached DMABUF to the transfer queue.
+ * The argument is a structure that packs the DMABUF's file descriptor,
+ * the size in bytes to transfer (which should generally correspond to
+ * the size of the DMABUF), and a 'flags' field which is unused
+ * for now. Returns zero on success, and a negative errno value on
+ * error.
+ */
+#define FUNCTIONFS_DMABUF_TRANSFER _IOW('g', 133, \
+ struct usb_ffs_dmabuf_transfer_req)

#endif /* _UAPI__LINUX_FUNCTIONFS_H__ */
--
2.43.0


2024-01-20 20:21:14

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Hi Paul,

kernel test robot noticed the following build warnings:

[auto build test WARNING on usb/usb-testing]
[also build test WARNING on usb/usb-next usb/usb-linus drm-misc/drm-misc-next lwn/docs-next linus/master v6.7 next-20240119]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Paul-Cercueil/dma-buf-Add-dma_buf_-begin-end-_access/20240119-221604
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
patch link: https://lore.kernel.org/r/20240119141402.44262-2-paul%40crapouillou.net
patch subject: [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()
config: arm-randconfig-001-20240120 (https://download.01.org/0day-ci/archive/20240121/[email protected]/config)
compiler: clang version 18.0.0git (https://github.com/llvm/llvm-project d92ce344bf641e6bb025b41b3f1a77dd25e2b3e9)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240121/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All warnings (new ones prefixed by >>):

>> drivers/dma-buf/dma-buf.c:1608: warning: Cannot understand * @dma_buf_begin_access - Call before any hardware access from/to the DMABUF
on line 1608 - I thought it was a doc line
>> drivers/dma-buf/dma-buf.c:1640: warning: Cannot understand * @dma_buf_end_access - Call after any hardware access from/to the DMABUF
on line 1640 - I thought it was a doc line


vim +1608 drivers/dma-buf/dma-buf.c

1606
1607 /**
> 1608 * @dma_buf_begin_access - Call before any hardware access from/to the DMABUF
1609 * @attach: [in] attachment used for hardware access
1610 * @sg_table: [in] scatterlist used for the DMA transfer
1611 * @direction: [in] direction of DMA transfer
1612 */
1613 int dma_buf_begin_access(struct dma_buf_attachment *attach,
1614 struct sg_table *sgt, enum dma_data_direction dir)
1615 {
1616 struct dma_buf *dmabuf;
1617 bool cookie;
1618 int ret;
1619
1620 if (WARN_ON(!attach))
1621 return -EINVAL;
1622
1623 dmabuf = attach->dmabuf;
1624
1625 if (!dmabuf->ops->begin_access)
1626 return 0;
1627
1628 cookie = dma_fence_begin_signalling();
1629 ret = dmabuf->ops->begin_access(attach, sgt, dir);
1630 dma_fence_end_signalling(cookie);
1631
1632 if (WARN_ON_ONCE(ret))
1633 return ret;
1634
1635 return 0;
1636 }
1637 EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
1638
1639 /**
> 1640 * @dma_buf_end_access - Call after any hardware access from/to the DMABUF
1641 * @attach: [in] attachment used for hardware access
1642 * @sg_table: [in] scatterlist used for the DMA transfer
1643 * @direction: [in] direction of DMA transfer
1644 */
1645 int dma_buf_end_access(struct dma_buf_attachment *attach,
1646 struct sg_table *sgt, enum dma_data_direction dir)
1647 {
1648 struct dma_buf *dmabuf;
1649 bool cookie;
1650 int ret;
1651
1652 if (WARN_ON(!attach))
1653 return -EINVAL;
1654
1655 dmabuf = attach->dmabuf;
1656
1657 if (!dmabuf->ops->end_access)
1658 return 0;
1659
1660 cookie = dma_fence_begin_signalling();
1661 ret = dmabuf->ops->end_access(attach, sgt, dir);
1662 dma_fence_end_signalling(cookie);
1663
1664 if (WARN_ON_ONCE(ret))
1665 return ret;
1666
1667 return 0;
1668 }
1669 EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
1670

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2024-01-22 10:35:51

by Christian König

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Am 19.01.24 um 15:13 schrieb Paul Cercueil:
> These functions should be used by device drivers when they start and
> stop accessing the data of DMABUF. It allows DMABUF importers to cache
> the dma_buf_attachment while ensuring that the data they want to access
> is available for their device when the DMA transfers take place.

As Daniel already noted as well this is a complete no-go from the
DMA-buf design point of view.

Regards,
Christian.

>
> Signed-off-by: Paul Cercueil <[email protected]>
>
> ---
> v5: New patch
> ---
> drivers/dma-buf/dma-buf.c | 66 +++++++++++++++++++++++++++++++++++++++
> include/linux/dma-buf.h | 37 ++++++++++++++++++++++
> 2 files changed, 103 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 8fe5aa67b167..a8bab6c18fcd 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -830,6 +830,8 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach,
> * - dma_buf_mmap()
> * - dma_buf_begin_cpu_access()
> * - dma_buf_end_cpu_access()
> + * - dma_buf_begin_access()
> + * - dma_buf_end_access()
> * - dma_buf_map_attachment_unlocked()
> * - dma_buf_unmap_attachment_unlocked()
> * - dma_buf_vmap_unlocked()
> @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map)
> }
> EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
>
> +/**
> + * @dma_buf_begin_access - Call before any hardware access from/to the DMABUF
> + * @attach: [in] attachment used for hardware access
> + * @sg_table: [in] scatterlist used for the DMA transfer
> + * @direction: [in] direction of DMA transfer
> + */
> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir)
> +{
> + struct dma_buf *dmabuf;
> + bool cookie;
> + int ret;
> +
> + if (WARN_ON(!attach))
> + return -EINVAL;
> +
> + dmabuf = attach->dmabuf;
> +
> + if (!dmabuf->ops->begin_access)
> + return 0;
> +
> + cookie = dma_fence_begin_signalling();
> + ret = dmabuf->ops->begin_access(attach, sgt, dir);
> + dma_fence_end_signalling(cookie);
> +
> + if (WARN_ON_ONCE(ret))
> + return ret;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
> +
> +/**
> + * @dma_buf_end_access - Call after any hardware access from/to the DMABUF
> + * @attach: [in] attachment used for hardware access
> + * @sg_table: [in] scatterlist used for the DMA transfer
> + * @direction: [in] direction of DMA transfer
> + */
> +int dma_buf_end_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir)
> +{
> + struct dma_buf *dmabuf;
> + bool cookie;
> + int ret;
> +
> + if (WARN_ON(!attach))
> + return -EINVAL;
> +
> + dmabuf = attach->dmabuf;
> +
> + if (!dmabuf->ops->end_access)
> + return 0;
> +
> + cookie = dma_fence_begin_signalling();
> + ret = dmabuf->ops->end_access(attach, sgt, dir);
> + dma_fence_end_signalling(cookie);
> +
> + if (WARN_ON_ONCE(ret))
> + return ret;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
> +
> #ifdef CONFIG_DEBUG_FS
> static int dma_buf_debug_show(struct seq_file *s, void *unused)
> {
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 8ff4add71f88..8ba612c7cc16 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -246,6 +246,38 @@ struct dma_buf_ops {
> */
> int (*end_cpu_access)(struct dma_buf *, enum dma_data_direction);
>
> + /**
> + * @begin_access:
> + *
> + * This is called from dma_buf_begin_access() when a device driver
> + * wants to access the data of the DMABUF. The exporter can use this
> + * to flush/sync the caches if needed.
> + *
> + * This callback is optional.
> + *
> + * Returns:
> + *
> + * 0 on success or a negative error code on failure.
> + */
> + int (*begin_access)(struct dma_buf_attachment *, struct sg_table *,
> + enum dma_data_direction);
> +
> + /**
> + * @end_access:
> + *
> + * This is called from dma_buf_end_access() when a device driver is
> + * done accessing the data of the DMABUF. The exporter can use this
> + * to flush/sync the caches if needed.
> + *
> + * This callback is optional.
> + *
> + * Returns:
> + *
> + * 0 on success or a negative error code on failure.
> + */
> + int (*end_access)(struct dma_buf_attachment *, struct sg_table *,
> + enum dma_data_direction);
> +
> /**
> * @mmap:
> *
> @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf *dmabuf,
> int dma_buf_pin(struct dma_buf_attachment *attach);
> void dma_buf_unpin(struct dma_buf_attachment *attach);
>
> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir);
> +int dma_buf_end_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir);
> +
> struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>
> int dma_buf_fd(struct dma_buf *dmabuf, int flags);


2024-01-22 11:04:55

by Paul Cercueil

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Hi Christian,

Le lundi 22 janvier 2024 à 11:35 +0100, Christian König a écrit :
> Am 19.01.24 um 15:13 schrieb Paul Cercueil:
> > These functions should be used by device drivers when they start
> > and
> > stop accessing the data of DMABUF. It allows DMABUF importers to
> > cache
> > the dma_buf_attachment while ensuring that the data they want to
> > access
> > is available for their device when the DMA transfers take place.
>
> As Daniel already noted as well this is a complete no-go from the
> DMA-buf design point of view.

What do you mean "as Daniel already noted"? It was him who suggested
this.

>
> Regards,
> Christian.

Cheers,
-Paul

>
> >
> > Signed-off-by: Paul Cercueil <[email protected]>
> >
> > ---
> > v5: New patch
> > ---
> >   drivers/dma-buf/dma-buf.c | 66
> > +++++++++++++++++++++++++++++++++++++++
> >   include/linux/dma-buf.h   | 37 ++++++++++++++++++++++
> >   2 files changed, 103 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index 8fe5aa67b167..a8bab6c18fcd 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -830,6 +830,8 @@ static struct sg_table * __map_dma_buf(struct
> > dma_buf_attachment *attach,
> >    *     - dma_buf_mmap()
> >    *     - dma_buf_begin_cpu_access()
> >    *     - dma_buf_end_cpu_access()
> > + *     - dma_buf_begin_access()
> > + *     - dma_buf_end_access()
> >    *     - dma_buf_map_attachment_unlocked()
> >    *     - dma_buf_unmap_attachment_unlocked()
> >    *     - dma_buf_vmap_unlocked()
> > @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct dma_buf
> > *dmabuf, struct iosys_map *map)
> >   }
> >   EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
> >  
> > +/**
> > + * @dma_buf_begin_access - Call before any hardware access from/to
> > the DMABUF
> > + * @attach: [in] attachment used for hardware access
> > + * @sg_table: [in] scatterlist used for the DMA transfer
> > + * @direction:  [in]    direction of DMA transfer
> > + */
> > +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> > + struct sg_table *sgt, enum
> > dma_data_direction dir)
> > +{
> > + struct dma_buf *dmabuf;
> > + bool cookie;
> > + int ret;
> > +
> > + if (WARN_ON(!attach))
> > + return -EINVAL;
> > +
> > + dmabuf = attach->dmabuf;
> > +
> > + if (!dmabuf->ops->begin_access)
> > + return 0;
> > +
> > + cookie = dma_fence_begin_signalling();
> > + ret = dmabuf->ops->begin_access(attach, sgt, dir);
> > + dma_fence_end_signalling(cookie);
> > +
> > + if (WARN_ON_ONCE(ret))
> > + return ret;
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
> > +
> > +/**
> > + * @dma_buf_end_access - Call after any hardware access from/to
> > the DMABUF
> > + * @attach: [in] attachment used for hardware access
> > + * @sg_table: [in] scatterlist used for the DMA transfer
> > + * @direction:  [in]    direction of DMA transfer
> > + */
> > +int dma_buf_end_access(struct dma_buf_attachment *attach,
> > +        struct sg_table *sgt, enum
> > dma_data_direction dir)
> > +{
> > + struct dma_buf *dmabuf;
> > + bool cookie;
> > + int ret;
> > +
> > + if (WARN_ON(!attach))
> > + return -EINVAL;
> > +
> > + dmabuf = attach->dmabuf;
> > +
> > + if (!dmabuf->ops->end_access)
> > + return 0;
> > +
> > + cookie = dma_fence_begin_signalling();
> > + ret = dmabuf->ops->end_access(attach, sgt, dir);
> > + dma_fence_end_signalling(cookie);
> > +
> > + if (WARN_ON_ONCE(ret))
> > + return ret;
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
> > +
> >   #ifdef CONFIG_DEBUG_FS
> >   static int dma_buf_debug_show(struct seq_file *s, void *unused)
> >   {
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index 8ff4add71f88..8ba612c7cc16 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -246,6 +246,38 @@ struct dma_buf_ops {
> >    */
> >    int (*end_cpu_access)(struct dma_buf *, enum
> > dma_data_direction);
> >  
> > + /**
> > + * @begin_access:
> > + *
> > + * This is called from dma_buf_begin_access() when a
> > device driver
> > + * wants to access the data of the DMABUF. The exporter
> > can use this
> > + * to flush/sync the caches if needed.
> > + *
> > + * This callback is optional.
> > + *
> > + * Returns:
> > + *
> > + * 0 on success or a negative error code on failure.
> > + */
> > + int (*begin_access)(struct dma_buf_attachment *, struct
> > sg_table *,
> > +     enum dma_data_direction);
> > +
> > + /**
> > + * @end_access:
> > + *
> > + * This is called from dma_buf_end_access() when a device
> > driver is
> > + * done accessing the data of the DMABUF. The exporter can
> > use this
> > + * to flush/sync the caches if needed.
> > + *
> > + * This callback is optional.
> > + *
> > + * Returns:
> > + *
> > + * 0 on success or a negative error code on failure.
> > + */
> > + int (*end_access)(struct dma_buf_attachment *, struct
> > sg_table *,
> > +   enum dma_data_direction);
> > +
> >    /**
> >    * @mmap:
> >    *
> > @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf *dmabuf,
> >   int dma_buf_pin(struct dma_buf_attachment *attach);
> >   void dma_buf_unpin(struct dma_buf_attachment *attach);
> >  
> > +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> > + struct sg_table *sgt, enum
> > dma_data_direction dir);
> > +int dma_buf_end_access(struct dma_buf_attachment *attach,
> > +        struct sg_table *sgt, enum
> > dma_data_direction dir);
> > +
> >   struct dma_buf *dma_buf_export(const struct dma_buf_export_info
> > *exp_info);
> >  
> >   int dma_buf_fd(struct dma_buf *dmabuf, int flags);
>


2024-01-22 14:17:33

by Christian König

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Am 22.01.24 um 12:01 schrieb Paul Cercueil:
> Hi Christian,
>
> Le lundi 22 janvier 2024 à 11:35 +0100, Christian König a écrit :
>> Am 19.01.24 um 15:13 schrieb Paul Cercueil:
>>> These functions should be used by device drivers when they start
>>> and
>>> stop accessing the data of DMABUF. It allows DMABUF importers to
>>> cache
>>> the dma_buf_attachment while ensuring that the data they want to
>>> access
>>> is available for their device when the DMA transfers take place.
>> As Daniel already noted as well this is a complete no-go from the
>> DMA-buf design point of view.
> What do you mean "as Daniel already noted"? It was him who suggested
> this.

Sorry, I haven't fully catched up to the discussion then.

In general DMA-buf is build around the idea that the data can be
accessed coherently by the involved devices.

Having a begin/end of access for devices was brought up multiple times
but so far rejected for good reasons.

That an exporter has to call extra functions to access his own buffers
is a complete no-go for the design since this forces exporters into
doing extra steps for allowing importers to access their data.

That in turn is pretty much un-testable unless you have every possible
importer around while testing the exporter.

Regards,
Christian.

>
>> Regards,
>> Christian.
> Cheers,
> -Paul
>
>>> Signed-off-by: Paul Cercueil <[email protected]>
>>>
>>> ---
>>> v5: New patch
>>> ---
>>>   drivers/dma-buf/dma-buf.c | 66
>>> +++++++++++++++++++++++++++++++++++++++
>>>   include/linux/dma-buf.h   | 37 ++++++++++++++++++++++
>>>   2 files changed, 103 insertions(+)
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index 8fe5aa67b167..a8bab6c18fcd 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -830,6 +830,8 @@ static struct sg_table * __map_dma_buf(struct
>>> dma_buf_attachment *attach,
>>>    *     - dma_buf_mmap()
>>>    *     - dma_buf_begin_cpu_access()
>>>    *     - dma_buf_end_cpu_access()
>>> + *     - dma_buf_begin_access()
>>> + *     - dma_buf_end_access()
>>>    *     - dma_buf_map_attachment_unlocked()
>>>    *     - dma_buf_unmap_attachment_unlocked()
>>>    *     - dma_buf_vmap_unlocked()
>>> @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct dma_buf
>>> *dmabuf, struct iosys_map *map)
>>>   }
>>>   EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
>>>
>>> +/**
>>> + * @dma_buf_begin_access - Call before any hardware access from/to
>>> the DMABUF
>>> + * @attach: [in] attachment used for hardware access
>>> + * @sg_table: [in] scatterlist used for the DMA transfer
>>> + * @direction:  [in]    direction of DMA transfer
>>> + */
>>> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
>>> + struct sg_table *sgt, enum
>>> dma_data_direction dir)
>>> +{
>>> + struct dma_buf *dmabuf;
>>> + bool cookie;
>>> + int ret;
>>> +
>>> + if (WARN_ON(!attach))
>>> + return -EINVAL;
>>> +
>>> + dmabuf = attach->dmabuf;
>>> +
>>> + if (!dmabuf->ops->begin_access)
>>> + return 0;
>>> +
>>> + cookie = dma_fence_begin_signalling();
>>> + ret = dmabuf->ops->begin_access(attach, sgt, dir);
>>> + dma_fence_end_signalling(cookie);
>>> +
>>> + if (WARN_ON_ONCE(ret))
>>> + return ret;
>>> +
>>> + return 0;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
>>> +
>>> +/**
>>> + * @dma_buf_end_access - Call after any hardware access from/to
>>> the DMABUF
>>> + * @attach: [in] attachment used for hardware access
>>> + * @sg_table: [in] scatterlist used for the DMA transfer
>>> + * @direction:  [in]    direction of DMA transfer
>>> + */
>>> +int dma_buf_end_access(struct dma_buf_attachment *attach,
>>> +        struct sg_table *sgt, enum
>>> dma_data_direction dir)
>>> +{
>>> + struct dma_buf *dmabuf;
>>> + bool cookie;
>>> + int ret;
>>> +
>>> + if (WARN_ON(!attach))
>>> + return -EINVAL;
>>> +
>>> + dmabuf = attach->dmabuf;
>>> +
>>> + if (!dmabuf->ops->end_access)
>>> + return 0;
>>> +
>>> + cookie = dma_fence_begin_signalling();
>>> + ret = dmabuf->ops->end_access(attach, sgt, dir);
>>> + dma_fence_end_signalling(cookie);
>>> +
>>> + if (WARN_ON_ONCE(ret))
>>> + return ret;
>>> +
>>> + return 0;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
>>> +
>>>   #ifdef CONFIG_DEBUG_FS
>>>   static int dma_buf_debug_show(struct seq_file *s, void *unused)
>>>   {
>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>> index 8ff4add71f88..8ba612c7cc16 100644
>>> --- a/include/linux/dma-buf.h
>>> +++ b/include/linux/dma-buf.h
>>> @@ -246,6 +246,38 @@ struct dma_buf_ops {
>>>    */
>>>    int (*end_cpu_access)(struct dma_buf *, enum
>>> dma_data_direction);
>>>
>>> + /**
>>> + * @begin_access:
>>> + *
>>> + * This is called from dma_buf_begin_access() when a
>>> device driver
>>> + * wants to access the data of the DMABUF. The exporter
>>> can use this
>>> + * to flush/sync the caches if needed.
>>> + *
>>> + * This callback is optional.
>>> + *
>>> + * Returns:
>>> + *
>>> + * 0 on success or a negative error code on failure.
>>> + */
>>> + int (*begin_access)(struct dma_buf_attachment *, struct
>>> sg_table *,
>>> +     enum dma_data_direction);
>>> +
>>> + /**
>>> + * @end_access:
>>> + *
>>> + * This is called from dma_buf_end_access() when a device
>>> driver is
>>> + * done accessing the data of the DMABUF. The exporter can
>>> use this
>>> + * to flush/sync the caches if needed.
>>> + *
>>> + * This callback is optional.
>>> + *
>>> + * Returns:
>>> + *
>>> + * 0 on success or a negative error code on failure.
>>> + */
>>> + int (*end_access)(struct dma_buf_attachment *, struct
>>> sg_table *,
>>> +   enum dma_data_direction);
>>> +
>>>    /**
>>>    * @mmap:
>>>    *
>>> @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf *dmabuf,
>>>   int dma_buf_pin(struct dma_buf_attachment *attach);
>>>   void dma_buf_unpin(struct dma_buf_attachment *attach);
>>>
>>> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
>>> + struct sg_table *sgt, enum
>>> dma_data_direction dir);
>>> +int dma_buf_end_access(struct dma_buf_attachment *attach,
>>> +        struct sg_table *sgt, enum
>>> dma_data_direction dir);
>>> +
>>>   struct dma_buf *dma_buf_export(const struct dma_buf_export_info
>>> *exp_info);
>>>
>>>   int dma_buf_fd(struct dma_buf *dmabuf, int flags);


2024-01-23 10:10:58

by Paul Cercueil

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Hi Christian,

Le lundi 22 janvier 2024 à 14:41 +0100, Christian König a écrit :
> Am 22.01.24 um 12:01 schrieb Paul Cercueil:
> > Hi Christian,
> >
> > Le lundi 22 janvier 2024 à 11:35 +0100, Christian König a écrit :
> > > Am 19.01.24 um 15:13 schrieb Paul Cercueil:
> > > > These functions should be used by device drivers when they
> > > > start
> > > > and
> > > > stop accessing the data of DMABUF. It allows DMABUF importers
> > > > to
> > > > cache
> > > > the dma_buf_attachment while ensuring that the data they want
> > > > to
> > > > access
> > > > is available for their device when the DMA transfers take
> > > > place.
> > > As Daniel already noted as well this is a complete no-go from the
> > > DMA-buf design point of view.
> > What do you mean "as Daniel already noted"? It was him who
> > suggested
> > this.
>
> Sorry, I haven't fully catched up to the discussion then.
>
> In general DMA-buf is build around the idea that the data can be
> accessed coherently by the involved devices.
>
> Having a begin/end of access for devices was brought up multiple
> times
> but so far rejected for good reasons.

I would argue that if it was brought up multiple times, then there are
also good reasons to support such a mechanism.

> That an exporter has to call extra functions to access his own
> buffers
> is a complete no-go for the design since this forces exporters into
> doing extra steps for allowing importers to access their data.

Then what about we add these dma_buf_{begin,end}_access(), with only
implementations for "dumb" exporters e.g. udmabuf or the dmabuf heaps?
And only importers (who cache the mapping and actually care about non-
coherency) would have to call these.

At the very least, is there a way to check that "the data can be
accessed coherently by the involved devices"? So that my importer can
EPERM if there is no coherency vs. a device that's already attached.

Cheers,
-Paul

> That in turn is pretty much un-testable unless you have every
> possible
> importer around while testing the exporter.
>
> Regards,
> Christian.
>
> >
> > > Regards,
> > > Christian.
> > Cheers,
> > -Paul
> >
> > > > Signed-off-by: Paul Cercueil <[email protected]>
> > > >
> > > > ---
> > > > v5: New patch
> > > > ---
> > > >    drivers/dma-buf/dma-buf.c | 66
> > > > +++++++++++++++++++++++++++++++++++++++
> > > >    include/linux/dma-buf.h   | 37 ++++++++++++++++++++++
> > > >    2 files changed, 103 insertions(+)
> > > >
> > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-
> > > > buf.c
> > > > index 8fe5aa67b167..a8bab6c18fcd 100644
> > > > --- a/drivers/dma-buf/dma-buf.c
> > > > +++ b/drivers/dma-buf/dma-buf.c
> > > > @@ -830,6 +830,8 @@ static struct sg_table *
> > > > __map_dma_buf(struct
> > > > dma_buf_attachment *attach,
> > > >     *     - dma_buf_mmap()
> > > >     *     - dma_buf_begin_cpu_access()
> > > >     *     - dma_buf_end_cpu_access()
> > > > + *     - dma_buf_begin_access()
> > > > + *     - dma_buf_end_access()
> > > >     *     - dma_buf_map_attachment_unlocked()
> > > >     *     - dma_buf_unmap_attachment_unlocked()
> > > >     *     - dma_buf_vmap_unlocked()
> > > > @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct
> > > > dma_buf
> > > > *dmabuf, struct iosys_map *map)
> > > >    }
> > > >    EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
> > > >   
> > > > +/**
> > > > + * @dma_buf_begin_access - Call before any hardware access
> > > > from/to
> > > > the DMABUF
> > > > + * @attach: [in] attachment used for hardware access
> > > > + * @sg_table: [in] scatterlist used for the DMA transfer
> > > > + * @direction:  [in]    direction of DMA transfer
> > > > + */
> > > > +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> > > > + struct sg_table *sgt, enum
> > > > dma_data_direction dir)
> > > > +{
> > > > + struct dma_buf *dmabuf;
> > > > + bool cookie;
> > > > + int ret;
> > > > +
> > > > + if (WARN_ON(!attach))
> > > > + return -EINVAL;
> > > > +
> > > > + dmabuf = attach->dmabuf;
> > > > +
> > > > + if (!dmabuf->ops->begin_access)
> > > > + return 0;
> > > > +
> > > > + cookie = dma_fence_begin_signalling();
> > > > + ret = dmabuf->ops->begin_access(attach, sgt, dir);
> > > > + dma_fence_end_signalling(cookie);
> > > > +
> > > > + if (WARN_ON_ONCE(ret))
> > > > + return ret;
> > > > +
> > > > + return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
> > > > +
> > > > +/**
> > > > + * @dma_buf_end_access - Call after any hardware access
> > > > from/to
> > > > the DMABUF
> > > > + * @attach: [in] attachment used for hardware access
> > > > + * @sg_table: [in] scatterlist used for the DMA transfer
> > > > + * @direction:  [in]    direction of DMA transfer
> > > > + */
> > > > +int dma_buf_end_access(struct dma_buf_attachment *attach,
> > > > +        struct sg_table *sgt, enum
> > > > dma_data_direction dir)
> > > > +{
> > > > + struct dma_buf *dmabuf;
> > > > + bool cookie;
> > > > + int ret;
> > > > +
> > > > + if (WARN_ON(!attach))
> > > > + return -EINVAL;
> > > > +
> > > > + dmabuf = attach->dmabuf;
> > > > +
> > > > + if (!dmabuf->ops->end_access)
> > > > + return 0;
> > > > +
> > > > + cookie = dma_fence_begin_signalling();
> > > > + ret = dmabuf->ops->end_access(attach, sgt, dir);
> > > > + dma_fence_end_signalling(cookie);
> > > > +
> > > > + if (WARN_ON_ONCE(ret))
> > > > + return ret;
> > > > +
> > > > + return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
> > > > +
> > > >    #ifdef CONFIG_DEBUG_FS
> > > >    static int dma_buf_debug_show(struct seq_file *s, void
> > > > *unused)
> > > >    {
> > > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > > > index 8ff4add71f88..8ba612c7cc16 100644
> > > > --- a/include/linux/dma-buf.h
> > > > +++ b/include/linux/dma-buf.h
> > > > @@ -246,6 +246,38 @@ struct dma_buf_ops {
> > > >     */
> > > >     int (*end_cpu_access)(struct dma_buf *, enum
> > > > dma_data_direction);
> > > >   
> > > > + /**
> > > > + * @begin_access:
> > > > + *
> > > > + * This is called from dma_buf_begin_access() when a
> > > > device driver
> > > > + * wants to access the data of the DMABUF. The
> > > > exporter
> > > > can use this
> > > > + * to flush/sync the caches if needed.
> > > > + *
> > > > + * This callback is optional.
> > > > + *
> > > > + * Returns:
> > > > + *
> > > > + * 0 on success or a negative error code on failure.
> > > > + */
> > > > + int (*begin_access)(struct dma_buf_attachment *,
> > > > struct
> > > > sg_table *,
> > > > +     enum dma_data_direction);
> > > > +
> > > > + /**
> > > > + * @end_access:
> > > > + *
> > > > + * This is called from dma_buf_end_access() when a
> > > > device
> > > > driver is
> > > > + * done accessing the data of the DMABUF. The exporter
> > > > can
> > > > use this
> > > > + * to flush/sync the caches if needed.
> > > > + *
> > > > + * This callback is optional.
> > > > + *
> > > > + * Returns:
> > > > + *
> > > > + * 0 on success or a negative error code on failure.
> > > > + */
> > > > + int (*end_access)(struct dma_buf_attachment *, struct
> > > > sg_table *,
> > > > +   enum dma_data_direction);
> > > > +
> > > >     /**
> > > >     * @mmap:
> > > >     *
> > > > @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf
> > > > *dmabuf,
> > > >    int dma_buf_pin(struct dma_buf_attachment *attach);
> > > >    void dma_buf_unpin(struct dma_buf_attachment *attach);
> > > >   
> > > > +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> > > > + struct sg_table *sgt, enum
> > > > dma_data_direction dir);
> > > > +int dma_buf_end_access(struct dma_buf_attachment *attach,
> > > > +        struct sg_table *sgt, enum
> > > > dma_data_direction dir);
> > > > +
> > > >    struct dma_buf *dma_buf_export(const struct
> > > > dma_buf_export_info
> > > > *exp_info);
> > > >   
> > > >    int dma_buf_fd(struct dma_buf *dmabuf, int flags);
>


2024-01-23 11:52:57

by Christian König

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Am 23.01.24 um 11:10 schrieb Paul Cercueil:
> Hi Christian,
>
> Le lundi 22 janvier 2024 à 14:41 +0100, Christian König a écrit :
>> Am 22.01.24 um 12:01 schrieb Paul Cercueil:
>>> Hi Christian,
>>>
>>> Le lundi 22 janvier 2024 à 11:35 +0100, Christian König a écrit :
>>>> Am 19.01.24 um 15:13 schrieb Paul Cercueil:
>>>>> These functions should be used by device drivers when they
>>>>> start
>>>>> and
>>>>> stop accessing the data of DMABUF. It allows DMABUF importers
>>>>> to
>>>>> cache
>>>>> the dma_buf_attachment while ensuring that the data they want
>>>>> to
>>>>> access
>>>>> is available for their device when the DMA transfers take
>>>>> place.
>>>> As Daniel already noted as well this is a complete no-go from the
>>>> DMA-buf design point of view.
>>> What do you mean "as Daniel already noted"? It was him who
>>> suggested
>>> this.
>> Sorry, I haven't fully catched up to the discussion then.
>>
>> In general DMA-buf is build around the idea that the data can be
>> accessed coherently by the involved devices.
>>
>> Having a begin/end of access for devices was brought up multiple
>> times
>> but so far rejected for good reasons.
> I would argue that if it was brought up multiple times, then there are
> also good reasons to support such a mechanism.
>
>> That an exporter has to call extra functions to access his own
>> buffers
>> is a complete no-go for the design since this forces exporters into
>> doing extra steps for allowing importers to access their data.
> Then what about we add these dma_buf_{begin,end}_access(), with only
> implementations for "dumb" exporters e.g. udmabuf or the dmabuf heaps?
> And only importers (who cache the mapping and actually care about non-
> coherency) would have to call these.

No, the problem is still that you would have to change all importers to
mandatory use dma_buf_begin/end.

But going a step back caching the mapping is irrelevant for coherency.
Even if you don't cache the mapping you don't get coherency.

In other words exporters are not require to call sync_to_cpu or
sync_to_device when you create a mapping.

What exactly is your use case here? And why does coherency matters?

> At the very least, is there a way to check that "the data can be
> accessed coherently by the involved devices"? So that my importer can
> EPERM if there is no coherency vs. a device that's already attached.

Yeah, there is functionality for this in the DMA subsystem. I've once
created prototype patches for enforcing the same coherency approach
between importer and exporter, but we never got around to upstream them.



>
> Cheers,
> -Paul
>
>> That in turn is pretty much un-testable unless you have every
>> possible
>> importer around while testing the exporter.
>>
>> Regards,
>> Christian.
>>
>>>> Regards,
>>>> Christian.
>>> Cheers,
>>> -Paul
>>>
>>>>> Signed-off-by: Paul Cercueil <[email protected]>
>>>>>
>>>>> ---
>>>>> v5: New patch
>>>>> ---
>>>>>    drivers/dma-buf/dma-buf.c | 66
>>>>> +++++++++++++++++++++++++++++++++++++++
>>>>>    include/linux/dma-buf.h   | 37 ++++++++++++++++++++++
>>>>>    2 files changed, 103 insertions(+)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-
>>>>> buf.c
>>>>> index 8fe5aa67b167..a8bab6c18fcd 100644
>>>>> --- a/drivers/dma-buf/dma-buf.c
>>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>>> @@ -830,6 +830,8 @@ static struct sg_table *
>>>>> __map_dma_buf(struct
>>>>> dma_buf_attachment *attach,
>>>>>     *     - dma_buf_mmap()
>>>>>     *     - dma_buf_begin_cpu_access()
>>>>>     *     - dma_buf_end_cpu_access()
>>>>> + *     - dma_buf_begin_access()
>>>>> + *     - dma_buf_end_access()
>>>>>     *     - dma_buf_map_attachment_unlocked()
>>>>>     *     - dma_buf_unmap_attachment_unlocked()
>>>>>     *     - dma_buf_vmap_unlocked()
>>>>> @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct
>>>>> dma_buf
>>>>> *dmabuf, struct iosys_map *map)
>>>>>    }
>>>>>    EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
>>>>>
>>>>> +/**
>>>>> + * @dma_buf_begin_access - Call before any hardware access
>>>>> from/to
>>>>> the DMABUF
>>>>> + * @attach: [in] attachment used for hardware access
>>>>> + * @sg_table: [in] scatterlist used for the DMA transfer
>>>>> + * @direction:  [in]    direction of DMA transfer
>>>>> + */
>>>>> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
>>>>> + struct sg_table *sgt, enum
>>>>> dma_data_direction dir)
>>>>> +{
>>>>> + struct dma_buf *dmabuf;
>>>>> + bool cookie;
>>>>> + int ret;
>>>>> +
>>>>> + if (WARN_ON(!attach))
>>>>> + return -EINVAL;
>>>>> +
>>>>> + dmabuf = attach->dmabuf;
>>>>> +
>>>>> + if (!dmabuf->ops->begin_access)
>>>>> + return 0;
>>>>> +
>>>>> + cookie = dma_fence_begin_signalling();
>>>>> + ret = dmabuf->ops->begin_access(attach, sgt, dir);
>>>>> + dma_fence_end_signalling(cookie);
>>>>> +
>>>>> + if (WARN_ON_ONCE(ret))
>>>>> + return ret;
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
>>>>> +
>>>>> +/**
>>>>> + * @dma_buf_end_access - Call after any hardware access
>>>>> from/to
>>>>> the DMABUF
>>>>> + * @attach: [in] attachment used for hardware access
>>>>> + * @sg_table: [in] scatterlist used for the DMA transfer
>>>>> + * @direction:  [in]    direction of DMA transfer
>>>>> + */
>>>>> +int dma_buf_end_access(struct dma_buf_attachment *attach,
>>>>> +        struct sg_table *sgt, enum
>>>>> dma_data_direction dir)
>>>>> +{
>>>>> + struct dma_buf *dmabuf;
>>>>> + bool cookie;
>>>>> + int ret;
>>>>> +
>>>>> + if (WARN_ON(!attach))
>>>>> + return -EINVAL;
>>>>> +
>>>>> + dmabuf = attach->dmabuf;
>>>>> +
>>>>> + if (!dmabuf->ops->end_access)
>>>>> + return 0;
>>>>> +
>>>>> + cookie = dma_fence_begin_signalling();
>>>>> + ret = dmabuf->ops->end_access(attach, sgt, dir);
>>>>> + dma_fence_end_signalling(cookie);
>>>>> +
>>>>> + if (WARN_ON_ONCE(ret))
>>>>> + return ret;
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
>>>>> +
>>>>>    #ifdef CONFIG_DEBUG_FS
>>>>>    static int dma_buf_debug_show(struct seq_file *s, void
>>>>> *unused)
>>>>>    {
>>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>>>> index 8ff4add71f88..8ba612c7cc16 100644
>>>>> --- a/include/linux/dma-buf.h
>>>>> +++ b/include/linux/dma-buf.h
>>>>> @@ -246,6 +246,38 @@ struct dma_buf_ops {
>>>>>     */
>>>>>     int (*end_cpu_access)(struct dma_buf *, enum
>>>>> dma_data_direction);
>>>>>
>>>>> + /**
>>>>> + * @begin_access:
>>>>> + *
>>>>> + * This is called from dma_buf_begin_access() when a
>>>>> device driver
>>>>> + * wants to access the data of the DMABUF. The
>>>>> exporter
>>>>> can use this
>>>>> + * to flush/sync the caches if needed.
>>>>> + *
>>>>> + * This callback is optional.
>>>>> + *
>>>>> + * Returns:
>>>>> + *
>>>>> + * 0 on success or a negative error code on failure.
>>>>> + */
>>>>> + int (*begin_access)(struct dma_buf_attachment *,
>>>>> struct
>>>>> sg_table *,
>>>>> +     enum dma_data_direction);
>>>>> +
>>>>> + /**
>>>>> + * @end_access:
>>>>> + *
>>>>> + * This is called from dma_buf_end_access() when a
>>>>> device
>>>>> driver is
>>>>> + * done accessing the data of the DMABUF. The exporter
>>>>> can
>>>>> use this
>>>>> + * to flush/sync the caches if needed.
>>>>> + *
>>>>> + * This callback is optional.
>>>>> + *
>>>>> + * Returns:
>>>>> + *
>>>>> + * 0 on success or a negative error code on failure.
>>>>> + */
>>>>> + int (*end_access)(struct dma_buf_attachment *, struct
>>>>> sg_table *,
>>>>> +   enum dma_data_direction);
>>>>> +
>>>>>     /**
>>>>>     * @mmap:
>>>>>     *
>>>>> @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf
>>>>> *dmabuf,
>>>>>    int dma_buf_pin(struct dma_buf_attachment *attach);
>>>>>    void dma_buf_unpin(struct dma_buf_attachment *attach);
>>>>>
>>>>> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
>>>>> + struct sg_table *sgt, enum
>>>>> dma_data_direction dir);
>>>>> +int dma_buf_end_access(struct dma_buf_attachment *attach,
>>>>> +        struct sg_table *sgt, enum
>>>>> dma_data_direction dir);
>>>>> +
>>>>>    struct dma_buf *dma_buf_export(const struct
>>>>> dma_buf_export_info
>>>>> *exp_info);
>>>>>
>>>>>    int dma_buf_fd(struct dma_buf *dmabuf, int flags);


2024-01-23 13:21:57

by Paul Cercueil

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Le mardi 23 janvier 2024 à 12:52 +0100, Christian König a écrit :
> Am 23.01.24 um 11:10 schrieb Paul Cercueil:
> > Hi Christian,
> >
> > Le lundi 22 janvier 2024 à 14:41 +0100, Christian König a écrit :
> > > Am 22.01.24 um 12:01 schrieb Paul Cercueil:
> > > > Hi Christian,
> > > >
> > > > Le lundi 22 janvier 2024 à 11:35 +0100, Christian König a
> > > > écrit :
> > > > > Am 19.01.24 um 15:13 schrieb Paul Cercueil:
> > > > > > These functions should be used by device drivers when they
> > > > > > start
> > > > > > and
> > > > > > stop accessing the data of DMABUF. It allows DMABUF
> > > > > > importers
> > > > > > to
> > > > > > cache
> > > > > > the dma_buf_attachment while ensuring that the data they
> > > > > > want
> > > > > > to
> > > > > > access
> > > > > > is available for their device when the DMA transfers take
> > > > > > place.
> > > > > As Daniel already noted as well this is a complete no-go from
> > > > > the
> > > > > DMA-buf design point of view.
> > > > What do you mean "as Daniel already noted"? It was him who
> > > > suggested
> > > > this.
> > > Sorry, I haven't fully catched up to the discussion then.
> > >
> > > In general DMA-buf is build around the idea that the data can be
> > > accessed coherently by the involved devices.
> > >
> > > Having a begin/end of access for devices was brought up multiple
> > > times
> > > but so far rejected for good reasons.
> > I would argue that if it was brought up multiple times, then there
> > are
> > also good reasons to support such a mechanism.
> >
> > > That an exporter has to call extra functions to access his own
> > > buffers
> > > is a complete no-go for the design since this forces exporters
> > > into
> > > doing extra steps for allowing importers to access their data.
> > Then what about we add these dma_buf_{begin,end}_access(), with
> > only
> > implementations for "dumb" exporters e.g. udmabuf or the dmabuf
> > heaps?
> > And only importers (who cache the mapping and actually care about
> > non-
> > coherency) would have to call these.
>
> No, the problem is still that you would have to change all importers
> to
> mandatory use dma_buf_begin/end.
>
> But going a step back caching the mapping is irrelevant for
> coherency.
> Even if you don't cache the mapping you don't get coherency.

You actually do - at least with udmabuf, as in that case
dma_buf_map_attachment() / dma_buf_unmap_attachment() will handle cache
coherency when the SGs are mapped/unmapped.

The problem was then that dma_buf_unmap_attachment cannot be called
before the dma_fence is signaled, and calling it after is already too
late (because the fence would be signaled before the data is sync'd).

Daniel / Sima suggested then that I cache the mapping and add new
functions to ensure cache coherency, which is what these patches are
about.

> In other words exporters are not require to call sync_to_cpu or
> sync_to_device when you create a mapping.
>
> What exactly is your use case here? And why does coherency matters?

My use-case is, I create DMABUFs with udmabuf, that I attach to
USB/functionfs with the interface introduced by this patchset. I attach
them to IIO with a similar interface (being upstreamed in parallel),
and transfer data from USB to IIO and vice-versa in a zero-copy
fashion.

This works perfectly fine as long as the USB and IIO hardware are
coherent between themselves, which is the case on most of our boards.
However I do have a board (with a Xilinx Ultrascale SoC) where it is
not the case, and cache flushes/sync are needed. So I was trying to
rework these new interfaces to work on that system too.

If this really is a no-no, then I am fine with the assumption that
devices sharing a DMABUF must be coherent between themselves; but
that's something that should probably be enforced rather than assumed.

(and I *think* there is a way to force coherency in the Ultrascale's
interconnect - we're investigating it)

Cheers,
-Paul

> > At the very least, is there a way to check that "the data can be
> > accessed coherently by the involved devices"? So that my importer
> > can
> > EPERM if there is no coherency vs. a device that's already
> > attached.
>
> Yeah, there is functionality for this in the DMA subsystem. I've once
> created prototype patches for enforcing the same coherency approach
> between importer and exporter, but we never got around to upstream
> them.
>
>
>
> >
> > Cheers,
> > -Paul
> >
> > > That in turn is pretty much un-testable unless you have every
> > > possible
> > > importer around while testing the exporter.
> > >
> > > Regards,
> > > Christian.
> > >
> > > > > Regards,
> > > > > Christian.
> > > > Cheers,
> > > > -Paul
> > > >
> > > > > > Signed-off-by: Paul Cercueil <[email protected]>
> > > > > >
> > > > > > ---
> > > > > > v5: New patch
> > > > > > ---
> > > > > >     drivers/dma-buf/dma-buf.c | 66
> > > > > > +++++++++++++++++++++++++++++++++++++++
> > > > > >     include/linux/dma-buf.h   | 37 ++++++++++++++++++++++
> > > > > >     2 files changed, 103 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-
> > > > > > buf/dma-
> > > > > > buf.c
> > > > > > index 8fe5aa67b167..a8bab6c18fcd 100644
> > > > > > --- a/drivers/dma-buf/dma-buf.c
> > > > > > +++ b/drivers/dma-buf/dma-buf.c
> > > > > > @@ -830,6 +830,8 @@ static struct sg_table *
> > > > > > __map_dma_buf(struct
> > > > > > dma_buf_attachment *attach,
> > > > > >      *     - dma_buf_mmap()
> > > > > >      *     - dma_buf_begin_cpu_access()
> > > > > >      *     - dma_buf_end_cpu_access()
> > > > > > + *     - dma_buf_begin_access()
> > > > > > + *     - dma_buf_end_access()
> > > > > >      *     - dma_buf_map_attachment_unlocked()
> > > > > >      *     - dma_buf_unmap_attachment_unlocked()
> > > > > >      *     - dma_buf_vmap_unlocked()
> > > > > > @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct
> > > > > > dma_buf
> > > > > > *dmabuf, struct iosys_map *map)
> > > > > >     }
> > > > > >     EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
> > > > > >    
> > > > > > +/**
> > > > > > + * @dma_buf_begin_access - Call before any hardware access
> > > > > > from/to
> > > > > > the DMABUF
> > > > > > + * @attach: [in] attachment used for hardware
> > > > > > access
> > > > > > + * @sg_table: [in] scatterlist used for the DMA
> > > > > > transfer
> > > > > > + * @direction:  [in]    direction of DMA transfer
> > > > > > + */
> > > > > > +int dma_buf_begin_access(struct dma_buf_attachment
> > > > > > *attach,
> > > > > > + struct sg_table *sgt, enum
> > > > > > dma_data_direction dir)
> > > > > > +{
> > > > > > + struct dma_buf *dmabuf;
> > > > > > + bool cookie;
> > > > > > + int ret;
> > > > > > +
> > > > > > + if (WARN_ON(!attach))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + dmabuf = attach->dmabuf;
> > > > > > +
> > > > > > + if (!dmabuf->ops->begin_access)
> > > > > > + return 0;
> > > > > > +
> > > > > > + cookie = dma_fence_begin_signalling();
> > > > > > + ret = dmabuf->ops->begin_access(attach, sgt, dir);
> > > > > > + dma_fence_end_signalling(cookie);
> > > > > > +
> > > > > > + if (WARN_ON_ONCE(ret))
> > > > > > + return ret;
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);
> > > > > > +
> > > > > > +/**
> > > > > > + * @dma_buf_end_access - Call after any hardware access
> > > > > > from/to
> > > > > > the DMABUF
> > > > > > + * @attach: [in] attachment used for hardware
> > > > > > access
> > > > > > + * @sg_table: [in] scatterlist used for the DMA
> > > > > > transfer
> > > > > > + * @direction:  [in]    direction of DMA transfer
> > > > > > + */
> > > > > > +int dma_buf_end_access(struct dma_buf_attachment *attach,
> > > > > > +        struct sg_table *sgt, enum
> > > > > > dma_data_direction dir)
> > > > > > +{
> > > > > > + struct dma_buf *dmabuf;
> > > > > > + bool cookie;
> > > > > > + int ret;
> > > > > > +
> > > > > > + if (WARN_ON(!attach))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + dmabuf = attach->dmabuf;
> > > > > > +
> > > > > > + if (!dmabuf->ops->end_access)
> > > > > > + return 0;
> > > > > > +
> > > > > > + cookie = dma_fence_begin_signalling();
> > > > > > + ret = dmabuf->ops->end_access(attach, sgt, dir);
> > > > > > + dma_fence_end_signalling(cookie);
> > > > > > +
> > > > > > + if (WARN_ON_ONCE(ret))
> > > > > > + return ret;
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
> > > > > > +
> > > > > >     #ifdef CONFIG_DEBUG_FS
> > > > > >     static int dma_buf_debug_show(struct seq_file *s, void
> > > > > > *unused)
> > > > > >     {
> > > > > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-
> > > > > > buf.h
> > > > > > index 8ff4add71f88..8ba612c7cc16 100644
> > > > > > --- a/include/linux/dma-buf.h
> > > > > > +++ b/include/linux/dma-buf.h
> > > > > > @@ -246,6 +246,38 @@ struct dma_buf_ops {
> > > > > >      */
> > > > > >      int (*end_cpu_access)(struct dma_buf *, enum
> > > > > > dma_data_direction);
> > > > > >    
> > > > > > + /**
> > > > > > + * @begin_access:
> > > > > > + *
> > > > > > + * This is called from dma_buf_begin_access() when
> > > > > > a
> > > > > > device driver
> > > > > > + * wants to access the data of the DMABUF. The
> > > > > > exporter
> > > > > > can use this
> > > > > > + * to flush/sync the caches if needed.
> > > > > > + *
> > > > > > + * This callback is optional.
> > > > > > + *
> > > > > > + * Returns:
> > > > > > + *
> > > > > > + * 0 on success or a negative error code on
> > > > > > failure.
> > > > > > + */
> > > > > > + int (*begin_access)(struct dma_buf_attachment *,
> > > > > > struct
> > > > > > sg_table *,
> > > > > > +     enum dma_data_direction);
> > > > > > +
> > > > > > + /**
> > > > > > + * @end_access:
> > > > > > + *
> > > > > > + * This is called from dma_buf_end_access() when a
> > > > > > device
> > > > > > driver is
> > > > > > + * done accessing the data of the DMABUF. The
> > > > > > exporter
> > > > > > can
> > > > > > use this
> > > > > > + * to flush/sync the caches if needed.
> > > > > > + *
> > > > > > + * This callback is optional.
> > > > > > + *
> > > > > > + * Returns:
> > > > > > + *
> > > > > > + * 0 on success or a negative error code on
> > > > > > failure.
> > > > > > + */
> > > > > > + int (*end_access)(struct dma_buf_attachment *,
> > > > > > struct
> > > > > > sg_table *,
> > > > > > +   enum dma_data_direction);
> > > > > > +
> > > > > >      /**
> > > > > >      * @mmap:
> > > > > >      *
> > > > > > @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf
> > > > > > *dmabuf,
> > > > > >     int dma_buf_pin(struct dma_buf_attachment *attach);
> > > > > >     void dma_buf_unpin(struct dma_buf_attachment *attach);
> > > > > >    
> > > > > > +int dma_buf_begin_access(struct dma_buf_attachment
> > > > > > *attach,
> > > > > > + struct sg_table *sgt, enum
> > > > > > dma_data_direction dir);
> > > > > > +int dma_buf_end_access(struct dma_buf_attachment *attach,
> > > > > > +        struct sg_table *sgt, enum
> > > > > > dma_data_direction dir);
> > > > > > +
> > > > > >     struct dma_buf *dma_buf_export(const struct
> > > > > > dma_buf_export_info
> > > > > > *exp_info);
> > > > > >    
> > > > > >     int dma_buf_fd(struct dma_buf *dmabuf, int flags);
>


2024-01-25 22:25:12

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

On Fri, Jan 19, 2024 at 03:13:57PM +0100, Paul Cercueil wrote:
> These functions should be used by device drivers when they start and
> stop accessing the data of DMABUF. It allows DMABUF importers to cache
> the dma_buf_attachment while ensuring that the data they want to access
> is available for their device when the DMA transfers take place.
>
> Signed-off-by: Paul Cercueil <[email protected]>

Putting my detailed review comments here just so I don't have to remember
them any longer. We need to reach consensus on the big picture direction
first.

>
> ---
> v5: New patch
> ---
> drivers/dma-buf/dma-buf.c | 66 +++++++++++++++++++++++++++++++++++++++
> include/linux/dma-buf.h | 37 ++++++++++++++++++++++
> 2 files changed, 103 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 8fe5aa67b167..a8bab6c18fcd 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -830,6 +830,8 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach,
> * - dma_buf_mmap()
> * - dma_buf_begin_cpu_access()
> * - dma_buf_end_cpu_access()
> + * - dma_buf_begin_access()
> + * - dma_buf_end_access()
> * - dma_buf_map_attachment_unlocked()
> * - dma_buf_unmap_attachment_unlocked()
> * - dma_buf_vmap_unlocked()
> @@ -1602,6 +1604,70 @@ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map)
> }
> EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF);
>
> +/**
> + * @dma_buf_begin_access - Call before any hardware access from/to the DMABUF
> + * @attach: [in] attachment used for hardware access
> + * @sg_table: [in] scatterlist used for the DMA transfer
> + * @direction: [in] direction of DMA transfer

I think for the kerneldoc would be good to point at the other function
here, explain why this might be needed and that for most reasonable
devices it's probably not, and link between the function pairs.

Also we need to document that dma_buf_map does an implied
dma_buf_begin_access (because dma_sg_map does an implied
dma_sg_sync_for_device) and vice versa for dma_buf_end_access. Which also
means that dma_buf_map/unmap should link to these functions in their
kerneldoc too.

Finally I think we should document here that it's ok to call these from
dma_fence signalling critical section and link to the relevant discussion
in the dma_fence docs for that.

> + */
> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir)
> +{
> + struct dma_buf *dmabuf;
> + bool cookie;
> + int ret;
> +
> + if (WARN_ON(!attach))
> + return -EINVAL;
> +
> + dmabuf = attach->dmabuf;
> +
> + if (!dmabuf->ops->begin_access)
> + return 0;
> +
> + cookie = dma_fence_begin_signalling();
> + ret = dmabuf->ops->begin_access(attach, sgt, dir);
> + dma_fence_end_signalling(cookie);
> +
> + if (WARN_ON_ONCE(ret))
> + return ret;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_access, DMA_BUF);

So explicit device side coherency management is not going to be very
compatible with dynamic buffer managament where the exporter can move the
buffer around. The reason for that is that for a dynamic exporter we cache
the sg mapping, which means any device-side coherency management which
dma_buf_map/unmap would do will not happen (since it's cached),
potentially breaking things for importers that rely on the assumption that
dma_buf_map/unmap already implies dma_buf_begin/end_device_access.

I think for now it's sufficient to put a WARN_ON(dma_buf_is_dymamic() &&
ops->begin|end_access) or similar into dma_buf_export and bail out with an
error to catch that.

Aside from the nits I do think this is roughly what we brievely discussed
well over a decade ago in the original dma-buf kickoff meeting at a linaro
connect in Budapest :-)

Cheers, Sima

> +
> +/**
> + * @dma_buf_end_access - Call after any hardware access from/to the DMABUF
> + * @attach: [in] attachment used for hardware access
> + * @sg_table: [in] scatterlist used for the DMA transfer
> + * @direction: [in] direction of DMA transfer
> + */
> +int dma_buf_end_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir)
> +{
> + struct dma_buf *dmabuf;
> + bool cookie;
> + int ret;
> +
> + if (WARN_ON(!attach))
> + return -EINVAL;
> +
> + dmabuf = attach->dmabuf;
> +
> + if (!dmabuf->ops->end_access)
> + return 0;
> +
> + cookie = dma_fence_begin_signalling();
> + ret = dmabuf->ops->end_access(attach, sgt, dir);
> + dma_fence_end_signalling(cookie);
> +
> + if (WARN_ON_ONCE(ret))
> + return ret;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(dma_buf_end_access, DMA_BUF);
> +
> #ifdef CONFIG_DEBUG_FS
> static int dma_buf_debug_show(struct seq_file *s, void *unused)
> {
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 8ff4add71f88..8ba612c7cc16 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -246,6 +246,38 @@ struct dma_buf_ops {
> */
> int (*end_cpu_access)(struct dma_buf *, enum dma_data_direction);
>
> + /**
> + * @begin_access:
> + *
> + * This is called from dma_buf_begin_access() when a device driver
> + * wants to access the data of the DMABUF. The exporter can use this
> + * to flush/sync the caches if needed.
> + *
> + * This callback is optional.
> + *
> + * Returns:
> + *
> + * 0 on success or a negative error code on failure.
> + */
> + int (*begin_access)(struct dma_buf_attachment *, struct sg_table *,
> + enum dma_data_direction);
> +
> + /**
> + * @end_access:
> + *
> + * This is called from dma_buf_end_access() when a device driver is
> + * done accessing the data of the DMABUF. The exporter can use this
> + * to flush/sync the caches if needed.
> + *
> + * This callback is optional.
> + *
> + * Returns:
> + *
> + * 0 on success or a negative error code on failure.
> + */
> + int (*end_access)(struct dma_buf_attachment *, struct sg_table *,
> + enum dma_data_direction);
> +
> /**
> * @mmap:
> *
> @@ -606,6 +638,11 @@ void dma_buf_detach(struct dma_buf *dmabuf,
> int dma_buf_pin(struct dma_buf_attachment *attach);
> void dma_buf_unpin(struct dma_buf_attachment *attach);
>
> +int dma_buf_begin_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir);
> +int dma_buf_end_access(struct dma_buf_attachment *attach,
> + struct sg_table *sgt, enum dma_data_direction dir);
> +
> struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>
> int dma_buf_fd(struct dma_buf *dmabuf, int flags);
> --
> 2.43.0
>
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2024-02-07 20:46:16

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Linaro-mm-sig] [PATCH v5 2/6] dma-buf: udmabuf: Implement .{begin,end}_access

On Fri, Jan 19, 2024 at 03:13:58PM +0100, Paul Cercueil wrote:
> Implement .begin_access() and .end_access() callbacks.
>
> For now these functions will simply sync/flush the CPU cache when
> needed.
>
> Signed-off-by: Paul Cercueil <[email protected]>
>
> ---
> v5: New patch
> ---
> drivers/dma-buf/udmabuf.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index c40645999648..a87d89b58816 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -179,6 +179,31 @@ static int end_cpu_udmabuf(struct dma_buf *buf,
> return 0;
> }
>
> +static int begin_udmabuf(struct dma_buf_attachment *attach,
> + struct sg_table *sgt,
> + enum dma_data_direction dir)
> +{
> + struct dma_buf *buf = attach->dmabuf;
> + struct udmabuf *ubuf = buf->priv;
> + struct device *dev = ubuf->device->this_device;
> +
> + dma_sync_sg_for_device(dev, sgt->sgl, sg_nents(sgt->sgl), dir);

So one thing I've just wondered is whether we've made sure that this is
only doing cache coherency maintenance, and not swiotlb bounce buffer
copying. The latter would really not be suitable for dma-buf anymore I
think.

Not sure how to best check for that since it's all in the depths of the
dma-api code, but I guess the best way to really make sure is to disable
CONFIG_SWIOTLB. Otherwise I guess the way to absolutely make sure is to
trace swiotlb_sync_single_for_device/cpu.

It would be kinda neat if dma-buf.c code could make sure you never ever
get an swiotlb entry from a dma_buf_map_attachment call, but I don't think
we can enforce that. There's sg_dma_is_swiotlb, but that won't catch all
implementations, only the generic dma-iommu.c one.

Cheers, Sima

> + return 0;
> +}
> +
> +static int end_udmabuf(struct dma_buf_attachment *attach,
> + struct sg_table *sgt,
> + enum dma_data_direction dir)
> +{
> + struct dma_buf *buf = attach->dmabuf;
> + struct udmabuf *ubuf = buf->priv;
> + struct device *dev = ubuf->device->this_device;
> +
> + if (dir != DMA_TO_DEVICE)
> + dma_sync_sg_for_cpu(dev, sgt->sgl, sg_nents(sgt->sgl), dir);
> + return 0;
> +}
> +
> static const struct dma_buf_ops udmabuf_ops = {
> .cache_sgt_mapping = true,
> .map_dma_buf = map_udmabuf,
> @@ -189,6 +214,8 @@ static const struct dma_buf_ops udmabuf_ops = {
> .vunmap = vunmap_udmabuf,
> .begin_cpu_access = begin_cpu_udmabuf,
> .end_cpu_access = end_cpu_udmabuf,
> + .begin_access = begin_udmabuf,
> + .end_access = end_udmabuf,
> };
>
> #define SEALS_WANTED (F_SEAL_SHRINK)
> --
> 2.43.0
>
> _______________________________________________
> Linaro-mm-sig mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch