2021-07-27 07:07:40

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv4 0/8] videobuf2: support new noncontiguous DMA API

Hello,

The series adds support for noncontiguous DMA API and
V4L2_MEMORY_FLAG_NON_COHERENT UAPI.

v4:
-- addressed Dafna's feedback

V3: https://lore.kernel.org/lkml/[email protected]

Sergey Senozhatsky (8):
videobuf2: rework vb2_mem_ops API
videobuf2: inverse buffer cache_hints flags
videobuf2: split buffer cache_hints initialisation
videobuf2: move cache_hints handling to allocators
videobuf2: add V4L2_MEMORY_FLAG_NON_COHERENT flag
videobuf2: add queue memory coherency parameter
videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag
videobuf2: handle non-contiguous DMA allocations

.../userspace-api/media/v4l/buffer.rst | 40 +++-
.../media/v4l/vidioc-create-bufs.rst | 7 +-
.../media/v4l/vidioc-reqbufs.rst | 16 +-
.../media/common/videobuf2/videobuf2-core.c | 127 ++++++++-----
.../common/videobuf2/videobuf2-dma-contig.c | 176 ++++++++++++++----
.../media/common/videobuf2/videobuf2-dma-sg.c | 39 ++--
.../media/common/videobuf2/videobuf2-v4l2.c | 59 +++---
.../common/videobuf2/videobuf2-vmalloc.c | 30 +--
drivers/media/dvb-core/dvb_vb2.c | 2 +-
drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 9 +-
drivers/media/v4l2-core/v4l2-ioctl.c | 4 +-
include/media/videobuf2-core.h | 59 +++---
include/uapi/linux/videodev2.h | 11 +-
13 files changed, 399 insertions(+), 180 deletions(-)

--
2.32.0.432.gabb21c7263-goog



2021-07-27 07:07:58

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv4 4/8] videobuf2: move cache_hints handling to allocators

This moves cache hints handling from videobuf2 core down
to allocators level, because allocators do the sync/flush
caches eventually and may take better decisions. Besides,
allocators already decide whether cache sync/flush should
be done or can be skipped. This patch moves the scattered
buffer cache sync logic to one common place.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
drivers/media/common/videobuf2/videobuf2-core.c | 6 ------
drivers/media/common/videobuf2/videobuf2-dma-contig.c | 6 ++++++
drivers/media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++++
3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index 76210c006958..55af63d54f23 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -328,9 +328,6 @@ static void __vb2_buf_mem_prepare(struct vb2_buffer *vb)
return;

vb->synced = 1;
- if (vb->skip_cache_sync_on_prepare)
- return;
-
for (plane = 0; plane < vb->num_planes; ++plane)
call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
}
@@ -347,9 +344,6 @@ static void __vb2_buf_mem_finish(struct vb2_buffer *vb)
return;

vb->synced = 0;
- if (vb->skip_cache_sync_on_finish)
- return;
-
for (plane = 0; plane < vb->num_planes; ++plane)
call_void_memop(vb, finish, vb->planes[plane].mem_priv);
}
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index 019c3843dc6d..1e218bc440c6 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -101,6 +101,9 @@ static void vb2_dc_prepare(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

+ if (buf->vb->skip_cache_sync_on_prepare)
+ return;
+
if (!sgt)
return;

@@ -112,6 +115,9 @@ static void vb2_dc_finish(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

+ if (buf->vb->skip_cache_sync_on_finish)
+ return;
+
if (!sgt)
return;

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 50265080cfc8..33ee63a99139 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -204,6 +204,9 @@ static void vb2_dma_sg_prepare(void *buf_priv)
struct vb2_dma_sg_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

+ if (buf->vb->skip_cache_sync_on_prepare)
+ return;
+
dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
}

@@ -212,6 +215,9 @@ static void vb2_dma_sg_finish(void *buf_priv)
struct vb2_dma_sg_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

+ if (buf->vb->skip_cache_sync_on_finish)
+ return;
+
dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
}

--
2.32.0.432.gabb21c7263-goog


2021-07-27 07:08:08

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv4 5/8] videobuf2: add V4L2_MEMORY_FLAG_NON_COHERENT flag

By setting or clearing V4L2_MEMORY_FLAG_NON_COHERENT flag
user-space should be able to hint vb2 that either a non-coherent
(if supported) or coherent memory should be used for the buffer
allocation.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
.../userspace-api/media/v4l/buffer.rst | 40 ++++++++++++++++++-
.../media/v4l/vidioc-reqbufs.rst | 5 ++-
include/uapi/linux/videodev2.h | 2 +
3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
index e991ba73d873..4638ec64db00 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -676,8 +676,6 @@ Buffer Flags

\normalsize

-.. _memory-flags:
-
enum v4l2_memory
================

@@ -701,6 +699,44 @@ enum v4l2_memory
- 4
- The buffer is used for :ref:`DMA shared buffer <dmabuf>` I/O.

+.. _memory-flags:
+
+Memory Consistency Flags
+------------------------
+
+.. raw:: latex
+
+ \small
+
+.. tabularcolumns:: |p{7.0cm}|p{2.1cm}|p{8.4cm}|
+
+.. cssclass:: longtable
+
+.. flat-table::
+ :header-rows: 0
+ :stub-columns: 0
+ :widths: 3 1 4
+
+ * .. _`V4L2-MEMORY-FLAG-NON-COHERENT`:
+
+ - ``V4L2_MEMORY_FLAG_NON_COHERENT``
+ - 0x00000001
+ - A buffer is allocated either in coherent (it will be automatically
+ coherent between the CPU and the bus) or non-coherent memory. The
+ latter can provide performance gains, for instance the CPU cache
+ sync/flush operations can be avoided if the buffer is accessed by the
+ corresponding device only and the CPU does not read/write to/from that
+ buffer. However, this requires extra care from the driver -- it must
+ guarantee memory consistency by issuing a cache flush/sync when
+ consistency is needed. If this flag is set V4L2 will attempt to
+ allocate the buffer in non-coherent memory. The flag takes effect
+ only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
+ queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
+ <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
+
+.. raw:: latex
+
+ \normalsize

Timecodes
=========
diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index 50ea72043bb0..e59306aba2b0 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -158,8 +158,9 @@ aborting or finishing any DMA in progress, an implicit
- This capability is set by the driver to indicate that the queue supports
cache and memory management hints. However, it's only valid when the
queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
- :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
- :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
+ :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>`,
+ :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>` and
+ :ref:`V4L2_MEMORY_FLAG_NON_COHERENT <V4L2-MEMORY-FLAG-NON-COHERENT>`.

.. raw:: latex

diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 9260791b8438..9d11e1d9c934 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -956,6 +956,8 @@ struct v4l2_requestbuffers {
__u32 reserved[1];
};

+#define V4L2_MEMORY_FLAG_NON_COHERENT (1 << 0)
+
/* capabilities for struct v4l2_requestbuffers and v4l2_create_buffers */
#define V4L2_BUF_CAP_SUPPORTS_MMAP (1 << 0)
#define V4L2_BUF_CAP_SUPPORTS_USERPTR (1 << 1)
--
2.32.0.432.gabb21c7263-goog


2021-07-27 07:08:30

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv4 6/8] videobuf2: add queue memory coherency parameter

Preparations for future V4L2_MEMORY_FLAG_NON_COHERENT support.

Extend vb2_core_reqbufs() parameters list to accept requests'
->flags, which will be used for memory coherency configuration.

An attempt to allocate a buffer with coherency requirements
which don't match queue's consistency model will fail.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
.../media/common/videobuf2/videobuf2-core.c | 38 ++++++++++++++++---
.../media/common/videobuf2/videobuf2-v4l2.c | 5 ++-
drivers/media/dvb-core/dvb_vb2.c | 2 +-
include/media/videobuf2-core.h | 10 ++++-
4 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index 55af63d54f23..af4db310cf5e 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -738,11 +738,31 @@ int vb2_verify_memory_type(struct vb2_queue *q,
}
EXPORT_SYMBOL(vb2_verify_memory_type);

+static void set_queue_coherency(struct vb2_queue *q, bool coherent_mem)
+{
+ q->coherent_mem = 1;
+
+ if (!vb2_queue_allows_cache_hints(q))
+ return;
+ if (!coherent_mem)
+ q->coherent_mem = 0;
+}
+
+static bool verify_coherency_flags(struct vb2_queue *q, bool coherent_mem)
+{
+ if (coherent_mem != q->coherent_mem) {
+ dprintk(q, 1, "memory coherency model mismatch\n");
+ return false;
+ }
+ return true;
+}
+
int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
- unsigned int *count)
+ unsigned int flags, unsigned int *count)
{
unsigned int num_buffers, allocated_buffers, num_planes = 0;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
+ bool coherent_mem = true;
unsigned int i;
int ret;

@@ -757,7 +777,8 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
}

if (*count == 0 || q->num_buffers != 0 ||
- (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
+ (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
+ !verify_coherency_flags(q, coherent_mem)) {
/*
* We already have buffers allocated, so first check if they
* are not in use and can be freed.
@@ -794,6 +815,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
q->memory = memory;
+ set_queue_coherency(q, coherent_mem);

/*
* Ask the driver how many buffers and planes per buffer it requires.
@@ -878,12 +900,13 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
EXPORT_SYMBOL_GPL(vb2_core_reqbufs);

int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
- unsigned int *count,
+ unsigned int flags, unsigned int *count,
unsigned int requested_planes,
const unsigned int requested_sizes[])
{
unsigned int num_planes = 0, num_buffers, allocated_buffers;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
+ bool coherent_mem = true;
int ret;

if (q->num_buffers == VB2_MAX_FRAME) {
@@ -899,11 +922,14 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
q->memory = memory;
q->waiting_for_buffers = !q->is_output;
+ set_queue_coherency(q, coherent_mem);
} else {
if (q->memory != memory) {
dprintk(q, 1, "memory model mismatch\n");
return -EINVAL;
}
+ if (!verify_coherency_flags(q, coherent_mem))
+ return -EINVAL;
}

num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
@@ -2576,7 +2602,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
fileio->memory = VB2_MEMORY_MMAP;
fileio->type = q->type;
q->fileio = fileio;
- ret = vb2_core_reqbufs(q, fileio->memory, &fileio->count);
+ ret = vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
if (ret)
goto err_kfree;

@@ -2633,7 +2659,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)

err_reqbufs:
fileio->count = 0;
- vb2_core_reqbufs(q, fileio->memory, &fileio->count);
+ vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);

err_kfree:
q->fileio = NULL;
@@ -2653,7 +2679,7 @@ static int __vb2_cleanup_fileio(struct vb2_queue *q)
vb2_core_streamoff(q, q->type);
q->fileio = NULL;
fileio->count = 0;
- vb2_core_reqbufs(q, fileio->memory, &fileio->count);
+ vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
kfree(fileio);
dprintk(q, 3, "file io emulator closed\n");
}
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 2fbae9bd7b52..b4f70ddb09b0 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -697,7 +697,7 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
int ret = vb2_verify_memory_type(q, req->memory, req->type);

fill_buf_caps(q, &req->capabilities);
- return ret ? ret : vb2_core_reqbufs(q, req->memory, &req->count);
+ return ret ? ret : vb2_core_reqbufs(q, req->memory, 0, &req->count);
}
EXPORT_SYMBOL_GPL(vb2_reqbufs);

@@ -772,6 +772,7 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
if (requested_sizes[i] == 0)
return -EINVAL;
return ret ? ret : vb2_core_create_bufs(q, create->memory,
+ 0,
&create->count,
requested_planes,
requested_sizes);
@@ -974,7 +975,7 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
return res;
if (vb2_queue_is_busy(vdev, file))
return -EBUSY;
- res = vb2_core_reqbufs(vdev->queue, p->memory, &p->count);
+ res = vb2_core_reqbufs(vdev->queue, p->memory, 0, &p->count);
/* If count == 0, then the owner has released all buffers and he
is no longer owner of the queue. Otherwise we have a new owner. */
if (res == 0)
diff --git a/drivers/media/dvb-core/dvb_vb2.c b/drivers/media/dvb-core/dvb_vb2.c
index 6974f1731529..959d110407a4 100644
--- a/drivers/media/dvb-core/dvb_vb2.c
+++ b/drivers/media/dvb-core/dvb_vb2.c
@@ -342,7 +342,7 @@ int dvb_vb2_reqbufs(struct dvb_vb2_ctx *ctx, struct dmx_requestbuffers *req)

ctx->buf_siz = req->size;
ctx->buf_cnt = req->count;
- ret = vb2_core_reqbufs(&ctx->vb_q, VB2_MEMORY_MMAP, &req->count);
+ ret = vb2_core_reqbufs(&ctx->vb_q, VB2_MEMORY_MMAP, 0, &req->count);
if (ret) {
ctx->state = DVB_VB2_STATE_NONE;
dprintk(1, "[%s] count=%d size=%d errno=%d\n", ctx->name,
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 66e548268242..7e748cd09b7a 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -504,6 +504,8 @@ struct vb2_buf_ops {
* @allow_cache_hints: when set user-space can pass cache management hints in
* order to skip cache flush/invalidation on ->prepare() or/and
* ->finish().
+ * @coherent_mem: when cleared queue will attempt to allocate buffers using
+ * non-coherent memory.
* @lock: pointer to a mutex that protects the &struct vb2_queue. The
* driver can set this to a mutex to let the v4l2 core serialize
* the queuing ioctls. If the driver wants to handle locking
@@ -583,6 +585,7 @@ struct vb2_queue {
unsigned int uses_qbuf:1;
unsigned int uses_requests:1;
unsigned int allow_cache_hints:1;
+ unsigned int coherent_mem:1;

struct mutex *lock;
void *owner;
@@ -748,6 +751,8 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
* vb2_core_reqbufs() - Initiate streaming.
* @q: pointer to &struct vb2_queue with videobuf2 queue.
* @memory: memory type, as defined by &enum vb2_memory.
+ * @flags: auxiliary queue/buffer management flags. Currently, the only
+ * used flag is %V4L2_MEMORY_FLAG_NON_COHERENT.
* @count: requested buffer count.
*
* Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
@@ -772,12 +777,13 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
* Return: returns zero on success; an error code otherwise.
*/
int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
- unsigned int *count);
+ unsigned int flags, unsigned int *count);

/**
* vb2_core_create_bufs() - Allocate buffers and any required auxiliary structs
* @q: pointer to &struct vb2_queue with videobuf2 queue.
* @memory: memory type, as defined by &enum vb2_memory.
+ * @flags: auxiliary queue/buffer management flags.
* @count: requested buffer count.
* @requested_planes: number of planes requested.
* @requested_sizes: array with the size of the planes.
@@ -795,7 +801,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
* Return: returns zero on success; an error code otherwise.
*/
int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
- unsigned int *count,
+ unsigned int flags, unsigned int *count,
unsigned int requested_planes,
const unsigned int requested_sizes[]);

--
2.32.0.432.gabb21c7263-goog


2021-07-27 07:08:50

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv4 7/8] videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag

This patch lets user-space to request a non-coherent memory
allocation during CREATE_BUFS and REQBUFS ioctl calls.

= CREATE_BUFS

struct v4l2_create_buffers has seven 4-byte reserved areas,
so reserved[0] is renamed to ->flags. The struct, thus, now
has six reserved 4-byte regions.

= CREATE_BUFS32

struct v4l2_create_buffers32 has seven 4-byte reserved areas,
so reserved[0] is renamed to ->flags. The struct, thus, now
has six reserved 4-byte regions.

= REQBUFS

We use one byte of a 4 byte ->reserved[1] member of struct
v4l2_requestbuffers. The struct, thus, now has reserved 3 bytes.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
.../media/v4l/vidioc-create-bufs.rst | 7 ++++-
.../media/v4l/vidioc-reqbufs.rst | 11 ++++---
.../media/common/videobuf2/videobuf2-core.c | 4 +--
.../media/common/videobuf2/videobuf2-v4l2.c | 31 +++++++++++++++++--
drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 9 +++++-
drivers/media/v4l2-core/v4l2-ioctl.c | 4 +--
include/uapi/linux/videodev2.h | 9 ++++--
7 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
index f98f18c9e91c..a048a9f6b7b6 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
@@ -113,7 +113,12 @@ than the number requested.
``V4L2_MEMORY_MMAP`` and ``format.type`` to the buffer type.

* - __u32
- - ``reserved``\ [7]
+ - ``flags``
+ - Specifies additional buffer management attributes.
+ See :ref:`memory-flags`.
+
+ * - __u32
+ - ``reserved``\ [6]
- A place holder for future extensions. Drivers and applications
must set the array to zero.

diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index e59306aba2b0..099fa6695167 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -104,10 +104,13 @@ aborting or finishing any DMA in progress, an implicit
``V4L2_MEMORY_MMAP`` and ``type`` set to the buffer type. This will
free any previously allocated buffers, so this is typically something
that will be done at the start of the application.
- * - __u32
- - ``reserved``\ [1]
- - A place holder for future extensions. Drivers and applications
- must set the array to zero.
+ * - __u8
+ - ``flags``
+ - Specifies additional buffer management attributes.
+ See :ref:`memory-flags`.
+ * - __u8
+ - ``reserved``\ [3]
+ - Reserved for future extensions.

.. _v4l2-buf-capabilities:
.. _V4L2-BUF-CAP-SUPPORTS-MMAP:
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index af4db310cf5e..38505783247e 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -762,7 +762,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
{
unsigned int num_buffers, allocated_buffers, num_planes = 0;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
- bool coherent_mem = true;
+ bool coherent_mem = !(flags & V4L2_MEMORY_FLAG_NON_COHERENT);
unsigned int i;
int ret;

@@ -906,7 +906,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
{
unsigned int num_planes = 0, num_buffers, allocated_buffers;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
- bool coherent_mem = true;
+ bool coherent_mem = !(flags & V4L2_MEMORY_FLAG_NON_COHERENT);
int ret;

if (q->num_buffers == VB2_MAX_FRAME) {
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index b4f70ddb09b0..6edf4508c636 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -692,12 +692,32 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
#endif
}

+static void validate_memory_flags(struct vb2_queue *q,
+ int memory,
+ u32 *flags)
+{
+ if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP) {
+ /*
+ * This needs to clear V4L2_MEMORY_FLAG_NON_COHERENT only,
+ * but in order to avoid bugs we zero out all bits.
+ */
+ *flags = 0;
+ } else {
+ /* Clear all unknown flags. */
+ *flags &= V4L2_MEMORY_FLAG_NON_COHERENT;
+ }
+}
+
int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
{
int ret = vb2_verify_memory_type(q, req->memory, req->type);
+ u32 flags = req->flags;

fill_buf_caps(q, &req->capabilities);
- return ret ? ret : vb2_core_reqbufs(q, req->memory, 0, &req->count);
+ validate_memory_flags(q, req->memory, &flags);
+ req->flags = flags;
+ return ret ? ret : vb2_core_reqbufs(q, req->memory,
+ req->flags, &req->count);
}
EXPORT_SYMBOL_GPL(vb2_reqbufs);

@@ -729,6 +749,7 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
unsigned i;

fill_buf_caps(q, &create->capabilities);
+ validate_memory_flags(q, create->memory, &create->flags);
create->index = q->num_buffers;
if (create->count == 0)
return ret != -EBUSY ? ret : 0;
@@ -772,7 +793,7 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
if (requested_sizes[i] == 0)
return -EINVAL;
return ret ? ret : vb2_core_create_bufs(q, create->memory,
- 0,
+ create->flags,
&create->count,
requested_planes,
requested_sizes);
@@ -969,13 +990,16 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
{
struct video_device *vdev = video_devdata(file);
int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
+ u32 flags = p->flags;

fill_buf_caps(vdev->queue, &p->capabilities);
+ validate_memory_flags(vdev->queue, p->memory, &flags);
+ p->flags = flags;
if (res)
return res;
if (vb2_queue_is_busy(vdev, file))
return -EBUSY;
- res = vb2_core_reqbufs(vdev->queue, p->memory, 0, &p->count);
+ res = vb2_core_reqbufs(vdev->queue, p->memory, p->flags, &p->count);
/* If count == 0, then the owner has released all buffers and he
is no longer owner of the queue. Otherwise we have a new owner. */
if (res == 0)
@@ -993,6 +1017,7 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,

p->index = vdev->queue->num_buffers;
fill_buf_caps(vdev->queue, &p->capabilities);
+ validate_memory_flags(vdev->queue, p->memory, &p->flags);
/*
* If count == 0, then just check if memory and type are valid.
* Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
index 47aff3b19742..8176769a89fa 100644
--- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
+++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
@@ -126,6 +126,9 @@ struct v4l2_format32 {
* @memory: buffer memory type
* @format: frame format, for which buffers are requested
* @capabilities: capabilities of this buffer type.
+ * @flags: additional buffer management attributes (ignored unless the
+ * queue has V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS capability and
+ * configured for MMAP streaming I/O).
* @reserved: future extensions
*/
struct v4l2_create_buffers32 {
@@ -134,7 +137,8 @@ struct v4l2_create_buffers32 {
__u32 memory; /* enum v4l2_memory */
struct v4l2_format32 format;
__u32 capabilities;
- __u32 reserved[7];
+ __u32 flags;
+ __u32 reserved[6];
};

static int get_v4l2_format32(struct v4l2_format *p64,
@@ -182,6 +186,8 @@ static int get_v4l2_create32(struct v4l2_create_buffers *p64,
if (copy_from_user(p64, p32,
offsetof(struct v4l2_create_buffers32, format)))
return -EFAULT;
+ if (copy_from_user(&p64->flags, &p32->flags, sizeof(p32->flags)))
+ return -EFAULT;
return get_v4l2_format32(&p64->format, &p32->format);
}

@@ -227,6 +233,7 @@ static int put_v4l2_create32(struct v4l2_create_buffers *p64,
if (copy_to_user(p32, p64,
offsetof(struct v4l2_create_buffers32, format)) ||
put_user(p64->capabilities, &p32->capabilities) ||
+ put_user(p64->flags, &p32->flags) ||
copy_to_user(p32->reserved, p64->reserved, sizeof(p64->reserved)))
return -EFAULT;
return put_v4l2_format32(&p64->format, &p32->format);
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index 05d5db3d85e5..6a941da33998 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -2004,7 +2004,7 @@ static int v4l_reqbufs(const struct v4l2_ioctl_ops *ops,
if (ret)
return ret;

- CLEAR_AFTER_FIELD(p, capabilities);
+ CLEAR_AFTER_FIELD(p, flags);

return ops->vidioc_reqbufs(file, fh, p);
}
@@ -2045,7 +2045,7 @@ static int v4l_create_bufs(const struct v4l2_ioctl_ops *ops,
if (ret)
return ret;

- CLEAR_AFTER_FIELD(create, capabilities);
+ CLEAR_AFTER_FIELD(create, flags);

v4l_sanitize_format(&create->format);

diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 9d11e1d9c934..7973aa0465d2 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -953,7 +953,8 @@ struct v4l2_requestbuffers {
__u32 type; /* enum v4l2_buf_type */
__u32 memory; /* enum v4l2_memory */
__u32 capabilities;
- __u32 reserved[1];
+ __u8 flags;
+ __u8 reserved[3];
};

#define V4L2_MEMORY_FLAG_NON_COHERENT (1 << 0)
@@ -2501,6 +2502,9 @@ struct v4l2_dbg_chip_info {
* @memory: enum v4l2_memory; buffer memory type
* @format: frame format, for which buffers are requested
* @capabilities: capabilities of this buffer type.
+ * @flags: additional buffer management attributes (ignored unless the
+ * queue has V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS capability
+ * and configured for MMAP streaming I/O).
* @reserved: future extensions
*/
struct v4l2_create_buffers {
@@ -2509,7 +2513,8 @@ struct v4l2_create_buffers {
__u32 memory;
struct v4l2_format format;
__u32 capabilities;
- __u32 reserved[7];
+ __u32 flags;
+ __u32 reserved[6];
};

/*
--
2.32.0.432.gabb21c7263-goog


2021-07-27 07:09:15

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

This adds support for new noncontiguous DMA API, which
requires allocators to have two execution branches: one
for the current API, and one for the new one.

Signed-off-by: Sergey Senozhatsky <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
.../common/videobuf2/videobuf2-dma-contig.c | 142 +++++++++++++++---
1 file changed, 117 insertions(+), 25 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index 1e218bc440c6..10f73e27d694 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -17,6 +17,7 @@
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/dma-mapping.h>
+#include <linux/highmem.h>

#include <media/videobuf2-v4l2.h>
#include <media/videobuf2-dma-contig.h>
@@ -42,6 +43,7 @@ struct vb2_dc_buf {
struct dma_buf_attachment *db_attach;

struct vb2_buffer *vb;
+ bool coherent_mem;
};

/*********************************************/
@@ -78,14 +80,22 @@ static void *vb2_dc_cookie(struct vb2_buffer *vb, void *buf_priv)
static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
{
struct vb2_dc_buf *buf = buf_priv;
- struct dma_buf_map map;
- int ret;

- if (!buf->vaddr && buf->db_attach) {
- ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
- buf->vaddr = ret ? NULL : map.vaddr;
+ if (buf->vaddr)
+ return buf->vaddr;
+
+ if (buf->db_attach) {
+ struct dma_buf_map map;
+
+ if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
+ buf->vaddr = map.vaddr;
+
+ return buf->vaddr;
}

+ if (!buf->coherent_mem)
+ buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
+ buf->dma_sgt);
return buf->vaddr;
}

@@ -101,13 +111,26 @@ static void vb2_dc_prepare(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

+ /* This takes care of DMABUF and user-enforced cache sync hint */
if (buf->vb->skip_cache_sync_on_prepare)
return;

+ /*
+ * Coherent MMAP buffers do not need to be synced, unlike USERPTR
+ * and non-coherent MMAP buffers.
+ */
+ if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
+ return;
+
if (!sgt)
return;

+ /* For both USERPTR and non-coherent MMAP */
dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
+
+ /* Non-coherent MMAP only */
+ if (!buf->coherent_mem && buf->vaddr)
+ flush_kernel_vmap_range(buf->vaddr, buf->size);
}

static void vb2_dc_finish(void *buf_priv)
@@ -115,13 +138,26 @@ static void vb2_dc_finish(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

+ /* This takes care of DMABUF and user-enforced cache sync hint */
if (buf->vb->skip_cache_sync_on_finish)
return;

+ /*
+ * Coherent MMAP buffers do not need to be synced, unlike USERPTR
+ * and non-coherent MMAP buffers.
+ */
+ if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
+ return;
+
if (!sgt)
return;

+ /* For both USERPTR and non-coherent MMAP */
dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
+
+ /* Non-coherent MMAP only */
+ if (!buf->coherent_mem && buf->vaddr)
+ invalidate_kernel_vmap_range(buf->vaddr, buf->size);
}

/*********************************************/
@@ -139,17 +175,66 @@ static void vb2_dc_put(void *buf_priv)
sg_free_table(buf->sgt_base);
kfree(buf->sgt_base);
}
- dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
- buf->attrs);
+
+ if (buf->coherent_mem) {
+ dma_free_attrs(buf->dev, buf->size, buf->cookie,
+ buf->dma_addr, buf->attrs);
+ } else {
+ if (buf->vaddr)
+ dma_vunmap_noncontiguous(buf->dev, buf->vaddr);
+ dma_free_noncontiguous(buf->dev, buf->size,
+ buf->dma_sgt, buf->dma_dir);
+ }
put_device(buf->dev);
kfree(buf);
}

+static int vb2_dc_alloc_coherent(struct vb2_dc_buf *buf)
+{
+ struct vb2_queue *q = buf->vb->vb2_queue;
+
+ buf->cookie = dma_alloc_attrs(buf->dev,
+ buf->size,
+ &buf->dma_addr,
+ GFP_KERNEL | q->gfp_flags,
+ buf->attrs);
+ if (!buf->cookie)
+ return -ENOMEM;
+
+ if (q->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
+ return 0;
+
+ buf->vaddr = buf->cookie;
+ return 0;
+}
+
+static int vb2_dc_alloc_non_coherent(struct vb2_dc_buf *buf)
+{
+ struct vb2_queue *q = buf->vb->vb2_queue;
+
+ buf->dma_sgt = dma_alloc_noncontiguous(buf->dev,
+ buf->size,
+ buf->dma_dir,
+ GFP_KERNEL | q->gfp_flags,
+ buf->attrs);
+ if (!buf->dma_sgt)
+ return -ENOMEM;
+
+ buf->dma_addr = sg_dma_address(buf->dma_sgt->sgl);
+
+ /*
+ * For requests that need kernel mapping (DMA_ATTR_NO_KERNEL_MAPPING
+ * bit is cleared) we perform dma_vmap_noncontiguous() in vb2_dc_vadd().
+ */
+ return 0;
+}
+
static void *vb2_dc_alloc(struct vb2_buffer *vb,
struct device *dev,
unsigned long size)
{
struct vb2_dc_buf *buf;
+ int ret;

if (WARN_ON(!dev))
return ERR_PTR(-EINVAL);
@@ -159,27 +244,28 @@ static void *vb2_dc_alloc(struct vb2_buffer *vb,
return ERR_PTR(-ENOMEM);

buf->attrs = vb->vb2_queue->dma_attrs;
- buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
- GFP_KERNEL | vb->vb2_queue->gfp_flags,
- buf->attrs);
- if (!buf->cookie) {
- dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
- kfree(buf);
- return ERR_PTR(-ENOMEM);
- }
-
- if ((buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0)
- buf->vaddr = buf->cookie;
+ buf->dma_dir = vb->vb2_queue->dma_dir;
+ buf->vb = vb;
+ buf->coherent_mem = vb->vb2_queue->coherent_mem;

+ buf->size = size;
/* Prevent the device from being released while the buffer is used */
buf->dev = get_device(dev);
- buf->size = size;
- buf->dma_dir = vb->vb2_queue->dma_dir;
+
+ if (buf->coherent_mem)
+ ret = vb2_dc_alloc_coherent(buf);
+ else
+ ret = vb2_dc_alloc_non_coherent(buf);
+
+ if (ret) {
+ dev_err(dev, "dma alloc of size %ld failed\n", size);
+ kfree(buf);
+ return ERR_PTR(-ENOMEM);
+ }

buf->handler.refcount = &buf->refcount;
buf->handler.put = vb2_dc_put;
buf->handler.arg = buf;
- buf->vb = vb;

refcount_set(&buf->refcount, 1);

@@ -196,9 +282,12 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma)
return -EINVAL;
}

- ret = dma_mmap_attrs(buf->dev, vma, buf->cookie,
- buf->dma_addr, buf->size, buf->attrs);
-
+ if (buf->coherent_mem)
+ ret = dma_mmap_attrs(buf->dev, vma, buf->cookie, buf->dma_addr,
+ buf->size, buf->attrs);
+ else
+ ret = dma_mmap_noncontiguous(buf->dev, vma, buf->size,
+ buf->dma_sgt);
if (ret) {
pr_err("Remapping memory failed, error: %d\n", ret);
return ret;
@@ -362,7 +451,7 @@ static int vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf, struct dma_buf_map *map)
{
struct vb2_dc_buf *buf = dbuf->priv;

- dma_buf_map_set_vaddr(map, buf->vaddr);
+ dma_buf_map_set_vaddr(map, vb2_dc_vaddr(buf->vb, buf));

return 0;
}
@@ -390,6 +479,9 @@ static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
int ret;
struct sg_table *sgt;

+ if (!buf->coherent_mem)
+ return buf->dma_sgt;
+
sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
if (!sgt) {
dev_err(buf->dev, "failed to alloc sg table\n");
--
2.32.0.432.gabb21c7263-goog


2021-08-03 08:12:32

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 4/8] videobuf2: move cache_hints handling to allocators

On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> This moves cache hints handling from videobuf2 core down

from -> from the

> to allocators level, because allocators do the sync/flush

to allocators -> to the allocator's

> caches eventually and may take better decisions. Besides,
> allocators already decide whether cache sync/flush should
> be done or can be skipped. This patch moves the scattered
> buffer cache sync logic to one common place.

Regards,

Hans

>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> drivers/media/common/videobuf2/videobuf2-core.c | 6 ------
> drivers/media/common/videobuf2/videobuf2-dma-contig.c | 6 ++++++
> drivers/media/common/videobuf2/videobuf2-dma-sg.c | 6 ++++++
> 3 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index 76210c006958..55af63d54f23 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -328,9 +328,6 @@ static void __vb2_buf_mem_prepare(struct vb2_buffer *vb)
> return;
>
> vb->synced = 1;
> - if (vb->skip_cache_sync_on_prepare)
> - return;
> -
> for (plane = 0; plane < vb->num_planes; ++plane)
> call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
> }
> @@ -347,9 +344,6 @@ static void __vb2_buf_mem_finish(struct vb2_buffer *vb)
> return;
>
> vb->synced = 0;
> - if (vb->skip_cache_sync_on_finish)
> - return;
> -
> for (plane = 0; plane < vb->num_planes; ++plane)
> call_void_memop(vb, finish, vb->planes[plane].mem_priv);
> }
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index 019c3843dc6d..1e218bc440c6 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -101,6 +101,9 @@ static void vb2_dc_prepare(void *buf_priv)
> struct vb2_dc_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + if (buf->vb->skip_cache_sync_on_prepare)
> + return;
> +
> if (!sgt)
> return;
>
> @@ -112,6 +115,9 @@ static void vb2_dc_finish(void *buf_priv)
> struct vb2_dc_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + if (buf->vb->skip_cache_sync_on_finish)
> + return;
> +
> if (!sgt)
> return;
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> index 50265080cfc8..33ee63a99139 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> @@ -204,6 +204,9 @@ static void vb2_dma_sg_prepare(void *buf_priv)
> struct vb2_dma_sg_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + if (buf->vb->skip_cache_sync_on_prepare)
> + return;
> +
> dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> }
>
> @@ -212,6 +215,9 @@ static void vb2_dma_sg_finish(void *buf_priv)
> struct vb2_dma_sg_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + if (buf->vb->skip_cache_sync_on_finish)
> + return;
> +
> dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> }
>
>


2021-08-03 08:14:14

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 5/8] videobuf2: add V4L2_MEMORY_FLAG_NON_COHERENT flag

On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> By setting or clearing V4L2_MEMORY_FLAG_NON_COHERENT flag

clearing -> clearing the

> user-space should be able to hint vb2 that either a non-coherent

either a -> either

> (if supported) or coherent memory should be used for the buffer
> allocation.

Regards,

Hans

>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> .../userspace-api/media/v4l/buffer.rst | 40 ++++++++++++++++++-
> .../media/v4l/vidioc-reqbufs.rst | 5 ++-
> include/uapi/linux/videodev2.h | 2 +
> 3 files changed, 43 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
> index e991ba73d873..4638ec64db00 100644
> --- a/Documentation/userspace-api/media/v4l/buffer.rst
> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> @@ -676,8 +676,6 @@ Buffer Flags
>
> \normalsize
>
> -.. _memory-flags:
> -
> enum v4l2_memory
> ================
>
> @@ -701,6 +699,44 @@ enum v4l2_memory
> - 4
> - The buffer is used for :ref:`DMA shared buffer <dmabuf>` I/O.
>
> +.. _memory-flags:
> +
> +Memory Consistency Flags
> +------------------------
> +
> +.. raw:: latex
> +
> + \small
> +
> +.. tabularcolumns:: |p{7.0cm}|p{2.1cm}|p{8.4cm}|
> +
> +.. cssclass:: longtable
> +
> +.. flat-table::
> + :header-rows: 0
> + :stub-columns: 0
> + :widths: 3 1 4
> +
> + * .. _`V4L2-MEMORY-FLAG-NON-COHERENT`:
> +
> + - ``V4L2_MEMORY_FLAG_NON_COHERENT``
> + - 0x00000001
> + - A buffer is allocated either in coherent (it will be automatically
> + coherent between the CPU and the bus) or non-coherent memory. The
> + latter can provide performance gains, for instance the CPU cache
> + sync/flush operations can be avoided if the buffer is accessed by the
> + corresponding device only and the CPU does not read/write to/from that
> + buffer. However, this requires extra care from the driver -- it must
> + guarantee memory consistency by issuing a cache flush/sync when
> + consistency is needed. If this flag is set V4L2 will attempt to
> + allocate the buffer in non-coherent memory. The flag takes effect
> + only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
> + queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
> + <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
> +
> +.. raw:: latex
> +
> + \normalsize
>
> Timecodes
> =========
> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> index 50ea72043bb0..e59306aba2b0 100644
> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> @@ -158,8 +158,9 @@ aborting or finishing any DMA in progress, an implicit
> - This capability is set by the driver to indicate that the queue supports
> cache and memory management hints. However, it's only valid when the
> queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
> - :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
> - :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
> + :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>`,
> + :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>` and
> + :ref:`V4L2_MEMORY_FLAG_NON_COHERENT <V4L2-MEMORY-FLAG-NON-COHERENT>`.
>
> .. raw:: latex
>
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 9260791b8438..9d11e1d9c934 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -956,6 +956,8 @@ struct v4l2_requestbuffers {
> __u32 reserved[1];
> };
>
> +#define V4L2_MEMORY_FLAG_NON_COHERENT (1 << 0)
> +
> /* capabilities for struct v4l2_requestbuffers and v4l2_create_buffers */
> #define V4L2_BUF_CAP_SUPPORTS_MMAP (1 << 0)
> #define V4L2_BUF_CAP_SUPPORTS_USERPTR (1 << 1)
>


2021-08-03 08:31:45

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 6/8] videobuf2: add queue memory coherency parameter

On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> Preparations for future V4L2_MEMORY_FLAG_NON_COHERENT support.
>
> Extend vb2_core_reqbufs() parameters list to accept requests'

Extend -> Extend the

> ->flags, which will be used for memory coherency configuration.
>
> An attempt to allocate a buffer with coherency requirements
> which don't match queue's consistency model will fail.

which don't match -> that do not match the

>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> .../media/common/videobuf2/videobuf2-core.c | 38 ++++++++++++++++---
> .../media/common/videobuf2/videobuf2-v4l2.c | 5 ++-
> drivers/media/dvb-core/dvb_vb2.c | 2 +-
> include/media/videobuf2-core.h | 10 ++++-
> 4 files changed, 44 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index 55af63d54f23..af4db310cf5e 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -738,11 +738,31 @@ int vb2_verify_memory_type(struct vb2_queue *q,
> }
> EXPORT_SYMBOL(vb2_verify_memory_type);
>
> +static void set_queue_coherency(struct vb2_queue *q, bool coherent_mem)
> +{
> + q->coherent_mem = 1;

This I do not like: coherent memory is the default, and so I think it will
be more robust if this field is renamed to non_coherent_mem and so coherent
memory will be the default since this field will be cleared initially with
kzalloc.

Basically a similar argument that you used in patch 2/8.

I also think that it improves readability, since non-coherent is the
exceptional case, not the rule, and the field name corresponds with the
V4L2 memory flag name.

I noticed that in v1 of this series it was actually called non_coherent_mem,
and it was changed in v2, actually after some comments from me.

But I changed my mind on that, and I think it makes more sense to go back to
calling this non_coherent_mem.

Regards,

Hans

> +
> + if (!vb2_queue_allows_cache_hints(q))
> + return;
> + if (!coherent_mem)
> + q->coherent_mem = 0;
> +}
> +
> +static bool verify_coherency_flags(struct vb2_queue *q, bool coherent_mem)
> +{
> + if (coherent_mem != q->coherent_mem) {
> + dprintk(q, 1, "memory coherency model mismatch\n");
> + return false;
> + }
> + return true;
> +}
> +
> int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> - unsigned int *count)
> + unsigned int flags, unsigned int *count)
> {
> unsigned int num_buffers, allocated_buffers, num_planes = 0;
> unsigned plane_sizes[VB2_MAX_PLANES] = { };
> + bool coherent_mem = true;
> unsigned int i;
> int ret;
>
> @@ -757,7 +777,8 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> }
>
> if (*count == 0 || q->num_buffers != 0 ||
> - (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
> + (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
> + !verify_coherency_flags(q, coherent_mem)) {
> /*
> * We already have buffers allocated, so first check if they
> * are not in use and can be freed.
> @@ -794,6 +815,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
> memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
> q->memory = memory;
> + set_queue_coherency(q, coherent_mem);
>
> /*
> * Ask the driver how many buffers and planes per buffer it requires.
> @@ -878,12 +900,13 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> EXPORT_SYMBOL_GPL(vb2_core_reqbufs);
>
> int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> - unsigned int *count,
> + unsigned int flags, unsigned int *count,
> unsigned int requested_planes,
> const unsigned int requested_sizes[])
> {
> unsigned int num_planes = 0, num_buffers, allocated_buffers;
> unsigned plane_sizes[VB2_MAX_PLANES] = { };
> + bool coherent_mem = true;
> int ret;
>
> if (q->num_buffers == VB2_MAX_FRAME) {
> @@ -899,11 +922,14 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
> q->memory = memory;
> q->waiting_for_buffers = !q->is_output;
> + set_queue_coherency(q, coherent_mem);
> } else {
> if (q->memory != memory) {
> dprintk(q, 1, "memory model mismatch\n");
> return -EINVAL;
> }
> + if (!verify_coherency_flags(q, coherent_mem))
> + return -EINVAL;
> }
>
> num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
> @@ -2576,7 +2602,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
> fileio->memory = VB2_MEMORY_MMAP;
> fileio->type = q->type;
> q->fileio = fileio;
> - ret = vb2_core_reqbufs(q, fileio->memory, &fileio->count);
> + ret = vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
> if (ret)
> goto err_kfree;
>
> @@ -2633,7 +2659,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
>
> err_reqbufs:
> fileio->count = 0;
> - vb2_core_reqbufs(q, fileio->memory, &fileio->count);
> + vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
>
> err_kfree:
> q->fileio = NULL;
> @@ -2653,7 +2679,7 @@ static int __vb2_cleanup_fileio(struct vb2_queue *q)
> vb2_core_streamoff(q, q->type);
> q->fileio = NULL;
> fileio->count = 0;
> - vb2_core_reqbufs(q, fileio->memory, &fileio->count);
> + vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
> kfree(fileio);
> dprintk(q, 3, "file io emulator closed\n");
> }
> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> index 2fbae9bd7b52..b4f70ddb09b0 100644
> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> @@ -697,7 +697,7 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
> int ret = vb2_verify_memory_type(q, req->memory, req->type);
>
> fill_buf_caps(q, &req->capabilities);
> - return ret ? ret : vb2_core_reqbufs(q, req->memory, &req->count);
> + return ret ? ret : vb2_core_reqbufs(q, req->memory, 0, &req->count);
> }
> EXPORT_SYMBOL_GPL(vb2_reqbufs);
>
> @@ -772,6 +772,7 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
> if (requested_sizes[i] == 0)
> return -EINVAL;
> return ret ? ret : vb2_core_create_bufs(q, create->memory,
> + 0,
> &create->count,
> requested_planes,
> requested_sizes);
> @@ -974,7 +975,7 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
> return res;
> if (vb2_queue_is_busy(vdev, file))
> return -EBUSY;
> - res = vb2_core_reqbufs(vdev->queue, p->memory, &p->count);
> + res = vb2_core_reqbufs(vdev->queue, p->memory, 0, &p->count);
> /* If count == 0, then the owner has released all buffers and he
> is no longer owner of the queue. Otherwise we have a new owner. */
> if (res == 0)
> diff --git a/drivers/media/dvb-core/dvb_vb2.c b/drivers/media/dvb-core/dvb_vb2.c
> index 6974f1731529..959d110407a4 100644
> --- a/drivers/media/dvb-core/dvb_vb2.c
> +++ b/drivers/media/dvb-core/dvb_vb2.c
> @@ -342,7 +342,7 @@ int dvb_vb2_reqbufs(struct dvb_vb2_ctx *ctx, struct dmx_requestbuffers *req)
>
> ctx->buf_siz = req->size;
> ctx->buf_cnt = req->count;
> - ret = vb2_core_reqbufs(&ctx->vb_q, VB2_MEMORY_MMAP, &req->count);
> + ret = vb2_core_reqbufs(&ctx->vb_q, VB2_MEMORY_MMAP, 0, &req->count);
> if (ret) {
> ctx->state = DVB_VB2_STATE_NONE;
> dprintk(1, "[%s] count=%d size=%d errno=%d\n", ctx->name,
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 66e548268242..7e748cd09b7a 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -504,6 +504,8 @@ struct vb2_buf_ops {
> * @allow_cache_hints: when set user-space can pass cache management hints in
> * order to skip cache flush/invalidation on ->prepare() or/and
> * ->finish().
> + * @coherent_mem: when cleared queue will attempt to allocate buffers using
> + * non-coherent memory.
> * @lock: pointer to a mutex that protects the &struct vb2_queue. The
> * driver can set this to a mutex to let the v4l2 core serialize
> * the queuing ioctls. If the driver wants to handle locking
> @@ -583,6 +585,7 @@ struct vb2_queue {
> unsigned int uses_qbuf:1;
> unsigned int uses_requests:1;
> unsigned int allow_cache_hints:1;
> + unsigned int coherent_mem:1;
>
> struct mutex *lock;
> void *owner;
> @@ -748,6 +751,8 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
> * vb2_core_reqbufs() - Initiate streaming.
> * @q: pointer to &struct vb2_queue with videobuf2 queue.
> * @memory: memory type, as defined by &enum vb2_memory.
> + * @flags: auxiliary queue/buffer management flags. Currently, the only
> + * used flag is %V4L2_MEMORY_FLAG_NON_COHERENT.
> * @count: requested buffer count.
> *
> * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
> @@ -772,12 +777,13 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
> * Return: returns zero on success; an error code otherwise.
> */
> int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> - unsigned int *count);
> + unsigned int flags, unsigned int *count);
>
> /**
> * vb2_core_create_bufs() - Allocate buffers and any required auxiliary structs
> * @q: pointer to &struct vb2_queue with videobuf2 queue.
> * @memory: memory type, as defined by &enum vb2_memory.
> + * @flags: auxiliary queue/buffer management flags.
> * @count: requested buffer count.
> * @requested_planes: number of planes requested.
> * @requested_sizes: array with the size of the planes.
> @@ -795,7 +801,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> * Return: returns zero on success; an error code otherwise.
> */
> int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> - unsigned int *count,
> + unsigned int flags, unsigned int *count,
> unsigned int requested_planes,
> const unsigned int requested_sizes[]);
>
>


2021-08-03 08:35:01

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> This adds support for new noncontiguous DMA API, which

for -> for the

> requires allocators to have two execution branches: one
> for the current API, and one for the new one.
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> Acked-by: Christoph Hellwig <[email protected]>
> ---
> .../common/videobuf2/videobuf2-dma-contig.c | 142 +++++++++++++++---
> 1 file changed, 117 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index 1e218bc440c6..10f73e27d694 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -17,6 +17,7 @@
> #include <linux/sched.h>
> #include <linux/slab.h>
> #include <linux/dma-mapping.h>
> +#include <linux/highmem.h>
>
> #include <media/videobuf2-v4l2.h>
> #include <media/videobuf2-dma-contig.h>
> @@ -42,6 +43,7 @@ struct vb2_dc_buf {
> struct dma_buf_attachment *db_attach;
>
> struct vb2_buffer *vb;
> + bool coherent_mem;

I think that this as well should be renamed to non_coherent_mem.

> };
>
> /*********************************************/
> @@ -78,14 +80,22 @@ static void *vb2_dc_cookie(struct vb2_buffer *vb, void *buf_priv)
> static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
> {
> struct vb2_dc_buf *buf = buf_priv;
> - struct dma_buf_map map;
> - int ret;
>
> - if (!buf->vaddr && buf->db_attach) {
> - ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
> - buf->vaddr = ret ? NULL : map.vaddr;
> + if (buf->vaddr)
> + return buf->vaddr;
> +
> + if (buf->db_attach) {
> + struct dma_buf_map map;
> +
> + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
> + buf->vaddr = map.vaddr;
> +
> + return buf->vaddr;
> }
>
> + if (!buf->coherent_mem)
> + buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
> + buf->dma_sgt);
> return buf->vaddr;
> }
>
> @@ -101,13 +111,26 @@ static void vb2_dc_prepare(void *buf_priv)
> struct vb2_dc_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + /* This takes care of DMABUF and user-enforced cache sync hint */
> if (buf->vb->skip_cache_sync_on_prepare)
> return;
>
> + /*
> + * Coherent MMAP buffers do not need to be synced, unlike USERPTR
> + * and non-coherent MMAP buffers.
> + */
> + if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
> + return;
> +
> if (!sgt)
> return;
>
> + /* For both USERPTR and non-coherent MMAP */
> dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> +
> + /* Non-coherent MMAP only */
> + if (!buf->coherent_mem && buf->vaddr)
> + flush_kernel_vmap_range(buf->vaddr, buf->size);
> }
>
> static void vb2_dc_finish(void *buf_priv)
> @@ -115,13 +138,26 @@ static void vb2_dc_finish(void *buf_priv)
> struct vb2_dc_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + /* This takes care of DMABUF and user-enforced cache sync hint */
> if (buf->vb->skip_cache_sync_on_finish)
> return;
>
> + /*
> + * Coherent MMAP buffers do not need to be synced, unlike USERPTR
> + * and non-coherent MMAP buffers.
> + */
> + if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
> + return;
> +
> if (!sgt)
> return;
>
> + /* For both USERPTR and non-coherent MMAP */
> dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> +
> + /* Non-coherent MMAP only */
> + if (!buf->coherent_mem && buf->vaddr)
> + invalidate_kernel_vmap_range(buf->vaddr, buf->size);
> }
>
> /*********************************************/
> @@ -139,17 +175,66 @@ static void vb2_dc_put(void *buf_priv)
> sg_free_table(buf->sgt_base);
> kfree(buf->sgt_base);
> }
> - dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
> - buf->attrs);
> +
> + if (buf->coherent_mem) {
> + dma_free_attrs(buf->dev, buf->size, buf->cookie,
> + buf->dma_addr, buf->attrs);
> + } else {
> + if (buf->vaddr)
> + dma_vunmap_noncontiguous(buf->dev, buf->vaddr);
> + dma_free_noncontiguous(buf->dev, buf->size,
> + buf->dma_sgt, buf->dma_dir);
> + }
> put_device(buf->dev);
> kfree(buf);
> }
>
> +static int vb2_dc_alloc_coherent(struct vb2_dc_buf *buf)
> +{
> + struct vb2_queue *q = buf->vb->vb2_queue;
> +
> + buf->cookie = dma_alloc_attrs(buf->dev,
> + buf->size,
> + &buf->dma_addr,
> + GFP_KERNEL | q->gfp_flags,
> + buf->attrs);
> + if (!buf->cookie)
> + return -ENOMEM;
> +
> + if (q->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
> + return 0;
> +
> + buf->vaddr = buf->cookie;
> + return 0;
> +}
> +
> +static int vb2_dc_alloc_non_coherent(struct vb2_dc_buf *buf)
> +{
> + struct vb2_queue *q = buf->vb->vb2_queue;
> +
> + buf->dma_sgt = dma_alloc_noncontiguous(buf->dev,
> + buf->size,
> + buf->dma_dir,
> + GFP_KERNEL | q->gfp_flags,
> + buf->attrs);
> + if (!buf->dma_sgt)
> + return -ENOMEM;
> +
> + buf->dma_addr = sg_dma_address(buf->dma_sgt->sgl);
> +
> + /*
> + * For requests that need kernel mapping (DMA_ATTR_NO_KERNEL_MAPPING
> + * bit is cleared) we perform dma_vmap_noncontiguous() in vb2_dc_vadd().
> + */
> + return 0;
> +}
> +
> static void *vb2_dc_alloc(struct vb2_buffer *vb,
> struct device *dev,
> unsigned long size)
> {
> struct vb2_dc_buf *buf;
> + int ret;
>
> if (WARN_ON(!dev))
> return ERR_PTR(-EINVAL);
> @@ -159,27 +244,28 @@ static void *vb2_dc_alloc(struct vb2_buffer *vb,
> return ERR_PTR(-ENOMEM);
>
> buf->attrs = vb->vb2_queue->dma_attrs;
> - buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
> - GFP_KERNEL | vb->vb2_queue->gfp_flags,
> - buf->attrs);
> - if (!buf->cookie) {
> - dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
> - kfree(buf);
> - return ERR_PTR(-ENOMEM);
> - }
> -
> - if ((buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0)
> - buf->vaddr = buf->cookie;
> + buf->dma_dir = vb->vb2_queue->dma_dir;
> + buf->vb = vb;
> + buf->coherent_mem = vb->vb2_queue->coherent_mem;
>
> + buf->size = size;
> /* Prevent the device from being released while the buffer is used */
> buf->dev = get_device(dev);
> - buf->size = size;
> - buf->dma_dir = vb->vb2_queue->dma_dir;
> +
> + if (buf->coherent_mem)
> + ret = vb2_dc_alloc_coherent(buf);
> + else
> + ret = vb2_dc_alloc_non_coherent(buf);
> +
> + if (ret) {
> + dev_err(dev, "dma alloc of size %ld failed\n", size);
> + kfree(buf);
> + return ERR_PTR(-ENOMEM);
> + }
>
> buf->handler.refcount = &buf->refcount;
> buf->handler.put = vb2_dc_put;
> buf->handler.arg = buf;
> - buf->vb = vb;
>
> refcount_set(&buf->refcount, 1);
>
> @@ -196,9 +282,12 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma)
> return -EINVAL;
> }
>
> - ret = dma_mmap_attrs(buf->dev, vma, buf->cookie,
> - buf->dma_addr, buf->size, buf->attrs);
> -
> + if (buf->coherent_mem)
> + ret = dma_mmap_attrs(buf->dev, vma, buf->cookie, buf->dma_addr,
> + buf->size, buf->attrs);
> + else
> + ret = dma_mmap_noncontiguous(buf->dev, vma, buf->size,
> + buf->dma_sgt);
> if (ret) {
> pr_err("Remapping memory failed, error: %d\n", ret);
> return ret;
> @@ -362,7 +451,7 @@ static int vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf, struct dma_buf_map *map)
> {
> struct vb2_dc_buf *buf = dbuf->priv;
>
> - dma_buf_map_set_vaddr(map, buf->vaddr);
> + dma_buf_map_set_vaddr(map, vb2_dc_vaddr(buf->vb, buf));
>
> return 0;
> }
> @@ -390,6 +479,9 @@ static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
> int ret;
> struct sg_table *sgt;
>
> + if (!buf->coherent_mem)
> + return buf->dma_sgt;
> +
> sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
> if (!sgt) {
> dev_err(buf->dev, "failed to alloc sg table\n");
>

Regards,

Hans

2021-08-03 08:35:37

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 7/8] videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag

On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> This patch lets user-space to request a non-coherent memory

s/to//

Regards,

Hans

> allocation during CREATE_BUFS and REQBUFS ioctl calls.
>
> = CREATE_BUFS
>
> struct v4l2_create_buffers has seven 4-byte reserved areas,
> so reserved[0] is renamed to ->flags. The struct, thus, now
> has six reserved 4-byte regions.
>
> = CREATE_BUFS32
>
> struct v4l2_create_buffers32 has seven 4-byte reserved areas,
> so reserved[0] is renamed to ->flags. The struct, thus, now
> has six reserved 4-byte regions.
>
> = REQBUFS
>
> We use one byte of a 4 byte ->reserved[1] member of struct
> v4l2_requestbuffers. The struct, thus, now has reserved 3 bytes.
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> .../media/v4l/vidioc-create-bufs.rst | 7 ++++-
> .../media/v4l/vidioc-reqbufs.rst | 11 ++++---
> .../media/common/videobuf2/videobuf2-core.c | 4 +--
> .../media/common/videobuf2/videobuf2-v4l2.c | 31 +++++++++++++++++--
> drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 9 +++++-
> drivers/media/v4l2-core/v4l2-ioctl.c | 4 +--
> include/uapi/linux/videodev2.h | 9 ++++--
> 7 files changed, 60 insertions(+), 15 deletions(-)
>
> diff --git a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
> index f98f18c9e91c..a048a9f6b7b6 100644
> --- a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
> +++ b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
> @@ -113,7 +113,12 @@ than the number requested.
> ``V4L2_MEMORY_MMAP`` and ``format.type`` to the buffer type.
>
> * - __u32
> - - ``reserved``\ [7]
> + - ``flags``
> + - Specifies additional buffer management attributes.
> + See :ref:`memory-flags`.
> +
> + * - __u32
> + - ``reserved``\ [6]
> - A place holder for future extensions. Drivers and applications
> must set the array to zero.
>
> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> index e59306aba2b0..099fa6695167 100644
> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> @@ -104,10 +104,13 @@ aborting or finishing any DMA in progress, an implicit
> ``V4L2_MEMORY_MMAP`` and ``type`` set to the buffer type. This will
> free any previously allocated buffers, so this is typically something
> that will be done at the start of the application.
> - * - __u32
> - - ``reserved``\ [1]
> - - A place holder for future extensions. Drivers and applications
> - must set the array to zero.
> + * - __u8
> + - ``flags``
> + - Specifies additional buffer management attributes.
> + See :ref:`memory-flags`.
> + * - __u8
> + - ``reserved``\ [3]
> + - Reserved for future extensions.
>
> .. _v4l2-buf-capabilities:
> .. _V4L2-BUF-CAP-SUPPORTS-MMAP:
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index af4db310cf5e..38505783247e 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -762,7 +762,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> {
> unsigned int num_buffers, allocated_buffers, num_planes = 0;
> unsigned plane_sizes[VB2_MAX_PLANES] = { };
> - bool coherent_mem = true;
> + bool coherent_mem = !(flags & V4L2_MEMORY_FLAG_NON_COHERENT);
> unsigned int i;
> int ret;
>
> @@ -906,7 +906,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> {
> unsigned int num_planes = 0, num_buffers, allocated_buffers;
> unsigned plane_sizes[VB2_MAX_PLANES] = { };
> - bool coherent_mem = true;
> + bool coherent_mem = !(flags & V4L2_MEMORY_FLAG_NON_COHERENT);
> int ret;
>
> if (q->num_buffers == VB2_MAX_FRAME) {
> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> index b4f70ddb09b0..6edf4508c636 100644
> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> @@ -692,12 +692,32 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
> #endif
> }
>
> +static void validate_memory_flags(struct vb2_queue *q,
> + int memory,
> + u32 *flags)
> +{
> + if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP) {
> + /*
> + * This needs to clear V4L2_MEMORY_FLAG_NON_COHERENT only,
> + * but in order to avoid bugs we zero out all bits.
> + */
> + *flags = 0;
> + } else {
> + /* Clear all unknown flags. */
> + *flags &= V4L2_MEMORY_FLAG_NON_COHERENT;
> + }
> +}
> +
> int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
> {
> int ret = vb2_verify_memory_type(q, req->memory, req->type);
> + u32 flags = req->flags;
>
> fill_buf_caps(q, &req->capabilities);
> - return ret ? ret : vb2_core_reqbufs(q, req->memory, 0, &req->count);
> + validate_memory_flags(q, req->memory, &flags);
> + req->flags = flags;
> + return ret ? ret : vb2_core_reqbufs(q, req->memory,
> + req->flags, &req->count);
> }
> EXPORT_SYMBOL_GPL(vb2_reqbufs);
>
> @@ -729,6 +749,7 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
> unsigned i;
>
> fill_buf_caps(q, &create->capabilities);
> + validate_memory_flags(q, create->memory, &create->flags);
> create->index = q->num_buffers;
> if (create->count == 0)
> return ret != -EBUSY ? ret : 0;
> @@ -772,7 +793,7 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
> if (requested_sizes[i] == 0)
> return -EINVAL;
> return ret ? ret : vb2_core_create_bufs(q, create->memory,
> - 0,
> + create->flags,
> &create->count,
> requested_planes,
> requested_sizes);
> @@ -969,13 +990,16 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
> {
> struct video_device *vdev = video_devdata(file);
> int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
> + u32 flags = p->flags;
>
> fill_buf_caps(vdev->queue, &p->capabilities);
> + validate_memory_flags(vdev->queue, p->memory, &flags);
> + p->flags = flags;
> if (res)
> return res;
> if (vb2_queue_is_busy(vdev, file))
> return -EBUSY;
> - res = vb2_core_reqbufs(vdev->queue, p->memory, 0, &p->count);
> + res = vb2_core_reqbufs(vdev->queue, p->memory, p->flags, &p->count);
> /* If count == 0, then the owner has released all buffers and he
> is no longer owner of the queue. Otherwise we have a new owner. */
> if (res == 0)
> @@ -993,6 +1017,7 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,
>
> p->index = vdev->queue->num_buffers;
> fill_buf_caps(vdev->queue, &p->capabilities);
> + validate_memory_flags(vdev->queue, p->memory, &p->flags);
> /*
> * If count == 0, then just check if memory and type are valid.
> * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
> diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> index 47aff3b19742..8176769a89fa 100644
> --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> @@ -126,6 +126,9 @@ struct v4l2_format32 {
> * @memory: buffer memory type
> * @format: frame format, for which buffers are requested
> * @capabilities: capabilities of this buffer type.
> + * @flags: additional buffer management attributes (ignored unless the
> + * queue has V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS capability and
> + * configured for MMAP streaming I/O).
> * @reserved: future extensions
> */
> struct v4l2_create_buffers32 {
> @@ -134,7 +137,8 @@ struct v4l2_create_buffers32 {
> __u32 memory; /* enum v4l2_memory */
> struct v4l2_format32 format;
> __u32 capabilities;
> - __u32 reserved[7];
> + __u32 flags;
> + __u32 reserved[6];
> };
>
> static int get_v4l2_format32(struct v4l2_format *p64,
> @@ -182,6 +186,8 @@ static int get_v4l2_create32(struct v4l2_create_buffers *p64,
> if (copy_from_user(p64, p32,
> offsetof(struct v4l2_create_buffers32, format)))
> return -EFAULT;
> + if (copy_from_user(&p64->flags, &p32->flags, sizeof(p32->flags)))
> + return -EFAULT;
> return get_v4l2_format32(&p64->format, &p32->format);
> }
>
> @@ -227,6 +233,7 @@ static int put_v4l2_create32(struct v4l2_create_buffers *p64,
> if (copy_to_user(p32, p64,
> offsetof(struct v4l2_create_buffers32, format)) ||
> put_user(p64->capabilities, &p32->capabilities) ||
> + put_user(p64->flags, &p32->flags) ||
> copy_to_user(p32->reserved, p64->reserved, sizeof(p64->reserved)))
> return -EFAULT;
> return put_v4l2_format32(&p64->format, &p32->format);
> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> index 05d5db3d85e5..6a941da33998 100644
> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> @@ -2004,7 +2004,7 @@ static int v4l_reqbufs(const struct v4l2_ioctl_ops *ops,
> if (ret)
> return ret;
>
> - CLEAR_AFTER_FIELD(p, capabilities);
> + CLEAR_AFTER_FIELD(p, flags);
>
> return ops->vidioc_reqbufs(file, fh, p);
> }
> @@ -2045,7 +2045,7 @@ static int v4l_create_bufs(const struct v4l2_ioctl_ops *ops,
> if (ret)
> return ret;
>
> - CLEAR_AFTER_FIELD(create, capabilities);
> + CLEAR_AFTER_FIELD(create, flags);
>
> v4l_sanitize_format(&create->format);
>
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 9d11e1d9c934..7973aa0465d2 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -953,7 +953,8 @@ struct v4l2_requestbuffers {
> __u32 type; /* enum v4l2_buf_type */
> __u32 memory; /* enum v4l2_memory */
> __u32 capabilities;
> - __u32 reserved[1];
> + __u8 flags;
> + __u8 reserved[3];
> };
>
> #define V4L2_MEMORY_FLAG_NON_COHERENT (1 << 0)
> @@ -2501,6 +2502,9 @@ struct v4l2_dbg_chip_info {
> * @memory: enum v4l2_memory; buffer memory type
> * @format: frame format, for which buffers are requested
> * @capabilities: capabilities of this buffer type.
> + * @flags: additional buffer management attributes (ignored unless the
> + * queue has V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS capability
> + * and configured for MMAP streaming I/O).
> * @reserved: future extensions
> */
> struct v4l2_create_buffers {
> @@ -2509,7 +2513,8 @@ struct v4l2_create_buffers {
> __u32 memory;
> struct v4l2_format format;
> __u32 capabilities;
> - __u32 reserved[7];
> + __u32 flags;
> + __u32 reserved[6];
> };
>
> /*
>


2021-08-03 08:40:52

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

On 03/08/2021 10:33, Hans Verkuil wrote:
> On 27/07/2021 09:05, Sergey Senozhatsky wrote:
>> This adds support for new noncontiguous DMA API, which
>
> for -> for the
>
>> requires allocators to have two execution branches: one
>> for the current API, and one for the new one.
>>
>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>> Acked-by: Christoph Hellwig <[email protected]>
>> ---
>> .../common/videobuf2/videobuf2-dma-contig.c | 142 +++++++++++++++---
>> 1 file changed, 117 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index 1e218bc440c6..10f73e27d694 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -17,6 +17,7 @@
>> #include <linux/sched.h>
>> #include <linux/slab.h>
>> #include <linux/dma-mapping.h>
>> +#include <linux/highmem.h>
>>
>> #include <media/videobuf2-v4l2.h>
>> #include <media/videobuf2-dma-contig.h>
>> @@ -42,6 +43,7 @@ struct vb2_dc_buf {
>> struct dma_buf_attachment *db_attach;
>>
>> struct vb2_buffer *vb;
>> + bool coherent_mem;
>
> I think that this as well should be renamed to non_coherent_mem.
>
>> };
>>
>> /*********************************************/
>> @@ -78,14 +80,22 @@ static void *vb2_dc_cookie(struct vb2_buffer *vb, void *buf_priv)
>> static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
>> {
>> struct vb2_dc_buf *buf = buf_priv;
>> - struct dma_buf_map map;
>> - int ret;
>>
>> - if (!buf->vaddr && buf->db_attach) {
>> - ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
>> - buf->vaddr = ret ? NULL : map.vaddr;
>> + if (buf->vaddr)
>> + return buf->vaddr;
>> +
>> + if (buf->db_attach) {
>> + struct dma_buf_map map;
>> +
>> + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
>> + buf->vaddr = map.vaddr;
>> +
>> + return buf->vaddr;
>> }
>>
>> + if (!buf->coherent_mem)
>> + buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
>> + buf->dma_sgt);
>> return buf->vaddr;
>> }
>>
>> @@ -101,13 +111,26 @@ static void vb2_dc_prepare(void *buf_priv)
>> struct vb2_dc_buf *buf = buf_priv;
>> struct sg_table *sgt = buf->dma_sgt;
>>
>> + /* This takes care of DMABUF and user-enforced cache sync hint */
>> if (buf->vb->skip_cache_sync_on_prepare)
>> return;
>>
>> + /*
>> + * Coherent MMAP buffers do not need to be synced, unlike USERPTR
>> + * and non-coherent MMAP buffers.
>> + */
>> + if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
>> + return;
>> +
>> if (!sgt)
>> return;
>>
>> + /* For both USERPTR and non-coherent MMAP */
>> dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
>> +
>> + /* Non-coherent MMAP only */
>> + if (!buf->coherent_mem && buf->vaddr)
>> + flush_kernel_vmap_range(buf->vaddr, buf->size);
>> }
>>
>> static void vb2_dc_finish(void *buf_priv)
>> @@ -115,13 +138,26 @@ static void vb2_dc_finish(void *buf_priv)
>> struct vb2_dc_buf *buf = buf_priv;
>> struct sg_table *sgt = buf->dma_sgt;
>>
>> + /* This takes care of DMABUF and user-enforced cache sync hint */
>> if (buf->vb->skip_cache_sync_on_finish)
>> return;
>>
>> + /*
>> + * Coherent MMAP buffers do not need to be synced, unlike USERPTR
>> + * and non-coherent MMAP buffers.
>> + */
>> + if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
>> + return;
>> +
>> if (!sgt)
>> return;
>>
>> + /* For both USERPTR and non-coherent MMAP */
>> dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
>> +
>> + /* Non-coherent MMAP only */
>> + if (!buf->coherent_mem && buf->vaddr)
>> + invalidate_kernel_vmap_range(buf->vaddr, buf->size);
>> }
>>
>> /*********************************************/
>> @@ -139,17 +175,66 @@ static void vb2_dc_put(void *buf_priv)
>> sg_free_table(buf->sgt_base);
>> kfree(buf->sgt_base);
>> }
>> - dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
>> - buf->attrs);
>> +
>> + if (buf->coherent_mem) {
>> + dma_free_attrs(buf->dev, buf->size, buf->cookie,
>> + buf->dma_addr, buf->attrs);
>> + } else {
>> + if (buf->vaddr)
>> + dma_vunmap_noncontiguous(buf->dev, buf->vaddr);
>> + dma_free_noncontiguous(buf->dev, buf->size,
>> + buf->dma_sgt, buf->dma_dir);
>> + }
>> put_device(buf->dev);
>> kfree(buf);
>> }
>>
>> +static int vb2_dc_alloc_coherent(struct vb2_dc_buf *buf)
>> +{
>> + struct vb2_queue *q = buf->vb->vb2_queue;
>> +
>> + buf->cookie = dma_alloc_attrs(buf->dev,
>> + buf->size,
>> + &buf->dma_addr,
>> + GFP_KERNEL | q->gfp_flags,
>> + buf->attrs);
>> + if (!buf->cookie)
>> + return -ENOMEM;
>> +
>> + if (q->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
>> + return 0;
>> +
>> + buf->vaddr = buf->cookie;
>> + return 0;
>> +}
>> +
>> +static int vb2_dc_alloc_non_coherent(struct vb2_dc_buf *buf)
>> +{
>> + struct vb2_queue *q = buf->vb->vb2_queue;
>> +
>> + buf->dma_sgt = dma_alloc_noncontiguous(buf->dev,
>> + buf->size,
>> + buf->dma_dir,
>> + GFP_KERNEL | q->gfp_flags,
>> + buf->attrs);
>> + if (!buf->dma_sgt)
>> + return -ENOMEM;
>> +
>> + buf->dma_addr = sg_dma_address(buf->dma_sgt->sgl);
>> +
>> + /*
>> + * For requests that need kernel mapping (DMA_ATTR_NO_KERNEL_MAPPING
>> + * bit is cleared) we perform dma_vmap_noncontiguous() in vb2_dc_vadd().

Typo: vb2_dc_vadd -> vb2_dc_vaddr

Regards,

Hans

2021-08-03 10:16:29

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

Hi Sergey,

On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> This adds support for new noncontiguous DMA API, which
> requires allocators to have two execution branches: one
> for the current API, and one for the new one.
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> Acked-by: Christoph Hellwig <[email protected]>
> ---
> .../common/videobuf2/videobuf2-dma-contig.c | 142 +++++++++++++++---
> 1 file changed, 117 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index 1e218bc440c6..10f73e27d694 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -17,6 +17,7 @@
> #include <linux/sched.h>
> #include <linux/slab.h>
> #include <linux/dma-mapping.h>
> +#include <linux/highmem.h>
>
> #include <media/videobuf2-v4l2.h>
> #include <media/videobuf2-dma-contig.h>
> @@ -42,6 +43,7 @@ struct vb2_dc_buf {
> struct dma_buf_attachment *db_attach;
>
> struct vb2_buffer *vb;
> + bool coherent_mem;
> };
>
> /*********************************************/
> @@ -78,14 +80,22 @@ static void *vb2_dc_cookie(struct vb2_buffer *vb, void *buf_priv)
> static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
> {
> struct vb2_dc_buf *buf = buf_priv;
> - struct dma_buf_map map;
> - int ret;
>
> - if (!buf->vaddr && buf->db_attach) {
> - ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
> - buf->vaddr = ret ? NULL : map.vaddr;
> + if (buf->vaddr)
> + return buf->vaddr;
> +
> + if (buf->db_attach) {
> + struct dma_buf_map map;
> +
> + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
> + buf->vaddr = map.vaddr;
> +
> + return buf->vaddr;
> }
>
> + if (!buf->coherent_mem)
> + buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
> + buf->dma_sgt);
> return buf->vaddr;
> }

This function really needs a bunch of comments.

What I want to see here specifically is under which circumstances this function
can return NULL.

- dma_buf_vmap returns an error
- for non-coherent memory dma_vmap_noncontiguous returns an error
- coherent memory with DMA_ATTR_NO_KERNEL_MAPPING set.

In the latter case, if a buffer with coherent memory and DMA_ATTR_NO_KERNEL_MAPPING
is exported as a dma_buf, and dma_buf_vmap is called by the importer of this dma-buf,
what happens then? I think that in that case dma_buf_vmap should return an error?

See also my comment below for the vb2_dc_dmabuf_ops_vmap function.

>
> @@ -101,13 +111,26 @@ static void vb2_dc_prepare(void *buf_priv)
> struct vb2_dc_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + /* This takes care of DMABUF and user-enforced cache sync hint */
> if (buf->vb->skip_cache_sync_on_prepare)
> return;
>
> + /*
> + * Coherent MMAP buffers do not need to be synced, unlike USERPTR
> + * and non-coherent MMAP buffers.
> + */
> + if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
> + return;
> +
> if (!sgt)
> return;
>
> + /* For both USERPTR and non-coherent MMAP */
> dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> +
> + /* Non-coherent MMAP only */
> + if (!buf->coherent_mem && buf->vaddr)
> + flush_kernel_vmap_range(buf->vaddr, buf->size);
> }
>
> static void vb2_dc_finish(void *buf_priv)
> @@ -115,13 +138,26 @@ static void vb2_dc_finish(void *buf_priv)
> struct vb2_dc_buf *buf = buf_priv;
> struct sg_table *sgt = buf->dma_sgt;
>
> + /* This takes care of DMABUF and user-enforced cache sync hint */
> if (buf->vb->skip_cache_sync_on_finish)
> return;
>
> + /*
> + * Coherent MMAP buffers do not need to be synced, unlike USERPTR
> + * and non-coherent MMAP buffers.
> + */
> + if (buf->vb->memory == V4L2_MEMORY_MMAP && buf->coherent_mem)
> + return;
> +
> if (!sgt)
> return;
>
> + /* For both USERPTR and non-coherent MMAP */
> dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> +
> + /* Non-coherent MMAP only */
> + if (!buf->coherent_mem && buf->vaddr)
> + invalidate_kernel_vmap_range(buf->vaddr, buf->size);
> }
>
> /*********************************************/
> @@ -139,17 +175,66 @@ static void vb2_dc_put(void *buf_priv)
> sg_free_table(buf->sgt_base);
> kfree(buf->sgt_base);
> }
> - dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
> - buf->attrs);
> +
> + if (buf->coherent_mem) {
> + dma_free_attrs(buf->dev, buf->size, buf->cookie,
> + buf->dma_addr, buf->attrs);
> + } else {
> + if (buf->vaddr)
> + dma_vunmap_noncontiguous(buf->dev, buf->vaddr);
> + dma_free_noncontiguous(buf->dev, buf->size,
> + buf->dma_sgt, buf->dma_dir);
> + }
> put_device(buf->dev);
> kfree(buf);
> }
>
> +static int vb2_dc_alloc_coherent(struct vb2_dc_buf *buf)
> +{
> + struct vb2_queue *q = buf->vb->vb2_queue;
> +
> + buf->cookie = dma_alloc_attrs(buf->dev,
> + buf->size,
> + &buf->dma_addr,
> + GFP_KERNEL | q->gfp_flags,
> + buf->attrs);
> + if (!buf->cookie)
> + return -ENOMEM;
> +
> + if (q->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
> + return 0;
> +
> + buf->vaddr = buf->cookie;
> + return 0;
> +}
> +
> +static int vb2_dc_alloc_non_coherent(struct vb2_dc_buf *buf)
> +{
> + struct vb2_queue *q = buf->vb->vb2_queue;
> +
> + buf->dma_sgt = dma_alloc_noncontiguous(buf->dev,
> + buf->size,
> + buf->dma_dir,
> + GFP_KERNEL | q->gfp_flags,
> + buf->attrs);
> + if (!buf->dma_sgt)
> + return -ENOMEM;
> +
> + buf->dma_addr = sg_dma_address(buf->dma_sgt->sgl);
> +
> + /*
> + * For requests that need kernel mapping (DMA_ATTR_NO_KERNEL_MAPPING
> + * bit is cleared) we perform dma_vmap_noncontiguous() in vb2_dc_vadd().
> + */
> + return 0;
> +}
> +
> static void *vb2_dc_alloc(struct vb2_buffer *vb,
> struct device *dev,
> unsigned long size)
> {
> struct vb2_dc_buf *buf;
> + int ret;
>
> if (WARN_ON(!dev))
> return ERR_PTR(-EINVAL);
> @@ -159,27 +244,28 @@ static void *vb2_dc_alloc(struct vb2_buffer *vb,
> return ERR_PTR(-ENOMEM);
>
> buf->attrs = vb->vb2_queue->dma_attrs;
> - buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
> - GFP_KERNEL | vb->vb2_queue->gfp_flags,
> - buf->attrs);
> - if (!buf->cookie) {
> - dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
> - kfree(buf);
> - return ERR_PTR(-ENOMEM);
> - }
> -
> - if ((buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0)
> - buf->vaddr = buf->cookie;
> + buf->dma_dir = vb->vb2_queue->dma_dir;
> + buf->vb = vb;
> + buf->coherent_mem = vb->vb2_queue->coherent_mem;
>
> + buf->size = size;
> /* Prevent the device from being released while the buffer is used */
> buf->dev = get_device(dev);
> - buf->size = size;
> - buf->dma_dir = vb->vb2_queue->dma_dir;
> +
> + if (buf->coherent_mem)
> + ret = vb2_dc_alloc_coherent(buf);
> + else
> + ret = vb2_dc_alloc_non_coherent(buf);
> +
> + if (ret) {
> + dev_err(dev, "dma alloc of size %ld failed\n", size);
> + kfree(buf);
> + return ERR_PTR(-ENOMEM);
> + }
>
> buf->handler.refcount = &buf->refcount;
> buf->handler.put = vb2_dc_put;
> buf->handler.arg = buf;
> - buf->vb = vb;
>
> refcount_set(&buf->refcount, 1);
>
> @@ -196,9 +282,12 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma)
> return -EINVAL;
> }
>
> - ret = dma_mmap_attrs(buf->dev, vma, buf->cookie,
> - buf->dma_addr, buf->size, buf->attrs);
> -
> + if (buf->coherent_mem)
> + ret = dma_mmap_attrs(buf->dev, vma, buf->cookie, buf->dma_addr,
> + buf->size, buf->attrs);
> + else
> + ret = dma_mmap_noncontiguous(buf->dev, vma, buf->size,
> + buf->dma_sgt);
> if (ret) {
> pr_err("Remapping memory failed, error: %d\n", ret);
> return ret;
> @@ -362,7 +451,7 @@ static int vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf, struct dma_buf_map *map)
> {
> struct vb2_dc_buf *buf = dbuf->priv;
>
> - dma_buf_map_set_vaddr(map, buf->vaddr);
> + dma_buf_map_set_vaddr(map, vb2_dc_vaddr(buf->vb, buf));

vb2_dc_vaddr() can return NULL, shouldn't this function return an error in that case?

BTW, looking at where vb2_plane_vaddr() is called in drivers I notice that most (all?)
drivers do not check for NULL. Somewhat scary, to be honest. That's a separate issue, though.

Regards,

Hans

>
> return 0;
> }
> @@ -390,6 +479,9 @@ static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
> int ret;
> struct sg_table *sgt;
>
> + if (!buf->coherent_mem)
> + return buf->dma_sgt;
> +
> sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
> if (!sgt) {
> dev_err(buf->dev, "failed to alloc sg table\n");
>


2021-08-17 10:43:12

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 4/8] videobuf2: move cache_hints handling to allocators

On (21/08/03 10:11), Hans Verkuil wrote:
> On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> > This moves cache hints handling from videobuf2 core down
>
> from -> from the
>
> > to allocators level, because allocators do the sync/flush
>
> to allocators -> to the allocator's

Done, thanks.

2021-08-17 10:44:33

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 6/8] videobuf2: add queue memory coherency parameter

On (21/08/03 10:29), Hans Verkuil wrote:
> >
> > +static void set_queue_coherency(struct vb2_queue *q, bool coherent_mem)
> > +{
> > + q->coherent_mem = 1;
>
> This I do not like: coherent memory is the default, and so I think it will
> be more robust if this field is renamed to non_coherent_mem and so coherent
> memory will be the default since this field will be cleared initially with
> kzalloc.
>
> Basically a similar argument that you used in patch 2/8.
>
> I also think that it improves readability, since non-coherent is the
> exceptional case, not the rule, and the field name corresponds with the
> V4L2 memory flag name.
>
> I noticed that in v1 of this series it was actually called non_coherent_mem,
> and it was changed in v2, actually after some comments from me.
>
> But I changed my mind on that, and I think it makes more sense to go back to
> calling this non_coherent_mem.

Ok, done. Hans, is this your final decision? :)

2021-08-17 10:45:48

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 7/8] videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag

On (21/08/03 10:31), Hans Verkuil wrote:
>
> s/to//
>

Done, thanks.

2021-08-17 10:46:58

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 5/8] videobuf2: add V4L2_MEMORY_FLAG_NON_COHERENT flag

On (21/08/03 10:12), Hans Verkuil wrote:
> On 27/07/2021 09:05, Sergey Senozhatsky wrote:
> > By setting or clearing V4L2_MEMORY_FLAG_NON_COHERENT flag
>
> clearing -> clearing the
>
> > user-space should be able to hint vb2 that either a non-coherent
>
> either a -> either

Done, thanks.

2021-08-17 11:59:06

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

Hi,

On (21/08/03 12:15), Hans Verkuil wrote:
> > static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
> > {
> > struct vb2_dc_buf *buf = buf_priv;
> > - struct dma_buf_map map;
> > - int ret;
> >
> > - if (!buf->vaddr && buf->db_attach) {
> > - ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
> > - buf->vaddr = ret ? NULL : map.vaddr;
> > + if (buf->vaddr)
> > + return buf->vaddr;
> > +
> > + if (buf->db_attach) {
> > + struct dma_buf_map map;
> > +
> > + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
> > + buf->vaddr = map.vaddr;
> > +
> > + return buf->vaddr;
> > }
> >
> > + if (!buf->coherent_mem)
> > + buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
> > + buf->dma_sgt);
> > return buf->vaddr;
> > }
>
> This function really needs a bunch of comments.
>
> What I want to see here specifically is under which circumstances this function
> can return NULL.
>
> - dma_buf_vmap returns an error
> - for non-coherent memory dma_vmap_noncontiguous returns an error
> - coherent memory with DMA_ATTR_NO_KERNEL_MAPPING set.

OK, I added some comments.

> In the latter case, if a buffer with coherent memory and DMA_ATTR_NO_KERNEL_MAPPING
> is exported as a dma_buf, and dma_buf_vmap is called by the importer of this dma-buf,
> what happens then? I think that in that case dma_buf_vmap should return an error?

Should we error out in vb2_dc_vaddr() in this case?

---

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index d4089d0b5ec5..e1d8ae1548fa 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -102,6 +102,9 @@ static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
if (buf->db_attach) {
struct dma_buf_map map;

+ if (WARN_ON(buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING))
+ return NULL;
+
if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
buf->vaddr = map.vaddr;


---


[..]
> > @@ -362,7 +451,7 @@ static int vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf, struct dma_buf_map *map)
> > {
> > struct vb2_dc_buf *buf = dbuf->priv;
> >
> > - dma_buf_map_set_vaddr(map, buf->vaddr);
> > + dma_buf_map_set_vaddr(map, vb2_dc_vaddr(buf->vb, buf));
>
> vb2_dc_vaddr() can return NULL, shouldn't this function return an error in that case?

Done, thanks.

> BTW, looking at where vb2_plane_vaddr() is called in drivers I notice that most (all?)
> drivers do not check for NULL. Somewhat scary, to be honest. That's a separate issue, though.

I may have some time in the future and can add missing if-s to the
drivers.

2021-08-18 09:23:47

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

On Tue, Aug 17, 2021 at 8:57 PM Sergey Senozhatsky
<[email protected]> wrote:
>
> Hi,
>
> On (21/08/03 12:15), Hans Verkuil wrote:
> > > static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
> > > {
> > > struct vb2_dc_buf *buf = buf_priv;
> > > - struct dma_buf_map map;
> > > - int ret;
> > >
> > > - if (!buf->vaddr && buf->db_attach) {
> > > - ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
> > > - buf->vaddr = ret ? NULL : map.vaddr;
> > > + if (buf->vaddr)
> > > + return buf->vaddr;
> > > +
> > > + if (buf->db_attach) {
> > > + struct dma_buf_map map;
> > > +
> > > + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
> > > + buf->vaddr = map.vaddr;
> > > +
> > > + return buf->vaddr;
> > > }
> > >
> > > + if (!buf->coherent_mem)
> > > + buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
> > > + buf->dma_sgt);
> > > return buf->vaddr;
> > > }
> >
> > This function really needs a bunch of comments.
> >
> > What I want to see here specifically is under which circumstances this function
> > can return NULL.
> >
> > - dma_buf_vmap returns an error
> > - for non-coherent memory dma_vmap_noncontiguous returns an error
> > - coherent memory with DMA_ATTR_NO_KERNEL_MAPPING set.
>
> OK, I added some comments.
>
> > In the latter case, if a buffer with coherent memory and DMA_ATTR_NO_KERNEL_MAPPING
> > is exported as a dma_buf, and dma_buf_vmap is called by the importer of this dma-buf,
> > what happens then? I think that in that case dma_buf_vmap should return an error?
>
> Should we error out in vb2_dc_vaddr() in this case?

Yes, because there is no way to create a kernel mapping for buffer
allocated with dma_alloc_coherent() post-factum. Of course it's not
the case for dma_alloc_noncontiguous() for which we can create the
kernel mapping on demand.

>
> ---
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index d4089d0b5ec5..e1d8ae1548fa 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -102,6 +102,9 @@ static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
> if (buf->db_attach) {
> struct dma_buf_map map;
>
> + if (WARN_ON(buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING))
> + return NULL;
> +

Why here? It's the case for buffers imported _into_ vb2, not exported
from vb2 and imported to other users.

> if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
> buf->vaddr = map.vaddr;
>
>
> ---
>
>
> [..]
> > > @@ -362,7 +451,7 @@ static int vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf, struct dma_buf_map *map)
> > > {
> > > struct vb2_dc_buf *buf = dbuf->priv;
> > >
> > > - dma_buf_map_set_vaddr(map, buf->vaddr);
> > > + dma_buf_map_set_vaddr(map, vb2_dc_vaddr(buf->vb, buf));
> >
> > vb2_dc_vaddr() can return NULL, shouldn't this function return an error in that case?
>
> Done, thanks.
>
> > BTW, looking at where vb2_plane_vaddr() is called in drivers I notice that most (all?)
> > drivers do not check for NULL. Somewhat scary, to be honest. That's a separate issue, though.
>
> I may have some time in the future and can add missing if-s to the
> drivers.

2021-08-23 10:30:07

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

On (21/08/03 10:33), Hans Verkuil wrote:
> > struct dma_buf_attachment *db_attach;
> >
> > struct vb2_buffer *vb;
> > + bool coherent_mem;
>
> I think that this as well should be renamed to non_coherent_mem.
>

Done.

2021-08-23 10:32:08

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

On (21/08/03 12:15), Hans Verkuil wrote:
> > static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv)
> > {
> > struct vb2_dc_buf *buf = buf_priv;
> > - struct dma_buf_map map;
> > - int ret;
> >
> > - if (!buf->vaddr && buf->db_attach) {
> > - ret = dma_buf_vmap(buf->db_attach->dmabuf, &map);
> > - buf->vaddr = ret ? NULL : map.vaddr;
> > + if (buf->vaddr)
> > + return buf->vaddr;
> > +
> > + if (buf->db_attach) {
> > + struct dma_buf_map map;
> > +
> > + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
> > + buf->vaddr = map.vaddr;
> > +
> > + return buf->vaddr;
> > }
> >
> > + if (!buf->coherent_mem)
> > + buf->vaddr = dma_vmap_noncontiguous(buf->dev, buf->size,
> > + buf->dma_sgt);
> > return buf->vaddr;
> > }
>
> This function really needs a bunch of comments.
>
> What I want to see here specifically is under which circumstances this function
> can return NULL.
>
> - dma_buf_vmap returns an error
> - for non-coherent memory dma_vmap_noncontiguous returns an error
> - coherent memory with DMA_ATTR_NO_KERNEL_MAPPING set.

Done.

2021-08-23 10:32:59

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCHv4 8/8] videobuf2: handle non-contiguous DMA allocations

On (21/08/03 10:39), Hans Verkuil wrote:
> >> +
> >> + /*
> >> + * For requests that need kernel mapping (DMA_ATTR_NO_KERNEL_MAPPING
> >> + * bit is cleared) we perform dma_vmap_noncontiguous() in vb2_dc_vadd().
>
> Typo: vb2_dc_vadd -> vb2_dc_vaddr

Done.