2017-12-11 18:28:01

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 0/6] V4L2 Explicit Synchronization

From: Gustavo Padovan <[email protected]>

Hi,

One more iteration of the explicit fences patches, please refer
to the previous version[1] for more details about the general
mechanism

This version makes the patchset and the implementation much more
simple, to start we are not using a ordered capability anymore,
but instead we have a VIDIOC_ENUM_FMT flag to tell when the
queue in not ordered. Drivers with ordered queues/formats don't
need implement anything. See patches 1 and 2 for more details.

The implementation of in-fences and out-fences were condensed in
just patches 4 and 5, making it more self-contained and easy to
understand. See the patches for detailed changelog.

Please review! Thanks.

Gustavo.

[1] https://lkml.org/lkml/2017/11/15/550

Gustavo Padovan (6):
[media] vb2: add is_unordered callback for drivers
[media] v4l: add 'unordered' flag to format description ioctl
[media] vb2: add explicit fence user API
[media] vb2: add in-fence support to QBUF
[media] vb2: add out-fence support to QBUF
[media] v4l: Document explicit synchronization behavior

Documentation/media/uapi/v4l/buffer.rst | 15 ++
Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 3 +
Documentation/media/uapi/v4l/vidioc-qbuf.rst | 47 ++++-
Documentation/media/uapi/v4l/vidioc-querybuf.rst | 9 +-
drivers/media/usb/cpia2/cpia2_v4l.c | 2 +-
drivers/media/v4l2-core/Kconfig | 1 +
drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 4 +-
drivers/media/v4l2-core/videobuf2-core.c | 239 +++++++++++++++++++++--
drivers/media/v4l2-core/videobuf2-v4l2.c | 52 ++++-
include/media/videobuf2-core.h | 41 +++-
include/uapi/linux/videodev2.h | 8 +-
11 files changed, 393 insertions(+), 28 deletions(-)

--
2.13.6


2017-12-11 18:28:09

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 2/6] [media] v4l: add 'unordered' flag to format description ioctl

From: Gustavo Padovan <[email protected]>

For explicit synchronization it important for userspace to know if the
format being used by the driver can deliver the buffers back to userspace
in the same order they were queued with QBUF.

Ordered streams fits nicely in a pipeline with DRM for example, where
ordered buffer are expected.

Signed-off-by: Gustavo Padovan <[email protected]>
---
Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 3 +++
include/uapi/linux/videodev2.h | 1 +
2 files changed, 4 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 019c513df217..368115f44fc0 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -116,6 +116,9 @@ one until ``EINVAL`` is returned.
- This format is not native to the device but emulated through
software (usually libv4l2), where possible try to use a native
format instead for better performance.
+ * - ``V4L2_FMT_FLAG_UNORDERED``
+ - 0x0004
+ - This is a format that doesn't guarantee timely order of frames.


Return Value
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 1c095b5a99c5..a8ea632c14f0 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -709,6 +709,7 @@ struct v4l2_fmtdesc {

#define V4L2_FMT_FLAG_COMPRESSED 0x0001
#define V4L2_FMT_FLAG_EMULATED 0x0002
+#define V4L2_FMT_FLAG_UNORDERED 0x0004

/* Frame Size and frame rate enumeration */
/*
--
2.13.6

2017-12-11 18:28:16

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 4/6] [media] vb2: add in-fence support to QBUF

From: Gustavo Padovan <[email protected]>

Receive in-fence from userspace and add support for waiting on them
before queueing the buffer to the driver. Buffers can't be queued to the
driver before its fences signal. And a buffer can't be queue to the driver
out of the order they were queued from userspace. That means that even if
it fence signal it must wait all other buffers, ahead of it in the queue,
to signal first.

If the fence for some buffer fails we do not queue it to the driver,
instead we mark it as error and wait until the previous buffer is done
to notify userspace of the error. We wait here to deliver the buffers back
to userspace in order.

v7:
- get rid of the fence array stuff for ordering and just use
get_num_buffers_ready() (Hans)
- fix issue of queuing the buffer twice (Hans)
- avoid the dma_fence_wait() in core_qbuf() (Alex)
- merge preparation commit in

v6:
- With fences always keep the order userspace queues the buffers.
- Protect in_fence manipulation with a lock (Brian Starkey)
- check if fences have the same context before adding a fence array
- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
- Clean up fence if __set_in_fence() fails (Brian Starkey)
- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)

v5: - use fence_array to keep buffers ordered in vb2 core when
needed (Brian Starkey)
- keep backward compat on the reserved2 field (Brian Starkey)
- protect fence callback removal with lock (Brian Starkey)

v4:
- Add a comment about dma_fence_add_callback() not returning a
error (Hans)
- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
- Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
vb2_start_streaming() (Hans)
- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
- Queue buffers to the driver as soon as they are ready (Hans)
- call fill_user_buffer() after queuing the buffer (Hans)
- add err: label to clean up fence
- add dma_fence_wait() before calling vb2_start_streaming()

v3: - document fence parameter
- remove ternary if at vb2_qbuf() return (Mauro)
- do not change if conditions behaviour (Mauro)

v2:
- fix vb2_queue_or_prepare_buf() ret check
- remove check for VB2_MEMORY_DMABUF only (Javier)
- check num of ready buffers to start streaming
- when queueing, start from the first ready buffer
- handle queue cancel

Signed-off-by: Gustavo Padovan <[email protected]>
---
drivers/media/v4l2-core/Kconfig | 1 +
drivers/media/v4l2-core/videobuf2-core.c | 154 +++++++++++++++++++++++++++----
drivers/media/v4l2-core/videobuf2-v4l2.c | 29 +++++-
include/media/videobuf2-core.h | 14 ++-
4 files changed, 177 insertions(+), 21 deletions(-)

diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
index a35c33686abf..3f988c407c80 100644
--- a/drivers/media/v4l2-core/Kconfig
+++ b/drivers/media/v4l2-core/Kconfig
@@ -83,6 +83,7 @@ config VIDEOBUF_DVB
# Used by drivers that need Videobuf2 modules
config VIDEOBUF2_CORE
select DMA_SHARED_BUFFER
+ select SYNC_FILE
tristate

config VIDEOBUF2_MEMOPS
diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index a8589d96ef72..520aa3c7d9f0 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -346,6 +346,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
vb->index = q->num_buffers + buffer;
vb->type = q->type;
vb->memory = memory;
+ spin_lock_init(&vb->fence_cb_lock);
for (plane = 0; plane < num_planes; ++plane) {
vb->planes[plane].length = plane_sizes[plane];
vb->planes[plane].min_length = plane_sizes[plane];
@@ -928,7 +929,7 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)

switch (state) {
case VB2_BUF_STATE_QUEUED:
- return;
+ break;
case VB2_BUF_STATE_REQUEUEING:
if (q->start_streaming_called)
__enqueue_in_driver(vb);
@@ -938,6 +939,16 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
wake_up(&q->done_wq);
break;
}
+
+ /*
+ * If the in-fence fails the buffer is not queued to the driver
+ * and we have to wait until the previous buffer is done before
+ * we notify userspace that the buffer with error can be dequeued.
+ * That way we maintain the ordering from userspace point of view.
+ */
+ vb = list_next_entry(vb, queued_entry);
+ if (vb && vb->state == VB2_BUF_STATE_ERROR)
+ vb2_buffer_done(vb, vb->state);
}
EXPORT_SYMBOL_GPL(vb2_buffer_done);

@@ -1222,6 +1233,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
{
struct vb2_queue *q = vb->vb2_queue;

+ if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
+ return;
+
vb->state = VB2_BUF_STATE_ACTIVE;
atomic_inc(&q->owned_by_drv_count);

@@ -1273,6 +1287,24 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
return 0;
}

+static int __get_num_ready_buffers(struct vb2_queue *q)
+{
+ struct vb2_buffer *vb;
+ int ready_count = 0;
+ unsigned long flags;
+
+ /* count num of buffers ready in front of the queued_list */
+ list_for_each_entry(vb, &q->queued_list, queued_entry) {
+ spin_lock_irqsave(&vb->fence_cb_lock, flags);
+ if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
+ break;
+ ready_count++;
+ spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+ }
+
+ return ready_count;
+}
+
int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
{
struct vb2_buffer *vb;
@@ -1361,9 +1393,34 @@ static int vb2_start_streaming(struct vb2_queue *q)
return ret;
}

-int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
+static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+ struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
+ struct vb2_queue *q = vb->vb2_queue;
+ unsigned long flags;
+
+ spin_lock_irqsave(&vb->fence_cb_lock, flags);
+ if (vb->in_fence->error)
+ vb->state = VB2_BUF_STATE_ERROR;
+
+ dma_fence_put(vb->in_fence);
+ vb->in_fence = NULL;
+
+ if (vb->state == VB2_BUF_STATE_ERROR) {
+ spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+ return;
+ }
+
+ if (q->start_streaming_called)
+ __enqueue_in_driver(vb);
+ spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+}
+
+int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
+ struct dma_fence *fence)
{
struct vb2_buffer *vb;
+ unsigned long flags;
int ret;

vb = q->bufs[index];
@@ -1372,16 +1429,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
case VB2_BUF_STATE_DEQUEUED:
ret = __buf_prepare(vb, pb);
if (ret)
- return ret;
+ goto err;
break;
case VB2_BUF_STATE_PREPARED:
break;
case VB2_BUF_STATE_PREPARING:
dprintk(1, "buffer still being prepared\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto err;
default:
dprintk(1, "invalid buffer state %d\n", vb->state);
- return -EINVAL;
+ ret = -EINVAL;
+ goto err;
}

/*
@@ -1392,6 +1451,7 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
q->queued_count++;
q->waiting_for_buffers = false;
vb->state = VB2_BUF_STATE_QUEUED;
+ vb->in_fence = fence;

if (pb)
call_void_bufop(q, copy_timestamp, vb, pb);
@@ -1399,15 +1459,42 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
trace_vb2_qbuf(q, vb);

/*
- * If already streaming, give the buffer to driver for processing.
- * If not, the buffer will be given to driver on next streamon.
+ * For explicit synchronization: If the fence didn't signal
+ * yet we setup a callback to queue the buffer once the fence
+ * signals, and then, return successfully. But if the fence
+ * already signaled we lose the reference we held and queue the
+ * buffer to the driver.
*/
- if (q->start_streaming_called)
- __enqueue_in_driver(vb);
+ spin_lock_irqsave(&vb->fence_cb_lock, flags);
+ if (vb->in_fence) {
+ ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
+ vb2_qbuf_fence_cb);
+ if (ret == -EINVAL) {
+ spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+ goto err;
+ } else if (!ret) {
+ goto fill;
+ }

- /* Fill buffer information for the userspace */
- if (pb)
- call_void_bufop(q, fill_user_buffer, vb, pb);
+ dma_fence_put(vb->in_fence);
+ vb->in_fence = NULL;
+ }
+
+fill:
+ /*
+ * If already streaming and there is no fence to wait on
+ * give the buffer to driver for processing.
+ */
+ if (q->start_streaming_called) {
+ struct vb2_buffer *b;
+ list_for_each_entry(b, &q->queued_list, queued_entry) {
+ if (b->state != VB2_BUF_STATE_QUEUED)
+ continue;
+ if (b->in_fence)
+ break;
+ __enqueue_in_driver(b);
+ }
+ }

/*
* If streamon has been called, and we haven't yet called
@@ -1416,14 +1503,33 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
* then we can finally call start_streaming().
*/
if (q->streaming && !q->start_streaming_called &&
- q->queued_count >= q->min_buffers_needed) {
+ __get_num_ready_buffers(q) >= q->min_buffers_needed) {
ret = vb2_start_streaming(q);
if (ret)
- return ret;
+ goto err;
}

+ spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+
+ /* Fill buffer information for the userspace */
+ if (pb)
+ call_void_bufop(q, fill_user_buffer, vb, pb);
+
dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
return 0;
+
+err:
+ /* Fill buffer information for the userspace */
+ if (pb)
+ call_void_bufop(q, fill_user_buffer, vb, pb);
+
+ if (vb->in_fence) {
+ dma_fence_put(vb->in_fence);
+ vb->in_fence = NULL;
+ }
+
+ return ret;
+
}
EXPORT_SYMBOL_GPL(vb2_core_qbuf);

@@ -1634,6 +1740,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
static void __vb2_queue_cancel(struct vb2_queue *q)
{
unsigned int i;
+ struct vb2_buffer *vb;
+ unsigned long flags;

/*
* Tell driver to stop all transactions and release all queued
@@ -1661,6 +1769,16 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
q->queued_count = 0;
q->error = 0;

+ list_for_each_entry(vb, &q->queued_list, queued_entry) {
+ spin_lock_irqsave(&vb->fence_cb_lock, flags);
+ if (vb->in_fence) {
+ dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
+ dma_fence_put(vb->in_fence);
+ vb->in_fence = NULL;
+ }
+ spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+ }
+
/*
* Remove all buffers from videobuf's list...
*/
@@ -1722,7 +1840,7 @@ int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
* Tell driver to start streaming provided sufficient buffers
* are available.
*/
- if (q->queued_count >= q->min_buffers_needed) {
+ if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
ret = v4l_vb2q_enable_media_source(q);
if (ret)
return ret;
@@ -2242,7 +2360,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
* Queue all buffers.
*/
for (i = 0; i < q->num_buffers; i++) {
- ret = vb2_core_qbuf(q, i, NULL);
+ ret = vb2_core_qbuf(q, i, NULL, NULL);
if (ret)
goto err_reqbufs;
fileio->bufs[i].queued = 1;
@@ -2421,7 +2539,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_

if (copy_timestamp)
b->timestamp = ktime_get_ns();
- ret = vb2_core_qbuf(q, index, NULL);
+ ret = vb2_core_qbuf(q, index, NULL, NULL);
dprintk(5, "vb2_dbuf result: %d\n", ret);
if (ret)
return ret;
@@ -2524,7 +2642,7 @@ static int vb2_thread(void *data)
if (copy_timestamp)
vb->timestamp = ktime_get_ns();;
if (!threadio->stop)
- ret = vb2_core_qbuf(q, vb->index, NULL);
+ ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
call_void_qop(q, wait_prepare, q);
if (ret || threadio->stop)
break;
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index 4a5244ee2c58..fa1df42d7250 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -23,6 +23,7 @@
#include <linux/sched.h>
#include <linux/freezer.h>
#include <linux/kthread.h>
+#include <linux/sync_file.h>

#include <media/v4l2-dev.h>
#include <media/v4l2-fh.h>
@@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b,
return -EINVAL;
}

+ if ((b->fence_fd != 0 && b->fence_fd != -1) &&
+ !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
+ dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
+ return -EINVAL;
+ }
+
return __verify_planes_array(q->bufs[b->index], b);
}

@@ -203,9 +210,14 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
b->timestamp = ns_to_timeval(vb->timestamp);
b->timecode = vbuf->timecode;
b->sequence = vbuf->sequence;
- b->fence_fd = -1;
b->reserved = 0;

+ b->fence_fd = -1;
+ if (vb->in_fence)
+ b->flags |= V4L2_BUF_FLAG_IN_FENCE;
+ else
+ b->flags &= ~V4L2_BUF_FLAG_IN_FENCE;
+
if (q->is_multiplanar) {
/*
* Fill in plane-related data if userspace provided an array
@@ -560,6 +572,7 @@ EXPORT_SYMBOL_GPL(vb2_create_bufs);

int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
{
+ struct dma_fence *fence = NULL;
int ret;

if (vb2_fileio_is_active(q)) {
@@ -568,7 +581,19 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
}

ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
- return ret ? ret : vb2_core_qbuf(q, b->index, b);
+ if (ret)
+ return ret;
+
+ if (b->flags & V4L2_BUF_FLAG_IN_FENCE) {
+ fence = sync_file_get_fence(b->fence_fd);
+ if (!fence) {
+ dprintk(1, "failed to get in-fence from fd %d\n",
+ b->fence_fd);
+ return -EINVAL;
+ }
+ }
+
+ return vb2_core_qbuf(q, b->index, b, fence);
}
EXPORT_SYMBOL_GPL(vb2_qbuf);

diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index eddb38a2a2f3..3c7683630b89 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -16,6 +16,7 @@
#include <linux/mutex.h>
#include <linux/poll.h>
#include <linux/dma-buf.h>
+#include <linux/dma-fence.h>

#define VB2_MAX_FRAME (32)
#define VB2_MAX_PLANES (8)
@@ -254,11 +255,20 @@ struct vb2_buffer {
* all buffers queued from userspace
* done_entry: entry on the list that stores all buffers ready
* to be dequeued to userspace
+ * in_fence: fence receive from vb2 client to wait on before
+ * using the buffer (queueing to the driver)
+ * fence_cb: fence callback information
+ * fence_cb_lock: protect callback signal/remove
*/
enum vb2_buffer_state state;

struct list_head queued_entry;
struct list_head done_entry;
+
+ struct dma_fence *in_fence;
+ struct dma_fence_cb fence_cb;
+ spinlock_t fence_cb_lock;
+
#ifdef CONFIG_VIDEO_ADV_DEBUG
/*
* Counters for how often these buffer-related ops are
@@ -731,6 +741,7 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
* @index: id number of the buffer
* @pb: buffer structure passed from userspace to vidioc_qbuf handler
* in driver
+ * @fence: in-fence to wait on before queueing the buffer
*
* Should be called from vidioc_qbuf ioctl handler of a driver.
* The passed buffer should have been verified.
@@ -745,7 +756,8 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
* The return values from this function are intended to be directly returned
* from vidioc_qbuf handler in driver.
*/
-int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
+int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
+ struct dma_fence *fence);

/**
* vb2_core_dqbuf() - Dequeue a buffer to the userspace
--
2.13.6

2017-12-11 18:28:22

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 6/6] [media] v4l: Document explicit synchronization behavior

From: Gustavo Padovan <[email protected]>

Add section to VIDIOC_QBUF about it

v5:
- Remove V4L2_CAP_ORDERED
- Add doc about V4L2_FMT_FLAG_UNORDERED

v4:
- Document ordering behavior for in-fences
- Document V4L2_CAP_ORDERED capability
- Remove doc about OUT_FENCE event
- Document immediate return of out-fence in QBUF

v3:
- make the out_fence refer to the current buffer (Hans)
- Note what happens when the IN_FENCE is not set (Hans)

v2:
- mention that fences are files (Hans)
- rework for the new API

Signed-off-by: Gustavo Padovan <[email protected]>
---
Documentation/media/uapi/v4l/vidioc-qbuf.rst | 47 +++++++++++++++++++++++-
Documentation/media/uapi/v4l/vidioc-querybuf.rst | 9 ++++-
2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-qbuf.rst b/Documentation/media/uapi/v4l/vidioc-qbuf.rst
index 9e448a4aa3aa..8809397fb110 100644
--- a/Documentation/media/uapi/v4l/vidioc-qbuf.rst
+++ b/Documentation/media/uapi/v4l/vidioc-qbuf.rst
@@ -54,7 +54,7 @@ When the buffer is intended for output (``type`` is
or ``V4L2_BUF_TYPE_VBI_OUTPUT``) applications must also initialize the
``bytesused``, ``field`` and ``timestamp`` fields, see :ref:`buffer`
for details. Applications must also set ``flags`` to 0. The
-``reserved2`` and ``reserved`` fields must be set to 0. When using the
+``reserved`` field must be set to 0. When using the
:ref:`multi-planar API <planar-apis>`, the ``m.planes`` field must
contain a userspace pointer to a filled-in array of struct
:c:type:`v4l2_plane` and the ``length`` field must be set
@@ -118,6 +118,51 @@ immediately with an ``EAGAIN`` error code when no buffer is available.
The struct :c:type:`v4l2_buffer` structure is specified in
:ref:`buffer`.

+Explicit Synchronization
+------------------------
+
+Explicit Synchronization allows us to control the synchronization of
+shared buffers from userspace by passing fences to the kernel and/or
+receiving them from it. Fences passed to the kernel are named in-fences and
+the kernel should wait on them to signal before using the buffer, i.e., queueing
+it to the driver. On the other side, the kernel can create out-fences for the
+buffers it queues to the drivers. Out-fences signal when the driver is
+finished with buffer, i.e., the buffer is ready. The fences are represented
+as a file and passed as a file descriptor to userspace.
+
+The in-fences are communicated to the kernel at the ``VIDIOC_QBUF`` ioctl
+using the ``V4L2_BUF_FLAG_IN_FENCE`` buffer flag and the `fence_fd` field. If
+an in-fence needs to be passed to the kernel, `fence_fd` should be set to the
+fence file descriptor number and the ``V4L2_BUF_FLAG_IN_FENCE`` should be set
+as well. Setting one but not the other will cause ``VIDIOC_QBUF`` to return
+with error. The fence_fd field will be ignored if the
+``V4L2_BUF_FLAG_IN_FENCE`` is not set.
+
+The videobuf2-core will guarantee that all buffers queued with in-fence will
+be queued to the drivers in the same order. Fence may signal out of order, so
+this guarantee at videobuf2 is necessary to not change ordering.
+
+If the in-fence signals with an error the videobuf2 won't queue the buffer to
+the driver, instead it will flag it with an error. And then wait for the
+previous buffer to complete before asking userspace dequeue the buffer with
+error - to make sure we deliver the buffers back in the correct order.
+
+To get an out-fence back from V4L2 the ``V4L2_BUF_FLAG_OUT_FENCE`` flag should
+be set to ask for a fence to be attached to the buffer. The out-fence fd is
+sent to userspace as a ``VIDIOC_QBUF`` return argument on the `fence_fd` field.
+
+Note the the same `fence_fd` field is used for both sending the in-fence as
+input argument to receive the out-fence as a return argument.
+
+At streamoff the out-fences will either signal normally if the driver waits
+for the operations on the buffers to finish or signal with an error if the
+driver cancels the pending operations. Buffers with in-fences won't be queued
+to the driver if their fences signal. It will be cleaned up.
+
+The ``V4L2_FMT_FLAG_UNORDERED`` flag in ``VIDIOC_ENUM_FMT`` tells userspace
+that the current buffer queues is able to keep the ordering of buffers, i.e.,
+the dequeing of buffers will happen at the same order we queue them, with no
+reordering by the driver.

Return Value
============
diff --git a/Documentation/media/uapi/v4l/vidioc-querybuf.rst b/Documentation/media/uapi/v4l/vidioc-querybuf.rst
index dd54747fabc9..df964c4d916b 100644
--- a/Documentation/media/uapi/v4l/vidioc-querybuf.rst
+++ b/Documentation/media/uapi/v4l/vidioc-querybuf.rst
@@ -44,7 +44,7 @@ and the ``index`` field. Valid index numbers range from zero to the
number of buffers allocated with
:ref:`VIDIOC_REQBUFS` (struct
:c:type:`v4l2_requestbuffers` ``count``) minus
-one. The ``reserved`` and ``reserved2`` fields must be set to 0. When
+one. The ``reserved`` field must be set to 0. When
using the :ref:`multi-planar API <planar-apis>`, the ``m.planes``
field must contain a userspace pointer to an array of struct
:c:type:`v4l2_plane` and the ``length`` field has to be set
@@ -64,6 +64,13 @@ elements will be used instead and the ``length`` field of struct
array elements. The driver may or may not set the remaining fields and
flags, they are meaningless in this context.

+When using in-fences, the ``V4L2_BUF_FLAG_IN_FENCE`` will be set if the
+in-fence didn't signal at the time of the
+:ref:`VIDIOC_QUERYBUF`. The ``V4L2_BUF_FLAG_OUT_FENCE`` will be set if
+the user asked for an out-fence for the buffer, but the ``fence_fd``
+field will be set to -1. In case ``V4L2_BUF_FLAG_OUT_FENCE`` is not set
+``fence_fd`` will be set to 0 for backward compatibility.
+
The struct :c:type:`v4l2_buffer` structure is specified in
:ref:`buffer`.

--
2.13.6

2017-12-11 18:28:50

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 5/6] [media] vb2: add out-fence support to QBUF

From: Gustavo Padovan <[email protected]>

If V4L2_BUF_FLAG_OUT_FENCE flag is present on the QBUF call we create
an out_fence and send its fd to userspace on the fence_fd field as a
return arg for the QBUF call.

The fence is signaled on buffer_done(), when the job on the buffer is
finished.

v7:
- merge patch that add the infrastructure to out-fences into
this one (Alex Courbot)
- Do not install the fd if there is no fence. (Alex Courbot)
- do not report error on requeueing, just WARN_ON_ONCE() (Hans)

v6
- get rid of the V4L2_EVENT_OUT_FENCE event. We always keep the
ordering in vb2 for queueing in the driver, so the event is not
necessary anymore and the out_fence_fd is sent back to userspace
on QBUF call return arg
- do not allow requeueing with out-fences, instead mark the buffer
with an error and wake up to userspace.
- send the out_fence_fd back to userspace on the fence_fd field

v5:
- delay fd_install to DQ_EVENT (Hans)
- if queue is fully ordered send OUT_FENCE event right away
(Brian)
- rename 'q->ordered' to 'q->ordered_in_driver'
- merge change to implement OUT_FENCE event here

v4:
- return the out_fence_fd in the BUF_QUEUED event(Hans)

v3: - add WARN_ON_ONCE(q->ordered) on requeueing (Hans)
- set the OUT_FENCE flag if there is a fence pending (Hans)
- call fd_install() after vb2_core_qbuf() (Hans)
- clean up fence if vb2_core_qbuf() fails (Hans)
- add list to store sync_file and fence for the next queued buffer

v2: check if the queue is ordered.

Signed-off-by: Gustavo Padovan <[email protected]>
---
drivers/media/v4l2-core/videobuf2-core.c | 99 +++++++++++++++++++++++++++++---
drivers/media/v4l2-core/videobuf2-v4l2.c | 29 +++++++++-
include/media/videobuf2-core.h | 22 +++++++
3 files changed, 139 insertions(+), 11 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 520aa3c7d9f0..cbe115f00736 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -23,6 +23,7 @@
#include <linux/sched.h>
#include <linux/freezer.h>
#include <linux/kthread.h>
+#include <linux/sync_file.h>

#include <media/videobuf2-core.h>
#include <media/v4l2-mc.h>
@@ -351,6 +352,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
vb->planes[plane].length = plane_sizes[plane];
vb->planes[plane].min_length = plane_sizes[plane];
}
+ vb->out_fence_fd = -1;
q->bufs[vb->index] = vb;

/* Allocate video buffer memory for the MMAP type */
@@ -931,10 +933,20 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
case VB2_BUF_STATE_QUEUED:
break;
case VB2_BUF_STATE_REQUEUEING:
+ /* Requeuing with explicit synchronization, spit warning */
+ WARN_ON_ONCE(vb->out_fence);
+
if (q->start_streaming_called)
__enqueue_in_driver(vb);
- return;
+ break;
default:
+ if (state == VB2_BUF_STATE_ERROR)
+ dma_fence_set_error(vb->out_fence, -EFAULT);
+ dma_fence_signal(vb->out_fence);
+ dma_fence_put(vb->out_fence);
+ vb->out_fence = NULL;
+ vb->out_fence_fd = -1;
+
/* Inform any processes that may be waiting for buffers */
wake_up(&q->done_wq);
break;
@@ -1330,6 +1342,65 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
}
EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);

+static inline const char *vb2_fence_get_driver_name(struct dma_fence *fence)
+{
+ return "vb2_fence";
+}
+
+static inline const char *vb2_fence_get_timeline_name(struct dma_fence *fence)
+{
+ return "vb2_fence_timeline";
+}
+
+static inline bool vb2_fence_enable_signaling(struct dma_fence *fence)
+{
+ return true;
+}
+
+static const struct dma_fence_ops vb2_fence_ops = {
+ .get_driver_name = vb2_fence_get_driver_name,
+ .get_timeline_name = vb2_fence_get_timeline_name,
+ .enable_signaling = vb2_fence_enable_signaling,
+ .wait = dma_fence_default_wait,
+};
+
+int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
+{
+ struct vb2_buffer *vb;
+ u64 context;
+
+ vb = q->bufs[index];
+
+ vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
+
+ if (call_qop(q, is_unordered, q))
+ context = dma_fence_context_alloc(1);
+ else
+ context = q->out_fence_context;
+
+ vb->out_fence = kzalloc(sizeof(*vb->out_fence), GFP_KERNEL);
+ if (!vb->out_fence)
+ return -ENOMEM;
+
+ dma_fence_init(vb->out_fence, &vb2_fence_ops, &q->out_fence_lock,
+ context, 1);
+ if (!vb->out_fence) {
+ put_unused_fd(vb->out_fence_fd);
+ return -ENOMEM;
+ }
+
+ vb->sync_file = sync_file_create(vb->out_fence);
+ if (!vb->sync_file) {
+ put_unused_fd(vb->out_fence_fd);
+ dma_fence_put(vb->out_fence);
+ vb->out_fence = NULL;
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_setup_out_fence);
+
/*
* vb2_start_streaming() - Attempt to start streaming.
* @q: videobuf2 queue
@@ -1469,18 +1540,16 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
if (vb->in_fence) {
ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
vb2_qbuf_fence_cb);
- if (ret == -EINVAL) {
+ /* is the fence signaled? */
+ if (ret == -ENOENT) {
+ dma_fence_put(vb->in_fence);
+ vb->in_fence = NULL;
+ } else if (ret) {
spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
goto err;
- } else if (!ret) {
- goto fill;
}
-
- dma_fence_put(vb->in_fence);
- vb->in_fence = NULL;
}

-fill:
/*
* If already streaming and there is no fence to wait on
* give the buffer to driver for processing.
@@ -1515,6 +1584,11 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
if (pb)
call_void_bufop(q, fill_user_buffer, vb, pb);

+ if (vb->out_fence) {
+ fd_install(vb->out_fence_fd, vb->sync_file->file);
+ vb->sync_file = NULL;
+ }
+
dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
return 0;

@@ -1780,6 +1854,12 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
}

/*
+ * Renew out-fence context.
+ */
+ if (!call_qop(q, is_unordered, q))
+ q->out_fence_context = dma_fence_context_alloc(1);
+
+ /*
* Remove all buffers from videobuf's list...
*/
INIT_LIST_HEAD(&q->queued_list);
@@ -2111,6 +2191,9 @@ int vb2_core_queue_init(struct vb2_queue *q)
spin_lock_init(&q->done_lock);
mutex_init(&q->mmap_lock);
init_waitqueue_head(&q->done_wq);
+ if (!call_qop(q, is_unordered, q))
+ q->out_fence_context = dma_fence_context_alloc(1);
+ spin_lock_init(&q->out_fence_lock);

if (q->buf_struct_size == 0)
q->buf_struct_size = sizeof(struct vb2_buffer);
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index fa1df42d7250..20b902cfe11a 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -179,12 +179,16 @@ static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b,
return -EINVAL;
}

- if ((b->fence_fd != 0 && b->fence_fd != -1) &&
- !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
+ if (b->fence_fd > 0 && !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
return -EINVAL;
}

+ if (b->fence_fd == -1 && (b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
+ dprintk(1, "%s: IN_FENCE flag set but no fence_fd\n", opname);
+ return -EINVAL;
+ }
+
return __verify_planes_array(q->bufs[b->index], b);
}

@@ -212,7 +216,13 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
b->sequence = vbuf->sequence;
b->reserved = 0;

- b->fence_fd = -1;
+ if (vb->out_fence) {
+ b->flags |= V4L2_BUF_FLAG_OUT_FENCE;
+ b->fence_fd = vb->out_fence_fd;
+ } else {
+ b->fence_fd = -1;
+ }
+
if (vb->in_fence)
b->flags |= V4L2_BUF_FLAG_IN_FENCE;
else
@@ -489,6 +499,10 @@ int vb2_querybuf(struct vb2_queue *q, struct v4l2_buffer *b)
ret = __verify_planes_array(vb, b);
if (!ret)
vb2_core_querybuf(q, b->index, b);
+
+ /* Do not return the out-fence fd on querybuf */
+ if (vb->out_fence)
+ b->fence_fd = -1;
return ret;
}
EXPORT_SYMBOL(vb2_querybuf);
@@ -593,6 +607,15 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
}
}

+ if (b->flags & V4L2_BUF_FLAG_OUT_FENCE) {
+ ret = vb2_setup_out_fence(q, b->index);
+ if (ret) {
+ dprintk(1, "failed to set up out-fence\n");
+ dma_fence_put(fence);
+ return ret;
+ }
+ }
+
return vb2_core_qbuf(q, b->index, b, fence);
}
EXPORT_SYMBOL_GPL(vb2_qbuf);
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 3c7683630b89..66291c631cd6 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -259,6 +259,10 @@ struct vb2_buffer {
* using the buffer (queueing to the driver)
* fence_cb: fence callback information
* fence_cb_lock: protect callback signal/remove
+ * out_fence_fd: the out_fence_fd to be shared with userspace.
+ * out_fence: the out-fence associated with the buffer once
+ * it is queued to the driver.
+ * sync_file: the sync file to wrap the out fence
*/
enum vb2_buffer_state state;

@@ -269,6 +273,10 @@ struct vb2_buffer {
struct dma_fence_cb fence_cb;
spinlock_t fence_cb_lock;

+ int out_fence_fd;
+ struct dma_fence *out_fence;
+ struct sync_file *sync_file;
+
#ifdef CONFIG_VIDEO_ADV_DEBUG
/*
* Counters for how often these buffer-related ops are
@@ -512,6 +520,7 @@ struct vb2_buf_ops {
* @last_buffer_dequeued: used in poll() and DQBUF to immediately return if the
* last decoded buffer was already dequeued. Set for capture queues
* when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
+ * @out_fence_context: the fence context for the out fences
* @fileio: file io emulator internal data, used only if emulator is active
* @threadio: thread io internal data, used only if thread is active
*/
@@ -565,6 +574,9 @@ struct vb2_queue {
unsigned int copy_timestamp:1;
unsigned int last_buffer_dequeued:1;

+ u64 out_fence_context;
+ spinlock_t out_fence_lock;
+
struct vb2_fileio_data *fileio;
struct vb2_threadio_data *threadio;

@@ -735,6 +747,16 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);

/**
+ * vb2_setup_out_fence() - setup new out-fence
+ * @q: The vb2_queue where to setup it
+ * @index: index of the buffer
+ *
+ * Setup the file descriptor, the fence and the sync_file for the next
+ * buffer to be queued and add everything to the tail of the q->out_fence_list.
+ */
+int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index);
+
+/**
* vb2_core_qbuf() - Queue a buffer from userspace
*
* @q: videobuf2 queue
--
2.13.6

2017-12-11 18:29:16

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 3/6] [media] vb2: add explicit fence user API

From: Gustavo Padovan <[email protected]>

Turn the reserved2 field into fence_fd that we will use to send
an in-fence to the kernel and return an out-fence from the kernel to
userspace.

Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
when sending a fence to the kernel to be waited on, and
V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.

v4:
- make it a union with reserved2 and fence_fd (Hans Verkuil)

v3:
- make the out_fence refer to the current buffer (Hans Verkuil)

v2: add documentation

Signed-off-by: Gustavo Padovan <[email protected]>
---
Documentation/media/uapi/v4l/buffer.rst | 15 +++++++++++++++
drivers/media/usb/cpia2/cpia2_v4l.c | 2 +-
drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 4 ++--
drivers/media/v4l2-core/videobuf2-v4l2.c | 2 +-
include/uapi/linux/videodev2.h | 7 ++++++-
5 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
index ae6ee73f151c..eeefbd2547e7 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -648,6 +648,21 @@ Buffer Flags
- Start Of Exposure. The buffer timestamp has been taken when the
exposure of the frame has begun. This is only valid for the
``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
+ * .. _`V4L2-BUF-FLAG-IN-FENCE`:
+
+ - ``V4L2_BUF_FLAG_IN_FENCE``
+ - 0x00200000
+ - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
+ won't be queued to the driver until the fence signals.
+
+ * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
+
+ - ``V4L2_BUF_FLAG_OUT_FENCE``
+ - 0x00400000
+ - Request a fence to be attached to the buffer. The ``fence_fd``
+ field on
+ :ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
+ fd to userspace.



diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
index a1c59f19cf2d..7d7459fa2077 100644
--- a/drivers/media/usb/cpia2/cpia2_v4l.c
+++ b/drivers/media/usb/cpia2/cpia2_v4l.c
@@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
buf->sequence = cam->buffers[buf->index].seq;
buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
buf->length = cam->frame_size;
- buf->reserved2 = 0;
+ buf->fence_fd = -1;
buf->reserved = 0;
memset(&buf->timecode, 0, sizeof(buf->timecode));

diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
index 821f2aa299ae..3a31d318df2a 100644
--- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
+++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
@@ -370,7 +370,7 @@ struct v4l2_buffer32 {
__s32 fd;
} m;
__u32 length;
- __u32 reserved2;
+ __s32 fence_fd;
__u32 reserved;
};

@@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
put_user(kp->sequence, &up->sequence) ||
- put_user(kp->reserved2, &up->reserved2) ||
+ put_user(kp->fence_fd, &up->fence_fd) ||
put_user(kp->reserved, &up->reserved) ||
put_user(kp->length, &up->length))
return -EFAULT;
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index 4075314a6989..4a5244ee2c58 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
b->timestamp = ns_to_timeval(vb->timestamp);
b->timecode = vbuf->timecode;
b->sequence = vbuf->sequence;
- b->reserved2 = 0;
+ b->fence_fd = -1;
b->reserved = 0;

if (q->is_multiplanar) {
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index a8ea632c14f0..17e163b5036d 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -926,7 +926,10 @@ struct v4l2_buffer {
__s32 fd;
} m;
__u32 length;
- __u32 reserved2;
+ union {
+ __s32 fence_fd;
+ __u32 reserved2;
+ };
__u32 reserved;
};

@@ -963,6 +966,8 @@ struct v4l2_buffer {
#define V4L2_BUF_FLAG_TSTAMP_SRC_SOE 0x00010000
/* mem2mem encoder/decoder */
#define V4L2_BUF_FLAG_LAST 0x00100000
+#define V4L2_BUF_FLAG_IN_FENCE 0x00200000
+#define V4L2_BUF_FLAG_OUT_FENCE 0x00400000

/**
* struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor
--
2.13.6

2017-12-11 18:29:41

by Gustavo Padovan

[permalink] [raw]
Subject: [PATCH v6 1/6] [media] vb2: add is_unordered callback for drivers

From: Gustavo Padovan <[email protected]>

Explicit synchronization benefits a lot from ordered queues, they fit
better in a pipeline with DRM for example so create a opt-in way for
drivers notify videobuf2 that the queue is unordered.

Drivers don't need implement it if the queue is ordered.

Signed-off-by: Gustavo Padovan <[email protected]>
---
include/media/videobuf2-core.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index ef9b64398c8c..eddb38a2a2f3 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -368,6 +368,9 @@ struct vb2_buffer {
* callback by calling vb2_buffer_done() with either
* %VB2_BUF_STATE_DONE or %VB2_BUF_STATE_ERROR; may use
* vb2_wait_for_all_buffers() function
+ * @is_unordered: tell if the queue format is unordered. The default is
+ * assumed to be ordered and this function only needs to
+ * be implemented for unordered queues.
* @buf_queue: passes buffer vb to the driver; driver may start
* hardware operation on this buffer; driver should give
* the buffer back by calling vb2_buffer_done() function;
@@ -391,6 +394,7 @@ struct vb2_ops {

int (*start_streaming)(struct vb2_queue *q, unsigned int count);
void (*stop_streaming)(struct vb2_queue *q);
+ int (*is_unordered)(struct vb2_queue *q);

void (*buf_queue)(struct vb2_buffer *vb);
};
@@ -564,6 +568,7 @@ struct vb2_queue {
u32 cnt_wait_finish;
u32 cnt_start_streaming;
u32 cnt_stop_streaming;
+ u32 cnt_is_unordered;
#endif
};

--
2.13.6

2017-12-21 18:17:59

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v6 2/6] [media] v4l: add 'unordered' flag to format description ioctl

Em Mon, 11 Dec 2017 16:27:37 -0200
Gustavo Padovan <[email protected]> escreveu:

> From: Gustavo Padovan <[email protected]>
>
> For explicit synchronization it important for userspace to know if the
> format being used by the driver can deliver the buffers back to userspace
> in the same order they were queued with QBUF.
>
> Ordered streams fits nicely in a pipeline with DRM for example, where
> ordered buffer are expected.
>
> Signed-off-by: Gustavo Padovan <[email protected]>

Looks OK to me.

> ---
> Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 3 +++
> include/uapi/linux/videodev2.h | 1 +
> 2 files changed, 4 insertions(+)
>
> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> index 019c513df217..368115f44fc0 100644
> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> @@ -116,6 +116,9 @@ one until ``EINVAL`` is returned.
> - This format is not native to the device but emulated through
> software (usually libv4l2), where possible try to use a native
> format instead for better performance.
> + * - ``V4L2_FMT_FLAG_UNORDERED``
> + - 0x0004
> + - This is a format that doesn't guarantee timely order of frames.
>
>
> Return Value
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 1c095b5a99c5..a8ea632c14f0 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -709,6 +709,7 @@ struct v4l2_fmtdesc {
>
> #define V4L2_FMT_FLAG_COMPRESSED 0x0001
> #define V4L2_FMT_FLAG_EMULATED 0x0002
> +#define V4L2_FMT_FLAG_UNORDERED 0x0004
>
> /* Frame Size and frame rate enumeration */
> /*


--
Thanks,
Mauro

2017-12-21 18:49:48

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v6 0/6] V4L2 Explicit Synchronization

Em Mon, 11 Dec 2017 16:27:35 -0200
Gustavo Padovan <[email protected]> escreveu:

> From: Gustavo Padovan <[email protected]>
>
> Hi,
>
> One more iteration of the explicit fences patches, please refer
> to the previous version[1] for more details about the general
> mechanism
>
> This version makes the patchset and the implementation much more
> simple, to start we are not using a ordered capability anymore,
> but instead we have a VIDIOC_ENUM_FMT flag to tell when the
> queue in not ordered. Drivers with ordered queues/formats don't
> need implement anything. See patches 1 and 2 for more details.
>
> The implementation of in-fences and out-fences were condensed in
> just patches 4 and 5, making it more self-contained and easy to
> understand. See the patches for detailed changelog.
>
> Please review! Thanks.

Hi Gustavo,

As I was afraid, the changes at the VB2 core makes it non-generic,
breaking support for the DVB VB2 patchset. That's a branch with
both patchsets applied:

https://git.linuxtv.org/mchehab/experimental.git/log/?h=dvb_mmap%2bexplicit_fences


With the explicit fences patchset, the DVB streaming breaks:

$ sudo perf stat dvbv5-zap -c ~/dvb_channel.conf "TV Brasilia RedeTV!" -o /dev/null -t120 -R
Usando demux 'dvb0.demux0'
lendo canais do arquivo '/home/mchehab/dvb_channel.conf'
sintonizando em 557142857 Hz
PID de vídeo 273
dvb_set_pesfilter 273
PID de áudio 274
dvb_set_pesfilter 274
Travado (0x1f) Sinal= -85,22dBm C/N= 18,57dB UCB= 8589933955 pós-BER= 0
Travado (0x1f) Sinal= -85,24dBm C/N= 18,57dB UCB= 8589933955 pós-BER= 0
Gravação iniciada para o arquivo '/dev/null'
ERRO:DMX_REQBUFS failed: error=-1 (Invalid argument)
ERRO:[stream_to_file] Failed to setup buffers!!! (Invalid argument)
start streaming!!!
copied 0 bytes (0 Kbytes/sec)
Travado (0x1f) Sinal= -85,25dBm C/N= 18,57dB UCB= 8589933955 pós-BER= 0

Performance counter stats for 'dvbv5-zap -c /home/mchehab/dvb_channel.conf TV Brasilia RedeTV! -o /dev/null -t120 -R':

7.001647 task-clock-msecs # 0.003 CPUs
251 context-switches # 0.036 M/sec
18 CPU-migrations # 0.003 M/sec
181 page-faults # 0.026 M/sec
17001058 cycles # 2428.151 M/sec
11342660 instructions # 0.667 IPC
349075 cache-references # 49.856 M/sec
70802 cache-misses # 10.112 M/sec

2.133343557 seconds time elapsed

It also breaks support on V4L2, when fences is not used:

$ ./contrib/test/v4l2grab
(kernel crashes)

I don't have a serial console on this machine to print what's
wrong, but clearly there's something not right there :-)

--
Thanks,
Mauro

2017-12-21 18:51:38

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v6 1/6] [media] vb2: add is_unordered callback for drivers

Em Mon, 11 Dec 2017 16:27:36 -0200
Gustavo Padovan <[email protected]> escreveu:

> From: Gustavo Padovan <[email protected]>
>
> Explicit synchronization benefits a lot from ordered queues, they fit
> better in a pipeline with DRM for example so create a opt-in way for
> drivers notify videobuf2 that the queue is unordered.
>
> Drivers don't need implement it if the queue is ordered.
>
> Signed-off-by: Gustavo Padovan <[email protected]>
> ---
> include/media/videobuf2-core.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index ef9b64398c8c..eddb38a2a2f3 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -368,6 +368,9 @@ struct vb2_buffer {
> * callback by calling vb2_buffer_done() with either
> * %VB2_BUF_STATE_DONE or %VB2_BUF_STATE_ERROR; may use
> * vb2_wait_for_all_buffers() function
> + * @is_unordered: tell if the queue format is unordered. The default is
> + * assumed to be ordered and this function only needs to
> + * be implemented for unordered queues.
> * @buf_queue: passes buffer vb to the driver; driver may start
> * hardware operation on this buffer; driver should give
> * the buffer back by calling vb2_buffer_done() function;
> @@ -391,6 +394,7 @@ struct vb2_ops {
>
> int (*start_streaming)(struct vb2_queue *q, unsigned int count);
> void (*stop_streaming)(struct vb2_queue *q);
> + int (*is_unordered)(struct vb2_queue *q);
>
> void (*buf_queue)(struct vb2_buffer *vb);
> };
> @@ -564,6 +568,7 @@ struct vb2_queue {
> u32 cnt_wait_finish;
> u32 cnt_start_streaming;
> u32 cnt_stop_streaming;
> + u32 cnt_is_unordered;

If I understand, this is just a bit, right?

if so, better to declare it as:

u32 cnt_is_unordered : 1;

> #endif
> };
>


--
Thanks,
Mauro

2017-12-21 18:52:23

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v6 3/6] [media] vb2: add explicit fence user API

Em Mon, 11 Dec 2017 16:27:38 -0200
Gustavo Padovan <[email protected]> escreveu:

> From: Gustavo Padovan <[email protected]>
>
> Turn the reserved2 field into fence_fd that we will use to send
> an in-fence to the kernel and return an out-fence from the kernel to
> userspace.
>
> Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
> when sending a fence to the kernel to be waited on, and
> V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.
>
> v4:
> - make it a union with reserved2 and fence_fd (Hans Verkuil)
>
> v3:
> - make the out_fence refer to the current buffer (Hans Verkuil)
>
> v2: add documentation
>
> Signed-off-by: Gustavo Padovan <[email protected]>
> ---
> Documentation/media/uapi/v4l/buffer.rst | 15 +++++++++++++++
> drivers/media/usb/cpia2/cpia2_v4l.c | 2 +-
> drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 4 ++--
> drivers/media/v4l2-core/videobuf2-v4l2.c | 2 +-
> include/uapi/linux/videodev2.h | 7 ++++++-
> 5 files changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
> index ae6ee73f151c..eeefbd2547e7 100644
> --- a/Documentation/media/uapi/v4l/buffer.rst
> +++ b/Documentation/media/uapi/v4l/buffer.rst
> @@ -648,6 +648,21 @@ Buffer Flags
> - Start Of Exposure. The buffer timestamp has been taken when the
> exposure of the frame has begun. This is only valid for the
> ``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
> + * .. _`V4L2-BUF-FLAG-IN-FENCE`:
> +
> + - ``V4L2_BUF_FLAG_IN_FENCE``
> + - 0x00200000
> + - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
> + won't be queued to the driver until the fence signals.
> +
> + * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
> +
> + - ``V4L2_BUF_FLAG_OUT_FENCE``
> + - 0x00400000
> + - Request a fence to be attached to the buffer. The ``fence_fd``
> + field on
> + :ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
> + fd to userspace.
>
>
>
> diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
> index a1c59f19cf2d..7d7459fa2077 100644
> --- a/drivers/media/usb/cpia2/cpia2_v4l.c
> +++ b/drivers/media/usb/cpia2/cpia2_v4l.c
> @@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
> buf->sequence = cam->buffers[buf->index].seq;
> buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
> buf->length = cam->frame_size;
> - buf->reserved2 = 0;
> + buf->fence_fd = -1;
> buf->reserved = 0;
> memset(&buf->timecode, 0, sizeof(buf->timecode));
>
> diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> index 821f2aa299ae..3a31d318df2a 100644
> --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> @@ -370,7 +370,7 @@ struct v4l2_buffer32 {
> __s32 fd;
> } m;
> __u32 length;
> - __u32 reserved2;
> + __s32 fence_fd;
> __u32 reserved;
> };
>
> @@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
> put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
> copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
> put_user(kp->sequence, &up->sequence) ||
> - put_user(kp->reserved2, &up->reserved2) ||
> + put_user(kp->fence_fd, &up->fence_fd) ||
> put_user(kp->reserved, &up->reserved) ||
> put_user(kp->length, &up->length))
> return -EFAULT;
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 4075314a6989..4a5244ee2c58 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
> b->timestamp = ns_to_timeval(vb->timestamp);
> b->timecode = vbuf->timecode;
> b->sequence = vbuf->sequence;
> - b->reserved2 = 0;
> + b->fence_fd = -1;

The patch itself looks ok. I'm just in doubt here, but it is probably
ok to change its default to -1.


> b->reserved = 0;
>
> if (q->is_multiplanar) {
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index a8ea632c14f0..17e163b5036d 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -926,7 +926,10 @@ struct v4l2_buffer {
> __s32 fd;
> } m;
> __u32 length;
> - __u32 reserved2;
> + union {
> + __s32 fence_fd;
> + __u32 reserved2;
> + };
> __u32 reserved;
> };
>
> @@ -963,6 +966,8 @@ struct v4l2_buffer {
> #define V4L2_BUF_FLAG_TSTAMP_SRC_SOE 0x00010000
> /* mem2mem encoder/decoder */
> #define V4L2_BUF_FLAG_LAST 0x00100000
> +#define V4L2_BUF_FLAG_IN_FENCE 0x00200000
> +#define V4L2_BUF_FLAG_OUT_FENCE 0x00400000
>
> /**
> * struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor


--
Thanks,
Mauro

2017-12-21 18:58:08

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v6 4/6] [media] vb2: add in-fence support to QBUF

Em Mon, 11 Dec 2017 16:27:39 -0200
Gustavo Padovan <[email protected]> escreveu:

> From: Gustavo Padovan <[email protected]>
>
> Receive in-fence from userspace and add support for waiting on them
> before queueing the buffer to the driver. Buffers can't be queued to the
> driver before its fences signal. And a buffer can't be queue to the driver
> out of the order they were queued from userspace. That means that even if
> it fence signal it must wait all other buffers, ahead of it in the queue,
> to signal first.
>
> If the fence for some buffer fails we do not queue it to the driver,
> instead we mark it as error and wait until the previous buffer is done
> to notify userspace of the error. We wait here to deliver the buffers back
> to userspace in order.
>
> v7:
> - get rid of the fence array stuff for ordering and just use
> get_num_buffers_ready() (Hans)
> - fix issue of queuing the buffer twice (Hans)
> - avoid the dma_fence_wait() in core_qbuf() (Alex)
> - merge preparation commit in
>
> v6:
> - With fences always keep the order userspace queues the buffers.
> - Protect in_fence manipulation with a lock (Brian Starkey)
> - check if fences have the same context before adding a fence array
> - Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> - Clean up fence if __set_in_fence() fails (Brian Starkey)
> - treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
>
> v5: - use fence_array to keep buffers ordered in vb2 core when
> needed (Brian Starkey)
> - keep backward compat on the reserved2 field (Brian Starkey)
> - protect fence callback removal with lock (Brian Starkey)
>
> v4:
> - Add a comment about dma_fence_add_callback() not returning a
> error (Hans)
> - Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> - select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> - Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> - Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> - Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> vb2_start_streaming() (Hans)
> - set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> - Queue buffers to the driver as soon as they are ready (Hans)
> - call fill_user_buffer() after queuing the buffer (Hans)
> - add err: label to clean up fence
> - add dma_fence_wait() before calling vb2_start_streaming()
>
> v3: - document fence parameter
> - remove ternary if at vb2_qbuf() return (Mauro)
> - do not change if conditions behaviour (Mauro)
>
> v2:
> - fix vb2_queue_or_prepare_buf() ret check
> - remove check for VB2_MEMORY_DMABUF only (Javier)
> - check num of ready buffers to start streaming
> - when queueing, start from the first ready buffer
> - handle queue cancel
>
> Signed-off-by: Gustavo Padovan <[email protected]>
> ---
> drivers/media/v4l2-core/Kconfig | 1 +
> drivers/media/v4l2-core/videobuf2-core.c | 154 +++++++++++++++++++++++++++----
> drivers/media/v4l2-core/videobuf2-v4l2.c | 29 +++++-
> include/media/videobuf2-core.h | 14 ++-
> 4 files changed, 177 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> index a35c33686abf..3f988c407c80 100644
> --- a/drivers/media/v4l2-core/Kconfig
> +++ b/drivers/media/v4l2-core/Kconfig
> @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
> # Used by drivers that need Videobuf2 modules
> config VIDEOBUF2_CORE
> select DMA_SHARED_BUFFER
> + select SYNC_FILE
> tristate
>
> config VIDEOBUF2_MEMOPS
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> index a8589d96ef72..520aa3c7d9f0 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -346,6 +346,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
> vb->index = q->num_buffers + buffer;
> vb->type = q->type;
> vb->memory = memory;
> + spin_lock_init(&vb->fence_cb_lock);
> for (plane = 0; plane < num_planes; ++plane) {
> vb->planes[plane].length = plane_sizes[plane];
> vb->planes[plane].min_length = plane_sizes[plane];
> @@ -928,7 +929,7 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>
> switch (state) {
> case VB2_BUF_STATE_QUEUED:
> - return;
> + break;
> case VB2_BUF_STATE_REQUEUEING:
> if (q->start_streaming_called)
> __enqueue_in_driver(vb);
> @@ -938,6 +939,16 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
> wake_up(&q->done_wq);
> break;
> }
> +
> + /*
> + * If the in-fence fails the buffer is not queued to the driver
> + * and we have to wait until the previous buffer is done before
> + * we notify userspace that the buffer with error can be dequeued.
> + * That way we maintain the ordering from userspace point of view.
> + */
> + vb = list_next_entry(vb, queued_entry);
> + if (vb && vb->state == VB2_BUF_STATE_ERROR)
> + vb2_buffer_done(vb, vb->state);

This is not a comment for this specific hunk itself, but for all
similar occurrences on patches 4 and 5.

I really don't like the idea of unconditionally execute things at the
core without checking if fences will be used or not. It seems risky, and,
from my quick tests with dvb_vb2, at least one of such hunks seem
broken for the case where either the VB2 API-specific interface doesn't
implement fences or fences is not being used by the userspace application.

I'm also wandering about the performance impacts for those things when
fences aren't used/implemented at the API-specific interfaces (currently,
videobuf2-v4l2 and dvb_vb2).

Thanks,
Mauro

2017-12-21 19:07:59

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v6 0/6] V4L2 Explicit Synchronization

Em Thu, 21 Dec 2017 16:49:31 -0200
Mauro Carvalho Chehab <[email protected]> escreveu:

> Em Mon, 11 Dec 2017 16:27:35 -0200
> Gustavo Padovan <[email protected]> escreveu:
>
> > From: Gustavo Padovan <[email protected]>
> >
> > Hi,
> >
> > One more iteration of the explicit fences patches, please refer
> > to the previous version[1] for more details about the general
> > mechanism
> >
> > This version makes the patchset and the implementation much more
> > simple, to start we are not using a ordered capability anymore,
> > but instead we have a VIDIOC_ENUM_FMT flag to tell when the
> > queue in not ordered. Drivers with ordered queues/formats don't
> > need implement anything. See patches 1 and 2 for more details.
> >
> > The implementation of in-fences and out-fences were condensed in
> > just patches 4 and 5, making it more self-contained and easy to
> > understand. See the patches for detailed changelog.
> >
> > Please review! Thanks.
>
> Hi Gustavo,
>
> As I was afraid, the changes at the VB2 core makes it non-generic,
> breaking support for the DVB VB2 patchset. That's a branch with
> both patchsets applied:
>
> https://git.linuxtv.org/mchehab/experimental.git/log/?h=dvb_mmap%2bexplicit_fences
>
>
> With the explicit fences patchset, the DVB streaming breaks:
>
> $ sudo perf stat dvbv5-zap -c ~/dvb_channel.conf "TV Brasilia RedeTV!" -o /dev/null -t120 -R
> Usando demux 'dvb0.demux0'
> lendo canais do arquivo '/home/mchehab/dvb_channel.conf'
> sintonizando em 557142857 Hz
> PID de vídeo 273
> dvb_set_pesfilter 273
> PID de áudio 274
> dvb_set_pesfilter 274
> Travado (0x1f) Sinal= -85,22dBm C/N= 18,57dB UCB= 8589933955 pós-BER= 0
> Travado (0x1f) Sinal= -85,24dBm C/N= 18,57dB UCB= 8589933955 pós-BER= 0
> Gravação iniciada para o arquivo '/dev/null'
> ERRO:DMX_REQBUFS failed: error=-1 (Invalid argument)
> ERRO:[stream_to_file] Failed to setup buffers!!! (Invalid argument)
> start streaming!!!
> copied 0 bytes (0 Kbytes/sec)
> Travado (0x1f) Sinal= -85,25dBm C/N= 18,57dB UCB= 8589933955 pós-BER= 0
>
> Performance counter stats for 'dvbv5-zap -c /home/mchehab/dvb_channel.conf TV Brasilia RedeTV! -o /dev/null -t120 -R':
>
> 7.001647 task-clock-msecs # 0.003 CPUs
> 251 context-switches # 0.036 M/sec
> 18 CPU-migrations # 0.003 M/sec
> 181 page-faults # 0.026 M/sec
> 17001058 cycles # 2428.151 M/sec
> 11342660 instructions # 0.667 IPC
> 349075 cache-references # 49.856 M/sec
> 70802 cache-misses # 10.112 M/sec
>
> 2.133343557 seconds time elapsed

Hmm... this may be unrelated. I'll need to do more tests here to know
for sure.

>
> It also breaks support on V4L2, when fences is not used:
>
> $ ./contrib/test/v4l2grab
> (kernel crashes)
>
> I don't have a serial console on this machine to print what's
> wrong, but clearly there's something not right there :-)
>



Thanks,
Mauro

2017-12-21 20:28:30

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH v6 0/6] V4L2 Explicit Synchronization

2017-12-21 Mauro Carvalho Chehab <[email protected]>:

> Em Mon, 11 Dec 2017 16:27:35 -0200
> Gustavo Padovan <[email protected]> escreveu:
>
> > From: Gustavo Padovan <[email protected]>
> >
> > Hi,
> >
> > One more iteration of the explicit fences patches, please refer
> > to the previous version[1] for more details about the general
> > mechanism
> >
> > This version makes the patchset and the implementation much more
> > simple, to start we are not using a ordered capability anymore,
> > but instead we have a VIDIOC_ENUM_FMT flag to tell when the
> > queue in not ordered. Drivers with ordered queues/formats don't
> > need implement anything. See patches 1 and 2 for more details.
> >
> > The implementation of in-fences and out-fences were condensed in
> > just patches 4 and 5, making it more self-contained and easy to
> > understand. See the patches for detailed changelog.
> >
> > Please review! Thanks.
>
> Hi Gustavo,
>
> As I was afraid, the changes at the VB2 core makes it non-generic,
> breaking support for the DVB VB2 patchset. That's a branch with
> both patchsets applied:
>
> https://git.linuxtv.org/mchehab/experimental.git/log/?h=dvb_mmap%2bexplicit_fences
>
>
> With the explicit fences patchset, the DVB streaming breaks:
>
> $ sudo perf stat dvbv5-zap -c ~/dvb_channel.conf "TV Brasilia RedeTV!" -o /dev/null -t120 -R
> Usando demux 'dvb0.demux0'
> lendo canais do arquivo '/home/mchehab/dvb_channel.conf'
> sintonizando em 557142857 Hz
> PID de v?deo 273
> dvb_set_pesfilter 273
> PID de ?udio 274
> dvb_set_pesfilter 274
> Travado (0x1f) Sinal= -85,22dBm C/N= 18,57dB UCB= 8589933955 p?s-BER= 0
> Travado (0x1f) Sinal= -85,24dBm C/N= 18,57dB UCB= 8589933955 p?s-BER= 0
> Grava??o iniciada para o arquivo '/dev/null'
> ERRO:DMX_REQBUFS failed: error=-1 (Invalid argument)
> ERRO:[stream_to_file] Failed to setup buffers!!! (Invalid argument)
> start streaming!!!
> copied 0 bytes (0 Kbytes/sec)
> Travado (0x1f) Sinal= -85,25dBm C/N= 18,57dB UCB= 8589933955 p?s-BER= 0

I haven't tried it, not sure what could break it. I'll take a look on
it.

> Performance counter stats for 'dvbv5-zap -c /home/mchehab/dvb_channel.conf TV Brasilia RedeTV! -o /dev/null -t120 -R':
>
> 7.001647 task-clock-msecs # 0.003 CPUs
> 251 context-switches # 0.036 M/sec
> 18 CPU-migrations # 0.003 M/sec
> 181 page-faults # 0.026 M/sec
> 17001058 cycles # 2428.151 M/sec
> 11342660 instructions # 0.667 IPC
> 349075 cache-references # 49.856 M/sec
> 70802 cache-misses # 10.112 M/sec
>
> 2.133343557 seconds time elapsed
>
> It also breaks support on V4L2, when fences is not used:
>
> $ ./contrib/test/v4l2grab
> (kernel crashes)
>
> I don't have a serial console on this machine to print what's
> wrong, but clearly there's something not right there :-)

2 days ago I fixed the a crash when not using fences:

https://git.kernel.org/pub/scm/linux/kernel/git/padovan/linux.git/commit/?h=v4l2-fences&id=704419e59437e9f7bdc369b44a612685e8663880

Gustavo

2017-12-21 20:32:12

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH v6 1/6] [media] vb2: add is_unordered callback for drivers

2017-12-21 Mauro Carvalho Chehab <[email protected]>:

> Em Mon, 11 Dec 2017 16:27:36 -0200
> Gustavo Padovan <[email protected]> escreveu:
>
> > From: Gustavo Padovan <[email protected]>
> >
> > Explicit synchronization benefits a lot from ordered queues, they fit
> > better in a pipeline with DRM for example so create a opt-in way for
> > drivers notify videobuf2 that the queue is unordered.
> >
> > Drivers don't need implement it if the queue is ordered.
> >
> > Signed-off-by: Gustavo Padovan <[email protected]>
> > ---
> > include/media/videobuf2-core.h | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> > index ef9b64398c8c..eddb38a2a2f3 100644
> > --- a/include/media/videobuf2-core.h
> > +++ b/include/media/videobuf2-core.h
> > @@ -368,6 +368,9 @@ struct vb2_buffer {
> > * callback by calling vb2_buffer_done() with either
> > * %VB2_BUF_STATE_DONE or %VB2_BUF_STATE_ERROR; may use
> > * vb2_wait_for_all_buffers() function
> > + * @is_unordered: tell if the queue format is unordered. The default is
> > + * assumed to be ordered and this function only needs to
> > + * be implemented for unordered queues.
> > * @buf_queue: passes buffer vb to the driver; driver may start
> > * hardware operation on this buffer; driver should give
> > * the buffer back by calling vb2_buffer_done() function;
> > @@ -391,6 +394,7 @@ struct vb2_ops {
> >
> > int (*start_streaming)(struct vb2_queue *q, unsigned int count);
> > void (*stop_streaming)(struct vb2_queue *q);
> > + int (*is_unordered)(struct vb2_queue *q);
> >
> > void (*buf_queue)(struct vb2_buffer *vb);
> > };
> > @@ -564,6 +568,7 @@ struct vb2_queue {
> > u32 cnt_wait_finish;
> > u32 cnt_start_streaming;
> > u32 cnt_stop_streaming;
> > + u32 cnt_is_unordered;
>
> If I understand, this is just a bit, right?
>
> if so, better to declare it as:
>
> u32 cnt_is_unordered : 1;

no, is_unordered() is a vb2 callback to ask drivers if their current
queue is not ordered and cnt_is_unordered is the counter of how many
times this callback was called. This happens for all vb2 callbacks, see
the other counters above, so I decided to add one for is_unordered() as
well.

Gustavo

2017-12-21 20:36:53

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH v6 3/6] [media] vb2: add explicit fence user API

2017-12-21 Mauro Carvalho Chehab <[email protected]>:

> Em Mon, 11 Dec 2017 16:27:38 -0200
> Gustavo Padovan <[email protected]> escreveu:
>
> > From: Gustavo Padovan <[email protected]>
> >
> > Turn the reserved2 field into fence_fd that we will use to send
> > an in-fence to the kernel and return an out-fence from the kernel to
> > userspace.
> >
> > Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
> > when sending a fence to the kernel to be waited on, and
> > V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.
> >
> > v4:
> > - make it a union with reserved2 and fence_fd (Hans Verkuil)
> >
> > v3:
> > - make the out_fence refer to the current buffer (Hans Verkuil)
> >
> > v2: add documentation
> >
> > Signed-off-by: Gustavo Padovan <[email protected]>
> > ---
> > Documentation/media/uapi/v4l/buffer.rst | 15 +++++++++++++++
> > drivers/media/usb/cpia2/cpia2_v4l.c | 2 +-
> > drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 4 ++--
> > drivers/media/v4l2-core/videobuf2-v4l2.c | 2 +-
> > include/uapi/linux/videodev2.h | 7 ++++++-
> > 5 files changed, 25 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
> > index ae6ee73f151c..eeefbd2547e7 100644
> > --- a/Documentation/media/uapi/v4l/buffer.rst
> > +++ b/Documentation/media/uapi/v4l/buffer.rst
> > @@ -648,6 +648,21 @@ Buffer Flags
> > - Start Of Exposure. The buffer timestamp has been taken when the
> > exposure of the frame has begun. This is only valid for the
> > ``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
> > + * .. _`V4L2-BUF-FLAG-IN-FENCE`:
> > +
> > + - ``V4L2_BUF_FLAG_IN_FENCE``
> > + - 0x00200000
> > + - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
> > + won't be queued to the driver until the fence signals.
> > +
> > + * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
> > +
> > + - ``V4L2_BUF_FLAG_OUT_FENCE``
> > + - 0x00400000
> > + - Request a fence to be attached to the buffer. The ``fence_fd``
> > + field on
> > + :ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
> > + fd to userspace.
> >
> >
> >
> > diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
> > index a1c59f19cf2d..7d7459fa2077 100644
> > --- a/drivers/media/usb/cpia2/cpia2_v4l.c
> > +++ b/drivers/media/usb/cpia2/cpia2_v4l.c
> > @@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
> > buf->sequence = cam->buffers[buf->index].seq;
> > buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
> > buf->length = cam->frame_size;
> > - buf->reserved2 = 0;
> > + buf->fence_fd = -1;
> > buf->reserved = 0;
> > memset(&buf->timecode, 0, sizeof(buf->timecode));
> >
> > diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> > index 821f2aa299ae..3a31d318df2a 100644
> > --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> > +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> > @@ -370,7 +370,7 @@ struct v4l2_buffer32 {
> > __s32 fd;
> > } m;
> > __u32 length;
> > - __u32 reserved2;
> > + __s32 fence_fd;
> > __u32 reserved;
> > };
> >
> > @@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
> > put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
> > copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
> > put_user(kp->sequence, &up->sequence) ||
> > - put_user(kp->reserved2, &up->reserved2) ||
> > + put_user(kp->fence_fd, &up->fence_fd) ||
> > put_user(kp->reserved, &up->reserved) ||
> > put_user(kp->length, &up->length))
> > return -EFAULT;
> > diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> > index 4075314a6989..4a5244ee2c58 100644
> > --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> > +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> > @@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
> > b->timestamp = ns_to_timeval(vb->timestamp);
> > b->timecode = vbuf->timecode;
> > b->sequence = vbuf->sequence;
> > - b->reserved2 = 0;
> > + b->fence_fd = -1;
>
> The patch itself looks ok. I'm just in doubt here, but it is probably
> ok to change its default to -1.

Right. What we can do is return 0 if the OUT_FENCE flag is not set.

Gustavo

2017-12-21 20:41:59

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH v6 4/6] [media] vb2: add in-fence support to QBUF

2017-12-21 Mauro Carvalho Chehab <[email protected]>:

> Em Mon, 11 Dec 2017 16:27:39 -0200
> Gustavo Padovan <[email protected]> escreveu:
>
> > From: Gustavo Padovan <[email protected]>
> >
> > Receive in-fence from userspace and add support for waiting on them
> > before queueing the buffer to the driver. Buffers can't be queued to the
> > driver before its fences signal. And a buffer can't be queue to the driver
> > out of the order they were queued from userspace. That means that even if
> > it fence signal it must wait all other buffers, ahead of it in the queue,
> > to signal first.
> >
> > If the fence for some buffer fails we do not queue it to the driver,
> > instead we mark it as error and wait until the previous buffer is done
> > to notify userspace of the error. We wait here to deliver the buffers back
> > to userspace in order.
> >
> > v7:
> > - get rid of the fence array stuff for ordering and just use
> > get_num_buffers_ready() (Hans)
> > - fix issue of queuing the buffer twice (Hans)
> > - avoid the dma_fence_wait() in core_qbuf() (Alex)
> > - merge preparation commit in
> >
> > v6:
> > - With fences always keep the order userspace queues the buffers.
> > - Protect in_fence manipulation with a lock (Brian Starkey)
> > - check if fences have the same context before adding a fence array
> > - Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> > - Clean up fence if __set_in_fence() fails (Brian Starkey)
> > - treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> >
> > v5: - use fence_array to keep buffers ordered in vb2 core when
> > needed (Brian Starkey)
> > - keep backward compat on the reserved2 field (Brian Starkey)
> > - protect fence callback removal with lock (Brian Starkey)
> >
> > v4:
> > - Add a comment about dma_fence_add_callback() not returning a
> > error (Hans)
> > - Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> > - select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> > - Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> > - Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> > - Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> > vb2_start_streaming() (Hans)
> > - set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> > - Queue buffers to the driver as soon as they are ready (Hans)
> > - call fill_user_buffer() after queuing the buffer (Hans)
> > - add err: label to clean up fence
> > - add dma_fence_wait() before calling vb2_start_streaming()
> >
> > v3: - document fence parameter
> > - remove ternary if at vb2_qbuf() return (Mauro)
> > - do not change if conditions behaviour (Mauro)
> >
> > v2:
> > - fix vb2_queue_or_prepare_buf() ret check
> > - remove check for VB2_MEMORY_DMABUF only (Javier)
> > - check num of ready buffers to start streaming
> > - when queueing, start from the first ready buffer
> > - handle queue cancel
> >
> > Signed-off-by: Gustavo Padovan <[email protected]>
> > ---
> > drivers/media/v4l2-core/Kconfig | 1 +
> > drivers/media/v4l2-core/videobuf2-core.c | 154 +++++++++++++++++++++++++++----
> > drivers/media/v4l2-core/videobuf2-v4l2.c | 29 +++++-
> > include/media/videobuf2-core.h | 14 ++-
> > 4 files changed, 177 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> > index a35c33686abf..3f988c407c80 100644
> > --- a/drivers/media/v4l2-core/Kconfig
> > +++ b/drivers/media/v4l2-core/Kconfig
> > @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
> > # Used by drivers that need Videobuf2 modules
> > config VIDEOBUF2_CORE
> > select DMA_SHARED_BUFFER
> > + select SYNC_FILE
> > tristate
> >
> > config VIDEOBUF2_MEMOPS
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> > index a8589d96ef72..520aa3c7d9f0 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -346,6 +346,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
> > vb->index = q->num_buffers + buffer;
> > vb->type = q->type;
> > vb->memory = memory;
> > + spin_lock_init(&vb->fence_cb_lock);
> > for (plane = 0; plane < num_planes; ++plane) {
> > vb->planes[plane].length = plane_sizes[plane];
> > vb->planes[plane].min_length = plane_sizes[plane];
> > @@ -928,7 +929,7 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
> >
> > switch (state) {
> > case VB2_BUF_STATE_QUEUED:
> > - return;
> > + break;
> > case VB2_BUF_STATE_REQUEUEING:
> > if (q->start_streaming_called)
> > __enqueue_in_driver(vb);
> > @@ -938,6 +939,16 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
> > wake_up(&q->done_wq);
> > break;
> > }
> > +
> > + /*
> > + * If the in-fence fails the buffer is not queued to the driver
> > + * and we have to wait until the previous buffer is done before
> > + * we notify userspace that the buffer with error can be dequeued.
> > + * That way we maintain the ordering from userspace point of view.
> > + */
> > + vb = list_next_entry(vb, queued_entry);
> > + if (vb && vb->state == VB2_BUF_STATE_ERROR)
> > + vb2_buffer_done(vb, vb->state);
>
> This is not a comment for this specific hunk itself, but for all
> similar occurrences on patches 4 and 5.

It in related to the this piece, but it seems I need to improve the
quality of the comment.

>
> I really don't like the idea of unconditionally execute things at the
> core without checking if fences will be used or not. It seems risky, and,
> from my quick tests with dvb_vb2, at least one of such hunks seem
> broken for the case where either the VB2 API-specific interface doesn't
> implement fences or fences is not being used by the userspace application.

Yeah, I'll have this in mind for the next iteration of this patchset.

>
> I'm also wandering about the performance impacts for those things when
> fences aren't used/implemented at the API-specific interfaces (currently,
> videobuf2-v4l2 and dvb_vb2).

I'm writing a DRM <-> V4L2 tool at the moment to start playing with a
full pipeline and performance.

Gustavo