2023-09-28 00:51:41

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 00/53] Add DELETE_BUF ioctl

Unlike when resolution change on keyframes, dynamic resolution change
on inter frames doesn't allow to do a stream off/on sequence because
it is need to keep all previous references alive to decode inter frames.
This constraint have two main problems:
- more memory consumption.
- more buffers in use.
To solve these issue this series introduce DELETE_BUFS ioctl and remove
the 32 buffers limit per queue.

VP9 conformance tests using fluster give a score of 210/305.
The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
but require to use postprocessor.

Kernel branch is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v8

GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
change is here:
https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc

changes in version 8:
- Add V4L2_BUF_CAP_SUPPORTS_SET_MAX_BUFS and new 'max_buffers' field in v4l2_create_buffers
structure to report the maximum number of buffers that a queue could allocate.
- Add V4L2_BUF_CAP_SUPPORTS_DELETE_BUFS to indicate that a queue support
DELETE_BUFS ioctl.
- Make some test drivers use more than 32 buffers and DELETE_BUFS ioctl.
- Fix remarks done by Hans
- Move "media: core: Rework how create_buf index returned value is
computed" patch to the top of the serie.

changes in version 7:
- Use a bitmap to know which entries are valid in queue bufs array.
The number of buffers in the queue could must calculated from the
bitmap so num_buffers becomes useless. This led to add quite few
patches to remove it from all the drivers.
Note: despiste my attention I may have miss some calls to
num_buffers...
- Split patches to make them more readable.
- Run v4l2-compliance with additional delete-bufs tests.
- Run ./test-media -kmemleak vivid and no more failures.
Note: I had to remove USERPTR streaming test because they to much
frequentely hit get_framevec bug. It is not related to my series
since this happens all the time on master branch.
- Fix Hans remarks on v6

changes in version 6:
- Get a patch per driver to use vb2_get_buffer() instead of directly access
to queue buffers array.
- Add lock in vb2_core_delete_buf()
- Use vb2_buffer instead of index
- Fix various comments
- Change buffer index name to BUFFER_INDEX_MASK
- Stop spamming kernel log with unbalanced counters

changes in version 5:
- Rework offset cookie encoding pattern is n ow the first patch of the
serie.
- Use static array instead of allocated one for postprocessor buffers.

changes in version 4:
- Stop using Xarray, instead let queues decide about their own maximum
number of buffer and allocate bufs array given that value.
- Rework offset cookie encoding pattern.
- Change DELETE_BUF to DELETE_BUFS because it now usable for
range of buffer to be symetrical of CREATE_BUFS.
- Add fixes tags on couple of Verisilicon related patches.
- Be smarter in Verisilicon postprocessor buffers management.
- Rebase on top of v6.4

changes in version 3:
- Use Xarray API to store allocated video buffers.
- No module parameter to limit the number of buffer per queue.
- Use Xarray inside Verisilicon driver to store postprocessor buffers
and remove VB2_MAX_FRAME limit.
- Allow Versilicon driver to change of resolution while streaming
- Various fixes the Verisilicon VP9 code to improve fluster score.

changes in version 2:
- Use a dynamic array and not a list to keep trace of allocated buffers.
Not use IDR interface because it is marked as deprecated in kernel
documentation.
- Add a module parameter to limit the number of buffer per queue.
- Add DELETE_BUF ioctl and m2m helpers.

Regards,
Benjamin

Benjamin Gaignard (53):
media: videobuf2: Rework offset 'cookie' encoding pattern
media: videobuf2: Stop spamming kernel log with all queue counter
media: videobuf2: Use vb2_buffer instead of index
media: amphion: Use vb2_get_buffer() instead of directly access to
buffers array
media: mediatek: jpeg: Use vb2_get_buffer() instead of directly access
to buffers array
media: mediatek: vdec: Remove useless loop
media: sti: hva: Use vb2_get_buffer() instead of directly access to
buffers array
media: visl: Use vb2_get_buffer() instead of directly access to
buffers array
media: atomisp: Use vb2_get_buffer() instead of directly access to
buffers array
media: dvb-core: Use vb2_get_buffer() instead of directly access to
buffers array
media: videobuf2: Access vb2_queue bufs array through helper functions
media: videobuf2: Be more flexible on the number of queue stored
buffers
media: Report the maximum possible number of buffers for the queue
media: test-drivers: vivid: Increase max supported buffers for capture
queues
media: test-drivers: vicodec: Increase max supported capture queue
buffers
media: verisilicon: Refactor postprocessor to store more buffers
media: verisilicon: Store chroma and motion vectors offset
media: verisilicon: g2: Use common helpers to compute chroma and mv
offsets
media: verisilicon: vp9: Allow to change resolution while streaming
media: Remove duplicated index vs q->num_buffers check
media: core: Add helper to get queue number of buffers
media: dvb-core: Do not initialize twice queue num_buffer field
media: dvb-frontends: rtl2832_srd: Use queue min_buffers_needed field
media: video-i2c: Set min_buffers_needed to 2
media: pci: cx18: Set correct value to min_buffers_needed field
media: pci: dt3155: Remove useless check
media: pci: netup_unidvb: Remove useless number of buffers check
media: pci: tw68: Stop direct calls to queue num_buffers field
media: pci: tw686x: Set min_buffers_needed to 3
media: amphion: Stop direct calls to queue num_buffers field
media: coda: Stop direct calls to queue num_buffers field
media: mediatek: vcodec: Stop direct calls to queue num_buffers field
media: nxp: Stop direct calls to queue num_buffers field
media: renesas: Set min_buffers_needed to 16
media: ti: Use queue min_buffers_needed field to set the min number of
buffers
media: verisilicon: Stop direct calls to queue num_buffers field
media: test-drivers: Stop direct calls to queue num_buffers field
media: usb: airspy: Set min_buffers_needed to 8
media: usb: cx231xx: Set min_buffers_needed to CX231XX_MIN_BUF
media: usb: hackrf: Set min_buffers_needed to 8
media: usb: usbtv: Set min_buffers_needed to 2
media: atomisp: Stop direct calls to queue num_buffers field
media: imx: Stop direct calls to queue num_buffers field
media: meson: vdec: Stop direct calls to queue num_buffers field
touchscreen: sur40: Stop direct calls to queue num_buffers field
sample: v4l: Stop direct calls to queue num_buffers field
media: cedrus: Stop direct calls to queue num_buffers field
media: core: Rework how create_buf index returned value is computed
media: core: Add bitmap manage bufs array entries
media: core: Free range of buffers
media: v4l2: Add DELETE_BUFS ioctl
media: v4l2: Add mem2mem helpers for DELETE_BUFS ioctl
media: test-drivers: Use helper for DELETE_BUFS ioctl

.../userspace-api/media/v4l/user-func.rst | 1 +
.../media/v4l/vidioc-create-bufs.rst | 8 +-
.../media/v4l/vidioc-delete-bufs.rst | 80 +++
.../media/v4l/vidioc-reqbufs.rst | 2 +
drivers/input/touchscreen/sur40.c | 5 +-
.../media/common/videobuf2/videobuf2-core.c | 556 +++++++++++-------
.../media/common/videobuf2/videobuf2-v4l2.c | 121 +++-
drivers/media/dvb-core/dvb_vb2.c | 17 +-
drivers/media/dvb-frontends/rtl2832_sdr.c | 9 +-
drivers/media/i2c/video-i2c.c | 5 +-
drivers/media/pci/cx18/cx18-streams.c | 13 +-
drivers/media/pci/dt3155/dt3155.c | 2 -
.../pci/netup_unidvb/netup_unidvb_core.c | 4 +-
drivers/media/pci/tw68/tw68-video.c | 6 +-
drivers/media/pci/tw686x/tw686x-video.c | 13 +-
drivers/media/platform/amphion/vpu_dbg.c | 30 +-
drivers/media/platform/amphion/vpu_v4l2.c | 4 +-
.../media/platform/chips-media/coda-common.c | 2 +-
.../platform/mediatek/jpeg/mtk_jpeg_core.c | 7 +-
.../vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 9 +-
.../mediatek/vcodec/encoder/mtk_vcodec_enc.c | 2 +-
drivers/media/platform/nxp/imx7-media-csi.c | 7 +-
drivers/media/platform/renesas/rcar_drif.c | 8 +-
drivers/media/platform/st/sti/hva/hva-v4l2.c | 9 +-
.../media/platform/ti/am437x/am437x-vpfe.c | 7 +-
drivers/media/platform/ti/cal/cal-video.c | 5 +-
.../media/platform/ti/davinci/vpif_capture.c | 5 +-
.../media/platform/ti/davinci/vpif_display.c | 5 +-
drivers/media/platform/ti/omap/omap_vout.c | 5 +-
drivers/media/platform/verisilicon/hantro.h | 9 +-
.../media/platform/verisilicon/hantro_drv.c | 5 +-
.../media/platform/verisilicon/hantro_g2.c | 14 +
.../platform/verisilicon/hantro_g2_hevc_dec.c | 18 +-
.../platform/verisilicon/hantro_g2_vp9_dec.c | 28 +-
.../media/platform/verisilicon/hantro_hw.h | 7 +-
.../platform/verisilicon/hantro_postproc.c | 93 ++-
.../media/platform/verisilicon/hantro_v4l2.c | 27 +-
.../media/test-drivers/vicodec/vicodec-core.c | 3 +
drivers/media/test-drivers/vim2m.c | 2 +
.../media/test-drivers/vimc/vimc-capture.c | 2 +
drivers/media/test-drivers/visl/visl-dec.c | 32 +-
drivers/media/test-drivers/visl/visl-video.c | 2 +
drivers/media/test-drivers/vivid/vivid-core.c | 14 +
.../media/test-drivers/vivid/vivid-meta-cap.c | 3 -
.../media/test-drivers/vivid/vivid-meta-out.c | 5 +-
.../test-drivers/vivid/vivid-touch-cap.c | 5 +-
.../media/test-drivers/vivid/vivid-vbi-cap.c | 5 +-
.../media/test-drivers/vivid/vivid-vbi-out.c | 5 +-
.../media/test-drivers/vivid/vivid-vid-cap.c | 5 +-
.../media/test-drivers/vivid/vivid-vid-out.c | 5 +-
drivers/media/usb/airspy/airspy.c | 9 +-
drivers/media/usb/cx231xx/cx231xx-417.c | 4 +-
drivers/media/usb/cx231xx/cx231xx-video.c | 4 +-
drivers/media/usb/hackrf/hackrf.c | 9 +-
drivers/media/usb/usbtv/usbtv-video.c | 3 +-
drivers/media/v4l2-core/v4l2-dev.c | 1 +
drivers/media/v4l2-core/v4l2-ioctl.c | 21 +-
drivers/media/v4l2-core/v4l2-mem2mem.c | 20 +
.../staging/media/atomisp/pci/atomisp_ioctl.c | 4 +-
drivers/staging/media/imx/imx-media-capture.c | 7 +-
drivers/staging/media/meson/vdec/vdec.c | 13 +-
.../staging/media/sunxi/cedrus/cedrus_h264.c | 8 +-
.../staging/media/sunxi/cedrus/cedrus_h265.c | 9 +-
include/media/v4l2-ioctl.h | 4 +
include/media/v4l2-mem2mem.h | 12 +
include/media/videobuf2-core.h | 65 +-
include/media/videobuf2-v4l2.h | 13 +
include/uapi/linux/videodev2.h | 24 +-
samples/v4l/v4l2-pci-skeleton.c | 5 +-
69 files changed, 969 insertions(+), 502 deletions(-)
create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-bufs.rst

--
2.39.2


2023-09-28 01:23:52

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 21/53] media: core: Add helper to get queue number of buffers

In the future a side effect of introducing DELETE_BUFS ioctl is
the create of 'holes' (i.e. unused buffers) in bufs arrays.
To know which entries of the bufs arrays are used a bitmap will
be added in struct vb2_queue. That will also mean that the number
of buffers will be computed given the number of bit set in this bitmap.
To smoothly allow this evolution all drives must stop using
directly num_buffers field from struct vb2_queue.
Let do it in 4 steps:
- Introduce vb2_get_num_buffers() helper
- Rework how create_bufs first buffer index is computed
- Rework all drivers to remove direct calls to queue num_buffers
- Replace num_buffers by a bitmap.

Signed-off-by: Benjamin Gaignard <[email protected]>
---
.../media/common/videobuf2/videobuf2-core.c | 104 ++++++++++--------
.../media/common/videobuf2/videobuf2-v4l2.c | 2 +-
include/media/videobuf2-core.h | 11 +-
3 files changed, 68 insertions(+), 49 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index 023e00aa4848..320f37e46343 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -511,12 +511,12 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
*/
static void __vb2_free_mem(struct vb2_queue *q, unsigned int buffers)
{
- unsigned int buffer;
+ unsigned int buffer = 0;
+ long i = q->max_num_buffers;
struct vb2_buffer *vb;

- for (buffer = q->num_buffers - buffers; buffer < q->num_buffers;
- ++buffer) {
- vb = vb2_get_buffer(q, buffer);
+ for (i = q->max_num_buffers; i >= 0 && buffer < buffers; i--) {
+ vb = vb2_get_buffer(q, i);
if (!vb)
continue;

@@ -527,6 +527,7 @@ static void __vb2_free_mem(struct vb2_queue *q, unsigned int buffers)
__vb2_buf_dmabuf_put(vb);
else
__vb2_buf_userptr_put(vb);
+ buffer++;
}
}

@@ -538,16 +539,20 @@ static void __vb2_free_mem(struct vb2_queue *q, unsigned int buffers)
static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
{
unsigned int buffer;
+ long i = q->max_num_buffers;

lockdep_assert_held(&q->mmap_lock);

/* Call driver-provided cleanup function for each buffer, if provided */
- for (buffer = q->num_buffers - buffers; buffer < q->num_buffers;
- ++buffer) {
- struct vb2_buffer *vb = vb2_get_buffer(q, buffer);
+ for (i = q->max_num_buffers, buffer = 0; i >= 0 && buffer < buffers; i--) {
+ struct vb2_buffer *vb = vb2_get_buffer(q, i);

- if (vb && vb->planes[0].mem_priv)
+ if (!vb)
+ continue;
+ if (vb->planes[0].mem_priv) {
call_void_vb_qop(vb, buf_cleanup, vb);
+ buffer++;
+ }
}

/* Release video buffer memory */
@@ -558,7 +563,7 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
* Check that all the calls were balanced during the life-time of this
* queue. If not then dump the counters to the kernel log.
*/
- if (q->num_buffers) {
+ if (vb2_get_num_buffers(q)) {
bool unbalanced = q->cnt_start_streaming != q->cnt_stop_streaming ||
q->cnt_prepare_streaming != q->cnt_unprepare_streaming ||
q->cnt_wait_prepare != q->cnt_wait_finish;
@@ -584,7 +589,7 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
q->cnt_stop_streaming = 0;
q->cnt_unprepare_streaming = 0;
}
- for (buffer = 0; buffer < q->num_buffers; ++buffer) {
+ for (buffer = 0; buffer < q->max_num_buffers; buffer++) {
struct vb2_buffer *vb = vb2_get_buffer(q, buffer);
bool unbalanced;

@@ -636,8 +641,7 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
#endif

/* Free vb2 buffers */
- for (buffer = q->num_buffers - buffers; buffer < q->num_buffers;
- ++buffer) {
+ for (i = q->max_num_buffers, buffer = 0; i > 0 && buffer < buffers; i--) {
struct vb2_buffer *vb = vb2_get_buffer(q, buffer);

if (!vb)
@@ -645,10 +649,10 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)

vb2_queue_remove_buffer(vb);
kfree(vb);
+ buffer++;
}

- q->num_buffers -= buffers;
- if (!q->num_buffers) {
+ if (!vb2_get_num_buffers(q)) {
q->memory = VB2_MEMORY_UNKNOWN;
INIT_LIST_HEAD(&q->queued_list);
}
@@ -679,7 +683,7 @@ EXPORT_SYMBOL(vb2_buffer_in_use);
static bool __buffers_in_use(struct vb2_queue *q)
{
unsigned int buffer;
- for (buffer = 0; buffer < q->num_buffers; ++buffer) {
+ for (buffer = 0; buffer < q->max_num_buffers; ++buffer) {
struct vb2_buffer *vb = vb2_get_buffer(q, buffer);

if (!vb)
@@ -805,6 +809,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
unsigned int flags, unsigned int *count)
{
unsigned int num_buffers, allocated_buffers, num_planes = 0;
+ unsigned int q_num_bufs = vb2_get_num_buffers(q);
unsigned plane_sizes[VB2_MAX_PLANES] = { };
bool non_coherent_mem = flags & V4L2_MEMORY_FLAG_NON_COHERENT;
unsigned int i;
@@ -820,7 +825,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
return -EBUSY;
}

- if (*count == 0 || q->num_buffers != 0 ||
+ if (*count == 0 || q_num_bufs != 0 ||
(q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
!verify_coherency_flags(q, non_coherent_mem)) {
/*
@@ -838,7 +843,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
* queued without ever calling STREAMON.
*/
__vb2_queue_cancel(q);
- __vb2_queue_free(q, q->num_buffers);
+ __vb2_queue_free(q, q_num_bufs);
mutex_unlock(&q->mmap_lock);

/*
@@ -938,7 +943,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
if (ret < 0) {
/*
* Note: __vb2_queue_free() will subtract 'allocated_buffers'
- * from q->num_buffers and it will reset q->memory to
+ * from already queued buffers and it will reset q->memory to
* VB2_MEMORY_UNKNOWN.
*/
__vb2_queue_free(q, allocated_buffers);
@@ -972,10 +977,11 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
unsigned int num_planes = 0, num_buffers, allocated_buffers;
unsigned plane_sizes[VB2_MAX_PLANES] = { };
bool non_coherent_mem = flags & V4L2_MEMORY_FLAG_NON_COHERENT;
- bool no_previous_buffers = !q->num_buffers;
+ unsigned int q_num_bufs = vb2_get_num_buffers(q);
+ bool no_previous_buffers = !q_num_bufs;
int ret = 0;

- if (q->num_buffers == q->max_num_buffers) {
+ if (q_num_bufs == q->max_num_buffers) {
dprintk(q, 1, "maximum number of buffers already allocated\n");
return -ENOBUFS;
}
@@ -1010,7 +1016,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
return -EINVAL;
}

- num_buffers = min(*count, q->max_num_buffers - q->num_buffers);
+ num_buffers = min(*count, q->max_num_buffers - q_num_bufs);

if (requested_planes && requested_sizes) {
num_planes = requested_planes;
@@ -1042,7 +1048,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
num_buffers = allocated_buffers;

/*
- * q->num_buffers contains the total number of buffers, that the
+ * num_buffers contains the total number of buffers, that the
* queue driver has set up
*/
ret = call_qop(q, queue_setup, q, &num_buffers,
@@ -1063,7 +1069,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
if (ret < 0) {
/*
* Note: __vb2_queue_free() will subtract 'allocated_buffers'
- * from q->num_buffers and it will reset q->memory to
+ * from already queued buffers and it will reset q->memory to
* VB2_MEMORY_UNKNOWN.
*/
__vb2_queue_free(q, allocated_buffers);
@@ -1680,7 +1686,7 @@ static int vb2_start_streaming(struct vb2_queue *q)
* Forcefully reclaim buffers if the driver did not
* correctly return them to vb2.
*/
- for (i = 0; i < q->num_buffers; ++i) {
+ for (i = 0; i < q->max_num_buffers; ++i) {
vb = vb2_get_buffer(q, i);

if (!vb)
@@ -2086,9 +2092,8 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
* to vb2 in stop_streaming().
*/
if (WARN_ON(atomic_read(&q->owned_by_drv_count))) {
- for (i = 0; i < q->num_buffers; ++i) {
+ for (i = 0; i < q->max_num_buffers; i++) {
struct vb2_buffer *vb = vb2_get_buffer(q, i);
-
if (!vb)
continue;

@@ -2130,10 +2135,9 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
* call to __fill_user_buffer() after buf_finish(). That order can't
* be changed, so we can't move the buf_finish() to __vb2_dqbuf().
*/
- for (i = 0; i < q->num_buffers; ++i) {
+ for (i = 0; i < q->max_num_buffers; i++) {
struct vb2_buffer *vb;
struct media_request *req;
-
vb = vb2_get_buffer(q, i);
if (!vb)
continue;
@@ -2178,6 +2182,7 @@ static void __vb2_queue_cancel(struct vb2_queue *q)

int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
{
+ unsigned int q_num_bufs = vb2_get_num_buffers(q);
int ret;

if (type != q->type) {
@@ -2190,12 +2195,12 @@ int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
return 0;
}

- if (!q->num_buffers) {
+ if (!q_num_bufs) {
dprintk(q, 1, "no buffers have been allocated\n");
return -EINVAL;
}

- if (q->num_buffers < q->min_buffers_needed) {
+ if (q_num_bufs < q->min_buffers_needed) {
dprintk(q, 1, "need at least %u allocated buffers\n",
q->min_buffers_needed);
return -EINVAL;
@@ -2533,7 +2538,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
__vb2_cleanup_fileio(q);
__vb2_queue_cancel(q);
mutex_lock(&q->mmap_lock);
- __vb2_queue_free(q, q->num_buffers);
+ __vb2_queue_free(q, q->max_num_buffers);
kfree(q->bufs);
q->bufs = NULL;
mutex_unlock(&q->mmap_lock);
@@ -2564,7 +2569,7 @@ __poll_t vb2_core_poll(struct vb2_queue *q, struct file *file,
/*
* Start file I/O emulator only if streaming API has not been used yet.
*/
- if (q->num_buffers == 0 && !vb2_fileio_is_active(q)) {
+ if (vb2_get_num_buffers(q) == 0 && !vb2_fileio_is_active(q)) {
if (!q->is_output && (q->io_modes & VB2_READ) &&
(req_events & (EPOLLIN | EPOLLRDNORM))) {
if (__vb2_init_fileio(q, 1))
@@ -2602,7 +2607,7 @@ __poll_t vb2_core_poll(struct vb2_queue *q, struct file *file,
* For output streams you can call write() as long as there are fewer
* buffers queued than there are buffers available.
*/
- if (q->is_output && q->fileio && q->queued_count < q->num_buffers)
+ if (q->is_output && q->fileio && q->queued_count < vb2_get_num_buffers(q))
return EPOLLOUT | EPOLLWRNORM;

if (list_empty(&q->done_list)) {
@@ -2651,8 +2656,8 @@ struct vb2_fileio_buf {
* struct vb2_fileio_data - queue context used by file io emulator
*
* @cur_index: the index of the buffer currently being read from or
- * written to. If equal to q->num_buffers then a new buffer
- * must be dequeued.
+ * written to. If equal to number of already queues buffers
+ * then a new buffer must be dequeued.
* @initial_index: in the read() case all buffers are queued up immediately
* in __vb2_init_fileio() and __vb2_perform_fileio() just cycles
* buffers. However, in the write() case no buffers are initially
@@ -2662,7 +2667,7 @@ struct vb2_fileio_buf {
* buffers. This means that initially __vb2_perform_fileio()
* needs to know what buffer index to use when it is queuing up
* the buffers for the first time. That initial index is stored
- * in this field. Once it is equal to q->num_buffers all
+ * in this field. Once it is equal to num_buffers all
* available buffers have been queued and __vb2_perform_fileio()
* should start the normal dequeue/queue cycle.
*
@@ -2712,7 +2717,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
/*
* Check if streaming api has not been already activated.
*/
- if (q->streaming || q->num_buffers > 0)
+ if (q->streaming || vb2_get_num_buffers(q) > 0)
return -EBUSY;

/*
@@ -2762,7 +2767,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
/*
* Get kernel address of each buffer.
*/
- for (i = 0; i < q->num_buffers; i++) {
+ for (i = 0; i < vb2_get_num_buffers(q); i++) {
vb = vb2_get_buffer(q, i);
WARN_ON_ONCE(!vb);

@@ -2781,18 +2786,23 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
/*
* Queue all buffers.
*/
- for (i = 0; i < q->num_buffers; i++) {
- ret = vb2_core_qbuf(q, q->bufs[i], NULL, NULL);
+ for (i = 0; i < vb2_get_num_buffers(q); i++) {
+ struct vb2_buffer *vb2 = vb2_get_buffer(q, i);
+
+ if (!vb2)
+ continue;
+
+ ret = vb2_core_qbuf(q, vb2, NULL, NULL);
if (ret)
goto err_reqbufs;
fileio->bufs[i].queued = 1;
}
/*
* All buffers have been queued, so mark that by setting
- * initial_index to q->num_buffers
+ * initial_index to num_buffers
*/
- fileio->initial_index = q->num_buffers;
- fileio->cur_index = q->num_buffers;
+ fileio->initial_index = vb2_get_num_buffers(q);
+ fileio->cur_index = fileio->initial_index;
}

/*
@@ -2986,12 +2996,12 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
* If we are queuing up buffers for the first time, then
* increase initial_index by one.
*/
- if (fileio->initial_index < q->num_buffers)
+ if (fileio->initial_index < vb2_get_num_buffers(q))
fileio->initial_index++;
/*
* The next buffer to use is either a buffer that's going to be
- * queued for the first time (initial_index < q->num_buffers)
- * or it is equal to q->num_buffers, meaning that the next
+ * queued for the first time (initial_index < num_buffers)
+ * or it is equal to num_buffers, meaning that the next
* time we need to dequeue a buffer since we've now queued up
* all the 'first time' buffers.
*/
@@ -3038,7 +3048,7 @@ static int vb2_thread(void *data)
int ret = 0;

if (q->is_output) {
- prequeue = q->num_buffers;
+ prequeue = vb2_get_num_buffers(q);
copy_timestamp = q->copy_timestamp;
}

diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 9a4b11279fe7..304163e27d10 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -628,7 +628,7 @@ struct vb2_buffer *vb2_find_buffer(struct vb2_queue *q, u64 timestamp)
* This loop doesn't scale if there is a really large number of buffers.
* Maybe something more efficient will be needed in this case.
*/
- for (i = 0; i < q->num_buffers; i++) {
+ for (i = 0; i < q->max_num_buffers; i++) {
vb2 = vb2_get_buffer(q, i);

if (!vb2)
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 1d6d68e8a711..dffb9647d4d1 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -1141,6 +1141,15 @@ static inline bool vb2_fileio_is_active(struct vb2_queue *q)
return q->fileio;
}

+/**
+ * vb2_get_num_buffers() - get the number of buffer in a queue
+ * @q: pointer to &struct vb2_queue with videobuf2 queue.
+ */
+static inline unsigned int vb2_get_num_buffers(struct vb2_queue *q)
+{
+ return q->num_buffers;
+}
+
/**
* vb2_is_busy() - return busy status of the queue.
* @q: pointer to &struct vb2_queue with videobuf2 queue.
@@ -1149,7 +1158,7 @@ static inline bool vb2_fileio_is_active(struct vb2_queue *q)
*/
static inline bool vb2_is_busy(struct vb2_queue *q)
{
- return (q->num_buffers > 0);
+ return (vb2_get_num_buffers(q) > 0);
}

/**
--
2.39.2

2023-09-28 02:12:41

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 41/53] media: usb: usbtv: Set min_buffers_needed to 2

vb2 queue_setup checks for a minimum number of buffers so set
min_buffers_needed to 2 and remove the useless check in
usbtv_queue_setup().

Signed-off-by: Benjamin Gaignard <[email protected]>
---
drivers/media/usb/usbtv/usbtv-video.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/usb/usbtv/usbtv-video.c b/drivers/media/usb/usbtv/usbtv-video.c
index 1e30e05953dc..0e9e860be47f 100644
--- a/drivers/media/usb/usbtv/usbtv-video.c
+++ b/drivers/media/usb/usbtv/usbtv-video.c
@@ -727,8 +727,6 @@ static int usbtv_queue_setup(struct vb2_queue *vq,
struct usbtv *usbtv = vb2_get_drv_priv(vq);
unsigned size = USBTV_CHUNK * usbtv->n_chunks * 2 * sizeof(u32);

- if (vq->num_buffers + *nbuffers < 2)
- *nbuffers = 2 - vq->num_buffers;
if (*nplanes)
return sizes[0] < size ? -EINVAL : 0;
*nplanes = 1;
@@ -892,6 +890,7 @@ int usbtv_video_init(struct usbtv *usbtv)
/* videobuf2 structure */
usbtv->vb2q.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
usbtv->vb2q.io_modes = VB2_MMAP | VB2_USERPTR | VB2_READ;
+ usbtv->vb2q.min_buffers_needed = 2;
usbtv->vb2q.drv_priv = usbtv;
usbtv->vb2q.buf_struct_size = sizeof(struct usbtv_buf);
usbtv->vb2q.ops = &usbtv_vb2_ops;
--
2.39.2

2023-09-28 03:09:27

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 23/53] media: dvb-frontends: rtl2832_srd: Use queue min_buffers_needed field

queue_setup checks for a minimum number of buffers so use queue
min_buffers_needed field and remove the check done in
rtl2832_sdr_queue_setup().

Signed-off-by: Benjamin Gaignard <[email protected]>
---
drivers/media/dvb-frontends/rtl2832_sdr.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/media/dvb-frontends/rtl2832_sdr.c b/drivers/media/dvb-frontends/rtl2832_sdr.c
index 02c619e51641..597b1548ed8b 100644
--- a/drivers/media/dvb-frontends/rtl2832_sdr.c
+++ b/drivers/media/dvb-frontends/rtl2832_sdr.c
@@ -440,14 +440,9 @@ static int rtl2832_sdr_queue_setup(struct vb2_queue *vq,
struct rtl2832_sdr_dev *dev = vb2_get_drv_priv(vq);
struct platform_device *pdev = dev->pdev;

- dev_dbg(&pdev->dev, "nbuffers=%d\n", *nbuffers);
-
- /* Need at least 8 buffers */
- if (vq->num_buffers + *nbuffers < 8)
- *nbuffers = 8 - vq->num_buffers;
*nplanes = 1;
sizes[0] = PAGE_ALIGN(dev->buffersize);
- dev_dbg(&pdev->dev, "nbuffers=%d sizes[0]=%d\n", *nbuffers, sizes[0]);
+ dev_dbg(&pdev->dev, "nbuffers=%d sizes[0]=%d\n", vb2_get_num_buffers(vq), sizes[0]);
return 0;
}

@@ -1364,6 +1359,8 @@ static int rtl2832_sdr_probe(struct platform_device *pdev)
dev->vb_queue.ops = &rtl2832_sdr_vb2_ops;
dev->vb_queue.mem_ops = &vb2_vmalloc_memops;
dev->vb_queue.timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
+ /* Need at least 8 buffers */
+ dev->vb_queue.min_buffers_needed = 8;
ret = vb2_queue_init(&dev->vb_queue);
if (ret) {
dev_err(&pdev->dev, "Could not initialize vb2 queue\n");
--
2.39.2

2023-09-28 04:27:42

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 39/53] media: usb: cx231xx: Set min_buffers_needed to CX231XX_MIN_BUF

vb2 queue_setup checks for a minimum number of buffers so set
min_buffers_needed to aCX231XX_MIN_BUFnd remove the useless check in
cx231xx queue_setup().

Signed-off-by: Benjamin Gaignard <[email protected]>
---
drivers/media/usb/cx231xx/cx231xx-417.c | 4 +---
drivers/media/usb/cx231xx/cx231xx-video.c | 4 +---
2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/media/usb/cx231xx/cx231xx-417.c b/drivers/media/usb/cx231xx/cx231xx-417.c
index c5e21785fafe..fecdb12f5ef7 100644
--- a/drivers/media/usb/cx231xx/cx231xx-417.c
+++ b/drivers/media/usb/cx231xx/cx231xx-417.c
@@ -1223,9 +1223,6 @@ static int queue_setup(struct vb2_queue *vq,
dev->ts1.ts_packet_size = mpeglinesize;
dev->ts1.ts_packet_count = mpeglines;

- if (vq->num_buffers + *nbuffers < CX231XX_MIN_BUF)
- *nbuffers = CX231XX_MIN_BUF - vq->num_buffers;
-
if (*nplanes)
return sizes[0] < size ? -EINVAL : 0;
*nplanes = 1;
@@ -1777,6 +1774,7 @@ int cx231xx_417_register(struct cx231xx *dev)
q = &dev->mpegq;
q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
q->io_modes = VB2_USERPTR | VB2_MMAP | VB2_DMABUF | VB2_READ;
+ q->min_buffers_needed = CX231XX_MIN_BUF;
q->drv_priv = dev;
q->buf_struct_size = sizeof(struct cx231xx_buffer);
q->ops = &cx231xx_video_qops;
diff --git a/drivers/media/usb/cx231xx/cx231xx-video.c b/drivers/media/usb/cx231xx/cx231xx-video.c
index e23b8ccd79d4..26b593844157 100644
--- a/drivers/media/usb/cx231xx/cx231xx-video.c
+++ b/drivers/media/usb/cx231xx/cx231xx-video.c
@@ -717,9 +717,6 @@ static int queue_setup(struct vb2_queue *vq,

dev->size = (dev->width * dev->height * dev->format->depth + 7) >> 3;

- if (vq->num_buffers + *nbuffers < CX231XX_MIN_BUF)
- *nbuffers = CX231XX_MIN_BUF - vq->num_buffers;
-
if (*nplanes)
return sizes[0] < dev->size ? -EINVAL : 0;
*nplanes = 1;
@@ -1805,6 +1802,7 @@ int cx231xx_register_analog_devices(struct cx231xx *dev)
q = &dev->vidq;
q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
q->io_modes = VB2_USERPTR | VB2_MMAP | VB2_DMABUF | VB2_READ;
+ q->min_buffers_needed = CX231XX_MIN_BUF;
q->drv_priv = dev;
q->buf_struct_size = sizeof(struct cx231xx_buffer);
q->ops = &cx231xx_video_qops;
--
2.39.2

2023-09-28 04:34:09

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 18/53] media: verisilicon: g2: Use common helpers to compute chroma and mv offsets

HEVC and VP9 are running on the same hardware and share the same
chroma and motion vectors offset constraint.
Create common helpers functions for these computation.
Source and destination buffer height may not be the same because
alignment constraint are different so use destination height to
compute chroma offset because we target this buffer as hardware
output.
To be able to use the helpers in both VP9 HEVC code remove dec_params
and use context->bit_depth instead.

Signed-off-by: Benjamin Gaignard <[email protected]>
---
.../media/platform/verisilicon/hantro_g2.c | 14 ++++++++++
.../platform/verisilicon/hantro_g2_hevc_dec.c | 18 ++-----------
.../platform/verisilicon/hantro_g2_vp9_dec.c | 26 +++----------------
.../media/platform/verisilicon/hantro_hw.h | 3 +++
4 files changed, 23 insertions(+), 38 deletions(-)

diff --git a/drivers/media/platform/verisilicon/hantro_g2.c b/drivers/media/platform/verisilicon/hantro_g2.c
index ee5f14c5f8f2..b880a6849d58 100644
--- a/drivers/media/platform/verisilicon/hantro_g2.c
+++ b/drivers/media/platform/verisilicon/hantro_g2.c
@@ -8,6 +8,8 @@
#include "hantro_hw.h"
#include "hantro_g2_regs.h"

+#define G2_ALIGN 16
+
void hantro_g2_check_idle(struct hantro_dev *vpu)
{
int i;
@@ -42,3 +44,15 @@ irqreturn_t hantro_g2_irq(int irq, void *dev_id)

return IRQ_HANDLED;
}
+
+size_t hantro_g2_chroma_offset(struct hantro_ctx *ctx)
+{
+ return ctx->dst_fmt.width * ctx->dst_fmt.height * ctx->bit_depth / 8;
+}
+
+size_t hantro_g2_motion_vectors_offset(struct hantro_ctx *ctx)
+{
+ size_t cr_offset = hantro_g2_chroma_offset(ctx);
+
+ return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
+}
diff --git a/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c b/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c
index a9d4ac84a8d8..d3f8c33eb16c 100644
--- a/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c
+++ b/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c
@@ -8,20 +8,6 @@
#include "hantro_hw.h"
#include "hantro_g2_regs.h"

-#define G2_ALIGN 16
-
-static size_t hantro_hevc_chroma_offset(struct hantro_ctx *ctx)
-{
- return ctx->dst_fmt.width * ctx->dst_fmt.height * ctx->bit_depth / 8;
-}
-
-static size_t hantro_hevc_motion_vectors_offset(struct hantro_ctx *ctx)
-{
- size_t cr_offset = hantro_hevc_chroma_offset(ctx);
-
- return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
-}
-
static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
{
struct hantro_dev *vpu = ctx->dev;
@@ -384,8 +370,8 @@ static int set_ref(struct hantro_ctx *ctx)
struct hantro_dev *vpu = ctx->dev;
struct vb2_v4l2_buffer *vb2_dst;
struct hantro_decoded_buffer *dst;
- size_t cr_offset = hantro_hevc_chroma_offset(ctx);
- size_t mv_offset = hantro_hevc_motion_vectors_offset(ctx);
+ size_t cr_offset = hantro_g2_chroma_offset(ctx);
+ size_t mv_offset = hantro_g2_motion_vectors_offset(ctx);
u32 max_ref_frames;
u16 dpb_longterm_e;
static const struct hantro_reg cur_poc[] = {
diff --git a/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c b/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
index 6db1c32fce4d..342e543dee4c 100644
--- a/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
+++ b/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
@@ -16,8 +16,6 @@
#include "hantro_vp9.h"
#include "hantro_g2_regs.h"

-#define G2_ALIGN 16
-
enum hantro_ref_frames {
INTRA_FRAME = 0,
LAST_FRAME = 1,
@@ -90,22 +88,6 @@ static int start_prepare_run(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_
return 0;
}

-static size_t chroma_offset(const struct hantro_ctx *ctx,
- const struct v4l2_ctrl_vp9_frame *dec_params)
-{
- int bytes_per_pixel = dec_params->bit_depth == 8 ? 1 : 2;
-
- return ctx->src_fmt.width * ctx->src_fmt.height * bytes_per_pixel;
-}
-
-static size_t mv_offset(const struct hantro_ctx *ctx,
- const struct v4l2_ctrl_vp9_frame *dec_params)
-{
- size_t cr_offset = chroma_offset(ctx, dec_params);
-
- return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
-}
-
static struct hantro_decoded_buffer *
get_ref_buf(struct hantro_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp)
{
@@ -156,13 +138,13 @@ static void config_output(struct hantro_ctx *ctx,
luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);

- chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
+ chroma_addr = luma_addr + hantro_g2_chroma_offset(ctx);
hantro_write_addr(ctx->dev, G2_OUT_CHROMA_ADDR, chroma_addr);
- dst->vp9.chroma_offset = chroma_offset(ctx, dec_params);
+ dst->vp9.chroma_offset = hantro_g2_chroma_offset(ctx);

- mv_addr = luma_addr + mv_offset(ctx, dec_params);
+ mv_addr = luma_addr + hantro_g2_motion_vectors_offset(ctx);
hantro_write_addr(ctx->dev, G2_OUT_MV_ADDR, mv_addr);
- dst->vp9.mv_offset = mv_offset(ctx, dec_params);
+ dst->vp9.mv_offset = hantro_g2_motion_vectors_offset(ctx);
}

struct hantro_vp9_ref_reg {
diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
index 292a76ef643e..9aec8a79acdc 100644
--- a/drivers/media/platform/verisilicon/hantro_hw.h
+++ b/drivers/media/platform/verisilicon/hantro_hw.h
@@ -521,6 +521,9 @@ hantro_av1_mv_size(unsigned int width, unsigned int height)
return ALIGN(num_sbs * 384, 16) * 2 + 512;
}

+size_t hantro_g2_chroma_offset(struct hantro_ctx *ctx);
+size_t hantro_g2_motion_vectors_offset(struct hantro_ctx *ctx);
+
int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx);
void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
--
2.39.2

2023-09-28 07:58:26

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 02/53] media: videobuf2: Stop spamming kernel log with all queue counter

Only report unbalanced queue counters do avoid spamming kernel log
with useless information.

Signed-off-by: Benjamin Gaignard <[email protected]>
---
.../media/common/videobuf2/videobuf2-core.c | 79 +++++++++++--------
1 file changed, 44 insertions(+), 35 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index 6eeddb3a01c7..1c5c84095065 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -532,25 +532,26 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)

#ifdef CONFIG_VIDEO_ADV_DEBUG
/*
- * Check that all the calls were balances during the life-time of this
- * queue. If not (or if the debug level is 1 or up), then dump the
- * counters to the kernel log.
+ * Check that all the calls were balanced during the life-time of this
+ * queue. If not then dump the counters to the kernel log.
*/
if (q->num_buffers) {
bool unbalanced = q->cnt_start_streaming != q->cnt_stop_streaming ||
q->cnt_prepare_streaming != q->cnt_unprepare_streaming ||
q->cnt_wait_prepare != q->cnt_wait_finish;

- if (unbalanced || debug) {
- pr_info("counters for queue %p:%s\n", q,
- unbalanced ? " UNBALANCED!" : "");
- pr_info(" setup: %u start_streaming: %u stop_streaming: %u\n",
- q->cnt_queue_setup, q->cnt_start_streaming,
- q->cnt_stop_streaming);
- pr_info(" prepare_streaming: %u unprepare_streaming: %u\n",
- q->cnt_prepare_streaming, q->cnt_unprepare_streaming);
- pr_info(" wait_prepare: %u wait_finish: %u\n",
- q->cnt_wait_prepare, q->cnt_wait_finish);
+ if (unbalanced) {
+ pr_info("unbalanced counters for queue %p:\n", q);
+ if (q->cnt_start_streaming != q->cnt_stop_streaming)
+ pr_info(" setup: %u start_streaming: %u stop_streaming: %u\n",
+ q->cnt_queue_setup, q->cnt_start_streaming,
+ q->cnt_stop_streaming);
+ if (q->cnt_prepare_streaming != q->cnt_unprepare_streaming)
+ pr_info(" prepare_streaming: %u unprepare_streaming: %u\n",
+ q->cnt_prepare_streaming, q->cnt_unprepare_streaming);
+ if (q->cnt_wait_prepare != q->cnt_wait_finish)
+ pr_info(" wait_prepare: %u wait_finish: %u\n",
+ q->cnt_wait_prepare, q->cnt_wait_finish);
}
q->cnt_queue_setup = 0;
q->cnt_wait_prepare = 0;
@@ -571,29 +572,37 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
vb->cnt_buf_prepare != vb->cnt_buf_finish ||
vb->cnt_buf_init != vb->cnt_buf_cleanup;

- if (unbalanced || debug) {
- pr_info(" counters for queue %p, buffer %d:%s\n",
- q, buffer, unbalanced ? " UNBALANCED!" : "");
- pr_info(" buf_init: %u buf_cleanup: %u buf_prepare: %u buf_finish: %u\n",
- vb->cnt_buf_init, vb->cnt_buf_cleanup,
- vb->cnt_buf_prepare, vb->cnt_buf_finish);
- pr_info(" buf_out_validate: %u buf_queue: %u buf_done: %u buf_request_complete: %u\n",
- vb->cnt_buf_out_validate, vb->cnt_buf_queue,
- vb->cnt_buf_done, vb->cnt_buf_request_complete);
- pr_info(" alloc: %u put: %u prepare: %u finish: %u mmap: %u\n",
- vb->cnt_mem_alloc, vb->cnt_mem_put,
- vb->cnt_mem_prepare, vb->cnt_mem_finish,
- vb->cnt_mem_mmap);
- pr_info(" get_userptr: %u put_userptr: %u\n",
- vb->cnt_mem_get_userptr, vb->cnt_mem_put_userptr);
- pr_info(" attach_dmabuf: %u detach_dmabuf: %u map_dmabuf: %u unmap_dmabuf: %u\n",
- vb->cnt_mem_attach_dmabuf, vb->cnt_mem_detach_dmabuf,
- vb->cnt_mem_map_dmabuf, vb->cnt_mem_unmap_dmabuf);
- pr_info(" get_dmabuf: %u num_users: %u vaddr: %u cookie: %u\n",
+ if (unbalanced) {
+ pr_info("unbalanced counters for queue %p, buffer %d:\n",
+ q, buffer);
+ if (vb->cnt_buf_init != vb->cnt_buf_cleanup)
+ pr_info(" buf_init: %u buf_cleanup: %u\n",
+ vb->cnt_buf_init, vb->cnt_buf_cleanup);
+ if (vb->cnt_buf_prepare != vb->cnt_buf_finish)
+ pr_info(" buf_prepare: %u buf_finish: %u\n",
+ vb->cnt_buf_prepare, vb->cnt_buf_finish);
+ if (vb->cnt_buf_queue != vb->cnt_buf_done)
+ pr_info(" buf_out_validate: %u buf_queue: %u buf_done: %u buf_request_complete: %u\n",
+ vb->cnt_buf_out_validate, vb->cnt_buf_queue,
+ vb->cnt_buf_done, vb->cnt_buf_request_complete);
+ if (vb->cnt_mem_alloc != vb->cnt_mem_put)
+ pr_info(" alloc: %u put: %u\n",
+ vb->cnt_mem_alloc, vb->cnt_mem_put);
+ if (vb->cnt_mem_prepare != vb->cnt_mem_finish)
+ pr_info(" prepare: %u finish: %u\n",
+ vb->cnt_mem_prepare, vb->cnt_mem_finish);
+ if (vb->cnt_mem_get_userptr != vb->cnt_mem_put_userptr)
+ pr_info(" get_userptr: %u put_userptr: %u\n",
+ vb->cnt_mem_get_userptr, vb->cnt_mem_put_userptr);
+ if (vb->cnt_mem_attach_dmabuf != vb->cnt_mem_detach_dmabuf)
+ pr_info(" attach_dmabuf: %u detach_dmabuf: %u\n",
+ vb->cnt_mem_attach_dmabuf, vb->cnt_mem_detach_dmabuf);
+ if (vb->cnt_mem_map_dmabuf != vb->cnt_mem_unmap_dmabuf)
+ pr_info(" map_dmabuf: %u unmap_dmabuf: %u\n",
+ vb->cnt_mem_map_dmabuf, vb->cnt_mem_unmap_dmabuf);
+ pr_info(" get_dmabuf: %u num_users: %u\n",
vb->cnt_mem_get_dmabuf,
- vb->cnt_mem_num_users,
- vb->cnt_mem_vaddr,
- vb->cnt_mem_cookie);
+ vb->cnt_mem_num_users);
}
}
#endif
--
2.39.2

2023-09-28 08:20:08

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH v8 25/53] media: pci: cx18: Set correct value to min_buffers_needed field

Set queue min_buffers_needed field to 3 and remove the useless
check.

Signed-off-by: Benjamin Gaignard <[email protected]>
---
drivers/media/pci/cx18/cx18-streams.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/media/pci/cx18/cx18-streams.c b/drivers/media/pci/cx18/cx18-streams.c
index 597472754c4c..6ed2c9fb882c 100644
--- a/drivers/media/pci/cx18/cx18-streams.c
+++ b/drivers/media/pci/cx18/cx18-streams.c
@@ -117,13 +117,6 @@ static int cx18_queue_setup(struct vb2_queue *vq,
else
szimage = cx->cxhdl.height * 720 * 2;

- /*
- * Let's request at least three buffers: two for the
- * DMA engine and one for userspace.
- */
- if (vq->num_buffers + *nbuffers < 3)
- *nbuffers = 3 - vq->num_buffers;
-
if (*nplanes) {
if (*nplanes != 1 || sizes[0] < szimage)
return -EINVAL;
@@ -286,7 +279,11 @@ static int cx18_stream_init(struct cx18 *cx, int type)
s->vidq.ops = &cx18_vb2_qops;
s->vidq.mem_ops = &vb2_vmalloc_memops;
s->vidq.timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
- s->vidq.min_buffers_needed = 2;
+ /*
+ * Let's request at least three buffers: two for the
+ * DMA engine and one for userspace.
+ */
+ s->vidq.min_buffers_needed = 3;
s->vidq.gfp_flags = GFP_DMA32;
s->vidq.dev = &cx->pci_dev->dev;
s->vidq.lock = &cx->serialize_lock;
--
2.39.2

2023-09-28 13:12:15

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH v8 00/53] Add DELETE_BUF ioctl

Hi Benjamin,

On 27/09/2023 17:35, Benjamin Gaignard wrote:
> Unlike when resolution change on keyframes, dynamic resolution change
> on inter frames doesn't allow to do a stream off/on sequence because
> it is need to keep all previous references alive to decode inter frames.
> This constraint have two main problems:
> - more memory consumption.
> - more buffers in use.
> To solve these issue this series introduce DELETE_BUFS ioctl and remove
> the 32 buffers limit per queue.
>
> VP9 conformance tests using fluster give a score of 210/305.
> The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
> but require to use postprocessor.
>
> Kernel branch is available here:
> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v8
>
> GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
> change is here:
> https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc

I applied patches 1-46 on top of staging and applied patch 1/2 of v4l-utils. I chose
1-46 since those would be the first batch to be merged.

I had to apply a small patch on top, since amphion was still using max_allowed_buffers:

https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=delbuf&id=fce79aa0239a3a4e5c7634b9b7ffc356057b70b4

With that change everything compiled cleanly.

But running the regression tests (test-media script) failed with a kernel oops:

Buffer ioctls (Input 0):
fail: v4l2-test-buffers.cpp(611): q.reqbufs(node, 1)
test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: FAIL
[ 25.309343] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[ 25.310067] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
[ 25.310471] CPU: 0 PID: 187 Comm: v4l2-compliance Not tainted 6.6.0-rc3+ #1
[ 25.310837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 25.311334] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
[ 25.311688] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48 c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
0f 0b 4c 89 e8
[ 25.312670] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
[ 25.312948] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000040
[ 25.313331] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI: ffff88810cfb72e0
[ 25.313708] RBP: ffff88810ce08800 R08: 0000000000001800 R09: 0000000000000001
[ 25.314097] R10: ffffffff91560b0f R11: ffffffff8f40006a R12: ffff88810afefa74
[ 25.314470] R13: 0000000000000008 R14: 0000000000000001 R15: ffff88810cfb71a0
[ 25.314854] FS: 00007fa524908440(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000
[ 25.315281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 25.315594] CR2: 0000562855948270 CR3: 0000000109e21000 CR4: 0000000000350ef0
[ 25.315982] Call Trace:
[ 25.316122] <TASK>
[ 25.316243] ? die_addr+0x37/0xa0
[ 25.316433] ? exc_general_protection+0x144/0x210
[ 25.316693] ? asm_exc_general_protection+0x22/0x30
[ 25.316955] ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 25.317235] ? __vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
[ 25.317555] vb2_core_create_bufs+0x5a0/0xce0 [videobuf2_common]
[ 25.317879] ? __vb2_queue_alloc+0xd00/0xd00 [videobuf2_common]
[ 25.318213] vb2_create_bufs+0x680/0x7f0 [videobuf2_v4l2]
[ 25.318506] ? __kasan_kmalloc+0x83/0x90
[ 25.318729] ? __kmalloc+0x5c/0x150
[ 25.318924] ? __video_do_ioctl+0x3a5/0xc20 [videodev]
[ 25.319223] ? __copy_timestamp+0x270/0x270 [videobuf2_v4l2]
[ 25.319532] ? v4l_sanitize_format+0x121/0x340 [videodev]
[ 25.319839] vb2_ioctl_create_bufs+0x280/0x440 [videobuf2_v4l2]
[ 25.320162] ? check_fmt+0x15/0x5e0 [videodev]
[ 25.320420] v4l_create_bufs+0xa8/0x150 [videodev]
[ 25.320693] __video_do_ioctl+0x8d0/0xc20 [videodev]
[ 25.320975] ? v4l_prepare_buf+0xa0/0xa0 [videodev]
[ 25.321250] ? __might_fault+0x112/0x170
[ 25.321463] ? __might_fault+0x112/0x170
[ 25.321675] video_usercopy+0x48c/0xd00 [videodev]
[ 25.321949] ? v4l_prepare_buf+0xa0/0xa0 [videodev]
[ 25.322228] ? v4l_enumstd+0x60/0x60 [videodev]
[ 25.322497] ? selinux_bprm_creds_for_exec+0xa90/0xa90
[ 25.322779] ? rseq_ip_fixup+0x304/0x480
[ 25.322995] ? __up_read+0x10b/0x750
[ 25.323201] ? rseq_get_rseq_cs+0x680/0x680
[ 25.323430] v4l2_ioctl+0x17f/0x1f0 [videodev]
[ 25.323689] __x64_sys_ioctl+0x127/0x190
[ 25.323905] do_syscall_64+0x35/0x80
[ 25.324105] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 25.324372] RIP: 0033:0x7fa524bf44eb
[ 25.324564] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b
04 25 28 00 00
[ 25.325533] RSP: 002b:00007ffc888efee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 25.325933] RAX: ffffffffffffffda RBX: 00007ffc888f0350 RCX: 00007fa524bf44eb
[ 25.326314] RDX: 00007ffc888effb0 RSI: 00000000c100565c RDI: 0000000000000023
[ 25.326702] RBP: 00007ffc888f8550 R08: 0000000000000000 R09: 0000000000000001
[ 25.327081] R10: 0000000000000016 R11: 0000000000000246 R12: 0000000000000000
[ 25.327468] R13: 0000000000000001 R14: 00007ffc888f8010 R15: 00007ffc888f1790
[ 25.327852] </TASK>
[ 25.327979] Modules linked in: vivid videobuf2_dma_contig v4l2_tpg v4l2_dv_timings cec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc rc_core
[ 25.328835] ---[ end trace 0000000000000000 ]---
[ 25.329083] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
[ 25.329434] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48 c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
0f 0b 4c 89 e8
[ 25.330417] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
[ 25.330812] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000040
[ 25.331194] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI: ffff88810cfb72e0
[ 25.331594] RBP: ffff88810ce08800 R08: 0000000000001800 R09: 0000000000000001
[ 25.331978] R10: ffffffff91560b0f R11: ffffffff8f40006a R12: ffff88810afefa74
[ 25.332360] R13: 0000000000000008 R14: 0000000000000001 R15: ffff88810cfb71a0
[ 25.332746] FS: 00007fa524908440(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000
[ 25.333170] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 25.333486] CR2: 0000562855948270 CR3: 0000000109e21000 CR4: 0000000000350ef0
[ 25.333861] Kernel panic - not syncing: Fatal exception
[ 25.334179] Kernel Offset: 0xbe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 25.334748] Rebooting in 30 seconds..

I use the build scripts as described here:

https://lore.kernel.org/linux-media/[email protected]/

I use this command to run the regression tests:

./build.sh -virtme -virtme-utils-path /usr/local/bin none delbuf

(delbuf is the name of my branch)

I don't really want to spend time reviewing this series until this is fixed...

Regards,

Hans

>
> changes in version 8:
> - Add V4L2_BUF_CAP_SUPPORTS_SET_MAX_BUFS and new 'max_buffers' field in v4l2_create_buffers
> structure to report the maximum number of buffers that a queue could allocate.
> - Add V4L2_BUF_CAP_SUPPORTS_DELETE_BUFS to indicate that a queue support
> DELETE_BUFS ioctl.
> - Make some test drivers use more than 32 buffers and DELETE_BUFS ioctl.
> - Fix remarks done by Hans
> - Move "media: core: Rework how create_buf index returned value is
> computed" patch to the top of the serie.
>
> changes in version 7:
> - Use a bitmap to know which entries are valid in queue bufs array.
> The number of buffers in the queue could must calculated from the
> bitmap so num_buffers becomes useless. This led to add quite few
> patches to remove it from all the drivers.
> Note: despiste my attention I may have miss some calls to
> num_buffers...
> - Split patches to make them more readable.
> - Run v4l2-compliance with additional delete-bufs tests.
> - Run ./test-media -kmemleak vivid and no more failures.
> Note: I had to remove USERPTR streaming test because they to much
> frequentely hit get_framevec bug. It is not related to my series
> since this happens all the time on master branch.
> - Fix Hans remarks on v6
>
> changes in version 6:
> - Get a patch per driver to use vb2_get_buffer() instead of directly access
> to queue buffers array.
> - Add lock in vb2_core_delete_buf()
> - Use vb2_buffer instead of index
> - Fix various comments
> - Change buffer index name to BUFFER_INDEX_MASK
> - Stop spamming kernel log with unbalanced counters
>
> changes in version 5:
> - Rework offset cookie encoding pattern is n ow the first patch of the
> serie.
> - Use static array instead of allocated one for postprocessor buffers.
>
> changes in version 4:
> - Stop using Xarray, instead let queues decide about their own maximum
> number of buffer and allocate bufs array given that value.
> - Rework offset cookie encoding pattern.
> - Change DELETE_BUF to DELETE_BUFS because it now usable for
> range of buffer to be symetrical of CREATE_BUFS.
> - Add fixes tags on couple of Verisilicon related patches.
> - Be smarter in Verisilicon postprocessor buffers management.
> - Rebase on top of v6.4
>
> changes in version 3:
> - Use Xarray API to store allocated video buffers.
> - No module parameter to limit the number of buffer per queue.
> - Use Xarray inside Verisilicon driver to store postprocessor buffers
> and remove VB2_MAX_FRAME limit.
> - Allow Versilicon driver to change of resolution while streaming
> - Various fixes the Verisilicon VP9 code to improve fluster score.
>
> changes in version 2:
> - Use a dynamic array and not a list to keep trace of allocated buffers.
> Not use IDR interface because it is marked as deprecated in kernel
> documentation.
> - Add a module parameter to limit the number of buffer per queue.
> - Add DELETE_BUF ioctl and m2m helpers.
>
> Regards,
> Benjamin
>
> Benjamin Gaignard (53):
> media: videobuf2: Rework offset 'cookie' encoding pattern
> media: videobuf2: Stop spamming kernel log with all queue counter
> media: videobuf2: Use vb2_buffer instead of index
> media: amphion: Use vb2_get_buffer() instead of directly access to
> buffers array
> media: mediatek: jpeg: Use vb2_get_buffer() instead of directly access
> to buffers array
> media: mediatek: vdec: Remove useless loop
> media: sti: hva: Use vb2_get_buffer() instead of directly access to
> buffers array
> media: visl: Use vb2_get_buffer() instead of directly access to
> buffers array
> media: atomisp: Use vb2_get_buffer() instead of directly access to
> buffers array
> media: dvb-core: Use vb2_get_buffer() instead of directly access to
> buffers array
> media: videobuf2: Access vb2_queue bufs array through helper functions
> media: videobuf2: Be more flexible on the number of queue stored
> buffers
> media: Report the maximum possible number of buffers for the queue
> media: test-drivers: vivid: Increase max supported buffers for capture
> queues
> media: test-drivers: vicodec: Increase max supported capture queue
> buffers
> media: verisilicon: Refactor postprocessor to store more buffers
> media: verisilicon: Store chroma and motion vectors offset
> media: verisilicon: g2: Use common helpers to compute chroma and mv
> offsets
> media: verisilicon: vp9: Allow to change resolution while streaming
> media: Remove duplicated index vs q->num_buffers check
> media: core: Add helper to get queue number of buffers
> media: dvb-core: Do not initialize twice queue num_buffer field
> media: dvb-frontends: rtl2832_srd: Use queue min_buffers_needed field
> media: video-i2c: Set min_buffers_needed to 2
> media: pci: cx18: Set correct value to min_buffers_needed field
> media: pci: dt3155: Remove useless check
> media: pci: netup_unidvb: Remove useless number of buffers check
> media: pci: tw68: Stop direct calls to queue num_buffers field
> media: pci: tw686x: Set min_buffers_needed to 3
> media: amphion: Stop direct calls to queue num_buffers field
> media: coda: Stop direct calls to queue num_buffers field
> media: mediatek: vcodec: Stop direct calls to queue num_buffers field
> media: nxp: Stop direct calls to queue num_buffers field
> media: renesas: Set min_buffers_needed to 16
> media: ti: Use queue min_buffers_needed field to set the min number of
> buffers
> media: verisilicon: Stop direct calls to queue num_buffers field
> media: test-drivers: Stop direct calls to queue num_buffers field
> media: usb: airspy: Set min_buffers_needed to 8
> media: usb: cx231xx: Set min_buffers_needed to CX231XX_MIN_BUF
> media: usb: hackrf: Set min_buffers_needed to 8
> media: usb: usbtv: Set min_buffers_needed to 2
> media: atomisp: Stop direct calls to queue num_buffers field
> media: imx: Stop direct calls to queue num_buffers field
> media: meson: vdec: Stop direct calls to queue num_buffers field
> touchscreen: sur40: Stop direct calls to queue num_buffers field
> sample: v4l: Stop direct calls to queue num_buffers field
> media: cedrus: Stop direct calls to queue num_buffers field
> media: core: Rework how create_buf index returned value is computed
> media: core: Add bitmap manage bufs array entries
> media: core: Free range of buffers
> media: v4l2: Add DELETE_BUFS ioctl
> media: v4l2: Add mem2mem helpers for DELETE_BUFS ioctl
> media: test-drivers: Use helper for DELETE_BUFS ioctl
>
> .../userspace-api/media/v4l/user-func.rst | 1 +
> .../media/v4l/vidioc-create-bufs.rst | 8 +-
> .../media/v4l/vidioc-delete-bufs.rst | 80 +++
> .../media/v4l/vidioc-reqbufs.rst | 2 +
> drivers/input/touchscreen/sur40.c | 5 +-
> .../media/common/videobuf2/videobuf2-core.c | 556 +++++++++++-------
> .../media/common/videobuf2/videobuf2-v4l2.c | 121 +++-
> drivers/media/dvb-core/dvb_vb2.c | 17 +-
> drivers/media/dvb-frontends/rtl2832_sdr.c | 9 +-
> drivers/media/i2c/video-i2c.c | 5 +-
> drivers/media/pci/cx18/cx18-streams.c | 13 +-
> drivers/media/pci/dt3155/dt3155.c | 2 -
> .../pci/netup_unidvb/netup_unidvb_core.c | 4 +-
> drivers/media/pci/tw68/tw68-video.c | 6 +-
> drivers/media/pci/tw686x/tw686x-video.c | 13 +-
> drivers/media/platform/amphion/vpu_dbg.c | 30 +-
> drivers/media/platform/amphion/vpu_v4l2.c | 4 +-
> .../media/platform/chips-media/coda-common.c | 2 +-
> .../platform/mediatek/jpeg/mtk_jpeg_core.c | 7 +-
> .../vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 9 +-
> .../mediatek/vcodec/encoder/mtk_vcodec_enc.c | 2 +-
> drivers/media/platform/nxp/imx7-media-csi.c | 7 +-
> drivers/media/platform/renesas/rcar_drif.c | 8 +-
> drivers/media/platform/st/sti/hva/hva-v4l2.c | 9 +-
> .../media/platform/ti/am437x/am437x-vpfe.c | 7 +-
> drivers/media/platform/ti/cal/cal-video.c | 5 +-
> .../media/platform/ti/davinci/vpif_capture.c | 5 +-
> .../media/platform/ti/davinci/vpif_display.c | 5 +-
> drivers/media/platform/ti/omap/omap_vout.c | 5 +-
> drivers/media/platform/verisilicon/hantro.h | 9 +-
> .../media/platform/verisilicon/hantro_drv.c | 5 +-
> .../media/platform/verisilicon/hantro_g2.c | 14 +
> .../platform/verisilicon/hantro_g2_hevc_dec.c | 18 +-
> .../platform/verisilicon/hantro_g2_vp9_dec.c | 28 +-
> .../media/platform/verisilicon/hantro_hw.h | 7 +-
> .../platform/verisilicon/hantro_postproc.c | 93 ++-
> .../media/platform/verisilicon/hantro_v4l2.c | 27 +-
> .../media/test-drivers/vicodec/vicodec-core.c | 3 +
> drivers/media/test-drivers/vim2m.c | 2 +
> .../media/test-drivers/vimc/vimc-capture.c | 2 +
> drivers/media/test-drivers/visl/visl-dec.c | 32 +-
> drivers/media/test-drivers/visl/visl-video.c | 2 +
> drivers/media/test-drivers/vivid/vivid-core.c | 14 +
> .../media/test-drivers/vivid/vivid-meta-cap.c | 3 -
> .../media/test-drivers/vivid/vivid-meta-out.c | 5 +-
> .../test-drivers/vivid/vivid-touch-cap.c | 5 +-
> .../media/test-drivers/vivid/vivid-vbi-cap.c | 5 +-
> .../media/test-drivers/vivid/vivid-vbi-out.c | 5 +-
> .../media/test-drivers/vivid/vivid-vid-cap.c | 5 +-
> .../media/test-drivers/vivid/vivid-vid-out.c | 5 +-
> drivers/media/usb/airspy/airspy.c | 9 +-
> drivers/media/usb/cx231xx/cx231xx-417.c | 4 +-
> drivers/media/usb/cx231xx/cx231xx-video.c | 4 +-
> drivers/media/usb/hackrf/hackrf.c | 9 +-
> drivers/media/usb/usbtv/usbtv-video.c | 3 +-
> drivers/media/v4l2-core/v4l2-dev.c | 1 +
> drivers/media/v4l2-core/v4l2-ioctl.c | 21 +-
> drivers/media/v4l2-core/v4l2-mem2mem.c | 20 +
> .../staging/media/atomisp/pci/atomisp_ioctl.c | 4 +-
> drivers/staging/media/imx/imx-media-capture.c | 7 +-
> drivers/staging/media/meson/vdec/vdec.c | 13 +-
> .../staging/media/sunxi/cedrus/cedrus_h264.c | 8 +-
> .../staging/media/sunxi/cedrus/cedrus_h265.c | 9 +-
> include/media/v4l2-ioctl.h | 4 +
> include/media/v4l2-mem2mem.h | 12 +
> include/media/videobuf2-core.h | 65 +-
> include/media/videobuf2-v4l2.h | 13 +
> include/uapi/linux/videodev2.h | 24 +-
> samples/v4l/v4l2-pci-skeleton.c | 5 +-
> 69 files changed, 969 insertions(+), 502 deletions(-)
> create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-bufs.rst
>

2023-09-28 20:49:25

by Benjamin Gaignard

[permalink] [raw]
Subject: Re: [PATCH v8 00/53] Add DELETE_BUF ioctl


Le 28/09/2023 à 15:04, Hans Verkuil a écrit :
> Hi Benjamin,
>
> On 27/09/2023 17:35, Benjamin Gaignard wrote:
>> Unlike when resolution change on keyframes, dynamic resolution change
>> on inter frames doesn't allow to do a stream off/on sequence because
>> it is need to keep all previous references alive to decode inter frames.
>> This constraint have two main problems:
>> - more memory consumption.
>> - more buffers in use.
>> To solve these issue this series introduce DELETE_BUFS ioctl and remove
>> the 32 buffers limit per queue.
>>
>> VP9 conformance tests using fluster give a score of 210/305.
>> The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
>> but require to use postprocessor.
>>
>> Kernel branch is available here:
>> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v8
>>
>> GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
>> change is here:
>> https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc
> I applied patches 1-46 on top of staging and applied patch 1/2 of v4l-utils. I chose
> 1-46 since those would be the first batch to be merged.
>
> I had to apply a small patch on top, since amphion was still using max_allowed_buffers:
>
> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=delbuf&id=fce79aa0239a3a4e5c7634b9b7ffc356057b70b4
>
> With that change everything compiled cleanly.
>
> But running the regression tests (test-media script) failed with a kernel oops:
>
> Buffer ioctls (Input 0):
> fail: v4l2-test-buffers.cpp(611): q.reqbufs(node, 1)
> test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: FAIL
> [ 25.309343] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
> [ 25.310067] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> [ 25.310471] CPU: 0 PID: 187 Comm: v4l2-compliance Not tainted 6.6.0-rc3+ #1
> [ 25.310837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 25.311334] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
> [ 25.311688] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48 c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
> 0f 0b 4c 89 e8
> [ 25.312670] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
> [ 25.312948] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000040
> [ 25.313331] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI: ffff88810cfb72e0
> [ 25.313708] RBP: ffff88810ce08800 R08: 0000000000001800 R09: 0000000000000001
> [ 25.314097] R10: ffffffff91560b0f R11: ffffffff8f40006a R12: ffff88810afefa74
> [ 25.314470] R13: 0000000000000008 R14: 0000000000000001 R15: ffff88810cfb71a0
> [ 25.314854] FS: 00007fa524908440(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000
> [ 25.315281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 25.315594] CR2: 0000562855948270 CR3: 0000000109e21000 CR4: 0000000000350ef0
> [ 25.315982] Call Trace:
> [ 25.316122] <TASK>
> [ 25.316243] ? die_addr+0x37/0xa0
> [ 25.316433] ? exc_general_protection+0x144/0x210
> [ 25.316693] ? asm_exc_general_protection+0x22/0x30
> [ 25.316955] ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [ 25.317235] ? __vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
> [ 25.317555] vb2_core_create_bufs+0x5a0/0xce0 [videobuf2_common]
> [ 25.317879] ? __vb2_queue_alloc+0xd00/0xd00 [videobuf2_common]
> [ 25.318213] vb2_create_bufs+0x680/0x7f0 [videobuf2_v4l2]
> [ 25.318506] ? __kasan_kmalloc+0x83/0x90
> [ 25.318729] ? __kmalloc+0x5c/0x150
> [ 25.318924] ? __video_do_ioctl+0x3a5/0xc20 [videodev]
> [ 25.319223] ? __copy_timestamp+0x270/0x270 [videobuf2_v4l2]
> [ 25.319532] ? v4l_sanitize_format+0x121/0x340 [videodev]
> [ 25.319839] vb2_ioctl_create_bufs+0x280/0x440 [videobuf2_v4l2]
> [ 25.320162] ? check_fmt+0x15/0x5e0 [videodev]
> [ 25.320420] v4l_create_bufs+0xa8/0x150 [videodev]
> [ 25.320693] __video_do_ioctl+0x8d0/0xc20 [videodev]
> [ 25.320975] ? v4l_prepare_buf+0xa0/0xa0 [videodev]
> [ 25.321250] ? __might_fault+0x112/0x170
> [ 25.321463] ? __might_fault+0x112/0x170
> [ 25.321675] video_usercopy+0x48c/0xd00 [videodev]
> [ 25.321949] ? v4l_prepare_buf+0xa0/0xa0 [videodev]
> [ 25.322228] ? v4l_enumstd+0x60/0x60 [videodev]
> [ 25.322497] ? selinux_bprm_creds_for_exec+0xa90/0xa90
> [ 25.322779] ? rseq_ip_fixup+0x304/0x480
> [ 25.322995] ? __up_read+0x10b/0x750
> [ 25.323201] ? rseq_get_rseq_cs+0x680/0x680
> [ 25.323430] v4l2_ioctl+0x17f/0x1f0 [videodev]
> [ 25.323689] __x64_sys_ioctl+0x127/0x190
> [ 25.323905] do_syscall_64+0x35/0x80
> [ 25.324105] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [ 25.324372] RIP: 0033:0x7fa524bf44eb
> [ 25.324564] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b
> 04 25 28 00 00
> [ 25.325533] RSP: 002b:00007ffc888efee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 25.325933] RAX: ffffffffffffffda RBX: 00007ffc888f0350 RCX: 00007fa524bf44eb
> [ 25.326314] RDX: 00007ffc888effb0 RSI: 00000000c100565c RDI: 0000000000000023
> [ 25.326702] RBP: 00007ffc888f8550 R08: 0000000000000000 R09: 0000000000000001
> [ 25.327081] R10: 0000000000000016 R11: 0000000000000246 R12: 0000000000000000
> [ 25.327468] R13: 0000000000000001 R14: 00007ffc888f8010 R15: 00007ffc888f1790
> [ 25.327852] </TASK>
> [ 25.327979] Modules linked in: vivid videobuf2_dma_contig v4l2_tpg v4l2_dv_timings cec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc rc_core
> [ 25.328835] ---[ end trace 0000000000000000 ]---
> [ 25.329083] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
> [ 25.329434] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48 c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
> 0f 0b 4c 89 e8
> [ 25.330417] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
> [ 25.330812] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000040
> [ 25.331194] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI: ffff88810cfb72e0
> [ 25.331594] RBP: ffff88810ce08800 R08: 0000000000001800 R09: 0000000000000001
> [ 25.331978] R10: ffffffff91560b0f R11: ffffffff8f40006a R12: ffff88810afefa74
> [ 25.332360] R13: 0000000000000008 R14: 0000000000000001 R15: ffff88810cfb71a0
> [ 25.332746] FS: 00007fa524908440(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000
> [ 25.333170] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 25.333486] CR2: 0000562855948270 CR3: 0000000109e21000 CR4: 0000000000350ef0
> [ 25.333861] Kernel panic - not syncing: Fatal exception
> [ 25.334179] Kernel Offset: 0xbe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 25.334748] Rebooting in 30 seconds..
>
> I use the build scripts as described here:
>
> https://lore.kernel.org/linux-media/[email protected]/
>
> I use this command to run the regression tests:
>
> ./build.sh -virtme -virtme-utils-path /usr/local/bin none delbuf
>
> (delbuf is the name of my branch)
>
> I don't really want to spend time reviewing this series until this is fixed...

Yes that make sense.
It will work on it and let you now when I had fix it.
Regards,
Benjamin

>
> Regards,
>
> Hans
>
>> changes in version 8:
>> - Add V4L2_BUF_CAP_SUPPORTS_SET_MAX_BUFS and new 'max_buffers' field in v4l2_create_buffers
>> structure to report the maximum number of buffers that a queue could allocate.
>> - Add V4L2_BUF_CAP_SUPPORTS_DELETE_BUFS to indicate that a queue support
>> DELETE_BUFS ioctl.
>> - Make some test drivers use more than 32 buffers and DELETE_BUFS ioctl.
>> - Fix remarks done by Hans
>> - Move "media: core: Rework how create_buf index returned value is
>> computed" patch to the top of the serie.
>>
>> changes in version 7:
>> - Use a bitmap to know which entries are valid in queue bufs array.
>> The number of buffers in the queue could must calculated from the
>> bitmap so num_buffers becomes useless. This led to add quite few
>> patches to remove it from all the drivers.
>> Note: despiste my attention I may have miss some calls to
>> num_buffers...
>> - Split patches to make them more readable.
>> - Run v4l2-compliance with additional delete-bufs tests.
>> - Run ./test-media -kmemleak vivid and no more failures.
>> Note: I had to remove USERPTR streaming test because they to much
>> frequentely hit get_framevec bug. It is not related to my series
>> since this happens all the time on master branch.
>> - Fix Hans remarks on v6
>>
>> changes in version 6:
>> - Get a patch per driver to use vb2_get_buffer() instead of directly access
>> to queue buffers array.
>> - Add lock in vb2_core_delete_buf()
>> - Use vb2_buffer instead of index
>> - Fix various comments
>> - Change buffer index name to BUFFER_INDEX_MASK
>> - Stop spamming kernel log with unbalanced counters
>>
>> changes in version 5:
>> - Rework offset cookie encoding pattern is n ow the first patch of the
>> serie.
>> - Use static array instead of allocated one for postprocessor buffers.
>>
>> changes in version 4:
>> - Stop using Xarray, instead let queues decide about their own maximum
>> number of buffer and allocate bufs array given that value.
>> - Rework offset cookie encoding pattern.
>> - Change DELETE_BUF to DELETE_BUFS because it now usable for
>> range of buffer to be symetrical of CREATE_BUFS.
>> - Add fixes tags on couple of Verisilicon related patches.
>> - Be smarter in Verisilicon postprocessor buffers management.
>> - Rebase on top of v6.4
>>
>> changes in version 3:
>> - Use Xarray API to store allocated video buffers.
>> - No module parameter to limit the number of buffer per queue.
>> - Use Xarray inside Verisilicon driver to store postprocessor buffers
>> and remove VB2_MAX_FRAME limit.
>> - Allow Versilicon driver to change of resolution while streaming
>> - Various fixes the Verisilicon VP9 code to improve fluster score.
>>
>> changes in version 2:
>> - Use a dynamic array and not a list to keep trace of allocated buffers.
>> Not use IDR interface because it is marked as deprecated in kernel
>> documentation.
>> - Add a module parameter to limit the number of buffer per queue.
>> - Add DELETE_BUF ioctl and m2m helpers.
>>
>> Regards,
>> Benjamin
>>
>> Benjamin Gaignard (53):
>> media: videobuf2: Rework offset 'cookie' encoding pattern
>> media: videobuf2: Stop spamming kernel log with all queue counter
>> media: videobuf2: Use vb2_buffer instead of index
>> media: amphion: Use vb2_get_buffer() instead of directly access to
>> buffers array
>> media: mediatek: jpeg: Use vb2_get_buffer() instead of directly access
>> to buffers array
>> media: mediatek: vdec: Remove useless loop
>> media: sti: hva: Use vb2_get_buffer() instead of directly access to
>> buffers array
>> media: visl: Use vb2_get_buffer() instead of directly access to
>> buffers array
>> media: atomisp: Use vb2_get_buffer() instead of directly access to
>> buffers array
>> media: dvb-core: Use vb2_get_buffer() instead of directly access to
>> buffers array
>> media: videobuf2: Access vb2_queue bufs array through helper functions
>> media: videobuf2: Be more flexible on the number of queue stored
>> buffers
>> media: Report the maximum possible number of buffers for the queue
>> media: test-drivers: vivid: Increase max supported buffers for capture
>> queues
>> media: test-drivers: vicodec: Increase max supported capture queue
>> buffers
>> media: verisilicon: Refactor postprocessor to store more buffers
>> media: verisilicon: Store chroma and motion vectors offset
>> media: verisilicon: g2: Use common helpers to compute chroma and mv
>> offsets
>> media: verisilicon: vp9: Allow to change resolution while streaming
>> media: Remove duplicated index vs q->num_buffers check
>> media: core: Add helper to get queue number of buffers
>> media: dvb-core: Do not initialize twice queue num_buffer field
>> media: dvb-frontends: rtl2832_srd: Use queue min_buffers_needed field
>> media: video-i2c: Set min_buffers_needed to 2
>> media: pci: cx18: Set correct value to min_buffers_needed field
>> media: pci: dt3155: Remove useless check
>> media: pci: netup_unidvb: Remove useless number of buffers check
>> media: pci: tw68: Stop direct calls to queue num_buffers field
>> media: pci: tw686x: Set min_buffers_needed to 3
>> media: amphion: Stop direct calls to queue num_buffers field
>> media: coda: Stop direct calls to queue num_buffers field
>> media: mediatek: vcodec: Stop direct calls to queue num_buffers field
>> media: nxp: Stop direct calls to queue num_buffers field
>> media: renesas: Set min_buffers_needed to 16
>> media: ti: Use queue min_buffers_needed field to set the min number of
>> buffers
>> media: verisilicon: Stop direct calls to queue num_buffers field
>> media: test-drivers: Stop direct calls to queue num_buffers field
>> media: usb: airspy: Set min_buffers_needed to 8
>> media: usb: cx231xx: Set min_buffers_needed to CX231XX_MIN_BUF
>> media: usb: hackrf: Set min_buffers_needed to 8
>> media: usb: usbtv: Set min_buffers_needed to 2
>> media: atomisp: Stop direct calls to queue num_buffers field
>> media: imx: Stop direct calls to queue num_buffers field
>> media: meson: vdec: Stop direct calls to queue num_buffers field
>> touchscreen: sur40: Stop direct calls to queue num_buffers field
>> sample: v4l: Stop direct calls to queue num_buffers field
>> media: cedrus: Stop direct calls to queue num_buffers field
>> media: core: Rework how create_buf index returned value is computed
>> media: core: Add bitmap manage bufs array entries
>> media: core: Free range of buffers
>> media: v4l2: Add DELETE_BUFS ioctl
>> media: v4l2: Add mem2mem helpers for DELETE_BUFS ioctl
>> media: test-drivers: Use helper for DELETE_BUFS ioctl
>>
>> .../userspace-api/media/v4l/user-func.rst | 1 +
>> .../media/v4l/vidioc-create-bufs.rst | 8 +-
>> .../media/v4l/vidioc-delete-bufs.rst | 80 +++
>> .../media/v4l/vidioc-reqbufs.rst | 2 +
>> drivers/input/touchscreen/sur40.c | 5 +-
>> .../media/common/videobuf2/videobuf2-core.c | 556 +++++++++++-------
>> .../media/common/videobuf2/videobuf2-v4l2.c | 121 +++-
>> drivers/media/dvb-core/dvb_vb2.c | 17 +-
>> drivers/media/dvb-frontends/rtl2832_sdr.c | 9 +-
>> drivers/media/i2c/video-i2c.c | 5 +-
>> drivers/media/pci/cx18/cx18-streams.c | 13 +-
>> drivers/media/pci/dt3155/dt3155.c | 2 -
>> .../pci/netup_unidvb/netup_unidvb_core.c | 4 +-
>> drivers/media/pci/tw68/tw68-video.c | 6 +-
>> drivers/media/pci/tw686x/tw686x-video.c | 13 +-
>> drivers/media/platform/amphion/vpu_dbg.c | 30 +-
>> drivers/media/platform/amphion/vpu_v4l2.c | 4 +-
>> .../media/platform/chips-media/coda-common.c | 2 +-
>> .../platform/mediatek/jpeg/mtk_jpeg_core.c | 7 +-
>> .../vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 9 +-
>> .../mediatek/vcodec/encoder/mtk_vcodec_enc.c | 2 +-
>> drivers/media/platform/nxp/imx7-media-csi.c | 7 +-
>> drivers/media/platform/renesas/rcar_drif.c | 8 +-
>> drivers/media/platform/st/sti/hva/hva-v4l2.c | 9 +-
>> .../media/platform/ti/am437x/am437x-vpfe.c | 7 +-
>> drivers/media/platform/ti/cal/cal-video.c | 5 +-
>> .../media/platform/ti/davinci/vpif_capture.c | 5 +-
>> .../media/platform/ti/davinci/vpif_display.c | 5 +-
>> drivers/media/platform/ti/omap/omap_vout.c | 5 +-
>> drivers/media/platform/verisilicon/hantro.h | 9 +-
>> .../media/platform/verisilicon/hantro_drv.c | 5 +-
>> .../media/platform/verisilicon/hantro_g2.c | 14 +
>> .../platform/verisilicon/hantro_g2_hevc_dec.c | 18 +-
>> .../platform/verisilicon/hantro_g2_vp9_dec.c | 28 +-
>> .../media/platform/verisilicon/hantro_hw.h | 7 +-
>> .../platform/verisilicon/hantro_postproc.c | 93 ++-
>> .../media/platform/verisilicon/hantro_v4l2.c | 27 +-
>> .../media/test-drivers/vicodec/vicodec-core.c | 3 +
>> drivers/media/test-drivers/vim2m.c | 2 +
>> .../media/test-drivers/vimc/vimc-capture.c | 2 +
>> drivers/media/test-drivers/visl/visl-dec.c | 32 +-
>> drivers/media/test-drivers/visl/visl-video.c | 2 +
>> drivers/media/test-drivers/vivid/vivid-core.c | 14 +
>> .../media/test-drivers/vivid/vivid-meta-cap.c | 3 -
>> .../media/test-drivers/vivid/vivid-meta-out.c | 5 +-
>> .../test-drivers/vivid/vivid-touch-cap.c | 5 +-
>> .../media/test-drivers/vivid/vivid-vbi-cap.c | 5 +-
>> .../media/test-drivers/vivid/vivid-vbi-out.c | 5 +-
>> .../media/test-drivers/vivid/vivid-vid-cap.c | 5 +-
>> .../media/test-drivers/vivid/vivid-vid-out.c | 5 +-
>> drivers/media/usb/airspy/airspy.c | 9 +-
>> drivers/media/usb/cx231xx/cx231xx-417.c | 4 +-
>> drivers/media/usb/cx231xx/cx231xx-video.c | 4 +-
>> drivers/media/usb/hackrf/hackrf.c | 9 +-
>> drivers/media/usb/usbtv/usbtv-video.c | 3 +-
>> drivers/media/v4l2-core/v4l2-dev.c | 1 +
>> drivers/media/v4l2-core/v4l2-ioctl.c | 21 +-
>> drivers/media/v4l2-core/v4l2-mem2mem.c | 20 +
>> .../staging/media/atomisp/pci/atomisp_ioctl.c | 4 +-
>> drivers/staging/media/imx/imx-media-capture.c | 7 +-
>> drivers/staging/media/meson/vdec/vdec.c | 13 +-
>> .../staging/media/sunxi/cedrus/cedrus_h264.c | 8 +-
>> .../staging/media/sunxi/cedrus/cedrus_h265.c | 9 +-
>> include/media/v4l2-ioctl.h | 4 +
>> include/media/v4l2-mem2mem.h | 12 +
>> include/media/videobuf2-core.h | 65 +-
>> include/media/videobuf2-v4l2.h | 13 +
>> include/uapi/linux/videodev2.h | 24 +-
>> samples/v4l/v4l2-pci-skeleton.c | 5 +-
>> 69 files changed, 969 insertions(+), 502 deletions(-)
>> create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-bufs.rst
>>
> _______________________________________________
> Kernel mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

2023-09-29 13:09:15

by Benjamin Gaignard

[permalink] [raw]
Subject: Re: [PATCH v8 00/53] Add DELETE_BUF ioctl


Le 28/09/2023 à 15:16, Benjamin Gaignard a écrit :
>
> Le 28/09/2023 à 15:04, Hans Verkuil a écrit :
>> Hi Benjamin,
>>
>> On 27/09/2023 17:35, Benjamin Gaignard wrote:
>>> Unlike when resolution change on keyframes, dynamic resolution change
>>> on inter frames doesn't allow to do a stream off/on sequence because
>>> it is need to keep all previous references alive to decode inter
>>> frames.
>>> This constraint have two main problems:
>>> - more memory consumption.
>>> - more buffers in use.
>>> To solve these issue this series introduce DELETE_BUFS ioctl and remove
>>> the 32 buffers limit per queue.
>>>
>>> VP9 conformance tests using fluster give a score of 210/305.
>>> The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
>>> but require to use postprocessor.
>>>
>>> Kernel branch is available here:
>>> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v8
>>>
>>>
>>> GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
>>> change is here:
>>> https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc
>>>
>> I applied patches 1-46 on top of staging and applied patch 1/2 of
>> v4l-utils. I chose
>> 1-46 since those would be the first batch to be merged.
>>
>> I had to apply a small patch on top, since amphion was still using
>> max_allowed_buffers:
>>
>> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=delbuf&id=fce79aa0239a3a4e5c7634b9b7ffc356057b70b4
>>
>>
>> With that change everything compiled cleanly.
>>
>> But running the regression tests (test-media script) failed with a
>> kernel oops:
>>
>> Buffer ioctls (Input 0):
>>                  fail: v4l2-test-buffers.cpp(611): q.reqbufs(node, 1)
>>          test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: FAIL
>> [   25.309343] general protection fault, probably for non-canonical
>> address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
>> [   25.310067] KASAN: null-ptr-deref in range
>> [0x0000000000000008-0x000000000000000f]
>> [   25.310471] CPU: 0 PID: 187 Comm: v4l2-compliance Not tainted
>> 6.6.0-rc3+ #1
>> [   25.310837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS 1.16.2-debian-1.16.2-1 04/01/2014
>> [   25.311334] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00
>> [videobuf2_common]
>> [   25.311688] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00
>> 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48
>> c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
>> 0f 0b 4c 89 e8
>> [   25.312670] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
>> [   25.312948] RAX: 0000000000000001 RBX: dffffc0000000000 RCX:
>> 0000000000000040
>> [   25.313331] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI:
>> ffff88810cfb72e0
>> [   25.313708] RBP: ffff88810ce08800 R08: 0000000000001800 R09:
>> 0000000000000001
>> [   25.314097] R10: ffffffff91560b0f R11: ffffffff8f40006a R12:
>> ffff88810afefa74
>> [   25.314470] R13: 0000000000000008 R14: 0000000000000001 R15:
>> ffff88810cfb71a0
>> [   25.314854] FS:  00007fa524908440(0000) GS:ffff88811aa00000(0000)
>> knlGS:0000000000000000
>> [   25.315281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   25.315594] CR2: 0000562855948270 CR3: 0000000109e21000 CR4:
>> 0000000000350ef0
>> [   25.315982] Call Trace:
>> [   25.316122]  <TASK>
>> [   25.316243]  ? die_addr+0x37/0xa0
>> [   25.316433]  ? exc_general_protection+0x144/0x210
>> [   25.316693]  ? asm_exc_general_protection+0x22/0x30
>> [   25.316955]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
>> [   25.317235]  ? __vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
>> [   25.317555]  vb2_core_create_bufs+0x5a0/0xce0 [videobuf2_common]
>> [   25.317879]  ? __vb2_queue_alloc+0xd00/0xd00 [videobuf2_common]
>> [   25.318213]  vb2_create_bufs+0x680/0x7f0 [videobuf2_v4l2]
>> [   25.318506]  ? __kasan_kmalloc+0x83/0x90
>> [   25.318729]  ? __kmalloc+0x5c/0x150
>> [   25.318924]  ? __video_do_ioctl+0x3a5/0xc20 [videodev]
>> [   25.319223]  ? __copy_timestamp+0x270/0x270 [videobuf2_v4l2]
>> [   25.319532]  ? v4l_sanitize_format+0x121/0x340 [videodev]
>> [   25.319839]  vb2_ioctl_create_bufs+0x280/0x440 [videobuf2_v4l2]
>> [   25.320162]  ? check_fmt+0x15/0x5e0 [videodev]
>> [   25.320420]  v4l_create_bufs+0xa8/0x150 [videodev]
>> [   25.320693]  __video_do_ioctl+0x8d0/0xc20 [videodev]
>> [   25.320975]  ? v4l_prepare_buf+0xa0/0xa0 [videodev]
>> [   25.321250]  ? __might_fault+0x112/0x170
>> [   25.321463]  ? __might_fault+0x112/0x170
>> [   25.321675]  video_usercopy+0x48c/0xd00 [videodev]
>> [   25.321949]  ? v4l_prepare_buf+0xa0/0xa0 [videodev]
>> [   25.322228]  ? v4l_enumstd+0x60/0x60 [videodev]
>> [   25.322497]  ? selinux_bprm_creds_for_exec+0xa90/0xa90
>> [   25.322779]  ? rseq_ip_fixup+0x304/0x480
>> [   25.322995]  ? __up_read+0x10b/0x750
>> [   25.323201]  ? rseq_get_rseq_cs+0x680/0x680
>> [   25.323430]  v4l2_ioctl+0x17f/0x1f0 [videodev]
>> [   25.323689]  __x64_sys_ioctl+0x127/0x190
>> [   25.323905]  do_syscall_64+0x35/0x80
>> [   25.324105]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>> [   25.324372] RIP: 0033:0x7fa524bf44eb
>> [   25.324564] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24
>> 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
>> 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b
>> 04 25 28 00 00
>> [   25.325533] RSP: 002b:00007ffc888efee0 EFLAGS: 00000246 ORIG_RAX:
>> 0000000000000010
>> [   25.325933] RAX: ffffffffffffffda RBX: 00007ffc888f0350 RCX:
>> 00007fa524bf44eb
>> [   25.326314] RDX: 00007ffc888effb0 RSI: 00000000c100565c RDI:
>> 0000000000000023
>> [   25.326702] RBP: 00007ffc888f8550 R08: 0000000000000000 R09:
>> 0000000000000001
>> [   25.327081] R10: 0000000000000016 R11: 0000000000000246 R12:
>> 0000000000000000
>> [   25.327468] R13: 0000000000000001 R14: 00007ffc888f8010 R15:
>> 00007ffc888f1790
>> [   25.327852]  </TASK>
>> [   25.327979] Modules linked in: vivid videobuf2_dma_contig v4l2_tpg
>> v4l2_dv_timings cec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2
>> videodev videobuf2_common mc rc_core
>> [   25.328835] ---[ end trace 0000000000000000 ]---
>> [   25.329083] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00
>> [videobuf2_common]
>> [   25.329434] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00
>> 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48
>> c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
>> 0f 0b 4c 89 e8
>> [   25.330417] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
>> [   25.330812] RAX: 0000000000000001 RBX: dffffc0000000000 RCX:
>> 0000000000000040
>> [   25.331194] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI:
>> ffff88810cfb72e0
>> [   25.331594] RBP: ffff88810ce08800 R08: 0000000000001800 R09:
>> 0000000000000001
>> [   25.331978] R10: ffffffff91560b0f R11: ffffffff8f40006a R12:
>> ffff88810afefa74
>> [   25.332360] R13: 0000000000000008 R14: 0000000000000001 R15:
>> ffff88810cfb71a0
>> [   25.332746] FS:  00007fa524908440(0000) GS:ffff88811aa00000(0000)
>> knlGS:0000000000000000
>> [   25.333170] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   25.333486] CR2: 0000562855948270 CR3: 0000000109e21000 CR4:
>> 0000000000350ef0
>> [   25.333861] Kernel panic - not syncing: Fatal exception
>> [   25.334179] Kernel Offset: 0xbe00000 from 0xffffffff81000000
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> [   25.334748] Rebooting in 30 seconds..
>>
>> I use the build scripts as described here:
>>
>> https://lore.kernel.org/linux-media/[email protected]/
>>
>>
>> I use this command to run the regression tests:
>>
>> ./build.sh -virtme -virtme-utils-path /usr/local/bin none delbuf
>>
>> (delbuf is the name of my branch)
>>
>> I don't really want to spend time reviewing this series until this is
>> fixed...
>
> Yes that make sense.
> It will work on it and let you now when I had fix it.
> Regards,
> Benjamin

This patch fix the problem:

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c
b/drivers/media/common/videobuf2/videobuf2-core.c
index 320f37e46343..0598c1e4e9b5 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -428,6 +428,8 @@ static void vb2_queue_add_buffer(struct vb2_queue
*q, struct vb2_buffer *vb, uns
  */
 static void vb2_queue_remove_buffer(struct vb2_buffer *vb)
 {
+    if (vb->vb2_queue->num_buffers)
+        vb->vb2_queue->num_buffers--;
     vb->vb2_queue->bufs[vb->index] = NULL;
     vb->vb2_queue = NULL;
 }
@@ -641,8 +643,8 @@ static void __vb2_queue_free(struct vb2_queue *q,
unsigned int buffers)
 #endif

     /* Free vb2 buffers */
-    for (i = q->max_num_buffers, buffer = 0; i > 0 && buffer < buffers;
i--) {
-        struct vb2_buffer *vb = vb2_get_buffer(q, buffer);
+    for (i = q->max_num_buffers, buffer = 0; i >= 0 && buffer <
buffers; i--) {
+        struct vb2_buffer *vb = vb2_get_buffer(q, i);

         if (!vb)
             continue;
@@ -2541,6 +2543,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
     __vb2_queue_free(q, q->max_num_buffers);
     kfree(q->bufs);
     q->bufs = NULL;
+    q->num_buffers = 0;
     mutex_unlock(&q->mmap_lock);
 }
 EXPORT_SYMBOL_GPL(vb2_core_queue_release);

I'm preparing a v9 with this + your patch for Amphion driver:

https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v9

Regards,

Benjamin


>
>>
>> Regards,
>>
>>     Hans
>>
>>> changes in version 8:
>>> - Add V4L2_BUF_CAP_SUPPORTS_SET_MAX_BUFS and new 'max_buffers' field
>>> in v4l2_create_buffers
>>>    structure to report the maximum number of buffers that a queue
>>> could allocate.
>>> - Add V4L2_BUF_CAP_SUPPORTS_DELETE_BUFS to indicate that a queue
>>> support
>>>    DELETE_BUFS ioctl.
>>> - Make some test drivers use more than 32 buffers and DELETE_BUFS
>>> ioctl.
>>> - Fix remarks done by Hans
>>> - Move "media: core: Rework how create_buf index returned value is
>>>    computed" patch to the top of the serie.
>>>
>>> changes in version 7:
>>> - Use a bitmap to know which entries are valid in queue bufs array.
>>>    The number of buffers in the queue could must calculated from the
>>>    bitmap so num_buffers becomes useless. This led to add quite few
>>>    patches to remove it from all the drivers.
>>>    Note: despiste my attention I may have miss some calls to
>>>    num_buffers...
>>> - Split patches to make them more readable.
>>> - Run v4l2-compliance with additional delete-bufs tests.
>>> - Run ./test-media -kmemleak vivid and no more failures.
>>>    Note: I had to remove USERPTR streaming test because they to much
>>>    frequentely hit get_framevec bug. It is not related to my series
>>>    since this happens all the time on master branch.
>>> - Fix Hans remarks on v6
>>>
>>> changes in version 6:
>>> - Get a patch per driver to use vb2_get_buffer() instead of directly
>>> access
>>>    to queue buffers array.
>>> - Add lock in vb2_core_delete_buf()
>>> - Use vb2_buffer instead of index
>>> - Fix various comments
>>> - Change buffer index name to BUFFER_INDEX_MASK
>>> - Stop spamming kernel log with unbalanced counters
>>>
>>> changes in version 5:
>>> - Rework offset cookie encoding pattern is n ow the first patch of the
>>>    serie.
>>> - Use static array instead of allocated one for postprocessor buffers.
>>>
>>> changes in version 4:
>>> - Stop using Xarray, instead let queues decide about their own maximum
>>>    number of buffer and allocate bufs array given that value.
>>> - Rework offset cookie encoding pattern.
>>> - Change DELETE_BUF to DELETE_BUFS because it now usable for
>>>    range of buffer to be symetrical of CREATE_BUFS.
>>> - Add fixes tags on couple of Verisilicon related patches.
>>> - Be smarter in Verisilicon postprocessor buffers management.
>>> - Rebase on top of v6.4
>>>
>>> changes in version 3:
>>> - Use Xarray API to store allocated video buffers.
>>> - No module parameter to limit the number of buffer per queue.
>>> - Use Xarray inside Verisilicon driver to store postprocessor buffers
>>>    and remove VB2_MAX_FRAME limit.
>>> - Allow Versilicon driver to change of resolution while streaming
>>> - Various fixes the Verisilicon VP9 code to improve fluster score.
>>>   changes in version 2:
>>> - Use a dynamic array and not a list to keep trace of allocated
>>> buffers.
>>>    Not use IDR interface because it is marked as deprecated in kernel
>>>    documentation.
>>> - Add a module parameter to limit the number of buffer per queue.
>>> - Add DELETE_BUF ioctl and m2m helpers.
>>>
>>> Regards,
>>> Benjamin
>>>
>>> Benjamin Gaignard (53):
>>>    media: videobuf2: Rework offset 'cookie' encoding pattern
>>>    media: videobuf2: Stop spamming kernel log with all queue counter
>>>    media: videobuf2: Use vb2_buffer instead of index
>>>    media: amphion: Use vb2_get_buffer() instead of directly access to
>>>      buffers array
>>>    media: mediatek: jpeg: Use vb2_get_buffer() instead of directly
>>> access
>>>      to buffers array
>>>    media: mediatek: vdec: Remove useless loop
>>>    media: sti: hva: Use vb2_get_buffer() instead of directly access to
>>>      buffers array
>>>    media: visl: Use vb2_get_buffer() instead of directly access to
>>>      buffers array
>>>    media: atomisp: Use vb2_get_buffer() instead of directly access to
>>>      buffers array
>>>    media: dvb-core: Use vb2_get_buffer() instead of directly access to
>>>      buffers array
>>>    media: videobuf2: Access vb2_queue bufs array through helper
>>> functions
>>>    media: videobuf2: Be more flexible on the number of queue stored
>>>      buffers
>>>    media: Report the maximum possible number of buffers for the queue
>>>    media: test-drivers: vivid: Increase max supported buffers for
>>> capture
>>>      queues
>>>    media: test-drivers: vicodec: Increase max supported capture queue
>>>      buffers
>>>    media: verisilicon: Refactor postprocessor to store more buffers
>>>    media: verisilicon: Store chroma and motion vectors offset
>>>    media: verisilicon: g2: Use common helpers to compute chroma and mv
>>>      offsets
>>>    media: verisilicon: vp9: Allow to change resolution while streaming
>>>    media: Remove duplicated index vs q->num_buffers check
>>>    media: core: Add helper to get queue number of buffers
>>>    media: dvb-core: Do not initialize twice queue num_buffer field
>>>    media: dvb-frontends: rtl2832_srd: Use queue min_buffers_needed
>>> field
>>>    media: video-i2c: Set min_buffers_needed to 2
>>>    media: pci: cx18: Set correct value to min_buffers_needed field
>>>    media: pci: dt3155: Remove useless check
>>>    media: pci: netup_unidvb: Remove useless number of buffers check
>>>    media: pci: tw68: Stop direct calls to queue num_buffers field
>>>    media: pci: tw686x: Set min_buffers_needed to 3
>>>    media: amphion: Stop direct calls to queue num_buffers field
>>>    media: coda: Stop direct calls to queue num_buffers field
>>>    media: mediatek: vcodec: Stop direct calls to queue num_buffers
>>> field
>>>    media: nxp: Stop direct calls to queue num_buffers field
>>>    media: renesas: Set min_buffers_needed to 16
>>>    media: ti: Use queue min_buffers_needed field to set the min
>>> number of
>>>      buffers
>>>    media: verisilicon: Stop direct calls to queue num_buffers field
>>>    media: test-drivers: Stop direct calls to queue num_buffers field
>>>    media: usb: airspy: Set min_buffers_needed to 8
>>>    media: usb: cx231xx: Set min_buffers_needed to CX231XX_MIN_BUF
>>>    media: usb: hackrf: Set min_buffers_needed to 8
>>>    media: usb: usbtv: Set min_buffers_needed to 2
>>>    media: atomisp: Stop direct calls to queue num_buffers field
>>>    media: imx: Stop direct calls to queue num_buffers field
>>>    media: meson: vdec: Stop direct calls to queue num_buffers field
>>>    touchscreen: sur40: Stop direct calls to queue num_buffers field
>>>    sample: v4l: Stop direct calls to queue num_buffers field
>>>    media: cedrus: Stop direct calls to queue num_buffers field
>>>    media: core: Rework how create_buf index returned value is computed
>>>    media: core: Add bitmap manage bufs array entries
>>>    media: core: Free range of buffers
>>>    media: v4l2: Add DELETE_BUFS ioctl
>>>    media: v4l2: Add mem2mem helpers for DELETE_BUFS ioctl
>>>    media: test-drivers: Use helper for DELETE_BUFS ioctl
>>>
>>>   .../userspace-api/media/v4l/user-func.rst     |   1 +
>>>   .../media/v4l/vidioc-create-bufs.rst          |   8 +-
>>>   .../media/v4l/vidioc-delete-bufs.rst          |  80 +++
>>>   .../media/v4l/vidioc-reqbufs.rst              |   2 +
>>>   drivers/input/touchscreen/sur40.c             |   5 +-
>>>   .../media/common/videobuf2/videobuf2-core.c   | 556
>>> +++++++++++-------
>>>   .../media/common/videobuf2/videobuf2-v4l2.c   | 121 +++-
>>>   drivers/media/dvb-core/dvb_vb2.c              |  17 +-
>>>   drivers/media/dvb-frontends/rtl2832_sdr.c     |   9 +-
>>>   drivers/media/i2c/video-i2c.c                 |   5 +-
>>>   drivers/media/pci/cx18/cx18-streams.c         |  13 +-
>>>   drivers/media/pci/dt3155/dt3155.c             |   2 -
>>>   .../pci/netup_unidvb/netup_unidvb_core.c      |   4 +-
>>>   drivers/media/pci/tw68/tw68-video.c           |   6 +-
>>>   drivers/media/pci/tw686x/tw686x-video.c       |  13 +-
>>>   drivers/media/platform/amphion/vpu_dbg.c      |  30 +-
>>>   drivers/media/platform/amphion/vpu_v4l2.c     |   4 +-
>>>   .../media/platform/chips-media/coda-common.c  |   2 +-
>>>   .../platform/mediatek/jpeg/mtk_jpeg_core.c    |   7 +-
>>>   .../vcodec/decoder/vdec/vdec_vp9_req_lat_if.c |   9 +-
>>>   .../mediatek/vcodec/encoder/mtk_vcodec_enc.c  |   2 +-
>>>   drivers/media/platform/nxp/imx7-media-csi.c   |   7 +-
>>>   drivers/media/platform/renesas/rcar_drif.c    |   8 +-
>>>   drivers/media/platform/st/sti/hva/hva-v4l2.c  |   9 +-
>>>   .../media/platform/ti/am437x/am437x-vpfe.c    |   7 +-
>>>   drivers/media/platform/ti/cal/cal-video.c     |   5 +-
>>>   .../media/platform/ti/davinci/vpif_capture.c  |   5 +-
>>>   .../media/platform/ti/davinci/vpif_display.c  |   5 +-
>>>   drivers/media/platform/ti/omap/omap_vout.c    |   5 +-
>>>   drivers/media/platform/verisilicon/hantro.h   |   9 +-
>>>   .../media/platform/verisilicon/hantro_drv.c   |   5 +-
>>>   .../media/platform/verisilicon/hantro_g2.c    |  14 +
>>>   .../platform/verisilicon/hantro_g2_hevc_dec.c |  18 +-
>>>   .../platform/verisilicon/hantro_g2_vp9_dec.c  |  28 +-
>>>   .../media/platform/verisilicon/hantro_hw.h    |   7 +-
>>>   .../platform/verisilicon/hantro_postproc.c    |  93 ++-
>>>   .../media/platform/verisilicon/hantro_v4l2.c  |  27 +-
>>>   .../media/test-drivers/vicodec/vicodec-core.c |   3 +
>>>   drivers/media/test-drivers/vim2m.c            |   2 +
>>>   .../media/test-drivers/vimc/vimc-capture.c    |   2 +
>>>   drivers/media/test-drivers/visl/visl-dec.c    |  32 +-
>>>   drivers/media/test-drivers/visl/visl-video.c  |   2 +
>>>   drivers/media/test-drivers/vivid/vivid-core.c |  14 +
>>>   .../media/test-drivers/vivid/vivid-meta-cap.c |   3 -
>>>   .../media/test-drivers/vivid/vivid-meta-out.c |   5 +-
>>>   .../test-drivers/vivid/vivid-touch-cap.c      |   5 +-
>>>   .../media/test-drivers/vivid/vivid-vbi-cap.c  |   5 +-
>>>   .../media/test-drivers/vivid/vivid-vbi-out.c  |   5 +-
>>>   .../media/test-drivers/vivid/vivid-vid-cap.c  |   5 +-
>>>   .../media/test-drivers/vivid/vivid-vid-out.c  |   5 +-
>>>   drivers/media/usb/airspy/airspy.c             |   9 +-
>>>   drivers/media/usb/cx231xx/cx231xx-417.c       |   4 +-
>>>   drivers/media/usb/cx231xx/cx231xx-video.c     |   4 +-
>>>   drivers/media/usb/hackrf/hackrf.c             |   9 +-
>>>   drivers/media/usb/usbtv/usbtv-video.c         |   3 +-
>>>   drivers/media/v4l2-core/v4l2-dev.c            |   1 +
>>>   drivers/media/v4l2-core/v4l2-ioctl.c          |  21 +-
>>>   drivers/media/v4l2-core/v4l2-mem2mem.c        |  20 +
>>>   .../staging/media/atomisp/pci/atomisp_ioctl.c |   4 +-
>>>   drivers/staging/media/imx/imx-media-capture.c |   7 +-
>>>   drivers/staging/media/meson/vdec/vdec.c       |  13 +-
>>>   .../staging/media/sunxi/cedrus/cedrus_h264.c  |   8 +-
>>>   .../staging/media/sunxi/cedrus/cedrus_h265.c  |   9 +-
>>>   include/media/v4l2-ioctl.h                    |   4 +
>>>   include/media/v4l2-mem2mem.h                  |  12 +
>>>   include/media/videobuf2-core.h                |  65 +-
>>>   include/media/videobuf2-v4l2.h                |  13 +
>>>   include/uapi/linux/videodev2.h                |  24 +-
>>>   samples/v4l/v4l2-pci-skeleton.c               |   5 +-
>>>   69 files changed, 969 insertions(+), 502 deletions(-)
>>>   create mode 100644
>>> Documentation/userspace-api/media/v4l/vidioc-delete-bufs.rst
>>>
>> _______________________________________________
>> Kernel mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]

2023-09-29 13:20:41

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH v8 00/53] Add DELETE_BUF ioctl

On 29/09/2023 13:44, Benjamin Gaignard wrote:
>
> Le 28/09/2023 à 15:16, Benjamin Gaignard a écrit :
>>
>> Le 28/09/2023 à 15:04, Hans Verkuil a écrit :
>>> Hi Benjamin,
>>>
>>> On 27/09/2023 17:35, Benjamin Gaignard wrote:
>>>> Unlike when resolution change on keyframes, dynamic resolution change
>>>> on inter frames doesn't allow to do a stream off/on sequence because
>>>> it is need to keep all previous references alive to decode inter frames.
>>>> This constraint have two main problems:
>>>> - more memory consumption.
>>>> - more buffers in use.
>>>> To solve these issue this series introduce DELETE_BUFS ioctl and remove
>>>> the 32 buffers limit per queue.
>>>>
>>>> VP9 conformance tests using fluster give a score of 210/305.
>>>> The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
>>>> but require to use postprocessor.
>>>>
>>>> Kernel branch is available here:
>>>> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v8
>>>>
>>>> GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
>>>> change is here:
>>>> https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc
>>> I applied patches 1-46 on top of staging and applied patch 1/2 of v4l-utils. I chose
>>> 1-46 since those would be the first batch to be merged.
>>>
>>> I had to apply a small patch on top, since amphion was still using max_allowed_buffers:
>>>
>>> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=delbuf&id=fce79aa0239a3a4e5c7634b9b7ffc356057b70b4
>>>
>>> With that change everything compiled cleanly.
>>>
>>> But running the regression tests (test-media script) failed with a kernel oops:
>>>
>>> Buffer ioctls (Input 0):
>>>                  fail: v4l2-test-buffers.cpp(611): q.reqbufs(node, 1)
>>>          test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: FAIL
>>> [   25.309343] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
>>> [   25.310067] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
>>> [   25.310471] CPU: 0 PID: 187 Comm: v4l2-compliance Not tainted 6.6.0-rc3+ #1
>>> [   25.310837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
>>> [   25.311334] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
>>> [   25.311688] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48 c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
>>> 0f 0b 4c 89 e8
>>> [   25.312670] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
>>> [   25.312948] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000040
>>> [   25.313331] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI: ffff88810cfb72e0
>>> [   25.313708] RBP: ffff88810ce08800 R08: 0000000000001800 R09: 0000000000000001
>>> [   25.314097] R10: ffffffff91560b0f R11: ffffffff8f40006a R12: ffff88810afefa74
>>> [   25.314470] R13: 0000000000000008 R14: 0000000000000001 R15: ffff88810cfb71a0
>>> [   25.314854] FS:  00007fa524908440(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000
>>> [   25.315281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   25.315594] CR2: 0000562855948270 CR3: 0000000109e21000 CR4: 0000000000350ef0
>>> [   25.315982] Call Trace:
>>> [   25.316122]  <TASK>
>>> [   25.316243]  ? die_addr+0x37/0xa0
>>> [   25.316433]  ? exc_general_protection+0x144/0x210
>>> [   25.316693]  ? asm_exc_general_protection+0x22/0x30
>>> [   25.316955]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>> [   25.317235]  ? __vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
>>> [   25.317555]  vb2_core_create_bufs+0x5a0/0xce0 [videobuf2_common]
>>> [   25.317879]  ? __vb2_queue_alloc+0xd00/0xd00 [videobuf2_common]
>>> [   25.318213]  vb2_create_bufs+0x680/0x7f0 [videobuf2_v4l2]
>>> [   25.318506]  ? __kasan_kmalloc+0x83/0x90
>>> [   25.318729]  ? __kmalloc+0x5c/0x150
>>> [   25.318924]  ? __video_do_ioctl+0x3a5/0xc20 [videodev]
>>> [   25.319223]  ? __copy_timestamp+0x270/0x270 [videobuf2_v4l2]
>>> [   25.319532]  ? v4l_sanitize_format+0x121/0x340 [videodev]
>>> [   25.319839]  vb2_ioctl_create_bufs+0x280/0x440 [videobuf2_v4l2]
>>> [   25.320162]  ? check_fmt+0x15/0x5e0 [videodev]
>>> [   25.320420]  v4l_create_bufs+0xa8/0x150 [videodev]
>>> [   25.320693]  __video_do_ioctl+0x8d0/0xc20 [videodev]
>>> [   25.320975]  ? v4l_prepare_buf+0xa0/0xa0 [videodev]
>>> [   25.321250]  ? __might_fault+0x112/0x170
>>> [   25.321463]  ? __might_fault+0x112/0x170
>>> [   25.321675]  video_usercopy+0x48c/0xd00 [videodev]
>>> [   25.321949]  ? v4l_prepare_buf+0xa0/0xa0 [videodev]
>>> [   25.322228]  ? v4l_enumstd+0x60/0x60 [videodev]
>>> [   25.322497]  ? selinux_bprm_creds_for_exec+0xa90/0xa90
>>> [   25.322779]  ? rseq_ip_fixup+0x304/0x480
>>> [   25.322995]  ? __up_read+0x10b/0x750
>>> [   25.323201]  ? rseq_get_rseq_cs+0x680/0x680
>>> [   25.323430]  v4l2_ioctl+0x17f/0x1f0 [videodev]
>>> [   25.323689]  __x64_sys_ioctl+0x127/0x190
>>> [   25.323905]  do_syscall_64+0x35/0x80
>>> [   25.324105]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>> [   25.324372] RIP: 0033:0x7fa524bf44eb
>>> [   25.324564] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b
>>> 04 25 28 00 00
>>> [   25.325533] RSP: 002b:00007ffc888efee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> [   25.325933] RAX: ffffffffffffffda RBX: 00007ffc888f0350 RCX: 00007fa524bf44eb
>>> [   25.326314] RDX: 00007ffc888effb0 RSI: 00000000c100565c RDI: 0000000000000023
>>> [   25.326702] RBP: 00007ffc888f8550 R08: 0000000000000000 R09: 0000000000000001
>>> [   25.327081] R10: 0000000000000016 R11: 0000000000000246 R12: 0000000000000000
>>> [   25.327468] R13: 0000000000000001 R14: 00007ffc888f8010 R15: 00007ffc888f1790
>>> [   25.327852]  </TASK>
>>> [   25.327979] Modules linked in: vivid videobuf2_dma_contig v4l2_tpg v4l2_dv_timings cec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc rc_core
>>> [   25.328835] ---[ end trace 0000000000000000 ]---
>>> [   25.329083] RIP: 0010:__vb2_queue_alloc+0x34c/0xd00 [videobuf2_common]
>>> [   25.329434] Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 0c 07 00 00 49 8b b7 40 01 00 00 44 89 f0 4c 8d 2c c6 41 39 ce 73 18 4c 89 e8 48 c1 e8 03 <80> 3c 18 00 0f 85 d2 08 00 00 49 83 7d 00 00 74 02
>>> 0f 0b 4c 89 e8
>>> [   25.330417] RSP: 0018:ffff88810afef930 EFLAGS: 00010212
>>> [   25.330812] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000040
>>> [   25.331194] RDX: ffff88810ce08814 RSI: 0000000000000000 RDI: ffff88810cfb72e0
>>> [   25.331594] RBP: ffff88810ce08800 R08: 0000000000001800 R09: 0000000000000001
>>> [   25.331978] R10: ffffffff91560b0f R11: ffffffff8f40006a R12: ffff88810afefa74
>>> [   25.332360] R13: 0000000000000008 R14: 0000000000000001 R15: ffff88810cfb71a0
>>> [   25.332746] FS:  00007fa524908440(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000
>>> [   25.333170] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   25.333486] CR2: 0000562855948270 CR3: 0000000109e21000 CR4: 0000000000350ef0
>>> [   25.333861] Kernel panic - not syncing: Fatal exception
>>> [   25.334179] Kernel Offset: 0xbe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>>> [   25.334748] Rebooting in 30 seconds..
>>>
>>> I use the build scripts as described here:
>>>
>>> https://lore.kernel.org/linux-media/[email protected]/
>>>
>>> I use this command to run the regression tests:
>>>
>>> ./build.sh -virtme -virtme-utils-path /usr/local/bin none delbuf
>>>
>>> (delbuf is the name of my branch)
>>>
>>> I don't really want to spend time reviewing this series until this is fixed...
>>
>> Yes that make sense.
>> It will work on it and let you now when I had fix it.
>> Regards,
>> Benjamin
>
> This patch fix the problem:
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index 320f37e46343..0598c1e4e9b5 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -428,6 +428,8 @@ static void vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, uns
>   */
>  static void vb2_queue_remove_buffer(struct vb2_buffer *vb)
>  {
> +    if (vb->vb2_queue->num_buffers)
> +        vb->vb2_queue->num_buffers--;
>      vb->vb2_queue->bufs[vb->index] = NULL;
>      vb->vb2_queue = NULL;
>  }
> @@ -641,8 +643,8 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
>  #endif
>
>      /* Free vb2 buffers */
> -    for (i = q->max_num_buffers, buffer = 0; i > 0 && buffer < buffers; i--) {
> -        struct vb2_buffer *vb = vb2_get_buffer(q, buffer);
> +    for (i = q->max_num_buffers, buffer = 0; i >= 0 && buffer < buffers; i--) {
> +        struct vb2_buffer *vb = vb2_get_buffer(q, i);
>
>          if (!vb)
>              continue;
> @@ -2541,6 +2543,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
>      __vb2_queue_free(q, q->max_num_buffers);
>      kfree(q->bufs);
>      q->bufs = NULL;
> +    q->num_buffers = 0;
>      mutex_unlock(&q->mmap_lock);
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_queue_release);
>
> I'm preparing a v9 with this + your patch for Amphion driver:
>
> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v9

Even with this patch I still get errors. Did you just test vivid?

Run 'test-media mc' to test it for all virtual drivers. I get both compliance
failures, a WARN_ON in drivers/media/common/videobuf2/videobuf2-core.c:2489,
and memory leaks. It currently fails and exists completely with the vimc tests,
but things may already have gone pear-shaped with the vim2m test.

I'll mail you the full output from a test run with the build scripts
separately (it's a bit too big to add here).

I don't know if you can run the build script with the -virtme option, instructions
for that are currently only available for debian 12. If not, then ask Sebastian
Fricke (he is working on a docker for this), alternatively you can take
configs/virtme.config from the build scripts git repo and use that as the kernel
config to test under. This config enables memory leak detection and a whole bunch
of other debugging stuff.

Regards,

Hans