2022-03-28 22:10:45

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 00/24] H.264 Field Decoding Support for Frame-based Decoders

Until now, only Cedrus (a slice base decoder) supported interlaced decoding.
In order to support field decoding in our frame-based decoder, the v4l2-h264
library needed adaptation to produce the appropriate reference lists.

This patch extends the v4l2-h264 library to produce the larger references list
needed to represent fields separately. Hantro, MTK-VCODEC and RKVDEC drivers
have been adapted to accommodate the larger lists. Though, only Hantro and
RKVDEC actually have HW support for field decoding. So only these two
have been updated to make use of the larger lists. All this work has been
done using the H.264 specification, LibreELEC downstream kernel patches,
Rockchip MPP reference software and Hantro reference software.

All this work have been tested using GStreamer mainline implementation but also
with FFMPEG LibreELEC fork using the testing tool fluster running through the
ITU-T H.264 (2016-02) AVCv2 set of bitstream. Before this patch, the scores
were:

Hantro:
FFMPEG:
GSteamer:
RKVDEC:
FFMPEG:
GSteamer:

And after these changes:

Hantro:
FFMPEG: 118/135
GSteamer: 129/135
RKVDEC:
FFMPEG: 118/135
GSteamer: 129/135

Note that a bug in FFMPEG / LibreELEC fork was noticed and fixed with the
following change:

diff --git a/libavcodec/v4l2_request_h264.c b/libavcodec/v4l2_request_h264.c
index 88da8f0a2d..394bae0550 100644
--- a/libavcodec/v4l2_request_h264.c
+++ b/libavcodec/v4l2_request_h264.c
@@ -66,7 +66,7 @@ static void fill_dpb_entry(struct v4l2_h264_dpb_entry *entry, const H264Picture
{
entry->reference_ts = ff_v4l2_request_get_capture_timestamp(pic->f);
entry->pic_num = pic->pic_id;
- entry->frame_num = pic->frame_num;
+ entry->frame_num = pic->long_ref ? pic->pic_id : pic->frame_num;
entry->fields = pic->reference & V4L2_H264_FRAME_REF;
entry->flags = V4L2_H264_DPB_ENTRY_FLAG_VALID;
if (entry->fields)

Some useful links:

Detailed Hantro Results: https://gitlab.freedesktop.org/-/snippets/5189
Detailed RKVDEC Results: https://gitlab.freedesktop.org/-/snippets/5253
ITU-T H.264 (2016-02) AVCv2: https://www.itu.int/net/itu-t/sigdb/spevideo/VideoForm-s.aspx?val=102002641
Fluster: https://github.com/fluendo/fluster
GStreamer: https://gitlab.freedesktop.org/gstreamer/gstreamer/
FFMPEG Fork: https://github.com/jernejsk/FFmpeg/tree/v4l2-request-hwaccel-4.4
Rockchip MPP: https://github.com/rockchip-linux/mpp

Alex Bee (1):
media: rkvdec-h264: Don't hardcode SPS/PPS parameters

Jonas Karlman (5):
media: rkvdec: h264: Fix reference frame_num wrap for second field
media: rkvdec: Ensure decoded resolution fit coded resolution
media: rkvdec: h264: Validate and use pic width and height in mbs
media: rkvdec: h264: Fix bit depth wrap in pps packet
media: hantro: h264: Make dpb entry management more robust

Nicolas Dufresne (17):
media: h264: Increase reference lists size to 32
media: doc: Document dual use of H.264 pic_num/frame_num
media: h264: Avoid wrapping long_term_frame_idx
media: h264: Store current picture fields
media: h264: Store all fields into the unordered list
media: v4l2: Trace calculated p/b0/b1 initial reflist
media: h264: Sort p/b reflist using frame_num
media: v4l2: Reorder field reflist
media: v4l2-mem2mem: Fix typo in trace message
media: v4l2-mem2mem: Trace on implicit un-hold
media: rkvdec: Stop overclocking the decoder
media: rkvdec: h264: Fix dpb_valid implementation
media: rkvdec: Enable capture buffer holding for H264
media: rkvdec-h264: Add field decoding support
media: hantro: Enable HOLD_CAPTURE_BUF for H.264
media: hantro: Stop using H.264 parameter pic_num
media: hantro: Add H.264 field decoding support

Sebastian Fricke (1):
media: videobuf2-v4l2: Warn on holding buffers without support

.../media/v4l/ext-ctrls-codec-stateless.rst | 10 +-
.../media/common/videobuf2/videobuf2-v4l2.c | 7 +-
drivers/media/v4l2-core/v4l2-h264.c | 238 +++++++++++++++---
drivers/media/v4l2-core/v4l2-mem2mem.c | 3 +-
drivers/staging/media/hantro/hantro_h264.c | 119 +++++++--
drivers/staging/media/hantro/hantro_hw.h | 7 +-
drivers/staging/media/hantro/hantro_v4l2.c | 25 ++
drivers/staging/media/rkvdec/rkvdec-h264.c | 104 ++++----
drivers/staging/media/rkvdec/rkvdec.c | 22 +-
drivers/staging/media/rkvdec/rkvdec.h | 1 +
include/media/v4l2-h264.h | 20 +-
11 files changed, 431 insertions(+), 125 deletions(-)

base-commit: 51d86122ff02ac2ceef5c0a1cf28f0b5ed580ddd
--
2.34.1


2022-03-28 22:11:29

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 13/24] media: rkvdec: h264: Fix reference frame_num wrap for second field

From: Jonas Karlman <[email protected]>

When decoding the second field in a complementary field pair the second
field is sharing the same frame_num with the first field.

Currently the frame_num for the first field is wrapped when it matches the
field being decoded, this cause issues to decode the second field in a
complementary field pair.

Fix this by using inclusive comparison, less than or equal.

Signed-off-by: Jonas Karlman <[email protected]>
Reviewed-by: Ezequiel Garcia <[email protected]>
---
drivers/staging/media/rkvdec/rkvdec-h264.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
index dff89732ddd0..842d8cd80e90 100644
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
@@ -752,7 +752,7 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
continue;

if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM ||
- dpb[i].frame_num < dec_params->frame_num) {
+ dpb[i].frame_num <= dec_params->frame_num) {
p[i] = dpb[i].frame_num;
continue;
}
--
2.34.1

2022-03-28 22:11:50

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

This makes use of the new feature in the reference builder to program
up to 32 references when doing field decoding. It also signals the
parity (top of bottom) of the field to the hardware.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/staging/media/rkvdec/rkvdec-h264.c | 48 ++++++++++------------
1 file changed, 21 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
index ec52b195bbd7..891c48bf6a51 100644
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
@@ -97,13 +97,10 @@ struct rkvdec_h264_priv_tbl {
u8 err_info[RKV_ERROR_INFO_SIZE];
};

-#define RKVDEC_H264_DPB_SIZE 16
-
struct rkvdec_h264_reflists {
struct v4l2_h264_reference p[V4L2_H264_REF_LIST_LEN];
struct v4l2_h264_reference b0[V4L2_H264_REF_LIST_LEN];
struct v4l2_h264_reference b1[V4L2_H264_REF_LIST_LEN];
- u8 num_valid;
};

struct rkvdec_h264_run {
@@ -738,23 +735,26 @@ static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
int buf_idx = -1;

- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
buf_idx = vb2_find_timestamp(cap_q,
dpb[i].reference_ts, 0);
+ if (buf_idx < 0)
+ pr_debug("No buffer for reference_ts %llu",
+ dpb[i].reference_ts);
+ }

run->ref_buf_idx[i] = buf_idx;
}
}

static void assemble_hw_rps(struct rkvdec_ctx *ctx,
+ struct v4l2_h264_reflist_builder *builder,
struct rkvdec_h264_run *run)
{
const struct v4l2_ctrl_h264_decode_params *dec_params = run->decode_params;
const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb;
struct rkvdec_h264_ctx *h264_ctx = ctx->priv;
- const struct v4l2_ctrl_h264_sps *sps = run->sps;
struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu;
- u32 max_frame_num = 1 << (sps->log2_max_frame_num_minus4 + 4);

u32 *hw_rps = priv_tbl->rps;
u32 i, j;
@@ -772,37 +772,36 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
continue;

- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM ||
- dpb[i].frame_num <= dec_params->frame_num) {
- p[i] = dpb[i].frame_num;
- continue;
- }
-
- p[i] = dpb[i].frame_num - max_frame_num;
+ p[i] = builder->refs[i].frame_num;
}

for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
- for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
- u8 dpb_valid = run->ref_buf_idx[i] >= 0;
- u8 idx = 0;
+ for (i = 0; i < builder->num_valid; i++) {
+ struct v4l2_h264_reference *ref;
+ u8 dpb_valid;
+ u8 bottom;

switch (j) {
case 0:
- idx = h264_ctx->reflists.p[i].index;
+ ref = &h264_ctx->reflists.p[i];
break;
case 1:
- idx = h264_ctx->reflists.b0[i].index;
+ ref = &h264_ctx->reflists.b0[i];
break;
case 2:
- idx = h264_ctx->reflists.b1[i].index;
+ ref = &h264_ctx->reflists.b1[i];
break;
}

- if (idx >= ARRAY_SIZE(dec_params->dpb))
+ if (WARN_ON(ref->index >= ARRAY_SIZE(dec_params->dpb)))
continue;

+ dpb_valid = run->ref_buf_idx[ref->index] >= 0;
+ bottom = ref->fields == V4L2_H264_BOTTOM_FIELD_REF;
+
set_ps_field(hw_rps, DPB_INFO(i, j),
- idx | dpb_valid << 4);
+ ref->index | dpb_valid << 4);
+ set_ps_field(hw_rps, BOTTOM_FLAG(i, j), bottom);
}
}
}
@@ -990,10 +989,6 @@ static void config_registers(struct rkvdec_ctx *ctx,
rkvdec->regs + RKVDEC_REG_H264_BASE_REFER15);
}

- /*
- * Since support frame mode only
- * top_field_order_cnt is the same as bottom_field_order_cnt
- */
reg = RKVDEC_CUR_POC(dec_params->top_field_order_cnt);
writel_relaxed(reg, rkvdec->regs + RKVDEC_REG_CUR_POC0);

@@ -1109,7 +1104,6 @@ static int rkvdec_h264_run(struct rkvdec_ctx *ctx)
/* Build the P/B{0,1} ref lists. */
v4l2_h264_init_reflist_builder(&reflist_builder, run.decode_params,
run.sps, run.decode_params->dpb);
- h264_ctx->reflists.num_valid = reflist_builder.num_valid;
v4l2_h264_build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
v4l2_h264_build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
h264_ctx->reflists.b1);
@@ -1117,7 +1111,7 @@ static int rkvdec_h264_run(struct rkvdec_ctx *ctx)
assemble_hw_scaling_list(ctx, &run);
assemble_hw_pps(ctx, &run);
lookup_ref_buf_idx(ctx, &run);
- assemble_hw_rps(ctx, &run);
+ assemble_hw_rps(ctx, &reflist_builder, &run);
config_registers(ctx, &run);

rkvdec_run_postamble(ctx, &run.base);
--
2.34.1

2022-03-28 22:15:49

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 18/24] media: rkvdec: h264: Fix bit depth wrap in pps packet

From: Jonas Karlman <[email protected]>

The luma and chroma bit depth fields in the pps packet is 3 bits wide.
8 is wrongly added to the bit depth value written to these 3-bit fields.
Because only the 3 LSB is written the hardware is configured correctly.

Correct this by not adding 8 to the luma and chroma bit depth value.

Signed-off-by: Jonas Karlman <[email protected]>
Reviewed-by: Ezequiel Garcia <[email protected]>
---
drivers/staging/media/rkvdec/rkvdec-h264.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
index 847b8957dad3..ec52b195bbd7 100644
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
@@ -662,8 +662,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
WRITE_PPS(0xff, PROFILE_IDC);
WRITE_PPS(1, CONSTRAINT_SET3_FLAG);
WRITE_PPS(sps->chroma_format_idc, CHROMA_FORMAT_IDC);
- WRITE_PPS(sps->bit_depth_luma_minus8 + 8, BIT_DEPTH_LUMA);
- WRITE_PPS(sps->bit_depth_chroma_minus8 + 8, BIT_DEPTH_CHROMA);
+ WRITE_PPS(sps->bit_depth_luma_minus8, BIT_DEPTH_LUMA);
+ WRITE_PPS(sps->bit_depth_chroma_minus8, BIT_DEPTH_CHROMA);
WRITE_PPS(0, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
WRITE_PPS(sps->log2_max_frame_num_minus4, LOG2_MAX_FRAME_NUM_MINUS4);
WRITE_PPS(sps->max_num_ref_frames, MAX_NUM_REF_FRAMES);
--
2.34.1

2022-03-28 22:15:52

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 17/24] media: rkvdec: h264: Validate and use pic width and height in mbs

From: Jonas Karlman <[email protected]>

The width and height in mbs is currently configured based on OUTPUT buffer
resolution, this works for frame pictures but can cause issues for field
pictures.

When frame_mbs_only_flag is 0 the height in mbs should be height of
the field instead of height of frame.

Validate pic_width_in_mbs_minus1 and pic_height_in_map_units_minus1
against OUTPUT buffer resolution and use these values to configure HW.

Signed-off-by: Jonas Karlman <[email protected]>
---
drivers/staging/media/rkvdec/rkvdec-h264.c | 4 ++--
drivers/staging/media/rkvdec/rkvdec.c | 10 ++++++++++
2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
index db1e762baee5..847b8957dad3 100644
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
@@ -672,8 +672,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4);
WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO),
DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG);
- WRITE_PPS(DIV_ROUND_UP(ctx->coded_fmt.fmt.pix_mp.width, 16), PIC_WIDTH_IN_MBS);
- WRITE_PPS(DIV_ROUND_UP(ctx->coded_fmt.fmt.pix_mp.height, 16), PIC_HEIGHT_IN_MBS);
+ WRITE_PPS(sps->pic_width_in_mbs_minus1 + 1, PIC_WIDTH_IN_MBS);
+ WRITE_PPS(sps->pic_height_in_map_units_minus1 + 1, PIC_HEIGHT_IN_MBS);
WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY),
FRAME_MBS_ONLY_FLAG);
WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD),
diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
index 22c0382c579e..67539f4bf382 100644
--- a/drivers/staging/media/rkvdec/rkvdec.c
+++ b/drivers/staging/media/rkvdec/rkvdec.c
@@ -29,8 +29,11 @@

static int rkvdec_try_ctrl(struct v4l2_ctrl *ctrl)
{
+ struct rkvdec_ctx *ctx = container_of(ctrl->handler, struct rkvdec_ctx, ctrl_hdl);
+
if (ctrl->id == V4L2_CID_STATELESS_H264_SPS) {
const struct v4l2_ctrl_h264_sps *sps = ctrl->p_new.p_h264_sps;
+ unsigned int width, height;
/*
* TODO: The hardware supports 10-bit and 4:2:2 profiles,
* but it's currently broken in the driver.
@@ -45,6 +48,13 @@ static int rkvdec_try_ctrl(struct v4l2_ctrl *ctrl)
if (sps->bit_depth_luma_minus8 != 0)
/* Only 8-bit is supported */
return -EINVAL;
+
+ width = (sps->pic_width_in_mbs_minus1 + 1) * 16;
+ height = (sps->pic_height_in_map_units_minus1 + 1) * 16;
+
+ if (width > ctx->coded_fmt.fmt.pix_mp.width ||
+ height > ctx->coded_fmt.fmt.pix_mp.height)
+ return -EINVAL;
}
return 0;
}
--
2.34.1

2022-03-28 22:16:27

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 20/24] media: hantro: Enable HOLD_CAPTURE_BUF for H.264

This is needed to optimizing field decoding. Each field will be
decoded in the same capture buffer, so to make use of the queues
we need to be able to ask the driver to keep the capture buffer.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/staging/media/hantro/hantro_v4l2.c | 25 ++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 67148ba346f5..50d636678ff3 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -409,6 +409,30 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
}
}

+static void
+hantro_update_requires_hold_capture_buf(struct hantro_ctx *ctx, u32 fourcc)
+{
+ struct vb2_queue *vq;
+
+ vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx,
+ V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+
+ switch (fourcc) {
+ case V4L2_PIX_FMT_JPEG:
+ case V4L2_PIX_FMT_MPEG2_SLICE:
+ case V4L2_PIX_FMT_VP8_FRAME:
+ case V4L2_PIX_FMT_HEVC_SLICE:
+ case V4L2_PIX_FMT_VP9_FRAME:
+ vq->subsystem_flags &= ~(VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF);
+ break;
+ case V4L2_PIX_FMT_H264_SLICE:
+ vq->subsystem_flags |= VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
+ break;
+ default:
+ break;
+ }
+}
+
static int hantro_set_fmt_out(struct hantro_ctx *ctx,
struct v4l2_pix_format_mplane *pix_mp)
{
@@ -472,6 +496,7 @@ static int hantro_set_fmt_out(struct hantro_ctx *ctx,
ctx->dst_fmt.quantization = pix_mp->quantization;

hantro_update_requires_request(ctx, pix_mp->pixelformat);
+ hantro_update_requires_hold_capture_buf(ctx, pix_mp->pixelformat);

vpu_debug(0, "OUTPUT codec mode: %d\n", ctx->vpu_src_fmt->codec_mode);
vpu_debug(0, "fmt - w: %d, h: %d\n",
--
2.34.1

2022-03-28 22:17:43

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 09/24] media: v4l2-mem2mem: Fix typo in trace message

On -> One

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/v4l2-core/v4l2-mem2mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
index 675e22895ebe..53c2332d5cbd 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -316,7 +316,7 @@ static void __v4l2_m2m_try_queue(struct v4l2_m2m_dev *m2m_dev,
}

if (m2m_ctx->job_flags & TRANS_QUEUED) {
- dprintk("On job queue already\n");
+ dprintk("One job queue already\n");
goto job_unlock;
}

--
2.34.1

2022-03-28 22:17:55

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 21/24] media: hantro: Stop using H.264 parameter pic_num

The hardware expects FrameNumWrap or long_term_frame_idx. Picture
numbers are per field, and are mostly used during the memory
management process, which is done in userland. This fixes two
ITU conformance tests:

- MR6_BT_B
- MR8_BT_B

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/staging/media/hantro/hantro_h264.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
index 0b4d2491be3b..228629fb3cdf 100644
--- a/drivers/staging/media/hantro/hantro_h264.c
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -354,8 +354,6 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)

if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
return 0;
- if (dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
- return dpb->pic_num;
return dpb->frame_num;
}

--
2.34.1

2022-03-28 22:18:50

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 04/24] media: h264: Store current picture fields

This information, also called picture structure will be needed to construct
reference list when decoding field.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/v4l2-core/v4l2-h264.c | 10 +++++++---
include/media/v4l2-h264.h | 4 ++++
2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
index aebed1cbe05a..4c6bfb057bda 100644
--- a/drivers/media/v4l2-core/v4l2-h264.c
+++ b/drivers/media/v4l2-core/v4l2-h264.c
@@ -34,13 +34,17 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
cur_frame_num = dec_params->frame_num;

memset(b, 0, sizeof(*b));
- if (!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC))
+ if (!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC)) {
b->cur_pic_order_count = min(dec_params->bottom_field_order_cnt,
dec_params->top_field_order_cnt);
- else if (dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD)
+ b->cur_pic_fields = V4L2_H264_FRAME_REF;
+ } else if (dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD) {
b->cur_pic_order_count = dec_params->bottom_field_order_cnt;
- else
+ b->cur_pic_fields = V4L2_H264_BOTTOM_FIELD_REF;
+ } else {
b->cur_pic_order_count = dec_params->top_field_order_cnt;
+ b->cur_pic_fields = V4L2_H264_TOP_FIELD_REF;
+ }

for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++) {
u32 pic_order_count;
diff --git a/include/media/v4l2-h264.h b/include/media/v4l2-h264.h
index e282fb16ac58..e165a54c68fa 100644
--- a/include/media/v4l2-h264.h
+++ b/include/media/v4l2-h264.h
@@ -21,6 +21,7 @@
* @refs.longterm: set to true for a long term reference
* @refs: array of references
* @cur_pic_order_count: picture order count of the frame being decoded
+ * @cur_pic_fields: fields present in the frame being decoded
* @unordered_reflist: unordered list of references. Will be used to generate
* ordered P/B0/B1 lists
* @num_valid: number of valid references in the refs array
@@ -36,7 +37,10 @@ struct v4l2_h264_reflist_builder {
u32 pic_num;
u16 longterm : 1;
} refs[V4L2_H264_NUM_DPB_ENTRIES];
+
s32 cur_pic_order_count;
+ u8 cur_pic_fields;
+
struct v4l2_h264_reference unordered_reflist[V4L2_H264_REF_LIST_LEN];
u8 num_valid;
};
--
2.34.1

2022-03-28 22:21:28

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 11/24] media: videobuf2-v4l2: Warn on holding buffers without support

From: Sebastian Fricke <[email protected]>

Using V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF flag without specifying the
subsystem flag VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF, results in
silently ignoring it.
Warn the user via a debug print when the flag is requested but ignored
by the videobuf2 framework.

Signed-off-by: Sebastian Fricke <[email protected]>
Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/common/videobuf2/videobuf2-v4l2.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 6edf4508c636..812c8d1962e0 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -329,8 +329,13 @@ static int vb2_fill_vb2_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b
*/
vbuf->flags &= ~V4L2_BUF_FLAG_TIMECODE;
vbuf->field = b->field;
- if (!(q->subsystem_flags & VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF))
+ if (!(q->subsystem_flags & VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF)) {
+ if (vbuf->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF)
+ dprintk(q, 1,
+ "Request holding buffer (%d), unsupported on output queue\n",
+ b->index);
vbuf->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+ }
} else {
/* Zero any output buffer flags as this is a capture buffer */
vbuf->flags &= ~V4L2_BUFFER_OUT_FLAGS;
--
2.34.1

2022-03-28 22:34:00

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 08/24] media: v4l2: Reorder field reflist

As per spec, the field refslist requires interleaving top and bottom
field in a specific way that does not fit inside the sort operation.
Reorder in-place the references so that their parity sart with the same
parity as the current picture and do that for both short and longterm
references separately.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/v4l2-core/v4l2-h264.c | 45 +++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
index c3fad382882d..2f7ee8d5479f 100644
--- a/drivers/media/v4l2-core/v4l2-h264.c
+++ b/drivers/media/v4l2-core/v4l2-h264.c
@@ -243,6 +243,43 @@ static int v4l2_h264_b1_ref_list_cmp(const void *ptra, const void *ptrb,
return poca < pocb ? -1 : 1;
}

+/*
+ * The references need to be reorder so that reference are alternating between
+ * top and bottom ref starting with the current picture parity. This have to be
+ * done for short term and long term references separately.
+ */
+static void reorder_field_reflist(const struct v4l2_h264_reflist_builder *b,
+ struct v4l2_h264_reference *reflist)
+{
+ struct v4l2_h264_reference tmplist[V4L2_H264_REF_LIST_LEN];
+ u8 lt, i = 0, j = 0, k = 0;
+
+ memcpy(tmplist, reflist, sizeof(tmplist[0]) * b->num_valid);
+
+ for (lt = 0; lt <= 1; lt++) {
+ do {
+ for (; i < b->num_valid && b->refs[tmplist[i].index].longterm == lt; i++) {
+ if (tmplist[i].fields == b->cur_pic_fields) {
+ reflist[k] = tmplist[i];
+ k++;
+ i++;
+ break;
+ }
+ }
+
+ for (; j < b->num_valid && b->refs[tmplist[j].index].longterm == lt; j++) {
+ if (tmplist[j].fields != b->cur_pic_fields) {
+ reflist[k] = tmplist[j];
+ k++;
+ j++;
+ break;
+ }
+ }
+ } while ((i < b->num_valid && b->refs[tmplist[i].index].longterm == lt) ||
+ (j < b->num_valid && b->refs[tmplist[j].index].longterm == lt));
+ }
+}
+
static char ref_type_to_char (u8 ref_type)
{
switch (ref_type) {
@@ -345,6 +382,9 @@ v4l2_h264_build_p_ref_list(const struct v4l2_h264_reflist_builder *builder,
sort_r(reflist, builder->num_valid, sizeof(*reflist),
v4l2_h264_p_ref_list_cmp, NULL, builder);

+ if (builder->cur_pic_fields != V4L2_H264_FRAME_REF)
+ reorder_field_reflist(builder, reflist);
+
print_ref_list_p(builder, reflist);
}
EXPORT_SYMBOL_GPL(v4l2_h264_build_p_ref_list);
@@ -378,6 +418,11 @@ v4l2_h264_build_b_ref_lists(const struct v4l2_h264_reflist_builder *builder,
sort_r(b1_reflist, builder->num_valid, sizeof(*b1_reflist),
v4l2_h264_b1_ref_list_cmp, NULL, builder);

+ if (builder->cur_pic_fields != V4L2_H264_FRAME_REF) {
+ reorder_field_reflist(builder, b0_reflist);
+ reorder_field_reflist(builder, b1_reflist);
+ }
+
if (builder->num_valid > 1 &&
!memcmp(b1_reflist, b0_reflist, builder->num_valid))
swap(b1_reflist[0], b1_reflist[1]);
--
2.34.1

2022-03-28 22:35:29

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 23/24] media: hantro: Add H.264 field decoding support

This adds the required code to support field decoding. While most of
the code is derived from Rockchip and VSI reference code, the
reduction of the reference list to 16 entries has been found by
trial and errors. The list consist of all the references with the
opposite field parity.

The strategy being to deduplicate the reference picture that points
to the same storage (same index). The choice of opposite parity has
been made to keep the other field or a field pair to be kept in the
list. This method may not be robust if a field was lost.

Signed-off-by: Jonas Karlman <[email protected]>
Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/staging/media/hantro/hantro_h264.c | 107 ++++++++++++++++++---
drivers/staging/media/hantro/hantro_hw.h | 1 +
2 files changed, 94 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
index 7377fc26f780..f6fc939aa726 100644
--- a/drivers/staging/media/hantro/hantro_h264.c
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -22,6 +22,11 @@
#define POC_BUFFER_SIZE 34
#define SCALING_LIST_SIZE (6 * 16 + 2 * 64)

+/* For valid and long term reference marking, index are reversed, so bit 31
+ * indicates the status of the picture 0.
+ */
+#define REF_BIT(i) BIT(32 - 1 - (i))
+
/* Data structure describing auxiliary buffer format. */
struct hantro_h264_dec_priv_tbl {
u32 cabac_table[CABAC_INIT_BUFFER_SIZE];
@@ -227,6 +232,7 @@ static void prepare_table(struct hantro_ctx *ctx)
{
const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
+ const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
u32 dpb_longterm = 0;
@@ -237,20 +243,45 @@ static void prepare_table(struct hantro_ctx *ctx)
tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;

+ if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_VALID))
+ continue;
+
/*
* Set up bit maps of valid and long term DPBs.
- * NOTE: The bits are reversed, i.e. MSb is DPB 0.
+ * NOTE: The bits are reversed, i.e. MSb is DPB 0. For frame
+ * decoding, bit 31 to 15 are used, while for field decoding,
+ * all bits are used, with bit 31 being a top field, 30 a bottom
+ * field and so on.
*/
- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
- dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
- dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+ if (dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) {
+ if (dpb[i].fields & V4L2_H264_TOP_FIELD_REF)
+ dpb_valid |= REF_BIT(i * 2);
+
+ if (dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)
+ dpb_valid |= REF_BIT(i * 2 + 1);
+
+ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) {
+ dpb_longterm |= REF_BIT(i * 2);
+ dpb_longterm |= REF_BIT(i * 2 + 1);
+ }
+ } else {
+ dpb_valid |= REF_BIT(i);
+
+ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+ dpb_longterm |= REF_BIT(i);
+ }
+ }
+ ctx->h264_dec.dpb_valid = dpb_valid;
+ ctx->h264_dec.dpb_longterm = dpb_longterm;
+
+ if ((dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) ||
+ !(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
+ tbl->poc[32] = ctx->h264_dec.cur_poc;
+ tbl->poc[33] = 0;
+ } else {
+ tbl->poc[32] = dec_param->top_field_order_cnt;
+ tbl->poc[33] = dec_param->bottom_field_order_cnt;
}
- ctx->h264_dec.dpb_valid = dpb_valid << 16;
- ctx->h264_dec.dpb_longterm = dpb_longterm << 16;
-
- tbl->poc[32] = dec_param->top_field_order_cnt;
- tbl->poc[33] = dec_param->bottom_field_order_cnt;

assemble_scaling_list(ctx);
}
@@ -326,6 +357,8 @@ dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
{
struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
dma_addr_t dma_addr = 0;
+ s32 cur_poc = ctx->h264_dec.cur_poc;
+ u32 flags;

if (dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
dma_addr = hantro_get_ref(ctx, dpb[dpb_idx].reference_ts);
@@ -343,7 +376,12 @@ dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
dma_addr = hantro_get_dec_buf_addr(ctx, buf);
}

- return dma_addr;
+ flags = dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD ? 0x2 : 0;
+ flags |= abs(dpb[dpb_idx].top_field_order_cnt - cur_poc) <
+ abs(dpb[dpb_idx].bottom_field_order_cnt - cur_poc) ?
+ 0x1 : 0;
+
+ return dma_addr | flags;
}

u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
@@ -355,6 +393,34 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
return dpb->frame_num;
}

+static void deduplicate_reflist(struct v4l2_h264_reflist_builder *b,
+ struct v4l2_h264_reference *reflist)
+{
+ int write_idx = 0;
+ int i;
+
+ if (b->cur_pic_fields == V4L2_H264_FRAME_REF) {
+ write_idx = b->num_valid;
+ goto done;
+ }
+
+ for (i = 0; i < b->num_valid; i++) {
+ if (!(b->cur_pic_fields == reflist[i].fields)) {
+ reflist[write_idx++] = reflist[i];
+ continue;
+ }
+ }
+
+done:
+ /* Should not happen unless we have a bug in the reflist builder. */
+ if (WARN_ON(write_idx > 16))
+ write_idx = 16;
+
+ /* Clear the remaining, some streams fails otherwise */
+ for (; write_idx < 16; write_idx++)
+ reflist[write_idx].index = 15;
+}
+
int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
{
struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
@@ -386,15 +452,28 @@ int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
/* Update the DPB with new refs. */
update_dpb(ctx);

- /* Prepare data in memory. */
- prepare_table(ctx);
-
/* Build the P/B{0,1} ref lists. */
v4l2_h264_init_reflist_builder(&reflist_builder, ctrls->decode,
ctrls->sps, ctx->h264_dec.dpb);
+ h264_ctx->cur_poc = reflist_builder.cur_pic_order_count;
+
+ /* Prepare data in memory. */
+ prepare_table(ctx);
+
v4l2_h264_build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
v4l2_h264_build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
h264_ctx->reflists.b1);
+
+ /* Reduce ref lists to at most 16 entries, Hantro hardware will deduce
+ * the actual picture lists in field through the dpb_valid,
+ * dpb_longterm bitmap along with the current frame parity.
+ */
+ if (reflist_builder.cur_pic_fields != V4L2_H264_FRAME_REF) {
+ deduplicate_reflist(&reflist_builder, h264_ctx->reflists.p);
+ deduplicate_reflist(&reflist_builder, h264_ctx->reflists.b0);
+ deduplicate_reflist(&reflist_builder, h264_ctx->reflists.b1);
+ }
+
return 0;
}

diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 292aaaabaf24..fd869369fb97 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -91,6 +91,7 @@ struct hantro_h264_dec_hw_ctx {
struct hantro_h264_dec_ctrls ctrls;
u32 dpb_longterm;
u32 dpb_valid;
+ s32 cur_poc;
};

/**
--
2.34.1

2022-03-28 22:41:18

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 03/24] media: h264: Avoid wrapping long_term_frame_idx

For long term reference, frame_num is set to long_term_frame_idx which
does not require wrapping. This if fixed by observation, no directly
related issue have been found yet.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/v4l2-core/v4l2-h264.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
index 8d750ee69e74..aebed1cbe05a 100644
--- a/drivers/media/v4l2-core/v4l2-h264.c
+++ b/drivers/media/v4l2-core/v4l2-h264.c
@@ -57,8 +57,10 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
* '8.2.4.1 Decoding process for picture numbers' of the spec.
* TODO: This logic will have to be adjusted when we start
* supporting interlaced content.
+ * For long term reference, frame_num is set to
+ * long_term_frame_idx which requires no wrapping.
*/
- if (dpb[i].frame_num > cur_frame_num)
+ if (!b->refs[i].longterm && dpb[i].frame_num > cur_frame_num)
b->refs[i].frame_num = (int)dpb[i].frame_num -
max_frame_num;
else
--
2.34.1

2022-03-28 22:53:05

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 01/24] media: h264: Increase reference lists size to 32

This is to accommodate support for field decoding, which splits the top
and the bottom reference into the reference list.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/v4l2-core/v4l2-h264.c | 6 +++---
drivers/staging/media/hantro/hantro_hw.h | 6 +++---
drivers/staging/media/rkvdec/rkvdec-h264.c | 6 +++---
include/media/v4l2-h264.h | 8 ++++----
4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
index 0618c6f52214..8d750ee69e74 100644
--- a/drivers/media/v4l2-core/v4l2-h264.c
+++ b/drivers/media/v4l2-core/v4l2-h264.c
@@ -210,7 +210,7 @@ static int v4l2_h264_b1_ref_list_cmp(const void *ptra, const void *ptrb,
* v4l2_h264_build_p_ref_list() - Build the P reference list
*
* @builder: reference list builder context
- * @reflist: 16 sized array used to store the P reference list. Each entry
+ * @reflist: 32 sized array used to store the P reference list. Each entry
* is a v4l2_h264_reference structure
*
* This functions builds the P reference lists. This procedure is describe in
@@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(v4l2_h264_build_p_ref_list);
* v4l2_h264_build_b_ref_lists() - Build the B0/B1 reference lists
*
* @builder: reference list builder context
- * @b0_reflist: 16 sized array used to store the B0 reference list. Each entry
+ * @b0_reflist: 32 sized array used to store the B0 reference list. Each entry
* is a v4l2_h264_reference structure
- * @b1_reflist: 16 sized array used to store the B1 reference list. Each entry
+ * @b1_reflist: 32 sized array used to store the B1 reference list. Each entry
* is a v4l2_h264_reference structure
*
* This functions builds the B0/B1 reference lists. This procedure is described
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 2bc6b8f088f5..292aaaabaf24 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -69,9 +69,9 @@ struct hantro_h264_dec_ctrls {
* @b1: B1 reflist
*/
struct hantro_h264_dec_reflists {
- struct v4l2_h264_reference p[HANTRO_H264_DPB_SIZE];
- struct v4l2_h264_reference b0[HANTRO_H264_DPB_SIZE];
- struct v4l2_h264_reference b1[HANTRO_H264_DPB_SIZE];
+ struct v4l2_h264_reference p[V4L2_H264_REF_LIST_LEN];
+ struct v4l2_h264_reference b0[V4L2_H264_REF_LIST_LEN];
+ struct v4l2_h264_reference b1[V4L2_H264_REF_LIST_LEN];
};

/**
diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
index 3c7f3d87fab4..dff89732ddd0 100644
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
@@ -100,9 +100,9 @@ struct rkvdec_h264_priv_tbl {
#define RKVDEC_H264_DPB_SIZE 16

struct rkvdec_h264_reflists {
- struct v4l2_h264_reference p[RKVDEC_H264_DPB_SIZE];
- struct v4l2_h264_reference b0[RKVDEC_H264_DPB_SIZE];
- struct v4l2_h264_reference b1[RKVDEC_H264_DPB_SIZE];
+ struct v4l2_h264_reference p[V4L2_H264_REF_LIST_LEN];
+ struct v4l2_h264_reference b0[V4L2_H264_REF_LIST_LEN];
+ struct v4l2_h264_reference b1[V4L2_H264_REF_LIST_LEN];
u8 num_valid;
};

diff --git a/include/media/v4l2-h264.h b/include/media/v4l2-h264.h
index ef9a894e3c32..e282fb16ac58 100644
--- a/include/media/v4l2-h264.h
+++ b/include/media/v4l2-h264.h
@@ -37,7 +37,7 @@ struct v4l2_h264_reflist_builder {
u16 longterm : 1;
} refs[V4L2_H264_NUM_DPB_ENTRIES];
s32 cur_pic_order_count;
- struct v4l2_h264_reference unordered_reflist[V4L2_H264_NUM_DPB_ENTRIES];
+ struct v4l2_h264_reference unordered_reflist[V4L2_H264_REF_LIST_LEN];
u8 num_valid;
};

@@ -51,9 +51,9 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
* v4l2_h264_build_b_ref_lists() - Build the B0/B1 reference lists
*
* @builder: reference list builder context
- * @b0_reflist: 16 sized array used to store the B0 reference list. Each entry
+ * @b0_reflist: 32 sized array used to store the B0 reference list. Each entry
* is a v4l2_h264_reference structure
- * @b1_reflist: 16 sized array used to store the B1 reference list. Each entry
+ * @b1_reflist: 32 sized array used to store the B1 reference list. Each entry
* is a v4l2_h264_reference structure
*
* This functions builds the B0/B1 reference lists. This procedure is described
@@ -70,7 +70,7 @@ v4l2_h264_build_b_ref_lists(const struct v4l2_h264_reflist_builder *builder,
* v4l2_h264_build_p_ref_list() - Build the P reference list
*
* @builder: reference list builder context
- * @reflist: 16 sized array used to store the P reference list. Each entry
+ * @reflist: 32 sized array used to store the P reference list. Each entry
* is a v4l2_h264_reference structure
*
* This functions builds the P reference lists. This procedure is describe in
--
2.34.1

2022-03-28 22:54:19

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 02/24] media: doc: Document dual use of H.264 pic_num/frame_num

These two fields needs documentation as they have dual meaning. It is also
confusing since pic_num is a derived value from frame_num, so this should
help application developpers. If we ever need to make a V2 of this API, I
would suggest to remove pic_num entirely.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
.../media/v4l/ext-ctrls-codec-stateless.rst | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
index 6541e4c32b26..f634f20bcfbe 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
@@ -649,10 +649,16 @@ Stateless Codec Control ID
:c:type:`timeval` in struct :c:type:`v4l2_buffer` to a __u64.
* - __u32
- ``pic_num``
- -
+ - For short term reference, this should match the derived value PicNum
+ (8-28) and for long term references it should match the derived value
+ LongTermPicNum (8-29). Note that pic_num is the same as FrameNumWrap
+ for frame decoding.
* - __u16
- ``frame_num``
- -
+ - For short term references, this should match the frame_num value from
+ the slice header syntax (the driver will wrap the value if neeeded). For
+ long term references, this should be set to the value of
+ long_term_frame_idx describes in the dec_ref_pic_marking() syntax.
* - __u8
- ``fields``
- Specifies how the DPB entry is referenced. See :ref:`Reference Fields <h264_ref_fields>`
--
2.34.1

2022-03-28 22:56:08

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 06/24] media: v4l2: Trace calculated p/b0/b1 initial reflist

This is crucial in verifying that the ordering is right, specially with
more complex sorting needed for field decoding.

Signed-off-by: Nicolas Dufresne <[email protected]>
---
drivers/media/v4l2-core/v4l2-h264.c | 86 +++++++++++++++++++++++++++++
1 file changed, 86 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
index d5698c981973..c9e18fd51d78 100644
--- a/drivers/media/v4l2-core/v4l2-h264.c
+++ b/drivers/media/v4l2-core/v4l2-h264.c
@@ -241,6 +241,87 @@ static int v4l2_h264_b1_ref_list_cmp(const void *ptra, const void *ptrb,
return poca < pocb ? -1 : 1;
}

+static char ref_type_to_char (u8 ref_type)
+{
+ switch (ref_type) {
+ case V4L2_H264_FRAME_REF:
+ return 'f';
+ case V4L2_H264_TOP_FIELD_REF:
+ return 't';
+ case V4L2_H264_BOTTOM_FIELD_REF:
+ return 'b';
+ }
+
+ return '?';
+}
+
+static const char *format_ref_list_p(const struct v4l2_h264_reflist_builder *builder,
+ struct v4l2_h264_reference *reflist,
+ char *out_str, const int len)
+{
+ int n = 0, i;
+
+ n += snprintf(out_str + n, len - n, "|");
+
+ for (i = 0; i < builder->num_valid; i++) {
+ /* this is pic_num for frame and frame_num (wrapped) for field,
+ * but for frame pic_num is equal to frame_num (wrapped).
+ */
+ int frame_num = builder->refs[reflist[i].index].frame_num;
+ bool longterm = builder->refs[reflist[i].index].longterm;
+
+ n += scnprintf(out_str + n, len - n, "%i%c%c|",
+ frame_num, longterm ? 'l' : 's',
+ ref_type_to_char (reflist[i].fields));
+ }
+
+ return out_str;
+}
+
+static void print_ref_list_p(const struct v4l2_h264_reflist_builder *builder,
+ struct v4l2_h264_reference *reflist)
+{
+ char buf[1024];
+
+ pr_debug("ref_pic_list_p (cur_poc %u%c) %s\n",
+ builder->cur_pic_order_count,
+ ref_type_to_char(builder->cur_pic_fields),
+ format_ref_list_p(builder, reflist, buf, sizeof(buf)));
+}
+
+static const char *format_ref_list_b(const struct v4l2_h264_reflist_builder *builder,
+ struct v4l2_h264_reference *reflist,
+ char *out_str, const int len)
+{
+ int n = 0, i;
+
+ n += snprintf(out_str + n, len - n, "|");
+
+ for (i = 0; i < builder->num_valid; i++) {
+ int frame_num = builder->refs[reflist[i].index].frame_num;
+ u32 poc = v4l2_h264_get_poc(builder, reflist + i);
+ bool longterm = builder->refs[reflist[i].index].longterm;
+
+ n += scnprintf(out_str + n, len - n, "%i%c%c|",
+ longterm ? frame_num : poc,
+ longterm ? 'l' : 's',
+ ref_type_to_char(reflist[i].fields));
+ }
+
+ return out_str;
+}
+
+static void print_ref_list_b(const struct v4l2_h264_reflist_builder *builder,
+ struct v4l2_h264_reference *reflist, u8 list_num)
+{
+ char buf[1024];
+
+ pr_debug("ref_pic_list_b%u (cur_poc %u%c) %s",
+ list_num, builder->cur_pic_order_count,
+ ref_type_to_char (builder->cur_pic_fields),
+ format_ref_list_b(builder, reflist, buf, sizeof(buf)));
+}
+
/**
* v4l2_h264_build_p_ref_list() - Build the P reference list
*
@@ -261,6 +342,8 @@ v4l2_h264_build_p_ref_list(const struct v4l2_h264_reflist_builder *builder,
sizeof(builder->unordered_reflist[0]) * builder->num_valid);
sort_r(reflist, builder->num_valid, sizeof(*reflist),
v4l2_h264_p_ref_list_cmp, NULL, builder);
+
+ print_ref_list_p(builder, reflist);
}
EXPORT_SYMBOL_GPL(v4l2_h264_build_p_ref_list);

@@ -296,6 +379,9 @@ v4l2_h264_build_b_ref_lists(const struct v4l2_h264_reflist_builder *builder,
if (builder->num_valid > 1 &&
!memcmp(b1_reflist, b0_reflist, builder->num_valid))
swap(b1_reflist[0], b1_reflist[1]);
+
+ print_ref_list_b(builder, b0_reflist, 0);
+ print_ref_list_b(builder, b1_reflist, 1);
}
EXPORT_SYMBOL_GPL(v4l2_h264_build_b_ref_lists);

--
2.34.1

2022-03-28 23:00:25

by Nicolas Dufresne

[permalink] [raw]
Subject: [PATCH v1 24/24] media: rkvdec-h264: Don't hardcode SPS/PPS parameters

From: Alex Bee <[email protected]>

Some SPS/PPS parameters are currently hardcoded in the driver
even though so do exist in the uapi which is stable by now.

Use them instead of hardcoding them.

Conformance tests have shown there is no difference, but it might
increase decoder performance.

Signed-off-by: Alex Bee <[email protected]>
---
drivers/staging/media/rkvdec/rkvdec-h264.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
index 891c48bf6a51..91f65d78e453 100644
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
@@ -655,13 +655,14 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,

#define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
/* write sps */
- WRITE_PPS(0xf, SEQ_PARAMETER_SET_ID);
- WRITE_PPS(0xff, PROFILE_IDC);
- WRITE_PPS(1, CONSTRAINT_SET3_FLAG);
+ WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
+ WRITE_PPS(sps->profile_idc, PROFILE_IDC);
+ WRITE_PPS((sps->constraint_set_flags & 1 << 3) ? 1 : 0, CONSTRAINT_SET3_FLAG);
WRITE_PPS(sps->chroma_format_idc, CHROMA_FORMAT_IDC);
WRITE_PPS(sps->bit_depth_luma_minus8, BIT_DEPTH_LUMA);
WRITE_PPS(sps->bit_depth_chroma_minus8, BIT_DEPTH_CHROMA);
- WRITE_PPS(0, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
+ WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS),
+ QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
WRITE_PPS(sps->log2_max_frame_num_minus4, LOG2_MAX_FRAME_NUM_MINUS4);
WRITE_PPS(sps->max_num_ref_frames, MAX_NUM_REF_FRAMES);
WRITE_PPS(sps->pic_order_cnt_type, PIC_ORDER_CNT_TYPE);
@@ -679,8 +680,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
DIRECT_8X8_INFERENCE_FLAG);

/* write pps */
- WRITE_PPS(0xff, PIC_PARAMETER_SET_ID);
- WRITE_PPS(0x1f, PPS_SEQ_PARAMETER_SET_ID);
+ WRITE_PPS(pps->pic_parameter_set_id, PIC_PARAMETER_SET_ID);
+ WRITE_PPS(pps->seq_parameter_set_id, PPS_SEQ_PARAMETER_SET_ID);
WRITE_PPS(!!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE),
ENTROPY_CODING_MODE_FLAG);
WRITE_PPS(!!(pps->flags & V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT),
--
2.34.1

2022-03-29 10:56:17

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 24/24] media: rkvdec-h264: Don't hardcode SPS/PPS parameters

Hey Nicolas,

The patch series doesn't seem to apply on the latest media tree master
branch. Looking at your tree I can see that you have commit: ba2c670a
"media: nxp: Restrict VIDEO_IMX_MIPI_CSIS to ARCH_MXC or COMPILE_TEST
Laurent Pinchart authored 1 week ago "

But the current head of the media tree is: 71e6d0608e4d
"media: platform: Remove unnecessary print function dev_err()
Yang Li authored 13 days ago"

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>From: Alex Bee <[email protected]>
>
>Some SPS/PPS parameters are currently hardcoded in the driver
>even though so do exist in the uapi which is stable by now.

s/even though so/even though they/
>
>Use them instead of hardcoding them.
>
>Conformance tests have shown there is no difference, but it might
>increase decoder performance.

I think it would be great if we could add some performance metrics to
the commit description to have a metric that following patches could
compare themselves with.

Greetings,
Sebastian

>
>Signed-off-by: Alex Bee <[email protected]>
>---
> drivers/staging/media/rkvdec/rkvdec-h264.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
>index 891c48bf6a51..91f65d78e453 100644
>--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
>+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
>@@ -655,13 +655,14 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
>
> #define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
> /* write sps */
>- WRITE_PPS(0xf, SEQ_PARAMETER_SET_ID);
>- WRITE_PPS(0xff, PROFILE_IDC);
>- WRITE_PPS(1, CONSTRAINT_SET3_FLAG);
>+ WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
>+ WRITE_PPS(sps->profile_idc, PROFILE_IDC);
>+ WRITE_PPS((sps->constraint_set_flags & 1 << 3) ? 1 : 0, CONSTRAINT_SET3_FLAG);
> WRITE_PPS(sps->chroma_format_idc, CHROMA_FORMAT_IDC);
> WRITE_PPS(sps->bit_depth_luma_minus8, BIT_DEPTH_LUMA);
> WRITE_PPS(sps->bit_depth_chroma_minus8, BIT_DEPTH_CHROMA);
>- WRITE_PPS(0, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
>+ WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS),
>+ QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
> WRITE_PPS(sps->log2_max_frame_num_minus4, LOG2_MAX_FRAME_NUM_MINUS4);
> WRITE_PPS(sps->max_num_ref_frames, MAX_NUM_REF_FRAMES);
> WRITE_PPS(sps->pic_order_cnt_type, PIC_ORDER_CNT_TYPE);
>@@ -679,8 +680,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
> DIRECT_8X8_INFERENCE_FLAG);
>
> /* write pps */
>- WRITE_PPS(0xff, PIC_PARAMETER_SET_ID);
>- WRITE_PPS(0x1f, PPS_SEQ_PARAMETER_SET_ID);
>+ WRITE_PPS(pps->pic_parameter_set_id, PIC_PARAMETER_SET_ID);
>+ WRITE_PPS(pps->seq_parameter_set_id, PPS_SEQ_PARAMETER_SET_ID);
> WRITE_PPS(!!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE),
> ENTROPY_CODING_MODE_FLAG);
> WRITE_PPS(!!(pps->flags & V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT),
>--
>2.34.1
>

2022-03-29 11:23:37

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

On Mon, Mar 28, 2022 at 03:59:31PM -0400, Nicolas Dufresne wrote:
> @@ -738,23 +735,26 @@ static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
> struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
> int buf_idx = -1;
>
> - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
> buf_idx = vb2_find_timestamp(cap_q,
> dpb[i].reference_ts, 0);
> + if (buf_idx < 0)
> + pr_debug("No buffer for reference_ts %llu",
> + dpb[i].reference_ts);

pr_debug() is too quiet. Make it pr_err(). Set buf_idx to zero instead
leaving it as an error code.

> + }
>
> run->ref_buf_idx[i] = buf_idx;
> }
> }
>
> static void assemble_hw_rps(struct rkvdec_ctx *ctx,
> + struct v4l2_h264_reflist_builder *builder,
> struct rkvdec_h264_run *run)
> {
> const struct v4l2_ctrl_h264_decode_params *dec_params = run->decode_params;
> const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb;
> struct rkvdec_h264_ctx *h264_ctx = ctx->priv;
> - const struct v4l2_ctrl_h264_sps *sps = run->sps;
> struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu;
> - u32 max_frame_num = 1 << (sps->log2_max_frame_num_minus4 + 4);
>
> u32 *hw_rps = priv_tbl->rps;
> u32 i, j;
> @@ -772,37 +772,36 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
> if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> continue;
>
> - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM ||
> - dpb[i].frame_num <= dec_params->frame_num) {
> - p[i] = dpb[i].frame_num;
> - continue;
> - }
> -
> - p[i] = dpb[i].frame_num - max_frame_num;
> + p[i] = builder->refs[i].frame_num;
> }
>
> for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
> - for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
> - u8 dpb_valid = run->ref_buf_idx[i] >= 0;
> - u8 idx = 0;
> + for (i = 0; i < builder->num_valid; i++) {
> + struct v4l2_h264_reference *ref;
> + u8 dpb_valid;
> + u8 bottom;

These would be better as type bool.

regards,
dan carpenter

>
> switch (j) {
> case 0:
> - idx = h264_ctx->reflists.p[i].index;
> + ref = &h264_ctx->reflists.p[i];
> break;
> case 1:
> - idx = h264_ctx->reflists.b0[i].index;
> + ref = &h264_ctx->reflists.b0[i];
> break;
> case 2:
> - idx = h264_ctx->reflists.b1[i].index;
> + ref = &h264_ctx->reflists.b1[i];
> break;
> }
>
> - if (idx >= ARRAY_SIZE(dec_params->dpb))
> + if (WARN_ON(ref->index >= ARRAY_SIZE(dec_params->dpb)))
> continue;
>
> + dpb_valid = run->ref_buf_idx[ref->index] >= 0;
> + bottom = ref->fields == V4L2_H264_BOTTOM_FIELD_REF;
> +
> set_ps_field(hw_rps, DPB_INFO(i, j),
> - idx | dpb_valid << 4);
> + ref->index | dpb_valid << 4);
> + set_ps_field(hw_rps, BOTTOM_FLAG(i, j), bottom);
> }
> }
> }

2022-03-29 12:27:30

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 02/24] media: doc: Document dual use of H.264 pic_num/frame_num

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>These two fields needs documentation as they have dual meaning. It is also

s/needs/need/

>confusing since pic_num is a derived value from frame_num, so this should
>help application developpers. If we ever need to make a V2 of this API, I

s/developpers/developers/
(seems to be a common typo among native French speakers ;))

>would suggest to remove pic_num entirely.

I think that suggestion should be placed as a FIXME/TODO comment into the
source code as it is way easier to find for future developers creating a V2.

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

>---
> .../media/v4l/ext-ctrls-codec-stateless.rst | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
>index 6541e4c32b26..f634f20bcfbe 100644
>--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
>+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
>@@ -649,10 +649,16 @@ Stateless Codec Control ID
> :c:type:`timeval` in struct :c:type:`v4l2_buffer` to a __u64.
> * - __u32
> - ``pic_num``
>- -
>+ - For short term reference, this should match the derived value PicNum
>+ (8-28) and for long term references it should match the derived value

Minor suggestion, for short term you use singular (reference) and for
long term you use plural (references), I would stick to one of both.
(Below you only use plural so maybe stick to plural here as well)

>+ LongTermPicNum (8-29). Note that pic_num is the same as FrameNumWrap
>+ for frame decoding.
> * - __u16
> - ``frame_num``
>- -
>+ - For short term references, this should match the frame_num value from
>+ the slice header syntax (the driver will wrap the value if neeeded). For
>+ long term references, this should be set to the value of
>+ long_term_frame_idx describes in the dec_ref_pic_marking() syntax.

s/describes/described/

Greetings,
Sebastian

> * - __u8
> - ``fields``
> - Specifies how the DPB entry is referenced. See :ref:`Reference Fields <h264_ref_fields>`
>--
>2.34.1
>

2022-03-30 03:18:04

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 04/24] media: h264: Store current picture fields

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>This information, also called picture structure will be needed to construct

s/picture structure/picture structure,/
(minor.. but still ;))

>reference list when decoding field.

s/will be needed to construct reference list when decoding field./
is required in field decoding mode to construct reference lists./

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

>---
> drivers/media/v4l2-core/v4l2-h264.c | 10 +++++++---
> include/media/v4l2-h264.h | 4 ++++
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
>index aebed1cbe05a..4c6bfb057bda 100644
>--- a/drivers/media/v4l2-core/v4l2-h264.c
>+++ b/drivers/media/v4l2-core/v4l2-h264.c
>@@ -34,13 +34,17 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
> cur_frame_num = dec_params->frame_num;
>
> memset(b, 0, sizeof(*b));
>- if (!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC))
>+ if (!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC)) {
> b->cur_pic_order_count = min(dec_params->bottom_field_order_cnt,
> dec_params->top_field_order_cnt);
>- else if (dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD)
>+ b->cur_pic_fields = V4L2_H264_FRAME_REF;
>+ } else if (dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD) {
> b->cur_pic_order_count = dec_params->bottom_field_order_cnt;
>- else
>+ b->cur_pic_fields = V4L2_H264_BOTTOM_FIELD_REF;
>+ } else {
> b->cur_pic_order_count = dec_params->top_field_order_cnt;
>+ b->cur_pic_fields = V4L2_H264_TOP_FIELD_REF;
>+ }
>
> for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++) {
> u32 pic_order_count;
>diff --git a/include/media/v4l2-h264.h b/include/media/v4l2-h264.h
>index e282fb16ac58..e165a54c68fa 100644
>--- a/include/media/v4l2-h264.h
>+++ b/include/media/v4l2-h264.h
>@@ -21,6 +21,7 @@
> * @refs.longterm: set to true for a long term reference
> * @refs: array of references
> * @cur_pic_order_count: picture order count of the frame being decoded
>+ * @cur_pic_fields: fields present in the frame being decoded

As there are only three choices:
```
#define V4L2_H264_TOP_FIELD_REF 0x1
#define V4L2_H264_BOTTOM_FIELD_REF 0x2
#define V4L2_H264_FRAME_REF 0x3
```

Maybe it would help to say:
* @cur_pic_fields: fields present in the frame being decoded (top/bottom/both)

Greetings,
Sebastian

> * @unordered_reflist: unordered list of references. Will be used to generate
> * ordered P/B0/B1 lists
> * @num_valid: number of valid references in the refs array
>@@ -36,7 +37,10 @@ struct v4l2_h264_reflist_builder {
> u32 pic_num;
> u16 longterm : 1;
> } refs[V4L2_H264_NUM_DPB_ENTRIES];
>+
> s32 cur_pic_order_count;
>+ u8 cur_pic_fields;
>+
> struct v4l2_h264_reference unordered_reflist[V4L2_H264_REF_LIST_LEN];
> u8 num_valid;
> };
>--
>2.34.1
>

2022-03-30 08:56:55

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 01/24] media: h264: Increase reference lists size to 32

Hey Nicolas,

As mentioned in patch 24 as well:
The patch series doesn't seem to apply on the latest media tree master
branch. Looking at your tree I can see that you have commit: ba2c670a
"media: nxp: Restrict VIDEO_IMX_MIPI_CSIS to ARCH_MXC or COMPILE_TEST
Laurent Pinchart authored 1 week ago "

But the current head of the media tree is: 71e6d0608e4d
"media: platform: Remove unnecessary print function dev_err()
Yang Li authored 13 days ago"

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>This is to accommodate support for field decoding, which splits the top
>and the bottom reference into the reference list.

s/and the bottom reference/and the bottom references/

I think it would be helpful to describe with a bit more detail why field
decoding requires the 32 entry array instead of the 16 entry array.

How about:
"""
In field decoding mode a slice is a sequence of macroblock pairs, where
each pair contains a top and a bottom macroblock. To accommodate for
this mode the reference list must be able contain references for both
top and bottom macroblocks. Double the size of the reference list array
accordingly.
"""

I got the info from the specification at 6.3.

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>
>---
> drivers/media/v4l2-core/v4l2-h264.c | 6 +++---
> drivers/staging/media/hantro/hantro_hw.h | 6 +++---
> drivers/staging/media/rkvdec/rkvdec-h264.c | 6 +++---
> include/media/v4l2-h264.h | 8 ++++----
> 4 files changed, 13 insertions(+), 13 deletions(-)
>
>diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
>index 0618c6f52214..8d750ee69e74 100644
>--- a/drivers/media/v4l2-core/v4l2-h264.c
>+++ b/drivers/media/v4l2-core/v4l2-h264.c
>@@ -210,7 +210,7 @@ static int v4l2_h264_b1_ref_list_cmp(const void *ptra, const void *ptrb,
> * v4l2_h264_build_p_ref_list() - Build the P reference list
> *
> * @builder: reference list builder context
>- * @reflist: 16 sized array used to store the P reference list. Each entry
>+ * @reflist: 32 sized array used to store the P reference list. Each entry
> * is a v4l2_h264_reference structure
> *
> * This functions builds the P reference lists. This procedure is describe in

Oh unrelated to the patch but there is a typo: s/describe/described/

Greetings,
Sebastian

>@@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(v4l2_h264_build_p_ref_list);
> * v4l2_h264_build_b_ref_lists() - Build the B0/B1 reference lists
> *
> * @builder: reference list builder context
>- * @b0_reflist: 16 sized array used to store the B0 reference list. Each entry
>+ * @b0_reflist: 32 sized array used to store the B0 reference list. Each entry
> * is a v4l2_h264_reference structure
>- * @b1_reflist: 16 sized array used to store the B1 reference list. Each entry
>+ * @b1_reflist: 32 sized array used to store the B1 reference list. Each entry
> * is a v4l2_h264_reference structure
> *
> * This functions builds the B0/B1 reference lists. This procedure is described
>diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
>index 2bc6b8f088f5..292aaaabaf24 100644
>--- a/drivers/staging/media/hantro/hantro_hw.h
>+++ b/drivers/staging/media/hantro/hantro_hw.h
>@@ -69,9 +69,9 @@ struct hantro_h264_dec_ctrls {
> * @b1: B1 reflist
> */
> struct hantro_h264_dec_reflists {
>- struct v4l2_h264_reference p[HANTRO_H264_DPB_SIZE];
>- struct v4l2_h264_reference b0[HANTRO_H264_DPB_SIZE];
>- struct v4l2_h264_reference b1[HANTRO_H264_DPB_SIZE];
>+ struct v4l2_h264_reference p[V4L2_H264_REF_LIST_LEN];
>+ struct v4l2_h264_reference b0[V4L2_H264_REF_LIST_LEN];
>+ struct v4l2_h264_reference b1[V4L2_H264_REF_LIST_LEN];
> };
>
> /**
>diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
>index 3c7f3d87fab4..dff89732ddd0 100644
>--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
>+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
>@@ -100,9 +100,9 @@ struct rkvdec_h264_priv_tbl {
> #define RKVDEC_H264_DPB_SIZE 16
>
> struct rkvdec_h264_reflists {
>- struct v4l2_h264_reference p[RKVDEC_H264_DPB_SIZE];
>- struct v4l2_h264_reference b0[RKVDEC_H264_DPB_SIZE];
>- struct v4l2_h264_reference b1[RKVDEC_H264_DPB_SIZE];
>+ struct v4l2_h264_reference p[V4L2_H264_REF_LIST_LEN];
>+ struct v4l2_h264_reference b0[V4L2_H264_REF_LIST_LEN];
>+ struct v4l2_h264_reference b1[V4L2_H264_REF_LIST_LEN];
> u8 num_valid;
> };
>
>diff --git a/include/media/v4l2-h264.h b/include/media/v4l2-h264.h
>index ef9a894e3c32..e282fb16ac58 100644
>--- a/include/media/v4l2-h264.h
>+++ b/include/media/v4l2-h264.h
>@@ -37,7 +37,7 @@ struct v4l2_h264_reflist_builder {
> u16 longterm : 1;
> } refs[V4L2_H264_NUM_DPB_ENTRIES];
> s32 cur_pic_order_count;
>- struct v4l2_h264_reference unordered_reflist[V4L2_H264_NUM_DPB_ENTRIES];
>+ struct v4l2_h264_reference unordered_reflist[V4L2_H264_REF_LIST_LEN];
> u8 num_valid;
> };
>
>@@ -51,9 +51,9 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
> * v4l2_h264_build_b_ref_lists() - Build the B0/B1 reference lists
> *
> * @builder: reference list builder context
>- * @b0_reflist: 16 sized array used to store the B0 reference list. Each entry
>+ * @b0_reflist: 32 sized array used to store the B0 reference list. Each entry
> * is a v4l2_h264_reference structure
>- * @b1_reflist: 16 sized array used to store the B1 reference list. Each entry
>+ * @b1_reflist: 32 sized array used to store the B1 reference list. Each entry
> * is a v4l2_h264_reference structure
> *
> * This functions builds the B0/B1 reference lists. This procedure is described
>@@ -70,7 +70,7 @@ v4l2_h264_build_b_ref_lists(const struct v4l2_h264_reflist_builder *builder,
> * v4l2_h264_build_p_ref_list() - Build the P reference list
> *
> * @builder: reference list builder context
>- * @reflist: 16 sized array used to store the P reference list. Each entry
>+ * @reflist: 32 sized array used to store the P reference list. Each entry
> * is a v4l2_h264_reference structure
> *
> * This functions builds the P reference lists. This procedure is describe in
>--
>2.34.1
>

2022-03-30 09:22:04

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 06/24] media: v4l2: Trace calculated p/b0/b1 initial reflist

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>This is crucial in verifying that the ordering is right, specially with
>more complex sorting needed for field decoding.

How about:

"""
Add debug print statements to print the content of P & B reference
lists, to verify that the ordering of the generated reference lists is
correct. This is escpecially important for the field decoding mode,
where sorting is more complex.
"""

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Tested-by: Sebastian Fricke <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

Greetings,
Sebastian

>---
> drivers/media/v4l2-core/v4l2-h264.c | 86 +++++++++++++++++++++++++++++
> 1 file changed, 86 insertions(+)
>
>diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
>index d5698c981973..c9e18fd51d78 100644
>--- a/drivers/media/v4l2-core/v4l2-h264.c
>+++ b/drivers/media/v4l2-core/v4l2-h264.c
>@@ -241,6 +241,87 @@ static int v4l2_h264_b1_ref_list_cmp(const void *ptra, const void *ptrb,
> return poca < pocb ? -1 : 1;
> }
>
>+static char ref_type_to_char (u8 ref_type)
>+{
>+ switch (ref_type) {
>+ case V4L2_H264_FRAME_REF:
>+ return 'f';
>+ case V4L2_H264_TOP_FIELD_REF:
>+ return 't';
>+ case V4L2_H264_BOTTOM_FIELD_REF:
>+ return 'b';
>+ }
>+
>+ return '?';
>+}
>+
>+static const char *format_ref_list_p(const struct v4l2_h264_reflist_builder *builder,
>+ struct v4l2_h264_reference *reflist,
>+ char *out_str, const int len)
>+{
>+ int n = 0, i;
>+
>+ n += snprintf(out_str + n, len - n, "|");
>+
>+ for (i = 0; i < builder->num_valid; i++) {
>+ /* this is pic_num for frame and frame_num (wrapped) for field,
>+ * but for frame pic_num is equal to frame_num (wrapped).
>+ */
>+ int frame_num = builder->refs[reflist[i].index].frame_num;
>+ bool longterm = builder->refs[reflist[i].index].longterm;
>+
>+ n += scnprintf(out_str + n, len - n, "%i%c%c|",
>+ frame_num, longterm ? 'l' : 's',
>+ ref_type_to_char (reflist[i].fields));
>+ }
>+
>+ return out_str;
>+}
>+
>+static void print_ref_list_p(const struct v4l2_h264_reflist_builder *builder,
>+ struct v4l2_h264_reference *reflist)
>+{
>+ char buf[1024];
>+
>+ pr_debug("ref_pic_list_p (cur_poc %u%c) %s\n",
>+ builder->cur_pic_order_count,
>+ ref_type_to_char(builder->cur_pic_fields),
>+ format_ref_list_p(builder, reflist, buf, sizeof(buf)));
>+}
>+
>+static const char *format_ref_list_b(const struct v4l2_h264_reflist_builder *builder,
>+ struct v4l2_h264_reference *reflist,
>+ char *out_str, const int len)
>+{
>+ int n = 0, i;
>+
>+ n += snprintf(out_str + n, len - n, "|");
>+
>+ for (i = 0; i < builder->num_valid; i++) {
>+ int frame_num = builder->refs[reflist[i].index].frame_num;
>+ u32 poc = v4l2_h264_get_poc(builder, reflist + i);
>+ bool longterm = builder->refs[reflist[i].index].longterm;
>+
>+ n += scnprintf(out_str + n, len - n, "%i%c%c|",
>+ longterm ? frame_num : poc,
>+ longterm ? 'l' : 's',
>+ ref_type_to_char(reflist[i].fields));
>+ }
>+
>+ return out_str;
>+}
>+
>+static void print_ref_list_b(const struct v4l2_h264_reflist_builder *builder,
>+ struct v4l2_h264_reference *reflist, u8 list_num)
>+{
>+ char buf[1024];
>+
>+ pr_debug("ref_pic_list_b%u (cur_poc %u%c) %s",
>+ list_num, builder->cur_pic_order_count,
>+ ref_type_to_char (builder->cur_pic_fields),
>+ format_ref_list_b(builder, reflist, buf, sizeof(buf)));
>+}
>+
> /**
> * v4l2_h264_build_p_ref_list() - Build the P reference list
> *
>@@ -261,6 +342,8 @@ v4l2_h264_build_p_ref_list(const struct v4l2_h264_reflist_builder *builder,
> sizeof(builder->unordered_reflist[0]) * builder->num_valid);
> sort_r(reflist, builder->num_valid, sizeof(*reflist),
> v4l2_h264_p_ref_list_cmp, NULL, builder);
>+
>+ print_ref_list_p(builder, reflist);
> }
> EXPORT_SYMBOL_GPL(v4l2_h264_build_p_ref_list);
>
>@@ -296,6 +379,9 @@ v4l2_h264_build_b_ref_lists(const struct v4l2_h264_reflist_builder *builder,
> if (builder->num_valid > 1 &&
> !memcmp(b1_reflist, b0_reflist, builder->num_valid))
> swap(b1_reflist[0], b1_reflist[1]);
>+
>+ print_ref_list_b(builder, b0_reflist, 0);
>+ print_ref_list_b(builder, b1_reflist, 1);
> }
> EXPORT_SYMBOL_GPL(v4l2_h264_build_b_ref_lists);
>
>--
>2.34.1
>

2022-03-30 11:50:44

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 09/24] media: v4l2-mem2mem: Fix typo in trace message

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>On -> One

Hmmmm the message "One job queue already" doesn't sound correct. I think
the message wants to say that the buffer is already on the queue.

We could maybe enhance the message like:
"Buffer already found on the job queue\n"

But this is not a typo from my POV.

Greetings,
Sebastian

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
>---
> drivers/media/v4l2-core/v4l2-mem2mem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
>index 675e22895ebe..53c2332d5cbd 100644
>--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
>+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
>@@ -316,7 +316,7 @@ static void __v4l2_m2m_try_queue(struct v4l2_m2m_dev *m2m_dev,
> }
>
> if (m2m_ctx->job_flags & TRANS_QUEUED) {
>- dprintk("On job queue already\n");
>+ dprintk("One job queue already\n");
> goto job_unlock;
> }
>
>--
>2.34.1
>

2022-03-30 11:57:54

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>This makes use of the new feature in the reference builder to program
>up to 32 references when doing field decoding. It also signals the
>parity (top of bottom) of the field to the hardware.

s/top of bottom/top or bottom/

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

Greetings,
Sebastian

>---
> drivers/staging/media/rkvdec/rkvdec-h264.c | 48 ++++++++++------------
> 1 file changed, 21 insertions(+), 27 deletions(-)
>
>diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
>index ec52b195bbd7..891c48bf6a51 100644
>--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
>+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
>@@ -97,13 +97,10 @@ struct rkvdec_h264_priv_tbl {
> u8 err_info[RKV_ERROR_INFO_SIZE];
> };
>
>-#define RKVDEC_H264_DPB_SIZE 16
>-
> struct rkvdec_h264_reflists {
> struct v4l2_h264_reference p[V4L2_H264_REF_LIST_LEN];
> struct v4l2_h264_reference b0[V4L2_H264_REF_LIST_LEN];
> struct v4l2_h264_reference b1[V4L2_H264_REF_LIST_LEN];
>- u8 num_valid;
> };
>
> struct rkvdec_h264_run {
>@@ -738,23 +735,26 @@ static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
> struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
> int buf_idx = -1;
>
>- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
>+ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
> buf_idx = vb2_find_timestamp(cap_q,
> dpb[i].reference_ts, 0);
>+ if (buf_idx < 0)
>+ pr_debug("No buffer for reference_ts %llu",
>+ dpb[i].reference_ts);
>+ }
>
> run->ref_buf_idx[i] = buf_idx;
> }
> }
>
> static void assemble_hw_rps(struct rkvdec_ctx *ctx,
>+ struct v4l2_h264_reflist_builder *builder,
> struct rkvdec_h264_run *run)
> {
> const struct v4l2_ctrl_h264_decode_params *dec_params = run->decode_params;
> const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb;
> struct rkvdec_h264_ctx *h264_ctx = ctx->priv;
>- const struct v4l2_ctrl_h264_sps *sps = run->sps;
> struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu;
>- u32 max_frame_num = 1 << (sps->log2_max_frame_num_minus4 + 4);
>
> u32 *hw_rps = priv_tbl->rps;
> u32 i, j;
>@@ -772,37 +772,36 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
> if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> continue;
>
>- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM ||
>- dpb[i].frame_num <= dec_params->frame_num) {
>- p[i] = dpb[i].frame_num;
>- continue;
>- }
>-
>- p[i] = dpb[i].frame_num - max_frame_num;
>+ p[i] = builder->refs[i].frame_num;
> }
>
> for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
>- for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
>- u8 dpb_valid = run->ref_buf_idx[i] >= 0;
>- u8 idx = 0;
>+ for (i = 0; i < builder->num_valid; i++) {
>+ struct v4l2_h264_reference *ref;
>+ u8 dpb_valid;
>+ u8 bottom;
>
> switch (j) {
> case 0:
>- idx = h264_ctx->reflists.p[i].index;
>+ ref = &h264_ctx->reflists.p[i];
> break;
> case 1:
>- idx = h264_ctx->reflists.b0[i].index;
>+ ref = &h264_ctx->reflists.b0[i];
> break;
> case 2:
>- idx = h264_ctx->reflists.b1[i].index;
>+ ref = &h264_ctx->reflists.b1[i];
> break;
> }
>
>- if (idx >= ARRAY_SIZE(dec_params->dpb))
>+ if (WARN_ON(ref->index >= ARRAY_SIZE(dec_params->dpb)))
> continue;
>
>+ dpb_valid = run->ref_buf_idx[ref->index] >= 0;
>+ bottom = ref->fields == V4L2_H264_BOTTOM_FIELD_REF;
>+
> set_ps_field(hw_rps, DPB_INFO(i, j),
>- idx | dpb_valid << 4);
>+ ref->index | dpb_valid << 4);
>+ set_ps_field(hw_rps, BOTTOM_FLAG(i, j), bottom);
> }
> }
> }
>@@ -990,10 +989,6 @@ static void config_registers(struct rkvdec_ctx *ctx,
> rkvdec->regs + RKVDEC_REG_H264_BASE_REFER15);
> }
>
>- /*
>- * Since support frame mode only
>- * top_field_order_cnt is the same as bottom_field_order_cnt
>- */
> reg = RKVDEC_CUR_POC(dec_params->top_field_order_cnt);
> writel_relaxed(reg, rkvdec->regs + RKVDEC_REG_CUR_POC0);
>
>@@ -1109,7 +1104,6 @@ static int rkvdec_h264_run(struct rkvdec_ctx *ctx)
> /* Build the P/B{0,1} ref lists. */
> v4l2_h264_init_reflist_builder(&reflist_builder, run.decode_params,
> run.sps, run.decode_params->dpb);
>- h264_ctx->reflists.num_valid = reflist_builder.num_valid;
> v4l2_h264_build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
> v4l2_h264_build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
> h264_ctx->reflists.b1);
>@@ -1117,7 +1111,7 @@ static int rkvdec_h264_run(struct rkvdec_ctx *ctx)
> assemble_hw_scaling_list(ctx, &run);
> assemble_hw_pps(ctx, &run);
> lookup_ref_buf_idx(ctx, &run);
>- assemble_hw_rps(ctx, &run);
>+ assemble_hw_rps(ctx, &reflist_builder, &run);
> config_registers(ctx, &run);
>
> rkvdec_run_postamble(ctx, &run.base);
>--
>2.34.1
>

2022-03-30 11:58:22

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 13/24] media: rkvdec: h264: Fix reference frame_num wrap for second field

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>From: Jonas Karlman <[email protected]>
>
>When decoding the second field in a complementary field pair the second
>field is sharing the same frame_num with the first field.
>
>Currently the frame_num for the first field is wrapped when it matches the
>field being decoded, this cause issues to decode the second field in a
>complementary field pair.
>
>Fix this by using inclusive comparison, less than or equal.
>
>Signed-off-by: Jonas Karlman <[email protected]>
>Reviewed-by: Ezequiel Garcia <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

Greetings,
Sebastian

>---
> drivers/staging/media/rkvdec/rkvdec-h264.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
>index dff89732ddd0..842d8cd80e90 100644
>--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
>+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
>@@ -752,7 +752,7 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
> continue;
>
> if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM ||
>- dpb[i].frame_num < dec_params->frame_num) {
>+ dpb[i].frame_num <= dec_params->frame_num) {
> p[i] = dpb[i].frame_num;
> continue;
> }
>--
>2.34.1
>

2022-03-30 14:14:03

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 08/24] media: v4l2: Reorder field reflist

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>As per spec, the field refslist requires interleaving top and bottom

In other patches you call it always field reflist, so I'd say let's
stick to that:
s/field refslist/field reflist/

>field in a specific way that does not fit inside the sort operation.
>Reorder in-place the references so that their parity sart with the same
>parity as the current picture and do that for both short and longterm
>references separately.

I find that sentence hard to understand, is this maybe better:

"""
Rearrange the references in place so that their parity matches that of
the current image, and do this separately for both short- and long-term
references.
"""

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

>---
> drivers/media/v4l2-core/v4l2-h264.c | 45 +++++++++++++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
>diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
>index c3fad382882d..2f7ee8d5479f 100644
>--- a/drivers/media/v4l2-core/v4l2-h264.c
>+++ b/drivers/media/v4l2-core/v4l2-h264.c
>@@ -243,6 +243,43 @@ static int v4l2_h264_b1_ref_list_cmp(const void *ptra, const void *ptrb,
> return poca < pocb ? -1 : 1;
> }
>
>+/*
>+ * The references need to be reorder so that reference are alternating between

s/to be reorder/to be reordered/
s/ so that reference/, so that references/

>+ * top and bottom ref starting with the current picture parity. This have to be

s/bottom ref/bottom field references/
s/This have to be/This has to be/

>+ * done for short term and long term references separately.
>+ */
>+static void reorder_field_reflist(const struct v4l2_h264_reflist_builder *b,
>+ struct v4l2_h264_reference *reflist)
>+{
>+ struct v4l2_h264_reference tmplist[V4L2_H264_REF_LIST_LEN];
>+ u8 lt, i = 0, j = 0, k = 0;
>+
>+ memcpy(tmplist, reflist, sizeof(tmplist[0]) * b->num_valid);
>+
>+ for (lt = 0; lt <= 1; lt++) {
>+ do {
>+ for (; i < b->num_valid && b->refs[tmplist[i].index].longterm == lt; i++) {
>+ if (tmplist[i].fields == b->cur_pic_fields) {
>+ reflist[k] = tmplist[i];
>+ k++;
>+ i++;

You can just say: `reflist[k++] = tmplist[i++];`

>+ break;
>+ }
>+ }
>+
>+ for (; j < b->num_valid && b->refs[tmplist[j].index].longterm == lt; j++) {
>+ if (tmplist[j].fields != b->cur_pic_fields) {
>+ reflist[k] = tmplist[j];
>+ k++;
>+ j++;

Same here: `reflist[k++] = tmplist[j++];`

Greetings,
Sebastian

>+ break;
>+ }
>+ }
>+ } while ((i < b->num_valid && b->refs[tmplist[i].index].longterm == lt) ||
>+ (j < b->num_valid && b->refs[tmplist[j].index].longterm == lt));
>+ }
>+}
>+
> static char ref_type_to_char (u8 ref_type)
> {
> switch (ref_type) {
>@@ -345,6 +382,9 @@ v4l2_h264_build_p_ref_list(const struct v4l2_h264_reflist_builder *builder,
> sort_r(reflist, builder->num_valid, sizeof(*reflist),
> v4l2_h264_p_ref_list_cmp, NULL, builder);
>
>+ if (builder->cur_pic_fields != V4L2_H264_FRAME_REF)
>+ reorder_field_reflist(builder, reflist);
>+
> print_ref_list_p(builder, reflist);
> }
> EXPORT_SYMBOL_GPL(v4l2_h264_build_p_ref_list);
>@@ -378,6 +418,11 @@ v4l2_h264_build_b_ref_lists(const struct v4l2_h264_reflist_builder *builder,
> sort_r(b1_reflist, builder->num_valid, sizeof(*b1_reflist),
> v4l2_h264_b1_ref_list_cmp, NULL, builder);
>
>+ if (builder->cur_pic_fields != V4L2_H264_FRAME_REF) {
>+ reorder_field_reflist(builder, b0_reflist);
>+ reorder_field_reflist(builder, b1_reflist);
>+ }
>+
> if (builder->num_valid > 1 &&
> !memcmp(b1_reflist, b0_reflist, builder->num_valid))
> swap(b1_reflist[0], b1_reflist[1]);
>--
>2.34.1
>

2022-03-30 14:58:32

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

On Tue, Mar 29, 2022 at 04:54:55PM -0400, Nicolas Dufresne wrote:
> Le mardi 29 mars 2022 ? 11:13 +0300, Dan Carpenter a ?crit?:
> > On Mon, Mar 28, 2022 at 03:59:31PM -0400, Nicolas Dufresne wrote:
> > > @@ -738,23 +735,26 @@ static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
> > > struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
> > > int buf_idx = -1;
> > >
> > > - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
> > > buf_idx = vb2_find_timestamp(cap_q,
> > > dpb[i].reference_ts, 0);
> > > + if (buf_idx < 0)
> > > + pr_debug("No buffer for reference_ts %llu",
> > > + dpb[i].reference_ts);
> >
> > pr_debug() is too quiet. Make it pr_err(). Set buf_idx to zero instead
> > leaving it as an error code.
>
> Thanks for the suggestion, I'm just a bit uncomfortable using pr_err() for
> something that is not a driver error, but userland error. Perhaps you can
> educate me on the policy in this regard, but malicous userland being able to
> flood the logs very easily is my main concern here.
>
> About the negative idx, it is being used set dpb_valid later on. H.264 error
> resilience requires that these frames should be marked as "unexisting" but still
> occupy space in the DPB, this is more or less what I'm trying to implement here.
> Setting it to 0 would basically mean to refer to DPB index 0, which is
> relatively random pick. I believe your suggestion is not taking into
> consideration what the code is doing, but it would fall in some poor-man
> concealment which I would rather leave to the userland.
>

To be honest, I just saw that it was a negative idx and freaked out. I
didn't look at any context...

You're right that we don't to allow the user to spam the dmesg. Ideally
we would return an error. A second best solution is to do a pr_err_once().

> > > for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
> > > - for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
> > > - u8 dpb_valid = run->ref_buf_idx[i] >= 0;
> > > - u8 idx = 0;
> > > + for (i = 0; i < builder->num_valid; i++) {
> > > + struct v4l2_h264_reference *ref;
> > > + u8 dpb_valid;
> > > + u8 bottom;
> >
> > These would be better as type bool.
>
> I never used a bool for bit operations before, but I guess that can work, thanks
> for the suggestion. As this deviates from the original code, I suppose I should
> make this a separate patch ?

I just saw the name and wondered why it was a u8. bool does make more
sense and works fine for the bitwise stuff. But I don't really care at
all.

regards,
dan carpenter

2022-03-30 16:06:35

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 18/24] media: rkvdec: h264: Fix bit depth wrap in pps packet

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>From: Jonas Karlman <[email protected]>
>
>The luma and chroma bit depth fields in the pps packet is 3 bits wide.

s/is 3 bits wide/are 3 bits wide/

>8 is wrongly added to the bit depth value written to these 3-bit fields.

s/bit depth value/bit depth values/

(as we talk about multiple different values)

>Because only the 3 LSB is written the hardware is configured correctly.

s/Because only the 3 LSB is written the hardware is configured correctly./
Because only the three least significant bits are written, the hardware will be configured correctly./

(original sentence is very hard to read, the sentence could also mean
something like this:
'Because only the three least significant bits, that are written to the hardware, are configured correctly.')

>
>Correct this by not adding 8 to the luma and chroma bit depth value.
>
>Signed-off-by: Jonas Karlman <[email protected]>
>Reviewed-by: Ezequiel Garcia <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

Greetings,
Sebastian

>---
> drivers/staging/media/rkvdec/rkvdec-h264.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
>index 847b8957dad3..ec52b195bbd7 100644
>--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
>+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
>@@ -662,8 +662,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
> WRITE_PPS(0xff, PROFILE_IDC);
> WRITE_PPS(1, CONSTRAINT_SET3_FLAG);
> WRITE_PPS(sps->chroma_format_idc, CHROMA_FORMAT_IDC);
>- WRITE_PPS(sps->bit_depth_luma_minus8 + 8, BIT_DEPTH_LUMA);
>- WRITE_PPS(sps->bit_depth_chroma_minus8 + 8, BIT_DEPTH_CHROMA);
>+ WRITE_PPS(sps->bit_depth_luma_minus8, BIT_DEPTH_LUMA);
>+ WRITE_PPS(sps->bit_depth_chroma_minus8, BIT_DEPTH_CHROMA);
> WRITE_PPS(0, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
> WRITE_PPS(sps->log2_max_frame_num_minus4, LOG2_MAX_FRAME_NUM_MINUS4);
> WRITE_PPS(sps->max_num_ref_frames, MAX_NUM_REF_FRAMES);
>--
>2.34.1
>

2022-03-30 17:52:06

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 17/24] media: rkvdec: h264: Validate and use pic width and height in mbs

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>From: Jonas Karlman <[email protected]>
>
>The width and height in mbs is currently configured based on OUTPUT buffer
>resolution, this works for frame pictures but can cause issues for field
>pictures.

How about:

"""
The width and height measured in macroblocks (mbs) is currently
configured based on the resolution of the OUTPUT buffer, this works for
frame pictures but can cause issues for field pictures..
"""

(I think it improves readability to explain at least once what mbs means
to anyone not aware)

>
>When frame_mbs_only_flag is 0 the height in mbs should be height of
>the field instead of height of frame.
>
>Validate pic_width_in_mbs_minus1 and pic_height_in_map_units_minus1
>against OUTPUT buffer resolution and use these values to configure HW.

s/OUTPUT buffer resolution/the resolution of the OUTPUT buffer/

>
>Signed-off-by: Jonas Karlman <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

Greetings,
Sebastian
>---
> drivers/staging/media/rkvdec/rkvdec-h264.c | 4 ++--
> drivers/staging/media/rkvdec/rkvdec.c | 10 ++++++++++
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
>index db1e762baee5..847b8957dad3 100644
>--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
>+++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
>@@ -672,8 +672,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
> LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4);
> WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO),
> DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG);
>- WRITE_PPS(DIV_ROUND_UP(ctx->coded_fmt.fmt.pix_mp.width, 16), PIC_WIDTH_IN_MBS);
>- WRITE_PPS(DIV_ROUND_UP(ctx->coded_fmt.fmt.pix_mp.height, 16), PIC_HEIGHT_IN_MBS);
>+ WRITE_PPS(sps->pic_width_in_mbs_minus1 + 1, PIC_WIDTH_IN_MBS);
>+ WRITE_PPS(sps->pic_height_in_map_units_minus1 + 1, PIC_HEIGHT_IN_MBS);
> WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY),
> FRAME_MBS_ONLY_FLAG);
> WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD),
>diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
>index 22c0382c579e..67539f4bf382 100644
>--- a/drivers/staging/media/rkvdec/rkvdec.c
>+++ b/drivers/staging/media/rkvdec/rkvdec.c
>@@ -29,8 +29,11 @@
>
> static int rkvdec_try_ctrl(struct v4l2_ctrl *ctrl)
> {
>+ struct rkvdec_ctx *ctx = container_of(ctrl->handler, struct rkvdec_ctx, ctrl_hdl);
>+
> if (ctrl->id == V4L2_CID_STATELESS_H264_SPS) {
> const struct v4l2_ctrl_h264_sps *sps = ctrl->p_new.p_h264_sps;
>+ unsigned int width, height;
> /*
> * TODO: The hardware supports 10-bit and 4:2:2 profiles,
> * but it's currently broken in the driver.
>@@ -45,6 +48,13 @@ static int rkvdec_try_ctrl(struct v4l2_ctrl *ctrl)
> if (sps->bit_depth_luma_minus8 != 0)
> /* Only 8-bit is supported */
> return -EINVAL;
>+
>+ width = (sps->pic_width_in_mbs_minus1 + 1) * 16;
>+ height = (sps->pic_height_in_map_units_minus1 + 1) * 16;
>+
>+ if (width > ctx->coded_fmt.fmt.pix_mp.width ||
>+ height > ctx->coded_fmt.fmt.pix_mp.height)
>+ return -EINVAL;
> }
> return 0;
> }
>--
>2.34.1
>

2022-03-31 02:36:33

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 23/24] media: hantro: Add H.264 field decoding support

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>This adds the required code to support field decoding. While most of
>the code is derived from Rockchip and VSI reference code, the
>reduction of the reference list to 16 entries has been found by

s/has been/was/

>trial and errors. The list consist of all the references with the

s/consist/consists/

>opposite field parity.
>
>The strategy being to deduplicate the reference picture that points

s/strategy being to/strategy is to/

>to the same storage (same index). The choice of opposite parity has
>been made to keep the other field or a field pair to be kept in the

Do you mean?

s/keep the other field or a field pair to be kept/
keep the other field of a field pair/

>list. This method may not be robust if a field was lost.
>
>Signed-off-by: Jonas Karlman <[email protected]>
>Signed-off-by: Nicolas Dufresne <[email protected]>
>---
> drivers/staging/media/hantro/hantro_h264.c | 107 ++++++++++++++++++---
> drivers/staging/media/hantro/hantro_hw.h | 1 +
> 2 files changed, 94 insertions(+), 14 deletions(-)
>
>diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
>index 7377fc26f780..f6fc939aa726 100644
>--- a/drivers/staging/media/hantro/hantro_h264.c
>+++ b/drivers/staging/media/hantro/hantro_h264.c
>@@ -22,6 +22,11 @@
> #define POC_BUFFER_SIZE 34
> #define SCALING_LIST_SIZE (6 * 16 + 2 * 64)
>
>+/* For valid and long term reference marking, index are reversed, so bit 31

s/index/indeces/

>+ * indicates the status of the picture 0.
>+ */
>+#define REF_BIT(i) BIT(32 - 1 - (i))
>+
> /* Data structure describing auxiliary buffer format. */
> struct hantro_h264_dec_priv_tbl {
> u32 cabac_table[CABAC_INIT_BUFFER_SIZE];
>@@ -227,6 +232,7 @@ static void prepare_table(struct hantro_ctx *ctx)
> {
> const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
> const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
>+ const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
> struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
> const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> u32 dpb_longterm = 0;
>@@ -237,20 +243,45 @@ static void prepare_table(struct hantro_ctx *ctx)
> tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
> tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
>
>+ if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_VALID))
>+ continue;
>+
> /*
> * Set up bit maps of valid and long term DPBs.
>- * NOTE: The bits are reversed, i.e. MSb is DPB 0.
>+ * NOTE: The bits are reversed, i.e. MSb is DPB 0. For frame
>+ * decoding, bit 31 to 15 are used, while for field decoding,

s/bit 31/bits 31/

>+ * all bits are used, with bit 31 being a top field, 30 a bottom
>+ * field and so on.
> */
>- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
>- dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
>- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
>- dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
>+ if (dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) {
>+ if (dpb[i].fields & V4L2_H264_TOP_FIELD_REF)
>+ dpb_valid |= REF_BIT(i * 2);
>+
>+ if (dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)
>+ dpb_valid |= REF_BIT(i * 2 + 1);
>+
>+ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) {
>+ dpb_longterm |= REF_BIT(i * 2);
>+ dpb_longterm |= REF_BIT(i * 2 + 1);
>+ }
>+ } else {
>+ dpb_valid |= REF_BIT(i);
>+
>+ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
>+ dpb_longterm |= REF_BIT(i);
>+ }
>+ }
>+ ctx->h264_dec.dpb_valid = dpb_valid;
>+ ctx->h264_dec.dpb_longterm = dpb_longterm;
>+
>+ if ((dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) ||
>+ !(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
>+ tbl->poc[32] = ctx->h264_dec.cur_poc;
>+ tbl->poc[33] = 0;
>+ } else {
>+ tbl->poc[32] = dec_param->top_field_order_cnt;
>+ tbl->poc[33] = dec_param->bottom_field_order_cnt;
> }
>- ctx->h264_dec.dpb_valid = dpb_valid << 16;
>- ctx->h264_dec.dpb_longterm = dpb_longterm << 16;
>-
>- tbl->poc[32] = dec_param->top_field_order_cnt;
>- tbl->poc[33] = dec_param->bottom_field_order_cnt;
>
> assemble_scaling_list(ctx);
> }
>@@ -326,6 +357,8 @@ dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
> {
> struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> dma_addr_t dma_addr = 0;
>+ s32 cur_poc = ctx->h264_dec.cur_poc;
>+ u32 flags;
>
> if (dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> dma_addr = hantro_get_ref(ctx, dpb[dpb_idx].reference_ts);
>@@ -343,7 +376,12 @@ dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
> dma_addr = hantro_get_dec_buf_addr(ctx, buf);
> }
>
>- return dma_addr;
>+ flags = dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD ? 0x2 : 0;

>+ flags |= abs(dpb[dpb_idx].top_field_order_cnt - cur_poc) <
>+ abs(dpb[dpb_idx].bottom_field_order_cnt - cur_poc) ?
>+ 0x1 : 0;

You use the magic values `0x1` & `0x2` here, can you replace those with
macros?

It looks 0x2 indicates that we have a field and 0x1 indicates the
distance of the current picture to the bottom field is greater than
the distance of the current picture to the top field. (inidicating that
the order is correct?)

So maybe:
```
#define HANTRO_H264_FIELD_DMA_ADDR 0x1
#define HANTRO_H264_CORRECT_FIELD_ORDER 0x2
```


>+
>+ return dma_addr | flags;
> }
>
> u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
>@@ -355,6 +393,34 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
> return dpb->frame_num;
> }
>
>+static void deduplicate_reflist(struct v4l2_h264_reflist_builder *b,
>+ struct v4l2_h264_reference *reflist)

Can you add a comment describing why we need to deduplicate the
reference list? And maybe also why we get duplications in the first
place? Why must we limit the size to 16?
This would increase the readability of the code a lot.

>+{
>+ int write_idx = 0;
>+ int i;
>+
>+ if (b->cur_pic_fields == V4L2_H264_FRAME_REF) {
>+ write_idx = b->num_valid;
>+ goto done;
>+ }
>+
>+ for (i = 0; i < b->num_valid; i++) {
>+ if (!(b->cur_pic_fields == reflist[i].fields)) {
>+ reflist[write_idx++] = reflist[i];
>+ continue;
>+ }
>+ }
>+
>+done:
>+ /* Should not happen unless we have a bug in the reflist builder. */
>+ if (WARN_ON(write_idx > 16))
>+ write_idx = 16;
>+
>+ /* Clear the remaining, some streams fails otherwise */

s/the remaining/the remaining entries/
s/fails/fail/

>+ for (; write_idx < 16; write_idx++)
>+ reflist[write_idx].index = 15;
>+}
>+
> int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
> {
> struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
>@@ -386,15 +452,28 @@ int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
> /* Update the DPB with new refs. */
> update_dpb(ctx);
>
>- /* Prepare data in memory. */
>- prepare_table(ctx);
>-
> /* Build the P/B{0,1} ref lists. */
> v4l2_h264_init_reflist_builder(&reflist_builder, ctrls->decode,
> ctrls->sps, ctx->h264_dec.dpb);
>+ h264_ctx->cur_poc = reflist_builder.cur_pic_order_count;
>+
>+ /* Prepare data in memory. */
>+ prepare_table(ctx);
>+
> v4l2_h264_build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
> v4l2_h264_build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
> h264_ctx->reflists.b1);
>+
>+ /* Reduce ref lists to at most 16 entries, Hantro hardware will deduce
>+ * the actual picture lists in field through the dpb_valid,

s/in field/in a field/

>+ * dpb_longterm bitmap along with the current frame parity.

s/bitmap/bitmaps/

Greetings,
Sebastian

>+ */
>+ if (reflist_builder.cur_pic_fields != V4L2_H264_FRAME_REF) {
>+ deduplicate_reflist(&reflist_builder, h264_ctx->reflists.p);
>+ deduplicate_reflist(&reflist_builder, h264_ctx->reflists.b0);
>+ deduplicate_reflist(&reflist_builder, h264_ctx->reflists.b1);
>+ }
>+
> return 0;
> }
>
>diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
>index 292aaaabaf24..fd869369fb97 100644
>--- a/drivers/staging/media/hantro/hantro_hw.h
>+++ b/drivers/staging/media/hantro/hantro_hw.h
>@@ -91,6 +91,7 @@ struct hantro_h264_dec_hw_ctx {
> struct hantro_h264_dec_ctrls ctrls;
> u32 dpb_longterm;
> u32 dpb_valid;
>+ s32 cur_poc;
> };
>
> /**
>--
>2.34.1
>

2022-03-31 02:37:37

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 03/24] media: h264: Avoid wrapping long_term_frame_idx

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>For long term reference, frame_num is set to long_term_frame_idx which

s/reference/references/

>does not require wrapping. This if fixed by observation, no directly

s/This if/This is/

>related issue have been found yet.
>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

>---
> drivers/media/v4l2-core/v4l2-h264.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
>index 8d750ee69e74..aebed1cbe05a 100644
>--- a/drivers/media/v4l2-core/v4l2-h264.c
>+++ b/drivers/media/v4l2-core/v4l2-h264.c
>@@ -57,8 +57,10 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
> * '8.2.4.1 Decoding process for picture numbers' of the spec.
> * TODO: This logic will have to be adjusted when we start
> * supporting interlaced content.

As you change the logic can't you remove the TODO comment now?

>+ * For long term reference, frame_num is set to

s/reference/references/

Greetings,
Sebastian

>+ * long_term_frame_idx which requires no wrapping.
> */
>- if (dpb[i].frame_num > cur_frame_num)
>+ if (!b->refs[i].longterm && dpb[i].frame_num > cur_frame_num)
> b->refs[i].frame_num = (int)dpb[i].frame_num -
> max_frame_num;
> else
>--
>2.34.1
>

2022-03-31 02:44:33

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

Le mardi 29 mars 2022 à 11:13 +0300, Dan Carpenter a écrit :
> On Mon, Mar 28, 2022 at 03:59:31PM -0400, Nicolas Dufresne wrote:
> > @@ -738,23 +735,26 @@ static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
> > struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
> > int buf_idx = -1;
> >
> > - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
> > buf_idx = vb2_find_timestamp(cap_q,
> > dpb[i].reference_ts, 0);
> > + if (buf_idx < 0)
> > + pr_debug("No buffer for reference_ts %llu",
> > + dpb[i].reference_ts);
>
> pr_debug() is too quiet. Make it pr_err(). Set buf_idx to zero instead
> leaving it as an error code.

Thanks for the suggestion, I'm just a bit uncomfortable using pr_err() for
something that is not a driver error, but userland error. Perhaps you can
educate me on the policy in this regard, but malicous userland being able to
flood the logs very easily is my main concern here.

About the negative idx, it is being used set dpb_valid later on. H.264 error
resilience requires that these frames should be marked as "unexisting" but still
occupy space in the DPB, this is more or less what I'm trying to implement here.
Setting it to 0 would basically mean to refer to DPB index 0, which is
relatively random pick. I believe your suggestion is not taking into
consideration what the code is doing, but it would fall in some poor-man
concealment which I would rather leave to the userland.

>
> > + }
> >
> > run->ref_buf_idx[i] = buf_idx;
> > }
> > }
> >
> > static void assemble_hw_rps(struct rkvdec_ctx *ctx,
> > + struct v4l2_h264_reflist_builder *builder,
> > struct rkvdec_h264_run *run)
> > {
> > const struct v4l2_ctrl_h264_decode_params *dec_params = run->decode_params;
> > const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb;
> > struct rkvdec_h264_ctx *h264_ctx = ctx->priv;
> > - const struct v4l2_ctrl_h264_sps *sps = run->sps;
> > struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu;
> > - u32 max_frame_num = 1 << (sps->log2_max_frame_num_minus4 + 4);
> >
> > u32 *hw_rps = priv_tbl->rps;
> > u32 i, j;
> > @@ -772,37 +772,36 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
> > if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > continue;
> >
> > - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM ||
> > - dpb[i].frame_num <= dec_params->frame_num) {
> > - p[i] = dpb[i].frame_num;
> > - continue;
> > - }
> > -
> > - p[i] = dpb[i].frame_num - max_frame_num;
> > + p[i] = builder->refs[i].frame_num;
> > }
> >
> > for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
> > - for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
> > - u8 dpb_valid = run->ref_buf_idx[i] >= 0;
> > - u8 idx = 0;
> > + for (i = 0; i < builder->num_valid; i++) {
> > + struct v4l2_h264_reference *ref;
> > + u8 dpb_valid;
> > + u8 bottom;
>
> These would be better as type bool.

I never used a bool for bit operations before, but I guess that can work, thanks
for the suggestion. As this deviates from the original code, I suppose I should
make this a separate patch ?

>
> regards,
> dan carpenter
>
> >
> > switch (j) {
> > case 0:
> > - idx = h264_ctx->reflists.p[i].index;
> > + ref = &h264_ctx->reflists.p[i];
> > break;
> > case 1:
> > - idx = h264_ctx->reflists.b0[i].index;
> > + ref = &h264_ctx->reflists.b0[i];
> > break;
> > case 2:
> > - idx = h264_ctx->reflists.b1[i].index;
> > + ref = &h264_ctx->reflists.b1[i];
> > break;
> > }
> >
> > - if (idx >= ARRAY_SIZE(dec_params->dpb))
> > + if (WARN_ON(ref->index >= ARRAY_SIZE(dec_params->dpb)))
> > continue;
> >
> > + dpb_valid = run->ref_buf_idx[ref->index] >= 0;
> > + bottom = ref->fields == V4L2_H264_BOTTOM_FIELD_REF;
> > +
> > set_ps_field(hw_rps, DPB_INFO(i, j),
> > - idx | dpb_valid << 4);
> > + ref->index | dpb_valid << 4);
> > + set_ps_field(hw_rps, BOTTOM_FLAG(i, j), bottom);
> > }
> > }
> > }
>

2022-03-31 02:52:55

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 09/24] media: v4l2-mem2mem: Fix typo in trace message

Le mardi 29 mars 2022 à 15:57 +0200, Sebastian Fricke a écrit :
> Hey Nicolas,
>
> On 28.03.2022 15:59, Nicolas Dufresne wrote:
> > On -> One
>
> Hmmmm the message "One job queue already" doesn't sound correct. I think
> the message wants to say that the buffer is already on the queue.
>
> We could maybe enhance the message like:
> "Buffer already found on the job queue\n"

I think I read queue -> queued. The new message would be inaccurate with this
suggestion. I'll just drop that patch in V2, the fact that this message was
miss-leading to me is irrelevant to the patchset.
>
> But this is not a typo from my POV.
>
> Greetings,
> Sebastian
>
> >
> > Signed-off-by: Nicolas Dufresne <[email protected]>
> > ---
> > drivers/media/v4l2-core/v4l2-mem2mem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
> > index 675e22895ebe..53c2332d5cbd 100644
> > --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> > +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> > @@ -316,7 +316,7 @@ static void __v4l2_m2m_try_queue(struct v4l2_m2m_dev *m2m_dev,
> > }
> >
> > if (m2m_ctx->job_flags & TRANS_QUEUED) {
> > - dprintk("On job queue already\n");
> > + dprintk("One job queue already\n");
> > goto job_unlock;
> > }
> >
> > --
> > 2.34.1
> >

2022-03-31 03:52:07

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 21/24] media: hantro: Stop using H.264 parameter pic_num

Hey Nicolas,

The term pic_num is now only present in the following files:
```
❯ rg 'pic_num'
staging/media/rkvdec/rkvdec-h264.c
766: * Assign an invalid pic_num if DPB entry at that position is inactive.
768: * reference picture with pic_num 0, triggering output picture

media/platform/amphion/vpu_windsor.c
485: u32 pic_num;

media/platform/mediatek/vcodec/vdec/vdec_h264_req_if.c
97: unsigned short pic_num;
346: dst_entry->pic_num = src_entry->pic_num;

media/v4l2-core/v4l2-h264.c
143: * but with frame_num (wrapped). As for frame the pic_num and frame_num
306: /* this is pic_num for frame and frame_num (wrapped) for field,
307: * but for frame pic_num is equal to frame_num (wrapped).
```

In v4l2-h264 and rkvdec-h264 it is only present as comment and the term
is not part of the specification.
In vpu_windsor it is actually never used.
And for the mediatek driver the same might apply.
It might be worth it to get rid of that term all together while you are
at it.

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>The hardware expects FrameNumWrap or long_term_frame_idx. Picture
>numbers are per field, and are mostly used during the memory
>management process, which is done in userland. This fixes two
>ITU conformance tests:
>
> - MR6_BT_B
> - MR8_BT_B
>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

Greetings,
Sebastian
>---
> drivers/staging/media/hantro/hantro_h264.c | 2 --
> 1 file changed, 2 deletions(-)
>
>diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
>index 0b4d2491be3b..228629fb3cdf 100644
>--- a/drivers/staging/media/hantro/hantro_h264.c
>+++ b/drivers/staging/media/hantro/hantro_h264.c
>@@ -354,8 +354,6 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
>
> if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> return 0;
>- if (dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
>- return dpb->pic_num;
> return dpb->frame_num;
> }
>
>--
>2.34.1
>

2022-03-31 03:56:14

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 23/24] media: hantro: Add H.264 field decoding support

Le mercredi 30 mars 2022 à 11:03 +0200, Sebastian Fricke a écrit :
> Hey Nicolas,
>
> On 28.03.2022 15:59, Nicolas Dufresne wrote:
> > This adds the required code to support field decoding. While most of
> > the code is derived from Rockchip and VSI reference code, the
> > reduction of the reference list to 16 entries has been found by
>
> s/has been/was/
>
> > trial and errors. The list consist of all the references with the
>
> s/consist/consists/
>
> > opposite field parity.
> >
> > The strategy being to deduplicate the reference picture that points
>
> s/strategy being to/strategy is to/
>
> > to the same storage (same index). The choice of opposite parity has
> > been made to keep the other field or a field pair to be kept in the
>
> Do you mean?
>
> s/keep the other field or a field pair to be kept/
> keep the other field of a field pair/
>
> > list. This method may not be robust if a field was lost.
> >
> > Signed-off-by: Jonas Karlman <[email protected]>
> > Signed-off-by: Nicolas Dufresne <[email protected]>
> > ---
> > drivers/staging/media/hantro/hantro_h264.c | 107 ++++++++++++++++++---
> > drivers/staging/media/hantro/hantro_hw.h | 1 +
> > 2 files changed, 94 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
> > index 7377fc26f780..f6fc939aa726 100644
> > --- a/drivers/staging/media/hantro/hantro_h264.c
> > +++ b/drivers/staging/media/hantro/hantro_h264.c
> > @@ -22,6 +22,11 @@
> > #define POC_BUFFER_SIZE 34
> > #define SCALING_LIST_SIZE (6 * 16 + 2 * 64)
> >
> > +/* For valid and long term reference marking, index are reversed, so bit 31
>
> s/index/indeces/
>
> > + * indicates the status of the picture 0.
> > + */
> > +#define REF_BIT(i) BIT(32 - 1 - (i))
> > +
> > /* Data structure describing auxiliary buffer format. */
> > struct hantro_h264_dec_priv_tbl {
> > u32 cabac_table[CABAC_INIT_BUFFER_SIZE];
> > @@ -227,6 +232,7 @@ static void prepare_table(struct hantro_ctx *ctx)
> > {
> > const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
> > const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
> > + const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
> > struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
> > const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> > u32 dpb_longterm = 0;
> > @@ -237,20 +243,45 @@ static void prepare_table(struct hantro_ctx *ctx)
> > tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
> > tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
> >
> > + if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_VALID))
> > + continue;
> > +
> > /*
> > * Set up bit maps of valid and long term DPBs.
> > - * NOTE: The bits are reversed, i.e. MSb is DPB 0.
> > + * NOTE: The bits are reversed, i.e. MSb is DPB 0. For frame
> > + * decoding, bit 31 to 15 are used, while for field decoding,
>
> s/bit 31/bits 31/
>
> > + * all bits are used, with bit 31 being a top field, 30 a bottom
> > + * field and so on.
> > */
> > - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > - dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
> > - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> > - dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
> > + if (dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) {
> > + if (dpb[i].fields & V4L2_H264_TOP_FIELD_REF)
> > + dpb_valid |= REF_BIT(i * 2);
> > +
> > + if (dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)
> > + dpb_valid |= REF_BIT(i * 2 + 1);
> > +
> > + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) {
> > + dpb_longterm |= REF_BIT(i * 2);
> > + dpb_longterm |= REF_BIT(i * 2 + 1);
> > + }
> > + } else {
> > + dpb_valid |= REF_BIT(i);
> > +
> > + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> > + dpb_longterm |= REF_BIT(i);
> > + }
> > + }
> > + ctx->h264_dec.dpb_valid = dpb_valid;
> > + ctx->h264_dec.dpb_longterm = dpb_longterm;
> > +
> > + if ((dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) ||
> > + !(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
> > + tbl->poc[32] = ctx->h264_dec.cur_poc;
> > + tbl->poc[33] = 0;
> > + } else {
> > + tbl->poc[32] = dec_param->top_field_order_cnt;
> > + tbl->poc[33] = dec_param->bottom_field_order_cnt;
> > }
> > - ctx->h264_dec.dpb_valid = dpb_valid << 16;
> > - ctx->h264_dec.dpb_longterm = dpb_longterm << 16;
> > -
> > - tbl->poc[32] = dec_param->top_field_order_cnt;
> > - tbl->poc[33] = dec_param->bottom_field_order_cnt;
> >
> > assemble_scaling_list(ctx);
> > }
> > @@ -326,6 +357,8 @@ dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
> > {
> > struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> > dma_addr_t dma_addr = 0;
> > + s32 cur_poc = ctx->h264_dec.cur_poc;
> > + u32 flags;
> >
> > if (dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > dma_addr = hantro_get_ref(ctx, dpb[dpb_idx].reference_ts);
> > @@ -343,7 +376,12 @@ dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
> > dma_addr = hantro_get_dec_buf_addr(ctx, buf);
> > }
> >
> > - return dma_addr;
> > + flags = dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD ? 0x2 : 0;
>
> > + flags |= abs(dpb[dpb_idx].top_field_order_cnt - cur_poc) <
> > + abs(dpb[dpb_idx].bottom_field_order_cnt - cur_poc) ?
> > + 0x1 : 0;
>
> You use the magic values `0x1` & `0x2` here, can you replace those with
> macros?
>
> It looks 0x2 indicates that we have a field and 0x1 indicates the
> distance of the current picture to the bottom field is greater than
> the distance of the current picture to the top field. (inidicating that
> the order is correct?)
>
> So maybe:
> ```
> #define HANTRO_H264_FIELD_DMA_ADDR 0x1
> #define HANTRO_H264_CORRECT_FIELD_ORDER 0x2
> ```

Will do, I need to check again in the ref code for the appropriate wording for
the meaning of these bits. I don't like much FIELD_DMA_ADDR, as its not an
address, or an offset to an address. Here the address must be aligned, which
results in these bits always being 0, so they reuse it to pass per-reference
information.

>
>
> > +
> > + return dma_addr | flags;
> > }
> >
> > u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
> > @@ -355,6 +393,34 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
> > return dpb->frame_num;
> > }
> >
> > +static void deduplicate_reflist(struct v4l2_h264_reflist_builder *b,
> > + struct v4l2_h264_reference *reflist)
>
> Can you add a comment describing why we need to deduplicate the
> reference list? And maybe also why we get duplications in the first
> place? Why must we limit the size to 16?
> This would increase the readability of the code a lot.

For readers that did understand the H.264 specification, that a field reflist
has 32 entry should be obvious. That hantro uses 16 entry only is an
undocumented trickery, I'll write something to the best of my knowledge, but
this has been reversed, so don't expect the highest rationale. I will though
document here what I say in the commit, e.g. why this method might not be
robust, so someone can give it a try and make it more robust in the future.

>
> > +{
> > + int write_idx = 0;
> > + int i;
> > +
> > + if (b->cur_pic_fields == V4L2_H264_FRAME_REF) {
> > + write_idx = b->num_valid;
> > + goto done;
> > + }
> > +
> > + for (i = 0; i < b->num_valid; i++) {
> > + if (!(b->cur_pic_fields == reflist[i].fields)) {
> > + reflist[write_idx++] = reflist[i];
> > + continue;
> > + }
> > + }
> > +
> > +done:
> > + /* Should not happen unless we have a bug in the reflist builder. */
> > + if (WARN_ON(write_idx > 16))
> > + write_idx = 16;
> > +
> > + /* Clear the remaining, some streams fails otherwise */
>
> s/the remaining/the remaining entries/
> s/fails/fail/
>
> > + for (; write_idx < 16; write_idx++)
> > + reflist[write_idx].index = 15;
> > +}
> > +
> > int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
> > {
> > struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
> > @@ -386,15 +452,28 @@ int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
> > /* Update the DPB with new refs. */
> > update_dpb(ctx);
> >
> > - /* Prepare data in memory. */
> > - prepare_table(ctx);
> > -
> > /* Build the P/B{0,1} ref lists. */
> > v4l2_h264_init_reflist_builder(&reflist_builder, ctrls->decode,
> > ctrls->sps, ctx->h264_dec.dpb);
> > + h264_ctx->cur_poc = reflist_builder.cur_pic_order_count;
> > +
> > + /* Prepare data in memory. */
> > + prepare_table(ctx);
> > +
> > v4l2_h264_build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
> > v4l2_h264_build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
> > h264_ctx->reflists.b1);
> > +
> > + /* Reduce ref lists to at most 16 entries, Hantro hardware will deduce
> > + * the actual picture lists in field through the dpb_valid,
>
> s/in field/in a field/
>
> > + * dpb_longterm bitmap along with the current frame parity.
>
> s/bitmap/bitmaps/
>
> Greetings,
> Sebastian
>
> > + */
> > + if (reflist_builder.cur_pic_fields != V4L2_H264_FRAME_REF) {
> > + deduplicate_reflist(&reflist_builder, h264_ctx->reflists.p);
> > + deduplicate_reflist(&reflist_builder, h264_ctx->reflists.b0);
> > + deduplicate_reflist(&reflist_builder, h264_ctx->reflists.b1);
> > + }
> > +
> > return 0;
> > }
> >
> > diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
> > index 292aaaabaf24..fd869369fb97 100644
> > --- a/drivers/staging/media/hantro/hantro_hw.h
> > +++ b/drivers/staging/media/hantro/hantro_hw.h
> > @@ -91,6 +91,7 @@ struct hantro_h264_dec_hw_ctx {
> > struct hantro_h264_dec_ctrls ctrls;
> > u32 dpb_longterm;
> > u32 dpb_valid;
> > + s32 cur_poc;
> > };
> >
> > /**
> > --
> > 2.34.1
> >

2022-03-31 04:40:24

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 24/24] media: rkvdec-h264: Don't hardcode SPS/PPS parameters

Le mardi 29 mars 2022 à 09:22 +0200, Sebastian Fricke a écrit :
> Hey Nicolas,
>
> The patch series doesn't seem to apply on the latest media tree master
> branch. Looking at your tree I can see that you have commit: ba2c670a
> "media: nxp: Restrict VIDEO_IMX_MIPI_CSIS to ARCH_MXC or COMPILE_TEST
> Laurent Pinchart authored 1 week ago "
>
> But the current head of the media tree is: 71e6d0608e4d
> "media: platform: Remove unnecessary print function dev_err()
> Yang Li authored 13 days ago"
>
> On 28.03.2022 15:59, Nicolas Dufresne wrote:
> > From: Alex Bee <[email protected]>
> >
> > Some SPS/PPS parameters are currently hardcoded in the driver
> > even though so do exist in the uapi which is stable by now.
>
> s/even though so/even though they/
> >
> > Use them instead of hardcoding them.
> >
> > Conformance tests have shown there is no difference, but it might
> > increase decoder performance.
>
> I think it would be great if we could add some performance metrics to
> the commit description to have a metric that following patches could
> compare themselves with.

Alex, can you extend on this one? I'm not sure how this can impact performance,
so I doubt any mitric will be significant. Can I just drop that part of the
comment ?
>
> Greetings,
> Sebastian
>
> >
> > Signed-off-by: Alex Bee <[email protected]>
> > ---
> > drivers/staging/media/rkvdec/rkvdec-h264.c | 13 +++++++------
> > 1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
> > index 891c48bf6a51..91f65d78e453 100644
> > --- a/drivers/staging/media/rkvdec/rkvdec-h264.c
> > +++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
> > @@ -655,13 +655,14 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
> >
> > #define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
> > /* write sps */
> > - WRITE_PPS(0xf, SEQ_PARAMETER_SET_ID);
> > - WRITE_PPS(0xff, PROFILE_IDC);
> > - WRITE_PPS(1, CONSTRAINT_SET3_FLAG);
> > + WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
> > + WRITE_PPS(sps->profile_idc, PROFILE_IDC);
> > + WRITE_PPS((sps->constraint_set_flags & 1 << 3) ? 1 : 0, CONSTRAINT_SET3_FLAG);
> > WRITE_PPS(sps->chroma_format_idc, CHROMA_FORMAT_IDC);
> > WRITE_PPS(sps->bit_depth_luma_minus8, BIT_DEPTH_LUMA);
> > WRITE_PPS(sps->bit_depth_chroma_minus8, BIT_DEPTH_CHROMA);
> > - WRITE_PPS(0, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
> > + WRITE_PPS(!!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS),
> > + QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
> > WRITE_PPS(sps->log2_max_frame_num_minus4, LOG2_MAX_FRAME_NUM_MINUS4);
> > WRITE_PPS(sps->max_num_ref_frames, MAX_NUM_REF_FRAMES);
> > WRITE_PPS(sps->pic_order_cnt_type, PIC_ORDER_CNT_TYPE);
> > @@ -679,8 +680,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
> > DIRECT_8X8_INFERENCE_FLAG);
> >
> > /* write pps */
> > - WRITE_PPS(0xff, PIC_PARAMETER_SET_ID);
> > - WRITE_PPS(0x1f, PPS_SEQ_PARAMETER_SET_ID);
> > + WRITE_PPS(pps->pic_parameter_set_id, PIC_PARAMETER_SET_ID);
> > + WRITE_PPS(pps->seq_parameter_set_id, PPS_SEQ_PARAMETER_SET_ID);
> > WRITE_PPS(!!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE),
> > ENTROPY_CODING_MODE_FLAG);
> > WRITE_PPS(!!(pps->flags & V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT),
> > --
> > 2.34.1
> >

2022-03-31 04:42:43

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

Yeah. I'm aboslutely fine with whatever you do. Some of the questions
you're asking occurred to me too but I don't have the answers.

> > > > > + for (i = 0; i < builder->num_valid; i++) {
> > > > > + struct v4l2_h264_reference *ref;
> > > > > + u8 dpb_valid;
> > > > > + u8 bottom;
> > > >
> > > > These would be better as type bool.
> > >
> > > I never used a bool for bit operations before, but I guess that can work, thanks
> > > for the suggestion. As this deviates from the original code, I suppose I should
> > > make this a separate patch ?
> >
> > I just saw the name and wondered why it was a u8. bool does make more
> > sense and works fine for the bitwise stuff. But I don't really care at
> > all.
>
> I'll do that in v2, in same patch, looks minor enough. I think if using bool
> could guaranty that only 1 or 0 is possible, it would be even better, but don't
> think C works like this.

I'm not sure I understand. If you assign "bool x = <any non-zero>;"
then x is set to true. Do you want a static checker warning for if
<any non-zero> can be something other than one or zero? The problem is
that people sometimes deliberately do stuff like "bool x = var & 0xf0;".
Smatch will complain if you assign a negative value to x.

test.c:8 test() warn: assigning (-3) to unsigned variable 'x'

It's supposed to print a warning if you used it to save error codes like:

x = some_kernel_function();

But it does not. :/ Something to investigate.

regards,
dan carpenter

2022-03-31 04:50:26

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v1 20/24] media: hantro: Enable HOLD_CAPTURE_BUF for H.264

Hey Nicolas,

On 28.03.2022 15:59, Nicolas Dufresne wrote:
>This is needed to optimizing field decoding. Each field will be

s/is needed to optimizing/is needed to optimize/

>decoded in the same capture buffer, so to make use of the queues

s/in the same/into the same/

>we need to be able to ask the driver to keep the capture buffer.

How about:
"""
During field decoding each field will be decoded into the same capture
buffer. Optimise this mode by asking the driver to hold the buffer until
all fields are written into it.
"""

>
>Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Sebastian Fricke <[email protected]>

>---
> drivers/staging/media/hantro/hantro_v4l2.c | 25 ++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
>diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
>index 67148ba346f5..50d636678ff3 100644
>--- a/drivers/staging/media/hantro/hantro_v4l2.c
>+++ b/drivers/staging/media/hantro/hantro_v4l2.c
>@@ -409,6 +409,30 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
> }
> }
>
>+static void
>+hantro_update_requires_hold_capture_buf(struct hantro_ctx *ctx, u32 fourcc)
>+{
>+ struct vb2_queue *vq;
>+
>+ vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx,
>+ V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
>+
>+ switch (fourcc) {
>+ case V4L2_PIX_FMT_JPEG:
>+ case V4L2_PIX_FMT_MPEG2_SLICE:
>+ case V4L2_PIX_FMT_VP8_FRAME:
>+ case V4L2_PIX_FMT_HEVC_SLICE:
>+ case V4L2_PIX_FMT_VP9_FRAME:
>+ vq->subsystem_flags &= ~(VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF);
>+ break;

Out of curiosity, why would it be bad for the other codecs to have
support for that feature activated? As this doesn't actually hold the
buffers but only makes sure that they could be held.

>+ case V4L2_PIX_FMT_H264_SLICE:
>+ vq->subsystem_flags |= VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;

I think it is worth it to highlight with a comment why only this one
receives support for holding the buffer. As it is quite confusing
without background info and just the code.

How about:
```
/*
* During field decoding in H264, all fields are written into the
* same capture buffer, thus we need to be able to hold the buffer
* until all fields are written to it
*/
```

>+ break;
>+ default:

The only other decoding formats remaining are:
- V4L2_PIX_FMT_NV12_4L4
- V4L2_PIX_FMT_NV12

Both have codec mode HANTRO_MODE_NONE.

My thought is:
If we don't care for these two, the we might as well disable buffer holding
support for them as well. So, we could make this simplier
(but a bit less descriptive):

```
/*
* During field decoding in H264, all fields are written into the
* same capture buffer, thus we need to be able to hold the buffer
* until all fields are written to it
*/
if (fourcc == V4L2_PIX_FMT_H264_SLICE)
vq->subsystem_flags |= VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
else
vq->subsystem_flags &= ~(VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF);
```

Greetings,
Sebastian

>+ break;
>+ }
>+}
>+
> static int hantro_set_fmt_out(struct hantro_ctx *ctx,
> struct v4l2_pix_format_mplane *pix_mp)
> {
>@@ -472,6 +496,7 @@ static int hantro_set_fmt_out(struct hantro_ctx *ctx,
> ctx->dst_fmt.quantization = pix_mp->quantization;
>
> hantro_update_requires_request(ctx, pix_mp->pixelformat);
>+ hantro_update_requires_hold_capture_buf(ctx, pix_mp->pixelformat);
>
> vpu_debug(0, "OUTPUT codec mode: %d\n", ctx->vpu_src_fmt->codec_mode);
> vpu_debug(0, "fmt - w: %d, h: %d\n",
>--
>2.34.1
>

2022-03-31 04:50:57

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

Le mercredi 30 mars 2022 à 08:15 +0300, Dan Carpenter a écrit :
> On Tue, Mar 29, 2022 at 04:54:55PM -0400, Nicolas Dufresne wrote:
> > Le mardi 29 mars 2022 à 11:13 +0300, Dan Carpenter a écrit :
> > > On Mon, Mar 28, 2022 at 03:59:31PM -0400, Nicolas Dufresne wrote:
> > > > @@ -738,23 +735,26 @@ static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
> > > > struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
> > > > int buf_idx = -1;
> > > >
> > > > - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > > + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
> > > > buf_idx = vb2_find_timestamp(cap_q,
> > > > dpb[i].reference_ts, 0);
> > > > + if (buf_idx < 0)
> > > > + pr_debug("No buffer for reference_ts %llu",
> > > > + dpb[i].reference_ts);
> > >
> > > pr_debug() is too quiet. Make it pr_err(). Set buf_idx to zero instead
> > > leaving it as an error code.
> >
> > Thanks for the suggestion, I'm just a bit uncomfortable using pr_err() for
> > something that is not a driver error, but userland error. Perhaps you can
> > educate me on the policy in this regard, but malicous userland being able to
> > flood the logs very easily is my main concern here.
> >
> > About the negative idx, it is being used set dpb_valid later on. H.264 error
> > resilience requires that these frames should be marked as "unexisting" but still
> > occupy space in the DPB, this is more or less what I'm trying to implement here.
> > Setting it to 0 would basically mean to refer to DPB index 0, which is
> > relatively random pick. I believe your suggestion is not taking into
> > consideration what the code is doing, but it would fall in some poor-man
> > concealment which I would rather leave to the userland.
> >
>
> To be honest, I just saw that it was a negative idx and freaked out. I
> didn't look at any context...
>
> You're right that we don't to allow the user to spam the dmesg. Ideally
> we would return an error. A second best solution is to do a pr_err_once().

There is two schools in the context of video stream decoding. I'm not saying
this driver is quite there in term of visual corruption reporting, this is
something I expect we'll improve in the long term. But here's the two schools:

- Freeze on the last non-corrupted image (Apple style)
- Display slightly distorted image with movement (the rest of the world)

In order to give users that choice, I must try decoding as much as I can
regardless if there is a missing a reference or not. That wasn't the goal in
this MR, but in the long run we'll remember this and mark the buffer as
corrupted (using the ERROR but setting a payload size to the picture size). This
leaves the users the option to drop or to keep the visually corrupted image. In
video streaming, corrupted stream could look relatively fine, this cannot be
judge noticing 1 reference frame missing (it could be referenced in only 1
macro-block).

Educational bit behind, I think we should keep going and not "error out". Also,
for debugging purpose, it is nicer if we can get a complete report of non-memory
backed references. As a missing reference in 1 frame, may have implication in
later frame, or may cause other frames to be missed.

One thing I was thinking though, is that through using raw pr_debug() I'm not
giving much context to my trace, so it would be hard to associate it with the
driver instance (m2m are multi-instance) it was running against. But I didn't
want to add a new tracing wrapper for that driver in this patchset as it was
already relatively large patchset. Though, these traces have been previous to
make the driver work (as long as you test with a single instance).

If its all right with everyone, I'd leave it like this for this round, we can
dedicated a patchset on improve this driver tracing in the future.

>
> > > > for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
> > > > - for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
> > > > - u8 dpb_valid = run->ref_buf_idx[i] >= 0;
> > > > - u8 idx = 0;
> > > > + for (i = 0; i < builder->num_valid; i++) {
> > > > + struct v4l2_h264_reference *ref;
> > > > + u8 dpb_valid;
> > > > + u8 bottom;
> > >
> > > These would be better as type bool.
> >
> > I never used a bool for bit operations before, but I guess that can work, thanks
> > for the suggestion. As this deviates from the original code, I suppose I should
> > make this a separate patch ?
>
> I just saw the name and wondered why it was a u8. bool does make more
> sense and works fine for the bitwise stuff. But I don't really care at
> all.

I'll do that in v2, in same patch, looks minor enough. I think if using bool
could guaranty that only 1 or 0 is possible, it would be even better, but don't
think C works like this.

Thanks again for your comments.

>
> regards,
> dan carpenter
>

2022-03-31 05:01:48

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 21/24] media: hantro: Stop using H.264 parameter pic_num

Le mercredi 30 mars 2022 à 09:42 +0200, Sebastian Fricke a écrit :
> Hey Nicolas,
>
> The term pic_num is now only present in the following files:
> ```
> ❯ rg 'pic_num'
> staging/media/rkvdec/rkvdec-h264.c
> 766: * Assign an invalid pic_num if DPB entry at that position is inactive.
> 768: * reference picture with pic_num 0, triggering output picture

I should probably translate this one, since the HW uses frame_num, not pic_num.

>
> media/platform/amphion/vpu_windsor.c
> 485: u32 pic_num;

Amphion Windsor is a stateful driver, I cannot comment on the user of pic_num
for that type of driver.

>
> media/platform/mediatek/vcodec/vdec/vdec_h264_req_if.c
> 97: unsigned short pic_num;
> 346: dst_entry->pic_num = src_entry->pic_num;

This is being sent to the firmware, so its a difficult change to make without
testing it first. I do have HW to test this, but would prefer doing so in a
seperate patchset. Note that MTK does not support field decoding, so pic_num ==
frame_num. So whatever it does here is likely correct.

>
> media/v4l2-core/v4l2-h264.c
> 143: * but with frame_num (wrapped). As for frame the pic_num and frame_num
> 306: /* this is pic_num for frame and frame_num (wrapped) for field,
> 307: * but for frame pic_num is equal to frame_num (wrapped).
> ```
>
> In v4l2-h264 and rkvdec-h264 it is only present as comment and the term
> is not part of the specification.
> In vpu_windsor it is actually never used.
> And for the mediatek driver the same might apply.
> It might be worth it to get rid of that term all together while you are
> at it.

Amphion Windsor is a stateful driver, I'd leave it to the maintainer to cleanup
unused variables if there is. In general the term is not invalid, its just that
the value can be trivially deduced from frame_num and the value depends on the
current picture parity, which makes it an unstable identifier.

>
> On 28.03.2022 15:59, Nicolas Dufresne wrote:
> > The hardware expects FrameNumWrap or long_term_frame_idx. Picture
> > numbers are per field, and are mostly used during the memory
> > management process, which is done in userland. This fixes two
> > ITU conformance tests:
> >
> > - MR6_BT_B
> > - MR8_BT_B
> >
> > Signed-off-by: Nicolas Dufresne <[email protected]>
> Reviewed-by: Sebastian Fricke <[email protected]>
>
> Greetings,
> Sebastian
> > ---
> > drivers/staging/media/hantro/hantro_h264.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
> > index 0b4d2491be3b..228629fb3cdf 100644
> > --- a/drivers/staging/media/hantro/hantro_h264.c
> > +++ b/drivers/staging/media/hantro/hantro_h264.c
> > @@ -354,8 +354,6 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
> >
> > if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > return 0;
> > - if (dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> > - return dpb->pic_num;
> > return dpb->frame_num;
> > }
> >
> > --
> > 2.34.1
> >

2022-04-01 00:22:40

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 20/24] media: hantro: Enable HOLD_CAPTURE_BUF for H.264

Le mercredi 30 mars 2022 à 09:36 +0200, Sebastian Fricke a écrit :
> Hey Nicolas,
>
> On 28.03.2022 15:59, Nicolas Dufresne wrote:
> > This is needed to optimizing field decoding. Each field will be
>
> s/is needed to optimizing/is needed to optimize/
>
> > decoded in the same capture buffer, so to make use of the queues
>
> s/in the same/into the same/
>
> > we need to be able to ask the driver to keep the capture buffer.
>
> How about:
> """
> During field decoding each field will be decoded into the same capture
> buffer. Optimise this mode by asking the driver to hold the buffer until
> all fields are written into it.
> """
>
> >
> > Signed-off-by: Nicolas Dufresne <[email protected]>
> Reviewed-by: Sebastian Fricke <[email protected]>

Perhaps avoid giving a reviewed by if you are to comment around modifying the
code ? I will though keep the code as is, I believe there is more good then bad
around the form.

>
> > ---
> > drivers/staging/media/hantro/hantro_v4l2.c | 25 ++++++++++++++++++++++
> > 1 file changed, 25 insertions(+)
> >
> > diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
> > index 67148ba346f5..50d636678ff3 100644
> > --- a/drivers/staging/media/hantro/hantro_v4l2.c
> > +++ b/drivers/staging/media/hantro/hantro_v4l2.c
> > @@ -409,6 +409,30 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
> > }
> > }
> >
> > +static void
> > +hantro_update_requires_hold_capture_buf(struct hantro_ctx *ctx, u32 fourcc)
> > +{
> > + struct vb2_queue *vq;
> > +
> > + vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx,
> > + V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> > +
> > + switch (fourcc) {
> > + case V4L2_PIX_FMT_JPEG:
> > + case V4L2_PIX_FMT_MPEG2_SLICE:
> > + case V4L2_PIX_FMT_VP8_FRAME:
> > + case V4L2_PIX_FMT_HEVC_SLICE:
> > + case V4L2_PIX_FMT_VP9_FRAME:
> > + vq->subsystem_flags &= ~(VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF);
> > + break;
>
> Out of curiosity, why would it be bad for the other codecs to have
> support for that feature activated? As this doesn't actually hold the
> buffers but only makes sure that they could be held.
>
> > + case V4L2_PIX_FMT_H264_SLICE:
> > + vq->subsystem_flags |= VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
>
> I think it is worth it to highlight with a comment why only this one
> receives support for holding the buffer. As it is quite confusing
> without background info and just the code.

As this code is quite separate from the actual codec code, I believe it will be
more robust this way. It could happen in the future that code get modified
without taking into account that a buffer may be held. This also mimic how this
was implemented in Cedrus fwiw.

Note that it needs to be added for MPEG2 field decoding too, but I believe this
is unrelated to this patchset, the form is nice for adding more in the future.

>
> How about:
> ```
> /*
> * During field decoding in H264, all fields are written into the
> * same capture buffer, thus we need to be able to hold the buffer
> * until all fields are written to it
> */
> ```
>
> > + break;
> > + default:
>
> The only other decoding formats remaining are:
> - V4L2_PIX_FMT_NV12_4L4
> - V4L2_PIX_FMT_NV12

You'll never get raw formats in that switch. The cases are exhaustive for the
context, yet the compiler does not understand that context.

>
> Both have codec mode HANTRO_MODE_NONE.
>
> My thought is:
> If we don't care for these two, the we might as well disable buffer holding
> support for them as well. So, we could make this simplier
> (but a bit less descriptive):
>
> ```
> /*
> * During field decoding in H264, all fields are written into the
> * same capture buffer, thus we need to be able to hold the buffer
> * until all fields are written to it
> */
> if (fourcc == V4L2_PIX_FMT_H264_SLICE)
> vq->subsystem_flags |= VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
> else
> vq->subsystem_flags &= ~(VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF);
> ```
>
> Greetings,
> Sebastian
>
> > + break;
> > + }
> > +}
> > +
> > static int hantro_set_fmt_out(struct hantro_ctx *ctx,
> > struct v4l2_pix_format_mplane *pix_mp)
> > {
> > @@ -472,6 +496,7 @@ static int hantro_set_fmt_out(struct hantro_ctx *ctx,
> > ctx->dst_fmt.quantization = pix_mp->quantization;
> >
> > hantro_update_requires_request(ctx, pix_mp->pixelformat);
> > + hantro_update_requires_hold_capture_buf(ctx, pix_mp->pixelformat);
> >
> > vpu_debug(0, "OUTPUT codec mode: %d\n", ctx->vpu_src_fmt->codec_mode);
> > vpu_debug(0, "fmt - w: %d, h: %d\n",
> > --
> > 2.34.1
> >

2022-04-02 13:37:02

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 19/24] media: rkvdec-h264: Add field decoding support

Le mercredi 30 mars 2022 à 18:16 +0300, Dan Carpenter a écrit :
> Yeah. I'm aboslutely fine with whatever you do. Some of the questions
> you're asking occurred to me too but I don't have the answers.
>
> > > > > > + for (i = 0; i < builder->num_valid; i++) {
> > > > > > + struct v4l2_h264_reference *ref;
> > > > > > + u8 dpb_valid;
> > > > > > + u8 bottom;
> > > > >
> > > > > These would be better as type bool.
> > > >
> > > > I never used a bool for bit operations before, but I guess that can work, thanks
> > > > for the suggestion. As this deviates from the original code, I suppose I should
> > > > make this a separate patch ?
> > >
> > > I just saw the name and wondered why it was a u8. bool does make more
> > > sense and works fine for the bitwise stuff. But I don't really care at
> > > all.
> >
> > I'll do that in v2, in same patch, looks minor enough. I think if using bool
> > could guaranty that only 1 or 0 is possible, it would be even better, but don't
> > think C works like this.
>
> I'm not sure I understand. If you assign "bool x = <any non-zero>;"
> then x is set to true. Do you want a static checker warning for if
> <any non-zero> can be something other than one or zero? The problem is
> that people sometimes deliberately do stuff like "bool x = var & 0xf0;".
> Smatch will complain if you assign a negative value to x.
>
> test.c:8 test() warn: assigning (-3) to unsigned variable 'x'
>
> It's supposed to print a warning if you used it to save error codes like:
>
> x = some_kernel_function();
>
> But it does not. :/ Something to investigate.

That would be an amazing catch, you might have seen a lot of:

x = !!(var & 0xf0)

For branches, it does no matter, but if you use x it like this dpb_valid
variable is used, not having 0 or 1 can lead to very surprising results. In the
end its used like this

set_reg(reg0, val | (x << N))

So using bool type can hint the analyzer that 0 or 1 was likely expected, while
currently an u8 would be ambiguous and lead to false positive if we were to
warn.

>
> regards,
> dan carpenter
>

2022-04-04 10:06:55

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v1 03/24] media: h264: Avoid wrapping long_term_frame_idx

Le mardi 29 mars 2022 à 10:35 +0200, Sebastian Fricke a écrit :
> On 28.03.2022 15:59, Nicolas Dufresne wrote:
> > For long term reference, frame_num is set to long_term_frame_idx which
>
> s/reference/references/
>
> > does not require wrapping. This if fixed by observation, no directly
>
> s/This if/This is/
>
> > related issue have been found yet.
> >
> > Signed-off-by: Nicolas Dufresne <[email protected]>
> Reviewed-by: Sebastian Fricke <[email protected]>
>
> > ---
> > drivers/media/v4l2-core/v4l2-h264.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/media/v4l2-core/v4l2-h264.c b/drivers/media/v4l2-core/v4l2-h264.c
> > index 8d750ee69e74..aebed1cbe05a 100644
> > --- a/drivers/media/v4l2-core/v4l2-h264.c
> > +++ b/drivers/media/v4l2-core/v4l2-h264.c
> > @@ -57,8 +57,10 @@ v4l2_h264_init_reflist_builder(struct v4l2_h264_reflist_builder *b,
> > * '8.2.4.1 Decoding process for picture numbers' of the spec.
> > * TODO: This logic will have to be adjusted when we start
> > * supporting interlaced content.
>
> As you change the logic can't you remove the TODO comment now?

Not yet, as I'm not fixing it here. Its removed in:

[PATCH v1 05/24] media: h264: Store all fields into the unordered list

>
> > + * For long term reference, frame_num is set to
>
> s/reference/references/
>
> Greetings,
> Sebastian
>
> > + * long_term_frame_idx which requires no wrapping.
> > */
> > - if (dpb[i].frame_num > cur_frame_num)
> > + if (!b->refs[i].longterm && dpb[i].frame_num > cur_frame_num)
> > b->refs[i].frame_num = (int)dpb[i].frame_num -
> > max_frame_num;
> > else
> > --
> > 2.34.1
> >