Received: by 2002:ab2:6816:0:b0:1f9:5764:f03e with SMTP id t22csp143097lqo; Thu, 16 May 2024 01:41:31 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVpMtl265upVlgVmclnvpvtjJRcYiIjawJK103DnvIkOiG0HlCPxdVL1yICQNP0BIsBK12QfxOHjb106jOATPTgqF3Auy8L9kdr/GKWvQ== X-Google-Smtp-Source: AGHT+IFStVYi54FiChTqMV9/lFoVU8m/dLfcbkAWssFYSajzY6y94PFctuuJqmqmxaZ5smWUbTuk X-Received: by 2002:a17:906:694a:b0:a59:e3b8:ab7b with SMTP id a640c23a62f3a-a5a2d6796bcmr1021735766b.74.1715848891115; Thu, 16 May 2024 01:41:31 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715848891; cv=pass; d=google.com; s=arc-20160816; b=ArmvNx+4xNQukBNPuz0w6YaS1Wa65xF5YVEOxpQlgB7vqQ0nb763Kaw6R3Zwy9Ig2j Oepp7DN0jL5qQ1mWlxMNcp5V8XpapOHiQ6SYRKGTnx5Lzqrm36SDorQ9vOHtpzTOkIOI My1KTNw3Yy4Qerk/RL7/Jk1x70MPdY2Z2Nnh3GyPxfEd94Q3dww2aoUF445GXZaYhiYe FmU9ndZtr6u7S7nYh9jfw8wg0+nhSyOuuESESDZkk2qm40onjGm/UO3ckoRmuxqhZTr3 mKNgiCe73ddyYh3PvTyh2zaHtEBs8Cdk+93Nns5gDQQtJwcL54Q7hohdR3GvoydLAGXK PAgQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=s3UZZ6OPgy/KZsM6IuWuWPSwHkrsDm2cMGZRDvWKcLk=; fh=VPCJk6DEEJDattFCzu41j/po5hyUBB9pXnDLhwN0k9M=; b=X+mOlbv9MYhhHJ1pS/hsmpn7yord2sr6nmlmvE32h4pJrvvF/qyyZkTmxylgJ45Cdk YOSkBKLEgPMfkM7y/xqH/yDxjcw/NcHOaHOBbMD85NLuLnqWbnEj2PPdSntF5qDdoDYp 4WjdpKCkwShzvPS7/Ar9sjB1pZ2X3ywHgfd+haC7OrxbskYdc0zFV8GTs2SFdBT1qiYT DVLxR9LSPsHPeYVcAIAl5r+wwyHL+dOH0Jzj+wS+rLTt38XmDZUfVFRikYOXj7dP+P8s KcCRunV/LruZTA2BK8cCTLjgzMm+ZkMB5+aathH6PmXzTKnPg1BSChQqYJk/ypB6R+2D XNwg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b="l8ZZo/Z4"; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-180786-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-180786-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id a640c23a62f3a-a5a52d12d25si585349066b.378.2024.05.16.01.41.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 May 2024 01:41:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-180786-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b="l8ZZo/Z4"; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-180786-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-180786-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 9CD9F1F228B8 for ; Thu, 16 May 2024 08:41:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 95B8B14291D; Thu, 16 May 2024 08:41:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="l8ZZo/Z4" Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11E681411D8; Thu, 16 May 2024 08:41:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715848878; cv=none; b=DTJJg/42OCp0YHTpsM9HuMoI/RFVeLtSIC5EdQst4GVydrJ1SNVCs1WBEoylsgYZgFK8kscin8cpc4LCCeqzvHK6gDAgSXXiTLyVB+445BGv2ojYTdNzYldp0XB6zfwYvxNtzeK0HPpXStAHHyQc+zoV+yclxYDQi6nXK2GGzYY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715848878; c=relaxed/simple; bh=4KLkMCbZUKnHQ9N54Zv0Cup9+GEhrxn6Rra6jaeTtrE=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=NgBwjaL6WEOxdYNaBb8tyOp9WCiBmW0tiKDLmMTN2/e7DNYJrSEcbP7hiiAJOV+H0nS/qsmnJAXdrIPAMYVv9jRl/sqqqJm3XJZc3WYPE7SrLw7regZCDEBinAQUQhDejKRoJ1jGyhr4/SsA8YVs0rRN2NMEv6gBQCe8ApuX7TM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=l8ZZo/Z4; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1715848874; bh=4KLkMCbZUKnHQ9N54Zv0Cup9+GEhrxn6Rra6jaeTtrE=; h=From:To:Cc:Subject:Date:From; b=l8ZZo/Z49ch8G9srlc0FLz/sJXV1RZiKw5sq+1dYXYmU7D8DiBCD0JNLjOc5UsvZu dYYd8lWE9W68K+3bHk7iCd/DSQP7zxGWFHj3lIUfLtJ+og5/WmSqnCc8Sbds3rthwR cFh804GeelUBJ+CnaYcAqUwEyRzPU7Sp7SZNMoftXywxRvvlQdSK3PbvDQD/5k94+Z Qw7kje0aqZ4Lpa4nrzqrB1+XNeZ2PUVBgNo0baSWRxakQ59e0vzxAxweRvkh70IQ2j AHzzRPqQXw3dHgGVckP8DtveZfSrxzby+uwWg2NlExrCvTCAHog+7VzMFDZIyJErGl DLHdtbVO8fKQQ== Received: from benjamin-XPS-13-9310.. (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: benjamin.gaignard) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 9FD90378218B; Thu, 16 May 2024 08:41:13 +0000 (UTC) From: Benjamin Gaignard To: ezequiel@vanguardiasur.com.ar, p.zabel@pengutronix.de, mchehab@kernel.org, nicolas.dufresne@collabora.com Cc: linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, kernel@collabora.com, Benjamin Gaignard Subject: [PATCH v2] media: verisilicon: Add reference buffers compression Date: Thu, 16 May 2024 10:41:07 +0200 Message-Id: <20240516084107.37083-1-benjamin.gaignard@collabora.com> X-Mailer: git-send-email 2.40.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Reference frame compression is a feature added in G2 decoder to compress frame buffers so that the bandwidth of storing/loading reference frames can be reduced, especially when the resolution of decoded stream is of high definition. The impact of compressed frames is confirmed when using perf to monitor the number of memory accesses with or without compression feature. The following command perf stat -a -e imx8_ddr0/cycles/,imx8_ddr0/read-cycles/,imx8_ddr0/write-cycles/ gst-launch-1.0 filesrc location=Jockey_3840x2160_120fps_420_8bit_HEVC_RAW.hevc ! queue ! h265parse ! v4l2slh265dec ! video/x-raw,format=NV12 ! fakesink give us these results without compression feature: Performance counter stats for 'system wide': 1711300345 imx8_ddr0/cycles/ 892207924 imx8_ddr0/read-cycles/ 1291785864 imx8_ddr0/write-cycles/ 13.760048353 seconds time elapsed with compression feature: Performance counter stats for 'system wide': 274526799 imx8_ddr0/cycles/ 453120194 imx8_ddr0/read-cycles/ 833391434 imx8_ddr0/write-cycles/ 18.257831534 seconds time elapsed As expected the number of read/write cycles are really lower when compression is used. Since storing compression data requires more memory a module parameter named 'hevc_use_compression' is used to enable/disable this feature and, by default, compression isn't used. Enabling compression feature means that decoder output frames are stored with a specific compression pixel format. Since this pixel format is unknown, this patch restrain compression feature usage to the cases where post-processor pixels formats (NV12 or NV15) are selected by the applications. Fluster compliance HEVC test suite score is still 141/147 after this patch. Signed-off-by: Benjamin Gaignard Tested-by: Nicolas Dufresne --- version 2: - Make 'bool hevc_use_compression' static - Add Kconfig to enable/disable the feature - Fix test command in commit message. drivers/media/platform/verisilicon/Kconfig | 8 ++++ .../media/platform/verisilicon/hantro_g2.c | 35 +++++++++++++++++ .../platform/verisilicon/hantro_g2_hevc_dec.c | 20 ++++++++-- .../platform/verisilicon/hantro_g2_regs.h | 4 ++ .../media/platform/verisilicon/hantro_hevc.c | 8 ++++ .../media/platform/verisilicon/hantro_hw.h | 39 +++++++++++++++++++ .../platform/verisilicon/hantro_postproc.c | 6 ++- 7 files changed, 116 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/verisilicon/Kconfig b/drivers/media/platform/verisilicon/Kconfig index 9a34d14c6e40..9adefb628d73 100644 --- a/drivers/media/platform/verisilicon/Kconfig +++ b/drivers/media/platform/verisilicon/Kconfig @@ -20,6 +20,14 @@ config VIDEO_HANTRO To compile this driver as a module, choose M here: the module will be called hantro-vpu. +config VIDEO_HANTRO_HEVC_RFC + bool "Use reference frame compression for HEVC" + depends on VIDEO_HANTRO + default n + help + Enable reference frame compression feature for HEVC codec. + It will use more memory but save bandwidth on memory bus. + config VIDEO_HANTRO_IMX8M bool "Hantro VPU i.MX8M support" depends on VIDEO_HANTRO diff --git a/drivers/media/platform/verisilicon/hantro_g2.c b/drivers/media/platform/verisilicon/hantro_g2.c index b880a6849d58..62ca91427360 100644 --- a/drivers/media/platform/verisilicon/hantro_g2.c +++ b/drivers/media/platform/verisilicon/hantro_g2.c @@ -9,6 +9,12 @@ #include "hantro_g2_regs.h" #define G2_ALIGN 16 +#define CBS_SIZE 16 /* compression table size in bytes */ +#define CBS_LUMA 8 /* luminance CBS is composed of 1 8x8 coded block */ +#define CBS_CHROMA_W (8 * 2) /* chrominance CBS is composed of two 8x4 coded + * blocks, with Cb CB first then Cr CB following + */ +#define CBS_CHROMA_H 4 void hantro_g2_check_idle(struct hantro_dev *vpu) { @@ -56,3 +62,32 @@ size_t hantro_g2_motion_vectors_offset(struct hantro_ctx *ctx) return ALIGN((cr_offset * 3) / 2, G2_ALIGN); } + +static size_t hantro_g2_mv_size(struct hantro_ctx *ctx) +{ + const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls; + const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps; + unsigned int pic_width_in_ctbs, pic_height_in_ctbs; + unsigned int max_log2_ctb_size; + + max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 + + sps->log2_diff_max_min_luma_coding_block_size; + pic_width_in_ctbs = (sps->pic_width_in_luma_samples + + (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size; + pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1) + >> max_log2_ctb_size; + + return pic_width_in_ctbs * pic_height_in_ctbs * (1 << (2 * (max_log2_ctb_size - 4))) * 16; +} + +size_t hantro_g2_luma_compress_offset(struct hantro_ctx *ctx) +{ + return hantro_g2_motion_vectors_offset(ctx) + + hantro_g2_mv_size(ctx); +} + +size_t hantro_g2_chroma_compress_offset(struct hantro_ctx *ctx) +{ + return hantro_g2_luma_compress_offset(ctx) + + hantro_hevc_luma_compressed_size(ctx->dst_fmt.width, ctx->dst_fmt.height); +} diff --git a/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c b/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c index d3f8c33eb16c..85a44143b378 100644 --- a/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c +++ b/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c @@ -367,11 +367,14 @@ static int set_ref(struct hantro_ctx *ctx) const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params; const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb; dma_addr_t luma_addr, chroma_addr, mv_addr = 0; + dma_addr_t compress_luma_addr, compress_chroma_addr = 0; struct hantro_dev *vpu = ctx->dev; struct vb2_v4l2_buffer *vb2_dst; struct hantro_decoded_buffer *dst; size_t cr_offset = hantro_g2_chroma_offset(ctx); size_t mv_offset = hantro_g2_motion_vectors_offset(ctx); + size_t compress_luma_offset = hantro_g2_luma_compress_offset(ctx); + size_t compress_chroma_offset = hantro_g2_chroma_compress_offset(ctx); u32 max_ref_frames; u16 dpb_longterm_e; static const struct hantro_reg cur_poc[] = { @@ -445,6 +448,8 @@ static int set_ref(struct hantro_ctx *ctx) chroma_addr = luma_addr + cr_offset; mv_addr = luma_addr + mv_offset; + compress_luma_addr = luma_addr + compress_luma_offset; + compress_chroma_addr = luma_addr + compress_chroma_offset; if (dpb[i].flags & V4L2_HEVC_DPB_ENTRY_LONG_TERM_REFERENCE) dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i); @@ -452,6 +457,8 @@ static int set_ref(struct hantro_ctx *ctx) hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), luma_addr); hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), chroma_addr); hantro_write_addr(vpu, G2_REF_MV_ADDR(i), mv_addr); + hantro_write_addr(vpu, G2_REF_COMP_LUMA_ADDR(i), compress_luma_addr); + hantro_write_addr(vpu, G2_REF_COMP_CHROMA_ADDR(i), compress_chroma_addr); } vb2_dst = hantro_get_dst_buf(ctx); @@ -465,19 +472,27 @@ static int set_ref(struct hantro_ctx *ctx) chroma_addr = luma_addr + cr_offset; mv_addr = luma_addr + mv_offset; + compress_luma_addr = luma_addr + compress_luma_offset; + compress_chroma_addr = luma_addr + compress_chroma_offset; hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), luma_addr); hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), chroma_addr); - hantro_write_addr(vpu, G2_REF_MV_ADDR(i++), mv_addr); + hantro_write_addr(vpu, G2_REF_MV_ADDR(i), mv_addr); + hantro_write_addr(vpu, G2_REF_COMP_LUMA_ADDR(i), compress_luma_addr); + hantro_write_addr(vpu, G2_REF_COMP_CHROMA_ADDR(i++), compress_chroma_addr); hantro_write_addr(vpu, G2_OUT_LUMA_ADDR, luma_addr); hantro_write_addr(vpu, G2_OUT_CHROMA_ADDR, chroma_addr); hantro_write_addr(vpu, G2_OUT_MV_ADDR, mv_addr); + hantro_write_addr(vpu, G2_OUT_COMP_LUMA_ADDR, compress_luma_addr); + hantro_write_addr(vpu, G2_OUT_COMP_CHROMA_ADDR, compress_chroma_addr); for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) { hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), 0); hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), 0); hantro_write_addr(vpu, G2_REF_MV_ADDR(i), 0); + hantro_write_addr(vpu, G2_REF_COMP_LUMA_ADDR(i), 0); + hantro_write_addr(vpu, G2_REF_COMP_CHROMA_ADDR(i), 0); } hantro_reg_write(vpu, &g2_refer_lterm_e, dpb_longterm_e); @@ -594,8 +609,7 @@ int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx) /* Don't disable output */ hantro_reg_write(vpu, &g2_out_dis, 0); - /* Don't compress buffers */ - hantro_reg_write(vpu, &g2_ref_compress_bypass, 1); + hantro_reg_write(vpu, &g2_ref_compress_bypass, !ctx->hevc_dec.use_compression); /* Bus width and max burst */ hantro_reg_write(vpu, &g2_buswidth, BUS_WIDTH_128); diff --git a/drivers/media/platform/verisilicon/hantro_g2_regs.h b/drivers/media/platform/verisilicon/hantro_g2_regs.h index 82606783591a..b943b1816db7 100644 --- a/drivers/media/platform/verisilicon/hantro_g2_regs.h +++ b/drivers/media/platform/verisilicon/hantro_g2_regs.h @@ -318,6 +318,10 @@ #define G2_TILE_BSD_ADDR (G2_SWREG(183)) #define G2_DS_DST (G2_SWREG(186)) #define G2_DS_DST_CHR (G2_SWREG(188)) +#define G2_OUT_COMP_LUMA_ADDR (G2_SWREG(190)) +#define G2_REF_COMP_LUMA_ADDR(i) (G2_SWREG(192) + ((i) * 0x8)) +#define G2_OUT_COMP_CHROMA_ADDR (G2_SWREG(224)) +#define G2_REF_COMP_CHROMA_ADDR(i) (G2_SWREG(226) + ((i) * 0x8)) #define g2_strm_buffer_len G2_DEC_REG(258, 0, 0xffffffff) #define g2_strm_start_offset G2_DEC_REG(259, 0, 0xffffffff) diff --git a/drivers/media/platform/verisilicon/hantro_hevc.c b/drivers/media/platform/verisilicon/hantro_hevc.c index 2c14330bc562..83cd12b0ddd6 100644 --- a/drivers/media/platform/verisilicon/hantro_hevc.c +++ b/drivers/media/platform/verisilicon/hantro_hevc.c @@ -25,6 +25,11 @@ #define MAX_TILE_COLS 20 #define MAX_TILE_ROWS 22 +static bool hevc_use_compression = IS_ENABLED(CONFIG_VIDEO_HANTRO_HEVC_RFC); +module_param_named(hevc_use_compression, hevc_use_compression, bool, 0644); +MODULE_PARM_DESC(hevc_use_compression, + "Use reference frame compression for HEVC"); + void hantro_hevc_ref_init(struct hantro_ctx *ctx) { struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec; @@ -275,5 +280,8 @@ int hantro_hevc_dec_init(struct hantro_ctx *ctx) hantro_hevc_ref_init(ctx); + hevc_dec->use_compression = + hevc_use_compression & hantro_needs_postproc(ctx, ctx->vpu_dst_fmt); + return 0; } diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h index 7737320cc8cc..43d4ff637376 100644 --- a/drivers/media/platform/verisilicon/hantro_hw.h +++ b/drivers/media/platform/verisilicon/hantro_hw.h @@ -42,6 +42,14 @@ #define MAX_POSTPROC_BUFFERS 64 +#define G2_ALIGN 16 +#define CBS_SIZE 16 /* compression table size in bytes */ +#define CBS_LUMA 8 /* luminance CBS is composed of 1 8x8 coded block */ +#define CBS_CHROMA_W (8 * 2) /* chrominance CBS is composed of two 8x4 coded + * blocks, with Cb CB first then Cr CB following + */ +#define CBS_CHROMA_H 4 + struct hantro_dev; struct hantro_ctx; struct hantro_buf; @@ -144,6 +152,7 @@ struct hantro_hevc_dec_ctrls { * @ref_bufs_used: Bitfield of used reference buffers * @ctrls: V4L2 controls attached to a run * @num_tile_cols_allocated: number of allocated tiles + * @use_compression: use reference buffer compression */ struct hantro_hevc_dec_hw_ctx { struct hantro_aux_buf tile_sizes; @@ -156,6 +165,7 @@ struct hantro_hevc_dec_hw_ctx { u32 ref_bufs_used; struct hantro_hevc_dec_ctrls ctrls; unsigned int num_tile_cols_allocated; + bool use_compression; }; /** @@ -510,6 +520,33 @@ hantro_hevc_mv_size(unsigned int width, unsigned int height) return width * height / 16; } +static inline size_t +hantro_hevc_luma_compressed_size(unsigned int width, unsigned int height) +{ + u32 pic_width_in_cbsy = + round_up((width + CBS_LUMA - 1) / CBS_LUMA, CBS_SIZE); + u32 pic_height_in_cbsy = (height + CBS_LUMA - 1) / CBS_LUMA; + + return round_up(pic_width_in_cbsy * pic_height_in_cbsy, CBS_SIZE); +} + +static inline size_t +hantro_hevc_chroma_compressed_size(unsigned int width, unsigned int height) +{ + u32 pic_width_in_cbsc = + round_up((width + CBS_CHROMA_W - 1) / CBS_CHROMA_W, CBS_SIZE); + u32 pic_height_in_cbsc = (height / 2 + CBS_CHROMA_H - 1) / CBS_CHROMA_H; + + return round_up(pic_width_in_cbsc * pic_height_in_cbsc, CBS_SIZE); +} + +static inline size_t +hantro_hevc_compressed_size(unsigned int width, unsigned int height) +{ + return hantro_hevc_luma_compressed_size(width, height) + + hantro_hevc_chroma_compressed_size(width, height); +} + static inline unsigned short hantro_av1_num_sbs(unsigned short dimension) { return DIV_ROUND_UP(dimension, 64); @@ -525,6 +562,8 @@ hantro_av1_mv_size(unsigned int width, unsigned int height) size_t hantro_g2_chroma_offset(struct hantro_ctx *ctx); size_t hantro_g2_motion_vectors_offset(struct hantro_ctx *ctx); +size_t hantro_g2_luma_compress_offset(struct hantro_ctx *ctx); +size_t hantro_g2_chroma_compress_offset(struct hantro_ctx *ctx); int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx); int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx); diff --git a/drivers/media/platform/verisilicon/hantro_postproc.c b/drivers/media/platform/verisilicon/hantro_postproc.c index 41e93176300b..232c93eea7ee 100644 --- a/drivers/media/platform/verisilicon/hantro_postproc.c +++ b/drivers/media/platform/verisilicon/hantro_postproc.c @@ -213,9 +213,13 @@ static unsigned int hantro_postproc_buffer_size(struct hantro_ctx *ctx) else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME) buf_size += hantro_vp9_mv_size(pix_mp.width, pix_mp.height); - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) { buf_size += hantro_hevc_mv_size(pix_mp.width, pix_mp.height); + if (ctx->hevc_dec.use_compression) + buf_size += hantro_hevc_compressed_size(pix_mp.width, + pix_mp.height); + } else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME) buf_size += hantro_av1_mv_size(pix_mp.width, pix_mp.height); -- 2.40.1