LinuxLists.cc - [PATCH v8 0/8] add DSC 1.2 dpu supports

[permalink] [raw]

Subject: [PATCH v8 4/8] drm/msm/dpu: Introduce PINGPONG_NONE to disconnect DSC from PINGPONG

Disabling the crossbar mux between DSC and PINGPONG currently
requires a bogus enum dpu_pingpong value to be passed when calling
dsc_bind_pingpong_blk() with enable=false, even though the register
value written is independent of the current PINGPONG block. Replace
that `bool enable` parameter with a new PINGPONG_NONE dpu_pingpong
flag that triggers the write of the "special" 0xF "crossbar
disabled" value to the register instead.

Changes in v4:
-- more details to commit text

Changes in v5:
-- rewording commit text suggested by Marijn
-- add DRM_DEBUG_KMS for DSC unbinding case

Changes in v8:
-- fix checkpatch warning

Signed-off-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
Reviewed-by: Marijn Suijten <[email protected]>
---
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 2 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c | 14 +++++++-------
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 1 -
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 3 ++-
4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index cf1de5d..ffa6f04 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -1850,7 +1850,7 @@ static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
hw_pp->ops.setup_dsc(hw_pp);

if (hw_dsc->ops.dsc_bind_pingpong_blk)
- hw_dsc->ops.dsc_bind_pingpong_blk(hw_dsc, true, hw_pp->idx);
+ hw_dsc->ops.dsc_bind_pingpong_blk(hw_dsc, hw_pp->idx);

if (hw_pp->ops.enable_dsc)
hw_pp->ops.enable_dsc(hw_pp);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
index 4a6bbcc..f134fb0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
@@ -157,7 +157,6 @@ static void dpu_hw_dsc_config_thresh(struct dpu_hw_dsc *hw_dsc,

static void dpu_hw_dsc_bind_pingpong_blk(
struct dpu_hw_dsc *hw_dsc,
- bool enable,
const enum dpu_pingpong pp)
{
struct dpu_hw_blk_reg_map *c = &hw_dsc->hw;
@@ -166,14 +165,15 @@ static void dpu_hw_dsc_bind_pingpong_blk(

dsc_ctl_offset = DSC_CTL(hw_dsc->idx);

- if (enable)
+ if (pp)
mux_cfg = (pp - PINGPONG_0) & 0x7;

- DRM_DEBUG_KMS("%s dsc:%d %s pp:%d\n",
- enable ? "Binding" : "Unbinding",
- hw_dsc->idx - DSC_0,
- enable ? "to" : "from",
- pp - PINGPONG_0);
+ if (pp)
+ DRM_DEBUG_KMS("Binding dsc:%d to pp:%d\n",
+ hw_dsc->idx - DSC_0, pp - PINGPONG_0);
+ else
+ DRM_DEBUG_KMS("Unbinding dsc:%d from any pp\n",
+ hw_dsc->idx - DSC_0);

DPU_REG_WRITE(c, dsc_ctl_offset, mux_cfg);
}
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
index 287ec5f..138080a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
@@ -44,7 +44,6 @@ struct dpu_hw_dsc_ops {
struct drm_dsc_config *dsc);

void (*dsc_bind_pingpong_blk)(struct dpu_hw_dsc *hw_dsc,
- bool enable,
enum dpu_pingpong pp);
};

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index 1913a19..02a0f48 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -191,7 +191,8 @@ enum dpu_dsc {
};

enum dpu_pingpong {
- PINGPONG_0 = 1,
+ PINGPONG_NONE,
+ PINGPONG_0,
PINGPONG_1,
PINGPONG_2,
PINGPONG_3,
--
2.7.4

2023-05-12 18:02:44

[permalink] [raw]

Subject: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

Add support for DSC 1.2 by providing the necessary hooks to program
the DPU DSC 1.2 encoder.

Changes in v3:
-- fixed kernel test rebot report that "__iomem *off" is declared but not
used at dpu_hw_dsc_config_1_2()
-- unrolling thresh loops

Changes in v4:
-- delete DPU_DSC_HW_REV_1_1
-- delete off and used real register name directly

Changes in v7:
-- replace offset with sblk->enc.base
-- replace ss with slice

Changes in v8:
-- fixed checkpatch warning

Signed-off-by: Kuogee Hsieh <[email protected]>
---
drivers/gpu/drm/msm/Makefile | 1 +
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382 +++++++++++++++++++++++++
drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 7 +-
5 files changed, 432 insertions(+), 4 deletions(-)
create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index b814fc8..b9af5e4 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
disp/dpu1/dpu_hw_catalog.o \
disp/dpu1/dpu_hw_ctl.o \
disp/dpu1/dpu_hw_dsc.o \
+ disp/dpu1/dpu_hw_dsc_1_2.o \
disp/dpu1/dpu_hw_interrupts.o \
disp/dpu1/dpu_hw_intf.o \
disp/dpu1/dpu_hw_lm.o \
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index dc0a4da..4eda2cc 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
- * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
+ * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
* Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights reserved.
*/

@@ -244,12 +244,18 @@ enum {
};

/**
- * DSC features
+ * DSC sub-blocks/features
* @DPU_DSC_OUTPUT_CTRL Configure which PINGPONG block gets
* the pixel output from this DSC.
+ * @DPU_DSC_HW_REV_1_2 DSC block supports dsc 1.1 and 1.2
+ * @DPU_DSC_NATIVE_422_EN Supports native422 and native420 encoding
+ * @DPU_DSC_MAX
*/
enum {
DPU_DSC_OUTPUT_CTRL = 0x1,
+ DPU_DSC_HW_REV_1_2,
+ DPU_DSC_NATIVE_422_EN,
+ DPU_DSC_MAX
};

/**
@@ -306,6 +312,14 @@ struct dpu_pp_blk {
};

/**
+ * struct dpu_dsc_blk - DSC Encoder sub-blk information
+ * @info: HW register and features supported by this sub-blk
+ */
+struct dpu_dsc_blk {
+ DPU_HW_SUBBLK_INFO;
+};
+
+/**
* enum dpu_qos_lut_usage - define QoS LUT use cases
*/
enum dpu_qos_lut_usage {
@@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
};

/**
+ * struct dpu_dsc_sub_blks - DSC sub-blks
+ * @enc: DSC encoder sub block
+ * @ctl: DSC controller sub block
+ *
+ */
+struct dpu_dsc_sub_blks {
+ struct dpu_dsc_blk enc;
+ struct dpu_dsc_blk ctl;
+};
+
+/**
* dpu_clk_ctrl_type - Defines top level clock control signals
*/
enum dpu_clk_ctrl_type {
@@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
* struct dpu_dsc_cfg - information of DSC blocks
* @id enum identifying this block
* @base register offset of this block
+ * @len: length of hardware block
* @features bit mask identifying sub-blocks/features
+ * @sblk sub-blocks information
*/
struct dpu_dsc_cfg {
DPU_HW_BLK_INFO;
+ const struct dpu_dsc_sub_blks *sblk;
};

/**
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
index 138080a..44fd624 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
@@ -1,5 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0-only */
-/* Copyright (c) 2020-2022, Linaro Limited */
+/*
+ * Copyright (c) 2020-2022, Linaro Limited
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
+ */

#ifndef _DPU_HW_DSC_H
#define _DPU_HW_DSC_H
@@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct dpu_dsc_cfg *cfg,
void __iomem *addr);

/**
+ * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block
+ * @cfg: DSC catalog entry for which driver object is required
+ * @addr: Mapped register io address of MDP
+ * Returns: Error code or allocated dpu_hw_dsc context
+ */
+struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
+ void __iomem *addr);
+
+/**
* dpu_hw_dsc_destroy - destroys dsc driver context
* @dsc: Pointer to dsc driver context returned by dpu_hw_dsc_init
*/
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
new file mode 100644
index 00000000..5bd84bd
--- /dev/null
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
@@ -0,0 +1,382 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
+ */
+
+#include <drm/display/drm_dsc_helper.h>
+
+#include "dpu_kms.h"
+#include "dpu_hw_catalog.h"
+#include "dpu_hwio.h"
+#include "dpu_hw_mdss.h"
+#include "dpu_hw_dsc.h"
+
+#define DSC_CMN_MAIN_CNF 0x00
+
+/* DPU_DSC_ENC register offsets */
+#define ENC_DF_CTRL 0x00
+#define ENC_GENERAL_STATUS 0x04
+#define ENC_HSLICE_STATUS 0x08
+#define ENC_OUT_STATUS 0x0C
+#define ENC_INT_STAT 0x10
+#define ENC_INT_CLR 0x14
+#define ENC_INT_MASK 0x18
+#define DSC_MAIN_CONF 0x30
+#define DSC_PICTURE_SIZE 0x34
+#define DSC_SLICE_SIZE 0x38
+#define DSC_MISC_SIZE 0x3C
+#define DSC_HRD_DELAYS 0x40
+#define DSC_RC_SCALE 0x44
+#define DSC_RC_SCALE_INC_DEC 0x48
+#define DSC_RC_OFFSETS_1 0x4C
+#define DSC_RC_OFFSETS_2 0x50
+#define DSC_RC_OFFSETS_3 0x54
+#define DSC_RC_OFFSETS_4 0x58
+#define DSC_FLATNESS_QP 0x5C
+#define DSC_RC_MODEL_SIZE 0x60
+#define DSC_RC_CONFIG 0x64
+#define DSC_RC_BUF_THRESH_0 0x68
+#define DSC_RC_BUF_THRESH_1 0x6C
+#define DSC_RC_BUF_THRESH_2 0x70
+#define DSC_RC_BUF_THRESH_3 0x74
+#define DSC_RC_MIN_QP_0 0x78
+#define DSC_RC_MIN_QP_1 0x7C
+#define DSC_RC_MIN_QP_2 0x80
+#define DSC_RC_MAX_QP_0 0x84
+#define DSC_RC_MAX_QP_1 0x88
+#define DSC_RC_MAX_QP_2 0x8C
+#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
+#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
+#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
+
+/* DPU_DSC_CTL register offsets */
+#define DSC_CTL 0x00
+#define DSC_CFG 0x04
+#define DSC_DATA_IN_SWAP 0x08
+#define DSC_CLK_CTRL 0x0C
+
+static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc, int num_ss)
+{
+ int max_addr = 2400 / num_ss;
+
+ if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
+ max_addr /= 2;
+
+ return max_addr - 1;
+};
+
+static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
+{
+ struct dpu_hw_blk_reg_map *hw;
+ const struct dpu_dsc_sub_blks *sblk;
+
+ if (!hw_dsc)
+ return;
+
+ hw = &hw_dsc->hw;
+ sblk = hw_dsc->caps->sblk;
+ DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
+}
+
+static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
+ struct drm_dsc_config *dsc,
+ u32 mode,
+ u32 initial_lines)
+{
+ struct dpu_hw_blk_reg_map *hw;
+ const struct dpu_dsc_sub_blks *sblk;
+ u32 data = 0;
+ u32 det_thresh_flatness;
+ u32 num_active_slice_per_enc;
+ u32 bpp;
+
+ if (!hw_dsc || !dsc)
+ return;
+
+ hw = &hw_dsc->hw;
+
+ sblk = hw_dsc->caps->sblk;
+
+ if (mode & DSC_MODE_SPLIT_PANEL)
+ data |= BIT(0);
+
+ if (mode & DSC_MODE_MULTIPLEX)
+ data |= BIT(1);
+
+ num_active_slice_per_enc = dsc->slice_count;
+ if (mode & DSC_MODE_MULTIPLEX)
+ num_active_slice_per_enc = dsc->slice_count >> 1;
+
+ data |= (num_active_slice_per_enc & 0x3) << 7;
+
+ DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
+
+ data = (initial_lines & 0xff);
+
+ if (mode & DSC_MODE_VIDEO)
+ data |= BIT(9);
+
+ data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc) << 18);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
+
+ data = (dsc->dsc_version_minor & 0xf) << 28;
+ if (dsc->dsc_version_minor == 0x2) {
+ if (dsc->native_422)
+ data |= BIT(22);
+ if (dsc->native_420)
+ data |= BIT(21);
+ }
+
+ bpp = dsc->bits_per_pixel;
+ /* as per hw requirement bpp should be programmed
+ * twice the actual value in case of 420 or 422 encoding
+ */
+ if (dsc->native_422 || dsc->native_420)
+ bpp = 2 * bpp;
+ data |= (dsc->block_pred_enable ? 1 : 0) << 20;
+ data |= bpp << 10;
+ data |= (dsc->line_buf_depth & 0xf) << 6;
+ data |= dsc->convert_rgb << 4;
+ data |= dsc->bits_per_component & 0xf;
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
+
+ data = (dsc->pic_width & 0xffff) |
+ ((dsc->pic_height & 0xffff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
+
+ data = (dsc->slice_width & 0xffff) |
+ ((dsc->slice_height & 0xffff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
+ (dsc->slice_chunk_size) & 0xffff);
+
+ data = (dsc->initial_xmit_delay & 0xffff) |
+ ((dsc->initial_dec_delay & 0xffff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
+ dsc->initial_scale_value & 0x3f);
+
+ data = (dsc->scale_increment_interval & 0xffff) |
+ ((dsc->scale_decrement_interval & 0x7ff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
+
+ data = (dsc->first_line_bpg_offset & 0x1f) |
+ ((dsc->second_line_bpg_offset & 0x1f) << 5);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
+
+ data = (dsc->nfl_bpg_offset & 0xffff) |
+ ((dsc->slice_bpg_offset & 0xffff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
+
+ data = (dsc->initial_offset & 0xffff) |
+ ((dsc->final_offset & 0xffff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
+
+ data = (dsc->nsl_bpg_offset & 0xffff) |
+ ((dsc->second_line_offset_adj & 0xffff) << 16);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
+
+ data = (dsc->flatness_min_qp & 0x1f);
+ data |= (dsc->flatness_max_qp & 0x1f) << 5;
+
+ det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
+ data |= (det_thresh_flatness & 0xff) << 10;
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
+ (dsc->rc_model_size) & 0xffff);
+
+ data = dsc->rc_edge_factor & 0xf;
+ data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
+ data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
+ data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
+ data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
+
+ /* program the dsc wrapper */
+ data = BIT(0); /* encoder enable */
+ if (dsc->native_422)
+ data |= BIT(8);
+ else if (dsc->native_420)
+ data |= BIT(9);
+ if (!dsc->convert_rgb)
+ data |= BIT(10);
+ if (dsc->bits_per_component == 8)
+ data |= BIT(11);
+ if (mode & DSC_MODE_SPLIT_PANEL)
+ data |= BIT(12);
+ if (mode & DSC_MODE_MULTIPLEX)
+ data |= BIT(13);
+ if (!(mode & DSC_MODE_VIDEO))
+ data |= BIT(17);
+
+ DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
+}
+
+static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc *hw_dsc,
+ struct drm_dsc_config *dsc)
+{
+ struct dpu_hw_blk_reg_map *hw;
+ const struct dpu_dsc_sub_blks *sblk;
+ struct drm_dsc_rc_range_parameters *rc;
+
+ if (!hw_dsc || !dsc)
+ return;
+
+ hw = &hw_dsc->hw;
+
+ sblk = hw_dsc->caps->sblk;
+
+ rc = dsc->rc_range_params;
+
+ /*
+ * With BUF_THRESH -- 14 in total
+ * each register contains 4 thresh values with the last register
+ * containing only 2 thresh values
+ */
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
+ (dsc->rc_buf_thresh[0] << 0) |
+ (dsc->rc_buf_thresh[1] << 8) |
+ (dsc->rc_buf_thresh[2] << 16) |
+ (dsc->rc_buf_thresh[3] << 24));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
+ (dsc->rc_buf_thresh[4] << 0) |
+ (dsc->rc_buf_thresh[5] << 8) |
+ (dsc->rc_buf_thresh[6] << 16) |
+ (dsc->rc_buf_thresh[7] << 24));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
+ (dsc->rc_buf_thresh[8] << 0) |
+ (dsc->rc_buf_thresh[9] << 8) |
+ (dsc->rc_buf_thresh[10] << 16) |
+ (dsc->rc_buf_thresh[11] << 24));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
+ (dsc->rc_buf_thresh[12] << 0) |
+ (dsc->rc_buf_thresh[13] << 8));
+
+ /*
+ * with min/max_QP -- 5 bits
+ * each register contains 5 min_qp or max_qp for total of 15
+ *
+ * With BPG_OFFSET -- 6 bits
+ * each register contains 5 BPG_offset for total of 15
+ */
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
+ (rc[0].range_min_qp << 0) |
+ (rc[1].range_min_qp << 5) |
+ (rc[2].range_min_qp << 10) |
+ (rc[3].range_min_qp << 15) |
+ (rc[4].range_min_qp << 20));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
+ (rc[0].range_max_qp << 0) |
+ (rc[1].range_max_qp << 5) |
+ (rc[2].range_max_qp << 10) |
+ (rc[3].range_max_qp << 15) |
+ (rc[4].range_max_qp << 20));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
+ (rc[0].range_bpg_offset << 0) |
+ (rc[1].range_bpg_offset << 6) |
+ (rc[2].range_bpg_offset << 12) |
+ (rc[3].range_bpg_offset << 18) |
+ (rc[4].range_bpg_offset << 24));
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
+ (rc[5].range_min_qp << 0) |
+ (rc[6].range_min_qp << 5) |
+ (rc[7].range_min_qp << 10) |
+ (rc[8].range_min_qp << 15) |
+ (rc[9].range_min_qp << 20));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
+ (rc[5].range_max_qp << 0) |
+ (rc[6].range_max_qp << 5) |
+ (rc[7].range_max_qp << 10) |
+ (rc[8].range_max_qp << 15) |
+ (rc[9].range_max_qp << 20));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
+ (rc[5].range_bpg_offset << 0) |
+ (rc[6].range_bpg_offset << 6) |
+ (rc[7].range_bpg_offset << 12) |
+ (rc[8].range_bpg_offset << 18) |
+ (rc[9].range_bpg_offset << 24));
+
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
+ (rc[10].range_min_qp << 0) |
+ (rc[11].range_min_qp << 5) |
+ (rc[12].range_min_qp << 10) |
+ (rc[13].range_min_qp << 15) |
+ (rc[14].range_min_qp << 20));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
+ (rc[10].range_max_qp << 0) |
+ (rc[11].range_max_qp << 5) |
+ (rc[12].range_max_qp << 10) |
+ (rc[13].range_max_qp << 15) |
+ (rc[14].range_max_qp << 20));
+ DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
+ (rc[10].range_bpg_offset << 0) |
+ (rc[11].range_bpg_offset << 6) |
+ (rc[12].range_bpg_offset << 12) |
+ (rc[13].range_bpg_offset << 18) |
+ (rc[14].range_bpg_offset << 24));
+}
+
+static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
+ const enum dpu_pingpong pp)
+{
+ struct dpu_hw_blk_reg_map *hw;
+ const struct dpu_dsc_sub_blks *sblk;
+ int mux_cfg = 0xf; /* Disabled */
+
+ hw = &hw_dsc->hw;
+
+ sblk = hw_dsc->caps->sblk;
+
+ if (pp)
+ mux_cfg = (pp - PINGPONG_0) & 0x7;
+
+ DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
+}
+
+static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
+ const unsigned long features)
+{
+ ops->dsc_disable = dpu_hw_dsc_disable_1_2;
+ ops->dsc_config = dpu_hw_dsc_config_1_2;
+ ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
+ ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
+}
+
+struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
+ void __iomem *addr)
+{
+ struct dpu_hw_dsc *c;
+
+ c = kzalloc(sizeof(*c), GFP_KERNEL);
+ if (!c)
+ return ERR_PTR(-ENOMEM);
+
+ c->hw.blk_addr = addr + cfg->base;
+ c->hw.log_mask = DPU_DBG_MASK_DSC;
+
+ c->idx = cfg->id;
+ c->caps = cfg;
+ _setup_dcs_ops_1_2(&c->ops, c->caps->features);
+
+ return c;
+}
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index f0fc704..502dd60 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
*/

#define pr_fmt(fmt) "[drm:%s] " fmt, __func__
@@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
struct dpu_hw_dsc *hw;
const struct dpu_dsc_cfg *dsc = &cat->dsc[i];

- hw = dpu_hw_dsc_init(dsc, mmio);
+ if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
+ hw = dpu_hw_dsc_init_1_2(dsc, mmio);
+ else
+ hw = dpu_hw_dsc_init(dsc, mmio);
+
if (IS_ERR_OR_NULL(hw)) {
rc = PTR_ERR(hw);
DPU_ERROR("failed dsc object creation: err %d\n", rc);
--
2.7.4

2023-05-12 18:03:25

[permalink] [raw]

Subject: [PATCH v8 7/8] drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets

From: Abhinav Kumar <[email protected]>

Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
feature flag information. Each display compression engine (DCE) contains
dual hard slice DSC encoders so both share same base address but with
its own different sub block address.

changes in v4:
-- delete DPU_DSC_HW_REV_1_1
-- re arrange sc8280xp_dsc[]

changes in v4:
-- fix checkpatch warning

Signed-off-by: Abhinav Kumar <[email protected]>
Signed-off-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
---
.../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 ++++++++++++
.../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h | 7 ++++++
.../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 ++++++++++++++
.../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 ++++++++++++
.../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 ++++++++++++
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 25 +++++++++++++++++++++-
6 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
index 500cfd0..c4c93c8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
@@ -153,6 +153,18 @@ static const struct dpu_merge_3d_cfg sm8350_merge_3d[] = {
MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
};

+/*
+ * NOTE: Each display compression engine (DCE) contains dual hard
+ * slice DSC encoders so both share same base address but with
+ * its own different sub block address.
+ */
+static const struct dpu_dsc_cfg sm8350_dsc[] = {
+ DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
+ DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
+ DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
+ DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
+};
+
static const struct dpu_intf_cfg sm8350_intf[] = {
INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
@@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
.dspp = sm8350_dspp,
.pingpong_count = ARRAY_SIZE(sm8350_pp),
.pingpong = sm8350_pp,
+ .dsc = sm8350_dsc,
+ .dsc_count = ARRAY_SIZE(sm8350_dsc),
.merge_3d_count = ARRAY_SIZE(sm8350_merge_3d),
.merge_3d = sm8350_merge_3d,
.intf_count = ARRAY_SIZE(sm8350_intf),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
index 5646713..42c66fe 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
@@ -93,6 +93,11 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
};

+/* NOTE: sc7280 only has one dsc hard slice encoder */
+static const struct dpu_dsc_cfg sc7280_dsc[] = {
+ DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
+};
+
static const struct dpu_intf_cfg sc7280_intf[] = {
INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
@@ -149,6 +154,8 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
.mixer = sc7280_lm,
.pingpong_count = ARRAY_SIZE(sc7280_pp),
.pingpong = sc7280_pp,
+ .dsc_count = ARRAY_SIZE(sc7280_dsc),
+ .dsc = sc7280_dsc,
.intf_count = ARRAY_SIZE(sc7280_intf),
.intf = sc7280_intf,
.vbif_count = ARRAY_SIZE(sdm845_vbif),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
index 808aacd..1901fff 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
@@ -141,6 +141,20 @@ static const struct dpu_merge_3d_cfg sc8280xp_merge_3d[] = {
MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
};

+/*
+ * NOTE: Each display compression engine (DCE) contains dual hard
+ * slice DSC encoders so both share same base address but with
+ * its own different sub block address.
+ */
+static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
+ DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
+ DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
+ DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
+ DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
+ DSC_BLK_1_2("dce_2", DSC_4, 0x82000, 0x100, 0, dsc_sblk_0),
+ DSC_BLK_1_2("dce_2", DSC_5, 0x82000, 0x100, 0, dsc_sblk_1),
+};
+
/* TODO: INTF 3, 8 and 7 are used for MST, marked as INTF_NONE for now */
static const struct dpu_intf_cfg sc8280xp_intf[] = {
INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
@@ -216,6 +230,8 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
.dspp = sc8280xp_dspp,
.pingpong_count = ARRAY_SIZE(sc8280xp_pp),
.pingpong = sc8280xp_pp,
+ .dsc = sc8280xp_dsc,
+ .dsc_count = ARRAY_SIZE(sc8280xp_dsc),
.merge_3d_count = ARRAY_SIZE(sc8280xp_merge_3d),
.merge_3d = sc8280xp_merge_3d,
.intf_count = ARRAY_SIZE(sc8280xp_intf),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
index 1a89ff9..741d03f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
@@ -161,6 +161,18 @@ static const struct dpu_merge_3d_cfg sm8450_merge_3d[] = {
MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x65f00),
};

+/*
+ * NOTE: Each display compression engine (DCE) contains dual hard
+ * slice DSC encoders so both share same base address but with
+ * its own different sub block address.
+ */
+static const struct dpu_dsc_cfg sm8450_dsc[] = {
+ DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
+ DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
+ DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
+ DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
+};
+
static const struct dpu_intf_cfg sm8450_intf[] = {
INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
@@ -223,6 +235,8 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
.dspp = sm8450_dspp,
.pingpong_count = ARRAY_SIZE(sm8450_pp),
.pingpong = sm8450_pp,
+ .dsc = sm8450_dsc,
+ .dsc_count = ARRAY_SIZE(sm8450_dsc),
.merge_3d_count = ARRAY_SIZE(sm8450_merge_3d),
.merge_3d = sm8450_merge_3d,
.intf_count = ARRAY_SIZE(sm8450_intf),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
index 497b34c..3ee6dc8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
@@ -165,6 +165,18 @@ static const struct dpu_merge_3d_cfg sm8550_merge_3d[] = {
MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x66700),
};

+/*
+ * NOTE: Each display compression engine (DCE) contains dual hard
+ * slice DSC encoders so both share same base address but with
+ * its own different sub block address.
+ */
+static const struct dpu_dsc_cfg sm8550_dsc[] = {
+ DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
+ DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
+ DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
+ DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
+};
+
static const struct dpu_intf_cfg sm8550_intf[] = {
INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
@@ -227,6 +239,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
.dspp = sm8550_dspp,
.pingpong_count = ARRAY_SIZE(sm8550_pp),
.pingpong = sm8550_pp,
+ .dsc = sm8550_dsc,
+ .dsc_count = ARRAY_SIZE(sm8550_dsc),
.merge_3d_count = ARRAY_SIZE(sm8550_merge_3d),
.merge_3d = sm8550_merge_3d,
.intf_count = ARRAY_SIZE(sm8550_intf),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 78e4bf6..c1d7338 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
- * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
+ * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
*/

#define pr_fmt(fmt) "[drm:%s:%d] " fmt, __func__, __LINE__
@@ -522,6 +522,16 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
/*************************************************************
* DSC sub blocks config
*************************************************************/
+static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
+ .enc = {.base = 0x100, .len = 0x100},
+ .ctl = {.base = 0xF00, .len = 0x10},
+};
+
+static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
+ .enc = {.base = 0x200, .len = 0x100},
+ .ctl = {.base = 0xF80, .len = 0x10},
+};
+
#define DSC_BLK(_name, _id, _base, _features) \
{\
.name = _name, .id = _id, \
@@ -529,6 +539,19 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
.features = _features, \
}

+/*
+ * NOTE: Each display compression engine (DCE) contains dual hard
+ * slice DSC encoders so both share same base address but with
+ * its own different sub block address.
+ */
+#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
+ {\
+ .name = _name, .id = _id, \
+ .base = _base, .len = _len, \
+ .features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
+ .sblk = &_sblk, \
+ }
+
/*************************************************************
* INTF sub blocks config
*************************************************************/
--
2.7.4

2023-05-12 18:03:50

[permalink] [raw]

Subject: [PATCH v8 8/8] drm/msm/dpu: tear down DSC data path when DSC disabled

Unset DSC_ACTIVE bit at dpu_hw_ctl_reset_intf_cfg_v1(),
dpu_encoder_unprep_dsc() and dpu_encoder_dsc_pipe_clr() functions
to tear down DSC data path if DSC data path was setup previous.

Signed-off-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
---
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 43 +++++++++++++++++++++++++++++
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 7 +++++
2 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 5cae70e..ee999ce 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -1214,6 +1214,44 @@ static void dpu_encoder_virt_atomic_enable(struct drm_encoder *drm_enc,
mutex_unlock(&dpu_enc->enc_lock);
}

+static void dpu_encoder_dsc_pipe_clr(struct dpu_encoder_virt *dpu_enc,
+ struct dpu_hw_dsc *hw_dsc,
+ struct dpu_hw_pingpong *hw_pp)
+{
+ struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
+ struct dpu_hw_ctl *ctl;
+
+ ctl = cur_master->hw_ctl;
+
+ if (hw_dsc->ops.dsc_disable)
+ hw_dsc->ops.dsc_disable(hw_dsc);
+
+ if (hw_pp->ops.disable_dsc)
+ hw_pp->ops.disable_dsc(hw_pp);
+
+ if (hw_dsc->ops.dsc_bind_pingpong_blk)
+ hw_dsc->ops.dsc_bind_pingpong_blk(hw_dsc, PINGPONG_NONE);
+
+ if (ctl->ops.update_pending_flush_dsc)
+ ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
+}
+
+static void dpu_encoder_unprep_dsc(struct dpu_encoder_virt *dpu_enc)
+{
+ /* coding only for 2LM, 2enc, 1 dsc config */
+ struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
+ struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
+ int i;
+
+ for (i = 0; i < MAX_CHANNELS_PER_ENC; i++) {
+ hw_pp[i] = dpu_enc->hw_pp[i];
+ hw_dsc[i] = dpu_enc->hw_dsc[i];
+
+ if (hw_pp[i] && hw_dsc[i])
+ dpu_encoder_dsc_pipe_clr(dpu_enc, hw_dsc[i], hw_pp[i]);
+ }
+}
+
static void dpu_encoder_virt_atomic_disable(struct drm_encoder *drm_enc,
struct drm_atomic_state *state)
{
@@ -2090,6 +2128,9 @@ void dpu_encoder_helper_phys_cleanup(struct dpu_encoder_phys *phys_enc)
phys_enc->hw_pp->merge_3d->idx);
}

+ if (dpu_enc->dsc)
+ dpu_encoder_unprep_dsc(dpu_enc);
+
intf_cfg.stream_sel = 0; /* Don't care value for video mode */
intf_cfg.mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc);

@@ -2101,6 +2142,8 @@ void dpu_encoder_helper_phys_cleanup(struct dpu_encoder_phys *phys_enc)
if (phys_enc->hw_pp->merge_3d)
intf_cfg.merge_3d = phys_enc->hw_pp->merge_3d->idx;

+ intf_cfg.dsc = dpu_encoder_helper_get_dsc(phys_enc);
+
if (ctl->ops.reset_intf_cfg)
ctl->ops.reset_intf_cfg(ctl, &intf_cfg);

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index f3a50cc..aec3b08 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -577,6 +577,7 @@ static void dpu_hw_ctl_reset_intf_cfg_v1(struct dpu_hw_ctl *ctx,
u32 intf_active = 0;
u32 wb_active = 0;
u32 merge3d_active = 0;
+ u32 dsc_active;

/*
* This API resets each portion of the CTL path namely,
@@ -606,6 +607,12 @@ static void dpu_hw_ctl_reset_intf_cfg_v1(struct dpu_hw_ctl *ctx,
wb_active &= ~BIT(cfg->wb - WB_0);
DPU_REG_WRITE(c, CTL_WB_ACTIVE, wb_active);
}
+
+ if (cfg->dsc) {
+ dsc_active = DPU_REG_READ(c, CTL_DSC_ACTIVE);
+ dsc_active &= ~cfg->dsc;
+ DPU_REG_WRITE(c, CTL_DSC_ACTIVE, dsc_active);
+ }
}

static void dpu_hw_ctl_set_fetch_pipe_active(struct dpu_hw_ctl *ctx,
--
2.7.4

2023-05-12 18:16:23

[permalink] [raw]

Subject: [PATCH v8 2/8] drm/msm/dpu: add DPU_PINGPONG_DSC feature bit for DPU < 7.0.0

DPU < 7.0.0 requires the PINGPONG block to be involved during
DSC setting up. Since DPU >= 7.0.0, enabling and starting the DSC
encoder engine was moved to INTF with the help of the flush mechanism.
Add a DPU_PINGPONG_DSC feature bit to restrict the availability of
dpu_hw_pp_setup_dsc() and dpu_hw_pp_dsc_{enable,disable}() on the
PINGPONG block to DPU < 7.0.0 hardware, as the registers are not
available [in the PINGPONG block] on DPU 7.0.0 and higher anymore.
Add DPU_PINGPONG_DSC to PINGPONG_SDM845_MASK, PINGPONG_SDM845_TE2_MASK
and PINGPONG_SM8150_MASK which is used for all DPU < 7.0 chipsets.

changes in v6:
-- split patches and rearrange to keep catalog related files at this patch

changes in v7:
-- rewording commit text as suggested at review comments

Signed-off-by: Kuogee Hsieh <[email protected]>
---
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 6 +++---
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 4 +++-
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 82b58c6..78e4bf6 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -76,13 +76,13 @@
(BIT(DPU_DIM_LAYER) | BIT(DPU_MIXER_COMBINED_ALPHA))

#define PINGPONG_SDM845_MASK \
- (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE))
+ (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE) | BIT(DPU_PINGPONG_DSC))

#define PINGPONG_SDM845_TE2_MASK \
- (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2))
+ (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2) | BIT(DPU_PINGPONG_DSC))

#define PINGPONG_SM8150_MASK \
- (BIT(DPU_PINGPONG_DITHER))
+ (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_DSC))

#define CTL_SC7280_MASK \
(BIT(DPU_CTL_ACTIVE_CFG) | \
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 6ee48f0..dc0a4da 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -144,7 +144,8 @@ enum {
* @DPU_PINGPONG_TE2 Additional tear check block for split pipes
* @DPU_PINGPONG_SPLIT PP block supports split fifo
* @DPU_PINGPONG_SLAVE PP block is a suitable slave for split fifo
- * @DPU_PINGPONG_DITHER, Dither blocks
+ * @DPU_PINGPONG_DITHER Dither blocks
+ * @DPU_PINGPONG_DSC PP ops functions required for DSC
* @DPU_PINGPONG_MAX
*/
enum {
@@ -153,6 +154,7 @@ enum {
DPU_PINGPONG_SPLIT,
DPU_PINGPONG_SLAVE,
DPU_PINGPONG_DITHER,
+ DPU_PINGPONG_DSC,
DPU_PINGPONG_MAX
};

--
2.7.4

2023-05-12 18:16:32

[permalink] [raw]

Subject: [PATCH v8 6/8] drm/msm/dpu: separate DSC flush update out of interface

Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
DSC engine and DSC flush bits at same time to make it consistent with
the location of flush programming of other dpu sub blocks.

Signed-off-by: Kuogee Hsieh <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
---
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 22 ++++++++++++++++------
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 ++++++++++
3 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index ffa6f04..5cae70e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct drm_dsc_config *dsc,
return DIV_ROUND_UP(total_pixels, dsc->slice_width);
}

-static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
+static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
+ struct dpu_hw_dsc *hw_dsc,
struct dpu_hw_pingpong *hw_pp,
struct drm_dsc_config *dsc,
u32 common_mode,
u32 initial_lines)
{
+ struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
+ struct dpu_hw_ctl *ctl;
+
+ ctl = cur_master->hw_ctl;
+
if (hw_dsc->ops.dsc_config)
hw_dsc->ops.dsc_config(hw_dsc, dsc, common_mode, initial_lines);

@@ -1854,6 +1860,9 @@ static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,

if (hw_pp->ops.enable_dsc)
hw_pp->ops.enable_dsc(hw_pp);
+
+ if (ctl->ops.update_pending_flush_dsc)
+ ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
}

static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
@@ -1898,7 +1907,8 @@ static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
initial_lines = dpu_encoder_dsc_initial_line_calc(dsc, enc_ip_w);

for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
- dpu_encoder_dsc_pipe_cfg(hw_dsc[i], hw_pp[i], dsc, dsc_common_mode, initial_lines);
+ dpu_encoder_dsc_pipe_cfg(dpu_enc, hw_dsc[i], hw_pp[i], dsc,
+ dsc_common_mode, initial_lines);
}

void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index 4f7cfa9..f3a50cc 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -139,6 +139,11 @@ static inline void dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
CTL_DSPP_n_FLUSH(dspp - DSPP_0),
ctx->pending_dspp_flush_mask[dspp - DSPP_0]);
}
+
+ if (ctx->pending_flush_mask & BIT(DSC_IDX))
+ DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH,
+ ctx->pending_dsc_flush_mask);
+
DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, ctx->pending_flush_mask);
}

@@ -285,6 +290,13 @@ static void dpu_hw_ctl_update_pending_flush_merge_3d_v1(struct dpu_hw_ctl *ctx,
ctx->pending_flush_mask |= BIT(MERGE_3D_IDX);
}

+static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct dpu_hw_ctl *ctx,
+ enum dpu_dsc dsc_num)
+{
+ ctx->pending_dsc_flush_mask |= BIT(dsc_num - DSC_0);
+ ctx->pending_flush_mask |= BIT(DSC_IDX);
+}
+
static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl *ctx,
enum dpu_dspp dspp, u32 dspp_sub_blk)
{
@@ -502,9 +514,6 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
if ((test_bit(DPU_CTL_VM_CFG, &ctx->caps->features)))
mode_sel = CTL_DEFAULT_GROUP_ID << 28;

- if (cfg->dsc)
- DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH, cfg->dsc);
-
if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
mode_sel |= BIT(17);

@@ -524,10 +533,8 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
if (cfg->merge_3d)
DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
BIT(cfg->merge_3d - MERGE_3D_0));
- if (cfg->dsc) {
- DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);
+ if (cfg->dsc)
DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
- }
}

static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
@@ -630,6 +637,9 @@ static void _setup_ctl_ops(struct dpu_hw_ctl_ops *ops,
ops->update_pending_flush_merge_3d =
dpu_hw_ctl_update_pending_flush_merge_3d_v1;
ops->update_pending_flush_wb = dpu_hw_ctl_update_pending_flush_wb_v1;
+
+ ops->update_pending_flush_dsc =
+ dpu_hw_ctl_update_pending_flush_dsc_v1;
} else {
ops->trigger_flush = dpu_hw_ctl_trigger_flush;
ops->setup_intf_cfg = dpu_hw_ctl_intf_cfg;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
index 6292002..d4869a0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
@@ -158,6 +158,15 @@ struct dpu_hw_ctl_ops {
enum dpu_dspp blk, u32 dspp_sub_blk);

/**
+ * OR in the given flushbits to the cached pending_(dsc_)flush_mask
+ * No effect on hardware
+ * @ctx : ctl path ctx pointer
+ * @blk : interface block index
+ */
+ void (*update_pending_flush_dsc)(struct dpu_hw_ctl *ctx,
+ enum dpu_dsc blk);
+
+ /**
* Write the value of the pending_flush_mask to hardware
* @ctx : ctl path ctx pointer
*/
@@ -245,6 +254,7 @@ struct dpu_hw_ctl {
u32 pending_wb_flush_mask;
u32 pending_merge_3d_flush_mask;
u32 pending_dspp_flush_mask[DSPP_MAX - DSPP_0];
+ u32 pending_dsc_flush_mask;

/* ops */
struct dpu_hw_ctl_ops ops;
--
2.7.4

2023-05-12 18:22:33

[permalink] [raw]

Subject: Re: [PATCH v8 2/8] drm/msm/dpu: add DPU_PINGPONG_DSC feature bit for DPU < 7.0.0

On 12/05/2023 21:00, Kuogee Hsieh wrote:
> DPU < 7.0.0 requires the PINGPONG block to be involved during
> DSC setting up. Since DPU >= 7.0.0, enabling and starting the DSC
> encoder engine was moved to INTF with the help of the flush mechanism.
> Add a DPU_PINGPONG_DSC feature bit to restrict the availability of
> dpu_hw_pp_setup_dsc() and dpu_hw_pp_dsc_{enable,disable}() on the
> PINGPONG block to DPU < 7.0.0 hardware, as the registers are not
> available [in the PINGPONG block] on DPU 7.0.0 and higher anymore.
> Add DPU_PINGPONG_DSC to PINGPONG_SDM845_MASK, PINGPONG_SDM845_TE2_MASK
> and PINGPONG_SM8150_MASK which is used for all DPU < 7.0 chipsets.
>
> changes in v6:
> -- split patches and rearrange to keep catalog related files at this patch
>
> changes in v7:
> -- rewording commit text as suggested at review comments
>
> Signed-off-by: Kuogee Hsieh <[email protected]>

Reviewed-by: Dmitry Baryshkov <[email protected]>

Single nit below

> ---
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 6 +++---
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 4 +++-
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 82b58c6..78e4bf6 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -76,13 +76,13 @@
> (BIT(DPU_DIM_LAYER) | BIT(DPU_MIXER_COMBINED_ALPHA))
>
> #define PINGPONG_SDM845_MASK \
> - (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE))
> + (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE) | BIT(DPU_PINGPONG_DSC))
>
> #define PINGPONG_SDM845_TE2_MASK \
> - (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2))
> + (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2) | BIT(DPU_PINGPONG_DSC))
>
> #define PINGPONG_SM8150_MASK \
> - (BIT(DPU_PINGPONG_DITHER))
> + (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_DSC))
>
> #define CTL_SC7280_MASK \
> (BIT(DPU_CTL_ACTIVE_CFG) | \
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> index 6ee48f0..dc0a4da 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> @@ -144,7 +144,8 @@ enum {
> * @DPU_PINGPONG_TE2 Additional tear check block for split pipes
> * @DPU_PINGPONG_SPLIT PP block supports split fifo
> * @DPU_PINGPONG_SLAVE PP block is a suitable slave for split fifo
> - * @DPU_PINGPONG_DITHER, Dither blocks
> + * @DPU_PINGPONG_DITHER Dither blocks

Ideally this should be a separate commit. It is irrelevant to
DPU_PINGPONG_DSC

> + * @DPU_PINGPONG_DSC PP ops functions required for DSC
> * @DPU_PINGPONG_MAX
> */
> enum {
> @@ -153,6 +154,7 @@ enum {
> DPU_PINGPONG_SPLIT,
> DPU_PINGPONG_SLAVE,
> DPU_PINGPONG_DITHER,
> + DPU_PINGPONG_DSC,
> DPU_PINGPONG_MAX
> };
>

--
With best wishes
Dmitry

2023-05-12 18:24:57

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 12/05/2023 21:00, Kuogee Hsieh wrote:
> Add support for DSC 1.2 by providing the necessary hooks to program
> the DPU DSC 1.2 encoder.
>
> Changes in v3:
> -- fixed kernel test rebot report that "__iomem *off" is declared but not
> used at dpu_hw_dsc_config_1_2()
> -- unrolling thresh loops
>
> Changes in v4:
> -- delete DPU_DSC_HW_REV_1_1
> -- delete off and used real register name directly
>
> Changes in v7:
> -- replace offset with sblk->enc.base
> -- replace ss with slice
>
> Changes in v8:
> -- fixed checkpatch warning
>
> Signed-off-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/Makefile | 1 +
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 14 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382 +++++++++++++++++++++++++
> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 7 +-
> 5 files changed, 432 insertions(+), 4 deletions(-)
> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index b814fc8..b9af5e4 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
> disp/dpu1/dpu_hw_catalog.o \
> disp/dpu1/dpu_hw_ctl.o \
> disp/dpu1/dpu_hw_dsc.o \
> + disp/dpu1/dpu_hw_dsc_1_2.o \
> disp/dpu1/dpu_hw_interrupts.o \
> disp/dpu1/dpu_hw_intf.o \
> disp/dpu1/dpu_hw_lm.o \
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> index dc0a4da..4eda2cc 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> @@ -1,6 +1,6 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> /*
> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
> * Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights reserved.
> */
>
> @@ -244,12 +244,18 @@ enum {
> };
>
> /**
> - * DSC features
> + * DSC sub-blocks/features
> * @DPU_DSC_OUTPUT_CTRL Configure which PINGPONG block gets
> * the pixel output from this DSC.
> + * @DPU_DSC_HW_REV_1_2 DSC block supports dsc 1.1 and 1.2
> + * @DPU_DSC_NATIVE_422_EN Supports native422 and native420 encoding
> + * @DPU_DSC_MAX
> */
> enum {
> DPU_DSC_OUTPUT_CTRL = 0x1,
> + DPU_DSC_HW_REV_1_2,
> + DPU_DSC_NATIVE_422_EN,
> + DPU_DSC_MAX
> };
>
> /**
> @@ -306,6 +312,14 @@ struct dpu_pp_blk {
> };
>
> /**
> + * struct dpu_dsc_blk - DSC Encoder sub-blk information
> + * @info: HW register and features supported by this sub-blk
> + */
> +struct dpu_dsc_blk {
> + DPU_HW_SUBBLK_INFO;
> +};
> +
> +/**
> * enum dpu_qos_lut_usage - define QoS LUT use cases
> */
> enum dpu_qos_lut_usage {
> @@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
> };
>
> /**
> + * struct dpu_dsc_sub_blks - DSC sub-blks
> + * @enc: DSC encoder sub block
> + * @ctl: DSC controller sub block
> + *
> + */
> +struct dpu_dsc_sub_blks {
> + struct dpu_dsc_blk enc;
> + struct dpu_dsc_blk ctl;
> +};
> +
> +/**
> * dpu_clk_ctrl_type - Defines top level clock control signals
> */
> enum dpu_clk_ctrl_type {
> @@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
> * struct dpu_dsc_cfg - information of DSC blocks
> * @id enum identifying this block
> * @base register offset of this block
> + * @len: length of hardware block
> * @features bit mask identifying sub-blocks/features
> + * @sblk sub-blocks information
> */
> struct dpu_dsc_cfg {
> DPU_HW_BLK_INFO;
> + const struct dpu_dsc_sub_blks *sblk;
> };
>
> /**
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
> index 138080a..44fd624 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
> @@ -1,5 +1,8 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> -/* Copyright (c) 2020-2022, Linaro Limited */
> +/*
> + * Copyright (c) 2020-2022, Linaro Limited
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
> + */
>
> #ifndef _DPU_HW_DSC_H
> #define _DPU_HW_DSC_H
> @@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct dpu_dsc_cfg *cfg,
> void __iomem *addr);
>
> /**
> + * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block
> + * @cfg: DSC catalog entry for which driver object is required
> + * @addr: Mapped register io address of MDP
> + * Returns: Error code or allocated dpu_hw_dsc context
> + */
> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
> + void __iomem *addr);
> +
> +/**
> * dpu_hw_dsc_destroy - destroys dsc driver context
> * @dsc: Pointer to dsc driver context returned by dpu_hw_dsc_init
> */
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
> new file mode 100644
> index 00000000..5bd84bd
> --- /dev/null
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
> @@ -0,0 +1,382 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
> + */
> +
> +#include <drm/display/drm_dsc_helper.h>
> +
> +#include "dpu_kms.h"
> +#include "dpu_hw_catalog.h"
> +#include "dpu_hwio.h"
> +#include "dpu_hw_mdss.h"
> +#include "dpu_hw_dsc.h"
> +
> +#define DSC_CMN_MAIN_CNF 0x00
> +
> +/* DPU_DSC_ENC register offsets */
> +#define ENC_DF_CTRL 0x00
> +#define ENC_GENERAL_STATUS 0x04
> +#define ENC_HSLICE_STATUS 0x08
> +#define ENC_OUT_STATUS 0x0C
> +#define ENC_INT_STAT 0x10
> +#define ENC_INT_CLR 0x14
> +#define ENC_INT_MASK 0x18
> +#define DSC_MAIN_CONF 0x30
> +#define DSC_PICTURE_SIZE 0x34
> +#define DSC_SLICE_SIZE 0x38
> +#define DSC_MISC_SIZE 0x3C
> +#define DSC_HRD_DELAYS 0x40
> +#define DSC_RC_SCALE 0x44
> +#define DSC_RC_SCALE_INC_DEC 0x48
> +#define DSC_RC_OFFSETS_1 0x4C
> +#define DSC_RC_OFFSETS_2 0x50
> +#define DSC_RC_OFFSETS_3 0x54
> +#define DSC_RC_OFFSETS_4 0x58
> +#define DSC_FLATNESS_QP 0x5C
> +#define DSC_RC_MODEL_SIZE 0x60
> +#define DSC_RC_CONFIG 0x64
> +#define DSC_RC_BUF_THRESH_0 0x68
> +#define DSC_RC_BUF_THRESH_1 0x6C
> +#define DSC_RC_BUF_THRESH_2 0x70
> +#define DSC_RC_BUF_THRESH_3 0x74
> +#define DSC_RC_MIN_QP_0 0x78
> +#define DSC_RC_MIN_QP_1 0x7C
> +#define DSC_RC_MIN_QP_2 0x80
> +#define DSC_RC_MAX_QP_0 0x84
> +#define DSC_RC_MAX_QP_1 0x88
> +#define DSC_RC_MAX_QP_2 0x8C
> +#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
> +#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
> +#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
> +
> +/* DPU_DSC_CTL register offsets */
> +#define DSC_CTL 0x00
> +#define DSC_CFG 0x04
> +#define DSC_DATA_IN_SWAP 0x08
> +#define DSC_CLK_CTRL 0x0C
> +
> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc, int num_ss)
> +{
> + int max_addr = 2400 / num_ss;
> +
> + if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
> + max_addr /= 2;
> +
> + return max_addr - 1;
> +};
> +
> +static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> +
> + if (!hw_dsc)
> + return;
> +
> + hw = &hw_dsc->hw;
> + sblk = hw_dsc->caps->sblk;
> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
> +}
> +
> +static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
> + struct drm_dsc_config *dsc,
> + u32 mode,
> + u32 initial_lines)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> + u32 data = 0;
> + u32 det_thresh_flatness;
> + u32 num_active_slice_per_enc;
> + u32 bpp;
> +
> + if (!hw_dsc || !dsc)
> + return;
> +
> + hw = &hw_dsc->hw;
> +
> + sblk = hw_dsc->caps->sblk;
> +
> + if (mode & DSC_MODE_SPLIT_PANEL)
> + data |= BIT(0);
> +
> + if (mode & DSC_MODE_MULTIPLEX)
> + data |= BIT(1);
> +
> + num_active_slice_per_enc = dsc->slice_count;
> + if (mode & DSC_MODE_MULTIPLEX)
> + num_active_slice_per_enc = dsc->slice_count >> 1;
> +
> + data |= (num_active_slice_per_enc & 0x3) << 7;
> +
> + DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
> +
> + data = (initial_lines & 0xff);
> +
> + if (mode & DSC_MODE_VIDEO)
> + data |= BIT(9);
> +
> + data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc) << 18);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
> +
> + data = (dsc->dsc_version_minor & 0xf) << 28;
> + if (dsc->dsc_version_minor == 0x2) {
> + if (dsc->native_422)
> + data |= BIT(22);
> + if (dsc->native_420)
> + data |= BIT(21);
> + }
> +
> + bpp = dsc->bits_per_pixel;
> + /* as per hw requirement bpp should be programmed
> + * twice the actual value in case of 420 or 422 encoding
> + */
> + if (dsc->native_422 || dsc->native_420)
> + bpp = 2 * bpp;
> + data |= (dsc->block_pred_enable ? 1 : 0) << 20;
> + data |= bpp << 10;
> + data |= (dsc->line_buf_depth & 0xf) << 6;
> + data |= dsc->convert_rgb << 4;
> + data |= dsc->bits_per_component & 0xf;
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
> +
> + data = (dsc->pic_width & 0xffff) |
> + ((dsc->pic_height & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
> +
> + data = (dsc->slice_width & 0xffff) |
> + ((dsc->slice_height & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
> + (dsc->slice_chunk_size) & 0xffff);
> +
> + data = (dsc->initial_xmit_delay & 0xffff) |
> + ((dsc->initial_dec_delay & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
> + dsc->initial_scale_value & 0x3f);
> +
> + data = (dsc->scale_increment_interval & 0xffff) |
> + ((dsc->scale_decrement_interval & 0x7ff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
> +
> + data = (dsc->first_line_bpg_offset & 0x1f) |
> + ((dsc->second_line_bpg_offset & 0x1f) << 5);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
> +
> + data = (dsc->nfl_bpg_offset & 0xffff) |
> + ((dsc->slice_bpg_offset & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
> +
> + data = (dsc->initial_offset & 0xffff) |
> + ((dsc->final_offset & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
> +
> + data = (dsc->nsl_bpg_offset & 0xffff) |
> + ((dsc->second_line_offset_adj & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
> +
> + data = (dsc->flatness_min_qp & 0x1f);
> + data |= (dsc->flatness_max_qp & 0x1f) << 5;
> +
> + det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
> + data |= (det_thresh_flatness & 0xff) << 10;
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
> + (dsc->rc_model_size) & 0xffff);
> +
> + data = dsc->rc_edge_factor & 0xf;
> + data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
> + data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
> + data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
> + data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
> +
> + /* program the dsc wrapper */
> + data = BIT(0); /* encoder enable */
> + if (dsc->native_422)
> + data |= BIT(8);
> + else if (dsc->native_420)
> + data |= BIT(9);
> + if (!dsc->convert_rgb)
> + data |= BIT(10);
> + if (dsc->bits_per_component == 8)
> + data |= BIT(11);
> + if (mode & DSC_MODE_SPLIT_PANEL)
> + data |= BIT(12);
> + if (mode & DSC_MODE_MULTIPLEX)
> + data |= BIT(13);
> + if (!(mode & DSC_MODE_VIDEO))
> + data |= BIT(17);
> +
> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
> +}
> +
> +static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc *hw_dsc,
> + struct drm_dsc_config *dsc)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> + struct drm_dsc_rc_range_parameters *rc;
> +
> + if (!hw_dsc || !dsc)
> + return;
> +
> + hw = &hw_dsc->hw;
> +
> + sblk = hw_dsc->caps->sblk;
> +
> + rc = dsc->rc_range_params;
> +
> + /*
> + * With BUF_THRESH -- 14 in total
> + * each register contains 4 thresh values with the last register
> + * containing only 2 thresh values
> + */
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
> + (dsc->rc_buf_thresh[0] << 0) |
> + (dsc->rc_buf_thresh[1] << 8) |
> + (dsc->rc_buf_thresh[2] << 16) |
> + (dsc->rc_buf_thresh[3] << 24));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
> + (dsc->rc_buf_thresh[4] << 0) |
> + (dsc->rc_buf_thresh[5] << 8) |
> + (dsc->rc_buf_thresh[6] << 16) |
> + (dsc->rc_buf_thresh[7] << 24));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
> + (dsc->rc_buf_thresh[8] << 0) |
> + (dsc->rc_buf_thresh[9] << 8) |
> + (dsc->rc_buf_thresh[10] << 16) |
> + (dsc->rc_buf_thresh[11] << 24));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
> + (dsc->rc_buf_thresh[12] << 0) |
> + (dsc->rc_buf_thresh[13] << 8));
> +
> + /*
> + * with min/max_QP -- 5 bits
> + * each register contains 5 min_qp or max_qp for total of 15
> + *
> + * With BPG_OFFSET -- 6 bits
> + * each register contains 5 BPG_offset for total of 15
> + */
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
> + (rc[0].range_min_qp << 0) |
> + (rc[1].range_min_qp << 5) |
> + (rc[2].range_min_qp << 10) |
> + (rc[3].range_min_qp << 15) |
> + (rc[4].range_min_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
> + (rc[0].range_max_qp << 0) |
> + (rc[1].range_max_qp << 5) |
> + (rc[2].range_max_qp << 10) |
> + (rc[3].range_max_qp << 15) |
> + (rc[4].range_max_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
> + (rc[0].range_bpg_offset << 0) |
> + (rc[1].range_bpg_offset << 6) |
> + (rc[2].range_bpg_offset << 12) |
> + (rc[3].range_bpg_offset << 18) |
> + (rc[4].range_bpg_offset << 24));
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
> + (rc[5].range_min_qp << 0) |
> + (rc[6].range_min_qp << 5) |
> + (rc[7].range_min_qp << 10) |
> + (rc[8].range_min_qp << 15) |
> + (rc[9].range_min_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
> + (rc[5].range_max_qp << 0) |
> + (rc[6].range_max_qp << 5) |
> + (rc[7].range_max_qp << 10) |
> + (rc[8].range_max_qp << 15) |
> + (rc[9].range_max_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
> + (rc[5].range_bpg_offset << 0) |
> + (rc[6].range_bpg_offset << 6) |
> + (rc[7].range_bpg_offset << 12) |
> + (rc[8].range_bpg_offset << 18) |
> + (rc[9].range_bpg_offset << 24));
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
> + (rc[10].range_min_qp << 0) |
> + (rc[11].range_min_qp << 5) |
> + (rc[12].range_min_qp << 10) |
> + (rc[13].range_min_qp << 15) |
> + (rc[14].range_min_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
> + (rc[10].range_max_qp << 0) |
> + (rc[11].range_max_qp << 5) |
> + (rc[12].range_max_qp << 10) |
> + (rc[13].range_max_qp << 15) |
> + (rc[14].range_max_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
> + (rc[10].range_bpg_offset << 0) |
> + (rc[11].range_bpg_offset << 6) |
> + (rc[12].range_bpg_offset << 12) |
> + (rc[13].range_bpg_offset << 18) |
> + (rc[14].range_bpg_offset << 24));
> +}
> +
> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
> + const enum dpu_pingpong pp)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> + int mux_cfg = 0xf; /* Disabled */
> +
> + hw = &hw_dsc->hw;
> +
> + sblk = hw_dsc->caps->sblk;
> +
> + if (pp)
> + mux_cfg = (pp - PINGPONG_0) & 0x7;

Do we need an unbind support here like we do for the DSC 1.1?

> +
> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
> +}
> +
> +static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
> + const unsigned long features)
> +{
> + ops->dsc_disable = dpu_hw_dsc_disable_1_2;
> + ops->dsc_config = dpu_hw_dsc_config_1_2;
> + ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
> + ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
> +}
> +
> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
> + void __iomem *addr)
> +{
> + struct dpu_hw_dsc *c;
> +
> + c = kzalloc(sizeof(*c), GFP_KERNEL);
> + if (!c)
> + return ERR_PTR(-ENOMEM);
> +
> + c->hw.blk_addr = addr + cfg->base;
> + c->hw.log_mask = DPU_DBG_MASK_DSC;
> +
> + c->idx = cfg->id;
> + c->caps = cfg;
> + _setup_dcs_ops_1_2(&c->ops, c->caps->features);
> +
> + return c;
> +}
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> index f0fc704..502dd60 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> @@ -1,6 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /*
> * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> */
>
> #define pr_fmt(fmt) "[drm:%s] " fmt, __func__
> @@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
> struct dpu_hw_dsc *hw;
> const struct dpu_dsc_cfg *dsc = &cat->dsc[i];
>
> - hw = dpu_hw_dsc_init(dsc, mmio);
> + if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
> + hw = dpu_hw_dsc_init_1_2(dsc, mmio);
> + else
> + hw = dpu_hw_dsc_init(dsc, mmio);
> +
> if (IS_ERR_OR_NULL(hw)) {
> rc = PTR_ERR(hw);
> DPU_ERROR("failed dsc object creation: err %d\n", rc);

--
With best wishes
Dmitry

2023-05-12 18:46:38

[permalink] [raw]

Subject: Re: [PATCH v8 6/8] drm/msm/dpu: separate DSC flush update out of interface

On 12/05/2023 21:00, Kuogee Hsieh wrote:
> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
> DSC engine and DSC flush bits at same time to make it consistent with
> the location of flush programming of other dpu sub blocks.
>
> Signed-off-by: Kuogee Hsieh <[email protected]>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> ---
> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 22 ++++++++++++++++------
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 ++++++++++
> 3 files changed, 38 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index ffa6f04..5cae70e 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct drm_dsc_config *dsc,
> return DIV_ROUND_UP(total_pixels, dsc->slice_width);
> }
>
> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
> + struct dpu_hw_dsc *hw_dsc,
> struct dpu_hw_pingpong *hw_pp,
> struct drm_dsc_config *dsc,
> u32 common_mode,
> u32 initial_lines)
> {
> + struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
> + struct dpu_hw_ctl *ctl;
> +
> + ctl = cur_master->hw_ctl;

Just for my understanding: if we have a bonded DSI @ sdm845, should both
flashes go to the master CTL or each flush should go to the
corresponding CTL?

I'm going to send patches that utilize single CTL for sm8150+ after the
DSC lands, so I'd like to understand this part.

> +
> if (hw_dsc->ops.dsc_config)
> hw_dsc->ops.dsc_config(hw_dsc, dsc, common_mode, initial_lines);
>
> @@ -1854,6 +1860,9 @@ static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>
> if (hw_pp->ops.enable_dsc)
> hw_pp->ops.enable_dsc(hw_pp);
> +
> + if (ctl->ops.update_pending_flush_dsc)
> + ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
> }
>
> static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
> @@ -1898,7 +1907,8 @@ static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
> initial_lines = dpu_encoder_dsc_initial_line_calc(dsc, enc_ip_w);
>
> for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
> - dpu_encoder_dsc_pipe_cfg(hw_dsc[i], hw_pp[i], dsc, dsc_common_mode, initial_lines);
> + dpu_encoder_dsc_pipe_cfg(dpu_enc, hw_dsc[i], hw_pp[i], dsc,
> + dsc_common_mode, initial_lines);
> }
>
> void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc)
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> index 4f7cfa9..f3a50cc 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> @@ -139,6 +139,11 @@ static inline void dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
> CTL_DSPP_n_FLUSH(dspp - DSPP_0),
> ctx->pending_dspp_flush_mask[dspp - DSPP_0]);
> }
> +
> + if (ctx->pending_flush_mask & BIT(DSC_IDX))
> + DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH,
> + ctx->pending_dsc_flush_mask);
> +
> DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, ctx->pending_flush_mask);
> }
>
> @@ -285,6 +290,13 @@ static void dpu_hw_ctl_update_pending_flush_merge_3d_v1(struct dpu_hw_ctl *ctx,
> ctx->pending_flush_mask |= BIT(MERGE_3D_IDX);
> }
>
> +static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct dpu_hw_ctl *ctx,
> + enum dpu_dsc dsc_num)
> +{
> + ctx->pending_dsc_flush_mask |= BIT(dsc_num - DSC_0);
> + ctx->pending_flush_mask |= BIT(DSC_IDX);
> +}
> +
> static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl *ctx,
> enum dpu_dspp dspp, u32 dspp_sub_blk)
> {
> @@ -502,9 +514,6 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
> if ((test_bit(DPU_CTL_VM_CFG, &ctx->caps->features)))
> mode_sel = CTL_DEFAULT_GROUP_ID << 28;
>
> - if (cfg->dsc)
> - DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH, cfg->dsc);
> -
> if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
> mode_sel |= BIT(17);
>
> @@ -524,10 +533,8 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
> if (cfg->merge_3d)
> DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
> BIT(cfg->merge_3d - MERGE_3D_0));
> - if (cfg->dsc) {
> - DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);
> + if (cfg->dsc)
> DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
> - }
> }
>
> static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
> @@ -630,6 +637,9 @@ static void _setup_ctl_ops(struct dpu_hw_ctl_ops *ops,
> ops->update_pending_flush_merge_3d =
> dpu_hw_ctl_update_pending_flush_merge_3d_v1;
> ops->update_pending_flush_wb = dpu_hw_ctl_update_pending_flush_wb_v1;
> +
> + ops->update_pending_flush_dsc =
> + dpu_hw_ctl_update_pending_flush_dsc_v1;
> } else {
> ops->trigger_flush = dpu_hw_ctl_trigger_flush;
> ops->setup_intf_cfg = dpu_hw_ctl_intf_cfg;
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> index 6292002..d4869a0 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> @@ -158,6 +158,15 @@ struct dpu_hw_ctl_ops {
> enum dpu_dspp blk, u32 dspp_sub_blk);
>
> /**
> + * OR in the given flushbits to the cached pending_(dsc_)flush_mask
> + * No effect on hardware
> + * @ctx : ctl path ctx pointer
> + * @blk : interface block index
> + */
> + void (*update_pending_flush_dsc)(struct dpu_hw_ctl *ctx,
> + enum dpu_dsc blk);
> +
> + /**
> * Write the value of the pending_flush_mask to hardware
> * @ctx : ctl path ctx pointer
> */
> @@ -245,6 +254,7 @@ struct dpu_hw_ctl {
> u32 pending_wb_flush_mask;
> u32 pending_merge_3d_flush_mask;
> u32 pending_dspp_flush_mask[DSPP_MAX - DSPP_0];
> + u32 pending_dsc_flush_mask;
>
> /* ops */
> struct dpu_hw_ctl_ops ops;

--
With best wishes
Dmitry

2023-05-12 19:18:30

[permalink] [raw]

Subject: Re: [PATCH v8 6/8] drm/msm/dpu: separate DSC flush update out of interface

On 12/05/2023 21:47, Abhinav Kumar wrote:
>
>
> On 5/12/2023 11:21 AM, Dmitry Baryshkov wrote:
>> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>>> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
>>> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
>>> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
>>> DSC engine and DSC flush bits at same time to make it consistent with
>>> the location of flush programming of other dpu sub blocks.
>>>
>>> Signed-off-by: Kuogee Hsieh <[email protected]>
>>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>>> ---
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 22
>>> ++++++++++++++++------
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 ++++++++++
>>> 3 files changed, 38 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>> index ffa6f04..5cae70e 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct
>>> drm_dsc_config *dsc,
>>>       return DIV_ROUND_UP(total_pixels, dsc->slice_width);
>>> }
>>> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>>> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
>>> +                     struct dpu_hw_dsc *hw_dsc,
>>>                        struct dpu_hw_pingpong *hw_pp,
>>>                        struct drm_dsc_config *dsc,
>>>                        u32 common_mode,
>>>                        u32 initial_lines)
>>> {
>>> +    struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
>>> +    struct dpu_hw_ctl *ctl;
>>> +
>>> +    ctl = cur_master->hw_ctl;
>>
>> Just for my understanding: if we have a bonded DSI @ sdm845, should
>> both flashes go to the master CTL or each flush should go to the
>> corresponding CTL?
>>
>
> Is this question for DSC or just general question about flush?
>
> I dont see an explicit DSC flush needed in sdm845 at the ctl level.
>
> If the question is about general flush involving two control paths, we
> need to combine the flushes and they goto the master only. Please refer
> to below part in sde_encoder.c
And this is because we have a single CTL to flush on sm8150+, isn't it?

>
> 4243     /* for split flush, combine pending flush masks and send to
> master */
> 4244     if (pending_flush.pending_flush_mask && sde_enc->cur_master) {
> 4245         ctl = sde_enc->cur_master->hw_ctl;
> 4246         if (config_changed && ctl->ops.reg_dma_flush)
> 4247             ctl->ops.reg_dma_flush(ctl, is_regdma_blocking);
> 4248         _sde_encoder_trigger_flush(&sde_enc->base,
> sde_enc->cur_master,
> 4249                         &pending_flush,
> 4250                         config_changed);
> 4251     }

--
With best wishes
Dmitry

2023-05-12 19:20:34

[permalink] [raw]

Subject: Re: [PATCH v8 6/8] drm/msm/dpu: separate DSC flush update out of interface

On 5/12/2023 11:50 AM, Dmitry Baryshkov wrote:
> On 12/05/2023 21:47, Abhinav Kumar wrote:
>>
>>
>> On 5/12/2023 11:21 AM, Dmitry Baryshkov wrote:
>>> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>>>> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
>>>> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
>>>> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
>>>> DSC engine and DSC flush bits at same time to make it consistent with
>>>> the location of flush programming of other dpu sub blocks.
>>>>
>>>> Signed-off-by: Kuogee Hsieh <[email protected]>
>>>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>>>> ---
>>>> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
>>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 22
>>>> ++++++++++++++++------
>>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 ++++++++++
>>>> 3 files changed, 38 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> index ffa6f04..5cae70e 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct
>>>> drm_dsc_config *dsc,
>>>>       return DIV_ROUND_UP(total_pixels, dsc->slice_width);
>>>> }
>>>> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>>>> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
>>>> +                     struct dpu_hw_dsc *hw_dsc,
>>>>                        struct dpu_hw_pingpong *hw_pp,
>>>>                        struct drm_dsc_config *dsc,
>>>>                        u32 common_mode,
>>>>                        u32 initial_lines)
>>>> {
>>>> +    struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
>>>> +    struct dpu_hw_ctl *ctl;
>>>> +
>>>> +    ctl = cur_master->hw_ctl;
>>>
>>> Just for my understanding: if we have a bonded DSI @ sdm845, should
>>> both flashes go to the master CTL or each flush should go to the
>>> corresponding CTL?
>>>
>>
>> Is this question for DSC or just general question about flush?
>>
>> I dont see an explicit DSC flush needed in sdm845 at the ctl level.
>>
>> If the question is about general flush involving two control paths, we
>> need to combine the flushes and they goto the master only. Please
>> refer to below part in sde_encoder.c
> And this is because we have a single CTL to flush on sm8150+, isn't it?
>

For sm8150+, yes there will be only a single CTL to flush even in bonded
DSI mode so only one will be flushed.

So, in general, you can refer to the function
sde_encoder_phys_needs_single_flush() to decide if it needs 2 flushes or
one. That accounts for the DPU rev as well.

>>
>> 4243     /* for split flush, combine pending flush masks and send to
>> master */
>> 4244     if (pending_flush.pending_flush_mask && sde_enc->cur_master) {
>> 4245         ctl = sde_enc->cur_master->hw_ctl;
>> 4246         if (config_changed && ctl->ops.reg_dma_flush)
>> 4247             ctl->ops.reg_dma_flush(ctl, is_regdma_blocking);
>> 4248         _sde_encoder_trigger_flush(&sde_enc->base,
>> sde_enc->cur_master,
>> 4249                         &pending_flush,
>> 4250                         config_changed);
>> 4251     }
>
>

2023-05-12 19:42:38

[permalink] [raw]

Subject: Re: [PATCH v8 6/8] drm/msm/dpu: separate DSC flush update out of interface

On 5/12/2023 11:21 AM, Dmitry Baryshkov wrote:
> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().
>> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by
>> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
>> DSC engine and DSC flush bits at same time to make it consistent with
>> the location of flush programming of other dpu sub blocks.
>>
>> Signed-off-by: Kuogee Hsieh <[email protected]>
>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>> ---
>> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 22 ++++++++++++++++------
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 ++++++++++
>> 3 files changed, 38 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> index ffa6f04..5cae70e 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct
>> drm_dsc_config *dsc,
>>       return DIV_ROUND_UP(total_pixels, dsc->slice_width);
>> }
>> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,
>> +                     struct dpu_hw_dsc *hw_dsc,
>>                        struct dpu_hw_pingpong *hw_pp,
>>                        struct drm_dsc_config *dsc,
>>                        u32 common_mode,
>>                        u32 initial_lines)
>> {
>> +    struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
>> +    struct dpu_hw_ctl *ctl;
>> +
>> +    ctl = cur_master->hw_ctl;
>
> Just for my understanding: if we have a bonded DSI @ sdm845, should both
> flashes go to the master CTL or each flush should go to the
> corresponding CTL?
>

Is this question for DSC or just general question about flush?

I dont see an explicit DSC flush needed in sdm845 at the ctl level.

If the question is about general flush involving two control paths, we
need to combine the flushes and they goto the master only. Please refer
to below part in sde_encoder.c

4243 /* for split flush, combine pending flush masks and send to master */
4244 if (pending_flush.pending_flush_mask && sde_enc->cur_master) {
4245 ctl = sde_enc->cur_master->hw_ctl;
4246 if (config_changed && ctl->ops.reg_dma_flush)
4247 ctl->ops.reg_dma_flush(ctl, is_regdma_blocking);
4248 _sde_encoder_trigger_flush(&sde_enc->base, sde_enc->cur_master,
4249 &pending_flush,
4250 config_changed);
4251 }

> I'm going to send patches that utilize single CTL for sm8150+ after the
> DSC lands, so I'd like to understand this part.
>
>> +
>>       if (hw_dsc->ops.dsc_config)
>>           hw_dsc->ops.dsc_config(hw_dsc, dsc, common_mode,
>> initial_lines);
>> @@ -1854,6 +1860,9 @@ static void dpu_encoder_dsc_pipe_cfg(struct
>> dpu_hw_dsc *hw_dsc,
>>       if (hw_pp->ops.enable_dsc)
>>           hw_pp->ops.enable_dsc(hw_pp);
>> +
>> +    if (ctl->ops.update_pending_flush_dsc)
>> +        ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
>> }
>> static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
>> @@ -1898,7 +1907,8 @@ static void dpu_encoder_prep_dsc(struct
>> dpu_encoder_virt *dpu_enc,
>>       initial_lines = dpu_encoder_dsc_initial_line_calc(dsc, enc_ip_w);
>>       for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
>> -        dpu_encoder_dsc_pipe_cfg(hw_dsc[i], hw_pp[i], dsc,
>> dsc_common_mode, initial_lines);
>> +        dpu_encoder_dsc_pipe_cfg(dpu_enc, hw_dsc[i], hw_pp[i], dsc,
>> +                     dsc_common_mode, initial_lines);
>> }
>> void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc)
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> index 4f7cfa9..f3a50cc 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
>> @@ -139,6 +139,11 @@ static inline void
>> dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
>>                   CTL_DSPP_n_FLUSH(dspp - DSPP_0),
>>                   ctx->pending_dspp_flush_mask[dspp - DSPP_0]);
>>           }
>> +
>> +    if (ctx->pending_flush_mask & BIT(DSC_IDX))
>> +        DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH,
>> +                  ctx->pending_dsc_flush_mask);
>> +
>>       DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, ctx->pending_flush_mask);
>> }
>> @@ -285,6 +290,13 @@ static void
>> dpu_hw_ctl_update_pending_flush_merge_3d_v1(struct dpu_hw_ctl *ctx,
>>       ctx->pending_flush_mask |= BIT(MERGE_3D_IDX);
>> }
>> +static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct dpu_hw_ctl
>> *ctx,
>> +                           enum dpu_dsc dsc_num)
>> +{
>> +    ctx->pending_dsc_flush_mask |= BIT(dsc_num - DSC_0);
>> +    ctx->pending_flush_mask |= BIT(DSC_IDX);
>> +}
>> +
>> static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl
>> *ctx,
>>       enum dpu_dspp dspp, u32 dspp_sub_blk)
>> {
>> @@ -502,9 +514,6 @@ static void dpu_hw_ctl_intf_cfg_v1(struct
>> dpu_hw_ctl *ctx,
>>       if ((test_bit(DPU_CTL_VM_CFG, &ctx->caps->features)))
>>           mode_sel = CTL_DEFAULT_GROUP_ID << 28;
>> -    if (cfg->dsc)
>> -        DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH, cfg->dsc);
>> -
>>       if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
>>           mode_sel |= BIT(17);
>> @@ -524,10 +533,8 @@ static void dpu_hw_ctl_intf_cfg_v1(struct
>> dpu_hw_ctl *ctx,
>>       if (cfg->merge_3d)
>>           DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
>>                     BIT(cfg->merge_3d - MERGE_3D_0));
>> -    if (cfg->dsc) {
>> -        DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);
>> +    if (cfg->dsc)
>>           DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
>> -    }
>> }
>> static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
>> @@ -630,6 +637,9 @@ static void _setup_ctl_ops(struct dpu_hw_ctl_ops
>> *ops,
>>           ops->update_pending_flush_merge_3d =
>>               dpu_hw_ctl_update_pending_flush_merge_3d_v1;
>>           ops->update_pending_flush_wb =
>> dpu_hw_ctl_update_pending_flush_wb_v1;
>> +
>> +        ops->update_pending_flush_dsc =
>> +            dpu_hw_ctl_update_pending_flush_dsc_v1;
>>       } else {
>>           ops->trigger_flush = dpu_hw_ctl_trigger_flush;
>>           ops->setup_intf_cfg = dpu_hw_ctl_intf_cfg;
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> index 6292002..d4869a0 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
>> @@ -158,6 +158,15 @@ struct dpu_hw_ctl_ops {
>>           enum dpu_dspp blk, u32 dspp_sub_blk);
>>       /**
>> +     * OR in the given flushbits to the cached pending_(dsc_)flush_mask
>> +     * No effect on hardware
>> +     * @ctx       : ctl path ctx pointer
>> +     * @blk       : interface block index
>> +     */
>> +    void (*update_pending_flush_dsc)(struct dpu_hw_ctl *ctx,
>> +                     enum dpu_dsc blk);
>> +
>> +    /**
>>        * Write the value of the pending_flush_mask to hardware
>>        * @ctx       : ctl path ctx pointer
>>        */
>> @@ -245,6 +254,7 @@ struct dpu_hw_ctl {
>>       u32 pending_wb_flush_mask;
>>       u32 pending_merge_3d_flush_mask;
>>       u32 pending_dspp_flush_mask[DSPP_MAX - DSPP_0];
>> +    u32 pending_dsc_flush_mask;
>>       /* ops */
>>       struct dpu_hw_ctl_ops ops;
>

2023-05-12 21:07:30

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 5/12/2023 11:19 AM, Dmitry Baryshkov wrote:
> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>> Add support for DSC 1.2 by providing the necessary hooks to program
>> the DPU DSC 1.2 encoder.
>>
>> Changes in v3:
>> -- fixed kernel test rebot report that "__iomem *off" is declared but
>> not
>>     used at dpu_hw_dsc_config_1_2()
>> -- unrolling thresh loops
>>
>> Changes in v4:
>> -- delete DPU_DSC_HW_REV_1_1
>> -- delete off and used real register name directly
>>
>> Changes in v7:
>> -- replace offset with sblk->enc.base
>> -- replace ss with slice
>>
>> Changes in v8:
>> -- fixed checkpatch warning
>>
>> Signed-off-by: Kuogee Hsieh <[email protected]>
>> ---
>> drivers/gpu/drm/msm/Makefile                   |   1 +
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h     | 14 +-
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382
>> +++++++++++++++++++++++++
>> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c         |   7 +-
>> 5 files changed, 432 insertions(+), 4 deletions(-)
>> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>
>> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
>> index b814fc8..b9af5e4 100644
>> --- a/drivers/gpu/drm/msm/Makefile
>> +++ b/drivers/gpu/drm/msm/Makefile
>> @@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
>>       disp/dpu1/dpu_hw_catalog.o \
>>       disp/dpu1/dpu_hw_ctl.o \
>>       disp/dpu1/dpu_hw_dsc.o \
>> +    disp/dpu1/dpu_hw_dsc_1_2.o \
>>       disp/dpu1/dpu_hw_interrupts.o \
>>       disp/dpu1/dpu_hw_intf.o \
>>       disp/dpu1/dpu_hw_lm.o \
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> index dc0a4da..4eda2cc 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> @@ -1,6 +1,6 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> /*
>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights
>> reserved.
>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All
>> rights reserved.
>>    * Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights
>> reserved.
>>    */
>> @@ -244,12 +244,18 @@ enum {
>> };
>> /**
>> - * DSC features
>> + * DSC sub-blocks/features
>>    * @DPU_DSC_OUTPUT_CTRL       Configure which PINGPONG block gets
>>    *                            the pixel output from this DSC.
>> + * @DPU_DSC_HW_REV_1_2        DSC block supports dsc 1.1 and 1.2
>> + * @DPU_DSC_NATIVE_422_EN     Supports native422 and native420 encoding
>> + * @DPU_DSC_MAX
>>    */
>> enum {
>>       DPU_DSC_OUTPUT_CTRL = 0x1,
>> +    DPU_DSC_HW_REV_1_2,
>> +    DPU_DSC_NATIVE_422_EN,
>> +    DPU_DSC_MAX
>> };
>> /**
>> @@ -306,6 +312,14 @@ struct dpu_pp_blk {
>> };
>> /**
>> + * struct dpu_dsc_blk - DSC Encoder sub-blk information
>> + * @info:   HW register and features supported by this sub-blk
>> + */
>> +struct dpu_dsc_blk {
>> +    DPU_HW_SUBBLK_INFO;
>> +};
>> +
>> +/**
>>    * enum dpu_qos_lut_usage - define QoS LUT use cases
>>    */
>> enum dpu_qos_lut_usage {
>> @@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
>> };
>> /**
>> + * struct dpu_dsc_sub_blks - DSC sub-blks
>> + * @enc: DSC encoder sub block
>> + * @ctl: DSC controller sub block
>> + *
>> + */
>> +struct dpu_dsc_sub_blks {
>> +    struct dpu_dsc_blk enc;
>> +    struct dpu_dsc_blk ctl;
>> +};
>> +
>> +/**
>>    * dpu_clk_ctrl_type - Defines top level clock control signals
>>    */
>> enum dpu_clk_ctrl_type {
>> @@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
>>    * struct dpu_dsc_cfg - information of DSC blocks
>>    * @id                 enum identifying this block
>>    * @base               register offset of this block
>> + * @len:               length of hardware block
>>    * @features           bit mask identifying sub-blocks/features
>> + * @sblk               sub-blocks information
>>    */
>> struct dpu_dsc_cfg {
>>       DPU_HW_BLK_INFO;
>> +    const struct dpu_dsc_sub_blks *sblk;
>> };
>> /**
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> index 138080a..44fd624 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> @@ -1,5 +1,8 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> -/* Copyright (c) 2020-2022, Linaro Limited */
>> +/*
>> + * Copyright (c) 2020-2022, Linaro Limited
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>> reserved
>> + */
>> #ifndef _DPU_HW_DSC_H
>> #define _DPU_HW_DSC_H
>> @@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct
>> dpu_dsc_cfg *cfg,
>>           void __iomem *addr);
>> /**
>> + * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block
>> + * @cfg: DSC catalog entry for which driver object is required
>> + * @addr: Mapped register io address of MDP
>> + * Returns: Error code or allocated dpu_hw_dsc context
>> + */
>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>> +                       void __iomem *addr);
>> +
>> +/**
>>    * dpu_hw_dsc_destroy - destroys dsc driver context
>>    * @dsc:   Pointer to dsc driver context returned by dpu_hw_dsc_init
>>    */
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> new file mode 100644
>> index 00000000..5bd84bd
>> --- /dev/null
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> @@ -0,0 +1,382 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>> reserved
>> + */
>> +
>> +#include <drm/display/drm_dsc_helper.h>
>> +
>> +#include "dpu_kms.h"
>> +#include "dpu_hw_catalog.h"
>> +#include "dpu_hwio.h"
>> +#include "dpu_hw_mdss.h"
>> +#include "dpu_hw_dsc.h"
>> +
>> +#define DSC_CMN_MAIN_CNF           0x00
>> +
>> +/* DPU_DSC_ENC register offsets */
>> +#define ENC_DF_CTRL                0x00
>> +#define ENC_GENERAL_STATUS         0x04
>> +#define ENC_HSLICE_STATUS          0x08
>> +#define ENC_OUT_STATUS             0x0C
>> +#define ENC_INT_STAT               0x10
>> +#define ENC_INT_CLR                0x14
>> +#define ENC_INT_MASK               0x18
>> +#define DSC_MAIN_CONF              0x30
>> +#define DSC_PICTURE_SIZE           0x34
>> +#define DSC_SLICE_SIZE             0x38
>> +#define DSC_MISC_SIZE              0x3C
>> +#define DSC_HRD_DELAYS             0x40
>> +#define DSC_RC_SCALE               0x44
>> +#define DSC_RC_SCALE_INC_DEC       0x48
>> +#define DSC_RC_OFFSETS_1           0x4C
>> +#define DSC_RC_OFFSETS_2           0x50
>> +#define DSC_RC_OFFSETS_3           0x54
>> +#define DSC_RC_OFFSETS_4           0x58
>> +#define DSC_FLATNESS_QP            0x5C
>> +#define DSC_RC_MODEL_SIZE          0x60
>> +#define DSC_RC_CONFIG              0x64
>> +#define DSC_RC_BUF_THRESH_0        0x68
>> +#define DSC_RC_BUF_THRESH_1        0x6C
>> +#define DSC_RC_BUF_THRESH_2        0x70
>> +#define DSC_RC_BUF_THRESH_3        0x74
>> +#define DSC_RC_MIN_QP_0            0x78
>> +#define DSC_RC_MIN_QP_1            0x7C
>> +#define DSC_RC_MIN_QP_2            0x80
>> +#define DSC_RC_MAX_QP_0            0x84
>> +#define DSC_RC_MAX_QP_1            0x88
>> +#define DSC_RC_MAX_QP_2            0x8C
>> +#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
>> +#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
>> +#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
>> +
>> +/* DPU_DSC_CTL register offsets */
>> +#define DSC_CTL                    0x00
>> +#define DSC_CFG                    0x04
>> +#define DSC_DATA_IN_SWAP           0x08
>> +#define DSC_CLK_CTRL               0x0C
>> +
>> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc,
>> int num_ss)
>> +{
>> +    int max_addr = 2400 / num_ss;
>> +
>> +    if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
>> +        max_addr /= 2;
>> +
>> +    return max_addr - 1;
>> +};
>> +
>> +static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
>> +{
>> +    struct dpu_hw_blk_reg_map *hw;
>> +    const struct dpu_dsc_sub_blks *sblk;
>> +
>> +    if (!hw_dsc)
>> +        return;
>> +
>> +    hw = &hw_dsc->hw;
>> +    sblk = hw_dsc->caps->sblk;
>> +    DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
>> +}
>> +
>> +static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
>> +                     struct drm_dsc_config *dsc,
>> +                     u32 mode,
>> +                     u32 initial_lines)
>> +{
>> +    struct dpu_hw_blk_reg_map *hw;
>> +    const struct dpu_dsc_sub_blks *sblk;
>> +    u32 data = 0;
>> +    u32 det_thresh_flatness;
>> +    u32 num_active_slice_per_enc;
>> +    u32 bpp;
>> +
>> +    if (!hw_dsc || !dsc)
>> +        return;
>> +
>> +    hw = &hw_dsc->hw;
>> +
>> +    sblk = hw_dsc->caps->sblk;
>> +
>> +    if (mode & DSC_MODE_SPLIT_PANEL)
>> +        data |= BIT(0);
>> +
>> +    if (mode & DSC_MODE_MULTIPLEX)
>> +        data |= BIT(1);
>> +
>> +    num_active_slice_per_enc = dsc->slice_count;
>> +    if (mode & DSC_MODE_MULTIPLEX)
>> +        num_active_slice_per_enc = dsc->slice_count >> 1;
>> +
>> +    data |= (num_active_slice_per_enc & 0x3) << 7;
>> +
>> +    DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
>> +
>> +    data = (initial_lines & 0xff);
>> +
>> +    if (mode & DSC_MODE_VIDEO)
>> +        data |= BIT(9);
>> +
>> +    data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc)
>> << 18);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
>> +
>> +    data = (dsc->dsc_version_minor & 0xf) << 28;
>> +    if (dsc->dsc_version_minor == 0x2) {
>> +        if (dsc->native_422)
>> +            data |= BIT(22);
>> +        if (dsc->native_420)
>> +            data |= BIT(21);
>> +    }
>> +
>> +    bpp = dsc->bits_per_pixel;
>> +    /* as per hw requirement bpp should be programmed
>> +     * twice the actual value in case of 420 or 422 encoding
>> +     */
>> +    if (dsc->native_422 || dsc->native_420)
>> +        bpp = 2 * bpp;
>> +    data |= (dsc->block_pred_enable ? 1 : 0) << 20;
>> +    data |= bpp << 10;
>> +    data |= (dsc->line_buf_depth & 0xf) << 6;
>> +    data |= dsc->convert_rgb << 4;
>> +    data |= dsc->bits_per_component & 0xf;
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
>> +
>> +    data = (dsc->pic_width & 0xffff) |
>> +        ((dsc->pic_height & 0xffff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
>> +
>> +    data = (dsc->slice_width & 0xffff) |
>> +        ((dsc->slice_height & 0xffff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
>> +              (dsc->slice_chunk_size) & 0xffff);
>> +
>> +    data = (dsc->initial_xmit_delay & 0xffff) |
>> +        ((dsc->initial_dec_delay & 0xffff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
>> +              dsc->initial_scale_value & 0x3f);
>> +
>> +    data = (dsc->scale_increment_interval & 0xffff) |
>> +        ((dsc->scale_decrement_interval & 0x7ff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
>> +
>> +    data = (dsc->first_line_bpg_offset & 0x1f) |
>> +        ((dsc->second_line_bpg_offset & 0x1f) << 5);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
>> +
>> +    data = (dsc->nfl_bpg_offset & 0xffff) |
>> +        ((dsc->slice_bpg_offset & 0xffff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
>> +
>> +    data = (dsc->initial_offset & 0xffff) |
>> +        ((dsc->final_offset & 0xffff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
>> +
>> +    data = (dsc->nsl_bpg_offset & 0xffff) |
>> +        ((dsc->second_line_offset_adj & 0xffff) << 16);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
>> +
>> +    data = (dsc->flatness_min_qp & 0x1f);
>> +    data |= (dsc->flatness_max_qp & 0x1f) << 5;
>> +
>> +    det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
>> +    data |= (det_thresh_flatness & 0xff) << 10;
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
>> +              (dsc->rc_model_size) & 0xffff);
>> +
>> +    data = dsc->rc_edge_factor & 0xf;
>> +    data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
>> +    data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
>> +    data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
>> +    data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
>> +
>> +    /* program the dsc wrapper */
>> +    data = BIT(0); /* encoder enable */
>> +    if (dsc->native_422)
>> +        data |= BIT(8);
>> +    else if (dsc->native_420)
>> +        data |= BIT(9);
>> +    if (!dsc->convert_rgb)
>> +        data |= BIT(10);
>> +    if (dsc->bits_per_component == 8)
>> +        data |= BIT(11);
>> +    if (mode & DSC_MODE_SPLIT_PANEL)
>> +        data |= BIT(12);
>> +    if (mode & DSC_MODE_MULTIPLEX)
>> +        data |= BIT(13);
>> +    if (!(mode & DSC_MODE_VIDEO))
>> +        data |= BIT(17);
>> +
>> +    DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
>> +}
>> +
>> +static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc
>> *hw_dsc,
>> +                        struct drm_dsc_config *dsc)
>> +{
>> +    struct dpu_hw_blk_reg_map *hw;
>> +    const struct dpu_dsc_sub_blks *sblk;
>> +    struct drm_dsc_rc_range_parameters *rc;
>> +
>> +    if (!hw_dsc || !dsc)
>> +        return;
>> +
>> +    hw = &hw_dsc->hw;
>> +
>> +    sblk = hw_dsc->caps->sblk;
>> +
>> +    rc = dsc->rc_range_params;
>> +
>> +    /*
>> +     * With BUF_THRESH -- 14 in total
>> +     * each register contains 4 thresh values with the last register
>> +     * containing only 2 thresh values
>> +     */
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
>> +              (dsc->rc_buf_thresh[0] << 0) |
>> +              (dsc->rc_buf_thresh[1] << 8) |
>> +              (dsc->rc_buf_thresh[2] << 16) |
>> +              (dsc->rc_buf_thresh[3] << 24));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
>> +              (dsc->rc_buf_thresh[4] << 0) |
>> +              (dsc->rc_buf_thresh[5] << 8) |
>> +              (dsc->rc_buf_thresh[6] << 16) |
>> +              (dsc->rc_buf_thresh[7] << 24));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
>> +              (dsc->rc_buf_thresh[8] << 0) |
>> +              (dsc->rc_buf_thresh[9] << 8) |
>> +              (dsc->rc_buf_thresh[10] << 16) |
>> +              (dsc->rc_buf_thresh[11] << 24));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
>> +              (dsc->rc_buf_thresh[12] << 0) |
>> +              (dsc->rc_buf_thresh[13] << 8));
>> +
>> +    /*
>> +     * with min/max_QP -- 5 bits
>> +     * each register contains 5 min_qp or max_qp for total of 15
>> +     *
>> +     * With BPG_OFFSET -- 6 bits
>> +     * each register contains 5 BPG_offset for total of 15
>> +     */
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
>> +              (rc[0].range_min_qp << 0) |
>> +              (rc[1].range_min_qp << 5) |
>> +              (rc[2].range_min_qp << 10) |
>> +              (rc[3].range_min_qp << 15) |
>> +              (rc[4].range_min_qp << 20));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
>> +              (rc[0].range_max_qp << 0) |
>> +              (rc[1].range_max_qp << 5) |
>> +              (rc[2].range_max_qp << 10) |
>> +              (rc[3].range_max_qp << 15) |
>> +              (rc[4].range_max_qp << 20));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
>> +              (rc[0].range_bpg_offset << 0) |
>> +              (rc[1].range_bpg_offset << 6) |
>> +              (rc[2].range_bpg_offset << 12) |
>> +              (rc[3].range_bpg_offset << 18) |
>> +              (rc[4].range_bpg_offset << 24));
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
>> +              (rc[5].range_min_qp << 0) |
>> +              (rc[6].range_min_qp << 5) |
>> +              (rc[7].range_min_qp << 10) |
>> +              (rc[8].range_min_qp << 15) |
>> +              (rc[9].range_min_qp << 20));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
>> +              (rc[5].range_max_qp << 0) |
>> +              (rc[6].range_max_qp << 5) |
>> +              (rc[7].range_max_qp << 10) |
>> +              (rc[8].range_max_qp << 15) |
>> +              (rc[9].range_max_qp << 20));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
>> +              (rc[5].range_bpg_offset << 0) |
>> +              (rc[6].range_bpg_offset << 6) |
>> +              (rc[7].range_bpg_offset << 12) |
>> +              (rc[8].range_bpg_offset << 18) |
>> +              (rc[9].range_bpg_offset << 24));
>> +
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
>> +              (rc[10].range_min_qp << 0) |
>> +              (rc[11].range_min_qp << 5) |
>> +              (rc[12].range_min_qp << 10) |
>> +              (rc[13].range_min_qp << 15) |
>> +              (rc[14].range_min_qp << 20));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
>> +              (rc[10].range_max_qp << 0) |
>> +              (rc[11].range_max_qp << 5) |
>> +              (rc[12].range_max_qp << 10) |
>> +              (rc[13].range_max_qp << 15) |
>> +              (rc[14].range_max_qp << 20));
>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
>> +              (rc[10].range_bpg_offset << 0) |
>> +              (rc[11].range_bpg_offset << 6) |
>> +              (rc[12].range_bpg_offset << 12) |
>> +              (rc[13].range_bpg_offset << 18) |
>> +              (rc[14].range_bpg_offset << 24));
>> +}
>> +
>> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct
>> dpu_hw_dsc *hw_dsc,
>> +                            const enum dpu_pingpong pp)
>> +{
>> +    struct dpu_hw_blk_reg_map *hw;
>> +    const struct dpu_dsc_sub_blks *sblk;
>> +    int mux_cfg = 0xf; /* Disabled */
>> +
>> +    hw = &hw_dsc->hw;
>> +
>> +    sblk = hw_dsc->caps->sblk;
>> +
>> +    if (pp)
>> +        mux_cfg = (pp - PINGPONG_0) & 0x7;
>
> Do we need an unbind support here like we do for the DSC 1.1?

PINGPONG_NONE is used for unbind. (exactly same as DSC 1.1).

Are you wand DRM_DEBUG_KMS(...) add here same as DSC 1.1?

>
>> +
>> +    DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
>> +}
>> +
>> +static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
>> +                   const unsigned long features)
>> +{
>> +    ops->dsc_disable = dpu_hw_dsc_disable_1_2;
>> +    ops->dsc_config = dpu_hw_dsc_config_1_2;
>> +    ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
>> +    ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
>> +}
>> +
>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>> +                       void __iomem *addr)
>> +{
>> +    struct dpu_hw_dsc *c;
>> +
>> +    c = kzalloc(sizeof(*c), GFP_KERNEL);
>> +    if (!c)
>> +        return ERR_PTR(-ENOMEM);
>> +
>> +    c->hw.blk_addr = addr + cfg->base;
>> +    c->hw.log_mask = DPU_DBG_MASK_DSC;
>> +
>> +    c->idx = cfg->id;
>> +    c->caps = cfg;
>> +    _setup_dcs_ops_1_2(&c->ops, c->caps->features);
>> +
>> +    return c;
>> +}
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> index f0fc704..502dd60 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> @@ -1,6 +1,7 @@
>> // SPDX-License-Identifier: GPL-2.0-only
>> /*
>>    * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>> reserved.
>>    */
>> #define pr_fmt(fmt)    "[drm:%s] " fmt, __func__
>> @@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
>>           struct dpu_hw_dsc *hw;
>>           const struct dpu_dsc_cfg *dsc = &cat->dsc[i];
>> -        hw = dpu_hw_dsc_init(dsc, mmio);
>> +        if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
>> +            hw = dpu_hw_dsc_init_1_2(dsc, mmio);
>> +        else
>> +            hw = dpu_hw_dsc_init(dsc, mmio);
>> +
>>           if (IS_ERR_OR_NULL(hw)) {
>>               rc = PTR_ERR(hw);
>>               DPU_ERROR("failed dsc object creation: err %d\n", rc);
>

2023-05-12 21:07:39

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 12/05/2023 23:48, Kuogee Hsieh wrote:
>
> On 5/12/2023 11:19 AM, Dmitry Baryshkov wrote:
>> On 12/05/2023 21:00, Kuogee Hsieh wrote:
>>> Add support for DSC 1.2 by providing the necessary hooks to program
>>> the DPU DSC 1.2 encoder.
>>>
>>> Changes in v3:
>>> -- fixed kernel test rebot report that "__iomem *off" is declared but
>>> not
>>>     used at dpu_hw_dsc_config_1_2()
>>> -- unrolling thresh loops
>>>
>>> Changes in v4:
>>> -- delete DPU_DSC_HW_REV_1_1
>>> -- delete off and used real register name directly
>>>
>>> Changes in v7:
>>> -- replace offset with sblk->enc.base
>>> -- replace ss with slice
>>>
>>> Changes in v8:
>>> -- fixed checkpatch warning
>>>
>>> Signed-off-by: Kuogee Hsieh <[email protected]>
>>> ---
>>> drivers/gpu/drm/msm/Makefile                   |   1 +
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h     | 14 +-
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382
>>> +++++++++++++++++++++++++
>>> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c         |   7 +-
>>> 5 files changed, 432 insertions(+), 4 deletions(-)
>>> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>>
>>> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
>>> index b814fc8..b9af5e4 100644
>>> --- a/drivers/gpu/drm/msm/Makefile
>>> +++ b/drivers/gpu/drm/msm/Makefile
>>> @@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
>>>       disp/dpu1/dpu_hw_catalog.o \
>>>       disp/dpu1/dpu_hw_ctl.o \
>>>       disp/dpu1/dpu_hw_dsc.o \
>>> +    disp/dpu1/dpu_hw_dsc_1_2.o \
>>>       disp/dpu1/dpu_hw_interrupts.o \
>>>       disp/dpu1/dpu_hw_intf.o \
>>>       disp/dpu1/dpu_hw_lm.o \
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>>> index dc0a4da..4eda2cc 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>>> @@ -1,6 +1,6 @@
>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>> /*
>>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights
>>> reserved.
>>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All
>>> rights reserved.
>>>    * Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights
>>> reserved.
>>>    */
>>> @@ -244,12 +244,18 @@ enum {
>>> };
>>> /**
>>> - * DSC features
>>> + * DSC sub-blocks/features
>>>    * @DPU_DSC_OUTPUT_CTRL       Configure which PINGPONG block gets
>>>    *                            the pixel output from this DSC.
>>> + * @DPU_DSC_HW_REV_1_2        DSC block supports dsc 1.1 and 1.2
>>> + * @DPU_DSC_NATIVE_422_EN     Supports native422 and native420 encoding
>>> + * @DPU_DSC_MAX
>>>    */
>>> enum {
>>>       DPU_DSC_OUTPUT_CTRL = 0x1,
>>> +    DPU_DSC_HW_REV_1_2,
>>> +    DPU_DSC_NATIVE_422_EN,
>>> +    DPU_DSC_MAX
>>> };
>>> /**
>>> @@ -306,6 +312,14 @@ struct dpu_pp_blk {
>>> };
>>> /**
>>> + * struct dpu_dsc_blk - DSC Encoder sub-blk information
>>> + * @info:   HW register and features supported by this sub-blk
>>> + */
>>> +struct dpu_dsc_blk {
>>> +    DPU_HW_SUBBLK_INFO;
>>> +};
>>> +
>>> +/**
>>>    * enum dpu_qos_lut_usage - define QoS LUT use cases
>>>    */
>>> enum dpu_qos_lut_usage {
>>> @@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
>>> };
>>> /**
>>> + * struct dpu_dsc_sub_blks - DSC sub-blks
>>> + * @enc: DSC encoder sub block
>>> + * @ctl: DSC controller sub block
>>> + *
>>> + */
>>> +struct dpu_dsc_sub_blks {
>>> +    struct dpu_dsc_blk enc;
>>> +    struct dpu_dsc_blk ctl;
>>> +};
>>> +
>>> +/**
>>>    * dpu_clk_ctrl_type - Defines top level clock control signals
>>>    */
>>> enum dpu_clk_ctrl_type {
>>> @@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
>>>    * struct dpu_dsc_cfg - information of DSC blocks
>>>    * @id                 enum identifying this block
>>>    * @base               register offset of this block
>>> + * @len:               length of hardware block
>>>    * @features           bit mask identifying sub-blocks/features
>>> + * @sblk               sub-blocks information
>>>    */
>>> struct dpu_dsc_cfg {
>>>       DPU_HW_BLK_INFO;
>>> +    const struct dpu_dsc_sub_blks *sblk;
>>> };
>>> /**
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>>> index 138080a..44fd624 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>>> @@ -1,5 +1,8 @@
>>> /* SPDX-License-Identifier: GPL-2.0-only */
>>> -/* Copyright (c) 2020-2022, Linaro Limited */
>>> +/*
>>> + * Copyright (c) 2020-2022, Linaro Limited
>>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>>> reserved
>>> + */
>>> #ifndef _DPU_HW_DSC_H
>>> #define _DPU_HW_DSC_H
>>> @@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct
>>> dpu_dsc_cfg *cfg,
>>>           void __iomem *addr);
>>> /**
>>> + * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block
>>> + * @cfg: DSC catalog entry for which driver object is required
>>> + * @addr: Mapped register io address of MDP
>>> + * Returns: Error code or allocated dpu_hw_dsc context
>>> + */
>>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>>> +                       void __iomem *addr);
>>> +
>>> +/**
>>>    * dpu_hw_dsc_destroy - destroys dsc driver context
>>>    * @dsc:   Pointer to dsc driver context returned by dpu_hw_dsc_init
>>>    */
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>> new file mode 100644
>>> index 00000000..5bd84bd
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>> @@ -0,0 +1,382 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
>>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>>> reserved
>>> + */
>>> +
>>> +#include <drm/display/drm_dsc_helper.h>
>>> +
>>> +#include "dpu_kms.h"
>>> +#include "dpu_hw_catalog.h"
>>> +#include "dpu_hwio.h"
>>> +#include "dpu_hw_mdss.h"
>>> +#include "dpu_hw_dsc.h"
>>> +
>>> +#define DSC_CMN_MAIN_CNF           0x00
>>> +
>>> +/* DPU_DSC_ENC register offsets */
>>> +#define ENC_DF_CTRL                0x00
>>> +#define ENC_GENERAL_STATUS         0x04
>>> +#define ENC_HSLICE_STATUS          0x08
>>> +#define ENC_OUT_STATUS             0x0C
>>> +#define ENC_INT_STAT               0x10
>>> +#define ENC_INT_CLR                0x14
>>> +#define ENC_INT_MASK               0x18
>>> +#define DSC_MAIN_CONF              0x30
>>> +#define DSC_PICTURE_SIZE           0x34
>>> +#define DSC_SLICE_SIZE             0x38
>>> +#define DSC_MISC_SIZE              0x3C
>>> +#define DSC_HRD_DELAYS             0x40
>>> +#define DSC_RC_SCALE               0x44
>>> +#define DSC_RC_SCALE_INC_DEC       0x48
>>> +#define DSC_RC_OFFSETS_1           0x4C
>>> +#define DSC_RC_OFFSETS_2           0x50
>>> +#define DSC_RC_OFFSETS_3           0x54
>>> +#define DSC_RC_OFFSETS_4           0x58
>>> +#define DSC_FLATNESS_QP            0x5C
>>> +#define DSC_RC_MODEL_SIZE          0x60
>>> +#define DSC_RC_CONFIG              0x64
>>> +#define DSC_RC_BUF_THRESH_0        0x68
>>> +#define DSC_RC_BUF_THRESH_1        0x6C
>>> +#define DSC_RC_BUF_THRESH_2        0x70
>>> +#define DSC_RC_BUF_THRESH_3        0x74
>>> +#define DSC_RC_MIN_QP_0            0x78
>>> +#define DSC_RC_MIN_QP_1            0x7C
>>> +#define DSC_RC_MIN_QP_2            0x80
>>> +#define DSC_RC_MAX_QP_0            0x84
>>> +#define DSC_RC_MAX_QP_1            0x88
>>> +#define DSC_RC_MAX_QP_2            0x8C
>>> +#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
>>> +#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
>>> +#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
>>> +
>>> +/* DPU_DSC_CTL register offsets */
>>> +#define DSC_CTL                    0x00
>>> +#define DSC_CFG                    0x04
>>> +#define DSC_DATA_IN_SWAP           0x08
>>> +#define DSC_CLK_CTRL               0x0C
>>> +
>>> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc,
>>> int num_ss)
>>> +{
>>> +    int max_addr = 2400 / num_ss;
>>> +
>>> +    if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
>>> +        max_addr /= 2;
>>> +
>>> +    return max_addr - 1;
>>> +};
>>> +
>>> +static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
>>> +{
>>> +    struct dpu_hw_blk_reg_map *hw;
>>> +    const struct dpu_dsc_sub_blks *sblk;
>>> +
>>> +    if (!hw_dsc)
>>> +        return;
>>> +
>>> +    hw = &hw_dsc->hw;
>>> +    sblk = hw_dsc->caps->sblk;
>>> +    DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
>>> +}
>>> +
>>> +static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
>>> +                     struct drm_dsc_config *dsc,
>>> +                     u32 mode,
>>> +                     u32 initial_lines)
>>> +{
>>> +    struct dpu_hw_blk_reg_map *hw;
>>> +    const struct dpu_dsc_sub_blks *sblk;
>>> +    u32 data = 0;
>>> +    u32 det_thresh_flatness;
>>> +    u32 num_active_slice_per_enc;
>>> +    u32 bpp;
>>> +
>>> +    if (!hw_dsc || !dsc)
>>> +        return;
>>> +
>>> +    hw = &hw_dsc->hw;
>>> +
>>> +    sblk = hw_dsc->caps->sblk;
>>> +
>>> +    if (mode & DSC_MODE_SPLIT_PANEL)
>>> +        data |= BIT(0);
>>> +
>>> +    if (mode & DSC_MODE_MULTIPLEX)
>>> +        data |= BIT(1);
>>> +
>>> +    num_active_slice_per_enc = dsc->slice_count;
>>> +    if (mode & DSC_MODE_MULTIPLEX)
>>> +        num_active_slice_per_enc = dsc->slice_count >> 1;
>>> +
>>> +    data |= (num_active_slice_per_enc & 0x3) << 7;
>>> +
>>> +    DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
>>> +
>>> +    data = (initial_lines & 0xff);
>>> +
>>> +    if (mode & DSC_MODE_VIDEO)
>>> +        data |= BIT(9);
>>> +
>>> +    data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc)
>>> << 18);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
>>> +
>>> +    data = (dsc->dsc_version_minor & 0xf) << 28;
>>> +    if (dsc->dsc_version_minor == 0x2) {
>>> +        if (dsc->native_422)
>>> +            data |= BIT(22);
>>> +        if (dsc->native_420)
>>> +            data |= BIT(21);
>>> +    }
>>> +
>>> +    bpp = dsc->bits_per_pixel;
>>> +    /* as per hw requirement bpp should be programmed
>>> +     * twice the actual value in case of 420 or 422 encoding
>>> +     */
>>> +    if (dsc->native_422 || dsc->native_420)
>>> +        bpp = 2 * bpp;
>>> +    data |= (dsc->block_pred_enable ? 1 : 0) << 20;
>>> +    data |= bpp << 10;
>>> +    data |= (dsc->line_buf_depth & 0xf) << 6;
>>> +    data |= dsc->convert_rgb << 4;
>>> +    data |= dsc->bits_per_component & 0xf;
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
>>> +
>>> +    data = (dsc->pic_width & 0xffff) |
>>> +        ((dsc->pic_height & 0xffff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
>>> +
>>> +    data = (dsc->slice_width & 0xffff) |
>>> +        ((dsc->slice_height & 0xffff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
>>> +              (dsc->slice_chunk_size) & 0xffff);
>>> +
>>> +    data = (dsc->initial_xmit_delay & 0xffff) |
>>> +        ((dsc->initial_dec_delay & 0xffff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
>>> +              dsc->initial_scale_value & 0x3f);
>>> +
>>> +    data = (dsc->scale_increment_interval & 0xffff) |
>>> +        ((dsc->scale_decrement_interval & 0x7ff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
>>> +
>>> +    data = (dsc->first_line_bpg_offset & 0x1f) |
>>> +        ((dsc->second_line_bpg_offset & 0x1f) << 5);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
>>> +
>>> +    data = (dsc->nfl_bpg_offset & 0xffff) |
>>> +        ((dsc->slice_bpg_offset & 0xffff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
>>> +
>>> +    data = (dsc->initial_offset & 0xffff) |
>>> +        ((dsc->final_offset & 0xffff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
>>> +
>>> +    data = (dsc->nsl_bpg_offset & 0xffff) |
>>> +        ((dsc->second_line_offset_adj & 0xffff) << 16);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
>>> +
>>> +    data = (dsc->flatness_min_qp & 0x1f);
>>> +    data |= (dsc->flatness_max_qp & 0x1f) << 5;
>>> +
>>> +    det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
>>> +    data |= (det_thresh_flatness & 0xff) << 10;
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
>>> +              (dsc->rc_model_size) & 0xffff);
>>> +
>>> +    data = dsc->rc_edge_factor & 0xf;
>>> +    data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
>>> +    data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
>>> +    data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
>>> +    data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
>>> +
>>> +    /* program the dsc wrapper */
>>> +    data = BIT(0); /* encoder enable */
>>> +    if (dsc->native_422)
>>> +        data |= BIT(8);
>>> +    else if (dsc->native_420)
>>> +        data |= BIT(9);
>>> +    if (!dsc->convert_rgb)
>>> +        data |= BIT(10);
>>> +    if (dsc->bits_per_component == 8)
>>> +        data |= BIT(11);
>>> +    if (mode & DSC_MODE_SPLIT_PANEL)
>>> +        data |= BIT(12);
>>> +    if (mode & DSC_MODE_MULTIPLEX)
>>> +        data |= BIT(13);
>>> +    if (!(mode & DSC_MODE_VIDEO))
>>> +        data |= BIT(17);
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
>>> +}
>>> +
>>> +static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc
>>> *hw_dsc,
>>> +                        struct drm_dsc_config *dsc)
>>> +{
>>> +    struct dpu_hw_blk_reg_map *hw;
>>> +    const struct dpu_dsc_sub_blks *sblk;
>>> +    struct drm_dsc_rc_range_parameters *rc;
>>> +
>>> +    if (!hw_dsc || !dsc)
>>> +        return;
>>> +
>>> +    hw = &hw_dsc->hw;
>>> +
>>> +    sblk = hw_dsc->caps->sblk;
>>> +
>>> +    rc = dsc->rc_range_params;
>>> +
>>> +    /*
>>> +     * With BUF_THRESH -- 14 in total
>>> +     * each register contains 4 thresh values with the last register
>>> +     * containing only 2 thresh values
>>> +     */
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
>>> +              (dsc->rc_buf_thresh[0] << 0) |
>>> +              (dsc->rc_buf_thresh[1] << 8) |
>>> +              (dsc->rc_buf_thresh[2] << 16) |
>>> +              (dsc->rc_buf_thresh[3] << 24));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
>>> +              (dsc->rc_buf_thresh[4] << 0) |
>>> +              (dsc->rc_buf_thresh[5] << 8) |
>>> +              (dsc->rc_buf_thresh[6] << 16) |
>>> +              (dsc->rc_buf_thresh[7] << 24));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
>>> +              (dsc->rc_buf_thresh[8] << 0) |
>>> +              (dsc->rc_buf_thresh[9] << 8) |
>>> +              (dsc->rc_buf_thresh[10] << 16) |
>>> +              (dsc->rc_buf_thresh[11] << 24));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
>>> +              (dsc->rc_buf_thresh[12] << 0) |
>>> +              (dsc->rc_buf_thresh[13] << 8));
>>> +
>>> +    /*
>>> +     * with min/max_QP -- 5 bits
>>> +     * each register contains 5 min_qp or max_qp for total of 15
>>> +     *
>>> +     * With BPG_OFFSET -- 6 bits
>>> +     * each register contains 5 BPG_offset for total of 15
>>> +     */
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
>>> +              (rc[0].range_min_qp << 0) |
>>> +              (rc[1].range_min_qp << 5) |
>>> +              (rc[2].range_min_qp << 10) |
>>> +              (rc[3].range_min_qp << 15) |
>>> +              (rc[4].range_min_qp << 20));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
>>> +              (rc[0].range_max_qp << 0) |
>>> +              (rc[1].range_max_qp << 5) |
>>> +              (rc[2].range_max_qp << 10) |
>>> +              (rc[3].range_max_qp << 15) |
>>> +              (rc[4].range_max_qp << 20));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
>>> +              (rc[0].range_bpg_offset << 0) |
>>> +              (rc[1].range_bpg_offset << 6) |
>>> +              (rc[2].range_bpg_offset << 12) |
>>> +              (rc[3].range_bpg_offset << 18) |
>>> +              (rc[4].range_bpg_offset << 24));
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
>>> +              (rc[5].range_min_qp << 0) |
>>> +              (rc[6].range_min_qp << 5) |
>>> +              (rc[7].range_min_qp << 10) |
>>> +              (rc[8].range_min_qp << 15) |
>>> +              (rc[9].range_min_qp << 20));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
>>> +              (rc[5].range_max_qp << 0) |
>>> +              (rc[6].range_max_qp << 5) |
>>> +              (rc[7].range_max_qp << 10) |
>>> +              (rc[8].range_max_qp << 15) |
>>> +              (rc[9].range_max_qp << 20));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
>>> +              (rc[5].range_bpg_offset << 0) |
>>> +              (rc[6].range_bpg_offset << 6) |
>>> +              (rc[7].range_bpg_offset << 12) |
>>> +              (rc[8].range_bpg_offset << 18) |
>>> +              (rc[9].range_bpg_offset << 24));
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
>>> +              (rc[10].range_min_qp << 0) |
>>> +              (rc[11].range_min_qp << 5) |
>>> +              (rc[12].range_min_qp << 10) |
>>> +              (rc[13].range_min_qp << 15) |
>>> +              (rc[14].range_min_qp << 20));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
>>> +              (rc[10].range_max_qp << 0) |
>>> +              (rc[11].range_max_qp << 5) |
>>> +              (rc[12].range_max_qp << 10) |
>>> +              (rc[13].range_max_qp << 15) |
>>> +              (rc[14].range_max_qp << 20));
>>> +    DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
>>> +              (rc[10].range_bpg_offset << 0) |
>>> +              (rc[11].range_bpg_offset << 6) |
>>> +              (rc[12].range_bpg_offset << 12) |
>>> +              (rc[13].range_bpg_offset << 18) |
>>> +              (rc[14].range_bpg_offset << 24));
>>> +}
>>> +
>>> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct
>>> dpu_hw_dsc *hw_dsc,
>>> +                            const enum dpu_pingpong pp)
>>> +{
>>> +    struct dpu_hw_blk_reg_map *hw;
>>> +    const struct dpu_dsc_sub_blks *sblk;
>>> +    int mux_cfg = 0xf; /* Disabled */
>>> +
>>> +    hw = &hw_dsc->hw;
>>> +
>>> +    sblk = hw_dsc->caps->sblk;
>>> +
>>> +    if (pp)
>>> +        mux_cfg = (pp - PINGPONG_0) & 0x7;
>>
>> Do we need an unbind support here like we do for the DSC 1.1?
>
> PINGPONG_NONE is used for unbind. (exactly same as DSC 1.1).
>
> Are you wand DRM_DEBUG_KMS(...) add here same as DSC 1.1?

No, that's fine. I'd have probably preferred an explicit if-else, but
this is also fine.

Reviewed-by: Dmitry Baryshkov <[email protected]>

>
>
>>
>>> +
>>> +    DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
>>> +}
>>> +
>>> +static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
>>> +                   const unsigned long features)
>>> +{
>>> +    ops->dsc_disable = dpu_hw_dsc_disable_1_2;
>>> +    ops->dsc_config = dpu_hw_dsc_config_1_2;
>>> +    ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
>>> +    ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
>>> +}
>>> +
>>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>>> +                       void __iomem *addr)
>>> +{
>>> +    struct dpu_hw_dsc *c;
>>> +
>>> +    c = kzalloc(sizeof(*c), GFP_KERNEL);
>>> +    if (!c)
>>> +        return ERR_PTR(-ENOMEM);
>>> +
>>> +    c->hw.blk_addr = addr + cfg->base;
>>> +    c->hw.log_mask = DPU_DBG_MASK_DSC;
>>> +
>>> +    c->idx = cfg->id;
>>> +    c->caps = cfg;
>>> +    _setup_dcs_ops_1_2(&c->ops, c->caps->features);
>>> +
>>> +    return c;
>>> +}
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>> index f0fc704..502dd60 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>> @@ -1,6 +1,7 @@
>>> // SPDX-License-Identifier: GPL-2.0-only
>>> /*
>>>    * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
>>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>>> reserved.
>>>    */
>>> #define pr_fmt(fmt)    "[drm:%s] " fmt, __func__
>>> @@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
>>>           struct dpu_hw_dsc *hw;
>>>           const struct dpu_dsc_cfg *dsc = &cat->dsc[i];
>>> -        hw = dpu_hw_dsc_init(dsc, mmio);
>>> +        if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
>>> +            hw = dpu_hw_dsc_init_1_2(dsc, mmio);
>>> +        else
>>> +            hw = dpu_hw_dsc_init(dsc, mmio);
>>> +
>>>           if (IS_ERR_OR_NULL(hw)) {
>>>               rc = PTR_ERR(hw);
>>>               DPU_ERROR("failed dsc object creation: err %d\n", rc);
>>

--
With best wishes
Dmitry

2023-05-13 19:36:58

[permalink] [raw]

Subject: Re: [PATCH v8 2/8] drm/msm/dpu: add DPU_PINGPONG_DSC feature bit for DPU < 7.0.0

On 2023-05-12 11:00:17, Kuogee Hsieh wrote:
>
> DPU < 7.0.0 requires the PINGPONG block to be involved during
> DSC setting up. Since DPU >= 7.0.0, enabling and starting the DSC
> encoder engine was moved to INTF with the help of the flush mechanism.
> Add a DPU_PINGPONG_DSC feature bit to restrict the availability of
> dpu_hw_pp_setup_dsc() and dpu_hw_pp_dsc_{enable,disable}() on the
> PINGPONG block to DPU < 7.0.0 hardware, as the registers are not
> available [in the PINGPONG block] on DPU 7.0.0 and higher anymore.

Fwiw I added the brackets in the suggestion as an "up to you to include
this or not". Drop the brackets if you think this should be part of the
sentence.

> Add DPU_PINGPONG_DSC to PINGPONG_SDM845_MASK, PINGPONG_SDM845_TE2_MASK
> and PINGPONG_SM8150_MASK which is used for all DPU < 7.0 chipsets.
>
> changes in v6:
> -- split patches and rearrange to keep catalog related files at this patch
>
> changes in v7:
> -- rewording commit text as suggested at review comments
>
> Signed-off-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 6 +++---
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 4 +++-
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 82b58c6..78e4bf6 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -76,13 +76,13 @@
> (BIT(DPU_DIM_LAYER) | BIT(DPU_MIXER_COMBINED_ALPHA))
>
> #define PINGPONG_SDM845_MASK \
> - (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE))
> + (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE) | BIT(DPU_PINGPONG_DSC))
>
> #define PINGPONG_SDM845_TE2_MASK \
> - (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2))
> + (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2) | BIT(DPU_PINGPONG_DSC))

Don't add it here, this is already in PINGPONG_SDM845_MASK.

>
> #define PINGPONG_SM8150_MASK \
> - (BIT(DPU_PINGPONG_DITHER))
> + (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_DSC))
>
> #define CTL_SC7280_MASK \
> (BIT(DPU_CTL_ACTIVE_CFG) | \
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> index 6ee48f0..dc0a4da 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> @@ -144,7 +144,8 @@ enum {
> * @DPU_PINGPONG_TE2 Additional tear check block for split pipes
> * @DPU_PINGPONG_SPLIT PP block supports split fifo
> * @DPU_PINGPONG_SLAVE PP block is a suitable slave for split fifo
> - * @DPU_PINGPONG_DITHER, Dither blocks
> + * @DPU_PINGPONG_DITHER Dither blocks
> + * @DPU_PINGPONG_DSC PP ops functions required for DSC

Following the other documentation wording:

PP block supports DSC

Or:

PP block has DSC enable/disable registers

- Marijn

> * @DPU_PINGPONG_MAX
> */
> enum {
> @@ -153,6 +154,7 @@ enum {
> DPU_PINGPONG_SPLIT,
> DPU_PINGPONG_SLAVE,
> DPU_PINGPONG_DITHER,
> + DPU_PINGPONG_DSC,
> DPU_PINGPONG_MAX
> };
>
> --
> 2.7.4
>

2023-05-14 21:42:37

[permalink] [raw]

Subject: Re: [PATCH v8 0/8] add DSC 1.2 dpu supports

Asked this before: change the title to "DPU support" (capital "DPU",
singular "support") if this series keeps being resent.

On 2023-05-12 11:00:15, Kuogee Hsieh wrote:
>
> This series adds the DPU side changes to support DSC 1.2 encoder. This
> was validated with both DSI DSC 1.2 panel and DP DSC 1.2 monitor.
> The DSI and DP parts will be pushed later on top of this change.
> This seriel is rebase on [1], [2] and catalog fixes from rev-4 of [3].

series*

rebased*

Also I think it's not just the catalog fixes but everything now, because
we were both touching HW block implementations?

- Marijn

> [1]: https://patchwork.freedesktop.org/series/116851/
> [2]: https://patchwork.freedesktop.org/series/116615/
> [3]: https://patchwork.freedesktop.org/series/112332/
>
> Abhinav Kumar (2):
> drm/msm/dpu: add dsc blocks for remaining chipsets in catalog
> drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets
>
> Kuogee Hsieh (6):
> drm/msm/dpu: add DPU_PINGPONG_DSC feature bit for DPU < 7.0.0
> drm/msm/dpu: test DPU_PINGPONG_DSC bit before assign DSC ops to
> PINGPONG
> drm/msm/dpu: Introduce PINGPONG_NONE to disconnect DSC from PINGPONG
> drm/msm/dpu: add support for DSC encoder v1.2 engine
> drm/msm/dpu: separate DSC flush update out of interface
> drm/msm/dpu: tear down DSC data path when DSC disabled
>
> drivers/gpu/drm/msm/Makefile | 1 +
> .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h | 7 +
> .../drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h | 11 +
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 +
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h | 7 +
> .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 +
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 +
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 +
> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 59 +++-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 31 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 36 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 29 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 +
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c | 14 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 15 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382 +++++++++++++++++++++
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 3 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c | 6 +
> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 7 +-
> 19 files changed, 649 insertions(+), 27 deletions(-)
> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>
> --
> 2.7.4%%
>

2023-05-14 22:23:57

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 2023-05-12 21:19:19, Dmitry Baryshkov wrote:
<snip.
> > +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
> > + const enum dpu_pingpong pp)
> > +{
> > + struct dpu_hw_blk_reg_map *hw;
> > + const struct dpu_dsc_sub_blks *sblk;
> > + int mux_cfg = 0xf; /* Disabled */
> > +
> > + hw = &hw_dsc->hw;
> > +
> > + sblk = hw_dsc->caps->sblk;
> > +
> > + if (pp)
> > + mux_cfg = (pp - PINGPONG_0) & 0x7;
>
> Do we need an unbind support here like we do for the DSC 1.1?
>
> > +
> > + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
> > +}

<snip>

Friendly request to strip/snip unneeded context (as done in this reply)
to make it easier to spot the conversation, and replies to it.

- Marijn

2023-05-14 22:50:02

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 2023-05-12 11:00:20, Kuogee Hsieh wrote:
>
> Add support for DSC 1.2 by providing the necessary hooks to program
> the DPU DSC 1.2 encoder.
>
> Changes in v3:
> -- fixed kernel test rebot report that "__iomem *off" is declared but not
> used at dpu_hw_dsc_config_1_2()
> -- unrolling thresh loops
>
> Changes in v4:
> -- delete DPU_DSC_HW_REV_1_1
> -- delete off and used real register name directly
>
> Changes in v7:
> -- replace offset with sblk->enc.base
> -- replace ss with slice
>
> Changes in v8:
> -- fixed checkpatch warning
>
> Signed-off-by: Kuogee Hsieh <[email protected]>
> ---
> drivers/gpu/drm/msm/Makefile | 1 +
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 14 +-
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382 +++++++++++++++++++++++++
> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 7 +-
> 5 files changed, 432 insertions(+), 4 deletions(-)
> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index b814fc8..b9af5e4 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
> disp/dpu1/dpu_hw_catalog.o \
> disp/dpu1/dpu_hw_ctl.o \
> disp/dpu1/dpu_hw_dsc.o \
> + disp/dpu1/dpu_hw_dsc_1_2.o \
> disp/dpu1/dpu_hw_interrupts.o \
> disp/dpu1/dpu_hw_intf.o \
> disp/dpu1/dpu_hw_lm.o \
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> index dc0a4da..4eda2cc 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> @@ -1,6 +1,6 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> /*
> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
> * Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights reserved.
> */
>
> @@ -244,12 +244,18 @@ enum {
> };
>
> /**
> - * DSC features
> + * DSC sub-blocks/features
> * @DPU_DSC_OUTPUT_CTRL Configure which PINGPONG block gets
> * the pixel output from this DSC.
> + * @DPU_DSC_HW_REV_1_2 DSC block supports dsc 1.1 and 1.2

Nit: dsc -> DSC

> + * @DPU_DSC_NATIVE_422_EN Supports native422 and native420 encoding

NATIVE_42x_EN?

> + * @DPU_DSC_MAX
> */
> enum {
> DPU_DSC_OUTPUT_CTRL = 0x1,
> + DPU_DSC_HW_REV_1_2,
> + DPU_DSC_NATIVE_422_EN,
> + DPU_DSC_MAX
> };
>
> /**
> @@ -306,6 +312,14 @@ struct dpu_pp_blk {
> };
>
> /**
> + * struct dpu_dsc_blk - DSC Encoder sub-blk information
> + * @info: HW register and features supported by this sub-blk
> + */
> +struct dpu_dsc_blk {
> + DPU_HW_SUBBLK_INFO;
> +};
> +
> +/**
> * enum dpu_qos_lut_usage - define QoS LUT use cases
> */
> enum dpu_qos_lut_usage {
> @@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
> };
>
> /**
> + * struct dpu_dsc_sub_blks - DSC sub-blks
> + * @enc: DSC encoder sub block
> + * @ctl: DSC controller sub block

Nit: sub-block, for both.

> + *

No need for this empty line.

> + */
> +struct dpu_dsc_sub_blks {
> + struct dpu_dsc_blk enc;
> + struct dpu_dsc_blk ctl;
> +};
> +
> +/**
> * dpu_clk_ctrl_type - Defines top level clock control signals
> */
> enum dpu_clk_ctrl_type {
> @@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
> * struct dpu_dsc_cfg - information of DSC blocks
> * @id enum identifying this block
> * @base register offset of this block
> + * @len: length of hardware block
> * @features bit mask identifying sub-blocks/features
> + * @sblk sub-blocks information

Add trailing colon like you did for @len (we will add it to every other
field doc in a future fixing pass).

> */
> struct dpu_dsc_cfg {
> DPU_HW_BLK_INFO;
> + const struct dpu_dsc_sub_blks *sblk;
> };
>
> /**
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
> index 138080a..44fd624 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
> @@ -1,5 +1,8 @@
> /* SPDX-License-Identifier: GPL-2.0-only */
> -/* Copyright (c) 2020-2022, Linaro Limited */
> +/*
> + * Copyright (c) 2020-2022, Linaro Limited
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
> + */
>
> #ifndef _DPU_HW_DSC_H
> #define _DPU_HW_DSC_H
> @@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct dpu_dsc_cfg *cfg,
> void __iomem *addr);
>
> /**
> + * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block

Add ()

We recently normalized this to "DSC hw driver object."

> + * @cfg: DSC catalog entry for which driver object is required
> + * @addr: Mapped register io address of MDP
> + * Returns: Error code or allocated dpu_hw_dsc context
> + */
> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
> + void __iomem *addr);
> +
> +/**
> * dpu_hw_dsc_destroy - destroys dsc driver context
> * @dsc: Pointer to dsc driver context returned by dpu_hw_dsc_init
> */
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
> new file mode 100644
> index 00000000..5bd84bd
> --- /dev/null
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
> @@ -0,0 +1,382 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
> + */
> +
> +#include <drm/display/drm_dsc_helper.h>
> +
> +#include "dpu_kms.h"
> +#include "dpu_hw_catalog.h"
> +#include "dpu_hwio.h"
> +#include "dpu_hw_mdss.h"
> +#include "dpu_hw_dsc.h"
> +
> +#define DSC_CMN_MAIN_CNF 0x00
> +
> +/* DPU_DSC_ENC register offsets */
> +#define ENC_DF_CTRL 0x00
> +#define ENC_GENERAL_STATUS 0x04
> +#define ENC_HSLICE_STATUS 0x08
> +#define ENC_OUT_STATUS 0x0C
> +#define ENC_INT_STAT 0x10
> +#define ENC_INT_CLR 0x14
> +#define ENC_INT_MASK 0x18
> +#define DSC_MAIN_CONF 0x30
> +#define DSC_PICTURE_SIZE 0x34
> +#define DSC_SLICE_SIZE 0x38
> +#define DSC_MISC_SIZE 0x3C
> +#define DSC_HRD_DELAYS 0x40
> +#define DSC_RC_SCALE 0x44
> +#define DSC_RC_SCALE_INC_DEC 0x48
> +#define DSC_RC_OFFSETS_1 0x4C
> +#define DSC_RC_OFFSETS_2 0x50
> +#define DSC_RC_OFFSETS_3 0x54
> +#define DSC_RC_OFFSETS_4 0x58
> +#define DSC_FLATNESS_QP 0x5C
> +#define DSC_RC_MODEL_SIZE 0x60
> +#define DSC_RC_CONFIG 0x64
> +#define DSC_RC_BUF_THRESH_0 0x68
> +#define DSC_RC_BUF_THRESH_1 0x6C
> +#define DSC_RC_BUF_THRESH_2 0x70
> +#define DSC_RC_BUF_THRESH_3 0x74
> +#define DSC_RC_MIN_QP_0 0x78
> +#define DSC_RC_MIN_QP_1 0x7C
> +#define DSC_RC_MIN_QP_2 0x80
> +#define DSC_RC_MAX_QP_0 0x84
> +#define DSC_RC_MAX_QP_1 0x88
> +#define DSC_RC_MAX_QP_2 0x8C
> +#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
> +#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
> +#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
> +
> +/* DPU_DSC_CTL register offsets */
> +#define DSC_CTL 0x00
> +#define DSC_CFG 0x04
> +#define DSC_DATA_IN_SWAP 0x08
> +#define DSC_CLK_CTRL 0x0C
> +
> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc, int num_ss)

Can you write out "ob" fully?

These don't need to be marked "inline", same below.

> +{
> + int max_addr = 2400 / num_ss;

ss -> slice (or subslice), right?

> +
> + if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
> + max_addr /= 2;
> +
> + return max_addr - 1;
> +};
> +
> +static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> +
> + if (!hw_dsc)
> + return;
> +
> + hw = &hw_dsc->hw;
> + sblk = hw_dsc->caps->sblk;
> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
> +}
> +
> +static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
> + struct drm_dsc_config *dsc,
> + u32 mode,
> + u32 initial_lines)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> + u32 data = 0;
> + u32 det_thresh_flatness;
> + u32 num_active_slice_per_enc;
> + u32 bpp;
> +
> + if (!hw_dsc || !dsc)
> + return;
> +
> + hw = &hw_dsc->hw;
> +
> + sblk = hw_dsc->caps->sblk;
> +
> + if (mode & DSC_MODE_SPLIT_PANEL)
> + data |= BIT(0);
> +
> + if (mode & DSC_MODE_MULTIPLEX)
> + data |= BIT(1);
> +
> + num_active_slice_per_enc = dsc->slice_count;
> + if (mode & DSC_MODE_MULTIPLEX)
> + num_active_slice_per_enc = dsc->slice_count >> 1;

A /2 would be clearer to read, IMO.

> +
> + data |= (num_active_slice_per_enc & 0x3) << 7;
> +
> + DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
> +
> + data = (initial_lines & 0xff);
> +
> + if (mode & DSC_MODE_VIDEO)
> + data |= BIT(9);
> +
> + data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc) << 18);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
> +
> + data = (dsc->dsc_version_minor & 0xf) << 28;
> + if (dsc->dsc_version_minor == 0x2) {
> + if (dsc->native_422)
> + data |= BIT(22);
> + if (dsc->native_420)
> + data |= BIT(21);
> + }
> +
> + bpp = dsc->bits_per_pixel;
> + /* as per hw requirement bpp should be programmed
> + * twice the actual value in case of 420 or 422 encoding
> + */
> + if (dsc->native_422 || dsc->native_420)
> + bpp = 2 * bpp;
> + data |= (dsc->block_pred_enable ? 1 : 0) << 20;

Below (e.g. convert_rgb) you just shift the bool without ternary if to 0
or 1: why here?

Or rewrite this one and convert_rgb to match the above and below:

if (dsc->block_pred_enable)
data |= BIT(20);
if (dsc->convert_rgb)
data |= BIT(4);

No need to invent multiple different styles within the same file.

> + data |= bpp << 10;

Why not move this right below the calculation of bpp? Seems odd to
intersperse block_pred_enable.

> + data |= (dsc->line_buf_depth & 0xf) << 6;
> + data |= dsc->convert_rgb << 4;
> + data |= dsc->bits_per_component & 0xf;
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
> +
> + data = (dsc->pic_width & 0xffff) |
> + ((dsc->pic_height & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
> +
> + data = (dsc->slice_width & 0xffff) |
> + ((dsc->slice_height & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
> + (dsc->slice_chunk_size) & 0xffff);
> +
> + data = (dsc->initial_xmit_delay & 0xffff) |
> + ((dsc->initial_dec_delay & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
> + dsc->initial_scale_value & 0x3f);
> +
> + data = (dsc->scale_increment_interval & 0xffff) |
> + ((dsc->scale_decrement_interval & 0x7ff) << 16);

Is it correct that increment and decrement have a different amount of
bits?

> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
> +
> + data = (dsc->first_line_bpg_offset & 0x1f) |
> + ((dsc->second_line_bpg_offset & 0x1f) << 5);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
> +
> + data = (dsc->nfl_bpg_offset & 0xffff) |
> + ((dsc->slice_bpg_offset & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
> +
> + data = (dsc->initial_offset & 0xffff) |
> + ((dsc->final_offset & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
> +
> + data = (dsc->nsl_bpg_offset & 0xffff) |
> + ((dsc->second_line_offset_adj & 0xffff) << 16);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
> +
> + data = (dsc->flatness_min_qp & 0x1f);
> + data |= (dsc->flatness_max_qp & 0x1f) << 5;

Wrap these as a single data= expression to match the above.

> +
> + det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
> + data |= (det_thresh_flatness & 0xff) << 10;

And fold this line into it, too.

> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
> + (dsc->rc_model_size) & 0xffff);
> +
> + data = dsc->rc_edge_factor & 0xf;
> + data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
> + data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
> + data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
> + data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
> +
> + /* program the dsc wrapper */
> + data = BIT(0); /* encoder enable */
> + if (dsc->native_422)
> + data |= BIT(8);
> + else if (dsc->native_420)
> + data |= BIT(9);
> + if (!dsc->convert_rgb)
> + data |= BIT(10);
> + if (dsc->bits_per_component == 8)
> + data |= BIT(11);
> + if (mode & DSC_MODE_SPLIT_PANEL)
> + data |= BIT(12);
> + if (mode & DSC_MODE_MULTIPLEX)
> + data |= BIT(13);
> + if (!(mode & DSC_MODE_VIDEO))
> + data |= BIT(17);
> +
> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
> +}
> +
> +static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc *hw_dsc,
> + struct drm_dsc_config *dsc)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> + struct drm_dsc_rc_range_parameters *rc;
> +
> + if (!hw_dsc || !dsc)
> + return;
> +
> + hw = &hw_dsc->hw;
> +
> + sblk = hw_dsc->caps->sblk;
> +
> + rc = dsc->rc_range_params;
> +
> + /*
> + * With BUF_THRESH -- 14 in total
> + * each register contains 4 thresh values with the last register
> + * containing only 2 thresh values
> + */
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
> + (dsc->rc_buf_thresh[0] << 0) |
> + (dsc->rc_buf_thresh[1] << 8) |
> + (dsc->rc_buf_thresh[2] << 16) |
> + (dsc->rc_buf_thresh[3] << 24));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
> + (dsc->rc_buf_thresh[4] << 0) |
> + (dsc->rc_buf_thresh[5] << 8) |
> + (dsc->rc_buf_thresh[6] << 16) |
> + (dsc->rc_buf_thresh[7] << 24));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
> + (dsc->rc_buf_thresh[8] << 0) |
> + (dsc->rc_buf_thresh[9] << 8) |
> + (dsc->rc_buf_thresh[10] << 16) |
> + (dsc->rc_buf_thresh[11] << 24));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
> + (dsc->rc_buf_thresh[12] << 0) |
> + (dsc->rc_buf_thresh[13] << 8));

I'm still thrown-off by the fact that rc_buf_thresh is u16, but assumed
everywhere to only contain 8 valid bits.

Rest of the patch appears okay, but I'll try to test it on actual
hardware too.

- Marijn

> +
> + /*
> + * with min/max_QP -- 5 bits
> + * each register contains 5 min_qp or max_qp for total of 15
> + *
> + * With BPG_OFFSET -- 6 bits
> + * each register contains 5 BPG_offset for total of 15
> + */
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
> + (rc[0].range_min_qp << 0) |
> + (rc[1].range_min_qp << 5) |
> + (rc[2].range_min_qp << 10) |
> + (rc[3].range_min_qp << 15) |
> + (rc[4].range_min_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
> + (rc[0].range_max_qp << 0) |
> + (rc[1].range_max_qp << 5) |
> + (rc[2].range_max_qp << 10) |
> + (rc[3].range_max_qp << 15) |
> + (rc[4].range_max_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
> + (rc[0].range_bpg_offset << 0) |
> + (rc[1].range_bpg_offset << 6) |
> + (rc[2].range_bpg_offset << 12) |
> + (rc[3].range_bpg_offset << 18) |
> + (rc[4].range_bpg_offset << 24));
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
> + (rc[5].range_min_qp << 0) |
> + (rc[6].range_min_qp << 5) |
> + (rc[7].range_min_qp << 10) |
> + (rc[8].range_min_qp << 15) |
> + (rc[9].range_min_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
> + (rc[5].range_max_qp << 0) |
> + (rc[6].range_max_qp << 5) |
> + (rc[7].range_max_qp << 10) |
> + (rc[8].range_max_qp << 15) |
> + (rc[9].range_max_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
> + (rc[5].range_bpg_offset << 0) |
> + (rc[6].range_bpg_offset << 6) |
> + (rc[7].range_bpg_offset << 12) |
> + (rc[8].range_bpg_offset << 18) |
> + (rc[9].range_bpg_offset << 24));
> +
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
> + (rc[10].range_min_qp << 0) |
> + (rc[11].range_min_qp << 5) |
> + (rc[12].range_min_qp << 10) |
> + (rc[13].range_min_qp << 15) |
> + (rc[14].range_min_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
> + (rc[10].range_max_qp << 0) |
> + (rc[11].range_max_qp << 5) |
> + (rc[12].range_max_qp << 10) |
> + (rc[13].range_max_qp << 15) |
> + (rc[14].range_max_qp << 20));
> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
> + (rc[10].range_bpg_offset << 0) |
> + (rc[11].range_bpg_offset << 6) |
> + (rc[12].range_bpg_offset << 12) |
> + (rc[13].range_bpg_offset << 18) |
> + (rc[14].range_bpg_offset << 24));
> +}
> +
> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
> + const enum dpu_pingpong pp)
> +{
> + struct dpu_hw_blk_reg_map *hw;
> + const struct dpu_dsc_sub_blks *sblk;
> + int mux_cfg = 0xf; /* Disabled */
> +
> + hw = &hw_dsc->hw;
> +
> + sblk = hw_dsc->caps->sblk;
> +
> + if (pp)
> + mux_cfg = (pp - PINGPONG_0) & 0x7;
> +
> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
> +}
> +
> +static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
> + const unsigned long features)
> +{
> + ops->dsc_disable = dpu_hw_dsc_disable_1_2;
> + ops->dsc_config = dpu_hw_dsc_config_1_2;
> + ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
> + ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
> +}
> +
> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
> + void __iomem *addr)
> +{
> + struct dpu_hw_dsc *c;
> +
> + c = kzalloc(sizeof(*c), GFP_KERNEL);
> + if (!c)
> + return ERR_PTR(-ENOMEM);
> +
> + c->hw.blk_addr = addr + cfg->base;
> + c->hw.log_mask = DPU_DBG_MASK_DSC;
> +
> + c->idx = cfg->id;
> + c->caps = cfg;
> + _setup_dcs_ops_1_2(&c->ops, c->caps->features);
> +
> + return c;
> +}
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> index f0fc704..502dd60 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> @@ -1,6 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /*
> * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> */
>
> #define pr_fmt(fmt) "[drm:%s] " fmt, __func__
> @@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
> struct dpu_hw_dsc *hw;
> const struct dpu_dsc_cfg *dsc = &cat->dsc[i];
>
> - hw = dpu_hw_dsc_init(dsc, mmio);
> + if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
> + hw = dpu_hw_dsc_init_1_2(dsc, mmio);
> + else
> + hw = dpu_hw_dsc_init(dsc, mmio);
> +
> if (IS_ERR_OR_NULL(hw)) {
> rc = PTR_ERR(hw);
> DPU_ERROR("failed dsc object creation: err %d\n", rc);
> --
> 2.7.4
>

2023-05-15 17:34:42

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 5/14/2023 3:18 PM, Marijn Suijten wrote:
> On 2023-05-12 11:00:20, Kuogee Hsieh wrote:
>> Add support for DSC 1.2 by providing the necessary hooks to program
>> the DPU DSC 1.2 encoder.
>>
>> Changes in v3:
>> -- fixed kernel test rebot report that "__iomem *off" is declared but not
>> used at dpu_hw_dsc_config_1_2()
>> -- unrolling thresh loops
>>
>> Changes in v4:
>> -- delete DPU_DSC_HW_REV_1_1
>> -- delete off and used real register name directly
>>
>> Changes in v7:
>> -- replace offset with sblk->enc.base
>> -- replace ss with slice
>>
>> Changes in v8:
>> -- fixed checkpatch warning
>>
>> Signed-off-by: Kuogee Hsieh <[email protected]>
>> ---
>> drivers/gpu/drm/msm/Makefile | 1 +
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 14 +-
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382 +++++++++++++++++++++++++
>> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 7 +-
>> 5 files changed, 432 insertions(+), 4 deletions(-)
>> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>
>> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
>> index b814fc8..b9af5e4 100644
>> --- a/drivers/gpu/drm/msm/Makefile
>> +++ b/drivers/gpu/drm/msm/Makefile
>> @@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
>> disp/dpu1/dpu_hw_catalog.o \
>> disp/dpu1/dpu_hw_ctl.o \
>> disp/dpu1/dpu_hw_dsc.o \
>> + disp/dpu1/dpu_hw_dsc_1_2.o \
>> disp/dpu1/dpu_hw_interrupts.o \
>> disp/dpu1/dpu_hw_intf.o \
>> disp/dpu1/dpu_hw_lm.o \
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> index dc0a4da..4eda2cc 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> @@ -1,6 +1,6 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> /*
>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
>> * Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights reserved.
>> */
>>
>> @@ -244,12 +244,18 @@ enum {
>> };
>>
>> /**
>> - * DSC features
>> + * DSC sub-blocks/features
>> * @DPU_DSC_OUTPUT_CTRL Configure which PINGPONG block gets
>> * the pixel output from this DSC.
>> + * @DPU_DSC_HW_REV_1_2 DSC block supports dsc 1.1 and 1.2
> Nit: dsc -> DSC
>
>> + * @DPU_DSC_NATIVE_422_EN Supports native422 and native420 encoding
> NATIVE_42x_EN?
>
>> + * @DPU_DSC_MAX
>> */
>> enum {
>> DPU_DSC_OUTPUT_CTRL = 0x1,
>> + DPU_DSC_HW_REV_1_2,
>> + DPU_DSC_NATIVE_422_EN,
>> + DPU_DSC_MAX
>> };
>>
>> /**
>> @@ -306,6 +312,14 @@ struct dpu_pp_blk {
>> };
>>
>> /**
>> + * struct dpu_dsc_blk - DSC Encoder sub-blk information
>> + * @info: HW register and features supported by this sub-blk
>> + */
>> +struct dpu_dsc_blk {
>> + DPU_HW_SUBBLK_INFO;
>> +};
>> +
>> +/**
>> * enum dpu_qos_lut_usage - define QoS LUT use cases
>> */
>> enum dpu_qos_lut_usage {
>> @@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
>> };
>>
>> /**
>> + * struct dpu_dsc_sub_blks - DSC sub-blks
>> + * @enc: DSC encoder sub block
>> + * @ctl: DSC controller sub block
> Nit: sub-block, for both.
>
>> + *
> No need for this empty line.
>
>> + */
>> +struct dpu_dsc_sub_blks {
>> + struct dpu_dsc_blk enc;
>> + struct dpu_dsc_blk ctl;
>> +};
>> +
>> +/**
>> * dpu_clk_ctrl_type - Defines top level clock control signals
>> */
>> enum dpu_clk_ctrl_type {
>> @@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
>> * struct dpu_dsc_cfg - information of DSC blocks
>> * @id enum identifying this block
>> * @base register offset of this block
>> + * @len: length of hardware block
>> * @features bit mask identifying sub-blocks/features
>> + * @sblk sub-blocks information
> Add trailing colon like you did for @len (we will add it to every other
> field doc in a future fixing pass).
>
>> */
>> struct dpu_dsc_cfg {
>> DPU_HW_BLK_INFO;
>> + const struct dpu_dsc_sub_blks *sblk;
>> };
>>
>> /**
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> index 138080a..44fd624 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> @@ -1,5 +1,8 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> -/* Copyright (c) 2020-2022, Linaro Limited */
>> +/*
>> + * Copyright (c) 2020-2022, Linaro Limited
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
>> + */
>>
>> #ifndef _DPU_HW_DSC_H
>> #define _DPU_HW_DSC_H
>> @@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct dpu_dsc_cfg *cfg,
>> void __iomem *addr);
>>
>> /**
>> + * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block
> Add ()
>
> We recently normalized this to "DSC hw driver object."
>
>> + * @cfg: DSC catalog entry for which driver object is required
>> + * @addr: Mapped register io address of MDP
>> + * Returns: Error code or allocated dpu_hw_dsc context
>> + */
>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>> + void __iomem *addr);
>> +
>> +/**
>> * dpu_hw_dsc_destroy - destroys dsc driver context
>> * @dsc: Pointer to dsc driver context returned by dpu_hw_dsc_init
>> */
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> new file mode 100644
>> index 00000000..5bd84bd
>> --- /dev/null
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> @@ -0,0 +1,382 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
>> + */
>> +
>> +#include <drm/display/drm_dsc_helper.h>
>> +
>> +#include "dpu_kms.h"
>> +#include "dpu_hw_catalog.h"
>> +#include "dpu_hwio.h"
>> +#include "dpu_hw_mdss.h"
>> +#include "dpu_hw_dsc.h"
>> +
>> +#define DSC_CMN_MAIN_CNF 0x00
>> +
>> +/* DPU_DSC_ENC register offsets */
>> +#define ENC_DF_CTRL 0x00
>> +#define ENC_GENERAL_STATUS 0x04
>> +#define ENC_HSLICE_STATUS 0x08
>> +#define ENC_OUT_STATUS 0x0C
>> +#define ENC_INT_STAT 0x10
>> +#define ENC_INT_CLR 0x14
>> +#define ENC_INT_MASK 0x18
>> +#define DSC_MAIN_CONF 0x30
>> +#define DSC_PICTURE_SIZE 0x34
>> +#define DSC_SLICE_SIZE 0x38
>> +#define DSC_MISC_SIZE 0x3C
>> +#define DSC_HRD_DELAYS 0x40
>> +#define DSC_RC_SCALE 0x44
>> +#define DSC_RC_SCALE_INC_DEC 0x48
>> +#define DSC_RC_OFFSETS_1 0x4C
>> +#define DSC_RC_OFFSETS_2 0x50
>> +#define DSC_RC_OFFSETS_3 0x54
>> +#define DSC_RC_OFFSETS_4 0x58
>> +#define DSC_FLATNESS_QP 0x5C
>> +#define DSC_RC_MODEL_SIZE 0x60
>> +#define DSC_RC_CONFIG 0x64
>> +#define DSC_RC_BUF_THRESH_0 0x68
>> +#define DSC_RC_BUF_THRESH_1 0x6C
>> +#define DSC_RC_BUF_THRESH_2 0x70
>> +#define DSC_RC_BUF_THRESH_3 0x74
>> +#define DSC_RC_MIN_QP_0 0x78
>> +#define DSC_RC_MIN_QP_1 0x7C
>> +#define DSC_RC_MIN_QP_2 0x80
>> +#define DSC_RC_MAX_QP_0 0x84
>> +#define DSC_RC_MAX_QP_1 0x88
>> +#define DSC_RC_MAX_QP_2 0x8C
>> +#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
>> +#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
>> +#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
>> +
>> +/* DPU_DSC_CTL register offsets */
>> +#define DSC_CTL 0x00
>> +#define DSC_CFG 0x04
>> +#define DSC_DATA_IN_SWAP 0x08
>> +#define DSC_CLK_CTRL 0x0C
>> +
>> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc, int num_ss)
> Can you write out "ob" fully?
>
> These don't need to be marked "inline", same below.
are you means all functions in this file doe snot to be marked as inline?
>
>> +{
>> + int max_addr = 2400 / num_ss;
> ss -> slice (or subslice), right?
slice (softslice)
>
>> +
>> + if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
>> + max_addr /= 2;
>> +
>> + return max_addr - 1;
>> +};
>> +
>> +static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> +
>> + if (!hw_dsc)
>> + return;
>> +
>> + hw = &hw_dsc->hw;
>> + sblk = hw_dsc->caps->sblk;
>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
>> +}
>> +
>> +static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
>> + struct drm_dsc_config *dsc,
>> + u32 mode,
>> + u32 initial_lines)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> + u32 data = 0;
>> + u32 det_thresh_flatness;
>> + u32 num_active_slice_per_enc;
>> + u32 bpp;
>> +
>> + if (!hw_dsc || !dsc)
>> + return;
>> +
>> + hw = &hw_dsc->hw;
>> +
>> + sblk = hw_dsc->caps->sblk;
>> +
>> + if (mode & DSC_MODE_SPLIT_PANEL)
>> + data |= BIT(0);
>> +
>> + if (mode & DSC_MODE_MULTIPLEX)
>> + data |= BIT(1);
>> +
>> + num_active_slice_per_enc = dsc->slice_count;
>> + if (mode & DSC_MODE_MULTIPLEX)
>> + num_active_slice_per_enc = dsc->slice_count >> 1;
> A /2 would be clearer to read, IMO.
>
>> +
>> + data |= (num_active_slice_per_enc & 0x3) << 7;
>> +
>> + DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
>> +
>> + data = (initial_lines & 0xff);
>> +
>> + if (mode & DSC_MODE_VIDEO)
>> + data |= BIT(9);
>> +
>> + data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc) << 18);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
>> +
>> + data = (dsc->dsc_version_minor & 0xf) << 28;
>> + if (dsc->dsc_version_minor == 0x2) {
>> + if (dsc->native_422)
>> + data |= BIT(22);
>> + if (dsc->native_420)
>> + data |= BIT(21);
>> + }
>> +
>> + bpp = dsc->bits_per_pixel;
>> + /* as per hw requirement bpp should be programmed
>> + * twice the actual value in case of 420 or 422 encoding
>> + */
>> + if (dsc->native_422 || dsc->native_420)
>> + bpp = 2 * bpp;
>> + data |= (dsc->block_pred_enable ? 1 : 0) << 20;
> Below (e.g. convert_rgb) you just shift the bool without ternary if to 0
> or 1: why here?
>
> Or rewrite this one and convert_rgb to match the above and below:
>
> if (dsc->block_pred_enable)
> data |= BIT(20);
> if (dsc->convert_rgb)
> data |= BIT(4);
>
> No need to invent multiple different styles within the same file.
>
>> + data |= bpp << 10;
> Why not move this right below the calculation of bpp? Seems odd to
> intersperse block_pred_enable.
>
>> + data |= (dsc->line_buf_depth & 0xf) << 6;
>> + data |= dsc->convert_rgb << 4;
>> + data |= dsc->bits_per_component & 0xf;
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
>> +
>> + data = (dsc->pic_width & 0xffff) |
>> + ((dsc->pic_height & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
>> +
>> + data = (dsc->slice_width & 0xffff) |
>> + ((dsc->slice_height & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
>> + (dsc->slice_chunk_size) & 0xffff);
>> +
>> + data = (dsc->initial_xmit_delay & 0xffff) |
>> + ((dsc->initial_dec_delay & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
>> + dsc->initial_scale_value & 0x3f);
>> +
>> + data = (dsc->scale_increment_interval & 0xffff) |
>> + ((dsc->scale_decrement_interval & 0x7ff) << 16);
> Is it correct that increment and decrement have a different amount of
> bits?
>
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
>> +
>> + data = (dsc->first_line_bpg_offset & 0x1f) |
>> + ((dsc->second_line_bpg_offset & 0x1f) << 5);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
>> +
>> + data = (dsc->nfl_bpg_offset & 0xffff) |
>> + ((dsc->slice_bpg_offset & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
>> +
>> + data = (dsc->initial_offset & 0xffff) |
>> + ((dsc->final_offset & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
>> +
>> + data = (dsc->nsl_bpg_offset & 0xffff) |
>> + ((dsc->second_line_offset_adj & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
>> +
>> + data = (dsc->flatness_min_qp & 0x1f);
>> + data |= (dsc->flatness_max_qp & 0x1f) << 5;
> Wrap these as a single data= expression to match the above.
>
>> +
>> + det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
>> + data |= (det_thresh_flatness & 0xff) << 10;
> And fold this line into it, too.
>
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
>> + (dsc->rc_model_size) & 0xffff);
>> +
>> + data = dsc->rc_edge_factor & 0xf;
>> + data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
>> + data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
>> + data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
>> + data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
>> +
>> + /* program the dsc wrapper */
>> + data = BIT(0); /* encoder enable */
>> + if (dsc->native_422)
>> + data |= BIT(8);
>> + else if (dsc->native_420)
>> + data |= BIT(9);
>> + if (!dsc->convert_rgb)
>> + data |= BIT(10);
>> + if (dsc->bits_per_component == 8)
>> + data |= BIT(11);
>> + if (mode & DSC_MODE_SPLIT_PANEL)
>> + data |= BIT(12);
>> + if (mode & DSC_MODE_MULTIPLEX)
>> + data |= BIT(13);
>> + if (!(mode & DSC_MODE_VIDEO))
>> + data |= BIT(17);
>> +
>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
>> +}
>> +
>> +static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc *hw_dsc,
>> + struct drm_dsc_config *dsc)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> + struct drm_dsc_rc_range_parameters *rc;
>> +
>> + if (!hw_dsc || !dsc)
>> + return;
>> +
>> + hw = &hw_dsc->hw;
>> +
>> + sblk = hw_dsc->caps->sblk;
>> +
>> + rc = dsc->rc_range_params;
>> +
>> + /*
>> + * With BUF_THRESH -- 14 in total
>> + * each register contains 4 thresh values with the last register
>> + * containing only 2 thresh values
>> + */
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
>> + (dsc->rc_buf_thresh[0] << 0) |
>> + (dsc->rc_buf_thresh[1] << 8) |
>> + (dsc->rc_buf_thresh[2] << 16) |
>> + (dsc->rc_buf_thresh[3] << 24));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
>> + (dsc->rc_buf_thresh[4] << 0) |
>> + (dsc->rc_buf_thresh[5] << 8) |
>> + (dsc->rc_buf_thresh[6] << 16) |
>> + (dsc->rc_buf_thresh[7] << 24));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
>> + (dsc->rc_buf_thresh[8] << 0) |
>> + (dsc->rc_buf_thresh[9] << 8) |
>> + (dsc->rc_buf_thresh[10] << 16) |
>> + (dsc->rc_buf_thresh[11] << 24));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
>> + (dsc->rc_buf_thresh[12] << 0) |
>> + (dsc->rc_buf_thresh[13] << 8));
> I'm still thrown-off by the fact that rc_buf_thresh is u16, but assumed
> everywhere to only contain 8 valid bits.
>
> Rest of the patch appears okay, but I'll try to test it on actual
> hardware too.
>
> - Marijn
>
>> +
>> + /*
>> + * with min/max_QP -- 5 bits
>> + * each register contains 5 min_qp or max_qp for total of 15
>> + *
>> + * With BPG_OFFSET -- 6 bits
>> + * each register contains 5 BPG_offset for total of 15
>> + */
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
>> + (rc[0].range_min_qp << 0) |
>> + (rc[1].range_min_qp << 5) |
>> + (rc[2].range_min_qp << 10) |
>> + (rc[3].range_min_qp << 15) |
>> + (rc[4].range_min_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
>> + (rc[0].range_max_qp << 0) |
>> + (rc[1].range_max_qp << 5) |
>> + (rc[2].range_max_qp << 10) |
>> + (rc[3].range_max_qp << 15) |
>> + (rc[4].range_max_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
>> + (rc[0].range_bpg_offset << 0) |
>> + (rc[1].range_bpg_offset << 6) |
>> + (rc[2].range_bpg_offset << 12) |
>> + (rc[3].range_bpg_offset << 18) |
>> + (rc[4].range_bpg_offset << 24));
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
>> + (rc[5].range_min_qp << 0) |
>> + (rc[6].range_min_qp << 5) |
>> + (rc[7].range_min_qp << 10) |
>> + (rc[8].range_min_qp << 15) |
>> + (rc[9].range_min_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
>> + (rc[5].range_max_qp << 0) |
>> + (rc[6].range_max_qp << 5) |
>> + (rc[7].range_max_qp << 10) |
>> + (rc[8].range_max_qp << 15) |
>> + (rc[9].range_max_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
>> + (rc[5].range_bpg_offset << 0) |
>> + (rc[6].range_bpg_offset << 6) |
>> + (rc[7].range_bpg_offset << 12) |
>> + (rc[8].range_bpg_offset << 18) |
>> + (rc[9].range_bpg_offset << 24));
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
>> + (rc[10].range_min_qp << 0) |
>> + (rc[11].range_min_qp << 5) |
>> + (rc[12].range_min_qp << 10) |
>> + (rc[13].range_min_qp << 15) |
>> + (rc[14].range_min_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
>> + (rc[10].range_max_qp << 0) |
>> + (rc[11].range_max_qp << 5) |
>> + (rc[12].range_max_qp << 10) |
>> + (rc[13].range_max_qp << 15) |
>> + (rc[14].range_max_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
>> + (rc[10].range_bpg_offset << 0) |
>> + (rc[11].range_bpg_offset << 6) |
>> + (rc[12].range_bpg_offset << 12) |
>> + (rc[13].range_bpg_offset << 18) |
>> + (rc[14].range_bpg_offset << 24));
>> +}
>> +
>> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
>> + const enum dpu_pingpong pp)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> + int mux_cfg = 0xf; /* Disabled */
>> +
>> + hw = &hw_dsc->hw;
>> +
>> + sblk = hw_dsc->caps->sblk;
>> +
>> + if (pp)
>> + mux_cfg = (pp - PINGPONG_0) & 0x7;
>> +
>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
>> +}
>> +
>> +static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
>> + const unsigned long features)
>> +{
>> + ops->dsc_disable = dpu_hw_dsc_disable_1_2;
>> + ops->dsc_config = dpu_hw_dsc_config_1_2;
>> + ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
>> + ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
>> +}
>> +
>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>> + void __iomem *addr)
>> +{
>> + struct dpu_hw_dsc *c;
>> +
>> + c = kzalloc(sizeof(*c), GFP_KERNEL);
>> + if (!c)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + c->hw.blk_addr = addr + cfg->base;
>> + c->hw.log_mask = DPU_DBG_MASK_DSC;
>> +
>> + c->idx = cfg->id;
>> + c->caps = cfg;
>> + _setup_dcs_ops_1_2(&c->ops, c->caps->features);
>> +
>> + return c;
>> +}
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> index f0fc704..502dd60 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> @@ -1,6 +1,7 @@
>> // SPDX-License-Identifier: GPL-2.0-only
>> /*
>> * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
>> */
>>
>> #define pr_fmt(fmt) "[drm:%s] " fmt, __func__
>> @@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
>> struct dpu_hw_dsc *hw;
>> const struct dpu_dsc_cfg *dsc = &cat->dsc[i];
>>
>> - hw = dpu_hw_dsc_init(dsc, mmio);
>> + if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
>> + hw = dpu_hw_dsc_init_1_2(dsc, mmio);
>> + else
>> + hw = dpu_hw_dsc_init(dsc, mmio);
>> +
>> if (IS_ERR_OR_NULL(hw)) {
>> rc = PTR_ERR(hw);
>> DPU_ERROR("failed dsc object creation: err %d\n", rc);
>> --
>> 2.7.4
>>

2023-05-15 17:57:26

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 5/14/2023 2:46 PM, Marijn Suijten wrote:
> On 2023-05-12 21:19:19, Dmitry Baryshkov wrote:
> <snip.
>>> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
>>> + const enum dpu_pingpong pp)
>>> +{
>>> + struct dpu_hw_blk_reg_map *hw;
>>> + const struct dpu_dsc_sub_blks *sblk;
>>> + int mux_cfg = 0xf; /* Disabled */
>>> +
>>> + hw = &hw_dsc->hw;
>>> +
>>> + sblk = hw_dsc->caps->sblk;
>>> +
>>> + if (pp)
>>> + mux_cfg = (pp - PINGPONG_0) & 0x7;
>> Do we need an unbind support here like we do for the DSC 1.1?
>>
>>> +
>>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
>>> +}
> <snip>
>
> Friendly request to strip/snip unneeded context (as done in this reply)
> to make it easier to spot the conversation, and replies to it.
>
> - Marijn

Thanks for suggestion.

How can I do that?

just manually delete unneeded context?

Or are they other way (tricks) to do it automatically?

2023-05-15 18:02:35

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 5/14/2023 3:18 PM, Marijn Suijten wrote:
> On 2023-05-12 11:00:20, Kuogee Hsieh wrote:
>> Add support for DSC 1.2 by providing the necessary hooks to program
>> the DPU DSC 1.2 encoder.
>>
>> Changes in v3:
>> -- fixed kernel test rebot report that "__iomem *off" is declared but not
>> used at dpu_hw_dsc_config_1_2()
>> -- unrolling thresh loops
>>
>> Changes in v4:
>> -- delete DPU_DSC_HW_REV_1_1
>> -- delete off and used real register name directly
>>
>> Changes in v7:
>> -- replace offset with sblk->enc.base
>> -- replace ss with slice
>>
>> Changes in v8:
>> -- fixed checkpatch warning
>>
>> Signed-off-by: Kuogee Hsieh <[email protected]>
>> ---
>> drivers/gpu/drm/msm/Makefile | 1 +
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 32 ++-
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 14 +-
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c | 382 +++++++++++++++++++++++++
>> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 7 +-
>> 5 files changed, 432 insertions(+), 4 deletions(-)
>> create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>>
>> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
>> index b814fc8..b9af5e4 100644
>> --- a/drivers/gpu/drm/msm/Makefile
>> +++ b/drivers/gpu/drm/msm/Makefile
>> @@ -65,6 +65,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
>> disp/dpu1/dpu_hw_catalog.o \
>> disp/dpu1/dpu_hw_ctl.o \
>> disp/dpu1/dpu_hw_dsc.o \
>> + disp/dpu1/dpu_hw_dsc_1_2.o \
>> disp/dpu1/dpu_hw_interrupts.o \
>> disp/dpu1/dpu_hw_intf.o \
>> disp/dpu1/dpu_hw_lm.o \
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> index dc0a4da..4eda2cc 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> @@ -1,6 +1,6 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> /*
>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
>> * Copyright (c) 2015-2018, 2020 The Linux Foundation. All rights reserved.
>> */
>>
>> @@ -244,12 +244,18 @@ enum {
>> };
>>
>> /**
>> - * DSC features
>> + * DSC sub-blocks/features
>> * @DPU_DSC_OUTPUT_CTRL Configure which PINGPONG block gets
>> * the pixel output from this DSC.
>> + * @DPU_DSC_HW_REV_1_2 DSC block supports dsc 1.1 and 1.2
> Nit: dsc -> DSC
>
>> + * @DPU_DSC_NATIVE_422_EN Supports native422 and native420 encoding
> NATIVE_42x_EN?
>
>> + * @DPU_DSC_MAX
>> */
>> enum {
>> DPU_DSC_OUTPUT_CTRL = 0x1,
>> + DPU_DSC_HW_REV_1_2,
>> + DPU_DSC_NATIVE_422_EN,
>> + DPU_DSC_MAX
>> };
>>
>> /**
>> @@ -306,6 +312,14 @@ struct dpu_pp_blk {
>> };
>>
>> /**
>> + * struct dpu_dsc_blk - DSC Encoder sub-blk information
>> + * @info: HW register and features supported by this sub-blk
>> + */
>> +struct dpu_dsc_blk {
>> + DPU_HW_SUBBLK_INFO;
>> +};
>> +
>> +/**
>> * enum dpu_qos_lut_usage - define QoS LUT use cases
>> */
>> enum dpu_qos_lut_usage {
>> @@ -452,6 +466,17 @@ struct dpu_pingpong_sub_blks {
>> };
>>
>> /**
>> + * struct dpu_dsc_sub_blks - DSC sub-blks
>> + * @enc: DSC encoder sub block
>> + * @ctl: DSC controller sub block
> Nit: sub-block, for both.
>
>> + *
> No need for this empty line.
>
>> + */
>> +struct dpu_dsc_sub_blks {
>> + struct dpu_dsc_blk enc;
>> + struct dpu_dsc_blk ctl;
>> +};
>> +
>> +/**
>> * dpu_clk_ctrl_type - Defines top level clock control signals
>> */
>> enum dpu_clk_ctrl_type {
>> @@ -605,10 +630,13 @@ struct dpu_merge_3d_cfg {
>> * struct dpu_dsc_cfg - information of DSC blocks
>> * @id enum identifying this block
>> * @base register offset of this block
>> + * @len: length of hardware block
>> * @features bit mask identifying sub-blocks/features
>> + * @sblk sub-blocks information
> Add trailing colon like you did for @len (we will add it to every other
> field doc in a future fixing pass).
>
>> */
>> struct dpu_dsc_cfg {
>> DPU_HW_BLK_INFO;
>> + const struct dpu_dsc_sub_blks *sblk;
>> };
>>
>> /**
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> index 138080a..44fd624 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h
>> @@ -1,5 +1,8 @@
>> /* SPDX-License-Identifier: GPL-2.0-only */
>> -/* Copyright (c) 2020-2022, Linaro Limited */
>> +/*
>> + * Copyright (c) 2020-2022, Linaro Limited
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
>> + */
>>
>> #ifndef _DPU_HW_DSC_H
>> #define _DPU_HW_DSC_H
>> @@ -69,6 +72,15 @@ struct dpu_hw_dsc *dpu_hw_dsc_init(const struct dpu_dsc_cfg *cfg,
>> void __iomem *addr);
>>
>> /**
>> + * dpu_hw_dsc_init_1_2 - initializes the v1.2 DSC hw driver block
> Add ()
>
> We recently normalized this to "DSC hw driver object."
>
>> + * @cfg: DSC catalog entry for which driver object is required
>> + * @addr: Mapped register io address of MDP
>> + * Returns: Error code or allocated dpu_hw_dsc context
>> + */
>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>> + void __iomem *addr);
>> +
>> +/**
>> * dpu_hw_dsc_destroy - destroys dsc driver context
>> * @dsc: Pointer to dsc driver context returned by dpu_hw_dsc_init
>> */
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> new file mode 100644
>> index 00000000..5bd84bd
>> --- /dev/null
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc_1_2.c
>> @@ -0,0 +1,382 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved
>> + */
>> +
>> +#include <drm/display/drm_dsc_helper.h>
>> +
>> +#include "dpu_kms.h"
>> +#include "dpu_hw_catalog.h"
>> +#include "dpu_hwio.h"
>> +#include "dpu_hw_mdss.h"
>> +#include "dpu_hw_dsc.h"
>> +
>> +#define DSC_CMN_MAIN_CNF 0x00
>> +
>> +/* DPU_DSC_ENC register offsets */
>> +#define ENC_DF_CTRL 0x00
>> +#define ENC_GENERAL_STATUS 0x04
>> +#define ENC_HSLICE_STATUS 0x08
>> +#define ENC_OUT_STATUS 0x0C
>> +#define ENC_INT_STAT 0x10
>> +#define ENC_INT_CLR 0x14
>> +#define ENC_INT_MASK 0x18
>> +#define DSC_MAIN_CONF 0x30
>> +#define DSC_PICTURE_SIZE 0x34
>> +#define DSC_SLICE_SIZE 0x38
>> +#define DSC_MISC_SIZE 0x3C
>> +#define DSC_HRD_DELAYS 0x40
>> +#define DSC_RC_SCALE 0x44
>> +#define DSC_RC_SCALE_INC_DEC 0x48
>> +#define DSC_RC_OFFSETS_1 0x4C
>> +#define DSC_RC_OFFSETS_2 0x50
>> +#define DSC_RC_OFFSETS_3 0x54
>> +#define DSC_RC_OFFSETS_4 0x58
>> +#define DSC_FLATNESS_QP 0x5C
>> +#define DSC_RC_MODEL_SIZE 0x60
>> +#define DSC_RC_CONFIG 0x64
>> +#define DSC_RC_BUF_THRESH_0 0x68
>> +#define DSC_RC_BUF_THRESH_1 0x6C
>> +#define DSC_RC_BUF_THRESH_2 0x70
>> +#define DSC_RC_BUF_THRESH_3 0x74
>> +#define DSC_RC_MIN_QP_0 0x78
>> +#define DSC_RC_MIN_QP_1 0x7C
>> +#define DSC_RC_MIN_QP_2 0x80
>> +#define DSC_RC_MAX_QP_0 0x84
>> +#define DSC_RC_MAX_QP_1 0x88
>> +#define DSC_RC_MAX_QP_2 0x8C
>> +#define DSC_RC_RANGE_BPG_OFFSETS_0 0x90
>> +#define DSC_RC_RANGE_BPG_OFFSETS_1 0x94
>> +#define DSC_RC_RANGE_BPG_OFFSETS_2 0x98
>> +
>> +/* DPU_DSC_CTL register offsets */
>> +#define DSC_CTL 0x00
>> +#define DSC_CFG 0x04
>> +#define DSC_DATA_IN_SWAP 0x08
>> +#define DSC_CLK_CTRL 0x0C
>> +
>> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc, int num_ss)
> Can you write out "ob" fully?
>
> These don't need to be marked "inline", same below.
>
>> +{
>> + int max_addr = 2400 / num_ss;
> ss -> slice (or subslice), right?
>
>> +
>> + if (hw_dsc->caps->features & BIT(DPU_DSC_NATIVE_422_EN))
>> + max_addr /= 2;
>> +
>> + return max_addr - 1;
>> +};
>> +
>> +static inline void dpu_hw_dsc_disable_1_2(struct dpu_hw_dsc *hw_dsc)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> +
>> + if (!hw_dsc)
>> + return;
>> +
>> + hw = &hw_dsc->hw;
>> + sblk = hw_dsc->caps->sblk;
>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, 0);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, 0);
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, 0);
>> +}
>> +
>> +static inline void dpu_hw_dsc_config_1_2(struct dpu_hw_dsc *hw_dsc,
>> + struct drm_dsc_config *dsc,
>> + u32 mode,
>> + u32 initial_lines)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> + u32 data = 0;
>> + u32 det_thresh_flatness;
>> + u32 num_active_slice_per_enc;
>> + u32 bpp;
>> +
>> + if (!hw_dsc || !dsc)
>> + return;
>> +
>> + hw = &hw_dsc->hw;
>> +
>> + sblk = hw_dsc->caps->sblk;
>> +
>> + if (mode & DSC_MODE_SPLIT_PANEL)
>> + data |= BIT(0);
>> +
>> + if (mode & DSC_MODE_MULTIPLEX)
>> + data |= BIT(1);
>> +
>> + num_active_slice_per_enc = dsc->slice_count;
>> + if (mode & DSC_MODE_MULTIPLEX)
>> + num_active_slice_per_enc = dsc->slice_count >> 1;
> A /2 would be clearer to read, IMO.
>
>> +
>> + data |= (num_active_slice_per_enc & 0x3) << 7;
>> +
>> + DPU_REG_WRITE(hw, DSC_CMN_MAIN_CNF, data);
>> +
>> + data = (initial_lines & 0xff);
>> +
>> + if (mode & DSC_MODE_VIDEO)
>> + data |= BIT(9);
>> +
>> + data |= (_dsc_calc_ob_max_addr(hw_dsc, num_active_slice_per_enc) << 18);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + ENC_DF_CTRL, data);
>> +
>> + data = (dsc->dsc_version_minor & 0xf) << 28;
>> + if (dsc->dsc_version_minor == 0x2) {
>> + if (dsc->native_422)
>> + data |= BIT(22);
>> + if (dsc->native_420)
>> + data |= BIT(21);
>> + }
>> +
>> + bpp = dsc->bits_per_pixel;
>> + /* as per hw requirement bpp should be programmed
>> + * twice the actual value in case of 420 or 422 encoding
>> + */
>> + if (dsc->native_422 || dsc->native_420)
>> + bpp = 2 * bpp;
>> + data |= (dsc->block_pred_enable ? 1 : 0) << 20;
> Below (e.g. convert_rgb) you just shift the bool without ternary if to 0
> or 1: why here?
>
> Or rewrite this one and convert_rgb to match the above and below:
>
> if (dsc->block_pred_enable)
> data |= BIT(20);
> if (dsc->convert_rgb)
> data |= BIT(4);
>
> No need to invent multiple different styles within the same file.
>
>> + data |= bpp << 10;
> Why not move this right below the calculation of bpp? Seems odd to
> intersperse block_pred_enable.
>
>> + data |= (dsc->line_buf_depth & 0xf) << 6;
>> + data |= dsc->convert_rgb << 4;
>> + data |= dsc->bits_per_component & 0xf;
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MAIN_CONF, data);
>> +
>> + data = (dsc->pic_width & 0xffff) |
>> + ((dsc->pic_height & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_PICTURE_SIZE, data);
>> +
>> + data = (dsc->slice_width & 0xffff) |
>> + ((dsc->slice_height & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_SLICE_SIZE, data);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_MISC_SIZE,
>> + (dsc->slice_chunk_size) & 0xffff);
>> +
>> + data = (dsc->initial_xmit_delay & 0xffff) |
>> + ((dsc->initial_dec_delay & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_HRD_DELAYS, data);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE,
>> + dsc->initial_scale_value & 0x3f);
>> +
>> + data = (dsc->scale_increment_interval & 0xffff) |
>> + ((dsc->scale_decrement_interval & 0x7ff) << 16);
> Is it correct that increment and decrement have a different amount of
yes, 10 bits for decrement and 16 bits for increment
> bits?
>
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_SCALE_INC_DEC, data);
>> +
>> + data = (dsc->first_line_bpg_offset & 0x1f) |
>> + ((dsc->second_line_bpg_offset & 0x1f) << 5);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_1, data);
>> +
>> + data = (dsc->nfl_bpg_offset & 0xffff) |
>> + ((dsc->slice_bpg_offset & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_2, data);
>> +
>> + data = (dsc->initial_offset & 0xffff) |
>> + ((dsc->final_offset & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_3, data);
>> +
>> + data = (dsc->nsl_bpg_offset & 0xffff) |
>> + ((dsc->second_line_offset_adj & 0xffff) << 16);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_OFFSETS_4, data);
>> +
>> + data = (dsc->flatness_min_qp & 0x1f);
>> + data |= (dsc->flatness_max_qp & 0x1f) << 5;
> Wrap these as a single data= expression to match the above.
>
>> +
>> + det_thresh_flatness = drm_dsc_calculate_flatness_det_thresh(dsc);
>> + data |= (det_thresh_flatness & 0xff) << 10;
> And fold this line into it, too.
>
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_FLATNESS_QP, data);
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MODEL_SIZE,
>> + (dsc->rc_model_size) & 0xffff);
>> +
>> + data = dsc->rc_edge_factor & 0xf;
>> + data |= (dsc->rc_quant_incr_limit0 & 0x1f) << 8;
>> + data |= (dsc->rc_quant_incr_limit1 & 0x1f) << 13;
>> + data |= (dsc->rc_tgt_offset_high & 0xf) << 20;
>> + data |= (dsc->rc_tgt_offset_low & 0xf) << 24;
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_CONFIG, data);
>> +
>> + /* program the dsc wrapper */
>> + data = BIT(0); /* encoder enable */
>> + if (dsc->native_422)
>> + data |= BIT(8);
>> + else if (dsc->native_420)
>> + data |= BIT(9);
>> + if (!dsc->convert_rgb)
>> + data |= BIT(10);
>> + if (dsc->bits_per_component == 8)
>> + data |= BIT(11);
>> + if (mode & DSC_MODE_SPLIT_PANEL)
>> + data |= BIT(12);
>> + if (mode & DSC_MODE_MULTIPLEX)
>> + data |= BIT(13);
>> + if (!(mode & DSC_MODE_VIDEO))
>> + data |= BIT(17);
>> +
>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CFG, data);
>> +}
>> +
>> +static inline void dpu_hw_dsc_config_thresh_1_2(struct dpu_hw_dsc *hw_dsc,
>> + struct drm_dsc_config *dsc)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> + struct drm_dsc_rc_range_parameters *rc;
>> +
>> + if (!hw_dsc || !dsc)
>> + return;
>> +
>> + hw = &hw_dsc->hw;
>> +
>> + sblk = hw_dsc->caps->sblk;
>> +
>> + rc = dsc->rc_range_params;
>> +
>> + /*
>> + * With BUF_THRESH -- 14 in total
>> + * each register contains 4 thresh values with the last register
>> + * containing only 2 thresh values
>> + */
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_0,
>> + (dsc->rc_buf_thresh[0] << 0) |
>> + (dsc->rc_buf_thresh[1] << 8) |
>> + (dsc->rc_buf_thresh[2] << 16) |
>> + (dsc->rc_buf_thresh[3] << 24));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_1,
>> + (dsc->rc_buf_thresh[4] << 0) |
>> + (dsc->rc_buf_thresh[5] << 8) |
>> + (dsc->rc_buf_thresh[6] << 16) |
>> + (dsc->rc_buf_thresh[7] << 24));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_2,
>> + (dsc->rc_buf_thresh[8] << 0) |
>> + (dsc->rc_buf_thresh[9] << 8) |
>> + (dsc->rc_buf_thresh[10] << 16) |
>> + (dsc->rc_buf_thresh[11] << 24));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_BUF_THRESH_3,
>> + (dsc->rc_buf_thresh[12] << 0) |
>> + (dsc->rc_buf_thresh[13] << 8));
> I'm still thrown-off by the fact that rc_buf_thresh is u16, but assumed
> everywhere to only contain 8 valid bits.
>
> Rest of the patch appears okay, but I'll try to test it on actual
> hardware too.
>
> - Marijn
>
>> +
>> + /*
>> + * with min/max_QP -- 5 bits
>> + * each register contains 5 min_qp or max_qp for total of 15
>> + *
>> + * With BPG_OFFSET -- 6 bits
>> + * each register contains 5 BPG_offset for total of 15
>> + */
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_0,
>> + (rc[0].range_min_qp << 0) |
>> + (rc[1].range_min_qp << 5) |
>> + (rc[2].range_min_qp << 10) |
>> + (rc[3].range_min_qp << 15) |
>> + (rc[4].range_min_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_0,
>> + (rc[0].range_max_qp << 0) |
>> + (rc[1].range_max_qp << 5) |
>> + (rc[2].range_max_qp << 10) |
>> + (rc[3].range_max_qp << 15) |
>> + (rc[4].range_max_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_0,
>> + (rc[0].range_bpg_offset << 0) |
>> + (rc[1].range_bpg_offset << 6) |
>> + (rc[2].range_bpg_offset << 12) |
>> + (rc[3].range_bpg_offset << 18) |
>> + (rc[4].range_bpg_offset << 24));
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_1,
>> + (rc[5].range_min_qp << 0) |
>> + (rc[6].range_min_qp << 5) |
>> + (rc[7].range_min_qp << 10) |
>> + (rc[8].range_min_qp << 15) |
>> + (rc[9].range_min_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_1,
>> + (rc[5].range_max_qp << 0) |
>> + (rc[6].range_max_qp << 5) |
>> + (rc[7].range_max_qp << 10) |
>> + (rc[8].range_max_qp << 15) |
>> + (rc[9].range_max_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_1,
>> + (rc[5].range_bpg_offset << 0) |
>> + (rc[6].range_bpg_offset << 6) |
>> + (rc[7].range_bpg_offset << 12) |
>> + (rc[8].range_bpg_offset << 18) |
>> + (rc[9].range_bpg_offset << 24));
>> +
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MIN_QP_2,
>> + (rc[10].range_min_qp << 0) |
>> + (rc[11].range_min_qp << 5) |
>> + (rc[12].range_min_qp << 10) |
>> + (rc[13].range_min_qp << 15) |
>> + (rc[14].range_min_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_MAX_QP_2,
>> + (rc[10].range_max_qp << 0) |
>> + (rc[11].range_max_qp << 5) |
>> + (rc[12].range_max_qp << 10) |
>> + (rc[13].range_max_qp << 15) |
>> + (rc[14].range_max_qp << 20));
>> + DPU_REG_WRITE(hw, sblk->enc.base + DSC_RC_RANGE_BPG_OFFSETS_2,
>> + (rc[10].range_bpg_offset << 0) |
>> + (rc[11].range_bpg_offset << 6) |
>> + (rc[12].range_bpg_offset << 12) |
>> + (rc[13].range_bpg_offset << 18) |
>> + (rc[14].range_bpg_offset << 24));
>> +}
>> +
>> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
>> + const enum dpu_pingpong pp)
>> +{
>> + struct dpu_hw_blk_reg_map *hw;
>> + const struct dpu_dsc_sub_blks *sblk;
>> + int mux_cfg = 0xf; /* Disabled */
>> +
>> + hw = &hw_dsc->hw;
>> +
>> + sblk = hw_dsc->caps->sblk;
>> +
>> + if (pp)
>> + mux_cfg = (pp - PINGPONG_0) & 0x7;
>> +
>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
>> +}
>> +
>> +static void _setup_dcs_ops_1_2(struct dpu_hw_dsc_ops *ops,
>> + const unsigned long features)
>> +{
>> + ops->dsc_disable = dpu_hw_dsc_disable_1_2;
>> + ops->dsc_config = dpu_hw_dsc_config_1_2;
>> + ops->dsc_config_thresh = dpu_hw_dsc_config_thresh_1_2;
>> + ops->dsc_bind_pingpong_blk = dpu_hw_dsc_bind_pingpong_blk_1_2;
>> +}
>> +
>> +struct dpu_hw_dsc *dpu_hw_dsc_init_1_2(const struct dpu_dsc_cfg *cfg,
>> + void __iomem *addr)
>> +{
>> + struct dpu_hw_dsc *c;
>> +
>> + c = kzalloc(sizeof(*c), GFP_KERNEL);
>> + if (!c)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + c->hw.blk_addr = addr + cfg->base;
>> + c->hw.log_mask = DPU_DBG_MASK_DSC;
>> +
>> + c->idx = cfg->id;
>> + c->caps = cfg;
>> + _setup_dcs_ops_1_2(&c->ops, c->caps->features);
>> +
>> + return c;
>> +}
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> index f0fc704..502dd60 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> @@ -1,6 +1,7 @@
>> // SPDX-License-Identifier: GPL-2.0-only
>> /*
>> * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
>> */
>>
>> #define pr_fmt(fmt) "[drm:%s] " fmt, __func__
>> @@ -246,7 +247,11 @@ int dpu_rm_init(struct dpu_rm *rm,
>> struct dpu_hw_dsc *hw;
>> const struct dpu_dsc_cfg *dsc = &cat->dsc[i];
>>
>> - hw = dpu_hw_dsc_init(dsc, mmio);
>> + if (test_bit(DPU_DSC_HW_REV_1_2, &dsc->features))
>> + hw = dpu_hw_dsc_init_1_2(dsc, mmio);
>> + else
>> + hw = dpu_hw_dsc_init(dsc, mmio);
>> +
>> if (IS_ERR_OR_NULL(hw)) {
>> rc = PTR_ERR(hw);
>> DPU_ERROR("failed dsc object creation: err %d\n", rc);
>> --
>> 2.7.4
>>

2023-05-15 18:28:22

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On Mon, 15 May 2023 at 20:47, Kuogee Hsieh <[email protected]> wrote:
>
>
> On 5/14/2023 2:46 PM, Marijn Suijten wrote:
> > On 2023-05-12 21:19:19, Dmitry Baryshkov wrote:
> > <snip.
> >>> +static inline void dpu_hw_dsc_bind_pingpong_blk_1_2(struct dpu_hw_dsc *hw_dsc,
> >>> + const enum dpu_pingpong pp)
> >>> +{
> >>> + struct dpu_hw_blk_reg_map *hw;
> >>> + const struct dpu_dsc_sub_blks *sblk;
> >>> + int mux_cfg = 0xf; /* Disabled */
> >>> +
> >>> + hw = &hw_dsc->hw;
> >>> +
> >>> + sblk = hw_dsc->caps->sblk;
> >>> +
> >>> + if (pp)
> >>> + mux_cfg = (pp - PINGPONG_0) & 0x7;
> >> Do we need an unbind support here like we do for the DSC 1.1?
> >>
> >>> +
> >>> + DPU_REG_WRITE(hw, sblk->ctl.base + DSC_CTL, mux_cfg);
> >>> +}
> > <snip>
> >
> > Friendly request to strip/snip unneeded context (as done in this reply)
> > to make it easier to spot the conversation, and replies to it.
> >
> > - Marijn
>
> Thanks for suggestion.
>
> How can I do that?
>
> just manually delete unneeded context?
>
> Or are they other way (tricks) to do it automatically?

No automation can deduce what is irrelevant to the mail. So we do this manually.

--
With best wishes
Dmitry

2023-05-15 20:13:33

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 2023-05-15 10:46:48, Kuogee Hsieh wrote:
<snip>
> > Friendly request to strip/snip unneeded context (as done in this reply)
> > to make it easier to spot the conversation, and replies to it.
> >
> > - Marijn
>
> Thanks for suggestion.
>
> How can I do that?
>
> just manually delete unneeded context?
>
> Or are they other way (tricks) to do it automatically?

Fortunately, no: it is up to you to decide what context is relevant,
which is typically the text (and/or diff snippets) you are replying to.

- Marijn

2023-05-15 20:22:45

[permalink] [raw]

Subject: Re: [PATCH v8 5/8] drm/msm/dpu: add support for DSC encoder v1.2 engine

On 2023-05-15 10:06:33, Kuogee Hsieh wrote:
<snip>
> >> +static inline int _dsc_calc_ob_max_addr(struct dpu_hw_dsc *hw_dsc, int num_ss)
> > Can you write out "ob" fully?
> >
> > These don't need to be marked "inline", same below.

Please add newlines around your reply, like I did here, to make it
easier to spot them in the context. As asked in another thread, shorten
the original message around it if it's not relevant for your reply
message (see <snip> bits).

> are you means all functions in this file doe snot to be marked as inline?

https://www.kernel.org/doc/local/inline.html

In general, inline is fine for math functions that are small and useful
to be inlined (and functions in headers that get compiled multiple times
but need to be deduplicated when all the objects are linked together).
It has no sensible meaning on callback functions (of which their pointer
get assigned to a struct member), however.

In DPU1 for example, only dpu_hw_ctl.c erroneously does this for
callbacks (and this patch, but I presume you'll fix this up in v9).

> >> +{
> >> + int max_addr = 2400 / num_ss;
> > ss -> slice (or subslice), right?
> ...
> slice (softslice)

Thanks if you can fix this up in v9!

- Marijn

<snip>

2023-05-15 21:26:30

[permalink] [raw]

Subject: Re: [PATCH v8 7/8] drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets

On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
>
> From: Abhinav Kumar <[email protected]>
>
> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
> feature flag information. Each display compression engine (DCE) contains
> dual hard slice DSC encoders so both share same base address but with
> its own different sub block address.

Can we have an explanation of hard vs soft slices in some commit message
and/or code documentation?

>
> changes in v4:
> -- delete DPU_DSC_HW_REV_1_1
> -- re arrange sc8280xp_dsc[]
>
> changes in v4:
> -- fix checkpatch warning
>
> Signed-off-by: Abhinav Kumar <[email protected]>
> Signed-off-by: Kuogee Hsieh <[email protected]>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> ---
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 ++++++++++++
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h | 7 ++++++
> .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 ++++++++++++++
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 ++++++++++++
> .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 ++++++++++++
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 25 +++++++++++++++++++++-
> 6 files changed, 89 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
> index 500cfd0..c4c93c8 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
> @@ -153,6 +153,18 @@ static const struct dpu_merge_3d_cfg sm8350_merge_3d[] = {
> MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
> };
>
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sm8350_dsc[] = {
> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),

Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
for the ctl sblk). This simply fills it up to the start of the enc sblk
so that we can see all registers in the dump? After all only
DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
adequate.

> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),

Should we add an extra suffix to the name to indicate which hard-slice
DSC encoder it is? i.e. "dce_0_0" and "dce_0_1" etc?

> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),

See comment below about loose BIT() in features.

> +};
> +
> static const struct dpu_intf_cfg sm8350_intf[] = {
> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
> .dspp = sm8350_dspp,
> .pingpong_count = ARRAY_SIZE(sm8350_pp),
> .pingpong = sm8350_pp,
> + .dsc = sm8350_dsc,
> + .dsc_count = ARRAY_SIZE(sm8350_dsc),

Count goes first **everywhere else**, let's not break consistency here.

> .merge_3d_count = ARRAY_SIZE(sm8350_merge_3d),
> .merge_3d = sm8350_merge_3d,
> .intf_count = ARRAY_SIZE(sm8350_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> index 5646713..42c66fe 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> @@ -93,6 +93,11 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
> PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
> };
>
> +/* NOTE: sc7280 only has one dsc hard slice encoder */

DSC

> +static const struct dpu_dsc_cfg sc7280_dsc[] = {
> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> +};
> +
> static const struct dpu_intf_cfg sc7280_intf[] = {
> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -149,6 +154,8 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
> .mixer = sc7280_lm,
> .pingpong_count = ARRAY_SIZE(sc7280_pp),
> .pingpong = sc7280_pp,
> + .dsc_count = ARRAY_SIZE(sc7280_dsc),
> + .dsc = sc7280_dsc,
> .intf_count = ARRAY_SIZE(sc7280_intf),
> .intf = sc7280_intf,
> .vbif_count = ARRAY_SIZE(sdm845_vbif),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
> index 808aacd..1901fff 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
> @@ -141,6 +141,20 @@ static const struct dpu_merge_3d_cfg sc8280xp_merge_3d[] = {
> MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
> };
>
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> + DSC_BLK_1_2("dce_2", DSC_4, 0x82000, 0x100, 0, dsc_sblk_0),
> + DSC_BLK_1_2("dce_2", DSC_5, 0x82000, 0x100, 0, dsc_sblk_1),
> +};
> +
> /* TODO: INTF 3, 8 and 7 are used for MST, marked as INTF_NONE for now */
> static const struct dpu_intf_cfg sc8280xp_intf[] = {
> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> @@ -216,6 +230,8 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
> .dspp = sc8280xp_dspp,
> .pingpong_count = ARRAY_SIZE(sc8280xp_pp),
> .pingpong = sc8280xp_pp,
> + .dsc = sc8280xp_dsc,
> + .dsc_count = ARRAY_SIZE(sc8280xp_dsc),

Swap the two lines.

> .merge_3d_count = ARRAY_SIZE(sc8280xp_merge_3d),
> .merge_3d = sc8280xp_merge_3d,
> .intf_count = ARRAY_SIZE(sc8280xp_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
> index 1a89ff9..741d03f 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
> @@ -161,6 +161,18 @@ static const struct dpu_merge_3d_cfg sm8450_merge_3d[] = {
> MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x65f00),
> };
>
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sm8450_dsc[] = {
> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> +};
> +
> static const struct dpu_intf_cfg sm8450_intf[] = {
> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -223,6 +235,8 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
> .dspp = sm8450_dspp,
> .pingpong_count = ARRAY_SIZE(sm8450_pp),
> .pingpong = sm8450_pp,
> + .dsc = sm8450_dsc,
> + .dsc_count = ARRAY_SIZE(sm8450_dsc),

Another swap.

> .merge_3d_count = ARRAY_SIZE(sm8450_merge_3d),
> .merge_3d = sm8450_merge_3d,
> .intf_count = ARRAY_SIZE(sm8450_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
> index 497b34c..3ee6dc8 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
> @@ -165,6 +165,18 @@ static const struct dpu_merge_3d_cfg sm8550_merge_3d[] = {
> MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x66700),
> };
>
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +static const struct dpu_dsc_cfg sm8550_dsc[] = {
> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> +};
> +
> static const struct dpu_intf_cfg sm8550_intf[] = {
> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> @@ -227,6 +239,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
> .dspp = sm8550_dspp,
> .pingpong_count = ARRAY_SIZE(sm8550_pp),
> .pingpong = sm8550_pp,
> + .dsc = sm8550_dsc,
> + .dsc_count = ARRAY_SIZE(sm8550_dsc),

Swap.

> .merge_3d_count = ARRAY_SIZE(sm8550_merge_3d),
> .merge_3d = sm8550_merge_3d,
> .intf_count = ARRAY_SIZE(sm8550_intf),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 78e4bf6..c1d7338 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -1,6 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
> */
>
> #define pr_fmt(fmt) "[drm:%s:%d] " fmt, __func__, __LINE__
> @@ -522,6 +522,16 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
> /*************************************************************
> * DSC sub blocks config
> *************************************************************/
> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
> + .enc = {.base = 0x100, .len = 0x100},
> + .ctl = {.base = 0xF00, .len = 0x10},
> +};
> +
> +static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
> + .enc = {.base = 0x200, .len = 0x100},
> + .ctl = {.base = 0xF80, .len = 0x10},
> +};
> +
> #define DSC_BLK(_name, _id, _base, _features) \
> {\
> .name = _name, .id = _id, \
> @@ -529,6 +539,19 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
> .features = _features, \
> }
>
> +/*
> + * NOTE: Each display compression engine (DCE) contains dual hard
> + * slice DSC encoders so both share same base address but with
> + * its own different sub block address.
> + */
> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \

There are no address values here so this comment doesn't seem very
useful, and it is already duplicated on every DSC block array, where the
duplication is more visible. Drop the comment here?

> + {\
> + .name = _name, .id = _id, \
> + .base = _base, .len = _len, \

The len is always 0x100 (downstream says 0x10), should we hardcode it
here and drop _len? We can always add it back if a future revision
starts changing it, but that's not the case currently.

> + .features = BIT(DPU_DSC_HW_REV_1_2) | _features, \

We don't willy-nilly append bits like that: should there be global
feature flags?

Or is this the start of a new era where we expand those defines in-line
and drop them altogether? I'd much prefer that but we should first
align on this direction (and then also make the switch globally in a
followup).

- Marijn

> + .sblk = &_sblk, \
> + }
> +
> /*************************************************************
> * INTF sub blocks config
> *************************************************************/
> --
> 2.7.4
>

2023-05-15 21:32:30

[permalink] [raw]

Subject: Re: [PATCH v8 7/8] drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets

By the way, can we replace "relevant chipsets" in the title with
"DPU >= 7.0" like the other titles?

- Marijn

On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
<snip>

2023-05-15 22:00:30

[permalink] [raw]

Subject: Re: [PATCH v8 6/8] drm/msm/dpu: separate DSC flush update out of interface

On 2023-05-12 11:00:21, Kuogee Hsieh wrote:
>
> Current DSC flush update is piggyback inside dpu_hw_ctl_intf_cfg_v1().

Can you rewrite "is piggyback"? Something like "Currently DSC flushing
happens during interface configuration". And it's intf configuration
**on the CTL**, which makes this extra confusing.

> This patch separates DSC flush away from dpu_hw_ctl_intf_cfg_v1() by

Drop "This patch". Then, separates -> Separate

> adding dpu_hw_ctl_update_pending_flush_dsc_v1() to handle both per
> DSC engine and DSC flush bits at same time to make it consistent with

Make that per-DSC with a hyphen.

> the location of flush programming of other dpu sub blocks.

DPU sub-blocks.

>
> Signed-off-by: Kuogee Hsieh <[email protected]>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> ---
> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 14 ++++++++++++--
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 22 ++++++++++++++++------
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 10 ++++++++++
> 3 files changed, 38 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index ffa6f04..5cae70e 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -1834,12 +1834,18 @@ dpu_encoder_dsc_initial_line_calc(struct drm_dsc_config *dsc,
> return DIV_ROUND_UP(total_pixels, dsc->slice_width);
> }
>
> -static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
> +static void dpu_encoder_dsc_pipe_cfg(struct dpu_encoder_virt *dpu_enc,

Why not pass hw_ctl directly? The other blocks are directly passed as
well, and the caller already has cur_master. Otherwise we might as well
inline the for loops. Same question for the new _clr call added in
patch 8/8.

> + struct dpu_hw_dsc *hw_dsc,
> struct dpu_hw_pingpong *hw_pp,
> struct drm_dsc_config *dsc,
> u32 common_mode,
> u32 initial_lines)
> {
> + struct dpu_encoder_phys *cur_master = dpu_enc->cur_master;
> + struct dpu_hw_ctl *ctl;
> +
> + ctl = cur_master->hw_ctl;

Assign this directly at declaration, just like cur_master (but
irrelevant if you pass this directly instead).

> +
> if (hw_dsc->ops.dsc_config)
> hw_dsc->ops.dsc_config(hw_dsc, dsc, common_mode, initial_lines);
>
> @@ -1854,6 +1860,9 @@ static void dpu_encoder_dsc_pipe_cfg(struct dpu_hw_dsc *hw_dsc,
>
> if (hw_pp->ops.enable_dsc)
> hw_pp->ops.enable_dsc(hw_pp);
> +
> + if (ctl->ops.update_pending_flush_dsc)
> + ctl->ops.update_pending_flush_dsc(ctl, hw_dsc->idx);
> }
>
> static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
> @@ -1898,7 +1907,8 @@ static void dpu_encoder_prep_dsc(struct dpu_encoder_virt *dpu_enc,
> initial_lines = dpu_encoder_dsc_initial_line_calc(dsc, enc_ip_w);
>
> for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
> - dpu_encoder_dsc_pipe_cfg(hw_dsc[i], hw_pp[i], dsc, dsc_common_mode, initial_lines);
> + dpu_encoder_dsc_pipe_cfg(dpu_enc, hw_dsc[i], hw_pp[i], dsc,
> + dsc_common_mode, initial_lines);
> }
>
> void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc)
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> index 4f7cfa9..f3a50cc 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
> @@ -139,6 +139,11 @@ static inline void dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
> CTL_DSPP_n_FLUSH(dspp - DSPP_0),
> ctx->pending_dspp_flush_mask[dspp - DSPP_0]);
> }
> +
> + if (ctx->pending_flush_mask & BIT(DSC_IDX))
> + DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH,
> + ctx->pending_dsc_flush_mask);

When are we setting this to zero again?

Same question for the other masks, only the global pending_flush_mask
and pending_dspp_flush_mask are reset in dpu_hw_ctl_clear_pending_flush.

> +
> DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, ctx->pending_flush_mask);
> }
>
> @@ -285,6 +290,13 @@ static void dpu_hw_ctl_update_pending_flush_merge_3d_v1(struct dpu_hw_ctl *ctx,
> ctx->pending_flush_mask |= BIT(MERGE_3D_IDX);
> }
>
> +static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct dpu_hw_ctl *ctx,
> + enum dpu_dsc dsc_num)
> +{
> + ctx->pending_dsc_flush_mask |= BIT(dsc_num - DSC_0);
> + ctx->pending_flush_mask |= BIT(DSC_IDX);
> +}
> +
> static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl *ctx,
> enum dpu_dspp dspp, u32 dspp_sub_blk)
> {
> @@ -502,9 +514,6 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
> if ((test_bit(DPU_CTL_VM_CFG, &ctx->caps->features)))
> mode_sel = CTL_DEFAULT_GROUP_ID << 28;
>
> - if (cfg->dsc)
> - DPU_REG_WRITE(&ctx->hw, CTL_DSC_FLUSH, cfg->dsc);
> -
> if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
> mode_sel |= BIT(17);
>
> @@ -524,10 +533,8 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
> if (cfg->merge_3d)
> DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
> BIT(cfg->merge_3d - MERGE_3D_0));

Can we have a newline here?

> - if (cfg->dsc) {
> - DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);

Found the reason why this patch (as one of the few) is needed to get
display working on my SM8150/SM8250 devices: the semantic change is that
BIT() was missing around DSC_IDX here.
(It wasn't hampering SDM845 because it doesn't have a configurable
crossbar, i.e. DPU_CTL_ACTIVE_CFG)

Manually reverting this patch and adding BIT() indeed also fixes the
issue.

This semantic change should be documented in the description and with a
Fixes: (and Reported-by:?), or as a separate preliminary patch for
clarity.

> + if (cfg->dsc)
> DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
> - }
> }
>
> static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
> @@ -630,6 +637,9 @@ static void _setup_ctl_ops(struct dpu_hw_ctl_ops *ops,
> ops->update_pending_flush_merge_3d =
> dpu_hw_ctl_update_pending_flush_merge_3d_v1;
> ops->update_pending_flush_wb = dpu_hw_ctl_update_pending_flush_wb_v1;
> +

And while adding a newline above, drop the one here.

> + ops->update_pending_flush_dsc =
> + dpu_hw_ctl_update_pending_flush_dsc_v1;
> } else {
> ops->trigger_flush = dpu_hw_ctl_trigger_flush;
> ops->setup_intf_cfg = dpu_hw_ctl_intf_cfg;
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> index 6292002..d4869a0 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
> @@ -158,6 +158,15 @@ struct dpu_hw_ctl_ops {
> enum dpu_dspp blk, u32 dspp_sub_blk);
>
> /**
> + * OR in the given flushbits to the cached pending_(dsc_)flush_mask
> + * No effect on hardware
> + * @ctx : ctl path ctx pointer
> + * @blk : interface block index

Can you drop the spaces before the colon (:)? That's wrong and will be
fixed elsewhere later.

> + */
> + void (*update_pending_flush_dsc)(struct dpu_hw_ctl *ctx,
> + enum dpu_dsc blk);

Indent with a single tab to match the rest.

> +
> + /**
> * Write the value of the pending_flush_mask to hardware
> * @ctx : ctl path ctx pointer
> */
> @@ -245,6 +254,7 @@ struct dpu_hw_ctl {
> u32 pending_wb_flush_mask;
> u32 pending_merge_3d_flush_mask;
> u32 pending_dspp_flush_mask[DSPP_MAX - DSPP_0];
> + u32 pending_dsc_flush_mask;

Don't forget to add this to the doc-comment, or did you skip it by
intention because pending_merge_3d_flush_mask and
pending_dspp_flush_mask are missing as well?

- Marijn

>
> /* ops */
> struct dpu_hw_ctl_ops ops;
> --
> 2.7.4
>

2023-05-15 22:53:07

[permalink] [raw]

Subject: Re: [PATCH v8 7/8] drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets

On 5/15/2023 2:21 PM, Marijn Suijten wrote:
> On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
>>
>> From: Abhinav Kumar <[email protected]>
>>
>> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
>> feature flag information. Each display compression engine (DCE) contains
>> dual hard slice DSC encoders so both share same base address but with
>> its own different sub block address.
>
> Can we have an explanation of hard vs soft slices in some commit message
> and/or code documentation?
>

Not in this one. It wont look appropriate. I would rather remove "hard"
to avoid confusion.

>>
>> changes in v4:
>> -- delete DPU_DSC_HW_REV_1_1
>> -- re arrange sc8280xp_dsc[]
>>
>> changes in v4:
>> -- fix checkpatch warning
>>
>> Signed-off-by: Abhinav Kumar <[email protected]>
>> Signed-off-by: Kuogee Hsieh <[email protected]>
>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>> ---
>> .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 14 ++++++++++++
>> .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h | 7 ++++++
>> .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 ++++++++++++++
>> .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 14 ++++++++++++
>> .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 14 ++++++++++++
>> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 25 +++++++++++++++++++++-
>> 6 files changed, 89 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>> index 500cfd0..c4c93c8 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
>> @@ -153,6 +153,18 @@ static const struct dpu_merge_3d_cfg sm8350_merge_3d[] = {
>> MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>> };
>>
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sm8350_dsc[] = {
>> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>
> Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
> for the ctl sblk). This simply fills it up to the start of the enc sblk
> so that we can see all registers in the dump? After all only
> DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
> adequate.
>

.len today is always only for the dump. and yes even here we have only
0x100 for the enc and 0x10 for the ctl.

+static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
+ .enc = {.base = 0x100, .len = 0x100},
+ .ctl = {.base = 0xF00, .len = 0x10},
+};

The issue here is that, the dpu snapshot does not handle sub_blk parsing
today. Its a to-do item. So for that reason, 0x100 was used here to
atleast get the full encoder dumps.

>> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>
> Should we add an extra suffix to the name to indicate which hard-slice
> DSC encoder it is? i.e. "dce_0_0" and "dce_0_1" etc?

Ok, that should be fine. We can add it.

>
>> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>

> See comment below about loose BIT() in features.

Responded below.
>
>> +};
>> +
>> static const struct dpu_intf_cfg sm8350_intf[] = {
>> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
>> .dspp = sm8350_dspp,
>> .pingpong_count = ARRAY_SIZE(sm8350_pp),
>> .pingpong = sm8350_pp,
>> + .dsc = sm8350_dsc,
>> + .dsc_count = ARRAY_SIZE(sm8350_dsc),
>
> Count goes first **everywhere else**, let's not break consistency here.
>

the order of DSC entries is swapped for all chipsets. Please refer to
dpu_sc8180x_cfg, dpu_sm8250_cfg etc.

So if you are talking about consistency, this is actually consistent
with whats present in other chipsets.

If you are very particular about this, then once this lands, you can
change the order for all of them in another change.

Same answer for all swap comments.

>> .merge_3d_count = ARRAY_SIZE(sm8350_merge_3d),
>> .merge_3d = sm8350_merge_3d,
>> .intf_count = ARRAY_SIZE(sm8350_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>> index 5646713..42c66fe 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
>> @@ -93,6 +93,11 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
>> PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
>> };
>>
>> +/* NOTE: sc7280 only has one dsc hard slice encoder */
>
> DSC
>
>> +static const struct dpu_dsc_cfg sc7280_dsc[] = {
>> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> +};
>> +
>> static const struct dpu_intf_cfg sc7280_intf[] = {
>> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -149,6 +154,8 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
>> .mixer = sc7280_lm,
>> .pingpong_count = ARRAY_SIZE(sc7280_pp),
>> .pingpong = sc7280_pp,
>> + .dsc_count = ARRAY_SIZE(sc7280_dsc),
>> + .dsc = sc7280_dsc,
>> .intf_count = ARRAY_SIZE(sc7280_intf),
>> .intf = sc7280_intf,
>> .vbif_count = ARRAY_SIZE(sdm845_vbif),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>> index 808aacd..1901fff 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
>> @@ -141,6 +141,20 @@ static const struct dpu_merge_3d_cfg sc8280xp_merge_3d[] = {
>> MERGE_3D_BLK("merge_3d_2", MERGE_3D_2, 0x50000),
>> };
>>
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sc8280xp_dsc[] = {
>> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>> + DSC_BLK_1_2("dce_2", DSC_4, 0x82000, 0x100, 0, dsc_sblk_0),
>> + DSC_BLK_1_2("dce_2", DSC_5, 0x82000, 0x100, 0, dsc_sblk_1),
>> +};
>> +
>> /* TODO: INTF 3, 8 and 7 are used for MST, marked as INTF_NONE for now */
>> static const struct dpu_intf_cfg sc8280xp_intf[] = {
>> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>> @@ -216,6 +230,8 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
>> .dspp = sc8280xp_dspp,
>> .pingpong_count = ARRAY_SIZE(sc8280xp_pp),
>> .pingpong = sc8280xp_pp,
>> + .dsc = sc8280xp_dsc,
>> + .dsc_count = ARRAY_SIZE(sc8280xp_dsc),
>
> Swap the two lines.
>
>> .merge_3d_count = ARRAY_SIZE(sc8280xp_merge_3d),
>> .merge_3d = sc8280xp_merge_3d,
>> .intf_count = ARRAY_SIZE(sc8280xp_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>> index 1a89ff9..741d03f 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
>> @@ -161,6 +161,18 @@ static const struct dpu_merge_3d_cfg sm8450_merge_3d[] = {
>> MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x65f00),
>> };
>>
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sm8450_dsc[] = {
>> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>> +};
>> +
>> static const struct dpu_intf_cfg sm8450_intf[] = {
>> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -223,6 +235,8 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
>> .dspp = sm8450_dspp,
>> .pingpong_count = ARRAY_SIZE(sm8450_pp),
>> .pingpong = sm8450_pp,
>> + .dsc = sm8450_dsc,
>> + .dsc_count = ARRAY_SIZE(sm8450_dsc),
>
> Another swap.
>
>> .merge_3d_count = ARRAY_SIZE(sm8450_merge_3d),
>> .merge_3d = sm8450_merge_3d,
>> .intf_count = ARRAY_SIZE(sm8450_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>> index 497b34c..3ee6dc8 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
>> @@ -165,6 +165,18 @@ static const struct dpu_merge_3d_cfg sm8550_merge_3d[] = {
>> MERGE_3D_BLK("merge_3d_3", MERGE_3D_3, 0x66700),
>> };
>>
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +static const struct dpu_dsc_cfg sm8550_dsc[] = {
>> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
>> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
>> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
>> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
>> +};
>> +
>> static const struct dpu_intf_cfg sm8550_intf[] = {
>> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
>> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>> @@ -227,6 +239,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
>> .dspp = sm8550_dspp,
>> .pingpong_count = ARRAY_SIZE(sm8550_pp),
>> .pingpong = sm8550_pp,
>> + .dsc = sm8550_dsc,
>> + .dsc_count = ARRAY_SIZE(sm8550_dsc),
>
> Swap.
>
>> .merge_3d_count = ARRAY_SIZE(sm8550_merge_3d),
>> .merge_3d = sm8550_merge_3d,
>> .intf_count = ARRAY_SIZE(sm8550_intf),
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>> index 78e4bf6..c1d7338 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
>> @@ -1,6 +1,6 @@
>> // SPDX-License-Identifier: GPL-2.0-only
>> /* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
>> - * Copyright (c) 2022. Qualcomm Innovation Center, Inc. All rights reserved.
>> + * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
>> */
>>
>> #define pr_fmt(fmt) "[drm:%s:%d] " fmt, __func__, __LINE__
>> @@ -522,6 +522,16 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
>> /*************************************************************
>> * DSC sub blocks config
>> *************************************************************/
>> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
>> + .enc = {.base = 0x100, .len = 0x100},
>> + .ctl = {.base = 0xF00, .len = 0x10},
>> +};
>> +
>> +static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
>> + .enc = {.base = 0x200, .len = 0x100},
>> + .ctl = {.base = 0xF80, .len = 0x10},
>> +};
>> +
>> #define DSC_BLK(_name, _id, _base, _features) \
>> {\
>> .name = _name, .id = _id, \
>> @@ -529,6 +539,19 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
>> .features = _features, \
>> }
>>
>> +/*
>> + * NOTE: Each display compression engine (DCE) contains dual hard
>> + * slice DSC encoders so both share same base address but with
>> + * its own different sub block address.
>> + */
>> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
>
> There are no address values here so this comment doesn't seem very
> useful, and it is already duplicated on every DSC block array, where the
> duplication is more visible. Drop the comment here?
>

_base is the address. So base address. Does that clarify things?

>> + {\
>> + .name = _name, .id = _id, \
>> + .base = _base, .len = _len, \
>
> The len is always 0x100 (downstream says 0x10), should we hardcode it
> here and drop _len? We can always add it back if a future revision
> starts changing it, but that's not the case currently.
>
>> + .features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
>
> We don't willy-nilly append bits like that: should there be global
> feature flags?

So this approach is actually better. This macro is a DSC_1_2 macro so it
will have the 1.2 feature flag and other features like native_422
support of that encoder are ORed on top of it. Nothing wrong with this.

>
> Or is this the start of a new era where we expand those defines in-line
> and drop them altogether? I'd much prefer that but we should first
> align on this direction (and then also make the switch globally in a
> followup).
>

Its case by case. No need to generalize.

In this the feature flag ORed with the base feature flag of DSC_1_2
makes it more clear.

> - Marijn
>
>> + .sblk = &_sblk, \
>> + }
>> +
>> /*************************************************************
>> * INTF sub blocks config
>> *************************************************************/
>> --
>> 2.7.4
>>

2023-05-15 22:55:06

[permalink] [raw]

Subject: Re: [PATCH v8 7/8] drm/msm/dpu: add DSC 1.2 hw blocks for relevant chipsets

On 2023-05-15 15:03:46, Abhinav Kumar wrote:
> On 5/15/2023 2:21 PM, Marijn Suijten wrote:
> > On 2023-05-12 11:00:22, Kuogee Hsieh wrote:
> >>
> >> From: Abhinav Kumar <[email protected]>
> >>
> >> Add DSC 1.2 hardware blocks to the catalog with necessary sub-block and
> >> feature flag information. Each display compression engine (DCE) contains
> >> dual hard slice DSC encoders so both share same base address but with
> >> its own different sub block address.
> >
> > Can we have an explanation of hard vs soft slices in some commit message
> > and/or code documentation?
> >
>
> Not in this one. It wont look appropriate. I would rather remove "hard"
> to avoid confusion.

That is totally fine, let's remove it instead.

<snip>
> >> + DSC_BLK_1_2("dce_0", DSC_0, 0x80000, 0x100, 0, dsc_sblk_0),
> >
> > Downstream says that the size is 0x10 (and 0x100 for the enc sblk, 0x10
> > for the ctl sblk). This simply fills it up to the start of the enc sblk
> > so that we can see all registers in the dump? After all only
> > DSC_CMN_MAIN_CNF is defined in the main register space, so 0x10 is
> > adequate.
> >
>
> .len today is always only for the dump. and yes even here we have only
> 0x100 for the enc and 0x10 for the ctl.
>
> +static const struct dpu_dsc_sub_blks dsc_sblk_0 = {
> + .enc = {.base = 0x100, .len = 0x100},
> + .ctl = {.base = 0xF00, .len = 0x10},
> +};
>
> The issue here is that, the dpu snapshot does not handle sub_blk parsing
> today. Its a to-do item. So for that reason, 0x100 was used here to
> atleast get the full encoder dumps.

But then you don't see the ENC block? It starts at 0x100 (or 0x200) so
then the length should be longer... it should in fact depend on even/odd
DCE then?

>
> >> + DSC_BLK_1_2("dce_0", DSC_1, 0x80000, 0x100, 0, dsc_sblk_1),
> >
> > Should we add an extra suffix to the name to indicate which hard-slice
> > DSC encoder it is? i.e. "dce_0_0" and "dce_0_1" etc?
>
> Ok, that should be fine. We can add it.

Great, thanks!

> >> + DSC_BLK_1_2("dce_1", DSC_2, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_0),
> >> + DSC_BLK_1_2("dce_1", DSC_3, 0x81000, 0x100, BIT(DPU_DSC_NATIVE_422_EN), dsc_sblk_1),
> >
>
> > See comment below about loose BIT() in features.
>
> Responded below.
> >
> >> +};
> >> +
> >> static const struct dpu_intf_cfg sm8350_intf[] = {
> >> INTF_BLK("intf_0", INTF_0, 0x34000, 0x280, INTF_DP, MSM_DP_CONTROLLER_0, 24, INTF_SC7280_MASK,
> >> DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
> >> @@ -215,6 +227,8 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
> >> .dspp = sm8350_dspp,
> >> .pingpong_count = ARRAY_SIZE(sm8350_pp),
> >> .pingpong = sm8350_pp,
> >> + .dsc = sm8350_dsc,
> >> + .dsc_count = ARRAY_SIZE(sm8350_dsc),
> >
> > Count goes first **everywhere else**, let's not break consistency here.
> >
>
> the order of DSC entries is swapped for all chipsets. Please refer to
> dpu_sc8180x_cfg, dpu_sm8250_cfg etc.

Thanks for confirming that this is not the case in a followup mail :)

> So if you are talking about consistency, this is actually consistent
> with whats present in other chipsets.
>
> If you are very particular about this, then once this lands, you can
> change the order for all of them in another change.
>
> Same answer for all swap comments.
<snip>
> >> +/*
> >> + * NOTE: Each display compression engine (DCE) contains dual hard
> >> + * slice DSC encoders so both share same base address but with
> >> + * its own different sub block address.
> >> + */
> >> +#define DSC_BLK_1_2(_name, _id, _base, _len, _features, _sblk) \
> >
> > There are no address values here so this comment doesn't seem very
> > useful, and it is already duplicated on every DSC block array, where the
> > duplication is more visible. Drop the comment here?
> >
>
> _base is the address. So base address. Does that clarify things?

This is referring to the NOTE: comment above. There's _base as address
here, yes, but there's no context here that it'll be used in duplicate
fashion, unlike the SoC catalog files. The request is to just drop it
here as it adds no value.

> >> + {\
> >> + .name = _name, .id = _id, \
> >> + .base = _base, .len = _len, \
> >
> > The len is always 0x100 (downstream says 0x10), should we hardcode it
> > here and drop _len? We can always add it back if a future revision
> > starts changing it, but that's not the case currently.
> >
> >> + .features = BIT(DPU_DSC_HW_REV_1_2) | _features, \
> >
> > We don't willy-nilly append bits like that: should there be global
> > feature flags?
>
> So this approach is actually better. This macro is a DSC_1_2 macro so it
> will have the 1.2 feature flag and other features like native_422
> support of that encoder are ORed on top of it. Nothing wrong with this.

I agree it is better, but we seem to be very selective in whether to
stick to the "old" principles in DPU versus applying a new pattern that
isn't used elsewhere yet (i.e. your request to _not_ shuffle the order
of .dsc and .dsc_count assignment to match other .x and .x_count, and do
that in a future patch instead). If we want to be consistent
everywhere, these should be #defines that we'll flatten out in a
followup if at all.

> > Or is this the start of a new era where we expand those defines in-line
> > and drop them altogether? I'd much prefer that but we should first
> > align on this direction (and then also make the switch globally in a
> > followup).
> >
>
> Its case by case. No need to generalize.
>
> In this the feature flag ORed with the base feature flag of DSC_1_2
> makes it more clear.

- Marijn

2023-05-16 00:07:10