2024-02-23 21:22:00

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 0/7] A702 support

Bit of a megaseries, bunched together for your testing convenience..
Needs mesa!27665 [1] on the userland part, kmscube happily spins.

I'm feeling quite lukewarm about the memory barriers in patch 3..

Patch 1 for Will/smmu, 5-6 for drm/msm, rest for qcom

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27665

Signed-off-by: Konrad Dybcio <[email protected]>
---
Changes in v2:
- Drop applied smmu-bindings patch
- Fix the gpucc bindings patch to be even better
- Reorder HUAYRA_2290 definitions near HUAYRA (..Add HUAYRA_2290
support..)
- Replace weird memory barriers copypasted from msm-5.4 with readback to
ensure timely write completion (..Add HUAYRA_2290 support..)
- Keep my super amazing commit message referencing the 3D accelerator
official naming (dts)
- Pick up tags
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Konrad Dybcio (7):
dt-bindings: clock: Add Qcom QCM2290 GPUCC
clk: qcom: clk-alpha-pll: Add HUAYRA_2290 support
clk: qcom: Add QCM2290 GPU clock controller driver
drm/msm/adreno: Add missing defines for A702
drm/msm/adreno: Add A702 support
arm64: dts: qcom: qcm2290: Add GPU nodes
arm64: dts: qcom: qrb2210-rb1: Enable the GPU

.../bindings/clock/qcom,qcm2290-gpucc.yaml | 77 ++++
arch/arm64/boot/dts/qcom/qcm2290.dtsi | 154 ++++++++
arch/arm64/boot/dts/qcom/qrb2210-rb1.dts | 8 +
drivers/clk/qcom/Kconfig | 9 +
drivers/clk/qcom/Makefile | 1 +
drivers/clk/qcom/clk-alpha-pll.c | 47 +++
drivers/clk/qcom/clk-alpha-pll.h | 3 +
drivers/clk/qcom/gpucc-qcm2290.c | 423 +++++++++++++++++++++
drivers/gpu/drm/msm/adreno/a6xx.xml.h | 18 +
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 92 ++++-
drivers/gpu/drm/msm/adreno/adreno_device.c | 18 +
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 16 +-
include/dt-bindings/clock/qcom,qcm2290-gpucc.h | 32 ++
13 files changed, 889 insertions(+), 9 deletions(-)
---
base-commit: 26d7d52b6253574d5b6fec16a93e1110d1489cef
change-id: 20240219-topic-rb1_gpu-3ec8c6830384

Best regards,
--
Konrad Dybcio <[email protected]>



2024-02-23 21:22:13

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 1/7] dt-bindings: clock: Add Qcom QCM2290 GPUCC

Add device tree bindings for graphics clock controller for Qualcomm
Technology Inc's QCM2290 SoCs.

Signed-off-by: Konrad Dybcio <[email protected]>
---
.../bindings/clock/qcom,qcm2290-gpucc.yaml | 77 ++++++++++++++++++++++
include/dt-bindings/clock/qcom,qcm2290-gpucc.h | 32 +++++++++
2 files changed, 109 insertions(+)

diff --git a/Documentation/devicetree/bindings/clock/qcom,qcm2290-gpucc.yaml b/Documentation/devicetree/bindings/clock/qcom,qcm2290-gpucc.yaml
new file mode 100644
index 000000000000..734880805c1b
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/qcom,qcm2290-gpucc.yaml
@@ -0,0 +1,77 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/qcom,qcm2290-gpucc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Graphics Clock & Reset Controller on QCM2290
+
+maintainers:
+ - Konrad Dybcio <[email protected]>
+
+description: |
+ Qualcomm graphics clock control module provides the clocks, resets and power
+ domains on Qualcomm SoCs.
+
+ See also::
+ include/dt-bindings/clock/qcom,qcm2290-gpucc.h
+
+properties:
+ compatible:
+ const: qcom,qcm2290-gpucc
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: AHB interface clock,
+ - description: SoC CXO clock
+ - description: GPLL0 main branch source
+ - description: GPLL0 div branch source
+
+ power-domains:
+ description:
+ A phandle and PM domain specifier for the CX power domain.
+ maxItems: 1
+
+ required-opps:
+ description:
+ A phandle to an OPP node describing required CX performance point.
+ maxItems: 1
+
+required:
+ - compatible
+ - clocks
+ - power-domains
+
+allOf:
+ - $ref: qcom,gcc.yaml#
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/qcom,gcc-qcm2290.h>
+ #include <dt-bindings/clock/qcom,rpmcc.h>
+ #include <dt-bindings/power/qcom-rpmpd.h>
+
+ soc {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ clock-controller@5990000 {
+ compatible = "qcom,qcm2290-gpucc";
+ reg = <0x0 0x05990000 0x0 0x9000>;
+ clocks = <&gcc GCC_GPU_CFG_AHB_CLK>,
+ <&rpmcc RPM_SMD_XO_CLK_SRC>,
+ <&gcc GCC_GPU_GPLL0_CLK_SRC>,
+ <&gcc GCC_GPU_GPLL0_DIV_CLK_SRC>;
+ power-domains = <&rpmpd QCM2290_VDDCX>;
+ required-opps = <&rpmpd_opp_low_svs>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ };
+ };
+...
diff --git a/include/dt-bindings/clock/qcom,qcm2290-gpucc.h b/include/dt-bindings/clock/qcom,qcm2290-gpucc.h
new file mode 100644
index 000000000000..7c76dd05278f
--- /dev/null
+++ b/include/dt-bindings/clock/qcom,qcm2290-gpucc.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
+/*
+ * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024, Linaro Limited
+ */
+
+#ifndef _DT_BINDINGS_CLK_QCOM_GPU_CC_QCM2290_H
+#define _DT_BINDINGS_CLK_QCOM_GPU_CC_QCM2290_H
+
+/* GPU_CC clocks */
+#define GPU_CC_AHB_CLK 0
+#define GPU_CC_CRC_AHB_CLK 1
+#define GPU_CC_CX_GFX3D_CLK 2
+#define GPU_CC_CX_GMU_CLK 3
+#define GPU_CC_CX_SNOC_DVM_CLK 4
+#define GPU_CC_CXO_AON_CLK 5
+#define GPU_CC_CXO_CLK 6
+#define GPU_CC_GMU_CLK_SRC 7
+#define GPU_CC_GX_GFX3D_CLK 8
+#define GPU_CC_GX_GFX3D_CLK_SRC 9
+#define GPU_CC_PLL0 10
+#define GPU_CC_SLEEP_CLK 11
+#define GPU_CC_HLOS1_VOTE_GPU_SMMU_CLK 12
+
+/* Resets */
+#define GPU_GX_BCR 0
+
+/* GDSCs */
+#define GPU_CX_GDSC 0
+#define GPU_GX_GDSC 1
+
+#endif

--
2.43.2


2024-02-23 21:22:29

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 2/7] clk: qcom: clk-alpha-pll: Add HUAYRA_2290 support

Commit 134b55b7e19f ("clk: qcom: support Huayra type Alpha PLL")
introduced an entry to the alpha offsets array, but diving into QCM2290
downstream and some documentation, it turned out that the name Huayra
apparently has been used quite liberally across many chips, even with
noticeably different hardware.

Introduce another set of offsets and a new configure function for the
Huayra PLL found on QCM2290. This is required e.g. for the consumers
of GPUCC_PLL0 to properly start.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/clk/qcom/clk-alpha-pll.c | 47 ++++++++++++++++++++++++++++++++++++++++
drivers/clk/qcom/clk-alpha-pll.h | 3 +++
2 files changed, 50 insertions(+)

diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c
index 8a412ef47e16..82b71f24ee7d 100644
--- a/drivers/clk/qcom/clk-alpha-pll.c
+++ b/drivers/clk/qcom/clk-alpha-pll.c
@@ -83,6 +83,19 @@ const u8 clk_alpha_pll_regs[][PLL_OFF_MAX_REGS] = {
[PLL_OFF_TEST_CTL_U] = 0x20,
[PLL_OFF_STATUS] = 0x24,
},
+ [CLK_ALPHA_PLL_TYPE_HUAYRA_2290] = {
+ [PLL_OFF_L_VAL] = 0x04,
+ [PLL_OFF_ALPHA_VAL] = 0x08,
+ [PLL_OFF_USER_CTL] = 0x0c,
+ [PLL_OFF_CONFIG_CTL] = 0x10,
+ [PLL_OFF_CONFIG_CTL_U] = 0x14,
+ [PLL_OFF_CONFIG_CTL_U1] = 0x18,
+ [PLL_OFF_TEST_CTL] = 0x1c,
+ [PLL_OFF_TEST_CTL_U] = 0x20,
+ [PLL_OFF_TEST_CTL_U1] = 0x24,
+ [PLL_OFF_OPMODE] = 0x28,
+ [PLL_OFF_STATUS] = 0x38,
+ },
[CLK_ALPHA_PLL_TYPE_BRAMMO] = {
[PLL_OFF_L_VAL] = 0x04,
[PLL_OFF_ALPHA_VAL] = 0x08,
@@ -779,6 +792,40 @@ static long clk_alpha_pll_round_rate(struct clk_hw *hw, unsigned long rate,
return clamp(rate, min_freq, max_freq);
}

+void clk_huayra_2290_pll_configure(struct clk_alpha_pll *pll, struct regmap *regmap,
+ const struct alpha_pll_config *config)
+{
+ u32 val;
+
+ clk_alpha_pll_write_config(regmap, PLL_CONFIG_CTL(pll), config->config_ctl_val);
+ clk_alpha_pll_write_config(regmap, PLL_CONFIG_CTL_U(pll), config->config_ctl_hi_val);
+ clk_alpha_pll_write_config(regmap, PLL_CONFIG_CTL_U1(pll), config->config_ctl_hi1_val);
+ clk_alpha_pll_write_config(regmap, PLL_TEST_CTL(pll), config->test_ctl_val);
+ clk_alpha_pll_write_config(regmap, PLL_TEST_CTL_U(pll), config->test_ctl_hi_val);
+ clk_alpha_pll_write_config(regmap, PLL_TEST_CTL_U1(pll), config->test_ctl_hi1_val);
+ clk_alpha_pll_write_config(regmap, PLL_L_VAL(pll), config->l);
+ clk_alpha_pll_write_config(regmap, PLL_ALPHA_VAL(pll), config->alpha);
+ clk_alpha_pll_write_config(regmap, PLL_USER_CTL(pll), config->user_ctl_val);
+
+ /* Set PLL_BYPASSNL */
+ regmap_update_bits(regmap, PLL_MODE(pll), PLL_BYPASSNL, PLL_BYPASSNL);
+ regmap_read(regmap, PLL_MODE(pll), &val);
+
+ /* Wait 5 us between setting BYPASS and deasserting reset */
+ udelay(5);
+
+ /* Take PLL out from reset state */
+ regmap_update_bits(regmap, PLL_MODE(pll), PLL_RESET_N, PLL_RESET_N);
+ regmap_read(regmap, PLL_MODE(pll), &val);
+
+ /* Wait 50us for PLL_LOCK_DET bit to go high */
+ usleep_range(50, 55);
+
+ /* Enable PLL output */
+ regmap_update_bits(regmap, PLL_MODE(pll), PLL_OUTCTRL, PLL_OUTCTRL);
+}
+EXPORT_SYMBOL(clk_huayra_2290_pll_configure);
+
static unsigned long
alpha_huayra_pll_calc_rate(u64 prate, u32 l, u32 a)
{
diff --git a/drivers/clk/qcom/clk-alpha-pll.h b/drivers/clk/qcom/clk-alpha-pll.h
index fb6d50263bb9..d1cd52158c17 100644
--- a/drivers/clk/qcom/clk-alpha-pll.h
+++ b/drivers/clk/qcom/clk-alpha-pll.h
@@ -15,6 +15,7 @@
enum {
CLK_ALPHA_PLL_TYPE_DEFAULT,
CLK_ALPHA_PLL_TYPE_HUAYRA,
+ CLK_ALPHA_PLL_TYPE_HUAYRA_2290,
CLK_ALPHA_PLL_TYPE_BRAMMO,
CLK_ALPHA_PLL_TYPE_FABIA,
CLK_ALPHA_PLL_TYPE_TRION,
@@ -191,6 +192,8 @@ extern const struct clk_ops clk_alpha_pll_rivian_evo_ops;

void clk_alpha_pll_configure(struct clk_alpha_pll *pll, struct regmap *regmap,
const struct alpha_pll_config *config);
+void clk_huayra_2290_pll_configure(struct clk_alpha_pll *pll, struct regmap *regmap,
+ const struct alpha_pll_config *config);
void clk_fabia_pll_configure(struct clk_alpha_pll *pll, struct regmap *regmap,
const struct alpha_pll_config *config);
void clk_trion_pll_configure(struct clk_alpha_pll *pll, struct regmap *regmap,

--
2.43.2


2024-02-23 21:22:55

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 3/7] clk: qcom: Add QCM2290 GPU clock controller driver

Add a driver for the GPU clock controller block found on the QCM2290 SoC.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/clk/qcom/Kconfig | 9 +
drivers/clk/qcom/Makefile | 1 +
drivers/clk/qcom/gpucc-qcm2290.c | 423 +++++++++++++++++++++++++++++++++++++++
3 files changed, 433 insertions(+)

diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig
index 4580edbd13ea..d70ea4548755 100644
--- a/drivers/clk/qcom/Kconfig
+++ b/drivers/clk/qcom/Kconfig
@@ -65,6 +65,15 @@ config CLK_X1E80100_TCSRCC
Support for the TCSR clock controller on X1E80100 devices.
Say Y if you want to use peripheral devices such as SD/UFS.

+config CLK_QCM2290_GPUCC
+ tristate "QCM2290 Graphics Clock Controller"
+ depends on ARM64 || COMPILE_TEST
+ select CLK_QCM2290_GCC
+ help
+ Support for the graphics clock controller on QCM2290 devices.
+ Say Y if you want to support graphics controller devices and
+ functionality such as 3D graphics.
+
config QCOM_A53PLL
tristate "MSM8916 A53 PLL"
help
diff --git a/drivers/clk/qcom/Makefile b/drivers/clk/qcom/Makefile
index 1da65ca78e24..b8d49c054558 100644
--- a/drivers/clk/qcom/Makefile
+++ b/drivers/clk/qcom/Makefile
@@ -26,6 +26,7 @@ obj-$(CONFIG_CLK_X1E80100_DISPCC) += dispcc-x1e80100.o
obj-$(CONFIG_CLK_X1E80100_GCC) += gcc-x1e80100.o
obj-$(CONFIG_CLK_X1E80100_GPUCC) += gpucc-x1e80100.o
obj-$(CONFIG_CLK_X1E80100_TCSRCC) += tcsrcc-x1e80100.o
+obj-$(CONFIG_CLK_QCM2290_GPUCC) += gpucc-qcm2290.o
obj-$(CONFIG_IPQ_APSS_PLL) += apss-ipq-pll.o
obj-$(CONFIG_IPQ_APSS_6018) += apss-ipq6018.o
obj-$(CONFIG_IPQ_GCC_4019) += gcc-ipq4019.o
diff --git a/drivers/clk/qcom/gpucc-qcm2290.c b/drivers/clk/qcom/gpucc-qcm2290.c
new file mode 100644
index 000000000000..b6e20d63ac85
--- /dev/null
+++ b/drivers/clk/qcom/gpucc-qcm2290.c
@@ -0,0 +1,423 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024, Linaro Limited
+ */
+
+#include <linux/clk-provider.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/pm_clock.h>
+#include <linux/pm_runtime.h>
+#include <linux/regmap.h>
+
+#include <dt-bindings/clock/qcom,qcm2290-gpucc.h>
+
+#include "clk-alpha-pll.h"
+#include "clk-branch.h"
+#include "clk-rcg.h"
+#include "clk-regmap.h"
+#include "clk-regmap-divider.h"
+#include "clk-regmap-mux.h"
+#include "clk-regmap-phy-mux.h"
+#include "gdsc.h"
+#include "reset.h"
+
+enum {
+ DT_GCC_AHB_CLK,
+ DT_BI_TCXO,
+ DT_GCC_GPU_GPLL0_CLK_SRC,
+ DT_GCC_GPU_GPLL0_DIV_CLK_SRC,
+};
+
+enum {
+ P_BI_TCXO,
+ P_GPLL0_OUT_MAIN,
+ P_GPLL0_OUT_MAIN_DIV,
+ P_GPU_CC_PLL0_2X_DIV_CLK_SRC,
+ P_GPU_CC_PLL0_OUT_AUX,
+ P_GPU_CC_PLL0_OUT_AUX2,
+ P_GPU_CC_PLL0_OUT_MAIN,
+};
+
+static const struct pll_vco huayra_vco[] = {
+ { 600000000, 3300000000, 0 },
+ { 600000000, 2200000000, 1 },
+};
+
+static const struct alpha_pll_config gpu_cc_pll0_config = {
+ .l = 0x25,
+ .config_ctl_val = 0x200d4828,
+ .config_ctl_hi_val = 0x6,
+ .test_ctl_val = GENMASK(28, 26),
+ .test_ctl_hi_val = BIT(14),
+ .user_ctl_val = 0xf,
+};
+
+static struct clk_alpha_pll gpu_cc_pll0 = {
+ .offset = 0x0,
+ .vco_table = huayra_vco,
+ .num_vco = ARRAY_SIZE(huayra_vco),
+ .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_HUAYRA_2290],
+ .clkr = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_pll0",
+ .parent_data = &(const struct clk_parent_data) {
+ .index = DT_BI_TCXO,
+ },
+ .num_parents = 1,
+ .ops = &clk_alpha_pll_huayra_ops,
+ },
+ },
+};
+
+static const struct parent_map gpu_cc_parent_map_0[] = {
+ { P_BI_TCXO, 0 },
+ { P_GPU_CC_PLL0_OUT_MAIN, 1 },
+ { P_GPLL0_OUT_MAIN, 5 },
+ { P_GPLL0_OUT_MAIN_DIV, 6 },
+};
+
+static const struct clk_parent_data gpu_cc_parent_data_0[] = {
+ { .index = DT_BI_TCXO, },
+ { .hw = &gpu_cc_pll0.clkr.hw, },
+ { .index = DT_GCC_GPU_GPLL0_CLK_SRC, },
+ { .index = DT_GCC_GPU_GPLL0_DIV_CLK_SRC, },
+};
+
+static const struct parent_map gpu_cc_parent_map_1[] = {
+ { P_BI_TCXO, 0 },
+ { P_GPU_CC_PLL0_2X_DIV_CLK_SRC, 1 },
+ { P_GPU_CC_PLL0_OUT_AUX2, 2 },
+ { P_GPU_CC_PLL0_OUT_AUX, 3 },
+ { P_GPLL0_OUT_MAIN, 5 },
+};
+
+static const struct clk_parent_data gpu_cc_parent_data_1[] = {
+ { .index = DT_BI_TCXO, },
+ { .hw = &gpu_cc_pll0.clkr.hw, },
+ { .hw = &gpu_cc_pll0.clkr.hw, },
+ { .hw = &gpu_cc_pll0.clkr.hw, },
+ { .index = DT_GCC_GPU_GPLL0_CLK_SRC, },
+};
+
+static const struct freq_tbl ftbl_gpu_cc_gmu_clk_src[] = {
+ F(200000000, P_GPLL0_OUT_MAIN, 3, 0, 0),
+ { }
+};
+
+static struct clk_rcg2 gpu_cc_gmu_clk_src = {
+ .cmd_rcgr = 0x1120,
+ .mnd_width = 0,
+ .hid_width = 5,
+ .parent_map = gpu_cc_parent_map_0,
+ .freq_tbl = ftbl_gpu_cc_gmu_clk_src,
+ .clkr.hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_gmu_clk_src",
+ .parent_data = gpu_cc_parent_data_0,
+ .num_parents = ARRAY_SIZE(gpu_cc_parent_data_0),
+ .flags = CLK_SET_RATE_PARENT,
+ .ops = &clk_rcg2_shared_ops,
+ },
+};
+
+static const struct freq_tbl ftbl_gpu_cc_gx_gfx3d_clk_src[] = {
+ F(355200000, P_GPU_CC_PLL0_OUT_AUX, 2, 0, 0),
+ F(537600000, P_GPU_CC_PLL0_OUT_AUX2, 2, 0, 0),
+ F(672000000, P_GPU_CC_PLL0_OUT_AUX2, 2, 0, 0),
+ F(844800000, P_GPU_CC_PLL0_OUT_AUX2, 2, 0, 0),
+ F(921600000, P_GPU_CC_PLL0_OUT_AUX2, 2, 0, 0),
+ F(1017600000, P_GPU_CC_PLL0_OUT_AUX2, 2, 0, 0),
+ F(1123200000, P_GPU_CC_PLL0_OUT_AUX2, 2, 0, 0),
+ { }
+};
+
+static struct clk_rcg2 gpu_cc_gx_gfx3d_clk_src = {
+ .cmd_rcgr = 0x101c,
+ .mnd_width = 0,
+ .hid_width = 5,
+ .parent_map = gpu_cc_parent_map_1,
+ .freq_tbl = ftbl_gpu_cc_gx_gfx3d_clk_src,
+ .clkr.hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_gx_gfx3d_clk_src",
+ .parent_data = gpu_cc_parent_data_1,
+ .num_parents = ARRAY_SIZE(gpu_cc_parent_data_1),
+ .flags = CLK_SET_RATE_PARENT,
+ .ops = &clk_rcg2_ops,
+ },
+};
+
+static struct clk_branch gpu_cc_ahb_clk = {
+ .halt_reg = 0x1078,
+ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x1078,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_ahb_clk",
+ .flags = CLK_IS_CRITICAL,
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_crc_ahb_clk = {
+ .halt_reg = 0x107c,
+ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x107c,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_crc_ahb_clk",
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_cx_gfx3d_clk = {
+ .halt_reg = 0x10a4,
+ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x10a4,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_cx_gfx3d_clk",
+ .parent_data = &(const struct clk_parent_data){
+ .hw = &gpu_cc_gx_gfx3d_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_cx_gmu_clk = {
+ .halt_reg = 0x1098,
+ .halt_check = BRANCH_HALT,
+ .clkr = {
+ .enable_reg = 0x1098,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_cx_gmu_clk",
+ .parent_data = &(const struct clk_parent_data){
+ .hw = &gpu_cc_gmu_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_cx_snoc_dvm_clk = {
+ .halt_reg = 0x108c,
+ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x108c,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_cx_snoc_dvm_clk",
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_cxo_aon_clk = {
+ .halt_reg = 0x1004,
+ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x1004,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_cxo_aon_clk",
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_cxo_clk = {
+ .halt_reg = 0x109c,
+ .halt_check = BRANCH_HALT,
+ .clkr = {
+ .enable_reg = 0x109c,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_cxo_clk",
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_gx_gfx3d_clk = {
+ .halt_reg = 0x1054,
+ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x1054,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_gx_gfx3d_clk",
+ .parent_data = &(const struct clk_parent_data){
+ .hw = &gpu_cc_gx_gfx3d_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_sleep_clk = {
+ .halt_reg = 0x1090,
+ .halt_check = BRANCH_VOTED,
+ .clkr = {
+ .enable_reg = 0x1090,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_sleep_clk",
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct clk_branch gpu_cc_hlos1_vote_gpu_smmu_clk = {
+ .halt_reg = 0x5000,
+ .halt_check = BRANCH_VOTED,
+ .clkr = {
+ .enable_reg = 0x5000,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_hlos1_vote_gpu_smmu_clk",
+ .ops = &clk_branch2_ops,
+ },
+ },
+};
+
+static struct gdsc gpu_cx_gdsc = {
+ .gdscr = 0x106c,
+ .gds_hw_ctrl = 0x1540,
+ .pd = {
+ .name = "gpu_cx_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+ .flags = VOTABLE,
+};
+
+static struct gdsc gpu_gx_gdsc = {
+ .gdscr = 0x100c,
+ .clamp_io_ctrl = 0x1508,
+ .resets = (unsigned int []){ GPU_GX_BCR },
+ .reset_count = 1,
+ .pd = {
+ .name = "gpu_gx_gdsc",
+ },
+ .parent = &gpu_cx_gdsc.pd,
+ .pwrsts = PWRSTS_OFF_ON,
+ .flags = CLAMP_IO | AON_RESET | SW_RESET,
+};
+
+static struct clk_regmap *gpu_cc_qcm2290_clocks[] = {
+ [GPU_CC_AHB_CLK] = &gpu_cc_ahb_clk.clkr,
+ [GPU_CC_CRC_AHB_CLK] = &gpu_cc_crc_ahb_clk.clkr,
+ [GPU_CC_CX_GFX3D_CLK] = &gpu_cc_cx_gfx3d_clk.clkr,
+ [GPU_CC_CX_GMU_CLK] = &gpu_cc_cx_gmu_clk.clkr,
+ [GPU_CC_CX_SNOC_DVM_CLK] = &gpu_cc_cx_snoc_dvm_clk.clkr,
+ [GPU_CC_CXO_AON_CLK] = &gpu_cc_cxo_aon_clk.clkr,
+ [GPU_CC_CXO_CLK] = &gpu_cc_cxo_clk.clkr,
+ [GPU_CC_GMU_CLK_SRC] = &gpu_cc_gmu_clk_src.clkr,
+ [GPU_CC_GX_GFX3D_CLK] = &gpu_cc_gx_gfx3d_clk.clkr,
+ [GPU_CC_GX_GFX3D_CLK_SRC] = &gpu_cc_gx_gfx3d_clk_src.clkr,
+ [GPU_CC_PLL0] = &gpu_cc_pll0.clkr,
+ [GPU_CC_SLEEP_CLK] = &gpu_cc_sleep_clk.clkr,
+ [GPU_CC_HLOS1_VOTE_GPU_SMMU_CLK] = &gpu_cc_hlos1_vote_gpu_smmu_clk.clkr,
+};
+
+static const struct qcom_reset_map gpu_cc_qcm2290_resets[] = {
+ [GPU_GX_BCR] = { 0x1008 },
+};
+
+static struct gdsc *gpu_cc_qcm2290_gdscs[] = {
+ [GPU_CX_GDSC] = &gpu_cx_gdsc,
+ [GPU_GX_GDSC] = &gpu_gx_gdsc,
+};
+
+static const struct regmap_config gpu_cc_qcm2290_regmap_config = {
+ .reg_bits = 32,
+ .reg_stride = 4,
+ .val_bits = 32,
+ .max_register = 0x9000,
+ .fast_io = true,
+};
+
+
+static const struct qcom_cc_desc gpu_cc_qcm2290_desc = {
+ .config = &gpu_cc_qcm2290_regmap_config,
+ .clks = gpu_cc_qcm2290_clocks,
+ .num_clks = ARRAY_SIZE(gpu_cc_qcm2290_clocks),
+ .resets = gpu_cc_qcm2290_resets,
+ .num_resets = ARRAY_SIZE(gpu_cc_qcm2290_resets),
+ .gdscs = gpu_cc_qcm2290_gdscs,
+ .num_gdscs = ARRAY_SIZE(gpu_cc_qcm2290_gdscs),
+};
+
+static const struct of_device_id gpu_cc_qcm2290_match_table[] = {
+ { .compatible = "qcom,qcm2290-gpucc" },
+ { }
+};
+MODULE_DEVICE_TABLE(of, gpu_cc_qcm2290_match_table);
+
+static int gpu_cc_qcm2290_probe(struct platform_device *pdev)
+{
+ struct regmap *regmap;
+ int ret;
+
+ regmap = qcom_cc_map(pdev, &gpu_cc_qcm2290_desc);
+ if (IS_ERR(regmap))
+ return PTR_ERR(regmap);
+
+ ret = devm_pm_runtime_enable(&pdev->dev);
+ if (ret)
+ return ret;
+
+ ret = devm_pm_clk_create(&pdev->dev);
+ if (ret)
+ return ret;
+
+ ret = pm_clk_add(&pdev->dev, NULL);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "failed to acquire ahb clock\n");
+ return ret;
+ }
+
+ ret = pm_runtime_resume_and_get(&pdev->dev);
+ if (ret)
+ return ret;
+
+ clk_huayra_2290_pll_configure(&gpu_cc_pll0, regmap, &gpu_cc_pll0_config);
+
+ regmap_update_bits(regmap, 0x1060, BIT(0), BIT(0)); /* GPU_CC_GX_CXO_CLK */
+
+ ret = qcom_cc_really_probe(pdev, &gpu_cc_qcm2290_desc, regmap);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to register display clock controller\n");
+ goto out_pm_runtime_put;
+ }
+
+out_pm_runtime_put:
+ pm_runtime_put_sync(&pdev->dev);
+
+ return 0;
+}
+
+static struct platform_driver gpu_cc_qcm2290_driver = {
+ .probe = gpu_cc_qcm2290_probe,
+ .driver = {
+ .name = "gpucc-qcm2290",
+ .of_match_table = gpu_cc_qcm2290_match_table,
+ },
+};
+module_platform_driver(gpu_cc_qcm2290_driver);
+
+MODULE_DESCRIPTION("QTI QCM2290 GPU clock controller driver");
+MODULE_LICENSE("GPL");

--
2.43.2


2024-02-23 21:23:02

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 4/7] drm/msm/adreno: Add missing defines for A702

Add some defines required for A702. Can be substituted with a header
sync after merging mesa!27665 [1].

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27665
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx.xml.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx.xml.h b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
index 863b5e3b0e67..1ec4dbc0e746 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx.xml.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
@@ -1945,6 +1945,24 @@ static inline uint32_t REG_A6XX_RBBM_PERFCTR_RBBM_SEL(uint32_t i0) { return 0x00

#define REG_A6XX_RBBM_CLOCK_HYST_TEX_FCHE 0x00000122

+#define REG_A6XX_RBBM_CLOCK_CNTL_FCHE 0x00000123
+
+#define REG_A6XX_RBBM_CLOCK_DELAY_FCHE 0x00000124
+
+#define REG_A6XX_RBBM_CLOCK_HYST_FCHE 0x00000125
+
+#define REG_A6XX_RBBM_CLOCK_CNTL_MHUB 0x00000126
+
+#define REG_A6XX_RBBM_CLOCK_DELAY_MHUB 0x00000127
+
+#define REG_A6XX_RBBM_CLOCK_HYST_MHUB 0x00000128
+
+#define REG_A6XX_RBBM_CLOCK_DELAY_GLC 0x00000129
+
+#define REG_A6XX_RBBM_CLOCK_HYST_GLC 0x0000012a
+
+#define REG_A6XX_RBBM_CLOCK_CNTL_GLC 0x0000012b
+
#define REG_A7XX_RBBM_CLOCK_HYST2_VFD 0x0000012f

#define REG_A6XX_RBBM_LPAC_GBIF_CLIENT_QOS_CNTL 0x000005ff

--
2.43.2


2024-02-23 21:23:19

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 5/7] drm/msm/adreno: Add A702 support

The A702 is a weird mix of 600 and 700 series.. Perhaps even a
testing ground for some A7xx features with good ol' A6xx silicon.
It's basically A610 that's been beefed up with some new registers
and hw features (like APRIV!), that was then cut back in size,
memory bus and some other ways.

Add support for it, tested with QCM2290 / RB1.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 92 +++++++++++++++++++++++++++---
drivers/gpu/drm/msm/adreno/adreno_device.c | 18 ++++++
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 16 +++++-
3 files changed, 117 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c9c55e2ea584..2a491a486ca1 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -837,6 +837,65 @@ const struct adreno_reglist a690_hwcg[] = {
{}
};

+const struct adreno_reglist a702_hwcg[] = {
+ { REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x22222222 },
+ { REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x02222220 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x00000081 },
+ { REG_A6XX_RBBM_CLOCK_HYST_SP0, 0x0000f3cf },
+ { REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x22222222 },
+ { REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x22222222 },
+ { REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x22222222 },
+ { REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x00022222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x11111111 },
+ { REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x11111111 },
+ { REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x11111111 },
+ { REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x00011111 },
+ { REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x77777777 },
+ { REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x77777777 },
+ { REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x77777777 },
+ { REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x00077777 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x22222222 },
+ { REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x01202222 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x00002220 },
+ { REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022 },
+ { REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x00005555 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x00000011 },
+ { REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x04222222 },
+ { REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x00002222 },
+ { REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x02222222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x00000002 },
+ { REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x00002222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x00004000 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x00002222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x00000200 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004 },
+ { REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x22222222 },
+ { REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x00000004 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x00000002 },
+ { REG_A6XX_RBBM_ISDB_CNT, 0x00000182 },
+ { REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x00000000 },
+ { REG_A6XX_RBBM_SP_HYST_CNT, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x00000222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x00000111 },
+ { REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x00000555 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_FCHE, 0x00000222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_FCHE, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_HYST_FCHE, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_GLC, 0x00222222 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_GLC, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_HYST_GLC, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_CNTL_MHUB, 0x00000002 },
+ { REG_A6XX_RBBM_CLOCK_DELAY_MHUB, 0x00000000 },
+ { REG_A6XX_RBBM_CLOCK_HYST_MHUB, 0x00000000 },
+ {}
+};
+
const struct adreno_reglist a730_hwcg[] = {
{ REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x02222222 },
{ REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x02022222 },
@@ -968,6 +1027,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
clock_cntl_on = 0x8aa8aa02;
else if (adreno_is_a610(adreno_gpu))
clock_cntl_on = 0xaaa8aa82;
+ else if (adreno_is_a702(adreno_gpu))
+ clock_cntl_on = 0xaaaaaa82;
else
clock_cntl_on = 0x8aa8aa82;

@@ -989,14 +1050,14 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
return;

/* Disable SP clock before programming HWCG registers */
- if (!adreno_is_a610(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
+ if (!adreno_is_a610_family(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 1, 0);

for (i = 0; (reg = &adreno_gpu->info->hwcg[i], reg->offset); i++)
gpu_write(gpu, reg->offset, state ? reg->value : 0);

/* Enable SP clock */
- if (!adreno_is_a610(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
+ if (!adreno_is_a610_family(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 0, 1);

gpu_write(gpu, REG_A6XX_RBBM_CLOCK_CNTL, state ? clock_cntl_on : 0);
@@ -1224,7 +1285,7 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
const u32 *regs = a6xx_protect;
unsigned i, count, count_max;

- if (adreno_is_a650(adreno_gpu)) {
+ if (adreno_is_a650(adreno_gpu) || adreno_is_a702(adreno_gpu)) {
regs = a650_protect;
count = ARRAY_SIZE(a650_protect);
count_max = 48;
@@ -1320,6 +1381,12 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.rgb565_predicator = 1;
gpu->ubwc_config.uavflagprd_inv = 2;
}
+
+ if (adreno_is_a702(gpu)) {
+ gpu->ubwc_config.highest_bank_bit = 14;
+ gpu->ubwc_config.min_acc_len = 1;
+ gpu->ubwc_config.ubwc_mode = 2;
+ }
}

static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
@@ -1453,7 +1520,7 @@ static bool a6xx_ucode_check_version(struct a6xx_gpu *a6xx_gpu,
return false;

/* A7xx is safe! */
- if (adreno_is_a7xx(adreno_gpu))
+ if (adreno_is_a7xx(adreno_gpu) || adreno_is_a702(adreno_gpu))
return true;

/*
@@ -1671,7 +1738,7 @@ static int hw_init(struct msm_gpu *gpu)
a6xx_set_hwcg(gpu, true);

/* VBIF/GBIF start*/
- if (adreno_is_a610(adreno_gpu) ||
+ if (adreno_is_a610_family(adreno_gpu) ||
adreno_is_a640_family(adreno_gpu) ||
adreno_is_a650_family(adreno_gpu) ||
adreno_is_a7xx(adreno_gpu)) {
@@ -1705,6 +1772,7 @@ static int hw_init(struct msm_gpu *gpu)
}

if (!(adreno_is_a650_family(adreno_gpu) ||
+ adreno_is_a702(adreno_gpu) ||
adreno_is_a730(adreno_gpu))) {
gmem_range_min = adreno_is_a740_family(adreno_gpu) ? SZ_16M : SZ_1M;

@@ -1725,7 +1793,7 @@ static int hw_init(struct msm_gpu *gpu)
if (adreno_is_a640_family(adreno_gpu) || adreno_is_a650_family(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x02000140);
gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362c);
- } else if (adreno_is_a610(adreno_gpu)) {
+ } else if (adreno_is_a610_family(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x00800060);
gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x40201b16);
} else if (!adreno_is_a7xx(adreno_gpu)) {
@@ -1740,13 +1808,18 @@ static int hw_init(struct msm_gpu *gpu)
if (adreno_is_a610(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 48);
gpu_write(gpu, REG_A6XX_CP_MEM_POOL_DBG_ADDR, 47);
+ } else if (adreno_is_a702(adreno_gpu)) {
+ gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 64);
+ gpu_write(gpu, REG_A6XX_CP_MEM_POOL_DBG_ADDR, 63);
} else if (!adreno_is_a7xx(adreno_gpu))
gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 128);

/* Setting the primFifo thresholds default values,
* and vccCacheSkipDis=1 bit (0x200) for A640 and newer
*/
- if (adreno_is_a690(adreno_gpu))
+ if (adreno_is_a702(adreno_gpu))
+ gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x0000c000);
+ else if (adreno_is_a690(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00800200);
else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
@@ -1786,7 +1859,7 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x4fffff);
else if (adreno_is_a619(adreno_gpu))
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3fffff);
- else if (adreno_is_a610(adreno_gpu))
+ else if (adreno_is_a610(adreno_gpu) || adreno_is_a702(adreno_gpu))
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3ffff);
else
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x1fffff);
@@ -1822,6 +1895,9 @@ static int hw_init(struct msm_gpu *gpu)
else
gpu_write(gpu, REG_A6XX_CP_CHICKEN_DBG, 0x1);
gpu_write(gpu, REG_A6XX_RBBM_GBIF_CLIENT_QOS_CNTL, 0x0);
+ } else if (adreno_is_a702(adreno_gpu)) {
+ /* Something to do with the HLSQ cluster */
+ gpu_write(gpu, REG_A6XX_CP_CHICKEN_DBG, BIT(24));
}

if (adreno_is_a690(adreno_gpu))
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 2ce7d7b1690d..b121abc71338 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -492,6 +492,24 @@ static const struct adreno_info gpulist[] = {
.zapfw = "a690_zap.mdt",
.hwcg = a690_hwcg,
.address_space_size = SZ_16G,
+ }, {
+ .chip_ids = ADRENO_CHIP_IDS(0x07000200),
+ .family = ADRENO_6XX_GEN1, /* NOT a mistake! */
+ .fw = {
+ [ADRENO_FW_SQE] = "a702_sqe.fw",
+ },
+ .gmem = SZ_128K,
+ .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+ .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
+ .init = a6xx_gpu_init,
+ .zapfw = "a702_zap.mbn",
+ .hwcg = a702_hwcg,
+ .speedbins = ADRENO_SPEEDBINS(
+ { 0, 0 },
+ { 236, 1 },
+ { 178, 2 },
+ { 142, 3 },
+ ),
}, {
.chip_ids = ADRENO_CHIP_IDS(0x07030001),
.family = ADRENO_7XX_GEN1,
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index bc14df96feb0..f451881a6ddf 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -77,7 +77,7 @@ struct adreno_reglist {
};

extern const struct adreno_reglist a612_hwcg[], a615_hwcg[], a630_hwcg[], a640_hwcg[], a650_hwcg[];
-extern const struct adreno_reglist a660_hwcg[], a690_hwcg[], a730_hwcg[], a740_hwcg[];
+extern const struct adreno_reglist a660_hwcg[], a690_hwcg[], a702_hwcg[], a730_hwcg[], a740_hwcg[];

struct adreno_speedbin {
uint16_t fuse;
@@ -382,6 +382,20 @@ static inline int adreno_is_a690(const struct adreno_gpu *gpu)
return gpu->info->chip_ids[0] == 0x06090000;
}

+static inline int adreno_is_a702(const struct adreno_gpu *gpu)
+{
+ return gpu->info->chip_ids[0] == 0x07000200;
+}
+
+static inline int adreno_is_a610_family(const struct adreno_gpu *gpu)
+{
+ if (WARN_ON_ONCE(!gpu->info))
+ return false;
+
+ /* TODO: A612 */
+ return adreno_is_a610(gpu) || adreno_is_a702(gpu);
+}
+
/* check for a615, a616, a618, a619 or any a630 derivatives */
static inline int adreno_is_a630_family(const struct adreno_gpu *gpu)
{

--
2.43.2


2024-02-23 21:23:34

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 6/7] arm64: dts: qcom: qcm2290: Add GPU nodes

Describe the GPU hardware on the QCM2290.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/qcm2290.dtsi | 154 ++++++++++++++++++++++++++++++++++
1 file changed, 154 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/qcm2290.dtsi b/arch/arm64/boot/dts/qcom/qcm2290.dtsi
index 89beac833d43..ec5aef5d9c69 100644
--- a/arch/arm64/boot/dts/qcom/qcm2290.dtsi
+++ b/arch/arm64/boot/dts/qcom/qcm2290.dtsi
@@ -7,6 +7,7 @@

#include <dt-bindings/clock/qcom,dispcc-qcm2290.h>
#include <dt-bindings/clock/qcom,gcc-qcm2290.h>
+#include <dt-bindings/clock/qcom,qcm2290-gpucc.h>
#include <dt-bindings/clock/qcom,rpmcc.h>
#include <dt-bindings/dma/qcom-gpi.h>
#include <dt-bindings/firmware/qcom,scm.h>
@@ -737,6 +738,11 @@ qusb2_hstx_trim: hstx-trim@25b {
reg = <0x25b 0x1>;
bits = <1 4>;
};
+
+ gpu_speed_bin: gpu-speed-bin@2006 {
+ reg = <0x2006 0x2>;
+ bits = <5 8>;
+ };
};

pmu@1b8e300 {
@@ -1383,6 +1389,154 @@ usb_dwc3: usb@4e00000 {
};
};

+ gpu: gpu@5900000 {
+ compatible = "qcom,adreno-07000200", "qcom,adreno";
+ reg = <0x0 0x05900000 0x0 0x40000>;
+ reg-names = "kgsl_3d0_reg_memory";
+
+ interrupts = <GIC_SPI 177 IRQ_TYPE_LEVEL_HIGH>;
+
+ clocks = <&gpucc GPU_CC_GX_GFX3D_CLK>,
+ <&gpucc GPU_CC_AHB_CLK>,
+ <&gcc GCC_BIMC_GPU_AXI_CLK>,
+ <&gcc GCC_GPU_MEMNOC_GFX_CLK>,
+ <&gpucc GPU_CC_CX_GMU_CLK>,
+ <&gpucc GPU_CC_CXO_CLK>;
+ clock-names = "core",
+ "iface",
+ "mem_iface",
+ "alt_mem_iface",
+ "gmu",
+ "xo";
+
+ interconnects = <&bimc MASTER_GFX3D RPM_ALWAYS_TAG
+ &bimc SLAVE_EBI1 RPM_ALWAYS_TAG>;
+ interconnect-names = "gfx-mem";
+
+ iommus = <&adreno_smmu 0 1>,
+ <&adreno_smmu 2 0>;
+ operating-points-v2 = <&gpu_opp_table>;
+ power-domains = <&rpmpd QCM2290_VDDCX>;
+ qcom,gmu = <&gmu_wrapper>;
+
+ nvmem-cells = <&gpu_speed_bin>;
+ nvmem-cell-names = "speed_bin";
+ #cooling-cells = <2>;
+
+ status = "disabled";
+
+ zap-shader {
+ memory-region = <&pil_gpu_mem>;
+ };
+
+ gpu_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ /* TODO: Scale RPM_SMD_BIMC_GPU_CLK w/ turbo freqs */
+ opp-1123200000 {
+ opp-hz = /bits/ 64 <1123200000>;
+ required-opps = <&rpmpd_opp_turbo_plus>;
+ opp-peak-kBps = <6881000>;
+ opp-supported-hw = <0x3>;
+ turbo-mode;
+ };
+
+ opp-1017600000 {
+ opp-hz = /bits/ 64 <1017600000>;
+ required-opps = <&rpmpd_opp_turbo>;
+ opp-peak-kBps = <6881000>;
+ opp-supported-hw = <0x3>;
+ turbo-mode;
+ };
+
+ opp-921600000 {
+ opp-hz = /bits/ 64 <921600000>;
+ required-opps = <&rpmpd_opp_nom_plus>;
+ opp-peak-kBps = <6881000>;
+ opp-supported-hw = <0x3>;
+ };
+
+ opp-844800000 {
+ opp-hz = /bits/ 64 <844800000>;
+ required-opps = <&rpmpd_opp_nom>;
+ opp-peak-kBps = <6881000>;
+ opp-supported-hw = <0x7>;
+ };
+
+ opp-672000000 {
+ opp-hz = /bits/ 64 <672000000>;
+ required-opps = <&rpmpd_opp_svs_plus>;
+ opp-peak-kBps = <3879000>;
+ opp-supported-hw = <0xf>;
+ };
+
+ opp-537600000 {
+ opp-hz = /bits/ 64 <537600000>;
+ required-opps = <&rpmpd_opp_svs>;
+ opp-peak-kBps = <2929000>;
+ opp-supported-hw = <0xf>;
+ };
+
+ opp-355200000 {
+ opp-hz = /bits/ 64 <355200000>;
+ required-opps = <&rpmpd_opp_low_svs>;
+ opp-peak-kBps = <1720000>;
+ opp-supported-hw = <0xf>;
+ };
+ };
+ };
+
+ gmu_wrapper: gmu@596a000 {
+ compatible = "qcom,adreno-gmu-wrapper";
+ reg = <0x0 0x0596a000 0x0 0x30000>;
+ reg-names = "gmu";
+ power-domains = <&gpucc GPU_CX_GDSC>,
+ <&gpucc GPU_GX_GDSC>;
+ power-domain-names = "cx",
+ "gx";
+ };
+
+ gpucc: clock-controller@5990000 {
+ compatible = "qcom,qcm2290-gpucc";
+ reg = <0x0 0x05990000 0x0 0x9000>;
+ clocks = <&gcc GCC_GPU_CFG_AHB_CLK>,
+ <&rpmcc RPM_SMD_XO_CLK_SRC>,
+ <&gcc GCC_GPU_GPLL0_CLK_SRC>,
+ <&gcc GCC_GPU_GPLL0_DIV_CLK_SRC>;
+ power-domains = <&rpmpd QCM2290_VDDCX>;
+ required-opps = <&rpmpd_opp_low_svs>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ };
+
+ adreno_smmu: iommu@59a0000 {
+ compatible = "qcom,qcm2290-smmu-500", "qcom,adreno-smmu",
+ "qcom,smmu-500", "arm,mmu-500";
+ reg = <0x0 0x059a0000 0x0 0x10000>;
+ interrupts = <GIC_SPI 163 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 167 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 171 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 172 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 173 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 174 IRQ_TYPE_LEVEL_HIGH>;
+
+ clocks = <&gcc GCC_GPU_MEMNOC_GFX_CLK>,
+ <&gpucc GPU_CC_HLOS1_VOTE_GPU_SMMU_CLK>,
+ <&gcc GCC_GPU_SNOC_DVM_GFX_CLK>;
+ clock-names = "mem",
+ "hlos",
+ "iface";
+
+ power-domains = <&gpucc GPU_CX_GDSC>;
+
+ #global-interrupts = <1>;
+ #iommu-cells = <2>;
+ };
+
mdss: display-subsystem@5e00000 {
compatible = "qcom,qcm2290-mdss";
reg = <0x0 0x05e00000 0x0 0x1000>;

--
2.43.2


2024-02-23 21:23:47

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v2 7/7] arm64: dts: qcom: qrb2210-rb1: Enable the GPU

Enable the A702 GPU (also marketed as "3D accelerator by qcom [1], lol).

[1] https://docs.qualcomm.com/bundle/publicresource/87-61720-1_REV_A_QUALCOMM_ROBOTICS_RB1_PLATFORM__QUALCOMM_QRB2210__PRODUCT_BRIEF.pdf

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/qrb2210-rb1.dts | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/qrb2210-rb1.dts b/arch/arm64/boot/dts/qcom/qrb2210-rb1.dts
index 6e9dd0312adc..c9abca5a7e39 100644
--- a/arch/arm64/boot/dts/qcom/qrb2210-rb1.dts
+++ b/arch/arm64/boot/dts/qcom/qrb2210-rb1.dts
@@ -199,6 +199,14 @@ &gpi_dma0 {
status = "okay";
};

+&gpu {
+ status = "okay";
+
+ zap-shader {
+ firmware-name = "qcom/qcm2290/a702_zap.mbn";
+ };
+};
+
&i2c2 {
clock-frequency = <400000>;
status = "okay";

--
2.43.2


2024-02-23 21:24:48

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] clk: qcom: clk-alpha-pll: Add HUAYRA_2290 support

On Fri, 23 Feb 2024 at 23:21, Konrad Dybcio <[email protected]> wrote:
>
> Commit 134b55b7e19f ("clk: qcom: support Huayra type Alpha PLL")
> introduced an entry to the alpha offsets array, but diving into QCM2290
> downstream and some documentation, it turned out that the name Huayra
> apparently has been used quite liberally across many chips, even with
> noticeably different hardware.
>
> Introduce another set of offsets and a new configure function for the
> Huayra PLL found on QCM2290. This is required e.g. for the consumers
> of GPUCC_PLL0 to properly start.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> drivers/clk/qcom/clk-alpha-pll.c | 47 ++++++++++++++++++++++++++++++++++++++++
> drivers/clk/qcom/clk-alpha-pll.h | 3 +++
> 2 files changed, 50 insertions(+)

Reviewed-by: Dmitry Baryshkov <[email protected]>

--
With best wishes
Dmitry

2024-02-23 21:25:04

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] drm/msm/adreno: Add missing defines for A702

On Fri, 23 Feb 2024 at 23:21, Konrad Dybcio <[email protected]> wrote:
>
> Add some defines required for A702. Can be substituted with a header
> sync after merging mesa!27665 [1].
>
> [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27665
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/a6xx.xml.h | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)

Reviewed-by: Dmitry Baryshkov <[email protected]>

--
With best wishes
Dmitry

2024-02-23 22:49:38

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] clk: qcom: clk-alpha-pll: Add HUAYRA_2290 support

On 2/23/2024 1:21 PM, Konrad Dybcio wrote:
> + /* Wait 50us for PLL_LOCK_DET bit to go high */
> + usleep_range(50, 55);
> +
> + /* Enable PLL output */
> + regmap_update_bits(regmap, PLL_MODE(pll), PLL_OUTCTRL, PLL_OUTCTRL);
> +}
> +EXPORT_SYMBOL(clk_huayra_2290_pll_configure);

Please use EXPORT_SYMBOL_GPL.

--
---Trilok Soni


2024-02-24 10:20:47

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v2 1/7] dt-bindings: clock: Add Qcom QCM2290 GPUCC

On 23/02/2024 22:21, Konrad Dybcio wrote:
> Add device tree bindings for graphics clock controller for Qualcomm
> Technology Inc's QCM2290 SoCs.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---

Reviewed-by: Krzysztof Kozlowski <[email protected]>

Best regards,
Krzysztof


2024-02-27 10:12:12

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] A702 support

On Fri, Feb 23, 2024 at 10:21:36PM +0100, Konrad Dybcio wrote:
> Bit of a megaseries, bunched together for your testing convenience..
> Needs mesa!27665 [1] on the userland part, kmscube happily spins.
>
> I'm feeling quite lukewarm about the memory barriers in patch 3..
>
> Patch 1 for Will/smmu, 5-6 for drm/msm, rest for qcom

I'm guessing you don't really expect me to take the clock bindings?!

Will

2024-02-27 10:12:56

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] A702 support



On 2/27/24 11:10, Will Deacon wrote:
> On Fri, Feb 23, 2024 at 10:21:36PM +0100, Konrad Dybcio wrote:
>> Bit of a megaseries, bunched together for your testing convenience..
>> Needs mesa!27665 [1] on the userland part, kmscube happily spins.
>>
>> I'm feeling quite lukewarm about the memory barriers in patch 3..
>>
>> Patch 1 for Will/smmu, 5-6 for drm/msm, rest for qcom
>
> I'm guessing you don't really expect me to take the clock bindings?!

Sorry, I didn't remove this hunk from v1 (where it was smmu changes
that you already took)!

Konrad

2024-03-12 16:14:53

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] clk: qcom: clk-alpha-pll: Add HUAYRA_2290 support



On 2/23/24 23:48, Trilok Soni wrote:
> On 2/23/2024 1:21 PM, Konrad Dybcio wrote:
>> + /* Wait 50us for PLL_LOCK_DET bit to go high */
>> + usleep_range(50, 55);
>> +
>> + /* Enable PLL output */
>> + regmap_update_bits(regmap, PLL_MODE(pll), PLL_OUTCTRL, PLL_OUTCTRL);
>> +}
>> +EXPORT_SYMBOL(clk_huayra_2290_pll_configure);
>
> Please use EXPORT_SYMBOL_GPL.

Sure, I glanced over this!

I've also noticed that it's a very common oversight.. would you be
interested in extending scripts/checkpatch.pl to suggest the _GPL
variant?

Konrad

2024-05-23 12:25:58

by Connor Abbott

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] drm/msm/adreno: Add A702 support

On Fri, Feb 23, 2024 at 9:28 PM Konrad Dybcio <konrad.dybcio@linaroorg> wrote:
>
> The A702 is a weird mix of 600 and 700 series.. Perhaps even a
> testing ground for some A7xx features with good ol' A6xx silicon.
> It's basically A610 that's been beefed up with some new registers
> and hw features (like APRIV!), that was then cut back in size,
> memory bus and some other ways.
>
> Add support for it, tested with QCM2290 / RB1.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 92 +++++++++++++++++++++++++++---
> drivers/gpu/drm/msm/adreno/adreno_device.c | 18 ++++++
> drivers/gpu/drm/msm/adreno/adreno_gpu.h | 16 +++++-
> 3 files changed, 117 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c9c55e2ea584..2a491a486ca1 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -837,6 +837,65 @@ const struct adreno_reglist a690_hwcg[] = {
> {}
> };
>
> +const struct adreno_reglist a702_hwcg[] = {
> + { REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x22222222 },
> + { REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x02222220 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x00000081 },
> + { REG_A6XX_RBBM_CLOCK_HYST_SP0, 0x0000f3cf },
> + { REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x22222222 },
> + { REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x22222222 },
> + { REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x22222222 },
> + { REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x00022222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x11111111 },
> + { REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x11111111 },
> + { REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x11111111 },
> + { REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x00011111 },
> + { REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x77777777 },
> + { REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x77777777 },
> + { REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x77777777 },
> + { REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x00077777 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x22222222 },
> + { REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x01202222 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x00002220 },
> + { REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022 },
> + { REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x00005555 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x00000011 },
> + { REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x04222222 },
> + { REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x00002222 },
> + { REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x02222222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x00000002 },
> + { REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x00002222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x00004000 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x00002222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x00000200 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004 },
> + { REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x22222222 },
> + { REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x00000004 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x00000002 },
> + { REG_A6XX_RBBM_ISDB_CNT, 0x00000182 },
> + { REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x00000000 },
> + { REG_A6XX_RBBM_SP_HYST_CNT, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x00000222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x00000111 },
> + { REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x00000555 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_FCHE, 0x00000222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_FCHE, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_HYST_FCHE, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_GLC, 0x00222222 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_GLC, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_HYST_GLC, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_CNTL_MHUB, 0x00000002 },
> + { REG_A6XX_RBBM_CLOCK_DELAY_MHUB, 0x00000000 },
> + { REG_A6XX_RBBM_CLOCK_HYST_MHUB, 0x00000000 },
> + {}
> +};
> +
> const struct adreno_reglist a730_hwcg[] = {
> { REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x02222222 },
> { REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x02022222 },
> @@ -968,6 +1027,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
> clock_cntl_on = 0x8aa8aa02;
> else if (adreno_is_a610(adreno_gpu))
> clock_cntl_on = 0xaaa8aa82;
> + else if (adreno_is_a702(adreno_gpu))
> + clock_cntl_on = 0xaaaaaa82;
> else
> clock_cntl_on = 0x8aa8aa82;
>
> @@ -989,14 +1050,14 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
> return;
>
> /* Disable SP clock before programming HWCG registers */
> - if (!adreno_is_a610(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
> + if (!adreno_is_a610_family(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
> gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 1, 0);
>
> for (i = 0; (reg = &adreno_gpu->info->hwcg[i], reg->offset); i++)
> gpu_write(gpu, reg->offset, state ? reg->value : 0);
>
> /* Enable SP clock */
> - if (!adreno_is_a610(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
> + if (!adreno_is_a610_family(adreno_gpu) && !adreno_is_a7xx(adreno_gpu))
> gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 0, 1);
>
> gpu_write(gpu, REG_A6XX_RBBM_CLOCK_CNTL, state ? clock_cntl_on : 0);
> @@ -1224,7 +1285,7 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
> const u32 *regs = a6xx_protect;
> unsigned i, count, count_max;
>
> - if (adreno_is_a650(adreno_gpu)) {
> + if (adreno_is_a650(adreno_gpu) || adreno_is_a702(adreno_gpu)) {
> regs = a650_protect;
> count = ARRAY_SIZE(a650_protect);
> count_max = 48;
> @@ -1320,6 +1381,12 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
> gpu->ubwc_config.rgb565_predicator = 1;
> gpu->ubwc_config.uavflagprd_inv = 2;
> }
> +
> + if (adreno_is_a702(gpu)) {
> + gpu->ubwc_config.highest_bank_bit = 14;
> + gpu->ubwc_config.min_acc_len = 1;
> + gpu->ubwc_config.ubwc_mode = 2;

I just noticed, but this is wrong. ubwc_mode is a 1 bit field and what
this is actually doing is overwriting hbb_lo, making the highest bank
bit 15 instead of 14.

> + }
> }
>
> static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
> @@ -1453,7 +1520,7 @@ static bool a6xx_ucode_check_version(struct a6xx_gpu *a6xx_gpu,
> return false;
>
> /* A7xx is safe! */
> - if (adreno_is_a7xx(adreno_gpu))
> + if (adreno_is_a7xx(adreno_gpu) || adreno_is_a702(adreno_gpu))
> return true;
>
> /*
> @@ -1671,7 +1738,7 @@ static int hw_init(struct msm_gpu *gpu)
> a6xx_set_hwcg(gpu, true);
>
> /* VBIF/GBIF start*/
> - if (adreno_is_a610(adreno_gpu) ||
> + if (adreno_is_a610_family(adreno_gpu) ||
> adreno_is_a640_family(adreno_gpu) ||
> adreno_is_a650_family(adreno_gpu) ||
> adreno_is_a7xx(adreno_gpu)) {
> @@ -1705,6 +1772,7 @@ static int hw_init(struct msm_gpu *gpu)
> }
>
> if (!(adreno_is_a650_family(adreno_gpu) ||
> + adreno_is_a702(adreno_gpu) ||
> adreno_is_a730(adreno_gpu))) {
> gmem_range_min = adreno_is_a740_family(adreno_gpu) ? SZ_16M : SZ_1M;
>
> @@ -1725,7 +1793,7 @@ static int hw_init(struct msm_gpu *gpu)
> if (adreno_is_a640_family(adreno_gpu) || adreno_is_a650_family(adreno_gpu)) {
> gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x02000140);
> gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362c);
> - } else if (adreno_is_a610(adreno_gpu)) {
> + } else if (adreno_is_a610_family(adreno_gpu)) {
> gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x00800060);
> gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x40201b16);
> } else if (!adreno_is_a7xx(adreno_gpu)) {
> @@ -1740,13 +1808,18 @@ static int hw_init(struct msm_gpu *gpu)
> if (adreno_is_a610(adreno_gpu)) {
> gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 48);
> gpu_write(gpu, REG_A6XX_CP_MEM_POOL_DBG_ADDR, 47);
> + } else if (adreno_is_a702(adreno_gpu)) {
> + gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 64);
> + gpu_write(gpu, REG_A6XX_CP_MEM_POOL_DBG_ADDR, 63);
> } else if (!adreno_is_a7xx(adreno_gpu))
> gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 128);
>
> /* Setting the primFifo thresholds default values,
> * and vccCacheSkipDis=1 bit (0x200) for A640 and newer
> */
> - if (adreno_is_a690(adreno_gpu))
> + if (adreno_is_a702(adreno_gpu))
> + gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x0000c000);
> + else if (adreno_is_a690(adreno_gpu))
> gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00800200);
> else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
> gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
> @@ -1786,7 +1859,7 @@ static int hw_init(struct msm_gpu *gpu)
> gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x4fffff);
> else if (adreno_is_a619(adreno_gpu))
> gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3fffff);
> - else if (adreno_is_a610(adreno_gpu))
> + else if (adreno_is_a610(adreno_gpu) || adreno_is_a702(adreno_gpu))
> gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3ffff);
> else
> gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x1fffff);
> @@ -1822,6 +1895,9 @@ static int hw_init(struct msm_gpu *gpu)
> else
> gpu_write(gpu, REG_A6XX_CP_CHICKEN_DBG, 0x1);
> gpu_write(gpu, REG_A6XX_RBBM_GBIF_CLIENT_QOS_CNTL, 0x0);
> + } else if (adreno_is_a702(adreno_gpu)) {
> + /* Something to do with the HLSQ cluster */
> + gpu_write(gpu, REG_A6XX_CP_CHICKEN_DBG, BIT(24));
> }
>
> if (adreno_is_a690(adreno_gpu))
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 2ce7d7b1690d..b121abc71338 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -492,6 +492,24 @@ static const struct adreno_info gpulist[] = {
> .zapfw = "a690_zap.mdt",
> .hwcg = a690_hwcg,
> .address_space_size = SZ_16G,
> + }, {
> + .chip_ids = ADRENO_CHIP_IDS(0x07000200),
> + .family = ADRENO_6XX_GEN1, /* NOT a mistake! */
> + .fw = {
> + [ADRENO_FW_SQE] = "a702_sqe.fw",
> + },
> + .gmem = SZ_128K,
> + .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .init = a6xx_gpu_init,
> + .zapfw = "a702_zap.mbn",
> + .hwcg = a702_hwcg,
> + .speedbins = ADRENO_SPEEDBINS(
> + { 0, 0 },
> + { 236, 1 },
> + { 178, 2 },
> + { 142, 3 },
> + ),
> }, {
> .chip_ids = ADRENO_CHIP_IDS(0x07030001),
> .family = ADRENO_7XX_GEN1,
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index bc14df96feb0..f451881a6ddf 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -77,7 +77,7 @@ struct adreno_reglist {
> };
>
> extern const struct adreno_reglist a612_hwcg[], a615_hwcg[], a630_hwcg[], a640_hwcg[], a650_hwcg[];
> -extern const struct adreno_reglist a660_hwcg[], a690_hwcg[], a730_hwcg[], a740_hwcg[];
> +extern const struct adreno_reglist a660_hwcg[], a690_hwcg[], a702_hwcg[], a730_hwcg[], a740_hwcg[];
>
> struct adreno_speedbin {
> uint16_t fuse;
> @@ -382,6 +382,20 @@ static inline int adreno_is_a690(const struct adreno_gpu *gpu)
> return gpu->info->chip_ids[0] == 0x06090000;
> }
>
> +static inline int adreno_is_a702(const struct adreno_gpu *gpu)
> +{
> + return gpu->info->chip_ids[0] == 0x07000200;
> +}
> +
> +static inline int adreno_is_a610_family(const struct adreno_gpu *gpu)
> +{
> + if (WARN_ON_ONCE(!gpu->info))
> + return false;
> +
> + /* TODO: A612 */
> + return adreno_is_a610(gpu) || adreno_is_a702(gpu);
> +}
> +
> /* check for a615, a616, a618, a619 or any a630 derivatives */
> static inline int adreno_is_a630_family(const struct adreno_gpu *gpu)
> {
>
> --
> 2.43.2
>

2024-06-06 09:30:56

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] drm/msm/adreno: Add A702 support

On 23.05.2024 2:14 PM, Connor Abbott wrote:
> On Fri, Feb 23, 2024 at 9:28 PM Konrad Dybcio <[email protected]> wrote:
>>
>> The A702 is a weird mix of 600 and 700 series.. Perhaps even a
>> testing ground for some A7xx features with good ol' A6xx silicon.
>> It's basically A610 that's been beefed up with some new registers
>> and hw features (like APRIV!), that was then cut back in size,
>> memory bus and some other ways.
>>
>> Add support for it, tested with QCM2290 / RB1.
>>
>> Signed-off-by: Konrad Dybcio <[email protected]>
>> ---

[...]

>> +
>> + if (adreno_is_a702(gpu)) {
>> + gpu->ubwc_config.highest_bank_bit = 14;
>> + gpu->ubwc_config.min_acc_len = 1;
>> + gpu->ubwc_config.ubwc_mode = 2;
>
> I just noticed, but this is wrong. ubwc_mode is a 1 bit field and what
> this is actually doing is overwriting hbb_lo, making the highest bank
> bit 15 instead of 14.

You're right, this should be a 0. Thanks!

Konrad