2023-02-23 12:06:52

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 00/15] GMU-less A6xx support (A610, A619_holi)

v2 -> v3:
New dependencies:
- https://lore.kernel.org/linux-arm-msm/[email protected]/T/#t
- https://lore.kernel.org/linux-arm-msm/[email protected]/

Sidenote: A speedbin rework is in progress, the of_machine_is_compatible
calls in A619_holi are ugly (but well, necessary..) but they'll be
replaced with socid matching in this or the next kernel cycle.

Due to the new way of identifying GMU wrapper GPUs, configuring 6350
to use wrapper would cause the wrong fuse values to be checked, but that
will be solved by the conversion + the ultimate goal is to use the GMU
whenever possible with the wrapper left for GMU-less Adrenos and early
bringup debugging of GMU-equipped ones.

- Ship dt-bindings in this series as we're referencing the compatible now

- "De-staticize" -> "remove static keyword" [3/15]

- Track down all the values in [4/15]

- Add many comments and explanations in [4/15]

- Fix possible return-before-mutex-unlock [5/15]

- Explain the GMU wrapper a bit more in the commit msg [5/15]

- Separate out pm_resume/suspend for GMU-wrapper GPUs to make things
cleaner [5/15]

- Don't check if `info` exists, it has to at this point [5/15]

- Assign gpu->info early and clean up following if statements in
a6xx_gpu_init [5/15]

- Determine whether we use GMU wrapper based on the GMU compatible
instead of a quirk [5/15]

- Use a struct field to annotate whether we're using gmu wrapper so
that it can be assigned at runtime (turns out a619 holi-ness cannot
be determined by patchid + that will make it easier to test out GMU
GPUs without actually turning on the GMU if anybody wants to do so)
[5/15]

- Unconditionally hook up gx to the gmu wrapper (otherwise our gpu
will not get power) [5/15]

- Don't check for gx domain presence in gmu_wrapper paths, it's
guaranteed [5/15]

- Use opp set rate in the gmuwrapper suspend path [5/15]

- Call opp functions on the GPU device and not on the DRM device of
mdp4/5/DPU1 half the time (WHOOOOPS!) [5/15]

- Disable the memory clock in a6xx_pm_suspend instead of enabling it
(moderate oops) [5/15]

- Call the forgotten clk_bulk_disable_unprepare in a6xx_pm_suspend [5/15]

- Set rate to FMIN (a6xx really doesn't like rate=0 + that's what
msm-5.x does anyway) before disabling core clock [5/15]

- pm_runtime_get_sync -> pm_runtime_resume_and_get [5/15]

- Don't annotate no cached BO support with a quirk, as A619_holi is
merged into the A619 entry in the big const struct - this means
that all GPUs operating in gmu wrapper configuration will be
implicitly treated as if they didn't have this feature [7/15]

- Drop OPP rate & icc related patches, they're a part of a separate
series now; rebase on it

- Clean up extra parentheses [8/15]

- Identify A619_holi by checking the compatible of its GMU instead
of patchlevel [8/15]

- Drop "Fix up A6XX protected registers" - unnecessary, Rob will add
a comment explaining why

- Fix existing UBWC values for A680, new patch [10/15]

- Use adreno_is_aXYZ macros in speedbin matching [13/15] - new patch

v2: https://lore.kernel.org/linux-arm-msm/[email protected]/

v1 -> v2:
- Fix A630 values in [2/14]
- Fix [6/14] for GMU-equipped GPUs

Link to v1: https://lore.kernel.org/linux-arm-msm/[email protected]/

This series concludes my couple-weeks-long suffering of figuring out
the ins and outs of the "non-standard" A6xx GPUs which feature no GMU.

The GMU functionality is essentially emulated by parting out a
"GMU wrapper" region, which is essentially just a register space
within the GPU. It's modeled to be as similar to the actual GMU
as possible while staying as unnecessary as we can make it - there's
no IRQs, communicating with a microcontroller, no RPMh communication
etc. etc. I tried to reuse as much code as possible without making
a mess where every even line is used for GMU and every odd line is
used for GMU wrapper..

This series contains:
- plumbing for non-GMU operation, if-ing out GMU calls based on
GMU presence
- GMU wrapper support
- A610 support (w/ speedbin)
- A619 support (w/ speedbin)
- couple of minor fixes and improvements
- VDDCX/VDDGX scaling fix for non-GMU GPUs (concerns more than just
A6xx)
- Enablement of opp interconnect properties

A619_holi works perfectly fine using the already-present A619 support
in mesa. A610 needs more work on that front, but can already replay
command traces captures on downstream.

NOTE: the "drm/msm/a6xx: Add support for A619_holi" patch contains
two occurences of 0x18 used in place of a register #define, as it's
supposed to be RBBM_GPR0_CNTL, but that will only be present after
mesa-side changes are merged and headers are synced from there.

Speedbin patches depend on:
https://lore.kernel.org/linux-arm-msm/[email protected]/

Signed-off-by: Konrad Dybcio <[email protected]>
---
Konrad Dybcio (15):
dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx
dt-bindings: display/msm/gmu: Add GMU wrapper
drm/msm/a6xx: Remove static keyword from sptprac en/disable functions
drm/msm/a6xx: Extend and explain UBWC config
drm/msm/a6xx: Introduce GMU wrapper support
drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init
drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations
drm/msm/a6xx: Add support for A619_holi
drm/msm/a6xx: Add A610 support
drm/msm/a6xx: Fix A680 highest bank bit value
drm/msm/a6xx: Fix some A619 tunables
drm/msm/a6xx: Use "else if" in GPU speedbin rev matching
drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching
drm/msm/a6xx: Add A619_holi speedbin support
drm/msm/a6xx: Add A610 speedbin support

.../devicetree/bindings/display/msm/gmu.yaml | 49 +-
.../devicetree/bindings/display/msm/gpu.yaml | 63 ++-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 57 ++-
drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 513 ++++++++++++++++++---
drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 +
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 14 +-
drivers/gpu/drm/msm/adreno/adreno_device.c | 17 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 33 +-
9 files changed, 655 insertions(+), 94 deletions(-)
---
base-commit: f122501715b5bb8ea340e077401257795b6638a1
change-id: 20230223-topic-gmuwrapper-b4fff5fd7789

Best regards,
--
Konrad Dybcio <[email protected]>



2023-02-23 12:06:56

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 01/15] dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx

GMU wrapper-equipped A6xx GPUs require clocks and clock-names to be
specified under the GPU node, just like their older cousins.
Account for that.

Signed-off-by: Konrad Dybcio <[email protected]>
---
.../devicetree/bindings/display/msm/gpu.yaml | 63 ++++++++++++++++++----
1 file changed, 53 insertions(+), 10 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/gpu.yaml b/Documentation/devicetree/bindings/display/msm/gpu.yaml
index d4191cca71fb..e6d3160601bc 100644
--- a/Documentation/devicetree/bindings/display/msm/gpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gpu.yaml
@@ -36,10 +36,7 @@ properties:

reg-names:
minItems: 1
- items:
- - const: kgsl_3d0_reg_memory
- - const: cx_mem
- - const: cx_dbgc
+ maxItems: 3

interrupts:
maxItems: 1
@@ -147,26 +144,72 @@ allOf:
description: GPU Alternative Memory Interface clock
- const: gfx3d
description: GPU 3D engine clock
+ - const: gmu
+ description: CX GMU clock
- const: rbbmtimer
description: GPU RBBM Timer for Adreno 5xx series
- const: rbcpr
description: GPU RB Core Power Reduction clock
+ - const: xo
+ description: GPUCC clocksource clock
minItems: 2
- maxItems: 7
+ maxItems: 9

required:
- clocks
- clock-names
+
- if:
properties:
compatible:
contains:
- pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
-
- then: # Since Adreno 6xx series clocks should be defined in GMU
+ enum:
+ - qcom,adreno-610.0
+ - qcom,adreno-619.1
+ then:
properties:
- clocks: false
- clock-names: false
+ clock-names:
+ items:
+ - const: core
+ description: GPU Core clock
+ - const: iface
+ description: GPU Interface clock
+ - const: mem_iface
+ description: GPU Memory Interface clock
+ - const: alt_mem_iface
+ description: GPU Alternative Memory Interface clock
+ - const: gmu
+ description: CX GMU clock
+ - const: xo
+ description: GPUCC clocksource clock
+
+ reg-names:
+ minItems: 1
+ items:
+ - const: kgsl_3d0_reg_memory
+ - const: cx_dbgc
+
+ required:
+ - clocks
+ - clock-names
+ else:
+ if:
+ properties:
+ compatible:
+ contains:
+ pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
+
+ then: # Starting with A6xx, the clocks are usually defined in the GMU node
+ properties:
+ clocks: false
+ clock-names: false
+
+ reg-names:
+ minItems: 1
+ items:
+ - const: kgsl_3d0_reg_memory
+ - const: cx_mem
+ - const: cx_dbgc

examples:
- |

--
2.39.2


2023-02-23 12:06:59

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 02/15] dt-bindings: display/msm/gmu: Add GMU wrapper

GMU wrapper is essentially a register space within the GPU, which
Linux sees as a dumbed-down regular GMU: there's no clocks,
interrupts, multiple regs, iommus and OPP. Document it.

Signed-off-by: Konrad Dybcio <[email protected]>
---
.../devicetree/bindings/display/msm/gmu.yaml | 49 ++++++++++++++++------
1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index ab14e81cb050..021373e686e1 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -19,16 +19,18 @@ description: |

properties:
compatible:
- items:
- - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
- - const: qcom,adreno-gmu
+ oneOf:
+ - items:
+ - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
+ - const: qcom,adreno-gmu
+ - const: qcom,adreno-gmu-wrapper

reg:
- minItems: 3
+ minItems: 1
maxItems: 4

reg-names:
- minItems: 3
+ minItems: 1
maxItems: 4

clocks:
@@ -44,7 +46,6 @@ properties:
- description: GMU HFI interrupt
- description: GMU interrupt

-
interrupt-names:
items:
- const: hfi
@@ -72,14 +73,8 @@ required:
- compatible
- reg
- reg-names
- - clocks
- - clock-names
- - interrupts
- - interrupt-names
- power-domains
- power-domain-names
- - iommus
- - operating-points-v2

additionalProperties: false

@@ -216,6 +211,27 @@ allOf:
- const: cxo
- const: axi
- const: memnoc
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: qcom,adreno-gmu-wrapper
+ then:
+ properties:
+ reg:
+ items:
+ - description: GMU wrapper register space
+ reg-names:
+ items:
+ - const: gmu
+ else:
+ required:
+ - clocks
+ - clock-names
+ - interrupts
+ - interrupt-names
+ - iommus
+ - operating-points-v2

examples:
- |
@@ -249,3 +265,12 @@ examples:
iommus = <&adreno_smmu 5>;
operating-points-v2 = <&gmu_opp_table>;
};
+
+ gmu_wrapper: gmu@596a000 {
+ compatible = "qcom,adreno-gmu-wrapper";
+ reg = <0x0596a000 0x30000>;
+ reg-names = "gmu";
+ power-domains = <&gpucc GPU_CX_GDSC>,
+ <&gpucc GPU_GX_GDSC>;
+ power-domain-names = "cx", "gx";
+ };

--
2.39.2


2023-02-23 12:07:01

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 03/15] drm/msm/a6xx: Remove static keyword from sptprac en/disable functions

These two will be reused by at least A619_holi in the non-gmu
paths. Turn them non-static them to make it possible.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 4 ++--
drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index f3c9600221d4..90e636dcdd5b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -354,7 +354,7 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state)
}

/* Enable CPU control of SPTP power power collapse */
-static int a6xx_sptprac_enable(struct a6xx_gmu *gmu)
+int a6xx_sptprac_enable(struct a6xx_gmu *gmu)
{
int ret;
u32 val;
@@ -376,7 +376,7 @@ static int a6xx_sptprac_enable(struct a6xx_gmu *gmu)
}

/* Disable CPU control of SPTP power power collapse */
-static void a6xx_sptprac_disable(struct a6xx_gmu *gmu)
+void a6xx_sptprac_disable(struct a6xx_gmu *gmu)
{
u32 val;
int ret;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index e034935b3986..ec28abdd327b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -186,5 +186,7 @@ int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index);

bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu);
bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu);
+void a6xx_sptprac_disable(struct a6xx_gmu *gmu);
+int a6xx_sptprac_enable(struct a6xx_gmu *gmu);

#endif

--
2.39.2


2023-02-23 12:07:09

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

Rename lower_bit to hbb_lo and explain what it signifies.
Add explanations (wherever possible to other tunables).

Sort the variable definition and assignment alphabetically.

Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
Set default values for all of the tunables to zero, as they should be.

Values were validated against downstream and will be fixed up in
separate commits so as not to make this one even more messy.

A618 remains untouched (left at hw defaults) in this patch.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 ++++++++++++++++++++++++++++-------
1 file changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c5f5d0bb3fdc..bdae341e0a7c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
- u32 lower_bit = 2;
+ /* Unknown, introduced with A640/680 */
u32 amsbc = 0;
+ /*
+ * The Highest Bank Bit value represents the bit of the highest DDR bank.
+ * We then subtract 13 from it (13 is the minimum value allowed by hw) and
+ * write the lowest two bits of the remaining value as hbb_lo and the
+ * one above it as hbb_hi to the hardware. The default values (when HBB is
+ * not specified) are 0, 0.
+ */
+ u32 hbb_hi = 0;
+ u32 hbb_lo = 0;
+ /* Whether the minimum access length is 64 bits */
+ u32 min_acc_len = 0;
+ /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
u32 rgb565_predicator = 0;
+ /* Unknown, introduced with A650 family */
u32 uavflagprd_inv = 0;
+ /* Entirely magic, per-GPU-gen value */
+ u32 ubwc_mode = 0;

/* a618 is using the hw default values */
if (adreno_is_a618(adreno_gpu))
return;

- if (adreno_is_a640_family(adreno_gpu))
+ if (adreno_is_a619(adreno_gpu)) {
+ /* HBB = 14 */
+ hbb_lo = 1;
+ }
+
+ if (adreno_is_a630(adreno_gpu)) {
+ /* HBB = 15 */
+ hbb_lo = 2;
+ }
+
+ if (adreno_is_a640_family(adreno_gpu)) {
amsbc = 1;
+ /* HBB = 15 */
+ hbb_lo = 2;
+ }

if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
- /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
- lower_bit = 3;
amsbc = 1;
+ /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
+ /* HBB = 16 */
+ hbb_lo = 3;
rgb565_predicator = 1;
uavflagprd_inv = 2;
}

if (adreno_is_7c3(adreno_gpu)) {
- lower_bit = 1;
amsbc = 1;
+ /* HBB is unset in downstream DTS, defaulting to 0 */
rgb565_predicator = 1;
uavflagprd_inv = 2;
}

gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
- rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
- gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
- gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
- uavflagprd_inv << 4 | lower_bit << 1);
- gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
+ rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
+ min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
+
+ gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
+ min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
+
+ gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
+ uavflagprd_inv << 4 | min_acc_len << 3 |
+ hbb_lo << 1 | ubwc_mode);
+
+ gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 21);
}

static int a6xx_cp_init(struct msm_gpu *gpu)

--
2.39.2


2023-02-23 12:07:13

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 06/15] drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init

Currently we're only deasserting REG_A6XX_RBBM_GBIF_HALT, but we also
need REG_A6XX_GBIF_HALT to be set to 0. For GMU-equipped GPUs this is
done in a6xx_bus_clear_pending_transactions(), but for the GMU-less
ones we have to do it *somewhere*. Unhalting both side by side sounds
like a good plan and it won't cause any issues if it's unnecessary.

Also, add a memory barrier to ensure it's gone through.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index d8e7ef181e39..a8b727b82389 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1029,8 +1029,12 @@ static int hw_init(struct msm_gpu *gpu)
}

/* Clear GBIF halt in case GX domain was not collapsed */
- if (a6xx_has_gbif(adreno_gpu))
+ if (a6xx_has_gbif(adreno_gpu)) {
+ gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
+ /* Let's make extra sure that the GPU can access the memory.. */
+ mb();
+ }

gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);


--
2.39.2


2023-02-23 12:07:15

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 05/15] drm/msm/a6xx: Introduce GMU wrapper support

Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
but don't implement the associated GMUs. This is due to the fact that
the GMU directly pokes at RPMh. Sadly, this means we have to take care
of enabling & scaling power rails, clocks and bandwidth ourselves.

Reuse existing Adreno-common code and modify the deeply-GMU-infused
A6XX code to facilitate these GPUs. This involves if-ing out lots
of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
the actual name that Qualcomm uses in their downstream kernels).

This is essentially a register region which is convenient to model
as a device. We'll use it for managing the GDSCs. The register
layout matches the actual GMU_CX/GX regions on the "real GMU" devices
and lets us reuse quite a bit of gmu_read/write/rmw calls.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 53 +++++-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 244 +++++++++++++++++++++++++---
drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 +
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 14 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 +
5 files changed, 282 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 90e636dcdd5b..b2c56561cde6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1474,6 +1474,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, struct platform_device *pdev,

void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
{
+ struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
struct platform_device *pdev = to_platform_device(gmu->dev);

@@ -1493,10 +1494,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->mmio = NULL;
gmu->rscc = NULL;

- a6xx_gmu_memory_free(gmu);
+ if (!adreno_has_gmu_wrapper(adreno_gpu)) {
+ a6xx_gmu_memory_free(gmu);

- free_irq(gmu->gmu_irq, gmu);
- free_irq(gmu->hfi_irq, gmu);
+ free_irq(gmu->gmu_irq, gmu);
+ free_irq(gmu->hfi_irq, gmu);
+ }

/* Drop reference taken in of_find_device_by_node */
put_device(gmu->dev);
@@ -1504,6 +1507,50 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->initialized = false;
}

+int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
+{
+ struct platform_device *pdev = of_find_device_by_node(node);
+ struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+ int ret;
+
+ if (!pdev)
+ return -ENODEV;
+
+ gmu->dev = &pdev->dev;
+
+ of_dma_configure(gmu->dev, node, true);
+
+ pm_runtime_enable(gmu->dev);
+
+ /* Mark legacy for manual SPTPRAC control */
+ gmu->legacy = true;
+
+ /* Map the GMU registers */
+ gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
+ if (IS_ERR(gmu->mmio)) {
+ ret = PTR_ERR(gmu->mmio);
+ goto err_mmio;
+ }
+
+ /* Get a link to the GX power domain to reset the GPU */
+ gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
+ if (IS_ERR(gmu->gxpd))
+ goto err_mmio;
+
+ gmu->initialized = true;
+
+ return 0;
+
+err_mmio:
+ iounmap(gmu->mmio);
+ ret = -ENODEV;
+
+ /* Drop reference taken in of_find_device_by_node */
+ put_device(gmu->dev);
+
+ return ret;
+}
+
int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
{
struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index bdae341e0a7c..d8e7ef181e39 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -20,9 +20,11 @@ static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

- /* Check that the GMU is idle */
- if (!a6xx_gmu_isidle(&a6xx_gpu->gmu))
- return false;
+ if (!adreno_has_gmu_wrapper(adreno_gpu)) {
+ /* Check that the GMU is idle */
+ if (!a6xx_gmu_isidle(&a6xx_gpu->gmu))
+ return false;
+ }

/* Check tha the CX master is idle */
if (gpu_read(gpu, REG_A6XX_RBBM_STATUS) &
@@ -612,13 +614,15 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
return;

/* Disable SP clock before programming HWCG registers */
- gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 1, 0);
+ if (!adreno_has_gmu_wrapper(adreno_gpu))
+ gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 1, 0);

for (i = 0; (reg = &adreno_gpu->info->hwcg[i], reg->offset); i++)
gpu_write(gpu, reg->offset, state ? reg->value : 0);

/* Enable SP clock */
- gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 0, 1);
+ if (!adreno_has_gmu_wrapper(adreno_gpu))
+ gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 0, 1);

gpu_write(gpu, REG_A6XX_RBBM_CLOCK_CNTL, state ? clock_cntl_on : 0);
}
@@ -1016,10 +1020,13 @@ static int hw_init(struct msm_gpu *gpu)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+ struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
int ret;

- /* Make sure the GMU keeps the GPU on while we set it up */
- a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_GPU_SET);
+ if (!adreno_has_gmu_wrapper(adreno_gpu)) {
+ /* Make sure the GMU keeps the GPU on while we set it up */
+ a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_GPU_SET);
+ }

/* Clear GBIF halt in case GX domain was not collapsed */
if (a6xx_has_gbif(adreno_gpu))
@@ -1145,6 +1152,17 @@ static int hw_init(struct msm_gpu *gpu)
0x3f0243f0);
}

+ if (adreno_has_gmu_wrapper(adreno_gpu)) {
+ /* Do it here, as GMU wrapper only inits the GMU for memory reservation etc. */
+
+ /* Set up the CX GMU counter 0 to count busy ticks */
+ gmu_write(gmu, REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_MASK, 0xff000000);
+
+ /* Enable power counter 0 */
+ gmu_rmw(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_SELECT_0, 0xff, BIT(5));
+ gmu_write(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_ENABLE, 1);
+ }
+
/* Protect registers from the CP */
a6xx_set_cp_protect(gpu);

@@ -1253,6 +1271,8 @@ static int hw_init(struct msm_gpu *gpu)
}

out:
+ if (adreno_has_gmu_wrapper(adreno_gpu))
+ return ret;
/*
* Tell the GMU that we are done touching the GPU and it can start power
* management
@@ -1287,6 +1307,9 @@ static void a6xx_dump(struct msm_gpu *gpu)
adreno_dump(gpu);
}

+#define GBIF_GX_HALT_MASK BIT(0)
+#define GBIF_CLIENT_HALT_MASK BIT(0)
+#define GBIF_ARB_HALT_MASK BIT(1)
#define VBIF_RESET_ACK_TIMEOUT 100
#define VBIF_RESET_ACK_MASK 0x00f0

@@ -1318,7 +1341,8 @@ static void a6xx_recover(struct msm_gpu *gpu)
* Turn off keep alive that might have been enabled by the hang
* interrupt
*/
- gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
+ if (!adreno_has_gmu_wrapper(adreno_gpu))
+ gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);

pm_runtime_dont_use_autosuspend(&gpu->pdev->dev);

@@ -1342,6 +1366,35 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* Call into gpucc driver to poll for cx gdsc collapse */
reset_control_reset(gpu->cx_collapse);

+ /* Software-reset the GPU */
+ if (adreno_has_gmu_wrapper(adreno_gpu)) {
+ /* Halt the GX side of GBIF */
+ gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, GBIF_GX_HALT_MASK);
+ spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) &
+ GBIF_GX_HALT_MASK);
+
+ /* Halt new client requests on GBIF */
+ gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
+ spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
+ (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
+
+ /* Halt all AXI requests on GBIF */
+ gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
+ spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
+ (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
+
+ /* Clear the halts */
+ gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
+
+ if (adreno_is_a619_holi(adreno_gpu))
+ gpu_write(gpu, 0x18, 0);
+ else
+ gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
+
+ /* This *really* needs to go through before we do anything else! */
+ mb();
+ }
+
pm_runtime_use_autosuspend(&gpu->pdev->dev);

if (active_submits)
@@ -1526,7 +1579,8 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
* Force the GPU to stay on until after we finish
* collecting information
*/
- gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 1);
+ if (!adreno_has_gmu_wrapper(adreno_gpu))
+ gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 1);

DRM_DEV_ERROR(&gpu->pdev->dev,
"gpu fault ring %d fence %x status %8.8X rb %4.4x/%4.4x ib1 %16.16llX/%4.4x ib2 %16.16llX/%4.4x\n",
@@ -1687,7 +1741,7 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
}

-static int a6xx_pm_resume(struct msm_gpu *gpu)
+static int a6xx_gmu_pm_resume(struct msm_gpu *gpu)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
@@ -1707,10 +1761,48 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)

a6xx_llc_activate(a6xx_gpu);

- return 0;
+ return ret;
}

-static int a6xx_pm_suspend(struct msm_gpu *gpu)
+static int a6xx_pm_resume(struct msm_gpu *gpu)
+{
+ struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+ struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+ struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+ int ret;
+
+ gpu->needs_hw_init = true;
+
+ trace_msm_gpu_resume(0);
+
+ mutex_lock(&a6xx_gpu->gmu.lock);
+
+ pm_runtime_resume_and_get(gmu->dev);
+ pm_runtime_resume_and_get(gmu->gxpd);
+
+ /* Set the core clock, having VDD scaling in mind */
+ ret = dev_pm_opp_set_rate(&gpu->pdev->dev, gpu->fast_rate);
+ if (ret)
+ goto err;
+
+ ret = clk_bulk_prepare_enable(gpu->nr_clocks, gpu->grp_clks);
+ if (ret)
+ goto err;
+
+ ret = clk_prepare_enable(gpu->ebi1_clk);
+ if (ret)
+ goto err;
+
+err:
+ mutex_unlock(&a6xx_gpu->gmu.lock);
+
+ if (!ret)
+ msm_devfreq_resume(gpu);
+
+ return ret;
+}
+
+static int a6xx_gmu_pm_suspend(struct msm_gpu *gpu)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
@@ -1737,11 +1829,62 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
return 0;
}

+static int a6xx_pm_suspend(struct msm_gpu *gpu)
+{
+ struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+ struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+ struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+ unsigned long freq = 0;
+ struct dev_pm_opp *opp;
+ int i, ret;
+
+ trace_msm_gpu_suspend(0);
+
+ opp = dev_pm_opp_find_freq_ceil(&gpu->pdev->dev, &freq);
+ dev_pm_opp_put(opp);
+
+ msm_devfreq_suspend(gpu);
+
+ mutex_lock(&a6xx_gpu->gmu.lock);
+
+ clk_disable_unprepare(gpu->ebi1_clk);
+
+ clk_bulk_disable_unprepare(gpu->nr_clocks, gpu->grp_clks);
+
+ /* Set frequency to the minimum supported level (no 27MHz on A6xx!) */
+ ret = dev_pm_opp_set_rate(&gpu->pdev->dev, freq);
+ if (ret)
+ goto err;
+
+ pm_runtime_put_sync(gmu->gxpd);
+ pm_runtime_put_sync(gmu->dev);
+
+ mutex_unlock(&a6xx_gpu->gmu.lock);
+
+ if (a6xx_gpu->shadow_bo)
+ for (i = 0; i < gpu->nr_rings; i++)
+ a6xx_gpu->shadow[i] = 0;
+
+ gpu->suspend_count++;
+
+ return 0;
+
+err:
+ mutex_unlock(&a6xx_gpu->gmu.lock);
+
+ return ret;
+}
+
static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

+ if (adreno_has_gmu_wrapper(adreno_gpu)) {
+ *value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO);
+ return 0;
+ }
+
mutex_lock(&a6xx_gpu->gmu.lock);

/* Force the GPU power on so we can read this register */
@@ -1779,7 +1922,8 @@ static void a6xx_destroy(struct msm_gpu *gpu)
drm_gem_object_put(a6xx_gpu->shadow_bo);
}

- a6xx_llc_slices_destroy(a6xx_gpu);
+ if (!adreno_has_gmu_wrapper(adreno_gpu))
+ a6xx_llc_slices_destroy(a6xx_gpu);

a6xx_gmu_remove(a6xx_gpu);

@@ -2017,8 +2161,8 @@ static const struct adreno_gpu_funcs funcs = {
.get_param = adreno_get_param,
.set_param = adreno_set_param,
.hw_init = a6xx_hw_init,
- .pm_suspend = a6xx_pm_suspend,
- .pm_resume = a6xx_pm_resume,
+ .pm_suspend = a6xx_gmu_pm_suspend,
+ .pm_resume = a6xx_gmu_pm_resume,
.recover = a6xx_recover,
.submit = a6xx_submit,
.active_ring = a6xx_active_ring,
@@ -2042,6 +2186,34 @@ static const struct adreno_gpu_funcs funcs = {
.get_timestamp = a6xx_get_timestamp,
};

+static const struct adreno_gpu_funcs funcs_gmuwrapper = {
+ .base = {
+ .get_param = adreno_get_param,
+ .set_param = adreno_set_param,
+ .hw_init = a6xx_hw_init,
+ .pm_suspend = a6xx_pm_suspend,
+ .pm_resume = a6xx_pm_resume,
+ .recover = a6xx_recover,
+ .submit = a6xx_submit,
+ .active_ring = a6xx_active_ring,
+ .irq = a6xx_irq,
+ .destroy = a6xx_destroy,
+#if defined(CONFIG_DRM_MSM_GPU_STATE)
+ .show = a6xx_show,
+#endif
+ .gpu_busy = a6xx_gpu_busy,
+#if defined(CONFIG_DRM_MSM_GPU_STATE)
+ .gpu_state_get = a6xx_gpu_state_get,
+ .gpu_state_put = a6xx_gpu_state_put,
+#endif
+ .create_address_space = a6xx_create_address_space,
+ .create_private_address_space = a6xx_create_private_address_space,
+ .get_rptr = a6xx_get_rptr,
+ .progress = a6xx_progress,
+ },
+ .get_timestamp = a6xx_get_timestamp,
+};
+
struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
{
struct msm_drm_private *priv = dev->dev_private;
@@ -2063,18 +2235,36 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)

adreno_gpu->registers = NULL;

+ /* Check if there is a GMU phandle and set it up */
+ node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
+ /* FIXME: How do we gracefully handle this? */
+ BUG_ON(!node);
+
+ adreno_gpu->gmu_is_wrapper = of_device_is_compatible(node, "qcom,adreno-gmu-wrapper");
+
/*
* We need to know the platform type before calling into adreno_gpu_init
* so that the hw_apriv flag can be correctly set. Snoop into the info
* and grab the revision number
*/
info = adreno_info(config->rev);
-
- if (info && (info->revn == 650 || info->revn == 660 ||
- adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), info->rev)))
+ if (!info)
+ return ERR_PTR(-EINVAL);
+
+ /* Assign these early so that we can use the is_aXYZ helpers */
+ /* Numeric revision IDs (e.g. 630) */
+ adreno_gpu->revn = info->revn;
+ /* New-style ADRENO_REV()-only */
+ adreno_gpu->rev = info->rev;
+ /* Quirk data */
+ adreno_gpu->info = info;
+
+ if (adreno_is_a650(adreno_gpu) || adreno_is_a660_family(adreno_gpu))
adreno_gpu->base.hw_apriv = true;

- a6xx_llc_slices_init(pdev, a6xx_gpu);
+ /* No LLCC on non-RPMh (and by extension, non-GMU) SoCs */
+ if (!adreno_has_gmu_wrapper(adreno_gpu))
+ a6xx_llc_slices_init(pdev, a6xx_gpu);

ret = a6xx_set_supported_hw(&pdev->dev, config->rev);
if (ret) {
@@ -2082,7 +2272,10 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
return ERR_PTR(ret);
}

- ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
+ if (adreno_has_gmu_wrapper(adreno_gpu))
+ ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs_gmuwrapper, 1);
+ else
+ ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
if (ret) {
a6xx_destroy(&(a6xx_gpu->base.base));
return ERR_PTR(ret);
@@ -2095,13 +2288,10 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
if (adreno_is_a618(adreno_gpu) || adreno_is_7c3(adreno_gpu))
priv->gpu_clamp_to_idle = true;

- /* Check if there is a GMU phandle and set it up */
- node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
-
- /* FIXME: How do we gracefully handle this? */
- BUG_ON(!node);
-
- ret = a6xx_gmu_init(a6xx_gpu, node);
+ if (adreno_has_gmu_wrapper(adreno_gpu))
+ ret = a6xx_gmu_wrapper_init(a6xx_gpu, node);
+ else
+ ret = a6xx_gmu_init(a6xx_gpu, node);
of_node_put(node);
if (ret) {
a6xx_destroy(&(a6xx_gpu->base.base));
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index eea2e60ce3b7..51a7656072fa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -76,6 +76,7 @@ int a6xx_gmu_set_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);

int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
+int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu);

void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index b7e217d00a22..e11e8a02ac22 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -1041,16 +1041,18 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
/* Get the generic state from the adreno core */
adreno_gpu_state_get(gpu, &a6xx_state->base);

- a6xx_get_gmu_registers(gpu, a6xx_state);
+ if (!adreno_has_gmu_wrapper(adreno_gpu)) {
+ a6xx_get_gmu_registers(gpu, a6xx_state);

- a6xx_state->gmu_log = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.log);
- a6xx_state->gmu_hfi = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.hfi);
- a6xx_state->gmu_debug = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.debug);
+ a6xx_state->gmu_log = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.log);
+ a6xx_state->gmu_hfi = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.hfi);
+ a6xx_state->gmu_debug = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.debug);

- a6xx_snapshot_gmu_hfi_history(gpu, a6xx_state);
+ a6xx_snapshot_gmu_hfi_history(gpu, a6xx_state);
+ }

/* If GX isn't on the rest of the data isn't going to be accessible */
- if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
+ if (!adreno_has_gmu_wrapper(adreno_gpu) && !a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
return &a6xx_state->base;

/* Get the banks of indexed registers */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index b4f9b1343d63..2c0f0ef094cb 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -115,6 +115,7 @@ struct adreno_gpu {
* code (a3xx_gpu.c) and stored in this common location.
*/
const unsigned int *reg_offsets;
+ bool gmu_is_wrapper;
};
#define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)

@@ -145,6 +146,11 @@ struct adreno_platform_config {

bool adreno_cmp_rev(struct adreno_rev rev1, struct adreno_rev rev2);

+static inline bool adreno_has_gmu_wrapper(struct adreno_gpu *gpu)
+{
+ return gpu->gmu_is_wrapper;
+}
+
static inline bool adreno_is_a2xx(struct adreno_gpu *gpu)
{
return (gpu->revn < 300);

--
2.39.2


2023-02-23 12:07:18

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 07/15] drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations

A610 and A619_holi don't support the feature. Disable it to make the GPU stop
crashing after almost each and every submission - the received data on
the GPU end was simply incomplete in garbled, resulting in almost nothing
being executed properly. Extend the disablement to adreno_has_gmu_wrapper,
as none of the GMU wrapper Adrenos that don't support yet seem to feature it.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/adreno_device.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 5142a4c72cfc..dfb43741ea32 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -540,7 +540,6 @@ static int adreno_bind(struct device *dev, struct device *master, void *data)
config.rev.minor, config.rev.patchid);

priv->is_a2xx = config.rev.core == 2;
- priv->has_cached_coherent = config.rev.core >= 6;

gpu = info->init(drm);
if (IS_ERR(gpu)) {
@@ -552,6 +551,10 @@ static int adreno_bind(struct device *dev, struct device *master, void *data)
if (ret)
return ret;

+ if (config.rev.core >= 6)
+ if (!adreno_has_gmu_wrapper(to_adreno_gpu(gpu)))
+ priv->has_cached_coherent = true;
+
return 0;
}


--
2.39.2


2023-02-23 12:07:31

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 08/15] drm/msm/a6xx: Add support for A619_holi

A619_holi is a GMU-less variant of the already-supported A619 GPU.
It's present on at least SM4350 (holi) and SM6375 (blair). No mesa
changes are required. Add the required kernel-side support for it.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 43 +++++++++++++++++++++++++--------
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 5 ++++
2 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index a8b727b82389..0859a6f463f9 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -614,14 +614,16 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
return;

/* Disable SP clock before programming HWCG registers */
- if (!adreno_has_gmu_wrapper(adreno_gpu))
+ if (!adreno_has_gmu_wrapper(adreno_gpu) ||
+ adreno_is_a619_holi(adreno_gpu))
gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 1, 0);

for (i = 0; (reg = &adreno_gpu->info->hwcg[i], reg->offset); i++)
gpu_write(gpu, reg->offset, state ? reg->value : 0);

/* Enable SP clock */
- if (!adreno_has_gmu_wrapper(adreno_gpu))
+ if (!adreno_has_gmu_wrapper(adreno_gpu) ||
+ adreno_is_a619_holi(adreno_gpu))
gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 0, 1);

gpu_write(gpu, REG_A6XX_RBBM_CLOCK_CNTL, state ? clock_cntl_on : 0);
@@ -814,8 +816,8 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
if (adreno_is_a618(adreno_gpu))
return;

- if (adreno_is_a619(adreno_gpu)) {
- /* HBB = 14 */
+ if (adreno_is_a619(adreno_gpu) && !adreno_is_a619_holi(adreno_gpu)) {
+ /* HBB = 14 on A619, default 0 on A619_holi */
hbb_lo = 1;
}

@@ -1029,7 +1031,12 @@ static int hw_init(struct msm_gpu *gpu)
}

/* Clear GBIF halt in case GX domain was not collapsed */
- if (a6xx_has_gbif(adreno_gpu)) {
+ if (adreno_is_a619_holi(adreno_gpu)) {
+ gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
+ gpu_write(gpu, 0x18, 0);
+ /* Let's make extra sure that the GPU can access the memory.. */
+ mb();
+ } else if (a6xx_has_gbif(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
/* Let's make extra sure that the GPU can access the memory.. */
@@ -1038,6 +1045,9 @@ static int hw_init(struct msm_gpu *gpu)

gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);

+ if (adreno_is_a619_holi(adreno_gpu))
+ a6xx_sptprac_enable(gmu);
+
/*
* Disable the trusted memory range - we don't actually supported secure
* memory rendering at this point in time and we don't want to block off
@@ -1315,7 +1325,8 @@ static void a6xx_dump(struct msm_gpu *gpu)
#define GBIF_CLIENT_HALT_MASK BIT(0)
#define GBIF_ARB_HALT_MASK BIT(1)
#define VBIF_RESET_ACK_TIMEOUT 100
-#define VBIF_RESET_ACK_MASK 0x00f0
+#define VBIF_RESET_ACK_MASK 0xF0
+#define GPR0_GBIF_HALT_REQUEST 0x1E0

static void a6xx_recover(struct msm_gpu *gpu)
{
@@ -1372,10 +1383,16 @@ static void a6xx_recover(struct msm_gpu *gpu)

/* Software-reset the GPU */
if (adreno_has_gmu_wrapper(adreno_gpu)) {
- /* Halt the GX side of GBIF */
- gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, GBIF_GX_HALT_MASK);
- spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) &
- GBIF_GX_HALT_MASK);
+ if (adreno_is_a619_holi(adreno_gpu)) {
+ gpu_write(gpu, 0x18, GPR0_GBIF_HALT_REQUEST);
+ spin_until((gpu_read(gpu, REG_A6XX_RBBM_VBIF_GX_RESET_STATUS) &
+ (VBIF_RESET_ACK_MASK)) == VBIF_RESET_ACK_MASK);
+ } else {
+ /* Halt the GX side of GBIF */
+ gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, GBIF_GX_HALT_MASK);
+ spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) &
+ GBIF_GX_HALT_MASK);
+ }

/* Halt new client requests on GBIF */
gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
@@ -1797,6 +1814,9 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
if (ret)
goto err;

+ if (adreno_is_a619_holi(adreno_gpu))
+ a6xx_sptprac_enable(gmu);
+
err:
mutex_unlock(&a6xx_gpu->gmu.lock);

@@ -1851,6 +1871,9 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)

mutex_lock(&a6xx_gpu->gmu.lock);

+ if (adreno_is_a619_holi(adreno_gpu))
+ a6xx_sptprac_disable(gmu);
+
clk_disable_unprepare(gpu->ebi1_clk);

clk_bulk_disable_unprepare(gpu->nr_clocks, gpu->grp_clks);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 2c0f0ef094cb..92ece15ec7d8 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -252,6 +252,11 @@ static inline int adreno_is_a619(struct adreno_gpu *gpu)
return gpu->revn == 619;
}

+static inline int adreno_is_a619_holi(struct adreno_gpu *gpu)
+{
+ return adreno_is_a619(gpu) && adreno_has_gmu_wrapper(gpu);
+}
+
static inline int adreno_is_a630(struct adreno_gpu *gpu)
{
return gpu->revn == 630;

--
2.39.2


2023-02-23 12:07:34

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 09/15] drm/msm/a6xx: Add A610 support

A610 is one of (if not the) lowest-tier SKUs in the A6XX family. It
features no GMU, as it's implemented solely on SoCs with SMD_RPM.
What's more interesting is that it does not feature a VDDGX line
either, being powered solely by VDDCX and has an unfortunate hardware
quirk that makes its reset line broken - after a couple of assert/
deassert cycles, it will hang for good and will not wake up again.

This GPU requires mesa changes for proper rendering, and lots of them
at that. The command streams are quite far away from any other A6XX
GPU and hence it needs special care. This patch was validated both
by running an (incomplete) downstream mesa with some hacks (frames
rendered correctly, though some instructions made the GPU hangcheck
which is expected - garbage in, garbage out) and by replaying RD
traces captured with the downstream KGSL driver - no crashes there,
ever.

Add support for this GPU on the kernel side, which comes down to
pretty simply adding A612 HWCG tables, altering a few values and
adding a special case for handling the reset line.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 97 +++++++++++++++++++++++++++---
drivers/gpu/drm/msm/adreno/adreno_device.c | 12 ++++
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 8 ++-
3 files changed, 107 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 0859a6f463f9..b5017c56fa1b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -254,6 +254,56 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
a6xx_flush(gpu, ring);
}

+const struct adreno_reglist a612_hwcg[] = {
+ {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x22222222},
+ {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x02222220},
+ {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x00000081},
+ {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0x0000f3cf},
+ {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x22222222},
+ {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x22222222},
+ {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x22222222},
+ {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x00022222},
+ {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x11111111},
+ {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x11111111},
+ {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x11111111},
+ {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x00011111},
+ {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x77777777},
+ {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x77777777},
+ {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x77777777},
+ {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x00077777},
+ {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x22222222},
+ {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x01202222},
+ {REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x00002220},
+ {REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00},
+ {REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022},
+ {REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x00005555},
+ {REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x00000011},
+ {REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044},
+ {REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x04222222},
+ {REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x00002222},
+ {REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x02222222},
+ {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x00000002},
+ {REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x00002222},
+ {REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x00004000},
+ {REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x00002222},
+ {REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x00000200},
+ {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x00000000},
+ {REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x00000000},
+ {REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x00000000},
+ {REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004},
+ {REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x00000000},
+ {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x22222222},
+ {REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x00000004},
+ {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x00000002},
+ {REG_A6XX_RBBM_ISDB_CNT, 0x00000182},
+ {REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x00000000},
+ {REG_A6XX_RBBM_SP_HYST_CNT, 0x00000000},
+ {REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x00000222},
+ {REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x00000111},
+ {REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x00000555},
+ {},
+};
+
/* For a615 family (a615, a616, a618 and a619) */
const struct adreno_reglist a615_hwcg[] = {
{REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x02222222},
@@ -604,6 +654,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)

if (adreno_is_a630(adreno_gpu))
clock_cntl_on = 0x8aa8aa02;
+ else if (adreno_is_a610(adreno_gpu))
+ clock_cntl_on = 0xaaa8aa82;
else
clock_cntl_on = 0x8aa8aa82;

@@ -812,6 +864,13 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
/* Entirely magic, per-GPU-gen value */
u32 ubwc_mode = 0;

+ if (adreno_is_a610(adreno_gpu)) {
+ /* HBB = 14 */
+ hbb_lo = 1;
+ min_acc_len = 1;
+ ubwc_mode = 1;
+ }
+
/* a618 is using the hw default values */
if (adreno_is_a618(adreno_gpu))
return;
@@ -1074,13 +1133,13 @@ static int hw_init(struct msm_gpu *gpu)
a6xx_set_hwcg(gpu, true);

/* VBIF/GBIF start*/
- if (adreno_is_a640_family(adreno_gpu) ||
+ if (adreno_is_a610(adreno_gpu) ||
+ adreno_is_a640_family(adreno_gpu) ||
adreno_is_a650_family(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_GBIF_QSB_SIDE0, 0x00071620);
gpu_write(gpu, REG_A6XX_GBIF_QSB_SIDE1, 0x00071620);
gpu_write(gpu, REG_A6XX_GBIF_QSB_SIDE2, 0x00071620);
gpu_write(gpu, REG_A6XX_GBIF_QSB_SIDE3, 0x00071620);
- gpu_write(gpu, REG_A6XX_GBIF_QSB_SIDE3, 0x00071620);
gpu_write(gpu, REG_A6XX_RBBM_GBIF_CLIENT_QOS_CNTL, 0x3);
} else {
gpu_write(gpu, REG_A6XX_RBBM_VBIF_CLIENT_QOS_CNTL, 0x3);
@@ -1111,18 +1170,26 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_UCHE_FILTER_CNTL, 0x804);
gpu_write(gpu, REG_A6XX_UCHE_CACHE_WAYS, 0x4);

- if (adreno_is_a640_family(adreno_gpu) ||
- adreno_is_a650_family(adreno_gpu))
+ if (adreno_is_a640_family(adreno_gpu) || adreno_is_a650_family(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x02000140);
- else
+ gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362c);
+ } else if (adreno_is_a610(adreno_gpu)) {
+ gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x00800060);
+ gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x40201b16);
+ } else {
gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2, 0x010000c0);
- gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362c);
+ gpu_write(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362c);
+ }

if (adreno_is_a660_family(adreno_gpu))
gpu_write(gpu, REG_A6XX_CP_LPAC_PROG_FIFO_SIZE, 0x00000020);

/* Setting the mem pool size */
- gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 128);
+ if (adreno_is_a610(adreno_gpu)) {
+ gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 48);
+ gpu_write(gpu, REG_A6XX_CP_MEM_POOL_DBG_ADDR, 47);
+ } else
+ gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 128);

/* Setting the primFifo thresholds default values,
* and vccCacheSkipDis=1 bit (0x200) for A640 and newer
@@ -1133,6 +1200,8 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00200200);
else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
+ else if (adreno_is_a610(adreno_gpu))
+ gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00080000);
else
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00180000);

@@ -1148,8 +1217,10 @@ static int hw_init(struct msm_gpu *gpu)
a6xx_set_ubwc_config(gpu);

/* Enable fault detection */
- gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL,
- (1 << 30) | 0x1fffff);
+ if (adreno_is_a610(adreno_gpu))
+ gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3ffff);
+ else
+ gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x1fffff);

gpu_write(gpu, REG_A6XX_UCHE_CLIENT_PF, 1);

@@ -1383,6 +1454,14 @@ static void a6xx_recover(struct msm_gpu *gpu)

/* Software-reset the GPU */
if (adreno_has_gmu_wrapper(adreno_gpu)) {
+ /* 11nm chips (i.e. A610-hosting ones) have HW issues with the reset line */
+ if (!adreno_is_a610(adreno_gpu)) {
+ gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
+ gpu_read(gpu, REG_A6XX_RBBM_SW_RESET_CMD);
+ udelay(100);
+ gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 0);
+ }
+
if (adreno_is_a619_holi(adreno_gpu)) {
gpu_write(gpu, 0x18, GPR0_GBIF_HALT_REQUEST);
spin_until((gpu_read(gpu, REG_A6XX_RBBM_VBIF_GX_RESET_STATUS) &
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index dfb43741ea32..95053ac29398 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -253,6 +253,18 @@ static const struct adreno_info gpulist[] = {
.quirks = ADRENO_QUIRK_LMLOADKILL_DISABLE,
.init = a5xx_gpu_init,
.zapfw = "a540_zap.mdt",
+ }, {
+ .rev = ADRENO_REV(6, 1, 0, ANY_ID),
+ .revn = 610,
+ .name = "A610",
+ .fw = {
+ [ADRENO_FW_SQE] = "a630_sqe.fw",
+ },
+ .gmem = (SZ_128K + SZ_4K),
+ .inactive_period = 500,
+ .init = a6xx_gpu_init,
+ .zapfw = "a610_zap.mdt",
+ .hwcg = a612_hwcg,
}, {
.rev = ADRENO_REV(6, 1, 8, ANY_ID),
.revn = 618,
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 92ece15ec7d8..27c30a7694f4 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -55,7 +55,8 @@ struct adreno_reglist {
u32 value;
};

-extern const struct adreno_reglist a615_hwcg[], a630_hwcg[], a640_hwcg[], a650_hwcg[], a660_hwcg[];
+extern const struct adreno_reglist a612_hwcg[], a615_hwcg[], a630_hwcg[], a640_hwcg[], a650_hwcg[];
+extern const struct adreno_reglist a660_hwcg[];

struct adreno_info {
struct adreno_rev rev;
@@ -242,6 +243,11 @@ static inline int adreno_is_a540(struct adreno_gpu *gpu)
return gpu->revn == 540;
}

+static inline int adreno_is_a610(struct adreno_gpu *gpu)
+{
+ return gpu->revn == 610;
+}
+
static inline int adreno_is_a618(struct adreno_gpu *gpu)
{
return gpu->revn == 618;

--
2.39.2


2023-02-23 12:07:38

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 11/15] drm/msm/a6xx: Fix some A619 tunables

Adreno 619 expects some tunables to be set differently. Make up for it.

Fixes: b7616b5c69e6 ("drm/msm/adreno: Add A619 support")
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 2c4afecdd213..eb24be772934 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1206,6 +1206,8 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00200200);
else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
+ else if (adreno_is_a619(adreno_gpu))
+ gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00018000);
else if (adreno_is_a610(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00080000);
else
@@ -1223,7 +1225,9 @@ static int hw_init(struct msm_gpu *gpu)
a6xx_set_ubwc_config(gpu);

/* Enable fault detection */
- if (adreno_is_a610(adreno_gpu))
+ if (adreno_is_a619(adreno_gpu))
+ gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3fffff);
+ else if (adreno_is_a610(adreno_gpu))
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x3ffff);
else
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) | 0x1fffff);

--
2.39.2


2023-02-23 12:07:42

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 10/15] drm/msm/a6xx: Fix A680 highest bank bit value

According to the vendor sources, it's equal to 16, which makes hbb_lo
equal to 3.

Fixes: 840d10b64dad ("drm: msm: Add 680 gpu to the adreno gpu list")
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index b5017c56fa1b..2c4afecdd213 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -885,12 +885,18 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
hbb_lo = 2;
}

- if (adreno_is_a640_family(adreno_gpu)) {
+ if (adreno_is_a640(adreno_gpu)) {
amsbc = 1;
/* HBB = 15 */
hbb_lo = 2;
}

+ if (adreno_is_a680(adreno_gpu)) {
+ amsbc = 1;
+ /* HBB = 16 */
+ hbb_lo = 3;
+ }
+
if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
amsbc = 1;
/* TODO: get ddr type from bootloader and use 2 for LPDDR4 */

--
2.39.2


2023-02-23 12:07:43

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 12/15] drm/msm/a6xx: Use "else if" in GPU speedbin rev matching

The GPU can only be one at a time. Turn a series of ifs into if +
elseifs to save some CPU cycles.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index eb24be772934..f694acca931c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2222,16 +2222,16 @@ static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 fuse)
if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
val = a618_get_speed_bin(fuse);

- if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
+ else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
val = a619_get_speed_bin(fuse);

- if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
+ else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
val = adreno_7c3_get_speed_bin(fuse);

- if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
+ else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
val = a640_get_speed_bin(fuse);

- if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
+ else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
val = a650_get_speed_bin(fuse);

if (val == UINT_MAX) {

--
2.39.2


2023-02-23 12:07:53

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 14/15] drm/msm/a6xx: Add A619_holi speedbin support

A619_holi is implemented on at least two SoCs: SM4350 (holi) and SM6375
(blair). This is what seems to be a first occurrence of this happening,
but it's easy to overcome by guarding the SoC-specific fuse values with
of_machine_is_compatible(). Do just that to enable frequency limiting
on these SoCs.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index d49b649ebecf..81f99f8d1978 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2163,6 +2163,34 @@ static u32 a618_get_speed_bin(u32 fuse)
return UINT_MAX;
}

+static u32 a619_holi_get_speed_bin(u32 fuse)
+{
+ /*
+ * There are (at least) two SoCs implementing A619_holi: SM4350 (holi)
+ * and SM6375 (blair). Limit the fuse matching to the corresponding
+ * SoC to prevent bogus frequency setting (as improbable as it may be,
+ * given unexpected fuse values are.. unexpected! But still possible.)
+ */
+
+ if (fuse == 0)
+ return 0;
+
+ if (of_machine_is_compatible("qcom,sm4350")) {
+ if (fuse == 138)
+ return 1;
+ else if (fuse == 92)
+ return 2;
+ } else if (of_machine_is_compatible("qcom,sm6375")) {
+ if (fuse == 190)
+ return 1;
+ else if (fuse == 177)
+ return 2;
+ } else
+ pr_warn("Unknown SoC implementing A619_holi!\n");
+
+ return UINT_MAX;
+}
+
static u32 a619_get_speed_bin(u32 fuse)
{
if (fuse == 0)
@@ -2222,6 +2250,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct adreno_gpu *adreno_gpu, u3
if (adreno_is_a618(adreno_gpu))
val = a618_get_speed_bin(fuse);

+ else if (adreno_is_a619_holi(adreno_gpu))
+ val = a619_holi_get_speed_bin(fuse);
+
else if (adreno_is_a619(adreno_gpu))
val = a619_get_speed_bin(fuse);


--
2.39.2


2023-02-23 12:07:55

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 15/15] drm/msm/a6xx: Add A610 speedbin support

A610 is implemented on at least three SoCs: SM6115 (bengal), SM6125
(trinket) and SM6225 (khaje). Trinket does not support speed binning
(only a single SKU exists) and we don't yet support khaje upstream.
Hence, add a fuse mapping table for bengal to allow for per-chip
frequency limiting.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 81f99f8d1978..f78077abb886 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2151,6 +2151,30 @@ static bool a6xx_progress(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
return progress;
}

+static u32 a610_get_speed_bin(u32 fuse)
+{
+ /*
+ * There are (at least) three SoCs implementing A610: SM6125 (trinket),
+ * SM6115 (bengal) and SM6225 (khaje). Trinket does not have speedbinning,
+ * as only a single SKU exists and we don't support khaje upstream yet.
+ * Hence, this matching table is only valid for bengal and can be easily
+ * expanded if need be.
+ */
+
+ if (fuse == 0)
+ return 0;
+ else if (fuse == 206)
+ return 1;
+ else if (fuse == 200)
+ return 2;
+ else if (fuse == 157)
+ return 3;
+ else if (fuse == 127)
+ return 4;
+
+ return UINT_MAX;
+}
+
static u32 a618_get_speed_bin(u32 fuse)
{
if (fuse == 0)
@@ -2247,6 +2271,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct adreno_gpu *adreno_gpu, u3
{
u32 val = UINT_MAX;

+ if (adreno_is_a610(adreno_gpu))
+ val = a610_get_speed_bin(fuse);
+
if (adreno_is_a618(adreno_gpu))
val = a618_get_speed_bin(fuse);


--
2.39.2


2023-02-23 12:07:59

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH v3 13/15] drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching

Before transitioning to using per-SoC and not per-Adreno speedbin
fuse values (need another patchset to land elsewhere), a good
improvement/stopgap solution is to use adreno_is_aXYZ macros in
place of explicit revision matching. Do so to allow differentiating
between A619 and A619_holi.

Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 18 +++++++++---------
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 14 ++++++++++++--
2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index f694acca931c..d49b649ebecf 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2215,23 +2215,23 @@ static u32 adreno_7c3_get_speed_bin(u32 fuse)
return UINT_MAX;
}

-static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 fuse)
+static u32 fuse_to_supp_hw(struct device *dev, struct adreno_gpu *adreno_gpu, u32 fuse)
{
u32 val = UINT_MAX;

- if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
+ if (adreno_is_a618(adreno_gpu))
val = a618_get_speed_bin(fuse);

- else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
+ else if (adreno_is_a619(adreno_gpu))
val = a619_get_speed_bin(fuse);

- else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
+ else if (adreno_is_7c3(adreno_gpu))
val = adreno_7c3_get_speed_bin(fuse);

- else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
+ else if (adreno_is_a640(adreno_gpu))
val = a640_get_speed_bin(fuse);

- else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
+ else if (adreno_is_a650(adreno_gpu))
val = a650_get_speed_bin(fuse);

if (val == UINT_MAX) {
@@ -2244,7 +2244,7 @@ static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 fuse)
return (1 << val);
}

-static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
+static int a6xx_set_supported_hw(struct device *dev, struct adreno_gpu *adreno_gpu)
{
u32 supp_hw;
u32 speedbin;
@@ -2263,7 +2263,7 @@ static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
return ret;
}

- supp_hw = fuse_to_supp_hw(dev, rev, speedbin);
+ supp_hw = fuse_to_supp_hw(dev, adreno_gpu, speedbin);

ret = devm_pm_opp_set_supported_hw(dev, &supp_hw, 1);
if (ret)
@@ -2382,7 +2382,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
if (!adreno_has_gmu_wrapper(adreno_gpu))
a6xx_llc_slices_init(pdev, a6xx_gpu);

- ret = a6xx_set_supported_hw(&pdev->dev, config->rev);
+ ret = a6xx_set_supported_hw(&pdev->dev, adreno_gpu);
if (ret) {
a6xx_destroy(&(a6xx_gpu->base.base));
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 27c30a7694f4..da9f45a13b5d 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -268,9 +268,9 @@ static inline int adreno_is_a630(struct adreno_gpu *gpu)
return gpu->revn == 630;
}

-static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
+static inline int adreno_is_a640(struct adreno_gpu *gpu)
{
- return (gpu->revn == 640) || (gpu->revn == 680);
+ return gpu->revn == 640;
}

static inline int adreno_is_a650(struct adreno_gpu *gpu)
@@ -289,6 +289,11 @@ static inline int adreno_is_a660(struct adreno_gpu *gpu)
return gpu->revn == 660;
}

+static inline int adreno_is_a680(struct adreno_gpu *gpu)
+{
+ return gpu->revn == 680;
+}
+
/* check for a615, a616, a618, a619 or any derivatives */
static inline int adreno_is_a615_family(struct adreno_gpu *gpu)
{
@@ -306,6 +311,11 @@ static inline int adreno_is_a650_family(struct adreno_gpu *gpu)
return gpu->revn == 650 || gpu->revn == 620 || adreno_is_a660_family(gpu);
}

+static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
+{
+ return adreno_is_a640(gpu) || adreno_is_a680(gpu);
+}
+
u64 adreno_private_address_space_size(struct msm_gpu *gpu);
int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
uint32_t param, uint64_t *value, uint32_t *len);

--
2.39.2


2023-02-23 12:10:44

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 01/15] dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx



On 23.02.2023 13:06, Konrad Dybcio wrote:
> GMU wrapper-equipped A6xx GPUs require clocks and clock-names to be
> specified under the GPU node, just like their older cousins.
> Account for that.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
[...]
> - then: # Since Adreno 6xx series clocks should be defined in GMU
> + enum:
> + - qcom,adreno-610.0
> + - qcom,adreno-619.1
Immediate comment: this could be improved by checking
the compatible of the node referenced in qcom,gmu, but
frankly - I have no idea how to do this / whether it's
possible with schema.

Konrad
> + then:
> properties:
> - clocks: false
> - clock-names: false
> + clock-names:
> + items:
> + - const: core
> + description: GPU Core clock
> + - const: iface
> + description: GPU Interface clock
> + - const: mem_iface
> + description: GPU Memory Interface clock
> + - const: alt_mem_iface
> + description: GPU Alternative Memory Interface clock
> + - const: gmu
> + description: CX GMU clock
> + - const: xo
> + description: GPUCC clocksource clock
> +
> + reg-names:
> + minItems: 1
> + items:
> + - const: kgsl_3d0_reg_memory
> + - const: cx_dbgc
> +
> + required:
> + - clocks
> + - clock-names
> + else:
> + if:
> + properties:
> + compatible:
> + contains:
> + pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
> +
> + then: # Starting with A6xx, the clocks are usually defined in the GMU node
> + properties:
> + clocks: false
> + clock-names: false
> +
> + reg-names:
> + minItems: 1
> + items:
> + - const: kgsl_3d0_reg_memory
> + - const: cx_mem
> + - const: cx_dbgc
>
> examples:
> - |
>

2023-02-23 13:06:52

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH v3 10/15] drm/msm/a6xx: Fix A680 highest bank bit value

On Thu, 23 Feb 2023 at 14:07, Konrad Dybcio <[email protected]> wrote:
>
> According to the vendor sources, it's equal to 16, which makes hbb_lo
> equal to 3.

I think we might be stricken with the ddr kind difference here, but I
would not bet on it.

>
> Fixes: 840d10b64dad ("drm: msm: Add 680 gpu to the adreno gpu list")
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index b5017c56fa1b..2c4afecdd213 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -885,12 +885,18 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
> hbb_lo = 2;
> }
>
> - if (adreno_is_a640_family(adreno_gpu)) {
> + if (adreno_is_a640(adreno_gpu)) {
> amsbc = 1;
> /* HBB = 15 */
> hbb_lo = 2;
> }
>
> + if (adreno_is_a680(adreno_gpu)) {
> + amsbc = 1;
> + /* HBB = 16 */
> + hbb_lo = 3;
> + }
> +
> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> amsbc = 1;
> /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>
> --
> 2.39.2
>


--
With best wishes
Dmitry

2023-02-23 13:49:28

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 10/15] drm/msm/a6xx: Fix A680 highest bank bit value



On 23.02.2023 14:06, Dmitry Baryshkov wrote:
> On Thu, 23 Feb 2023 at 14:07, Konrad Dybcio <[email protected]> wrote:
>>
>> According to the vendor sources, it's equal to 16, which makes hbb_lo
>> equal to 3.
>
> I think we might be stricken with the ddr kind difference here, but I
> would not bet on it.
It totally is, but it also seems to be SoC-dependent..
I think all 8180x devices shipped with LPDDR4X FWIW

Konrad
>
>>
>> Fixes: 840d10b64dad ("drm: msm: Add 680 gpu to the adreno gpu list")
>> Signed-off-by: Konrad Dybcio <[email protected]>
>> ---
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> index b5017c56fa1b..2c4afecdd213 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> @@ -885,12 +885,18 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>> hbb_lo = 2;
>> }
>>
>> - if (adreno_is_a640_family(adreno_gpu)) {
>> + if (adreno_is_a640(adreno_gpu)) {
>> amsbc = 1;
>> /* HBB = 15 */
>> hbb_lo = 2;
>> }
>>
>> + if (adreno_is_a680(adreno_gpu)) {
>> + amsbc = 1;
>> + /* HBB = 16 */
>> + hbb_lo = 3;
>> + }
>> +
>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>> amsbc = 1;
>> /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>
>> --
>> 2.39.2
>>
>
>

2023-02-23 14:44:16

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH v3 05/15] drm/msm/a6xx: Introduce GMU wrapper support

On Thu, 23 Feb 2023 at 14:06, Konrad Dybcio <[email protected]> wrote:
>
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
>
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
>
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 53 +++++-
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 244 +++++++++++++++++++++++++---
> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 +
> drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 14 +-
> drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 +
> 5 files changed, 282 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 90e636dcdd5b..b2c56561cde6 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c

[skipped]

> struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
> {
> struct msm_drm_private *priv = dev->dev_private;
> @@ -2063,18 +2235,36 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>
> adreno_gpu->registers = NULL;
>
> + /* Check if there is a GMU phandle and set it up */
> + node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
> + /* FIXME: How do we gracefully handle this? */
> + BUG_ON(!node);

I thought that we should fix it, but then I noticed that this code was
just moved from the part below.

> +
> + adreno_gpu->gmu_is_wrapper = of_device_is_compatible(node, "qcom,adreno-gmu-wrapper");
> +
> /*
> * We need to know the platform type before calling into adreno_gpu_init
> * so that the hw_apriv flag can be correctly set. Snoop into the info
> * and grab the revision number
> */
> info = adreno_info(config->rev);
> -
> - if (info && (info->revn == 650 || info->revn == 660 ||
> - adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), info->rev)))

Are we losing A635 here? I don't see it in the condition below.

> + if (!info)
> + return ERR_PTR(-EINVAL);
> +
> + /* Assign these early so that we can use the is_aXYZ helpers */
> + /* Numeric revision IDs (e.g. 630) */
> + adreno_gpu->revn = info->revn;
> + /* New-style ADRENO_REV()-only */
> + adreno_gpu->rev = info->rev;
> + /* Quirk data */
> + adreno_gpu->info = info;
> +
> + if (adreno_is_a650(adreno_gpu) || adreno_is_a660_family(adreno_gpu))
> adreno_gpu->base.hw_apriv = true;
>
> - a6xx_llc_slices_init(pdev, a6xx_gpu);
> + /* No LLCC on non-RPMh (and by extension, non-GMU) SoCs */
> + if (!adreno_has_gmu_wrapper(adreno_gpu))
> + a6xx_llc_slices_init(pdev, a6xx_gpu);
>
> ret = a6xx_set_supported_hw(&pdev->dev, config->rev);
> if (ret) {
> @@ -2082,7 +2272,10 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
> return ERR_PTR(ret);
> }
>
> - ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
> + if (adreno_has_gmu_wrapper(adreno_gpu))
> + ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs_gmuwrapper, 1);
> + else
> + ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
> if (ret) {
> a6xx_destroy(&(a6xx_gpu->base.base));
> return ERR_PTR(ret);
> @@ -2095,13 +2288,10 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
> if (adreno_is_a618(adreno_gpu) || adreno_is_7c3(adreno_gpu))
> priv->gpu_clamp_to_idle = true;
>
> - /* Check if there is a GMU phandle and set it up */
> - node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
> -
> - /* FIXME: How do we gracefully handle this? */
> - BUG_ON(!node);
> -
> - ret = a6xx_gmu_init(a6xx_gpu, node);
> + if (adreno_has_gmu_wrapper(adreno_gpu))
> + ret = a6xx_gmu_wrapper_init(a6xx_gpu, node);
> + else
> + ret = a6xx_gmu_init(a6xx_gpu, node);
> of_node_put(node);
> if (ret) {
> a6xx_destroy(&(a6xx_gpu->base.base));
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index eea2e60ce3b7..51a7656072fa 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -76,6 +76,7 @@ int a6xx_gmu_set_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
> void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
>
> int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
> void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu);
>
> void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> index b7e217d00a22..e11e8a02ac22 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> @@ -1041,16 +1041,18 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
> /* Get the generic state from the adreno core */
> adreno_gpu_state_get(gpu, &a6xx_state->base);
>
> - a6xx_get_gmu_registers(gpu, a6xx_state);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_get_gmu_registers(gpu, a6xx_state);
>
> - a6xx_state->gmu_log = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.log);
> - a6xx_state->gmu_hfi = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.hfi);
> - a6xx_state->gmu_debug = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.debug);
> + a6xx_state->gmu_log = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.log);
> + a6xx_state->gmu_hfi = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.hfi);
> + a6xx_state->gmu_debug = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.debug);
>
> - a6xx_snapshot_gmu_hfi_history(gpu, a6xx_state);
> + a6xx_snapshot_gmu_hfi_history(gpu, a6xx_state);
> + }
>
> /* If GX isn't on the rest of the data isn't going to be accessible */
> - if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
> + if (!adreno_has_gmu_wrapper(adreno_gpu) && !a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
> return &a6xx_state->base;
>
> /* Get the banks of indexed registers */
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index b4f9b1343d63..2c0f0ef094cb 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -115,6 +115,7 @@ struct adreno_gpu {
> * code (a3xx_gpu.c) and stored in this common location.
> */
> const unsigned int *reg_offsets;
> + bool gmu_is_wrapper;
> };
> #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
>
> @@ -145,6 +146,11 @@ struct adreno_platform_config {
>
> bool adreno_cmp_rev(struct adreno_rev rev1, struct adreno_rev rev2);
>
> +static inline bool adreno_has_gmu_wrapper(struct adreno_gpu *gpu)
> +{
> + return gpu->gmu_is_wrapper;
> +}
> +
> static inline bool adreno_is_a2xx(struct adreno_gpu *gpu)
> {
> return (gpu->revn < 300);
>
> --
> 2.39.2
>


--
With best wishes
Dmitry

2023-02-23 14:46:48

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 05/15] drm/msm/a6xx: Introduce GMU wrapper support



On 23.02.2023 15:43, Dmitry Baryshkov wrote:
> On Thu, 23 Feb 2023 at 14:06, Konrad Dybcio <[email protected]> wrote:
>>
>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
>> but don't implement the associated GMUs. This is due to the fact that
>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
>> of enabling & scaling power rails, clocks and bandwidth ourselves.
>>
>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
>> A6XX code to facilitate these GPUs. This involves if-ing out lots
>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
>> the actual name that Qualcomm uses in their downstream kernels).
>>
>> This is essentially a register region which is convenient to model
>> as a device. We'll use it for managing the GDSCs. The register
>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
>>
>> Signed-off-by: Konrad Dybcio <[email protected]>
>> ---
>> drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 53 +++++-
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 244 +++++++++++++++++++++++++---
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 +
>> drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 14 +-
>> drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 +
>> 5 files changed, 282 insertions(+), 36 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> index 90e636dcdd5b..b2c56561cde6 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>
> [skipped]
>
>> struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>> {
>> struct msm_drm_private *priv = dev->dev_private;
>> @@ -2063,18 +2235,36 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>>
>> adreno_gpu->registers = NULL;
>>
>> + /* Check if there is a GMU phandle and set it up */
>> + node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
>> + /* FIXME: How do we gracefully handle this? */
>> + BUG_ON(!node);
>
> I thought that we should fix it, but then I noticed that this code was
> just moved from the part below.
I suppose we could return einval instead of calling BUG_ON,
but this would belong to a separate patch (possibly outside
this series).

>
>> +
>> + adreno_gpu->gmu_is_wrapper = of_device_is_compatible(node, "qcom,adreno-gmu-wrapper");
>> +
>> /*
>> * We need to know the platform type before calling into adreno_gpu_init
>> * so that the hw_apriv flag can be correctly set. Snoop into the info
>> * and grab the revision number
>> */
>> info = adreno_info(config->rev);
>> -
>> - if (info && (info->revn == 650 || info->revn == 660 ||
>> - adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), info->rev)))
>
> Are we losing A635 here? I don't see it in the condition below.
a660_family takes care of it

Konrad

>
>> + if (!info)
>> + return ERR_PTR(-EINVAL);
>> +
>> + /* Assign these early so that we can use the is_aXYZ helpers */
>> + /* Numeric revision IDs (e.g. 630) */
>> + adreno_gpu->revn = info->revn;
>> + /* New-style ADRENO_REV()-only */
>> + adreno_gpu->rev = info->rev;
>> + /* Quirk data */
>> + adreno_gpu->info = info;
>> +
>> + if (adreno_is_a650(adreno_gpu) || adreno_is_a660_family(adreno_gpu))
>> adreno_gpu->base.hw_apriv = true;
>>
>> - a6xx_llc_slices_init(pdev, a6xx_gpu);
>> + /* No LLCC on non-RPMh (and by extension, non-GMU) SoCs */
>> + if (!adreno_has_gmu_wrapper(adreno_gpu))
>> + a6xx_llc_slices_init(pdev, a6xx_gpu);
>>
>> ret = a6xx_set_supported_hw(&pdev->dev, config->rev);
>> if (ret) {
>> @@ -2082,7 +2272,10 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>> return ERR_PTR(ret);
>> }
>>
>> - ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
>> + if (adreno_has_gmu_wrapper(adreno_gpu))
>> + ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs_gmuwrapper, 1);
>> + else
>> + ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
>> if (ret) {
>> a6xx_destroy(&(a6xx_gpu->base.base));
>> return ERR_PTR(ret);
>> @@ -2095,13 +2288,10 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>> if (adreno_is_a618(adreno_gpu) || adreno_is_7c3(adreno_gpu))
>> priv->gpu_clamp_to_idle = true;
>>
>> - /* Check if there is a GMU phandle and set it up */
>> - node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
>> -
>> - /* FIXME: How do we gracefully handle this? */
>> - BUG_ON(!node);
>> -
>> - ret = a6xx_gmu_init(a6xx_gpu, node);
>> + if (adreno_has_gmu_wrapper(adreno_gpu))
>> + ret = a6xx_gmu_wrapper_init(a6xx_gpu, node);
>> + else
>> + ret = a6xx_gmu_init(a6xx_gpu, node);
>> of_node_put(node);
>> if (ret) {
>> a6xx_destroy(&(a6xx_gpu->base.base));
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>> index eea2e60ce3b7..51a7656072fa 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>> @@ -76,6 +76,7 @@ int a6xx_gmu_set_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
>> void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
>>
>> int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
>> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
>> void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu);
>>
>> void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>> index b7e217d00a22..e11e8a02ac22 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>> @@ -1041,16 +1041,18 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
>> /* Get the generic state from the adreno core */
>> adreno_gpu_state_get(gpu, &a6xx_state->base);
>>
>> - a6xx_get_gmu_registers(gpu, a6xx_state);
>> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
>> + a6xx_get_gmu_registers(gpu, a6xx_state);
>>
>> - a6xx_state->gmu_log = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.log);
>> - a6xx_state->gmu_hfi = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.hfi);
>> - a6xx_state->gmu_debug = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.debug);
>> + a6xx_state->gmu_log = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.log);
>> + a6xx_state->gmu_hfi = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.hfi);
>> + a6xx_state->gmu_debug = a6xx_snapshot_gmu_bo(a6xx_state, &a6xx_gpu->gmu.debug);
>>
>> - a6xx_snapshot_gmu_hfi_history(gpu, a6xx_state);
>> + a6xx_snapshot_gmu_hfi_history(gpu, a6xx_state);
>> + }
>>
>> /* If GX isn't on the rest of the data isn't going to be accessible */
>> - if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
>> + if (!adreno_has_gmu_wrapper(adreno_gpu) && !a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
>> return &a6xx_state->base;
>>
>> /* Get the banks of indexed registers */
>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
>> index b4f9b1343d63..2c0f0ef094cb 100644
>> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
>> @@ -115,6 +115,7 @@ struct adreno_gpu {
>> * code (a3xx_gpu.c) and stored in this common location.
>> */
>> const unsigned int *reg_offsets;
>> + bool gmu_is_wrapper;
>> };
>> #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
>>
>> @@ -145,6 +146,11 @@ struct adreno_platform_config {
>>
>> bool adreno_cmp_rev(struct adreno_rev rev1, struct adreno_rev rev2);
>>
>> +static inline bool adreno_has_gmu_wrapper(struct adreno_gpu *gpu)
>> +{
>> + return gpu->gmu_is_wrapper;
>> +}
>> +
>> static inline bool adreno_is_a2xx(struct adreno_gpu *gpu)
>> {
>> return (gpu->revn < 300);
>>
>> --
>> 2.39.2
>>
>
>

2023-02-23 14:49:11

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH v3 10/15] drm/msm/a6xx: Fix A680 highest bank bit value

On Thu, 23 Feb 2023 at 15:49, Konrad Dybcio <[email protected]> wrote:
>
>
>
> On 23.02.2023 14:06, Dmitry Baryshkov wrote:
> > On Thu, 23 Feb 2023 at 14:07, Konrad Dybcio <[email protected]> wrote:
> >>
> >> According to the vendor sources, it's equal to 16, which makes hbb_lo
> >> equal to 3.
> >
> > I think we might be stricken with the ddr kind difference here, but I
> > would not bet on it.
> It totally is, but it also seems to be SoC-dependent..
> I think all 8180x devices shipped with LPDDR4X FWIW

I think so too. However sdmshrike dts uses LPDDR5.

>
> Konrad
> >
> >>
> >> Fixes: 840d10b64dad ("drm: msm: Add 680 gpu to the adreno gpu list")
> >> Signed-off-by: Konrad Dybcio <[email protected]>
> >> ---
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 +++++++-
> >> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> index b5017c56fa1b..2c4afecdd213 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> @@ -885,12 +885,18 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
> >> hbb_lo = 2;
> >> }
> >>
> >> - if (adreno_is_a640_family(adreno_gpu)) {
> >> + if (adreno_is_a640(adreno_gpu)) {
> >> amsbc = 1;
> >> /* HBB = 15 */
> >> hbb_lo = 2;
> >> }
> >>
> >> + if (adreno_is_a680(adreno_gpu)) {
> >> + amsbc = 1;
> >> + /* HBB = 16 */
> >> + hbb_lo = 3;
> >> + }
> >> +
> >> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> >> amsbc = 1;
> >> /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> >>
> >> --
> >> 2.39.2
> >>
> >
> >



--
With best wishes
Dmitry

2023-02-23 14:51:42

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 10/15] drm/msm/a6xx: Fix A680 highest bank bit value



On 23.02.2023 15:48, Dmitry Baryshkov wrote:
> On Thu, 23 Feb 2023 at 15:49, Konrad Dybcio <[email protected]> wrote:
>>
>>
>>
>> On 23.02.2023 14:06, Dmitry Baryshkov wrote:
>>> On Thu, 23 Feb 2023 at 14:07, Konrad Dybcio <[email protected]> wrote:
>>>>
>>>> According to the vendor sources, it's equal to 16, which makes hbb_lo
>>>> equal to 3.
>>>
>>> I think we might be stricken with the ddr kind difference here, but I
>>> would not bet on it.
>> It totally is, but it also seems to be SoC-dependent..
>> I think all 8180x devices shipped with LPDDR4X FWIW
>
> I think so too. However sdmshrike dts uses LPDDR5.
Yeah.. it may be better to skip this patch; it should be
possible to apply this series without this one.

Konrad
>
>>
>> Konrad
>>>
>>>>
>>>> Fixes: 840d10b64dad ("drm: msm: Add 680 gpu to the adreno gpu list")
>>>> Signed-off-by: Konrad Dybcio <[email protected]>
>>>> ---
>>>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> index b5017c56fa1b..2c4afecdd213 100644
>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> @@ -885,12 +885,18 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>> hbb_lo = 2;
>>>> }
>>>>
>>>> - if (adreno_is_a640_family(adreno_gpu)) {
>>>> + if (adreno_is_a640(adreno_gpu)) {
>>>> amsbc = 1;
>>>> /* HBB = 15 */
>>>> hbb_lo = 2;
>>>> }
>>>>
>>>> + if (adreno_is_a680(adreno_gpu)) {
>>>> + amsbc = 1;
>>>> + /* HBB = 16 */
>>>> + hbb_lo = 3;
>>>> + }
>>>> +
>>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>>> amsbc = 1;
>>>> /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>>
>>>> --
>>>> 2.39.2
>>>>
>>>
>>>
>
>
>

2023-02-24 11:18:05

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 01/15] dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx

On 23/02/2023 13:06, Konrad Dybcio wrote:
> GMU wrapper-equipped A6xx GPUs require clocks and clock-names to be
> specified under the GPU node, just like their older cousins.
> Account for that.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> .../devicetree/bindings/display/msm/gpu.yaml | 63 ++++++++++++++++++----
> 1 file changed, 53 insertions(+), 10 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/display/msm/gpu.yaml b/Documentation/devicetree/bindings/display/msm/gpu.yaml
> index d4191cca71fb..e6d3160601bc 100644
> --- a/Documentation/devicetree/bindings/display/msm/gpu.yaml
> +++ b/Documentation/devicetree/bindings/display/msm/gpu.yaml
> @@ -36,10 +36,7 @@ properties:
>
> reg-names:
> minItems: 1
> - items:
> - - const: kgsl_3d0_reg_memory
> - - const: cx_mem
> - - const: cx_dbgc
> + maxItems: 3
>
> interrupts:
> maxItems: 1
> @@ -147,26 +144,72 @@ allOf:
> description: GPU Alternative Memory Interface clock
> - const: gfx3d
> description: GPU 3D engine clock
> + - const: gmu
> + description: CX GMU clock
> - const: rbbmtimer
> description: GPU RBBM Timer for Adreno 5xx series
> - const: rbcpr
> description: GPU RB Core Power Reduction clock
> + - const: xo
> + description: GPUCC clocksource clock
> minItems: 2
> - maxItems: 7
> + maxItems: 9

Your commit says A6xx but this is a3-5xx. I don't understand this change.

>
> required:
> - clocks
> - clock-names
> +
> - if:
> properties:
> compatible:
> contains:
> - pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
> -
> - then: # Since Adreno 6xx series clocks should be defined in GMU
> + enum:
> + - qcom,adreno-610.0
> + - qcom,adreno-619.1
> + then:
> properties:
> - clocks: false
> - clock-names: false
> + clock-names:
> + items:
> + - const: core
> + description: GPU Core clock
> + - const: iface
> + description: GPU Interface clock
> + - const: mem_iface
> + description: GPU Memory Interface clock
> + - const: alt_mem_iface
> + description: GPU Alternative Memory Interface clock
> + - const: gmu
> + description: CX GMU clock
> + - const: xo
> + description: GPUCC clocksource clock
> +
> + reg-names:
> + minItems: 1
> + items:
> + - const: kgsl_3d0_reg_memory
> + - const: cx_dbgc
> +
> + required:
> + - clocks
> + - clock-names
> + else:
> + if:
> + properties:
> + compatible:
> + contains:
> + pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
> +
> + then: # Starting with A6xx, the clocks are usually defined in the GMU node

The comment is not accurate anymore.


Best regards,
Krzysztof


2023-02-24 11:20:07

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 02/15] dt-bindings: display/msm/gmu: Add GMU wrapper

On 23/02/2023 13:06, Konrad Dybcio wrote:
> GMU wrapper is essentially a register space within the GPU, which
> Linux sees as a dumbed-down regular GMU: there's no clocks,
> interrupts, multiple regs, iommus and OPP. Document it.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> .../devicetree/bindings/display/msm/gmu.yaml | 49 ++++++++++++++++------
> 1 file changed, 37 insertions(+), 12 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml b/Documentation/devicetree/bindings/display/msm/gmu.yaml
> index ab14e81cb050..021373e686e1 100644
> --- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
> +++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
> @@ -19,16 +19,18 @@ description: |
>
> properties:
> compatible:
> - items:
> - - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
> - - const: qcom,adreno-gmu
> + oneOf:
> + - items:
> + - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
> + - const: qcom,adreno-gmu
> + - const: qcom,adreno-gmu-wrapper

Why wrapper is part of this binding then? Usually wrapper means there is
wrapper node with a GMU child (at least this is what we call for all
wrappers of custom IP blocks like USB DWC). Where is the child?


Best regards,
Krzysztof


2023-02-24 11:50:25

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 02/15] dt-bindings: display/msm/gmu: Add GMU wrapper



On 24.02.2023 12:19, Krzysztof Kozlowski wrote:
> On 23/02/2023 13:06, Konrad Dybcio wrote:
>> GMU wrapper is essentially a register space within the GPU, which
>> Linux sees as a dumbed-down regular GMU: there's no clocks,
>> interrupts, multiple regs, iommus and OPP. Document it.
>>
>> Signed-off-by: Konrad Dybcio <[email protected]>
>> ---
>> .../devicetree/bindings/display/msm/gmu.yaml | 49 ++++++++++++++++------
>> 1 file changed, 37 insertions(+), 12 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml b/Documentation/devicetree/bindings/display/msm/gmu.yaml
>> index ab14e81cb050..021373e686e1 100644
>> --- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
>> +++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
>> @@ -19,16 +19,18 @@ description: |
>>
>> properties:
>> compatible:
>> - items:
>> - - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
>> - - const: qcom,adreno-gmu
>> + oneOf:
>> + - items:
>> + - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
>> + - const: qcom,adreno-gmu
>> + - const: qcom,adreno-gmu-wrapper
>
> Why wrapper is part of this binding then? Usually wrapper means there is
> wrapper node with a GMU child (at least this is what we call for all
> wrappers of custom IP blocks like USB DWC). Where is the child?
"GMU wrapper" is a sorta confusing name that Qualcomm chose for
the "fake GMU" which has the GMU_CX and GMU_GX registers responsible
for things like powering up some GPU things internally and some
perf/pwr counters. It is _not_ a wrapper in the sense of a parent-child
relationship. The GMU wrapper has no HFI (Hardware Firmware Interface)
to communicate through crafted messages, but relies on plain register
accesses.

Konrad
>
>
> Best regards,
> Krzysztof
>

2023-02-24 11:51:31

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 01/15] dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx



On 24.02.2023 12:17, Krzysztof Kozlowski wrote:
> On 23/02/2023 13:06, Konrad Dybcio wrote:
>> GMU wrapper-equipped A6xx GPUs require clocks and clock-names to be
>> specified under the GPU node, just like their older cousins.
>> Account for that.
>>
>> Signed-off-by: Konrad Dybcio <[email protected]>
>> ---
>> .../devicetree/bindings/display/msm/gpu.yaml | 63 ++++++++++++++++++----
>> 1 file changed, 53 insertions(+), 10 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/display/msm/gpu.yaml b/Documentation/devicetree/bindings/display/msm/gpu.yaml
>> index d4191cca71fb..e6d3160601bc 100644
>> --- a/Documentation/devicetree/bindings/display/msm/gpu.yaml
>> +++ b/Documentation/devicetree/bindings/display/msm/gpu.yaml
>> @@ -36,10 +36,7 @@ properties:
>>
>> reg-names:
>> minItems: 1
>> - items:
>> - - const: kgsl_3d0_reg_memory
>> - - const: cx_mem
>> - - const: cx_dbgc
>> + maxItems: 3
>>
>> interrupts:
>> maxItems: 1
>> @@ -147,26 +144,72 @@ allOf:
>> description: GPU Alternative Memory Interface clock
>> - const: gfx3d
>> description: GPU 3D engine clock
>> + - const: gmu
>> + description: CX GMU clock
>> - const: rbbmtimer
>> description: GPU RBBM Timer for Adreno 5xx series
>> - const: rbcpr
>> description: GPU RB Core Power Reduction clock
>> + - const: xo
>> + description: GPUCC clocksource clock
>> minItems: 2
>> - maxItems: 7
>> + maxItems: 9
>
> Your commit says A6xx but this is a3-5xx. I don't understand this change.
Right, it's a leftover unrelated hunk. I'll remove it.

>
>>
>> required:
>> - clocks
>> - clock-names
>> +
>> - if:
>> properties:
>> compatible:
>> contains:
>> - pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
>> -
>> - then: # Since Adreno 6xx series clocks should be defined in GMU
>> + enum:
>> + - qcom,adreno-610.0
>> + - qcom,adreno-619.1
>> + then:
>> properties:
>> - clocks: false
>> - clock-names: false
>> + clock-names:
>> + items:
>> + - const: core
>> + description: GPU Core clock
>> + - const: iface
>> + description: GPU Interface clock
>> + - const: mem_iface
>> + description: GPU Memory Interface clock
>> + - const: alt_mem_iface
>> + description: GPU Alternative Memory Interface clock
>> + - const: gmu
>> + description: CX GMU clock
>> + - const: xo
>> + description: GPUCC clocksource clock
>> +
>> + reg-names:
>> + minItems: 1
>> + items:
>> + - const: kgsl_3d0_reg_memory
>> + - const: cx_dbgc
>> +
>> + required:
>> + - clocks
>> + - clock-names
>> + else:
>> + if:
>> + properties:
>> + compatible:
>> + contains:
>> + pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
>> +
>> + then: # Starting with A6xx, the clocks are usually defined in the GMU node
>
> The comment is not accurate anymore.
I'll argue the semantics, they are still "usually" defined
in the GMU node..

Konrad
>
>
> Best regards,
> Krzysztof
>

2023-02-24 12:54:40

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 01/15] dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx

On 24/02/2023 12:51, Konrad Dybcio wrote:
>>> + else:
>>> + if:
>>> + properties:
>>> + compatible:
>>> + contains:
>>> + pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
>>> +
>>> + then: # Starting with A6xx, the clocks are usually defined in the GMU node
>>
>> The comment is not accurate anymore.
> I'll argue the semantics, they are still "usually" defined
> in the GMU node..

Ah, usually. It's fine then.

Best regards,
Krzysztof


2023-02-28 20:24:08

by Akhil P Oommen

[permalink] [raw]
Subject: Re: [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
>
> Sort the variable definition and assignment alphabetically.
Sorting based on decreasing order of line length is more readable, isn't it?
>
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> Set default values for all of the tunables to zero, as they should be.
>
> Values were validated against downstream and will be fixed up in
> separate commits so as not to make this one even more messy.
>
> A618 remains untouched (left at hw defaults) in this patch.
>
> Signed-off-by: Konrad Dybcio <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 ++++++++++++++++++++++++++++-------
> 1 file changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..bdae341e0a7c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
> static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
> {
> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + /* Unknown, introduced with A640/680 */
> u32 amsbc = 0;
> + /*
> + * The Highest Bank Bit value represents the bit of the highest DDR bank.
> + * We then subtract 13 from it (13 is the minimum value allowed by hw) and
> + * write the lowest two bits of the remaining value as hbb_lo and the
> + * one above it as hbb_hi to the hardware. The default values (when HBB is
> + * not specified) are 0, 0.
> + */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
> u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
> u32 uavflagprd_inv = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
>
> /* a618 is using the hw default values */
> if (adreno_is_a618(adreno_gpu))
> return;
>
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu)) {
> + /* HBB = 14 */
> + hbb_lo = 1;
> + }
> +
> + if (adreno_is_a630(adreno_gpu)) {
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
> +
> + if (adreno_is_a640_family(adreno_gpu)) {
> amsbc = 1;
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
>
> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
> amsbc = 1;
> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> + /* HBB = 16 */
> + hbb_lo = 3;
> rgb565_predicator = 1;
> uavflagprd_inv = 2;
> }
>
> if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
> amsbc = 1;
> + /* HBB is unset in downstream DTS, defaulting to 0 */
This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb configurations were moved to the driver from DT in recent downstream kernels.

-Akhil.
> rgb565_predicator = 1;
> uavflagprd_inv = 2;
> }
>
> gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> + rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> + uavflagprd_inv << 4 | min_acc_len << 3 |
> + hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 21);
> }
>
> static int a6xx_cp_init(struct msm_gpu *gpu)
>


2023-02-28 20:40:33

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config



On 28.02.2023 21:23, Akhil P Oommen wrote:
> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>> Rename lower_bit to hbb_lo and explain what it signifies.
>> Add explanations (wherever possible to other tunables).
>>
>> Sort the variable definition and assignment alphabetically.
> Sorting based on decreasing order of line length is more readable, isn't it?
I can do that.

>>
>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>> Set default values for all of the tunables to zero, as they should be.
>>
>> Values were validated against downstream and will be fixed up in
>> separate commits so as not to make this one even more messy.
>>
>> A618 remains untouched (left at hw defaults) in this patch.
>>
>> Signed-off-by: Konrad Dybcio <[email protected]>
>> ---
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 ++++++++++++++++++++++++++++-------
>> 1 file changed, 45 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>> static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>> {
>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>> - u32 lower_bit = 2;
>> + /* Unknown, introduced with A640/680 */
>> u32 amsbc = 0;
>> + /*
>> + * The Highest Bank Bit value represents the bit of the highest DDR bank.
>> + * We then subtract 13 from it (13 is the minimum value allowed by hw) and
>> + * write the lowest two bits of the remaining value as hbb_lo and the
>> + * one above it as hbb_hi to the hardware. The default values (when HBB is
>> + * not specified) are 0, 0.
>> + */
>> + u32 hbb_hi = 0;
>> + u32 hbb_lo = 0;
>> + /* Whether the minimum access length is 64 bits */
>> + u32 min_acc_len = 0;
>> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>> u32 rgb565_predicator = 0;
>> + /* Unknown, introduced with A650 family */
>> u32 uavflagprd_inv = 0;
>> + /* Entirely magic, per-GPU-gen value */
>> + u32 ubwc_mode = 0;
>>
>> /* a618 is using the hw default values */
>> if (adreno_is_a618(adreno_gpu))
>> return;
>>
>> - if (adreno_is_a640_family(adreno_gpu))
>> + if (adreno_is_a619(adreno_gpu)) {
>> + /* HBB = 14 */
>> + hbb_lo = 1;
>> + }
>> +
>> + if (adreno_is_a630(adreno_gpu)) {
>> + /* HBB = 15 */
>> + hbb_lo = 2;
>> + }
>> +
>> + if (adreno_is_a640_family(adreno_gpu)) {
>> amsbc = 1;
>> + /* HBB = 15 */
>> + hbb_lo = 2;
>> + }
>>
>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>> - lower_bit = 3;
>> amsbc = 1;
>> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>> + /* HBB = 16 */
>> + hbb_lo = 3;
>> rgb565_predicator = 1;
>> uavflagprd_inv = 2;
>> }
>>
>> if (adreno_is_7c3(adreno_gpu)) {
>> - lower_bit = 1;
>> amsbc = 1;
>> + /* HBB is unset in downstream DTS, defaulting to 0 */
> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb configurations were moved to the driver from DT in recent downstream kernels.
Right, seems to have happened with msm-5.10. Though a random kernel I
grabbed seems to suggest it's 15 and not 14?

https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710

Konrad
>
> -Akhil.
>> rgb565_predicator = 1;
>> uavflagprd_inv = 2;
>> }
>>
>> gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>> - uavflagprd_inv << 4 | lower_bit << 1);
>> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>> + rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>> +
>> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>> +
>> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
>> + uavflagprd_inv << 4 | min_acc_len << 3 |
>> + hbb_lo << 1 | ubwc_mode);
>> +
>> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 21);
>> }
>>
>> static int a6xx_cp_init(struct msm_gpu *gpu)
>>
>

2023-02-28 20:45:42

by Akhil P Oommen

[permalink] [raw]
Subject: Re: [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>
> On 28.02.2023 21:23, Akhil P Oommen wrote:
>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>> Add explanations (wherever possible to other tunables).
>>>
>>> Sort the variable definition and assignment alphabetically.
>> Sorting based on decreasing order of line length is more readable, isn't it?
> I can do that.
>
>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>> Set default values for all of the tunables to zero, as they should be.
>>>
>>> Values were validated against downstream and will be fixed up in
>>> separate commits so as not to make this one even more messy.
>>>
>>> A618 remains untouched (left at hw defaults) in this patch.
>>>
>>> Signed-off-by: Konrad Dybcio <[email protected]>
>>> ---
>>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 ++++++++++++++++++++++++++++-------
>>> 1 file changed, 45 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>> static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>> {
>>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>> - u32 lower_bit = 2;
>>> + /* Unknown, introduced with A640/680 */
>>> u32 amsbc = 0;
>>> + /*
>>> + * The Highest Bank Bit value represents the bit of the highest DDR bank.
>>> + * We then subtract 13 from it (13 is the minimum value allowed by hw) and
>>> + * write the lowest two bits of the remaining value as hbb_lo and the
>>> + * one above it as hbb_hi to the hardware. The default values (when HBB is
>>> + * not specified) are 0, 0.
>>> + */
>>> + u32 hbb_hi = 0;
>>> + u32 hbb_lo = 0;
>>> + /* Whether the minimum access length is 64 bits */
>>> + u32 min_acc_len = 0;
>>> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>> u32 rgb565_predicator = 0;
>>> + /* Unknown, introduced with A650 family */
>>> u32 uavflagprd_inv = 0;
>>> + /* Entirely magic, per-GPU-gen value */
>>> + u32 ubwc_mode = 0;
>>>
>>> /* a618 is using the hw default values */
>>> if (adreno_is_a618(adreno_gpu))
>>> return;
>>>
>>> - if (adreno_is_a640_family(adreno_gpu))
>>> + if (adreno_is_a619(adreno_gpu)) {
>>> + /* HBB = 14 */
>>> + hbb_lo = 1;
>>> + }
>>> +
>>> + if (adreno_is_a630(adreno_gpu)) {
>>> + /* HBB = 15 */
>>> + hbb_lo = 2;
>>> + }
>>> +
>>> + if (adreno_is_a640_family(adreno_gpu)) {
>>> amsbc = 1;
>>> + /* HBB = 15 */
>>> + hbb_lo = 2;
>>> + }
>>>
>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> - lower_bit = 3;
>>> amsbc = 1;
>>> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> + /* HBB = 16 */
>>> + hbb_lo = 3;
>>> rgb565_predicator = 1;
>>> uavflagprd_inv = 2;
>>> }
>>>
>>> if (adreno_is_7c3(adreno_gpu)) {
>>> - lower_bit = 1;
>>> amsbc = 1;
>>> + /* HBB is unset in downstream DTS, defaulting to 0 */
>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb configurations were moved to the driver from DT in recent downstream kernels.
> Right, seems to have happened with msm-5.10. Though a random kernel I
> grabbed seems to suggest it's 15 and not 14?
>
> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 7c3 is only supported on LP4, we can hardcode 14 here.
In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect ddrtype. If we can get something like that in upstream, we should implement a similar logic here.

-Akhil.
>
> Konrad
>> -Akhil.
>>> rgb565_predicator = 1;
>>> uavflagprd_inv = 2;
>>> }
>>>
>>> gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>>> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>>> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>>> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>>> - uavflagprd_inv << 4 | lower_bit << 1);
>>> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>>> + rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>> +
>>> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>> +
>>> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
>>> + uavflagprd_inv << 4 | min_acc_len << 3 |
>>> + hbb_lo << 1 | ubwc_mode);
>>> +
>>> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 21);
>>> }
>>>
>>> static int a6xx_cp_init(struct msm_gpu *gpu)
>>>


2023-02-28 20:48:59

by Akhil P Oommen

[permalink] [raw]
Subject: Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

On 3/1/2023 2:14 AM, Akhil P Oommen wrote:
> On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>> On 28.02.2023 21:23, Akhil P Oommen wrote:
>>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>>> Add explanations (wherever possible to other tunables).
>>>>
>>>> Sort the variable definition and assignment alphabetically.
>>> Sorting based on decreasing order of line length is more readable, isn't it?
>> I can do that.
>>
>>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>>> Set default values for all of the tunables to zero, as they should be.
>>>>
>>>> Values were validated against downstream and will be fixed up in
>>>> separate commits so as not to make this one even more messy.
>>>>
>>>> A618 remains untouched (left at hw defaults) in this patch.
>>>>
>>>> Signed-off-by: Konrad Dybcio <[email protected]>
>>>> ---
>>>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 ++++++++++++++++++++++++++++-------
>>>> 1 file changed, 45 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>> static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>> {
>>>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>>> - u32 lower_bit = 2;
>>>> + /* Unknown, introduced with A640/680 */
>>>> u32 amsbc = 0;
>>>> + /*
>>>> + * The Highest Bank Bit value represents the bit of the highest DDR bank.
>>>> + * We then subtract 13 from it (13 is the minimum value allowed by hw) and
>>>> + * write the lowest two bits of the remaining value as hbb_lo and the
>>>> + * one above it as hbb_hi to the hardware. The default values (when HBB is
>>>> + * not specified) are 0, 0.
>>>> + */
>>>> + u32 hbb_hi = 0;
>>>> + u32 hbb_lo = 0;
>>>> + /* Whether the minimum access length is 64 bits */
>>>> + u32 min_acc_len = 0;
>>>> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>>> u32 rgb565_predicator = 0;
>>>> + /* Unknown, introduced with A650 family */
>>>> u32 uavflagprd_inv = 0;
>>>> + /* Entirely magic, per-GPU-gen value */
>>>> + u32 ubwc_mode = 0;
>>>>
>>>> /* a618 is using the hw default values */
>>>> if (adreno_is_a618(adreno_gpu))
>>>> return;
>>>>
>>>> - if (adreno_is_a640_family(adreno_gpu))
>>>> + if (adreno_is_a619(adreno_gpu)) {
>>>> + /* HBB = 14 */
>>>> + hbb_lo = 1;
>>>> + }
>>>> +
>>>> + if (adreno_is_a630(adreno_gpu)) {
>>>> + /* HBB = 15 */
>>>> + hbb_lo = 2;
>>>> + }
>>>> +
>>>> + if (adreno_is_a640_family(adreno_gpu)) {
>>>> amsbc = 1;
>>>> + /* HBB = 15 */
>>>> + hbb_lo = 2;
>>>> + }
>>>>
>>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>>> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>> - lower_bit = 3;
>>>> amsbc = 1;
>>>> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>> + /* HBB = 16 */
>>>> + hbb_lo = 3;
>>>> rgb565_predicator = 1;
>>>> uavflagprd_inv = 2;
>>>> }
>>>>
>>>> if (adreno_is_7c3(adreno_gpu)) {
>>>> - lower_bit = 1;
>>>> amsbc = 1;
>>>> + /* HBB is unset in downstream DTS, defaulting to 0 */
>>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb configurations were moved to the driver from DT in recent downstream kernels.
>> Right, seems to have happened with msm-5.10. Though a random kernel I
>> grabbed seems to suggest it's 15 and not 14?
>>
>> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
> We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 7c3 is only supported on LP4, we can hardcode 14 here.
> In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect ddrtype. If we can get something like that in upstream, we should implement a similar logic here.
>
> -Akhil.
Also, I haven't closely reviewed other targets configuration you updated, but it is a good idea to leave the existing configurations here as it in this refactor patch. Any update should be a separate patch.

-Akhil.
>> Konrad
>>> -Akhil.
>>>> rgb565_predicator = 1;
>>>> uavflagprd_inv = 2;
>>>> }
>>>>
>>>> gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>>>> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>>>> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>>>> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>>>> - uavflagprd_inv << 4 | lower_bit << 1);
>>>> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>>>> + rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
>>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>>> +
>>>> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
>>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>>> +
>>>> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
>>>> + uavflagprd_inv << 4 | min_acc_len << 3 |
>>>> + hbb_lo << 1 | ubwc_mode);
>>>> +
>>>> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 21);
>>>> }
>>>>
>>>> static int a6xx_cp_init(struct msm_gpu *gpu)
>>>>


2023-02-28 21:22:33

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config



On 28.02.2023 21:48, Akhil P Oommen wrote:
> On 3/1/2023 2:14 AM, Akhil P Oommen wrote:
>> On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>>> On 28.02.2023 21:23, Akhil P Oommen wrote:
>>>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>>>> Add explanations (wherever possible to other tunables).
>>>>>
>>>>> Sort the variable definition and assignment alphabetically.
>>>> Sorting based on decreasing order of line length is more readable, isn't it?
>>> I can do that.
>>>
>>>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>>>> Set default values for all of the tunables to zero, as they should be.
>>>>>
>>>>> Values were validated against downstream and will be fixed up in
>>>>> separate commits so as not to make this one even more messy.
>>>>>
>>>>> A618 remains untouched (left at hw defaults) in this patch.
>>>>>
>>>>> Signed-off-by: Konrad Dybcio <[email protected]>
>>>>> ---
>>>>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 ++++++++++++++++++++++++++++-------
>>>>> 1 file changed, 45 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>>> static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>>> {
>>>>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>>>> - u32 lower_bit = 2;
>>>>> + /* Unknown, introduced with A640/680 */
>>>>> u32 amsbc = 0;
>>>>> + /*
>>>>> + * The Highest Bank Bit value represents the bit of the highest DDR bank.
>>>>> + * We then subtract 13 from it (13 is the minimum value allowed by hw) and
>>>>> + * write the lowest two bits of the remaining value as hbb_lo and the
>>>>> + * one above it as hbb_hi to the hardware. The default values (when HBB is
>>>>> + * not specified) are 0, 0.
>>>>> + */
>>>>> + u32 hbb_hi = 0;
>>>>> + u32 hbb_lo = 0;
>>>>> + /* Whether the minimum access length is 64 bits */
>>>>> + u32 min_acc_len = 0;
>>>>> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>>>> u32 rgb565_predicator = 0;
>>>>> + /* Unknown, introduced with A650 family */
>>>>> u32 uavflagprd_inv = 0;
>>>>> + /* Entirely magic, per-GPU-gen value */
>>>>> + u32 ubwc_mode = 0;
>>>>>
>>>>> /* a618 is using the hw default values */
>>>>> if (adreno_is_a618(adreno_gpu))
>>>>> return;
>>>>>
>>>>> - if (adreno_is_a640_family(adreno_gpu))
>>>>> + if (adreno_is_a619(adreno_gpu)) {
>>>>> + /* HBB = 14 */
>>>>> + hbb_lo = 1;
>>>>> + }
>>>>> +
>>>>> + if (adreno_is_a630(adreno_gpu)) {
>>>>> + /* HBB = 15 */
>>>>> + hbb_lo = 2;
>>>>> + }
>>>>> +
>>>>> + if (adreno_is_a640_family(adreno_gpu)) {
>>>>> amsbc = 1;
>>>>> + /* HBB = 15 */
>>>>> + hbb_lo = 2;
>>>>> + }
>>>>>
>>>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>>>> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>>> - lower_bit = 3;
>>>>> amsbc = 1;
>>>>> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>>> + /* HBB = 16 */
>>>>> + hbb_lo = 3;
>>>>> rgb565_predicator = 1;
>>>>> uavflagprd_inv = 2;
>>>>> }
>>>>>
>>>>> if (adreno_is_7c3(adreno_gpu)) {
>>>>> - lower_bit = 1;
>>>>> amsbc = 1;
>>>>> + /* HBB is unset in downstream DTS, defaulting to 0 */
>>>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb configurations were moved to the driver from DT in recent downstream kernels.
>>> Right, seems to have happened with msm-5.10. Though a random kernel I
>>> grabbed seems to suggest it's 15 and not 14?
>>>
>>> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
>> We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 7c3 is only supported on LP4, we can hardcode 14 here.
Okay, I see.

>> In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect ddrtype. If we can get something like that in upstream, we should implement a similar logic here.
Yeah, I mentioned it here [1], but I doubt it'd be implemented,
given what Krzysztof pointed out.

>>
>> -Akhil.
> Also, I haven't closely reviewed other targets configuration you updated, but it is a good idea to leave the existing configurations here as it in this refactor patch. Any update should be a separate patch.
Sure, will do.

Konrad

[1] https://github.com/devicetree-org/devicetree-specification/issues/62
>
> -Akhil.
>>> Konrad
>>>> -Akhil.
>>>>> rgb565_predicator = 1;
>>>>> uavflagprd_inv = 2;
>>>>> }
>>>>>
>>>>> gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>>>>> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>>>>> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>>>>> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>>>>> - uavflagprd_inv << 4 | lower_bit << 1);
>>>>> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>>>>> + rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
>>>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>>>> +
>>>>> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
>>>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>>>> +
>>>>> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
>>>>> + uavflagprd_inv << 4 | min_acc_len << 3 |
>>>>> + hbb_lo << 1 | ubwc_mode);
>>>>> +
>>>>> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 21);
>>>>> }
>>>>>
>>>>> static int a6xx_cp_init(struct msm_gpu *gpu)
>>>>>
>