This patchset mainly adds support for mt8183 IOMMU and SMI.
mt8183 has only one M4U like mt8173 and is also MTK IOMMU gen2 which
uses ARM Short-Descriptor translation table format.
The mt8183 M4U-SMI HW diagram is as below:
EMI
|
M4U
|
----------
| |
gals0-rx gals1-rx
| |
| |
gals0-tx gals1-tx
| |
------------
SMI Common
------------
|
+-----+-----+--------+-----+-----+-------+-------+
| | | | | | | |
| | gals-rx gals-rx | gals-rx gals-rx gals-rx
| | | | | | | |
| | | | | | | |
| | gals-tx gals-tx | gals-tx gals-tx gals-tx
| | | | | | | |
larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
disp vdec img cam venc img cam
All the connections are HW fixed, SW can NOT adjust it.
Compared with mt8173, we add a GALS(Global Async Local Sync) module
between SMI-common and M4U, and additional GALS between larb2/3/5/6
and SMI-common. GALS can help synchronize for the modules in different
clock frequency, it can be seen as a "asynchronous fifo".
GALS can only help transfer the command/data while it doesn't have
the configuring register, thus it has the special "smi" clock and it
doesn't have the "apb" clock. From the diagram above, we add "gals0"
and "gals1" clocks for smi-common and add a "gals" clock for smi-larb.
From the diagram above, IPU0/IPU1(Image Processor Unit) and CCU(Camera
Control Unit) is connected with smi-common directly, we can take them
as "larb2", "larb3" and "larb7", and their register spaces are
different with the normal larb.
This is the general purpose of each patch in this patchset:
the patch 1..13 add the iommu/smi support for mt8183;
the patch 14..16 add mmu1 support;
the last patches contain some minor changes:
-patch 17 cleanup some smi codes(delete need_larbid).
-patch 18 fix a issue(fix vld_pa_rng).
-patch 19 improve the code flow(add shutdown).
-patch 20 switch to SPDX license.
The dtsi was included at [1] since it should depend on power-domain
and ccf nodes.
[1] http://lists.infradead.org/pipermail/linux-mediatek/2018-December/016539.html
Change notes:
v5: 1) Remove this patch "iommu/mediatek: Constify iommu_ops" for here as it
was applied for v4.21.
2) Again, add 3 preparing patches. Move two property into the plat_data.
iommu/mediatek: Move vld_pa_rng into plat_data
iommu/mediatek: Move reset_axi into plat_data
iommu/mediatek: Refine protect memory definition
3) Change the string "larb_special_mask" to "larb_direct_to_common_mask".
4) Add shutdown callback for mtk_iommu_v1 in patch[19/20].
v4: http://lists.infradead.org/pipermail/linux-mediatek/2018-December/016205.html
1) Add 3 preparing patches. Seperate some minor meaningful code into
a new patch according to Matthias's suggestion.
memory: mtk-smi: Add gals support
iommu/mediatek: Add larb-id remapped support
iommu/mediatek: Add bclk can be supported optionally
2) rebase on "iommu/mediatek: Make it explicitly non-modular"
which was applied.
https://lore.kernel.org/patchwork/patch/1020125/
3) add some comment about "mediatek,larb-id" in the commit message of
the patch "mtk-smi: Get rid of need_larbid".
4) Fix bus_sel value.
v3: https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031121.html
1) rebase on v4.20-rc1.
2) In the dt-binding, add a minor string "mt7623" which also use gen1
since Matthias added it in v4.20.
3) About v7s:
a) for paddr_to_pte, change the param from "arm_v7s_io_pgtable" to
"arm_pgtable_cfg", according to Robin suggestion.
b) Don't use CONFIG_PHYS_ADDR_T_64BIT.
c) add a little comment(pgtable address still don't over 4GB) in the
commit message of the patch "Extend MediaTek 4GB Mode".
4) add "iommu/mediatek: Constify iommu_ops" into this patchset. this may
be helpful for review and merge.
https://lists.linuxfoundation.org/pipermail/iommu/2018-October/030637.html
v2: https://lists.linuxfoundation.org/pipermail/iommu/2018-September/030164.html
1) Fix typo in the commit message of dt-binding.
2) Change larb2/larb3 to the special larbs.
3) Refactor the larb-id remapped array(larbid_remapped), then we
don't need add the new function(mtk_iommu_get_larbid).
4) Add a new patch for v7s two helpers(paddr_to_iopte and
iopte_to_paddr).
5) Change some comment for MTK 4GB mode.
v1: base on v4.19-rc1.
http://lists.infradead.org/pipermail/linux-mediatek/2018-September/014881.html
Yong Wu (20):
dt-bindings: mediatek: Add binding for mt8183 IOMMU and SMI
iommu/mediatek: Use a struct as the platform data
memory: mtk-smi: Use a general config_port interface
memory: mtk-smi: Use a struct for the platform data for smi-common
iommu/io-pgtable-arm-v7s: Add paddr_to_iopte and iopte_to_paddr
helpers
iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode
iommu/mediatek: Add bclk can be supported optionally
iommu/mediatek: Add larb-id remapped support
iommu/mediatek: Refine protect memory definition
iommu/mediatek: Move reset_axi into plat_data
iommu/mediatek: Move vld_pa_rng into plat_data
memory: mtk-smi: Add gals support
iommu/mediatek: Add mt8183 IOMMU support
iommu/mediatek: Add mmu1 support
memory: mtk-smi: Invoke pm runtime_callback to enable clocks
memory: mtk-smi: Add bus_sel for mt8183
memory: mtk-smi: Get rid of need_larbid
iommu/mediatek: Fix VLD_PA_RANGE register backup when suspend
iommu/mediatek: Add shutdown callback
iommu/mediatek: Switch to SPDX license identifier
.../devicetree/bindings/iommu/mediatek,iommu.txt | 15 +-
.../memory-controllers/mediatek,smi-common.txt | 11 +-
.../memory-controllers/mediatek,smi-larb.txt | 3 +
drivers/iommu/io-pgtable-arm-v7s.c | 72 ++++--
drivers/iommu/io-pgtable.h | 7 +-
drivers/iommu/mtk_iommu.c | 145 ++++++-----
drivers/iommu/mtk_iommu.h | 25 +-
drivers/iommu/mtk_iommu_v1.c | 16 +-
drivers/memory/mtk-smi.c | 270 ++++++++++++++-------
include/dt-bindings/memory/mt2701-larb-port.h | 10 +-
include/dt-bindings/memory/mt8173-larb-port.h | 10 +-
include/dt-bindings/memory/mt8183-larb-port.h | 130 ++++++++++
include/soc/mediatek/smi.h | 10 +-
13 files changed, 505 insertions(+), 219 deletions(-)
create mode 100644 include/dt-bindings/memory/mt8183-larb-port.h
--
1.9.1
This patch adds decriptions for mt8183 IOMMU and SMI.
mt8183 has only one M4U like mt8173 and is also MTK IOMMU gen2 which
uses ARM Short-Descriptor translation table format.
The mt8183 M4U-SMI HW diagram is as below:
EMI
|
M4U
|
----------
| |
gals0-rx gals1-rx
| |
| |
gals0-tx gals1-tx
| |
------------
SMI Common
------------
|
+-----+-----+--------+-----+-----+-------+-------+
| | | | | | | |
| | gals-rx gals-rx | gals-rx gals-rx gals-rx
| | | | | | | |
| | | | | | | |
| | gals-tx gals-tx | gals-tx gals-tx gals-tx
| | | | | | | |
larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
disp vdec img cam venc img cam
All the connections are HW fixed, SW can NOT adjust it.
Compared with mt8173, we add a GALS(Global Async Local Sync) module
between SMI-common and M4U, and additional GALS between larb2/3/5/6
and SMI-common. GALS can help synchronize for the modules in different
clock frequency, it can be seen as a "asynchronous fifo".
GALS can only help transfer the command/data while it doesn't have
the configuring register, thus it has the special "smi" clock and it
doesn't have the "apb" clock. From the diagram above, we add "gals0"
and "gals1" clocks for smi-common and add a "gals" clock for smi-larb.
From the diagram above, IPU0/IPU1(Image Processor Unit) and CCU(Camera
Control Unit) is connected with smi-common directly, we can take them
as "larb2", "larb3" and "larb7", and their register spaces are
different with the normal larb.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
.../devicetree/bindings/iommu/mediatek,iommu.txt | 15 ++-
.../memory-controllers/mediatek,smi-common.txt | 11 +-
.../memory-controllers/mediatek,smi-larb.txt | 3 +
include/dt-bindings/memory/mt8183-larb-port.h | 130 +++++++++++++++++++++
4 files changed, 153 insertions(+), 6 deletions(-)
create mode 100644 include/dt-bindings/memory/mt8183-larb-port.h
diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
index 6922db5..6e758996 100644
--- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
@@ -36,6 +36,10 @@ each local arbiter.
like display, video decode, and camera. And there are different ports
in each larb. Take a example, There are many ports like MC, PP, VLD in the
video decode local arbiter, all these ports are according to the video HW.
+ In some SoCs, there may be a GALS(Global Async Local Sync) module between
+smi-common and m4u, and additional GALS module between smi-larb and
+smi-common. GALS can been seen as a "asynchronous fifo" which could help
+synchronize for the modules in different clock frequency.
Required properties:
- compatible : must be one of the following string:
@@ -44,18 +48,23 @@ Required properties:
"mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
generation one m4u HW.
"mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
+ "mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
- reg : m4u register base and size.
- interrupts : the interrupt of m4u.
- clocks : must contain one entry for each clock-names.
-- clock-names : must be "bclk", It is the block clock of m4u.
+- clock-names : Only 1 optional clock:
+ - "bclk": the block clock of m4u.
+ Note that m4u use the EMI clock which always has been enabled before kernel
+ if there is no this "bclk".
- mediatek,larbs : List of phandle to the local arbiters in the current Socs.
Refer to bindings/memory-controllers/mediatek,smi-larb.txt. It must sort
according to the local arbiter index, like larb0, larb1, larb2...
- iommu-cells : must be 1. This is the mtk_m4u_id according to the HW.
Specifies the mtk_m4u_id as defined in
dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
- dt-binding/memory/mt2712-larb-port.h for mt2712, and
- dt-binding/memory/mt8173-larb-port.h for mt8173.
+ dt-binding/memory/mt2712-larb-port.h for mt2712,
+ dt-binding/memory/mt8173-larb-port.h for mt8173, and
+ dt-binding/memory/mt8183-larb-port.h for mt8183.
Example:
iommu: iommu@10205000 {
diff --git a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
index e937ddd..8d3240a 100644
--- a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
@@ -2,9 +2,10 @@ SMI (Smart Multimedia Interface) Common
The hardware block diagram please check bindings/iommu/mediatek,iommu.txt
-Mediatek SMI have two generations of HW architecture, mt2712 and mt8173 use
-the second generation of SMI HW while mt2701 uses the first generation HW of
-SMI.
+Mediatek SMI have two generations of HW architecture, here is the list
+which generation the Socs use:
+generation 1: mt2701 and mt7623.
+generation 2: mt2712, mt8173 and mt8183.
There's slight differences between the two SMI, for generation 2, the
register which control the iommu port is at each larb's register base. But
@@ -19,6 +20,7 @@ Required properties:
"mediatek,mt2712-smi-common"
"mediatek,mt7623-smi-common", "mediatek,mt2701-smi-common"
"mediatek,mt8173-smi-common"
+ "mediatek,mt8183-smi-common"
- reg : the register and size of the SMI block.
- power-domains : a phandle to the power domain of this local arbiter.
- clocks : Must contain an entry for each entry in clock-names.
@@ -30,6 +32,9 @@ Required properties:
They may be the same if both source clocks are the same.
- "async" : asynchronous clock, it help transform the smi clock into the emi
clock domain, this clock is only needed by generation 1 smi HW.
+ and these 2 option clocks for generation 2 smi HW:
+ - "gals0": the path0 clock of GALS(Global Async Local Sync).
+ - "gals1": the path1 clock of GALS(Global Async Local Sync).
Example:
smi_common: smi@14022000 {
diff --git a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
index 94eddca..69266c9 100644
--- a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
@@ -8,6 +8,7 @@ Required properties:
"mediatek,mt2712-smi-larb"
"mediatek,mt7623-smi-larb", "mediatek,mt2701-smi-larb"
"mediatek,mt8173-smi-larb"
+ "mediatek,mt8183-smi-larb"
- reg : the register and size of this local arbiter.
- mediatek,smi : a phandle to the smi_common node.
- power-domains : a phandle to the power domain of this local arbiter.
@@ -16,6 +17,8 @@ Required properties:
- "apb" : Advanced Peripheral Bus clock, It's the clock for setting
the register.
- "smi" : It's the clock for transfer data and command.
+ and this optional clock name:
+ - "gals": the clock for gals(Global Async Local Sync).
Required property for mt2701, mt2712 and mt7623:
- mediatek,larb-id :the hardware id of this larb.
diff --git a/include/dt-bindings/memory/mt8183-larb-port.h b/include/dt-bindings/memory/mt8183-larb-port.h
new file mode 100644
index 0000000..2c579f3
--- /dev/null
+++ b/include/dt-bindings/memory/mt8183-larb-port.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Yong Wu <[email protected]>
+ */
+#ifndef __DTS_IOMMU_PORT_MT8183_H
+#define __DTS_IOMMU_PORT_MT8183_H
+
+#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
+
+#define M4U_LARB0_ID 0
+#define M4U_LARB1_ID 1
+#define M4U_LARB2_ID 2
+#define M4U_LARB3_ID 3
+#define M4U_LARB4_ID 4
+#define M4U_LARB5_ID 5
+#define M4U_LARB6_ID 6
+#define M4U_LARB7_ID 7
+
+/* larb0 */
+#define M4U_PORT_DISP_OVL0 MTK_M4U_ID(M4U_LARB0_ID, 0)
+#define M4U_PORT_DISP_2L_OVL0_LARB0 MTK_M4U_ID(M4U_LARB0_ID, 1)
+#define M4U_PORT_DISP_2L_OVL1_LARB0 MTK_M4U_ID(M4U_LARB0_ID, 2)
+#define M4U_PORT_DISP_RDMA0 MTK_M4U_ID(M4U_LARB0_ID, 3)
+#define M4U_PORT_DISP_RDMA1 MTK_M4U_ID(M4U_LARB0_ID, 4)
+#define M4U_PORT_DISP_WDMA0 MTK_M4U_ID(M4U_LARB0_ID, 5)
+#define M4U_PORT_MDP_RDMA0 MTK_M4U_ID(M4U_LARB0_ID, 6)
+#define M4U_PORT_MDP_WROT0 MTK_M4U_ID(M4U_LARB0_ID, 7)
+#define M4U_PORT_MDP_WDMA0 MTK_M4U_ID(M4U_LARB0_ID, 8)
+#define M4U_PORT_DISP_FAKE0 MTK_M4U_ID(M4U_LARB0_ID, 9)
+
+/* larb1 */
+#define M4U_PORT_HW_VDEC_MC_EXT MTK_M4U_ID(M4U_LARB1_ID, 0)
+#define M4U_PORT_HW_VDEC_PP_EXT MTK_M4U_ID(M4U_LARB1_ID, 1)
+#define M4U_PORT_HW_VDEC_VLD_EXT MTK_M4U_ID(M4U_LARB1_ID, 2)
+#define M4U_PORT_HW_VDEC_AVC_MV_EXT MTK_M4U_ID(M4U_LARB1_ID, 3)
+#define M4U_PORT_HW_VDEC_PRED_RD_EXT MTK_M4U_ID(M4U_LARB1_ID, 4)
+#define M4U_PORT_HW_VDEC_PRED_WR_EXT MTK_M4U_ID(M4U_LARB1_ID, 5)
+#define M4U_PORT_HW_VDEC_PPWRAP_EXT MTK_M4U_ID(M4U_LARB1_ID, 6)
+
+/* larb2 VPU0 */
+#define M4U_PORT_IMG_IPUO MTK_M4U_ID(M4U_LARB2_ID, 0)
+#define M4U_PORT_IMG_IPU3O MTK_M4U_ID(M4U_LARB2_ID, 1)
+#define M4U_PORT_IMG_IPUI MTK_M4U_ID(M4U_LARB2_ID, 2)
+
+/* larb3 VPU1 */
+#define M4U_PORT_CAM_IPUO MTK_M4U_ID(M4U_LARB3_ID, 0)
+#define M4U_PORT_CAM_IPU2O MTK_M4U_ID(M4U_LARB3_ID, 1)
+#define M4U_PORT_CAM_IPU3O MTK_M4U_ID(M4U_LARB3_ID, 2)
+#define M4U_PORT_CAM_IPUI MTK_M4U_ID(M4U_LARB3_ID, 3)
+#define M4U_PORT_CAM_IPU2I MTK_M4U_ID(M4U_LARB3_ID, 4)
+
+/* larb4 */
+#define M4U_PORT_VENC_RCPU MTK_M4U_ID(M4U_LARB4_ID, 0)
+#define M4U_PORT_VENC_REC MTK_M4U_ID(M4U_LARB4_ID, 1)
+#define M4U_PORT_VENC_BSDMA MTK_M4U_ID(M4U_LARB4_ID, 2)
+#define M4U_PORT_VENC_SV_COMV MTK_M4U_ID(M4U_LARB4_ID, 3)
+#define M4U_PORT_VENC_RD_COMV MTK_M4U_ID(M4U_LARB4_ID, 4)
+#define M4U_PORT_JPGENC_RDMA MTK_M4U_ID(M4U_LARB4_ID, 5)
+#define M4U_PORT_JPGENC_BSDMA MTK_M4U_ID(M4U_LARB4_ID, 6)
+#define M4U_PORT_VENC_CUR_LUMA MTK_M4U_ID(M4U_LARB4_ID, 7)
+#define M4U_PORT_VENC_CUR_CHROMA MTK_M4U_ID(M4U_LARB4_ID, 8)
+#define M4U_PORT_VENC_REF_LUMA MTK_M4U_ID(M4U_LARB4_ID, 9)
+#define M4U_PORT_VENC_REF_CHROMA MTK_M4U_ID(M4U_LARB4_ID, 10)
+
+/* larb5 */
+#define M4U_PORT_CAM_IMGI MTK_M4U_ID(M4U_LARB5_ID, 0)
+#define M4U_PORT_CAM_IMG2O MTK_M4U_ID(M4U_LARB5_ID, 1)
+#define M4U_PORT_CAM_IMG3O MTK_M4U_ID(M4U_LARB5_ID, 2)
+#define M4U_PORT_CAM_VIPI MTK_M4U_ID(M4U_LARB5_ID, 3)
+#define M4U_PORT_CAM_LCEI MTK_M4U_ID(M4U_LARB5_ID, 4)
+#define M4U_PORT_CAM_SMXI MTK_M4U_ID(M4U_LARB5_ID, 5)
+#define M4U_PORT_CAM_SMXO MTK_M4U_ID(M4U_LARB5_ID, 6)
+#define M4U_PORT_CAM_WPE0_RDMA1 MTK_M4U_ID(M4U_LARB5_ID, 7)
+#define M4U_PORT_CAM_WPE0_RDMA0 MTK_M4U_ID(M4U_LARB5_ID, 8)
+#define M4U_PORT_CAM_WPE0_WDMA MTK_M4U_ID(M4U_LARB5_ID, 9)
+#define M4U_PORT_CAM_FDVT_RP MTK_M4U_ID(M4U_LARB5_ID, 10)
+#define M4U_PORT_CAM_FDVT_WR MTK_M4U_ID(M4U_LARB5_ID, 11)
+#define M4U_PORT_CAM_FDVT_RB MTK_M4U_ID(M4U_LARB5_ID, 12)
+#define M4U_PORT_CAM_WPE1_RDMA0 MTK_M4U_ID(M4U_LARB5_ID, 13)
+#define M4U_PORT_CAM_WPE1_RDMA1 MTK_M4U_ID(M4U_LARB5_ID, 14)
+#define M4U_PORT_CAM_WPE1_WDMA MTK_M4U_ID(M4U_LARB5_ID, 15)
+#define M4U_PORT_CAM_DPE_RDMA MTK_M4U_ID(M4U_LARB5_ID, 16)
+#define M4U_PORT_CAM_DPE_WDMA MTK_M4U_ID(M4U_LARB5_ID, 17)
+#define M4U_PORT_CAM_MFB_RDMA0 MTK_M4U_ID(M4U_LARB5_ID, 18)
+#define M4U_PORT_CAM_MFB_RDMA1 MTK_M4U_ID(M4U_LARB5_ID, 19)
+#define M4U_PORT_CAM_MFB_WDMA MTK_M4U_ID(M4U_LARB5_ID, 20)
+#define M4U_PORT_CAM_RSC_RDMA0 MTK_M4U_ID(M4U_LARB5_ID, 21)
+#define M4U_PORT_CAM_RSC_WDMA MTK_M4U_ID(M4U_LARB5_ID, 22)
+#define M4U_PORT_CAM_OWE_RDMA MTK_M4U_ID(M4U_LARB5_ID, 23)
+#define M4U_PORT_CAM_OWE_WDMA MTK_M4U_ID(M4U_LARB5_ID, 24)
+
+/* larb6 */
+#define M4U_PORT_CAM_IMGO MTK_M4U_ID(M4U_LARB6_ID, 0)
+#define M4U_PORT_CAM_RRZO MTK_M4U_ID(M4U_LARB6_ID, 1)
+#define M4U_PORT_CAM_AAO MTK_M4U_ID(M4U_LARB6_ID, 2)
+#define M4U_PORT_CAM_AFO MTK_M4U_ID(M4U_LARB6_ID, 3)
+#define M4U_PORT_CAM_LSCI0 MTK_M4U_ID(M4U_LARB6_ID, 4)
+#define M4U_PORT_CAM_LSCI1 MTK_M4U_ID(M4U_LARB6_ID, 5)
+#define M4U_PORT_CAM_PDO MTK_M4U_ID(M4U_LARB6_ID, 6)
+#define M4U_PORT_CAM_BPCI MTK_M4U_ID(M4U_LARB6_ID, 7)
+#define M4U_PORT_CAM_LCSO MTK_M4U_ID(M4U_LARB6_ID, 8)
+#define M4U_PORT_CAM_CAM_RSSO_A MTK_M4U_ID(M4U_LARB6_ID, 9)
+#define M4U_PORT_CAM_UFEO MTK_M4U_ID(M4U_LARB6_ID, 10)
+#define M4U_PORT_CAM_SOCO MTK_M4U_ID(M4U_LARB6_ID, 11)
+#define M4U_PORT_CAM_SOC1 MTK_M4U_ID(M4U_LARB6_ID, 12)
+#define M4U_PORT_CAM_SOC2 MTK_M4U_ID(M4U_LARB6_ID, 13)
+#define M4U_PORT_CAM_CCUI MTK_M4U_ID(M4U_LARB6_ID, 14)
+#define M4U_PORT_CAM_CCUO MTK_M4U_ID(M4U_LARB6_ID, 15)
+#define M4U_PORT_CAM_RAWI_A MTK_M4U_ID(M4U_LARB6_ID, 16)
+#define M4U_PORT_CAM_CCUG MTK_M4U_ID(M4U_LARB6_ID, 17)
+#define M4U_PORT_CAM_PSO MTK_M4U_ID(M4U_LARB6_ID, 18)
+#define M4U_PORT_CAM_AFO_1 MTK_M4U_ID(M4U_LARB6_ID, 19)
+#define M4U_PORT_CAM_LSCI_2 MTK_M4U_ID(M4U_LARB6_ID, 20)
+#define M4U_PORT_CAM_PDI MTK_M4U_ID(M4U_LARB6_ID, 21)
+#define M4U_PORT_CAM_FLKO MTK_M4U_ID(M4U_LARB6_ID, 22)
+#define M4U_PORT_CAM_LMVO MTK_M4U_ID(M4U_LARB6_ID, 23)
+#define M4U_PORT_CAM_UFGO MTK_M4U_ID(M4U_LARB6_ID, 24)
+#define M4U_PORT_CAM_SPARE MTK_M4U_ID(M4U_LARB6_ID, 25)
+#define M4U_PORT_CAM_SPARE_2 MTK_M4U_ID(M4U_LARB6_ID, 26)
+#define M4U_PORT_CAM_SPARE_3 MTK_M4U_ID(M4U_LARB6_ID, 27)
+#define M4U_PORT_CAM_SPARE_4 MTK_M4U_ID(M4U_LARB6_ID, 28)
+#define M4U_PORT_CAM_SPARE_5 MTK_M4U_ID(M4U_LARB6_ID, 29)
+#define M4U_PORT_CAM_SPARE_6 MTK_M4U_ID(M4U_LARB6_ID, 30)
+
+/* CCU */
+#define M4U_PORT_CCU0 MTK_M4U_ID(M4U_LARB7_ID, 0)
+#define M4U_PORT_CCU1 MTK_M4U_ID(M4U_LARB7_ID, 1)
+
+#endif
--
1.9.1
The config_port of mt2712 and mt8183 are the same. Use a general
config_port interface instead.
In addition, in mt2712, larb8 and larb9 are the bdpsys larbs which
are not the normal larb, their register space are different from the
normal one. thus, we can not call the general config_port. In mt8183,
IPU0/1 and CCU connect with smi-common directly, they also are not
the normal larb. Hence, we add a "larb_direct_to_common_mask" for these
larbs which connect to smi-commmon directly.
This is also a preparing patch for adding mt8183 SMI support.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Matthias Brugger <[email protected]>
---
drivers/memory/mtk-smi.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 8f2d152..9fd6b3d 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -53,6 +53,7 @@ struct mtk_smi_larb_gen {
bool need_larbid;
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *);
+ unsigned int larb_direct_to_common_mask;
};
struct mtk_smi {
@@ -176,17 +177,13 @@ void mtk_smi_larb_put(struct device *larbdev)
return -ENODEV;
}
-static void mtk_smi_larb_config_port_mt2712(struct device *dev)
+static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
{
struct mtk_smi_larb *larb = dev_get_drvdata(dev);
u32 reg;
int i;
- /*
- * larb 8/9 is the bdpsys larb, the iommu_en is enabled defaultly.
- * Don't need to set it again.
- */
- if (larb->larbid == 8 || larb->larbid == 9)
+ if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask)
return;
for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
@@ -261,7 +258,8 @@ static void mtk_smi_larb_config_port_gen1(struct device *dev)
static const struct mtk_smi_larb_gen mtk_smi_larb_mt2712 = {
.need_larbid = true,
- .config_port = mtk_smi_larb_config_port_mt2712,
+ .config_port = mtk_smi_larb_config_port_gen2_general,
+ .larb_direct_to_common_mask = BIT(8) | BIT(9), /* bdpsys */
};
static const struct of_device_id mtk_smi_larb_of_ids[] = {
--
1.9.1
Add two helper functions: paddr_to_iopte and iopte_to_paddr.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
---
drivers/iommu/io-pgtable-arm-v7s.c | 45 ++++++++++++++++++++++++++++----------
1 file changed, 33 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index cec29bf..11d8505 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -173,18 +173,38 @@ struct arm_v7s_io_pgtable {
spinlock_t split_lock;
};
+static bool arm_v7s_pte_is_cont(arm_v7s_iopte pte, int lvl);
+
static dma_addr_t __arm_v7s_dma_addr(void *pages)
{
return (dma_addr_t)virt_to_phys(pages);
}
-static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl)
+static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
+ struct io_pgtable_cfg *cfg)
{
+ return paddr & ARM_V7S_LVL_MASK(lvl);
+}
+
+static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
+ struct io_pgtable_cfg *cfg)
+{
+ arm_v7s_iopte mask;
+
if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
- pte &= ARM_V7S_TABLE_MASK;
+ mask = ARM_V7S_TABLE_MASK;
+ else if (arm_v7s_pte_is_cont(pte, lvl))
+ mask = ARM_V7S_LVL_MASK(lvl) * ARM_V7S_CONT_PAGES;
else
- pte &= ARM_V7S_LVL_MASK(lvl);
- return phys_to_virt(pte);
+ mask = ARM_V7S_LVL_MASK(lvl);
+
+ return pte & mask;
+}
+
+static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
+ struct arm_v7s_io_pgtable *data)
+{
+ return phys_to_virt(iopte_to_paddr(pte, lvl, &data->iop.cfg));
}
static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
@@ -396,7 +416,7 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable *data,
if (num_entries > 1)
pte = arm_v7s_pte_to_cont(pte, lvl);
- pte |= paddr & ARM_V7S_LVL_MASK(lvl);
+ pte |= paddr_to_iopte(paddr, lvl, cfg);
__arm_v7s_set_pte(ptep, pte, num_entries, cfg);
return 0;
@@ -462,7 +482,7 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, unsigned long iova,
}
if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
- cptep = iopte_deref(pte, lvl);
+ cptep = iopte_deref(pte, lvl, data);
} else if (pte) {
/* We require an unmap first */
WARN_ON(!selftest_running);
@@ -512,7 +532,8 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
arm_v7s_iopte pte = data->pgd[i];
if (ARM_V7S_PTE_IS_TABLE(pte, 1))
- __arm_v7s_free_table(iopte_deref(pte, 1), 2, data);
+ __arm_v7s_free_table(iopte_deref(pte, 1, data),
+ 2, data);
}
__arm_v7s_free_table(data->pgd, 1, data);
kmem_cache_destroy(data->l2_tables);
@@ -582,7 +603,7 @@ static size_t arm_v7s_split_blk_unmap(struct arm_v7s_io_pgtable *data,
if (!ARM_V7S_PTE_IS_TABLE(pte, 1))
return 0;
- tablep = iopte_deref(pte, 1);
+ tablep = iopte_deref(pte, 1, data);
return __arm_v7s_unmap(data, iova, size, 2, tablep);
}
@@ -641,7 +662,7 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
io_pgtable_tlb_add_flush(iop, iova, blk_size,
ARM_V7S_BLOCK_SIZE(lvl + 1), false);
io_pgtable_tlb_sync(iop);
- ptep = iopte_deref(pte[i], lvl);
+ ptep = iopte_deref(pte[i], lvl, data);
__arm_v7s_free_table(ptep, lvl + 1, data);
} else if (iop->cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT) {
/*
@@ -666,7 +687,7 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
}
/* Keep on walkin' */
- ptep = iopte_deref(pte[0], lvl);
+ ptep = iopte_deref(pte[0], lvl, data);
return __arm_v7s_unmap(data, iova, size, lvl + 1, ptep);
}
@@ -692,7 +713,7 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
do {
ptep += ARM_V7S_LVL_IDX(iova, ++lvl);
pte = READ_ONCE(*ptep);
- ptep = iopte_deref(pte, lvl);
+ ptep = iopte_deref(pte, lvl, data);
} while (ARM_V7S_PTE_IS_TABLE(pte, lvl));
if (!ARM_V7S_PTE_IS_VALID(pte))
@@ -701,7 +722,7 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
mask = ARM_V7S_LVL_MASK(lvl);
if (arm_v7s_pte_is_cont(pte, lvl))
mask *= ARM_V7S_CONT_PAGES;
- return (pte & mask) | (iova & ~mask);
+ return iopte_to_paddr(pte, lvl, &data->iop.cfg) | (iova & ~mask);
}
static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
--
1.9.1
Use a struct as the platform special data instead of the enumeration.
Also there is a minor change that moving the position of
"enum mtk_smi_gen" definition, this is because we expect define
"struct mtk_smi_common_plat" before it is referred.
This is a preparing patch for mt8183.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Matthias Brugger <[email protected]>
---
drivers/memory/mtk-smi.c | 35 ++++++++++++++++++++++++-----------
1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 9fd6b3d..8a2f968 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -49,6 +49,15 @@
#define SMI_LARB_NONSEC_CON(id) (0x380 + ((id) * 4))
#define F_MMU_EN BIT(0)
+enum mtk_smi_gen {
+ MTK_SMI_GEN1,
+ MTK_SMI_GEN2
+};
+
+struct mtk_smi_common_plat {
+ enum mtk_smi_gen gen;
+};
+
struct mtk_smi_larb_gen {
bool need_larbid;
int port_in_larb[MTK_LARB_NR_MAX + 1];
@@ -61,6 +70,8 @@ struct mtk_smi {
struct clk *clk_apb, *clk_smi;
struct clk *clk_async; /*only needed by mt2701*/
void __iomem *smi_ao_base;
+
+ const struct mtk_smi_common_plat *plat;
};
struct mtk_smi_larb { /* larb: local arbiter */
@@ -72,11 +83,6 @@ struct mtk_smi_larb { /* larb: local arbiter */
u32 *mmu;
};
-enum mtk_smi_gen {
- MTK_SMI_GEN1,
- MTK_SMI_GEN2
-};
-
static int mtk_smi_enable(const struct mtk_smi *smi)
{
int ret;
@@ -351,18 +357,26 @@ static int mtk_smi_larb_remove(struct platform_device *pdev)
}
};
+static const struct mtk_smi_common_plat mtk_smi_common_gen1 = {
+ .gen = MTK_SMI_GEN1,
+};
+
+static const struct mtk_smi_common_plat mtk_smi_common_gen2 = {
+ .gen = MTK_SMI_GEN2,
+};
+
static const struct of_device_id mtk_smi_common_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-common",
- .data = (void *)MTK_SMI_GEN2
+ .data = &mtk_smi_common_gen2,
},
{
.compatible = "mediatek,mt2701-smi-common",
- .data = (void *)MTK_SMI_GEN1
+ .data = &mtk_smi_common_gen1,
},
{
.compatible = "mediatek,mt2712-smi-common",
- .data = (void *)MTK_SMI_GEN2
+ .data = &mtk_smi_common_gen2,
},
{}
};
@@ -372,13 +386,13 @@ static int mtk_smi_common_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
struct mtk_smi *common;
struct resource *res;
- enum mtk_smi_gen smi_gen;
int ret;
common = devm_kzalloc(dev, sizeof(*common), GFP_KERNEL);
if (!common)
return -ENOMEM;
common->dev = dev;
+ common->plat = of_device_get_match_data(dev);
common->clk_apb = devm_clk_get(dev, "apb");
if (IS_ERR(common->clk_apb))
@@ -394,8 +408,7 @@ static int mtk_smi_common_probe(struct platform_device *pdev)
* clock into emi clock domain, but for mtk smi gen2, there's no smi ao
* base.
*/
- smi_gen = (enum mtk_smi_gen)of_device_get_match_data(dev);
- if (smi_gen == MTK_SMI_GEN1) {
+ if (common->plat->gen == MTK_SMI_GEN1) {
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
common->smi_ao_base = devm_ioremap_resource(dev, res);
if (IS_ERR(common->smi_ao_base))
--
1.9.1
Use a struct as the platform special data instead of the enumeration.
This is a prepare patch for adding mt8183 iommu support.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Matthias Brugger <[email protected]>
---
drivers/iommu/mtk_iommu.c | 24 ++++++++++++++++--------
drivers/iommu/mtk_iommu.h | 6 +++++-
2 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index de3e022..189d1b5 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -54,7 +54,7 @@
#define REG_MMU_CTRL_REG 0x110
#define F_MMU_PREFETCH_RT_REPLACE_MOD BIT(4)
#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
- ((data)->m4u_plat == M4U_MT2712 ? 4 : 5)
+ ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
#define F_MMU_TF_PROTECT_SEL(prot, data) \
(((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
@@ -520,7 +520,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
}
regval = F_MMU_TF_PROTECT_SEL(2, data);
- if (data->m4u_plat == M4U_MT8173)
+ if (data->plat_data->m4u_plat == M4U_MT8173)
regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
@@ -541,14 +541,14 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
- if (data->m4u_plat == M4U_MT8173)
+ if (data->plat_data->m4u_plat == M4U_MT8173)
regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
else
regval = lower_32_bits(data->protect_base) |
upper_32_bits(data->protect_base);
writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
- if (data->enable_4GB && data->m4u_plat != M4U_MT8173) {
+ if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
/*
* If 4GB mode is enabled, the validate PA range is from
* 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
@@ -559,7 +559,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
/* It's MISC control register whose default value is ok except mt8173.*/
- if (data->m4u_plat == M4U_MT8173)
+ if (data->plat_data->m4u_plat == M4U_MT8173)
writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
@@ -592,7 +592,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (!data)
return -ENOMEM;
data->dev = dev;
- data->m4u_plat = (enum mtk_iommu_plat)of_device_get_match_data(dev);
+ data->plat_data = of_device_get_match_data(dev);
/* Protect memory. HW will access here while translation fault.*/
protect = devm_kzalloc(dev, MTK_PROTECT_PA_ALIGN * 2, GFP_KERNEL);
@@ -736,9 +736,17 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_iommu_suspend, mtk_iommu_resume)
};
+static const struct mtk_iommu_plat_data mt2712_data = {
+ .m4u_plat = M4U_MT2712,
+};
+
+static const struct mtk_iommu_plat_data mt8173_data = {
+ .m4u_plat = M4U_MT8173,
+};
+
static const struct of_device_id mtk_iommu_of_ids[] = {
- { .compatible = "mediatek,mt2712-m4u", .data = (void *)M4U_MT2712},
- { .compatible = "mediatek,mt8173-m4u", .data = (void *)M4U_MT8173},
+ { .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data},
+ { .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data},
{}
};
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 778498b..333a0ef 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -41,6 +41,10 @@ enum mtk_iommu_plat {
M4U_MT8173,
};
+struct mtk_iommu_plat_data {
+ enum mtk_iommu_plat m4u_plat;
+};
+
struct mtk_iommu_domain;
struct mtk_iommu_data {
@@ -57,7 +61,7 @@ struct mtk_iommu_data {
bool tlb_flush_active;
struct iommu_device iommu;
- enum mtk_iommu_plat m4u_plat;
+ const struct mtk_iommu_plat_data *plat_data;
struct list_head list;
};
--
1.9.1
In some SoCs, M4U doesn't have its "bclk", it will use the EMI
clock instead which has always been enabled when entering kernel.
This also is a preparing patch for mt8183.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 10 +++++++---
drivers/iommu/mtk_iommu.h | 3 +++
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index ae1aa5a..847082c 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -613,9 +613,11 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (data->irq < 0)
return data->irq;
- data->bclk = devm_clk_get(dev, "bclk");
- if (IS_ERR(data->bclk))
- return PTR_ERR(data->bclk);
+ if (data->plat_data->has_bclk) {
+ data->bclk = devm_clk_get(dev, "bclk");
+ if (IS_ERR(data->bclk))
+ return PTR_ERR(data->bclk);
+ }
larb_nr = of_count_phandle_with_args(dev->of_node,
"mediatek,larbs", NULL);
@@ -739,11 +741,13 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
static const struct mtk_iommu_plat_data mt2712_data = {
.m4u_plat = M4U_MT2712,
.has_4gb_mode = true,
+ .has_bclk = true,
};
static const struct mtk_iommu_plat_data mt8173_data = {
.m4u_plat = M4U_MT8173,
.has_4gb_mode = true,
+ .has_bclk = true,
};
static const struct of_device_id mtk_iommu_of_ids[] = {
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 5890e55..b8749ac 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -44,6 +44,9 @@ enum mtk_iommu_plat {
struct mtk_iommu_plat_data {
enum mtk_iommu_plat m4u_plat;
bool has_4gb_mode;
+
+ /* HW will use the EMI clock if there isn't the "bclk". */
+ bool has_bclk;
};
struct mtk_iommu_domain;
--
1.9.1
MediaTek extend the arm v7s descriptor to support the dram over 4GB.
In the mt2712 and mt8173, it's called "4GB mode", the physical address
is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
bit32 is always enabled. thus, in the M4U, we always enable the bit9
for all PTEs which means to enable bit32 of physical address.
but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
32bits.
In order to unify code, in the "4GB mode", we add the bit32 for the
physical address manually in our driver.
Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
has to been moved into v7s.
Regarding whether the pagetable address could be over 4GB, the mt8183
support it while the previous mt8173 don't. thus keep it as is.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
---
drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
drivers/iommu/io-pgtable.h | 7 +++----
drivers/iommu/mtk_iommu.c | 14 ++++++++------
drivers/iommu/mtk_iommu.h | 1 +
4 files changed, 36 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 11d8505..8803a35 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -124,7 +124,9 @@
#define ARM_V7S_TEX_MASK 0x7
#define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
-#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
+/* MediaTek extend the two bits below for over 4GB mode */
+#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
+#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
/* *well, except for TEX on level 2 large pages, of course :( */
#define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
@@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
struct io_pgtable_cfg *cfg)
{
- return paddr & ARM_V7S_LVL_MASK(lvl);
+ arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
+
+ if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
+ if (paddr & BIT_ULL(32))
+ pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
+ if (paddr & BIT_ULL(33))
+ pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
+ }
+ return pte;
}
static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
struct io_pgtable_cfg *cfg)
{
arm_v7s_iopte mask;
+ phys_addr_t paddr;
if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
mask = ARM_V7S_TABLE_MASK;
@@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
else
mask = ARM_V7S_LVL_MASK(lvl);
- return pte & mask;
+ paddr = pte & mask;
+ if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
+ if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
+ paddr |= BIT_ULL(32);
+ if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
+ paddr |= BIT_ULL(33);
+ }
+ return paddr;
}
static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
@@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
pte |= ARM_V7S_ATTR_NS_SECTION;
- if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
- pte |= ARM_V7S_ATTR_MTK_4GB;
-
return pte;
}
@@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
return 0;
- if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
+ if (WARN_ON(upper_32_bits(iova)) ||
+ WARN_ON(upper_32_bits(paddr) &&
+ !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
return -ERANGE;
ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 47d5ae5..69db115 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -62,10 +62,9 @@ struct io_pgtable_cfg {
* (unmapped) entries but the hardware might do so anyway, perform
* TLB maintenance when mapping as well as when unmapping.
*
- * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
- * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
- * when the SoC is in "4GB mode" and they can only access the high
- * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
+ * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
+ * to support up to 34 bits PA where the bit32 and bit33 are
+ * encoded in the bit9 and bit4 of the PTE respectively.
*
* IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
* be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 189d1b5..ae1aa5a 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
{
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+ struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
unsigned long flags;
int ret;
+ /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
+ if (data->plat_data->has_4gb_mode && data->enable_4GB)
+ paddr |= BIT_ULL(32);
+
spin_lock_irqsave(&dom->pgtlock, flags);
- ret = dom->iop->map(dom->iop, iova, paddr & DMA_BIT_MASK(32),
- size, prot);
+ ret = dom->iop->map(dom->iop, iova, paddr, size, prot);
spin_unlock_irqrestore(&dom->pgtlock, flags);
return ret;
@@ -401,7 +405,6 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova)
{
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
- struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
unsigned long flags;
phys_addr_t pa;
@@ -409,9 +412,6 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
pa = dom->iop->iova_to_phys(dom->iop, iova);
spin_unlock_irqrestore(&dom->pgtlock, flags);
- if (data->enable_4GB)
- pa |= BIT_ULL(32);
-
return pa;
}
@@ -738,10 +738,12 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
static const struct mtk_iommu_plat_data mt2712_data = {
.m4u_plat = M4U_MT2712,
+ .has_4gb_mode = true,
};
static const struct mtk_iommu_plat_data mt8173_data = {
.m4u_plat = M4U_MT8173,
+ .has_4gb_mode = true,
};
static const struct of_device_id mtk_iommu_of_ids[] = {
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 333a0ef..5890e55 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -43,6 +43,7 @@ enum mtk_iommu_plat {
struct mtk_iommu_plat_data {
enum mtk_iommu_plat m4u_plat;
+ bool has_4gb_mode;
};
struct mtk_iommu_domain;
--
1.9.1
The larb-id may be remapped in the smi-common, this means the
larb-id reported in the mtk_iommu_isr isn't the real larb-id,
Take mt8183 as a example:
M4U
|
---------------------------------------------
| SMI common |
-0-----7-----5-----6-----1-----2------3-----4- <- Id remapped
| | | | | | | |
larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
disp vdec img cam venc img cam
As above, larb0 connects with the id 0 in smi-common.
larb1 connects with the id 7 in smi-common.
...
If the larb-id reported in the isr is 7, actually it's larb1(vdec).
In order to output the right larb-id in the isr, we add a larb-id
remapping relationship in this patch.
If there is no this larb-id remapping in some SoCs, use the linear
mapping array instead.
This also is a preparing patch for mt8183.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 4 ++++
drivers/iommu/mtk_iommu.h | 2 ++
2 files changed, 6 insertions(+)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 847082c..eca1536 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -220,6 +220,8 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
fault_port = F_MMU0_INT_ID_PORT_ID(regval);
+ fault_larb = data->plat_data->larbid_remap[fault_larb];
+
if (report_iommu_fault(&dom->domain, data->dev, fault_iova,
write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
dev_err_ratelimited(
@@ -742,12 +744,14 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
.m4u_plat = M4U_MT2712,
.has_4gb_mode = true,
.has_bclk = true,
+ .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
};
static const struct mtk_iommu_plat_data mt8173_data = {
.m4u_plat = M4U_MT8173,
.has_4gb_mode = true,
.has_bclk = true,
+ .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
};
static const struct of_device_id mtk_iommu_of_ids[] = {
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index b8749ac..eec19a6 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -47,6 +47,8 @@ struct mtk_iommu_plat_data {
/* HW will use the EMI clock if there isn't the "bclk". */
bool has_bclk;
+
+ unsigned char larbid_remap[MTK_LARB_NR_MAX];
};
struct mtk_iommu_domain;
--
1.9.1
The protect memory setting is a little different in the different SoCs.
In the register REG_MMU_CTRL_REG(0x110), the TF_PROT(translation fault
protect) shift bit is normally 4 while it shift 5 bits only in the
mt8173. This patch delete the complex MACRO and use a common if-else
instead.
Also, use "F_MMU_TF_PROT_TO_PROGRAM_ADDR" instead of the hard code(2)
which means the M4U will output the dirty data to the programmed
address that we allocated dynamically when translation fault occurs.
Signed-off-by: Yong Wu <[email protected]>
---
@Nicalos, I don't put it in the plat_data since only the previous mt8173
shift 5. As I know, the latest SoC always use the new setting like mt2712
and mt8183. Thus, I think it is unnecessary to put it in plat_data and
let all the latest SoC set it. Hence, I still keep "== mt8173" for this
like the reg REG_MMU_CTRL_REG.
---
drivers/iommu/mtk_iommu.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index eca1536..35a1263 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -53,11 +53,7 @@
#define REG_MMU_CTRL_REG 0x110
#define F_MMU_PREFETCH_RT_REPLACE_MOD BIT(4)
-#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
- ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
-/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
-#define F_MMU_TF_PROTECT_SEL(prot, data) \
- (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
+#define F_MMU_TF_PROT_TO_PROGRAM_ADDR 2
#define REG_MMU_IVRP_PADDR 0x114
@@ -521,9 +517,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
return ret;
}
- regval = F_MMU_TF_PROTECT_SEL(2, data);
if (data->plat_data->m4u_plat == M4U_MT8173)
- regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
+ regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
+ (F_MMU_TF_PROT_TO_PROGRAM_ADDR << 5);
+ else
+ regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR << 4;
writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
regval = F_L2_MULIT_HIT_EN |
--
1.9.1
In mt8173 and mt8183, 0x48 is REG_MMU_STANDARD_AXI_MODE while
it is extended to REG_MMU_CTRL which contains _STANDARD_AXI_MODE in
the other SoCs. I move this property to plat_data since both mt8173
and mt8183 use this property.
It is a preparing patch for mt8183.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 4 ++--
drivers/iommu/mtk_iommu.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 35a1263..8d8ab21 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -558,8 +558,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
}
writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
- /* It's MISC control register whose default value is ok except mt8173.*/
- if (data->plat_data->m4u_plat == M4U_MT8173)
+ if (data->plat_data->reset_axi)
writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
@@ -749,6 +748,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
.m4u_plat = M4U_MT8173,
.has_4gb_mode = true,
.has_bclk = true,
+ .reset_axi = true,
.larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
};
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index eec19a6..b46aeaa 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -47,7 +47,7 @@ struct mtk_iommu_plat_data {
/* HW will use the EMI clock if there isn't the "bclk". */
bool has_bclk;
-
+ bool reset_axi;
unsigned char larbid_remap[MTK_LARB_NR_MAX];
};
--
1.9.1
In some SoCs like mt8183, SMI add GALS(Global Async Local Sync) module
which can help synchronize for the modules in different clock frequency.
It can be seen as a "asynchronous fifo". This is a example diagram:
M4U
|
----------
| |
gals0-rx gals1-rx
| |
| |
gals0-tx gals1-tx
| |
------------
SMI Common
------------
|
+-----+--------+-----+- ...
| | | |
| gals-rx gals-rx |
| | | |
| | | |
| gals-tx gals-tx |
| | | |
larb1 larb2 larb3 larb4
GALS only help transfer the command/data while it doesn't have the
configuring register, thus it has the special "smi" clock and doesn't
have the "apb" clock. From the diagram above, we add "gals0" and
"gals1" clocks for smi-common and add a "gals" clock for smi-larb.
This patch adds gals clock supporting in the SMI. Note that some larbs
may still don't have the "gals" clock like larb1 and larb4 above.
This is also a preparing patch for mt8183 which has GALS.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/memory/mtk-smi.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 8a2f968..91634d7 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -56,6 +56,7 @@ enum mtk_smi_gen {
struct mtk_smi_common_plat {
enum mtk_smi_gen gen;
+ bool has_gals;
};
struct mtk_smi_larb_gen {
@@ -63,11 +64,13 @@ struct mtk_smi_larb_gen {
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *);
unsigned int larb_direct_to_common_mask;
+ bool has_gals;
};
struct mtk_smi {
struct device *dev;
struct clk *clk_apb, *clk_smi;
+ struct clk *clk_gals0, *clk_gals1;
struct clk *clk_async; /*only needed by mt2701*/
void __iomem *smi_ao_base;
@@ -99,8 +102,20 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
if (ret)
goto err_disable_apb;
+ ret = clk_prepare_enable(smi->clk_gals0);
+ if (ret)
+ goto err_disable_smi;
+
+ ret = clk_prepare_enable(smi->clk_gals1);
+ if (ret)
+ goto err_disable_gals0;
+
return 0;
+err_disable_gals0:
+ clk_disable_unprepare(smi->clk_gals0);
+err_disable_smi:
+ clk_disable_unprepare(smi->clk_smi);
err_disable_apb:
clk_disable_unprepare(smi->clk_apb);
err_put_pm:
@@ -110,6 +125,8 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
static void mtk_smi_disable(const struct mtk_smi *smi)
{
+ clk_disable_unprepare(smi->clk_gals1);
+ clk_disable_unprepare(smi->clk_gals0);
clk_disable_unprepare(smi->clk_smi);
clk_disable_unprepare(smi->clk_apb);
pm_runtime_put_sync(smi->dev);
@@ -310,6 +327,15 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
larb->smi.clk_smi = devm_clk_get(dev, "smi");
if (IS_ERR(larb->smi.clk_smi))
return PTR_ERR(larb->smi.clk_smi);
+
+ if (larb->larb_gen->has_gals) {
+ /* The larbs may still haven't gals even if the SoC support.*/
+ larb->smi.clk_gals0 = devm_clk_get(dev, "gals");
+ if (PTR_ERR(larb->smi.clk_gals0) == -ENOENT)
+ larb->smi.clk_gals0 = NULL;
+ else if (IS_ERR(larb->smi.clk_gals0))
+ return PTR_ERR(larb->smi.clk_gals0);
+ }
larb->smi.dev = dev;
if (larb->larb_gen->need_larbid) {
@@ -402,6 +428,16 @@ static int mtk_smi_common_probe(struct platform_device *pdev)
if (IS_ERR(common->clk_smi))
return PTR_ERR(common->clk_smi);
+ if (common->plat->has_gals) {
+ common->clk_gals0 = devm_clk_get(dev, "gals0");
+ if (IS_ERR(common->clk_gals0))
+ return PTR_ERR(common->clk_gals0);
+
+ common->clk_gals1 = devm_clk_get(dev, "gals1");
+ if (IS_ERR(common->clk_gals1))
+ return PTR_ERR(common->clk_gals1);
+ }
+
/*
* for mtk smi gen 1, we need to get the ao(always on) base to config
* m4u port, and we need to enable the aync clock for transform the smi
--
1.9.1
Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
range) register while mt2712 have. Move it into the plat_data.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 3 ++-
drivers/iommu/mtk_iommu.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 8d8ab21..2913ddb 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
upper_32_bits(data->protect_base);
writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
- if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
+ if (data->enable_4GB && data->plat_data->vld_pa_rng) {
/*
* If 4GB mode is enabled, the validate PA range is from
* 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
@@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
.m4u_plat = M4U_MT2712,
.has_4gb_mode = true,
.has_bclk = true,
+ .vld_pa_rng = true,
.larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
};
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index b46aeaa..a8c5d1e 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
/* HW will use the EMI clock if there isn't the "bclk". */
bool has_bclk;
bool reset_axi;
+ bool vld_pa_rng;
unsigned char larbid_remap[MTK_LARB_NR_MAX];
};
--
1.9.1
The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
the ARM Short-descriptor like mt8173, and most of the HW registers
are the same.
Here list main differences between mt8183 and mt8173/mt2712:
1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
mode".
4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
the bit[33:32] in the physical address of the pgtable base, But the
standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
we add a mask.
5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
6) mt8183 need reset_axi like mt8173.
7) the larb-id in smi-common is remapped. M4U should add its larbid_remap.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 15 ++++++++++++---
drivers/iommu/mtk_iommu.h | 1 +
drivers/memory/mtk-smi.c | 20 ++++++++++++++++++++
3 files changed, 33 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 2913ddb..66e3615 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -36,6 +36,7 @@
#include "mtk_iommu.h"
#define REG_MMU_PT_BASE_ADDR 0x000
+#define MMU_PT_ADDR_MASK GENMASK(31, 7)
#define REG_MMU_INVALIDATE 0x020
#define F_ALL_INVLD 0x2
@@ -342,7 +343,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
/* Update the pgtable base address register of the M4U HW */
if (!data->m4u_dom) {
data->m4u_dom = dom;
- writel(dom->cfg.arm_v7s_cfg.ttbr[0],
+ writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
data->base + REG_MMU_PT_BASE_ADDR);
}
@@ -712,6 +713,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
{
struct mtk_iommu_data *data = dev_get_drvdata(dev);
struct mtk_iommu_suspend_reg *reg = &data->reg;
+ struct mtk_iommu_domain *m4u_dom = data->m4u_dom;
void __iomem *base = data->base;
int ret;
@@ -727,8 +729,8 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
- if (data->m4u_dom)
- writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr[0],
+ if (m4u_dom)
+ writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
base + REG_MMU_PT_BASE_ADDR);
return 0;
}
@@ -753,9 +755,16 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
.larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
};
+static const struct mtk_iommu_plat_data mt8183_data = {
+ .m4u_plat = M4U_MT8183,
+ .reset_axi = true,
+ .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1},
+};
+
static const struct of_device_id mtk_iommu_of_ids[] = {
{ .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data},
{ .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data},
+ { .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data},
{}
};
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index a8c5d1e..0a7c463 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -39,6 +39,7 @@ enum mtk_iommu_plat {
M4U_MT2701,
M4U_MT2712,
M4U_MT8173,
+ M4U_MT8183,
};
struct mtk_iommu_plat_data {
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 91634d7..a430721 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -285,6 +285,13 @@ static void mtk_smi_larb_config_port_gen1(struct device *dev)
.larb_direct_to_common_mask = BIT(8) | BIT(9), /* bdpsys */
};
+static const struct mtk_smi_larb_gen mtk_smi_larb_mt8183 = {
+ .has_gals = true,
+ .config_port = mtk_smi_larb_config_port_gen2_general,
+ .larb_direct_to_common_mask = BIT(2) | BIT(3) | BIT(7),
+ /* IPU0 | IPU1 | CCU */
+};
+
static const struct of_device_id mtk_smi_larb_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-larb",
@@ -298,6 +305,10 @@ static void mtk_smi_larb_config_port_gen1(struct device *dev)
.compatible = "mediatek,mt2712-smi-larb",
.data = &mtk_smi_larb_mt2712
},
+ {
+ .compatible = "mediatek,mt8183-smi-larb",
+ .data = &mtk_smi_larb_mt8183
+ },
{}
};
@@ -391,6 +402,11 @@ static int mtk_smi_larb_remove(struct platform_device *pdev)
.gen = MTK_SMI_GEN2,
};
+static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
+ .gen = MTK_SMI_GEN2,
+ .has_gals = true,
+};
+
static const struct of_device_id mtk_smi_common_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-common",
@@ -404,6 +420,10 @@ static int mtk_smi_larb_remove(struct platform_device *pdev)
.compatible = "mediatek,mt2712-smi-common",
.data = &mtk_smi_common_gen2,
},
+ {
+ .compatible = "mediatek,mt8183-smi-common",
+ .data = &mtk_smi_common_mt8183,
+ },
{}
};
--
1.9.1
This patch only move the clk_prepare_enable and config_port into the
runtime suspend/resume callback. It doesn't change the code content
and sequence.
This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
(SMI_BUS_SEL need to be restored after smi-common resume every time.)
Also it gives a chance to get rid of mtk_smi_larb_get/put which could
be a next topic.
CC: Matthias Brugger <[email protected]>
Signed-off-by: Yong Wu <[email protected]>
---
drivers/memory/mtk-smi.c | 113 ++++++++++++++++++++++++++++++-----------------
1 file changed, 72 insertions(+), 41 deletions(-)
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index a430721..9790801 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -86,17 +86,13 @@ struct mtk_smi_larb { /* larb: local arbiter */
u32 *mmu;
};
-static int mtk_smi_enable(const struct mtk_smi *smi)
+static int mtk_smi_clk_enable(const struct mtk_smi *smi)
{
int ret;
- ret = pm_runtime_get_sync(smi->dev);
- if (ret < 0)
- return ret;
-
ret = clk_prepare_enable(smi->clk_apb);
if (ret)
- goto err_put_pm;
+ return ret;
ret = clk_prepare_enable(smi->clk_smi);
if (ret)
@@ -118,59 +114,28 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
clk_disable_unprepare(smi->clk_smi);
err_disable_apb:
clk_disable_unprepare(smi->clk_apb);
-err_put_pm:
- pm_runtime_put_sync(smi->dev);
return ret;
}
-static void mtk_smi_disable(const struct mtk_smi *smi)
+static void mtk_smi_clk_disable(const struct mtk_smi *smi)
{
clk_disable_unprepare(smi->clk_gals1);
clk_disable_unprepare(smi->clk_gals0);
clk_disable_unprepare(smi->clk_smi);
clk_disable_unprepare(smi->clk_apb);
- pm_runtime_put_sync(smi->dev);
}
int mtk_smi_larb_get(struct device *larbdev)
{
- struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
- const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
- struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
- int ret;
+ int ret = pm_runtime_get_sync(larbdev);
- /* Enable the smi-common's power and clocks */
- ret = mtk_smi_enable(common);
- if (ret)
- return ret;
-
- /* Enable the larb's power and clocks */
- ret = mtk_smi_enable(&larb->smi);
- if (ret) {
- mtk_smi_disable(common);
- return ret;
- }
-
- /* Configure the iommu info for this larb */
- larb_gen->config_port(larbdev);
-
- return 0;
+ return (ret < 0) ? ret : 0;
}
EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
void mtk_smi_larb_put(struct device *larbdev)
{
- struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
- struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
-
- /*
- * Don't de-configure the iommu info for this larb since there may be
- * several modules in this larb.
- * The iommu info will be reset after power off.
- */
-
- mtk_smi_disable(&larb->smi);
- mtk_smi_disable(common);
+ pm_runtime_put_sync(larbdev);
}
EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
@@ -385,12 +350,52 @@ static int mtk_smi_larb_remove(struct platform_device *pdev)
return 0;
}
+static int __maybe_unused mtk_smi_larb_resume(struct device *dev)
+{
+ struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+ const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
+ int ret;
+
+ /* Power on smi-common. */
+ ret = pm_runtime_get_sync(larb->smi_common_dev);
+ if (ret < 0) {
+ dev_err(dev, "Failed to pm get for smi-common(%d).\n", ret);
+ return ret;
+ }
+
+ ret = mtk_smi_clk_enable(&larb->smi);
+ if (ret < 0) {
+ dev_err(dev, "Failed to enable clock(%d).\n", ret);
+ pm_runtime_put_sync(larb->smi_common_dev);
+ return ret;
+ }
+
+ /* Configure the basic setting for this larb */
+ larb_gen->config_port(dev);
+
+ return 0;
+}
+
+static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
+{
+ struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+
+ mtk_smi_clk_disable(&larb->smi);
+ pm_runtime_put_sync(larb->smi_common_dev);
+ return 0;
+}
+
+static const struct dev_pm_ops smi_larb_pm_ops = {
+ SET_RUNTIME_PM_OPS(mtk_smi_larb_suspend, mtk_smi_larb_resume, NULL)
+};
+
static struct platform_driver mtk_smi_larb_driver = {
.probe = mtk_smi_larb_probe,
.remove = mtk_smi_larb_remove,
.driver = {
.name = "mtk-smi-larb",
.of_match_table = mtk_smi_larb_of_ids,
+ .pm = &smi_larb_pm_ops,
}
};
@@ -489,12 +494,38 @@ static int mtk_smi_common_remove(struct platform_device *pdev)
return 0;
}
+static int __maybe_unused mtk_smi_common_resume(struct device *dev)
+{
+ struct mtk_smi *common = dev_get_drvdata(dev);
+ int ret;
+
+ ret = mtk_smi_clk_enable(common);
+ if (ret) {
+ dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
+ return ret;
+ }
+ return 0;
+}
+
+static int __maybe_unused mtk_smi_common_suspend(struct device *dev)
+{
+ struct mtk_smi *common = dev_get_drvdata(dev);
+
+ mtk_smi_clk_disable(common);
+ return 0;
+}
+
+static const struct dev_pm_ops smi_common_pm_ops = {
+ SET_RUNTIME_PM_OPS(mtk_smi_common_suspend, mtk_smi_common_resume, NULL)
+};
+
static struct platform_driver mtk_smi_common_driver = {
.probe = mtk_smi_common_probe,
.remove = mtk_smi_common_remove,
.driver = {
.name = "mtk-smi-common",
.of_match_table = mtk_smi_common_of_ids,
+ .pm = &smi_common_pm_ops,
}
};
--
1.9.1
There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
mmu0 or mmu1 to balance the bandwidth via the smi-common register
SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).
In mt8183, For better performance, we switch larb1/2/5/7 to enter
mmu1 while the others still keep enter mmu0.
In mt8173 and mt2712, we don't get the performance issue,
Keep its default value(0x0), that means all the larbs enter mmu0.
Note: smi gen1(mt2701/mt7623) don't have this bus_sel.
CC: Matthias Brugger <[email protected]>
Signed-off-by: Yong Wu <[email protected]>
---
drivers/memory/mtk-smi.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 9790801..08cf40d 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -49,6 +49,12 @@
#define SMI_LARB_NONSEC_CON(id) (0x380 + ((id) * 4))
#define F_MMU_EN BIT(0)
+/* SMI COMMON */
+#define SMI_BUS_SEL 0x220
+#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
+/* All are MMU0 defaultly. Only specialize mmu1 here. */
+#define F_MMU1_LARB(larbid) (0x1 << SMI_BUS_LARB_SHIFT(larbid))
+
enum mtk_smi_gen {
MTK_SMI_GEN1,
MTK_SMI_GEN2
@@ -57,6 +63,7 @@ enum mtk_smi_gen {
struct mtk_smi_common_plat {
enum mtk_smi_gen gen;
bool has_gals;
+ u32 bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
};
struct mtk_smi_larb_gen {
@@ -72,8 +79,8 @@ struct mtk_smi {
struct clk *clk_apb, *clk_smi;
struct clk *clk_gals0, *clk_gals1;
struct clk *clk_async; /*only needed by mt2701*/
- void __iomem *smi_ao_base;
-
+ void __iomem *smi_ao_base; /* only for gen1 */
+ void __iomem *base; /* only for gen2 */
const struct mtk_smi_common_plat *plat;
};
@@ -410,6 +417,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
.gen = MTK_SMI_GEN2,
.has_gals = true,
+ .bus_sel = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
+ F_MMU1_LARB(7),
};
static const struct of_device_id mtk_smi_common_of_ids[] = {
@@ -482,6 +491,11 @@ static int mtk_smi_common_probe(struct platform_device *pdev)
ret = clk_prepare_enable(common->clk_async);
if (ret)
return ret;
+ } else {
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ common->base = devm_ioremap_resource(dev, res);
+ if (IS_ERR(common->base))
+ return PTR_ERR(common->base);
}
pm_runtime_enable(dev);
platform_set_drvdata(pdev, common);
@@ -497,6 +511,7 @@ static int mtk_smi_common_remove(struct platform_device *pdev)
static int __maybe_unused mtk_smi_common_resume(struct device *dev)
{
struct mtk_smi *common = dev_get_drvdata(dev);
+ u32 bus_sel = common->plat->bus_sel;
int ret;
ret = mtk_smi_clk_enable(common);
@@ -504,6 +519,9 @@ static int __maybe_unused mtk_smi_common_resume(struct device *dev)
dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
return ret;
}
+
+ if (common->plat->gen == MTK_SMI_GEN2 && bus_sel)
+ writel(bus_sel, common->base + SMI_BUS_SEL);
return 0;
}
--
1.9.1
The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
It's no need to parse it again in SMI driver. Only clean some codes.
This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
and mt8183.
After this patch, the "mediatek,larb-id" only be needed for mt2712
which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
in which the larbs in the "mediatek,larbs" always are ordered.
CC: Matthias Brugger <[email protected]>
Signed-off-by: Yong Wu <[email protected]>
---
drivers/memory/mtk-smi.c | 26 ++------------------------
1 file changed, 2 insertions(+), 24 deletions(-)
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 08cf40d..10e6493 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
};
struct mtk_smi_larb_gen {
- bool need_larbid;
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *);
unsigned int larb_direct_to_common_mask;
@@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
struct mtk_smi_iommu *smi_iommu = data;
unsigned int i;
- if (larb->larb_gen->need_larbid) {
- larb->mmu = &smi_iommu->larb_imu[larb->larbid].mmu;
- return 0;
- }
-
- /*
- * If there is no larbid property, Loop to find the corresponding
- * iommu information.
- */
- for (i = 0; i < smi_iommu->larb_nr; i++) {
+ for (i = 0; i < MTK_LARB_NR_MAX; i++) {
if (dev == smi_iommu->larb_imu[i].dev) {
- /* The 'mmu' may be updated in iommu-attach/detach. */
+ larb->larbid = i;
larb->mmu = &smi_iommu->larb_imu[i].mmu;
return 0;
}
@@ -243,7 +233,6 @@ static void mtk_smi_larb_config_port_gen1(struct device *dev)
};
static const struct mtk_smi_larb_gen mtk_smi_larb_mt2701 = {
- .need_larbid = true,
.port_in_larb = {
LARB0_PORT_OFFSET, LARB1_PORT_OFFSET,
LARB2_PORT_OFFSET, LARB3_PORT_OFFSET
@@ -252,7 +241,6 @@ static void mtk_smi_larb_config_port_gen1(struct device *dev)
};
static const struct mtk_smi_larb_gen mtk_smi_larb_mt2712 = {
- .need_larbid = true,
.config_port = mtk_smi_larb_config_port_gen2_general,
.larb_direct_to_common_mask = BIT(8) | BIT(9), /* bdpsys */
};
@@ -291,7 +279,6 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
struct device_node *smi_node;
struct platform_device *smi_pdev;
- int err;
larb = devm_kzalloc(dev, sizeof(*larb), GFP_KERNEL);
if (!larb)
@@ -321,15 +308,6 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
}
larb->smi.dev = dev;
- if (larb->larb_gen->need_larbid) {
- err = of_property_read_u32(dev->of_node, "mediatek,larb-id",
- &larb->larbid);
- if (err) {
- dev_err(dev, "missing larbid property\n");
- return err;
- }
- }
-
smi_node = of_parse_phandle(dev->of_node, "mediatek,smi", 0);
if (!smi_node)
return -EINVAL;
--
1.9.1
Normally the M4U HW connect EMI with smi. the diagram is like below:
EMI
|
M4U
|
smi-common
|
-----------------
| | | | ...
larb0 larb1 larb2 larb3
Actually there are 2 mmu cells in the M4U HW, like this diagram:
EMI
---------
| |
mmu0 mmu1 <- M4U
| |
---------
|
smi-common
|
-----------------
| | | | ...
larb0 larb1 larb2 larb3
This patch add support for mmu1. In order to get better performance,
we could adjust some larbs go to mmu1 while the others still go to
mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
value of that register is 0 which means all the larbs go to mmu0
defaultly.
This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 47 +++++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 18 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 66e3615..7fcef19 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -70,27 +70,32 @@
#define F_MISS_FIFO_ERR_INT_EN BIT(6)
#define F_INT_CLR_BIT BIT(12)
-#define REG_MMU_INT_MAIN_CONTROL 0x124
-#define F_INT_TRANSLATION_FAULT BIT(0)
-#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
-#define F_INT_INVALID_PA_FAULT BIT(2)
-#define F_INT_ENTRY_REPLACEMENT_FAULT BIT(3)
-#define F_INT_TLB_MISS_FAULT BIT(4)
-#define F_INT_MISS_TRANSACTION_FIFO_FAULT BIT(5)
-#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT BIT(6)
+#define REG_MMU_INT_MAIN_CONTROL 0x124 /* mmu0 | mmu1 */
+#define F_INT_TRANSLATION_FAULT (BIT(0) | BIT(7))
+#define F_INT_MAIN_MULTI_HIT_FAULT (BIT(1) | BIT(8))
+#define F_INT_INVALID_PA_FAULT (BIT(2) | BIT(9))
+#define F_INT_ENTRY_REPLACEMENT_FAULT (BIT(3) | BIT(10))
+#define F_INT_TLB_MISS_FAULT (BIT(4) | BIT(11))
+#define F_INT_MISS_TRANSACTION_FIFO_FAULT (BIT(5) | BIT(12))
+#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT (BIT(6) | BIT(13))
#define REG_MMU_CPE_DONE 0x12C
#define REG_MMU_FAULT_ST1 0x134
+#define F_REG_MMU0_FAULT_MASK GENMASK(6, 0)
+#define F_REG_MMU1_FAULT_MASK GENMASK(13, 7)
-#define REG_MMU_FAULT_VA 0x13c
+#define REG_MMU0_FAULT_VA 0x13c
#define F_MMU_FAULT_VA_WRITE_BIT BIT(1)
#define F_MMU_FAULT_VA_LAYER_BIT BIT(0)
-#define REG_MMU_INVLD_PA 0x140
-#define REG_MMU_INT_ID 0x150
-#define F_MMU0_INT_ID_LARB_ID(a) (((a) >> 7) & 0x7)
-#define F_MMU0_INT_ID_PORT_ID(a) (((a) >> 2) & 0x1f)
+#define REG_MMU0_INVLD_PA 0x140
+#define REG_MMU1_FAULT_VA 0x144
+#define REG_MMU1_INVLD_PA 0x148
+#define REG_MMU0_INT_ID 0x150
+#define REG_MMU1_INT_ID 0x154
+#define F_MMU_INT_ID_LARB_ID(a) (((a) >> 7) & 0x7)
+#define F_MMU_INT_ID_PORT_ID(a) (((a) >> 2) & 0x1f)
#define MTK_PROTECT_PA_ALIGN 128
@@ -209,13 +214,19 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
/* Read error info from registers */
int_state = readl_relaxed(data->base + REG_MMU_FAULT_ST1);
- fault_iova = readl_relaxed(data->base + REG_MMU_FAULT_VA);
+ if (int_state & F_REG_MMU0_FAULT_MASK) {
+ regval = readl_relaxed(data->base + REG_MMU0_INT_ID);
+ fault_iova = readl_relaxed(data->base + REG_MMU0_FAULT_VA);
+ fault_pa = readl_relaxed(data->base + REG_MMU0_INVLD_PA);
+ } else {
+ regval = readl_relaxed(data->base + REG_MMU1_INT_ID);
+ fault_iova = readl_relaxed(data->base + REG_MMU1_FAULT_VA);
+ fault_pa = readl_relaxed(data->base + REG_MMU1_INVLD_PA);
+ }
layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT;
write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT;
- fault_pa = readl_relaxed(data->base + REG_MMU_INVLD_PA);
- regval = readl_relaxed(data->base + REG_MMU_INT_ID);
- fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
- fault_port = F_MMU0_INT_ID_PORT_ID(regval);
+ fault_larb = F_MMU_INT_ID_LARB_ID(regval);
+ fault_port = F_MMU_INT_ID_PORT_ID(regval);
fault_larb = data->plat_data->larbid_remap[fault_larb];
--
1.9.1
The register VLD_PA_RNG(0x118) was forgot to backup while adding 4GB
mode support for mt2712. this patch add it.
Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode")
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 2 ++
drivers/iommu/mtk_iommu.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 7fcef19..ddf1969 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -716,6 +716,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev)
reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0);
reg->int_main_control = readl_relaxed(base + REG_MMU_INT_MAIN_CONTROL);
reg->ivrp_paddr = readl_relaxed(base + REG_MMU_IVRP_PADDR);
+ reg->vld_pa_range = readl_relaxed(base + REG_MMU_VLD_PA_RNG);
clk_disable_unprepare(data->bclk);
return 0;
}
@@ -740,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
+ writel_relaxed(reg->vld_pa_range, base + REG_MMU_VLD_PA_RNG);
if (m4u_dom)
writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
base + REG_MMU_PT_BASE_ADDR);
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 0a7c463..c500bfd 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -33,6 +33,7 @@ struct mtk_iommu_suspend_reg {
u32 int_control0;
u32 int_main_control;
u32 ivrp_paddr;
+ u32 vld_pa_range;
};
enum mtk_iommu_plat {
--
1.9.1
In the reboot burning test, if some Multimedia HW has something wrong,
It may keep send the invalid request to IOMMU. In order to avoid
affect the reboot flow, we add the shutdown callback to disable
M4U HW when shutdown.
Signed-off-by: Yong Wu <[email protected]>
---
drivers/iommu/mtk_iommu.c | 6 ++++++
drivers/iommu/mtk_iommu_v1.c | 6 ++++++
2 files changed, 12 insertions(+)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index ddf1969..dcb02e3 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -703,6 +703,11 @@ static int mtk_iommu_remove(struct platform_device *pdev)
return 0;
}
+static void mtk_iommu_shutdown(struct platform_device *pdev)
+{
+ mtk_iommu_remove(pdev);
+}
+
static int __maybe_unused mtk_iommu_suspend(struct device *dev)
{
struct mtk_iommu_data *data = dev_get_drvdata(dev);
@@ -784,6 +789,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
static struct platform_driver mtk_iommu_driver = {
.probe = mtk_iommu_probe,
.remove = mtk_iommu_remove,
+ .shutdown = mtk_iommu_shutdown,
.driver = {
.name = "mtk-iommu",
.of_match_table = of_match_ptr(mtk_iommu_of_ids),
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 6ede428..517dfbd 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -662,6 +662,11 @@ static int mtk_iommu_remove(struct platform_device *pdev)
return 0;
}
+static void mtk_iommu_shutdown(struct platform_device *pdev)
+{
+ mtk_iommu_remove(pdev);
+}
+
static int __maybe_unused mtk_iommu_suspend(struct device *dev)
{
struct mtk_iommu_data *data = dev_get_drvdata(dev);
@@ -699,6 +704,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
static struct platform_driver mtk_iommu_driver = {
.probe = mtk_iommu_probe,
.remove = mtk_iommu_remove,
+ .shutdown = mtk_iommu_shutdown,
.driver = {
.name = "mtk-iommu-v1",
.of_match_table = mtk_iommu_of_ids,
--
1.9.1
Switch to SPDX license identifier for MediaTek iommu/smi and their
header files.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
drivers/iommu/mtk_iommu.c | 10 +---------
drivers/iommu/mtk_iommu.h | 10 +---------
drivers/iommu/mtk_iommu_v1.c | 10 +---------
drivers/memory/mtk-smi.c | 10 +---------
include/dt-bindings/memory/mt2701-larb-port.h | 10 +---------
include/dt-bindings/memory/mt8173-larb-port.h | 10 +---------
include/soc/mediatek/smi.h | 10 +---------
7 files changed, 7 insertions(+), 63 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index dcb02e3..36526c9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -1,15 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2015-2016 MediaTek Inc.
* Author: Yong Wu <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#include <linux/memblock.h>
#include <linux/bug.h>
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index c500bfd..e09f2220 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2015-2016 MediaTek Inc.
* Author: Honghui Zhang <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#ifndef _MTK_IOMMU_H_
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 517dfbd..f8b8275 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* IOMMU API for MTK architected m4u v1 implementations
*
@@ -5,15 +6,6 @@
* Author: Honghui Zhang <[email protected]>
*
* Based on driver/iommu/mtk_iommu.c
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#include <linux/memblock.h>
#include <linux/bug.h>
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 10e6493..9688341 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -1,15 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2015-2016 MediaTek Inc.
* Author: Yong Wu <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#include <linux/clk.h>
#include <linux/component.h>
diff --git a/include/dt-bindings/memory/mt2701-larb-port.h b/include/dt-bindings/memory/mt2701-larb-port.h
index 6764d74..c511f0f 100644
--- a/include/dt-bindings/memory/mt2701-larb-port.h
+++ b/include/dt-bindings/memory/mt2701-larb-port.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2015 MediaTek Inc.
* Author: Honghui Zhang <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#ifndef _MT2701_LARB_PORT_H_
diff --git a/include/dt-bindings/memory/mt8173-larb-port.h b/include/dt-bindings/memory/mt8173-larb-port.h
index 111b4b0..a62bfeb 100644
--- a/include/dt-bindings/memory/mt8173-larb-port.h
+++ b/include/dt-bindings/memory/mt8173-larb-port.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2015-2016 MediaTek Inc.
* Author: Yong Wu <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#ifndef __DTS_IOMMU_PORT_MT8173_H
#define __DTS_IOMMU_PORT_MT8173_H
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 5201e90..2b410d2 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2015-2016 MediaTek Inc.
* Author: Yong Wu <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#ifndef MTK_IOMMU_SMI_H
#define MTK_IOMMU_SMI_H
--
1.9.1
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu <[email protected]> wrote:
>
> The larb-id may be remapped in the smi-common, this means the
> larb-id reported in the mtk_iommu_isr isn't the real larb-id,
>
> Take mt8183 as a example:
> M4U
> |
> ---------------------------------------------
> | SMI common |
> -0-----7-----5-----6-----1-----2------3-----4- <- Id remapped
> | | | | | | | |
> larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
> disp vdec img cam venc img cam
> As above, larb0 connects with the id 0 in smi-common.
> larb1 connects with the id 7 in smi-common.
> ...
> If the larb-id reported in the isr is 7, actually it's larb1(vdec).
> In order to output the right larb-id in the isr, we add a larb-id
> remapping relationship in this patch.
>
> If there is no this larb-id remapping in some SoCs, use the linear
> mapping array instead.
>
> This also is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
I think it's a little cleaner this way, thanks.
Reviewed-by: Nicolas Boichat <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 4 ++++
> drivers/iommu/mtk_iommu.h | 2 ++
> 2 files changed, 6 insertions(+)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 847082c..eca1536 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -220,6 +220,8 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
> fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
> fault_port = F_MMU0_INT_ID_PORT_ID(regval);
>
> + fault_larb = data->plat_data->larbid_remap[fault_larb];
> +
> if (report_iommu_fault(&dom->domain, data->dev, fault_iova,
> write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
> dev_err_ratelimited(
> @@ -742,12 +744,14 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> .m4u_plat = M4U_MT2712,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> };
>
> static const struct mtk_iommu_plat_data mt8173_data = {
> .m4u_plat = M4U_MT8173,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
> };
>
> static const struct of_device_id mtk_iommu_of_ids[] = {
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index b8749ac..eec19a6 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -47,6 +47,8 @@ struct mtk_iommu_plat_data {
>
> /* HW will use the EMI clock if there isn't the "bclk". */
> bool has_bclk;
> +
> + unsigned char larbid_remap[MTK_LARB_NR_MAX];
> };
>
> struct mtk_iommu_domain;
> --
> 1.9.1
>
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu <[email protected]> wrote:
>
> The protect memory setting is a little different in the different SoCs.
> In the register REG_MMU_CTRL_REG(0x110), the TF_PROT(translation fault
> protect) shift bit is normally 4 while it shift 5 bits only in the
> mt8173. This patch delete the complex MACRO and use a common if-else
> instead.
>
> Also, use "F_MMU_TF_PROT_TO_PROGRAM_ADDR" instead of the hard code(2)
> which means the M4U will output the dirty data to the programmed
> address that we allocated dynamically when translation fault occurs.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> @Nicalos, I don't put it in the plat_data since only the previous mt8173
> shift 5. As I know, the latest SoC always use the new setting like mt2712
> and mt8183. Thus, I think it is unnecessary to put it in plat_data and
> let all the latest SoC set it. Hence, I still keep "== mt8173" for this
> like the reg REG_MMU_CTRL_REG.
Should be ok this way. But maybe one way to avoid hard-coding 4/5
below is to have 2 macros:
#define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4)
#define F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173 (2 << 5)
And still use the if below?
> ---
> drivers/iommu/mtk_iommu.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index eca1536..35a1263 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -53,11 +53,7 @@
>
> #define REG_MMU_CTRL_REG 0x110
> #define F_MMU_PREFETCH_RT_REPLACE_MOD BIT(4)
> -#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> - ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> -/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> -#define F_MMU_TF_PROTECT_SEL(prot, data) \
> - (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> +#define F_MMU_TF_PROT_TO_PROGRAM_ADDR 2
>
> #define REG_MMU_IVRP_PADDR 0x114
>
> @@ -521,9 +517,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> return ret;
> }
>
> - regval = F_MMU_TF_PROTECT_SEL(2, data);
> if (data->plat_data->m4u_plat == M4U_MT8173)
> - regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> + regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
> + (F_MMU_TF_PROT_TO_PROGRAM_ADDR << 5);
> + else
> + regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR << 4;
> writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
>
> regval = F_L2_MULIT_HIT_EN |
> --
> 1.9.1
>
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu <[email protected]> wrote:
>
> In mt8173 and mt8183, 0x48 is REG_MMU_STANDARD_AXI_MODE while
> it is extended to REG_MMU_CTRL which contains _STANDARD_AXI_MODE in
> the other SoCs. I move this property to plat_data since both mt8173
> and mt8183 use this property.
>
> It is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Nicolas Boichat <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 4 ++--
> drivers/iommu/mtk_iommu.h | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 35a1263..8d8ab21 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -558,8 +558,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> }
> writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
>
> - /* It's MISC control register whose default value is ok except mt8173.*/
> - if (data->plat_data->m4u_plat == M4U_MT8173)
> + if (data->plat_data->reset_axi)
> writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
>
> if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
> @@ -749,6 +748,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> .m4u_plat = M4U_MT8173,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .reset_axi = true,
> .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
> };
>
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index eec19a6..b46aeaa 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -47,7 +47,7 @@ struct mtk_iommu_plat_data {
>
> /* HW will use the EMI clock if there isn't the "bclk". */
> bool has_bclk;
> -
> + bool reset_axi;
> unsigned char larbid_remap[MTK_LARB_NR_MAX];
> };
>
> --
> 1.9.1
>
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu <[email protected]> wrote:
>
> Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
> range) register while mt2712 have. Move it into the plat_data.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 3 ++-
> drivers/iommu/mtk_iommu.h | 1 +
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 8d8ab21..2913ddb 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> upper_32_bits(data->protect_base);
> writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
>
> - if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> + if (data->enable_4GB && data->plat_data->vld_pa_rng) {
> /*
> * If 4GB mode is enabled, the validate PA range is from
> * 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
> @@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> .m4u_plat = M4U_MT2712,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .vld_pa_rng = true,
> .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> };
>
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index b46aeaa..a8c5d1e 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
> /* HW will use the EMI clock if there isn't the "bclk". */
> bool has_bclk;
> bool reset_axi;
> + bool vld_pa_rng;
Since this is not a register name, maybe we can use something more
readable, like valid_pa_range?
(or at the very least describe it in a comment in the struct?)
> unsigned char larbid_remap[MTK_LARB_NR_MAX];
> };
>
> --
> 1.9.1
>
On Tue, Jan 1, 2019 at 11:59 AM Yong Wu <[email protected]> wrote:
>
> The register VLD_PA_RNG(0x118) was forgot to backup while adding 4GB
> mode support for mt2712. this patch add it.
>
> Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
> for 4GB mode")
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 2 ++
> drivers/iommu/mtk_iommu.h | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 7fcef19..ddf1969 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -716,6 +716,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev)
> reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0);
> reg->int_main_control = readl_relaxed(base + REG_MMU_INT_MAIN_CONTROL);
> reg->ivrp_paddr = readl_relaxed(base + REG_MMU_IVRP_PADDR);
> + reg->vld_pa_range = readl_relaxed(base + REG_MMU_VLD_PA_RNG);
Don't we want to add:
if (data->plat_data->vld_pa_rng)
before this save/restore operation? Or it doesn't matter?
> clk_disable_unprepare(data->bclk);
> return 0;
> }
> @@ -740,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
> writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
> writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
> + writel_relaxed(reg->vld_pa_range, base + REG_MMU_VLD_PA_RNG);
> if (m4u_dom)
> writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
> base + REG_MMU_PT_BASE_ADDR);
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index 0a7c463..c500bfd 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -33,6 +33,7 @@ struct mtk_iommu_suspend_reg {
> u32 int_control0;
> u32 int_main_control;
> u32 ivrp_paddr;
> + u32 vld_pa_range;
Well, please be consistent ,-) Either vld_pa_rng, or valid_pa_range ,-)
> };
>
> enum mtk_iommu_plat {
> --
> 1.9.1
>
On Wed, 2019-01-02 at 14:45 +0800, Nicolas Boichat wrote:
> On Tue, Jan 1, 2019 at 11:58 AM Yong Wu <[email protected]> wrote:
> >
> > Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
> > range) register while mt2712 have. Move it into the plat_data.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/iommu/mtk_iommu.c | 3 ++-
> > drivers/iommu/mtk_iommu.h | 1 +
> > 2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 8d8ab21..2913ddb 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > upper_32_bits(data->protect_base);
> > writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
> >
> > - if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> > + if (data->enable_4GB && data->plat_data->vld_pa_rng) {
> > /*
> > * If 4GB mode is enabled, the validate PA range is from
> > * 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
> > @@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> > .m4u_plat = M4U_MT2712,
> > .has_4gb_mode = true,
> > .has_bclk = true,
> > + .vld_pa_rng = true,
> > .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> > };
> >
> > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > index b46aeaa..a8c5d1e 100644
> > --- a/drivers/iommu/mtk_iommu.h
> > +++ b/drivers/iommu/mtk_iommu.h
> > @@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
> > /* HW will use the EMI clock if there isn't the "bclk". */
> > bool has_bclk;
> > bool reset_axi;
> > + bool vld_pa_rng;
>
> Since this is not a register name, maybe we can use something more
> readable, like valid_pa_range?
>
> (or at the very least describe it in a comment in the struct?)
I will add a comment about it. like:
bool vld_pa_rng; /* valid pa range */
>
> > unsigned char larbid_remap[MTK_LARB_NR_MAX];
> > };
> >
> > --
> > 1.9.1
> >
On Wed, 2019-01-02 at 14:54 +0800, Nicolas Boichat wrote:
> On Tue, Jan 1, 2019 at 11:59 AM Yong Wu <[email protected]> wrote:
> >
> > The register VLD_PA_RNG(0x118) was forgot to backup while adding 4GB
> > mode support for mt2712. this patch add it.
> >
> > Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
> > for 4GB mode")
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/iommu/mtk_iommu.c | 2 ++
> > drivers/iommu/mtk_iommu.h | 1 +
> > 2 files changed, 3 insertions(+)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 7fcef19..ddf1969 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -716,6 +716,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device *dev)
> > reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0);
> > reg->int_main_control = readl_relaxed(base + REG_MMU_INT_MAIN_CONTROL);
> > reg->ivrp_paddr = readl_relaxed(base + REG_MMU_IVRP_PADDR);
> > + reg->vld_pa_range = readl_relaxed(base + REG_MMU_VLD_PA_RNG);
>
> Don't we want to add:
> if (data->plat_data->vld_pa_rng)
> before this save/restore operation? Or it doesn't matter?
It doesn't matter. If some SoCs don't have it, the register doesn't
conflict with the others. Reading it will return 0, and writing 0 will
have no effect.
>
> > clk_disable_unprepare(data->bclk);
> > return 0;
> > }
> > @@ -740,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> > writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
> > writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
> > writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
> > + writel_relaxed(reg->vld_pa_range, base + REG_MMU_VLD_PA_RNG);
> > if (m4u_dom)
> > writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
> > base + REG_MMU_PT_BASE_ADDR);
> > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > index 0a7c463..c500bfd 100644
> > --- a/drivers/iommu/mtk_iommu.h
> > +++ b/drivers/iommu/mtk_iommu.h
> > @@ -33,6 +33,7 @@ struct mtk_iommu_suspend_reg {
> > u32 int_control0;
> > u32 int_main_control;
> > u32 ivrp_paddr;
> > + u32 vld_pa_range;
>
> Well, please be consistent ,-) Either vld_pa_rng, or valid_pa_range ,-)
Thanks. I will use "vld_pa_rng", Keep same with the register name from
CODA.
>
> > };
> >
> > enum mtk_iommu_plat {
> > --
> > 1.9.1
> >
On Wed, 2019-01-02 at 14:23 +0800, Nicolas Boichat wrote:
> On Tue, Jan 1, 2019 at 11:58 AM Yong Wu <[email protected]> wrote:
> >
> > The protect memory setting is a little different in the different SoCs.
> > In the register REG_MMU_CTRL_REG(0x110), the TF_PROT(translation fault
> > protect) shift bit is normally 4 while it shift 5 bits only in the
> > mt8173. This patch delete the complex MACRO and use a common if-else
> > instead.
> >
> > Also, use "F_MMU_TF_PROT_TO_PROGRAM_ADDR" instead of the hard code(2)
> > which means the M4U will output the dirty data to the programmed
> > address that we allocated dynamically when translation fault occurs.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > @Nicalos, I don't put it in the plat_data since only the previous mt8173
> > shift 5. As I know, the latest SoC always use the new setting like mt2712
> > and mt8183. Thus, I think it is unnecessary to put it in plat_data and
> > let all the latest SoC set it. Hence, I still keep "== mt8173" for this
> > like the reg REG_MMU_CTRL_REG.
>
> Should be ok this way. But maybe one way to avoid hard-coding 4/5
> below is to have 2 macros:
>
> #define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4)
> #define F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173 (2 << 5)
>
> And still use the if below?
Thanks for your quick review.
OK for me.
I will wait Matthias's review for memory/ part. then send the next
version.
>
> > ---
> > drivers/iommu/mtk_iommu.c | 12 +++++-------
> > 1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index eca1536..35a1263 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -53,11 +53,7 @@
> >
> > #define REG_MMU_CTRL_REG 0x110
> > #define F_MMU_PREFETCH_RT_REPLACE_MOD BIT(4)
> > -#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> > - ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> > -/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> > -#define F_MMU_TF_PROTECT_SEL(prot, data) \
> > - (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> > +#define F_MMU_TF_PROT_TO_PROGRAM_ADDR 2
> >
> > #define REG_MMU_IVRP_PADDR 0x114
> >
> > @@ -521,9 +517,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > return ret;
> > }
> >
> > - regval = F_MMU_TF_PROTECT_SEL(2, data);
> > if (data->plat_data->m4u_plat == M4U_MT8173)
> > - regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> > + regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
> > + (F_MMU_TF_PROT_TO_PROGRAM_ADDR << 5);
> > + else
> > + regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR << 4;
> > writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
> >
> > regval = F_L2_MULIT_HIT_EN |
> > --
> > 1.9.1
> >
Hi Matthias,
Gentle ping about this and the other mtk-smi part in this patchset.
The memory part don't have its specific maintainer, normally we need
SoC maintain's help. Thus, For this file(memory/mtk-smi.c), your review
is needed before Joerg accept it.
Appreciate in advance.
On Tue, 2019-01-01 at 11:55 +0800, Yong Wu wrote:
> In some SoCs like mt8183, SMI add GALS(Global Async Local Sync) module
> which can help synchronize for the modules in different clock frequency.
> It can be seen as a "asynchronous fifo". This is a example diagram:
>
> M4U
> |
> ----------
> | |
> gals0-rx gals1-rx
> | |
> | |
> gals0-tx gals1-tx
> | |
> ------------
> SMI Common
> ------------
> |
> +-----+--------+-----+- ...
> | | | |
> | gals-rx gals-rx |
> | | | |
> | | | |
> | gals-tx gals-tx |
> | | | |
> larb1 larb2 larb3 larb4
>
> GALS only help transfer the command/data while it doesn't have the
> configuring register, thus it has the special "smi" clock and doesn't
> have the "apb" clock. From the diagram above, we add "gals0" and
> "gals1" clocks for smi-common and add a "gals" clock for smi-larb.
>
> This patch adds gals clock supporting in the SMI. Note that some larbs
> may still don't have the "gals" clock like larb1 and larb4 above.
>
> This is also a preparing patch for mt8183 which has GALS.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/memory/mtk-smi.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
[...]
On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
>
> Add two helper functions: paddr_to_iopte and iopte_to_paddr.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Robin Murphy <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
>
> The config_port of mt2712 and mt8183 are the same. Use a general
> config_port interface instead.
>
> In addition, in mt2712, larb8 and larb9 are the bdpsys larbs which
> are not the normal larb, their register space are different from the
> normal one. thus, we can not call the general config_port. In mt8183,
> IPU0/1 and CCU connect with smi-common directly, they also are not
> the normal larb. Hence, we add a "larb_direct_to_common_mask" for these
> larbs which connect to smi-commmon directly.
>
> This is also a preparing patch for adding mt8183 SMI support.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Matthias Brugger <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
>
> Use a struct as the platform special data instead of the enumeration.
>
> Also there is a minor change that moving the position of
> "enum mtk_smi_gen" definition, this is because we expect define
> "struct mtk_smi_common_plat" before it is referred.
>
> This is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Matthias Brugger <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
>
> MediaTek extend the arm v7s descriptor to support the dram over 4GB.
>
> In the mt2712 and mt8173, it's called "4GB mode", the physical address
> is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> bit32 is always enabled. thus, in the M4U, we always enable the bit9
> for all PTEs which means to enable bit32 of physical address.
I got a little lost here. I get that you're trying to explain why you
always used to set bit32 of the physical address. But I don't totally
get the part about physical addresses being from 0x4000_0000 -
0x1_3fff_ffff, but also from 0x1_0000_0000-0x1_ffff_ffff. Are you
saying that the physical addresses from the iommu's perspective were
always >0x1_0000_0000? But then from whose perspective is it
0x4000_0000? ... oh, or you're saying there was some sort of remapping
facility that moved the physical addresses around?
>
> but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
> which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
> PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
> 32bits.
>
> In order to unify code, in the "4GB mode", we add the bit32 for the
> physical address manually in our driver.
>
> Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
> has to been moved into v7s.
>
> Regarding whether the pagetable address could be over 4GB, the mt8183
> support it while the previous mt8173 don't. thus keep it as is.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
> drivers/iommu/io-pgtable.h | 7 +++----
> drivers/iommu/mtk_iommu.c | 14 ++++++++------
> drivers/iommu/mtk_iommu.h | 1 +
> 4 files changed, 36 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index 11d8505..8803a35 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -124,7 +124,9 @@
> #define ARM_V7S_TEX_MASK 0x7
> #define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
>
> -#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
> +/* MediaTek extend the two bits below for over 4GB mode */
> +#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
> +#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
If other vendors start doing stuff like this we'll need a more generic
way to handle this... but I guess until we see a pattern this is okay.
>
> /* *well, except for TEX on level 2 large pages, of course :( */
> #define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
> @@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
> static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> struct io_pgtable_cfg *cfg)
> {
> - return paddr & ARM_V7S_LVL_MASK(lvl);
> + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> +
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> + if (paddr & BIT_ULL(32))
> + pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> + if (paddr & BIT_ULL(33))
> + pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> + }
> + return pte;
> }
>
> static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> struct io_pgtable_cfg *cfg)
> {
> arm_v7s_iopte mask;
> + phys_addr_t paddr;
>
> if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
> mask = ARM_V7S_TABLE_MASK;
> @@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> else
> mask = ARM_V7S_LVL_MASK(lvl);
>
> - return pte & mask;
> + paddr = pte & mask;
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> + if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> + paddr |= BIT_ULL(32);
> + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> + paddr |= BIT_ULL(33);
> + }
> + return paddr;
> }
>
> static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
> @@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
> if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
> pte |= ARM_V7S_ATTR_NS_SECTION;
>
> - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
> - pte |= ARM_V7S_ATTR_MTK_4GB;
> -
So despite getting lost in the details, I guess the reason it's okay
that this goes from unconditional to conditional on bit32 is that
before, with the older chips, all physical addresses were above 4GB,
so we'll always see PA's above 4GB?
> return pte;
> }
>
> @@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> return 0;
>
> - if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
> + if (WARN_ON(upper_32_bits(iova)) ||
> + WARN_ON(upper_32_bits(paddr) &&
> + !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
> return -ERANGE;
>
> ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
> diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> index 47d5ae5..69db115 100644
> --- a/drivers/iommu/io-pgtable.h
> +++ b/drivers/iommu/io-pgtable.h
> @@ -62,10 +62,9 @@ struct io_pgtable_cfg {
> * (unmapped) entries but the hardware might do so anyway, perform
> * TLB maintenance when mapping as well as when unmapping.
> *
> - * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
> - * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
> - * when the SoC is in "4GB mode" and they can only access the high
> - * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
> + * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
> + * to support up to 34 bits PA where the bit32 and bit33 are
> + * encoded in the bit9 and bit4 of the PTE respectively.
> *
> * IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
> * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 189d1b5..ae1aa5a 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
> phys_addr_t paddr, size_t size, int prot)
> {
> struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> unsigned long flags;
> int ret;
>
> + /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> + if (data->plat_data->has_4gb_mode && data->enable_4GB)
> + paddr |= BIT_ULL(32);
> +
Ok here's where I get lost. How is this okay? Is the same physical RAM
accessible at multiple locations in the physical address space? Won't
this map an iova to a different pa than the one requested?
Also, you could have rolled the has_4gb_mode check into whether or not
you set enable_4GB. Then you're doing the check for has_4gb_mode once,
rather than on every map call.
-Evan
On Mon, Dec 31, 2018 at 7:56 PM Yong Wu <[email protected]> wrote:
>
> Use a struct as the platform special data instead of the enumeration.
> This is a prepare patch for adding mt8183 iommu support.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Matthias Brugger <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
>
> In some SoCs, M4U doesn't have its "bclk", it will use the EMI
> clock instead which has always been enabled when entering kernel.
>
> This also is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
>
> The larb-id may be remapped in the smi-common, this means the
> larb-id reported in the mtk_iommu_isr isn't the real larb-id,
>
> Take mt8183 as a example:
> M4U
> |
> ---------------------------------------------
> | SMI common |
> -0-----7-----5-----6-----1-----2------3-----4- <- Id remapped
> | | | | | | | |
> larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
> disp vdec img cam venc img cam
> As above, larb0 connects with the id 0 in smi-common.
> larb1 connects with the id 7 in smi-common.
> ...
> If the larb-id reported in the isr is 7, actually it's larb1(vdec).
> In order to output the right larb-id in the isr, we add a larb-id
> remapping relationship in this patch.
>
> If there is no this larb-id remapping in some SoCs, use the linear
> mapping array instead.
>
> This also is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 4 ++++
> drivers/iommu/mtk_iommu.h | 2 ++
> 2 files changed, 6 insertions(+)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 847082c..eca1536 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -220,6 +220,8 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
> fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
> fault_port = F_MMU0_INT_ID_PORT_ID(regval);
>
> + fault_larb = data->plat_data->larbid_remap[fault_larb];
> +
> if (report_iommu_fault(&dom->domain, data->dev, fault_iova,
> write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
> dev_err_ratelimited(
> @@ -742,12 +744,14 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> .m4u_plat = M4U_MT2712,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> };
>
> static const struct mtk_iommu_plat_data mt8173_data = {
> .m4u_plat = M4U_MT8173,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
I briefly considered bikeshedding about how to define these arrays in
a way that might save memory for linear-map devices, but then decided
this is fine.
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
>
> Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
> range) register while mt2712 have. Move it into the plat_data.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 3 ++-
> drivers/iommu/mtk_iommu.h | 1 +
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 8d8ab21..2913ddb 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> upper_32_bits(data->protect_base);
> writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
>
> - if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> + if (data->enable_4GB && data->plat_data->vld_pa_rng) {
> /*
> * If 4GB mode is enabled, the validate PA range is from
> * 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
> @@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> .m4u_plat = M4U_MT2712,
> .has_4gb_mode = true,
> .has_bclk = true,
> + .vld_pa_rng = true,
> .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> };
>
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index b46aeaa..a8c5d1e 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
> /* HW will use the EMI clock if there isn't the "bclk". */
> bool has_bclk;
> bool reset_axi;
> + bool vld_pa_rng;
I agree with Nicolas that valid_pa_range would be much clearer...
although, even now that I know what it's supposed to mean, I don't get
what it represents. What is this saying?
-Evan
On Mon, Dec 31, 2018 at 7:56 PM Yong Wu <[email protected]> wrote:
>
> This patch adds decriptions for mt8183 IOMMU and SMI.
>
> mt8183 has only one M4U like mt8173 and is also MTK IOMMU gen2 which
> uses ARM Short-Descriptor translation table format.
>
> The mt8183 M4U-SMI HW diagram is as below:
>
> EMI
> |
> M4U
> |
> ----------
> | |
> gals0-rx gals1-rx
> | |
> | |
> gals0-tx gals1-tx
> | |
> ------------
> SMI Common
> ------------
> |
> +-----+-----+--------+-----+-----+-------+-------+
> | | | | | | | |
> | | gals-rx gals-rx | gals-rx gals-rx gals-rx
> | | | | | | | |
> | | | | | | | |
> | | gals-tx gals-tx | gals-tx gals-tx gals-tx
> | | | | | | | |
> larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
> disp vdec img cam venc img cam
It might be cool to put the gals in the picture in the bindings. Not a
big deal though.
>
> All the connections are HW fixed, SW can NOT adjust it.
>
> Compared with mt8173, we add a GALS(Global Async Local Sync) module
> between SMI-common and M4U, and additional GALS between larb2/3/5/6
> and SMI-common. GALS can help synchronize for the modules in different
> clock frequency, it can be seen as a "asynchronous fifo".
>
> GALS can only help transfer the command/data while it doesn't have
> the configuring register, thus it has the special "smi" clock and it
> doesn't have the "apb" clock. From the diagram above, we add "gals0"
> and "gals1" clocks for smi-common and add a "gals" clock for smi-larb.
>
> From the diagram above, IPU0/IPU1(Image Processor Unit) and CCU(Camera
> Control Unit) is connected with smi-common directly, we can take them
> as "larb2", "larb3" and "larb7", and their register spaces are
> different with the normal larb.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Rob Herring <[email protected]>
> ---
> .../devicetree/bindings/iommu/mediatek,iommu.txt | 15 ++-
> .../memory-controllers/mediatek,smi-common.txt | 11 +-
> .../memory-controllers/mediatek,smi-larb.txt | 3 +
> include/dt-bindings/memory/mt8183-larb-port.h | 130 +++++++++++++++++++++
> 4 files changed, 153 insertions(+), 6 deletions(-)
> create mode 100644 include/dt-bindings/memory/mt8183-larb-port.h
>
> diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> index 6922db5..6e758996 100644
> --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> @@ -36,6 +36,10 @@ each local arbiter.
> like display, video decode, and camera. And there are different ports
> in each larb. Take a example, There are many ports like MC, PP, VLD in the
> video decode local arbiter, all these ports are according to the video HW.
> + In some SoCs, there may be a GALS(Global Async Local Sync) module between
> +smi-common and m4u, and additional GALS module between smi-larb and
> +smi-common. GALS can been seen as a "asynchronous fifo" which could help
> +synchronize for the modules in different clock frequency.
>
> Required properties:
> - compatible : must be one of the following string:
> @@ -44,18 +48,23 @@ Required properties:
> "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
> generation one m4u HW.
> "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
> + "mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
> - reg : m4u register base and size.
> - interrupts : the interrupt of m4u.
> - clocks : must contain one entry for each clock-names.
> -- clock-names : must be "bclk", It is the block clock of m4u.
> +- clock-names : Only 1 optional clock:
> + - "bclk": the block clock of m4u.
> + Note that m4u use the EMI clock which always has been enabled before kernel
> + if there is no this "bclk".
Ideally bclk could be specified a little more crisply, as this is
actually required for some SoCs and not used at all on others (as in
patch 7).
> - mediatek,larbs : List of phandle to the local arbiters in the current Socs.
> Refer to bindings/memory-controllers/mediatek,smi-larb.txt. It must sort
> according to the local arbiter index, like larb0, larb1, larb2...
> - iommu-cells : must be 1. This is the mtk_m4u_id according to the HW.
> Specifies the mtk_m4u_id as defined in
> dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
> - dt-binding/memory/mt2712-larb-port.h for mt2712, and
> - dt-binding/memory/mt8173-larb-port.h for mt8173.
> + dt-binding/memory/mt2712-larb-port.h for mt2712,
> + dt-binding/memory/mt8173-larb-port.h for mt8173, and
> + dt-binding/memory/mt8183-larb-port.h for mt8183.
>
> Example:
> iommu: iommu@10205000 {
> diff --git a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> index e937ddd..8d3240a 100644
> --- a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> +++ b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> @@ -2,9 +2,10 @@ SMI (Smart Multimedia Interface) Common
>
> The hardware block diagram please check bindings/iommu/mediatek,iommu.txt
>
> -Mediatek SMI have two generations of HW architecture, mt2712 and mt8173 use
> -the second generation of SMI HW while mt2701 uses the first generation HW of
> -SMI.
> +Mediatek SMI have two generations of HW architecture, here is the list
> +which generation the Socs use:
> +generation 1: mt2701 and mt7623.
> +generation 2: mt2712, mt8173 and mt8183.
>
> There's slight differences between the two SMI, for generation 2, the
> register which control the iommu port is at each larb's register base. But
> @@ -19,6 +20,7 @@ Required properties:
> "mediatek,mt2712-smi-common"
> "mediatek,mt7623-smi-common", "mediatek,mt2701-smi-common"
> "mediatek,mt8173-smi-common"
> + "mediatek,mt8183-smi-common"
> - reg : the register and size of the SMI block.
> - power-domains : a phandle to the power domain of this local arbiter.
> - clocks : Must contain an entry for each entry in clock-names.
> @@ -30,6 +32,9 @@ Required properties:
> They may be the same if both source clocks are the same.
> - "async" : asynchronous clock, it help transform the smi clock into the emi
> clock domain, this clock is only needed by generation 1 smi HW.
> + and these 2 option clocks for generation 2 smi HW:
> + - "gals0": the path0 clock of GALS(Global Async Local Sync).
> + - "gals1": the path1 clock of GALS(Global Async Local Sync).
It would also be nice to specify these more clearly too, since these
are also required for some SoCs and not used on others (as in patch
12).
On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
>
> In mt8173 and mt8183, 0x48 is REG_MMU_STANDARD_AXI_MODE while
> it is extended to REG_MMU_CTRL which contains _STANDARD_AXI_MODE in
> the other SoCs. I move this property to plat_data since both mt8173
> and mt8183 use this property.
>
> It is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 4 ++--
> drivers/iommu/mtk_iommu.h | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 35a1263..8d8ab21 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -558,8 +558,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> }
> writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
>
> - /* It's MISC control register whose default value is ok except mt8173.*/
> - if (data->plat_data->m4u_plat == M4U_MT8173)
> + if (data->plat_data->reset_axi)
> writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
The commit description makes it sound like the overall format of the
register is the same, but the "other SoCs" have some extra bits they'd
like to leave alone. Would it be easier to do a read-modify-write to
always clear some bits in the register, instead of doing something
based on the SoC? Or do the bits mean completely different things in
the different versions (in which case what you've got makes sense to
me)?
-Evan
On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
>
> The protect memory setting is a little different in the different SoCs.
> In the register REG_MMU_CTRL_REG(0x110), the TF_PROT(translation fault
> protect) shift bit is normally 4 while it shift 5 bits only in the
> mt8173. This patch delete the complex MACRO and use a common if-else
> instead.
>
> Also, use "F_MMU_TF_PROT_TO_PROGRAM_ADDR" instead of the hard code(2)
> which means the M4U will output the dirty data to the programmed
> address that we allocated dynamically when translation fault occurs.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> @Nicalos, I don't put it in the plat_data since only the previous mt8173
> shift 5. As I know, the latest SoC always use the new setting like mt2712
> and mt8183. Thus, I think it is unnecessary to put it in plat_data and
> let all the latest SoC set it. Hence, I still keep "== mt8173" for this
> like the reg REG_MMU_CTRL_REG.
> ---
> drivers/iommu/mtk_iommu.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index eca1536..35a1263 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -53,11 +53,7 @@
>
> #define REG_MMU_CTRL_REG 0x110
> #define F_MMU_PREFETCH_RT_REPLACE_MOD BIT(4)
> -#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> - ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> -/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> -#define F_MMU_TF_PROTECT_SEL(prot, data) \
> - (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> +#define F_MMU_TF_PROT_TO_PROGRAM_ADDR 2
>
> #define REG_MMU_IVRP_PADDR 0x114
>
> @@ -521,9 +517,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> return ret;
> }
>
> - regval = F_MMU_TF_PROTECT_SEL(2, data);
> if (data->plat_data->m4u_plat == M4U_MT8173)
> - regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> + regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
> + (F_MMU_TF_PROT_TO_PROGRAM_ADDR << 5);
> + else
> + regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR << 4;
I agree with Nicolas with regard to not having the random 4 and 5
sprinkled in the function.
-Evan
On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
>
> The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> the ARM Short-descriptor like mt8173, and most of the HW registers
> are the same.
>
> Here list main differences between mt8183 and mt8173/mt2712:
> 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> mode".
> 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> the bit[33:32] in the physical address of the pgtable base, But the
> standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> we add a mask.
> 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> 6) mt8183 need reset_axi like mt8173.
> 7) the larb-id in smi-common is remapped. M4U should add its larbid_remap.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 15 ++++++++++++---
> drivers/iommu/mtk_iommu.h | 1 +
> drivers/memory/mtk-smi.c | 20 ++++++++++++++++++++
> 3 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 2913ddb..66e3615 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -36,6 +36,7 @@
> #include "mtk_iommu.h"
>
> #define REG_MMU_PT_BASE_ADDR 0x000
> +#define MMU_PT_ADDR_MASK GENMASK(31, 7)
>
> #define REG_MMU_INVALIDATE 0x020
> #define F_ALL_INVLD 0x2
> @@ -342,7 +343,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
> /* Update the pgtable base address register of the M4U HW */
> if (!data->m4u_dom) {
> data->m4u_dom = dom;
> - writel(dom->cfg.arm_v7s_cfg.ttbr[0],
> + writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
So there aren't any other bits down below 7 that you need, like the
shareable bits?
On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
>
> In some SoCs like mt8183, SMI add GALS(Global Async Local Sync) module
> which can help synchronize for the modules in different clock frequency.
> It can be seen as a "asynchronous fifo". This is a example diagram:
>
> M4U
> |
> ----------
> | |
> gals0-rx gals1-rx
> | |
> | |
> gals0-tx gals1-tx
> | |
> ------------
> SMI Common
> ------------
> |
> +-----+--------+-----+- ...
> | | | |
> | gals-rx gals-rx |
> | | | |
> | | | |
> | gals-tx gals-tx |
> | | | |
> larb1 larb2 larb3 larb4
>
> GALS only help transfer the command/data while it doesn't have the
> configuring register, thus it has the special "smi" clock and doesn't
> have the "apb" clock. From the diagram above, we add "gals0" and
> "gals1" clocks for smi-common and add a "gals" clock for smi-larb.
>
> This patch adds gals clock supporting in the SMI. Note that some larbs
> may still don't have the "gals" clock like larb1 and larb4 above.
>
> This is also a preparing patch for mt8183 which has GALS.
>
> Signed-off-by: Yong Wu <[email protected]>
So really from a software perspective the gals are just a couple of
extra clocks that need to be turned on for certain larbs. Seems fine
to me.
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
>
> Normally the M4U HW connect EMI with smi. the diagram is like below:
> EMI
> |
> M4U
> |
> smi-common
> |
> -----------------
> | | | | ...
> larb0 larb1 larb2 larb3
>
> Actually there are 2 mmu cells in the M4U HW, like this diagram:
>
> EMI
> ---------
> | |
> mmu0 mmu1 <- M4U
> | |
> ---------
> |
> smi-common
> |
> -----------------
> | | | | ...
> larb0 larb1 larb2 larb3
>
> This patch add support for mmu1. In order to get better performance,
> we could adjust some larbs go to mmu1 while the others still go to
> mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
>
> mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
> value of that register is 0 which means all the larbs go to mmu0
> defaultly.
>
> This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 47 +++++++++++++++++++++++++++++------------------
> 1 file changed, 29 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 66e3615..7fcef19 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -70,27 +70,32 @@
> #define F_MISS_FIFO_ERR_INT_EN BIT(6)
> #define F_INT_CLR_BIT BIT(12)
>
> -#define REG_MMU_INT_MAIN_CONTROL 0x124
> -#define F_INT_TRANSLATION_FAULT BIT(0)
> -#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
> -#define F_INT_INVALID_PA_FAULT BIT(2)
> -#define F_INT_ENTRY_REPLACEMENT_FAULT BIT(3)
> -#define F_INT_TLB_MISS_FAULT BIT(4)
> -#define F_INT_MISS_TRANSACTION_FIFO_FAULT BIT(5)
> -#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT BIT(6)
> +#define REG_MMU_INT_MAIN_CONTROL 0x124 /* mmu0 | mmu1 */
The comment being on that line is kind of weird, since the comment
really applies to the lines below it. Maybe the comment should be on
its own line, or on the TRANSLATION_FAULT line.
Other than that,
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
>
> This patch only move the clk_prepare_enable and config_port into the
> runtime suspend/resume callback. It doesn't change the code content
> and sequence.
>
> This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
> (SMI_BUS_SEL need to be restored after smi-common resume every time.)
> Also it gives a chance to get rid of mtk_smi_larb_get/put which could
> be a next topic.
>
> CC: Matthias Brugger <[email protected]>
> Signed-off-by: Yong Wu <[email protected]>
I believe this refactoring is a no-op as described, because the order is still:
1) mtk_smi_clk_enable(common)
2) mtk_smi_clk_enable(larb)
3) larb_gen->config_port()
And teardown still happens in the opposite order, except for
config_port, which they seem not to do in suspend.
So, looks good to me.
Reviewed-by: Evan Green <[email protected]>
> ---
> drivers/memory/mtk-smi.c | 113 ++++++++++++++++++++++++++++++-----------------
> 1 file changed, 72 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> index a430721..9790801 100644
> --- a/drivers/memory/mtk-smi.c
> +++ b/drivers/memory/mtk-smi.c
> @@ -86,17 +86,13 @@ struct mtk_smi_larb { /* larb: local arbiter */
> u32 *mmu;
> };
>
> -static int mtk_smi_enable(const struct mtk_smi *smi)
> +static int mtk_smi_clk_enable(const struct mtk_smi *smi)
> {
> int ret;
>
> - ret = pm_runtime_get_sync(smi->dev);
> - if (ret < 0)
> - return ret;
> -
> ret = clk_prepare_enable(smi->clk_apb);
> if (ret)
> - goto err_put_pm;
> + return ret;
>
> ret = clk_prepare_enable(smi->clk_smi);
> if (ret)
> @@ -118,59 +114,28 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
> clk_disable_unprepare(smi->clk_smi);
> err_disable_apb:
> clk_disable_unprepare(smi->clk_apb);
> -err_put_pm:
> - pm_runtime_put_sync(smi->dev);
> return ret;
> }
>
> -static void mtk_smi_disable(const struct mtk_smi *smi)
> +static void mtk_smi_clk_disable(const struct mtk_smi *smi)
> {
> clk_disable_unprepare(smi->clk_gals1);
> clk_disable_unprepare(smi->clk_gals0);
> clk_disable_unprepare(smi->clk_smi);
> clk_disable_unprepare(smi->clk_apb);
> - pm_runtime_put_sync(smi->dev);
> }
>
> int mtk_smi_larb_get(struct device *larbdev)
> {
> - struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
> - const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
> - struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
> - int ret;
> + int ret = pm_runtime_get_sync(larbdev);
>
> - /* Enable the smi-common's power and clocks */
> - ret = mtk_smi_enable(common);
> - if (ret)
> - return ret;
> -
> - /* Enable the larb's power and clocks */
> - ret = mtk_smi_enable(&larb->smi);
> - if (ret) {
> - mtk_smi_disable(common);
> - return ret;
> - }
> -
> - /* Configure the iommu info for this larb */
> - larb_gen->config_port(larbdev);
> -
> - return 0;
> + return (ret < 0) ? ret : 0;
> }
> EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
>
> void mtk_smi_larb_put(struct device *larbdev)
> {
> - struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
> - struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
> -
> - /*
> - * Don't de-configure the iommu info for this larb since there may be
> - * several modules in this larb.
> - * The iommu info will be reset after power off.
> - */
> -
> - mtk_smi_disable(&larb->smi);
> - mtk_smi_disable(common);
> + pm_runtime_put_sync(larbdev);
> }
> EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
>
> @@ -385,12 +350,52 @@ static int mtk_smi_larb_remove(struct platform_device *pdev)
> return 0;
> }
>
> +static int __maybe_unused mtk_smi_larb_resume(struct device *dev)
> +{
> + struct mtk_smi_larb *larb = dev_get_drvdata(dev);
> + const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
> + int ret;
> +
> + /* Power on smi-common. */
> + ret = pm_runtime_get_sync(larb->smi_common_dev);
> + if (ret < 0) {
> + dev_err(dev, "Failed to pm get for smi-common(%d).\n", ret);
> + return ret;
> + }
> +
> + ret = mtk_smi_clk_enable(&larb->smi);
> + if (ret < 0) {
> + dev_err(dev, "Failed to enable clock(%d).\n", ret);
> + pm_runtime_put_sync(larb->smi_common_dev);
> + return ret;
> + }
> +
> + /* Configure the basic setting for this larb */
> + larb_gen->config_port(dev);
> +
> + return 0;
> +}
> +
> +static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
> +{
> + struct mtk_smi_larb *larb = dev_get_drvdata(dev);
> +
> + mtk_smi_clk_disable(&larb->smi);
> + pm_runtime_put_sync(larb->smi_common_dev);
> + return 0;
> +}
> +
> +static const struct dev_pm_ops smi_larb_pm_ops = {
> + SET_RUNTIME_PM_OPS(mtk_smi_larb_suspend, mtk_smi_larb_resume, NULL)
> +};
> +
> static struct platform_driver mtk_smi_larb_driver = {
> .probe = mtk_smi_larb_probe,
> .remove = mtk_smi_larb_remove,
> .driver = {
> .name = "mtk-smi-larb",
> .of_match_table = mtk_smi_larb_of_ids,
> + .pm = &smi_larb_pm_ops,
> }
> };
>
> @@ -489,12 +494,38 @@ static int mtk_smi_common_remove(struct platform_device *pdev)
> return 0;
> }
>
> +static int __maybe_unused mtk_smi_common_resume(struct device *dev)
> +{
> + struct mtk_smi *common = dev_get_drvdata(dev);
> + int ret;
> +
> + ret = mtk_smi_clk_enable(common);
> + if (ret) {
> + dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
> + return ret;
> + }
> + return 0;
> +}
> +
> +static int __maybe_unused mtk_smi_common_suspend(struct device *dev)
> +{
> + struct mtk_smi *common = dev_get_drvdata(dev);
> +
> + mtk_smi_clk_disable(common);
> + return 0;
> +}
> +
> +static const struct dev_pm_ops smi_common_pm_ops = {
> + SET_RUNTIME_PM_OPS(mtk_smi_common_suspend, mtk_smi_common_resume, NULL)
> +};
> +
> static struct platform_driver mtk_smi_common_driver = {
> .probe = mtk_smi_common_probe,
> .remove = mtk_smi_common_remove,
> .driver = {
> .name = "mtk-smi-common",
> .of_match_table = mtk_smi_common_of_ids,
> + .pm = &smi_common_pm_ops,
> }
> };
>
> --
> 1.9.1
>
On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
>
> There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
> mmu0 or mmu1 to balance the bandwidth via the smi-common register
> SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).
>
> In mt8183, For better performance, we switch larb1/2/5/7 to enter
> mmu1 while the others still keep enter mmu0.
>
> In mt8173 and mt2712, we don't get the performance issue,
> Keep its default value(0x0), that means all the larbs enter mmu0.
>
> Note: smi gen1(mt2701/mt7623) don't have this bus_sel.
>
> CC: Matthias Brugger <[email protected]>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/memory/mtk-smi.c | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> index 9790801..08cf40d 100644
> --- a/drivers/memory/mtk-smi.c
> +++ b/drivers/memory/mtk-smi.c
> @@ -49,6 +49,12 @@
> #define SMI_LARB_NONSEC_CON(id) (0x380 + ((id) * 4))
> #define F_MMU_EN BIT(0)
>
> +/* SMI COMMON */
> +#define SMI_BUS_SEL 0x220
> +#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
> +/* All are MMU0 defaultly. Only specialize mmu1 here. */
> +#define F_MMU1_LARB(larbid) (0x1 << SMI_BUS_LARB_SHIFT(larbid))
> +
> enum mtk_smi_gen {
> MTK_SMI_GEN1,
> MTK_SMI_GEN2
> @@ -57,6 +63,7 @@ enum mtk_smi_gen {
> struct mtk_smi_common_plat {
> enum mtk_smi_gen gen;
> bool has_gals;
> + u32 bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
> };
>
> struct mtk_smi_larb_gen {
> @@ -72,8 +79,8 @@ struct mtk_smi {
> struct clk *clk_apb, *clk_smi;
> struct clk *clk_gals0, *clk_gals1;
> struct clk *clk_async; /*only needed by mt2701*/
> - void __iomem *smi_ao_base;
> -
> + void __iomem *smi_ao_base; /* only for gen1 */
> + void __iomem *base; /* only for gen2 */
> const struct mtk_smi_common_plat *plat;
> };
>
> @@ -410,6 +417,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
> static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
> .gen = MTK_SMI_GEN2,
> .has_gals = true,
> + .bus_sel = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
> + F_MMU1_LARB(7),
> };
>
> static const struct of_device_id mtk_smi_common_of_ids[] = {
> @@ -482,6 +491,11 @@ static int mtk_smi_common_probe(struct platform_device *pdev)
> ret = clk_prepare_enable(common->clk_async);
> if (ret)
> return ret;
> + } else {
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + common->base = devm_ioremap_resource(dev, res);
> + if (IS_ERR(common->base))
> + return PTR_ERR(common->base);
So you split base and smi_ao_base because they're completely different
register regions, or because ->base is no longer "always on"? It's
tempting to recombine them because they appear to be mutually
exclusive, but if they're truly different register regions then I
understand.
On Mon, Dec 31, 2018 at 8:00 PM Yong Wu <[email protected]> wrote:
>
> Switch to SPDX license identifier for MediaTek iommu/smi and their
> header files.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Rob Herring <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 8:00 PM Yong Wu <[email protected]> wrote:
>
> The register VLD_PA_RNG(0x118) was forgot to backup while adding 4GB
> mode support for mt2712. this patch add it.
>
> Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
> for 4GB mode")
> Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Evan Green <[email protected]>
On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
>
> The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
> It's no need to parse it again in SMI driver. Only clean some codes.
> This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
> and mt8183.
>
> After this patch, the "mediatek,larb-id" only be needed for mt2712
> which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
> in which the larbs in the "mediatek,larbs" always are ordered.
>
> CC: Matthias Brugger <[email protected]>
> Signed-off-by: Yong Wu <[email protected]>
> ---
> drivers/memory/mtk-smi.c | 26 ++------------------------
> 1 file changed, 2 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> index 08cf40d..10e6493 100644
> --- a/drivers/memory/mtk-smi.c
> +++ b/drivers/memory/mtk-smi.c
> @@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
> };
>
> struct mtk_smi_larb_gen {
> - bool need_larbid;
> int port_in_larb[MTK_LARB_NR_MAX + 1];
> void (*config_port)(struct device *);
> unsigned int larb_direct_to_common_mask;
> @@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
> struct mtk_smi_iommu *smi_iommu = data;
> unsigned int i;
>
> - if (larb->larb_gen->need_larbid) {
> - larb->mmu = &smi_iommu->larb_imu[larb->larbid].mmu;
> - return 0;
> - }
> -
> - /*
> - * If there is no larbid property, Loop to find the corresponding
> - * iommu information.
> - */
> - for (i = 0; i < smi_iommu->larb_nr; i++) {
> + for (i = 0; i < MTK_LARB_NR_MAX; i++) {
Looks like this was the only use of mtk_smi_iommu.larb_nr. Should we
remove that now?
On Mon, Dec 31, 2018 at 8:00 PM Yong Wu <[email protected]> wrote:
>
> In the reboot burning test, if some Multimedia HW has something wrong,
> It may keep send the invalid request to IOMMU. In order to avoid
> affect the reboot flow, we add the shutdown callback to disable
> M4U HW when shutdown.
Sounds unpleasant. Hopefully the reboot flow still continues properly
even in that case, since this shutdown code may not run during some
rougher resets.
Reviewed-by: Evan Green <[email protected]>
Hi Evan,
Thanks very much for reviewing this patchset.
On Wed, 2019-01-30 at 10:27 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:56 PM Yong Wu <[email protected]> wrote:
> >
> > This patch adds decriptions for mt8183 IOMMU and SMI.
> >
> > mt8183 has only one M4U like mt8173 and is also MTK IOMMU gen2 which
> > uses ARM Short-Descriptor translation table format.
> >
> > The mt8183 M4U-SMI HW diagram is as below:
> >
> > EMI
> > |
> > M4U
> > |
> > ----------
> > | |
> > gals0-rx gals1-rx
> > | |
> > | |
> > gals0-tx gals1-tx
> > | |
> > ------------
> > SMI Common
> > ------------
> > |
> > +-----+-----+--------+-----+-----+-------+-------+
> > | | | | | | | |
> > | | gals-rx gals-rx | gals-rx gals-rx gals-rx
> > | | | | | | | |
> > | | | | | | | |
> > | | gals-tx gals-tx | gals-tx gals-tx gals-tx
> > | | | | | | | |
> > larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU
> > disp vdec img cam venc img cam
>
> It might be cool to put the gals in the picture in the bindings. Not a
> big deal though.
OK. the picture in the binding should be more generic, I will try add
gals in the binding picture in next version.
>
> >
> > All the connections are HW fixed, SW can NOT adjust it.
> >
> > Compared with mt8173, we add a GALS(Global Async Local Sync) module
> > between SMI-common and M4U, and additional GALS between larb2/3/5/6
> > and SMI-common. GALS can help synchronize for the modules in different
> > clock frequency, it can be seen as a "asynchronous fifo".
> >
> > GALS can only help transfer the command/data while it doesn't have
> > the configuring register, thus it has the special "smi" clock and it
> > doesn't have the "apb" clock. From the diagram above, we add "gals0"
> > and "gals1" clocks for smi-common and add a "gals" clock for smi-larb.
> >
> > From the diagram above, IPU0/IPU1(Image Processor Unit) and CCU(Camera
> > Control Unit) is connected with smi-common directly, we can take them
> > as "larb2", "larb3" and "larb7", and their register spaces are
> > different with the normal larb.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > Reviewed-by: Rob Herring <[email protected]>
> > ---
> > .../devicetree/bindings/iommu/mediatek,iommu.txt | 15 ++-
> > .../memory-controllers/mediatek,smi-common.txt | 11 +-
> > .../memory-controllers/mediatek,smi-larb.txt | 3 +
> > include/dt-bindings/memory/mt8183-larb-port.h | 130 +++++++++++++++++++++
> > 4 files changed, 153 insertions(+), 6 deletions(-)
> > create mode 100644 include/dt-bindings/memory/mt8183-larb-port.h
> >
> > diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> > index 6922db5..6e758996 100644
> > --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> > +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> > @@ -36,6 +36,10 @@ each local arbiter.
> > like display, video decode, and camera. And there are different ports
> > in each larb. Take a example, There are many ports like MC, PP, VLD in the
> > video decode local arbiter, all these ports are according to the video HW.
> > + In some SoCs, there may be a GALS(Global Async Local Sync) module between
> > +smi-common and m4u, and additional GALS module between smi-larb and
> > +smi-common. GALS can been seen as a "asynchronous fifo" which could help
> > +synchronize for the modules in different clock frequency.
> >
> > Required properties:
> > - compatible : must be one of the following string:
> > @@ -44,18 +48,23 @@ Required properties:
> > "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
> > generation one m4u HW.
> > "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
> > + "mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
> > - reg : m4u register base and size.
> > - interrupts : the interrupt of m4u.
> > - clocks : must contain one entry for each clock-names.
> > -- clock-names : must be "bclk", It is the block clock of m4u.
> > +- clock-names : Only 1 optional clock:
> > + - "bclk": the block clock of m4u.
> > + Note that m4u use the EMI clock which always has been enabled before kernel
> > + if there is no this "bclk".
>
> Ideally bclk could be specified a little more crisply, as this is
> actually required for some SoCs and not used at all on others (as in
> patch 7).
OK. I will add like this:
+ Here is the list which require the "bclk":
+ - mt2701, mt2712, mt7623 and mt8173.
>
> > - mediatek,larbs : List of phandle to the local arbiters in the current Socs.
> > Refer to bindings/memory-controllers/mediatek,smi-larb.txt. It must sort
> > according to the local arbiter index, like larb0, larb1, larb2...
> > - iommu-cells : must be 1. This is the mtk_m4u_id according to the HW.
> > Specifies the mtk_m4u_id as defined in
> > dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
> > - dt-binding/memory/mt2712-larb-port.h for mt2712, and
> > - dt-binding/memory/mt8173-larb-port.h for mt8173.
> > + dt-binding/memory/mt2712-larb-port.h for mt2712,
> > + dt-binding/memory/mt8173-larb-port.h for mt8173, and
> > + dt-binding/memory/mt8183-larb-port.h for mt8183.
> >
> > Example:
> > iommu: iommu@10205000 {
> > diff --git a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> > index e937ddd..8d3240a 100644
> > --- a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> > +++ b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> > @@ -2,9 +2,10 @@ SMI (Smart Multimedia Interface) Common
> >
> > The hardware block diagram please check bindings/iommu/mediatek,iommu.txt
> >
> > -Mediatek SMI have two generations of HW architecture, mt2712 and mt8173 use
> > -the second generation of SMI HW while mt2701 uses the first generation HW of
> > -SMI.
> > +Mediatek SMI have two generations of HW architecture, here is the list
> > +which generation the Socs use:
> > +generation 1: mt2701 and mt7623.
> > +generation 2: mt2712, mt8173 and mt8183.
> >
> > There's slight differences between the two SMI, for generation 2, the
> > register which control the iommu port is at each larb's register base. But
> > @@ -19,6 +20,7 @@ Required properties:
> > "mediatek,mt2712-smi-common"
> > "mediatek,mt7623-smi-common", "mediatek,mt2701-smi-common"
> > "mediatek,mt8173-smi-common"
> > + "mediatek,mt8183-smi-common"
> > - reg : the register and size of the SMI block.
> > - power-domains : a phandle to the power domain of this local arbiter.
> > - clocks : Must contain an entry for each entry in clock-names.
> > @@ -30,6 +32,9 @@ Required properties:
> > They may be the same if both source clocks are the same.
> > - "async" : asynchronous clock, it help transform the smi clock into the emi
> > clock domain, this clock is only needed by generation 1 smi HW.
> > + and these 2 option clocks for generation 2 smi HW:
> > + - "gals0": the path0 clock of GALS(Global Async Local Sync).
> > + - "gals1": the path1 clock of GALS(Global Async Local Sync).
>
> It would also be nice to specify these more clearly too, since these
> are also required for some SoCs and not used on others (as in patch
> 12).
I will add:
+ Here is the list which has the "gals": mt8183.
On Wed, 2019-01-30 at 10:31 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> >
> > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> > the ARM Short-descriptor like mt8173, and most of the HW registers
> > are the same.
> >
> > Here list main differences between mt8183 and mt8173/mt2712:
> > 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> > 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> > 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> > mode".
> > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> > the bit[33:32] in the physical address of the pgtable base, But the
> > standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> > we add a mask.
> > 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> > 6) mt8183 need reset_axi like mt8173.
> > 7) the larb-id in smi-common is remapped. M4U should add its larbid_remap.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/iommu/mtk_iommu.c | 15 ++++++++++++---
> > drivers/iommu/mtk_iommu.h | 1 +
> > drivers/memory/mtk-smi.c | 20 ++++++++++++++++++++
> > 3 files changed, 33 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 2913ddb..66e3615 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -36,6 +36,7 @@
> > #include "mtk_iommu.h"
> >
> > #define REG_MMU_PT_BASE_ADDR 0x000
> > +#define MMU_PT_ADDR_MASK GENMASK(31, 7)
> >
> > #define REG_MMU_INVALIDATE 0x020
> > #define F_ALL_INVLD 0x2
> > @@ -342,7 +343,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
> > /* Update the pgtable base address register of the M4U HW */
> > if (!data->m4u_dom) {
> > data->m4u_dom = dom;
> > - writel(dom->cfg.arm_v7s_cfg.ttbr[0],
> > + writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
>
> So there aren't any other bits down below 7 that you need, like the
> shareable bits?
Yes. We don't need all the bits down below 7. As the comment 4) above,
we mask it just because the S bit.
On Wed, 2019-01-30 at 11:07 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> >
> > There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
> > mmu0 or mmu1 to balance the bandwidth via the smi-common register
> > SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).
> >
> > In mt8183, For better performance, we switch larb1/2/5/7 to enter
> > mmu1 while the others still keep enter mmu0.
> >
> > In mt8173 and mt2712, we don't get the performance issue,
> > Keep its default value(0x0), that means all the larbs enter mmu0.
> >
> > Note: smi gen1(mt2701/mt7623) don't have this bus_sel.
> >
> > CC: Matthias Brugger <[email protected]>
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/memory/mtk-smi.c | 22 ++++++++++++++++++++--
> > 1 file changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > index 9790801..08cf40d 100644
> > --- a/drivers/memory/mtk-smi.c
> > +++ b/drivers/memory/mtk-smi.c
> > @@ -49,6 +49,12 @@
> > #define SMI_LARB_NONSEC_CON(id) (0x380 + ((id) * 4))
> > #define F_MMU_EN BIT(0)
> >
> > +/* SMI COMMON */
> > +#define SMI_BUS_SEL 0x220
> > +#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
> > +/* All are MMU0 defaultly. Only specialize mmu1 here. */
> > +#define F_MMU1_LARB(larbid) (0x1 << SMI_BUS_LARB_SHIFT(larbid))
> > +
> > enum mtk_smi_gen {
> > MTK_SMI_GEN1,
> > MTK_SMI_GEN2
> > @@ -57,6 +63,7 @@ enum mtk_smi_gen {
> > struct mtk_smi_common_plat {
> > enum mtk_smi_gen gen;
> > bool has_gals;
> > + u32 bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
> > };
> >
> > struct mtk_smi_larb_gen {
> > @@ -72,8 +79,8 @@ struct mtk_smi {
> > struct clk *clk_apb, *clk_smi;
> > struct clk *clk_gals0, *clk_gals1;
> > struct clk *clk_async; /*only needed by mt2701*/
> > - void __iomem *smi_ao_base;
> > -
> > + void __iomem *smi_ao_base; /* only for gen1 */
> > + void __iomem *base; /* only for gen2 */
> > const struct mtk_smi_common_plat *plat;
> > };
> >
> > @@ -410,6 +417,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
> > static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
> > .gen = MTK_SMI_GEN2,
> > .has_gals = true,
> > + .bus_sel = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
> > + F_MMU1_LARB(7),
> > };
> >
> > static const struct of_device_id mtk_smi_common_of_ids[] = {
> > @@ -482,6 +491,11 @@ static int mtk_smi_common_probe(struct platform_device *pdev)
> > ret = clk_prepare_enable(common->clk_async);
> > if (ret)
> > return ret;
> > + } else {
> > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + common->base = devm_ioremap_resource(dev, res);
> > + if (IS_ERR(common->base))
> > + return PTR_ERR(common->base);
>
> So you split base and smi_ao_base because they're completely different
> register regions, or because ->base is no longer "always on"? It's
> tempting to recombine them because they appear to be mutually
> exclusive, but if they're truly different register regions then I
> understand.
They are completely different. the common->base is the smi-common normal
base while the common->smi_ao_base only exist in smi-gen1.
On Wed, 2019-01-30 at 10:30 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
> >
> > Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
> > range) register while mt2712 have. Move it into the plat_data.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/iommu/mtk_iommu.c | 3 ++-
> > drivers/iommu/mtk_iommu.h | 1 +
> > 2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 8d8ab21..2913ddb 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > upper_32_bits(data->protect_base);
> > writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
> >
> > - if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> > + if (data->enable_4GB && data->plat_data->vld_pa_rng) {
> > /*
> > * If 4GB mode is enabled, the validate PA range is from
> > * 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
> > @@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> > .m4u_plat = M4U_MT2712,
> > .has_4gb_mode = true,
> > .has_bclk = true,
> > + .vld_pa_rng = true,
> > .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> > };
> >
> > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > index b46aeaa..a8c5d1e 100644
> > --- a/drivers/iommu/mtk_iommu.h
> > +++ b/drivers/iommu/mtk_iommu.h
> > @@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
> > /* HW will use the EMI clock if there isn't the "bclk". */
> > bool has_bclk;
> > bool reset_axi;
> > + bool vld_pa_rng;
>
> I agree with Nicolas that valid_pa_range would be much clearer...
> although, even now that I know what it's supposed to mean, I don't get
> what it represents. What is this saying?
This register in the coda is called "vld_pa_rng".
How about I change it to "has_vld_pa_rng"?. In the comment above, I have
explained the meaning(valid physical address range).
>
> -Evan
On Wed, 2019-01-30 at 11:12 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 8:00 PM Yong Wu <[email protected]> wrote:
> >
> > In the reboot burning test, if some Multimedia HW has something wrong,
> > It may keep send the invalid request to IOMMU. In order to avoid
> > affect the reboot flow, we add the shutdown callback to disable
> > M4U HW when shutdown.
>
> Sounds unpleasant. Hopefully the reboot flow still continues properly
> even in that case, since this shutdown code may not run during some
> rougher resets.
Thanks this hint. I will reword the comment avoid writing reboot.
>
> Reviewed-by: Evan Green <[email protected]>
Thanks.
On Wed, 2019-01-30 at 11:11 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> >
> > The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
> > It's no need to parse it again in SMI driver. Only clean some codes.
> > This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
> > and mt8183.
> >
> > After this patch, the "mediatek,larb-id" only be needed for mt2712
> > which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
> > in which the larbs in the "mediatek,larbs" always are ordered.
> >
> > CC: Matthias Brugger <[email protected]>
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/memory/mtk-smi.c | 26 ++------------------------
> > 1 file changed, 2 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > index 08cf40d..10e6493 100644
> > --- a/drivers/memory/mtk-smi.c
> > +++ b/drivers/memory/mtk-smi.c
> > @@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
> > };
> >
> > struct mtk_smi_larb_gen {
> > - bool need_larbid;
> > int port_in_larb[MTK_LARB_NR_MAX + 1];
> > void (*config_port)(struct device *);
> > unsigned int larb_direct_to_common_mask;
> > @@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
> > struct mtk_smi_iommu *smi_iommu = data;
> > unsigned int i;
> >
> > - if (larb->larb_gen->need_larbid) {
> > - larb->mmu = &smi_iommu->larb_imu[larb->larbid].mmu;
> > - return 0;
> > - }
> > -
> > - /*
> > - * If there is no larbid property, Loop to find the corresponding
> > - * iommu information.
> > - */
> > - for (i = 0; i < smi_iommu->larb_nr; i++) {
> > + for (i = 0; i < MTK_LARB_NR_MAX; i++) {
>
> Looks like this was the only use of mtk_smi_iommu.larb_nr. Should we
> remove that now?
This is necessary since the mt2712 which has two M4U HW.
From its dtsi[1], take iommu1 as a example, its larb_nr only is 3, but
we need scan all the larb.
[1]
http://lists.infradead.org/pipermail/linux-mediatek/2018-December/016119.html
>
> _______________________________________________
> Linux-mediatek mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-mediatek
On Wed, 2019-01-30 at 10:30 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
> >
> > In mt8173 and mt8183, 0x48 is REG_MMU_STANDARD_AXI_MODE while
> > it is extended to REG_MMU_CTRL which contains _STANDARD_AXI_MODE in
> > the other SoCs. I move this property to plat_data since both mt8173
> > and mt8183 use this property.
> >
> > It is a preparing patch for mt8183.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/iommu/mtk_iommu.c | 4 ++--
> > drivers/iommu/mtk_iommu.h | 2 +-
> > 2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 35a1263..8d8ab21 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -558,8 +558,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > }
> > writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
> >
> > - /* It's MISC control register whose default value is ok except mt8173.*/
> > - if (data->plat_data->m4u_plat == M4U_MT8173)
> > + if (data->plat_data->reset_axi)
> > writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
>
> The commit description makes it sound like the overall format of the
> register is the same, but the "other SoCs" have some extra bits they'd
> like to leave alone. Would it be easier to do a read-modify-write to
> always clear some bits in the register, instead of doing something
> based on the SoC? Or do the bits mean completely different things in
> the different versions (in which case what you've got makes sense to
> me)?
The bits mean completely is different.(the axi bit position also is
different. I will add this in the comment of this patch.)
> -Evan
On Wed, 2019-01-30 at 10:55 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> >
> > Normally the M4U HW connect EMI with smi. the diagram is like below:
> > EMI
> > |
> > M4U
> > |
> > smi-common
> > |
> > -----------------
> > | | | | ...
> > larb0 larb1 larb2 larb3
> >
> > Actually there are 2 mmu cells in the M4U HW, like this diagram:
> >
> > EMI
> > ---------
> > | |
> > mmu0 mmu1 <- M4U
> > | |
> > ---------
> > |
> > smi-common
> > |
> > -----------------
> > | | | | ...
> > larb0 larb1 larb2 larb3
> >
> > This patch add support for mmu1. In order to get better performance,
> > we could adjust some larbs go to mmu1 while the others still go to
> > mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
> >
> > mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
> > value of that register is 0 which means all the larbs go to mmu0
> > defaultly.
> >
> > This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > ---
> > drivers/iommu/mtk_iommu.c | 47 +++++++++++++++++++++++++++++------------------
> > 1 file changed, 29 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 66e3615..7fcef19 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -70,27 +70,32 @@
> > #define F_MISS_FIFO_ERR_INT_EN BIT(6)
> > #define F_INT_CLR_BIT BIT(12)
> >
> > -#define REG_MMU_INT_MAIN_CONTROL 0x124
> > -#define F_INT_TRANSLATION_FAULT BIT(0)
> > -#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
> > -#define F_INT_INVALID_PA_FAULT BIT(2)
> > -#define F_INT_ENTRY_REPLACEMENT_FAULT BIT(3)
> > -#define F_INT_TLB_MISS_FAULT BIT(4)
> > -#define F_INT_MISS_TRANSACTION_FIFO_FAULT BIT(5)
> > -#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT BIT(6)
> > +#define REG_MMU_INT_MAIN_CONTROL 0x124 /* mmu0 | mmu1 */
>
> The comment being on that line is kind of weird, since the comment
> really applies to the lines below it. Maybe the comment should be on
> its own line, or on the TRANSLATION_FAULT line.
Sharp eye. You are right, this comment applies the lines below.
But If I move it below, then the next line will be over 80 chars.
How about I add a "below:" like this:
> +#define REG_MMU_INT_MAIN_CONTROL 0x124 /* below: mmu0 |
mmu1 */
>
> Other than that,
> Reviewed-by: Evan Green <[email protected]>
On Wed, 2019-01-30 at 11:05 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> >
> > This patch only move the clk_prepare_enable and config_port into the
> > runtime suspend/resume callback. It doesn't change the code content
> > and sequence.
> >
> > This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
> > (SMI_BUS_SEL need to be restored after smi-common resume every time.)
> > Also it gives a chance to get rid of mtk_smi_larb_get/put which could
> > be a next topic.
> >
> > CC: Matthias Brugger <[email protected]>
> > Signed-off-by: Yong Wu <[email protected]>
>
> I believe this refactoring is a no-op as described, because the order is still:
> 1) mtk_smi_clk_enable(common)
> 2) mtk_smi_clk_enable(larb)
> 3) larb_gen->config_port()
>
> And teardown still happens in the opposite order, except for
Thanks your confirm.
> config_port, which they seem not to do in suspend.
After suspend, the HW engine should not work. config_port is not
helpful. At that time, use the HW default value.
> So, looks good to me.
>
> Reviewed-by: Evan Green <[email protected]>
Thanks.
>
> > ---
> > drivers/memory/mtk-smi.c | 113 ++++++++++++++++++++++++++++++-----------------
> > 1 file changed, 72 insertions(+), 41 deletions(-)
> >
> > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > index a430721..9790801 100644
> > --- a/drivers/memory/mtk-smi.c
> > +++ b/drivers/memory/mtk-smi.c
> > @@ -86,17 +86,13 @@ struct mtk_smi_larb { /* larb: local arbiter */
> > u32 *mmu;
> > };
> >
> > -static int mtk_smi_enable(const struct mtk_smi *smi)
> > +static int mtk_smi_clk_enable(const struct mtk_smi *smi)
> > {
> > int ret;
> >
> > - ret = pm_runtime_get_sync(smi->dev);
> > - if (ret < 0)
> > - return ret;
> > -
> > ret = clk_prepare_enable(smi->clk_apb);
> > if (ret)
> > - goto err_put_pm;
> > + return ret;
> >
> > ret = clk_prepare_enable(smi->clk_smi);
> > if (ret)
> > @@ -118,59 +114,28 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
> > clk_disable_unprepare(smi->clk_smi);
> > err_disable_apb:
> > clk_disable_unprepare(smi->clk_apb);
> > -err_put_pm:
> > - pm_runtime_put_sync(smi->dev);
> > return ret;
> > }
> >
> > -static void mtk_smi_disable(const struct mtk_smi *smi)
> > +static void mtk_smi_clk_disable(const struct mtk_smi *smi)
> > {
> > clk_disable_unprepare(smi->clk_gals1);
> > clk_disable_unprepare(smi->clk_gals0);
> > clk_disable_unprepare(smi->clk_smi);
> > clk_disable_unprepare(smi->clk_apb);
> > - pm_runtime_put_sync(smi->dev);
> > }
> >
> > int mtk_smi_larb_get(struct device *larbdev)
> > {
> > - struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
> > - const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
> > - struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
> > - int ret;
> > + int ret = pm_runtime_get_sync(larbdev);
> >
> > - /* Enable the smi-common's power and clocks */
> > - ret = mtk_smi_enable(common);
> > - if (ret)
> > - return ret;
> > -
> > - /* Enable the larb's power and clocks */
> > - ret = mtk_smi_enable(&larb->smi);
> > - if (ret) {
> > - mtk_smi_disable(common);
> > - return ret;
> > - }
> > -
> > - /* Configure the iommu info for this larb */
> > - larb_gen->config_port(larbdev);
> > -
> > - return 0;
> > + return (ret < 0) ? ret : 0;
> > }
> > EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
> >
> > void mtk_smi_larb_put(struct device *larbdev)
> > {
> > - struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
> > - struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
> > -
> > - /*
> > - * Don't de-configure the iommu info for this larb since there may be
> > - * several modules in this larb.
> > - * The iommu info will be reset after power off.
> > - */
> > -
> > - mtk_smi_disable(&larb->smi);
> > - mtk_smi_disable(common);
> > + pm_runtime_put_sync(larbdev);
> > }
> > EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
> >
> > @@ -385,12 +350,52 @@ static int mtk_smi_larb_remove(struct platform_device *pdev)
> > return 0;
> > }
> >
> > +static int __maybe_unused mtk_smi_larb_resume(struct device *dev)
> > +{
> > + struct mtk_smi_larb *larb = dev_get_drvdata(dev);
> > + const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
> > + int ret;
> > +
> > + /* Power on smi-common. */
> > + ret = pm_runtime_get_sync(larb->smi_common_dev);
> > + if (ret < 0) {
> > + dev_err(dev, "Failed to pm get for smi-common(%d).\n", ret);
> > + return ret;
> > + }
> > +
> > + ret = mtk_smi_clk_enable(&larb->smi);
> > + if (ret < 0) {
> > + dev_err(dev, "Failed to enable clock(%d).\n", ret);
> > + pm_runtime_put_sync(larb->smi_common_dev);
> > + return ret;
> > + }
> > +
> > + /* Configure the basic setting for this larb */
> > + larb_gen->config_port(dev);
> > +
> > + return 0;
> > +}
> > +
> > +static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
> > +{
> > + struct mtk_smi_larb *larb = dev_get_drvdata(dev);
> > +
> > + mtk_smi_clk_disable(&larb->smi);
> > + pm_runtime_put_sync(larb->smi_common_dev);
> > + return 0;
> > +}
> > +
> > +static const struct dev_pm_ops smi_larb_pm_ops = {
> > + SET_RUNTIME_PM_OPS(mtk_smi_larb_suspend, mtk_smi_larb_resume, NULL)
> > +};
> > +
> > static struct platform_driver mtk_smi_larb_driver = {
> > .probe = mtk_smi_larb_probe,
> > .remove = mtk_smi_larb_remove,
> > .driver = {
> > .name = "mtk-smi-larb",
> > .of_match_table = mtk_smi_larb_of_ids,
> > + .pm = &smi_larb_pm_ops,
> > }
> > };
> >
> > @@ -489,12 +494,38 @@ static int mtk_smi_common_remove(struct platform_device *pdev)
> > return 0;
> > }
> >
> > +static int __maybe_unused mtk_smi_common_resume(struct device *dev)
> > +{
> > + struct mtk_smi *common = dev_get_drvdata(dev);
> > + int ret;
> > +
> > + ret = mtk_smi_clk_enable(common);
> > + if (ret) {
> > + dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
> > + return ret;
> > + }
> > + return 0;
> > +}
> > +
> > +static int __maybe_unused mtk_smi_common_suspend(struct device *dev)
> > +{
> > + struct mtk_smi *common = dev_get_drvdata(dev);
> > +
> > + mtk_smi_clk_disable(common);
> > + return 0;
> > +}
> > +
> > +static const struct dev_pm_ops smi_common_pm_ops = {
> > + SET_RUNTIME_PM_OPS(mtk_smi_common_suspend, mtk_smi_common_resume, NULL)
> > +};
> > +
> > static struct platform_driver mtk_smi_common_driver = {
> > .probe = mtk_smi_common_probe,
> > .remove = mtk_smi_common_remove,
> > .driver = {
> > .name = "mtk-smi-common",
> > .of_match_table = mtk_smi_common_of_ids,
> > + .pm = &smi_common_pm_ops,
> > }
> > };
> >
> > --
> > 1.9.1
> >
On Wed, 2019-01-30 at 10:28 -0800, Evan Green wrote:
> On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
> >
> > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> >
> > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > for all PTEs which means to enable bit32 of physical address.
>
> I got a little lost here. I get that you're trying to explain why you
> always used to set bit32 of the physical address. But I don't totally
> get the part about physical addresses being from 0x4000_0000 -
> 0x1_3fff_ffff, but also from 0x1_0000_0000-0x1_ffff_ffff. Are you
> saying that the physical addresses from the iommu's perspective were
> always >0x1_0000_0000?
Yes. From the IOMMU's perspective, the Physical address is from
0x1_0000_0000 to 0x1_ffff_ffff.
> But then from whose perspective is it 0x4000_0000? ...
I guess from SW point view. it is from 0x4000_0000 to 0x1_3fff_ffff.
If 4GB mode is enabled, the memory property in dts like this:
memory@40000000 {
device_type = "memory";
reg = <0 0x40000000 0x00000001 0x00000000>;
};
> oh, or you're saying there was some sort of remapping
> facility that moved the physical addresses around?
>
> >
> > but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
> > which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
> > PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
> > 32bits.
> >
> > In order to unify code, in the "4GB mode", we add the bit32 for the
> > physical address manually in our driver.
> >
> > Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
> > has to been moved into v7s.
> >
> > Regarding whether the pagetable address could be over 4GB, the mt8183
> > support it while the previous mt8173 don't. thus keep it as is.
> >
> > Signed-off-by: Yong Wu <[email protected]>
> > Reviewed-by: Robin Murphy <[email protected]>
> > ---
> > drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
> > drivers/iommu/io-pgtable.h | 7 +++----
> > drivers/iommu/mtk_iommu.c | 14 ++++++++------
> > drivers/iommu/mtk_iommu.h | 1 +
> > 4 files changed, 36 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > index 11d8505..8803a35 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -124,7 +124,9 @@
> > #define ARM_V7S_TEX_MASK 0x7
> > #define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
> >
> > -#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
> > +/* MediaTek extend the two bits below for over 4GB mode */
> > +#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
> > +#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
>
> If other vendors start doing stuff like this we'll need a more generic
> way to handle this... but I guess until we see a pattern this is okay.
>
> >
> > /* *well, except for TEX on level 2 large pages, of course :( */
> > #define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
> > @@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
> > static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> > struct io_pgtable_cfg *cfg)
> > {
> > - return paddr & ARM_V7S_LVL_MASK(lvl);
> > + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> > +
> > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > + if (paddr & BIT_ULL(32))
> > + pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> > + if (paddr & BIT_ULL(33))
> > + pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > + }
> > + return pte;
> > }
> >
> > static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > struct io_pgtable_cfg *cfg)
> > {
> > arm_v7s_iopte mask;
> > + phys_addr_t paddr;
> >
> > if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
> > mask = ARM_V7S_TABLE_MASK;
> > @@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > else
> > mask = ARM_V7S_LVL_MASK(lvl);
> >
> > - return pte & mask;
> > + paddr = pte & mask;
> > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > + paddr |= BIT_ULL(32);
> > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > + paddr |= BIT_ULL(33);
> > + }
> > + return paddr;
> > }
> >
> > static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
> > @@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
> > if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
> > pte |= ARM_V7S_ATTR_NS_SECTION;
> >
> > - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
> > - pte |= ARM_V7S_ATTR_MTK_4GB;
> > -
>
> So despite getting lost in the details, I guess the reason it's okay
> that this goes from unconditional to conditional on bit32 is that
> before, with the older chips, all physical addresses were above 4GB,
> so we'll always see PA's above 4GB?
>
> > return pte;
> > }
> >
> > @@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> > if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> > return 0;
> >
> > - if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
> > + if (WARN_ON(upper_32_bits(iova)) ||
> > + WARN_ON(upper_32_bits(paddr) &&
> > + !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
> > return -ERANGE;
> >
> > ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
> > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> > index 47d5ae5..69db115 100644
> > --- a/drivers/iommu/io-pgtable.h
> > +++ b/drivers/iommu/io-pgtable.h
> > @@ -62,10 +62,9 @@ struct io_pgtable_cfg {
> > * (unmapped) entries but the hardware might do so anyway, perform
> > * TLB maintenance when mapping as well as when unmapping.
> > *
> > - * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
> > - * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
> > - * when the SoC is in "4GB mode" and they can only access the high
> > - * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
> > + * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
> > + * to support up to 34 bits PA where the bit32 and bit33 are
> > + * encoded in the bit9 and bit4 of the PTE respectively.
> > *
> > * IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
> > * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 189d1b5..ae1aa5a 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
> > phys_addr_t paddr, size_t size, int prot)
> > {
> > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > unsigned long flags;
> > int ret;
> >
> > + /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> > + if (data->plat_data->has_4gb_mode && data->enable_4GB)
> > + paddr |= BIT_ULL(32);
> > +
>
> Ok here's where I get lost. How is this okay? Is the same physical RAM
> accessible at multiple locations in the physical address space? Won't
> this map an iova to a different pa than the one requested?
In 4GB mode, HW will remap 0x4000_0000-0x1_3fff_ffff to 0x1_0000_0000-
0x1_ffff_ffff. M4U help multimedia HW access dram, thus from M4U point
of view, the dram always is 0x1_0000_0000 to 0x1_ffff_ffff.
The detailed mapping relationship is like this:
0x4000_0000 -0xffff_ffff map to 0x1_4000_0000 - 0x1_ffff_ffff.
0x1_0000_0000-0x1_3fff_ffff map to 0x1_0000_0000 - 0x1_3fff_ffff.
Thus, we can only add bit32 for the PA in the 4GB mode.
>
> Also, you could have rolled the has_4gb_mode check into whether or not
> you set enable_4GB. Then you're doing the check for has_4gb_mode once,
> rather than on every map call.
"has_4gb_mode" means this SoC support 4GB mode.
"enable_4GB" means whether the current dram size is 4GB.
> -Evan
On Wed, Jan 30, 2019 at 7:20 PM Yong Wu <[email protected]> wrote:
>
> On Wed, 2019-01-30 at 10:30 -0800, Evan Green wrote:
> > On Mon, Dec 31, 2018 at 7:58 PM Yong Wu <[email protected]> wrote:
> > >
> > > Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
> > > range) register while mt2712 have. Move it into the plat_data.
> > >
> > > Signed-off-by: Yong Wu <[email protected]>
> > > ---
> > > drivers/iommu/mtk_iommu.c | 3 ++-
> > > drivers/iommu/mtk_iommu.h | 1 +
> > > 2 files changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > index 8d8ab21..2913ddb 100644
> > > --- a/drivers/iommu/mtk_iommu.c
> > > +++ b/drivers/iommu/mtk_iommu.c
> > > @@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > > upper_32_bits(data->protect_base);
> > > writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
> > >
> > > - if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> > > + if (data->enable_4GB && data->plat_data->vld_pa_rng) {
> > > /*
> > > * If 4GB mode is enabled, the validate PA range is from
> > > * 0x1_0000_0000 to 0x1_ffff_ffff. here record bit[32:30].
> > > @@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
> > > .m4u_plat = M4U_MT2712,
> > > .has_4gb_mode = true,
> > > .has_bclk = true,
> > > + .vld_pa_rng = true,
> > > .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
> > > };
> > >
> > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > > index b46aeaa..a8c5d1e 100644
> > > --- a/drivers/iommu/mtk_iommu.h
> > > +++ b/drivers/iommu/mtk_iommu.h
> > > @@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
> > > /* HW will use the EMI clock if there isn't the "bclk". */
> > > bool has_bclk;
> > > bool reset_axi;
> > > + bool vld_pa_rng;
> >
> > I agree with Nicolas that valid_pa_range would be much clearer...
> > although, even now that I know what it's supposed to mean, I don't get
> > what it represents. What is this saying?
>
> This register in the coda is called "vld_pa_rng".
>
> How about I change it to "has_vld_pa_rng"?. In the comment above, I have
> explained the meaning(valid physical address range).
>
Ok, that sounds fine.
-Evan
On Wed, Jan 30, 2019 at 7:22 PM Yong Wu <[email protected]> wrote:
>
> On Wed, 2019-01-30 at 11:11 -0800, Evan Green wrote:
> > On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> > >
> > > The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
> > > It's no need to parse it again in SMI driver. Only clean some codes.
> > > This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
> > > and mt8183.
> > >
> > > After this patch, the "mediatek,larb-id" only be needed for mt2712
> > > which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
> > > in which the larbs in the "mediatek,larbs" always are ordered.
> > >
> > > CC: Matthias Brugger <[email protected]>
> > > Signed-off-by: Yong Wu <[email protected]>
> > > ---
> > > drivers/memory/mtk-smi.c | 26 ++------------------------
> > > 1 file changed, 2 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > > index 08cf40d..10e6493 100644
> > > --- a/drivers/memory/mtk-smi.c
> > > +++ b/drivers/memory/mtk-smi.c
> > > @@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
> > > };
> > >
> > > struct mtk_smi_larb_gen {
> > > - bool need_larbid;
> > > int port_in_larb[MTK_LARB_NR_MAX + 1];
> > > void (*config_port)(struct device *);
> > > unsigned int larb_direct_to_common_mask;
> > > @@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
> > > struct mtk_smi_iommu *smi_iommu = data;
> > > unsigned int i;
> > >
> > > - if (larb->larb_gen->need_larbid) {
> > > - larb->mmu = &smi_iommu->larb_imu[larb->larbid].mmu;
> > > - return 0;
> > > - }
> > > -
> > > - /*
> > > - * If there is no larbid property, Loop to find the corresponding
> > > - * iommu information.
> > > - */
> > > - for (i = 0; i < smi_iommu->larb_nr; i++) {
> > > + for (i = 0; i < MTK_LARB_NR_MAX; i++) {
> >
> > Looks like this was the only use of mtk_smi_iommu.larb_nr. Should we
> > remove that now?
>
> This is necessary since the mt2712 which has two M4U HW.
>
> From its dtsi[1], take iommu1 as a example, its larb_nr only is 3, but
> we need scan all the larb.
>
> [1]
> http://lists.infradead.org/pipermail/linux-mediatek/2018-December/016119.html
I'm not sure I follow. My point was that this structure member is only
ever set but never read:
$ git grep '[.>]larb_nr'
drivers/iommu/mtk_iommu.c: data->smi_imu.larb_nr = larb_nr;
drivers/iommu/mtk_iommu_v1.c: data->smi_imu.larb_nr = larb_nr;
Maybe I've applied the patches to the wrong tree, and there is a use
of this member I'm not seeing?
-Evan
On Wed, Jan 30, 2019 at 10:59 PM Yong Wu <[email protected]> wrote:
>
> On Wed, 2019-01-30 at 10:28 -0800, Evan Green wrote:
> > On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
> > >
> > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > >
> > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > for all PTEs which means to enable bit32 of physical address.
> >
> > I got a little lost here. I get that you're trying to explain why you
> > always used to set bit32 of the physical address. But I don't totally
> > get the part about physical addresses being from 0x4000_0000 -
> > 0x1_3fff_ffff, but also from 0x1_0000_0000-0x1_ffff_ffff. Are you
> > saying that the physical addresses from the iommu's perspective were
> > always >0x1_0000_0000?
>
> Yes. From the IOMMU's perspective, the Physical address is from
> 0x1_0000_0000 to 0x1_ffff_ffff.
>
> > But then from whose perspective is it 0x4000_0000? ...
>
> I guess from SW point view. it is from 0x4000_0000 to 0x1_3fff_ffff.
>
> If 4GB mode is enabled, the memory property in dts like this:
>
> memory@40000000 {
> device_type = "memory";
> reg = <0 0x40000000 0x00000001 0x00000000>;
> };
>
> > oh, or you're saying there was some sort of remapping
> > facility that moved the physical addresses around?
> >
> > >
> > > but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
> > > which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
> > > PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
> > > 32bits.
> > >
> > > In order to unify code, in the "4GB mode", we add the bit32 for the
> > > physical address manually in our driver.
> > >
> > > Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
> > > has to been moved into v7s.
> > >
> > > Regarding whether the pagetable address could be over 4GB, the mt8183
> > > support it while the previous mt8173 don't. thus keep it as is.
> > >
> > > Signed-off-by: Yong Wu <[email protected]>
> > > Reviewed-by: Robin Murphy <[email protected]>
> > > ---
> > > drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
> > > drivers/iommu/io-pgtable.h | 7 +++----
> > > drivers/iommu/mtk_iommu.c | 14 ++++++++------
> > > drivers/iommu/mtk_iommu.h | 1 +
> > > 4 files changed, 36 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > > index 11d8505..8803a35 100644
> > > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > > @@ -124,7 +124,9 @@
> > > #define ARM_V7S_TEX_MASK 0x7
> > > #define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
> > >
> > > -#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
> > > +/* MediaTek extend the two bits below for over 4GB mode */
> > > +#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
> > > +#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
> >
> > If other vendors start doing stuff like this we'll need a more generic
> > way to handle this... but I guess until we see a pattern this is okay.
> >
> > >
> > > /* *well, except for TEX on level 2 large pages, of course :( */
> > > #define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
> > > @@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
> > > static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> > > struct io_pgtable_cfg *cfg)
> > > {
> > > - return paddr & ARM_V7S_LVL_MASK(lvl);
> > > + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> > > +
> > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > + if (paddr & BIT_ULL(32))
> > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> > > + if (paddr & BIT_ULL(33))
> > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > > + }
> > > + return pte;
> > > }
> > >
> > > static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > struct io_pgtable_cfg *cfg)
> > > {
> > > arm_v7s_iopte mask;
> > > + phys_addr_t paddr;
> > >
> > > if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
> > > mask = ARM_V7S_TABLE_MASK;
> > > @@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > else
> > > mask = ARM_V7S_LVL_MASK(lvl);
> > >
> > > - return pte & mask;
> > > + paddr = pte & mask;
> > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > > + paddr |= BIT_ULL(32);
> > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > > + paddr |= BIT_ULL(33);
> > > + }
> > > + return paddr;
> > > }
> > >
> > > static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
> > > @@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
> > > if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
> > > pte |= ARM_V7S_ATTR_NS_SECTION;
> > >
> > > - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
> > > - pte |= ARM_V7S_ATTR_MTK_4GB;
> > > -
> >
> > So despite getting lost in the details, I guess the reason it's okay
> > that this goes from unconditional to conditional on bit32 is that
> > before, with the older chips, all physical addresses were above 4GB,
> > so we'll always see PA's above 4GB?
> >
> > > return pte;
> > > }
> > >
> > > @@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> > > if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> > > return 0;
> > >
> > > - if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
> > > + if (WARN_ON(upper_32_bits(iova)) ||
> > > + WARN_ON(upper_32_bits(paddr) &&
> > > + !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
> > > return -ERANGE;
> > >
> > > ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
> > > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> > > index 47d5ae5..69db115 100644
> > > --- a/drivers/iommu/io-pgtable.h
> > > +++ b/drivers/iommu/io-pgtable.h
> > > @@ -62,10 +62,9 @@ struct io_pgtable_cfg {
> > > * (unmapped) entries but the hardware might do so anyway, perform
> > > * TLB maintenance when mapping as well as when unmapping.
> > > *
> > > - * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
> > > - * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
> > > - * when the SoC is in "4GB mode" and they can only access the high
> > > - * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
> > > + * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
> > > + * to support up to 34 bits PA where the bit32 and bit33 are
> > > + * encoded in the bit9 and bit4 of the PTE respectively.
> > > *
> > > * IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
> > > * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > index 189d1b5..ae1aa5a 100644
> > > --- a/drivers/iommu/mtk_iommu.c
> > > +++ b/drivers/iommu/mtk_iommu.c
> > > @@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
> > > phys_addr_t paddr, size_t size, int prot)
> > > {
> > > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > > + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > > unsigned long flags;
> > > int ret;
> > >
> > > + /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> > > + if (data->plat_data->has_4gb_mode && data->enable_4GB)
> > > + paddr |= BIT_ULL(32);
> > > +
> >
> > Ok here's where I get lost. How is this okay? Is the same physical RAM
> > accessible at multiple locations in the physical address space? Won't
> > this map an iova to a different pa than the one requested?
>
> In 4GB mode, HW will remap 0x4000_0000-0x1_3fff_ffff to 0x1_0000_0000-
> 0x1_ffff_ffff. M4U help multimedia HW access dram, thus from M4U point
> of view, the dram always is 0x1_0000_0000 to 0x1_ffff_ffff.
>
> The detailed mapping relationship is like this:
>
> 0x4000_0000 -0xffff_ffff map to 0x1_4000_0000 - 0x1_ffff_ffff.
> 0x1_0000_0000-0x1_3fff_ffff map to 0x1_0000_0000 - 0x1_3fff_ffff.
>
> Thus, we can only add bit32 for the PA in the 4GB mode.
Ok, I think I get it now. This thread really helped:
https://patchwork.kernel.org/patch/8402211/
So from what I understand basically the same DRAM exists in two places:
0000_0000 - ffff_ffff, and is also available in
1_0000_0000 - 1_ffff_ffff
...except that the peripherals are located in 0000_0000 - 3ffff_ffff,
so that first GB of RAM is not visible at the lower address. I'm
gathering this was in fact the motivation for 4GB mode. The important
part is that address 4000_0000 == 1_4000_0000.
Then there was also some quirk of the IOMMU where it refused to access
addresses below 4GB. But those same addresses are accessible by ORing
in bit 32, so you just always do that and you're good to go.
Ok so now I can use that to understand this refactoring:
* You used to always return an address above 4GB in
mtk_iommu_iova_to_phys. I don't fully get how that worked, since it
seems like you'd start returning PAs to the rest of the system that
were outside of the range 4000_0000 - 1_3fff_ffff, but okay, you're no
longer doing that there, so I won't worry about it.
* Now, if you're in the 4GB mode, you just slam the bit in the PTE in
mtk_iommu_map, which seems like the right thing to do.
* The general functions in io-pgtable-arm-v7s.c now carefully reflect
bits 32 & 33 in the PTE, since the new IOMMUs don't have the weird
restriction of staying above 4GB, and there's not this weird 4GB
aliasing mode going on (which I think would be a clearer name for this
feature: has_4gb_alias).
>
> >
> > Also, you could have rolled the has_4gb_mode check into whether or not
> > you set enable_4GB. Then you're doing the check for has_4gb_mode once,
> > rather than on every map call.
>
> "has_4gb_mode" means this SoC support 4GB mode.
> "enable_4GB" means whether the current dram size is 4GB.
Right. But your use of the variable as well as it's name suggest that
it really means "is 4GB aliasing mode on", not "does the system have
>=4GB of RAM". You could reduce the map function to one conditional if
you treated the variable that way. Then the only things that would
need to change would be:
* Add an extra conditional in probe that would only set enable_4GB if
has_4gb_mode is set.
* in mtk_iommu_domain_finalize, you could just always set the MTK
quirk, since if you have <4GB of RAM, those bits will never get set in
the PTEs anyway.
* I suspect mtk_iommu_hw_init would continue to work as-is, since
everything that has vld_pa_rng also has has_4gb_mode.
-Evan
On Thu, 2019-01-31 at 11:23 -0800, Evan Green wrote:
> On Wed, Jan 30, 2019 at 10:59 PM Yong Wu <[email protected]> wrote:
> >
> > On Wed, 2019-01-30 at 10:28 -0800, Evan Green wrote:
> > > On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
> > > >
> > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > >
> > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > for all PTEs which means to enable bit32 of physical address.
> > >
> > > I got a little lost here. I get that you're trying to explain why you
> > > always used to set bit32 of the physical address. But I don't totally
> > > get the part about physical addresses being from 0x4000_0000 -
> > > 0x1_3fff_ffff, but also from 0x1_0000_0000-0x1_ffff_ffff. Are you
> > > saying that the physical addresses from the iommu's perspective were
> > > always >0x1_0000_0000?
> >
> > Yes. From the IOMMU's perspective, the Physical address is from
> > 0x1_0000_0000 to 0x1_ffff_ffff.
> >
> > > But then from whose perspective is it 0x4000_0000? ...
> >
> > I guess from SW point view. it is from 0x4000_0000 to 0x1_3fff_ffff.
> >
> > If 4GB mode is enabled, the memory property in dts like this:
> >
> > memory@40000000 {
> > device_type = "memory";
> > reg = <0 0x40000000 0x00000001 0x00000000>;
> > };
> >
> > > oh, or you're saying there was some sort of remapping
> > > facility that moved the physical addresses around?
> > >
> > > >
> > > > but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
> > > > which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
> > > > PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
> > > > 32bits.
> > > >
> > > > In order to unify code, in the "4GB mode", we add the bit32 for the
> > > > physical address manually in our driver.
> > > >
> > > > Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
> > > > has to been moved into v7s.
> > > >
> > > > Regarding whether the pagetable address could be over 4GB, the mt8183
> > > > support it while the previous mt8173 don't. thus keep it as is.
> > > >
> > > > Signed-off-by: Yong Wu <[email protected]>
> > > > Reviewed-by: Robin Murphy <[email protected]>
> > > > ---
> > > > drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
> > > > drivers/iommu/io-pgtable.h | 7 +++----
> > > > drivers/iommu/mtk_iommu.c | 14 ++++++++------
> > > > drivers/iommu/mtk_iommu.h | 1 +
> > > > 4 files changed, 36 insertions(+), 17 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > > > index 11d8505..8803a35 100644
> > > > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > > > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > > > @@ -124,7 +124,9 @@
> > > > #define ARM_V7S_TEX_MASK 0x7
> > > > #define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
> > > >
> > > > -#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
> > > > +/* MediaTek extend the two bits below for over 4GB mode */
> > > > +#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
> > > > +#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
> > >
> > > If other vendors start doing stuff like this we'll need a more generic
> > > way to handle this... but I guess until we see a pattern this is okay.
> > >
> > > >
> > > > /* *well, except for TEX on level 2 large pages, of course :( */
> > > > #define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
> > > > @@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
> > > > static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> > > > struct io_pgtable_cfg *cfg)
> > > > {
> > > > - return paddr & ARM_V7S_LVL_MASK(lvl);
> > > > + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> > > > +
> > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > > + if (paddr & BIT_ULL(32))
> > > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> > > > + if (paddr & BIT_ULL(33))
> > > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > > > + }
> > > > + return pte;
> > > > }
> > > >
> > > > static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > > struct io_pgtable_cfg *cfg)
> > > > {
> > > > arm_v7s_iopte mask;
> > > > + phys_addr_t paddr;
> > > >
> > > > if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
> > > > mask = ARM_V7S_TABLE_MASK;
> > > > @@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > > else
> > > > mask = ARM_V7S_LVL_MASK(lvl);
> > > >
> > > > - return pte & mask;
> > > > + paddr = pte & mask;
> > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > > > + paddr |= BIT_ULL(32);
> > > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > > > + paddr |= BIT_ULL(33);
> > > > + }
> > > > + return paddr;
> > > > }
> > > >
> > > > static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
> > > > @@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
> > > > if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
> > > > pte |= ARM_V7S_ATTR_NS_SECTION;
> > > >
> > > > - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
> > > > - pte |= ARM_V7S_ATTR_MTK_4GB;
> > > > -
> > >
> > > So despite getting lost in the details, I guess the reason it's okay
> > > that this goes from unconditional to conditional on bit32 is that
> > > before, with the older chips, all physical addresses were above 4GB,
> > > so we'll always see PA's above 4GB?
> > >
> > > > return pte;
> > > > }
> > > >
> > > > @@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> > > > if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> > > > return 0;
> > > >
> > > > - if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
> > > > + if (WARN_ON(upper_32_bits(iova)) ||
> > > > + WARN_ON(upper_32_bits(paddr) &&
> > > > + !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
> > > > return -ERANGE;
> > > >
> > > > ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
> > > > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> > > > index 47d5ae5..69db115 100644
> > > > --- a/drivers/iommu/io-pgtable.h
> > > > +++ b/drivers/iommu/io-pgtable.h
> > > > @@ -62,10 +62,9 @@ struct io_pgtable_cfg {
> > > > * (unmapped) entries but the hardware might do so anyway, perform
> > > > * TLB maintenance when mapping as well as when unmapping.
> > > > *
> > > > - * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
> > > > - * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
> > > > - * when the SoC is in "4GB mode" and they can only access the high
> > > > - * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
> > > > + * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
> > > > + * to support up to 34 bits PA where the bit32 and bit33 are
> > > > + * encoded in the bit9 and bit4 of the PTE respectively.
> > > > *
> > > > * IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
> > > > * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > > index 189d1b5..ae1aa5a 100644
> > > > --- a/drivers/iommu/mtk_iommu.c
> > > > +++ b/drivers/iommu/mtk_iommu.c
> > > > @@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
> > > > phys_addr_t paddr, size_t size, int prot)
> > > > {
> > > > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > > > + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > > > unsigned long flags;
> > > > int ret;
> > > >
> > > > + /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> > > > + if (data->plat_data->has_4gb_mode && data->enable_4GB)
> > > > + paddr |= BIT_ULL(32);
> > > > +
> > >
> > > Ok here's where I get lost. How is this okay? Is the same physical RAM
> > > accessible at multiple locations in the physical address space? Won't
> > > this map an iova to a different pa than the one requested?
> >
> > In 4GB mode, HW will remap 0x4000_0000-0x1_3fff_ffff to 0x1_0000_0000-
> > 0x1_ffff_ffff. M4U help multimedia HW access dram, thus from M4U point
> > of view, the dram always is 0x1_0000_0000 to 0x1_ffff_ffff.
> >
> > The detailed mapping relationship is like this:
> >
> > 0x4000_0000 -0xffff_ffff map to 0x1_4000_0000 - 0x1_ffff_ffff.
> > 0x1_0000_0000-0x1_3fff_ffff map to 0x1_0000_0000 - 0x1_3fff_ffff.
> >
> > Thus, we can only add bit32 for the PA in the 4GB mode.
>
> Ok, I think I get it now. This thread really helped:
> https://patchwork.kernel.org/patch/8402211/
>
> So from what I understand basically the same DRAM exists in two places:
> 0000_0000 - ffff_ffff, and is also available in
> 1_0000_0000 - 1_ffff_ffff
>
> ...except that the peripherals are located in 0000_0000 - 3ffff_ffff,
> so that first GB of RAM is not visible at the lower address. I'm
> gathering this was in fact the motivation for 4GB mode. The important
> part is that address 4000_0000 == 1_4000_0000.
>
> Then there was also some quirk of the IOMMU where it refused to access
> addresses below 4GB. But those same addresses are accessible by ORing
> in bit 32, so you just always do that and you're good to go.
>
> Ok so now I can use that to understand this refactoring:
> * You used to always return an address above 4GB in
> mtk_iommu_iova_to_phys. I don't fully get how that worked, since it
> seems like you'd start returning PAs to the rest of the system that
> were outside of the range 4000_0000 - 1_3fff_ffff, but okay, you're no
I'm not sure I follow this. From the SW point view, the dram is
0x4000_0000 - 0x1_3fff_ffff. there is no memory outside it.
But there is really a issue in the mtk_iommu_iova_to_phys in the
4gb_mode.
Currently in the 4gb mode, I always add BIT32 for all the memory, then
the PA returned by the mtk_iommu_iova_to_phys(in v7s) always
is from 0x1_0000_0000 to 0x1_ffff_ffff. But the SW still expect the PA
is from 0x4000_0000 - 0x1_3fff_ffff. Thus, I guess I will add a new
patch like this:
@@ -418,6 +418,7 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
iommu_domain *domain,
dma_addr_t iova)
{
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+ struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
unsigned long flags;
phys_addr_t pa;
@@ -425,6 +426,11 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
iommu_domain *domain,
pa = dom->iop->iova_to_phys(dom->iop, iova);
spin_unlock_irqrestore(&dom->pgtlock, flags);
+ /* Discard bit32 if pa is 0x1_4000_0000 -0x1_ffff_ffff in 4GB mode. */
+ if (data->plat_data->has_4gb_mode && data->enable_4GB &&
+ pa >= 0x140000000)
+ paddr &= ~BIT_ULL(32);
+
return pa;
}
> longer doing that there, so I won't worry about it.
> * Now, if you're in the 4GB mode, you just slam the bit in the PTE in
> mtk_iommu_map, which seems like the right thing to do.
> * The general functions in io-pgtable-arm-v7s.c now carefully reflect
> bits 32 & 33 in the PTE, since the new IOMMUs don't have the weird
> restriction of staying above 4GB, and there's not this weird 4GB
> aliasing mode going on (which I think would be a clearer name for this
> feature: has_4gb_alias).
A more beautiful name. But our internal and all the CODA call this "4GB
mode"..thus I'd like to keep it....
>
> >
> > >
> > > Also, you could have rolled the has_4gb_mode check into whether or not
> > > you set enable_4GB. Then you're doing the check for has_4gb_mode once,
> > > rather than on every map call.
> >
> > "has_4gb_mode" means this SoC support 4GB mode.
> > "enable_4GB" means whether the current dram size is 4GB.
>
> Right. But your use of the variable as well as it's name suggest that
> it really means "is 4GB aliasing mode on", not "does the system have
> >=4GB of RAM". You could reduce the map function to one conditional if
> you treated the variable that way. Then the only things that would
> need to change would be:
> * Add an extra conditional in probe that would only set enable_4GB if
> has_4gb_mode is set.
I guess I still don't get this. the enable_4GB and has_4gb_mode are not
the same. Take mt8173 as a example when its dram size is 2G. it
has_4gb_mode, but we can not enable_4GB at that time.(if dram size is
2G, the HW will not remap the PA address, we can not add BIT32 at that
time.)
> * in mtk_iommu_domain_finalize, you could just always set the MTK
> quirk, since if you have <4GB of RAM, those bits will never get set in
> the PTEs anyway.
oh. Yes. this looks right.
> * I suspect mtk_iommu_hw_init would continue to work as-is, since
> everything that has vld_pa_rng also has has_4gb_mode.
mt8173 has 4gb_mode but it doesn't has vld_pa_rng.
>
> -Evan
On Thu, 2019-01-31 at 09:45 -0800, Evan Green wrote:
> On Wed, Jan 30, 2019 at 7:22 PM Yong Wu <[email protected]> wrote:
> >
> > On Wed, 2019-01-30 at 11:11 -0800, Evan Green wrote:
> > > On Mon, Dec 31, 2018 at 7:59 PM Yong Wu <[email protected]> wrote:
> > > >
> > > > The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
> > > > It's no need to parse it again in SMI driver. Only clean some codes.
> > > > This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
> > > > and mt8183.
> > > >
> > > > After this patch, the "mediatek,larb-id" only be needed for mt2712
> > > > which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
> > > > in which the larbs in the "mediatek,larbs" always are ordered.
> > > >
> > > > CC: Matthias Brugger <[email protected]>
> > > > Signed-off-by: Yong Wu <[email protected]>
> > > > ---
> > > > drivers/memory/mtk-smi.c | 26 ++------------------------
> > > > 1 file changed, 2 insertions(+), 24 deletions(-)
> > > >
> > > > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > > > index 08cf40d..10e6493 100644
> > > > --- a/drivers/memory/mtk-smi.c
> > > > +++ b/drivers/memory/mtk-smi.c
> > > > @@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
> > > > };
> > > >
> > > > struct mtk_smi_larb_gen {
> > > > - bool need_larbid;
> > > > int port_in_larb[MTK_LARB_NR_MAX + 1];
> > > > void (*config_port)(struct device *);
> > > > unsigned int larb_direct_to_common_mask;
> > > > @@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
> > > > struct mtk_smi_iommu *smi_iommu = data;
> > > > unsigned int i;
> > > >
> > > > - if (larb->larb_gen->need_larbid) {
> > > > - larb->mmu = &smi_iommu->larb_imu[larb->larbid].mmu;
> > > > - return 0;
> > > > - }
> > > > -
> > > > - /*
> > > > - * If there is no larbid property, Loop to find the corresponding
> > > > - * iommu information.
> > > > - */
> > > > - for (i = 0; i < smi_iommu->larb_nr; i++) {
> > > > + for (i = 0; i < MTK_LARB_NR_MAX; i++) {
> > >
> > > Looks like this was the only use of mtk_smi_iommu.larb_nr. Should we
> > > remove that now?
> >
> > This is necessary since the mt2712 which has two M4U HW.
> >
> > From its dtsi[1], take iommu1 as a example, its larb_nr only is 3, but
> > we need scan all the larb.
> >
> > [1]
> > http://lists.infradead.org/pipermail/linux-mediatek/2018-December/016119.html
>
> I'm not sure I follow. My point was that this structure member is only
> ever set but never read:
> $ git grep '[.>]larb_nr'
> drivers/iommu/mtk_iommu.c: data->smi_imu.larb_nr = larb_nr;
> drivers/iommu/mtk_iommu_v1.c: data->smi_imu.larb_nr = larb_nr;
>
> Maybe I've applied the patches to the wrong tree, and there is a use
> of this member I'm not seeing?
Thanks. I misunderstood. It looks right, this variable can be deleted I
didn't realize this. Maybe I need use a new patch for this.
> -Evan
On Fri, Feb 1, 2019 at 1:42 AM Yong Wu <[email protected]> wrote:
>
> On Thu, 2019-01-31 at 11:23 -0800, Evan Green wrote:
> > On Wed, Jan 30, 2019 at 10:59 PM Yong Wu <[email protected]> wrote:
> > >
> > > On Wed, 2019-01-30 at 10:28 -0800, Evan Green wrote:
> > > > On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
> > > > >
> > > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > >
> > > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > > for all PTEs which means to enable bit32 of physical address.
> > > >
> > > > I got a little lost here. I get that you're trying to explain why you
> > > > always used to set bit32 of the physical address. But I don't totally
> > > > get the part about physical addresses being from 0x4000_0000 -
> > > > 0x1_3fff_ffff, but also from 0x1_0000_0000-0x1_ffff_ffff. Are you
> > > > saying that the physical addresses from the iommu's perspective were
> > > > always >0x1_0000_0000?
> > >
> > > Yes. From the IOMMU's perspective, the Physical address is from
> > > 0x1_0000_0000 to 0x1_ffff_ffff.
> > >
> > > > But then from whose perspective is it 0x4000_0000? ...
> > >
> > > I guess from SW point view. it is from 0x4000_0000 to 0x1_3fff_ffff.
> > >
> > > If 4GB mode is enabled, the memory property in dts like this:
> > >
> > > memory@40000000 {
> > > device_type = "memory";
> > > reg = <0 0x40000000 0x00000001 0x00000000>;
> > > };
> > >
> > > > oh, or you're saying there was some sort of remapping
> > > > facility that moved the physical addresses around?
> > > >
> > > > >
> > > > > but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
> > > > > which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
> > > > > PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
> > > > > 32bits.
> > > > >
> > > > > In order to unify code, in the "4GB mode", we add the bit32 for the
> > > > > physical address manually in our driver.
> > > > >
> > > > > Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
> > > > > has to been moved into v7s.
> > > > >
> > > > > Regarding whether the pagetable address could be over 4GB, the mt8183
> > > > > support it while the previous mt8173 don't. thus keep it as is.
> > > > >
> > > > > Signed-off-by: Yong Wu <[email protected]>
> > > > > Reviewed-by: Robin Murphy <[email protected]>
> > > > > ---
> > > > > drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
> > > > > drivers/iommu/io-pgtable.h | 7 +++----
> > > > > drivers/iommu/mtk_iommu.c | 14 ++++++++------
> > > > > drivers/iommu/mtk_iommu.h | 1 +
> > > > > 4 files changed, 36 insertions(+), 17 deletions(-)
> > > > >
> > > > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > > > > index 11d8505..8803a35 100644
> > > > > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > > > > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > > > > @@ -124,7 +124,9 @@
> > > > > #define ARM_V7S_TEX_MASK 0x7
> > > > > #define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
> > > > >
> > > > > -#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
> > > > > +/* MediaTek extend the two bits below for over 4GB mode */
> > > > > +#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
> > > > > +#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
> > > >
> > > > If other vendors start doing stuff like this we'll need a more generic
> > > > way to handle this... but I guess until we see a pattern this is okay.
> > > >
> > > > >
> > > > > /* *well, except for TEX on level 2 large pages, of course :( */
> > > > > #define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
> > > > > @@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
> > > > > static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> > > > > struct io_pgtable_cfg *cfg)
> > > > > {
> > > > > - return paddr & ARM_V7S_LVL_MASK(lvl);
> > > > > + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> > > > > +
> > > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > > > + if (paddr & BIT_ULL(32))
> > > > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> > > > > + if (paddr & BIT_ULL(33))
> > > > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > > > > + }
> > > > > + return pte;
> > > > > }
> > > > >
> > > > > static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > > > struct io_pgtable_cfg *cfg)
> > > > > {
> > > > > arm_v7s_iopte mask;
> > > > > + phys_addr_t paddr;
> > > > >
> > > > > if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
> > > > > mask = ARM_V7S_TABLE_MASK;
> > > > > @@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > > > else
> > > > > mask = ARM_V7S_LVL_MASK(lvl);
> > > > >
> > > > > - return pte & mask;
> > > > > + paddr = pte & mask;
> > > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > > > > + paddr |= BIT_ULL(32);
> > > > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > > > > + paddr |= BIT_ULL(33);
> > > > > + }
> > > > > + return paddr;
> > > > > }
> > > > >
> > > > > static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
> > > > > @@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
> > > > > if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
> > > > > pte |= ARM_V7S_ATTR_NS_SECTION;
> > > > >
> > > > > - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
> > > > > - pte |= ARM_V7S_ATTR_MTK_4GB;
> > > > > -
> > > >
> > > > So despite getting lost in the details, I guess the reason it's okay
> > > > that this goes from unconditional to conditional on bit32 is that
> > > > before, with the older chips, all physical addresses were above 4GB,
> > > > so we'll always see PA's above 4GB?
> > > >
> > > > > return pte;
> > > > > }
> > > > >
> > > > > @@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> > > > > if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> > > > > return 0;
> > > > >
> > > > > - if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
> > > > > + if (WARN_ON(upper_32_bits(iova)) ||
> > > > > + WARN_ON(upper_32_bits(paddr) &&
> > > > > + !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
> > > > > return -ERANGE;
> > > > >
> > > > > ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
> > > > > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> > > > > index 47d5ae5..69db115 100644
> > > > > --- a/drivers/iommu/io-pgtable.h
> > > > > +++ b/drivers/iommu/io-pgtable.h
> > > > > @@ -62,10 +62,9 @@ struct io_pgtable_cfg {
> > > > > * (unmapped) entries but the hardware might do so anyway, perform
> > > > > * TLB maintenance when mapping as well as when unmapping.
> > > > > *
> > > > > - * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
> > > > > - * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
> > > > > - * when the SoC is in "4GB mode" and they can only access the high
> > > > > - * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
> > > > > + * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
> > > > > + * to support up to 34 bits PA where the bit32 and bit33 are
> > > > > + * encoded in the bit9 and bit4 of the PTE respectively.
> > > > > *
> > > > > * IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
> > > > > * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > > > index 189d1b5..ae1aa5a 100644
> > > > > --- a/drivers/iommu/mtk_iommu.c
> > > > > +++ b/drivers/iommu/mtk_iommu.c
> > > > > @@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
> > > > > phys_addr_t paddr, size_t size, int prot)
> > > > > {
> > > > > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > > > > + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > > > > unsigned long flags;
> > > > > int ret;
> > > > >
> > > > > + /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> > > > > + if (data->plat_data->has_4gb_mode && data->enable_4GB)
> > > > > + paddr |= BIT_ULL(32);
> > > > > +
> > > >
> > > > Ok here's where I get lost. How is this okay? Is the same physical RAM
> > > > accessible at multiple locations in the physical address space? Won't
> > > > this map an iova to a different pa than the one requested?
> > >
> > > In 4GB mode, HW will remap 0x4000_0000-0x1_3fff_ffff to 0x1_0000_0000-
> > > 0x1_ffff_ffff. M4U help multimedia HW access dram, thus from M4U point
> > > of view, the dram always is 0x1_0000_0000 to 0x1_ffff_ffff.
> > >
> > > The detailed mapping relationship is like this:
> > >
> > > 0x4000_0000 -0xffff_ffff map to 0x1_4000_0000 - 0x1_ffff_ffff.
> > > 0x1_0000_0000-0x1_3fff_ffff map to 0x1_0000_0000 - 0x1_3fff_ffff.
> > >
> > > Thus, we can only add bit32 for the PA in the 4GB mode.
> >
> > Ok, I think I get it now. This thread really helped:
> > https://patchwork.kernel.org/patch/8402211/
> >
> > So from what I understand basically the same DRAM exists in two places:
> > 0000_0000 - ffff_ffff, and is also available in
> > 1_0000_0000 - 1_ffff_ffff
> >
> > ...except that the peripherals are located in 0000_0000 - 3ffff_ffff,
> > so that first GB of RAM is not visible at the lower address. I'm
> > gathering this was in fact the motivation for 4GB mode. The important
> > part is that address 4000_0000 == 1_4000_0000.
> >
> > Then there was also some quirk of the IOMMU where it refused to access
> > addresses below 4GB. But those same addresses are accessible by ORing
> > in bit 32, so you just always do that and you're good to go.
> >
> > Ok so now I can use that to understand this refactoring:
> > * You used to always return an address above 4GB in
> > mtk_iommu_iova_to_phys. I don't fully get how that worked, since it
> > seems like you'd start returning PAs to the rest of the system that
> > were outside of the range 4000_0000 - 1_3fff_ffff, but okay, you're no
>
> I'm not sure I follow this. From the SW point view, the dram is
> 0x4000_0000 - 0x1_3fff_ffff. there is no memory outside it.
>
> But there is really a issue in the mtk_iommu_iova_to_phys in the
> 4gb_mode.
I guess I'm still struggling to understand what the "remapping" means.
From what you've described, it seems like it means that the physical
addresses seen by the CPU and IOMMU are different. I can picture two
possibilities:
First variant:
CPU PA == IOMMU PA
0x4000_0000 == 0x1_4000_0000
0x8000_0000 == 0x1_8000_0000
0xC000_0000 == 0x1_C000_0000
0x1_0000_0000 == 0x1_0000_0000
Or, maybe second variant:
CPU PA == IOMMU PA
0x4000_0000 == 0x1_0000_0000
0x8000_0000 == 0x1_4000_0000
0xC000_0000 == 0x1_8000_0000
0x1_0000_0000 == 0x1_C000_0000
My only point in trying to understand this about 4GB mode is that I'm
trying to figure out if the equation CPU PA | 0x1_0000_0000 == IOMMU
PA holds. In the first variant above, that equation works. But in the
second equation, I'd expect to see a +/- 0x4000_0000, as simply ORing
in 0x1_0000_0000 would get you the wrong PA as seen by the IOMMU.
>
> Currently in the 4gb mode, I always add BIT32 for all the memory, then
> the PA returned by the mtk_iommu_iova_to_phys(in v7s) always
> is from 0x1_0000_0000 to 0x1_ffff_ffff. But the SW still expect the PA
> is from 0x4000_0000 - 0x1_3fff_ffff. Thus, I guess I will add a new
> patch like this:
>
> @@ -418,6 +418,7 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
> iommu_domain *domain,
> dma_addr_t iova)
> {
> struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> unsigned long flags;
> phys_addr_t pa;
>
> @@ -425,6 +426,11 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
> iommu_domain *domain,
> pa = dom->iop->iova_to_phys(dom->iop, iova);
> spin_unlock_irqrestore(&dom->pgtlock, flags);
>
> + /* Discard bit32 if pa is 0x1_4000_0000 -0x1_ffff_ffff in 4GB mode. */
> + if (data->plat_data->has_4gb_mode && data->enable_4GB &&
> + pa >= 0x140000000)
> + paddr &= ~BIT_ULL(32);
> +
Right. I had noticed this in my previous reply about the old code, but
forgot about the place where we just jam in that BIT32 in the new code
for enable_4GB, which would lead to returning PAs to the rest of the
system outside of the valid range of 0x4000_0000 - 0x1_3fff_ffff. Good
catch.
The hardcoded PA is horribly ugly, I'm trying to think of a better way
to do this. I've got nothing at the moment...
I guess this also lends another point towards #1 of my two variants
being the correct picture of things.
> return pa;
> }
>
>
> > longer doing that there, so I won't worry about it.
> > * Now, if you're in the 4GB mode, you just slam the bit in the PTE in
> > mtk_iommu_map, which seems like the right thing to do.
> > * The general functions in io-pgtable-arm-v7s.c now carefully reflect
> > bits 32 & 33 in the PTE, since the new IOMMUs don't have the weird
> > restriction of staying above 4GB, and there's not this weird 4GB
> > aliasing mode going on (which I think would be a clearer name for this
> > feature: has_4gb_alias).
>
> A more beautiful name. But our internal and all the CODA call this "4GB
> mode"..thus I'd like to keep it....
Sigh.
>
> >
> > >
> > > >
> > > > Also, you could have rolled the has_4gb_mode check into whether or not
> > > > you set enable_4GB. Then you're doing the check for has_4gb_mode once,
> > > > rather than on every map call.
> > >
> > > "has_4gb_mode" means this SoC support 4GB mode.
> > > "enable_4GB" means whether the current dram size is 4GB.
> >
> > Right. But your use of the variable as well as it's name suggest that
> > it really means "is 4GB aliasing mode on", not "does the system have
> > >=4GB of RAM". You could reduce the map function to one conditional if
> > you treated the variable that way. Then the only things that would
> > need to change would be:
> > * Add an extra conditional in probe that would only set enable_4GB if
> > has_4gb_mode is set.
>
> I guess I still don't get this. the enable_4GB and has_4gb_mode are not
> the same. Take mt8173 as a example when its dram size is 2G. it
> has_4gb_mode, but we can not enable_4GB at that time.(if dram size is
> 2G, the HW will not remap the PA address, we can not add BIT32 at that
> time.)
Right. So enable_4GB would be false there, since your code in probe
would look like:
data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
if (!data->plat_data->has_4gb_mode)
data->enable_4GB = false;
Then mtk_iommu_map would only have:
if (data->enable_4GB)
paddr |= BIT_ULL(32);
Said differently: right now every place enable_4GB is read, there is
(or could be with no change in behavior) a check just before it for
has_4gb_mode, so roll that check into enable_4GB.
Anyway, this isn't a huge deal, it just seemed nice to save the extra
conditional in the map function, which I imagine might be a hot
function.
>
> > * in mtk_iommu_domain_finalize, you could just always set the MTK
> > quirk, since if you have <4GB of RAM, those bits will never get set in
> > the PTEs anyway.
>
> oh. Yes. this looks right.
>
> > * I suspect mtk_iommu_hw_init would continue to work as-is, since
> > everything that has vld_pa_rng also has has_4gb_mode.
>
> mt8173 has 4gb_mode but it doesn't has vld_pa_rng.
Right, so that conditional would continue to stay false, as it should.
Put differently, that conditional in mtk_iommu_hw_init() could be
replaced with no functional difference by:
if ((data->has_4gb_mode && data->enable_4GB) && data->plat_data->vld_pa_rng)
since everything that has vld_pa_rng also has has_4gb_mode.
-Evan
On Tue, 2019-02-05 at 15:11 -0800, Evan Green wrote:
> On Fri, Feb 1, 2019 at 1:42 AM Yong Wu <[email protected]> wrote:
> >
> > On Thu, 2019-01-31 at 11:23 -0800, Evan Green wrote:
> > > On Wed, Jan 30, 2019 at 10:59 PM Yong Wu <[email protected]> wrote:
> > > >
> > > > On Wed, 2019-01-30 at 10:28 -0800, Evan Green wrote:
> > > > > On Mon, Dec 31, 2018 at 7:57 PM Yong Wu <[email protected]> wrote:
> > > > > >
> > > > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > > >
> > > > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > > > for all PTEs which means to enable bit32 of physical address.
> > > > >
> > > > > I got a little lost here. I get that you're trying to explain why you
> > > > > always used to set bit32 of the physical address. But I don't totally
> > > > > get the part about physical addresses being from 0x4000_0000 -
> > > > > 0x1_3fff_ffff, but also from 0x1_0000_0000-0x1_ffff_ffff. Are you
> > > > > saying that the physical addresses from the iommu's perspective were
> > > > > always >0x1_0000_0000?
> > > >
> > > > Yes. From the IOMMU's perspective, the Physical address is from
> > > > 0x1_0000_0000 to 0x1_ffff_ffff.
> > > >
> > > > > But then from whose perspective is it 0x4000_0000? ...
> > > >
> > > > I guess from SW point view. it is from 0x4000_0000 to 0x1_3fff_ffff.
> > > >
> > > > If 4GB mode is enabled, the memory property in dts like this:
> > > >
> > > > memory@40000000 {
> > > > device_type = "memory";
> > > > reg = <0 0x40000000 0x00000001 0x00000000>;
> > > > };
> > > >
> > > > > oh, or you're saying there was some sort of remapping
> > > > > facility that moved the physical addresses around?
> > > > >
> > > > > >
> > > > > > but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
> > > > > > which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
> > > > > > PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
> > > > > > 32bits.
> > > > > >
> > > > > > In order to unify code, in the "4GB mode", we add the bit32 for the
> > > > > > physical address manually in our driver.
> > > > > >
> > > > > > Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
> > > > > > has to been moved into v7s.
> > > > > >
> > > > > > Regarding whether the pagetable address could be over 4GB, the mt8183
> > > > > > support it while the previous mt8173 don't. thus keep it as is.
> > > > > >
> > > > > > Signed-off-by: Yong Wu <[email protected]>
> > > > > > Reviewed-by: Robin Murphy <[email protected]>
> > > > > > ---
> > > > > > drivers/iommu/io-pgtable-arm-v7s.c | 31 ++++++++++++++++++++++++-------
> > > > > > drivers/iommu/io-pgtable.h | 7 +++----
> > > > > > drivers/iommu/mtk_iommu.c | 14 ++++++++------
> > > > > > drivers/iommu/mtk_iommu.h | 1 +
> > > > > > 4 files changed, 36 insertions(+), 17 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > > > > > index 11d8505..8803a35 100644
> > > > > > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > > > > > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > > > > > @@ -124,7 +124,9 @@
> > > > > > #define ARM_V7S_TEX_MASK 0x7
> > > > > > #define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
> > > > > >
> > > > > > -#define ARM_V7S_ATTR_MTK_4GB BIT(9) /* MTK extend it for 4GB mode */
> > > > > > +/* MediaTek extend the two bits below for over 4GB mode */
> > > > > > +#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
> > > > > > +#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
> > > > >
> > > > > If other vendors start doing stuff like this we'll need a more generic
> > > > > way to handle this... but I guess until we see a pattern this is okay.
> > > > >
> > > > > >
> > > > > > /* *well, except for TEX on level 2 large pages, of course :( */
> > > > > > #define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
> > > > > > @@ -183,13 +185,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
> > > > > > static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> > > > > > struct io_pgtable_cfg *cfg)
> > > > > > {
> > > > > > - return paddr & ARM_V7S_LVL_MASK(lvl);
> > > > > > + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> > > > > > +
> > > > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > > > > + if (paddr & BIT_ULL(32))
> > > > > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> > > > > > + if (paddr & BIT_ULL(33))
> > > > > > + pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > > > > > + }
> > > > > > + return pte;
> > > > > > }
> > > > > >
> > > > > > static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > > > > struct io_pgtable_cfg *cfg)
> > > > > > {
> > > > > > arm_v7s_iopte mask;
> > > > > > + phys_addr_t paddr;
> > > > > >
> > > > > > if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
> > > > > > mask = ARM_V7S_TABLE_MASK;
> > > > > > @@ -198,7 +209,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > > > > > else
> > > > > > mask = ARM_V7S_LVL_MASK(lvl);
> > > > > >
> > > > > > - return pte & mask;
> > > > > > + paddr = pte & mask;
> > > > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > > > > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > > > > > + paddr |= BIT_ULL(32);
> > > > > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > > > > > + paddr |= BIT_ULL(33);
> > > > > > + }
> > > > > > + return paddr;
> > > > > > }
> > > > > >
> > > > > > static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
> > > > > > @@ -315,9 +333,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
> > > > > > if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
> > > > > > pte |= ARM_V7S_ATTR_NS_SECTION;
> > > > > >
> > > > > > - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
> > > > > > - pte |= ARM_V7S_ATTR_MTK_4GB;
> > > > > > -
> > > > >
> > > > > So despite getting lost in the details, I guess the reason it's okay
> > > > > that this goes from unconditional to conditional on bit32 is that
> > > > > before, with the older chips, all physical addresses were above 4GB,
> > > > > so we'll always see PA's above 4GB?
> > > > >
> > > > > > return pte;
> > > > > > }
> > > > > >
> > > > > > @@ -504,7 +519,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> > > > > > if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> > > > > > return 0;
> > > > > >
> > > > > > - if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
> > > > > > + if (WARN_ON(upper_32_bits(iova)) ||
> > > > > > + WARN_ON(upper_32_bits(paddr) &&
> > > > > > + !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
> > > > > > return -ERANGE;
> > > > > >
> > > > > > ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
> > > > > > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> > > > > > index 47d5ae5..69db115 100644
> > > > > > --- a/drivers/iommu/io-pgtable.h
> > > > > > +++ b/drivers/iommu/io-pgtable.h
> > > > > > @@ -62,10 +62,9 @@ struct io_pgtable_cfg {
> > > > > > * (unmapped) entries but the hardware might do so anyway, perform
> > > > > > * TLB maintenance when mapping as well as when unmapping.
> > > > > > *
> > > > > > - * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) Set bit 9 in all
> > > > > > - * PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
> > > > > > - * when the SoC is in "4GB mode" and they can only access the high
> > > > > > - * remap of DRAM (0x1_00000000 to 0x1_ffffffff).
> > > > > > + * IO_PGTABLE_QUIRK_ARM_MTK_4GB: (ARM v7s format) MediaTek IOMMUs extend
> > > > > > + * to support up to 34 bits PA where the bit32 and bit33 are
> > > > > > + * encoded in the bit9 and bit4 of the PTE respectively.
> > > > > > *
> > > > > > * IO_PGTABLE_QUIRK_NO_DMA: Guarantees that the tables will only ever
> > > > > > * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > > > > index 189d1b5..ae1aa5a 100644
> > > > > > --- a/drivers/iommu/mtk_iommu.c
> > > > > > +++ b/drivers/iommu/mtk_iommu.c
> > > > > > @@ -367,12 +367,16 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
> > > > > > phys_addr_t paddr, size_t size, int prot)
> > > > > > {
> > > > > > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > > > > > + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > > > > > unsigned long flags;
> > > > > > int ret;
> > > > > >
> > > > > > + /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> > > > > > + if (data->plat_data->has_4gb_mode && data->enable_4GB)
> > > > > > + paddr |= BIT_ULL(32);
> > > > > > +
> > > > >
> > > > > Ok here's where I get lost. How is this okay? Is the same physical RAM
> > > > > accessible at multiple locations in the physical address space? Won't
> > > > > this map an iova to a different pa than the one requested?
> > > >
> > > > In 4GB mode, HW will remap 0x4000_0000-0x1_3fff_ffff to 0x1_0000_0000-
> > > > 0x1_ffff_ffff. M4U help multimedia HW access dram, thus from M4U point
> > > > of view, the dram always is 0x1_0000_0000 to 0x1_ffff_ffff.
> > > >
> > > > The detailed mapping relationship is like this:
> > > >
> > > > 0x4000_0000 -0xffff_ffff map to 0x1_4000_0000 - 0x1_ffff_ffff.
> > > > 0x1_0000_0000-0x1_3fff_ffff map to 0x1_0000_0000 - 0x1_3fff_ffff.
> > > >
> > > > Thus, we can only add bit32 for the PA in the 4GB mode.
> > >
> > > Ok, I think I get it now. This thread really helped:
> > > https://patchwork.kernel.org/patch/8402211/
> > >
> > > So from what I understand basically the same DRAM exists in two places:
> > > 0000_0000 - ffff_ffff, and is also available in
> > > 1_0000_0000 - 1_ffff_ffff
> > >
> > > ...except that the peripherals are located in 0000_0000 - 3ffff_ffff,
> > > so that first GB of RAM is not visible at the lower address. I'm
> > > gathering this was in fact the motivation for 4GB mode. The important
> > > part is that address 4000_0000 == 1_4000_0000.
> > >
> > > Then there was also some quirk of the IOMMU where it refused to access
> > > addresses below 4GB. But those same addresses are accessible by ORing
> > > in bit 32, so you just always do that and you're good to go.
> > >
> > > Ok so now I can use that to understand this refactoring:
> > > * You used to always return an address above 4GB in
> > > mtk_iommu_iova_to_phys. I don't fully get how that worked, since it
> > > seems like you'd start returning PAs to the rest of the system that
> > > were outside of the range 4000_0000 - 1_3fff_ffff, but okay, you're no
> >
> > I'm not sure I follow this. From the SW point view, the dram is
> > 0x4000_0000 - 0x1_3fff_ffff. there is no memory outside it.
> >
> > But there is really a issue in the mtk_iommu_iova_to_phys in the
> > 4gb_mode.
>
> I guess I'm still struggling to understand what the "remapping" means.
> From what you've described, it seems like it means that the physical
> addresses seen by the CPU and IOMMU are different. I can picture two
> possibilities:
>
> First variant:
> CPU PA == IOMMU PA
> 0x4000_0000 == 0x1_4000_0000
> 0x8000_0000 == 0x1_8000_0000
> 0xC000_0000 == 0x1_C000_0000
> 0x1_0000_0000 == 0x1_0000_0000
This one is right. The 4GB mode remap is a little complex, In the new
version, I explain it in the code. then someone don't need get it from
the git log or search from the network. help see [v6 21/22].
>
> Or, maybe second variant:
> CPU PA == IOMMU PA
> 0x4000_0000 == 0x1_0000_0000
> 0x8000_0000 == 0x1_4000_0000
> 0xC000_0000 == 0x1_8000_0000
> 0x1_0000_0000 == 0x1_C000_0000
>
> My only point in trying to understand this about 4GB mode is that I'm
> trying to figure out if the equation CPU PA | 0x1_0000_0000 == IOMMU
> PA holds. In the first variant above, that equation works. But in the
> second equation, I'd expect to see a +/- 0x4000_0000, as simply ORing
> in 0x1_0000_0000 would get you the wrong PA as seen by the IOMMU.
>
> >
> > Currently in the 4gb mode, I always add BIT32 for all the memory, then
> > the PA returned by the mtk_iommu_iova_to_phys(in v7s) always
> > is from 0x1_0000_0000 to 0x1_ffff_ffff. But the SW still expect the PA
> > is from 0x4000_0000 - 0x1_3fff_ffff. Thus, I guess I will add a new
> > patch like this:
> >
> > @@ -418,6 +418,7 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
> > iommu_domain *domain,
> > dma_addr_t iova)
> > {
> > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > unsigned long flags;
> > phys_addr_t pa;
> >
> > @@ -425,6 +426,11 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
> > iommu_domain *domain,
> > pa = dom->iop->iova_to_phys(dom->iop, iova);
> > spin_unlock_irqrestore(&dom->pgtlock, flags);
> >
> > + /* Discard bit32 if pa is 0x1_4000_0000 -0x1_ffff_ffff in 4GB mode. */
> > + if (data->plat_data->has_4gb_mode && data->enable_4GB &&
> > + pa >= 0x140000000)
> > + paddr &= ~BIT_ULL(32);
> > +
>
> Right. I had noticed this in my previous reply about the old code, but
> forgot about the place where we just jam in that BIT32 in the new code
> for enable_4GB, which would lead to returning PAs to the rest of the
> system outside of the valid range of 0x4000_0000 - 0x1_3fff_ffff. Good
> catch.
>
> The hardcoded PA is horribly ugly, I'm trying to think of a better way
> to do this. I've got nothing at the moment...
Yes, the hard code is not good. And I also don't get a better name for
this, thus use the address into the MACRO. see [v6 21/22].
>
> I guess this also lends another point towards #1 of my two variants
> being the correct picture of things.
>
> > return pa;
> > }
> >
> >
> > > longer doing that there, so I won't worry about it.
> > > * Now, if you're in the 4GB mode, you just slam the bit in the PTE in
> > > mtk_iommu_map, which seems like the right thing to do.
> > > * The general functions in io-pgtable-arm-v7s.c now carefully reflect
> > > bits 32 & 33 in the PTE, since the new IOMMUs don't have the weird
> > > restriction of staying above 4GB, and there's not this weird 4GB
> > > aliasing mode going on (which I think would be a clearer name for this
> > > feature: has_4gb_alias).
> >
> > A more beautiful name. But our internal and all the CODA call this "4GB
> > mode"..thus I'd like to keep it....
>
> Sigh.
>
> >
> > >
> > > >
> > > > >
> > > > > Also, you could have rolled the has_4gb_mode check into whether or not
> > > > > you set enable_4GB. Then you're doing the check for has_4gb_mode once,
> > > > > rather than on every map call.
> > > >
> > > > "has_4gb_mode" means this SoC support 4GB mode.
> > > > "enable_4GB" means whether the current dram size is 4GB.
> > >
> > > Right. But your use of the variable as well as it's name suggest that
> > > it really means "is 4GB aliasing mode on", not "does the system have
> > > >=4GB of RAM". You could reduce the map function to one conditional if
> > > you treated the variable that way. Then the only things that would
> > > need to change would be:
> > > * Add an extra conditional in probe that would only set enable_4GB if
> > > has_4gb_mode is set.
> >
> > I guess I still don't get this. the enable_4GB and has_4gb_mode are not
> > the same. Take mt8173 as a example when its dram size is 2G. it
> > has_4gb_mode, but we can not enable_4GB at that time.(if dram size is
> > 2G, the HW will not remap the PA address, we can not add BIT32 at that
> > time.)
>
> Right. So enable_4GB would be false there, since your code in probe
> would look like:
> data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
> if (!data->plat_data->has_4gb_mode)
> data->enable_4GB = false;
>
> Then mtk_iommu_map would only have:
> if (data->enable_4GB)
> paddr |= BIT_ULL(32);
>
> Said differently: right now every place enable_4GB is read, there is
> (or could be with no change in behavior) a check just before it for
> has_4gb_mode, so roll that check into enable_4GB.
>
> Anyway, this isn't a huge deal, it just seemed nice to save the extra
> conditional in the map function, which I imagine might be a hot
> function.
Thanks for this explanation with the code. I guess I get it.
I have to apologized that I misread this when I prepare v6. I thought it
may be NG when mt8183 use 4GB. But I realize it is also ok when I reply
this mail. Embarrassed!
The logical is a little complex and the string "enable_4GB" confused me.
Thus, I use a new patch[v6 20/22] to change it to "dram_is_4gb" for
readable.
If you still prefer the solution above, I can send v7.
>
> >
> > > * in mtk_iommu_domain_finalize, you could just always set the MTK
> > > quirk, since if you have <4GB of RAM, those bits will never get set in
> > > the PTEs anyway.
> >
> > oh. Yes. this looks right.
> >
> > > * I suspect mtk_iommu_hw_init would continue to work as-is, since
> > > everything that has vld_pa_rng also has has_4gb_mode.
> >
> > mt8173 has 4gb_mode but it doesn't has vld_pa_rng.
>
> Right, so that conditional would continue to stay false, as it should.
> Put differently, that conditional in mtk_iommu_hw_init() could be
> replaced with no functional difference by:
>
> if ((data->has_4gb_mode && data->enable_4GB) && data->plat_data->vld_pa_rng)
>
> since everything that has vld_pa_rng also has has_4gb_mode.
> -Evan