Add a user guide to show how to use DDR PMU to
monitor DDR bandwidth on Amlogic G12 SoC
Signed-off-by: Jiucheng Xu <[email protected]>
Reviewed-by: Chris Healy <[email protected]>
---
Changes v9 -> v10:
- Rebase code
- Add "Reviewed-by" tag
Changes v8 -> v9:
- No change
Changes v7 -> v8:
- No change
Changes v6 -> v7:
- Drop the Reported-by tag
- Fix spelling error
Changes v5 -> v6:
- No change
Changes v4 -> v5:
- Fix building warning
Changes v3 -> v4:
- No change
Changes v2 -> v3:
- Rename doc name from aml-ddr-pmu.rst to meson-ddr-pmu.rst
Changes v1 -> v2:
- Nothing was changed
---
Documentation/admin-guide/perf/index.rst | 1 +
.../admin-guide/perf/meson-ddr-pmu.rst | 70 +++++++++++++++++++
MAINTAINERS | 1 +
3 files changed, 72 insertions(+)
create mode 100644 Documentation/admin-guide/perf/meson-ddr-pmu.rst
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 793e1970bc05..c767e03e4d34 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -19,3 +19,4 @@ Performance monitor support
arm_dsu_pmu
thunderx2-pmu
alibaba_pmu
+ meson-ddr-pmu
diff --git a/Documentation/admin-guide/perf/meson-ddr-pmu.rst b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
new file mode 100644
index 000000000000..15e93a751ced
--- /dev/null
+++ b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
@@ -0,0 +1,70 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================================================
+Amlogic SoC DDR Bandwidth Performance Monitoring Unit (PMU)
+===========================================================
+
+There is a bandwidth monitor inside the DRAM controller. The monitor includes
+4 channels which can count the read/write request of accessing DRAM individually.
+It can be helpful to show if the performance bottleneck is on DDR bandwidth.
+
+Currently, this driver supports the following 5 Perf events:
+
++ meson_ddr_bw/total_rw_bytes/
++ meson_ddr_bw/chan_1_rw_bytes/
++ meson_ddr_bw/chan_2_rw_bytes/
++ meson_ddr_bw/chan_3_rw_bytes/
++ meson_ddr_bw/chan_4_rw_bytes/
+
+meson_ddr_bw/chan_{1,2,3,4}_rw_bytes/ events are the channel related events.
+Each channel support using keywords as filter, which can let the channel
+to monitor the individual IP module in SoC.
+
+The following keywords are the filter:
+
++ arm - DDR access request from CPU
++ vpu_read1 - DDR access request from OSD + VPP read
++ gpu - DDR access request from 3D GPU
++ pcie - DDR access request from PCIe controller
++ hdcp - DDR access request from HDCP controller
++ hevc_front - DDR access request from HEVC codec front end
++ usb3_0 - DDR access request from USB3.0 controller
++ hevc_back - DDR access request from HEVC codec back end
++ h265enc - DDR access request from HEVC encoder
++ vpu_read2 - DDR access request from DI read
++ vpu_write1 - DDR access request from VDIN write
++ vpu_write2 - DDR access request from di write
++ vdec - DDR access request from legacy codec video decoder
++ hcodec - DDR access request from H264 encoder
++ ge2d - DDR access request from ge2d
++ spicc1 - DDR access request from SPI controller 1
++ usb0 - DDR access request from USB2.0 controller 0
++ dma - DDR access request from system DMA controller 1
++ arb0 - DDR access request from arb0
++ sd_emmc_b - DDR access request from SD eMMC b controller
++ usb1 - DDR access request from USB2.0 controller 1
++ audio - DDR access request from Audio module
++ sd_emmc_c - DDR access request from SD eMMC c controller
++ spicc2 - DDR access request from SPI controller 2
++ ethernet - DDR access request from Ethernet controller
+
+
+The following command is to show the total DDR bandwidth:
+
+ .. code-block:: bash
+
+ perf stat -a -e meson_ddr_bw/total_rw_bytes/ -I 1000 sleep 10
+
+This command will print the total DDR bandwidth per second.
+
+The following commands are to show how to use filter parameters:
+
+ .. code-block:: bash
+
+ perf stat -a -e meson_ddr_bw/chan_1_rw_bytes,arm=1/ -I 1000 sleep 10
+ perf stat -a -e meson_ddr_bw/chan_2_rw_bytes,gpu=1/ -I 1000 sleep 10
+ perf stat -a -e meson_ddr_bw/chan_3_rw_bytes,arm=1,gpu=1/ -I 1000 sleep 10
+
+The 1st command show how to use channel 1 to monitor the DDR bandwidth from ARM.
+The 2nd command show using channel 2 to get the DDR bandwidth of GPU.
+The 3rd command show using channel 3 to monitor the sum of ARM and GPU.
diff --git a/MAINTAINERS b/MAINTAINERS
index 415eaa30c523..b76c4deddf22 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1098,6 +1098,7 @@ M: Jiucheng Xu <[email protected]>
L: [email protected]
S: Supported
W: http://www.amlogic.com
+F: Documentation/admin-guide/perf/meson-ddr-pmu.rst
F: drivers/perf/amlogic/
F: include/soc/amlogic/
--
2.25.1
On Wed, Nov 16, 2022 at 08:31:33AM +0800, Jiucheng Xu wrote:
> diff --git a/Documentation/admin-guide/perf/meson-ddr-pmu.rst b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
> new file mode 100644
> index 000000000000..15e93a751ced
> --- /dev/null
> +++ b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
> @@ -0,0 +1,70 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===========================================================
> +Amlogic SoC DDR Bandwidth Performance Monitoring Unit (PMU)
> +===========================================================
> +
> +There is a bandwidth monitor inside the DRAM controller. The monitor includes
> +4 channels which can count the read/write request of accessing DRAM individually.
> +It can be helpful to show if the performance bottleneck is on DDR bandwidth.
> +
> +Currently, this driver supports the following 5 Perf events:
> +
> ++ meson_ddr_bw/total_rw_bytes/
> ++ meson_ddr_bw/chan_1_rw_bytes/
> ++ meson_ddr_bw/chan_2_rw_bytes/
> ++ meson_ddr_bw/chan_3_rw_bytes/
> ++ meson_ddr_bw/chan_4_rw_bytes/
> +
> +meson_ddr_bw/chan_{1,2,3,4}_rw_bytes/ events are the channel related events.
> +Each channel support using keywords as filter, which can let the channel
> +to monitor the individual IP module in SoC.
> +
> +The following keywords are the filter:
> +
> ++ arm - DDR access request from CPU
> ++ vpu_read1 - DDR access request from OSD + VPP read
> ++ gpu - DDR access request from 3D GPU
> ++ pcie - DDR access request from PCIe controller
> ++ hdcp - DDR access request from HDCP controller
> ++ hevc_front - DDR access request from HEVC codec front end
> ++ usb3_0 - DDR access request from USB3.0 controller
> ++ hevc_back - DDR access request from HEVC codec back end
> ++ h265enc - DDR access request from HEVC encoder
> ++ vpu_read2 - DDR access request from DI read
> ++ vpu_write1 - DDR access request from VDIN write
> ++ vpu_write2 - DDR access request from di write
> ++ vdec - DDR access request from legacy codec video decoder
> ++ hcodec - DDR access request from H264 encoder
> ++ ge2d - DDR access request from ge2d
> ++ spicc1 - DDR access request from SPI controller 1
> ++ usb0 - DDR access request from USB2.0 controller 0
> ++ dma - DDR access request from system DMA controller 1
> ++ arb0 - DDR access request from arb0
> ++ sd_emmc_b - DDR access request from SD eMMC b controller
> ++ usb1 - DDR access request from USB2.0 controller 1
> ++ audio - DDR access request from Audio module
> ++ sd_emmc_c - DDR access request from SD eMMC c controller
> ++ spicc2 - DDR access request from SPI controller 2
> ++ ethernet - DDR access request from Ethernet controller
> +
> +
> +The following command is to show the total DDR bandwidth:
> +
> + .. code-block:: bash
> +
> + perf stat -a -e meson_ddr_bw/total_rw_bytes/ -I 1000 sleep 10
> +
> +This command will print the total DDR bandwidth per second.
> +
> +The following commands are to show how to use filter parameters:
> +
> + .. code-block:: bash
> +
> + perf stat -a -e meson_ddr_bw/chan_1_rw_bytes,arm=1/ -I 1000 sleep 10
> + perf stat -a -e meson_ddr_bw/chan_2_rw_bytes,gpu=1/ -I 1000 sleep 10
> + perf stat -a -e meson_ddr_bw/chan_3_rw_bytes,arm=1,gpu=1/ -I 1000 sleep 10
> +
> +The 1st command show how to use channel 1 to monitor the DDR bandwidth from ARM.
> +The 2nd command show using channel 2 to get the DDR bandwidth of GPU.
> +The 3rd command show using channel 3 to monitor the sum of ARM and GPU.
The wordings are rather weird, so I need to improve the doc:
---- >8 ----
diff --git a/Documentation/admin-guide/perf/meson-ddr-pmu.rst b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
index 15e93a751ced8a..4a1fdb5aba4b24 100644
--- a/Documentation/admin-guide/perf/meson-ddr-pmu.rst
+++ b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
@@ -4,11 +4,12 @@
Amlogic SoC DDR Bandwidth Performance Monitoring Unit (PMU)
===========================================================
-There is a bandwidth monitor inside the DRAM controller. The monitor includes
-4 channels which can count the read/write request of accessing DRAM individually.
-It can be helpful to show if the performance bottleneck is on DDR bandwidth.
+The Amlogic Meson G12 SoC contains a bandwidth monitor inside DRAM controller.
+The monitor includes 4 channels which can count the read/write request of
+individual DRAM. It can be helpful to show if the performance bottleneck is on
+DDR bandwidth.
-Currently, this driver supports the following 5 Perf events:
+Currently, this driver supports the following 5 perf events:
+ meson_ddr_bw/total_rw_bytes/
+ meson_ddr_bw/chan_1_rw_bytes/
@@ -16,55 +17,54 @@ Currently, this driver supports the following 5 Perf events:
+ meson_ddr_bw/chan_3_rw_bytes/
+ meson_ddr_bw/chan_4_rw_bytes/
-meson_ddr_bw/chan_{1,2,3,4}_rw_bytes/ events are the channel related events.
-Each channel support using keywords as filter, which can let the channel
-to monitor the individual IP module in SoC.
+meson_ddr_bw/chan_{1,2,3,4}_rw_bytes/ events are channel-specific events.
+Each channel support filtering, which can let the channel to monitor
+individual IP module in SoC.
-The following keywords are the filter:
+Below are DDR access request event filter keywords:
-+ arm - DDR access request from CPU
-+ vpu_read1 - DDR access request from OSD + VPP read
-+ gpu - DDR access request from 3D GPU
-+ pcie - DDR access request from PCIe controller
-+ hdcp - DDR access request from HDCP controller
-+ hevc_front - DDR access request from HEVC codec front end
-+ usb3_0 - DDR access request from USB3.0 controller
-+ hevc_back - DDR access request from HEVC codec back end
-+ h265enc - DDR access request from HEVC encoder
-+ vpu_read2 - DDR access request from DI read
-+ vpu_write1 - DDR access request from VDIN write
-+ vpu_write2 - DDR access request from di write
-+ vdec - DDR access request from legacy codec video decoder
-+ hcodec - DDR access request from H264 encoder
-+ ge2d - DDR access request from ge2d
-+ spicc1 - DDR access request from SPI controller 1
-+ usb0 - DDR access request from USB2.0 controller 0
-+ dma - DDR access request from system DMA controller 1
-+ arb0 - DDR access request from arb0
-+ sd_emmc_b - DDR access request from SD eMMC b controller
-+ usb1 - DDR access request from USB2.0 controller 1
-+ audio - DDR access request from Audio module
-+ sd_emmc_c - DDR access request from SD eMMC c controller
-+ spicc2 - DDR access request from SPI controller 2
-+ ethernet - DDR access request from Ethernet controller
++ arm - from CPU
++ vpu_read1 - from OSD + VPP read
++ gpu - from 3D GPU
++ pcie - from PCIe controller
++ hdcp - from HDCP controller
++ hevc_front - from HEVC codec front end
++ usb3_0 - from USB3.0 controller
++ hevc_back - from HEVC codec back end
++ h265enc - from HEVC encoder
++ vpu_read2 - from DI read
++ vpu_write1 - from VDIN write
++ vpu_write2 - from di write
++ vdec - from legacy codec video decoder
++ hcodec - from H264 encoder
++ ge2d - from ge2d
++ spicc1 - from SPI controller 1
++ usb0 - from USB2.0 controller 0
++ dma - from system DMA controller 1
++ arb0 - from arb0
++ sd_emmc_b - from SD eMMC b controller
++ usb1 - from USB2.0 controller 1
++ audio - from Audio module
++ sd_emmc_c - from SD eMMC c controller
++ spicc2 - from SPI controller 2
++ ethernet - from Ethernet controller
-The following command is to show the total DDR bandwidth:
+Examples:
- .. code-block:: bash
+ + Show the total DDR bandwidth per seconds:
- perf stat -a -e meson_ddr_bw/total_rw_bytes/ -I 1000 sleep 10
+ .. code-block:: bash
-This command will print the total DDR bandwidth per second.
+ perf stat -a -e meson_ddr_bw/total_rw_bytes/ -I 1000 sleep 10
-The following commands are to show how to use filter parameters:
- .. code-block:: bash
+ + Show individual DDR bandwidth from CPU and GPU respectively, as well as
+ sum of them:
- perf stat -a -e meson_ddr_bw/chan_1_rw_bytes,arm=1/ -I 1000 sleep 10
- perf stat -a -e meson_ddr_bw/chan_2_rw_bytes,gpu=1/ -I 1000 sleep 10
- perf stat -a -e meson_ddr_bw/chan_3_rw_bytes,arm=1,gpu=1/ -I 1000 sleep 10
+ .. code-block:: bash
+
+ perf stat -a -e meson_ddr_bw/chan_1_rw_bytes,arm=1/ -I 1000 sleep 10
+ perf stat -a -e meson_ddr_bw/chan_2_rw_bytes,gpu=1/ -I 1000 sleep 10
+ perf stat -a -e meson_ddr_bw/chan_3_rw_bytes,arm=1,gpu=1/ -I 1000 sleep 10
-The 1st command show how to use channel 1 to monitor the DDR bandwidth from ARM.
-The 2nd command show using channel 2 to get the DDR bandwidth of GPU.
-The 3rd command show using channel 3 to monitor the sum of ARM and GPU.
Thanks.
--
An old man doll... just what I always wanted! - Clara
Sorry for my poor English. Your writing looks very elegant. I will apply
your modification in next version.
Thanks,
Jiucheng
On 2022/11/16 17:40, Bagas Sanjaya wrote:
>
> The wordings are rather weird, so I need to improve the doc:
>
> ---- >8 ----
>
> diff --git a/Documentation/admin-guide/perf/meson-ddr-pmu.rst b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
> index 15e93a751ced8a..4a1fdb5aba4b24 100644
> --- a/Documentation/admin-guide/perf/meson-ddr-pmu.rst
> +++ b/Documentation/admin-guide/perf/meson-ddr-pmu.rst
> @@ -4,11 +4,12 @@
> Amlogic SoC DDR Bandwidth Performance Monitoring Unit (PMU)
> ===========================================================
>
> -There is a bandwidth monitor inside the DRAM controller. The monitor includes
> -4 channels which can count the read/write request of accessing DRAM individually.
> -It can be helpful to show if the performance bottleneck is on DDR bandwidth.
> +The Amlogic Meson G12 SoC contains a bandwidth monitor inside DRAM controller.
> +The monitor includes 4 channels which can count the read/write request of
> +individual DRAM. It can be helpful to show if the performance bottleneck is on
> +DDR bandwidth.
>
> -Currently, this driver supports the following 5 Perf events:
> +Currently, this driver supports the following 5 perf events:
>
> + meson_ddr_bw/total_rw_bytes/
> + meson_ddr_bw/chan_1_rw_bytes/
> @@ -16,55 +17,54 @@ Currently, this driver supports the following 5 Perf events:
> + meson_ddr_bw/chan_3_rw_bytes/
> + meson_ddr_bw/chan_4_rw_bytes/
>
> -meson_ddr_bw/chan_{1,2,3,4}_rw_bytes/ events are the channel related events.
> -Each channel support using keywords as filter, which can let the channel
> -to monitor the individual IP module in SoC.
> +meson_ddr_bw/chan_{1,2,3,4}_rw_bytes/ events are channel-specific events.
> +Each channel support filtering, which can let the channel to monitor
> +individual IP module in SoC.
>
> -The following keywords are the filter:
> +Below are DDR access request event filter keywords:
>
> -+ arm - DDR access request from CPU
> -+ vpu_read1 - DDR access request from OSD + VPP read
> -+ gpu - DDR access request from 3D GPU
> -+ pcie - DDR access request from PCIe controller
> -+ hdcp - DDR access request from HDCP controller
> -+ hevc_front - DDR access request from HEVC codec front end
> -+ usb3_0 - DDR access request from USB3.0 controller
> -+ hevc_back - DDR access request from HEVC codec back end
> -+ h265enc - DDR access request from HEVC encoder
> -+ vpu_read2 - DDR access request from DI read
> -+ vpu_write1 - DDR access request from VDIN write
> -+ vpu_write2 - DDR access request from di write
> -+ vdec - DDR access request from legacy codec video decoder
> -+ hcodec - DDR access request from H264 encoder
> -+ ge2d - DDR access request from ge2d
> -+ spicc1 - DDR access request from SPI controller 1
> -+ usb0 - DDR access request from USB2.0 controller 0
> -+ dma - DDR access request from system DMA controller 1
> -+ arb0 - DDR access request from arb0
> -+ sd_emmc_b - DDR access request from SD eMMC b controller
> -+ usb1 - DDR access request from USB2.0 controller 1
> -+ audio - DDR access request from Audio module
> -+ sd_emmc_c - DDR access request from SD eMMC c controller
> -+ spicc2 - DDR access request from SPI controller 2
> -+ ethernet - DDR access request from Ethernet controller
> ++ arm - from CPU
> ++ vpu_read1 - from OSD + VPP read
> ++ gpu - from 3D GPU
> ++ pcie - from PCIe controller
> ++ hdcp - from HDCP controller
> ++ hevc_front - from HEVC codec front end
> ++ usb3_0 - from USB3.0 controller
> ++ hevc_back - from HEVC codec back end
> ++ h265enc - from HEVC encoder
> ++ vpu_read2 - from DI read
> ++ vpu_write1 - from VDIN write
> ++ vpu_write2 - from di write
> ++ vdec - from legacy codec video decoder
> ++ hcodec - from H264 encoder
> ++ ge2d - from ge2d
> ++ spicc1 - from SPI controller 1
> ++ usb0 - from USB2.0 controller 0
> ++ dma - from system DMA controller 1
> ++ arb0 - from arb0
> ++ sd_emmc_b - from SD eMMC b controller
> ++ usb1 - from USB2.0 controller 1
> ++ audio - from Audio module
> ++ sd_emmc_c - from SD eMMC c controller
> ++ spicc2 - from SPI controller 2
> ++ ethernet - from Ethernet controller
>
>
> -The following command is to show the total DDR bandwidth:
> +Examples:
>
> - .. code-block:: bash
> + + Show the total DDR bandwidth per seconds:
>
> - perf stat -a -e meson_ddr_bw/total_rw_bytes/ -I 1000 sleep 10
> + .. code-block:: bash
>
> -This command will print the total DDR bandwidth per second.
> + perf stat -a -e meson_ddr_bw/total_rw_bytes/ -I 1000 sleep 10
>
> -The following commands are to show how to use filter parameters:
>
> - .. code-block:: bash
> + + Show individual DDR bandwidth from CPU and GPU respectively, as well as
> + sum of them:
>
> - perf stat -a -e meson_ddr_bw/chan_1_rw_bytes,arm=1/ -I 1000 sleep 10
> - perf stat -a -e meson_ddr_bw/chan_2_rw_bytes,gpu=1/ -I 1000 sleep 10
> - perf stat -a -e meson_ddr_bw/chan_3_rw_bytes,arm=1,gpu=1/ -I 1000 sleep 10
> + .. code-block:: bash
> +
> + perf stat -a -e meson_ddr_bw/chan_1_rw_bytes,arm=1/ -I 1000 sleep 10
> + perf stat -a -e meson_ddr_bw/chan_2_rw_bytes,gpu=1/ -I 1000 sleep 10
> + perf stat -a -e meson_ddr_bw/chan_3_rw_bytes,arm=1,gpu=1/ -I 1000 sleep 10
>
> -The 1st command show how to use channel 1 to monitor the DDR bandwidth from ARM.
> -The 2nd command show using channel 2 to get the DDR bandwidth of GPU.
> -The 3rd command show using channel 3 to monitor the sum of ARM and GPU.
>
> Thanks.
>
--
Thanks,
Jiucheng
On 11/16/22 16:56, Jiucheng Xu wrote:
> Sorry for my poor English. Your writing looks very elegant. I will apply your modification in next version.
>
Please don't top-post, reply inline with appropriate context
instead.
Also, wrap your message text within 72-80 characters.
Thanks.
--
An old man doll... just what I always wanted! - Clara