Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp5032992imw; Tue, 19 Jul 2022 19:19:22 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uhPqfnewvlUN6GuOYxwuSeLzsF0K00JsPaQfKcbGgaULWA0Nq0yVm/20sQBrj1kp88OSUC X-Received: by 2002:a05:6a00:27a1:b0:52b:a5f:6ae6 with SMTP id bd33-20020a056a0027a100b0052b0a5f6ae6mr37231614pfb.50.1658283562532; Tue, 19 Jul 2022 19:19:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658283562; cv=none; d=google.com; s=arc-20160816; b=pmTPnXWhtZHFFVi+rMaHLx6pjFlBGIgW33IMJRAigHq8kRSaUVAKzLTush06h1tk2o jMhJMWQYTeM6pQ/qGdyc1A56m0VrmqmIDFy61YB+lPOI+k/25+wkifgNRwrrvfq6tWPN pKn7dadQzg/e7FBh/PZ2JmxM0lO7LRsStwbLf3cexOAthiTFL0n5DhTLRAan+aZTcpem xenKhEMoL9O39K+9gJciGGeuZJZDOtYRTtf6oySmi7TkqOUQzkRGJ3WtfTvwyYUhhSvn ylaahU8DP2DwNCx5yKe3i8cIUokEtdmGb+m1AZyMxhnfI6lCFGpwaxSgt+dLGP8II+Rv XSng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=JEq1cCmoLQWd+dUytiOY5K5KrGTbr60xtE476CjfXF8=; b=dB7ZlpElegZ0DcaWBnM4n1kDHOYKyqFE9NlOhrqboe6PddRPRpT68QHl5J7wS8a4QY 4SLsLVrTuFD3Shbt2jRZVFluay4LrOMixHsNe/x52Pt1wdGgdLFbhcisSA5rTcvvV6VA 8hFXbn3h3x+RSLS72Nmw3GaTk2ZLvubbTAzH97URs6naexR/sjwkvjvAA99QSB8l2ENT 8G6kfh0xhMIfT7luD0MG9VyBPZsdHXVpJ+Ajp11BORguEdNRhDFTjeZv+qeC8bsZS9R3 vyxWbCPdHnDIB93ZzLzKp5CS4MpGUpEavBDNZ2BAsRk5weSOWve+lVh7yLJmk7yJ0hqU pP+A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nv9-20020a17090b1b4900b001f21645ab2csi1159067pjb.163.2022.07.19.19.19.08; Tue, 19 Jul 2022 19:19:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238757AbiGTBmz (ORCPT + 99 others); Tue, 19 Jul 2022 21:42:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241906AbiGTBmj (ORCPT ); Tue, 19 Jul 2022 21:42:39 -0400 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD19465D5E for ; Tue, 19 Jul 2022 18:41:08 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0VJueMTw_1658281264; Received: from 30.240.121.248(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJueMTw_1658281264) by smtp.aliyun-inc.com; Wed, 20 Jul 2022 09:41:05 +0800 Message-ID: Date: Wed, 20 Jul 2022 09:41:03 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Content-Language: en-US To: Jonathan Cameron Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> <20220715151310.90091-2-xueshuai@linux.alibaba.com> <20220719133509.00006ee9@Huawei.com> From: Shuai Xue In-Reply-To: <20220719133509.00006ee9@Huawei.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2022/7/19 PM8:35, Jonathan Cameron 写道: > On Fri, 15 Jul 2022 23:13:08 +0800 > Shuai Xue wrote: > >> Alibaba's T-Head SoC implements uncore PMU for performance and functional >> debugging to facilitate system maintenance. Document it to provide guidance >> on how to use it. >> >> Signed-off-by: Shuai Xue > I'm far from an expert on this, but looks good to me. > > Reviewed-by: Jonathan Cameron Thanks for the review. Cheers, Shuai >> --- >> .../admin-guide/perf/alibaba_pmu.rst | 100 ++++++++++++++++++ >> Documentation/admin-guide/perf/index.rst | 1 + >> 2 files changed, 101 insertions(+) >> create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst >> >> diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst >> new file mode 100644 >> index 000000000000..11de998bb480 >> --- /dev/null >> +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst >> @@ -0,0 +1,100 @@ >> +============================================================= >> +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) >> +============================================================= >> + >> +The Yitian 710, custom-built by Alibaba Group's chip development business, >> +T-Head, implements uncore PMU for performance and functional debugging to >> +facilitate system maintenance. >> + >> +DDR Sub-System Driveway (DRW) PMU Driver >> +========================================= >> + >> +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel >> +is independent of others to service system memory requests. And one DDR5 >> +channel is split into two independent sub-channels. The DDR Sub-System Driveway >> +implements separate PMUs for each sub-channel to monitor various performance >> +metrics. >> + >> +The Driveway PMU devices are named as ali_drw_ with perf. >> +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two >> +sub-channels of the same channel in die 0. And the PMU device of die 1 is >> +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. >> + >> +Each sub-channel has 36 PMU counters in total, which is classified into >> +four groups: >> + >> +- Group 0: PMU Cycle Counter. This group has one pair of counters >> + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count >> + based on DDRC core clock. >> + >> +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used >> + to count the total access number of either the eight bank groups in a >> + selected rank, or four ranks separately in the first 4 counters. The base >> + transfer unit is 64B. >> + >> +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to >> + count the total retry number of each type of uncorrectable error. >> + >> +- Group 3: PMU Common Counters. This group has 16 counters, that are used >> + to count the common events. >> + >> +For now, the Driveway PMU driver only uses counters in group 0 and group 3. >> + >> +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution >> +for connecting an SoC application bus to DDR memory devices. The DDRCTL >> +receives transactions Host Interface (HIF) which is custom-defined by Synopsys. >> +These transactions are queued internally and scheduled for access while >> +satisfying the SDRAM protocol timing requirements, transaction priorities, and >> +dependencies between the transactions. The DDRCTL in turn issues commands on >> +the DDR PHY Interface (DFI) to the PHY module, which launches and captures data >> +to and from the SDRAM. The driveway PMUs have hardware logic to gather >> +statistics and performance logging signals on HIF, DFI, etc. >> + >> +By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF >> +interface, we could calculate the bandwidth. Example usage of counting memory >> +data bandwidth:: >> + >> + perf stat \ >> + -e ali_drw_21000/hif_wr/ \ >> + -e ali_drw_21000/hif_rd/ \ >> + -e ali_drw_21000/hif_rmw/ \ >> + -e ali_drw_21000/cycle/ \ >> + -e ali_drw_21080/hif_wr/ \ >> + -e ali_drw_21080/hif_rd/ \ >> + -e ali_drw_21080/hif_rmw/ \ >> + -e ali_drw_21080/cycle/ \ >> + -e ali_drw_23000/hif_wr/ \ >> + -e ali_drw_23000/hif_rd/ \ >> + -e ali_drw_23000/hif_rmw/ \ >> + -e ali_drw_23000/cycle/ \ >> + -e ali_drw_23080/hif_wr/ \ >> + -e ali_drw_23080/hif_rd/ \ >> + -e ali_drw_23080/hif_rmw/ \ >> + -e ali_drw_23080/cycle/ \ >> + -e ali_drw_25000/hif_wr/ \ >> + -e ali_drw_25000/hif_rd/ \ >> + -e ali_drw_25000/hif_rmw/ \ >> + -e ali_drw_25000/cycle/ \ >> + -e ali_drw_25080/hif_wr/ \ >> + -e ali_drw_25080/hif_rd/ \ >> + -e ali_drw_25080/hif_rmw/ \ >> + -e ali_drw_25080/cycle/ \ >> + -e ali_drw_27000/hif_wr/ \ >> + -e ali_drw_27000/hif_rd/ \ >> + -e ali_drw_27000/hif_rmw/ \ >> + -e ali_drw_27000/cycle/ \ >> + -e ali_drw_27080/hif_wr/ \ >> + -e ali_drw_27080/hif_rd/ \ >> + -e ali_drw_27080/hif_rmw/ \ >> + -e ali_drw_27080/cycle/ -- sleep 10 >> + >> +The average DRAM bandwidth can be calculated as follows: >> + >> +- Read Bandwidth = perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle >> +- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle >> + >> +Here, DDRC_WIDTH = 64 bytes. >> + >> +The current driver does not support sampling. So "perf record" is >> +unsupported. Also attach to a task is unsupported as the events are all >> +uncore. >> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst >> index 69b23f087c05..823db08863db 100644 >> --- a/Documentation/admin-guide/perf/index.rst >> +++ b/Documentation/admin-guide/perf/index.rst >> @@ -17,3 +17,4 @@ Performance monitor support >> xgene-pmu >> arm_dsu_pmu >> thunderx2-pmu >> + alibaba_pmu