Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp832053rwl; Fri, 24 Mar 2023 02:35:32 -0700 (PDT) X-Google-Smtp-Source: AKy350aeu0W2RN60PHA0LDj2N0DGKXlSvjAZHzenOUio9mGP/PIMNbIX+xOUJxPQCj+W68D3hBN4 X-Received: by 2002:a17:906:4e1a:b0:935:20d8:c3c with SMTP id z26-20020a1709064e1a00b0093520d80c3cmr2007112eju.61.1679650531787; Fri, 24 Mar 2023 02:35:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679650531; cv=none; d=google.com; s=arc-20160816; b=QYx8hkdYBiR5taSrj8akNmL3bTVvTMKQtkgrtSDOd7TLSqYnIJ/ZUmoGHa4VOTVtn3 FC8vJXmLomh258fBXNRI8GhYQE2fMd+BFtxVtWNSsRa27fozVrKvteNmdMBrBMjf9wfH sG7ujNEETfO+SSm+NJg8D6Uz/BcmU2IEnngVYtUQuDKa6tBf9nf2na5huRrI7Rt8Xhn1 8zu4FBCiMekifi380lOE4aPuw+ZMzzndDEzctOUtRwtHbAv93wXFYzDmNUPyFP9z6hEV kpH+emiAOZSzCxYVqkaZmFgY9C7dvBXLGGfryJrVMJnHUj+Rt3aPd14zH4/at/LdOrC8 DMLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=ZjgbYWlVPGwKkjjWXjIKhWotMCpILHj7/sGCJ9FCjbY=; b=DlV+6kLUIovolmKCIR7hurv8kONq4ZE2uYoa5LMo268aDSk6OqtyzwZ87q+73CAAiY f2JSuNIcfu65yS4YBD7Ivf8p3/AG7T4f5xgPgRytGNtbcABdaw0CjDjFYIjJU+M7TeM7 xmIJGzRS1axCCYhI0UskNRBU41bY+U30icUsmxPl/BNy0NRGUELD+XpbZAth484AIppn boE6ERLdvYuwJ/zIzEXs8kwdei597fK5vpF0EJrnNMmz50x0yLbqLYowhddCJob+N96y edBva249ZSJKdtvQ9L1lT42r9T5vo+nwRx2hJyu/reU/UDqr3W9S6vSfPiRjuJt1/nLN LW8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i17-20020aa7c711000000b005002cb238e1si20579533edq.350.2023.03.24.02.35.08; Fri, 24 Mar 2023 02:35:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231522AbjCXJc2 (ORCPT + 99 others); Fri, 24 Mar 2023 05:32:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231346AbjCXJcX (ORCPT ); Fri, 24 Mar 2023 05:32:23 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB50A24BE5; Fri, 24 Mar 2023 02:32:17 -0700 (PDT) Received: from dggpeml500019.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4PjcNL6S5nz17P1v; Fri, 24 Mar 2023 17:29:06 +0800 (CST) Received: from [10.67.101.98] (10.67.101.98) by dggpeml500019.china.huawei.com (7.185.36.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Fri, 24 Mar 2023 17:32:15 +0800 Message-ID: <2d366bef-a891-6ee7-28bf-2091e0b78dbc@hisilicon.com> Date: Fri, 24 Mar 2023 17:32:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: [RFC PATCH v1 1/4] docs: perf: Add documentation for HiSilicon PMCU To: Jonathan Cameron CC: , , , , , , , , , , , , , , , , , , , , , , , References: <20230206065146.645505-1-zhanjie9@hisilicon.com> <20230206065146.645505-2-zhanjie9@hisilicon.com> <20230317133710.00007d48@Huawei.com> From: Jie Zhan In-Reply-To: <20230317133710.00007d48@Huawei.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.101.98] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpeml500019.china.huawei.com (7.185.36.137) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/03/2023 21:37, Jonathan Cameron wrote: > On Mon, 6 Feb 2023 14:51:43 +0800 > Jie Zhan wrote: > >> Document the overview and usage of HiSilicon PMCU. >> >> HiSilicon Performance Monitor Control Unit (PMCU) is a device that offloads >> PMU accesses from CPUs, handling the configuration, event switching, and >> counter reading of core PMUs on Kunpeng SoC. It facilitates fine-grained >> and multi-PMU-event CPU profiling, in which scenario the current 'perf' >> scheme may lose events or drop sampling frequency. With PMCU, users can >> reliably obtain the data of up to 240 PMU events with the sample interval >> of events down to 1ms, while the software overhead of accessing PMUs, as >> well as its impact on target workloads, is reduced. >> >> Signed-off-by: Jie Zhan > Nice documentation. I've read this a few times before, but on this read > through wondered if we could say anything about the skew between capture > of the counters. Not that important though so I'm happy to add > > Reviewed-by: Jonathan Cameron > > though this may of course need updating significantly as the interface > is refined (the RFC question you raised for example in the cover letter). > > Thanks > > Jonathan > >> --- >> Documentation/admin-guide/perf/hisi-pmcu.rst | 183 +++++++++++++++++++ >> Documentation/admin-guide/perf/index.rst | 1 + >> 2 files changed, 184 insertions(+) >> create mode 100644 Documentation/admin-guide/perf/hisi-pmcu.rst >> >> diff --git a/Documentation/admin-guide/perf/hisi-pmcu.rst b/Documentation/admin-guide/perf/hisi-pmcu.rst >> new file mode 100644 >> index 000000000000..50d17cbd0049 >> --- /dev/null >> +++ b/Documentation/admin-guide/perf/hisi-pmcu.rst >> @@ -0,0 +1,183 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> + >> +========================================== >> +HiSilicon Performance Monitor Control Unit >> +========================================== >> + >> +Introduction >> +============ >> + >> +HiSilicon Performance Monitor Control Unit (PMCU) is a device that offloads >> +PMU accesses from CPUs, handling the configuration, event switching, and >> +counter reading of core PMUs on Kunpeng SoC. It facilitates fine-grained >> +and multi-PMU-event CPU profiling, in which scenario the current ``perf`` >> +scheme may lose events or drop sampling frequency. With PMCU, users can >> +reliably obtain the data of up to 240 PMU events with the sample interval >> +of events down to 1ms, while the software overhead of accessing PMUs, as >> +well as its impact on target workloads, is reduced. >> + >> +Each CPU die is equipped with a PMCU device. The PMCU driver registers it as a >> +PMU device, named as ``hisi_pmcu_sccl``, where ```` is the corresponding >> +CPU die ID. When triggered, PMCU reads event IDs and pass them to PMUs in all >> +CPUs on the CPU die it is on. PMCU then starts the counters for counting >> +events, waits for a time interval, and stops them. The PMU counter readings are >> +dumped from hardware to memory, i.e. perf AUX buffers, and further copied to >> +the ``perf.data`` file in the user space. PMCU automatically switches events >> +(when there are more events than available PMU counters) and completes multiple >> +rounds of PMU event counting in one trigger. >> + >> +Hardware overview >> +================= >> + >> +On Kunpeng SoC, each CPU die is equipped with a PMCU device. PMCU acts like an >> +assistant to access the core PMUs on its die and move the counter readings to >> +memory. An overview of PMCU's hardware organization is shown below:: >> + >> + +--------------------+ >> + | Memory | >> + | +------+ +-------+ | >> + +--------+ | |Events| |Samples| | >> + | PMCU | | +------+ +-------+ | >> + +---|----+ +---------|----------+ >> + | | >> + ======================================================= Bus >> + | | | >> + +----------|----------+ +----------|----------+ | >> + | +------+ | +------+ | | +------+ | +------+ | | >> + | |Core 0| | |Core 1| | | |Core 0| | |Core 1| | | >> + | +--|---+ | +--|---+ | | +--|---+ | +--|---+ | (More >> + | +-----+----+ | | +-----+----+ | clusters >> + | +--|---+ +--|---+ | | +--|---+ +--|---+ | ...) >> + | |Core 2| |Core 3| | | |Core 2| |Core 3| | >> + | +------+ +------+ | | +------+ +------+ | >> + | CPU Cluster 0 | | CPU Cluster 1 | >> + +---------------------+ +---------------------+ >> + >> +On Kunpeng SoC, a CPU die is formed of several CPU clusters and several >> +CPUs per cluster. PMCU is able to access the core PMUs in these CPUs. >> +The main job of PMCU is to fetch PMU event IDs from memory, make PMUs count the >> +events for a while, and move the counter readings back to memory. >> + >> +Once triggered, PMCU performs a number of loops and processes a number of >> +events in each loop. It fetches ``nr_pmu`` events from memory at a time, where >> +``nr_pmu`` denotes the number of PMU counters to be used in each CPU. The >> +``nr_pmu`` events are passed to the PMU counters of all CPUs on the CPU die >> +where PMCU resides. Then, PMCU starts all the counters, waits for a period, >> +stops all the counters, and moves the counter readings to memory, before >> +handling the next ``nr_pmu`` events if there are more events to process in this >> +loop. The number of loops and ``nr_pmu`` are determined by the driver, whereas >> +the number of events to process depends on user inputs. The counters are >> +stopped when PMCU reads counters and switches events, so there is a tiny time >> +window during which the events are not counted. > I'm not clear from this description whether there is 'skew' between the counters > (beyond the normal issues from uarch). Does the PMCU stop all counters > then read them all (minimizing skew) or does it stop each CPUs set of counters > and read those, or stop each individual counter before reading? > > My impression is that this feature is meant to be left running over timescales > much longer than the sampling period so it may not be necessary to align the > different lines on the resulting graphs perfectly. Hence maybe this doesn't matter. > Thanks for pointing this out. The PMCU stops all the counters before reading any counters (i.e. the first case you said). The basic procedure is:     start counters -> wait -> stop counters -> read and reset counters -> switch events -> start counters -> ... where each step applys to all CPUs and counters. The counters don't count during the tiny stop-start window. I guess a small improvement would be: reset -> read -> switch -> reset -> ..., while the counters keep running, but we still lose some event counts between read and reset, and thus, no fundamental differrence. Regards, Jie