Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1017641rwl; Fri, 24 Mar 2023 05:19:17 -0700 (PDT) X-Google-Smtp-Source: AK7set+fOFyOZ2aCxFzqX1ZyfOjV1wC69dEPFncY4rM0PZPy+eHkYs5kCNZJfxn7qcB3Vif5+ktB X-Received: by 2002:a62:1a0d:0:b0:625:80f5:96ea with SMTP id a13-20020a621a0d000000b0062580f596eamr2496081pfa.28.1679660357152; Fri, 24 Mar 2023 05:19:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679660357; cv=none; d=google.com; s=arc-20160816; b=sgLoey3FZvnNs3woaXVe5oGYT7eOnXODUv1Lg6w7FJRCV+WptajXZSnHoVGzdjgM0u wwVUmRBtQLYF3bcVEJNeMSiGK3ijPs35sdir1hTQcRBZCde/1JBaHDmodmXXA2xHGgzh lbMtUzYnRGKMrr/lb8/9GoLMS+Nez43SkRiw79LMaTwWREnNdOhaSfDGx0oMlkv6JJeE wWedvgoBOZlTuqzAE1QNUOOd8FKsqnUI/d4ZWPpyrciPctPwjpkQwWDYKCGQCUXeFZop Lt2UKkb0Ff5hkq+CG4Ehf84Bql7bRpaHEztx4B54ndVihtWVY7Za5avB/kLsJC33pFQy yC4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=y15QM2zEAKMrcVy+xSbjDT94fC3ZhrbkU6ZH8UseEwI=; b=cbuA3CJpSiEfWX0AkPEsYGBPYRMISpt4OTC1u4QDOPou823r0mXVWEJ4aAK2MW6NMP 3nr6ZxWl3LVDYQZBfmvUgVFoayx8W8tSi5FR2uPMlNlH9jbmfrsI2byZCS/Tt/iKRKFF Owp7uEHWNcXsimosBFS72VXgstfuPPgttMfLvtyBujIZ+HK7JHZwefHyMSDTrX2rt9/w EEKaeMCP2KyzMGqSd1lhkgqhgmHQlKzSWq6T+oLwiAmXuScHfSc1Lu1CA0whRTlkbZir XDsP/yIRaO9d+ulfs0HBgKQnCMyvbzUE9j3ApkJQA+7Uk4qpB0e5nBVjLyLQ90dKEizp o+gA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k196-20020a6284cd000000b0062ae6345c63si1712990pfd.382.2023.03.24.05.19.05; Fri, 24 Mar 2023 05:19:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231602AbjCXMOj convert rfc822-to-8bit (ORCPT + 99 others); Fri, 24 Mar 2023 08:14:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229794AbjCXMOh (ORCPT ); Fri, 24 Mar 2023 08:14:37 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 090236A40; Fri, 24 Mar 2023 05:14:35 -0700 (PDT) Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.226]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4PjgzF1z9Pz67MCd; Fri, 24 Mar 2023 20:11:05 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Fri, 24 Mar 2023 12:14:32 +0000 Date: Fri, 24 Mar 2023 12:14:31 +0000 From: Jonathan Cameron To: Jie Zhan CC: , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [RFC PATCH v1 1/4] docs: perf: Add documentation for HiSilicon PMCU Message-ID: <20230324121431.000034c4@Huawei.com> In-Reply-To: <2d366bef-a891-6ee7-28bf-2091e0b78dbc@hisilicon.com> References: <20230206065146.645505-1-zhanjie9@hisilicon.com> <20230206065146.645505-2-zhanjie9@hisilicon.com> <20230317133710.00007d48@Huawei.com> <2d366bef-a891-6ee7-28bf-2091e0b78dbc@hisilicon.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8BIT X-Originating-IP: [10.202.227.76] X-ClientProxiedBy: lhrpeml100004.china.huawei.com (7.191.162.219) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 24 Mar 2023 17:32:15 +0800 Jie Zhan wrote: > On 17/03/2023 21:37, Jonathan Cameron wrote: > > On Mon, 6 Feb 2023 14:51:43 +0800 > > Jie Zhan wrote: > > > >> Document the overview and usage of HiSilicon PMCU. > >> > >> HiSilicon Performance Monitor Control Unit (PMCU) is a device that offloads > >> PMU accesses from CPUs, handling the configuration, event switching, and > >> counter reading of core PMUs on Kunpeng SoC. It facilitates fine-grained > >> and multi-PMU-event CPU profiling, in which scenario the current 'perf' > >> scheme may lose events or drop sampling frequency. With PMCU, users can > >> reliably obtain the data of up to 240 PMU events with the sample interval > >> of events down to 1ms, while the software overhead of accessing PMUs, as > >> well as its impact on target workloads, is reduced. > >> > >> Signed-off-by: Jie Zhan > > Nice documentation. I've read this a few times before, but on this read > > through wondered if we could say anything about the skew between capture > > of the counters. Not that important though so I'm happy to add > > > > Reviewed-by: Jonathan Cameron > > > > though this may of course need updating significantly as the interface > > is refined (the RFC question you raised for example in the cover letter). > > > > Thanks > > > > Jonathan > > > >> --- > >> Documentation/admin-guide/perf/hisi-pmcu.rst | 183 +++++++++++++++++++ > >> Documentation/admin-guide/perf/index.rst | 1 + > >> 2 files changed, 184 insertions(+) > >> create mode 100644 Documentation/admin-guide/perf/hisi-pmcu.rst > >> > >> diff --git a/Documentation/admin-guide/perf/hisi-pmcu.rst b/Documentation/admin-guide/perf/hisi-pmcu.rst > >> new file mode 100644 > >> index 000000000000..50d17cbd0049 > >> --- /dev/null > >> +++ b/Documentation/admin-guide/perf/hisi-pmcu.rst > >> @@ -0,0 +1,183 @@ > >> +.. SPDX-License-Identifier: GPL-2.0 > >> + > >> +========================================== > >> +HiSilicon Performance Monitor Control Unit > >> +========================================== > >> + > >> +Introduction > >> +============ > >> + > >> +HiSilicon Performance Monitor Control Unit (PMCU) is a device that offloads > >> +PMU accesses from CPUs, handling the configuration, event switching, and > >> +counter reading of core PMUs on Kunpeng SoC. It facilitates fine-grained > >> +and multi-PMU-event CPU profiling, in which scenario the current ``perf`` > >> +scheme may lose events or drop sampling frequency. With PMCU, users can > >> +reliably obtain the data of up to 240 PMU events with the sample interval > >> +of events down to 1ms, while the software overhead of accessing PMUs, as > >> +well as its impact on target workloads, is reduced. > >> + > >> +Each CPU die is equipped with a PMCU device. The PMCU driver registers it as a > >> +PMU device, named as ``hisi_pmcu_sccl``, where ```` is the corresponding > >> +CPU die ID. When triggered, PMCU reads event IDs and pass them to PMUs in all > >> +CPUs on the CPU die it is on. PMCU then starts the counters for counting > >> +events, waits for a time interval, and stops them. The PMU counter readings are > >> +dumped from hardware to memory, i.e. perf AUX buffers, and further copied to > >> +the ``perf.data`` file in the user space. PMCU automatically switches events > >> +(when there are more events than available PMU counters) and completes multiple > >> +rounds of PMU event counting in one trigger. > >> + > >> +Hardware overview > >> +================= > >> + > >> +On Kunpeng SoC, each CPU die is equipped with a PMCU device. PMCU acts like an > >> +assistant to access the core PMUs on its die and move the counter readings to > >> +memory. An overview of PMCU's hardware organization is shown below:: > >> + > >> + +--------------------+ > >> + | Memory | > >> + | +------+ +-------+ | > >> + +--------+ | |Events| |Samples| | > >> + | PMCU | | +------+ +-------+ | > >> + +---|----+ +---------|----------+ > >> + | | > >> + ======================================================= Bus > >> + | | | > >> + +----------|----------+ +----------|----------+ | > >> + | +------+ | +------+ | | +------+ | +------+ | | > >> + | |Core 0| | |Core 1| | | |Core 0| | |Core 1| | | > >> + | +--|---+ | +--|---+ | | +--|---+ | +--|---+ | (More > >> + | +-----+----+ | | +-----+----+ | clusters > >> + | +--|---+ +--|---+ | | +--|---+ +--|---+ | ...) > >> + | |Core 2| |Core 3| | | |Core 2| |Core 3| | > >> + | +------+ +------+ | | +------+ +------+ | > >> + | CPU Cluster 0 | | CPU Cluster 1 | > >> + +---------------------+ +---------------------+ > >> + > >> +On Kunpeng SoC, a CPU die is formed of several CPU clusters and several > >> +CPUs per cluster. PMCU is able to access the core PMUs in these CPUs. > >> +The main job of PMCU is to fetch PMU event IDs from memory, make PMUs count the > >> +events for a while, and move the counter readings back to memory. > >> + > >> +Once triggered, PMCU performs a number of loops and processes a number of > >> +events in each loop. It fetches ``nr_pmu`` events from memory at a time, where > >> +``nr_pmu`` denotes the number of PMU counters to be used in each CPU. The > >> +``nr_pmu`` events are passed to the PMU counters of all CPUs on the CPU die > >> +where PMCU resides. Then, PMCU starts all the counters, waits for a period, > >> +stops all the counters, and moves the counter readings to memory, before > >> +handling the next ``nr_pmu`` events if there are more events to process in this > >> +loop. The number of loops and ``nr_pmu`` are determined by the driver, whereas > >> +the number of events to process depends on user inputs. The counters are > >> +stopped when PMCU reads counters and switches events, so there is a tiny time > >> +window during which the events are not counted. > > I'm not clear from this description whether there is 'skew' between the counters > > (beyond the normal issues from uarch). Does the PMCU stop all counters > > then read them all (minimizing skew) or does it stop each CPUs set of counters > > and read those, or stop each individual counter before reading? > > > > My impression is that this feature is meant to be left running over timescales > > much longer than the sampling period so it may not be necessary to align the > > different lines on the resulting graphs perfectly. Hence maybe this doesn't matter. > > > Thanks for pointing this out. > > The PMCU stops all the counters before reading any counters (i.e. the > first case you said). > > The basic procedure is: > ??? start counters -> wait -> stop counters -> read and reset counters > -> switch events -> start counters -> ... > where each step applys to all CPUs and counters. Great. So this is across all cores on a die so skew should be minimized (at a cost of missing more events than a skew heavy approach). > > The counters don't count during the tiny stop-start window. > I guess a small improvement would be: reset -> read -> switch -> reset > -> ..., while the counters keep running, > but we still lose some event counts between read and reset, and thus, no > fundamental differrence. Lots of ways to reduce both skew and missed counts, but I think you are right in that none of them matter for the intended long term monitoring usecase. Jonathan > > Regards, > Jie