Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4080199rdb; Mon, 11 Dec 2023 08:17:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IEhI7l5cfkr50l44QiIiqqH60KVLBgKpcv/PuQ8yJlXR52+7j6FGT8LUTM13e94VOiFWycd X-Received: by 2002:a05:6a20:3829:b0:18f:97c:8a3a with SMTP id p41-20020a056a20382900b0018f097c8a3amr5243867pzf.101.1702311431408; Mon, 11 Dec 2023 08:17:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702311431; cv=none; d=google.com; s=arc-20160816; b=e0y0A1ApvO0Sfk24NvRpvYvWNxGDgThqU1uSfaGrnj40S7+hpehJXCytiiMtvyqm3P GTmAsjPjmXRwkG0HP23AWTB1XOEBn2OsSF2ZBEcNQCjEoZUmRmCQOP2cGH2YY6pWMmRS 9uQRVjQzwd58VNVTND70X2b6vY+gLeemQO6qwfv8S8OnD2o63shLBr3yTzcNNxNTA1iy HUg9u0Kl51gTQWKsMS4jUPHXqzUzol56qPFv6js9U45u54PCxhTFEy6sEuJwkaHZeICm AK7HBfEkkrLpSzLCZQpQkT+mWQ1iQAlqyYJyrnLXZkPijU2t2rQAQptEXc5Me/pMT/Gs N8eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=SsAt91amLeybov6riEYx6BM2qFZrSAtbJ9XAzd8W2Z0=; fh=oRTr1d7mCDMKQCV8DFHHjee6VGmKsUMMuSV4+yK9gas=; b=ITniMcIJ+7tvkDbNNduqtn+WVIKGvs2OOGRr3aD2yoahATMbqpCp01lzXUyuQ4foiN dcKh3+lkfskMDzUoDDbulYM7oXAeFFBXq511leUaLsH1TPO3e7E9fXoshmVFO7j0SMyB +TGRx3KYyG9BR9fCwcAp8rRSDBUe4tFlNKRmiqCpXnm+jsNxd3Cbi/i6bJNFJhdEeZii 8Yuh5heg79b49+sCh+v82LIiY4Btvjgm7c0kg2/7V7MYFZk2p8pT8AReZDrAKz4wn6Yq VGcFDs7Bu6X8fZr0StAtkc+bKv31r0Rr0u5Vh0smhU5D0U2z6+H9pHfV5nXqan3mg3Qp PdxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id z2-20020a63c042000000b005b8ef498e2fsi6246647pgi.181.2023.12.11.08.17.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 08:17:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 5501B80613A1; Mon, 11 Dec 2023 08:17:07 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344616AbjLKQQp (ORCPT + 99 others); Mon, 11 Dec 2023 11:16:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344623AbjLKQQI (ORCPT ); Mon, 11 Dec 2023 11:16:08 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0078511F; Mon, 11 Dec 2023 08:15:56 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4F97F16F2; Mon, 11 Dec 2023 08:16:43 -0800 (PST) Received: from e127643.broadband (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id CC74C3F738; Mon, 11 Dec 2023 08:15:52 -0800 (PST) From: James Clark To: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, suzuki.poulose@arm.com, will@kernel.org, mark.rutland@arm.com, anshuman.khandual@arm.com Cc: namhyung@gmail.com, James Clark , Catalin Marinas , Jonathan Corbet , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Russell King , Marc Zyngier , Oliver Upton , James Morse , Zenghui Yu , Paolo Bonzini , Shuah Khan , Zaid Al-Bassam , Raghavendra Rao Ananta , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH v7 10/11] arm64: perf: Add support for event counting threshold Date: Mon, 11 Dec 2023 16:13:22 +0000 Message-Id: <20231211161331.1277825-11-james.clark@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231211161331.1277825-1-james.clark@arm.com> References: <20231211161331.1277825-1-james.clark@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Mon, 11 Dec 2023 08:17:07 -0800 (PST) FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on events whose count meets a specified threshold condition. For example if PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or equal, count), and the threshold is set to 2, then the PMU counter will now only increment by 1 when an event would have previously incremented the PMU counter by 2 or more on a single processor cycle. Three new Perf event config fields, 'threshold', 'threshold_compare' and 'threshold_count' have been added to control the feature. threshold_compare maps to the upper two bits of PMEVTYPERn.TC and threshold_count maps to the first bit of TC. These separate attributes have been picked rather than enumerating all the possible combinations of the TC field as in the Arm ARM. The attributes would be used on a Perf command line like this: $ perf stat -e stall_slot/threshold=2,threshold_compare=2/ A new capability for reading out the maximum supported threshold value has also been added: $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max 0x000000ff If a threshold higher than threshold_max is provided, then an error is generated. If FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then threshold_max reads zero, and attempting to set a threshold value will also result in an error. The threshold is per PMU counter, and there are potentially different threshold_max values per PMU type on heterogeneous systems. Bits higher than 32 now need to be written into PMEVTYPER, so armv8pmu_write_evtype() has to be updated to take an unsigned long value rather than u32 which gives the correct behavior on both aarch32 and 64. Signed-off-by: James Clark --- drivers/perf/arm_pmuv3.c | 79 +++++++++++++++++++++++++++++++++- include/linux/perf/arm_pmuv3.h | 1 + 2 files changed, 79 insertions(+), 1 deletion(-) diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c index d33378a198cc..0bd6565eb401 100644 --- a/drivers/perf/arm_pmuv3.c +++ b/drivers/perf/arm_pmuv3.c @@ -305,10 +305,22 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = { #define ATTR_CFG_FLD_rdpmc_CFG config1 #define ATTR_CFG_FLD_rdpmc_LO 1 #define ATTR_CFG_FLD_rdpmc_HI 1 +#define ATTR_CFG_FLD_threshold_count_CFG config1 /* PMEVTYPER.TC[0] */ +#define ATTR_CFG_FLD_threshold_count_LO 2 +#define ATTR_CFG_FLD_threshold_count_HI 2 +#define ATTR_CFG_FLD_threshold_compare_CFG config1 /* PMEVTYPER.TC[2:1] */ +#define ATTR_CFG_FLD_threshold_compare_LO 3 +#define ATTR_CFG_FLD_threshold_compare_HI 4 +#define ATTR_CFG_FLD_threshold_CFG config1 /* PMEVTYPER.TH */ +#define ATTR_CFG_FLD_threshold_LO 5 +#define ATTR_CFG_FLD_threshold_HI 16 GEN_PMU_FORMAT_ATTR(event); GEN_PMU_FORMAT_ATTR(long); GEN_PMU_FORMAT_ATTR(rdpmc); +GEN_PMU_FORMAT_ATTR(threshold_count); +GEN_PMU_FORMAT_ATTR(threshold_compare); +GEN_PMU_FORMAT_ATTR(threshold); static int sysctl_perf_user_access __read_mostly; @@ -322,10 +334,27 @@ static bool armv8pmu_event_want_user_access(struct perf_event *event) return ATTR_CFG_GET_FLD(&event->attr, rdpmc); } +static u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr) +{ + u8 th_compare = ATTR_CFG_GET_FLD(attr, threshold_compare); + u8 th_count = ATTR_CFG_GET_FLD(attr, threshold_count); + + /* + * The count bit is always the bottom bit of the full control field, and + * the comparison is the upper two bits, but it's not explicitly + * labelled in the Arm ARM. For the Perf interface we split it into two + * fields, so reconstruct it here. + */ + return (th_compare << 1) | th_count; +} + static struct attribute *armv8_pmuv3_format_attrs[] = { &format_attr_event.attr, &format_attr_long.attr, &format_attr_rdpmc.attr, + &format_attr_threshold.attr, + &format_attr_threshold_compare.attr, + &format_attr_threshold_count.attr, NULL, }; @@ -375,10 +404,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr, static DEVICE_ATTR_RO(bus_width); +static u32 threshold_max(struct arm_pmu *cpu_pmu) +{ + /* + * PMMIR.THWIDTH is readable and non-zero on aarch32, but it would be + * impossible to write the threshold in the upper 32 bits of PMEVTYPER. + */ + if (IS_ENABLED(CONFIG_ARM)) + return 0; + + /* + * The largest value that can be written to PMEVTYPER_EL0.TH is + * (2 ^ PMMIR.THWIDTH) - 1. + */ + return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1; +} + +static ssize_t threshold_max_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct pmu *pmu = dev_get_drvdata(dev); + struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); + + return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu)); +} + +static DEVICE_ATTR_RO(threshold_max); + static struct attribute *armv8_pmuv3_caps_attrs[] = { &dev_attr_slots.attr, &dev_attr_bus_slots.attr, &dev_attr_bus_width.attr, + &dev_attr_threshold_max.attr, NULL, }; @@ -562,7 +619,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value) armv8pmu_write_hw_counter(event, value); } -static void armv8pmu_write_evtype(int idx, u32 val) +static void armv8pmu_write_evtype(int idx, unsigned long val) { u32 counter = ARMV8_IDX_TO_COUNTER(idx); unsigned long mask = ARMV8_PMU_EVTYPE_EVENT | @@ -931,6 +988,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, struct perf_event_attr *attr) { unsigned long config_base = 0; + struct perf_event *perf_event = container_of(attr, struct perf_event, + attr); + struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu); + u32 th; if (attr->exclude_idle) { pr_debug("ARM performance counters do not support mode exclusion\n"); @@ -964,6 +1025,22 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, if (attr->exclude_user) config_base |= ARMV8_PMU_EXCLUDE_EL0; + /* + * If FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will + * be 0 and will also trigger this check, preventing it from being used. + */ + th = ATTR_CFG_GET_FLD(attr, threshold); + if (th > threshold_max(cpu_pmu)) { + pr_debug("PMU event threshold exceeds max value\n"); + return -EINVAL; + } + + if (IS_ENABLED(CONFIG_ARM64) && th) { + config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th); + config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC, + armv8pmu_event_threshold_control(attr)); + } + /* * Install the filter into config_base as this is used to * construct the event type. diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h index 91957b3468e9..0f4d62ef3a9a 100644 --- a/include/linux/perf/arm_pmuv3.h +++ b/include/linux/perf/arm_pmuv3.h @@ -262,6 +262,7 @@ #define ARMV8_PMU_SLOTS GENMASK(7, 0) #define ARMV8_PMU_BUS_SLOTS GENMASK(15, 8) #define ARMV8_PMU_BUS_WIDTH GENMASK(19, 16) +#define ARMV8_PMU_THWIDTH GENMASK(23, 20) /* * This code is really good -- 2.34.1