Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4668050rdb; Tue, 12 Dec 2023 06:17:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IGs2bs4Ec2lDzpZvCI/L50yxA/oM3yLzffJFiJmpJMpHeVAbGLtutBAQtf5kdzOXyfdcSRn X-Received: by 2002:a05:6a00:804:b0:6ce:689d:e002 with SMTP id m4-20020a056a00080400b006ce689de002mr3711554pfk.42.1702390668091; Tue, 12 Dec 2023 06:17:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702390668; cv=none; d=google.com; s=arc-20160816; b=cK5vaBozC6HJzCr89xIxeGfO7TneDVmgq3Jpmrr4bNpabD9ci2ICSu/9HCEiBwSnKg fsyHtQR2HMlWKKxGomHOIPnXTHK70ICYjjMN8EsFsaJOlNP8OSaNiLwQIDyvWFvkolfu QV8pPzhmouz6wlXV37X0CkqhIA4ONOR92eRhk2Pu+w/icQ4RDGLpbfIzgi0eZ7k+PbSv Hy/mBJIf6OmW0eH5M6+I/hXI7dl65ve9oYJyd6mJBpvVsgU4Ur+wsenj40IIROk/iBMO eBORZGGWGgQvrGliv3lM6VqXpdYM0WGQFvYYnA1YkX8dpXL9+yrZJLjfsliJeu0Ns4IS YPYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id; bh=lhDpVWhj5Zu/eKdf9DCGwv60dk+ZeEcCjlEDc4o+u34=; fh=mY00LIquHHsKBm652Pol4auzTr5vKfqHo9xW4eVJRy8=; b=CrUCn3Y0PTVdijR9g/RI2gz2ayiMrtx6o9Vzk/ERh0pFWqa0cIGmtNftTsB1sz/VfV OLQnFhN6q9u26rci4+/UWim3lP8bm1LQftMJakaErUuWB9QE18u3C2h4sZGRnJF8V59A BNFHRmsIyOmUswyMAwQ21Ey1Yd7Wh9Ah8NVmgYvGUeqNkT9XKYTIM/tk53VezwXg7qHo 0g8mYoqyAFVhEfpdP8evlgPYOyIN7U4htVBEs99TAA0SAe1GC+Nu0bXv5sRpABW6KymV UjTOTg071RNbKV721yuCL1qaFHKXJ8jZ+tXLYNamnGRU44SAXkCdC04l3IXvT33mQDEH hrBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id z5-20020aa78885000000b006cb4d47cfa9si7751728pfe.270.2023.12.12.06.17.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 06:17:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 211898076E53; Tue, 12 Dec 2023 06:17:45 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376557AbjLLOR1 (ORCPT + 99 others); Tue, 12 Dec 2023 09:17:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232741AbjLLOR0 (ORCPT ); Tue, 12 Dec 2023 09:17:26 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C5BFC100; Tue, 12 Dec 2023 06:17:30 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A67B4143D; Tue, 12 Dec 2023 06:18:16 -0800 (PST) Received: from [192.168.1.3] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7DF073F738; Tue, 12 Dec 2023 06:17:25 -0800 (PST) Message-ID: Date: Tue, 12 Dec 2023 14:17:25 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v1 07/14] perf arm-spe/cs-etm: Directly iterate CPU maps Content-Language: en-US To: Ian Rogers , Leo Yan References: <20231129060211.1890454-1-irogers@google.com> <20231129060211.1890454-8-irogers@google.com> Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Adrian Hunter , Suzuki K Poulose , Mike Leach , Leo Yan , John Garry , Will Deacon , Thomas Gleixner , Darren Hart , Davidlohr Bueso , =?UTF-8?Q?Andr=c3=a9_Almeida?= , Kan Liang , K Prateek Nayak , Sean Christopherson , Paolo Bonzini , Kajol Jain , Athira Rajeev , Andrew Jones , Alexandre Ghiti , Atish Patra , "Steinar H. Gunderson" , Yang Jihong , Yang Li , Changbin Du , Sandipan Das , Ravi Bangoria , Paran Lee , Nick Desaulniers , Huacai Chen , Yanteng Si , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, bpf@vger.kernel.org From: James Clark In-Reply-To: <20231129060211.1890454-8-irogers@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.3 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 12 Dec 2023 06:17:45 -0800 (PST) On 29/11/2023 06:02, Ian Rogers wrote: > Rather than iterate all CPUs and see if they are in CPU maps, directly > iterate the CPU map. Similarly make use of the intersect > function. Switch perf_cpu_map__has_any_cpu_or_is_empty to more > appropriate alternatives. > > Signed-off-by: Ian Rogers > --- > tools/perf/arch/arm/util/cs-etm.c | 77 ++++++++++++---------------- > tools/perf/arch/arm64/util/arm-spe.c | 4 +- > 2 files changed, 34 insertions(+), 47 deletions(-) > > diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c > index 77e6663c1703..a68a72f2f668 100644 > --- a/tools/perf/arch/arm/util/cs-etm.c > +++ b/tools/perf/arch/arm/util/cs-etm.c > @@ -197,38 +197,32 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr, > static int cs_etm_validate_config(struct auxtrace_record *itr, > struct evsel *evsel) > { > - int i, err = -EINVAL; > + int idx, err = -EINVAL; > struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus; > struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); > + struct perf_cpu_map *intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus); > + struct perf_cpu cpu; > > - /* Set option of each CPU we have */ > - for (i = 0; i < cpu__max_cpu().cpu; i++) { > - struct perf_cpu cpu = { .cpu = i, }; > - > - /* > - * In per-cpu case, do the validation for CPUs to work with. > - * In per-thread case, the CPU map is empty. Since the traced > - * program can run on any CPUs in this case, thus don't skip > - * validation. > - */ > - if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus) && > - !perf_cpu_map__has(event_cpus, cpu)) > - continue; This has broken validation for per-thread sessions. perf_cpu_map__intersect() doesn't seem to be able to handle the case where an 'any' map intersected with an online map should return the online map. Or at least it should for this to work, and it seems to make sense for it to work that way. At least that was my initial impression, but I only debugged it and saw that the loop is now skipped entirely. > - > - if (!perf_cpu_map__has(online_cpus, cpu)) > - continue; > + perf_cpu_map__put(online_cpus); > > - err = cs_etm_validate_context_id(itr, evsel, i); > + /* > + * Set option of each CPU we have. In per-cpu case, do the validation > + * for CPUs to work with. In per-thread case, the CPU map is empty. > + * Since the traced program can run on any CPUs in this case, thus don't > + * skip validation. > + */ > + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { > + err = cs_etm_validate_context_id(itr, evsel, cpu.cpu); > if (err) > goto out; > - err = cs_etm_validate_timestamp(itr, evsel, i); > + err = cs_etm_validate_timestamp(itr, evsel, idx); > if (err) > goto out; > } > > err = 0; > out: > - perf_cpu_map__put(online_cpus); > + perf_cpu_map__put(intersect_cpus); > return err; > } > > @@ -435,7 +429,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, > * Also the case of per-cpu mmaps, need the contextID in order to be notified > * when a context switch happened. > */ > - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { > + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { > evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, > "timestamp", 1); > evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, > @@ -461,7 +455,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, > evsel->core.attr.sample_period = 1; > > /* In per-cpu case, always need the time of mmap events etc */ > - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) > + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) > evsel__set_sample_bit(evsel, TIME); > > err = cs_etm_validate_config(itr, cs_etm_evsel); > @@ -533,38 +527,32 @@ static size_t > cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, > struct evlist *evlist __maybe_unused) > { > - int i; > + int idx; > int etmv3 = 0, etmv4 = 0, ete = 0; > struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus; > struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); > + struct perf_cpu cpu; > > /* cpu map is not empty, we have specific CPUs to work with */ > - if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { > - for (i = 0; i < cpu__max_cpu().cpu; i++) { > - struct perf_cpu cpu = { .cpu = i, }; > - > - if (!perf_cpu_map__has(event_cpus, cpu) || > - !perf_cpu_map__has(online_cpus, cpu)) > - continue; > + if (!perf_cpu_map__is_empty(event_cpus)) { > + struct perf_cpu_map *intersect_cpus = > + perf_cpu_map__intersect(event_cpus, online_cpus); > > - if (cs_etm_is_ete(itr, i)) > + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { > + if (cs_etm_is_ete(itr, cpu.cpu)) > ete++; > - else if (cs_etm_is_etmv4(itr, i)) > + else if (cs_etm_is_etmv4(itr, cpu.cpu)) > etmv4++; > else > etmv3++; > } > + perf_cpu_map__put(intersect_cpus); > } else { > /* get configuration for all CPUs in the system */ > - for (i = 0; i < cpu__max_cpu().cpu; i++) { > - struct perf_cpu cpu = { .cpu = i, }; > - > - if (!perf_cpu_map__has(online_cpus, cpu)) > - continue; > - > - if (cs_etm_is_ete(itr, i)) > + perf_cpu_map__for_each_cpu(cpu, idx, online_cpus) { > + if (cs_etm_is_ete(itr, cpu.cpu)) > ete++; > - else if (cs_etm_is_etmv4(itr, i)) > + else if (cs_etm_is_etmv4(itr, cpu.cpu)) > etmv4++; > else > etmv3++; > @@ -814,15 +802,14 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, > return -EINVAL; > > /* If the cpu_map is empty all online CPUs are involved */ > - if (perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { > + if (perf_cpu_map__is_empty(event_cpus)) { > cpu_map = online_cpus; > } else { > /* Make sure all specified CPUs are online */ > - for (i = 0; i < perf_cpu_map__nr(event_cpus); i++) { > - struct perf_cpu cpu = { .cpu = i, }; > + struct perf_cpu cpu; > > - if (perf_cpu_map__has(event_cpus, cpu) && > - !perf_cpu_map__has(online_cpus, cpu)) > + perf_cpu_map__for_each_cpu(cpu, i, event_cpus) { > + if (!perf_cpu_map__has(online_cpus, cpu)) > return -EINVAL; > } > > diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c > index 51ccbfd3d246..0b52e67edb3b 100644 > --- a/tools/perf/arch/arm64/util/arm-spe.c > +++ b/tools/perf/arch/arm64/util/arm-spe.c > @@ -232,7 +232,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, > * In the case of per-cpu mmaps, sample CPU for AUX event; > * also enable the timestamp tracing for samples correlation. > */ > - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { > + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { > evsel__set_sample_bit(arm_spe_evsel, CPU); > evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel, > "ts_enable", 1); > @@ -265,7 +265,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, > tracking_evsel->core.attr.sample_period = 1; > > /* In per-cpu case, always need the time of mmap events etc */ > - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { > + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { > evsel__set_sample_bit(tracking_evsel, TIME); > evsel__set_sample_bit(tracking_evsel, CPU); >