Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp5251293rwb; Tue, 6 Sep 2022 22:53:16 -0700 (PDT) X-Google-Smtp-Source: AA6agR46enc3Nw+RxetVQ4y2ID0fgdsGmtQY0sREsASVV4Cammufgy5Upqgcya5RIIwt8dLc4JQe X-Received: by 2002:aa7:c956:0:b0:43b:206d:c283 with SMTP id h22-20020aa7c956000000b0043b206dc283mr1660906edt.381.1662529995920; Tue, 06 Sep 2022 22:53:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662529995; cv=none; d=google.com; s=arc-20160816; b=t7jnN59eRoILUkpme7vGj9o35byOXUJNE9BezsqJ1tw8seBaB683RwLtdRJv6J+ZzO mf+oxfc4yVa9J8ibM3M8iFo7BHB7E3HuHeySAdNfxCzDkDy4vCx/YVhUMKBG8h9mHmuA 78qhv/4Yf4GZklHFeJdoqxB0jCW9lGY2/hTA4ppg4AOM2rdSPNITBq+k+1plby1OoXcq msnlCKAIUzq1Bl9+DHoSIDUQ1Fs+afJirGrBod4/DrDsLt4AoUxUhWcXom9YCOYtZS7j MXBHjDcmWVrDq7PFRytd0cnOk1pNRaJpZapShqiBbPffswThdBLVeqM8hCbacUdGMT9Y Ml9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=6hb/7MLaTrS7eDNj3iVjup5eBFDO1m0kFbC64t4lWzY=; b=JdHtMBDr+gRDOBZc70P03+sH46Tn1ykhGqXS1ncQ5b0a1x6SzWstfawajcoXL02Jzb DAuzdQz/brQhnfCCzRxK1V6K0aWYpal7GZ/ewPdDBrBNuB2wpKvfDAOMa05GquGoiycm def8dNRapqWKJGSNHCSvjmyS7NBD088Z2PitUp3IJ0M7fao1zNmjaRt/sTwx1XWpgUFy DHBNE+S3BroSb11+t+vB+iQg0c98Pwe7XPKmsueUfKLOGlv63ULFsYSTarh0iHIPdgvl aFlC/QWSdUGzQ0yRD/jnArg2O9xOCZV5ztuo6zk2f86LFY1dBrJ0sqPbzPbxFjUYb4Al m9sw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I0+tJ5eH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w25-20020a170906385900b00744aad5a1a6si9393113ejc.250.2022.09.06.22.52.50; Tue, 06 Sep 2022 22:53:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I0+tJ5eH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229684AbiIGFFk (ORCPT + 99 others); Wed, 7 Sep 2022 01:05:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229605AbiIGFFg (ORCPT ); Wed, 7 Sep 2022 01:05:36 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 540BA7C1FB; Tue, 6 Sep 2022 22:05:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662527135; x=1694063135; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=3hKO8goXwPPBPFjOJolVzOo3e+yHPR0/w089nXehW1I=; b=I0+tJ5eH0fHbcnrVEk4XkXfsu9ZIKm3I58yG/glLDcLygbPErXRkPjAe /QlQdsLkbHzqDz/QhamHV4GU4yx9wxp5UaMBbLS/J2B9R5NbcnZNLBIbN WRJL7d9CIHJYh9JEweEUzOuZQVl4vO9dUpaYxBSz6/JL+/rF1WVMs5XPW Bq7x0DfXoWKebnu8D3IbekbYDjhj4M9PHP2UtVMNeQdfVsFwG4goKkSFg J2oo5bjGk1UzBpqzlUghKjYgulHQM9BAeRDoovdRZd6uXanLoYFP/NwDE CmIANsu7D+oRHS/C10hBdOXKY2IgEB/j6D/mhxAJW4q5MjBtYmFKtt/pX Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10462"; a="383076154" X-IronPort-AV: E=Sophos;i="5.93,295,1654585200"; d="scan'208";a="383076154" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Sep 2022 22:05:35 -0700 X-IronPort-AV: E=Sophos;i="5.93,295,1654585200"; d="scan'208";a="676011657" Received: from alinassi-mobl1.ger.corp.intel.com (HELO [10.0.2.15]) ([10.252.58.27]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Sep 2022 22:05:31 -0700 Message-ID: <4763358a-91fa-3607-dd54-749b40f977fc@intel.com> Date: Wed, 7 Sep 2022 08:05:27 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.11.0 Subject: Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets Content-Language: en-US To: Jiri Olsa Cc: Arnaldo Carvalho de Melo , Jiri Olsa , Namhyung Kim , Ian Rogers , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Peter Zijlstra , Ingo Molnar References: <20220905114209.8389-1-adrian.hunter@intel.com> <60b5c9bf-4ec9-957e-17dd-aa0a50411ff9@intel.com> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/09/22 22:45, Jiri Olsa wrote: > On Tue, Sep 06, 2022 at 05:04:45PM +0300, Adrian Hunter wrote: >> On 6/09/22 15:59, Jiri Olsa wrote: >>> On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote: >>> >>> SNIP >>> >>>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c >>>> index e6c98a6e3908..6b1bafe267a4 100644 >>>> --- a/tools/lib/perf/evlist.c >>>> +++ b/tools/lib/perf/evlist.c >>>> @@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops, >>>> if (ops->idx) >>>> ops->idx(evlist, evsel, mp, idx); >>>> >>>> + pr_debug("idx %d: mmapping fd %d\n", idx, *output); >>>> if (ops->mmap(map, mp, *output, evlist_cpu) < 0) >>>> return -1; >>>> >>>> @@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops, >>>> if (!idx) >>>> perf_evlist__set_mmap_first(evlist, map, overwrite); >>>> } else { >>>> + pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output); >>>> if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0) >>>> return -1; >>>> >>>> @@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops, >>>> return 0; >>>> } >>>> >>>> +static int >>>> +mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops, >>>> + struct perf_mmap_param *mp) >>>> +{ >>>> + int nr_threads = perf_thread_map__nr(evlist->threads); >>>> + int nr_cpus = perf_cpu_map__nr(evlist->all_cpus); >>>> + int cpu, thread, idx = 0; >>>> + int nr_mmaps = 0; >>>> + >>>> + pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n", >>>> + __func__, nr_cpus, nr_threads); >>> >>> -1 as cpu value is only for 'empty' perf_cpu_map, right? >> >> The cpu map is a map of valid 3rd arguments to perf_event_open, so -1 >> means all CPUs which is per-thread by necessity. >> >>> >>>> + >>>> + /* per-thread mmaps */ >>>> + for (thread = 0; thread < nr_threads; thread++, idx++) { >>>> + int output = -1; >>>> + int output_overwrite = -1; >>>> + >>>> + if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output, >>>> + &output_overwrite, &nr_mmaps)) >>>> + goto out_unmap; >>>> + } >>>> + >>>> + /* system-wide mmaps i.e. per-cpu */ >>>> + for (cpu = 1; cpu < nr_cpus; cpu++, idx++) { >>>> + int output = -1; >>>> + int output_overwrite = -1; >>>> + >>>> + if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output, >>>> + &output_overwrite, &nr_mmaps)) >>>> + goto out_unmap; >>>> + } >>> >>> will this loop be executed? we are in here because all_cpus is empty, right? >> >> Yes it is executed. I put back the code that was there before ae4f8ae16a07 >> ("libperf evlist: Allow mixing per-thread and per-cpu mmaps"), which uses > > hm, but commit ae4f8ae16a07 does not have similar cpu loop It was calling mmap_per_cpu() for that case. The 2 cases: mmap_per_cpu() and mmap_per_thread() could still be combined into a single function. > >> perf_cpu_map__empty() which only checks the first entry is -1: >> >> bool perf_cpu_map__empty(const struct perf_cpu_map *map) >> { >> return map ? map->map[0].cpu == -1 : true; >> } >> >> But there can be more CPUs in the map, so perf_cpu_map__empty() >> returns true for the per-thread case, as desired, even if there >> are also system-wide CPUs. > > I don't see how, if I'd see -1 together with other cpu values in > perf_cpu_map I'd think it's a bug, but I might be missing some > auxtrace usage, Yes, it is for system-wide collection of events that can affect every CPU. Currently text_poke is always system-wide - see the Intel PT example. > > I thought we use -1 just for empty cpu map, so in per-thread case > -1 is properly passed to perf_event_open syscall Yes, but it does not need to be limited to that case. > > jirka > >> >> I guess perf_cpu_map__empty() needs renaming. >> >>> >>> thanks, >>> jirka >>> >>>> + >>>> + if (nr_mmaps != evlist->nr_mmaps) >>>> + pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps); >>>> + >>>> + return 0; >>>> + >>>> +out_unmap: >>>> + perf_evlist__munmap(evlist); >>>> + return -1; >>>> +} >>> >>> SNIP >>