Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp485845rdb; Tue, 19 Sep 2023 01:02:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG+eShovmr7SzAp9gz1hDffVCBIw8l0hZ7s6cfXgFrAnyBZFGhcqJrim8nYskiC2uOdZAGe X-Received: by 2002:a05:6870:548f:b0:1d0:da52:4235 with SMTP id f15-20020a056870548f00b001d0da524235mr13443753oan.26.1695110555321; Tue, 19 Sep 2023 01:02:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695110555; cv=none; d=google.com; s=arc-20160816; b=BpRgZuBrUf785i7lOwDj/otdtA37fMK+unLVvrxSUkKvyJKp6oVhcJgEMmW8fM4Nes uJ3tLVPvAeuWVtSOFvWnnxOouZavDZrToJkCdi26hJ/dL5AeL4d6sDVRbl6whrSclNJz 751TixwdrMrRcy8MA9JhQn8JwC8gmuS9Pkn3ZEuA28aTuvCCDDUMVbBKOpc+TJJ8n3Mo DtKXO2znAlCTTyI501+qD5pdblsDIPngaoR1OjiPbgan7wJHC0Ig7hVgchJhpvK80Oy5 PrwMkPYrHFgx7AFIW+kJKE9CZD/dxKoAn3AuW+LIBLYLe9CF5n8Cq/etjRbtmXTkB6Bd mlOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:content-language:references:cc:to:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=/jxxF1Xnuep9wndIZiTwMjVdFqeSIJbssDjfIiip2BM=; fh=teTP+krP51f93KGYj9Xunu/14zJf8A/KWM4/LtgJLNc=; b=RPJTaOJTwmPW716kjWjo0jxro2ebFW8PsRXlOkHQ3PektRbSfCOuzUTqTmAU6j5Iy9 O3SKHoWq6gtH/uX7dTP95o1KJUetYcDGN5F9BN/z/I8QA4pBaWJOBBlbLS2yTG09X7sg TwhnbRX/kABWF2E7nBpsy07DB048wYEdmIWdKrcySuzcZULevLI8yDxe62V0QVbpOSFK WqMnnXSbVpnWQRS/FeHHoT9Z/4oPLzT16sL5Eqa65RZ65X5oQMOQCb/cpG0fwwxYcwrH p0z1CeUUfd/khtulcV1wrGtMg7p34UZmeGUeK0rW6WdYtAhNfNRVdxNyHWdHpH9kEEMw oTAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nKQoFg++; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id a14-20020a63e84e000000b00577461b010fsi9474423pgk.685.2023.09.19.01.02.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Sep 2023 01:02:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nKQoFg++; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id E39E480942F6; Mon, 18 Sep 2023 23:00:21 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231362AbjISGAR (ORCPT + 99 others); Tue, 19 Sep 2023 02:00:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229960AbjISGAQ (ORCPT ); Tue, 19 Sep 2023 02:00:16 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEC43102; Mon, 18 Sep 2023 23:00:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695103210; x=1726639210; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=+0+BOZGajX3/f4YT5PIAXOxOK+0h04eUBpzJ0BA8lWM=; b=nKQoFg++BpnfT7dE5z69j2XkwoUu1m+z+r52QFmdfHVy4TsH9D7sNzoN HV2uPmaLGzZiJMPpGOX9gxwwF0gdnK9X/rNsJypT5kIjDQJMLW8pcf/+M h02wZxJvTr+wNM5u10iHQBP30aMxDGg8Gwxi0skRYTcb3FmnbNInwK+3j k2wlPHREfWxPjoHTASN1CxvTkmykdfozdYcPy1aKBZEHq9drWLw3GxNr6 yO+0Q+hYt8dfoYgsoP/u92zpMr0U0tipTKcUHA2ibERnTdAE74P+yBLSY PTtJqekAraiN+EXDDut34XWyj1mK1Dt0X/zgqI0VKeD36Jp0A1yQA+emI Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10837"; a="443942603" X-IronPort-AV: E=Sophos;i="6.02,158,1688454000"; d="scan'208";a="443942603" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Sep 2023 22:59:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10837"; a="836305448" X-IronPort-AV: E=Sophos;i="6.02,158,1688454000"; d="scan'208";a="836305448" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO [10.0.2.15]) ([10.252.49.132]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Sep 2023 22:59:14 -0700 Message-ID: <8086150e-3c6d-4001-b3cc-bb12ebe15928@intel.com> Date: Tue, 19 Sep 2023 08:59:10 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] perf evlist: Avoid frequency mode for the dummy event To: Ian Rogers Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Kan Liang , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Yang Jihong , Stephane Eranian References: <20230916040915.1075620-1-irogers@google.com> <08995efc-d5af-6ba6-831b-1a483daabef4@intel.com> Content-Language: en-US From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Mon, 18 Sep 2023 23:00:22 -0700 (PDT) On 19/09/23 00:48, Ian Rogers wrote: > On Mon, Sep 18, 2023 at 1:14 AM Adrian Hunter wrote: >> >> On 16/09/23 07:09, Ian Rogers wrote: >>> Dummy events are created with an attribute where the period and freq >>> are zero. evsel__config will then see the uninitialized values and >>> initialize them in evsel__default_freq_period. As fequency mode is >>> used by default the dummy event would be set to use frequency >>> mode. However, this has no effect on the dummy event but does cause >>> unnecessary timers/interrupts. Avoid this overhead by setting the >>> period to 1 for dummy events. >>> >>> evlist__add_aux_dummy calls evlist__add_dummy then sets freq=0 and >>> period=1. This isn't necessary after this change and so the setting is >>> removed. >>> >>> From Stephane: >>> >>> The dummy event is not counting anything. It is used to collect mmap >>> records and avoid a race condition during the synthesize mmap phase of >>> perf record. As such, it should not cause any overhead during active >>> profiling. Yet, it did. Because of a bug the dummy event was >>> programmed as a sampling event in frequency mode. Events in that mode >>> incur more kernel overheads because on timer tick, the kernel has to >>> look at the number of samples for each event and potentially adjust >>> the sampling period to achieve the desired frequency. The dummy event >>> was therefore adding a frequency event to task and ctx contexts we may >>> otherwise not have any, e.g., perf record -a -e >>> cpu/event=0x3c,period=10000000/. On each timer tick the >>> perf_adjust_freq_unthr_context() is invoked and if ctx->nr_freq is >>> non-zero, then the kernel will loop over ALL the events of the context >>> looking for frequency mode ones. In doing, so it locks the context, >>> and enable/disable the PMU of each hw event. If all the events of the >>> context are in period mode, the kernel will have to traverse the list for >>> nothing incurring overhead. The overhead is multiplied by a very large >>> factor when this happens in a guest kernel. There is no need for the >>> dummy event to be in frequency mode, it does not count anything and >>> therefore should not cause extra overhead for no reason. >>> >>> Fixes: 5bae0250237f ("perf evlist: Introduce perf_evlist__new_dummy constructor") >>> Reported-by: Stephane Eranian >>> Signed-off-by: Ian Rogers >>> --- >>> tools/perf/util/evlist.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c >>> index 25c3ebe2c2f5..e36da58522ef 100644 >>> --- a/tools/perf/util/evlist.c >>> +++ b/tools/perf/util/evlist.c >>> @@ -251,6 +251,9 @@ static struct evsel *evlist__dummy_event(struct evlist *evlist) >>> .type = PERF_TYPE_SOFTWARE, >>> .config = PERF_COUNT_SW_DUMMY, >>> .size = sizeof(attr), /* to capture ABI version */ >>> + /* Avoid frequency mode for dummy events to avoid associated timers. */ >>> + .freq = 0, >>> + .sample_period = 1, >>> }; >>> >>> return evsel__new_idx(&attr, evlist->core.nr_entries); >>> @@ -277,8 +280,6 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) >>> evsel->core.attr.exclude_kernel = 1; >>> evsel->core.attr.exclude_guest = 1; >>> evsel->core.attr.exclude_hv = 1; >>> - evsel->core.attr.freq = 0; >>> - evsel->core.attr.sample_period = 1; >>> evsel->core.system_wide = system_wide; >>> evsel->no_aux_samples = true; >>> evsel->name = strdup("dummy:u"); >> >> Note that evsel__config() will put it back to freq if -F is used. > > Right, I was looking for a minimal fix in part for the sake of back > porting. For the -F we could do: > > ``` > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c > index d5363d23f5d3..806185a39e17 100644 > --- a/tools/perf/util/evsel.c > +++ b/tools/perf/util/evsel.c > @@ -1083,11 +1083,15 @@ void __weak arch__post_evsel_config(struct > evsel *evsel __maybe_unused, > static void evsel__set_default_freq_period(struct record_opts *opts, > struct perf_event_attr *attr) > { > - if (opts->freq) { > + bool is_dummy = attr->type == PERF_TYPE_SOFTWARE && > + attr->config == PERF_COUNT_SW_DUMMY; > + > + if (opts->freq && !is_dummy) { > attr->freq = 1; > attr->sample_freq = opts->freq; > } else { > - attr->sample_period = opts->default_interval; > + attr->freq = 0; > + attr->sample_period = is_dummy ? 1 : opts->default_interval; > } > } > ``` > > But this felt like it could potentially have other side-effects. Perhaps leave it alone, if the period has already been defined: diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index d5363d23f5d3..ad3e12f5ec88 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1166,7 +1166,8 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts, if ((evsel->is_libpfm_event && !attr->sample_period) || (!evsel->is_libpfm_event && (!attr->sample_period || opts->user_freq != UINT_MAX || - opts->user_interval != ULLONG_MAX))) + opts->user_interval != ULLONG_MAX) && + !(is_dummy && attr->sample_period))) evsel__set_default_freq_period(opts, attr); /* > Software events like page faults have a period, but the > dummy/side-band events appear never to check the period - but I could > be wrong. We can add the patch above but perhaps without a fixes tag. > > Thanks, > Ian > >> Nevertheless: >> >> Acked-by: Adrian Hunter >>