Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp2165843rdh; Tue, 26 Sep 2023 14:53:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGRwZWSclFl0bN3jrfXaJ5R/lPgJblLWbehaDBbEBvtfUxtqBJwoWTrr7j+i8c5a8+YNHJo X-Received: by 2002:a05:6a20:914e:b0:15e:6614:5b40 with SMTP id x14-20020a056a20914e00b0015e66145b40mr167780pzc.26.1695765213247; Tue, 26 Sep 2023 14:53:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695765213; cv=none; d=google.com; s=arc-20160816; b=qTHE5wlVLha/egFYt5bbBruI3EFNME0utHEIdu1jXbsfFzjGAvolGOzAJ4YG/63+z6 ueF8dNbe2EMbbvWe7AbJ1eIPzfjMCB38xXuu0/gURz5V7eN3w79neiazj6hqhGlfgkxL sRL/L7S6NNf4aFOI9Cr9EgaLWxE4zbrlSekMnJkYXfYfvG8IWnq8RCmStJc/YvWnH9pz X5X3HHi6ywGT1l+uHShUkZmH+3e4woA9wvDFc+n0xB7FhQI15+qOHK/pv45lBbAmSsHg 45mpyA0O4pIKJerb01jxLSgCuYFBiQtWLTHQBYT54qCCU5EdskPvI5LbJORADsaq3NZf XMMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:from:content-language:subject:user-agent:mime-version:date :message-id:dkim-signature; bh=YhVaHv7wDnisfqHK2lGH+nGGa2PJduf8rHNhpIWcF8k=; fh=R1OkhriIwip1F03SPBPuQjbmUJVFiHc78tVoBI9xOtw=; b=I3R63AeRkQiiT3jfQAGJIUMVqJ5RTkPZTQ6rN69bOMw4DG8wZTGHmrcz/UtYELGFtK EmV9USNcNcRIJ9ciwAyURbz3kiwNBEBL37ks5dInPGMvBWaPF9lLDPtImPiwbMrbLbRJ ORnb6MqOZEn48hm0ke4xHEguxUuXXuEElEr9WdKjTbyMKN9pFyUMgypzhWUp9+FMAgo2 NZXDpIwg/lm0G2KD/k6aSV17tbptrD7r3OTw8EobqL/q79/LW7x+hivEuv04cxfbmagA N5VkGsxKvc0KHwGfTB/2ISQgxlV+4naeZzFTgmSaPNmyW+D5V0WUCiSEcyyWxcsHJYRK 7Cxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=a1iOlMfy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id d7-20020a17090ad98700b002680abd9398si13734775pjv.88.2023.09.26.14.53.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 14:53:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=a1iOlMfy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 0D2A481DC625; Tue, 26 Sep 2023 09:48:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235208AbjIZQsz (ORCPT + 99 others); Tue, 26 Sep 2023 12:48:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229629AbjIZQsx (ORCPT ); Tue, 26 Sep 2023 12:48:53 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BFBACE; Tue, 26 Sep 2023 09:48:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695746926; x=1727282926; h=message-id:date:mime-version:subject:from:to:cc: references:in-reply-to:content-transfer-encoding; bh=LBVll71AmupPCU5W9lts6N+NWDSZlNkgAGXT6anFo5k=; b=a1iOlMfyXiCEXqnw1BS16riiNUIeM8zw5S30bt9QEB2rVUV7m5WI+TdX GxNjVU4W5eizvPdCOSv/z+I6CR4LMMN9XQ95u4XTRDsKbTouM48QNzpRW oTJcB8XMmRG3TKCK3puEtoCIw1lnoIm51W8XjRIr5Lr8tLntPhrfXPnf6 IMb6tLhn4Erxld7vf+JvOK4VsQF2x4KYbjY/xw1W+jkuSuoeTzmKiqe2e s3ugIUccJLoOPqt3g1d0zrQhDh2qJpbkWoOq7UL2vWFM3jKtu/4U8cneP bnAepOHxWsQhOZ/Dstv8bXSijMGnx9idZqyNO8mcWzVO73AjZyEPNQPQZ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10845"; a="385462418" X-IronPort-AV: E=Sophos;i="6.03,178,1694761200"; d="scan'208";a="385462418" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Sep 2023 09:48:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,178,1694761200"; d="scan'208";a="225044" Received: from linux.intel.com ([10.54.29.200]) by orviesa001.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Sep 2023 09:48:15 -0700 Received: from [10.209.130.196] (kliang2-mobl1.ccr.corp.intel.com [10.209.130.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 5A860580D9B; Tue, 26 Sep 2023 09:48:42 -0700 (PDT) Message-ID: Date: Tue, 26 Sep 2023 12:48:41 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [RFC PATCH 00/25] Perf stat metric grouping with hardware information Content-Language: en-US From: "Liang, Kan" To: weilin.wang@intel.com, Ian Rogers , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Adrian Hunter Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Perry Taylor , Samantha Alt , Caleb Biggers , Mark Rutland References: <20230925061824.3818631-1-weilin.wang@intel.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 26 Sep 2023 09:48:57 -0700 (PDT) On 2023-09-26 10:43 a.m., Liang, Kan wrote: > > > On 2023-09-25 2:17 a.m., weilin.wang@intel.com wrote: >> From: Weilin Wang >> >> Perf stat metric grouping generates event groups that are provided to kernel for >> data collection using the hardware counters. Sometimes, the grouping might fail >> and kernel has to retry the groups because generated groups do not fit in the >> hardware counters correctly. In some other cases, the groupings are collected >> correctly, however, they left some hardware counters unused. >> >> To improve these inefficiencies, we would like to propose a hardware aware >> grouping method that does metric/event grouping based on event counter>> restriction rules and the availability of hardware counters in the system. This >> method is generic as long as all the restriction rules could be provided from >> the pmu-event JSON files. > > This method assumes that it's the only user (expect NMI watchdog) and > all the HW resource are available. Right? It's better to give more details about the algorithm of the method. How to decide to create a new group? How to decide which group the event will be added into. etc. Thanks, Kan > >> >> This patch set includes code that does hardware aware grouping and updated >> pmu-event JSON files for four platforms (SapphireRapids, Icelakex, Cascadelakex, >> and Tigerlake) for your testing and experimenting. We've successfully tested >> these patches on three platforms (SapphireRapids, Icelakex, and Cascadelakex) >> with topdown metrics from TopdownL1 to TopdownL6. >> >> There are some optimization opportunities that we might implement in the future: >> 1) Better NMI hanlding: when NMI watchdog is enabled, we reduce the default_core >> total counter size by one. This could be improved to better utilize the counter. >> 2) Fill important events into unused counter for better counter utlization: >> there might be some unused counters scattered in the groups. We could consider >> to add important events in this slots if necessary. This could help increase the >> multiplexing percentage and help improve accuracy if the event is critical. >> >> Remaining questions for dicussion: >> 3) Where to start grouping from? The current implementation start grouping by >> combining all the events into a single list. This step deduplicates events. But >> it does not maintain the relationship of events according to the metrics, i.e. >> events required by one metric may not be collected at the same time. Another >> type of starting point would be grouping each individual metric and then try to >> merge the groups. > > Maybe you can add a new flag to tag the metric which should/better be > scheduled together, e.g., IPC. > > Thanks, > Kan >> 4) Any comments, suggestions, new ideas? >> 5) If you are interested to test the patch out and the pmu-event JSON files of >> your testing platform is not provided here, please let me know so that I could >> provide you the files. >> >> >> Weilin Wang (25): >> perf stat: Add hardware-grouping cmd option to perf stat >> perf stat: Add basic functions for the hardware-grouping stat cmd >> option >> perf pmu-events: Add functions in jevent.py >> perf pmu-events: Add counter info into JSON files for SapphireRapids >> perf pmu-events: Add event counter data for Cascadelakex >> perf pmu-events: Add event counter data for Icelakex >> perf stat: Add helper functions for hardware-grouping method >> perf stat: Add functions to get counter info >> perf stat: Add helper functions for hardware-grouping method >> perf stat: Add helper functions to hardware-grouping method >> perf stat: Add utility functions to hardware-grouping method >> perf stat: Add more functions for hardware-grouping method >> perf stat: Add functions to hardware-grouping method >> perf stat: Add build string function and topdown events handling in >> hardware-grouping >> perf stat: Add function to combine metrics for hardware-grouping >> perf stat: Update keyword core to default_core to adjust to the >> changes for events with no unit >> perf stat: Handle taken alone in hardware-grouping >> perf stat: Handle NMI in hardware-grouping >> perf stat: Handle grouping method fall back in hardware-grouping >> perf stat: Code refactoring in hardware-grouping >> perf stat: Add tool events support in hardware-grouping >> perf stat: Add TSC support in hardware-grouping >> perf stat: Fix a return error issue in hardware-grouping >> perf stat: Add check to ensure correctness in platform that does not >> support hardware-grouping >> perf pmu-events: Add event counter data for Tigerlake >> >> tools/lib/bitmap.c | 20 + >> tools/perf/builtin-stat.c | 7 + >> .../arch/x86/cascadelakex/cache.json | 1237 ++++++++++++ >> .../arch/x86/cascadelakex/counter.json | 17 + >> .../arch/x86/cascadelakex/floating-point.json | 16 + >> .../arch/x86/cascadelakex/frontend.json | 68 + >> .../arch/x86/cascadelakex/memory.json | 751 ++++++++ >> .../arch/x86/cascadelakex/other.json | 168 ++ >> .../arch/x86/cascadelakex/pipeline.json | 102 + >> .../arch/x86/cascadelakex/uncore-cache.json | 1138 +++++++++++ >> .../x86/cascadelakex/uncore-interconnect.json | 1272 +++++++++++++ >> .../arch/x86/cascadelakex/uncore-io.json | 394 ++++ >> .../arch/x86/cascadelakex/uncore-memory.json | 509 +++++ >> .../arch/x86/cascadelakex/uncore-power.json | 25 + >> .../arch/x86/cascadelakex/virtual-memory.json | 28 + >> .../pmu-events/arch/x86/icelakex/cache.json | 98 + >> .../pmu-events/arch/x86/icelakex/counter.json | 17 + >> .../arch/x86/icelakex/floating-point.json | 13 + >> .../arch/x86/icelakex/frontend.json | 55 + >> .../pmu-events/arch/x86/icelakex/memory.json | 53 + >> .../pmu-events/arch/x86/icelakex/other.json | 52 + >> .../arch/x86/icelakex/pipeline.json | 92 + >> .../arch/x86/icelakex/uncore-cache.json | 965 ++++++++++ >> .../x86/icelakex/uncore-interconnect.json | 1667 +++++++++++++++++ >> .../arch/x86/icelakex/uncore-io.json | 966 ++++++++++ >> .../arch/x86/icelakex/uncore-memory.json | 186 ++ >> .../arch/x86/icelakex/uncore-power.json | 26 + >> .../arch/x86/icelakex/virtual-memory.json | 22 + >> .../arch/x86/sapphirerapids/cache.json | 104 + >> .../arch/x86/sapphirerapids/counter.json | 17 + >> .../x86/sapphirerapids/floating-point.json | 25 + >> .../arch/x86/sapphirerapids/frontend.json | 98 +- >> .../arch/x86/sapphirerapids/memory.json | 44 + >> .../arch/x86/sapphirerapids/other.json | 40 + >> .../arch/x86/sapphirerapids/pipeline.json | 118 ++ >> .../arch/x86/sapphirerapids/uncore-cache.json | 534 +++++- >> .../arch/x86/sapphirerapids/uncore-cxl.json | 56 + >> .../sapphirerapids/uncore-interconnect.json | 476 +++++ >> .../arch/x86/sapphirerapids/uncore-io.json | 373 ++++ >> .../x86/sapphirerapids/uncore-memory.json | 391 ++++ >> .../arch/x86/sapphirerapids/uncore-power.json | 24 + >> .../x86/sapphirerapids/virtual-memory.json | 20 + >> .../pmu-events/arch/x86/tigerlake/cache.json | 65 + >> .../arch/x86/tigerlake/counter.json | 7 + >> .../arch/x86/tigerlake/floating-point.json | 13 + >> .../arch/x86/tigerlake/frontend.json | 56 + >> .../pmu-events/arch/x86/tigerlake/memory.json | 31 + >> .../pmu-events/arch/x86/tigerlake/other.json | 4 + >> .../arch/x86/tigerlake/pipeline.json | 96 + >> .../x86/tigerlake/uncore-interconnect.json | 11 + >> .../arch/x86/tigerlake/uncore-memory.json | 6 + >> .../arch/x86/tigerlake/uncore-other.json | 1 + >> .../arch/x86/tigerlake/virtual-memory.json | 20 + >> tools/perf/pmu-events/jevents.py | 179 +- >> tools/perf/pmu-events/pmu-events.h | 26 +- >> tools/perf/util/metricgroup.c | 927 +++++++++ >> tools/perf/util/metricgroup.h | 82 + >> tools/perf/util/pmu.c | 5 + >> tools/perf/util/pmu.h | 1 + >> tools/perf/util/stat.h | 1 + >> 60 files changed, 13790 insertions(+), 25 deletions(-) >> create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/counter.json >> create mode 100644 tools/perf/pmu-events/arch/x86/icelakex/counter.json >> create mode 100644 tools/perf/pmu-events/arch/x86/sapphirerapids/counter.json >> create mode 100644 tools/perf/pmu-events/arch/x86/tigerlake/counter.json >> >> -- >> 2.39.3 >>