Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp1460862rdh; Mon, 25 Sep 2023 13:35:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGTHY6IWgsT6WtzZISqZ9/hUx0kzAszgnDARHNREMtFuERsc0NVWljCbp/ZK4iUNYuigPz7 X-Received: by 2002:a9d:664b:0:b0:6c4:9852:a498 with SMTP id q11-20020a9d664b000000b006c49852a498mr8775381otm.4.1695674133969; Mon, 25 Sep 2023 13:35:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695674133; cv=none; d=google.com; s=arc-20160816; b=TEp8yLtPsKNcwllvlkEiH62zsb+99k2pGnkrKbOGKPsgiggW1Kl0RKIfgf79vig7F9 TsnXJjx4DLq6ptgl8IZWw2of+je7w8pUGRBXctlTpVwPnYShSBqOHjog3p99RJZbIXmo x4qR8fyXuFB0hcLkdysX9N7Ctq3uknn+4eRbcwrdnLMNHReEM9vlO+fdczQcc2vHydZ9 t8LJ4MuJpEEo/4rbWEsRwnTAGD9KxKWXgXtT9UBBg5tFf2b4CHNdzWEkJC/hmHJd7hwj gZLX38clqBPrDr8XpVcSoQDY3Thy0wRiJr6c7GuAsPQLjNNeGW1b9XzxDTbUSNG2+bs9 AYng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=+PMYRfehl7AKwIuYYv4VYHbWKGFD1AMTVVElbcrzeIk=; fh=zB7ApRR+SrqLPtjtmqHjOES4mms/7SKYAia6Uquf5MY=; b=zg3D5/Xfm0fSrUr//SPoyXfzfcpoXw/KfkwkheAnN8SPih0WjbvKhzrtX/6i+3KNdc 5lFnVDP7miT93lmzVh78/dTVcA++tQGmzDMjvElj0RpDb7s5w2mz8VL3PJjh6HFlmWtP OoovuNemcK46LidzT4EdnrOEF5V9z+WaNBKtVccDK4nNEx5RwbCp8boPBxx+LcFw6ruh 2PRcd5naGXi0ozXl6p7UOlzMDYGs2p9Wwxag486bQQlrhAu3Od19OhLd6hxslrb/N8AU VLsNbT/C8p3yPlp1kjTIzIhIzLNjWz056XUSff2z84vTK95E0H/43KhEOmnWEMcccPe6 aH3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=qxik4R9H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id v21-20020a056a00149500b0068ffda29587si12079689pfu.109.2023.09.25.13.35.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 13:35:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=qxik4R9H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 8893680CFD01; Mon, 25 Sep 2023 11:30:05 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230125AbjIYS36 (ORCPT + 99 others); Mon, 25 Sep 2023 14:29:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229584AbjIYS35 (ORCPT ); Mon, 25 Sep 2023 14:29:57 -0400 Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69E2595 for ; Mon, 25 Sep 2023 11:29:50 -0700 (PDT) Received: by mail-qt1-x82f.google.com with SMTP id d75a77b69052e-41513d2cca7so64421cf.0 for ; Mon, 25 Sep 2023 11:29:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695666589; x=1696271389; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+PMYRfehl7AKwIuYYv4VYHbWKGFD1AMTVVElbcrzeIk=; b=qxik4R9HkVKTqGeIWOJhuEN3oz8egBGnDC/O1lXDBmQStvd0vqtovtXGQHbVYd8ga+ r7/eQLEHJ5vbSAqWX90YpdUTxOltKsk+vHqDz48rq0BjQ3hSNFgEaW2yTcFjf+SVWaaJ n7V3ibpg9NXiL5ptxcK8d89Kan3koLe1lijZ8EQHQ8ZfmytvVeXLKCguiRfpmQrkFXEA TwFEY3ksv2Vg1u20J3FHxatFtQlaE2q8/SpitKwgr0nJanFfx18+uZWaUWZJssEP2FN4 JjKPXUvJZ1ZxqzJYSbl4RS4/Vr+dCcjkoxWcEHwubM8pXF7Zo3vptSkSrK4K+l9o5ZKO VusQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695666589; x=1696271389; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+PMYRfehl7AKwIuYYv4VYHbWKGFD1AMTVVElbcrzeIk=; b=f/R7IeUqEU/tgzHVoIUN15XGbUmaunqasb5dbAlMZS670tJonmRY8mtML949HZV7ON EDGkga16Qf2ibBa8r05tE8Z1VC6TLdV1roCgceQd/i7i53F8QRCSkdm8h0FR1XMBEaRf fygJEXvgmr53BXi6FvcpxWPrkvy3CVTmorANmHFoCWDOwP9mFEAj16LSixWf8xSPimbw j9n62alpnH01awcHSzLfQmqWN7B/zRMaGYpXianzQRCInVI8s6OraeyjW4FV46ACKMeo NRRvHD1vqS+yHoRuRpMzljhyWdT9/mUBJDSL0TwWp5T503zGHwHL8ktmiyabJY7fSgZj mZ0A== X-Gm-Message-State: AOJu0YzY/PsbvEzjPGucpEroIzle5yKiuv0huSszSBbHyrCTUg4KwC4K fHC0tFluExPJAp3SbGvaiq5i98M1fe13RfO9zWHbLA== X-Received: by 2002:ac8:5a46:0:b0:410:88dc:21b with SMTP id o6-20020ac85a46000000b0041088dc021bmr45741qta.26.1695666589355; Mon, 25 Sep 2023 11:29:49 -0700 (PDT) MIME-Version: 1.0 References: <20230925061824.3818631-1-weilin.wang@intel.com> In-Reply-To: <20230925061824.3818631-1-weilin.wang@intel.com> From: Ian Rogers Date: Mon, 25 Sep 2023 11:29:34 -0700 Message-ID: Subject: Re: [RFC PATCH 00/25] Perf stat metric grouping with hardware information To: weilin.wang@intel.com Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Adrian Hunter , Kan Liang , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Perry Taylor , Samantha Alt , Caleb Biggers , Mark Rutland Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 25 Sep 2023 11:30:06 -0700 (PDT) On Sun, Sep 24, 2023 at 11:19=E2=80=AFPM wrote: > > From: Weilin Wang > > Perf stat metric grouping generates event groups that are provided to ker= nel for > data collection using the hardware counters. Sometimes, the grouping migh= t fail > and kernel has to retry the groups because generated groups do not fit in= the > hardware counters correctly. In some other cases, the groupings are colle= cted > correctly, however, they left some hardware counters unused. > > To improve these inefficiencies, we would like to propose a hardware awar= e > grouping method that does metric/event grouping based on event counter > restriction rules and the availability of hardware counters in the system= . This > method is generic as long as all the restriction rules could be provided = from > the pmu-event JSON files. > > This patch set includes code that does hardware aware grouping and update= d > pmu-event JSON files for four platforms (SapphireRapids, Icelakex, Cascad= elakex, > and Tigerlake) for your testing and experimenting. We've successfully tes= ted > these patches on three platforms (SapphireRapids, Icelakex, and Cascadela= kex) > with topdown metrics from TopdownL1 to TopdownL6. > > There are some optimization opportunities that we might implement in the = future: > 1) Better NMI hanlding: when NMI watchdog is enabled, we reduce the defau= lt_core > total counter size by one. This could be improved to better utilize the c= ounter. Thanks Weilin! I'm checking out the series. Hopefully the NMI watchdog perf event can go away soon with the buddy scheme: https://lore.kernel.org/lkml/20230527014153.2793931-1-dianders@chromium.org= / But better NMI handling would be true for people without the latest kernel. Thanks, Ian > 2) Fill important events into unused counter for better counter utlizatio= n: > there might be some unused counters scattered in the groups. We could con= sider > to add important events in this slots if necessary. This could help incre= ase the > multiplexing percentage and help improve accuracy if the event is critica= l. > > Remaining questions for dicussion: > 3) Where to start grouping from? The current implementation start groupin= g by > combining all the events into a single list. This step deduplicates event= s. But > it does not maintain the relationship of events according to the metrics,= i.e. > events required by one metric may not be collected at the same time. Anot= her > type of starting point would be grouping each individual metric and then = try to > merge the groups. > 4) Any comments, suggestions, new ideas? > 5) If you are interested to test the patch out and the pmu-event JSON fil= es of > your testing platform is not provided here, please let me know so that I = could > provide you the files. > > > Weilin Wang (25): > perf stat: Add hardware-grouping cmd option to perf stat > perf stat: Add basic functions for the hardware-grouping stat cmd > option > perf pmu-events: Add functions in jevent.py > perf pmu-events: Add counter info into JSON files for SapphireRapids > perf pmu-events: Add event counter data for Cascadelakex > perf pmu-events: Add event counter data for Icelakex > perf stat: Add helper functions for hardware-grouping method > perf stat: Add functions to get counter info > perf stat: Add helper functions for hardware-grouping method > perf stat: Add helper functions to hardware-grouping method > perf stat: Add utility functions to hardware-grouping method > perf stat: Add more functions for hardware-grouping method > perf stat: Add functions to hardware-grouping method > perf stat: Add build string function and topdown events handling in > hardware-grouping > perf stat: Add function to combine metrics for hardware-grouping > perf stat: Update keyword core to default_core to adjust to the > changes for events with no unit > perf stat: Handle taken alone in hardware-grouping > perf stat: Handle NMI in hardware-grouping > perf stat: Handle grouping method fall back in hardware-grouping > perf stat: Code refactoring in hardware-grouping > perf stat: Add tool events support in hardware-grouping > perf stat: Add TSC support in hardware-grouping > perf stat: Fix a return error issue in hardware-grouping > perf stat: Add check to ensure correctness in platform that does not > support hardware-grouping > perf pmu-events: Add event counter data for Tigerlake > > tools/lib/bitmap.c | 20 + > tools/perf/builtin-stat.c | 7 + > .../arch/x86/cascadelakex/cache.json | 1237 ++++++++++++ > .../arch/x86/cascadelakex/counter.json | 17 + > .../arch/x86/cascadelakex/floating-point.json | 16 + > .../arch/x86/cascadelakex/frontend.json | 68 + > .../arch/x86/cascadelakex/memory.json | 751 ++++++++ > .../arch/x86/cascadelakex/other.json | 168 ++ > .../arch/x86/cascadelakex/pipeline.json | 102 + > .../arch/x86/cascadelakex/uncore-cache.json | 1138 +++++++++++ > .../x86/cascadelakex/uncore-interconnect.json | 1272 +++++++++++++ > .../arch/x86/cascadelakex/uncore-io.json | 394 ++++ > .../arch/x86/cascadelakex/uncore-memory.json | 509 +++++ > .../arch/x86/cascadelakex/uncore-power.json | 25 + > .../arch/x86/cascadelakex/virtual-memory.json | 28 + > .../pmu-events/arch/x86/icelakex/cache.json | 98 + > .../pmu-events/arch/x86/icelakex/counter.json | 17 + > .../arch/x86/icelakex/floating-point.json | 13 + > .../arch/x86/icelakex/frontend.json | 55 + > .../pmu-events/arch/x86/icelakex/memory.json | 53 + > .../pmu-events/arch/x86/icelakex/other.json | 52 + > .../arch/x86/icelakex/pipeline.json | 92 + > .../arch/x86/icelakex/uncore-cache.json | 965 ++++++++++ > .../x86/icelakex/uncore-interconnect.json | 1667 +++++++++++++++++ > .../arch/x86/icelakex/uncore-io.json | 966 ++++++++++ > .../arch/x86/icelakex/uncore-memory.json | 186 ++ > .../arch/x86/icelakex/uncore-power.json | 26 + > .../arch/x86/icelakex/virtual-memory.json | 22 + > .../arch/x86/sapphirerapids/cache.json | 104 + > .../arch/x86/sapphirerapids/counter.json | 17 + > .../x86/sapphirerapids/floating-point.json | 25 + > .../arch/x86/sapphirerapids/frontend.json | 98 +- > .../arch/x86/sapphirerapids/memory.json | 44 + > .../arch/x86/sapphirerapids/other.json | 40 + > .../arch/x86/sapphirerapids/pipeline.json | 118 ++ > .../arch/x86/sapphirerapids/uncore-cache.json | 534 +++++- > .../arch/x86/sapphirerapids/uncore-cxl.json | 56 + > .../sapphirerapids/uncore-interconnect.json | 476 +++++ > .../arch/x86/sapphirerapids/uncore-io.json | 373 ++++ > .../x86/sapphirerapids/uncore-memory.json | 391 ++++ > .../arch/x86/sapphirerapids/uncore-power.json | 24 + > .../x86/sapphirerapids/virtual-memory.json | 20 + > .../pmu-events/arch/x86/tigerlake/cache.json | 65 + > .../arch/x86/tigerlake/counter.json | 7 + > .../arch/x86/tigerlake/floating-point.json | 13 + > .../arch/x86/tigerlake/frontend.json | 56 + > .../pmu-events/arch/x86/tigerlake/memory.json | 31 + > .../pmu-events/arch/x86/tigerlake/other.json | 4 + > .../arch/x86/tigerlake/pipeline.json | 96 + > .../x86/tigerlake/uncore-interconnect.json | 11 + > .../arch/x86/tigerlake/uncore-memory.json | 6 + > .../arch/x86/tigerlake/uncore-other.json | 1 + > .../arch/x86/tigerlake/virtual-memory.json | 20 + > tools/perf/pmu-events/jevents.py | 179 +- > tools/perf/pmu-events/pmu-events.h | 26 +- > tools/perf/util/metricgroup.c | 927 +++++++++ > tools/perf/util/metricgroup.h | 82 + > tools/perf/util/pmu.c | 5 + > tools/perf/util/pmu.h | 1 + > tools/perf/util/stat.h | 1 + > 60 files changed, 13790 insertions(+), 25 deletions(-) > create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/counter.j= son > create mode 100644 tools/perf/pmu-events/arch/x86/icelakex/counter.json > create mode 100644 tools/perf/pmu-events/arch/x86/sapphirerapids/counter= .json > create mode 100644 tools/perf/pmu-events/arch/x86/tigerlake/counter.json > > -- > 2.39.3 >