Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp147624rwb; Wed, 5 Oct 2022 16:17:04 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5QfDni3WI2ZZI0/T/4bPTN0h1Cvruw0YFZjCzpU4TwS80CegtHHYjCIyZZlEHa2sx2QH+A X-Received: by 2002:a17:906:4fc3:b0:72e:eab4:d9d7 with SMTP id i3-20020a1709064fc300b0072eeab4d9d7mr1557136ejw.599.1665011824020; Wed, 05 Oct 2022 16:17:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665011824; cv=none; d=google.com; s=arc-20160816; b=FlntSdB4yUeY6QLiGKJt7yDR2JMSRYT2qNqMtbTLvS/M0o4E51SjFizfJDOcQTTKeD ar1QOgvLOShw9VZ9u1++8MMGt5oqFd24qZaHQd1RSEhbm8BbusilZh+QasAbYWFpque0 2A8CqJm9FFdQ+4OW82D0BXQyDmvrjAyNhjUI3mYoXp0pUQi9oPHrpkIaurofHPgixMNG vIUMEs2XlA3VDkaM8daTWwAawWiCdRua9yERLBnjeLlbPDCblpkf8HY5CqoJ9JynH9Ru FpUJelyqg8nijJfrkl+xI91SXeJfB6nnHvGYwBZHrrIyEaLSV3OQrwQR3UIzAByUCGw7 UapA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=s8EMTXiUwrOa6YLnf1xSNISKDp/9rxbmpvrrsPdKN14=; b=ivhO+z7kWeUia7hsLlONUTErRUHAS17PKml5yfEoN/+4gZtqw2hLUD//KdkEaoeZP8 E+uy+1bUQkKmu0TRkLetWnshnXduoqAKxeBnwGuQ14o3RVitsb5rtxOnjmeJdiNEVlni Sl9YqLTRbKl8mMfB2y1lwNrPHTfyawmUczLj9qGze+S+tsgtz8ewA7hJT0aF8ycoOsV/ e4/werF9GQq9sSk2I6u9qOCOTYRvENvj/2rcnohl3qaIvKB7wKTeDTxNNFYpzaS0Y9U4 C+w4Gxn6eAPT94n42zcs8BdbGCszmx/zg27ryWgNUrSLNSFX58m+u6dJGsNGq/z/hztQ f18g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=IWzxhs+I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a19-20020a170906275300b007878144285csi13432694ejd.314.2022.10.05.16.16.13; Wed, 05 Oct 2022 16:17:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=IWzxhs+I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229592AbiJEWu6 (ORCPT + 99 others); Wed, 5 Oct 2022 18:50:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229461AbiJEWu4 (ORCPT ); Wed, 5 Oct 2022 18:50:56 -0400 Received: from mail-vs1-xe2a.google.com (mail-vs1-xe2a.google.com [IPv6:2607:f8b0:4864:20::e2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA33F2B62E for ; Wed, 5 Oct 2022 15:50:53 -0700 (PDT) Received: by mail-vs1-xe2a.google.com with SMTP id l127so393970vsc.3 for ; Wed, 05 Oct 2022 15:50:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date; bh=s8EMTXiUwrOa6YLnf1xSNISKDp/9rxbmpvrrsPdKN14=; b=IWzxhs+IVgBICIpnd5W1YnUvyc2lcj3u9zrlwggNI3fvqZJzT/1ghyPDeFxYw4TyRZ g2gF/gMVip5CwQptYdhNu21LbY+O/uccSjrwhaHtHAAZWqmwQ7YggGMPjJdzQFXHLQ1f 5QrH109nstuJoi1OC5rpDfTTARLtPCG02gY0a+v6YCde2RfwIQed6ynBA4C0DwsGFoiy 6H8pdIEczUnH9hSyufI+rukzsu05FF1fZdOYAJdbAatEiXCkAsfmYjJNJ6rHomrSU0DA xCjtGAF/dnNkrYfz6u5LS0KXDSa7AaXQqdymQQHGhMDVXxSv3JS8IdihfznObVForWp6 dyAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date; bh=s8EMTXiUwrOa6YLnf1xSNISKDp/9rxbmpvrrsPdKN14=; b=f0Z2PimbhE0q3vJdQQLiCpOq3gbttD84IJvWrTbB3SP021x26xLplND2uvwC79ASVq tq7JGYbeKVYnDNLsD20UYxYPCvqIjIofOdXwRUp9vDz9hh+6yBTbHrNTzrEq2r3Z2mTb HI8ZjACu64lXcFsEy/O3Fy7+9sXi1ejNUXmGEWJvxWk5n3SU/QopCY+ZoVawpzPLfQaw DV/VwmDNBPvRqBAN39pkR0Fod1fXCfMC9ptdSzfnmbGDcQ8MVFZpjhGAZ6/dSqHUtVRi JHKusmPlAWJ4sSn+TsG9eSxmQOlLxarvFWAmHYx9a/E4pl0LbMIwWVZl16X+yiiGWpdD mYMQ== X-Gm-Message-State: ACrzQf0uotuW355kH10vt2i4dmtERKstY4fjSV96RXnHMLS7/tguLbSW a1qrmQCYwVqHYKVNiUOxl9dMsHtosB2T35lIoruXSw== X-Received: by 2002:a05:6102:358d:b0:3a6:cf61:6208 with SMTP id h13-20020a056102358d00b003a6cf616208mr1119928vsu.12.1665010252862; Wed, 05 Oct 2022 15:50:52 -0700 (PDT) MIME-Version: 1.0 References: <76CB17D0-5A66-4D49-A389-8F40EC830DC0@sladewatkins.net> In-Reply-To: From: Stephane Eranian Date: Wed, 5 Oct 2022 15:50:41 -0700 Message-ID: Subject: Re: Invalid event (cycles:pp) in per-thread mode, enable system wide with '-a'. To: Nick Desaulniers Cc: Ravi Bangoria , Slade Watkins , linux-perf-users , LKML , Ian Rogers , Namhyung Kim , Kees Cook , sandipan.das@amd.com, Bill Wendling , clang-built-linux , Yonghong Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 5, 2022 at 2:56 PM Nick Desaulniers w= rote: > > + Stephane, Kees, Sandipan, Bill, ClangBuiltLinux mailing list, Yonghong > https://www.spinics.net/lists/linux-perf-users/msg23103.html > starts the thread, for context. > > On Thu, Sep 29, 2022 at 9:32 PM Ravi Bangoria wro= te: > > > > On 30-Sep-22 9:56 AM, Namhyung Kim wrote: > > > Hello, > > > > > > On Thu, Sep 29, 2022 at 8:49 PM Ian Rogers wrote= : > > >> > > >> On Thu, Sep 29, 2022 at 3:10 PM Slade Watkins = wrote: > > >>> > > >>> Hey Nick, > > >>> > > >>>> On Sep 29, 2022, at 5:54 PM, Nick Desaulniers wrote: > > >>>> > > >>>> I remember hearing rumblings about issues with zen 2, LBR, vs zen = 3. > > >>>> Is this a known issue, or am I holding it wrong? > > >>> > > >>> Hm=E2=80=A6 I also remember this. I have a Zen 2 based system that = I can do testing on, so I will do so when I=E2=80=99m able. > > >>> > > >>> If I discover something of note, I=E2=80=99ll get back to you. > > >>> > > >>> Cheers, > > >>> -srw > > >>> > > >> > > >> LBR isn't yet supported for Zen but is coming: > > >> https://lore.kernel.org/lkml/166155216401.401.5809694678609694438.ti= p-bot2@tip-bot2/ > > >> I'd recommend frame-pointers. > > Having to recompile is less than ideal for my workflow. I have added a n= ote to > https://github.com/ClangBuiltLinux/profiling/tree/main/perf#errors > Please let me know how I might improve the documentation. > > > >> > > >> +Ravi who may be able to say if there are any issues with the precis= e > > >> sampling on AMD. > > > > > > Afaik cvcles:pp will use IBS but it doesn't support per-task profilin= g > > > since it has no task context. Ravi is working on it.. > > > > Right. > > https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com > > Cool, thanks for working on this Ravi. > > I'm not sure yet whether I may replace the kernel on my corporate > provided workstation, so I'm not sure yet I can help test that patch. > > Can you confirm that > $ perf record -e cycles:pp --freq=3D128 --call-graph lbr -- > > works with just that patch applied? Or is there more work required? > What is the status of that patch? > > For context, we had difficulty upstreaming support for instrumentation > based profile guided optimizations in the Linux kernel. > https://lore.kernel.org/lkml/CAHk-=3DwhqCT0BeqBQhW8D-YoLLgp_eFY=3D8Y=3D9i= eREM5xx0ef08w@mail.gmail.com/ > We'd like to be able to use either instrumentation or sampling to > optimize our builds. The major barrier to sample based approaches are > architecture / micro architecture issues with sample based profile > data collection, and bitrot of data processing utilities. > https://github.com/google/autofdo/issues/144 On existing AMD Zen2, Zen3 the following cmdline: $ perf record -e cycles:pp --freq=3D128 --call-graph lbr -- does not work. I see two reasons: 1. cycles:pp is likely converted into IBS op in cycle mode. Current kernels do not support IBS in per-thread mode. This is purely a kernel limitation 2. call-graph lbr is not supported on AMD because they do not have LBR and therefore no LBR callstack mode The best way to get what you want here today on AMD Zen2 and Zen3: $ perf record -e cycles --freq=3D128 -g -- On AMD Zen3, there is a precursor to LBR with Branch Sampling (BRS), and you can use it to sample taken branches but not for callstacks. I mention the cmdline here for reference: $ perf record -e cpu/branch-brs/ -c 1000037 -b -- Note that AMD Zen3 BRS is enough to get the autoFDO usage of an LBR working as per the cmdline above. Hope this helps.