Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp421842rdg; Thu, 12 Oct 2023 09:19:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHmw9lQ66JIHV9v219pYvV4d74e/dUlvsZeIZLAXRzia4TFVIdKWjd81ykH2MaSfCnpoCww X-Received: by 2002:a17:902:b7c9:b0:1c9:d908:d60c with SMTP id v9-20020a170902b7c900b001c9d908d60cmr3247737plz.63.1697127567334; Thu, 12 Oct 2023 09:19:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697127567; cv=none; d=google.com; s=arc-20160816; b=ndONGHMpclTtFh9aFEdErKKLZD+vw9cCHoyuU7zfhM4jViQXU1YW1oosNYrV2RlNjK 7oozf1mpUtklRRZx1V1VRSYFx3vO5byucucAXSBT2slFyVKl71QcfxLrW0O5Jvbp19Tp toeDwX/mjmeLld1rOfxEVQPb7iveAvucLkWFFd77V6xz+QVVjDReHWxMX4xiRShe/qEO xIB+DGYlpolie1R8AVlZvDoO2BAl4GACs/D6Pgy/OkSAw3KLyCN88JSnL/jSi3UGd4y1 lyqGTyidqjcnizZhAcpy0TwroHV66MeOCrynxgJEAINzwZTC9Q375zljdbcIXPUrvAWO 4uQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version; bh=tzdPWr+BPWni+pwanj4zMweCa8mDkf6Zl5gHsxJfZ1g=; fh=+LnW9RCJuKMdVT8iSCsiB50xRf+5s2MGTpVeGRhX+sQ=; b=zeuo1R4mnci/sduhgBF++VnkorFrS7CJ6+I0+cZDwCtWIzsGeXlSUkiM67GQKhphuO LeFI9a/bZ6T+DBjHwGPmWnbDixCn8+bNjwLvXBI4bZJF254RnFLQXRV53VucFTeLl5oq +aJf5X9XVr8l7ERWs0OSfLRI3a8nCSaoqovXhFI3RswMEAIffWu/TxSHLOBc8MmdNvSS MdnKSytPTj+aQxKHCIOnhEmCe4Q4ojzrj2K5X51rrkww07H2eLbqwZMqWTJJPtQszHjg +NOAfUgPU0B0uTMt59YUyTP5+OgitYgIYyXOOL+O0cwSrbLeMkdHdjn1LFVJvw10EgY2 QmkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id x12-20020a1709028ecc00b001bb2093efb6si2437802plo.45.2023.10.12.09.19.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 09:19:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 7512980408D9; Thu, 12 Oct 2023 09:19:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379127AbjJLQTW convert rfc822-to-8bit (ORCPT + 99 others); Thu, 12 Oct 2023 12:19:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347163AbjJLQTT (ORCPT ); Thu, 12 Oct 2023 12:19:19 -0400 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C293D7; Thu, 12 Oct 2023 09:19:18 -0700 (PDT) Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-27cfb84432aso847283a91.2; Thu, 12 Oct 2023 09:19:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697127557; x=1697732357; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Nbiq+SbKVzv0umicsGQIbdhA4gJE73jhezJTqZ1Ip3U=; b=K67LX5HRZ2VXLvm3g+RKGJVmR2F0DFJ9Aa1a5XbjFY5JNCvtSOgasy1+ADnB01T9Di C1q5C8vZalVM+ImxX/AkVf12iyf3urb04OIwCutkt6hosVNUjnKgtWsynNotG5z0gGz2 lhiG5vLHxn5pZk4inU5xnGMsB0x5dwAiXE+n9ZQooddwcjMBJRatzsf/tuaxpLjxeVa8 zBIEV32bFmXJPzcrdk5cDSky+n8DX5tczgxXx6e4AVeSPMJ37l8cSCk147+iP1x/K4BV zE+H88JiJo1OMr3cWQJ+3y1CSKdn4sIYn9+liUAjIInw2gAbvBKKF6blfg3H3Z9QgIgv GOpA== X-Gm-Message-State: AOJu0Yy3wI2NpNj/aeu4tm7NQvMTMk2OpbNkaeb51rqb7GRVxk3SEe6I bpM7H7SoFH1TUDGJ3rUI9AjYkV8jKewCmOQSzEM= X-Received: by 2002:a17:90b:3a8e:b0:274:2906:656a with SMTP id om14-20020a17090b3a8e00b002742906656amr22645241pjb.5.1697127557440; Thu, 12 Oct 2023 09:19:17 -0700 (PDT) MIME-Version: 1.0 References: <20231012035111.676789-1-namhyung@kernel.org> In-Reply-To: From: Namhyung Kim Date: Thu, 12 Oct 2023 09:19:04 -0700 Message-ID: Subject: Re: [RFC 00/48] perf tools: Introduce data type profiling (v1) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra , Ian Rogers , Adrian Hunter , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org, Ben Woodard , Joe Mario , Kees Cook , David Blaikie , Xu Liu , Kan Liang , Ravi Bangoria Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 12 Oct 2023 09:19:26 -0700 (PDT) Hi Ingo, On Wed, Oct 11, 2023 at 11:03 PM Ingo Molnar wrote: > > > * Namhyung Kim wrote: > > > * How to use it > > > > To get precise memory access samples, users can use `perf mem record` > > command to utilize those events supported by their architecture. Intel > > machines would work best as they have dedicated memory access events but > > they would have a filter to ignore low latency loads like less than 30 > > cycles (use --ldlat option to change the default value). > > > > # To get memory access samples in kernel for 1 second (on Intel) > > $ sudo perf mem record -a -K --ldlat=4 -- sleep 1 > > > > # Similar for the AMD (but it requires 6.3+ kernel for BPF filters) > > $ sudo perf mem record -a --filter 'mem_op == load, ip > 0x8000000000000000' -- sleep 1 > > BTW., it would be nice for 'perf mem record' to just do the right thing on > whatever machine it is running on. > > Also, why are BPF filters required - due to the IP filtering of mem-load > events? Right, because AMD uses IBS for precise events and it doesn't have a filtering feature. > > Could we perhaps add an IP filter to perf events to get this built-in? > Perhaps attr->exclude_user would achieve something similar? Unfortunately IBS doesn't support privilege filters IIUC. Maybe we could add a general filtering logic in the NMI handler but I'm afraid it can complicate the code and maybe slow it down a bit. Probably it's ok to have only a simple privilege filter by IP range. > > > In perf report, it's just a matter of selecting new sort keys: 'type' > > and 'typeoff'. The 'type' shows name of the data type as a whole while > > 'typeoff' shows name of the field in the data type. I found it useful > > to use it with --hierarchy option to group relevant entries in the same > > level. > > > > $ sudo perf report -s type,typeoff --hierarchy --stdio > > ... > > # > > # Overhead Data Type / Data Type Offset > > # ........... ............................ > > # > > 23.95% (stack operation) > > 23.95% (stack operation) +0 (no field) > > 23.43% (unknown) > > 23.43% (unknown) +0 (no field) > > 10.30% struct pcpu_hot > > 4.80% struct pcpu_hot +0 (current_task) > > 3.53% struct pcpu_hot +8 (preempt_count) > > 1.88% struct pcpu_hot +12 (cpu_number) > > 0.07% struct pcpu_hot +24 (top_of_stack) > > 0.01% struct pcpu_hot +40 (softirq_pending) > > 4.25% struct task_struct > > 1.48% struct task_struct +2036 (rcu_read_lock_nesting) > > 0.53% struct task_struct +2040 (rcu_read_unlock_special.b.blocked) > > 0.49% struct task_struct +2936 (cred) > > 0.35% struct task_struct +3144 (audit_context) > > 0.19% struct task_struct +46 (flags) > > 0.17% struct task_struct +972 (policy) > > 0.15% struct task_struct +32 (stack) > > 0.15% struct task_struct +8 (thread_info.syscall_work) > > 0.10% struct task_struct +976 (nr_cpus_allowed) > > 0.09% struct task_struct +2272 (mm) > > ... > > This looks really useful! :) Thanks, Namhyung