Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp3287483pxv; Mon, 28 Jun 2021 00:49:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzu6pAXTnczOXdp4OUjeop7lJlni/D1WkeaZcTQU2F1Ve15/2PCfdAD+Cilnu78jbVTMYQ1 X-Received: by 2002:a17:906:4c58:: with SMTP id d24mr22340530ejw.298.1624866551215; Mon, 28 Jun 2021 00:49:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624866551; cv=none; d=google.com; s=arc-20160816; b=aVS2UQYeOc4ck6MNeOJQMl1vZswc3Nw62GlLgZ58IyaopCIswZmM+WxNyrd2DDgeKP pbBrMQEJFmZxdaBbmaDQpaO5sZ3T7hcmRaq0v2lQ8QYBvMK68k6W0224p1BRRp7pfU+I AN7O58QEqsQA9wi0ktDf69O/t9WUGIPX7gHLRUAfBVaEEm9VBm8BY6uEfNWKy+E40Vv6 yVAcH9AGMg6/vmAzeXSEHB/utLYefDoHHt38ibS8CId1SV2IkM9Ym3IRVh1+BfSmlOPW gYbVL4EnHefYrHIWAMg1TBdGPffa6kXoK4qSBWHImK1h45lNVnUwqV5pQXqXB4JMmHaG hO1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject; bh=ZAJciK4Uh018kUOnBcjkXf/U5uavkEWrtwUaEEpaeXo=; b=Tzmd5D+Z4TBhGVuRBi9CHgGsEmaE1i4ZC7XRQluElTCXlO1d14LsECJNYjRk3PkyLb l6hpjqfpfN/VwSmc8mtFo2e9yMIp7bA416HDkoVuRP+rTBesyOIn6/Hi0na3jv5OzEZD puRw2craOOzofEMyPotLZ737yuIR77kqaixWM3F1PJnAW19pfXGeD5V8zXehpXl6dgqc 50NJfI1pje1VjVh0NhtukH8ksOMHZwQhRflpeWOI+Dt+u1LzpsGWs6cJ3YhXNQeGEvTz TNtofR6Y3Y+VM0SenJ71lwDRImauFWkqI4zq5pg9MIB58d//HGflFD3uBawo74yzMlDh UrcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n10si12901414edo.129.2021.06.28.00.48.48; Mon, 28 Jun 2021 00:49:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232349AbhF1HZ3 (ORCPT + 99 others); Mon, 28 Jun 2021 03:25:29 -0400 Received: from mga11.intel.com ([192.55.52.93]:59184 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229998AbhF1HZ3 (ORCPT ); Mon, 28 Jun 2021 03:25:29 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10028"; a="204895212" X-IronPort-AV: E=Sophos;i="5.83,305,1616482800"; d="scan'208";a="204895212" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jun 2021 00:23:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,305,1616482800"; d="scan'208";a="419069114" Received: from ahunter-desktop.fi.intel.com (HELO [10.237.72.79]) ([10.237.72.79]) by fmsmga007.fm.intel.com with ESMTP; 28 Jun 2021 00:22:59 -0700 Subject: Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object To: Andi Kleen , Arnaldo Carvalho de Melo Cc: Jiri Olsa , Peter Zijlstra , Ingo Molnar , Mark Rutland , Namhyung Kim , Leo Yan , Kan Liang , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org References: <20210627131818.810-1-adrian.hunter@intel.com> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Message-ID: Date: Mon, 28 Jun 2021 10:23:18 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27/06/21 7:13 pm, Andi Kleen wrote: > > On 6/27/2021 6:18 AM, Adrian Hunter wrote: >> Hi In some cases, users want to filter very large amounts of data >> (e.g. from AUX area tracing like Intel PT) looking for something >> specific. While scripting such as Python can be used, Python is 10 >> to 20 times slower than C. So define a C API so that custom filters >> can be written and loaded. > > While I appreciate this for complex cases, in my experience filtering > is usually just a simple expression. It would be nice to also have a > way to do this reasonably fast without having to write a custom C I do not agree that writing C filters is a hassle e.g. a minimal do-nothing filter is only a few lines: #include int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx) { return 0; } (Actually, the filter program does not have to have any LOC at all, but that is not much of an example) Additionally, a script to do the build is fairly trivial e.g. I use this: $ cat `which make-dlfilter.sh ` #!/bin/bash set -ex if test -z "${1}" ; then echo "Name required" exit 1 fi name="${1%.c}" if test "${name}" = "${1}" ; then name="${1%.so}" fi gcc -c -I ~/include -fpic "${name}.c" gcc -shared -o "${name}.so" "${name}.o" > file. Is the 10x-20x overhead just the python interpreter, or is it > related to perf? AFAICT the Python C API used to interface to Python performs fairly similarly to the Python interpreter. > Maybe we could have some kind of python fast path > just for filters? I expect there are ways to make it more efficient, but I doubt it would ever come close to C. > just for filters? Or maybe the alternative would be to have a > frontend in perf that can automatically generate/compile such a C > filter based on a simple expression, but I'm not sure if that would > be much simpler. If gcc is available, perf script could, in fact, build the .so on the fly since the compile time is very quick. Another point is that filters can be used for more than just filtering. Here is an example which sums cycles per-cpu and prints them, and the difference to the last print, at the beginning of each line. I think this was something you were interested in doing? #include #include #define MAX_CPU 4096 __u64 cycles[MAX_CPU]; __u64 cycles_rpt[MAX_CPU]; int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx) { __s32 cpu = sample->cpu; if (cpu >=0 && cpu < MAX_CPU) cycles[cpu] += sample->cyc_cnt; return 0; } int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx) { __s32 cpu = sample->cpu; if (cpu >=0 && cpu < MAX_CPU) { printf("%10llu %10llu ", cycles[cpu], cycles[cpu] - cycles_rpt[cpu]); cycles_rpt[cpu] = cycles[cpu]; } else { printf("%22s", ""); } return 0; } const char *filter_description(const char **long_description) { return "Print the number of cycles at the start of each line"; }