Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp1004443rwr; Thu, 4 May 2023 12:39:54 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6ye1l6ivN9NiSFGjk7Io9c9lA7Q6mvAhtrNyJ5i+w/evoPVrenQ+sZl07/iaa19u1qWGOt X-Received: by 2002:a17:90a:6c04:b0:247:2300:87d9 with SMTP id x4-20020a17090a6c0400b00247230087d9mr3271474pjj.34.1683229193962; Thu, 04 May 2023 12:39:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683229193; cv=none; d=google.com; s=arc-20160816; b=OMPMEuRZQ+PlaK8asH/7Ogp5C27wBeYS2nYATUdrufFXrnXrdKr2yOTtX8bMlEMw+0 KLRIBAjR6DLpls5keCv99H8FtF1IT2ZCQP8KC9Y6v3ozQUbhxSd1eHg1A6tKqfMEdBY/ +5VA7TLk3UMKX3bZ5vR/rjfraXp8gFQLEgrYlqsapGPMthCg7Ap/ugrGiu6Ue95HMfFd bE77oPaBc+R+i7FkTscl86+VmOl2gYbn8gakSep0V3NrG5bpnUJr2dk8wK4dxSYsYNoQ x+V0LtMz1xDjXGXqgbZIUaf9Rh+bb8EoxBfRn22AKtzOuptKXcZvnFR4ffnUIPhh1Ryo Gvaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=gUli7s2roI1SpdlasbJH93X2OTjqbUbEu7QwJ3emDgs=; b=Hv5LLyrCyDHxjDhmoM7sXSlLLrhsixtwt2PUyXg3qJDuHVc1/FPDi9H+LfHuV+9sIR w8qQ1xlvWkyHvCGqQ3OAGWTnI3jJrLcNx4kZbI8sIzJR31goDJ/Eez675mbE6n/JYvop gkZZ8VJ8PifZCpweVlwV8AqG3aHlA7sl06h4mZpLIA+QioLNOKhbNpFoOlMLMz/T8JyQ 9C2/4WpBAcZgyQxbbHrFmEpsrNv8yuzpXqPSE4ECHFEvUwn8NdwhRkbebwJWRid05ll/ almZ9The/nOU+VFsQx4pYVG4wv8P9pH7BfcNTgMcjbHDOSOk9havSI/+8AP7KsS2uBqv 8TlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bjNnyTKp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u4-20020a17090341c400b001a9f42553ddsi17708396ple.296.2023.05.04.12.39.35; Thu, 04 May 2023 12:39:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bjNnyTKp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230244AbjEDTKz (ORCPT + 99 others); Thu, 4 May 2023 15:10:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230311AbjEDTJo (ORCPT ); Thu, 4 May 2023 15:09:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C553A26E; Thu, 4 May 2023 12:08:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3FECA634F4; Thu, 4 May 2023 19:07:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3A30FC433EF; Thu, 4 May 2023 19:07:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683227252; bh=WHS4UOLzsznF3fRKcLws3qoH3ga9wKIXKHJo1Ux7Bfg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bjNnyTKp1n8jUiFOxQ+fWo3LMhDBRDqjblYZTj39OxzgMRZxk9/HzDng3cwqs/4Up FYbe8FP5bq30vEdTkPtklIO2ulSyOv6pucGAno0o31KDojuY3A8nAMmzBog9EXcWXa gPuBuFsi4R4FodQFVOj6A8eE2D+5R1YK8v8F4S10i44fqZnSXwGc/66S8H9ev3hmNQ UJJDAtA+e7qIyHUn6snNAMz9IgSlVEV5aQp9xigZv1f6vRgz8Sy6mJBu3KOtonoLsd ia5eNMHf+34k0b4l3Xy4Z9r3I4ch4VrR8/jrus49jGkIck5Ze1Vp1fiVhlTagQ8+GE PwP3g2fHmIkVw== Received: by quaco.ghostprotocols.net (Postfix, from userid 1000) id 5FCB2403B5; Thu, 4 May 2023 16:07:29 -0300 (-03) Date: Thu, 4 May 2023 16:07:29 -0300 From: Arnaldo Carvalho de Melo To: Andrii Nakryiko , Linus Torvalds Cc: Song Liu , Andrii Nakryiko , Ingo Molnar , Thomas Gleixner , Jiri Olsa , Namhyung Kim , Clark Williams , Kate Carcia , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Adrian Hunter , Changbin Du , Hao Luo , Ian Rogers , James Clark , Kan Liang , Roman Lozko , Stephane Eranian , Thomas Richter , Arnaldo Carvalho de Melo , bpf Subject: Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4 Message-ID: References: <20230503211801.897735-1-acme@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Url: http://acmel.wordpress.com X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu: > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo wrote: > > Andrii, can you add some more information about the usage of vmlinux.h > > instead of using kernel headers? > I'll just say that vmlinux.h is not a hard requirement to build BPF > programs, it's more a convenience allowing easy access to definitions > of both UAPI and kernel-internal structures for tracing needs and > marking them relocatable using BPF CO-RE machinery. Lots of real-world > applications just check-in pregenerated vmlinux.h to avoid build-time > dependency on up-to-date host kernel and such. > If vmlinux.h generation and usage is causing issues, though, given > that perf's BPF programs don't seem to be using many different kernel > types, it might be a better option to just use UAPI headers for public > kernel type definitions, and just define CO-RE-relocatable minimal > definitions locally in perf's BPF code for the other types necessary. > E.g., if perf needs only pid and tgid from task_struct, this would > suffice: > struct task_struct { > int pid; > int tgid; > } __attribute__((preserve_access_index)); Yeah, that seems like a way better approach, no vmlinux involved, libbpf CO-RE notices that task_struct changed from this two integers version (of course) and does the relocation to where it is in the running kernel by using /sys/kernel/btf/vmlinux. I looked and the creation of vmlinux.h was introduced in: commit 944138f048f7d7591ec7568c94b21de8df2724d4 Author: Namhyung Kim Date: Thu Jul 1 14:12:27 2021 -0700 perf stat: Enable BPF counter with --for-each-cgroup Recently bperf was added to use BPF to count perf events for various purposes. This is an extension for the approach and targetting to cgroup usages. Unlike the other bperf, it doesn't share the events with other processes but it'd reduces unnecessary events (and the overhead of multiplexing) for each monitored cgroup within the perf session. When --for-each-cgroup is used with --bpf-counters, it will open cgroup-switches event per cpu internally and attach the new BPF program to read given perf_events and to aggregate the results for cgroups. It's only called when task is switched to a task in a different cgroup. Signed-off-by: Namhyung Kim Acked-by: Song Liu Cc: Andi Kleen Cc: Ian Rogers Cc: Jiri Olsa Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Which I think was the first BPF skel to access a kernel data structure, yeah: tools/perf/util/bpf_skel/bperf_cgroup.bpf.c For things like: +static inline int get_cgroup_v1_idx(__u32 *cgrps, int size) +{ + struct task_struct *p = (void *)bpf_get_current_task(); + struct cgroup *cgrp; + register int i = 0; + __u32 *elem; + int level; + int cnt; + + cgrp = BPF_CORE_READ(p, cgroups, subsys[perf_event_cgrp_id], cgroup); + level = BPF_CORE_READ(cgrp, level); So we can completely remove touching vmlinux from the perf building process. If we can get the revert of the patches making BPF skels to build by default for v6.4 then we would do this work, test it thorougly and have it available for v6.5. Linus, would that be a way forward? - Arnaldo For reference, here is the definition for BPF_CORE_READ() from tools/lib/bpf/bpf_core_read.h /* * BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially * when there are few pointer chasing steps. * E.g., what in non-BPF world (or in BPF w/ BCC) would be something like: * int x = s->a.b.c->d.e->f->g; * can be succinctly achieved using BPF_CORE_READ as: * int x = BPF_CORE_READ(s, a.b.c, d.e, f, g); * * BPF_CORE_READ will decompose above statement into 4 bpf_core_read (BPF * CO-RE relocatable bpf_probe_read_kernel() wrapper) calls, logically * equivalent to: * 1. const void *__t = s->a.b.c; * 2. __t = __t->d.e; * 3. __t = __t->f; * 4. return __t->g; * * Equivalence is logical, because there is a heavy type casting/preservation * involved, as well as all the reads are happening through * bpf_probe_read_kernel() calls using __builtin_preserve_access_index() to * emit CO-RE relocations. * * N.B. Only up to 9 "field accessors" are supported, which should be more * than enough for any practical purpose. */ #define BPF_CORE_READ(src, a, ...) ({ \ ___type((src), a, ##__VA_ARGS__) __r; \ BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \ __r; \ })