Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp5135601pxb; Wed, 26 Jan 2022 05:41:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJxeE1RFWJOkAx/1hT5xnS7sBupKulTMqCgwyBmpdvGYBqogYMzXVZ49J45FQpc6ITl5i670 X-Received: by 2002:a63:9143:: with SMTP id l64mr18547049pge.304.1643204484165; Wed, 26 Jan 2022 05:41:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643204484; cv=none; d=google.com; s=arc-20160816; b=WtA0FJwChFXVQCwRXL/QQOxzKw6a2l1jvqDsQsibsRC70EGo3uy8QDNAahDp7YYIv/ TD3zikjtRQVuUIrxh30i7Yy7WNi2mxPzCiPBoEOBM/Mjsd/Ch9EA0xJB6MMujAoM7ui6 9gbd5qdNXNh31YlhBMOWAm7G/QjUjJPnNpkPoKolBIL5DflkwF8S6TvlsxfX7Sw2g37G X/yPj0wMijuIrAgWFASt22gLsUychDOc/GGCevTqpsX+co9ywHUS6JR0Iw/gWCe6fOgp gdGtGybF8NcUuR6s49nloR2QQYSXrqWhMLSpGWsK8Z9EmQ9Ob4kEJNKQXkqmjSPc1L4L 9ZZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=nNya5xBFq1T0FSdPlCDjqGLn9d3mBQRG0+SS4LR5nuU=; b=LUsgBvCNA+Fm8ABHqk7fwBYny77AjTyj0mQmqD3cbQ8ldLEA05IDL9UlHy4Xlv4ysx QZrxXd8e4zLBbCaXL01odC/Cku3b+hhAybjveRpQmbmVilTvGk/XSqQ05HxWTP+xRAaN VbG35AJFiAhQiVZ/noPEmSb+J1Hu9olkX97aXClpaALvYUbw9laX77/RXDYOaQ04a9wy KRQvt+7CNhSHGj6MxEQc41ByI8P5QffpIaAng0pbb9brD/DlGAHfIv3+11Am0lUJ+p2b IEpcdJbfgHL07GPRmsUGIgALvmILcDyTNI61f52kix+1GJ92IOGY3acYiHxx7P9UthIb vU4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CRS+bb8K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u31si1726360pfg.208.2022.01.26.05.41.12; Wed, 26 Jan 2022 05:41:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CRS+bb8K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235179AbiAZAQ1 (ORCPT + 99 others); Tue, 25 Jan 2022 19:16:27 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:56346 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233947AbiAZAQU (ORCPT ); Tue, 25 Jan 2022 19:16:20 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6667BB81B8C; Wed, 26 Jan 2022 00:16:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 25F02C340E9; Wed, 26 Jan 2022 00:16:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643156178; bh=nNya5xBFq1T0FSdPlCDjqGLn9d3mBQRG0+SS4LR5nuU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=CRS+bb8KHILgT4i2AsI0NvilErAf3oO59C68FNNLAFBBLSNqhkynl+3+FdbGpObtp veBPpm5A9yFBbWISFqE8HzqJiyLJ4d65ev8koIRkZNTvTuqDPqS1msZX7+kIC30VBJ EB/PJ6v3YpAOWTCaskEsRrHMOBYNmTuURXdzaeNWuOA/acLxqqvkoZbP8L8W0EdbZ5 2zyXUdUP8UMHYH22bIa97yVOOT6Y5qV7Sc3AlVwVlIuX+SII3TXlhjMqNOcQPo0xgF PRenasU/9kYcdFSzJJiOUiWIcd3ADwGA8ZCDfSHggV/YzalT05MWW/1ot0ATWkRZBq HbuDkFswd6Yag== Received: by mail-yb1-f171.google.com with SMTP id i62so20160235ybg.5; Tue, 25 Jan 2022 16:16:18 -0800 (PST) X-Gm-Message-State: AOAM533gFsr70Ga7Z8do2DD2jfKwLKO7HUshGimv6b44klWSeZrGvuQ0 FFbthSnhJziE97Y+PG9E69aks8LSd74c8peeRWo= X-Received: by 2002:a25:fd6:: with SMTP id 205mr34295767ybp.654.1643156177185; Tue, 25 Jan 2022 16:16:17 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Song Liu Date: Tue, 25 Jan 2022 16:16:06 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [Question] How to reliably get BuildIDs from bpf prog To: Hao Luo Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Song Liu , Yonghong Song , Martin KaFai Lau , KP Singh , bpf , open list , Jiri Olsa , Blake Jones , Alexey Alexandrov , Namhyung Kim , Ian Rogers , "pasha.tatashin@soleen.com" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 25, 2022 at 3:54 PM Hao Luo wrote: > > Thanks Song for your suggestion. > > On Mon, Jan 24, 2022 at 11:08 PM Song Liu wrote: > > > > On Mon, Jan 24, 2022 at 2:43 PM Hao Luo wrote: > > > > > > Dear BPF experts, > > > > > > I'm working on collecting some kernel performance data using BPF > > > tracing prog. Our performance profiling team wants to associate the > > > data with user stack information. One of the requirements is to > > > reliably get BuildIDs from bpf_get_stackid() and other similar helpers > > > [1]. > > > > > > As part of an early investigation, we found that there are a couple > > > issues that make bpf_get_stackid() much less reliable than we'd like > > > for our use: > > > > > > 1. The first page of many binaries (which contains the ELF headers and > > > thus the BuildID that we need) is often not in memory. The failure of > > > find_get_page() (called from build_id_parse()) is higher than we would > > > want. > > > > Our top use case of bpf_get_stack() is called from NMI, so there isn't > > much we can do. Maybe it is possible to improve it by changing the > > layout of the binary and the libraries? Specifically, if the text is > > also in the first page, it is likely to stay in memory? > > > > We are seeing 30-40% of stack frames not able to get build ids due to > this. This is a place where we could improve the reliability of build > id. > > There were a few proposals coming up when we found this issue. One of > them is to have userspace mlock the first page. This would be the > easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is > to embed build id in vma. This is an idea similar to [1], but it's > unclear (at least to me) where to store the string. I'm wondering if > we can introduce a sleepable version of bpf_get_stack() if it helps. > When a page is not present, sleepable bpf_get_stack() can bring in the > page. I guess it is possible to have different flavors of bpf_get_stack(). However, I am not sure whether the actual use case could use sleepable BPF programs. Our user of bpf_get_stack() is a profiler. The BPF program which triggers a perf_event from NMI, where we really cannot sleep. If we have target use case that could sleep, sleepable bpf_get_stack() sounds reasonable to me. > > [1] https://lwn.net/Articles/867818/ > > > > 2. When anonymous huge pages are used to hold some regions of process > > > text, build_id_parse() also fails to get a BuildID because > > > vma->vm_file is NULL. > > > > How did the text get in anonymous memory? I guess it is NOT from JIT? > > We had a hack to use transparent huge page for application text. The > > hack looks like: > > > > "At run time, the application creates an 8MB temporary buffer and the > > hot section of the executable memory is copied to it. The 8MB region in > > the executable memory is then converted to a huge page (by way of an > > mmap() to anonymous pages and an madvise() to create a huge page), the > > data is copied back to it, and it is made executable again using > > mprotect()." > > > > If your case is the same (or similar), it can probably be fixed with > > CONFIG_READ_ONLY_THP_FOR_FS, and modified user space. > > > > In our use cases, we have text mapped to huge pages that are not > backed by files. vma->vm_file could be null or points some fake file. > This causes challenges for us on getting build id for these code text. So, what is the ideal output in these cases? If there isn't a back file, we don't really have good build-id for it, right? Thanks, Song