Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp5110338pxb; Wed, 26 Jan 2022 05:09:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJzvicuvdDX8w0fYHWvC7mZ3V2Z4E2K0140DJngkYIHFScYgX49Rg2yAIhVgiRxbIEnaLGkk X-Received: by 2002:a17:906:5596:: with SMTP id y22mr14999577ejp.8.1643202544767; Wed, 26 Jan 2022 05:09:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643202544; cv=none; d=google.com; s=arc-20160816; b=N97+dOCr7YTuiEPQOP0YJM8JYGbnzqv/o4YS5gpoAPP3txFwxG4Q+Qtg0kIoVfrvz1 QlkEc6e2X2g60cliGahm/5GbqB01F7mUrV87x2Mg7ugyzDbhRR0Hn7Z4O8xWTSyrvR4+ LY83QmbyEFKWvfx8xYJQhlPWwSGYg2hJaQBGUEHsPkwo15u3kJ425qOifYuLwZnKq3pw kjCdrCBqHJF0g17FjR2rKd5r0usI8i0jz6rb2GifRMWljwNAKt3AiVxuXSKB5nqrfpH3 NTBqzg/+UQ96+kciAFFgAo227AQeDQG9e4XrE+AFNdAMjT6VPmKzs+MQ28T8oBvXBp5X bbrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=mi0Q8WQvhL64eujn43o9araRQ9PmZV6oTtIkZaqBYj0=; b=zaFvOTCElZYSCiZzZRLjHV6BnH5PoTNg7+08vPvXPQDMMiRMj7tFojfRNzQzdccTUN tSooBpxDzQYjqWwZNTotjuAM277SoP8UVPnD49au5QN2I3G+B7RdaWwI8wTeYXBIoai4 tc+SpzUFwZ9y65fB1bA/11BskX5I1iueHeFiQUbW392lEYt3Y97kOf/3eMMrEUCPB1qm DTmbhSUpgrLFH6QXayLZ3lkCEbKME3Vn1ScNplww4dpvIiRgO4dUjvdb0dRktFgp1Blj aGOiGda+ScLkZ2JgFtu2+YYBloKxscWaqZqIS/tTdRfQNh45Wo47VCym1FJlndWpZ1od oZ2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Pj23Ifpq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x18si11513305edi.330.2022.01.26.05.08.39; Wed, 26 Jan 2022 05:09:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Pj23Ifpq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235019AbiAYXyk (ORCPT + 99 others); Tue, 25 Jan 2022 18:54:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235010AbiAYXyi (ORCPT ); Tue, 25 Jan 2022 18:54:38 -0500 Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1C3DC06173B for ; Tue, 25 Jan 2022 15:54:38 -0800 (PST) Received: by mail-qt1-x82e.google.com with SMTP id g12so2589169qto.13 for ; Tue, 25 Jan 2022 15:54:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mi0Q8WQvhL64eujn43o9araRQ9PmZV6oTtIkZaqBYj0=; b=Pj23Ifpq3NXEUmdJWFASfN/tfOWg323SUwogg6VIXUgj5Wzr/NZHhnHEDUtaS0yqFL NVzF23imTw0GBMm4Bn4U306sZ8MrqfbOeFnBw2xNhqCSNpvnBPn23Une71DqKajKhU/Z wZunmuixgHbAq8TTLaJj3GZLgf8le8TLPXWt6wprE5ru0n15Kri8gSGu+rKQGxCzetlp oUFZyD4KU7BWVm07Jql2mgqpSLzGraf+xDlIcJOJEk83KfVTKG1NZ6JsY+SSP4Z3tif0 FU/4YubKbvZpup4UYob6J1IA6cRLaHvv3lJS0G4TZmtz0x+pUDu3wthjfUFc+kr1lxGr 81Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mi0Q8WQvhL64eujn43o9araRQ9PmZV6oTtIkZaqBYj0=; b=QgU8L0wab+Khep3dmyRKk27wDZVwWfFKBqBeQYaksCRmMTjn6P53gsh7/GzpbVB/3t KOFy54S3Xz8KuLwr31wz/vmTl2Sc6fuYeA5Oy9geA8eTrnELImQ7+mkLn7vDITpZJ4bZ yS/8xTp93YXqkgoHg8skpNDdsPDxfbx/LykYBlIlSJwLhAf59G5OTHdVJ7H3iH/HowMU K2bEHVnG5M6ThzuYWv+UcJJniCp8qJ/OjPXpNSC71aoUNjpVIvu8QaZ92HodhuKmn+6A ATUj2C+fnICnMXFS53gpu93OY5KV0pLY+9LVTPwpD0wf24Slj63mJlr4JRYL30BJpeK+ 3eXA== X-Gm-Message-State: AOAM530lLFbGhyX/bcUsqW8mPwp3zPv7mmn+Hr3NiYB7g+wI5FteFzWx rIxmt0WzRwFmlA9hI8CxgLClW7lXu2WpZOwPbrJPsg== X-Received: by 2002:ac8:5fd1:: with SMTP id k17mr3647351qta.566.1643154877635; Tue, 25 Jan 2022 15:54:37 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Hao Luo Date: Tue, 25 Jan 2022 15:54:26 -0800 Message-ID: Subject: Re: [Question] How to reliably get BuildIDs from bpf prog To: Song Liu Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Song Liu , Yonghong Song , Martin KaFai Lau , KP Singh , bpf , open list , Jiri Olsa , Blake Jones , Alexey Alexandrov , Namhyung Kim , Ian Rogers , "pasha.tatashin@soleen.com" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks Song for your suggestion. On Mon, Jan 24, 2022 at 11:08 PM Song Liu wrote: > > On Mon, Jan 24, 2022 at 2:43 PM Hao Luo wrote: > > > > Dear BPF experts, > > > > I'm working on collecting some kernel performance data using BPF > > tracing prog. Our performance profiling team wants to associate the > > data with user stack information. One of the requirements is to > > reliably get BuildIDs from bpf_get_stackid() and other similar helpers > > [1]. > > > > As part of an early investigation, we found that there are a couple > > issues that make bpf_get_stackid() much less reliable than we'd like > > for our use: > > > > 1. The first page of many binaries (which contains the ELF headers and > > thus the BuildID that we need) is often not in memory. The failure of > > find_get_page() (called from build_id_parse()) is higher than we would > > want. > > Our top use case of bpf_get_stack() is called from NMI, so there isn't > much we can do. Maybe it is possible to improve it by changing the > layout of the binary and the libraries? Specifically, if the text is > also in the first page, it is likely to stay in memory? > We are seeing 30-40% of stack frames not able to get build ids due to this. This is a place where we could improve the reliability of build id. There were a few proposals coming up when we found this issue. One of them is to have userspace mlock the first page. This would be the easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is to embed build id in vma. This is an idea similar to [1], but it's unclear (at least to me) where to store the string. I'm wondering if we can introduce a sleepable version of bpf_get_stack() if it helps. When a page is not present, sleepable bpf_get_stack() can bring in the page. [1] https://lwn.net/Articles/867818/ > > 2. When anonymous huge pages are used to hold some regions of process > > text, build_id_parse() also fails to get a BuildID because > > vma->vm_file is NULL. > > How did the text get in anonymous memory? I guess it is NOT from JIT? > We had a hack to use transparent huge page for application text. The > hack looks like: > > "At run time, the application creates an 8MB temporary buffer and the > hot section of the executable memory is copied to it. The 8MB region in > the executable memory is then converted to a huge page (by way of an > mmap() to anonymous pages and an madvise() to create a huge page), the > data is copied back to it, and it is made executable again using > mprotect()." > > If your case is the same (or similar), it can probably be fixed with > CONFIG_READ_ONLY_THP_FOR_FS, and modified user space. > In our use cases, we have text mapped to huge pages that are not backed by files. vma->vm_file could be null or points some fake file. This causes challenges for us on getting build id for these code text. > Thanks, > Song