Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4999620rdb; Tue, 12 Dec 2023 16:13:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IHbEWkNafN1M3Ojl6k3Npzr2okSHkgR9WwgAP/T+XkV2TyiBryoRqeMLaO+hRlVr8PqJI+u X-Received: by 2002:a17:902:aa90:b0:1cc:5db8:7e92 with SMTP id d16-20020a170902aa9000b001cc5db87e92mr3388287plr.58.1702426431529; Tue, 12 Dec 2023 16:13:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702426431; cv=none; d=google.com; s=arc-20160816; b=nC8F6SPPEViz0sSB1WuYypyLonQ5LAVnM5XtNOR6LX7Cmd9CcC2DYc5nsvJNfWtrJB zNe/CfZZ1XVG+Mi8R1wWSb4nka3NjPgP/Y1pE5bnb+3KTv6OZop5tPf67cH5FSbIZyvj PLP3tQrzJCZwR3txrx/Q4GAUv8aR897d5T+ByBQ9ntd8jyMBVW7GtD+ZpIipkIXK4H6h OiHhKKYfS+l7Z1vnpoX0XTkTiv498NbV3qvtyEz46G5xeU8zWM1xt0z9S0gG+PzeGekg QaElqZHkbEOv1UmkJyMyCDW6RJdf8PiWOOlHx02FKkgZTPjiJ8cF3cTZDAYU2lxwmeTv /wEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=jUL8XAmO2qiHdWokVXe8WU7dwxVOHJXV31dlSpnABqs=; fh=7Id7wL2M+oCJXD4imOzx4OKioxBXSOZqypDprm7aO4E=; b=OqCAW1St6X9yBnqZferyeNVmmwpAnapRhgii2FnIPHlXmTpxyEhlDGTCQOffoUtKp/ mTDQCM8nIrhfBC0es+4jRNirxs3lvqF/HW537+x1rgkwzh3lGwsSb/joP5i5aFrmiWwQ MrDuxuZL0Az/WxaZgGp5eHIM1L4cEHH+4P5qRX3jZHAqrxbCBGC20iWp+R3gO6PSeKSn O5r8rEjmKIHB06uWAgnRe4lUxCoaNrvR7jaL81CIUqjgUCY2EECXAZSVoAWbdLCFjnMm 2SbeK20hdr9G+GEBzorHiywGBmHPaOUbwp9bbwb9SHLU2+ZI/DXdpj0AjqTUPAU7v1bb xqjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=POyDM7Zz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id i13-20020a17090332cd00b001d345c8bbedsi936261plr.353.2023.12.12.16.13.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 16:13:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=POyDM7Zz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 5CA508040EEA; Tue, 12 Dec 2023 16:13:50 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377883AbjLMANg (ORCPT + 99 others); Tue, 12 Dec 2023 19:13:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232620AbjLMAN0 (ORCPT ); Tue, 12 Dec 2023 19:13:26 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 896DE92; Tue, 12 Dec 2023 16:13:32 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1d03bcf27e9so39541665ad.0; Tue, 12 Dec 2023 16:13:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702426412; x=1703031212; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=jUL8XAmO2qiHdWokVXe8WU7dwxVOHJXV31dlSpnABqs=; b=POyDM7Zz2mJ/TjEWN/s8el/mDMWOU5lqMwUanTXP2fR/a9AMA+QovAeU+FRubGNnsu srN/GgG34z25tdRKg3gGYUNKNydhNAhdPnoPR7AqWvGnCXF4OMZjjeJIQ2vdK1xfFidX tSugM9NnOEOmUrtyxHTj27mKqgKAJgcQ/t27mKYQUcZScuzCbchvWULMp62dlaUj59aD ixmQFzjMgfhscf89OYUI9dJe6/70XDzVTJTHW4X4HczT27iz3y89Lg4CESYMbKvq3aGH 9zTiW3jkfHnAfiss7g5zxn4z3RAsKiwFII78k/6JOUeF7pSndcCpAAXVCmBl83kHgWI7 WPXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702426412; x=1703031212; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jUL8XAmO2qiHdWokVXe8WU7dwxVOHJXV31dlSpnABqs=; b=pfXEU8x8Y8vWqhtGruNiBSh27J+eGW1Y7zv2cBJaBPfeYBaDKiYw/QudUp+YZKsxrM nET34/mYBsKq7m7rMXQFmY/MvEtnb2OMOpT5DXkMTdvkAJZEoG0pOHgBIgdQtmR4XFHH nJBRb5FWYcxkHOSuuu7VuY30qe/AzDM48SLQUDNW4AV+zecHdEFMCoRet/I0qvhdYiWi UYexnceOPkgAD0W/A0PTp+ytB0xIblXrKEISox07gbuovTs8ZJzJys+rEWPqjxxiTtFx w//J+52qwL/99Hv2qms/1X/b9KIvVoZLG5+Fg7jYcHDPWMByWVnapNXShUdPs+U+9EHX nDJA== X-Gm-Message-State: AOJu0Yx6iidhvDnZch3EBFSdqUDI19Rz+t9w+ncAEKECjGoim0NE72kS DpjGZQ+vtM4cpEBkM2Bk1gg= X-Received: by 2002:a17:902:680d:b0:1d3:3eb8:53c3 with SMTP id h13-20020a170902680d00b001d33eb853c3mr1100315plk.89.1702426411841; Tue, 12 Dec 2023 16:13:31 -0800 (PST) Received: from bangji.corp.google.com ([2620:15c:2c0:5:8251:f360:4316:214e]) by smtp.gmail.com with ESMTPSA id i11-20020a17090332cb00b001d0ab572458sm9136398plr.121.2023.12.12.16.13.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 16:13:30 -0800 (PST) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 03/17] perf annotate-data: Add find_data_type() Date: Tue, 12 Dec 2023 16:13:09 -0800 Message-ID: <20231213001323.718046-4-namhyung@kernel.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231213001323.718046-1-namhyung@kernel.org> References: <20231213001323.718046-1-namhyung@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 12 Dec 2023 16:13:50 -0800 (PST) The find_data_type() is to get a data type from the memory access at the given address (IP) using a register and an offset. It requires DWARF debug info in the DSO and searches the list of variables and function parameters in the scope. In a pseudo code, it does basically the following: find_data_type(dso, ip, reg, offset) { pc = map__rip_2objdump(ip); CU = dwarf_addrdie(dso->dwarf, pc); scopes = die_get_scopes(CU, pc); for_each_scope(S, scopes) { V = die_find_variable_by_reg(S, pc, reg); if (V && V.type == pointer_type) { T = die_get_real_type(V); if (offset < T.size) return T; } } return NULL; } Signed-off-by: Namhyung Kim --- tools/perf/util/Build | 1 + tools/perf/util/annotate-data.c | 163 ++++++++++++++++++++++++++++++++ tools/perf/util/annotate-data.h | 40 ++++++++ 3 files changed, 204 insertions(+) create mode 100644 tools/perf/util/annotate-data.c create mode 100644 tools/perf/util/annotate-data.h diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 73e3f194f949..5cf000302080 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -196,6 +196,7 @@ perf-$(CONFIG_DWARF) += probe-finder.o perf-$(CONFIG_DWARF) += dwarf-aux.o perf-$(CONFIG_DWARF) += dwarf-regs.o perf-$(CONFIG_DWARF) += debuginfo.o +perf-$(CONFIG_DWARF) += annotate-data.o perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind-local.o diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c new file mode 100644 index 000000000000..1ddec786721c --- /dev/null +++ b/tools/perf/util/annotate-data.c @@ -0,0 +1,163 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Convert sample address to data type using DWARF debug info. + * + * Written by Namhyung Kim + */ + +#include +#include + +#include "annotate-data.h" +#include "debuginfo.h" +#include "debug.h" +#include "dso.h" +#include "map.h" +#include "map_symbol.h" +#include "strbuf.h" +#include "symbol.h" + +static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die) +{ + Dwarf_Off off, next_off; + size_t header_size; + + if (dwarf_addrdie(di->dbg, pc, cu_die) != NULL) + return cu_die; + + /* + * There are some kernels don't have full aranges and contain only a few + * aranges entries. Fallback to iterate all CU entries in .debug_info + * in case it's missing. + */ + off = 0; + while (dwarf_nextcu(di->dbg, off, &next_off, &header_size, + NULL, NULL, NULL) == 0) { + if (dwarf_offdie(di->dbg, off + header_size, cu_die) && + dwarf_haspc(cu_die, pc)) + return true; + + off = next_off; + } + return false; +} + +/* The type info will be saved in @type_die */ +static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) +{ + Dwarf_Word size; + + /* Get the type of the variable */ + if (die_get_real_type(var_die, type_die) == NULL) { + pr_debug("variable has no type\n"); + return -1; + } + + /* + * It expects a pointer type for a memory access. + * Convert to a real type it points to. + */ + if (dwarf_tag(type_die) != DW_TAG_pointer_type || + die_get_real_type(type_die, type_die) == NULL) { + pr_debug("no pointer or no type\n"); + return -1; + } + + /* Get the size of the actual type */ + if (dwarf_aggregate_size(type_die, &size) < 0) { + pr_debug("type size is unknown\n"); + return -1; + } + + /* Minimal sanity check */ + if ((unsigned)offset >= size) { + pr_debug("offset: %d is bigger than size: %lu\n", offset, size); + return -1; + } + + return 0; +} + +/* The result will be saved in @type_die */ +static int find_data_type_die(struct debuginfo *di, u64 pc, + int reg, int offset, Dwarf_Die *type_die) +{ + Dwarf_Die cu_die, var_die; + Dwarf_Die *scopes = NULL; + int ret = -1; + int i, nr_scopes; + + /* Get a compile_unit for this address */ + if (!find_cu_die(di, pc, &cu_die)) { + pr_debug("cannot find CU for address %lx\n", pc); + return -1; + } + + /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ + nr_scopes = die_get_scopes(&cu_die, pc, &scopes); + + /* Search from the inner-most scope to the outer */ + for (i = nr_scopes - 1; i >= 0; i--) { + /* Look up variables/parameters in this scope */ + if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die)) + continue; + + /* Found a variable, see if it's correct */ + ret = check_variable(&var_die, type_die, offset); + break; + } + + free(scopes); + return ret; +} + +/** + * find_data_type - Return a data type at the location + * @ms: map and symbol at the location + * @ip: instruction address of the memory access + * @reg: register that holds the base address + * @offset: offset from the base address + * + * This functions searches the debug information of the binary to get the data + * type it accesses. The exact location is expressed by (ip, reg, offset). + * It return %NULL if not found. + */ +struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, + int reg, int offset) +{ + struct annotated_data_type *result = NULL; + struct dso *dso = map__dso(ms->map); + struct debuginfo *di; + Dwarf_Die type_die; + struct strbuf sb; + u64 pc; + + di = debuginfo__new(dso->long_name); + if (di == NULL) { + pr_debug("cannot get the debug info\n"); + return NULL; + } + + /* + * IP is a relative instruction address from the start of the map, as + * it can be randomized/relocated, it needs to translate to PC which is + * a file address for DWARF processing. + */ + pc = map__rip_2objdump(ms->map, ip); + if (find_data_type_die(di, pc, reg, offset, &type_die) < 0) + goto out; + + result = zalloc(sizeof(*result)); + if (result == NULL) + goto out; + + strbuf_init(&sb, 32); + if (die_get_typename_from_type(&type_die, &sb) < 0) + strbuf_add(&sb, "(unknown type)", 14); + + result->type_name = strbuf_detach(&sb, NULL); + +out: + debuginfo__delete(di); + return result; +} diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h new file mode 100644 index 000000000000..633147f78ca5 --- /dev/null +++ b/tools/perf/util/annotate-data.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _PERF_ANNOTATE_DATA_H +#define _PERF_ANNOTATE_DATA_H + +#include +#include +#include + +struct map_symbol; + +/** + * struct annotated_data_type - Data type to profile + * @type_name: Name of the data type + * @type_size: Size of the data type + * + * This represents a data type accessed by samples in the profile data. + */ +struct annotated_data_type { + char *type_name; + int type_size; +}; + +#ifdef HAVE_DWARF_SUPPORT + +/* Returns data type at the location (ip, reg, offset) */ +struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, + int reg, int offset); + +#else /* HAVE_DWARF_SUPPORT */ + +static inline struct annotated_data_type * +find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, + int reg __maybe_unused, int offset __maybe_unused) +{ + return NULL; +} + +#endif /* HAVE_DWARF_SUPPORT */ + +#endif /* _PERF_ANNOTATE_DATA_H */ -- 2.43.0.472.g3155946c3a-goog