Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp638659pxb; Tue, 1 Feb 2022 07:28:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJwgNChfRmTksI0GoEyW5gnCrJNzvWkek2QAsDMeoUEs00BVl2PCRBDXIvQEbxjqjjCRQNy4 X-Received: by 2002:a17:907:2ce4:: with SMTP id hz4mr20527146ejc.613.1643729301158; Tue, 01 Feb 2022 07:28:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643729301; cv=none; d=google.com; s=arc-20160816; b=cb/wE4eAvsMsHlpWTYLMiu7yDyegYZU4v4A9XxGVZx6IfAcBWUMzOa2drlcH1KKYom k1woYxwf5l5aO+K90CsQepn79rj7/ph9jEwwGL03nGmPNyIPhqMwSvijvhp9Dzl61Npb T641hNFS+XkVDOCDgsagFIVfd9MbuN7bDZJU1b38esg0gZt+iLpQbznOYLFOnk5eUran UIAJ1xOWkJAR7mB9skRq1TN/heaROjL/bcXKx6Mhm4zkBhaTBeFrM42JOXQ0aZgcOb0+ 7Vi7GuL4tTh/3HSY9FnwTnKwHZexL05CCPJr3f/v5Cc9EbHZG4VvVDsAbcGatg0gFRR8 geqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=bSpuCR80wCnZ+0lfg4WbwF2U+DbM6xyHRKtRN7kyQ2E=; b=LagnK9Rt61TZ/Mcge/L/+ZC8/yIN8HBkRgQi4VauZrHQW8GR7oyfWnhc5Fkr29frd0 JE7GWipNtmamCarfB0d202mr/2UTbdln/PGJlAR4aJnETvzmdHiZ0G2nHYGbixaD0+4r QoUT/hH8yp3FVpro7Ze7CV6U8lQ8M+YSAIh92ZlHlxttcSO3wTcvtQHvj5wW/Tz1a8Pl q28Q+vcVdaDRMeWrXfsB4fUTCcSFnpbhz8ixqelqzerUw5M7iRbtX/E6cIYHL3cltzNr wv9Y7YHm3qjeJLI9oPHJr5I1eAbFEifF7ALq2fJohs39N9wzeWRUldbytDjotJBrFwhx yPlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RnMiYSc4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id go12si9561888ejc.4.2022.02.01.07.27.56; Tue, 01 Feb 2022 07:28:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RnMiYSc4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234240AbiAaFA7 (ORCPT + 99 others); Mon, 31 Jan 2022 00:00:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231249AbiAaFA5 (ORCPT ); Mon, 31 Jan 2022 00:00:57 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F326C061714; Sun, 30 Jan 2022 21:00:57 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1E517B82996; Mon, 31 Jan 2022 05:00:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50B67C340E8; Mon, 31 Jan 2022 05:00:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643605254; bh=1cYvbr0yqG6yKeEzNRNEryoFE5gH0kRAa7KDeIkMqmc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RnMiYSc4FlOj7jeyHlmwxS9i93X7fF35+6zjcb3oluOd+vREDr6LtlB8a4cZcXCUF 0fCnj97GRgJr2VRsPvXcnbz9t+K3GjQjl6tT2Fn04pMGrBrReZrYhvSsYHJbJCHTbP EEbdP/d8XNrZF5mWDWya7fI3XBQUmJ2kc+jeaRopbyFebCNSPvPVjuh7hyTxC05CPE c/KSRDobAzlGQcDrf53fd7baTeQCfbWPuah/az9E7YmnYuqfm3DM0D8pe9J6l6FWPj 2l9XaTOsIz/mBwrIZ4js5Andrc/dTj/cclZeWI2HD8wFAeAKlTBUtHjvqbyJasnopU Dv/cj6AzeT/sg== From: Masami Hiramatsu To: Jiri Olsa , Alexei Starovoitov Cc: Daniel Borkmann , Andrii Nakryiko , Masami Hiramatsu , netdev@vger.kernel.org, bpf@vger.kernel.org, lkml , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Steven Rostedt , "Naveen N . Rao" , Anil S Keshavamurthy , "David S . Miller" Subject: [PATCH v7 02/10] fprobe: Add ftrace based probe APIs Date: Mon, 31 Jan 2022 14:00:48 +0900 Message-Id: <164360524789.65877.4689863820905138928.stgit@devnote2> X-Mailer: git-send-email 2.25.1 In-Reply-To: <164360522462.65877.1891020292202285106.stgit@devnote2> References: <164360522462.65877.1891020292202285106.stgit@devnote2> User-Agent: StGit/0.19 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The fprobe is a wrapper API for ftrace function tracer. Unlike kprobes, this probes only supports the function entry, but this can probe multiple functions by one fprobe. The usage is similar, user will set their callback to fprobe::entry_handler and call register_fprobe*() with probed functions. There are 3 registration interfaces, - register_fprobe() takes filtering patterns of the functin names. - register_fprobe_ips() takes an array of ftrace-location addresses. - register_fprobe_syms() takes an array of function names. The registered fprobes can be unregistered with unregister_fprobe(). e.g. struct fprobe fp = { .entry_handler = user_handler }; const char *targets[] = { "func1", "func2", "func3"}; ... ret = register_fprobe_syms(&fp, targets, ARRAY_SIZE(targets)); ... unregister_fprobe(&fp); Signed-off-by: Masami Hiramatsu --- Changes in v7: - Fix kerneldoc for the APIs. Changes in v6: - Remove syms, addrs, and nentry fields from struct fprobe. - Introduce 3 variants of registration functions. - Call ftrace_free_filter() at unregistration. Changes in v4: - Fix a memory leak when symbol lookup failed. - Use ftrace location address instead of symbol address. - Convert the given symbol address to ftrace location automatically. - Rename fprobe::ftrace to fprobe::ops. - Update the Kconfig description. --- include/linux/fprobe.h | 79 +++++++++++++++++ kernel/trace/Kconfig | 12 +++ kernel/trace/Makefile | 1 kernel/trace/fprobe.c | 218 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 310 insertions(+) create mode 100644 include/linux/fprobe.h create mode 100644 kernel/trace/fprobe.c diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h new file mode 100644 index 000000000000..b920dc1b2969 --- /dev/null +++ b/include/linux/fprobe.h @@ -0,0 +1,79 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Simple ftrace probe wrapper */ +#ifndef _LINUX_FPROBE_H +#define _LINUX_FPROBE_H + +#include +#include + +/** + * struct fprobe - ftrace based probe. + * @ops: The ftrace_ops. + * @nmissed: The counter for missing events. + * @flags: The status flag. + * @entry_handler: The callback function for function entry. + */ +struct fprobe { + struct ftrace_ops ops; + unsigned long nmissed; + unsigned int flags; + void (*entry_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs); +}; + +#define FPROBE_FL_DISABLED 1 + +static inline bool fprobe_disabled(struct fprobe *fp) +{ + return (fp) ? fp->flags & FPROBE_FL_DISABLED : false; +} + +#ifdef CONFIG_FPROBE +int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter); +int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num); +int register_fprobe_syms(struct fprobe *fp, const char **syms, int num); +int unregister_fprobe(struct fprobe *fp); +#else +static inline int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter) +{ + return -EOPNOTSUPP; +} +static inline int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num) +{ + return -EOPNOTSUPP; +} +static inline int register_fprobe_syms(struct fprobe *fp, const char **syms, int num) +{ + return -EOPNOTSUPP; +} +static inline int unregister_fprobe(struct fprobe *fp) +{ + return -EOPNOTSUPP; +} +#endif + +/** + * disable_fprobe() - Disable fprobe + * @fp: The fprobe to be disabled. + * + * This will soft-disable @fp. Note that this doesn't remove the ftrace + * hooks from the function entry. + */ +static inline void disable_fprobe(struct fprobe *fp) +{ + if (fp) + fp->flags |= FPROBE_FL_DISABLED; +} + +/** + * enable_fprobe() - Enable fprobe + * @fp: The fprobe to be enabled. + * + * This will soft-enable @fp. + */ +static inline void enable_fprobe(struct fprobe *fp) +{ + if (fp) + fp->flags &= ~FPROBE_FL_DISABLED; +} + +#endif diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 752ed89a293b..043c8f6c4075 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -230,6 +230,18 @@ config DYNAMIC_FTRACE_WITH_ARGS depends on DYNAMIC_FTRACE depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS +config FPROBE + bool "Kernel Function Probe (fprobe)" + depends on FUNCTION_TRACER + depends on DYNAMIC_FTRACE_WITH_REGS + default n + help + This option enables kernel function probe (fprobe) based on ftrace, + which is similar to kprobes, but probes only for kernel function + entries and it can probe multiple functions by one fprobe. + + If unsure, say N. + config FUNCTION_PROFILER bool "Kernel function profiler" depends on FUNCTION_TRACER diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index bedc5caceec7..79255f9de9a4 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -97,6 +97,7 @@ obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o +obj-$(CONFIG_FPROBE) += fprobe.o obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c new file mode 100644 index 000000000000..b5d4f8baaf43 --- /dev/null +++ b/kernel/trace/fprobe.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * fprobe - Simple ftrace probe wrapper for function entry. + */ +#define pr_fmt(fmt) "fprobe: " fmt + +#include +#include +#include +#include +#include +#include + +static void fprobe_handler(unsigned long ip, unsigned long parent_ip, + struct ftrace_ops *ops, struct ftrace_regs *fregs) +{ + struct fprobe *fp; + int bit; + + fp = container_of(ops, struct fprobe, ops); + if (fprobe_disabled(fp)) + return; + + bit = ftrace_test_recursion_trylock(ip, parent_ip); + if (bit < 0) { + fp->nmissed++; + return; + } + + if (fp->entry_handler) + fp->entry_handler(fp, ip, ftrace_get_regs(fregs)); + + ftrace_test_recursion_unlock(bit); +} +NOKPROBE_SYMBOL(fprobe_handler); + +/* Convert ftrace location address from symbols */ +static unsigned long *get_ftrace_locations(const char **syms, int num) +{ + unsigned long *addrs, addr, size; + int i; + + /* Convert symbols to symbol address */ + addrs = kcalloc(num, sizeof(*addrs), GFP_KERNEL); + if (!addrs) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < num; i++) { + addrs[i] = kallsyms_lookup_name(syms[i]); + if (!addrs[i]) /* Maybe wrong symbol */ + goto error; + } + + /* Convert symbol address to ftrace location. */ + for (i = 0; i < num; i++) { + if (!kallsyms_lookup_size_offset(addrs[i], &size, NULL)) + size = MCOUNT_INSN_SIZE; + addr = ftrace_location_range(addrs[i], addrs[i] + size - 1); + if (!addr) /* No dynamic ftrace there. */ + goto error; + addrs[i] = addr; + } + + return addrs; + +error: + kfree(addrs); + + return ERR_PTR(-ENOENT); +} + +static void fprobe_init(struct fprobe *fp) +{ + fp->nmissed = 0; + fp->ops.func = fprobe_handler; + fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS; +} + +/** + * register_fprobe() - Register fprobe to ftrace by pattern. + * @fp: A fprobe data structure to be registered. + * @filter: A wildcard pattern of probed symbols. + * @notfilter: A wildcard pattern of NOT probed symbols. + * + * Register @fp to ftrace for enabling the probe on the symbols matched to @filter. + * If @notfilter is not NULL, the symbols matched the @notfilter are not probed. + * + * Return 0 if @fp is registered successfully, -errno if not. + */ +int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter) +{ + unsigned char *str; + int ret, len; + + if (!fp || !filter) + return -EINVAL; + + fprobe_init(fp); + + len = strlen(filter); + str = kstrdup(filter, GFP_KERNEL); + ret = ftrace_set_filter(&fp->ops, str, len, 0); + kfree(str); + if (ret) + return ret; + + if (notfilter) { + len = strlen(notfilter); + str = kstrdup(notfilter, GFP_KERNEL); + ret = ftrace_set_notrace(&fp->ops, str, len, 0); + kfree(str); + if (ret) + goto out; + } + + ret = register_ftrace_function(&fp->ops); +out: + if (ret) + ftrace_free_filter(&fp->ops); + return ret; +} +EXPORT_SYMBOL_GPL(register_fprobe); + +/** + * register_fprobe_ips() - Register fprobe to ftrace by address. + * @fp: A fprobe data structure to be registered. + * @addrs: An array of target ftrace location addresses. + * @num: The number of entries of @addrs. + * + * Register @fp to ftrace for enabling the probe on the address given by @addrs. + * The @addrs must be the addresses of ftrace location address, which may be + * the symbol address + arch-dependent offset. + * If you unsure what this mean, please use other registration functions. + * + * Return 0 if @fp is registered successfully, -errno if not. + */ +int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num) +{ + int ret; + + if (!fp || !addrs || num <= 0) + return -EINVAL; + + fprobe_init(fp); + + ret = ftrace_set_filter_ips(&fp->ops, addrs, num, 0, 0); + if (!ret) + ret = register_ftrace_function(&fp->ops); + + if (ret) + ftrace_free_filter(&fp->ops); + + return ret; +} +EXPORT_SYMBOL_GPL(register_fprobe_ips); + +/** + * register_fprobe_syms() - Register fprobe to ftrace by symbols. + * @fp: A fprobe data structure to be registered. + * @syms: An array of target symbols. + * @num: The number of entries of @syms. + * + * Register @fp to the symbols given by @syms array. This will be useful if + * you are sure the symbols exist in the kernel. + * + * Return 0 if @fp is registered successfully, -errno if not. + */ +int register_fprobe_syms(struct fprobe *fp, const char **syms, int num) +{ + unsigned long *addrs; + int ret; + + if (!fp || !syms || num <= 0) + return -EINVAL; + + fprobe_init(fp); + + addrs = get_ftrace_locations(syms, num); + if (IS_ERR(addrs)) + return PTR_ERR(addrs); + + ret = ftrace_set_filter_ips(&fp->ops, addrs, num, 0, 0); + if (ret) + goto out; + ret = register_ftrace_function(&fp->ops); + if (ret) + ftrace_free_filter(&fp->ops); + +out: + kfree(addrs); + + return ret; +} +EXPORT_SYMBOL_GPL(register_fprobe_syms); + +/** + * unregister_fprobe() - Unregister fprobe from ftrace + * @fp: A fprobe data structure to be unregistered. + * + * Unregister fprobe (and remove ftrace hooks from the function entries). + * + * Return 0 if @fp is unregistered successfully, -errno if not. + */ +int unregister_fprobe(struct fprobe *fp) +{ + int ret; + + if (!fp || fp->ops.func != fprobe_handler) + return -EINVAL; + + ret = unregister_ftrace_function(&fp->ops); + + if (!ret) + ftrace_free_filter(&fp->ops); + + return ret; +} +EXPORT_SYMBOL_GPL(unregister_fprobe);