2015-11-13 12:29:47

by Wang Nan

[permalink] [raw]
Subject: [PATCH 00/12] perf tools: bpf: Improve BPF program ability

This patchset is combined by 3 parts:

Part 1: Bugfix (1/12 - 2/12).

Part 2: Improve BPF configuration syntax, allow BPF program set
probing options originally belong to 'perf probe'
(3/12 - 5/12)

Part 3: Allow BPF program read kernel data using variable names
and do structure dereferencing automatically.
(6/12 - 12/12)

Although not obvious, the order is need to be kept if one want to
try the testcase shown in 12/12's commit message. Otherwise he/she
should find another example, and deal with inlining problems and bugs
in perf probe.

Compare with previous patches I posted before, the main difference in
this patchset is improving commit messages and utilizing new testing
and error reporting code.

He Kuang (1):
perf tools: Add prologue for BPF programs for fetching arguments

Masami Hiramatsu (1):
perf probe: Fix memory leaking on faiulre by clearing all
probe_trace_events

Wang Nan (10):
perf probe: Clear probe_trace_event when add_probe_trace_event() fails
perf tools: Allow BPF program attach to uprobe events
perf tools: Allow BPF program attach to modules
perf tools: Allow BPF program config probing options
bpf tools: Load a program with different instances using preprocessor
perf tools: Add BPF_PROLOGUE config options for further patches
perf tools: Compile dwarf-regs.c if CONFIG_BPF_PROLOGUE is on
perf tools: Generate prologue for BPF programs
perf test: Test BPF prologue
perf tools: Use same BPF program if arguments are identical

tools/lib/bpf/libbpf.c | 145 ++++++++-
tools/lib/bpf/libbpf.h | 64 ++++
tools/perf/arch/x86/util/Build | 1 +
tools/perf/config/Makefile | 12 +
tools/perf/tests/Build | 9 +-
tools/perf/tests/bpf-script-test-prologue.c | 35 +++
tools/perf/tests/bpf.c | 34 +++
tools/perf/tests/llvm.c | 4 +
tools/perf/tests/llvm.h | 2 +
tools/perf/util/Build | 1 +
tools/perf/util/bpf-loader.c | 429 +++++++++++++++++++++++++-
tools/perf/util/bpf-loader.h | 4 +
tools/perf/util/bpf-prologue.c | 455 ++++++++++++++++++++++++++++
tools/perf/util/bpf-prologue.h | 34 +++
tools/perf/util/config.c | 9 +-
tools/perf/util/probe-finder.c | 24 +-
tools/perf/util/util.c | 18 ++
tools/perf/util/util.h | 2 +
18 files changed, 1254 insertions(+), 28 deletions(-)
create mode 100644 tools/perf/tests/bpf-script-test-prologue.c
create mode 100644 tools/perf/util/bpf-prologue.c
create mode 100644 tools/perf/util/bpf-prologue.h

--
1.8.3.4


2015-11-13 12:31:58

by Wang Nan

[permalink] [raw]
Subject: [PATCH 01/12] perf probe: Fix memory leaking on faiulre by clearing all probe_trace_events

From: Masami Hiramatsu <[email protected]>

Fix memory leaking on failure path in debuginfo__find_trace_events()
which frees an array of probe_trace_events but doesn't clear all
allocated sub-structures and strings.
So, before doing zfree(tevs), clear all the array elements which
can have allocated resources.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reported-by: Wang Nan <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/probe-finder.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index bd8f03d..63993d7 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1246,7 +1246,7 @@ int debuginfo__find_trace_events(struct debuginfo *dbg,
struct trace_event_finder tf = {
.pf = {.pev = pev, .callback = add_probe_trace_event},
.max_tevs = probe_conf.max_probes, .mod = dbg->mod};
- int ret;
+ int ret, i;

/* Allocate result tevs array */
*tevs = zalloc(sizeof(struct probe_trace_event) * tf.max_tevs);
@@ -1258,6 +1258,8 @@ int debuginfo__find_trace_events(struct debuginfo *dbg,

ret = debuginfo__find_probes(dbg, &tf.pf);
if (ret < 0) {
+ for (i = 0; i < tf.ntevs; i++)
+ clear_probe_trace_event(&tf.tevs[i]);
zfree(tevs);
return ret;
}
--
1.8.3.4

2015-11-13 12:31:53

by Wang Nan

[permalink] [raw]
Subject: [PATCH 02/12] perf probe: Clear probe_trace_event when add_probe_trace_event() fails

When probe with glob, error in add_probe_trace_event() won't be passed
to debuginfo__find_trace_events() because it whould be modified by
probe_point_search_cb(). It causes segfault if perf failed to find
argument for one probing point matched by the glob. For example:

# ./perf probe -v -n 'SyS_dup? oldfd'
probe-definition(0): SyS_dup? oldfd
symbol:SyS_dup? file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: oldfd into oldfd
1 arguments
Looking at the vmlinux_path (7 entries long)
Using /lib/modules/4.3.0-rc4+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.3.0-rc4+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: SyS_dup3
found inline addr: 0xffffffff812095c0
Probe point found: SyS_dup3+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
found inline addr: 0xffffffff812096d4
Probe point found: SyS_dup2+36
Searching 'oldfd' variable in context.
Failed to find 'oldfd' in this function.
Matched function: SyS_dup3
Probe point found: SyS_dup3+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
Matched function: SyS_dup2
Probe point found: SyS_dup2+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
Found 4 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/SyS_dup3 _text+2135488 oldfd=%di:s64
Segmentation fault (core dumped)

This patch ensures add_probe_trace_event() not touch tf->ntevs and
tf->tevs if it returns failure.

Here is testing result:

# perf probe 'SyS_dup? oldfd'
Failed to find 'oldfd' in this function.
Added new events:
probe:SyS_dup3 (on SyS_dup? with oldfd)
probe:SyS_dup3_1 (on SyS_dup? with oldfd)
probe:SyS_dup2 (on SyS_dup? with oldfd)

You can now use it in all perf tools, such as:

perf record -e probe:SyS_dup2 -aR sleep 1

Signed-off-by: Wang Nan <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/probe-finder.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 63993d7..05012bb 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1183,7 +1183,7 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
container_of(pf, struct trace_event_finder, pf);
struct perf_probe_point *pp = &pf->pev->point;
struct probe_trace_event *tev;
- struct perf_probe_arg *args;
+ struct perf_probe_arg *args = NULL;
int ret, i;

/* Check number of tevs */
@@ -1198,19 +1198,23 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
ret = convert_to_trace_point(&pf->sp_die, tf->mod, pf->addr,
pp->retprobe, pp->function, &tev->point);
if (ret < 0)
- return ret;
+ goto end;

tev->point.realname = strdup(dwarf_diename(sc_die));
- if (!tev->point.realname)
- return -ENOMEM;
+ if (!tev->point.realname) {
+ ret = -ENOMEM;
+ goto end;
+ }

pr_debug("Probe point found: %s+%lu\n", tev->point.symbol,
tev->point.offset);

/* Expand special probe argument if exist */
args = zalloc(sizeof(struct perf_probe_arg) * MAX_PROBE_ARGS);
- if (args == NULL)
- return -ENOMEM;
+ if (args == NULL) {
+ ret = -ENOMEM;
+ goto end;
+ }

ret = expand_probe_args(sc_die, pf, args);
if (ret < 0)
@@ -1234,6 +1238,10 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
}

end:
+ if (ret) {
+ clear_probe_trace_event(tev);
+ tf->ntevs--;
+ }
free(args);
return ret;
}
--
1.8.3.4

2015-11-13 12:34:31

by Wang Nan

[permalink] [raw]
Subject: [PATCH 03/12] perf tools: Allow BPF program attach to uprobe events

This patch appends new syntax to BPF object section name to support
probing at uprobe event. Now we can use BPF program like this:

SEC(
"exec=/lib64/libc.so.6\n"
"libcwrite=__write"
)
int libcwrite(void *ctx)
{
return 1;
}

Where, in section name of a program, before the main config string,
we can use 'key=value' style options. Now the only option key "exec"
is for uprobe probing.

Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/bpf-loader.c | 122 ++++++++++++++++++++++++++++++++++++++++---
tools/perf/util/bpf-loader.h | 1 +
2 files changed, 117 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 4c50411..5f5505d 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -110,6 +110,115 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
}

static int
+config__exec(const char *value, struct perf_probe_event *pev)
+{
+ pev->uprobes = true;
+ pev->target = strdup(value);
+ return 0;
+}
+
+static struct {
+ const char *key;
+ const char *usage;
+ const char *desc;
+ int (*func)(const char *, struct perf_probe_event *);
+} bpf_config_terms[] = {
+ {
+ "exec",
+ "exec=<full path of file>",
+ "Set uprobe target",
+ config__exec,
+ },
+};
+
+static int
+do_config(const char *key, const char *value,
+ struct perf_probe_event *pev)
+{
+ unsigned int i;
+
+ pr_debug("config bpf program: %s=%s\n", key, value);
+ for (i = 0; i < ARRAY_SIZE(bpf_config_terms); i++)
+ if (strcmp(key, bpf_config_terms[i].key) == 0)
+ return bpf_config_terms[i].func(value, pev);
+
+ pr_debug("BPF: ERROR: invalid config option in object: %s=%s\n",
+ key, value);
+
+ pr_debug("\nHint: Currently valid options are:\n");
+ for (i = 0; i < ARRAY_SIZE(bpf_config_terms); i++)
+ pr_debug("\t%s:\t%s\n", bpf_config_terms[i].usage,
+ bpf_config_terms[i].desc);
+ pr_debug("\n");
+
+ return -BPF_LOADER_ERRNO__CONFIG_TERM;
+}
+
+static const char *
+parse_config_kvpair(const char *config_str, struct perf_probe_event *pev)
+{
+ char *text = strdup(config_str);
+ char *sep, *line;
+ const char *main_str = NULL;
+ int err = 0;
+
+ if (!text) {
+ pr_debug("No enough memory: dup config_str failed\n");
+ return ERR_PTR(-ENOMEM);
+ }
+
+ line = text;
+ while ((sep = strchr(line, '\n'))) {
+ char *equ;
+
+ *sep = '\0';
+ equ = strchr(line, '=');
+ if (!equ) {
+ pr_warning("WARNING: invalid config in BPF object: %s\n",
+ line);
+ pr_warning("\tShould be 'key=value'.\n");
+ goto nextline;
+ }
+ *equ = '\0';
+
+ err = do_config(line, equ + 1, pev);
+ if (err)
+ break;
+nextline:
+ line = sep + 1;
+ }
+
+ if (!err)
+ main_str = config_str + (line - text);
+ free(text);
+
+ if (err)
+ return ERR_PTR(err);
+
+ return main_str;
+}
+
+static int
+parse_config(const char *config_str, struct perf_probe_event *pev)
+{
+ const char *main_str;
+ int err;
+
+ main_str = parse_config_kvpair(config_str, pev);
+ if (IS_ERR(main_str))
+ return PTR_ERR(main_str);
+
+ err = parse_perf_probe_command(main_str, pev);
+ if (err < 0) {
+ pr_debug("bpf: '%s' is not a valid config string\n",
+ config_str);
+ /* parse failed, don't need clear pev. */
+ return -BPF_LOADER_ERRNO__CONFIG;
+ }
+ return 0;
+}
+
+static int
config_bpf_program(struct bpf_program *prog)
{
struct perf_probe_event *pev = NULL;
@@ -131,13 +240,9 @@ config_bpf_program(struct bpf_program *prog)
pev = &priv->pev;

pr_debug("bpf: config program '%s'\n", config_str);
- err = parse_perf_probe_command(config_str, pev);
- if (err < 0) {
- pr_debug("bpf: '%s' is not a valid config string\n",
- config_str);
- err = -BPF_LOADER_ERRNO__CONFIG;
+ err = parse_config(config_str, pev);
+ if (err)
goto errout;
- }

if (pev->group && strcmp(pev->group, PERF_BPF_PROBE_GROUP)) {
pr_debug("bpf: '%s': group for event is set and not '%s'.\n",
@@ -340,6 +445,7 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
[ERRCODE_OFFSET(EVENTNAME)] = "No event name found in config string",
[ERRCODE_OFFSET(INTERNAL)] = "BPF loader internal error",
[ERRCODE_OFFSET(COMPILE)] = "Error when compiling BPF scriptlet",
+ [ERRCODE_OFFSET(CONFIG_TERM)] = "Invalid config term in config string",
};

static int
@@ -420,6 +526,10 @@ int bpf__strerror_probe(struct bpf_object *obj __maybe_unused,
int err, char *buf, size_t size)
{
bpf__strerror_head(err, buf, size);
+ case BPF_LOADER_ERRNO__CONFIG_TERM: {
+ scnprintf(buf, size, "%s (add -v to see detail)", emsg);
+ break;
+ }
bpf__strerror_entry(EEXIST, "Probe point exist. Try use 'perf probe -d \"*\"'");
bpf__strerror_entry(EACCES, "You need to be root");
bpf__strerror_entry(EPERM, "You need to be root, and /proc/sys/kernel/kptr_restrict should be 0");
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 9caf3ae..d19f5c5 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -20,6 +20,7 @@ enum bpf_loader_errno {
BPF_LOADER_ERRNO__EVENTNAME, /* Event name is missing */
BPF_LOADER_ERRNO__INTERNAL, /* BPF loader internal error */
BPF_LOADER_ERRNO__COMPILE, /* Error when compiling BPF scriptlet */
+ BPF_LOADER_ERRNO__CONFIG_TERM, /* Invalid config term in config term */
__BPF_LOADER_ERRNO__END,
};

--
1.8.3.4

2015-11-13 12:32:51

by Wang Nan

[permalink] [raw]
Subject: [PATCH 04/12] perf tools: Allow BPF program attach to modules

By extending the syntax of BPF object section names, this patch allows
user to attach BPF programs to symbol in modules. For example:

SEC("module=i915\n"
"parse_cmds=i915_parse_cmds")
int parse_cmds(void *ctx)
{
return 1;
}

Implementation is very simple: like what 'perf probe' does, for module,
fill 'uprobe' field in 'struct perf_probe_event'. Other parts would be
done automatically.

Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David Ahern <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kaixu Xia <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
---
tools/perf/util/bpf-loader.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 5f5505d..8d78785 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -117,6 +117,14 @@ config__exec(const char *value, struct perf_probe_event *pev)
return 0;
}

+static int
+config__module(const char *value, struct perf_probe_event *pev)
+{
+ pev->uprobes = false;
+ pev->target = strdup(value);
+ return 0;
+}
+
static struct {
const char *key;
const char *usage;
@@ -129,6 +137,12 @@ static struct {
"Set uprobe target",
config__exec,
},
+ {
+ "module",
+ "module=<module name> ",
+ "Set kprobe module",
+ config__module,
+ }
};

static int
--
1.8.3.4

2015-11-13 12:32:07

by Wang Nan

[permalink] [raw]
Subject: [PATCH 05/12] perf tools: Allow BPF program config probing options

By extending the syntax of BPF object section names, this patch allows
user to config probing options like what they can do in 'perf probe'.

Test result:

For following BPF file bpf.c:

SEC("inlines=no\n"
"func=SyS_dup?")
int func(void *ctx)
{
return 1;
}

Cmdline:

# ./perf record -e ./test_probe_glob.c ls /
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# ./perf evlist
perf_bpf_probe:func_1
perf_bpf_probe:func

Change "inlines=no" to "inlines=yes":

Cmdline:

# ./perf record -e ./test_probe_glob.c ls /
...
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# ./perf evlist
perf_bpf_probe:func_3
perf_bpf_probe:func_2
perf_bpf_probe:func_1
perf_bpf_probe:func

Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/bpf-loader.c | 50 +++++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/config.c | 9 ++++----
tools/perf/util/util.c | 18 ++++++++++++++++
tools/perf/util/util.h | 2 ++
4 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 8d78785..a368ead 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -125,6 +125,38 @@ config__module(const char *value, struct perf_probe_event *pev)
return 0;
}

+static int
+config__bool(const char *value,
+ bool *pbool, bool invert)
+{
+ int err;
+ bool bool_value;
+
+ if (!pbool)
+ return -EINVAL;
+
+ err = convert_str_to_bool(value, &bool_value);
+ if (err)
+ return err;
+
+ *pbool = invert ? !bool_value : bool_value;
+ return 0;
+}
+
+static int
+config__inlines(const char *value,
+ struct perf_probe_event *pev __maybe_unused)
+{
+ return config__bool(value, &probe_conf.no_inlines, true);
+}
+
+static int
+config__force(const char *value,
+ struct perf_probe_event *pev __maybe_unused)
+{
+ return config__bool(value, &probe_conf.force_add, false);
+}
+
static struct {
const char *key;
const char *usage;
@@ -142,7 +174,19 @@ static struct {
"module=<module name> ",
"Set kprobe module",
config__module,
- }
+ },
+ {
+ "inlines",
+ "inlines=[yes|no] ",
+ "Probe at inline symbol",
+ config__inlines,
+ },
+ {
+ "force",
+ "force=[yes|no] ",
+ "Forcibly add events with existing name",
+ config__force,
+ },
};

static int
@@ -240,6 +284,10 @@ config_bpf_program(struct bpf_program *prog)
const char *config_str;
int err;

+ /* Initialize per-program probing setting */
+ probe_conf.no_inlines = false;
+ probe_conf.force_add = false;
+
config_str = bpf_program__title(prog, false);
if (IS_ERR(config_str)) {
pr_debug("bpf: unable to get title for program\n");
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 2e452ac..8219798 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -351,15 +351,16 @@ int perf_config_int(const char *name, const char *value)

static int perf_config_bool_or_int(const char *name, const char *value, int *is_bool)
{
+ bool str_bool;
+
*is_bool = 1;
if (!value)
return 1;
if (!*value)
return 0;
- if (!strcasecmp(value, "true") || !strcasecmp(value, "yes") || !strcasecmp(value, "on"))
- return 1;
- if (!strcasecmp(value, "false") || !strcasecmp(value, "no") || !strcasecmp(value, "off"))
- return 0;
+
+ if (convert_str_to_bool(value, &str_bool) == 0)
+ return str_bool ? 1 : 0;
*is_bool = 0;
return perf_config_int(name, value);
}
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 47b1e36..55785d5 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -695,3 +695,21 @@ fetch_kernel_version(unsigned int *puint, char *str,
*puint = (version << 16) + (patchlevel << 8) + sublevel;
return 0;
}
+
+int convert_str_to_bool(const char *str, bool *result)
+{
+ if (!result || !str)
+ return -EINVAL;
+
+ if (!strcasecmp(str, "true") || !strcasecmp(str, "yes") || !strcasecmp(str, "on")) {
+ *result = true;
+ return 0;
+ }
+
+ if (!strcasecmp(str, "false") || !strcasecmp(str, "no") || !strcasecmp(str, "off")) {
+ *result = false;
+ return 0;
+ }
+
+ return -EINVAL;
+}
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index dcc6590..be90932 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -358,4 +358,6 @@ int fetch_kernel_version(unsigned int *puint,
#define KVER_FMT "%d.%d.%d"
#define KVER_PARAM(x) KVER_VERSION(x), KVER_PATCHLEVEL(x), KVER_SUBLEVEL(x)

+int convert_str_to_bool(const char *str, bool *result);
+
#endif /* GIT_COMPAT_UTIL_H */
--
1.8.3.4

2015-11-13 12:32:37

by Wang Nan

[permalink] [raw]
Subject: [PATCH 06/12] bpf tools: Load a program with different instances using preprocessor

This patch is a preparation for BPF prologue support which allows
generating a series of BPF bytecode for fetching kernel data before
calling program code. With the newly introduced multiple instances
support, perf is able to create different prologues for different
kprobe points.

Before this patch, a bpf_program can be loaded into kernel only once,
and get the only resuling fd. What this patch done is to allow creating
and loading different variants of one bpf_program, then fetching their
fds.

Here describe the basic idea in this patch. The detail description of
the newly introduced APIs can be found in comment in the patch body.

The key of this patch is the new mechanism in bpf_program__load().
Instead of loading BPF program into kernel directly, it calls a
'pre-processor' to generate program instances which would be final
loaded into kernel based on the original code. To enable multiple
instances generation, libbpf passes an index to the pre-processor
so it know which instance is being loaded.

Pre-processor should be passed from libbpf's user (perf) using
bpf_program__set_prep(). The number of instances and the relationship
between indics and the target instance should be clear when calling
bpf_program__set_prep().

To retrive fd for a specific instance of a program,
bpf_program__nth_fd() is introduced. It return the resuling fd
according to index.

Signed-off-by: Wang Nan <[email protected]>
Signed-off-by: He Kuang <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/lib/bpf/libbpf.c | 145 ++++++++++++++++++++++++++++++++++++++++++++++---
tools/lib/bpf/libbpf.h | 64 ++++++++++++++++++++++
2 files changed, 200 insertions(+), 9 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e176bad..fcfa39f 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -152,7 +152,11 @@ struct bpf_program {
} *reloc_desc;
int nr_reloc;

- int fd;
+ struct {
+ int nr;
+ int *fds;
+ } instances;
+ bpf_program_prep_t preprocessor;

struct bpf_object *obj;
void *priv;
@@ -206,10 +210,24 @@ struct bpf_object {

static void bpf_program__unload(struct bpf_program *prog)
{
+ int i;
+
if (!prog)
return;

- zclose(prog->fd);
+ /*
+ * If the object is opened but the program is never loaded,
+ * it is possible that prog->instances.nr == -1.
+ */
+ if (prog->instances.nr > 0) {
+ for (i = 0; i < prog->instances.nr; i++)
+ zclose(prog->instances.fds[i]);
+ } else if (prog->instances.nr != -1)
+ pr_warning("Internal error: instances.nr is %d\n",
+ prog->instances.nr);
+
+ prog->instances.nr = -1;
+ zfree(&prog->instances.fds);
}

static void bpf_program__exit(struct bpf_program *prog)
@@ -260,7 +278,8 @@ bpf_program__init(void *data, size_t size, char *name, int idx,
memcpy(prog->insns, data,
prog->insns_cnt * sizeof(struct bpf_insn));
prog->idx = idx;
- prog->fd = -1;
+ prog->instances.fds = NULL;
+ prog->instances.nr = -1;

return 0;
errout:
@@ -860,13 +879,73 @@ static int
bpf_program__load(struct bpf_program *prog,
char *license, u32 kern_version)
{
- int err, fd;
+ int err = 0, fd, i;
+
+ if (prog->instances.nr < 0 || !prog->instances.fds) {
+ if (prog->preprocessor) {
+ pr_warning("Internal error: can't load program '%s'\n",
+ prog->section_name);
+ return -LIBBPF_ERRNO__INTERNAL;
+ }
+
+ prog->instances.fds = malloc(sizeof(int));
+ if (!prog->instances.fds) {
+ pr_warning("No enough memory for fds\n");
+ return -ENOMEM;
+ }
+ prog->instances.nr = 1;
+ prog->instances.fds[0] = -1;
+ }
+
+ if (!prog->preprocessor) {
+ if (prog->instances.nr != 1)
+ pr_warning("Program '%s' inconsistent: nr(%d) not 1\n",
+ prog->section_name, prog->instances.nr);

- err = load_program(prog->insns, prog->insns_cnt,
- license, kern_version, &fd);
- if (!err)
- prog->fd = fd;
+ err = load_program(prog->insns, prog->insns_cnt,
+ license, kern_version, &fd);
+ if (!err)
+ prog->instances.fds[0] = fd;
+ goto out;
+ }
+
+ for (i = 0; i < prog->instances.nr; i++) {
+ struct bpf_prog_prep_result result;
+ bpf_program_prep_t preprocessor = prog->preprocessor;
+
+ bzero(&result, sizeof(result));
+ err = (*preprocessor)(prog, i, prog->insns,
+ prog->insns_cnt, &result);
+ if (err) {
+ pr_warning("Preprocessing %dth instance of program '%s' failed\n",
+ i, prog->section_name);
+ goto out;
+ }
+
+ if (!result.new_insn_ptr || !result.new_insn_cnt) {
+ pr_debug("Skip loading %dth instance of program '%s'\n",
+ i, prog->section_name);
+ prog->instances.fds[i] = -1;
+ if (result.pfd)
+ *result.pfd = -1;
+ continue;
+ }
+
+ err = load_program(result.new_insn_ptr,
+ result.new_insn_cnt,
+ license, kern_version, &fd);
+
+ if (err) {
+ pr_warning("Loading %dth instance of program '%s' failed\n",
+ i, prog->section_name);
+ goto out;
+ }

+ if (result.pfd)
+ *result.pfd = fd;
+ prog->instances.fds[i] = fd;
+ }
+out:
if (err)
pr_warning("failed to load program '%s'\n",
prog->section_name);
@@ -1121,5 +1200,53 @@ const char *bpf_program__title(struct bpf_program *prog, bool needs_copy)

int bpf_program__fd(struct bpf_program *prog)
{
- return prog->fd;
+ return bpf_program__nth_fd(prog, 0);
+}
+
+int bpf_program__set_prep(struct bpf_program *prog, int nr_instances,
+ bpf_program_prep_t prep)
+{
+ int *instances_fds;
+
+ if (nr_instances <= 0 || !prep)
+ return -EINVAL;
+
+ if (prog->instances.nr > 0 || prog->instances.fds) {
+ pr_warning("Can't set pre-processor after loading\n");
+ return -EINVAL;
+ }
+
+ instances_fds = malloc(sizeof(int) * nr_instances);
+ if (!instances_fds) {
+ pr_warning("alloc memory failed for fds\n");
+ return -ENOMEM;
+ }
+
+ /* fill all fd with -1 */
+ memset(instances_fds, 0xff, sizeof(int) * nr_instances);
+
+ prog->instances.nr = nr_instances;
+ prog->instances.fds = instances_fds;
+ prog->preprocessor = prep;
+ return 0;
+}
+
+int bpf_program__nth_fd(struct bpf_program *prog, int n)
+{
+ int fd;
+
+ if (n >= prog->instances.nr || n < 0) {
+ pr_warning("Can't get the %dth fd from program %s: only %d instances\n",
+ n, prog->section_name, prog->instances.nr);
+ return -EINVAL;
+ }
+
+ fd = prog->instances.fds[n];
+ if (fd < 0) {
+ pr_warning("%dth instance of program '%s' is invalid\n",
+ n, prog->section_name);
+ return -ENOENT;
+ }
+
+ return fd;
}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index c9a9aef..949df4b 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -88,6 +88,70 @@ const char *bpf_program__title(struct bpf_program *prog, bool needs_copy);

int bpf_program__fd(struct bpf_program *prog);

+struct bpf_insn;
+
+/*
+ * Libbpf allows callers to adjust BPF programs before being loaded
+ * into kernel. One program in an object file can be transform into
+ * multiple variants to be attached to different code.
+ *
+ * bpf_program_prep_t, bpf_program__set_prep and bpf_program__nth_fd
+ * are APIs for this propose.
+ *
+ * - bpf_program_prep_t:
+ * It defines 'preprocessor', which is a caller defined function
+ * passed to libbpf through bpf_program__set_prep(), and will be
+ * called before program is loaded. The processor should adjust
+ * the program one time for each instances according to the number
+ * passed to it.
+ *
+ * - bpf_program__set_prep:
+ * Attachs a preprocessor to a BPF program. The number of instances
+ * whould be created is also passed through this function.
+ *
+ * - bpf_program__nth_fd:
+ * After the program is loaded, get resuling fds from bpf program for
+ * each instances.
+ *
+ * If bpf_program__set_prep() is not used, the program whould be loaded
+ * without adjustment during bpf_object__load(). The program has only
+ * one instance. In this case bpf_program__fd(prog) is equal to
+ * bpf_program__nth_fd(prog, 0).
+ */
+
+struct bpf_prog_prep_result {
+ /*
+ * If not NULL, load new instruction array.
+ * If set to NULL, don't load this instance.
+ */
+ struct bpf_insn *new_insn_ptr;
+ int new_insn_cnt;
+
+ /* If not NULL, result fd is set to it */
+ int *pfd;
+};
+
+/*
+ * Parameters of bpf_program_prep_t:
+ * - prog: The bpf_program being loaded.
+ * - n: Index of instance being generated.
+ * - insns: BPF instructions array.
+ * - insns_cnt:Number of instructions in insns.
+ * - res: Output parameter, result of transformation.
+ *
+ * Return value:
+ * - Zero: pre-processing success.
+ * - Non-zero: pre-processing, stop loading.
+ */
+typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n,
+ struct bpf_insn *insns, int insns_cnt,
+ struct bpf_prog_prep_result *res);
+
+int bpf_program__set_prep(struct bpf_program *prog, int nr_instance,
+ bpf_program_prep_t prep);
+
+int bpf_program__nth_fd(struct bpf_program *prog, int n);
+
/*
* We don't need __attribute__((packed)) now since it is
* unnecessary for 'bpf_map_def' because they are all aligned.
--
1.8.3.4

2015-11-13 12:32:34

by Wang Nan

[permalink] [raw]
Subject: [PATCH 07/12] perf tools: Add BPF_PROLOGUE config options for further patches

If both LIBBPF and DWARF are detected, it is possible to create prologue
for eBPF programs to help them accessing kernel data. HAVE_BPF_PROLOGUE
and CONFIG_BPF_PROLOGUE is added as flags for this feature.

PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET is introduced in commit
63ab024a5b6f295ca17a293ad81b7c728f49a89a ("perf tools:
regs_query_register_offset() infrastructure"), which indicates an
architecture supports converting name of a register to its offset in
'struct pt_regs'. Without this support, BPF_PROLOGUE should be turned off.

Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/config/Makefile | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index de89ec5..6eb9a95 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -318,6 +318,18 @@ ifndef NO_LIBELF
CFLAGS += -DHAVE_LIBBPF_SUPPORT
$(call detected,CONFIG_LIBBPF)
endif
+
+ ifndef NO_DWARF
+ ifdef PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
+ CFLAGS += -DHAVE_BPF_PROLOGUE
+ $(call detected,CONFIG_BPF_PROLOGUE)
+ else
+ msg := $(warning BPF prologue is not supported by architecture $(ARCH), missing regs_query_register_offset());
+ endif
+ else
+ msg := $(warning DWARF support is off, BPF prologue is disabled);
+ endif
+
endif # NO_LIBBPF
endif # NO_LIBELF

--
1.8.3.4

2015-11-13 12:32:23

by Wang Nan

[permalink] [raw]
Subject: [PATCH 08/12] perf tools: Compile dwarf-regs.c if CONFIG_BPF_PROLOGUE is on

regs_query_register_offset() in dwarf-regs.c is required by BPF prologue.
This patch compiles it if CONFIG_BPF_PROLOGUE is on to avoid buildin
failure when CONFIG_BPF_PROLOGUE is on but CONFIG_DWARF is not set.

Signed-off-by: Wang Nan <[email protected]>
Signed-off-by: He Kuang <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/arch/x86/util/Build | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index ff63649..4659703 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -5,6 +5,7 @@ libperf-y += kvm-stat.o
libperf-y += perf_regs.o

libperf-$(CONFIG_DWARF) += dwarf-regs.o
+libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o

libperf-$(CONFIG_LIBUNWIND) += unwind-libunwind.o
libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
--
1.8.3.4

2015-11-13 12:32:35

by Wang Nan

[permalink] [raw]
Subject: [PATCH 09/12] perf tools: Add prologue for BPF programs for fetching arguments

From: He Kuang <[email protected]>

This patch generates prologue for a BPF program which fetch arguments
for it. With this patch, the program can have arguments as follow:

SEC("lock_page=__lock_page page->flags")
int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
{
return 1;
}

This patch passes at most 3 arguments from r3, r4 and r5. r1 is still
the ctx pointer. r2 is used to indicate the successfulness of
dereferencing.

This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack
pointer for result. Result of each arguments first store on stack:

low address
BPF_REG_FP - 24 ARG3
BPF_REG_FP - 16 ARG2
BPF_REG_FP - 8 ARG1
BPF_REG_FP
high address

Then loaded into r3, r4 and r5.

The output prologue for offn(...off2(off1(reg)))) should be:

r6 <- r1 // save ctx into a callee saved register
r7 <- fp
r7 <- r7 - stack_offset // pointer to result slot
/* load r3 with the offset in pt_regs of 'reg' */
(r7) <- r3 // make slot valid
r3 <- r3 + off1 // prepare to read unsafe pointer
r2 <- 8
r1 <- r7 // result put onto stack
call probe_read // read unsafe pointer
jnei r0, 0, err // error checking
r3 <- (r7) // read result
r3 <- r3 + off2 // prepare to read unsafe pointer
r2 <- 8
r1 <- r7
call probe_read
jnei r0, 0, err
...
/* load r2, r3, r4 from stack */
goto success
err:
r2 <- 1
/* load r3, r4, r5 with 0 */
goto usercode
success:
r2 <- 0
usercode:
r1 <- r6 // restore ctx
// original user code

If all of arguments reside in register (dereferencing is not
required), gen_prologue_fastpath() will be used to create
fast prologue:

r3 <- (r1 + offset of reg1)
r4 <- (r1 + offset of reg2)
r5 <- (r1 + offset of reg3)
r2 <- 0

P.S.

eBPF calling convention is defined as:

* r0 - return value from in-kernel function, and exit value
for eBPF program
* r1 - r5 - arguments from eBPF program to in-kernel function
* r6 - r9 - callee saved registers that in-kernel function will
preserve
* r10 - read-only frame pointer to access stack

Signed-off-by: He Kuang <[email protected]>
Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/Build | 1 +
tools/perf/util/bpf-loader.c | 3 +
tools/perf/util/bpf-loader.h | 3 +
tools/perf/util/bpf-prologue.c | 455 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/bpf-prologue.h | 34 +++
5 files changed, 496 insertions(+)
create mode 100644 tools/perf/util/bpf-prologue.c
create mode 100644 tools/perf/util/bpf-prologue.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 591b3fe..b9d56f2 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -88,6 +88,7 @@ libperf-y += parse-branch-options.o
libperf-y += parse-regs-options.o

libperf-$(CONFIG_LIBBPF) += bpf-loader.o
+libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
libperf-$(CONFIG_LIBELF) += symbol-elf.o
libperf-$(CONFIG_LIBELF) += probe-file.o
libperf-$(CONFIG_LIBELF) += probe-event.o
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index a368ead..99e14d1 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -508,6 +508,9 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
[ERRCODE_OFFSET(INTERNAL)] = "BPF loader internal error",
[ERRCODE_OFFSET(COMPILE)] = "Error when compiling BPF scriptlet",
[ERRCODE_OFFSET(CONFIG_TERM)] = "Invalid config term in config string",
+ [ERRCODE_OFFSET(PROLOGUE)] = "Failed to generate prologue",
+ [ERRCODE_OFFSET(PROLOGUE2BIG)] = "Prologue too big for program",
+ [ERRCODE_OFFSET(PROLOGUEOOB)] = "Offset out of bound for prologue",
};

static int
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index d19f5c5..a58740b 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -21,6 +21,9 @@ enum bpf_loader_errno {
BPF_LOADER_ERRNO__INTERNAL, /* BPF loader internal error */
BPF_LOADER_ERRNO__COMPILE, /* Error when compiling BPF scriptlet */
BPF_LOADER_ERRNO__CONFIG_TERM, /* Invalid config term in config term */
+ BPF_LOADER_ERRNO__PROLOGUE, /* Failed to generate prologue */
+ BPF_LOADER_ERRNO__PROLOGUE2BIG, /* Prologue too big for program */
+ BPF_LOADER_ERRNO__PROLOGUEOOB, /* Offset out of bound for prologue */
__BPF_LOADER_ERRNO__END,
};

diff --git a/tools/perf/util/bpf-prologue.c b/tools/perf/util/bpf-prologue.c
new file mode 100644
index 0000000..6cdbee1
--- /dev/null
+++ b/tools/perf/util/bpf-prologue.c
@@ -0,0 +1,455 @@
+/*
+ * bpf-prologue.c
+ *
+ * Copyright (C) 2015 He Kuang <[email protected]>
+ * Copyright (C) 2015 Wang Nan <[email protected]>
+ * Copyright (C) 2015 Huawei Inc.
+ */
+
+#include <bpf/libbpf.h>
+#include "perf.h"
+#include "debug.h"
+#include "bpf-loader.h"
+#include "bpf-prologue.h"
+#include "probe-finder.h"
+#include <dwarf-regs.h>
+#include <linux/filter.h>
+
+#define BPF_REG_SIZE 8
+
+#define JMP_TO_ERROR_CODE -1
+#define JMP_TO_SUCCESS_CODE -2
+#define JMP_TO_USER_CODE -3
+
+struct bpf_insn_pos {
+ struct bpf_insn *begin;
+ struct bpf_insn *end;
+ struct bpf_insn *pos;
+};
+
+static inline int
+pos_get_cnt(struct bpf_insn_pos *pos)
+{
+ return pos->pos - pos->begin;
+}
+
+static int
+append_insn(struct bpf_insn new_insn, struct bpf_insn_pos *pos)
+{
+ if (!pos->pos)
+ return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
+
+ if (pos->pos + 1 >= pos->end) {
+ pr_err("bpf prologue: prologue too long\n");
+ pos->pos = NULL;
+ return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
+ }
+
+ *(pos->pos)++ = new_insn;
+ return 0;
+}
+
+static int
+check_pos(struct bpf_insn_pos *pos)
+{
+ if (!pos->pos || pos->pos >= pos->end)
+ return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
+ return 0;
+}
+
+/* Give it a shorter name */
+#define ins(i, p) append_insn((i), (p))
+
+/*
+ * Give a register name (in 'reg'), generate instruction to
+ * load register into an eBPF register rd:
+ * 'ldd target_reg, offset(ctx_reg)', where:
+ * ctx_reg is pre initialized to pointer of 'struct pt_regs'.
+ */
+static int
+gen_ldx_reg_from_ctx(struct bpf_insn_pos *pos, int ctx_reg,
+ const char *reg, int target_reg)
+{
+ int offset = regs_query_register_offset(reg);
+
+ if (offset < 0) {
+ pr_err("bpf: prologue: failed to get register %s\n",
+ reg);
+ return offset;
+ }
+ ins(BPF_LDX_MEM(BPF_DW, target_reg, ctx_reg, offset), pos);
+
+ return check_pos(pos);
+}
+
+/*
+ * Generate a BPF_FUNC_probe_read function call.
+ *
+ * src_base_addr_reg is a register holding base address,
+ * dst_addr_reg is a register holding dest address (on stack),
+ * result is:
+ *
+ * *[dst_addr_reg] = *([src_base_addr_reg] + offset)
+ *
+ * Arguments of BPF_FUNC_probe_read:
+ * ARG1: ptr to stack (dest)
+ * ARG2: size (8)
+ * ARG3: unsafe ptr (src)
+ */
+static int
+gen_read_mem(struct bpf_insn_pos *pos,
+ int src_base_addr_reg,
+ int dst_addr_reg,
+ long offset)
+{
+ /* mov arg3, src_base_addr_reg */
+ if (src_base_addr_reg != BPF_REG_ARG3)
+ ins(BPF_MOV64_REG(BPF_REG_ARG3, src_base_addr_reg), pos);
+ /* add arg3, #offset */
+ if (offset)
+ ins(BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG3, offset), pos);
+
+ /* mov arg2, #reg_size */
+ ins(BPF_ALU64_IMM(BPF_MOV, BPF_REG_ARG2, BPF_REG_SIZE), pos);
+
+ /* mov arg1, dst_addr_reg */
+ if (dst_addr_reg != BPF_REG_ARG1)
+ ins(BPF_MOV64_REG(BPF_REG_ARG1, dst_addr_reg), pos);
+
+ /* Call probe_read */
+ ins(BPF_EMIT_CALL(BPF_FUNC_probe_read), pos);
+ /*
+ * Error processing: if read fail, goto error code,
+ * will be relocated. Target should be the start of
+ * error processing code.
+ */
+ ins(BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, JMP_TO_ERROR_CODE),
+ pos);
+
+ return check_pos(pos);
+}
+
+/*
+ * Each arg should be bare register. Fetch and save them into argument
+ * registers (r3 - r5).
+ *
+ * BPF_REG_1 should have been initialized with pointer to
+ * 'struct pt_regs'.
+ */
+static int
+gen_prologue_fastpath(struct bpf_insn_pos *pos,
+ struct probe_trace_arg *args, int nargs)
+{
+ int i, err = 0;
+
+ for (i = 0; i < nargs; i++) {
+ err = gen_ldx_reg_from_ctx(pos, BPF_REG_1, args[i].value,
+ BPF_PROLOGUE_START_ARG_REG + i);
+ if (err)
+ goto errout;
+ }
+
+ return check_pos(pos);
+errout:
+ return err;
+}
+
+/*
+ * Slow path:
+ * At least one argument has the form of 'offset($rx)'.
+ *
+ * Following code first stores them into stack, then loads all of then
+ * to r2 - r5.
+ * Before final loading, the final result should be:
+ *
+ * low address
+ * BPF_REG_FP - 24 ARG3
+ * BPF_REG_FP - 16 ARG2
+ * BPF_REG_FP - 8 ARG1
+ * BPF_REG_FP
+ * high address
+ *
+ * For each argument (described as: offn(...off2(off1(reg)))),
+ * generates following code:
+ *
+ * r7 <- fp
+ * r7 <- r7 - stack_offset // Ideal code should initialize r7 using
+ * // fp before generating args. However,
+ * // eBPF won't regard r7 as stack pointer
+ * // if it is generated by minus 8 from
+ * // another stack pointer except fp.
+ * // This is why we have to set r7
+ * // to fp for each variable.
+ * r3 <- value of 'reg'-> generated using gen_ldx_reg_from_ctx()
+ * (r7) <- r3 // skip following instructions for bare reg
+ * r3 <- r3 + off1 . // skip if off1 == 0
+ * r2 <- 8 \
+ * r1 <- r7 |-> generated by gen_read_mem()
+ * call probe_read /
+ * jnei r0, 0, err ./
+ * r3 <- (r7)
+ * r3 <- r3 + off2 . // skip if off2 == 0
+ * r2 <- 8 \ // r2 may be broken by probe_read, so set again
+ * r1 <- r7 |-> generated by gen_read_mem()
+ * call probe_read /
+ * jnei r0, 0, err ./
+ * ...
+ */
+static int
+gen_prologue_slowpath(struct bpf_insn_pos *pos,
+ struct probe_trace_arg *args, int nargs)
+{
+ int err, i;
+
+ for (i = 0; i < nargs; i++) {
+ struct probe_trace_arg *arg = &args[i];
+ const char *reg = arg->value;
+ struct probe_trace_arg_ref *ref = NULL;
+ int stack_offset = (i + 1) * -8;
+
+ pr_debug("prologue: fetch arg %d, base reg is %s\n",
+ i, reg);
+
+ /* value of base register is stored into ARG3 */
+ err = gen_ldx_reg_from_ctx(pos, BPF_REG_CTX, reg,
+ BPF_REG_ARG3);
+ if (err) {
+ pr_err("prologue: failed to get offset of register %s\n",
+ reg);
+ goto errout;
+ }
+
+ /* Make r7 the stack pointer. */
+ ins(BPF_MOV64_REG(BPF_REG_7, BPF_REG_FP), pos);
+ /* r7 += -8 */
+ ins(BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, stack_offset), pos);
+ /*
+ * Store r3 (base register) onto stack
+ * Ensure fp[offset] is set.
+ * fp is the only valid base register when storing
+ * into stack. We are not allowed to use r7 as base
+ * register here.
+ */
+ ins(BPF_STX_MEM(BPF_DW, BPF_REG_FP, BPF_REG_ARG3,
+ stack_offset), pos);
+
+ ref = arg->ref;
+ while (ref) {
+ pr_debug("prologue: arg %d: offset %ld\n",
+ i, ref->offset);
+ err = gen_read_mem(pos, BPF_REG_3, BPF_REG_7,
+ ref->offset);
+ if (err) {
+ pr_err("prologue: failed to generate probe_read function call\n");
+ goto errout;
+ }
+
+ ref = ref->next;
+ /*
+ * Load previous result into ARG3. Use
+ * BPF_REG_FP instead of r7 because verifier
+ * allows FP based addressing only.
+ */
+ if (ref)
+ ins(BPF_LDX_MEM(BPF_DW, BPF_REG_ARG3,
+ BPF_REG_FP, stack_offset), pos);
+ }
+ }
+
+ /* Final pass: read to registers */
+ for (i = 0; i < nargs; i++)
+ ins(BPF_LDX_MEM(BPF_DW, BPF_PROLOGUE_START_ARG_REG + i,
+ BPF_REG_FP, -BPF_REG_SIZE * (i + 1)), pos);
+
+ ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_SUCCESS_CODE), pos);
+
+ return check_pos(pos);
+errout:
+ return err;
+}
+
+static int
+prologue_relocate(struct bpf_insn_pos *pos, struct bpf_insn *error_code,
+ struct bpf_insn *success_code, struct bpf_insn *user_code)
+{
+ struct bpf_insn *insn;
+
+ if (check_pos(pos))
+ return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
+
+ for (insn = pos->begin; insn < pos->pos; insn++) {
+ struct bpf_insn *target;
+ u8 class = BPF_CLASS(insn->code);
+ u8 opcode;
+
+ if (class != BPF_JMP)
+ continue;
+ opcode = BPF_OP(insn->code);
+ if (opcode == BPF_CALL)
+ continue;
+
+ switch (insn->off) {
+ case JMP_TO_ERROR_CODE:
+ target = error_code;
+ break;
+ case JMP_TO_SUCCESS_CODE:
+ target = success_code;
+ break;
+ case JMP_TO_USER_CODE:
+ target = user_code;
+ break;
+ default:
+ pr_err("bpf prologue: internal error: relocation failed\n");
+ return -BPF_LOADER_ERRNO__PROLOGUE;
+ }
+
+ insn->off = target - (insn + 1);
+ }
+ return 0;
+}
+
+int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
+ struct bpf_insn *new_prog, size_t *new_cnt,
+ size_t cnt_space)
+{
+ struct bpf_insn *success_code = NULL;
+ struct bpf_insn *error_code = NULL;
+ struct bpf_insn *user_code = NULL;
+ struct bpf_insn_pos pos;
+ bool fastpath = true;
+ int err = 0, i;
+
+ if (!new_prog || !new_cnt)
+ return -EINVAL;
+
+ if (cnt_space > BPF_MAXINSNS)
+ cnt_space = BPF_MAXINSNS;
+
+ pos.begin = new_prog;
+ pos.end = new_prog + cnt_space;
+ pos.pos = new_prog;
+
+ if (!nargs) {
+ ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0),
+ &pos);
+
+ if (check_pos(&pos))
+ goto errout;
+
+ *new_cnt = pos_get_cnt(&pos);
+ return 0;
+ }
+
+ if (nargs > BPF_PROLOGUE_MAX_ARGS) {
+ pr_warning("bpf: prologue: %d arguments are dropped\n",
+ nargs - BPF_PROLOGUE_MAX_ARGS);
+ nargs = BPF_PROLOGUE_MAX_ARGS;
+ }
+
+ /* First pass: validation */
+ for (i = 0; i < nargs; i++) {
+ struct probe_trace_arg_ref *ref = args[i].ref;
+
+ if (args[i].value[0] == '@') {
+ /* TODO: fetch global variable */
+ pr_err("bpf: prologue: global %s%+ld not support\n",
+ args[i].value, ref ? ref->offset : 0);
+ return -ENOTSUP;
+ }
+
+ while (ref) {
+ /* fastpath is true if all args has ref == NULL */
+ fastpath = false;
+
+ /*
+ * Instruction encodes immediate value using
+ * s32, ref->offset is long. On systems which
+ * can't fill long in s32, refuse to process if
+ * ref->offset too large (or small).
+ */
+#ifdef __LP64__
+#define OFFSET_MAX ((1LL << 31) - 1)
+#define OFFSET_MIN ((1LL << 31) * -1)
+ if (ref->offset > OFFSET_MAX ||
+ ref->offset < OFFSET_MIN) {
+ pr_err("bpf: prologue: offset out of bound: %ld\n",
+ ref->offset);
+ return -BPF_LOADER_ERRNO__PROLOGUEOOB;
+ }
+#endif
+ ref = ref->next;
+ }
+ }
+ pr_debug("prologue: pass validation\n");
+
+ if (fastpath) {
+ /* If all variables are registers... */
+ pr_debug("prologue: fast path\n");
+ err = gen_prologue_fastpath(&pos, args, nargs);
+ if (err)
+ goto errout;
+ } else {
+ pr_debug("prologue: slow path\n");
+
+ /* Initialization: move ctx to a callee saved register. */
+ ins(BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1), &pos);
+
+ err = gen_prologue_slowpath(&pos, args, nargs);
+ if (err)
+ goto errout;
+ /*
+ * start of ERROR_CODE (only slow pass needs error code)
+ * mov r2 <- 1 // r2 is error number
+ * mov r3 <- 0 // r3, r4... should be touched or
+ * // verifier would complain
+ * mov r4 <- 0
+ * ...
+ * goto usercode
+ */
+ error_code = pos.pos;
+ ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 1),
+ &pos);
+
+ for (i = 0; i < nargs; i++)
+ ins(BPF_ALU64_IMM(BPF_MOV,
+ BPF_PROLOGUE_START_ARG_REG + i,
+ 0),
+ &pos);
+ ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_USER_CODE),
+ &pos);
+ }
+
+ /*
+ * start of SUCCESS_CODE:
+ * mov r2 <- 0
+ * goto usercode // skip
+ */
+ success_code = pos.pos;
+ ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0), &pos);
+
+ /*
+ * start of USER_CODE:
+ * Restore ctx to r1
+ */
+ user_code = pos.pos;
+ if (!fastpath) {
+ /*
+ * Only slow path needs restoring of ctx. In fast path,
+ * register are loaded directly from r1.
+ */
+ ins(BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_CTX), &pos);
+ err = prologue_relocate(&pos, error_code, success_code,
+ user_code);
+ if (err)
+ goto errout;
+ }
+
+ err = check_pos(&pos);
+ if (err)
+ goto errout;
+
+ *new_cnt = pos_get_cnt(&pos);
+ return 0;
+errout:
+ return err;
+}
diff --git a/tools/perf/util/bpf-prologue.h b/tools/perf/util/bpf-prologue.h
new file mode 100644
index 0000000..d94cbea
--- /dev/null
+++ b/tools/perf/util/bpf-prologue.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright (C) 2015, He Kuang <[email protected]>
+ * Copyright (C) 2015, Huawei Inc.
+ */
+#ifndef __BPF_PROLOGUE_H
+#define __BPF_PROLOGUE_H
+
+#include <linux/compiler.h>
+#include <linux/filter.h>
+#include "probe-event.h"
+
+#define BPF_PROLOGUE_MAX_ARGS 3
+#define BPF_PROLOGUE_START_ARG_REG BPF_REG_3
+#define BPF_PROLOGUE_FETCH_RESULT_REG BPF_REG_2
+
+#ifdef HAVE_BPF_PROLOGUE
+int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
+ struct bpf_insn *new_prog, size_t *new_cnt,
+ size_t cnt_space);
+#else
+static inline int
+bpf__gen_prologue(struct probe_trace_arg *args __maybe_unused,
+ int nargs __maybe_unused,
+ struct bpf_insn *new_prog __maybe_unused,
+ size_t *new_cnt,
+ size_t cnt_space __maybe_unused)
+{
+ if (!new_cnt)
+ return -EINVAL;
+ *new_cnt = 0;
+ return -ENOTSUP;
+}
+#endif
+#endif /* __BPF_PROLOGUE_H */
--
1.8.3.4

2015-11-13 12:33:57

by Wang Nan

[permalink] [raw]
Subject: [PATCH 10/12] perf tools: Generate prologue for BPF programs

This patch generates prologue for each 'struct probe_trace_event' for
fetching arguments for BPF programs.

After bpf__probe(), iterate over each programs to check whether
prologue is required. If none of 'struct perf_probe_event' a program
will attach to has at least one argument, simply skip preprocessor
hooking. For those who prologue is required, calls bpf__gen_prologue()
and paste original instruction after prologue.

Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/bpf-loader.c | 120 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 119 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 99e14d1..b7587c7 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -5,11 +5,14 @@
* Copyright (C) 2015 Huawei Inc.
*/

+#include <linux/bpf.h>
#include <bpf/libbpf.h>
#include <linux/err.h>
#include "perf.h"
#include "debug.h"
#include "bpf-loader.h"
+#include "bpf-prologue.h"
+#include "llvm-utils.h"
#include "probe-event.h"
#include "probe-finder.h" // for MAX_PROBES
#include "llvm-utils.h"
@@ -32,6 +35,8 @@ DEFINE_PRINT_FN(debug, 1)

struct bpf_prog_priv {
struct perf_probe_event pev;
+ bool need_prologue;
+ struct bpf_insn *insns_buf;
};

static bool libbpf_initialized;
@@ -106,6 +111,7 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
struct bpf_prog_priv *priv = _priv;

cleanup_perf_probe_events(&priv->pev, 1);
+ zfree(&priv->insns_buf);
free(priv);
}

@@ -364,6 +370,102 @@ static int bpf__prepare_probe(void)
return err;
}

+static int
+preproc_gen_prologue(struct bpf_program *prog, int n,
+ struct bpf_insn *orig_insns, int orig_insns_cnt,
+ struct bpf_prog_prep_result *res)
+{
+ struct probe_trace_event *tev;
+ struct perf_probe_event *pev;
+ struct bpf_prog_priv *priv;
+ struct bpf_insn *buf;
+ size_t prologue_cnt = 0;
+ int err;
+
+ err = bpf_program__get_private(prog, (void **)&priv);
+ if (err || !priv)
+ goto errout;
+
+ pev = &priv->pev;
+
+ if (n < 0 || n >= pev->ntevs)
+ goto errout;
+
+ tev = &pev->tevs[n];
+
+ buf = priv->insns_buf;
+ err = bpf__gen_prologue(tev->args, tev->nargs,
+ buf, &prologue_cnt,
+ BPF_MAXINSNS - orig_insns_cnt);
+ if (err) {
+ const char *title;
+
+ title = bpf_program__title(prog, false);
+ if (!title)
+ title = "[unknown]";
+
+ pr_debug("Failed to generate prologue for program %s\n",
+ title);
+ return err;
+ }
+
+ memcpy(&buf[prologue_cnt], orig_insns,
+ sizeof(struct bpf_insn) * orig_insns_cnt);
+
+ res->new_insn_ptr = buf;
+ res->new_insn_cnt = prologue_cnt + orig_insns_cnt;
+ res->pfd = NULL;
+ return 0;
+
+errout:
+ pr_debug("Internal error in preproc_gen_prologue\n");
+ return -BPF_LOADER_ERRNO__PROLOGUE;
+}
+
+static int hook_load_preprocessor(struct bpf_program *prog)
+{
+ struct perf_probe_event *pev;
+ struct bpf_prog_priv *priv;
+ bool need_prologue = false;
+ int err, i;
+
+ err = bpf_program__get_private(prog, (void **)&priv);
+ if (err || !priv) {
+ pr_debug("Internal error when hook preprocessor\n");
+ return -BPF_LOADER_ERRNO__INTERNAL;
+ }
+
+ pev = &priv->pev;
+ for (i = 0; i < pev->ntevs; i++) {
+ struct probe_trace_event *tev = &pev->tevs[i];
+
+ if (tev->nargs > 0) {
+ need_prologue = true;
+ break;
+ }
+ }
+
+ /*
+ * Since all tevs don't have argument, we don't need generate
+ * prologue.
+ */
+ if (!need_prologue) {
+ priv->need_prologue = false;
+ return 0;
+ }
+
+ priv->need_prologue = true;
+ priv->insns_buf = malloc(sizeof(struct bpf_insn) * BPF_MAXINSNS);
+ if (!priv->insns_buf) {
+ pr_debug("No enough memory: alloc insns_buf failed\n");
+ return -ENOMEM;
+ }
+
+ err = bpf_program__set_prep(prog, pev->ntevs,
+ preproc_gen_prologue);
+ return err;
+}
+
int bpf__probe(struct bpf_object *obj)
{
int err = 0;
@@ -398,6 +500,18 @@ int bpf__probe(struct bpf_object *obj)
pr_debug("bpf_probe: failed to apply perf probe events");
goto out;
}
+
+ /*
+ * After probing, let's consider prologue, which
+ * adds program fetcher to BPF programs.
+ *
+ * hook_load_preprocessorr() hooks pre-processor
+ * to bpf_program, let it generate prologue
+ * dynamically during loading.
+ */
+ err = hook_load_preprocessor(prog);
+ if (err)
+ goto out;
}
out:
return err < 0 ? err : 0;
@@ -481,7 +595,11 @@ int bpf__foreach_tev(struct bpf_object *obj,
for (i = 0; i < pev->ntevs; i++) {
tev = &pev->tevs[i];

- fd = bpf_program__fd(prog);
+ if (priv->need_prologue)
+ fd = bpf_program__nth_fd(prog, i);
+ else
+ fd = bpf_program__fd(prog);
+
if (fd < 0) {
pr_debug("bpf: failed to get file descriptor\n");
return fd;
--
1.8.3.4

2015-11-13 12:30:50

by Wang Nan

[permalink] [raw]
Subject: [PATCH 11/12] perf test: Test BPF prologue

This patch introduces a new BPF script to test BPF prologue. The new
script probes at null_lseek, which is the function pointer when we try
to lseek on '/dev/null'.

null_lseek is chosen because it is a function pointer, so we don't need
to consider inlining and LTO.

By extracting file->f_mode, bpf-script-test-prologue.c should know whether
the file is writable or readonly. According to llseek_loop() and
bpf-script-test-prologue.c, one forth of total lseeks should be collected.

Signed-off-by: Wang Nan <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David Ahern <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kaixu Xia <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/n/[email protected]
---
tools/perf/tests/Build | 9 +++++++-
tools/perf/tests/bpf-script-test-prologue.c | 35 +++++++++++++++++++++++++++++
tools/perf/tests/bpf.c | 34 ++++++++++++++++++++++++++++
tools/perf/tests/llvm.c | 4 ++++
tools/perf/tests/llvm.h | 2 ++
5 files changed, 83 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/tests/bpf-script-test-prologue.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index f41ebf8..0ff8a97 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -31,7 +31,7 @@ perf-y += sample-parsing.o
perf-y += parse-no-sample-id-all.o
perf-y += kmod-path.o
perf-y += thread-map.o
-perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o
+perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o
perf-y += bpf.o
perf-y += topology.o

@@ -49,6 +49,13 @@ $(OUTPUT)tests/llvm-src-kbuild.c: tests/bpf-script-test-kbuild.c
$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
$(Q)echo ';' >> $@

+$(OUTPUT)tests/llvm-src-prologue.c: tests/bpf-script-test-prologue.c
+ $(call rule_mkdir)
+ $(Q)echo '#include <tests/llvm.h>' > $@
+ $(Q)echo 'const char test_llvm__bpf_test_prologue_prog[] =' >> $@
+ $(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
+ $(Q)echo ';' >> $@
+
ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
endif
diff --git a/tools/perf/tests/bpf-script-test-prologue.c b/tools/perf/tests/bpf-script-test-prologue.c
new file mode 100644
index 0000000..7230e62
--- /dev/null
+++ b/tools/perf/tests/bpf-script-test-prologue.c
@@ -0,0 +1,35 @@
+/*
+ * bpf-script-test-prologue.c
+ * Test BPF prologue
+ */
+#ifndef LINUX_VERSION_CODE
+# error Need LINUX_VERSION_CODE
+# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
+#endif
+#define SEC(NAME) __attribute__((section(NAME), used))
+
+#include <uapi/linux/fs.h>
+
+#define FMODE_READ 0x1
+#define FMODE_WRITE 0x2
+
+static void (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
+ (void *) 6;
+
+SEC("func=null_lseek file->f_mode offset orig")
+int bpf_func__null_lseek(void *ctx, int err, unsigned long f_mode,
+ unsigned long offset, unsigned long orig)
+{
+ if (err)
+ return 0;
+ if (f_mode & FMODE_WRITE)
+ return 0;
+ if (offset & 1)
+ return 0;
+ if (orig == SEEK_CUR)
+ return 0;
+ return 1;
+}
+
+char _license[] SEC("license") = "GPL";
+int _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index ec16f78..c7131fa 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -19,6 +19,29 @@ static int epoll_pwait_loop(void)
return 0;
}

+#ifdef HAVE_BPF_PROLOGUE
+
+static int llseek_loop(void)
+{
+ int fds[2], i;
+
+ fds[0] = open("/dev/null", O_RDONLY);
+ fds[1] = open("/dev/null", O_RDWR);
+
+ if (fds[0] < 0 || fds[1] < 0)
+ return -1;
+
+ for (i = 0; i < NR_ITERS; i++) {
+ lseek(fds[i % 2], i, (i / 2) % 2 ? SEEK_CUR : SEEK_SET);
+ lseek(fds[(i + 1) % 2], i, (i / 2) % 2 ? SEEK_CUR : SEEK_SET);
+ }
+ close(fds[0]);
+ close(fds[1]);
+ return 0;
+}
+
+#endif
+
static struct {
enum test_llvm__testcase prog_id;
const char *desc;
@@ -37,6 +60,17 @@ static struct {
&epoll_pwait_loop,
(NR_ITERS + 1) / 2,
},
+#ifdef HAVE_BPF_PROLOGUE
+ {
+ LLVM_TESTCASE_BPF_PROLOGUE,
+ "Test BPF prologue generation",
+ "[bpf_prologue_test]",
+ "fix kbuild first",
+ "check your vmlinux setting?",
+ &llseek_loop,
+ (NR_ITERS + 1) / 4,
+ },
+#endif
};

static int do_test(struct bpf_object *obj, int (*func)(void),
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index bc4cf50..7f4f7f7 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -44,6 +44,10 @@ static struct {
.source = test_llvm__bpf_test_kbuild_prog,
.desc = "Test kbuild searching",
},
+ [LLVM_TESTCASE_BPF_PROLOGUE] = {
+ .source = test_llvm__bpf_test_prologue_prog,
+ .desc = "Test BPF prologue generation",
+ },
};


diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index d91d8f4..5150b4d 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -6,10 +6,12 @@

extern const char test_llvm__bpf_base_prog[];
extern const char test_llvm__bpf_test_kbuild_prog[];
+extern const char test_llvm__bpf_test_prologue_prog[];

enum test_llvm__testcase {
LLVM_TESTCASE_BASE,
LLVM_TESTCASE_KBUILD,
+ LLVM_TESTCASE_BPF_PROLOGUE,
__LLVM_TESTCASE_MAX,
};

--
1.8.3.4

2015-11-13 12:30:26

by Wang Nan

[permalink] [raw]
Subject: [PATCH 12/12] perf tools: Use same BPF program if arguments are identical

This patch allows creating only one BPF program for different
'probe_trace_event'(tev) generated by one 'perf_probe_event'(pev), if
their prologues are identical.

This is done by comparing argument list of different tev, and maps type
of prologue and tev using a mapping array. This patch utilizes qsort to
sort tevs. After sorting, tevs with identical argument list will be
grouped together.

Test result:

Sample BPF program:

SEC("inlines=no\n"
"func=SyS_dup? oldfd")
int func(void *ctx)
{
return 1;
}

It would probe at SyS_dup2 and SyS_dup3, extracts oldfd as its argument.

Following cmdline shows BPF program loaded into kernel by perf:

# ./perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog

Before this patch:

# ./perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog
[1] 24858
lrwx------ 1 root root 64 Nov 14 04:09 3 -> anon_inode:bpf-prog
lrwx------ 1 root root 64 Nov 14 04:09 4 -> anon_inode:bpf-prog
...

After this patch:

# ./perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog
[1] 25699
lrwx------ 1 root root 64 Nov 14 04:10 3 -> anon_inode:bpf-prog
...

Signed-off-by: Wang Nan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
---
tools/perf/util/bpf-loader.c | 136 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 129 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index b7587c7..924d5ea 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -37,6 +37,8 @@ struct bpf_prog_priv {
struct perf_probe_event pev;
bool need_prologue;
struct bpf_insn *insns_buf;
+ int nr_types;
+ int *type_mapping;
};

static bool libbpf_initialized;
@@ -112,6 +114,7 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,

cleanup_perf_probe_events(&priv->pev, 1);
zfree(&priv->insns_buf);
+ zfree(&priv->type_mapping);
free(priv);
}

@@ -380,7 +383,7 @@ preproc_gen_prologue(struct bpf_program *prog, int n,
struct bpf_prog_priv *priv;
struct bpf_insn *buf;
size_t prologue_cnt = 0;
- int err;
+ int i, err;

err = bpf_program__get_private(prog, (void **)&priv);
if (err || !priv)
@@ -388,10 +391,20 @@ preproc_gen_prologue(struct bpf_program *prog, int n,

pev = &priv->pev;

- if (n < 0 || n >= pev->ntevs)
+ if (n < 0 || n >= priv->nr_types)
goto errout;

- tev = &pev->tevs[n];
+ /* Find a tev belongs to that type */
+ for (i = 0; i < pev->ntevs; i++)
+ if (priv->type_mapping[i] == n)
+ break;
+
+ if (i >= pev->ntevs) {
+ pr_debug("Internal error: prologue type %d not found\n", n);
+ return -BPF_LOADER_ERRNO__PROLOGUE;
+ }
+
+ tev = &pev->tevs[i];

buf = priv->insns_buf;
err = bpf__gen_prologue(tev->args, tev->nargs,
@@ -422,6 +435,101 @@ errout:
return -BPF_LOADER_ERRNO__PROLOGUE;
}

+/*
+ * compare_tev_args is reflexive, transitive and antisymmetric.
+ * I can proof it but this margin is too narrow to contain.
+ */
+static int compare_tev_args(const void *ptev1, const void *ptev2)
+{
+ int i, ret;
+ const struct probe_trace_event *tev1 =
+ *(const struct probe_trace_event **)ptev1;
+ const struct probe_trace_event *tev2 =
+ *(const struct probe_trace_event **)ptev2;
+
+ ret = tev2->nargs - tev1->nargs;
+ if (ret)
+ return ret;
+
+ for (i = 0; i < tev1->nargs; i++) {
+ struct probe_trace_arg *arg1, *arg2;
+ struct probe_trace_arg_ref *ref1, *ref2;
+
+ arg1 = &tev1->args[i];
+ arg2 = &tev2->args[i];
+
+ ret = strcmp(arg1->value, arg2->value);
+ if (ret)
+ return ret;
+
+ ref1 = arg1->ref;
+ ref2 = arg2->ref;
+
+ while (ref1 && ref2) {
+ ret = ref2->offset - ref1->offset;
+ if (ret)
+ return ret;
+
+ ref1 = ref1->next;
+ ref2 = ref2->next;
+ }
+
+ if (ref1 || ref2)
+ return ref2 ? 1 : -1;
+ }
+
+ return 0;
+}
+
+/*
+ * Assign a type number to each tevs in a pev.
+ * mapping is an array with same slots as tevs in that pev.
+ * nr_types will be set to number of types.
+ */
+static int map_prologue(struct perf_probe_event *pev, int *mapping,
+ int *nr_types)
+{
+ int i, type = 0;
+ struct probe_trace_event **ptevs;
+
+ size_t array_sz = sizeof(*ptevs) * pev->ntevs;
+
+ ptevs = malloc(array_sz);
+ if (!ptevs) {
+ pr_debug("No ehough memory: alloc ptevs failed\n");
+ return -ENOMEM;
+ }
+
+ pr_debug("In map_prologue, ntevs=%d\n", pev->ntevs);
+ for (i = 0; i < pev->ntevs; i++)
+ ptevs[i] = &pev->tevs[i];
+
+ qsort(ptevs, pev->ntevs, sizeof(*ptevs),
+ compare_tev_args);
+
+ for (i = 0; i < pev->ntevs; i++) {
+ int n;
+
+ n = ptevs[i] - pev->tevs;
+ if (i == 0) {
+ mapping[n] = type;
+ pr_debug("mapping[%d]=%d\n", n, type);
+ continue;
+ }
+
+ if (compare_tev_args(ptevs + i, ptevs + i - 1) == 0)
+ mapping[n] = type;
+ else
+ mapping[n] = ++type;
+
+ pr_debug("mapping[%d]=%d\n", n, mapping[n]);
+ }
+ free(ptevs);
+ *nr_types = type + 1;
+
+ return 0;
+}
+
static int hook_load_preprocessor(struct bpf_program *prog)
{
struct perf_probe_event *pev;
@@ -461,7 +569,19 @@ static int hook_load_preprocessor(struct bpf_program *prog)
return -ENOMEM;
}

- err = bpf_program__set_prep(prog, pev->ntevs,
+ priv->type_mapping = malloc(sizeof(int) * pev->ntevs);
+ if (!priv->type_mapping) {
+ pr_debug("No enough memory: alloc type_mapping failed\n");
+ return -ENOMEM;
+ }
+ memset(priv->type_mapping, 0xff,
+ sizeof(int) * pev->ntevs);
+
+ err = map_prologue(pev, priv->type_mapping, &priv->nr_types);
+ if (err)
+ return err;
+
+ err = bpf_program__set_prep(prog, priv->nr_types,
preproc_gen_prologue);
return err;
}
@@ -595,9 +715,11 @@ int bpf__foreach_tev(struct bpf_object *obj,
for (i = 0; i < pev->ntevs; i++) {
tev = &pev->tevs[i];

- if (priv->need_prologue)
- fd = bpf_program__nth_fd(prog, i);
- else
+ if (priv->need_prologue) {
+ int type = priv->type_mapping[i];
+
+ fd = bpf_program__nth_fd(prog, type);
+ } else
fd = bpf_program__fd(prog);

if (fd < 0) {
--
1.8.3.4

2015-11-13 15:39:48

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 03/12] perf tools: Allow BPF program attach to uprobe events

Em Fri, Nov 13, 2015 at 12:29:12PM +0000, Wang Nan escreveu:
> This patch appends new syntax to BPF object section name to support
> probing at uprobe event. Now we can use BPF program like this:
>
> SEC(
> "exec=/lib64/libc.so.6\n"
> "libcwrite=__write"
> )
> int libcwrite(void *ctx)
> {
> return 1;
> }
>
> Where, in section name of a program, before the main config string,
> we can use 'key=value' style options. Now the only option key "exec"
> is for uprobe probing.
>
> Signed-off-by: Wang Nan <[email protected]>
> Cc: Alexei Starovoitov <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Zefan Li <[email protected]>
> Cc: [email protected]
> ---
> tools/perf/util/bpf-loader.c | 122 ++++++++++++++++++++++++++++++++++++++++---
> tools/perf/util/bpf-loader.h | 1 +
> 2 files changed, 117 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
> index 4c50411..5f5505d 100644
> --- a/tools/perf/util/bpf-loader.c
> +++ b/tools/perf/util/bpf-loader.c
> @@ -110,6 +110,115 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
> }
>
> static int
> +config__exec(const char *value, struct perf_probe_event *pev)
> +{
> + pev->uprobes = true;
> + pev->target = strdup(value);

Shouldn't we check if this fails and return some failure error
accordingly?

> + return 0;
> +}
> +
> +static struct {
> + const char *key;
> + const char *usage;
> + const char *desc;
> + int (*func)(const char *, struct perf_probe_event *);
> +} bpf_config_terms[] = {
> + {
> + "exec",
> + "exec=<full path of file>",
> + "Set uprobe target",
> + config__exec,
> + },

Even with the definition of the struct right above it, please use named
initializers, i.e. turn the avove into:

+ {
+ .key = "exec",
+ .usage = "exec=<full path of file>",
+ .desc = "Set uprobe target",
+ .func = config__exec,
+ },

> +};
> +
> +static int
> +do_config(const char *key, const char *value,
> + struct perf_probe_event *pev)
> +{
> + unsigned int i;
> +
> + pr_debug("config bpf program: %s=%s\n", key, value);
> + for (i = 0; i < ARRAY_SIZE(bpf_config_terms); i++)
> + if (strcmp(key, bpf_config_terms[i].key) == 0)
> + return bpf_config_terms[i].func(value, pev);
> +
> + pr_debug("BPF: ERROR: invalid config option in object: %s=%s\n",
> + key, value);
> +
> + pr_debug("\nHint: Currently valid options are:\n");
> + for (i = 0; i < ARRAY_SIZE(bpf_config_terms); i++)
> + pr_debug("\t%s:\t%s\n", bpf_config_terms[i].usage,
> + bpf_config_terms[i].desc);
> + pr_debug("\n");
> +
> + return -BPF_LOADER_ERRNO__CONFIG_TERM;
> +}
> +
> +static const char *
> +parse_config_kvpair(const char *config_str, struct perf_probe_event *pev)
> +{
> + char *text = strdup(config_str);
> + char *sep, *line;
> + const char *main_str = NULL;
> + int err = 0;
> +
> + if (!text) {

See, here you do it correctly, checking the strdup() return, providing
some debugging for -v usage _and_ the right ERR_PTR() return, well done!

> + pr_debug("No enough memory: dup config_str failed\n");
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + line = text;
> + while ((sep = strchr(line, '\n'))) {
> + char *equ;
> +
> + *sep = '\0';
> + equ = strchr(line, '=');
> + if (!equ) {
> + pr_warning("WARNING: invalid config in BPF object: %s\n",
> + line);
> + pr_warning("\tShould be 'key=value'.\n");
> + goto nextline;
> + }
> + *equ = '\0';
> +
> + err = do_config(line, equ + 1, pev);
> + if (err)
> + break;
> +nextline:
> + line = sep + 1;
> + }
> +
> + if (!err)
> + main_str = config_str + (line - text);
> + free(text);
> +
> + if (err)
> + return ERR_PTR(err);
> +
> + return main_str;


No biggie, but doing:

return err ? ERR_PTR(err) : main_str;

Is more compact and clear :-)

> +}
> +
> +static int
> +parse_config(const char *config_str, struct perf_probe_event *pev)
> +{
> + const char *main_str;
> + int err;
> +
> + main_str = parse_config_kvpair(config_str, pev);

Also just for compactness, just a suggestion, doing:

const char *main_str = parse_config_kvpair(config_str, pev);

Saves screen real state :-)

> + if (IS_ERR(main_str))
> + return PTR_ERR(main_str);
> +
> + err = parse_perf_probe_command(main_str, pev);
> + if (err < 0) {
> + pr_debug("bpf: '%s' is not a valid config string\n",
> + config_str);
> + /* parse failed, don't need clear pev. */
> + return -BPF_LOADER_ERRNO__CONFIG;
> + }
> + return 0;
> +}
> +
> +static int
> config_bpf_program(struct bpf_program *prog)
> {
> struct perf_probe_event *pev = NULL;
> @@ -131,13 +240,9 @@ config_bpf_program(struct bpf_program *prog)
> pev = &priv->pev;
>
> pr_debug("bpf: config program '%s'\n", config_str);
> - err = parse_perf_probe_command(config_str, pev);
> - if (err < 0) {
> - pr_debug("bpf: '%s' is not a valid config string\n",
> - config_str);
> - err = -BPF_LOADER_ERRNO__CONFIG;
> + err = parse_config(config_str, pev);
> + if (err)
> goto errout;
> - }
>
> if (pev->group && strcmp(pev->group, PERF_BPF_PROBE_GROUP)) {
> pr_debug("bpf: '%s': group for event is set and not '%s'.\n",
> @@ -340,6 +445,7 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
> [ERRCODE_OFFSET(EVENTNAME)] = "No event name found in config string",
> [ERRCODE_OFFSET(INTERNAL)] = "BPF loader internal error",
> [ERRCODE_OFFSET(COMPILE)] = "Error when compiling BPF scriptlet",
> + [ERRCODE_OFFSET(CONFIG_TERM)] = "Invalid config term in config string",
> };
>
> static int
> @@ -420,6 +526,10 @@ int bpf__strerror_probe(struct bpf_object *obj __maybe_unused,
> int err, char *buf, size_t size)
> {
> bpf__strerror_head(err, buf, size);
> + case BPF_LOADER_ERRNO__CONFIG_TERM: {
> + scnprintf(buf, size, "%s (add -v to see detail)", emsg);
> + break;
> + }
> bpf__strerror_entry(EEXIST, "Probe point exist. Try use 'perf probe -d \"*\"'");
> bpf__strerror_entry(EACCES, "You need to be root");
> bpf__strerror_entry(EPERM, "You need to be root, and /proc/sys/kernel/kptr_restrict should be 0");
> diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
> index 9caf3ae..d19f5c5 100644
> --- a/tools/perf/util/bpf-loader.h
> +++ b/tools/perf/util/bpf-loader.h
> @@ -20,6 +20,7 @@ enum bpf_loader_errno {
> BPF_LOADER_ERRNO__EVENTNAME, /* Event name is missing */
> BPF_LOADER_ERRNO__INTERNAL, /* BPF loader internal error */
> BPF_LOADER_ERRNO__COMPILE, /* Error when compiling BPF scriptlet */
> + BPF_LOADER_ERRNO__CONFIG_TERM, /* Invalid config term in config term */
> __BPF_LOADER_ERRNO__END,
> };
>
> --
> 1.8.3.4

2015-11-13 15:40:20

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 04/12] perf tools: Allow BPF program attach to modules

Em Fri, Nov 13, 2015 at 12:29:13PM +0000, Wang Nan escreveu:
> By extending the syntax of BPF object section names, this patch allows
> user to attach BPF programs to symbol in modules. For example:
>
> SEC("module=i915\n"
> "parse_cmds=i915_parse_cmds")
> int parse_cmds(void *ctx)
> {
> return 1;
> }
>
> Implementation is very simple: like what 'perf probe' does, for module,
> fill 'uprobe' field in 'struct perf_probe_event'. Other parts would be
> done automatically.
>
> Signed-off-by: Wang Nan <[email protected]>
> Cc: Alexei Starovoitov <[email protected]>
> Cc: Brendan Gregg <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: David Ahern <[email protected]>
> Cc: He Kuang <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Kaixu Xia <[email protected]>
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Zefan Li <[email protected]>
> Cc: [email protected]
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Link: http://lkml.kernel.org/n/[email protected]
> ---
> tools/perf/util/bpf-loader.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
> index 5f5505d..8d78785 100644
> --- a/tools/perf/util/bpf-loader.c
> +++ b/tools/perf/util/bpf-loader.c
> @@ -117,6 +117,14 @@ config__exec(const char *value, struct perf_probe_event *pev)
> return 0;
> }
>
> +static int
> +config__module(const char *value, struct perf_probe_event *pev)
> +{
> + pev->uprobes = false;
> + pev->target = strdup(value);
> + return 0;
> +}

check strdup return

> +
> static struct {
> const char *key;
> const char *usage;
> @@ -129,6 +137,12 @@ static struct {
> "Set uprobe target",
> config__exec,
> },
> + {
> + "module",
> + "module=<module name> ",
> + "Set kprobe module",
> + config__module,
> + }

Named initializers, please.

> };
>
> static int
> --
> 1.8.3.4

2015-11-13 15:46:36

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 05/12] perf tools: Allow BPF program config probing options

Em Fri, Nov 13, 2015 at 12:29:14PM +0000, Wang Nan escreveu:
> By extending the syntax of BPF object section names, this patch allows
> user to config probing options like what they can do in 'perf probe'.
>
> Test result:
>
> For following BPF file bpf.c:
>
> SEC("inlines=no\n"
> "func=SyS_dup?")
> int func(void *ctx)
> {
> return 1;
> }
>
> Cmdline:
>
> # ./perf record -e ./test_probe_glob.c ls /
> ...
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.013 MB perf.data ]
> # ./perf evlist
> perf_bpf_probe:func_1
> perf_bpf_probe:func
>
> Change "inlines=no" to "inlines=yes":
>
> Cmdline:
>
> # ./perf record -e ./test_probe_glob.c ls /
> ...
> [ perf record: Woken up 2 times to write data ]
> [ perf record: Captured and wrote 0.013 MB perf.data ]
> # ./perf evlist
> perf_bpf_probe:func_3
> perf_bpf_probe:func_2
> perf_bpf_probe:func_1
> perf_bpf_probe:func
>
> Signed-off-by: Wang Nan <[email protected]>
> Cc: Alexei Starovoitov <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Zefan Li <[email protected]>
> Cc: [email protected]
> ---
> tools/perf/util/bpf-loader.c | 50 +++++++++++++++++++++++++++++++++++++++++++-
> tools/perf/util/config.c | 9 ++++----
> tools/perf/util/util.c | 18 ++++++++++++++++
> tools/perf/util/util.h | 2 ++
> 4 files changed, 74 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
> index 8d78785..a368ead 100644
> --- a/tools/perf/util/bpf-loader.c
> +++ b/tools/perf/util/bpf-loader.c
> @@ -125,6 +125,38 @@ config__module(const char *value, struct perf_probe_event *pev)
> return 0;
> }
>
> +static int
> +config__bool(const char *value,
> + bool *pbool, bool invert)
> +{
> + int err;
> + bool bool_value;
> +
> + if (!pbool)
> + return -EINVAL;
> +
> + err = convert_str_to_bool(value, &bool_value);
> + if (err)
> + return err;
> +
> + *pbool = invert ? !bool_value : bool_value;
> + return 0;


if (!err)
*pbool = invert ? !bool_value : bool_value;

return err;

But again, no strict requirement, just a more compact form :)

> +}
> +
> +static int
> +config__inlines(const char *value,
> + struct perf_probe_event *pev __maybe_unused)
> +{
> + return config__bool(value, &probe_conf.no_inlines, true);
> +}
> +
> +static int
> +config__force(const char *value,
> + struct perf_probe_event *pev __maybe_unused)
> +{
> + return config__bool(value, &probe_conf.force_add, false);
> +}
> +
> static struct {
> const char *key;
> const char *usage;
> @@ -142,7 +174,19 @@ static struct {
> "module=<module name> ",
> "Set kprobe module",
> config__module,
> - }
> + },
> + {
> + "inlines",
> + "inlines=[yes|no] ",
> + "Probe at inline symbol",
> + config__inlines,
> + },
> + {
> + "force",
> + "force=[yes|no] ",
> + "Forcibly add events with existing name",
> + config__force,
> + },

Named initializers for both, please

> };
>
> static int
> @@ -240,6 +284,10 @@ config_bpf_program(struct bpf_program *prog)
> const char *config_str;
> int err;
>
> + /* Initialize per-program probing setting */
> + probe_conf.no_inlines = false;
> + probe_conf.force_add = false;
> +
> config_str = bpf_program__title(prog, false);
> if (IS_ERR(config_str)) {
> pr_debug("bpf: unable to get title for program\n");
> diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
> index 2e452ac..8219798 100644
> --- a/tools/perf/util/config.c
> +++ b/tools/perf/util/config.c
> @@ -351,15 +351,16 @@ int perf_config_int(const char *name, const char *value)
>
> static int perf_config_bool_or_int(const char *name, const char *value, int *is_bool)
> {
> + bool str_bool;
> +
> *is_bool = 1;
> if (!value)
> return 1;
> if (!*value)
> return 0;
> - if (!strcasecmp(value, "true") || !strcasecmp(value, "yes") || !strcasecmp(value, "on"))
> - return 1;
> - if (!strcasecmp(value, "false") || !strcasecmp(value, "no") || !strcasecmp(value, "off"))
> - return 0;
> +
> + if (convert_str_to_bool(value, &str_bool) == 0)
> + return str_bool ? 1 : 0;
> *is_bool = 0;
> return perf_config_int(name, value);
> }
> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> index 47b1e36..55785d5 100644
> --- a/tools/perf/util/util.c
> +++ b/tools/perf/util/util.c
> @@ -695,3 +695,21 @@ fetch_kernel_version(unsigned int *puint, char *str,
> *puint = (version << 16) + (patchlevel << 8) + sublevel;
> return 0;
> }
> +
> +int convert_str_to_bool(const char *str, bool *result)

strtobool() should be more compact and convey the same idea....

Hey, I googled for that name and guess what, the kernel has exactly this
function:

lib/string.c

/**
* strtobool - convert common user inputs into boolean values
* @s: input string
* @res: result
*
* This routine returns 0 iff the first character is one of 'Yy1Nn0'.
* Otherwise it will return -EINVAL. Value pointed to by res is
* updated upon finding a match.
*/

include/linux/string.h

So, please add it to tools/include/linux/string.h and
tools/lib/util/string.c, this way we use the same code as the kernel,
with the same function signature, etc.

Also this looks like a good thing to have on a separate patch, one that
introduces strtobool(), then the rest of this patch.

- Arnaldo

> +{
> + if (!result || !str)
> + return -EINVAL;
> +
> + if (!strcasecmp(str, "true") || !strcasecmp(str, "yes") || !strcasecmp(str, "on")) {
> + *result = true;
> + return 0;
> + }
> +
> + if (!strcasecmp(str, "false") || !strcasecmp(str, "no") || !strcasecmp(str, "off")) {
> + *result = false;
> + return 0;
> + }
> +
> + return -EINVAL;
> +}
> diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
> index dcc6590..be90932 100644
> --- a/tools/perf/util/util.h
> +++ b/tools/perf/util/util.h
> @@ -358,4 +358,6 @@ int fetch_kernel_version(unsigned int *puint,
> #define KVER_FMT "%d.%d.%d"
> #define KVER_PARAM(x) KVER_VERSION(x), KVER_PATCHLEVEL(x), KVER_SUBLEVEL(x)
>
> +int convert_str_to_bool(const char *str, bool *result);
> +
> #endif /* GIT_COMPAT_UTIL_H */
> --
> 1.8.3.4

2015-11-13 15:50:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/12] perf probe: Clear probe_trace_event when add_probe_trace_event() fails

Em Fri, Nov 13, 2015 at 12:29:11PM +0000, Wang Nan escreveu:
> When probe with glob, error in add_probe_trace_event() won't be passed
> to debuginfo__find_trace_events() because it whould be modified by
> probe_point_search_cb(). It causes segfault if perf failed to find
> argument for one probing point matched by the glob. For example:

I applied the first two patches in this series, after testing them by
following the before/after instructions, thank you for providing it!

I'm now going from patch 6 onwards, just FYI.

- Arnaldo

2015-11-16 09:13:09

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH 05/12] perf tools: Allow BPF program config probing options



On 2015/11/13 23:46, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 13, 2015 at 12:29:14PM +0000, Wang Nan escreveu:
>> By extending the syntax of BPF object section names, this patch allows
>> user to config probing options like what they can do in 'perf probe'.
>>
>> Test result:
>>
>> For following BPF file bpf.c:
>>
>> SEC("inlines=no\n"
>> "func=SyS_dup?")
>> int func(void *ctx)
>> {
>> return 1;
>> }
>>
>> Cmdline:
>>
>> # ./perf record -e ./test_probe_glob.c ls /
>> ...
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.013 MB perf.data ]
>> # ./perf evlist
>> perf_bpf_probe:func_1
>> perf_bpf_probe:func
>>
>> Change "inlines=no" to "inlines=yes":
>>
>> Cmdline:
>>
>> # ./perf record -e ./test_probe_glob.c ls /
>> ...
>> [ perf record: Woken up 2 times to write data ]
>> [ perf record: Captured and wrote 0.013 MB perf.data ]
>> # ./perf evlist
>> perf_bpf_probe:func_3
>> perf_bpf_probe:func_2
>> perf_bpf_probe:func_1
>> perf_bpf_probe:func
>>
>> Signed-off-by: Wang Nan <[email protected]>
>> Cc: Alexei Starovoitov <[email protected]>
>> Cc: Arnaldo Carvalho de Melo <[email protected]>
>> Cc: Masami Hiramatsu <[email protected]>
>> Cc: Zefan Li <[email protected]>
>> Cc: [email protected]
>> ---

[SNIP]

>> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
>> index 47b1e36..55785d5 100644
>> --- a/tools/perf/util/util.c
>> +++ b/tools/perf/util/util.c
>> @@ -695,3 +695,21 @@ fetch_kernel_version(unsigned int *puint, char *str,
>> *puint = (version << 16) + (patchlevel << 8) + sublevel;
>> return 0;
>> }
>> +
>> +int convert_str_to_bool(const char *str, bool *result)
> strtobool() should be more compact and convey the same idea....
>
> Hey, I googled for that name and guess what, the kernel has exactly this
> function:
>
> lib/string.c
>
> /**
> * strtobool - convert common user inputs into boolean values
> * @s: input string
> * @res: result
> *
> * This routine returns 0 iff the first character is one of 'Yy1Nn0'.
> * Otherwise it will return -EINVAL. Value pointed to by res is
> * updated upon finding a match.
> */
>
> include/linux/string.h
>
> So, please add it to tools/include/linux/string.h and
> tools/lib/util/string.c, this way we use the same code as the kernel,
> with the same function signature, etc.

Good suggestion. One small inconvenience: kernel's strtobool
only support 'yes|no', it doesn't match perf's config, which
requires parsing yes/no/on/off/true/false. In addition, it
accept '0/1' but perf config not.

It is very easy for me to change strtobool() to match them.
However, do you think we can provide a function with same
name with kernel but have different behavior?

However I think we can first clone that function and use it in
BPF without touching perf config, then decide whether we need
to change strtobool() or not.

Thank you.

Subject: [tip:perf/urgent] perf probe: Fix memory leaking on failure by clearing all probe_trace_events

Commit-ID: 0196e787ceb58cdfea822482ec70019bc16cbd51
Gitweb: http://git.kernel.org/tip/0196e787ceb58cdfea822482ec70019bc16cbd51
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Fri, 13 Nov 2015 12:29:10 +0000
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Fri, 13 Nov 2015 12:24:32 -0300

perf probe: Fix memory leaking on failure by clearing all probe_trace_events

Fix memory leaking on the debuginfo__find_trace_events() failure path
which frees an array of probe_trace_events but doesn't clears all the
allocated sub-structures and strings.

So, before doing zfree(tevs), clear all the array elements which may
have allocated resources.

Reported-by: Wang Nan <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/probe-finder.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index bd8f03d..63993d7 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1246,7 +1246,7 @@ int debuginfo__find_trace_events(struct debuginfo *dbg,
struct trace_event_finder tf = {
.pf = {.pev = pev, .callback = add_probe_trace_event},
.max_tevs = probe_conf.max_probes, .mod = dbg->mod};
- int ret;
+ int ret, i;

/* Allocate result tevs array */
*tevs = zalloc(sizeof(struct probe_trace_event) * tf.max_tevs);
@@ -1258,6 +1258,8 @@ int debuginfo__find_trace_events(struct debuginfo *dbg,

ret = debuginfo__find_probes(dbg, &tf.pf);
if (ret < 0) {
+ for (i = 0; i < tf.ntevs; i++)
+ clear_probe_trace_event(&tf.tevs[i]);
zfree(tevs);
return ret;
}

Subject: [tip:perf/urgent] perf probe: Clear probe_trace_event when add_probe_trace_event() fails

Commit-ID: 092b1f0b5f9f797812da0de927c3aa26acbe8762
Gitweb: http://git.kernel.org/tip/092b1f0b5f9f797812da0de927c3aa26acbe8762
Author: Wang Nan <[email protected]>
AuthorDate: Fri, 13 Nov 2015 12:29:11 +0000
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Fri, 13 Nov 2015 12:28:09 -0300

perf probe: Clear probe_trace_event when add_probe_trace_event() fails

When probing with a glob, errors in add_probe_trace_event() won't be
passed to debuginfo__find_trace_events() because it would be modified by
probe_point_search_cb(). It causes a segfault if perf fails to find an
argument for a probe point matched by the glob. For example:

# ./perf probe -v -n 'SyS_dup? oldfd'
probe-definition(0): SyS_dup? oldfd
symbol:SyS_dup? file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: oldfd into oldfd
1 arguments
Looking at the vmlinux_path (7 entries long)
Using /lib/modules/4.3.0-rc4+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.3.0-rc4+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: SyS_dup3
found inline addr: 0xffffffff812095c0
Probe point found: SyS_dup3+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
found inline addr: 0xffffffff812096d4
Probe point found: SyS_dup2+36
Searching 'oldfd' variable in context.
Failed to find 'oldfd' in this function.
Matched function: SyS_dup3
Probe point found: SyS_dup3+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
Matched function: SyS_dup2
Probe point found: SyS_dup2+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
Found 4 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/SyS_dup3 _text+2135488 oldfd=%di:s64
Segmentation fault (core dumped)
#

This patch ensures that add_probe_trace_event() doesn't touches
tf->ntevs and tf->tevs if those functions fail.

After the patch:

# perf probe 'SyS_dup? oldfd'
Failed to find 'oldfd' in this function.
Added new events:
probe:SyS_dup3 (on SyS_dup? with oldfd)
probe:SyS_dup3_1 (on SyS_dup? with oldfd)
probe:SyS_dup2 (on SyS_dup? with oldfd)

You can now use it in all perf tools, such as:

perf record -e probe:SyS_dup2 -aR sleep 1

Signed-off-by: Wang Nan <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/probe-finder.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 63993d7..05012bb 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1183,7 +1183,7 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
container_of(pf, struct trace_event_finder, pf);
struct perf_probe_point *pp = &pf->pev->point;
struct probe_trace_event *tev;
- struct perf_probe_arg *args;
+ struct perf_probe_arg *args = NULL;
int ret, i;

/* Check number of tevs */
@@ -1198,19 +1198,23 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
ret = convert_to_trace_point(&pf->sp_die, tf->mod, pf->addr,
pp->retprobe, pp->function, &tev->point);
if (ret < 0)
- return ret;
+ goto end;

tev->point.realname = strdup(dwarf_diename(sc_die));
- if (!tev->point.realname)
- return -ENOMEM;
+ if (!tev->point.realname) {
+ ret = -ENOMEM;
+ goto end;
+ }

pr_debug("Probe point found: %s+%lu\n", tev->point.symbol,
tev->point.offset);

/* Expand special probe argument if exist */
args = zalloc(sizeof(struct perf_probe_arg) * MAX_PROBE_ARGS);
- if (args == NULL)
- return -ENOMEM;
+ if (args == NULL) {
+ ret = -ENOMEM;
+ goto end;
+ }

ret = expand_probe_args(sc_die, pf, args);
if (ret < 0)
@@ -1234,6 +1238,10 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
}

end:
+ if (ret) {
+ clear_probe_trace_event(tev);
+ tf->ntevs--;
+ }
free(args);
return ret;
}