2014-12-09 17:34:53

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 0/8] Fix perf probe issues on powerpc

This patchset fixes various issues with perf probe on powerpc
across ABIv1 and ABIv2:
- in the presence of DWARF debug-info,
- in the absence of DWARF, but with the symbol table, and
- in the absence of debug-info, but with kallsyms.

Applies cleanly on v3.18 and on -tip with minor changes to patch 6.
Tested on ppc64 BE and LE.

- Naveen

Naveen N. Rao (8):
kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2
perf probe powerpc: Fix symbol fixup issues due to ELF type
perf probe: Improve detection of file/function name in the probe
pattern
perf probe powerpc: Handle powerpc dot symbols
perf probe powerpc: Allow matching against dot symbols
perf tools powerpc: Fix PPC64 ELF ABIv2 symbol decoding
perf probe powerpc: Use DWARF info only if necessary
perf probe powerpc: Fixup function entry if using kallsyms lookup

arch/powerpc/include/asm/code-patching.h | 26 ++++++++----
arch/powerpc/include/asm/kprobes.h | 58 ++++++++++++++++++---------
tools/perf/arch/powerpc/Makefile | 1 +
tools/perf/arch/powerpc/util/elf-sym-decode.c | 27 +++++++++++++
tools/perf/config/Makefile | 1 +
tools/perf/util/elf_sym.h | 13 ++++++
tools/perf/util/probe-event.c | 57 ++++++++++++++++++++++++--
tools/perf/util/symbol-elf.c | 11 ++++-
tools/perf/util/symbol.c | 6 +++
9 files changed, 170 insertions(+), 30 deletions(-)
create mode 100644 tools/perf/arch/powerpc/util/elf-sym-decode.c
create mode 100644 tools/perf/util/elf_sym.h

--
2.1.3


2014-12-09 17:34:57

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 1/8] kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2

Currently, all non-dot symbols are being treated as function descriptors
in ABIv1. This is incorrect and is resulting in perf probe not working:

# perf probe do_fork
Added new event:
Failed to write event: Invalid argument
Error: Failed to add events.
# dmesg | tail -1
[192268.073063] Could not insert probe at _text+768432: -22

_text is being resolved incorrectly and is resulting in the above error.
Fix this by changing how we lookup symbol addresses on ppc64. We first
check for the dot variant of a symbol and look at the non-dot variant
only if that fails. In this manner, we avoid having to look at the
function descriptor.

Signed-off-by: Naveen N. Rao <[email protected]>
---
arch/powerpc/include/asm/code-patching.h | 26 +++++++++-----
arch/powerpc/include/asm/kprobes.h | 58 ++++++++++++++++++++++----------
2 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 840a550..19c5bab 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -47,18 +47,12 @@ void __patch_exception(int exc, unsigned long addr);
#define ADDIS_R2_R12 0x3c4c0000UL
#define ADDI_R2_R2 0x38420000UL

-static inline unsigned long ppc_function_entry(void *func)
+static inline unsigned long ppc_local_function_entry(void *func)
{
-#if defined(CONFIG_PPC64)
-#if defined(_CALL_ELF) && _CALL_ELF == 2
+#if defined(CONFIG_PPC64) && defined(_CALL_ELF) && _CALL_ELF == 2
u32 *insn = func;

/*
- * A PPC64 ABIv2 function may have a local and a global entry
- * point. We need to use the local entry point when patching
- * functions, so identify and step over the global entry point
- * sequence.
- *
* The global entry point sequence is always of the form:
*
* addis r2,r12,XXXX
@@ -76,6 +70,22 @@ static inline unsigned long ppc_function_entry(void *func)
else
return (unsigned long)func;
#else
+ return (unsigned long)func;
+#endif
+}
+
+static inline unsigned long ppc_function_entry(void *func)
+{
+#if defined(CONFIG_PPC64)
+#if defined(_CALL_ELF) && _CALL_ELF == 2
+ /*
+ * A PPC64 ABIv2 function may have a local and a global entry
+ * point. We need to use the local entry point when patching
+ * functions, so identify and step over the global entry point
+ * sequence.
+ */
+ return ppc_local_function_entry(func);
+#else
/*
* On PPC64 ABIv1 the function pointer actually points to the
* function's descriptor. The first entry in the descriptor is the
diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
index af15d4d..060bdea 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -42,30 +42,52 @@ typedef ppc_opcode_t kprobe_opcode_t;

#ifdef CONFIG_PPC64
/*
- * 64bit powerpc uses function descriptors.
- * Handle cases where:
- * - User passes a <.symbol> or <module:.symbol>
- * - User passes a <symbol> or <module:symbol>
- * - User passes a non-existent symbol, kallsyms_lookup_name
- * returns 0. Don't deref the NULL pointer in that case
+ * ppc64[le] uses function descriptors with ABIv1 and global/local
+ * entry points for ABIv2:
+ * - Check for the dot variant of the symbol first. If that exists, then
+ * we know this is ABIv1 and we have the symbol and not the descriptor.
+ * - If that fails, try looking up the symbol provided. If that works,
+ * then we either have ABIv1 symbol (not the descriptor) or ABIv2
+ * global entry point.
+ *
+ * Also handle <module:symbol> format.
*/
#define kprobe_lookup_name(name, addr) \
{ \
- addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \
- if (addr) { \
- char *colon; \
- if ((colon = strchr(name, ':')) != NULL) { \
- colon++; \
- if (*colon != '\0' && *colon != '.') \
- addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
- } else if (name[0] != '.') \
- addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
- } else { \
- char dot_name[KSYM_NAME_LEN]; \
+ char dot_name[MODULE_NAME_LEN + 1 + KSYM_NAME_LEN]; \
+ char *modsym; \
+ bool dot_appended = false; \
+ if ((modsym = strchr(name, ':')) != NULL) { \
+ modsym++; \
+ if (*modsym != '\0' && *modsym != '.') { \
+ /* Convert to <module:.symbol> */ \
+ strncpy(dot_name, name, modsym - name); \
+ dot_name[modsym - name] = '.'; \
+ dot_name[modsym - name + 1] = '\0'; \
+ strncat(dot_name, modsym, sizeof(dot_name) - (modsym - name) - 2); \
+ dot_appended = true; \
+ } else { \
+ dot_name[0] = '\0'; \
+ strncat(dot_name, name, sizeof(dot_name) - 1); \
+ } \
+ } else if (name[0] != '.') { \
dot_name[0] = '.'; \
dot_name[1] = '\0'; \
strncat(dot_name, name, KSYM_NAME_LEN - 2); \
- addr = (kprobe_opcode_t *)kallsyms_lookup_name(dot_name); \
+ dot_appended = true; \
+ } else { \
+ dot_name[0] = '\0'; \
+ strncat(dot_name, name, KSYM_NAME_LEN - 1); \
+ } \
+ addr = (kprobe_opcode_t *)kallsyms_lookup_name(dot_name); \
+ if (!addr && dot_appended) { \
+ /* Let's try the original symbol lookup */ \
+ addr = (kprobe_opcode_t *)kallsyms_lookup_name(name); \
+ if (addr) { \
+ /* We know this isn't a function descriptor */ \
+ /* But, this could be the global entry point */ \
+ addr = (kprobe_opcode_t *)ppc_local_function_entry(addr); \
+ } \
} \
}
#endif
--
2.1.3

2014-12-09 17:35:00

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 3/8] perf probe: Improve detection of file/function name in the probe pattern

Currently, perf probe considers patterns including a '.' to be a file.
However, this causes problems on powerpc ABIv1 where all functions have
a leading '.':

$ perf probe -F | grep schedule_timeout_interruptible
.schedule_timeout_interruptible
$ perf probe .schedule_timeout_interruptible
Semantic error :File always requires line number or lazy pattern.
Error: Command Parse Error.

Fix this by checking the probe pattern in more detail.

Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/util/probe-event.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index c150ca4..c7e01ef 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -999,6 +999,24 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
arg = tmp;
}

+ /*
+ * Check arg is function or file name and copy it.
+ *
+ * We consider arg to be a file spec if and only if it satisfies
+ * all of the below criteria::
+ * - it does not include any of "+@%",
+ * - it includes one of ":;", and
+ * - it has a period '.' in the name.
+ *
+ * Otherwise, we consider arg to be a function specification.
+ */
+ c = 0;
+ if (!strpbrk(arg, "+@%") && (ptr = strpbrk(arg, ";:")) != NULL) {
+ /* This is a file spec if it includes a '.' before ; or : */
+ if (memchr(arg, '.', ptr-arg))
+ c = 1;
+ }
+
ptr = strpbrk(arg, ";:+@%");
if (ptr) {
nc = *ptr;
@@ -1009,10 +1027,9 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
if (tmp == NULL)
return -ENOMEM;

- /* Check arg is function or file and copy it */
- if (strchr(tmp, '.')) /* File */
+ if (c == 1)
pp->file = tmp;
- else /* Function */
+ else
pp->function = tmp;

/* Parse other options */
--
2.1.3

2014-12-09 17:35:08

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 4/8] perf probe powerpc: Handle powerpc dot symbols

Fix up various perf aspects related to ppc64's usage of dot functions:
- ignore leading '.' when generating event names and when looking for
existing events.
- use the proper prefix when ignoring SyS symbol lookups.

Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/util/probe-event.c | 8 ++++++++
tools/perf/util/symbol.c | 6 ++++++
2 files changed, 14 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index c7e01ef..d465f7c 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2080,6 +2080,10 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
{
int i, ret;

+ /* Skip the leading dot on powerpc */
+ if (*base == '.')
+ base++;
+
/* Try no suffix */
ret = e_snprintf(buf, len, "%s", base);
if (ret < 0) {
@@ -2538,6 +2542,10 @@ int del_perf_probe_events(struct strlist *dellist)
event = str;
}

+ /* Skip the leading dot on powerpc */
+ if (event && *event == '.')
+ event++;
+
ret = e_snprintf(buf, 128, "%s:%s", group, event);
if (ret < 0) {
pr_err("Failed to copy event.");
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 0783311..cc04475 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -137,6 +137,12 @@ static int choose_best_symbol(struct symbol *syma, struct symbol *symb)
if (na >= 10 && !strncmp(syma->name, "compat_SyS", 10))
return SYMBOL_B;

+ /* On powerpc, ignore the dot variants */
+ if (na >= 4 && !strncmp(syma->name, ".SyS", 4))
+ return SYMBOL_B;
+ if (na >= 11 && !strncmp(syma->name, ".compat_SyS", 11))
+ return SYMBOL_B;
+
return SYMBOL_A;
}

--
2.1.3

2014-12-09 17:35:15

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 8/8] perf probe powerpc: Fixup function entry if using kallsyms lookup

On powerpc ABIv2, if no debug-info is found and we use kallsyms,
we need to fixup the function entry to point to the local entry point.
Use offset of 8 since current toolchains always generate 2
instructions (8 bytes).

Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/util/probe-event.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index adcdbd2..0970e2a 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2318,6 +2318,15 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
}
/* Add one probe point */
tp->address = map->unmap_ip(map, sym->start) + pp->offset;
+#if defined(__powerpc64__) && defined(_CALL_ELF) && _CALL_ELF == 2
+ /*
+ * If we used kallsyms, we should fixup the function entry address here.
+ * ppc64le ABIv2 local entry point is currently always 2 instructions (8 bytes)
+ * after the global entry point. Fix this if it ever changes.
+ */
+ if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
+ tp->address += 8;
+#endif
if (reloc_sym) {
tp->symbol = strdup_or_goto(reloc_sym->name, nomem_out);
tp->offset = tp->address - reloc_sym->addr;
--
2.1.3

2014-12-09 17:35:53

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 6/8] perf tools powerpc: Fix PPC64 ELF ABIv2 symbol decoding

PPC64 ELF ABIv2 has a Global Entry Point (GEP) and a Local Entry Point
(LEP). For purposes of probing, we need the LEP. Offset to the LEP is
encoded in st_other.

Signed-off-by: Ananth N Mavinakayanahalli <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/arch/powerpc/Makefile | 1 +
tools/perf/arch/powerpc/util/elf-sym-decode.c | 27 +++++++++++++++++++++++++++
tools/perf/config/Makefile | 1 +
tools/perf/util/elf_sym.h | 13 +++++++++++++
tools/perf/util/symbol-elf.c | 8 ++++++++
5 files changed, 50 insertions(+)
create mode 100644 tools/perf/arch/powerpc/util/elf-sym-decode.c
create mode 100644 tools/perf/util/elf_sym.h

diff --git a/tools/perf/arch/powerpc/Makefile b/tools/perf/arch/powerpc/Makefile
index 6f7782b..8621439 100644
--- a/tools/perf/arch/powerpc/Makefile
+++ b/tools/perf/arch/powerpc/Makefile
@@ -3,4 +3,5 @@ PERF_HAVE_DWARF_REGS := 1
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/skip-callchain-idx.o
endif
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/elf-sym-decode.o
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
diff --git a/tools/perf/arch/powerpc/util/elf-sym-decode.c b/tools/perf/arch/powerpc/util/elf-sym-decode.c
new file mode 100644
index 0000000..7434656
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/elf-sym-decode.c
@@ -0,0 +1,27 @@
+/*
+ * Decode offset from Global Entry Point to Local Entry Point on PPC64
+ * ELF ABIv2.
+ *
+ * Derived from definitions in arch/powerpc/kernel/module_64.c
+ *
+ * Copyright (C) 2014 Ananth N Mavinakayanahalli, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <gelf.h>
+#include "elf_sym.h"
+
+/* PowerPC64 ABIv2 specific values for the ELF64_Sym st_other field. */
+#define STO_PPC64_LOCAL_BIT 5
+#define STO_PPC64_LOCAL_MASK (7 << STO_PPC64_LOCAL_BIT)
+#define PPC64_LOCAL_ENTRY_OFFSET(other) \
+ (((1 << (((other) & STO_PPC64_LOCAL_MASK) >> STO_PPC64_LOCAL_BIT)) >> 2) << 2)
+
+unsigned int arch_elf_sym_decode_offset(GElf_Sym *sym)
+{
+ return PPC64_LOCAL_ENTRY_OFFSET(sym->st_other);
+}
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 58f6091..8f64557 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -378,6 +378,7 @@ ifeq ($(ARCH),powerpc)
ifndef NO_DWARF
CFLAGS += -DHAVE_SKIP_CALLCHAIN_IDX
endif
+ CFLAGS += -DHAVE_ELF_SYM_DECODE
endif

ifndef NO_LIBUNWIND
diff --git a/tools/perf/util/elf_sym.h b/tools/perf/util/elf_sym.h
new file mode 100644
index 0000000..0176f21
--- /dev/null
+++ b/tools/perf/util/elf_sym.h
@@ -0,0 +1,13 @@
+#ifndef __PERF_ELF_SYM_H
+#define __PERF_ELF_SYM_H
+
+#ifdef HAVE_ELF_SYM_DECODE
+extern unsigned int arch_elf_sym_decode_offset(GElf_Sym *sym);
+#else
+static inline unsigned int arch_elf_sym_decode_offset(GElf_Sym *sym __maybe_unused)
+{
+ return 0;
+}
+#endif
+
+#endif /* __PERF_ELF_SYM_H */
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 67e4392..92a424e 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -10,6 +10,7 @@
#include "vdso.h"
#include <symbol/kallsyms.h>
#include "debug.h"
+#include "elf_sym.h"

#ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
static int elf_getphdrnum(Elf *elf, size_t *dst)
@@ -848,6 +849,13 @@ int dso__load_sym(struct dso *dso, struct map *map,
(sym.st_value & 1))
--sym.st_value;

+ /*
+ * PPC64 ELF ABIv2 encodes Local Entry Point offset in
+ * the st_other field
+ */
+ if ((map->type == MAP__FUNCTION) && sym.st_other)
+ sym.st_value += arch_elf_sym_decode_offset(&sym);
+
if (dso->kernel || kmodule) {
char dso_name[PATH_MAX];

--
2.1.3

2014-12-09 17:35:51

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 7/8] perf probe powerpc: Use DWARF info only if necessary

Use symbol table lookups by default if DWARF is not necessary, since
powerpc ABIv2 encodes local entry points in the symbol table and the
function entry address in DWARF may not be appropriate for kprobes,
as described here:
https://sourceware.org/bugzilla/show_bug.cgi?id=17638

Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/util/probe-event.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 174c22e..adcdbd2 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2382,6 +2382,14 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
}
}

+#if defined(__powerpc64__) && defined(_CALL_ELF) && _CALL_ELF == 2
+ if (!perf_probe_event_need_dwarf(pev)) {
+ ret = find_probe_trace_events_from_map(pev, tevs, max_tevs, target);
+ if (ret > 0)
+ return ret; /* Found in symbol table */
+ }
+#endif
+
/* Convert perf_probe_event with debuginfo */
ret = try_to_find_probe_trace_events(pev, tevs, max_tevs, target);
if (ret != 0)
--
2.1.3

2014-12-09 17:35:06

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 5/8] perf probe powerpc: Allow matching against dot symbols

Allow perf probe to work on powerpc ABIv1 without the need to specify the
leading dot '.' for functions. 'perf probe do_fork' works with this patch.

Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/util/probe-event.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index d465f7c..174c22e 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2221,6 +2221,15 @@ static int probe_function_filter(struct map *map __maybe_unused,
num_matched_functions++;
return 0;
}
+#ifdef __powerpc64__
+ /* Allow matching against the dot variant */
+ if (sym->name[0] == '.' && looking_function_name[0] != '.' &&
+ (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) &&
+ strcmp(looking_function_name, sym->name+1) == 0) {
+ num_matched_functions++;
+ return 0;
+ }
+#endif
return 1;
}

--
2.1.3

2014-12-09 17:37:14

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 2/8] perf probe powerpc: Fix symbol fixup issues due to ELF type

If using the symbol table, symbol addresses are not being fixed up
properly, resulting in probes being placed at wrong addresses:

# perf probe do_fork
Added new event:
probe:do_fork (on do_fork)

You can now use it in all perf tools, such as:

perf record -e probe:do_fork -aR sleep 1

# cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_fork _text+635952
# printf "%x" 635952
9b430
# grep do_fork /boot/System.map
c0000000000ab430 T .do_fork

Fix by checking for ELF type ET_DYN used by ppc64 kernels.

Signed-off-by: Naveen N. Rao <[email protected]>
---
tools/perf/util/symbol-elf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 1e23a5b..67e4392 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -629,7 +629,8 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
NULL) != NULL);
} else {
ss->adjust_symbols = ehdr.e_type == ET_EXEC ||
- ehdr.e_type == ET_REL;
+ ehdr.e_type == ET_REL ||
+ ehdr.e_type == ET_DYN;
}

ss->name = strdup(name);
--
2.1.3

2014-12-09 21:07:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH 2/8] perf probe powerpc: Fix symbol fixup issues due to ELF type

Em Tue, Dec 09, 2014 at 11:04:00PM +0530, Naveen N. Rao escreveu:
> If using the symbol table, symbol addresses are not being fixed up
> properly, resulting in probes being placed at wrong addresses:
>
> # perf probe do_fork
> Added new event:
> probe:do_fork (on do_fork)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:do_fork -aR sleep 1
>
> # cat /sys/kernel/debug/tracing/kprobe_events
> p:probe/do_fork _text+635952
> # printf "%x" 635952
> 9b430
> # grep do_fork /boot/System.map
> c0000000000ab430 T .do_fork
>
> Fix by checking for ELF type ET_DYN used by ppc64 kernels.

Are you sure this doesn't need to be enclosed in ifdef PPC64?

- Arnaldo

> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> tools/perf/util/symbol-elf.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 1e23a5b..67e4392 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -629,7 +629,8 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> NULL) != NULL);
> } else {
> ss->adjust_symbols = ehdr.e_type == ET_EXEC ||
> - ehdr.e_type == ET_REL;
> + ehdr.e_type == ET_REL ||
> + ehdr.e_type == ET_DYN;
> }
>
> ss->name = strdup(name);
> --
> 2.1.3

2014-12-09 21:14:07

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH 8/8] perf probe powerpc: Fixup function entry if using kallsyms lookup

Em Tue, Dec 09, 2014 at 11:04:06PM +0530, Naveen N. Rao escreveu:
> On powerpc ABIv2, if no debug-info is found and we use kallsyms,
> we need to fixup the function entry to point to the local entry point.
> Use offset of 8 since current toolchains always generate 2
> instructions (8 bytes).

Hi Michael and Ananth, may I have your Acked-by or Reviewed-by for these
patches?

The ones, like this, that are affects only ppc I'm can assume was tested
and applying it won't break other arch users, but having a/rev-by from
ppc developers should speed up this process.

Thanks,

- Arnaldo

> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> tools/perf/util/probe-event.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index adcdbd2..0970e2a 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2318,6 +2318,15 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
> }
> /* Add one probe point */
> tp->address = map->unmap_ip(map, sym->start) + pp->offset;
> +#if defined(__powerpc64__) && defined(_CALL_ELF) && _CALL_ELF == 2
> + /*
> + * If we used kallsyms, we should fixup the function entry address here.
> + * ppc64le ABIv2 local entry point is currently always 2 instructions (8 bytes)
> + * after the global entry point. Fix this if it ever changes.
> + */
> + if (!pev->uprobes && map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
> + tp->address += 8;
> +#endif
> if (reloc_sym) {
> tp->symbol = strdup_or_goto(reloc_sym->name, nomem_out);
> tp->offset = tp->address - reloc_sym->addr;
> --
> 2.1.3

Subject: Re: [RFC PATCH 8/8] perf probe powerpc: Fixup function entry if using kallsyms lookup

On Tue, Dec 09, 2014 at 06:14:00PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Dec 09, 2014 at 11:04:06PM +0530, Naveen N. Rao escreveu:
> > On powerpc ABIv2, if no debug-info is found and we use kallsyms,
> > we need to fixup the function entry to point to the local entry point.
> > Use offset of 8 since current toolchains always generate 2
> > instructions (8 bytes).
>
> Hi Michael and Ananth, may I have your Acked-by or Reviewed-by for these
> patches?
>
> The ones, like this, that are affects only ppc I'm can assume was tested
> and applying it won't break other arch users, but having a/rev-by from
> ppc developers should speed up this process.

Hi Arnaldo,

Yes, I have reviewed the patches. So, for all patches...

Reviewed-by: Ananth N Mavinakayanahalli <[email protected]>

Subject: Re: [RFC PATCH 0/8] Fix perf probe issues on powerpc

On Tue, Dec 09, 2014 at 11:03:58PM +0530, Naveen N. Rao wrote:
> This patchset fixes various issues with perf probe on powerpc
> across ABIv1 and ABIv2:
> - in the presence of DWARF debug-info,
> - in the absence of DWARF, but with the symbol table, and
> - in the absence of debug-info, but with kallsyms.
>
> Applies cleanly on v3.18 and on -tip with minor changes to patch 6.
> Tested on ppc64 BE and LE.
>
> - Naveen
>
> Naveen N. Rao (8):
> kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2
> perf probe powerpc: Fix symbol fixup issues due to ELF type
> perf probe: Improve detection of file/function name in the probe
> pattern
> perf probe powerpc: Handle powerpc dot symbols
> perf probe powerpc: Allow matching against dot symbols
> perf tools powerpc: Fix PPC64 ELF ABIv2 symbol decoding
> perf probe powerpc: Use DWARF info only if necessary
> perf probe powerpc: Fixup function entry if using kallsyms lookup
>
> arch/powerpc/include/asm/code-patching.h | 26 ++++++++----
> arch/powerpc/include/asm/kprobes.h | 58 ++++++++++++++++++---------
> tools/perf/arch/powerpc/Makefile | 1 +
> tools/perf/arch/powerpc/util/elf-sym-decode.c | 27 +++++++++++++
> tools/perf/config/Makefile | 1 +
> tools/perf/util/elf_sym.h | 13 ++++++
> tools/perf/util/probe-event.c | 57 ++++++++++++++++++++++++--
> tools/perf/util/symbol-elf.c | 11 ++++-
> tools/perf/util/symbol.c | 6 +++
> 9 files changed, 170 insertions(+), 30 deletions(-)
> create mode 100644 tools/perf/arch/powerpc/util/elf-sym-decode.c
> create mode 100644 tools/perf/util/elf_sym.h

For the full patchset...

Reviewed-by: Ananth N Mavinakayanahalli <[email protected]>

2014-12-10 09:36:08

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 2/8] perf probe powerpc: Fix symbol fixup issues due to ELF type

On 2014/12/09 06:07PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Dec 09, 2014 at 11:04:00PM +0530, Naveen N. Rao escreveu:
> > If using the symbol table, symbol addresses are not being fixed up
> > properly, resulting in probes being placed at wrong addresses:
> >
> > # perf probe do_fork
> > Added new event:
> > probe:do_fork (on do_fork)
> >
> > You can now use it in all perf tools, such as:
> >
> > perf record -e probe:do_fork -aR sleep 1
> >
> > # cat /sys/kernel/debug/tracing/kprobe_events
> > p:probe/do_fork _text+635952
> > # printf "%x" 635952
> > 9b430
> > # grep do_fork /boot/System.map
> > c0000000000ab430 T .do_fork
> >
> > Fix by checking for ELF type ET_DYN used by ppc64 kernels.
>
> Are you sure this doesn't need to be enclosed in ifdef PPC64?

I felt this change is architecture-independent, though I'm actually not
sure if there are other architectures using ET_DYN for their kernel. I
can restrict this to ppc64 if you think that would be better.

- Naveen

>
> - Arnaldo
>
> > Signed-off-by: Naveen N. Rao <[email protected]>
> > ---
> > tools/perf/util/symbol-elf.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> > index 1e23a5b..67e4392 100644
> > --- a/tools/perf/util/symbol-elf.c
> > +++ b/tools/perf/util/symbol-elf.c
> > @@ -629,7 +629,8 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> > NULL) != NULL);
> > } else {
> > ss->adjust_symbols = ehdr.e_type == ET_EXEC ||
> > - ehdr.e_type == ET_REL;
> > + ehdr.e_type == ET_REL ||
> > + ehdr.e_type == ET_DYN;
> > }
> >
> > ss->name = strdup(name);
> > --
> > 2.1.3
>

2014-12-10 09:37:27

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 1/8] kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2

On Tue, 2014-12-09 at 23:03 +0530, Naveen N. Rao wrote:
> Currently, all non-dot symbols are being treated as function descriptors
> in ABIv1. This is incorrect and is resulting in perf probe not working:

I don't understand that first sentence. With ABIv1 non-dot symbols *are*
function descriptors?

> # perf probe do_fork
> Added new event:
> Failed to write event: Invalid argument
> Error: Failed to add events.
> # dmesg | tail -1
> [192268.073063] Could not insert probe at _text+768432: -22
>
> _text is being resolved incorrectly and is resulting in the above error.
> Fix this by changing how we lookup symbol addresses on ppc64. We first
> check for the dot variant of a symbol and look at the non-dot variant
> only if that fails. In this manner, we avoid having to look at the
> function descriptor.

I'm not clear that ppc_local_function_entry() makes sense. On ABIv2 you return
the local entry point, which is fine. But on ABIv1 you just return the
unmodified address, which will be the descriptor if you actually passed it a
function pointer. I think you're assuming that you're passed the text address,
but if that's the case the function is badly named at least.

I also don't understand why we need to ever guess which ABI we're using. We
know which ABI we're built with, so there should be no guess work required.

So at the very least this needs much more explanation.

But to be honest I'm not clear why it even needs a kernel change, don't we just
need perf to understand dot symbols?

cheers

2014-12-10 09:50:58

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 2/8] perf probe powerpc: Fix symbol fixup issues due to ELF type

On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> If using the symbol table, symbol addresses are not being fixed up
> properly, resulting in probes being placed at wrong addresses:
>
> # perf probe do_fork
> Added new event:
> probe:do_fork (on do_fork)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:do_fork -aR sleep 1
>
> # cat /sys/kernel/debug/tracing/kprobe_events
> p:probe/do_fork _text+635952
> # printf "%x" 635952
> 9b430
> # grep do_fork /boot/System.map
> c0000000000ab430 T .do_fork

OK, but why is that happening? And why does checking for ET_DYN fix it?

> Fix by checking for ELF type ET_DYN used by ppc64 kernels.

We sometimes produce ET_DYN kernels, but only if CONFIG_RELOCATABLE=y.

cheers

2014-12-10 10:00:13

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 3/8] perf probe: Improve detection of file/function name in the probe pattern

On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> Currently, perf probe considers patterns including a '.' to be a file.
> However, this causes problems on powerpc ABIv1 where all functions have
> a leading '.':
>
> $ perf probe -F | grep schedule_timeout_interruptible
> .schedule_timeout_interruptible
> $ perf probe .schedule_timeout_interruptible
> Semantic error :File always requires line number or lazy pattern.
> Error: Command Parse Error.
>
> Fix this by checking the probe pattern in more detail.
>
> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> tools/perf/util/probe-event.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index c150ca4..c7e01ef 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -999,6 +999,24 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
> arg = tmp;
> }
>
> + /*
> + * Check arg is function or file name and copy it.
> + *
> + * We consider arg to be a file spec if and only if it satisfies
> + * all of the below criteria::
> + * - it does not include any of "+@%",
> + * - it includes one of ":;", and
> + * - it has a period '.' in the name.

I don't think we need to be this elaborate.

AFAIK there are no source files in the kernel that start with '.'

So if the arg starts with '.' it must be a function?

cheers

2014-12-10 10:01:59

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 4/8] perf probe powerpc: Handle powerpc dot symbols

On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> Fix up various perf aspects related to ppc64's usage of dot functions:
> - ignore leading '.' when generating event names and when looking for
> existing events.
> - use the proper prefix when ignoring SyS symbol lookups.
>
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index c7e01ef..d465f7c 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2080,6 +2080,10 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
> {
> int i, ret;
>
> + /* Skip the leading dot on powerpc */
> + if (*base == '.')
> + base++;
> +
> /* Try no suffix */
> ret = e_snprintf(buf, len, "%s", base);
> if (ret < 0) {
> @@ -2538,6 +2542,10 @@ int del_perf_probe_events(struct strlist *dellist)
> event = str;
> }
>
> + /* Skip the leading dot on powerpc */
> + if (event && *event == '.')
> + event++;

I'll defer to the perf guys, but I think you want these abstracted in an
architecture specific helper.

> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 0783311..cc04475 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -137,6 +137,12 @@ static int choose_best_symbol(struct symbol *syma, struct symbol *symb)
> if (na >= 10 && !strncmp(syma->name, "compat_SyS", 10))
> return SYMBOL_B;
>
> + /* On powerpc, ignore the dot variants */
> + if (na >= 4 && !strncmp(syma->name, ".SyS", 4))
> + return SYMBOL_B;
> + if (na >= 11 && !strncmp(syma->name, ".compat_SyS", 11))
> + return SYMBOL_B;

And possibly this too.

cheers




2014-12-10 10:03:46

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 5/8] perf probe powerpc: Allow matching against dot symbols

On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> Allow perf probe to work on powerpc ABIv1 without the need to specify the
> leading dot '.' for functions. 'perf probe do_fork' works with this patch.
>
> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> tools/perf/util/probe-event.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index d465f7c..174c22e 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2221,6 +2221,15 @@ static int probe_function_filter(struct map *map __maybe_unused,
> num_matched_functions++;
> return 0;
> }
> +#ifdef __powerpc64__
> + /* Allow matching against the dot variant */
> + if (sym->name[0] == '.' && looking_function_name[0] != '.' &&
> + (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) &&
> + strcmp(looking_function_name, sym->name+1) == 0) {
> + num_matched_functions++;
> + return 0;
> + }
> +#endif

As for the previous patch, I think this should be in an arch helper.

cheers

2014-12-10 10:13:36

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 6/8] perf tools powerpc: Fix PPC64 ELF ABIv2 symbol decoding

On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> PPC64 ELF ABIv2 has a Global Entry Point (GEP) and a Local Entry Point
> (LEP). For purposes of probing, we need the LEP. Offset to the LEP is
> encoded in st_other.
>
> diff --git a/tools/perf/arch/powerpc/util/elf-sym-decode.c b/tools/perf/arch/powerpc/util/elf-sym-decode.c
> new file mode 100644
> index 0000000..7434656
> --- /dev/null
> +++ b/tools/perf/arch/powerpc/util/elf-sym-decode.c
> @@ -0,0 +1,27 @@
> +/*
> + * Decode offset from Global Entry Point to Local Entry Point on PPC64
> + * ELF ABIv2.
> + *
> + * Derived from definitions in arch/powerpc/kernel/module_64.c
> + *
> + * Copyright (C) 2014 Ananth N Mavinakayanahalli, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +#include <gelf.h>
> +#include "elf_sym.h"
> +
> +/* PowerPC64 ABIv2 specific values for the ELF64_Sym st_other field. */
> +#define STO_PPC64_LOCAL_BIT 5
> +#define STO_PPC64_LOCAL_MASK (7 << STO_PPC64_LOCAL_BIT)
> +#define PPC64_LOCAL_ENTRY_OFFSET(other) \
> + (((1 << (((other) & STO_PPC64_LOCAL_MASK) >> STO_PPC64_LOCAL_BIT)) >> 2) << 2)

You're in userspace, you should be able to get these from elf.h

> +unsigned int arch_elf_sym_decode_offset(GElf_Sym *sym)
> +{
> + return PPC64_LOCAL_ENTRY_OFFSET(sym->st_other);

What happens on ABIv1 ? We hope st_other is zero?

> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 67e4392..92a424e 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -10,6 +10,7 @@
> #include "vdso.h"
> #include <symbol/kallsyms.h>
> #include "debug.h"
> +#include "elf_sym.h"
>
> #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
> static int elf_getphdrnum(Elf *elf, size_t *dst)
> @@ -848,6 +849,13 @@ int dso__load_sym(struct dso *dso, struct map *map,
> (sym.st_value & 1))
> --sym.st_value;
>
> + /*
> + * PPC64 ELF ABIv2 encodes Local Entry Point offset in
> + * the st_other field
> + */
> + if ((map->type == MAP__FUNCTION) && sym.st_other)
> + sym.st_value += arch_elf_sym_decode_offset(&sym);

I guess no other arch has needed to do anything like this.

But if they did it's unlikely they'll want to do the exact same logic, ie.
check st_other and add some value to st_value. To make it more generically
useful you could just make it:

> + if (map->type == MAP__FUNCTION)
> + arch_elf_sym_decode(&sym);

And do any other checks in the arch routine.

cheers

2014-12-10 10:17:23

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 7/8] perf probe powerpc: Use DWARF info only if necessary

On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> Use symbol table lookups by default if DWARF is not necessary, since
> powerpc ABIv2 encodes local entry points in the symbol table and the
> function entry address in DWARF may not be appropriate for kprobes,
> as described here:
> https://sourceware.org/bugzilla/show_bug.cgi?id=17638

Needs a better changelog.

> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 174c22e..adcdbd2 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2382,6 +2382,14 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
> }
> }
>
> +#if defined(__powerpc64__) && defined(_CALL_ELF) && _CALL_ELF == 2
> + if (!perf_probe_event_need_dwarf(pev)) {
> + ret = find_probe_trace_events_from_map(pev, tevs, max_tevs, target);
> + if (ret > 0)
> + return ret; /* Found in symbol table */
> + }
> +#endif

And should be in an arch helper, not a big powerpc wart dropped in the middle
of the generic code.

cheers

2014-12-10 10:27:18

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 1/8] kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2

On 2014/12/10 08:37PM, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 23:03 +0530, Naveen N. Rao wrote:
> > Currently, all non-dot symbols are being treated as function descriptors
> > in ABIv1. This is incorrect and is resulting in perf probe not working:
>
> I don't understand that first sentence. With ABIv1 non-dot symbols *are*
> function descriptors?

Not always. '_text' is an example of a symbol that is not a function
descriptor. However, most functions have a dot variant constituting the
actual entry point and a non-dot variant constituting the function
descriptor.

>
> > # perf probe do_fork
> > Added new event:
> > Failed to write event: Invalid argument
> > Error: Failed to add events.
> > # dmesg | tail -1
> > [192268.073063] Could not insert probe at _text+768432: -22
> >
> > _text is being resolved incorrectly and is resulting in the above error.
> > Fix this by changing how we lookup symbol addresses on ppc64. We first
> > check for the dot variant of a symbol and look at the non-dot variant
> > only if that fails. In this manner, we avoid having to look at the
> > function descriptor.
>
> I'm not clear that ppc_local_function_entry() makes sense. On ABIv2 you return
> the local entry point, which is fine. But on ABIv1 you just return the
> unmodified address, which will be the descriptor if you actually passed it a
> function pointer. I think you're assuming that you're passed the text address,
> but if that's the case the function is badly named at least.
>
> I also don't understand why we need to ever guess which ABI we're using. We
> know which ABI we're built with, so there should be no guess work required.
>
> So at the very least this needs much more explanation.
>
> But to be honest I'm not clear why it even needs a kernel change, don't we just
> need perf to understand dot symbols?

The problem in this case is in the kernel. perf probe is now basing all
probe addresses on _text and writes, for example, "p:probe/do_fork
_text+768432" to /sys/kernel/debug/tracing/kprobe_events.

This ends up in kprobe_lookup_name() for resolving address of _text,
which invokes ppc_function_entry(), which ends up thinking _text is a
function descriptor.

Even though we know we are compiled for ABIv1, there is no easy way to
identify if a given symbol is the actual entry point or if it is a
function descriptor. To address this, my approach is to always check for
a dot symbol first and if that exists, we know we have the actual
function entry. If not, we know this isn't a function descriptor (since
there is no related dot symbol).

I agree that the function is named badly though. The real problem is
that kprobe_lookup_name is a macro and I can't have a #ifdef to call
ppc_function_entry() only for ABIv2.

Thoughts? Suggestions?

Thanks,
Naveen

2014-12-10 10:42:06

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 2/8] perf probe powerpc: Fix symbol fixup issues due to ELF type

On 2014/12/10 08:50PM, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> > If using the symbol table, symbol addresses are not being fixed up
> > properly, resulting in probes being placed at wrong addresses:
> >
> > # perf probe do_fork
> > Added new event:
> > probe:do_fork (on do_fork)
> >
> > You can now use it in all perf tools, such as:
> >
> > perf record -e probe:do_fork -aR sleep 1
> >
> > # cat /sys/kernel/debug/tracing/kprobe_events
> > p:probe/do_fork _text+635952
> > # printf "%x" 635952
> > 9b430
> > # grep do_fork /boot/System.map
> > c0000000000ab430 T .do_fork
>
> OK, but why is that happening? And why does checking for ET_DYN fix it?

The section header indicates 0x10000 as the offset:

Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS c000000000000000 00010000
0000000000806678 0000000000000000 AX 0 0 256

This is used during fixup and perf only expects this to be needed for
ET_EXEC, though we use ET_DYN on ppc64.

> > Fix by checking for ELF type ET_DYN used by ppc64 kernels.
>
> We sometimes produce ET_DYN kernels, but only if CONFIG_RELOCATABLE=y.

Ok.

- Naveen

2014-12-10 10:59:45

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 3/8] perf probe: Improve detection of file/function name in the probe pattern

On 2014/12/10 09:00PM, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> > Currently, perf probe considers patterns including a '.' to be a file.
> > However, this causes problems on powerpc ABIv1 where all functions have
> > a leading '.':
> >
> > $ perf probe -F | grep schedule_timeout_interruptible
> > .schedule_timeout_interruptible
> > $ perf probe .schedule_timeout_interruptible
> > Semantic error :File always requires line number or lazy pattern.
> > Error: Command Parse Error.
> >
> > Fix this by checking the probe pattern in more detail.
> >
> > Signed-off-by: Naveen N. Rao <[email protected]>
> > ---
> > tools/perf/util/probe-event.c | 23 ++++++++++++++++++++---
> > 1 file changed, 20 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > index c150ca4..c7e01ef 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -999,6 +999,24 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
> > arg = tmp;
> > }
> >
> > + /*
> > + * Check arg is function or file name and copy it.
> > + *
> > + * We consider arg to be a file spec if and only if it satisfies
> > + * all of the below criteria::
> > + * - it does not include any of "+@%",
> > + * - it includes one of ":;", and
> > + * - it has a period '.' in the name.
>
> I don't think we need to be this elaborate.
>
> AFAIK there are no source files in the kernel that start with '.'
>
> So if the arg starts with '.' it must be a function?

Indeed, but this is also used for parsing uprobes. So, I coded this
based on the spec for the probe pattern.

- Naveen

2014-12-10 11:12:35

by Michael Ellerman

[permalink] [raw]
Subject: Re: [RFC PATCH 3/8] perf probe: Improve detection of file/function name in the probe pattern

On Wed, 2014-12-10 at 16:29 +0530, Naveen N. Rao wrote:
> On 2014/12/10 09:00PM, Michael Ellerman wrote:
> > On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> > > Currently, perf probe considers patterns including a '.' to be a file.
> > > However, this causes problems on powerpc ABIv1 where all functions have
> > > a leading '.':
> > >
> > > $ perf probe -F | grep schedule_timeout_interruptible
> > > .schedule_timeout_interruptible
> > > $ perf probe .schedule_timeout_interruptible
> > > Semantic error :File always requires line number or lazy pattern.
> > > Error: Command Parse Error.
> > >
> > > Fix this by checking the probe pattern in more detail.
> > >
> > > Signed-off-by: Naveen N. Rao <[email protected]>
> > > ---
> > > tools/perf/util/probe-event.c | 23 ++++++++++++++++++++---
> > > 1 file changed, 20 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > > index c150ca4..c7e01ef 100644
> > > --- a/tools/perf/util/probe-event.c
> > > +++ b/tools/perf/util/probe-event.c
> > > @@ -999,6 +999,24 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
> > > arg = tmp;
> > > }
> > >
> > > + /*
> > > + * Check arg is function or file name and copy it.
> > > + *
> > > + * We consider arg to be a file spec if and only if it satisfies
> > > + * all of the below criteria::
> > > + * - it does not include any of "+@%",
> > > + * - it includes one of ":;", and
> > > + * - it has a period '.' in the name.
> >
> > I don't think we need to be this elaborate.
> >
> > AFAIK there are no source files in the kernel that start with '.'
> >
> > So if the arg starts with '.' it must be a function?
>
> Indeed, but this is also used for parsing uprobes. So, I coded this
> based on the spec for the probe pattern.

OK. It also seems unlikely you'll want a uprobe on a file that starts with a . ?

I'll leave it up to the perf guys to decide if they're happy with it.

cheers

2014-12-10 11:21:49

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 6/8] perf tools powerpc: Fix PPC64 ELF ABIv2 symbol decoding

On 2014/12/10 09:13PM, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> > PPC64 ELF ABIv2 has a Global Entry Point (GEP) and a Local Entry Point
> > (LEP). For purposes of probing, we need the LEP. Offset to the LEP is
> > encoded in st_other.
> >
> > diff --git a/tools/perf/arch/powerpc/util/elf-sym-decode.c b/tools/perf/arch/powerpc/util/elf-sym-decode.c
> > new file mode 100644
> > index 0000000..7434656
> > --- /dev/null
> > +++ b/tools/perf/arch/powerpc/util/elf-sym-decode.c
> > @@ -0,0 +1,27 @@
> > +/*
> > + * Decode offset from Global Entry Point to Local Entry Point on PPC64
> > + * ELF ABIv2.
> > + *
> > + * Derived from definitions in arch/powerpc/kernel/module_64.c
> > + *
> > + * Copyright (C) 2014 Ananth N Mavinakayanahalli, IBM Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> > +
> > +#include <gelf.h>
> > +#include "elf_sym.h"
> > +
> > +/* PowerPC64 ABIv2 specific values for the ELF64_Sym st_other field. */
> > +#define STO_PPC64_LOCAL_BIT 5
> > +#define STO_PPC64_LOCAL_MASK (7 << STO_PPC64_LOCAL_BIT)
> > +#define PPC64_LOCAL_ENTRY_OFFSET(other) \
> > + (((1 << (((other) & STO_PPC64_LOCAL_MASK) >> STO_PPC64_LOCAL_BIT)) >> 2) << 2)
>
> You're in userspace, you should be able to get these from elf.h

Ah, ok.

>
> > +unsigned int arch_elf_sym_decode_offset(GElf_Sym *sym)
> > +{
> > + return PPC64_LOCAL_ENTRY_OFFSET(sym->st_other);
>
> What happens on ABIv1 ? We hope st_other is zero?

Yes, st_other is zero in ABIv1.

>
> > diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> > index 67e4392..92a424e 100644
> > --- a/tools/perf/util/symbol-elf.c
> > +++ b/tools/perf/util/symbol-elf.c
> > @@ -10,6 +10,7 @@
> > #include "vdso.h"
> > #include <symbol/kallsyms.h>
> > #include "debug.h"
> > +#include "elf_sym.h"
> >
> > #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
> > static int elf_getphdrnum(Elf *elf, size_t *dst)
> > @@ -848,6 +849,13 @@ int dso__load_sym(struct dso *dso, struct map *map,
> > (sym.st_value & 1))
> > --sym.st_value;
> >
> > + /*
> > + * PPC64 ELF ABIv2 encodes Local Entry Point offset in
> > + * the st_other field
> > + */
> > + if ((map->type == MAP__FUNCTION) && sym.st_other)
> > + sym.st_value += arch_elf_sym_decode_offset(&sym);
>
> I guess no other arch has needed to do anything like this.
>
> But if they did it's unlikely they'll want to do the exact same logic, ie.
> check st_other and add some value to st_value. To make it more generically
> useful you could just make it:
>
> > + if (map->type == MAP__FUNCTION)
> > + arch_elf_sym_decode(&sym);
>
> And do any other checks in the arch routine.

Sure. Makes sense.

- Naveen

2014-12-10 11:49:24

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 7/8] perf probe powerpc: Use DWARF info only if necessary

On 2014/12/10 09:17PM, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 23:04 +0530, Naveen N. Rao wrote:
> > Use symbol table lookups by default if DWARF is not necessary, since
> > powerpc ABIv2 encodes local entry points in the symbol table and the
> > function entry address in DWARF may not be appropriate for kprobes,
> > as described here:
> > https://sourceware.org/bugzilla/show_bug.cgi?id=17638
>
> Needs a better changelog.

Ok. Will add, but to elaborate quickly: DWARF will only include the
entire function in low_pc/entry_pc and high_pc. It can't indicate the
local entry point. Hence, we need to use the symbol table instead of
DWARF on ABIv2.

>
> > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > index 174c22e..adcdbd2 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -2382,6 +2382,14 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
> > }
> > }
> >
> > +#if defined(__powerpc64__) && defined(_CALL_ELF) && _CALL_ELF == 2
> > + if (!perf_probe_event_need_dwarf(pev)) {
> > + ret = find_probe_trace_events_from_map(pev, tevs, max_tevs, target);
> > + if (ret > 0)
> > + return ret; /* Found in symbol table */
> > + }
> > +#endif
>
> And should be in an arch helper, not a big powerpc wart dropped in the middle
> of the generic code.

Sure - will change.

Thanks for the review!
- Naveen