2014-02-03 18:18:45

by Jean Pihet

[permalink] [raw]
Subject: [PATCH v6 0/3] perf: AARCH64 arch support

Add AARCH64 specific support. This includes the following:
- AARCH64 perf registers definition and hooks,
- compat mode registers use, i.e. profiling a 32-bit binary on
a 64-bit system,
- unwinding using the dwarf information from the .debug_frame
section of the ELF binary,
- unwinding using the frame pointer information; in 64-bit and
compat modes.

Notes:
- the tools/perf change is submitted separately on LKML,
- support for unwinding using the dwarf information in compat
mode requires some changes to the libunwind code. Those changes
have been submitted on the libunwind ML and are in discussion.

Tested on ARMv7, ARMv8 and x86_64 platforms. The compat mode has been
tested on ARMv8 using statically built 32-bit binaries.


Jean Pihet (3):
ARM64: perf: add support for perf registers API
ARM64: perf: add support for frame pointer unwinding in compat mode
ARM64: perf: support dwarf unwinding in compat mode

arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/compat.h | 2 +-
arch/arm64/include/asm/ptrace.h | 3 +-
arch/arm64/include/uapi/asm/Kbuild | 1 +
arch/arm64/include/uapi/asm/perf_regs.h | 40 ++++++++++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/perf_event.c | 75 +++++++++++++++++++++++++++++----
arch/arm64/kernel/perf_regs.c | 44 +++++++++++++++++++
8 files changed, 158 insertions(+), 10 deletions(-)
create mode 100644 arch/arm64/include/uapi/asm/perf_regs.h
create mode 100644 arch/arm64/kernel/perf_regs.c

--
1.7.11.7


2014-02-03 18:18:49

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 2/3] ARM64: perf: add support for frame pointer unwinding in compat mode

When profiling a 32-bit application, user space callchain unwinding
using the frame pointer is performed in compat mode. The code is taken
over from the AARCH32 code and adapted to work on AARCH64.

Signed-off-by: Jean Pihet <[email protected]>
Acked-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/perf_event.c | 75 +++++++++++++++++++++++++++++++++++++-----
1 file changed, 67 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 5b1cd79..e868c72 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1348,8 +1348,8 @@ early_initcall(init_hw_perf_events);
* Callchain handling code.
*/
struct frame_tail {
- struct frame_tail __user *fp;
- unsigned long lr;
+ struct frame_tail __user *fp;
+ unsigned long lr;
} __attribute__((packed));

/*
@@ -1386,22 +1386,80 @@ user_backtrace(struct frame_tail __user *tail,
return buftail.fp;
}

+/*
+ * The registers we're interested in are at the end of the variable
+ * length saved register structure. The fp points at the end of this
+ * structure so the address of this struct is:
+ * (struct compat_frame_tail *)(xxx->fp)-1
+ *
+ * This code has been adapted from the ARM OProfile support.
+ */
+struct compat_frame_tail {
+ compat_uptr_t fp; /* a (struct compat_frame_tail *) in compat mode */
+ u32 sp;
+ u32 lr;
+} __attribute__((packed));
+
+static struct compat_frame_tail __user *
+compat_user_backtrace(struct compat_frame_tail __user *tail,
+ struct perf_callchain_entry *entry)
+{
+ struct compat_frame_tail buftail;
+ unsigned long err;
+
+ /* Also check accessibility of one struct frame_tail beyond */
+ if (!access_ok(VERIFY_READ, tail, sizeof(buftail)))
+ return NULL;
+
+ pagefault_disable();
+ err = __copy_from_user_inatomic(&buftail, tail, sizeof(buftail));
+ pagefault_enable();
+
+ if (err)
+ return NULL;
+
+ perf_callchain_store(entry, buftail.lr);
+
+ /*
+ * Frame pointers should strictly progress back up the stack
+ * (towards higher addresses).
+ */
+ if (tail + 1 >= (struct compat_frame_tail __user *)
+ compat_ptr(buftail.fp))
+ return NULL;
+
+ return (struct compat_frame_tail __user *)compat_ptr(buftail.fp) - 1;
+}
+
void perf_callchain_user(struct perf_callchain_entry *entry,
struct pt_regs *regs)
{
- struct frame_tail __user *tail;
-
if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
/* We don't support guest os callchain now */
return;
}

perf_callchain_store(entry, regs->pc);
- tail = (struct frame_tail __user *)regs->regs[29];

- while (entry->nr < PERF_MAX_STACK_DEPTH &&
- tail && !((unsigned long)tail & 0xf))
- tail = user_backtrace(tail, entry);
+ if (!compat_user_mode(regs)) {
+ /* AARCH64 mode */
+ struct frame_tail __user *tail;
+
+ tail = (struct frame_tail __user *)regs->regs[29];
+
+ while (entry->nr < PERF_MAX_STACK_DEPTH &&
+ tail && !((unsigned long)tail & 0xf))
+ tail = user_backtrace(tail, entry);
+ } else {
+ /* AARCH32 compat mode */
+ struct compat_frame_tail __user *tail;
+
+ tail = (struct compat_frame_tail __user *)regs->compat_fp - 1;
+
+ while ((entry->nr < PERF_MAX_STACK_DEPTH) &&
+ tail && !((unsigned long)tail & 0x3))
+ tail = compat_user_backtrace(tail, entry);
+ }
}

/*
@@ -1429,6 +1487,7 @@ void perf_callchain_kernel(struct perf_callchain_entry *entry,
frame.fp = regs->regs[29];
frame.sp = regs->sp;
frame.pc = regs->pc;
+
walk_stackframe(&frame, callchain_trace, entry);
}

--
1.7.11.7

2014-02-03 18:18:53

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 3/3] ARM64: perf: support dwarf unwinding in compat mode

Add support for unwinding using the dwarf information in compat
mode. Using the correct user stack pointer allows perf to record
the frames correctly in the native and compat modes.

Note that although the dwarf frame unwinding works ok using
libunwind in native mode (on ARMv7 & ARMv8), some changes are
required to the libunwind code for the compat mode. Those changes
are posted separately on the libunwind mailing list.

Tested on ARMv8 platform with v8 and compat v7 binaries, the latter
are statically built.

Signed-off-by: Jean Pihet <[email protected]>
Acked-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/compat.h | 2 +-
arch/arm64/include/asm/ptrace.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index fda2704..e71f81f 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -228,7 +228,7 @@ static inline compat_uptr_t ptr_to_compat(void __user *uptr)
return (u32)(unsigned long)uptr;
}

-#define compat_user_stack_pointer() (current_pt_regs()->compat_sp)
+#define compat_user_stack_pointer() (user_stack_pointer(current_pt_regs()))

static inline void __user *arch_compat_alloc_user_space(long len)
{
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index fbb0020..86d5b54 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -133,7 +133,7 @@ struct pt_regs {
(!((regs)->pstate & PSR_F_BIT))

#define user_stack_pointer(regs) \
- ((regs)->sp)
+ (!compat_user_mode(regs)) ? ((regs)->sp) : ((regs)->compat_sp)

/*
* Are the current registers suitable for user mode? (used to maintain
--
1.7.11.7

2014-02-03 18:19:23

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 1/3] ARM64: perf: add support for perf registers API

This patch implements the functions required for the perf registers API,
allowing the perf tool to interface kernel register dumps with libunwind
in order to provide userspace backtracing.
Compat mode is also supported.

Only the general purpose user space registers are exported, i.e.:
PERF_REG_ARM_X0,
...
PERF_REG_ARM_X28,
PERF_REG_ARM_FP,
PERF_REG_ARM_LR,
PERF_REG_ARM_SP,
PERF_REG_ARM_PC
and not the PERF_REG_ARM_V* registers.

Signed-off-by: Jean Pihet <[email protected]>
Acked-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 2 ++
arch/arm64/include/asm/ptrace.h | 1 +
arch/arm64/include/uapi/asm/Kbuild | 1 +
arch/arm64/include/uapi/asm/perf_regs.h | 40 ++++++++++++++++++++++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/perf_regs.c | 44 +++++++++++++++++++++++++++++++++
6 files changed, 89 insertions(+)
create mode 100644 arch/arm64/include/uapi/asm/perf_regs.h
create mode 100644 arch/arm64/kernel/perf_regs.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dd4327f..e9899bb 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -37,6 +37,8 @@ config ARM64
select HAVE_HW_BREAKPOINT if PERF_EVENTS
select HAVE_MEMBLOCK
select HAVE_PERF_EVENTS
+ select HAVE_PERF_REGS
+ select HAVE_PERF_USER_STACK_DUMP
select IRQ_DOMAIN
select MODULES_USE_ELF_RELA
select NO_BOOTMEM
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 0e7fa49..fbb0020 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -68,6 +68,7 @@

/* Architecturally defined mapping between AArch32 and AArch64 registers */
#define compat_usr(x) regs[(x)]
+#define compat_fp regs[11]
#define compat_sp regs[13]
#define compat_lr regs[14]
#define compat_sp_hyp regs[15]
diff --git a/arch/arm64/include/uapi/asm/Kbuild b/arch/arm64/include/uapi/asm/Kbuild
index e4b78bd..942376d 100644
--- a/arch/arm64/include/uapi/asm/Kbuild
+++ b/arch/arm64/include/uapi/asm/Kbuild
@@ -9,6 +9,7 @@ header-y += byteorder.h
header-y += fcntl.h
header-y += hwcap.h
header-y += kvm_para.h
+header-y += perf_regs.h
header-y += param.h
header-y += ptrace.h
header-y += setup.h
diff --git a/arch/arm64/include/uapi/asm/perf_regs.h b/arch/arm64/include/uapi/asm/perf_regs.h
new file mode 100644
index 0000000..172b831
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/perf_regs.h
@@ -0,0 +1,40 @@
+#ifndef _ASM_ARM64_PERF_REGS_H
+#define _ASM_ARM64_PERF_REGS_H
+
+enum perf_event_arm_regs {
+ PERF_REG_ARM64_X0,
+ PERF_REG_ARM64_X1,
+ PERF_REG_ARM64_X2,
+ PERF_REG_ARM64_X3,
+ PERF_REG_ARM64_X4,
+ PERF_REG_ARM64_X5,
+ PERF_REG_ARM64_X6,
+ PERF_REG_ARM64_X7,
+ PERF_REG_ARM64_X8,
+ PERF_REG_ARM64_X9,
+ PERF_REG_ARM64_X10,
+ PERF_REG_ARM64_X11,
+ PERF_REG_ARM64_X12,
+ PERF_REG_ARM64_X13,
+ PERF_REG_ARM64_X14,
+ PERF_REG_ARM64_X15,
+ PERF_REG_ARM64_X16,
+ PERF_REG_ARM64_X17,
+ PERF_REG_ARM64_X18,
+ PERF_REG_ARM64_X19,
+ PERF_REG_ARM64_X20,
+ PERF_REG_ARM64_X21,
+ PERF_REG_ARM64_X22,
+ PERF_REG_ARM64_X23,
+ PERF_REG_ARM64_X24,
+ PERF_REG_ARM64_X25,
+ PERF_REG_ARM64_X26,
+ PERF_REG_ARM64_X27,
+ PERF_REG_ARM64_X28,
+ PERF_REG_ARM64_X29,
+ PERF_REG_ARM64_LR,
+ PERF_REG_ARM64_SP,
+ PERF_REG_ARM64_PC,
+ PERF_REG_ARM64_MAX,
+};
+#endif /* _ASM_ARM64_PERF_REGS_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 2d4554b..9a5d592 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -15,6 +15,7 @@ arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
sys_compat.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_SMP) += smp.o smp_spin_table.o
+arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o
arm64-obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
arm64-obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
new file mode 100644
index 0000000..f2d6f0a
--- /dev/null
+++ b/arch/arm64/kernel/perf_regs.c
@@ -0,0 +1,44 @@
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/perf_event.h>
+#include <linux/bug.h>
+#include <asm/perf_regs.h>
+#include <asm/ptrace.h>
+
+u64 perf_reg_value(struct pt_regs *regs, int idx)
+{
+ if (WARN_ON_ONCE((u32)idx >= PERF_REG_ARM64_MAX))
+ return 0;
+
+ /*
+ * Compat (i.e. 32 bit) mode:
+ * - PC has been set in the pt_regs struct in kernel_entry,
+ * - Handle SP and LR here.
+ */
+ if (compat_user_mode(regs)) {
+ if ((u32)idx == PERF_REG_ARM64_SP)
+ return regs->compat_sp;
+ if ((u32)idx == PERF_REG_ARM64_LR)
+ return regs->compat_lr;
+ }
+
+ return regs->regs[idx];
+}
+
+#define REG_RESERVED (~((1ULL << PERF_REG_ARM64_MAX) - 1))
+
+int perf_reg_validate(u64 mask)
+{
+ if (!mask || mask & REG_RESERVED)
+ return -EINVAL;
+
+ return 0;
+}
+
+u64 perf_reg_abi(struct task_struct *task)
+{
+ if (is_compat_thread(task_thread_info(task)))
+ return PERF_SAMPLE_REGS_ABI_32;
+ else
+ return PERF_SAMPLE_REGS_ABI_64;
+}
--
1.7.11.7

2014-02-12 08:47:19

by Jean Pihet

[permalink] [raw]
Subject: Re: [PATCH v6 0/3] perf: AARCH64 arch support

Hi Will,

Ping on the series. Is this OK for inclusion?

Regards,
Jean

On 3 February 2014 19:18, Jean Pihet <[email protected]> wrote:
> Add AARCH64 specific support. This includes the following:
> - AARCH64 perf registers definition and hooks,
> - compat mode registers use, i.e. profiling a 32-bit binary on
> a 64-bit system,
> - unwinding using the dwarf information from the .debug_frame
> section of the ELF binary,
> - unwinding using the frame pointer information; in 64-bit and
> compat modes.
>
> Notes:
> - the tools/perf change is submitted separately on LKML,
> - support for unwinding using the dwarf information in compat
> mode requires some changes to the libunwind code. Those changes
> have been submitted on the libunwind ML and are in discussion.
>
> Tested on ARMv7, ARMv8 and x86_64 platforms. The compat mode has been
> tested on ARMv8 using statically built 32-bit binaries.
>
>
> Jean Pihet (3):
> ARM64: perf: add support for perf registers API
> ARM64: perf: add support for frame pointer unwinding in compat mode
> ARM64: perf: support dwarf unwinding in compat mode
>
> arch/arm64/Kconfig | 2 +
> arch/arm64/include/asm/compat.h | 2 +-
> arch/arm64/include/asm/ptrace.h | 3 +-
> arch/arm64/include/uapi/asm/Kbuild | 1 +
> arch/arm64/include/uapi/asm/perf_regs.h | 40 ++++++++++++++++++
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/perf_event.c | 75 +++++++++++++++++++++++++++++----
> arch/arm64/kernel/perf_regs.c | 44 +++++++++++++++++++
> 8 files changed, 158 insertions(+), 10 deletions(-)
> create mode 100644 arch/arm64/include/uapi/asm/perf_regs.h
> create mode 100644 arch/arm64/kernel/perf_regs.c
>
> --
> 1.7.11.7
>

2014-02-12 10:11:54

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v6 0/3] perf: AARCH64 arch support

On Wed, Feb 12, 2014 at 08:47:17AM +0000, Jean Pihet wrote:
> Ping on the series. Is this OK for inclusion?

I already acked it all, but you'll need to wait until the next merge window
since this isn't a fix.

Will