2014-07-07 13:45:33

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 0/3] ARM: perf: allow tracing with kernel tracepoints events

- Robustify the user backtrace code, as done on other architectures.
- Provide the symbols resolution when triggering from tracepoints.

Big thanks to Steve Capper for the help in debugging and rephrasing the
commits descriptions.

Stress tested with perf record and tracepoints triggering (-e <tracepoint>),
with unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).


Jean Pihet (3):
ARM: perf: Check that current->mm is alive before getting user
callchain
ARM: perf: disable the pagefault handler when reading from user space
ARM: perf: allow tracing with kernel tracepoints events

arch/arm/include/asm/perf_event.h | 19 +++++++++++++++++++
arch/arm/kernel/perf_event.c | 13 +++++++++++--
2 files changed, 30 insertions(+), 2 deletions(-)

--
1.8.1.2


2014-07-07 13:45:35

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 1/3] ARM: perf: Check that current->mm is alive before getting user callchain

An event may occur when an mm is already released.

As per commit 20afc60f892d285fde179ead4b24e6a7938c2f1b
'x86, perf: Check that current->mm is alive before getting user callchain'

Signed-off-by: Jean Pihet <[email protected]>
Cc: Will Deacon <[email protected]>
Acked-by: Will Deacon <[email protected]>
---
arch/arm/kernel/perf_event.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 4238bcb..6493c4c 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -590,6 +590,10 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
}

perf_callchain_store(entry, regs->ARM_pc);
+
+ if (!current->mm)
+ return;
+
tail = (struct frame_tail __user *)regs->ARM_fp - 1;

while ((entry->nr < PERF_MAX_STACK_DEPTH) &&
--
1.8.1.2

2014-07-07 13:45:39

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 3/3] ARM: perf: allow tracing with kernel tracepoints events

When tracing with tracepoints events the IP and CPSR are set to 0,
preventing the perf code to resolve the symbols:

./perf record -e kmem:kmalloc cal
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (~321 samples) ]

./perf report
Overhead Command Shared Object Symbol
........ ....... ............. ...........
40.78% cal [unknown] [.]00000000
31.6% cal [unknown] [.]00000000

The examination of the gathered samples (perf report -D) shows the IP
is set to 0 and that the samples are considered as user space samples,
while the IP should be set from the registers and the samples should be
considered as kernel samples.

The fix is to implement perf_arch_fetch_caller_regs for ARM, which
fills the necessary registers used for the callchain unwinding and
to determine the user/kernel space property of the samples:
- program counter for PERF_SAMPLE_IP, as used by dwarf unwinding,
- sp & fp for fp based callchain unwinding,
- cpsr for user_mode() tests.

Tested with perf record and tracepoints triggering (-e <tracepoint>), with
unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).

Reported by Sneha Priya on linaro-dev, cf.
http://lists.linaro.org/pipermail/linaro-dev/2014-May/017151.html

Signed-off-by: Jean Pihet <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Steve Capper <[email protected]>
Reported-by: Sneha Priya <[email protected]>
---
arch/arm/include/asm/perf_event.h | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h
index 7558775..b02b5d3 100644
--- a/arch/arm/include/asm/perf_event.h
+++ b/arch/arm/include/asm/perf_event.h
@@ -26,6 +26,25 @@ struct pt_regs;
extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
extern unsigned long perf_misc_flags(struct pt_regs *regs);
#define perf_misc_flags(regs) perf_misc_flags(regs)
+
+/*
+ * Take a snapshot of the regs.
+ * We only need a few of the regs:
+ * - program counter for PERF_SAMPLE_IP, as used by dwarf unwinding,
+ * - sp & fp for fp based callchain unwinding,
+ * - cpsr for user_mode() tests.
+ */
+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+ instruction_pointer(regs)= (__ip); \
+ __asm__ ( \
+ "str sp, %[_ARM_sp] \n\t" \
+ "str fp, %[_ARM_fp] \n\t" \
+ "mrs %[_ARM_cpsr], cpsr \n\t" \
+ : [_ARM_sp] "=m" (regs->ARM_sp), \
+ [_ARM_fp] "=m" (regs->ARM_fp), \
+ [_ARM_cpsr] "=r" (regs->ARM_cpsr) \
+ ); \
+}
#endif

#endif /* __ARM_PERF_EVENT_H__ */
--
1.8.1.2

2014-07-07 13:45:54

by Jean Pihet

[permalink] [raw]
Subject: [PATCH 2/3] ARM: perf: disable the pagefault handler when reading from user space

Under perf, the fp unwinding scheme requires access to user space memory
and can provoke a pagefault via call to __copy_from_user_inatomic from
user_backtrace. This unwinding can take place in response to an interrupt
(__perf_event_overflow). This is undesirable as we may already have
mmap_sem held for write. One example being a process that calls mprotect
just as a the PMU counters overflow.

An example that can provoke this behaviour:
perf record -e event:tocapture --call-graph fp ./application_to_test

This patch addresses this issue by disabling pagefaults briefly in
user_backtrace (as is done in the other architectures: ARM64, x86, Sparc etc.).

Without the patch a deadlock occurs when __perf_event_overflow is called
while reading the data from the user space:

[ INFO: possible recursive locking detected ]
3.16.0-rc2-00038-g0ed7ff6 #46 Not tainted
---------------------------------------------
stress/1634 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<c001dc04>] do_page_fault+0xa8/0x428

but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&mm->mmap_sem);
lock(&mm->mmap_sem);

*** DEADLOCK ***

May be due to missing lock nesting notation

2 locks held by stress/1634:
#0: (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8
#1: (rcu_read_lock){......}, at: [<c00c29dc>] __perf_event_overflow+0x120/0x294

stack backtrace:
CPU: 1 PID: 1634 Comm: stress Not tainted 3.16.0-rc2-00038-g0ed7ff6 #46
[<c0017c8c>] (unwind_backtrace) from [<c0012eec>] (show_stack+0x20/0x24)
[<c0012eec>] (show_stack) from [<c04de914>] (dump_stack+0x7c/0x98)
[<c04de914>] (dump_stack) from [<c006a360>] (__lock_acquire+0x1484/0x1cf0)
[<c006a360>] (__lock_acquire) from [<c006b14c>] (lock_acquire+0xa4/0x11c)
[<c006b14c>] (lock_acquire) from [<c04e3880>] (down_read+0x40/0x7c)
[<c04e3880>] (down_read) from [<c001dc04>] (do_page_fault+0xa8/0x428)
[<c001dc04>] (do_page_fault) from [<c00084ec>] (do_DataAbort+0x44/0xa8)
[<c00084ec>] (do_DataAbort) from [<c0013a1c>] (__dabt_svc+0x3c/0x60)
Exception stack(0xed7c5ae0 to 0xed7c5b28)
5ae0: ed7c5b5c b6dadff4 ffffffec 00000000 b6dadff4 ebc08000 00000000 ebc08000
5b00: 0000007e 00000000 ed7c4000 ed7c5b94 00000014 ed7c5b2c c001a438 c0236c60
5b20: 00000013 ffffffff
[<c0013a1c>] (__dabt_svc) from [<c0236c60>] (__copy_from_user+0xa4/0x3a4)

Signed-off-by: Jean Pihet <[email protected]>
Cc: Will Deacon <[email protected]>
Acked-by: Steve Capper <[email protected]>
---
arch/arm/kernel/perf_event.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 6493c4c..f5aeca2 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -560,11 +560,16 @@ user_backtrace(struct frame_tail __user *tail,
struct perf_callchain_entry *entry)
{
struct frame_tail buftail;
+ unsigned long err;

- /* Also check accessibility of one struct frame_tail beyond */
if (!access_ok(VERIFY_READ, tail, sizeof(buftail)))
return NULL;
- if (__copy_from_user_inatomic(&buftail, tail, sizeof(buftail)))
+
+ pagefault_disable();
+ err = __copy_from_user_inatomic(&buftail, tail, sizeof(buftail));
+ pagefault_enable();
+
+ if (err)
return NULL;

perf_callchain_store(entry, buftail.lr);
--
1.8.1.2

2014-07-09 14:05:21

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 0/3] ARM: perf: allow tracing with kernel tracepoints events

Hi Jean,

On Mon, Jul 07, 2014 at 02:45:07PM +0100, Jean Pihet wrote:
> - Robustify the user backtrace code, as done on other architectures.
> - Provide the symbols resolution when triggering from tracepoints.
>
> Big thanks to Steve Capper for the help in debugging and rephrasing the
> commits descriptions.
>
> Stress tested with perf record and tracepoints triggering (-e <tracepoint>),
> with unwinding using fp (--call-graph fp) and dwarf info (--call-graph dwarf).

I've taken the first two in this series, but patch 3 still doesn't address
my concerns about saving the required register state for unwinding.

Thanks!

Will