2021-05-21 09:59:25

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 0/6] powerpc: Stack tracer fixes

While the stack tracer worked fine with ppc64 ELFv1 ABI, it was never
ported/enabled to run properly with ppc64 ELFv2 ABI, -mprofile-kernel in
particular. The primary issue is that the call to ftrace happens before
a function sets up its own stackframe. Due to this, the traced function
never shows up in the stack trace. This produces confusing results,
especially with the stack trace filter and is evident with the selftest
failure. This is also an issue on ppc32.

The first two commits add support to the stack tracer for powerpc. This
support utilizes the powerpc ABI to produce reliable stack traces, as
well as to determine stackframe sizes.

Patches 3, 4 and 6 add support for decoding the traced function name in
various stack traces and to show the same.

Patch 5 makes this change for livepatching and I am a bit unsure of this
change. More details in patch 5.

- Naveen


Naveen N. Rao (6):
trace/stack: Move code to save the stack trace into a separate
function
powerpc/trace: Add support for stack tracer
powerpc: Indicate traced function name in show_stack()
powerpc/perf: Include traced function in the callchain
powerpc/stacktrace: Include ftraced function in
arch_stack_walk_reliable()
powerpc/stacktrace: Include ftraced function in arch_stack_walk()

arch/powerpc/include/asm/ftrace.h | 18 ++++++
arch/powerpc/kernel/process.c | 3 +
arch/powerpc/kernel/stacktrace.c | 8 +++
arch/powerpc/kernel/trace/ftrace.c | 70 +++++++++++++++++++++
arch/powerpc/perf/callchain.c | 5 ++
include/linux/ftrace.h | 8 +++
kernel/trace/trace_stack.c | 98 ++++++++++++++++--------------
7 files changed, 164 insertions(+), 46 deletions(-)


base-commit: 258eb1f3aaa9face35e613c229c1337263491ea0
--
2.30.2


2021-05-21 09:59:38

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 3/6] powerpc: Indicate traced function name in show_stack()

With -mprofile-kernel and ppc32, the function tracer is invoked before a
function sets up its own stackframe. This results in the traced function
not appearing in stack traces. Fix this by checking for ftrace entry and
including the traced function in the stack trace.

With this patch, stack traces show the function traced next to the
ftrace entry:
NIP [c0080000003700d4] handler_pre+0x1c/0x5c [kprobe_example]
LR [c00000000006c118] kprobe_ftrace_handler+0x1b8/0x270
Call Trace:
[c00000001ed7fa50] [c00000001ed7faa0] 0xc00000001ed7faa0 (unreliable)
[c00000001ed7fab0] [c00000000006c118] kprobe_ftrace_handler+0x1b8/0x270
[c00000001ed7fb00] [c000000000076604] ftrace_regs_call+0x4/0xa4 (kernel_clone+0xc/0x600)
^^^^^^^^^^^^^^^^^^^^^^^^
[c00000001ed7fcf0] [c000000000139e08] __do_sys_clone+0x88/0xd0
[c00000001ed7fdb0] [c00000000002b6c4] system_call_exception+0xf4/0x200
[c00000001ed7fe10] [c00000000000ca5c] system_call_common+0xec/0x278

Signed-off-by: Naveen N. Rao <[email protected]>
---
arch/powerpc/kernel/process.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 89e34aa273e21a..9df1d44939bec1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2160,6 +2160,9 @@ void show_stack(struct task_struct *tsk, unsigned long *stack,
&ftrace_idx, ip, stack);
if (ret_addr != ip)
pr_cont(" (%pS)", (void *)ret_addr);
+ ip = ftrace_get_traced_func_if_no_stackframe(ip, stack);
+ if (ip)
+ pr_cont(" (%pS)", (void *)ip);
if (firstframe)
pr_cont(" (unreliable)");
pr_cont("\n");
--
2.30.2

2021-05-21 10:02:01

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 5/6] powerpc/stacktrace: Include ftraced function in arch_stack_walk_reliable()

With -mprofile-kernel and ppc32, the function tracer is invoked before a
function sets up its own stackframe. This results in the traced function
not appearing in stack traces. Fix this by checking for ftrace entry and
including the traced function in the stack trace.

Signed-off-by: Naveen N. Rao <[email protected]>
---
While I don't think we expect to see ftrace show up in the trace for a
non-running task, I think it is good to cover this scenario. I also
think it is ok to consider such traces to be reliable. But, I'm not sure
of the implications w.r.t livepatching.

- Naveen


arch/powerpc/kernel/stacktrace.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 1deb1bf331ddbd..1e1be297c5d6c7 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -160,6 +160,10 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,

if (!consume_entry(cookie, ip))
return -EINVAL;
+
+ ip = ftrace_get_traced_func_if_no_stackframe(ip, stack);
+ if (ip && !consume_entry(cookie, ip))
+ return -EINVAL;
}
return 0;
}
--
2.30.2

2021-05-21 18:25:47

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 6/6] powerpc/stacktrace: Include ftraced function in arch_stack_walk()

With -mprofile-kernel and ppc32, the function tracer is invoked before a
function sets up its own stackframe. This results in the traced function
not appearing in stack traces. Fix this by checking for ftrace entry and
including the traced function in the stack trace.

Signed-off-by: Naveen N. Rao <[email protected]>
---
arch/powerpc/kernel/stacktrace.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 1e1be297c5d6c7..f884c5d21fa7e7 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -51,6 +51,10 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
if (!consume_entry(cookie, ip))
return;

+ ip = ftrace_get_traced_func_if_no_stackframe(ip, stack);
+ if (ip && !consume_entry(cookie, ip))
+ return;
+
sp = newsp;
}
}
--
2.30.2

2021-05-21 20:08:21

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 4/6] powerpc/perf: Include traced function in the callchain

With -mprofile-kernel and ppc32, the function tracer is invoked before a
function sets up its own stackframe. This results in the traced function
not appearing in stack traces. Fix this by checking for ftrace entry and
including the traced function in the perf callchain.

Signed-off-by: Naveen N. Rao <[email protected]>
---
arch/powerpc/perf/callchain.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index 6c028ee513c0d7..ed4dd4ab6621f6 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -93,6 +93,11 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re
}

perf_callchain_store(entry, next_ip);
+ if (level && next_ip) {
+ next_ip = ftrace_get_traced_func_if_no_stackframe(next_ip, fp);
+ if (next_ip)
+ perf_callchain_store(entry, next_ip);
+ }
if (!valid_next_sp(next_sp, sp))
return;
sp = next_sp;
--
2.30.2

2021-05-21 20:08:21

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 1/6] trace/stack: Move code to save the stack trace into a separate function

In preparation to add support for stack tracer to powerpc, move code to
save stack trace and to calculate the frame sizes into a separate weak
function. Also provide access to some of the data structures used by the
stack trace code so that architectures can update those.

Signed-off-by: Naveen N. Rao <[email protected]>
---
include/linux/ftrace.h | 8 ++++
kernel/trace/trace_stack.c | 98 ++++++++++++++++++++------------------
2 files changed, 60 insertions(+), 46 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index a69f363b61bf73..8263427379f05c 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -368,10 +368,18 @@ static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs,

#ifdef CONFIG_STACK_TRACER

+#define STACK_TRACE_ENTRIES 500
+
+extern unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
+extern unsigned stack_trace_index[STACK_TRACE_ENTRIES];
+extern unsigned int stack_trace_nr_entries;
+extern unsigned long stack_trace_max_size;
extern int stack_tracer_enabled;

int stack_trace_sysctl(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
+void stack_get_trace(unsigned long traced_ip, unsigned long *stack_ref,
+ unsigned long stack_size, int *tracer_frame);

/* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
DECLARE_PER_CPU(int, disable_stack_tracer);
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 63c28504205162..5b63dbd37c8c25 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -19,13 +19,11 @@

#include "trace.h"

-#define STACK_TRACE_ENTRIES 500
+unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
+unsigned stack_trace_index[STACK_TRACE_ENTRIES];

-static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
-static unsigned stack_trace_index[STACK_TRACE_ENTRIES];
-
-static unsigned int stack_trace_nr_entries;
-static unsigned long stack_trace_max_size;
+unsigned int stack_trace_nr_entries;
+unsigned long stack_trace_max_size;
static arch_spinlock_t stack_trace_max_lock =
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;

@@ -152,49 +150,19 @@ static void print_max_stack(void)
* Although the entry function is not displayed, the first function (sys_foo)
* will still include the stack size of it.
*/
-static void check_stack(unsigned long ip, unsigned long *stack)
+void __weak stack_get_trace(unsigned long traced_ip, unsigned long *stack_ref,
+ unsigned long stack_size, int *tracer_frame)
{
- unsigned long this_size, flags; unsigned long *p, *top, *start;
- static int tracer_frame;
- int frame_size = READ_ONCE(tracer_frame);
+ unsigned long *p, *top, *start;
int i, x;

- this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
- this_size = THREAD_SIZE - this_size;
- /* Remove the frame of the tracer */
- this_size -= frame_size;
-
- if (this_size <= stack_trace_max_size)
- return;
-
- /* we do not handle interrupt stacks yet */
- if (!object_is_on_stack(stack))
- return;
-
- /* Can't do this from NMI context (can cause deadlocks) */
- if (in_nmi())
- return;
-
- local_irq_save(flags);
- arch_spin_lock(&stack_trace_max_lock);
-
- /* In case another CPU set the tracer_frame on us */
- if (unlikely(!frame_size))
- this_size -= tracer_frame;
-
- /* a race could have already updated it */
- if (this_size <= stack_trace_max_size)
- goto out;
-
- stack_trace_max_size = this_size;
-
stack_trace_nr_entries = stack_trace_save(stack_dump_trace,
ARRAY_SIZE(stack_dump_trace) - 1,
0);

/* Skip over the overhead of the stack tracer itself */
for (i = 0; i < stack_trace_nr_entries; i++) {
- if (stack_dump_trace[i] == ip)
+ if (stack_dump_trace[i] == traced_ip)
break;
}

@@ -209,7 +177,7 @@ static void check_stack(unsigned long ip, unsigned long *stack)
* Now find where in the stack these are.
*/
x = 0;
- start = stack;
+ start = stack_ref;
top = (unsigned long *)
(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);

@@ -223,7 +191,7 @@ static void check_stack(unsigned long ip, unsigned long *stack)
while (i < stack_trace_nr_entries) {
int found = 0;

- stack_trace_index[x] = this_size;
+ stack_trace_index[x] = stack_size;
p = start;

for (; p < top && i < stack_trace_nr_entries; p++) {
@@ -233,7 +201,7 @@ static void check_stack(unsigned long ip, unsigned long *stack)
*/
if ((READ_ONCE_NOCHECK(*p)) == stack_dump_trace[i]) {
stack_dump_trace[x] = stack_dump_trace[i++];
- this_size = stack_trace_index[x++] =
+ stack_size = stack_trace_index[x++] =
(top - p) * sizeof(unsigned long);
found = 1;
/* Start the search from here */
@@ -245,10 +213,10 @@ static void check_stack(unsigned long ip, unsigned long *stack)
* out what that is, then figure it out
* now.
*/
- if (unlikely(!tracer_frame)) {
- tracer_frame = (p - stack) *
+ if (unlikely(!*tracer_frame)) {
+ *tracer_frame = (p - stack_ref) *
sizeof(unsigned long);
- stack_trace_max_size -= tracer_frame;
+ stack_trace_max_size -= *tracer_frame;
}
}
}
@@ -272,6 +240,44 @@ static void check_stack(unsigned long ip, unsigned long *stack)
#endif

stack_trace_nr_entries = x;
+}
+
+static void check_stack(unsigned long ip, unsigned long *stack)
+{
+ unsigned long this_size, flags;
+ static int tracer_frame;
+ int frame_size = READ_ONCE(tracer_frame);
+
+ this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
+ this_size = THREAD_SIZE - this_size;
+ /* Remove the frame of the tracer */
+ this_size -= frame_size;
+
+ if (this_size <= stack_trace_max_size)
+ return;
+
+ /* we do not handle interrupt stacks yet */
+ if (!object_is_on_stack(stack))
+ return;
+
+ /* Can't do this from NMI context (can cause deadlocks) */
+ if (in_nmi())
+ return;
+
+ local_irq_save(flags);
+ arch_spin_lock(&stack_trace_max_lock);
+
+ /* In case another CPU set the tracer_frame on us */
+ if (unlikely(!frame_size))
+ this_size -= tracer_frame;
+
+ /* a race could have already updated it */
+ if (this_size <= stack_trace_max_size)
+ goto out;
+
+ stack_trace_max_size = this_size;
+
+ stack_get_trace(ip, stack, this_size, &tracer_frame);

if (task_stack_end_corrupted(current)) {
print_max_stack();
--
2.30.2

2021-05-21 20:08:53

by Naveen N. Rao

[permalink] [raw]
Subject: [RFC PATCH 2/6] powerpc/trace: Add support for stack tracer

With -mprofile-kernel and ppc32, we call into ftrace at function entry
before the function can establish its own stack frame. This breaks the
ABI since functions are expected to setup a stack frame before calling
into another function. As a consequence of this, when walking the stack,
the ftraced function does not show up in the stack trace.

Fix this by checking for ftrace functions (ftrace_[regs_]call+4) in the
stack trace and looking up the stored nip in pt_regs in its stackframe.
Use the back chain from the stack frame headers to accurately determine
the stack frame sizes, except for the ftraced function on
-mprofile-kernel and ppc32 where we set the frame size to 0.

The max stack tracer ftrace selftest (ftrace/func_stack_tracer.tc)
passes on -mprofile-kernel with this patch.

Before this patch, top of a stack trace with the stack tracer:
Depth Size Location (44 entries)
----- ---- --------
0) 7616 496 ftrace_call+0x4/0x44
1) 7120 64 __mod_lruvec_page_state+0x90/0x110
2) 7056 96 test_clear_page_writeback+0xe4/0x480
3) 6960 48 end_page_writeback+0xa0/0x1c0
4) 6912 256 ext4_finish_bio+0x2c0/0x350
5) 6656 176 ext4_end_bio+0x74/0x280
6) 6480 64 bio_endio+0x1cc/0x240
7) 6416 176 blk_update_request+0x2b8/0x640
8) 6240 64 blk_mq_end_request+0x3c/0x1e0
9) 6176 48 virtblk_request_done+0x48/0xd0
10) 6128 48 blk_complete_reqs+0x80/0xa0
11) 6080 240 __do_softirq+0x150/0x408
12) 5840 32 irq_exit+0x144/0x150
13) 5808 80 do_IRQ+0xc8/0x140
14) 5728 32 hardware_interrupt_common_virt+0x1a4/0x1b0
15) 5696 64 0x0
16) 5632 768 virtqueue_notify+0x40/0x80
17) 4864 240 virtio_queue_rq+0x568/0x610
18) 4624 256 blk_mq_dispatch_rq_list+0x190/0xbc0
19) 4368 160 __blk_mq_do_dispatch_sched+0x1f0/0x3d0
20) 4208 96 __blk_mq_sched_dispatch_requests+0x238/0x2c0
...

After this patch:
Depth Size Location (44 entries)
----- ---- --------
0) 7136 0 rcu_read_unlock_strict+0x8/0x10
1) 7136 64 __mod_lruvec_page_state+0x90/0x110
2) 7072 96 test_clear_page_writeback+0xe4/0x480
3) 6976 48 end_page_writeback+0xa0/0x1c0
4) 6928 256 ext4_finish_bio+0x2c0/0x350
5) 6672 176 ext4_end_bio+0x74/0x280
6) 6496 64 bio_endio+0x1cc/0x240
7) 6432 176 blk_update_request+0x2b8/0x640
8) 6256 64 blk_mq_end_request+0x3c/0x1e0
9) 6192 48 virtblk_request_done+0x48/0xd0
10) 6144 48 blk_complete_reqs+0x80/0xa0
11) 6096 240 __do_softirq+0x150/0x408
12) 5856 32 irq_exit+0x144/0x150
13) 5824 80 do_IRQ+0xc8/0x140
14) 5744 784 hardware_interrupt_common_virt+0x1a4/0x1b0
15) 4960 32 0x0
16) 4928 48 virtqueue_notify+0x40/0x80
17) 4880 240 virtio_queue_rq+0x568/0x610
18) 4640 256 blk_mq_dispatch_rq_list+0x190/0xbc0
19) 4384 160 __blk_mq_do_dispatch_sched+0x1f0/0x3d0
20) 4224 96 __blk_mq_sched_dispatch_requests+0x238/0x2c0
...

Signed-off-by: Naveen N. Rao <[email protected]>
---
arch/powerpc/include/asm/ftrace.h | 18 ++++++++
arch/powerpc/kernel/trace/ftrace.c | 70 ++++++++++++++++++++++++++++++
2 files changed, 88 insertions(+)

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index debe8c4f706260..392296df70e96c 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -126,6 +126,24 @@ static inline void this_cpu_enable_ftrace(void) { }
static inline void this_cpu_set_ftrace_enabled(u8 ftrace_enabled) { }
static inline u8 this_cpu_get_ftrace_enabled(void) { return 1; }
#endif /* CONFIG_PPC64 */
+
+#ifdef CONFIG_FUNCTION_TRACER
+/*
+ * With ppc64 -mprofile-kernel and ppc32, mcount call is made before a function
+ * establishes its own stack frame. While unwinding the stack, such functions
+ * do not appear in the trace. This helper returns the traced function if ip in
+ * the stack frame points to ftrace_[regs_]call.
+ *
+ * In ppc64 ELFv1, mcount call is after a function establishes its own
+ * stackframe. So, this always returns 0.
+ */
+unsigned long ftrace_get_traced_func_if_no_stackframe(unsigned long ip, unsigned long *stack);
+#else
+static inline unsigned long ftrace_get_traced_func_if_no_stackframe(unsigned long ip, unsigned long *stack)
+{
+ return 0;
+}
+#endif /* FUNCTION_TRACER */
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_POWERPC_FTRACE */
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index ffe9537195aa33..ec1072d9a858d0 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -21,6 +21,7 @@
#include <linux/percpu.h>
#include <linux/init.h>
#include <linux/list.h>
+#include <linux/sched/task_stack.h>

#include <asm/asm-prototypes.h>
#include <asm/cacheflush.h>
@@ -987,3 +988,72 @@ char *arch_ftrace_match_adjust(char *str, const char *search)
return str;
}
#endif /* PPC64_ELF_ABI_v1 */
+
+static int is_ftrace_entry(unsigned long ip)
+{
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+ if (ip == (unsigned long)&ftrace_call + 4 || ip == (unsigned long)&ftrace_regs_call + 4)
+#else
+ if (ip == (unsigned long)&ftrace_call + 4)
+#endif
+ return 1;
+
+ return 0;
+}
+
+unsigned long ftrace_get_traced_func_if_no_stackframe(unsigned long ip, unsigned long *stack)
+{
+ if (!is_ftrace_entry(ip))
+ return 0;
+
+ if (IS_ENABLED(CONFIG_PPC32))
+ return stack[11]; /* see MCOUNT_SAVE_FRAME */
+
+ if (!IS_ENABLED(CONFIG_MPROFILE_KERNEL))
+ return 0;
+
+ return stack[(STACK_FRAME_OVERHEAD + offsetof(struct pt_regs, nip)) / sizeof(unsigned long)];
+}
+
+#ifdef CONFIG_STACK_TRACER
+void stack_get_trace(unsigned long traced_ip,
+ unsigned long *stack_ref __maybe_unused,
+ unsigned long stack_size __maybe_unused,
+ int *tracer_frame)
+{
+ unsigned long sp, newsp, top, ip;
+ int ftrace_call_found = 0;
+ unsigned long *stack;
+ int i = 0;
+
+ sp = current_stack_frame();
+ top = (unsigned long)task_stack_page(current) + THREAD_SIZE;
+
+ while (validate_sp(sp, current, STACK_FRAME_OVERHEAD) && i < STACK_TRACE_ENTRIES) {
+ stack = (unsigned long *) sp;
+ newsp = stack[0];
+ ip = stack[STACK_FRAME_LR_SAVE];
+
+ if (ftrace_call_found) {
+ stack_dump_trace[i] = ip;
+ stack_trace_index[i++] = top - sp;
+ }
+
+ if (is_ftrace_entry(ip)) {
+ if (IS_ENABLED(CONFIG_MPROFILE_KERNEL) || IS_ENABLED(CONFIG_PPC32)) {
+ stack_dump_trace[i] = ftrace_get_traced_func_if_no_stackframe(ip, stack);
+ stack_trace_index[i++] = top - newsp;
+ }
+ if (unlikely(!*tracer_frame)) {
+ *tracer_frame = newsp - (unsigned long)stack_ref;
+ stack_trace_max_size -= *tracer_frame;
+ }
+ ftrace_call_found = 1;
+ }
+
+ sp = newsp;
+ }
+
+ stack_trace_nr_entries = i;
+}
+#endif
--
2.30.2

2021-06-01 13:53:47

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 2/6] powerpc/trace: Add support for stack tracer

Naveen N. Rao wrote:
> +
> +unsigned long ftrace_get_traced_func_if_no_stackframe(unsigned long ip, unsigned long *stack)
> +{
> + if (!is_ftrace_entry(ip))
> + return 0;
> +
> + if (IS_ENABLED(CONFIG_PPC32))
> + return stack[11]; /* see MCOUNT_SAVE_FRAME */
> +
> + if (!IS_ENABLED(CONFIG_MPROFILE_KERNEL))
> + return 0;
> +
> + return stack[(STACK_FRAME_OVERHEAD + offsetof(struct pt_regs, nip)) / sizeof(unsigned long)];

Looking at Daniel's patch to address KASAN errors with our stack walk
code in show_stack() [*], I realized that I am not validating the stack
pointer here for the above accesses...

[*] http://lkml.kernel.org/r/[email protected]

> +}
> +
> +#ifdef CONFIG_STACK_TRACER
> +void stack_get_trace(unsigned long traced_ip,
> + unsigned long *stack_ref __maybe_unused,
> + unsigned long stack_size __maybe_unused,
> + int *tracer_frame)
> +{
> + unsigned long sp, newsp, top, ip;
> + int ftrace_call_found = 0;
> + unsigned long *stack;
> + int i = 0;
> +
> + sp = current_stack_frame();
> + top = (unsigned long)task_stack_page(current) + THREAD_SIZE;
> +
> + while (validate_sp(sp, current, STACK_FRAME_OVERHEAD) && i < STACK_TRACE_ENTRIES) {
> + stack = (unsigned long *) sp;
> + newsp = stack[0];
> + ip = stack[STACK_FRAME_LR_SAVE];
> +
> + if (ftrace_call_found) {
> + stack_dump_trace[i] = ip;
> + stack_trace_index[i++] = top - sp;
> + }

And I need to make the above accesses bypass KASAN as well.


- Naveen

2021-06-01 15:29:53

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] trace/stack: Move code to save the stack trace into a separate function

On Fri, 21 May 2021 12:18:36 +0530
"Naveen N. Rao" <[email protected]> wrote:

> In preparation to add support for stack tracer to powerpc, move code to
> save stack trace and to calculate the frame sizes into a separate weak
> function. Also provide access to some of the data structures used by the
> stack trace code so that architectures can update those.
>
> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> include/linux/ftrace.h | 8 ++++
> kernel/trace/trace_stack.c | 98 ++++++++++++++++++++------------------
> 2 files changed, 60 insertions(+), 46 deletions(-)
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index a69f363b61bf73..8263427379f05c 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -368,10 +368,18 @@ static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs,
>
> #ifdef CONFIG_STACK_TRACER
>
> +#define STACK_TRACE_ENTRIES 500
> +
> +extern unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
> +extern unsigned stack_trace_index[STACK_TRACE_ENTRIES];
> +extern unsigned int stack_trace_nr_entries;
> +extern unsigned long stack_trace_max_size;
> extern int stack_tracer_enabled;
>
> int stack_trace_sysctl(struct ctl_table *table, int write, void *buffer,
> size_t *lenp, loff_t *ppos);
> +void stack_get_trace(unsigned long traced_ip, unsigned long *stack_ref,
> + unsigned long stack_size, int *tracer_frame);
>
> /* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
> DECLARE_PER_CPU(int, disable_stack_tracer);
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 63c28504205162..5b63dbd37c8c25 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -19,13 +19,11 @@
>
> #include "trace.h"
>
> -#define STACK_TRACE_ENTRIES 500
> +unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
> +unsigned stack_trace_index[STACK_TRACE_ENTRIES];
>
> -static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
> -static unsigned stack_trace_index[STACK_TRACE_ENTRIES];
> -
> -static unsigned int stack_trace_nr_entries;
> -static unsigned long stack_trace_max_size;
> +unsigned int stack_trace_nr_entries;
> +unsigned long stack_trace_max_size;
> static arch_spinlock_t stack_trace_max_lock =
> (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
>
> @@ -152,49 +150,19 @@ static void print_max_stack(void)
> * Although the entry function is not displayed, the first function (sys_foo)
> * will still include the stack size of it.
> */
> -static void check_stack(unsigned long ip, unsigned long *stack)

I just got back from PTO and have a ton of other obligations to attend
to before I can dig deeper into this. I'm not opposed to this change,
but the stack_tracer has not been getting the love that it deserves and
I think you hit one of the issues that needs to be addressed. I'm not
sure this is a PPC only issue, and would like to see if I can get more
time (or someone else can) to reevaluate the way stack tracer works,
and see if it can be made a bit more robust.

-- Steve

2021-06-02 10:37:26

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] trace/stack: Move code to save the stack trace into a separate function

Steven Rostedt wrote:
> On Fri, 21 May 2021 12:18:36 +0530
> "Naveen N. Rao" <[email protected]> wrote:
>
>> In preparation to add support for stack tracer to powerpc, move code to
>> save stack trace and to calculate the frame sizes into a separate weak
>> function. Also provide access to some of the data structures used by the
>> stack trace code so that architectures can update those.
>>
>> Signed-off-by: Naveen N. Rao <[email protected]>
>> ---
>> include/linux/ftrace.h | 8 ++++
>> kernel/trace/trace_stack.c | 98 ++++++++++++++++++++------------------
>> 2 files changed, 60 insertions(+), 46 deletions(-)
>>
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index a69f363b61bf73..8263427379f05c 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -368,10 +368,18 @@ static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs,
>>
>> #ifdef CONFIG_STACK_TRACER
>>
>> +#define STACK_TRACE_ENTRIES 500
>> +
>> +extern unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
>> +extern unsigned stack_trace_index[STACK_TRACE_ENTRIES];
>> +extern unsigned int stack_trace_nr_entries;
>> +extern unsigned long stack_trace_max_size;
>> extern int stack_tracer_enabled;
>>
>> int stack_trace_sysctl(struct ctl_table *table, int write, void *buffer,
>> size_t *lenp, loff_t *ppos);
>> +void stack_get_trace(unsigned long traced_ip, unsigned long *stack_ref,
>> + unsigned long stack_size, int *tracer_frame);
>>
>> /* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
>> DECLARE_PER_CPU(int, disable_stack_tracer);
>> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
>> index 63c28504205162..5b63dbd37c8c25 100644
>> --- a/kernel/trace/trace_stack.c
>> +++ b/kernel/trace/trace_stack.c
>> @@ -19,13 +19,11 @@
>>
>> #include "trace.h"
>>
>> -#define STACK_TRACE_ENTRIES 500
>> +unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
>> +unsigned stack_trace_index[STACK_TRACE_ENTRIES];
>>
>> -static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
>> -static unsigned stack_trace_index[STACK_TRACE_ENTRIES];
>> -
>> -static unsigned int stack_trace_nr_entries;
>> -static unsigned long stack_trace_max_size;
>> +unsigned int stack_trace_nr_entries;
>> +unsigned long stack_trace_max_size;
>> static arch_spinlock_t stack_trace_max_lock =
>> (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
>>
>> @@ -152,49 +150,19 @@ static void print_max_stack(void)
>> * Although the entry function is not displayed, the first function (sys_foo)
>> * will still include the stack size of it.
>> */
>> -static void check_stack(unsigned long ip, unsigned long *stack)
>
> I just got back from PTO and have a ton of other obligations to attend
> to before I can dig deeper into this.

Thanks for the heads up.

> I'm not opposed to this change,
> but the stack_tracer has not been getting the love that it deserves and
> I think you hit one of the issues that needs to be addressed.

Sure. I started off by just updating arch_stack_walk() to return the
traced function, but soon realized that the frame size determination can
be made more robust on powerpc due to the ABI.

> I'm not
> sure this is a PPC only issue, and would like to see if I can get more
> time (or someone else can) to reevaluate the way stack tracer works,
> and see if it can be made a bit more robust.

Sure. If you have specific issues to be looked at, please do elaborate.

It seems to be working fine otherwise. The one limitation though is down
to how ftrace works on powerpc -- the mcount call is before a function
sets up its own stackframe. Due to this, we won't ever be able to
account for the stackframe from a leaf function -- but, that's a fairly
minor limitation.


Thanks,
Naveen

2021-06-02 14:12:16

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] trace/stack: Move code to save the stack trace into a separate function

On Wed, 02 Jun 2021 16:05:18 +0530
"Naveen N. Rao" <[email protected]> wrote:

> It seems to be working fine otherwise. The one limitation though is down
> to how ftrace works on powerpc -- the mcount call is before a function
> sets up its own stackframe. Due to this, we won't ever be able to
> account for the stackframe from a leaf function -- but, that's a fairly
> minor limitation.

And this is true for x86 as well because it no longer uses mcount, but
uses fentry instead (called before stack setup), but I figured there's
not much we could do about it.

-- Steve