2022-09-21 13:37:31

by Chen Zhongjin

[permalink] [raw]
Subject: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace

Currently, the stacktrace with FRAME_POINTER on riscv has some problem:

1. stacktrace will stop at irq so it can't get the stack frames before
irq entry.
2. stacktrace can't unwind all the real stack frames when there is
k{ret}probes or ftrace.

These are mainly becase when there is a pt_regs on stack, we can't unwind
the stack frame as normal function.

Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
However this doesn't work for riscv because the ra is not ensured to be
pushed to stack. As explained in:
commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")

So, I choosed the method of x86 that, if there is a pt_regs on stack,
we encoded the frame pointer and save it. When unwinding stack frame,
we can get pt_regs and registers required for unwinding stacks.

In addition, the patch set contains some refactoring of stacktrace.c to
keep the stacktrace code on riscv consistent with other architectures.

Stacktrace before for kretprobes:

Call Trace:
...
[<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
[<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
[<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
[<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
[<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
...

Stacktrace after:

Call Trace:
...
[<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
[<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
[<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
+ [<ffffffff01633076>] the_caller+0x2c/0x38 [kprobe_unwind]
[<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
[<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
...

Stacktrace before for ftrace:

Call Trace:
...
[<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
[<ffffffff80008e7e>] ftrace_regs_call+0x8/0x10
[<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
[<ffffffff8008a4e6>] do_init_module+0x56/0x210
...

Stacktrace after:

Call Trace:
...
[<ffffffff016150e0>] handler_pre+0x30/0x4a [kprobe_unwind]
[<ffffffff800bce96>] aggr_pre_handler+0x60/0x94
[<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
[<ffffffff80008e82>] ftrace_regs_call+0x8/0x10
+ [<ffffffff01615000>] traced_func+0x0/0x1e [kprobe_unwind]
[<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
[<ffffffff8008a4ea>] do_init_module+0x56/0x210
...

Noticed that the caller of ftrace and probed func of kretprobe
cannot be unwind because they are inside function pro/epilogue.

---
v1 -> v2:
- Merge three patches which add ENCODE_FRAME_POINTER together
- Update commit message
- Delete the KRETPORBES stuff added in unwind_state, we don't need them
to recover the kretporbes ret_addr because we can get it in pt_regs
---
Chen Zhongjin (4):
riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
riscv: stacktrace: Introduce unwind functions
riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
riscv: stacktrace: Implement stacktrace for irq

arch/riscv/include/asm/frame.h | 45 ++++++
arch/riscv/include/asm/stacktrace.h | 9 +-
arch/riscv/kernel/entry.S | 3 +
arch/riscv/kernel/mcount-dyn.S | 7 +
arch/riscv/kernel/perf_callchain.c | 2 +-
arch/riscv/kernel/probes/kprobes_trampoline.S | 7 +
arch/riscv/kernel/stacktrace.c | 150 ++++++++++++------
7 files changed, 174 insertions(+), 49 deletions(-)
create mode 100644 arch/riscv/include/asm/frame.h

--
2.17.1


2022-09-21 13:45:43

by Chen Zhongjin

[permalink] [raw]
Subject: [PATCH for-next v2 4/4] riscv: stacktrace: Implement stacktrace for irq

After adding encoded fp onto stack to record pt_regs, now the
unwinder have ability to unwind frame through irq.

There is two steps to unwind irq frame and the interrupted frame:

1. When there is an encoded fp on stack, we can get the pt_regs
and unwind frame by (regs->epc) and (regs->s0).

2. To unwind the interrupted frame, there is two possibilities,
we can determine the situation by checking whether the value in
frame->ra position is a fp value.

If there is a fp in ra position:
We are inside a leaf frame and there is only fp on ra position.
Get fp from ra position and get next pc from pt_regs.
Else:
Just get fp and next pc from stack frame.

Stacktrace before this patch:

Call Trace:
...
[<ffffffff800aa692>] __flush_smp_call_function_queue+0xde/0x1fa
[<ffffffff800ab404>] generic_smp_call_function_single_interrupt+0x22/0x2a
[<ffffffff800077b2>] handle_IPI+0xaa/0x108
[<ffffffff803f827e>] riscv_intc_irq+0x56/0x6e
[<ffffffff808d94b6>] generic_handle_arch_irq+0x4c/0x76
[<ffffffff80003ad0>] ret_from_exception+0x0/0xc

Stacktrace after this patch:

Call Trace:
...
[<ffffffff800aa6da>] __flush_smp_call_function_queue+0xde/0x1fa
[<ffffffff800ab44c>] generic_smp_call_function_single_interrupt+0x22/0x2a
[<ffffffff800077fa>] handle_IPI+0xaa/0x108
[<ffffffff803f82c6>] riscv_intc_irq+0x56/0x6e
[<ffffffff808d94fe>] generic_handle_arch_irq+0x4c/0x76
[<ffffffff80003ad0>] ret_from_exception+0x0/0xc
+ [<ffffffff80003d52>] arch_cpu_idle+0x22/0x28
+ [<ffffffff808e23a8>] default_idle_call+0x44/0xee
+ [<ffffffff80056ece>] do_idle+0x116/0x126
+ [<ffffffff8005706e>] cpu_startup_entry+0x36/0x38
+ [<ffffffff808d99ae>] kernel_init+0x0/0x15a
+ [<ffffffff80a007a0>] arch_post_acpi_subsys_init+0x0/0x38
+ [<ffffffff80a0100c>] start_kernel+0x7c4/0x7f2

Signed-off-by: Chen Zhongjin <[email protected]>
---
arch/riscv/kernel/stacktrace.c | 45 ++++++++++++++++++++++++++++------
1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index e84e21868a3e..976dc298ab3b 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -16,29 +16,60 @@

#ifdef CONFIG_FRAME_POINTER

+static struct pt_regs *decode_frame_pointer(unsigned long fp)
+{
+ if (!(fp & 0x1))
+ return NULL;
+
+ return (struct pt_regs *)(fp & ~0x1);
+}
+
static int notrace unwind_next(struct unwind_state *state)
{
unsigned long low, high, fp;
struct stackframe *frame;
+ struct pt_regs *regs;

- fp = state->fp;
+ regs = decode_frame_pointer(state->fp);

/* Validate frame pointer */
- low = state->sp + sizeof(struct stackframe);
+ if (regs) {
+ if user_mode(regs)
+ return -1;
+
+ fp = (unsigned long)regs;
+ low = state->sp;
+ } else {
+ fp = state->fp;
+ low = state->sp + sizeof(struct stackframe);
+ }
high = ALIGN(low, THREAD_SIZE);

if (fp < low || fp > high || fp & 0x7)
return -EINVAL;

- /* Unwind stack frame */
frame = (struct stackframe *)fp - 1;
state->sp = fp;

- if (state->regs && state->regs->epc == state->pc &&
- fp & 0x7) {
- state->fp = frame->ra;
- state->pc = state->regs->ra;
+ if (regs) {
+ /* Unwind from irq to interrupted function */
+ state->fp = regs->s0;
+ state->pc = regs->epc;
+ state->regs = regs;
+ } else if (state->regs && state->regs->epc == state->pc) {
+ /* Unwind from interrupted function to caller*/
+ if (frame->ra < low || frame->ra > high) {
+ /* normal function */
+ state->fp = frame->fp;
+ state->pc = frame->ra;
+ } else {
+ /* leaf function */
+ state->fp = frame->ra;
+ state->pc = state->regs->ra;
+ }
+ state->regs = NULL;
} else {
+ /* Unwind from normal stack frame */
state->fp = frame->fp;
state->pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
(unsigned long *)fp - 1);
--
2.17.1

2022-09-21 14:27:11

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace

On Wed, Sep 21, 2022 at 08:51:23PM +0800, Chen Zhongjin wrote:
> Currently, the stacktrace with FRAME_POINTER on riscv has some problem:
>
> 1. stacktrace will stop at irq so it can't get the stack frames before
> irq entry.
> 2. stacktrace can't unwind all the real stack frames when there is
> k{ret}probes or ftrace.
>
> These are mainly becase when there is a pt_regs on stack, we can't unwind
> the stack frame as normal function.
>
> Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
> However this doesn't work for riscv because the ra is not ensured to be
> pushed to stack. As explained in:
> commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")

FWIW, this is also a latent problem on arm64, since we don't know whether the
LR is live at an exception boundary (and we currently always ignore it).

My plan to fix that on arm64 is to push an empty frame record to the stack upon
an exception, and use that to find the pt_regs, from which we can get the PC
and LR (and then we can report the later as unreliable). That should be roughly
equivalent to what you do in this series (where use use the LSB to identify
that the pointer is actually a pt_regs).

One important thing to note is that when crossing an exception boundary you
won't know whether RA is live, and so you might try to consume an address twice
(if it has also been pushed to the stack). That could be problematic for
unwinding ftrace or kretprobes. On arm64 we had to implement
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR so that we could reliably unwind ftrace. See
commit:

c6d3cd32fd0064af ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")

... and we haven't yet come up with something similar for kretprobes (though I
suspect we'll need to).

Thanks,
Mark.

> So, I choosed the method of x86 that, if there is a pt_regs on stack,
> we encoded the frame pointer and save it. When unwinding stack frame,
> we can get pt_regs and registers required for unwinding stacks.
>
> In addition, the patch set contains some refactoring of stacktrace.c to
> keep the stacktrace code on riscv consistent with other architectures.
>
> Stacktrace before for kretprobes:
>
> Call Trace:
> ...
> [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
> [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
> [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
> [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
> [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
> ...
>
> Stacktrace after:
>
> Call Trace:
> ...
> [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
> [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
> [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
> + [<ffffffff01633076>] the_caller+0x2c/0x38 [kprobe_unwind]
> [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
> [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
> ...
>
> Stacktrace before for ftrace:
>
> Call Trace:
> ...
> [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
> [<ffffffff80008e7e>] ftrace_regs_call+0x8/0x10
> [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
> [<ffffffff8008a4e6>] do_init_module+0x56/0x210
> ...
>
> Stacktrace after:
>
> Call Trace:
> ...
> [<ffffffff016150e0>] handler_pre+0x30/0x4a [kprobe_unwind]
> [<ffffffff800bce96>] aggr_pre_handler+0x60/0x94
> [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
> [<ffffffff80008e82>] ftrace_regs_call+0x8/0x10
> + [<ffffffff01615000>] traced_func+0x0/0x1e [kprobe_unwind]
> [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
> [<ffffffff8008a4ea>] do_init_module+0x56/0x210
> ...
>
> Noticed that the caller of ftrace and probed func of kretprobe
> cannot be unwind because they are inside function pro/epilogue.
>
> ---
> v1 -> v2:
> - Merge three patches which add ENCODE_FRAME_POINTER together
> - Update commit message
> - Delete the KRETPORBES stuff added in unwind_state, we don't need them
> to recover the kretporbes ret_addr because we can get it in pt_regs
> ---
> Chen Zhongjin (4):
> riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
> riscv: stacktrace: Introduce unwind functions
> riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
> riscv: stacktrace: Implement stacktrace for irq
>
> arch/riscv/include/asm/frame.h | 45 ++++++
> arch/riscv/include/asm/stacktrace.h | 9 +-
> arch/riscv/kernel/entry.S | 3 +
> arch/riscv/kernel/mcount-dyn.S | 7 +
> arch/riscv/kernel/perf_callchain.c | 2 +-
> arch/riscv/kernel/probes/kprobes_trampoline.S | 7 +
> arch/riscv/kernel/stacktrace.c | 150 ++++++++++++------
> 7 files changed, 174 insertions(+), 49 deletions(-)
> create mode 100644 arch/riscv/include/asm/frame.h
>
> --
> 2.17.1
>

2022-09-22 08:25:45

by Chen Zhongjin

[permalink] [raw]
Subject: Re: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace

Hi,

On 2022/9/21 22:02, Mark Rutland wrote:
> On Wed, Sep 21, 2022 at 08:51:23PM +0800, Chen Zhongjin wrote:
>> Currently, the stacktrace with FRAME_POINTER on riscv has some problem:
>>
>> 1. stacktrace will stop at irq so it can't get the stack frames before
>> irq entry.
>> 2. stacktrace can't unwind all the real stack frames when there is
>> k{ret}probes or ftrace.
>>
>> These are mainly becase when there is a pt_regs on stack, we can't unwind
>> the stack frame as normal function.
>>
>> Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
>> However this doesn't work for riscv because the ra is not ensured to be
>> pushed to stack. As explained in:
>> commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")
> FWIW, this is also a latent problem on arm64, since we don't know whether the
> LR is live at an exception boundary (and we currently always ignore it).

In fact this is still a problem for riscv, I didn't unwind the caller of
ftrace or probed func

of kretprobe because they are inside function pro/epilogue.

The problem on riscv is a little more complex, for leaf function, ra
will always not be pushed

to stack and fp will be pushed to ra slot. This patch solved this problem.

> My plan to fix that on arm64 is to push an empty frame record to the stack upon
> an exception, and use that to find the pt_regs, from which we can get the PC
> and LR (and then we can report the later as unreliable). That should be roughly
> equivalent to what you do in this series (where use use the LSB to identify
> that the pointer is actually a pt_regs).

IIRC, this solution looks like:

=====

Func1     { lr, fp } or { nothing }

irq entry { pt_regs & empty stackframe[2] }

handler   { lr, fp }

======

handler->fp points to stackframe, and when we find stackframe is empty,
we know we are inside

an pt_regs and can find registers by offset.

I think it's available, there no other scenario will cause the frame
list contains zero (unless stack is clobbered).

And it seems only fp slot should be zero, lr will be clobbered after
function call, we cannot use lr inside pt_regs.

> One important thing to note is that when crossing an exception boundary you
> won't know whether RA is live, and so you might try to consume an address twice
> (if it has also been pushed to the stack). That could be problematic for
> unwinding ftrace or kretprobes. On arm64 we had to implement
> HAVE_FUNCTION_GRAPH_RET_ADDR_PTR so that we could reliably unwind ftrace. See
> commit:
>
> c6d3cd32fd0064af ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")
>
> ... and we haven't yet come up with something similar for kretprobes (though I
> suspect we'll need to).

Can we test the sp and fp inside pt_regs? Because luckily arm64 saves
lr, fp and moves sp together.

Before sp is moved we will have fp == sp from last frame 'mov x29, sp'.
So if they are same, we uses the lr in pt_regs,

conversely we use lr on stack.

That's what I planned to use for orc unwinder (or we need to track lr as
sp/fp). Seems it also works for this

solution. Haven't thought it in detail through.


Best,

Chen

> Thanks,
> Mark.
>
>> So, I choosed the method of x86 that, if there is a pt_regs on stack,
>> we encoded the frame pointer and save it. When unwinding stack frame,
>> we can get pt_regs and registers required for unwinding stacks.
>>
>> In addition, the patch set contains some refactoring of stacktrace.c to
>> keep the stacktrace code on riscv consistent with other architectures.
>>
>> Stacktrace before for kretprobes:
>>
>> Call Trace:
>> ...
>> [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
>> [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
>> [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
>> [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
>> [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
>> ...
>>
>> Stacktrace after:
>>
>> Call Trace:
>> ...
>> [<ffffffff800d5d48>] __kretprobe_trampoline_handler+0xc2/0x13e
>> [<ffffffff808b766c>] trampoline_probe_handler+0x30/0x46
>> [<ffffffff800070de>] __kretprobe_trampoline+0x52/0x92
>> + [<ffffffff01633076>] the_caller+0x2c/0x38 [kprobe_unwind]
>> [<ffffffff0163809c>] kprobe_init+0x9c/0x1000 [kprobe_unwind]
>> [<ffffffff800027c8>] do_one_initcall+0x4c/0x1f2
>> ...
>>
>> Stacktrace before for ftrace:
>>
>> Call Trace:
>> ...
>> [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
>> [<ffffffff80008e7e>] ftrace_regs_call+0x8/0x10
>> [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
>> [<ffffffff8008a4e6>] do_init_module+0x56/0x210
>> ...
>>
>> Stacktrace after:
>>
>> Call Trace:
>> ...
>> [<ffffffff016150e0>] handler_pre+0x30/0x4a [kprobe_unwind]
>> [<ffffffff800bce96>] aggr_pre_handler+0x60/0x94
>> [<ffffffff80006df0>] kprobe_ftrace_handler+0x13e/0x188
>> [<ffffffff80008e82>] ftrace_regs_call+0x8/0x10
>> + [<ffffffff01615000>] traced_func+0x0/0x1e [kprobe_unwind]
>> [<ffffffff80002540>] do_one_initcall+0x4c/0x1f2
>> [<ffffffff8008a4ea>] do_init_module+0x56/0x210
>> ...
>>
>> Noticed that the caller of ftrace and probed func of kretprobe
>> cannot be unwind because they are inside function pro/epilogue.
>>
>> ---
>> v1 -> v2:
>> - Merge three patches which add ENCODE_FRAME_POINTER together
>> - Update commit message
>> - Delete the KRETPORBES stuff added in unwind_state, we don't need them
>> to recover the kretporbes ret_addr because we can get it in pt_regs
>> ---
>> Chen Zhongjin (4):
>> riscv: stacktrace: Replace walk_stackframe with arch_stack_walk
>> riscv: stacktrace: Introduce unwind functions
>> riscv: stacktrace: Save pt_regs in ENCODE_FRAME_POINTER
>> riscv: stacktrace: Implement stacktrace for irq
>>
>> arch/riscv/include/asm/frame.h | 45 ++++++
>> arch/riscv/include/asm/stacktrace.h | 9 +-
>> arch/riscv/kernel/entry.S | 3 +
>> arch/riscv/kernel/mcount-dyn.S | 7 +
>> arch/riscv/kernel/perf_callchain.c | 2 +-
>> arch/riscv/kernel/probes/kprobes_trampoline.S | 7 +
>> arch/riscv/kernel/stacktrace.c | 150 ++++++++++++------
>> 7 files changed, 174 insertions(+), 49 deletions(-)
>> create mode 100644 arch/riscv/include/asm/frame.h
>>
>> --
>> 2.17.1
>>

2022-09-23 13:57:40

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH for-next v2 0/4] riscv: Improvments for stacktrace

On Thu, Sep 22, 2022 at 04:22:23PM +0800, Chen Zhongjin wrote:
> Hi,
>
> On 2022/9/21 22:02, Mark Rutland wrote:
> > On Wed, Sep 21, 2022 at 08:51:23PM +0800, Chen Zhongjin wrote:
> > > Currently, the stacktrace with FRAME_POINTER on riscv has some problem:
> > >
> > > 1. stacktrace will stop at irq so it can't get the stack frames before
> > > irq entry.
> > > 2. stacktrace can't unwind all the real stack frames when there is
> > > k{ret}probes or ftrace.
> > >
> > > These are mainly becase when there is a pt_regs on stack, we can't unwind
> > > the stack frame as normal function.
> > >
> > > Some architectures (e.g. arm64) create a extra stackframe inside pt_regs.
> > > However this doesn't work for riscv because the ra is not ensured to be
> > > pushed to stack. As explained in:
> > > commit f766f77a74f5("riscv/stacktrace: Fix stack output without ra on the stack top")
> > FWIW, this is also a latent problem on arm64, since we don't know whether the
> > LR is live at an exception boundary (and we currently always ignore it).
>
> In fact this is still a problem for riscv, I didn't unwind the caller of
> ftrace or probed func
>
> of kretprobe because they are inside function pro/epilogue.
>
> The problem on riscv is a little more complex, for leaf function, ra will
> always not be pushed to stack and fp will be pushed to ra slot. This patch
> solved this problem.

That's the same on arm64, our `LR` is equivalent to your `RA`.

I have a series at:

https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/metadata

... which explciitly indicates pt_regs::pc and pt_regs::lr in the stack trace
output (and for RELIABLE_STACKTRACE we'd mark the LR entries as unreliable).

> > My plan to fix that on arm64 is to push an empty frame record to the stack upon
> > an exception, and use that to find the pt_regs, from which we can get the PC
> > and LR (and then we can report the later as unreliable). That should be roughly
> > equivalent to what you do in this series (where use use the LSB to identify
> > that the pointer is actually a pt_regs).
>
> IIRC, this solution looks like:
>
> =====
>
> Func1     { lr, fp } or { nothing }
>
> irq entry { pt_regs & empty stackframe[2] }
>
> handler   { lr, fp }
>
> ======
>
> handler->fp points to stackframe, and when we find stackframe is empty, we
> know we are inside
>
> an pt_regs and can find registers by offset.
>
> I think it's available, there no other scenario will cause the frame list
> contains zero (unless stack is clobbered).

That's basically what I do in my series; see:

https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/stacktrace/metadata&id=7394a825bc4f8243b28261517999793acb6f11cd

... but I haven't posted that to the list yet as I *also* want to add code to
check that we were at an exception boundary when we saw this (based on the
prior PC/LR being in the entry text, using some metadata I have yet to add).

> And it seems only fp slot should be zero, lr will be clobbered after
> function call, we cannot use lr inside pt_regs.

I'm not sure what you mean here.

I agree that the LR in the pt_regs isn't *reliable*, and won't *always* be
valid, but it will sometimes be valid.

> > One important thing to note is that when crossing an exception boundary you
> > won't know whether RA is live, and so you might try to consume an address twice
> > (if it has also been pushed to the stack). That could be problematic for
> > unwinding ftrace or kretprobes. On arm64 we had to implement
> > HAVE_FUNCTION_GRAPH_RET_ADDR_PTR so that we could reliably unwind ftrace. See
> > commit:
> >
> > c6d3cd32fd0064af ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")
> >
> > ... and we haven't yet come up with something similar for kretprobes (though I
> > suspect we'll need to).
>
> Can we test the sp and fp inside pt_regs? Because luckily arm64 saves lr, fp
> and moves sp together.
>
> Before sp is moved we will have fp == sp from last frame 'mov x29, sp'. So
> if they are same, we uses the lr in pt_regs, conversely we use lr on stack.

The frame record (which FP points to) can live anywhere in a stack frame, so
comparing aginst SP doesn't tell us anything. Regardless of whether FP == SP,
the LTR might be valid or invalid.

There is no way of knowing if LR is live without information from the compiler
that we don't have today (though from LPC 2022, it seems like CTF *might* be
sufficient -- that's on my list of things to look at).

Thanks,
Mark.