From: "Madhavan T. Venkataraman" <[email protected]>
There are a number of places in kernel code where the stack trace is not
reliable. Enhance the unwinder to check for those cases and mark the
stack trace as unreliable. Once all of the checks are in place, the unwinder
can provide a reliable stack trace. But before this can be used for livepatch,
some other entity needs to guarantee that the frame pointers are all set up
correctly in kernel functions. objtool is currently being worked on to
address that need.
Return address check
====================
Check the return PC of every stack frame to make sure that it is a valid
kernel text address (and not some generated code, for example). If it is
not a valid kernel text address, mark the stack trace as unreliable.
Assembly functions
==================
There are a number of assembly functions in arm64. Except for a couple of
them, these functions do not have a frame pointer prolog or epilog. Also,
many of them manipulate low-level state such as registers. These functions
are, by definition, unreliable from a stack unwinding perspective. That is,
when these functions occur in a stack trace, the unwinder would not be able
to unwind through them reliably.
Assembly functions are defined as SYM_FUNC_*() functions or SYM_CODE_*()
functions. objtool peforms static analysis of SYM_FUNC functions. It ignores
SYM_CODE functions because they have low level code that is difficult to
analyze. When objtool becomes ready eventually, SYM_FUNC functions will
be analyzed and "fixed" as necessary. So, they are not "interesting" for
the reliable unwinder.
That leaves SYM_CODE functions. It is for the unwinder to deal with these
for reliable stack trace. The unwinder needs to do the following:
- Recognize SYM_CODE functions in a stack trace.
- If a particular SYM_CODE function can be unwinded through using
some special logic, then do it. E.g., the return trampoline for
Function Graph Tracing.
- Otherwise, mark the stack trace unreliable.
Using text sections to recognize SYM_CODE functions
===================================================
SYM_CODE functions are present in the following text sections:
(1) .entry.text
(2) .idmap.text
(3) .hyp.idmap.text
(4) .hyp.text
(5) .hibernate_exit.text
(6) .entry.tramp.text
(7) .text
(8) .init.text
For each of these sections, there are global variables that contain the
starting and ending addresses generated by the linker. So, they can be
recognized easily. Create an array called sym_code_ranges[] to contain
the ranges for sections (1) thru (6).
Check if the return PC falls in any of these sections. If it does, mark
the stack trace unreliable.
Sections (7) and (8)
====================
Sections (7) and (8) are generic sections which also contain tons of other
functions which are actually reliable. The SYM_CODE functions in these
sections must be dealt with differently.
Some of these are "don't care" functions so they can be left in their
current sections. E.g.,
efi_enter_kernel().
arm64_relocate_new_kernel()
These functions enter the kernel. So, they will not occur
in a stack trace examined by livepatch.
__kernel_rt_sigreturn()
This only gets invoked in user context.
hypervisor vectors
I don't believe these are interesting to livepatch.
Others need to be moved to a special section. I have created a "catch-all"
section called ".code.text" for this purpose. Add this section to
vmlinux.lds.S and sym_code_ranges[].
Reliable SYM_CODE functions
===========================
The unwinder currently has logic to recognize the return trampoline of the
Function Graph Tracer (return_to_handler()), retrieve the original return
address and use that in the stack trace. So, this is a SYM_CODE function
that the unwinder can actually unwind through.
However, the current logic only addreses stack traces taken while still in
the traced function. When the traced function returns, control is transferred
to return_to_handler(). Any stack trace taken while in the return trampoline
is not handled. This messes up the stack trace as the unwinder has to keep
track of the current index within the return address stack.
There are two options:
- Either remove this logic altogether. This would work since the
unwinder would recognize the section of the trampoline and
treat the stack trace as unreliable. So, from a live patch
perspective, this is sufficient.
- Or, fix the logic. I have taken this approach. See the patch
for more details. That said, I am open to option 1.
Other cases like kretprobe_trampoline() and ftrace entry code can be
addressed in a similar fashion. But this is outside the scope of this
work. The only reason I fixed the logic for return_to_handler() in the
unwinder is because the logic is already there.
Last stack frame
================
If a SYM_CODE function occurs in the very last frame in the stack trace,
then the stack trace is not considered unreliable. This is because there
is no more unwinding to do. Examples:
- EL0 exception stack traces end in the top level EL0 exception
handlers.
- All kernel thread stack traces end in ret_from_fork().
Special cases
=============
Some special cases need to be mentioned.
EL1 interrupt and exception handlers are present in .entry.text. So, all
EL1 interrupt and exception stack traces will be considered unreliable.
This the correct behavior as interrupts and exceptions can happen on any
instruction including ones in the frame pointer prolog and epilog. Unless
objtool generates metadata so the unwinder can unwind through these
special cases, such stack traces will be considered unreliable.
A task can get preempted at the end of an interrupt. Stack traces of
preempted tasks will show the interrupt frame in the stack trace and
will be considered unreliable.
Breakpoints are exceptions. So, all stack traces in the break point
handler (including probes) will be considered unreliable.
All of the ftrace trampoline code that gets executed at the beginning
of a traced function falls in ".code.text". All stack traces taken from
tracer functions will be considered unreliable.
The same is true for kretprobe trampolines.
---
Changelog:
v3:
- Implemented a sym_code_ranges[] array to contains sections bounds
for text sections that contain SYM_CODE_*() functions. The unwinder
checks each return PC against the sections. If it falls in any of
the sections, the stack trace is marked unreliable.
- Moved SYM_CODE functions from .text and .init.text into a new
text section called ".code.text". Added this section to
vmlinux.lds.S and sym_code_ranges[].
- Fixed the logic in the unwinder that handles Function Graph
Tracer return trampoline.
- Removed all the previous code that handles:
- ftrace entry code for traced function
- special_functions[] array that lists individual functions
- kretprobe_trampoline() special case
- Synced with mainline v5.12-rc8.
v2
- Removed the terminating entry { 0, 0 } in special_functions[]
and replaced it with the idiom { /* sentinel */ }.
- Change the ftrace trampoline entry ftrace_graph_call in
special_functions[] to ftrace_call + 4 and added explanatory
comments.
- Unnested #ifdefs in special_functions[] for FTRACE.
v1
- Define a bool field in struct stackframe. This will indicate if
a stack trace is reliable.
- Implement a special_functions[] array that will be populated
with special functions in which the stack trace is considered
unreliable.
- Using kallsyms_lookup(), get the address ranges for the special
functions and record them.
- Implement an is_reliable_function(pc). This function will check
if a given return PC falls in any of the special functions. If
it does, the stack trace is unreliable.
- Implement check_reliability() function that will check if a
stack frame is reliable. Call is_reliable_function() from
check_reliability().
- Before a return PC is checked against special_funtions[], it
must be validates as a proper kernel text address. Call
__kernel_text_address() from check_reliability().
- Finally, call check_reliability() from unwind_frame() for
each stack frame.
- Add EL1 exception handlers to special_functions[].
el1_sync();
el1_irq();
el1_error();
el1_sync_invalid();
el1_irq_invalid();
el1_fiq_invalid();
el1_error_invalid();
- The above functions are currently defined as LOCAL symbols.
Make them global so that they can be referenced from the
unwinder code.
- Add FTRACE trampolines to special_functions[]:
ftrace_graph_call()
ftrace_graph_caller()
return_to_handler()
- Add the kretprobe trampoline to special functions[]:
kretprobe_trampoline()
Previous versions and discussion
================================
v2: https://lore.kernel.org/linux-arm-kernel/[email protected]/
v1: https://lore.kernel.org/linux-arm-kernel/[email protected]/
Madhavan T. Venkataraman (4):
arm64: Introduce stack trace reliability checks in the unwinder
arm64: Check the return PC against unreliable code sections
arm64: Handle miscellaneous functions in .text and .init.text
arm64: Handle funtion graph tracer better in the unwinder
arch/arm64/include/asm/sections.h | 1 +
arch/arm64/include/asm/stacktrace.h | 7 +
arch/arm64/kernel/entry-ftrace.S | 5 +
arch/arm64/kernel/entry.S | 6 +
arch/arm64/kernel/head.S | 3 +-
arch/arm64/kernel/probes/kprobes_trampoline.S | 2 +
arch/arm64/kernel/stacktrace.c | 129 ++++++++++++++++--
arch/arm64/kernel/vmlinux.lds.S | 7 +
8 files changed, 149 insertions(+), 11 deletions(-)
base-commit: bf05bf16c76bb44ab5156223e1e58e26dfe30a88
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
The unwinder should check for the presence of various features and
conditions that can render the stack trace unreliable and mark the
the stack trace as unreliable for the benefit of the caller.
Introduce the first reliability check - If a return PC encountered in a
stack trace is not a valid kernel text address, the stack trace is
considered unreliable. It could be some generated code. Mark the stack
trace unreliable.
Other reliability checks will be added in the future.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/stacktrace.h | 4 ++++
arch/arm64/kernel/stacktrace.c | 19 ++++++++++++++++---
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index eb29b1fe8255..f1eab6b029f7 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -49,6 +49,8 @@ struct stack_info {
*
* @graph: When FUNCTION_GRAPH_TRACER is selected, holds the index of a
* replacement lr value in the ftrace graph stack.
+ *
+ * @reliable: Is this stack frame reliable?
*/
struct stackframe {
unsigned long fp;
@@ -59,6 +61,7 @@ struct stackframe {
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
int graph;
#endif
+ bool reliable;
};
extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
@@ -169,6 +172,7 @@ static inline void start_backtrace(struct stackframe *frame,
bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
frame->prev_fp = 0;
frame->prev_type = STACK_TYPE_UNKNOWN;
+ frame->reliable = true;
}
#endif /* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index d55bdfb7789c..c21a1bca28f3 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -44,6 +44,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
unsigned long fp = frame->fp;
struct stack_info info;
+ frame->reliable = true;
+
/* Terminal record; nothing to unwind */
if (!fp)
return -ENOENT;
@@ -86,12 +88,24 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
*/
frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
+ frame->pc = ptrauth_strip_insn_pac(frame->pc);
frame->prev_fp = fp;
frame->prev_type = info.type;
+ /*
+ * First, make sure that the return address is a proper kernel text
+ * address. A NULL or invalid return address probably means there's
+ * some generated code which __kernel_text_address() doesn't know
+ * about. Mark the stack trace as not reliable.
+ */
+ if (!__kernel_text_address(frame->pc)) {
+ frame->reliable = false;
+ return 0;
+ }
+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk->ret_stack &&
- (ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
+ frame->pc == (unsigned long)return_to_handler) {
struct ftrace_ret_stack *ret_stack;
/*
* This is a case where function graph tracer has
@@ -103,11 +117,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
if (WARN_ON_ONCE(!ret_stack))
return -EINVAL;
frame->pc = ret_stack->ret;
+ frame->pc = ptrauth_strip_insn_pac(frame->pc);
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
- frame->pc = ptrauth_strip_insn_pac(frame->pc);
-
return 0;
}
NOKPROBE_SYMBOL(unwind_frame);
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Create a sym_code_ranges[] array to cover the following text sections that
contain functions defined as SYM_CODE_*(). These functions are low-level
functions (and do not have a proper frame pointer prolog and epilog). So,
they are inherently unreliable from a stack unwinding perspective.
.entry.text
.idmap.text
.hyp.idmap.text
.hyp.text
.hibernate_exit.text
.entry.tramp.text
If a return PC falls in any of these, mark the stack trace unreliable.
The only exception to this is - if the unwinder has reached the last
frame already, it will not mark the stack trace unreliable since there
is no more unwinding to do. E.g.,
- ret_from_fork() occurs at the end of the stack trace of
kernel tasks.
- el0_*() functions occur at the end of EL0 exception stack
traces. This covers all user task entries into the kernel.
NOTE:
- EL1 exception handlers are in .entry.text. So, stack traces that
contain those functions will be marked not reliable. This covers
interrupts, exceptions and breakpoints encountered while executing
in the kernel.
- At the end of an interrupt, the kernel can preempt the current
task if required. So, the stack traces of all preempted tasks will
show the interrupt frame and will be considered unreliable.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/kernel/stacktrace.c | 54 ++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index c21a1bca28f3..1ff14615a55a 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -15,9 +15,48 @@
#include <asm/irq.h>
#include <asm/pointer_auth.h>
+#include <asm/sections.h>
#include <asm/stack_pointer.h>
#include <asm/stacktrace.h>
+struct code_range {
+ unsigned long start;
+ unsigned long end;
+};
+
+struct code_range sym_code_ranges[] =
+{
+ /* non-unwindable ranges */
+ { (unsigned long)__entry_text_start,
+ (unsigned long)__entry_text_end },
+ { (unsigned long)__idmap_text_start,
+ (unsigned long)__idmap_text_end },
+ { (unsigned long)__hyp_idmap_text_start,
+ (unsigned long)__hyp_idmap_text_end },
+ { (unsigned long)__hyp_text_start,
+ (unsigned long)__hyp_text_end },
+#ifdef CONFIG_HIBERNATION
+ { (unsigned long)__hibernate_exit_text_start,
+ (unsigned long)__hibernate_exit_text_end },
+#endif
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ { (unsigned long)__entry_tramp_text_start,
+ (unsigned long)__entry_tramp_text_end },
+#endif
+ { /* sentinel */ }
+};
+
+static struct code_range *lookup_range(unsigned long pc)
+{
+ struct code_range *range;
+
+ for (range = sym_code_ranges; range->start; range++) {
+ if (pc >= range->start && pc < range->end)
+ return range;
+ }
+ return range;
+}
+
/*
* AArch64 PCS assigns the frame pointer to x29.
*
@@ -43,6 +82,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
{
unsigned long fp = frame->fp;
struct stack_info info;
+ struct code_range *range;
frame->reliable = true;
@@ -103,6 +143,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
return 0;
}
+ range = lookup_range(frame->pc);
+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk->ret_stack &&
frame->pc == (unsigned long)return_to_handler) {
@@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
return -EINVAL;
frame->pc = ret_stack->ret;
frame->pc = ptrauth_strip_insn_pac(frame->pc);
+ return 0;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+ if (!range->start)
+ return 0;
+
+ /*
+ * The return PC falls in an unreliable function. If the final frame
+ * has been reached, no more unwinding is needed. Otherwise, mark the
+ * stack trace not reliable.
+ */
+ if (frame->fp)
+ frame->reliable = false;
+
return 0;
}
NOKPROBE_SYMBOL(unwind_frame);
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
The Function Graph Tracer modifies the return address of a traced function
to a return trampoline (return_to_handler()) to gather tracing data on
function return. When the unwinder encounters return_to_handler(), it calls
ftrace_graph_get_ret_stack() to lookup the original return address in the
return address stack.
This lookup will succeed as long as the unwinder is invoked when the traced
function is executing. However, when the traced function returns and control
goes to return_to_handler(), this lookup will not succeed because:
- the return address on the stack would not be return_to_handler. It would
be return_to_handler+someoffset. To solve this, get the address range for
return_to_handler() by looking up its symbol table entry and check if
frame->pc falls in the range. This is also required for the unwinder to
maintain the index into the return address stack correctly as it unwinds
through Function Graph trace return trampolines.
- the original return address will be popped off the return address stack
at some point. From this point till the end of return_to_handler(),
the lookup will not succeed. The stack trace is unreliable in that
window.
On arm64, each return address stack entry also stores the FP of the
caller of the traced function. Compare the FP in the current frame
with the entry that is looked up. If the FP matches, then, all is
well. Else, it is in the window. mark the stack trace unreliable.
Although it is possible to close the window mentioned above, it is
not worth it. It is a tiny window.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/stacktrace.h | 3 ++
arch/arm64/kernel/stacktrace.c | 60 ++++++++++++++++++++++++-----
2 files changed, 53 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index f1eab6b029f7..e70a2a6451db 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -69,6 +69,7 @@ extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
bool (*fn)(void *, unsigned long), void *data);
extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl);
+extern void init_ranges(void);
DECLARE_PER_CPU(unsigned long *, irq_stack_ptr);
@@ -154,6 +155,8 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
static inline void start_backtrace(struct stackframe *frame,
unsigned long fp, unsigned long pc)
{
+ init_ranges();
+
frame->fp = fp;
frame->pc = pc;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 33e174160f9b..7504aec79faa 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -26,6 +26,9 @@ struct code_range {
struct code_range sym_code_ranges[] =
{
+ /* unwindable ranges */
+ { (unsigned long)return_to_handler, 0 },
+
/* non-unwindable ranges */
{ (unsigned long)__entry_text_start,
(unsigned long)__entry_text_end },
@@ -48,6 +51,33 @@ struct code_range sym_code_ranges[] =
{ /* sentinel */ }
};
+void init_ranges(void)
+{
+ static char sym[KSYM_NAME_LEN];
+ static bool inited = false;
+ struct code_range *range;
+ unsigned long pc, size, offset;
+
+ if (inited)
+ return;
+
+ for (range = sym_code_ranges; range->start; range++) {
+ if (range->end)
+ continue;
+
+ pc = (unsigned long)range->start;
+ if (kallsyms_lookup(pc, &size, &offset, NULL, sym)) {
+ range->start = pc - offset;
+ range->end = range->start + size;
+ } else {
+ /* Range will only include one instruction */
+ range->start = pc;
+ range->end = pc + 4;
+ }
+ }
+ inited = true;
+}
+
static struct code_range *lookup_range(unsigned long pc)
{
struct code_range *range;
@@ -149,19 +179,29 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk->ret_stack &&
- frame->pc == (unsigned long)return_to_handler) {
+ range->start == (unsigned long)return_to_handler) {
struct ftrace_ret_stack *ret_stack;
/*
- * This is a case where function graph tracer has
- * modified a return address (LR) in a stack frame
- * to hook a function return.
- * So replace it to an original value.
+ * Either the function graph tracer has modified a return
+ * address (LR) in a stack frame to the return trampoline.
+ * Or, the return trampoline itself is executing upon the
+ * return of a traced function. Lookup the original return
+ * address and replace frame->pc with it.
+ *
+ * However, the return trampoline pops the original return
+ * address off the return address stack at some point. So,
+ * there is a small window towards the end of the return
+ * trampoline where the lookup will fail. In that case,
+ * mark the stack trace as unreliable and proceed.
*/
- ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++);
- if (WARN_ON_ONCE(!ret_stack))
- return -EINVAL;
- frame->pc = ret_stack->ret;
- frame->pc = ptrauth_strip_insn_pac(frame->pc);
+ ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph);
+ if (!ret_stack || frame->fp != ret_stack->fp) {
+ frame->reliable = false;
+ } else {
+ frame->pc = ret_stack->ret;
+ frame->pc = ptrauth_strip_insn_pac(frame->pc);
+ frame->graph++;
+ }
return 0;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
There are some SYM_CODE functions that are currently in ".text" or
".init.text" sections. Some of these are functions that the unwinder
does not care about as they are not "interesting" to livepatch. These
will remain in their current sections. The rest I have moved into a
new section called ".code.text".
Include .code.text in sym_code_ranges[] so the unwinder can check it.
I have listed the names of the functions along with the name of their
existing section.
Don't care functions
====================
efi-entry.S:
efi_enter_kernel .init.text
relocate_kernel.S:
arm64_relocate_new_kernel .text
sigreturn.S:
__kernel_rt_sigreturn .text
arch/arm64/kvm/hyp/hyp-entry.S:
el2t_sync_invalid .text
el2t_irq_invalid .text
el2t_fiq_invalid .text
el2t_error_invalid .text
el2h_irq_invalid .text
el2h_fiq_invalid .text
el1_fiq_invalid .text
__kvm_hyp_vector .text
__bp_harden_hyp_vecs .text
arch/arm64/kvm/hyp/nvhe/host.S:
__kvm_hyp_host_vector .text
__kvm_hyp_host_forward_smc .text
Rest of the functions (moved to .code.text)
=====================
entry.S:
__swpan_entry_el1 .text
__swpan_exit_el1 .text
__swpan_entry_el0 .text
__swpan_exit_el0 .text
ret_from_fork .text
__sdei_asm_handler .text
head.S:
primary_entry .init.text
preserve_boot_args .init.text
entry-ftrace.S:
ftrace_regs_caller .text
ftrace_caller .text
ftrace_common .text
ftrace_graph_caller .text
return_to_handler .text
kprobes_trampoline.S:
kretprobe_trampoline .text
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/sections.h | 1 +
arch/arm64/kernel/entry-ftrace.S | 5 +++++
arch/arm64/kernel/entry.S | 6 ++++++
arch/arm64/kernel/head.S | 3 ++-
arch/arm64/kernel/probes/kprobes_trampoline.S | 2 ++
arch/arm64/kernel/stacktrace.c | 2 ++
arch/arm64/kernel/vmlinux.lds.S | 7 +++++++
7 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h
index 2f36b16a5b5d..bceda68aaa79 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -20,5 +20,6 @@ extern char __exittext_begin[], __exittext_end[];
extern char __irqentry_text_start[], __irqentry_text_end[];
extern char __mmuoff_data_start[], __mmuoff_data_end[];
extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
+extern char __code_text_start[], __code_text_end[];
#endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index b3e4f9a088b1..c0831a49c290 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -12,7 +12,9 @@
#include <asm/ftrace.h>
#include <asm/insn.h>
+ .text
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+ .pushsection ".code.text", "ax"
/*
* Due to -fpatchable-function-entry=2, the compiler has placed two NOPs before
* the regular function prologue. For an enabled callsite, ftrace_init_nop() and
@@ -135,6 +137,7 @@ SYM_CODE_START(ftrace_graph_caller)
b ftrace_common_return
SYM_CODE_END(ftrace_graph_caller)
#endif
+ .popsection
#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
@@ -315,6 +318,7 @@ SYM_FUNC_START(ftrace_stub)
SYM_FUNC_END(ftrace_stub)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ .pushsection ".code.text", "ax"
/*
* void return_to_handler(void)
*
@@ -342,4 +346,5 @@ SYM_CODE_START(return_to_handler)
ret
SYM_CODE_END(return_to_handler)
+ .popsection
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 6acfc5e6b5e0..3f9f7f80cd65 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -402,6 +402,7 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
.endm
#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+ .pushsection ".code.text", "ax"
/*
* Set the TTBR0 PAN bit in SPSR. When the exception is taken from
* EL0, there is no need to check the state of TTBR0_EL1 since
@@ -442,6 +443,7 @@ SYM_CODE_START_LOCAL(__swpan_exit_el0)
*/
b post_ttbr_update_workaround
SYM_CODE_END(__swpan_exit_el0)
+ .popsection
#endif
.macro irq_stack_entry
@@ -950,6 +952,7 @@ SYM_FUNC_START(cpu_switch_to)
SYM_FUNC_END(cpu_switch_to)
NOKPROBE(cpu_switch_to)
+ .pushsection ".code.text", "ax"
/*
* This is how we return from a fork.
*/
@@ -962,6 +965,7 @@ SYM_CODE_START(ret_from_fork)
b ret_to_user
SYM_CODE_END(ret_from_fork)
NOKPROBE(ret_from_fork)
+ .popsection
#ifdef CONFIG_ARM_SDE_INTERFACE
@@ -1040,6 +1044,7 @@ SYM_DATA_END(__sdei_asm_trampoline_next_handler)
#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+ .pushsection ".code.text", "ax"
/*
* Software Delegated Exception entry point.
*
@@ -1150,4 +1155,5 @@ alternative_else_nop_endif
#endif
SYM_CODE_END(__sdei_asm_handler)
NOKPROBE(__sdei_asm_handler)
+ .popsection
#endif /* CONFIG_ARM_SDE_INTERFACE */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 840bda1869e9..4ce96dfac2b8 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -75,7 +75,7 @@
__EFI_PE_HEADER
__INIT
-
+ .pushsection ".code.text", "ax"
/*
* The following callee saved general purpose registers are used on the
* primary lowlevel boot path:
@@ -120,6 +120,7 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
mov x1, #0x20 // 4 x 8 bytes
b __inval_dcache_area // tail call
SYM_CODE_END(preserve_boot_args)
+ .popsection
/*
* Macro to create a table entry to the next page.
diff --git a/arch/arm64/kernel/probes/kprobes_trampoline.S b/arch/arm64/kernel/probes/kprobes_trampoline.S
index 288a84e253cc..9244e119af3e 100644
--- a/arch/arm64/kernel/probes/kprobes_trampoline.S
+++ b/arch/arm64/kernel/probes/kprobes_trampoline.S
@@ -8,6 +8,7 @@
#include <asm/assembler.h>
.text
+ .pushsection ".code.text", "ax"
.macro save_all_base_regs
stp x0, x1, [sp, #S_X0]
@@ -80,3 +81,4 @@ SYM_CODE_START(kretprobe_trampoline)
ret
SYM_CODE_END(kretprobe_trampoline)
+ .popsection
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 1ff14615a55a..33e174160f9b 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -43,6 +43,8 @@ struct code_range sym_code_ranges[] =
{ (unsigned long)__entry_tramp_text_start,
(unsigned long)__entry_tramp_text_end },
#endif
+ { (unsigned long)__code_text_start,
+ (unsigned long)__code_text_end },
{ /* sentinel */ }
};
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..c00b3232e6dc 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -103,6 +103,12 @@ jiffies = jiffies_64;
#define TRAMP_TEXT
#endif
+#define CODE_TEXT \
+ . = ALIGN(SZ_4K); \
+ __code_text_start = .; \
+ *(.code.text) \
+ __code_text_end = .;
+
/*
* The size of the PE/COFF section that covers the kernel image, which
* runs from _stext to _edata, must be a round multiple of the PE/COFF
@@ -145,6 +151,7 @@ SECTIONS
SOFTIRQENTRY_TEXT
ENTRY_TEXT
TEXT_TEXT
+ CODE_TEXT
SCHED_TEXT
CPUIDLE_TEXT
LOCK_TEXT
--
2.25.1
On Mon, May 03, 2021 at 12:36:12PM -0500, [email protected] wrote:
> + /*
> + * First, make sure that the return address is a proper kernel text
> + * address. A NULL or invalid return address probably means there's
> + * some generated code which __kernel_text_address() doesn't know
> + * about. Mark the stack trace as not reliable.
> + */
> + if (!__kernel_text_address(frame->pc)) {
> + frame->reliable = false;
> + return 0;
> + }
Do we want the return here? It means that...
> +
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> if (tsk->ret_stack &&
> - (ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
> + frame->pc == (unsigned long)return_to_handler) {
> struct ftrace_ret_stack *ret_stack;
> /*
> * This is a case where function graph tracer has
> @@ -103,11 +117,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> if (WARN_ON_ONCE(!ret_stack))
> return -EINVAL;
> frame->pc = ret_stack->ret;
> + frame->pc = ptrauth_strip_insn_pac(frame->pc);
> }
...we skip this handling in the case where we're not in kernel code. I
don't know off hand if that's a case that can happen right now but it
seems more robust to run through this and anything else we add later,
even if it's not relevant now changes either in the unwinder itself or
resulting from some future work elsewhere may mean it later becomes
important. Skipping futher reliability checks is obviously fine if
we've already decided things aren't reliable but this is more than just
a reliability check.
On Mon, May 03, 2021 at 12:36:13PM -0500, [email protected] wrote:
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> Create a sym_code_ranges[] array to cover the following text sections that
> contain functions defined as SYM_CODE_*(). These functions are low-level
This makes sense to me - a few of bikesheddy comments below but nothing
really substantive.
> +static struct code_range *lookup_range(unsigned long pc)
This feels like it should have a prefix on the name (eg, unwinder_)
since it looks collision prone. Or lookup_code_range() rather than just
plain lookup_range().
> +{
+ struct code_range *range;
+
+ for (range = sym_code_ranges; range->start; range++) {
It seems more idiomatic to use ARRAY_SIZE() rather than a sentinel here,
the array can't be empty.
> + range = lookup_range(frame->pc);
> +
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> if (tsk->ret_stack &&
> frame->pc == (unsigned long)return_to_handler) {
> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> return -EINVAL;
> frame->pc = ret_stack->ret;
> frame->pc = ptrauth_strip_insn_pac(frame->pc);
> + return 0;
> }
Do we not need to look up the range of the restored pc and validate
what's being pointed to here? It's not immediately obvious why we do
the lookup before handling the function graph tracer, especially given
that we never look at the result and there's now a return added skipping
further reliability checks. At the very least I think this needs some
additional comments so the code is more obvious.
On 5/4/21 11:05 AM, Mark Brown wrote:
> On Mon, May 03, 2021 at 12:36:13PM -0500, [email protected] wrote:
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> Create a sym_code_ranges[] array to cover the following text sections that
>> contain functions defined as SYM_CODE_*(). These functions are low-level
>
> This makes sense to me - a few of bikesheddy comments below but nothing
> really substantive.
>
OK.
>> +static struct code_range *lookup_range(unsigned long pc)
>
> This feels like it should have a prefix on the name (eg, unwinder_)
> since it looks collision prone. Or lookup_code_range() rather than just
> plain lookup_range().
>
I will add the prefix.
>> +{
> + struct code_range *range;
> +
> + for (range = sym_code_ranges; range->start; range++) {
>
> It seems more idiomatic to use ARRAY_SIZE() rather than a sentinel here,
> the array can't be empty.
>
If there is a match, I return the matched range. Else, I return the sentinel.
This is just so I don't have to check for range == NULL after calling
lookup_range().
I will change it to what you have suggested and check for NULL explicitly.
It is not a problem.
>> + range = lookup_range(frame->pc);
>> +
>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>> if (tsk->ret_stack &&
>> frame->pc == (unsigned long)return_to_handler) {
>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>> return -EINVAL;
>> frame->pc = ret_stack->ret;
>> frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> + return 0;
>> }
>
> Do we not need to look up the range of the restored pc and validate
> what's being pointed to here? It's not immediately obvious why we do
> the lookup before handling the function graph tracer, especially given
> that we never look at the result and there's now a return added skipping
> further reliability checks. At the very least I think this needs some
> additional comments so the code is more obvious.
I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
Unwindable ranges will be special ranges such as the return_to_handler() and
kretprobe_trampoline() functions for which the unwinder has (or will have)
special code to unwind. So, the lookup_range() has to happen before the
function graph code. Please look at the last patch in the series for
the fix for the above function graph code.
On the question of "should the original return address be checked against
sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
function, it had to be an ftraceable function. It would not be a part
of sym_code_ranges[]. Is that a wrong assumption on my part?
Madhavan
On 5/4/21 10:50 AM, Mark Brown wrote:
> On Mon, May 03, 2021 at 12:36:12PM -0500, [email protected] wrote:
>
>> + /*
>> + * First, make sure that the return address is a proper kernel text
>> + * address. A NULL or invalid return address probably means there's
>> + * some generated code which __kernel_text_address() doesn't know
>> + * about. Mark the stack trace as not reliable.
>> + */
>> + if (!__kernel_text_address(frame->pc)) {
>> + frame->reliable = false;
>> + return 0;
>> + }
>
> Do we want the return here? It means that...
>
>> +
>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>> if (tsk->ret_stack &&
>> - (ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
>> + frame->pc == (unsigned long)return_to_handler) {
>> struct ftrace_ret_stack *ret_stack;
>> /*
>> * This is a case where function graph tracer has
>> @@ -103,11 +117,10 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>> if (WARN_ON_ONCE(!ret_stack))
>> return -EINVAL;
>> frame->pc = ret_stack->ret;
>> + frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> }
>
> ...we skip this handling in the case where we're not in kernel code. I
> don't know off hand if that's a case that can happen right now but it
> seems more robust to run through this and anything else we add later,
> even if it's not relevant now changes either in the unwinder itself or
> resulting from some future work elsewhere may mean it later becomes
> important. Skipping futher reliability checks is obviously fine if
> we've already decided things aren't reliable but this is more than just
> a reliability check.
>
AFAICT, currently, all the functions that the unwinder checks do have
valid kernel text addresses. However, I don't think there is any harm
in letting it fall through and make all the checks. So, I will remove the
return statement.
Thanks!
Madhavan
On 5/4/21 2:03 PM, Madhavan T. Venkataraman wrote:
>
>
> On 5/4/21 11:05 AM, Mark Brown wrote:
>> On Mon, May 03, 2021 at 12:36:13PM -0500, [email protected] wrote:
>>> From: "Madhavan T. Venkataraman" <[email protected]>
>>>
>>> Create a sym_code_ranges[] array to cover the following text sections that
>>> contain functions defined as SYM_CODE_*(). These functions are low-level
>>
>> This makes sense to me - a few of bikesheddy comments below but nothing
>> really substantive.
>>
>
> OK.
>
>>> +static struct code_range *lookup_range(unsigned long pc)
>>
>> This feels like it should have a prefix on the name (eg, unwinder_)
>> since it looks collision prone. Or lookup_code_range() rather than just
>> plain lookup_range().
>>
>
> I will add the prefix.
>
>>> +{
>> + struct code_range *range;
>> +
>> + for (range = sym_code_ranges; range->start; range++) {
>>
>> It seems more idiomatic to use ARRAY_SIZE() rather than a sentinel here,
>> the array can't be empty.
>>
>
> If there is a match, I return the matched range. Else, I return the sentinel.
> This is just so I don't have to check for range == NULL after calling
> lookup_range().
>
> I will change it to what you have suggested and check for NULL explicitly.
> It is not a problem.
>
>>> + range = lookup_range(frame->pc);
>>> +
>>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>> if (tsk->ret_stack &&
>>> frame->pc == (unsigned long)return_to_handler) {
>>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>> return -EINVAL;
>>> frame->pc = ret_stack->ret;
>>> frame->pc = ptrauth_strip_insn_pac(frame->pc);
>>> + return 0;
>>> }
>>
>> Do we not need to look up the range of the restored pc and validate
>> what's being pointed to here? It's not immediately obvious why we do
>> the lookup before handling the function graph tracer, especially given
>> that we never look at the result and there's now a return added skipping
>> further reliability checks. At the very least I think this needs some
>> additional comments so the code is more obvious.
> I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
> Unwindable ranges will be special ranges such as the return_to_handler() and
> kretprobe_trampoline() functions for which the unwinder has (or will have)
> special code to unwind. So, the lookup_range() has to happen before the
> function graph code. Please look at the last patch in the series for
> the fix for the above function graph code.
>
> On the question of "should the original return address be checked against
> sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
> function, it had to be an ftraceable function. It would not be a part
> of sym_code_ranges[]. Is that a wrong assumption on my part?
>
If you prefer, I could do something like this:
check_pc:
if (!__kernel_text_address(frame->pc))
frame->reliable = false;
range = lookup_range(frame->pc);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk->ret_stack &&
frame->pc == (unsigned long)return_to_handler) {
...
frame->pc = ret_stack->ret;
frame->pc = ptrauth_strip_insn_pac(frame->pc);
goto check_pc;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
Is that acceptable?
Madhavan
On Mon, May 03, 2021 at 12:36:12PM -0500, [email protected] wrote:
> @@ -44,6 +44,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> unsigned long fp = frame->fp;
> struct stack_info info;
>
> + frame->reliable = true;
> +
Why set 'reliable' to true on every invocation of unwind_frame()?
Shouldn't it be remembered across frames?
Also, it looks like there are several error scenarios where it returns
-EINVAL but doesn't set 'reliable' to false.
--
Josh
On 5/4/21 4:52 PM, Josh Poimboeuf wrote:
> On Mon, May 03, 2021 at 12:36:12PM -0500, [email protected] wrote:
>> @@ -44,6 +44,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>> unsigned long fp = frame->fp;
>> struct stack_info info;
>>
>> + frame->reliable = true;
>> +
>
> Why set 'reliable' to true on every invocation of unwind_frame()?
> Shouldn't it be remembered across frames?
>
This is mainly for debug purposes in case a caller wants to print the whole stack and also
print which functions are unreliable. For livepatch, it does not make any difference. It will
quit as soon as it encounters an unreliable frame.
> Also, it looks like there are several error scenarios where it returns
> -EINVAL but doesn't set 'reliable' to false.
>
I wanted to make a distinction between an error situation (like stack corruption where unwinding
has to stop) and an unreliable situation (where unwinding can still proceed). E.g., when a
stack trace is taken for informational purposes or debug purposes, the unwinding will try to
proceed until either the stack trace ends or an error happens.
Madhavan
On Tue, May 04, 2021 at 06:13:39PM -0500, Madhavan T. Venkataraman wrote:
>
>
> On 5/4/21 4:52 PM, Josh Poimboeuf wrote:
> > On Mon, May 03, 2021 at 12:36:12PM -0500, [email protected] wrote:
> >> @@ -44,6 +44,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >> unsigned long fp = frame->fp;
> >> struct stack_info info;
> >>
> >> + frame->reliable = true;
> >> +
> >
> > Why set 'reliable' to true on every invocation of unwind_frame()?
> > Shouldn't it be remembered across frames?
> >
>
> This is mainly for debug purposes in case a caller wants to print the whole stack and also
> print which functions are unreliable. For livepatch, it does not make any difference. It will
> quit as soon as it encounters an unreliable frame.
Hm, ok. So 'frame->reliable' refers to the current frame, not the
entire stack.
> > Also, it looks like there are several error scenarios where it returns
> > -EINVAL but doesn't set 'reliable' to false.
> >
>
> I wanted to make a distinction between an error situation (like stack corruption where unwinding
> has to stop) and an unreliable situation (where unwinding can still proceed). E.g., when a
> stack trace is taken for informational purposes or debug purposes, the unwinding will try to
> proceed until either the stack trace ends or an error happens.
Ok, but I don't understand how that relates to my comment.
Why wouldn't a stack corruption like !on_accessible_stack() set
'frame->reliable' to false?
In other words: for livepatch purposes, how does the caller tell the
difference between hitting the final stack record -- which returns an
error with reliable 'true' -- and a stack corruption like
!on_accessible_stack(), which also returns an error with reliable
'true'? Surely the latter should be considered unreliable?
--
Josh
On 5/4/21 7:07 PM, Josh Poimboeuf wrote:
> On Tue, May 04, 2021 at 06:13:39PM -0500, Madhavan T. Venkataraman wrote:
>>
>>
>> On 5/4/21 4:52 PM, Josh Poimboeuf wrote:
>>> On Mon, May 03, 2021 at 12:36:12PM -0500, [email protected] wrote:
>>>> @@ -44,6 +44,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>>> unsigned long fp = frame->fp;
>>>> struct stack_info info;
>>>>
>>>> + frame->reliable = true;
>>>> +
>>>
>>> Why set 'reliable' to true on every invocation of unwind_frame()?
>>> Shouldn't it be remembered across frames?
>>>
>>
>> This is mainly for debug purposes in case a caller wants to print the whole stack and also
>> print which functions are unreliable. For livepatch, it does not make any difference. It will
>> quit as soon as it encounters an unreliable frame.
>
> Hm, ok. So 'frame->reliable' refers to the current frame, not the
> entire stack.
>
Yes.
>>> Also, it looks like there are several error scenarios where it returns
>>> -EINVAL but doesn't set 'reliable' to false.
>>>
>>
>> I wanted to make a distinction between an error situation (like stack corruption where unwinding
>> has to stop) and an unreliable situation (where unwinding can still proceed). E.g., when a
>> stack trace is taken for informational purposes or debug purposes, the unwinding will try to
>> proceed until either the stack trace ends or an error happens.
>
> Ok, but I don't understand how that relates to my comment.
>
> Why wouldn't a stack corruption like !on_accessible_stack() set
> 'frame->reliable' to false?
>
I do see your point. If an error has been hit, then the stack trace is essentially unreliable
regardless of anything else. So, I accept your comment. I will mark the stack trace as unreliable
if any kind of error is encountered.
Thanks!
Madhavan
On Tue, May 04, 2021 at 02:03:14PM -0500, Madhavan T. Venkataraman wrote:
> On 5/4/21 11:05 AM, Mark Brown wrote:
> >> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >> return -EINVAL;
> >> frame->pc = ret_stack->ret;
> >> frame->pc = ptrauth_strip_insn_pac(frame->pc);
> >> + return 0;
> >> }
> > Do we not need to look up the range of the restored pc and validate
> > what's being pointed to here? It's not immediately obvious why we do
> > the lookup before handling the function graph tracer, especially given
> > that we never look at the result and there's now a return added skipping
> > further reliability checks. At the very least I think this needs some
> > additional comments so the code is more obvious.
> I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
> Unwindable ranges will be special ranges such as the return_to_handler() and
> kretprobe_trampoline() functions for which the unwinder has (or will have)
> special code to unwind. So, the lookup_range() has to happen before the
> function graph code. Please look at the last patch in the series for
> the fix for the above function graph code.
That sounds reasonable but like I say should probably be called out in
the code so it's clear to people working with it.
> On the question of "should the original return address be checked against
> sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
> function, it had to be an ftraceable function. It would not be a part
> of sym_code_ranges[]. Is that a wrong assumption on my part?
I can't think of any cases where it wouldn't be right now, but it seems
easier to just do a redundant check than to have the assumption in the
code and have to think about if it's missing.
On Tue, May 04, 2021 at 02:32:35PM -0500, Madhavan T. Venkataraman wrote:
> If you prefer, I could do something like this:
>
> check_pc:
> if (!__kernel_text_address(frame->pc))
> frame->reliable = false;
>
> range = lookup_range(frame->pc);
>
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> if (tsk->ret_stack &&
> frame->pc == (unsigned long)return_to_handler) {
> ...
> frame->pc = ret_stack->ret;
> frame->pc = ptrauth_strip_insn_pac(frame->pc);
> goto check_pc;
> }
> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> Is that acceptable?
I think that works even if it's hard to love the goto, might want some
defensiveness to ensure we can't somehow end up in an infinite loop with
a sufficiently badly formed stack.
On 5/5/21 11:46 AM, Mark Brown wrote:
> On Tue, May 04, 2021 at 02:32:35PM -0500, Madhavan T. Venkataraman wrote:
>
>> If you prefer, I could do something like this:
>>
>> check_pc:
>> if (!__kernel_text_address(frame->pc))
>> frame->reliable = false;
>>
>> range = lookup_range(frame->pc);
>>
>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>> if (tsk->ret_stack &&
>> frame->pc == (unsigned long)return_to_handler) {
>> ...
>> frame->pc = ret_stack->ret;
>> frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> goto check_pc;
>> }
>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>
>> Is that acceptable?
>
> I think that works even if it's hard to love the goto, might want some
> defensiveness to ensure we can't somehow end up in an infinite loop with
> a sufficiently badly formed stack.
>
I could do something like this:
- Move all frame->pc checking code into a function called check_frame_pc().
bool check_frame_pc(frame)
{
Do all the checks including function graph
return frame->pc changed
}
- Then, in unwind_frame()
unwind_frame()
{
int i;
...
for (i = 0; i < MAX_CHECKS; i++) {
if (!check_frame(tsk, frame))
break;
}
if (i == MAX_CHECKS)
frame->reliable = false;
return 0;
}
The above would take care of future cases like kretprobe_trampoline().
If this is acceptable, then the only question is - what should be the value of
MAX_CHECKS (I will rename it to something more appropriate)?
Madhavan
On 5/5/21 1:48 PM, Madhavan T. Venkataraman wrote:
>
>
> On 5/5/21 11:46 AM, Mark Brown wrote:
>> On Tue, May 04, 2021 at 02:32:35PM -0500, Madhavan T. Venkataraman wrote:
>>
>>> If you prefer, I could do something like this:
>>>
>>> check_pc:
>>> if (!__kernel_text_address(frame->pc))
>>> frame->reliable = false;
>>>
>>> range = lookup_range(frame->pc);
>>>
>>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>> if (tsk->ret_stack &&
>>> frame->pc == (unsigned long)return_to_handler) {
>>> ...
>>> frame->pc = ret_stack->ret;
>>> frame->pc = ptrauth_strip_insn_pac(frame->pc);
>>> goto check_pc;
>>> }
>>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>>
>>> Is that acceptable?
>>
>> I think that works even if it's hard to love the goto, might want some
>> defensiveness to ensure we can't somehow end up in an infinite loop with
>> a sufficiently badly formed stack.
>>
>
> I could do something like this:
>
> - Move all frame->pc checking code into a function called check_frame_pc().
>
> bool check_frame_pc(frame)
> {
> Do all the checks including function graph
> return frame->pc changed
> }
>
> - Then, in unwind_frame()
>
> unwind_frame()
> {
> int i;
> ...
>
> for (i = 0; i < MAX_CHECKS; i++) {
> if (!check_frame(tsk, frame))
Small typo in the last statement - It should be check_frame_pc().
Sorry.
Madhavan
> break;
> }
>
> if (i == MAX_CHECKS)
> frame->reliable = false;
> return 0;
> }
>
> The above would take care of future cases like kretprobe_trampoline().
>
> If this is acceptable, then the only question is - what should be the value of
> MAX_CHECKS (I will rename it to something more appropriate)?
>
> Madhavan
>
OK. I will make all the changes you suggested.
Thanks!
Madhavan
On 5/5/21 2:30 PM, Ard Biesheuvel wrote:
> On Mon, 3 May 2021 at 19:38, <[email protected]> wrote:
>>
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> Create a sym_code_ranges[] array to cover the following text sections that
>> contain functions defined as SYM_CODE_*(). These functions are low-level
>> functions (and do not have a proper frame pointer prolog and epilog). So,
>> they are inherently unreliable from a stack unwinding perspective.
>>
>> .entry.text
>> .idmap.text
>> .hyp.idmap.text
>> .hyp.text
>> .hibernate_exit.text
>> .entry.tramp.text
>>
>> If a return PC falls in any of these, mark the stack trace unreliable.
>>
>> The only exception to this is - if the unwinder has reached the last
>> frame already, it will not mark the stack trace unreliable since there
>> is no more unwinding to do. E.g.,
>>
>> - ret_from_fork() occurs at the end of the stack trace of
>> kernel tasks.
>>
>> - el0_*() functions occur at the end of EL0 exception stack
>> traces. This covers all user task entries into the kernel.
>>
>> NOTE:
>> - EL1 exception handlers are in .entry.text. So, stack traces that
>> contain those functions will be marked not reliable. This covers
>> interrupts, exceptions and breakpoints encountered while executing
>> in the kernel.
>>
>> - At the end of an interrupt, the kernel can preempt the current
>> task if required. So, the stack traces of all preempted tasks will
>> show the interrupt frame and will be considered unreliable.
>>
>> Signed-off-by: Madhavan T. Venkataraman <[email protected]>
>> ---
>> arch/arm64/kernel/stacktrace.c | 54 ++++++++++++++++++++++++++++++++++
>> 1 file changed, 54 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index c21a1bca28f3..1ff14615a55a 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -15,9 +15,48 @@
>>
>> #include <asm/irq.h>
>> #include <asm/pointer_auth.h>
>> +#include <asm/sections.h>
>> #include <asm/stack_pointer.h>
>> #include <asm/stacktrace.h>
>>
>> +struct code_range {
>> + unsigned long start;
>> + unsigned long end;
>> +};
>> +
>> +struct code_range sym_code_ranges[] =
>
> This should be static and const
>
>> +{
>> + /* non-unwindable ranges */
>> + { (unsigned long)__entry_text_start,
>> + (unsigned long)__entry_text_end },
>> + { (unsigned long)__idmap_text_start,
>> + (unsigned long)__idmap_text_end },
>> + { (unsigned long)__hyp_idmap_text_start,
>> + (unsigned long)__hyp_idmap_text_end },
>> + { (unsigned long)__hyp_text_start,
>> + (unsigned long)__hyp_text_end },
>> +#ifdef CONFIG_HIBERNATION
>> + { (unsigned long)__hibernate_exit_text_start,
>> + (unsigned long)__hibernate_exit_text_end },
>> +#endif
>> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
>> + { (unsigned long)__entry_tramp_text_start,
>> + (unsigned long)__entry_tramp_text_end },
>> +#endif
>> + { /* sentinel */ }
>> +};
>> +
>> +static struct code_range *lookup_range(unsigned long pc)
>
> const struct code_range *
>
>> +{
>> + struct code_range *range;
>
> const struct code_range *
>
>> +
>> + for (range = sym_code_ranges; range->start; range++) {
>> + if (pc >= range->start && pc < range->end)
>> + return range;
>> + }
>> + return range;
>> +}
>> +
>> /*
>> * AArch64 PCS assigns the frame pointer to x29.
>> *
>> @@ -43,6 +82,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>> {
>> unsigned long fp = frame->fp;
>> struct stack_info info;
>> + struct code_range *range;
>
> const struct code_range *
>
>>
>> frame->reliable = true;
>>
>> @@ -103,6 +143,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>> return 0;
>> }
>>
>> + range = lookup_range(frame->pc);
>> +
>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>> if (tsk->ret_stack &&
>> frame->pc == (unsigned long)return_to_handler) {
>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>> return -EINVAL;
>> frame->pc = ret_stack->ret;
>> frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> + return 0;
>> }
>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>>
>> + if (!range->start)
>> + return 0;
>> +
>> + /*
>> + * The return PC falls in an unreliable function. If the final frame
>> + * has been reached, no more unwinding is needed. Otherwise, mark the
>> + * stack trace not reliable.
>> + */
>> + if (frame->fp)
>> + frame->reliable = false;
>> +
>> return 0;
>> }
>> NOKPROBE_SYMBOL(unwind_frame);
>> --
>> 2.25.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On 5/5/21 11:34 AM, Mark Brown wrote:
> On Tue, May 04, 2021 at 02:03:14PM -0500, Madhavan T. Venkataraman wrote:
>> On 5/4/21 11:05 AM, Mark Brown wrote:
>
>>>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>>> return -EINVAL;
>>>> frame->pc = ret_stack->ret;
>>>> frame->pc = ptrauth_strip_insn_pac(frame->pc);
>>>> + return 0;
>>>> }
>
>>> Do we not need to look up the range of the restored pc and validate
>>> what's being pointed to here? It's not immediately obvious why we do
>>> the lookup before handling the function graph tracer, especially given
>>> that we never look at the result and there's now a return added skipping
>>> further reliability checks. At the very least I think this needs some
>>> additional comments so the code is more obvious.
>
>> I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
>> Unwindable ranges will be special ranges such as the return_to_handler() and
>> kretprobe_trampoline() functions for which the unwinder has (or will have)
>> special code to unwind. So, the lookup_range() has to happen before the
>> function graph code. Please look at the last patch in the series for
>> the fix for the above function graph code.
>
> That sounds reasonable but like I say should probably be called out in
> the code so it's clear to people working with it.
>
OK. To make this better, I will do the lookup_range() after the function
graph code to begin with. Then, in the last patch for the function graph
code, I will move it up. This way, the code is clear and your comment
is addressed.
>> On the question of "should the original return address be checked against
>> sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
>> function, it had to be an ftraceable function. It would not be a part
>> of sym_code_ranges[]. Is that a wrong assumption on my part?
>
> I can't think of any cases where it wouldn't be right now, but it seems
> easier to just do a redundant check than to have the assumption in the
> code and have to think about if it's missing.
>
Agreed. Will do the check.
Madhavan
On Mon, 3 May 2021 at 19:38, <[email protected]> wrote:
>
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> Create a sym_code_ranges[] array to cover the following text sections that
> contain functions defined as SYM_CODE_*(). These functions are low-level
> functions (and do not have a proper frame pointer prolog and epilog). So,
> they are inherently unreliable from a stack unwinding perspective.
>
> .entry.text
> .idmap.text
> .hyp.idmap.text
> .hyp.text
> .hibernate_exit.text
> .entry.tramp.text
>
> If a return PC falls in any of these, mark the stack trace unreliable.
>
> The only exception to this is - if the unwinder has reached the last
> frame already, it will not mark the stack trace unreliable since there
> is no more unwinding to do. E.g.,
>
> - ret_from_fork() occurs at the end of the stack trace of
> kernel tasks.
>
> - el0_*() functions occur at the end of EL0 exception stack
> traces. This covers all user task entries into the kernel.
>
> NOTE:
> - EL1 exception handlers are in .entry.text. So, stack traces that
> contain those functions will be marked not reliable. This covers
> interrupts, exceptions and breakpoints encountered while executing
> in the kernel.
>
> - At the end of an interrupt, the kernel can preempt the current
> task if required. So, the stack traces of all preempted tasks will
> show the interrupt frame and will be considered unreliable.
>
> Signed-off-by: Madhavan T. Venkataraman <[email protected]>
> ---
> arch/arm64/kernel/stacktrace.c | 54 ++++++++++++++++++++++++++++++++++
> 1 file changed, 54 insertions(+)
>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index c21a1bca28f3..1ff14615a55a 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -15,9 +15,48 @@
>
> #include <asm/irq.h>
> #include <asm/pointer_auth.h>
> +#include <asm/sections.h>
> #include <asm/stack_pointer.h>
> #include <asm/stacktrace.h>
>
> +struct code_range {
> + unsigned long start;
> + unsigned long end;
> +};
> +
> +struct code_range sym_code_ranges[] =
This should be static and const
> +{
> + /* non-unwindable ranges */
> + { (unsigned long)__entry_text_start,
> + (unsigned long)__entry_text_end },
> + { (unsigned long)__idmap_text_start,
> + (unsigned long)__idmap_text_end },
> + { (unsigned long)__hyp_idmap_text_start,
> + (unsigned long)__hyp_idmap_text_end },
> + { (unsigned long)__hyp_text_start,
> + (unsigned long)__hyp_text_end },
> +#ifdef CONFIG_HIBERNATION
> + { (unsigned long)__hibernate_exit_text_start,
> + (unsigned long)__hibernate_exit_text_end },
> +#endif
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> + { (unsigned long)__entry_tramp_text_start,
> + (unsigned long)__entry_tramp_text_end },
> +#endif
> + { /* sentinel */ }
> +};
> +
> +static struct code_range *lookup_range(unsigned long pc)
const struct code_range *
> +{
> + struct code_range *range;
const struct code_range *
> +
> + for (range = sym_code_ranges; range->start; range++) {
> + if (pc >= range->start && pc < range->end)
> + return range;
> + }
> + return range;
> +}
> +
> /*
> * AArch64 PCS assigns the frame pointer to x29.
> *
> @@ -43,6 +82,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> {
> unsigned long fp = frame->fp;
> struct stack_info info;
> + struct code_range *range;
const struct code_range *
>
> frame->reliable = true;
>
> @@ -103,6 +143,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> return 0;
> }
>
> + range = lookup_range(frame->pc);
> +
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> if (tsk->ret_stack &&
> frame->pc == (unsigned long)return_to_handler) {
> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> return -EINVAL;
> frame->pc = ret_stack->ret;
> frame->pc = ptrauth_strip_insn_pac(frame->pc);
> + return 0;
> }
> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>
> + if (!range->start)
> + return 0;
> +
> + /*
> + * The return PC falls in an unreliable function. If the final frame
> + * has been reached, no more unwinding is needed. Otherwise, mark the
> + * stack trace not reliable.
> + */
> + if (frame->fp)
> + frame->reliable = false;
> +
> return 0;
> }
> NOKPROBE_SYMBOL(unwind_frame);
> --
> 2.25.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, May 05, 2021 at 01:48:21PM -0500, Madhavan T. Venkataraman wrote:
> On 5/5/21 11:46 AM, Mark Brown wrote:
> > I think that works even if it's hard to love the goto, might want some
> > defensiveness to ensure we can't somehow end up in an infinite loop with
> > a sufficiently badly formed stack.
> I could do something like this:
> unwind_frame()
> {
> int i;
> ...
>
> for (i = 0; i < MAX_CHECKS; i++) {
> if (!check_frame(tsk, frame))
> break;
> }
I think that could work, yes. Have to see the actual code (and other
people's opinions!).
> If this is acceptable, then the only question is - what should be the value of
> MAX_CHECKS (I will rename it to something more appropriate)?
I'd expect something like 10 to be way more than we'd ever need, or we
could define it down to the 2 checks we expect to be possible ATM to be
conservative. I'm tempted to be permissive if we have sufficient other
checks but I'm not 100% sure on that.
On Mon, May 03, 2021 at 12:36:14PM -0500, [email protected] wrote:
> There are some SYM_CODE functions that are currently in ".text" or
> ".init.text" sections. Some of these are functions that the unwinder
> does not care about as they are not "interesting" to livepatch. These
> will remain in their current sections. The rest I have moved into a
> new section called ".code.text".
I was thinking it'd be good to do this by modifying SYM_CODE_START() to
emit the section, that way nobody can forget to put any SYM_CODE into a
special section. That does mean we'd have to first introduce a new
variant for specifying a section that lets us override things that need
to be in some specific section and convert everything that's in a
special section over to that first which is a bit annoying but feels
like it's worth it for the robustness. It'd also put some of the don't
cares into .code.text but so long as they are actually don't cares that
should be fine!
> Don't care functions
> ====================
We also have a bunch of things like __cpu_soft_restart which don't seem
to be called out here but need to be in .idmap.text.
On Mon, May 03, 2021 at 12:36:15PM -0500, [email protected] wrote:
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> The Function Graph Tracer modifies the return address of a traced function
> to a return trampoline (return_to_handler()) to gather tracing data on
> function return. When the unwinder encounters return_to_handler(), it calls
> ftrace_graph_get_ret_stack() to lookup the original return address in the
> return address stack.
This makes sense to me, I'll need to re-review properly with the changes
earlier on in the series but should be fine.
On 5/6/21 9:12 AM, Mark Brown wrote:
> On Mon, May 03, 2021 at 12:36:14PM -0500, [email protected] wrote:
>
>> There are some SYM_CODE functions that are currently in ".text" or
>> ".init.text" sections. Some of these are functions that the unwinder
>> does not care about as they are not "interesting" to livepatch. These
>> will remain in their current sections. The rest I have moved into a
>> new section called ".code.text".
>
> I was thinking it'd be good to do this by modifying SYM_CODE_START() to
> emit the section, that way nobody can forget to put any SYM_CODE into a
> special section. That does mean we'd have to first introduce a new
> variant for specifying a section that lets us override things that need
> to be in some specific section and convert everything that's in a
> special section over to that first which is a bit annoying but feels
> like it's worth it for the robustness. It'd also put some of the don't
> cares into .code.text but so long as they are actually don't cares that
> should be fine!
>
OK. I could make the section an argument to SYM_CODE*() so that a developer
will never miss that. Some documentation may be in order so the guidelines
are clear. I will do the doc patch separately, if that is alright with
you all.
About the don't car functions - most of them are OK to be moved into
.code.text. But the hypervisor vector related code has a problem. I
have not debugged that yet. If I add that code in .code.text, KVM
initialization fails in one case. In another case, when I actually
test with a VM, the VM does not come up.
I am not sure. But it looks like there may be some reloc issue going on.
I don't have a handle on this yet. So, for now, I will leave the hypervisor
stuff in .text. But I will mark it as TBD in the cover letter so we don't
lose track of it.
>> Don't care functions
>> ====================
>
> We also have a bunch of things like __cpu_soft_restart which don't seem
> to be called out here but need to be in .idmap.text.
>
It is already in .idmap.text.
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* CPU reset routines
*
* Copyright (C) 2001 Deep Blue Solutions Ltd.
* Copyright (C) 2012 ARM Ltd.
* Copyright (C) 2015 Huawei Futurewei Technologies.
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
#include <asm/sysreg.h>
#include <asm/virt.h>
.text
.pushsection .idmap.text, "awx"
/*
* __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
* cpu_soft_restart.
*
* @el2_switch: Flag to indicate a switch to EL2 is needed.
* @entry: Location to jump to for soft reset.
* arg0: First argument passed to @entry. (relocation list)
* arg1: Second argument passed to @entry.(physical kernel entry)
* arg2: Third argument passed to @entry. (physical dtb address)
*
* Put the CPU into the same state as it would be if it had been reset, and
* branch to what would be the reset vector. It must be executed with the
* flat identity mapping.
*/
SYM_CODE_START(__cpu_soft_restart)
mov_q x12, INIT_SCTLR_EL1_MMU_OFF
pre_disable_mmu_workaround
/*
* either disable EL1&0 translation regime or disable EL2&0 translation
* regime if HCR_EL2.E2H == 1
*/
msr sctlr_el1, x12
isb
cbz x0, 1f // el2_switch?
mov x0, #HVC_SOFT_RESTART
hvc #0 // no return
1: mov x8, x1 // entry
mov x0, x2 // arg0
mov x1, x3 // arg1
mov x2, x4 // arg2
br x8
SYM_CODE_END(__cpu_soft_restart)
.popsection
I will double check all the *.S files and make sure that every function is accounted
for in version 4.
Stay tuned.
Thanks.
Madhavan
On 5/6/21 10:30 AM, Madhavan T. Venkataraman wrote:
>> I was thinking it'd be good to do this by modifying SYM_CODE_START() to
>> emit the section, that way nobody can forget to put any SYM_CODE into a
>> special section. That does mean we'd have to first introduce a new
>> variant for specifying a section that lets us override things that need
>> to be in some specific section and convert everything that's in a
>> special section over to that first which is a bit annoying but feels
>> like it's worth it for the robustness. It'd also put some of the don't
>> cares into .code.text but so long as they are actually don't cares that
>> should be fine!
>>
> OK. I could make the section an argument to SYM_CODE*() so that a developer
> will never miss that. Some documentation may be in order so the guidelines
> are clear. I will do the doc patch separately, if that is alright with
> you all.
There is just one problem with this. Sometimes, there is some data in the
same text section. That data will not get included when we do the SYM_CODE(section)
change.
Madhavan
On 5/6/21 10:44 AM, Mark Brown wrote:
> On Thu, May 06, 2021 at 10:32:30AM -0500, Madhavan T. Venkataraman wrote:
>> On 5/6/21 10:30 AM, Madhavan T. Venkataraman wrote:
>
>>> OK. I could make the section an argument to SYM_CODE*() so that a developer
>>> will never miss that. Some documentation may be in order so the guidelines
>>> are clear. I will do the doc patch separately, if that is alright with
>>> you all.
>
>> There is just one problem with this. Sometimes, there is some data in the
>> same text section. That data will not get included when we do the SYM_CODE(section)
>> change.
>
> Yes, data would need to be handled separately still. That doesn't seem
> insurmountable though?
>
I will think of something.
Madhavan
On 5/6/21 10:37 AM, Mark Brown wrote:
> On Thu, May 06, 2021 at 10:30:21AM -0500, Madhavan T. Venkataraman wrote:
>> On 5/6/21 9:12 AM, Mark Brown wrote:
>>> On Mon, May 03, 2021 at 12:36:14PM -0500, [email protected] wrote:
>
>>> I was thinking it'd be good to do this by modifying SYM_CODE_START() to
>>> emit the section, that way nobody can forget to put any SYM_CODE into a
>>> special section. That does mean we'd have to first introduce a new
>
>> OK. I could make the section an argument to SYM_CODE*() so that a developer
>> will never miss that. Some documentation may be in order so the guidelines
>> are clear. I will do the doc patch separately, if that is alright with
>> you all.
>
> I was thinking to have standard SYM_CODE default to a section then a
> variant for anything that cares (like how we have SYM_FUNC_PI and
> friends for the PI code for EFI).
>
OK.
>>> We also have a bunch of things like __cpu_soft_restart which don't seem
>>> to be called out here but need to be in .idmap.text.
>
>> It is already in .idmap.text.
>
> Right, I meant that I was expecting to see things that need to be in a
> specific section other than .code.text called out separately here if
> we're enumerating them. Though if the annotations are done separately
> then this patch wouldn't need to do that calling out at all, it'd be
> covered as part of fiddling around with the annotations.
>
OK.
Madhavan
On 5/6/21 9:43 AM, Mark Brown wrote:
> On Mon, May 03, 2021 at 12:36:15PM -0500, [email protected] wrote:
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> The Function Graph Tracer modifies the return address of a traced function
>> to a return trampoline (return_to_handler()) to gather tracing data on
>> function return. When the unwinder encounters return_to_handler(), it calls
>> ftrace_graph_get_ret_stack() to lookup the original return address in the
>> return address stack.
>
> This makes sense to me, I'll need to re-review properly with the changes
> earlier on in the series but should be fine.
>
I will make changes based on the comments I have received so far and send
out version 4 so everything is current for the next round of review.
Thanks!
Madhavan
On 5/6/21 8:45 AM, Mark Brown wrote:
> On Wed, May 05, 2021 at 01:48:21PM -0500, Madhavan T. Venkataraman wrote:
>> On 5/5/21 11:46 AM, Mark Brown wrote:
>
>>> I think that works even if it's hard to love the goto, might want some
>>> defensiveness to ensure we can't somehow end up in an infinite loop with
>>> a sufficiently badly formed stack.
>
>> I could do something like this:
>
>> unwind_frame()
>> {
>> int i;
>> ...
>>
>> for (i = 0; i < MAX_CHECKS; i++) {
>> if (!check_frame(tsk, frame))
>> break;
>> }
>
> I think that could work, yes. Have to see the actual code (and other
> people's opinions!).
>
>> If this is acceptable, then the only question is - what should be the value of
>> MAX_CHECKS (I will rename it to something more appropriate)?
>
> I'd expect something like 10 to be way more than we'd ever need, or we
> could define it down to the 2 checks we expect to be possible ATM to be
> conservative. I'm tempted to be permissive if we have sufficient other
> checks but I'm not 100% sure on that.
>
OK. I will implement these changes for version 4 and send it out so this
whole thing can be reviewed again with the actual changes in front of us.
Madhavan
On Thu, May 06, 2021 at 10:32:30AM -0500, Madhavan T. Venkataraman wrote:
> On 5/6/21 10:30 AM, Madhavan T. Venkataraman wrote:
> > OK. I could make the section an argument to SYM_CODE*() so that a developer
> > will never miss that. Some documentation may be in order so the guidelines
> > are clear. I will do the doc patch separately, if that is alright with
> > you all.
> There is just one problem with this. Sometimes, there is some data in the
> same text section. That data will not get included when we do the SYM_CODE(section)
> change.
Yes, data would need to be handled separately still. That doesn't seem
insurmountable though?
On Thu, May 06, 2021 at 10:30:21AM -0500, Madhavan T. Venkataraman wrote:
> On 5/6/21 9:12 AM, Mark Brown wrote:
> > On Mon, May 03, 2021 at 12:36:14PM -0500, [email protected] wrote:
> > I was thinking it'd be good to do this by modifying SYM_CODE_START() to
> > emit the section, that way nobody can forget to put any SYM_CODE into a
> > special section. That does mean we'd have to first introduce a new
> OK. I could make the section an argument to SYM_CODE*() so that a developer
> will never miss that. Some documentation may be in order so the guidelines
> are clear. I will do the doc patch separately, if that is alright with
> you all.
I was thinking to have standard SYM_CODE default to a section then a
variant for anything that cares (like how we have SYM_FUNC_PI and
friends for the PI code for EFI).
> > We also have a bunch of things like __cpu_soft_restart which don't seem
> > to be called out here but need to be in .idmap.text.
> It is already in .idmap.text.
Right, I meant that I was expecting to see things that need to be in a
specific section other than .code.text called out separately here if
we're enumerating them. Though if the annotations are done separately
then this patch wouldn't need to do that calling out at all, it'd be
covered as part of fiddling around with the annotations.