From: "Madhavan T. Venkataraman" <[email protected]>
Introduction
============
The livepatch feature requires an unwinder that can provide a reliable stack
trace. General requirements for a reliable unwinder are described in this
document from Mark Rutland:
Documentation/livepatch/reliable-stacktrace.rst
The requirements have two parts:
1. The unwinder must be enhanced with certain features. E.g.,
- Identifying successful termination of stack trace
- Identifying unwindable and non-unwindable code
- Identifying interrupts and exceptions occurring in the frame pointer
prolog and epilog
- Identifying features such as kretprobe and ftrace graph tracing
that can modify the return address stored on the stack
- Identifying corrupted/unreliable stack contents
- Architecture-specific items that can render a stack trace unreliable
at certain points in code
2. Validation of the frame pointer
This assumes that the unwinder is based on the frame pointer (FP).
The actual frame pointer that the unwinder uses cannot just be
assumed to be correct. It needs to be validated somehow.
This patch series is to address the following:
- Identifying unwindable and non-unwindable code
- Identifying interrupts and exceptions occurring in the frame pointer
prolog and epilog
- Validation of the frame pointer
The rest are already in place AFAICT.
Validation of the FP (aka FRAME_POINTER_VALIDATION)
====================
The current approach in Linux is to use objtool, a build time tool, for this
purpose. When configured, objtool is invoked on every relocatable object file
during kernel build. It performs static analysis of the code in each file. It
walks the instructions in every function and notes the changes to the stack
pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
accordance with the ABI rules. There are also a lot of other checks that
Objtool performs. Once objtool completes successfully, the kernel can then be
used for livepatch purposes.
Objtool can have uses other than just FP validation. For instance, it can check
control flow integrity during its analysis.
Problem
=======
Objtool is complex and highly architecture-dependent. There are a lot of
different checks in objtool that all of the code in the kernel must pass
before livepatch can be enabled. If a check fails, it must be corrected
before we can proceed. Sometimes, the kernel code needs to be fixed.
Sometimes, it is a compiler bug that needs to be fixed. The challenge is
also to prove that all the work is complete for an architecture.
As such, it presents a great challenge to enable livepatch for an
architecture.
A different approach
====================
I would like to propose a different approach for FP validation. I would
like to be able to enable livepatch for an architecture as is. That is,
without "fixing" the kernel or the compiler for it:
There are three steps in this:
1. Objtool walks all the functions as usual. It computes the stack and
frame pointer offsets at each instruction as usual. It generates ORC
records and stores them in special sections as usual. This is simple
enough to do.
2. Objtool performs validation of the offsets (see below) and checks
if the frame is properly set up according to ABI rules. But the set
of checks performed are a whole lot simpler than the existing Objtool
checks for X86.
3. The unwinder in the kernel retrieves the ORC record for each PC in a
stack trace. If there is no ORC record or if the ORC data indicates that
a frame pointer has not been set up, the unwinder considers the stack
frame unreliable. Otherwise, the unwinder computes a frame pointer from
the ORC data. It compares the computed frame pointer with the actual
frame pointer. If there is a match, the frame is reliable. If not, it
isn't. A stack trace is reliable if every single frame in it is reliable.
To summarize, the frame pointer validation is done dynamically instead
of statically.
Using this scheme, the unwinder can always know what kernel code is reliable
for unwind and what is not. This is the requirement for livepatch.
Instruction decoder
===================
To do this, an instruction decoder needs to be implemented. I have implemented
a simple, table-driven decoder for ARM64. Only a subset of the instructions
needs to be fully decoded for this purpose:
- Load-Store instructions
- Add/Sub/Mov instructions
- Branch instructions
- Call instructions
- Return instructions
- Stack pointer authentication instruction
The rest of the instructions are either dont-care from an unwind perspective
or unexpected from the compiler. I have added checks for the unexpected ones
to catch them if the compiler ever generates them.
This decoder is simpler than a full-fledged one. But if a full-fledged one
is ever implemented, my decoder can be subsumed by it.
Code reorganization and reuse
=============================
Stack validation scheme
-----------------------
Currently, the stack validation scheme supported in Objtool is static stack
validation. Static stack validation is performed in check.c. There is a lot
of code in check.c that should be shared with other validation schemes such
as the dynamic FP validation scheme that I am proposing. Accordingly, I have
moved that code into separate files:
- Code that walks instructions and decodes them
- Code that manages instructions
- CFI related code
- Code that handles unwind hints
So, all of this is shared across all architectures and validation schemes.
Architecture-dependent code
---------------------------
Currently, the ORC definitions and code are X86-specific. I have separated
out the architecture-specific stuff from the generic stuff and placed
them in appropriate files so other architectures can share.
So, these are the architecture-specific parts that need to be supplied for
a new architecture for my proposal. Everything else is shared.
- Instruction decoder as mentioned above
- ORC register definitions
- ORC support functions
- Unwind hint support
- Invoke ORC init from kernel initialization code
- Invoke ORC init from module initialization code
- Add ORC_UNWIND_TABLE to kernel data in vmlinux.lds.S
- Modify the unwinder to use ORC data to validate the frame pointer
- Add kernel config definitions for reliable stack trace and livepatch
Other than the decoder, all of this is very simple to do. Just follow the
example in ARM64.
For ARM64, the decoder turned out to be fairly simple. I cannot speak to
other architectures.
sorttable
---------
At build time, the ORC tables in special sections are sorted so that the
kernel does not have to spend time sorting them. The tables need to be
sorted for binary search. The sorttable program works without any change
in my proposal as well.
FP prolog, epilog, leaf functions, generated code, etc
======================================================
If the unwinder is not able to find an ORC record for a given instruction
address, it considers the code to be unreliable from an unwind perspective.
This enables the unwinder to deal with:
- Generated code that will not have any ORC records.
If the unwinder finds an ORC record, but the record indicates that a frame
pointer has not been properly set up at that instruction, then the unwinder
considers that instruction unreliable from an unwind perspective. This enables
the unwinder to deal with:
- Low level assembly code (SYM_CODE) that is not walked by Objtool.
See below.
- Interrupts/exceptions in frame pointer prologs and epilogs.
- Interrupts/exceptions in leaf functions that don't have a frame
setup.
- Compiler not setting up the frame pointer properly before calling
a function. E.g., if inline assembly code occurs at the beginning
of a function and it contains a call.
If the unwinder finds an ORC record and the record indicates that a frame
pointer has been properly set up, then it computes a frame pointer from the
ORC data and compares it with the actual frame pointer. If the computed frame
pointer does not match the actual one, it considers the code to be unreliable
from an unwind perspective. This enables the unwinder to detect:
- Cases where runtime patching of the kernel resulted in a change in
the ORC for an instruction
- A corrupted frame pointer
Assembly functions
==================
Objtool does not walk SYM_CODE functions as they are low-level functions
that don't follow ABI rules or functions that manipulate register state
in such a way that unwind is unreliable. For these the ORC records will
show that the frame offset is 0. So, the unwinder will be able to tell that
they are unreliable for unwind.
As for SYM_FUNC functions, Objtool will walk them and compute ORC. However,
currently, most of the SYM_FUNC functions in ARM64 do not setup a frame.
So, these will look unreliable to the unwinder. While this will not impact
the ability to do livepatch, I plan to submit a separate patch series to add
a frame pointer prolog and epilog to many of these functions. This is to
reduce the number of retries during the livepatch process.
Unwind hints
============
Now, there are certain points in assembly code that we would like to unwind
through reliably. Like interrupt and exception handlers. This is mainly for
getting reliable stack traces in these cases and reducing the number of
retries during the livepatch process. For these, unwind hints can be placed
at strategic points in assembly code. Only a small number of these hints
should be needed.
In this work, I have defined the following unwind hints so stack traces that
contain these can be done reliably:
- Exception handlers
- Interrupt handlers
Unwind hints are collected in a special section. Objtool converts unwind hints
to ORC data. The unwinder processes unwind hints to handle special cases
mentioned above.
Now, unwind hints are generally a problem to maintain. So, I have only
defined them for the above cases.
Size of the memory consumed within the kernel for this feature
==============================================================
This depends on the amount of code in the kernel which, in turn, depends on
the number of configs turned on. E.g., on the kernel on my arm64 system, the
ORC data size for vmlinux is about 2MB.
GitHub repository
=================
My github repo for this version is here:
https://github.com/madvenka786/linux/tree/orc_v3
Please feel free to clone and check it out. And, please let me know if you
find any issues.
Testing
=======
- I have run all of the livepatch selftests successfully. I have written a
couple of extra selftests myself which I will be posting separately.
- I have a test driver to induce a NULL pointer exception to make sure
that unwinding through exception handlers is reliable.
- I use the test driver to create a timer to make sure that unwinding through
the timer IRQ is reliable.
- I call the unwinder from different places during boot to make sure that
the unwinding in each of those cases is reliable.
TBD
===
- I need to perform more rigorous testing with different scenarios. This
is work in progress. Any ideas or suggestions are welcome.
- I plan to add a return address check in the unwinder. The unwinder will
decode the instruction at the call site for each frame and make sure that
it is a valid call instruction. This is just a paranoid check to catch it
if Objtool generates an incorrect ORC entry or if the FP is corrupted.
---
Changelog:
v3:
From Mark Brown <[email protected]>
====================================
Objtool no longer uses sub commands.
I have addressed this. Objtool calls check() to perform
all the validation. I have defined a check() function for
the Dynamic FP Validation feature. Based on the config,
the correct files will be included so that there is only
one check() defined within Objtool.
From Chen Zhongjin <[email protected]>
============================================
No need to rewrite the decoder. Merge the decoder with my patch.
My decoder is table based. So, it is very compact. Also,
in that table, I have included checks for the instructions
that alter the stack or frame pointers that the compiler
should not be using. This catches all the cases where the
compiler generates unexpected code. So, for now, I am
keeping my decoder.
But your point is a good one. May be, in the future, our
decoders can be merged. Currently, both patchsets are
relatively new. It is not as if either has received
enough review.
From Peter Zijlstra <[email protected]>
==========================================================
Why can't you use the validate_branch() defined currently?
Why do you want to define your own?
Originally, I responded to this comment by saying that
I will use validate_branch(). I studied it. It will probably
take a long time for all the pieces to be in place for the
current validate_branch() to work for ARM64. So, I have taken
a different approach to try to shorten the time to market.
In my approach, I generate the ORC data based on the actual
code by walking all the instructions and following all the
code paths statically.
Now, a hidden code path may not be followed during my
static analysis. E.g.,
- a retpoline is obscuring an actual branch
- runtime patching can potentially change a code path
In most of the cases, even if these things occur, the ORC data
will not change. In the unlikely event that ORC data can be
different for an instruction because of the above, the unwinder
will handle that. There are two cases:
1) The SP offset or the FP offset is zero in the generated ORC.
2) The SP offset and the FP offset are both non-zero in the
generated ORC.
In case (1), the unwinder will return an error right away
and the stack trace will be considered unreliable. So,
livepatch has to retry.
In case (2), the unwinder will detect the problem because the
computed frame pointer will not match the actual one. So,
livepatch has to retry.
As long as the number of such retries is small, livepatch can
easily be supported with this approach.
AFAICT, the only way this will not work is if the actual frame
pointer gets corrupted because of buggy kernel code and just
happens to match the computed frame pointer. According to an
earlier comment from Josh (IIRC), corruption of the frame
pointer in the kernel is a super rare event. Even if it does
occur, the frame pointer getting corrupted in a fashion that
it exactly matches the computed frame pointer should be even
rarer.
Even if that were to happen, the unwinder checks must pass for
every frame in the stack trace. This will not happen if the
frame pointer is corrupted along the way.
Also, if the kernel has buggy code that can corrupt the frame
pointer, then all bets are off anyway.
If you find any holes here, please let me know. I will work
on it some more. Appreciate your feedback.
I have added a number of checks to validate the CFI since
version 2. They are described in:
[RFC PATCH v3 13/21] objtool: arm64: Walk instructions and compute CFI for each instruction
FWIW, I have also compared the CFI I am generating with DWARF
information that the compiler generates. The CFIs match a
100% for Clang. In the case of gcc, the comparison fails
in 1.7% of the cases. I have analyzed those cases and found
the DWARF information generated by gcc is incorrect. The
ORC generated by my Objtool is correct.
From Miroslav Benes <[email protected]>
====================================
klp_arch_set_pc() has been replaced by ftrace_instruction_pointer_set().
klp_get_ftrace_location() is not needed either.
I have addressed this in version 3.
From me Madhavan T. Venkataraman <[email protected]>
================================================================
- I have removed the unwind hints for FTrace and the Kretprobe
trampoline. These can be added later. I have retained only
the unwind hints for exceptions and interrupts.
- I have enhanced the decoder to recognize the paciasp instruction.
Both gcc and Clang use this to begin a frame pointer prolog. This
is really useful for working out the CFI.
v2:
From Josh Poimboeuf <[email protected]>:
==========================================
DWARF is not proven to be reliable. So, depending on it for livepatch
is a problem.
I have removed the DWARF part from the patch series. Instead,
I have implemented the minimum ARM64 instruction decoder
required for this work. I have implemented code to walk all
the instructions in an object file and generate ORC data.
The ORC data will be used by the unwinder to compute a
frame pointer and validate the actual frame pointer with it.
Unwind hints are a problem from a maintenance perspective.
This is true. But there are only a few unwind hints that I
have introduced in this work. Also, if an unwind hint becomes
outdated, the dynamic frame pointer check will catch it so
that the unwinder will know that it is unreliable.
Inline ASM code can cause problems that DWARF cannot catch.
Now that I walk all of the instructions, this problem is solved.
In version 2, the ORC data is generated based on the actual
machine code in the object file. So, the data reflects the
actual code. Unreliability in any part of the code will be
caught by the unwinder when it looks up the ORC data and
performs a frame pointer check.
Rename kernel code and data that currently contains the name dwarf
to avoid confusion.
This problem is solved as I have dropped DWARF altogether.
Try to reuse the existing ORC data format and code as much as possible.
I have reorganized the code in the following ways:
- I have placed code that was in check.c in separate files
so that different stack validation schemes can share the code.
- I have separated architecture-specific code and structures
from generic ones so that different architectures can share
common stuff.
- I am using the ORC structure as it is currently defined. The
only cosmetic change I have done is to rename the fields
bp_* to fp_* (FP for frame pointer).
- I completely reuse the ORC definitions and code. E.g., in
the kernel the ORC lookup code is shared across architectures.
Objtool contains other features which other architectures are looking
into. So, should we just implement static stack validation for other
architectures or use dynamic FP validation just for the livepatch
feature?
For one thing, it will take a long time before the static
validation scheme can even be proved to be complete on ARM64.
Livepatching is an immediate need for security fixes.
Also, since I am using the traditional approach in v2 of
walking the instructions, computing CFI, generating ORC, etc,
my current approach can be combined with the traditional
approach. Dynamic FP validation can be offered either as an
alternative to static stack validation or as something that
can be combined with static stack validation to make the
feature even more robust. Objtool can always have bugs and
there can be bad ORC data.
From Peter Zijlstra <[email protected]>:
===========================================
Please use/extend ORC.
Done. Please see my description above.
Why deviate from the traditional approach of static stack validation?
I have given the answer above.
Mandating DWARF sucks. Compile times are so much worse with DWARVES.
I have removed DWARF from the work. I use the traditional
approach of decoding the instructions and computing the
same data as DWARF but in ORC format.
DWARF does not cover assembly code.
In v2, I walk all of the functions including assembly functions.
SYM_CODE functions are not walked by Objtool anyway. So, I
don't do that. But I walk all the SYM_FUNC functions. Currently,
only a few SYM_FUNC functions in ARM64 have a proper FP prolog
and epilog. So, I plan to submit a separate patch series to
add an FP prolog and epilog for other SYM_FUNC functions.
But this is only to reduce retries during the livepatch process.
It is not absolutely required for livepatch to work. But I
plan to address this separately.
Compilers don't consider DWARF generation to be a correctness issue.
I totally agree. I have myself found 4 bugs that I have had
to compensate for. So, I have dropped DWARF.
From Chen Zhongjin <[email protected]>:
=============================================
One cannot depend on compilers to generate correct DWARF info.
Agreed. I have dropped DWARF.
DWARF does not cover assembly. So, what if too many assembly
functions exist so that the livepatch process can encounter too
many retries?
DWARF has been dropped. The code in version 2 walks all the
functions including assembly functions.
There is a corner case where an interrupt or an exception can happen
in FP prologs/epilogs. The stack trace would be unreliable.
Yes. This will be caught by the reliable unwinder in the kernel
in my scheme when it retrieves the ORC data and validates
the actual frame pointer. The validation will fail and the
stack trace will be considered unreliable.
v1:
- Introduced the livepatch feature based on DWARF Call Frame
Information generated by the compilers.
Previous versions and discussion
================================
v2: https://lore.kernel.org/linux-arm-kernel/[email protected]/T/#t
v1: https://lore.kernel.org/linux-arm-kernel/[email protected]/T/#t
Madhavan T. Venkataraman (22):
objtool: Reorganize CFI code
objtool: Reorganize instruction-related code
objtool: Move decode_instructions() to a separate file
objtool: Reorganize Unwind hint code
objtool: Reorganize ORC types
objtool: Reorganize ORC code
objtool: Reorganize ORC kernel code
objtool: Introduce STATIC_CHECK
objtool: arm64: Add basic definitions and compile
objtool: arm64: Implement decoder for Dynamic FP validation
objtool: arm64: Invoke the decoder
objtool: arm64: Compute destinations for call and jump instructions
objtool: arm64: Walk instructions and compute CFI for each instruction
objtool: arm64: Generate ORC data from CFI for object files
objtool: arm64: Add unwind hint support
arm64: Add unwind hints to exception handlers
arm64: Add kernel and module support for ORC
arm64: Build the kernel with ORC information
arm64: unwinder: Add a reliability check in the unwinder based on ORC
arm64: Define HAVE_DYNAMIC_FTRACE_WITH_ARGS
arm64: Define TIF_PATCH_PENDING for livepatch
arm64: Enable livepatch for ARM64
arch/arm64/Kconfig | 5 +
arch/arm64/Kconfig.debug | 33 +
arch/arm64/include/asm/ftrace.h | 20 +
arch/arm64/include/asm/module.h | 12 +-
arch/arm64/include/asm/orc_types.h | 35 ++
arch/arm64/include/asm/stacktrace/common.h | 15 +
arch/arm64/include/asm/thread_info.h | 4 +-
arch/arm64/include/asm/unwind_hints.h | 104 ++++
arch/arm64/kernel/entry.S | 3 +
arch/arm64/kernel/module.c | 13 +-
arch/arm64/kernel/setup.c | 2 +
arch/arm64/kernel/signal.c | 4 +
arch/arm64/kernel/stacktrace.c | 167 ++++-
arch/arm64/kernel/vmlinux.lds.S | 3 +
arch/x86/include/asm/orc_types.h | 37 +-
arch/x86/include/asm/unwind.h | 5 -
arch/x86/include/asm/unwind_hints.h | 83 +++
arch/x86/kernel/module.c | 7 +-
arch/x86/kernel/unwind_orc.c | 258 +-------
arch/x86/kernel/vmlinux.lds.S | 2 +-
.../asm => include/asm-generic}/orc_lookup.h | 42 ++
include/linux/objtool.h | 72 +--
include/linux/orc_entry.h | 39 ++
kernel/Makefile | 2 +
kernel/orc_lookup.c | 261 ++++++++
scripts/Makefile | 4 +-
scripts/Makefile.lib | 9 +
tools/arch/arm64/include/asm/orc_types.h | 35 ++
tools/arch/arm64/include/asm/unwind_hints.h | 104 ++++
tools/arch/x86/include/asm/orc_types.h | 37 +-
tools/arch/x86/include/asm/unwind_hints.h | 157 +++++
tools/include/linux/objtool.h | 72 +--
tools/include/linux/orc_entry.h | 39 ++
tools/objtool/Build | 9 +-
tools/objtool/Makefile | 8 +-
tools/objtool/arch/arm64/Build | 2 +
tools/objtool/arch/arm64/decode.c | 573 ++++++++++++++++++
.../arch/arm64/include/arch/cfi_regs.h | 13 +
tools/objtool/arch/arm64/include/arch/elf.h | 9 +
.../arch/arm64/include/arch/endianness.h | 9 +
tools/objtool/arch/arm64/orc.c | 90 +++
tools/objtool/arch/x86/Build | 1 +
tools/objtool/arch/x86/include/arch/elf.h | 1 +
tools/objtool/arch/x86/orc.c | 150 +++++
tools/objtool/cfi.c | 108 ++++
tools/objtool/check.c | 492 +--------------
tools/objtool/dcheck.c | 360 +++++++++++
tools/objtool/decode.c | 107 ++++
tools/objtool/include/objtool/arch.h | 2 +
tools/objtool/include/objtool/cfi.h | 12 +
tools/objtool/include/objtool/check.h | 77 +--
tools/objtool/include/objtool/endianness.h | 1 +
tools/objtool/include/objtool/insn.h | 130 ++++
tools/objtool/include/objtool/objtool.h | 1 +
tools/objtool/include/objtool/orc.h | 18 +
tools/objtool/insn.c | 215 +++++++
tools/objtool/orc_dump.c | 63 +-
tools/objtool/orc_gen.c | 89 +--
tools/objtool/sync-check.sh | 10 +
tools/objtool/unwind_hints.c | 110 ++++
60 files changed, 3176 insertions(+), 1169 deletions(-)
create mode 100644 arch/arm64/include/asm/orc_types.h
create mode 100644 arch/arm64/include/asm/unwind_hints.h
rename {arch/x86/include/asm => include/asm-generic}/orc_lookup.h (51%)
create mode 100644 include/linux/orc_entry.h
create mode 100644 kernel/orc_lookup.c
create mode 100644 tools/arch/arm64/include/asm/orc_types.h
create mode 100644 tools/arch/arm64/include/asm/unwind_hints.h
create mode 100644 tools/arch/x86/include/asm/unwind_hints.h
create mode 100644 tools/include/linux/orc_entry.h
create mode 100644 tools/objtool/arch/arm64/Build
create mode 100644 tools/objtool/arch/arm64/decode.c
create mode 100644 tools/objtool/arch/arm64/include/arch/cfi_regs.h
create mode 100644 tools/objtool/arch/arm64/include/arch/elf.h
create mode 100644 tools/objtool/arch/arm64/include/arch/endianness.h
create mode 100644 tools/objtool/arch/arm64/orc.c
create mode 100644 tools/objtool/arch/x86/orc.c
create mode 100644 tools/objtool/cfi.c
create mode 100644 tools/objtool/dcheck.c
create mode 100644 tools/objtool/decode.c
create mode 100644 tools/objtool/include/objtool/insn.h
create mode 100644 tools/objtool/include/objtool/orc.h
create mode 100644 tools/objtool/insn.c
create mode 100644 tools/objtool/unwind_hints.c
base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
check.c implements static stack validation. But the instruction-related
code that it contains can be shared with other types of validation. E.g.,
dynamic FP validation. Move the instruction-related code to its own files
- insn.h and insn.c.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/Build | 1 +
tools/objtool/check.c | 201 --------------------------
tools/objtool/include/objtool/check.h | 77 +---------
tools/objtool/include/objtool/insn.h | 125 ++++++++++++++++
tools/objtool/insn.c | 186 ++++++++++++++++++++++++
5 files changed, 313 insertions(+), 277 deletions(-)
create mode 100644 tools/objtool/include/objtool/insn.h
create mode 100644 tools/objtool/insn.c
diff --git a/tools/objtool/Build b/tools/objtool/Build
index 21db9d79c69f..1149048e6b3e 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -6,6 +6,7 @@ objtool-y += check.o
objtool-y += special.o
objtool-y += builtin-check.o
objtool-y += cfi.o
+objtool-y += insn.o
objtool-y += elf.o
objtool-y += objtool.o
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e6a2afa08748..d208086a8a18 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -28,86 +28,6 @@ struct alternative {
bool skip_orig;
};
-struct instruction *find_insn(struct objtool_file *file,
- struct section *sec, unsigned long offset)
-{
- struct instruction *insn;
-
- hash_for_each_possible(file->insn_hash, insn, hash, sec_offset_hash(sec, offset)) {
- if (insn->sec == sec && insn->offset == offset)
- return insn;
- }
-
- return NULL;
-}
-
-static struct instruction *next_insn_same_sec(struct objtool_file *file,
- struct instruction *insn)
-{
- struct instruction *next = list_next_entry(insn, list);
-
- if (!next || &next->list == &file->insn_list || next->sec != insn->sec)
- return NULL;
-
- return next;
-}
-
-static struct instruction *next_insn_same_func(struct objtool_file *file,
- struct instruction *insn)
-{
- struct instruction *next = list_next_entry(insn, list);
- struct symbol *func = insn->func;
-
- if (!func)
- return NULL;
-
- if (&next->list != &file->insn_list && next->func == func)
- return next;
-
- /* Check if we're already in the subfunction: */
- if (func == func->cfunc)
- return NULL;
-
- /* Move to the subfunction: */
- return find_insn(file, func->cfunc->sec, func->cfunc->offset);
-}
-
-static struct instruction *prev_insn_same_sym(struct objtool_file *file,
- struct instruction *insn)
-{
- struct instruction *prev = list_prev_entry(insn, list);
-
- if (&prev->list != &file->insn_list && prev->func == insn->func)
- return prev;
-
- return NULL;
-}
-
-#define func_for_each_insn(file, func, insn) \
- for (insn = find_insn(file, func->sec, func->offset); \
- insn; \
- insn = next_insn_same_func(file, insn))
-
-#define sym_for_each_insn(file, sym, insn) \
- for (insn = find_insn(file, sym->sec, sym->offset); \
- insn && &insn->list != &file->insn_list && \
- insn->sec == sym->sec && \
- insn->offset < sym->offset + sym->len; \
- insn = list_next_entry(insn, list))
-
-#define sym_for_each_insn_continue_reverse(file, sym, insn) \
- for (insn = list_prev_entry(insn, list); \
- &insn->list != &file->insn_list && \
- insn->sec == sym->sec && insn->offset >= sym->offset; \
- insn = list_prev_entry(insn, list))
-
-#define sec_for_each_insn_from(file, insn) \
- for (; insn; insn = next_insn_same_sec(file, insn))
-
-#define sec_for_each_insn_continue(file, insn) \
- for (insn = next_insn_same_sec(file, insn); insn; \
- insn = next_insn_same_sec(file, insn))
-
static bool is_jump_table_jump(struct instruction *insn)
{
struct alt_group *alt_group = insn->alt_group;
@@ -249,21 +169,6 @@ static bool dead_end_function(struct objtool_file *file, struct symbol *func)
return __dead_end_function(file, func, 0);
}
-static void init_insn_state(struct objtool_file *file, struct insn_state *state,
- struct section *sec)
-{
- memset(state, 0, sizeof(*state));
- init_cfi_state(&state->cfi);
-
- /*
- * We need the full vmlinux for noinstr validation, otherwise we can
- * not correctly determine insn->call_dest->sec (external symbols do
- * not have a section).
- */
- if (opts.link && opts.noinstr && sec)
- state->noinstr = sec->noinstr;
-}
-
static unsigned long nr_insns;
static unsigned long nr_insns_visited;
@@ -439,19 +344,6 @@ static int init_pv_ops(struct objtool_file *file)
return 0;
}
-static struct instruction *find_last_insn(struct objtool_file *file,
- struct section *sec)
-{
- struct instruction *insn = NULL;
- unsigned int offset;
- unsigned int end = (sec->sh.sh_size > 10) ? sec->sh.sh_size - 10 : 0;
-
- for (offset = sec->sh.sh_size - 1; offset >= end && !insn; offset--)
- insn = find_insn(file, sec, offset);
-
- return insn;
-}
-
/*
* Mark "ud2" instructions and manually annotated dead ends.
*/
@@ -1072,28 +964,6 @@ __weak bool arch_is_rethunk(struct symbol *sym)
return false;
}
-#define NEGATIVE_RELOC ((void *)-1L)
-
-static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
-{
- if (insn->reloc == NEGATIVE_RELOC)
- return NULL;
-
- if (!insn->reloc) {
- if (!file)
- return NULL;
-
- insn->reloc = find_reloc_by_dest_range(file->elf, insn->sec,
- insn->offset, insn->len);
- if (!insn->reloc) {
- insn->reloc = NEGATIVE_RELOC;
- return NULL;
- }
- }
-
- return insn->reloc;
-}
-
static void remove_insn_ops(struct instruction *insn)
{
struct stack_op *op, *tmp;
@@ -1252,27 +1122,6 @@ static void add_return_call(struct objtool_file *file, struct instruction *insn,
list_add_tail(&insn->call_node, &file->return_thunk_list);
}
-static bool same_function(struct instruction *insn1, struct instruction *insn2)
-{
- return insn1->func->pfunc == insn2->func->pfunc;
-}
-
-static bool is_first_func_insn(struct objtool_file *file, struct instruction *insn)
-{
- if (insn->offset == insn->func->offset)
- return true;
-
- if (opts.ibt) {
- struct instruction *prev = prev_insn_same_sym(file, insn);
-
- if (prev && prev->type == INSN_ENDBR &&
- insn->offset == insn->func->offset + prev->len)
- return true;
- }
-
- return false;
-}
-
/*
* Find the destination instructions for all jumps.
*/
@@ -2987,56 +2836,6 @@ static int handle_insn_ops(struct instruction *insn,
return 0;
}
-static bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2)
-{
- struct cfi_state *cfi1 = insn->cfi;
- int i;
-
- if (!cfi1) {
- WARN("CFI missing");
- return false;
- }
-
- if (memcmp(&cfi1->cfa, &cfi2->cfa, sizeof(cfi1->cfa))) {
-
- WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
- insn->sec, insn->offset,
- cfi1->cfa.base, cfi1->cfa.offset,
- cfi2->cfa.base, cfi2->cfa.offset);
-
- } else if (memcmp(&cfi1->regs, &cfi2->regs, sizeof(cfi1->regs))) {
- for (i = 0; i < CFI_NUM_REGS; i++) {
- if (!memcmp(&cfi1->regs[i], &cfi2->regs[i],
- sizeof(struct cfi_reg)))
- continue;
-
- WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
- insn->sec, insn->offset,
- i, cfi1->regs[i].base, cfi1->regs[i].offset,
- i, cfi2->regs[i].base, cfi2->regs[i].offset);
- break;
- }
-
- } else if (cfi1->type != cfi2->type) {
-
- WARN_FUNC("stack state mismatch: type1=%d type2=%d",
- insn->sec, insn->offset, cfi1->type, cfi2->type);
-
- } else if (cfi1->drap != cfi2->drap ||
- (cfi1->drap && cfi1->drap_reg != cfi2->drap_reg) ||
- (cfi1->drap && cfi1->drap_offset != cfi2->drap_offset)) {
-
- WARN_FUNC("stack state mismatch: drap1=%d(%d,%d) drap2=%d(%d,%d)",
- insn->sec, insn->offset,
- cfi1->drap, cfi1->drap_reg, cfi1->drap_offset,
- cfi2->drap, cfi2->drap_reg, cfi2->drap_offset);
-
- } else
- return true;
-
- return false;
-}
-
static inline bool func_uaccess_safe(struct symbol *func)
{
if (func)
diff --git a/tools/objtool/include/objtool/check.h b/tools/objtool/include/objtool/check.h
index 036129cebeee..a093f5cb100a 100644
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -7,17 +7,7 @@
#define _CHECK_H
#include <stdbool.h>
-#include <objtool/cfi.h>
-#include <objtool/arch.h>
-
-struct insn_state {
- struct cfi_state cfi;
- unsigned int uaccess_stack;
- bool uaccess;
- bool df;
- bool noinstr;
- s8 instr;
-};
+#include <objtool/insn.h>
struct alt_group {
/*
@@ -36,74 +26,9 @@ struct alt_group {
struct cfi_state **cfi;
};
-struct instruction {
- struct list_head list;
- struct hlist_node hash;
- struct list_head call_node;
- struct section *sec;
- unsigned long offset;
- unsigned int len;
- enum insn_type type;
- unsigned long immediate;
-
- u16 dead_end : 1,
- ignore : 1,
- ignore_alts : 1,
- hint : 1,
- save : 1,
- restore : 1,
- retpoline_safe : 1,
- noendbr : 1,
- entry : 1;
- /* 7 bit hole */
-
- s8 instr;
- u8 visited;
-
- struct alt_group *alt_group;
- struct symbol *call_dest;
- struct instruction *jump_dest;
- struct instruction *first_jump_src;
- struct reloc *jump_table;
- struct reloc *reloc;
- struct list_head alts;
- struct symbol *func;
- struct list_head stack_ops;
- struct cfi_state *cfi;
-};
-
#define VISITED_BRANCH 0x01
#define VISITED_BRANCH_UACCESS 0x02
#define VISITED_BRANCH_MASK 0x03
#define VISITED_ENTRY 0x04
-static inline bool is_static_jump(struct instruction *insn)
-{
- return insn->type == INSN_JUMP_CONDITIONAL ||
- insn->type == INSN_JUMP_UNCONDITIONAL;
-}
-
-static inline bool is_dynamic_jump(struct instruction *insn)
-{
- return insn->type == INSN_JUMP_DYNAMIC ||
- insn->type == INSN_JUMP_DYNAMIC_CONDITIONAL;
-}
-
-static inline bool is_jump(struct instruction *insn)
-{
- return is_static_jump(insn) || is_dynamic_jump(insn);
-}
-
-struct instruction *find_insn(struct objtool_file *file,
- struct section *sec, unsigned long offset);
-
-#define for_each_insn(file, insn) \
- list_for_each_entry(insn, &file->insn_list, list)
-
-#define sec_for_each_insn(file, sec, insn) \
- for (insn = find_insn(file, sec, 0); \
- insn && &insn->list != &file->insn_list && \
- insn->sec == sec; \
- insn = list_next_entry(insn, list))
-
#endif /* _CHECK_H */
diff --git a/tools/objtool/include/objtool/insn.h b/tools/objtool/include/objtool/insn.h
new file mode 100644
index 000000000000..b40756a38994
--- /dev/null
+++ b/tools/objtool/include/objtool/insn.h
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ */
+
+#ifndef _INSN_H
+#define _INSN_H
+
+#include <objtool/objtool.h>
+#include <objtool/arch.h>
+
+struct insn_state {
+ struct cfi_state cfi;
+ unsigned int uaccess_stack;
+ bool uaccess;
+ bool df;
+ bool noinstr;
+ s8 instr;
+};
+
+struct instruction {
+ struct list_head list;
+ struct hlist_node hash;
+ struct list_head call_node;
+ struct section *sec;
+ unsigned long offset;
+ unsigned int len;
+ enum insn_type type;
+ unsigned long immediate;
+
+ u16 dead_end : 1,
+ ignore : 1,
+ ignore_alts : 1,
+ hint : 1,
+ save : 1,
+ restore : 1,
+ retpoline_safe : 1,
+ noendbr : 1,
+ entry : 1;
+ /* 7 bit hole */
+
+ s8 instr;
+ u8 visited;
+
+ struct alt_group *alt_group;
+ struct symbol *call_dest;
+ struct instruction *jump_dest;
+ struct instruction *first_jump_src;
+ struct reloc *jump_table;
+ struct reloc *reloc;
+ struct list_head alts;
+ struct symbol *func;
+ struct list_head stack_ops;
+ struct cfi_state *cfi;
+};
+
+static inline bool is_static_jump(struct instruction *insn)
+{
+ return insn->type == INSN_JUMP_CONDITIONAL ||
+ insn->type == INSN_JUMP_UNCONDITIONAL;
+}
+
+static inline bool is_dynamic_jump(struct instruction *insn)
+{
+ return insn->type == INSN_JUMP_DYNAMIC ||
+ insn->type == INSN_JUMP_DYNAMIC_CONDITIONAL;
+}
+
+static inline bool is_jump(struct instruction *insn)
+{
+ return is_static_jump(insn) || is_dynamic_jump(insn);
+}
+
+void init_insn_state(struct objtool_file *file, struct insn_state *state,
+ struct section *sec);
+struct instruction *find_insn(struct objtool_file *file,
+ struct section *sec, unsigned long offset);
+struct instruction *find_last_insn(struct objtool_file *file,
+ struct section *sec);
+struct instruction *prev_insn_same_sym(struct objtool_file *file,
+ struct instruction *insn);
+struct instruction *next_insn_same_sec(struct objtool_file *file,
+ struct instruction *insn);
+struct instruction *next_insn_same_func(struct objtool_file *file,
+ struct instruction *insn);
+struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn);
+bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2);
+bool same_function(struct instruction *insn1, struct instruction *insn2);
+bool is_first_func_insn(struct objtool_file *file, struct instruction *insn);
+
+#define for_each_insn(file, insn) \
+ list_for_each_entry(insn, &file->insn_list, list)
+
+#define sec_for_each_insn(file, sec, insn) \
+ for (insn = find_insn(file, sec, 0); \
+ insn && &insn->list != &file->insn_list && \
+ insn->sec == sec; \
+ insn = list_next_entry(insn, list))
+
+#define func_for_each_insn(file, func, insn) \
+ for (insn = find_insn(file, func->sec, func->offset); \
+ insn; \
+ insn = next_insn_same_func(file, insn))
+
+#define sym_for_each_insn(file, sym, insn) \
+ for (insn = find_insn(file, sym->sec, sym->offset); \
+ insn && &insn->list != &file->insn_list && \
+ insn->sec == sym->sec && \
+ insn->offset < sym->offset + sym->len; \
+ insn = list_next_entry(insn, list))
+
+#define sym_for_each_insn_continue_reverse(file, sym, insn) \
+ for (insn = list_prev_entry(insn, list); \
+ &insn->list != &file->insn_list && \
+ insn->sec == sym->sec && insn->offset >= sym->offset; \
+ insn = list_prev_entry(insn, list))
+
+#define sec_for_each_insn_from(file, insn) \
+ for (; insn; insn = next_insn_same_sec(file, insn))
+
+#define sec_for_each_insn_continue(file, insn) \
+ for (insn = next_insn_same_sec(file, insn); insn; \
+ insn = next_insn_same_sec(file, insn))
+
+#endif /* _INSN_H */
diff --git a/tools/objtool/insn.c b/tools/objtool/insn.c
new file mode 100644
index 000000000000..e570b46ad39e
--- /dev/null
+++ b/tools/objtool/insn.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ */
+
+#include <string.h>
+
+#include <objtool/builtin.h>
+#include <objtool/insn.h>
+#include <objtool/warn.h>
+
+struct instruction *find_insn(struct objtool_file *file,
+ struct section *sec, unsigned long offset)
+{
+ struct instruction *insn;
+
+ hash_for_each_possible(file->insn_hash, insn, hash, sec_offset_hash(sec, offset)) {
+ if (insn->sec == sec && insn->offset == offset)
+ return insn;
+ }
+
+ return NULL;
+}
+
+struct instruction *next_insn_same_sec(struct objtool_file *file,
+ struct instruction *insn)
+{
+ struct instruction *next = list_next_entry(insn, list);
+
+ if (!next || &next->list == &file->insn_list || next->sec != insn->sec)
+ return NULL;
+
+ return next;
+}
+
+struct instruction *next_insn_same_func(struct objtool_file *file,
+ struct instruction *insn)
+{
+ struct instruction *next = list_next_entry(insn, list);
+ struct symbol *func = insn->func;
+
+ if (!func)
+ return NULL;
+
+ if (&next->list != &file->insn_list && next->func == func)
+ return next;
+
+ /* Check if we're already in the subfunction: */
+ if (func == func->cfunc)
+ return NULL;
+
+ /* Move to the subfunction: */
+ return find_insn(file, func->cfunc->sec, func->cfunc->offset);
+}
+
+struct instruction *prev_insn_same_sym(struct objtool_file *file,
+ struct instruction *insn)
+{
+ struct instruction *prev = list_prev_entry(insn, list);
+
+ if (&prev->list != &file->insn_list && prev->func == insn->func)
+ return prev;
+
+ return NULL;
+}
+
+void init_insn_state(struct objtool_file *file, struct insn_state *state,
+ struct section *sec)
+{
+ memset(state, 0, sizeof(*state));
+ init_cfi_state(&state->cfi);
+
+ /*
+ * We need the full vmlinux for noinstr validation, otherwise we can
+ * not correctly determine insn->call_dest->sec (external symbols do
+ * not have a section).
+ */
+ if (opts.link && opts.noinstr && sec)
+ state->noinstr = sec->noinstr;
+}
+
+struct instruction *find_last_insn(struct objtool_file *file,
+ struct section *sec)
+{
+ struct instruction *insn = NULL;
+ unsigned int offset;
+ unsigned int end = (sec->sh.sh_size > 10) ? sec->sh.sh_size - 10 : 0;
+
+ for (offset = sec->sh.sh_size - 1; offset >= end && !insn; offset--)
+ insn = find_insn(file, sec, offset);
+
+ return insn;
+}
+
+#define NEGATIVE_RELOC ((void *)-1L)
+
+struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+{
+ if (insn->reloc == NEGATIVE_RELOC)
+ return NULL;
+
+ if (!insn->reloc) {
+ if (!file)
+ return NULL;
+
+ insn->reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+ insn->offset, insn->len);
+ if (!insn->reloc) {
+ insn->reloc = NEGATIVE_RELOC;
+ return NULL;
+ }
+ }
+
+ return insn->reloc;
+}
+
+bool same_function(struct instruction *insn1, struct instruction *insn2)
+{
+ return insn1->func->pfunc == insn2->func->pfunc;
+}
+
+bool is_first_func_insn(struct objtool_file *file, struct instruction *insn)
+{
+ if (insn->offset == insn->func->offset)
+ return true;
+
+ if (opts.ibt) {
+ struct instruction *prev = prev_insn_same_sym(file, insn);
+
+ if (prev && prev->type == INSN_ENDBR &&
+ insn->offset == insn->func->offset + prev->len)
+ return true;
+ }
+
+ return false;
+}
+
+bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2)
+{
+ struct cfi_state *cfi1 = insn->cfi;
+ int i;
+
+ if (!cfi1) {
+ WARN("CFI missing");
+ return false;
+ }
+
+ if (memcmp(&cfi1->cfa, &cfi2->cfa, sizeof(cfi1->cfa))) {
+
+ WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
+ insn->sec, insn->offset,
+ cfi1->cfa.base, cfi1->cfa.offset,
+ cfi2->cfa.base, cfi2->cfa.offset);
+
+ } else if (memcmp(&cfi1->regs, &cfi2->regs, sizeof(cfi1->regs))) {
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ if (!memcmp(&cfi1->regs[i], &cfi2->regs[i],
+ sizeof(struct cfi_reg)))
+ continue;
+
+ WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
+ insn->sec, insn->offset,
+ i, cfi1->regs[i].base, cfi1->regs[i].offset,
+ i, cfi2->regs[i].base, cfi2->regs[i].offset);
+ break;
+ }
+
+ } else if (cfi1->type != cfi2->type) {
+
+ WARN_FUNC("stack state mismatch: type1=%d type2=%d",
+ insn->sec, insn->offset, cfi1->type, cfi2->type);
+
+ } else if (cfi1->drap != cfi2->drap ||
+ (cfi1->drap && cfi1->drap_reg != cfi2->drap_reg) ||
+ (cfi1->drap && cfi1->drap_offset != cfi2->drap_offset)) {
+
+ WARN_FUNC("stack state mismatch: drap1=%d(%d,%d) drap2=%d(%d,%d)",
+ insn->sec, insn->offset,
+ cfi1->drap, cfi1->drap_reg, cfi1->drap_offset,
+ cfi2->drap, cfi2->drap_reg, cfi2->drap_offset);
+
+ } else
+ return true;
+
+ return false;
+}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
check.c implements static stack validation. But decode_instructions() which
resides in it can be shared with other types of validation. E.g., dynamic
FP validation. Move the function to its own file - decode.c.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/Build | 1 +
tools/objtool/check.c | 97 ------------------------
tools/objtool/decode.c | 107 +++++++++++++++++++++++++++
tools/objtool/include/objtool/insn.h | 2 +
4 files changed, 110 insertions(+), 97 deletions(-)
create mode 100644 tools/objtool/decode.c
diff --git a/tools/objtool/Build b/tools/objtool/Build
index 1149048e6b3e..8afe56cd0c2d 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -7,6 +7,7 @@ objtool-y += special.o
objtool-y += builtin-check.o
objtool-y += cfi.o
objtool-y += insn.o
+objtool-y += decode.o
objtool-y += elf.o
objtool-y += objtool.o
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index d208086a8a18..be3f6564104a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -169,105 +169,8 @@ static bool dead_end_function(struct objtool_file *file, struct symbol *func)
return __dead_end_function(file, func, 0);
}
-static unsigned long nr_insns;
static unsigned long nr_insns_visited;
-/*
- * Call the arch-specific instruction decoder for all the instructions and add
- * them to the global instruction list.
- */
-static int decode_instructions(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- unsigned long offset;
- struct instruction *insn;
- int ret;
-
- for_each_sec(file, sec) {
-
- if (!(sec->sh.sh_flags & SHF_EXECINSTR))
- continue;
-
- if (strcmp(sec->name, ".altinstr_replacement") &&
- strcmp(sec->name, ".altinstr_aux") &&
- strncmp(sec->name, ".discard.", 9))
- sec->text = true;
-
- if (!strcmp(sec->name, ".noinstr.text") ||
- !strcmp(sec->name, ".entry.text") ||
- !strncmp(sec->name, ".text.__x86.", 12))
- sec->noinstr = true;
-
- for (offset = 0; offset < sec->sh.sh_size; offset += insn->len) {
- insn = malloc(sizeof(*insn));
- if (!insn) {
- WARN("malloc failed");
- return -1;
- }
- memset(insn, 0, sizeof(*insn));
- INIT_LIST_HEAD(&insn->alts);
- INIT_LIST_HEAD(&insn->stack_ops);
- INIT_LIST_HEAD(&insn->call_node);
-
- insn->sec = sec;
- insn->offset = offset;
-
- ret = arch_decode_instruction(file, sec, offset,
- sec->sh.sh_size - offset,
- &insn->len, &insn->type,
- &insn->immediate,
- &insn->stack_ops);
- if (ret)
- goto err;
-
- /*
- * By default, "ud2" is a dead end unless otherwise
- * annotated, because GCC 7 inserts it for certain
- * divide-by-zero cases.
- */
- if (insn->type == INSN_BUG)
- insn->dead_end = true;
-
- hash_add(file->insn_hash, &insn->hash, sec_offset_hash(sec, insn->offset));
- list_add_tail(&insn->list, &file->insn_list);
- nr_insns++;
- }
-
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC || func->alias != func)
- continue;
-
- if (!find_insn(file, sec, func->offset)) {
- WARN("%s(): can't find starting instruction",
- func->name);
- return -1;
- }
-
- sym_for_each_insn(file, func, insn) {
- insn->func = func;
- if (insn->type == INSN_ENDBR && list_empty(&insn->call_node)) {
- if (insn->offset == insn->func->offset) {
- list_add_tail(&insn->call_node, &file->endbr_list);
- file->nr_endbr++;
- } else {
- file->nr_endbr_int++;
- }
- }
- }
- }
- }
-
- if (opts.stats)
- printf("nr_insns: %lu\n", nr_insns);
-
- return 0;
-
-err:
- free(insn);
- return ret;
-}
-
/*
* Read the pv_ops[] .data table to find the static initialized values.
*/
diff --git a/tools/objtool/decode.c b/tools/objtool/decode.c
new file mode 100644
index 000000000000..dcec3efc2afb
--- /dev/null
+++ b/tools/objtool/decode.c
@@ -0,0 +1,107 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ */
+#include <linux/objtool.h>
+
+#include <objtool/builtin.h>
+#include <objtool/insn.h>
+#include <objtool/warn.h>
+
+static unsigned long nr_insns;
+
+/*
+ * Call the arch-specific instruction decoder for all the instructions and add
+ * them to the global instruction list.
+ */
+int decode_instructions(struct objtool_file *file)
+{
+ struct section *sec;
+ struct symbol *func;
+ unsigned long offset;
+ struct instruction *insn;
+ int ret;
+
+ for_each_sec(file, sec) {
+
+ if (!(sec->sh.sh_flags & SHF_EXECINSTR))
+ continue;
+
+ if (strcmp(sec->name, ".altinstr_replacement") &&
+ strcmp(sec->name, ".altinstr_aux") &&
+ strncmp(sec->name, ".discard.", 9))
+ sec->text = true;
+
+ if (!strcmp(sec->name, ".noinstr.text") ||
+ !strcmp(sec->name, ".entry.text") ||
+ !strncmp(sec->name, ".text.__x86.", 12))
+ sec->noinstr = true;
+
+ for (offset = 0; offset < sec->sh.sh_size; offset += insn->len) {
+ insn = malloc(sizeof(*insn));
+ if (!insn) {
+ WARN("malloc failed");
+ return -1;
+ }
+ memset(insn, 0, sizeof(*insn));
+ INIT_LIST_HEAD(&insn->alts);
+ INIT_LIST_HEAD(&insn->stack_ops);
+ INIT_LIST_HEAD(&insn->call_node);
+
+ insn->sec = sec;
+ insn->offset = offset;
+
+ ret = arch_decode_instruction(file, sec, offset,
+ sec->sh.sh_size - offset,
+ &insn->len, &insn->type,
+ &insn->immediate,
+ &insn->stack_ops);
+ if (ret)
+ goto err;
+
+ /*
+ * By default, "ud2" is a dead end unless otherwise
+ * annotated, because GCC 7 inserts it for certain
+ * divide-by-zero cases.
+ */
+ if (insn->type == INSN_BUG)
+ insn->dead_end = true;
+
+ hash_add(file->insn_hash, &insn->hash, sec_offset_hash(sec, insn->offset));
+ list_add_tail(&insn->list, &file->insn_list);
+ nr_insns++;
+ }
+
+ list_for_each_entry(func, &sec->symbol_list, list) {
+ if (func->type != STT_FUNC || func->alias != func)
+ continue;
+
+ if (!find_insn(file, sec, func->offset)) {
+ WARN("%s(): can't find starting instruction",
+ func->name);
+ return -1;
+ }
+
+ sym_for_each_insn(file, func, insn) {
+ insn->func = func;
+ if (insn->type == INSN_ENDBR && list_empty(&insn->call_node)) {
+ if (insn->offset == insn->func->offset) {
+ list_add_tail(&insn->call_node, &file->endbr_list);
+ file->nr_endbr++;
+ } else {
+ file->nr_endbr_int++;
+ }
+ }
+ }
+ }
+ }
+
+ if (opts.stats)
+ printf("nr_insns: %lu\n", nr_insns);
+
+ return 0;
+
+err:
+ free(insn);
+ return ret;
+}
diff --git a/tools/objtool/include/objtool/insn.h b/tools/objtool/include/objtool/insn.h
index b40756a38994..b74c7f0d9076 100644
--- a/tools/objtool/include/objtool/insn.h
+++ b/tools/objtool/include/objtool/insn.h
@@ -88,6 +88,8 @@ bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2);
bool same_function(struct instruction *insn1, struct instruction *insn2);
bool is_first_func_insn(struct objtool_file *file, struct instruction *insn);
+int decode_instructions(struct objtool_file *file);
+
#define for_each_insn(file, insn) \
list_for_each_entry(insn, &file->insn_list, list)
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
The ORC code needs to be reorganized into arch-specific and generic parts
so that architectures other than X86 can use the generic parts.
orc_types.h contains the following ORC definitions shared between objtool
and the kernel:
- ORC register definitions which are arch-specific.
- orc_entry structure which is generic.
Move orc_entry into a new file include/linux/orc_entry.h. Also, the field
names bp_reg and bp_offset in struct orc_entry are x86-specific. Change
them to fp_reg and fp_offset. FP stands for frame pointer.
Currently, the type field in orc_entry is only 2 bits. For other
architectures, we will need more. So, expand this to 3 bits.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/x86/include/asm/orc_types.h | 37 +++++-------------------
include/linux/orc_entry.h | 39 ++++++++++++++++++++++++++
tools/arch/x86/include/asm/orc_types.h | 37 +++++-------------------
tools/include/linux/orc_entry.h | 39 ++++++++++++++++++++++++++
tools/objtool/orc_gen.c | 4 +--
tools/objtool/sync-check.sh | 1 +
6 files changed, 95 insertions(+), 62 deletions(-)
create mode 100644 include/linux/orc_entry.h
create mode 100644 tools/include/linux/orc_entry.h
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 5a2baf28a1dc..851c9fb9f695 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -8,6 +8,13 @@
#include <linux/types.h>
#include <linux/compiler.h>
+#include <linux/orc_entry.h>
+
+/*
+ * For x86, use the appripriate name for the frame pointer in orc_entry.
+ */
+#define bp_offset fp_offset
+#define bp_reg fp_reg
/*
* The ORC_REG_* registers are base registers which are used to find other
@@ -39,34 +46,4 @@
#define ORC_REG_SP_INDIRECT 9
#define ORC_REG_MAX 15
-#ifndef __ASSEMBLY__
-#include <asm/byteorder.h>
-
-/*
- * This struct is more or less a vastly simplified version of the DWARF Call
- * Frame Information standard. It contains only the necessary parts of DWARF
- * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
- * unwinder how to find the previous SP and BP (and sometimes entry regs) on
- * the stack for a given code address. Each instance of the struct corresponds
- * to one or more code locations.
- */
-struct orc_entry {
- s16 sp_offset;
- s16 bp_offset;
-#if defined(__LITTLE_ENDIAN_BITFIELD)
- unsigned sp_reg:4;
- unsigned bp_reg:4;
- unsigned type:2;
- unsigned end:1;
-#elif defined(__BIG_ENDIAN_BITFIELD)
- unsigned bp_reg:4;
- unsigned sp_reg:4;
- unsigned unused:5;
- unsigned end:1;
- unsigned type:2;
-#endif
-} __packed;
-
-#endif /* __ASSEMBLY__ */
-
#endif /* _ORC_TYPES_H */
diff --git a/include/linux/orc_entry.h b/include/linux/orc_entry.h
new file mode 100644
index 000000000000..3d49e3b9dabe
--- /dev/null
+++ b/include/linux/orc_entry.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ */
+
+#ifndef _ORC_ENTRY_H
+#define _ORC_ENTRY_H
+
+#ifndef __ASSEMBLY__
+#include <asm/byteorder.h>
+
+/*
+ * This struct is more or less a vastly simplified version of the DWARF Call
+ * Frame Information standard. It contains only the necessary parts of DWARF
+ * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
+ * unwinder how to find the previous SP and BP (and sometimes entry regs) on
+ * the stack for a given code address. Each instance of the struct corresponds
+ * to one or more code locations.
+ */
+struct orc_entry {
+ s16 sp_offset;
+ s16 fp_offset;
+#if defined(__LITTLE_ENDIAN_BITFIELD)
+ unsigned sp_reg:4;
+ unsigned fp_reg:4;
+ unsigned type:3;
+ unsigned end:1;
+#elif defined(__BIG_ENDIAN_BITFIELD)
+ unsigned fp_reg:4;
+ unsigned sp_reg:4;
+ unsigned unused:4;
+ unsigned end:1;
+ unsigned type:3;
+#endif
+} __packed;
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ORC_ENTRY_H */
diff --git a/tools/arch/x86/include/asm/orc_types.h b/tools/arch/x86/include/asm/orc_types.h
index 5a2baf28a1dc..851c9fb9f695 100644
--- a/tools/arch/x86/include/asm/orc_types.h
+++ b/tools/arch/x86/include/asm/orc_types.h
@@ -8,6 +8,13 @@
#include <linux/types.h>
#include <linux/compiler.h>
+#include <linux/orc_entry.h>
+
+/*
+ * For x86, use the appripriate name for the frame pointer in orc_entry.
+ */
+#define bp_offset fp_offset
+#define bp_reg fp_reg
/*
* The ORC_REG_* registers are base registers which are used to find other
@@ -39,34 +46,4 @@
#define ORC_REG_SP_INDIRECT 9
#define ORC_REG_MAX 15
-#ifndef __ASSEMBLY__
-#include <asm/byteorder.h>
-
-/*
- * This struct is more or less a vastly simplified version of the DWARF Call
- * Frame Information standard. It contains only the necessary parts of DWARF
- * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
- * unwinder how to find the previous SP and BP (and sometimes entry regs) on
- * the stack for a given code address. Each instance of the struct corresponds
- * to one or more code locations.
- */
-struct orc_entry {
- s16 sp_offset;
- s16 bp_offset;
-#if defined(__LITTLE_ENDIAN_BITFIELD)
- unsigned sp_reg:4;
- unsigned bp_reg:4;
- unsigned type:2;
- unsigned end:1;
-#elif defined(__BIG_ENDIAN_BITFIELD)
- unsigned bp_reg:4;
- unsigned sp_reg:4;
- unsigned unused:5;
- unsigned end:1;
- unsigned type:2;
-#endif
-} __packed;
-
-#endif /* __ASSEMBLY__ */
-
#endif /* _ORC_TYPES_H */
diff --git a/tools/include/linux/orc_entry.h b/tools/include/linux/orc_entry.h
new file mode 100644
index 000000000000..3d49e3b9dabe
--- /dev/null
+++ b/tools/include/linux/orc_entry.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ */
+
+#ifndef _ORC_ENTRY_H
+#define _ORC_ENTRY_H
+
+#ifndef __ASSEMBLY__
+#include <asm/byteorder.h>
+
+/*
+ * This struct is more or less a vastly simplified version of the DWARF Call
+ * Frame Information standard. It contains only the necessary parts of DWARF
+ * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
+ * unwinder how to find the previous SP and BP (and sometimes entry regs) on
+ * the stack for a given code address. Each instance of the struct corresponds
+ * to one or more code locations.
+ */
+struct orc_entry {
+ s16 sp_offset;
+ s16 fp_offset;
+#if defined(__LITTLE_ENDIAN_BITFIELD)
+ unsigned sp_reg:4;
+ unsigned fp_reg:4;
+ unsigned type:3;
+ unsigned end:1;
+#elif defined(__BIG_ENDIAN_BITFIELD)
+ unsigned fp_reg:4;
+ unsigned sp_reg:4;
+ unsigned unused:4;
+ unsigned end:1;
+ unsigned type:3;
+#endif
+} __packed;
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ORC_ENTRY_H */
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index dd3c64af9db2..68c317daadbf 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -98,7 +98,7 @@ static int write_orc_entry(struct elf *elf, struct section *orc_sec,
orc = (struct orc_entry *)orc_sec->data->d_buf + idx;
memcpy(orc, o, sizeof(*orc));
orc->sp_offset = bswap_if_needed(orc->sp_offset);
- orc->bp_offset = bswap_if_needed(orc->bp_offset);
+ orc->fp_offset = bswap_if_needed(orc->fp_offset);
/* populate reloc for ip */
if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
@@ -149,7 +149,7 @@ int orc_create(struct objtool_file *file)
struct orc_entry null = {
.sp_reg = ORC_REG_UNDEFINED,
- .bp_reg = ORC_REG_UNDEFINED,
+ .fp_reg = ORC_REG_UNDEFINED,
.type = UNWIND_HINT_TYPE_CALL,
};
diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh
index ee49b4e9e72c..ef1acb064605 100755
--- a/tools/objtool/sync-check.sh
+++ b/tools/objtool/sync-check.sh
@@ -18,6 +18,7 @@ arch/x86/include/asm/unwind_hints.h
arch/x86/lib/x86-opcode-map.txt
arch/x86/tools/gen-insn-attr-x86.awk
include/linux/static_call_types.h
+include/linux/orc_entry.h
"
SYNC_CHECK_FILES='
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Unwind hint macros and struct unwind_hint are arch-specific. Move them
into the arch-specific file asm/unwind_hints.h. But the unwind hint
types are generic. Retain them in linux/objtool.h.
Unwind hints can be used with static stack validation as well as other
forms of validation such as dynamic FP validation. Move the function
read_unwind_hints() from check.c to a new file unwind_hints.c so that
it can be shared across validation schemes.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/x86/include/asm/unwind_hints.h | 83 ++++++++++++
arch/x86/kernel/unwind_orc.c | 2 +-
include/linux/objtool.h | 67 ---------
tools/arch/x86/include/asm/unwind_hints.h | 157 ++++++++++++++++++++++
tools/include/linux/objtool.h | 67 ---------
tools/objtool/Build | 1 +
tools/objtool/check.c | 96 -------------
tools/objtool/include/objtool/insn.h | 1 +
tools/objtool/sync-check.sh | 1 +
tools/objtool/unwind_hints.c | 106 +++++++++++++++
10 files changed, 350 insertions(+), 231 deletions(-)
create mode 100644 tools/arch/x86/include/asm/unwind_hints.h
create mode 100644 tools/objtool/unwind_hints.c
diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index f66fbe6537dd..07c8d911266c 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -1,10 +1,93 @@
#ifndef _ASM_X86_UNWIND_HINTS_H
#define _ASM_X86_UNWIND_HINTS_H
+#ifndef __ASSEMBLY__
+
+#include <linux/types.h>
+
+/*
+ * This struct is used by asm and inline asm code to manually annotate the
+ * location of registers on the stack.
+ */
+struct unwind_hint {
+ u32 ip;
+ s16 sp_offset;
+ u8 sp_reg;
+ u8 type;
+ u8 end;
+};
+#endif
+
#include <linux/objtool.h>
#include "orc_types.h"
+#ifdef CONFIG_OBJTOOL
+
+#ifndef __ASSEMBLY__
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(sp_offset) "\n\t" \
+ ".byte " __stringify(sp_reg) "\n\t" \
+ ".byte " __stringify(type) "\n\t" \
+ ".byte " __stringify(end) "\n\t" \
+ ".balign 4 \n\t" \
+ ".popsection\n\t"
+
+#else /* __ASSEMBLY__ */
+
+/*
+ * In asm, there are two kinds of code: normal C-type callable functions and
+ * the rest. The normal callable functions can be called by other code, and
+ * don't do anything unusual with the stack. Such normal callable functions
+ * are annotated with the ENTRY/ENDPROC macros. Most asm code falls in this
+ * category. In this case, no special debugging annotations are needed because
+ * objtool can automatically generate the ORC data for the ORC unwinder to read
+ * at runtime.
+ *
+ * Anything which doesn't fall into the above category, such as syscall and
+ * interrupt handlers, tends to not be called directly by other functions, and
+ * often does unusual non-C-function-type things with the stack pointer. Such
+ * code needs to be annotated such that objtool can understand it. The
+ * following CFI hint macros are for this type of code.
+ *
+ * These macros provide hints to objtool about the state of the stack at each
+ * instruction. Objtool starts from the hints and follows the code flow,
+ * making automatic CFI adjustments when it sees pushes and pops, filling out
+ * the debuginfo as necessary. It will also warn if it sees any
+ * inconsistencies.
+ */
+.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+.Lunwind_hint_ip_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+ .long .Lunwind_hint_ip_\@ - .
+ .short \sp_offset
+ .byte \sp_reg
+ .byte \type
+ .byte \end
+ .balign 4
+ .popsection
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#else /* !CONFIG_OBJTOOL */
+
+#ifndef __ASSEMBLY__
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "\n\t"
+#else
+.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+.endm
+#endif
+
+#endif /* CONFIG_OBJTOOL */
+
#ifdef __ASSEMBLY__
.macro UNWIND_HINT_EMPTY
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index c059820dfaea..c2bfc597d909 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -1,10 +1,10 @@
// SPDX-License-Identifier: GPL-2.0-only
-#include <linux/objtool.h>
#include <linux/module.h>
#include <linux/sort.h>
#include <asm/ptrace.h>
#include <asm/stacktrace.h>
#include <asm/unwind.h>
+#include <asm/unwind_hints.h>
#include <asm/orc_types.h>
#include <asm/orc_lookup.h>
diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index 62c54ffbeeaa..1af295efc12c 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -2,23 +2,6 @@
#ifndef _LINUX_OBJTOOL_H
#define _LINUX_OBJTOOL_H
-#ifndef __ASSEMBLY__
-
-#include <linux/types.h>
-
-/*
- * This struct is used by asm and inline asm code to manually annotate the
- * location of registers on the stack.
- */
-struct unwind_hint {
- u32 ip;
- s16 sp_offset;
- u8 sp_reg;
- u8 type;
- u8 end;
-};
-#endif
-
/*
* UNWIND_HINT_TYPE_CALL: Indicates that sp_reg+sp_offset resolves to PREV_SP
* (the caller's SP right before it made the call). Used for all callable
@@ -49,18 +32,6 @@ struct unwind_hint {
#ifndef __ASSEMBLY__
-#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
- "987: \n\t" \
- ".pushsection .discard.unwind_hints\n\t" \
- /* struct unwind_hint */ \
- ".long 987b - .\n\t" \
- ".short " __stringify(sp_offset) "\n\t" \
- ".byte " __stringify(sp_reg) "\n\t" \
- ".byte " __stringify(type) "\n\t" \
- ".byte " __stringify(end) "\n\t" \
- ".balign 4 \n\t" \
- ".popsection\n\t"
-
/*
* This macro marks the given function's stack frame as "non-standard", which
* tells objtool to ignore the function when doing stack metadata validation.
@@ -108,40 +79,6 @@ struct unwind_hint {
.long 999b; \
.popsection;
-/*
- * In asm, there are two kinds of code: normal C-type callable functions and
- * the rest. The normal callable functions can be called by other code, and
- * don't do anything unusual with the stack. Such normal callable functions
- * are annotated with the ENTRY/ENDPROC macros. Most asm code falls in this
- * category. In this case, no special debugging annotations are needed because
- * objtool can automatically generate the ORC data for the ORC unwinder to read
- * at runtime.
- *
- * Anything which doesn't fall into the above category, such as syscall and
- * interrupt handlers, tends to not be called directly by other functions, and
- * often does unusual non-C-function-type things with the stack pointer. Such
- * code needs to be annotated such that objtool can understand it. The
- * following CFI hint macros are for this type of code.
- *
- * These macros provide hints to objtool about the state of the stack at each
- * instruction. Objtool starts from the hints and follows the code flow,
- * making automatic CFI adjustments when it sees pushes and pops, filling out
- * the debuginfo as necessary. It will also warn if it sees any
- * inconsistencies.
- */
-.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
-.Lunwind_hint_ip_\@:
- .pushsection .discard.unwind_hints
- /* struct unwind_hint */
- .long .Lunwind_hint_ip_\@ - .
- .short \sp_offset
- .byte \sp_reg
- .byte \type
- .byte \end
- .balign 4
- .popsection
-.endm
-
.macro STACK_FRAME_NON_STANDARD func:req
.pushsection .discard.func_stack_frame_non_standard, "aw"
_ASM_PTR \func
@@ -174,16 +111,12 @@ struct unwind_hint {
#ifndef __ASSEMBLY__
-#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
- "\n\t"
#define STACK_FRAME_NON_STANDARD(func)
#define STACK_FRAME_NON_STANDARD_FP(func)
#define ANNOTATE_NOENDBR
#define ASM_REACHABLE
#else
#define ANNOTATE_INTRA_FUNCTION_CALL
-.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
-.endm
.macro STACK_FRAME_NON_STANDARD func:req
.endm
.macro ANNOTATE_NOENDBR
diff --git a/tools/arch/x86/include/asm/unwind_hints.h b/tools/arch/x86/include/asm/unwind_hints.h
new file mode 100644
index 000000000000..07c8d911266c
--- /dev/null
+++ b/tools/arch/x86/include/asm/unwind_hints.h
@@ -0,0 +1,157 @@
+#ifndef _ASM_X86_UNWIND_HINTS_H
+#define _ASM_X86_UNWIND_HINTS_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/types.h>
+
+/*
+ * This struct is used by asm and inline asm code to manually annotate the
+ * location of registers on the stack.
+ */
+struct unwind_hint {
+ u32 ip;
+ s16 sp_offset;
+ u8 sp_reg;
+ u8 type;
+ u8 end;
+};
+#endif
+
+#include <linux/objtool.h>
+
+#include "orc_types.h"
+
+#ifdef CONFIG_OBJTOOL
+
+#ifndef __ASSEMBLY__
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(sp_offset) "\n\t" \
+ ".byte " __stringify(sp_reg) "\n\t" \
+ ".byte " __stringify(type) "\n\t" \
+ ".byte " __stringify(end) "\n\t" \
+ ".balign 4 \n\t" \
+ ".popsection\n\t"
+
+#else /* __ASSEMBLY__ */
+
+/*
+ * In asm, there are two kinds of code: normal C-type callable functions and
+ * the rest. The normal callable functions can be called by other code, and
+ * don't do anything unusual with the stack. Such normal callable functions
+ * are annotated with the ENTRY/ENDPROC macros. Most asm code falls in this
+ * category. In this case, no special debugging annotations are needed because
+ * objtool can automatically generate the ORC data for the ORC unwinder to read
+ * at runtime.
+ *
+ * Anything which doesn't fall into the above category, such as syscall and
+ * interrupt handlers, tends to not be called directly by other functions, and
+ * often does unusual non-C-function-type things with the stack pointer. Such
+ * code needs to be annotated such that objtool can understand it. The
+ * following CFI hint macros are for this type of code.
+ *
+ * These macros provide hints to objtool about the state of the stack at each
+ * instruction. Objtool starts from the hints and follows the code flow,
+ * making automatic CFI adjustments when it sees pushes and pops, filling out
+ * the debuginfo as necessary. It will also warn if it sees any
+ * inconsistencies.
+ */
+.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+.Lunwind_hint_ip_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+ .long .Lunwind_hint_ip_\@ - .
+ .short \sp_offset
+ .byte \sp_reg
+ .byte \type
+ .byte \end
+ .balign 4
+ .popsection
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#else /* !CONFIG_OBJTOOL */
+
+#ifndef __ASSEMBLY__
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "\n\t"
+#else
+.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+.endm
+#endif
+
+#endif /* CONFIG_OBJTOOL */
+
+#ifdef __ASSEMBLY__
+
+.macro UNWIND_HINT_EMPTY
+ UNWIND_HINT type=UNWIND_HINT_TYPE_CALL end=1
+.endm
+
+.macro UNWIND_HINT_ENTRY
+ UNWIND_HINT type=UNWIND_HINT_TYPE_ENTRY end=1
+.endm
+
+.macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 partial=0
+ .if \base == %rsp
+ .if \indirect
+ .set sp_reg, ORC_REG_SP_INDIRECT
+ .else
+ .set sp_reg, ORC_REG_SP
+ .endif
+ .elseif \base == %rbp
+ .set sp_reg, ORC_REG_BP
+ .elseif \base == %rdi
+ .set sp_reg, ORC_REG_DI
+ .elseif \base == %rdx
+ .set sp_reg, ORC_REG_DX
+ .elseif \base == %r10
+ .set sp_reg, ORC_REG_R10
+ .else
+ .error "UNWIND_HINT_REGS: bad base register"
+ .endif
+
+ .set sp_offset, \offset
+
+ .if \partial
+ .set type, UNWIND_HINT_TYPE_REGS_PARTIAL
+ .elseif \extra == 0
+ .set type, UNWIND_HINT_TYPE_REGS_PARTIAL
+ .set sp_offset, \offset + (16*8)
+ .else
+ .set type, UNWIND_HINT_TYPE_REGS
+ .endif
+
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+.macro UNWIND_HINT_IRET_REGS base=%rsp offset=0
+ UNWIND_HINT_REGS base=\base offset=\offset partial=1
+.endm
+
+.macro UNWIND_HINT_FUNC
+ UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC
+.endm
+
+.macro UNWIND_HINT_SAVE
+ UNWIND_HINT type=UNWIND_HINT_TYPE_SAVE
+.endm
+
+.macro UNWIND_HINT_RESTORE
+ UNWIND_HINT type=UNWIND_HINT_TYPE_RESTORE
+.endm
+
+#else
+
+#define UNWIND_HINT_FUNC \
+ UNWIND_HINT(ORC_REG_SP, 8, UNWIND_HINT_TYPE_FUNC, 0)
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_UNWIND_HINTS_H */
diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
index 62c54ffbeeaa..1af295efc12c 100644
--- a/tools/include/linux/objtool.h
+++ b/tools/include/linux/objtool.h
@@ -2,23 +2,6 @@
#ifndef _LINUX_OBJTOOL_H
#define _LINUX_OBJTOOL_H
-#ifndef __ASSEMBLY__
-
-#include <linux/types.h>
-
-/*
- * This struct is used by asm and inline asm code to manually annotate the
- * location of registers on the stack.
- */
-struct unwind_hint {
- u32 ip;
- s16 sp_offset;
- u8 sp_reg;
- u8 type;
- u8 end;
-};
-#endif
-
/*
* UNWIND_HINT_TYPE_CALL: Indicates that sp_reg+sp_offset resolves to PREV_SP
* (the caller's SP right before it made the call). Used for all callable
@@ -49,18 +32,6 @@ struct unwind_hint {
#ifndef __ASSEMBLY__
-#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
- "987: \n\t" \
- ".pushsection .discard.unwind_hints\n\t" \
- /* struct unwind_hint */ \
- ".long 987b - .\n\t" \
- ".short " __stringify(sp_offset) "\n\t" \
- ".byte " __stringify(sp_reg) "\n\t" \
- ".byte " __stringify(type) "\n\t" \
- ".byte " __stringify(end) "\n\t" \
- ".balign 4 \n\t" \
- ".popsection\n\t"
-
/*
* This macro marks the given function's stack frame as "non-standard", which
* tells objtool to ignore the function when doing stack metadata validation.
@@ -108,40 +79,6 @@ struct unwind_hint {
.long 999b; \
.popsection;
-/*
- * In asm, there are two kinds of code: normal C-type callable functions and
- * the rest. The normal callable functions can be called by other code, and
- * don't do anything unusual with the stack. Such normal callable functions
- * are annotated with the ENTRY/ENDPROC macros. Most asm code falls in this
- * category. In this case, no special debugging annotations are needed because
- * objtool can automatically generate the ORC data for the ORC unwinder to read
- * at runtime.
- *
- * Anything which doesn't fall into the above category, such as syscall and
- * interrupt handlers, tends to not be called directly by other functions, and
- * often does unusual non-C-function-type things with the stack pointer. Such
- * code needs to be annotated such that objtool can understand it. The
- * following CFI hint macros are for this type of code.
- *
- * These macros provide hints to objtool about the state of the stack at each
- * instruction. Objtool starts from the hints and follows the code flow,
- * making automatic CFI adjustments when it sees pushes and pops, filling out
- * the debuginfo as necessary. It will also warn if it sees any
- * inconsistencies.
- */
-.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
-.Lunwind_hint_ip_\@:
- .pushsection .discard.unwind_hints
- /* struct unwind_hint */
- .long .Lunwind_hint_ip_\@ - .
- .short \sp_offset
- .byte \sp_reg
- .byte \type
- .byte \end
- .balign 4
- .popsection
-.endm
-
.macro STACK_FRAME_NON_STANDARD func:req
.pushsection .discard.func_stack_frame_non_standard, "aw"
_ASM_PTR \func
@@ -174,16 +111,12 @@ struct unwind_hint {
#ifndef __ASSEMBLY__
-#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
- "\n\t"
#define STACK_FRAME_NON_STANDARD(func)
#define STACK_FRAME_NON_STANDARD_FP(func)
#define ANNOTATE_NOENDBR
#define ASM_REACHABLE
#else
#define ANNOTATE_INTRA_FUNCTION_CALL
-.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
-.endm
.macro STACK_FRAME_NON_STANDARD func:req
.endm
.macro ANNOTATE_NOENDBR
diff --git a/tools/objtool/Build b/tools/objtool/Build
index 8afe56cd0c2d..c4666d0b40ba 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -8,6 +8,7 @@ objtool-y += builtin-check.o
objtool-y += cfi.o
objtool-y += insn.o
objtool-y += decode.o
+objtool-y += unwind_hints.o
objtool-y += elf.o
objtool-y += objtool.o
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index be3f6564104a..d14a2b7b8b37 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1670,102 +1670,6 @@ static int add_jump_table_alts(struct objtool_file *file)
return 0;
}
-static int read_unwind_hints(struct objtool_file *file)
-{
- struct cfi_state cfi = init_cfi;
- struct section *sec, *relocsec;
- struct unwind_hint *hint;
- struct instruction *insn;
- struct reloc *reloc;
- int i;
-
- sec = find_section_by_name(file->elf, ".discard.unwind_hints");
- if (!sec)
- return 0;
-
- relocsec = sec->reloc;
- if (!relocsec) {
- WARN("missing .rela.discard.unwind_hints section");
- return -1;
- }
-
- if (sec->sh.sh_size % sizeof(struct unwind_hint)) {
- WARN("struct unwind_hint size mismatch");
- return -1;
- }
-
- file->hints = true;
-
- for (i = 0; i < sec->sh.sh_size / sizeof(struct unwind_hint); i++) {
- hint = (struct unwind_hint *)sec->data->d_buf + i;
-
- reloc = find_reloc_by_dest(file->elf, sec, i * sizeof(*hint));
- if (!reloc) {
- WARN("can't find reloc for unwind_hints[%d]", i);
- return -1;
- }
-
- insn = find_insn(file, reloc->sym->sec, reloc->addend);
- if (!insn) {
- WARN("can't find insn for unwind_hints[%d]", i);
- return -1;
- }
-
- insn->hint = true;
-
- if (hint->type == UNWIND_HINT_TYPE_SAVE) {
- insn->hint = false;
- insn->save = true;
- continue;
- }
-
- if (hint->type == UNWIND_HINT_TYPE_RESTORE) {
- insn->restore = true;
- continue;
- }
-
- if (hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
- struct symbol *sym = find_symbol_by_offset(insn->sec, insn->offset);
-
- if (sym && sym->bind == STB_GLOBAL) {
- if (opts.ibt && insn->type != INSN_ENDBR && !insn->noendbr) {
- WARN_FUNC("UNWIND_HINT_IRET_REGS without ENDBR",
- insn->sec, insn->offset);
- }
-
- insn->entry = 1;
- }
- }
-
- if (hint->type == UNWIND_HINT_TYPE_ENTRY) {
- hint->type = UNWIND_HINT_TYPE_CALL;
- insn->entry = 1;
- }
-
- if (hint->type == UNWIND_HINT_TYPE_FUNC) {
- insn->cfi = &func_cfi;
- continue;
- }
-
- if (insn->cfi)
- cfi = *(insn->cfi);
-
- if (arch_decode_hint_reg(hint->sp_reg, &cfi.cfa.base)) {
- WARN_FUNC("unsupported unwind_hint sp base reg %d",
- insn->sec, insn->offset, hint->sp_reg);
- return -1;
- }
-
- cfi.cfa.offset = bswap_if_needed(hint->sp_offset);
- cfi.type = hint->type;
- cfi.end = hint->end;
-
- insn->cfi = cfi_hash_find_or_add(&cfi);
- }
-
- return 0;
-}
-
static int read_noendbr_hints(struct objtool_file *file)
{
struct section *sec;
diff --git a/tools/objtool/include/objtool/insn.h b/tools/objtool/include/objtool/insn.h
index b74c7f0d9076..cfd1ae7e2e8e 100644
--- a/tools/objtool/include/objtool/insn.h
+++ b/tools/objtool/include/objtool/insn.h
@@ -89,6 +89,7 @@ bool same_function(struct instruction *insn1, struct instruction *insn2);
bool is_first_func_insn(struct objtool_file *file, struct instruction *insn);
int decode_instructions(struct objtool_file *file);
+int read_unwind_hints(struct objtool_file *file);
#define for_each_insn(file, insn) \
list_for_each_entry(insn, &file->insn_list, list)
diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh
index 105a291ff8e7..ee49b4e9e72c 100755
--- a/tools/objtool/sync-check.sh
+++ b/tools/objtool/sync-check.sh
@@ -14,6 +14,7 @@ arch/x86/include/asm/nops.h
arch/x86/include/asm/inat_types.h
arch/x86/include/asm/orc_types.h
arch/x86/include/asm/emulate_prefix.h
+arch/x86/include/asm/unwind_hints.h
arch/x86/lib/x86-opcode-map.txt
arch/x86/tools/gen-insn-attr-x86.awk
include/linux/static_call_types.h
diff --git a/tools/objtool/unwind_hints.c b/tools/objtool/unwind_hints.c
new file mode 100644
index 000000000000..f2521659bae5
--- /dev/null
+++ b/tools/objtool/unwind_hints.c
@@ -0,0 +1,106 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ */
+#include <asm/unwind_hints.h>
+
+#include <objtool/builtin.h>
+#include <objtool/endianness.h>
+#include <objtool/insn.h>
+#include <objtool/warn.h>
+
+int read_unwind_hints(struct objtool_file *file)
+{
+ struct cfi_state cfi = init_cfi;
+ struct section *sec, *relocsec;
+ struct unwind_hint *hint;
+ struct instruction *insn;
+ struct reloc *reloc;
+ int i;
+
+ sec = find_section_by_name(file->elf, ".discard.unwind_hints");
+ if (!sec)
+ return 0;
+
+ relocsec = sec->reloc;
+ if (!relocsec) {
+ WARN("missing .rela.discard.unwind_hints section");
+ return -1;
+ }
+
+ if (sec->sh.sh_size % sizeof(struct unwind_hint)) {
+ WARN("struct unwind_hint size mismatch");
+ return -1;
+ }
+
+ file->hints = true;
+
+ for (i = 0; i < sec->sh.sh_size / sizeof(struct unwind_hint); i++) {
+ hint = (struct unwind_hint *)sec->data->d_buf + i;
+
+ reloc = find_reloc_by_dest(file->elf, sec, i * sizeof(*hint));
+ if (!reloc) {
+ WARN("can't find reloc for unwind_hints[%d]", i);
+ return -1;
+ }
+
+ insn = find_insn(file, reloc->sym->sec, reloc->addend);
+ if (!insn) {
+ WARN("can't find insn for unwind_hints[%d]", i);
+ return -1;
+ }
+
+ insn->hint = true;
+
+ if (hint->type == UNWIND_HINT_TYPE_SAVE) {
+ insn->hint = false;
+ insn->save = true;
+ continue;
+ }
+
+ if (hint->type == UNWIND_HINT_TYPE_RESTORE) {
+ insn->restore = true;
+ continue;
+ }
+
+ if (hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
+ struct symbol *sym = find_symbol_by_offset(insn->sec, insn->offset);
+
+ if (sym && sym->bind == STB_GLOBAL) {
+ if (opts.ibt && insn->type != INSN_ENDBR && !insn->noendbr) {
+ WARN_FUNC("UNWIND_HINT_IRET_REGS without ENDBR",
+ insn->sec, insn->offset);
+ }
+
+ insn->entry = 1;
+ }
+ }
+
+ if (hint->type == UNWIND_HINT_TYPE_ENTRY) {
+ hint->type = UNWIND_HINT_TYPE_CALL;
+ insn->entry = 1;
+ }
+
+ if (hint->type == UNWIND_HINT_TYPE_FUNC) {
+ insn->cfi = &func_cfi;
+ continue;
+ }
+
+ if (insn->cfi)
+ cfi = *(insn->cfi);
+
+ if (arch_decode_hint_reg(hint->sp_reg, &cfi.cfa.base)) {
+ WARN_FUNC("unsupported unwind_hint sp base reg %d",
+ insn->sec, insn->offset, hint->sp_reg);
+ return -1;
+ }
+
+ cfi.cfa.offset = bswap_if_needed(hint->sp_offset);
+ cfi.type = hint->type;
+ cfi.end = hint->end;
+
+ insn->cfi = cfi_hash_find_or_add(&cfi);
+ }
+
+ return 0;
+}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
The ORC code needs to be reorganized into arch-specific and generic parts
so that architectures other than X86 can avail the generic parts.
Some arch-specific ORC code is present in orc_gen.c and orc_dump.c. Create
the following two files for such code:
- tools/objtool/include/objtool/orc.h
- tools/objtool/arch/x86/orc.c
Move the following arch-specific function from tools/objtool/orc_gen.c
to tools/objtool/arch/x86/orc.c:
- init_orc_entry()
Move the following arch-specific functions from tools/objtool/orc_dump.c
to tools/objtool/arch/x86/orc.c:
- reg_name()
- orc_type_name()
- print_reg()
Create arch-specific functions to print the names of the SP and FP
registers.
The relocation type for relocation entries for ORC structures is
arch-specific. Define it in tools/objtool/arch/x86/include/arch/elf.h:
#define R_PCREL R_X86_64_PC32
and use that in orc_gen.c so each architecture can provide its own
relocation type.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/arch/x86/Build | 1 +
tools/objtool/arch/x86/include/arch/elf.h | 1 +
tools/objtool/arch/x86/orc.c | 150 ++++++++++++++++++++++
tools/objtool/include/objtool/orc.h | 18 +++
tools/objtool/orc_dump.c | 63 +--------
tools/objtool/orc_gen.c | 79 +-----------
6 files changed, 179 insertions(+), 133 deletions(-)
create mode 100644 tools/objtool/arch/x86/orc.c
create mode 100644 tools/objtool/include/objtool/orc.h
diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build
index 9f7869b5c5e0..77b9a66cd6da 100644
--- a/tools/objtool/arch/x86/Build
+++ b/tools/objtool/arch/x86/Build
@@ -1,5 +1,6 @@
objtool-y += special.o
objtool-y += decode.o
+objtool-$(BUILD_ORC) += orc.o
inat_tables_script = ../arch/x86/tools/gen-insn-attr-x86.awk
inat_tables_maps = ../arch/x86/lib/x86-opcode-map.txt
diff --git a/tools/objtool/arch/x86/include/arch/elf.h b/tools/objtool/arch/x86/include/arch/elf.h
index 69cc4264b28a..3a7eb515dbb9 100644
--- a/tools/objtool/arch/x86/include/arch/elf.h
+++ b/tools/objtool/arch/x86/include/arch/elf.h
@@ -2,5 +2,6 @@
#define _OBJTOOL_ARCH_ELF
#define R_NONE R_X86_64_NONE
+#define R_PCREL R_X86_64_PC32
#endif /* _OBJTOOL_ARCH_ELF */
diff --git a/tools/objtool/arch/x86/orc.c b/tools/objtool/arch/x86/orc.c
new file mode 100644
index 000000000000..a075737d4503
--- /dev/null
+++ b/tools/objtool/arch/x86/orc.c
@@ -0,0 +1,150 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ */
+
+#include <stdlib.h>
+#include <string.h>
+
+#include <linux/objtool.h>
+
+#include <objtool/check.h>
+#include <objtool/orc.h>
+#include <objtool/warn.h>
+#include <objtool/endianness.h>
+
+int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi,
+ struct instruction *insn)
+{
+ struct cfi_reg *bp = &cfi->regs[CFI_BP];
+
+ memset(orc, 0, sizeof(*orc));
+
+ if (!cfi) {
+ orc->end = 0;
+ orc->sp_reg = ORC_REG_UNDEFINED;
+ return 0;
+ }
+
+ orc->end = cfi->end;
+
+ if (cfi->cfa.base == CFI_UNDEFINED) {
+ orc->sp_reg = ORC_REG_UNDEFINED;
+ return 0;
+ }
+
+ switch (cfi->cfa.base) {
+ case CFI_SP:
+ orc->sp_reg = ORC_REG_SP;
+ break;
+ case CFI_SP_INDIRECT:
+ orc->sp_reg = ORC_REG_SP_INDIRECT;
+ break;
+ case CFI_BP:
+ orc->sp_reg = ORC_REG_BP;
+ break;
+ case CFI_BP_INDIRECT:
+ orc->sp_reg = ORC_REG_BP_INDIRECT;
+ break;
+ case CFI_R10:
+ orc->sp_reg = ORC_REG_R10;
+ break;
+ case CFI_R13:
+ orc->sp_reg = ORC_REG_R13;
+ break;
+ case CFI_DI:
+ orc->sp_reg = ORC_REG_DI;
+ break;
+ case CFI_DX:
+ orc->sp_reg = ORC_REG_DX;
+ break;
+ default:
+ WARN_FUNC("unknown CFA base reg %d",
+ insn->sec, insn->offset, cfi->cfa.base);
+ return -1;
+ }
+
+ switch (bp->base) {
+ case CFI_UNDEFINED:
+ orc->bp_reg = ORC_REG_UNDEFINED;
+ break;
+ case CFI_CFA:
+ orc->bp_reg = ORC_REG_PREV_SP;
+ break;
+ case CFI_BP:
+ orc->bp_reg = ORC_REG_BP;
+ break;
+ default:
+ WARN_FUNC("unknown BP base reg %d",
+ insn->sec, insn->offset, bp->base);
+ return -1;
+ }
+
+ orc->sp_offset = cfi->cfa.offset;
+ orc->bp_offset = bp->offset;
+ orc->type = cfi->type;
+
+ return 0;
+}
+
+static const char *reg_name(unsigned int reg)
+{
+ switch (reg) {
+ case ORC_REG_PREV_SP:
+ return "prevsp";
+ case ORC_REG_DX:
+ return "dx";
+ case ORC_REG_DI:
+ return "di";
+ case ORC_REG_BP:
+ return "bp";
+ case ORC_REG_SP:
+ return "sp";
+ case ORC_REG_R10:
+ return "r10";
+ case ORC_REG_R13:
+ return "r13";
+ case ORC_REG_BP_INDIRECT:
+ return "bp(ind)";
+ case ORC_REG_SP_INDIRECT:
+ return "sp(ind)";
+ default:
+ return "?";
+ }
+}
+
+const char *orc_type_name(unsigned int type)
+{
+ switch (type) {
+ case UNWIND_HINT_TYPE_CALL:
+ return "call";
+ case UNWIND_HINT_TYPE_REGS:
+ return "regs";
+ case UNWIND_HINT_TYPE_REGS_PARTIAL:
+ return "regs (partial)";
+ default:
+ return "?";
+ }
+}
+
+void orc_print_reg(unsigned int reg, int offset)
+{
+ if (reg == ORC_REG_BP_INDIRECT)
+ printf("(bp%+d)", offset);
+ else if (reg == ORC_REG_SP_INDIRECT)
+ printf("(sp)%+d", offset);
+ else if (reg == ORC_REG_UNDEFINED)
+ printf("(und)");
+ else
+ printf("%s%+d", reg_name(reg), offset);
+}
+
+void orc_print_sp(void)
+{
+ printf(" sp:");
+}
+
+void orc_print_fp(void)
+{
+ printf(" bp:");
+}
diff --git a/tools/objtool/include/objtool/orc.h b/tools/objtool/include/objtool/orc.h
new file mode 100644
index 000000000000..bf141134c56f
--- /dev/null
+++ b/tools/objtool/include/objtool/orc.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ */
+
+#ifndef _OBJTOOL_ORC_H
+#define _OBJTOOL_ORC_H
+
+#include <asm/orc_types.h>
+
+int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi,
+ struct instruction *insn);
+const char *orc_type_name(unsigned int type);
+void orc_print_reg(unsigned int reg, int offset);
+void orc_print_sp(void);
+void orc_print_fp(void);
+
+#endif /* _OBJTOOL_ORC_H */
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c
index f5a8508c42d6..61b39960ab6a 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -5,63 +5,12 @@
#include <unistd.h>
#include <linux/objtool.h>
-#include <asm/orc_types.h>
#include <objtool/objtool.h>
+#include <objtool/check.h>
+#include <objtool/orc.h>
#include <objtool/warn.h>
#include <objtool/endianness.h>
-static const char *reg_name(unsigned int reg)
-{
- switch (reg) {
- case ORC_REG_PREV_SP:
- return "prevsp";
- case ORC_REG_DX:
- return "dx";
- case ORC_REG_DI:
- return "di";
- case ORC_REG_BP:
- return "bp";
- case ORC_REG_SP:
- return "sp";
- case ORC_REG_R10:
- return "r10";
- case ORC_REG_R13:
- return "r13";
- case ORC_REG_BP_INDIRECT:
- return "bp(ind)";
- case ORC_REG_SP_INDIRECT:
- return "sp(ind)";
- default:
- return "?";
- }
-}
-
-static const char *orc_type_name(unsigned int type)
-{
- switch (type) {
- case UNWIND_HINT_TYPE_CALL:
- return "call";
- case UNWIND_HINT_TYPE_REGS:
- return "regs";
- case UNWIND_HINT_TYPE_REGS_PARTIAL:
- return "regs (partial)";
- default:
- return "?";
- }
-}
-
-static void print_reg(unsigned int reg, int offset)
-{
- if (reg == ORC_REG_BP_INDIRECT)
- printf("(bp%+d)", offset);
- else if (reg == ORC_REG_SP_INDIRECT)
- printf("(sp)%+d", offset);
- else if (reg == ORC_REG_UNDEFINED)
- printf("(und)");
- else
- printf("%s%+d", reg_name(reg), offset);
-}
-
int orc_dump(const char *_objname)
{
int fd, nr_entries, i, *orc_ip = NULL, orc_size = 0;
@@ -196,13 +145,13 @@ int orc_dump(const char *_objname)
}
- printf(" sp:");
+ orc_print_sp();
- print_reg(orc[i].sp_reg, bswap_if_needed(orc[i].sp_offset));
+ orc_print_reg(orc[i].sp_reg, bswap_if_needed(orc[i].sp_offset));
- printf(" bp:");
+ orc_print_fp();
- print_reg(orc[i].bp_reg, bswap_if_needed(orc[i].bp_offset));
+ orc_print_reg(orc[i].fp_reg, bswap_if_needed(orc[i].fp_offset));
printf(" type:%s end:%d\n",
orc_type_name(orc[i].type), orc[i].end);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 68c317daadbf..ea2e361ff7bc 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -7,86 +7,13 @@
#include <string.h>
#include <linux/objtool.h>
-#include <asm/orc_types.h>
+#include <arch/elf.h>
#include <objtool/check.h>
+#include <objtool/orc.h>
#include <objtool/warn.h>
#include <objtool/endianness.h>
-static int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi,
- struct instruction *insn)
-{
- struct cfi_reg *bp = &cfi->regs[CFI_BP];
-
- memset(orc, 0, sizeof(*orc));
-
- if (!cfi) {
- orc->end = 0;
- orc->sp_reg = ORC_REG_UNDEFINED;
- return 0;
- }
-
- orc->end = cfi->end;
-
- if (cfi->cfa.base == CFI_UNDEFINED) {
- orc->sp_reg = ORC_REG_UNDEFINED;
- return 0;
- }
-
- switch (cfi->cfa.base) {
- case CFI_SP:
- orc->sp_reg = ORC_REG_SP;
- break;
- case CFI_SP_INDIRECT:
- orc->sp_reg = ORC_REG_SP_INDIRECT;
- break;
- case CFI_BP:
- orc->sp_reg = ORC_REG_BP;
- break;
- case CFI_BP_INDIRECT:
- orc->sp_reg = ORC_REG_BP_INDIRECT;
- break;
- case CFI_R10:
- orc->sp_reg = ORC_REG_R10;
- break;
- case CFI_R13:
- orc->sp_reg = ORC_REG_R13;
- break;
- case CFI_DI:
- orc->sp_reg = ORC_REG_DI;
- break;
- case CFI_DX:
- orc->sp_reg = ORC_REG_DX;
- break;
- default:
- WARN_FUNC("unknown CFA base reg %d",
- insn->sec, insn->offset, cfi->cfa.base);
- return -1;
- }
-
- switch (bp->base) {
- case CFI_UNDEFINED:
- orc->bp_reg = ORC_REG_UNDEFINED;
- break;
- case CFI_CFA:
- orc->bp_reg = ORC_REG_PREV_SP;
- break;
- case CFI_BP:
- orc->bp_reg = ORC_REG_BP;
- break;
- default:
- WARN_FUNC("unknown BP base reg %d",
- insn->sec, insn->offset, bp->base);
- return -1;
- }
-
- orc->sp_offset = cfi->cfa.offset;
- orc->bp_offset = bp->offset;
- orc->type = cfi->type;
-
- return 0;
-}
-
static int write_orc_entry(struct elf *elf, struct section *orc_sec,
struct section *ip_sec, unsigned int idx,
struct section *insn_sec, unsigned long insn_off,
@@ -101,7 +28,7 @@ static int write_orc_entry(struct elf *elf, struct section *orc_sec,
orc->fp_offset = bswap_if_needed(orc->fp_offset);
/* populate reloc for ip */
- if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
+ if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_PCREL,
insn_sec, insn_off))
return -1;
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Objtool currently implements static stack validation. Another method called
dynamic validation can be supported for other architectures.
Define STATIC_CHECK to select the files required for static validation
in objtool build.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/Build | 6 +++---
tools/objtool/Makefile | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/tools/objtool/Build b/tools/objtool/Build
index c4666d0b40ba..974290dc4aac 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -2,13 +2,13 @@ objtool-y += arch/$(SRCARCH)/
objtool-y += weak.o
-objtool-y += check.o
-objtool-y += special.o
+objtool-$(STATIC_CHECK) += check.o
+objtool-$(STATIC_CHECK) += special.o
objtool-y += builtin-check.o
objtool-y += cfi.o
objtool-y += insn.o
objtool-y += decode.o
-objtool-y += unwind_hints.o
+objtool-$(STATIC_CHECK) += unwind_hints.o
objtool-y += elf.o
objtool-y += objtool.o
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index a3a9cc24e0e3..797d1ea02db0 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -43,9 +43,10 @@ BUILD_ORC := n
ifeq ($(SRCARCH),x86)
BUILD_ORC := y
+ STATIC_CHECK := y
endif
-export BUILD_ORC
+export BUILD_ORC STATIC_CHECK
export srctree OUTPUT CFLAGS SRCARCH AWK
include $(srctree)/tools/build/Makefile.include
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
All of the ORC code in the kernel is currently under arch/x86. The
following parts of that code can be shared by other architectures that
wish to use ORC.
(1) ORC lookup initialization for vmlinux
(2) ORC lookup initialization for modules
(3) ORC lookup functions
Move arch/x86/include/asm/orc_lookup.h to include/asm-generic/orc_lookup.h.
Move the ORC lookup code into kernel/orc_lookup.c.
Rename the following init functions:
unwind_module_init ==> orc_lookup_module_init
unwind_init ==> orc_lookup_init
since that is exactly what they do.
orc_find() is the function that locates the ORC entry for a given PC.
Currently, it contains an architecture-specific part to locate ftrace
entries. Introduce a new arch-specific function called arch_orc_find()
and move the ftrace-related lookup there. If orc_find() is unable to
locate the ORC entry for a given PC in vmlinux or in the modules, it can
call arch_orc_find() to find architecture-specific entries.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/x86/include/asm/unwind.h | 5 -
arch/x86/kernel/module.c | 7 +-
arch/x86/kernel/unwind_orc.c | 256 +----------------
arch/x86/kernel/vmlinux.lds.S | 2 +-
.../asm => include/asm-generic}/orc_lookup.h | 42 +++
kernel/Makefile | 2 +
kernel/orc_lookup.c | 261 ++++++++++++++++++
7 files changed, 316 insertions(+), 259 deletions(-)
rename {arch/x86/include/asm => include/asm-generic}/orc_lookup.h (51%)
create mode 100644 kernel/orc_lookup.c
diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 7cede4dc21f0..71af8246c69e 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -94,13 +94,8 @@ static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state,
#ifdef CONFIG_UNWINDER_ORC
void unwind_init(void);
-void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
- void *orc, size_t orc_size);
#else
static inline void unwind_init(void) {}
-static inline
-void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
- void *orc, size_t orc_size) {}
#endif
static inline
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index c032edcd3d95..24664930c917 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -23,7 +23,7 @@
#include <asm/text-patching.h>
#include <asm/page.h>
#include <asm/setup.h>
-#include <asm/unwind.h>
+#include <asm-generic/orc_lookup.h>
#if 0
#define DEBUGP(fmt, ...) \
@@ -311,8 +311,9 @@ int module_finalize(const Elf_Ehdr *hdr,
}
if (orc && orc_ip)
- unwind_module_init(me, (void *)orc_ip->sh_addr, orc_ip->sh_size,
- (void *)orc->sh_addr, orc->sh_size);
+ orc_lookup_module_init(me,
+ (void *)orc_ip->sh_addr, orc_ip->sh_size,
+ (void *)orc->sh_addr, orc->sh_size);
return 0;
}
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index c2bfc597d909..eac9ed762bf8 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -6,80 +6,9 @@
#include <asm/unwind.h>
#include <asm/unwind_hints.h>
#include <asm/orc_types.h>
-#include <asm/orc_lookup.h>
-
-#define orc_warn(fmt, ...) \
- printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
-
-#define orc_warn_current(args...) \
-({ \
- if (state->task == current && !state->error) \
- orc_warn(args); \
-})
-
-extern int __start_orc_unwind_ip[];
-extern int __stop_orc_unwind_ip[];
-extern struct orc_entry __start_orc_unwind[];
-extern struct orc_entry __stop_orc_unwind[];
-
-static bool orc_init __ro_after_init;
-static unsigned int lookup_num_blocks __ro_after_init;
-
-static inline unsigned long orc_ip(const int *ip)
-{
- return (unsigned long)ip + *ip;
-}
-
-static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table,
- unsigned int num_entries, unsigned long ip)
-{
- int *first = ip_table;
- int *last = ip_table + num_entries - 1;
- int *mid = first, *found = first;
-
- if (!num_entries)
- return NULL;
-
- /*
- * Do a binary range search to find the rightmost duplicate of a given
- * starting address. Some entries are section terminators which are
- * "weak" entries for ensuring there are no gaps. They should be
- * ignored when they conflict with a real entry.
- */
- while (first <= last) {
- mid = first + ((last - first) / 2);
-
- if (orc_ip(mid) <= ip) {
- found = mid;
- first = mid + 1;
- } else
- last = mid - 1;
- }
-
- return u_table + (found - ip_table);
-}
-
-#ifdef CONFIG_MODULES
-static struct orc_entry *orc_module_find(unsigned long ip)
-{
- struct module *mod;
-
- mod = __module_address(ip);
- if (!mod || !mod->arch.orc_unwind || !mod->arch.orc_unwind_ip)
- return NULL;
- return __orc_find(mod->arch.orc_unwind_ip, mod->arch.orc_unwind,
- mod->arch.num_orcs, ip);
-}
-#else
-static struct orc_entry *orc_module_find(unsigned long ip)
-{
- return NULL;
-}
-#endif
+#include <asm-generic/orc_lookup.h>
#ifdef CONFIG_DYNAMIC_FTRACE
-static struct orc_entry *orc_find(unsigned long ip);
-
/*
* Ftrace dynamic trampolines do not have orc entries of their own.
* But they are copies of the ftrace entries that are static and
@@ -122,19 +51,10 @@ static struct orc_entry *orc_ftrace_find(unsigned long ip)
}
#endif
-/*
- * If we crash with IP==0, the last successfully executed instruction
- * was probably an indirect function call with a NULL function pointer,
- * and we don't have unwind information for NULL.
- * This hardcoded ORC entry for IP==0 allows us to unwind from a NULL function
- * pointer into its parent and then continue normally from there.
- */
-static struct orc_entry null_orc_entry = {
- .sp_offset = sizeof(long),
- .sp_reg = ORC_REG_SP,
- .bp_reg = ORC_REG_UNDEFINED,
- .type = UNWIND_HINT_TYPE_CALL
-};
+struct orc_entry *arch_orc_find(unsigned long ip)
+{
+ return orc_ftrace_find(ip);
+}
/* Fake frame pointer entry -- used as a fallback for generated code */
static struct orc_entry orc_fp_entry = {
@@ -146,173 +66,9 @@ static struct orc_entry orc_fp_entry = {
.end = 0,
};
-static struct orc_entry *orc_find(unsigned long ip)
-{
- static struct orc_entry *orc;
-
- if (ip == 0)
- return &null_orc_entry;
-
- /* For non-init vmlinux addresses, use the fast lookup table: */
- if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
- unsigned int idx, start, stop;
-
- idx = (ip - LOOKUP_START_IP) / LOOKUP_BLOCK_SIZE;
-
- if (unlikely((idx >= lookup_num_blocks-1))) {
- orc_warn("WARNING: bad lookup idx: idx=%u num=%u ip=%pB\n",
- idx, lookup_num_blocks, (void *)ip);
- return NULL;
- }
-
- start = orc_lookup[idx];
- stop = orc_lookup[idx + 1] + 1;
-
- if (unlikely((__start_orc_unwind + start >= __stop_orc_unwind) ||
- (__start_orc_unwind + stop > __stop_orc_unwind))) {
- orc_warn("WARNING: bad lookup value: idx=%u num=%u start=%u stop=%u ip=%pB\n",
- idx, lookup_num_blocks, start, stop, (void *)ip);
- return NULL;
- }
-
- return __orc_find(__start_orc_unwind_ip + start,
- __start_orc_unwind + start, stop - start, ip);
- }
-
- /* vmlinux .init slow lookup: */
- if (is_kernel_inittext(ip))
- return __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
- __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
-
- /* Module lookup: */
- orc = orc_module_find(ip);
- if (orc)
- return orc;
-
- return orc_ftrace_find(ip);
-}
-
-#ifdef CONFIG_MODULES
-
-static DEFINE_MUTEX(sort_mutex);
-static int *cur_orc_ip_table = __start_orc_unwind_ip;
-static struct orc_entry *cur_orc_table = __start_orc_unwind;
-
-static void orc_sort_swap(void *_a, void *_b, int size)
-{
- struct orc_entry *orc_a, *orc_b;
- struct orc_entry orc_tmp;
- int *a = _a, *b = _b, tmp;
- int delta = _b - _a;
-
- /* Swap the .orc_unwind_ip entries: */
- tmp = *a;
- *a = *b + delta;
- *b = tmp - delta;
-
- /* Swap the corresponding .orc_unwind entries: */
- orc_a = cur_orc_table + (a - cur_orc_ip_table);
- orc_b = cur_orc_table + (b - cur_orc_ip_table);
- orc_tmp = *orc_a;
- *orc_a = *orc_b;
- *orc_b = orc_tmp;
-}
-
-static int orc_sort_cmp(const void *_a, const void *_b)
-{
- struct orc_entry *orc_a;
- const int *a = _a, *b = _b;
- unsigned long a_val = orc_ip(a);
- unsigned long b_val = orc_ip(b);
-
- if (a_val > b_val)
- return 1;
- if (a_val < b_val)
- return -1;
-
- /*
- * The "weak" section terminator entries need to always be on the left
- * to ensure the lookup code skips them in favor of real entries.
- * These terminator entries exist to handle any gaps created by
- * whitelisted .o files which didn't get objtool generation.
- */
- orc_a = cur_orc_table + (a - cur_orc_ip_table);
- return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1;
-}
-
-void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size,
- void *_orc, size_t orc_size)
-{
- int *orc_ip = _orc_ip;
- struct orc_entry *orc = _orc;
- unsigned int num_entries = orc_ip_size / sizeof(int);
-
- WARN_ON_ONCE(orc_ip_size % sizeof(int) != 0 ||
- orc_size % sizeof(*orc) != 0 ||
- num_entries != orc_size / sizeof(*orc));
-
- /*
- * The 'cur_orc_*' globals allow the orc_sort_swap() callback to
- * associate an .orc_unwind_ip table entry with its corresponding
- * .orc_unwind entry so they can both be swapped.
- */
- mutex_lock(&sort_mutex);
- cur_orc_ip_table = orc_ip;
- cur_orc_table = orc;
- sort(orc_ip, num_entries, sizeof(int), orc_sort_cmp, orc_sort_swap);
- mutex_unlock(&sort_mutex);
-
- mod->arch.orc_unwind_ip = orc_ip;
- mod->arch.orc_unwind = orc;
- mod->arch.num_orcs = num_entries;
-}
-#endif
-
void __init unwind_init(void)
{
- size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip;
- size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind;
- size_t num_entries = orc_ip_size / sizeof(int);
- struct orc_entry *orc;
- int i;
-
- if (!num_entries || orc_ip_size % sizeof(int) != 0 ||
- orc_size % sizeof(struct orc_entry) != 0 ||
- num_entries != orc_size / sizeof(struct orc_entry)) {
- orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n");
- return;
- }
-
- /*
- * Note, the orc_unwind and orc_unwind_ip tables were already
- * sorted at build time via the 'sorttable' tool.
- * It's ready for binary search straight away, no need to sort it.
- */
-
- /* Initialize the fast lookup table: */
- lookup_num_blocks = orc_lookup_end - orc_lookup;
- for (i = 0; i < lookup_num_blocks-1; i++) {
- orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
- num_entries,
- LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
- if (!orc) {
- orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
- return;
- }
-
- orc_lookup[i] = orc - __start_orc_unwind;
- }
-
- /* Initialize the ending block: */
- orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries,
- LOOKUP_STOP_IP);
- if (!orc) {
- orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
- return;
- }
- orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind;
-
- orc_init = true;
+ orc_lookup_init();
}
unsigned long unwind_get_return_address(struct unwind_state *state)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 15f29053cec4..b4b93cd68136 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -29,7 +29,7 @@
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>
#include <asm/page_types.h>
-#include <asm/orc_lookup.h>
+#include <asm-generic/orc_lookup.h>
#include <asm/cache.h>
#include <asm/boot.h>
diff --git a/arch/x86/include/asm/orc_lookup.h b/include/asm-generic/orc_lookup.h
similarity index 51%
rename from arch/x86/include/asm/orc_lookup.h
rename to include/asm-generic/orc_lookup.h
index 241631282e43..f299fbf41cd0 100644
--- a/arch/x86/include/asm/orc_lookup.h
+++ b/include/asm-generic/orc_lookup.h
@@ -23,6 +23,8 @@
#ifndef LINKER_SCRIPT
+#include <asm-generic/sections.h>
+
extern unsigned int orc_lookup[];
extern unsigned int orc_lookup_end[];
@@ -31,4 +33,44 @@ extern unsigned int orc_lookup_end[];
#endif /* LINKER_SCRIPT */
+#ifndef __ASSEMBLY__
+
+#include <linux/orc_entry.h>
+
+#ifdef CONFIG_UNWINDER_ORC
+void orc_lookup_init(void);
+void orc_lookup_module_init(struct module *mod,
+ void *orc_ip, size_t orc_ip_size,
+ void *orc, size_t orc_size);
+#else
+static inline void orc_lookup_init(void) {}
+static inline
+void orc_lookup_module_init(struct module *mod,
+ void *orc_ip, size_t orc_ip_size,
+ void *orc, size_t orc_size)
+{
+}
+#endif
+
+struct orc_entry *arch_orc_find(unsigned long ip);
+
+#define orc_warn(fmt, ...) \
+ printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
+
+#define orc_warn_current(args...) \
+({ \
+ if (state->task == current && !state->error) \
+ orc_warn(args); \
+})
+
+struct orc_entry *orc_find(unsigned long ip);
+
+extern bool orc_init;
+extern int __start_orc_unwind_ip[];
+extern int __stop_orc_unwind_ip[];
+extern struct orc_entry __start_orc_unwind[];
+extern struct orc_entry __stop_orc_unwind[];
+
+#endif /* __ASSEMBLY__ */
+
#endif /* _ORC_LOOKUP_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index d754e0be1176..9a78612c4568 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -132,6 +132,8 @@ obj-$(CONFIG_WATCH_QUEUE) += watch_queue.o
obj-$(CONFIG_RESOURCE_KUNIT_TEST) += resource_kunit.o
obj-$(CONFIG_SYSCTL_KUNIT_TEST) += sysctl-test.o
+obj-$(CONFIG_UNWINDER_ORC) += orc_lookup.o
+
CFLAGS_stackleak.o += $(DISABLE_STACKLEAK_PLUGIN)
obj-$(CONFIG_GCC_PLUGIN_STACKLEAK) += stackleak.o
KASAN_SANITIZE_stackleak.o := n
diff --git a/kernel/orc_lookup.c b/kernel/orc_lookup.c
new file mode 100644
index 000000000000..88b783c41e94
--- /dev/null
+++ b/kernel/orc_lookup.c
@@ -0,0 +1,261 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/objtool.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+#include <asm/orc_types.h>
+#include <asm-generic/orc_lookup.h>
+
+bool orc_init __ro_after_init;
+static unsigned int lookup_num_blocks __ro_after_init;
+
+static inline unsigned long orc_ip(const int *ip)
+{
+ return (unsigned long)ip + *ip;
+}
+
+static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table,
+ unsigned int num_entries, unsigned long ip)
+{
+ int *first = ip_table;
+ int *last = ip_table + num_entries - 1;
+ int *mid = first, *found = first;
+
+ if (!num_entries)
+ return NULL;
+
+ /*
+ * Do a binary range search to find the rightmost duplicate of a given
+ * starting address. Some entries are section terminators which are
+ * "weak" entries for ensuring there are no gaps. They should be
+ * ignored when they conflict with a real entry.
+ */
+ while (first <= last) {
+ mid = first + ((last - first) / 2);
+
+ if (orc_ip(mid) <= ip) {
+ found = mid;
+ first = mid + 1;
+ } else
+ last = mid - 1;
+ }
+
+ return u_table + (found - ip_table);
+}
+
+#ifdef CONFIG_MODULES
+static struct orc_entry *orc_module_find(unsigned long ip)
+{
+ struct module *mod;
+
+ mod = __module_address(ip);
+ if (!mod || !mod->arch.orc_unwind || !mod->arch.orc_unwind_ip)
+ return NULL;
+ return __orc_find(mod->arch.orc_unwind_ip, mod->arch.orc_unwind,
+ mod->arch.num_orcs, ip);
+}
+#else
+static struct orc_entry *orc_module_find(unsigned long ip)
+{
+ return NULL;
+}
+#endif
+
+/*
+ * If we crash with IP==0, the last successfully executed instruction
+ * was probably an indirect function call with a NULL function pointer,
+ * and we don't have unwind information for NULL.
+ * This hardcoded ORC entry for IP==0 allows us to unwind from a NULL function
+ * pointer into its parent and then continue normally from there.
+ */
+static struct orc_entry null_orc_entry = {
+ .sp_offset = sizeof(long),
+ .sp_reg = ORC_REG_SP,
+ .fp_reg = ORC_REG_UNDEFINED,
+ .type = UNWIND_HINT_TYPE_CALL
+};
+
+struct orc_entry *orc_find(unsigned long ip)
+{
+ static struct orc_entry *orc;
+
+ if (ip == 0)
+ return &null_orc_entry;
+
+ /* For non-init vmlinux addresses, use the fast lookup table: */
+ if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
+ unsigned int idx, start, stop;
+
+ if (!orc_init) {
+ /*
+ * Take the slow path if the fast lookup tables have
+ * not yet been initialized.
+ */
+ return __orc_find(__start_orc_unwind_ip,
+ __start_orc_unwind,
+ __stop_orc_unwind_ip -
+ __start_orc_unwind_ip, ip);
+ }
+
+ idx = (ip - LOOKUP_START_IP) / LOOKUP_BLOCK_SIZE;
+
+ if (unlikely((idx >= lookup_num_blocks-1))) {
+ orc_warn("WARNING: bad lookup idx: idx=%u num=%u ip=%pB\n",
+ idx, lookup_num_blocks, (void *)ip);
+ return NULL;
+ }
+
+ start = orc_lookup[idx];
+ stop = orc_lookup[idx + 1] + 1;
+
+ if (unlikely((__start_orc_unwind + start >= __stop_orc_unwind) ||
+ (__start_orc_unwind + stop > __stop_orc_unwind))) {
+ orc_warn("WARNING: bad lookup value: idx=%u num=%u start=%u stop=%u ip=%pB\n",
+ idx, lookup_num_blocks, start, stop, (void *)ip);
+ return NULL;
+ }
+
+ return __orc_find(__start_orc_unwind_ip + start,
+ __start_orc_unwind + start, stop - start, ip);
+ }
+
+ /* vmlinux .init slow lookup: */
+ if (is_kernel_inittext(ip))
+ return __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
+ __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
+
+ /* Module lookup: */
+ orc = orc_module_find(ip);
+ if (orc)
+ return orc;
+
+ return arch_orc_find(ip);
+}
+
+#ifdef CONFIG_MODULES
+
+static DEFINE_MUTEX(sort_mutex);
+static int *cur_orc_ip_table = __start_orc_unwind_ip;
+static struct orc_entry *cur_orc_table = __start_orc_unwind;
+
+static void orc_sort_swap(void *_a, void *_b, int size)
+{
+ struct orc_entry *orc_a, *orc_b;
+ struct orc_entry orc_tmp;
+ int *a = _a, *b = _b, tmp;
+ int delta = _b - _a;
+
+ /* Swap the .orc_unwind_ip entries: */
+ tmp = *a;
+ *a = *b + delta;
+ *b = tmp - delta;
+
+ /* Swap the corresponding .orc_unwind entries: */
+ orc_a = cur_orc_table + (a - cur_orc_ip_table);
+ orc_b = cur_orc_table + (b - cur_orc_ip_table);
+ orc_tmp = *orc_a;
+ *orc_a = *orc_b;
+ *orc_b = orc_tmp;
+}
+
+static int orc_sort_cmp(const void *_a, const void *_b)
+{
+ struct orc_entry *orc_a;
+ const int *a = _a, *b = _b;
+ unsigned long a_val = orc_ip(a);
+ unsigned long b_val = orc_ip(b);
+
+ if (a_val > b_val)
+ return 1;
+ if (a_val < b_val)
+ return -1;
+
+ /*
+ * The "weak" section terminator entries need to always be on the left
+ * to ensure the lookup code skips them in favor of real entries.
+ * These terminator entries exist to handle any gaps created by
+ * whitelisted .o files which didn't get objtool generation.
+ */
+ orc_a = cur_orc_table + (a - cur_orc_ip_table);
+ return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1;
+}
+
+void orc_lookup_module_init(struct module *mod,
+ void *_orc_ip, size_t orc_ip_size,
+ void *_orc, size_t orc_size)
+{
+ int *orc_ip = _orc_ip;
+ struct orc_entry *orc = _orc;
+ unsigned int num_entries = orc_ip_size / sizeof(int);
+
+ WARN_ON_ONCE(orc_ip_size % sizeof(int) != 0 ||
+ orc_size % sizeof(*orc) != 0 ||
+ num_entries != orc_size / sizeof(*orc));
+
+ /*
+ * The 'cur_orc_*' globals allow the orc_sort_swap() callback to
+ * associate an .orc_unwind_ip table entry with its corresponding
+ * .orc_unwind entry so they can both be swapped.
+ */
+ mutex_lock(&sort_mutex);
+ cur_orc_ip_table = orc_ip;
+ cur_orc_table = orc;
+ sort(orc_ip, num_entries, sizeof(int), orc_sort_cmp, orc_sort_swap);
+ mutex_unlock(&sort_mutex);
+
+ mod->arch.orc_unwind_ip = orc_ip;
+ mod->arch.orc_unwind = orc;
+ mod->arch.num_orcs = num_entries;
+}
+#endif
+
+void __init orc_lookup_init(void)
+{
+ size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip;
+ size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind;
+ size_t num_entries = orc_ip_size / sizeof(int);
+ struct orc_entry *orc;
+ int i;
+
+ if (!num_entries || orc_ip_size % sizeof(int) != 0 ||
+ orc_size % sizeof(struct orc_entry) != 0 ||
+ num_entries != orc_size / sizeof(struct orc_entry)) {
+ orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n");
+ return;
+ }
+
+ /*
+ * Note, the orc_unwind and orc_unwind_ip tables were already
+ * sorted at build time via the 'sorttable' tool.
+ * It's ready for binary search straight away, no need to sort it.
+ */
+
+ /* Initialize the fast lookup table: */
+ lookup_num_blocks = orc_lookup_end - orc_lookup;
+ for (i = 0; i < lookup_num_blocks-1; i++) {
+ orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
+ num_entries,
+ LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
+ if (!orc) {
+ orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
+ return;
+ }
+
+ orc_lookup[i] = orc - __start_orc_unwind;
+ }
+
+ /* Initialize the ending block: */
+ orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries,
+ LOOKUP_STOP_IP);
+ if (!orc) {
+ orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
+ return;
+ }
+ orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind;
+
+ orc_init = true;
+}
+
+__weak struct orc_entry *arch_orc_find(unsigned long ip)
+{
+ return NULL;
+}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Add CFI definitions and Endianness for ARM64.
Add DYNAMIC_CHECK option for ARM64.
Provide stubs for arch_decode_instructions() and check() just to get
Objtool to build on ARM64.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/Build | 1 +
tools/objtool/Makefile | 6 +++++-
tools/objtool/arch/arm64/Build | 1 +
tools/objtool/arch/arm64/decode.c | 21 +++++++++++++++++++
.../arch/arm64/include/arch/cfi_regs.h | 13 ++++++++++++
.../arch/arm64/include/arch/endianness.h | 9 ++++++++
tools/objtool/dcheck.c | 16 ++++++++++++++
7 files changed, 66 insertions(+), 1 deletion(-)
create mode 100644 tools/objtool/arch/arm64/Build
create mode 100644 tools/objtool/arch/arm64/decode.c
create mode 100644 tools/objtool/arch/arm64/include/arch/cfi_regs.h
create mode 100644 tools/objtool/arch/arm64/include/arch/endianness.h
create mode 100644 tools/objtool/dcheck.c
diff --git a/tools/objtool/Build b/tools/objtool/Build
index 974290dc4aac..fb0846b7d95e 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -4,6 +4,7 @@ objtool-y += weak.o
objtool-$(STATIC_CHECK) += check.o
objtool-$(STATIC_CHECK) += special.o
+objtool-$(DYNAMIC_CHECK) += dcheck.o
objtool-y += builtin-check.o
objtool-y += cfi.o
objtool-y += insn.o
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 797d1ea02db0..92583b82eb78 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -46,7 +46,11 @@ ifeq ($(SRCARCH),x86)
STATIC_CHECK := y
endif
-export BUILD_ORC STATIC_CHECK
+ifeq ($(SRCARCH),arm64)
+ DYNAMIC_CHECK := y
+endif
+
+export BUILD_ORC STATIC_CHECK DYNAMIC_CHECK
export srctree OUTPUT CFLAGS SRCARCH AWK
include $(srctree)/tools/build/Makefile.include
diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
new file mode 100644
index 000000000000..3ff1f00c6a47
--- /dev/null
+++ b/tools/objtool/arch/arm64/Build
@@ -0,0 +1 @@
+objtool-y += decode.o
diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
new file mode 100644
index 000000000000..69f851337537
--- /dev/null
+++ b/tools/objtool/arch/arm64/decode.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Author: Madhavan T. Venkataraman ([email protected])
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <objtool/check.h>
+
+int arch_decode_instruction(struct objtool_file *file,
+ const struct section *sec,
+ unsigned long offset, unsigned int maxlen,
+ unsigned int *len, enum insn_type *type,
+ unsigned long *immediate,
+ struct list_head *ops_list)
+{
+ return 0;
+}
diff --git a/tools/objtool/arch/arm64/include/arch/cfi_regs.h b/tools/objtool/arch/arm64/include/arch/cfi_regs.h
new file mode 100644
index 000000000000..cff3b04d7248
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/cfi_regs.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _OBJTOOL_CFI_REGS_H
+#define _OBJTOOL_CFI_REGS_H
+
+#define CFI_FP 29
+#define CFI_BP CFI_FP
+#define CFI_RA 30
+#define CFI_SP 31
+
+#define CFI_NUM_REGS 32
+
+#endif /* _OBJTOOL_CFI_REGS_H */
diff --git a/tools/objtool/arch/arm64/include/arch/endianness.h b/tools/objtool/arch/arm64/include/arch/endianness.h
new file mode 100644
index 000000000000..7c362527da20
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/endianness.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ARCH_ENDIANNESS_H
+#define _ARCH_ENDIANNESS_H
+
+#include <endian.h>
+
+#define __TARGET_BYTE_ORDER __LITTLE_ENDIAN
+
+#endif /* _ARCH_ENDIANNESS_H */
diff --git a/tools/objtool/dcheck.c b/tools/objtool/dcheck.c
new file mode 100644
index 000000000000..e2098c9ad282
--- /dev/null
+++ b/tools/objtool/dcheck.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ */
+
+#include <string.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <sys/mman.h>
+
+#include <objtool/objtool.h>
+
+int check(struct objtool_file *file)
+{
+ return 0;
+}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Implement arch_decode_instruction() for ARM64. For Dynamic FP validation,
we need to walk each function's code and determine the stack and frame
offsets at each instruction. So, the following instructions are completely
decoded:
Instructions that affect the SP and FP:
- Load-Store instructions
- Add/Sub/Mov instructions
Instructions that affect control flow:
- Branch instructions
- Call instructions
- Return instructions
Miscellaneous instructions:
- Break instruction used for bugs
- Paciasp instruction that occurs at the beginning of the frame
pointer prolog
The rest of the instructions are either dont-care from an unwind
perspective or unexpected from the compiler. Add checks for the unexpected
ones to catch them if the compiler ever generates them.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/arch/arm64/decode.c | 506 ++++++++++++++++++++++++++-
tools/objtool/include/objtool/arch.h | 2 +
2 files changed, 507 insertions(+), 1 deletion(-)
diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
index 69f851337537..aaae16791807 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -1,5 +1,9 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
+ * decode.c - ARM64 instruction decoder for dynamic FP validation. Only a
+ * small subset of the instructions need to be decoded. The rest
+ * only need to be sanity checked.
+ *
* Author: Madhavan T. Venkataraman ([email protected])
*
* Copyright (C) 2022 Microsoft Corporation
@@ -7,15 +11,515 @@
#include <stdio.h>
#include <stdlib.h>
+#include <stdint.h>
#include <objtool/check.h>
+#include <objtool/elf.h>
+#include <objtool/warn.h>
+
+/* ARM64 instructions are all 4 bytes wide. */
+#define INSN_SIZE 4
+
+/* --------------------- instruction decode structs ------------------------ */
+
+struct decode_var {
+ u32 insn;
+ enum insn_type type;
+ s64 imm;
+ unsigned int mode1;
+ unsigned int mode2;
+ unsigned int check_reg;
+ struct list_head *ops;
+};
+
+struct decode {
+ unsigned long opmask;
+ unsigned long op;
+ unsigned int width;
+ unsigned int shift;
+ unsigned int bits;
+ unsigned int sign_extend;
+ unsigned int mult;
+ unsigned int mode1;
+ unsigned int mode2;
+ void (*func)(struct decode *decode, struct decode_var *var);
+};
+
+struct class {
+ unsigned long opmask;
+ unsigned long op;
+ void (*check)(struct decode_var *var);
+};
+
+/* ------------------------ stack operations ------------------------------- */
+
+static void add_stack_op(unsigned char src_reg, enum op_src_type src_type,
+ s64 src_offset,
+ unsigned char dest_reg, enum op_dest_type dest_type,
+ s64 dest_offset,
+ struct list_head *ops)
+{
+ struct stack_op *op;
+
+ op = calloc(1, sizeof(*op));
+ if (!op) {
+ WARN("calloc failed");
+ return;
+ }
+
+ op->src.reg = src_reg;
+ op->src.type = src_type;
+ op->src.offset = src_offset;
+ op->dest.reg = dest_reg;
+ op->dest.type = dest_type;
+ op->dest.offset = dest_offset;
+
+ list_add_tail(&op->list, ops);
+}
+
+static void add_op(struct decode_var *var,
+ unsigned char rn, s64 offset, unsigned char rd)
+{
+ add_stack_op(rn, OP_SRC_ADD, offset, rd, OP_DEST_REG, 0, var->ops);
+}
+
+static void load_op(struct decode_var *var, s64 offset, unsigned char rd)
+{
+ add_stack_op(CFI_SP, OP_SRC_REG_INDIRECT, offset, rd, OP_DEST_REG, 0,
+ var->ops);
+}
+
+static void store_op(struct decode_var *var, s64 offset, unsigned char rd)
+{
+ add_stack_op(CFI_SP, OP_SRC_REG, 0, rd, OP_DEST_REG_INDIRECT, offset,
+ var->ops);
+}
+
+/* ------------------------ decode functions ------------------------------- */
+
+#define is_saved_reg(rt) ((rt) == CFI_FP || (rt) == CFI_RA)
+#define is_frame_reg(rt) ((rt) == CFI_FP || (rt) == CFI_SP)
+
+/* ----- Add/Subtract instructions. ----- */
+
+#define CMN_OP 0x31000000 /* Alias of ADDS imm */
+#define CMP_OP 0x71000000 /* Alias of SUBS imm */
+
+static void add(struct decode *decode, struct decode_var *var)
+{
+ unsigned int rd = var->insn & 0x1F;
+ unsigned int rn = (var->insn >> 5) & 0x1F;
+ unsigned int shift = (var->insn >> 22) & 1;
+
+ if (decode->op == CMN_OP || decode->op == CMP_OP)
+ return;
+
+ if (!is_frame_reg(rd))
+ return;
+
+ if (is_frame_reg(rn)) {
+ if (shift)
+ var->imm <<= 12;
+ add_op(var, rn, var->imm, rd);
+ } else {
+ var->type = INSN_UNRELIABLE;
+ }
+}
+
+#define CMN_EXT_OP 0x2B200000 /* Alias of ADDS ext */
+#define CMP_EXT_OP 0x6B200000 /* Alias of SUBS ext */
+
+static void addc(struct decode *decode, struct decode_var *var)
+{
+ unsigned int rd = var->insn & 0x1F;
+
+ if (decode->op == CMN_EXT_OP || decode->op == CMP_EXT_OP)
+ return;
+
+ if (is_frame_reg(rd))
+ var->type = INSN_UNRELIABLE;
+}
+
+static void sub(struct decode *decode, struct decode_var *var)
+{
+ var->imm = -var->imm;
+ return add(decode, var);
+}
+
+/* ----- Load instructions. ----- */
+
+/*
+ * For some instructions, the target register cannot be FP. There are 3 cases:
+ *
+ * - The register width is 32 bits. FP cannot be 32 bits.
+ * - The register is loaded from one that is not the SP. We do not track
+ * the value of other registers in static analysis.
+ * - The instruction does not make sense for the FP to be the target.
+ */
+static void check_reg(unsigned int reg, struct decode_var *var)
+{
+ if (reg == CFI_FP)
+ var->type = INSN_UNRELIABLE;
+}
+
+static void ldp(struct decode *decode, struct decode_var *var)
+{
+ unsigned int rt1 = var->insn & 0x1F;
+ unsigned int rt2 = (var->insn >> 10) & 0x1F;
+ unsigned int rn = (var->insn >> 5) & 0x1F;
+ s64 imm;
+
+ if (rn != CFI_SP || var->check_reg) {
+ check_reg(rt1, var);
+ check_reg(rt2, var);
+ }
+
+ if (rn == CFI_SP) {
+ if (var->mode1 && var->mode2) /* Pre-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+
+ imm = var->mode1 ? 0 : var->imm;
+ if (is_saved_reg(rt1))
+ load_op(var, imm, rt1);
+ if (is_saved_reg(rt2))
+ load_op(var, imm + 8, rt2);
+
+ if (var->mode1 && !var->mode2) /* Post-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+ }
+}
+
+static void ldpc(struct decode *decode, struct decode_var *var)
+{
+ var->check_reg = 1;
+ ldp(decode, var);
+}
+
+static void ldr(struct decode *decode, struct decode_var *var)
+{
+ unsigned int rd = var->insn & 0x1F;
+ unsigned int rn = (var->insn >> 5) & 0x1F;
+ s64 imm;
+
+ if (rn != CFI_SP || var->check_reg)
+ check_reg(rd, var);
+
+ if (rn == CFI_SP) {
+ if (var->mode1 && var->mode2) /* Pre-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+
+ imm = var->mode1 ? 0 : var->imm;
+ if (is_saved_reg(rd))
+ load_op(var, imm, rd);
+
+ if (var->mode1 && !var->mode2) /* Post-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+ }
+}
+
+/* ----- Store instructions. ----- */
+
+static void stp(struct decode *decode, struct decode_var *var)
+{
+ unsigned int rt1 = var->insn & 0x1F;
+ unsigned int rt2 = (var->insn >> 10) & 0x1F;
+ unsigned int rn = (var->insn >> 5) & 0x1F;
+ s64 imm;
+
+ if (var->check_reg) {
+ check_reg(rt1, var);
+ check_reg(rt2, var);
+ }
+
+ if (rn == CFI_SP) {
+ if (var->mode1 && var->mode2) /* Pre-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+
+ imm = var->mode1 ? 0 : var->imm;
+ if (is_saved_reg(rt1))
+ store_op(var, imm, rt1);
+ if (is_saved_reg(rt2))
+ store_op(var, imm + 8, rt2);
+
+ if (var->mode1 && !var->mode2) /* Post-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+ }
+}
+
+static void stpc(struct decode *decode, struct decode_var *var)
+{
+ var->check_reg = 1;
+ stp(decode, var);
+}
+
+static void str(struct decode *decode, struct decode_var *var)
+{
+ unsigned int rd = var->insn & 0x1F;
+ unsigned int rn = (var->insn >> 5) & 0x1F;
+ s64 imm;
+
+ if (var->check_reg)
+ check_reg(rd, var);
+
+ if (rn == CFI_SP) {
+ if (var->mode1 && var->mode2) /* Pre-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+
+ imm = var->mode1 ? 0 : var->imm;
+ if (is_saved_reg(rd))
+ store_op(var, imm, rd);
+
+ if (var->mode1 && !var->mode2) /* Post-index */
+ add_op(var, CFI_SP, var->imm, CFI_SP);
+ }
+}
+
+static void strc(struct decode *decode, struct decode_var *var)
+{
+ var->check_reg = 1;
+ str(decode, var);
+}
+
+/* ----- Control transfer instructions. ----- */
+
+#define BR_UNCONDITIONAL 0x14000000
+
+static void bra(struct decode *decode, struct decode_var *var)
+{
+ if (var->imm) {
+ if (decode->op == BR_UNCONDITIONAL)
+ var->type = INSN_JUMP_UNCONDITIONAL;
+ else
+ var->type = INSN_JUMP_CONDITIONAL;
+ } else {
+ var->type = INSN_JUMP_DYNAMIC;
+ }
+}
+
+static void call(struct decode *decode, struct decode_var *var)
+{
+ var->type = var->imm ? INSN_CALL : INSN_CALL_DYNAMIC;
+}
+
+static void ret(struct decode *decode, struct decode_var *var)
+{
+ var->type = INSN_RETURN;
+}
+
+/* ----- Miscellaneous instructions. ----- */
+
+static void bug(struct decode *decode, struct decode_var *var)
+{
+ var->type = INSN_BUG;
+}
+
+static void pac(struct decode *decode, struct decode_var *var)
+{
+ var->type = INSN_START;
+}
+
+/* ------------------------ Instruction decode ----------------------------- */
+
+struct decode decode_array[] = {
+/*
+ * mask OP code mask
+ * opcode OP code
+ * width Target register width. Values can be:
+ * 64 (64-bit)
+ * 32 (32-bit),
+ * X (64-bit if bit X in the instruction is set)
+ * -X (32-bit if bit X in the instruction is set)
+ * shift Shift for the immediate value
+ * bits Number of bits in the immediate value
+ * sign Sign extend the immediate value
+ * mult Multiplier for the immediate value
+ * am1 Addressing mode bit 1
+ * am2 Addressing mode bit 2
+ * func Decode function
+ *
+ * =============================== INSTRUCTIONS ===============================
+ * mask opcode width shift bits sign mult am1 am2 func
+ * ============================================================================
+ */
+{ 0x7E400000, 0x28400000, 31, 15, 7, 1, 0, 23, 24, ldp /* LDP */},
+{ 0x7E400000, 0x68400000, 32, 15, 7, 1, 4, 23, 24, ldp /* LDPSW */},
+{ 0x7FC00000, 0x28400000, 31, 15, 7, 1, 0, 0, 0, ldpc /* LDNP */},
+{ 0xBFE00000, 0xB8400000, 30, 12, 9, 1, 1, 10, 11, ldr /* LDR */},
+{ 0xBFC00000, 0xB9400000, 30, 10, 12, 0, 0, 0, 0, ldr /* LDR off */},
+{ 0xFF200400, 0xF8200400, 64, 12, 9, 1, 8, 11, 11, ldr /* LDRA */},
+{ 0xFFC00000, 0x39400000, 32, 10, 12, 0, 1, 0, 0, ldr /* LDRB off */},
+{ 0xFFE00000, 0x38400000, 32, 12, 9, 1, 1, 10, 11, ldr /* LDRB */},
+{ 0xFFC00000, 0x79400000, 32, 10, 12, 0, 2, 0, 0, ldr /* LDRH off */},
+{ 0xFFE00000, 0x78400000, 32, 12, 9, 1, 1, 10, 11, ldr /* LDRH */},
+{ 0xFF800000, 0x39800000, -22, 10, 12, 0, 1, 0, 0, ldr /* LDRSB off */},
+{ 0xFFA00000, 0x38800000, -22, 12, 9, 1, 1, 10, 11, ldr /* LDRSB */},
+{ 0xFF800000, 0x79800000, -22, 10, 12, 0, 2, 0, 0, ldr /* LDRSH off */},
+{ 0xFFA00000, 0x78800000, -22, 12, 9, 1, 1, 10, 11, ldr /* LDRSH */},
+{ 0xFFC00000, 0xB9800000, 32, 10, 12, 0, 4, 0, 0, ldr /* LDRSW off */},
+{ 0xFFE00000, 0xB8800000, 32, 12, 9, 1, 1, 10, 11, ldr /* LDRSW */},
+{ 0x7E000000, 0x28000000, 31, 15, 7, 1, 0, 23, 24, stp /* STP */},
+{ 0x7E400000, 0x28000000, 31, 15, 7, 1, 0, 23, 24, stp /* STG */},
+{ 0xFE400000, 0x68000000, 64, 15, 7, 1, 16, 23, 24, stpc /* STGP */},
+{ 0x7FC00000, 0x28000000, 31, 15, 7, 1, 0, 0, 0, stpc /* STNP */},
+{ 0xBFC00000, 0xB9000000, 30, 10, 12, 0, 0, 0, 0, str /* STR off */},
+{ 0xBFE00000, 0xB8000000, 30, 12, 9, 1, 1, 10, 11, str /* STR */},
+{ 0xFFE00000, 0xD9200000, 64, 12, 9, 1, 16, 10, 11, strc /* STG */},
+{ 0xFFE00000, 0xD9A00000, 64, 12, 9, 1, 16, 10, 11, strc /* ST2G */},
+{ 0x7F800000, 0x11000000, 31, 10, 12, 0, 1, 0, 0, add /* ADD imm */},
+{ 0x7FE00000, 0x0B200000, 31, 10, 3, 0, 1, 0, 0, addc /* ADD ext */},
+{ 0x7F800000, 0x31000000, 31, 10, 12, 0, 1, 0, 0, add /* ADDS imm */},
+{ 0x7FE00000, 0x2B200000, 31, 10, 3, 0, 1, 0, 0, addc /* ADDS ext */},
+{ 0x7F800000, 0x51000000, 31, 10, 12, 0, 1, 0, 0, sub /* SUB imm */},
+{ 0x7FE00000, 0x4B200000, 31, 10, 3, 0, 1, 0, 0, addc /* SUB ext */},
+{ 0x7F800000, 0x71000000, 31, 10, 12, 0, 1, 0, 0, sub /* SUBS imm */},
+{ 0x7FE00000, 0x6B200000, 31, 10, 3, 0, 1, 0, 0, addc /* SUBS ext */},
+{ 0xFC000000, 0x14000000, 64, 0, 26, 1, 4, 0, 0, bra /* B */},
+{ 0xFF000010, 0x54000000, 64, 5, 19, 1, 4, 0, 0, bra /* B.cond */},
+{ 0xFF000010, 0x54000010, 64, 5, 19, 1, 4, 0, 0, bra /* BC.cond */},
+{ 0xFFFFFC1F, 0xD61F0000, 64, 0, 0, 0, 0, 0, 0, bra /* BR */},
+{ 0xFEFFF800, 0xD61F0800, 64, 0, 0, 0, 0, 0, 0, bra /* BRA */},
+{ 0x7E000000, 0x34000000, 31, 5, 19, 1, 4, 0, 0, bra /* CBZ/CBNZ */},
+{ 0x7E000000, 0x36000000, 31, 5, 14, 1, 4, 0, 0, bra /* TBZ/TBNZ */},
+{ 0xFC000000, 0x94000000, 64, 0, 26, 1, 4, 0, 0, call /* BL */},
+{ 0xFFFFFC1F, 0xD63F0000, 64, 0, 0, 0, 0, 0, 0, call /* BLR */},
+{ 0xFEFFF800, 0xD63F0800, 64, 0, 0, 0, 0, 0, 0, call /* BLRA */},
+{ 0xFFFFFC1F, 0xD65F0000, 64, 0, 0, 0, 0, 0, 0, ret /* RET */},
+{ 0xFFFFFBFF, 0xD65F0BFF, 64, 0, 0, 0, 0, 0, 0, ret /* RETA */},
+{ 0xFFFFFFFF, 0xD69F03E0, 64, 0, 0, 0, 0, 0, 0, ret /* ERET */},
+{ 0xFFFFFBFF, 0xD69F0BFF, 64, 0, 0, 0, 0, 0, 0, ret /* ERETA */},
+{ 0xFFE00000, 0xD4200000, 64, 5, 16, 0, 1, 0, 0, bug /* BRK */},
+{ 0xFFFFFFFF, 0xD503233F, 64, 0, 0, 0, 1, 0, 0, pac /* PACIASP */},
+};
+unsigned int ndecode = ARRAY_SIZE(decode_array);
+
+static void ignore(struct decode_var *var)
+{
+}
+
+static void check_target(struct decode_var *var)
+{
+ unsigned int rd = var->insn & 0x1F;
+
+ check_reg(rd, var);
+}
+
+struct class class_array[] = {
+/*
+ * mask Class OP mask
+ * opcode Class OP code
+ * check Function to perform checks
+ *
+ * ========================== INSTRUCTION CLASSES =============================
+ * mask opcode check
+ * ============================================================================
+ */
+{ 0x1E000000, 0x00000000, ignore /* RSVD_00 */ },
+{ 0x1E000000, 0x02000000, ignore /* UNALLOC_01 */ },
+{ 0x1E000000, 0x04000000, ignore /* SVE_02 */ },
+{ 0x1E000000, 0x06000000, ignore /* UNALLOC_03 */ },
+{ 0x1E000000, 0x08000000, check_target /* LOAD_STORE_04 */ },
+{ 0x1E000000, 0x0A000000, check_target /* DP_REGISTER_05 */ },
+{ 0x1E000000, 0x0C000000, ignore /* LOAD_STORE_06 */ },
+{ 0x1E000000, 0x0E000000, ignore /* SIMD_FP_07 */ },
+{ 0x1E000000, 0x12000000, check_target /* DP_IMMEDIATE_09 */ },
+{ 0x1E000000, 0x10000000, check_target /* DP_IMMEDIATE_08 */ },
+{ 0x1E000000, 0x14000000, check_target /* BR_SYS_10 */ },
+{ 0x1E000000, 0x16000000, check_target /* BR_SYS_11 */ },
+{ 0x1E000000, 0x18000000, check_target /* LOAD_STORE_12 */ },
+{ 0x1E000000, 0x1A000000, ignore /* DP_REGISTER_13 */ },
+{ 0x1E000000, 0x1C000000, check_target /* LOAD_STORE_14 */ },
+{ 0x1E000000, 0x1E000000, ignore /* SIMD_FP_15 */ },
+};
+unsigned int nclass = ARRAY_SIZE(class_array);
+
+static inline s64 sign_extend(s64 imm, unsigned int bits)
+{
+ return (imm << (64 - bits)) >> (64 - bits);
+}
int arch_decode_instruction(struct objtool_file *file,
const struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
unsigned long *immediate,
- struct list_head *ops_list)
+ struct list_head *ops)
{
+ struct decode *decode;
+ struct decode_var var;
+ struct class *class;
+ unsigned int width, mask, mult, i;
+
+ if (maxlen < INSN_SIZE)
+ return -1;
+ *len = INSN_SIZE;
+
+ var.insn = *(u32 *)(sec->data->d_buf + offset);
+ var.type = INSN_OTHER;
+ var.imm = 0;
+ var.ops = ops;
+
+ *type = INSN_OTHER;
+
+ /* Decode the instruction, if listed. */
+ for (i = 0; i < ndecode; i++) {
+ decode = &decode_array[i];
+
+ if ((var.insn & decode->opmask) != decode->op)
+ continue;
+
+ /* Extract addressing mode (for some instructions). */
+ var.mode1 = 0;
+ var.mode2 = 0;
+ if (decode->mode1)
+ var.mode1 = (var.insn >> decode->mode1) & 1;
+ if (decode->mode2)
+ var.mode2 = (var.insn >> decode->mode2) & 1;
+
+ /* Determine target register width. */
+ width = decode->width;
+ if (width < 0)
+ width = (var.insn & (1 << -width)) ? 32 : 64;
+ else if (width < 32)
+ width = (var.insn & (1 << width)) ? 64 : 32;
+
+ /*
+ * If the target register width is 32 bits, set the check flag
+ * so that the target registers are checked to make sure they
+ * are not the FP or the RA. We should not be using 32-bit
+ * values in these registers.
+ */
+ var.check_reg = (width == 32);
+
+ /* Extract the immediate value. */
+ mask = (1 << decode->bits) - 1;
+ var.imm = (var.insn >> decode->shift) & mask;
+ if (decode->sign_extend)
+ var.imm = sign_extend(var.imm, decode->bits);
+
+ /* Scale the immediate value. */
+ mult = decode->mult;
+ if (!mult)
+ mult = (width == 32) ? 4 : 8;
+ var.imm *= mult;
+
+ /* Decode the instruction. */
+ decode->func(decode, &var);
+ goto out;
+ }
+
+ /*
+ * Sanity check to make sure that the compiler has not generated
+ * code that modifies the FP or the RA in an unexpected way.
+ */
+ for (i = 0; i < nclass; i++) {
+ class = &class_array[i];
+ if ((var.insn & class->opmask) == class->op) {
+ class->check(&var);
+ goto out;
+ }
+ }
+out:
+ *immediate = var.imm;
+ *type = var.type;
return 0;
}
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index beb2f3aa94ff..3c2f8c1b8265 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -29,6 +29,8 @@ enum insn_type {
INSN_TRAP,
INSN_ENDBR,
INSN_OTHER,
+ INSN_START,
+ INSN_UNRELIABLE,
};
enum op_dest_type {
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Invoke decode_instructions() from check(). For Dynamic Validation of
the frame pointer, we only need the "-s" option for objtool.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/dcheck.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/objtool/dcheck.c b/tools/objtool/dcheck.c
index e2098c9ad282..cd2700153408 100644
--- a/tools/objtool/dcheck.c
+++ b/tools/objtool/dcheck.c
@@ -9,8 +9,13 @@
#include <sys/mman.h>
#include <objtool/objtool.h>
+#include <objtool/builtin.h>
+#include <objtool/insn.h>
int check(struct objtool_file *file)
{
- return 0;
+ if (!opts.stackval)
+ return 1;
+
+ return decode_instructions(file);
}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Compute the destination address of each call and jump instruction after
decoding all the instructions.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/arch/arm64/decode.c | 12 ++++++++
tools/objtool/dcheck.c | 47 ++++++++++++++++++++++++++++++-
2 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
index aaae16791807..81653ed3c323 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -20,6 +20,18 @@
/* ARM64 instructions are all 4 bytes wide. */
#define INSN_SIZE 4
+/* --------------------- arch support functions ------------------------- */
+
+unsigned long arch_dest_reloc_offset(int addend)
+{
+ return addend;
+}
+
+unsigned long arch_jump_destination(struct instruction *insn)
+{
+ return insn->offset + insn->immediate;
+}
+
/* --------------------- instruction decode structs ------------------------ */
struct decode_var {
diff --git a/tools/objtool/dcheck.c b/tools/objtool/dcheck.c
index cd2700153408..eb806a032a32 100644
--- a/tools/objtool/dcheck.c
+++ b/tools/objtool/dcheck.c
@@ -12,10 +12,55 @@
#include <objtool/builtin.h>
#include <objtool/insn.h>
+/*
+ * Find the destination instructions for all jumps.
+ */
+static void add_jump_destinations(struct objtool_file *file)
+{
+ struct instruction *insn;
+ struct reloc *reloc;
+ struct section *dest_sec;
+ unsigned long dest_off;
+
+ for_each_insn(file, insn) {
+ if (insn->type != INSN_CALL &&
+ insn->type != INSN_JUMP_CONDITIONAL &&
+ insn->type != INSN_JUMP_UNCONDITIONAL) {
+ continue;
+ }
+
+ reloc = insn_reloc(file, insn);
+ if (!reloc) {
+ dest_sec = insn->sec;
+ dest_off = arch_jump_destination(insn);
+ } else if (reloc->sym->type == STT_SECTION) {
+ dest_sec = reloc->sym->sec;
+ dest_off = arch_dest_reloc_offset(reloc->addend);
+ } else if (reloc->sym->sec->idx) {
+ dest_sec = reloc->sym->sec;
+ dest_off = reloc->sym->sym.st_value +
+ arch_dest_reloc_offset(reloc->addend);
+ } else {
+ /* non-func asm code jumping to another file */
+ continue;
+ }
+
+ insn->jump_dest = find_insn(file, dest_sec, dest_off);
+ }
+}
+
int check(struct objtool_file *file)
{
+ int ret;
+
if (!opts.stackval)
return 1;
- return decode_instructions(file);
+ ret = decode_instructions(file);
+ if (ret)
+ return ret;
+
+ add_jump_destinations(file);
+
+ return 0;
}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Implement arch_initial_func_cfi_state() to initialize the CFI for a
function.
Add code to check() in dcheck.c to walk the instructions in every function
and compute the CFI information for each instruction.
Perform the following checks to validate the CFI:
- Make sure that there is exactly one frame pointer prolog for
an epilog.
- Make sure that the frame pointer register is initialized to
the location at which the previous frame pointer is stored
on the stack.
- Make sure that the frame pointer is restored in the epilog
from the same location on stack where it was saved.
- Make sure that the return address is restored in the epilog
from the same location on stack where it was saved.
- Make sure that the frame pointer and return address are saved
on the stack adjacent to each other in the correct order as
specified in the ABI.
- If an instruction can be reached via two different code paths,
make sure that the CFIs computed from traversing each path match
for the instruction.
- Every time the frame pointer or stack offset is changed, make sure
the offsets have legal values.
insn_cfi_match() is used to compare CFIs to see if they match. When there
is a mismatch, the function emits error messages. With static checking,
these errors result in failure. With dynamic checking, these errors
only resulting in marking those instructions as unreliable for unwind.
In the latter case, suppress the warning messages.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
tools/objtool/arch/arm64/decode.c | 15 ++
tools/objtool/check.c | 2 +-
tools/objtool/dcheck.c | 287 +++++++++++++++++++++++++++
tools/objtool/include/objtool/insn.h | 3 +-
tools/objtool/insn.c | 39 ++--
5 files changed, 329 insertions(+), 17 deletions(-)
diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
index 81653ed3c323..f723be80c09a 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -22,6 +22,21 @@
/* --------------------- arch support functions ------------------------- */
+void arch_initial_func_cfi_state(struct cfi_init_state *state)
+{
+ int i;
+
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ state->regs[i].base = CFI_UNDEFINED;
+ state->regs[i].offset = 0;
+ }
+ state->regs[CFI_FP].base = CFI_CFA;
+
+ /* initial CFA (call frame address) */
+ state->cfa.base = CFI_SP;
+ state->cfa.offset = 0;
+}
+
unsigned long arch_dest_reloc_offset(int addend)
{
return addend;
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index d14a2b7b8b37..94efe94a566e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2863,7 +2863,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
visited = VISITED_BRANCH << state.uaccess;
if (insn->visited & VISITED_BRANCH_MASK) {
- if (!insn->hint && !insn_cfi_match(insn, &state.cfi))
+ if (!insn->hint && !insn_cfi_match(insn, &state.cfi, true))
return 1;
if (insn->visited & visited)
diff --git a/tools/objtool/dcheck.c b/tools/objtool/dcheck.c
index eb806a032a32..8b78cb608528 100644
--- a/tools/objtool/dcheck.c
+++ b/tools/objtool/dcheck.c
@@ -49,6 +49,283 @@ static void add_jump_destinations(struct objtool_file *file)
}
}
+static bool update_cfi_state(struct cfi_state *cfi, struct stack_op *op)
+{
+ struct cfi_reg *cfa = &cfi->cfa;
+ struct cfi_reg *fp_reg = &cfi->regs[CFI_FP];
+ struct cfi_reg *fp_val = &cfi->vals[CFI_FP];
+ struct cfi_reg *ra_val = &cfi->vals[CFI_RA];
+ enum op_src_type src_type = op->src.type;
+ enum op_dest_type dest_type = op->dest.type;
+ unsigned char dest_reg = op->dest.reg;
+ int offset;
+
+ if (src_type == OP_SRC_ADD && dest_type == OP_DEST_REG) {
+
+ if (op->src.reg == CFI_SP) {
+ if (op->dest.reg == CFI_SP) {
+ cfa->offset -= op->src.offset;
+ } else {
+ if (fp_reg->offset) {
+ /* FP is already set. */
+ return false;
+ }
+ fp_reg->offset = -cfa->offset + op->src.offset;
+ if (fp_reg->offset != fp_val->offset) {
+ /*
+ * FP does not match the location
+ * where FP is stored on stack.
+ */
+ return false;
+ }
+ }
+ } else {
+ if (op->dest.reg == CFI_SP) {
+ cfa->offset =
+ -(fp_reg->offset + op->src.offset);
+ } else {
+ /* Setting the FP from itself is unreliable. */
+ return false;
+ }
+ }
+ /*
+ * When the stack pointer is restored in the frame pointer
+ * epilog, forget where the FP and RA were stored.
+ */
+ if (cfa->offset < -fp_val->offset)
+ fp_val->offset = 0;
+ if (cfa->offset < -ra_val->offset)
+ ra_val->offset = 0;
+ goto out;
+ }
+
+ if (src_type == OP_SRC_REG_INDIRECT && dest_type == OP_DEST_REG) {
+ offset = -cfa->offset + op->src.offset;
+ if (dest_reg == CFI_FP) {
+ if (!fp_val->offset || fp_val->offset != offset) {
+ /*
+ * Loading the FP from a different place than
+ * where it is stored.
+ */
+ return false;
+ }
+ if (!ra_val->offset ||
+ (ra_val->offset - fp_val->offset) != 8) {
+ /* FP and RA must be adjacent in a frame. */
+ return false;
+ }
+ fp_reg->offset = 0;
+ }
+ goto out;
+ }
+
+ if (src_type == OP_SRC_REG && dest_type == OP_DEST_REG_INDIRECT) {
+ offset = -cfa->offset + op->dest.offset;
+ if (dest_reg == CFI_FP) {
+ /* Record where the FP is stored on the stack. */
+ fp_val->offset = offset;
+ } else {
+ /* Record where the RA is stored on the stack. */
+ if (fp_val->offset && (offset - fp_val->offset) == 8)
+ ra_val->offset = offset;
+ }
+ goto out;
+ }
+ return false;
+out:
+ if (cfa->offset < 0 || fp_reg->offset > 0 ||
+ fp_val->offset > 0 || ra_val->offset > 0) {
+ /* Unexpected SP and FP offset values. */
+ return false;
+ }
+ return true;
+}
+
+static bool do_stack_ops(struct instruction *insn, struct insn_state *state)
+{
+ struct stack_op *op;
+
+ list_for_each_entry(op, &insn->stack_ops, list) {
+ if (!update_cfi_state(&state->cfi, op))
+ return false;
+ }
+ return true;
+}
+
+static bool validate_branch(struct objtool_file *file, struct section *sec,
+ struct symbol *func, struct instruction *insn,
+ struct insn_state *state)
+{
+ struct symbol *insn_func = insn->func;
+ struct instruction *dest;
+ struct cfi_state save_cfi;
+ struct cfi_reg *cfa;
+ struct cfi_reg *regs;
+ unsigned long start, end;
+
+ for (; insn; insn = next_insn_same_sec(file, insn)) {
+
+ if (insn->func != insn_func)
+ return true;
+
+ if (insn->cfi)
+ return insn_cfi_match(insn, &state->cfi, false);
+
+ insn->cfi = cfi_hash_find_or_add(&state->cfi);
+ dest = insn->jump_dest;
+
+ if (!do_stack_ops(insn, state))
+ return false;
+
+ switch (insn->type) {
+ case INSN_BUG:
+ return true;
+
+ case INSN_UNRELIABLE:
+ return false;
+
+ case INSN_RETURN:
+ cfa = &state->cfi.cfa;
+ regs = state->cfi.regs;
+ if (cfa->offset || regs[CFI_FP].offset) {
+ /* SP and FP offsets should be 0 on return. */
+ return false;
+ }
+ return true;
+
+ case INSN_CALL:
+ case INSN_CALL_DYNAMIC:
+ start = func->offset;
+ end = start + func->len;
+ /* Treat intra-function calls as jumps. */
+ if (!dest || dest->sec != sec ||
+ dest->offset <= start || dest->offset >= end) {
+ break;
+ }
+
+ case INSN_JUMP_UNCONDITIONAL:
+ case INSN_JUMP_CONDITIONAL:
+ case INSN_JUMP_DYNAMIC:
+ if (dest) {
+ save_cfi = state->cfi;
+ if (!validate_branch(file, sec, func, dest,
+ state)) {
+ return false;
+ }
+ state->cfi = save_cfi;
+ }
+ if (insn->type == INSN_JUMP_UNCONDITIONAL ||
+ insn->type == INSN_JUMP_DYNAMIC) {
+ return true;
+ }
+ break;
+
+ default:
+ break;
+ }
+ }
+ return true;
+}
+
+static bool walk_reachable(struct objtool_file *file, struct section *sec,
+ struct symbol *func)
+{
+ struct instruction *insn = find_insn(file, sec, func->offset);
+ struct insn_state state;
+
+ func_for_each_insn(file, func, insn) {
+
+ if (insn->offset != func->offset &&
+ (insn->type != INSN_START || insn->cfi)) {
+ continue;
+ }
+
+ init_insn_state(file, &state, sec);
+ set_func_state(&state.cfi);
+
+ if (!validate_branch(file, sec, func, insn, &state))
+ return false;
+ }
+ return true;
+}
+
+static void remove_cfi(struct objtool_file *file, struct symbol *func)
+{
+ struct instruction *insn;
+
+ func_for_each_insn(file, func, insn) {
+ insn->cfi = NULL;
+ }
+}
+
+/*
+ * Instructions that were not visited by walk_reachable() would not have a
+ * CFI. Try to initialize their CFI. For instance, there could be a table of
+ * unconditional branches like for a switch statement. Or, code can be patched
+ * by the kernel at runtime. After patching, some of the previously unreachable
+ * code may become reachable.
+ *
+ * This follows the same pattern as the DWARF info generated by the compiler.
+ */
+static bool walk_unreachable(struct objtool_file *file, struct section *sec,
+ struct symbol *func)
+{
+ struct instruction *insn, *prev;
+ struct insn_state state;
+
+ func_for_each_insn(file, func, insn) {
+
+ if (insn->cfi)
+ continue;
+
+ prev = list_prev_entry(insn, list);
+ if (!prev || prev->func != insn->func || !prev->cfi)
+ continue;
+
+ if (prev->type != INSN_JUMP_UNCONDITIONAL &&
+ prev->type != INSN_JUMP_DYNAMIC &&
+ prev->type != INSN_BUG) {
+ continue;
+ }
+
+ /* Propagate the CFI. */
+ state.cfi = *prev->cfi;
+ if (!validate_branch(file, sec, func, insn, &state))
+ return false;
+ }
+ return true;
+}
+
+static void walk_section(struct objtool_file *file, struct section *sec)
+{
+ struct symbol *func;
+
+ list_for_each_entry(func, &sec->symbol_list, list) {
+
+ if (func->type != STT_FUNC || !func->len ||
+ func->pfunc != func || func->alias != func) {
+ /* No CFI generated for this function. */
+ continue;
+ }
+
+ if (!walk_reachable(file, sec, func) ||
+ !walk_unreachable(file, sec, func)) {
+ remove_cfi(file, func);
+ continue;
+ }
+ }
+}
+
+static void walk_sections(struct objtool_file *file)
+{
+ struct section *sec;
+
+ for_each_sec(file, sec) {
+ if (sec->sh.sh_flags & SHF_EXECINSTR)
+ walk_section(file, sec);
+ }
+}
+
int check(struct objtool_file *file)
{
int ret;
@@ -56,11 +333,21 @@ int check(struct objtool_file *file)
if (!opts.stackval)
return 1;
+ arch_initial_func_cfi_state(&initial_func_cfi);
+
+ if (!cfi_hash_alloc(1UL << (file->elf->symbol_bits - 3)))
+ return -1;
+
ret = decode_instructions(file);
if (ret)
return ret;
add_jump_destinations(file);
+ if (list_empty(&file->insn_list))
+ return 0;
+
+ walk_sections(file);
+
return 0;
}
diff --git a/tools/objtool/include/objtool/insn.h b/tools/objtool/include/objtool/insn.h
index cfd1ae7e2e8e..3a43a591b318 100644
--- a/tools/objtool/include/objtool/insn.h
+++ b/tools/objtool/include/objtool/insn.h
@@ -84,7 +84,8 @@ struct instruction *next_insn_same_sec(struct objtool_file *file,
struct instruction *next_insn_same_func(struct objtool_file *file,
struct instruction *insn);
struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn);
-bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2);
+bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2,
+ bool print);
bool same_function(struct instruction *insn1, struct instruction *insn2);
bool is_first_func_insn(struct objtool_file *file, struct instruction *insn);
diff --git a/tools/objtool/insn.c b/tools/objtool/insn.c
index e570b46ad39e..be3617d55aea 100644
--- a/tools/objtool/insn.c
+++ b/tools/objtool/insn.c
@@ -135,7 +135,8 @@ bool is_first_func_insn(struct objtool_file *file, struct instruction *insn)
return false;
}
-bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2)
+bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2,
+ bool print)
{
struct cfi_state *cfi1 = insn->cfi;
int i;
@@ -147,10 +148,12 @@ bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2)
if (memcmp(&cfi1->cfa, &cfi2->cfa, sizeof(cfi1->cfa))) {
- WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
- insn->sec, insn->offset,
- cfi1->cfa.base, cfi1->cfa.offset,
- cfi2->cfa.base, cfi2->cfa.offset);
+ if (print) {
+ WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
+ insn->sec, insn->offset,
+ cfi1->cfa.base, cfi1->cfa.offset,
+ cfi2->cfa.base, cfi2->cfa.offset);
+ }
} else if (memcmp(&cfi1->regs, &cfi2->regs, sizeof(cfi1->regs))) {
for (i = 0; i < CFI_NUM_REGS; i++) {
@@ -158,26 +161,32 @@ bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2)
sizeof(struct cfi_reg)))
continue;
- WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
- insn->sec, insn->offset,
- i, cfi1->regs[i].base, cfi1->regs[i].offset,
- i, cfi2->regs[i].base, cfi2->regs[i].offset);
+ if (print) {
+ WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
+ insn->sec, insn->offset,
+ i, cfi1->regs[i].base, cfi1->regs[i].offset,
+ i, cfi2->regs[i].base, cfi2->regs[i].offset);
+ }
break;
}
} else if (cfi1->type != cfi2->type) {
- WARN_FUNC("stack state mismatch: type1=%d type2=%d",
- insn->sec, insn->offset, cfi1->type, cfi2->type);
+ if (print) {
+ WARN_FUNC("stack state mismatch: type1=%d type2=%d",
+ insn->sec, insn->offset, cfi1->type, cfi2->type);
+ }
} else if (cfi1->drap != cfi2->drap ||
(cfi1->drap && cfi1->drap_reg != cfi2->drap_reg) ||
(cfi1->drap && cfi1->drap_offset != cfi2->drap_offset)) {
- WARN_FUNC("stack state mismatch: drap1=%d(%d,%d) drap2=%d(%d,%d)",
- insn->sec, insn->offset,
- cfi1->drap, cfi1->drap_reg, cfi1->drap_offset,
- cfi2->drap, cfi2->drap_reg, cfi2->drap_offset);
+ if (print) {
+ WARN_FUNC("stack state mismatch: drap1=%d(%d,%d) drap2=%d(%d,%d)",
+ insn->sec, insn->offset,
+ cfi1->drap, cfi1->drap_reg, cfi1->drap_offset,
+ cfi2->drap, cfi2->drap_reg, cfi2->drap_offset);
+ }
} else
return true;
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Enable ORC data for ARM64.
Call orc_create() from check() in dcheck.c to generate the ORC sections in
object files for dynamic frame pointer validation.
Define support functions for ORC data creation.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/orc_types.h | 35 +++++++++
tools/arch/arm64/include/asm/orc_types.h | 35 +++++++++
tools/objtool/Makefile | 1 +
tools/objtool/arch/arm64/Build | 1 +
tools/objtool/arch/arm64/include/arch/elf.h | 9 +++
tools/objtool/arch/arm64/orc.c | 86 +++++++++++++++++++++
tools/objtool/dcheck.c | 5 +-
tools/objtool/include/objtool/insn.h | 1 +
tools/objtool/include/objtool/objtool.h | 1 +
tools/objtool/insn.c | 20 +++++
tools/objtool/orc_gen.c | 12 ++-
tools/objtool/sync-check.sh | 7 ++
12 files changed, 210 insertions(+), 3 deletions(-)
create mode 100644 arch/arm64/include/asm/orc_types.h
create mode 100644 tools/arch/arm64/include/asm/orc_types.h
create mode 100644 tools/objtool/arch/arm64/include/arch/elf.h
create mode 100644 tools/objtool/arch/arm64/orc.c
diff --git a/arch/arm64/include/asm/orc_types.h b/arch/arm64/include/asm/orc_types.h
new file mode 100644
index 000000000000..c7bb690ca7d9
--- /dev/null
+++ b/arch/arm64/include/asm/orc_types.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Author: Madhavan T. Venkataraman ([email protected])
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#ifndef _ORC_TYPES_H
+#define _ORC_TYPES_H
+
+#include <linux/types.h>
+#include <linux/compiler.h>
+#include <linux/orc_entry.h>
+
+/*
+ * The ORC_REG_* registers are base registers which are used to find other
+ * registers on the stack.
+ *
+ * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
+ * address of the previous frame: the caller's SP before it called the current
+ * function.
+ *
+ * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
+ * the current frame.
+ *
+ * We only use base registers SP and FP -- which the previous SP is based on --
+ * and PREV_SP and UNDEFINED -- which the previous FP is based on.
+ */
+#define ORC_REG_UNDEFINED 0
+#define ORC_REG_PREV_SP 1
+#define ORC_REG_SP 2
+#define ORC_REG_FP 3
+#define ORC_REG_MAX 4
+
+#endif /* _ORC_TYPES_H */
diff --git a/tools/arch/arm64/include/asm/orc_types.h b/tools/arch/arm64/include/asm/orc_types.h
new file mode 100644
index 000000000000..c7bb690ca7d9
--- /dev/null
+++ b/tools/arch/arm64/include/asm/orc_types.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Author: Madhavan T. Venkataraman ([email protected])
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#ifndef _ORC_TYPES_H
+#define _ORC_TYPES_H
+
+#include <linux/types.h>
+#include <linux/compiler.h>
+#include <linux/orc_entry.h>
+
+/*
+ * The ORC_REG_* registers are base registers which are used to find other
+ * registers on the stack.
+ *
+ * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
+ * address of the previous frame: the caller's SP before it called the current
+ * function.
+ *
+ * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
+ * the current frame.
+ *
+ * We only use base registers SP and FP -- which the previous SP is based on --
+ * and PREV_SP and UNDEFINED -- which the previous FP is based on.
+ */
+#define ORC_REG_UNDEFINED 0
+#define ORC_REG_PREV_SP 1
+#define ORC_REG_SP 2
+#define ORC_REG_FP 3
+#define ORC_REG_MAX 4
+
+#endif /* _ORC_TYPES_H */
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 92583b82eb78..14bb324d9385 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -47,6 +47,7 @@ ifeq ($(SRCARCH),x86)
endif
ifeq ($(SRCARCH),arm64)
+ BUILD_ORC := y
DYNAMIC_CHECK := y
endif
diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
index 3ff1f00c6a47..8615abfb12cf 100644
--- a/tools/objtool/arch/arm64/Build
+++ b/tools/objtool/arch/arm64/Build
@@ -1 +1,2 @@
objtool-y += decode.o
+objtool-y += orc.o
diff --git a/tools/objtool/arch/arm64/include/arch/elf.h b/tools/objtool/arch/arm64/include/arch/elf.h
new file mode 100644
index 000000000000..4ae6df2bd90c
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/elf.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0 */
+
+#ifndef _OBJTOOL_ARCH_ELF
+#define _OBJTOOL_ARCH_ELF
+
+#define R_NONE R_AARCH64_NONE
+#define R_PCREL R_AARCH64_PREL32
+
+#endif /* _OBJTOOL_ARCH_ELF */
diff --git a/tools/objtool/arch/arm64/orc.c b/tools/objtool/arch/arm64/orc.c
new file mode 100644
index 000000000000..cef14114e1ec
--- /dev/null
+++ b/tools/objtool/arch/arm64/orc.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Author: Madhavan T. Venkataraman ([email protected])
+ *
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+#include <string.h>
+
+#include <linux/objtool.h>
+
+#include <objtool/insn.h>
+#include <objtool/orc.h>
+
+int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi,
+ struct instruction *insn)
+{
+ struct cfi_reg *fp = &cfi->regs[CFI_FP];
+
+ memset(orc, 0, sizeof(*orc));
+
+ orc->sp_reg = ORC_REG_SP;
+ orc->fp_reg = ORC_REG_PREV_SP;
+
+ if (!cfi || cfi->cfa.base == CFI_UNDEFINED ||
+ (cfi->type == UNWIND_HINT_TYPE_CALL && !fp->offset)) {
+ /*
+ * The frame pointer has not been set up. This instruction is
+ * unreliable from an unwind perspective.
+ */
+ return 0;
+ }
+
+ orc->sp_offset = cfi->cfa.offset;
+ orc->fp_offset = fp->offset;
+ orc->type = cfi->type;
+ orc->end = cfi->end;
+
+ return 0;
+}
+
+static const char *reg_name(unsigned int reg)
+{
+ switch (reg) {
+ case ORC_REG_PREV_SP:
+ return "cfa";
+ case ORC_REG_FP:
+ return "x29";
+ case ORC_REG_SP:
+ return "sp";
+ default:
+ return "?";
+ }
+}
+
+const char *orc_type_name(unsigned int type)
+{
+ switch (type) {
+ case UNWIND_HINT_TYPE_CALL:
+ return "call";
+ default:
+ return "?";
+ }
+}
+
+void orc_print_reg(unsigned int reg, int offset)
+{
+ if (reg == ORC_REG_UNDEFINED)
+ printf("(und)");
+ else
+ printf("%s%+d", reg_name(reg), offset);
+}
+
+void orc_print_sp(void)
+{
+ printf(" cfa:");
+}
+
+void orc_print_fp(void)
+{
+ printf(" x29:");
+}
+
+bool orc_ignore_section(struct section *sec)
+{
+ return !strcmp(sec->name, ".head.text");
+}
diff --git a/tools/objtool/dcheck.c b/tools/objtool/dcheck.c
index 8b78cb608528..57499752c523 100644
--- a/tools/objtool/dcheck.c
+++ b/tools/objtool/dcheck.c
@@ -349,5 +349,8 @@ int check(struct objtool_file *file)
walk_sections(file);
- return 0;
+ if (opts.orc)
+ ret = orc_create(file);
+
+ return ret;
}
diff --git a/tools/objtool/include/objtool/insn.h b/tools/objtool/include/objtool/insn.h
index 3a43a591b318..ac718f1e2d2f 100644
--- a/tools/objtool/include/objtool/insn.h
+++ b/tools/objtool/include/objtool/insn.h
@@ -84,6 +84,7 @@ struct instruction *next_insn_same_sec(struct objtool_file *file,
struct instruction *next_insn_same_func(struct objtool_file *file,
struct instruction *insn);
struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn);
+bool insn_can_reloc(struct instruction *insn);
bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2,
bool print);
bool same_function(struct instruction *insn1, struct instruction *insn2);
diff --git a/tools/objtool/include/objtool/objtool.h b/tools/objtool/include/objtool/objtool.h
index 7f2d1b095333..b7655ad3e402 100644
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -46,5 +46,6 @@ void objtool_pv_add(struct objtool_file *file, int idx, struct symbol *func);
int check(struct objtool_file *file);
int orc_dump(const char *objname);
int orc_create(struct objtool_file *file);
+bool orc_ignore_section(struct section *sec);
#endif /* _OBJTOOL_H */
diff --git a/tools/objtool/insn.c b/tools/objtool/insn.c
index be3617d55aea..af48319f2225 100644
--- a/tools/objtool/insn.c
+++ b/tools/objtool/insn.c
@@ -193,3 +193,23 @@ bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2,
return false;
}
+
+/*
+ * This is a hack for Clang. Clang is aggressive about removing section
+ * symbols and then some. If we cannot find something to relocate an
+ * instruction against, we must not generate CFI for it or the ORC
+ * generation will fail later.
+ */
+bool insn_can_reloc(struct instruction *insn)
+{
+ struct section *insn_sec = insn->sec;
+ unsigned long insn_off = insn->offset;
+
+ if (insn_sec->sym ||
+ find_symbol_containing(insn_sec, insn_off) ||
+ find_symbol_containing(insn_sec, insn_off - 1)) {
+ /* See elf_add_reloc_to_insn(). */
+ return true;
+ }
+ return false;
+}
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index ea2e361ff7bc..bddf5889466f 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -14,6 +14,11 @@
#include <objtool/warn.h>
#include <objtool/endianness.h>
+bool __weak orc_ignore_section(struct section *sec)
+{
+ return false;
+}
+
static int write_orc_entry(struct elf *elf, struct section *orc_sec,
struct section *ip_sec, unsigned int idx,
struct section *insn_sec, unsigned long insn_off,
@@ -87,13 +92,16 @@ int orc_create(struct objtool_file *file)
struct instruction *insn;
bool empty = true;
- if (!sec->text)
+ if (!sec->text || orc_ignore_section(sec))
continue;
sec_for_each_insn(file, sec, insn) {
struct alt_group *alt_group = insn->alt_group;
int i;
+ if (!insn_can_reloc(insn))
+ continue;
+
if (!alt_group) {
if (init_orc_entry(&orc, insn->cfi, insn))
return -1;
@@ -137,7 +145,7 @@ int orc_create(struct objtool_file *file)
}
/* Add a section terminator */
- if (!empty) {
+ if (!empty && sec->sym) {
orc_list_add(&orc_list, &null, sec, sec->sh.sh_size);
nr++;
}
diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh
index ef1acb064605..0d0656f6ce4a 100755
--- a/tools/objtool/sync-check.sh
+++ b/tools/objtool/sync-check.sh
@@ -29,6 +29,13 @@ arch/x86/lib/insn.c
'
fi
+if [ "$SRCARCH" = "arm64" ]; then
+FILES="$FILES
+arch/arm64/include/asm/orc_types.h
+include/linux/orc_entry.h
+"
+fi
+
check_2 () {
file1=$1
file2=$2
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Implement the unwind hint macros for ARM64. Define the unwind hint types
as well.
Process the unwind hints section for dynamic FP validation for ARM64.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/unwind_hints.h | 104 ++++++++++++++++++++
include/linux/objtool.h | 3 +
tools/arch/arm64/include/asm/unwind_hints.h | 104 ++++++++++++++++++++
tools/include/linux/objtool.h | 3 +
tools/objtool/Build | 2 +-
tools/objtool/arch/arm64/decode.c | 21 ++++
tools/objtool/arch/arm64/orc.c | 4 +
tools/objtool/dcheck.c | 4 +
tools/objtool/include/objtool/endianness.h | 1 +
tools/objtool/sync-check.sh | 1 +
tools/objtool/unwind_hints.c | 24 +++--
11 files changed, 260 insertions(+), 11 deletions(-)
create mode 100644 arch/arm64/include/asm/unwind_hints.h
create mode 100644 tools/arch/arm64/include/asm/unwind_hints.h
diff --git a/arch/arm64/include/asm/unwind_hints.h b/arch/arm64/include/asm/unwind_hints.h
new file mode 100644
index 000000000000..fb1b924d85bc
--- /dev/null
+++ b/arch/arm64/include/asm/unwind_hints.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_ARM64_UNWIND_HINTS_H
+#define _ASM_ARM64_UNWIND_HINTS_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/types.h>
+
+/*
+ * This struct is used by asm and inline asm code to manually annotate the
+ * CFI for an instruction. We have to use s16 instead of s8 for some of these
+ * fields as 8-bit fields are not relocated by some assemblers.
+ */
+struct unwind_hint {
+ u32 ip;
+ s16 sp_offset;
+ s16 sp_reg;
+ s16 type;
+ s16 end;
+};
+
+#endif
+
+#include <linux/objtool.h>
+
+#include "orc_types.h"
+
+#ifdef CONFIG_STACK_VALIDATION
+
+#ifndef __ASSEMBLY__
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(sp_offset) "\n\t" \
+ ".short " __stringify(sp_reg) "\n\t" \
+ ".short " __stringify(type) "\n\t" \
+ ".short " __stringify(end) "\n\t" \
+ ".popsection\n\t"
+
+#else /* __ASSEMBLY__ */
+
+/*
+ * There are points in ASM code where it is useful to unwind through even
+ * though the ASM code itself may be unreliable from an unwind perspective.
+ * E.g., interrupt and exception handlers.
+ *
+ * These macros provide hints to objtool to compute the CFI information at
+ * such instructions.
+ */
+.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
+.Lunwind_hint_pc_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+ .long .Lunwind_hint_pc_\@ - .
+ .short \sp_offset
+ .short \sp_reg
+ .short \type
+ .short \end
+ .popsection
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#else /* !CONFIG_STACK_VALIDATION */
+
+#ifndef __ASSEMBLY__
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "\n\t"
+#else
+.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
+.endm
+#endif
+
+#endif /* CONFIG_STACK_VALIDATION */
+#ifdef __ASSEMBLY__
+
+.macro UNWIND_HINT_FTRACE, offset
+ .set sp_reg, ORC_REG_SP
+ .set sp_offset, \offset
+ .set type, UNWIND_HINT_TYPE_FTRACE
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+.macro UNWIND_HINT_REGS, offset
+ .set sp_reg, ORC_REG_SP
+ .set sp_offset, \offset
+ .set type, UNWIND_HINT_TYPE_REGS
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+.macro UNWIND_HINT_IRQ, offset
+ .set sp_reg, ORC_REG_SP
+ .set sp_offset, \offset
+ .set type, UNWIND_HINT_TYPE_IRQ_STACK
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_ARM64_UNWIND_HINTS_H */
diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index 1af295efc12c..dcbd365944f6 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -17,6 +17,8 @@
* Useful for code which doesn't have an ELF function annotation.
*
* UNWIND_HINT_ENTRY: machine entry without stack, SYSCALL/SYSENTER etc.
+ *
+ * UNWIND_HINT_TYPE_IRQ_STACK: Used to unwind through the IRQ stack.
*/
#define UNWIND_HINT_TYPE_CALL 0
#define UNWIND_HINT_TYPE_REGS 1
@@ -25,6 +27,7 @@
#define UNWIND_HINT_TYPE_ENTRY 4
#define UNWIND_HINT_TYPE_SAVE 5
#define UNWIND_HINT_TYPE_RESTORE 6
+#define UNWIND_HINT_TYPE_IRQ_STACK 7
#ifdef CONFIG_OBJTOOL
diff --git a/tools/arch/arm64/include/asm/unwind_hints.h b/tools/arch/arm64/include/asm/unwind_hints.h
new file mode 100644
index 000000000000..fb1b924d85bc
--- /dev/null
+++ b/tools/arch/arm64/include/asm/unwind_hints.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_ARM64_UNWIND_HINTS_H
+#define _ASM_ARM64_UNWIND_HINTS_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/types.h>
+
+/*
+ * This struct is used by asm and inline asm code to manually annotate the
+ * CFI for an instruction. We have to use s16 instead of s8 for some of these
+ * fields as 8-bit fields are not relocated by some assemblers.
+ */
+struct unwind_hint {
+ u32 ip;
+ s16 sp_offset;
+ s16 sp_reg;
+ s16 type;
+ s16 end;
+};
+
+#endif
+
+#include <linux/objtool.h>
+
+#include "orc_types.h"
+
+#ifdef CONFIG_STACK_VALIDATION
+
+#ifndef __ASSEMBLY__
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(sp_offset) "\n\t" \
+ ".short " __stringify(sp_reg) "\n\t" \
+ ".short " __stringify(type) "\n\t" \
+ ".short " __stringify(end) "\n\t" \
+ ".popsection\n\t"
+
+#else /* __ASSEMBLY__ */
+
+/*
+ * There are points in ASM code where it is useful to unwind through even
+ * though the ASM code itself may be unreliable from an unwind perspective.
+ * E.g., interrupt and exception handlers.
+ *
+ * These macros provide hints to objtool to compute the CFI information at
+ * such instructions.
+ */
+.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
+.Lunwind_hint_pc_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+ .long .Lunwind_hint_pc_\@ - .
+ .short \sp_offset
+ .short \sp_reg
+ .short \type
+ .short \end
+ .popsection
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#else /* !CONFIG_STACK_VALIDATION */
+
+#ifndef __ASSEMBLY__
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "\n\t"
+#else
+.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
+.endm
+#endif
+
+#endif /* CONFIG_STACK_VALIDATION */
+#ifdef __ASSEMBLY__
+
+.macro UNWIND_HINT_FTRACE, offset
+ .set sp_reg, ORC_REG_SP
+ .set sp_offset, \offset
+ .set type, UNWIND_HINT_TYPE_FTRACE
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+.macro UNWIND_HINT_REGS, offset
+ .set sp_reg, ORC_REG_SP
+ .set sp_offset, \offset
+ .set type, UNWIND_HINT_TYPE_REGS
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+.macro UNWIND_HINT_IRQ, offset
+ .set sp_reg, ORC_REG_SP
+ .set sp_offset, \offset
+ .set type, UNWIND_HINT_TYPE_IRQ_STACK
+ UNWIND_HINT sp_reg=sp_reg sp_offset=sp_offset type=type
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_ARM64_UNWIND_HINTS_H */
diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
index 1af295efc12c..dcbd365944f6 100644
--- a/tools/include/linux/objtool.h
+++ b/tools/include/linux/objtool.h
@@ -17,6 +17,8 @@
* Useful for code which doesn't have an ELF function annotation.
*
* UNWIND_HINT_ENTRY: machine entry without stack, SYSCALL/SYSENTER etc.
+ *
+ * UNWIND_HINT_TYPE_IRQ_STACK: Used to unwind through the IRQ stack.
*/
#define UNWIND_HINT_TYPE_CALL 0
#define UNWIND_HINT_TYPE_REGS 1
@@ -25,6 +27,7 @@
#define UNWIND_HINT_TYPE_ENTRY 4
#define UNWIND_HINT_TYPE_SAVE 5
#define UNWIND_HINT_TYPE_RESTORE 6
+#define UNWIND_HINT_TYPE_IRQ_STACK 7
#ifdef CONFIG_OBJTOOL
diff --git a/tools/objtool/Build b/tools/objtool/Build
index fb0846b7d95e..2780e402babb 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -9,7 +9,7 @@ objtool-y += builtin-check.o
objtool-y += cfi.o
objtool-y += insn.o
objtool-y += decode.o
-objtool-$(STATIC_CHECK) += unwind_hints.o
+objtool-y += unwind_hints.o
objtool-y += elf.o
objtool-y += objtool.o
diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
index f723be80c09a..570069ac68ae 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -17,6 +17,8 @@
#include <objtool/elf.h>
#include <objtool/warn.h>
+#include <asm/orc_types.h>
+
/* ARM64 instructions are all 4 bytes wide. */
#define INSN_SIZE 4
@@ -47,6 +49,25 @@ unsigned long arch_jump_destination(struct instruction *insn)
return insn->offset + insn->immediate;
}
+int arch_decode_hint_reg(u8 sp_reg, int *base)
+{
+ switch (sp_reg) {
+ case ORC_REG_UNDEFINED:
+ *base = CFI_UNDEFINED;
+ break;
+ case ORC_REG_SP:
+ *base = CFI_SP;
+ break;
+ case ORC_REG_FP:
+ *base = CFI_FP;
+ break;
+ default:
+ return -1;
+ }
+
+ return 0;
+}
+
/* --------------------- instruction decode structs ------------------------ */
struct decode_var {
diff --git a/tools/objtool/arch/arm64/orc.c b/tools/objtool/arch/arm64/orc.c
index cef14114e1ec..5b155585258a 100644
--- a/tools/objtool/arch/arm64/orc.c
+++ b/tools/objtool/arch/arm64/orc.c
@@ -57,6 +57,10 @@ const char *orc_type_name(unsigned int type)
switch (type) {
case UNWIND_HINT_TYPE_CALL:
return "call";
+ case UNWIND_HINT_TYPE_REGS:
+ return "regs";
+ case UNWIND_HINT_TYPE_IRQ_STACK:
+ return "irqstack";
default:
return "?";
}
diff --git a/tools/objtool/dcheck.c b/tools/objtool/dcheck.c
index 57499752c523..567f492b0e3e 100644
--- a/tools/objtool/dcheck.c
+++ b/tools/objtool/dcheck.c
@@ -349,6 +349,10 @@ int check(struct objtool_file *file)
walk_sections(file);
+ ret = read_unwind_hints(file);
+ if (ret)
+ return ret;
+
if (opts.orc)
ret = orc_create(file);
diff --git a/tools/objtool/include/objtool/endianness.h b/tools/objtool/include/objtool/endianness.h
index 10241341eff3..9a53ab421a19 100644
--- a/tools/objtool/include/objtool/endianness.h
+++ b/tools/objtool/include/objtool/endianness.h
@@ -29,6 +29,7 @@
case 8: __ret = __NEED_BSWAP ? bswap_64(val) : (val); break; \
case 4: __ret = __NEED_BSWAP ? bswap_32(val) : (val); break; \
case 2: __ret = __NEED_BSWAP ? bswap_16(val) : (val); break; \
+ case 1: __ret = (val); break; \
default: \
BUILD_BUG(); break; \
} \
diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh
index 0d0656f6ce4a..3742d1e2585c 100755
--- a/tools/objtool/sync-check.sh
+++ b/tools/objtool/sync-check.sh
@@ -31,6 +31,7 @@ fi
if [ "$SRCARCH" = "arm64" ]; then
FILES="$FILES
+arch/arm64/include/asm/unwind_hints.h
arch/arm64/include/asm/orc_types.h
include/linux/orc_entry.h
"
diff --git a/tools/objtool/unwind_hints.c b/tools/objtool/unwind_hints.c
index f2521659bae5..c51013c5d0b3 100644
--- a/tools/objtool/unwind_hints.c
+++ b/tools/objtool/unwind_hints.c
@@ -16,6 +16,7 @@ int read_unwind_hints(struct objtool_file *file)
struct unwind_hint *hint;
struct instruction *insn;
struct reloc *reloc;
+ u8 sp_reg, type;
int i;
sec = find_section_by_name(file->elf, ".discard.unwind_hints");
@@ -38,6 +39,9 @@ int read_unwind_hints(struct objtool_file *file)
for (i = 0; i < sec->sh.sh_size / sizeof(struct unwind_hint); i++) {
hint = (struct unwind_hint *)sec->data->d_buf + i;
+ sp_reg = bswap_if_needed(hint->sp_reg);
+ type = bswap_if_needed(hint->type);
+
reloc = find_reloc_by_dest(file->elf, sec, i * sizeof(*hint));
if (!reloc) {
WARN("can't find reloc for unwind_hints[%d]", i);
@@ -52,18 +56,18 @@ int read_unwind_hints(struct objtool_file *file)
insn->hint = true;
- if (hint->type == UNWIND_HINT_TYPE_SAVE) {
+ if (type == UNWIND_HINT_TYPE_SAVE) {
insn->hint = false;
insn->save = true;
continue;
}
- if (hint->type == UNWIND_HINT_TYPE_RESTORE) {
+ if (type == UNWIND_HINT_TYPE_RESTORE) {
insn->restore = true;
continue;
}
- if (hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
+ if (type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
struct symbol *sym = find_symbol_by_offset(insn->sec, insn->offset);
if (sym && sym->bind == STB_GLOBAL) {
@@ -76,12 +80,12 @@ int read_unwind_hints(struct objtool_file *file)
}
}
- if (hint->type == UNWIND_HINT_TYPE_ENTRY) {
- hint->type = UNWIND_HINT_TYPE_CALL;
+ if (type == UNWIND_HINT_TYPE_ENTRY) {
+ type = UNWIND_HINT_TYPE_CALL;
insn->entry = 1;
}
- if (hint->type == UNWIND_HINT_TYPE_FUNC) {
+ if (type == UNWIND_HINT_TYPE_FUNC) {
insn->cfi = &func_cfi;
continue;
}
@@ -89,15 +93,15 @@ int read_unwind_hints(struct objtool_file *file)
if (insn->cfi)
cfi = *(insn->cfi);
- if (arch_decode_hint_reg(hint->sp_reg, &cfi.cfa.base)) {
+ if (arch_decode_hint_reg(sp_reg, &cfi.cfa.base)) {
WARN_FUNC("unsupported unwind_hint sp base reg %d",
- insn->sec, insn->offset, hint->sp_reg);
+ insn->sec, insn->offset, sp_reg);
return -1;
}
cfi.cfa.offset = bswap_if_needed(hint->sp_offset);
- cfi.type = hint->type;
- cfi.end = hint->end;
+ cfi.type = type;
+ cfi.end = bswap_if_needed(hint->end);
insn->cfi = cfi_hash_find_or_add(&cfi);
}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Add unwind hints to Interrupt and Exception handlers.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/kernel/entry.S | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e28137d64b76..d73bed56f0e6 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -28,6 +28,7 @@
#include <asm/thread_info.h>
#include <asm/asm-uaccess.h>
#include <asm/unistd.h>
+#include <asm/unwind_hints.h>
.macro clear_gp_regs
.irp n,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
@@ -560,6 +561,7 @@ SYM_CODE_START_LOCAL(el\el\ht\()_\regsize\()_\label)
.if \el == 0
b ret_to_user
.else
+ UNWIND_HINT_REGS PT_REGS_SIZE
b ret_to_kernel
.endif
SYM_CODE_END(el\el\ht\()_\regsize\()_\label)
@@ -887,6 +889,7 @@ SYM_FUNC_START(call_on_irq_stack)
/* Move to the new stack and call the function there */
mov sp, x16
blr x1
+ UNWIND_HINT_IRQ 16
/*
* Restore the SP from the FP, and restore the FP and LR from the frame
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Call orc_lookup_init() from setup_arch() to perform ORC lookup
initialization for vmlinux.
Call orc_lookup_module_init() in module load to perform ORC lookup
initialization for modules.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/kernel/module.c | 13 ++++++++++++-
arch/arm64/kernel/setup.c | 2 ++
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 76b41e4ca9fa..71264a181f61 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -19,6 +19,7 @@
#include <asm/alternative.h>
#include <asm/insn.h>
#include <asm/sections.h>
+#include <asm-generic/orc_lookup.h>
void *module_alloc(unsigned long size)
{
@@ -509,10 +510,20 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
- const Elf_Shdr *s;
+ const Elf_Shdr *s, *orc, *orc_ip;
+
s = find_section(hdr, sechdrs, ".altinstructions");
if (s)
apply_alternatives_module((void *)s->sh_addr, s->sh_size);
+ orc = find_section(hdr, sechdrs, ".orc_unwind");
+ orc_ip = find_section(hdr, sechdrs, ".orc_unwind_ip");
+
+ if (orc && orc_ip) {
+ orc_lookup_module_init(me,
+ (void *)orc_ip->sh_addr, orc_ip->sh_size,
+ (void *)orc->sh_addr, orc->sh_size);
+ }
+
return module_init_ftrace_plt(hdr, sechdrs, me);
}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index fea3223704b6..360304dcd8c2 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -51,6 +51,7 @@
#include <asm/efi.h>
#include <asm/xen/hypervisor.h>
#include <asm/mmu_context.h>
+#include <asm-generic/orc_lookup.h>
static int num_standard_resources;
static struct resource *standard_resources;
@@ -378,6 +379,7 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
"This indicates a broken bootloader or old kernel\n",
boot_args[1], boot_args[2], boot_args[3]);
}
+ orc_lookup_init();
}
static inline bool cpu_can_disable(unsigned int cpu)
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Add code to scripts/Makefile.lib to define objtool options to generate
ORC data for frame pointer validation.
Define kernel configs:
- to enable dynamic FRAME_POINTER_VALIDATION
- to enable the generation of ORC data using objtool
When these configs are enabled, objtool is invoked on relocatable files
during kernel build with the following command:
objtool --stackval --orc <object-file>
Objtool creates special sections in the object files:
.orc_unwind_ip PC array.
.orc_unwind ORC structure table.
.orc_lookup ORC lookup table.
Change arch/arm64/kernel/vmlinux.lds.S to include ORC_UNWIND_TABLE in
the data section so that the special sections get included there. For
modules, these sections will be added to the kernel during module load.
In the future, the kernel can use these sections to find the ORC for a
given instruction address. The unwinder can then compute the FP at an
instruction address and validate the actual FP with that.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/Kconfig | 2 ++
arch/arm64/Kconfig.debug | 32 ++++++++++++++++++++++++++++++++
arch/arm64/include/asm/module.h | 12 +++++++++++-
arch/arm64/kernel/vmlinux.lds.S | 3 +++
include/linux/objtool.h | 2 ++
scripts/Makefile | 4 +++-
scripts/Makefile.lib | 9 +++++++++
tools/include/linux/objtool.h | 2 ++
8 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 505c8a1ccbe0..73c3f30a37c7 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -230,6 +230,8 @@ config ARM64
select TRACE_IRQFLAGS_SUPPORT
select TRACE_IRQFLAGS_NMI_SUPPORT
select HAVE_SOFTIRQ_ON_OWN_STACK
+ select HAVE_STACK_VALIDATION if FRAME_POINTER_VALIDATION
+ select STACK_VALIDATION if HAVE_STACK_VALIDATION
help
ARM 64-bit (AArch64) Linux support.
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 265c4461031f..a50caabdb18e 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -20,4 +20,36 @@ config ARM64_RELOC_TEST
depends on m
tristate "Relocation testing module"
+config UNWINDER_ORC
+ bool "ORC unwinder"
+ depends on FRAME_POINTER_VALIDATION
+ select HAVE_MOD_ARCH_SPECIFIC
+ select OBJTOOL
+ help
+ This option enables ORC (Oops Rewind Capability) for ARM64. This
+ allows the unwinder to look up ORC data for an instruction address
+ and compute the frame pointer at that address. The computed frame
+ pointer is used to validate the actual frame pointer.
+
+config UNWINDER_FRAME_POINTER
+ bool "Frame pointer unwinder"
+ depends on FRAME_POINTER_VALIDATION
+ select FRAME_POINTER
+ help
+ ARM64 already uses the frame pointer for unwinding kernel stack
+ traces. We need to enable this config to enable STACK_VALIDATION.
+ STACK_VALIDATION is needed to get objtool to do static analysis
+ of kernel code.
+
+config FRAME_POINTER_VALIDATION
+ bool "Dynamic Frame pointer validation"
+ select UNWINDER_FRAME_POINTER
+ select UNWINDER_ORC
+ help
+ This invokes objtool on every object file causing it to
+ generate ORC data for the object file. ORC data is in a custom
+ data format which is a simplified version of the DWARF
+ Call Frame Information standard. See UNWINDER_ORC for more
+ details.
+
source "drivers/hwtracing/coresight/Kconfig"
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index 18734fed3bdd..4362f44aae61 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -6,6 +6,7 @@
#define __ASM_MODULE_H
#include <asm-generic/module.h>
+#include <asm/orc_types.h>
#ifdef CONFIG_ARM64_MODULE_PLTS
struct mod_plt_sec {
@@ -13,15 +14,24 @@ struct mod_plt_sec {
int plt_num_entries;
int plt_max_entries;
};
+#endif
+#ifdef CONFIG_HAVE_MOD_ARCH_SPECIFIC
struct mod_arch_specific {
+#ifdef CONFIG_ARM64_MODULE_PLTS
struct mod_plt_sec core;
struct mod_plt_sec init;
/* for CONFIG_DYNAMIC_FTRACE */
struct plt_entry *ftrace_trampolines;
-};
#endif
+#ifdef CONFIG_UNWINDER_ORC
+ unsigned int num_orcs;
+ int *orc_unwind_ip;
+ struct orc_entry *orc_unwind;
+#endif
+};
+#endif /* CONFIG_HAVE_MOD_ARCH_SPECIFIC */
u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
void *loc, const Elf64_Rela *rela,
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 45131e354e27..bf7b55ae10ee 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -61,6 +61,7 @@
#define RUNTIME_DISCARD_EXIT
#include <asm-generic/vmlinux.lds.h>
+#include <asm-generic/orc_lookup.h>
#include <asm/cache.h>
#include <asm/kernel-pgtable.h>
#include <asm/kexec.h>
@@ -294,6 +295,8 @@ SECTIONS
__mmuoff_data_end = .;
}
+ ORC_UNWIND_TABLE
+
PECOFF_EDATA_PADDING
__pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
_edata = .;
diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index dcbd365944f6..c980522190f7 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -31,7 +31,9 @@
#ifdef CONFIG_OBJTOOL
+#ifndef CONFIG_ARM64
#include <asm/asm.h>
+#endif
#ifndef __ASSEMBLY__
diff --git a/scripts/Makefile b/scripts/Makefile
index 1575af84d557..df3e4d90f195 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -23,8 +23,10 @@ HOSTLDLIBS_sign-file = $(shell $(HOSTPKG_CONFIG) --libs libcrypto 2> /dev/null |
ifdef CONFIG_UNWINDER_ORC
ifeq ($(ARCH),x86_64)
ARCH := x86
-endif
HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
+else
+HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(ARCH)/include
+endif
HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
endif
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 3aa384cec76b..d364871a1046 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -252,6 +252,13 @@ ifdef CONFIG_OBJTOOL
objtool := $(objtree)/tools/objtool/objtool
+ifdef CONFIG_FRAME_POINTER_VALIDATION
+
+objtool-args-$(CONFIG_STACK_VALIDATION) += --stackval
+objtool-args-$(CONFIG_UNWINDER_ORC) += --orc
+
+else
+
objtool-args-$(CONFIG_HAVE_JUMP_LABEL_HACK) += --hacks=jump_label
objtool-args-$(CONFIG_HAVE_NOINSTR_HACK) += --hacks=noinstr
objtool-args-$(CONFIG_X86_KERNEL_IBT) += --ibt
@@ -265,6 +272,8 @@ objtool-args-$(CONFIG_HAVE_STATIC_CALL_INLINE) += --static-call
objtool-args-$(CONFIG_HAVE_UACCESS_VALIDATION) += --uaccess
objtool-args-$(CONFIG_GCOV_KERNEL) += --no-unreachable
+endif
+
objtool-args = $(objtool-args-y) \
$(if $(delay-objtool), --link) \
$(if $(part-of-module), --module)
diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
index dcbd365944f6..c980522190f7 100644
--- a/tools/include/linux/objtool.h
+++ b/tools/include/linux/objtool.h
@@ -31,7 +31,9 @@
#ifdef CONFIG_OBJTOOL
+#ifndef CONFIG_ARM64
#include <asm/asm.h>
+#endif
#ifndef __ASSEMBLY__
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Introduce a reliability flag in struct unwind_state. This will be set to
false if the PC does not have a valid ORC or if the frame pointer computed
from the ORC does not match the actual frame pointer.
Now that the unwinder can validate the frame pointer, introduce
arch_stack_walk_reliable().
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/stacktrace/common.h | 15 ++
arch/arm64/kernel/stacktrace.c | 167 ++++++++++++++++++++-
2 files changed, 175 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index 508f734de46e..064aaf5dc3a0 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -11,6 +11,7 @@
#include <linux/kprobes.h>
#include <linux/types.h>
+#include <linux/objtool.h>
struct stack_info {
unsigned long low;
@@ -23,6 +24,7 @@ struct stack_info {
* @fp: The fp value in the frame record (or the real fp)
* @pc: The lr value in the frame record (or the real lr)
*
+ * @prev_pc: The lr value in the previous frame record.
* @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
* associated with the most recently encountered replacement lr
* value.
@@ -32,10 +34,15 @@ struct stack_info {
* @stack: The stack currently being unwound.
* @stacks: An array of stacks which can be unwound.
* @nr_stacks: The number of stacks in @stacks.
+ *
+ * @cfa: The sp value at the call site of the current function.
+ * @unwind_type The previous frame's unwind type.
+ * @reliable: Stack trace is reliable.
*/
struct unwind_state {
unsigned long fp;
unsigned long pc;
+ unsigned long prev_pc;
#ifdef CONFIG_KRETPROBES
struct llist_node *kr_cur;
#endif
@@ -44,6 +51,9 @@ struct unwind_state {
struct stack_info stack;
struct stack_info *stacks;
int nr_stacks;
+ unsigned long cfa;
+ int unwind_type;
+ bool reliable;
};
static inline struct stack_info stackinfo_get_unknown(void)
@@ -70,11 +80,15 @@ static inline void unwind_init_common(struct unwind_state *state,
struct task_struct *task)
{
state->task = task;
+ state->prev_pc = 0;
#ifdef CONFIG_KRETPROBES
state->kr_cur = NULL;
#endif
state->stack = stackinfo_get_unknown();
+ state->reliable = true;
+ state->cfa = 0;
+ state->unwind_type = UNWIND_HINT_TYPE_CALL;
}
static struct stack_info *unwind_find_next_stack(const struct unwind_state *state,
@@ -167,6 +181,7 @@ unwind_next_frame_record(struct unwind_state *state)
/*
* Record this frame record's values.
*/
+ state->prev_pc = state->pc;
state->fp = READ_ONCE(*(unsigned long *)(fp));
state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 634279b3b03d..fbcb14539816 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -5,6 +5,8 @@
* Copyright (C) 2012 ARM Ltd.
*/
#include <linux/kernel.h>
+#include <asm/unwind_hints.h>
+#include <asm-generic/orc_lookup.h>
#include <linux/export.h>
#include <linux/ftrace.h>
#include <linux/sched.h>
@@ -16,6 +18,122 @@
#include <asm/stack_pointer.h>
#include <asm/stacktrace.h>
+static inline bool unwind_completed(struct unwind_state *state)
+{
+ if (state->fp == (unsigned long)task_pt_regs(state->task)->stackframe) {
+ /* Final frame; nothing to unwind */
+ return true;
+ }
+ return false;
+}
+
+#ifdef CONFIG_FRAME_POINTER_VALIDATION
+
+static void unwind_check_reliable(struct unwind_state *state)
+{
+ unsigned long pc, fp;
+ struct orc_entry *orc;
+ bool adjust_pc = false;
+
+ if (unwind_completed(state))
+ return;
+
+ /*
+ * If a previous frame was unreliable, the CFA cannot be reliably
+ * computed anymore.
+ */
+ if (!state->reliable)
+ return;
+
+ pc = state->pc;
+
+ /* Don't let modules unload while we're reading their ORC data. */
+ preempt_disable();
+
+ orc = orc_find(pc);
+ if (!orc || (!orc->fp_offset && orc->type == UNWIND_HINT_TYPE_CALL)) {
+ /*
+ * If the final instruction in a function happens to be a call
+ * instruction, the return address would fall outside of the
+ * function. That could be the case here. This can happen, for
+ * instance, if the called function is a "noreturn" function.
+ * The compiler can optimize away the instructions after the
+ * call. So, adjust the PC so it falls inside the function and
+ * retry.
+ *
+ * We only do this if the current and the previous frames
+ * are call frames and not hint frames.
+ */
+ if (state->unwind_type == UNWIND_HINT_TYPE_CALL) {
+ pc -= 4;
+ adjust_pc = true;
+ orc = orc_find(pc);
+ }
+ }
+ if (!orc) {
+ state->reliable = false;
+ goto out;
+ }
+ state->unwind_type = orc->type;
+
+ if (!state->cfa) {
+ /* Set up the initial CFA and return. */
+ state->cfa = state->fp - orc->fp_offset;
+ goto out;
+ }
+
+ /* Compute the next CFA and FP. */
+ switch (orc->type) {
+ case UNWIND_HINT_TYPE_CALL:
+ /* Normal call */
+ state->cfa += orc->sp_offset;
+ fp = state->cfa + orc->fp_offset;
+ break;
+
+ case UNWIND_HINT_TYPE_REGS:
+ /*
+ * pt_regs hint: The frame pointer points to either the
+ * synthetic frame within pt_regs or to the place where
+ * x29 and x30 are saved in the register save area in
+ * pt_regs.
+ */
+ state->cfa += orc->sp_offset;
+ fp = state->cfa + offsetof(struct pt_regs, stackframe) -
+ sizeof(struct pt_regs);
+ if (state->fp != fp) {
+ fp = state->cfa + offsetof(struct pt_regs, regs[29]) -
+ sizeof(struct pt_regs);
+ }
+ break;
+
+ case UNWIND_HINT_TYPE_IRQ_STACK:
+ /* Hint to unwind from the IRQ stack to the task stack. */
+ state->cfa = state->fp + orc->sp_offset;
+ fp = state->fp;
+ break;
+
+ default:
+ fp = 0;
+ break;
+ }
+
+ /* Validate the actual FP with the computed one. */
+ if (state->fp != fp)
+ state->reliable = false;
+out:
+ if (state->reliable && adjust_pc)
+ state->pc = pc;
+ preempt_enable();
+}
+
+#else /* !CONFIG_FRAME_POINTER_VALIDATION */
+
+static void unwind_check_reliable(struct unwind_state *state)
+{
+}
+
+#endif /* CONFIG_FRAME_POINTER_VALIDATION */
+
/*
* Start an unwind from a pt_regs.
*
@@ -77,11 +195,9 @@ static inline void unwind_init_from_task(struct unwind_state *state,
static int notrace unwind_next(struct unwind_state *state)
{
struct task_struct *tsk = state->task;
- unsigned long fp = state->fp;
int err;
- /* Final frame; nothing to unwind */
- if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
+ if (unwind_completed(state))
return -ENOENT;
err = unwind_next_frame_record(state);
@@ -116,18 +232,23 @@ static int notrace unwind_next(struct unwind_state *state)
}
NOKPROBE_SYMBOL(unwind_next);
-static void notrace unwind(struct unwind_state *state,
+static int notrace unwind(struct unwind_state *state, bool need_reliable,
stack_trace_consume_fn consume_entry, void *cookie)
{
- while (1) {
- int ret;
+ int ret = 0;
+ while (1) {
+ if (need_reliable && !state->reliable)
+ return -EINVAL;
if (!consume_entry(cookie, state->pc))
break;
ret = unwind_next(state);
+ if (need_reliable && !ret)
+ unwind_check_reliable(state);
if (ret < 0)
break;
}
+ return ret;
}
NOKPROBE_SYMBOL(unwind);
@@ -216,5 +337,37 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
unwind_init_from_task(&state, task);
}
- unwind(&state, consume_entry, cookie);
+ unwind(&state, false, consume_entry, cookie);
+}
+
+noinline notrace int arch_stack_walk_reliable(
+ stack_trace_consume_fn consume_entry,
+ void *cookie, struct task_struct *task)
+{
+ struct stack_info stacks[] = {
+ stackinfo_get_task(task),
+ STACKINFO_CPU(irq),
+#if defined(CONFIG_VMAP_STACK)
+ STACKINFO_CPU(overflow),
+#endif
+#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_ARM_SDE_INTERFACE)
+ STACKINFO_SDEI(normal),
+ STACKINFO_SDEI(critical),
+#endif
+ };
+ struct unwind_state state = {
+ .stacks = stacks,
+ .nr_stacks = ARRAY_SIZE(stacks),
+ };
+ int ret;
+
+ if (task == current)
+ unwind_init_from_caller(&state);
+ else
+ unwind_init_from_task(&state, task);
+ unwind_check_reliable(&state);
+
+ ret = unwind(&state, true, consume_entry, cookie);
+
+ return ret == -ENOENT ? 0 : -EINVAL;
}
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
- Define HAVE_DYNAMIC_FTRACE_WITH_ARGS to support livepatch.
- Supply the arch code for HAVE_DYNAMIC_FTRACE_WITH_ARGS.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/Kconfig.debug | 1 +
arch/arm64/include/asm/ftrace.h | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index a50caabdb18e..6d5dc90a0a52 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -45,6 +45,7 @@ config FRAME_POINTER_VALIDATION
bool "Dynamic Frame pointer validation"
select UNWINDER_FRAME_POINTER
select UNWINDER_ORC
+ select HAVE_DYNAMIC_FTRACE_WITH_ARGS
help
This invokes objtool on every object file causing it to
generate ORC data for the object file. ORC data is in a custom
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 329dbbd4d50b..0bc03ecfb257 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -78,6 +78,26 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
return addr;
}
+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+
+struct ftrace_regs {
+ struct pt_regs regs;
+};
+
+static __always_inline struct pt_regs *
+arch_ftrace_get_regs(struct ftrace_regs *fregs)
+{
+ return &fregs->regs;
+}
+
+static __always_inline void ftrace_instruction_pointer_set(
+ struct ftrace_regs *fregs, unsigned long pc)
+{
+ fregs->regs.pc = pc;
+}
+
+#endif
+
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
struct dyn_ftrace;
struct ftrace_ops;
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
- Define TIF_PATCH_PENDING in arch/arm64/include/asm/thread_info.h
for livepatch.
- Check TIF_PATCH_PENDING in do_notify_resume() to patch the
current task for livepatch.
Signed-off-by: Suraj Jitindar Singh <[email protected]>
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/include/asm/thread_info.h | 4 +++-
arch/arm64/kernel/signal.c | 4 ++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 848739c15de8..42ba9d37e8d8 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -68,6 +68,7 @@ int arch_dup_task_struct(struct task_struct *dst,
#define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */
#define TIF_MTE_ASYNC_FAULT 5 /* MTE Asynchronous Tag Check Fault */
#define TIF_NOTIFY_SIGNAL 6 /* signal notifications exist */
+#define TIF_PATCH_PENDING 7 /* pending live patching update */
#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
@@ -100,11 +101,12 @@ int arch_dup_task_struct(struct task_struct *dst,
#define _TIF_SVE (1 << TIF_SVE)
#define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
#define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
+#define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
_TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
_TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \
- _TIF_NOTIFY_SIGNAL)
+ _TIF_NOTIFY_SIGNAL | _TIF_PATCH_PENDING)
#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 9ad911f1647c..dea21ba60ff1 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -17,6 +17,7 @@
#include <linux/sizes.h>
#include <linux/string.h>
#include <linux/resume_user_mode.h>
+#include <linux/livepatch.h>
#include <linux/ratelimit.h>
#include <linux/syscalls.h>
@@ -1120,6 +1121,9 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags)
(void __user *)NULL, current);
}
+ if (thread_flags & _TIF_PATCH_PENDING)
+ klp_update_patch_state(current);
+
if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
do_signal(regs);
--
2.25.1
From: "Madhavan T. Venkataraman" <[email protected]>
Enable livepatch in arch/arm64/Kconfig.
Signed-off-by: Madhavan T. Venkataraman <[email protected]>
---
arch/arm64/Kconfig | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 73c3f30a37c7..01f802935dda 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -232,6 +232,8 @@ config ARM64
select HAVE_SOFTIRQ_ON_OWN_STACK
select HAVE_STACK_VALIDATION if FRAME_POINTER_VALIDATION
select STACK_VALIDATION if HAVE_STACK_VALIDATION
+ select HAVE_RELIABLE_STACKTRACE if STACK_VALIDATION
+ select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS && HAVE_RELIABLE_STACKTRACE
help
ARM 64-bit (AArch64) Linux support.
@@ -2269,3 +2271,4 @@ source "drivers/acpi/Kconfig"
source "arch/arm64/kvm/Kconfig"
+source "kernel/livepatch/Kconfig"
--
2.25.1
Hello,
I see the following build warning after this commit (gcc12):
ld: warning: orphan section `.init.orc_unwind' from `arch/arm64/kernel/pi/kaslr_early.pi.o' being placed in section `.init.orc_unwind'
ld: warning: orphan section `.init.orc_unwind_ip' from `arch/arm64/kernel/pi/kaslr_early.pi.o' being placed in section `.init.orc_unwind_ip'
...
My understanding of the cause is that arch/arm64/kernel/pi has its
own Makefile and adds "init" prefix to sections by objcopy:
https://github.com/madvenka786/linux/blob/orc_v3/arch/arm64/kernel/pi/Makefile#L25
I assume these files are not relevant for livepatch perspective and so it is
safe to exclude these sections by --remove-section or should we care these as well?
Regards,
Tomohiro
> Subject: [RFC PATCH v3 18/22] arm64: Build the kernel with ORC information
>
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> Add code to scripts/Makefile.lib to define objtool options to generate
> ORC data for frame pointer validation.
>
> Define kernel configs:
> - to enable dynamic FRAME_POINTER_VALIDATION
> - to enable the generation of ORC data using objtool
>
> When these configs are enabled, objtool is invoked on relocatable files
> during kernel build with the following command:
>
> objtool --stackval --orc <object-file>
>
> Objtool creates special sections in the object files:
>
> .orc_unwind_ip PC array.
> .orc_unwind ORC structure table.
> .orc_lookup ORC lookup table.
>
> Change arch/arm64/kernel/vmlinux.lds.S to include ORC_UNWIND_TABLE in
> the data section so that the special sections get included there. For
> modules, these sections will be added to the kernel during module load.
>
> In the future, the kernel can use these sections to find the ORC for a
> given instruction address. The unwinder can then compute the FP at an
> instruction address and validate the actual FP with that.
>
> Signed-off-by: Madhavan T. Venkataraman <[email protected]>
> ---
> arch/arm64/Kconfig | 2 ++
> arch/arm64/Kconfig.debug | 32
> ++++++++++++++++++++++++++++++++
> arch/arm64/include/asm/module.h | 12 +++++++++++-
> arch/arm64/kernel/vmlinux.lds.S | 3 +++
> include/linux/objtool.h | 2 ++
> scripts/Makefile | 4 +++-
> scripts/Makefile.lib | 9 +++++++++
> tools/include/linux/objtool.h | 2 ++
> 8 files changed, 64 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 505c8a1ccbe0..73c3f30a37c7 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -230,6 +230,8 @@ config ARM64
> select TRACE_IRQFLAGS_SUPPORT
> select TRACE_IRQFLAGS_NMI_SUPPORT
> select HAVE_SOFTIRQ_ON_OWN_STACK
> + select HAVE_STACK_VALIDATION if
> FRAME_POINTER_VALIDATION
> + select STACK_VALIDATION if HAVE_STACK_VALIDATION
> help
> ARM 64-bit (AArch64) Linux support.
>
> diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
> index 265c4461031f..a50caabdb18e 100644
> --- a/arch/arm64/Kconfig.debug
> +++ b/arch/arm64/Kconfig.debug
> @@ -20,4 +20,36 @@ config ARM64_RELOC_TEST
> depends on m
> tristate "Relocation testing module"
>
> +config UNWINDER_ORC
> + bool "ORC unwinder"
> + depends on FRAME_POINTER_VALIDATION
> + select HAVE_MOD_ARCH_SPECIFIC
> + select OBJTOOL
> + help
> + This option enables ORC (Oops Rewind Capability) for ARM64. This
> + allows the unwinder to look up ORC data for an instruction address
> + and compute the frame pointer at that address. The computed frame
> + pointer is used to validate the actual frame pointer.
> +
> +config UNWINDER_FRAME_POINTER
> + bool "Frame pointer unwinder"
> + depends on FRAME_POINTER_VALIDATION
> + select FRAME_POINTER
> + help
> + ARM64 already uses the frame pointer for unwinding kernel stack
> + traces. We need to enable this config to enable STACK_VALIDATION.
> + STACK_VALIDATION is needed to get objtool to do static analysis
> + of kernel code.
> +
> +config FRAME_POINTER_VALIDATION
> + bool "Dynamic Frame pointer validation"
> + select UNWINDER_FRAME_POINTER
> + select UNWINDER_ORC
> + help
> + This invokes objtool on every object file causing it to
> + generate ORC data for the object file. ORC data is in a custom
> + data format which is a simplified version of the DWARF
> + Call Frame Information standard. See UNWINDER_ORC for more
> + details.
> +
> source "drivers/hwtracing/coresight/Kconfig"
> diff --git a/arch/arm64/include/asm/module.h
> b/arch/arm64/include/asm/module.h
> index 18734fed3bdd..4362f44aae61 100644
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -6,6 +6,7 @@
> #define __ASM_MODULE_H
>
> #include <asm-generic/module.h>
> +#include <asm/orc_types.h>
>
> #ifdef CONFIG_ARM64_MODULE_PLTS
> struct mod_plt_sec {
> @@ -13,15 +14,24 @@ struct mod_plt_sec {
> int plt_num_entries;
> int plt_max_entries;
> };
> +#endif
>
> +#ifdef CONFIG_HAVE_MOD_ARCH_SPECIFIC
> struct mod_arch_specific {
> +#ifdef CONFIG_ARM64_MODULE_PLTS
> struct mod_plt_sec core;
> struct mod_plt_sec init;
>
> /* for CONFIG_DYNAMIC_FTRACE */
> struct plt_entry *ftrace_trampolines;
> -};
> #endif
> +#ifdef CONFIG_UNWINDER_ORC
> + unsigned int num_orcs;
> + int *orc_unwind_ip;
> + struct orc_entry *orc_unwind;
> +#endif
> +};
> +#endif /* CONFIG_HAVE_MOD_ARCH_SPECIFIC */
>
> u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
> void *loc, const Elf64_Rela *rela,
> diff --git a/arch/arm64/kernel/vmlinux.lds.S
> b/arch/arm64/kernel/vmlinux.lds.S
> index 45131e354e27..bf7b55ae10ee 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -61,6 +61,7 @@
> #define RUNTIME_DISCARD_EXIT
>
> #include <asm-generic/vmlinux.lds.h>
> +#include <asm-generic/orc_lookup.h>
> #include <asm/cache.h>
> #include <asm/kernel-pgtable.h>
> #include <asm/kexec.h>
> @@ -294,6 +295,8 @@ SECTIONS
> __mmuoff_data_end = .;
> }
>
> + ORC_UNWIND_TABLE
> +
> PECOFF_EDATA_PADDING
> __pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
> _edata = .;
> diff --git a/include/linux/objtool.h b/include/linux/objtool.h
> index dcbd365944f6..c980522190f7 100644
> --- a/include/linux/objtool.h
> +++ b/include/linux/objtool.h
> @@ -31,7 +31,9 @@
>
> #ifdef CONFIG_OBJTOOL
>
> +#ifndef CONFIG_ARM64
> #include <asm/asm.h>
> +#endif
>
> #ifndef __ASSEMBLY__
>
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 1575af84d557..df3e4d90f195 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -23,8 +23,10 @@ HOSTLDLIBS_sign-file = $(shell $(HOSTPKG_CONFIG)
> --libs libcrypto 2> /dev/null |
> ifdef CONFIG_UNWINDER_ORC
> ifeq ($(ARCH),x86_64)
> ARCH := x86
> -endif
> HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
> +else
> +HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(ARCH)/include
> +endif
> HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
> endif
>
> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> index 3aa384cec76b..d364871a1046 100644
> --- a/scripts/Makefile.lib
> +++ b/scripts/Makefile.lib
> @@ -252,6 +252,13 @@ ifdef CONFIG_OBJTOOL
>
> objtool := $(objtree)/tools/objtool/objtool
>
> +ifdef CONFIG_FRAME_POINTER_VALIDATION
> +
> +objtool-args-$(CONFIG_STACK_VALIDATION) +=
> --stackval
> +objtool-args-$(CONFIG_UNWINDER_ORC) += --orc
> +
> +else
> +
> objtool-args-$(CONFIG_HAVE_JUMP_LABEL_HACK) +=
> --hacks=jump_label
> objtool-args-$(CONFIG_HAVE_NOINSTR_HACK) += --hacks=noinstr
> objtool-args-$(CONFIG_X86_KERNEL_IBT) += --ibt
> @@ -265,6 +272,8 @@ objtool-args-$(CONFIG_HAVE_STATIC_CALL_INLINE)
> += --static-call
> objtool-args-$(CONFIG_HAVE_UACCESS_VALIDATION) +=
> --uaccess
> objtool-args-$(CONFIG_GCOV_KERNEL) +=
> --no-unreachable
>
> +endif
> +
> objtool-args = $(objtool-args-y) \
> $(if $(delay-objtool), --link) \
> $(if $(part-of-module), --module)
> diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
> index dcbd365944f6..c980522190f7 100644
> --- a/tools/include/linux/objtool.h
> +++ b/tools/include/linux/objtool.h
> @@ -31,7 +31,9 @@
>
> #ifdef CONFIG_OBJTOOL
>
> +#ifndef CONFIG_ARM64
> #include <asm/asm.h>
> +#endif
>
> #ifndef __ASSEMBLY__
>
> --
> 2.25.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
I will analyze this and respond.
Thanks for mentioning this.
Madhavan
On 2/10/23 01:52, Tomohiro Misono (Fujitsu) wrote:
> Hello,
>
> I see the following build warning after this commit (gcc12):
> ld: warning: orphan section `.init.orc_unwind' from `arch/arm64/kernel/pi/kaslr_early.pi.o' being placed in section `.init.orc_unwind'
> ld: warning: orphan section `.init.orc_unwind_ip' from `arch/arm64/kernel/pi/kaslr_early.pi.o' being placed in section `.init.orc_unwind_ip'
> ...
>
> My understanding of the cause is that arch/arm64/kernel/pi has its
> own Makefile and adds "init" prefix to sections by objcopy:
> https://github.com/madvenka786/linux/blob/orc_v3/arch/arm64/kernel/pi/Makefile#L25
>
> I assume these files are not relevant for livepatch perspective and so it is
> safe to exclude these sections by --remove-section or should we care these as well?
>
> Regards,
> Tomohiro
>
>> Subject: [RFC PATCH v3 18/22] arm64: Build the kernel with ORC information
>>
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> Add code to scripts/Makefile.lib to define objtool options to generate
>> ORC data for frame pointer validation.
>>
>> Define kernel configs:
>> - to enable dynamic FRAME_POINTER_VALIDATION
>> - to enable the generation of ORC data using objtool
>>
>> When these configs are enabled, objtool is invoked on relocatable files
>> during kernel build with the following command:
>>
>> objtool --stackval --orc <object-file>
>>
>> Objtool creates special sections in the object files:
>>
>> .orc_unwind_ip PC array.
>> .orc_unwind ORC structure table.
>> .orc_lookup ORC lookup table.
>>
>> Change arch/arm64/kernel/vmlinux.lds.S to include ORC_UNWIND_TABLE in
>> the data section so that the special sections get included there. For
>> modules, these sections will be added to the kernel during module load.
>>
>> In the future, the kernel can use these sections to find the ORC for a
>> given instruction address. The unwinder can then compute the FP at an
>> instruction address and validate the actual FP with that.
>>
>> Signed-off-by: Madhavan T. Venkataraman <[email protected]>
>> ---
>> arch/arm64/Kconfig | 2 ++
>> arch/arm64/Kconfig.debug | 32
>> ++++++++++++++++++++++++++++++++
>> arch/arm64/include/asm/module.h | 12 +++++++++++-
>> arch/arm64/kernel/vmlinux.lds.S | 3 +++
>> include/linux/objtool.h | 2 ++
>> scripts/Makefile | 4 +++-
>> scripts/Makefile.lib | 9 +++++++++
>> tools/include/linux/objtool.h | 2 ++
>> 8 files changed, 64 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 505c8a1ccbe0..73c3f30a37c7 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -230,6 +230,8 @@ config ARM64
>> select TRACE_IRQFLAGS_SUPPORT
>> select TRACE_IRQFLAGS_NMI_SUPPORT
>> select HAVE_SOFTIRQ_ON_OWN_STACK
>> + select HAVE_STACK_VALIDATION if
>> FRAME_POINTER_VALIDATION
>> + select STACK_VALIDATION if HAVE_STACK_VALIDATION
>> help
>> ARM 64-bit (AArch64) Linux support.
>>
>> diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
>> index 265c4461031f..a50caabdb18e 100644
>> --- a/arch/arm64/Kconfig.debug
>> +++ b/arch/arm64/Kconfig.debug
>> @@ -20,4 +20,36 @@ config ARM64_RELOC_TEST
>> depends on m
>> tristate "Relocation testing module"
>>
>> +config UNWINDER_ORC
>> + bool "ORC unwinder"
>> + depends on FRAME_POINTER_VALIDATION
>> + select HAVE_MOD_ARCH_SPECIFIC
>> + select OBJTOOL
>> + help
>> + This option enables ORC (Oops Rewind Capability) for ARM64. This
>> + allows the unwinder to look up ORC data for an instruction address
>> + and compute the frame pointer at that address. The computed frame
>> + pointer is used to validate the actual frame pointer.
>> +
>> +config UNWINDER_FRAME_POINTER
>> + bool "Frame pointer unwinder"
>> + depends on FRAME_POINTER_VALIDATION
>> + select FRAME_POINTER
>> + help
>> + ARM64 already uses the frame pointer for unwinding kernel stack
>> + traces. We need to enable this config to enable STACK_VALIDATION.
>> + STACK_VALIDATION is needed to get objtool to do static analysis
>> + of kernel code.
>> +
>> +config FRAME_POINTER_VALIDATION
>> + bool "Dynamic Frame pointer validation"
>> + select UNWINDER_FRAME_POINTER
>> + select UNWINDER_ORC
>> + help
>> + This invokes objtool on every object file causing it to
>> + generate ORC data for the object file. ORC data is in a custom
>> + data format which is a simplified version of the DWARF
>> + Call Frame Information standard. See UNWINDER_ORC for more
>> + details.
>> +
>> source "drivers/hwtracing/coresight/Kconfig"
>> diff --git a/arch/arm64/include/asm/module.h
>> b/arch/arm64/include/asm/module.h
>> index 18734fed3bdd..4362f44aae61 100644
>> --- a/arch/arm64/include/asm/module.h
>> +++ b/arch/arm64/include/asm/module.h
>> @@ -6,6 +6,7 @@
>> #define __ASM_MODULE_H
>>
>> #include <asm-generic/module.h>
>> +#include <asm/orc_types.h>
>>
>> #ifdef CONFIG_ARM64_MODULE_PLTS
>> struct mod_plt_sec {
>> @@ -13,15 +14,24 @@ struct mod_plt_sec {
>> int plt_num_entries;
>> int plt_max_entries;
>> };
>> +#endif
>>
>> +#ifdef CONFIG_HAVE_MOD_ARCH_SPECIFIC
>> struct mod_arch_specific {
>> +#ifdef CONFIG_ARM64_MODULE_PLTS
>> struct mod_plt_sec core;
>> struct mod_plt_sec init;
>>
>> /* for CONFIG_DYNAMIC_FTRACE */
>> struct plt_entry *ftrace_trampolines;
>> -};
>> #endif
>> +#ifdef CONFIG_UNWINDER_ORC
>> + unsigned int num_orcs;
>> + int *orc_unwind_ip;
>> + struct orc_entry *orc_unwind;
>> +#endif
>> +};
>> +#endif /* CONFIG_HAVE_MOD_ARCH_SPECIFIC */
>>
>> u64 module_emit_plt_entry(struct module *mod, Elf64_Shdr *sechdrs,
>> void *loc, const Elf64_Rela *rela,
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S
>> b/arch/arm64/kernel/vmlinux.lds.S
>> index 45131e354e27..bf7b55ae10ee 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -61,6 +61,7 @@
>> #define RUNTIME_DISCARD_EXIT
>>
>> #include <asm-generic/vmlinux.lds.h>
>> +#include <asm-generic/orc_lookup.h>
>> #include <asm/cache.h>
>> #include <asm/kernel-pgtable.h>
>> #include <asm/kexec.h>
>> @@ -294,6 +295,8 @@ SECTIONS
>> __mmuoff_data_end = .;
>> }
>>
>> + ORC_UNWIND_TABLE
>> +
>> PECOFF_EDATA_PADDING
>> __pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
>> _edata = .;
>> diff --git a/include/linux/objtool.h b/include/linux/objtool.h
>> index dcbd365944f6..c980522190f7 100644
>> --- a/include/linux/objtool.h
>> +++ b/include/linux/objtool.h
>> @@ -31,7 +31,9 @@
>>
>> #ifdef CONFIG_OBJTOOL
>>
>> +#ifndef CONFIG_ARM64
>> #include <asm/asm.h>
>> +#endif
>>
>> #ifndef __ASSEMBLY__
>>
>> diff --git a/scripts/Makefile b/scripts/Makefile
>> index 1575af84d557..df3e4d90f195 100644
>> --- a/scripts/Makefile
>> +++ b/scripts/Makefile
>> @@ -23,8 +23,10 @@ HOSTLDLIBS_sign-file = $(shell $(HOSTPKG_CONFIG)
>> --libs libcrypto 2> /dev/null |
>> ifdef CONFIG_UNWINDER_ORC
>> ifeq ($(ARCH),x86_64)
>> ARCH := x86
>> -endif
>> HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
>> +else
>> +HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(ARCH)/include
>> +endif
>> HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
>> endif
>>
>> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
>> index 3aa384cec76b..d364871a1046 100644
>> --- a/scripts/Makefile.lib
>> +++ b/scripts/Makefile.lib
>> @@ -252,6 +252,13 @@ ifdef CONFIG_OBJTOOL
>>
>> objtool := $(objtree)/tools/objtool/objtool
>>
>> +ifdef CONFIG_FRAME_POINTER_VALIDATION
>> +
>> +objtool-args-$(CONFIG_STACK_VALIDATION) +=
>> --stackval
>> +objtool-args-$(CONFIG_UNWINDER_ORC) += --orc
>> +
>> +else
>> +
>> objtool-args-$(CONFIG_HAVE_JUMP_LABEL_HACK) +=
>> --hacks=jump_label
>> objtool-args-$(CONFIG_HAVE_NOINSTR_HACK) += --hacks=noinstr
>> objtool-args-$(CONFIG_X86_KERNEL_IBT) += --ibt
>> @@ -265,6 +272,8 @@ objtool-args-$(CONFIG_HAVE_STATIC_CALL_INLINE)
>> += --static-call
>> objtool-args-$(CONFIG_HAVE_UACCESS_VALIDATION) +=
>> --uaccess
>> objtool-args-$(CONFIG_GCOV_KERNEL) +=
>> --no-unreachable
>>
>> +endif
>> +
>> objtool-args = $(objtool-args-y) \
>> $(if $(delay-objtool), --link) \
>> $(if $(part-of-module), --module)
>> diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
>> index dcbd365944f6..c980522190f7 100644
>> --- a/tools/include/linux/objtool.h
>> +++ b/tools/include/linux/objtool.h
>> @@ -31,7 +31,9 @@
>>
>> #ifdef CONFIG_OBJTOOL
>>
>> +#ifndef CONFIG_ARM64
>> #include <asm/asm.h>
>> +#endif
>>
>> #ifndef __ASSEMBLY__
>>
>> --
>> 2.25.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Thu, 2023-02-02 at 01:40 -0600, [email protected] wrote:
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> The ORC code needs to be reorganized into arch-specific and generic
> parts
> so that architectures other than X86 can use the generic parts.
>
> orc_types.h contains the following ORC definitions shared between
> objtool
> and the kernel:
>
> - ORC register definitions which are arch-specific.
> - orc_entry structure which is generic.
>
> Move orc_entry into a new file include/linux/orc_entry.h. Also, the
> field
> names bp_reg and bp_offset in struct orc_entry are x86-specific.
> Change
> them to fp_reg and fp_offset. FP stands for frame pointer.
>
> Currently, the type field in orc_entry is only 2 bits. For other
> architectures, we will need more. So, expand this to 3 bits.
>
> Signed-off-by: Madhavan T. Venkataraman <[email protected]
> >
> ---
> arch/x86/include/asm/orc_types.h | 37 +++++-------------------
> include/linux/orc_entry.h | 39
> ++++++++++++++++++++++++++
> tools/arch/x86/include/asm/orc_types.h | 37 +++++-------------------
> tools/include/linux/orc_entry.h | 39
> ++++++++++++++++++++++++++
> tools/objtool/orc_gen.c | 4 +--
> tools/objtool/sync-check.sh | 1 +
> 6 files changed, 95 insertions(+), 62 deletions(-)
> create mode 100644 include/linux/orc_entry.h
> create mode 100644 tools/include/linux/orc_entry.h
>
[snip]
> diff --git a/tools/include/linux/orc_entry.h
> b/tools/include/linux/orc_entry.h
> new file mode 100644
> index 000000000000..3d49e3b9dabe
> --- /dev/null
> +++ b/tools/include/linux/orc_entry.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
> + */
> +
> +#ifndef _ORC_ENTRY_H
> +#define _ORC_ENTRY_H
> +
> +#ifndef __ASSEMBLY__
> +#include <asm/byteorder.h>
> +
> +/*
> + * This struct is more or less a vastly simplified version of the
> DWARF Call
> + * Frame Information standard. It contains only the necessary parts
> of DWARF
> + * CFI, simplified for ease of access by the in-kernel unwinder. It
> tells the
> + * unwinder how to find the previous SP and BP (and sometimes entry
> regs) on
> + * the stack for a given code address. Each instance of the struct
> corresponds
> + * to one or more code locations.
> + */
> +struct orc_entry {
> + s16 sp_offset;
> + s16 fp_offset;
> +#if defined(__LITTLE_ENDIAN_BITFIELD)
> + unsigned sp_reg:4;
> + unsigned fp_reg:4;
> + unsigned type:3;
> + unsigned end:1;
> +#elif defined(__BIG_ENDIAN_BITFIELD)
> + unsigned fp_reg:4;
> + unsigned sp_reg:4;
> + unsigned unused:4;
> + unsigned end:1;
> + unsigned type:3;
> +#
nit:
I believe you also need to update fp_reg/bp_offset -> fp_reg/fp_offset
in orc_dump() in orc_dump.c
- Suraj
> +
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ORC_ENTRY_H */
> diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
> index dd3c64af9db2..68c317daadbf 100644
> --- a/tools/objtool/orc_gen.c
> +++ b/tools/objtool/orc_gen.c
> @@ -98,7 +98,7 @@ static int write_orc_entry(struct elf *elf, struct
> section *orc_sec,
> orc = (struct orc_entry *)orc_sec->data->d_buf + idx;
> memcpy(orc, o, sizeof(*orc));
> orc->sp_offset = bswap_if_needed(orc->sp_offset);
> - orc->bp_offset = bswap_if_needed(orc->bp_offset);
> + orc->fp_offset = bswap_if_needed(orc->fp_offset);
>
> /* populate reloc for ip */
> if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int),
> R_X86_64_PC32,
> @@ -149,7 +149,7 @@ int orc_create(struct objtool_file *file)
>
> struct orc_entry null = {
> .sp_reg = ORC_REG_UNDEFINED,
> - .bp_reg = ORC_REG_UNDEFINED,
> + .fp_reg = ORC_REG_UNDEFINED,
> .type = UNWIND_HINT_TYPE_CALL,
> };
>
> diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-
> check.sh
> index ee49b4e9e72c..ef1acb064605 100755
> --- a/tools/objtool/sync-check.sh
> +++ b/tools/objtool/sync-check.sh
> @@ -18,6 +18,7 @@ arch/x86/include/asm/unwind_hints.h
> arch/x86/lib/x86-opcode-map.txt
> arch/x86/tools/gen-insn-attr-x86.awk
> include/linux/static_call_types.h
> +include/linux/orc_entry.h
> "
>
> SYNC_CHECK_FILES='
On Thu, 2023-02-02 at 01:40 -0600, [email protected] wrote:
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> Introduce a reliability flag in struct unwind_state. This will be set
> to
> false if the PC does not have a valid ORC or if the frame pointer
> computed
> from the ORC does not match the actual frame pointer.
>
> Now that the unwinder can validate the frame pointer, introduce
> arch_stack_walk_reliable().
>
> Signed-off-by: Madhavan T. Venkataraman <[email protected]
> >
> ---
> arch/arm64/include/asm/stacktrace/common.h | 15 ++
> arch/arm64/kernel/stacktrace.c | 167
> ++++++++++++++++++++-
> 2 files changed, 175 insertions(+), 7 deletions(-)
>
[snip]
> -static void notrace unwind(struct unwind_state *state,
> +static int notrace unwind(struct unwind_state *state, bool
> need_reliable,
> stack_trace_consume_fn consume_entry, void
> *cookie)
> {
> - while (1) {
> - int ret;
> + int ret = 0;
>
> + while (1) {
> + if (need_reliable && !state->reliable)
> + return -EINVAL;
> if (!consume_entry(cookie, state->pc))
> break;
> ret = unwind_next(state);
> + if (need_reliable && !ret)
> + unwind_check_reliable(state);
> if (ret < 0)
> break;
> }
> + return ret;
nit:
I think you're looking more for comments on the approach and the
correctness of these patches, but from an initial read I'm still
putting it all together in my head. So this comment is on the coding
style.
The above loop seems to check the current reliability state, then
unwind a frame then check the reliability, and then break based of
something which couldn't have been updated by the line immediately
above. I propose something like:
unwind(...) {
ret = 0;
while (!ret) {
if (need_reliable) {
unwind_check_reliable(state);
if (!state->reliable)
return -EINVAL;
}
if (!consume_entry(cookie, state->pc))
return -EINVAL;
ret = unwind_next(state);
}
return ret;
}
This also removes the need for the call to unwind_check_reliable()
before the first unwind() below in arch_stack_walk_reliable().
- Suraj
> }
> NOKPROBE_SYMBOL(unwind);
>
> @@ -216,5 +337,37 @@ noinline notrace void
> arch_stack_walk(stack_trace_consume_fn consume_entry,
> unwind_init_from_task(&state, task);
> }
>
> - unwind(&state, consume_entry, cookie);
> + unwind(&state, false, consume_entry, cookie);
> +}
> +
> +noinline notrace int arch_stack_walk_reliable(
> + stack_trace_consume_fn consume_entry,
> + void *cookie, struct task_struct *task)
> +{
> + struct stack_info stacks[] = {
> + stackinfo_get_task(task),
> + STACKINFO_CPU(irq),
> +#if defined(CONFIG_VMAP_STACK)
> + STACKINFO_CPU(overflow),
> +#endif
> +#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_ARM_SDE_INTERFACE)
> + STACKINFO_SDEI(normal),
> + STACKINFO_SDEI(critical),
> +#endif
> + };
> + struct unwind_state state = {
> + .stacks = stacks,
> + .nr_stacks = ARRAY_SIZE(stacks),
> + };
> + int ret;
> +
> + if (task == current)
> + unwind_init_from_caller(&state);
> + else
> + unwind_init_from_task(&state, task);
> + unwind_check_reliable(&state);
> +
> + ret = unwind(&state, true, consume_entry, cookie);
> +
> + return ret == -ENOENT ? 0 : -EINVAL;
> }
<snip>
> Testing
> =======
>
> - I have run all of the livepatch selftests successfully. I have written a
> couple of extra selftests myself which I will be posting separately
Hi,
What test configuration/environment you are using for test?
When I tried kselftest with fedora based config on VM, I got errors
because livepatch transition won't finish until signal is sent
(i.e. it takes 15s for every transition).
[excerpt from test result]
```
$ sudo ./test-livepatch.sh
TEST: basic function patching ... not ok
--- expected
+++ result
@@ -2,11 +2,13 @@
livepatch: enabling patch 'test_klp_livepatch'
livepatch: 'test_klp_livepatch': initializing patching transition
livepatch: 'test_klp_livepatch': starting patching transition
+livepatch: signaling remaining tasks
livepatch: 'test_klp_livepatch': completing patching transition
```
Thanks,
Tomohiro
>
> - I have a test driver to induce a NULL pointer exception to make sure
> that unwinding through exception handlers is reliable.
>
> - I use the test driver to create a timer to make sure that unwinding through
> the timer IRQ is reliable.
>
> - I call the unwinder from different places during boot to make sure that
> the unwinding in each of those cases is reliable.
>
On Wed 2023-03-01 03:12:08, Tomohiro Misono (Fujitsu) wrote:
> <snip>
> > Testing
> > =======
> >
> > - I have run all of the livepatch selftests successfully. I have written a
> > couple of extra selftests myself which I will be posting separately
> Hi,
>
> What test configuration/environment you are using for test?
> When I tried kselftest with fedora based config on VM, I got errors
> because livepatch transition won't finish until signal is sent
> (i.e. it takes 15s for every transition).
>
> [excerpt from test result]
> ```
> $ sudo ./test-livepatch.sh
> TEST: basic function patching ... not ok
>
> --- expected
> +++ result
> @@ -2,11 +2,13 @@
> livepatch: enabling patch 'test_klp_livepatch'
> livepatch: 'test_klp_livepatch': initializing patching transition
> livepatch: 'test_klp_livepatch': starting patching transition
> +livepatch: signaling remaining tasks
> livepatch: 'test_klp_livepatch': completing patching transition
> ```
It might be interesting to see what process is blocking the
transition. The transition state is visible in
/proc/<pid>/patch_state.
The transition is blocked when a process is in KLP_UNPATCHED state.
It is defined in include/linux/livepatch.h:
#define KLP_UNPATCHED 0
Well, the timing against the transition is important. The following
might help to see the blocking processes:
$> modprobe livepatch-sample ; \
sleep 1; \
for proc_path in \
`grep "\-1" /proc/*/patch_state | cut -d '/' -f-3` ; \
do \
cat $proc_path/comm ; \
cat $proc_path/stack ; \
echo === ; \
done
After this the livepatch has to be manualy disabled and removed
$> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
$> rmmod livepatch_sample
Best Regards,
Petr
> > <snip>
> > > Testing
> > > =======
> > >
> > > - I have run all of the livepatch selftests successfully. I have written a
> > > couple of extra selftests myself which I will be posting separately
> > Hi,
> >
> > What test configuration/environment you are using for test?
> > When I tried kselftest with fedora based config on VM, I got errors
> > because livepatch transition won't finish until signal is sent
> > (i.e. it takes 15s for every transition).
> >
> > [excerpt from test result]
> > ```
> > $ sudo ./test-livepatch.sh
> > TEST: basic function patching ... not ok
> >
> > --- expected
> > +++ result
> > @@ -2,11 +2,13 @@
> > livepatch: enabling patch 'test_klp_livepatch'
> > livepatch: 'test_klp_livepatch': initializing patching transition
> > livepatch: 'test_klp_livepatch': starting patching transition
> > +livepatch: signaling remaining tasks
> > livepatch: 'test_klp_livepatch': completing patching transition
> > ```
>
> It might be interesting to see what process is blocking the
> transition. The transition state is visible in
> /proc/<pid>/patch_state.
>
> The transition is blocked when a process is in KLP_UNPATCHED state.
> It is defined in include/linux/livepatch.h:
>
> #define KLP_UNPATCHED 0
>
> Well, the timing against the transition is important. The following
> might help to see the blocking processes:
>
> $> modprobe livepatch-sample ; \
> sleep 1; \
> for proc_path in \
> `grep "\-1" /proc/*/patch_state | cut -d '/' -f-3` ; \
> do \
> cat $proc_path/comm ; \
> cat $proc_path/stack ; \
> echo === ; \
> done
>
> After this the livepatch has to be manualy disabled and removed
>
> $> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
> $> rmmod livepatch_sample
Thanks for the suggestion. This is quite helpful for debug.
I did some tests and in short, I could run all livepatch selftest successfully
on clang15-built kernel when RANDOMIZE_KSTACK_OFFSET=n.
Below is my analysis. Please let me know if I'm wrong.
When I checked the stack state while being live-patched, I saw some tasks
sleeping after system call are not transitioned. For example, I saw a task with
following stack:
```
sshd
[<0>] do_select+0x5cc/0x64c
[<0>] core_sys_select+0x174/0x210
[<0>] __arm64_sys_pselect6+0x11c/0x384
[<0>] invoke_syscall+0x78/0x108
[<0>] el0_svc_common+0xc0/0xfc
[<0>] do_el0_svc+0x38/0xd0
[<0>] el0_svc+0x34/0x110
[<0>] el0t_64_sync_handler+0x84/0xf0
[<0>] el0t_64_sync+0x190/0x194
```
Then, I noticed that invoke_syscall generates instructions to add random offset
in sp when RANDOMIZE_KSTACK_OFFSET=y, which is true in the above case.
Actually I see that sp can be modified in the binary:
```
$ objdump -d vmlinux --disassemble=invoke_syscall
...
ffff80000803076c <invoke_syscall>:
...
ffff8000080307b4: 9100011f mov sp, x8
...
ffff80000803085c: d65f03c0 ret
```
This will set the instruction UNRELIABLE as sp value is not deterministic:
https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/arch/arm64/decode.c#L173
and in turn will skip the generation of orc data:
https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/dcheck.c#L313
I can confirm the orc result in vmlinux:
```
./tools/objtool/objtool --dump vmlinux
...
# no entry in range of invoke_syscall (ffff80000803076c - ffff80000803085c)
ffff800008030764: cfa:sp+0 x29:cfa+0 type:call end:0
ffff800008030874: cfa:(und) x29:(und) type:call end:0
ffff800008030874: cfa:sp+0 x29:cfa+0 type:call end:0
...
```
So, when live-patch is performed, stacktrace of task containing invoke_syscall
cannot be validated in arch_stack_walk_reliable() and transition won't happen
until the fake signal is delivered (unless task's state changes).
It seems that stack validation itself works as intended.
As I said, when RANDOMIZE_KSTACK_OFFSET=n, selftests run fine.
Or am I misunderstood something completely?
Regards,
Tomohiro
Sorry for the delay in responding to your comments. I was out sick.
Please find my responses inline.
On 2/18/23 03:30, Suraj Jitindar Singh wrote:
> On Thu, 2023-02-02 at 01:40 -0600, [email protected] wrote:
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> The ORC code needs to be reorganized into arch-specific and generic
>> parts
>> so that architectures other than X86 can use the generic parts.
>>
>> orc_types.h contains the following ORC definitions shared between
>> objtool
>> and the kernel:
>>
>> - ORC register definitions which are arch-specific.
>> - orc_entry structure which is generic.
>>
>> Move orc_entry into a new file include/linux/orc_entry.h. Also, the
>> field
>> names bp_reg and bp_offset in struct orc_entry are x86-specific.
>> Change
>> them to fp_reg and fp_offset. FP stands for frame pointer.
>>
>> Currently, the type field in orc_entry is only 2 bits. For other
>> architectures, we will need more. So, expand this to 3 bits.
>>
>> Signed-off-by: Madhavan T. Venkataraman <[email protected]
>>>
>> ---
>> arch/x86/include/asm/orc_types.h | 37 +++++-------------------
>> include/linux/orc_entry.h | 39
>> ++++++++++++++++++++++++++
>> tools/arch/x86/include/asm/orc_types.h | 37 +++++-------------------
>> tools/include/linux/orc_entry.h | 39
>> ++++++++++++++++++++++++++
>> tools/objtool/orc_gen.c | 4 +--
>> tools/objtool/sync-check.sh | 1 +
>> 6 files changed, 95 insertions(+), 62 deletions(-)
>> create mode 100644 include/linux/orc_entry.h
>> create mode 100644 tools/include/linux/orc_entry.h
>>
>
> [snip]
>
>> diff --git a/tools/include/linux/orc_entry.h
>> b/tools/include/linux/orc_entry.h
>> new file mode 100644
>> index 000000000000..3d49e3b9dabe
>> --- /dev/null
>> +++ b/tools/include/linux/orc_entry.h
>> @@ -0,0 +1,39 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
>> + */
>> +
>> +#ifndef _ORC_ENTRY_H
>> +#define _ORC_ENTRY_H
>> +
>> +#ifndef __ASSEMBLY__
>> +#include <asm/byteorder.h>
>> +
>> +/*
>> + * This struct is more or less a vastly simplified version of the
>> DWARF Call
>> + * Frame Information standard. It contains only the necessary parts
>> of DWARF
>> + * CFI, simplified for ease of access by the in-kernel unwinder. It
>> tells the
>> + * unwinder how to find the previous SP and BP (and sometimes entry
>> regs) on
>> + * the stack for a given code address. Each instance of the struct
>> corresponds
>> + * to one or more code locations.
>> + */
>> +struct orc_entry {
>> + s16 sp_offset;
>> + s16 fp_offset;
>> +#if defined(__LITTLE_ENDIAN_BITFIELD)
>> + unsigned sp_reg:4;
>> + unsigned fp_reg:4;
>> + unsigned type:3;
>> + unsigned end:1;
>> +#elif defined(__BIG_ENDIAN_BITFIELD)
>> + unsigned fp_reg:4;
>> + unsigned sp_reg:4;
>> + unsigned unused:4;
>> + unsigned end:1;
>> + unsigned type:3;
>> +#
>
> nit:
> I believe you also need to update fp_reg/bp_offset -> fp_reg/fp_offset
> in orc_dump() in orc_dump.c
>
OK. Will do.
Madhavan
> - Suraj
>
>> +
>> +#endif /* __ASSEMBLY__ */
>> +
>> +#endif /* _ORC_ENTRY_H */
>> diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
>> index dd3c64af9db2..68c317daadbf 100644
>> --- a/tools/objtool/orc_gen.c
>> +++ b/tools/objtool/orc_gen.c
>> @@ -98,7 +98,7 @@ static int write_orc_entry(struct elf *elf, struct
>> section *orc_sec,
>> orc = (struct orc_entry *)orc_sec->data->d_buf + idx;
>> memcpy(orc, o, sizeof(*orc));
>> orc->sp_offset = bswap_if_needed(orc->sp_offset);
>> - orc->bp_offset = bswap_if_needed(orc->bp_offset);
>> + orc->fp_offset = bswap_if_needed(orc->fp_offset);
>>
>> /* populate reloc for ip */
>> if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int),
>> R_X86_64_PC32,
>> @@ -149,7 +149,7 @@ int orc_create(struct objtool_file *file)
>>
>> struct orc_entry null = {
>> .sp_reg = ORC_REG_UNDEFINED,
>> - .bp_reg = ORC_REG_UNDEFINED,
>> + .fp_reg = ORC_REG_UNDEFINED,
>> .type = UNWIND_HINT_TYPE_CALL,
>> };
>>
>> diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-
>> check.sh
>> index ee49b4e9e72c..ef1acb064605 100755
>> --- a/tools/objtool/sync-check.sh
>> +++ b/tools/objtool/sync-check.sh
>> @@ -18,6 +18,7 @@ arch/x86/include/asm/unwind_hints.h
>> arch/x86/lib/x86-opcode-map.txt
>> arch/x86/tools/gen-insn-attr-x86.awk
>> include/linux/static_call_types.h
>> +include/linux/orc_entry.h
>> "
>>
>> SYNC_CHECK_FILES='
On 3/2/23 10:23, Petr Mladek wrote:
> On Wed 2023-03-01 03:12:08, Tomohiro Misono (Fujitsu) wrote:
>> <snip>
>>> Testing
>>> =======
>>>
>>> - I have run all of the livepatch selftests successfully. I have written a
>>> couple of extra selftests myself which I will be posting separately
>> Hi,
>>
>> What test configuration/environment you are using for test?
>> When I tried kselftest with fedora based config on VM, I got errors
>> because livepatch transition won't finish until signal is sent
>> (i.e. it takes 15s for every transition).
>>
>> [excerpt from test result]
>> ```
>> $ sudo ./test-livepatch.sh
>> TEST: basic function patching ... not ok
>>
>> --- expected
>> +++ result
>> @@ -2,11 +2,13 @@
>> livepatch: enabling patch 'test_klp_livepatch'
>> livepatch: 'test_klp_livepatch': initializing patching transition
>> livepatch: 'test_klp_livepatch': starting patching transition
>> +livepatch: signaling remaining tasks
>> livepatch: 'test_klp_livepatch': completing patching transition
>> ```
>
> It might be interesting to see what process is blocking the
> transition. The transition state is visible in
> /proc/<pid>/patch_state.
>
> The transition is blocked when a process is in KLP_UNPATCHED state.
> It is defined in include/linux/livepatch.h:
>
> #define KLP_UNPATCHED 0
>
> Well, the timing against the transition is important. The following
> might help to see the blocking processes:
>
> $> modprobe livepatch-sample ; \
> sleep 1; \
> for proc_path in \
> `grep "\-1" /proc/*/patch_state | cut -d '/' -f-3` ; \
> do \
> cat $proc_path/comm ; \
> cat $proc_path/stack ; \
> echo === ; \
> done
>
> After this the livepatch has to be manualy disabled and removed
>
> $> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
> $> rmmod livepatch_sample
>
> Best Regards,
> Petr
Thanks for the suggestion. I will try to reproduce the problem and look at what process(es) are holding up
the livepatch.
Madhavan
On 2/22/23 22:07, Suraj Jitindar Singh wrote:
> On Thu, 2023-02-02 at 01:40 -0600, [email protected] wrote:
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> Introduce a reliability flag in struct unwind_state. This will be set
>> to
>> false if the PC does not have a valid ORC or if the frame pointer
>> computed
>> from the ORC does not match the actual frame pointer.
>>
>> Now that the unwinder can validate the frame pointer, introduce
>> arch_stack_walk_reliable().
>>
>> Signed-off-by: Madhavan T. Venkataraman <[email protected]
>>>
>> ---
>> arch/arm64/include/asm/stacktrace/common.h | 15 ++
>> arch/arm64/kernel/stacktrace.c | 167
>> ++++++++++++++++++++-
>> 2 files changed, 175 insertions(+), 7 deletions(-)
>>
>
> [snip]
>
>> -static void notrace unwind(struct unwind_state *state,
>> +static int notrace unwind(struct unwind_state *state, bool
>> need_reliable,
>> stack_trace_consume_fn consume_entry, void
>> *cookie)
>> {
>> - while (1) {
>> - int ret;
>> + int ret = 0;
>>
>> + while (1) {
>> + if (need_reliable && !state->reliable)
>> + return -EINVAL;
>> if (!consume_entry(cookie, state->pc))
>> break;
>> ret = unwind_next(state);
>> + if (need_reliable && !ret)
>> + unwind_check_reliable(state);
>> if (ret < 0)
>> break;
>> }
>> + return ret;
>
> nit:
>
> I think you're looking more for comments on the approach and the
> correctness of these patches, but from an initial read I'm still
> putting it all together in my head. So this comment is on the coding
> style.
>
> The above loop seems to check the current reliability state, then
> unwind a frame then check the reliability, and then break based of
> something which couldn't have been updated by the line immediately
> above. I propose something like:
>
> unwind(...) {
> ret = 0;
>
> while (!ret) {
> if (need_reliable) {
> unwind_check_reliable(state);
> if (!state->reliable)
> return -EINVAL;
> }
> if (!consume_entry(cookie, state->pc))
> return -EINVAL;
> ret = unwind_next(state);
> }
>
> return ret;
> }
>
> This also removes the need for the call to unwind_check_reliable()
> before the first unwind() below in arch_stack_walk_reliable().
>
OK. Suggestion sounds reasonable. Will do.
Madhavan
> - Suraj
>
>> }
>> NOKPROBE_SYMBOL(unwind);
>>
>> @@ -216,5 +337,37 @@ noinline notrace void
>> arch_stack_walk(stack_trace_consume_fn consume_entry,
>> unwind_init_from_task(&state, task);
>> }
>>
>> - unwind(&state, consume_entry, cookie);
>> + unwind(&state, false, consume_entry, cookie);
>> +}
>> +
>> +noinline notrace int arch_stack_walk_reliable(
>> + stack_trace_consume_fn consume_entry,
>> + void *cookie, struct task_struct *task)
>> +{
>> + struct stack_info stacks[] = {
>> + stackinfo_get_task(task),
>> + STACKINFO_CPU(irq),
>> +#if defined(CONFIG_VMAP_STACK)
>> + STACKINFO_CPU(overflow),
>> +#endif
>> +#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_ARM_SDE_INTERFACE)
>> + STACKINFO_SDEI(normal),
>> + STACKINFO_SDEI(critical),
>> +#endif
>> + };
>> + struct unwind_state state = {
>> + .stacks = stacks,
>> + .nr_stacks = ARRAY_SIZE(stacks),
>> + };
>> + int ret;
>> +
>> + if (task == current)
>> + unwind_init_from_caller(&state);
>> + else
>> + unwind_init_from_task(&state, task);
>> + unwind_check_reliable(&state);
>> +
>> + ret = unwind(&state, true, consume_entry, cookie);
>> +
>> + return ret == -ENOENT ? 0 : -EINVAL;
>> }
On 2/28/23 21:12, Tomohiro Misono (Fujitsu) wrote:
> <snip>
>> Testing
>> =======
>>
>> - I have run all of the livepatch selftests successfully. I have written a
>> couple of extra selftests myself which I will be posting separately
> Hi,
>
> What test configuration/environment you are using for test?
> When I tried kselftest with fedora based config on VM, I got errors
> because livepatch transition won't finish until signal is sent
> (i.e. it takes 15s for every transition).
>
Sorry for not responding earlier. I was out sick.
I tested on a bare metal system (thunderx) running Ubuntu. I will try to reproduce
the error you are seeing on a VM running fedora.
Madhavan
> [excerpt from test result]
> ```
> $ sudo ./test-livepatch.sh
> TEST: basic function patching ... not ok
>
> --- expected
> +++ result
> @@ -2,11 +2,13 @@
> livepatch: enabling patch 'test_klp_livepatch'
> livepatch: 'test_klp_livepatch': initializing patching transition
> livepatch: 'test_klp_livepatch': starting patching transition
> +livepatch: signaling remaining tasks
> livepatch: 'test_klp_livepatch': completing patching transition
> ```
>
> Thanks,
> Tomohiro
>
>>
>> - I have a test driver to induce a NULL pointer exception to make sure
>> that unwinding through exception handlers is reliable.
>>
>> - I use the test driver to create a timer to make sure that unwinding through
>> the timer IRQ is reliable.
>>
>> - I call the unwinder from different places during boot to make sure that
>> the unwinding in each of those cases is reliable.
>>
Hi Madhavan,
At a high-level, I think this still falls afoul of our desire to not reverse
engineer control flow from the binary, and so I do not think this is the right
approach. I've expanded a bit on that below.
I do think it would be nice to have *some* of the objtool changes, as I do
think we will want to use objtool for some things in future (e.g. some
build-time binary patching such as table sorting).
On Thu, Feb 02, 2023 at 01:40:14AM -0600, [email protected] wrote:
> From: "Madhavan T. Venkataraman" <[email protected]>
>
> Introduction
> ============
>
> The livepatch feature requires an unwinder that can provide a reliable stack
> trace. General requirements for a reliable unwinder are described in this
> document from Mark Rutland:
>
> Documentation/livepatch/reliable-stacktrace.rst
>
> The requirements have two parts:
>
> 1. The unwinder must be enhanced with certain features. E.g.,
>
> - Identifying successful termination of stack trace
> - Identifying unwindable and non-unwindable code
> - Identifying interrupts and exceptions occurring in the frame pointer
> prolog and epilog
> - Identifying features such as kretprobe and ftrace graph tracing
> that can modify the return address stored on the stack
> - Identifying corrupted/unreliable stack contents
> - Architecture-specific items that can render a stack trace unreliable
> at certain points in code
>
> 2. Validation of the frame pointer
>
> This assumes that the unwinder is based on the frame pointer (FP).
> The actual frame pointer that the unwinder uses cannot just be
> assumed to be correct. It needs to be validated somehow.
>
> This patch series is to address the following:
>
> - Identifying unwindable and non-unwindable code
> - Identifying interrupts and exceptions occurring in the frame pointer
> prolog and epilog
> - Validation of the frame pointer
>
> The rest are already in place AFAICT.
Just as a note: there are a few issues remaining (e.g. the kretprobe and fgraph
PC recovery both have windows where they lose the original return address), and
there are a few compiler-generated trampoline functions with non-AAPCS calling
conventions that will need special care.
> Validation of the FP (aka FRAME_POINTER_VALIDATION)
> ====================
>
> The current approach in Linux is to use objtool, a build time tool, for this
> purpose. When configured, objtool is invoked on every relocatable object file
> during kernel build. It performs static analysis of the code in each file. It
> walks the instructions in every function and notes the changes to the stack
> pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
> accordance with the ABI rules. There are also a lot of other checks that
> Objtool performs. Once objtool completes successfully, the kernel can then be
> used for livepatch purposes.
>
> Objtool can have uses other than just FP validation. For instance, it can check
> control flow integrity during its analysis.
>
> Problem
> =======
>
> Objtool is complex and highly architecture-dependent. There are a lot of
> different checks in objtool that all of the code in the kernel must pass
> before livepatch can be enabled. If a check fails, it must be corrected
> before we can proceed. Sometimes, the kernel code needs to be fixed.
> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> also to prove that all the work is complete for an architecture.
>
> As such, it presents a great challenge to enable livepatch for an
> architecture.
There's a more fundamental issue here in that objtool has to reverse-engineer
control flow, and so even if the kernel code and compiled code generation is
*perfect*, it's possible that objtool won't recognise the structure of the
generated code, and won't be able to reverse-engineer the correct control flow.
We've seen issues where objtool didn't understand jump tables, so support for
that got disabled on x86. A key objection from the arm64 side is that we don't
want to disable compile code generation strategies like this. Further, as
compiles evolve, their code generation strategies will change, and it's likely
there will be other cases that crop up. This is inherently fragile.
The key objections from the arm64 side is that we don't want to
reverse-engineer details from the binary, as this is complex, fragile, and
unstable. This is why we've previously suggested that we should work with
compiler folk to get what we need.
I'll note that at the last Linux Plumbers Conference, there was a discussion
about what is now called SFrame, which *might* give us sufficient information,
but I have not had the time to dig into that as I have been chasing other
problems and trying to get other infrastructure in place.
> A different approach
> ====================
>
> I would like to propose a different approach for FP validation. I would
> like to be able to enable livepatch for an architecture as is. That is,
> without "fixing" the kernel or the compiler for it:
>
> There are three steps in this:
>
> 1. Objtool walks all the functions as usual. It computes the stack and
> frame pointer offsets at each instruction as usual. It generates ORC
> records and stores them in special sections as usual. This is simple
> enough to do.
This still requires reverse-engineering the forward-edge control flow in order
to compute those offets, so the same objections apply with this approach. I do
not think this is the right approach.
I would *strongly* prefer that we work with compiler folk to get the
information that we need.
[...]
> FWIW, I have also compared the CFI I am generating with DWARF
> information that the compiler generates. The CFIs match a
> 100% for Clang. In the case of gcc, the comparison fails
> in 1.7% of the cases. I have analyzed those cases and found
> the DWARF information generated by gcc is incorrect. The
> ORC generated by my Objtool is correct.
Have you reported this to the GCC folk, and can you give any examples?
I'm sure they would be interested in fixing this, regardless of whether we end
up using it.
Thanks,
Mark.
> Then, I noticed that invoke_syscall generates instructions to add random offset
> in sp when RANDOMIZE_KSTACK_OFFSET=y, which is true in the above case.
I'm also seeing this behavior when compiling with
RANDOMIZE_KSTACK_OFFSET=y. I wonder if a special hint type
could/should be added to allow for skipping the reliability check for
stack frames with this randomized offset? Forgive me if this is a
naive suggestion.
Thanks,
Dylan
Hi Mark,
Sorry for the long delay in responding. Was caught up in many things.
My responses inline..
On 3/23/23 12:17, Mark Rutland wrote:
> Hi Madhavan,
>
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
>
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
>
OK. I have been under the impression that the arm64 folks are basically OK with
Objtool's approach of reverse engineering from the binary. I did not see
any specific objections to previously submitted patches based on this approach
including mine.
So, if the community is not in agreement with this approach, I will go back to the
drawing board for this one.
Are there any other opinions on this subject from others?
> On Thu, Feb 02, 2023 at 01:40:14AM -0600, [email protected] wrote:
>> From: "Madhavan T. Venkataraman" <[email protected]>
>>
>> Introduction
>> ============
>>
>> The livepatch feature requires an unwinder that can provide a reliable stack
>> trace. General requirements for a reliable unwinder are described in this
>> document from Mark Rutland:
>>
>> Documentation/livepatch/reliable-stacktrace.rst
>>
>> The requirements have two parts:
>>
>> 1. The unwinder must be enhanced with certain features. E.g.,
>>
>> - Identifying successful termination of stack trace
>> - Identifying unwindable and non-unwindable code
>> - Identifying interrupts and exceptions occurring in the frame pointer
>> prolog and epilog
>> - Identifying features such as kretprobe and ftrace graph tracing
>> that can modify the return address stored on the stack
>> - Identifying corrupted/unreliable stack contents
>> - Architecture-specific items that can render a stack trace unreliable
>> at certain points in code
>>
>> 2. Validation of the frame pointer
>>
>> This assumes that the unwinder is based on the frame pointer (FP).
>> The actual frame pointer that the unwinder uses cannot just be
>> assumed to be correct. It needs to be validated somehow.
>>
>> This patch series is to address the following:
>>
>> - Identifying unwindable and non-unwindable code
>> - Identifying interrupts and exceptions occurring in the frame pointer
>> prolog and epilog
>> - Validation of the frame pointer
>>
>> The rest are already in place AFAICT.
>
> Just as a note: there are a few issues remaining (e.g. the kretprobe and fgraph
> PC recovery both have windows where they lose the original return address), and
> there are a few compiler-generated trampoline functions with non-AAPCS calling
> conventions that will need special care.
>
OK.
>> Validation of the FP (aka FRAME_POINTER_VALIDATION)
>> ====================
>>
>> The current approach in Linux is to use objtool, a build time tool, for this
>> purpose. When configured, objtool is invoked on every relocatable object file
>> during kernel build. It performs static analysis of the code in each file. It
>> walks the instructions in every function and notes the changes to the stack
>> pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
>> accordance with the ABI rules. There are also a lot of other checks that
>> Objtool performs. Once objtool completes successfully, the kernel can then be
>> used for livepatch purposes.
>>
>> Objtool can have uses other than just FP validation. For instance, it can check
>> control flow integrity during its analysis.
>>
>> Problem
>> =======
>>
>> Objtool is complex and highly architecture-dependent. There are a lot of
>> different checks in objtool that all of the code in the kernel must pass
>> before livepatch can be enabled. If a check fails, it must be corrected
>> before we can proceed. Sometimes, the kernel code needs to be fixed.
>> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>> also to prove that all the work is complete for an architecture.
>>
>> As such, it presents a great challenge to enable livepatch for an
>> architecture.
>
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
>
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
>
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.
>
So, what exactly do you have in mind? What help can the compiler folk provide?
By your own argument, we cannot rely on the compiler as compiler implementations,
optimization strategies, etc can change in ways that are incompatible with any
livepatch implementation. Also, there can always be bugs in the compiler
implementations.
Can you please elaborate? Are we looking for a way for the compiler folks to
provide us with something that we can use to implement reliable stack trace?
> I'll note that at the last Linux Plumbers Conference, there was a discussion
> about what is now called SFrame, which *might* give us sufficient information,
> but I have not had the time to dig into that as I have been chasing other
> problems and trying to get other infrastructure in place.
>
I will try to locate the link. If you can provide me a link, that would be greatly
appreciated. I will study their SFrame proposal.
>> A different approach
>> ====================
>>
>> I would like to propose a different approach for FP validation. I would
>> like to be able to enable livepatch for an architecture as is. That is,
>> without "fixing" the kernel or the compiler for it:
>>
>> There are three steps in this:
>>
>> 1. Objtool walks all the functions as usual. It computes the stack and
>> frame pointer offsets at each instruction as usual. It generates ORC
>> records and stores them in special sections as usual. This is simple
>> enough to do.
>
> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
>
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.
>
I am willing to do this. But I am not clear on the kind of features we want
from the compiler. Are you suggesting something for getting a reliable
stack trace? Is there any kind of proposal out there that I need to study?
> [...]
>
>> FWIW, I have also compared the CFI I am generating with DWARF
>> information that the compiler generates. The CFIs match a
>> 100% for Clang. In the case of gcc, the comparison fails
>> in 1.7% of the cases. I have analyzed those cases and found
>> the DWARF information generated by gcc is incorrect. The
>> ORC generated by my Objtool is correct.
>
>
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.
>
I will try to get the data again and put something together and send it to the
gcc folks.
Thanks for the suggestions.
Madhavan
Sorry for the delay. Someone else also pointed out the same thing. I just have not
had the chance to look at this. I plan to soon.
Thanks.
Madhavan
On 4/3/23 17:26, Dylan Hatch wrote:
>> Then, I noticed that invoke_syscall generates instructions to add random offset
>> in sp when RANDOMIZE_KSTACK_OFFSET=y, which is true in the above case.
>
> I'm also seeing this behavior when compiling with
> RANDOMIZE_KSTACK_OFFSET=y. I wonder if a special hint type
> could/should be added to allow for skipping the reliability check for
> stack frames with this randomized offset? Forgive me if this is a
> naive suggestion.
>
> Thanks,
> Dylan
On Fri, Apr 07, 2023 at 10:40:07PM -0500, Madhavan T. Venkataraman wrote:
> Hi Mark,
>
> Sorry for the long delay in responding. Was caught up in many things.
> My responses inline..
>
> On 3/23/23 12:17, Mark Rutland wrote:
> > Hi Madhavan,
> >
> > At a high-level, I think this still falls afoul of our desire to not reverse
> > engineer control flow from the binary, and so I do not think this is the right
> > approach. I've expanded a bit on that below.
> >
> > I do think it would be nice to have *some* of the objtool changes, as I do
> > think we will want to use objtool for some things in future (e.g. some
> > build-time binary patching such as table sorting).
>
> OK. I have been under the impression that the arm64 folks are basically OK with
> Objtool's approach of reverse engineering from the binary. I did not see
> any specific objections to previously submitted patches based on this approach
> including mine.
This has admittedly changed over time, but the preference to avoid
reverse-engineering control flow has been around for a while. For example,
during LPC 2021's "objtool on arm64" session:
https://lpc.events/event/11/contributions/971/
... where Will and I expressed strong desires to get the compiler to help,
whether that's compiler-generated metadata, (agreed upon) restrictions on code
generation, or something else.
[...]
> > There's a more fundamental issue here in that objtool has to reverse-engineer
> > control flow, and so even if the kernel code and compiled code generation is
> > *perfect*, it's possible that objtool won't recognise the structure of the
> > generated code, and won't be able to reverse-engineer the correct control flow.
> >
> > We've seen issues where objtool didn't understand jump tables, so support for
> > that got disabled on x86. A key objection from the arm64 side is that we don't
> > want to disable compile code generation strategies like this. Further, as
> > compiles evolve, their code generation strategies will change, and it's likely
> > there will be other cases that crop up. This is inherently fragile.
> >
> > The key objections from the arm64 side is that we don't want to
> > reverse-engineer details from the binary, as this is complex, fragile, and
> > unstable. This is why we've previously suggested that we should work with
> > compiler folk to get what we need.
> >
>
> So, what exactly do you have in mind? What help can the compiler folk provide?
There are several possibilities, e.g.
* Generate some simple metadata that tells us for each PC whether to start an
unwind from the LR or FP. My understanding was that SFrame *might* be
sufficient for this.
We might need some custom metadata for assembly (e.g. exception entry,
trampolines), but it'd be ok for that to be different.
* Agree upon some restricted patterns for code generation (e.g. fixed
prologues/epilogues), so that we can identify whether to use LR or FP based
on the PC and a symbol lookup.
> By your own argument, we cannot rely on the compiler as compiler implementations,
> optimization strategies, etc can change in ways that are incompatible with any
> livepatch implementation.
That's not quite my argument.
My argument is that if we assume some set of properties that compiler folk
never agreed to (and were never made aware of), then compiler folk are well
within their rights to change the compiler such that it doesn't provide those
properties, and it's very likely that such expectation will be broken. We've
seen that happen before (e.g. with jump tables).
Consequently I think we should be working with compiler folk to agree upon some
solution, where compiler folk will actually try to maintain the properties we
depend upon (and e.g. they could have tests for). That sort of co-design has
worked well so far (e.g. with things like kCFI).
Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
> Also, there can always be bugs in the compiler implementations.
I don't disagree with that.
> Can you please elaborate? Are we looking for a way for the compiler folks to
> provide us with something that we can use to implement reliable stack trace?
I tried to do so a bit above.
I'm looking for some agreement between kernel folk and compiler folk on a
reliable mechanism. That might be something that already exists, or something
new. It might be metadata or some restrictions on code generation.
> > I'll note that at the last Linux Plumbers Conference, there was a discussion
> > about what is now called SFrame, which *might* give us sufficient information,
> > but I have not had the time to dig into that as I have been chasing other
> > problems and trying to get other infrastructure in place.
>
> I will try to locate the link. If you can provide me a link, that would be greatly
> appreciated. I will study their SFrame proposal.
From looking around, that session was:
https://lpc.events/event/16/contributions/1177/
At the time it was called CTF Frame, but got renamed to SFrame.
I'm not sure where to find the most recent documentation. As I mentioned above
I have not had the time to look in detail.
> >> FWIW, I have also compared the CFI I am generating with DWARF
> >> information that the compiler generates. The CFIs match a
> >> 100% for Clang. In the case of gcc, the comparison fails
> >> in 1.7% of the cases. I have analyzed those cases and found
> >> the DWARF information generated by gcc is incorrect. The
> >> ORC generated by my Objtool is correct.
> >
> > Have you reported this to the GCC folk, and can you give any examples?
> > I'm sure they would be interested in fixing this, regardless of whether we end
> > up using it.
>
> I will try to get the data again and put something together and send it to the
> gcc folks.
Thanks for doing so; that's much appreciated!
Thanks,
Mark.
On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
> > By your own argument, we cannot rely on the compiler as compiler implementations,
> > optimization strategies, etc can change in ways that are incompatible with any
> > livepatch implementation.
>
> That's not quite my argument.
>
> My argument is that if we assume some set of properties that compiler folk
> never agreed to (and were never made aware of), then compiler folk are well
> within their rights to change the compiler such that it doesn't provide those
> properties, and it's very likely that such expectation will be broken. We've
> seen that happen before (e.g. with jump tables).
>
> Consequently I think we should be working with compiler folk to agree upon some
> solution, where compiler folk will actually try to maintain the properties we
> depend upon (and e.g. they could have tests for). That sort of co-design has
> worked well so far (e.g. with things like kCFI).
>
> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
That was the goal of my talk at LPC last year:
https://lpc.events/event/16/contributions/1392/
We discussed having the compiler annotate the tricky bits of control
flow, mainly jump tables and noreturns. It's still on my TODO list to
prototype that.
Another alternative which has been suggested in the past by Indu and
others is for objtool to use DWARF/sframe as an input to help guide it
through the tricky bits.
That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
has some reliability issues -- and also defeats some of the benefits of
reverse-engineering in the first place (we've found many compiler bugs
and other surprising kernel-compiler interactions over the years).
Objtool's understanding of the control flow graph has been really
valuable for reasons beyond live patching (e.g., noinstr and uaccess
validation), it's definitely worth finding a way to make that more
sustainable.
--
Josh
On 4/11/23 23:17, Josh Poimboeuf wrote:
> On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
>>> By your own argument, we cannot rely on the compiler as compiler implementations,
>>> optimization strategies, etc can change in ways that are incompatible with any
>>> livepatch implementation.
>>
>> That's not quite my argument.
>>
>> My argument is that if we assume some set of properties that compiler folk
>> never agreed to (and were never made aware of), then compiler folk are well
>> within their rights to change the compiler such that it doesn't provide those
>> properties, and it's very likely that such expectation will be broken. We've
>> seen that happen before (e.g. with jump tables).
>>
>> Consequently I think we should be working with compiler folk to agree upon some
>> solution, where compiler folk will actually try to maintain the properties we
>> depend upon (and e.g. they could have tests for). That sort of co-design has
>> worked well so far (e.g. with things like kCFI).
>>
>> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
>
> That was the goal of my talk at LPC last year:
>
> https://lpc.events/event/16/contributions/1392/
>
> We discussed having the compiler annotate the tricky bits of control
> flow, mainly jump tables and noreturns. It's still on my TODO list to
> prototype that.
>
> Another alternative which has been suggested in the past by Indu and
> others is for objtool to use DWARF/sframe as an input to help guide it
> through the tricky bits.
>
I read through the SFrame spec file briefly. It looks like I can easily adapt my
version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
folks agree to properly support and maintain SFrame, then I could send the next version
of the patchset based on SFrame.
But I kinda need a clear path forward before I implement anything. I request the arm64
folks to comment on the above approach. Would it be useful to initiate an email discussion
with the compiler folks on what they plan to do to support SFrame? Or, should this all
happen face to face in some forum like LPC?
Madhavan
> That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
> has some reliability issues -- and also defeats some of the benefits of
> reverse-engineering in the first place (we've found many compiler bugs
> and other surprising kernel-compiler interactions over the years).
>
> Objtool's understanding of the control flow graph has been really
> valuable for reasons beyond live patching (e.g., noinstr and uaccess
> validation), it's definitely worth finding a way to make that more
> sustainable.
>
On 4/11/23 23:48, Madhavan T. Venkataraman wrote:
>
>
> On 4/11/23 23:17, Josh Poimboeuf wrote:
>> On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
>>>> By your own argument, we cannot rely on the compiler as compiler implementations,
>>>> optimization strategies, etc can change in ways that are incompatible with any
>>>> livepatch implementation.
>>>
>>> That's not quite my argument.
>>>
>>> My argument is that if we assume some set of properties that compiler folk
>>> never agreed to (and were never made aware of), then compiler folk are well
>>> within their rights to change the compiler such that it doesn't provide those
>>> properties, and it's very likely that such expectation will be broken. We've
>>> seen that happen before (e.g. with jump tables).
>>>
>>> Consequently I think we should be working with compiler folk to agree upon some
>>> solution, where compiler folk will actually try to maintain the properties we
>>> depend upon (and e.g. they could have tests for). That sort of co-design has
>>> worked well so far (e.g. with things like kCFI).
>>>
>>> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
>>
>> That was the goal of my talk at LPC last year:
>>
>> https://lpc.events/event/16/contributions/1392/
>>
>> We discussed having the compiler annotate the tricky bits of control
>> flow, mainly jump tables and noreturns. It's still on my TODO list to
>> prototype that.
>>
>> Another alternative which has been suggested in the past by Indu and
>> others is for objtool to use DWARF/sframe as an input to help guide it
>> through the tricky bits.
>>
>
> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> folks agree to properly support and maintain SFrame, then I could send the next version
> of the patchset based on SFrame.
>
> But I kinda need a clear path forward before I implement anything. I request the arm64
> folks to comment on the above approach. Would it be useful to initiate an email discussion
> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> happen face to face in some forum like LPC?
>
> Madhavan
>
Just to be clear. This is not to replace Objtool as it has other uses as well, not just
reliable stack trace. I am trying to solve the reliable stack trace issue alone with
SFrame.
Madhavan
>> That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
>> has some reliability issues -- and also defeats some of the benefits of
>> reverse-engineering in the first place (we've found many compiler bugs
>> and other surprising kernel-compiler interactions over the years).
>>
>> Objtool's understanding of the control flow graph has been really
>> valuable for reasons beyond live patching (e.g., noinstr and uaccess
>> validation), it's definitely worth finding a way to make that more
>> sustainable.
>>
On Tue, Apr 11, 2023 at 11:48:21PM -0500, Madhavan T. Venkataraman wrote:
>
>
> On 4/11/23 23:17, Josh Poimboeuf wrote:
> > On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
> >>> By your own argument, we cannot rely on the compiler as compiler implementations,
> >>> optimization strategies, etc can change in ways that are incompatible with any
> >>> livepatch implementation.
> >>
> >> That's not quite my argument.
> >>
> >> My argument is that if we assume some set of properties that compiler folk
> >> never agreed to (and were never made aware of), then compiler folk are well
> >> within their rights to change the compiler such that it doesn't provide those
> >> properties, and it's very likely that such expectation will be broken. We've
> >> seen that happen before (e.g. with jump tables).
> >>
> >> Consequently I think we should be working with compiler folk to agree upon some
> >> solution, where compiler folk will actually try to maintain the properties we
> >> depend upon (and e.g. they could have tests for). That sort of co-design has
> >> worked well so far (e.g. with things like kCFI).
> >>
> >> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
> >
> > That was the goal of my talk at LPC last year:
> >
> > https://lpc.events/event/16/contributions/1392/
> >
> > We discussed having the compiler annotate the tricky bits of control
> > flow, mainly jump tables and noreturns. It's still on my TODO list to
> > prototype that.
> >
> > Another alternative which has been suggested in the past by Indu and
> > others is for objtool to use DWARF/sframe as an input to help guide it
> > through the tricky bits.
> >
>
> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> folks agree to properly support and maintain SFrame, then I could send the next version
> of the patchset based on SFrame.
>
> But I kinda need a clear path forward before I implement anything. I request the arm64
> folks to comment on the above approach. Would it be useful to initiate an email discussion
> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> happen face to face in some forum like LPC?
SFrame is basically a simplified version of DWARF unwind, using it as an
input to objtool is going to have the same issues I mentioned below (and
as was discussed with your v1).
> > That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
> > has some reliability issues -- and also defeats some of the benefits of
> > reverse-engineering in the first place (we've found many compiler bugs
> > and other surprising kernel-compiler interactions over the years).
> >
> > Objtool's understanding of the control flow graph has been really
> > valuable for reasons beyond live patching (e.g., noinstr and uaccess
> > validation), it's definitely worth finding a way to make that more
> > sustainable.
--
Josh
On 4/12/23 00:01, Josh Poimboeuf wrote:
> On Tue, Apr 11, 2023 at 11:48:21PM -0500, Madhavan T. Venkataraman wrote:
>>
>>
>> On 4/11/23 23:17, Josh Poimboeuf wrote:
>>> On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
>>>>> By your own argument, we cannot rely on the compiler as compiler implementations,
>>>>> optimization strategies, etc can change in ways that are incompatible with any
>>>>> livepatch implementation.
>>>>
>>>> That's not quite my argument.
>>>>
>>>> My argument is that if we assume some set of properties that compiler folk
>>>> never agreed to (and were never made aware of), then compiler folk are well
>>>> within their rights to change the compiler such that it doesn't provide those
>>>> properties, and it's very likely that such expectation will be broken. We've
>>>> seen that happen before (e.g. with jump tables).
>>>>
>>>> Consequently I think we should be working with compiler folk to agree upon some
>>>> solution, where compiler folk will actually try to maintain the properties we
>>>> depend upon (and e.g. they could have tests for). That sort of co-design has
>>>> worked well so far (e.g. with things like kCFI).
>>>>
>>>> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
>>>
>>> That was the goal of my talk at LPC last year:
>>>
>>> https://lpc.events/event/16/contributions/1392/
>>>
>>> We discussed having the compiler annotate the tricky bits of control
>>> flow, mainly jump tables and noreturns. It's still on my TODO list to
>>> prototype that.
>>>
>>> Another alternative which has been suggested in the past by Indu and
>>> others is for objtool to use DWARF/sframe as an input to help guide it
>>> through the tricky bits.
>>>
>>
>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>> folks agree to properly support and maintain SFrame, then I could send the next version
>> of the patchset based on SFrame.
>>
>> But I kinda need a clear path forward before I implement anything. I request the arm64
>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>> happen face to face in some forum like LPC?
>
> SFrame is basically a simplified version of DWARF unwind, using it as an
> input to objtool is going to have the same issues I mentioned below (and
> as was discussed with your v1).
>
Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
can provide the feature with a reliability guarantee. DWARF is too complex.
Madhavan
>>> That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
>>> has some reliability issues -- and also defeats some of the benefits of
>>> reverse-engineering in the first place (we've found many compiler bugs
>>> and other surprising kernel-compiler interactions over the years).
>>>
>>> Objtool's understanding of the control flow graph has been really
>>> valuable for reasons beyond live patching (e.g., noinstr and uaccess
>>> validation), it's definitely worth finding a way to make that more
>>> sustainable.
>
On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
> >> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> >> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> >> folks agree to properly support and maintain SFrame, then I could send the next version
> >> of the patchset based on SFrame.
> >>
> >> But I kinda need a clear path forward before I implement anything. I request the arm64
> >> folks to comment on the above approach. Would it be useful to initiate an email discussion
> >> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> >> happen face to face in some forum like LPC?
> >
> > SFrame is basically a simplified version of DWARF unwind, using it as an
> > input to objtool is going to have the same issues I mentioned below (and
> > as was discussed with your v1).
> >
>
> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
> can provide the feature with a reliability guarantee. DWARF is too complex.
I don't see what the complexity (or lack thereof) of the unwinding data
format has to do with it. The unreliability comes from the underlying
data source, not the formatting of the data.
--
Josh
On 4/12/23 10:52, Josh Poimboeuf wrote:
> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>> of the patchset based on SFrame.
>>>>
>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>> happen face to face in some forum like LPC?
>>>
>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>> input to objtool is going to have the same issues I mentioned below (and
>>> as was discussed with your v1).
>>>
>>
>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>> can provide the feature with a reliability guarantee. DWARF is too complex.
>
> I don't see what the complexity (or lack thereof) of the unwinding data
> format has to do with it. The unreliability comes from the underlying
> data source, not the formatting of the data.
>
What I meant is - if SFrame is implemented by simply extracting unwind info from
DWARF data and placing it in a separate section (as it is probably implemented now),
then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
then either they have to make DWARF reliable. Or, they have to implement SFrame as a
separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
The latter is a lot easier to do.
Sorry if that was not clear.
Madhavan
On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
> On 4/12/23 10:52, Josh Poimboeuf wrote:
> > On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
> >>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> >>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> >>>> folks agree to properly support and maintain SFrame, then I could send the next version
> >>>> of the patchset based on SFrame.
> >>>>
> >>>> But I kinda need a clear path forward before I implement anything. I request the arm64
> >>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
> >>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> >>>> happen face to face in some forum like LPC?
> >>>
> >>> SFrame is basically a simplified version of DWARF unwind, using it as an
> >>> input to objtool is going to have the same issues I mentioned below (and
> >>> as was discussed with your v1).
> >>>
> >>
> >> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
> >> can provide the feature with a reliability guarantee. DWARF is too complex.
> >
> > I don't see what the complexity (or lack thereof) of the unwinding data
> > format has to do with it. The unreliability comes from the underlying
> > data source, not the formatting of the data.
> >
>
> What I meant is - if SFrame is implemented by simply extracting unwind info from
> DWARF data and placing it in a separate section (as it is probably implemented now),
> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
> The latter is a lot easier to do.
[ adding linux-toolchains ]
I don't think ensuring reliability is an easy task, regardless of the
complexity of the unwinding format.
Whether it's SFrame or DWARF/eh_frame, the question would be how to
ensure it's always reliable for a compiler "power user" like the kernel
which has many edge cases (including lots of inline asm which the
compiler has no visibility to) and which uses unwinding for more than
just debugging.
It would need some kind of black-box testing on a complex code base.
(hint: kind of like what objtool already does today)
--
Josh
On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
> Hi Madhavan,
>
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
>
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
>
> > Problem
> > =======
> >
> > Objtool is complex and highly architecture-dependent. There are a lot of
> > different checks in objtool that all of the code in the kernel must pass
> > before livepatch can be enabled. If a check fails, it must be corrected
> > before we can proceed. Sometimes, the kernel code needs to be fixed.
> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> > also to prove that all the work is complete for an architecture.
> >
> > As such, it presents a great challenge to enable livepatch for an
> > architecture.
>
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
>
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
>
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.
> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
>
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.
IDK if it's relevant here, but I did see a commit go by to LLVM that
seemed to include such info in a custom ELF section (for the purposes of
improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
to see if it's reliable or usable?
- https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
- https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>
> [...]
>
> > FWIW, I have also compared the CFI I am generating with DWARF
> > information that the compiler generates. The CFIs match a
> > 100% for Clang. In the case of gcc, the comparison fails
> > in 1.7% of the cases. I have analyzed those cases and found
> > the DWARF information generated by gcc is incorrect. The
> > ORC generated by my Objtool is correct.
>
>
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.
Yeah, at least a bug report is good. "See something, say something."
> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>> Hi Madhavan,
>>
>> At a high-level, I think this still falls afoul of our desire to not reverse
>> engineer control flow from the binary, and so I do not think this is the right
>> approach. I've expanded a bit on that below.
>>
>> I do think it would be nice to have *some* of the objtool changes, as I do
>> think we will want to use objtool for some things in future (e.g. some
>> build-time binary patching such as table sorting).
>>
>> > Problem
>> > =======
>> >
>> > Objtool is complex and highly architecture-dependent. There are a lot of
>> > different checks in objtool that all of the code in the kernel must pass
>> > before livepatch can be enabled. If a check fails, it must be corrected
>> > before we can proceed. Sometimes, the kernel code needs to be fixed.
>> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>> > also to prove that all the work is complete for an architecture.
>> >
>> > As such, it presents a great challenge to enable livepatch for an
>> > architecture.
>>
>> There's a more fundamental issue here in that objtool has to reverse-engineer
>> control flow, and so even if the kernel code and compiled code generation is
>> *perfect*, it's possible that objtool won't recognise the structure of the
>> generated code, and won't be able to reverse-engineer the correct control flow.
>>
>> We've seen issues where objtool didn't understand jump tables, so support for
>> that got disabled on x86. A key objection from the arm64 side is that we don't
>> want to disable compile code generation strategies like this. Further, as
>> compiles evolve, their code generation strategies will change, and it's likely
>> there will be other cases that crop up. This is inherently fragile.
>>
>> The key objections from the arm64 side is that we don't want to
>> reverse-engineer details from the binary, as this is complex, fragile, and
>> unstable. This is why we've previously suggested that we should work with
>> compiler folk to get what we need.
>
>> This still requires reverse-engineering the forward-edge control flow in order
>> to compute those offets, so the same objections apply with this approach. I do
>> not think this is the right approach.
>>
>> I would *strongly* prefer that we work with compiler folk to get the
>> information that we need.
>
> IDK if it's relevant here, but I did see a commit go by to LLVM that
> seemed to include such info in a custom ELF section (for the purposes of
> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
> to see if it's reliable or usable?
> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>
>>
>> [...]
>>
>> > FWIW, I have also compared the CFI I am generating with DWARF
>> > information that the compiler generates. The CFIs match a
>> > 100% for Clang. In the case of gcc, the comparison fails
>> > in 1.7% of the cases. I have analyzed those cases and found
>> > the DWARF information generated by gcc is incorrect. The
>> > ORC generated by my Objtool is correct.
>>
>>
>> Have you reported this to the GCC folk, and can you give any examples?
>> I'm sure they would be interested in fixing this, regardless of whether we end
>> up using it.
>
> Yeah, at least a bug report is good. "See something, say something."
By all means, please. If you guys report these issues on CFI
divergences in the GCC bugzilla, we will look into fixing them.
https://gcc.gnu.org/bugzilla
On 4/13/23 13:15, Jose E. Marchesi wrote:
>
>> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>>> Hi Madhavan,
>>>
>>> At a high-level, I think this still falls afoul of our desire to not reverse
>>> engineer control flow from the binary, and so I do not think this is the right
>>> approach. I've expanded a bit on that below.
>>>
>>> I do think it would be nice to have *some* of the objtool changes, as I do
>>> think we will want to use objtool for some things in future (e.g. some
>>> build-time binary patching such as table sorting).
>>>
>>>> Problem
>>>> =======
>>>>
>>>> Objtool is complex and highly architecture-dependent. There are a lot of
>>>> different checks in objtool that all of the code in the kernel must pass
>>>> before livepatch can be enabled. If a check fails, it must be corrected
>>>> before we can proceed. Sometimes, the kernel code needs to be fixed.
>>>> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>>>> also to prove that all the work is complete for an architecture.
>>>>
>>>> As such, it presents a great challenge to enable livepatch for an
>>>> architecture.
>>>
>>> There's a more fundamental issue here in that objtool has to reverse-engineer
>>> control flow, and so even if the kernel code and compiled code generation is
>>> *perfect*, it's possible that objtool won't recognise the structure of the
>>> generated code, and won't be able to reverse-engineer the correct control flow.
>>>
>>> We've seen issues where objtool didn't understand jump tables, so support for
>>> that got disabled on x86. A key objection from the arm64 side is that we don't
>>> want to disable compile code generation strategies like this. Further, as
>>> compiles evolve, their code generation strategies will change, and it's likely
>>> there will be other cases that crop up. This is inherently fragile.
>>>
>>> The key objections from the arm64 side is that we don't want to
>>> reverse-engineer details from the binary, as this is complex, fragile, and
>>> unstable. This is why we've previously suggested that we should work with
>>> compiler folk to get what we need.
>>
>>> This still requires reverse-engineering the forward-edge control flow in order
>>> to compute those offets, so the same objections apply with this approach. I do
>>> not think this is the right approach.
>>>
>>> I would *strongly* prefer that we work with compiler folk to get the
>>> information that we need.
>>
>> IDK if it's relevant here, but I did see a commit go by to LLVM that
>> seemed to include such info in a custom ELF section (for the purposes of
>> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
>> to see if it's reliable or usable?
>> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
>> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>>
>>>
>>> [...]
>>>
>>>> FWIW, I have also compared the CFI I am generating with DWARF
>>>> information that the compiler generates. The CFIs match a
>>>> 100% for Clang. In the case of gcc, the comparison fails
>>>> in 1.7% of the cases. I have analyzed those cases and found
>>>> the DWARF information generated by gcc is incorrect. The
>>>> ORC generated by my Objtool is correct.
>>>
>>>
>>> Have you reported this to the GCC folk, and can you give any examples?
>>> I'm sure they would be interested in fixing this, regardless of whether we end
>>> up using it.
>>
>> Yeah, at least a bug report is good. "See something, say something."
>
> By all means, please. If you guys report these issues on CFI
> divergences in the GCC bugzilla, we will look into fixing them.
>
> https://gcc.gnu.org/bugzilla
I will try to get the data again and report the problems that I see.
Thanks.
Madhavan
On 4/13/23 11:30, Josh Poimboeuf wrote:
> On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
>> On 4/12/23 10:52, Josh Poimboeuf wrote:
>>> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>>>> of the patchset based on SFrame.
>>>>>>
>>>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>>>> happen face to face in some forum like LPC?
>>>>>
>>>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>>>> input to objtool is going to have the same issues I mentioned below (and
>>>>> as was discussed with your v1).
>>>>>
>>>>
>>>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>>>> can provide the feature with a reliability guarantee. DWARF is too complex.
>>>
>>> I don't see what the complexity (or lack thereof) of the unwinding data
>>> format has to do with it. The unreliability comes from the underlying
>>> data source, not the formatting of the data.
>>>
>>
>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>> DWARF data and placing it in a separate section (as it is probably implemented now),
>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>> The latter is a lot easier to do.
>
> [ adding linux-toolchains ]
>
> I don't think ensuring reliability is an easy task, regardless of the
> complexity of the unwinding format.
>
> Whether it's SFrame or DWARF/eh_frame, the question would be how to
> ensure it's always reliable for a compiler "power user" like the kernel
> which has many edge cases (including lots of inline asm which the
> compiler has no visibility to) and which uses unwinding for more than
> just debugging.
>
> It would need some kind of black-box testing on a complex code base.
> (hint: kind of like what objtool already does today)
>
I could use the ORC data I generate by using the decoder against the SFrame data.
A function is reliable only if both data sources agree for the whole function.
Also, in my approach, the actual frame pointer is dynamically checked against the
frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
IMHO, this is sufficient to provide livepatch. Do you agree?
Madhavan
On Fri, Apr 14, 2023 at 11:27:44PM -0500, Madhavan T. Venkataraman wrote:
> >> What I meant is - if SFrame is implemented by simply extracting unwind info from
> >> DWARF data and placing it in a separate section (as it is probably implemented now),
> >> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
> >> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
> >> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
> >> The latter is a lot easier to do.
> >
> > [ adding linux-toolchains ]
> >
> > I don't think ensuring reliability is an easy task, regardless of the
> > complexity of the unwinding format.
> >
> > Whether it's SFrame or DWARF/eh_frame, the question would be how to
> > ensure it's always reliable for a compiler "power user" like the kernel
> > which has many edge cases (including lots of inline asm which the
> > compiler has no visibility to) and which uses unwinding for more than
> > just debugging.
> >
> > It would need some kind of black-box testing on a complex code base.
> > (hint: kind of like what objtool already does today)
> >
>
> I could use the ORC data I generate by using the decoder against the SFrame data.
> A function is reliable only if both data sources agree for the whole function.
This is somewhat similar to what I'm saying in another thread:
https://lore.kernel.org/live-patching/20230415043949.7y4tvshe26zday3e@treble/
If objtool and DWARF/SFrame agree, all is well.
> Also, in my approach, the actual frame pointer is dynamically checked against the
> frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
>
> IMHO, this is sufficient to provide livepatch. Do you agree?
The dynamic reliable stacktrace checks for CONFIG_FRAME_POINTER on x86
are much simpler, as they don't require ORC or any other metadata. They
just need to detect preemption and page faults on the stack, and to
identify the end of the stack. Those simple dynamic checks, combined
with objtool's build-time frame pointer validation, worked very well
until we switched to ORC.
So I'm not sure I see the benefit of the additional complexity involved
in cross-checking frame pointers with ORC at runtime. But I'm just a
bystander. What really matters is what the arm64 folks think ;-)
--
Josh
On 4/15/23 00:05, Josh Poimboeuf wrote:
> On Fri, Apr 14, 2023 at 11:27:44PM -0500, Madhavan T. Venkataraman wrote:
>>>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>>>> DWARF data and placing it in a separate section (as it is probably implemented now),
>>>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>>>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>>>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>>>> The latter is a lot easier to do.
>>>
>>> [ adding linux-toolchains ]
>>>
>>> I don't think ensuring reliability is an easy task, regardless of the
>>> complexity of the unwinding format.
>>>
>>> Whether it's SFrame or DWARF/eh_frame, the question would be how to
>>> ensure it's always reliable for a compiler "power user" like the kernel
>>> which has many edge cases (including lots of inline asm which the
>>> compiler has no visibility to) and which uses unwinding for more than
>>> just debugging.
>>>
>>> It would need some kind of black-box testing on a complex code base.
>>> (hint: kind of like what objtool already does today)
>>>
>>
>> I could use the ORC data I generate by using the decoder against the SFrame data.
>> A function is reliable only if both data sources agree for the whole function.
>
> This is somewhat similar to what I'm saying in another thread:
>
> https://lore.kernel.org/live-patching/20230415043949.7y4tvshe26zday3e@treble/
>
> If objtool and DWARF/SFrame agree, all is well.
>
>> Also, in my approach, the actual frame pointer is dynamically checked against the
>> frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
>>
>> IMHO, this is sufficient to provide livepatch. Do you agree?
>
> The dynamic reliable stacktrace checks for CONFIG_FRAME_POINTER on x86
> are much simpler, as they don't require ORC or any other metadata. They
> just need to detect preemption and page faults on the stack, and to
> identify the end of the stack. Those simple dynamic checks, combined
> with objtool's build-time frame pointer validation, worked very well
> until we switched to ORC.
>
> So I'm not sure I see the benefit of the additional complexity involved
> in cross-checking frame pointers with ORC at runtime. But I'm just a
> bystander. What really matters is what the arm64 folks think ;-)
>
The unwinder on arm64 is frame-pointer based. I don't want to deviate from that.
I just want to use the metadata to validate the frame pointer. This approach
also catches the rare cases of frame pointer corruption and any bugs in
SFrame that the metadata check did not catch.
Of course, this is all moot if the arm64 folks do not even want the reverse engineering.
I guess we wait until the microconference to discuss all this.
Madhavan
On 4/13/23 9:30 AM, Josh Poimboeuf wrote:
> On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
>> On 4/12/23 10:52, Josh Poimboeuf wrote:
>>> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>>>> of the patchset based on SFrame.
>>>>>>
>>>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>>>> happen face to face in some forum like LPC?
>>>>>
>>>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>>>> input to objtool is going to have the same issues I mentioned below (and
>>>>> as was discussed with your v1).
>>>>>
>>>>
>>>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>>>> can provide the feature with a reliability guarantee. DWARF is too complex.
>>>
>>> I don't see what the complexity (or lack thereof) of the unwinding data
>>> format has to do with it. The unreliability comes from the underlying
>>> data source, not the formatting of the data.
>>>
>>
>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>> DWARF data and placing it in a separate section (as it is probably implemented now),
>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>> The latter is a lot easier to do.
SFrame stack trace data is generated by the GNU assembler, by using the
.cfi_* asm directives embedded by the compiler. So, it is true that the
source of EH_Frame info and SFrame stack trace data is the same.
That said, yes, if you see bugs/inconsistencies in SFrame/EH_Frame info,
please file the issue(s).
>
> [ adding linux-toolchains ]
>
> I don't think ensuring reliability is an easy task, regardless of the
> complexity of the unwinding format.
>
> Whether it's SFrame or DWARF/eh_frame, the question would be how to
> ensure it's always reliable for a compiler "power user" like the kernel
> which has many edge cases (including lots of inline asm which the
> compiler has no visibility to) and which uses unwinding for more than
> just debugging.
>
> It would need some kind of black-box testing on a complex code base.
> (hint: kind of like what objtool already does today)
>
Hi Mark,
I attended your presentation in the LPC. You mentioned that you could use some help with some pre-requisites for the Livepatch feature.
I would like to lend a hand.
What would you like me to implement?
I would also like to implement Unwind Hints for the feature. If you want a specific format for the hints, let me know.
Looking forward to help out with the feature.
Madhavan
On Thu, Dec 14, 2023 at 02:49:29PM -0600, Madhavan T. Venkataraman wrote:
> Hi Mark,
Hi Madhavan,
> I attended your presentation in the LPC. You mentioned that you could use
> some help with some pre-requisites for the Livepatch feature.
> I would like to lend a hand.
Cool!
I've been meaning to send a mail round with a summary of the current state of
things, and what needs to be done going forward, but I haven't had the time
since LPC to put that together (as e.g. that requires learning some more about
SFrame). I'll be disappearing for the holiday shortly, and I intend to pick
that up in the new year.
> What would you like me to implement?
I'm not currently sure exactly what we need/want to implement, and as above I
think that needs to wait until the new year.
However, one thing that you can do that would be very useful is to write up and
report the GCC DWARF issues that you mentioned in:
https://lore.kernel.org/linux-arm-kernel/[email protected]/
... as (depending on exactly what those are) those could also affect SFrame
generation (and thus we'll need to get those fixed in GCC), and regardless it
would be useful information to know.
I understood that you planned to do that from:
https://lore.kernel.org/linux-arm-kernel/[email protected]/
... but I couldn't spot any relevant mails or issues in the GCC bugzilla, so
either I'm failing to search hard enough, or did that get forgotten about?
> I would also like to implement Unwind Hints for the feature. If you want a
> specific format for the hints, let me know.
I will get back to you on that in the new year; I think the specifics we want
are going to depend on other details above we need to analyse first.
Thanks,
Mark.
On 12/15/23 07:04, Mark Rutland wrote:
> On Thu, Dec 14, 2023 at 02:49:29PM -0600, Madhavan T. Venkataraman wrote:
>> Hi Mark,
>
> Hi Madhavan,
>
>> I attended your presentation in the LPC. You mentioned that you could use
>> some help with some pre-requisites for the Livepatch feature.
>> I would like to lend a hand.
>
> Cool!
>
> I've been meaning to send a mail round with a summary of the current state of
> things, and what needs to be done going forward, but I haven't had the time
> since LPC to put that together (as e.g. that requires learning some more about
> SFrame). I'll be disappearing for the holiday shortly, and I intend to pick
> that up in the new year.
>
>> What would you like me to implement?
>
> I'm not currently sure exactly what we need/want to implement, and as above I
> think that needs to wait until the new year.
>
OK.
> However, one thing that you can do that would be very useful is to write up and
> report the GCC DWARF issues that you mentioned in:
>
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> ... as (depending on exactly what those are) those could also affect SFrame
> generation (and thus we'll need to get those fixed in GCC), and regardless it
> would be useful information to know.
>
> I understood that you planned to do that from:
>
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> ... but I couldn't spot any relevant mails or issues in the GCC bugzilla, so
> either I'm failing to search hard enough, or did that get forgotten about?
>
Yeah. I had notes on that. But I seem to have lost them. I need to reproduce the
problems and analyze them again which is not trivial. So, I have been procrastinating.
I am also disappearing for the rest of this year. I will try to look at it in the
new year.
>> I would also like to implement Unwind Hints for the feature. If you want a
>> specific format for the hints, let me know.
>
> I will get back to you on that in the new year; I think the specifics we want
> are going to depend on other details above we need to analyse first.
>
OK.
For now, I will implement something and send it out just for reference. We can revisit
this topic next year sometime.
Thanks.
Madhavan