2017-06-28 15:13:47

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 0/8] x86: undwarf unwinder

v2:

- 2x performance improvement by using a fast lookup table and splitting
undwarf array into two parallel arrays (Andy L)
- reduce data size by ~1MB by getting rid of 'len' field
- sort and post-process data at boot time
- don't search vmlinux tables for module addresses (Peter Z)
- disable preemption to prevent module from getting unloaded while
reading its undwarf data (Peter Z)
- avoid unwinding a running task's stack (Jiri S)
- remove '__sp' constraint from inline asm (Jiri S)
- rename "CFI_*" -> "UNWIND_HINT_*" (Andy L)
- replace '999:' label with '.Lunwind_hint_ip_\@' (Andy L)
- entry code annotation fixes: extra=0 fix, symmetrical macro
annotations, ret_from_fork fix (Andy L)
- invalidate all object files when enabling/disabling
CONFIG_UNDWARF_UNWINDER
- pass ip-1 to undwarf_find() for call return addresses to fix stack
traces for sibling calls and noreturn calls at end of function
- docs: clarify benefits vs frame pointers (Ingo)
- docs: improve wording, add more info, add performance info from Mel G
and Jiri S, move to kernel docs dir
- objtool: several minor fixes (Jiri S)
- objtool: append file instead of rewriting it
- objtool: improve elf warnings
- objtool: fix handling of the GCC DRAP register for aligned stacks
- objtool: rewrite 'undwarf dump' command to be much faster and to work
on vmlinux
- objtool: rename undwarf.c -> undwarf_gen.c

-----

Create a new 'undwarf' unwinder, enabled by CONFIG_UNDWARF_UNWINDER, and
plug it into the x86 unwinder framework. Objtool is used to generate
the undwarf debuginfo. The undwarf debuginfo format is basically a
simplified version of DWARF CFI. More details below.

The unwinder works well in my testing. It unwinds through interrupts,
exceptions, and preemption, with and without frame pointers, across
aligned stacks and dynamically allocated stacks. If something goes
wrong during an oops, it successfully falls back to printing the '?'
entries just like the frame pointer unwinder.

I'm not tied to the 'undwarf' name, other naming ideas are welcome.

Some potential future improvements:
- properly annotate or fix whitelisted functions and files
- reduce the number of base CFA registers needed in entry code
- compress undwarf debuginfo to use less memory
- make it easier to disable CONFIG_FRAME_POINTER
- add reliability checks for livepatch
- runtime NMI stack reliability checker

This code can also be found at:

git://github.com/jpoimboe/linux undwarf-v2

Here's the contents of the undwarf.txt file which explains the 'why' in
more detail:


Undwarf unwinder debuginfo generation
=====================================

Overview
--------

The kernel CONFIG_UNDWARF_UNWINDER option enables objtool generation of
undwarf debuginfo, which is out-of-band data which is used by the
in-kernel undwarf unwinder. It's similar in concept to DWARF CFI
debuginfo which would be used by a DWARF unwinder. The difference is
that the format of the undwarf data is simpler than DWARF, which in turn
allows the unwinder to be simpler and faster.

Objtool generates the undwarf data by first doing compile-time stack
metadata validation (CONFIG_STACK_VALIDATION). After analyzing all the
code paths of a .o file, it determines information about the stack state
at each instruction address in the file and outputs that information to
the .undwarf and .undwarf_ip sections.

The undwarf sections are combined at link time and are sorted at boot
time. The unwinder uses the resulting data to correlate instruction
addresses with their stack states at run time.


Undwarf vs frame pointers
-------------------------

With frame pointers enabled, GCC adds instrumentation code to every
function in the kernel. The kernel's .text size increases by about
3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel
Gorman [1] have shown a slowdown of 5-10% for some workloads.

In contrast, the undwarf unwinder has no effect on text size or runtime
performance, because the debuginfo is out of band. So if you disable
frame pointers and enable undwarf, you get a nice performance
improvement across the board, and still have reliable stack traces.

Another benefit of undwarf compared to frame pointers is that it can
reliably unwind across interrupts and exceptions. Frame pointer based
unwinds can skip the caller of the interrupted function if it was a leaf
function or if the interrupt hit before the frame pointer was saved.

The main disadvantage of undwarf compared to frame pointers is that it
needs more memory to store the undwarf table: roughly 3-5MB depending on
the kernel config.


Undwarf vs DWARF
----------------

Undwarf debuginfo's advantage over DWARF itself is that it's much
simpler. It gets rid of the complex DWARF CFI state machine and also
gets rid of the tracking of unnecessary registers. This allows the
unwinder to be much simpler, meaning fewer bugs, which is especially
important for mission critical oops code.

The simpler debuginfo format also enables the unwinder to be much faster
than DWARF, which is important for perf and lockdep. In a basic
performance test by Jiri Slaby [2], the undwarf unwinder was about 20x
faster than an out-of-tree DWARF unwinder. (Note: that measurement was
taken before some performance tweaks were implemented, so the speedup
may be even higher.)

The undwarf format does have a few downsides compared to DWARF. The
undwarf table takes up ~2MB more memory than an DWARF .eh_frame table.

Another potential downside is that, as GCC evolves, it's conceivable
that the undwarf data may end up being *too* simple to describe the
state of the stack for certain optimizations. But IMO this is unlikely
because GCC saves the frame pointer for any unusual stack adjustments it
does, so I suspect we'll really only ever need to keep track of the
stack pointer and the frame pointer between call frames. But even if we
do end up having to track all the registers DWARF tracks, at least we
will still be able to control the format, e.g. no complex state
machines.


Undwarf debuginfo generation
----------------------------

The undwarf data is generated by objtool. With the existing
compile-time stack metadata validation feature, objtool already follows
all code paths, and so it already has all the information it needs to be
able to generate undwarf data from scratch. So it's an easy step to go
from stack validation to undwarf generation.

It should be possible to instead generate the undwarf data with a simple
tool which converts DWARF to undwarf. However, such a solution would be
incomplete due to the kernel's extensive use of asm, inline asm, and
special sections like exception tables.

That could be rectified by manually annotating those special code paths
using GNU assembler .cfi annotations in .S files, and homegrown
annotations for inline asm in .c files. But asm annotations were tried
in the past and were found to be unmaintainable. They were often
incorrect/incomplete and made the code harder to read and keep updated.
And based on looking at glibc code, annotating inline asm in .c files
might be even worse.

Objtool still needs a few annotations, but only in code which does
unusual things to the stack like entry code. And even then, far fewer
annotations are needed than what DWARF would need, so they're much more
maintainable than DWARF CFI annotations.

So the advantages of using objtool to generate undwarf are that it gives
more accurate debuginfo, with very few annotations. It also insulates
the kernel from toolchain bugs which can be very painful to deal with in
the kernel since we often have to workaround issues in older versions of
the toolchain for years.

The downside is that the unwinder now becomes dependent on objtool's
ability to reverse engineer GCC code paths. If GCC optimizations become
too complicated for objtool to follow, the undwarf generation might stop
working or become incomplete. (It's worth noting that livepatch already
has such a dependency on objtool's ability to follow GCC code paths.)

If newer versions of GCC come up with some optimizations which break
objtool, we may need to revisit the current implementation. Some
possible solutions would be asking GCC to make the optimizations more
palatable, or having objtool use DWARF as an additional input, or
creating a GCC plugin to assist objtool with its analysis. But for now,
objtool follows GCC code quite well.


Unwinder implementation details
-------------------------------

Objtool generates the undwarf data by integrating with the compile-time
stack metadata validation feature, which is described in detail in
tools/objtool/Documentation/stack-validation.txt. After analyzing all
the code paths of a .o file, it creates an array of undwarf structs, and
a parallel array of instruction addresses associated with those structs,
and writes them to the .undwarf and .undwarf_ip sections respectively.

The undwarf data is split into the two arrays for performance reasons,
to make the searchable part of the data (.undwarf_ip) more compact. The
arrays are sorted in parallel at boot time.

Performance is further improved by the use of a fast lookup table which
is created at runtime. The fast lookup table associates a given address
with a range of undwarf table indices, so that only a small subset of
the undwarf table needs to be searched.


[1] https://lkml.kernel.org/r/[email protected]
[2] https://lkml.kernel.org/r/[email protected]


Josh Poimboeuf (8):
objtool: move checking code to check.c
objtool, x86: add several functions and files to the objtool whitelist
objtool: stack validation 2.0
objtool: add undwarf debuginfo generation
objtool, x86: add facility for asm code to provide unwind hints
x86/entry: add unwind hint annotations
x86/asm: add unwind hint annotations to sync_core()
x86/unwind: add undwarf unwinder

Documentation/x86/undwarf.txt | 146 +++
arch/um/include/asm/unwind.h | 8 +
arch/x86/Kconfig | 1 +
arch/x86/Kconfig.debug | 25 +
arch/x86/crypto/Makefile | 2 +
arch/x86/crypto/sha1-mb/Makefile | 2 +
arch/x86/crypto/sha256-mb/Makefile | 2 +
arch/x86/entry/Makefile | 1 -
arch/x86/entry/calling.h | 6 +
arch/x86/entry/entry_64.S | 56 +-
arch/x86/include/asm/module.h | 9 +
arch/x86/include/asm/processor.h | 3 +
arch/x86/include/asm/undwarf-types.h | 99 ++
arch/x86/include/asm/undwarf.h | 103 ++
arch/x86/include/asm/unwind.h | 77 +-
arch/x86/kernel/Makefile | 9 +-
arch/x86/kernel/acpi/Makefile | 2 +
arch/x86/kernel/kprobes/opt.c | 9 +-
arch/x86/kernel/module.c | 12 +-
arch/x86/kernel/reboot.c | 2 +
arch/x86/kernel/setup.c | 3 +
arch/x86/kernel/unwind_frame.c | 39 +-
arch/x86/kernel/unwind_guess.c | 5 +
arch/x86/kernel/unwind_undwarf.c | 589 ++++++++++
arch/x86/kernel/vmlinux.lds.S | 2 +
arch/x86/kvm/svm.c | 2 +
arch/x86/kvm/vmx.c | 3 +
arch/x86/lib/msr-reg.S | 8 +-
arch/x86/net/Makefile | 2 +
arch/x86/platform/efi/Makefile | 1 +
arch/x86/power/Makefile | 2 +
arch/x86/xen/Makefile | 3 +
include/asm-generic/vmlinux.lds.h | 20 +-
kernel/kexec_core.c | 4 +-
lib/Kconfig.debug | 3 +
scripts/Makefile.build | 14 +-
tools/objtool/Build | 4 +
tools/objtool/Documentation/stack-validation.txt | 195 ++--
tools/objtool/Makefile | 5 +-
tools/objtool/arch.h | 64 +-
tools/objtool/arch/x86/decode.c | 400 ++++++-
tools/objtool/builtin-check.c | 1281 +---------------------
tools/objtool/builtin-undwarf.c | 70 ++
tools/objtool/builtin.h | 1 +
tools/objtool/cfi.h | 55 +
tools/objtool/{builtin-check.c => check.c} | 954 ++++++++++++----
tools/objtool/check.h | 79 ++
tools/objtool/elf.c | 265 ++++-
tools/objtool/elf.h | 21 +-
tools/objtool/objtool.c | 3 +-
tools/objtool/special.c | 6 +-
tools/objtool/undwarf-types.h | 99 ++
tools/objtool/{builtin.h => undwarf.h} | 18 +-
tools/objtool/undwarf_dump.c | 212 ++++
tools/objtool/undwarf_gen.c | 215 ++++
tools/objtool/warn.h | 10 +
56 files changed, 3466 insertions(+), 1765 deletions(-)
create mode 100644 Documentation/x86/undwarf.txt
create mode 100644 arch/um/include/asm/unwind.h
create mode 100644 arch/x86/include/asm/undwarf-types.h
create mode 100644 arch/x86/include/asm/undwarf.h
create mode 100644 arch/x86/kernel/unwind_undwarf.c
create mode 100644 tools/objtool/builtin-undwarf.c
create mode 100644 tools/objtool/cfi.h
copy tools/objtool/{builtin-check.c => check.c} (59%)
create mode 100644 tools/objtool/check.h
create mode 100644 tools/objtool/undwarf-types.h
copy tools/objtool/{builtin.h => undwarf.h} (67%)
create mode 100644 tools/objtool/undwarf_dump.c
create mode 100644 tools/objtool/undwarf_gen.c

--
2.7.5


2017-06-28 15:12:19

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 3/8] objtool: stack validation 2.0

This is a major rewrite of objtool. Instead of only tracking frame
pointer changes, it now tracks all stack-related operations, including
all register saves/restores.

In addition to making stack validation more robust, this also paves the
way for undwarf generation.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
tools/objtool/Documentation/stack-validation.txt | 153 +++--
tools/objtool/Makefile | 2 +-
tools/objtool/arch.h | 64 ++-
tools/objtool/arch/x86/decode.c | 400 ++++++++++++--
tools/objtool/cfi.h | 55 ++
tools/objtool/check.c | 676 ++++++++++++++++++-----
tools/objtool/check.h | 19 +-
tools/objtool/elf.c | 59 +-
tools/objtool/elf.h | 6 +-
tools/objtool/special.c | 6 +-
tools/objtool/warn.h | 10 +
11 files changed, 1130 insertions(+), 320 deletions(-)
create mode 100644 tools/objtool/cfi.h

diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt
index 55a60d3..17c1195 100644
--- a/tools/objtool/Documentation/stack-validation.txt
+++ b/tools/objtool/Documentation/stack-validation.txt
@@ -127,28 +127,13 @@ b) 100% reliable stack traces for DWARF enabled kernels

c) Higher live patching compatibility rate

- (NOTE: This is not yet implemented)
-
- Currently with CONFIG_LIVEPATCH there's a basic live patching
- framework which is safe for roughly 85-90% of "security" fixes. But
- patches can't have complex features like function dependency or
- prototype changes, or data structure changes.
-
- There's a strong need to support patches which have the more complex
- features so that the patch compatibility rate for security fixes can
- eventually approach something resembling 100%. To achieve that, a
- "consistency model" is needed, which allows tasks to be safely
- transitioned from an unpatched state to a patched state.
-
- One of the key requirements of the currently proposed livepatch
- consistency model [*] is that it needs to walk the stack of each
- sleeping task to determine if it can be transitioned to the patched
- state. If objtool can ensure that stack traces are reliable, this
- consistency model can be used and the live patching compatibility
- rate can be improved significantly.
-
- [*] https://lkml.kernel.org/r/[email protected]
+ Livepatch has an optional "consistency model", which is needed for
+ more complex patches. In order for the consistency model to work,
+ stack traces need to be reliable (or an unreliable condition needs to
+ be detectable). Objtool makes that possible.

+ For more details, see the livepatch documentation in the Linux kernel
+ source tree at Documentation/livepatch/livepatch.txt.

Rules
-----
@@ -201,80 +186,84 @@ To achieve the validation, objtool enforces the following rules:
return normally.


-Errors in .S files
-------------------
+Objtool warnings
+----------------

-If you're getting an error in a compiled .S file which you don't
-understand, first make sure that the affected code follows the above
-rules.
+For asm files, if you're getting an error which doesn't make sense,
+first make sure that the affected code follows the above rules.
+
+For C files, the common culprits are inline asm statements and calls to
+"noreturn" functions. See below for more details.
+
+Another possible cause for errors in C code is if the Makefile removes
+-fno-omit-frame-pointer or adds -fomit-frame-pointer to the gcc options.

Here are some examples of common warnings reported by objtool, what
they mean, and suggestions for how to fix them.


-1. asm_file.o: warning: objtool: func()+0x128: call without frame pointer save/setup
+1. file.o: warning: objtool: func()+0x128: call without frame pointer save/setup

The func() function made a function call without first saving and/or
- updating the frame pointer.
-
- If func() is indeed a callable function, add proper frame pointer
- logic using the FRAME_BEGIN and FRAME_END macros. Otherwise, remove
- its ELF function annotation by changing ENDPROC to END.
+ updating the frame pointer, and CONFIG_FRAME_POINTER is enabled.

- If you're getting this error in a .c file, see the "Errors in .c
- files" section.
+ If the error is for an asm file, and func() is indeed a callable
+ function, add proper frame pointer logic using the FRAME_BEGIN and
+ FRAME_END macros. Otherwise, if it's not a callable function, remove
+ its ELF function annotation by changing ENDPROC to END, and instead
+ use the manual CFI hint macros in asm/undwarf.h.

+ If it's a GCC-compiled .c file, the error may be because the function
+ uses an inline asm() statement which has a "call" instruction. An
+ asm() statement with a call instruction must declare the use of the
+ stack pointer in its output operand. For example, on x86_64:

-2. asm_file.o: warning: objtool: .text+0x53: return instruction outside of a callable function
-
- A return instruction was detected, but objtool couldn't find a way
- for a callable function to reach the instruction.
+ register void *__sp asm("rsp");
+ asm volatile("call func" : "+r" (__sp));

- If the return instruction is inside (or reachable from) a callable
- function, the function needs to be annotated with the ENTRY/ENDPROC
- macros.
+ Otherwise the stack frame may not get created before the call.

- If you _really_ need a return instruction outside of a function, and
- are 100% sure that it won't affect stack traces, you can tell
- objtool to ignore it. See the "Adding exceptions" section below.

+2. file.o: warning: objtool: .text+0x53: unreachable instruction

-3. asm_file.o: warning: objtool: func()+0x9: function has unreachable instruction
+ Objtool couldn't find a code path to reach the instruction.

- The instruction lives inside of a callable function, but there's no
- possible control flow path from the beginning of the function to the
- instruction.
+ If the error is for an asm file, and the instruction is inside (or
+ reachable from) a callable function, the function should be annotated
+ with the ENTRY/ENDPROC macros (ENDPROC is the important one).
+ Otherwise, the code should probably be annotated with the CFI hint
+ macros in asm/undwarf.h so objtool and the unwinder can know the
+ stack state associated with the code.

- If the instruction is actually needed, and it's actually in a
- callable function, ensure that its function is properly annotated
- with ENTRY/ENDPROC.
+ If you're 100% sure the code won't affect stack traces, or if you're
+ a just a bad person, you can tell objtool to ignore it. See the
+ "Adding exceptions" section below.

If it's not actually in a callable function (e.g. kernel entry code),
change ENDPROC to END.


-4. asm_file.o: warning: objtool: func(): can't find starting instruction
+4. file.o: warning: objtool: func(): can't find starting instruction
or
- asm_file.o: warning: objtool: func()+0x11dd: can't decode instruction
+ file.o: warning: objtool: func()+0x11dd: can't decode instruction

- Did you put data in a text section? If so, that can confuse
+ Does the file have data in a text section? If so, that can confuse
objtool's instruction decoder. Move the data to a more appropriate
section like .data or .rodata.


-5. asm_file.o: warning: objtool: func()+0x6: kernel entry/exit from callable instruction
-
- This is a kernel entry/exit instruction like sysenter or sysret.
- Such instructions aren't allowed in a callable function, and are most
- likely part of the kernel entry code.
+5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function

- If the instruction isn't actually in a callable function, change
- ENDPROC to END.
+ This is a kernel entry/exit instruction like sysenter or iret. Such
+ instructions aren't allowed in a callable function, and are most
+ likely part of the kernel entry code. They should usually not have
+ the callable function annotation (ENDPROC) and should always be
+ annotated with the CFI hint macros in asm/undwarf.h.


-6. asm_file.o: warning: objtool: func()+0x26: sibling call from callable instruction with changed frame pointer
+6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame

- This is a dynamic jump or a jump to an undefined symbol. Stacktool
+ This is a dynamic jump or a jump to an undefined symbol. Objtool
assumed it's a sibling call and detected that the frame pointer
wasn't first restored to its original state.

@@ -282,24 +271,28 @@ they mean, and suggestions for how to fix them.
destination code to the local file.

If the instruction is not actually in a callable function (e.g.
- kernel entry code), change ENDPROC to END.
+ kernel entry code), change ENDPROC to END and annotate manually with
+ the CFI hint macros in asm/undwarf.h.


-7. asm_file: warning: objtool: func()+0x5c: frame pointer state mismatch
+7. file: warning: objtool: func()+0x5c: stack state mismatch

The instruction's frame pointer state is inconsistent, depending on
which execution path was taken to reach the instruction.

- Make sure the function pushes and sets up the frame pointer (for
- x86_64, this means rbp) at the beginning of the function and pops it
- at the end of the function. Also make sure that no other code in the
- function touches the frame pointer.
+ Make sure that, when CONFIG_FRAME_POINTER is enabled, the function
+ pushes and sets up the frame pointer (for x86_64, this means rbp) at
+ the beginning of the function and pops it at the end of the function.
+ Also make sure that no other code in the function touches the frame
+ pointer.

+ Another possibility is that the code has some asm or inline asm which
+ does some unusual things to the stack or the frame pointer. In such
+ cases it's probably appropriate to use the CFI hint macros in
+ asm/undwarf.h.

-Errors in .c files
-------------------

-1. c_file.o: warning: objtool: funcA() falls through to next function funcB()
+8. file.o: warning: objtool: funcA() falls through to next function funcB()

This means that funcA() doesn't end with a return instruction or an
unconditional jump, and that objtool has determined that the function
@@ -318,22 +311,6 @@ Errors in .c files
might be corrupt due to a gcc bug. For more details, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

-2. If you're getting any other objtool error in a compiled .c file, it
- may be because the file uses an asm() statement which has a "call"
- instruction. An asm() statement with a call instruction must declare
- the use of the stack pointer in its output operand. For example, on
- x86_64:
-
- register void *__sp asm("rsp");
- asm volatile("call func" : "+r" (__sp));
-
- Otherwise the stack frame may not get created before the call.
-
-3. Another possible cause for errors in C code is if the Makefile removes
- -fno-omit-frame-pointer or adds -fomit-frame-pointer to the gcc options.
-
-Also see the above section for .S file errors for more information what
-the individual error messages mean.

If the error doesn't seem to make sense, it could be a bug in objtool.
Feel free to ask the objtool maintainer for help.
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 27e019c..0e2765e 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -25,7 +25,7 @@ OBJTOOL_IN := $(OBJTOOL)-in.o
all: $(OBJTOOL)

INCLUDES := -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi
-CFLAGS += -Wall -Werror $(EXTRA_WARNINGS) -fomit-frame-pointer -O2 -g $(INCLUDES)
+CFLAGS += -Wall -Werror $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -fomit-frame-pointer -O2 -g $(INCLUDES)
LDFLAGS += -lelf $(LIBSUBCMD)

# Allow old libelf to be used:
diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index a59e061..21aeca8 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -19,25 +19,63 @@
#define _ARCH_H

#include <stdbool.h>
+#include <linux/list.h>
#include "elf.h"
+#include "cfi.h"

-#define INSN_FP_SAVE 1
-#define INSN_FP_SETUP 2
-#define INSN_FP_RESTORE 3
-#define INSN_JUMP_CONDITIONAL 4
-#define INSN_JUMP_UNCONDITIONAL 5
-#define INSN_JUMP_DYNAMIC 6
-#define INSN_CALL 7
-#define INSN_CALL_DYNAMIC 8
-#define INSN_RETURN 9
-#define INSN_CONTEXT_SWITCH 10
-#define INSN_NOP 11
-#define INSN_OTHER 12
+#define INSN_JUMP_CONDITIONAL 1
+#define INSN_JUMP_UNCONDITIONAL 2
+#define INSN_JUMP_DYNAMIC 3
+#define INSN_CALL 4
+#define INSN_CALL_DYNAMIC 5
+#define INSN_RETURN 6
+#define INSN_CONTEXT_SWITCH 7
+#define INSN_STACK 8
+#define INSN_NOP 9
+#define INSN_OTHER 10
#define INSN_LAST INSN_OTHER

+enum op_dest_type {
+ OP_DEST_REG,
+ OP_DEST_REG_INDIRECT,
+ OP_DEST_MEM,
+ OP_DEST_PUSH,
+ OP_DEST_LEAVE,
+};
+
+struct op_dest {
+ enum op_dest_type type;
+ unsigned char reg;
+ int offset;
+};
+
+enum op_src_type {
+ OP_SRC_REG,
+ OP_SRC_REG_INDIRECT,
+ OP_SRC_CONST,
+ OP_SRC_POP,
+ OP_SRC_ADD,
+ OP_SRC_AND,
+};
+
+struct op_src {
+ enum op_src_type type;
+ unsigned char reg;
+ int offset;
+};
+
+struct stack_op {
+ struct op_dest dest;
+ struct op_src src;
+};
+
+void arch_initial_func_cfi_state(struct cfi_state *state);
+
int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, unsigned char *type,
- unsigned long *displacement);
+ unsigned long *immediate, struct stack_op *op);
+
+bool arch_callee_saved_reg(unsigned char reg);

#endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 6ac99e3..a36c2eb 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -27,6 +27,17 @@
#include "../../arch.h"
#include "../../warn.h"

+static unsigned char op_to_cfi_reg[][2] = {
+ {CFI_AX, CFI_R8},
+ {CFI_CX, CFI_R9},
+ {CFI_DX, CFI_R10},
+ {CFI_BX, CFI_R11},
+ {CFI_SP, CFI_R12},
+ {CFI_BP, CFI_R13},
+ {CFI_SI, CFI_R14},
+ {CFI_DI, CFI_R15},
+};
+
static int is_x86_64(struct elf *elf)
{
switch (elf->ehdr.e_machine) {
@@ -40,24 +51,50 @@ static int is_x86_64(struct elf *elf)
}
}

+bool arch_callee_saved_reg(unsigned char reg)
+{
+ switch (reg) {
+ case CFI_BP:
+ case CFI_BX:
+ case CFI_R12:
+ case CFI_R13:
+ case CFI_R14:
+ case CFI_R15:
+ return true;
+
+ case CFI_AX:
+ case CFI_CX:
+ case CFI_DX:
+ case CFI_SI:
+ case CFI_DI:
+ case CFI_SP:
+ case CFI_R8:
+ case CFI_R9:
+ case CFI_R10:
+ case CFI_R11:
+ case CFI_RA:
+ default:
+ return false;
+ }
+}
+
int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, unsigned char *type,
- unsigned long *immediate)
+ unsigned long *immediate, struct stack_op *op)
{
struct insn insn;
- int x86_64;
- unsigned char op1, op2, ext;
+ int x86_64, sign;
+ unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
+ modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
+ sib = 0;

x86_64 = is_x86_64(elf);
if (x86_64 == -1)
return -1;

- insn_init(&insn, (void *)(sec->data + offset), maxlen, x86_64);
+ insn_init(&insn, sec->data->d_buf + offset, maxlen, x86_64);
insn_get_length(&insn);
- insn_get_opcode(&insn);
- insn_get_modrm(&insn);
- insn_get_immediate(&insn);

if (!insn_complete(&insn)) {
WARN_FUNC("can't decode instruction", sec, offset);
@@ -73,67 +110,323 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op1 = insn.opcode.bytes[0];
op2 = insn.opcode.bytes[1];

+ if (insn.rex_prefix.nbytes) {
+ rex = insn.rex_prefix.bytes[0];
+ rex_w = X86_REX_W(rex) >> 3;
+ rex_r = X86_REX_R(rex) >> 2;
+ rex_b = X86_REX_B(rex);
+ }
+
+ if (insn.modrm.nbytes) {
+ modrm = insn.modrm.bytes[0];
+ modrm_mod = X86_MODRM_MOD(modrm);
+ modrm_reg = X86_MODRM_REG(modrm);
+ modrm_rm = X86_MODRM_RM(modrm);
+ }
+
+ if (insn.sib.nbytes)
+ sib = insn.sib.bytes[0];
+
switch (op1) {
- case 0x55:
- if (!insn.rex_prefix.nbytes)
- /* push rbp */
- *type = INSN_FP_SAVE;
+
+ case 0x1:
+ case 0x29:
+ if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
+
+ /* add/sub reg, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_SRC_REG;
+ op->dest.reg = CFI_SP;
+ }
+ break;
+
+ case 0x50 ... 0x57:
+
+ /* push reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+ op->dest.type = OP_DEST_PUSH;
+
break;

- case 0x5d:
- if (!insn.rex_prefix.nbytes)
- /* pop rbp */
- *type = INSN_FP_RESTORE;
+ case 0x58 ... 0x5f:
+
+ /* pop reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+
+ break;
+
+ case 0x68:
+ case 0x6a:
+ /* push immediate */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
break;

case 0x70 ... 0x7f:
*type = INSN_JUMP_CONDITIONAL;
break;

+ case 0x81:
+ case 0x83:
+ if (rex != 0x48)
+ break;
+
+ if (modrm == 0xe4) {
+ /* and imm, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_AND;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ if (modrm == 0xc4)
+ sign = 1;
+ else if (modrm == 0xec)
+ sign = -1;
+ else
+ break;
+
+ /* add/sub imm, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value * sign;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+
case 0x89:
- if (insn.rex_prefix.nbytes == 1 &&
- insn.rex_prefix.bytes[0] == 0x48 &&
- insn.modrm.nbytes && insn.modrm.bytes[0] == 0xe5)
- /* mov rsp, rbp */
- *type = INSN_FP_SETUP;
+ if (rex == 0x48 && modrm == 0xe5) {
+
+ /* mov %rsp, %rbp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = CFI_SP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_BP;
+ break;
+ }
+ /* fallthrough */
+ case 0x88:
+ if (!rex_b &&
+ (modrm_mod == 1 || modrm_mod == 2) && modrm_rm == 5) {
+
+ /* mov reg, disp(%rbp) */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG_INDIRECT;
+ op->dest.reg = CFI_BP;
+ op->dest.offset = insn.displacement.value;
+
+ } else if (rex_w && !rex_b && modrm_rm == 4 && sib == 0x24) {
+
+ /* mov reg, disp(%rsp) */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG_INDIRECT;
+ op->dest.reg = CFI_SP;
+ op->dest.offset = insn.displacement.value;
+ }
+
+ break;
+
+ case 0x8b:
+ if (rex_w && !rex_b && modrm_mod == 1 && modrm_rm == 5) {
+
+ /* mov disp(%rbp), reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG_INDIRECT;
+ op->src.reg = CFI_BP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+
+ } else if (rex_w && !rex_b && sib == 0x24 &&
+ modrm_mod != 3 && modrm_rm == 4) {
+
+ /* mov disp(%rsp), reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG_INDIRECT;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ }
+
break;

case 0x8d:
- if (insn.rex_prefix.nbytes &&
- insn.rex_prefix.bytes[0] == 0x48 &&
- insn.modrm.nbytes && insn.modrm.bytes[0] == 0x2c &&
- insn.sib.nbytes && insn.sib.bytes[0] == 0x24)
- /* lea %(rsp), %rbp */
- *type = INSN_FP_SETUP;
+ if (rex == 0x48 && modrm == 0x65) {
+
+ /* lea -disp(%rbp), %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_BP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ if (rex == 0x4c && modrm == 0x54 && sib == 0x24 &&
+ insn.displacement.value == 8) {
+
+ /*
+ * lea 0x8(%rsp), %r10
+ *
+ * Here r10 is the "drap" pointer, used as a stack
+ * pointer helper when the stack gets realigned.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = 8;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_R10;
+ break;
+ }
+
+ if (rex == 0x4c && modrm == 0x6c && sib == 0x24 &&
+ insn.displacement.value == 16) {
+
+ /*
+ * lea 0x10(%rsp), %r13
+ *
+ * Here r13 is the "drap" pointer, used as a stack
+ * pointer helper when the stack gets realigned.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = 16;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_R13;
+ break;
+ }
+
+ if (rex == 0x49 && modrm == 0x62 &&
+ insn.displacement.value == -8) {
+
+ /*
+ * lea -0x8(%r10), %rsp
+ *
+ * Restoring rsp back to its original value after a
+ * stack realignment.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_R10;
+ op->src.offset = -8;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ if (rex == 0x49 && modrm == 0x65 &&
+ insn.displacement.value == -16) {
+
+ /*
+ * lea -0x10(%r13), %rsp
+ *
+ * Restoring rsp back to its original value after a
+ * stack realignment.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_R13;
+ op->src.offset = -16;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ break;
+
+ case 0x8f:
+ /* pop to mem */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
break;

case 0x90:
*type = INSN_NOP;
break;

+ case 0x9c:
+ /* pushf */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ break;
+
+ case 0x9d:
+ /* popf */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
+ break;
+
case 0x0f:
+
if (op2 >= 0x80 && op2 <= 0x8f)
*type = INSN_JUMP_CONDITIONAL;
else if (op2 == 0x05 || op2 == 0x07 || op2 == 0x34 ||
op2 == 0x35)
+
/* sysenter, sysret */
*type = INSN_CONTEXT_SWITCH;
+
else if (op2 == 0x0d || op2 == 0x1f)
+
/* nopl/nopw */
*type = INSN_NOP;
- else if (op2 == 0x01 && insn.modrm.nbytes &&
- (insn.modrm.bytes[0] == 0xc2 ||
- insn.modrm.bytes[0] == 0xd8))
- /* vmlaunch, vmrun */
- *type = INSN_CONTEXT_SWITCH;
+
+ else if (op2 == 0xa0 || op2 == 0xa8) {
+
+ /* push fs/gs */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+
+ } else if (op2 == 0xa1 || op2 == 0xa9) {
+
+ /* pop fs/gs */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
+ }

break;

- case 0xc9: /* leave */
- *type = INSN_FP_RESTORE;
+ case 0xc9:
+ /*
+ * leave
+ *
+ * equivalent to:
+ * mov bp, sp
+ * pop bp
+ */
+ *type = INSN_STACK;
+ op->dest.type = OP_DEST_LEAVE;
+
break;

- case 0xe3: /* jecxz/jrcxz */
+ case 0xe3:
+ /* jecxz/jrcxz */
*type = INSN_JUMP_CONDITIONAL;
break;

@@ -158,14 +451,27 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
break;

case 0xff:
- ext = X86_MODRM_REG(insn.modrm.bytes[0]);
- if (ext == 2 || ext == 3)
+ if (modrm_reg == 2 || modrm_reg == 3)
+
*type = INSN_CALL_DYNAMIC;
- else if (ext == 4)
+
+ else if (modrm_reg == 4)
+
*type = INSN_JUMP_DYNAMIC;
- else if (ext == 5) /*jmpf */
+
+ else if (modrm_reg == 5)
+
+ /* jmpf */
*type = INSN_CONTEXT_SWITCH;

+ else if (modrm_reg == 6) {
+
+ /* push from mem */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ }
+
break;

default:
@@ -176,3 +482,21 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,

return 0;
}
+
+void arch_initial_func_cfi_state(struct cfi_state *state)
+{
+ int i;
+
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ state->regs[i].base = CFI_UNDEFINED;
+ state->regs[i].offset = 0;
+ }
+
+ /* initial CFA (call frame address) */
+ state->cfa.base = CFI_SP;
+ state->cfa.offset = 8;
+
+ /* initial RA (return address) */
+ state->regs[16].base = CFI_CFA;
+ state->regs[16].offset = -8;
+}
diff --git a/tools/objtool/cfi.h b/tools/objtool/cfi.h
new file mode 100644
index 0000000..443ab2c
--- /dev/null
+++ b/tools/objtool/cfi.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _OBJTOOL_CFI_H
+#define _OBJTOOL_CFI_H
+
+#define CFI_UNDEFINED -1
+#define CFI_CFA -2
+#define CFI_SP_INDIRECT -3
+#define CFI_BP_INDIRECT -4
+
+#define CFI_AX 0
+#define CFI_DX 1
+#define CFI_CX 2
+#define CFI_BX 3
+#define CFI_SI 4
+#define CFI_DI 5
+#define CFI_BP 6
+#define CFI_SP 7
+#define CFI_R8 8
+#define CFI_R9 9
+#define CFI_R10 10
+#define CFI_R11 11
+#define CFI_R12 12
+#define CFI_R13 13
+#define CFI_R14 14
+#define CFI_R15 15
+#define CFI_RA 16
+#define CFI_NUM_REGS 17
+
+struct cfi_reg {
+ int base;
+ int offset;
+};
+
+struct cfi_state {
+ struct cfi_reg cfa;
+ struct cfi_reg regs[CFI_NUM_REGS];
+};
+
+#endif /* _OBJTOOL_CFI_H */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 231a360..2f80aa51 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -27,10 +27,6 @@
#include <linux/hashtable.h>
#include <linux/kernel.h>

-#define STATE_FP_SAVED 0x1
-#define STATE_FP_SETUP 0x2
-#define STATE_FENTRY 0x4
-
struct alternative {
struct list_head list;
struct instruction *insn;
@@ -38,6 +34,7 @@ struct alternative {

const char *objname;
static bool nofp;
+struct cfi_state initial_func_cfi;

static struct instruction *find_insn(struct objtool_file *file,
struct section *sec, unsigned long offset)
@@ -56,7 +53,7 @@ static struct instruction *next_insn_same_sec(struct objtool_file *file,
{
struct instruction *next = list_next_entry(insn, list);

- if (&next->list == &file->insn_list || next->sec != insn->sec)
+ if (!next || &next->list == &file->insn_list || next->sec != insn->sec)
return NULL;

return next;
@@ -67,7 +64,7 @@ static bool gcov_enabled(struct objtool_file *file)
struct section *sec;
struct symbol *sym;

- list_for_each_entry(sec, &file->elf->sections, list)
+ for_each_sec(file, sec)
list_for_each_entry(sym, &sec->symbol_list, list)
if (!strncmp(sym->name, "__gcov_.", 8))
return true;
@@ -75,9 +72,6 @@ static bool gcov_enabled(struct objtool_file *file)
return false;
}

-#define for_each_insn(file, insn) \
- list_for_each_entry(insn, &file->insn_list, list)
-
#define func_for_each_insn(file, func, insn) \
for (insn = find_insn(file, func->sec, func->offset); \
insn && &insn->list != &file->insn_list && \
@@ -94,6 +88,9 @@ static bool gcov_enabled(struct objtool_file *file)
#define sec_for_each_insn_from(file, insn) \
for (; insn; insn = next_insn_same_sec(file, insn))

+#define sec_for_each_insn_continue(file, insn) \
+ for (insn = next_insn_same_sec(file, insn); insn; \
+ insn = next_insn_same_sec(file, insn))

/*
* Check if the function has been manually whitelisted with the
@@ -103,7 +100,6 @@ static bool gcov_enabled(struct objtool_file *file)
static bool ignore_func(struct objtool_file *file, struct symbol *func)
{
struct rela *rela;
- struct instruction *insn;

/* check for STACK_FRAME_NON_STANDARD */
if (file->whitelist && file->whitelist->rela)
@@ -116,11 +112,6 @@ static bool ignore_func(struct objtool_file *file, struct symbol *func)
return true;
}

- /* check if it has a context switching instruction */
- func_for_each_insn(file, func, insn)
- if (insn->type == INSN_CONTEXT_SWITCH)
- return true;
-
return false;
}

@@ -234,6 +225,17 @@ static int dead_end_function(struct objtool_file *file, struct symbol *func)
return __dead_end_function(file, func, 0);
}

+static void clear_insn_state(struct insn_state *state)
+{
+ int i;
+
+ memset(state, 0, sizeof(*state));
+ state->cfa.base = CFI_UNDEFINED;
+ for (i = 0; i < CFI_NUM_REGS; i++)
+ state->regs[i].base = CFI_UNDEFINED;
+ state->drap_reg = CFI_UNDEFINED;
+}
+
/*
* Call the arch-specific instruction decoder for all the instructions and add
* them to the global instruction list.
@@ -246,23 +248,29 @@ static int decode_instructions(struct objtool_file *file)
struct instruction *insn;
int ret;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ for_each_sec(file, sec) {

if (!(sec->sh.sh_flags & SHF_EXECINSTR))
continue;

for (offset = 0; offset < sec->len; offset += insn->len) {
insn = malloc(sizeof(*insn));
+ if (!insn) {
+ WARN("malloc failed");
+ return -1;
+ }
memset(insn, 0, sizeof(*insn));
-
INIT_LIST_HEAD(&insn->alts);
+ clear_insn_state(&insn->state);
+
insn->sec = sec;
insn->offset = offset;

ret = arch_decode_instruction(file->elf, sec, offset,
sec->len - offset,
&insn->len, &insn->type,
- &insn->immediate);
+ &insn->immediate,
+ &insn->stack_op);
if (ret)
return ret;

@@ -352,7 +360,7 @@ static void add_ignores(struct objtool_file *file)
struct section *sec;
struct symbol *func;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ for_each_sec(file, sec) {
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;
@@ -361,7 +369,7 @@ static void add_ignores(struct objtool_file *file)
continue;

func_for_each_insn(file, func, insn)
- insn->visited = true;
+ insn->ignore = true;
}
}
}
@@ -381,8 +389,7 @@ static int add_jump_destinations(struct objtool_file *file)
insn->type != INSN_JUMP_UNCONDITIONAL)
continue;

- /* skip ignores */
- if (insn->visited)
+ if (insn->ignore)
continue;

rela = find_rela_by_dest_range(insn->sec, insn->offset,
@@ -519,10 +526,13 @@ static int handle_group_alt(struct objtool_file *file,
}
memset(fake_jump, 0, sizeof(*fake_jump));
INIT_LIST_HEAD(&fake_jump->alts);
+ clear_insn_state(&fake_jump->state);
+
fake_jump->sec = special_alt->new_sec;
fake_jump->offset = -1;
fake_jump->type = INSN_JUMP_UNCONDITIONAL;
fake_jump->jump_dest = list_next_entry(last_orig_insn, list);
+ fake_jump->ignore = true;

if (!special_alt->new_len) {
*new_insn = fake_jump;
@@ -844,7 +854,7 @@ static int add_switch_table_alts(struct objtool_file *file)
if (!file->rodata || !file->rodata->rela)
return 0;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ for_each_sec(file, sec) {
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;
@@ -901,21 +911,423 @@ static bool is_fentry_call(struct instruction *insn)
return false;
}

-static bool has_modified_stack_frame(struct instruction *insn)
+static bool has_modified_stack_frame(struct insn_state *state)
{
- return (insn->state & STATE_FP_SAVED) ||
- (insn->state & STATE_FP_SETUP);
+ int i;
+
+ if (state->cfa.base != initial_func_cfi.cfa.base ||
+ state->cfa.offset != initial_func_cfi.cfa.offset ||
+ state->stack_size != initial_func_cfi.cfa.offset ||
+ state->drap)
+ return true;
+
+ for (i = 0; i < CFI_NUM_REGS; i++)
+ if (state->regs[i].base != initial_func_cfi.regs[i].base ||
+ state->regs[i].offset != initial_func_cfi.regs[i].offset)
+ return true;
+
+ return false;
+}
+
+static bool has_valid_stack_frame(struct insn_state *state)
+{
+ if (state->cfa.base == CFI_BP && state->regs[CFI_BP].base == CFI_CFA &&
+ state->regs[CFI_BP].offset == -16)
+ return true;
+
+ if (state->drap && state->regs[CFI_BP].base == CFI_BP)
+ return true;
+
+ return false;
}

-static bool has_valid_stack_frame(struct instruction *insn)
+static void save_reg(struct insn_state *state, unsigned char reg, int base,
+ int offset)
{
- return (insn->state & STATE_FP_SAVED) &&
- (insn->state & STATE_FP_SETUP);
+ if ((arch_callee_saved_reg(reg) ||
+ (state->drap && reg == state->drap_reg)) &&
+ state->regs[reg].base == CFI_UNDEFINED) {
+ state->regs[reg].base = base;
+ state->regs[reg].offset = offset;
+ }
}

-static unsigned int frame_state(unsigned long state)
+static void restore_reg(struct insn_state *state, unsigned char reg)
{
- return (state & (STATE_FP_SAVED | STATE_FP_SETUP));
+ state->regs[reg].base = CFI_UNDEFINED;
+ state->regs[reg].offset = 0;
+}
+
+/*
+ * A note about DRAP stack alignment:
+ *
+ * GCC has the concept of a DRAP register, which is used to help keep track of
+ * the stack pointer when aligning the stack. r10 or r13 is used as the DRAP
+ * register. The typical DRAP pattern is:
+ *
+ * 4c 8d 54 24 08 lea 0x8(%rsp),%r10
+ * 48 83 e4 c0 and $0xffffffffffffffc0,%rsp
+ * 41 ff 72 f8 pushq -0x8(%r10)
+ * 55 push %rbp
+ * 48 89 e5 mov %rsp,%rbp
+ * (more pushes)
+ * 41 52 push %r10
+ * ...
+ * 41 5a pop %r10
+ * (more pops)
+ * 5d pop %rbp
+ * 49 8d 62 f8 lea -0x8(%r10),%rsp
+ * c3 retq
+ *
+ * There are some variations in the epilogues, like:
+ *
+ * 5b pop %rbx
+ * 41 5a pop %r10
+ * 41 5c pop %r12
+ * 41 5d pop %r13
+ * 41 5e pop %r14
+ * c9 leaveq
+ * 49 8d 62 f8 lea -0x8(%r10),%rsp
+ * c3 retq
+ *
+ * and:
+ *
+ * 4c 8b 55 e8 mov -0x18(%rbp),%r10
+ * 48 8b 5d e0 mov -0x20(%rbp),%rbx
+ * 4c 8b 65 f0 mov -0x10(%rbp),%r12
+ * 4c 8b 6d f8 mov -0x8(%rbp),%r13
+ * c9 leaveq
+ * 49 8d 62 f8 lea -0x8(%r10),%rsp
+ * c3 retq
+ *
+ * Sometimes r13 is used as the DRAP register, in which case it's saved and
+ * restored beforehand:
+ *
+ * 41 55 push %r13
+ * 4c 8d 6c 24 10 lea 0x10(%rsp),%r13
+ * 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
+ * ...
+ * 49 8d 65 f0 lea -0x10(%r13),%rsp
+ * 41 5d pop %r13
+ * c3 retq
+ */
+static int update_insn_state(struct instruction *insn, struct insn_state *state)
+{
+ struct stack_op *op = &insn->stack_op;
+ struct cfi_reg *cfa = &state->cfa;
+ struct cfi_reg *regs = state->regs;
+
+ /* stack operations don't make sense with an undefined CFA */
+ if (cfa->base == CFI_UNDEFINED) {
+ if (insn->func) {
+ WARN_FUNC("undefined stack state", insn->sec, insn->offset);
+ return -1;
+ }
+ return 0;
+ }
+
+ switch (op->dest.type) {
+
+ case OP_DEST_REG:
+ switch (op->src.type) {
+
+ case OP_SRC_REG:
+ if (cfa->base == op->src.reg && cfa->base == CFI_SP &&
+ op->dest.reg == CFI_BP && regs[CFI_BP].base == CFI_CFA &&
+ regs[CFI_BP].offset == -cfa->offset) {
+
+ /* mov %rsp, %rbp */
+ cfa->base = op->dest.reg;
+ state->bp_scratch = false;
+ } else if (state->drap) {
+
+ /* drap: mov %rsp, %rbp */
+ regs[CFI_BP].base = CFI_BP;
+ regs[CFI_BP].offset = -state->stack_size;
+ state->bp_scratch = false;
+ } else if (!nofp) {
+
+ WARN_FUNC("unknown stack-related register move",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+
+ case OP_SRC_ADD:
+ if (op->dest.reg == CFI_SP && op->src.reg == CFI_SP) {
+
+ /* add imm, %rsp */
+ state->stack_size -= op->src.offset;
+ if (cfa->base == CFI_SP)
+ cfa->offset -= op->src.offset;
+ break;
+ }
+
+ if (op->dest.reg == CFI_SP && op->src.reg == CFI_BP) {
+
+ /* lea disp(%rbp), %rsp */
+ state->stack_size = -(op->src.offset + regs[CFI_BP].offset);
+ break;
+ }
+
+ if (op->dest.reg != CFI_BP && op->src.reg == CFI_SP &&
+ cfa->base == CFI_SP) {
+
+ /* drap: lea disp(%rsp), %drap */
+ state->drap_reg = op->dest.reg;
+ break;
+ }
+
+ if (state->drap && op->dest.reg == CFI_SP &&
+ op->src.reg == state->drap_reg) {
+
+ /* drap: lea disp(%drap), %rsp */
+ cfa->base = CFI_SP;
+ cfa->offset = state->stack_size = -op->src.offset;
+ state->drap_reg = CFI_UNDEFINED;
+ state->drap = false;
+ break;
+ }
+
+ if (op->dest.reg == state->cfa.base) {
+ WARN_FUNC("unsupported stack register modification",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+
+ case OP_SRC_AND:
+ if (op->dest.reg != CFI_SP ||
+ (state->drap_reg != CFI_UNDEFINED && cfa->base != CFI_SP) ||
+ (state->drap_reg == CFI_UNDEFINED && cfa->base != CFI_BP)) {
+ WARN_FUNC("unsupported stack pointer realignment",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ if (state->drap_reg != CFI_UNDEFINED) {
+ /* drap: and imm, %rsp */
+ cfa->base = state->drap_reg;
+ cfa->offset = state->stack_size = 0;
+ state->drap = true;
+
+ }
+
+ /*
+ * Older versions of GCC (4.8ish) realign the stack
+ * without DRAP, with a frame pointer.
+ */
+
+ break;
+
+ case OP_SRC_POP:
+ if (!state->drap && op->dest.type == OP_DEST_REG &&
+ op->dest.reg == cfa->base) {
+
+ /* pop %rbp */
+ cfa->base = CFI_SP;
+ }
+
+ if (regs[op->dest.reg].offset == -state->stack_size) {
+
+ if (state->drap && cfa->base == CFI_BP_INDIRECT &&
+ op->dest.type == OP_DEST_REG &&
+ op->dest.reg == state->drap_reg) {
+
+ /* drap: pop %drap */
+ cfa->base = state->drap_reg;
+ cfa->offset = 0;
+ }
+
+ restore_reg(state, op->dest.reg);
+ }
+
+ state->stack_size -= 8;
+ if (cfa->base == CFI_SP)
+ cfa->offset -= 8;
+
+ break;
+
+ case OP_SRC_REG_INDIRECT:
+ if (state->drap && op->src.reg == CFI_BP &&
+ op->src.offset == regs[op->dest.reg].offset) {
+
+ /* drap: mov disp(%rbp), %reg */
+ if (op->dest.reg == state->drap_reg) {
+ cfa->base = state->drap_reg;
+ cfa->offset = 0;
+ }
+
+ restore_reg(state, op->dest.reg);
+
+ } else if (op->src.reg == cfa->base &&
+ op->src.offset == regs[op->dest.reg].offset + cfa->offset) {
+
+ /* mov disp(%rbp), %reg */
+ /* mov disp(%rsp), %reg */
+ restore_reg(state, op->dest.reg);
+ }
+
+ break;
+
+ default:
+ WARN_FUNC("unknown stack-related instruction",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+
+ case OP_DEST_PUSH:
+ state->stack_size += 8;
+ if (cfa->base == CFI_SP)
+ cfa->offset += 8;
+
+ if (op->src.type != OP_SRC_REG)
+ break;
+
+ if (state->drap) {
+ if (op->src.reg == cfa->base && op->src.reg == state->drap_reg) {
+
+ /* drap: push %drap */
+ cfa->base = CFI_BP_INDIRECT;
+ cfa->offset = -state->stack_size;
+
+ /* save drap so we know when to undefine it */
+ save_reg(state, op->src.reg, CFI_CFA, -state->stack_size);
+
+ } else if (op->src.reg == CFI_BP && cfa->base == state->drap_reg) {
+
+ /* drap: push %rbp */
+ state->stack_size = 0;
+
+ } else if (regs[op->src.reg].base == CFI_UNDEFINED) {
+
+ /* drap: push %reg */
+ save_reg(state, op->src.reg, CFI_BP, -state->stack_size);
+ }
+
+ } else {
+
+ /* push %reg */
+ save_reg(state, op->src.reg, CFI_CFA, -state->stack_size);
+ }
+
+ /* detect when asm code uses rbp as a scratch register */
+ if (!nofp && insn->func && op->src.reg == CFI_BP &&
+ cfa->base != CFI_BP)
+ state->bp_scratch = true;
+ break;
+
+ case OP_DEST_REG_INDIRECT:
+
+ if (state->drap) {
+ if (op->src.reg == cfa->base && op->src.reg == state->drap_reg) {
+
+ /* drap: mov %drap, disp(%rbp) */
+ cfa->base = CFI_BP_INDIRECT;
+ cfa->offset = op->dest.offset;
+
+ /* save drap so we know when to undefine it */
+ save_reg(state, op->src.reg, CFI_CFA, op->dest.offset);
+ }
+
+ else if (regs[op->src.reg].base == CFI_UNDEFINED) {
+
+ /* drap: mov reg, disp(%rbp) */
+ save_reg(state, op->src.reg, CFI_BP, op->dest.offset);
+ }
+
+ } else if (op->dest.reg == cfa->base) {
+
+ /* mov reg, disp(%rbp) */
+ /* mov reg, disp(%rsp) */
+ save_reg(state, op->src.reg, CFI_CFA,
+ op->dest.offset - state->cfa.offset);
+ }
+
+ break;
+
+ case OP_DEST_LEAVE:
+ if ((!state->drap && cfa->base != CFI_BP) ||
+ (state->drap && cfa->base != state->drap_reg)) {
+ WARN_FUNC("leave instruction with modified stack frame",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ /* leave (mov %rbp, %rsp; pop %rbp) */
+
+ state->stack_size = -state->regs[CFI_BP].offset - 8;
+ restore_reg(state, CFI_BP);
+
+ if (!state->drap) {
+ cfa->base = CFI_SP;
+ cfa->offset -= 8;
+ }
+
+ break;
+
+ case OP_DEST_MEM:
+ if (op->src.type != OP_SRC_POP) {
+ WARN_FUNC("unknown stack-related memory operation",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ /* pop mem */
+ state->stack_size -= 8;
+ if (cfa->base == CFI_SP)
+ cfa->offset -= 8;
+
+ break;
+
+ default:
+ WARN_FUNC("unknown stack-related instruction",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ return 0;
+}
+
+static bool insn_state_match(struct instruction *insn, struct insn_state *state)
+{
+ struct insn_state *state1 = &insn->state, *state2 = state;
+ int i;
+
+ if (memcmp(&state1->cfa, &state2->cfa, sizeof(state1->cfa))) {
+ WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
+ insn->sec, insn->offset,
+ state1->cfa.base, state1->cfa.offset,
+ state2->cfa.base, state2->cfa.offset);
+
+ } else if (memcmp(&state1->regs, &state2->regs, sizeof(state1->regs))) {
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ if (!memcmp(&state1->regs[i], &state2->regs[i],
+ sizeof(struct cfi_reg)))
+ continue;
+
+ WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
+ insn->sec, insn->offset,
+ i, state1->regs[i].base, state1->regs[i].offset,
+ i, state2->regs[i].base, state2->regs[i].offset);
+ break;
+ }
+
+ } else if (state1->drap != state2->drap ||
+ (state1->drap && state1->drap_reg != state2->drap_reg)) {
+ WARN_FUNC("stack state mismatch: drap1=%d(%d) drap2=%d(%d)",
+ insn->sec, insn->offset,
+ state1->drap, state1->drap_reg,
+ state2->drap, state2->drap_reg);
+
+ } else
+ return true;
+
+ return false;
}

/*
@@ -924,24 +1336,22 @@ static unsigned int frame_state(unsigned long state)
* each instruction and validate all the rules described in
* tools/objtool/Documentation/stack-validation.txt.
*/
-static int validate_branch(struct objtool_file *file,
- struct instruction *first, unsigned char first_state)
+static int validate_branch(struct objtool_file *file, struct instruction *first,
+ struct insn_state state)
{
struct alternative *alt;
struct instruction *insn;
struct section *sec;
struct symbol *func = NULL;
- unsigned char state;
int ret;

insn = first;
sec = insn->sec;
- state = first_state;

if (insn->alt_group && list_empty(&insn->alts)) {
WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
sec, insn->offset);
- return 1;
+ return -1;
}

while (1) {
@@ -951,23 +1361,21 @@ static int validate_branch(struct objtool_file *file,
func->name, insn->func->name);
return 1;
}
-
- func = insn->func;
}

+ func = insn->func;
+
if (insn->visited) {
- if (frame_state(insn->state) != frame_state(state)) {
- WARN_FUNC("frame pointer state mismatch",
- sec, insn->offset);
+ if (!!insn_state_match(insn, &state))
return 1;
- }

return 0;
}

- insn->visited = true;
insn->state = state;

+ insn->visited = true;
+
list_for_each_entry(alt, &insn->alts, list) {
ret = validate_branch(file, alt->insn, state);
if (ret)
@@ -976,50 +1384,24 @@ static int validate_branch(struct objtool_file *file,

switch (insn->type) {

- case INSN_FP_SAVE:
- if (!nofp) {
- if (state & STATE_FP_SAVED) {
- WARN_FUNC("duplicate frame pointer save",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SAVED;
- }
- break;
-
- case INSN_FP_SETUP:
- if (!nofp) {
- if (state & STATE_FP_SETUP) {
- WARN_FUNC("duplicate frame pointer setup",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SETUP;
- }
- break;
-
- case INSN_FP_RESTORE:
- if (!nofp) {
- if (has_valid_stack_frame(insn))
- state &= ~STATE_FP_SETUP;
-
- state &= ~STATE_FP_SAVED;
- }
- break;
-
case INSN_RETURN:
- if (!nofp && has_modified_stack_frame(insn)) {
- WARN_FUNC("return without frame pointer restore",
+ if (func && has_modified_stack_frame(&state)) {
+ WARN_FUNC("return with modified stack frame",
sec, insn->offset);
return 1;
}
+
+ if (state.bp_scratch) {
+ WARN("%s uses BP as a scratch register",
+ insn->func->name);
+ return 1;
+ }
+
return 0;

case INSN_CALL:
- if (is_fentry_call(insn)) {
- state |= STATE_FENTRY;
+ if (is_fentry_call(insn))
break;
- }

ret = dead_end_function(file, insn->call_dest);
if (ret == 1)
@@ -1029,7 +1411,7 @@ static int validate_branch(struct objtool_file *file,

/* fallthrough */
case INSN_CALL_DYNAMIC:
- if (!nofp && !has_valid_stack_frame(insn)) {
+ if (!nofp && func && !has_valid_stack_frame(&state)) {
WARN_FUNC("call without frame pointer save/setup",
sec, insn->offset);
return 1;
@@ -1043,8 +1425,8 @@ static int validate_branch(struct objtool_file *file,
state);
if (ret)
return 1;
- } else if (has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
+ } else if (func && has_modified_stack_frame(&state)) {
+ WARN_FUNC("sibling call from callable instruction with modified stack frame",
sec, insn->offset);
return 1;
} /* else it's a sibling call */
@@ -1055,15 +1437,29 @@ static int validate_branch(struct objtool_file *file,
break;

case INSN_JUMP_DYNAMIC:
- if (list_empty(&insn->alts) &&
- has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
+ if (func && list_empty(&insn->alts) &&
+ has_modified_stack_frame(&state)) {
+ WARN_FUNC("sibling call from callable instruction with modified stack frame",
sec, insn->offset);
return 1;
}

return 0;

+ case INSN_CONTEXT_SWITCH:
+ if (func) {
+ WARN_FUNC("unsupported instruction in callable function",
+ sec, insn->offset);
+ return 1;
+ }
+ return 0;
+
+ case INSN_STACK:
+ if (update_insn_state(insn, &state))
+ return -1;
+
+ break;
+
default:
break;
}
@@ -1094,12 +1490,18 @@ static bool is_ubsan_insn(struct instruction *insn)
"__ubsan_handle_builtin_unreachable"));
}

-static bool ignore_unreachable_insn(struct symbol *func,
- struct instruction *insn)
+static bool ignore_unreachable_insn(struct instruction *insn)
{
int i;

- if (insn->type == INSN_NOP)
+ if (insn->ignore || insn->type == INSN_NOP)
+ return true;
+
+ /*
+ * Ignore any unused exceptions. This can happen when a whitelisted
+ * function has an exception table entry.
+ */
+ if (!strcmp(insn->sec->name, ".fixup"))
return true;

/*
@@ -1108,6 +1510,8 @@ static bool ignore_unreachable_insn(struct symbol *func,
*
* End the search at 5 instructions to avoid going into the weeds.
*/
+ if (!insn->func)
+ return false;
for (i = 0; i < 5; i++) {

if (is_kasan_insn(insn) || is_ubsan_insn(insn))
@@ -1118,7 +1522,7 @@ static bool ignore_unreachable_insn(struct symbol *func,
continue;
}

- if (insn->offset + insn->len >= func->offset + func->len)
+ if (insn->offset + insn->len >= insn->func->offset + insn->func->len)
break;
insn = list_next_entry(insn, list);
}
@@ -1131,73 +1535,58 @@ static int validate_functions(struct objtool_file *file)
struct section *sec;
struct symbol *func;
struct instruction *insn;
+ struct insn_state state;
int ret, warnings = 0;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ clear_insn_state(&state);
+
+ state.cfa = initial_func_cfi.cfa;
+ memcpy(&state.regs, &initial_func_cfi.regs,
+ CFI_NUM_REGS * sizeof(struct cfi_reg));
+ state.stack_size = initial_func_cfi.cfa.offset;
+
+ for_each_sec(file, sec) {
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;

insn = find_insn(file, sec, func->offset);
- if (!insn)
+ if (!insn || insn->ignore)
continue;

- ret = validate_branch(file, insn, 0);
+ ret = validate_branch(file, insn, state);
warnings += ret;
}
}

- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- func_for_each_insn(file, func, insn) {
- if (insn->visited)
- continue;
-
- insn->visited = true;
-
- if (file->ignore_unreachables || warnings ||
- ignore_unreachable_insn(func, insn))
- continue;
-
- /*
- * gcov produces a lot of unreachable
- * instructions. If we get an unreachable
- * warning and the file has gcov enabled, just
- * ignore it, and all other such warnings for
- * the file.
- */
- if (!file->ignore_unreachables &&
- gcov_enabled(file)) {
- file->ignore_unreachables = true;
- continue;
- }
-
- WARN_FUNC("function has unreachable instruction", insn->sec, insn->offset);
- warnings++;
- }
- }
- }
-
return warnings;
}

-static int validate_uncallable_instructions(struct objtool_file *file)
+static int validate_reachable_instructions(struct objtool_file *file)
{
struct instruction *insn;
- int warnings = 0;
+
+ if (file->ignore_unreachables)
+ return 0;

for_each_insn(file, insn) {
- if (!insn->visited && insn->type == INSN_RETURN) {
- WARN_FUNC("return instruction outside of a callable function",
- insn->sec, insn->offset);
- warnings++;
- }
+ if (insn->visited || ignore_unreachable_insn(insn))
+ continue;
+
+ /*
+ * gcov produces a lot of unreachable instructions. If we get
+ * an unreachable warning and the file has gcov enabled, just
+ * ignore it, and all other such warnings for the file. Do
+ * this here because this is an expensive function.
+ */
+ if (gcov_enabled(file))
+ return 0;
+
+ WARN_FUNC("unreachable instruction", insn->sec, insn->offset);
+ return 1;
}

- return warnings;
+ return 0;
}

static void cleanup(struct objtool_file *file)
@@ -1226,10 +1615,8 @@ int check(const char *_objname, bool _nofp)
nofp = _nofp;

file.elf = elf_open(objname);
- if (!file.elf) {
- fprintf(stderr, "error reading elf file %s\n", objname);
+ if (!file.elf)
return 1;
- }

INIT_LIST_HEAD(&file.insn_list);
hash_init(file.insn_hash);
@@ -1238,21 +1625,28 @@ int check(const char *_objname, bool _nofp)
file.ignore_unreachables = false;
file.c_file = find_section_by_name(file.elf, ".comment");

+ arch_initial_func_cfi_state(&initial_func_cfi);
+
ret = decode_sections(&file);
if (ret < 0)
goto out;
warnings += ret;

- ret = validate_functions(&file);
- if (ret < 0)
+ if (list_empty(&file.insn_list))
goto out;
- warnings += ret;

- ret = validate_uncallable_instructions(&file);
+ ret = validate_functions(&file);
if (ret < 0)
goto out;
warnings += ret;

+ if (!warnings) {
+ ret = validate_reachable_instructions(&file);
+ if (ret < 0)
+ goto out;
+ warnings += ret;
+ }
+
out:
cleanup(&file);

diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index c0d2fde..da85f5b 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -20,22 +20,34 @@

#include <stdbool.h>
#include "elf.h"
+#include "cfi.h"
#include "arch.h"
#include <linux/hashtable.h>

+struct insn_state {
+ struct cfi_reg cfa;
+ struct cfi_reg regs[CFI_NUM_REGS];
+ int stack_size;
+ bool bp_scratch;
+ bool drap;
+ int drap_reg;
+};
+
struct instruction {
struct list_head list;
struct hlist_node hash;
struct section *sec;
unsigned long offset;
- unsigned int len, state;
+ unsigned int len;
unsigned char type;
unsigned long immediate;
- bool alt_group, visited, dead_end;
+ bool alt_group, visited, dead_end, ignore;
struct symbol *call_dest;
struct instruction *jump_dest;
struct list_head alts;
struct symbol *func;
+ struct stack_op stack_op;
+ struct insn_state state;
};

struct objtool_file {
@@ -48,4 +60,7 @@ struct objtool_file {

int check(const char *objname, bool nofp);

+#define for_each_insn(file, insn) \
+ list_for_each_entry(insn, &file->insn_list, list)
+
#endif /* _CHECK_H */
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index d897702..1a7e8aa 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -37,6 +37,9 @@
#define ELF_C_READ_MMAP ELF_C_READ
#endif

+#define WARN_ELF(format, ...) \
+ WARN(format ": %s", ##__VA_ARGS__, elf_errmsg(-1))
+
struct section *find_section_by_name(struct elf *elf, const char *name)
{
struct section *sec;
@@ -139,12 +142,12 @@ static int read_sections(struct elf *elf)
int i;

if (elf_getshdrnum(elf->elf, &sections_nr)) {
- perror("elf_getshdrnum");
+ WARN_ELF("elf_getshdrnum");
return -1;
}

if (elf_getshdrstrndx(elf->elf, &shstrndx)) {
- perror("elf_getshdrstrndx");
+ WARN_ELF("elf_getshdrstrndx");
return -1;
}

@@ -165,37 +168,36 @@ static int read_sections(struct elf *elf)

s = elf_getscn(elf->elf, i);
if (!s) {
- perror("elf_getscn");
+ WARN_ELF("elf_getscn");
return -1;
}

sec->idx = elf_ndxscn(s);

if (!gelf_getshdr(s, &sec->sh)) {
- perror("gelf_getshdr");
+ WARN_ELF("gelf_getshdr");
return -1;
}

sec->name = elf_strptr(elf->elf, shstrndx, sec->sh.sh_name);
if (!sec->name) {
- perror("elf_strptr");
+ WARN_ELF("elf_strptr");
return -1;
}

- sec->elf_data = elf_getdata(s, NULL);
- if (!sec->elf_data) {
- perror("elf_getdata");
+ sec->data = elf_getdata(s, NULL);
+ if (!sec->data) {
+ WARN_ELF("elf_getdata");
return -1;
}

- if (sec->elf_data->d_off != 0 ||
- sec->elf_data->d_size != sec->sh.sh_size) {
+ if (sec->data->d_off != 0 ||
+ sec->data->d_size != sec->sh.sh_size) {
WARN("unexpected data attributes for %s", sec->name);
return -1;
}

- sec->data = (unsigned long)sec->elf_data->d_buf;
- sec->len = sec->elf_data->d_size;
+ sec->len = sec->data->d_size;
}

/* sanity check, one more call to elf_nextscn() should return NULL */
@@ -232,15 +234,15 @@ static int read_symbols(struct elf *elf)

sym->idx = i;

- if (!gelf_getsym(symtab->elf_data, i, &sym->sym)) {
- perror("gelf_getsym");
+ if (!gelf_getsym(symtab->data, i, &sym->sym)) {
+ WARN_ELF("gelf_getsym");
goto err;
}

sym->name = elf_strptr(elf->elf, symtab->sh.sh_link,
sym->sym.st_name);
if (!sym->name) {
- perror("elf_strptr");
+ WARN_ELF("elf_strptr");
goto err;
}

@@ -322,8 +324,8 @@ static int read_relas(struct elf *elf)
}
memset(rela, 0, sizeof(*rela));

- if (!gelf_getrela(sec->elf_data, i, &rela->rela)) {
- perror("gelf_getrela");
+ if (!gelf_getrela(sec->data, i, &rela->rela)) {
+ WARN_ELF("gelf_getrela");
return -1;
}

@@ -362,12 +364,6 @@ struct elf *elf_open(const char *name)

INIT_LIST_HEAD(&elf->sections);

- elf->name = strdup(name);
- if (!elf->name) {
- perror("strdup");
- goto err;
- }
-
elf->fd = open(name, O_RDONLY);
if (elf->fd == -1) {
perror("open");
@@ -376,12 +372,12 @@ struct elf *elf_open(const char *name)

elf->elf = elf_begin(elf->fd, ELF_C_READ_MMAP, NULL);
if (!elf->elf) {
- perror("elf_begin");
+ WARN_ELF("elf_begin");
goto err;
}

if (!gelf_getehdr(elf->elf, &elf->ehdr)) {
- perror("gelf_getehdr");
+ WARN_ELF("gelf_getehdr");
goto err;
}

@@ -407,6 +403,12 @@ void elf_close(struct elf *elf)
struct symbol *sym, *tmpsym;
struct rela *rela, *tmprela;

+ if (elf->elf)
+ elf_end(elf->elf);
+
+ if (elf->fd > 0)
+ close(elf->fd);
+
list_for_each_entry_safe(sec, tmpsec, &elf->sections, list) {
list_for_each_entry_safe(sym, tmpsym, &sec->symbol_list, list) {
list_del(&sym->list);
@@ -421,11 +423,6 @@ void elf_close(struct elf *elf)
list_del(&sec->list);
free(sec);
}
- if (elf->name)
- free(elf->name);
- if (elf->fd > 0)
- close(elf->fd);
- if (elf->elf)
- elf_end(elf->elf);
+
free(elf);
}
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index 731973e..343968b 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -37,10 +37,9 @@ struct section {
DECLARE_HASHTABLE(rela_hash, 16);
struct section *base, *rela;
struct symbol *sym;
- Elf_Data *elf_data;
+ Elf_Data *data;
char *name;
int idx;
- unsigned long data;
unsigned int len;
};

@@ -86,6 +85,7 @@ struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
struct symbol *find_containing_func(struct section *sec, unsigned long offset);
void elf_close(struct elf *elf);

-
+#define for_each_sec(file, sec) \
+ list_for_each_entry(sec, &file->elf->sections, list)

#endif /* _OBJTOOL_ELF_H */
diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index bff8abb..84f001d 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -91,16 +91,16 @@ static int get_alt_entry(struct elf *elf, struct special_entry *entry,
alt->jump_or_nop = entry->jump_or_nop;

if (alt->group) {
- alt->orig_len = *(unsigned char *)(sec->data + offset +
+ alt->orig_len = *(unsigned char *)(sec->data->d_buf + offset +
entry->orig_len);
- alt->new_len = *(unsigned char *)(sec->data + offset +
+ alt->new_len = *(unsigned char *)(sec->data->d_buf + offset +
entry->new_len);
}

if (entry->feature) {
unsigned short feature;

- feature = *(unsigned short *)(sec->data + offset +
+ feature = *(unsigned short *)(sec->data->d_buf + offset +
entry->feature);

/*
diff --git a/tools/objtool/warn.h b/tools/objtool/warn.h
index ac7e075..afd9f7a 100644
--- a/tools/objtool/warn.h
+++ b/tools/objtool/warn.h
@@ -18,6 +18,13 @@
#ifndef _WARN_H
#define _WARN_H

+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include "elf.h"
+
extern const char *objname;

static inline char *offstr(struct section *sec, unsigned long offset)
@@ -57,4 +64,7 @@ static inline char *offstr(struct section *sec, unsigned long offset)
free(_str); \
})

+#define WARN_ELF(format, ...) \
+ WARN(format ": %s", ##__VA_ARGS__, elf_errmsg(-1))
+
#endif /* _WARN_H */
--
2.7.5

2017-06-28 15:12:31

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 1/8] objtool: move checking code to check.c

In preparation for the new 'objtool undwarf generate' command, which
will rely on 'objtool check', move the checking code from
builtin-check.c to check.c where it can be used by other commands.

Signed-off-by: Josh Poimboeuf <[email protected]>
Reviewed-by: Jiri Slaby <[email protected]>
---
tools/objtool/Build | 1 +
tools/objtool/builtin-check.c | 1281 +---------------------------
tools/objtool/{builtin-check.c => check.c} | 58 +-
tools/objtool/check.h | 51 ++
4 files changed, 70 insertions(+), 1321 deletions(-)
copy tools/objtool/{builtin-check.c => check.c} (95%)
create mode 100644 tools/objtool/check.h

diff --git a/tools/objtool/Build b/tools/objtool/Build
index d6cdece..6f2e198 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -1,5 +1,6 @@
objtool-y += arch/$(SRCARCH)/
objtool-y += builtin-check.o
+objtool-y += check.o
objtool-y += elf.o
objtool-y += special.o
objtool-y += objtool.o
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 5f66697f..365c34e 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -25,1287 +25,32 @@
* For more information, see tools/objtool/Documentation/stack-validation.txt.
*/

-#include <string.h>
-#include <stdlib.h>
#include <subcmd/parse-options.h>
-
#include "builtin.h"
-#include "elf.h"
-#include "special.h"
-#include "arch.h"
-#include "warn.h"
-
-#include <linux/hashtable.h>
-#include <linux/kernel.h>
-
-#define STATE_FP_SAVED 0x1
-#define STATE_FP_SETUP 0x2
-#define STATE_FENTRY 0x4
-
-struct instruction {
- struct list_head list;
- struct hlist_node hash;
- struct section *sec;
- unsigned long offset;
- unsigned int len, state;
- unsigned char type;
- unsigned long immediate;
- bool alt_group, visited, dead_end;
- struct symbol *call_dest;
- struct instruction *jump_dest;
- struct list_head alts;
- struct symbol *func;
-};
-
-struct alternative {
- struct list_head list;
- struct instruction *insn;
-};
-
-struct objtool_file {
- struct elf *elf;
- struct list_head insn_list;
- DECLARE_HASHTABLE(insn_hash, 16);
- struct section *rodata, *whitelist;
- bool ignore_unreachables, c_file;
-};
-
-const char *objname;
-static bool nofp;
-
-static struct instruction *find_insn(struct objtool_file *file,
- struct section *sec, unsigned long offset)
-{
- struct instruction *insn;
-
- hash_for_each_possible(file->insn_hash, insn, hash, offset)
- if (insn->sec == sec && insn->offset == offset)
- return insn;
-
- return NULL;
-}
-
-static struct instruction *next_insn_same_sec(struct objtool_file *file,
- struct instruction *insn)
-{
- struct instruction *next = list_next_entry(insn, list);
-
- if (&next->list == &file->insn_list || next->sec != insn->sec)
- return NULL;
-
- return next;
-}
-
-static bool gcov_enabled(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *sym;
-
- list_for_each_entry(sec, &file->elf->sections, list)
- list_for_each_entry(sym, &sec->symbol_list, list)
- if (!strncmp(sym->name, "__gcov_.", 8))
- return true;
-
- return false;
-}
-
-#define for_each_insn(file, insn) \
- list_for_each_entry(insn, &file->insn_list, list)
-
-#define func_for_each_insn(file, func, insn) \
- for (insn = find_insn(file, func->sec, func->offset); \
- insn && &insn->list != &file->insn_list && \
- insn->sec == func->sec && \
- insn->offset < func->offset + func->len; \
- insn = list_next_entry(insn, list))
-
-#define func_for_each_insn_continue_reverse(file, func, insn) \
- for (insn = list_prev_entry(insn, list); \
- &insn->list != &file->insn_list && \
- insn->sec == func->sec && insn->offset >= func->offset; \
- insn = list_prev_entry(insn, list))
-
-#define sec_for_each_insn_from(file, insn) \
- for (; insn; insn = next_insn_same_sec(file, insn))
-
-
-/*
- * Check if the function has been manually whitelisted with the
- * STACK_FRAME_NON_STANDARD macro, or if it should be automatically whitelisted
- * due to its use of a context switching instruction.
- */
-static bool ignore_func(struct objtool_file *file, struct symbol *func)
-{
- struct rela *rela;
- struct instruction *insn;
-
- /* check for STACK_FRAME_NON_STANDARD */
- if (file->whitelist && file->whitelist->rela)
- list_for_each_entry(rela, &file->whitelist->rela->rela_list, list) {
- if (rela->sym->type == STT_SECTION &&
- rela->sym->sec == func->sec &&
- rela->addend == func->offset)
- return true;
- if (rela->sym->type == STT_FUNC && rela->sym == func)
- return true;
- }
-
- /* check if it has a context switching instruction */
- func_for_each_insn(file, func, insn)
- if (insn->type == INSN_CONTEXT_SWITCH)
- return true;
-
- return false;
-}
-
-/*
- * This checks to see if the given function is a "noreturn" function.
- *
- * For global functions which are outside the scope of this object file, we
- * have to keep a manual list of them.
- *
- * For local functions, we have to detect them manually by simply looking for
- * the lack of a return instruction.
- *
- * Returns:
- * -1: error
- * 0: no dead end
- * 1: dead end
- */
-static int __dead_end_function(struct objtool_file *file, struct symbol *func,
- int recursion)
-{
- int i;
- struct instruction *insn;
- bool empty = true;
-
- /*
- * Unfortunately these have to be hard coded because the noreturn
- * attribute isn't provided in ELF data.
- */
- static const char * const global_noreturns[] = {
- "__stack_chk_fail",
- "panic",
- "do_exit",
- "do_task_dead",
- "__module_put_and_exit",
- "complete_and_exit",
- "kvm_spurious_fault",
- "__reiserfs_panic",
- "lbug_with_loc",
- "fortify_panic",
- };
-
- if (func->bind == STB_WEAK)
- return 0;
-
- if (func->bind == STB_GLOBAL)
- for (i = 0; i < ARRAY_SIZE(global_noreturns); i++)
- if (!strcmp(func->name, global_noreturns[i]))
- return 1;
-
- if (!func->sec)
- return 0;
-
- func_for_each_insn(file, func, insn) {
- empty = false;
-
- if (insn->type == INSN_RETURN)
- return 0;
- }
-
- if (empty)
- return 0;
-
- /*
- * A function can have a sibling call instead of a return. In that
- * case, the function's dead-end status depends on whether the target
- * of the sibling call returns.
- */
- func_for_each_insn(file, func, insn) {
- if (insn->sec != func->sec ||
- insn->offset >= func->offset + func->len)
- break;
-
- if (insn->type == INSN_JUMP_UNCONDITIONAL) {
- struct instruction *dest = insn->jump_dest;
- struct symbol *dest_func;
-
- if (!dest)
- /* sibling call to another file */
- return 0;
-
- if (dest->sec != func->sec ||
- dest->offset < func->offset ||
- dest->offset >= func->offset + func->len) {
- /* local sibling call */
- dest_func = find_symbol_by_offset(dest->sec,
- dest->offset);
- if (!dest_func)
- continue;
-
- if (recursion == 5) {
- WARN_FUNC("infinite recursion (objtool bug!)",
- dest->sec, dest->offset);
- return -1;
- }
-
- return __dead_end_function(file, dest_func,
- recursion + 1);
- }
- }
-
- if (insn->type == INSN_JUMP_DYNAMIC && list_empty(&insn->alts))
- /* sibling call */
- return 0;
- }
-
- return 1;
-}
-
-static int dead_end_function(struct objtool_file *file, struct symbol *func)
-{
- return __dead_end_function(file, func, 0);
-}
-
-/*
- * Call the arch-specific instruction decoder for all the instructions and add
- * them to the global instruction list.
- */
-static int decode_instructions(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- unsigned long offset;
- struct instruction *insn;
- int ret;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
-
- if (!(sec->sh.sh_flags & SHF_EXECINSTR))
- continue;
-
- for (offset = 0; offset < sec->len; offset += insn->len) {
- insn = malloc(sizeof(*insn));
- memset(insn, 0, sizeof(*insn));
-
- INIT_LIST_HEAD(&insn->alts);
- insn->sec = sec;
- insn->offset = offset;
-
- ret = arch_decode_instruction(file->elf, sec, offset,
- sec->len - offset,
- &insn->len, &insn->type,
- &insn->immediate);
- if (ret)
- return ret;
-
- if (!insn->type || insn->type > INSN_LAST) {
- WARN_FUNC("invalid instruction type %d",
- insn->sec, insn->offset, insn->type);
- return -1;
- }
-
- hash_add(file->insn_hash, &insn->hash, insn->offset);
- list_add_tail(&insn->list, &file->insn_list);
- }
-
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- if (!find_insn(file, sec, func->offset)) {
- WARN("%s(): can't find starting instruction",
- func->name);
- return -1;
- }
-
- func_for_each_insn(file, func, insn)
- if (!insn->func)
- insn->func = func;
- }
- }
-
- return 0;
-}
-
-/*
- * Find all uses of the unreachable() macro, which are code path dead ends.
- */
-static int add_dead_ends(struct objtool_file *file)
-{
- struct section *sec;
- struct rela *rela;
- struct instruction *insn;
- bool found;
-
- sec = find_section_by_name(file->elf, ".rela.discard.unreachable");
- if (!sec)
- return 0;
-
- list_for_each_entry(rela, &sec->rela_list, list) {
- if (rela->sym->type != STT_SECTION) {
- WARN("unexpected relocation symbol type in %s", sec->name);
- return -1;
- }
- insn = find_insn(file, rela->sym->sec, rela->addend);
- if (insn)
- insn = list_prev_entry(insn, list);
- else if (rela->addend == rela->sym->sec->len) {
- found = false;
- list_for_each_entry_reverse(insn, &file->insn_list, list) {
- if (insn->sec == rela->sym->sec) {
- found = true;
- break;
- }
- }
-
- if (!found) {
- WARN("can't find unreachable insn at %s+0x%x",
- rela->sym->sec->name, rela->addend);
- return -1;
- }
- } else {
- WARN("can't find unreachable insn at %s+0x%x",
- rela->sym->sec->name, rela->addend);
- return -1;
- }
-
- insn->dead_end = true;
- }
-
- return 0;
-}
-
-/*
- * Warnings shouldn't be reported for ignored functions.
- */
-static void add_ignores(struct objtool_file *file)
-{
- struct instruction *insn;
- struct section *sec;
- struct symbol *func;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- if (!ignore_func(file, func))
- continue;
-
- func_for_each_insn(file, func, insn)
- insn->visited = true;
- }
- }
-}
-
-/*
- * Find the destination instructions for all jumps.
- */
-static int add_jump_destinations(struct objtool_file *file)
-{
- struct instruction *insn;
- struct rela *rela;
- struct section *dest_sec;
- unsigned long dest_off;
-
- for_each_insn(file, insn) {
- if (insn->type != INSN_JUMP_CONDITIONAL &&
- insn->type != INSN_JUMP_UNCONDITIONAL)
- continue;
-
- /* skip ignores */
- if (insn->visited)
- continue;
-
- rela = find_rela_by_dest_range(insn->sec, insn->offset,
- insn->len);
- if (!rela) {
- dest_sec = insn->sec;
- dest_off = insn->offset + insn->len + insn->immediate;
- } else if (rela->sym->type == STT_SECTION) {
- dest_sec = rela->sym->sec;
- dest_off = rela->addend + 4;
- } else if (rela->sym->sec->idx) {
- dest_sec = rela->sym->sec;
- dest_off = rela->sym->sym.st_value + rela->addend + 4;
- } else {
- /* sibling call */
- insn->jump_dest = 0;
- continue;
- }
-
- insn->jump_dest = find_insn(file, dest_sec, dest_off);
- if (!insn->jump_dest) {
-
- /*
- * This is a special case where an alt instruction
- * jumps past the end of the section. These are
- * handled later in handle_group_alt().
- */
- if (!strcmp(insn->sec->name, ".altinstr_replacement"))
- continue;
-
- WARN_FUNC("can't find jump dest instruction at %s+0x%lx",
- insn->sec, insn->offset, dest_sec->name,
- dest_off);
- return -1;
- }
- }
-
- return 0;
-}
-
-/*
- * Find the destination instructions for all calls.
- */
-static int add_call_destinations(struct objtool_file *file)
-{
- struct instruction *insn;
- unsigned long dest_off;
- struct rela *rela;
-
- for_each_insn(file, insn) {
- if (insn->type != INSN_CALL)
- continue;
-
- rela = find_rela_by_dest_range(insn->sec, insn->offset,
- insn->len);
- if (!rela) {
- dest_off = insn->offset + insn->len + insn->immediate;
- insn->call_dest = find_symbol_by_offset(insn->sec,
- dest_off);
- if (!insn->call_dest) {
- WARN_FUNC("can't find call dest symbol at offset 0x%lx",
- insn->sec, insn->offset, dest_off);
- return -1;
- }
- } else if (rela->sym->type == STT_SECTION) {
- insn->call_dest = find_symbol_by_offset(rela->sym->sec,
- rela->addend+4);
- if (!insn->call_dest ||
- insn->call_dest->type != STT_FUNC) {
- WARN_FUNC("can't find call dest symbol at %s+0x%x",
- insn->sec, insn->offset,
- rela->sym->sec->name,
- rela->addend + 4);
- return -1;
- }
- } else
- insn->call_dest = rela->sym;
- }
-
- return 0;
-}
-
-/*
- * The .alternatives section requires some extra special care, over and above
- * what other special sections require:
- *
- * 1. Because alternatives are patched in-place, we need to insert a fake jump
- * instruction at the end so that validate_branch() skips all the original
- * replaced instructions when validating the new instruction path.
- *
- * 2. An added wrinkle is that the new instruction length might be zero. In
- * that case the old instructions are replaced with noops. We simulate that
- * by creating a fake jump as the only new instruction.
- *
- * 3. In some cases, the alternative section includes an instruction which
- * conditionally jumps to the _end_ of the entry. We have to modify these
- * jumps' destinations to point back to .text rather than the end of the
- * entry in .altinstr_replacement.
- *
- * 4. It has been requested that we don't validate the !POPCNT feature path
- * which is a "very very small percentage of machines".
- */
-static int handle_group_alt(struct objtool_file *file,
- struct special_alt *special_alt,
- struct instruction *orig_insn,
- struct instruction **new_insn)
-{
- struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump;
- unsigned long dest_off;
-
- last_orig_insn = NULL;
- insn = orig_insn;
- sec_for_each_insn_from(file, insn) {
- if (insn->offset >= special_alt->orig_off + special_alt->orig_len)
- break;
-
- if (special_alt->skip_orig)
- insn->type = INSN_NOP;
-
- insn->alt_group = true;
- last_orig_insn = insn;
- }
-
- if (!next_insn_same_sec(file, last_orig_insn)) {
- WARN("%s: don't know how to handle alternatives at end of section",
- special_alt->orig_sec->name);
- return -1;
- }
-
- fake_jump = malloc(sizeof(*fake_jump));
- if (!fake_jump) {
- WARN("malloc failed");
- return -1;
- }
- memset(fake_jump, 0, sizeof(*fake_jump));
- INIT_LIST_HEAD(&fake_jump->alts);
- fake_jump->sec = special_alt->new_sec;
- fake_jump->offset = -1;
- fake_jump->type = INSN_JUMP_UNCONDITIONAL;
- fake_jump->jump_dest = list_next_entry(last_orig_insn, list);
-
- if (!special_alt->new_len) {
- *new_insn = fake_jump;
- return 0;
- }
-
- last_new_insn = NULL;
- insn = *new_insn;
- sec_for_each_insn_from(file, insn) {
- if (insn->offset >= special_alt->new_off + special_alt->new_len)
- break;
-
- last_new_insn = insn;
-
- if (insn->type != INSN_JUMP_CONDITIONAL &&
- insn->type != INSN_JUMP_UNCONDITIONAL)
- continue;
-
- if (!insn->immediate)
- continue;
-
- dest_off = insn->offset + insn->len + insn->immediate;
- if (dest_off == special_alt->new_off + special_alt->new_len)
- insn->jump_dest = fake_jump;
-
- if (!insn->jump_dest) {
- WARN_FUNC("can't find alternative jump destination",
- insn->sec, insn->offset);
- return -1;
- }
- }
-
- if (!last_new_insn) {
- WARN_FUNC("can't find last new alternative instruction",
- special_alt->new_sec, special_alt->new_off);
- return -1;
- }
-
- list_add(&fake_jump->list, &last_new_insn->list);
-
- return 0;
-}
-
-/*
- * A jump table entry can either convert a nop to a jump or a jump to a nop.
- * If the original instruction is a jump, make the alt entry an effective nop
- * by just skipping the original instruction.
- */
-static int handle_jump_alt(struct objtool_file *file,
- struct special_alt *special_alt,
- struct instruction *orig_insn,
- struct instruction **new_insn)
-{
- if (orig_insn->type == INSN_NOP)
- return 0;
-
- if (orig_insn->type != INSN_JUMP_UNCONDITIONAL) {
- WARN_FUNC("unsupported instruction at jump label",
- orig_insn->sec, orig_insn->offset);
- return -1;
- }
-
- *new_insn = list_next_entry(orig_insn, list);
- return 0;
-}
-
-/*
- * Read all the special sections which have alternate instructions which can be
- * patched in or redirected to at runtime. Each instruction having alternate
- * instruction(s) has them added to its insn->alts list, which will be
- * traversed in validate_branch().
- */
-static int add_special_section_alts(struct objtool_file *file)
-{
- struct list_head special_alts;
- struct instruction *orig_insn, *new_insn;
- struct special_alt *special_alt, *tmp;
- struct alternative *alt;
- int ret;
-
- ret = special_get_alts(file->elf, &special_alts);
- if (ret)
- return ret;
-
- list_for_each_entry_safe(special_alt, tmp, &special_alts, list) {
- alt = malloc(sizeof(*alt));
- if (!alt) {
- WARN("malloc failed");
- ret = -1;
- goto out;
- }
-
- orig_insn = find_insn(file, special_alt->orig_sec,
- special_alt->orig_off);
- if (!orig_insn) {
- WARN_FUNC("special: can't find orig instruction",
- special_alt->orig_sec, special_alt->orig_off);
- ret = -1;
- goto out;
- }
+#include "check.h"

- new_insn = NULL;
- if (!special_alt->group || special_alt->new_len) {
- new_insn = find_insn(file, special_alt->new_sec,
- special_alt->new_off);
- if (!new_insn) {
- WARN_FUNC("special: can't find new instruction",
- special_alt->new_sec,
- special_alt->new_off);
- ret = -1;
- goto out;
- }
- }
+bool nofp;

- if (special_alt->group) {
- ret = handle_group_alt(file, special_alt, orig_insn,
- &new_insn);
- if (ret)
- goto out;
- } else if (special_alt->jump_or_nop) {
- ret = handle_jump_alt(file, special_alt, orig_insn,
- &new_insn);
- if (ret)
- goto out;
- }
-
- alt->insn = new_insn;
- list_add_tail(&alt->list, &orig_insn->alts);
-
- list_del(&special_alt->list);
- free(special_alt);
- }
-
-out:
- return ret;
-}
-
-static int add_switch_table(struct objtool_file *file, struct symbol *func,
- struct instruction *insn, struct rela *table,
- struct rela *next_table)
-{
- struct rela *rela = table;
- struct instruction *alt_insn;
- struct alternative *alt;
-
- list_for_each_entry_from(rela, &file->rodata->rela->rela_list, list) {
- if (rela == next_table)
- break;
-
- if (rela->sym->sec != insn->sec ||
- rela->addend <= func->offset ||
- rela->addend >= func->offset + func->len)
- break;
-
- alt_insn = find_insn(file, insn->sec, rela->addend);
- if (!alt_insn) {
- WARN("%s: can't find instruction at %s+0x%x",
- file->rodata->rela->name, insn->sec->name,
- rela->addend);
- return -1;
- }
-
- alt = malloc(sizeof(*alt));
- if (!alt) {
- WARN("malloc failed");
- return -1;
- }
-
- alt->insn = alt_insn;
- list_add_tail(&alt->list, &insn->alts);
- }
-
- return 0;
-}
-
-/*
- * find_switch_table() - Given a dynamic jump, find the switch jump table in
- * .rodata associated with it.
- *
- * There are 3 basic patterns:
- *
- * 1. jmpq *[rodata addr](,%reg,8)
- *
- * This is the most common case by far. It jumps to an address in a simple
- * jump table which is stored in .rodata.
- *
- * 2. jmpq *[rodata addr](%rip)
- *
- * This is caused by a rare GCC quirk, currently only seen in three driver
- * functions in the kernel, only with certain obscure non-distro configs.
- *
- * As part of an optimization, GCC makes a copy of an existing switch jump
- * table, modifies it, and then hard-codes the jump (albeit with an indirect
- * jump) to use a single entry in the table. The rest of the jump table and
- * some of its jump targets remain as dead code.
- *
- * In such a case we can just crudely ignore all unreachable instruction
- * warnings for the entire object file. Ideally we would just ignore them
- * for the function, but that would require redesigning the code quite a
- * bit. And honestly that's just not worth doing: unreachable instruction
- * warnings are of questionable value anyway, and this is such a rare issue.
- *
- * 3. mov [rodata addr],%reg1
- * ... some instructions ...
- * jmpq *(%reg1,%reg2,8)
- *
- * This is a fairly uncommon pattern which is new for GCC 6. As of this
- * writing, there are 11 occurrences of it in the allmodconfig kernel.
- *
- * TODO: Once we have DWARF CFI and smarter instruction decoding logic,
- * ensure the same register is used in the mov and jump instructions.
- */
-static struct rela *find_switch_table(struct objtool_file *file,
- struct symbol *func,
- struct instruction *insn)
-{
- struct rela *text_rela, *rodata_rela;
- struct instruction *orig_insn = insn;
-
- text_rela = find_rela_by_dest_range(insn->sec, insn->offset, insn->len);
- if (text_rela && text_rela->sym == file->rodata->sym) {
- /* case 1 */
- rodata_rela = find_rela_by_dest(file->rodata,
- text_rela->addend);
- if (rodata_rela)
- return rodata_rela;
-
- /* case 2 */
- rodata_rela = find_rela_by_dest(file->rodata,
- text_rela->addend + 4);
- if (!rodata_rela)
- return NULL;
- file->ignore_unreachables = true;
- return rodata_rela;
- }
-
- /* case 3 */
- func_for_each_insn_continue_reverse(file, func, insn) {
- if (insn->type == INSN_JUMP_DYNAMIC)
- break;
-
- /* allow small jumps within the range */
- if (insn->type == INSN_JUMP_UNCONDITIONAL &&
- insn->jump_dest &&
- (insn->jump_dest->offset <= insn->offset ||
- insn->jump_dest->offset > orig_insn->offset))
- break;
-
- /* look for a relocation which references .rodata */
- text_rela = find_rela_by_dest_range(insn->sec, insn->offset,
- insn->len);
- if (!text_rela || text_rela->sym != file->rodata->sym)
- continue;
-
- /*
- * Make sure the .rodata address isn't associated with a
- * symbol. gcc jump tables are anonymous data.
- */
- if (find_symbol_containing(file->rodata, text_rela->addend))
- continue;
-
- return find_rela_by_dest(file->rodata, text_rela->addend);
- }
-
- return NULL;
-}
-
-static int add_func_switch_tables(struct objtool_file *file,
- struct symbol *func)
-{
- struct instruction *insn, *prev_jump = NULL;
- struct rela *rela, *prev_rela = NULL;
- int ret;
-
- func_for_each_insn(file, func, insn) {
- if (insn->type != INSN_JUMP_DYNAMIC)
- continue;
-
- rela = find_switch_table(file, func, insn);
- if (!rela)
- continue;
-
- /*
- * We found a switch table, but we don't know yet how big it
- * is. Don't add it until we reach the end of the function or
- * the beginning of another switch table in the same function.
- */
- if (prev_jump) {
- ret = add_switch_table(file, func, prev_jump, prev_rela,
- rela);
- if (ret)
- return ret;
- }
-
- prev_jump = insn;
- prev_rela = rela;
- }
-
- if (prev_jump) {
- ret = add_switch_table(file, func, prev_jump, prev_rela, NULL);
- if (ret)
- return ret;
- }
-
- return 0;
-}
-
-/*
- * For some switch statements, gcc generates a jump table in the .rodata
- * section which contains a list of addresses within the function to jump to.
- * This finds these jump tables and adds them to the insn->alts lists.
- */
-static int add_switch_table_alts(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- int ret;
-
- if (!file->rodata || !file->rodata->rela)
- return 0;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- ret = add_func_switch_tables(file, func);
- if (ret)
- return ret;
- }
- }
-
- return 0;
-}
-
-static int decode_sections(struct objtool_file *file)
-{
- int ret;
-
- ret = decode_instructions(file);
- if (ret)
- return ret;
-
- ret = add_dead_ends(file);
- if (ret)
- return ret;
-
- add_ignores(file);
-
- ret = add_jump_destinations(file);
- if (ret)
- return ret;
-
- ret = add_call_destinations(file);
- if (ret)
- return ret;
-
- ret = add_special_section_alts(file);
- if (ret)
- return ret;
-
- ret = add_switch_table_alts(file);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static bool is_fentry_call(struct instruction *insn)
-{
- if (insn->type == INSN_CALL &&
- insn->call_dest->type == STT_NOTYPE &&
- !strcmp(insn->call_dest->name, "__fentry__"))
- return true;
-
- return false;
-}
-
-static bool has_modified_stack_frame(struct instruction *insn)
-{
- return (insn->state & STATE_FP_SAVED) ||
- (insn->state & STATE_FP_SETUP);
-}
-
-static bool has_valid_stack_frame(struct instruction *insn)
-{
- return (insn->state & STATE_FP_SAVED) &&
- (insn->state & STATE_FP_SETUP);
-}
-
-static unsigned int frame_state(unsigned long state)
-{
- return (state & (STATE_FP_SAVED | STATE_FP_SETUP));
-}
-
-/*
- * Follow the branch starting at the given instruction, and recursively follow
- * any other branches (jumps). Meanwhile, track the frame pointer state at
- * each instruction and validate all the rules described in
- * tools/objtool/Documentation/stack-validation.txt.
- */
-static int validate_branch(struct objtool_file *file,
- struct instruction *first, unsigned char first_state)
-{
- struct alternative *alt;
- struct instruction *insn;
- struct section *sec;
- struct symbol *func = NULL;
- unsigned char state;
- int ret;
-
- insn = first;
- sec = insn->sec;
- state = first_state;
-
- if (insn->alt_group && list_empty(&insn->alts)) {
- WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
- sec, insn->offset);
- return 1;
- }
-
- while (1) {
- if (file->c_file && insn->func) {
- if (func && func != insn->func) {
- WARN("%s() falls through to next function %s()",
- func->name, insn->func->name);
- return 1;
- }
-
- func = insn->func;
- }
-
- if (insn->visited) {
- if (frame_state(insn->state) != frame_state(state)) {
- WARN_FUNC("frame pointer state mismatch",
- sec, insn->offset);
- return 1;
- }
-
- return 0;
- }
-
- insn->visited = true;
- insn->state = state;
-
- list_for_each_entry(alt, &insn->alts, list) {
- ret = validate_branch(file, alt->insn, state);
- if (ret)
- return 1;
- }
-
- switch (insn->type) {
-
- case INSN_FP_SAVE:
- if (!nofp) {
- if (state & STATE_FP_SAVED) {
- WARN_FUNC("duplicate frame pointer save",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SAVED;
- }
- break;
-
- case INSN_FP_SETUP:
- if (!nofp) {
- if (state & STATE_FP_SETUP) {
- WARN_FUNC("duplicate frame pointer setup",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SETUP;
- }
- break;
-
- case INSN_FP_RESTORE:
- if (!nofp) {
- if (has_valid_stack_frame(insn))
- state &= ~STATE_FP_SETUP;
-
- state &= ~STATE_FP_SAVED;
- }
- break;
-
- case INSN_RETURN:
- if (!nofp && has_modified_stack_frame(insn)) {
- WARN_FUNC("return without frame pointer restore",
- sec, insn->offset);
- return 1;
- }
- return 0;
-
- case INSN_CALL:
- if (is_fentry_call(insn)) {
- state |= STATE_FENTRY;
- break;
- }
-
- ret = dead_end_function(file, insn->call_dest);
- if (ret == 1)
- return 0;
- if (ret == -1)
- return 1;
-
- /* fallthrough */
- case INSN_CALL_DYNAMIC:
- if (!nofp && !has_valid_stack_frame(insn)) {
- WARN_FUNC("call without frame pointer save/setup",
- sec, insn->offset);
- return 1;
- }
- break;
-
- case INSN_JUMP_CONDITIONAL:
- case INSN_JUMP_UNCONDITIONAL:
- if (insn->jump_dest) {
- ret = validate_branch(file, insn->jump_dest,
- state);
- if (ret)
- return 1;
- } else if (has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
- sec, insn->offset);
- return 1;
- } /* else it's a sibling call */
-
- if (insn->type == INSN_JUMP_UNCONDITIONAL)
- return 0;
-
- break;
-
- case INSN_JUMP_DYNAMIC:
- if (list_empty(&insn->alts) &&
- has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
- sec, insn->offset);
- return 1;
- }
-
- return 0;
-
- default:
- break;
- }
-
- if (insn->dead_end)
- return 0;
-
- insn = next_insn_same_sec(file, insn);
- if (!insn) {
- WARN("%s: unexpected end of section", sec->name);
- return 1;
- }
- }
-
- return 0;
-}
-
-static bool is_kasan_insn(struct instruction *insn)
-{
- return (insn->type == INSN_CALL &&
- !strcmp(insn->call_dest->name, "__asan_handle_no_return"));
-}
-
-static bool is_ubsan_insn(struct instruction *insn)
-{
- return (insn->type == INSN_CALL &&
- !strcmp(insn->call_dest->name,
- "__ubsan_handle_builtin_unreachable"));
-}
-
-static bool ignore_unreachable_insn(struct symbol *func,
- struct instruction *insn)
-{
- int i;
-
- if (insn->type == INSN_NOP)
- return true;
-
- /*
- * Check if this (or a subsequent) instruction is related to
- * CONFIG_UBSAN or CONFIG_KASAN.
- *
- * End the search at 5 instructions to avoid going into the weeds.
- */
- for (i = 0; i < 5; i++) {
-
- if (is_kasan_insn(insn) || is_ubsan_insn(insn))
- return true;
-
- if (insn->type == INSN_JUMP_UNCONDITIONAL && insn->jump_dest) {
- insn = insn->jump_dest;
- continue;
- }
-
- if (insn->offset + insn->len >= func->offset + func->len)
- break;
- insn = list_next_entry(insn, list);
- }
-
- return false;
-}
-
-static int validate_functions(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- struct instruction *insn;
- int ret, warnings = 0;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- insn = find_insn(file, sec, func->offset);
- if (!insn)
- continue;
-
- ret = validate_branch(file, insn, 0);
- warnings += ret;
- }
- }
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- func_for_each_insn(file, func, insn) {
- if (insn->visited)
- continue;
-
- insn->visited = true;
-
- if (file->ignore_unreachables || warnings ||
- ignore_unreachable_insn(func, insn))
- continue;
-
- /*
- * gcov produces a lot of unreachable
- * instructions. If we get an unreachable
- * warning and the file has gcov enabled, just
- * ignore it, and all other such warnings for
- * the file.
- */
- if (!file->ignore_unreachables &&
- gcov_enabled(file)) {
- file->ignore_unreachables = true;
- continue;
- }
-
- WARN_FUNC("function has unreachable instruction", insn->sec, insn->offset);
- warnings++;
- }
- }
- }
-
- return warnings;
-}
-
-static int validate_uncallable_instructions(struct objtool_file *file)
-{
- struct instruction *insn;
- int warnings = 0;
-
- for_each_insn(file, insn) {
- if (!insn->visited && insn->type == INSN_RETURN) {
- WARN_FUNC("return instruction outside of a callable function",
- insn->sec, insn->offset);
- warnings++;
- }
- }
-
- return warnings;
-}
-
-static void cleanup(struct objtool_file *file)
-{
- struct instruction *insn, *tmpinsn;
- struct alternative *alt, *tmpalt;
-
- list_for_each_entry_safe(insn, tmpinsn, &file->insn_list, list) {
- list_for_each_entry_safe(alt, tmpalt, &insn->alts, list) {
- list_del(&alt->list);
- free(alt);
- }
- list_del(&insn->list);
- hash_del(&insn->hash);
- free(insn);
- }
- elf_close(file->elf);
-}
-
-const char * const check_usage[] = {
+static const char * const check_usage[] = {
"objtool check [<options>] file.o",
NULL,
};

+const struct option check_options[] = {
+ OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
+ OPT_END(),
+};
+
int cmd_check(int argc, const char **argv)
{
- struct objtool_file file;
- int ret, warnings = 0;
-
- const struct option options[] = {
- OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
- OPT_END(),
- };
+ const char *objname;

- argc = parse_options(argc, argv, options, check_usage, 0);
+ argc = parse_options(argc, argv, check_options, check_usage, 0);

if (argc != 1)
- usage_with_options(check_usage, options);
+ usage_with_options(check_usage, check_options);

objname = argv[0];

- file.elf = elf_open(objname);
- if (!file.elf) {
- fprintf(stderr, "error reading elf file %s\n", objname);
- return 1;
- }
-
- INIT_LIST_HEAD(&file.insn_list);
- hash_init(file.insn_hash);
- file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard");
- file.rodata = find_section_by_name(file.elf, ".rodata");
- file.ignore_unreachables = false;
- file.c_file = find_section_by_name(file.elf, ".comment");
-
- ret = decode_sections(&file);
- if (ret < 0)
- goto out;
- warnings += ret;
-
- ret = validate_functions(&file);
- if (ret < 0)
- goto out;
- warnings += ret;
-
- ret = validate_uncallable_instructions(&file);
- if (ret < 0)
- goto out;
- warnings += ret;
-
-out:
- cleanup(&file);
-
- /* ignore warnings for now until we get all the code cleaned up */
- if (ret || warnings)
- return 0;
- return 0;
+ return check(objname, nofp);
}
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/check.c
similarity index 95%
copy from tools/objtool/builtin-check.c
copy to tools/objtool/check.c
index 5f66697f..231a360 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/check.c
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -15,21 +15,10 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/

-/*
- * objtool check:
- *
- * This command analyzes every .o file and ensures the validity of its stack
- * trace metadata. It enforces a set of rules on asm code and C inline
- * assembly code so that stack traces can be reliable.
- *
- * For more information, see tools/objtool/Documentation/stack-validation.txt.
- */
-
#include <string.h>
#include <stdlib.h>
-#include <subcmd/parse-options.h>

-#include "builtin.h"
+#include "check.h"
#include "elf.h"
#include "special.h"
#include "arch.h"
@@ -42,34 +31,11 @@
#define STATE_FP_SETUP 0x2
#define STATE_FENTRY 0x4

-struct instruction {
- struct list_head list;
- struct hlist_node hash;
- struct section *sec;
- unsigned long offset;
- unsigned int len, state;
- unsigned char type;
- unsigned long immediate;
- bool alt_group, visited, dead_end;
- struct symbol *call_dest;
- struct instruction *jump_dest;
- struct list_head alts;
- struct symbol *func;
-};
-
struct alternative {
struct list_head list;
struct instruction *insn;
};

-struct objtool_file {
- struct elf *elf;
- struct list_head insn_list;
- DECLARE_HASHTABLE(insn_hash, 16);
- struct section *rodata, *whitelist;
- bool ignore_unreachables, c_file;
-};
-
const char *objname;
static bool nofp;

@@ -1251,27 +1217,13 @@ static void cleanup(struct objtool_file *file)
elf_close(file->elf);
}

-const char * const check_usage[] = {
- "objtool check [<options>] file.o",
- NULL,
-};
-
-int cmd_check(int argc, const char **argv)
+int check(const char *_objname, bool _nofp)
{
struct objtool_file file;
int ret, warnings = 0;

- const struct option options[] = {
- OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
- OPT_END(),
- };
-
- argc = parse_options(argc, argv, options, check_usage, 0);
-
- if (argc != 1)
- usage_with_options(check_usage, options);
-
- objname = argv[0];
+ objname = _objname;
+ nofp = _nofp;

file.elf = elf_open(objname);
if (!file.elf) {
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
new file mode 100644
index 0000000..c0d2fde
--- /dev/null
+++ b/tools/objtool/check.h
@@ -0,0 +1,51 @@
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _CHECK_H
+#define _CHECK_H
+
+#include <stdbool.h>
+#include "elf.h"
+#include "arch.h"
+#include <linux/hashtable.h>
+
+struct instruction {
+ struct list_head list;
+ struct hlist_node hash;
+ struct section *sec;
+ unsigned long offset;
+ unsigned int len, state;
+ unsigned char type;
+ unsigned long immediate;
+ bool alt_group, visited, dead_end;
+ struct symbol *call_dest;
+ struct instruction *jump_dest;
+ struct list_head alts;
+ struct symbol *func;
+};
+
+struct objtool_file {
+ struct elf *elf;
+ struct list_head insn_list;
+ DECLARE_HASHTABLE(insn_hash, 16);
+ struct section *rodata, *whitelist;
+ bool ignore_unreachables, c_file;
+};
+
+int check(const char *objname, bool nofp);
+
+#endif /* _CHECK_H */
--
2.7.5

2017-06-28 15:12:42

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 2/8] objtool, x86: add several functions and files to the objtool whitelist

In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.

These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage. Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/x86/crypto/Makefile | 2 ++
arch/x86/crypto/sha1-mb/Makefile | 2 ++
arch/x86/crypto/sha256-mb/Makefile | 2 ++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/acpi/Makefile | 2 ++
arch/x86/kernel/kprobes/opt.c | 9 ++++++++-
arch/x86/kernel/reboot.c | 2 ++
arch/x86/kvm/svm.c | 2 ++
arch/x86/kvm/vmx.c | 3 +++
arch/x86/lib/msr-reg.S | 8 ++++----
arch/x86/net/Makefile | 2 ++
arch/x86/platform/efi/Makefile | 1 +
arch/x86/power/Makefile | 2 ++
arch/x86/xen/Makefile | 3 +++
kernel/kexec_core.c | 4 +++-
15 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 34b3fa2..9e32d40 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -2,6 +2,8 @@
# Arch-specific CryptoAPI modules.
#

+OBJECT_FILES_NON_STANDARD := y
+
avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no)
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
$(comma)4)$(comma)%ymm2,yes,no)
diff --git a/arch/x86/crypto/sha1-mb/Makefile b/arch/x86/crypto/sha1-mb/Makefile
index 2f87563..2e14acc 100644
--- a/arch/x86/crypto/sha1-mb/Makefile
+++ b/arch/x86/crypto/sha1-mb/Makefile
@@ -2,6 +2,8 @@
# Arch-specific CryptoAPI modules.
#

+OBJECT_FILES_NON_STANDARD := y
+
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
$(comma)4)$(comma)%ymm2,yes,no)
ifeq ($(avx2_supported),yes)
diff --git a/arch/x86/crypto/sha256-mb/Makefile b/arch/x86/crypto/sha256-mb/Makefile
index 41089e7..45b4fca 100644
--- a/arch/x86/crypto/sha256-mb/Makefile
+++ b/arch/x86/crypto/sha256-mb/Makefile
@@ -2,6 +2,8 @@
# Arch-specific CryptoAPI modules.
#

+OBJECT_FILES_NON_STANDARD := y
+
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
$(comma)4)$(comma)%ymm2,yes,no)
ifeq ($(avx2_supported),yes)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 4b99423..3c7c419 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -29,6 +29,7 @@ OBJECT_FILES_NON_STANDARD_head_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_test_nx.o := y
+OBJECT_FILES_NON_STANDARD_paravirt_patch_$(BITS).o := y

# If instrumentation of this dir is enabled, boot hangs during first second.
# Probably could be more selective here, but note that files related to irqs,
diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile
index 26b78d8..85a9e17 100644
--- a/arch/x86/kernel/acpi/Makefile
+++ b/arch/x86/kernel/acpi/Makefile
@@ -1,3 +1,5 @@
+OBJECT_FILES_NON_STANDARD_wakeup_$(BITS).o := y
+
obj-$(CONFIG_ACPI) += boot.o
obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup_$(BITS).o
obj-$(CONFIG_ACPI_APEI) += apei.o
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 901c640..69ea0bc 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -28,6 +28,7 @@
#include <linux/kdebug.h>
#include <linux/kallsyms.h>
#include <linux/ftrace.h>
+#include <linux/frame.h>

#include <asm/text-patching.h>
#include <asm/cacheflush.h>
@@ -94,6 +95,7 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
}

asm (
+ "optprobe_template_func:\n"
".global optprobe_template_entry\n"
"optprobe_template_entry:\n"
#ifdef CONFIG_X86_64
@@ -131,7 +133,12 @@ asm (
" popf\n"
#endif
".global optprobe_template_end\n"
- "optprobe_template_end:\n");
+ "optprobe_template_end:\n"
+ ".type optprobe_template_func, @function\n"
+ ".size optprobe_template_func, .-optprobe_template_func\n");
+
+void optprobe_template_func(void);
+STACK_FRAME_NON_STANDARD(optprobe_template_func);

#define TMPL_MOVE_IDX \
((long)&optprobe_template_val - (long)&optprobe_template_entry)
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 2544700..67393fc 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -9,6 +9,7 @@
#include <linux/sched.h>
#include <linux/tboot.h>
#include <linux/delay.h>
+#include <linux/frame.h>
#include <acpi/reboot.h>
#include <asm/io.h>
#include <asm/apic.h>
@@ -123,6 +124,7 @@ void __noreturn machine_real_restart(unsigned int type)
#ifdef CONFIG_APM_MODULE
EXPORT_SYMBOL(machine_real_restart);
#endif
+STACK_FRAME_NON_STANDARD(machine_real_restart);

/*
* Some Apple MacBook and MacBookPro's needs reboot=p to be able to reboot
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ba9891a..33460fc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
#include <linux/slab.h>
#include <linux/amd-iommu.h>
#include <linux/hashtable.h>
+#include <linux/frame.h>

#include <asm/apic.h>
#include <asm/perf_event.h>
@@ -4906,6 +4907,7 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)

mark_all_clean(svm->vmcb);
}
+STACK_FRAME_NON_STANDARD(svm_vcpu_run);

static void svm_set_cr3(struct kvm_vcpu *vcpu, unsigned long root)
{
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7dd53fb..6dcc487 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -33,6 +33,7 @@
#include <linux/slab.h>
#include <linux/tboot.h>
#include <linux/hrtimer.h>
+#include <linux/frame.h>
#include "kvm_cache_regs.h"
#include "x86.h"

@@ -8661,6 +8662,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
);
}
}
+STACK_FRAME_NON_STANDARD(vmx_handle_external_intr);

static bool vmx_has_high_real_mode_segbase(void)
{
@@ -9043,6 +9045,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
vmx_recover_nmi_blocking(vmx);
vmx_complete_interrupts(vmx);
}
+STACK_FRAME_NON_STANDARD(vmx_vcpu_run);

static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs)
{
diff --git a/arch/x86/lib/msr-reg.S b/arch/x86/lib/msr-reg.S
index c815564..10ffa7e 100644
--- a/arch/x86/lib/msr-reg.S
+++ b/arch/x86/lib/msr-reg.S
@@ -13,14 +13,14 @@
.macro op_safe_regs op
ENTRY(\op\()_safe_regs)
pushq %rbx
- pushq %rbp
+ pushq %r12
movq %rdi, %r10 /* Save pointer */
xorl %r11d, %r11d /* Return value */
movl (%rdi), %eax
movl 4(%rdi), %ecx
movl 8(%rdi), %edx
movl 12(%rdi), %ebx
- movl 20(%rdi), %ebp
+ movl 20(%rdi), %r12d
movl 24(%rdi), %esi
movl 28(%rdi), %edi
1: \op
@@ -29,10 +29,10 @@ ENTRY(\op\()_safe_regs)
movl %ecx, 4(%r10)
movl %edx, 8(%r10)
movl %ebx, 12(%r10)
- movl %ebp, 20(%r10)
+ movl %r12d, 20(%r10)
movl %esi, 24(%r10)
movl %edi, 28(%r10)
- popq %rbp
+ popq %r12
popq %rbx
ret
3:
diff --git a/arch/x86/net/Makefile b/arch/x86/net/Makefile
index 90568c3..fefb4b6 100644
--- a/arch/x86/net/Makefile
+++ b/arch/x86/net/Makefile
@@ -1,4 +1,6 @@
#
# Arch-specific network modules
#
+OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
+
obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile
index f1d83b3..2f56e1e 100644
--- a/arch/x86/platform/efi/Makefile
+++ b/arch/x86/platform/efi/Makefile
@@ -1,4 +1,5 @@
OBJECT_FILES_NON_STANDARD_efi_thunk_$(BITS).o := y
+OBJECT_FILES_NON_STANDARD_efi_stub_$(BITS).o := y

obj-$(CONFIG_EFI) += quirks.o efi.o efi_$(BITS).o efi_stub_$(BITS).o
obj-$(CONFIG_EARLY_PRINTK_EFI) += early_printk.o
diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile
index a6a198c..0504187 100644
--- a/arch/x86/power/Makefile
+++ b/arch/x86/power/Makefile
@@ -1,3 +1,5 @@
+OBJECT_FILES_NON_STANDARD_hibernate_asm_$(BITS).o := y
+
# __restore_processor_state() restores %gs after S3 resume and so should not
# itself be stack-protected
nostackp := $(call cc-option, -fno-stack-protector)
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index fffb0a1..bced7a3 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,3 +1,6 @@
+OBJECT_FILES_NON_STANDARD_xen-asm_$(BITS).o := y
+OBJECT_FILES_NON_STANDARD_xen-pvh.o := y
+
ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
CFLAGS_REMOVE_spinlock.o = -pg
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index ae1a3ba..154ffb4 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -38,6 +38,7 @@
#include <linux/syscore_ops.h>
#include <linux/compiler.h>
#include <linux/hugetlb.h>
+#include <linux/frame.h>

#include <asm/page.h>
#include <asm/sections.h>
@@ -874,7 +875,7 @@ int kexec_load_disabled;
* only when panic_cpu holds the current CPU number; this is the only CPU
* which processes crash_kexec routines.
*/
-void __crash_kexec(struct pt_regs *regs)
+void __noclone __crash_kexec(struct pt_regs *regs)
{
/* Take the kexec_mutex here to prevent sys_kexec_load
* running on one cpu from replacing the crash kernel
@@ -896,6 +897,7 @@ void __crash_kexec(struct pt_regs *regs)
mutex_unlock(&kexec_mutex);
}
}
+STACK_FRAME_NON_STANDARD(__crash_kexec);

void crash_kexec(struct pt_regs *regs)
{
--
2.7.5

2017-06-28 15:13:23

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 6/8] x86/entry: add unwind hint annotations

Add unwind hint annotations to entry_64.S. This will enable the undwarf
unwinder to unwind through any location in the entry code including
syscalls, interrupts, and exceptions.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/x86/entry/Makefile | 1 -
arch/x86/entry/calling.h | 6 +++++
arch/x86/entry/entry_64.S | 56 ++++++++++++++++++++++++++++++++++++++++++-----
3 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index 9976fce..af28a8a 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -2,7 +2,6 @@
# Makefile for the x86 low level entry code
#

-OBJECT_FILES_NON_STANDARD_entry_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_entry_64_compat.o := y

CFLAGS_syscall_64.o += $(call cc-option,-Wno-override-init,)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 05ed3d3..4050b73 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -1,4 +1,6 @@
#include <linux/jump_label.h>
+#include <asm/undwarf.h>
+

/*

@@ -112,6 +114,7 @@ For 32-bit we have the following conventions - kernel is built with
movq %rdx, 12*8+\offset(%rsp)
movq %rsi, 13*8+\offset(%rsp)
movq %rdi, 14*8+\offset(%rsp)
+ UNWIND_HINT_REGS offset=\offset extra=0
.endm
.macro SAVE_C_REGS offset=0
SAVE_C_REGS_HELPER \offset, 1, 1, 1, 1
@@ -136,6 +139,7 @@ For 32-bit we have the following conventions - kernel is built with
movq %r12, 3*8+\offset(%rsp)
movq %rbp, 4*8+\offset(%rsp)
movq %rbx, 5*8+\offset(%rsp)
+ UNWIND_HINT_REGS offset=\offset
.endm

.macro RESTORE_EXTRA_REGS offset=0
@@ -145,6 +149,7 @@ For 32-bit we have the following conventions - kernel is built with
movq 3*8+\offset(%rsp), %r12
movq 4*8+\offset(%rsp), %rbp
movq 5*8+\offset(%rsp), %rbx
+ UNWIND_HINT_REGS offset=\offset extra=0
.endm

.macro RESTORE_C_REGS_HELPER rstor_rax=1, rstor_rcx=1, rstor_r11=1, rstor_r8910=1, rstor_rdx=1
@@ -167,6 +172,7 @@ For 32-bit we have the following conventions - kernel is built with
.endif
movq 13*8(%rsp), %rsi
movq 14*8(%rsp), %rdi
+ UNWIND_HINT_IRET_REGS offset=16*8
.endm
.macro RESTORE_C_REGS
RESTORE_C_REGS_HELPER 1,1,1,1,1
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a9a8027..9075a6c 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -36,6 +36,7 @@
#include <asm/smap.h>
#include <asm/pgtable_types.h>
#include <asm/export.h>
+#include <asm/frame.h>
#include <linux/err.h>

.code64
@@ -43,9 +44,10 @@

#ifdef CONFIG_PARAVIRT
ENTRY(native_usergs_sysret64)
+ UNWIND_HINT_EMPTY
swapgs
sysretq
-ENDPROC(native_usergs_sysret64)
+END(native_usergs_sysret64)
#endif /* CONFIG_PARAVIRT */

.macro TRACE_IRQS_IRETQ
@@ -134,6 +136,7 @@ ENDPROC(native_usergs_sysret64)
*/

ENTRY(entry_SYSCALL_64)
+ UNWIND_HINT_EMPTY
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
@@ -169,6 +172,7 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r10 /* pt_regs->r10 */
pushq %r11 /* pt_regs->r11 */
sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
+ UNWIND_HINT_REGS extra=0

/*
* If we need to do entry work or if we guess we'll need to do
@@ -223,6 +227,7 @@ entry_SYSCALL_64_fastpath:
movq EFLAGS(%rsp), %r11
RESTORE_C_REGS_EXCEPT_RCX_R11
movq RSP(%rsp), %rsp
+ UNWIND_HINT_EMPTY
USERGS_SYSRET64

1:
@@ -316,6 +321,7 @@ syscall_return_via_sysret:
/* rcx and r11 are already restored (see code above) */
RESTORE_C_REGS_EXCEPT_RCX_R11
movq RSP(%rsp), %rsp
+ UNWIND_HINT_EMPTY
USERGS_SYSRET64

opportunistic_sysret_failed:
@@ -343,6 +349,7 @@ ENTRY(stub_ptregs_64)
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
popq %rax
+ UNWIND_HINT_REGS extra=0
jmp entry_SYSCALL64_slow_path

1:
@@ -351,6 +358,7 @@ END(stub_ptregs_64)

.macro ptregs_stub func
ENTRY(ptregs_\func)
+ UNWIND_HINT_FUNC
leaq \func(%rip), %rax
jmp stub_ptregs_64
END(ptregs_\func)
@@ -367,6 +375,7 @@ END(ptregs_\func)
* %rsi: next task
*/
ENTRY(__switch_to_asm)
+ UNWIND_HINT_FUNC
/*
* Save callee-saved registers
* This must match the order in inactive_task_frame
@@ -406,6 +415,7 @@ END(__switch_to_asm)
* r12: kernel thread arg
*/
ENTRY(ret_from_fork)
+ UNWIND_HINT_EMPTY
movq %rax, %rdi
call schedule_tail /* rdi: 'prev' task parameter */

@@ -413,6 +423,7 @@ ENTRY(ret_from_fork)
jnz 1f /* kernel threads are uncommon */

2:
+ UNWIND_HINT_REGS
movq %rsp, %rdi
call syscall_return_slowpath /* returns with IRQs disabled */
TRACE_IRQS_ON /* user mode is traced as IRQS on */
@@ -440,10 +451,11 @@ END(ret_from_fork)
ENTRY(irq_entries_start)
vector=FIRST_EXTERNAL_VECTOR
.rept (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR)
+ UNWIND_HINT_IRET_REGS
pushq $(~vector+0x80) /* Note: always in signed byte range */
- vector=vector+1
jmp common_interrupt
.align 8
+ vector=vector+1
.endr
END(irq_entries_start)

@@ -495,7 +507,9 @@ END(irq_entries_start)
movq %rsp, %rdi
incl PER_CPU_VAR(irq_count)
cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
+ UNWIND_HINT_REGS base=rdi
pushq %rdi
+ UNWIND_HINT_REGS indirect=1
/* We entered an interrupt context - irqs are off: */
TRACE_IRQS_OFF

@@ -519,6 +533,7 @@ ret_from_intr:

/* Restore saved previous stack */
popq %rsp
+ UNWIND_HINT_REGS

testb $3, CS(%rsp)
jz retint_kernel
@@ -561,6 +576,7 @@ restore_c_regs_and_iret:
INTERRUPT_RETURN

ENTRY(native_iret)
+ UNWIND_HINT_IRET_REGS
/*
* Are we returning to a stack segment from the LDT? Note: in
* 64-bit mode SS:RSP on the exception stack is always valid.
@@ -633,6 +649,7 @@ native_irq_return_ldt:
orq PER_CPU_VAR(espfix_stack), %rax
SWAPGS
movq %rax, %rsp
+ UNWIND_HINT_IRET_REGS offset=8

/*
* At this point, we cannot write to the stack any more, but we can
@@ -654,6 +671,7 @@ END(common_interrupt)
*/
.macro apicinterrupt3 num sym do_sym
ENTRY(\sym)
+ UNWIND_HINT_IRET_REGS
ASM_CLAC
pushq $~(\num)
.Lcommon_\sym:
@@ -739,6 +757,8 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt

.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym)
+ UNWIND_HINT_IRET_REGS offset=8
+
/* Sanity check */
.if \shift_ist != -1 && \paranoid == 0
.error "using shift_ist requires paranoid=1"
@@ -762,6 +782,7 @@ ENTRY(\sym)
.else
call error_entry
.endif
+ UNWIND_HINT_REGS
/* returned flag: ebx=0: need swapgs on exit, ebx=1: don't need it */

.if \paranoid
@@ -859,6 +880,7 @@ idtentry simd_coprocessor_error do_simd_coprocessor_error has_error_code=0
* edi: new selector
*/
ENTRY(native_load_gs_index)
+ FRAME_BEGIN
pushfq
DISABLE_INTERRUPTS(CLBR_ANY & ~CLBR_RDI)
SWAPGS
@@ -867,8 +889,9 @@ ENTRY(native_load_gs_index)
2: ALTERNATIVE "", "mfence", X86_BUG_SWAPGS_FENCE
SWAPGS
popfq
+ FRAME_END
ret
-END(native_load_gs_index)
+ENDPROC(native_load_gs_index)
EXPORT_SYMBOL(native_load_gs_index)

_ASM_EXTABLE(.Lgs_change, bad_gs)
@@ -898,7 +921,7 @@ ENTRY(do_softirq_own_stack)
leaveq
decl PER_CPU_VAR(irq_count)
ret
-END(do_softirq_own_stack)
+ENDPROC(do_softirq_own_stack)

#ifdef CONFIG_XEN
idtentry xen_hypervisor_callback xen_do_hypervisor_callback has_error_code=0
@@ -922,13 +945,18 @@ ENTRY(xen_do_hypervisor_callback) /* do_hypervisor_callback(struct *pt_regs) */
* Since we don't modify %rdi, evtchn_do_upall(struct *pt_regs) will
* see the correct pointer to the pt_regs
*/
+ UNWIND_HINT_FUNC
movq %rdi, %rsp /* we don't return, adjust the stack frame */
+ UNWIND_HINT_REGS
11: incl PER_CPU_VAR(irq_count)
movq %rsp, %rbp
cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
+ UNWIND_HINT_REGS base=rbp
pushq %rbp /* frame pointer backlink */
+ UNWIND_HINT_REGS indirect=1
call xen_evtchn_do_upcall
popq %rsp
+ UNWIND_HINT_REGS
decl PER_CPU_VAR(irq_count)
#ifndef CONFIG_PREEMPT
call xen_maybe_preempt_hcall
@@ -950,6 +978,7 @@ END(xen_do_hypervisor_callback)
* with its current contents: any discrepancy means we in category 1.
*/
ENTRY(xen_failsafe_callback)
+ UNWIND_HINT_EMPTY
movl %ds, %ecx
cmpw %cx, 0x10(%rsp)
jne 1f
@@ -969,11 +998,13 @@ ENTRY(xen_failsafe_callback)
pushq $0 /* RIP */
pushq %r11
pushq %rcx
+ UNWIND_HINT_IRET_REGS offset=8
jmp general_protection
1: /* Segment mismatch => Category 1 (Bad segment). Retry the IRET. */
movq (%rsp), %rcx
movq 8(%rsp), %r11
addq $0x30, %rsp
+ UNWIND_HINT_IRET_REGS
pushq $-1 /* orig_ax = -1 => not a system call */
ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS
@@ -1019,6 +1050,7 @@ idtentry machine_check has_error_code=0 paranoid=1 do_sym=*machine_check_vec
* Return: ebx=0: need swapgs on exit, ebx=1: otherwise
*/
ENTRY(paranoid_entry)
+ UNWIND_HINT_FUNC
cld
SAVE_C_REGS 8
SAVE_EXTRA_REGS 8
@@ -1046,6 +1078,7 @@ END(paranoid_entry)
* On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it)
*/
ENTRY(paranoid_exit)
+ UNWIND_HINT_REGS
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF_DEBUG
testl %ebx, %ebx /* swapgs needed? */
@@ -1067,6 +1100,7 @@ END(paranoid_exit)
* Return: EBX=0: came from user mode; EBX=1: otherwise
*/
ENTRY(error_entry)
+ UNWIND_HINT_FUNC
cld
SAVE_C_REGS 8
SAVE_EXTRA_REGS 8
@@ -1151,6 +1185,7 @@ END(error_entry)
* 0: user gsbase is loaded, we need SWAPGS and standard preparation for return to usermode
*/
ENTRY(error_exit)
+ UNWIND_HINT_REGS
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
testl %ebx, %ebx
@@ -1160,6 +1195,7 @@ END(error_exit)

/* Runs on exception stack */
ENTRY(nmi)
+ UNWIND_HINT_IRET_REGS
/*
* Fix up the exception frame if we're on Xen.
* PARAVIRT_ADJUST_EXCEPTION_FRAME is guaranteed to push at most
@@ -1231,11 +1267,13 @@ ENTRY(nmi)
cld
movq %rsp, %rdx
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+ UNWIND_HINT_IRET_REGS base=rdx offset=8
pushq 5*8(%rdx) /* pt_regs->ss */
pushq 4*8(%rdx) /* pt_regs->rsp */
pushq 3*8(%rdx) /* pt_regs->flags */
pushq 2*8(%rdx) /* pt_regs->cs */
pushq 1*8(%rdx) /* pt_regs->rip */
+ UNWIND_HINT_IRET_REGS
pushq $-1 /* pt_regs->orig_ax */
pushq %rdi /* pt_regs->di */
pushq %rsi /* pt_regs->si */
@@ -1252,6 +1290,7 @@ ENTRY(nmi)
pushq %r13 /* pt_regs->r13 */
pushq %r14 /* pt_regs->r14 */
pushq %r15 /* pt_regs->r15 */
+ UNWIND_HINT_REGS
ENCODE_FRAME_POINTER

/*
@@ -1406,6 +1445,7 @@ first_nmi:
.rept 5
pushq 11*8(%rsp)
.endr
+ UNWIND_HINT_IRET_REGS

/* Everything up to here is safe from nested NMIs */

@@ -1421,6 +1461,7 @@ first_nmi:
pushq $__KERNEL_CS /* CS */
pushq $1f /* RIP */
INTERRUPT_RETURN /* continues at repeat_nmi below */
+ UNWIND_HINT_IRET_REGS
1:
#endif

@@ -1470,6 +1511,7 @@ end_repeat_nmi:
* exceptions might do.
*/
call paranoid_entry
+ UNWIND_HINT_REGS

/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
movq %rsp, %rdi
@@ -1507,17 +1549,19 @@ nmi_restore:
END(nmi)

ENTRY(ignore_sysret)
+ UNWIND_HINT_EMPTY
mov $-ENOSYS, %eax
sysret
END(ignore_sysret)

ENTRY(rewind_stack_do_exit)
+ UNWIND_HINT_FUNC
/* Prevent any naive code from trying to unwind to our caller. */
xorl %ebp, %ebp

movq PER_CPU_VAR(cpu_current_top_of_stack), %rax
- leaq -TOP_OF_KERNEL_STACK_PADDING-PTREGS_SIZE(%rax), %rsp
+ leaq -PTREGS_SIZE(%rax), %rsp
+ UNWIND_HINT_FUNC cfa_offset=PTREGS_SIZE

call do_exit
-1: jmp 1b
END(rewind_stack_do_exit)
--
2.7.5

2017-06-28 15:13:37

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 8/8] x86/unwind: add undwarf unwinder

Add a new 'undwarf' unwinder which is enabled by
CONFIG_UNDWARF_UNWINDER. It plugs into the existing x86 unwinder
framework.

It relies on objtool to generate the needed .undwarf section.

For more details on why undwarf is used instead of DWARF, see
tools/objtool/Documentation/undwarf.txt.

Thanks to Andy Lutomirski for the performance improvement ideas:
splitting the undwarf table into two parallel arrays and creating a fast
lookup table to search a subset of the undwarf table.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
Documentation/x86/undwarf.txt | 146 ++++++++++
arch/um/include/asm/unwind.h | 8 +
arch/x86/Kconfig | 1 +
arch/x86/Kconfig.debug | 25 ++
arch/x86/include/asm/module.h | 9 +
arch/x86/include/asm/unwind.h | 77 +++--
arch/x86/kernel/Makefile | 8 +-
arch/x86/kernel/module.c | 12 +-
arch/x86/kernel/setup.c | 3 +
arch/x86/kernel/unwind_frame.c | 39 ++-
arch/x86/kernel/unwind_guess.c | 5 +
arch/x86/kernel/unwind_undwarf.c | 589 ++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/vmlinux.lds.S | 2 +
include/asm-generic/vmlinux.lds.h | 20 +-
lib/Kconfig.debug | 3 +
scripts/Makefile.build | 14 +-
16 files changed, 898 insertions(+), 63 deletions(-)
create mode 100644 Documentation/x86/undwarf.txt
create mode 100644 arch/um/include/asm/unwind.h
create mode 100644 arch/x86/kernel/unwind_undwarf.c

diff --git a/Documentation/x86/undwarf.txt b/Documentation/x86/undwarf.txt
new file mode 100644
index 0000000..d76c6b4
--- /dev/null
+++ b/Documentation/x86/undwarf.txt
@@ -0,0 +1,146 @@
+Undwarf unwinder debuginfo generation
+=====================================
+
+Overview
+--------
+
+The kernel CONFIG_UNDWARF_UNWINDER option enables objtool generation of
+undwarf debuginfo, which is out-of-band data which is used by the
+in-kernel undwarf unwinder. It's similar in concept to DWARF CFI
+debuginfo which would be used by a DWARF unwinder. The difference is
+that the format of the undwarf data is simpler than DWARF, which in turn
+allows the unwinder to be simpler and faster.
+
+Objtool generates the undwarf data by first doing compile-time stack
+metadata validation (CONFIG_STACK_VALIDATION). After analyzing all the
+code paths of a .o file, it determines information about the stack state
+at each instruction address in the file and outputs that information to
+the .undwarf and .undwarf_ip sections.
+
+The undwarf sections are combined at link time and are sorted at boot
+time. The unwinder uses the resulting data to correlate instruction
+addresses with their stack states at run time.
+
+
+Undwarf vs frame pointers
+-------------------------
+
+With frame pointers enabled, GCC adds instrumentation code to every
+function in the kernel. The kernel's .text size increases by about
+3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel
+Gorman [1] have shown a slowdown of 5-10% for some workloads.
+
+In contrast, the undwarf unwinder has no effect on text size or runtime
+performance, because the debuginfo is out of band. So if you disable
+frame pointers and enable undwarf, you get a nice performance
+improvement across the board, and still have reliable stack traces.
+
+Another benefit of undwarf compared to frame pointers is that it can
+reliably unwind across interrupts and exceptions. Frame pointer based
+unwinds can skip the caller of the interrupted function if it was a leaf
+function or if the interrupt hit before the frame pointer was saved.
+
+The main disadvantage of undwarf compared to frame pointers is that it
+needs more memory to store the undwarf table: roughly 2-4MB depending on
+the kernel config.
+
+
+Undwarf vs DWARF
+----------------
+
+Undwarf debuginfo's advantage over DWARF itself is that it's much
+simpler. It gets rid of the complex DWARF CFI state machine and also
+gets rid of the tracking of unnecessary registers. This allows the
+unwinder to be much simpler, meaning fewer bugs, which is especially
+important for mission critical oops code.
+
+The simpler debuginfo format also enables the unwinder to be much faster
+than DWARF, which is important for perf and lockdep. In a basic
+performance test by Jiri Slaby [2], the undwarf unwinder was about 20x
+faster than an out-of-tree DWARF unwinder. (Note: that measurement was
+taken before some performance tweaks were implemented, so the speedup
+may be even higher.)
+
+The undwarf format does have a few downsides compared to DWARF. The
+undwarf table takes up ~2MB more memory than an DWARF .eh_frame table.
+
+Another potential downside is that, as GCC evolves, it's conceivable
+that the undwarf data may end up being *too* simple to describe the
+state of the stack for certain optimizations. But IMO this is unlikely
+because GCC saves the frame pointer for any unusual stack adjustments it
+does, so I suspect we'll really only ever need to keep track of the
+stack pointer and the frame pointer between call frames. But even if we
+do end up having to track all the registers DWARF tracks, at least we
+will still be able to control the format, e.g. no complex state
+machines.
+
+
+Undwarf debuginfo generation
+----------------------------
+
+The undwarf data is generated by objtool. With the existing
+compile-time stack metadata validation feature, objtool already follows
+all code paths, and so it already has all the information it needs to be
+able to generate undwarf data from scratch. So it's an easy step to go
+from stack validation to undwarf generation.
+
+It should be possible to instead generate the undwarf data with a simple
+tool which converts DWARF to undwarf. However, such a solution would be
+incomplete due to the kernel's extensive use of asm, inline asm, and
+special sections like exception tables.
+
+That could be rectified by manually annotating those special code paths
+using GNU assembler .cfi annotations in .S files, and homegrown
+annotations for inline asm in .c files. But asm annotations were tried
+in the past and were found to be unmaintainable. They were often
+incorrect/incomplete and made the code harder to read and keep updated.
+And based on looking at glibc code, annotating inline asm in .c files
+might be even worse.
+
+Objtool still needs a few annotations, but only in code which does
+unusual things to the stack like entry code. And even then, far fewer
+annotations are needed than what DWARF would need, so they're much more
+maintainable than DWARF CFI annotations.
+
+So the advantages of using objtool to generate undwarf are that it gives
+more accurate debuginfo, with very few annotations. It also insulates
+the kernel from toolchain bugs which can be very painful to deal with in
+the kernel since we often have to workaround issues in older versions of
+the toolchain for years.
+
+The downside is that the unwinder now becomes dependent on objtool's
+ability to reverse engineer GCC code paths. If GCC optimizations become
+too complicated for objtool to follow, the undwarf generation might stop
+working or become incomplete. (It's worth noting that livepatch already
+has such a dependency on objtool's ability to follow GCC code paths.)
+
+If newer versions of GCC come up with some optimizations which break
+objtool, we may need to revisit the current implementation. Some
+possible solutions would be asking GCC to make the optimizations more
+palatable, or having objtool use DWARF as an additional input, or
+creating a GCC plugin to assist objtool with its analysis. But for now,
+objtool follows GCC code quite well.
+
+
+Unwinder implementation details
+-------------------------------
+
+Objtool generates the undwarf data by integrating with the compile-time
+stack metadata validation feature, which is described in detail in
+tools/objtool/Documentation/stack-validation.txt. After analyzing all
+the code paths of a .o file, it creates an array of undwarf structs, and
+a parallel array of instruction addresses associated with those structs,
+and writes them to the .undwarf and .undwarf_ip sections respectively.
+
+The undwarf data is split into the two arrays for performance reasons,
+to make the searchable part of the data (.undwarf_ip) more compact. The
+arrays are sorted in parallel at boot time.
+
+Performance is further improved by the use of a fast lookup table which
+is created at runtime. The fast lookup table associates a given address
+with a range of undwarf table indices, so that only a small subset of
+the undwarf table needs to be searched.
+
+
+[1] https://lkml.kernel.org/r/[email protected]
+[2] https://lkml.kernel.org/r/[email protected]
diff --git a/arch/um/include/asm/unwind.h b/arch/um/include/asm/unwind.h
new file mode 100644
index 0000000..53f507c
--- /dev/null
+++ b/arch/um/include/asm/unwind.h
@@ -0,0 +1,8 @@
+#ifndef _ASM_UML_UNWIND_H
+#define _ASM_UML_UNWIND_H
+
+static inline void
+unwind_module_init(struct module *mod, void *undwarf_ip, size_t unward_ip_size,
+ void *undwarf, size_t undwarf_size) {}
+
+#endif /* _ASM_UML_UNWIND_H */
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 72028a1..adf3222 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -153,6 +153,7 @@ config X86
select HAVE_MEMBLOCK
select HAVE_MEMBLOCK_NODE_MAP
select HAVE_MIXED_BREAKPOINTS_REGS
+ select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
select HAVE_OPROFILE
select HAVE_OPTPROBES
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index fcb7604..995434c 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -357,4 +357,29 @@ config PUNIT_ATOM_DEBUG
The current power state can be read from
/sys/kernel/debug/punit_atom/dev_power_state

+config UNDWARF_UNWINDER
+ bool "undwarf unwinder"
+ depends on X86_64
+ select STACK_VALIDATION
+ ---help---
+ This option enables the "undwarf" unwinder for unwinding kernel stack
+ traces. It uses a custom data format which is a simplified version
+ of the DWARF Call Frame Information standard.
+
+ This unwinder is more accurate across interrupt entry frames than the
+ frame pointer unwinder. This also can enable a small performance
+ improvement across the entire kernel if CONFIG_FRAME_POINTER is
+ disabled.
+
+ Enabling this option will increase the kernel's runtime memory usage
+ by roughly 2-4MB, depending on your kernel config.
+
+config FRAME_POINTER_UNWINDER
+ def_bool y
+ depends on !UNDWARF_UNWINDER && FRAME_POINTER
+
+config GUESS_UNWINDER
+ def_bool y
+ depends on !UNDWARF_UNWINDER && !FRAME_POINTER
+
endmenu
diff --git a/arch/x86/include/asm/module.h b/arch/x86/include/asm/module.h
index e3b7819..4dc6427 100644
--- a/arch/x86/include/asm/module.h
+++ b/arch/x86/include/asm/module.h
@@ -2,6 +2,15 @@
#define _ASM_X86_MODULE_H

#include <asm-generic/module.h>
+#include <asm/undwarf.h>
+
+struct mod_arch_specific {
+#ifdef CONFIG_UNDWARF_UNWINDER
+ unsigned int num_undwarves;
+ int *undwarf_ip;
+ struct undwarf *undwarf;
+#endif
+};

#ifdef CONFIG_X86_64
/* X86_64 does not define MODULE_PROC_FAMILY */
diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index e667649..1f8cb78 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -12,11 +12,14 @@ struct unwind_state {
struct task_struct *task;
int graph_idx;
bool error;
-#ifdef CONFIG_FRAME_POINTER
+#if defined(CONFIG_UNDWARF_UNWINDER)
+ bool signal, full_regs;
+ unsigned long sp, bp, ip;
+ struct pt_regs *regs;
+#elif defined(CONFIG_FRAME_POINTER)
bool got_irq;
- unsigned long *bp, *orig_sp;
+ unsigned long *bp, *orig_sp, ip;
struct pt_regs *regs;
- unsigned long ip;
#else
unsigned long *sp;
#endif
@@ -24,41 +27,30 @@ struct unwind_state {

void __unwind_start(struct unwind_state *state, struct task_struct *task,
struct pt_regs *regs, unsigned long *first_frame);
-
bool unwind_next_frame(struct unwind_state *state);
-
unsigned long unwind_get_return_address(struct unwind_state *state);
+unsigned long *unwind_get_return_address_ptr(struct unwind_state *state);

static inline bool unwind_done(struct unwind_state *state)
{
return state->stack_info.type == STACK_TYPE_UNKNOWN;
}

-static inline
-void unwind_start(struct unwind_state *state, struct task_struct *task,
- struct pt_regs *regs, unsigned long *first_frame)
-{
- first_frame = first_frame ? : get_stack_pointer(task, regs);
-
- __unwind_start(state, task, regs, first_frame);
-}
-
static inline bool unwind_error(struct unwind_state *state)
{
return state->error;
}

-#ifdef CONFIG_FRAME_POINTER
-
static inline
-unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
+void unwind_start(struct unwind_state *state, struct task_struct *task,
+ struct pt_regs *regs, unsigned long *first_frame)
{
- if (unwind_done(state))
- return NULL;
+ first_frame = first_frame ? : get_stack_pointer(task, regs);

- return state->regs ? &state->regs->ip : state->bp + 1;
+ __unwind_start(state, task, regs, first_frame);
}

+#if defined(CONFIG_UNDWARF_UNWINDER) || defined(CONFIG_FRAME_POINTER)
static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
{
if (unwind_done(state))
@@ -66,20 +58,47 @@ static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)

return state->regs;
}
-
-#else /* !CONFIG_FRAME_POINTER */
-
-static inline
-unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
+#else
+static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
{
return NULL;
}
+#endif

-static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+#ifdef CONFIG_UNDWARF_UNWINDER
+void unwind_init(void);
+void unwind_module_init(struct module *mod, void *undwarf_ip,
+ size_t undwarf_ip_size, void *undwarf,
+ size_t undwarf_size);
+#else
+static inline void unwind_init(void) {}
+static inline void unwind_module_init(struct module *mod, void *undwarf_ip,
+ size_t undwarf_ip_size, void *undwarf,
+ size_t undwarf_size) {}
+#endif
+
+/*
+ * This disables KASAN checking when reading a value from another task's stack,
+ * since the other task could be running on another CPU and could have poisoned
+ * the stack in the meantime.
+ */
+#define READ_ONCE_TASK_STACK(task, x) \
+({ \
+ unsigned long val; \
+ if (task == current) \
+ val = READ_ONCE(x); \
+ else \
+ val = READ_ONCE_NOCHECK(x); \
+ val; \
+})
+
+static inline bool task_on_another_cpu(struct task_struct *task)
{
- return NULL;
+#ifdef CONFIG_SMP
+ return task != current && task->on_cpu;
+#else
+ return false;
+#endif
}

-#endif /* CONFIG_FRAME_POINTER */
-
#endif /* _ASM_X86_UNWIND_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 3c7c419..4865889 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -125,11 +125,9 @@ obj-$(CONFIG_PERF_EVENTS) += perf_regs.o
obj-$(CONFIG_TRACING) += tracepoint.o
obj-$(CONFIG_SCHED_MC_PRIO) += itmt.o

-ifdef CONFIG_FRAME_POINTER
-obj-y += unwind_frame.o
-else
-obj-y += unwind_guess.o
-endif
+obj-$(CONFIG_UNDWARF_UNWINDER) += unwind_undwarf.o
+obj-$(CONFIG_FRAME_POINTER_UNWINDER) += unwind_frame.o
+obj-$(CONFIG_GUESS_UNWINDER) += unwind_guess.o

###
# 64 bit specific files
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index f67bd32..203b5a7 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -35,6 +35,7 @@
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/setup.h>
+#include <asm/unwind.h>

#if 0
#define DEBUGP(fmt, ...) \
@@ -213,7 +214,7 @@ int module_finalize(const Elf_Ehdr *hdr,
struct module *me)
{
const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL,
- *para = NULL;
+ *para = NULL, *undwarf = NULL, *undwarf_ip = NULL;
char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;

for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) {
@@ -225,6 +226,10 @@ int module_finalize(const Elf_Ehdr *hdr,
locks = s;
if (!strcmp(".parainstructions", secstrings + s->sh_name))
para = s;
+ if (!strcmp(".undwarf", secstrings + s->sh_name))
+ undwarf = s;
+ if (!strcmp(".undwarf_ip", secstrings + s->sh_name))
+ undwarf_ip = s;
}

if (alt) {
@@ -248,6 +253,11 @@ int module_finalize(const Elf_Ehdr *hdr,
/* make jump label nops */
jump_label_apply_nops(me);

+ if (undwarf && undwarf_ip)
+ unwind_module_init(me, (void *)undwarf_ip->sh_addr,
+ undwarf_ip->sh_size,
+ (void *)undwarf->sh_addr, undwarf->sh_size);
+
return 0;
}

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 65622f0..d736761 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -115,6 +115,7 @@
#include <asm/microcode.h>
#include <asm/mmu_context.h>
#include <asm/kaslr.h>
+#include <asm/unwind.h>

/*
* max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -1303,6 +1304,8 @@ void __init setup_arch(char **cmdline_p)
if (efi_enabled(EFI_BOOT))
efi_apply_memmap_quirks();
#endif
+
+ unwind_init();
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
index b9389d7..7574ef5 100644
--- a/arch/x86/kernel/unwind_frame.c
+++ b/arch/x86/kernel/unwind_frame.c
@@ -10,20 +10,22 @@

#define FRAME_HEADER_SIZE (sizeof(long) * 2)

-/*
- * This disables KASAN checking when reading a value from another task's stack,
- * since the other task could be running on another CPU and could have poisoned
- * the stack in the meantime.
- */
-#define READ_ONCE_TASK_STACK(task, x) \
-({ \
- unsigned long val; \
- if (task == current) \
- val = READ_ONCE(x); \
- else \
- val = READ_ONCE_NOCHECK(x); \
- val; \
-})
+unsigned long unwind_get_return_address(struct unwind_state *state)
+{
+ if (unwind_done(state))
+ return 0;
+
+ return __kernel_text_address(state->ip) ? state->ip : 0;
+}
+EXPORT_SYMBOL_GPL(unwind_get_return_address);
+
+unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
+{
+ if (unwind_done(state))
+ return NULL;
+
+ return state->regs ? &state->regs->ip : state->bp + 1;
+}

static void unwind_dump(struct unwind_state *state)
{
@@ -66,15 +68,6 @@ static void unwind_dump(struct unwind_state *state)
}
}

-unsigned long unwind_get_return_address(struct unwind_state *state)
-{
- if (unwind_done(state))
- return 0;
-
- return __kernel_text_address(state->ip) ? state->ip : 0;
-}
-EXPORT_SYMBOL_GPL(unwind_get_return_address);
-
static size_t regs_size(struct pt_regs *regs)
{
/* x86_32 regs from kernel mode are two words shorter: */
diff --git a/arch/x86/kernel/unwind_guess.c b/arch/x86/kernel/unwind_guess.c
index 039f367..4f0e17b 100644
--- a/arch/x86/kernel/unwind_guess.c
+++ b/arch/x86/kernel/unwind_guess.c
@@ -19,6 +19,11 @@ unsigned long unwind_get_return_address(struct unwind_state *state)
}
EXPORT_SYMBOL_GPL(unwind_get_return_address);

+unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
+{
+ return NULL;
+}
+
bool unwind_next_frame(struct unwind_state *state)
{
struct stack_info *info = &state->stack_info;
diff --git a/arch/x86/kernel/unwind_undwarf.c b/arch/x86/kernel/unwind_undwarf.c
new file mode 100644
index 0000000..44f62af
--- /dev/null
+++ b/arch/x86/kernel/unwind_undwarf.c
@@ -0,0 +1,589 @@
+#include <linux/module.h>
+#include <linux/sort.h>
+#include <asm/ptrace.h>
+#include <asm/stacktrace.h>
+#include <asm/unwind.h>
+#include <asm/undwarf.h>
+#include <asm/sections.h>
+
+#define undwarf_warn(fmt, ...) \
+ printk_deferred_once(KERN_WARNING pr_fmt("WARNING: " fmt), ##__VA_ARGS__)
+
+extern int __start_undwarf_ip[];
+extern int __stop_undwarf_ip[];
+extern struct undwarf __start_undwarf[];
+extern struct undwarf __stop_undwarf[];
+
+bool undwarf_init;
+static DEFINE_MUTEX(sort_mutex);
+
+int *cur_undwarf_ip_table = __start_undwarf_ip;
+struct undwarf *cur_undwarf_table = __start_undwarf;
+
+/*
+ * This is a lookup table for speeding up access to the undwarf table. Given
+ * an input address offset, the corresponding lookup table entry specifies a
+ * subset of the undwarf table to search.
+ *
+ * Each block represents the end of the previous range and the start of the
+ * next range. An extra block is added to give the last range an end.
+ *
+ * Some measured performance results for different values of LOOKUP_NUM_BLOCKS:
+ *
+ * num blocks array size lookup speedup total speedup
+ * 2k 8k 1.5x 1.5x
+ * 4k 16k 1.6x 1.6x
+ * 8k 32k 1.8x 1.7x
+ * 16k 64k 2.0x 1.8x
+ * 32k 128k 2.5x 2.0x
+ * 64k 256k 2.9x 2.2x
+ * 128k 512k 3.3x 2.4x
+ *
+ * Go with 32k blocks because it doubles unwinder performance while only adding
+ * 3.5% to the undwarf data footprint.
+ */
+#define LOOKUP_NUM_BLOCKS (32 * 1024)
+static unsigned int undwarf_fast_lookup[LOOKUP_NUM_BLOCKS + 1] __ro_after_init;
+
+#define LOOKUP_START_IP (unsigned long)_stext
+#define LOOKUP_STOP_IP (unsigned long)_etext
+#define LOOKUP_BLOCK_SIZE \
+ (DIV_ROUND_UP(LOOKUP_STOP_IP - LOOKUP_START_IP, \
+ LOOKUP_NUM_BLOCKS))
+
+
+static inline unsigned long undwarf_ip(const int *ip)
+{
+ return (unsigned long)ip + *ip;
+}
+
+static struct undwarf *__undwarf_find(int *ip_table, struct undwarf *u_table,
+ unsigned int num_entries,
+ unsigned long ip)
+{
+ int *first = ip_table;
+ int *last = ip_table + num_entries - 1;
+ int *mid = first, *found = first;
+
+ if (!num_entries)
+ return NULL;
+
+ /*
+ * Do a binary range search to find the rightmost duplicate of a given
+ * starting address. Some entries are section terminators which are
+ * "weak" entries for ensuring there are no gaps. They should be
+ * ignored when they conflict with a real entry.
+ */
+ while (first <= last) {
+ mid = first + ((last - first) / 2);
+
+ if (undwarf_ip(mid) <= ip) {
+ found = mid;
+ first = mid + 1;
+ } else
+ last = mid - 1;
+ }
+
+ return u_table + (found - ip_table);
+}
+
+static struct undwarf *undwarf_find(unsigned long ip)
+{
+ struct module *mod;
+
+ if (!undwarf_init)
+ return NULL;
+
+ /* For non-init vmlinux addresses, use the fast lookup table: */
+ if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
+ unsigned int idx, start, stop;
+
+ idx = (ip - LOOKUP_START_IP) / LOOKUP_BLOCK_SIZE;
+
+ if (WARN_ON_ONCE(idx >= LOOKUP_NUM_BLOCKS))
+ return NULL;
+
+ start = undwarf_fast_lookup[idx];
+ stop = undwarf_fast_lookup[idx + 1] + 1;
+
+ if (WARN_ON_ONCE(__start_undwarf + start >= __stop_undwarf) ||
+ __start_undwarf + stop > __stop_undwarf)
+ return NULL;
+
+ return __undwarf_find(__start_undwarf_ip + start,
+ __start_undwarf + start,
+ stop - start, ip);
+ }
+
+ /* vmlinux .init slow lookup: */
+ if (ip >= (unsigned long)_sinittext && ip < (unsigned long)_einittext)
+ return __undwarf_find(__start_undwarf_ip, __start_undwarf,
+ __stop_undwarf - __start_undwarf, ip);
+
+ /* Module lookup: */
+ mod = __module_address(ip);
+ if (!mod || !mod->arch.undwarf || !mod->arch.undwarf_ip)
+ return NULL;
+ return __undwarf_find(mod->arch.undwarf_ip, mod->arch.undwarf,
+ mod->arch.num_undwarves, ip);
+}
+
+static void undwarf_sort_swap(void *_a, void *_b, int size)
+{
+ struct undwarf *undwarf_a, *undwarf_b;
+ struct undwarf undwarf_tmp;
+ int *a = _a, *b = _b, tmp;
+ int delta = _b - _a;
+
+ /* Swap the undwarf_ip entries: */
+ tmp = *a;
+ *a = *b + delta;
+ *b = tmp - delta;
+
+ /* Swap the corresponding undwarf entries: */
+ undwarf_a = cur_undwarf_table + (a - cur_undwarf_ip_table);
+ undwarf_b = cur_undwarf_table + (b - cur_undwarf_ip_table);
+ undwarf_tmp = *undwarf_a;
+ *undwarf_a = *undwarf_b;
+ *undwarf_b = undwarf_tmp;
+}
+
+static int undwarf_sort_cmp(const void *_a, const void *_b)
+{
+ struct undwarf *undwarf_a;
+ const int *a = _a, *b = _b;
+ unsigned long a_val = undwarf_ip(a);
+ unsigned long b_val = undwarf_ip(b);
+
+ if (a_val > b_val)
+ return 1;
+ if (a_val < b_val)
+ return -1;
+
+ /*
+ * The "weak" section terminator entries need to always be on the left
+ * to ensure the lookup code skips them in favor of real entries.
+ * These terminator entries exist to handle any gaps created by
+ * whitelisted .o files which didn't get objtool generation.
+ */
+ undwarf_a = cur_undwarf_table + (a - cur_undwarf_ip_table);
+ return undwarf_a->cfa_reg == UNDWARF_REG_UNDEFINED ? -1 : 1;
+}
+
+void unwind_module_init(struct module *mod, void *_undwarf_ip,
+ size_t undwarf_ip_size, void *_undwarf,
+ size_t undwarf_size)
+{
+ int *undwarf_ip = _undwarf_ip;
+ struct undwarf *undwarf = _undwarf;
+ unsigned int num_entries = undwarf_ip_size / sizeof(int);
+
+ WARN_ON_ONCE(undwarf_ip_size % sizeof(int) != 0 ||
+ undwarf_size % sizeof(*undwarf) != 0 ||
+ num_entries != undwarf_size / sizeof(*undwarf));
+
+ /*
+ * The 'cur_undwarf_*' globals allow the undwarf_sort_swap() callback
+ * to associate an undwarf_ip table entry with its corresponding
+ * undwarf entry so they can both be swapped.
+ */
+ mutex_lock(&sort_mutex);
+ cur_undwarf_ip_table = undwarf_ip;
+ cur_undwarf_table = undwarf;
+ sort(undwarf_ip, num_entries, sizeof(int),undwarf_sort_cmp,
+ undwarf_sort_swap);
+ mutex_unlock(&sort_mutex);
+
+ mod->arch.undwarf_ip = undwarf_ip;
+ mod->arch.undwarf = undwarf;
+ mod->arch.num_undwarves = num_entries;
+}
+
+void __init unwind_init(void)
+{
+ size_t undwarf_ip_size = (void *)__stop_undwarf_ip - (void *)__start_undwarf_ip;
+ size_t undwarf_size = (void *)__stop_undwarf - (void *)__start_undwarf;
+ size_t num_entries = undwarf_ip_size / sizeof(int);
+ struct undwarf *undwarf;
+ int i;
+
+ if (!num_entries || undwarf_ip_size % sizeof(int) != 0 ||
+ undwarf_size % sizeof(struct undwarf) != 0 ||
+ num_entries != undwarf_size / sizeof(struct undwarf)) {
+ pr_warn("WARNING: Bad or missing undwarf table. Disabling unwinder.\n");
+ return;
+ }
+
+ /* Sort the undwarf table: */
+ sort(__start_undwarf_ip, num_entries, sizeof(int), undwarf_sort_cmp,
+ undwarf_sort_swap);
+
+ /* Initialize the fast lookup table: */
+ for (i = 0; i < LOOKUP_NUM_BLOCKS; i++) {
+ undwarf = __undwarf_find(__start_undwarf_ip, __start_undwarf,
+ num_entries,
+ LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
+ if (!undwarf) {
+ pr_warn("WARNING: Corrupt undwarf table. Disabling unwinder.\n");
+ return;
+ }
+
+ undwarf_fast_lookup[i] = undwarf - __start_undwarf;
+ }
+
+ /* Initialize the last 'end' block: */
+ undwarf = __undwarf_find(__start_undwarf_ip, __start_undwarf,
+ num_entries, LOOKUP_STOP_IP);
+ if (!undwarf) {
+ pr_warn("WARNING: Corrupt undwarf table. Disabling unwinder.\n");
+ return;
+ }
+ undwarf_fast_lookup[LOOKUP_NUM_BLOCKS] = undwarf - __start_undwarf;
+
+ undwarf_init = true;
+}
+
+unsigned long unwind_get_return_address(struct unwind_state *state)
+{
+ if (unwind_done(state))
+ return 0;
+
+ return __kernel_text_address(state->ip) ? state->ip : 0;
+}
+EXPORT_SYMBOL_GPL(unwind_get_return_address);
+
+unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
+{
+ if (unwind_done(state))
+ return NULL;
+
+ if (state->regs)
+ return &state->regs->ip;
+
+ if (state->sp)
+ return (unsigned long *)state->sp - 1;
+
+ return NULL;
+}
+
+static bool stack_access_ok(struct unwind_state *state, unsigned long addr,
+ size_t len)
+{
+ struct stack_info *info = &state->stack_info;
+
+ /*
+ * If the address isn't on the current stack, switch to the next one.
+ *
+ * We may have to traverse multiple stacks to deal with the possibility
+ * that info->next_sp could point to an empty stack and the address
+ * could be on a subsequent stack.
+ */
+ while (!on_stack(info, (void *)addr, len))
+ if (get_stack_info(info->next_sp, state->task, info,
+ &state->stack_mask))
+ return false;
+
+ return true;
+}
+
+static bool deref_stack_reg(struct unwind_state *state, unsigned long addr,
+ unsigned long *val)
+{
+ if (!stack_access_ok(state, addr, sizeof(long)))
+ return false;
+
+ *val = READ_ONCE_TASK_STACK(state->task, *(unsigned long *)addr);
+ return true;
+}
+
+#define REGS_SIZE (sizeof(struct pt_regs))
+#define SP_OFFSET (offsetof(struct pt_regs, sp))
+#define IRET_REGS_SIZE (REGS_SIZE - offsetof(struct pt_regs, ip))
+#define IRET_SP_OFFSET (SP_OFFSET - offsetof(struct pt_regs, ip))
+
+static bool deref_stack_regs(struct unwind_state *state, unsigned long addr,
+ unsigned long *ip, unsigned long *sp, bool full)
+{
+ size_t regs_size = full ? REGS_SIZE : IRET_REGS_SIZE;
+ size_t sp_offset = full ? SP_OFFSET : IRET_SP_OFFSET;
+ struct pt_regs *regs = (struct pt_regs *)(addr + regs_size - REGS_SIZE);
+
+ if (IS_ENABLED(CONFIG_X86_64)) {
+ if (!stack_access_ok(state, addr, regs_size))
+ return false;
+
+ *ip = regs->ip;
+ *sp = regs->sp;
+
+ return true;
+ }
+
+ if (!stack_access_ok(state, addr, sp_offset))
+ return false;
+
+ *ip = regs->ip;
+
+ if (user_mode(regs)) {
+ if (!stack_access_ok(state, addr + sp_offset,
+ REGS_SIZE - SP_OFFSET))
+ return false;
+
+ *sp = regs->sp;
+ } else
+ *sp = (unsigned long)&regs->sp;
+
+ return true;
+}
+
+bool unwind_next_frame(struct unwind_state *state)
+{
+ enum stack_type prev_type = state->stack_info.type;
+ unsigned long ip_p, prev_sp = state->sp;
+ unsigned long cfa, orig_ip, orig_sp;
+ struct undwarf *undwarf;
+ struct pt_regs *ptregs;
+ bool indirect = false;
+
+ if (unwind_done(state))
+ return false;
+
+ /* Don't let modules unload while we're reading their undwarf data. */
+ preempt_disable();
+
+ /* Have we reached the end? */
+ if (state->regs && user_mode(state->regs))
+ goto done;
+
+ /*
+ * Find the undwarf table entry associated with the text address.
+ *
+ * Decrement call return addresses by one so they work for sibling
+ * calls and calls to noreturn functions.
+ */
+ undwarf = undwarf_find(state->signal ? state->ip : state->ip - 1);
+ if (!undwarf || undwarf->cfa_reg == UNDWARF_REG_UNDEFINED)
+ goto done;
+ orig_ip = state->ip;
+
+ /* Calculate the CFA (caller frame address): */
+ switch (undwarf->cfa_reg) {
+ case UNDWARF_REG_SP:
+ cfa = state->sp + undwarf->cfa_offset;
+ break;
+
+ case UNDWARF_REG_BP:
+ cfa = state->bp + undwarf->cfa_offset;
+ break;
+
+ case UNDWARF_REG_SP_INDIRECT:
+ cfa = state->sp + undwarf->cfa_offset;
+ indirect = true;
+ break;
+
+ case UNDWARF_REG_BP_INDIRECT:
+ cfa = state->bp + undwarf->cfa_offset;
+ indirect = true;
+ break;
+
+ case UNDWARF_REG_R10:
+ if (!state->regs || !state->full_regs) {
+ undwarf_warn("missing regs for base reg R10 at ip %p\n",
+ (void *)state->ip);
+ goto done;
+ }
+ cfa = state->regs->r10;
+ break;
+
+ case UNDWARF_REG_R13:
+ if (!state->regs || !state->full_regs) {
+ undwarf_warn("missing regs for base reg R13 at ip %p\n",
+ (void *)state->ip);
+ goto done;
+ }
+ cfa = state->regs->r13;
+ break;
+
+ case UNDWARF_REG_DI:
+ if (!state->regs || !state->full_regs) {
+ undwarf_warn("missing regs for base reg DI at ip %p\n",
+ (void *)state->ip);
+ goto done;
+ }
+ cfa = state->regs->di;
+ break;
+
+ case UNDWARF_REG_DX:
+ if (!state->regs || !state->full_regs) {
+ undwarf_warn("missing regs for base reg DX at ip %p\n",
+ (void *)state->ip);
+ goto done;
+ }
+ cfa = state->regs->dx;
+ break;
+
+ default:
+ undwarf_warn("unknown CFA base reg %d for ip %p\n",
+ undwarf->cfa_reg, (void *)state->ip);
+ goto done;
+ }
+
+ if (indirect) {
+ if (!deref_stack_reg(state, cfa, &cfa))
+ goto done;
+ }
+
+ /* Find IP, SP and possibly regs: */
+ switch (undwarf->type) {
+ case UNDWARF_TYPE_CFA:
+ ip_p = cfa - sizeof(long);
+
+ if (!deref_stack_reg(state, ip_p, &state->ip))
+ goto done;
+
+ state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
+ state->ip, (void *)ip_p);
+
+ state->sp = cfa;
+ state->regs = NULL;
+ state->signal = false;
+ break;
+
+ case UNDWARF_TYPE_REGS:
+ if (!deref_stack_regs(state, cfa, &state->ip, &state->sp, true)) {
+ undwarf_warn("can't dereference registers at %p for ip %p\n",
+ (void *)cfa, (void *)orig_ip);
+ goto done;
+ }
+
+ state->regs = (struct pt_regs *)cfa;
+ state->full_regs = true;
+ state->signal = true;
+ break;
+
+ case UNDWARF_TYPE_REGS_IRET:
+ orig_sp = state->sp;
+ if (!deref_stack_regs(state, cfa, &state->ip, &state->sp, false)) {
+ undwarf_warn("can't dereference iret registers at %p for ip %p\n",
+ (void *)cfa, (void *)orig_ip);
+ goto done;
+ }
+
+ ptregs = container_of((void *)cfa, struct pt_regs, ip);
+ if ((unsigned long)ptregs >= orig_sp &&
+ on_stack(&state->stack_info, ptregs, REGS_SIZE)) {
+ state->regs = ptregs;
+ state->full_regs = false;
+ } else
+ state->regs = NULL;
+
+ state->signal = true;
+ break;
+
+ default:
+ undwarf_warn("unknown undwarf type %d\n", undwarf->type);
+ break;
+ }
+
+ /* Find BP: */
+ switch (undwarf->bp_reg) {
+ case UNDWARF_REG_UNDEFINED:
+ if (state->regs && state->full_regs)
+ state->bp = state->regs->bp;
+ break;
+
+ case UNDWARF_REG_CFA:
+ if (!deref_stack_reg(state, cfa + undwarf->bp_offset,&state->bp))
+ goto done;
+ break;
+
+ case UNDWARF_REG_BP:
+ if (!deref_stack_reg(state, state->bp + undwarf->bp_offset, &state->bp))
+ goto done;
+ break;
+
+ default:
+ undwarf_warn("unknown BP base reg %d for ip %p\n",
+ undwarf->bp_reg, (void *)orig_ip);
+ goto done;
+ }
+
+ /* Prevent a recursive loop due to bad undwarf data: */
+ if (state->stack_info.type == prev_type &&
+ on_stack(&state->stack_info, (void *)state->sp, sizeof(long)) &&
+ state->sp <= prev_sp) {
+ undwarf_warn("stack going in the wrong direction? ip=%p\n",
+ (void *)orig_ip);
+ goto done;
+ }
+
+ preempt_enable();
+ return true;
+
+done:
+ preempt_enable();
+ state->stack_info.type = STACK_TYPE_UNKNOWN;
+ return false;
+}
+EXPORT_SYMBOL_GPL(unwind_next_frame);
+
+void __unwind_start(struct unwind_state *state, struct task_struct *task,
+ struct pt_regs *regs, unsigned long *first_frame)
+{
+ memset(state, 0, sizeof(*state));
+ state->task = task;
+
+ /*
+ * Refuse to unwind the stack of a task while it's executing on another
+ * CPU. This check is racy, but that's ok: the unwinder has other
+ * checks to prevent it from going off the rails.
+ */
+ if (task_on_another_cpu(task))
+ goto done;
+
+ if (regs) {
+ if (user_mode(regs))
+ goto done;
+
+ state->ip = regs->ip;
+ state->sp = kernel_stack_pointer(regs);
+ state->bp = regs->bp;
+ state->regs = regs;
+ state->full_regs = true;
+ state->signal = true;
+
+ } else if (task == current) {
+ asm volatile("lea (%%rip), %0\n\t"
+ "mov %%rsp, %1\n\t"
+ "mov %%rbp, %2\n\t"
+ : "=r" (state->ip), "=r" (state->sp),
+ "=r" (state->bp));
+
+ } else {
+ struct inactive_task_frame *frame = (void *)task->thread.sp;
+
+ state->ip = frame->ret_addr;
+ state->sp = task->thread.sp;
+ state->bp = frame->bp;
+ }
+
+ if (get_stack_info((unsigned long *)state->sp, state->task,
+ &state->stack_info, &state->stack_mask))
+ return;
+
+ /*
+ * The caller can provide the address of the first frame directly
+ * (first_frame) or indirectly (regs->sp) to indicate which stack frame
+ * to start unwinding at. Skip ahead until we reach it.
+ */
+ while (!unwind_done(state) &&
+ (!on_stack(&state->stack_info, first_frame, sizeof(long)) ||
+ state->sp <= (unsigned long)first_frame))
+ unwind_next_frame(state);
+
+ return;
+
+done:
+ state->stack_info.type = STACK_TYPE_UNKNOWN;
+ return;
+}
+EXPORT_SYMBOL_GPL(__unwind_start);
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index c8a3b61..e3b7cfc 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -148,6 +148,8 @@ SECTIONS

BUG_TABLE

+ UNDWARF_TABLE
+
. = ALIGN(PAGE_SIZE);
__vvar_page = .;

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0d64658..a8ed616 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -668,6 +668,24 @@
#define BUG_TABLE
#endif

+#ifdef CONFIG_UNDWARF_UNWINDER
+#define UNDWARF_TABLE \
+ . = ALIGN(4); \
+ .undwarf_ip : AT(ADDR(.undwarf_ip) - LOAD_OFFSET) { \
+ VMLINUX_SYMBOL(__start_undwarf_ip) = .; \
+ KEEP(*(.undwarf_ip)) \
+ VMLINUX_SYMBOL(__stop_undwarf_ip) = .; \
+ } \
+ . = ALIGN(8); \
+ .undwarf : AT(ADDR(.undwarf) - LOAD_OFFSET) { \
+ VMLINUX_SYMBOL(__start_undwarf) = .; \
+ KEEP(*(.undwarf)) \
+ VMLINUX_SYMBOL(__stop_undwarf) = .; \
+ }
+#else
+#define UNDWARF_TABLE
+#endif
+
#ifdef CONFIG_PM_TRACE
#define TRACEDATA \
. = ALIGN(4); \
@@ -854,7 +872,7 @@
DATA_DATA \
CONSTRUCTORS \
} \
- BUG_TABLE
+ BUG_TABLE \

#define INIT_TEXT_SECTION(inittext_align) \
. = ALIGN(inittext_align); \
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9c5d40a..ec79366 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -374,6 +374,9 @@ config STACK_VALIDATION
pointers (if CONFIG_FRAME_POINTER is enabled). This helps ensure
that runtime stack traces are more reliable.

+ This is also a prerequisite for creation of the undwarf format which
+ is needed for CONFIG_UNDWARF_UNWINDER.
+
For more information, see
tools/objtool/Documentation/stack-validation.txt.

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 733e044..7859c79 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -258,7 +258,8 @@ ifneq ($(SKIP_STACK_VALIDATION),1)

__objtool_obj := $(objtree)/tools/objtool/objtool

-objtool_args = check
+objtool_args = $(if $(CONFIG_UNDWARF_UNWINDER),undwarf generate,check)
+
ifndef CONFIG_FRAME_POINTER
objtool_args += --no-fp
endif
@@ -276,6 +277,11 @@ objtool_obj = $(if $(patsubst y%,, \
endif # SKIP_STACK_VALIDATION
endif # CONFIG_STACK_VALIDATION

+# Rebuild all objects when objtool changes, or is enabled/disabled.
+objtool_dep = $(objtool_obj) \
+ $(wildcard include/config/undwarf/unwinder.h \
+ include/config/stack/validation.h)
+
define rule_cc_o_c
$(call echo-cmd,checksrc) $(cmd_checksrc) \
$(call cmd_and_fixdep,cc_o_c) \
@@ -298,13 +304,13 @@ cmd_undef_syms = echo
endif

# Built-in and composite module parts
-$(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_obj) FORCE
+$(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
$(call cmd,force_checksrc)
$(call if_changed_rule,cc_o_c)

# Single-part modules are special since we need to mark them in $(MODVERDIR)

-$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_obj) FORCE
+$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
$(call cmd,force_checksrc)
$(call if_changed_rule,cc_o_c)
@{ echo $(@:.o=.ko); echo $@; \
@@ -399,7 +405,7 @@ cmd_modversions_S = \
endif
endif

-$(obj)/%.o: $(src)/%.S $(objtool_obj) FORCE
+$(obj)/%.o: $(src)/%.S $(objtool_dep) FORCE
$(call if_changed_rule,as_o_S)

targets += $(real-objs-y) $(real-objs-m) $(lib-y)
--
2.7.5

2017-06-28 15:13:58

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 5/8] objtool, x86: add facility for asm code to provide unwind hints

Some asm (and inline asm) code does special things to the stack which
objtool can't understand. (Nor can GCC or GNU assembler, for that
matter.) In such cases we need a facility for the code to provide
annotations, so the unwinder can unwind through it.

This provides such a facility, in the form of unwind hints. They're
similar to the GNU assembler .cfi* directives, but they give more
information, and are needed in far fewer places, because objtool can
fill in the blanks by following branches and adjusting the stack pointer
for pushes and pops.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
.../x86/include/asm}/undwarf-types.h | 18 +++
arch/x86/include/asm/undwarf.h | 103 ++++++++++++
tools/objtool/Makefile | 3 +
tools/objtool/check.c | 179 ++++++++++++++++++++-
tools/objtool/check.h | 4 +-
tools/objtool/undwarf-types.h | 18 +++
6 files changed, 317 insertions(+), 8 deletions(-)
copy {tools/objtool => arch/x86/include/asm}/undwarf-types.h (84%)
create mode 100644 arch/x86/include/asm/undwarf.h

diff --git a/tools/objtool/undwarf-types.h b/arch/x86/include/asm/undwarf-types.h
similarity index 84%
copy from tools/objtool/undwarf-types.h
copy to arch/x86/include/asm/undwarf-types.h
index ef92a1d..4e5e283 100644
--- a/tools/objtool/undwarf-types.h
+++ b/arch/x86/include/asm/undwarf-types.h
@@ -57,11 +57,17 @@
*
* UNDWARF_TYPE_REGS_IRET: Used in entry code to indicate that
* cfa_reg+cfa_offset points to the iret return frame.
+ *
+ * The UNWIND_HINT macros are only used for the unwind_hint struct. They are
+ * not used for the undwarf struct due to size and complexity constraints.
*/
#define UNDWARF_TYPE_CFA 0
#define UNDWARF_TYPE_REGS 1
#define UNDWARF_TYPE_REGS_IRET 2
+#define UNWIND_HINT_TYPE_SAVE 3
+#define UNWIND_HINT_TYPE_RESTORE 4

+#ifndef __ASSEMBLY__
/*
* This struct contains a simplified version of the DWARF Call Frame
* Information standard. It contains only the necessary parts of the real
@@ -78,4 +84,16 @@ struct undwarf {
unsigned type:2;
};

+/*
+ * This struct is used by asm and inline asm code to manually annotate the
+ * location of registers on the stack for the undwarf unwinder.
+ */
+struct unwind_hint {
+ unsigned int ip;
+ short cfa_offset;
+ unsigned char cfa_reg;
+ unsigned char type;
+};
+#endif /* __ASSEMBLY__ */
+
#endif /* _UNDWARF_TYPES_H */
diff --git a/arch/x86/include/asm/undwarf.h b/arch/x86/include/asm/undwarf.h
new file mode 100644
index 0000000..41384e6
--- /dev/null
+++ b/arch/x86/include/asm/undwarf.h
@@ -0,0 +1,103 @@
+#ifndef _ASM_X86_UNDWARF_H
+#define _ASM_X86_UNDWARF_H
+
+#include "undwarf-types.h"
+
+#ifdef __ASSEMBLY__
+
+/*
+ * In asm, there are two kinds of code: normal C-type callable functions and
+ * the rest. The normal callable functions can be called by other code, and
+ * don't do anything unusual with the stack. Such normal callable functions
+ * are annotated with the ENTRY/ENDPROC macros. Most asm code falls in this
+ * category. In this case, no special debugging annotations are needed because
+ * objtool can automatically generate the .undwarf section which the undwarf
+ * unwinder reads at runtime.
+ *
+ * Anything which doesn't fall into the above category, such as syscall and
+ * interrupt handlers, tends to not be called directly by other functions, and
+ * often does unusual non-C-function-type things with the stack pointer. Such
+ * code needs to be annotated such that objtool can understand it. The
+ * following CFI hint macros are for this type of code.
+ *
+ * These macros provide hints to objtool about the state of the stack at each
+ * instruction. Objtool starts from the hints and follows the code flow,
+ * making automatic CFI adjustments when it sees pushes and pops, filling out
+ * the debuginfo as necessary. It will also warn if it sees any
+ * inconsistencies.
+ */
+.macro UNWIND_HINT cfa_reg=UNDWARF_REG_SP cfa_offset=0 type=UNDWARF_TYPE_CFA
+#ifdef CONFIG_STACK_VALIDATION
+.Lunwind_hint_ip_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+ .long .Lunwind_hint_ip_\@ - .
+ .short \cfa_offset
+ .byte \cfa_reg
+ .byte \type
+ .popsection
+#endif
+.endm
+
+.macro UNWIND_HINT_EMPTY
+ UNWIND_HINT cfa_reg=UNDWARF_REG_UNDEFINED
+.endm
+
+.macro UNWIND_HINT_REGS base=rsp offset=0 indirect=0 extra=1 iret=0
+ .if \base == rsp && \indirect
+ .set cfa_reg, UNDWARF_REG_SP_INDIRECT
+ .elseif \base == rsp
+ .set cfa_reg, UNDWARF_REG_SP
+ .elseif \base == rbp
+ .set cfa_reg, UNDWARF_REG_BP
+ .elseif \base == rdi
+ .set cfa_reg, UNDWARF_REG_DI
+ .elseif \base == rdx
+ .set cfa_reg, UNDWARF_REG_DX
+ .else
+ .error "UNWIND_HINT_REGS: bad base register"
+ .endif
+
+ .set cfa_offset, \offset
+
+ .if \iret
+ .set type, UNDWARF_TYPE_REGS_IRET
+ .elseif \extra == 0
+ .set type, UNDWARF_TYPE_REGS_IRET
+ .set cfa_offset, \offset + (16*8)
+ .else
+ .set type, UNDWARF_TYPE_REGS
+ .endif
+
+ UNWIND_HINT cfa_reg=cfa_reg cfa_offset=cfa_offset type=type
+.endm
+
+.macro UNWIND_HINT_IRET_REGS base=rsp offset=0
+ UNWIND_HINT_REGS base=\base offset=\offset iret=1
+.endm
+
+.macro UNWIND_HINT_FUNC cfa_offset=8
+ UNWIND_HINT cfa_offset=\cfa_offset
+.endm
+
+#else /* !__ASSEMBLY__ */
+
+#define UNWIND_HINT(cfa_reg, cfa_offset, type) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(cfa_offset) "\n\t" \
+ ".byte " __stringify(cfa_reg) "\n\t" \
+ ".byte " __stringify(type) "\n\t" \
+ ".popsection\n\t"
+
+#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE)
+
+#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE)
+
+
+#endif /* __ASSEMBLY__ */
+
+
+#endif /* _ASM_X86_UNDWARF_H */
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 0e2765e..7997d5c 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -52,6 +52,9 @@ $(OBJTOOL): $(LIBSUBCMD) $(OBJTOOL_IN)
diff -I'^#include' arch/x86/insn/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
diff -I'^#include' arch/x86/insn/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
|| echo "warning: objtool: x86 instruction decoder differs from kernel" >&2 )) || true
+ @(test -d ../../kernel -a -d ../../tools -a -d ../objtool && (( \
+ diff ../../arch/x86/include/asm/undwarf-types.h undwarf-types.h >/dev/null) \
+ || echo "warning: objtool: undwarf-types.h differs from kernel" >&2 )) || true
$(QUIET_LINK)$(CC) $(OBJTOOL_IN) $(LDFLAGS) -o $@


diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index f76ac4c..6f83dad 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -873,6 +873,99 @@ static int add_switch_table_alts(struct objtool_file *file)
return 0;
}

+static int read_unwind_hints(struct objtool_file *file)
+{
+ struct section *sec, *relasec;
+ struct rela *rela;
+ struct unwind_hint *hint;
+ struct instruction *insn;
+ struct cfi_reg *cfa;
+ int i;
+
+ sec = find_section_by_name(file->elf, ".discard.unwind_hints");
+ if (!sec)
+ return 0;
+
+ relasec = sec->rela;
+ if (!relasec) {
+ WARN("missing .rela.discard.unwind_hints section");
+ return -1;
+ }
+
+ if (sec->len % sizeof(struct unwind_hint)) {
+ WARN("struct unwind_hint size mismatch");
+ return -1;
+ }
+
+ file->hints = true;
+
+ for (i = 0; i < sec->len / sizeof(struct unwind_hint); i++) {
+ hint = (struct unwind_hint *)sec->data->d_buf + i;
+
+ rela = find_rela_by_dest(sec, i * sizeof(*hint));
+ if (!rela) {
+ WARN("can't find rela for unwind_hints[%d]", i);
+ return -1;
+ }
+
+ insn = find_insn(file, rela->sym->sec, rela->addend);
+ if (!insn) {
+ WARN("can't find insn for unwind_hints[%d]", i);
+ return -1;
+ }
+
+ cfa = &insn->state.cfa;
+
+ if (hint->type == UNWIND_HINT_TYPE_SAVE) {
+ insn->save = true;
+ continue;
+
+ } else if (hint->type == UNWIND_HINT_TYPE_RESTORE) {
+ insn->restore = true;
+ insn->hint = true;
+ continue;
+ }
+
+ insn->hint = true;
+
+ switch (hint->cfa_reg) {
+ case UNDWARF_REG_UNDEFINED:
+ cfa->base = CFI_UNDEFINED;
+ break;
+ case UNDWARF_REG_SP:
+ cfa->base = CFI_SP;
+ break;
+ case UNDWARF_REG_BP:
+ cfa->base = CFI_BP;
+ break;
+ case UNDWARF_REG_SP_INDIRECT:
+ cfa->base = CFI_SP_INDIRECT;
+ break;
+ case UNDWARF_REG_R10:
+ cfa->base = CFI_R10;
+ break;
+ case UNDWARF_REG_R13:
+ cfa->base = CFI_R13;
+ break;
+ case UNDWARF_REG_DI:
+ cfa->base = CFI_DI;
+ break;
+ case UNDWARF_REG_DX:
+ cfa->base = CFI_DX;
+ break;
+ default:
+ WARN_FUNC("unsupported unwind_hint cfa base reg %d",
+ insn->sec, insn->offset, hint->cfa_reg);
+ return -1;
+ }
+
+ cfa->offset = hint->cfa_offset;
+ insn->state.type = hint->type;
+ }
+
+ return 0;
+}
+
static int decode_sections(struct objtool_file *file)
{
int ret;
@@ -903,6 +996,10 @@ static int decode_sections(struct objtool_file *file)
if (ret)
return ret;

+ ret = read_unwind_hints(file);
+ if (ret)
+ return ret;
+
return 0;
}

@@ -1377,7 +1474,7 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
struct insn_state state)
{
struct alternative *alt;
- struct instruction *insn;
+ struct instruction *insn, *next_insn;
struct section *sec;
struct symbol *func = NULL;
int ret;
@@ -1392,6 +1489,8 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
}

while (1) {
+ next_insn = next_insn_same_sec(file, insn);
+
if (file->c_file && insn->func) {
if (func && func != insn->func) {
WARN("%s() falls through to next function %s()",
@@ -1403,13 +1502,54 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
func = insn->func;

if (insn->visited) {
- if (!!insn_state_match(insn, &state))
+ if (!insn->hint && !insn_state_match(insn, &state))
return 1;

return 0;
}

- insn->state = state;
+ if (insn->hint) {
+ if (insn->restore) {
+ struct instruction *save_insn, *i;
+
+ i = insn;
+ save_insn = NULL;
+ func_for_each_insn_continue_reverse(file, func, i) {
+ if (i->save) {
+ save_insn = i;
+ break;
+ }
+ }
+
+ if (!save_insn) {
+ WARN_FUNC("no corresponding CFI save for CFI restore",
+ sec, insn->offset);
+ return 1;
+ }
+
+ if (!save_insn->visited) {
+ /*
+ * Oops, no state to copy yet.
+ * Hopefully we can reach this
+ * instruction from another branch
+ * after the save insn has been
+ * visited.
+ */
+ if (insn == first)
+ return 0;
+
+ WARN_FUNC("objtool isn't smart enough to handle this CFI save/restore combo",
+ sec, insn->offset);
+ return 1;
+ }
+
+ insn->state = save_insn->state;
+ }
+
+ state = insn->state;
+
+ } else
+ insn->state = state;

insn->visited = true;

@@ -1484,7 +1624,7 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
return 0;

case INSN_CONTEXT_SWITCH:
- if (func) {
+ if (func && (!next_insn || !next_insn->hint)) {
WARN_FUNC("unsupported instruction in callable function",
sec, insn->offset);
return 1;
@@ -1504,7 +1644,7 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
if (insn->dead_end)
return 0;

- insn = next_insn_same_sec(file, insn);
+ insn = next_insn;
if (!insn) {
WARN("%s: unexpected end of section", sec->name);
return 1;
@@ -1514,6 +1654,27 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
return 0;
}

+static int validate_unwind_hints(struct objtool_file *file)
+{
+ struct instruction *insn;
+ int ret, warnings = 0;
+ struct insn_state state;
+
+ if (!file->hints)
+ return 0;
+
+ clear_insn_state(&state);
+
+ for_each_insn(file, insn) {
+ if (insn->hint && !insn->visited) {
+ ret = validate_branch(file, insn, state);
+ warnings += ret;
+ }
+ }
+
+ return warnings;
+}
+
static bool is_kasan_insn(struct instruction *insn)
{
return (insn->type == INSN_CALL &&
@@ -1659,8 +1820,9 @@ int check(const char *_objname, bool _nofp, bool undwarf)
hash_init(file.insn_hash);
file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard");
file.rodata = find_section_by_name(file.elf, ".rodata");
- file.ignore_unreachables = false;
file.c_file = find_section_by_name(file.elf, ".comment");
+ file.ignore_unreachables = false;
+ file.hints = false;

arch_initial_func_cfi_state(&initial_func_cfi);

@@ -1677,6 +1839,11 @@ int check(const char *_objname, bool _nofp, bool undwarf)
goto out;
warnings += ret;

+ ret = validate_unwind_hints(&file);
+ if (ret < 0)
+ goto out;
+ warnings += ret;
+
if (!warnings) {
ret = validate_reachable_instructions(&file);
if (ret < 0)
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 2fe0810..2d7cff4 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -43,7 +43,7 @@ struct instruction {
unsigned int len;
unsigned char type;
unsigned long immediate;
- bool alt_group, visited, dead_end, ignore;
+ bool alt_group, visited, dead_end, ignore, hint, save, restore;
struct symbol *call_dest;
struct instruction *jump_dest;
struct list_head alts;
@@ -58,7 +58,7 @@ struct objtool_file {
struct list_head insn_list;
DECLARE_HASHTABLE(insn_hash, 16);
struct section *rodata, *whitelist;
- bool ignore_unreachables, c_file;
+ bool ignore_unreachables, c_file, hints;
};

int check(const char *objname, bool nofp, bool undwarf);
diff --git a/tools/objtool/undwarf-types.h b/tools/objtool/undwarf-types.h
index ef92a1d..4e5e283 100644
--- a/tools/objtool/undwarf-types.h
+++ b/tools/objtool/undwarf-types.h
@@ -57,11 +57,17 @@
*
* UNDWARF_TYPE_REGS_IRET: Used in entry code to indicate that
* cfa_reg+cfa_offset points to the iret return frame.
+ *
+ * The UNWIND_HINT macros are only used for the unwind_hint struct. They are
+ * not used for the undwarf struct due to size and complexity constraints.
*/
#define UNDWARF_TYPE_CFA 0
#define UNDWARF_TYPE_REGS 1
#define UNDWARF_TYPE_REGS_IRET 2
+#define UNWIND_HINT_TYPE_SAVE 3
+#define UNWIND_HINT_TYPE_RESTORE 4

+#ifndef __ASSEMBLY__
/*
* This struct contains a simplified version of the DWARF Call Frame
* Information standard. It contains only the necessary parts of the real
@@ -78,4 +84,16 @@ struct undwarf {
unsigned type:2;
};

+/*
+ * This struct is used by asm and inline asm code to manually annotate the
+ * location of registers on the stack for the undwarf unwinder.
+ */
+struct unwind_hint {
+ unsigned int ip;
+ short cfa_offset;
+ unsigned char cfa_reg;
+ unsigned char type;
+};
+#endif /* __ASSEMBLY__ */
+
#endif /* _UNDWARF_TYPES_H */
--
2.7.5

2017-06-28 15:14:08

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

Now that objtool knows the states of all registers on the stack for each
instruction, it's straightforward to generate debuginfo for an unwinder
to use.

Instead of generating DWARF, generate a new format called undwarf, which
is more suitable for an in-kernel unwinder. See
tools/objtool/Documentation/undwarf.txt for a more detailed description
of this new debuginfo format and why it's preferable to DWARF.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
tools/objtool/Build | 3 +
tools/objtool/Documentation/stack-validation.txt | 46 ++---
tools/objtool/builtin-check.c | 2 +-
tools/objtool/builtin-undwarf.c | 70 ++++++++
tools/objtool/builtin.h | 1 +
tools/objtool/check.c | 59 ++++++-
tools/objtool/check.h | 15 +-
tools/objtool/elf.c | 212 ++++++++++++++++++++--
tools/objtool/elf.h | 15 +-
tools/objtool/objtool.c | 3 +-
tools/objtool/undwarf-types.h | 81 +++++++++
tools/objtool/{builtin.h => undwarf.h} | 18 +-
tools/objtool/undwarf_dump.c | 212 ++++++++++++++++++++++
tools/objtool/undwarf_gen.c | 215 +++++++++++++++++++++++
14 files changed, 892 insertions(+), 60 deletions(-)
create mode 100644 tools/objtool/builtin-undwarf.c
create mode 100644 tools/objtool/undwarf-types.h
copy tools/objtool/{builtin.h => undwarf.h} (67%)
create mode 100644 tools/objtool/undwarf_dump.c
create mode 100644 tools/objtool/undwarf_gen.c

diff --git a/tools/objtool/Build b/tools/objtool/Build
index 6f2e198..9fb3f2f 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -1,6 +1,9 @@
objtool-y += arch/$(SRCARCH)/
objtool-y += builtin-check.o
+objtool-y += builtin-undwarf.o
objtool-y += check.o
+objtool-y += undwarf_gen.o
+objtool-y += undwarf_dump.o
objtool-y += elf.o
objtool-y += special.o
objtool-y += objtool.o
diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt
index 17c1195..14c0ded 100644
--- a/tools/objtool/Documentation/stack-validation.txt
+++ b/tools/objtool/Documentation/stack-validation.txt
@@ -11,9 +11,6 @@ analyzes every .o file and ensures the validity of its stack metadata.
It enforces a set of rules on asm code and C inline assembly code so
that stack traces can be reliable.

-Currently it only checks frame pointer usage, but there are plans to add
-CFI validation for C files and CFI generation for asm files.
-
For each function, it recursively follows all possible code paths and
validates the correct frame pointer state at each instruction.

@@ -23,6 +20,10 @@ alternative execution paths to a given instruction (or set of
instructions). Similarly, it knows how to follow switch statements, for
which gcc sometimes uses jump tables.

+(Objtool also has an 'undwarf generate' subcommand which generates
+debuginfo for the undwarf unwinder. See Documentation/x86/undwarf.txt
+in the kernel tree for more details.)
+

Why do we need stack metadata validation?
-----------------------------------------
@@ -93,37 +94,14 @@ a) More reliable stack traces for frame pointer enabled kernels
or at the very end of the function after the stack frame has been
destroyed. This is an inherent limitation of frame pointers.

-b) 100% reliable stack traces for DWARF enabled kernels
-
- (NOTE: This is not yet implemented)
-
- As an alternative to frame pointers, DWARF Call Frame Information
- (CFI) metadata can be used to walk the stack. Unlike frame pointers,
- CFI metadata is out of band. So it doesn't affect runtime
- performance and it can be reliable even when interrupts or exceptions
- are involved.
-
- For C code, gcc automatically generates DWARF CFI metadata. But for
- asm code, generating CFI is a tedious manual approach which requires
- manually placed .cfi assembler macros to be scattered throughout the
- code. It's clumsy and very easy to get wrong, and it makes the real
- code harder to read.
-
- Stacktool will improve this situation in several ways. For code
- which already has CFI annotations, it will validate them. For code
- which doesn't have CFI annotations, it will generate them. So an
- architecture can opt to strip out all the manual .cfi annotations
- from their asm code and have objtool generate them instead.
-
- We might also add a runtime stack validation debug option where we
- periodically walk the stack from schedule() and/or an NMI to ensure
- that the stack metadata is sane and that we reach the bottom of the
- stack.
-
- So the benefit of objtool here will be that external tooling should
- always show perfect stack traces. And the same will be true for
- kernel warning/oops traces if the architecture has a runtime DWARF
- unwinder.
+b) Out-of-band debuginfo generation (undwarf)
+
+ As an alternative to frame pointers, undwarf metadata can be used to
+ walk the stack. Unlike frame pointers, undwarf is out of band. So
+ it doesn't affect runtime performance and it can be reliable even
+ when interrupts or exceptions are involved.
+
+ For more details, see undwarf.txt.

c) Higher live patching compatibility rate

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 365c34e..eedf089 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -52,5 +52,5 @@ int cmd_check(int argc, const char **argv)

objname = argv[0];

- return check(objname, nofp);
+ return check(objname, nofp, false);
}
diff --git a/tools/objtool/builtin-undwarf.c b/tools/objtool/builtin-undwarf.c
new file mode 100644
index 0000000..900b1e5
--- /dev/null
+++ b/tools/objtool/builtin-undwarf.c
@@ -0,0 +1,70 @@
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * objtool undwarf:
+ *
+ * This command analyzes a .o file and adds an .undwarf section to it, which is
+ * used by the in-kernel "undwarf" unwinder.
+ *
+ * This command is a superset of "objtool check".
+ */
+
+#include <string.h>
+#include <subcmd/parse-options.h>
+#include "builtin.h"
+#include "check.h"
+
+
+static const char *undwarf_usage[] = {
+ "objtool undwarf generate [<options>] file.o",
+ "objtool undwarf dump file.o",
+ NULL,
+};
+
+extern const struct option check_options[];
+extern bool nofp;
+
+int cmd_undwarf(int argc, const char **argv)
+{
+ const char *objname;
+
+ argc--; argv++;
+ if (!strncmp(argv[0], "gen", 3)) {
+ argc = parse_options(argc, argv, check_options, undwarf_usage, 0);
+ if (argc != 1)
+ usage_with_options(undwarf_usage, check_options);
+
+ objname = argv[0];
+
+ return check(objname, nofp, true);
+
+ }
+
+ if (!strcmp(argv[0], "dump")) {
+ if (argc != 2)
+ usage_with_options(undwarf_usage, check_options);
+
+ objname = argv[1];
+
+ return undwarf_dump(objname);
+ }
+
+ usage_with_options(undwarf_usage, check_options);
+
+ return 0;
+}
diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h
index 34d2ba7..0b9722f 100644
--- a/tools/objtool/builtin.h
+++ b/tools/objtool/builtin.h
@@ -18,5 +18,6 @@
#define _BUILTIN_H

extern int cmd_check(int argc, const char **argv);
+extern int cmd_undwarf(int argc, const char **argv);

#endif /* _BUILTIN_H */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 2f80aa51..f76ac4c 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -36,8 +36,8 @@ const char *objname;
static bool nofp;
struct cfi_state initial_func_cfi;

-static struct instruction *find_insn(struct objtool_file *file,
- struct section *sec, unsigned long offset)
+struct instruction *find_insn(struct objtool_file *file,
+ struct section *sec, unsigned long offset)
{
struct instruction *insn;

@@ -253,6 +253,11 @@ static int decode_instructions(struct objtool_file *file)
if (!(sec->sh.sh_flags & SHF_EXECINSTR))
continue;

+ if (strcmp(sec->name, ".altinstr_replacement") &&
+ strcmp(sec->name, ".altinstr_aux") &&
+ strncmp(sec->name, ".discard.", 9))
+ sec->text = true;
+
for (offset = 0; offset < sec->len; offset += insn->len) {
insn = malloc(sizeof(*insn));
if (!insn) {
@@ -941,6 +946,30 @@ static bool has_valid_stack_frame(struct insn_state *state)
return false;
}

+static int update_insn_state_regs(struct instruction *insn, struct insn_state *state)
+{
+ struct cfi_reg *cfa = &state->cfa;
+ struct stack_op *op = &insn->stack_op;
+
+ if (cfa->base != CFI_SP)
+ return 0;
+
+ /* push */
+ if (op->dest.type == OP_DEST_PUSH)
+ cfa->offset += 8;
+
+ /* pop */
+ if (op->src.type == OP_SRC_POP)
+ cfa->offset -= 8;
+
+ /* add immediate to sp */
+ if (op->dest.type == OP_DEST_REG && op->src.type == OP_SRC_ADD &&
+ op->dest.reg == CFI_SP && op->src.reg == CFI_SP)
+ cfa->offset -= op->src.offset;
+
+ return 0;
+}
+
static void save_reg(struct insn_state *state, unsigned char reg, int base,
int offset)
{
@@ -1026,6 +1055,10 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state)
return 0;
}

+ if (state->type == UNDWARF_TYPE_REGS ||
+ state->type == UNDWARF_TYPE_REGS_IRET)
+ return update_insn_state_regs(insn, state);
+
switch (op->dest.type) {

case OP_DEST_REG:
@@ -1317,6 +1350,10 @@ static bool insn_state_match(struct instruction *insn, struct insn_state *state)
break;
}

+ } else if (state1->type != state2->type) {
+ WARN_FUNC("stack state mismatch: type1=%d type2=%d",
+ insn->sec, insn->offset, state1->type, state2->type);
+
} else if (state1->drap != state2->drap ||
(state1->drap && state1->drap_reg != state2->drap_reg)) {
WARN_FUNC("stack state mismatch: drap1=%d(%d) drap2=%d(%d)",
@@ -1606,7 +1643,7 @@ static void cleanup(struct objtool_file *file)
elf_close(file->elf);
}

-int check(const char *_objname, bool _nofp)
+int check(const char *_objname, bool _nofp, bool undwarf)
{
struct objtool_file file;
int ret, warnings = 0;
@@ -1614,7 +1651,7 @@ int check(const char *_objname, bool _nofp)
objname = _objname;
nofp = _nofp;

- file.elf = elf_open(objname);
+ file.elf = elf_open(objname, undwarf ? O_RDWR : O_RDONLY);
if (!file.elf)
return 1;

@@ -1647,6 +1684,20 @@ int check(const char *_objname, bool _nofp)
warnings += ret;
}

+ if (undwarf) {
+ ret = create_undwarf(&file);
+ if (ret < 0)
+ goto out;
+
+ ret = create_undwarf_sections(&file);
+ if (ret < 0)
+ goto out;
+
+ ret = elf_write(file.elf);
+ if (ret < 0)
+ goto out;
+ }
+
out:
cleanup(&file);

diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index da85f5b..2fe0810 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -22,12 +22,14 @@
#include "elf.h"
#include "cfi.h"
#include "arch.h"
+#include "undwarf.h"
#include <linux/hashtable.h>

struct insn_state {
struct cfi_reg cfa;
struct cfi_reg regs[CFI_NUM_REGS];
int stack_size;
+ unsigned char type;
bool bp_scratch;
bool drap;
int drap_reg;
@@ -48,6 +50,7 @@ struct instruction {
struct symbol *func;
struct stack_op stack_op;
struct insn_state state;
+ struct undwarf undwarf;
};

struct objtool_file {
@@ -58,9 +61,19 @@ struct objtool_file {
bool ignore_unreachables, c_file;
};

-int check(const char *objname, bool nofp);
+int check(const char *objname, bool nofp, bool undwarf);
+
+struct instruction *find_insn(struct objtool_file *file,
+ struct section *sec, unsigned long offset);

#define for_each_insn(file, insn) \
list_for_each_entry(insn, &file->insn_list, list)

+#define sec_for_each_insn(file, sec, insn) \
+ for (insn = find_insn(file, sec, 0); \
+ insn && &insn->list != &file->insn_list && \
+ insn->sec == sec; \
+ insn = list_next_entry(insn, list))
+
+
#endif /* _CHECK_H */
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 1a7e8aa..6e9f980 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -30,16 +30,6 @@
#include "elf.h"
#include "warn.h"

-/*
- * Fallback for systems without this "read, mmaping if possible" cmd.
- */
-#ifndef ELF_C_READ_MMAP
-#define ELF_C_READ_MMAP ELF_C_READ
-#endif
-
-#define WARN_ELF(format, ...) \
- WARN(format ": %s", ##__VA_ARGS__, elf_errmsg(-1))
-
struct section *find_section_by_name(struct elf *elf, const char *name)
{
struct section *sec;
@@ -349,9 +339,10 @@ static int read_relas(struct elf *elf)
return 0;
}

-struct elf *elf_open(const char *name)
+struct elf *elf_open(const char *name, int flags)
{
struct elf *elf;
+ Elf_Cmd cmd;

elf_version(EV_CURRENT);

@@ -364,13 +355,20 @@ struct elf *elf_open(const char *name)

INIT_LIST_HEAD(&elf->sections);

- elf->fd = open(name, O_RDONLY);
+ elf->fd = open(name, flags);
if (elf->fd == -1) {
perror("open");
goto err;
}

- elf->elf = elf_begin(elf->fd, ELF_C_READ_MMAP, NULL);
+ if ((flags & O_ACCMODE) == O_RDONLY)
+ cmd = ELF_C_READ_MMAP;
+ else if ((flags & O_ACCMODE) == O_RDWR)
+ cmd = ELF_C_RDWR;
+ else /* O_WRONLY */
+ cmd = ELF_C_WRITE;
+
+ elf->elf = elf_begin(elf->fd, cmd, NULL);
if (!elf->elf) {
WARN_ELF("elf_begin");
goto err;
@@ -397,6 +395,194 @@ struct elf *elf_open(const char *name)
return NULL;
}

+struct section *elf_create_section(struct elf *elf, const char *name,
+ size_t entsize, int nr)
+{
+ struct section *sec, *shstrtab;
+ size_t size = entsize * nr;
+ struct Elf_Scn *s;
+ Elf_Data *data;
+
+ sec = malloc(sizeof(*sec));
+ if (!sec) {
+ perror("malloc");
+ return NULL;
+ }
+ memset(sec, 0, sizeof(*sec));
+
+ INIT_LIST_HEAD(&sec->symbol_list);
+ INIT_LIST_HEAD(&sec->rela_list);
+ hash_init(sec->rela_hash);
+ hash_init(sec->symbol_hash);
+
+ list_add_tail(&sec->list, &elf->sections);
+
+ s = elf_newscn(elf->elf);
+ if (!s) {
+ WARN_ELF("elf_newscn");
+ return NULL;
+ }
+
+ sec->name = strdup(name);
+ if (!sec->name) {
+ perror("strdup");
+ return NULL;
+ }
+
+ sec->idx = elf_ndxscn(s);
+ sec->len = size;
+ sec->changed = true;
+
+ sec->data = elf_newdata(s);
+ if (!sec->data) {
+ WARN_ELF("elf_newdata");
+ return NULL;
+ }
+
+ sec->data->d_size = size;
+ sec->data->d_align = 1;
+
+ if (size) {
+ sec->data->d_buf = malloc(size);
+ if (!sec->data->d_buf) {
+ perror("malloc");
+ return NULL;
+ }
+ memset(sec->data->d_buf, 0, size);
+ }
+
+ if (!gelf_getshdr(s, &sec->sh)) {
+ WARN_ELF("gelf_getshdr");
+ return NULL;
+ }
+
+ sec->sh.sh_size = size;
+ sec->sh.sh_entsize = entsize;
+ sec->sh.sh_type = SHT_PROGBITS;
+ sec->sh.sh_addralign = 1;
+ sec->sh.sh_flags = SHF_ALLOC;
+
+
+ /* Add section name to .shstrtab */
+ shstrtab = find_section_by_name(elf, ".shstrtab");
+ if (!shstrtab) {
+ WARN("can't find .shstrtab section");
+ return NULL;
+ }
+
+ s = elf_getscn(elf->elf, shstrtab->idx);
+ if (!s) {
+ WARN_ELF("elf_getscn");
+ return NULL;
+ }
+
+ data = elf_newdata(s);
+ if (!data) {
+ WARN_ELF("elf_newdata");
+ return NULL;
+ }
+
+ data->d_buf = sec->name;
+ data->d_size = strlen(name) + 1;
+ data->d_align = 1;
+
+ sec->sh.sh_name = shstrtab->len;
+
+ shstrtab->len += strlen(name) + 1;
+ shstrtab->changed = true;
+
+ return sec;
+}
+
+struct section *elf_create_rela_section(struct elf *elf, struct section *base)
+{
+ char *relaname;
+ struct section *sec;
+
+ relaname = malloc(strlen(base->name) + strlen(".rela") + 1);
+ if (!relaname) {
+ perror("malloc");
+ return NULL;
+ }
+ strcpy(relaname, ".rela");
+ strcat(relaname, base->name);
+
+ sec = elf_create_section(elf, relaname, sizeof(GElf_Rela), 0);
+ if (!sec)
+ return NULL;
+
+ base->rela = sec;
+ sec->base = base;
+
+ sec->sh.sh_type = SHT_RELA;
+ sec->sh.sh_addralign = 8;
+ sec->sh.sh_link = find_section_by_name(elf, ".symtab")->idx;
+ sec->sh.sh_info = base->idx;
+ sec->sh.sh_flags = SHF_INFO_LINK;
+
+ return sec;
+}
+
+int elf_rebuild_rela_section(struct section *sec)
+{
+ struct rela *rela;
+ int nr, idx = 0, size;
+ GElf_Rela *relas;
+
+ nr = 0;
+ list_for_each_entry(rela, &sec->rela_list, list)
+ nr++;
+
+ size = nr * sizeof(*relas);
+ relas = malloc(size);
+ if (!relas) {
+ perror("malloc");
+ return -1;
+ }
+
+ sec->data->d_buf = relas;
+ sec->data->d_size = size;
+
+ sec->sh.sh_size = size;
+
+ idx = 0;
+ list_for_each_entry(rela, &sec->rela_list, list) {
+ relas[idx].r_offset = rela->offset;
+ relas[idx].r_addend = rela->addend;
+ relas[idx].r_info = GELF_R_INFO(rela->sym->idx, rela->type);
+ idx++;
+ }
+
+ return 0;
+}
+
+int elf_write(struct elf *elf)
+{
+ struct section *sec;
+ Elf_Scn *s;
+
+ list_for_each_entry(sec, &elf->sections, list) {
+ if (sec->changed) {
+ s = elf_getscn(elf->elf, sec->idx);
+ if (!s) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
+ if (!gelf_update_shdr (s, &sec->sh)) {
+ WARN_ELF("gelf_update_shdr");
+ return -1;
+ }
+ }
+ }
+
+ if (elf_update(elf->elf, ELF_C_WRITE) < 0) {
+ WARN_ELF("elf_update");
+ return -1;
+ }
+
+ return 0;
+}
+
void elf_close(struct elf *elf)
{
struct section *sec, *tmpsec;
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index 343968b..d86e2ff1 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -28,6 +28,13 @@
# define elf_getshdrstrndx elf_getshstrndx
#endif

+/*
+ * Fallback for systems without this "read, mmaping if possible" cmd.
+ */
+#ifndef ELF_C_READ_MMAP
+#define ELF_C_READ_MMAP ELF_C_READ
+#endif
+
struct section {
struct list_head list;
GElf_Shdr sh;
@@ -41,6 +48,7 @@ struct section {
char *name;
int idx;
unsigned int len;
+ bool changed, text;
};

struct symbol {
@@ -75,7 +83,7 @@ struct elf {
};


-struct elf *elf_open(const char *name);
+struct elf *elf_open(const char *name, int flags);
struct section *find_section_by_name(struct elf *elf, const char *name);
struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset);
struct symbol *find_symbol_containing(struct section *sec, unsigned long offset);
@@ -83,6 +91,11 @@ struct rela *find_rela_by_dest(struct section *sec, unsigned long offset);
struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
unsigned int len);
struct symbol *find_containing_func(struct section *sec, unsigned long offset);
+struct section *elf_create_section(struct elf *elf, const char *name, size_t
+ entsize, int nr);
+struct section *elf_create_rela_section(struct elf *elf, struct section *base);
+int elf_rebuild_rela_section(struct section *sec);
+int elf_write(struct elf *elf);
void elf_close(struct elf *elf);

#define for_each_sec(file, sec) \
diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
index ecc5b1b..b2051d1 100644
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -42,10 +42,11 @@ struct cmd_struct {
};

static const char objtool_usage_string[] =
- "objtool [OPTIONS] COMMAND [ARGS]";
+ "objtool COMMAND [ARGS]";

static struct cmd_struct objtool_cmds[] = {
{"check", cmd_check, "Perform stack metadata validation on an object file" },
+ {"undwarf", cmd_undwarf, "Generate in-place undwarf metadata for an object file" },
};

bool help;
diff --git a/tools/objtool/undwarf-types.h b/tools/objtool/undwarf-types.h
new file mode 100644
index 0000000..ef92a1d
--- /dev/null
+++ b/tools/objtool/undwarf-types.h
@@ -0,0 +1,81 @@
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _UNDWARF_TYPES_H
+#define _UNDWARF_TYPES_H
+
+/*
+ * The UNDWARF_REG_* registers are base registers which are used to find other
+ * registers on the stack.
+ *
+ * The CFA (call frame address) is the value of the stack pointer on the
+ * previous frame, i.e. the caller's SP before it called the callee.
+ *
+ * The CFA is usually based on SP, unless a frame pointer has been saved, in
+ * which case it's based on BP.
+ *
+ * BP is usually either based on CFA or is undefined (meaning its value didn't
+ * change for the current frame).
+ *
+ * So the CFA base is usually either SP or BP, and the FP base is usually either
+ * CFA or undefined. The rest of the base registers are needed for special
+ * cases like entry code and gcc aligned stacks.
+ */
+#define UNDWARF_REG_UNDEFINED 0
+#define UNDWARF_REG_CFA 1
+#define UNDWARF_REG_DX 2
+#define UNDWARF_REG_DI 3
+#define UNDWARF_REG_BP 4
+#define UNDWARF_REG_SP 5
+#define UNDWARF_REG_R10 6
+#define UNDWARF_REG_R13 7
+#define UNDWARF_REG_BP_INDIRECT 8
+#define UNDWARF_REG_SP_INDIRECT 9
+#define UNDWARF_REG_MAX 15
+
+/*
+ * UNDWARF_TYPE_CFA: Indicates that cfa_reg+cfa_offset points to the caller's
+ * stack pointer (aka the CFA in DWARF terms). Used for all callable
+ * functions, i.e. all C code and all callable asm functions.
+ *
+ * UNDWARF_TYPE_REGS: Used in entry code to indicate that cfa_reg+cfa_offset
+ * points to a fully populated pt_regs from a syscall, interrupt, or exception.
+ *
+ * UNDWARF_TYPE_REGS_IRET: Used in entry code to indicate that
+ * cfa_reg+cfa_offset points to the iret return frame.
+ */
+#define UNDWARF_TYPE_CFA 0
+#define UNDWARF_TYPE_REGS 1
+#define UNDWARF_TYPE_REGS_IRET 2
+
+/*
+ * This struct contains a simplified version of the DWARF Call Frame
+ * Information standard. It contains only the necessary parts of the real
+ * DWARF, simplified for ease of access by the in-kernel unwinder. It tells
+ * the unwinder how to find the previous SP and BP (and sometimes entry regs)
+ * on the stack for a given code address (IP). Each instance of the struct
+ * corresponds to one or more code locations.
+ */
+struct undwarf {
+ short cfa_offset;
+ short bp_offset;
+ unsigned cfa_reg:4;
+ unsigned bp_reg:4;
+ unsigned type:2;
+};
+
+#endif /* _UNDWARF_TYPES_H */
diff --git a/tools/objtool/builtin.h b/tools/objtool/undwarf.h
similarity index 67%
copy from tools/objtool/builtin.h
copy to tools/objtool/undwarf.h
index 34d2ba7..c9f5116 100644
--- a/tools/objtool/builtin.h
+++ b/tools/objtool/undwarf.h
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -14,9 +14,17 @@
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
-#ifndef _BUILTIN_H
-#define _BUILTIN_H

-extern int cmd_check(int argc, const char **argv);
+#ifndef _UNDWARF_H
+#define _UNDWARF_H

-#endif /* _BUILTIN_H */
+#include "undwarf-types.h"
+
+struct objtool_file;
+
+int create_undwarf(struct objtool_file *file);
+int create_undwarf_sections(struct objtool_file *file);
+
+int undwarf_dump(const char *objname);
+
+#endif /* _UNDWARF_H */
diff --git a/tools/objtool/undwarf_dump.c b/tools/objtool/undwarf_dump.c
new file mode 100644
index 0000000..7bab393
--- /dev/null
+++ b/tools/objtool/undwarf_dump.c
@@ -0,0 +1,212 @@
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <unistd.h>
+#include "undwarf.h"
+#include "warn.h"
+
+static const char *reg_name(unsigned int reg)
+{
+ switch (reg) {
+ case UNDWARF_REG_CFA:
+ return "cfa";
+ case UNDWARF_REG_DX:
+ return "dx";
+ case UNDWARF_REG_DI:
+ return "di";
+ case UNDWARF_REG_BP:
+ return "bp";
+ case UNDWARF_REG_SP:
+ return "sp";
+ case UNDWARF_REG_R10:
+ return "r10";
+ case UNDWARF_REG_R13:
+ return "r13";
+ case UNDWARF_REG_BP_INDIRECT:
+ return "bp(ind)";
+ case UNDWARF_REG_SP_INDIRECT:
+ return "sp(ind)";
+ default:
+ return "?";
+ }
+}
+
+static const char *undwarf_type_name(unsigned int type)
+{
+ switch (type) {
+ case UNDWARF_TYPE_CFA:
+ return "cfa";
+ case UNDWARF_TYPE_REGS:
+ return "regs";
+ case UNDWARF_TYPE_REGS_IRET:
+ return "iret";
+ default:
+ return "?";
+ }
+}
+
+static void print_reg(unsigned int reg, int offset)
+{
+ if (reg == UNDWARF_REG_BP_INDIRECT)
+ printf("(bp%+d)", offset);
+ else if (reg == UNDWARF_REG_SP_INDIRECT)
+ printf("(sp%+d)", offset);
+ else if (reg == UNDWARF_REG_UNDEFINED)
+ printf("(und)");
+ else
+ printf("%s%+d", reg_name(reg), offset);
+}
+
+int undwarf_dump(const char *_objname)
+{
+ int fd, nr_entries, i, *undwarf_ip = NULL, undwarf_size = 0;
+ struct undwarf *undwarf = NULL;
+ char *name;
+ unsigned long nr_sections, undwarf_ip_addr = 0;
+ size_t shstrtab_idx;
+ Elf *elf;
+ Elf_Scn *scn;
+ GElf_Shdr sh;
+ GElf_Rela rela;
+ GElf_Sym sym;
+ Elf_Data *data, *symtab = NULL, *rela_undwarf_ip = NULL;
+
+
+ objname = _objname;
+
+ elf_version(EV_CURRENT);
+
+ fd = open(objname, O_RDONLY);
+ if (fd == -1) {
+ perror("open");
+ return -1;
+ }
+
+ elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);
+ if (!elf) {
+ WARN_ELF("elf_begin");
+ return -1;
+ }
+
+ if (elf_getshdrnum(elf, &nr_sections)) {
+ WARN_ELF("elf_getshdrnum");
+ return -1;
+ }
+
+ if (elf_getshdrstrndx(elf, &shstrtab_idx)) {
+ WARN_ELF("elf_getshdrstrndx");
+ return -1;
+ }
+
+ for (i = 0; i < nr_sections; i++) {
+ scn = elf_getscn(elf, i);
+ if (!scn) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
+
+ if (!gelf_getshdr(scn, &sh)) {
+ WARN_ELF("gelf_getshdr");
+ return -1;
+ }
+
+ name = elf_strptr(elf, shstrtab_idx, sh.sh_name);
+ if (!name) {
+ WARN_ELF("elf_strptr");
+ return -1;
+ }
+
+ data = elf_getdata(scn, NULL);
+ if (!data) {
+ WARN_ELF("elf_getdata");
+ return -1;
+ }
+
+ if (!strcmp(name, ".symtab")) {
+ symtab = data;
+ } else if (!strcmp(name, ".undwarf")) {
+ undwarf = data->d_buf;
+ undwarf_size = sh.sh_size;
+ } else if (!strcmp(name, ".undwarf_ip")) {
+ undwarf_ip = data->d_buf;
+ undwarf_ip_addr = sh.sh_addr;
+ } else if (!strcmp(name, ".rela.undwarf_ip")) {
+ rela_undwarf_ip = data;
+ }
+ }
+
+ if (!symtab || !undwarf || !undwarf_ip)
+ return 0;
+
+ if (undwarf_size % sizeof(*undwarf) != 0) {
+ WARN("bad .undwarf section size");
+ return -1;
+ }
+
+ nr_entries = undwarf_size / sizeof(*undwarf);
+ for (i = 0; i < nr_entries; i++) {
+ if (rela_undwarf_ip) {
+ if (!gelf_getrela(rela_undwarf_ip, i, &rela)) {
+ WARN_ELF("gelf_getrela");
+ return -1;
+ }
+
+ if (!gelf_getsym(symtab, GELF_R_SYM(rela.r_info), &sym)) {
+ WARN_ELF("gelf_getsym");
+ return -1;
+ }
+
+ scn = elf_getscn(elf, sym.st_shndx);
+ if (!scn) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
+
+ if (!gelf_getshdr(scn, &sh)) {
+ WARN_ELF("gelf_getshdr");
+ return -1;
+ }
+
+ name = elf_strptr(elf, shstrtab_idx, sh.sh_name);
+ if (!name || !*name) {
+ WARN_ELF("elf_strptr");
+ return -1;
+ }
+
+ printf("%s+%lx:", name, rela.r_addend);
+
+ } else {
+ printf("%lx:", undwarf_ip_addr + (i * sizeof(int)) + undwarf_ip[i]);
+ }
+
+
+ printf(" cfa:");
+
+ print_reg(undwarf[i].cfa_reg, undwarf[i].cfa_offset);
+
+ printf(" bp:");
+
+ print_reg(undwarf[i].bp_reg, undwarf[i].bp_offset);
+
+ printf(" type:%s\n", undwarf_type_name(undwarf[i].type));
+ }
+
+ elf_end(elf);
+ close(fd);
+
+ return 0;
+}
diff --git a/tools/objtool/undwarf_gen.c b/tools/objtool/undwarf_gen.c
new file mode 100644
index 0000000..03021d8
--- /dev/null
+++ b/tools/objtool/undwarf_gen.c
@@ -0,0 +1,215 @@
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+
+#include "undwarf.h"
+#include "check.h"
+#include "warn.h"
+
+int create_undwarf(struct objtool_file *file)
+{
+ struct instruction *insn;
+
+ for_each_insn(file, insn) {
+ struct undwarf *undwarf = &insn->undwarf;
+ struct cfi_reg *cfa = &insn->state.cfa;
+ struct cfi_reg *bp = &insn->state.regs[CFI_BP];
+
+ if (cfa->base == CFI_UNDEFINED) {
+ undwarf->cfa_reg = UNDWARF_REG_UNDEFINED;
+ continue;
+ }
+
+ switch (cfa->base) {
+ case CFI_SP:
+ undwarf->cfa_reg = UNDWARF_REG_SP;
+ break;
+ case CFI_SP_INDIRECT:
+ undwarf->cfa_reg = UNDWARF_REG_SP_INDIRECT;
+ break;
+ case CFI_BP:
+ undwarf->cfa_reg = UNDWARF_REG_BP;
+ break;
+ case CFI_BP_INDIRECT:
+ undwarf->cfa_reg = UNDWARF_REG_BP_INDIRECT;
+ break;
+ case CFI_R10:
+ undwarf->cfa_reg = UNDWARF_REG_R10;
+ break;
+ case CFI_R13:
+ undwarf->cfa_reg = UNDWARF_REG_R13;
+ break;
+ case CFI_DI:
+ undwarf->cfa_reg = UNDWARF_REG_DI;
+ break;
+ case CFI_DX:
+ undwarf->cfa_reg = UNDWARF_REG_DX;
+ break;
+ default:
+ WARN_FUNC("unknown CFA base reg %d",
+ insn->sec, insn->offset, cfa->base);
+ return -1;
+ }
+
+ switch(bp->base) {
+ case CFI_UNDEFINED:
+ undwarf->bp_reg = UNDWARF_REG_UNDEFINED;
+ break;
+ case CFI_CFA:
+ undwarf->bp_reg = UNDWARF_REG_CFA;
+ break;
+ case CFI_BP:
+ undwarf->bp_reg = UNDWARF_REG_BP;
+ break;
+ default:
+ WARN_FUNC("unknown BP base reg %d",
+ insn->sec, insn->offset, bp->base);
+ return -1;
+ }
+
+ undwarf->cfa_offset = cfa->offset;
+ undwarf->bp_offset = bp->offset;
+ undwarf->type = insn->state.type;
+ }
+
+ return 0;
+}
+
+static int create_undwarf_entry(struct section *u_sec, struct section *ip_relasec,
+ unsigned int idx, struct section *insn_sec,
+ unsigned long insn_off, struct undwarf *u)
+{
+ struct undwarf *undwarf;
+ struct rela *rela;
+
+ /* populate undwarf */
+ undwarf = (struct undwarf *)u_sec->data->d_buf + idx;
+ memcpy(undwarf, u, sizeof(*undwarf));
+
+ /* populate rela for ip */
+ rela = malloc(sizeof(*rela));
+ if (!rela) {
+ perror("malloc");
+ return -1;
+ }
+ memset(rela, 0, sizeof(*rela));
+
+ rela->sym = insn_sec->sym;
+ rela->addend = insn_off;
+ rela->type = R_X86_64_PC32;
+ rela->offset = idx * sizeof(int);
+
+ list_add_tail(&rela->list, &ip_relasec->rela_list);
+ hash_add(ip_relasec->rela_hash, &rela->hash, rela->offset);
+
+ return 0;
+}
+
+int create_undwarf_sections(struct objtool_file *file)
+{
+ struct instruction *insn, *prev_insn;
+ struct section *sec, *u_sec, *ip_relasec;
+ unsigned int idx;
+
+ struct undwarf empty = {
+ .cfa_reg = UNDWARF_REG_UNDEFINED,
+ .bp_reg = UNDWARF_REG_UNDEFINED,
+ .type = UNDWARF_TYPE_CFA,
+ };
+
+ sec = find_section_by_name(file->elf, ".undwarf");
+ if (sec) {
+ WARN("file already has .undwarf section, skipping");
+ return -1;
+ }
+
+ /* count the number of needed undwarves */
+ idx = 0;
+ for_each_sec(file, sec) {
+ if (!sec->text)
+ continue;
+
+ prev_insn = NULL;
+ sec_for_each_insn(file, sec, insn) {
+ if (!prev_insn ||
+ memcmp(&insn->undwarf, &prev_insn->undwarf,
+ sizeof(struct undwarf))) {
+ idx++;
+ }
+ prev_insn = insn;
+ }
+
+ /* section terminator */
+ if (prev_insn)
+ idx++;
+ }
+ if (!idx)
+ return -1;
+
+
+ /* create .undwarf_ip and .rela.undwarf_ip sections */
+ sec = elf_create_section(file->elf, ".undwarf_ip", sizeof(int), idx);
+
+ ip_relasec = elf_create_rela_section(file->elf, sec);
+ if (!ip_relasec)
+ return -1;
+
+ /* create .undwarf section */
+ u_sec = elf_create_section(file->elf, ".undwarf",
+ sizeof(struct undwarf), idx);
+
+ /* populate sections */
+ idx = 0;
+ for_each_sec(file, sec) {
+ if (!sec->text)
+ continue;
+
+ prev_insn = NULL;
+ sec_for_each_insn(file, sec, insn) {
+ if (!prev_insn || memcmp(&insn->undwarf,
+ &prev_insn->undwarf,
+ sizeof(struct undwarf))) {
+
+ if (create_undwarf_entry(u_sec, ip_relasec, idx,
+ insn->sec, insn->offset,
+ &insn->undwarf))
+ return -1;
+
+ idx++;
+ }
+ prev_insn = insn;
+ }
+
+ /* section terminator */
+ if (prev_insn) {
+ if (create_undwarf_entry(u_sec, ip_relasec, idx,
+ prev_insn->sec,
+ prev_insn->offset + prev_insn->len,
+ &empty))
+ return -1;
+
+ idx++;
+ }
+ }
+
+ if (elf_rebuild_rela_section(ip_relasec))
+ return -1;
+
+ return 0;
+}
--
2.7.5

2017-06-28 15:13:21

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH v2 7/8] x86/asm: add unwind hint annotations to sync_core()

This enables the undwarf unwinder to grok the iret in the middle of a C
function.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/x86/include/asm/processor.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f3b1b27..465e5e2 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -22,6 +22,7 @@ struct vm86;
#include <asm/nops.h>
#include <asm/special_insns.h>
#include <asm/fpu/types.h>
+#include <asm/undwarf.h>

#include <linux/personality.h>
#include <linux/cache.h>
@@ -684,6 +685,7 @@ static inline void sync_core(void)
unsigned int tmp;

asm volatile (
+ UNWIND_HINT_SAVE
"mov %%ss, %0\n\t"
"pushq %q0\n\t"
"pushq %%rsp\n\t"
@@ -693,6 +695,7 @@ static inline void sync_core(void)
"pushq %q0\n\t"
"pushq $1f\n\t"
"iretq\n\t"
+ UNWIND_HINT_RESTORE
"1:"
: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
#endif
--
2.7.5

2017-06-29 07:14:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation


* Josh Poimboeuf <[email protected]> wrote:

> Now that objtool knows the states of all registers on the stack for each
> instruction, it's straightforward to generate debuginfo for an unwinder
> to use.
>
> Instead of generating DWARF, generate a new format called undwarf, which
> is more suitable for an in-kernel unwinder. See
> tools/objtool/Documentation/undwarf.txt for a more detailed description
> of this new debuginfo format and why it's preferable to DWARF.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> tools/objtool/Build | 3 +
> tools/objtool/Documentation/stack-validation.txt | 46 ++---
> tools/objtool/builtin-check.c | 2 +-
> tools/objtool/builtin-undwarf.c | 70 ++++++++
> tools/objtool/builtin.h | 1 +
> tools/objtool/check.c | 59 ++++++-
> tools/objtool/check.h | 15 +-
> tools/objtool/elf.c | 212 ++++++++++++++++++++--
> tools/objtool/elf.h | 15 +-
> tools/objtool/objtool.c | 3 +-
> tools/objtool/undwarf-types.h | 81 +++++++++

Just a very quick stylistic suggestion: please name the header 'undwarf_types.h'
(note the underscore versus hyphen), which is the common naming pattern used in
the kernel.

Thanks,

Ingo

2017-06-29 07:25:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation


* Josh Poimboeuf <[email protected]> wrote:

> +#ifndef _UNDWARF_TYPES_H
> +#define _UNDWARF_TYPES_H
> +
> +/*
> + * The UNDWARF_REG_* registers are base registers which are used to find other
> + * registers on the stack.
> + *
> + * The CFA (call frame address) is the value of the stack pointer on the
> + * previous frame, i.e. the caller's SP before it called the callee.
> + *
> + * The CFA is usually based on SP, unless a frame pointer has been saved, in
> + * which case it's based on BP.
> + *
> + * BP is usually either based on CFA or is undefined (meaning its value didn't
> + * change for the current frame).
> + *
> + * So the CFA base is usually either SP or BP, and the FP base is usually either
> + * CFA or undefined. The rest of the base registers are needed for special
> + * cases like entry code and gcc aligned stacks.
> + */
> +#define UNDWARF_REG_UNDEFINED 0
> +#define UNDWARF_REG_CFA 1
> +#define UNDWARF_REG_DX 2
> +#define UNDWARF_REG_DI 3
> +#define UNDWARF_REG_BP 4
> +#define UNDWARF_REG_SP 5
> +#define UNDWARF_REG_R10 6
> +#define UNDWARF_REG_R13 7
> +#define UNDWARF_REG_BP_INDIRECT 8
> +#define UNDWARF_REG_SP_INDIRECT 9
> +#define UNDWARF_REG_MAX 15
> +
> +/*
> + * UNDWARF_TYPE_CFA: Indicates that cfa_reg+cfa_offset points to the caller's
> + * stack pointer (aka the CFA in DWARF terms). Used for all callable
> + * functions, i.e. all C code and all callable asm functions.
> + *
> + * UNDWARF_TYPE_REGS: Used in entry code to indicate that cfa_reg+cfa_offset
> + * points to a fully populated pt_regs from a syscall, interrupt, or exception.
> + *
> + * UNDWARF_TYPE_REGS_IRET: Used in entry code to indicate that
> + * cfa_reg+cfa_offset points to the iret return frame.
> + */
> +#define UNDWARF_TYPE_CFA 0
> +#define UNDWARF_TYPE_REGS 1
> +#define UNDWARF_TYPE_REGS_IRET 2
> +
> +/*
> + * This struct contains a simplified version of the DWARF Call Frame
> + * Information standard. It contains only the necessary parts of the real
> + * DWARF, simplified for ease of access by the in-kernel unwinder. It tells
> + * the unwinder how to find the previous SP and BP (and sometimes entry regs)
> + * on the stack for a given code address (IP). Each instance of the struct
> + * corresponds to one or more code locations.
> + */
> +struct undwarf {
> + short cfa_offset;
> + short bp_offset;
> + unsigned cfa_reg:4;
> + unsigned bp_reg:4;
> + unsigned type:2;
> +};

I never know straight away what 'CFA' stands for - could we please use natural
names, i.e. something like:

struct undwarf {
u16 sp_offset;
u16 bp_offset;
unsigned sp_reg:4;
unsigned bp_reg:4;
unsigned type:2;
};

...

struct unwind_hint {
u32 ip;
u16 sp_offset;
u8 sp_reg;
u8 type;
};

?

Also note the slightly cleaner vertical alignment, plus the conversion to more
stable data types: I believe various bits of tooling (perf and so) will eventually
learn about undwarf, so having a well defined cross-arch data structure is
probably of advantage.

Since we are not bound by DWARF anymore, we might as well use readable names and
such?

Plus, shouldn't we use __packed for 'struct undwarf' to minimize the structure's
size (to 6 bytes AFAICS?) - or is optimal packing of the main undwarf array
already guaranteed on every platform with this layout?

Thanks,

Ingo

2017-06-29 07:56:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] x86: undwarf unwinder


* Josh Poimboeuf <[email protected]> wrote:

> Undwarf vs frame pointers
> -------------------------
>
> With frame pointers enabled, GCC adds instrumentation code to every
> function in the kernel. The kernel's .text size increases by about
> 3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel
> Gorman [1] have shown a slowdown of 5-10% for some workloads.
>
> In contrast, the undwarf unwinder has no effect on text size or runtime
> performance, because the debuginfo is out of band. So if you disable
> frame pointers and enable undwarf, you get a nice performance
> improvement across the board, and still have reliable stack traces.
>
> Another benefit of undwarf compared to frame pointers is that it can
> reliably unwind across interrupts and exceptions. Frame pointer based
> unwinds can skip the caller of the interrupted function if it was a leaf
> function or if the interrupt hit before the frame pointer was saved.
>
> The main disadvantage of undwarf compared to frame pointers is that it
> needs more memory to store the undwarf table: roughly 3-5MB depending on
> the kernel config.

Note that it's not just a performance improvement, but also an instruction cache
locality improvement: 3.2% .text savings almost directly transform into a
similarly sized reduction in cache footprint. That can transform to even higher
speedups for workloads whose cache locality is borderline.

I _really_ like this feature, and the independence of the debuginfo data format.

Logistically it's too bad we are 3 days away from the merge window to be able to
pick this up:

> 56 files changed, 3466 insertions(+), 1765 deletions(-)

OTOH most of the diffstat is in objtool.

Any objections to applying the first 3 objtool patches straight away and see
whether anything breaks? That would significantly reduce the size of the rest of
the patch set.

> I'm not tied to the 'undwarf' name, other naming ideas are welcome.

Ha, a new bike shed painting job! ;-)

I think 'undwarf' isn't a bad name, it's short, catchy and describes the purpose
of the effort.

But I cannot resist some other suggestions, after 'elf' and 'dwarf' the obvious
candidates from the peoples of Middle-earth would be:

- 'Hobbit'
- 'Eagle'
- 'Ent'
- 'Dragon'
- 'Troll'
- 'Ainur'

'struct troll_entry' has a certain charm to it.

'Eagle' is even nicer IMHO: larger than a dwarf but so much faster - and eagles
are beautiful! Plus the name is 2 letters shorter than 'unwdwarf', win-win.

Thanks,

Ingo

2017-06-29 13:40:49

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

On Thu, Jun 29, 2017 at 09:14:14AM +0200, Ingo Molnar wrote:
>
> * Josh Poimboeuf <[email protected]> wrote:
>
> > Now that objtool knows the states of all registers on the stack for each
> > instruction, it's straightforward to generate debuginfo for an unwinder
> > to use.
> >
> > Instead of generating DWARF, generate a new format called undwarf, which
> > is more suitable for an in-kernel unwinder. See
> > tools/objtool/Documentation/undwarf.txt for a more detailed description
> > of this new debuginfo format and why it's preferable to DWARF.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> > tools/objtool/Build | 3 +
> > tools/objtool/Documentation/stack-validation.txt | 46 ++---
> > tools/objtool/builtin-check.c | 2 +-
> > tools/objtool/builtin-undwarf.c | 70 ++++++++
> > tools/objtool/builtin.h | 1 +
> > tools/objtool/check.c | 59 ++++++-
> > tools/objtool/check.h | 15 +-
> > tools/objtool/elf.c | 212 ++++++++++++++++++++--
> > tools/objtool/elf.h | 15 +-
> > tools/objtool/objtool.c | 3 +-
> > tools/objtool/undwarf-types.h | 81 +++++++++
>
> Just a very quick stylistic suggestion: please name the header 'undwarf_types.h'
> (note the underscore versus hyphen), which is the common naming pattern used in
> the kernel.

Ok, will rename it.

--
Josh

2017-06-29 14:04:16

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

On Thu, Jun 29, 2017 at 09:25:12AM +0200, Ingo Molnar wrote:
> > +/*
> > + * This struct contains a simplified version of the DWARF Call Frame
> > + * Information standard. It contains only the necessary parts of the real
> > + * DWARF, simplified for ease of access by the in-kernel unwinder. It tells
> > + * the unwinder how to find the previous SP and BP (and sometimes entry regs)
> > + * on the stack for a given code address (IP). Each instance of the struct
> > + * corresponds to one or more code locations.
> > + */
> > +struct undwarf {
> > + short cfa_offset;
> > + short bp_offset;
> > + unsigned cfa_reg:4;
> > + unsigned bp_reg:4;
> > + unsigned type:2;
> > +};
>
> I never know straight away what 'CFA' stands for - could we please use natural
> names, i.e. something like:
>
> struct undwarf {
> u16 sp_offset;
> u16 bp_offset;
> unsigned sp_reg:4;
> unsigned bp_reg:4;
> unsigned type:2;
> };
>
> ...
>
> struct unwind_hint {
> u32 ip;
> u16 sp_offset;
> u8 sp_reg;
> u8 type;
> };
>
> ?
>
> Also note the slightly cleaner vertical alignment, plus the conversion to more
> stable data types: I believe various bits of tooling (perf and so) will eventually
> learn about undwarf, so having a well defined cross-arch data structure is
> probably of advantage.

I agree with all your suggestions.

(Though if we want to make it truly cross-arch, 'bp' should be 'fp', for
frame pointer. But there were some objections to that, so I'll leave it
'bp' for now.)

> Since we are not bound by DWARF anymore, we might as well use readable names and
> such?
>
> Plus, shouldn't we use __packed for 'struct undwarf' to minimize the structure's
> size (to 6 bytes AFAICS?) - or is optimal packing of the main undwarf array
> already guaranteed on every platform with this layout?

Ah yes, it should definitely be packed (assuming that doesn't affect
performance negatively).

--
Josh

2017-06-29 14:12:59

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] x86: undwarf unwinder

On Thu, Jun 29, 2017 at 09:55:47AM +0200, Ingo Molnar wrote:
>
> * Josh Poimboeuf <[email protected]> wrote:
>
> > Undwarf vs frame pointers
> > -------------------------
> >
> > With frame pointers enabled, GCC adds instrumentation code to every
> > function in the kernel. The kernel's .text size increases by about
> > 3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel
> > Gorman [1] have shown a slowdown of 5-10% for some workloads.
> >
> > In contrast, the undwarf unwinder has no effect on text size or runtime
> > performance, because the debuginfo is out of band. So if you disable
> > frame pointers and enable undwarf, you get a nice performance
> > improvement across the board, and still have reliable stack traces.
> >
> > Another benefit of undwarf compared to frame pointers is that it can
> > reliably unwind across interrupts and exceptions. Frame pointer based
> > unwinds can skip the caller of the interrupted function if it was a leaf
> > function or if the interrupt hit before the frame pointer was saved.
> >
> > The main disadvantage of undwarf compared to frame pointers is that it
> > needs more memory to store the undwarf table: roughly 3-5MB depending on
> > the kernel config.
>
> Note that it's not just a performance improvement, but also an instruction cache
> locality improvement: 3.2% .text savings almost directly transform into a
> similarly sized reduction in cache footprint. That can transform to even higher
> speedups for workloads whose cache locality is borderline.

I'll add that detail to the docs.

> I _really_ like this feature, and the independence of the debuginfo data format.
>
> Logistically it's too bad we are 3 days away from the merge window to be able to
> pick this up:
>
> > 56 files changed, 3466 insertions(+), 1765 deletions(-)
>
> OTOH most of the diffstat is in objtool.
>
> Any objections to applying the first 3 objtool patches straight away and see
> whether anything breaks? That would significantly reduce the size of the rest of
> the patch set.

Merging the first 3 patches now sounds good to me. They implement
"stack validation 2.0" which is a good standalone improvement even
without undwarf. I think I've already ironed out all the issues
reported by the build bot.

> > I'm not tied to the 'undwarf' name, other naming ideas are welcome.
>
> Ha, a new bike shed painting job! ;-)
>
> I think 'undwarf' isn't a bad name, it's short, catchy and describes the purpose
> of the effort.
>
> But I cannot resist some other suggestions, after 'elf' and 'dwarf' the obvious
> candidates from the peoples of Middle-earth would be:
>
> - 'Hobbit'
> - 'Eagle'
> - 'Ent'
> - 'Dragon'
> - 'Troll'
> - 'Ainur'
>
> 'struct troll_entry' has a certain charm to it.
>
> 'Eagle' is even nicer IMHO: larger than a dwarf but so much faster - and eagles
> are beautiful! Plus the name is 2 letters shorter than 'unwdwarf', win-win.

Finally, we get to the important part ;-)

Thus far I've been partial to undwarf, and I haven't been able to shake
it.

But I like some of your suggestions. Especially troll and hobbit. Will
need to do some more deep thinking about it :-)

--
Josh

2017-06-29 14:46:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation


* Josh Poimboeuf <[email protected]> wrote:

> > Plus, shouldn't we use __packed for 'struct undwarf' to minimize the
> > structure's size (to 6 bytes AFAICS?) - or is optimal packing of the main
> > undwarf array already guaranteed on every platform with this layout?
>
> Ah yes, it should definitely be packed (assuming that doesn't affect performance
> negatively).

So if I count that correctly that should shave another ~1MB off a typical ~4MB
table size?

Thanks,

Ingo

2017-06-29 15:07:03

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

On Thu, Jun 29, 2017 at 04:46:18PM +0200, Ingo Molnar wrote:
>
> * Josh Poimboeuf <[email protected]> wrote:
>
> > > Plus, shouldn't we use __packed for 'struct undwarf' to minimize the
> > > structure's size (to 6 bytes AFAICS?) - or is optimal packing of the main
> > > undwarf array already guaranteed on every platform with this layout?
> >
> > Ah yes, it should definitely be packed (assuming that doesn't affect performance
> > negatively).
>
> So if I count that correctly that should shave another ~1MB off a typical ~4MB
> table size?

Here's what my Fedora kernel looks like *before* the packed change:

$ eu-readelf -S vmlinux |grep undwarf
[15] .undwarf_ip PROGBITS ffffffff81f776d0 011776d0 0012d9d0 0 A 0 0 1
[16] .undwarf PROGBITS ffffffff820a50a0 012a50a0 0025b3a0 0 A 0 0 1

The total undwarf data size is ~3.5MB.

There are 308852 entries of two parallel arrays:

* .undwarf (8 bytes/entry) = 2470816 bytes
* .undwarf_ip (4 bytes/entry) = 1235408 bytes

If we pack undwarf, reducing the size of the .undwarf entries by two
bytes, it will save 308852 * 2 = 617704.

So the savings will be ~600k, and the typical size will be reduced to ~3MB.

--
Josh

2017-06-29 17:53:42

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

There's a bug here that will need a small change to the entry code.

Mike Galbraith reported:

WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb

After some looking I found that it's caused by the following code
snippet in the 'interrupt' macro in entry_64.S:

/*
* Save previous stack pointer, optionally switch to interrupt stack.
* irq_count is used to check if a CPU is already on an interrupt stack
* or not. While this is essentially redundant with preempt_count it is
* a little cheaper to use a separate counter in the PDA (short of
* moving irq_enter into assembly, which would be too much work)
*/
movq %rsp, %rdi
incl PER_CPU_VAR(irq_count)
cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
UNWIND_HINT_REGS base=rdi
pushq %rdi
UNWIND_HINT_REGS indirect=1

The problem is that it's changing the stack pointer *before* writing the
previous stack pointer (push %rdi). So when unwinding from an NMI which
hit between the rsp write and the rdi push, the unwinder tries to access
the regs on the previous stack (by reading rdi), but the previous stack
pointer isn't there yet, so the access is considered out of bounds.

--
Josh

2017-06-29 18:50:52

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <[email protected]> wrote:
> There's a bug here that will need a small change to the entry code.
>
> Mike Galbraith reported:
>
> WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
>
> After some looking I found that it's caused by the following code
> snippet in the 'interrupt' macro in entry_64.S:
>
> /*
> * Save previous stack pointer, optionally switch to interrupt stack.
> * irq_count is used to check if a CPU is already on an interrupt stack
> * or not. While this is essentially redundant with preempt_count it is
> * a little cheaper to use a separate counter in the PDA (short of
> * moving irq_enter into assembly, which would be too much work)
> */
> movq %rsp, %rdi
> incl PER_CPU_VAR(irq_count)
> cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
> UNWIND_HINT_REGS base=rdi
> pushq %rdi
> UNWIND_HINT_REGS indirect=1
>
> The problem is that it's changing the stack pointer *before* writing the
> previous stack pointer (push %rdi). So when unwinding from an NMI which
> hit between the rsp write and the rdi push, the unwinder tries to access
> the regs on the previous stack (by reading rdi), but the previous stack
> pointer isn't there yet, so the access is considered out of bounds.

Ugh, that code. Does this problem go away with this patch applied:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1

If so, want to update the patch for new kernels (shouldn't conflict
with anything except your unwind hints)?

--Andy

2017-06-29 19:07:05

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 11:50:18AM -0700, Andy Lutomirski wrote:
> On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <[email protected]> wrote:
> > There's a bug here that will need a small change to the entry code.
> >
> > Mike Galbraith reported:
> >
> > WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
> >
> > After some looking I found that it's caused by the following code
> > snippet in the 'interrupt' macro in entry_64.S:
> >
> > /*
> > * Save previous stack pointer, optionally switch to interrupt stack.
> > * irq_count is used to check if a CPU is already on an interrupt stack
> > * or not. While this is essentially redundant with preempt_count it is
> > * a little cheaper to use a separate counter in the PDA (short of
> > * moving irq_enter into assembly, which would be too much work)
> > */
> > movq %rsp, %rdi
> > incl PER_CPU_VAR(irq_count)
> > cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
> > UNWIND_HINT_REGS base=rdi
> > pushq %rdi
> > UNWIND_HINT_REGS indirect=1
> >
> > The problem is that it's changing the stack pointer *before* writing the
> > previous stack pointer (push %rdi). So when unwinding from an NMI which
> > hit between the rsp write and the rdi push, the unwinder tries to access
> > the regs on the previous stack (by reading rdi), but the previous stack
> > pointer isn't there yet, so the access is considered out of bounds.
>
> Ugh, that code. Does this problem go away with this patch applied:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1
>
> If so, want to update the patch for new kernels (shouldn't conflict
> with anything except your unwind hints)?

I don't think that patch will fix it, because it still updates rsp
*before* writing the old rsp on the new stack. So there's still a
window where the "previous stack" pointer is missing.

--
Josh

2017-06-29 19:13:36

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] x86: undwarf unwinder

On Thu, Jun 29, 2017 at 09:12:56AM -0500, Josh Poimboeuf wrote:
> > > I'm not tied to the 'undwarf' name, other naming ideas are welcome.
> >
> > Ha, a new bike shed painting job! ;-)
> >
> > I think 'undwarf' isn't a bad name, it's short, catchy and describes the purpose
> > of the effort.
> >
> > But I cannot resist some other suggestions, after 'elf' and 'dwarf' the obvious
> > candidates from the peoples of Middle-earth would be:
> >
> > - 'Hobbit'
> > - 'Eagle'
> > - 'Ent'
> > - 'Dragon'
> > - 'Troll'
> > - 'Ainur'
> >
> > 'struct troll_entry' has a certain charm to it.
> >
> > 'Eagle' is even nicer IMHO: larger than a dwarf but so much faster - and eagles
> > are beautiful! Plus the name is 2 letters shorter than 'unwdwarf', win-win.
>
> Finally, we get to the important part ;-)
>
> Thus far I've been partial to undwarf, and I haven't been able to shake
> it.
>
> But I like some of your suggestions. Especially troll and hobbit. Will
> need to do some more deep thinking about it :-)

After doing some research (i.e., skimming the "Middle-earth peoples"
article on Wikipedia), my favorite is "Orc".

I don't have a reason other than the fact that "orc unwinder" sounds
badass. And it's short. And also, orcs are enemies of dwarves :-)

I did like the symbolism of "Eagle", but unfortunately our own universe
also has eagles, which diminishes down the word's Tolkien and Germanic
mythology connotations. And I think we can all agree that such
connotations are extremely important for an unwinder data format.

That said, while I like "orc", I also still like "undwarf", since I've
been staring at that name for a few months, and as you said, it does
describe its purpose.

So I'm leaning towards either "orc" or "undwarf".

--
Josh

2017-06-29 21:10:23

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 12:05 PM, Josh Poimboeuf <[email protected]> wrote:
> On Thu, Jun 29, 2017 at 11:50:18AM -0700, Andy Lutomirski wrote:
>> On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <[email protected]> wrote:
>> > There's a bug here that will need a small change to the entry code.
>> >
>> > Mike Galbraith reported:
>> >
>> > WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
>> >
>> > After some looking I found that it's caused by the following code
>> > snippet in the 'interrupt' macro in entry_64.S:
>> >
>> > /*
>> > * Save previous stack pointer, optionally switch to interrupt stack.
>> > * irq_count is used to check if a CPU is already on an interrupt stack
>> > * or not. While this is essentially redundant with preempt_count it is
>> > * a little cheaper to use a separate counter in the PDA (short of
>> > * moving irq_enter into assembly, which would be too much work)
>> > */
>> > movq %rsp, %rdi
>> > incl PER_CPU_VAR(irq_count)
>> > cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
>> > UNWIND_HINT_REGS base=rdi
>> > pushq %rdi
>> > UNWIND_HINT_REGS indirect=1
>> >
>> > The problem is that it's changing the stack pointer *before* writing the
>> > previous stack pointer (push %rdi). So when unwinding from an NMI which
>> > hit between the rsp write and the rdi push, the unwinder tries to access
>> > the regs on the previous stack (by reading rdi), but the previous stack
>> > pointer isn't there yet, so the access is considered out of bounds.
>>
>> Ugh, that code. Does this problem go away with this patch applied:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1
>>
>> If so, want to update the patch for new kernels (shouldn't conflict
>> with anything except your unwind hints)?
>
> I don't think that patch will fix it, because it still updates rsp
> *before* writing the old rsp on the new stack. So there's still a
> window where the "previous stack" pointer is missing.

But it's in a register. Is undwarf not able to grok that? I have no
fundamental problem with pushing it to the new stack first, but the
actual asm is nastier because we don't have an addressing mode that's
*(*(gs:blahblahblah)) = reg.

At least my patch makes all the copied of this code identical so the
problem can be solved only once.

2017-06-29 21:41:38

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 02:09:54PM -0700, Andy Lutomirski wrote:
> On Thu, Jun 29, 2017 at 12:05 PM, Josh Poimboeuf <[email protected]> wrote:
> > On Thu, Jun 29, 2017 at 11:50:18AM -0700, Andy Lutomirski wrote:
> >> On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <[email protected]> wrote:
> >> > There's a bug here that will need a small change to the entry code.
> >> >
> >> > Mike Galbraith reported:
> >> >
> >> > WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
> >> >
> >> > After some looking I found that it's caused by the following code
> >> > snippet in the 'interrupt' macro in entry_64.S:
> >> >
> >> > /*
> >> > * Save previous stack pointer, optionally switch to interrupt stack.
> >> > * irq_count is used to check if a CPU is already on an interrupt stack
> >> > * or not. While this is essentially redundant with preempt_count it is
> >> > * a little cheaper to use a separate counter in the PDA (short of
> >> > * moving irq_enter into assembly, which would be too much work)
> >> > */
> >> > movq %rsp, %rdi
> >> > incl PER_CPU_VAR(irq_count)
> >> > cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
> >> > UNWIND_HINT_REGS base=rdi
> >> > pushq %rdi
> >> > UNWIND_HINT_REGS indirect=1
> >> >
> >> > The problem is that it's changing the stack pointer *before* writing the
> >> > previous stack pointer (push %rdi). So when unwinding from an NMI which
> >> > hit between the rsp write and the rdi push, the unwinder tries to access
> >> > the regs on the previous stack (by reading rdi), but the previous stack
> >> > pointer isn't there yet, so the access is considered out of bounds.
> >>
> >> Ugh, that code. Does this problem go away with this patch applied:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1
> >>
> >> If so, want to update the patch for new kernels (shouldn't conflict
> >> with anything except your unwind hints)?
> >
> > I don't think that patch will fix it, because it still updates rsp
> > *before* writing the old rsp on the new stack. So there's still a
> > window where the "previous stack" pointer is missing.
>
> But it's in a register. Is undwarf not able to grok that?

Sorry, I didn't explain it very well. Undwarf can find the regs pointer
in rdi, it just doesn't trust its value.

See the stack_info.next_sp field, which is set in in_irq_stack():

/*
* The next stack pointer is the first thing pushed by the entry code
* after switching to the irq stack.
*/
info->next_sp = (unsigned long *)*(end - 1);

It's a safety mechanism. The unwinder needs the last word of the irq
stack page to point to the previous stack. That way it can double check
that the stack pointer it calculates is within the bounds of either the
current stack or the previous stack.

In the above code, the previous stack pointer (or next stack pointer,
depending on your perspective) hasn't been set up before it switches
stacks. So the unwinder reads an uninitialized value into
info->next_sp, and compares that with the regs pointer, and then stops
the unwind because it thinks it went off into the weeds.

--
Josh

2017-06-29 22:59:09

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations



--Andy

> On Jun 29, 2017, at 2:41 PM, Josh Poimboeuf <[email protected]> wrote:
>
>> On Thu, Jun 29, 2017 at 02:09:54PM -0700, Andy Lutomirski wrote:
>>> On Thu, Jun 29, 2017 at 12:05 PM, Josh Poimboeuf <[email protected]> wrote:
>>>> On Thu, Jun 29, 2017 at 11:50:18AM -0700, Andy Lutomirski wrote:
>>>>> On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <[email protected]> wrote:
>>>>> There's a bug here that will need a small change to the entry code.
>>>>>
>>>>> Mike Galbraith reported:
>>>>>
>>>>> WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
>>>>>
>>>>> After some looking I found that it's caused by the following code
>>>>> snippet in the 'interrupt' macro in entry_64.S:
>>>>>
>>>>> /*
>>>>> * Save previous stack pointer, optionally switch to interrupt stack.
>>>>> * irq_count is used to check if a CPU is already on an interrupt stack
>>>>> * or not. While this is essentially redundant with preempt_count it is
>>>>> * a little cheaper to use a separate counter in the PDA (short of
>>>>> * moving irq_enter into assembly, which would be too much work)
>>>>> */
>>>>> movq %rsp, %rdi
>>>>> incl PER_CPU_VAR(irq_count)
>>>>> cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
>>>>> UNWIND_HINT_REGS base=rdi
>>>>> pushq %rdi
>>>>> UNWIND_HINT_REGS indirect=1
>>>>>
>>>>> The problem is that it's changing the stack pointer *before* writing the
>>>>> previous stack pointer (push %rdi). So when unwinding from an NMI which
>>>>> hit between the rsp write and the rdi push, the unwinder tries to access
>>>>> the regs on the previous stack (by reading rdi), but the previous stack
>>>>> pointer isn't there yet, so the access is considered out of bounds.
>>>>
>>>> Ugh, that code. Does this problem go away with this patch applied:
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1
>>>>
>>>> If so, want to update the patch for new kernels (shouldn't conflict
>>>> with anything except your unwind hints)?
>>>
>>> I don't think that patch will fix it, because it still updates rsp
>>> *before* writing the old rsp on the new stack. So there's still a
>>> window where the "previous stack" pointer is missing.
>>
>> But it's in a register. Is undwarf not able to grok that?
>
> Sorry, I didn't explain it very well. Undwarf can find the regs pointer
> in rdi, it just doesn't trust its value.
>
> See the stack_info.next_sp field, which is set in in_irq_stack():
>
> /*
> * The next stack pointer is the first thing pushed by the entry code
> * after switching to the irq stack.
> */
> info->next_sp = (unsigned long *)*(end - 1);
>
> It's a safety mechanism. The unwinder needs the last word of the irq
> stack page to point to the previous stack. That way it can double check
> that the stack pointer it calculates is within the bounds of either the
> current stack or the previous stack.
>
> In the above code, the previous stack pointer (or next stack pointer,
> depending on your perspective) hasn't been set up before it switches
> stacks. So the unwinder reads an uninitialized value into
> info->next_sp, and compares that with the regs pointer, and then stops
> the unwind because it thinks it went off into the weeds.
>

That should be manageable, though, I think. With my patch applied (and maybe even without it), the only exception to that rule is if regs->sp points just above the top of the IRQ stack and the next instruction is push reg. In that case, the reg is exactly as trustworthy as the normal rule.* Can you teach the unwinding code that this is okay?

* If an NMI hits right there, then it relies on unwinding out of the NMI correctly. But the usual checks that the target stack is a valid stack should prevent us from going off into the weeds regardless.

> --
> Josh

2017-06-30 02:12:54

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 03:59:04PM -0700, Andy Lutomirski wrote:
> > On Jun 29, 2017, at 2:41 PM, Josh Poimboeuf <[email protected]> wrote:
> >> On Thu, Jun 29, 2017 at 02:09:54PM -0700, Andy Lutomirski wrote:
> >>> On Thu, Jun 29, 2017 at 12:05 PM, Josh Poimboeuf <[email protected]> wrote:
> >>>> On Thu, Jun 29, 2017 at 11:50:18AM -0700, Andy Lutomirski wrote:
> >>>>> On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <[email protected]> wrote:
> >>>>> There's a bug here that will need a small change to the entry code.
> >>>>>
> >>>>> Mike Galbraith reported:
> >>>>>
> >>>>> WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
> >>>>>
> >>>>> After some looking I found that it's caused by the following code
> >>>>> snippet in the 'interrupt' macro in entry_64.S:
> >>>>>
> >>>>> /*
> >>>>> * Save previous stack pointer, optionally switch to interrupt stack.
> >>>>> * irq_count is used to check if a CPU is already on an interrupt stack
> >>>>> * or not. While this is essentially redundant with preempt_count it is
> >>>>> * a little cheaper to use a separate counter in the PDA (short of
> >>>>> * moving irq_enter into assembly, which would be too much work)
> >>>>> */
> >>>>> movq %rsp, %rdi
> >>>>> incl PER_CPU_VAR(irq_count)
> >>>>> cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
> >>>>> UNWIND_HINT_REGS base=rdi
> >>>>> pushq %rdi
> >>>>> UNWIND_HINT_REGS indirect=1
> >>>>>
> >>>>> The problem is that it's changing the stack pointer *before* writing the
> >>>>> previous stack pointer (push %rdi). So when unwinding from an NMI which
> >>>>> hit between the rsp write and the rdi push, the unwinder tries to access
> >>>>> the regs on the previous stack (by reading rdi), but the previous stack
> >>>>> pointer isn't there yet, so the access is considered out of bounds.
> >>>>
> >>>> Ugh, that code. Does this problem go away with this patch applied:
> >>>>
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1
> >>>>
> >>>> If so, want to update the patch for new kernels (shouldn't conflict
> >>>> with anything except your unwind hints)?
> >>>
> >>> I don't think that patch will fix it, because it still updates rsp
> >>> *before* writing the old rsp on the new stack. So there's still a
> >>> window where the "previous stack" pointer is missing.
> >>
> >> But it's in a register. Is undwarf not able to grok that?
> >
> > Sorry, I didn't explain it very well. Undwarf can find the regs pointer
> > in rdi, it just doesn't trust its value.
> >
> > See the stack_info.next_sp field, which is set in in_irq_stack():
> >
> > /*
> > * The next stack pointer is the first thing pushed by the entry code
> > * after switching to the irq stack.
> > */
> > info->next_sp = (unsigned long *)*(end - 1);
> >
> > It's a safety mechanism. The unwinder needs the last word of the irq
> > stack page to point to the previous stack. That way it can double check
> > that the stack pointer it calculates is within the bounds of either the
> > current stack or the previous stack.
> >
> > In the above code, the previous stack pointer (or next stack pointer,
> > depending on your perspective) hasn't been set up before it switches
> > stacks. So the unwinder reads an uninitialized value into
> > info->next_sp, and compares that with the regs pointer, and then stops
> > the unwind because it thinks it went off into the weeds.
> >
>
> That should be manageable, though, I think. With my patch applied
> (and maybe even without it), the only exception to that rule is if
> regs->sp points just above the top of the IRQ stack and the next
> instruction is push reg. In that case, the reg is exactly as
> trustworthy as the normal rule.* Can you teach the unwinding code
> that this is okay?
>
> * If an NMI hits right there, then it relies on unwinding out of the
> NMI correctly. But the usual checks that the target stack is a valid
> stack should prevent us from going off into the weeds regardless.

But that would remove a safeguard against the undwarf data being
corrupt. Sure, it would only affect the rare case where the stack
pointer is at the top of the IRQ stack, but still...

Also, the frame pointer and guess unwinders have the same issue, and
this solution wouldn't work for them.

And, worst of all, the oops stack dumping code in show_trace_log_lvl()
also has this issue . It relies on those previous stack pointers. And
it's separated from the unwinder logic by design, so it can't ask the
unwinder where the next stack is.

--
Josh

2017-06-30 05:05:37

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 7:12 PM, Josh Poimboeuf <[email protected]> wrote:
> On Thu, Jun 29, 2017 at 03:59:04PM -0700, Andy Lutomirski wrote:
>> >
>> > Sorry, I didn't explain it very well. Undwarf can find the regs pointer
>> > in rdi, it just doesn't trust its value.
>> >
>> > See the stack_info.next_sp field, which is set in in_irq_stack():
>> >
>> > /*
>> > * The next stack pointer is the first thing pushed by the entry code
>> > * after switching to the irq stack.
>> > */
>> > info->next_sp = (unsigned long *)*(end - 1);
>> >
>> > It's a safety mechanism. The unwinder needs the last word of the irq
>> > stack page to point to the previous stack. That way it can double check
>> > that the stack pointer it calculates is within the bounds of either the
>> > current stack or the previous stack.
>> >
>> > In the above code, the previous stack pointer (or next stack pointer,
>> > depending on your perspective) hasn't been set up before it switches
>> > stacks. So the unwinder reads an uninitialized value into
>> > info->next_sp, and compares that with the regs pointer, and then stops
>> > the unwind because it thinks it went off into the weeds.
>> >
>>
>> That should be manageable, though, I think. With my patch applied
>> (and maybe even without it), the only exception to that rule is if
>> regs->sp points just above the top of the IRQ stack and the next
>> instruction is push reg. In that case, the reg is exactly as
>> trustworthy as the normal rule.* Can you teach the unwinding code
>> that this is okay?
>>
>> * If an NMI hits right there, then it relies on unwinding out of the
>> NMI correctly. But the usual checks that the target stack is a valid
>> stack should prevent us from going off into the weeds regardless.
>
> But that would remove a safeguard against the undwarf data being
> corrupt. Sure, it would only affect the rare case where the stack
> pointer is at the top of the IRQ stack, but still...
>
> Also, the frame pointer and guess unwinders have the same issue, and
> this solution wouldn't work for them.
>
> And, worst of all, the oops stack dumping code in show_trace_log_lvl()
> also has this issue . It relies on those previous stack pointers. And
> it's separated from the unwinder logic by design, so it can't ask the
> unwinder where the next stack is.

Ugh.

I feel like we had this debate before, and I thought it was rather
silly that the unwinder cared. After all, we already have separate
safety mechanisms to make sure that the unwinder never wanders off of
the valid stacks and that it never touches any given stack more than
once. But it is indeed useful for the oops unwinder, so c'est la vie.

That being said, I bet we could get away with this (sorry for immense
whitespace damage):

.macro ENTER_IRQ_STACK old_rsp scratch_reg
DEBUG_ENTRY_ASSERT_IRQS_OFF
movq %rsp, \old_rsp
incl PER_CPU_VAR(irq_count)
jnz .Lrecurse_irq_stack_\@

/*
* Right now, we just incremented irq_count to zero, so we've
* claimed the IRQ stack but we haven't switched to it yet.
* Anything that can interrupt us here without using IST
* must be *extremely* careful to limit its stack usage.
*
* We write old_rsp to the IRQ stack before switching to
* %rsp for the benefit of the OOPS unwinder.
*/
movq PER_CPU_VAR(irq_stack_ptr), \scratch_reg
movq \old_rsp, -8(\scratch_reg)
leaq -8(\old_rsp), %rsp
jmp .Lout_\@

.Lrecurse_irq_stack_\@:
pushq \old_rsp

.Lout_\@:
.endm

After all, it looks like all the users have a scratch reg available.

Hmm. There's another option that might be considerably nicer, though:
put the IRQ stack at a known (at link time) position *in percpu
space*. (Presumably it already is -- I haven't checked.) Then we do:

.macro ENTER_IRQ_STACK old_rsp
DEBUG_ENTRY_ASSERT_IRQS_OFF
movq %rsp, \old_rsp
incl PER_CPU_VAR(irq_count)

/*
* Right now, if we just incremented irq_count to zero, we've
* claimed the IRQ stack but we haven't switched to it yet.
* Anything that can interrupt us here without using IST
* must be *extremely* careful to limit its stack usage.
*/
jnz .Lpush_old_rsp_\@
movq \old_rsp, PER_CPU_VAR(top_word_in_irq_stack)
movq PER_CPU_VAR(irq_stack_ptr), %rsp
.Lpush_old_rsp_\@:
pushq \old_rsp
.endm

This pushes the old pointer twice, but that's easy enough to fix if we
really cared. I think I like this variant better. What do you think?

--Andy

2017-06-30 05:42:08

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 10:05 PM, Andy Lutomirski <[email protected]> wrote:
> Hmm. There's another option that might be considerably nicer, though:
> put the IRQ stack at a known (at link time) position *in percpu
> space*. (Presumably it already is -- I haven't checked.) Then we do:
>
> .macro ENTER_IRQ_STACK old_rsp
> DEBUG_ENTRY_ASSERT_IRQS_OFF
> movq %rsp, \old_rsp
> incl PER_CPU_VAR(irq_count)
>
> /*
> * Right now, if we just incremented irq_count to zero, we've
> * claimed the IRQ stack but we haven't switched to it yet.
> * Anything that can interrupt us here without using IST
> * must be *extremely* careful to limit its stack usage.
> */
> jnz .Lpush_old_rsp_\@
> movq \old_rsp, PER_CPU_VAR(top_word_in_irq_stack)
> movq PER_CPU_VAR(irq_stack_ptr), %rsp
> .Lpush_old_rsp_\@:
> pushq \old_rsp
> .endm
>

How about the two commits here (well, soon to be there once gitweb catches up):

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=0f56a55bb133cd53ccb78ca51378086296618322

If you like them, want to add them to your series?

2017-06-30 08:32:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 3/8] objtool: stack validation 2.0


* Josh Poimboeuf <[email protected]> wrote:

> This is a major rewrite of objtool. Instead of only tracking frame
> pointer changes, it now tracks all stack-related operations, including
> all register saves/restores.
>
> In addition to making stack validation more robust, this also paves the
> way for undwarf generation.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>

Note, I have applied the first 3 patches, and got a bunch of new warnings on x86
64-bit allmodconfig:

arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function
arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function
arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function
arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function
arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function

(That's the vmlinux build - plus 4 more warnings in the modules build.)

That's with GCC 5.3.1.

Let me know if you need any more info.

Thanks,

Ingo

2017-06-30 13:11:52

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Thu, Jun 29, 2017 at 10:41:44PM -0700, Andy Lutomirski wrote:
> On Thu, Jun 29, 2017 at 10:05 PM, Andy Lutomirski <[email protected]> wrote:
> > Hmm. There's another option that might be considerably nicer, though:
> > put the IRQ stack at a known (at link time) position *in percpu
> > space*. (Presumably it already is -- I haven't checked.) Then we do:
> >
> > .macro ENTER_IRQ_STACK old_rsp
> > DEBUG_ENTRY_ASSERT_IRQS_OFF
> > movq %rsp, \old_rsp
> > incl PER_CPU_VAR(irq_count)
> >
> > /*
> > * Right now, if we just incremented irq_count to zero, we've
> > * claimed the IRQ stack but we haven't switched to it yet.
> > * Anything that can interrupt us here without using IST
> > * must be *extremely* careful to limit its stack usage.
> > */
> > jnz .Lpush_old_rsp_\@
> > movq \old_rsp, PER_CPU_VAR(top_word_in_irq_stack)
> > movq PER_CPU_VAR(irq_stack_ptr), %rsp
> > .Lpush_old_rsp_\@:
> > pushq \old_rsp
> > .endm
> >
>
> How about the two commits here (well, soon to be there once gitweb catches up):
>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=0f56a55bb133cd53ccb78ca51378086296618322
>
> If you like them, want to add them to your series?

The second patch looks good to me, thanks. I can pick up the patches.

A few comments about the first patch:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=3e2aa2102cc1c5e60d4a8637bff78d0478a55059

- It uses a '693:' label instead of '.Lirqs_off_\@:'

- There's a comment I don't follow:

"Anything that can interrupt us here without using IST must be
*extremely* careful to limit its stack usage."

What specifically could interrupt there without using IST?

- Since do_softirq_own_stack() is a callable function, I think it still
needs to save rbp.

- Why change the "jmp error_exit" to "ret" in
xen_do_hypervisor_callback()?

--
Josh

Subject: [tip:core/objtool] objtool: Move checking code to check.c

Commit-ID: dcc914f44f065ef73685b37e59877a5bb3cb7358
Gitweb: http://git.kernel.org/tip/dcc914f44f065ef73685b37e59877a5bb3cb7358
Author: Josh Poimboeuf <[email protected]>
AuthorDate: Wed, 28 Jun 2017 10:11:05 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 30 Jun 2017 10:19:19 +0200

objtool: Move checking code to check.c

In preparation for the new 'objtool undwarf generate' command, which
will rely on 'objtool check', move the checking code from
builtin-check.c to check.c where it can be used by other commands.

Signed-off-by: Josh Poimboeuf <[email protected]>
Reviewed-by: Jiri Slaby <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/294c5c695fd73c1a5000bbe5960a7c9bec4ee6b4.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/objtool/Build | 1 +
tools/objtool/builtin-check.c | 1281 +---------------------------
tools/objtool/{builtin-check.c => check.c} | 58 +-
tools/objtool/{special.h => check.h} | 43 +-
4 files changed, 45 insertions(+), 1338 deletions(-)

diff --git a/tools/objtool/Build b/tools/objtool/Build
index d6cdece..6f2e198 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -1,5 +1,6 @@
objtool-y += arch/$(SRCARCH)/
objtool-y += builtin-check.o
+objtool-y += check.o
objtool-y += elf.o
objtool-y += special.o
objtool-y += objtool.o
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 5f66697f..365c34e 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -25,1287 +25,32 @@
* For more information, see tools/objtool/Documentation/stack-validation.txt.
*/

-#include <string.h>
-#include <stdlib.h>
#include <subcmd/parse-options.h>
-
#include "builtin.h"
-#include "elf.h"
-#include "special.h"
-#include "arch.h"
-#include "warn.h"
-
-#include <linux/hashtable.h>
-#include <linux/kernel.h>
-
-#define STATE_FP_SAVED 0x1
-#define STATE_FP_SETUP 0x2
-#define STATE_FENTRY 0x4
-
-struct instruction {
- struct list_head list;
- struct hlist_node hash;
- struct section *sec;
- unsigned long offset;
- unsigned int len, state;
- unsigned char type;
- unsigned long immediate;
- bool alt_group, visited, dead_end;
- struct symbol *call_dest;
- struct instruction *jump_dest;
- struct list_head alts;
- struct symbol *func;
-};
-
-struct alternative {
- struct list_head list;
- struct instruction *insn;
-};
-
-struct objtool_file {
- struct elf *elf;
- struct list_head insn_list;
- DECLARE_HASHTABLE(insn_hash, 16);
- struct section *rodata, *whitelist;
- bool ignore_unreachables, c_file;
-};
-
-const char *objname;
-static bool nofp;
-
-static struct instruction *find_insn(struct objtool_file *file,
- struct section *sec, unsigned long offset)
-{
- struct instruction *insn;
-
- hash_for_each_possible(file->insn_hash, insn, hash, offset)
- if (insn->sec == sec && insn->offset == offset)
- return insn;
-
- return NULL;
-}
-
-static struct instruction *next_insn_same_sec(struct objtool_file *file,
- struct instruction *insn)
-{
- struct instruction *next = list_next_entry(insn, list);
-
- if (&next->list == &file->insn_list || next->sec != insn->sec)
- return NULL;
-
- return next;
-}
-
-static bool gcov_enabled(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *sym;
-
- list_for_each_entry(sec, &file->elf->sections, list)
- list_for_each_entry(sym, &sec->symbol_list, list)
- if (!strncmp(sym->name, "__gcov_.", 8))
- return true;
-
- return false;
-}
-
-#define for_each_insn(file, insn) \
- list_for_each_entry(insn, &file->insn_list, list)
-
-#define func_for_each_insn(file, func, insn) \
- for (insn = find_insn(file, func->sec, func->offset); \
- insn && &insn->list != &file->insn_list && \
- insn->sec == func->sec && \
- insn->offset < func->offset + func->len; \
- insn = list_next_entry(insn, list))
-
-#define func_for_each_insn_continue_reverse(file, func, insn) \
- for (insn = list_prev_entry(insn, list); \
- &insn->list != &file->insn_list && \
- insn->sec == func->sec && insn->offset >= func->offset; \
- insn = list_prev_entry(insn, list))
-
-#define sec_for_each_insn_from(file, insn) \
- for (; insn; insn = next_insn_same_sec(file, insn))
-
-
-/*
- * Check if the function has been manually whitelisted with the
- * STACK_FRAME_NON_STANDARD macro, or if it should be automatically whitelisted
- * due to its use of a context switching instruction.
- */
-static bool ignore_func(struct objtool_file *file, struct symbol *func)
-{
- struct rela *rela;
- struct instruction *insn;
-
- /* check for STACK_FRAME_NON_STANDARD */
- if (file->whitelist && file->whitelist->rela)
- list_for_each_entry(rela, &file->whitelist->rela->rela_list, list) {
- if (rela->sym->type == STT_SECTION &&
- rela->sym->sec == func->sec &&
- rela->addend == func->offset)
- return true;
- if (rela->sym->type == STT_FUNC && rela->sym == func)
- return true;
- }
-
- /* check if it has a context switching instruction */
- func_for_each_insn(file, func, insn)
- if (insn->type == INSN_CONTEXT_SWITCH)
- return true;
-
- return false;
-}
-
-/*
- * This checks to see if the given function is a "noreturn" function.
- *
- * For global functions which are outside the scope of this object file, we
- * have to keep a manual list of them.
- *
- * For local functions, we have to detect them manually by simply looking for
- * the lack of a return instruction.
- *
- * Returns:
- * -1: error
- * 0: no dead end
- * 1: dead end
- */
-static int __dead_end_function(struct objtool_file *file, struct symbol *func,
- int recursion)
-{
- int i;
- struct instruction *insn;
- bool empty = true;
-
- /*
- * Unfortunately these have to be hard coded because the noreturn
- * attribute isn't provided in ELF data.
- */
- static const char * const global_noreturns[] = {
- "__stack_chk_fail",
- "panic",
- "do_exit",
- "do_task_dead",
- "__module_put_and_exit",
- "complete_and_exit",
- "kvm_spurious_fault",
- "__reiserfs_panic",
- "lbug_with_loc",
- "fortify_panic",
- };
-
- if (func->bind == STB_WEAK)
- return 0;
-
- if (func->bind == STB_GLOBAL)
- for (i = 0; i < ARRAY_SIZE(global_noreturns); i++)
- if (!strcmp(func->name, global_noreturns[i]))
- return 1;
-
- if (!func->sec)
- return 0;
-
- func_for_each_insn(file, func, insn) {
- empty = false;
-
- if (insn->type == INSN_RETURN)
- return 0;
- }
-
- if (empty)
- return 0;
-
- /*
- * A function can have a sibling call instead of a return. In that
- * case, the function's dead-end status depends on whether the target
- * of the sibling call returns.
- */
- func_for_each_insn(file, func, insn) {
- if (insn->sec != func->sec ||
- insn->offset >= func->offset + func->len)
- break;
-
- if (insn->type == INSN_JUMP_UNCONDITIONAL) {
- struct instruction *dest = insn->jump_dest;
- struct symbol *dest_func;
-
- if (!dest)
- /* sibling call to another file */
- return 0;
-
- if (dest->sec != func->sec ||
- dest->offset < func->offset ||
- dest->offset >= func->offset + func->len) {
- /* local sibling call */
- dest_func = find_symbol_by_offset(dest->sec,
- dest->offset);
- if (!dest_func)
- continue;
-
- if (recursion == 5) {
- WARN_FUNC("infinite recursion (objtool bug!)",
- dest->sec, dest->offset);
- return -1;
- }
-
- return __dead_end_function(file, dest_func,
- recursion + 1);
- }
- }
-
- if (insn->type == INSN_JUMP_DYNAMIC && list_empty(&insn->alts))
- /* sibling call */
- return 0;
- }
-
- return 1;
-}
-
-static int dead_end_function(struct objtool_file *file, struct symbol *func)
-{
- return __dead_end_function(file, func, 0);
-}
-
-/*
- * Call the arch-specific instruction decoder for all the instructions and add
- * them to the global instruction list.
- */
-static int decode_instructions(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- unsigned long offset;
- struct instruction *insn;
- int ret;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
-
- if (!(sec->sh.sh_flags & SHF_EXECINSTR))
- continue;
-
- for (offset = 0; offset < sec->len; offset += insn->len) {
- insn = malloc(sizeof(*insn));
- memset(insn, 0, sizeof(*insn));
-
- INIT_LIST_HEAD(&insn->alts);
- insn->sec = sec;
- insn->offset = offset;
-
- ret = arch_decode_instruction(file->elf, sec, offset,
- sec->len - offset,
- &insn->len, &insn->type,
- &insn->immediate);
- if (ret)
- return ret;
-
- if (!insn->type || insn->type > INSN_LAST) {
- WARN_FUNC("invalid instruction type %d",
- insn->sec, insn->offset, insn->type);
- return -1;
- }
-
- hash_add(file->insn_hash, &insn->hash, insn->offset);
- list_add_tail(&insn->list, &file->insn_list);
- }
-
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- if (!find_insn(file, sec, func->offset)) {
- WARN("%s(): can't find starting instruction",
- func->name);
- return -1;
- }
-
- func_for_each_insn(file, func, insn)
- if (!insn->func)
- insn->func = func;
- }
- }
-
- return 0;
-}
-
-/*
- * Find all uses of the unreachable() macro, which are code path dead ends.
- */
-static int add_dead_ends(struct objtool_file *file)
-{
- struct section *sec;
- struct rela *rela;
- struct instruction *insn;
- bool found;
-
- sec = find_section_by_name(file->elf, ".rela.discard.unreachable");
- if (!sec)
- return 0;
-
- list_for_each_entry(rela, &sec->rela_list, list) {
- if (rela->sym->type != STT_SECTION) {
- WARN("unexpected relocation symbol type in %s", sec->name);
- return -1;
- }
- insn = find_insn(file, rela->sym->sec, rela->addend);
- if (insn)
- insn = list_prev_entry(insn, list);
- else if (rela->addend == rela->sym->sec->len) {
- found = false;
- list_for_each_entry_reverse(insn, &file->insn_list, list) {
- if (insn->sec == rela->sym->sec) {
- found = true;
- break;
- }
- }
-
- if (!found) {
- WARN("can't find unreachable insn at %s+0x%x",
- rela->sym->sec->name, rela->addend);
- return -1;
- }
- } else {
- WARN("can't find unreachable insn at %s+0x%x",
- rela->sym->sec->name, rela->addend);
- return -1;
- }
-
- insn->dead_end = true;
- }
-
- return 0;
-}
-
-/*
- * Warnings shouldn't be reported for ignored functions.
- */
-static void add_ignores(struct objtool_file *file)
-{
- struct instruction *insn;
- struct section *sec;
- struct symbol *func;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- if (!ignore_func(file, func))
- continue;
-
- func_for_each_insn(file, func, insn)
- insn->visited = true;
- }
- }
-}
-
-/*
- * Find the destination instructions for all jumps.
- */
-static int add_jump_destinations(struct objtool_file *file)
-{
- struct instruction *insn;
- struct rela *rela;
- struct section *dest_sec;
- unsigned long dest_off;
-
- for_each_insn(file, insn) {
- if (insn->type != INSN_JUMP_CONDITIONAL &&
- insn->type != INSN_JUMP_UNCONDITIONAL)
- continue;
-
- /* skip ignores */
- if (insn->visited)
- continue;
-
- rela = find_rela_by_dest_range(insn->sec, insn->offset,
- insn->len);
- if (!rela) {
- dest_sec = insn->sec;
- dest_off = insn->offset + insn->len + insn->immediate;
- } else if (rela->sym->type == STT_SECTION) {
- dest_sec = rela->sym->sec;
- dest_off = rela->addend + 4;
- } else if (rela->sym->sec->idx) {
- dest_sec = rela->sym->sec;
- dest_off = rela->sym->sym.st_value + rela->addend + 4;
- } else {
- /* sibling call */
- insn->jump_dest = 0;
- continue;
- }
-
- insn->jump_dest = find_insn(file, dest_sec, dest_off);
- if (!insn->jump_dest) {
-
- /*
- * This is a special case where an alt instruction
- * jumps past the end of the section. These are
- * handled later in handle_group_alt().
- */
- if (!strcmp(insn->sec->name, ".altinstr_replacement"))
- continue;
-
- WARN_FUNC("can't find jump dest instruction at %s+0x%lx",
- insn->sec, insn->offset, dest_sec->name,
- dest_off);
- return -1;
- }
- }
-
- return 0;
-}
-
-/*
- * Find the destination instructions for all calls.
- */
-static int add_call_destinations(struct objtool_file *file)
-{
- struct instruction *insn;
- unsigned long dest_off;
- struct rela *rela;
-
- for_each_insn(file, insn) {
- if (insn->type != INSN_CALL)
- continue;
-
- rela = find_rela_by_dest_range(insn->sec, insn->offset,
- insn->len);
- if (!rela) {
- dest_off = insn->offset + insn->len + insn->immediate;
- insn->call_dest = find_symbol_by_offset(insn->sec,
- dest_off);
- if (!insn->call_dest) {
- WARN_FUNC("can't find call dest symbol at offset 0x%lx",
- insn->sec, insn->offset, dest_off);
- return -1;
- }
- } else if (rela->sym->type == STT_SECTION) {
- insn->call_dest = find_symbol_by_offset(rela->sym->sec,
- rela->addend+4);
- if (!insn->call_dest ||
- insn->call_dest->type != STT_FUNC) {
- WARN_FUNC("can't find call dest symbol at %s+0x%x",
- insn->sec, insn->offset,
- rela->sym->sec->name,
- rela->addend + 4);
- return -1;
- }
- } else
- insn->call_dest = rela->sym;
- }
-
- return 0;
-}
-
-/*
- * The .alternatives section requires some extra special care, over and above
- * what other special sections require:
- *
- * 1. Because alternatives are patched in-place, we need to insert a fake jump
- * instruction at the end so that validate_branch() skips all the original
- * replaced instructions when validating the new instruction path.
- *
- * 2. An added wrinkle is that the new instruction length might be zero. In
- * that case the old instructions are replaced with noops. We simulate that
- * by creating a fake jump as the only new instruction.
- *
- * 3. In some cases, the alternative section includes an instruction which
- * conditionally jumps to the _end_ of the entry. We have to modify these
- * jumps' destinations to point back to .text rather than the end of the
- * entry in .altinstr_replacement.
- *
- * 4. It has been requested that we don't validate the !POPCNT feature path
- * which is a "very very small percentage of machines".
- */
-static int handle_group_alt(struct objtool_file *file,
- struct special_alt *special_alt,
- struct instruction *orig_insn,
- struct instruction **new_insn)
-{
- struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump;
- unsigned long dest_off;
-
- last_orig_insn = NULL;
- insn = orig_insn;
- sec_for_each_insn_from(file, insn) {
- if (insn->offset >= special_alt->orig_off + special_alt->orig_len)
- break;
-
- if (special_alt->skip_orig)
- insn->type = INSN_NOP;
-
- insn->alt_group = true;
- last_orig_insn = insn;
- }
-
- if (!next_insn_same_sec(file, last_orig_insn)) {
- WARN("%s: don't know how to handle alternatives at end of section",
- special_alt->orig_sec->name);
- return -1;
- }
-
- fake_jump = malloc(sizeof(*fake_jump));
- if (!fake_jump) {
- WARN("malloc failed");
- return -1;
- }
- memset(fake_jump, 0, sizeof(*fake_jump));
- INIT_LIST_HEAD(&fake_jump->alts);
- fake_jump->sec = special_alt->new_sec;
- fake_jump->offset = -1;
- fake_jump->type = INSN_JUMP_UNCONDITIONAL;
- fake_jump->jump_dest = list_next_entry(last_orig_insn, list);
-
- if (!special_alt->new_len) {
- *new_insn = fake_jump;
- return 0;
- }
-
- last_new_insn = NULL;
- insn = *new_insn;
- sec_for_each_insn_from(file, insn) {
- if (insn->offset >= special_alt->new_off + special_alt->new_len)
- break;
-
- last_new_insn = insn;
-
- if (insn->type != INSN_JUMP_CONDITIONAL &&
- insn->type != INSN_JUMP_UNCONDITIONAL)
- continue;
-
- if (!insn->immediate)
- continue;
-
- dest_off = insn->offset + insn->len + insn->immediate;
- if (dest_off == special_alt->new_off + special_alt->new_len)
- insn->jump_dest = fake_jump;
-
- if (!insn->jump_dest) {
- WARN_FUNC("can't find alternative jump destination",
- insn->sec, insn->offset);
- return -1;
- }
- }
-
- if (!last_new_insn) {
- WARN_FUNC("can't find last new alternative instruction",
- special_alt->new_sec, special_alt->new_off);
- return -1;
- }
-
- list_add(&fake_jump->list, &last_new_insn->list);
-
- return 0;
-}
-
-/*
- * A jump table entry can either convert a nop to a jump or a jump to a nop.
- * If the original instruction is a jump, make the alt entry an effective nop
- * by just skipping the original instruction.
- */
-static int handle_jump_alt(struct objtool_file *file,
- struct special_alt *special_alt,
- struct instruction *orig_insn,
- struct instruction **new_insn)
-{
- if (orig_insn->type == INSN_NOP)
- return 0;
-
- if (orig_insn->type != INSN_JUMP_UNCONDITIONAL) {
- WARN_FUNC("unsupported instruction at jump label",
- orig_insn->sec, orig_insn->offset);
- return -1;
- }
-
- *new_insn = list_next_entry(orig_insn, list);
- return 0;
-}
-
-/*
- * Read all the special sections which have alternate instructions which can be
- * patched in or redirected to at runtime. Each instruction having alternate
- * instruction(s) has them added to its insn->alts list, which will be
- * traversed in validate_branch().
- */
-static int add_special_section_alts(struct objtool_file *file)
-{
- struct list_head special_alts;
- struct instruction *orig_insn, *new_insn;
- struct special_alt *special_alt, *tmp;
- struct alternative *alt;
- int ret;
-
- ret = special_get_alts(file->elf, &special_alts);
- if (ret)
- return ret;
-
- list_for_each_entry_safe(special_alt, tmp, &special_alts, list) {
- alt = malloc(sizeof(*alt));
- if (!alt) {
- WARN("malloc failed");
- ret = -1;
- goto out;
- }
-
- orig_insn = find_insn(file, special_alt->orig_sec,
- special_alt->orig_off);
- if (!orig_insn) {
- WARN_FUNC("special: can't find orig instruction",
- special_alt->orig_sec, special_alt->orig_off);
- ret = -1;
- goto out;
- }
+#include "check.h"

- new_insn = NULL;
- if (!special_alt->group || special_alt->new_len) {
- new_insn = find_insn(file, special_alt->new_sec,
- special_alt->new_off);
- if (!new_insn) {
- WARN_FUNC("special: can't find new instruction",
- special_alt->new_sec,
- special_alt->new_off);
- ret = -1;
- goto out;
- }
- }
+bool nofp;

- if (special_alt->group) {
- ret = handle_group_alt(file, special_alt, orig_insn,
- &new_insn);
- if (ret)
- goto out;
- } else if (special_alt->jump_or_nop) {
- ret = handle_jump_alt(file, special_alt, orig_insn,
- &new_insn);
- if (ret)
- goto out;
- }
-
- alt->insn = new_insn;
- list_add_tail(&alt->list, &orig_insn->alts);
-
- list_del(&special_alt->list);
- free(special_alt);
- }
-
-out:
- return ret;
-}
-
-static int add_switch_table(struct objtool_file *file, struct symbol *func,
- struct instruction *insn, struct rela *table,
- struct rela *next_table)
-{
- struct rela *rela = table;
- struct instruction *alt_insn;
- struct alternative *alt;
-
- list_for_each_entry_from(rela, &file->rodata->rela->rela_list, list) {
- if (rela == next_table)
- break;
-
- if (rela->sym->sec != insn->sec ||
- rela->addend <= func->offset ||
- rela->addend >= func->offset + func->len)
- break;
-
- alt_insn = find_insn(file, insn->sec, rela->addend);
- if (!alt_insn) {
- WARN("%s: can't find instruction at %s+0x%x",
- file->rodata->rela->name, insn->sec->name,
- rela->addend);
- return -1;
- }
-
- alt = malloc(sizeof(*alt));
- if (!alt) {
- WARN("malloc failed");
- return -1;
- }
-
- alt->insn = alt_insn;
- list_add_tail(&alt->list, &insn->alts);
- }
-
- return 0;
-}
-
-/*
- * find_switch_table() - Given a dynamic jump, find the switch jump table in
- * .rodata associated with it.
- *
- * There are 3 basic patterns:
- *
- * 1. jmpq *[rodata addr](,%reg,8)
- *
- * This is the most common case by far. It jumps to an address in a simple
- * jump table which is stored in .rodata.
- *
- * 2. jmpq *[rodata addr](%rip)
- *
- * This is caused by a rare GCC quirk, currently only seen in three driver
- * functions in the kernel, only with certain obscure non-distro configs.
- *
- * As part of an optimization, GCC makes a copy of an existing switch jump
- * table, modifies it, and then hard-codes the jump (albeit with an indirect
- * jump) to use a single entry in the table. The rest of the jump table and
- * some of its jump targets remain as dead code.
- *
- * In such a case we can just crudely ignore all unreachable instruction
- * warnings for the entire object file. Ideally we would just ignore them
- * for the function, but that would require redesigning the code quite a
- * bit. And honestly that's just not worth doing: unreachable instruction
- * warnings are of questionable value anyway, and this is such a rare issue.
- *
- * 3. mov [rodata addr],%reg1
- * ... some instructions ...
- * jmpq *(%reg1,%reg2,8)
- *
- * This is a fairly uncommon pattern which is new for GCC 6. As of this
- * writing, there are 11 occurrences of it in the allmodconfig kernel.
- *
- * TODO: Once we have DWARF CFI and smarter instruction decoding logic,
- * ensure the same register is used in the mov and jump instructions.
- */
-static struct rela *find_switch_table(struct objtool_file *file,
- struct symbol *func,
- struct instruction *insn)
-{
- struct rela *text_rela, *rodata_rela;
- struct instruction *orig_insn = insn;
-
- text_rela = find_rela_by_dest_range(insn->sec, insn->offset, insn->len);
- if (text_rela && text_rela->sym == file->rodata->sym) {
- /* case 1 */
- rodata_rela = find_rela_by_dest(file->rodata,
- text_rela->addend);
- if (rodata_rela)
- return rodata_rela;
-
- /* case 2 */
- rodata_rela = find_rela_by_dest(file->rodata,
- text_rela->addend + 4);
- if (!rodata_rela)
- return NULL;
- file->ignore_unreachables = true;
- return rodata_rela;
- }
-
- /* case 3 */
- func_for_each_insn_continue_reverse(file, func, insn) {
- if (insn->type == INSN_JUMP_DYNAMIC)
- break;
-
- /* allow small jumps within the range */
- if (insn->type == INSN_JUMP_UNCONDITIONAL &&
- insn->jump_dest &&
- (insn->jump_dest->offset <= insn->offset ||
- insn->jump_dest->offset > orig_insn->offset))
- break;
-
- /* look for a relocation which references .rodata */
- text_rela = find_rela_by_dest_range(insn->sec, insn->offset,
- insn->len);
- if (!text_rela || text_rela->sym != file->rodata->sym)
- continue;
-
- /*
- * Make sure the .rodata address isn't associated with a
- * symbol. gcc jump tables are anonymous data.
- */
- if (find_symbol_containing(file->rodata, text_rela->addend))
- continue;
-
- return find_rela_by_dest(file->rodata, text_rela->addend);
- }
-
- return NULL;
-}
-
-static int add_func_switch_tables(struct objtool_file *file,
- struct symbol *func)
-{
- struct instruction *insn, *prev_jump = NULL;
- struct rela *rela, *prev_rela = NULL;
- int ret;
-
- func_for_each_insn(file, func, insn) {
- if (insn->type != INSN_JUMP_DYNAMIC)
- continue;
-
- rela = find_switch_table(file, func, insn);
- if (!rela)
- continue;
-
- /*
- * We found a switch table, but we don't know yet how big it
- * is. Don't add it until we reach the end of the function or
- * the beginning of another switch table in the same function.
- */
- if (prev_jump) {
- ret = add_switch_table(file, func, prev_jump, prev_rela,
- rela);
- if (ret)
- return ret;
- }
-
- prev_jump = insn;
- prev_rela = rela;
- }
-
- if (prev_jump) {
- ret = add_switch_table(file, func, prev_jump, prev_rela, NULL);
- if (ret)
- return ret;
- }
-
- return 0;
-}
-
-/*
- * For some switch statements, gcc generates a jump table in the .rodata
- * section which contains a list of addresses within the function to jump to.
- * This finds these jump tables and adds them to the insn->alts lists.
- */
-static int add_switch_table_alts(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- int ret;
-
- if (!file->rodata || !file->rodata->rela)
- return 0;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- ret = add_func_switch_tables(file, func);
- if (ret)
- return ret;
- }
- }
-
- return 0;
-}
-
-static int decode_sections(struct objtool_file *file)
-{
- int ret;
-
- ret = decode_instructions(file);
- if (ret)
- return ret;
-
- ret = add_dead_ends(file);
- if (ret)
- return ret;
-
- add_ignores(file);
-
- ret = add_jump_destinations(file);
- if (ret)
- return ret;
-
- ret = add_call_destinations(file);
- if (ret)
- return ret;
-
- ret = add_special_section_alts(file);
- if (ret)
- return ret;
-
- ret = add_switch_table_alts(file);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static bool is_fentry_call(struct instruction *insn)
-{
- if (insn->type == INSN_CALL &&
- insn->call_dest->type == STT_NOTYPE &&
- !strcmp(insn->call_dest->name, "__fentry__"))
- return true;
-
- return false;
-}
-
-static bool has_modified_stack_frame(struct instruction *insn)
-{
- return (insn->state & STATE_FP_SAVED) ||
- (insn->state & STATE_FP_SETUP);
-}
-
-static bool has_valid_stack_frame(struct instruction *insn)
-{
- return (insn->state & STATE_FP_SAVED) &&
- (insn->state & STATE_FP_SETUP);
-}
-
-static unsigned int frame_state(unsigned long state)
-{
- return (state & (STATE_FP_SAVED | STATE_FP_SETUP));
-}
-
-/*
- * Follow the branch starting at the given instruction, and recursively follow
- * any other branches (jumps). Meanwhile, track the frame pointer state at
- * each instruction and validate all the rules described in
- * tools/objtool/Documentation/stack-validation.txt.
- */
-static int validate_branch(struct objtool_file *file,
- struct instruction *first, unsigned char first_state)
-{
- struct alternative *alt;
- struct instruction *insn;
- struct section *sec;
- struct symbol *func = NULL;
- unsigned char state;
- int ret;
-
- insn = first;
- sec = insn->sec;
- state = first_state;
-
- if (insn->alt_group && list_empty(&insn->alts)) {
- WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
- sec, insn->offset);
- return 1;
- }
-
- while (1) {
- if (file->c_file && insn->func) {
- if (func && func != insn->func) {
- WARN("%s() falls through to next function %s()",
- func->name, insn->func->name);
- return 1;
- }
-
- func = insn->func;
- }
-
- if (insn->visited) {
- if (frame_state(insn->state) != frame_state(state)) {
- WARN_FUNC("frame pointer state mismatch",
- sec, insn->offset);
- return 1;
- }
-
- return 0;
- }
-
- insn->visited = true;
- insn->state = state;
-
- list_for_each_entry(alt, &insn->alts, list) {
- ret = validate_branch(file, alt->insn, state);
- if (ret)
- return 1;
- }
-
- switch (insn->type) {
-
- case INSN_FP_SAVE:
- if (!nofp) {
- if (state & STATE_FP_SAVED) {
- WARN_FUNC("duplicate frame pointer save",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SAVED;
- }
- break;
-
- case INSN_FP_SETUP:
- if (!nofp) {
- if (state & STATE_FP_SETUP) {
- WARN_FUNC("duplicate frame pointer setup",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SETUP;
- }
- break;
-
- case INSN_FP_RESTORE:
- if (!nofp) {
- if (has_valid_stack_frame(insn))
- state &= ~STATE_FP_SETUP;
-
- state &= ~STATE_FP_SAVED;
- }
- break;
-
- case INSN_RETURN:
- if (!nofp && has_modified_stack_frame(insn)) {
- WARN_FUNC("return without frame pointer restore",
- sec, insn->offset);
- return 1;
- }
- return 0;
-
- case INSN_CALL:
- if (is_fentry_call(insn)) {
- state |= STATE_FENTRY;
- break;
- }
-
- ret = dead_end_function(file, insn->call_dest);
- if (ret == 1)
- return 0;
- if (ret == -1)
- return 1;
-
- /* fallthrough */
- case INSN_CALL_DYNAMIC:
- if (!nofp && !has_valid_stack_frame(insn)) {
- WARN_FUNC("call without frame pointer save/setup",
- sec, insn->offset);
- return 1;
- }
- break;
-
- case INSN_JUMP_CONDITIONAL:
- case INSN_JUMP_UNCONDITIONAL:
- if (insn->jump_dest) {
- ret = validate_branch(file, insn->jump_dest,
- state);
- if (ret)
- return 1;
- } else if (has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
- sec, insn->offset);
- return 1;
- } /* else it's a sibling call */
-
- if (insn->type == INSN_JUMP_UNCONDITIONAL)
- return 0;
-
- break;
-
- case INSN_JUMP_DYNAMIC:
- if (list_empty(&insn->alts) &&
- has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
- sec, insn->offset);
- return 1;
- }
-
- return 0;
-
- default:
- break;
- }
-
- if (insn->dead_end)
- return 0;
-
- insn = next_insn_same_sec(file, insn);
- if (!insn) {
- WARN("%s: unexpected end of section", sec->name);
- return 1;
- }
- }
-
- return 0;
-}
-
-static bool is_kasan_insn(struct instruction *insn)
-{
- return (insn->type == INSN_CALL &&
- !strcmp(insn->call_dest->name, "__asan_handle_no_return"));
-}
-
-static bool is_ubsan_insn(struct instruction *insn)
-{
- return (insn->type == INSN_CALL &&
- !strcmp(insn->call_dest->name,
- "__ubsan_handle_builtin_unreachable"));
-}
-
-static bool ignore_unreachable_insn(struct symbol *func,
- struct instruction *insn)
-{
- int i;
-
- if (insn->type == INSN_NOP)
- return true;
-
- /*
- * Check if this (or a subsequent) instruction is related to
- * CONFIG_UBSAN or CONFIG_KASAN.
- *
- * End the search at 5 instructions to avoid going into the weeds.
- */
- for (i = 0; i < 5; i++) {
-
- if (is_kasan_insn(insn) || is_ubsan_insn(insn))
- return true;
-
- if (insn->type == INSN_JUMP_UNCONDITIONAL && insn->jump_dest) {
- insn = insn->jump_dest;
- continue;
- }
-
- if (insn->offset + insn->len >= func->offset + func->len)
- break;
- insn = list_next_entry(insn, list);
- }
-
- return false;
-}
-
-static int validate_functions(struct objtool_file *file)
-{
- struct section *sec;
- struct symbol *func;
- struct instruction *insn;
- int ret, warnings = 0;
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- insn = find_insn(file, sec, func->offset);
- if (!insn)
- continue;
-
- ret = validate_branch(file, insn, 0);
- warnings += ret;
- }
- }
-
- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- func_for_each_insn(file, func, insn) {
- if (insn->visited)
- continue;
-
- insn->visited = true;
-
- if (file->ignore_unreachables || warnings ||
- ignore_unreachable_insn(func, insn))
- continue;
-
- /*
- * gcov produces a lot of unreachable
- * instructions. If we get an unreachable
- * warning and the file has gcov enabled, just
- * ignore it, and all other such warnings for
- * the file.
- */
- if (!file->ignore_unreachables &&
- gcov_enabled(file)) {
- file->ignore_unreachables = true;
- continue;
- }
-
- WARN_FUNC("function has unreachable instruction", insn->sec, insn->offset);
- warnings++;
- }
- }
- }
-
- return warnings;
-}
-
-static int validate_uncallable_instructions(struct objtool_file *file)
-{
- struct instruction *insn;
- int warnings = 0;
-
- for_each_insn(file, insn) {
- if (!insn->visited && insn->type == INSN_RETURN) {
- WARN_FUNC("return instruction outside of a callable function",
- insn->sec, insn->offset);
- warnings++;
- }
- }
-
- return warnings;
-}
-
-static void cleanup(struct objtool_file *file)
-{
- struct instruction *insn, *tmpinsn;
- struct alternative *alt, *tmpalt;
-
- list_for_each_entry_safe(insn, tmpinsn, &file->insn_list, list) {
- list_for_each_entry_safe(alt, tmpalt, &insn->alts, list) {
- list_del(&alt->list);
- free(alt);
- }
- list_del(&insn->list);
- hash_del(&insn->hash);
- free(insn);
- }
- elf_close(file->elf);
-}
-
-const char * const check_usage[] = {
+static const char * const check_usage[] = {
"objtool check [<options>] file.o",
NULL,
};

+const struct option check_options[] = {
+ OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
+ OPT_END(),
+};
+
int cmd_check(int argc, const char **argv)
{
- struct objtool_file file;
- int ret, warnings = 0;
-
- const struct option options[] = {
- OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
- OPT_END(),
- };
+ const char *objname;

- argc = parse_options(argc, argv, options, check_usage, 0);
+ argc = parse_options(argc, argv, check_options, check_usage, 0);

if (argc != 1)
- usage_with_options(check_usage, options);
+ usage_with_options(check_usage, check_options);

objname = argv[0];

- file.elf = elf_open(objname);
- if (!file.elf) {
- fprintf(stderr, "error reading elf file %s\n", objname);
- return 1;
- }
-
- INIT_LIST_HEAD(&file.insn_list);
- hash_init(file.insn_hash);
- file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard");
- file.rodata = find_section_by_name(file.elf, ".rodata");
- file.ignore_unreachables = false;
- file.c_file = find_section_by_name(file.elf, ".comment");
-
- ret = decode_sections(&file);
- if (ret < 0)
- goto out;
- warnings += ret;
-
- ret = validate_functions(&file);
- if (ret < 0)
- goto out;
- warnings += ret;
-
- ret = validate_uncallable_instructions(&file);
- if (ret < 0)
- goto out;
- warnings += ret;
-
-out:
- cleanup(&file);
-
- /* ignore warnings for now until we get all the code cleaned up */
- if (ret || warnings)
- return 0;
- return 0;
+ return check(objname, nofp);
}
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/check.c
similarity index 95%
copy from tools/objtool/builtin-check.c
copy to tools/objtool/check.c
index 5f66697f..231a360 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/check.c
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -15,21 +15,10 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/

-/*
- * objtool check:
- *
- * This command analyzes every .o file and ensures the validity of its stack
- * trace metadata. It enforces a set of rules on asm code and C inline
- * assembly code so that stack traces can be reliable.
- *
- * For more information, see tools/objtool/Documentation/stack-validation.txt.
- */
-
#include <string.h>
#include <stdlib.h>
-#include <subcmd/parse-options.h>

-#include "builtin.h"
+#include "check.h"
#include "elf.h"
#include "special.h"
#include "arch.h"
@@ -42,34 +31,11 @@
#define STATE_FP_SETUP 0x2
#define STATE_FENTRY 0x4

-struct instruction {
- struct list_head list;
- struct hlist_node hash;
- struct section *sec;
- unsigned long offset;
- unsigned int len, state;
- unsigned char type;
- unsigned long immediate;
- bool alt_group, visited, dead_end;
- struct symbol *call_dest;
- struct instruction *jump_dest;
- struct list_head alts;
- struct symbol *func;
-};
-
struct alternative {
struct list_head list;
struct instruction *insn;
};

-struct objtool_file {
- struct elf *elf;
- struct list_head insn_list;
- DECLARE_HASHTABLE(insn_hash, 16);
- struct section *rodata, *whitelist;
- bool ignore_unreachables, c_file;
-};
-
const char *objname;
static bool nofp;

@@ -1251,27 +1217,13 @@ static void cleanup(struct objtool_file *file)
elf_close(file->elf);
}

-const char * const check_usage[] = {
- "objtool check [<options>] file.o",
- NULL,
-};
-
-int cmd_check(int argc, const char **argv)
+int check(const char *_objname, bool _nofp)
{
struct objtool_file file;
int ret, warnings = 0;

- const struct option options[] = {
- OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
- OPT_END(),
- };
-
- argc = parse_options(argc, argv, options, check_usage, 0);
-
- if (argc != 1)
- usage_with_options(check_usage, options);
-
- objname = argv[0];
+ objname = _objname;
+ nofp = _nofp;

file.elf = elf_open(objname);
if (!file.elf) {
diff --git a/tools/objtool/special.h b/tools/objtool/check.h
similarity index 51%
copy from tools/objtool/special.h
copy to tools/objtool/check.h
index fad1d092..c0d2fde 100644
--- a/tools/objtool/special.h
+++ b/tools/objtool/check.h
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -15,28 +15,37 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/

-#ifndef _SPECIAL_H
-#define _SPECIAL_H
+#ifndef _CHECK_H
+#define _CHECK_H

#include <stdbool.h>
#include "elf.h"
+#include "arch.h"
+#include <linux/hashtable.h>

-struct special_alt {
+struct instruction {
struct list_head list;
+ struct hlist_node hash;
+ struct section *sec;
+ unsigned long offset;
+ unsigned int len, state;
+ unsigned char type;
+ unsigned long immediate;
+ bool alt_group, visited, dead_end;
+ struct symbol *call_dest;
+ struct instruction *jump_dest;
+ struct list_head alts;
+ struct symbol *func;
+};

- bool group;
- bool skip_orig;
- bool jump_or_nop;
-
- struct section *orig_sec;
- unsigned long orig_off;
-
- struct section *new_sec;
- unsigned long new_off;
-
- unsigned int orig_len, new_len; /* group only */
+struct objtool_file {
+ struct elf *elf;
+ struct list_head insn_list;
+ DECLARE_HASHTABLE(insn_hash, 16);
+ struct section *rodata, *whitelist;
+ bool ignore_unreachables, c_file;
};

-int special_get_alts(struct elf *elf, struct list_head *alts);
+int check(const char *objname, bool nofp);

-#endif /* _SPECIAL_H */
+#endif /* _CHECK_H */

Subject: [tip:core/objtool] objtool, x86: Add several functions and files to the objtool whitelist

Commit-ID: c207aee48037abca71c669cbec407b9891965c34
Gitweb: http://git.kernel.org/tip/c207aee48037abca71c669cbec407b9891965c34
Author: Josh Poimboeuf <[email protected]>
AuthorDate: Wed, 28 Jun 2017 10:11:06 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 30 Jun 2017 10:19:19 +0200

objtool, x86: Add several functions and files to the objtool whitelist

In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.

These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage. Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.

Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/crypto/Makefile | 2 ++
arch/x86/crypto/sha1-mb/Makefile | 2 ++
arch/x86/crypto/sha256-mb/Makefile | 2 ++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/acpi/Makefile | 2 ++
arch/x86/kernel/kprobes/opt.c | 9 ++++++++-
arch/x86/kernel/reboot.c | 2 ++
arch/x86/kvm/svm.c | 2 ++
arch/x86/kvm/vmx.c | 3 +++
arch/x86/lib/msr-reg.S | 8 ++++----
arch/x86/net/Makefile | 2 ++
arch/x86/platform/efi/Makefile | 1 +
arch/x86/power/Makefile | 2 ++
arch/x86/xen/Makefile | 3 +++
kernel/kexec_core.c | 4 +++-
15 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 34b3fa2..9e32d40 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -2,6 +2,8 @@
# Arch-specific CryptoAPI modules.
#

+OBJECT_FILES_NON_STANDARD := y
+
avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no)
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
$(comma)4)$(comma)%ymm2,yes,no)
diff --git a/arch/x86/crypto/sha1-mb/Makefile b/arch/x86/crypto/sha1-mb/Makefile
index 2f87563..2e14acc 100644
--- a/arch/x86/crypto/sha1-mb/Makefile
+++ b/arch/x86/crypto/sha1-mb/Makefile
@@ -2,6 +2,8 @@
# Arch-specific CryptoAPI modules.
#

+OBJECT_FILES_NON_STANDARD := y
+
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
$(comma)4)$(comma)%ymm2,yes,no)
ifeq ($(avx2_supported),yes)
diff --git a/arch/x86/crypto/sha256-mb/Makefile b/arch/x86/crypto/sha256-mb/Makefile
index 41089e7..45b4fca 100644
--- a/arch/x86/crypto/sha256-mb/Makefile
+++ b/arch/x86/crypto/sha256-mb/Makefile
@@ -2,6 +2,8 @@
# Arch-specific CryptoAPI modules.
#

+OBJECT_FILES_NON_STANDARD := y
+
avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\
$(comma)4)$(comma)%ymm2,yes,no)
ifeq ($(avx2_supported),yes)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 4b99423..3c7c419 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -29,6 +29,7 @@ OBJECT_FILES_NON_STANDARD_head_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o := y
OBJECT_FILES_NON_STANDARD_test_nx.o := y
+OBJECT_FILES_NON_STANDARD_paravirt_patch_$(BITS).o := y

# If instrumentation of this dir is enabled, boot hangs during first second.
# Probably could be more selective here, but note that files related to irqs,
diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile
index 26b78d8..85a9e17 100644
--- a/arch/x86/kernel/acpi/Makefile
+++ b/arch/x86/kernel/acpi/Makefile
@@ -1,3 +1,5 @@
+OBJECT_FILES_NON_STANDARD_wakeup_$(BITS).o := y
+
obj-$(CONFIG_ACPI) += boot.o
obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup_$(BITS).o
obj-$(CONFIG_ACPI_APEI) += apei.o
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 901c640..69ea0bc 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -28,6 +28,7 @@
#include <linux/kdebug.h>
#include <linux/kallsyms.h>
#include <linux/ftrace.h>
+#include <linux/frame.h>

#include <asm/text-patching.h>
#include <asm/cacheflush.h>
@@ -94,6 +95,7 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
}

asm (
+ "optprobe_template_func:\n"
".global optprobe_template_entry\n"
"optprobe_template_entry:\n"
#ifdef CONFIG_X86_64
@@ -131,7 +133,12 @@ asm (
" popf\n"
#endif
".global optprobe_template_end\n"
- "optprobe_template_end:\n");
+ "optprobe_template_end:\n"
+ ".type optprobe_template_func, @function\n"
+ ".size optprobe_template_func, .-optprobe_template_func\n");
+
+void optprobe_template_func(void);
+STACK_FRAME_NON_STANDARD(optprobe_template_func);

#define TMPL_MOVE_IDX \
((long)&optprobe_template_val - (long)&optprobe_template_entry)
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 2544700..67393fc 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -9,6 +9,7 @@
#include <linux/sched.h>
#include <linux/tboot.h>
#include <linux/delay.h>
+#include <linux/frame.h>
#include <acpi/reboot.h>
#include <asm/io.h>
#include <asm/apic.h>
@@ -123,6 +124,7 @@ void __noreturn machine_real_restart(unsigned int type)
#ifdef CONFIG_APM_MODULE
EXPORT_SYMBOL(machine_real_restart);
#endif
+STACK_FRAME_NON_STANDARD(machine_real_restart);

/*
* Some Apple MacBook and MacBookPro's needs reboot=p to be able to reboot
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ba9891a..33460fc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
#include <linux/slab.h>
#include <linux/amd-iommu.h>
#include <linux/hashtable.h>
+#include <linux/frame.h>

#include <asm/apic.h>
#include <asm/perf_event.h>
@@ -4906,6 +4907,7 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)

mark_all_clean(svm->vmcb);
}
+STACK_FRAME_NON_STANDARD(svm_vcpu_run);

static void svm_set_cr3(struct kvm_vcpu *vcpu, unsigned long root)
{
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ca5d2b9..1b469b6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -33,6 +33,7 @@
#include <linux/slab.h>
#include <linux/tboot.h>
#include <linux/hrtimer.h>
+#include <linux/frame.h>
#include "kvm_cache_regs.h"
#include "x86.h"

@@ -8652,6 +8653,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
);
}
}
+STACK_FRAME_NON_STANDARD(vmx_handle_external_intr);

static bool vmx_has_high_real_mode_segbase(void)
{
@@ -9028,6 +9030,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
vmx_recover_nmi_blocking(vmx);
vmx_complete_interrupts(vmx);
}
+STACK_FRAME_NON_STANDARD(vmx_vcpu_run);

static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs)
{
diff --git a/arch/x86/lib/msr-reg.S b/arch/x86/lib/msr-reg.S
index c815564..10ffa7e 100644
--- a/arch/x86/lib/msr-reg.S
+++ b/arch/x86/lib/msr-reg.S
@@ -13,14 +13,14 @@
.macro op_safe_regs op
ENTRY(\op\()_safe_regs)
pushq %rbx
- pushq %rbp
+ pushq %r12
movq %rdi, %r10 /* Save pointer */
xorl %r11d, %r11d /* Return value */
movl (%rdi), %eax
movl 4(%rdi), %ecx
movl 8(%rdi), %edx
movl 12(%rdi), %ebx
- movl 20(%rdi), %ebp
+ movl 20(%rdi), %r12d
movl 24(%rdi), %esi
movl 28(%rdi), %edi
1: \op
@@ -29,10 +29,10 @@ ENTRY(\op\()_safe_regs)
movl %ecx, 4(%r10)
movl %edx, 8(%r10)
movl %ebx, 12(%r10)
- movl %ebp, 20(%r10)
+ movl %r12d, 20(%r10)
movl %esi, 24(%r10)
movl %edi, 28(%r10)
- popq %rbp
+ popq %r12
popq %rbx
ret
3:
diff --git a/arch/x86/net/Makefile b/arch/x86/net/Makefile
index 90568c3..fefb4b6 100644
--- a/arch/x86/net/Makefile
+++ b/arch/x86/net/Makefile
@@ -1,4 +1,6 @@
#
# Arch-specific network modules
#
+OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
+
obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile
index f1d83b3..2f56e1e 100644
--- a/arch/x86/platform/efi/Makefile
+++ b/arch/x86/platform/efi/Makefile
@@ -1,4 +1,5 @@
OBJECT_FILES_NON_STANDARD_efi_thunk_$(BITS).o := y
+OBJECT_FILES_NON_STANDARD_efi_stub_$(BITS).o := y

obj-$(CONFIG_EFI) += quirks.o efi.o efi_$(BITS).o efi_stub_$(BITS).o
obj-$(CONFIG_EARLY_PRINTK_EFI) += early_printk.o
diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile
index a6a198c..0504187 100644
--- a/arch/x86/power/Makefile
+++ b/arch/x86/power/Makefile
@@ -1,3 +1,5 @@
+OBJECT_FILES_NON_STANDARD_hibernate_asm_$(BITS).o := y
+
# __restore_processor_state() restores %gs after S3 resume and so should not
# itself be stack-protected
nostackp := $(call cc-option, -fno-stack-protector)
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index fffb0a1..bced7a3 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,3 +1,6 @@
+OBJECT_FILES_NON_STANDARD_xen-asm_$(BITS).o := y
+OBJECT_FILES_NON_STANDARD_xen-pvh.o := y
+
ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
CFLAGS_REMOVE_spinlock.o = -pg
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index ae1a3ba..154ffb4 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -38,6 +38,7 @@
#include <linux/syscore_ops.h>
#include <linux/compiler.h>
#include <linux/hugetlb.h>
+#include <linux/frame.h>

#include <asm/page.h>
#include <asm/sections.h>
@@ -874,7 +875,7 @@ int kexec_load_disabled;
* only when panic_cpu holds the current CPU number; this is the only CPU
* which processes crash_kexec routines.
*/
-void __crash_kexec(struct pt_regs *regs)
+void __noclone __crash_kexec(struct pt_regs *regs)
{
/* Take the kexec_mutex here to prevent sys_kexec_load
* running on one cpu from replacing the crash kernel
@@ -896,6 +897,7 @@ void __crash_kexec(struct pt_regs *regs)
mutex_unlock(&kexec_mutex);
}
}
+STACK_FRAME_NON_STANDARD(__crash_kexec);

void crash_kexec(struct pt_regs *regs)
{

Subject: [tip:core/objtool] objtool: Implement stack validation 2.0

Commit-ID: baa41469a7b992c1e3db2a39854219cc7442e48f
Gitweb: http://git.kernel.org/tip/baa41469a7b992c1e3db2a39854219cc7442e48f
Author: Josh Poimboeuf <[email protected]>
AuthorDate: Wed, 28 Jun 2017 10:11:07 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 30 Jun 2017 10:19:19 +0200

objtool: Implement stack validation 2.0

This is a major rewrite of objtool. Instead of only tracking frame
pointer changes, it now tracks all stack-related operations, including
all register saves/restores.

In addition to making stack validation more robust, this also paves the
way for undwarf generation.

Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/678bd94c0566c6129bcc376cddb259c4c5633004.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/objtool/Documentation/stack-validation.txt | 153 +++--
tools/objtool/Makefile | 2 +-
tools/objtool/arch.h | 64 ++-
tools/objtool/arch/x86/decode.c | 400 ++++++++++++--
tools/objtool/cfi.h | 55 ++
tools/objtool/check.c | 676 ++++++++++++++++++-----
tools/objtool/check.h | 19 +-
tools/objtool/elf.c | 59 +-
tools/objtool/elf.h | 6 +-
tools/objtool/special.c | 6 +-
tools/objtool/warn.h | 10 +
11 files changed, 1130 insertions(+), 320 deletions(-)

diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt
index 55a60d3..17c1195 100644
--- a/tools/objtool/Documentation/stack-validation.txt
+++ b/tools/objtool/Documentation/stack-validation.txt
@@ -127,28 +127,13 @@ b) 100% reliable stack traces for DWARF enabled kernels

c) Higher live patching compatibility rate

- (NOTE: This is not yet implemented)
-
- Currently with CONFIG_LIVEPATCH there's a basic live patching
- framework which is safe for roughly 85-90% of "security" fixes. But
- patches can't have complex features like function dependency or
- prototype changes, or data structure changes.
-
- There's a strong need to support patches which have the more complex
- features so that the patch compatibility rate for security fixes can
- eventually approach something resembling 100%. To achieve that, a
- "consistency model" is needed, which allows tasks to be safely
- transitioned from an unpatched state to a patched state.
-
- One of the key requirements of the currently proposed livepatch
- consistency model [*] is that it needs to walk the stack of each
- sleeping task to determine if it can be transitioned to the patched
- state. If objtool can ensure that stack traces are reliable, this
- consistency model can be used and the live patching compatibility
- rate can be improved significantly.
-
- [*] https://lkml.kernel.org/r/[email protected]
+ Livepatch has an optional "consistency model", which is needed for
+ more complex patches. In order for the consistency model to work,
+ stack traces need to be reliable (or an unreliable condition needs to
+ be detectable). Objtool makes that possible.

+ For more details, see the livepatch documentation in the Linux kernel
+ source tree at Documentation/livepatch/livepatch.txt.

Rules
-----
@@ -201,80 +186,84 @@ To achieve the validation, objtool enforces the following rules:
return normally.


-Errors in .S files
-------------------
+Objtool warnings
+----------------

-If you're getting an error in a compiled .S file which you don't
-understand, first make sure that the affected code follows the above
-rules.
+For asm files, if you're getting an error which doesn't make sense,
+first make sure that the affected code follows the above rules.
+
+For C files, the common culprits are inline asm statements and calls to
+"noreturn" functions. See below for more details.
+
+Another possible cause for errors in C code is if the Makefile removes
+-fno-omit-frame-pointer or adds -fomit-frame-pointer to the gcc options.

Here are some examples of common warnings reported by objtool, what
they mean, and suggestions for how to fix them.


-1. asm_file.o: warning: objtool: func()+0x128: call without frame pointer save/setup
+1. file.o: warning: objtool: func()+0x128: call without frame pointer save/setup

The func() function made a function call without first saving and/or
- updating the frame pointer.
-
- If func() is indeed a callable function, add proper frame pointer
- logic using the FRAME_BEGIN and FRAME_END macros. Otherwise, remove
- its ELF function annotation by changing ENDPROC to END.
+ updating the frame pointer, and CONFIG_FRAME_POINTER is enabled.

- If you're getting this error in a .c file, see the "Errors in .c
- files" section.
+ If the error is for an asm file, and func() is indeed a callable
+ function, add proper frame pointer logic using the FRAME_BEGIN and
+ FRAME_END macros. Otherwise, if it's not a callable function, remove
+ its ELF function annotation by changing ENDPROC to END, and instead
+ use the manual CFI hint macros in asm/undwarf.h.

+ If it's a GCC-compiled .c file, the error may be because the function
+ uses an inline asm() statement which has a "call" instruction. An
+ asm() statement with a call instruction must declare the use of the
+ stack pointer in its output operand. For example, on x86_64:

-2. asm_file.o: warning: objtool: .text+0x53: return instruction outside of a callable function
-
- A return instruction was detected, but objtool couldn't find a way
- for a callable function to reach the instruction.
+ register void *__sp asm("rsp");
+ asm volatile("call func" : "+r" (__sp));

- If the return instruction is inside (or reachable from) a callable
- function, the function needs to be annotated with the ENTRY/ENDPROC
- macros.
+ Otherwise the stack frame may not get created before the call.

- If you _really_ need a return instruction outside of a function, and
- are 100% sure that it won't affect stack traces, you can tell
- objtool to ignore it. See the "Adding exceptions" section below.

+2. file.o: warning: objtool: .text+0x53: unreachable instruction

-3. asm_file.o: warning: objtool: func()+0x9: function has unreachable instruction
+ Objtool couldn't find a code path to reach the instruction.

- The instruction lives inside of a callable function, but there's no
- possible control flow path from the beginning of the function to the
- instruction.
+ If the error is for an asm file, and the instruction is inside (or
+ reachable from) a callable function, the function should be annotated
+ with the ENTRY/ENDPROC macros (ENDPROC is the important one).
+ Otherwise, the code should probably be annotated with the CFI hint
+ macros in asm/undwarf.h so objtool and the unwinder can know the
+ stack state associated with the code.

- If the instruction is actually needed, and it's actually in a
- callable function, ensure that its function is properly annotated
- with ENTRY/ENDPROC.
+ If you're 100% sure the code won't affect stack traces, or if you're
+ a just a bad person, you can tell objtool to ignore it. See the
+ "Adding exceptions" section below.

If it's not actually in a callable function (e.g. kernel entry code),
change ENDPROC to END.


-4. asm_file.o: warning: objtool: func(): can't find starting instruction
+4. file.o: warning: objtool: func(): can't find starting instruction
or
- asm_file.o: warning: objtool: func()+0x11dd: can't decode instruction
+ file.o: warning: objtool: func()+0x11dd: can't decode instruction

- Did you put data in a text section? If so, that can confuse
+ Does the file have data in a text section? If so, that can confuse
objtool's instruction decoder. Move the data to a more appropriate
section like .data or .rodata.


-5. asm_file.o: warning: objtool: func()+0x6: kernel entry/exit from callable instruction
-
- This is a kernel entry/exit instruction like sysenter or sysret.
- Such instructions aren't allowed in a callable function, and are most
- likely part of the kernel entry code.
+5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function

- If the instruction isn't actually in a callable function, change
- ENDPROC to END.
+ This is a kernel entry/exit instruction like sysenter or iret. Such
+ instructions aren't allowed in a callable function, and are most
+ likely part of the kernel entry code. They should usually not have
+ the callable function annotation (ENDPROC) and should always be
+ annotated with the CFI hint macros in asm/undwarf.h.


-6. asm_file.o: warning: objtool: func()+0x26: sibling call from callable instruction with changed frame pointer
+6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame

- This is a dynamic jump or a jump to an undefined symbol. Stacktool
+ This is a dynamic jump or a jump to an undefined symbol. Objtool
assumed it's a sibling call and detected that the frame pointer
wasn't first restored to its original state.

@@ -282,24 +271,28 @@ they mean, and suggestions for how to fix them.
destination code to the local file.

If the instruction is not actually in a callable function (e.g.
- kernel entry code), change ENDPROC to END.
+ kernel entry code), change ENDPROC to END and annotate manually with
+ the CFI hint macros in asm/undwarf.h.


-7. asm_file: warning: objtool: func()+0x5c: frame pointer state mismatch
+7. file: warning: objtool: func()+0x5c: stack state mismatch

The instruction's frame pointer state is inconsistent, depending on
which execution path was taken to reach the instruction.

- Make sure the function pushes and sets up the frame pointer (for
- x86_64, this means rbp) at the beginning of the function and pops it
- at the end of the function. Also make sure that no other code in the
- function touches the frame pointer.
+ Make sure that, when CONFIG_FRAME_POINTER is enabled, the function
+ pushes and sets up the frame pointer (for x86_64, this means rbp) at
+ the beginning of the function and pops it at the end of the function.
+ Also make sure that no other code in the function touches the frame
+ pointer.

+ Another possibility is that the code has some asm or inline asm which
+ does some unusual things to the stack or the frame pointer. In such
+ cases it's probably appropriate to use the CFI hint macros in
+ asm/undwarf.h.

-Errors in .c files
-------------------

-1. c_file.o: warning: objtool: funcA() falls through to next function funcB()
+8. file.o: warning: objtool: funcA() falls through to next function funcB()

This means that funcA() doesn't end with a return instruction or an
unconditional jump, and that objtool has determined that the function
@@ -318,22 +311,6 @@ Errors in .c files
might be corrupt due to a gcc bug. For more details, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

-2. If you're getting any other objtool error in a compiled .c file, it
- may be because the file uses an asm() statement which has a "call"
- instruction. An asm() statement with a call instruction must declare
- the use of the stack pointer in its output operand. For example, on
- x86_64:
-
- register void *__sp asm("rsp");
- asm volatile("call func" : "+r" (__sp));
-
- Otherwise the stack frame may not get created before the call.
-
-3. Another possible cause for errors in C code is if the Makefile removes
- -fno-omit-frame-pointer or adds -fomit-frame-pointer to the gcc options.
-
-Also see the above section for .S file errors for more information what
-the individual error messages mean.

If the error doesn't seem to make sense, it could be a bug in objtool.
Feel free to ask the objtool maintainer for help.
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 27e019c..0e2765e 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -25,7 +25,7 @@ OBJTOOL_IN := $(OBJTOOL)-in.o
all: $(OBJTOOL)

INCLUDES := -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi
-CFLAGS += -Wall -Werror $(EXTRA_WARNINGS) -fomit-frame-pointer -O2 -g $(INCLUDES)
+CFLAGS += -Wall -Werror $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -fomit-frame-pointer -O2 -g $(INCLUDES)
LDFLAGS += -lelf $(LIBSUBCMD)

# Allow old libelf to be used:
diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index a59e061..21aeca8 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -19,25 +19,63 @@
#define _ARCH_H

#include <stdbool.h>
+#include <linux/list.h>
#include "elf.h"
+#include "cfi.h"

-#define INSN_FP_SAVE 1
-#define INSN_FP_SETUP 2
-#define INSN_FP_RESTORE 3
-#define INSN_JUMP_CONDITIONAL 4
-#define INSN_JUMP_UNCONDITIONAL 5
-#define INSN_JUMP_DYNAMIC 6
-#define INSN_CALL 7
-#define INSN_CALL_DYNAMIC 8
-#define INSN_RETURN 9
-#define INSN_CONTEXT_SWITCH 10
-#define INSN_NOP 11
-#define INSN_OTHER 12
+#define INSN_JUMP_CONDITIONAL 1
+#define INSN_JUMP_UNCONDITIONAL 2
+#define INSN_JUMP_DYNAMIC 3
+#define INSN_CALL 4
+#define INSN_CALL_DYNAMIC 5
+#define INSN_RETURN 6
+#define INSN_CONTEXT_SWITCH 7
+#define INSN_STACK 8
+#define INSN_NOP 9
+#define INSN_OTHER 10
#define INSN_LAST INSN_OTHER

+enum op_dest_type {
+ OP_DEST_REG,
+ OP_DEST_REG_INDIRECT,
+ OP_DEST_MEM,
+ OP_DEST_PUSH,
+ OP_DEST_LEAVE,
+};
+
+struct op_dest {
+ enum op_dest_type type;
+ unsigned char reg;
+ int offset;
+};
+
+enum op_src_type {
+ OP_SRC_REG,
+ OP_SRC_REG_INDIRECT,
+ OP_SRC_CONST,
+ OP_SRC_POP,
+ OP_SRC_ADD,
+ OP_SRC_AND,
+};
+
+struct op_src {
+ enum op_src_type type;
+ unsigned char reg;
+ int offset;
+};
+
+struct stack_op {
+ struct op_dest dest;
+ struct op_src src;
+};
+
+void arch_initial_func_cfi_state(struct cfi_state *state);
+
int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, unsigned char *type,
- unsigned long *displacement);
+ unsigned long *immediate, struct stack_op *op);
+
+bool arch_callee_saved_reg(unsigned char reg);

#endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 6ac99e3..a36c2eb 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -27,6 +27,17 @@
#include "../../arch.h"
#include "../../warn.h"

+static unsigned char op_to_cfi_reg[][2] = {
+ {CFI_AX, CFI_R8},
+ {CFI_CX, CFI_R9},
+ {CFI_DX, CFI_R10},
+ {CFI_BX, CFI_R11},
+ {CFI_SP, CFI_R12},
+ {CFI_BP, CFI_R13},
+ {CFI_SI, CFI_R14},
+ {CFI_DI, CFI_R15},
+};
+
static int is_x86_64(struct elf *elf)
{
switch (elf->ehdr.e_machine) {
@@ -40,24 +51,50 @@ static int is_x86_64(struct elf *elf)
}
}

+bool arch_callee_saved_reg(unsigned char reg)
+{
+ switch (reg) {
+ case CFI_BP:
+ case CFI_BX:
+ case CFI_R12:
+ case CFI_R13:
+ case CFI_R14:
+ case CFI_R15:
+ return true;
+
+ case CFI_AX:
+ case CFI_CX:
+ case CFI_DX:
+ case CFI_SI:
+ case CFI_DI:
+ case CFI_SP:
+ case CFI_R8:
+ case CFI_R9:
+ case CFI_R10:
+ case CFI_R11:
+ case CFI_RA:
+ default:
+ return false;
+ }
+}
+
int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, unsigned char *type,
- unsigned long *immediate)
+ unsigned long *immediate, struct stack_op *op)
{
struct insn insn;
- int x86_64;
- unsigned char op1, op2, ext;
+ int x86_64, sign;
+ unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
+ modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
+ sib = 0;

x86_64 = is_x86_64(elf);
if (x86_64 == -1)
return -1;

- insn_init(&insn, (void *)(sec->data + offset), maxlen, x86_64);
+ insn_init(&insn, sec->data->d_buf + offset, maxlen, x86_64);
insn_get_length(&insn);
- insn_get_opcode(&insn);
- insn_get_modrm(&insn);
- insn_get_immediate(&insn);

if (!insn_complete(&insn)) {
WARN_FUNC("can't decode instruction", sec, offset);
@@ -73,67 +110,323 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op1 = insn.opcode.bytes[0];
op2 = insn.opcode.bytes[1];

+ if (insn.rex_prefix.nbytes) {
+ rex = insn.rex_prefix.bytes[0];
+ rex_w = X86_REX_W(rex) >> 3;
+ rex_r = X86_REX_R(rex) >> 2;
+ rex_b = X86_REX_B(rex);
+ }
+
+ if (insn.modrm.nbytes) {
+ modrm = insn.modrm.bytes[0];
+ modrm_mod = X86_MODRM_MOD(modrm);
+ modrm_reg = X86_MODRM_REG(modrm);
+ modrm_rm = X86_MODRM_RM(modrm);
+ }
+
+ if (insn.sib.nbytes)
+ sib = insn.sib.bytes[0];
+
switch (op1) {
- case 0x55:
- if (!insn.rex_prefix.nbytes)
- /* push rbp */
- *type = INSN_FP_SAVE;
+
+ case 0x1:
+ case 0x29:
+ if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
+
+ /* add/sub reg, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_SRC_REG;
+ op->dest.reg = CFI_SP;
+ }
+ break;
+
+ case 0x50 ... 0x57:
+
+ /* push reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+ op->dest.type = OP_DEST_PUSH;
+
break;

- case 0x5d:
- if (!insn.rex_prefix.nbytes)
- /* pop rbp */
- *type = INSN_FP_RESTORE;
+ case 0x58 ... 0x5f:
+
+ /* pop reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+
+ break;
+
+ case 0x68:
+ case 0x6a:
+ /* push immediate */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
break;

case 0x70 ... 0x7f:
*type = INSN_JUMP_CONDITIONAL;
break;

+ case 0x81:
+ case 0x83:
+ if (rex != 0x48)
+ break;
+
+ if (modrm == 0xe4) {
+ /* and imm, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_AND;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ if (modrm == 0xc4)
+ sign = 1;
+ else if (modrm == 0xec)
+ sign = -1;
+ else
+ break;
+
+ /* add/sub imm, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value * sign;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+
case 0x89:
- if (insn.rex_prefix.nbytes == 1 &&
- insn.rex_prefix.bytes[0] == 0x48 &&
- insn.modrm.nbytes && insn.modrm.bytes[0] == 0xe5)
- /* mov rsp, rbp */
- *type = INSN_FP_SETUP;
+ if (rex == 0x48 && modrm == 0xe5) {
+
+ /* mov %rsp, %rbp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = CFI_SP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_BP;
+ break;
+ }
+ /* fallthrough */
+ case 0x88:
+ if (!rex_b &&
+ (modrm_mod == 1 || modrm_mod == 2) && modrm_rm == 5) {
+
+ /* mov reg, disp(%rbp) */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG_INDIRECT;
+ op->dest.reg = CFI_BP;
+ op->dest.offset = insn.displacement.value;
+
+ } else if (rex_w && !rex_b && modrm_rm == 4 && sib == 0x24) {
+
+ /* mov reg, disp(%rsp) */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG_INDIRECT;
+ op->dest.reg = CFI_SP;
+ op->dest.offset = insn.displacement.value;
+ }
+
+ break;
+
+ case 0x8b:
+ if (rex_w && !rex_b && modrm_mod == 1 && modrm_rm == 5) {
+
+ /* mov disp(%rbp), reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG_INDIRECT;
+ op->src.reg = CFI_BP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+
+ } else if (rex_w && !rex_b && sib == 0x24 &&
+ modrm_mod != 3 && modrm_rm == 4) {
+
+ /* mov disp(%rsp), reg */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG_INDIRECT;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ }
+
break;

case 0x8d:
- if (insn.rex_prefix.nbytes &&
- insn.rex_prefix.bytes[0] == 0x48 &&
- insn.modrm.nbytes && insn.modrm.bytes[0] == 0x2c &&
- insn.sib.nbytes && insn.sib.bytes[0] == 0x24)
- /* lea %(rsp), %rbp */
- *type = INSN_FP_SETUP;
+ if (rex == 0x48 && modrm == 0x65) {
+
+ /* lea -disp(%rbp), %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_BP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ if (rex == 0x4c && modrm == 0x54 && sib == 0x24 &&
+ insn.displacement.value == 8) {
+
+ /*
+ * lea 0x8(%rsp), %r10
+ *
+ * Here r10 is the "drap" pointer, used as a stack
+ * pointer helper when the stack gets realigned.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = 8;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_R10;
+ break;
+ }
+
+ if (rex == 0x4c && modrm == 0x6c && sib == 0x24 &&
+ insn.displacement.value == 16) {
+
+ /*
+ * lea 0x10(%rsp), %r13
+ *
+ * Here r13 is the "drap" pointer, used as a stack
+ * pointer helper when the stack gets realigned.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = 16;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_R13;
+ break;
+ }
+
+ if (rex == 0x49 && modrm == 0x62 &&
+ insn.displacement.value == -8) {
+
+ /*
+ * lea -0x8(%r10), %rsp
+ *
+ * Restoring rsp back to its original value after a
+ * stack realignment.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_R10;
+ op->src.offset = -8;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ if (rex == 0x49 && modrm == 0x65 &&
+ insn.displacement.value == -16) {
+
+ /*
+ * lea -0x10(%r13), %rsp
+ *
+ * Restoring rsp back to its original value after a
+ * stack realignment.
+ */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_R13;
+ op->src.offset = -16;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
+ break;
+
+ case 0x8f:
+ /* pop to mem */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
break;

case 0x90:
*type = INSN_NOP;
break;

+ case 0x9c:
+ /* pushf */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ break;
+
+ case 0x9d:
+ /* popf */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
+ break;
+
case 0x0f:
+
if (op2 >= 0x80 && op2 <= 0x8f)
*type = INSN_JUMP_CONDITIONAL;
else if (op2 == 0x05 || op2 == 0x07 || op2 == 0x34 ||
op2 == 0x35)
+
/* sysenter, sysret */
*type = INSN_CONTEXT_SWITCH;
+
else if (op2 == 0x0d || op2 == 0x1f)
+
/* nopl/nopw */
*type = INSN_NOP;
- else if (op2 == 0x01 && insn.modrm.nbytes &&
- (insn.modrm.bytes[0] == 0xc2 ||
- insn.modrm.bytes[0] == 0xd8))
- /* vmlaunch, vmrun */
- *type = INSN_CONTEXT_SWITCH;
+
+ else if (op2 == 0xa0 || op2 == 0xa8) {
+
+ /* push fs/gs */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+
+ } else if (op2 == 0xa1 || op2 == 0xa9) {
+
+ /* pop fs/gs */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
+ }

break;

- case 0xc9: /* leave */
- *type = INSN_FP_RESTORE;
+ case 0xc9:
+ /*
+ * leave
+ *
+ * equivalent to:
+ * mov bp, sp
+ * pop bp
+ */
+ *type = INSN_STACK;
+ op->dest.type = OP_DEST_LEAVE;
+
break;

- case 0xe3: /* jecxz/jrcxz */
+ case 0xe3:
+ /* jecxz/jrcxz */
*type = INSN_JUMP_CONDITIONAL;
break;

@@ -158,14 +451,27 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
break;

case 0xff:
- ext = X86_MODRM_REG(insn.modrm.bytes[0]);
- if (ext == 2 || ext == 3)
+ if (modrm_reg == 2 || modrm_reg == 3)
+
*type = INSN_CALL_DYNAMIC;
- else if (ext == 4)
+
+ else if (modrm_reg == 4)
+
*type = INSN_JUMP_DYNAMIC;
- else if (ext == 5) /*jmpf */
+
+ else if (modrm_reg == 5)
+
+ /* jmpf */
*type = INSN_CONTEXT_SWITCH;

+ else if (modrm_reg == 6) {
+
+ /* push from mem */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ }
+
break;

default:
@@ -176,3 +482,21 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,

return 0;
}
+
+void arch_initial_func_cfi_state(struct cfi_state *state)
+{
+ int i;
+
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ state->regs[i].base = CFI_UNDEFINED;
+ state->regs[i].offset = 0;
+ }
+
+ /* initial CFA (call frame address) */
+ state->cfa.base = CFI_SP;
+ state->cfa.offset = 8;
+
+ /* initial RA (return address) */
+ state->regs[16].base = CFI_CFA;
+ state->regs[16].offset = -8;
+}
diff --git a/tools/objtool/cfi.h b/tools/objtool/cfi.h
new file mode 100644
index 0000000..443ab2c
--- /dev/null
+++ b/tools/objtool/cfi.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2015-2017 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _OBJTOOL_CFI_H
+#define _OBJTOOL_CFI_H
+
+#define CFI_UNDEFINED -1
+#define CFI_CFA -2
+#define CFI_SP_INDIRECT -3
+#define CFI_BP_INDIRECT -4
+
+#define CFI_AX 0
+#define CFI_DX 1
+#define CFI_CX 2
+#define CFI_BX 3
+#define CFI_SI 4
+#define CFI_DI 5
+#define CFI_BP 6
+#define CFI_SP 7
+#define CFI_R8 8
+#define CFI_R9 9
+#define CFI_R10 10
+#define CFI_R11 11
+#define CFI_R12 12
+#define CFI_R13 13
+#define CFI_R14 14
+#define CFI_R15 15
+#define CFI_RA 16
+#define CFI_NUM_REGS 17
+
+struct cfi_reg {
+ int base;
+ int offset;
+};
+
+struct cfi_state {
+ struct cfi_reg cfa;
+ struct cfi_reg regs[CFI_NUM_REGS];
+};
+
+#endif /* _OBJTOOL_CFI_H */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 231a360..2f80aa51 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -27,10 +27,6 @@
#include <linux/hashtable.h>
#include <linux/kernel.h>

-#define STATE_FP_SAVED 0x1
-#define STATE_FP_SETUP 0x2
-#define STATE_FENTRY 0x4
-
struct alternative {
struct list_head list;
struct instruction *insn;
@@ -38,6 +34,7 @@ struct alternative {

const char *objname;
static bool nofp;
+struct cfi_state initial_func_cfi;

static struct instruction *find_insn(struct objtool_file *file,
struct section *sec, unsigned long offset)
@@ -56,7 +53,7 @@ static struct instruction *next_insn_same_sec(struct objtool_file *file,
{
struct instruction *next = list_next_entry(insn, list);

- if (&next->list == &file->insn_list || next->sec != insn->sec)
+ if (!next || &next->list == &file->insn_list || next->sec != insn->sec)
return NULL;

return next;
@@ -67,7 +64,7 @@ static bool gcov_enabled(struct objtool_file *file)
struct section *sec;
struct symbol *sym;

- list_for_each_entry(sec, &file->elf->sections, list)
+ for_each_sec(file, sec)
list_for_each_entry(sym, &sec->symbol_list, list)
if (!strncmp(sym->name, "__gcov_.", 8))
return true;
@@ -75,9 +72,6 @@ static bool gcov_enabled(struct objtool_file *file)
return false;
}

-#define for_each_insn(file, insn) \
- list_for_each_entry(insn, &file->insn_list, list)
-
#define func_for_each_insn(file, func, insn) \
for (insn = find_insn(file, func->sec, func->offset); \
insn && &insn->list != &file->insn_list && \
@@ -94,6 +88,9 @@ static bool gcov_enabled(struct objtool_file *file)
#define sec_for_each_insn_from(file, insn) \
for (; insn; insn = next_insn_same_sec(file, insn))

+#define sec_for_each_insn_continue(file, insn) \
+ for (insn = next_insn_same_sec(file, insn); insn; \
+ insn = next_insn_same_sec(file, insn))

/*
* Check if the function has been manually whitelisted with the
@@ -103,7 +100,6 @@ static bool gcov_enabled(struct objtool_file *file)
static bool ignore_func(struct objtool_file *file, struct symbol *func)
{
struct rela *rela;
- struct instruction *insn;

/* check for STACK_FRAME_NON_STANDARD */
if (file->whitelist && file->whitelist->rela)
@@ -116,11 +112,6 @@ static bool ignore_func(struct objtool_file *file, struct symbol *func)
return true;
}

- /* check if it has a context switching instruction */
- func_for_each_insn(file, func, insn)
- if (insn->type == INSN_CONTEXT_SWITCH)
- return true;
-
return false;
}

@@ -234,6 +225,17 @@ static int dead_end_function(struct objtool_file *file, struct symbol *func)
return __dead_end_function(file, func, 0);
}

+static void clear_insn_state(struct insn_state *state)
+{
+ int i;
+
+ memset(state, 0, sizeof(*state));
+ state->cfa.base = CFI_UNDEFINED;
+ for (i = 0; i < CFI_NUM_REGS; i++)
+ state->regs[i].base = CFI_UNDEFINED;
+ state->drap_reg = CFI_UNDEFINED;
+}
+
/*
* Call the arch-specific instruction decoder for all the instructions and add
* them to the global instruction list.
@@ -246,23 +248,29 @@ static int decode_instructions(struct objtool_file *file)
struct instruction *insn;
int ret;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ for_each_sec(file, sec) {

if (!(sec->sh.sh_flags & SHF_EXECINSTR))
continue;

for (offset = 0; offset < sec->len; offset += insn->len) {
insn = malloc(sizeof(*insn));
+ if (!insn) {
+ WARN("malloc failed");
+ return -1;
+ }
memset(insn, 0, sizeof(*insn));
-
INIT_LIST_HEAD(&insn->alts);
+ clear_insn_state(&insn->state);
+
insn->sec = sec;
insn->offset = offset;

ret = arch_decode_instruction(file->elf, sec, offset,
sec->len - offset,
&insn->len, &insn->type,
- &insn->immediate);
+ &insn->immediate,
+ &insn->stack_op);
if (ret)
return ret;

@@ -352,7 +360,7 @@ static void add_ignores(struct objtool_file *file)
struct section *sec;
struct symbol *func;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ for_each_sec(file, sec) {
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;
@@ -361,7 +369,7 @@ static void add_ignores(struct objtool_file *file)
continue;

func_for_each_insn(file, func, insn)
- insn->visited = true;
+ insn->ignore = true;
}
}
}
@@ -381,8 +389,7 @@ static int add_jump_destinations(struct objtool_file *file)
insn->type != INSN_JUMP_UNCONDITIONAL)
continue;

- /* skip ignores */
- if (insn->visited)
+ if (insn->ignore)
continue;

rela = find_rela_by_dest_range(insn->sec, insn->offset,
@@ -519,10 +526,13 @@ static int handle_group_alt(struct objtool_file *file,
}
memset(fake_jump, 0, sizeof(*fake_jump));
INIT_LIST_HEAD(&fake_jump->alts);
+ clear_insn_state(&fake_jump->state);
+
fake_jump->sec = special_alt->new_sec;
fake_jump->offset = -1;
fake_jump->type = INSN_JUMP_UNCONDITIONAL;
fake_jump->jump_dest = list_next_entry(last_orig_insn, list);
+ fake_jump->ignore = true;

if (!special_alt->new_len) {
*new_insn = fake_jump;
@@ -844,7 +854,7 @@ static int add_switch_table_alts(struct objtool_file *file)
if (!file->rodata || !file->rodata->rela)
return 0;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ for_each_sec(file, sec) {
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;
@@ -901,21 +911,423 @@ static bool is_fentry_call(struct instruction *insn)
return false;
}

-static bool has_modified_stack_frame(struct instruction *insn)
+static bool has_modified_stack_frame(struct insn_state *state)
{
- return (insn->state & STATE_FP_SAVED) ||
- (insn->state & STATE_FP_SETUP);
+ int i;
+
+ if (state->cfa.base != initial_func_cfi.cfa.base ||
+ state->cfa.offset != initial_func_cfi.cfa.offset ||
+ state->stack_size != initial_func_cfi.cfa.offset ||
+ state->drap)
+ return true;
+
+ for (i = 0; i < CFI_NUM_REGS; i++)
+ if (state->regs[i].base != initial_func_cfi.regs[i].base ||
+ state->regs[i].offset != initial_func_cfi.regs[i].offset)
+ return true;
+
+ return false;
+}
+
+static bool has_valid_stack_frame(struct insn_state *state)
+{
+ if (state->cfa.base == CFI_BP && state->regs[CFI_BP].base == CFI_CFA &&
+ state->regs[CFI_BP].offset == -16)
+ return true;
+
+ if (state->drap && state->regs[CFI_BP].base == CFI_BP)
+ return true;
+
+ return false;
}

-static bool has_valid_stack_frame(struct instruction *insn)
+static void save_reg(struct insn_state *state, unsigned char reg, int base,
+ int offset)
{
- return (insn->state & STATE_FP_SAVED) &&
- (insn->state & STATE_FP_SETUP);
+ if ((arch_callee_saved_reg(reg) ||
+ (state->drap && reg == state->drap_reg)) &&
+ state->regs[reg].base == CFI_UNDEFINED) {
+ state->regs[reg].base = base;
+ state->regs[reg].offset = offset;
+ }
}

-static unsigned int frame_state(unsigned long state)
+static void restore_reg(struct insn_state *state, unsigned char reg)
{
- return (state & (STATE_FP_SAVED | STATE_FP_SETUP));
+ state->regs[reg].base = CFI_UNDEFINED;
+ state->regs[reg].offset = 0;
+}
+
+/*
+ * A note about DRAP stack alignment:
+ *
+ * GCC has the concept of a DRAP register, which is used to help keep track of
+ * the stack pointer when aligning the stack. r10 or r13 is used as the DRAP
+ * register. The typical DRAP pattern is:
+ *
+ * 4c 8d 54 24 08 lea 0x8(%rsp),%r10
+ * 48 83 e4 c0 and $0xffffffffffffffc0,%rsp
+ * 41 ff 72 f8 pushq -0x8(%r10)
+ * 55 push %rbp
+ * 48 89 e5 mov %rsp,%rbp
+ * (more pushes)
+ * 41 52 push %r10
+ * ...
+ * 41 5a pop %r10
+ * (more pops)
+ * 5d pop %rbp
+ * 49 8d 62 f8 lea -0x8(%r10),%rsp
+ * c3 retq
+ *
+ * There are some variations in the epilogues, like:
+ *
+ * 5b pop %rbx
+ * 41 5a pop %r10
+ * 41 5c pop %r12
+ * 41 5d pop %r13
+ * 41 5e pop %r14
+ * c9 leaveq
+ * 49 8d 62 f8 lea -0x8(%r10),%rsp
+ * c3 retq
+ *
+ * and:
+ *
+ * 4c 8b 55 e8 mov -0x18(%rbp),%r10
+ * 48 8b 5d e0 mov -0x20(%rbp),%rbx
+ * 4c 8b 65 f0 mov -0x10(%rbp),%r12
+ * 4c 8b 6d f8 mov -0x8(%rbp),%r13
+ * c9 leaveq
+ * 49 8d 62 f8 lea -0x8(%r10),%rsp
+ * c3 retq
+ *
+ * Sometimes r13 is used as the DRAP register, in which case it's saved and
+ * restored beforehand:
+ *
+ * 41 55 push %r13
+ * 4c 8d 6c 24 10 lea 0x10(%rsp),%r13
+ * 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
+ * ...
+ * 49 8d 65 f0 lea -0x10(%r13),%rsp
+ * 41 5d pop %r13
+ * c3 retq
+ */
+static int update_insn_state(struct instruction *insn, struct insn_state *state)
+{
+ struct stack_op *op = &insn->stack_op;
+ struct cfi_reg *cfa = &state->cfa;
+ struct cfi_reg *regs = state->regs;
+
+ /* stack operations don't make sense with an undefined CFA */
+ if (cfa->base == CFI_UNDEFINED) {
+ if (insn->func) {
+ WARN_FUNC("undefined stack state", insn->sec, insn->offset);
+ return -1;
+ }
+ return 0;
+ }
+
+ switch (op->dest.type) {
+
+ case OP_DEST_REG:
+ switch (op->src.type) {
+
+ case OP_SRC_REG:
+ if (cfa->base == op->src.reg && cfa->base == CFI_SP &&
+ op->dest.reg == CFI_BP && regs[CFI_BP].base == CFI_CFA &&
+ regs[CFI_BP].offset == -cfa->offset) {
+
+ /* mov %rsp, %rbp */
+ cfa->base = op->dest.reg;
+ state->bp_scratch = false;
+ } else if (state->drap) {
+
+ /* drap: mov %rsp, %rbp */
+ regs[CFI_BP].base = CFI_BP;
+ regs[CFI_BP].offset = -state->stack_size;
+ state->bp_scratch = false;
+ } else if (!nofp) {
+
+ WARN_FUNC("unknown stack-related register move",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+
+ case OP_SRC_ADD:
+ if (op->dest.reg == CFI_SP && op->src.reg == CFI_SP) {
+
+ /* add imm, %rsp */
+ state->stack_size -= op->src.offset;
+ if (cfa->base == CFI_SP)
+ cfa->offset -= op->src.offset;
+ break;
+ }
+
+ if (op->dest.reg == CFI_SP && op->src.reg == CFI_BP) {
+
+ /* lea disp(%rbp), %rsp */
+ state->stack_size = -(op->src.offset + regs[CFI_BP].offset);
+ break;
+ }
+
+ if (op->dest.reg != CFI_BP && op->src.reg == CFI_SP &&
+ cfa->base == CFI_SP) {
+
+ /* drap: lea disp(%rsp), %drap */
+ state->drap_reg = op->dest.reg;
+ break;
+ }
+
+ if (state->drap && op->dest.reg == CFI_SP &&
+ op->src.reg == state->drap_reg) {
+
+ /* drap: lea disp(%drap), %rsp */
+ cfa->base = CFI_SP;
+ cfa->offset = state->stack_size = -op->src.offset;
+ state->drap_reg = CFI_UNDEFINED;
+ state->drap = false;
+ break;
+ }
+
+ if (op->dest.reg == state->cfa.base) {
+ WARN_FUNC("unsupported stack register modification",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+
+ case OP_SRC_AND:
+ if (op->dest.reg != CFI_SP ||
+ (state->drap_reg != CFI_UNDEFINED && cfa->base != CFI_SP) ||
+ (state->drap_reg == CFI_UNDEFINED && cfa->base != CFI_BP)) {
+ WARN_FUNC("unsupported stack pointer realignment",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ if (state->drap_reg != CFI_UNDEFINED) {
+ /* drap: and imm, %rsp */
+ cfa->base = state->drap_reg;
+ cfa->offset = state->stack_size = 0;
+ state->drap = true;
+
+ }
+
+ /*
+ * Older versions of GCC (4.8ish) realign the stack
+ * without DRAP, with a frame pointer.
+ */
+
+ break;
+
+ case OP_SRC_POP:
+ if (!state->drap && op->dest.type == OP_DEST_REG &&
+ op->dest.reg == cfa->base) {
+
+ /* pop %rbp */
+ cfa->base = CFI_SP;
+ }
+
+ if (regs[op->dest.reg].offset == -state->stack_size) {
+
+ if (state->drap && cfa->base == CFI_BP_INDIRECT &&
+ op->dest.type == OP_DEST_REG &&
+ op->dest.reg == state->drap_reg) {
+
+ /* drap: pop %drap */
+ cfa->base = state->drap_reg;
+ cfa->offset = 0;
+ }
+
+ restore_reg(state, op->dest.reg);
+ }
+
+ state->stack_size -= 8;
+ if (cfa->base == CFI_SP)
+ cfa->offset -= 8;
+
+ break;
+
+ case OP_SRC_REG_INDIRECT:
+ if (state->drap && op->src.reg == CFI_BP &&
+ op->src.offset == regs[op->dest.reg].offset) {
+
+ /* drap: mov disp(%rbp), %reg */
+ if (op->dest.reg == state->drap_reg) {
+ cfa->base = state->drap_reg;
+ cfa->offset = 0;
+ }
+
+ restore_reg(state, op->dest.reg);
+
+ } else if (op->src.reg == cfa->base &&
+ op->src.offset == regs[op->dest.reg].offset + cfa->offset) {
+
+ /* mov disp(%rbp), %reg */
+ /* mov disp(%rsp), %reg */
+ restore_reg(state, op->dest.reg);
+ }
+
+ break;
+
+ default:
+ WARN_FUNC("unknown stack-related instruction",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+
+ case OP_DEST_PUSH:
+ state->stack_size += 8;
+ if (cfa->base == CFI_SP)
+ cfa->offset += 8;
+
+ if (op->src.type != OP_SRC_REG)
+ break;
+
+ if (state->drap) {
+ if (op->src.reg == cfa->base && op->src.reg == state->drap_reg) {
+
+ /* drap: push %drap */
+ cfa->base = CFI_BP_INDIRECT;
+ cfa->offset = -state->stack_size;
+
+ /* save drap so we know when to undefine it */
+ save_reg(state, op->src.reg, CFI_CFA, -state->stack_size);
+
+ } else if (op->src.reg == CFI_BP && cfa->base == state->drap_reg) {
+
+ /* drap: push %rbp */
+ state->stack_size = 0;
+
+ } else if (regs[op->src.reg].base == CFI_UNDEFINED) {
+
+ /* drap: push %reg */
+ save_reg(state, op->src.reg, CFI_BP, -state->stack_size);
+ }
+
+ } else {
+
+ /* push %reg */
+ save_reg(state, op->src.reg, CFI_CFA, -state->stack_size);
+ }
+
+ /* detect when asm code uses rbp as a scratch register */
+ if (!nofp && insn->func && op->src.reg == CFI_BP &&
+ cfa->base != CFI_BP)
+ state->bp_scratch = true;
+ break;
+
+ case OP_DEST_REG_INDIRECT:
+
+ if (state->drap) {
+ if (op->src.reg == cfa->base && op->src.reg == state->drap_reg) {
+
+ /* drap: mov %drap, disp(%rbp) */
+ cfa->base = CFI_BP_INDIRECT;
+ cfa->offset = op->dest.offset;
+
+ /* save drap so we know when to undefine it */
+ save_reg(state, op->src.reg, CFI_CFA, op->dest.offset);
+ }
+
+ else if (regs[op->src.reg].base == CFI_UNDEFINED) {
+
+ /* drap: mov reg, disp(%rbp) */
+ save_reg(state, op->src.reg, CFI_BP, op->dest.offset);
+ }
+
+ } else if (op->dest.reg == cfa->base) {
+
+ /* mov reg, disp(%rbp) */
+ /* mov reg, disp(%rsp) */
+ save_reg(state, op->src.reg, CFI_CFA,
+ op->dest.offset - state->cfa.offset);
+ }
+
+ break;
+
+ case OP_DEST_LEAVE:
+ if ((!state->drap && cfa->base != CFI_BP) ||
+ (state->drap && cfa->base != state->drap_reg)) {
+ WARN_FUNC("leave instruction with modified stack frame",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ /* leave (mov %rbp, %rsp; pop %rbp) */
+
+ state->stack_size = -state->regs[CFI_BP].offset - 8;
+ restore_reg(state, CFI_BP);
+
+ if (!state->drap) {
+ cfa->base = CFI_SP;
+ cfa->offset -= 8;
+ }
+
+ break;
+
+ case OP_DEST_MEM:
+ if (op->src.type != OP_SRC_POP) {
+ WARN_FUNC("unknown stack-related memory operation",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ /* pop mem */
+ state->stack_size -= 8;
+ if (cfa->base == CFI_SP)
+ cfa->offset -= 8;
+
+ break;
+
+ default:
+ WARN_FUNC("unknown stack-related instruction",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ return 0;
+}
+
+static bool insn_state_match(struct instruction *insn, struct insn_state *state)
+{
+ struct insn_state *state1 = &insn->state, *state2 = state;
+ int i;
+
+ if (memcmp(&state1->cfa, &state2->cfa, sizeof(state1->cfa))) {
+ WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
+ insn->sec, insn->offset,
+ state1->cfa.base, state1->cfa.offset,
+ state2->cfa.base, state2->cfa.offset);
+
+ } else if (memcmp(&state1->regs, &state2->regs, sizeof(state1->regs))) {
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ if (!memcmp(&state1->regs[i], &state2->regs[i],
+ sizeof(struct cfi_reg)))
+ continue;
+
+ WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
+ insn->sec, insn->offset,
+ i, state1->regs[i].base, state1->regs[i].offset,
+ i, state2->regs[i].base, state2->regs[i].offset);
+ break;
+ }
+
+ } else if (state1->drap != state2->drap ||
+ (state1->drap && state1->drap_reg != state2->drap_reg)) {
+ WARN_FUNC("stack state mismatch: drap1=%d(%d) drap2=%d(%d)",
+ insn->sec, insn->offset,
+ state1->drap, state1->drap_reg,
+ state2->drap, state2->drap_reg);
+
+ } else
+ return true;
+
+ return false;
}

/*
@@ -924,24 +1336,22 @@ static unsigned int frame_state(unsigned long state)
* each instruction and validate all the rules described in
* tools/objtool/Documentation/stack-validation.txt.
*/
-static int validate_branch(struct objtool_file *file,
- struct instruction *first, unsigned char first_state)
+static int validate_branch(struct objtool_file *file, struct instruction *first,
+ struct insn_state state)
{
struct alternative *alt;
struct instruction *insn;
struct section *sec;
struct symbol *func = NULL;
- unsigned char state;
int ret;

insn = first;
sec = insn->sec;
- state = first_state;

if (insn->alt_group && list_empty(&insn->alts)) {
WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
sec, insn->offset);
- return 1;
+ return -1;
}

while (1) {
@@ -951,23 +1361,21 @@ static int validate_branch(struct objtool_file *file,
func->name, insn->func->name);
return 1;
}
-
- func = insn->func;
}

+ func = insn->func;
+
if (insn->visited) {
- if (frame_state(insn->state) != frame_state(state)) {
- WARN_FUNC("frame pointer state mismatch",
- sec, insn->offset);
+ if (!!insn_state_match(insn, &state))
return 1;
- }

return 0;
}

- insn->visited = true;
insn->state = state;

+ insn->visited = true;
+
list_for_each_entry(alt, &insn->alts, list) {
ret = validate_branch(file, alt->insn, state);
if (ret)
@@ -976,50 +1384,24 @@ static int validate_branch(struct objtool_file *file,

switch (insn->type) {

- case INSN_FP_SAVE:
- if (!nofp) {
- if (state & STATE_FP_SAVED) {
- WARN_FUNC("duplicate frame pointer save",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SAVED;
- }
- break;
-
- case INSN_FP_SETUP:
- if (!nofp) {
- if (state & STATE_FP_SETUP) {
- WARN_FUNC("duplicate frame pointer setup",
- sec, insn->offset);
- return 1;
- }
- state |= STATE_FP_SETUP;
- }
- break;
-
- case INSN_FP_RESTORE:
- if (!nofp) {
- if (has_valid_stack_frame(insn))
- state &= ~STATE_FP_SETUP;
-
- state &= ~STATE_FP_SAVED;
- }
- break;
-
case INSN_RETURN:
- if (!nofp && has_modified_stack_frame(insn)) {
- WARN_FUNC("return without frame pointer restore",
+ if (func && has_modified_stack_frame(&state)) {
+ WARN_FUNC("return with modified stack frame",
sec, insn->offset);
return 1;
}
+
+ if (state.bp_scratch) {
+ WARN("%s uses BP as a scratch register",
+ insn->func->name);
+ return 1;
+ }
+
return 0;

case INSN_CALL:
- if (is_fentry_call(insn)) {
- state |= STATE_FENTRY;
+ if (is_fentry_call(insn))
break;
- }

ret = dead_end_function(file, insn->call_dest);
if (ret == 1)
@@ -1029,7 +1411,7 @@ static int validate_branch(struct objtool_file *file,

/* fallthrough */
case INSN_CALL_DYNAMIC:
- if (!nofp && !has_valid_stack_frame(insn)) {
+ if (!nofp && func && !has_valid_stack_frame(&state)) {
WARN_FUNC("call without frame pointer save/setup",
sec, insn->offset);
return 1;
@@ -1043,8 +1425,8 @@ static int validate_branch(struct objtool_file *file,
state);
if (ret)
return 1;
- } else if (has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
+ } else if (func && has_modified_stack_frame(&state)) {
+ WARN_FUNC("sibling call from callable instruction with modified stack frame",
sec, insn->offset);
return 1;
} /* else it's a sibling call */
@@ -1055,15 +1437,29 @@ static int validate_branch(struct objtool_file *file,
break;

case INSN_JUMP_DYNAMIC:
- if (list_empty(&insn->alts) &&
- has_modified_stack_frame(insn)) {
- WARN_FUNC("sibling call from callable instruction with changed frame pointer",
+ if (func && list_empty(&insn->alts) &&
+ has_modified_stack_frame(&state)) {
+ WARN_FUNC("sibling call from callable instruction with modified stack frame",
sec, insn->offset);
return 1;
}

return 0;

+ case INSN_CONTEXT_SWITCH:
+ if (func) {
+ WARN_FUNC("unsupported instruction in callable function",
+ sec, insn->offset);
+ return 1;
+ }
+ return 0;
+
+ case INSN_STACK:
+ if (update_insn_state(insn, &state))
+ return -1;
+
+ break;
+
default:
break;
}
@@ -1094,12 +1490,18 @@ static bool is_ubsan_insn(struct instruction *insn)
"__ubsan_handle_builtin_unreachable"));
}

-static bool ignore_unreachable_insn(struct symbol *func,
- struct instruction *insn)
+static bool ignore_unreachable_insn(struct instruction *insn)
{
int i;

- if (insn->type == INSN_NOP)
+ if (insn->ignore || insn->type == INSN_NOP)
+ return true;
+
+ /*
+ * Ignore any unused exceptions. This can happen when a whitelisted
+ * function has an exception table entry.
+ */
+ if (!strcmp(insn->sec->name, ".fixup"))
return true;

/*
@@ -1108,6 +1510,8 @@ static bool ignore_unreachable_insn(struct symbol *func,
*
* End the search at 5 instructions to avoid going into the weeds.
*/
+ if (!insn->func)
+ return false;
for (i = 0; i < 5; i++) {

if (is_kasan_insn(insn) || is_ubsan_insn(insn))
@@ -1118,7 +1522,7 @@ static bool ignore_unreachable_insn(struct symbol *func,
continue;
}

- if (insn->offset + insn->len >= func->offset + func->len)
+ if (insn->offset + insn->len >= insn->func->offset + insn->func->len)
break;
insn = list_next_entry(insn, list);
}
@@ -1131,73 +1535,58 @@ static int validate_functions(struct objtool_file *file)
struct section *sec;
struct symbol *func;
struct instruction *insn;
+ struct insn_state state;
int ret, warnings = 0;

- list_for_each_entry(sec, &file->elf->sections, list) {
+ clear_insn_state(&state);
+
+ state.cfa = initial_func_cfi.cfa;
+ memcpy(&state.regs, &initial_func_cfi.regs,
+ CFI_NUM_REGS * sizeof(struct cfi_reg));
+ state.stack_size = initial_func_cfi.cfa.offset;
+
+ for_each_sec(file, sec) {
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;

insn = find_insn(file, sec, func->offset);
- if (!insn)
+ if (!insn || insn->ignore)
continue;

- ret = validate_branch(file, insn, 0);
+ ret = validate_branch(file, insn, state);
warnings += ret;
}
}

- list_for_each_entry(sec, &file->elf->sections, list) {
- list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC)
- continue;
-
- func_for_each_insn(file, func, insn) {
- if (insn->visited)
- continue;
-
- insn->visited = true;
-
- if (file->ignore_unreachables || warnings ||
- ignore_unreachable_insn(func, insn))
- continue;
-
- /*
- * gcov produces a lot of unreachable
- * instructions. If we get an unreachable
- * warning and the file has gcov enabled, just
- * ignore it, and all other such warnings for
- * the file.
- */
- if (!file->ignore_unreachables &&
- gcov_enabled(file)) {
- file->ignore_unreachables = true;
- continue;
- }
-
- WARN_FUNC("function has unreachable instruction", insn->sec, insn->offset);
- warnings++;
- }
- }
- }
-
return warnings;
}

-static int validate_uncallable_instructions(struct objtool_file *file)
+static int validate_reachable_instructions(struct objtool_file *file)
{
struct instruction *insn;
- int warnings = 0;
+
+ if (file->ignore_unreachables)
+ return 0;

for_each_insn(file, insn) {
- if (!insn->visited && insn->type == INSN_RETURN) {
- WARN_FUNC("return instruction outside of a callable function",
- insn->sec, insn->offset);
- warnings++;
- }
+ if (insn->visited || ignore_unreachable_insn(insn))
+ continue;
+
+ /*
+ * gcov produces a lot of unreachable instructions. If we get
+ * an unreachable warning and the file has gcov enabled, just
+ * ignore it, and all other such warnings for the file. Do
+ * this here because this is an expensive function.
+ */
+ if (gcov_enabled(file))
+ return 0;
+
+ WARN_FUNC("unreachable instruction", insn->sec, insn->offset);
+ return 1;
}

- return warnings;
+ return 0;
}

static void cleanup(struct objtool_file *file)
@@ -1226,10 +1615,8 @@ int check(const char *_objname, bool _nofp)
nofp = _nofp;

file.elf = elf_open(objname);
- if (!file.elf) {
- fprintf(stderr, "error reading elf file %s\n", objname);
+ if (!file.elf)
return 1;
- }

INIT_LIST_HEAD(&file.insn_list);
hash_init(file.insn_hash);
@@ -1238,21 +1625,28 @@ int check(const char *_objname, bool _nofp)
file.ignore_unreachables = false;
file.c_file = find_section_by_name(file.elf, ".comment");

+ arch_initial_func_cfi_state(&initial_func_cfi);
+
ret = decode_sections(&file);
if (ret < 0)
goto out;
warnings += ret;

- ret = validate_functions(&file);
- if (ret < 0)
+ if (list_empty(&file.insn_list))
goto out;
- warnings += ret;

- ret = validate_uncallable_instructions(&file);
+ ret = validate_functions(&file);
if (ret < 0)
goto out;
warnings += ret;

+ if (!warnings) {
+ ret = validate_reachable_instructions(&file);
+ if (ret < 0)
+ goto out;
+ warnings += ret;
+ }
+
out:
cleanup(&file);

diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index c0d2fde..da85f5b 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -20,22 +20,34 @@

#include <stdbool.h>
#include "elf.h"
+#include "cfi.h"
#include "arch.h"
#include <linux/hashtable.h>

+struct insn_state {
+ struct cfi_reg cfa;
+ struct cfi_reg regs[CFI_NUM_REGS];
+ int stack_size;
+ bool bp_scratch;
+ bool drap;
+ int drap_reg;
+};
+
struct instruction {
struct list_head list;
struct hlist_node hash;
struct section *sec;
unsigned long offset;
- unsigned int len, state;
+ unsigned int len;
unsigned char type;
unsigned long immediate;
- bool alt_group, visited, dead_end;
+ bool alt_group, visited, dead_end, ignore;
struct symbol *call_dest;
struct instruction *jump_dest;
struct list_head alts;
struct symbol *func;
+ struct stack_op stack_op;
+ struct insn_state state;
};

struct objtool_file {
@@ -48,4 +60,7 @@ struct objtool_file {

int check(const char *objname, bool nofp);

+#define for_each_insn(file, insn) \
+ list_for_each_entry(insn, &file->insn_list, list)
+
#endif /* _CHECK_H */
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index d897702..1a7e8aa 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -37,6 +37,9 @@
#define ELF_C_READ_MMAP ELF_C_READ
#endif

+#define WARN_ELF(format, ...) \
+ WARN(format ": %s", ##__VA_ARGS__, elf_errmsg(-1))
+
struct section *find_section_by_name(struct elf *elf, const char *name)
{
struct section *sec;
@@ -139,12 +142,12 @@ static int read_sections(struct elf *elf)
int i;

if (elf_getshdrnum(elf->elf, &sections_nr)) {
- perror("elf_getshdrnum");
+ WARN_ELF("elf_getshdrnum");
return -1;
}

if (elf_getshdrstrndx(elf->elf, &shstrndx)) {
- perror("elf_getshdrstrndx");
+ WARN_ELF("elf_getshdrstrndx");
return -1;
}

@@ -165,37 +168,36 @@ static int read_sections(struct elf *elf)

s = elf_getscn(elf->elf, i);
if (!s) {
- perror("elf_getscn");
+ WARN_ELF("elf_getscn");
return -1;
}

sec->idx = elf_ndxscn(s);

if (!gelf_getshdr(s, &sec->sh)) {
- perror("gelf_getshdr");
+ WARN_ELF("gelf_getshdr");
return -1;
}

sec->name = elf_strptr(elf->elf, shstrndx, sec->sh.sh_name);
if (!sec->name) {
- perror("elf_strptr");
+ WARN_ELF("elf_strptr");
return -1;
}

- sec->elf_data = elf_getdata(s, NULL);
- if (!sec->elf_data) {
- perror("elf_getdata");
+ sec->data = elf_getdata(s, NULL);
+ if (!sec->data) {
+ WARN_ELF("elf_getdata");
return -1;
}

- if (sec->elf_data->d_off != 0 ||
- sec->elf_data->d_size != sec->sh.sh_size) {
+ if (sec->data->d_off != 0 ||
+ sec->data->d_size != sec->sh.sh_size) {
WARN("unexpected data attributes for %s", sec->name);
return -1;
}

- sec->data = (unsigned long)sec->elf_data->d_buf;
- sec->len = sec->elf_data->d_size;
+ sec->len = sec->data->d_size;
}

/* sanity check, one more call to elf_nextscn() should return NULL */
@@ -232,15 +234,15 @@ static int read_symbols(struct elf *elf)

sym->idx = i;

- if (!gelf_getsym(symtab->elf_data, i, &sym->sym)) {
- perror("gelf_getsym");
+ if (!gelf_getsym(symtab->data, i, &sym->sym)) {
+ WARN_ELF("gelf_getsym");
goto err;
}

sym->name = elf_strptr(elf->elf, symtab->sh.sh_link,
sym->sym.st_name);
if (!sym->name) {
- perror("elf_strptr");
+ WARN_ELF("elf_strptr");
goto err;
}

@@ -322,8 +324,8 @@ static int read_relas(struct elf *elf)
}
memset(rela, 0, sizeof(*rela));

- if (!gelf_getrela(sec->elf_data, i, &rela->rela)) {
- perror("gelf_getrela");
+ if (!gelf_getrela(sec->data, i, &rela->rela)) {
+ WARN_ELF("gelf_getrela");
return -1;
}

@@ -362,12 +364,6 @@ struct elf *elf_open(const char *name)

INIT_LIST_HEAD(&elf->sections);

- elf->name = strdup(name);
- if (!elf->name) {
- perror("strdup");
- goto err;
- }
-
elf->fd = open(name, O_RDONLY);
if (elf->fd == -1) {
perror("open");
@@ -376,12 +372,12 @@ struct elf *elf_open(const char *name)

elf->elf = elf_begin(elf->fd, ELF_C_READ_MMAP, NULL);
if (!elf->elf) {
- perror("elf_begin");
+ WARN_ELF("elf_begin");
goto err;
}

if (!gelf_getehdr(elf->elf, &elf->ehdr)) {
- perror("gelf_getehdr");
+ WARN_ELF("gelf_getehdr");
goto err;
}

@@ -407,6 +403,12 @@ void elf_close(struct elf *elf)
struct symbol *sym, *tmpsym;
struct rela *rela, *tmprela;

+ if (elf->elf)
+ elf_end(elf->elf);
+
+ if (elf->fd > 0)
+ close(elf->fd);
+
list_for_each_entry_safe(sec, tmpsec, &elf->sections, list) {
list_for_each_entry_safe(sym, tmpsym, &sec->symbol_list, list) {
list_del(&sym->list);
@@ -421,11 +423,6 @@ void elf_close(struct elf *elf)
list_del(&sec->list);
free(sec);
}
- if (elf->name)
- free(elf->name);
- if (elf->fd > 0)
- close(elf->fd);
- if (elf->elf)
- elf_end(elf->elf);
+
free(elf);
}
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index 731973e..343968b 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -37,10 +37,9 @@ struct section {
DECLARE_HASHTABLE(rela_hash, 16);
struct section *base, *rela;
struct symbol *sym;
- Elf_Data *elf_data;
+ Elf_Data *data;
char *name;
int idx;
- unsigned long data;
unsigned int len;
};

@@ -86,6 +85,7 @@ struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
struct symbol *find_containing_func(struct section *sec, unsigned long offset);
void elf_close(struct elf *elf);

-
+#define for_each_sec(file, sec) \
+ list_for_each_entry(sec, &file->elf->sections, list)

#endif /* _OBJTOOL_ELF_H */
diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index bff8abb..84f001d 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -91,16 +91,16 @@ static int get_alt_entry(struct elf *elf, struct special_entry *entry,
alt->jump_or_nop = entry->jump_or_nop;

if (alt->group) {
- alt->orig_len = *(unsigned char *)(sec->data + offset +
+ alt->orig_len = *(unsigned char *)(sec->data->d_buf + offset +
entry->orig_len);
- alt->new_len = *(unsigned char *)(sec->data + offset +
+ alt->new_len = *(unsigned char *)(sec->data->d_buf + offset +
entry->new_len);
}

if (entry->feature) {
unsigned short feature;

- feature = *(unsigned short *)(sec->data + offset +
+ feature = *(unsigned short *)(sec->data->d_buf + offset +
entry->feature);

/*
diff --git a/tools/objtool/warn.h b/tools/objtool/warn.h
index ac7e075..afd9f7a 100644
--- a/tools/objtool/warn.h
+++ b/tools/objtool/warn.h
@@ -18,6 +18,13 @@
#ifndef _WARN_H
#define _WARN_H

+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include "elf.h"
+
extern const char *objname;

static inline char *offstr(struct section *sec, unsigned long offset)
@@ -57,4 +64,7 @@ static inline char *offstr(struct section *sec, unsigned long offset)
free(_str); \
})

+#define WARN_ELF(format, ...) \
+ WARN(format ": %s", ##__VA_ARGS__, elf_errmsg(-1))
+
#endif /* _WARN_H */

2017-06-30 13:23:46

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 3/8] objtool: stack validation 2.0

On Fri, Jun 30, 2017 at 10:32:03AM +0200, Ingo Molnar wrote:
>
> * Josh Poimboeuf <[email protected]> wrote:
>
> > This is a major rewrite of objtool. Instead of only tracking frame
> > pointer changes, it now tracks all stack-related operations, including
> > all register saves/restores.
> >
> > In addition to making stack validation more robust, this also paves the
> > way for undwarf generation.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
>
> Note, I have applied the first 3 patches, and got a bunch of new warnings on x86
> 64-bit allmodconfig:
>
> arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function
> arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function
> arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function
> arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function
> arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function
>
> (That's the vmlinux build - plus 4 more warnings in the modules build.)
>
> That's with GCC 5.3.1.
>
> Let me know if you need any more info.

Hm, this is odd. I tried with GCC 5.3.0 (which should presumably be
almost identical to 5.3.1) and I don't see any warnings. Can you send
me one of the object files it's complaining about?

--
Josh

2017-06-30 13:26:43

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 3/8] objtool: stack validation 2.0

On Fri, Jun 30, 2017 at 08:23:36AM -0500, Josh Poimboeuf wrote:
> On Fri, Jun 30, 2017 at 10:32:03AM +0200, Ingo Molnar wrote:
> >
> > * Josh Poimboeuf <[email protected]> wrote:
> >
> > > This is a major rewrite of objtool. Instead of only tracking frame
> > > pointer changes, it now tracks all stack-related operations, including
> > > all register saves/restores.
> > >
> > > In addition to making stack validation more robust, this also paves the
> > > way for undwarf generation.
> > >
> > > Signed-off-by: Josh Poimboeuf <[email protected]>
> >
> > Note, I have applied the first 3 patches, and got a bunch of new warnings on x86
> > 64-bit allmodconfig:
> >
> > arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function
> > arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function
> > arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function
> > arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function
> > arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function
> >
> > (That's the vmlinux build - plus 4 more warnings in the modules build.)
> >
> > That's with GCC 5.3.1.
> >
> > Let me know if you need any more info.
>
> Hm, this is odd. I tried with GCC 5.3.0 (which should presumably be
> almost identical to 5.3.1) and I don't see any warnings. Can you send
> me one of the object files it's complaining about?

Oh wait, never mind. Now that I'm actually reading the warnings, it
makes sense. Those are warnings I fixed in a later patch in the series.

I'll try to come up with something to fix those without pulling in the
rest of the patches.

--
Josh

2017-06-30 14:09:37

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH] objtool: silence warnings for functions which use iret

On Fri, Jun 30, 2017 at 10:32:03AM +0200, Ingo Molnar wrote:
>
> * Josh Poimboeuf <[email protected]> wrote:
>
> > This is a major rewrite of objtool. Instead of only tracking frame
> > pointer changes, it now tracks all stack-related operations, including
> > all register saves/restores.
> >
> > In addition to making stack validation more robust, this also paves the
> > way for undwarf generation.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
>
> Note, I have applied the first 3 patches, and got a bunch of new warnings on x86
> 64-bit allmodconfig:
>
> arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function
> arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function
> arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function
> arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function
> arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function
>
> (That's the vmlinux build - plus 4 more warnings in the modules build.)
>
> That's with GCC 5.3.1.

Here's the fix. It can be squashed into the 3rd commit ("objtool:
Implement stack validation 2.0") or can be added as a standalone commit,
whichever you prefer.

----

From: Josh Poimboeuf <[email protected]>
Subject: [PATCH] objtool: silence warnings for functions which use iret

Previously, objtool ignored functions which have the 'iret' instruction
in them. That's because it assumed that such functions know what
they're doing with respect to frame pointers.

With the new "objtool 2.0" changes, it stopped ignoring such functions,
and started complaining about them:

arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function
arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function
arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function
arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function
arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function

Silence those warnings for now. They can be re-enabled later, once we
have unwind hints which will allow the code to annotate the iret usages.

Reported-by: Ingo Molnar <[email protected]>
Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0")
Signed-off-by: Josh Poimboeuf <[email protected]>
---
tools/objtool/check.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 2f80aa51..fea2221 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -100,6 +100,7 @@ static bool gcov_enabled(struct objtool_file *file)
static bool ignore_func(struct objtool_file *file, struct symbol *func)
{
struct rela *rela;
+ struct instruction *insn;

/* check for STACK_FRAME_NON_STANDARD */
if (file->whitelist && file->whitelist->rela)
@@ -112,6 +113,11 @@ static bool ignore_func(struct objtool_file *file, struct symbol *func)
return true;
}

+ /* check if it has a context switching instruction */
+ func_for_each_insn(file, func, insn)
+ if (insn->type == INSN_CONTEXT_SWITCH)
+ return true;
+
return false;
}

@@ -1446,14 +1452,6 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,

return 0;

- case INSN_CONTEXT_SWITCH:
- if (func) {
- WARN_FUNC("unsupported instruction in callable function",
- sec, insn->offset);
- return 1;
- }
- return 0;
-
case INSN_STACK:
if (update_insn_state(insn, &state))
return -1;
--
2.7.5

2017-06-30 15:45:05

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Fri, Jun 30, 2017 at 6:11 AM, Josh Poimboeuf <[email protected]> wrote:
> On Thu, Jun 29, 2017 at 10:41:44PM -0700, Andy Lutomirski wrote:
>> On Thu, Jun 29, 2017 at 10:05 PM, Andy Lutomirski <[email protected]> wrote:
>> > Hmm. There's another option that might be considerably nicer, though:
>> > put the IRQ stack at a known (at link time) position *in percpu
>> > space*. (Presumably it already is -- I haven't checked.) Then we do:
>> >
>> > .macro ENTER_IRQ_STACK old_rsp
>> > DEBUG_ENTRY_ASSERT_IRQS_OFF
>> > movq %rsp, \old_rsp
>> > incl PER_CPU_VAR(irq_count)
>> >
>> > /*
>> > * Right now, if we just incremented irq_count to zero, we've
>> > * claimed the IRQ stack but we haven't switched to it yet.
>> > * Anything that can interrupt us here without using IST
>> > * must be *extremely* careful to limit its stack usage.
>> > */
>> > jnz .Lpush_old_rsp_\@
>> > movq \old_rsp, PER_CPU_VAR(top_word_in_irq_stack)
>> > movq PER_CPU_VAR(irq_stack_ptr), %rsp
>> > .Lpush_old_rsp_\@:
>> > pushq \old_rsp
>> > .endm
>> >
>>
>> How about the two commits here (well, soon to be there once gitweb catches up):
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=0f56a55bb133cd53ccb78ca51378086296618322
>>
>> If you like them, want to add them to your series?
>
> The second patch looks good to me, thanks. I can pick up the patches.
>
> A few comments about the first patch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=3e2aa2102cc1c5e60d4a8637bff78d0478a55059
>
> - It uses a '693:' label instead of '.Lirqs_off_\@:'

Touché!

>
> - There's a comment I don't follow:
>
> "Anything that can interrupt us here without using IST must be
> *extremely* careful to limit its stack usage."
>
> What specifically could interrupt there without using IST?

#DB, later on in the series. I'll update the comment.

>
> - Since do_softirq_own_stack() is a callable function, I think it still
> needs to save rbp.

Whoops.

>
> - Why change the "jmp error_exit" to "ret" in
> xen_do_hypervisor_callback()?

To match the other change I made there. I removed both.

2017-06-30 15:55:04

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Fri, Jun 30, 2017 at 08:44:40AM -0700, Andy Lutomirski wrote:
> > A few comments about the first patch:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=3e2aa2102cc1c5e60d4a8637bff78d0478a55059
> >
> > - It uses a '693:' label instead of '.Lirqs_off_\@:'
>
> Touché!
>
> >
> > - There's a comment I don't follow:
> >
> > "Anything that can interrupt us here without using IST must be
> > *extremely* careful to limit its stack usage."
> >
> > What specifically could interrupt there without using IST?
>
> #DB, later on in the series. I'll update the comment.
>
> >
> > - Since do_softirq_own_stack() is a callable function, I think it still
> > needs to save rbp.
>
> Whoops.
>
> >
> > - Why change the "jmp error_exit" to "ret" in
> > xen_do_hypervisor_callback()?
>
> To match the other change I made there. I removed both.

One more thing I forgot to mention: if you could use r10 instead of r11,
that would be helpful because it means one less register undwarf needs
to know about. (It already deals with r10 because of GCC stack
realignment).

--
Josh

2017-06-30 15:56:33

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Fri, Jun 30, 2017 at 8:55 AM, Josh Poimboeuf <[email protected]> wrote:
> On Fri, Jun 30, 2017 at 08:44:40AM -0700, Andy Lutomirski wrote:
>> > A few comments about the first patch:
>> >
>> > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=3e2aa2102cc1c5e60d4a8637bff78d0478a55059
>> >
>> > - It uses a '693:' label instead of '.Lirqs_off_\@:'
>>
>> Touché!
>>
>> >
>> > - There's a comment I don't follow:
>> >
>> > "Anything that can interrupt us here without using IST must be
>> > *extremely* careful to limit its stack usage."
>> >
>> > What specifically could interrupt there without using IST?
>>
>> #DB, later on in the series. I'll update the comment.
>>
>> >
>> > - Since do_softirq_own_stack() is a callable function, I think it still
>> > needs to save rbp.
>>
>> Whoops.
>>
>> >
>> > - Why change the "jmp error_exit" to "ret" in
>> > xen_do_hypervisor_callback()?
>>
>> To match the other change I made there. I removed both.
>
> One more thing I forgot to mention: if you could use r10 instead of r11,
> that would be helpful because it means one less register undwarf needs
> to know about. (It already deals with r10 because of GCC stack
> realignment).

I'll let you figure that one out :)

(Although I think I agree with hpa: why not let it support all regs?
Or am I missing something still?)

>
> --
> Josh

2017-06-30 16:16:12

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

On Fri, Jun 30, 2017 at 08:56:09AM -0700, Andy Lutomirski wrote:
> On Fri, Jun 30, 2017 at 8:55 AM, Josh Poimboeuf <[email protected]> wrote:
> > On Fri, Jun 30, 2017 at 08:44:40AM -0700, Andy Lutomirski wrote:
> >> > A few comments about the first patch:
> >> >
> >> > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_irq_stack&id=3e2aa2102cc1c5e60d4a8637bff78d0478a55059
> >> >
> >> > - It uses a '693:' label instead of '.Lirqs_off_\@:'
> >>
> >> Touché!
> >>
> >> >
> >> > - There's a comment I don't follow:
> >> >
> >> > "Anything that can interrupt us here without using IST must be
> >> > *extremely* careful to limit its stack usage."
> >> >
> >> > What specifically could interrupt there without using IST?
> >>
> >> #DB, later on in the series. I'll update the comment.
> >>
> >> >
> >> > - Since do_softirq_own_stack() is a callable function, I think it still
> >> > needs to save rbp.
> >>
> >> Whoops.
> >>
> >> >
> >> > - Why change the "jmp error_exit" to "ret" in
> >> > xen_do_hypervisor_callback()?
> >>
> >> To match the other change I made there. I removed both.
> >
> > One more thing I forgot to mention: if you could use r10 instead of r11,
> > that would be helpful because it means one less register undwarf needs
> > to know about. (It already deals with r10 because of GCC stack
> > realignment).
>
> I'll let you figure that one out :)
>
> (Although I think I agree with hpa: why not let it support all regs?
> Or am I missing something still?)

Sure, it can support all regs, but as the code is structured now, doing
so means adding more switch cases, e.g.:

https://github.com/jpoimboe/linux/blob/undwarf-v2/arch/x86/kernel/unwind_undwarf.c#L388

And that makes the balrog angry!

--
Josh

Subject: [tip:core/objtool] objtool: Silence warnings for functions which use IRET

Commit-ID: 2513cbf9d622d85268655bfd787d4f004342cfc9
Gitweb: http://git.kernel.org/tip/2513cbf9d622d85268655bfd787d4f004342cfc9
Author: Josh Poimboeuf <[email protected]>
AuthorDate: Fri, 30 Jun 2017 09:09:34 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 30 Jun 2017 19:43:50 +0200

objtool: Silence warnings for functions which use IRET

Previously, objtool ignored functions which have the IRET instruction
in them. That's because it assumed that such functions know what
they're doing with respect to frame pointers.

With the new "objtool 2.0" changes, it stopped ignoring such functions,
and started complaining about them:

arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function
arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function
arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function
arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function
arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function

Silence those warnings for now. They can be re-enabled later, once we
have unwind hints which will allow the code to annotate the IRET usages.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0")
Link: http://lkml.kernel.org/r/20170630140934.mmwtpockvpupahro@treble
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/objtool/check.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 2f80aa51..fea2221 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -100,6 +100,7 @@ static bool gcov_enabled(struct objtool_file *file)
static bool ignore_func(struct objtool_file *file, struct symbol *func)
{
struct rela *rela;
+ struct instruction *insn;

/* check for STACK_FRAME_NON_STANDARD */
if (file->whitelist && file->whitelist->rela)
@@ -112,6 +113,11 @@ static bool ignore_func(struct objtool_file *file, struct symbol *func)
return true;
}

+ /* check if it has a context switching instruction */
+ func_for_each_insn(file, func, insn)
+ if (insn->type == INSN_CONTEXT_SWITCH)
+ return true;
+
return false;
}

@@ -1446,14 +1452,6 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,

return 0;

- case INSN_CONTEXT_SWITCH:
- if (func) {
- WARN_FUNC("unsupported instruction in callable function",
- sec, insn->offset);
- return 1;
- }
- return 0;
-
case INSN_STACK:
if (update_insn_state(insn, &state))
return -1;

2017-07-06 20:36:43

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

On Thu, Jun 29, 2017 at 10:06:52AM -0500, Josh Poimboeuf wrote:
> On Thu, Jun 29, 2017 at 04:46:18PM +0200, Ingo Molnar wrote:
> >
> > * Josh Poimboeuf <[email protected]> wrote:
> >
> > > > Plus, shouldn't we use __packed for 'struct undwarf' to minimize the
> > > > structure's size (to 6 bytes AFAICS?) - or is optimal packing of the main
> > > > undwarf array already guaranteed on every platform with this layout?
> > >
> > > Ah yes, it should definitely be packed (assuming that doesn't affect performance
> > > negatively).
> >
> > So if I count that correctly that should shave another ~1MB off a typical ~4MB
> > table size?
>
> Here's what my Fedora kernel looks like *before* the packed change:
>
> $ eu-readelf -S vmlinux |grep undwarf
> [15] .undwarf_ip PROGBITS ffffffff81f776d0 011776d0 0012d9d0 0 A 0 0 1
> [16] .undwarf PROGBITS ffffffff820a50a0 012a50a0 0025b3a0 0 A 0 0 1
>
> The total undwarf data size is ~3.5MB.
>
> There are 308852 entries of two parallel arrays:
>
> * .undwarf (8 bytes/entry) = 2470816 bytes
> * .undwarf_ip (4 bytes/entry) = 1235408 bytes
>
> If we pack undwarf, reducing the size of the .undwarf entries by two
> bytes, it will save 308852 * 2 = 617704.
>
> So the savings will be ~600k, and the typical size will be reduced to ~3MB.

Just for the record, while packing the struct from 8 to 6 bytes did save
600k, it also made the unwinder ~7% slower. I think that's probably an
ok tradeoff, so I'll leave it packed in v3.

--
Josh

2017-07-07 09:44:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation


* Josh Poimboeuf <[email protected]> wrote:

> On Thu, Jun 29, 2017 at 10:06:52AM -0500, Josh Poimboeuf wrote:
> > On Thu, Jun 29, 2017 at 04:46:18PM +0200, Ingo Molnar wrote:
> > >
> > > * Josh Poimboeuf <[email protected]> wrote:
> > >
> > > > > Plus, shouldn't we use __packed for 'struct undwarf' to minimize the
> > > > > structure's size (to 6 bytes AFAICS?) - or is optimal packing of the main
> > > > > undwarf array already guaranteed on every platform with this layout?
> > > >
> > > > Ah yes, it should definitely be packed (assuming that doesn't affect performance
> > > > negatively).
> > >
> > > So if I count that correctly that should shave another ~1MB off a typical ~4MB
> > > table size?
> >
> > Here's what my Fedora kernel looks like *before* the packed change:
> >
> > $ eu-readelf -S vmlinux |grep undwarf
> > [15] .undwarf_ip PROGBITS ffffffff81f776d0 011776d0 0012d9d0 0 A 0 0 1
> > [16] .undwarf PROGBITS ffffffff820a50a0 012a50a0 0025b3a0 0 A 0 0 1
> >
> > The total undwarf data size is ~3.5MB.
> >
> > There are 308852 entries of two parallel arrays:
> >
> > * .undwarf (8 bytes/entry) = 2470816 bytes
> > * .undwarf_ip (4 bytes/entry) = 1235408 bytes
> >
> > If we pack undwarf, reducing the size of the .undwarf entries by two
> > bytes, it will save 308852 * 2 = 617704.
> >
> > So the savings will be ~600k, and the typical size will be reduced to ~3MB.
>
> Just for the record, while packing the struct from 8 to 6 bytes did save 600k,
> it also made the unwinder ~7% slower. I think that's probably an ok tradeoff,
> so I'll leave it packed in v3.

So, out of curiosity, I'm wondering where that slowdown comes from: on modern x86
CPUs indexing by units of 6 bytes ought to be just as fast as indexing by 8 bytes,
unless I'm missing something? Is it maybe the not naturally aligned 32-bit words?

Or maybe there's some bad case of a 32-bit word crossing a 64-byte cache line
boundary that hits some pathological aspect of the CPU? We could probably get
around any such problems by padding by 2 bytes on 64-byte boundaries - that's only
a ~3% data size increase. The flip side would be a complication of the data
structure and its accessors - which might cost more in terms of code generation
efficiency than it buys us to begin with ...

Also, there's another aspect besides RAM footprint: a large data structure that is
~20% smaller means 20% less cache footprint: which for cache cold lookups might
matter more than the direct computational cost.

Thanks,

Ingo

2017-07-11 02:58:12

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

On Fri, Jul 07, 2017 at 11:44:37AM +0200, Ingo Molnar wrote:
>
> * Josh Poimboeuf <[email protected]> wrote:
>
> > On Thu, Jun 29, 2017 at 10:06:52AM -0500, Josh Poimboeuf wrote:
> > > On Thu, Jun 29, 2017 at 04:46:18PM +0200, Ingo Molnar wrote:
> > > >
> > > > * Josh Poimboeuf <[email protected]> wrote:
> > > >
> > > > > > Plus, shouldn't we use __packed for 'struct undwarf' to minimize the
> > > > > > structure's size (to 6 bytes AFAICS?) - or is optimal packing of the main
> > > > > > undwarf array already guaranteed on every platform with this layout?
> > > > >
> > > > > Ah yes, it should definitely be packed (assuming that doesn't affect performance
> > > > > negatively).
> > > >
> > > > So if I count that correctly that should shave another ~1MB off a typical ~4MB
> > > > table size?
> > >
> > > Here's what my Fedora kernel looks like *before* the packed change:
> > >
> > > $ eu-readelf -S vmlinux |grep undwarf
> > > [15] .undwarf_ip PROGBITS ffffffff81f776d0 011776d0 0012d9d0 0 A 0 0 1
> > > [16] .undwarf PROGBITS ffffffff820a50a0 012a50a0 0025b3a0 0 A 0 0 1
> > >
> > > The total undwarf data size is ~3.5MB.
> > >
> > > There are 308852 entries of two parallel arrays:
> > >
> > > * .undwarf (8 bytes/entry) = 2470816 bytes
> > > * .undwarf_ip (4 bytes/entry) = 1235408 bytes
> > >
> > > If we pack undwarf, reducing the size of the .undwarf entries by two
> > > bytes, it will save 308852 * 2 = 617704.
> > >
> > > So the savings will be ~600k, and the typical size will be reduced to ~3MB.
> >
> > Just for the record, while packing the struct from 8 to 6 bytes did save 600k,
> > it also made the unwinder ~7% slower. I think that's probably an ok tradeoff,
> > so I'll leave it packed in v3.
>
> So, out of curiosity, I'm wondering where that slowdown comes from: on modern x86
> CPUs indexing by units of 6 bytes ought to be just as fast as indexing by 8 bytes,
> unless I'm missing something? Is it maybe the not naturally aligned 32-bit words?
>
> Or maybe there's some bad case of a 32-bit word crossing a 64-byte cache line
> boundary that hits some pathological aspect of the CPU? We could probably get
> around any such problems by padding by 2 bytes on 64-byte boundaries - that's only
> a ~3% data size increase. The flip side would be a complication of the data
> structure and its accessors - which might cost more in terms of code generation
> efficiency than it buys us to begin with ...
>
> Also, there's another aspect besides RAM footprint: a large data structure that is
> ~20% smaller means 20% less cache footprint: which for cache cold lookups might
> matter more than the direct computational cost.

tl;dr: Packed really seems to be more like ~2% slower, time for an adult
beverage.

So I tested again with the latest version of my code, and this time
packed was 5% *faster* than unpacked, rather than 7% slower. 'perf
stat' showed that, in both cases, most of the difference was caused by
branch misses in the binary search code. But that code doesn't even
touch the packed struct...

After some hair-pulling/hand-wringing I realized that changing the
struct packing caused GCC to change some of the unwinder code a bit,
which shifted the rest of the kernel's function offsets enough that it
changed the behavior of the unwind table binary search in a way that
affected the CPU's branch prediction. And my crude benchmark was just
unwinding the same stack on repeat, so a small change in the loop
behavior had a big impact on the overall branch predictability.

Anyway, I used some linker magic to temporarily move the unwinder code
to the end of .text, so that unwinder changes don't add unexpected side
effects to the microbenchmark behavior. Now I'm getting more consistent
results: the packed struct is measuring ~2% slower. The slight slowdown
might just be explained by the fact that GCC generates some extra
instructions for extracting the fields out of the packed struct.

In the meantime, I found a ~10% speedup by making the "fast lookup
table" block size a power-of-two (256) to get rid of the need for a slow
'div' instruction.

I think I'm done performance tweaking for now. I'll keep the packed
struct, and add the code for the 'div' removal, and hope to submit v3
soon.

--
Josh

2017-07-11 08:41:02

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation


* Josh Poimboeuf <[email protected]> wrote:

> Anyway, I used some linker magic to temporarily move the unwinder code to the
> end of .text, so that unwinder changes don't add unexpected side effects to the
> microbenchmark behavior. Now I'm getting more consistent results: the packed
> struct is measuring ~2% slower. The slight slowdown might just be explained by
> the fact that GCC generates some extra instructions for extracting the fields
> out of the packed struct.

Yeah, the 16-bit field accesses versus a zero-extended 32-bit field are more
complex to access even on x86 that has a fair amount of 16-bit legacy.

> In the meantime, I found a ~10% speedup by making the "fast lookup table" block
> size a power-of-two (256) to get rid of the need for a slow 'div' instruction.
>
> I think I'm done performance tweaking for now. I'll keep the packed struct, and
> add the code for the 'div' removal, and hope to submit v3 soon.

Sounds good to me!

~2% slowdown for ~30% RAM savings for a debug data structure that is about as
large as a typical kernel's total .text is a decent trade-off.

Thanks,

Ingo