Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751579AbdF1PNr (ORCPT ); Wed, 28 Jun 2017 11:13:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56710 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751895AbdF1PMD (ORCPT ); Wed, 28 Jun 2017 11:12:03 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 5B01EC04B938 Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jpoimboe@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 5B01EC04B938 From: Josh Poimboeuf To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Linus Torvalds , Andy Lutomirski , Jiri Slaby , Ingo Molnar , "H. Peter Anvin" , Peter Zijlstra Subject: [PATCH v2 0/8] x86: undwarf unwinder Date: Wed, 28 Jun 2017 10:11:04 -0500 Message-Id: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 28 Jun 2017 15:11:52 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14027 Lines: 292 v2: - 2x performance improvement by using a fast lookup table and splitting undwarf array into two parallel arrays (Andy L) - reduce data size by ~1MB by getting rid of 'len' field - sort and post-process data at boot time - don't search vmlinux tables for module addresses (Peter Z) - disable preemption to prevent module from getting unloaded while reading its undwarf data (Peter Z) - avoid unwinding a running task's stack (Jiri S) - remove '__sp' constraint from inline asm (Jiri S) - rename "CFI_*" -> "UNWIND_HINT_*" (Andy L) - replace '999:' label with '.Lunwind_hint_ip_\@' (Andy L) - entry code annotation fixes: extra=0 fix, symmetrical macro annotations, ret_from_fork fix (Andy L) - invalidate all object files when enabling/disabling CONFIG_UNDWARF_UNWINDER - pass ip-1 to undwarf_find() for call return addresses to fix stack traces for sibling calls and noreturn calls at end of function - docs: clarify benefits vs frame pointers (Ingo) - docs: improve wording, add more info, add performance info from Mel G and Jiri S, move to kernel docs dir - objtool: several minor fixes (Jiri S) - objtool: append file instead of rewriting it - objtool: improve elf warnings - objtool: fix handling of the GCC DRAP register for aligned stacks - objtool: rewrite 'undwarf dump' command to be much faster and to work on vmlinux - objtool: rename undwarf.c -> undwarf_gen.c ----- Create a new 'undwarf' unwinder, enabled by CONFIG_UNDWARF_UNWINDER, and plug it into the x86 unwinder framework. Objtool is used to generate the undwarf debuginfo. The undwarf debuginfo format is basically a simplified version of DWARF CFI. More details below. The unwinder works well in my testing. It unwinds through interrupts, exceptions, and preemption, with and without frame pointers, across aligned stacks and dynamically allocated stacks. If something goes wrong during an oops, it successfully falls back to printing the '?' entries just like the frame pointer unwinder. I'm not tied to the 'undwarf' name, other naming ideas are welcome. Some potential future improvements: - properly annotate or fix whitelisted functions and files - reduce the number of base CFA registers needed in entry code - compress undwarf debuginfo to use less memory - make it easier to disable CONFIG_FRAME_POINTER - add reliability checks for livepatch - runtime NMI stack reliability checker This code can also be found at: git://github.com/jpoimboe/linux undwarf-v2 Here's the contents of the undwarf.txt file which explains the 'why' in more detail: Undwarf unwinder debuginfo generation ===================================== Overview -------- The kernel CONFIG_UNDWARF_UNWINDER option enables objtool generation of undwarf debuginfo, which is out-of-band data which is used by the in-kernel undwarf unwinder. It's similar in concept to DWARF CFI debuginfo which would be used by a DWARF unwinder. The difference is that the format of the undwarf data is simpler than DWARF, which in turn allows the unwinder to be simpler and faster. Objtool generates the undwarf data by first doing compile-time stack metadata validation (CONFIG_STACK_VALIDATION). After analyzing all the code paths of a .o file, it determines information about the stack state at each instruction address in the file and outputs that information to the .undwarf and .undwarf_ip sections. The undwarf sections are combined at link time and are sorted at boot time. The unwinder uses the resulting data to correlate instruction addresses with their stack states at run time. Undwarf vs frame pointers ------------------------- With frame pointers enabled, GCC adds instrumentation code to every function in the kernel. The kernel's .text size increases by about 3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel Gorman [1] have shown a slowdown of 5-10% for some workloads. In contrast, the undwarf unwinder has no effect on text size or runtime performance, because the debuginfo is out of band. So if you disable frame pointers and enable undwarf, you get a nice performance improvement across the board, and still have reliable stack traces. Another benefit of undwarf compared to frame pointers is that it can reliably unwind across interrupts and exceptions. Frame pointer based unwinds can skip the caller of the interrupted function if it was a leaf function or if the interrupt hit before the frame pointer was saved. The main disadvantage of undwarf compared to frame pointers is that it needs more memory to store the undwarf table: roughly 3-5MB depending on the kernel config. Undwarf vs DWARF ---------------- Undwarf debuginfo's advantage over DWARF itself is that it's much simpler. It gets rid of the complex DWARF CFI state machine and also gets rid of the tracking of unnecessary registers. This allows the unwinder to be much simpler, meaning fewer bugs, which is especially important for mission critical oops code. The simpler debuginfo format also enables the unwinder to be much faster than DWARF, which is important for perf and lockdep. In a basic performance test by Jiri Slaby [2], the undwarf unwinder was about 20x faster than an out-of-tree DWARF unwinder. (Note: that measurement was taken before some performance tweaks were implemented, so the speedup may be even higher.) The undwarf format does have a few downsides compared to DWARF. The undwarf table takes up ~2MB more memory than an DWARF .eh_frame table. Another potential downside is that, as GCC evolves, it's conceivable that the undwarf data may end up being *too* simple to describe the state of the stack for certain optimizations. But IMO this is unlikely because GCC saves the frame pointer for any unusual stack adjustments it does, so I suspect we'll really only ever need to keep track of the stack pointer and the frame pointer between call frames. But even if we do end up having to track all the registers DWARF tracks, at least we will still be able to control the format, e.g. no complex state machines. Undwarf debuginfo generation ---------------------------- The undwarf data is generated by objtool. With the existing compile-time stack metadata validation feature, objtool already follows all code paths, and so it already has all the information it needs to be able to generate undwarf data from scratch. So it's an easy step to go from stack validation to undwarf generation. It should be possible to instead generate the undwarf data with a simple tool which converts DWARF to undwarf. However, such a solution would be incomplete due to the kernel's extensive use of asm, inline asm, and special sections like exception tables. That could be rectified by manually annotating those special code paths using GNU assembler .cfi annotations in .S files, and homegrown annotations for inline asm in .c files. But asm annotations were tried in the past and were found to be unmaintainable. They were often incorrect/incomplete and made the code harder to read and keep updated. And based on looking at glibc code, annotating inline asm in .c files might be even worse. Objtool still needs a few annotations, but only in code which does unusual things to the stack like entry code. And even then, far fewer annotations are needed than what DWARF would need, so they're much more maintainable than DWARF CFI annotations. So the advantages of using objtool to generate undwarf are that it gives more accurate debuginfo, with very few annotations. It also insulates the kernel from toolchain bugs which can be very painful to deal with in the kernel since we often have to workaround issues in older versions of the toolchain for years. The downside is that the unwinder now becomes dependent on objtool's ability to reverse engineer GCC code paths. If GCC optimizations become too complicated for objtool to follow, the undwarf generation might stop working or become incomplete. (It's worth noting that livepatch already has such a dependency on objtool's ability to follow GCC code paths.) If newer versions of GCC come up with some optimizations which break objtool, we may need to revisit the current implementation. Some possible solutions would be asking GCC to make the optimizations more palatable, or having objtool use DWARF as an additional input, or creating a GCC plugin to assist objtool with its analysis. But for now, objtool follows GCC code quite well. Unwinder implementation details ------------------------------- Objtool generates the undwarf data by integrating with the compile-time stack metadata validation feature, which is described in detail in tools/objtool/Documentation/stack-validation.txt. After analyzing all the code paths of a .o file, it creates an array of undwarf structs, and a parallel array of instruction addresses associated with those structs, and writes them to the .undwarf and .undwarf_ip sections respectively. The undwarf data is split into the two arrays for performance reasons, to make the searchable part of the data (.undwarf_ip) more compact. The arrays are sorted in parallel at boot time. Performance is further improved by the use of a fast lookup table which is created at runtime. The fast lookup table associates a given address with a range of undwarf table indices, so that only a small subset of the undwarf table needs to be searched. [1] https://lkml.kernel.org/r/20170602104048.jkkzssljsompjdwy@suse.de [2] https://lkml.kernel.org/r/d2ca5435-6386-29b8-db87-7f227c2b713a@suse.cz Josh Poimboeuf (8): objtool: move checking code to check.c objtool, x86: add several functions and files to the objtool whitelist objtool: stack validation 2.0 objtool: add undwarf debuginfo generation objtool, x86: add facility for asm code to provide unwind hints x86/entry: add unwind hint annotations x86/asm: add unwind hint annotations to sync_core() x86/unwind: add undwarf unwinder Documentation/x86/undwarf.txt | 146 +++ arch/um/include/asm/unwind.h | 8 + arch/x86/Kconfig | 1 + arch/x86/Kconfig.debug | 25 + arch/x86/crypto/Makefile | 2 + arch/x86/crypto/sha1-mb/Makefile | 2 + arch/x86/crypto/sha256-mb/Makefile | 2 + arch/x86/entry/Makefile | 1 - arch/x86/entry/calling.h | 6 + arch/x86/entry/entry_64.S | 56 +- arch/x86/include/asm/module.h | 9 + arch/x86/include/asm/processor.h | 3 + arch/x86/include/asm/undwarf-types.h | 99 ++ arch/x86/include/asm/undwarf.h | 103 ++ arch/x86/include/asm/unwind.h | 77 +- arch/x86/kernel/Makefile | 9 +- arch/x86/kernel/acpi/Makefile | 2 + arch/x86/kernel/kprobes/opt.c | 9 +- arch/x86/kernel/module.c | 12 +- arch/x86/kernel/reboot.c | 2 + arch/x86/kernel/setup.c | 3 + arch/x86/kernel/unwind_frame.c | 39 +- arch/x86/kernel/unwind_guess.c | 5 + arch/x86/kernel/unwind_undwarf.c | 589 ++++++++++ arch/x86/kernel/vmlinux.lds.S | 2 + arch/x86/kvm/svm.c | 2 + arch/x86/kvm/vmx.c | 3 + arch/x86/lib/msr-reg.S | 8 +- arch/x86/net/Makefile | 2 + arch/x86/platform/efi/Makefile | 1 + arch/x86/power/Makefile | 2 + arch/x86/xen/Makefile | 3 + include/asm-generic/vmlinux.lds.h | 20 +- kernel/kexec_core.c | 4 +- lib/Kconfig.debug | 3 + scripts/Makefile.build | 14 +- tools/objtool/Build | 4 + tools/objtool/Documentation/stack-validation.txt | 195 ++-- tools/objtool/Makefile | 5 +- tools/objtool/arch.h | 64 +- tools/objtool/arch/x86/decode.c | 400 ++++++- tools/objtool/builtin-check.c | 1281 +--------------------- tools/objtool/builtin-undwarf.c | 70 ++ tools/objtool/builtin.h | 1 + tools/objtool/cfi.h | 55 + tools/objtool/{builtin-check.c => check.c} | 954 ++++++++++++---- tools/objtool/check.h | 79 ++ tools/objtool/elf.c | 265 ++++- tools/objtool/elf.h | 21 +- tools/objtool/objtool.c | 3 +- tools/objtool/special.c | 6 +- tools/objtool/undwarf-types.h | 99 ++ tools/objtool/{builtin.h => undwarf.h} | 18 +- tools/objtool/undwarf_dump.c | 212 ++++ tools/objtool/undwarf_gen.c | 215 ++++ tools/objtool/warn.h | 10 + 56 files changed, 3466 insertions(+), 1765 deletions(-) create mode 100644 Documentation/x86/undwarf.txt create mode 100644 arch/um/include/asm/unwind.h create mode 100644 arch/x86/include/asm/undwarf-types.h create mode 100644 arch/x86/include/asm/undwarf.h create mode 100644 arch/x86/kernel/unwind_undwarf.c create mode 100644 tools/objtool/builtin-undwarf.c create mode 100644 tools/objtool/cfi.h copy tools/objtool/{builtin-check.c => check.c} (59%) create mode 100644 tools/objtool/check.h create mode 100644 tools/objtool/undwarf-types.h copy tools/objtool/{builtin.h => undwarf.h} (67%) create mode 100644 tools/objtool/undwarf_dump.c create mode 100644 tools/objtool/undwarf_gen.c -- 2.7.5