Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752529AbdF2H4H (ORCPT ); Thu, 29 Jun 2017 03:56:07 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:35226 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752272AbdF2Hzv (ORCPT ); Thu, 29 Jun 2017 03:55:51 -0400 Date: Thu, 29 Jun 2017 09:55:47 +0200 From: Ingo Molnar To: Josh Poimboeuf Cc: x86@kernel.org, linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Linus Torvalds , Andy Lutomirski , Jiri Slaby , "H. Peter Anvin" , Peter Zijlstra Subject: Re: [PATCH v2 0/8] x86: undwarf unwinder Message-ID: <20170629075547.y24s7aq4nqwt2rll@gmail.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2521 Lines: 68 * Josh Poimboeuf wrote: > Undwarf vs frame pointers > ------------------------- > > With frame pointers enabled, GCC adds instrumentation code to every > function in the kernel. The kernel's .text size increases by about > 3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel > Gorman [1] have shown a slowdown of 5-10% for some workloads. > > In contrast, the undwarf unwinder has no effect on text size or runtime > performance, because the debuginfo is out of band. So if you disable > frame pointers and enable undwarf, you get a nice performance > improvement across the board, and still have reliable stack traces. > > Another benefit of undwarf compared to frame pointers is that it can > reliably unwind across interrupts and exceptions. Frame pointer based > unwinds can skip the caller of the interrupted function if it was a leaf > function or if the interrupt hit before the frame pointer was saved. > > The main disadvantage of undwarf compared to frame pointers is that it > needs more memory to store the undwarf table: roughly 3-5MB depending on > the kernel config. Note that it's not just a performance improvement, but also an instruction cache locality improvement: 3.2% .text savings almost directly transform into a similarly sized reduction in cache footprint. That can transform to even higher speedups for workloads whose cache locality is borderline. I _really_ like this feature, and the independence of the debuginfo data format. Logistically it's too bad we are 3 days away from the merge window to be able to pick this up: > 56 files changed, 3466 insertions(+), 1765 deletions(-) OTOH most of the diffstat is in objtool. Any objections to applying the first 3 objtool patches straight away and see whether anything breaks? That would significantly reduce the size of the rest of the patch set. > I'm not tied to the 'undwarf' name, other naming ideas are welcome. Ha, a new bike shed painting job! ;-) I think 'undwarf' isn't a bad name, it's short, catchy and describes the purpose of the effort. But I cannot resist some other suggestions, after 'elf' and 'dwarf' the obvious candidates from the peoples of Middle-earth would be: - 'Hobbit' - 'Eagle' - 'Ent' - 'Dragon' - 'Troll' - 'Ainur' 'struct troll_entry' has a certain charm to it. 'Eagle' is even nicer IMHO: larger than a dwarf but so much faster - and eagles are beautiful! Plus the name is 2 letters shorter than 'unwdwarf', win-win. Thanks, Ingo