Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755243AbbBTVW5 (ORCPT ); Fri, 20 Feb 2015 16:22:57 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60991 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755156AbbBTVW4 (ORCPT ); Fri, 20 Feb 2015 16:22:56 -0500 Date: Fri, 20 Feb 2015 15:22:23 -0600 From: Josh Poimboeuf To: Ingo Molnar Cc: Jiri Kosina , Vojtech Pavlik , Peter Zijlstra , Andrew Morton , Ingo Molnar , Seth Jennings , linux-kernel@vger.kernel.org, Linus Torvalds Subject: Re: [PATCH 1/3] sched: add sched_task_call() Message-ID: <20150220212223.GH15980@treble.redhat.com> References: <20150219170353.GB15980@treble.redhat.com> <20150219171929.GA13178@suse.cz> <20150219173255.GC15980@treble.redhat.com> <20150219204036.GA16882@suse.com> <20150219214229.GD15980@treble.redhat.com> <20150220095003.GA23506@gmail.com> <20150220161219.GF15980@treble.redhat.com> <20150220200849.GC3603@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150220200849.GC3603@gmail.com> User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4999 Lines: 135 On Fri, Feb 20, 2015 at 09:08:49PM +0100, Ingo Molnar wrote: > * Josh Poimboeuf wrote: > > On Fri, Feb 20, 2015 at 10:50:03AM +0100, Ingo Molnar wrote: > > > * Jiri Kosina wrote: > > > > > > > Alright, so to sum it up: > > > > > > > > - current stack dumping (even looking at /proc//stack) is not > > > > guaranteed to yield "correct" results in case the task is running at the > > > > time the stack is being examined > > > > > > Don't even _think_ about trying to base something as > > > dangerous as live patching the kernel image on the concept > > > of: > > > > > > 'We can make all stack backtraces reliably correct all > > > the time, with no false positives, with no false > > > negatives, 100% of the time, and quickly discover and > > > fix bugs in that'. > > > > > > It's not going to happen: > > > > > > - The correctness of stacktraces partially depends on > > > tooling and we don't control those. > > > > What tooling are you referring to? > > > > Sure, we rely on the compiler to produce the correct > > assembly for putting frame pointers on the stack. [...] > > We also rely on all assembly code to have valid frames all > the time - and this is far from a given, we've had many > bugs in that area. Are you talking about kernel bugs or compiler bugs? Either way, would those bugs not be caught by the stack unwinding validation improvements I proposed? > > [...] But that's pretty straightforward. The kernel > > already relies on the compiler to do plenty of things > > which are much more complex than that. > > So this claim scares me because it's such nonsense: > > We rely on the compiler to create _functional code_ for us. > If the compiler messes up then functionality, actual code > breaks. > > 'Valid stack frame' is not functional code, it's in essence > debug info. Only because we _treat_ it like debug info. I'm proposing that we treat it like functional code. So that yes, if the compiler messes up frame pointers, actual code breaks. (But we could detect that before we do the patch, so it doesn't destabilize the system.) If gcc can't do frame pointers right, it's a *major* gcc bug. It's nothing more than: push %rbp mov %rsp,%rbp at the beginning of the function, and: pop %rbp right before the function returns. If you really think the compiler could possibly mess that up, I can probably write a simple script, executed by the Makefile, which validates that every C function in the kernel has the above pattern. > If a backtrace is wrong in some rare, racy case > then that won't break the kernel, it just gives slightly > misleading debug info in most cases. > > Now you want to turn 'debug info' to have functional > relevance, for something as intrusive as patching the > kernel code. > > You really need to accept this distinction! Yes, I already realize that this would be the first time the kernel relies on stack traces being 100% correct. Peter and I had already some discussion about it already elsewhere in this thread. Just because we haven't relied on its correctness in the past doesn't mean we can't rely on it in the future. > > > - More importantly, there's no strong force that ensures > > > we can rely on stack backtraces: correcting bad > > > stack traces depends on people hitting those > > > functions and situations that generate them, seeing > > > a bad stack trace, noticing that it's weird and > > > correcting whatever code or tooling quirk causes the > > > stack entry to be incorrect. > > > > > > Essentially unlike other kernel code which breaks stuff > > > if it's incorrect, there's no _functional_ dependence > > > on stack traces, so live patching would be the first > > > (and pretty much only) thing that breaks on bad stack > > > traces ... > > > > First, there are several things we do to avoid anomalies: > > > > - we don't patch asm functions > > the problem with asm functions isn't that we might want to > patch them (although sometimes we might want to: especially > the more interesting security fixes tend to be in assembly > code), it's that asm functions can (easily) mess up debug > info, without that being apparent in any functional way. Again, that's why I proposed the stack unwinding validation improvements. > > - we only walk the stack of sleeping tasks > > sleeping tasks are affected by debug info bugs too. > > > Second, currently the stack unwinding code is rather crude and > > doesn't do much in the way of error handling. > > That's a big red flag in my book ... I wasn't talking about my patch set. I was talking about the existing stack dumping code in the kernel. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/