Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755243AbbBTQMy (ORCPT ); Fri, 20 Feb 2015 11:12:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52165 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752120AbbBTQMx (ORCPT ); Fri, 20 Feb 2015 11:12:53 -0500 Date: Fri, 20 Feb 2015 10:12:19 -0600 From: Josh Poimboeuf To: Ingo Molnar Cc: Jiri Kosina , Vojtech Pavlik , Peter Zijlstra , Andrew Morton , Ingo Molnar , Seth Jennings , linux-kernel@vger.kernel.org, Linus Torvalds Subject: Re: [PATCH 1/3] sched: add sched_task_call() Message-ID: <20150220161219.GF15980@treble.redhat.com> References: <20150219162429.GA15980@treble.redhat.com> <20150219163359.GA25438@suse.cz> <20150219170353.GB15980@treble.redhat.com> <20150219171929.GA13178@suse.cz> <20150219173255.GC15980@treble.redhat.com> <20150219204036.GA16882@suse.com> <20150219214229.GD15980@treble.redhat.com> <20150220095003.GA23506@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150220095003.GA23506@gmail.com> User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3885 Lines: 104 On Fri, Feb 20, 2015 at 10:50:03AM +0100, Ingo Molnar wrote: > > * Jiri Kosina wrote: > > > Alright, so to sum it up: > > > > - current stack dumping (even looking at /proc//stack) is not > > guaranteed to yield "correct" results in case the task is running at the > > time the stack is being examined > > Don't even _think_ about trying to base something as > dangerous as live patching the kernel image on the concept > of: > > 'We can make all stack backtraces reliably correct all > the time, with no false positives, with no false > negatives, 100% of the time, and quickly discover and > fix bugs in that'. > > It's not going to happen: > > - The correctness of stacktraces partially depends on > tooling and we don't control those. What tooling are you referring to? Sure, we rely on the compiler to produce the correct assembly for putting frame pointers on the stack. But that's pretty straightforward. The kernel already relies on the compiler to do plenty of things which are much more complex than that. > - More importantly, there's no strong force that ensures > we can rely on stack backtraces: correcting bad stack > traces depends on people hitting those functions and > situations that generate them, seeing a bad stack trace, > noticing that it's weird and correcting whatever code or > tooling quirk causes the stack entry to be incorrect. > > Essentially unlike other kernel code which breaks stuff if > it's incorrect, there's no _functional_ dependence on stack > traces, so live patching would be the first (and pretty > much only) thing that breaks on bad stack traces ... First, there are several things we do to avoid anomalies: - we don't patch asm functions - we only walk the stack of sleeping tasks Second, currently the stack unwinding code is rather crude and doesn't do much in the way of error handling. There are several error conditions we could easily check for programmatically: - make sure it starts from a __schedule() call at the top (I've only proposed walking the stacks of sleeping tasks) - make sure we actually reach the bottom of the stack - make sure each stack frame's return address is immediately after a call instruction If any of these checks fail, we can either delay the patching of the task until later or we can cancel the patching process, with no harm done. Either way we can WARN the user so that we get reports of these anomalies. > If you think you can make something like dwarf annotations > work reliably to base kernel live patching on that, > reconsider. No, we would rely on CONFIG_FRAME_POINTER. > Even with frame pointer backtraces can go bad sometimes, I > wouldn't base live patching even on _that_, Other than hand-coded assembly, can you provide more details about how frame pointer stack traces can go bad? > and that's a very simple concept with a performance cost that most > distros don't want to pay. Hm, CONFIG_FRAME_POINTER is enabled on all the distros I use. > So if your design is based on being able to discover 'live' > functions in the kernel stack dump of all tasks in the > system, I think you need a serious reboot of the whole > approach and get rid of that fragility before any of that > functionality gets upstream! Just to clarify a couple of things: 1. Despite the email subject, this discussion is really about another RFC patch set [1]. It hasn't been merged, and there's still a lot of ongoing discussion about it. 2. We don't dump the stack of _all_ tasks. Only sleeping ones. [1] https://lkml.org/lkml/2015/2/9/475 -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/