Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754068AbbBTJuf (ORCPT ); Fri, 20 Feb 2015 04:50:35 -0500 Received: from mail-we0-f171.google.com ([74.125.82.171]:39970 "EHLO mail-we0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753964AbbBTJuK (ORCPT ); Fri, 20 Feb 2015 04:50:10 -0500 Date: Fri, 20 Feb 2015 10:50:03 +0100 From: Ingo Molnar To: Jiri Kosina Cc: Josh Poimboeuf , Vojtech Pavlik , Peter Zijlstra , Andrew Morton , Ingo Molnar , Seth Jennings , linux-kernel@vger.kernel.org, Linus Torvalds Subject: Re: [PATCH 1/3] sched: add sched_task_call() Message-ID: <20150220095003.GA23506@gmail.com> References: <20150219101607.GG5029@twins.programming.kicks-ass.net> <20150219162429.GA15980@treble.redhat.com> <20150219163359.GA25438@suse.cz> <20150219170353.GB15980@treble.redhat.com> <20150219171929.GA13178@suse.cz> <20150219173255.GC15980@treble.redhat.com> <20150219204036.GA16882@suse.com> <20150219214229.GD15980@treble.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2507 Lines: 68 * Jiri Kosina wrote: > Alright, so to sum it up: > > - current stack dumping (even looking at /proc//stack) is not > guaranteed to yield "correct" results in case the task is running at the > time the stack is being examined Don't even _think_ about trying to base something as dangerous as live patching the kernel image on the concept of: 'We can make all stack backtraces reliably correct all the time, with no false positives, with no false negatives, 100% of the time, and quickly discover and fix bugs in that'. It's not going to happen: - The correctness of stacktraces partially depends on tooling and we don't control those. - More importantly, there's no strong force that ensures we can rely on stack backtraces: correcting bad stack traces depends on people hitting those functions and situations that generate them, seeing a bad stack trace, noticing that it's weird and correcting whatever code or tooling quirk causes the stack entry to be incorrect. Essentially unlike other kernel code which breaks stuff if it's incorrect, there's no _functional_ dependence on stack traces, so live patching would be the first (and pretty much only) thing that breaks on bad stack traces ... If you think you can make something like dwarf annotations work reliably to base kernel live patching on that, reconsider. Even with frame pointer backtraces can go bad sometimes, I wouldn't base live patching even on _that_, and that's a very simple concept with a performance cost that most distros don't want to pay. So if your design is based on being able to discover 'live' functions in the kernel stack dump of all tasks in the system, I think you need a serious reboot of the whole approach and get rid of that fragility before any of that functionality gets upstream! > - For live patching use-case, the stack has to be > analyzed (and decision on what to do based on the > analysis) in the NMI handler itself, otherwise it gets > racy again You simply cannot reliably determine from the kernel stack whether a function is used by a task or not, and actually modify the kernel image, from a stack backtrace, as things stand today. Full stop. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/