Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754182AbbBTKoZ (ORCPT ); Fri, 20 Feb 2015 05:44:25 -0500 Received: from mail-wi0-f169.google.com ([209.85.212.169]:48198 "EHLO mail-wi0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753358AbbBTKoX (ORCPT ); Fri, 20 Feb 2015 05:44:23 -0500 Date: Fri, 20 Feb 2015 11:44:18 +0100 From: Ingo Molnar To: Jiri Kosina Cc: Josh Poimboeuf , Vojtech Pavlik , Peter Zijlstra , Andrew Morton , Ingo Molnar , Seth Jennings , linux-kernel@vger.kernel.org, Linus Torvalds Subject: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call()) Message-ID: <20150220104418.GD25076@gmail.com> References: <20150219163359.GA25438@suse.cz> <20150219170353.GB15980@treble.redhat.com> <20150219171929.GA13178@suse.cz> <20150219173255.GC15980@treble.redhat.com> <20150219204036.GA16882@suse.com> <20150219214229.GD15980@treble.redhat.com> <20150220095003.GA23506@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3085 Lines: 77 * Jiri Kosina wrote: > On Fri, 20 Feb 2015, Ingo Molnar wrote: > > > So if your design is based on being able to discover > > > 'live' functions in the kernel stack dump of all tasks > > in the system, I think you need a serious reboot of the > > whole approach and get rid of that fragility before any > > of that functionality gets upstream! > > So let me repeat again, just to make sure that no more > confusion is being spread around -- there are aproaches > which do rely on stack contents, and aproaches which > don't. kpatch (the Red Hat solution) and ksplice (the > Oracle solution) contains stack analysis as a conceptual > design step, kgraft (the SUSE solution) doesn't. So just to make my position really clear: any talk about looking at the kernel stack for backtraces is just crazy talk, considering how stack backtrace technology stands today and in the reasonable near future! With that out of the way, the only safe mechanism to live patch the kernel (for sufficiently simple sets of changes to singular functions) I'm aware of at the moment is: - forcing all user space tasks out of kernel mode and intercepting them in a safe state. I.e. making sure that no kernel code is executed, no kernel stack state is used (modulo code closely related to the live patching mechanism and kernel threads in safe state, lets ignore them for this argument) There's two variants of this concept, which deals with the timing of how user-space tasks are forced into user mode: - the simple method: force all user-space tasks out of kernel mode, stop the machine for a brief moment and be done with the patching safely and essentially atomically. - the complicated method spread out over time: uses the same essential mechanism plus the ftrace patching machinery to detect whether all tasks have transitioned through a version flip. [this is what kgraft does in part.] All fundamental pieces of the simple method are necessary to get guaranteed time transition from the complicated method: task tracking and transparent catching of them, handling kthreads, etc. My argument is that the simple method should be implemented first and foremost. Then people can do add-on features to possibly spread out the new function versions in a more complicated way if they want to avoid the stop-all-tasks transition - although I'm not convinced about it: I'm sure sure many sysadmins would like the bug patching to be over with quickly and not have their systems in an intermediate state like kgraft does it. In any case, as per my arguments above, examining the kernel stack is superfluous (so we won't be exposed to the fragility of it either): there's no need to examine it and writing such patches is misguided... Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/