Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751805AbbBUTQO (ORCPT ); Sat, 21 Feb 2015 14:16:14 -0500 Received: from mail-we0-f178.google.com ([74.125.82.178]:46016 "EHLO mail-we0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751647AbbBUTQM (ORCPT ); Sat, 21 Feb 2015 14:16:12 -0500 Date: Sat, 21 Feb 2015 20:16:07 +0100 From: Ingo Molnar To: Jiri Kosina Cc: Vojtech Pavlik , Josh Poimboeuf , Peter Zijlstra , Andrew Morton , Ingo Molnar , Seth Jennings , linux-kernel@vger.kernel.org, Linus Torvalds Subject: Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call()) Message-ID: <20150221191607.GA9534@gmail.com> References: <20150220095003.GA23506@gmail.com> <20150220104418.GD25076@gmail.com> <20150220194901.GB3603@gmail.com> <20150220214613.GA21598@suse.com> <20150221181852.GA8406@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2999 Lines: 77 * Jiri Kosina wrote: > On Sat, 21 Feb 2015, Ingo Molnar wrote: > > > > This means that each and every sleeping task in the > > > system has to be woken up in some way (sending a > > > signal ...) to exit from a syscall it is sleeping in. > > > Same for CPU hogs. All kernel threads need to be > > > parked. > > > > Yes - although I'd not use signals for this, signals > > have side effects - but yes, something functionally > > equivalent. > > This is similar to my proposal I came up with not too > long time ago; a fake signal (analogically to, but not > exactly the same, what freezer is using), that will just > make tasks cycle through userspace/kernelspace boundary > without other side-effects. Yeah. > > > This is exactly what you need to do for kGraft to > > > complete patching. > > > > My understanding of kGraft is that by default you allow > > tasks to continue 'in the new universe' after they are > > patched. Has this changed or have I misunderstood the > > concept? > > What Vojtech meant here, I believe, is that the effort > you have to make to force all tasks to queue themselves > to park them on a safe place and then restart their > execution is exactly the same as the effort you have to > make to make kGraft converge and succeed. Yes - with the difference that in the 'simple' method I suggested we'd have kpatch's patching robustness (all or nothing atomic patching - no intermediate patching state, no reliance on mcount entries, no doubt about which version of the function is working - sans kpatch's stack trace logic), combined with kGraft's task parking robustness. > But admittedly, if we reserve a special sort-of signal > for making the tasks pass through a safe checkpoint (and > make them queue there (your solution) or make them just > pass through it and continue (current kGraft)), it might > reduce the time this effort needs considerably. Well, I think the 'simple' method has another advantage: it can only work if all those problems (kthreads, parking machinery) are solved, because the patching will occur only everything is quiescent. So no shortcuts are allowed, by design. It starts from a fundamentally safe, robust base, while all the other approaches I've seen were developed in a 'lets get the patching to work, then iteratively try to make it safer' which really puts the cart before the horse. So to put it more bluntly: I don't subscribe to the whole 'consistency model' nonsense: that's just crazy talk IMHO. Make it fundamentally safe from the very beginning, the 'simple method' I suggested _won't live patch the kernel_ if the mechanism has a bug and some kthread or task does not park. See the difference? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/