Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752177AbbBWAoK (ORCPT ); Sun, 22 Feb 2015 19:44:10 -0500 Received: from mail-wg0-f44.google.com ([74.125.82.44]:55044 "EHLO mail-wg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752117AbbBWAoJ (ORCPT ); Sun, 22 Feb 2015 19:44:09 -0500 MIME-Version: 1.0 In-Reply-To: References: <20150220194901.GB3603@gmail.com> <20150220214613.GA21598@suse.com> <20150221181852.GA8406@gmail.com> <20150221191607.GA9534@gmail.com> <20150221194840.GA10126@gmail.com> <20150222084601.GA23491@gmail.com> <20150222094639.GA23684@gmail.com> <20150222104841.GA25335@gmail.com> <20150222150148.3c566837.akpm@linux-foundation.org> Date: Sun, 22 Feb 2015 16:44:07 -0800 Message-ID: Subject: Re: live kernel upgrades (was: live kernel patching design) From: Arjan van de Ven To: Dave Airlie Cc: Andrew Morton , Jiri Kosina , Ingo Molnar , Vojtech Pavlik , Josh Poimboeuf , Peter Zijlstra , Ingo Molnar , Seth Jennings , LKML , Linus Torvalds , Arjan van de Ven , Thomas Gleixner , Peter Zijlstra , Borislav Petkov , live-patching@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3182 Lines: 68 There's failover, there's running the core services in VMs (which can migrate)... I think 10 seconds is Ingo being a bit exaggerating, since you can boot a full system in a lot less time than that, and more so if you know more about the system (e.g. don't need to spin down and then discover and spin up disks). If you're talking about inside a VM it's even more extreme than that. Now, live patching sounds great as ideal, but it may end up being (mostly) similar like hardware hotplug: Everyone wants it, but nobody wants to use it (and just waits for a maintenance window instead). In the hotplug case, while people say they want it, they're also aware that hardware hotplug is fundamentally messy, and then nobody wants to do it on that mission critical piece of hardware outside the maintenance window. (hotswap drives seem to have been the exception to this, that seems to have been worked out well enough, but that's replace-with-the-same). I would be very afraid that hot kernel patching ends up in the same space: The super-mission-critical folks are what its aimed at, while those are the exact same folks that would rather wait for the maintenance window. There's a lot of logistical issues (can you patch a patched system... if live patching is a first class citizen you end up with dozens and dozens of live patches applied, some out of sequence etc etc). There's the "which patches do I have, and if the first patch for a security hole was not complete, how do I cope by applying number two. There's the "which of my 50.000 servers have which patch applied" logistics. And Ingo is absolutely right: The scope is very fuzzy. Todays bugfix is tomorrows "oh oops it turns out exploitable". I will throw a different hat in the ring: Maybe we don't want full kernel update as step one, maybe we want this on a kernel module level: Hot-swap of kernel modules, where a kernel module makes itself go quiet and serializes its state ("suspend" pretty much), then gets swapped out (hot) by its replacement, which then unserializes the state and continues. If we can do this on a module level, then the next step is treating more components of the kernel as modules, which is a fundamental modularity thing. On Sun, Feb 22, 2015 at 4:18 PM, Dave Airlie wrote: > On 23 February 2015 at 09:01, Andrew Morton wrote: >> On Sun, 22 Feb 2015 20:13:28 +0100 (CET) Jiri Kosina wrote: >> >>> But if you ask the folks who are hungry for live bug patching, they >>> wouldn't care. >>> >>> You mentioned "10 seconds", that's more or less equal to infinity to them. >> >> 10 seconds outage is unacceptable, but we're running our service on a >> single machine with no failover. Who is doing this?? > > if I had to guess, telcos generally, you've only got one wire between a phone > and the exchange and if the switch on the end needs patching it better be fast. > > Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/