Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753228AbbBJR3v (ORCPT ); Tue, 10 Feb 2015 12:29:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42307 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752673AbbBJR3u (ORCPT ); Tue, 10 Feb 2015 12:29:50 -0500 Date: Tue, 10 Feb 2015 11:29:45 -0600 From: Josh Poimboeuf To: Masami Hiramatsu Cc: Seth Jennings , Jiri Kosina , Vojtech Pavlik , live-patching@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/9] livepatch: consistency model Message-ID: <20150210172945.GH21643@treble.redhat.com> References: <54D9E8AB.3070800@hitachi.com> <20150210155958.GE21643@treble.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150210155958.GE21643@treble.redhat.com> User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6034 Lines: 143 On Tue, Feb 10, 2015 at 09:59:58AM -0600, Josh Poimboeuf wrote: > On Tue, Feb 10, 2015 at 08:16:59PM +0900, Masami Hiramatsu wrote: > > (2015/02/10 2:31), Josh Poimboeuf wrote: > > > This patch set implements a livepatch consistency model, targeted for 3.21. > > > Now that we have a solid livepatch code base, this is the biggest remaining > > > missing piece. > > > > > > This code stems from the design proposal made by Vojtech [1] in November. It > > > makes live patching safer in general. Specifically, it allows you to apply > > > patches which change function prototypes. It also lays the groundwork for > > > future code changes which will enable data and data semantic changes. > > > > Interesting, How would you do that? > > As Vojtech described in the earlier thread from November, there are > different approaches for changing data: > > 1. TRANSFORM_WORLD: stop the world, transform everything, resume > > 2. TRANSFORM_ON_ACCESS: transform data structures when you access them > > I would add a third category (which is what we've been doing with > kpatch): > > 3. TRANSFORM_ON_CREATE: create new data structures created after a certain point > are the "v2" versions Sorry, bad wording, I meant to say: 3. TRANSFORM_ON_CREATE: create new versions of the data structures when you create them If that still doesn't make sense, hopefully the below explanation clarifies what I mean :-) > > I think approach 1 seems very tricky, if not impossible in many cases, > even if you're using stop_machine(). Right now we're focusing on > enabling approaches 2 and 3, since they seem more practical, don't > require stop_machine(), and are generally easier to get right. > > With kpatch we've been using approach 3, with a lot of success. Here's > how I would do it with livepatch: > > As a prerequisite, we need shadow variables, which is a way to add > virtual fields to existing structs at runtime. For an example, see: > > https://github.com/dynup/kpatch/blob/master/test/integration/shadow-newpid.patch > > In that example, I added "newpid" to task_struct. If it's only > something like locking semantics that are changing, you can just add a > "v2" field to the struct to specify that it's the 2nd version of the > struct. > > When converting a patch to be used for livepatch, the patch author must > carefully look for data struct versioning changes. It doesn't matter if > there's a new field, or if the semantics of using that data has changed. > Either way, the patch author must define a new version of the struct. > > If a struct has changed, all patched functions need to be able to deal > with struct v1 or struct v2. This is true for those functions which > access the structs as well as the functions which create them. > > For example, a function which accesses the struct might change to: > > if (klp_shadow_has_field(struct, "v2")) > /* access struct the new way */ > else > /* access struct the old way */ > > A function which creates the struct might change to: > > struct foo *struct_create() > { > /* kmalloc and init struct here */ > > if (klp_patching_complete()) > /* add v2 shadow fields */ > } > > > The klp_patching_complete() call is needed to prevent v1 functions from > accessing v2 data. The creation/transformation of v2 structs shouldn't > occur until after the patching process is complete, and all tasks are > converged to the new universe. > > > > disadvantages vs kpatch: > > > - no system-wide switch point (not really a functional limitation, just forces > > > the patch author to be more careful. but that's probably a good thing anyway) > > > > OK, we must check carefully that the old function and new function can be co-exist. > > Agreed, and this requires the patch author to look carefully for data > version changes, as described above. Which they should be doing > regardless. > > > > My biggest concerns and questions related to this patch set are: > > > > > > 1) To safely examine the task stacks, the transition code locks each task's rq > > > struct, which requires using the scheduler's internal rq locking functions. > > > It seems to work well, but I'm not sure if there's a cleaner way to safely > > > do stack checking without stop_machine(). > > > > We'd better ask scheduler people. > > Agreed, I will. > > > > 2) As mentioned above, kthreads which are always sleeping on a patched function > > > will never transition to the new universe. This is really a minor issue > > > (less than 1% of patches). It's not necessarily something that needs to be > > > resolved with this patch set, but it would be good to have some discussion > > > about it regardless. > > > > > > To overcome this issue, I have 1/2 an idea: we could add some stack checking > > > code to the ftrace handler itself to transition the kthread to the new > > > universe after it re-enters the function it was originally sleeping on, if > > > the stack doesn't already have have any other to-be-patched functions. > > > Combined with the klp_transition_work_fn()'s periodic stack checking of > > > sleeping tasks, that would handle most of the cases (except when trying to > > > patch the high-level thread_fn itself). > > > > It makes sense to me. (I just did similar thing) > > > > > > > > But then how do you make the kthread wake up? As far as I can tell, > > > wake_up_process() doesn't seem to work on a kthread (unless I messed up my > > > testing somehow). What does kGraft do in this case? > > > > Hmm, at a glance, the code itself can work on kthread too... > > Maybe you can also send you testing patch too. > > Yeah, I probably messed it up. I'll try it again :-) > > -- > Josh -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/