Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934054AbcDLRgA (ORCPT ); Tue, 12 Apr 2016 13:36:00 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:48812 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933530AbcDLRf6 (ORCPT ); Tue, 12 Apr 2016 13:35:58 -0400 Date: Tue, 12 Apr 2016 12:35:49 -0500 From: Chris J Arges To: Josh Poimboeuf Cc: Jiri Kosina , Jessica Yu , Miroslav Benes , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Vojtech Pavlik Subject: Re: [RFC PATCH v1.9 08/14] livepatch: separate enabled and patched states Message-ID: <20160412173548.GA26049@canonical.com> References: <20160412144443.GA21882@canonical.com> <20160412171600.eikmt75i4qapjj2g@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160412171600.eikmt75i4qapjj2g@treble> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5447 Lines: 150 On Tue, Apr 12, 2016 at 12:16:00PM -0500, Josh Poimboeuf wrote: > On Tue, Apr 12, 2016 at 09:44:43AM -0500, Chris J Arges wrote: > > On Fri, Mar 25, 2016 at 02:34:55PM -0500, Josh Poimboeuf wrote: > > > Once we have a consistency model, patches and their objects will be > > > enabled and disabled at different times. For example, when a patch is > > > disabled, its loaded objects' funcs can remain registered with ftrace > > > indefinitely until the unpatching operation is complete and they're no > > > longer in use. > > > > > > It's less confusing if we give them different names: patches can be > > > enabled or disabled; objects (and their funcs) can be patched or > > > unpatched: > > > > > > - Enabled means that a patch is logically enabled (but not necessarily > > > fully applied). > > > > > > - Patched means that an object's funcs are registered with ftrace and > > > added to the klp_ops func stack. > > > > > > Also, since these states are binary, represent them with booleans > > > instead of ints. > > > > > > > Josh, > > > > Awesome work here first of all! > > > > Looking through the patchset a bit I see the following bools: > > - functions: patched, transitioning > > - objects: patched > > - patches: enabled > > > > It seems this reflects the following states at a patch level: > > disabled - module inserted, not yet logically enabled > > enabled - logically enabled, but not all objects/functions patched > > transitioning - objects/functions are being applied or reverted > > patched - all objects/functions patched > > > > However each object and function could have the same state and the parent object > > just reflects the 'aggregate state'. For example if all funcs in an object are > > patched then the object is also patched. > > > > Perhaps we need more states (or maybe there will be more in the future), but > > wouldn't this just be easier to have something like for each patch, object, and > > function? > > > > enum klp_state{ > > KLP_DISABLED, > > KLP_ENABLED, > > KLP_TRANSITION, > > KLP_PATCHED, > > } > > > > > > I'm happy to help out too. > > Thanks for the comments. First let me try to explain why I chose two > bools rather than a single state variable. > > At a func level, it's always in one of the following states: > > patched=0 transition=0: unpatched > patched=0 transition=1: unpatched, temporary starting state > patched=1 transition=1: patched, may be visible to some tasks > patched=1 transition=0: patched, visible to all tasks > > And for unpatching, it goes in the reverse order: > > patched=1 transition=0: patched, visible to all tasks > patched=1 transition=1: patched, may be visible to some tasks > patched=0 transition=1: unpatched, temporary ending state > patched=0 transition=0: unpatched > > (note to self, put the above in a comment somewhere) > This is helpful and makes more sense. > Now, we could convert the states from two bools into a single enum. But > I think it would complicate the code. Because nowhere in the code does > it need to access the full state. In some places it accesses > func->patched and in other places it accesses func->transition, but it > never needs to access both. > > So for example, the following check in klp_ftrace_handler(): > > if (unlikely(func->transition)) { > > would change to: > > if (unlikely(func->state == KLP_ENABLED || func->state == KLP_TRANSITION)) { > > Sure, we could use a helper function to make that more readable. But > with the bools its clearer and you don't need a helper function. > > As another example, see the following code in klp_complete_transition(): > > klp_for_each_object(klp_transition_patch, obj) > klp_for_each_func(obj, func) > func->transition = false; > > The last line would have to be changed to something like: > > if (patching...) > func->state = KLP_PATCHED; > else /* unpatching */ > func->state = KLP_DISABLED; > > So that's why I picked two bools over a single state variable: it seems > to make the code simpler. > > As to the other idea about copying the func states to the object and > patch level, I get the feeling that would also complicate the code. We > patch and transition at a function granularity, so the "real" state is > at the func level. Proliferating that state to objects and patches > might be tricky to get right, and it could make it harder to understand > exactly what's going on in the code. And I don't really see a benefit > to doing that. > > -- > Josh > I think keeping it simple makes a ton of sense, thanks for the explanation. I'm also envisioning how to troubleshoot patches that don't converge. So if someone wanted to check if a function has been fully patched on all tasks they would check: /sys/kernel/livepatch//transition /sys/kernel/livepatch//patched And see if patched=1 and transition=0 things are applied. However if patched=1 and transition=1 then if a user wanted to dig down and see which pid's we were waiting on we could do: cat /proc/*/patch_status and check if any pid's patch_status values still contain KLP_UNIVERSE_OLD. If we wanted to see which function in the task needs patching we could: cat /proc//stack and see if anything in that stack contains the functions in question. Anyway I'll keep looking at this patchset, patching using the samples/livepatch code works for me without issue so far. --chris