Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753457AbcDNJZZ (ORCPT ); Thu, 14 Apr 2016 05:25:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:36708 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751742AbcDNJZV (ORCPT ); Thu, 14 Apr 2016 05:25:21 -0400 Date: Thu, 14 Apr 2016 11:25:18 +0200 (CEST) From: Miroslav Benes To: Josh Poimboeuf cc: Jiri Kosina , Jessica Yu , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Vojtech Pavlik Subject: Re: [RFC PATCH v1.9 12/14] livepatch: create per-task consistency model In-Reply-To: <74a2b37cea7a64a185e50876dba031137aa59a24.1458933243.git.jpoimboe@redhat.com> Message-ID: References: <74a2b37cea7a64a185e50876dba031137aa59a24.1458933243.git.jpoimboe@redhat.com> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5223 Lines: 134 On Fri, 25 Mar 2016, Josh Poimboeuf wrote: > Add a basic per-task consistency model. This is the foundation which > will eventually enable us to patch those ~10% of security patches which > change function prototypes and/or data semantics. > > When a patch is enabled, livepatch enters into a transition state where > tasks are converging from the old universe to the new universe. If a > given task isn't using any of the patched functions, it's switched to > the new universe. Once all the tasks have been converged to the new > universe, patching is complete. > > The same sequence occurs when a patch is disabled, except the tasks > converge from the new universe to the old universe. > > The /sys/kernel/livepatch//transition file shows whether a patch > is in transition. Only a single patch (the topmost patch on the stack) > can be in transition at a given time. A patch can remain in the > transition state indefinitely, if any of the tasks are stuck in the > previous universe. > > A transition can be reversed and effectively canceled by writing the > opposite value to the /sys/kernel/livepatch//enabled file while > the transition is in progress. Then all the tasks will attempt to > converge back to the original universe. So after spending some time with this patch I must say hats off to you. I haven't managed to find anything serious yet and I doubt I would. Few things below. > +/* > + * klp_update_task_universe() - change the patched state of a task > + * @task: The task to change > + * > + * Converts the patched state of the task so that it will switch to the set of > + * functions in the goal universe. > + */ > +static inline void klp_update_task_universe(struct task_struct *task) > +{ > + /* > + * The corresponding write barriers are in klp_init_transition() and > + * klp_start_transition(). See the comments there for an explanation. > + */ > + smp_rmb(); > + > + task->klp_universe = klp_universe_goal; > +} I wonder whether we should introduce a static_key for this function. klp_update_task_universe() is called in some important code paths and it is called everytime if CONFIG_LIVEPATCH is on. It does not matter much for syscall paths, because we enter slowpath there only if TIF_KLP_NEED_UPDATE is set and that is set iff patching is in progress. But we call the function also elsewhere. So maybe it would be nice to introduce static_key in the function which would be on if the patching is in progress and off if not. Maybe it is just an overoptimization though. [...] > +static bool klp_try_switch_task(struct task_struct *task) > +{ > + struct rq *rq; > + unsigned long flags; > + int ret; > + bool success = false; > + > + /* check if this task has already switched over */ > + if (task->klp_universe == klp_universe_goal) > + return true; [...] > +/* > + * This function can be called in the middle of an existing transition to > + * reverse the direction of the universe goal. This can be done to effectively > + * cancel an existing enable or disable operation if there are any tasks which > + * are stuck in the starting universe. > + */ > +void klp_reverse_transition(void) > +{ > + struct klp_patch *patch = klp_transition_patch; > + > + klp_start_transition(!klp_universe_goal); > + klp_try_complete_transition(); > + > + patch->enabled = !patch->enabled; > +} This function could be called iff the patching is in progress, there are some not-yet-migrated tasks and we wait for scheduled workqueue to run klp_try_complete_transition() again, right? I have two questions. 1. Is the call to klp_try_complete_transition() here necessary? It would be called via workqueue, wouldn't it? I suspect we don't gain much with this and we introduce possible future problems because of "double success" of klp_try_complete_transition() (nothing serious there as of now though). But I might be wrong because I got really confused by this piece of code and its context :) 2. klp_reverse_transition() works well because not-yet-migrated tasks are still in KLP_UNIVERSE_OLD (without loss of generality) and since klp_universe_goal is swapped they are in fact in their target universe. klp_try_switch_task() returns immediately in such case. The rest is migrated in an ordinary way. What about TIF_KLP_NEED_UPDATE (I know it is introduced in later patch but it is easier to discuss it here)? klp_start_transition() resets the flag for all the task. Shouldn't the flag be cleared for the task in klp_try_switch_task() if task->klp_universe == klp_universe_goal? In fact it does not matter if it is set. The patching success is determined by klp_func->klp_universe booleans. But those tasks would needlessly go through syscall slowpaths (see note above about static_key) which could hurt performance. Does this make sense? [...] transition.h: > > +extern void klp_init_transition(struct klp_patch *patch, int universe); > +extern void klp_start_transition(int universe); > +extern void klp_reverse_transition(void); > +extern bool klp_try_complete_transition(void); > +extern void klp_complete_transition(void); externs are superfluous here and we do not have them in other header files. Anyway, great job! Miroslav