Date: Wed, 11 Feb 2015 17:28:13 +0100 (CET)
From: Miroslav Benes <mbenes@suse.cz>
To: Josh Poimboeuf <jpoimboe@redhat.com>
cc: Seth Jennings <sjenning@redhat.com>, Jiri Kosina <jkosina@suse.cz>,
        Vojtech Pavlik <vojtech@suse.cz>,
        Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
        live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
In-Reply-To: <20150210165639.GF21643@treble.redhat.com>
Message-ID: <alpine.LNX.2.00.1502111703570.27943@pobox.suse.cz>
References: <cover.1423499826.git.jpoimboe@redhat.com> <2c3d1e685dae5cccc2dfdb1b24c241b2f1c89348.1423499826.git.jpoimboe@redhat.com> <alpine.LNX.2.00.1502101627350.18649@pobox.suse.cz> <20150210165639.GF21643@treble.redhat.com>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 6831
Lines: 146

On Tue, 10 Feb 2015, Josh Poimboeuf wrote:

> On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> > 
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > Add a basic per-task consistency model.  This is the foundation which
> > > will eventually enable us to patch those ~10% of security patches which
> > > change function prototypes and/or data semantics.
> > > 
> > > When a patch is enabled, livepatch enters into a transition state where
> > > tasks are converging from the old universe to the new universe.  If a
> > > given task isn't using any of the patched functions, it's switched to
> > > the new universe.  Once all the tasks have been converged to the new
> > > universe, patching is complete.
> > > 
> > > The same sequence occurs when a patch is disabled, except the tasks
> > > converge from the new universe to the old universe.
> > > 
> > > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > > is in transition.  Only a single patch (the topmost patch on the stack)
> > > can be in transition at a given time.  A patch can remain in the
> > > transition state indefinitely, if any of the tasks are stuck in the
> > > previous universe.
> > > 
> > > A transition can be reversed and effectively canceled by writing the
> > > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > > the transition is in progress.  Then all the tasks will attempt to
> > > converge back to the original universe.
> > 
> > Hi Josh,
> > 
> > first, thanks a lot for great work. I'm starting to go through it and it's 
> > gonna take me some time to do and send a complete review.
> 
> I know there are a lot of details to look at, please take your time.  I
> really appreciate your review.  (And everybody else's, for that matter
> :-)
> 
> > > +	/* success! unpatch obsolete functions and do some cleanup */
> > > +
> > > +	if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > > +		klp_unpatch_objects(klp_transition_patch);
> > > +
> > > +		/* prevent ftrace handler from reading old func->transition */
> > > +		synchronize_rcu();
> > > +	}
> > > +
> > > +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > > +		  klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > > +							  "unpatching");
> > > +
> > > +	klp_complete_transition();
> > > +}
> > 
> > ...synchronize_rcu() could be insufficient. There still can be some  
> > process in our ftrace handler after the call.
> > 
> > Consider the following scenario:
> > 
> > When synchronize_rcu is called some process could have been preempted on 
> > some other cpu somewhere at the start of the ftrace handler before  
> > rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that 
> > does not mean anything for our process in the handler, because it is not 
> > in rcu critical section. There is no guarantee that after synchronize_rcu 
> > the process would be away from the handler. 
> > 
> > "Meanwhile" klp_try_complete_transition continues and calls 
> > klp_complete_transition. This clears func->transition flags. Now the 
> > process in the handler could be scheduled again. It reads the wrong value 
> > of func->transition and redirection to the wrong function is done.
> > 
> > What do you think? I hope I made myself clear.
> 
> You really made me think.  But I don't think there's a race here.
> 
> Consider the two separate cases, patching and unpatching:
> 
> 1. patching has completed: klp_universe_goal and all tasks'
>    klp_universes are at KLP_UNIVERSE_NEW.  In this case, the value of
>    func->transition doesn't matter, because we want to use the func at
>    the top of the stack, and if klp_universe is NEW, the ftrace handler
>    will do that, regardless of the value of func->transition.  This is
>    why I didn't do the rcu_synchronize() in this case.  But maybe you're
>    not worried about this case anyway, I just described it for the sake
>    of completeness :-)

Yes, this case shouldn't be a problem :)

> 2. unpatching has completed: klp_universe_goal and all tasks'
>    klp_universes are at KLP_UNIVERSE_OLD.  In this case, the value of
>    func->transition _does_ matter.  However, notice that
>    klp_unpatch_objects() is called before rcu_synchronize().  That
>    removes the "new" func from the klp_ops stack.  Since the ftrace
>    handler accesses the list _after_ calling rcu_read_lock(), it will
>    never see the "new" func, and thus func->transition will never be
>    set.

Hm, so indeed I messed it up. Let me rework the scenario a bit. We have a 
function foo(), which has been already patched with foo_1() from patch_1 
and foo_2() from patch_2. Now we would like to unpatch patch_2. It is 
successfully completed and klp_try_complete_transition calls 
klp_unpatch_objects and synchronize_rcu. Thus foo_2() is removed from the 
RCU list in ops. 

Now to the funny part. After synchronize_rcu() and before 
klp_complete_transition some process might get to the ftrace handler (it 
is still there because of the patch_1 still being present). It gets foo_1 
from the list_first_or_null_rcu, sees that func->transition is 1 (it 
hasn't been cleared yet), current->klp_universe is KLP_UNIVERSE_OLD... so 
it tries to get previous function. There is none and foo() is called. This 
is incorrect.

It is very similar scenario to the one in my other email earlier this day. 
I think we need to clear func->transition before calling 
klp_unpatch_objects. More or less.

>    That said, I think there is a race where the WARN_ON_ONCE(!func)
>    could trigger here, and it wouldn't be an error.  So I think I'll
>    remove the warning.
> 
> Does that make sense?
> 
> > There is the similar problem for dynamic trampolines in ftrace. You
> > cannot remove them unless there is no process in the handler. I think
> > rcu-tasks were merged a while ago for this purpose. However ftrace
> > does not use them yet and I don't know if we could exploit them to
> > solve this issue. I need to think more about it.
> 
> Ok, sounds like that's an ftrace bug that could affect us.

Fortunately it is not. Steven knows about it and he does not allow dynamic 
trampolines for CONFIG_PREEMPT and FTRACE_OPS_FL_DYNAMIC. Not yet. See the 
comment in kernel/trace/ftrace.c for ftrace_update_trampoline.

Anyway the conclusion is that we need to be really careful with ftrace 
handler. Especially in the future with dynamic trampolines and especially 
with CONFIG_PREEMPT. Now the handler runs always in atomic context (at 
least in cases relevant for our use) if I am not mistaken.

Miroslav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/