Date: Thu, 20 Jul 2017 17:50:04 +0200
From: Petr Mladek <pmladek@suse.com>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>, live-patching@vger.kernel.org,
        linux-kernel@vger.kernel.org, Jessica Yu <jeyu@kernel.org>,
        Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
        Chris J Arges <chris.j.arges@canonical.com>
Subject: Re: [PATCH] livepatch: add (un)patch hooks
Message-ID: <20170720155004.GB13895@pathway.suse.cz>
References: <1499868600-10176-1-git-send-email-joe.lawrence@redhat.com>
 <1499868600-10176-2-git-send-email-joe.lawrence@redhat.com>
 <20170717155144.GF32632@pathway.suse.cz>
 <20170719204952.4fyhtig3rbw7z4w4@treble>
 <20170720041723.35r6qk2fia7xix3t@treble>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170720041723.35r6qk2fia7xix3t@treble>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4911
Lines: 139

On Wed 2017-07-19 23:17:23, Josh Poimboeuf wrote:
> On Wed, Jul 19, 2017 at 03:49:52PM -0500, Josh Poimboeuf wrote:
> > > I am sorry for the long mail. But I have really troubles to
> > > understand and describe what can be done with these hooks
> > > a safe way.
> > > 
> > > It might help if you share some real-life examples.
> > 
> > Agreed, we should share some real world examples.  For a few cases, load
> > hooks were extremely useful.  But most of our experience has been with
> > the kpatch consistency model, so we need to revisit our past findings
> > and view them through the livepatch lens.
> > 
> > One crazy -- but potentially very useful -- idea would be if the user
> > were allowed to run stop_machine() from the load hook.  If possible,
> > that would help prevent a lot of race conditions.
> 
> To try to give us all a better idea of what's needed, here are some of
> the patches that Joe and I looked at before which seem to need load
> hooks.  A few of these are ones we actually delivered to customers with
> kpatch.  I've tried to re-analyze them in light of the livepatch CM.
> 
> 
> First, a few observations:
> 
> - Load hooks are a power feature.  They're live patching
>   in "hard mode".

I really like this description of the feature!

The original description made me feel that it was something easy
to use and rather safe because the changes were done before
the object was patched or even loaded.

>From my point of view, creating a classic livepatch is relatively
easy. The old code is redirected to a fixed one. Where the fixed
one is more or less the same as the new "upstream" fixed code.

Of course, there are some catches, for example, the inline functions,
compiler optimization, semantic changes, functions that cannot be traced.
But all this is kind of discovered and described.

But the hooks allows to do anything. They require writing a custom
code that will never be used anywhere else. It usually has very
limited review if any.

The things gets more complicated when the new code provided by
the livepatch somehow depends on the effect of the hooks and vice
versa. Then the authors must know something about the consistency
models. They might deal with many problems (races) that we were
dealing when implementing the consistency models. And the livepatch
framework does not help much here. In fact, the hooks work in
the immediate mode even when the rest of the patch is applied
using the more careful consistency model.


> - In general, the hooks seem to be useful for cases like:
> 
>   - global data updates
> 
>   - "patches" to __init and probe functions

I think that this is similar to global data updates. The variables
are only harder to find.


>   - patching otherwise unpatchable code (i.e., assembly)
> 
>   In many/most cases, it seems like stop_machine() would be very useful
>   to avoid concurrency issues.

I am not sure if stop_machine() would help here. It would make sense
in kPatch where also the ftrace handlers are added during
stop_machine(). Then it is possible to synchronize both operations
(hooks, enabling ftrace handlers) and do everything "atomically".

IMHO, the big advantage of livepatch framework is that stop_machine()
is not needed. I hope that it will stay this way.

Also it might need some additional support. You would want to stop
the machine to make sure that it is safe to do a change. Then
we might need to check stacks, ...


> - The more examples I look at, the more I'm thinking we will need both
>   pre-patch and post-patch hooks, as well as pre-unpatch and
>   post-unpatch hooks.

Yup.


> - The pre-patch and pre-unpatch hooks can be run before the
>   patching/unpatching process begins.
> 
> - The post-patch and post-unpatch hooks will need to be run from either
>   klp_complete_transition() or klp_module_coming/going(), depending on
>   whether the to-be-patched module is already loaded or is being
>   loaded/unloaded.

Makes sense.

> 
> Here's a simple example:

Thanks a lot for the examples. I have got the idea.

I would formulate it the way that the hooks will allow to
patch/unpatch things that cannot be done by simply using
fixed code instead of the old one.

The use case is:

   + modification of global variables (even more instances)

   + registering newly available services/handlers
     (always in post handlers?)

People need to be careful:

   + synchronize the changes with the rest of the system

   + be aware when the new code from the livepatch is active
     (state of the transition, state of the object(module)


The advantage is that they:

   + allow to enable/disable the changes together with the patch
     (using immediate consistency model)

   + allow to handle both patch enable/disable and object
     load/unload in one place

   + they eventually allow to reject loading the patch
     or the affected object (module) in case of error.


Best Regards,
Petr