Date: Tue, 10 Feb 2015 09:59:58 -0600
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Seth Jennings <sjenning@redhat.com>, Jiri Kosina <jkosina@suse.cz>,
        Vojtech Pavlik <vojtech@suse.cz>, live-patching@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model
Message-ID: <20150210155958.GE21643@treble.redhat.com>
References: <cover.1423499826.git.jpoimboe@redhat.com>
 <54D9E8AB.3070800@hitachi.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <54D9E8AB.3070800@hitachi.com>
User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5487
Lines: 130

On Tue, Feb 10, 2015 at 08:16:59PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > This patch set implements a livepatch consistency model, targeted for 3.21.
> > Now that we have a solid livepatch code base, this is the biggest remaining
> > missing piece.
> > 
> > This code stems from the design proposal made by Vojtech [1] in November.  It
> > makes live patching safer in general.  Specifically, it allows you to apply
> > patches which change function prototypes.  It also lays the groundwork for
> > future code changes which will enable data and data semantic changes.
> 
> Interesting, How would you do that?

As Vojtech described in the earlier thread from November, there are
different approaches for changing data:

1. TRANSFORM_WORLD: stop the world, transform everything, resume

2. TRANSFORM_ON_ACCESS: transform data structures when you access them

I would add a third category (which is what we've been doing with
kpatch):

3. TRANSFORM_ON_CREATE: create new data structures created after a certain point
are the "v2" versions

I think approach 1 seems very tricky, if not impossible in many cases,
even if you're using stop_machine().  Right now we're focusing on
enabling approaches 2 and 3, since they seem more practical, don't
require stop_machine(), and are generally easier to get right.

With kpatch we've been using approach 3, with a lot of success.  Here's
how I would do it with livepatch:

As a prerequisite, we need shadow variables, which is a way to add
virtual fields to existing structs at runtime.  For an example, see:

   https://github.com/dynup/kpatch/blob/master/test/integration/shadow-newpid.patch

In that example, I added "newpid" to task_struct.  If it's only
something like locking semantics that are changing, you can just add a
"v2" field to the struct to specify that it's the 2nd version of the
struct.

When converting a patch to be used for livepatch, the patch author must
carefully look for data struct versioning changes.  It doesn't matter if
there's a new field, or if the semantics of using that data has changed.
Either way, the patch author must define a new version of the struct.

If a struct has changed, all patched functions need to be able to deal
with struct v1 or struct v2.  This is true for those functions which
access the structs as well as the functions which create them.

For example, a function which accesses the struct might change to:

  if (klp_shadow_has_field(struct, "v2"))
      /* access struct the new way */
  else
      /* access struct the old way */

A function which creates the struct might change to:

  struct foo *struct_create()
  {
     /* kmalloc and init struct here */

     if (klp_patching_complete())
         /* add v2 shadow fields */
  }


The klp_patching_complete() call is needed to prevent v1 functions from
accessing v2 data.  The creation/transformation of v2 structs shouldn't
occur until after the patching process is complete, and all tasks are
converged to the new universe.

> > disadvantages vs kpatch:
> > - no system-wide switch point (not really a functional limitation, just forces
> >   the patch author to be more careful. but that's probably a good thing anyway)
> 
> OK, we must check carefully that the old function and new function can be co-exist.

Agreed, and this requires the patch author to look carefully for data
version changes, as described above.  Which they should be doing
regardless.

> > My biggest concerns and questions related to this patch set are:
> > 
> > 1) To safely examine the task stacks, the transition code locks each task's rq
> >    struct, which requires using the scheduler's internal rq locking functions.
> >    It seems to work well, but I'm not sure if there's a cleaner way to safely
> >    do stack checking without stop_machine().
> 
> We'd better ask scheduler people.

Agreed, I will.

> > 2) As mentioned above, kthreads which are always sleeping on a patched function
> >    will never transition to the new universe.  This is really a minor issue
> >    (less than 1% of patches).  It's not necessarily something that needs to be
> >    resolved with this patch set, but it would be good to have some discussion
> >    about it regardless.
> >    
> >    To overcome this issue, I have 1/2 an idea: we could add some stack checking
> >    code to the ftrace handler itself to transition the kthread to the new
> >    universe after it re-enters the function it was originally sleeping on, if
> >    the stack doesn't already have have any other to-be-patched functions.
> >    Combined with the klp_transition_work_fn()'s periodic stack checking of
> >    sleeping tasks, that would handle most of the cases (except when trying to
> >    patch the high-level thread_fn itself).
> 
> It makes sense to me. (I just did similar thing)
> 
> > 
> >    But then how do you make the kthread wake up?  As far as I can tell,
> >    wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> >    testing somehow).  What does kGraft do in this case?
> 
> Hmm, at a glance, the code itself can work on kthread too...
> Maybe you can also send you testing patch too.

Yeah, I probably messed it up.  I'll try it again :-)

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/