DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B35577CE04
From: Joe Lawrence <joe.lawrence@redhat.com>
Subject: Re: [PATCH] livepatch: add (un)patch hooks
To: Petr Mladek <pmladek@suse.com>
Cc: live-patching@vger.kernel.org, linux-kernel@vger.kernel.org,
        Josh Poimboeuf <jpoimboe@redhat.com>, Jessica Yu <jeyu@kernel.org>,
        Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
        Chris J Arges <chris.j.arges@canonical.com>
References: <1499868600-10176-1-git-send-email-joe.lawrence@redhat.com>
 <1499868600-10176-2-git-send-email-joe.lawrence@redhat.com>
 <20170717155144.GF32632@pathway.suse.cz>
Organization: Red Hat
Message-ID: <5ad1126c-98ba-29bb-12a9-388af09d0944@redhat.com>
Date: Wed, 19 Jul 2017 14:59:48 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.2.0
MIME-Version: 1.0
In-Reply-To: <20170717155144.GF32632@pathway.suse.cz>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 11786
Lines: 348

On 07/17/2017 11:51 AM, Petr Mladek wrote:
> On Wed 2017-07-12 10:10:00, Joe Lawrence wrote:
>> When the livepatch core executes klp_(un)patch_object, call out to a
>> livepatch-module specified array of callback hooks.  These hooks provide
>> a notification mechanism for livepatch modules when klp_objects are
>> (un)patching. This may be most interesting when another kernel module
>> is a klp_object target and the livepatch module needs to execute code
>> after the target is loaded, but before its module_init code is run.
>>
>> The patch-hook executes right before patching objects and the
>> unpatch-hook executes right after unpatching objects.
>>
>> diff --git a/Documentation/livepatch/hooks.txt b/Documentation/livepatch/hooks.txt
>> new file mode 100644
>> index 000000000000..ef18101a3b90
>> --- /dev/null
>> +++ b/Documentation/livepatch/hooks.txt
>> @@ -0,0 +1,98 @@
>> +(Un)patching Hooks
>> +==================
>> +
>> +Livepatching (un)patch-hooks provide a mechanism to register and execute
>> +a set of callback functions when the kernel's livepatching core performs
>> +an (un)patching operation on a given kernel object.
> 
> The above is correct but it is a bit hard to understand what it really
> means. Josh's discussion about the naming suggests that I am not the only
> one who is confused ;-)
> 
> We need to make it clear that there are 4 basic situations
> where these hooks are called:
> 
>   + patch hook is called when:
> 
> 	1. livepatch is being enabled and object is loaded
> 	2. livepatch is enabled and object is being loaded
> 
>   + unpatch hook is called when
> 
> 	3. livepatch is enabled and object is being removed
> 	4. livepatch is being disabled and object is loaded
> 
> Note that this document mostly talks only about the two situations
> when the livepatch is enabled and the patched object is being
> loaded or removed.
> 
> But it is still quite tricky to understand what can be modified
> a safe way. We need to be careful about different things
> in the different situations.

Indeed, this is tricky.

> If the patched object is beeing added/removed, we know that its
> code is not being used but the code from the rest of the patch
> is already in use. The module is not yet or not longer properly
> initialized. Therefore it might be too early or too late to
> register or unregister any of its services in the rest of
> the system. Basically it limits the changes only to
> to the object (module) itself.
> 
> If the patch is being enabled, it is another story. The object
> is already initialized and its old code is used but the new
> code from the patch is not yet or not longer used. It suggests
> that it might be safe to do some changes related to the
> new code in the patch. But we need to be careful because
> the system is using the old code.
> 
> 
> But there are actually 4 more situations. If we use the consistency
> model, different parts of the system might use different code.
> I mean that:
> 
>   + patch hook is called also when:
> 
>      + livepatch is being enabled and object is being loaded
>      + livepatch is being disabled and object is being loaded
> 
>   + unpatch hook is called when:
> 
>      + livepatch is being enabled and object is being removed
>      + livepatch is being disabled and object is being removed
> 
> 
> It is a bit easier if you run the hook for vmlinux
> because it is always running.
> 
> I am sorry for the long mail. But I have really troubles to
> understand and describe what can be done with these hooks
> a safe way.

If you look at it from the inside out, the concept can be boiled down to
"kernel object patching notifiers".  Essentially whenever a klp_object
is being patched or unpatched, the callbacks are executed.

In this implementation, the patch-hook provides the callback a context
of "your object's patched code is loaded, but not active yet".
Likewise, the unpatch-hook would be "your patched code is no longer
active, but not unloaded (yet)".

I agree that explaining the patch-at-large context quickly gets
complicated.  IMHO, I think some of that discussion belongs in
Documentation/livepatch/livepatch.txt, in a section that answers, "just
when does code get patched?".  In my mind, these notifiers build on top
of the answer to that question.

That said, some kind of guidance about the various contexts in which
these hooks are executed is definitely warranted.  Perhaps some kind of
table or list can summarize things, let me think more on that.

> It might help if you share some real-life examples.

Well, kpatch utilizes "load hooks" and their application are somewhat
specific to that system.  Livepatch (and the consistency model)
introduce a more complicated world -- as you pointed out, for example,
old code and new code may be simultaneously running.

The traditional kpatch load hook example given [1] is updating some data
structure:

[1]
https://github.com/dynup/kpatch/blob/master/doc/patch-author-guide.md#use-a-kpatch-load-hook

however, I think a livepatch counterpart needs to consider a bit more:
locking, unpatched/patched code access, etc.

> 
>> +The hooks are provided and registered by a livepatch module as part of
>> +klp_objects that make up its klp_patch structure.  Both patch and
>> +unpatch-hook function signatures accept a pointer to a klp_object
>> +argument and return an integer status, ie:
> 
> I would put this into separate section and make it clear
> that it is a sample code.

It wasn't exactly /example/ code, it just seemed more succinct to
provide some /pseudo/ code to explain the data structure relationships.

If it is redundant then I can remove from v2 ... if it were any more
"working" code, then it would be nearly as elaborate as the provided
sample module. :)  Does it deserve its own "Data structure relationship
/ pseudo code" section?

>> +  static int patch_hook(struct klp_object *obj)
>> +  {
>> +  	/* ... */
>> +  }
>> +  static int unpatch_hook(struct klp_object *obj)
>> +  {
>> +  	/* ... */
>> +  }
>> +
>> +  static struct klp_hook patch_hooks[] = {
>> +  	{
>> +  		.hook = patch_hook,
>> +  	}, { }
>> +  };
>> +  static struct klp_hook unpatch_hooks[] = {
>> +  	{
>> +  		.hook = unpatch_hook,
>> +  	}, { }
>> +  };
>> +
>> +  static struct klp_object objs[] = {
>> +  	{
>> +  		/* ... */
>> +  		.patch_hooks = patch_hooks,
>> +  		.unpatch_hooks = unpatch_hooks,
>> +  	}, { }
>> +  };
>> +
>> +  static struct klp_patch patch = {
>> +  	.mod = THIS_MODULE,
>> +  	.objs = objs,
>> +  };
>> +
>> +If a hook returns non-zero status, the livepatching core will log a
>> +hook failure warning message.
> 
>> +Multiple (un)patch-hooks may be registered per klp_object.  Each hook
>> +will execute regardless of any previously executed hook's non-zero
>> +return status.
> 
> We should pass the error down the stack. If will prevent either the
> patch or the patched module of being loaded. Of course, we could
> not do much if the patch or the patched object is being removed.

I know Josh and Miroslav pointed this out elsewhere in the thread,
but I don't know how I feel about using the hooks to affect the
patch(ed) module loading.  Making these "void" feels simpler IMHO.

>> +Hooks are optional.  The livepatching core will not execute any
>> +callbacks for an empty klp_hook.hook array or a NULL klp_hook.hook
>> +value.
>> +
>> +
>> +For module targets
>> +------------------
>> +
>> +On the other hand, if a target kernel module is already present when a
>> +livepatch is loading, then the corresponding patch hook(s) will execute
>> +as soon as the livepatching kernel core enables the livepatch.
>>
>> +It may be useful for hooks to inspect the module state of the klp_object
>> +it is passed (i.e. obj->mod->state).  Patch hooks can expect to see
>> +modules in MODULE_STATE_LIVE and MODULE_STATE_COMING states.  Unpatch
>> +hooks can expect modules in MODULE_STATE_LIVE and MODULE_STATE_GOING
>> +states.
> 
> This actualy talks about the other situations but it is well hidden
> and kind of cryptic.

Its placement seemed logical in the progression from intro -> easy cases
-> complicated cases.  The ramifications of what it means to be called
when the klp_object->mod is MODULE_STATE_FOO is definitely understated
and could probably use more attention.

I think this goes back to creating some kind of table of scenarios.  I
can try to compile this for v2.

> 
>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>> index b9628e43c78f..ff3685470057 100644
>> --- a/kernel/livepatch/core.c
>> +++ b/kernel/livepatch/core.c
>> @@ -49,11 +49,6 @@
>>  
>>  static struct kobject *klp_root_kobj;
>>  
>> -static bool klp_is_module(struct klp_object *obj)
>> -{
>> -	return obj->name;
>> -}
>> -
>>  static bool klp_is_object_loaded(struct klp_object *obj)
>>  {
>>  	return !obj->name || obj->mod;
>> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
>> index 52c4e907c14b..c8084a18ddb7 100644
>> --- a/kernel/livepatch/patch.c
>> +++ b/kernel/livepatch/patch.c
>> @@ -235,25 +235,60 @@ static int klp_patch_func(struct klp_func *func)
>>  	return ret;
>>  }
>>  
>> +/**
>> + * klp_run_hook - execute a given klp_hook callback
>> + * @hook:	callback hook
>> + * @obj:	kernel object that has been hooked
>> + *
>> + * Return: return value from hook, or 0 if none is currently associated
>> + */
>> +static int klp_run_hook(struct klp_hook *hook, struct klp_object *obj)
>> +{
>> +	if (hook && hook->hook)
>> +		return (*hook->hook)(obj);
>> +
>> +	return 0;
>> +}
>> +
>>  void klp_unpatch_object(struct klp_object *obj)
>>  {
>>  	struct klp_func *func;
>> +	struct klp_hook *hook;
>> +	int ret;
>>  
>>  	klp_for_each_func(obj, func)
>>  		if (func->patched)
>>  			klp_unpatch_func(func);
>>  
>>  	obj->patched = false;
>>
>> +
>> +	klp_for_each_unpatch_hook(obj, hook) {
>> +		ret = klp_run_hook(hook, obj);
>> +		if (ret) {
>> +			pr_warn("unpatch hook '%p' failed for object '%s'\n",
>> +				hook, klp_is_module(obj) ? obj->name : "vmlinux");
>> +		}
>> +	}
>> +
> 
> It probably does not matter but I would move
> 
> 	obj->patched = false;
> 
> here. Otherwise, the hooks will see "false" in both cases. In each,
> case it looks more symetric.
> 
> Or is there any reason behind the given order?

klp_unpatch_func() was just executed for all klp_object functions, so
IMHO a value of obj->patched = false reflects reality.  Is this not
symmetric:

klp_patch_object()
  klp_for_each_patch_hook     |
    klp_run_hook              | A - hooks

  klp_for_each_func           |
    klp_patch_func            |
  obj->patched = true         | B - patching

klp_unpatch_object()
  klp_for_each_func           |
    klp_unpatch_func          |
  obj->patched = false        | B' - unpatching

  klp_for_each_unpatch_hook   |
    klp_run_hook              | A' - hooks

I didn't really have a strong preference here... only for what would be
most intuitive.

>>  }
>>  
>>  int klp_patch_object(struct klp_object *obj)
>>  {
>>  	struct klp_func *func;
>> +	struct klp_hook *hook;
>>  	int ret;
>>  
>>  	if (WARN_ON(obj->patched))
>>  		return -EINVAL;
>>  
>> +	klp_for_each_patch_hook(obj, hook) {
>> +		ret = klp_run_hook(hook, obj);
>> +		if (ret) {
>> +			pr_warn("patch hook '%p' failed for object '%s'\n",
>> +				hook, klp_is_module(obj) ? obj->name : "vmlinux");
>> +		}
>> +	}
>> +
>> +
>>  	klp_for_each_func(obj, func) {
>>  		ret = klp_patch_func(func);
>>  		if (ret) {
> 
> 
> Thaks a lot for looking at this. I guess that it is useful. But
> it is also pretty dangerous at the same moment. I would like to
> understand all consequences/usecases before we add this API.

Thanks for reviewing and I appreciate the feedback, especially the dark
corners of enabling vs. loading and what would be safe when.

Regards,

-- Joe