2017-12-08 17:25:28

by Miroslav Benes

[permalink] [raw]
Subject: [PATCH 0/2] Remove immediate feature

Hi,

here it is. The first patch is only code removal and appropriate changes
in the documentation. The second patch is maybe more interesting. Do you
agree to do it this way, or have you got a better idea?

Miroslav Benes (2):
livepatch: Remove immediate feature
livepatch: Allow loading modules on architectures without
HAVE_RELIABLE_STACKTRACE

Documentation/livepatch/livepatch.txt | 84 ++++++++--------------------
include/linux/livepatch.h | 4 --
kernel/livepatch/core.c | 19 ++-----
kernel/livepatch/transition.c | 49 ++--------------
samples/livepatch/livepatch-callbacks-demo.c | 15 -----
samples/livepatch/livepatch-sample.c | 15 -----
samples/livepatch/livepatch-shadow-fix1.c | 15 -----
samples/livepatch/livepatch-shadow-fix2.c | 15 -----
8 files changed, 33 insertions(+), 183 deletions(-)

--
2.15.1


2017-12-08 17:25:34

by Miroslav Benes

[permalink] [raw]
Subject: [PATCH 1/2] livepatch: Remove immediate feature

immediate flag has been used to disable per-task consistency and patch
all tasks immediately. It could be useful if the patch doesn't change any
function or data semantics.

However, it causes problems on its own. The consistency problem is
currently broken with respect to immediate patches.

func a
patches 1i
2i
3

When the patch 3 is applied, only 2i function is checked (by stack
checking facility). There might be a task sleeping in 1i though. Such
task is migrated to 3, because we do not check 1i in
klp_check_stack_func() at all.

Coming atomic replace feature would be easier to implement and more
reliable without immediate.

Moreover, the fake signal and force feature give us almost the same
benefits and the user can decide to use them in problematic situations
(while immediate needs to be set before the patch is applied). It is
also more isolated in terms of code.

Thus, remove immediate feature completely and save us from the problems.

Signed-off-by: Miroslav Benes <[email protected]>
---
Documentation/livepatch/livepatch.txt | 84 ++++++++--------------------
include/linux/livepatch.h | 4 --
kernel/livepatch/core.c | 15 ++---
kernel/livepatch/transition.c | 49 ++--------------
samples/livepatch/livepatch-callbacks-demo.c | 16 +-----
samples/livepatch/livepatch-sample.c | 16 +-----
samples/livepatch/livepatch-shadow-fix1.c | 16 +-----
samples/livepatch/livepatch-shadow-fix2.c | 16 +-----
8 files changed, 40 insertions(+), 176 deletions(-)

diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
index 896ba8941702..cab682deeda8 100644
--- a/Documentation/livepatch/livepatch.txt
+++ b/Documentation/livepatch/livepatch.txt
@@ -72,8 +72,7 @@ example, they add a NULL pointer or a boundary check, fix a race by adding
a missing memory barrier, or add some locking around a critical section.
Most of these changes are self contained and the function presents itself
the same way to the rest of the system. In this case, the functions might
-be updated independently one by one. (This can be done by setting the
-'immediate' flag in the klp_patch struct.)
+be updated independently one by one.

But there are more complex fixes. For example, a patch might change
ordering of locking in multiple functions at the same time. Or a patch
@@ -125,12 +124,6 @@ Livepatch uses several complementary approaches to determine when it's
b) Patching CPU-bound user tasks. If the task is highly CPU-bound
then it will get patched the next time it gets interrupted by an
IRQ.
- c) In the future it could be useful for applying patches for
- architectures which don't yet have HAVE_RELIABLE_STACKTRACE. In
- this case you would have to signal most of the tasks on the
- system. However this isn't supported yet because there's
- currently no way to patch kthreads without
- HAVE_RELIABLE_STACKTRACE.

3. For idle "swapper" tasks, since they don't ever exit the kernel, they
instead have a klp_update_patch_state() call in the idle loop which
@@ -138,27 +131,16 @@ Livepatch uses several complementary approaches to determine when it's

(Note there's not yet such an approach for kthreads.)

-All the above approaches may be skipped by setting the 'immediate' flag
-in the 'klp_patch' struct, which will disable per-task consistency and
-patch all tasks immediately. This can be useful if the patch doesn't
-change any function or data semantics. Note that, even with this flag
-set, it's possible that some tasks may still be running with an old
-version of the function, until that function returns.
+Architectures which don't have HAVE_RELIABLE_STACKTRACE solely rely on
+the second approach. It's highly likely that some tasks may still be
+running with an old version of the function, until that function
+returns. In this case you would have to signal the tasks. This
+especially applies to kthreads. They may not be woken up and would need
+to be forced. See below for more information.

-There's also an 'immediate' flag in the 'klp_func' struct which allows
-you to specify that certain functions in the patch can be applied
-without per-task consistency. This might be useful if you want to patch
-a common function like schedule(), and the function change doesn't need
-consistency but the rest of the patch does.
-
-For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
-must set patch->immediate which causes all tasks to be patched
-immediately. This option should be used with care, only when the patch
-doesn't change any function or data semantics.
-
-In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
-may be allowed to use per-task consistency if we can come up with
-another way to patch kthreads.
+Unless we can come up with another way to patch kthreads, architectures
+without HAVE_RELIABLE_STACKTRACE are not considered fully supported by
+the kernel livepatching.

The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
is in transition. Only a single patch (the topmost patch on the stack)
@@ -234,13 +216,6 @@ For adding consistency model support to new architectures, there are a
a good backup option for those architectures which don't have
reliable stack traces yet.

-In the meantime, patches for such architectures can bypass the
-consistency model by setting klp_patch.immediate to true. This option
-is perfectly fine for patches which don't change the semantics of the
-patched functions. In practice, this is usable for ~90% of security
-fixes. Use of this option also means the patch can't be unloaded after
-it has been disabled.
-

4. Livepatch module
===================
@@ -296,9 +271,6 @@ The patch is described by several structures that split the information
only for a particular object ( vmlinux or a kernel module ). Note that
kallsyms allows for searching symbols according to the object name.

- There's also an 'immediate' flag which, when set, patches the
- function immediately, bypassing the consistency model safety checks.
-
+ struct klp_object defines an array of patched functions (struct
klp_func) in the same object. Where the object is either vmlinux
(NULL) or a module name.
@@ -317,9 +289,6 @@ The patch is described by several structures that split the information
symbols are found. The only exception are symbols from objects
(kernel modules) that have not been loaded yet.

- Setting the 'immediate' flag applies the patch to all tasks
- immediately, bypassing the consistency model safety checks.
-
For more details on how the patch is applied on a per-task basis,
see the "Consistency model" section.

@@ -334,14 +303,12 @@ section "Livepatch life-cycle" below for more details about these
two operations.

Module removal is only safe when there are no users of the underlying
-functions. The immediate consistency model is not able to detect this. The
-code just redirects the functions at the very beginning and it does not
-check if the functions are in use. In other words, it knows when the
-functions get called but it does not know when the functions return.
-Therefore it cannot be decided when the livepatch module can be safely
-removed. This is solved by a hybrid consistency model. When the system is
-transitioned to a new patch state (patched/unpatched) it is guaranteed that
-no task sleeps or runs in the old code.
+functions. This is the reason why the force feature permanently disables
+the removal. The forced tasks entered the functions but we cannot say
+that they returned back. Therefore it cannot be decided when the
+livepatch module can be safely removed. When the system is successfully
+transitioned to a new patch state (patched/unpatched) without being
+forced it is guaranteed that no task sleeps or runs in the old code.


5. Livepatch life-cycle
@@ -355,19 +322,12 @@ First, the patch is applied only when all patched symbols for already
loaded objects are found. The error handling is much easier if this
check is done before particular functions get redirected.

-Second, the immediate consistency model does not guarantee that anyone is not
-sleeping in the new code after the patch is reverted. This means that the new
-code needs to stay around "forever". If the code is there, one could apply it
-again. Therefore it makes sense to separate the operations that might be done
-once and those that need to be repeated when the patch is enabled (applied)
-again.
-
-Third, it might take some time until the entire system is migrated
-when a more complex consistency model is used. The patch revert might
-block the livepatch module removal for too long. Therefore it is useful
-to revert the patch using a separate operation that might be called
-explicitly. But it does not make sense to remove all information
-until the livepatch module is really removed.
+Second, it might take some time until the entire system is migrated with
+the hybrid consistency model being used. The patch revert might block
+the livepatch module removal for too long. Therefore it is useful to
+revert the patch using a separate operation that might be called
+explicitly. But it does not make sense to remove all information until
+the livepatch module is really removed.


5.1. Registration
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index fc5c1be3f6f4..4754f01c1abb 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -40,7 +40,6 @@
* @new_func: pointer to the patched function code
* @old_sympos: a hint indicating which symbol position the old function
* can be found (optional)
- * @immediate: patch the func immediately, bypassing safety mechanisms
* @old_addr: the address of the function being patched
* @kobj: kobject for sysfs resources
* @stack_node: list node for klp_ops func_stack list
@@ -76,7 +75,6 @@ struct klp_func {
* in kallsyms for the given object is used.
*/
unsigned long old_sympos;
- bool immediate;

/* internal */
unsigned long old_addr;
@@ -137,7 +135,6 @@ struct klp_object {
* struct klp_patch - patch structure for live patching
* @mod: reference to the live patch module
* @objs: object entries for kernel objects to be patched
- * @immediate: patch all funcs immediately, bypassing safety mechanisms
* @list: list node for global list of registered patches
* @kobj: kobject for sysfs resources
* @enabled: the patch is enabled (but operation may be incomplete)
@@ -147,7 +144,6 @@ struct klp_patch {
/* external */
struct module *mod;
struct klp_object *objs;
- bool immediate;

/* internal */
struct list_head list;
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 1c3c9b27c916..461c0b7dc913 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -367,10 +367,10 @@ static int __klp_enable_patch(struct klp_patch *patch)
* A reference is taken on the patch module to prevent it from being
* unloaded.
*
- * Note: For immediate (no consistency model) patches we don't allow
- * patch modules to unload since there is no safe/sane method to
- * determine if a thread is still running in the patched code contained
- * in the patch module once the ftrace registration is successful.
+ * Note: When klp_forced is set we don't allow patch modules to unload
+ * since there is no safe/sane method to determine if a thread is still
+ * running in the patched code contained in the patch module once the
+ * ftrace registration is successful.
*/
if (!try_module_get(patch->mod))
return -ENODEV;
@@ -890,12 +890,7 @@ int klp_register_patch(struct klp_patch *patch)
if (!klp_initialized())
return -ENODEV;

- /*
- * Architectures without reliable stack traces have to set
- * patch->immediate because there's currently no way to patch kthreads
- * with the consistency model.
- */
- if (!klp_have_reliable_stack() && !patch->immediate) {
+ if (!klp_have_reliable_stack()) {
pr_err("This architecture doesn't have support for the livepatch consistency model.\n");
return -ENOSYS;
}
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index be5bfa533ee8..7c6631e693bc 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -82,7 +82,6 @@ static void klp_complete_transition(void)
struct klp_func *func;
struct task_struct *g, *task;
unsigned int cpu;
- bool immediate_func = false;

pr_debug("'%s': completing %s transition\n",
klp_transition_patch->mod->name,
@@ -104,16 +103,9 @@ static void klp_complete_transition(void)
klp_synchronize_transition();
}

- if (klp_transition_patch->immediate)
- goto done;
-
- klp_for_each_object(klp_transition_patch, obj) {
- klp_for_each_func(obj, func) {
+ klp_for_each_object(klp_transition_patch, obj)
+ klp_for_each_func(obj, func)
func->transition = false;
- if (func->immediate)
- immediate_func = true;
- }
- }

/* Prevent klp_ftrace_handler() from seeing KLP_UNDEFINED state */
if (klp_target_state == KLP_PATCHED)
@@ -132,7 +124,6 @@ static void klp_complete_transition(void)
task->patch_state = KLP_UNDEFINED;
}

-done:
klp_for_each_object(klp_transition_patch, obj) {
if (!klp_is_object_loaded(obj))
continue;
@@ -146,16 +137,11 @@ static void klp_complete_transition(void)
klp_target_state == KLP_PATCHED ? "patching" : "unpatching");

/*
- * See complementary comment in __klp_enable_patch() for why we
- * keep the module reference for immediate patches.
- *
- * klp_forced or immediate_func set implies unbounded increase of
- * module's ref count if the module is disabled/enabled in a loop.
+ * klp_forced set implies unbounded increase of module's ref count if
+ * the module is disabled/enabled in a loop.
*/
- if (!klp_forced && !klp_transition_patch->immediate &&
- !immediate_func && klp_target_state == KLP_UNPATCHED) {
+ if (!klp_forced && klp_target_state == KLP_UNPATCHED)
module_put(klp_transition_patch->mod);
- }

klp_target_state = KLP_UNDEFINED;
klp_transition_patch = NULL;
@@ -223,9 +209,6 @@ static int klp_check_stack_func(struct klp_func *func,
struct klp_ops *ops;
int i;

- if (func->immediate)
- return 0;
-
for (i = 0; i < trace->nr_entries; i++) {
address = trace->entries[i];

@@ -387,13 +370,6 @@ void klp_try_complete_transition(void)

WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);

- /*
- * If the patch can be applied or reverted immediately, skip the
- * per-task transitions.
- */
- if (klp_transition_patch->immediate)
- goto success;
-
/*
* Try to switch the tasks to the target patch state by walking their
* stacks and looking for any to-be-patched or to-be-unpatched
@@ -437,7 +413,6 @@ void klp_try_complete_transition(void)
return;
}

-success:
/* we're done, now cleanup the data structures */
klp_complete_transition();
}
@@ -457,13 +432,6 @@ void klp_start_transition(void)
klp_transition_patch->mod->name,
klp_target_state == KLP_PATCHED ? "patching" : "unpatching");

- /*
- * If the patch can be applied or reverted immediately, skip the
- * per-task transitions.
- */
- if (klp_transition_patch->immediate)
- return;
-
/*
* Mark all normal tasks as needing a patch state update. They'll
* switch either in klp_try_complete_transition() or as they exit the
@@ -513,13 +481,6 @@ void klp_init_transition(struct klp_patch *patch, int state)
pr_debug("'%s': initializing %s transition\n", patch->mod->name,
klp_target_state == KLP_PATCHED ? "patching" : "unpatching");

- /*
- * If the patch can be applied or reverted immediately, skip the
- * per-task transitions.
- */
- if (patch->immediate)
- return;
-
/*
* Initialize all tasks to the initial patch state to prepare them for
* switching to the target state.
diff --git a/samples/livepatch/livepatch-callbacks-demo.c b/samples/livepatch/livepatch-callbacks-demo.c
index 3d115bd68442..bda7f3841f3e 100644
--- a/samples/livepatch/livepatch-callbacks-demo.c
+++ b/samples/livepatch/livepatch-callbacks-demo.c
@@ -197,20 +197,8 @@ static int livepatch_callbacks_demo_init(void)
{
int ret;

- if (!klp_have_reliable_stack() && !patch.immediate) {
- /*
- * WARNING: Be very careful when using 'patch.immediate' in
- * your patches. It's ok to use it for simple patches like
- * this, but for more complex patches which change function
- * semantics, locking semantics, or data structures, it may not
- * be safe. Use of this option will also prevent removal of
- * the patch.
- *
- * See Documentation/livepatch/livepatch.txt for more details.
- */
- patch.immediate = true;
- pr_notice("The consistency model isn't supported for your architecture. Bypassing safety mechanisms and applying the patch immediately.\n");
- }
+ if (!klp_have_reliable_stack())
+ pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");

ret = klp_register_patch(&patch);
if (ret)
diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
index 84795223f15f..a150fca6f7cd 100644
--- a/samples/livepatch/livepatch-sample.c
+++ b/samples/livepatch/livepatch-sample.c
@@ -71,20 +71,8 @@ static int livepatch_init(void)
{
int ret;

- if (!klp_have_reliable_stack() && !patch.immediate) {
- /*
- * WARNING: Be very careful when using 'patch.immediate' in
- * your patches. It's ok to use it for simple patches like
- * this, but for more complex patches which change function
- * semantics, locking semantics, or data structures, it may not
- * be safe. Use of this option will also prevent removal of
- * the patch.
- *
- * See Documentation/livepatch/livepatch.txt for more details.
- */
- patch.immediate = true;
- pr_notice("The consistency model isn't supported for your architecture. Bypassing safety mechanisms and applying the patch immediately.\n");
- }
+ if (!klp_have_reliable_stack())
+ pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");

ret = klp_register_patch(&patch);
if (ret)
diff --git a/samples/livepatch/livepatch-shadow-fix1.c b/samples/livepatch/livepatch-shadow-fix1.c
index fbe0a1f3d99b..415db31aca8d 100644
--- a/samples/livepatch/livepatch-shadow-fix1.c
+++ b/samples/livepatch/livepatch-shadow-fix1.c
@@ -133,20 +133,8 @@ static int livepatch_shadow_fix1_init(void)
{
int ret;

- if (!klp_have_reliable_stack() && !patch.immediate) {
- /*
- * WARNING: Be very careful when using 'patch.immediate' in
- * your patches. It's ok to use it for simple patches like
- * this, but for more complex patches which change function
- * semantics, locking semantics, or data structures, it may not
- * be safe. Use of this option will also prevent removal of
- * the patch.
- *
- * See Documentation/livepatch/livepatch.txt for more details.
- */
- patch.immediate = true;
- pr_notice("The consistency model isn't supported for your architecture. Bypassing safety mechanisms and applying the patch immediately.\n");
- }
+ if (!klp_have_reliable_stack())
+ pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");

ret = klp_register_patch(&patch);
if (ret)
diff --git a/samples/livepatch/livepatch-shadow-fix2.c b/samples/livepatch/livepatch-shadow-fix2.c
index 53c1794bdc5f..04b3fe23bfd3 100644
--- a/samples/livepatch/livepatch-shadow-fix2.c
+++ b/samples/livepatch/livepatch-shadow-fix2.c
@@ -128,20 +128,8 @@ static int livepatch_shadow_fix2_init(void)
{
int ret;

- if (!klp_have_reliable_stack() && !patch.immediate) {
- /*
- * WARNING: Be very careful when using 'patch.immediate' in
- * your patches. It's ok to use it for simple patches like
- * this, but for more complex patches which change function
- * semantics, locking semantics, or data structures, it may not
- * be safe. Use of this option will also prevent removal of
- * the patch.
- *
- * See Documentation/livepatch/livepatch.txt for more details.
- */
- patch.immediate = true;
- pr_notice("The consistency model isn't supported for your architecture. Bypassing safety mechanisms and applying the patch immediately.\n");
- }
+ if (!klp_have_reliable_stack())
+ pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");

ret = klp_register_patch(&patch);
if (ret)
--
2.15.1

2017-12-08 17:25:31

by Miroslav Benes

[permalink] [raw]
Subject: [PATCH 2/2] livepatch: Allow loading modules on architectures without HAVE_RELIABLE_STACKTRACE

Now that immediate feature was removed, it is not possible to load
livepatch modules on architectures without HAVE_RELIABLE_STACKTRACE. Fix
it by removing guilty check in klp_register_patch().

The architectures without HAVE_RELIABLE_STACKTRACE will now rely only on
kernelspace/userspace boundary switching, the (fake) signal and force
feature.

Also remove the check from all sample modules.

Signed-off-by: Miroslav Benes <[email protected]>
---
kernel/livepatch/core.c | 6 ++----
samples/livepatch/livepatch-callbacks-demo.c | 3 ---
samples/livepatch/livepatch-sample.c | 3 ---
samples/livepatch/livepatch-shadow-fix1.c | 3 ---
samples/livepatch/livepatch-shadow-fix2.c | 3 ---
5 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 461c0b7dc913..fa7e33aeb2a6 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -890,10 +890,8 @@ int klp_register_patch(struct klp_patch *patch)
if (!klp_initialized())
return -ENODEV;

- if (!klp_have_reliable_stack()) {
- pr_err("This architecture doesn't have support for the livepatch consistency model.\n");
- return -ENOSYS;
- }
+ if (!klp_have_reliable_stack())
+ pr_notice("This architecture doesn't have full support for the livepatch consistency model. The transition may not finish.\n");

return klp_init_patch(patch);
}
diff --git a/samples/livepatch/livepatch-callbacks-demo.c b/samples/livepatch/livepatch-callbacks-demo.c
index bda7f3841f3e..72f9e6d1387b 100644
--- a/samples/livepatch/livepatch-callbacks-demo.c
+++ b/samples/livepatch/livepatch-callbacks-demo.c
@@ -197,9 +197,6 @@ static int livepatch_callbacks_demo_init(void)
{
int ret;

- if (!klp_have_reliable_stack())
- pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");
-
ret = klp_register_patch(&patch);
if (ret)
return ret;
diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
index a150fca6f7cd..2d554dd930e2 100644
--- a/samples/livepatch/livepatch-sample.c
+++ b/samples/livepatch/livepatch-sample.c
@@ -71,9 +71,6 @@ static int livepatch_init(void)
{
int ret;

- if (!klp_have_reliable_stack())
- pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");
-
ret = klp_register_patch(&patch);
if (ret)
return ret;
diff --git a/samples/livepatch/livepatch-shadow-fix1.c b/samples/livepatch/livepatch-shadow-fix1.c
index 415db31aca8d..830c55514f9f 100644
--- a/samples/livepatch/livepatch-shadow-fix1.c
+++ b/samples/livepatch/livepatch-shadow-fix1.c
@@ -133,9 +133,6 @@ static int livepatch_shadow_fix1_init(void)
{
int ret;

- if (!klp_have_reliable_stack())
- pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");
-
ret = klp_register_patch(&patch);
if (ret)
return ret;
diff --git a/samples/livepatch/livepatch-shadow-fix2.c b/samples/livepatch/livepatch-shadow-fix2.c
index 04b3fe23bfd3..ff9948f0ec00 100644
--- a/samples/livepatch/livepatch-shadow-fix2.c
+++ b/samples/livepatch/livepatch-shadow-fix2.c
@@ -128,9 +128,6 @@ static int livepatch_shadow_fix2_init(void)
{
int ret;

- if (!klp_have_reliable_stack())
- pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");
-
ret = klp_register_patch(&patch);
if (ret)
return ret;
--
2.15.1

2017-12-20 14:35:15

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 1/2] livepatch: Remove immediate feature

On Fri 2017-12-08 18:25:22, Miroslav Benes wrote:
> immediate flag has been used to disable per-task consistency and patch
> all tasks immediately. It could be useful if the patch doesn't change any
> function or data semantics.
>
> However, it causes problems on its own. The consistency problem is
> currently broken with respect to immediate patches.
>
> func a
> patches 1i
> 2i
> 3
>
> When the patch 3 is applied, only 2i function is checked (by stack
> checking facility). There might be a task sleeping in 1i though. Such
> task is migrated to 3, because we do not check 1i in
> klp_check_stack_func() at all.
>
> Coming atomic replace feature would be easier to implement and more
> reliable without immediate.
>
> Moreover, the fake signal and force feature give us almost the same
> benefits and the user can decide to use them in problematic situations
> (while immediate needs to be set before the patch is applied). It is
> also more isolated in terms of code.
>
> Thus, remove immediate feature completely and save us from the problems.

Sigh, the force feature actually have the same problem. We would use
it when a process never has a reliable stack or when it is endlessly
sleeping in a function that might have been patched immediately.

The documentation about the force feature says that the user should
consult the patch provider before using the flag. The provider
would check that it is really safe in the given situation
and eventually allow to use the force.

But what about any future livepatches? If the force flag was
safe for a given livepatch/process, it does not mean that
it would be safe for the next one. The process might still
be sleeping on the original function or on one lower in
the stack.

I see two possibilities. We could either refuse loading new
livepatches after using the force flag. Or we would need
to check all variants of the function "a" that might still
be in use.

I think that we might want to check the stack correctly.
Note that we need to take care also about livepatches that
were disabled. They are usually removed from func->stack_node.
We might need to maintain separate stack where we would
put all variants of the function that might be in
use when using the immediate or force flag.

I am not sure if we still want to remove the immediate
flag then.

Best Regards,
Petr

2017-12-20 17:09:41

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 1/2] livepatch: Remove immediate feature

On Wed, Dec 20, 2017 at 03:35:12PM +0100, Petr Mladek wrote:
> On Fri 2017-12-08 18:25:22, Miroslav Benes wrote:
> > immediate flag has been used to disable per-task consistency and patch
> > all tasks immediately. It could be useful if the patch doesn't change any
> > function or data semantics.
> >
> > However, it causes problems on its own. The consistency problem is
> > currently broken with respect to immediate patches.
> >
> > func a
> > patches 1i
> > 2i
> > 3
> >
> > When the patch 3 is applied, only 2i function is checked (by stack
> > checking facility). There might be a task sleeping in 1i though. Such
> > task is migrated to 3, because we do not check 1i in
> > klp_check_stack_func() at all.
> >
> > Coming atomic replace feature would be easier to implement and more
> > reliable without immediate.
> >
> > Moreover, the fake signal and force feature give us almost the same
> > benefits and the user can decide to use them in problematic situations
> > (while immediate needs to be set before the patch is applied). It is
> > also more isolated in terms of code.
> >
> > Thus, remove immediate feature completely and save us from the problems.
>
> Sigh, the force feature actually have the same problem. We would use
> it when a process never has a reliable stack or when it is endlessly
> sleeping in a function that might have been patched immediately.
>
> The documentation about the force feature says that the user should
> consult the patch provider before using the flag. The provider
> would check that it is really safe in the given situation
> and eventually allow to use the force.
>
> But what about any future livepatches? If the force flag was
> safe for a given livepatch/process, it does not mean that
> it would be safe for the next one. The process might still
> be sleeping on the original function or on one lower in
> the stack.
>
> I see two possibilities. We could either refuse loading new
> livepatches after using the force flag. Or we would need
> to check all variants of the function "a" that might still
> be in use.
>
> I think that we might want to check the stack correctly.
> Note that we need to take care also about livepatches that
> were disabled. They are usually removed from func->stack_node.
> We might need to maintain separate stack where we would
> put all variants of the function that might be in
> use when using the immediate or force flag.
>
> I am not sure if we still want to remove the immediate
> flag then.

"Using the force" is a nuclear option. User beware. If you use it (or
recommend that others use it), be prepared for the consequences. That
means anticipating how forcing this patch might affect future patches,
and planning accordingly.

>From a livepatch code standpoint, let's avoid adding complexity (or
limitations) where none are needed. I think all we need to do is
permanently disable patch module unloading when somebody forces a patch,
which we already do. Otherwise the user is on their own.

--
Josh

2017-12-21 13:30:14

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 1/2] livepatch: Remove immediate feature

On Wed 2017-12-20 11:09:37, Josh Poimboeuf wrote:
> On Wed, Dec 20, 2017 at 03:35:12PM +0100, Petr Mladek wrote:
> > On Fri 2017-12-08 18:25:22, Miroslav Benes wrote:
> > > immediate flag has been used to disable per-task consistency and patch
> > > all tasks immediately. It could be useful if the patch doesn't change any
> > > function or data semantics.
> > >
> > > However, it causes problems on its own. The consistency problem is
> > > currently broken with respect to immediate patches.
> > >
> > > func a
> > > patches 1i
> > > 2i
> > > 3
> > >
> > > When the patch 3 is applied, only 2i function is checked (by stack
> > > checking facility). There might be a task sleeping in 1i though. Such
> > > task is migrated to 3, because we do not check 1i in
> > > klp_check_stack_func() at all.
> > >
> >
> > Sigh, the force feature actually have the same problem. We would use
> > it when a process never has a reliable stack or when it is endlessly
> > sleeping in a function that might have been patched immediately.
> >
> > I see two possibilities. We could either refuse loading new
> > livepatches after using the force flag. Or we would need
> > to check all variants of the function "a" that might still
> > be in use.
>
> "Using the force" is a nuclear option. User beware. If you use it (or
> recommend that others use it), be prepared for the consequences. That
> means anticipating how forcing this patch might affect future patches,
> and planning accordingly.
>
> >From a livepatch code standpoint, let's avoid adding complexity (or
> limitations) where none are needed. I think all we need to do is
> permanently disable patch module unloading when somebody forces a patch,
> which we already do. Otherwise the user is on their own.

This looks like a rather weak protection against nuclear diseases ;-)

If we want to keep it simple and safe, we should either print
a big fact warning about this danger when the option is used.
Or we should allow to load new patches only with yet another
force flag.

Anyway, I agree that we should keep it simple. The fact is
that the immediate flag removal makes the code better
readable.

Best Regards,
Petr

2017-12-21 13:55:54

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 1/2] livepatch: Remove immediate feature

On Thu, 21 Dec 2017, Petr Mladek wrote:

> On Wed 2017-12-20 11:09:37, Josh Poimboeuf wrote:
> > On Wed, Dec 20, 2017 at 03:35:12PM +0100, Petr Mladek wrote:
> > > On Fri 2017-12-08 18:25:22, Miroslav Benes wrote:
> > > > immediate flag has been used to disable per-task consistency and patch
> > > > all tasks immediately. It could be useful if the patch doesn't change any
> > > > function or data semantics.
> > > >
> > > > However, it causes problems on its own. The consistency problem is
> > > > currently broken with respect to immediate patches.
> > > >
> > > > func a
> > > > patches 1i
> > > > 2i
> > > > 3
> > > >
> > > > When the patch 3 is applied, only 2i function is checked (by stack
> > > > checking facility). There might be a task sleeping in 1i though. Such
> > > > task is migrated to 3, because we do not check 1i in
> > > > klp_check_stack_func() at all.
> > > >
> > >
> > > Sigh, the force feature actually have the same problem. We would use
> > > it when a process never has a reliable stack or when it is endlessly
> > > sleeping in a function that might have been patched immediately.
> > >
> > > I see two possibilities. We could either refuse loading new
> > > livepatches after using the force flag. Or we would need
> > > to check all variants of the function "a" that might still
> > > be in use.
> >
> > "Using the force" is a nuclear option. User beware. If you use it (or
> > recommend that others use it), be prepared for the consequences. That
> > means anticipating how forcing this patch might affect future patches,
> > and planning accordingly.
> >
> > >From a livepatch code standpoint, let's avoid adding complexity (or
> > limitations) where none are needed. I think all we need to do is
> > permanently disable patch module unloading when somebody forces a patch,
> > which we already do. Otherwise the user is on their own.

I agree with Josh here.

If there is a problem with a patch module, the recommended action is to
simply cancel its transition (by writing 0 to enabled). If there are
serious reasons to apply the patch, there is force as the last resort. In
that case the user should probably plan for reboot into an updated kernel
and not to plan to apply more live patches.

> This looks like a rather weak protection against nuclear diseases ;-)
>
> If we want to keep it simple and safe, we should either print
> a big fact warning about this danger when the option is used.
> Or we should allow to load new patches only with yet another
> force flag.

Having said the above, I'm not against to update the warning and
documentation. I would not introduce another force flag to deal with it.

Thanks,
Miroslav

2017-12-21 14:58:17

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 1/2] livepatch: Remove immediate feature

Hello,

it seems that we are going to use this patch (I agree). Therefore
I am going to review the content.

On Fri 2017-12-08 18:25:22, Miroslav Benes wrote:
> immediate flag has been used to disable per-task consistency and patch
> all tasks immediately. It could be useful if the patch doesn't change any
> function or data semantics.
>
> However, it causes problems on its own. The consistency problem is
> currently broken with respect to immediate patches.
>
> func a
> patches 1i
> 2i
> 3
>
> When the patch 3 is applied, only 2i function is checked (by stack
> checking facility). There might be a task sleeping in 1i though. Such
> task is migrated to 3, because we do not check 1i in
> klp_check_stack_func() at all.
>
> Coming atomic replace feature would be easier to implement and more
> reliable without immediate.
>
> Moreover, the fake signal and force feature give us almost the same
> benefits and the user can decide to use them in problematic situations
> (while immediate needs to be set before the patch is applied). It is
> also more isolated in terms of code.
>
> Thus, remove immediate feature completely and save us from the problems.

Just for record, the above paragraphs needs to be reworded because the
problem still will be there with the force feature.

> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 1c3c9b27c916..461c0b7dc913 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -367,10 +367,10 @@ static int __klp_enable_patch(struct klp_patch *patch)
> * A reference is taken on the patch module to prevent it from being
> * unloaded.
> *
> - * Note: For immediate (no consistency model) patches we don't allow
> - * patch modules to unload since there is no safe/sane method to
> - * determine if a thread is still running in the patched code contained
> - * in the patch module once the ftrace registration is successful.
> + * Note: When klp_forced is set we don't allow patch modules to unload
> + * since there is no safe/sane method to determine if a thread is still
> + * running in the patched code contained in the patch module once the
> + * ftrace registration is successful.

I would remove this paragraph completely. You removed the
cross-reference klp_complete_transition() as well.

> */
> if (!try_module_get(patch->mod))
> return -ENODEV;
> @@ -890,12 +890,7 @@ int klp_register_patch(struct klp_patch *patch)
> if (!klp_initialized())
> return -ENODEV;
>
> - /*
> - * Architectures without reliable stack traces have to set
> - * patch->immediate because there's currently no way to patch kthreads
> - * with the consistency model.
> - */
> - if (!klp_have_reliable_stack() && !patch->immediate) {
> + if (!klp_have_reliable_stack()) {
> pr_err("This architecture doesn't have support for the livepatch consistency model.\n");
> return -ENOSYS;
> }

> diff --git a/samples/livepatch/livepatch-callbacks-demo.c b/samples/livepatch/livepatch-callbacks-demo.c
> index 3d115bd68442..bda7f3841f3e 100644
> --- a/samples/livepatch/livepatch-callbacks-demo.c
> +++ b/samples/livepatch/livepatch-callbacks-demo.c
> @@ -197,20 +197,8 @@ static int livepatch_callbacks_demo_init(void)
> {
> int ret;
>
> - if (!klp_have_reliable_stack() && !patch.immediate) {
> - /*
> - * WARNING: Be very careful when using 'patch.immediate' in
> - * your patches. It's ok to use it for simple patches like
> - * this, but for more complex patches which change function
> - * semantics, locking semantics, or data structures, it may not
> - * be safe. Use of this option will also prevent removal of
> - * the patch.
> - *
> - * See Documentation/livepatch/livepatch.txt for more details.
> - */
> - patch.immediate = true;
> - pr_notice("The consistency model isn't supported for your architecture. Bypassing safety mechanisms and applying the patch immediately.\n");
> - }
> + if (!klp_have_reliable_stack())
> + pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");

The notice is redundant. The klp_registrer_patch() would printk
similar message and return -ENOSYS.

Same is true for the other sample modules.

In each case, I like this patch. It simplifies the code a lot.

Best Regards,
Petr

2017-12-21 15:14:33

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 2/2] livepatch: Allow loading modules on architectures without HAVE_RELIABLE_STACKTRACE

On Fri 2017-12-08 18:25:23, Miroslav Benes wrote:
> Now that immediate feature was removed, it is not possible to load
> livepatch modules on architectures without HAVE_RELIABLE_STACKTRACE. Fix
> it by removing guilty check in klp_register_patch().
>
> The architectures without HAVE_RELIABLE_STACKTRACE will now rely only on
> kernelspace/userspace boundary switching, the (fake) signal and force
> feature.

I do not thing that this is a good idea. It encourages people to use
the force feature. They might get used to it.

If people are going to provide livepatches, they should be capable
enough to provide a kernel where it is allowed. IMHO, the upstream
kernel should not support bad/dirty practices out of box.

Best Regards,
Petr

2017-12-22 13:10:43

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 1/2] livepatch: Remove immediate feature

On Thu, 21 Dec 2017, Petr Mladek wrote:

> Hello,
>
> it seems that we are going to use this patch (I agree). Therefore
> I am going to review the content.
>
> On Fri 2017-12-08 18:25:22, Miroslav Benes wrote:
> > immediate flag has been used to disable per-task consistency and patch
> > all tasks immediately. It could be useful if the patch doesn't change any
> > function or data semantics.
> >
> > However, it causes problems on its own. The consistency problem is
> > currently broken with respect to immediate patches.
> >
> > func a
> > patches 1i
> > 2i
> > 3
> >
> > When the patch 3 is applied, only 2i function is checked (by stack
> > checking facility). There might be a task sleeping in 1i though. Such
> > task is migrated to 3, because we do not check 1i in
> > klp_check_stack_func() at all.
> >
> > Coming atomic replace feature would be easier to implement and more
> > reliable without immediate.
> >
> > Moreover, the fake signal and force feature give us almost the same
> > benefits and the user can decide to use them in problematic situations
> > (while immediate needs to be set before the patch is applied). It is
> > also more isolated in terms of code.
> >
> > Thus, remove immediate feature completely and save us from the problems.
>
> Just for record, the above paragraphs needs to be reworded because the
> problem still will be there with the force feature.

Yes, the changelog should be rewritten.

> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 1c3c9b27c916..461c0b7dc913 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -367,10 +367,10 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > * A reference is taken on the patch module to prevent it from being
> > * unloaded.
> > *
> > - * Note: For immediate (no consistency model) patches we don't allow
> > - * patch modules to unload since there is no safe/sane method to
> > - * determine if a thread is still running in the patched code contained
> > - * in the patch module once the ftrace registration is successful.
> > + * Note: When klp_forced is set we don't allow patch modules to unload
> > + * since there is no safe/sane method to determine if a thread is still
> > + * running in the patched code contained in the patch module once the
> > + * ftrace registration is successful.
>
> I would remove this paragraph completely. You removed the
> cross-reference klp_complete_transition() as well.

Ok.

> > */
> > if (!try_module_get(patch->mod))
> > return -ENODEV;
> > @@ -890,12 +890,7 @@ int klp_register_patch(struct klp_patch *patch)
> > if (!klp_initialized())
> > return -ENODEV;
> >
> > - /*
> > - * Architectures without reliable stack traces have to set
> > - * patch->immediate because there's currently no way to patch kthreads
> > - * with the consistency model.
> > - */
> > - if (!klp_have_reliable_stack() && !patch->immediate) {
> > + if (!klp_have_reliable_stack()) {
> > pr_err("This architecture doesn't have support for the livepatch consistency model.\n");
> > return -ENOSYS;
> > }
>
> > diff --git a/samples/livepatch/livepatch-callbacks-demo.c b/samples/livepatch/livepatch-callbacks-demo.c
> > index 3d115bd68442..bda7f3841f3e 100644
> > --- a/samples/livepatch/livepatch-callbacks-demo.c
> > +++ b/samples/livepatch/livepatch-callbacks-demo.c
> > @@ -197,20 +197,8 @@ static int livepatch_callbacks_demo_init(void)
> > {
> > int ret;
> >
> > - if (!klp_have_reliable_stack() && !patch.immediate) {
> > - /*
> > - * WARNING: Be very careful when using 'patch.immediate' in
> > - * your patches. It's ok to use it for simple patches like
> > - * this, but for more complex patches which change function
> > - * semantics, locking semantics, or data structures, it may not
> > - * be safe. Use of this option will also prevent removal of
> > - * the patch.
> > - *
> > - * See Documentation/livepatch/livepatch.txt for more details.
> > - */
> > - patch.immediate = true;
> > - pr_notice("The consistency model isn't supported for your architecture. Bypassing safety mechanisms and applying the patch immediately.\n");
> > - }
> > + if (!klp_have_reliable_stack())
> > + pr_notice("The consistency model isn't supported for your architecture. The transition may not finish.\n");
>
> The notice is redundant. The klp_registrer_patch() would printk
> similar message and return -ENOSYS.
>
> Same is true for the other sample modules.

Yes. I wanted the patch to be a mechanic removal of immediate and do the
rest somewhere else. But that did not work out anyway, so ok.

> In each case, I like this patch. It simplifies the code a lot.

Yes. Thanks.

Miroslav

2017-12-22 13:13:05

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 2/2] livepatch: Allow loading modules on architectures without HAVE_RELIABLE_STACKTRACE

On Thu, 21 Dec 2017, Petr Mladek wrote:

> On Fri 2017-12-08 18:25:23, Miroslav Benes wrote:
> > Now that immediate feature was removed, it is not possible to load
> > livepatch modules on architectures without HAVE_RELIABLE_STACKTRACE. Fix
> > it by removing guilty check in klp_register_patch().
> >
> > The architectures without HAVE_RELIABLE_STACKTRACE will now rely only on
> > kernelspace/userspace boundary switching, the (fake) signal and force
> > feature.
>
> I do not thing that this is a good idea. It encourages people to use
> the force feature. They might get used to it.
>
> If people are going to provide livepatches, they should be capable
> enough to provide a kernel where it is allowed. IMHO, the upstream
> kernel should not support bad/dirty practices out of box.

You're probably right.

It is true that with immediate gone the only reasonable thing for all
supported architectures is to provide HAVE_RELIABLE_STACKTRACE.

Miroslav