2015-02-09 17:31:32

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 0/9] livepatch: consistency model

This patch set implements a livepatch consistency model, targeted for 3.21.
Now that we have a solid livepatch code base, this is the biggest remaining
missing piece.

This code stems from the design proposal made by Vojtech [1] in November. It
makes live patching safer in general. Specifically, it allows you to apply
patches which change function prototypes. It also lays the groundwork for
future code changes which will enable data and data semantic changes.

It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
checking with kGraft's per-task consistency. When patching, tasks are
carefully transitioned from the old universe to the new universe. A task can
only be switched to the new universe if it's not using a function that is to be
patched or unpatched. After all tasks have moved to the new universe, the
patching process is complete.

How it transitions various tasks to the new universe:

- The stacks of all sleeping tasks are checked. Each task that is not sleeping
on a to-be-patched function is switched.

- Other user tasks are handled by do_notify_resume() (see patch 9/9). If a
task is I/O bound, it switches universes when returning from a system call.
If it's CPU bound, it switches when returning from an interrupt. If it's
sleeping on a patched function, the user can send SIGSTOP and SIGCONT to
force it to switch upon return from the signal handler.

- Idle "swapper" tasks which are sleeping on a to-be-patched function can be
switched from within the outer idle loop.

- An interrupt handler will inherit the universe of the task it interrupts.

- kthreads which are sleeping on to-be-patched functions are not yet handled
(more on this below).


I think this approach provides the best benefits of both kpatch and kGraft:

advantages vs kpatch:
- no stop machine latency
- higher patch success rate (can patch in-use functions)
- patching failures are more predictable (primary failure mode is attempting to
patch a kthread which is sleeping forever on a patched function, more on this
below)

advantages vs kGraft:
- less code complexity (don't have to hack up the code of all the different
kthreads)
- less impact to processes (don't have to signal all sleeping tasks)

disadvantages vs kpatch:
- no system-wide switch point (not really a functional limitation, just forces
the patch author to be more careful. but that's probably a good thing anyway)


My biggest concerns and questions related to this patch set are:

1) To safely examine the task stacks, the transition code locks each task's rq
struct, which requires using the scheduler's internal rq locking functions.
It seems to work well, but I'm not sure if there's a cleaner way to safely
do stack checking without stop_machine().

2) As mentioned above, kthreads which are always sleeping on a patched function
will never transition to the new universe. This is really a minor issue
(less than 1% of patches). It's not necessarily something that needs to be
resolved with this patch set, but it would be good to have some discussion
about it regardless.

To overcome this issue, I have 1/2 an idea: we could add some stack checking
code to the ftrace handler itself to transition the kthread to the new
universe after it re-enters the function it was originally sleeping on, if
the stack doesn't already have have any other to-be-patched functions.
Combined with the klp_transition_work_fn()'s periodic stack checking of
sleeping tasks, that would handle most of the cases (except when trying to
patch the high-level thread_fn itself).

But then how do you make the kthread wake up? As far as I can tell,
wake_up_process() doesn't seem to work on a kthread (unless I messed up my
testing somehow). What does kGraft do in this case?


[1] https://lkml.org/lkml/2014/11/7/354


Josh Poimboeuf (9):
livepatch: simplify disable error path
livepatch: separate enabled and patched states
livepatch: move patching functions into patch.c
livepatch: get function sizes
sched: move task rq locking functions to sched.h
livepatch: create per-task consistency model
proc: add /proc/<pid>/universe to show livepatch status
livepatch: allow patch modules to be removed
livepatch: update task universe when exiting kernel

arch/x86/include/asm/thread_info.h | 4 +-
arch/x86/kernel/signal.c | 4 +
fs/proc/base.c | 11 ++
include/linux/livepatch.h | 38 ++--
include/linux/sched.h | 3 +
kernel/fork.c | 2 +
kernel/livepatch/Makefile | 2 +-
kernel/livepatch/core.c | 360 ++++++++++---------------------------
kernel/livepatch/patch.c | 206 +++++++++++++++++++++
kernel/livepatch/patch.h | 26 +++
kernel/livepatch/transition.c | 318 ++++++++++++++++++++++++++++++++
kernel/livepatch/transition.h | 16 ++
kernel/sched/core.c | 34 +---
kernel/sched/idle.c | 4 +
kernel/sched/sched.h | 33 ++++
15 files changed, 747 insertions(+), 314 deletions(-)
create mode 100644 kernel/livepatch/patch.c
create mode 100644 kernel/livepatch/patch.h
create mode 100644 kernel/livepatch/transition.c
create mode 100644 kernel/livepatch/transition.h

--
2.1.0


2015-02-09 17:31:31

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 1/9] livepatch: simplify disable error path

If registering the function with ftrace has previously succeeded,
unregistering will almost never fail. Even if it does, it's not a fatal
error. We can still carry on and disable the klp_func from being used
by removing it from the klp_ops func stack.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
kernel/livepatch/core.c | 67 +++++++++++++------------------------------------
1 file changed, 17 insertions(+), 50 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 9adf86b..081df77 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -322,32 +322,20 @@ static void notrace klp_ftrace_handler(unsigned long ip,
klp_arch_set_pc(regs, (unsigned long)func->new_func);
}

-static int klp_disable_func(struct klp_func *func)
+static void klp_disable_func(struct klp_func *func)
{
struct klp_ops *ops;
- int ret;
-
- if (WARN_ON(func->state != KLP_ENABLED))
- return -EINVAL;

- if (WARN_ON(!func->old_addr))
- return -EINVAL;
+ WARN_ON(func->state != KLP_ENABLED);
+ WARN_ON(!func->old_addr);

ops = klp_find_ops(func->old_addr);
if (WARN_ON(!ops))
- return -EINVAL;
+ return;

if (list_is_singular(&ops->func_stack)) {
- ret = unregister_ftrace_function(&ops->fops);
- if (ret) {
- pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
- func->old_name, ret);
- return ret;
- }
-
- ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
- if (ret)
- pr_warn("function unregister succeeded but failed to clear the filter\n");
+ WARN_ON(unregister_ftrace_function(&ops->fops));
+ WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));

list_del_rcu(&func->stack_node);
list_del(&ops->node);
@@ -357,8 +345,6 @@ static int klp_disable_func(struct klp_func *func)
}

func->state = KLP_DISABLED;
-
- return 0;
}

static int klp_enable_func(struct klp_func *func)
@@ -419,23 +405,15 @@ err:
return ret;
}

-static int klp_disable_object(struct klp_object *obj)
+static void klp_disable_object(struct klp_object *obj)
{
struct klp_func *func;
- int ret;

- for (func = obj->funcs; func->old_name; func++) {
- if (func->state != KLP_ENABLED)
- continue;
-
- ret = klp_disable_func(func);
- if (ret)
- return ret;
- }
+ for (func = obj->funcs; func->old_name; func++)
+ if (func->state == KLP_ENABLED)
+ klp_disable_func(func);

obj->state = KLP_DISABLED;
-
- return 0;
}

static int klp_enable_object(struct klp_object *obj)
@@ -451,22 +429,19 @@ static int klp_enable_object(struct klp_object *obj)

for (func = obj->funcs; func->old_name; func++) {
ret = klp_enable_func(func);
- if (ret)
- goto unregister;
+ if (ret) {
+ klp_disable_object(obj);
+ return ret;
+ }
}
obj->state = KLP_ENABLED;

return 0;
-
-unregister:
- WARN_ON(klp_disable_object(obj));
- return ret;
}

static int __klp_disable_patch(struct klp_patch *patch)
{
struct klp_object *obj;
- int ret;

/* enforce stacking: only the last enabled patch can be disabled */
if (!list_is_last(&patch->list, &klp_patches) &&
@@ -476,12 +451,8 @@ static int __klp_disable_patch(struct klp_patch *patch)
pr_notice("disabling patch '%s'\n", patch->mod->name);

for (obj = patch->objs; obj->funcs; obj++) {
- if (obj->state != KLP_ENABLED)
- continue;
-
- ret = klp_disable_object(obj);
- if (ret)
- return ret;
+ if (obj->state == KLP_ENABLED)
+ klp_disable_object(obj);
}

patch->state = KLP_DISABLED;
@@ -931,7 +902,6 @@ static void klp_module_notify_going(struct klp_patch *patch,
{
struct module *pmod = patch->mod;
struct module *mod = obj->mod;
- int ret;

if (patch->state == KLP_DISABLED)
goto disabled;
@@ -939,10 +909,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
pr_notice("reverting patch '%s' on unloading module '%s'\n",
pmod->name, mod->name);

- ret = klp_disable_object(obj);
- if (ret)
- pr_warn("failed to revert patch '%s' on module '%s' (%d)\n",
- pmod->name, mod->name, ret);
+ klp_disable_object(obj);

disabled:
klp_free_object_loaded(obj);
--
2.1.0

2015-02-09 17:33:11

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 2/9] livepatch: separate enabled and patched states

Once we have a consistency model, patches and their objects will be
enabled and disabled at different times. For example, when a patch is
disabled, its loaded objects' funcs can remain registered with ftrace
indefinitely until the unpatching operation is complete and they're no
longer in use.

It's less confusing if we give them different names: patches can be
enabled or disabled; objects (and their funcs) can be patched or
unpatched:

- Enabled means that a patch is logically enabled (but not necessarily
fully applied).

- Patched means that an object's funcs are registered with ftrace and
added to the klp_ops func stack.

Also, since these states are binary, represent them with boolean-type
variables instead of enums.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
include/linux/livepatch.h | 15 ++++-----
kernel/livepatch/core.c | 79 +++++++++++++++++++++++------------------------
2 files changed, 45 insertions(+), 49 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 95023fd..22a67d1 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -28,11 +28,6 @@

#include <asm/livepatch.h>

-enum klp_state {
- KLP_DISABLED,
- KLP_ENABLED
-};
-
/**
* struct klp_func - function structure for live patching
* @old_name: name of the function to be patched
@@ -42,6 +37,7 @@ enum klp_state {
* @kobj: kobject for sysfs resources
* @state: tracks function-level patch application state
* @stack_node: list node for klp_ops func_stack list
+ * @patched: the func has been added to the klp_ops list
*/
struct klp_func {
/* external */
@@ -59,8 +55,8 @@ struct klp_func {

/* internal */
struct kobject kobj;
- enum klp_state state;
struct list_head stack_node;
+ int patched;
};

/**
@@ -90,7 +86,7 @@ struct klp_reloc {
* @kobj: kobject for sysfs resources
* @mod: kernel module associated with the patched object
* (NULL for vmlinux)
- * @state: tracks object-level patch application state
+ * @patched: the object's funcs have been add to the klp_ops list
*/
struct klp_object {
/* external */
@@ -101,7 +97,7 @@ struct klp_object {
/* internal */
struct kobject *kobj;
struct module *mod;
- enum klp_state state;
+ int patched;
};

/**
@@ -111,6 +107,7 @@ struct klp_object {
* @list: list node for global list of registered patches
* @kobj: kobject for sysfs resources
* @state: tracks patch-level application state
+ * @enabled: the patch is enabled (but operation may be incomplete)
*/
struct klp_patch {
/* external */
@@ -120,7 +117,7 @@ struct klp_patch {
/* internal */
struct list_head list;
struct kobject kobj;
- enum klp_state state;
+ int enabled;
};

extern int klp_register_patch(struct klp_patch *);
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 081df77..73f9ba4 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -322,11 +322,11 @@ static void notrace klp_ftrace_handler(unsigned long ip,
klp_arch_set_pc(regs, (unsigned long)func->new_func);
}

-static void klp_disable_func(struct klp_func *func)
+static void klp_unpatch_func(struct klp_func *func)
{
struct klp_ops *ops;

- WARN_ON(func->state != KLP_ENABLED);
+ WARN_ON(!func->patched);
WARN_ON(!func->old_addr);

ops = klp_find_ops(func->old_addr);
@@ -344,10 +344,10 @@ static void klp_disable_func(struct klp_func *func)
list_del_rcu(&func->stack_node);
}

- func->state = KLP_DISABLED;
+ func->patched = 0;
}

-static int klp_enable_func(struct klp_func *func)
+static int klp_patch_func(struct klp_func *func)
{
struct klp_ops *ops;
int ret;
@@ -355,7 +355,7 @@ static int klp_enable_func(struct klp_func *func)
if (WARN_ON(!func->old_addr))
return -EINVAL;

- if (WARN_ON(func->state != KLP_DISABLED))
+ if (WARN_ON(func->patched))
return -EINVAL;

ops = klp_find_ops(func->old_addr);
@@ -394,7 +394,7 @@ static int klp_enable_func(struct klp_func *func)
list_add_rcu(&func->stack_node, &ops->func_stack);
}

- func->state = KLP_ENABLED;
+ func->patched = 1;

return 0;

@@ -405,36 +405,36 @@ err:
return ret;
}

-static void klp_disable_object(struct klp_object *obj)
+static void klp_unpatch_object(struct klp_object *obj)
{
struct klp_func *func;

for (func = obj->funcs; func->old_name; func++)
- if (func->state == KLP_ENABLED)
- klp_disable_func(func);
+ if (func->patched)
+ klp_unpatch_func(func);

- obj->state = KLP_DISABLED;
+ obj->patched = 0;
}

-static int klp_enable_object(struct klp_object *obj)
+static int klp_patch_object(struct klp_object *obj)
{
struct klp_func *func;
int ret;

- if (WARN_ON(obj->state != KLP_DISABLED))
+ if (WARN_ON(obj->patched))
return -EINVAL;

if (WARN_ON(!klp_is_object_loaded(obj)))
return -EINVAL;

for (func = obj->funcs; func->old_name; func++) {
- ret = klp_enable_func(func);
+ ret = klp_patch_func(func);
if (ret) {
- klp_disable_object(obj);
+ klp_unpatch_object(obj);
return ret;
}
}
- obj->state = KLP_ENABLED;
+ obj->patched = 1;

return 0;
}
@@ -445,17 +445,16 @@ static int __klp_disable_patch(struct klp_patch *patch)

/* enforce stacking: only the last enabled patch can be disabled */
if (!list_is_last(&patch->list, &klp_patches) &&
- list_next_entry(patch, list)->state == KLP_ENABLED)
+ list_next_entry(patch, list)->enabled)
return -EBUSY;

pr_notice("disabling patch '%s'\n", patch->mod->name);

- for (obj = patch->objs; obj->funcs; obj++) {
- if (obj->state == KLP_ENABLED)
- klp_disable_object(obj);
- }
+ for (obj = patch->objs; obj->funcs; obj++)
+ if (obj->patched)
+ klp_unpatch_object(obj);

- patch->state = KLP_DISABLED;
+ patch->enabled = 0;

return 0;
}
@@ -479,7 +478,7 @@ int klp_disable_patch(struct klp_patch *patch)
goto err;
}

- if (patch->state == KLP_DISABLED) {
+ if (!patch->enabled) {
ret = -EINVAL;
goto err;
}
@@ -497,12 +496,12 @@ static int __klp_enable_patch(struct klp_patch *patch)
struct klp_object *obj;
int ret;

- if (WARN_ON(patch->state != KLP_DISABLED))
+ if (WARN_ON(patch->enabled))
return -EINVAL;

/* enforce stacking: only the first disabled patch can be enabled */
if (patch->list.prev != &klp_patches &&
- list_prev_entry(patch, list)->state == KLP_DISABLED)
+ !list_prev_entry(patch, list)->enabled)
return -EBUSY;

pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
@@ -516,12 +515,12 @@ static int __klp_enable_patch(struct klp_patch *patch)
if (!klp_is_object_loaded(obj))
continue;

- ret = klp_enable_object(obj);
+ ret = klp_patch_object(obj);
if (ret)
goto unregister;
}

- patch->state = KLP_ENABLED;
+ patch->enabled = 1;

return 0;

@@ -579,20 +578,20 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
if (ret)
return -EINVAL;

- if (val != KLP_DISABLED && val != KLP_ENABLED)
+ if (val > 1)
return -EINVAL;

patch = container_of(kobj, struct klp_patch, kobj);

mutex_lock(&klp_mutex);

- if (val == patch->state) {
+ if (patch->enabled == val) {
/* already in requested state */
ret = -EINVAL;
goto err;
}

- if (val == KLP_ENABLED) {
+ if (val) {
ret = __klp_enable_patch(patch);
if (ret)
goto err;
@@ -617,7 +616,7 @@ static ssize_t enabled_show(struct kobject *kobj,
struct klp_patch *patch;

patch = container_of(kobj, struct klp_patch, kobj);
- return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->state);
+ return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
}

static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
@@ -699,7 +698,7 @@ static void klp_free_patch(struct klp_patch *patch)
static int klp_init_func(struct klp_object *obj, struct klp_func *func)
{
INIT_LIST_HEAD(&func->stack_node);
- func->state = KLP_DISABLED;
+ func->patched = 0;

return kobject_init_and_add(&func->kobj, &klp_ktype_func,
obj->kobj, func->old_name);
@@ -736,7 +735,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
if (!obj->funcs)
return -EINVAL;

- obj->state = KLP_DISABLED;
+ obj->patched = 0;

klp_find_object_module(obj);

@@ -775,7 +774,7 @@ static int klp_init_patch(struct klp_patch *patch)

mutex_lock(&klp_mutex);

- patch->state = KLP_DISABLED;
+ patch->enabled = 0;

ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
klp_root_kobj, patch->mod->name);
@@ -821,7 +820,7 @@ int klp_unregister_patch(struct klp_patch *patch)
goto out;
}

- if (patch->state == KLP_ENABLED) {
+ if (patch->enabled) {
ret = -EBUSY;
goto out;
}
@@ -882,13 +881,13 @@ static void klp_module_notify_coming(struct klp_patch *patch,
if (ret)
goto err;

- if (patch->state == KLP_DISABLED)
+ if (!patch->enabled)
return;

pr_notice("applying patch '%s' to loading module '%s'\n",
pmod->name, mod->name);

- ret = klp_enable_object(obj);
+ ret = klp_patch_object(obj);
if (!ret)
return;

@@ -903,15 +902,15 @@ static void klp_module_notify_going(struct klp_patch *patch,
struct module *pmod = patch->mod;
struct module *mod = obj->mod;

- if (patch->state == KLP_DISABLED)
- goto disabled;
+ if (!patch->enabled)
+ goto free;

pr_notice("reverting patch '%s' on unloading module '%s'\n",
pmod->name, mod->name);

- klp_disable_object(obj);
+ klp_unpatch_object(obj);

-disabled:
+free:
klp_free_object_loaded(obj);
}

--
2.1.0

2015-02-09 17:33:33

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 3/9] livepatch: move patching functions into patch.c

Move functions related to the actual patching of functions and objects
into a new patch.c file.

The only functional change is to remove the unnecessary
WARN_ON(!klp_is_object_loaded()) check from klp_patch_object().

Signed-off-by: Josh Poimboeuf <[email protected]>
---
kernel/livepatch/Makefile | 2 +-
kernel/livepatch/core.c | 175 +--------------------------------------------
kernel/livepatch/patch.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++
kernel/livepatch/patch.h | 25 +++++++
4 files changed, 203 insertions(+), 175 deletions(-)
create mode 100644 kernel/livepatch/patch.c
create mode 100644 kernel/livepatch/patch.h

diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
index e8780c0..e136dad 100644
--- a/kernel/livepatch/Makefile
+++ b/kernel/livepatch/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_LIVEPATCH) += livepatch.o

-livepatch-objs := core.o
+livepatch-objs := core.o patch.o
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 73f9ba4..0c09eba 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -24,29 +24,10 @@
#include <linux/kernel.h>
#include <linux/mutex.h>
#include <linux/slab.h>
-#include <linux/ftrace.h>
#include <linux/list.h>
#include <linux/kallsyms.h>
-#include <linux/livepatch.h>

-/**
- * struct klp_ops - structure for tracking registered ftrace ops structs
- *
- * A single ftrace_ops is shared between all enabled replacement functions
- * (klp_func structs) which have the same old_addr. This allows the switch
- * between function versions to happen instantaneously by updating the klp_ops
- * struct's func_stack list. The winner is the klp_func at the top of the
- * func_stack (front of the list).
- *
- * @node: node for the global klp_ops list
- * @func_stack: list head for the stack of klp_func's (active func is on top)
- * @fops: registered ftrace ops struct
- */
-struct klp_ops {
- struct list_head node;
- struct list_head func_stack;
- struct ftrace_ops fops;
-};
+#include "patch.h"

/*
* The klp_mutex protects the global lists and state transitions of any
@@ -57,25 +38,9 @@ struct klp_ops {
static DEFINE_MUTEX(klp_mutex);

static LIST_HEAD(klp_patches);
-static LIST_HEAD(klp_ops);

static struct kobject *klp_root_kobj;

-static struct klp_ops *klp_find_ops(unsigned long old_addr)
-{
- struct klp_ops *ops;
- struct klp_func *func;
-
- list_for_each_entry(ops, &klp_ops, node) {
- func = list_first_entry(&ops->func_stack, struct klp_func,
- stack_node);
- if (func->old_addr == old_addr)
- return ops;
- }
-
- return NULL;
-}
-
static bool klp_is_module(struct klp_object *obj)
{
return obj->name;
@@ -301,144 +266,6 @@ static int klp_write_object_relocations(struct module *pmod,
return 0;
}

-static void notrace klp_ftrace_handler(unsigned long ip,
- unsigned long parent_ip,
- struct ftrace_ops *fops,
- struct pt_regs *regs)
-{
- struct klp_ops *ops;
- struct klp_func *func;
-
- ops = container_of(fops, struct klp_ops, fops);
-
- rcu_read_lock();
- func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
- stack_node);
- rcu_read_unlock();
-
- if (WARN_ON_ONCE(!func))
- return;
-
- klp_arch_set_pc(regs, (unsigned long)func->new_func);
-}
-
-static void klp_unpatch_func(struct klp_func *func)
-{
- struct klp_ops *ops;
-
- WARN_ON(!func->patched);
- WARN_ON(!func->old_addr);
-
- ops = klp_find_ops(func->old_addr);
- if (WARN_ON(!ops))
- return;
-
- if (list_is_singular(&ops->func_stack)) {
- WARN_ON(unregister_ftrace_function(&ops->fops));
- WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
-
- list_del_rcu(&func->stack_node);
- list_del(&ops->node);
- kfree(ops);
- } else {
- list_del_rcu(&func->stack_node);
- }
-
- func->patched = 0;
-}
-
-static int klp_patch_func(struct klp_func *func)
-{
- struct klp_ops *ops;
- int ret;
-
- if (WARN_ON(!func->old_addr))
- return -EINVAL;
-
- if (WARN_ON(func->patched))
- return -EINVAL;
-
- ops = klp_find_ops(func->old_addr);
- if (!ops) {
- ops = kzalloc(sizeof(*ops), GFP_KERNEL);
- if (!ops)
- return -ENOMEM;
-
- ops->fops.func = klp_ftrace_handler;
- ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
- FTRACE_OPS_FL_DYNAMIC |
- FTRACE_OPS_FL_IPMODIFY;
-
- list_add(&ops->node, &klp_ops);
-
- INIT_LIST_HEAD(&ops->func_stack);
- list_add_rcu(&func->stack_node, &ops->func_stack);
-
- ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
- if (ret) {
- pr_err("failed to set ftrace filter for function '%s' (%d)\n",
- func->old_name, ret);
- goto err;
- }
-
- ret = register_ftrace_function(&ops->fops);
- if (ret) {
- pr_err("failed to register ftrace handler for function '%s' (%d)\n",
- func->old_name, ret);
- ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
- goto err;
- }
-
-
- } else {
- list_add_rcu(&func->stack_node, &ops->func_stack);
- }
-
- func->patched = 1;
-
- return 0;
-
-err:
- list_del_rcu(&func->stack_node);
- list_del(&ops->node);
- kfree(ops);
- return ret;
-}
-
-static void klp_unpatch_object(struct klp_object *obj)
-{
- struct klp_func *func;
-
- for (func = obj->funcs; func->old_name; func++)
- if (func->patched)
- klp_unpatch_func(func);
-
- obj->patched = 0;
-}
-
-static int klp_patch_object(struct klp_object *obj)
-{
- struct klp_func *func;
- int ret;
-
- if (WARN_ON(obj->patched))
- return -EINVAL;
-
- if (WARN_ON(!klp_is_object_loaded(obj)))
- return -EINVAL;
-
- for (func = obj->funcs; func->old_name; func++) {
- ret = klp_patch_func(func);
- if (ret) {
- klp_unpatch_object(obj);
- return ret;
- }
- }
- obj->patched = 1;
-
- return 0;
-}
-
static int __klp_disable_patch(struct klp_patch *patch)
{
struct klp_object *obj;
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
new file mode 100644
index 0000000..281fbca
--- /dev/null
+++ b/kernel/livepatch/patch.c
@@ -0,0 +1,176 @@
+/*
+ * patch.c - Kernel Live Patching patching functions
+ *
+ * Copyright (C) 2014 Seth Jennings <[email protected]>
+ * Copyright (C) 2014 SUSE
+ * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/slab.h>
+
+#include "patch.h"
+
+static LIST_HEAD(klp_ops);
+
+static void notrace klp_ftrace_handler(unsigned long ip,
+ unsigned long parent_ip,
+ struct ftrace_ops *fops,
+ struct pt_regs *regs)
+{
+ struct klp_ops *ops;
+ struct klp_func *func;
+
+ ops = container_of(fops, struct klp_ops, fops);
+
+ rcu_read_lock();
+ func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
+ stack_node);
+ rcu_read_unlock();
+
+ if (WARN_ON_ONCE(!func))
+ return;
+
+ klp_arch_set_pc(regs, (unsigned long)func->new_func);
+}
+
+struct klp_ops *klp_find_ops(unsigned long old_addr)
+{
+ struct klp_ops *ops;
+ struct klp_func *func;
+
+ list_for_each_entry(ops, &klp_ops, node) {
+ func = list_first_entry(&ops->func_stack, struct klp_func,
+ stack_node);
+ if (func->old_addr == old_addr)
+ return ops;
+ }
+
+ return NULL;
+}
+
+static void klp_unpatch_func(struct klp_func *func)
+{
+ struct klp_ops *ops;
+
+ WARN_ON(!func->patched);
+ WARN_ON(!func->old_addr);
+
+ ops = klp_find_ops(func->old_addr);
+ if (WARN_ON(!ops))
+ return;
+
+ if (list_is_singular(&ops->func_stack)) {
+ WARN_ON(unregister_ftrace_function(&ops->fops));
+ WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
+
+ list_del_rcu(&func->stack_node);
+ list_del(&ops->node);
+ kfree(ops);
+ } else {
+ list_del_rcu(&func->stack_node);
+ }
+
+ func->patched = 0;
+}
+
+static int klp_patch_func(struct klp_func *func)
+{
+ struct klp_ops *ops;
+ int ret;
+
+ if (WARN_ON(!func->old_addr))
+ return -EINVAL;
+
+ if (WARN_ON(func->patched))
+ return -EINVAL;
+
+ ops = klp_find_ops(func->old_addr);
+ if (!ops) {
+ ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+ if (!ops)
+ return -ENOMEM;
+
+ ops->fops.func = klp_ftrace_handler;
+ ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
+ FTRACE_OPS_FL_DYNAMIC |
+ FTRACE_OPS_FL_IPMODIFY;
+
+ list_add(&ops->node, &klp_ops);
+
+ INIT_LIST_HEAD(&ops->func_stack);
+ list_add_rcu(&func->stack_node, &ops->func_stack);
+
+ ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
+ if (ret) {
+ pr_err("failed to set ftrace filter for function '%s' (%d)\n",
+ func->old_name, ret);
+ goto err;
+ }
+
+ ret = register_ftrace_function(&ops->fops);
+ if (ret) {
+ pr_err("failed to register ftrace handler for function '%s' (%d)\n",
+ func->old_name, ret);
+ ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
+ goto err;
+ }
+ } else {
+ list_add_rcu(&func->stack_node, &ops->func_stack);
+ }
+
+ func->patched = 1;
+
+ return 0;
+
+err:
+ list_del_rcu(&func->stack_node);
+ list_del(&ops->node);
+ kfree(ops);
+ return ret;
+}
+
+void klp_unpatch_object(struct klp_object *obj)
+{
+ struct klp_func *func;
+
+ for (func = obj->funcs; func->old_name; func++)
+ if (func->patched)
+ klp_unpatch_func(func);
+
+ obj->patched = 0;
+}
+
+int klp_patch_object(struct klp_object *obj)
+{
+ struct klp_func *func;
+ int ret;
+
+ if (WARN_ON(obj->patched))
+ return -EINVAL;
+
+ for (func = obj->funcs; func->old_name; func++) {
+ ret = klp_patch_func(func);
+ if (ret) {
+ klp_unpatch_object(obj);
+ return ret;
+ }
+ }
+ obj->patched = 1;
+
+ return 0;
+}
diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
new file mode 100644
index 0000000..bb34bd3
--- /dev/null
+++ b/kernel/livepatch/patch.h
@@ -0,0 +1,25 @@
+#include <linux/livepatch.h>
+
+/**
+ * struct klp_ops - structure for tracking registered ftrace ops structs
+ *
+ * A single ftrace_ops is shared between all enabled replacement functions
+ * (klp_func structs) which have the same old_addr. This allows the switch
+ * between function versions to happen instantaneously by updating the klp_ops
+ * struct's func_stack list. The winner is the klp_func at the top of the
+ * func_stack (front of the list).
+ *
+ * @node: node for the global klp_ops list
+ * @func_stack: list head for the stack of klp_func's (active func is on top)
+ * @fops: registered ftrace ops struct
+ */
+struct klp_ops {
+ struct list_head node;
+ struct list_head func_stack;
+ struct ftrace_ops fops;
+};
+
+struct klp_ops *klp_find_ops(unsigned long old_addr);
+
+extern int klp_patch_object(struct klp_object *obj);
+extern void klp_unpatch_object(struct klp_object *obj);
--
2.1.0

2015-02-09 17:31:37

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 4/9] livepatch: get function sizes

For the consistency model we'll need to know the sizes of the old and
new functions to determine if they're on any task stacks.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
include/linux/livepatch.h | 3 +++
kernel/livepatch/core.c | 19 ++++++++++++++++++-
2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 22a67d1..0e65b4d 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -37,6 +37,8 @@
* @kobj: kobject for sysfs resources
* @state: tracks function-level patch application state
* @stack_node: list node for klp_ops func_stack list
+ * @old_size: size of the old function
+ * @new_size: size of the new function
* @patched: the func has been added to the klp_ops list
*/
struct klp_func {
@@ -56,6 +58,7 @@ struct klp_func {
/* internal */
struct kobject kobj;
struct list_head stack_node;
+ unsigned long old_size, new_size;
int patched;
};

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 0c09eba..85d4ef7 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -197,8 +197,25 @@ static int klp_find_verify_func_addr(struct klp_object *obj,
else
ret = klp_verify_vmlinux_symbol(func->old_name,
func->old_addr);
+ if (ret)
+ return ret;

- return ret;
+ ret = kallsyms_lookup_size_offset(func->old_addr, &func->old_size,
+ NULL);
+ if (!ret) {
+ pr_err("kallsyms lookup failed for '%s'\n", func->old_name);
+ return -EINVAL;
+ }
+
+ ret = kallsyms_lookup_size_offset((unsigned long)func->new_func,
+ &func->new_size, NULL);
+ if (!ret) {
+ pr_err("kallsyms lookup failed for '%s' replacement\n",
+ func->old_name);
+ return -EINVAL;
+ }
+
+ return 0;
}

/*
--
2.1.0

2015-02-09 17:31:34

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 5/9] sched: move task rq locking functions to sched.h

Move task_rq_lock/unlock() to sched.h so they can be used elsewhere.
The livepatch code needs to lock each task's rq in order to safely
examine its stack and switch it to a new patch universe.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
kernel/sched/core.c | 32 --------------------------------
kernel/sched/sched.h | 33 +++++++++++++++++++++++++++++++++
2 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b5797b7..78d91e6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -326,44 +326,12 @@ static inline struct rq *__task_rq_lock(struct task_struct *p)
}
}

-/*
- * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
- */
-static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags)
- __acquires(p->pi_lock)
- __acquires(rq->lock)
-{
- struct rq *rq;
-
- for (;;) {
- raw_spin_lock_irqsave(&p->pi_lock, *flags);
- rq = task_rq(p);
- raw_spin_lock(&rq->lock);
- if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
- return rq;
- raw_spin_unlock(&rq->lock);
- raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-
- while (unlikely(task_on_rq_migrating(p)))
- cpu_relax();
- }
-}
-
static void __task_rq_unlock(struct rq *rq)
__releases(rq->lock)
{
raw_spin_unlock(&rq->lock);
}

-static inline void
-task_rq_unlock(struct rq *rq, struct task_struct *p, unsigned long *flags)
- __releases(rq->lock)
- __releases(p->pi_lock)
-{
- raw_spin_unlock(&rq->lock);
- raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-}
-
/*
* this_rq_lock - lock this runqueue and disable interrupts.
*/
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9a2a45c..ae514c9 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1542,6 +1542,39 @@ static inline void double_rq_unlock(struct rq *rq1, struct rq *rq2)

#endif

+/*
+ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
+ */
+static inline struct rq *task_rq_lock(struct task_struct *p,
+ unsigned long *flags)
+ __acquires(p->pi_lock)
+ __acquires(rq->lock)
+{
+ struct rq *rq;
+
+ for (;;) {
+ raw_spin_lock_irqsave(&p->pi_lock, *flags);
+ rq = task_rq(p);
+ raw_spin_lock(&rq->lock);
+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
+ return rq;
+ raw_spin_unlock(&rq->lock);
+ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
+
+ while (unlikely(task_on_rq_migrating(p)))
+ cpu_relax();
+ }
+}
+
+static inline void task_rq_unlock(struct rq *rq, struct task_struct *p,
+ unsigned long *flags)
+ __releases(rq->lock)
+ __releases(p->pi_lock)
+{
+ raw_spin_unlock(&rq->lock);
+ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
+}
+
extern struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq);
extern struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq);
extern void print_cfs_stats(struct seq_file *m, int cpu);
--
2.1.0

2015-02-09 17:32:48

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 6/9] livepatch: create per-task consistency model

Add a basic per-task consistency model. This is the foundation which
will eventually enable us to patch those ~10% of security patches which
change function prototypes and/or data semantics.

When a patch is enabled, livepatch enters into a transition state where
tasks are converging from the old universe to the new universe. If a
given task isn't using any of the patched functions, it's switched to
the new universe. Once all the tasks have been converged to the new
universe, patching is complete.

The same sequence occurs when a patch is disabled, except the tasks
converge from the new universe to the old universe.

The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
is in transition. Only a single patch (the topmost patch on the stack)
can be in transition at a given time. A patch can remain in the
transition state indefinitely, if any of the tasks are stuck in the
previous universe.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
the transition is in progress. Then all the tasks will attempt to
converge back to the original universe.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
include/linux/livepatch.h | 18 ++-
include/linux/sched.h | 3 +
kernel/fork.c | 2 +
kernel/livepatch/Makefile | 2 +-
kernel/livepatch/core.c | 71 ++++++----
kernel/livepatch/patch.c | 34 ++++-
kernel/livepatch/patch.h | 1 +
kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
kernel/livepatch/transition.h | 16 +++
kernel/sched/core.c | 2 +
10 files changed, 423 insertions(+), 26 deletions(-)
create mode 100644 kernel/livepatch/transition.c
create mode 100644 kernel/livepatch/transition.h

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 0e65b4d..b8c2f15 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -40,6 +40,7 @@
* @old_size: size of the old function
* @new_size: size of the new function
* @patched: the func has been added to the klp_ops list
+ * @transition: the func is currently being applied or reverted
*/
struct klp_func {
/* external */
@@ -60,6 +61,7 @@ struct klp_func {
struct list_head stack_node;
unsigned long old_size, new_size;
int patched;
+ int transition;
};

/**
@@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
extern int klp_enable_patch(struct klp_patch *);
extern int klp_disable_patch(struct klp_patch *);

-#endif /* CONFIG_LIVEPATCH */
+extern int klp_universe_goal;
+
+static inline void klp_update_task_universe(struct task_struct *t)
+{
+ /* corresponding smp_wmb() is in klp_set_universe_goal() */
+ smp_rmb();
+
+ t->klp_universe = klp_universe_goal;
+}
+
+#else /* !CONFIG_LIVEPATCH */
+
+static inline void klp_update_task_universe(struct task_struct *t) {}
+
+#endif /* !CONFIG_LIVEPATCH */

#endif /* _LINUX_LIVEPATCH_H_ */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8db31ef..a95e59a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1701,6 +1701,9 @@ struct task_struct {
#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
unsigned long task_state_change;
#endif
+#ifdef CONFIG_LIVEPATCH
+ int klp_universe;
+#endif
};

/* Future-safe accessor for struct task_struct's cpus_allowed. */
diff --git a/kernel/fork.c b/kernel/fork.c
index 4dc2dda..1dcbebe 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -74,6 +74,7 @@
#include <linux/uprobes.h>
#include <linux/aio.h>
#include <linux/compiler.h>
+#include <linux/livepatch.h>

#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
total_forks++;
spin_unlock(&current->sighand->siglock);
syscall_tracepoint_update(p);
+ klp_update_task_universe(p);
write_unlock_irq(&tasklist_lock);

proc_fork_connector(p);
diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
index e136dad..2b8bdb1 100644
--- a/kernel/livepatch/Makefile
+++ b/kernel/livepatch/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_LIVEPATCH) += livepatch.o

-livepatch-objs := core.o patch.o
+livepatch-objs := core.o patch.o transition.o
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 85d4ef7..790dc10 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -28,14 +28,17 @@
#include <linux/kallsyms.h>

#include "patch.h"
+#include "transition.h"

/*
- * The klp_mutex protects the global lists and state transitions of any
- * structure reachable from them. References to any structure must be obtained
- * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
- * ensure it gets consistent data).
+ * The klp_mutex is a coarse lock which serializes access to klp data. All
+ * accesses to klp-related variables and structures must have mutex protection,
+ * except within the following functions which carefully avoid the need for it:
+ *
+ * - klp_ftrace_handler()
+ * - klp_update_task_universe()
*/
-static DEFINE_MUTEX(klp_mutex);
+DEFINE_MUTEX(klp_mutex);

static LIST_HEAD(klp_patches);

@@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
mutex_unlock(&module_mutex);
}

-/* klp_mutex must be held by caller */
static bool klp_is_patch_registered(struct klp_patch *patch)
{
struct klp_patch *mypatch;
@@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,

static int __klp_disable_patch(struct klp_patch *patch)
{
- struct klp_object *obj;
+ if (klp_transition_patch)
+ return -EBUSY;

/* enforce stacking: only the last enabled patch can be disabled */
if (!list_is_last(&patch->list, &klp_patches) &&
list_next_entry(patch, list)->enabled)
return -EBUSY;

- pr_notice("disabling patch '%s'\n", patch->mod->name);
-
- for (obj = patch->objs; obj->funcs; obj++)
- if (obj->patched)
- klp_unpatch_object(obj);
+ klp_init_transition(patch, KLP_UNIVERSE_NEW);
+ klp_start_transition(KLP_UNIVERSE_OLD);
+ klp_try_complete_transition();

patch->enabled = 0;

@@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
struct klp_object *obj;
int ret;

+ if (klp_transition_patch)
+ return -EBUSY;
+
if (WARN_ON(patch->enabled))
return -EINVAL;

@@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);

- pr_notice("enabling patch '%s'\n", patch->mod->name);
+ klp_init_transition(patch, KLP_UNIVERSE_OLD);

for (obj = patch->objs; obj->funcs; obj++) {
klp_find_object_module(obj);
@@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
continue;

ret = klp_patch_object(obj);
- if (ret)
- goto unregister;
+ if (ret) {
+ pr_warn("failed to enable patch '%s'\n",
+ patch->mod->name);
+
+ klp_unpatch_objects(patch);
+ klp_complete_transition();
+
+ return ret;
+ }
}

+ klp_start_transition(KLP_UNIVERSE_NEW);
+
+ klp_try_complete_transition();
+
patch->enabled = 1;

return 0;
-
-unregister:
- WARN_ON(__klp_disable_patch(patch));
- return ret;
}

/**
@@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
* /sys/kernel/livepatch
* /sys/kernel/livepatch/<patch>
* /sys/kernel/livepatch/<patch>/enabled
+ * /sys/kernel/livepatch/<patch>/transition
* /sys/kernel/livepatch/<patch>/<object>
* /sys/kernel/livepatch/<patch>/<object>/<func>
*/
@@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
goto err;
}

- if (val) {
+ if (klp_transition_patch == patch) {
+ klp_reverse_transition();
+ } else if (val) {
ret = __klp_enable_patch(patch);
if (ret)
goto err;
@@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
}

+static ssize_t transition_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct klp_patch *patch;
+
+ patch = container_of(kobj, struct klp_patch, kobj);
+ return snprintf(buf, PAGE_SIZE-1, "%d\n",
+ klp_transition_patch == patch);
+}
+
static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
+static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
static struct attribute *klp_patch_attrs[] = {
&enabled_kobj_attr.attr,
+ &transition_kobj_attr.attr,
NULL
};

@@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
{
INIT_LIST_HEAD(&func->stack_node);
func->patched = 0;
+ func->transition = 0;

return kobject_init_and_add(&func->kobj, &klp_ktype_func,
obj->kobj, func->old_name);
@@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
if (ret)
goto err;

- if (!patch->enabled)
+ if (!patch->enabled && klp_transition_patch != patch)
return;

pr_notice("applying patch '%s' to loading module '%s'\n",
@@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
struct module *pmod = patch->mod;
struct module *mod = obj->mod;

- if (!patch->enabled)
+ if (!patch->enabled && klp_transition_patch != patch)
goto free;

pr_notice("reverting patch '%s' on unloading module '%s'\n",
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 281fbca..f12256b 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -24,6 +24,7 @@
#include <linux/slab.h>

#include "patch.h"
+#include "transition.h"

static LIST_HEAD(klp_ops);

@@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
ops = container_of(fops, struct klp_ops, fops);

rcu_read_lock();
+
func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
stack_node);
- rcu_read_unlock();

if (WARN_ON_ONCE(!func))
- return;
+ goto unlock;
+
+ if (unlikely(func->transition)) {
+ /* corresponding smp_wmb() is in klp_init_transition() */
+ smp_rmb();
+
+ if (current->klp_universe == KLP_UNIVERSE_OLD) {
+ /*
+ * Use the previously patched version of the function.
+ * If no previous patches exist, use the original
+ * function.
+ */
+ func = list_entry_rcu(func->stack_node.next,
+ struct klp_func, stack_node);
+
+ if (&func->stack_node == &ops->func_stack)
+ goto unlock;
+ }
+ }

klp_arch_set_pc(regs, (unsigned long)func->new_func);
+unlock:
+ rcu_read_unlock();
}

struct klp_ops *klp_find_ops(unsigned long old_addr)
@@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)

return 0;
}
+
+void klp_unpatch_objects(struct klp_patch *patch)
+{
+ struct klp_object *obj;
+
+ for (obj = patch->objs; obj->funcs; obj++)
+ if (obj->patched)
+ klp_unpatch_object(obj);
+}
diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
index bb34bd3..1648259 100644
--- a/kernel/livepatch/patch.h
+++ b/kernel/livepatch/patch.h
@@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);

extern int klp_patch_object(struct klp_object *obj);
extern void klp_unpatch_object(struct klp_object *obj);
+extern void klp_unpatch_objects(struct klp_patch *patch);
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
new file mode 100644
index 0000000..2630296
--- /dev/null
+++ b/kernel/livepatch/transition.c
@@ -0,0 +1,300 @@
+/*
+ * transition.c - Kernel Live Patching transition functions
+ *
+ * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/cpu.h>
+#include <asm/stacktrace.h>
+#include "../sched/sched.h"
+
+#include "patch.h"
+#include "transition.h"
+
+static void klp_transition_work_fn(struct work_struct *);
+static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
+
+struct klp_patch *klp_transition_patch;
+
+int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
+
+static void klp_set_universe_goal(int universe)
+{
+ klp_universe_goal = universe;
+
+ /* corresponding smp_rmb() is in klp_update_task_universe() */
+ smp_wmb();
+}
+
+/*
+ * The transition to the universe goal is complete. Clean up the data
+ * structures.
+ */
+void klp_complete_transition(void)
+{
+ struct klp_object *obj;
+ struct klp_func *func;
+
+ for (obj = klp_transition_patch->objs; obj->funcs; obj++)
+ for (func = obj->funcs; func->old_name; func++)
+ func->transition = 0;
+
+ klp_transition_patch = NULL;
+}
+
+static int klp_stacktrace_address_verify_func(struct klp_func *func,
+ unsigned long address)
+{
+ unsigned long func_addr, func_size;
+
+ if (klp_universe_goal == KLP_UNIVERSE_OLD) {
+ /* check the to-be-unpatched function (the func itself) */
+ func_addr = (unsigned long)func->new_func;
+ func_size = func->new_size;
+ } else {
+ /* check the to-be-patched function (previous func) */
+ struct klp_ops *ops;
+
+ ops = klp_find_ops(func->old_addr);
+
+ if (list_is_singular(&ops->func_stack)) {
+ /* original function */
+ func_addr = func->old_addr;
+ func_size = func->old_size;
+ } else {
+ /* previously patched function */
+ struct klp_func *prev;
+
+ prev = list_next_entry(func, stack_node);
+ func_addr = (unsigned long)prev->new_func;
+ func_size = prev->new_size;
+ }
+ }
+
+ if (address >= func_addr && address < func_addr + func_size)
+ return -1;
+
+ return 0;
+}
+
+/*
+ * Determine whether the given return address on the stack is within a
+ * to-be-patched or to-be-unpatched function.
+ */
+static void klp_stacktrace_address_verify(void *data, unsigned long address,
+ int reliable)
+{
+ struct klp_object *obj;
+ struct klp_func *func;
+ int *ret = data;
+
+ if (*ret)
+ return;
+
+ for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
+ if (!obj->patched)
+ continue;
+ for (func = obj->funcs; func->old_name; func++) {
+ if (klp_stacktrace_address_verify_func(func, address)) {
+ *ret = -1;
+ return;
+ }
+ }
+ }
+}
+
+static int klp_stacktrace_stack(void *data, char *name)
+{
+ return 0;
+}
+
+static const struct stacktrace_ops klp_stacktrace_ops = {
+ .address = klp_stacktrace_address_verify,
+ .stack = klp_stacktrace_stack,
+ .walk_stack = print_context_stack_bp,
+};
+
+/*
+ * Try to safely transition a task to the universe goal. If the task is
+ * currently running or is sleeping on a to-be-patched or to-be-unpatched
+ * function, return false.
+ */
+static bool klp_transition_task(struct task_struct *t)
+{
+ struct rq *rq;
+ unsigned long flags;
+ int ret;
+ bool success = false;
+
+ if (t->klp_universe == klp_universe_goal)
+ return true;
+
+ rq = task_rq_lock(t, &flags);
+
+ if (task_running(rq, t) && t != current) {
+ pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
+ t->comm);
+ goto done;
+ }
+
+ ret = 0;
+ dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
+ if (ret) {
+ pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
+ __func__, t->pid, t->comm);
+ goto done;
+ }
+
+ klp_update_task_universe(t);
+
+ success = true;
+done:
+ task_rq_unlock(rq, t, &flags);
+ return success;
+}
+
+/*
+ * Try to transition all tasks to the universe goal. If any tasks are still
+ * stuck in the original universe, schedule a retry.
+ */
+void klp_try_complete_transition(void)
+{
+ unsigned int cpu;
+ struct task_struct *g, *t;
+ bool complete = true;
+
+ /* try to transition all normal tasks */
+ read_lock(&tasklist_lock);
+ for_each_process_thread(g, t)
+ if (!klp_transition_task(t))
+ complete = false;
+ read_unlock(&tasklist_lock);
+
+ /* try to transition the idle "swapper" tasks */
+ get_online_cpus();
+ for_each_online_cpu(cpu)
+ if (!klp_transition_task(idle_task(cpu)))
+ complete = false;
+ put_online_cpus();
+
+ /* if not complete, try again later */
+ if (!complete) {
+ schedule_delayed_work(&klp_transition_work,
+ round_jiffies_relative(HZ));
+ return;
+ }
+
+ /* success! unpatch obsolete functions and do some cleanup */
+
+ if (klp_universe_goal == KLP_UNIVERSE_OLD) {
+ klp_unpatch_objects(klp_transition_patch);
+
+ /* prevent ftrace handler from reading old func->transition */
+ synchronize_rcu();
+ }
+
+ pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
+ klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
+ "unpatching");
+
+ klp_complete_transition();
+}
+
+static void klp_transition_work_fn(struct work_struct *work)
+{
+ mutex_lock(&klp_mutex);
+
+ if (klp_transition_patch)
+ klp_try_complete_transition();
+
+ mutex_unlock(&klp_mutex);
+}
+
+/*
+ * Start the transition to the specified universe so tasks can begin switching
+ * to it.
+ */
+void klp_start_transition(int universe)
+{
+ if (WARN_ON(klp_universe_goal == universe))
+ return;
+
+ pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
+ universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
+
+ klp_set_universe_goal(universe);
+}
+
+/*
+ * Can be called in the middle of an existing transition to reverse the
+ * direction of the universe goal. This can be done to effectively cancel an
+ * existing enable or disable operation if there are any tasks which are stuck
+ * in the original universe.
+ */
+void klp_reverse_transition(void)
+{
+ struct klp_patch *patch = klp_transition_patch;
+
+ klp_start_transition(!klp_universe_goal);
+ klp_try_complete_transition();
+
+ patch->enabled = !patch->enabled;
+}
+
+/*
+ * Reset the universe goal and all tasks to the starting universe, and set all
+ * func->transition's to 1 to prepare for patching.
+ */
+void klp_init_transition(struct klp_patch *patch, int universe)
+{
+ struct task_struct *g, *t;
+ unsigned int cpu;
+ struct klp_object *obj;
+ struct klp_func *func;
+
+ klp_transition_patch = patch;
+
+ /*
+ * If the previous transition was in the opposite direction, we may
+ * already be in the requested initial universe.
+ */
+ if (klp_universe_goal == universe)
+ goto init_funcs;
+
+ klp_set_universe_goal(universe);
+
+ /* init all normal task universes */
+ read_lock(&tasklist_lock);
+ for_each_process_thread(g, t)
+ klp_update_task_universe(t);
+ read_unlock(&tasklist_lock);
+
+ /* init all idle "swapper" task universes */
+ get_online_cpus();
+ for_each_online_cpu(cpu)
+ klp_update_task_universe(idle_task(cpu));
+ put_online_cpus();
+
+init_funcs:
+ /* corresponding smp_rmb() is in klp_ftrace_handler() */
+ smp_wmb();
+
+ for (obj = patch->objs; obj->funcs; obj++)
+ for (func = obj->funcs; func->old_name; func++)
+ func->transition = 1;
+}
diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
new file mode 100644
index 0000000..ba9a55c
--- /dev/null
+++ b/kernel/livepatch/transition.h
@@ -0,0 +1,16 @@
+#include <linux/livepatch.h>
+
+enum {
+ KLP_UNIVERSE_UNDEFINED = -1,
+ KLP_UNIVERSE_OLD,
+ KLP_UNIVERSE_NEW,
+};
+
+extern struct mutex klp_mutex;
+extern struct klp_patch *klp_transition_patch;
+
+extern void klp_init_transition(struct klp_patch *patch, int universe);
+extern void klp_start_transition(int universe);
+extern void klp_reverse_transition(void);
+extern void klp_try_complete_transition(void);
+extern void klp_complete_transition(void);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d91e6..7b877f4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -74,6 +74,7 @@
#include <linux/binfmts.h>
#include <linux/context_tracking.h>
#include <linux/compiler.h>
+#include <linux/livepatch.h>

#include <asm/switch_to.h>
#include <asm/tlb.h>
@@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
#if defined(CONFIG_SMP)
sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
#endif
+ klp_update_task_universe(idle);
}

int cpuset_cpumask_can_shrink(const struct cpumask *cur,
--
2.1.0

2015-02-09 17:32:26

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status

Expose the per-task klp_universe value so users can determine which
tasks are holding up completion of a patching operation.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
fs/proc/base.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 3f3d7ae..b9fe6b5 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2528,6 +2528,14 @@ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
return err;
}

+#ifdef CONFIG_LIVEPATCH
+static int proc_pid_klp_universe(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ return seq_printf(m, "%d\n", task->klp_universe);
+}
+#endif /* CONFIG_LIVEPATCH */
+
/*
* Thread groups
*/
@@ -2628,6 +2636,9 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_CHECKPOINT_RESTORE
REG("timers", S_IRUGO, proc_timers_operations),
#endif
+#ifdef CONFIG_LIVEPATCH
+ ONE("universe", S_IRUGO, proc_pid_klp_universe),
+#endif
};

static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
--
2.1.0

2015-02-09 17:31:59

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

Now that we have a consistency model we can detect when unpatching is
complete and the patch module can be safely removed.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
kernel/livepatch/core.c | 25 ++++---------------------
kernel/livepatch/transition.c | 3 +++
2 files changed, 7 insertions(+), 21 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 790dc10..e572523 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -352,6 +352,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
!list_prev_entry(patch, list)->enabled)
return -EBUSY;

+ if (!try_module_get(patch->mod))
+ return -ENODEV;
+
pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);

@@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {

static void klp_kobj_release_patch(struct kobject *kobj)
{
- /*
- * Once we have a consistency model we'll need to module_put() the
- * patch module here. See klp_register_patch() for more details.
- */
}

static struct kobj_type klp_ktype_patch = {
@@ -715,29 +714,13 @@ EXPORT_SYMBOL_GPL(klp_unregister_patch);
*/
int klp_register_patch(struct klp_patch *patch)
{
- int ret;
-
if (!klp_initialized())
return -ENODEV;

if (!patch || !patch->mod)
return -EINVAL;

- /*
- * A reference is taken on the patch module to prevent it from being
- * unloaded. Right now, we don't allow patch modules to unload since
- * there is currently no method to determine if a thread is still
- * running in the patched code contained in the patch module once
- * the ftrace registration is successful.
- */
- if (!try_module_get(patch->mod))
- return -ENODEV;
-
- ret = klp_init_patch(patch);
- if (ret)
- module_put(patch->mod);
-
- return ret;
+ return klp_init_patch(patch);
}
EXPORT_SYMBOL_GPL(klp_register_patch);

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 2630296..20fafd2 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -54,6 +54,9 @@ void klp_complete_transition(void)
for (func = obj->funcs; func->old_name; func++)
func->transition = 0;

+ if (klp_universe_goal == KLP_UNIVERSE_OLD)
+ module_put(klp_transition_patch->mod);
+
klp_transition_patch = NULL;
}

--
2.1.0

2015-02-09 17:31:42

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC PATCH 9/9] livepatch: update task universe when exiting kernel

Update a tasks's universe when returning from a system call or user
space interrupt, or after handling a signal.

This greatly increases the chances of a patch operation succeeding. If
a task is I/O bound, it can switch universes when returning from a
system call. If a task is CPU bound, it can switch universes when
returning from an interrupt. If a task is sleeping on a to-be-patched
function, the user can send SIGSTOP and SIGCONT to force it to switch.

Since the idle "swapper" tasks don't ever exit the kernel, they're
updated from within the idle loop.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/x86/include/asm/thread_info.h | 4 +++-
arch/x86/kernel/signal.c | 4 ++++
include/linux/livepatch.h | 2 ++
kernel/livepatch/transition.c | 15 +++++++++++++++
kernel/sched/idle.c | 4 ++++
5 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 547e344..4e46d36 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -78,6 +78,7 @@ struct thread_info {
#define TIF_MCE_NOTIFY 10 /* notify userspace of an MCE */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
+#define TIF_KLP_NEED_UPDATE 13 /* pending live patching update */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_FORK 18 /* ret_from_fork */
@@ -102,6 +103,7 @@ struct thread_info {
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_MCE_NOTIFY (1 << TIF_MCE_NOTIFY)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
+#define _TIF_KLP_NEED_UPDATE (1 << TIF_KLP_NEED_UPDATE)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
@@ -141,7 +143,7 @@ struct thread_info {
/* Only used for 64 bit */
#define _TIF_DO_NOTIFY_MASK \
(_TIF_SIGPENDING | _TIF_MCE_NOTIFY | _TIF_NOTIFY_RESUME | \
- _TIF_USER_RETURN_NOTIFY | _TIF_UPROBE)
+ _TIF_USER_RETURN_NOTIFY | _TIF_UPROBE | _TIF_KLP_NEED_UPDATE)

/* flags to check in __switch_to() */
#define _TIF_WORK_CTXSW \
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ed37a76..1d4b8e6 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -23,6 +23,7 @@
#include <linux/user-return-notifier.h>
#include <linux/uprobes.h>
#include <linux/context_tracking.h>
+#include <linux/livepatch.h>

#include <asm/processor.h>
#include <asm/ucontext.h>
@@ -760,6 +761,9 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
fire_user_return_notifiers();

+ if (unlikely(thread_info_flags & _TIF_KLP_NEED_UPDATE))
+ klp_update_task_universe(current);
+
user_enter();
}

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index b8c2f15..14f6a96 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -134,6 +134,8 @@ extern int klp_universe_goal;

static inline void klp_update_task_universe(struct task_struct *t)
{
+ clear_tsk_thread_flag(t, TIF_KLP_NEED_UPDATE);
+
/* corresponding smp_wmb() is in klp_set_universe_goal() */
smp_rmb();

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 20fafd2..dac8ea5 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -234,6 +234,9 @@ static void klp_transition_work_fn(struct work_struct *work)
*/
void klp_start_transition(int universe)
{
+ struct task_struct *g, *t;
+ unsigned int cpu;
+
if (WARN_ON(klp_universe_goal == universe))
return;

@@ -241,6 +244,18 @@ void klp_start_transition(int universe)
universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");

klp_set_universe_goal(universe);
+
+ /* mark all normal tasks as needing a universe update */
+ read_lock(&tasklist_lock);
+ for_each_process_thread(g, t)
+ set_tsk_thread_flag(t, TIF_KLP_NEED_UPDATE);
+ read_unlock(&tasklist_lock);
+
+ /* mark all idle "swapper" tasks as needing a universe update */
+ get_online_cpus();
+ for_each_online_cpu(cpu)
+ set_tsk_thread_flag(idle_task(cpu), TIF_KLP_NEED_UPDATE);
+ put_online_cpus();
}

/*
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index c47fce7..c1390b6 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -7,6 +7,7 @@
#include <linux/tick.h>
#include <linux/mm.h>
#include <linux/stackprotector.h>
+#include <linux/livepatch.h>

#include <asm/tlb.h>

@@ -250,6 +251,9 @@ static void cpu_idle_loop(void)

sched_ttwu_pending();
schedule_preempt_disabled();
+
+ if (unlikely(test_thread_flag(TIF_KLP_NEED_UPDATE)))
+ klp_update_task_universe(current);
}
}

--
2.1.0

2015-02-09 23:15:25

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> This patch set implements a livepatch consistency model, targeted for 3.21.
> Now that we have a solid livepatch code base, this is the biggest remaining
> missing piece.

Hi Josh,

first, thanks a lot for putting this together. From a cursory look it
certainly seems to be a very solid base for future steps.

I am afraid I won't get to proper review before merge window concludes
though. But after that it gets moved the top of my TODO list.

> This code stems from the design proposal made by Vojtech [1] in November. It
> makes live patching safer in general. Specifically, it allows you to apply
> patches which change function prototypes. It also lays the groundwork for
> future code changes which will enable data and data semantic changes.
>
> It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> checking with kGraft's per-task consistency. When patching, tasks are
> carefully transitioned from the old universe to the new universe. A task can
> only be switched to the new universe if it's not using a function that is to be
> patched or unpatched. After all tasks have moved to the new universe, the
> patching process is complete.
>
> How it transitions various tasks to the new universe:
>
> - The stacks of all sleeping tasks are checked. Each task that is not sleeping
> on a to-be-patched function is switched.
>
> - Other user tasks are handled by do_notify_resume() (see patch 9/9). If a
> task is I/O bound, it switches universes when returning from a system call.
> If it's CPU bound, it switches when returning from an interrupt.

Just one rather minor comment to this -- we can actually switch CPU-bound
processess "immediately" when we notice they are running in userspace
(assuming that we are also migrating them when they are entering the
kernel as well ... which doesn't seem to be implemented by this patchset,
but that could be easily added at low cost).

Relying on IRQs is problematic, because you can have CPU completely
isolated from both scheduler and IRQs (that's what realtime folks are
doing routinely), so you don't see IRQ on that particular CPU for ages.

The way how do detect whether given CPU is running in userspace (without
interfering with it too much, like, say, sending costly IPI) is rather
tricky though. On kernels with CONFIG_CONTEXT_TRACKING we could make use
of that feature, but my gut feeling is that most people keep that
disabled.

Another alternative is what we are doing in kgraft with
kgr_needs_lazy_migration(), but admittedly that's very far from being
pretty.

--
Jiri Kosina
SUSE Labs

2015-02-10 03:06:00

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Tue, Feb 10, 2015 at 12:15:21AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > This patch set implements a livepatch consistency model, targeted for 3.21.
> > Now that we have a solid livepatch code base, this is the biggest remaining
> > missing piece.
>
> Hi Josh,
>
> first, thanks a lot for putting this together. From a cursory look it
> certainly seems to be a very solid base for future steps.
>
> I am afraid I won't get to proper review before merge window concludes
> though. But after that it gets moved the top of my TODO list.

No problem. Sorry for the inconvenient timing...

> > This code stems from the design proposal made by Vojtech [1] in November. It
> > makes live patching safer in general. Specifically, it allows you to apply
> > patches which change function prototypes. It also lays the groundwork for
> > future code changes which will enable data and data semantic changes.
> >
> > It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> > checking with kGraft's per-task consistency. When patching, tasks are
> > carefully transitioned from the old universe to the new universe. A task can
> > only be switched to the new universe if it's not using a function that is to be
> > patched or unpatched. After all tasks have moved to the new universe, the
> > patching process is complete.
> >
> > How it transitions various tasks to the new universe:
> >
> > - The stacks of all sleeping tasks are checked. Each task that is not sleeping
> > on a to-be-patched function is switched.
> >
> > - Other user tasks are handled by do_notify_resume() (see patch 9/9). If a
> > task is I/O bound, it switches universes when returning from a system call.
> > If it's CPU bound, it switches when returning from an interrupt.
>
> Just one rather minor comment to this -- we can actually switch CPU-bound
> processess "immediately" when we notice they are running in userspace
> (assuming that we are also migrating them when they are entering the
> kernel as well ... which doesn't seem to be implemented by this patchset,
> but that could be easily added at low cost).

We could, but I guess the trick is figuring out how to tell if the task
is in user space. But anyway, I don't really see why it would be
necessary.

> Relying on IRQs is problematic, because you can have CPU completely
> isolated from both scheduler and IRQs (that's what realtime folks are
> doing routinely), so you don't see IRQ on that particular CPU for ages.

It doesn't _rely_ on IRQs, it's just another tool in the kit to help
tasks converge quickly. The front line of attack is backtrace checking
of sleeping tasks. Then it uses system call switching and IRQs as the
next wave of attack, with signals as the last resort. So you can still
fall back on sending signals if needed.

> The way how do detect whether given CPU is running in userspace (without
> interfering with it too much, like, say, sending costly IPI) is rather
> tricky though. On kernels with CONFIG_CONTEXT_TRACKING we could make use
> of that feature, but my gut feeling is that most people keep that
> disabled.

Yeah, that seems to be related to nohz. I think we'd have to have it
enabled 100% of the time on all CPUs, even when not patching. Sounds
like a lot of unnecessary overhead (unless the user already has it
enabled on all CPUs).

> Another alternative is what we are doing in kgraft with
> kgr_needs_lazy_migration(), but admittedly that's very far from being
> pretty.

Hm, is it really safe to read a stack while the task could be writing to
it?

--
Josh

2015-02-10 07:21:40

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> > The way how do detect whether given CPU is running in userspace
> > (without interfering with it too much, like, say, sending costly IPI)
> > is rather tricky though. On kernels with CONFIG_CONTEXT_TRACKING we
> > could make use of that feature, but my gut feeling is that most people
> > keep that disabled.

> Yeah, that seems to be related to nohz. I think we'd have to have it
> enabled 100% of the time on all CPUs, even when not patching. Sounds
> like a lot of unnecessary overhead (unless the user already has it
> enabled on all CPUs).

Agreed, we could make use of it when it's enabled in kernel config anyway,
but it would be impractical for us to hard require it.

> > Another alternative is what we are doing in kgraft with
> > kgr_needs_lazy_migration(), but admittedly that's very far from being
> > pretty.
>
> Hm, is it really safe to read a stack while the task could be writing to
> it?

It might indeed look like that on a first sight :) but let's look at the
possible race scenarios:

(1) task is running in userspace when you start looking at its kernel
stack, and while you are examining it, it enters the kernel. That's
not a problem, because no matter what verdict kgr_needs_lazy_migration()
yields, the migration to new universe happens during kernel entry
anyway

(2) task is actively running in kernelspace. There is no way for
print_context_stack() to result it that small number of nr_entries.
The stack context might be bogus due to the race, but it always
starts at a valid bp which can't be that low.

(3) task is running in kernelspace, but is about to exit to userspace, and
looking at the kernel stack races with this. That's again not a
problem, because no matter what verdict kgr_needs_lazy_migration()
yields, the migration to the new unuverse happens during kernel exit
anyway

So I agree that this is ugly as hell, and depends on architecture-specific
implementation of print_context_stack(); but architectures are free to
give up this optimization if it can't be used.

But yes, we should be able to come up with something better if we want to
use this optimization upstream.

Thanks,

--
Jiri Kosina
SUSE Labs

2015-02-10 08:57:47

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> 2) As mentioned above, kthreads which are always sleeping on a patched function
> will never transition to the new universe. This is really a minor issue
> (less than 1% of patches). It's not necessarily something that needs to be
> resolved with this patch set, but it would be good to have some discussion
> about it regardless.
>
> To overcome this issue, I have 1/2 an idea: we could add some stack checking
> code to the ftrace handler itself to transition the kthread to the new
> universe after it re-enters the function it was originally sleeping on, if
> the stack doesn't already have have any other to-be-patched functions.
> Combined with the klp_transition_work_fn()'s periodic stack checking of
> sleeping tasks, that would handle most of the cases (except when trying to
> patch the high-level thread_fn itself).
>
> But then how do you make the kthread wake up? As far as I can tell,
> wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> testing somehow). What does kGraft do in this case?

wake_up_process() really should work for (p->flags & PF_KTHREAD)
task_struct. What was your testing scenario?

--
Jiri Kosina
SUSE Labs

Subject: Re: [RFC PATCH 5/9] sched: move task rq locking functions to sched.h

(2015/02/10 2:31), Josh Poimboeuf wrote:
> Move task_rq_lock/unlock() to sched.h so they can be used elsewhere.
> The livepatch code needs to lock each task's rq in order to safely
> examine its stack and switch it to a new patch universe.

Hmm, why don't you just expose (extern in sched.h) those?

Thank you,

>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> kernel/sched/core.c | 32 --------------------------------
> kernel/sched/sched.h | 33 +++++++++++++++++++++++++++++++++
> 2 files changed, 33 insertions(+), 32 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b5797b7..78d91e6 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -326,44 +326,12 @@ static inline struct rq *__task_rq_lock(struct task_struct *p)
> }
> }
>
> -/*
> - * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
> - */
> -static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags)
> - __acquires(p->pi_lock)
> - __acquires(rq->lock)
> -{
> - struct rq *rq;
> -
> - for (;;) {
> - raw_spin_lock_irqsave(&p->pi_lock, *flags);
> - rq = task_rq(p);
> - raw_spin_lock(&rq->lock);
> - if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
> - return rq;
> - raw_spin_unlock(&rq->lock);
> - raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> -
> - while (unlikely(task_on_rq_migrating(p)))
> - cpu_relax();
> - }
> -}
> -
> static void __task_rq_unlock(struct rq *rq)
> __releases(rq->lock)
> {
> raw_spin_unlock(&rq->lock);
> }
>
> -static inline void
> -task_rq_unlock(struct rq *rq, struct task_struct *p, unsigned long *flags)
> - __releases(rq->lock)
> - __releases(p->pi_lock)
> -{
> - raw_spin_unlock(&rq->lock);
> - raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> -}
> -
> /*
> * this_rq_lock - lock this runqueue and disable interrupts.
> */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 9a2a45c..ae514c9 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1542,6 +1542,39 @@ static inline void double_rq_unlock(struct rq *rq1, struct rq *rq2)
>
> #endif
>
> +/*
> + * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
> + */
> +static inline struct rq *task_rq_lock(struct task_struct *p,
> + unsigned long *flags)
> + __acquires(p->pi_lock)
> + __acquires(rq->lock)
> +{
> + struct rq *rq;
> +
> + for (;;) {
> + raw_spin_lock_irqsave(&p->pi_lock, *flags);
> + rq = task_rq(p);
> + raw_spin_lock(&rq->lock);
> + if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
> + return rq;
> + raw_spin_unlock(&rq->lock);
> + raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> +
> + while (unlikely(task_on_rq_migrating(p)))
> + cpu_relax();
> + }
> +}
> +
> +static inline void task_rq_unlock(struct rq *rq, struct task_struct *p,
> + unsigned long *flags)
> + __releases(rq->lock)
> + __releases(p->pi_lock)
> +{
> + raw_spin_unlock(&rq->lock);
> + raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
> +}
> +
> extern struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq);
> extern struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq);
> extern void print_cfs_stats(struct seq_file *m, int cpu);
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

(2015/02/10 2:31), Josh Poimboeuf wrote:
> Add a basic per-task consistency model. This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
>
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe. If a
> given task isn't using any of the patched functions, it's switched to
> the new universe. Once all the tasks have been converged to the new
> universe, patching is complete.
>
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
>
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition. Only a single patch (the topmost patch on the stack)
> can be in transition at a given time. A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
>
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress. Then all the tasks will attempt to
> converge back to the original universe.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> include/linux/livepatch.h | 18 ++-
> include/linux/sched.h | 3 +
> kernel/fork.c | 2 +
> kernel/livepatch/Makefile | 2 +-
> kernel/livepatch/core.c | 71 ++++++----
> kernel/livepatch/patch.c | 34 ++++-
> kernel/livepatch/patch.h | 1 +
> kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
> kernel/livepatch/transition.h | 16 +++
> kernel/sched/core.c | 2 +
> 10 files changed, 423 insertions(+), 26 deletions(-)
> create mode 100644 kernel/livepatch/transition.c
> create mode 100644 kernel/livepatch/transition.h
>
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 0e65b4d..b8c2f15 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -40,6 +40,7 @@
> * @old_size: size of the old function
> * @new_size: size of the new function
> * @patched: the func has been added to the klp_ops list
> + * @transition: the func is currently being applied or reverted
> */
> struct klp_func {
> /* external */
> @@ -60,6 +61,7 @@ struct klp_func {
> struct list_head stack_node;
> unsigned long old_size, new_size;
> int patched;
> + int transition;
> };
>
> /**
> @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
> extern int klp_enable_patch(struct klp_patch *);
> extern int klp_disable_patch(struct klp_patch *);
>
> -#endif /* CONFIG_LIVEPATCH */
> +extern int klp_universe_goal;
> +
> +static inline void klp_update_task_universe(struct task_struct *t)
> +{
> + /* corresponding smp_wmb() is in klp_set_universe_goal() */
> + smp_rmb();
> +
> + t->klp_universe = klp_universe_goal;
> +}
> +
> +#else /* !CONFIG_LIVEPATCH */
> +
> +static inline void klp_update_task_universe(struct task_struct *t) {}
> +
> +#endif /* !CONFIG_LIVEPATCH */
>
> #endif /* _LINUX_LIVEPATCH_H_ */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..a95e59a 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1701,6 +1701,9 @@ struct task_struct {
> #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> unsigned long task_state_change;
> #endif
> +#ifdef CONFIG_LIVEPATCH
> + int klp_universe;
> +#endif
> };
>
> /* Future-safe accessor for struct task_struct's cpus_allowed. */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4dc2dda..1dcbebe 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -74,6 +74,7 @@
> #include <linux/uprobes.h>
> #include <linux/aio.h>
> #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>
> #include <asm/pgtable.h>
> #include <asm/pgalloc.h>
> @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> total_forks++;
> spin_unlock(&current->sighand->siglock);
> syscall_tracepoint_update(p);
> + klp_update_task_universe(p);
> write_unlock_irq(&tasklist_lock);
>
> proc_fork_connector(p);
> diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> index e136dad..2b8bdb1 100644
> --- a/kernel/livepatch/Makefile
> +++ b/kernel/livepatch/Makefile
> @@ -1,3 +1,3 @@
> obj-$(CONFIG_LIVEPATCH) += livepatch.o
>
> -livepatch-objs := core.o patch.o
> +livepatch-objs := core.o patch.o transition.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 85d4ef7..790dc10 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -28,14 +28,17 @@
> #include <linux/kallsyms.h>
>
> #include "patch.h"
> +#include "transition.h"
>
> /*
> - * The klp_mutex protects the global lists and state transitions of any
> - * structure reachable from them. References to any structure must be obtained
> - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> - * ensure it gets consistent data).
> + * The klp_mutex is a coarse lock which serializes access to klp data. All
> + * accesses to klp-related variables and structures must have mutex protection,
> + * except within the following functions which carefully avoid the need for it:
> + *
> + * - klp_ftrace_handler()
> + * - klp_update_task_universe()
> */
> -static DEFINE_MUTEX(klp_mutex);
> +DEFINE_MUTEX(klp_mutex);
>
> static LIST_HEAD(klp_patches);
>
> @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
> mutex_unlock(&module_mutex);
> }
>
> -/* klp_mutex must be held by caller */
> static bool klp_is_patch_registered(struct klp_patch *patch)
> {
> struct klp_patch *mypatch;
> @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
>
> static int __klp_disable_patch(struct klp_patch *patch)
> {
> - struct klp_object *obj;
> + if (klp_transition_patch)
> + return -EBUSY;
>
> /* enforce stacking: only the last enabled patch can be disabled */
> if (!list_is_last(&patch->list, &klp_patches) &&
> list_next_entry(patch, list)->enabled)
> return -EBUSY;
>
> - pr_notice("disabling patch '%s'\n", patch->mod->name);
> -
> - for (obj = patch->objs; obj->funcs; obj++)
> - if (obj->patched)
> - klp_unpatch_object(obj);
> + klp_init_transition(patch, KLP_UNIVERSE_NEW);
> + klp_start_transition(KLP_UNIVERSE_OLD);
> + klp_try_complete_transition();
>
> patch->enabled = 0;
>
> @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
> struct klp_object *obj;
> int ret;
>
> + if (klp_transition_patch)
> + return -EBUSY;
> +
> if (WARN_ON(patch->enabled))
> return -EINVAL;
>
> @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>
> - pr_notice("enabling patch '%s'\n", patch->mod->name);
> + klp_init_transition(patch, KLP_UNIVERSE_OLD);
>
> for (obj = patch->objs; obj->funcs; obj++) {
> klp_find_object_module(obj);
> @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
> continue;
>
> ret = klp_patch_object(obj);
> - if (ret)
> - goto unregister;
> + if (ret) {
> + pr_warn("failed to enable patch '%s'\n",
> + patch->mod->name);
> +
> + klp_unpatch_objects(patch);
> + klp_complete_transition();
> +
> + return ret;
> + }
> }
>
> + klp_start_transition(KLP_UNIVERSE_NEW);
> +
> + klp_try_complete_transition();
> +
> patch->enabled = 1;
>
> return 0;
> -
> -unregister:
> - WARN_ON(__klp_disable_patch(patch));
> - return ret;
> }
>
> /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
> * /sys/kernel/livepatch
> * /sys/kernel/livepatch/<patch>
> * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
> * /sys/kernel/livepatch/<patch>/<object>
> * /sys/kernel/livepatch/<patch>/<object>/<func>
> */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> goto err;
> }
>
> - if (val) {
> + if (klp_transition_patch == patch) {
> + klp_reverse_transition();
> + } else if (val) {
> ret = __klp_enable_patch(patch);
> if (ret)
> goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
> return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
> }
>
> +static ssize_t transition_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct klp_patch *patch;
> +
> + patch = container_of(kobj, struct klp_patch, kobj);
> + return snprintf(buf, PAGE_SIZE-1, "%d\n",
> + klp_transition_patch == patch);
> +}
> +
> static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
> static struct attribute *klp_patch_attrs[] = {
> &enabled_kobj_attr.attr,
> + &transition_kobj_attr.attr,
> NULL
> };
>
> @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> {
> INIT_LIST_HEAD(&func->stack_node);
> func->patched = 0;
> + func->transition = 0;
>
> return kobject_init_and_add(&func->kobj, &klp_ktype_func,
> obj->kobj, func->old_name);
> @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
> if (ret)
> goto err;
>
> - if (!patch->enabled)
> + if (!patch->enabled && klp_transition_patch != patch)
> return;
>
> pr_notice("applying patch '%s' to loading module '%s'\n",
> @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> struct module *pmod = patch->mod;
> struct module *mod = obj->mod;
>
> - if (!patch->enabled)
> + if (!patch->enabled && klp_transition_patch != patch)
> goto free;
>
> pr_notice("reverting patch '%s' on unloading module '%s'\n",
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 281fbca..f12256b 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -24,6 +24,7 @@
> #include <linux/slab.h>
>
> #include "patch.h"
> +#include "transition.h"
>
> static LIST_HEAD(klp_ops);
>
> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> ops = container_of(fops, struct klp_ops, fops);
>
> rcu_read_lock();
> +
> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - rcu_read_unlock();
>
> if (WARN_ON_ONCE(!func))
> - return;
> + goto unlock;
> +
> + if (unlikely(func->transition)) {
> + /* corresponding smp_wmb() is in klp_init_transition() */
> + smp_rmb();
> +
> + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> + /*
> + * Use the previously patched version of the function.
> + * If no previous patches exist, use the original
> + * function.
> + */
> + func = list_entry_rcu(func->stack_node.next,
> + struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }
>
> klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> + rcu_read_unlock();
> }
>
> struct klp_ops *klp_find_ops(unsigned long old_addr)
> @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
>
> return 0;
> }
> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> + struct klp_object *obj;
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + if (obj->patched)
> + klp_unpatch_object(obj);
> +}
> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>
> extern int klp_patch_object(struct klp_object *obj);
> extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 0000000..2630296
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> @@ -0,0 +1,300 @@
> +/*
> + * transition.c - Kernel Live Patching transition functions
> + *
> + * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/cpu.h>
> +#include <asm/stacktrace.h>
> +#include "../sched/sched.h"
> +
> +#include "patch.h"
> +#include "transition.h"
> +
> +static void klp_transition_work_fn(struct work_struct *);
> +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> +
> +struct klp_patch *klp_transition_patch;
> +
> +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> +
> +static void klp_set_universe_goal(int universe)
> +{
> + klp_universe_goal = universe;
> +
> + /* corresponding smp_rmb() is in klp_update_task_universe() */
> + smp_wmb();
> +}
> +
> +/*
> + * The transition to the universe goal is complete. Clean up the data
> + * structures.
> + */
> +void klp_complete_transition(void)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> +
> + for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> + for (func = obj->funcs; func->old_name; func++)
> + func->transition = 0;
> +
> + klp_transition_patch = NULL;
> +}
> +
> +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> + unsigned long address)
> +{
> + unsigned long func_addr, func_size;
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + /* check the to-be-unpatched function (the func itself) */
> + func_addr = (unsigned long)func->new_func;
> + func_size = func->new_size;
> + } else {
> + /* check the to-be-patched function (previous func) */
> + struct klp_ops *ops;
> +
> + ops = klp_find_ops(func->old_addr);
> +
> + if (list_is_singular(&ops->func_stack)) {
> + /* original function */
> + func_addr = func->old_addr;
> + func_size = func->old_size;
> + } else {
> + /* previously patched function */
> + struct klp_func *prev;
> +
> + prev = list_next_entry(func, stack_node);
> + func_addr = (unsigned long)prev->new_func;
> + func_size = prev->new_size;
> + }
> + }
> +
> + if (address >= func_addr && address < func_addr + func_size)
> + return -1;
> +
> + return 0;
> +}
> +
> +/*
> + * Determine whether the given return address on the stack is within a
> + * to-be-patched or to-be-unpatched function.
> + */
> +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> + int reliable)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> + int *ret = data;
> +
> + if (*ret)
> + return;
> +
> + for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> + if (!obj->patched)
> + continue;
> + for (func = obj->funcs; func->old_name; func++) {
> + if (klp_stacktrace_address_verify_func(func, address)) {
> + *ret = -1;
> + return;
> + }
> + }
> + }
> +}
> +
> +static int klp_stacktrace_stack(void *data, char *name)
> +{
> + return 0;
> +}
> +
> +static const struct stacktrace_ops klp_stacktrace_ops = {
> + .address = klp_stacktrace_address_verify,
> + .stack = klp_stacktrace_stack,
> + .walk_stack = print_context_stack_bp,
> +};
> +
> +/*
> + * Try to safely transition a task to the universe goal. If the task is
> + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> + * function, return false.
> + */
> +static bool klp_transition_task(struct task_struct *t)
> +{
> + struct rq *rq;
> + unsigned long flags;
> + int ret;
> + bool success = false;
> +
> + if (t->klp_universe == klp_universe_goal)
> + return true;
> +
> + rq = task_rq_lock(t, &flags);
> +
> + if (task_running(rq, t) && t != current) {
> + pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> + t->comm);
> + goto done;
> + }

Let me confirm that this always skips running tasks, and klp retries
checking by using delayed worker, correct?

Indeed, this can work if we retries enough long...

Thank you,

> +
> + ret = 0;
> + dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> + if (ret) {
> + pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> + __func__, t->pid, t->comm);
> + goto done;
> + }
> +
> + klp_update_task_universe(t);
> +
> + success = true;
> +done:
> + task_rq_unlock(rq, t, &flags);
> + return success;
> +}
> +
> +/*
> + * Try to transition all tasks to the universe goal. If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> + unsigned int cpu;
> + struct task_struct *g, *t;
> + bool complete = true;
> +
> + /* try to transition all normal tasks */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + if (!klp_transition_task(t))
> + complete = false;
> + read_unlock(&tasklist_lock);
> +
> + /* try to transition the idle "swapper" tasks */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + if (!klp_transition_task(idle_task(cpu)))
> + complete = false;
> + put_online_cpus();
> +
> + /* if not complete, try again later */
> + if (!complete) {
> + schedule_delayed_work(&klp_transition_work,
> + round_jiffies_relative(HZ));
> + return;
> + }
> +
> + /* success! unpatch obsolete functions and do some cleanup */
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + klp_unpatch_objects(klp_transition_patch);
> +
> + /* prevent ftrace handler from reading old func->transition */
> + synchronize_rcu();
> + }
> +
> + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> + "unpatching");
> +
> + klp_complete_transition();
> +}
> +
> +static void klp_transition_work_fn(struct work_struct *work)
> +{
> + mutex_lock(&klp_mutex);
> +
> + if (klp_transition_patch)
> + klp_try_complete_transition();
> +
> + mutex_unlock(&klp_mutex);
> +}
> +
> +/*
> + * Start the transition to the specified universe so tasks can begin switching
> + * to it.
> + */
> +void klp_start_transition(int universe)
> +{
> + if (WARN_ON(klp_universe_goal == universe))
> + return;
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> + universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> +
> + klp_set_universe_goal(universe);
> +}
> +
> +/*
> + * Can be called in the middle of an existing transition to reverse the
> + * direction of the universe goal. This can be done to effectively cancel an
> + * existing enable or disable operation if there are any tasks which are stuck
> + * in the original universe.
> + */
> +void klp_reverse_transition(void)
> +{
> + struct klp_patch *patch = klp_transition_patch;
> +
> + klp_start_transition(!klp_universe_goal);
> + klp_try_complete_transition();
> +
> + patch->enabled = !patch->enabled;
> +}
> +
> +/*
> + * Reset the universe goal and all tasks to the starting universe, and set all
> + * func->transition's to 1 to prepare for patching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int universe)
> +{
> + struct task_struct *g, *t;
> + unsigned int cpu;
> + struct klp_object *obj;
> + struct klp_func *func;
> +
> + klp_transition_patch = patch;
> +
> + /*
> + * If the previous transition was in the opposite direction, we may
> + * already be in the requested initial universe.
> + */
> + if (klp_universe_goal == universe)
> + goto init_funcs;
> +
> + klp_set_universe_goal(universe);
> +
> + /* init all normal task universes */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + klp_update_task_universe(t);
> + read_unlock(&tasklist_lock);
> +
> + /* init all idle "swapper" task universes */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + klp_update_task_universe(idle_task(cpu));
> + put_online_cpus();
> +
> +init_funcs:
> + /* corresponding smp_rmb() is in klp_ftrace_handler() */
> + smp_wmb();
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + for (func = obj->funcs; func->old_name; func++)
> + func->transition = 1;
> +}
> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> + KLP_UNIVERSE_UNDEFINED = -1,
> + KLP_UNIVERSE_OLD,
> + KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;
> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d91e6..7b877f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -74,6 +74,7 @@
> #include <linux/binfmts.h>
> #include <linux/context_tracking.h>
> #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>
> #include <asm/switch_to.h>
> #include <asm/tlb.h>
> @@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
> #if defined(CONFIG_SMP)
> sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
> #endif
> + klp_update_task_universe(idle);
> }
>
> int cpuset_cpumask_can_shrink(const struct cpumask *cur,
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

(2015/02/10 2:31), Josh Poimboeuf wrote:
> This patch set implements a livepatch consistency model, targeted for 3.21.
> Now that we have a solid livepatch code base, this is the biggest remaining
> missing piece.
>
> This code stems from the design proposal made by Vojtech [1] in November. It
> makes live patching safer in general. Specifically, it allows you to apply
> patches which change function prototypes. It also lays the groundwork for
> future code changes which will enable data and data semantic changes.

Interesting, How would you do that?

> It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> checking with kGraft's per-task consistency. When patching, tasks are
> carefully transitioned from the old universe to the new universe. A task can
> only be switched to the new universe if it's not using a function that is to be
> patched or unpatched. After all tasks have moved to the new universe, the
> patching process is complete.
>
> How it transitions various tasks to the new universe:
>
> - The stacks of all sleeping tasks are checked. Each task that is not sleeping
> on a to-be-patched function is switched.
>
> - Other user tasks are handled by do_notify_resume() (see patch 9/9). If a
> task is I/O bound, it switches universes when returning from a system call.
> If it's CPU bound, it switches when returning from an interrupt. If it's
> sleeping on a patched function, the user can send SIGSTOP and SIGCONT to
> force it to switch upon return from the signal handler.

Ah, OK. So you can handle those without hooking switch_to :)

>
> - Idle "swapper" tasks which are sleeping on a to-be-patched function can be
> switched from within the outer idle loop.
>
> - An interrupt handler will inherit the universe of the task it interrupts.
>
> - kthreads which are sleeping on to-be-patched functions are not yet handled
> (more on this below).
>
>
> I think this approach provides the best benefits of both kpatch and kGraft:
>
> advantages vs kpatch:
> - no stop machine latency

Good! :)

> - higher patch success rate (can patch in-use functions)
> - patching failures are more predictable (primary failure mode is attempting to
> patch a kthread which is sleeping forever on a patched function, more on this
> below)
>
> advantages vs kGraft:
> - less code complexity (don't have to hack up the code of all the different
> kthreads)
> - less impact to processes (don't have to signal all sleeping tasks)
>
> disadvantages vs kpatch:
> - no system-wide switch point (not really a functional limitation, just forces
> the patch author to be more careful. but that's probably a good thing anyway)

OK, we must check carefully that the old function and new function can be co-exist.

> My biggest concerns and questions related to this patch set are:
>
> 1) To safely examine the task stacks, the transition code locks each task's rq
> struct, which requires using the scheduler's internal rq locking functions.
> It seems to work well, but I'm not sure if there's a cleaner way to safely
> do stack checking without stop_machine().

We'd better ask scheduler people.

>
> 2) As mentioned above, kthreads which are always sleeping on a patched function
> will never transition to the new universe. This is really a minor issue
> (less than 1% of patches). It's not necessarily something that needs to be
> resolved with this patch set, but it would be good to have some discussion
> about it regardless.
>
> To overcome this issue, I have 1/2 an idea: we could add some stack checking
> code to the ftrace handler itself to transition the kthread to the new
> universe after it re-enters the function it was originally sleeping on, if
> the stack doesn't already have have any other to-be-patched functions.
> Combined with the klp_transition_work_fn()'s periodic stack checking of
> sleeping tasks, that would handle most of the cases (except when trying to
> patch the high-level thread_fn itself).

It makes sense to me. (I just did similar thing)

>
> But then how do you make the kthread wake up? As far as I can tell,
> wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> testing somehow). What does kGraft do in this case?

Hmm, at a glance, the code itself can work on kthread too...
Maybe you can also send you testing patch too.

Thank you,

>
>
> [1] https://lkml.org/lkml/2014/11/7/354
>
>
> Josh Poimboeuf (9):
> livepatch: simplify disable error path
> livepatch: separate enabled and patched states
> livepatch: move patching functions into patch.c
> livepatch: get function sizes
> sched: move task rq locking functions to sched.h
> livepatch: create per-task consistency model
> proc: add /proc/<pid>/universe to show livepatch status
> livepatch: allow patch modules to be removed
> livepatch: update task universe when exiting kernel
>
> arch/x86/include/asm/thread_info.h | 4 +-
> arch/x86/kernel/signal.c | 4 +
> fs/proc/base.c | 11 ++
> include/linux/livepatch.h | 38 ++--
> include/linux/sched.h | 3 +
> kernel/fork.c | 2 +
> kernel/livepatch/Makefile | 2 +-
> kernel/livepatch/core.c | 360 ++++++++++---------------------------
> kernel/livepatch/patch.c | 206 +++++++++++++++++++++
> kernel/livepatch/patch.h | 26 +++
> kernel/livepatch/transition.c | 318 ++++++++++++++++++++++++++++++++
> kernel/livepatch/transition.h | 16 ++
> kernel/sched/core.c | 34 +---
> kernel/sched/idle.c | 4 +
> kernel/sched/sched.h | 33 ++++
> 15 files changed, 747 insertions(+), 314 deletions(-)
> create mode 100644 kernel/livepatch/patch.c
> create mode 100644 kernel/livepatch/patch.h
> create mode 100644 kernel/livepatch/transition.c
> create mode 100644 kernel/livepatch/transition.h
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2015-02-10 14:43:44

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Tue, Feb 10, 2015 at 09:57:44AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > 2) As mentioned above, kthreads which are always sleeping on a patched function
> > will never transition to the new universe. This is really a minor issue
> > (less than 1% of patches). It's not necessarily something that needs to be
> > resolved with this patch set, but it would be good to have some discussion
> > about it regardless.
> >
> > To overcome this issue, I have 1/2 an idea: we could add some stack checking
> > code to the ftrace handler itself to transition the kthread to the new
> > universe after it re-enters the function it was originally sleeping on, if
> > the stack doesn't already have have any other to-be-patched functions.
> > Combined with the klp_transition_work_fn()'s periodic stack checking of
> > sleeping tasks, that would handle most of the cases (except when trying to
> > patch the high-level thread_fn itself).
> >
> > But then how do you make the kthread wake up? As far as I can tell,
> > wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> > testing somehow). What does kGraft do in this case?
>
> wake_up_process() really should work for (p->flags & PF_KTHREAD)
> task_struct. What was your testing scenario?

Hm, I probably did something stupid. I'll try it again :-)

--
Josh

2015-02-10 14:54:43

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 5/9] sched: move task rq locking functions to sched.h

On Tue, Feb 10, 2015 at 07:48:17PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > Move task_rq_lock/unlock() to sched.h so they can be used elsewhere.
> > The livepatch code needs to lock each task's rq in order to safely
> > examine its stack and switch it to a new patch universe.
>
> Hmm, why don't you just expose (extern in sched.h) those?

One reason was because task_rq_unlock was already static inline, and I
didn't want to un-inline it. But that's probably a dumb reason, since I
inlined task_rq_lock and it wasn't inlined before.

But also, there are some other inlined locking functions in sched.h:
double_lock_balance, double_rq_lock, double_lock_irq, etc. So it just
seemed to "fit" better there.

Either way works for me. I'll ask some scheduler people.

--
Josh

2015-02-10 14:59:12

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, Feb 10, 2015 at 07:58:30PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > +/*
> > + * Try to safely transition a task to the universe goal. If the task is
> > + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> > + * function, return false.
> > + */
> > +static bool klp_transition_task(struct task_struct *t)
> > +{
> > + struct rq *rq;
> > + unsigned long flags;
> > + int ret;
> > + bool success = false;
> > +
> > + if (t->klp_universe == klp_universe_goal)
> > + return true;
> > +
> > + rq = task_rq_lock(t, &flags);
> > +
> > + if (task_running(rq, t) && t != current) {
> > + pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> > + t->comm);
> > + goto done;
> > + }
>
> Let me confirm that this always skips running tasks, and klp retries
> checking by using delayed worker, correct?

Correct. Also, patch 9 of the series adds other ways to convert tasks,
using syscalls, irqs and signals.


--
Josh

2015-02-10 15:59:22

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model


On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Add a basic per-task consistency model. This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
>
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe. If a
> given task isn't using any of the patched functions, it's switched to
> the new universe. Once all the tasks have been converged to the new
> universe, patching is complete.
>
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
>
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition. Only a single patch (the topmost patch on the stack)
> can be in transition at a given time. A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
>
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress. Then all the tasks will attempt to
> converge back to the original universe.

Hi Josh,

first, thanks a lot for great work. I'm starting to go through it and it's
gonna take me some time to do and send a complete review. Anyway, I
suspect there is a possible race in the code. I'm still not sure though.
See below...

[...]

> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> ops = container_of(fops, struct klp_ops, fops);
>
> rcu_read_lock();
> +
> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - rcu_read_unlock();
>
> if (WARN_ON_ONCE(!func))
> - return;
> + goto unlock;
> +
> + if (unlikely(func->transition)) {
> + /* corresponding smp_wmb() is in klp_init_transition() */
> + smp_rmb();
> +
> + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> + /*
> + * Use the previously patched version of the function.
> + * If no previous patches exist, use the original
> + * function.
> + */
> + func = list_entry_rcu(func->stack_node.next,
> + struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }
>
> klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> + rcu_read_unlock();
> }

The problem is that there is no guarantee that ftrace handler is called in
an atomic context. Hence it could be preempted (if CONFIG_PREEMPT is y)
and it could be preempted anywhere before rcu_read_lock (which disables
preemption for CONFIG_PREEMPT). Ftrace often uses ftrace_ops_list_func as
a callback which calls the handlers with preemption disabled. But not
always. For dynamic trampolines it should call the handlers directly and
preemption is not disabled.

So...

> +/*
> + * Try to transition all tasks to the universe goal. If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> + unsigned int cpu;
> + struct task_struct *g, *t;
> + bool complete = true;
> +
> + /* try to transition all normal tasks */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + if (!klp_transition_task(t))
> + complete = false;
> + read_unlock(&tasklist_lock);
> +
> + /* try to transition the idle "swapper" tasks */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + if (!klp_transition_task(idle_task(cpu)))
> + complete = false;
> + put_online_cpus();
> +
> + /* if not complete, try again later */
> + if (!complete) {
> + schedule_delayed_work(&klp_transition_work,
> + round_jiffies_relative(HZ));
> + return;
> + }
> +
> + /* success! unpatch obsolete functions and do some cleanup */
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + klp_unpatch_objects(klp_transition_patch);
> +
> + /* prevent ftrace handler from reading old func->transition */
> + synchronize_rcu();
> + }
> +
> + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> + "unpatching");
> +
> + klp_complete_transition();
> +}

...synchronize_rcu() could be insufficient. There still can be some
process in our ftrace handler after the call.

Consider the following scenario:

When synchronize_rcu is called some process could have been preempted on
some other cpu somewhere at the start of the ftrace handler before
rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that
does not mean anything for our process in the handler, because it is not
in rcu critical section. There is no guarantee that after synchronize_rcu
the process would be away from the handler.

"Meanwhile" klp_try_complete_transition continues and calls
klp_complete_transition. This clears func->transition flags. Now the
process in the handler could be scheduled again. It reads the wrong value
of func->transition and redirection to the wrong function is done.

What do you think? I hope I made myself clear.

There is the similar problem for dynamic trampolines in ftrace. You cannot
remove them unless there is no process in the handler. I think rcu-tasks
were merged a while ago for this purpose. However ftrace does not use them
yet and I don't know if we could exploit them to solve this issue. I need
to think more about it.

Anyway thanks a lot!

Miroslav

2015-02-10 16:00:05

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Tue, Feb 10, 2015 at 08:16:59PM +0900, Masami Hiramatsu wrote:
> (2015/02/10 2:31), Josh Poimboeuf wrote:
> > This patch set implements a livepatch consistency model, targeted for 3.21.
> > Now that we have a solid livepatch code base, this is the biggest remaining
> > missing piece.
> >
> > This code stems from the design proposal made by Vojtech [1] in November. It
> > makes live patching safer in general. Specifically, it allows you to apply
> > patches which change function prototypes. It also lays the groundwork for
> > future code changes which will enable data and data semantic changes.
>
> Interesting, How would you do that?

As Vojtech described in the earlier thread from November, there are
different approaches for changing data:

1. TRANSFORM_WORLD: stop the world, transform everything, resume

2. TRANSFORM_ON_ACCESS: transform data structures when you access them

I would add a third category (which is what we've been doing with
kpatch):

3. TRANSFORM_ON_CREATE: create new data structures created after a certain point
are the "v2" versions

I think approach 1 seems very tricky, if not impossible in many cases,
even if you're using stop_machine(). Right now we're focusing on
enabling approaches 2 and 3, since they seem more practical, don't
require stop_machine(), and are generally easier to get right.

With kpatch we've been using approach 3, with a lot of success. Here's
how I would do it with livepatch:

As a prerequisite, we need shadow variables, which is a way to add
virtual fields to existing structs at runtime. For an example, see:

https://github.com/dynup/kpatch/blob/master/test/integration/shadow-newpid.patch

In that example, I added "newpid" to task_struct. If it's only
something like locking semantics that are changing, you can just add a
"v2" field to the struct to specify that it's the 2nd version of the
struct.

When converting a patch to be used for livepatch, the patch author must
carefully look for data struct versioning changes. It doesn't matter if
there's a new field, or if the semantics of using that data has changed.
Either way, the patch author must define a new version of the struct.

If a struct has changed, all patched functions need to be able to deal
with struct v1 or struct v2. This is true for those functions which
access the structs as well as the functions which create them.

For example, a function which accesses the struct might change to:

if (klp_shadow_has_field(struct, "v2"))
/* access struct the new way */
else
/* access struct the old way */

A function which creates the struct might change to:

struct foo *struct_create()
{
/* kmalloc and init struct here */

if (klp_patching_complete())
/* add v2 shadow fields */
}


The klp_patching_complete() call is needed to prevent v1 functions from
accessing v2 data. The creation/transformation of v2 structs shouldn't
occur until after the patching process is complete, and all tasks are
converged to the new universe.

> > disadvantages vs kpatch:
> > - no system-wide switch point (not really a functional limitation, just forces
> > the patch author to be more careful. but that's probably a good thing anyway)
>
> OK, we must check carefully that the old function and new function can be co-exist.

Agreed, and this requires the patch author to look carefully for data
version changes, as described above. Which they should be doing
regardless.

> > My biggest concerns and questions related to this patch set are:
> >
> > 1) To safely examine the task stacks, the transition code locks each task's rq
> > struct, which requires using the scheduler's internal rq locking functions.
> > It seems to work well, but I'm not sure if there's a cleaner way to safely
> > do stack checking without stop_machine().
>
> We'd better ask scheduler people.

Agreed, I will.

> > 2) As mentioned above, kthreads which are always sleeping on a patched function
> > will never transition to the new universe. This is really a minor issue
> > (less than 1% of patches). It's not necessarily something that needs to be
> > resolved with this patch set, but it would be good to have some discussion
> > about it regardless.
> >
> > To overcome this issue, I have 1/2 an idea: we could add some stack checking
> > code to the ftrace handler itself to transition the kthread to the new
> > universe after it re-enters the function it was originally sleeping on, if
> > the stack doesn't already have have any other to-be-patched functions.
> > Combined with the klp_transition_work_fn()'s periodic stack checking of
> > sleeping tasks, that would handle most of the cases (except when trying to
> > patch the high-level thread_fn itself).
>
> It makes sense to me. (I just did similar thing)
>
> >
> > But then how do you make the kthread wake up? As far as I can tell,
> > wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> > testing somehow). What does kGraft do in this case?
>
> Hmm, at a glance, the code itself can work on kthread too...
> Maybe you can also send you testing patch too.

Yeah, I probably messed it up. I'll try it again :-)

--
Josh

2015-02-10 16:44:35

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Once we have a consistency model, patches and their objects will be
> enabled and disabled at different times. For example, when a patch is
> disabled, its loaded objects' funcs can remain registered with ftrace
> indefinitely until the unpatching operation is complete and they're no
> longer in use.
>
> It's less confusing if we give them different names: patches can be
> enabled or disabled; objects (and their funcs) can be patched or
> unpatched:
>
> - Enabled means that a patch is logically enabled (but not necessarily
> fully applied).
>
> - Patched means that an object's funcs are registered with ftrace and
> added to the klp_ops func stack.
>
> Also, since these states are binary, represent them with boolean-type
> variables instead of enums.

So please do so: we have bool/true/false.

--
js
suse labs

2015-02-10 16:56:47

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
>
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > Add a basic per-task consistency model. This is the foundation which
> > will eventually enable us to patch those ~10% of security patches which
> > change function prototypes and/or data semantics.
> >
> > When a patch is enabled, livepatch enters into a transition state where
> > tasks are converging from the old universe to the new universe. If a
> > given task isn't using any of the patched functions, it's switched to
> > the new universe. Once all the tasks have been converged to the new
> > universe, patching is complete.
> >
> > The same sequence occurs when a patch is disabled, except the tasks
> > converge from the new universe to the old universe.
> >
> > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > is in transition. Only a single patch (the topmost patch on the stack)
> > can be in transition at a given time. A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
> >
> > A transition can be reversed and effectively canceled by writing the
> > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > the transition is in progress. Then all the tasks will attempt to
> > converge back to the original universe.
>
> Hi Josh,
>
> first, thanks a lot for great work. I'm starting to go through it and it's
> gonna take me some time to do and send a complete review.

I know there are a lot of details to look at, please take your time. I
really appreciate your review. (And everybody else's, for that matter
:-)

> > + /* success! unpatch obsolete functions and do some cleanup */
> > +
> > + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > + klp_unpatch_objects(klp_transition_patch);
> > +
> > + /* prevent ftrace handler from reading old func->transition */
> > + synchronize_rcu();
> > + }
> > +
> > + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > + "unpatching");
> > +
> > + klp_complete_transition();
> > +}
>
> ...synchronize_rcu() could be insufficient. There still can be some
> process in our ftrace handler after the call.
>
> Consider the following scenario:
>
> When synchronize_rcu is called some process could have been preempted on
> some other cpu somewhere at the start of the ftrace handler before
> rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that
> does not mean anything for our process in the handler, because it is not
> in rcu critical section. There is no guarantee that after synchronize_rcu
> the process would be away from the handler.
>
> "Meanwhile" klp_try_complete_transition continues and calls
> klp_complete_transition. This clears func->transition flags. Now the
> process in the handler could be scheduled again. It reads the wrong value
> of func->transition and redirection to the wrong function is done.
>
> What do you think? I hope I made myself clear.

You really made me think. But I don't think there's a race here.

Consider the two separate cases, patching and unpatching:

1. patching has completed: klp_universe_goal and all tasks'
klp_universes are at KLP_UNIVERSE_NEW. In this case, the value of
func->transition doesn't matter, because we want to use the func at
the top of the stack, and if klp_universe is NEW, the ftrace handler
will do that, regardless of the value of func->transition. This is
why I didn't do the rcu_synchronize() in this case. But maybe you're
not worried about this case anyway, I just described it for the sake
of completeness :-)

2. unpatching has completed: klp_universe_goal and all tasks'
klp_universes are at KLP_UNIVERSE_OLD. In this case, the value of
func->transition _does_ matter. However, notice that
klp_unpatch_objects() is called before rcu_synchronize(). That
removes the "new" func from the klp_ops stack. Since the ftrace
handler accesses the list _after_ calling rcu_read_lock(), it will
never see the "new" func, and thus func->transition will never be
set.

That said, I think there is a race where the WARN_ON_ONCE(!func)
could trigger here, and it wouldn't be an error. So I think I'll
remove the warning.

Does that make sense?

> There is the similar problem for dynamic trampolines in ftrace. You
> cannot remove them unless there is no process in the handler. I think
> rcu-tasks were merged a while ago for this purpose. However ftrace
> does not use them yet and I don't know if we could exploit them to
> solve this issue. I need to think more about it.

Ok, sounds like that's an ftrace bug that could affect us.

--
Josh

2015-02-10 17:21:47

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states

On Tue, Feb 10, 2015 at 05:44:30PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Once we have a consistency model, patches and their objects will be
> > enabled and disabled at different times. For example, when a patch is
> > disabled, its loaded objects' funcs can remain registered with ftrace
> > indefinitely until the unpatching operation is complete and they're no
> > longer in use.
> >
> > It's less confusing if we give them different names: patches can be
> > enabled or disabled; objects (and their funcs) can be patched or
> > unpatched:
> >
> > - Enabled means that a patch is logically enabled (but not necessarily
> > fully applied).
> >
> > - Patched means that an object's funcs are registered with ftrace and
> > added to the klp_ops func stack.
> >
> > Also, since these states are binary, represent them with boolean-type
> > variables instead of enums.
>
> So please do so: we have bool/true/false.

Will do, thanks.

--
Josh

2015-02-10 17:29:51

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Tue, Feb 10, 2015 at 09:59:58AM -0600, Josh Poimboeuf wrote:
> On Tue, Feb 10, 2015 at 08:16:59PM +0900, Masami Hiramatsu wrote:
> > (2015/02/10 2:31), Josh Poimboeuf wrote:
> > > This patch set implements a livepatch consistency model, targeted for 3.21.
> > > Now that we have a solid livepatch code base, this is the biggest remaining
> > > missing piece.
> > >
> > > This code stems from the design proposal made by Vojtech [1] in November. It
> > > makes live patching safer in general. Specifically, it allows you to apply
> > > patches which change function prototypes. It also lays the groundwork for
> > > future code changes which will enable data and data semantic changes.
> >
> > Interesting, How would you do that?
>
> As Vojtech described in the earlier thread from November, there are
> different approaches for changing data:
>
> 1. TRANSFORM_WORLD: stop the world, transform everything, resume
>
> 2. TRANSFORM_ON_ACCESS: transform data structures when you access them
>
> I would add a third category (which is what we've been doing with
> kpatch):
>
> 3. TRANSFORM_ON_CREATE: create new data structures created after a certain point
> are the "v2" versions

Sorry, bad wording, I meant to say:

3. TRANSFORM_ON_CREATE: create new versions of the data structures when
you create them

If that still doesn't make sense, hopefully the below explanation
clarifies what I mean :-)

>
> I think approach 1 seems very tricky, if not impossible in many cases,
> even if you're using stop_machine(). Right now we're focusing on
> enabling approaches 2 and 3, since they seem more practical, don't
> require stop_machine(), and are generally easier to get right.
>
> With kpatch we've been using approach 3, with a lot of success. Here's
> how I would do it with livepatch:
>
> As a prerequisite, we need shadow variables, which is a way to add
> virtual fields to existing structs at runtime. For an example, see:
>
> https://github.com/dynup/kpatch/blob/master/test/integration/shadow-newpid.patch
>
> In that example, I added "newpid" to task_struct. If it's only
> something like locking semantics that are changing, you can just add a
> "v2" field to the struct to specify that it's the 2nd version of the
> struct.
>
> When converting a patch to be used for livepatch, the patch author must
> carefully look for data struct versioning changes. It doesn't matter if
> there's a new field, or if the semantics of using that data has changed.
> Either way, the patch author must define a new version of the struct.
>
> If a struct has changed, all patched functions need to be able to deal
> with struct v1 or struct v2. This is true for those functions which
> access the structs as well as the functions which create them.
>
> For example, a function which accesses the struct might change to:
>
> if (klp_shadow_has_field(struct, "v2"))
> /* access struct the new way */
> else
> /* access struct the old way */
>
> A function which creates the struct might change to:
>
> struct foo *struct_create()
> {
> /* kmalloc and init struct here */
>
> if (klp_patching_complete())
> /* add v2 shadow fields */
> }
>
>
> The klp_patching_complete() call is needed to prevent v1 functions from
> accessing v2 data. The creation/transformation of v2 structs shouldn't
> occur until after the patching process is complete, and all tasks are
> converged to the new universe.
>
> > > disadvantages vs kpatch:
> > > - no system-wide switch point (not really a functional limitation, just forces
> > > the patch author to be more careful. but that's probably a good thing anyway)
> >
> > OK, we must check carefully that the old function and new function can be co-exist.
>
> Agreed, and this requires the patch author to look carefully for data
> version changes, as described above. Which they should be doing
> regardless.
>
> > > My biggest concerns and questions related to this patch set are:
> > >
> > > 1) To safely examine the task stacks, the transition code locks each task's rq
> > > struct, which requires using the scheduler's internal rq locking functions.
> > > It seems to work well, but I'm not sure if there's a cleaner way to safely
> > > do stack checking without stop_machine().
> >
> > We'd better ask scheduler people.
>
> Agreed, I will.
>
> > > 2) As mentioned above, kthreads which are always sleeping on a patched function
> > > will never transition to the new universe. This is really a minor issue
> > > (less than 1% of patches). It's not necessarily something that needs to be
> > > resolved with this patch set, but it would be good to have some discussion
> > > about it regardless.
> > >
> > > To overcome this issue, I have 1/2 an idea: we could add some stack checking
> > > code to the ftrace handler itself to transition the kthread to the new
> > > universe after it re-enters the function it was originally sleeping on, if
> > > the stack doesn't already have have any other to-be-patched functions.
> > > Combined with the klp_transition_work_fn()'s periodic stack checking of
> > > sleeping tasks, that would handle most of the cases (except when trying to
> > > patch the high-level thread_fn itself).
> >
> > It makes sense to me. (I just did similar thing)
> >
> > >
> > > But then how do you make the kthread wake up? As far as I can tell,
> > > wake_up_process() doesn't seem to work on a kthread (unless I messed up my
> > > testing somehow). What does kGraft do in this case?
> >
> > Hmm, at a glance, the code itself can work on kthread too...
> > Maybe you can also send you testing patch too.
>
> Yeah, I probably messed it up. I'll try it again :-)
>
> --
> Josh

--
Josh

2015-02-10 18:27:56

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Move functions related to the actual patching of functions and objects
> into a new patch.c file.
>
> The only functional change is to remove the unnecessary
> WARN_ON(!klp_is_object_loaded()) check from klp_patch_object().
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -24,29 +24,10 @@
> #include <linux/kernel.h>
> #include <linux/mutex.h>
> #include <linux/slab.h>
> -#include <linux/ftrace.h>
> #include <linux/list.h>
> #include <linux/kallsyms.h>
> -#include <linux/livepatch.h>

I don't understand, you define some functions declared there and you
remove the include? patch.h below is not enough. When somebody shuffles
with the files again, we would have to fix this.

>
> -/**
> - * struct klp_ops - structure for tracking registered ftrace ops structs
> - *
> - * A single ftrace_ops is shared between all enabled replacement functions
> - * (klp_func structs) which have the same old_addr. This allows the switch
> - * between function versions to happen instantaneously by updating the klp_ops
> - * struct's func_stack list. The winner is the klp_func at the top of the
> - * func_stack (front of the list).
> - *
> - * @node: node for the global klp_ops list
> - * @func_stack: list head for the stack of klp_func's (active func is on top)
> - * @fops: registered ftrace ops struct
> - */
> -struct klp_ops {
> - struct list_head node;
> - struct list_head func_stack;
> - struct ftrace_ops fops;
> -};
> +#include "patch.h"

...

> --- /dev/null
> +++ b/kernel/livepatch/patch.c
> @@ -0,0 +1,176 @@
> +/*
> + * patch.c - Kernel Live Patching patching functions

...

> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/slab.h>
> +
> +#include "patch.h"
> +
> +static LIST_HEAD(klp_ops);

list.h should be included.

> +static void notrace klp_ftrace_handler(unsigned long ip,
> + unsigned long parent_ip,
> + struct ftrace_ops *fops,

ftrace.h should be included.

> + struct pt_regs *regs)
> +{
> + struct klp_ops *ops;
> + struct klp_func *func;
> +
> + ops = container_of(fops, struct klp_ops, fops);
> +
> + rcu_read_lock();
> + func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> + stack_node);

rculist.h & perhaps rcupdate.h?

> + rcu_read_unlock();
> +
> + if (WARN_ON_ONCE(!func))
> + return;
> +
> + klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +}

...

> +static void klp_unpatch_func(struct klp_func *func)
> +{
> + struct klp_ops *ops;
> +
> + WARN_ON(!func->patched);
> + WARN_ON(!func->old_addr);

bug.h

> +
> + ops = klp_find_ops(func->old_addr);
> + if (WARN_ON(!ops))
> + return;
> +
> + if (list_is_singular(&ops->func_stack)) {
> + WARN_ON(unregister_ftrace_function(&ops->fops));
> + WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
> +
> + list_del_rcu(&func->stack_node);
> + list_del(&ops->node);
> + kfree(ops);
> + } else {
> + list_del_rcu(&func->stack_node);
> + }
> +
> + func->patched = 0;
> +}
> +
> +static int klp_patch_func(struct klp_func *func)
> +{
> + struct klp_ops *ops;
> + int ret;
> +
> + if (WARN_ON(!func->old_addr))
> + return -EINVAL;
> +
> + if (WARN_ON(func->patched))
> + return -EINVAL;
> +
> + ops = klp_find_ops(func->old_addr);
> + if (!ops) {
> + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> + if (!ops)
> + return -ENOMEM;
> +
> + ops->fops.func = klp_ftrace_handler;
> + ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
> + FTRACE_OPS_FL_DYNAMIC |
> + FTRACE_OPS_FL_IPMODIFY;
> +
> + list_add(&ops->node, &klp_ops);
> +
> + INIT_LIST_HEAD(&ops->func_stack);
> + list_add_rcu(&func->stack_node, &ops->func_stack);
> +
> + ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
> + if (ret) {
> + pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> + func->old_name, ret);

printk.h

> + goto err;
> + }
> +
> + ret = register_ftrace_function(&ops->fops);
> + if (ret) {
> + pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> + func->old_name, ret);
> + ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> + goto err;
> + }
> + } else {
> + list_add_rcu(&func->stack_node, &ops->func_stack);
> + }
> +
> + func->patched = 1;
> +
> + return 0;
> +
> +err:
> + list_del_rcu(&func->stack_node);
> + list_del(&ops->node);
> + kfree(ops);
> + return ret;
> +}

...

> --- /dev/null
> +++ b/kernel/livepatch/patch.h
> @@ -0,0 +1,25 @@

This is not a correct header. Double-inclusion protection is missing.

> +#include <linux/livepatch.h>
> +
> +/**
> + * struct klp_ops - structure for tracking registered ftrace ops structs
> + *
> + * A single ftrace_ops is shared between all enabled replacement functions
> + * (klp_func structs) which have the same old_addr. This allows the switch
> + * between function versions to happen instantaneously by updating the klp_ops
> + * struct's func_stack list. The winner is the klp_func at the top of the
> + * func_stack (front of the list).
> + *
> + * @node: node for the global klp_ops list
> + * @func_stack: list head for the stack of klp_func's (active func is on top)
> + * @fops: registered ftrace ops struct
> + */
> +struct klp_ops {
> + struct list_head node;
> + struct list_head func_stack;
> + struct ftrace_ops fops;

This header obviously needs list.h and ftrace.h.

> +};
> +
> +struct klp_ops *klp_find_ops(unsigned long old_addr);
> +
> +extern int klp_patch_object(struct klp_object *obj);
> +extern void klp_unpatch_object(struct klp_object *obj);
>

regards,
--
js
suse labs

2015-02-10 18:30:55

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 4/9] livepatch: get function sizes

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -197,8 +197,25 @@ static int klp_find_verify_func_addr(struct klp_object *obj,
> else
> ret = klp_verify_vmlinux_symbol(func->old_name,
> func->old_addr);
> + if (ret)
> + return ret;
>
> - return ret;
> + ret = kallsyms_lookup_size_offset(func->old_addr, &func->old_size,
> + NULL);
> + if (!ret) {
> + pr_err("kallsyms lookup failed for '%s'\n", func->old_name);
> + return -EINVAL;
> + }
> +
> + ret = kallsyms_lookup_size_offset((unsigned long)func->new_func,
> + &func->new_size, NULL);
> + if (!ret) {
> + pr_err("kallsyms lookup failed for '%s' replacement\n",
> + func->old_name);
> + return -EINVAL;

EINVAL does not seem to be an appropriate return value for "not found".
Maybe ENOENT?

regards,
--
js
suse labs

2015-02-10 18:47:21

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Expose the per-task klp_universe value so users can determine which
> tasks are holding up completion of a patching operation.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> fs/proc/base.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 3f3d7ae..b9fe6b5 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2528,6 +2528,14 @@ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
> return err;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +static int proc_pid_klp_universe(struct seq_file *m, struct pid_namespace *ns,
> + struct pid *pid, struct task_struct *task)
> +{
> + return seq_printf(m, "%d\n", task->klp_universe);
> +}
> +#endif /* CONFIG_LIVEPATCH */
> +
> /*
> * Thread groups
> */
> @@ -2628,6 +2636,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> #ifdef CONFIG_CHECKPOINT_RESTORE
> REG("timers", S_IRUGO, proc_timers_operations),
> #endif
> +#ifdef CONFIG_LIVEPATCH
> + ONE("universe", S_IRUGO, proc_pid_klp_universe),

I am not sure if this can be UGO or if it should be USR only instead.
Leaving for discussion, but I incline to use USR to avoid *any* info
leakage.

> +#endif

regards,
--
js
suse labs

2015-02-10 18:50:49

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c

On Tue, Feb 10, 2015 at 07:27:51PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Move functions related to the actual patching of functions and objects
> > into a new patch.c file.
> >
> > The only functional change is to remove the unnecessary
> > WARN_ON(!klp_is_object_loaded()) check from klp_patch_object().
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -24,29 +24,10 @@
> > #include <linux/kernel.h>
> > #include <linux/mutex.h>
> > #include <linux/slab.h>
> > -#include <linux/ftrace.h>
> > #include <linux/list.h>
> > #include <linux/kallsyms.h>
> > -#include <linux/livepatch.h>
>
> I don't understand, you define some functions declared there and you
> remove the include? patch.h below is not enough. When somebody shuffles
> with the files again, we would have to fix this.
>
> >
> > -/**
> > - * struct klp_ops - structure for tracking registered ftrace ops structs
> > - *
> > - * A single ftrace_ops is shared between all enabled replacement functions
> > - * (klp_func structs) which have the same old_addr. This allows the switch
> > - * between function versions to happen instantaneously by updating the klp_ops
> > - * struct's func_stack list. The winner is the klp_func at the top of the
> > - * func_stack (front of the list).
> > - *
> > - * @node: node for the global klp_ops list
> > - * @func_stack: list head for the stack of klp_func's (active func is on top)
> > - * @fops: registered ftrace ops struct
> > - */
> > -struct klp_ops {
> > - struct list_head node;
> > - struct list_head func_stack;
> > - struct ftrace_ops fops;
> > -};
> > +#include "patch.h"
>
> ...
>
> > --- /dev/null
> > +++ b/kernel/livepatch/patch.c
> > @@ -0,0 +1,176 @@
> > +/*
> > + * patch.c - Kernel Live Patching patching functions
>
> ...
>
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/slab.h>
> > +
> > +#include "patch.h"
> > +
> > +static LIST_HEAD(klp_ops);
>
> list.h should be included.
>
> > +static void notrace klp_ftrace_handler(unsigned long ip,
> > + unsigned long parent_ip,
> > + struct ftrace_ops *fops,
>
> ftrace.h should be included.
>
> > + struct pt_regs *regs)
> > +{
> > + struct klp_ops *ops;
> > + struct klp_func *func;
> > +
> > + ops = container_of(fops, struct klp_ops, fops);
> > +
> > + rcu_read_lock();
> > + func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > + stack_node);
>
> rculist.h & perhaps rcupdate.h?
>
> > + rcu_read_unlock();
> > +
> > + if (WARN_ON_ONCE(!func))
> > + return;
> > +
> > + klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +}
>
> ...
>
> > +static void klp_unpatch_func(struct klp_func *func)
> > +{
> > + struct klp_ops *ops;
> > +
> > + WARN_ON(!func->patched);
> > + WARN_ON(!func->old_addr);
>
> bug.h
>
> > +
> > + ops = klp_find_ops(func->old_addr);
> > + if (WARN_ON(!ops))
> > + return;
> > +
> > + if (list_is_singular(&ops->func_stack)) {
> > + WARN_ON(unregister_ftrace_function(&ops->fops));
> > + WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
> > +
> > + list_del_rcu(&func->stack_node);
> > + list_del(&ops->node);
> > + kfree(ops);
> > + } else {
> > + list_del_rcu(&func->stack_node);
> > + }
> > +
> > + func->patched = 0;
> > +}
> > +
> > +static int klp_patch_func(struct klp_func *func)
> > +{
> > + struct klp_ops *ops;
> > + int ret;
> > +
> > + if (WARN_ON(!func->old_addr))
> > + return -EINVAL;
> > +
> > + if (WARN_ON(func->patched))
> > + return -EINVAL;
> > +
> > + ops = klp_find_ops(func->old_addr);
> > + if (!ops) {
> > + ops = kzalloc(sizeof(*ops), GFP_KERNEL);
> > + if (!ops)
> > + return -ENOMEM;
> > +
> > + ops->fops.func = klp_ftrace_handler;
> > + ops->fops.flags = FTRACE_OPS_FL_SAVE_REGS |
> > + FTRACE_OPS_FL_DYNAMIC |
> > + FTRACE_OPS_FL_IPMODIFY;
> > +
> > + list_add(&ops->node, &klp_ops);
> > +
> > + INIT_LIST_HEAD(&ops->func_stack);
> > + list_add_rcu(&func->stack_node, &ops->func_stack);
> > +
> > + ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 0, 0);
> > + if (ret) {
> > + pr_err("failed to set ftrace filter for function '%s' (%d)\n",
> > + func->old_name, ret);
>
> printk.h
>
> > + goto err;
> > + }
> > +
> > + ret = register_ftrace_function(&ops->fops);
> > + if (ret) {
> > + pr_err("failed to register ftrace handler for function '%s' (%d)\n",
> > + func->old_name, ret);
> > + ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> > + goto err;
> > + }
> > + } else {
> > + list_add_rcu(&func->stack_node, &ops->func_stack);
> > + }
> > +
> > + func->patched = 1;
> > +
> > + return 0;
> > +
> > +err:
> > + list_del_rcu(&func->stack_node);
> > + list_del(&ops->node);
> > + kfree(ops);
> > + return ret;
> > +}
>
> ...
>
> > --- /dev/null
> > +++ b/kernel/livepatch/patch.h
> > @@ -0,0 +1,25 @@
>
> This is not a correct header. Double-inclusion protection is missing.
>
> > +#include <linux/livepatch.h>
> > +
> > +/**
> > + * struct klp_ops - structure for tracking registered ftrace ops structs
> > + *
> > + * A single ftrace_ops is shared between all enabled replacement functions
> > + * (klp_func structs) which have the same old_addr. This allows the switch
> > + * between function versions to happen instantaneously by updating the klp_ops
> > + * struct's func_stack list. The winner is the klp_func at the top of the
> > + * func_stack (front of the list).
> > + *
> > + * @node: node for the global klp_ops list
> > + * @func_stack: list head for the stack of klp_func's (active func is on top)
> > + * @fops: registered ftrace ops struct
> > + */
> > +struct klp_ops {
> > + struct list_head node;
> > + struct list_head func_stack;
> > + struct ftrace_ops fops;
>
> This header obviously needs list.h and ftrace.h.
>
> > +};
> > +
> > +struct klp_ops *klp_find_ops(unsigned long old_addr);
> > +
> > +extern int klp_patch_object(struct klp_object *obj);
> > +extern void klp_unpatch_object(struct klp_object *obj);
> >
>

Agreed to all, thanks.


--
Josh

2015-02-10 19:49:12

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 4/9] livepatch: get function sizes

On Tue, Feb 10, 2015 at 07:30:50PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -197,8 +197,25 @@ static int klp_find_verify_func_addr(struct klp_object *obj,
> > else
> > ret = klp_verify_vmlinux_symbol(func->old_name,
> > func->old_addr);
> > + if (ret)
> > + return ret;
> >
> > - return ret;
> > + ret = kallsyms_lookup_size_offset(func->old_addr, &func->old_size,
> > + NULL);
> > + if (!ret) {
> > + pr_err("kallsyms lookup failed for '%s'\n", func->old_name);
> > + return -EINVAL;
> > + }
> > +
> > + ret = kallsyms_lookup_size_offset((unsigned long)func->new_func,
> > + &func->new_size, NULL);
> > + if (!ret) {
> > + pr_err("kallsyms lookup failed for '%s' replacement\n",
> > + func->old_name);
> > + return -EINVAL;
>
> EINVAL does not seem to be an appropriate return value for "not found".
> Maybe ENOENT?

Ok.

--
Josh

2015-02-10 18:57:56

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status

On Tue, Feb 10, 2015 at 07:47:12PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Expose the per-task klp_universe value so users can determine which
> > tasks are holding up completion of a patching operation.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> > fs/proc/base.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 3f3d7ae..b9fe6b5 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2528,6 +2528,14 @@ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
> > return err;
> > }
> >
> > +#ifdef CONFIG_LIVEPATCH
> > +static int proc_pid_klp_universe(struct seq_file *m, struct pid_namespace *ns,
> > + struct pid *pid, struct task_struct *task)
> > +{
> > + return seq_printf(m, "%d\n", task->klp_universe);
> > +}
> > +#endif /* CONFIG_LIVEPATCH */
> > +
> > /*
> > * Thread groups
> > */
> > @@ -2628,6 +2636,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> > #ifdef CONFIG_CHECKPOINT_RESTORE
> > REG("timers", S_IRUGO, proc_timers_operations),
> > #endif
> > +#ifdef CONFIG_LIVEPATCH
> > + ONE("universe", S_IRUGO, proc_pid_klp_universe),
>
> I am not sure if this can be UGO or if it should be USR only instead.
> Leaving for discussion, but I incline to use USR to avoid *any* info
> leakage.

That's fine. I can't think of any reason why a non-root user would need
to know the task's universe.

--
Josh

2015-02-10 19:02:39

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
...
> @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
>
> static void klp_kobj_release_patch(struct kobject *kobj)
> {
> - /*
> - * Once we have a consistency model we'll need to module_put() the
> - * patch module here. See klp_register_patch() for more details.
> - */

I deliberately let you write the note in there :). What happens when I
leave some attribute in /sys open and you remove the module in the meantime?

> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -54,6 +54,9 @@ void klp_complete_transition(void)
> for (func = obj->funcs; func->old_name; func++)
> func->transition = 0;
>
> + if (klp_universe_goal == KLP_UNIVERSE_OLD)
> + module_put(klp_transition_patch->mod);
> +
> klp_transition_patch = NULL;
> }

regards,
--
js
suse labs

2015-02-10 19:28:07

by Seth Jennings

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Mon, Feb 09, 2015 at 11:31:18AM -0600, Josh Poimboeuf wrote:
> Add a basic per-task consistency model. This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
>
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe. If a
> given task isn't using any of the patched functions, it's switched to
> the new universe. Once all the tasks have been converged to the new
> universe, patching is complete.
>
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
>
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition. Only a single patch (the topmost patch on the stack)
> can be in transition at a given time. A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
>
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress. Then all the tasks will attempt to
> converge back to the original universe.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> include/linux/livepatch.h | 18 ++-
> include/linux/sched.h | 3 +
> kernel/fork.c | 2 +
> kernel/livepatch/Makefile | 2 +-
> kernel/livepatch/core.c | 71 ++++++----
> kernel/livepatch/patch.c | 34 ++++-
> kernel/livepatch/patch.h | 1 +
> kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
> kernel/livepatch/transition.h | 16 +++
> kernel/sched/core.c | 2 +
> 10 files changed, 423 insertions(+), 26 deletions(-)
> create mode 100644 kernel/livepatch/transition.c
> create mode 100644 kernel/livepatch/transition.h
>
<snip>
> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> + KLP_UNIVERSE_UNDEFINED = -1,
> + KLP_UNIVERSE_OLD,
> + KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;

klp_mutex isn't defined in transition.c. Maybe this extern should be in
the transition.c file or in a core.h file, since core.c provides the
definition?

Thanks,
Seth

> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d91e6..7b877f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -74,6 +74,7 @@
> #include <linux/binfmts.h>
> #include <linux/context_tracking.h>
> #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>
> #include <asm/switch_to.h>
> #include <asm/tlb.h>
> @@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
> #if defined(CONFIG_SMP)
> sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
> #endif
> + klp_update_task_universe(idle);
> }
>
> int cpuset_cpumask_can_shrink(const struct cpumask *cur,
> --
> 2.1.0
>

2015-02-10 19:32:23

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, Feb 10, 2015 at 01:27:59PM -0600, Seth Jennings wrote:
> On Mon, Feb 09, 2015 at 11:31:18AM -0600, Josh Poimboeuf wrote:
> > diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> > new file mode 100644
> > index 0000000..ba9a55c
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.h
> > @@ -0,0 +1,16 @@
> > +#include <linux/livepatch.h>
> > +
> > +enum {
> > + KLP_UNIVERSE_UNDEFINED = -1,
> > + KLP_UNIVERSE_OLD,
> > + KLP_UNIVERSE_NEW,
> > +};
> > +
> > +extern struct mutex klp_mutex;
>
> klp_mutex isn't defined in transition.c. Maybe this extern should be in
> the transition.c file or in a core.h file, since core.c provides the
> definition?

I originally had the extern in transition.c, but then checkpatch
complained so I moved it to transition.h. But yeah, it doesn't really
belong there either.

It's kind of ugly for transition.c to be using that mutex anyway. I
think it'll be cleaner if I just move the work_fn into core.c.

--
Josh

2015-02-10 19:57:13

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Tue, Feb 10, 2015 at 08:02:34PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> ...
> > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> >
> > static void klp_kobj_release_patch(struct kobject *kobj)
> > {
> > - /*
> > - * Once we have a consistency model we'll need to module_put() the
> > - * patch module here. See klp_register_patch() for more details.
> > - */
>
> I deliberately let you write the note in there :). What happens when I
> leave some attribute in /sys open and you remove the module in the meantime?

You're right, as was I the first time :-)

The only problem is that it would be nice if we could call
klp_unregister_patch() from the patch module's exit function, so that
doing an rmmod on the patch module unregisters it. But if we put
module_put() in the patch release function, then we have a circular
dependency and we could never rmmod it.

How about instead we do a klp_is_patch_registered() at the beginning of
all the attribute accessor functions? It's kind of ugly, but I can't
think of a better idea at the moment.

--
Josh

2015-02-11 10:21:56

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model


On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

[...]

> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> ops = container_of(fops, struct klp_ops, fops);
>
> rcu_read_lock();
> +
> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - rcu_read_unlock();
>
> if (WARN_ON_ONCE(!func))
> - return;
> + goto unlock;
> +
> + if (unlikely(func->transition)) {
> + /* corresponding smp_wmb() is in klp_init_transition() */
> + smp_rmb();
> +
> + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> + /*
> + * Use the previously patched version of the function.
> + * If no previous patches exist, use the original
> + * function.
> + */
> + func = list_entry_rcu(func->stack_node.next,
> + struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }
>
> klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> + rcu_read_unlock();
> }

I decided to understand the code more before answering the email about the
race and found another problem. I think.

Imagine we patched some function foo() with foo_1() from patch_1 and now
we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch
calls klp_init_transition which sets klp_universe for all processes to
KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1).
Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo().
BUT what if somebody calls foo() right between klp_init_transition and
the loop in __klp_enable_patch? The ftrace handler first returns the
first entry in the list which is foo_1() (foo_2() is still not present),
then it checks for func->transition. It is 1. It checks for
current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is
retrieved. There is no such and therefore foo() is called. This is
obviously wrong because foo_1() was expected.

Everything would work fine if one would call foo() before
klp_start_transition and after the loop in __klp_enable_patch. The
solution might be to move the setting of func->transition to
klp_start_transition, but this could break something different. I don't
know yet.

Am I wrong?

Miroslav

2015-02-11 10:55:10

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On 02/10/2015, 08:57 PM, Josh Poimboeuf wrote:
> On Tue, Feb 10, 2015 at 08:02:34PM +0100, Jiri Slaby wrote:
>> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
>>> --- a/kernel/livepatch/core.c
>>> +++ b/kernel/livepatch/core.c
>> ...
>>> @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
>>>
>>> static void klp_kobj_release_patch(struct kobject *kobj)
>>> {
>>> - /*
>>> - * Once we have a consistency model we'll need to module_put() the
>>> - * patch module here. See klp_register_patch() for more details.
>>> - */
>>
>> I deliberately let you write the note in there :). What happens when I
>> leave some attribute in /sys open and you remove the module in the meantime?
>
> You're right, as was I the first time :-)
>
> The only problem is that it would be nice if we could call
> klp_unregister_patch() from the patch module's exit function, so that
> doing an rmmod on the patch module unregisters it. But if we put
> module_put() in the patch release function, then we have a circular
> dependency and we could never rmmod it.
>
> How about instead we do a klp_is_patch_registered() at the beginning of
> all the attribute accessor functions? It's kind of ugly, but I can't
> think of a better idea at the moment.

Ugh, no :). You even have the kobject proper in the module which would
be gone.

However we can take inspiration in kgraft. I introduced a completion
there and wait for it in rmmod. This completion is made complete in
kobject's release. See:
https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n30
https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n138

This should IMO work here too.

regards,
--
js
suse labs

2015-02-11 16:28:21

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, 10 Feb 2015, Josh Poimboeuf wrote:

> On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> >
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> >
> > > Add a basic per-task consistency model. This is the foundation which
> > > will eventually enable us to patch those ~10% of security patches which
> > > change function prototypes and/or data semantics.
> > >
> > > When a patch is enabled, livepatch enters into a transition state where
> > > tasks are converging from the old universe to the new universe. If a
> > > given task isn't using any of the patched functions, it's switched to
> > > the new universe. Once all the tasks have been converged to the new
> > > universe, patching is complete.
> > >
> > > The same sequence occurs when a patch is disabled, except the tasks
> > > converge from the new universe to the old universe.
> > >
> > > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > > is in transition. Only a single patch (the topmost patch on the stack)
> > > can be in transition at a given time. A patch can remain in the
> > > transition state indefinitely, if any of the tasks are stuck in the
> > > previous universe.
> > >
> > > A transition can be reversed and effectively canceled by writing the
> > > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > > the transition is in progress. Then all the tasks will attempt to
> > > converge back to the original universe.
> >
> > Hi Josh,
> >
> > first, thanks a lot for great work. I'm starting to go through it and it's
> > gonna take me some time to do and send a complete review.
>
> I know there are a lot of details to look at, please take your time. I
> really appreciate your review. (And everybody else's, for that matter
> :-)
>
> > > + /* success! unpatch obsolete functions and do some cleanup */
> > > +
> > > + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > > + klp_unpatch_objects(klp_transition_patch);
> > > +
> > > + /* prevent ftrace handler from reading old func->transition */
> > > + synchronize_rcu();
> > > + }
> > > +
> > > + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > > + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > > + "unpatching");
> > > +
> > > + klp_complete_transition();
> > > +}
> >
> > ...synchronize_rcu() could be insufficient. There still can be some
> > process in our ftrace handler after the call.
> >
> > Consider the following scenario:
> >
> > When synchronize_rcu is called some process could have been preempted on
> > some other cpu somewhere at the start of the ftrace handler before
> > rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that
> > does not mean anything for our process in the handler, because it is not
> > in rcu critical section. There is no guarantee that after synchronize_rcu
> > the process would be away from the handler.
> >
> > "Meanwhile" klp_try_complete_transition continues and calls
> > klp_complete_transition. This clears func->transition flags. Now the
> > process in the handler could be scheduled again. It reads the wrong value
> > of func->transition and redirection to the wrong function is done.
> >
> > What do you think? I hope I made myself clear.
>
> You really made me think. But I don't think there's a race here.
>
> Consider the two separate cases, patching and unpatching:
>
> 1. patching has completed: klp_universe_goal and all tasks'
> klp_universes are at KLP_UNIVERSE_NEW. In this case, the value of
> func->transition doesn't matter, because we want to use the func at
> the top of the stack, and if klp_universe is NEW, the ftrace handler
> will do that, regardless of the value of func->transition. This is
> why I didn't do the rcu_synchronize() in this case. But maybe you're
> not worried about this case anyway, I just described it for the sake
> of completeness :-)

Yes, this case shouldn't be a problem :)

> 2. unpatching has completed: klp_universe_goal and all tasks'
> klp_universes are at KLP_UNIVERSE_OLD. In this case, the value of
> func->transition _does_ matter. However, notice that
> klp_unpatch_objects() is called before rcu_synchronize(). That
> removes the "new" func from the klp_ops stack. Since the ftrace
> handler accesses the list _after_ calling rcu_read_lock(), it will
> never see the "new" func, and thus func->transition will never be
> set.

Hm, so indeed I messed it up. Let me rework the scenario a bit. We have a
function foo(), which has been already patched with foo_1() from patch_1
and foo_2() from patch_2. Now we would like to unpatch patch_2. It is
successfully completed and klp_try_complete_transition calls
klp_unpatch_objects and synchronize_rcu. Thus foo_2() is removed from the
RCU list in ops.

Now to the funny part. After synchronize_rcu() and before
klp_complete_transition some process might get to the ftrace handler (it
is still there because of the patch_1 still being present). It gets foo_1
from the list_first_or_null_rcu, sees that func->transition is 1 (it
hasn't been cleared yet), current->klp_universe is KLP_UNIVERSE_OLD... so
it tries to get previous function. There is none and foo() is called. This
is incorrect.

It is very similar scenario to the one in my other email earlier this day.
I think we need to clear func->transition before calling
klp_unpatch_objects. More or less.

> That said, I think there is a race where the WARN_ON_ONCE(!func)
> could trigger here, and it wouldn't be an error. So I think I'll
> remove the warning.
>
> Does that make sense?
>
> > There is the similar problem for dynamic trampolines in ftrace. You
> > cannot remove them unless there is no process in the handler. I think
> > rcu-tasks were merged a while ago for this purpose. However ftrace
> > does not use them yet and I don't know if we could exploit them to
> > solve this issue. I need to think more about it.
>
> Ok, sounds like that's an ftrace bug that could affect us.

Fortunately it is not. Steven knows about it and he does not allow dynamic
trampolines for CONFIG_PREEMPT and FTRACE_OPS_FL_DYNAMIC. Not yet. See the
comment in kernel/trace/ftrace.c for ftrace_update_trampoline.

Anyway the conclusion is that we need to be really careful with ftrace
handler. Especially in the future with dynamic trampolines and especially
with CONFIG_PREEMPT. Now the handler runs always in atomic context (at
least in cases relevant for our use) if I am not mistaken.

Miroslav

2015-02-11 18:39:59

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Wed, Feb 11, 2015 at 11:55:05AM +0100, Jiri Slaby wrote:
> On 02/10/2015, 08:57 PM, Josh Poimboeuf wrote:
> > On Tue, Feb 10, 2015 at 08:02:34PM +0100, Jiri Slaby wrote:
> >> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> >>> --- a/kernel/livepatch/core.c
> >>> +++ b/kernel/livepatch/core.c
> >> ...
> >>> @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> >>>
> >>> static void klp_kobj_release_patch(struct kobject *kobj)
> >>> {
> >>> - /*
> >>> - * Once we have a consistency model we'll need to module_put() the
> >>> - * patch module here. See klp_register_patch() for more details.
> >>> - */
> >>
> >> I deliberately let you write the note in there :). What happens when I
> >> leave some attribute in /sys open and you remove the module in the meantime?
> >
> > You're right, as was I the first time :-)
> >
> > The only problem is that it would be nice if we could call
> > klp_unregister_patch() from the patch module's exit function, so that
> > doing an rmmod on the patch module unregisters it. But if we put
> > module_put() in the patch release function, then we have a circular
> > dependency and we could never rmmod it.
> >
> > How about instead we do a klp_is_patch_registered() at the beginning of
> > all the attribute accessor functions? It's kind of ugly, but I can't
> > think of a better idea at the moment.
>
> Ugh, no :). You even have the kobject proper in the module which would
> be gone.
>
> However we can take inspiration in kgraft. I introduced a completion
> there and wait for it in rmmod. This completion is made complete in
> kobject's release. See:
> https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n30
> https://git.kernel.org/cgit/linux/kernel/git/jirislaby/kgraft.git/tree/kernel/kgraft_files.c?h=kgraft#n138
>
> This should IMO work here too.

Thanks, that sounds a lot better. I'll try to do something like that.

--
Josh

2015-02-11 20:19:31

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, Feb 11, 2015 at 11:21:51AM +0100, Miroslav Benes wrote:
>
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> [...]
>
> > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > ops = container_of(fops, struct klp_ops, fops);
> >
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > stack_node);
> > - rcu_read_unlock();
> >
> > if (WARN_ON_ONCE(!func))
> > - return;
> > + goto unlock;
> > +
> > + if (unlikely(func->transition)) {
> > + /* corresponding smp_wmb() is in klp_init_transition() */
> > + smp_rmb();
> > +
> > + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > + /*
> > + * Use the previously patched version of the function.
> > + * If no previous patches exist, use the original
> > + * function.
> > + */
> > + func = list_entry_rcu(func->stack_node.next,
> > + struct klp_func, stack_node);
> > +
> > + if (&func->stack_node == &ops->func_stack)
> > + goto unlock;
> > + }
> > + }
> >
> > klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +unlock:
> > + rcu_read_unlock();
> > }
>
> I decided to understand the code more before answering the email about the
> race and found another problem. I think.
>
> Imagine we patched some function foo() with foo_1() from patch_1 and now
> we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch
> calls klp_init_transition which sets klp_universe for all processes to
> KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1).
> Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo().
> BUT what if somebody calls foo() right between klp_init_transition and
> the loop in __klp_enable_patch? The ftrace handler first returns the
> first entry in the list which is foo_1() (foo_2() is still not present),
> then it checks for func->transition. It is 1.

No, actually foo_1()'s func->transition will be 0. Only foo_2()'s
func->transition will be 1.

> It checks for
> current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is
> retrieved. There is no such and therefore foo() is called. This is
> obviously wrong because foo_1() was expected.
>
> Everything would work fine if one would call foo() before
> klp_start_transition and after the loop in __klp_enable_patch. The
> solution might be to move the setting of func->transition to
> klp_start_transition, but this could break something different. I don't
> know yet.
>
> Am I wrong?
>
> Miroslav

--
Josh

2015-02-11 20:23:25

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, Feb 11, 2015 at 05:28:13PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Josh Poimboeuf wrote:
>
> > On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> > >
> > > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > >
> > > > Add a basic per-task consistency model. This is the foundation which
> > > > will eventually enable us to patch those ~10% of security patches which
> > > > change function prototypes and/or data semantics.
> > > >
> > > > When a patch is enabled, livepatch enters into a transition state where
> > > > tasks are converging from the old universe to the new universe. If a
> > > > given task isn't using any of the patched functions, it's switched to
> > > > the new universe. Once all the tasks have been converged to the new
> > > > universe, patching is complete.
> > > >
> > > > The same sequence occurs when a patch is disabled, except the tasks
> > > > converge from the new universe to the old universe.
> > > >
> > > > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > > > is in transition. Only a single patch (the topmost patch on the stack)
> > > > can be in transition at a given time. A patch can remain in the
> > > > transition state indefinitely, if any of the tasks are stuck in the
> > > > previous universe.
> > > >
> > > > A transition can be reversed and effectively canceled by writing the
> > > > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > > > the transition is in progress. Then all the tasks will attempt to
> > > > converge back to the original universe.
> > >
> > > Hi Josh,
> > >
> > > first, thanks a lot for great work. I'm starting to go through it and it's
> > > gonna take me some time to do and send a complete review.
> >
> > I know there are a lot of details to look at, please take your time. I
> > really appreciate your review. (And everybody else's, for that matter
> > :-)
> >
> > > > + /* success! unpatch obsolete functions and do some cleanup */
> > > > +
> > > > + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > > > + klp_unpatch_objects(klp_transition_patch);
> > > > +
> > > > + /* prevent ftrace handler from reading old func->transition */
> > > > + synchronize_rcu();
> > > > + }
> > > > +
> > > > + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > > > + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > > > + "unpatching");
> > > > +
> > > > + klp_complete_transition();
> > > > +}
> > >
> > > ...synchronize_rcu() could be insufficient. There still can be some
> > > process in our ftrace handler after the call.
> > >
> > > Consider the following scenario:
> > >
> > > When synchronize_rcu is called some process could have been preempted on
> > > some other cpu somewhere at the start of the ftrace handler before
> > > rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that
> > > does not mean anything for our process in the handler, because it is not
> > > in rcu critical section. There is no guarantee that after synchronize_rcu
> > > the process would be away from the handler.
> > >
> > > "Meanwhile" klp_try_complete_transition continues and calls
> > > klp_complete_transition. This clears func->transition flags. Now the
> > > process in the handler could be scheduled again. It reads the wrong value
> > > of func->transition and redirection to the wrong function is done.
> > >
> > > What do you think? I hope I made myself clear.
> >
> > You really made me think. But I don't think there's a race here.
> >
> > Consider the two separate cases, patching and unpatching:
> >
> > 1. patching has completed: klp_universe_goal and all tasks'
> > klp_universes are at KLP_UNIVERSE_NEW. In this case, the value of
> > func->transition doesn't matter, because we want to use the func at
> > the top of the stack, and if klp_universe is NEW, the ftrace handler
> > will do that, regardless of the value of func->transition. This is
> > why I didn't do the rcu_synchronize() in this case. But maybe you're
> > not worried about this case anyway, I just described it for the sake
> > of completeness :-)
>
> Yes, this case shouldn't be a problem :)
>
> > 2. unpatching has completed: klp_universe_goal and all tasks'
> > klp_universes are at KLP_UNIVERSE_OLD. In this case, the value of
> > func->transition _does_ matter. However, notice that
> > klp_unpatch_objects() is called before rcu_synchronize(). That
> > removes the "new" func from the klp_ops stack. Since the ftrace
> > handler accesses the list _after_ calling rcu_read_lock(), it will
> > never see the "new" func, and thus func->transition will never be
> > set.
>
> Hm, so indeed I messed it up. Let me rework the scenario a bit. We have a
> function foo(), which has been already patched with foo_1() from patch_1
> and foo_2() from patch_2. Now we would like to unpatch patch_2. It is
> successfully completed and klp_try_complete_transition calls
> klp_unpatch_objects and synchronize_rcu. Thus foo_2() is removed from the
> RCU list in ops.
>
> Now to the funny part. After synchronize_rcu() and before
> klp_complete_transition some process might get to the ftrace handler (it
> is still there because of the patch_1 still being present). It gets foo_1
> from the list_first_or_null_rcu, sees that func->transition is 1 (it
> hasn't been cleared yet)

Same answer as the other email, foo_1()'s func->transition will be 0 :-)

When patching, only the new klp_func gets transition set to 1.

When unpatching, only the klp_func being removed gets transition set to
1.

> , current->klp_universe is KLP_UNIVERSE_OLD... so
> it tries to get previous function. There is none and foo() is called. This
> is incorrect.
>
> It is very similar scenario to the one in my other email earlier this day.
> I think we need to clear func->transition before calling
> klp_unpatch_objects. More or less.

--
Josh

2015-02-12 03:22:16

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

Ingo, Peter,

Would you have any objections to making task_rq_lock/unlock() non-static
(or moving them to kernel/sched/sched.h) so they can be called by the
livepatch code?

To provide some background, I'm looking for a way to temporarily prevent
a sleeping task from running while its stack is examined, to decide
whether it can be safely switched to the new patching "universe". For
more details see klp_transition_task() in the patch below.

Using task_rq_lock() is the most straightforward way I could find to
achieve that.

On Mon, Feb 09, 2015 at 11:31:18AM -0600, Josh Poimboeuf wrote:
> Add a basic per-task consistency model. This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
>
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe. If a
> given task isn't using any of the patched functions, it's switched to
> the new universe. Once all the tasks have been converged to the new
> universe, patching is complete.
>
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
>
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition. Only a single patch (the topmost patch on the stack)
> can be in transition at a given time. A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
>
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress. Then all the tasks will attempt to
> converge back to the original universe.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> include/linux/livepatch.h | 18 ++-
> include/linux/sched.h | 3 +
> kernel/fork.c | 2 +
> kernel/livepatch/Makefile | 2 +-
> kernel/livepatch/core.c | 71 ++++++----
> kernel/livepatch/patch.c | 34 ++++-
> kernel/livepatch/patch.h | 1 +
> kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
> kernel/livepatch/transition.h | 16 +++
> kernel/sched/core.c | 2 +
> 10 files changed, 423 insertions(+), 26 deletions(-)
> create mode 100644 kernel/livepatch/transition.c
> create mode 100644 kernel/livepatch/transition.h
>
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 0e65b4d..b8c2f15 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -40,6 +40,7 @@
> * @old_size: size of the old function
> * @new_size: size of the new function
> * @patched: the func has been added to the klp_ops list
> + * @transition: the func is currently being applied or reverted
> */
> struct klp_func {
> /* external */
> @@ -60,6 +61,7 @@ struct klp_func {
> struct list_head stack_node;
> unsigned long old_size, new_size;
> int patched;
> + int transition;
> };
>
> /**
> @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
> extern int klp_enable_patch(struct klp_patch *);
> extern int klp_disable_patch(struct klp_patch *);
>
> -#endif /* CONFIG_LIVEPATCH */
> +extern int klp_universe_goal;
> +
> +static inline void klp_update_task_universe(struct task_struct *t)
> +{
> + /* corresponding smp_wmb() is in klp_set_universe_goal() */
> + smp_rmb();
> +
> + t->klp_universe = klp_universe_goal;
> +}
> +
> +#else /* !CONFIG_LIVEPATCH */
> +
> +static inline void klp_update_task_universe(struct task_struct *t) {}
> +
> +#endif /* !CONFIG_LIVEPATCH */
>
> #endif /* _LINUX_LIVEPATCH_H_ */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..a95e59a 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1701,6 +1701,9 @@ struct task_struct {
> #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> unsigned long task_state_change;
> #endif
> +#ifdef CONFIG_LIVEPATCH
> + int klp_universe;
> +#endif
> };
>
> /* Future-safe accessor for struct task_struct's cpus_allowed. */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4dc2dda..1dcbebe 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -74,6 +74,7 @@
> #include <linux/uprobes.h>
> #include <linux/aio.h>
> #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>
> #include <asm/pgtable.h>
> #include <asm/pgalloc.h>
> @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> total_forks++;
> spin_unlock(&current->sighand->siglock);
> syscall_tracepoint_update(p);
> + klp_update_task_universe(p);
> write_unlock_irq(&tasklist_lock);
>
> proc_fork_connector(p);
> diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> index e136dad..2b8bdb1 100644
> --- a/kernel/livepatch/Makefile
> +++ b/kernel/livepatch/Makefile
> @@ -1,3 +1,3 @@
> obj-$(CONFIG_LIVEPATCH) += livepatch.o
>
> -livepatch-objs := core.o patch.o
> +livepatch-objs := core.o patch.o transition.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 85d4ef7..790dc10 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -28,14 +28,17 @@
> #include <linux/kallsyms.h>
>
> #include "patch.h"
> +#include "transition.h"
>
> /*
> - * The klp_mutex protects the global lists and state transitions of any
> - * structure reachable from them. References to any structure must be obtained
> - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> - * ensure it gets consistent data).
> + * The klp_mutex is a coarse lock which serializes access to klp data. All
> + * accesses to klp-related variables and structures must have mutex protection,
> + * except within the following functions which carefully avoid the need for it:
> + *
> + * - klp_ftrace_handler()
> + * - klp_update_task_universe()
> */
> -static DEFINE_MUTEX(klp_mutex);
> +DEFINE_MUTEX(klp_mutex);
>
> static LIST_HEAD(klp_patches);
>
> @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
> mutex_unlock(&module_mutex);
> }
>
> -/* klp_mutex must be held by caller */
> static bool klp_is_patch_registered(struct klp_patch *patch)
> {
> struct klp_patch *mypatch;
> @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
>
> static int __klp_disable_patch(struct klp_patch *patch)
> {
> - struct klp_object *obj;
> + if (klp_transition_patch)
> + return -EBUSY;
>
> /* enforce stacking: only the last enabled patch can be disabled */
> if (!list_is_last(&patch->list, &klp_patches) &&
> list_next_entry(patch, list)->enabled)
> return -EBUSY;
>
> - pr_notice("disabling patch '%s'\n", patch->mod->name);
> -
> - for (obj = patch->objs; obj->funcs; obj++)
> - if (obj->patched)
> - klp_unpatch_object(obj);
> + klp_init_transition(patch, KLP_UNIVERSE_NEW);
> + klp_start_transition(KLP_UNIVERSE_OLD);
> + klp_try_complete_transition();
>
> patch->enabled = 0;
>
> @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
> struct klp_object *obj;
> int ret;
>
> + if (klp_transition_patch)
> + return -EBUSY;
> +
> if (WARN_ON(patch->enabled))
> return -EINVAL;
>
> @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>
> - pr_notice("enabling patch '%s'\n", patch->mod->name);
> + klp_init_transition(patch, KLP_UNIVERSE_OLD);
>
> for (obj = patch->objs; obj->funcs; obj++) {
> klp_find_object_module(obj);
> @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
> continue;
>
> ret = klp_patch_object(obj);
> - if (ret)
> - goto unregister;
> + if (ret) {
> + pr_warn("failed to enable patch '%s'\n",
> + patch->mod->name);
> +
> + klp_unpatch_objects(patch);
> + klp_complete_transition();
> +
> + return ret;
> + }
> }
>
> + klp_start_transition(KLP_UNIVERSE_NEW);
> +
> + klp_try_complete_transition();
> +
> patch->enabled = 1;
>
> return 0;
> -
> -unregister:
> - WARN_ON(__klp_disable_patch(patch));
> - return ret;
> }
>
> /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
> * /sys/kernel/livepatch
> * /sys/kernel/livepatch/<patch>
> * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
> * /sys/kernel/livepatch/<patch>/<object>
> * /sys/kernel/livepatch/<patch>/<object>/<func>
> */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> goto err;
> }
>
> - if (val) {
> + if (klp_transition_patch == patch) {
> + klp_reverse_transition();
> + } else if (val) {
> ret = __klp_enable_patch(patch);
> if (ret)
> goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
> return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
> }
>
> +static ssize_t transition_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct klp_patch *patch;
> +
> + patch = container_of(kobj, struct klp_patch, kobj);
> + return snprintf(buf, PAGE_SIZE-1, "%d\n",
> + klp_transition_patch == patch);
> +}
> +
> static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
> static struct attribute *klp_patch_attrs[] = {
> &enabled_kobj_attr.attr,
> + &transition_kobj_attr.attr,
> NULL
> };
>
> @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> {
> INIT_LIST_HEAD(&func->stack_node);
> func->patched = 0;
> + func->transition = 0;
>
> return kobject_init_and_add(&func->kobj, &klp_ktype_func,
> obj->kobj, func->old_name);
> @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
> if (ret)
> goto err;
>
> - if (!patch->enabled)
> + if (!patch->enabled && klp_transition_patch != patch)
> return;
>
> pr_notice("applying patch '%s' to loading module '%s'\n",
> @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> struct module *pmod = patch->mod;
> struct module *mod = obj->mod;
>
> - if (!patch->enabled)
> + if (!patch->enabled && klp_transition_patch != patch)
> goto free;
>
> pr_notice("reverting patch '%s' on unloading module '%s'\n",
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 281fbca..f12256b 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -24,6 +24,7 @@
> #include <linux/slab.h>
>
> #include "patch.h"
> +#include "transition.h"
>
> static LIST_HEAD(klp_ops);
>
> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> ops = container_of(fops, struct klp_ops, fops);
>
> rcu_read_lock();
> +
> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - rcu_read_unlock();
>
> if (WARN_ON_ONCE(!func))
> - return;
> + goto unlock;
> +
> + if (unlikely(func->transition)) {
> + /* corresponding smp_wmb() is in klp_init_transition() */
> + smp_rmb();
> +
> + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> + /*
> + * Use the previously patched version of the function.
> + * If no previous patches exist, use the original
> + * function.
> + */
> + func = list_entry_rcu(func->stack_node.next,
> + struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }
>
> klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> + rcu_read_unlock();
> }
>
> struct klp_ops *klp_find_ops(unsigned long old_addr)
> @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
>
> return 0;
> }
> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> + struct klp_object *obj;
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + if (obj->patched)
> + klp_unpatch_object(obj);
> +}
> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>
> extern int klp_patch_object(struct klp_object *obj);
> extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 0000000..2630296
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> @@ -0,0 +1,300 @@
> +/*
> + * transition.c - Kernel Live Patching transition functions
> + *
> + * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/cpu.h>
> +#include <asm/stacktrace.h>
> +#include "../sched/sched.h"
> +
> +#include "patch.h"
> +#include "transition.h"
> +
> +static void klp_transition_work_fn(struct work_struct *);
> +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> +
> +struct klp_patch *klp_transition_patch;
> +
> +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> +
> +static void klp_set_universe_goal(int universe)
> +{
> + klp_universe_goal = universe;
> +
> + /* corresponding smp_rmb() is in klp_update_task_universe() */
> + smp_wmb();
> +}
> +
> +/*
> + * The transition to the universe goal is complete. Clean up the data
> + * structures.
> + */
> +void klp_complete_transition(void)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> +
> + for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> + for (func = obj->funcs; func->old_name; func++)
> + func->transition = 0;
> +
> + klp_transition_patch = NULL;
> +}
> +
> +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> + unsigned long address)
> +{
> + unsigned long func_addr, func_size;
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + /* check the to-be-unpatched function (the func itself) */
> + func_addr = (unsigned long)func->new_func;
> + func_size = func->new_size;
> + } else {
> + /* check the to-be-patched function (previous func) */
> + struct klp_ops *ops;
> +
> + ops = klp_find_ops(func->old_addr);
> +
> + if (list_is_singular(&ops->func_stack)) {
> + /* original function */
> + func_addr = func->old_addr;
> + func_size = func->old_size;
> + } else {
> + /* previously patched function */
> + struct klp_func *prev;
> +
> + prev = list_next_entry(func, stack_node);
> + func_addr = (unsigned long)prev->new_func;
> + func_size = prev->new_size;
> + }
> + }
> +
> + if (address >= func_addr && address < func_addr + func_size)
> + return -1;
> +
> + return 0;
> +}
> +
> +/*
> + * Determine whether the given return address on the stack is within a
> + * to-be-patched or to-be-unpatched function.
> + */
> +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> + int reliable)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> + int *ret = data;
> +
> + if (*ret)
> + return;
> +
> + for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> + if (!obj->patched)
> + continue;
> + for (func = obj->funcs; func->old_name; func++) {
> + if (klp_stacktrace_address_verify_func(func, address)) {
> + *ret = -1;
> + return;
> + }
> + }
> + }
> +}
> +
> +static int klp_stacktrace_stack(void *data, char *name)
> +{
> + return 0;
> +}
> +
> +static const struct stacktrace_ops klp_stacktrace_ops = {
> + .address = klp_stacktrace_address_verify,
> + .stack = klp_stacktrace_stack,
> + .walk_stack = print_context_stack_bp,
> +};
> +
> +/*
> + * Try to safely transition a task to the universe goal. If the task is
> + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> + * function, return false.
> + */
> +static bool klp_transition_task(struct task_struct *t)
> +{
> + struct rq *rq;
> + unsigned long flags;
> + int ret;
> + bool success = false;
> +
> + if (t->klp_universe == klp_universe_goal)
> + return true;
> +
> + rq = task_rq_lock(t, &flags);
> +
> + if (task_running(rq, t) && t != current) {
> + pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> + t->comm);
> + goto done;
> + }
> +
> + ret = 0;
> + dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> + if (ret) {
> + pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> + __func__, t->pid, t->comm);
> + goto done;
> + }
> +
> + klp_update_task_universe(t);
> +
> + success = true;
> +done:
> + task_rq_unlock(rq, t, &flags);
> + return success;
> +}
> +
> +/*
> + * Try to transition all tasks to the universe goal. If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> + unsigned int cpu;
> + struct task_struct *g, *t;
> + bool complete = true;
> +
> + /* try to transition all normal tasks */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + if (!klp_transition_task(t))
> + complete = false;
> + read_unlock(&tasklist_lock);
> +
> + /* try to transition the idle "swapper" tasks */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + if (!klp_transition_task(idle_task(cpu)))
> + complete = false;
> + put_online_cpus();
> +
> + /* if not complete, try again later */
> + if (!complete) {
> + schedule_delayed_work(&klp_transition_work,
> + round_jiffies_relative(HZ));
> + return;
> + }
> +
> + /* success! unpatch obsolete functions and do some cleanup */
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + klp_unpatch_objects(klp_transition_patch);
> +
> + /* prevent ftrace handler from reading old func->transition */
> + synchronize_rcu();
> + }
> +
> + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> + "unpatching");
> +
> + klp_complete_transition();
> +}
> +
> +static void klp_transition_work_fn(struct work_struct *work)
> +{
> + mutex_lock(&klp_mutex);
> +
> + if (klp_transition_patch)
> + klp_try_complete_transition();
> +
> + mutex_unlock(&klp_mutex);
> +}
> +
> +/*
> + * Start the transition to the specified universe so tasks can begin switching
> + * to it.
> + */
> +void klp_start_transition(int universe)
> +{
> + if (WARN_ON(klp_universe_goal == universe))
> + return;
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> + universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> +
> + klp_set_universe_goal(universe);
> +}
> +
> +/*
> + * Can be called in the middle of an existing transition to reverse the
> + * direction of the universe goal. This can be done to effectively cancel an
> + * existing enable or disable operation if there are any tasks which are stuck
> + * in the original universe.
> + */
> +void klp_reverse_transition(void)
> +{
> + struct klp_patch *patch = klp_transition_patch;
> +
> + klp_start_transition(!klp_universe_goal);
> + klp_try_complete_transition();
> +
> + patch->enabled = !patch->enabled;
> +}
> +
> +/*
> + * Reset the universe goal and all tasks to the starting universe, and set all
> + * func->transition's to 1 to prepare for patching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int universe)
> +{
> + struct task_struct *g, *t;
> + unsigned int cpu;
> + struct klp_object *obj;
> + struct klp_func *func;
> +
> + klp_transition_patch = patch;
> +
> + /*
> + * If the previous transition was in the opposite direction, we may
> + * already be in the requested initial universe.
> + */
> + if (klp_universe_goal == universe)
> + goto init_funcs;
> +
> + klp_set_universe_goal(universe);
> +
> + /* init all normal task universes */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + klp_update_task_universe(t);
> + read_unlock(&tasklist_lock);
> +
> + /* init all idle "swapper" task universes */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + klp_update_task_universe(idle_task(cpu));
> + put_online_cpus();
> +
> +init_funcs:
> + /* corresponding smp_rmb() is in klp_ftrace_handler() */
> + smp_wmb();
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + for (func = obj->funcs; func->old_name; func++)
> + func->transition = 1;
> +}
> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> + KLP_UNIVERSE_UNDEFINED = -1,
> + KLP_UNIVERSE_OLD,
> + KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;
> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d91e6..7b877f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -74,6 +74,7 @@
> #include <linux/binfmts.h>
> #include <linux/context_tracking.h>
> #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>
> #include <asm/switch_to.h>
> #include <asm/tlb.h>
> @@ -4601,6 +4602,7 @@ void init_idle(struct task_struct *idle, int cpu)
> #if defined(CONFIG_SMP)
> sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
> #endif
> + klp_update_task_universe(idle);
> }
>
> int cpuset_cpumask_can_shrink(const struct cpumask *cur,
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Josh

2015-02-12 10:45:15

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, 11 Feb 2015, Josh Poimboeuf wrote:

> On Wed, Feb 11, 2015 at 11:21:51AM +0100, Miroslav Benes wrote:
> >
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> >
> > [...]
> >
> > > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > > ops = container_of(fops, struct klp_ops, fops);
> > >
> > > rcu_read_lock();
> > > +
> > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > stack_node);
> > > - rcu_read_unlock();
> > >
> > > if (WARN_ON_ONCE(!func))
> > > - return;
> > > + goto unlock;
> > > +
> > > + if (unlikely(func->transition)) {
> > > + /* corresponding smp_wmb() is in klp_init_transition() */
> > > + smp_rmb();
> > > +
> > > + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > > + /*
> > > + * Use the previously patched version of the function.
> > > + * If no previous patches exist, use the original
> > > + * function.
> > > + */
> > > + func = list_entry_rcu(func->stack_node.next,
> > > + struct klp_func, stack_node);
> > > +
> > > + if (&func->stack_node == &ops->func_stack)
> > > + goto unlock;
> > > + }
> > > + }
> > >
> > > klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > > +unlock:
> > > + rcu_read_unlock();
> > > }
> >
> > I decided to understand the code more before answering the email about the
> > race and found another problem. I think.
> >
> > Imagine we patched some function foo() with foo_1() from patch_1 and now
> > we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch
> > calls klp_init_transition which sets klp_universe for all processes to
> > KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1).
> > Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo().
> > BUT what if somebody calls foo() right between klp_init_transition and
> > the loop in __klp_enable_patch? The ftrace handler first returns the
> > first entry in the list which is foo_1() (foo_2() is still not present),
> > then it checks for func->transition. It is 1.
>
> No, actually foo_1()'s func->transition will be 0. Only foo_2()'s
> func->transition will be 1.

Ah, you're right in both cases. Sorry for the noise.

Miroslav

>
> > It checks for
> > current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is
> > retrieved. There is no such and therefore foo() is called. This is
> > obviously wrong because foo_1() was expected.
> >
> > Everything would work fine if one would call foo() before
> > klp_start_transition and after the loop in __klp_enable_patch. The
> > solution might be to move the setting of func->transition to
> > klp_start_transition, but this could break something different. I don't
> > know yet.
> >
> > Am I wrong?
> >
> > Miroslav
>
> --
> Josh
>

2015-02-12 11:56:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, Feb 11, 2015 at 09:21:21PM -0600, Josh Poimboeuf wrote:
> Ingo, Peter,
>
> Would you have any objections to making task_rq_lock/unlock() non-static
> (or moving them to kernel/sched/sched.h) so they can be called by the
> livepatch code?

Basically yes. I really don't want to expose that. And
kernel/sched/sched.h is very much not intended for use outside of
kernel/sched/ so even that is a no go.

> To provide some background, I'm looking for a way to temporarily prevent
> a sleeping task from running while its stack is examined, to decide
> whether it can be safely switched to the new patching "universe". For
> more details see klp_transition_task() in the patch below.
>
> Using task_rq_lock() is the most straightforward way I could find to
> achieve that.

Its not at all clear how all this would work to me. And I'm not
motivated enough to go try and reverse engineer your patch; IMO
livepatching is utter fail.

If your infrastructure relies on the uptime of a single machine you've
lost already.

FWIW, the barriers in klp_update_task_universe() and
klp_set_universe_goal() look like complete crack, and their comments are
seriously deficient.

2015-02-12 12:25:18

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> Its not at all clear how all this would work to me. And I'm not
> motivated enough to go try and reverse engineer your patch; IMO
> livepatching is utter fail.
>
> If your infrastructure relies on the uptime of a single machine you've
> lost already.

Well, the fact indisputable fact is that there is a demand for this. It's
not about one machine, it's about scheduling dowtimes of datacentres.

But if this needs to be discussed, it should be done outside of this
thread I guess.

> FWIW, the barriers in klp_update_task_universe() and
> klp_set_universe_goal() look like complete crack, and their comments are
> seriously deficient.

These particular barriers seem correct to me; you basically need to make
sure that whenever a thread with TIF_KLP_NEED_UPDATE goes through
do_notify_resume(), it sees proper universe number to be converted to.

--
Jiri Kosina
SUSE Labs

2015-02-12 12:36:27

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 01:25:14PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > FWIW, the barriers in klp_update_task_universe() and
> > klp_set_universe_goal() look like complete crack, and their comments are
> > seriously deficient.
>
> These particular barriers seem correct to me; you basically need to make
> sure that whenever a thread with TIF_KLP_NEED_UPDATE goes through
> do_notify_resume(), it sees proper universe number to be converted to.

I'm not seeing how they're going to help with that.

The comment should describe the data race and how the barriers are
making it not happen.

putting wmb after a store and rmb before a read doesn't avoid the reader
seeing the old value in any universe I know of.

Barriers are about order, you need two consecutive stores for a wmb to
make sense, and two consecutive reads for an rmb, and if they're paired
the stores and reads need to be to the same addresses.

Without that they're pointless.

The comment doesn't describe which two variables are ordered how.

2015-02-12 12:39:09

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 01:25:14PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
>
> > Its not at all clear how all this would work to me. And I'm not
> > motivated enough to go try and reverse engineer your patch; IMO
> > livepatching is utter fail.
> >
> > If your infrastructure relies on the uptime of a single machine you've
> > lost already.
>
> Well, the fact indisputable fact is that there is a demand for this. It's
> not about one machine, it's about scheduling dowtimes of datacentres.

The changelog says:

> ... A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.

Therefore there is no scheduling anything. Without timeliness guarantees
you can't make a schedule.

Might as well just reboot, at least that's fairly well guaranteed to
happen.

2015-02-12 12:39:55

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > > FWIW, the barriers in klp_update_task_universe() and
> > > klp_set_universe_goal() look like complete crack, and their comments are
> > > seriously deficient.
> >
> > These particular barriers seem correct to me; you basically need to make
> > sure that whenever a thread with TIF_KLP_NEED_UPDATE goes through
> > do_notify_resume(), it sees proper universe number to be converted to.
>
> I'm not seeing how they're going to help with that.
>
> The comment should describe the data race and how the barriers are
> making it not happen.
>
> putting wmb after a store and rmb before a read doesn't avoid the reader
> seeing the old value in any universe I know of.

This is about dependency between klp_universe_goal and TIF_KLP_NEED_UPDATE
in threadinfo flags.

What is confusing here is that threadinfo flags are not set in
klp_set_universe_goal() directly, but in the caller
(klp_start_transition()).

I fully agree with you that this deserves better comment though.

--
Jiri Kosina
SUSE Labs

2015-02-12 12:42:05

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > Well, the fact indisputable fact is that there is a demand for this. It's
> > not about one machine, it's about scheduling dowtimes of datacentres.
>
> The changelog says:
>
> > ... A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
>
> Therefore there is no scheduling anything. Without timeliness guarantees
> you can't make a schedule.
>
> Might as well just reboot, at least that's fairly well guaranteed to
> happen.

All running (reasonably alive) tasks will be running patched code though.

You can't just claim complete victory (and get ready for accepting another
patch, etc) if there is a long-time sleeper that hasn't been converted
yet.

--
Jiri Kosina
SUSE Labs

2015-02-12 12:52:25

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 12:56:28PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 09:21:21PM -0600, Josh Poimboeuf wrote:
> > Ingo, Peter,
> >
> > Would you have any objections to making task_rq_lock/unlock() non-static
> > (or moving them to kernel/sched/sched.h) so they can be called by the
> > livepatch code?
>
> Basically yes. I really don't want to expose that. And
> kernel/sched/sched.h is very much not intended for use outside of
> kernel/sched/ so even that is a no go.
>
> > To provide some background, I'm looking for a way to temporarily prevent
> > a sleeping task from running while its stack is examined, to decide
> > whether it can be safely switched to the new patching "universe". For
> > more details see klp_transition_task() in the patch below.
> >
> > Using task_rq_lock() is the most straightforward way I could find to
> > achieve that.
>
> Its not at all clear how all this would work to me. And I'm not
> motivated enough to go try and reverse engineer your patch;

The short answer is: I need a way to ensure that a task isn't sleeping
on any of the functions we're trying to patch. If it's not, then I can
switch the task over to start using new versions of functions.

Obviously, there are many more details than that. If you have specific
questions I can try to answer them.

> IMO livepatching is utter fail.
>
> If your infrastructure relies on the uptime of a single machine you've
> lost already.

It's not always about uptime. IMO it's usually more about decoupling
your reboot schedule from your distro's kernel release schedule.

Most users want to plan in advance when they're going to reboot, rather
than being at the mercy of when CVEs and kernel fixes are released.

Rebooting is costly and risky, even (or often especially) for large
systems for which you have to stagger the reboots. You want to do it at
a time when you're ready for something bad to happen, without having to
also worry about security in the mean time while you're waiting for your
reboot window.

> FWIW, the barriers in klp_update_task_universe() and
> klp_set_universe_goal() look like complete crack, and their comments are
> seriously deficient.

Ok, I'll try to improve the comments for the barriers.

--
Josh

2015-02-12 13:01:52

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 01:42:01PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
>
> > > Well, the fact indisputable fact is that there is a demand for this. It's
> > > not about one machine, it's about scheduling dowtimes of datacentres.
> >
> > The changelog says:
> >
> > > ... A patch can remain in the
> > > transition state indefinitely, if any of the tasks are stuck in the
> > > previous universe.
> >
> > Therefore there is no scheduling anything. Without timeliness guarantees
> > you can't make a schedule.
> >
> > Might as well just reboot, at least that's fairly well guaranteed to
> > happen.
>
> All running (reasonably alive) tasks will be running patched code though.
>
> You can't just claim complete victory (and get ready for accepting another
> patch, etc) if there is a long-time sleeper that hasn't been converted
> yet.

Agreed. And also we have several strategies for reducing the time
needed to get all tasks to a patched state (see patch 9 of this series
for more details). The goal is to not leave systems in limbo for more
than a few seconds.

--
Josh

2015-02-12 13:08:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 06:51:49AM -0600, Josh Poimboeuf wrote:
> > > To provide some background, I'm looking for a way to temporarily prevent
> > > a sleeping task from running while its stack is examined, to decide
> > > whether it can be safely switched to the new patching "universe". For
> > > more details see klp_transition_task() in the patch below.
> > >
> > > Using task_rq_lock() is the most straightforward way I could find to
> > > achieve that.
> >
> > Its not at all clear how all this would work to me. And I'm not
> > motivated enough to go try and reverse engineer your patch;
>
> The short answer is: I need a way to ensure that a task isn't sleeping
> on any of the functions we're trying to patch. If it's not, then I can
> switch the task over to start using new versions of functions.
>
> Obviously, there are many more details than that. If you have specific
> questions I can try to answer them.

How can one task run new and another task old functions? Once you patch
any indirect function pointer any task will see the new call.

And what's wrong with using known good spots like the freezer?

2015-02-12 13:16:12

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > The short answer is: I need a way to ensure that a task isn't sleeping
> > on any of the functions we're trying to patch. If it's not, then I can
> > switch the task over to start using new versions of functions.
> >
> > Obviously, there are many more details than that. If you have specific
> > questions I can try to answer them.
>
> How can one task run new and another task old functions? Once you patch
> any indirect function pointer any task will see the new call.

Patched functions are redirected through ftrace trampoline, and decision
is being made there which function (old or new) to redirect to.

Function calls through pointer always go first to the original function,
and get redirected from its __fentry__ site.

Once the system is in fully patched state, the overhead of the trampoline
is reduced (no expensive decision-making to be made there, etc) to
minimum.

Sure, you will never be on a 100% of performance of the unpatched kernel
for redirected functions, the indirect call through the trampoline will
always be there (although ftrace with dynamic trampolines is really
minimizing this penalty to few extra instructions, one extra call and one
extra ret being the expensive ones).

> And what's wrong with using known good spots like the freezer?

It has undefined semantics when it comes to what you want to achieve here.

Say for example you have a kernel thread which does something like

while (some_condition) {
ret = foo();
...
try_to_freeze();
...
}

and you have a livepatch patching foo() and changing its return value
semantics. Then freezer doesn't really help.

--
Jiri Kosina
SUSE Labs

2015-02-12 13:16:34

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

Hi,

On 02/12/2015, 02:08 PM, Peter Zijlstra wrote:
> How can one task run new and another task old functions?

because this is how it is designed to work in one of the consistency models.

> Once you patch
> any indirect function pointer any task will see the new call.

It does not patch any pointers. Callees' fentrys are "patched" using ftrace.

> And what's wrong with using known good spots like the freezer?

This was already discussed too. Please STA.

thanks,
--
js
suse labs

2015-02-12 13:26:47

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On 02/12/2015, 04:21 AM, Josh Poimboeuf wrote:
> Ingo, Peter,
>
> Would you have any objections to making task_rq_lock/unlock() non-static
> (or moving them to kernel/sched/sched.h) so they can be called by the
> livepatch code?
>
> To provide some background, I'm looking for a way to temporarily prevent
> a sleeping task from running while its stack is examined, to decide
> whether it can be safely switched to the new patching "universe". For
> more details see klp_transition_task() in the patch below.
>
> Using task_rq_lock() is the most straightforward way I could find to
> achieve that.

Hi, I cannot speak whether it is the proper way or not.

But if so, would it make sense to do the opposite: expose an API to walk
through the processes' stack and make the decision? Concretely, move
parts of klp_stacktrace_address_verify_func to sched.c or somewhere in
kernel/sched/ and leave task_rq_lock untouched.

regards,
--
js
suse labs

2015-02-12 13:35:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 02:16:28PM +0100, Jiri Slaby wrote:
> > And what's wrong with using known good spots like the freezer?
>
> This was already discussed too. Please STA.

WTF is STA? You guys want something from me; I don't have time, not
inclination to go hunt down whatever dark corner of the interweb
contains your ramblings.

If you can't be arsed to explain things, I certainly cannot be arsed to
consider your request.

So you now have my full NAK on touching the scheduler, have at it, go
deal with someone else.

2015-02-12 14:08:46

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> > > And what's wrong with using known good spots like the freezer?
> >
> > This was already discussed too. Please STA.
>
> WTF is STA? You guys want something from me; I don't have time, not
> inclination to go hunt down whatever dark corner of the interweb
> contains your ramblings.
>
> If you can't be arsed to explain things, I certainly cannot be arsed to
> consider your request.

I believe I have provided answer to the freezer question in my previous
mail, so please let's continue the discussion there if needed.

> So you now have my full NAK on touching the scheduler, have at it, go
> deal with someone else.

I personally am not a big fan of the task_rq_lock() public exposure
either. What might be generally useful though (not only for livepatching)
would be an API that would allow for "safe" stack dump (where "safe" means
that guarantee, that it wouldn't be interferred by process waking up in
the middle of dumping, would be provided). Does that sound like even
remotely acceptable idea to you?

Thanks,

--
Jiri Kosina
SUSE Labs

2015-02-12 14:20:22

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On 02/12/2015, 02:35 PM, Peter Zijlstra wrote:
> On Thu, Feb 12, 2015 at 02:16:28PM +0100, Jiri Slaby wrote:
>>> And what's wrong with using known good spots like the freezer?
>>
>> This was already discussed too. Please STA.
>
> WTF is STA? You guys want something from me; I don't have time, not
> inclination to go hunt down whatever dark corner of the interweb
> contains your ramblings.

You definitely do not need STA, if you don't want to know the details. I
think repeating the whole thread would not be productive for all of us.

The short answer you can read from the above is: it is not possible. On
the top of that, Jiri provided you with a simple example to answer why.

> If you can't be arsed to explain things, I certainly cannot be arsed to
> consider your request.

Please see above.

> So you now have my full NAK on touching the scheduler, have at it, go
> deal with someone else.

Ok, we already got your expressed attitude towards live patching. This
is not a kind of input we were hoping for though. Could you comment on
the technical aspects and the proposed solutions instead?

thanks,
--
js
suse labs

2015-02-12 14:21:06

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 02:16:07PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
>
> > > The short answer is: I need a way to ensure that a task isn't sleeping
> > > on any of the functions we're trying to patch. If it's not, then I can
> > > switch the task over to start using new versions of functions.
> > >
> > > Obviously, there are many more details than that. If you have specific
> > > questions I can try to answer them.
> >
> > How can one task run new and another task old functions? Once you patch
> > any indirect function pointer any task will see the new call.
>
> Patched functions are redirected through ftrace trampoline, and decision
> is being made there which function (old or new) to redirect to.
>
> Function calls through pointer always go first to the original function,
> and get redirected from its __fentry__ site.
>
> Once the system is in fully patched state, the overhead of the trampoline
> is reduced (no expensive decision-making to be made there, etc) to
> minimum.
>
> Sure, you will never be on a 100% of performance of the unpatched kernel
> for redirected functions, the indirect call through the trampoline will
> always be there (although ftrace with dynamic trampolines is really
> minimizing this penalty to few extra instructions, one extra call and one
> extra ret being the expensive ones).
>
> > And what's wrong with using known good spots like the freezer?
>
> It has undefined semantics when it comes to what you want to achieve here.
>
> Say for example you have a kernel thread which does something like
>
> while (some_condition) {
> ret = foo();
> ...
> try_to_freeze();
> ...
> }
>
> and you have a livepatch patching foo() and changing its return value
> semantics. Then freezer doesn't really help.

Don't we have the same issue with livepatch? For example:

while (some_condition) {
ret = foo();
...
schedule(); <-- switch to the new universe while it's sleeps
...
// use ret in an unexpected way
}

I think it's not really a problem, just something the patch author needs
to be aware of regardless. It should be part of the checklist. You
always need to be extremely careful when changing a function's return
semantics.

IIRC, when I looked at the freezer before, the biggest problems I found
were that it's too disruptive to the process, and that not all kthreads
are freezable. And I don't see anything inherently safer about it
compared to just stack checking.

--
Josh

2015-02-12 14:27:08

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Josh Poimboeuf wrote:

> > and you have a livepatch patching foo() and changing its return value
> > semantics. Then freezer doesn't really help.
>
> Don't we have the same issue with livepatch? For example:
>
> while (some_condition) {
> ret = foo();
> ...
> schedule(); <-- switch to the new universe while it's sleeps
> ...
> // use ret in an unexpected way
> }

Well if ret is changing semantics, the livepatch will also have to patch
the calling function (so that it handles new semantics properly), and
therefore by looking at the stacks you would see that fact and wouldn't
migrate the scheduled-out task to the new universe.

> I think it's not really a problem, just something the patch author needs
> to be aware of regardless.

Exactly; that's just up to the patch author to undersntad what the
semantical aspects of the patch he is writing are, and make appropriate
consistency model choice.

Thanks,

--
Jiri Kosina
SUSE Labs

2015-02-12 14:32:48

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, 12 Feb 2015, Peter Zijlstra wrote:

> And what's wrong with using known good spots like the freezer?

Quoting Tejun from the thread Jiri Slaby likely had on mind:

"The fact that they may coincide often can be useful as a guideline or
whatever but I'm completely against just mushing it together when it isn't
correct. This kind of things quickly lead to ambiguous situations where
people are not sure about the specific semantics or guarantees of the
construct and implement weird voodoo code followed by voodoo fixes. We
already had a full round of that with the kernel freezer itself, where
people thought that the freezer magically makes PM work properly for a
subsystem. Let's please not do that again."

The whole thread begins here, in case everything hasn't been covered here
yet:

https://lkml.org/lkml/2014/7/2/328

Thanks again for looking into this,

--
Jiri Kosina
SUSE Labs

2015-02-12 15:22:28

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Tue, 10 Feb 2015, Jiri Slaby wrote:

> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> ...
> > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> >
> > static void klp_kobj_release_patch(struct kobject *kobj)
> > {
> > - /*
> > - * Once we have a consistency model we'll need to module_put() the
> > - * patch module here. See klp_register_patch() for more details.
> > - */
>
> I deliberately let you write the note in there :). What happens when I
> leave some attribute in /sys open and you remove the module in the meantime?

And if that attribute is <enabled> it can lead even to the deadlock. You
can try it yourself with the patchset applied and lockdep on. Simple
series of insmod, disable and rmmod of the patch.

Just for the sake of completeness...

Miroslav

>
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -54,6 +54,9 @@ void klp_complete_transition(void)
> > for (func = obj->funcs; func->old_name; func++)
> > func->transition = 0;
> >
> > + if (klp_universe_goal == KLP_UNIVERSE_OLD)
> > + module_put(klp_transition_patch->mod);
> > +
> > klp_transition_patch = NULL;
> > }

2015-02-12 15:24:34

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 03:08:38PM +0100, Jiri Kosina wrote:
> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> I personally am not a big fan of the task_rq_lock() public exposure
> either. What might be generally useful though (not only for livepatching)
> would be an API that would allow for "safe" stack dump (where "safe" means
> that guarantee, that it wouldn't be interferred by process waking up in
> the middle of dumping, would be provided).

In general, I think a safe stack dump is needed. A lot of the stack
dumping in the kernel seems dangerous. For example, it looks like doing
a `cat /proc/pid/stack` while the process is writing the stack could
easily go off into the weeds.

But I don't see how it would help the livepatch case. What happens if
the process starts running in the to-be-patched function after we call
the "safe" dump_stack() but before switching it to the new universe?

--
Josh

2015-02-12 15:49:23

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 12, 2015 at 02:26:42PM +0100, Jiri Slaby wrote:
> On 02/12/2015, 04:21 AM, Josh Poimboeuf wrote:
> > Ingo, Peter,
> >
> > Would you have any objections to making task_rq_lock/unlock() non-static
> > (or moving them to kernel/sched/sched.h) so they can be called by the
> > livepatch code?
> >
> > To provide some background, I'm looking for a way to temporarily prevent
> > a sleeping task from running while its stack is examined, to decide
> > whether it can be safely switched to the new patching "universe". For
> > more details see klp_transition_task() in the patch below.
> >
> > Using task_rq_lock() is the most straightforward way I could find to
> > achieve that.
>
> Hi, I cannot speak whether it is the proper way or not.
>
> But if so, would it make sense to do the opposite: expose an API to walk
> through the processes' stack and make the decision? Concretely, move
> parts of klp_stacktrace_address_verify_func to sched.c or somewhere in
> kernel/sched/ and leave task_rq_lock untouched.

Yeah, it makes sense in theory. But I'm not sure how to do that in a
way that prevents races when switching the task's universe. I think we
need the rq locked for both the stack walk and the universe switch.

In general, I agree it would be good to find a way to keep the rq
locking functions in sched.c.

--
Josh

2015-02-13 10:14:09

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> My biggest concerns and questions related to this patch set are:
>
> 1) To safely examine the task stacks, the transition code locks each task's rq
> struct, which requires using the scheduler's internal rq locking functions.
> It seems to work well, but I'm not sure if there's a cleaner way to safely
> do stack checking without stop_machine().

How about we take a slightly different aproach -- put a probe (or ftrace)
on __switch_to() during a klp transition period, and examine stacktraces
for tasks that are just about to start running from there?

The only tasks that would not be covered by this would be purely CPU-bound
tasks that never schedule. But we are likely in trouble with those anyway,
because odds are that non-rescheduling CPU-bound tasks are also
RT-priority tasks running on isolated CPUs, which we will fail to handle
anyway.

I think Masami used similar trick in his kpatch-without-stopmachine
aproach.

--
Jiri Kosina
SUSE Labs

2015-02-13 12:25:43

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 1/9] livepatch: simplify disable error path

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> If registering the function with ftrace has previously succeeded,
> unregistering will almost never fail. Even if it does, it's not a fatal
> error. We can still carry on and disable the klp_func from being used
> by removing it from the klp_ops func stack.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>

This makes sense, so

Reviewed-by: Miroslav Benes <[email protected]>

I think this patch could be taken independently of the consistency model.
If no one else has any objection...

Miroslav

> ---
> kernel/livepatch/core.c | 67 +++++++++++++------------------------------------
> 1 file changed, 17 insertions(+), 50 deletions(-)
>
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 9adf86b..081df77 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -322,32 +322,20 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> klp_arch_set_pc(regs, (unsigned long)func->new_func);
> }
>
> -static int klp_disable_func(struct klp_func *func)
> +static void klp_disable_func(struct klp_func *func)
> {
> struct klp_ops *ops;
> - int ret;
> -
> - if (WARN_ON(func->state != KLP_ENABLED))
> - return -EINVAL;
>
> - if (WARN_ON(!func->old_addr))
> - return -EINVAL;
> + WARN_ON(func->state != KLP_ENABLED);
> + WARN_ON(!func->old_addr);
>
> ops = klp_find_ops(func->old_addr);
> if (WARN_ON(!ops))
> - return -EINVAL;
> + return;
>
> if (list_is_singular(&ops->func_stack)) {
> - ret = unregister_ftrace_function(&ops->fops);
> - if (ret) {
> - pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
> - func->old_name, ret);
> - return ret;
> - }
> -
> - ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> - if (ret)
> - pr_warn("function unregister succeeded but failed to clear the filter\n");
> + WARN_ON(unregister_ftrace_function(&ops->fops));
> + WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
>
> list_del_rcu(&func->stack_node);
> list_del(&ops->node);
> @@ -357,8 +345,6 @@ static int klp_disable_func(struct klp_func *func)
> }
>
> func->state = KLP_DISABLED;
> -
> - return 0;
> }
>
> static int klp_enable_func(struct klp_func *func)
> @@ -419,23 +405,15 @@ err:
> return ret;
> }
>
> -static int klp_disable_object(struct klp_object *obj)
> +static void klp_disable_object(struct klp_object *obj)
> {
> struct klp_func *func;
> - int ret;
>
> - for (func = obj->funcs; func->old_name; func++) {
> - if (func->state != KLP_ENABLED)
> - continue;
> -
> - ret = klp_disable_func(func);
> - if (ret)
> - return ret;
> - }
> + for (func = obj->funcs; func->old_name; func++)
> + if (func->state == KLP_ENABLED)
> + klp_disable_func(func);
>
> obj->state = KLP_DISABLED;
> -
> - return 0;
> }
>
> static int klp_enable_object(struct klp_object *obj)
> @@ -451,22 +429,19 @@ static int klp_enable_object(struct klp_object *obj)
>
> for (func = obj->funcs; func->old_name; func++) {
> ret = klp_enable_func(func);
> - if (ret)
> - goto unregister;
> + if (ret) {
> + klp_disable_object(obj);
> + return ret;
> + }
> }
> obj->state = KLP_ENABLED;
>
> return 0;
> -
> -unregister:
> - WARN_ON(klp_disable_object(obj));
> - return ret;
> }
>
> static int __klp_disable_patch(struct klp_patch *patch)
> {
> struct klp_object *obj;
> - int ret;
>
> /* enforce stacking: only the last enabled patch can be disabled */
> if (!list_is_last(&patch->list, &klp_patches) &&
> @@ -476,12 +451,8 @@ static int __klp_disable_patch(struct klp_patch *patch)
> pr_notice("disabling patch '%s'\n", patch->mod->name);
>
> for (obj = patch->objs; obj->funcs; obj++) {
> - if (obj->state != KLP_ENABLED)
> - continue;
> -
> - ret = klp_disable_object(obj);
> - if (ret)
> - return ret;
> + if (obj->state == KLP_ENABLED)
> + klp_disable_object(obj);
> }
>
> patch->state = KLP_DISABLED;
> @@ -931,7 +902,6 @@ static void klp_module_notify_going(struct klp_patch *patch,
> {
> struct module *pmod = patch->mod;
> struct module *mod = obj->mod;
> - int ret;
>
> if (patch->state == KLP_DISABLED)
> goto disabled;
> @@ -939,10 +909,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> pr_notice("reverting patch '%s' on unloading module '%s'\n",
> pmod->name, mod->name);
>
> - ret = klp_disable_object(obj);
> - if (ret)
> - pr_warn("failed to revert patch '%s' on module '%s' (%d)\n",
> - pmod->name, mod->name, ret);
> + klp_disable_object(obj);
>
> disabled:
> klp_free_object_loaded(obj);
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2015-02-13 12:44:20

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Thu, Feb 12, 2015 at 04:22:24PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Jiri Slaby wrote:
>
> > On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > ...
> > > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> > >
> > > static void klp_kobj_release_patch(struct kobject *kobj)
> > > {
> > > - /*
> > > - * Once we have a consistency model we'll need to module_put() the
> > > - * patch module here. See klp_register_patch() for more details.
> > > - */
> >
> > I deliberately let you write the note in there :). What happens when I
> > leave some attribute in /sys open and you remove the module in the meantime?
>
> And if that attribute is <enabled> it can lead even to the deadlock. You
> can try it yourself with the patchset applied and lockdep on. Simple
> series of insmod, disable and rmmod of the patch.
>
> Just for the sake of completeness...

Ouch, thanks.

--
Josh

2015-02-13 12:57:45

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Once we have a consistency model, patches and their objects will be
> enabled and disabled at different times. For example, when a patch is
> disabled, its loaded objects' funcs can remain registered with ftrace
> indefinitely until the unpatching operation is complete and they're no
> longer in use.
>
> It's less confusing if we give them different names: patches can be
> enabled or disabled; objects (and their funcs) can be patched or
> unpatched:
>
> - Enabled means that a patch is logically enabled (but not necessarily
> fully applied).
>
> - Patched means that an object's funcs are registered with ftrace and
> added to the klp_ops func stack.
>
> Also, since these states are binary, represent them with boolean-type
> variables instead of enums.

They are binary now but will it hold also in the future? I cannot come up
with any other possible state of the function right now, but that doesn't
mean there isn't any. It would be sad to return it back to enums one day
:)

Also would it be useful to expose patched variable for functions and
objects in sysfs?

Two small things below...

> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> include/linux/livepatch.h | 15 ++++-----
> kernel/livepatch/core.c | 79 +++++++++++++++++++++++------------------------
> 2 files changed, 45 insertions(+), 49 deletions(-)
>
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 95023fd..22a67d1 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -28,11 +28,6 @@
>
> #include <asm/livepatch.h>
>
> -enum klp_state {
> - KLP_DISABLED,
> - KLP_ENABLED
> -};
> -
> /**
> * struct klp_func - function structure for live patching
> * @old_name: name of the function to be patched
> @@ -42,6 +37,7 @@ enum klp_state {
> * @kobj: kobject for sysfs resources
> * @state: tracks function-level patch application state
> * @stack_node: list node for klp_ops func_stack list
> + * @patched: the func has been added to the klp_ops list
> */
> struct klp_func {
> /* external */
> @@ -59,8 +55,8 @@ struct klp_func {
>
> /* internal */
> struct kobject kobj;
> - enum klp_state state;
> struct list_head stack_node;
> + int patched;
> };

@state remains in the comment above

> /**
> @@ -90,7 +86,7 @@ struct klp_reloc {
> * @kobj: kobject for sysfs resources
> * @mod: kernel module associated with the patched object
> * (NULL for vmlinux)
> - * @state: tracks object-level patch application state
> + * @patched: the object's funcs have been add to the klp_ops list
> */
> struct klp_object {
> /* external */
> @@ -101,7 +97,7 @@ struct klp_object {
> /* internal */
> struct kobject *kobj;
> struct module *mod;
> - enum klp_state state;
> + int patched;
> };
>
> /**
> @@ -111,6 +107,7 @@ struct klp_object {
> * @list: list node for global list of registered patches
> * @kobj: kobject for sysfs resources
> * @state: tracks patch-level application state
> + * @enabled: the patch is enabled (but operation may be incomplete)
> */
> struct klp_patch {
> /* external */
> @@ -120,7 +117,7 @@ struct klp_patch {
> /* internal */
> struct list_head list;
> struct kobject kobj;
> - enum klp_state state;
> + int enabled;
> };

Dtto

Miroslav

2015-02-13 14:19:21

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Fri, Feb 13, 2015 at 11:14:01AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > My biggest concerns and questions related to this patch set are:
> >
> > 1) To safely examine the task stacks, the transition code locks each task's rq
> > struct, which requires using the scheduler's internal rq locking functions.
> > It seems to work well, but I'm not sure if there's a cleaner way to safely
> > do stack checking without stop_machine().
>
> How about we take a slightly different aproach -- put a probe (or ftrace)
> on __switch_to() during a klp transition period, and examine stacktraces
> for tasks that are just about to start running from there?
>
> The only tasks that would not be covered by this would be purely CPU-bound
> tasks that never schedule. But we are likely in trouble with those anyway,
> because odds are that non-rescheduling CPU-bound tasks are also
> RT-priority tasks running on isolated CPUs, which we will fail to handle
> anyway.
>
> I think Masami used similar trick in his kpatch-without-stopmachine
> aproach.

Yeah, that's definitely an option, though I'm really not too crazy about
it. Hooking into the scheduler is kind of scary and disruptive. We'd
also have to wake up all the sleeping processes.

--
Josh

2015-02-13 14:22:18

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> > How about we take a slightly different aproach -- put a probe (or ftrace)
> > on __switch_to() during a klp transition period, and examine stacktraces
> > for tasks that are just about to start running from there?
> >
> > The only tasks that would not be covered by this would be purely CPU-bound
> > tasks that never schedule. But we are likely in trouble with those anyway,
> > because odds are that non-rescheduling CPU-bound tasks are also
> > RT-priority tasks running on isolated CPUs, which we will fail to handle
> > anyway.
> >
> > I think Masami used similar trick in his kpatch-without-stopmachine
> > aproach.
>
> Yeah, that's definitely an option, though I'm really not too crazy about
> it. Hooking into the scheduler is kind of scary and disruptive.

This is basically about running a stack checking for ->next before
switching to it, i.e. read-only operation (admittedly inducing some
latency, but that's the same with locking the runqueue). And only when in
transition phase.

> We'd also have to wake up all the sleeping processes.

Yes, I don't think there is a way around that.

Thanks,

--
Jiri Kosina
SUSE Labs

2015-02-13 14:28:34

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Move functions related to the actual patching of functions and objects
> into a new patch.c file.

I am definitely for splitting the code to several different files.
Otherwise it would be soon unmanageable. However I don't know if this
patch is the best possible. Maybe it is just nitpicking so let's not spend
too much time on this :)

Without this patch there are several different groups of functions in
core.c:
1. infrastructure such as global variables, klp_init and some helper
functions
2. (un)registration and initialization of the patch
3. enable/disable with patching/unpatching, ftrace handler
4. sysfs code
5. module notifier
6. relocations

I would move sysfs code away to separate file.

If we decide to move patching code I think it would make sense to move
enable/disable functions along with it. Or perhaps __klp_enable_patch and
__klp_disable_patch only. It is possible though that the result would be
much worse.

Or we can move some other group of functions...

[...]

> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> new file mode 100644
> index 0000000..bb34bd3
> --- /dev/null
> +++ b/kernel/livepatch/patch.h
> @@ -0,0 +1,25 @@
> +#include <linux/livepatch.h>
> +
> +/**
> + * struct klp_ops - structure for tracking registered ftrace ops structs
> + *
> + * A single ftrace_ops is shared between all enabled replacement functions
> + * (klp_func structs) which have the same old_addr. This allows the switch
> + * between function versions to happen instantaneously by updating the klp_ops
> + * struct's func_stack list. The winner is the klp_func at the top of the
> + * func_stack (front of the list).
> + *
> + * @node: node for the global klp_ops list
> + * @func_stack: list head for the stack of klp_func's (active func is on top)
> + * @fops: registered ftrace ops struct
> + */
> +struct klp_ops {
> + struct list_head node;
> + struct list_head func_stack;
> + struct ftrace_ops fops;
> +};
> +
> +struct klp_ops *klp_find_ops(unsigned long old_addr);
> +
> +extern int klp_patch_object(struct klp_object *obj);
> +extern void klp_unpatch_object(struct klp_object *obj);

Is there a reason why klp_find_ops is not extern and the other two
functions are? I think it is redundant and it is better to be consistent.

Regards,
Miroslav

2015-02-13 14:39:46

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states

On Fri, Feb 13, 2015 at 01:57:38PM +0100, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > Once we have a consistency model, patches and their objects will be
> > enabled and disabled at different times. For example, when a patch is
> > disabled, its loaded objects' funcs can remain registered with ftrace
> > indefinitely until the unpatching operation is complete and they're no
> > longer in use.
> >
> > It's less confusing if we give them different names: patches can be
> > enabled or disabled; objects (and their funcs) can be patched or
> > unpatched:
> >
> > - Enabled means that a patch is logically enabled (but not necessarily
> > fully applied).
> >
> > - Patched means that an object's funcs are registered with ftrace and
> > added to the klp_ops func stack.
> >
> > Also, since these states are binary, represent them with boolean-type
> > variables instead of enums.
>
> They are binary now but will it hold also in the future? I cannot come up
> with any other possible state of the function right now, but that doesn't
> mean there isn't any. It would be sad to return it back to enums one day
> :)

I really can't think of any reason why they would become non-binary.
IMO it's more likely we could add more boolean variables, but if that
got out of hand we could just switch to using bit flags.

Either way I don't see a problem with changing them later if we need to.

> Also would it be useful to expose patched variable for functions and
> objects in sysfs?

Not that I know of. Do you have a use case in mind? I view "patched"
as an internal variable, corresponding to whether the object or its
functions are registered with ftrace/klp_ops. It doesn't mean "patched"
in a way that would really make sense to the user, because of the
gradual nature of the patching process.

>
> Two small things below...

Agreed to both, thanks.

>
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> > include/linux/livepatch.h | 15 ++++-----
> > kernel/livepatch/core.c | 79 +++++++++++++++++++++++------------------------
> > 2 files changed, 45 insertions(+), 49 deletions(-)
> >
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 95023fd..22a67d1 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -28,11 +28,6 @@
> >
> > #include <asm/livepatch.h>
> >
> > -enum klp_state {
> > - KLP_DISABLED,
> > - KLP_ENABLED
> > -};
> > -
> > /**
> > * struct klp_func - function structure for live patching
> > * @old_name: name of the function to be patched
> > @@ -42,6 +37,7 @@ enum klp_state {
> > * @kobj: kobject for sysfs resources
> > * @state: tracks function-level patch application state
> > * @stack_node: list node for klp_ops func_stack list
> > + * @patched: the func has been added to the klp_ops list
> > */
> > struct klp_func {
> > /* external */
> > @@ -59,8 +55,8 @@ struct klp_func {
> >
> > /* internal */
> > struct kobject kobj;
> > - enum klp_state state;
> > struct list_head stack_node;
> > + int patched;
> > };
>
> @state remains in the comment above
>
> > /**
> > @@ -90,7 +86,7 @@ struct klp_reloc {
> > * @kobj: kobject for sysfs resources
> > * @mod: kernel module associated with the patched object
> > * (NULL for vmlinux)
> > - * @state: tracks object-level patch application state
> > + * @patched: the object's funcs have been add to the klp_ops list
> > */
> > struct klp_object {
> > /* external */
> > @@ -101,7 +97,7 @@ struct klp_object {
> > /* internal */
> > struct kobject *kobj;
> > struct module *mod;
> > - enum klp_state state;
> > + int patched;
> > };
> >
> > /**
> > @@ -111,6 +107,7 @@ struct klp_object {
> > * @list: list node for global list of registered patches
> > * @kobj: kobject for sysfs resources
> > * @state: tracks patch-level application state
> > + * @enabled: the patch is enabled (but operation may be incomplete)
> > */
> > struct klp_patch {
> > /* external */
> > @@ -120,7 +117,7 @@ struct klp_patch {
> > /* internal */
> > struct list_head list;
> > struct kobject kobj;
> > - enum klp_state state;
> > + int enabled;
> > };
>
> Dtto
>
> Miroslav

--
Josh

2015-02-13 14:40:21

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Fri, 13 Feb 2015, Jiri Kosina wrote:

> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
>
> > > How about we take a slightly different aproach -- put a probe (or ftrace)
> > > on __switch_to() during a klp transition period, and examine stacktraces
> > > for tasks that are just about to start running from there?
> > >
> > > The only tasks that would not be covered by this would be purely CPU-bound
> > > tasks that never schedule. But we are likely in trouble with those anyway,
> > > because odds are that non-rescheduling CPU-bound tasks are also
> > > RT-priority tasks running on isolated CPUs, which we will fail to handle
> > > anyway.
> > >
> > > I think Masami used similar trick in his kpatch-without-stopmachine
> > > aproach.
> >
> > Yeah, that's definitely an option, though I'm really not too crazy about
> > it. Hooking into the scheduler is kind of scary and disruptive.
>
> This is basically about running a stack checking for ->next before
> switching to it, i.e. read-only operation (admittedly inducing some
> latency, but that's the same with locking the runqueue). And only when in
> transition phase.
>
> > We'd also have to wake up all the sleeping processes.
>
> Yes, I don't think there is a way around that.

I think there are two options how to do it if I understand you correctly.

1. we would put a probe on __switch_to and afterwards wake up all the
sleeping processes.

2. we would do it in an asynchronous manner. We would put a probe and let
the processes to wake themselves. The transition delayed workqueue
would only check if there is some non-migrated process. Of course if
some process sleeps for a long time it would take a long time to
complete the patching. It would be up to the user to send a signal to
the process to wake up.

Does it make sense? If yes, I cannot decide which approach is better.

Miroslav

2015-02-13 14:42:33

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Fri, Feb 13, 2015 at 03:22:15PM +0100, Jiri Kosina wrote:
> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
>
> > > How about we take a slightly different aproach -- put a probe (or ftrace)
> > > on __switch_to() during a klp transition period, and examine stacktraces
> > > for tasks that are just about to start running from there?
> > >
> > > The only tasks that would not be covered by this would be purely CPU-bound
> > > tasks that never schedule. But we are likely in trouble with those anyway,
> > > because odds are that non-rescheduling CPU-bound tasks are also
> > > RT-priority tasks running on isolated CPUs, which we will fail to handle
> > > anyway.
> > >
> > > I think Masami used similar trick in his kpatch-without-stopmachine
> > > aproach.
> >
> > Yeah, that's definitely an option, though I'm really not too crazy about
> > it. Hooking into the scheduler is kind of scary and disruptive.
>
> This is basically about running a stack checking for ->next before
> switching to it, i.e. read-only operation (admittedly inducing some
> latency, but that's the same with locking the runqueue). And only when in
> transition phase.

Yes, but it would introduce much more latency than locking rq, since
there would be at least some added latency to every schedule() call
during the transition phase. Locking the rq would only add latency in
those cases where another CPU is trying to do a context switch while
we're holding the lock.

It also seems much more dangerous. A bug in __switch_to() could easily
do a lot of damage.

> > We'd also have to wake up all the sleeping processes.
>
> Yes, I don't think there is a way around that.

Actually this patch set is a way around that :-)

--
Josh

2015-02-13 14:46:18

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 2/9] livepatch: separate enabled and patched states

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> On Fri, Feb 13, 2015 at 01:57:38PM +0100, Miroslav Benes wrote:
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> >
> > > Once we have a consistency model, patches and their objects will be
> > > enabled and disabled at different times. For example, when a patch is
> > > disabled, its loaded objects' funcs can remain registered with ftrace
> > > indefinitely until the unpatching operation is complete and they're no
> > > longer in use.
> > >
> > > It's less confusing if we give them different names: patches can be
> > > enabled or disabled; objects (and their funcs) can be patched or
> > > unpatched:
> > >
> > > - Enabled means that a patch is logically enabled (but not necessarily
> > > fully applied).
> > >
> > > - Patched means that an object's funcs are registered with ftrace and
> > > added to the klp_ops func stack.
> > >
> > > Also, since these states are binary, represent them with boolean-type
> > > variables instead of enums.
> >
> > They are binary now but will it hold also in the future? I cannot come up
> > with any other possible state of the function right now, but that doesn't
> > mean there isn't any. It would be sad to return it back to enums one day
> > :)
>
> I really can't think of any reason why they would become non-binary.
> IMO it's more likely we could add more boolean variables, but if that
> got out of hand we could just switch to using bit flags.
>
> Either way I don't see a problem with changing them later if we need to.

Agreed.

> > Also would it be useful to expose patched variable for functions and
> > objects in sysfs?
>
> Not that I know of. Do you have a use case in mind? I view "patched"
> as an internal variable, corresponding to whether the object or its
> functions are registered with ftrace/klp_ops. It doesn't mean "patched"
> in a way that would really make sense to the user, because of the
> gradual nature of the patching process.

The only reasonable thing which I thought about was in case of an error.
If something bad happens it could be useful to know which state the functions
are in (patched/unpatched). Anyway it is nothing of importance right now
and we can add it anytime later if we decide it useful.

Miroslav

2015-02-13 14:55:45

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

On Fri, Feb 13, 2015 at 03:40:14PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Jiri Kosina wrote:
>
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> >
> > > > How about we take a slightly different aproach -- put a probe (or ftrace)
> > > > on __switch_to() during a klp transition period, and examine stacktraces
> > > > for tasks that are just about to start running from there?
> > > >
> > > > The only tasks that would not be covered by this would be purely CPU-bound
> > > > tasks that never schedule. But we are likely in trouble with those anyway,
> > > > because odds are that non-rescheduling CPU-bound tasks are also
> > > > RT-priority tasks running on isolated CPUs, which we will fail to handle
> > > > anyway.
> > > >
> > > > I think Masami used similar trick in his kpatch-without-stopmachine
> > > > aproach.
> > >
> > > Yeah, that's definitely an option, though I'm really not too crazy about
> > > it. Hooking into the scheduler is kind of scary and disruptive.
> >
> > This is basically about running a stack checking for ->next before
> > switching to it, i.e. read-only operation (admittedly inducing some
> > latency, but that's the same with locking the runqueue). And only when in
> > transition phase.
> >
> > > We'd also have to wake up all the sleeping processes.
> >
> > Yes, I don't think there is a way around that.
>
> I think there are two options how to do it if I understand you correctly.
>
> 1. we would put a probe on __switch_to and afterwards wake up all the
> sleeping processes.
>
> 2. we would do it in an asynchronous manner. We would put a probe and let
> the processes to wake themselves. The transition delayed workqueue
> would only check if there is some non-migrated process. Of course if
> some process sleeps for a long time it would take a long time to
> complete the patching. It would be up to the user to send a signal to
> the process to wake up.
>
> Does it make sense? If yes, I cannot decide which approach is better.

Option 2 wouldn't really work for kthreads because you can't signal them
to wake up from user space. And I really want to avoid having to leave
the system in a partially patched state for a long period of time.

But also option 1 wouldn't necessarily result in the system being
immediately patched, since you could have some CPU-bound tasks. So some
asynchronous patching is still needed.

--
Josh

2015-02-13 15:09:18

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 3/9] livepatch: move patching functions into patch.c

On Fri, Feb 13, 2015 at 03:28:28PM +0100, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > Move functions related to the actual patching of functions and objects
> > into a new patch.c file.
>
> I am definitely for splitting the code to several different files.
> Otherwise it would be soon unmanageable. However I don't know if this
> patch is the best possible. Maybe it is just nitpicking so let's not spend
> too much time on this :)
>
> Without this patch there are several different groups of functions in
> core.c:
> 1. infrastructure such as global variables, klp_init and some helper
> functions
> 2. (un)registration and initialization of the patch
> 3. enable/disable with patching/unpatching, ftrace handler
> 4. sysfs code
> 5. module notifier
> 6. relocations
>
> I would move sysfs code away to separate file.

I'm not sure about moving the sysfs code to its own file, mainly because
of enabled_store():

1. It needs the klp_mutex. It's really nice and clean to keep the
klp_mutex a static variable in core.c (which I plan on doing in v2 of
the patch set).

2. It's one of the main entry points into the klp code, along with
register/unregister and enable/disable. It makes a lot of sense to
keep all of those entry points in the same file IMO.

> If we decide to move patching code I think it would make sense to move
> enable/disable functions along with it. Or perhaps __klp_enable_patch and
> __klp_disable_patch only. It is possible though that the result would be
> much worse.

I would vote to keep enable/disable in core.c for the same reasons as
stated above for enabled_store(). It's possible that
__klp_enable_patch() and __klp_disable_patch() could be moved elsewhere.
Personally I like them where they are, since they call into both
"transition" functions and "patch" functions.

So, big surprise, I agree with my own code splitting decisions ;-)

>
> Or we can move some other group of functions...
>
> [...]
>
> > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > new file mode 100644
> > index 0000000..bb34bd3
> > --- /dev/null
> > +++ b/kernel/livepatch/patch.h
> > @@ -0,0 +1,25 @@
> > +#include <linux/livepatch.h>
> > +
> > +/**
> > + * struct klp_ops - structure for tracking registered ftrace ops structs
> > + *
> > + * A single ftrace_ops is shared between all enabled replacement functions
> > + * (klp_func structs) which have the same old_addr. This allows the switch
> > + * between function versions to happen instantaneously by updating the klp_ops
> > + * struct's func_stack list. The winner is the klp_func at the top of the
> > + * func_stack (front of the list).
> > + *
> > + * @node: node for the global klp_ops list
> > + * @func_stack: list head for the stack of klp_func's (active func is on top)
> > + * @fops: registered ftrace ops struct
> > + */
> > +struct klp_ops {
> > + struct list_head node;
> > + struct list_head func_stack;
> > + struct ftrace_ops fops;
> > +};
> > +
> > +struct klp_ops *klp_find_ops(unsigned long old_addr);
> > +
> > +extern int klp_patch_object(struct klp_object *obj);
> > +extern void klp_unpatch_object(struct klp_object *obj);
>
> Is there a reason why klp_find_ops is not extern and the other two
> functions are? I think it is redundant and it is better to be consistent.

Good catch, thanks.

--
Josh

2015-02-13 16:04:50

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Thu, Feb 12, 2015 at 04:22:24PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Jiri Slaby wrote:
>
> > On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > ...
> > > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> > >
> > > static void klp_kobj_release_patch(struct kobject *kobj)
> > > {
> > > - /*
> > > - * Once we have a consistency model we'll need to module_put() the
> > > - * patch module here. See klp_register_patch() for more details.
> > > - */
> >
> > I deliberately let you write the note in there :). What happens when I
> > leave some attribute in /sys open and you remove the module in the meantime?
>
> And if that attribute is <enabled> it can lead even to the deadlock. You
> can try it yourself with the patchset applied and lockdep on. Simple
> series of insmod, disable and rmmod of the patch.
>
> Just for the sake of completeness...

Hm, even with Jiri Slaby's suggested fix to add the completion to the
unregister path, I still get a lockdep warning. This looks more insidious,
related to the locking order of a kernfs lock and the klp lock. I'll need to
look at this some more...


[26244.952692] ======================================================
[26244.954469] [ INFO: possible circular locking dependency detected ]
[26244.954469] 3.19.0-rc1+ #99 Tainted: G W E K
[26244.954469] -------------------------------------------------------
[26244.954469] rmmod/1270 is trying to acquire lock:
[26244.954469] (s_active#70){++++.+}, at: [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
[26244.954469]
[26244.954469] but task is already holding lock:
[26244.954469] (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
[26244.954469]
[26244.954469] which lock already depends on the new lock.
[26244.954469]
[26244.954469]
[26244.954469] the existing dependency chain (in reverse order) is:
[26244.954469]
-> #1 (klp_mutex){+.+.+.}:
[26244.954469] [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
[26244.954469] [<ffffffff8184ea5d>] mutex_lock_nested+0x7d/0x430
[26244.954469] [<ffffffff811303cf>] enabled_store+0x5f/0xf0
[26244.954469] [<ffffffff8141b98f>] kobj_attr_store+0xf/0x20
[26244.954469] [<ffffffff812fe759>] sysfs_kf_write+0x49/0x60
[26244.954469] [<ffffffff812fe050>] kernfs_fop_write+0x140/0x1a0
[26244.954469] [<ffffffff8126fb1a>] vfs_write+0xba/0x200
[26244.954469] [<ffffffff8127080c>] SyS_write+0x5c/0xd0
[26244.954469] [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
[26244.954469]
-> #0 (s_active#70){++++.+}:
[26244.954469] [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
[26244.954469] [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
[26244.954469] [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
[26244.954469] [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
[26244.954469] [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
[26244.954469] [<ffffffff8141bbc8>] kobject_del+0x18/0x50
[26244.954469] [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
[26244.954469] [<ffffffff8141bb25>] kobject_put+0x35/0x70
[26244.954469] [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
[26244.954469] [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
[26244.954469] [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
[26244.954469] [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
[26244.954469]
[26244.954469] other info that might help us debug this:
[26244.954469]
[26244.954469] Possible unsafe locking scenario:
[26244.954469]
[26244.954469] CPU0 CPU1
[26244.954469] ---- ----
[26244.954469] lock(klp_mutex);
[26244.954469] lock(s_active#70);
[26244.954469] lock(klp_mutex);
[26244.954469] lock(s_active#70);
[26244.954469]
[26244.954469] *** DEADLOCK ***
[26244.954469]
[26244.954469] 1 lock held by rmmod/1270:
[26244.954469] #0: (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
[26244.954469]
[26244.954469] stack backtrace:
[26244.954469] CPU: 1 PID: 1270 Comm: rmmod Tainted: G W E K 3.19.0-rc1+ #99
[26244.954469] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
[26244.954469] 0000000000000000 000000001f4deaad ffff880079877bf8 ffffffff81849fd2
[26244.954469] 0000000000000000 ffffffff82aea9c0 ffff880079877c48 ffffffff8184710b
[26244.954469] 00000000001d6640 ffff880079877ca8 ffff8800788525c0 ffff880078852e90
[26244.954469] Call Trace:
[26244.954469] [<ffffffff81849fd2>] dump_stack+0x4c/0x65
[26244.954469] [<ffffffff8184710b>] print_circular_bug+0x202/0x213
[26244.954469] [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
[26244.954469] [<ffffffff81247b3d>] ? __slab_free+0xbd/0x390
[26244.954469] [<ffffffff810e8765>] ? sched_clock_local+0x25/0x90
[26244.954469] [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
[26244.954469] [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
[26244.954469] [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
[26244.954469] [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
[26244.954469] [<ffffffff811071cf>] ? lock_release_holdtime.part.29+0xf/0x200
[26244.954469] [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
[26244.954469] [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
[26244.954469] [<ffffffff8141bbc8>] kobject_del+0x18/0x50
[26244.954469] [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
[26244.954469] [<ffffffff8141bb25>] kobject_put+0x35/0x70
[26244.954469] [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
[26244.954469] [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
[26244.954469] [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
[26244.954469] [<ffffffff81428a9b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[26244.954469] [<ffffffff818541a9>] system_call_fastpath+0x12/0x17


To recreate:

insmod livepatch-sample.ko

# wait for patching to complete

~/a.out & <-- simple program which opens the "enabled" file in the background

echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled

# wait for unpatch to complete

rmmod livepatch-sample.ko

--
Josh

2015-02-13 16:17:19

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> On Thu, Feb 12, 2015 at 04:22:24PM +0100, Miroslav Benes wrote:
> > On Tue, 10 Feb 2015, Jiri Slaby wrote:
> >
> > > On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> > > ...
> > > > @@ -497,10 +500,6 @@ static struct attribute *klp_patch_attrs[] = {
> > > >
> > > > static void klp_kobj_release_patch(struct kobject *kobj)
> > > > {
> > > > - /*
> > > > - * Once we have a consistency model we'll need to module_put() the
> > > > - * patch module here. See klp_register_patch() for more details.
> > > > - */
> > >
> > > I deliberately let you write the note in there :). What happens when I
> > > leave some attribute in /sys open and you remove the module in the meantime?
> >
> > And if that attribute is <enabled> it can lead even to the deadlock. You
> > can try it yourself with the patchset applied and lockdep on. Simple
> > series of insmod, disable and rmmod of the patch.
> >
> > Just for the sake of completeness...
>
> Hm, even with Jiri Slaby's suggested fix to add the completion to the
> unregister path, I still get a lockdep warning. This looks more insidious,
> related to the locking order of a kernfs lock and the klp lock. I'll need to
> look at this some more...

Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused
by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch
takes klp_mutex and destroys the sysfs structure. If somebody writes to
enabled just after unregister takes the mutex and before the sysfs
removal, he would cause the deadlock, because enabled_store takes the
"sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us
below.

We can look for inspiration elsewhere. Grep for s_active through git log
of the mainline offers several commits which dealt exactly with this. Will
browse through that...

> [26244.952692] ======================================================
> [26244.954469] [ INFO: possible circular locking dependency detected ]
> [26244.954469] 3.19.0-rc1+ #99 Tainted: G W E K
> [26244.954469] -------------------------------------------------------
> [26244.954469] rmmod/1270 is trying to acquire lock:
> [26244.954469] (s_active#70){++++.+}, at: [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
> [26244.954469]
> [26244.954469] but task is already holding lock:
> [26244.954469] (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
> [26244.954469]
> [26244.954469] which lock already depends on the new lock.
> [26244.954469]
> [26244.954469]
> [26244.954469] the existing dependency chain (in reverse order) is:
> [26244.954469]
> -> #1 (klp_mutex){+.+.+.}:
> [26244.954469] [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
> [26244.954469] [<ffffffff8184ea5d>] mutex_lock_nested+0x7d/0x430
> [26244.954469] [<ffffffff811303cf>] enabled_store+0x5f/0xf0
> [26244.954469] [<ffffffff8141b98f>] kobj_attr_store+0xf/0x20
> [26244.954469] [<ffffffff812fe759>] sysfs_kf_write+0x49/0x60
> [26244.954469] [<ffffffff812fe050>] kernfs_fop_write+0x140/0x1a0
> [26244.954469] [<ffffffff8126fb1a>] vfs_write+0xba/0x200
> [26244.954469] [<ffffffff8127080c>] SyS_write+0x5c/0xd0
> [26244.954469] [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
> [26244.954469]
> -> #0 (s_active#70){++++.+}:
> [26244.954469] [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
> [26244.954469] [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
> [26244.954469] [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
> [26244.954469] [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
> [26244.954469] [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
> [26244.954469] [<ffffffff8141bbc8>] kobject_del+0x18/0x50
> [26244.954469] [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
> [26244.954469] [<ffffffff8141bb25>] kobject_put+0x35/0x70
> [26244.954469] [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
> [26244.954469] [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
> [26244.954469] [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
> [26244.954469] [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
> [26244.954469]
> [26244.954469] other info that might help us debug this:
> [26244.954469]
> [26244.954469] Possible unsafe locking scenario:
> [26244.954469]
> [26244.954469] CPU0 CPU1
> [26244.954469] ---- ----
> [26244.954469] lock(klp_mutex);
> [26244.954469] lock(s_active#70);
> [26244.954469] lock(klp_mutex);
> [26244.954469] lock(s_active#70);
> [26244.954469]
> [26244.954469] *** DEADLOCK ***
> [26244.954469]
> [26244.954469] 1 lock held by rmmod/1270:
> [26244.954469] #0: (klp_mutex){+.+.+.}, at: [<ffffffff81130503>] klp_unregister_patch+0x23/0xc0
> [26244.954469]
> [26244.954469] stack backtrace:
> [26244.954469] CPU: 1 PID: 1270 Comm: rmmod Tainted: G W E K 3.19.0-rc1+ #99
> [26244.954469] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
> [26244.954469] 0000000000000000 000000001f4deaad ffff880079877bf8 ffffffff81849fd2
> [26244.954469] 0000000000000000 ffffffff82aea9c0 ffff880079877c48 ffffffff8184710b
> [26244.954469] 00000000001d6640 ffff880079877ca8 ffff8800788525c0 ffff880078852e90
> [26244.954469] Call Trace:
> [26244.954469] [<ffffffff81849fd2>] dump_stack+0x4c/0x65
> [26244.954469] [<ffffffff8184710b>] print_circular_bug+0x202/0x213
> [26244.954469] [<ffffffff8110c5de>] __lock_acquire+0x1c5e/0x1de0
> [26244.954469] [<ffffffff81247b3d>] ? __slab_free+0xbd/0x390
> [26244.954469] [<ffffffff810e8765>] ? sched_clock_local+0x25/0x90
> [26244.954469] [<ffffffff8110cfff>] lock_acquire+0xcf/0x2a0
> [26244.954469] [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
> [26244.954469] [<ffffffff812fbacb>] __kernfs_remove+0x27b/0x390
> [26244.954469] [<ffffffff812fcb07>] ? kernfs_remove+0x27/0x40
> [26244.954469] [<ffffffff811071cf>] ? lock_release_holdtime.part.29+0xf/0x200
> [26244.954469] [<ffffffff812fcb07>] kernfs_remove+0x27/0x40
> [26244.954469] [<ffffffff812ff041>] sysfs_remove_dir+0x51/0x90
> [26244.954469] [<ffffffff8141bbc8>] kobject_del+0x18/0x50
> [26244.954469] [<ffffffff8141bc5a>] kobject_release+0x5a/0x1c0
> [26244.954469] [<ffffffff8141bb25>] kobject_put+0x35/0x70
> [26244.954469] [<ffffffff8113056a>] klp_unregister_patch+0x8a/0xc0
> [26244.954469] [<ffffffffa034d0c5>] livepatch_exit+0x25/0xf60 [livepatch_sample]
> [26244.954469] [<ffffffff81155ddf>] SyS_delete_module+0x1cf/0x280
> [26244.954469] [<ffffffff81428a9b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [26244.954469] [<ffffffff818541a9>] system_call_fastpath+0x12/0x17
>
>
> To recreate:
>
> insmod livepatch-sample.ko
>
> # wait for patching to complete
>
> ~/a.out & <-- simple program which opens the "enabled" file in the background

I didn't even need such a program. Lockdep warned me with sole insmod,
echo and rmmod. It is magically clever.

Miroslav

> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
>
> # wait for unpatch to complete
>
> rmmod livepatch-sample.ko

2015-02-13 21:00:57

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > unregister path, I still get a lockdep warning. This looks more insidious,
> > related to the locking order of a kernfs lock and the klp lock. I'll need to
> > look at this some more...
>
> Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused
> by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch
> takes klp_mutex and destroys the sysfs structure. If somebody writes to
> enabled just after unregister takes the mutex and before the sysfs
> removal, he would cause the deadlock, because enabled_store takes the
> "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us
> below.
>
> We can look for inspiration elsewhere. Grep for s_active through git log
> of the mainline offers several commits which dealt exactly with this. Will
> browse through that...

Thanks Miroslav, please let me know what you find. It wouldn't surprise
me if this were a very common problem.

One option would be to move the enabled_store() work out to a workqueue
or something.

> >
> > To recreate:
> >
> > insmod livepatch-sample.ko
> >
> > # wait for patching to complete
> >
> > ~/a.out & <-- simple program which opens the "enabled" file in the background
>
> I didn't even need such a program. Lockdep warned me with sole insmod,
> echo and rmmod. It is magically clever.

Ah, even easier... lockdep is awesome.

--
Josh

2015-02-14 11:40:10

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Add a basic per-task consistency model. This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
>
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe. If a
> given task isn't using any of the patched functions, it's switched to
> the new universe. Once all the tasks have been converged to the new
> universe, patching is complete.
>
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
>
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition. Only a single patch (the topmost patch on the stack)
> can be in transition at a given time. A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
>
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress. Then all the tasks will attempt to
> converge back to the original universe.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> include/linux/livepatch.h | 18 ++-
> include/linux/sched.h | 3 +
> kernel/fork.c | 2 +
> kernel/livepatch/Makefile | 2 +-
> kernel/livepatch/core.c | 71 ++++++----
> kernel/livepatch/patch.c | 34 ++++-
> kernel/livepatch/patch.h | 1 +
> kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
> kernel/livepatch/transition.h | 16 +++
> kernel/sched/core.c | 2 +
> 10 files changed, 423 insertions(+), 26 deletions(-)
> create mode 100644 kernel/livepatch/transition.c
> create mode 100644 kernel/livepatch/transition.h
>
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 0e65b4d..b8c2f15 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -40,6 +40,7 @@
> * @old_size: size of the old function
> * @new_size: size of the new function
> * @patched: the func has been added to the klp_ops list
> + * @transition: the func is currently being applied or reverted
> */
> struct klp_func {
> /* external */
> @@ -60,6 +61,7 @@ struct klp_func {
> struct list_head stack_node;
> unsigned long old_size, new_size;
> int patched;
> + int transition;
> };
>
> /**
> @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
> extern int klp_enable_patch(struct klp_patch *);
> extern int klp_disable_patch(struct klp_patch *);
>
> -#endif /* CONFIG_LIVEPATCH */
> +extern int klp_universe_goal;
> +
> +static inline void klp_update_task_universe(struct task_struct *t)
> +{
> + /* corresponding smp_wmb() is in klp_set_universe_goal() */
> + smp_rmb();
> +
> + t->klp_universe = klp_universe_goal;
> +}
> +
> +#else /* !CONFIG_LIVEPATCH */
> +
> +static inline void klp_update_task_universe(struct task_struct *t) {}
> +
> +#endif /* !CONFIG_LIVEPATCH */
>
> #endif /* _LINUX_LIVEPATCH_H_ */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..a95e59a 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1701,6 +1701,9 @@ struct task_struct {
> #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> unsigned long task_state_change;
> #endif
> +#ifdef CONFIG_LIVEPATCH
> + int klp_universe;
> +#endif
> };
>
> /* Future-safe accessor for struct task_struct's cpus_allowed. */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4dc2dda..1dcbebe 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -74,6 +74,7 @@
> #include <linux/uprobes.h>
> #include <linux/aio.h>
> #include <linux/compiler.h>
> +#include <linux/livepatch.h>
>
> #include <asm/pgtable.h>
> #include <asm/pgalloc.h>
> @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> total_forks++;
> spin_unlock(&current->sighand->siglock);
> syscall_tracepoint_update(p);
> + klp_update_task_universe(p);
> write_unlock_irq(&tasklist_lock);
>
> proc_fork_connector(p);
> diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> index e136dad..2b8bdb1 100644
> --- a/kernel/livepatch/Makefile
> +++ b/kernel/livepatch/Makefile
> @@ -1,3 +1,3 @@
> obj-$(CONFIG_LIVEPATCH) += livepatch.o
>
> -livepatch-objs := core.o patch.o
> +livepatch-objs := core.o patch.o transition.o
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 85d4ef7..790dc10 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -28,14 +28,17 @@
> #include <linux/kallsyms.h>
>
> #include "patch.h"
> +#include "transition.h"
>
> /*
> - * The klp_mutex protects the global lists and state transitions of any
> - * structure reachable from them. References to any structure must be obtained
> - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> - * ensure it gets consistent data).
> + * The klp_mutex is a coarse lock which serializes access to klp data. All
> + * accesses to klp-related variables and structures must have mutex protection,
> + * except within the following functions which carefully avoid the need for it:
> + *
> + * - klp_ftrace_handler()
> + * - klp_update_task_universe()
> */
> -static DEFINE_MUTEX(klp_mutex);
> +DEFINE_MUTEX(klp_mutex);
>
> static LIST_HEAD(klp_patches);
>
> @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
> mutex_unlock(&module_mutex);
> }
>
> -/* klp_mutex must be held by caller */
> static bool klp_is_patch_registered(struct klp_patch *patch)
> {
> struct klp_patch *mypatch;
> @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
>
> static int __klp_disable_patch(struct klp_patch *patch)
> {
> - struct klp_object *obj;
> + if (klp_transition_patch)
> + return -EBUSY;
>
> /* enforce stacking: only the last enabled patch can be disabled */
> if (!list_is_last(&patch->list, &klp_patches) &&
> list_next_entry(patch, list)->enabled)
> return -EBUSY;
>
> - pr_notice("disabling patch '%s'\n", patch->mod->name);
> -
> - for (obj = patch->objs; obj->funcs; obj++)
> - if (obj->patched)
> - klp_unpatch_object(obj);
> + klp_init_transition(patch, KLP_UNIVERSE_NEW);
> + klp_start_transition(KLP_UNIVERSE_OLD);
> + klp_try_complete_transition();
>
> patch->enabled = 0;
>
> @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
> struct klp_object *obj;
> int ret;
>
> + if (klp_transition_patch)
> + return -EBUSY;
> +
> if (WARN_ON(patch->enabled))
> return -EINVAL;
>
> @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
>
> - pr_notice("enabling patch '%s'\n", patch->mod->name);
> + klp_init_transition(patch, KLP_UNIVERSE_OLD);
>
> for (obj = patch->objs; obj->funcs; obj++) {
> klp_find_object_module(obj);
> @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
> continue;
>
> ret = klp_patch_object(obj);
> - if (ret)
> - goto unregister;
> + if (ret) {
> + pr_warn("failed to enable patch '%s'\n",
> + patch->mod->name);
> +
> + klp_unpatch_objects(patch);
> + klp_complete_transition();
> +
> + return ret;
> + }
> }
>
> + klp_start_transition(KLP_UNIVERSE_NEW);
> +
> + klp_try_complete_transition();
> +
> patch->enabled = 1;
>
> return 0;
> -
> -unregister:
> - WARN_ON(__klp_disable_patch(patch));
> - return ret;
> }
>
> /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
> * /sys/kernel/livepatch
> * /sys/kernel/livepatch/<patch>
> * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
> * /sys/kernel/livepatch/<patch>/<object>
> * /sys/kernel/livepatch/<patch>/<object>/<func>
> */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> goto err;
> }
>
> - if (val) {
> + if (klp_transition_patch == patch) {
> + klp_reverse_transition();
> + } else if (val) {
> ret = __klp_enable_patch(patch);
> if (ret)
> goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
> return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
> }
>
> +static ssize_t transition_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct klp_patch *patch;
> +
> + patch = container_of(kobj, struct klp_patch, kobj);
> + return snprintf(buf, PAGE_SIZE-1, "%d\n",
> + klp_transition_patch == patch);
> +}
> +
> static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
> static struct attribute *klp_patch_attrs[] = {
> &enabled_kobj_attr.attr,
> + &transition_kobj_attr.attr,
> NULL
> };
>
> @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> {
> INIT_LIST_HEAD(&func->stack_node);
> func->patched = 0;
> + func->transition = 0;
>
> return kobject_init_and_add(&func->kobj, &klp_ktype_func,
> obj->kobj, func->old_name);
> @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
> if (ret)
> goto err;
>
> - if (!patch->enabled)
> + if (!patch->enabled && klp_transition_patch != patch)
> return;
>
> pr_notice("applying patch '%s' to loading module '%s'\n",
> @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> struct module *pmod = patch->mod;
> struct module *mod = obj->mod;
>
> - if (!patch->enabled)
> + if (!patch->enabled && klp_transition_patch != patch)
> goto free;
>
> pr_notice("reverting patch '%s' on unloading module '%s'\n",
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 281fbca..f12256b 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -24,6 +24,7 @@
> #include <linux/slab.h>
>
> #include "patch.h"
> +#include "transition.h"
>
> static LIST_HEAD(klp_ops);
>
> @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> ops = container_of(fops, struct klp_ops, fops);
>
> rcu_read_lock();
> +
> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - rcu_read_unlock();
>
> if (WARN_ON_ONCE(!func))
> - return;
> + goto unlock;
> +
> + if (unlikely(func->transition)) {
> + /* corresponding smp_wmb() is in klp_init_transition() */
> + smp_rmb();
> +
> + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> + /*
> + * Use the previously patched version of the function.
> + * If no previous patches exist, use the original
> + * function.
> + */
> + func = list_entry_rcu(func->stack_node.next,
> + struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }
>
> klp_arch_set_pc(regs, (unsigned long)func->new_func);
> +unlock:
> + rcu_read_unlock();
> }
>
> struct klp_ops *klp_find_ops(unsigned long old_addr)
> @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
>
> return 0;
> }
> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> + struct klp_object *obj;
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + if (obj->patched)
> + klp_unpatch_object(obj);
> +}
> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>
> extern int klp_patch_object(struct klp_object *obj);
> extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 0000000..2630296
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> @@ -0,0 +1,300 @@
> +/*
> + * transition.c - Kernel Live Patching transition functions
> + *
> + * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/cpu.h>
> +#include <asm/stacktrace.h>
> +#include "../sched/sched.h"
> +
> +#include "patch.h"
> +#include "transition.h"
> +
> +static void klp_transition_work_fn(struct work_struct *);
> +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> +
> +struct klp_patch *klp_transition_patch;
> +
> +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> +
> +static void klp_set_universe_goal(int universe)
> +{
> + klp_universe_goal = universe;
> +
> + /* corresponding smp_rmb() is in klp_update_task_universe() */
> + smp_wmb();
> +}
> +
> +/*
> + * The transition to the universe goal is complete. Clean up the data
> + * structures.
> + */
> +void klp_complete_transition(void)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> +
> + for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> + for (func = obj->funcs; func->old_name; func++)
> + func->transition = 0;
> +
> + klp_transition_patch = NULL;
> +}
> +
> +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> + unsigned long address)
> +{
> + unsigned long func_addr, func_size;
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + /* check the to-be-unpatched function (the func itself) */
> + func_addr = (unsigned long)func->new_func;
> + func_size = func->new_size;
> + } else {
> + /* check the to-be-patched function (previous func) */
> + struct klp_ops *ops;
> +
> + ops = klp_find_ops(func->old_addr);
> +
> + if (list_is_singular(&ops->func_stack)) {
> + /* original function */
> + func_addr = func->old_addr;
> + func_size = func->old_size;
> + } else {
> + /* previously patched function */
> + struct klp_func *prev;
> +
> + prev = list_next_entry(func, stack_node);
> + func_addr = (unsigned long)prev->new_func;
> + func_size = prev->new_size;
> + }
> + }
> +
> + if (address >= func_addr && address < func_addr + func_size)
> + return -1;
> +
> + return 0;
> +}
> +
> +/*
> + * Determine whether the given return address on the stack is within a
> + * to-be-patched or to-be-unpatched function.
> + */
> +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> + int reliable)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> + int *ret = data;
> +
> + if (*ret)
> + return;
> +
> + for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> + if (!obj->patched)
> + continue;
> + for (func = obj->funcs; func->old_name; func++) {
> + if (klp_stacktrace_address_verify_func(func, address)) {
> + *ret = -1;
> + return;
> + }
> + }
> + }
> +}
> +
> +static int klp_stacktrace_stack(void *data, char *name)
> +{
> + return 0;
> +}
> +
> +static const struct stacktrace_ops klp_stacktrace_ops = {
> + .address = klp_stacktrace_address_verify,
> + .stack = klp_stacktrace_stack,
> + .walk_stack = print_context_stack_bp,
> +};
> +
> +/*
> + * Try to safely transition a task to the universe goal. If the task is
> + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> + * function, return false.
> + */
> +static bool klp_transition_task(struct task_struct *t)
> +{
> + struct rq *rq;
> + unsigned long flags;
> + int ret;
> + bool success = false;
> +
> + if (t->klp_universe == klp_universe_goal)
> + return true;
> +
> + rq = task_rq_lock(t, &flags);
> +
> + if (task_running(rq, t) && t != current) {
> + pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> + t->comm);
> + goto done;
> + }
> +
> + ret = 0;
> + dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> + if (ret) {
> + pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> + __func__, t->pid, t->comm);
> + goto done;
> + }
> +
> + klp_update_task_universe(t);
> +
> + success = true;
> +done:
> + task_rq_unlock(rq, t, &flags);
> + return success;
> +}
> +
> +/*
> + * Try to transition all tasks to the universe goal. If any tasks are still
> + * stuck in the original universe, schedule a retry.
> + */
> +void klp_try_complete_transition(void)
> +{
> + unsigned int cpu;
> + struct task_struct *g, *t;
> + bool complete = true;
> +
> + /* try to transition all normal tasks */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + if (!klp_transition_task(t))
> + complete = false;
> + read_unlock(&tasklist_lock);
> +
> + /* try to transition the idle "swapper" tasks */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + if (!klp_transition_task(idle_task(cpu)))
> + complete = false;
> + put_online_cpus();
> +
> + /* if not complete, try again later */
> + if (!complete) {
> + schedule_delayed_work(&klp_transition_work,
> + round_jiffies_relative(HZ));
> + return;
> + }
> +
> + /* success! unpatch obsolete functions and do some cleanup */
> +
> + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> + klp_unpatch_objects(klp_transition_patch);
> +
> + /* prevent ftrace handler from reading old func->transition */
> + synchronize_rcu();
> + }
> +
> + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> + "unpatching");
> +
> + klp_complete_transition();
> +}
> +
> +static void klp_transition_work_fn(struct work_struct *work)
> +{
> + mutex_lock(&klp_mutex);
> +
> + if (klp_transition_patch)
> + klp_try_complete_transition();
> +
> + mutex_unlock(&klp_mutex);
> +}
> +
> +/*
> + * Start the transition to the specified universe so tasks can begin switching
> + * to it.
> + */
> +void klp_start_transition(int universe)
> +{
> + if (WARN_ON(klp_universe_goal == universe))
> + return;
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> + universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> +
> + klp_set_universe_goal(universe);
> +}
> +
> +/*
> + * Can be called in the middle of an existing transition to reverse the
> + * direction of the universe goal. This can be done to effectively cancel an
> + * existing enable or disable operation if there are any tasks which are stuck
> + * in the original universe.
> + */
> +void klp_reverse_transition(void)
> +{
> + struct klp_patch *patch = klp_transition_patch;
> +
> + klp_start_transition(!klp_universe_goal);
> + klp_try_complete_transition();
> +
> + patch->enabled = !patch->enabled;
> +}
> +
> +/*
> + * Reset the universe goal and all tasks to the starting universe, and set all
> + * func->transition's to 1 to prepare for patching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int universe)
> +{
> + struct task_struct *g, *t;
> + unsigned int cpu;
> + struct klp_object *obj;
> + struct klp_func *func;
> +
> + klp_transition_patch = patch;
> +
> + /*
> + * If the previous transition was in the opposite direction, we may
> + * already be in the requested initial universe.
> + */
> + if (klp_universe_goal == universe)
> + goto init_funcs;
> +
> + klp_set_universe_goal(universe);
> +
> + /* init all normal task universes */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, t)
> + klp_update_task_universe(t);
> + read_unlock(&tasklist_lock);
> +
> + /* init all idle "swapper" task universes */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + klp_update_task_universe(idle_task(cpu));
> + put_online_cpus();
> +
> +init_funcs:
> + /* corresponding smp_rmb() is in klp_ftrace_handler() */
> + smp_wmb();
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + for (func = obj->funcs; func->old_name; func++)
> + func->transition = 1;

So I finally got to review of this one. I have only two concerns:
1) it removes the ability for the user to use 'no consistency model'.
But you don't need to worry about this, I plan to implement this as soon
as you send v2 of these.

2) How is this 'transition = 1' store above guaranteed to reach other
CPUs before you start registering ftrace handlers? The CPUs need not see
the update when some handler is already invoked before start_transition
AFAICS.

thanks,
--
js
suse labs

2015-02-16 10:16:17

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFC PATCH 9/9] livepatch: update task universe when exiting kernel

On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> Update a tasks's universe when returning from a system call or user
> space interrupt, or after handling a signal.
>
> This greatly increases the chances of a patch operation succeeding. If
> a task is I/O bound, it can switch universes when returning from a
> system call. If a task is CPU bound, it can switch universes when
> returning from an interrupt. If a task is sleeping on a to-be-patched
> function, the user can send SIGSTOP and SIGCONT to force it to switch.
>
> Since the idle "swapper" tasks don't ever exit the kernel, they're
> updated from within the idle loop.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> arch/x86/include/asm/thread_info.h | 4 +++-
> arch/x86/kernel/signal.c | 4 ++++
> include/linux/livepatch.h | 2 ++
> kernel/livepatch/transition.c | 15 +++++++++++++++
> kernel/sched/idle.c | 4 ++++
...
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -7,6 +7,7 @@
> #include <linux/tick.h>
> #include <linux/mm.h>
> #include <linux/stackprotector.h>
> +#include <linux/livepatch.h>
>
> #include <asm/tlb.h>
>
> @@ -250,6 +251,9 @@ static void cpu_idle_loop(void)
>
> sched_ttwu_pending();
> schedule_preempt_disabled();
> +
> + if (unlikely(test_thread_flag(TIF_KLP_NEED_UPDATE)))
> + klp_update_task_universe(current);

Oh, this is indeed broken on non-x86 archs as kbuild reports.
(TIF_KLP_NEED_UPDATE undefined)

We need a klp_maybe_update_task_universe inline or something like that
and define it void for non-LIVEPATCH configs.

regards,
--
js
suse labs

2015-02-16 14:19:16

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> Add a basic per-task consistency model. This is the foundation which
> will eventually enable us to patch those ~10% of security patches which
> change function prototypes and/or data semantics.
>
> When a patch is enabled, livepatch enters into a transition state where
> tasks are converging from the old universe to the new universe. If a
> given task isn't using any of the patched functions, it's switched to
> the new universe. Once all the tasks have been converged to the new
> universe, patching is complete.
>
> The same sequence occurs when a patch is disabled, except the tasks
> converge from the new universe to the old universe.
>
> The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> is in transition. Only a single patch (the topmost patch on the stack)
> can be in transition at a given time. A patch can remain in the
> transition state indefinitely, if any of the tasks are stuck in the
> previous universe.
>
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> the transition is in progress. Then all the tasks will attempt to
> converge back to the original universe.

I finally managed to go through this patch and I have only few comments
apart from what Jiri has already written...

I think it would be useful to add more comments throughout the code.

[...]

> /**
> @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
> * /sys/kernel/livepatch
> * /sys/kernel/livepatch/<patch>
> * /sys/kernel/livepatch/<patch>/enabled
> + * /sys/kernel/livepatch/<patch>/transition
> * /sys/kernel/livepatch/<patch>/<object>
> * /sys/kernel/livepatch/<patch>/<object>/<func>
> */
> @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> goto err;
> }
>
> - if (val) {
> + if (klp_transition_patch == patch) {
> + klp_reverse_transition();
> + } else if (val) {
> ret = __klp_enable_patch(patch);
> if (ret)
> goto err;
> @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
> return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
> }
>
> +static ssize_t transition_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct klp_patch *patch;
> +
> + patch = container_of(kobj, struct klp_patch, kobj);
> + return snprintf(buf, PAGE_SIZE-1, "%d\n",
> + klp_transition_patch == patch);
> +}
> +
> static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
> static struct attribute *klp_patch_attrs[] = {
> &enabled_kobj_attr.attr,
> + &transition_kobj_attr.attr,
> NULL
> };

sysfs documentation (Documentation/ABI/testing/sysfs-kernel-livepatch)
should be updated as well. Also the meaning of enabled attribute was
changed a bit (by different patch of the set though).

[...]

> +
> +void klp_unpatch_objects(struct klp_patch *patch)
> +{
> + struct klp_object *obj;
> +
> + for (obj = patch->objs; obj->funcs; obj++)
> + if (obj->patched)
> + klp_unpatch_object(obj);
> +}

Maybe we should introduce for_each_* macros which could be used in the
code and avoid such functions. I do not have strong opinion about it.

> diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> index bb34bd3..1648259 100644
> --- a/kernel/livepatch/patch.h
> +++ b/kernel/livepatch/patch.h
> @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
>
> extern int klp_patch_object(struct klp_object *obj);
> extern void klp_unpatch_object(struct klp_object *obj);
> +extern void klp_unpatch_objects(struct klp_patch *patch);

[...]

> diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> new file mode 100644
> index 0000000..ba9a55c
> --- /dev/null
> +++ b/kernel/livepatch/transition.h
> @@ -0,0 +1,16 @@
> +#include <linux/livepatch.h>
> +
> +enum {
> + KLP_UNIVERSE_UNDEFINED = -1,
> + KLP_UNIVERSE_OLD,
> + KLP_UNIVERSE_NEW,
> +};
> +
> +extern struct mutex klp_mutex;
> +extern struct klp_patch *klp_transition_patch;
> +
> +extern void klp_init_transition(struct klp_patch *patch, int universe);
> +extern void klp_start_transition(int universe);
> +extern void klp_reverse_transition(void);
> +extern void klp_try_complete_transition(void);
> +extern void klp_complete_transition(void);

Double inclusion protection is missing and externs for functions are
redundant.

Otherwise it looks quite ok.

Miroslav

2015-02-16 16:06:22

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Fri, 13 Feb 2015, Josh Poimboeuf wrote:

> On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > unregister path, I still get a lockdep warning. This looks more insidious,
> > > related to the locking order of a kernfs lock and the klp lock. I'll need to
> > > look at this some more...
> >
> > Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused
> > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch
> > takes klp_mutex and destroys the sysfs structure. If somebody writes to
> > enabled just after unregister takes the mutex and before the sysfs
> > removal, he would cause the deadlock, because enabled_store takes the
> > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us
> > below.
> >
> > We can look for inspiration elsewhere. Grep for s_active through git log
> > of the mainline offers several commits which dealt exactly with this. Will
> > browse through that...
>
> Thanks Miroslav, please let me know what you find. It wouldn't surprise
> me if this were a very common problem.
>
> One option would be to move the enabled_store() work out to a workqueue
> or something.

Yes, that is one possibility. It is not the only one.

1. we could replace mutex_lock in enabled_store with mutex_trylock. If the
lock was not acquired we would return -EBUSY. Or could we 'return
restart_syscall' (maybe after some tiny msleep)?

2. we could reorganize klp_unregister_patch somehow and move sysfs removal
out of mutex protection.

Miroslav

> > >
> > > To recreate:
> > >
> > > insmod livepatch-sample.ko
> > >
> > > # wait for patching to complete
> > >
> > > ~/a.out & <-- simple program which opens the "enabled" file in the background
> >
> > I didn't even need such a program. Lockdep warned me with sole insmod,
> > echo and rmmod. It is magically clever.
>
> Ah, even easier... lockdep is awesome.
>
> --
> Josh
>

--
Miroslav Benes
SUSE Labs

2015-02-17 14:58:39

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 9/9] livepatch: update task universe when exiting kernel

On Mon, Feb 16, 2015 at 11:16:11AM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Update a tasks's universe when returning from a system call or user
> > space interrupt, or after handling a signal.
> >
> > This greatly increases the chances of a patch operation succeeding. If
> > a task is I/O bound, it can switch universes when returning from a
> > system call. If a task is CPU bound, it can switch universes when
> > returning from an interrupt. If a task is sleeping on a to-be-patched
> > function, the user can send SIGSTOP and SIGCONT to force it to switch.
> >
> > Since the idle "swapper" tasks don't ever exit the kernel, they're
> > updated from within the idle loop.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> > arch/x86/include/asm/thread_info.h | 4 +++-
> > arch/x86/kernel/signal.c | 4 ++++
> > include/linux/livepatch.h | 2 ++
> > kernel/livepatch/transition.c | 15 +++++++++++++++
> > kernel/sched/idle.c | 4 ++++
> ...
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -7,6 +7,7 @@
> > #include <linux/tick.h>
> > #include <linux/mm.h>
> > #include <linux/stackprotector.h>
> > +#include <linux/livepatch.h>
> >
> > #include <asm/tlb.h>
> >
> > @@ -250,6 +251,9 @@ static void cpu_idle_loop(void)
> >
> > sched_ttwu_pending();
> > schedule_preempt_disabled();
> > +
> > + if (unlikely(test_thread_flag(TIF_KLP_NEED_UPDATE)))
> > + klp_update_task_universe(current);
>
> Oh, this is indeed broken on non-x86 archs as kbuild reports.
> (TIF_KLP_NEED_UPDATE undefined)
>
> We need a klp_maybe_update_task_universe inline or something like that
> and define it void for non-LIVEPATCH configs.

Doh, thanks.

--
Josh

2015-02-17 15:18:04

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Sat, Feb 14, 2015 at 12:40:01PM +0100, Jiri Slaby wrote:
> On 02/09/2015, 06:31 PM, Josh Poimboeuf wrote:
> > Add a basic per-task consistency model. This is the foundation which
> > will eventually enable us to patch those ~10% of security patches which
> > change function prototypes and/or data semantics.
> >
> > When a patch is enabled, livepatch enters into a transition state where
> > tasks are converging from the old universe to the new universe. If a
> > given task isn't using any of the patched functions, it's switched to
> > the new universe. Once all the tasks have been converged to the new
> > universe, patching is complete.
> >
> > The same sequence occurs when a patch is disabled, except the tasks
> > converge from the new universe to the old universe.
> >
> > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > is in transition. Only a single patch (the topmost patch on the stack)
> > can be in transition at a given time. A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
> >
> > A transition can be reversed and effectively canceled by writing the
> > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > the transition is in progress. Then all the tasks will attempt to
> > converge back to the original universe.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> > include/linux/livepatch.h | 18 ++-
> > include/linux/sched.h | 3 +
> > kernel/fork.c | 2 +
> > kernel/livepatch/Makefile | 2 +-
> > kernel/livepatch/core.c | 71 ++++++----
> > kernel/livepatch/patch.c | 34 ++++-
> > kernel/livepatch/patch.h | 1 +
> > kernel/livepatch/transition.c | 300 ++++++++++++++++++++++++++++++++++++++++++
> > kernel/livepatch/transition.h | 16 +++
> > kernel/sched/core.c | 2 +
> > 10 files changed, 423 insertions(+), 26 deletions(-)
> > create mode 100644 kernel/livepatch/transition.c
> > create mode 100644 kernel/livepatch/transition.h
> >
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 0e65b4d..b8c2f15 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -40,6 +40,7 @@
> > * @old_size: size of the old function
> > * @new_size: size of the new function
> > * @patched: the func has been added to the klp_ops list
> > + * @transition: the func is currently being applied or reverted
> > */
> > struct klp_func {
> > /* external */
> > @@ -60,6 +61,7 @@ struct klp_func {
> > struct list_head stack_node;
> > unsigned long old_size, new_size;
> > int patched;
> > + int transition;
> > };
> >
> > /**
> > @@ -128,6 +130,20 @@ extern int klp_unregister_patch(struct klp_patch *);
> > extern int klp_enable_patch(struct klp_patch *);
> > extern int klp_disable_patch(struct klp_patch *);
> >
> > -#endif /* CONFIG_LIVEPATCH */
> > +extern int klp_universe_goal;
> > +
> > +static inline void klp_update_task_universe(struct task_struct *t)
> > +{
> > + /* corresponding smp_wmb() is in klp_set_universe_goal() */
> > + smp_rmb();
> > +
> > + t->klp_universe = klp_universe_goal;
> > +}
> > +
> > +#else /* !CONFIG_LIVEPATCH */
> > +
> > +static inline void klp_update_task_universe(struct task_struct *t) {}
> > +
> > +#endif /* !CONFIG_LIVEPATCH */
> >
> > #endif /* _LINUX_LIVEPATCH_H_ */
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 8db31ef..a95e59a 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1701,6 +1701,9 @@ struct task_struct {
> > #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> > unsigned long task_state_change;
> > #endif
> > +#ifdef CONFIG_LIVEPATCH
> > + int klp_universe;
> > +#endif
> > };
> >
> > /* Future-safe accessor for struct task_struct's cpus_allowed. */
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 4dc2dda..1dcbebe 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -74,6 +74,7 @@
> > #include <linux/uprobes.h>
> > #include <linux/aio.h>
> > #include <linux/compiler.h>
> > +#include <linux/livepatch.h>
> >
> > #include <asm/pgtable.h>
> > #include <asm/pgalloc.h>
> > @@ -1538,6 +1539,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> > total_forks++;
> > spin_unlock(&current->sighand->siglock);
> > syscall_tracepoint_update(p);
> > + klp_update_task_universe(p);
> > write_unlock_irq(&tasklist_lock);
> >
> > proc_fork_connector(p);
> > diff --git a/kernel/livepatch/Makefile b/kernel/livepatch/Makefile
> > index e136dad..2b8bdb1 100644
> > --- a/kernel/livepatch/Makefile
> > +++ b/kernel/livepatch/Makefile
> > @@ -1,3 +1,3 @@
> > obj-$(CONFIG_LIVEPATCH) += livepatch.o
> >
> > -livepatch-objs := core.o patch.o
> > +livepatch-objs := core.o patch.o transition.o
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 85d4ef7..790dc10 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -28,14 +28,17 @@
> > #include <linux/kallsyms.h>
> >
> > #include "patch.h"
> > +#include "transition.h"
> >
> > /*
> > - * The klp_mutex protects the global lists and state transitions of any
> > - * structure reachable from them. References to any structure must be obtained
> > - * under mutex protection (except in klp_ftrace_handler(), which uses RCU to
> > - * ensure it gets consistent data).
> > + * The klp_mutex is a coarse lock which serializes access to klp data. All
> > + * accesses to klp-related variables and structures must have mutex protection,
> > + * except within the following functions which carefully avoid the need for it:
> > + *
> > + * - klp_ftrace_handler()
> > + * - klp_update_task_universe()
> > */
> > -static DEFINE_MUTEX(klp_mutex);
> > +DEFINE_MUTEX(klp_mutex);
> >
> > static LIST_HEAD(klp_patches);
> >
> > @@ -67,7 +70,6 @@ static void klp_find_object_module(struct klp_object *obj)
> > mutex_unlock(&module_mutex);
> > }
> >
> > -/* klp_mutex must be held by caller */
> > static bool klp_is_patch_registered(struct klp_patch *patch)
> > {
> > struct klp_patch *mypatch;
> > @@ -285,18 +287,17 @@ static int klp_write_object_relocations(struct module *pmod,
> >
> > static int __klp_disable_patch(struct klp_patch *patch)
> > {
> > - struct klp_object *obj;
> > + if (klp_transition_patch)
> > + return -EBUSY;
> >
> > /* enforce stacking: only the last enabled patch can be disabled */
> > if (!list_is_last(&patch->list, &klp_patches) &&
> > list_next_entry(patch, list)->enabled)
> > return -EBUSY;
> >
> > - pr_notice("disabling patch '%s'\n", patch->mod->name);
> > -
> > - for (obj = patch->objs; obj->funcs; obj++)
> > - if (obj->patched)
> > - klp_unpatch_object(obj);
> > + klp_init_transition(patch, KLP_UNIVERSE_NEW);
> > + klp_start_transition(KLP_UNIVERSE_OLD);
> > + klp_try_complete_transition();
> >
> > patch->enabled = 0;
> >
> > @@ -340,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > struct klp_object *obj;
> > int ret;
> >
> > + if (klp_transition_patch)
> > + return -EBUSY;
> > +
> > if (WARN_ON(patch->enabled))
> > return -EINVAL;
> >
> > @@ -351,7 +355,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > pr_notice_once("tainting kernel with TAINT_LIVEPATCH\n");
> > add_taint(TAINT_LIVEPATCH, LOCKDEP_STILL_OK);
> >
> > - pr_notice("enabling patch '%s'\n", patch->mod->name);
> > + klp_init_transition(patch, KLP_UNIVERSE_OLD);
> >
> > for (obj = patch->objs; obj->funcs; obj++) {
> > klp_find_object_module(obj);
> > @@ -360,17 +364,24 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > continue;
> >
> > ret = klp_patch_object(obj);
> > - if (ret)
> > - goto unregister;
> > + if (ret) {
> > + pr_warn("failed to enable patch '%s'\n",
> > + patch->mod->name);
> > +
> > + klp_unpatch_objects(patch);
> > + klp_complete_transition();
> > +
> > + return ret;
> > + }
> > }
> >
> > + klp_start_transition(KLP_UNIVERSE_NEW);
> > +
> > + klp_try_complete_transition();
> > +
> > patch->enabled = 1;
> >
> > return 0;
> > -
> > -unregister:
> > - WARN_ON(__klp_disable_patch(patch));
> > - return ret;
> > }
> >
> > /**
> > @@ -407,6 +418,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
> > * /sys/kernel/livepatch
> > * /sys/kernel/livepatch/<patch>
> > * /sys/kernel/livepatch/<patch>/enabled
> > + * /sys/kernel/livepatch/<patch>/transition
> > * /sys/kernel/livepatch/<patch>/<object>
> > * /sys/kernel/livepatch/<patch>/<object>/<func>
> > */
> > @@ -435,7 +447,9 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> > goto err;
> > }
> >
> > - if (val) {
> > + if (klp_transition_patch == patch) {
> > + klp_reverse_transition();
> > + } else if (val) {
> > ret = __klp_enable_patch(patch);
> > if (ret)
> > goto err;
> > @@ -463,9 +477,21 @@ static ssize_t enabled_show(struct kobject *kobj,
> > return snprintf(buf, PAGE_SIZE-1, "%d\n", patch->enabled);
> > }
> >
> > +static ssize_t transition_show(struct kobject *kobj,
> > + struct kobj_attribute *attr, char *buf)
> > +{
> > + struct klp_patch *patch;
> > +
> > + patch = container_of(kobj, struct klp_patch, kobj);
> > + return snprintf(buf, PAGE_SIZE-1, "%d\n",
> > + klp_transition_patch == patch);
> > +}
> > +
> > static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
> > +static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
> > static struct attribute *klp_patch_attrs[] = {
> > &enabled_kobj_attr.attr,
> > + &transition_kobj_attr.attr,
> > NULL
> > };
> >
> > @@ -543,6 +569,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> > {
> > INIT_LIST_HEAD(&func->stack_node);
> > func->patched = 0;
> > + func->transition = 0;
> >
> > return kobject_init_and_add(&func->kobj, &klp_ktype_func,
> > obj->kobj, func->old_name);
> > @@ -725,7 +752,7 @@ static void klp_module_notify_coming(struct klp_patch *patch,
> > if (ret)
> > goto err;
> >
> > - if (!patch->enabled)
> > + if (!patch->enabled && klp_transition_patch != patch)
> > return;
> >
> > pr_notice("applying patch '%s' to loading module '%s'\n",
> > @@ -746,7 +773,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> > struct module *pmod = patch->mod;
> > struct module *mod = obj->mod;
> >
> > - if (!patch->enabled)
> > + if (!patch->enabled && klp_transition_patch != patch)
> > goto free;
> >
> > pr_notice("reverting patch '%s' on unloading module '%s'\n",
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 281fbca..f12256b 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -24,6 +24,7 @@
> > #include <linux/slab.h>
> >
> > #include "patch.h"
> > +#include "transition.h"
> >
> > static LIST_HEAD(klp_ops);
> >
> > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > ops = container_of(fops, struct klp_ops, fops);
> >
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > stack_node);
> > - rcu_read_unlock();
> >
> > if (WARN_ON_ONCE(!func))
> > - return;
> > + goto unlock;
> > +
> > + if (unlikely(func->transition)) {
> > + /* corresponding smp_wmb() is in klp_init_transition() */
> > + smp_rmb();
> > +
> > + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > + /*
> > + * Use the previously patched version of the function.
> > + * If no previous patches exist, use the original
> > + * function.
> > + */
> > + func = list_entry_rcu(func->stack_node.next,
> > + struct klp_func, stack_node);
> > +
> > + if (&func->stack_node == &ops->func_stack)
> > + goto unlock;
> > + }
> > + }
> >
> > klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +unlock:
> > + rcu_read_unlock();
> > }
> >
> > struct klp_ops *klp_find_ops(unsigned long old_addr)
> > @@ -174,3 +195,12 @@ int klp_patch_object(struct klp_object *obj)
> >
> > return 0;
> > }
> > +
> > +void klp_unpatch_objects(struct klp_patch *patch)
> > +{
> > + struct klp_object *obj;
> > +
> > + for (obj = patch->objs; obj->funcs; obj++)
> > + if (obj->patched)
> > + klp_unpatch_object(obj);
> > +}
> > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > index bb34bd3..1648259 100644
> > --- a/kernel/livepatch/patch.h
> > +++ b/kernel/livepatch/patch.h
> > @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
> >
> > extern int klp_patch_object(struct klp_object *obj);
> > extern void klp_unpatch_object(struct klp_object *obj);
> > +extern void klp_unpatch_objects(struct klp_patch *patch);
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 0000000..2630296
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > @@ -0,0 +1,300 @@
> > +/*
> > + * transition.c - Kernel Live Patching transition functions
> > + *
> > + * Copyright (C) 2015 Josh Poimboeuf <[email protected]>
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version 2
> > + * of the License, or (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/cpu.h>
> > +#include <asm/stacktrace.h>
> > +#include "../sched/sched.h"
> > +
> > +#include "patch.h"
> > +#include "transition.h"
> > +
> > +static void klp_transition_work_fn(struct work_struct *);
> > +static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
> > +
> > +struct klp_patch *klp_transition_patch;
> > +
> > +int klp_universe_goal = KLP_UNIVERSE_UNDEFINED;
> > +
> > +static void klp_set_universe_goal(int universe)
> > +{
> > + klp_universe_goal = universe;
> > +
> > + /* corresponding smp_rmb() is in klp_update_task_universe() */
> > + smp_wmb();
> > +}
> > +
> > +/*
> > + * The transition to the universe goal is complete. Clean up the data
> > + * structures.
> > + */
> > +void klp_complete_transition(void)
> > +{
> > + struct klp_object *obj;
> > + struct klp_func *func;
> > +
> > + for (obj = klp_transition_patch->objs; obj->funcs; obj++)
> > + for (func = obj->funcs; func->old_name; func++)
> > + func->transition = 0;
> > +
> > + klp_transition_patch = NULL;
> > +}
> > +
> > +static int klp_stacktrace_address_verify_func(struct klp_func *func,
> > + unsigned long address)
> > +{
> > + unsigned long func_addr, func_size;
> > +
> > + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > + /* check the to-be-unpatched function (the func itself) */
> > + func_addr = (unsigned long)func->new_func;
> > + func_size = func->new_size;
> > + } else {
> > + /* check the to-be-patched function (previous func) */
> > + struct klp_ops *ops;
> > +
> > + ops = klp_find_ops(func->old_addr);
> > +
> > + if (list_is_singular(&ops->func_stack)) {
> > + /* original function */
> > + func_addr = func->old_addr;
> > + func_size = func->old_size;
> > + } else {
> > + /* previously patched function */
> > + struct klp_func *prev;
> > +
> > + prev = list_next_entry(func, stack_node);
> > + func_addr = (unsigned long)prev->new_func;
> > + func_size = prev->new_size;
> > + }
> > + }
> > +
> > + if (address >= func_addr && address < func_addr + func_size)
> > + return -1;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * Determine whether the given return address on the stack is within a
> > + * to-be-patched or to-be-unpatched function.
> > + */
> > +static void klp_stacktrace_address_verify(void *data, unsigned long address,
> > + int reliable)
> > +{
> > + struct klp_object *obj;
> > + struct klp_func *func;
> > + int *ret = data;
> > +
> > + if (*ret)
> > + return;
> > +
> > + for (obj = klp_transition_patch->objs; obj->funcs; obj++) {
> > + if (!obj->patched)
> > + continue;
> > + for (func = obj->funcs; func->old_name; func++) {
> > + if (klp_stacktrace_address_verify_func(func, address)) {
> > + *ret = -1;
> > + return;
> > + }
> > + }
> > + }
> > +}
> > +
> > +static int klp_stacktrace_stack(void *data, char *name)
> > +{
> > + return 0;
> > +}
> > +
> > +static const struct stacktrace_ops klp_stacktrace_ops = {
> > + .address = klp_stacktrace_address_verify,
> > + .stack = klp_stacktrace_stack,
> > + .walk_stack = print_context_stack_bp,
> > +};
> > +
> > +/*
> > + * Try to safely transition a task to the universe goal. If the task is
> > + * currently running or is sleeping on a to-be-patched or to-be-unpatched
> > + * function, return false.
> > + */
> > +static bool klp_transition_task(struct task_struct *t)
> > +{
> > + struct rq *rq;
> > + unsigned long flags;
> > + int ret;
> > + bool success = false;
> > +
> > + if (t->klp_universe == klp_universe_goal)
> > + return true;
> > +
> > + rq = task_rq_lock(t, &flags);
> > +
> > + if (task_running(rq, t) && t != current) {
> > + pr_debug("%s: pid %d (%s) is running\n", __func__, t->pid,
> > + t->comm);
> > + goto done;
> > + }
> > +
> > + ret = 0;
> > + dump_trace(t, NULL, NULL, 0, &klp_stacktrace_ops, &ret);
> > + if (ret) {
> > + pr_debug("%s: pid %d (%s) is sleeping on a patched function\n",
> > + __func__, t->pid, t->comm);
> > + goto done;
> > + }
> > +
> > + klp_update_task_universe(t);
> > +
> > + success = true;
> > +done:
> > + task_rq_unlock(rq, t, &flags);
> > + return success;
> > +}
> > +
> > +/*
> > + * Try to transition all tasks to the universe goal. If any tasks are still
> > + * stuck in the original universe, schedule a retry.
> > + */
> > +void klp_try_complete_transition(void)
> > +{
> > + unsigned int cpu;
> > + struct task_struct *g, *t;
> > + bool complete = true;
> > +
> > + /* try to transition all normal tasks */
> > + read_lock(&tasklist_lock);
> > + for_each_process_thread(g, t)
> > + if (!klp_transition_task(t))
> > + complete = false;
> > + read_unlock(&tasklist_lock);
> > +
> > + /* try to transition the idle "swapper" tasks */
> > + get_online_cpus();
> > + for_each_online_cpu(cpu)
> > + if (!klp_transition_task(idle_task(cpu)))
> > + complete = false;
> > + put_online_cpus();
> > +
> > + /* if not complete, try again later */
> > + if (!complete) {
> > + schedule_delayed_work(&klp_transition_work,
> > + round_jiffies_relative(HZ));
> > + return;
> > + }
> > +
> > + /* success! unpatch obsolete functions and do some cleanup */
> > +
> > + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > + klp_unpatch_objects(klp_transition_patch);
> > +
> > + /* prevent ftrace handler from reading old func->transition */
> > + synchronize_rcu();
> > + }
> > +
> > + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > + "unpatching");
> > +
> > + klp_complete_transition();
> > +}
> > +
> > +static void klp_transition_work_fn(struct work_struct *work)
> > +{
> > + mutex_lock(&klp_mutex);
> > +
> > + if (klp_transition_patch)
> > + klp_try_complete_transition();
> > +
> > + mutex_unlock(&klp_mutex);
> > +}
> > +
> > +/*
> > + * Start the transition to the specified universe so tasks can begin switching
> > + * to it.
> > + */
> > +void klp_start_transition(int universe)
> > +{
> > + if (WARN_ON(klp_universe_goal == universe))
> > + return;
> > +
> > + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > + universe == KLP_UNIVERSE_NEW ? "patching" : "unpatching");
> > +
> > + klp_set_universe_goal(universe);
> > +}
> > +
> > +/*
> > + * Can be called in the middle of an existing transition to reverse the
> > + * direction of the universe goal. This can be done to effectively cancel an
> > + * existing enable or disable operation if there are any tasks which are stuck
> > + * in the original universe.
> > + */
> > +void klp_reverse_transition(void)
> > +{
> > + struct klp_patch *patch = klp_transition_patch;
> > +
> > + klp_start_transition(!klp_universe_goal);
> > + klp_try_complete_transition();
> > +
> > + patch->enabled = !patch->enabled;
> > +}
> > +
> > +/*
> > + * Reset the universe goal and all tasks to the starting universe, and set all
> > + * func->transition's to 1 to prepare for patching.
> > + */
> > +void klp_init_transition(struct klp_patch *patch, int universe)
> > +{
> > + struct task_struct *g, *t;
> > + unsigned int cpu;
> > + struct klp_object *obj;
> > + struct klp_func *func;
> > +
> > + klp_transition_patch = patch;
> > +
> > + /*
> > + * If the previous transition was in the opposite direction, we may
> > + * already be in the requested initial universe.
> > + */
> > + if (klp_universe_goal == universe)
> > + goto init_funcs;
> > +
> > + klp_set_universe_goal(universe);
> > +
> > + /* init all normal task universes */
> > + read_lock(&tasklist_lock);
> > + for_each_process_thread(g, t)
> > + klp_update_task_universe(t);
> > + read_unlock(&tasklist_lock);
> > +
> > + /* init all idle "swapper" task universes */
> > + get_online_cpus();
> > + for_each_online_cpu(cpu)
> > + klp_update_task_universe(idle_task(cpu));
> > + put_online_cpus();
> > +
> > +init_funcs:
> > + /* corresponding smp_rmb() is in klp_ftrace_handler() */
> > + smp_wmb();
> > +
> > + for (obj = patch->objs; obj->funcs; obj++)
> > + for (func = obj->funcs; func->old_name; func++)
> > + func->transition = 1;
>
> So I finally got to review of this one. I have only two concerns:
> 1) it removes the ability for the user to use 'no consistency model'.
> But you don't need to worry about this, I plan to implement this as soon
> as you send v2 of these.

Ok, sounds good.

> 2) How is this 'transition = 1' store above guaranteed to reach other
> CPUs before you start registering ftrace handlers? The CPUs need not see
> the update when some handler is already invoked before start_transition
> AFAICS.

Yeah, I think the order of the 'transition = 1' and adding the func to
the ops stack list should be enforced. Also I'll probably rework the
barriers a little bit in v2 so that they're more explicit and better
commented.

--
Josh

2015-02-17 15:10:57

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > Add a basic per-task consistency model. This is the foundation which
> > will eventually enable us to patch those ~10% of security patches which
> > change function prototypes and/or data semantics.
> >
> > When a patch is enabled, livepatch enters into a transition state where
> > tasks are converging from the old universe to the new universe. If a
> > given task isn't using any of the patched functions, it's switched to
> > the new universe. Once all the tasks have been converged to the new
> > universe, patching is complete.
> >
> > The same sequence occurs when a patch is disabled, except the tasks
> > converge from the new universe to the old universe.
> >
> > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > is in transition. Only a single patch (the topmost patch on the stack)
> > can be in transition at a given time. A patch can remain in the
> > transition state indefinitely, if any of the tasks are stuck in the
> > previous universe.
> >
> > A transition can be reversed and effectively canceled by writing the
> > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > the transition is in progress. Then all the tasks will attempt to
> > converge back to the original universe.
>
> I finally managed to go through this patch and I have only few comments
> apart from what Jiri has already written...
>
> I think it would be useful to add more comments throughout the code.

Ok, I'll try to add more comments throughout.

> sysfs documentation (Documentation/ABI/testing/sysfs-kernel-livepatch)
> should be updated as well. Also the meaning of enabled attribute was
> changed a bit (by different patch of the set though).

Ok.

> > +
> > +void klp_unpatch_objects(struct klp_patch *patch)
> > +{
> > + struct klp_object *obj;
> > +
> > + for (obj = patch->objs; obj->funcs; obj++)
> > + if (obj->patched)
> > + klp_unpatch_object(obj);
> > +}
>
> Maybe we should introduce for_each_* macros which could be used in the
> code and avoid such functions. I do not have strong opinion about it.

Yeah, but each such loop seems to differ a little bit, so I'm not quite
sure how to structure the macros such that they'd be useful. Maybe for
a future patch.

> > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > index bb34bd3..1648259 100644
> > --- a/kernel/livepatch/patch.h
> > +++ b/kernel/livepatch/patch.h
> > @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
> >
> > extern int klp_patch_object(struct klp_object *obj);
> > extern void klp_unpatch_object(struct klp_object *obj);
> > +extern void klp_unpatch_objects(struct klp_patch *patch);
>
> [...]
>
> > diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> > new file mode 100644
> > index 0000000..ba9a55c
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.h
> > @@ -0,0 +1,16 @@
> > +#include <linux/livepatch.h>
> > +
> > +enum {
> > + KLP_UNIVERSE_UNDEFINED = -1,
> > + KLP_UNIVERSE_OLD,
> > + KLP_UNIVERSE_NEW,
> > +};
> > +
> > +extern struct mutex klp_mutex;
> > +extern struct klp_patch *klp_transition_patch;
> > +
> > +extern void klp_init_transition(struct klp_patch *patch, int universe);
> > +extern void klp_start_transition(int universe);
> > +extern void klp_reverse_transition(void);
> > +extern void klp_try_complete_transition(void);
> > +extern void klp_complete_transition(void);
>
> Double inclusion protection is missing

Ok.

> and externs for functions are redundant.

I agree, but it seems to be the norm in Linux. I have no idea why. I'm
just following the existing convention.

> Otherwise it looks quite ok.

Thanks!

--
Josh

2015-02-17 15:48:47

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> >

[...]

> > > +
> > > +void klp_unpatch_objects(struct klp_patch *patch)
> > > +{
> > > + struct klp_object *obj;
> > > +
> > > + for (obj = patch->objs; obj->funcs; obj++)
> > > + if (obj->patched)
> > > + klp_unpatch_object(obj);
> > > +}
> >
> > Maybe we should introduce for_each_* macros which could be used in the
> > code and avoid such functions. I do not have strong opinion about it.
>
> Yeah, but each such loop seems to differ a little bit, so I'm not quite
> sure how to structure the macros such that they'd be useful. Maybe for
> a future patch.

Yes, that is correct. The code in the caller of klp_unpatch_objects would
look something like this

klp_for_each_object(obj, patch->objs)
if (obj->patched)
klp_unpatch_object(obj)

So there is in fact no change (compared to opencoding of
klp_unpatch_objects), but IMO it is more legible. The upside is
that we wouldn't introduce functions with similar names which could be
confusing in the future AND we could use such macros throughout the code.

One step more could be macro klp_for_each_patched_object which would
include the check.

However it is a nitpick, matter of taste and it is up to you.

>
> > > diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
> > > index bb34bd3..1648259 100644
> > > --- a/kernel/livepatch/patch.h
> > > +++ b/kernel/livepatch/patch.h
> > > @@ -23,3 +23,4 @@ struct klp_ops *klp_find_ops(unsigned long old_addr);
> > >
> > > extern int klp_patch_object(struct klp_object *obj);
> > > extern void klp_unpatch_object(struct klp_object *obj);
> > > +extern void klp_unpatch_objects(struct klp_patch *patch);
> >
> > [...]
> >
> > > diff --git a/kernel/livepatch/transition.h b/kernel/livepatch/transition.h
> > > new file mode 100644
> > > index 0000000..ba9a55c
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.h
> > > @@ -0,0 +1,16 @@
> > > +#include <linux/livepatch.h>
> > > +
> > > +enum {
> > > + KLP_UNIVERSE_UNDEFINED = -1,
> > > + KLP_UNIVERSE_OLD,
> > > + KLP_UNIVERSE_NEW,
> > > +};
> > > +
> > > +extern struct mutex klp_mutex;
> > > +extern struct klp_patch *klp_transition_patch;
> > > +
> > > +extern void klp_init_transition(struct klp_patch *patch, int universe);
> > > +extern void klp_start_transition(int universe);
> > > +extern void klp_reverse_transition(void);
> > > +extern void klp_try_complete_transition(void);
> > > +extern void klp_complete_transition(void);
> >
> > Double inclusion protection is missing
>
> Ok.
>
> > and externs for functions are redundant.
>
> I agree, but it seems to be the norm in Linux. I have no idea why. I'm
> just following the existing convention.

Yes, I know. It seems that each author does it differently. You can find
both forms even in one header file in the kernel. There is no functional
difference AFAIK (it is not the case for variables of course). So as long
as we are consistent I do not care. And since we have externs already in
livepatch.h... you can scratch this remark if you want to :)

Miroslav

2015-02-17 15:55:57

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Mon, Feb 16, 2015 at 05:06:15PM +0100, Miroslav Benes wrote:
> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
>
> > On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > > unregister path, I still get a lockdep warning. This looks more insidious,
> > > > related to the locking order of a kernfs lock and the klp lock. I'll need to
> > > > look at this some more...
> > >
> > > Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused
> > > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch
> > > takes klp_mutex and destroys the sysfs structure. If somebody writes to
> > > enabled just after unregister takes the mutex and before the sysfs
> > > removal, he would cause the deadlock, because enabled_store takes the
> > > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us
> > > below.
> > >
> > > We can look for inspiration elsewhere. Grep for s_active through git log
> > > of the mainline offers several commits which dealt exactly with this. Will
> > > browse through that...
> >
> > Thanks Miroslav, please let me know what you find. It wouldn't surprise
> > me if this were a very common problem.
> >
> > One option would be to move the enabled_store() work out to a workqueue
> > or something.
>
> Yes, that is one possibility. It is not the only one.
>
> 1. we could replace mutex_lock in enabled_store with mutex_trylock. If the
> lock was not acquired we would return -EBUSY. Or could we 'return
> restart_syscall' (maybe after some tiny msleep)?

Hm, doesn't that still violate the locking order rules? I thought locks
always had to be taken in the same order -- always sysfs before klp, or
klp before sysfs. Not sure if there would still be any deadlocks
lurking, but lockdep might still complain.

> 2. we could reorganize klp_unregister_patch somehow and move sysfs removal
> out of mutex protection.

Yeah, I was thinking about this too. Pretty sure we'd have to remove
both the sysfs add and the sysfs removal from mutex protection. I like
this option if we can get it to work.

--
Josh

2015-02-17 16:01:56

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
>
> > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > >
>
> [...]
>
> > > > +
> > > > +void klp_unpatch_objects(struct klp_patch *patch)
> > > > +{
> > > > + struct klp_object *obj;
> > > > +
> > > > + for (obj = patch->objs; obj->funcs; obj++)
> > > > + if (obj->patched)
> > > > + klp_unpatch_object(obj);
> > > > +}
> > >
> > > Maybe we should introduce for_each_* macros which could be used in the
> > > code and avoid such functions. I do not have strong opinion about it.
> >
> > Yeah, but each such loop seems to differ a little bit, so I'm not quite
> > sure how to structure the macros such that they'd be useful. Maybe for
> > a future patch.
>
> Yes, that is correct. The code in the caller of klp_unpatch_objects would
> look something like this
>
> klp_for_each_object(obj, patch->objs)
> if (obj->patched)
> klp_unpatch_object(obj)

Yeah, that is slightly more readable and less error prone. I'll do it.

> > > and externs for functions are redundant.
> >
> > I agree, but it seems to be the norm in Linux. I have no idea why. I'm
> > just following the existing convention.
>
> Yes, I know. It seems that each author does it differently. You can find
> both forms even in one header file in the kernel. There is no functional
> difference AFAIK (it is not the case for variables of course). So as long
> as we are consistent I do not care. And since we have externs already in
> livepatch.h... you can scratch this remark if you want to :)

Ok. If there are no objections, let's stick with our existing
nonsensical convention for now :-)

--
Josh

2015-02-17 16:38:21

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 8/9] livepatch: allow patch modules to be removed

On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Mon, Feb 16, 2015 at 05:06:15PM +0100, Miroslav Benes wrote:
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> >
> > > On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > > > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > > > unregister path, I still get a lockdep warning. This looks more insidious,
> > > > > related to the locking order of a kernfs lock and the klp lock. I'll need to
> > > > > look at this some more...
> > > >
> > > > Yes, I was afraid of this. Lockdep warning is a separate bug. It is caused
> > > > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch
> > > > takes klp_mutex and destroys the sysfs structure. If somebody writes to
> > > > enabled just after unregister takes the mutex and before the sysfs
> > > > removal, he would cause the deadlock, because enabled_store takes the
> > > > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells us
> > > > below.
> > > >
> > > > We can look for inspiration elsewhere. Grep for s_active through git log
> > > > of the mainline offers several commits which dealt exactly with this. Will
> > > > browse through that...
> > >
> > > Thanks Miroslav, please let me know what you find. It wouldn't surprise
> > > me if this were a very common problem.
> > >
> > > One option would be to move the enabled_store() work out to a workqueue
> > > or something.
> >
> > Yes, that is one possibility. It is not the only one.
> >
> > 1. we could replace mutex_lock in enabled_store with mutex_trylock. If the
> > lock was not acquired we would return -EBUSY. Or could we 'return
> > restart_syscall' (maybe after some tiny msleep)?
>
> Hm, doesn't that still violate the locking order rules? I thought locks
> always had to be taken in the same order -- always sysfs before klp, or
> klp before sysfs. Not sure if there would still be any deadlocks
> lurking, but lockdep might still complain.

Yes, but in this case you break the possible deadlock order. From the
lockdep report...

CPU0 CPU1
---- ----
lock(klp_mutex);
lock(s_active#70);
lock(klp_mutex);
lock(s_active#70);

CPU0 called klp_unregister_patch and CPU1 possible enabled_store in a race
window. Deadlock wouldn't be there because trylock(klp_mutex) on CPU1
would return 0 and enabled_store thus EBUSY. And in every other scenario
trylock would prevent deadlock too or klp_unregister_patch would wait on
klp_mutex (I hope I did not miss anything).

I tried it and lockdep did not complain.

And you can look at 36c38fb7144aa941dc072ba8f58b2dbe509c0345 or
5e33bc4165f3edd558d9633002465a95230effc1. They dealt with it the same way
(but it does not mean anything).

It would need more testing to be sure though.

> > 2. we could reorganize klp_unregister_patch somehow and move sysfs removal
> > out of mutex protection.
>
> Yeah, I was thinking about this too. Pretty sure we'd have to remove
> both the sysfs add and the sysfs removal from mutex protection. I like
> this option if we can get it to work.

Yes, why not.

Maybe someone else will share his opinion on this...

Miroslav

2015-02-18 12:42:59

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> >
> > > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
>
> > > > and externs for functions are redundant.
> > >
> > > I agree, but it seems to be the norm in Linux. I have no idea why. I'm
> > > just following the existing convention.
> >
> > Yes, I know. It seems that each author does it differently. You can find
> > both forms even in one header file in the kernel. There is no functional
> > difference AFAIK (it is not the case for variables of course). So as long
> > as we are consistent I do not care. And since we have externs already in
> > livepatch.h... you can scratch this remark if you want to :)
>
> Ok. If there are no objections, let's stick with our existing
> nonsensical convention for now :-)

So I was thinking about it again and we should not use bad patterns in our
code from the beginning. Externs do not make sense so let's get rid of
them everywhere (i.e. in the consistency model and also in livepatch.h).

The C specification talks about extern in context of internal and external
linkages or in context of inline functions but it does not make any sense
to me. Could you look at the specification and tell me if it makes any
sense to you, please?

Jiri, Vojtech, do you have any opinion about this?

Miroslav

2015-02-18 13:15:33

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, Feb 18, 2015 at 01:42:56PM +0100, Miroslav Benes wrote:
> On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
>
> > On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> > > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> > >
> > > > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> >
> > > > > and externs for functions are redundant.
> > > >
> > > > I agree, but it seems to be the norm in Linux. I have no idea why. I'm
> > > > just following the existing convention.
> > >
> > > Yes, I know. It seems that each author does it differently. You can find
> > > both forms even in one header file in the kernel. There is no functional
> > > difference AFAIK (it is not the case for variables of course). So as long
> > > as we are consistent I do not care. And since we have externs already in
> > > livepatch.h... you can scratch this remark if you want to :)
> >
> > Ok. If there are no objections, let's stick with our existing
> > nonsensical convention for now :-)
>
> So I was thinking about it again and we should not use bad patterns in our
> code from the beginning. Externs do not make sense so let's get rid of
> them everywhere (i.e. in the consistency model and also in livepatch.h).
>
> The C specification talks about extern in context of internal and external
> linkages or in context of inline functions but it does not make any sense
> to me. Could you look at the specification and tell me if it makes any
> sense to you, please?

Relevant parts from C11:

For an identifier declared with the storage-class specifier extern in a
scope in which a prior declaration of that identifier is visible, if the
prior declaration specifies internal or external linkage, the linkage of
the identifier at the later declaration is the same as the linkage
specified at the prior declaration. If no prior declaration is visible,
or if the prior declaration specifies no linkage, then the identifier
has external linkage.

If the declaration of an identifier for a function has no storage-class
specifier, its linkage is determined exactly as if it were declared with
the storage-class specifier extern .If the declaration of an identifier
for an object has file scope and no storage-class specifier, its linkage
is external.

Sounds to me like "extern" is redundant for functions. I'm fine with
removing it. Care to work up a patch for livepatch.h?

>
> Jiri, Vojtech, do you have any opinion about this?
>
> Miroslav

--
Josh

2015-02-18 13:42:10

by Miroslav Benes

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, 18 Feb 2015, Josh Poimboeuf wrote:

> On Wed, Feb 18, 2015 at 01:42:56PM +0100, Miroslav Benes wrote:
> > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> >
> > > On Tue, Feb 17, 2015 at 04:48:39PM +0100, Miroslav Benes wrote:
> > > > On Tue, 17 Feb 2015, Josh Poimboeuf wrote:
> > > >
> > > > > On Mon, Feb 16, 2015 at 03:19:10PM +0100, Miroslav Benes wrote:
> > >
> > > > > > and externs for functions are redundant.
> > > > >
> > > > > I agree, but it seems to be the norm in Linux. I have no idea why. I'm
> > > > > just following the existing convention.
> > > >
> > > > Yes, I know. It seems that each author does it differently. You can find
> > > > both forms even in one header file in the kernel. There is no functional
> > > > difference AFAIK (it is not the case for variables of course). So as long
> > > > as we are consistent I do not care. And since we have externs already in
> > > > livepatch.h... you can scratch this remark if you want to :)
> > >
> > > Ok. If there are no objections, let's stick with our existing
> > > nonsensical convention for now :-)
> >
> > So I was thinking about it again and we should not use bad patterns in our
> > code from the beginning. Externs do not make sense so let's get rid of
> > them everywhere (i.e. in the consistency model and also in livepatch.h).
> >
> > The C specification talks about extern in context of internal and external
> > linkages or in context of inline functions but it does not make any sense
> > to me. Could you look at the specification and tell me if it makes any
> > sense to you, please?
>
> Relevant parts from C11:
>
> For an identifier declared with the storage-class specifier extern in a
> scope in which a prior declaration of that identifier is visible, if the
> prior declaration specifies internal or external linkage, the linkage of
> the identifier at the later declaration is the same as the linkage
> specified at the prior declaration. If no prior declaration is visible,
> or if the prior declaration specifies no linkage, then the identifier
> has external linkage.
>
> If the declaration of an identifier for a function has no storage-class
> specifier, its linkage is determined exactly as if it were declared with
> the storage-class specifier extern .If the declaration of an identifier
> for an object has file scope and no storage-class specifier, its linkage
> is external.
>
> Sounds to me like "extern" is redundant for functions. I'm fine with
> removing it. Care to work up a patch for livepatch.h?

Agreed. I'll do that. Thanks.

Miroslav

2015-02-18 17:03:55

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC PATCH 1/9] livepatch: simplify disable error path

On Fri 2015-02-13 13:25:35, Miroslav Benes wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> > If registering the function with ftrace has previously succeeded,
> > unregistering will almost never fail. Even if it does, it's not a fatal
> > error. We can still carry on and disable the klp_func from being used
> > by removing it from the klp_ops func stack.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
>
> This makes sense, so
>
> Reviewed-by: Miroslav Benes <[email protected]>
>
> I think this patch could be taken independently of the consistency model.
> If no one else has any objection...

Yup, it looks good to me.

Reviewed-by: Petr Mladek <[email protected]>

> Miroslav
>
> > ---
> > kernel/livepatch/core.c | 67 +++++++++++++------------------------------------
> > 1 file changed, 17 insertions(+), 50 deletions(-)
> >
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 9adf86b..081df77 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -322,32 +322,20 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > }
> >
> > -static int klp_disable_func(struct klp_func *func)
> > +static void klp_disable_func(struct klp_func *func)
> > {
> > struct klp_ops *ops;
> > - int ret;
> > -
> > - if (WARN_ON(func->state != KLP_ENABLED))
> > - return -EINVAL;
> >
> > - if (WARN_ON(!func->old_addr))
> > - return -EINVAL;
> > + WARN_ON(func->state != KLP_ENABLED);
> > + WARN_ON(!func->old_addr);
> >
> > ops = klp_find_ops(func->old_addr);
> > if (WARN_ON(!ops))
> > - return -EINVAL;
> > + return;
> >
> > if (list_is_singular(&ops->func_stack)) {
> > - ret = unregister_ftrace_function(&ops->fops);
> > - if (ret) {
> > - pr_err("failed to unregister ftrace handler for function '%s' (%d)\n",
> > - func->old_name, ret);
> > - return ret;
> > - }
> > -
> > - ret = ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0);
> > - if (ret)
> > - pr_warn("function unregister succeeded but failed to clear the filter\n");
> > + WARN_ON(unregister_ftrace_function(&ops->fops));
> > + WARN_ON(ftrace_set_filter_ip(&ops->fops, func->old_addr, 1, 0));
> >
> > list_del_rcu(&func->stack_node);
> > list_del(&ops->node);
> > @@ -357,8 +345,6 @@ static int klp_disable_func(struct klp_func *func)
> > }
> >
> > func->state = KLP_DISABLED;
> > -
> > - return 0;
> > }
> >
> > static int klp_enable_func(struct klp_func *func)
> > @@ -419,23 +405,15 @@ err:
> > return ret;
> > }
> >
> > -static int klp_disable_object(struct klp_object *obj)
> > +static void klp_disable_object(struct klp_object *obj)
> > {
> > struct klp_func *func;
> > - int ret;
> >
> > - for (func = obj->funcs; func->old_name; func++) {
> > - if (func->state != KLP_ENABLED)
> > - continue;
> > -
> > - ret = klp_disable_func(func);
> > - if (ret)
> > - return ret;
> > - }
> > + for (func = obj->funcs; func->old_name; func++)
> > + if (func->state == KLP_ENABLED)
> > + klp_disable_func(func);
> >
> > obj->state = KLP_DISABLED;
> > -
> > - return 0;
> > }
> >
> > static int klp_enable_object(struct klp_object *obj)
> > @@ -451,22 +429,19 @@ static int klp_enable_object(struct klp_object *obj)
> >
> > for (func = obj->funcs; func->old_name; func++) {
> > ret = klp_enable_func(func);
> > - if (ret)
> > - goto unregister;
> > + if (ret) {
> > + klp_disable_object(obj);
> > + return ret;
> > + }
> > }
> > obj->state = KLP_ENABLED;
> >
> > return 0;
> > -
> > -unregister:
> > - WARN_ON(klp_disable_object(obj));
> > - return ret;
> > }
> >
> > static int __klp_disable_patch(struct klp_patch *patch)
> > {
> > struct klp_object *obj;
> > - int ret;
> >
> > /* enforce stacking: only the last enabled patch can be disabled */
> > if (!list_is_last(&patch->list, &klp_patches) &&
> > @@ -476,12 +451,8 @@ static int __klp_disable_patch(struct klp_patch *patch)
> > pr_notice("disabling patch '%s'\n", patch->mod->name);
> >
> > for (obj = patch->objs; obj->funcs; obj++) {
> > - if (obj->state != KLP_ENABLED)
> > - continue;
> > -
> > - ret = klp_disable_object(obj);
> > - if (ret)
> > - return ret;
> > + if (obj->state == KLP_ENABLED)
> > + klp_disable_object(obj);
> > }
> >
> > patch->state = KLP_DISABLED;
> > @@ -931,7 +902,6 @@ static void klp_module_notify_going(struct klp_patch *patch,
> > {
> > struct module *pmod = patch->mod;
> > struct module *mod = obj->mod;
> > - int ret;
> >
> > if (patch->state == KLP_DISABLED)
> > goto disabled;
> > @@ -939,10 +909,7 @@ static void klp_module_notify_going(struct klp_patch *patch,
> > pr_notice("reverting patch '%s' on unloading module '%s'\n",
> > pmod->name, mod->name);
> >
> > - ret = klp_disable_object(obj);
> > - if (ret)
> > - pr_warn("failed to revert patch '%s' on module '%s' (%d)\n",
> > - pmod->name, mod->name, ret);
> > + klp_disable_object(obj);
> >
> > disabled:
> > klp_free_object_loaded(obj);
> > --
> > 2.1.0
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
> --
> To unsubscribe from this list: send the line "unsubscribe live-patching" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-02-18 20:07:33

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC PATCH 1/9] livepatch: simplify disable error path

On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> If registering the function with ftrace has previously succeeded,
> unregistering will almost never fail. Even if it does, it's not a fatal
> error. We can still carry on and disable the klp_func from being used
> by removing it from the klp_ops func stack.
>
> Signed-off-by: Josh Poimboeuf <[email protected]>

Applied to for-3.21/core, thanks.

--
Jiri Kosina
SUSE Labs

2015-02-18 20:18:03

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model


* Jiri Kosina <[email protected]> wrote:

> On Thu, 12 Feb 2015, Peter Zijlstra wrote:
>
> > And what's wrong with using known good spots like the freezer?
>
> Quoting Tejun from the thread Jiri Slaby likely had on
> mind:
>
> "The fact that they may coincide often can be useful as a
> guideline or whatever but I'm completely against just
> mushing it together when it isn't correct. This kind of
> things quickly lead to ambiguous situations where people
> are not sure about the specific semantics or guarantees
> of the construct and implement weird voodoo code followed
> by voodoo fixes. We already had a full round of that
> with the kernel freezer itself, where people thought that
> the freezer magically makes PM work properly for a
> subsystem. Let's please not do that again."

I don't follow this vague argument.

The concept of 'freezing' all userspace execution is pretty
unambiguous: tasks that are running are trapped out at
known safe points such as context switch points or syscall
entry. Once all tasks have stopped, the system is frozen in
the sense that only the code we want is running, so you can
run special code without worrying about races.

What's the problem with that? Why would it be fundamentally
unsuitable for live patching?

Thanks,

Ingo

2015-02-18 21:02:48

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, Feb 18, 2015 at 09:17:55PM +0100, Ingo Molnar wrote:
>
> * Jiri Kosina <[email protected]> wrote:
>
> > On Thu, 12 Feb 2015, Peter Zijlstra wrote:
> >
> > > And what's wrong with using known good spots like the freezer?
> >
> > Quoting Tejun from the thread Jiri Slaby likely had on
> > mind:
> >
> > "The fact that they may coincide often can be useful as a
> > guideline or whatever but I'm completely against just
> > mushing it together when it isn't correct. This kind of
> > things quickly lead to ambiguous situations where people
> > are not sure about the specific semantics or guarantees
> > of the construct and implement weird voodoo code followed
> > by voodoo fixes. We already had a full round of that
> > with the kernel freezer itself, where people thought that
> > the freezer magically makes PM work properly for a
> > subsystem. Let's please not do that again."
>
> I don't follow this vague argument.
>
> The concept of 'freezing' all userspace execution is pretty
> unambiguous: tasks that are running are trapped out at
> known safe points such as context switch points or syscall
> entry. Once all tasks have stopped, the system is frozen in
> the sense that only the code we want is running, so you can
> run special code without worrying about races.
>
> What's the problem with that? Why would it be fundamentally
> unsuitable for live patching?

For live patching it doesn't matter whether code is running, sleeping or
frozen.

What matters is whether there is state before patching that may not be
valid after patching.

For userspace tasks, the exit from a syscall is a perfect moment for
switching to the "after" state, as all stacks, and thus all local
variables are gone and no local state exists in the kernel for the
thread.

The freezer is a logical choice for kernel threads, however, given that
kernel threads have no defined entry/exit point and execute within a
single main function, local variables stay and thus local state persists
from before to after freezing.

Defining that no local state within a kernel thread may be relied upon
after exiting from the freezer is certainly possible, and is already
true for many kernel threads.

It isn't a given property of the freezer itself, though. And isn't
obvious for author of new kernel threads either.

The ideal solution would be to convert the majority of kernel threads to
workqueues, because then there is a defined entry/exit point over which
state isn't transferred. That is a lot of work, though, and has other
drawbacks, particularly in the realtime space.

--
Vojtech Pavlik
Director SUSE Labs

2015-02-19 09:53:05

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Wed, Feb 18, 2015 at 09:44:44PM +0100, Vojtech Pavlik wrote:
> For live patching it doesn't matter whether code is running, sleeping or
> frozen.
>
> What matters is whether there is state before patching that may not be
> valid after patching.
>
> For userspace tasks, the exit from a syscall is a perfect moment for
> switching to the "after" state, as all stacks, and thus all local
> variables are gone and no local state exists in the kernel for the
> thread.
>
> The freezer is a logical choice for kernel threads, however, given that
> kernel threads have no defined entry/exit point and execute within a
> single main function, local variables stay and thus local state persists
> from before to after freezing.
>
> Defining that no local state within a kernel thread may be relied upon
> after exiting from the freezer is certainly possible, and is already
> true for many kernel threads.
>
> It isn't a given property of the freezer itself, though. And isn't
> obvious for author of new kernel threads either.
>
> The ideal solution would be to convert the majority of kernel threads to
> workqueues, because then there is a defined entry/exit point over which
> state isn't transferred. That is a lot of work, though, and has other
> drawbacks, particularly in the realtime space.

kthread_park() functionality seems to be exactly what you want.

2015-02-19 10:11:59

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 19, 2015 at 10:52:51AM +0100, Peter Zijlstra wrote:

> On Wed, Feb 18, 2015 at 09:44:44PM +0100, Vojtech Pavlik wrote:
> > For live patching it doesn't matter whether code is running, sleeping or
> > frozen.
> >
> > What matters is whether there is state before patching that may not be
> > valid after patching.
> >
> > For userspace tasks, the exit from a syscall is a perfect moment for
> > switching to the "after" state, as all stacks, and thus all local
> > variables are gone and no local state exists in the kernel for the
> > thread.
> >
> > The freezer is a logical choice for kernel threads, however, given that
> > kernel threads have no defined entry/exit point and execute within a
> > single main function, local variables stay and thus local state persists
> > from before to after freezing.
> >
> > Defining that no local state within a kernel thread may be relied upon
> > after exiting from the freezer is certainly possible, and is already
> > true for many kernel threads.
> >
> > It isn't a given property of the freezer itself, though. And isn't
> > obvious for author of new kernel threads either.
> >
> > The ideal solution would be to convert the majority of kernel threads to
> > workqueues, because then there is a defined entry/exit point over which
> > state isn't transferred. That is a lot of work, though, and has other
> > drawbacks, particularly in the realtime space.
>
> kthread_park() functionality seems to be exactly what you want.

It might be exactly that, indeed. The requrement of not just cleaning
up, but also not using contents of local variables from before parking
would need to be documented.

And kernel threads would need to start using it, too. I have been able
to find one instance where this functionality is actually used. So it is
again a matter of a massive patch adding that, like with the approach of
converting kernel threads to workqueues.

By the way, if kthread_park() was implemented all through the kernel,
would we still need the freezer for kernel threads at all? Since parking
seems to be stronger than freezing, it could also be used for that
purpose.

Vojtech

2015-02-19 10:51:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

On Thu, Feb 19, 2015 at 11:11:53AM +0100, Vojtech Pavlik wrote:
> On Thu, Feb 19, 2015 at 10:52:51AM +0100, Peter Zijlstra wrote:
> > kthread_park() functionality seems to be exactly what you want.
>
> It might be exactly that, indeed. The requrement of not just cleaning
> up, but also not using contents of local variables from before parking
> would need to be documented.
>
> And kernel threads would need to start using it, too. I have been able
> to find one instance where this functionality is actually used.

Yeah, there's work to be done there. It was introduced for the cpu
hotplug stuff, and some per-cpu threads use this through the smpboot
infrastructure.

More need to be converted. It would be relatively straight fwd to park
threaded IRQs on irq-suspend like activity for example.

> So it is
> again a matter of a massive patch adding that, like with the approach of
> converting kernel threads to workqueues.

Yeah, but not nearly all kthreads can be converted to workqueues. And
there is various problems with workqueues that make it undesirable for
some even if possible.

> By the way, if kthread_park() was implemented all through the kernel,
> would we still need the freezer for kernel threads at all? Since parking
> seems to be stronger than freezing, it could also be used for that
> purpose.

I think not; there might of course be horrible exceptions but in general
parking should be good enough indeed.

Subject: Re: [RFC PATCH 0/9] livepatch: consistency model

(2015/02/13 23:41), Josh Poimboeuf wrote:
> On Fri, Feb 13, 2015 at 03:22:15PM +0100, Jiri Kosina wrote:
>> On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
>>
>>>> How about we take a slightly different aproach -- put a probe (or ftrace)
>>>> on __switch_to() during a klp transition period, and examine stacktraces
>>>> for tasks that are just about to start running from there?
>>>>
>>>> The only tasks that would not be covered by this would be purely CPU-bound
>>>> tasks that never schedule. But we are likely in trouble with those anyway,
>>>> because odds are that non-rescheduling CPU-bound tasks are also
>>>> RT-priority tasks running on isolated CPUs, which we will fail to handle
>>>> anyway.
>>>>
>>>> I think Masami used similar trick in his kpatch-without-stopmachine
>>>> aproach.
>>>
>>> Yeah, that's definitely an option, though I'm really not too crazy about
>>> it. Hooking into the scheduler is kind of scary and disruptive.
>>
>> This is basically about running a stack checking for ->next before
>> switching to it, i.e. read-only operation (admittedly inducing some
>> latency, but that's the same with locking the runqueue). And only when in
>> transition phase.
>
> Yes, but it would introduce much more latency than locking rq, since
> there would be at least some added latency to every schedule() call
> during the transition phase. Locking the rq would only add latency in
> those cases where another CPU is trying to do a context switch while
> we're holding the lock.

If we can implement checking routine at the enter of switching process,
it will not have such bigger cost. My prototype code used kprobes just
for hack, but we can do it in the scheduler too.

>
> It also seems much more dangerous. A bug in __switch_to() could easily
> do a lot of damage.

Indeed. It requires per-task locking on scheduler for safety on switching
to avoid concurrent stack checking.

>>> We'd also have to wake up all the sleeping processes.
>>
>> Yes, I don't think there is a way around that.
>
> Actually this patch set is a way around that :-)

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]