Hello,
This is the v2 of the livepatch shadow GC patches. The changes are minor since
nobody asked for for big code changes.
Changes from v1:
* Reworked commit messages (Petr)
* Added my SoB which was missing in some patches, or the ordering was wrong. (Josh)
* Change __klp_shadow_get_or_use to __klp_shadow_get_or_add_locked and add a comment (Petr)
* Add lockdep_assert_held on __klp_shadow_get_or_add_locked (Petr)
about it's meaning (Petr)
* CCing LKML (Josh)
Some observations:
* Petr has reviewed some of the patches that we created. I kept the Reviewed-by
tags since he wrote the patches some time ago and now he reviewed them again
on the ML.
* There were questions about possible problems about using klp_shadow_types
instead of using ids, but Petr already explained that internally it still uses
the id to find the correct livepatch.
* Regarding the possibility of multiple patches use the same ID, the problem
already existed before. Petr suggested using a "stringified" version using
name and id, but nobody has commented yet. I can implement such feature in a
v3 if necessary.
Marcos Paulo de Souza (2):
livepatch/shadow: Introduce klp_shadow_type structure
livepatch/shadow: Add garbage collection of shadow variables
Petr Mladek (2):
livepatch/shadow: Separate code to get or use pre-allocated shadow
variable
livepatch/shadow: Separate code removing all shadow variables for a
given id
include/linux/livepatch.h | 50 ++-
kernel/livepatch/core.c | 39 +++
kernel/livepatch/core.h | 1 +
kernel/livepatch/shadow.c | 297 +++++++++++++-----
kernel/livepatch/transition.c | 4 +-
lib/livepatch/test_klp_shadow_vars.c | 119 ++++---
samples/livepatch/livepatch-shadow-fix1.c | 24 +-
samples/livepatch/livepatch-shadow-fix2.c | 34 +-
.../selftests/livepatch/test-shadow-vars.sh | 2 +-
9 files changed, 417 insertions(+), 153 deletions(-)
--
2.37.3
From: Petr Mladek <[email protected]>
Allow to remove all shadow variables with already taken klp_shadow_lock.
It will be needed to synchronize this with other operation during
the garbage collection of shadow variables.
It is a code refactoring without any functional changes.
Signed-off-by: Petr Mladek <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Signed-off-by: Marcos Paulo de Souza <[email protected]>
---
Changes from v1:
* Added my SoB (Josh)
kernel/livepatch/shadow.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/livepatch/shadow.c b/kernel/livepatch/shadow.c
index 81ad7cbbd124..aba44dcc0a88 100644
--- a/kernel/livepatch/shadow.c
+++ b/kernel/livepatch/shadow.c
@@ -283,6 +283,20 @@ void klp_shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor)
}
EXPORT_SYMBOL_GPL(klp_shadow_free);
+static void __klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor)
+{
+ struct klp_shadow *shadow;
+ int i;
+
+ lockdep_assert_held(&klp_shadow_lock);
+
+ /* Delete all <*, id> from hash */
+ hash_for_each(klp_shadow_hash, i, shadow, node) {
+ if (klp_shadow_match(shadow, shadow->obj, id))
+ klp_shadow_free_struct(shadow, dtor);
+ }
+}
+
/**
* klp_shadow_free_all() - detach and free all <_, id> shadow variables
* @id: data identifier
@@ -294,18 +308,10 @@ EXPORT_SYMBOL_GPL(klp_shadow_free);
*/
void klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor)
{
- struct klp_shadow *shadow;
unsigned long flags;
- int i;
spin_lock_irqsave(&klp_shadow_lock, flags);
-
- /* Delete all <_, id> from hash */
- hash_for_each(klp_shadow_hash, i, shadow, node) {
- if (klp_shadow_match(shadow, shadow->obj, id))
- klp_shadow_free_struct(shadow, dtor);
- }
-
+ __klp_shadow_free_all(id, dtor);
spin_unlock_irqrestore(&klp_shadow_lock, flags);
}
EXPORT_SYMBOL_GPL(klp_shadow_free_all);
--
2.37.3
The shadow variable type will be used in klp_shadow_alloc/get/free
functions instead of id/ctor/dtor parameters. As a result, all callers
use the same callbacks consistently[*][**].
The structure will be used in the next patch that will manage the
lifetime of shadow variables and execute garbage collection automatically.
[*] From the user POV, it might have been easier to pass $id instead
of pointer to struct klp_shadow_type.
It would require registering the klp_shadow_type so that
the klp_shadow API could find ctor/dtor for the given id.
It actually will be needed for the garbage collection anyway
because it will define the lifetime of the variables.
The bigger problem is that the same klp_shadow_type might be
used by more livepatch modules. Each livepatch module need
to duplicate the definition of klp_shadow_type and ctor/dtor
callbacks. The klp_shadow API would need to choose one registered
copy.
The definitions should be compatible and they should stay as long
as the type is registered. But it still feels more safe when
klp_shadow API callers use struct klp_shadow_type and ctor/dtor
callbacks defined in the same livepatch module.
This problem is gone when each livepatch explicitly uses its
own struct klp_shadow_type pointing to its own callbacks.
[**] test_klp_shadow_vars.c uses a custom @dtor to show that it was called.
The message must be disabled when called via klp_shadow_free_all()
because the ordering of freed variables is not well defined there.
It has to be done using another hack after switching to
klp_shadow_types.
Co-developed-by: Petr Mladek <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Signed-off-by: Marcos Paulo de Souza <[email protected]>
---
Changes from v1:
* Added my SoB (Josh)
* Added a Co-developed-by tag (Petr)
* Changed the comment about throwing away speculative allocation (Petr)
include/linux/livepatch.h | 29 +++--
kernel/livepatch/shadow.c | 107 +++++++++---------
lib/livepatch/test_klp_shadow_vars.c | 105 ++++++++++-------
samples/livepatch/livepatch-shadow-fix1.c | 18 ++-
samples/livepatch/livepatch-shadow-fix2.c | 27 +++--
.../selftests/livepatch/test-shadow-vars.sh | 2 +-
6 files changed, 165 insertions(+), 123 deletions(-)
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 293e29960c6e..79e7bf3b35f6 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -216,15 +216,26 @@ typedef int (*klp_shadow_ctor_t)(void *obj,
void *ctor_data);
typedef void (*klp_shadow_dtor_t)(void *obj, void *shadow_data);
-void *klp_shadow_get(void *obj, unsigned long id);
-void *klp_shadow_alloc(void *obj, unsigned long id,
- size_t size, gfp_t gfp_flags,
- klp_shadow_ctor_t ctor, void *ctor_data);
-void *klp_shadow_get_or_alloc(void *obj, unsigned long id,
- size_t size, gfp_t gfp_flags,
- klp_shadow_ctor_t ctor, void *ctor_data);
-void klp_shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor);
-void klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor);
+/**
+ * struct klp_shadow_type - shadow variable type used by the klp_object
+ * @id: shadow variable type indentifier
+ * @ctor: custom constructor to initialize the shadow data (optional)
+ * @dtor: custom callback that can be used to unregister the variable
+ * and/or free data that the shadow variable points to (optional)
+ */
+struct klp_shadow_type {
+ unsigned long id;
+ klp_shadow_ctor_t ctor;
+ klp_shadow_dtor_t dtor;
+};
+
+void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
+void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data);
+void *klp_shadow_get_or_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data);
+void klp_shadow_free(void *obj, struct klp_shadow_type *shadow_type);
+void klp_shadow_free_all(struct klp_shadow_type *shadow_type);
struct klp_state *klp_get_state(struct klp_patch *patch, unsigned long id);
struct klp_state *klp_get_prev_state(unsigned long id);
diff --git a/kernel/livepatch/shadow.c b/kernel/livepatch/shadow.c
index aba44dcc0a88..64e83853891d 100644
--- a/kernel/livepatch/shadow.c
+++ b/kernel/livepatch/shadow.c
@@ -63,24 +63,24 @@ struct klp_shadow {
* klp_shadow_match() - verify a shadow variable matches given <obj, id>
* @shadow: shadow variable to match
* @obj: pointer to parent object
- * @id: data identifier
+ * @shadow_type: type of the wanted shadow variable
*
* Return: true if the shadow variable matches.
*/
static inline bool klp_shadow_match(struct klp_shadow *shadow, void *obj,
- unsigned long id)
+ struct klp_shadow_type *shadow_type)
{
- return shadow->obj == obj && shadow->id == id;
+ return shadow->obj == obj && shadow->id == shadow_type->id;
}
/**
* klp_shadow_get() - retrieve a shadow variable data pointer
* @obj: pointer to parent object
- * @id: data identifier
+ * @shadow_type: type of the wanted shadow variable
*
* Return: the shadow variable data element, NULL on failure.
*/
-void *klp_shadow_get(void *obj, unsigned long id)
+void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type)
{
struct klp_shadow *shadow;
@@ -89,7 +89,7 @@ void *klp_shadow_get(void *obj, unsigned long id)
hash_for_each_possible_rcu(klp_shadow_hash, shadow, node,
(unsigned long)obj) {
- if (klp_shadow_match(shadow, obj, id)) {
+ if (klp_shadow_match(shadow, obj, shadow_type)) {
rcu_read_unlock();
return shadow->data;
}
@@ -101,17 +101,16 @@ void *klp_shadow_get(void *obj, unsigned long id)
}
EXPORT_SYMBOL_GPL(klp_shadow_get);
-/* Check if the variable exists. Otherwise, add the pre-allocated one. */
-static void *__klp_shadow_get_or_add_locked(void *obj, unsigned long id,
- struct klp_shadow *new_shadow,
- klp_shadow_ctor_t ctor, void *ctor_data,
- bool *exist)
+static void *__klp_shadow_get_or_add_locked(void *obj,
+ struct klp_shadow_type *shadow_type,
+ struct klp_shadow *new_shadow,
+ void *ctor_data, bool *exist)
{
void *shadow_data;
lockdep_assert_held(&klp_shadow_lock);
- shadow_data = klp_shadow_get(obj, id);
+ shadow_data = klp_shadow_get(obj, shadow_type);
if (unlikely(shadow_data)) {
*exist = true;
return shadow_data;
@@ -119,15 +118,15 @@ static void *__klp_shadow_get_or_add_locked(void *obj, unsigned long id,
*exist = false;
new_shadow->obj = obj;
- new_shadow->id = id;
+ new_shadow->id = shadow_type->id;
- if (ctor) {
+ if (shadow_type->ctor) {
int err;
- err = ctor(obj, new_shadow->data, ctor_data);
+ err = shadow_type->ctor(obj, new_shadow->data, ctor_data);
if (err) {
pr_err("Failed to construct shadow variable <%p, %lx> (%d)\n",
- obj, id, err);
+ obj, shadow_type->id, err);
return NULL;
}
}
@@ -139,9 +138,8 @@ static void *__klp_shadow_get_or_add_locked(void *obj, unsigned long id,
return new_shadow->data;
}
-static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
- size_t size, gfp_t gfp_flags,
- klp_shadow_ctor_t ctor, void *ctor_data,
+static void *__klp_shadow_get_or_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data,
bool warn_on_exist)
{
struct klp_shadow *new_shadow;
@@ -150,7 +148,7 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
unsigned long flags;
/* Check if the shadow variable already exists */
- shadow_data = klp_shadow_get(obj, id);
+ shadow_data = klp_shadow_get(obj, shadow_type);
if (shadow_data)
return shadow_data;
@@ -159,22 +157,25 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
* More complex setting can be done by @ctor function. But it is
* called only when the buffer is really used (under klp_shadow_lock).
*/
- new_shadow = kzalloc(size + sizeof(*new_shadow), gfp_flags);
+ new_shadow = kzalloc(size + sizeof(struct klp_shadow), gfp_flags);
if (!new_shadow)
return NULL;
/* Look for <obj, id> again under the lock */
spin_lock_irqsave(&klp_shadow_lock, flags);
- shadow_data = __klp_shadow_get_or_add_locked(obj, id, new_shadow,
- ctor, ctor_data, &exist);
+ shadow_data = __klp_shadow_get_or_add_locked(obj, shadow_type,
+ new_shadow, ctor_data, &exist);
spin_unlock_irqrestore(&klp_shadow_lock, flags);
- /* Throw away unused speculative allocation. */
+ /*
+ * Throw away unused speculative allocation if the ctor() failed
+ * or the variable already existed.
+ */
if (!shadow_data || exist)
kfree(new_shadow);
if (exist && warn_on_exist) {
- WARN(1, "Duplicate shadow variable <%p, %lx>\n", obj, id);
+ WARN(1, "Duplicate shadow variable <%p, %lx>\n", obj, shadow_type->id);
return NULL;
}
@@ -184,10 +185,9 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
/**
* klp_shadow_alloc() - allocate and add a new shadow variable
* @obj: pointer to parent object
- * @id: data identifier
+ * @shadow_type: type of the wanted shadow variable
* @size: size of attached data
* @gfp_flags: GFP mask for allocation
- * @ctor: custom constructor to initialize the shadow data (optional)
* @ctor_data: pointer to any data needed by @ctor (optional)
*
* Allocates @size bytes for new shadow variable data using @gfp_flags.
@@ -205,22 +205,21 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
* Return: the shadow variable data element, NULL on duplicate or
* failure.
*/
-void *klp_shadow_alloc(void *obj, unsigned long id,
- size_t size, gfp_t gfp_flags,
- klp_shadow_ctor_t ctor, void *ctor_data)
+void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data)
{
- return __klp_shadow_get_or_alloc(obj, id, size, gfp_flags,
- ctor, ctor_data, true);
+ return __klp_shadow_get_or_alloc(obj, shadow_type, size,
+ gfp_flags, ctor_data,
+ true);
}
EXPORT_SYMBOL_GPL(klp_shadow_alloc);
/**
* klp_shadow_get_or_alloc() - get existing or allocate a new shadow variable
* @obj: pointer to parent object
- * @id: data identifier
+ * @shadow_type: type of the wanted shadow variable
* @size: size of attached data
* @gfp_flags: GFP mask for allocation
- * @ctor: custom constructor to initialize the shadow data (optional)
* @ctor_data: pointer to any data needed by @ctor (optional)
*
* Returns a pointer to existing shadow data if an <obj, id> shadow
@@ -234,35 +233,33 @@ EXPORT_SYMBOL_GPL(klp_shadow_alloc);
*
* Return: the shadow variable data element, NULL on failure.
*/
-void *klp_shadow_get_or_alloc(void *obj, unsigned long id,
- size_t size, gfp_t gfp_flags,
- klp_shadow_ctor_t ctor, void *ctor_data)
+void *klp_shadow_get_or_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data)
{
- return __klp_shadow_get_or_alloc(obj, id, size, gfp_flags,
- ctor, ctor_data, false);
+ return __klp_shadow_get_or_alloc(obj, shadow_type, size,
+ gfp_flags, ctor_data,
+ false);
}
EXPORT_SYMBOL_GPL(klp_shadow_get_or_alloc);
static void klp_shadow_free_struct(struct klp_shadow *shadow,
- klp_shadow_dtor_t dtor)
+ struct klp_shadow_type *shadow_type)
{
hash_del_rcu(&shadow->node);
- if (dtor)
- dtor(shadow->obj, shadow->data);
+ if (shadow_type->dtor)
+ shadow_type->dtor(shadow->obj, shadow->data);
kfree_rcu(shadow, rcu_head);
}
/**
* klp_shadow_free() - detach and free a <obj, id> shadow variable
* @obj: pointer to parent object
- * @id: data identifier
- * @dtor: custom callback that can be used to unregister the variable
- * and/or free data that the shadow variable points to (optional)
+ * @shadow_type: type of to be freed shadow variable
*
* This function releases the memory for this <obj, id> shadow variable
* instance, callers should stop referencing it accordingly.
*/
-void klp_shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor)
+void klp_shadow_free(void *obj, struct klp_shadow_type *shadow_type)
{
struct klp_shadow *shadow;
unsigned long flags;
@@ -273,8 +270,8 @@ void klp_shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor)
hash_for_each_possible(klp_shadow_hash, shadow, node,
(unsigned long)obj) {
- if (klp_shadow_match(shadow, obj, id)) {
- klp_shadow_free_struct(shadow, dtor);
+ if (klp_shadow_match(shadow, obj, shadow_type)) {
+ klp_shadow_free_struct(shadow, shadow_type);
break;
}
}
@@ -283,7 +280,7 @@ void klp_shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor)
}
EXPORT_SYMBOL_GPL(klp_shadow_free);
-static void __klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor)
+static void __klp_shadow_free_all(struct klp_shadow_type *shadow_type)
{
struct klp_shadow *shadow;
int i;
@@ -292,26 +289,24 @@ static void __klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor)
/* Delete all <*, id> from hash */
hash_for_each(klp_shadow_hash, i, shadow, node) {
- if (klp_shadow_match(shadow, shadow->obj, id))
- klp_shadow_free_struct(shadow, dtor);
+ if (klp_shadow_match(shadow, shadow->obj, shadow_type))
+ klp_shadow_free_struct(shadow, shadow_type);
}
}
/**
* klp_shadow_free_all() - detach and free all <_, id> shadow variables
- * @id: data identifier
- * @dtor: custom callback that can be used to unregister the variable
- * and/or free data that the shadow variable points to (optional)
+ * @shadow_type: type of to be freed shadow variables
*
* This function releases the memory for all <_, id> shadow variable
* instances, callers should stop referencing them accordingly.
*/
-void klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor)
+void klp_shadow_free_all(struct klp_shadow_type *shadow_type)
{
unsigned long flags;
spin_lock_irqsave(&klp_shadow_lock, flags);
- __klp_shadow_free_all(id, dtor);
+ __klp_shadow_free_all(shadow_type);
spin_unlock_irqrestore(&klp_shadow_lock, flags);
}
EXPORT_SYMBOL_GPL(klp_shadow_free_all);
diff --git a/lib/livepatch/test_klp_shadow_vars.c b/lib/livepatch/test_klp_shadow_vars.c
index b99116490858..ee47c1fae8e2 100644
--- a/lib/livepatch/test_klp_shadow_vars.c
+++ b/lib/livepatch/test_klp_shadow_vars.c
@@ -58,58 +58,64 @@ static int ptr_id(void *ptr)
* to the kernel log for testing verification. Don't display raw pointers,
* but use the ptr_id() value instead.
*/
-static void *shadow_get(void *obj, unsigned long id)
+static void *shadow_get(void *obj, struct klp_shadow_type *shadow_type)
{
int **sv;
- sv = klp_shadow_get(obj, id);
+ sv = klp_shadow_get(obj, shadow_type);
pr_info("klp_%s(obj=PTR%d, id=0x%lx) = PTR%d\n",
- __func__, ptr_id(obj), id, ptr_id(sv));
+ __func__, ptr_id(obj), shadow_type->id, ptr_id(sv));
return sv;
}
-static void *shadow_alloc(void *obj, unsigned long id, size_t size,
- gfp_t gfp_flags, klp_shadow_ctor_t ctor,
- void *ctor_data)
+static void *shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data)
{
int **var = ctor_data;
int **sv;
- sv = klp_shadow_alloc(obj, id, size, gfp_flags, ctor, var);
+ sv = klp_shadow_alloc(obj, shadow_type, size, gfp_flags, var);
pr_info("klp_%s(obj=PTR%d, id=0x%lx, size=%zx, gfp_flags=%pGg), ctor=PTR%d, ctor_data=PTR%d = PTR%d\n",
- __func__, ptr_id(obj), id, size, &gfp_flags, ptr_id(ctor),
+ __func__, ptr_id(obj), shadow_type->id, size, &gfp_flags, ptr_id(shadow_type->ctor),
ptr_id(*var), ptr_id(sv));
return sv;
}
-static void *shadow_get_or_alloc(void *obj, unsigned long id, size_t size,
- gfp_t gfp_flags, klp_shadow_ctor_t ctor,
- void *ctor_data)
+static void *shadow_get_or_alloc(void *obj, struct klp_shadow_type *shadow_type,
+ size_t size, gfp_t gfp_flags, void *ctor_data)
{
int **var = ctor_data;
int **sv;
- sv = klp_shadow_get_or_alloc(obj, id, size, gfp_flags, ctor, var);
+ sv = klp_shadow_get_or_alloc(obj, shadow_type, size, gfp_flags, var);
pr_info("klp_%s(obj=PTR%d, id=0x%lx, size=%zx, gfp_flags=%pGg), ctor=PTR%d, ctor_data=PTR%d = PTR%d\n",
- __func__, ptr_id(obj), id, size, &gfp_flags, ptr_id(ctor),
+ __func__, ptr_id(obj), shadow_type->id, size, &gfp_flags, ptr_id(shadow_type->ctor),
ptr_id(*var), ptr_id(sv));
return sv;
}
-static void shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor)
+static void shadow_free(void *obj, struct klp_shadow_type *shadow_type)
{
- klp_shadow_free(obj, id, dtor);
+ klp_shadow_free(obj, shadow_type);
pr_info("klp_%s(obj=PTR%d, id=0x%lx, dtor=PTR%d)\n",
- __func__, ptr_id(obj), id, ptr_id(dtor));
+ __func__, ptr_id(obj), shadow_type->id, ptr_id(shadow_type->dtor));
}
-static void shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor)
+/*
+ * With more than one item to free in the list, order is not determined and
+ * shadow_dtor will not be passed to shadow_free_all() which would make the
+ * test fail. (see pass 6)
+ */
+static bool verbose_dtor = true;
+static void shadow_free_all(struct klp_shadow_type *shadow_type)
{
- klp_shadow_free_all(id, dtor);
- pr_info("klp_%s(id=0x%lx, dtor=PTR%d)\n", __func__, id, ptr_id(dtor));
+ verbose_dtor = false;
+ klp_shadow_free_all(shadow_type);
+ verbose_dtor = true;
+ pr_info("klp_%s(id=0x%lx, dtor=PTR%d)\n", __func__, shadow_type->id, ptr_id(shadow_type->dtor));
}
@@ -128,17 +134,14 @@ static int shadow_ctor(void *obj, void *shadow_data, void *ctor_data)
return 0;
}
-/*
- * With more than one item to free in the list, order is not determined and
- * shadow_dtor will not be passed to shadow_free_all() which would make the
- * test fail. (see pass 6)
- */
static void shadow_dtor(void *obj, void *shadow_data)
{
int **sv = shadow_data;
- pr_info("%s(obj=PTR%d, shadow_data=PTR%d)\n",
- __func__, ptr_id(obj), ptr_id(sv));
+ if (verbose_dtor) {
+ pr_info("%s(obj=PTR%d, shadow_data=PTR%d)\n",
+ __func__, ptr_id(obj), ptr_id(sv));
+ }
}
/* number of objects we simulate that need shadow vars */
@@ -148,6 +151,18 @@ static void shadow_dtor(void *obj, void *shadow_data)
#define SV_ID1 0x1234
#define SV_ID2 0x1235
+struct klp_shadow_type shadow_type_1 = {
+ .id = SV_ID1,
+ .ctor = shadow_ctor,
+ .dtor = shadow_dtor,
+};
+
+struct klp_shadow_type shadow_type_2 = {
+ .id = SV_ID2,
+ .ctor = shadow_ctor,
+ .dtor = shadow_dtor,
+};
+
/*
* The main test case adds/removes new fields (shadow var) to each of these
* test structure instances. The last group of fields in the struct represent
@@ -179,7 +194,7 @@ static int test_klp_shadow_vars_init(void)
* With an empty shadow variable hash table, expect not to find
* any matches.
*/
- sv = shadow_get(&objs[0], SV_ID1);
+ sv = shadow_get(&objs[0], &shadow_type_1);
if (!sv)
pr_info(" got expected NULL result\n");
@@ -189,13 +204,13 @@ static int test_klp_shadow_vars_init(void)
ptr_id(pnfields1[i]);
if (i % 2) {
- sv1[i] = shadow_alloc(&objs[i], SV_ID1,
+ sv1[i] = shadow_alloc(&objs[i], &shadow_type_1,
sizeof(pnfields1[i]), GFP_KERNEL,
- shadow_ctor, &pnfields1[i]);
+ &pnfields1[i]);
} else {
- sv1[i] = shadow_get_or_alloc(&objs[i], SV_ID1,
+ sv1[i] = shadow_get_or_alloc(&objs[i], &shadow_type_1,
sizeof(pnfields1[i]), GFP_KERNEL,
- shadow_ctor, &pnfields1[i]);
+ &pnfields1[i]);
}
if (!sv1[i]) {
ret = -ENOMEM;
@@ -204,8 +219,9 @@ static int test_klp_shadow_vars_init(void)
pnfields2[i] = &nfields2[i];
ptr_id(pnfields2[i]);
- sv2[i] = shadow_alloc(&objs[i], SV_ID2, sizeof(pnfields2[i]),
- GFP_KERNEL, shadow_ctor, &pnfields2[i]);
+ sv2[i] = shadow_alloc(&objs[i], &shadow_type_2,
+ sizeof(pnfields2[i]),
+ GFP_KERNEL, &pnfields2[i]);
if (!sv2[i]) {
ret = -ENOMEM;
goto out;
@@ -215,7 +231,7 @@ static int test_klp_shadow_vars_init(void)
/* pass 2: verify we find allocated svars and where they point to */
for (i = 0; i < NUM_OBJS; i++) {
/* check the "char" svar for all objects */
- sv = shadow_get(&objs[i], SV_ID1);
+ sv = shadow_get(&objs[i], &shadow_type_1);
if (!sv) {
ret = -EINVAL;
goto out;
@@ -225,7 +241,7 @@ static int test_klp_shadow_vars_init(void)
ptr_id(sv1[i]), ptr_id(*sv1[i]));
/* check the "int" svar for all objects */
- sv = shadow_get(&objs[i], SV_ID2);
+ sv = shadow_get(&objs[i], &shadow_type_2);
if (!sv) {
ret = -EINVAL;
goto out;
@@ -240,8 +256,9 @@ static int test_klp_shadow_vars_init(void)
pndup[i] = &nfields1[i];
ptr_id(pndup[i]);
- sv = shadow_get_or_alloc(&objs[i], SV_ID1, sizeof(pndup[i]),
- GFP_KERNEL, shadow_ctor, &pndup[i]);
+ sv = shadow_get_or_alloc(&objs[i], &shadow_type_1,
+ sizeof(pndup[i]),
+ GFP_KERNEL, &pndup[i]);
if (!sv) {
ret = -EINVAL;
goto out;
@@ -253,15 +270,15 @@ static int test_klp_shadow_vars_init(void)
/* pass 4: free <objs[*], SV_ID1> pairs of svars, verify removal */
for (i = 0; i < NUM_OBJS; i++) {
- shadow_free(&objs[i], SV_ID1, shadow_dtor); /* 'char' pairs */
- sv = shadow_get(&objs[i], SV_ID1);
+ shadow_free(&objs[i], &shadow_type_1); /* 'char' pairs */
+ sv = shadow_get(&objs[i], &shadow_type_1);
if (!sv)
pr_info(" got expected NULL result\n");
}
/* pass 5: check we still find <objs[*], SV_ID2> svar pairs */
for (i = 0; i < NUM_OBJS; i++) {
- sv = shadow_get(&objs[i], SV_ID2); /* 'int' pairs */
+ sv = shadow_get(&objs[i], &shadow_type_2); /* 'int' pairs */
if (!sv) {
ret = -EINVAL;
goto out;
@@ -272,9 +289,9 @@ static int test_klp_shadow_vars_init(void)
}
/* pass 6: free all the <objs[*], SV_ID2> svar pairs too. */
- shadow_free_all(SV_ID2, NULL); /* 'int' pairs */
+ shadow_free_all(&shadow_type_2); /* 'int' pairs */
for (i = 0; i < NUM_OBJS; i++) {
- sv = shadow_get(&objs[i], SV_ID2);
+ sv = shadow_get(&objs[i], &shadow_type_2);
if (!sv)
pr_info(" got expected NULL result\n");
}
@@ -283,8 +300,8 @@ static int test_klp_shadow_vars_init(void)
return 0;
out:
- shadow_free_all(SV_ID1, NULL); /* 'char' pairs */
- shadow_free_all(SV_ID2, NULL); /* 'int' pairs */
+ shadow_free_all(&shadow_type_1); /* 'char' pairs */
+ shadow_free_all(&shadow_type_2); /* 'int' pairs */
free_ptr_list();
return ret;
diff --git a/samples/livepatch/livepatch-shadow-fix1.c b/samples/livepatch/livepatch-shadow-fix1.c
index 6701641bf12d..0cc7d1e4b4bc 100644
--- a/samples/livepatch/livepatch-shadow-fix1.c
+++ b/samples/livepatch/livepatch-shadow-fix1.c
@@ -32,6 +32,8 @@
/* Shadow variable enums */
#define SV_LEAK 1
+static struct klp_shadow_type shadow_leak_type;
+
/* Allocate new dummies every second */
#define ALLOC_PERIOD 1
/* Check for expired dummies after a few new ones have been allocated */
@@ -84,8 +86,8 @@ static struct dummy *livepatch_fix1_dummy_alloc(void)
if (!leak)
goto err_leak;
- shadow_leak = klp_shadow_alloc(d, SV_LEAK, sizeof(leak), GFP_KERNEL,
- shadow_leak_ctor, &leak);
+ shadow_leak = klp_shadow_alloc(d, &shadow_leak_type, sizeof(leak),
+ GFP_KERNEL, &leak);
if (!shadow_leak) {
pr_err("%s: failed to allocate shadow variable for the leaking pointer: dummy @ %p, leak @ %p\n",
__func__, d, leak);
@@ -124,15 +126,21 @@ static void livepatch_fix1_dummy_free(struct dummy *d)
* not exist (ie, dummy structures allocated before this livepatch
* was loaded.)
*/
- shadow_leak = klp_shadow_get(d, SV_LEAK);
+ shadow_leak = klp_shadow_get(d, &shadow_leak_type);
if (shadow_leak)
- klp_shadow_free(d, SV_LEAK, livepatch_fix1_dummy_leak_dtor);
+ klp_shadow_free(d, &shadow_leak_type);
else
pr_info("%s: dummy @ %p leaked!\n", __func__, d);
kfree(d);
}
+static struct klp_shadow_type shadow_leak_type = {
+ .id = SV_LEAK,
+ .ctor = shadow_leak_ctor,
+ .dtor = livepatch_fix1_dummy_leak_dtor,
+};
+
static struct klp_func funcs[] = {
{
.old_name = "dummy_alloc",
@@ -164,7 +172,7 @@ static int livepatch_shadow_fix1_init(void)
static void livepatch_shadow_fix1_exit(void)
{
/* Cleanup any existing SV_LEAK shadow variables */
- klp_shadow_free_all(SV_LEAK, livepatch_fix1_dummy_leak_dtor);
+ klp_shadow_free_all(&shadow_leak_type);
}
module_init(livepatch_shadow_fix1_init);
diff --git a/samples/livepatch/livepatch-shadow-fix2.c b/samples/livepatch/livepatch-shadow-fix2.c
index 361046a4f10c..840100555152 100644
--- a/samples/livepatch/livepatch-shadow-fix2.c
+++ b/samples/livepatch/livepatch-shadow-fix2.c
@@ -33,6 +33,9 @@
#define SV_LEAK 1
#define SV_COUNTER 2
+static struct klp_shadow_type shadow_leak_type;
+static struct klp_shadow_type shadow_counter_type;
+
struct dummy {
struct list_head list;
unsigned long jiffies_expire;
@@ -47,9 +50,8 @@ static bool livepatch_fix2_dummy_check(struct dummy *d, unsigned long jiffies)
* already have a SV_COUNTER shadow variable, then attach a
* new one.
*/
- shadow_count = klp_shadow_get_or_alloc(d, SV_COUNTER,
- sizeof(*shadow_count), GFP_NOWAIT,
- NULL, NULL);
+ shadow_count = klp_shadow_get_or_alloc(d, &shadow_counter_type,
+ sizeof(*shadow_count), GFP_NOWAIT, NULL);
if (shadow_count)
*shadow_count += 1;
@@ -72,9 +74,9 @@ static void livepatch_fix2_dummy_free(struct dummy *d)
int *shadow_count;
/* Patch: copy the memory leak patch from the fix1 module. */
- shadow_leak = klp_shadow_get(d, SV_LEAK);
+ shadow_leak = klp_shadow_get(d, &shadow_leak_type);
if (shadow_leak)
- klp_shadow_free(d, SV_LEAK, livepatch_fix2_dummy_leak_dtor);
+ klp_shadow_free(d, &shadow_leak_type);
else
pr_info("%s: dummy @ %p leaked!\n", __func__, d);
@@ -82,16 +84,25 @@ static void livepatch_fix2_dummy_free(struct dummy *d)
* Patch: fetch the SV_COUNTER shadow variable and display
* the final count. Detach the shadow variable.
*/
- shadow_count = klp_shadow_get(d, SV_COUNTER);
+ shadow_count = klp_shadow_get(d, &shadow_counter_type);
if (shadow_count) {
pr_info("%s: dummy @ %p, check counter = %d\n",
__func__, d, *shadow_count);
- klp_shadow_free(d, SV_COUNTER, NULL);
+ klp_shadow_free(d, &shadow_counter_type);
}
kfree(d);
}
+static struct klp_shadow_type shadow_leak_type = {
+ .id = SV_LEAK,
+ .dtor = livepatch_fix2_dummy_leak_dtor,
+};
+
+static struct klp_shadow_type shadow_counter_type = {
+ .id = SV_COUNTER,
+};
+
static struct klp_func funcs[] = {
{
.old_name = "dummy_check",
@@ -123,7 +134,7 @@ static int livepatch_shadow_fix2_init(void)
static void livepatch_shadow_fix2_exit(void)
{
/* Cleanup any existing SV_COUNTER shadow variables */
- klp_shadow_free_all(SV_COUNTER, NULL);
+ klp_shadow_free_all(&shadow_leak_type);
}
module_init(livepatch_shadow_fix2_init);
diff --git a/tools/testing/selftests/livepatch/test-shadow-vars.sh b/tools/testing/selftests/livepatch/test-shadow-vars.sh
index e04cb354f56b..01ef65bc1f0c 100755
--- a/tools/testing/selftests/livepatch/test-shadow-vars.sh
+++ b/tools/testing/selftests/livepatch/test-shadow-vars.sh
@@ -67,7 +67,7 @@ $MOD_TEST: klp_shadow_get(obj=PTR9, id=0x1235) = PTR11
$MOD_TEST: got expected PTR11 -> PTR10 result
$MOD_TEST: klp_shadow_get(obj=PTR14, id=0x1235) = PTR16
$MOD_TEST: got expected PTR16 -> PTR15 result
-$MOD_TEST: klp_shadow_free_all(id=0x1235, dtor=PTR0)
+$MOD_TEST: klp_shadow_free_all(id=0x1235, dtor=PTR17)
$MOD_TEST: klp_shadow_get(obj=PTR1, id=0x1235) = PTR0
$MOD_TEST: got expected NULL result
$MOD_TEST: klp_shadow_get(obj=PTR9, id=0x1235) = PTR0
--
2.37.3
The life of shadow variables is not completely trivial to maintain.
They might be used by more livepatches and more livepatched objects.
They should stay as long as there is any user.
In practice, it requires to implement reference counting in callbacks
of all users. They should register all the user and remove the shadow
variables only when there is no user left.
This patch hides the reference counting into the klp_shadow API.
The counter is connected with the shadow variable @id. It requires
an API to take and release the reference. The release function also
calls the related dtor() when defined.
An easy solution would be to add some get_ref()/put_ref() API.
But it would need to get called from pre()/post_un() callbacks.
It might be easy to forget a callback and make it wrong.
A more safe approach is to associate the klp_shadow_type with
klp_objects that use the shadow variables. The livepatch core
code might then handle the reference counters on background.
The shadow variable type might then be added into a new @shadow_types
member of struct klp_object. They will get then automatically registered
and unregistered when the object is being livepatched. The registration
increments the reference count. Unregistration decreases the reference
count. All shadow variables of the given type are freed when the reference
count reaches zero.
All klp_shadow_alloc/get/free functions also checks whether the requested
type is registered. It will help to catch missing registration and might
also help to catch eventual races.
Signed-off-by: Petr Mladek <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Signed-off-by: Marcos Paulo de Souza <[email protected]>
---
Changes from v1:
* Reordered my SoB (Josh)
* Changed code comments (Petr)
include/linux/livepatch.h | 21 ++++
kernel/livepatch/core.c | 39 +++++++
kernel/livepatch/core.h | 1 +
kernel/livepatch/shadow.c | 124 ++++++++++++++++++++++
kernel/livepatch/transition.c | 4 +-
lib/livepatch/test_klp_shadow_vars.c | 18 +++-
samples/livepatch/livepatch-shadow-fix1.c | 8 +-
samples/livepatch/livepatch-shadow-fix2.c | 9 +-
8 files changed, 214 insertions(+), 10 deletions(-)
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 79e7bf3b35f6..fdd82fde86e6 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -100,11 +100,14 @@ struct klp_callbacks {
bool post_unpatch_enabled;
};
+struct klp_shadow_type;
+
/**
* struct klp_object - kernel object structure for live patching
* @name: module name (or NULL for vmlinux)
* @funcs: function entries for functions to be patched in the object
* @callbacks: functions to be executed pre/post (un)patching
+ * @shadow_types: shadow variable types used by the livepatch for the klp_object
* @kobj: kobject for sysfs resources
* @func_list: dynamic list of the function entries
* @node: list node for klp_patch obj_list
@@ -118,6 +121,7 @@ struct klp_object {
const char *name;
struct klp_func *funcs;
struct klp_callbacks callbacks;
+ struct klp_shadow_type **shadow_types;
/* internal */
struct kobject kobj;
@@ -222,13 +226,30 @@ typedef void (*klp_shadow_dtor_t)(void *obj, void *shadow_data);
* @ctor: custom constructor to initialize the shadow data (optional)
* @dtor: custom callback that can be used to unregister the variable
* and/or free data that the shadow variable points to (optional)
+ * @registered: flag indicating if the variable was successfully registered
+ *
+ * All shadow variables used by the livepatch for the related klp_object
+ * must be listed here so that they are registered when the livepatch
+ * and the module is loaded. Otherwise, it will not be possible to
+ * allocate them.
*/
struct klp_shadow_type {
unsigned long id;
klp_shadow_ctor_t ctor;
klp_shadow_dtor_t dtor;
+
+ /* internal */
+ bool registered;
};
+#define klp_for_each_shadow_type(obj, shadow_type, i) \
+ for (shadow_type = obj->shadow_types ? obj->shadow_types[0] : NULL, i = 1; \
+ shadow_type; \
+ shadow_type = obj->shadow_types[i++])
+
+int klp_shadow_register(struct klp_shadow_type *shadow_type);
+void klp_shadow_unregister(struct klp_shadow_type *shadow_type);
+
void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
size_t size, gfp_t gfp_flags, void *ctor_data);
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 9ada0bc5247b..44c9e5ea0d2c 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -928,6 +928,30 @@ static int klp_init_patch(struct klp_patch *patch)
return 0;
}
+void klp_unregister_shadow_types(struct klp_object *obj)
+{
+ struct klp_shadow_type *shadow_type;
+ int i;
+
+ klp_for_each_shadow_type(obj, shadow_type, i) {
+ klp_shadow_unregister(shadow_type);
+ }
+}
+
+static int klp_register_shadow_types(struct klp_object *obj)
+{
+ struct klp_shadow_type *shadow_type;
+ int i, ret;
+
+ klp_for_each_shadow_type(obj, shadow_type, i) {
+ ret = klp_shadow_register(shadow_type);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static int __klp_disable_patch(struct klp_patch *patch)
{
struct klp_object *obj;
@@ -988,6 +1012,13 @@ static int __klp_enable_patch(struct klp_patch *patch)
if (!klp_is_object_loaded(obj))
continue;
+ ret = klp_register_shadow_types(obj);
+ if (ret) {
+ pr_warn("failed to register shadow types for object '%s'\n",
+ klp_is_module(obj) ? obj->name : "vmlinux");
+ goto err;
+ }
+
ret = klp_pre_patch_callback(obj);
if (ret) {
pr_warn("pre-patch callback failed for object '%s'\n",
@@ -1172,6 +1203,7 @@ static void klp_cleanup_module_patches_limited(struct module *mod,
klp_unpatch_object(obj);
klp_post_unpatch_callback(obj);
+ klp_unregister_shadow_types(obj);
klp_free_object_loaded(obj);
break;
@@ -1218,6 +1250,13 @@ int klp_module_coming(struct module *mod)
pr_notice("applying patch '%s' to loading module '%s'\n",
patch->mod->name, obj->mod->name);
+ ret = klp_register_shadow_types(obj);
+ if (ret) {
+ pr_warn("failed to register shadow types for object '%s'\n",
+ obj->name);
+ goto err;
+ }
+
ret = klp_pre_patch_callback(obj);
if (ret) {
pr_warn("pre-patch callback failed for object '%s'\n",
diff --git a/kernel/livepatch/core.h b/kernel/livepatch/core.h
index 38209c7361b6..0b68f2407a82 100644
--- a/kernel/livepatch/core.h
+++ b/kernel/livepatch/core.h
@@ -13,6 +13,7 @@ extern struct list_head klp_patches;
#define klp_for_each_patch(patch) \
list_for_each_entry(patch, &klp_patches, list)
+void klp_unregister_shadow_types(struct klp_object *obj);
void klp_free_patch_async(struct klp_patch *patch);
void klp_free_replaced_patches_async(struct klp_patch *new_patch);
void klp_unpatch_replaced_patches(struct klp_patch *new_patch);
diff --git a/kernel/livepatch/shadow.c b/kernel/livepatch/shadow.c
index 64e83853891d..9437dc1be7b2 100644
--- a/kernel/livepatch/shadow.c
+++ b/kernel/livepatch/shadow.c
@@ -34,6 +34,7 @@
#include <linux/hashtable.h>
#include <linux/slab.h>
#include <linux/livepatch.h>
+#include "core.h"
static DEFINE_HASHTABLE(klp_shadow_hash, 12);
@@ -59,6 +60,22 @@ struct klp_shadow {
char data[];
};
+/**
+ * struct klp_shadow_type_reg - information about a registered shadow
+ * variable type
+ * @id: shadow variable type indentifier
+ * @count: reference counter
+ * @list: list node for list of registered shadow variable types
+ */
+struct klp_shadow_type_reg {
+ unsigned long id;
+ int ref_cnt;
+ struct list_head list;
+};
+
+/* List of registered shadow variable types */
+static LIST_HEAD(klp_shadow_types);
+
/**
* klp_shadow_match() - verify a shadow variable matches given <obj, id>
* @shadow: shadow variable to match
@@ -84,6 +101,13 @@ void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type)
{
struct klp_shadow *shadow;
+ /* Just the best effort. Can't take @klp_shadow_lock here. */
+ if (!shadow_type->registered) {
+ pr_err("Trying to get shadow variable of non-registered type: %lu\n",
+ shadow_type->id);
+ return NULL;
+ }
+
rcu_read_lock();
hash_for_each_possible_rcu(klp_shadow_hash, shadow, node,
@@ -310,3 +334,103 @@ void klp_shadow_free_all(struct klp_shadow_type *shadow_type)
spin_unlock_irqrestore(&klp_shadow_lock, flags);
}
EXPORT_SYMBOL_GPL(klp_shadow_free_all);
+
+static struct klp_shadow_type_reg *
+klp_shadow_type_get_reg(struct klp_shadow_type *shadow_type)
+{
+ struct klp_shadow_type_reg *shadow_type_reg;
+ lockdep_assert_held(&klp_shadow_lock);
+
+ list_for_each_entry(shadow_type_reg, &klp_shadow_types, list) {
+ if (shadow_type_reg->id == shadow_type->id)
+ return shadow_type_reg;
+ }
+
+ return NULL;
+}
+
+/**
+ * klp_shadow_register() - register the given shadow variable type
+ * @shadow_type: shadow type to be registered
+ *
+ * Tell the system that the given shadow type is going to used by the caller
+ * (livepatch module). It allows to check and maintain lifetime of shadow
+ * variables.
+ *
+ * Return: 0 on suceess, -ENOMEM when there is not enough memory.
+ */
+int klp_shadow_register(struct klp_shadow_type *shadow_type)
+{
+ struct klp_shadow_type_reg *shadow_type_reg;
+ struct klp_shadow_type_reg *new_shadow_type_reg;
+
+ new_shadow_type_reg =
+ kzalloc(sizeof(struct klp_shadow_type_reg), GFP_KERNEL);
+ if (!new_shadow_type_reg)
+ return -ENOMEM;
+
+ spin_lock_irq(&klp_shadow_lock);
+
+ if (shadow_type->registered) {
+ pr_err("Trying to register shadow variable type that is already registered: %lu",
+ shadow_type->id);
+ kfree(new_shadow_type_reg);
+ goto out;
+ }
+
+ shadow_type_reg = klp_shadow_type_get_reg(shadow_type);
+ if (!shadow_type_reg) {
+ shadow_type_reg = new_shadow_type_reg;
+ shadow_type_reg->id = shadow_type->id;
+ list_add(&shadow_type_reg->list, &klp_shadow_types);
+ } else {
+ kfree(new_shadow_type_reg);
+ }
+
+ shadow_type_reg->ref_cnt++;
+ shadow_type->registered = true;
+out:
+ spin_unlock_irq(&klp_shadow_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(klp_shadow_register);
+
+/**
+ * klp_shadow_unregister() - unregister the give shadow variable type
+ * @shadow_type: shadow type to be unregistered
+ *
+ * Tell the system that a given shadow variable ID is not longer be used by
+ * the caller (livepatch module). All existing shadow variables are freed
+ * when it was the last registered user.
+ */
+void klp_shadow_unregister(struct klp_shadow_type *shadow_type)
+{
+ struct klp_shadow_type_reg *shadow_type_reg;
+
+ spin_lock_irq(&klp_shadow_lock);
+
+ if (!shadow_type->registered) {
+ pr_err("Trying to unregister shadow variable type that is not registered: %lu",
+ shadow_type->id);
+ goto out;
+ }
+
+ shadow_type_reg = klp_shadow_type_get_reg(shadow_type);
+ if (!shadow_type_reg) {
+ pr_err("Can't find shadow variable type registration: %lu", shadow_type->id);
+ goto out;
+ }
+
+ shadow_type->registered = false;
+ shadow_type_reg->ref_cnt--;
+
+ if (!shadow_type_reg->ref_cnt) {
+ __klp_shadow_free_all(shadow_type);
+ list_del(&shadow_type_reg->list);
+ kfree(shadow_type_reg);
+ }
+out:
+ spin_unlock_irq(&klp_shadow_lock);
+}
+EXPORT_SYMBOL_GPL(klp_shadow_unregister);
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 30187b1d8275..9c57941974a7 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -123,8 +123,10 @@ static void klp_complete_transition(void)
continue;
if (klp_target_state == KLP_PATCHED)
klp_post_patch_callback(obj);
- else if (klp_target_state == KLP_UNPATCHED)
+ else if (klp_target_state == KLP_UNPATCHED) {
klp_post_unpatch_callback(obj);
+ klp_unregister_shadow_types(obj);
+ }
}
pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
diff --git a/lib/livepatch/test_klp_shadow_vars.c b/lib/livepatch/test_klp_shadow_vars.c
index ee47c1fae8e2..6de1f6d11bcf 100644
--- a/lib/livepatch/test_klp_shadow_vars.c
+++ b/lib/livepatch/test_klp_shadow_vars.c
@@ -188,6 +188,17 @@ static int test_klp_shadow_vars_init(void)
int ret;
int i;
+ /* Registered manually since we don't have a klp_object instance. */
+ ret = klp_shadow_register(&shadow_type_1);
+ if (ret)
+ return ret;
+
+ ret = klp_shadow_register(&shadow_type_2);
+ if (ret) {
+ klp_shadow_unregister(&shadow_type_1);
+ return ret;
+ }
+
ptr_id(NULL);
/*
@@ -296,12 +307,9 @@ static int test_klp_shadow_vars_init(void)
pr_info(" got expected NULL result\n");
}
- free_ptr_list();
-
- return 0;
out:
- shadow_free_all(&shadow_type_1); /* 'char' pairs */
- shadow_free_all(&shadow_type_2); /* 'int' pairs */
+ klp_shadow_unregister(&shadow_type_1); /* 'char' pairs */
+ klp_shadow_unregister(&shadow_type_2); /* 'int' pairs */
free_ptr_list();
return ret;
diff --git a/samples/livepatch/livepatch-shadow-fix1.c b/samples/livepatch/livepatch-shadow-fix1.c
index 0cc7d1e4b4bc..6718df9ec14b 100644
--- a/samples/livepatch/livepatch-shadow-fix1.c
+++ b/samples/livepatch/livepatch-shadow-fix1.c
@@ -141,6 +141,11 @@ static struct klp_shadow_type shadow_leak_type = {
.dtor = livepatch_fix1_dummy_leak_dtor,
};
+struct klp_shadow_type *shadow_types[] = {
+ &shadow_leak_type,
+ NULL
+};
+
static struct klp_func funcs[] = {
{
.old_name = "dummy_alloc",
@@ -156,6 +161,7 @@ static struct klp_object objs[] = {
{
.name = "livepatch_shadow_mod",
.funcs = funcs,
+ .shadow_types = shadow_types,
}, { }
};
@@ -171,8 +177,6 @@ static int livepatch_shadow_fix1_init(void)
static void livepatch_shadow_fix1_exit(void)
{
- /* Cleanup any existing SV_LEAK shadow variables */
- klp_shadow_free_all(&shadow_leak_type);
}
module_init(livepatch_shadow_fix1_init);
diff --git a/samples/livepatch/livepatch-shadow-fix2.c b/samples/livepatch/livepatch-shadow-fix2.c
index 840100555152..290c7e3f96b0 100644
--- a/samples/livepatch/livepatch-shadow-fix2.c
+++ b/samples/livepatch/livepatch-shadow-fix2.c
@@ -103,6 +103,12 @@ static struct klp_shadow_type shadow_counter_type = {
.id = SV_COUNTER,
};
+struct klp_shadow_type *shadow_types[] = {
+ &shadow_leak_type,
+ &shadow_counter_type,
+ NULL
+};
+
static struct klp_func funcs[] = {
{
.old_name = "dummy_check",
@@ -118,6 +124,7 @@ static struct klp_object objs[] = {
{
.name = "livepatch_shadow_mod",
.funcs = funcs,
+ .shadow_types = shadow_types,
}, { }
};
@@ -133,8 +140,6 @@ static int livepatch_shadow_fix2_init(void)
static void livepatch_shadow_fix2_exit(void)
{
- /* Cleanup any existing SV_COUNTER shadow variables */
- klp_shadow_free_all(&shadow_leak_type);
}
module_init(livepatch_shadow_fix2_init);
--
2.37.3
From: Petr Mladek <[email protected]>
Separate code that is used in klp_shadow_get_or_alloc() under klp_mutex.
It splits a long spaghetti function into two. Also it unifies the error
handling. The old used a mix of duplicated code, direct returns,
and goto. The new code has only one unlock, free, and return calls.
It is code refactoring without any functional changes.
Signed-off-by: Petr Mladek <[email protected]>
Signed-off-by: Marcos Paulo de Souza <[email protected]>
---
Changes from v1:
* Change __klp_shadow_get_or_use to __klp_shadow_get_or_add_locked and add a comment
about it's meaning (Petr)
* Add lockdep_assert_held on __klp_shadow_get_or_add_locked
* Reworked the commit message (Petr)
* Added my SoB (Josh)
kernel/livepatch/shadow.c | 78 ++++++++++++++++++++++-----------------
1 file changed, 45 insertions(+), 33 deletions(-)
diff --git a/kernel/livepatch/shadow.c b/kernel/livepatch/shadow.c
index c2e724d97ddf..81ad7cbbd124 100644
--- a/kernel/livepatch/shadow.c
+++ b/kernel/livepatch/shadow.c
@@ -101,41 +101,22 @@ void *klp_shadow_get(void *obj, unsigned long id)
}
EXPORT_SYMBOL_GPL(klp_shadow_get);
-static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
- size_t size, gfp_t gfp_flags,
- klp_shadow_ctor_t ctor, void *ctor_data,
- bool warn_on_exist)
+/* Check if the variable exists. Otherwise, add the pre-allocated one. */
+static void *__klp_shadow_get_or_add_locked(void *obj, unsigned long id,
+ struct klp_shadow *new_shadow,
+ klp_shadow_ctor_t ctor, void *ctor_data,
+ bool *exist)
{
- struct klp_shadow *new_shadow;
void *shadow_data;
- unsigned long flags;
- /* Check if the shadow variable already exists */
- shadow_data = klp_shadow_get(obj, id);
- if (shadow_data)
- goto exists;
-
- /*
- * Allocate a new shadow variable. Fill it with zeroes by default.
- * More complex setting can be done by @ctor function. But it is
- * called only when the buffer is really used (under klp_shadow_lock).
- */
- new_shadow = kzalloc(size + sizeof(*new_shadow), gfp_flags);
- if (!new_shadow)
- return NULL;
+ lockdep_assert_held(&klp_shadow_lock);
- /* Look for <obj, id> again under the lock */
- spin_lock_irqsave(&klp_shadow_lock, flags);
shadow_data = klp_shadow_get(obj, id);
if (unlikely(shadow_data)) {
- /*
- * Shadow variable was found, throw away speculative
- * allocation.
- */
- spin_unlock_irqrestore(&klp_shadow_lock, flags);
- kfree(new_shadow);
- goto exists;
+ *exist = true;
+ return shadow_data;
}
+ *exist = false;
new_shadow->obj = obj;
new_shadow->id = id;
@@ -145,8 +126,6 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
err = ctor(obj, new_shadow->data, ctor_data);
if (err) {
- spin_unlock_irqrestore(&klp_shadow_lock, flags);
- kfree(new_shadow);
pr_err("Failed to construct shadow variable <%p, %lx> (%d)\n",
obj, id, err);
return NULL;
@@ -156,12 +135,45 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
/* No <obj, id> found, so attach the newly allocated one */
hash_add_rcu(klp_shadow_hash, &new_shadow->node,
(unsigned long)new_shadow->obj);
- spin_unlock_irqrestore(&klp_shadow_lock, flags);
return new_shadow->data;
+}
+
+static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
+ size_t size, gfp_t gfp_flags,
+ klp_shadow_ctor_t ctor, void *ctor_data,
+ bool warn_on_exist)
+{
+ struct klp_shadow *new_shadow;
+ void *shadow_data;
+ bool exist;
+ unsigned long flags;
+
+ /* Check if the shadow variable already exists */
+ shadow_data = klp_shadow_get(obj, id);
+ if (shadow_data)
+ return shadow_data;
+
+ /*
+ * Allocate a new shadow variable. Fill it with zeroes by default.
+ * More complex setting can be done by @ctor function. But it is
+ * called only when the buffer is really used (under klp_shadow_lock).
+ */
+ new_shadow = kzalloc(size + sizeof(*new_shadow), gfp_flags);
+ if (!new_shadow)
+ return NULL;
+
+ /* Look for <obj, id> again under the lock */
+ spin_lock_irqsave(&klp_shadow_lock, flags);
+ shadow_data = __klp_shadow_get_or_add_locked(obj, id, new_shadow,
+ ctor, ctor_data, &exist);
+ spin_unlock_irqrestore(&klp_shadow_lock, flags);
+
+ /* Throw away unused speculative allocation. */
+ if (!shadow_data || exist)
+ kfree(new_shadow);
-exists:
- if (warn_on_exist) {
+ if (exist && warn_on_exist) {
WARN(1, "Duplicate shadow variable <%p, %lx>\n", obj, id);
return NULL;
}
--
2.37.3
On Wed 2022-10-26 16:41:19, Marcos Paulo de Souza wrote:
> From: Petr Mladek <[email protected]>
>
> Separate code that is used in klp_shadow_get_or_alloc() under klp_mutex.
> It splits a long spaghetti function into two. Also it unifies the error
> handling. The old used a mix of duplicated code, direct returns,
> and goto. The new code has only one unlock, free, and return calls.
>
> It is code refactoring without any functional changes.
>
> Signed-off-by: Petr Mladek <[email protected]>
> Signed-off-by: Marcos Paulo de Souza <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Best Regards,
Petr
On Wed 2022-10-26 16:41:21, Marcos Paulo de Souza wrote:
> The shadow variable type will be used in klp_shadow_alloc/get/free
> functions instead of id/ctor/dtor parameters. As a result, all callers
> use the same callbacks consistently[*][**].
>
> The structure will be used in the next patch that will manage the
> lifetime of shadow variables and execute garbage collection automatically.
>
> [*] From the user POV, it might have been easier to pass $id instead
> of pointer to struct klp_shadow_type.
>
> It would require registering the klp_shadow_type so that
> the klp_shadow API could find ctor/dtor for the given id.
> It actually will be needed for the garbage collection anyway
> because it will define the lifetime of the variables.
>
> The bigger problem is that the same klp_shadow_type might be
> used by more livepatch modules. Each livepatch module need
> to duplicate the definition of klp_shadow_type and ctor/dtor
> callbacks. The klp_shadow API would need to choose one registered
> copy.
>
> The definitions should be compatible and they should stay as long
> as the type is registered. But it still feels more safe when
> klp_shadow API callers use struct klp_shadow_type and ctor/dtor
> callbacks defined in the same livepatch module.
>
> This problem is gone when each livepatch explicitly uses its
> own struct klp_shadow_type pointing to its own callbacks.
This paragraph seems redundant. It more or less repeats what
is said in the previous one.
> [**] test_klp_shadow_vars.c uses a custom @dtor to show that it was called.
> The message must be disabled when called via klp_shadow_free_all()
> because the ordering of freed variables is not well defined there.
> It has to be done using another hack after switching to
> klp_shadow_types.
>
> Co-developed-by: Petr Mladek <[email protected]>
> Signed-off-by: Petr Mladek <[email protected]>
> Signed-off-by: Marcos Paulo de Souza <[email protected]>
I am never sure how the co-authoring should be done ;-) But I believe
that the author should always be the first one. And the other author
should be listed either by Co-developer-by: or by Signed-off-by: but
not by both tags. So, it should be:
Signed-off-by: Marcos Paulo de Souza <[email protected]>
Co-developed-by: Petr Mladek <[email protected]>
With the removed paragraph and updated tags:
Reviewed-by: Petr Mladek <[email protected]>
Best Regards,
Petr
PS: No need to resend the patch. I could do the two changes
when pushing it.
On Wed 2022-10-26 16:41:18, Marcos Paulo de Souza wrote:
> Hello,
>
> This is the v2 of the livepatch shadow GC patches. The changes are minor since
> nobody asked for for big code changes.
>
> Changes from v1:
> * Reworked commit messages (Petr)
> * Added my SoB which was missing in some patches, or the ordering was wrong. (Josh)
> * Change __klp_shadow_get_or_use to __klp_shadow_get_or_add_locked and add a comment (Petr)
> * Add lockdep_assert_held on __klp_shadow_get_or_add_locked (Petr)
> about it's meaning (Petr)
> * CCing LKML (Josh)
>
> Some observations:
> * Petr has reviewed some of the patches that we created. I kept the Reviewed-by
> tags since he wrote the patches some time ago and now he reviewed them again
> on the ML.
> * There were questions about possible problems about using klp_shadow_types
> instead of using ids, but Petr already explained that internally it still uses
> the id to find the correct livepatch.
> * Regarding the possibility of multiple patches use the same ID, the problem
> already existed before. Petr suggested using a "stringified" version using
> name and id, but nobody has commented yet. I can implement such feature in a
> v3 if necessary.
>
> Marcos Paulo de Souza (2):
> livepatch/shadow: Introduce klp_shadow_type structure
> livepatch/shadow: Add garbage collection of shadow variables
>
> Petr Mladek (2):
> livepatch/shadow: Separate code to get or use pre-allocated shadow
> variable
> livepatch/shadow: Separate code removing all shadow variables for a
> given id
From my POV, the patchset is ready for pushing upstream.
Well, we need to get approval from kpatch-build users. Joe described
possible problems in replay for v3, see
https://lore.kernel.org/r/[email protected]
Best Regards,
Petr
On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> The life of shadow variables is not completely trivial to maintain.
> They might be used by more livepatches and more livepatched objects.
> They should stay as long as there is any user.
>
> In practice, it requires to implement reference counting in callbacks
> of all users. They should register all the user and remove the shadow
> variables only when there is no user left.
>
> This patch hides the reference counting into the klp_shadow API.
> The counter is connected with the shadow variable @id. It requires
> an API to take and release the reference. The release function also
> calls the related dtor() when defined.
>
> An easy solution would be to add some get_ref()/put_ref() API.
> But it would need to get called from pre()/post_un() callbacks.
> It might be easy to forget a callback and make it wrong.
>
> A more safe approach is to associate the klp_shadow_type with
> klp_objects that use the shadow variables. The livepatch core
> code might then handle the reference counters on background.
>
> The shadow variable type might then be added into a new @shadow_types
> member of struct klp_object. They will get then automatically registered
> and unregistered when the object is being livepatched. The registration
> increments the reference count. Unregistration decreases the reference
> count. All shadow variables of the given type are freed when the reference
> count reaches zero.
>
> All klp_shadow_alloc/get/free functions also checks whether the requested
> type is registered. It will help to catch missing registration and might
> also help to catch eventual races.
Is there a reason the shadow variable lifetime is tied to klp_object
rather than klp_patch?
I get the feeling the latter would be easier to implement (no reference
counting; also maybe can be auto-detected with THIS_MODULE?) and harder
for the patch author to mess up (by accidentally omitting an object
which uses it).
--
Josh
On Thu 2022-11-03 18:03:27, Josh Poimboeuf wrote:
> On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> > The life of shadow variables is not completely trivial to maintain.
> > They might be used by more livepatches and more livepatched objects.
> > They should stay as long as there is any user.
> >
> > In practice, it requires to implement reference counting in callbacks
> > of all users. They should register all the user and remove the shadow
> > variables only when there is no user left.
> >
> > This patch hides the reference counting into the klp_shadow API.
> > The counter is connected with the shadow variable @id. It requires
> > an API to take and release the reference. The release function also
> > calls the related dtor() when defined.
> >
> > An easy solution would be to add some get_ref()/put_ref() API.
> > But it would need to get called from pre()/post_un() callbacks.
> > It might be easy to forget a callback and make it wrong.
> >
> > A more safe approach is to associate the klp_shadow_type with
> > klp_objects that use the shadow variables. The livepatch core
> > code might then handle the reference counters on background.
> >
> > The shadow variable type might then be added into a new @shadow_types
> > member of struct klp_object. They will get then automatically registered
> > and unregistered when the object is being livepatched. The registration
> > increments the reference count. Unregistration decreases the reference
> > count. All shadow variables of the given type are freed when the reference
> > count reaches zero.
> >
> > All klp_shadow_alloc/get/free functions also checks whether the requested
> > type is registered. It will help to catch missing registration and might
> > also help to catch eventual races.
>
> Is there a reason the shadow variable lifetime is tied to klp_object
> rather than klp_patch?
Good question!
My thinking was that shadow variables might be tight to variables
from a particular livepatched module. It would make sense to free them
when the only user (livepatched module) is removed.
The question is if it is really important. I did re-read the original
issue and the real problem was when the livepatch was reloaded.
The related variables were handled using livepatched code, then
without the livepatched code, and then with the livepatched code
again.
The variable was modified with the original code that was not aware of
the shadow variable. As a result the state of the shadow variable was
not in sync when it was later used again.
Nicolai, your have the practical experience. Should the reference
counting be per-livepatched object or per-livepatch, please?
> I get the feeling the latter would be easier to implement (no reference
> counting; also maybe can be auto-detected with THIS_MODULE?) and harder
> for the patch author to mess up (by accidentally omitting an object
> which uses it).
I am not sure how you mean it. I guess that you suggest to store
the name of the livepatch module into the shadow variable.
And use the variable only when the livepatch module is still loaded.
This would not help.
We use the reference counting because the shadow variables should
survive an atomic replace of the lifepatch. In this case, the related
variables are still handled using a livepatched code that is aware
of the shadow variables.
Also it does not solve the problem that the shadow variable might
get out of sync with the related variable when the same
livepatch gets reloaded.
Best Regards,
Petr
On Fri, Nov 04, 2022 at 11:25:38AM +0100, Petr Mladek wrote:
> > I get the feeling the latter would be easier to implement (no reference
> > counting; also maybe can be auto-detected with THIS_MODULE?) and harder
> > for the patch author to mess up (by accidentally omitting an object
> > which uses it).
>
> I am not sure how you mean it. I guess that you suggest to store
> the name of the livepatch module into the shadow variable.
> And use the variable only when the livepatch module is still loaded.
Actually I was thinking the klp_patch could have references to all the
shadow variables (or shadow variable types?) it owns.
Instead of reference counting, the livepatch atomic replace code could
just migrate any previously owned shadow variables to the new klp_patch.
This of course adds the restriction that such garbage-collected shadow
variables couldn't be shared across non-replace livepatches. But I
wouldn't expect that to be much of a problem.
Additionally, I was wondering if "which klp_patch owns which shadow
variables" could be auto-detected somehow.
For example:
1) add 'struct module *owner' or 'struct klp_patch *owner' to klp_shadow
2) add klp_shadow_alloc_gc() and klp_shadow_get_or_alloc_gc(), which are
similar to their non-gc counterparts, with a few additional
arguments: the klp module owner (THIS_MODULE for the caller); and a
destructor to be used later for the garbage collection
3) When atomic replacing a patch, iterate through the klp_shadow_hash
and, for each klp_shadow which previously had an owner, change it to
be owned by the new patch
4) When unloading/freeing a patch, free all its associated klp_shadows
(also calling destructors where applicable)
I'm thinking this would be easier for the patch author, and also simpler
overall. I could work up a patch.
--
Josh
On Mon 2022-11-07 17:32:09, Josh Poimboeuf wrote:
> On Fri, Nov 04, 2022 at 11:25:38AM +0100, Petr Mladek wrote:
> > > I get the feeling the latter would be easier to implement (no reference
> > > counting; also maybe can be auto-detected with THIS_MODULE?) and harder
> > > for the patch author to mess up (by accidentally omitting an object
> > > which uses it).
> >
> > I am not sure how you mean it. I guess that you suggest to store
> > the name of the livepatch module into the shadow variable.
> > And use the variable only when the livepatch module is still loaded.
>
> Actually I was thinking the klp_patch could have references to all the
> shadow variables (or shadow variable types?) it owns.
In short, you suggest to move the array of used klp_shadow_types from
struct klp_object to struct klp_patch. Do I get it correctly?
> Instead of reference counting, the livepatch atomic replace code could
> just migrate any previously owned shadow variables to the new klp_patch.
I see. We would not need to explicitly register the shadow type using
klp_shadow_register() and allocate struct klp_shadow_type_reg
for the reference counter.
> This of course adds the restriction that such garbage-collected shadow
> variables couldn't be shared across non-replace livepatches. But I
> wouldn't expect that to be much of a problem.
From my POV, this is similar to "func_stack" in struct "klp_ops".
The list was originally designed for non-replace livepatches because
the same function might be livepatched many times.
It might theoretically get "simplified" when only the atomic replace is
supported. Then there would be the maximum of two livepatches
and the (infinite) list would be an overhead.
I say theoretically because the list is actually quite elegant
solution.
> Additionally, I was wondering if "which klp_patch owns which shadow
> variables" could be auto-detected somehow.
>
> For example:
>
> 1) add 'struct module *owner' or 'struct klp_patch *owner' to klp_shadow
>
> 2) add klp_shadow_alloc_gc() and klp_shadow_get_or_alloc_gc(), which are
> similar to their non-gc counterparts, with a few additional
> arguments: the klp module owner (THIS_MODULE for the caller); and a
> destructor to be used later for the garbage collection
>
> 3) When atomic replacing a patch, iterate through the klp_shadow_hash
> and, for each klp_shadow which previously had an owner, change it to
> be owned by the new patch
This is not clear to me. The new livepatch might also use less shadow
variables. It must not blindly take over all shadow variables which
were owned by the previous livepatch.
> 4) When unloading/freeing a patch, free all its associated klp_shadows
> (also calling destructors where applicable)
>
>
> I'm thinking this would be easier for the patch author, and also simpler
> overall. I could work up a patch.
From the patch author POV:
If the autodetection did not work then the patch author would still
need to provide the array of used shadow types. I agree that only
one array in struct klp_patch might be enough.
From the implementation POV:
I agree that the code might be easier if we support only atomic
replace. We would not need the reference counter in this case.
But I am not sure if this is acceptable for users that do not use
the atomic replace. They suffer from the same problem. Do we really
want to make this mode a 2nd citizen? IMHO, all applicable features
have been implemented for both modes so far.
Best Regards,
Petr
On Tue, Nov 08, 2022 at 10:14:18AM +0100, Petr Mladek wrote:
> On Mon 2022-11-07 17:32:09, Josh Poimboeuf wrote:
> > On Fri, Nov 04, 2022 at 11:25:38AM +0100, Petr Mladek wrote:
> > > > I get the feeling the latter would be easier to implement (no reference
> > > > counting; also maybe can be auto-detected with THIS_MODULE?) and harder
> > > > for the patch author to mess up (by accidentally omitting an object
> > > > which uses it).
> > >
> > > I am not sure how you mean it. I guess that you suggest to store
> > > the name of the livepatch module into the shadow variable.
> > > And use the variable only when the livepatch module is still loaded.
> >
> > Actually I was thinking the klp_patch could have references to all the
> > shadow variables (or shadow variable types?) it owns.
>
> In short, you suggest to move the array of used klp_shadow_types from
> struct klp_object to struct klp_patch. Do I get it correctly?
Right. Though, thinking about it more, this isn't even needed. Each
klp_shadow would have a pointer to its owning module. We already have a
global hash of klp_shadows which can be iterated when the module gets
unloaded or replaced.
> > 1) add 'struct module *owner' or 'struct klp_patch *owner' to klp_shadow
> >
> > 2) add klp_shadow_alloc_gc() and klp_shadow_get_or_alloc_gc(), which are
> > similar to their non-gc counterparts, with a few additional
> > arguments: the klp module owner (THIS_MODULE for the caller); and a
> > destructor to be used later for the garbage collection
> >
> > 3) When atomic replacing a patch, iterate through the klp_shadow_hash
> > and, for each klp_shadow which previously had an owner, change it to
> > be owned by the new patch
>
> This is not clear to me. The new livepatch might also use less shadow
> variables. It must not blindly take over all shadow variables which
> were owned by the previous livepatch.
Assuming atomic replace, the new patch is almost always a superset of
the old patch. We can optimize for that case.
If the new patch needs to remove any old shadow variables, it can do so
in its post-patch callback.
> > 4) When unloading/freeing a patch, free all its associated klp_shadows
> > (also calling destructors where applicable)
> >
> >
> > I'm thinking this would be easier for the patch author, and also simpler
> > overall. I could work up a patch.
>
> From the patch author POV:
>
> If the autodetection did not work then the patch author would still
> need to provide the array of used shadow types. I agree that only
> one array in struct klp_patch might be enough.
>
>
> From the implementation POV:
>
> I agree that the code might be easier if we support only atomic
> replace. We would not need the reference counter in this case.
>
> But I am not sure if this is acceptable for users that do not use
> the atomic replace. They suffer from the same problem. Do we really
> want to make this mode a 2nd citizen? IMHO, all applicable features
> have been implemented for both modes so far.
Non-replace patches would still be supported. Just with the restriction
that garbage-collected shadow variables are by definition owned by a
single patch module and thus can't be shared across patch modules.
--
Josh
On Tue 2022-11-08 10:44:10, Josh Poimboeuf wrote:
> On Tue, Nov 08, 2022 at 10:14:18AM +0100, Petr Mladek wrote:
> > On Mon 2022-11-07 17:32:09, Josh Poimboeuf wrote:
> > > On Fri, Nov 04, 2022 at 11:25:38AM +0100, Petr Mladek wrote:
> > > > > I get the feeling the latter would be easier to implement (no reference
> > > > > counting; also maybe can be auto-detected with THIS_MODULE?) and harder
> > > > > for the patch author to mess up (by accidentally omitting an object
> > > > > which uses it).
> > > >
> > > > I am not sure how you mean it. I guess that you suggest to store
> > > > the name of the livepatch module into the shadow variable.
> > > > And use the variable only when the livepatch module is still loaded.
> > >
> > > Actually I was thinking the klp_patch could have references to all the
> > > shadow variables (or shadow variable types?) it owns.
> >
> > In short, you suggest to move the array of used klp_shadow_types from
> > struct klp_object to struct klp_patch. Do I get it correctly?
>
> Right. Though, thinking about it more, this isn't even needed. Each
> klp_shadow would have a pointer to its owning module. We already have a
> global hash of klp_shadows which can be iterated when the module gets
> unloaded or replaced.
>
> Assuming atomic replace, the new patch is almost always a superset of
> the old patch. We can optimize for that case.
I see. But I do not agree with this assumption. The new livepatch
might also remove code that caused problems.
Also we allow downgrades. I mean that there is no versioning
of livepatches. Older livepatches might replace new ones.
We really want to support downgrades or upgrades that remove
problematic code. Or upgrades that start fixing the problem
a better way using another shadow variable.
I personally think that registering the supported klp_shadow_types
is worth the effort. It allows to do various changes a safe way.
Also it should be easy to make sure that all klp_shadow_types
are registered:
1. Grep might be used to find declarations in the source code.
IMHO, it should work even with kPatch.
2. The klp_shadow_*() API will warn when a non-registered shadow
variable is used. IMHO, this might be useful and catch some bugs
on its own.
Best Regards,
Petr
Hi all,
Petr Mladek <[email protected]> writes:
> On Thu 2022-11-03 18:03:27, Josh Poimboeuf wrote:
>> On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
>> > The life of shadow variables is not completely trivial to maintain.
>> > They might be used by more livepatches and more livepatched objects.
>> > They should stay as long as there is any user.
>> >
>> > In practice, it requires to implement reference counting in callbacks
>> > of all users. They should register all the user and remove the shadow
>> > variables only when there is no user left.
>> >
>> > This patch hides the reference counting into the klp_shadow API.
>> > The counter is connected with the shadow variable @id. It requires
>> > an API to take and release the reference. The release function also
>> > calls the related dtor() when defined.
>> >
>> > An easy solution would be to add some get_ref()/put_ref() API.
>> > But it would need to get called from pre()/post_un() callbacks.
>> > It might be easy to forget a callback and make it wrong.
>> >
>> > A more safe approach is to associate the klp_shadow_type with
>> > klp_objects that use the shadow variables. The livepatch core
>> > code might then handle the reference counters on background.
>> >
>> > The shadow variable type might then be added into a new @shadow_types
>> > member of struct klp_object. They will get then automatically registered
>> > and unregistered when the object is being livepatched. The registration
>> > increments the reference count. Unregistration decreases the reference
>> > count. All shadow variables of the given type are freed when the reference
>> > count reaches zero.
>> >
>> > All klp_shadow_alloc/get/free functions also checks whether the requested
>> > type is registered. It will help to catch missing registration and might
>> > also help to catch eventual races.
>>
>> Is there a reason the shadow variable lifetime is tied to klp_object
>> rather than klp_patch?
>
> Good question!
>
> My thinking was that shadow variables might be tight to variables
> from a particular livepatched module. It would make sense to free them
> when the only user (livepatched module) is removed.
>
> The question is if it is really important.
I don't think so. For shadow variables attached to "real" data objects,
I would expect those objects to be gone by the time the patched ko gets
removed. For dummy shadows attached to NULL (see below), used for
handing over shared state between atomic-replace livepatch modules, the
memory footprint is negligible, I'd say.
OTOH, callbacks are per klp_object already, so for API consistency, it
might make sense to stick to this pattern for the shadows as well. But I
don't have a real opinion on this.
> I did re-read the original issue and the real problem was when the
> livepatch was reloaded. The related variables were handled using
> livepatched code, then without the livepatched code, and then with the
> livepatched code again.
>
> The variable was modified with the original code that was not aware of
> the shadow variable. As a result the state of the shadow variable was
> not in sync when it was later used again.
>
Apologies for being late to the discussion. As I've not seen it
mentioned anywhere in this thread here -- I haven't checked v1 though --
let me outline the primary limitations (as perceived by me) of the
current shadow variables API here again, just to have it documented.
In practice, an atomic-replace klp_patch is composed of a number of
"sublivepatches" addressing individual CVEs or other issues. In general,
it is usually true that a more recent version of a livepatch contains a
superset of these sublivepatches and the individual sublivepatches'
implementations never change in practice. However, users can downgrade
an atomic-replace livepatch to an earlier version or disable a livepatch
completely.
From my experience, there are basically two relevant usage patterns of
shadow variables.
1.) To hand over global state from one sublivepatch to its pendant in
the to-be-applied livepatch module. Example: a new global mutex or
alike.
2.) The "regular" intended usage, attaching schadow variables to real
(data) objects.
To manage lifetime for 1.), we usually implement some refcount scheme,
managed from the livepatches' module_init()/_exit(): the next livepatch
would subscribe to the shared state before the previous one got a chance
to release it. This works in practice, but the code related to it is
tedious to write and quite verbose.
The second usage pattern is much more difficult to implement correctly
in light of possible livepatch downgrades to a subset of
sublivepatches. Usually a sublivepatch making use of a shadow variable
attached to real objects would livepatch the associated object's
destruction code to free up the associated shadow, if any. If the next
livepatch to be applied happened to not contain this sublivepatch in
question as well, the destruction code would effectively become
unpatched, and any existing shadows leaked. Depending on the object type
in question, this memory leakage might or might not be an actual
problem, but it isn't nice either way.
Often, there's a more subtle issue with the latter usecase though: the
shadow continues to exist, but becomes unmaintained once the transitions
has started. If said sublivepatch happens to become reactivated later
on, it would potentially find stale shadows, and these could even get
wrongly associated with a completely different object which happened to
get allocated at the same memory address. Depending on the shadow type,
this might or might not be Ok. New per-object locks or a "TLB flush
needed" boolean would probably be Ok, but some kind of refcount would
certainly not. There's not much which could be done from the pre-unpatch
callbacks, because these aren't getting invoked for atomic-replace
downgrades.
We had quite some discussion internally on how to best approach these
limitations, the outcome being (IIRC), that a more versatile callbacks
support would perhaps quickly become too complex or error-prone to use
correctly. So Petr proposed this garbage collection/refcounting
mechanism posted here, which would solve the memory leakage issue as a
first step (and would make shadow variable usage less verbose IMO).
The consistency problem would still not be fully solved: consider a
refcount-like shadow, where during the transition some acquirer had been
unpatched already, while a releaser has not yet. But my hope is that we
can later build on this work here and somehow resolve this as well.
> Nicolai, your have the practical experience. Should the reference
> counting be per-livepatched object or per-livepatch, please?
See above, I think it won't matter much from a functionality POV.
>> I get the feeling the latter would be easier to implement (no reference
>> counting; also maybe can be auto-detected with THIS_MODULE?) and harder
>> for the patch author to mess up (by accidentally omitting an object
>> which uses it).
>
> I am not sure how you mean it. I guess that you suggest to store
> the name of the livepatch module into the shadow variable.
> And use the variable only when the livepatch module is still loaded.
>
> This would not help.
No, if I'm understanding correctly, this would tie the shadow lifetime
to one single livepatch module, which would make handing them over to
the next atomic-replace livepatch impossible?
>
> We use the reference counting because the shadow variables should
> survive an atomic replace of the lifepatch. In this case, the related
> variables are still handled using a livepatched code that is aware
> of the shadow variables.
Yes.
> Also it does not solve the problem that the shadow variable might
> get out of sync with the related variable when the same
> livepatch gets reloaded.
We would still not have a complete guarantee that some shadow is
consistently maintained over its full lifetime with this patchset here,
see the refcount example from above. But the overall situation would be
much better than before, IMO.
Thanks!
Nicolai
--
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nürnberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
(HRB 36809, AG Nürnberg)
On Fri 2022-11-11 10:20:44, Nicolai Stange wrote:
> Hi all,
>
> Petr Mladek <[email protected]> writes:
>
> > On Thu 2022-11-03 18:03:27, Josh Poimboeuf wrote:
> >> On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> >> > The life of shadow variables is not completely trivial to maintain.
> >> > They might be used by more livepatches and more livepatched objects.
> >> > They should stay as long as there is any user.
> >> >
> >> > In practice, it requires to implement reference counting in callbacks
> >> > of all users. They should register all the user and remove the shadow
> >> > variables only when there is no user left.
> >> >
> >> > This patch hides the reference counting into the klp_shadow API.
> >> > The counter is connected with the shadow variable @id. It requires
> >> > an API to take and release the reference. The release function also
> >> > calls the related dtor() when defined.
> >> >
> >> > An easy solution would be to add some get_ref()/put_ref() API.
> >> > But it would need to get called from pre()/post_un() callbacks.
> >> > It might be easy to forget a callback and make it wrong.
> >> >
> >> > A more safe approach is to associate the klp_shadow_type with
> >> > klp_objects that use the shadow variables. The livepatch core
> >> > code might then handle the reference counters on background.
> >> >
> >> > The shadow variable type might then be added into a new @shadow_types
> >> > member of struct klp_object. They will get then automatically registered
> >> > and unregistered when the object is being livepatched. The registration
> >> > increments the reference count. Unregistration decreases the reference
> >> > count. All shadow variables of the given type are freed when the reference
> >> > count reaches zero.
> >> >
> >> > All klp_shadow_alloc/get/free functions also checks whether the requested
> >> > type is registered. It will help to catch missing registration and might
> >> > also help to catch eventual races.
> >>
> >> Is there a reason the shadow variable lifetime is tied to klp_object
> >> rather than klp_patch?
> >
> > Good question!
> >
> > My thinking was that shadow variables might be tight to variables
> > from a particular livepatched module. It would make sense to free them
> > when the only user (livepatched module) is removed.
> >
> > The question is if it is really important.
>
> I don't think so. For shadow variables attached to "real" data objects,
> I would expect those objects to be gone by the time the patched ko gets
> removed. For dummy shadows attached to NULL (see below), used for
> handing over shared state between atomic-replace livepatch modules, the
> memory footprint is negligible, I'd say.
>
> OTOH, callbacks are per klp_object already, so for API consistency, it
> might make sense to stick to this pattern for the shadows as well. But I
> don't have a real opinion on this.
Good point!
> >From my experience, there are basically two relevant usage patterns of
> shadow variables.
> 1.) To hand over global state from one sublivepatch to its pendant in
> the to-be-applied livepatch module. Example: a new global mutex or
> alike.
> 2.) The "regular" intended usage, attaching schadow variables to real
> (data) objects.
>
> To manage lifetime for 1.), we usually implement some refcount scheme,
> managed from the livepatches' module_init()/_exit(): the next livepatch
> would subscribe to the shared state before the previous one got a chance
> to release it. This works in practice, but the code related to it is
> tedious to write and quite verbose.
>
> The second usage pattern is much more difficult to implement correctly
> in light of possible livepatch downgrades to a subset of
> sublivepatches. Usually a sublivepatch making use of a shadow variable
> attached to real objects would livepatch the associated object's
> destruction code to free up the associated shadow, if any. If the next
> livepatch to be applied happened to not contain this sublivepatch in
> question as well, the destruction code would effectively become
> unpatched, and any existing shadows leaked. Depending on the object type
> in question, this memory leakage might or might not be an actual
> problem, but it isn't nice either way.
>
> Often, there's a more subtle issue with the latter usecase though: the
> shadow continues to exist, but becomes unmaintained once the transitions
> has started. If said sublivepatch happens to become reactivated later
> on, it would potentially find stale shadows, and these could even get
> wrongly associated with a completely different object which happened to
> get allocated at the same memory address. Depending on the shadow type,
> this might or might not be Ok. New per-object locks or a "TLB flush
> needed" boolean would probably be Ok, but some kind of refcount would
> certainly not. There's not much which could be done from the pre-unpatch
> callbacks, because these aren't getting invoked for atomic-replace
> downgrades.
>
> We had quite some discussion internally on how to best approach these
> limitations, the outcome being (IIRC), that a more versatile callbacks
> support would perhaps quickly become too complex or error-prone to use
> correctly. So Petr proposed this garbage collection/refcounting
> mechanism posted here, which would solve the memory leakage issue as a
> first step (and would make shadow variable usage less verbose IMO).
>
> The consistency problem would still not be fully solved: consider a
> refcount-like shadow, where during the transition some acquirer had been
> unpatched already, while a releaser has not yet. But my hope is that we
> can later build on this work here and somehow resolve this as well.
>
> > Nicolai, your have the practical experience. Should the reference
> > counting be per-livepatched object or per-livepatch, please?
>
> See above, I think it won't matter much from a functionality POV.
I would personally keep it tied together with the livepatched object
just to be on the safe side.
Is this acceptable for kPatch users in the end?
Best Regards,
Petr
On Fri, Nov 11, 2022 at 10:55:38AM +0100, Petr Mladek wrote:
> > >From my experience, there are basically two relevant usage patterns of
> > shadow variables.
> > 1.) To hand over global state from one sublivepatch to its pendant in
> > the to-be-applied livepatch module. Example: a new global mutex or
> > alike.
> > 2.) The "regular" intended usage, attaching schadow variables to real
> > (data) objects.
> >
> > To manage lifetime for 1.), we usually implement some refcount scheme,
> > managed from the livepatches' module_init()/_exit(): the next livepatch
> > would subscribe to the shared state before the previous one got a chance
> > to release it. This works in practice, but the code related to it is
> > tedious to write and quite verbose.
> >
> > The second usage pattern is much more difficult to implement correctly
> > in light of possible livepatch downgrades to a subset of
> > sublivepatches. Usually a sublivepatch making use of a shadow variable
> > attached to real objects would livepatch the associated object's
> > destruction code to free up the associated shadow, if any. If the next
> > livepatch to be applied happened to not contain this sublivepatch in
> > question as well, the destruction code would effectively become
> > unpatched, and any existing shadows leaked. Depending on the object type
> > in question, this memory leakage might or might not be an actual
> > problem, but it isn't nice either way.
> >
> > Often, there's a more subtle issue with the latter usecase though: the
> > shadow continues to exist, but becomes unmaintained once the transitions
> > has started. If said sublivepatch happens to become reactivated later
> > on, it would potentially find stale shadows, and these could even get
> > wrongly associated with a completely different object which happened to
> > get allocated at the same memory address. Depending on the shadow type,
> > this might or might not be Ok. New per-object locks or a "TLB flush
> > needed" boolean would probably be Ok, but some kind of refcount would
> > certainly not. There's not much which could be done from the pre-unpatch
> > callbacks, because these aren't getting invoked for atomic-replace
> > downgrades.
> >
> > We had quite some discussion internally on how to best approach these
> > limitations, the outcome being (IIRC), that a more versatile callbacks
> > support would perhaps quickly become too complex or error-prone to use
> > correctly. So Petr proposed this garbage collection/refcounting
> > mechanism posted here, which would solve the memory leakage issue as a
> > first step (and would make shadow variable usage less verbose IMO).
> >
> > The consistency problem would still not be fully solved: consider a
> > refcount-like shadow, where during the transition some acquirer had been
> > unpatched already, while a releaser has not yet. But my hope is that we
> > can later build on this work here and somehow resolve this as well.
It would be great to have all this motivation for the new feature
documented in shadow-vars.rst.
> > > Nicolai, your have the practical experience. Should the reference
> > > counting be per-livepatched object or per-livepatch, please?
> >
> > See above, I think it won't matter much from a functionality POV.
>
> I would personally keep it tied together with the livepatched object
> just to be on the safe side.
If downgrades are going to be commonplace then I agree my automatic
detection idea wouldn't work so well. And ref counting does make sense.
However I'm still not convinced that per-object is the way to go.
Doesn't that create more room for human error? There's no way to
detect-and-warn if the wrong object is used, or if the variable is used
by multiple objects but only one of them is listed.
Per-patch shadow variable types would be easy to detect and warn on
misuse. And easy to automate in patch author tooling.
Also, I'm not crazy about the new API. It's even more confusing than
before.
--
Josh
Hi,
I am sorry for answering this so late. It somehow fallen under cracks.
On Sun 2022-11-13 10:51:38, Josh Poimboeuf wrote:
> On Fri, Nov 11, 2022 at 10:55:38AM +0100, Petr Mladek wrote:
> > > >From my experience, there are basically two relevant usage patterns of
> > > shadow variables.
> > > 1.) To hand over global state from one sublivepatch to its pendant in
> > > the to-be-applied livepatch module. Example: a new global mutex or
> > > alike.
> > > 2.) The "regular" intended usage, attaching shadow variables to real
> > > (data) objects.
> > >
> > > To manage lifetime for 1.), we usually implement some refcount scheme,
> > > managed from the livepatches' module_init()/_exit(): the next livepatch
> > > would subscribe to the shared state before the previous one got a chance
> > > to release it. This works in practice, but the code related to it is
> > > tedious to write and quite verbose.
> > >
> > > The second usage pattern is much more difficult to implement correctly
> > > in light of possible livepatch downgrades to a subset of
> > > sublivepatches. Usually a sublivepatch making use of a shadow variable
> > > attached to real objects would livepatch the associated object's
> > > destruction code to free up the associated shadow, if any. If the next
> > > livepatch to be applied happened to not contain this sublivepatch in
> > > question as well, the destruction code would effectively become
> > > unpatched, and any existing shadows leaked. Depending on the object type
> > > in question, this memory leakage might or might not be an actual
> > > problem, but it isn't nice either way.
> > >
> > > Often, there's a more subtle issue with the latter usecase though: the
> > > shadow continues to exist, but becomes unmaintained once the transitions
> > > has started. If said sublivepatch happens to become reactivated later
> > > on, it would potentially find stale shadows, and these could even get
> > > wrongly associated with a completely different object which happened to
> > > get allocated at the same memory address. Depending on the shadow type,
> > > this might or might not be Ok. New per-object locks or a "TLB flush
> > > needed" boolean would probably be Ok, but some kind of refcount would
> > > certainly not. There's not much which could be done from the pre-unpatch
> > > callbacks, because these aren't getting invoked for atomic-replace
> > > downgrades.
IMHO, this is the reason why we should make it per-object.
If the shadow variable was used by a livepatched module and we remove
this module then the shadow variables would get unmaintained. It would
results in the problem described in this paragraph.
> > > We had quite some discussion internally on how to best approach these
> > > limitations, the outcome being (IIRC), that a more versatile callbacks
> > > support would perhaps quickly become too complex or error-prone to use
> > > correctly. So Petr proposed this garbage collection/refcounting
> > > mechanism posted here, which would solve the memory leakage issue as a
> > > first step (and would make shadow variable usage less verbose IMO).
> > >
> > > The consistency problem would still not be fully solved: consider a
> > > refcount-like shadow, where during the transition some acquirer had been
> > > unpatched already, while a releaser has not yet. But my hope is that we
> > > can later build on this work here and somehow resolve this as well.
>
> It would be great to have all this motivation for the new feature
> documented in shadow-vars.rst.
>
> > > > Nicolai, your have the practical experience. Should the reference
> > > > counting be per-livepatched object or per-livepatch, please?
> > >
> > > See above, I think it won't matter much from a functionality POV.
> >
> > I would personally keep it tied together with the livepatched object
> > just to be on the safe side.
>
> If downgrades are going to be commonplace then I agree my automatic
> detection idea wouldn't work so well. And ref counting does make sense.
>
> However I'm still not convinced that per-object is the way to go.
> Doesn't that create more room for human error? There's no way to
> detect-and-warn if the wrong object is used, or if the variable is used
> by multiple objects but only one of them is listed.
I agree that these mistakes might happen. And I really do not see
any reasonable way how to auto-detect which livepatched object uses
which shadow variables.
Well, the per-object registration allows to register all shadow
types for the "vmlinux" object that is always loaded. It would
work as you suggested. I mean that it would keep the shadow variables
registered as long as the livepatch is loaded.
But I would still like to keep the registration per-object.
It would allow to handle re-load of livepatched modules
the right way when needed.
And I agree that we should document these pitfalls in the documentation.
> Per-patch shadow variable types would be easy to detect and warn on
> misuse. And easy to automate in patch author tooling.
>
> Also, I'm not crazy about the new API. It's even more confusing than
> before.
I do not think that it made the API much worse. It actually made some
things more safe and easier.
The ctor/dtor callbacks are defined by the shadow type. It removes
the risk to passing a wrong one.
Also there is not need to free the unused shadow variables in
post_un() callbacks. It was actually pretty tricky because
it required the reference counting.
Best Regards,
Petr
On Tue, Nov 01, 2022 at 11:43:50AM +0100, Petr Mladek wrote:
> On Wed 2022-10-26 16:41:18, Marcos Paulo de Souza wrote:
> > Hello,
> >
> > This is the v2 of the livepatch shadow GC patches. The changes are minor since
> > nobody asked for for big code changes.
> >
> > Changes from v1:
> > * Reworked commit messages (Petr)
> > * Added my SoB which was missing in some patches, or the ordering was wrong. (Josh)
> > * Change __klp_shadow_get_or_use to __klp_shadow_get_or_add_locked and add a comment (Petr)
> > * Add lockdep_assert_held on __klp_shadow_get_or_add_locked (Petr)
> > about it's meaning (Petr)
> > * CCing LKML (Josh)
> >
> > Some observations:
> > * Petr has reviewed some of the patches that we created. I kept the Reviewed-by
> > tags since he wrote the patches some time ago and now he reviewed them again
> > on the ML.
> > * There were questions about possible problems about using klp_shadow_types
> > instead of using ids, but Petr already explained that internally it still uses
> > the id to find the correct livepatch.
> > * Regarding the possibility of multiple patches use the same ID, the problem
> > already existed before. Petr suggested using a "stringified" version using
> > name and id, but nobody has commented yet. I can implement such feature in a
> > v3 if necessary.
> >
> > Marcos Paulo de Souza (2):
> > livepatch/shadow: Introduce klp_shadow_type structure
> > livepatch/shadow: Add garbage collection of shadow variables
> >
> > Petr Mladek (2):
> > livepatch/shadow: Separate code to get or use pre-allocated shadow
> > variable
> > livepatch/shadow: Separate code removing all shadow variables for a
> > given id
>
> From my POV, the patchset is ready for pushing upstream.
Petr, what do you think about merging the first two patches, since they just
cleanups and simplifications?
>
> Well, we need to get approval from kpatch-build users. Joe described
> possible problems in replay for v3, see
> https://lore.kernel.org/r/[email protected]
>
> Best Regards,
> Petr
On Mon 2023-01-23 14:33:31, Marcos Paulo de Souza wrote:
> On Tue, Nov 01, 2022 at 11:43:50AM +0100, Petr Mladek wrote:
> > On Wed 2022-10-26 16:41:18, Marcos Paulo de Souza wrote:
> > > Hello,
> > >
> > > This is the v2 of the livepatch shadow GC patches. The changes are minor since
> > > nobody asked for for big code changes.
> > >
> > > Changes from v1:
> > > * Reworked commit messages (Petr)
> > > * Added my SoB which was missing in some patches, or the ordering was wrong. (Josh)
> > > * Change __klp_shadow_get_or_use to __klp_shadow_get_or_add_locked and add a comment (Petr)
> > > * Add lockdep_assert_held on __klp_shadow_get_or_add_locked (Petr)
> > > about it's meaning (Petr)
> > > * CCing LKML (Josh)
> > >
> > > Some observations:
> > > * Petr has reviewed some of the patches that we created. I kept the Reviewed-by
> > > tags since he wrote the patches some time ago and now he reviewed them again
> > > on the ML.
> > > * There were questions about possible problems about using klp_shadow_types
> > > instead of using ids, but Petr already explained that internally it still uses
> > > the id to find the correct livepatch.
> > > * Regarding the possibility of multiple patches use the same ID, the problem
> > > already existed before. Petr suggested using a "stringified" version using
> > > name and id, but nobody has commented yet. I can implement such feature in a
> > > v3 if necessary.
> > >
> > > Marcos Paulo de Souza (2):
> > > livepatch/shadow: Introduce klp_shadow_type structure
> > > livepatch/shadow: Add garbage collection of shadow variables
> > >
> > > Petr Mladek (2):
> > > livepatch/shadow: Separate code to get or use pre-allocated shadow
> > > variable
> > > livepatch/shadow: Separate code removing all shadow variables for a
> > > given id
> >
> > From my POV, the patchset is ready for pushing upstream.
>
> Petr, what do you think about merging the first two patches, since they just
> cleanups and simplifications?
Sounds reasonable to me. I am going to push them by the end of the
week if nobody complained in the meantime.
Best Regards,
Petr
On Tue, Jan 17, 2023 at 04:01:57PM +0100, Petr Mladek wrote:
> IMHO, this is the reason why we should make it per-object.
>
> If the shadow variable was used by a livepatched module and we remove
> this module then the shadow variables would get unmaintained. It would
> results in the problem described in this paragraph.
Yes, that makes sense. Ok, I'm convinced.
BTW, this is yet another unfortunate consequence of our decision many
years ago to break the module dependency between a livepatch module and
the modules it patches. We already have a lot of technical debt as a
result of that decision and it continues to pile up.
In that vein see also Song's and my recent patches to fix module
re-patching.
--
Josh
On Wed 2023-01-25 15:22:48, Josh Poimboeuf wrote:
> On Tue, Jan 17, 2023 at 04:01:57PM +0100, Petr Mladek wrote:
> > IMHO, this is the reason why we should make it per-object.
> >
> > If the shadow variable was used by a livepatched module and we remove
> > this module then the shadow variables would get unmaintained. It would
> > results in the problem described in this paragraph.
>
> Yes, that makes sense. Ok, I'm convinced.
Thanks!
> BTW, this is yet another unfortunate consequence of our decision many
> years ago to break the module dependency between a livepatch module and
> the modules it patches. We already have a lot of technical debt as a
> result of that decision and it continues to pile up.
>
> In that vein see also Song's and my recent patches to fix module
> re-patching.
Yeah. Just for record, I have played with splitting the livepatch module
some years ago. It was quite tricky. The main problem was loading all
the needed livepatch modules and synchronizing their load with the
livepatched modules.
Few more details were mentioned in
https://lore.kernel.org/r/Ytp+u2mGPk5+7Tvf@alley
Best Regards,
Petr
On Tue 2023-01-24 16:51:08, Petr Mladek wrote:
> On Mon 2023-01-23 14:33:31, Marcos Paulo de Souza wrote:
> > On Tue, Nov 01, 2022 at 11:43:50AM +0100, Petr Mladek wrote:
> > > On Wed 2022-10-26 16:41:18, Marcos Paulo de Souza wrote:
> > > > Hello,
> > > >
> > > > This is the v2 of the livepatch shadow GC patches. The changes are minor since
> > > > nobody asked for for big code changes.
> > > >
> > > > Changes from v1:
> > > > * Reworked commit messages (Petr)
> > > > * Added my SoB which was missing in some patches, or the ordering was wrong. (Josh)
> > > > * Change __klp_shadow_get_or_use to __klp_shadow_get_or_add_locked and add a comment (Petr)
> > > > * Add lockdep_assert_held on __klp_shadow_get_or_add_locked (Petr)
> > > > about it's meaning (Petr)
> > > > * CCing LKML (Josh)
> > > >
> > > > Some observations:
> > > > * Petr has reviewed some of the patches that we created. I kept the Reviewed-by
> > > > tags since he wrote the patches some time ago and now he reviewed them again
> > > > on the ML.
> > > > * There were questions about possible problems about using klp_shadow_types
> > > > instead of using ids, but Petr already explained that internally it still uses
> > > > the id to find the correct livepatch.
> > > > * Regarding the possibility of multiple patches use the same ID, the problem
> > > > already existed before. Petr suggested using a "stringified" version using
> > > > name and id, but nobody has commented yet. I can implement such feature in a
> > > > v3 if necessary.
> > > >
> > > > Marcos Paulo de Souza (2):
> > > > livepatch/shadow: Introduce klp_shadow_type structure
> > > > livepatch/shadow: Add garbage collection of shadow variables
> > > >
> > > > Petr Mladek (2):
> > > > livepatch/shadow: Separate code to get or use pre-allocated shadow
> > > > variable
> > > > livepatch/shadow: Separate code removing all shadow variables for a
> > > > given id
> > >
> > > From my POV, the patchset is ready for pushing upstream.
> >
> > Petr, what do you think about merging the first two patches, since they just
> > cleanups and simplifications?
>
> Sounds reasonable to me. I am going to push them by the end of the
> week if nobody complained in the meantime.
Josh accepted the idea in the end so we could actually push the entire
patchset. I am not sure if anyone else would like to review it
so I going to wait a bit longer.
Resume:
I am going to push the entire patchset the following week (Wednesday?)
unless anyone asks for more time or finds a problem.
Best Regards,
Petr
On 1/26/23 11:35, Petr Mladek wrote:
>
> Josh accepted the idea in the end so we could actually push the entire
> patchset. I am not sure if anyone else would like to review it
> so I going to wait a bit longer.
>
> Resume:
>
> I am going to push the entire patchset the following week (Wednesday?)
> unless anyone asks for more time or finds a problem.
>
Hi Petr,
Re docs: patches (3) and (4) change the klp_shadow_* API. There should
be updates (and possibly examples) to
Documentation/livepatch/shadow-vars.rst.
Having this for v1/v2 would have made review a lot easier, though I
understand not wanting to waste cycles on documenting dead ends.
--
Joe
On Thu, Jan 26, 2023 at 12:05:02PM -0500, Joe Lawrence wrote:
> On 1/26/23 11:35, Petr Mladek wrote:
> >
> > Josh accepted the idea in the end so we could actually push the entire
> > patchset. I am not sure if anyone else would like to review it
> > so I going to wait a bit longer.
> >
> > Resume:
> >
> > I am going to push the entire patchset the following week (Wednesday?)
> > unless anyone asks for more time or finds a problem.
> >
>
> Hi Petr,
>
> Re docs: patches (3) and (4) change the klp_shadow_* API. There should
> be updates (and possibly examples) to
> Documentation/livepatch/shadow-vars.rst.
Agreed, a documentation update is definitely needed.
Also, as I mentioned, we need to document the motivation behind the
change and the expected use cases. Either in the commit log or the
documentation, or even better, some combination. There was some very
useful background from Nicolai and some very helpful clarifications from
Petr. We should make sure all that doesn't get lost in depths of the
mailing list.
And, while I did finally accept the idea, I still need to do a more
in-depth review of the patches. Patches 1-2 look fine, but please give
me some time to properly review patches 3 & 4.
--
Josh
On Thu 2023-01-26 10:30:05, Josh Poimboeuf wrote:
> On Thu, Jan 26, 2023 at 12:05:02PM -0500, Joe Lawrence wrote:
> > On 1/26/23 11:35, Petr Mladek wrote:
> > >
> > > Josh accepted the idea in the end so we could actually push the entire
> > > patchset. I am not sure if anyone else would like to review it
> > > so I going to wait a bit longer.
> > >
> > > Resume:
> > >
> > > I am going to push the entire patchset the following week (Wednesday?)
> > > unless anyone asks for more time or finds a problem.
> > >
> >
> > Hi Petr,
> >
> > Re docs: patches (3) and (4) change the klp_shadow_* API. There should
> > be updates (and possibly examples) to
> > Documentation/livepatch/shadow-vars.rst.
>
> Agreed, a documentation update is definitely needed.
>
> Also, as I mentioned, we need to document the motivation behind the
> change and the expected use cases. Either in the commit log or the
> documentation, or even better, some combination. There was some very
> useful background from Nicolai and some very helpful clarifications from
> Petr. We should make sure all that doesn't get lost in depths of the
> mailing list.
Great point. I have completely forgot about it. And did not catch the
request when quickly scaning the older replies yesterday.
> And, while I did finally accept the idea, I still need to do a more
> in-depth review of the patches. Patches 1-2 look fine, but please give
> me some time to properly review patches 3 & 4.
Sure. It is great that the code will get another review!
I am going to wait for v3 with updated documentation and review.
Best Regards,
Petr
On Thu, Jan 26, 2023 at 12:05:02PM -0500, Joe Lawrence wrote:
> On 1/26/23 11:35, Petr Mladek wrote:
> >
> > Josh accepted the idea in the end so we could actually push the entire
> > patchset. I am not sure if anyone else would like to review it
> > so I going to wait a bit longer.
> >
> > Resume:
> >
> > I am going to push the entire patchset the following week (Wednesday?)
> > unless anyone asks for more time or finds a problem.
> >
>
> Hi Petr,
>
> Re docs: patches (3) and (4) change the klp_shadow_* API. There should
> be updates (and possibly examples) to
> Documentation/livepatch/shadow-vars.rst.
I forgot about shadow-vars.rst! This will be added on v3.
>
> Having this for v1/v2 would have made review a lot easier, though I
> understand not wanting to waste cycles on documenting dead ends.
That's true. Next time I'll take care of the docs when the API changes. Thanks
for the reviews so far!
>
> --
> Joe
>
On Wed, Oct 26, 2022 at 04:41:21PM -0300, Marcos Paulo de Souza wrote:
> +++ b/include/linux/livepatch.h
> @@ -216,15 +216,26 @@ typedef int (*klp_shadow_ctor_t)(void *obj,
> void *ctor_data);
> typedef void (*klp_shadow_dtor_t)(void *obj, void *shadow_data);
>
> -void *klp_shadow_get(void *obj, unsigned long id);
> -void *klp_shadow_alloc(void *obj, unsigned long id,
> - size_t size, gfp_t gfp_flags,
> - klp_shadow_ctor_t ctor, void *ctor_data);
> -void *klp_shadow_get_or_alloc(void *obj, unsigned long id,
> - size_t size, gfp_t gfp_flags,
> - klp_shadow_ctor_t ctor, void *ctor_data);
> -void klp_shadow_free(void *obj, unsigned long id, klp_shadow_dtor_t dtor);
> -void klp_shadow_free_all(unsigned long id, klp_shadow_dtor_t dtor);
> +/**
> + * struct klp_shadow_type - shadow variable type used by the klp_object
> + * @id: shadow variable type indentifier
"identifier"
> diff --git a/kernel/livepatch/shadow.c b/kernel/livepatch/shadow.c
> index aba44dcc0a88..64e83853891d 100644
> --- a/kernel/livepatch/shadow.c
> +++ b/kernel/livepatch/shadow.c
> @@ -63,24 +63,24 @@ struct klp_shadow {
> * klp_shadow_match() - verify a shadow variable matches given <obj, id>
"matches given <obj, type>" ?
> * @shadow: shadow variable to match
> * @obj: pointer to parent object
> - * @id: data identifier
> + * @shadow_type: type of the wanted shadow variable
> *
> * Return: true if the shadow variable matches.
> */
> static inline bool klp_shadow_match(struct klp_shadow *shadow, void *obj,
> - unsigned long id)
> + struct klp_shadow_type *shadow_type)
> {
> - return shadow->obj == obj && shadow->id == id;
> + return shadow->obj == obj && shadow->id == shadow_type->id;
"shadow_type" is redundant, can we just call it "type"?
Same comment for all other instances of 'shadow_type' throughout the
patch.
> @@ -159,22 +157,25 @@ static void *__klp_shadow_get_or_alloc(void *obj, unsigned long id,
> * More complex setting can be done by @ctor function. But it is
> * called only when the buffer is really used (under klp_shadow_lock).
> */
> - new_shadow = kzalloc(size + sizeof(*new_shadow), gfp_flags);
> + new_shadow = kzalloc(size + sizeof(struct klp_shadow), gfp_flags);
Unnecessary change?
--
Josh
On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> /**
> * struct klp_object - kernel object structure for live patching
> * @name: module name (or NULL for vmlinux)
> * @funcs: function entries for functions to be patched in the object
> * @callbacks: functions to be executed pre/post (un)patching
> + * @shadow_types: shadow variable types used by the livepatch for the klp_object
Please change the indentation of the other field descriptions so they
match (here and elsewhere in the patch).
> @@ -222,13 +226,30 @@ typedef void (*klp_shadow_dtor_t)(void *obj, void *shadow_data);
> * @ctor: custom constructor to initialize the shadow data (optional)
> * @dtor: custom callback that can be used to unregister the variable
> * and/or free data that the shadow variable points to (optional)
> + * @registered: flag indicating if the variable was successfully registered
"variable" -> "type"
> + *
> + * All shadow variables used by the livepatch for the related klp_object
"variables" -> "variable types" ?
> + * must be listed here so that they are registered when the livepatch
> + * and the module is loaded. Otherwise, it will not be possible to
Where is "here"? Does it mean to say "must be listed in the
klp_object"?
Though, I don't think even that is true, since the user can manually call
klp_shadow_register().
"module" -> "object" ?
> + * allocate them.
> */
> struct klp_shadow_type {
> unsigned long id;
> klp_shadow_ctor_t ctor;
> klp_shadow_dtor_t dtor;
> +
> + /* internal */
> + bool registered;
> };
>
> +#define klp_for_each_shadow_type(obj, shadow_type, i) \
> + for (shadow_type = obj->shadow_types ? obj->shadow_types[0] : NULL, i = 1; \
> + shadow_type; \
> + shadow_type = obj->shadow_types[i++])
> +
> +int klp_shadow_register(struct klp_shadow_type *shadow_type);
> +void klp_shadow_unregister(struct klp_shadow_type *shadow_type);
More cases where 'shadow_type' can be shortened to 'type'.
(And the same comment elsewhere)
> +
> void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
> void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
> size_t size, gfp_t gfp_flags, void *ctor_data);
I find the type-based interface to be unnecessarily clunky and
confusing:
- It obscures the fact that uniqueness is determined by ID, not by type.
- It's weird to be passing around the type even after it has been
registered: e.g., why doesn't klp remember the ctor/dtor?
- It complicates the registration interface for the normal case where
we normally don't have constructors/destructors.
- I don't understand the exposed klp_shadow_{un}register() interfaces.
What's the point of forcing their use if there's no garbage
collection?
- It's internally confusing in klp to have both 'type' and 'type_reg'.
I don't have any real suggestions yet, I'll need to think a little more
about it.
One question: has anybody actually used destructors in the real world?
> @@ -988,6 +1012,13 @@ static int __klp_enable_patch(struct klp_patch *patch)
> if (!klp_is_object_loaded(obj))
> continue;
>
> + ret = klp_register_shadow_types(obj);
> + if (ret) {
> + pr_warn("failed to register shadow types for object '%s'\n",
> + klp_is_module(obj) ? obj->name : "vmlinux");
> + goto err;
> + }
> +
If this fails for some reason then the error path doesn't unregister.
Presumably klp_register_shadow_types() needs to be self-cleaning on
error like klp_patch_object() is.
And BTW, it might be easier to do so if klp_shadow_type has a 'reg'
pointer to its corresponding reg struct. Then the 'registered' boolean
is no longer needed and klp_shadow_unregister() doesn't need to call
klp_shadow_type_get_reg().
> +++ b/kernel/livepatch/shadow.c
> @@ -34,6 +34,7 @@
> #include <linux/hashtable.h>
> #include <linux/slab.h>
> #include <linux/livepatch.h>
> +#include "core.h"
>
> static DEFINE_HASHTABLE(klp_shadow_hash, 12);
>
> @@ -59,6 +60,22 @@ struct klp_shadow {
> char data[];
> };
>
> +/**
> + * struct klp_shadow_type_reg - information about a registered shadow
> + * variable type
> + * @id: shadow variable type indentifier
"identifier"
> +static struct klp_shadow_type_reg *
> +klp_shadow_type_get_reg(struct klp_shadow_type *shadow_type)
> +{
> + struct klp_shadow_type_reg *shadow_type_reg;
> + lockdep_assert_held(&klp_shadow_lock);
> +
> + list_for_each_entry(shadow_type_reg, &klp_shadow_types, list) {
> + if (shadow_type_reg->id == shadow_type->id)
> + return shadow_type_reg;
> + }
> +
> + return NULL;
> +}
It seems like overkill to have both klp_shadow_hash and
klp_shadow_types. For example 'struct klp_shadow' could have a link to
its type and then klp_shadow_type_get_reg could iterate through the
klp_shadow_hash.
> +
> +/**
> + * klp_shadow_register() - register the given shadow variable type
> + * @shadow_type: shadow type to be registered
> + *
> + * Tell the system that the given shadow type is going to used by the caller
> + * (livepatch module). It allows to check and maintain lifetime of shadow
> + * variables.
It's probably worth mentioning here that this function typically isn't
called directly by the livepatch, and is only needed if the klp_object
doesn't have the type in its 'shadow_types' array.
> + *
> + * Return: 0 on suceess, -ENOMEM when there is not enough memory.
"success"
> + */
> +int klp_shadow_register(struct klp_shadow_type *shadow_type)
> +{
> + struct klp_shadow_type_reg *shadow_type_reg;
> + struct klp_shadow_type_reg *new_shadow_type_reg;
> +
> + new_shadow_type_reg =
> + kzalloc(sizeof(struct klp_shadow_type_reg), GFP_KERNEL);
> + if (!new_shadow_type_reg)
> + return -ENOMEM;
> +
> + spin_lock_irq(&klp_shadow_lock);
> +
> + if (shadow_type->registered) {
> + pr_err("Trying to register shadow variable type that is already registered: %lu",
> + shadow_type->id);
> + kfree(new_shadow_type_reg);
> + goto out;
Shouldn't this return -EINVAL or so?
> + }
> +
> + shadow_type_reg = klp_shadow_type_get_reg(shadow_type);
> + if (!shadow_type_reg) {
> + shadow_type_reg = new_shadow_type_reg;
> + shadow_type_reg->id = shadow_type->id;
> + list_add(&shadow_type_reg->list, &klp_shadow_types);
> + } else {
> + kfree(new_shadow_type_reg);
> + }
> +
> + shadow_type_reg->ref_cnt++;
> + shadow_type->registered = true;
> +out:
> + spin_unlock_irq(&klp_shadow_lock);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(klp_shadow_register);
> +
> +/**
> + * klp_shadow_unregister() - unregister the give shadow variable type
"given"
> + * @shadow_type: shadow type to be unregistered
> + *
> + * Tell the system that a given shadow variable ID is not longer be used by
"is no longer used"
> + * the caller (livepatch module). All existing shadow variables are freed
> + * when it was the last registered user.
> + */
> +void klp_shadow_unregister(struct klp_shadow_type *shadow_type)
> +{
> + struct klp_shadow_type_reg *shadow_type_reg;
> +
> + spin_lock_irq(&klp_shadow_lock);
> +
> + if (!shadow_type->registered) {
> + pr_err("Trying to unregister shadow variable type that is not registered: %lu",
> + shadow_type->id);
> + goto out;
> + }
> +
> + shadow_type_reg = klp_shadow_type_get_reg(shadow_type);
> + if (!shadow_type_reg) {
> + pr_err("Can't find shadow variable type registration: %lu", shadow_type->id);
> + goto out;
> + }
Since it's too late to report these errors and they might indicate a
major bug, maybe they should WARN().
--
Josh
On Mon 2023-01-30 20:40:08, Josh Poimboeuf wrote:
> On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> > +#define klp_for_each_shadow_type(obj, shadow_type, i) \
> > + for (shadow_type = obj->shadow_types ? obj->shadow_types[0] : NULL, i = 1; \
> > + shadow_type; \
> > + shadow_type = obj->shadow_types[i++])
> > +
> > +int klp_shadow_register(struct klp_shadow_type *shadow_type);
> > +void klp_shadow_unregister(struct klp_shadow_type *shadow_type);
>
> More cases where 'shadow_type' can be shortened to 'type'.
> (And the same comment elsewhere)
Works for me.
> > +
> > void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
> > void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
> > size_t size, gfp_t gfp_flags, void *ctor_data);
>
> I find the type-based interface to be unnecessarily clunky and
> confusing:
>
> - It obscures the fact that uniqueness is determined by ID, not by type.
>
> - It's weird to be passing around the type even after it has been
> registered: e.g., why doesn't klp remember the ctor/dtor?
ctor/dtor must be implemented in each livepatch. klp_shadow_register()
would need to remember all registered implementations.
klp_shadow_alloc/get/free would need to find and choose one.
It would add an extra overhead and non-determines.
The overhead might be negligible. The callbacks should be compatible.
But still, is it worth the potential troubles?
IMHO, it is just about POV. Does it really matter if we pass id or type?
Switching to type might be a slight complication for existing API users
that are used to ID. But we are changing the API anyway.
> - It complicates the registration interface for the normal case where
> we normally don't have constructors/destructors.
I am sorry but I do not understand this concern. The new API hides
ctor/dtor into struct klp_shadow_type. It removes one NULL parameter
from klp_alloc/get/free API. It looks like a simplification to me.
> - I don't understand the exposed klp_shadow_{un}register() interfaces.
> What's the point of forcing their use if there's no garbage
> collection?
I probably do not understand it correctly.
Normal livepatch developers do not need to use klp_shadow_{un}register().
They are called transparently in __klp_enable_patch()/klp_complete_transition()
and klp_module_coming()/klp_cleanup_module_patches_limited().
The main reason why they are exposed via include/linux/livepatch.h
and EXPORT_SYMBOL_GPL() is the selftest lib/livepatch/test_klp_shadow_vars.c.
Well, the selftest probably could be bundled into a livepatch to go
around this.
> - It's internally confusing in klp to have both 'type' and 'type_reg'.
I do not like it much either.
An idea. We could replace "bool registered" with "struct list_head
register_node" in struct klp_shadow_type. Then we could use it
for registration.
All the registered types will be in a global list (klp_type_register).
klp_shadow_unregister() would do the garbage collection when it
removed the last entry with the given id from the global list.
> One question: has anybody actually used destructors in the real world?
>
> > @@ -988,6 +1012,13 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > if (!klp_is_object_loaded(obj))
> > continue;
> >
> > + ret = klp_register_shadow_types(obj);
> > + if (ret) {
> > + pr_warn("failed to register shadow types for object '%s'\n",
> > + klp_is_module(obj) ? obj->name : "vmlinux");
> > + goto err;
> > + }
> > +
>
> If this fails for some reason then the error path doesn't unregister.
> Presumably klp_register_shadow_types() needs to be self-cleaning on
> error like klp_patch_object() is.
Good point!
They actually are unregisterd via:
+ __klp_enable_patch()
+ klp_cancel_transition() # err: in __klp_enabled_patch()
+ klp_complete_transition()
+ klp_unregister_shadow_types()
But there is a problem. It tries to unregister all types.
We have to make sure that it does not complain when a type has
not been registered yet.
> And BTW, it might be easier to do so if klp_shadow_type has a 'reg'
> pointer to its corresponding reg struct. Then the 'registered' boolean
> is no longer needed and klp_shadow_unregister() doesn't need to call
> klp_shadow_type_get_reg().
Yup. It would be easier also if the use list_head pointing to the
global register.
> > +++ b/kernel/livepatch/shadow.c
>
> > +static struct klp_shadow_type_reg *
> > +klp_shadow_type_get_reg(struct klp_shadow_type *shadow_type)
> > +{
> > + struct klp_shadow_type_reg *shadow_type_reg;
> > + lockdep_assert_held(&klp_shadow_lock);
> > +
> > + list_for_each_entry(shadow_type_reg, &klp_shadow_types, list) {
> > + if (shadow_type_reg->id == shadow_type->id)
> > + return shadow_type_reg;
> > + }
> > +
> > + return NULL;
> > +}
>
> It seems like overkill to have both klp_shadow_hash and
> klp_shadow_types. For example 'struct klp_shadow' could have a link to
> its type and then klp_shadow_type_get_reg could iterate through the
> klp_shadow_hash.
I guess that there is a misunderstanding. We need two databases:
+ One for the registered shadow_types. I does the reference
counting. It counts the number of livepatched objects
(livepatches) that might the shadow type. It says _when_ the garbage
collection must be done.
+ Second for existing shadow variables. It is needed for finding
the shadow data. It says _what_ variables have to freed when
the garbage collection is being proceed.
> > +
> > +/**
> > + * klp_shadow_register() - register the given shadow variable type
> > + * @shadow_type: shadow type to be registered
> > + *
> > + * Tell the system that the given shadow type is going to used by the caller
> > + * (livepatch module). It allows to check and maintain lifetime of shadow
> > + * variables.
>
> It's probably worth mentioning here that this function typically isn't
> called directly by the livepatch, and is only needed if the klp_object
> doesn't have the type in its 'shadow_types' array.
Yup.
> > + *
> > + * Return: 0 on suceess, -ENOMEM when there is not enough memory.
>
> "success"
>
> > + */
> > +int klp_shadow_register(struct klp_shadow_type *shadow_type)
> > +{
> > + struct klp_shadow_type_reg *shadow_type_reg;
> > + struct klp_shadow_type_reg *new_shadow_type_reg;
> > +
> > + new_shadow_type_reg =
> > + kzalloc(sizeof(struct klp_shadow_type_reg), GFP_KERNEL);
> > + if (!new_shadow_type_reg)
> > + return -ENOMEM;
> > +
> > + spin_lock_irq(&klp_shadow_lock);
> > +
> > + if (shadow_type->registered) {
> > + pr_err("Trying to register shadow variable type that is already registered: %lu",
> > + shadow_type->id);
> > + kfree(new_shadow_type_reg);
> > + goto out;
>
> Shouldn't this return -EINVAL or so?
Yes, it would make more sense.
> > + }
> > +
> > + shadow_type_reg = klp_shadow_type_get_reg(shadow_type);
> > + if (!shadow_type_reg) {
> > + shadow_type_reg = new_shadow_type_reg;
> > + shadow_type_reg->id = shadow_type->id;
> > + list_add(&shadow_type_reg->list, &klp_shadow_types);
> > + } else {
> > + kfree(new_shadow_type_reg);
> > + }
> > +
> > + shadow_type_reg->ref_cnt++;
> > + shadow_type->registered = true;
> > +out:
> > + spin_unlock_irq(&klp_shadow_lock);
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(klp_shadow_register);
> > +
> > +void klp_shadow_unregister(struct klp_shadow_type *shadow_type)
> > +{
> > + struct klp_shadow_type_reg *shadow_type_reg;
> > +
> > + spin_lock_irq(&klp_shadow_lock);
> > +
> > + if (!shadow_type->registered) {
> > + pr_err("Trying to unregister shadow variable type that is not registered: %lu",
> > + shadow_type->id);
> > + goto out;
> > + }
> > +
> > + shadow_type_reg = klp_shadow_type_get_reg(shadow_type);
> > + if (!shadow_type_reg) {
> > + pr_err("Can't find shadow variable type registration: %lu", shadow_type->id);
> > + goto out;
> > + }
>
> Since it's too late to report these errors and they might indicate a
> major bug, maybe they should WARN().
This probably won't be needed after we get rid of shadow_type_reg.
Also this is actually a valid scenario. klp_complete_transition()
might try to unregister a not-yet-registered type. It might happen
when called from __klp_enabled_patch() error path.
Thanks a lot for the review.
Best Regards,
Petr
On Tue, Jan 31, 2023 at 03:23:58PM +0100, Petr Mladek wrote:
> > > +
> > > void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
> > > void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
> > > size_t size, gfp_t gfp_flags, void *ctor_data);
> >
> > I find the type-based interface to be unnecessarily clunky and
> > confusing:
> >
> > - It obscures the fact that uniqueness is determined by ID, not by type.
> >
> > - It's weird to be passing around the type even after it has been
> > registered: e.g., why doesn't klp remember the ctor/dtor?
>
> ctor/dtor must be implemented in each livepatch. klp_shadow_register()
> would need to remember all registered implementations.
> klp_shadow_alloc/get/free would need to find and choose one.
> It would add an extra overhead and non-determines.
There's really no need to even "register" the constructor. It can
continue to just be passed as needed to the alloc functions.
(Side note, I don't see why klp_shadow_alloc() even needs a constructor
as it's typically used when its corresponding object gets allocated, so
its initialization doesn't need to be atomic. klp_shadow_get_or_alloc()
on the other hand, does need a constructor since it's used for attaching
to already-existing objects.)
The thing really complicating this whole mess of an API is the
destructor. The problem is that it can change when replacing
livepatches, so we can't just remember it in the registration.
At least at Red Hat, I don't think we've ever used a shadow destructor.
Its real world use seems somewhere between rare and non-existent.
I guess a destructor would be needed if the constructor (or some other
initialization) allocated additional memory associated with the shadow
variable. That data would need to be freed.
But in that case couldn't an additional shadow variable be used for the
additional memory? Then we could just get rid of this idea of the
destructor and this API could become much more straightforward.
Alternatively, each livepatch could just have an optional global
destructor which is called for every object which needs destructing. It
could then use the ID to decide what needs done (or pass it off to a
more specific destructor).
> > - I don't understand the exposed klp_shadow_{un}register() interfaces.
> > What's the point of forcing their use if there's no garbage
> > collection?
>
> I probably do not understand it correctly.
>
> Normal livepatch developers do not need to use klp_shadow_{un}register().
> They are called transparently in __klp_enable_patch()/klp_complete_transition()
> and klp_module_coming()/klp_cleanup_module_patches_limited().
>
> The main reason why they are exposed via include/linux/livepatch.h
> and EXPORT_SYMBOL_GPL() is the selftest lib/livepatch/test_klp_shadow_vars.c.
>
> Well, the selftest probably could be bundled into a livepatch to go
> around this.
In that case, at the very least they should be prefixed with underscores
and the function comment should make it clear they shouldn't be called
under normal usage.
> > - It's internally confusing in klp to have both 'type' and 'type_reg'.
>
> I do not like it much either.
>
> An idea. We could replace "bool registered" with "struct list_head
> register_node" in struct klp_shadow_type. Then we could use it
> for registration.
>
> All the registered types will be in a global list (klp_type_register).
> klp_shadow_unregister() would do the garbage collection when it
> removed the last entry with the given id from the global list.
Yeah, that does sound better (assuming we decide to keep this type
thing).
> > It seems like overkill to have both klp_shadow_hash and
> > klp_shadow_types. For example 'struct klp_shadow' could have a link to
> > its type and then klp_shadow_type_get_reg could iterate through the
> > klp_shadow_hash.
>
> I guess that there is a misunderstanding. We need two databases:
>
> + One for the registered shadow_types. I does the reference
> counting. It counts the number of livepatched objects
> (livepatches) that might the shadow type. It says _when_ the garbage
> collection must be done.
>
> + Second for existing shadow variables. It is needed for finding
> the shadow data. It says _what_ variables have to freed when
> the garbage collection is being proceed.
Yup, not sure what I was thinking, please ignore...
--
Josh
On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> The shadow variable type might then be added into a new @shadow_types
> member of struct klp_object. They will get then automatically registered
> and unregistered when the object is being livepatched. The registration
> increments the reference count. Unregistration decreases the reference
> count. All shadow variables of the given type are freed when the reference
> count reaches zero.
How does the automatic unregistration work for replaced patches?
I see klp_unpatch_replaced_patches() is called, but I don't see where it
ends up calling klp_shadow_unregister() for the replaced patch(es).
--
Josh
On Tue 2023-01-31 16:18:17, Josh Poimboeuf wrote:
> On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> > The shadow variable type might then be added into a new @shadow_types
> > member of struct klp_object. They will get then automatically registered
> > and unregistered when the object is being livepatched. The registration
> > increments the reference count. Unregistration decreases the reference
> > count. All shadow variables of the given type are freed when the reference
> > count reaches zero.
>
> How does the automatic unregistration work for replaced patches?
>
> I see klp_unpatch_replaced_patches() is called, but I don't see where it
> ends up calling klp_shadow_unregister() for the replaced patch(es).
Great catch! I forgot that replaced patches are handled separately.
We should do the following (on top of this patch):
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -123,8 +123,11 @@ static void klp_complete_transition(void)
continue;
if (klp_target_state == KLP_PATCHED)
klp_post_patch_callback(obj);
- else if (klp_target_state == KLP_UNPATCHED) {
+ else if (klp_target_state == KLP_UNPATCHED)
klp_post_unpatch_callback(obj);
+
+ if (klp_target_state == KLP_UNPATCHED ||
+ klp_transition_patch->replace) {
klp_unregister_shadow_types(obj);
}
}
Also we should add more selftests for handling the shadow variables.
Best Regards,
Petr
On Tue 2023-01-31 13:17:31, Josh Poimboeuf wrote:
> On Tue, Jan 31, 2023 at 03:23:58PM +0100, Petr Mladek wrote:
> > > > +
> > > > void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
> > > > void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
> > > > size_t size, gfp_t gfp_flags, void *ctor_data);
> > >
> > > I find the type-based interface to be unnecessarily clunky and
> > > confusing:
> > >
> > > - It obscures the fact that uniqueness is determined by ID, not by type.
> > >
> > > - It's weird to be passing around the type even after it has been
> > > registered: e.g., why doesn't klp remember the ctor/dtor?
> >
> > ctor/dtor must be implemented in each livepatch. klp_shadow_register()
> > would need to remember all registered implementations.
> > klp_shadow_alloc/get/free would need to find and choose one.
> > It would add an extra overhead and non-determines.
>
> There's really no need to even "register" the constructor. It can
> continue to just be passed as needed to the alloc functions.
Sure. But it looks weird to handle the constructor another way then
the destructor. I mean that if we registrer the destructor then we
should registrer the constructor as well.
I see an analogy with C++ here. klp_shadow_register_type() is similar
to defining a class. And both constructor and destructor are defined
in the class definition.
A custom constructor in C++ might allow to pass extra parameters
needed for initialization. In our case, these are obj, size,
gpf_flags, ctor_data.
> (Side note, I don't see why klp_shadow_alloc() even needs a constructor
> as it's typically used when its corresponding object gets allocated, so
> its initialization doesn't need to be atomic. klp_shadow_get_or_alloc()
> on the other hand, does need a constructor since it's used for attaching
> to already-existing objects.)
If we agree that klp_shadow_get_or_alloc() needed a constructor then
klp_shadow_alloc() should do the same. Different behavior would add
more confusion from my POV.
> The thing really complicating this whole mess of an API is the
> destructor. The problem is that it can change when replacing
> livepatches, so we can't just remember it in the registration.
>
> At least at Red Hat, I don't think we've ever used a shadow destructor.
> Its real world use seems somewhere between rare and non-existent.
I checked SUSE livepatches and we use the destructor once
for mutex_destroy(). I guess that it made the livepatch more safe.
Also it might have been useful for testing.
> I guess a destructor would be needed if the constructor (or some other
> initialization) allocated additional memory associated with the shadow
> variable. That data would need to be freed.
It seems that the constructor is primary used to initialize the
structure members, for example, locks.
The commit e91c2518a5d22a07642f35 ("livepatch: Initialize shadow
variables safely by a custom callback") mentions as an example
list_head. The constructor might want to link it into another list.
In that case, the destructor would do the opposite.
> But in that case couldn't an additional shadow variable be used for the
> additional memory? Then we could just get rid of this idea of the
> destructor and this API could become much more straightforward.
>
> Alternatively, each livepatch could just have an optional global
> destructor which is called for every object which needs destructing. It
> could then use the ID to decide what needs done (or pass it off to a
> more specific destructor).
Honestly, I do not think that it would really make things easier
for people developing the livepatches.
Summary:
My understanding is that you do not like the following things:
1. Problem: Having both struct klp_shadow_type and
struct klp_shadow_type_reg.
Proposal: Get rid of struct klp_shadow_type_reg. Instead, add
a list_head into struct klp_shadow_type and use it for
registration and reference counting.
2. Problem: Using struct klp_shadow_type * instead of ID in
the klp_shadow API.
Solution:
a) Find the shadow_type by ID.
b) Get rid of klp_shadow_type completely
3. Problem: The API is obscure in general and this makes it even worse
Solution:
a) Do not pass constructor in klp_shadow_alloc()
b) Do not support destructor
c) Have global destructor
Requirements:
We agree that a constructor is needed at least for
klp_shadow_get_or_alloc().
The garbage collection solves a real problem. It provides a real
solution instead of custom hacks.
My opinion:
The shadow API is used rarely so that livepatch developer do not
have much experience with it.
The shadow API is usually used for some tricky things. And it is
used for custom livepatch-specific modifications that do not
get much review.
An ideal API should be simple. But we agree that the constructor is
needed in klp_shadow_get_or_alloc(). It defines the basic
complexity of the API.
The API should be predictable. IMHO, klp_shadow_alloc() should handle
the constructor the same way as klp_shadow_get_or_alloc().
Also it would help when the API is symmetric. I would keep the
destructor. And if register destructor then we should register
constructor as well.
I do not see big difference between passing ID or struct
klp_shadow_type * in the API. IMHO, it is better than a common
destructor.
Summary: I would get rid of struct klp_shadow_type_reg and
keep the rest as is.
Proposal:
If we can't find an agreenment. Then the decision should be made
by people actually using the API. On the SUSE side, it is Nicolai
and Marcos.
IMHO, the main question is whether we need custom destructors.
We could avoid struct klp_shadow_type if the destructors are
not needed...
> > > - I don't understand the exposed klp_shadow_{un}register() interfaces.
> > > What's the point of forcing their use if there's no garbage
> > > collection?
> >
> > I probably do not understand it correctly.
> >
> > Normal livepatch developers do not need to use klp_shadow_{un}register().
> > They are called transparently in __klp_enable_patch()/klp_complete_transition()
> > and klp_module_coming()/klp_cleanup_module_patches_limited().
> >
> > The main reason why they are exposed via include/linux/livepatch.h
> > and EXPORT_SYMBOL_GPL() is the selftest lib/livepatch/test_klp_shadow_vars.c.
> >
> > Well, the selftest probably could be bundled into a livepatch to go
> > around this.
>
> In that case, at the very least they should be prefixed with underscores
> and the function comment should make it clear they shouldn't be called
> under normal usage.
Good point. Well, I think that we should start testing the klp_shadow API using
livepatches. So that it won't be necessary to export the registration API.
Best Regards,
Petr
On Thu, Feb 02, 2023 at 11:14:15AM +0100, Petr Mladek wrote:
> On Tue 2023-01-31 16:18:17, Josh Poimboeuf wrote:
> > On Wed, Oct 26, 2022 at 04:41:22PM -0300, Marcos Paulo de Souza wrote:
> > > The shadow variable type might then be added into a new @shadow_types
> > > member of struct klp_object. They will get then automatically registered
> > > and unregistered when the object is being livepatched. The registration
> > > increments the reference count. Unregistration decreases the reference
> > > count. All shadow variables of the given type are freed when the reference
> > > count reaches zero.
> >
> > How does the automatic unregistration work for replaced patches?
> >
> > I see klp_unpatch_replaced_patches() is called, but I don't see where it
> > ends up calling klp_shadow_unregister() for the replaced patch(es).
>
> Great catch! I forgot that replaced patches are handled separately.
> We should do the following (on top of this patch):
>
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -123,8 +123,11 @@ static void klp_complete_transition(void)
> continue;
> if (klp_target_state == KLP_PATCHED)
> klp_post_patch_callback(obj);
> - else if (klp_target_state == KLP_UNPATCHED) {
> + else if (klp_target_state == KLP_UNPATCHED)
> klp_post_unpatch_callback(obj);
> +
> + if (klp_target_state == KLP_UNPATCHED ||
> + klp_transition_patch->replace) {
> klp_unregister_shadow_types(obj);
> }
> }
That calls klp_unregister_shadow_types() on the objects for
klp_transition_patch, which is the replacement patch, not the replaced
patch(es).
--
Josh
On Wed, Jan 25, 2023 at 03:22:48PM -0800, Josh Poimboeuf wrote:
> On Tue, Jan 17, 2023 at 04:01:57PM +0100, Petr Mladek wrote:
> > > >From my experience, there are basically two relevant usage patterns of
> > > shadow variables.
> > > 1.) To hand over global state from one sublivepatch to its pendant in
> > > the to-be-applied livepatch module. Example: a new global mutex or
> > > alike.
> > > 2.) The "regular" intended usage, attaching shadow variables to real
> > > (data) objects.
> > >
> > > To manage lifetime for 1.), we usually implement some refcount scheme,
> > > managed from the livepatches' module_init()/_exit(): the next livepatch
> > > would subscribe to the shared state before the previous one got a chance
> > > to release it. This works in practice, but the code related to it is
> > > tedious to write and quite verbose.
> > >
> > > The second usage pattern is much more difficult to implement correctly
> > > in light of possible livepatch downgrades to a subset of
> > > sublivepatches. Usually a sublivepatch making use of a shadow variable
> > > attached to real objects would livepatch the associated object's
> > > destruction code to free up the associated shadow, if any. If the next
> > > livepatch to be applied happened to not contain this sublivepatch in
> > > question as well, the destruction code would effectively become
> > > unpatched, and any existing shadows leaked. Depending on the object type
> > > in question, this memory leakage might or might not be an actual
> > > problem, but it isn't nice either way.
> > >
> > > Often, there's a more subtle issue with the latter usecase though: the
> > > shadow continues to exist, but becomes unmaintained once the transitions
> > > has started. If said sublivepatch happens to become reactivated later
> > > on, it would potentially find stale shadows, and these could even get
> > > wrongly associated with a completely different object which happened to
> > > get allocated at the same memory address. Depending on the shadow type,
> > > this might or might not be Ok. New per-object locks or a "TLB flush
> > > needed" boolean would probably be Ok, but some kind of refcount would
> > > certainly not. There's not much which could be done from the pre-unpatch
> > > callbacks, because these aren't getting invoked for atomic-replace
> > > downgrades.
> >
> > IMHO, this is the reason why we should make it per-object.
> >
> > If the shadow variable was used by a livepatched module and we remove
> > this module then the shadow variables would get unmaintained. It would
> > results in the problem described in this paragraph.
>
> Yes, that makes sense. Ok, I'm convinced.
I've been thinking about this some more, and this justification for
making it per-object no longer makes sense to me.
A shadow variable should follow the lifetime of its associated data
object, so the only way it would leak from an unloaded patched module
would be if there's a bug either in the patched module or in the
livepatch itself, right?
Or did I misunderstand your point?
--
Josh
On Wed, Nov 09, 2022 at 03:36:23PM +0100, Petr Mladek wrote:
> > Right. Though, thinking about it more, this isn't even needed. Each
> > klp_shadow would have a pointer to its owning module. We already have a
> > global hash of klp_shadows which can be iterated when the module gets
> > unloaded or replaced.
> >
> > Assuming atomic replace, the new patch is almost always a superset of
> > the old patch. We can optimize for that case.
>
> I see. But I do not agree with this assumption. The new livepatch
> might also remove code that caused problems.
>
> Also we allow downgrades. I mean that there is no versioning
> of livepatches. Older livepatches might replace new ones.
>
> We really want to support downgrades or upgrades that remove
> problematic code. Or upgrades that start fixing the problem
> a better way using another shadow variable.
So I've been thinking about this too...
I think it's good to support downgrades. But even with this patch set,
they still aren't properly supported because of the callback design.
Unpatch callbacks aren't called for the replaced patch. Any cleanup it
needs to do (e.g., for a downgrade) is skipped. That can cause all
kinds of problems including leaks and conflicts with future patches.
And actually, that seems directly related to shadow variable garbage
collection.
Typically you would call 'klp_shadow_free_all(id, dtor)' in the
post_unpatch callback, *unless* that data is going to be used by the
replacement patch.
That "unless" is the problem. Without that problem, garbage collection
of shadow variables would be trivial.
Thing is, that "unless" exists not only for the freeing of shadow
variables, but also for the allocation of some shadow variables in
pre/post-patch callbacks, and many other types of initializations and
cleanups done by patch/unpatch callbacks.
Correct me if I'm wrong, but it seems like downgrades (and similar
scenarios like upgrades with partial revert) are the primary real-world
motivation for this patch set.
These patches feel like a partial solution to a bigger problem. If we
really want to support downgrade use cases, the callbacks will need to
be improved.
The main issue is that callbacks aren't called at the right times, and
they're not granular enough to support different levels of
upgrade/downgrade so that we can arbitrarily replace any patch with any
other version of the patch without consequence.
If the callbacks were split up and called only when they're needed, it
would be trivial to free the shadow variables in a post_unpatch
callback.
To support this we could have some kind of callback versioning, where a
klp_object can have an array of klp_callbacks, and each set of callbacks
has a version associated with it.
For example, consider a sequence of patches P1-P4, each of which is an
upgrade from the last.
(Note: I'm assuming only one klp_object here to keep it simple.)
- P1 has [ callbacks.version=1 ]
- P2 has [ callbacks.version=1 ]
- P3 has [ callbacks.version=1, callbacks.version=2 ]
- P4 has [ callbacks.version=1, callbacks.version=3 ]
Usually a patch will have the same callbacks (with same versions) as its
predecessor. P1 and P2 have the same callbacks (v1).
If needed, a patch can then add an additional set of callbacks (with
bumped version) like P3. Or it can remove its predecessor's callbacks.
Or it can do both, like P4.
When deciding which callbacks to call for a replace operation, it's
basically an XOR. If a callback version exists in one patch but not the
other, its callbacks should be called.
For example, if P3 gets upgraded to P4, call P4's v3 patch callbacks and
P3's v2 unpatch callbacks.
If P4 gets downgraded to P1, call P4's v3 unpatch callbacks.
BTW, "version" may not be the right name, as there's no concept of newer
or older: any patch can be arbitrarily replaced by any other patch. The
version is just a unique ID for a set of callbacks.
--
Josh
On Sat 2023-02-04 15:59:10, Josh Poimboeuf wrote:
> On Wed, Nov 09, 2022 at 03:36:23PM +0100, Petr Mladek wrote:
> > We really want to support downgrades or upgrades that remove
> > problematic code. Or upgrades that start fixing the problem
> > a better way using another shadow variable.
>
> So I've been thinking about this too...
>
> I think it's good to support downgrades. But even with this patch set,
> they still aren't properly supported because of the callback design.
>
> Unpatch callbacks aren't called for the replaced patch. Any cleanup it
> needs to do (e.g., for a downgrade) is skipped. That can cause all
> kinds of problems including leaks and conflicts with future patches.
>
> And actually, that seems directly related to shadow variable garbage
> collection.
>
> Typically you would call 'klp_shadow_free_all(id, dtor)' in the
> post_unpatch callback, *unless* that data is going to be used by the
> replacement patch.
Exactly.
> That "unless" is the problem. Without that problem, garbage collection
> of shadow variables would be trivial.
>
> Thing is, that "unless" exists not only for the freeing of shadow
> variables, but also for the allocation of some shadow variables in
> pre/post-patch callbacks, and many other types of initializations and
> cleanups done by patch/unpatch callbacks.
Yes.
> Correct me if I'm wrong, but it seems like downgrades (and similar
> scenarios like upgrades with partial revert) are the primary real-world
> motivation for this patch set.
Yes.
> These patches feel like a partial solution to a bigger problem. If we
> really want to support downgrade use cases, the callbacks will need to
> be improved.
You are right. All this started when I discussed problems with writing
livepatch callbacks with Nicolai. I tried to make it easier by
introducing the klp_state API. But Nicolai realized that it was
still too hard.
The garbage collection was an idea how to easily handle the most
common case, see below.
> The main issue is that callbacks aren't called at the right times, and
> they're not granular enough to support different levels of
> upgrade/downgrade so that we can arbitrarily replace any patch with any
> other version of the patch without consequence.
>
> If the callbacks were split up and called only when they're needed, it
> would be trivial to free the shadow variables in a post_unpatch
> callback.
>
> To support this we could have some kind of callback versioning, where a
> klp_object can have an array of klp_callbacks, and each set of callbacks
> has a version associated with it.
>
> For example, consider a sequence of patches P1-P4, each of which is an
> upgrade from the last.
>
> (Note: I'm assuming only one klp_object here to keep it simple.)
>
> - P1 has [ callbacks.version=1 ]
> - P2 has [ callbacks.version=1 ]
> - P3 has [ callbacks.version=1, callbacks.version=2 ]
> - P4 has [ callbacks.version=1, callbacks.version=3 ]
>
> Usually a patch will have the same callbacks (with same versions) as its
> predecessor. P1 and P2 have the same callbacks (v1).
>
> If needed, a patch can then add an additional set of callbacks (with
> bumped version) like P3. Or it can remove its predecessor's callbacks.
> Or it can do both, like P4.
>
> When deciding which callbacks to call for a replace operation, it's
> basically an XOR. If a callback version exists in one patch but not the
> other, its callbacks should be called.
>
> For example, if P3 gets upgraded to P4, call P4's v3 patch callbacks and
> P3's v2 unpatch callbacks.
>
> If P4 gets downgraded to P1, call P4's v3 unpatch callbacks.
>
> BTW, "version" may not be the right name, as there's no concept of newer
> or older: any patch can be arbitrarily replaced by any other patch. The
> version is just a unique ID for a set of callbacks.
Yes, we discussed various possibilities and it was always getting pretty
complex.
The most promising direction with the callbacks was connecting
the callbacks with the klp_state. They are versioned as you
suggest above. And we could have callbacks:
+ state_prepare (instead of pre_patch)
+ state_enable (instead of post_patch)
+ state_disable (instead of pre_unpatch)
+ state_cleanup (instead of pre_unpatch)
The above callbacks would be called only when the state is new.
We might need more callbacks if the support update/downgrade to
another version of the state:
+ state_update_prepare (instead of pre_patch)
+ state_update_enable (instead of post_patch)
+ state_downgrade_prepare (instead of pre_unpatch)
+ state_downgrade_cleanup (instead of pre_unpatch)
Also we might some callbacks when an action is needed to take
over the state from the previous livepatch:
+ state_take_over_prepare (instead of pre_patch and pre_unpatch?)
+ state_take_over_enable (instead of post_patch and post_unpatch?)
IMHO, it is better than the current state. It better describes
the complexity of the problem. It forces/allows livepatch authors
to handle all scenarios correctly.
Anyway, it is complex. The authors have to understand when exactly
each callback will be called. Also some callbacks might be called
from the old-to-be-replaced and some from the new-to-be-used livepatch.
We discussed various use-cases and realized that shadow variables
were very common use case. I have just double checked it and
it seems that we used:
+ post_patch callback for two CVEs
+ klp_shadow_free() for 14 CVEs
This is where the idea of the garbage collection came from. It looked
like an elegant solution for the most common use case. So that
livepatch authors do not need to think about the rather complex
ordering of the many callbacks. They would just register/define
the shadow variable type and be done.
Well, the single state_cleanup() callback would usually be
enough for the shadow variables.
Conclusion:
You are right that the shadow variables and callbacks
are connected. Or more precisely, the shadow variables and
klp_states have similar life-time handling. They both
are introduced/transfered/removed.
Maybe, we could merge klp_shadow and klp_state into a single
entity that would support versioning and various callbacks.
I am just not sure how the entity would be called.
The name is important because it should be clear what it
means and how it can be used. The unification and bad name
might cause more harm then good.
Note that the meaning is a bit different. klp_shadow is
for storing data. klp_state rather describes a change
of the behavior done by a callback.
IMHO, there does not exist 1:1 relation between shadow
variables and states. Using a particular shadow variable
type might define a system state. But a callback might
change a system state even without using shadow variables.
Another small problem is that klp_state is currently in
struct klp_livepatch. And it would be more safe to register
klp_shadow_type in struct klp_object.
Best Regards,
Petr