2020-04-14 16:55:50

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

Better late than never, these patches add simplifications and
improvements for some issues Peter found six months ago, as part of his
non-writable text code (W^X) cleanups.

Highlights:

- Remove the livepatch arch-specific .klp.arch sections, which were used
to do paravirt patching and alternatives patching for livepatch
replacement code.

- Add support for jump labels in patched code.

- Remove the last module_disable_ro() usage.

For more background, see this thread:

https://lkml.kernel.org/r/20191021135312.jbbxsuipxldocdjk@treble

I've tested this with a modified kpatch-build:

https://github.com/jpoimboe/kpatch/tree/no-klp-arch

(I'm planning to submit a github PR for kpatch-build, once I get
the updated unit/integration tests sorted out.


Josh Poimboeuf (4):
livepatch: Apply vmlinux-specific KLP relocations early
livepatch: Prevent module-specific KLP rela sections from referencing
vmlinux symbols
livepatch: Remove module_disable_ro() usage
module: Remove module_disable_ro()

Peter Zijlstra (3):
livepatch: Remove .klp.arch
s390/module: Use s390_kernel_write() for relocations
x86/module: Use text_poke() for relocations

Documentation/livepatch/module-elf-format.rst | 12 +-
arch/s390/kernel/module.c | 106 +++++++++------
arch/um/kernel/um_arch.c | 16 +++
arch/x86/kernel/Makefile | 1 -
arch/x86/kernel/livepatch.c | 53 --------
arch/x86/kernel/module.c | 34 ++++-
include/linux/livepatch.h | 19 ++-
include/linux/module.h | 2 -
kernel/livepatch/core.c | 128 +++++++++++-------
kernel/module.c | 22 +--
10 files changed, 205 insertions(+), 188 deletions(-)
delete mode 100644 arch/x86/kernel/livepatch.c

--
2.21.1


2020-04-14 16:55:54

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 6/7] livepatch: Remove module_disable_ro() usage

With arch_klp_init_object_loaded() gone, and apply_relocate_add() now
using text_poke(), livepatch no longer needs to use module_disable_ro().

The text_mutex usage can also be removed -- its purpose was to protect
against module permission change races.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
kernel/livepatch/core.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 817676caddee..3a88639b3326 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -767,10 +767,6 @@ static int klp_init_object_loaded(struct klp_patch *patch,
struct klp_modinfo *info = patch->mod->klp_info;

if (klp_is_module(obj)) {
-
- mutex_lock(&text_mutex);
- module_disable_ro(patch->mod);
-
/*
* Only write module-specific relocations here
* (.klp.rela.{module}.*). vmlinux-specific relocations were
@@ -782,10 +778,6 @@ static int klp_init_object_loaded(struct klp_patch *patch,
patch->mod->core_kallsyms.strtab,
info->symndx, patch->mod,
obj->name);
-
- module_enable_ro(patch->mod, true);
- mutex_unlock(&text_mutex);
-
if (ret)
return ret;
}
--
2.21.1

2020-04-14 16:55:54

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 2/7] livepatch: Remove .klp.arch

From: Peter Zijlstra <[email protected]>

After the previous patch, vmlinux-specific KLP relocations are now
applied early during KLP module load. This means that .klp.arch
sections are no longer needed for *vmlinux-specific* KLP relocations.

One might think they're still needed for *module-specific* KLP
relocations. If a to-be-patched module is loaded *after* its
corresponding KLP module is loaded, any corresponding KLP relocations
will be delayed until the to-be-patched module is loaded. If any
special sections (.parainstructions, for example) rely on those
relocations, their initializations (apply_paravirt) need to be done
afterwards. Thus the apparent need for arch_klp_init_object_loaded()
and its corresponding .klp.arch sections -- it allows some of the
special section initializations to be done at a later time.

But... if you look closer, that dependency between the special sections
and the module-specific KLP relocations doesn't actually exist in
reality. Looking at the contents of the .altinstructions and
.parainstructions sections, there's not a realistic scenario in which a
KLP module's .altinstructions or .parainstructions section needs to
access a symbol in a to-be-patched module. It might need to access a
local symbol or even a vmlinux symbol; but not another module's symbol.
When a special section needs to reference a local or vmlinux symbol, a
normal rela can be used instead of a KLP rela.

Since the special section initializations don't actually have any real
dependency on module-specific KLP relocations, .klp.arch and
arch_klp_init_object_loaded() no longer have a reason to exist. So
remove them.

As Peter said much more succinctly:

So the reason for .klp.arch was that .klp.rela.* stuff would overwrite
paravirt instructions. If that happens you're doing it wrong. Those
RELAs are core kernel, not module, and thus should've happened in
.rela.* sections at patch-module loading time.

Reverting this removes the two apply_{paravirt,alternatives}() calls
from the late patching path, and means we don't have to worry about
them when removing module_disable_ro().

[ jpoimboe: Rewrote patch description. Tweaked klp_init_object_loaded()
error path. ]

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
---
Documentation/livepatch/module-elf-format.rst | 12 +----
arch/x86/kernel/Makefile | 1 -
arch/x86/kernel/livepatch.c | 53 -------------------
include/linux/livepatch.h | 3 --
kernel/livepatch/core.c | 27 ++++------
5 files changed, 10 insertions(+), 86 deletions(-)
delete mode 100644 arch/x86/kernel/livepatch.c

diff --git a/Documentation/livepatch/module-elf-format.rst b/Documentation/livepatch/module-elf-format.rst
index 2a591e6f8e6c..629ef7ffb6cf 100644
--- a/Documentation/livepatch/module-elf-format.rst
+++ b/Documentation/livepatch/module-elf-format.rst
@@ -298,17 +298,7 @@ Examples:
Note that the 'Ndx' (Section index) for these symbols is SHN_LIVEPATCH (0xff20).
"OS" means OS-specific.

-5. Architecture-specific sections
-=================================
-Architectures may override arch_klp_init_object_loaded() to perform
-additional arch-specific tasks when a target module loads, such as applying
-arch-specific sections. On x86 for example, we must apply per-object
-.altinstructions and .parainstructions sections when a target module loads.
-These sections must be prefixed with ".klp.arch.$objname." so that they can
-be easily identified when iterating through a patch module's Elf sections
-(See arch/x86/kernel/livepatch.c for a complete example).
-
-6. Symbol table and Elf section access
+5. Symbol table and Elf section access
======================================
A livepatch module's symbol table is accessible through module->symtab.

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d6d61c4455fa..fc8834342516 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -94,7 +94,6 @@ obj-$(CONFIG_X86_MPPARSE) += mpparse.o
obj-y += apic/
obj-$(CONFIG_X86_REBOOTFIXUPS) += reboot_fixups_32.o
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
-obj-$(CONFIG_LIVEPATCH) += livepatch.o
obj-$(CONFIG_FUNCTION_TRACER) += ftrace_$(BITS).o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
diff --git a/arch/x86/kernel/livepatch.c b/arch/x86/kernel/livepatch.c
deleted file mode 100644
index 6a68e41206e7..000000000000
--- a/arch/x86/kernel/livepatch.c
+++ /dev/null
@@ -1,53 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * livepatch.c - x86-specific Kernel Live Patching Core
- */
-
-#include <linux/module.h>
-#include <linux/kallsyms.h>
-#include <linux/livepatch.h>
-#include <asm/text-patching.h>
-
-/* Apply per-object alternatives. Based on x86 module_finalize() */
-void arch_klp_init_object_loaded(struct klp_patch *patch,
- struct klp_object *obj)
-{
- int cnt;
- struct klp_modinfo *info;
- Elf_Shdr *s, *alt = NULL, *para = NULL;
- void *aseg, *pseg;
- const char *objname;
- char sec_objname[MODULE_NAME_LEN];
- char secname[KSYM_NAME_LEN];
-
- info = patch->mod->klp_info;
- objname = obj->name ? obj->name : "vmlinux";
-
- /* See livepatch core code for BUILD_BUG_ON() explanation */
- BUILD_BUG_ON(MODULE_NAME_LEN < 56 || KSYM_NAME_LEN != 128);
-
- for (s = info->sechdrs; s < info->sechdrs + info->hdr.e_shnum; s++) {
- /* Apply per-object .klp.arch sections */
- cnt = sscanf(info->secstrings + s->sh_name,
- ".klp.arch.%55[^.].%127s",
- sec_objname, secname);
- if (cnt != 2)
- continue;
- if (strcmp(sec_objname, objname))
- continue;
- if (!strcmp(".altinstructions", secname))
- alt = s;
- if (!strcmp(".parainstructions", secname))
- para = s;
- }
-
- if (alt) {
- aseg = (void *) alt->sh_addr;
- apply_alternatives(aseg, aseg + alt->sh_size);
- }
-
- if (para) {
- pseg = (void *) para->sh_addr;
- apply_paravirt(pseg, pseg + para->sh_size);
- }
-}
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index d9e9b76f6054..2509dcf14605 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -195,9 +195,6 @@ struct klp_patch {

int klp_enable_patch(struct klp_patch *);

-void arch_klp_init_object_loaded(struct klp_patch *patch,
- struct klp_object *obj);
-
/* Called from the module loader during module coming/going states */
int klp_module_coming(struct module *mod);
void klp_module_going(struct module *mod);
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index ac9e2e78ae0f..af8f06382e43 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -741,12 +741,6 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
func->old_sympos ? func->old_sympos : 1);
}

-/* Arches may override this to finish any remaining arch-specific tasks */
-void __weak arch_klp_init_object_loaded(struct klp_patch *patch,
- struct klp_object *obj)
-{
-}
-
/* parts of the initialization that is done only when the object is loaded */
static int klp_init_object_loaded(struct klp_patch *patch,
struct klp_object *obj)
@@ -755,10 +749,11 @@ static int klp_init_object_loaded(struct klp_patch *patch,
int ret;
struct klp_modinfo *info = patch->mod->klp_info;

- mutex_lock(&text_mutex);
- module_disable_ro(patch->mod);
-
if (klp_is_module(obj)) {
+
+ mutex_lock(&text_mutex);
+ module_disable_ro(patch->mod);
+
/*
* Only write module-specific relocations here
* (.klp.rela.{module}.*). vmlinux-specific relocations were
@@ -770,17 +765,13 @@ static int klp_init_object_loaded(struct klp_patch *patch,
patch->mod->core_kallsyms.strtab,
info->symndx, patch->mod,
obj->name);
- if (ret) {
- module_enable_ro(patch->mod, true);
- mutex_unlock(&text_mutex);
- return ret;
- }
- }

- arch_klp_init_object_loaded(patch, obj);
+ module_enable_ro(patch->mod, true);
+ mutex_unlock(&text_mutex);

- module_enable_ro(patch->mod, true);
- mutex_unlock(&text_mutex);
+ if (ret)
+ return ret;
+ }

klp_for_each_func(obj, func) {
ret = klp_find_object_symbol(obj->name, func->old_name,
--
2.21.1

2020-04-14 16:56:04

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 5/7] x86/module: Use text_poke() for relocations

From: Peter Zijlstra <[email protected]>

Instead of playing games with module_{dis,en}able_ro(), use existing
text poking mechanisms to apply relocations after module loading.

So far only x86, s390 and Power have HAVE_LIVEPATCH but only the first
two also have STRICT_MODULE_RWX.

This will allow removal of the last module_disable_ro() usage in
livepatch. The ultimate goal is to completely disallow making
executable mappings writable.

[ jpoimboe: Split up patches. Use mod state to determine whether
memcpy() can be used. Implement text_poke() for UML. ]

Cc: [email protected]
Suggested-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/um/kernel/um_arch.c | 16 ++++++++++++++++
arch/x86/kernel/module.c | 34 +++++++++++++++++++++++++++-------
2 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
index 0f40eccbd759..375ab720e4aa 100644
--- a/arch/um/kernel/um_arch.c
+++ b/arch/um/kernel/um_arch.c
@@ -362,3 +362,19 @@ void __init check_bugs(void)
void apply_alternatives(struct alt_instr *start, struct alt_instr *end)
{
}
+
+void *text_poke(void *addr, const void *opcode, size_t len)
+{
+ /*
+ * In UML, the only reference to this function is in
+ * apply_relocate_add(), which shouldn't ever actually call this
+ * because UML doesn't have live patching.
+ */
+ WARN_ON(1);
+
+ return memcpy(addr, opcode, len);
+}
+
+void text_poke_sync(void)
+{
+}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index d5c72cb877b3..b67ec70cf35b 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -126,11 +126,12 @@ int apply_relocate(Elf32_Shdr *sechdrs,
return 0;
}
#else /*X86_64*/
-int apply_relocate_add(Elf64_Shdr *sechdrs,
+static int __apply_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
- struct module *me)
+ struct module *me,
+ void *(*write)(void *dest, const void *src, size_t len))
{
unsigned int i;
Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
@@ -162,19 +163,19 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_X86_64_64:
if (*(u64 *)loc != 0)
goto invalid_relocation;
- *(u64 *)loc = val;
+ write(loc, &val, 8);
break;
case R_X86_64_32:
if (*(u32 *)loc != 0)
goto invalid_relocation;
- *(u32 *)loc = val;
+ write(loc, &val, 4);
if (val != *(u32 *)loc)
goto overflow;
break;
case R_X86_64_32S:
if (*(s32 *)loc != 0)
goto invalid_relocation;
- *(s32 *)loc = val;
+ write(loc, &val, 4);
if ((s64)val != *(s32 *)loc)
goto overflow;
break;
@@ -183,7 +184,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
if (*(u32 *)loc != 0)
goto invalid_relocation;
val -= (u64)loc;
- *(u32 *)loc = val;
+ write(loc, &val, 4);
#if 0
if ((s64)val != *(s32 *)loc)
goto overflow;
@@ -193,7 +194,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
if (*(u64 *)loc != 0)
goto invalid_relocation;
val -= (u64)loc;
- *(u64 *)loc = val;
+ write(loc, &val, 8);
break;
default:
pr_err("%s: Unknown rela relocation: %llu\n",
@@ -215,6 +216,25 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
me->name);
return -ENOEXEC;
}
+
+int apply_relocate_add(Elf64_Shdr *sechdrs,
+ const char *strtab,
+ unsigned int symindex,
+ unsigned int relsec,
+ struct module *me)
+{
+ int ret;
+ bool early = me->state == MODULE_STATE_UNFORMED;
+
+ ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
+ early ? memcpy : text_poke);
+
+ if (!ret && !early)
+ text_poke_sync();
+
+ return ret;
+}
+
#endif

int module_finalize(const Elf_Ehdr *hdr,
--
2.21.1

2020-04-14 16:56:08

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 7/7] module: Remove module_disable_ro()

module_disable_ro() has no more users. Remove it.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
include/linux/module.h | 2 --
kernel/module.c | 13 -------------
2 files changed, 15 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 1ad393e62bef..e4ef7b36feda 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -860,10 +860,8 @@ extern int module_sysfs_initialized;

#ifdef CONFIG_STRICT_MODULE_RWX
extern void module_enable_ro(const struct module *mod, bool after_init);
-extern void module_disable_ro(const struct module *mod);
#else
static inline void module_enable_ro(const struct module *mod, bool after_init) { }
-static inline void module_disable_ro(const struct module *mod) { }
#endif

#ifdef CONFIG_GENERIC_BUG
diff --git a/kernel/module.c b/kernel/module.c
index d36ea8a8c3ec..b1d30ad67e82 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1997,19 +1997,6 @@ static void frob_writable_data(const struct module_layout *layout,
(layout->size - layout->ro_after_init_size) >> PAGE_SHIFT);
}

-/* livepatching wants to disable read-only so it can frob module. */
-void module_disable_ro(const struct module *mod)
-{
- if (!rodata_enabled)
- return;
-
- frob_text(&mod->core_layout, set_memory_rw);
- frob_rodata(&mod->core_layout, set_memory_rw);
- frob_ro_after_init(&mod->core_layout, set_memory_rw);
- frob_text(&mod->init_layout, set_memory_rw);
- frob_rodata(&mod->init_layout, set_memory_rw);
-}
-
void module_enable_ro(const struct module *mod, bool after_init)
{
if (!rodata_enabled)
--
2.21.1

2020-04-14 16:56:14

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 4/7] s390/module: Use s390_kernel_write() for relocations

From: Peter Zijlstra <[email protected]>

Instead of playing games with module_{dis,en}able_ro(), use existing
text poking mechanisms to apply relocations after module loading.

So far only x86, s390 and Power have HAVE_LIVEPATCH but only the first
two also have STRICT_MODULE_RWX.

This will allow removal of the last module_disable_ro() usage in
livepatch. The ultimate goal is to completely disallow making
executable mappings writable.

[ jpoimboe: Split up patches. Use mod state to determine whether
memcpy() can be used. ]

Cc: [email protected]
Cc: [email protected]
Suggested-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/s390/kernel/module.c | 106 ++++++++++++++++++++++----------------
1 file changed, 61 insertions(+), 45 deletions(-)

diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index ba8f19bb438b..e85e378f876e 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -174,7 +174,8 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
}

static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
- int sign, int bits, int shift)
+ int sign, int bits, int shift,
+ void (*write)(void *dest, const void *src, size_t len))
{
unsigned long umax;
long min, max;
@@ -194,26 +195,29 @@ static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
return -ENOEXEC;
}

- if (bits == 8)
- *(unsigned char *) loc = val;
- else if (bits == 12)
- *(unsigned short *) loc = (val & 0xfff) |
+ if (bits == 8) {
+ write(loc, &val, 1);
+ } else if (bits == 12) {
+ unsigned short tmp = (val & 0xfff) |
(*(unsigned short *) loc & 0xf000);
- else if (bits == 16)
- *(unsigned short *) loc = val;
- else if (bits == 20)
- *(unsigned int *) loc = (val & 0xfff) << 16 |
- (val & 0xff000) >> 4 |
- (*(unsigned int *) loc & 0xf00000ff);
- else if (bits == 32)
- *(unsigned int *) loc = val;
- else if (bits == 64)
- *(unsigned long *) loc = val;
+ write(loc, &tmp, 2);
+ } else if (bits == 16) {
+ write(loc, &val, 2);
+ } else if (bits == 20) {
+ unsigned int tmp = (val & 0xfff) << 16 |
+ (val & 0xff000) >> 4 | (*(unsigned int *) loc & 0xf00000ff);
+ write(loc, &tmp, 4);
+ } else if (bits == 32) {
+ write(loc, &val, 4);
+ } else if (bits == 64) {
+ write(loc, &val, 8);
+ }
return 0;
}

static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
- const char *strtab, struct module *me)
+ const char *strtab, struct module *me,
+ void (*write)(void *dest, const void *src, size_t len))
{
struct mod_arch_syminfo *info;
Elf_Addr loc, val;
@@ -241,17 +245,17 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
case R_390_64: /* Direct 64 bit. */
val += rela->r_addend;
if (r_type == R_390_8)
- rc = apply_rela_bits(loc, val, 0, 8, 0);
+ rc = apply_rela_bits(loc, val, 0, 8, 0, write);
else if (r_type == R_390_12)
- rc = apply_rela_bits(loc, val, 0, 12, 0);
+ rc = apply_rela_bits(loc, val, 0, 12, 0, write);
else if (r_type == R_390_16)
- rc = apply_rela_bits(loc, val, 0, 16, 0);
+ rc = apply_rela_bits(loc, val, 0, 16, 0, write);
else if (r_type == R_390_20)
- rc = apply_rela_bits(loc, val, 1, 20, 0);
+ rc = apply_rela_bits(loc, val, 1, 20, 0, write);
else if (r_type == R_390_32)
- rc = apply_rela_bits(loc, val, 0, 32, 0);
+ rc = apply_rela_bits(loc, val, 0, 32, 0, write);
else if (r_type == R_390_64)
- rc = apply_rela_bits(loc, val, 0, 64, 0);
+ rc = apply_rela_bits(loc, val, 0, 64, 0, write);
break;
case R_390_PC16: /* PC relative 16 bit. */
case R_390_PC16DBL: /* PC relative 16 bit shifted by 1. */
@@ -260,15 +264,15 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
case R_390_PC64: /* PC relative 64 bit. */
val += rela->r_addend - loc;
if (r_type == R_390_PC16)
- rc = apply_rela_bits(loc, val, 1, 16, 0);
+ rc = apply_rela_bits(loc, val, 1, 16, 0, write);
else if (r_type == R_390_PC16DBL)
- rc = apply_rela_bits(loc, val, 1, 16, 1);
+ rc = apply_rela_bits(loc, val, 1, 16, 1, write);
else if (r_type == R_390_PC32DBL)
- rc = apply_rela_bits(loc, val, 1, 32, 1);
+ rc = apply_rela_bits(loc, val, 1, 32, 1, write);
else if (r_type == R_390_PC32)
- rc = apply_rela_bits(loc, val, 1, 32, 0);
+ rc = apply_rela_bits(loc, val, 1, 32, 0, write);
else if (r_type == R_390_PC64)
- rc = apply_rela_bits(loc, val, 1, 64, 0);
+ rc = apply_rela_bits(loc, val, 1, 64, 0, write);
break;
case R_390_GOT12: /* 12 bit GOT offset. */
case R_390_GOT16: /* 16 bit GOT offset. */
@@ -293,23 +297,23 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
val = info->got_offset + rela->r_addend;
if (r_type == R_390_GOT12 ||
r_type == R_390_GOTPLT12)
- rc = apply_rela_bits(loc, val, 0, 12, 0);
+ rc = apply_rela_bits(loc, val, 0, 12, 0, write);
else if (r_type == R_390_GOT16 ||
r_type == R_390_GOTPLT16)
- rc = apply_rela_bits(loc, val, 0, 16, 0);
+ rc = apply_rela_bits(loc, val, 0, 16, 0, write);
else if (r_type == R_390_GOT20 ||
r_type == R_390_GOTPLT20)
- rc = apply_rela_bits(loc, val, 1, 20, 0);
+ rc = apply_rela_bits(loc, val, 1, 20, 0, write);
else if (r_type == R_390_GOT32 ||
r_type == R_390_GOTPLT32)
- rc = apply_rela_bits(loc, val, 0, 32, 0);
+ rc = apply_rela_bits(loc, val, 0, 32, 0, write);
else if (r_type == R_390_GOT64 ||
r_type == R_390_GOTPLT64)
- rc = apply_rela_bits(loc, val, 0, 64, 0);
+ rc = apply_rela_bits(loc, val, 0, 64, 0, write);
else if (r_type == R_390_GOTENT ||
r_type == R_390_GOTPLTENT) {
val += (Elf_Addr) me->core_layout.base - loc;
- rc = apply_rela_bits(loc, val, 1, 32, 1);
+ rc = apply_rela_bits(loc, val, 1, 32, 1, write);
}
break;
case R_390_PLT16DBL: /* 16 bit PC rel. PLT shifted by 1. */
@@ -357,17 +361,17 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
val += rela->r_addend - loc;
}
if (r_type == R_390_PLT16DBL)
- rc = apply_rela_bits(loc, val, 1, 16, 1);
+ rc = apply_rela_bits(loc, val, 1, 16, 1, write);
else if (r_type == R_390_PLTOFF16)
- rc = apply_rela_bits(loc, val, 0, 16, 0);
+ rc = apply_rela_bits(loc, val, 0, 16, 0, write);
else if (r_type == R_390_PLT32DBL)
- rc = apply_rela_bits(loc, val, 1, 32, 1);
+ rc = apply_rela_bits(loc, val, 1, 32, 1, write);
else if (r_type == R_390_PLT32 ||
r_type == R_390_PLTOFF32)
- rc = apply_rela_bits(loc, val, 0, 32, 0);
+ rc = apply_rela_bits(loc, val, 0, 32, 0, write);
else if (r_type == R_390_PLT64 ||
r_type == R_390_PLTOFF64)
- rc = apply_rela_bits(loc, val, 0, 64, 0);
+ rc = apply_rela_bits(loc, val, 0, 64, 0, write);
break;
case R_390_GOTOFF16: /* 16 bit offset to GOT. */
case R_390_GOTOFF32: /* 32 bit offset to GOT. */
@@ -375,20 +379,20 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
val = val + rela->r_addend -
((Elf_Addr) me->core_layout.base + me->arch.got_offset);
if (r_type == R_390_GOTOFF16)
- rc = apply_rela_bits(loc, val, 0, 16, 0);
+ rc = apply_rela_bits(loc, val, 0, 16, 0, write);
else if (r_type == R_390_GOTOFF32)
- rc = apply_rela_bits(loc, val, 0, 32, 0);
+ rc = apply_rela_bits(loc, val, 0, 32, 0, write);
else if (r_type == R_390_GOTOFF64)
- rc = apply_rela_bits(loc, val, 0, 64, 0);
+ rc = apply_rela_bits(loc, val, 0, 64, 0, write);
break;
case R_390_GOTPC: /* 32 bit PC relative offset to GOT. */
case R_390_GOTPCDBL: /* 32 bit PC rel. off. to GOT shifted by 1. */
val = (Elf_Addr) me->core_layout.base + me->arch.got_offset +
rela->r_addend - loc;
if (r_type == R_390_GOTPC)
- rc = apply_rela_bits(loc, val, 1, 32, 0);
+ rc = apply_rela_bits(loc, val, 1, 32, 0, write);
else if (r_type == R_390_GOTPCDBL)
- rc = apply_rela_bits(loc, val, 1, 32, 1);
+ rc = apply_rela_bits(loc, val, 1, 32, 1, write);
break;
case R_390_COPY:
case R_390_GLOB_DAT: /* Create GOT entry. */
@@ -412,9 +416,10 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
return 0;
}

-int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
+static int __apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
unsigned int symindex, unsigned int relsec,
- struct module *me)
+ struct module *me,
+ void (*write)(void *dest, const void *src, size_t len))
{
Elf_Addr base;
Elf_Sym *symtab;
@@ -437,6 +442,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
return 0;
}

+int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
+ unsigned int symindex, unsigned int relsec,
+ struct module *me)
+{
+ int ret;
+ bool early = me->state == MODULE_STATE_UNFORMED;
+
+ return __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
+ early ? memcpy : s390_kernel_write);
+}
+
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
--
2.21.1

2020-04-14 16:57:57

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH 3/7] livepatch: Prevent module-specific KLP rela sections from referencing vmlinux symbols

Prevent module-specific KLP rela sections from referencing vmlinux
symbols. This helps prevent ordering issues with module special section
initializations. Presumably such symbols are exported and normal relas
can be used instead.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
---
kernel/livepatch/core.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index af8f06382e43..817676caddee 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -192,17 +192,20 @@ static int klp_find_object_symbol(const char *objname, const char *name,
}

static int klp_resolve_symbols(Elf64_Shdr *sechdrs, const char *strtab,
- unsigned int symndx, Elf_Shdr *relasec)
+ unsigned int symndx, Elf_Shdr *relasec,
+ const char *sec_objname)
{
- int i, cnt, vmlinux, ret;
- char objname[MODULE_NAME_LEN];
- char symname[KSYM_NAME_LEN];
+ int i, cnt, ret;
+ char sym_objname[MODULE_NAME_LEN];
+ char sym_name[KSYM_NAME_LEN];
Elf_Rela *relas;
Elf_Sym *sym;
unsigned long sympos, addr;
+ bool sym_vmlinux;
+ bool sec_vmlinux = !strcmp(sec_objname, "vmlinux");

/*
- * Since the field widths for objname and symname in the sscanf()
+ * Since the field widths for sym_objname and sym_name in the sscanf()
* call are hard-coded and correspond to MODULE_NAME_LEN and
* KSYM_NAME_LEN respectively, we must make sure that MODULE_NAME_LEN
* and KSYM_NAME_LEN have the values we expect them to have.
@@ -223,20 +226,33 @@ static int klp_resolve_symbols(Elf64_Shdr *sechdrs, const char *strtab,
return -EINVAL;
}

- /* Format: .klp.sym.objname.symname,sympos */
+ /* Format: .klp.sym.sym_objname.sym_name,sympos */
cnt = sscanf(strtab + sym->st_name,
".klp.sym.%55[^.].%127[^,],%lu",
- objname, symname, &sympos);
+ sym_objname, sym_name, &sympos);
if (cnt != 3) {
pr_err("symbol %s has an incorrectly formatted name\n",
strtab + sym->st_name);
return -EINVAL;
}

+ sym_vmlinux = !strcmp(sym_objname, "vmlinux");
+
+ /*
+ * Prevent module-specific KLP rela sections from referencing
+ * vmlinux symbols. This helps prevent ordering issues with
+ * module special section initializations. Presumably such
+ * symbols are exported and normal relas can be used instead.
+ */
+ if (!sec_vmlinux && sym_vmlinux) {
+ pr_err("invalid access to vmlinux symbol '%s' from module-specific livepatch relocation section",
+ sym_name);
+ return -EINVAL;
+ }
+
/* klp_find_object_symbol() treats a NULL objname as vmlinux */
- vmlinux = !strcmp(objname, "vmlinux");
- ret = klp_find_object_symbol(vmlinux ? NULL : objname,
- symname, sympos, &addr);
+ ret = klp_find_object_symbol(sym_vmlinux ? NULL : sym_objname,
+ sym_name, sympos, &addr);
if (ret)
return ret;

@@ -301,7 +317,8 @@ int klp_write_relocations(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
if (strcmp(objname ? objname : "vmlinux", sec_objname))
continue;

- ret = klp_resolve_symbols(sechdrs, strtab, symndx, sec);
+ ret = klp_resolve_symbols(sechdrs, strtab, symndx, sec,
+ sec_objname);
if (ret)
break;

--
2.21.1

2020-04-15 13:14:14

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Tue, Apr 14, 2020 at 08:27:26PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 14, 2020 at 11:28:36AM -0500, Josh Poimboeuf wrote:
> > Better late than never, these patches add simplifications and
> > improvements for some issues Peter found six months ago, as part of his
> > non-writable text code (W^X) cleanups.
>
> Excellent stuff, thanks!!
>
> I'll go brush up these two patches then:
>
> https://lkml.kernel.org/r/[email protected]
> https://lkml.kernel.org/r/[email protected]

Ah right, I meant to bring that up. I actually played around with those
patches. While it would be nice to figure out a way to converge the
ftrace module init, I didn't really like the first patch.

It bothers me that both the notifiers and the module init() both see the
same MODULE_STATE_COMING state, but only in the former case is the text
writable.

I think it's cognitively simpler if MODULE_STATE_COMING always means the
same thing, like the comments imply, "fully formed" and thus
not-writable:

enum module_state {
MODULE_STATE_LIVE, /* Normal state. */
MODULE_STATE_COMING, /* Full formed, running module_init. */
MODULE_STATE_GOING, /* Going away. */
MODULE_STATE_UNFORMED, /* Still setting it up. */
};

And, it keeps tighter constraints on what a notifier can do, which is a
good thing if we can get away with it.

> and write a patch that makes the x86 code throw a wobbly on W+X.
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>

Thanks!

--
Josh

2020-04-15 14:47:17

by Joe Lawrence

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On 4/14/20 9:31 PM, Josh Poimboeuf wrote:
> On Tue, Apr 14, 2020 at 08:57:15PM -0400, Joe Lawrence wrote:
>> On 4/14/20 12:28 PM, Josh Poimboeuf wrote:
>>> Better late than never, these patches add simplifications and
>>> improvements for some issues Peter found six months ago, as part of his
>>> non-writable text code (W^X) cleanups.
>>>
>>> Highlights:
>>>
>>> - Remove the livepatch arch-specific .klp.arch sections, which were used
>>> to do paravirt patching and alternatives patching for livepatch
>>> replacement code.
>>>
>>> - Add support for jump labels in patched code.
>>
>> Re: jump labels and late-module patching support...
>>
>> Is there still an issue of a non-exported static key defined in a
>> to-be-patched module referenced and resolved via klp-relocation when the
>> livepatch module is loaded first? (Basically the same case I asked Petr
>> about in his split livepatch module PoC. [1])
>>
>> Or should we declare this an invalid klp-relocation use case and force the
>> livepatch author to use static_key_enabled()?
>>
>> [1] https://lore.kernel.org/lkml/[email protected]/
>
> Right, if the static key lives in a module, then it's still not possible
> for a jump label to use it. I added a check in kpatch-build to block
> that case and suggest static_key_enabled() instead.
>

Ok good. I didn't see a negative test case for this, so I wanted to
make sure that kpatch-build wouldn't accidentally create unsupported
klp-relocations for them. I'll try to review those changes over on
github tomorrow.

-- Joe

2020-04-15 21:51:57

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Tue, Apr 14, 2020 at 11:28:36AM -0500, Josh Poimboeuf wrote:
> Better late than never, these patches add simplifications and
> improvements for some issues Peter found six months ago, as part of his
> non-writable text code (W^X) cleanups.

Excellent stuff, thanks!!

I'll go brush up these two patches then:

https://lkml.kernel.org/r/[email protected]
https://lkml.kernel.org/r/[email protected]

and write a patch that makes the x86 code throw a wobbly on W+X.

Acked-by: Peter Zijlstra (Intel) <[email protected]>

2020-04-15 22:00:03

by Joe Lawrence

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On 4/14/20 12:28 PM, Josh Poimboeuf wrote:
> Better late than never, these patches add simplifications and
> improvements for some issues Peter found six months ago, as part of his
> non-writable text code (W^X) cleanups.
>
> Highlights:
>
> - Remove the livepatch arch-specific .klp.arch sections, which were used
> to do paravirt patching and alternatives patching for livepatch
> replacement code.
>
> - Add support for jump labels in patched code.

Re: jump labels and late-module patching support...

Is there still an issue of a non-exported static key defined in a
to-be-patched module referenced and resolved via klp-relocation when the
livepatch module is loaded first? (Basically the same case I asked Petr
about in his split livepatch module PoC. [1])

Or should we declare this an invalid klp-relocation use case and force
the livepatch author to use static_key_enabled()?

[1] https://lore.kernel.org/lkml/[email protected]/

-- Joe

2020-04-15 22:00:58

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Tue, Apr 14, 2020 at 08:57:15PM -0400, Joe Lawrence wrote:
> On 4/14/20 12:28 PM, Josh Poimboeuf wrote:
> > Better late than never, these patches add simplifications and
> > improvements for some issues Peter found six months ago, as part of his
> > non-writable text code (W^X) cleanups.
> >
> > Highlights:
> >
> > - Remove the livepatch arch-specific .klp.arch sections, which were used
> > to do paravirt patching and alternatives patching for livepatch
> > replacement code.
> >
> > - Add support for jump labels in patched code.
>
> Re: jump labels and late-module patching support...
>
> Is there still an issue of a non-exported static key defined in a
> to-be-patched module referenced and resolved via klp-relocation when the
> livepatch module is loaded first? (Basically the same case I asked Petr
> about in his split livepatch module PoC. [1])
>
> Or should we declare this an invalid klp-relocation use case and force the
> livepatch author to use static_key_enabled()?
>
> [1] https://lore.kernel.org/lkml/[email protected]/

Right, if the static key lives in a module, then it's still not possible
for a jump label to use it. I added a check in kpatch-build to block
that case and suggest static_key_enabled() instead.

I don't know what the solution is, other than getting rid of late module
patching.

I confess I haven't looked at Petr's patches due to other distractions,
but I plan to soon.

--
Josh

2020-04-16 00:08:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Tue, Apr 14, 2020 at 02:08:14PM -0500, Josh Poimboeuf wrote:
> On Tue, Apr 14, 2020 at 08:27:26PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 14, 2020 at 11:28:36AM -0500, Josh Poimboeuf wrote:
> > > Better late than never, these patches add simplifications and
> > > improvements for some issues Peter found six months ago, as part of his
> > > non-writable text code (W^X) cleanups.
> >
> > Excellent stuff, thanks!!
> >
> > I'll go brush up these two patches then:
> >
> > https://lkml.kernel.org/r/[email protected]
> > https://lkml.kernel.org/r/[email protected]
>
> Ah right, I meant to bring that up. I actually played around with those
> patches. While it would be nice to figure out a way to converge the
> ftrace module init, I didn't really like the first patch.

ftrace only needs it done after ftrace_module_enable(), which is before
the notifier chain happens, so we can simply do something like so
instead:

diff --git a/kernel/module.c b/kernel/module.c
index a3a8f6d0e144..89f8d02c3c3e 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3700,6 +3700,10 @@ static int prepare_coming_module(struct module *mod)
if (err)
return err;

+ module_enable_ro(mod, false);
+ module_enable_nx(mod);
+ module_enable_x(mod);
+
err = blocking_notifier_call_chain_robust(&module_notify_list,
MODULE_STATE_COMING, MODULE_STATE_GOING, mod);
err = notifier_to_errno(err);
@@ -3845,10 +3849,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
if (err)
goto bug_cleanup;

- module_enable_ro(mod, false);
- module_enable_nx(mod);
- module_enable_x(mod);
-
/* Module is ready to execute: parsing args may do that. */
after_dashes = parse_args(mod->name, mod->args, mod->kp, mod->num_kp,
-32768, 32767, mod,

> It bothers me that both the notifiers and the module init() both see the
> same MODULE_STATE_COMING state, but only in the former case is the text
> writable.
>
> I think it's cognitively simpler if MODULE_STATE_COMING always means the
> same thing, like the comments imply, "fully formed" and thus
> not-writable:
>
> enum module_state {
> MODULE_STATE_LIVE, /* Normal state. */
> MODULE_STATE_COMING, /* Full formed, running module_init. */
> MODULE_STATE_GOING, /* Going away. */
> MODULE_STATE_UNFORMED, /* Still setting it up. */
> };
>
> And, it keeps tighter constraints on what a notifier can do, which is a
> good thing if we can get away with it.

Moo! -- but jump_label and static_call are on the notifier chain and I
was hoping to make it cheaper for them. Should we perhaps weane them off the
notifier and, like ftrace/klp put in explicit calls?

It'd make the error handling in prepare_coming_module() a bigger mess,
but it should work.

2020-04-16 00:15:16

by Jessica Yu

[permalink] [raw]
Subject: Re: [PATCH 6/7] livepatch: Remove module_disable_ro() usage

+++ Josh Poimboeuf [14/04/20 11:28 -0500]:
>With arch_klp_init_object_loaded() gone, and apply_relocate_add() now
>using text_poke(), livepatch no longer needs to use module_disable_ro().
>
>The text_mutex usage can also be removed -- its purpose was to protect
>against module permission change races.
>
>Signed-off-by: Josh Poimboeuf <[email protected]>
>---
> kernel/livepatch/core.c | 8 --------
> 1 file changed, 8 deletions(-)
>
>diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>index 817676caddee..3a88639b3326 100644
>--- a/kernel/livepatch/core.c
>+++ b/kernel/livepatch/core.c
>@@ -767,10 +767,6 @@ static int klp_init_object_loaded(struct klp_patch *patch,
> struct klp_modinfo *info = patch->mod->klp_info;
>
> if (klp_is_module(obj)) {
>-
>- mutex_lock(&text_mutex);
>- module_disable_ro(patch->mod);
>-

Don't you still need the text_mutex to use text_poke() though?
(Through klp_write_relocations -> apply_relocate_add -> text_poke)
At least, I see this assertion there:

void *text_poke(void *addr, const void *opcode, size_t len)
{
lockdep_assert_held(&text_mutex);

return __text_poke(addr, opcode, len);
}

Jessica

2020-04-16 00:16:21

by Jessica Yu

[permalink] [raw]
Subject: Re: [PATCH 2/7] livepatch: Remove .klp.arch

+++ Josh Poimboeuf [14/04/20 11:28 -0500]:
[snip]
>diff --git a/Documentation/livepatch/module-elf-format.rst b/Documentation/livepatch/module-elf-format.rst
>index 2a591e6f8e6c..629ef7ffb6cf 100644
>--- a/Documentation/livepatch/module-elf-format.rst
>+++ b/Documentation/livepatch/module-elf-format.rst
>@@ -298,17 +298,7 @@ Examples:
> Note that the 'Ndx' (Section index) for these symbols is SHN_LIVEPATCH (0xff20).
> "OS" means OS-specific.
>
>-5. Architecture-specific sections
>-=================================
>-Architectures may override arch_klp_init_object_loaded() to perform
>-additional arch-specific tasks when a target module loads, such as applying
>-arch-specific sections. On x86 for example, we must apply per-object
>-.altinstructions and .parainstructions sections when a target module loads.
>-These sections must be prefixed with ".klp.arch.$objname." so that they can
>-be easily identified when iterating through a patch module's Elf sections
>-(See arch/x86/kernel/livepatch.c for a complete example).
>-
>-6. Symbol table and Elf section access
>+5. Symbol table and Elf section access

Nit: I think we need to fix the numbering in the Table of Contents too.

2020-04-16 00:19:32

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Wed, Apr 15, 2020 at 04:24:15PM +0200, Peter Zijlstra wrote:
> > It bothers me that both the notifiers and the module init() both see the
> > same MODULE_STATE_COMING state, but only in the former case is the text
> > writable.
> >
> > I think it's cognitively simpler if MODULE_STATE_COMING always means the
> > same thing, like the comments imply, "fully formed" and thus
> > not-writable:
> >
> > enum module_state {
> > MODULE_STATE_LIVE, /* Normal state. */
> > MODULE_STATE_COMING, /* Full formed, running module_init. */
> > MODULE_STATE_GOING, /* Going away. */
> > MODULE_STATE_UNFORMED, /* Still setting it up. */
> > };
> >
> > And, it keeps tighter constraints on what a notifier can do, which is a
> > good thing if we can get away with it.
>
> Moo! -- but jump_label and static_call are on the notifier chain and I
> was hoping to make it cheaper for them. Should we perhaps weane them off the
> notifier and, like ftrace/klp put in explicit calls?
>
> It'd make the error handling in prepare_coming_module() a bigger mess,
> but it should work.

So you're wanting to have jump labels and static_call do direct writes
instead of text pokes, right? Makes sense.

I don't feel strongly about "don't let module notifiers modify text".

But I still not a fan of the fact that COMING has two different
"states". For example, after your patch, when apply_relocate_add() is
called from klp_module_coming(), it can use memcpy(), but when called
from klp module init() it has to use text poke. But both are COMING so
there's no way to look at the module state to know which can be used.

I hate to say it, but it almost feels like another module state would be
useful.

--
Josh

2020-04-16 00:23:49

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 6/7] livepatch: Remove module_disable_ro() usage

On Wed, Apr 15, 2020 at 05:02:16PM +0200, Jessica Yu wrote:
> +++ Josh Poimboeuf [14/04/20 11:28 -0500]:
> > With arch_klp_init_object_loaded() gone, and apply_relocate_add() now
> > using text_poke(), livepatch no longer needs to use module_disable_ro().
> >
> > The text_mutex usage can also be removed -- its purpose was to protect
> > against module permission change races.
> >
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> > kernel/livepatch/core.c | 8 --------
> > 1 file changed, 8 deletions(-)
> >
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 817676caddee..3a88639b3326 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -767,10 +767,6 @@ static int klp_init_object_loaded(struct klp_patch *patch,
> > struct klp_modinfo *info = patch->mod->klp_info;
> >
> > if (klp_is_module(obj)) {
> > -
> > - mutex_lock(&text_mutex);
> > - module_disable_ro(patch->mod);
> > -
>
> Don't you still need the text_mutex to use text_poke() though?
> (Through klp_write_relocations -> apply_relocate_add -> text_poke)
> At least, I see this assertion there:
>
> void *text_poke(void *addr, const void *opcode, size_t len)
> {
> lockdep_assert_held(&text_mutex);
>
> return __text_poke(addr, opcode, len);
> }

Hm, guess I should have tested with lockdep ;-)

--
Josh

2020-04-16 09:00:45

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 4/7] s390/module: Use s390_kernel_write() for relocations

On Tue, 14 Apr 2020, Josh Poimboeuf wrote:

> From: Peter Zijlstra <[email protected]>
>
> Instead of playing games with module_{dis,en}able_ro(), use existing
> text poking mechanisms to apply relocations after module loading.
>
> So far only x86, s390 and Power have HAVE_LIVEPATCH but only the first
> two also have STRICT_MODULE_RWX.
>
> This will allow removal of the last module_disable_ro() usage in
> livepatch. The ultimate goal is to completely disallow making
> executable mappings writable.
>
> [ jpoimboe: Split up patches. Use mod state to determine whether
> memcpy() can be used. ]
>
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Josh Poimboeuf <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> arch/s390/kernel/module.c | 106 ++++++++++++++++++++++----------------
> 1 file changed, 61 insertions(+), 45 deletions(-)
>
> diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> index ba8f19bb438b..e85e378f876e 100644
> --- a/arch/s390/kernel/module.c
> +++ b/arch/s390/kernel/module.c
> @@ -174,7 +174,8 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
> }
>
> static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
> - int sign, int bits, int shift)
> + int sign, int bits, int shift,
> + void (*write)(void *dest, const void *src, size_t len))
> {
> unsigned long umax;
> long min, max;
> @@ -194,26 +195,29 @@ static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
> return -ENOEXEC;
> }
>
> - if (bits == 8)
> - *(unsigned char *) loc = val;
> - else if (bits == 12)
> - *(unsigned short *) loc = (val & 0xfff) |
> + if (bits == 8) {
> + write(loc, &val, 1);
> + } else if (bits == 12) {
> + unsigned short tmp = (val & 0xfff) |
> (*(unsigned short *) loc & 0xf000);
> - else if (bits == 16)
> - *(unsigned short *) loc = val;
> - else if (bits == 20)
> - *(unsigned int *) loc = (val & 0xfff) << 16 |
> - (val & 0xff000) >> 4 |
> - (*(unsigned int *) loc & 0xf00000ff);
> - else if (bits == 32)
> - *(unsigned int *) loc = val;
> - else if (bits == 64)
> - *(unsigned long *) loc = val;
> + write(loc, &tmp, 2);
> + } else if (bits == 16) {
> + write(loc, &val, 2);
> + } else if (bits == 20) {
> + unsigned int tmp = (val & 0xfff) << 16 |
> + (val & 0xff000) >> 4 | (*(unsigned int *) loc & 0xf00000ff);
> + write(loc, &tmp, 4);
> + } else if (bits == 32) {
> + write(loc, &val, 4);
> + } else if (bits == 64) {
> + write(loc, &val, 8);
> + }
> return 0;
> }

The compiler complains about the above changes

arch/s390/kernel/module.c:199:9: warning: passing argument 1 of 'write' makes pointer from integer without a cast [-Wint-conversion]
write(loc, &val, 1);
^~~
arch/s390/kernel/module.c:199:9: note: expected 'void *' but argument is of type 'Elf64_Addr' {aka 'long long unsigned int'}

[...]

> -int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> +static int __apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> unsigned int symindex, unsigned int relsec,
> - struct module *me)
> + struct module *me,
> + void (*write)(void *dest, const void *src, size_t len))
> {
> Elf_Addr base;
> Elf_Sym *symtab;

You also need to update apply_rela() call site in this function. It is
missing write argument.

> @@ -437,6 +442,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> return 0;
> }
>
> +int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> + unsigned int symindex, unsigned int relsec,
> + struct module *me)
> +{
> + int ret;

ret is unused;

> + bool early = me->state == MODULE_STATE_UNFORMED;
> +
> + return __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + early ? memcpy : s390_kernel_write);

The compiler warns about

arch/s390/kernel/module.c: In function 'apply_relocate_add':
arch/s390/kernel/module.c:453:24: warning: pointer type mismatch in conditional expression
early ? memcpy : s390_kernel_write);

Miroslav

2020-04-16 09:30:11

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 6/7] livepatch: Remove module_disable_ro() usage

On Wed, 15 Apr 2020, Josh Poimboeuf wrote:

> On Wed, Apr 15, 2020 at 05:02:16PM +0200, Jessica Yu wrote:
> > +++ Josh Poimboeuf [14/04/20 11:28 -0500]:
> > > With arch_klp_init_object_loaded() gone, and apply_relocate_add() now
> > > using text_poke(), livepatch no longer needs to use module_disable_ro().
> > >
> > > The text_mutex usage can also be removed -- its purpose was to protect
> > > against module permission change races.
> > >
> > > Signed-off-by: Josh Poimboeuf <[email protected]>
> > > ---
> > > kernel/livepatch/core.c | 8 --------
> > > 1 file changed, 8 deletions(-)
> > >
> > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > > index 817676caddee..3a88639b3326 100644
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -767,10 +767,6 @@ static int klp_init_object_loaded(struct klp_patch *patch,
> > > struct klp_modinfo *info = patch->mod->klp_info;
> > >
> > > if (klp_is_module(obj)) {
> > > -
> > > - mutex_lock(&text_mutex);
> > > - module_disable_ro(patch->mod);
> > > -
> >
> > Don't you still need the text_mutex to use text_poke() though?
> > (Through klp_write_relocations -> apply_relocate_add -> text_poke)
> > At least, I see this assertion there:
> >
> > void *text_poke(void *addr, const void *opcode, size_t len)
> > {
> > lockdep_assert_held(&text_mutex);
> >
> > return __text_poke(addr, opcode, len);
> > }
>
> Hm, guess I should have tested with lockdep ;-)

:)

If I remember correctly, text_mutex must be held whenever text is modified
to prevent race due to the modification. It is not only about permission
changes even though it was how it manifested itself in our case.

Miroslav

2020-04-16 09:49:02

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Tue, 14 Apr 2020, Josh Poimboeuf wrote:

> On Tue, Apr 14, 2020 at 08:27:26PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 14, 2020 at 11:28:36AM -0500, Josh Poimboeuf wrote:
> > > Better late than never, these patches add simplifications and
> > > improvements for some issues Peter found six months ago, as part of his
> > > non-writable text code (W^X) cleanups.
> >
> > Excellent stuff, thanks!!
> >
> > I'll go brush up these two patches then:
> >
> > https://lkml.kernel.org/r/[email protected]
> > https://lkml.kernel.org/r/[email protected]
>
> Ah right, I meant to bring that up. I actually played around with those
> patches. While it would be nice to figure out a way to converge the
> ftrace module init, I didn't really like the first patch.
>
> It bothers me that both the notifiers and the module init() both see the
> same MODULE_STATE_COMING state, but only in the former case is the text
> writable.
>
> I think it's cognitively simpler if MODULE_STATE_COMING always means the
> same thing, like the comments imply, "fully formed" and thus
> not-writable:
>
> enum module_state {
> MODULE_STATE_LIVE, /* Normal state. */
> MODULE_STATE_COMING, /* Full formed, running module_init. */
> MODULE_STATE_GOING, /* Going away. */
> MODULE_STATE_UNFORMED, /* Still setting it up. */
> };
>
> And, it keeps tighter constraints on what a notifier can do, which is a
> good thing if we can get away with it.

Agreed.

On the other hand, the first patch would remove the tiny race window when
a module state is still UNFORMED, but the protections are (being) set up.
Patches 4/7 and 5/7 allow to use memcpy in that case, because it is early.
But it is in fact not already. I haven't checked yet if it really matters
somewhere (a race with livepatch running klp_module_coming while another
module is being loaded or anything like that).

Miroslav

2020-04-16 12:12:23

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 4/7] s390/module: Use s390_kernel_write() for relocations

On Thu, Apr 16, 2020 at 10:56:02AM +0200, Miroslav Benes wrote:
> > + bool early = me->state == MODULE_STATE_UNFORMED;
> > +
> > + return __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> > + early ? memcpy : s390_kernel_write);
>
> The compiler warns about
>
> arch/s390/kernel/module.c: In function 'apply_relocate_add':
> arch/s390/kernel/module.c:453:24: warning: pointer type mismatch in conditional expression
> early ? memcpy : s390_kernel_write);

Thanks, I'll get all that cleaned up.

I could have sworn I got a SUCCESS message from the kbuild bot. Does it
ignore warnings nowadays?

--
Josh

2020-04-16 12:15:17

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 6/7] livepatch: Remove module_disable_ro() usage

On Thu, Apr 16, 2020 at 11:28:25AM +0200, Miroslav Benes wrote:
> If I remember correctly, text_mutex must be held whenever text is modified
> to prevent race due to the modification. It is not only about permission
> changes even though it was how it manifested itself in our case.

Yeah, that's what confused me. We went years without using text_mutex
and then only started using it because of a race with the permissions
changes.

--
Josh

2020-04-16 12:26:03

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Thu, Apr 16, 2020 at 11:45:05AM +0200, Miroslav Benes wrote:
> On Tue, 14 Apr 2020, Josh Poimboeuf wrote:
>
> > On Tue, Apr 14, 2020 at 08:27:26PM +0200, Peter Zijlstra wrote:
> > > On Tue, Apr 14, 2020 at 11:28:36AM -0500, Josh Poimboeuf wrote:
> > > > Better late than never, these patches add simplifications and
> > > > improvements for some issues Peter found six months ago, as part of his
> > > > non-writable text code (W^X) cleanups.
> > >
> > > Excellent stuff, thanks!!
> > >
> > > I'll go brush up these two patches then:
> > >
> > > https://lkml.kernel.org/r/[email protected]
> > > https://lkml.kernel.org/r/[email protected]
> >
> > Ah right, I meant to bring that up. I actually played around with those
> > patches. While it would be nice to figure out a way to converge the
> > ftrace module init, I didn't really like the first patch.
> >
> > It bothers me that both the notifiers and the module init() both see the
> > same MODULE_STATE_COMING state, but only in the former case is the text
> > writable.
> >
> > I think it's cognitively simpler if MODULE_STATE_COMING always means the
> > same thing, like the comments imply, "fully formed" and thus
> > not-writable:
> >
> > enum module_state {
> > MODULE_STATE_LIVE, /* Normal state. */
> > MODULE_STATE_COMING, /* Full formed, running module_init. */
> > MODULE_STATE_GOING, /* Going away. */
> > MODULE_STATE_UNFORMED, /* Still setting it up. */
> > };
> >
> > And, it keeps tighter constraints on what a notifier can do, which is a
> > good thing if we can get away with it.
>
> Agreed.
>
> On the other hand, the first patch would remove the tiny race window when
> a module state is still UNFORMED, but the protections are (being) set up.
> Patches 4/7 and 5/7 allow to use memcpy in that case, because it is early.
> But it is in fact not already. I haven't checked yet if it really matters
> somewhere (a race with livepatch running klp_module_coming while another
> module is being loaded or anything like that).

Maybe I'm missing your point, but I don't see any races here.

apply_relocate_add() only writes to the patch module's text, so there
can't be races with other modules.

--
Josh

2020-04-16 15:34:15

by Jessica Yu

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

+++ Josh Poimboeuf [15/04/20 11:17 -0500]:
>On Wed, Apr 15, 2020 at 04:24:15PM +0200, Peter Zijlstra wrote:
>> > It bothers me that both the notifiers and the module init() both see the
>> > same MODULE_STATE_COMING state, but only in the former case is the text
>> > writable.
>> >
>> > I think it's cognitively simpler if MODULE_STATE_COMING always means the
>> > same thing, like the comments imply, "fully formed" and thus
>> > not-writable:
>> >
>> > enum module_state {
>> > MODULE_STATE_LIVE, /* Normal state. */
>> > MODULE_STATE_COMING, /* Full formed, running module_init. */
>> > MODULE_STATE_GOING, /* Going away. */
>> > MODULE_STATE_UNFORMED, /* Still setting it up. */
>> > };
>> >
>> > And, it keeps tighter constraints on what a notifier can do, which is a
>> > good thing if we can get away with it.
>>
>> Moo! -- but jump_label and static_call are on the notifier chain and I
>> was hoping to make it cheaper for them. Should we perhaps weane them off the
>> notifier and, like ftrace/klp put in explicit calls?
>>
>> It'd make the error handling in prepare_coming_module() a bigger mess,
>> but it should work.
>
>So you're wanting to have jump labels and static_call do direct writes
>instead of text pokes, right? Makes sense.
>
>I don't feel strongly about "don't let module notifiers modify text".
>
>But I still not a fan of the fact that COMING has two different
>"states". For example, after your patch, when apply_relocate_add() is
>called from klp_module_coming(), it can use memcpy(), but when called
>from klp module init() it has to use text poke. But both are COMING so
>there's no way to look at the module state to know which can be used.

This is a good observation, thanks for bringing it up. I agree that we
should strive to be consistent with what the module states mean. In my
head, I think it is easiest to assume/establish the following meanings
for each module state:

MODULE_STATE_UNFORMED - no protections. relocations, alternatives,
ftrace module initialization, etc. any other text modifications are
in the process of being applied. Direct writes are permissible.

MODULE_STATE_COMING - module fully formed, text modifications are
done, protections applied, module is ready to execute init or is
executing init.

I wonder if we could enforce the meaning of these two states more
consistently without needing to add another module state.

Regarding Peter's patches, with the set_all_modules_text_*() api gone,
and ftrace reliance on MODULE_STATE_COMING gone (I think?), is there
anything preventing ftrace_module_init+enable from being called
earlier (i.e., before complete_formation()) while the module is
unformed? Then you don't have to move module_enable_ro/nx later and we
keep the MODULE_STATE_COMING semantics. And if we're enforcing the
above module state meanings, I would also be OK with moving jump_label
and static_call out of the coming notifier chain and making them
explicit calls while the module is still writable.

Sorry in advance if I missed anything above, I'm still trying to wrap
my head around which callers need what module state and what module
permissions :/

Jessica

2020-04-16 17:50:22

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 4/7] s390/module: Use s390_kernel_write() for relocations

On Thu, Apr 16, 2020 at 07:06:51AM -0500, Josh Poimboeuf wrote:
> On Thu, Apr 16, 2020 at 10:56:02AM +0200, Miroslav Benes wrote:
> > > + bool early = me->state == MODULE_STATE_UNFORMED;
> > > +
> > > + return __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> > > + early ? memcpy : s390_kernel_write);
> >
> > The compiler warns about
> >
> > arch/s390/kernel/module.c: In function 'apply_relocate_add':
> > arch/s390/kernel/module.c:453:24: warning: pointer type mismatch in conditional expression
> > early ? memcpy : s390_kernel_write);
>
> Thanks, I'll get all that cleaned up.
>
> I could have sworn I got a SUCCESS message from the kbuild bot. Does it
> ignore warnings nowadays?

Here's a fix on top of the original patch.

I changed s390_kernel_write() to return "void *" to match memcpy()
(probably a separate patch).

I also grabbed the text_mutex for the !early case in
apply_relocate_add() -- will do something similar for x86.

Will try to test this on a 390 box.


diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index a470f1fa9f2a..324438889fe1 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -276,6 +276,6 @@ static inline unsigned long __must_check clear_user(void __user *to, unsigned lo
}

int copy_to_user_real(void __user *dest, void *src, unsigned long count);
-void s390_kernel_write(void *dst, const void *src, size_t size);
+void *s390_kernel_write(void *dst, const void *src, size_t size);

#endif /* __S390_UACCESS_H */
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index e85e378f876e..2b30ed0ce14f 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -19,6 +19,7 @@
#include <linux/kasan.h>
#include <linux/moduleloader.h>
#include <linux/bug.h>
+#include <linux/memory.h>
#include <asm/alternative.h>
#include <asm/nospec-branch.h>
#include <asm/facility.h>
@@ -175,10 +176,11 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,

static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
int sign, int bits, int shift,
- void (*write)(void *dest, const void *src, size_t len))
+ void *(*write)(void *dest, const void *src, size_t len))
{
unsigned long umax;
long min, max;
+ void *dest = (void *)loc;

if (val & ((1UL << shift) - 1))
return -ENOEXEC;
@@ -196,28 +198,28 @@ static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
}

if (bits == 8) {
- write(loc, &val, 1);
+ write(dest, &val, 1);
} else if (bits == 12) {
unsigned short tmp = (val & 0xfff) |
(*(unsigned short *) loc & 0xf000);
- write(loc, &tmp, 2);
+ write(dest, &tmp, 2);
} else if (bits == 16) {
- write(loc, &val, 2);
+ write(dest, &val, 2);
} else if (bits == 20) {
unsigned int tmp = (val & 0xfff) << 16 |
(val & 0xff000) >> 4 | (*(unsigned int *) loc & 0xf00000ff);
- write(loc, &tmp, 4);
+ write(dest, &tmp, 4);
} else if (bits == 32) {
- write(loc, &val, 4);
+ write(dest, &val, 4);
} else if (bits == 64) {
- write(loc, &val, 8);
+ write(dest, &val, 8);
}
return 0;
}

static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
const char *strtab, struct module *me,
- void (*write)(void *dest, const void *src, size_t len))
+ void *(*write)(void *dest, const void *src, size_t len))
{
struct mod_arch_syminfo *info;
Elf_Addr loc, val;
@@ -419,7 +421,7 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab,
static int __apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
unsigned int symindex, unsigned int relsec,
struct module *me,
- void (*write)(void *dest, const void *src, size_t len))
+ void *(*write)(void *dest, const void *src, size_t len))
{
Elf_Addr base;
Elf_Sym *symtab;
@@ -435,7 +437,7 @@ static int __apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
n = sechdrs[relsec].sh_size / sizeof(Elf_Rela);

for (i = 0; i < n; i++, rela++) {
- rc = apply_rela(rela, base, symtab, strtab, me);
+ rc = apply_rela(rela, base, symtab, strtab, me, write);
if (rc)
return rc;
}
@@ -449,8 +451,16 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
int ret;
bool early = me->state == MODULE_STATE_UNFORMED;

- return __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
- early ? memcpy : s390_kernel_write);
+ if (!early)
+ mutex_lock(&text_mutex);
+
+ ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
+ early ? memcpy : s390_kernel_write);
+
+ if (!early)
+ mutex_unlock(&text_mutex);
+
+ return ret;
}

int module_finalize(const Elf_Ehdr *hdr,
diff --git a/arch/s390/mm/maccess.c b/arch/s390/mm/maccess.c
index de7ca4b6718f..22a0be655f27 100644
--- a/arch/s390/mm/maccess.c
+++ b/arch/s390/mm/maccess.c
@@ -55,19 +55,22 @@ static notrace long s390_kernel_write_odd(void *dst, const void *src, size_t siz
*/
static DEFINE_SPINLOCK(s390_kernel_write_lock);

-void notrace s390_kernel_write(void *dst, const void *src, size_t size)
+notrace void *s390_kernel_write(void *dst, const void *src, size_t size)
{
+ void *tmp = dst;
unsigned long flags;
long copied;

spin_lock_irqsave(&s390_kernel_write_lock, flags);
while (size) {
- copied = s390_kernel_write_odd(dst, src, size);
- dst += copied;
+ copied = s390_kernel_write_odd(tmp, src, size);
+ tmp += copied;
src += copied;
size -= copied;
}
spin_unlock_irqrestore(&s390_kernel_write_lock, flags);
+
+ return dst;
}

static int __no_sanitize_address __memcpy_real(void *dest, void *src, size_t count)

2020-04-16 20:37:33

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Thu, Apr 16, 2020 at 05:31:31PM +0200, Jessica Yu wrote:
> > But I still not a fan of the fact that COMING has two different
> > "states". For example, after your patch, when apply_relocate_add() is
> > called from klp_module_coming(), it can use memcpy(), but when called
> > from klp module init() it has to use text poke. But both are COMING so
> > there's no way to look at the module state to know which can be used.
>
> This is a good observation, thanks for bringing it up. I agree that we
> should strive to be consistent with what the module states mean. In my
> head, I think it is easiest to assume/establish the following meanings
> for each module state:
>
> MODULE_STATE_UNFORMED - no protections. relocations, alternatives,
> ftrace module initialization, etc. any other text modifications are
> in the process of being applied. Direct writes are permissible.
>
> MODULE_STATE_COMING - module fully formed, text modifications are
> done, protections applied, module is ready to execute init or is
> executing init.
>
> I wonder if we could enforce the meaning of these two states more
> consistently without needing to add another module state.
>
> Regarding Peter's patches, with the set_all_modules_text_*() api gone,
> and ftrace reliance on MODULE_STATE_COMING gone (I think?), is there
> anything preventing ftrace_module_init+enable from being called
> earlier (i.e., before complete_formation()) while the module is
> unformed? Then you don't have to move module_enable_ro/nx later and we
> keep the MODULE_STATE_COMING semantics. And if we're enforcing the
> above module state meanings, I would also be OK with moving jump_label
> and static_call out of the coming notifier chain and making them
> explicit calls while the module is still writable.
>
> Sorry in advance if I missed anything above, I'm still trying to wrap
> my head around which callers need what module state and what module
> permissions :/

Sounds reasonable to me...

BTW, instead of hard-coding the jump-label/static-call/ftrace calls, we
could instead call notifiers with MODULE_STATE_UNFORMED.

--
Josh

2020-04-17 01:38:56

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 4/7] s390/module: Use s390_kernel_write() for relocations

On Thu, Apr 16, 2020 at 08:16:35AM -0500, Josh Poimboeuf wrote:
> On Thu, Apr 16, 2020 at 07:06:51AM -0500, Josh Poimboeuf wrote:
> > On Thu, Apr 16, 2020 at 10:56:02AM +0200, Miroslav Benes wrote:
> > > > + bool early = me->state == MODULE_STATE_UNFORMED;
> > > > +
> > > > + return __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> > > > + early ? memcpy : s390_kernel_write);
> > >
> > > The compiler warns about
> > >
> > > arch/s390/kernel/module.c: In function 'apply_relocate_add':
> > > arch/s390/kernel/module.c:453:24: warning: pointer type mismatch in conditional expression
> > > early ? memcpy : s390_kernel_write);
> >
> > Thanks, I'll get all that cleaned up.
> >
> > I could have sworn I got a SUCCESS message from the kbuild bot. Does it
> > ignore warnings nowadays?
>
> Here's a fix on top of the original patch.
>
> I changed s390_kernel_write() to return "void *" to match memcpy()
> (probably a separate patch).
>
> I also grabbed the text_mutex for the !early case in
> apply_relocate_add() -- will do something similar for x86.
>
> Will try to test this on a 390 box.

...and that borked the box pretty nicely. Oops, big endian! Need
something like this on top.

Sorry about not testing the patch in the first place, it looked trivial
and somehow I was thinking Peter writes exclusively bug-free code.

diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index ee0904a23e24..513e640430ae 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -198,21 +198,25 @@ static int apply_rela_bits(Elf_Addr loc, Elf_Addr val,
}

if (bits == 8) {
- write(dest, &val, 1);
+ unsigned char tmp = val;
+ write(dest, &tmp, 1);
} else if (bits == 12) {
unsigned short tmp = (val & 0xfff) |
(*(unsigned short *) loc & 0xf000);
write(dest, &tmp, 2);
} else if (bits == 16) {
- write(dest, &val, 2);
+ unsigned short tmp = val;
+ write(dest, &tmp, 2);
} else if (bits == 20) {
unsigned int tmp = (val & 0xfff) << 16 |
(val & 0xff000) >> 4 | (*(unsigned int *) loc & 0xf00000ff);
write(dest, &tmp, 4);
} else if (bits == 32) {
- write(dest, &val, 4);
+ unsigned int tmp = val;
+ write(dest, &tmp, 4);
} else if (bits == 64) {
- write(dest, &val, 8);
+ unsigned long tmp = val;
+ write(dest, &tmp, 8);
}
return 0;
}

--
Josh

2020-04-17 08:29:02

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Thu, 16 Apr 2020, Josh Poimboeuf wrote:

> On Thu, Apr 16, 2020 at 05:31:31PM +0200, Jessica Yu wrote:
> > > But I still not a fan of the fact that COMING has two different
> > > "states". For example, after your patch, when apply_relocate_add() is
> > > called from klp_module_coming(), it can use memcpy(), but when called
> > > from klp module init() it has to use text poke. But both are COMING so
> > > there's no way to look at the module state to know which can be used.
> >
> > This is a good observation, thanks for bringing it up. I agree that we
> > should strive to be consistent with what the module states mean. In my
> > head, I think it is easiest to assume/establish the following meanings
> > for each module state:
> >
> > MODULE_STATE_UNFORMED - no protections. relocations, alternatives,
> > ftrace module initialization, etc. any other text modifications are
> > in the process of being applied. Direct writes are permissible.
> >
> > MODULE_STATE_COMING - module fully formed, text modifications are
> > done, protections applied, module is ready to execute init or is
> > executing init.
> >
> > I wonder if we could enforce the meaning of these two states more
> > consistently without needing to add another module state.
> >
> > Regarding Peter's patches, with the set_all_modules_text_*() api gone,
> > and ftrace reliance on MODULE_STATE_COMING gone (I think?), is there
> > anything preventing ftrace_module_init+enable from being called
> > earlier (i.e., before complete_formation()) while the module is
> > unformed? Then you don't have to move module_enable_ro/nx later and we
> > keep the MODULE_STATE_COMING semantics. And if we're enforcing the
> > above module state meanings, I would also be OK with moving jump_label
> > and static_call out of the coming notifier chain and making them
> > explicit calls while the module is still writable.
> >
> > Sorry in advance if I missed anything above, I'm still trying to wrap
> > my head around which callers need what module state and what module
> > permissions :/
>
> Sounds reasonable to me...
>
> BTW, instead of hard-coding the jump-label/static-call/ftrace calls, we
> could instead call notifiers with MODULE_STATE_UNFORMED.

That was exactly what I was thinking about too while reading Jessica's
email. Since (hopefully all if I remember correctly. I checked only
random subset now) existing module notifiers check module state,
it should not be a problem.

Miroslav

2020-04-17 08:52:55

by Jessica Yu

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

+++ Miroslav Benes [17/04/20 10:27 +0200]:
>On Thu, 16 Apr 2020, Josh Poimboeuf wrote:
>
>> On Thu, Apr 16, 2020 at 05:31:31PM +0200, Jessica Yu wrote:
>> > > But I still not a fan of the fact that COMING has two different
>> > > "states". For example, after your patch, when apply_relocate_add() is
>> > > called from klp_module_coming(), it can use memcpy(), but when called
>> > > from klp module init() it has to use text poke. But both are COMING so
>> > > there's no way to look at the module state to know which can be used.
>> >
>> > This is a good observation, thanks for bringing it up. I agree that we
>> > should strive to be consistent with what the module states mean. In my
>> > head, I think it is easiest to assume/establish the following meanings
>> > for each module state:
>> >
>> > MODULE_STATE_UNFORMED - no protections. relocations, alternatives,
>> > ftrace module initialization, etc. any other text modifications are
>> > in the process of being applied. Direct writes are permissible.
>> >
>> > MODULE_STATE_COMING - module fully formed, text modifications are
>> > done, protections applied, module is ready to execute init or is
>> > executing init.
>> >
>> > I wonder if we could enforce the meaning of these two states more
>> > consistently without needing to add another module state.
>> >
>> > Regarding Peter's patches, with the set_all_modules_text_*() api gone,
>> > and ftrace reliance on MODULE_STATE_COMING gone (I think?), is there
>> > anything preventing ftrace_module_init+enable from being called
>> > earlier (i.e., before complete_formation()) while the module is
>> > unformed? Then you don't have to move module_enable_ro/nx later and we
>> > keep the MODULE_STATE_COMING semantics. And if we're enforcing the
>> > above module state meanings, I would also be OK with moving jump_label
>> > and static_call out of the coming notifier chain and making them
>> > explicit calls while the module is still writable.
>> >
>> > Sorry in advance if I missed anything above, I'm still trying to wrap
>> > my head around which callers need what module state and what module
>> > permissions :/
>>
>> Sounds reasonable to me...
>>
>> BTW, instead of hard-coding the jump-label/static-call/ftrace calls, we
>> could instead call notifiers with MODULE_STATE_UNFORMED.
>
>That was exactly what I was thinking about too while reading Jessica's
>email. Since (hopefully all if I remember correctly. I checked only
>random subset now) existing module notifiers check module state,
>it should not be a problem.

Agreed, especially with the growing number of callers now that want to
access the module while it is still writable, it seems reasonable.
IIRC, the module notifiers I looked at too checked the module state
value appropriately, so I think we are fine as well (thanks for checking!)

Jessica

2020-04-17 09:12:41

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 0/7] livepatch,module: Remove .klp.arch and module_disable_ro()

On Thu, 16 Apr 2020, Josh Poimboeuf wrote:

> On Thu, Apr 16, 2020 at 11:45:05AM +0200, Miroslav Benes wrote:
> > On Tue, 14 Apr 2020, Josh Poimboeuf wrote:
> >
> > > On Tue, Apr 14, 2020 at 08:27:26PM +0200, Peter Zijlstra wrote:
> > > > On Tue, Apr 14, 2020 at 11:28:36AM -0500, Josh Poimboeuf wrote:
> > > > > Better late than never, these patches add simplifications and
> > > > > improvements for some issues Peter found six months ago, as part of his
> > > > > non-writable text code (W^X) cleanups.
> > > >
> > > > Excellent stuff, thanks!!
> > > >
> > > > I'll go brush up these two patches then:
> > > >
> > > > https://lkml.kernel.org/r/[email protected]
> > > > https://lkml.kernel.org/r/[email protected]
> > >
> > > Ah right, I meant to bring that up. I actually played around with those
> > > patches. While it would be nice to figure out a way to converge the
> > > ftrace module init, I didn't really like the first patch.
> > >
> > > It bothers me that both the notifiers and the module init() both see the
> > > same MODULE_STATE_COMING state, but only in the former case is the text
> > > writable.
> > >
> > > I think it's cognitively simpler if MODULE_STATE_COMING always means the
> > > same thing, like the comments imply, "fully formed" and thus
> > > not-writable:
> > >
> > > enum module_state {
> > > MODULE_STATE_LIVE, /* Normal state. */
> > > MODULE_STATE_COMING, /* Full formed, running module_init. */
> > > MODULE_STATE_GOING, /* Going away. */
> > > MODULE_STATE_UNFORMED, /* Still setting it up. */
> > > };
> > >
> > > And, it keeps tighter constraints on what a notifier can do, which is a
> > > good thing if we can get away with it.
> >
> > Agreed.
> >
> > On the other hand, the first patch would remove the tiny race window when
> > a module state is still UNFORMED, but the protections are (being) set up.
> > Patches 4/7 and 5/7 allow to use memcpy in that case, because it is early.
> > But it is in fact not already. I haven't checked yet if it really matters
> > somewhere (a race with livepatch running klp_module_coming while another
> > module is being loaded or anything like that).
>
> Maybe I'm missing your point, but I don't see any races here.
>
> apply_relocate_add() only writes to the patch module's text, so there
> can't be races with other modules.

I meant... a patch module is being loaded and at the same time a
to-be-patched module too. So apply_relocate_add() called from
klp_module_coming() would see UNFORMED in the patch module state and the
permissions would be set up already. So memcpy would not work. But we
protect against that (and many other things) by taking klp_mutex, of
course. I managed to confuse myself again, sorry about that.

Miroslav