2022-09-01 17:45:15

by Song Liu

[permalink] [raw]
Subject: [PATCH v6] livepatch: Clear relocation targets on a module removal

From: Miroslav Benes <[email protected]>

Josh reported a bug:

When the object to be patched is a module, and that module is
rmmod'ed and reloaded, it fails to load with:

module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'

The livepatch module has a relocation which references a symbol
in the _previous_ loading of nfsd. When apply_relocate_add()
tries to replace the old relocation with a new one, it sees that
the previous one is nonzero and it errors out.

On ppc64le, we have a similar issue:

module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'

He also proposed three different solutions. We could remove the error
check in apply_relocate_add() introduced by commit eda9cec4c9a1
("x86/module: Detect and skip invalid relocations"). However the check
is useful for detecting corrupted modules.

We could also deny the patched modules to be removed. If it proved to be
a major drawback for users, we could still implement a different
approach. The solution would also complicate the existing code a lot.

We thus decided to reverse the relocation patching (clear all relocation
targets on x86_64). The solution is not
universal and is too much arch-specific, but it may prove to be simpler
in the end.

Reported-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Miroslav Benes <[email protected]>
Signed-off-by: Song Liu <[email protected]>

---

NOTE: powerpc32 code is only compile tested.

Changes v5 = v6:
1. Fix powerpc64.
2. Fix compile for powerpc32.

Changes v4 = v5:
1. Fix compile with powerpc.

Changes v3 = v4:
1. Reuse __apply_relocate_add to make it more reliable in long term.
(Josh Poimboeuf)
2. Add back ppc64 logic from v2, with changes to match current code.
(Josh Poimboeuf)

Changes v2 => v3:
1. Rewrite x86 changes to match current code style.
2. Remove powerpc changes as there is no test coverage in v3.
3. Only keep 1/3 of v2.

v2: https://lore.kernel.org/all/[email protected]/T/#u
---
arch/powerpc/kernel/module_32.c | 10 ++++
arch/powerpc/kernel/module_64.c | 49 +++++++++++++++
arch/s390/kernel/module.c | 8 +++
arch/x86/kernel/module.c | 102 +++++++++++++++++++++++---------
include/linux/moduleloader.h | 7 +++
kernel/livepatch/core.c | 41 ++++++++++++-
6 files changed, 189 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index ea6536171778..e3c312770453 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -285,6 +285,16 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
return 0;
}

+#ifdef CONFIG_LIVEPATCH
+void clear_relocate_add(Elf32_Shdr *sechdrs,
+ const char *strtab,
+ unsigned int symindex,
+ unsigned int relsec,
+ struct module *me)
+{
+}
+#endif
+
#ifdef CONFIG_DYNAMIC_FTRACE
notrace int module_trampoline_target(struct module *mod, unsigned long addr,
unsigned long *target)
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 7e45dc98df8a..514951f97391 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -739,6 +739,55 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
return 0;
}

+#ifdef CONFIG_LIVEPATCH
+void clear_relocate_add(Elf64_Shdr *sechdrs,
+ const char *strtab,
+ unsigned int symindex,
+ unsigned int relsec,
+ struct module *me)
+{
+ unsigned int i;
+ Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
+ Elf64_Sym *sym;
+ unsigned long *location;
+ const char *symname;
+ u32 *instruction;
+
+ pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
+ sechdrs[relsec].sh_info);
+
+ for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
+ location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
+ + rela[i].r_offset;
+ sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
+ + ELF64_R_SYM(rela[i].r_info);
+ symname = me->core_kallsyms.strtab
+ + sym->st_name;
+
+ if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
+ continue;
+ /*
+ * reverse the operations in apply_relocate_add() for case
+ * R_PPC_REL24.
+ */
+ if (sym->st_shndx != SHN_UNDEF &&
+ sym->st_shndx != SHN_LIVEPATCH)
+ continue;
+
+ instruction = (u32 *)location;
+ if (is_mprofile_ftrace_call(symname))
+ continue;
+
+ if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
+ continue;
+
+ instruction += 1;
+ patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
+ }
+
+}
+#endif
+
#ifdef CONFIG_DYNAMIC_FTRACE
int module_trampoline_target(struct module *mod, unsigned long addr,
unsigned long *target)
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 2d159b32885b..cc6784fbc1ac 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -500,6 +500,14 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
}
#endif /* CONFIG_FUNCTION_TRACER */

+#ifdef CONFIG_LIVEPATCH
+void clear_relocate_add(Elf64_Shdr *sechdrs, const char *strtab,
+ unsigned int symindex, unsigned int relsec,
+ struct module *me)
+{
+}
+#endif
+
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index b1abf663417c..f9632afbb84c 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -128,18 +128,20 @@ int apply_relocate(Elf32_Shdr *sechdrs,
return 0;
}
#else /*X86_64*/
-static int __apply_relocate_add(Elf64_Shdr *sechdrs,
+static int __apply_clear_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me,
- void *(*write)(void *dest, const void *src, size_t len))
+ void *(*write)(void *dest, const void *src, size_t len),
+ bool clear)
{
unsigned int i;
Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
Elf64_Sym *sym;
void *loc;
u64 val;
+ u64 zero = 0ULL;

DEBUGP("Applying relocate section %u to %u\n",
relsec, sechdrs[relsec].sh_info);
@@ -163,40 +165,60 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs,
case R_X86_64_NONE:
break;
case R_X86_64_64:
- if (*(u64 *)loc != 0)
- goto invalid_relocation;
- write(loc, &val, 8);
+ if (!clear) {
+ if (*(u64 *)loc != 0)
+ goto invalid_relocation;
+ write(loc, &val, 8);
+ } else {
+ write(loc, &zero, 8);
+ }
break;
case R_X86_64_32:
- if (*(u32 *)loc != 0)
- goto invalid_relocation;
- write(loc, &val, 4);
- if (val != *(u32 *)loc)
- goto overflow;
+ if (!clear) {
+ if (*(u32 *)loc != 0)
+ goto invalid_relocation;
+ write(loc, &val, 4);
+ if (val != *(u32 *)loc)
+ goto overflow;
+ } else {
+ write(loc, &zero, 4);
+ }
break;
case R_X86_64_32S:
- if (*(s32 *)loc != 0)
- goto invalid_relocation;
- write(loc, &val, 4);
- if ((s64)val != *(s32 *)loc)
- goto overflow;
+ if (!clear) {
+ if (*(s32 *)loc != 0)
+ goto invalid_relocation;
+ write(loc, &val, 4);
+ if ((s64)val != *(s32 *)loc)
+ goto overflow;
+ } else {
+ write(loc, &zero, 4);
+ }
break;
case R_X86_64_PC32:
case R_X86_64_PLT32:
- if (*(u32 *)loc != 0)
- goto invalid_relocation;
- val -= (u64)loc;
- write(loc, &val, 4);
+ if (!clear) {
+ if (*(u32 *)loc != 0)
+ goto invalid_relocation;
+ val -= (u64)loc;
+ write(loc, &val, 4);
#if 0
- if ((s64)val != *(s32 *)loc)
- goto overflow;
+ if ((s64)val != *(s32 *)loc)
+ goto overflow;
#endif
+ } else {
+ write(loc, &zero, 4);
+ }
break;
case R_X86_64_PC64:
- if (*(u64 *)loc != 0)
- goto invalid_relocation;
- val -= (u64)loc;
- write(loc, &val, 8);
+ if (!clear) {
+ if (*(u64 *)loc != 0)
+ goto invalid_relocation;
+ val -= (u64)loc;
+ write(loc, &val, 8);
+ } else {
+ write(loc, &zero, 8);
+ }
break;
default:
pr_err("%s: Unknown rela relocation: %llu\n",
@@ -234,8 +256,8 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
mutex_lock(&text_mutex);
}

- ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
- write);
+ ret = __apply_clear_relocate_add(sechdrs, strtab, symindex, relsec, me,
+ write, false /* clear */);

if (!early) {
text_poke_sync();
@@ -245,6 +267,32 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
return ret;
}

+#ifdef CONFIG_LIVEPATCH
+
+void clear_relocate_add(Elf64_Shdr *sechdrs,
+ const char *strtab,
+ unsigned int symindex,
+ unsigned int relsec,
+ struct module *me)
+{
+ bool early = me->state == MODULE_STATE_UNFORMED;
+ void *(*write)(void *, const void *, size_t) = memcpy;
+
+ if (!early) {
+ write = text_poke;
+ mutex_lock(&text_mutex);
+ }
+
+ __apply_clear_relocate_add(sechdrs, strtab, symindex, relsec, me,
+ write, true /* clear */);
+
+ if (!early) {
+ text_poke_sync();
+ mutex_unlock(&text_mutex);
+ }
+}
+#endif
+
#endif

int module_finalize(const Elf_Ehdr *hdr,
diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
index 9e09d11ffe5b..958e6da7f475 100644
--- a/include/linux/moduleloader.h
+++ b/include/linux/moduleloader.h
@@ -72,6 +72,13 @@ int apply_relocate_add(Elf_Shdr *sechdrs,
unsigned int symindex,
unsigned int relsec,
struct module *mod);
+#ifdef CONFIG_LIVEPATCH
+void clear_relocate_add(Elf_Shdr *sechdrs,
+ const char *strtab,
+ unsigned int symindex,
+ unsigned int relsec,
+ struct module *me);
+#endif
#else
static inline int apply_relocate_add(Elf_Shdr *sechdrs,
const char *strtab,
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index bc475e62279d..5c0d8a4eba13 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
}

+static void klp_clear_object_relocations(struct module *pmod,
+ struct klp_object *obj)
+{
+ int i, cnt;
+ const char *objname, *secname;
+ char sec_objname[MODULE_NAME_LEN];
+ Elf_Shdr *sec;
+
+ objname = klp_is_module(obj) ? obj->name : "vmlinux";
+
+ /* For each klp relocation section */
+ for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
+ sec = pmod->klp_info->sechdrs + i;
+ secname = pmod->klp_info->secstrings + sec->sh_name;
+ if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
+ continue;
+
+ /*
+ * Format: .klp.rela.sec_objname.section_name
+ * See comment in klp_resolve_symbols() for an explanation
+ * of the selected field width value.
+ */
+ secname = pmod->klp_info->secstrings + sec->sh_name;
+ cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
+ if (cnt != 1) {
+ pr_err("section %s has an incorrectly formatted name\n",
+ secname);
+ continue;
+ }
+
+ if (strcmp(objname, sec_objname))
+ continue;
+
+ clear_relocate_add(pmod->klp_info->sechdrs,
+ pmod->core_kallsyms.strtab,
+ pmod->klp_info->symndx, i, pmod);
+ }
+}
+
/*
* Sysfs Interface
*
@@ -1154,7 +1193,7 @@ static void klp_cleanup_module_patches_limited(struct module *mod,
klp_unpatch_object(obj);

klp_post_unpatch_callback(obj);
-
+ klp_clear_object_relocations(patch->mod, obj);
klp_free_object_loaded(obj);
break;
}
--
2.30.2


2022-11-17 22:21:21

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

Hi folks,

It seems we forgot about this work... What shall we do to move forward?

Thanks,
Song

On Thu, Sep 1, 2022 at 10:16 AM Song Liu <[email protected]> wrote:
>
> From: Miroslav Benes <[email protected]>
>
> Josh reported a bug:
>
> When the object to be patched is a module, and that module is
> rmmod'ed and reloaded, it fails to load with:
>
> module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>
> The livepatch module has a relocation which references a symbol
> in the _previous_ loading of nfsd. When apply_relocate_add()
> tries to replace the old relocation with a new one, it sees that
> the previous one is nonzero and it errors out.
>
> On ppc64le, we have a similar issue:
>
> module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>
> He also proposed three different solutions. We could remove the error
> check in apply_relocate_add() introduced by commit eda9cec4c9a1
> ("x86/module: Detect and skip invalid relocations"). However the check
> is useful for detecting corrupted modules.
>
> We could also deny the patched modules to be removed. If it proved to be
> a major drawback for users, we could still implement a different
> approach. The solution would also complicate the existing code a lot.
>
> We thus decided to reverse the relocation patching (clear all relocation
> targets on x86_64). The solution is not
> universal and is too much arch-specific, but it may prove to be simpler
> in the end.
>
> Reported-by: Josh Poimboeuf <[email protected]>
> Signed-off-by: Miroslav Benes <[email protected]>
> Signed-off-by: Song Liu <[email protected]>
>
> ---
>
> NOTE: powerpc32 code is only compile tested.
>
> Changes v5 = v6:
> 1. Fix powerpc64.
> 2. Fix compile for powerpc32.
>
> Changes v4 = v5:
> 1. Fix compile with powerpc.
>
> Changes v3 = v4:
> 1. Reuse __apply_relocate_add to make it more reliable in long term.
> (Josh Poimboeuf)
> 2. Add back ppc64 logic from v2, with changes to match current code.
> (Josh Poimboeuf)
>
> Changes v2 => v3:
> 1. Rewrite x86 changes to match current code style.
> 2. Remove powerpc changes as there is no test coverage in v3.
> 3. Only keep 1/3 of v2.
>
> v2: https://lore.kernel.org/all/[email protected]/T/#u
> ---
> arch/powerpc/kernel/module_32.c | 10 ++++
> arch/powerpc/kernel/module_64.c | 49 +++++++++++++++
> arch/s390/kernel/module.c | 8 +++
> arch/x86/kernel/module.c | 102 +++++++++++++++++++++++---------
> include/linux/moduleloader.h | 7 +++
> kernel/livepatch/core.c | 41 ++++++++++++-
> 6 files changed, 189 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
> index ea6536171778..e3c312770453 100644
> --- a/arch/powerpc/kernel/module_32.c
> +++ b/arch/powerpc/kernel/module_32.c
> @@ -285,6 +285,16 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
> return 0;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf32_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> +}
> +#endif
> +
> #ifdef CONFIG_DYNAMIC_FTRACE
> notrace int module_trampoline_target(struct module *mod, unsigned long addr,
> unsigned long *target)
> diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
> index 7e45dc98df8a..514951f97391 100644
> --- a/arch/powerpc/kernel/module_64.c
> +++ b/arch/powerpc/kernel/module_64.c
> @@ -739,6 +739,55 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> return 0;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + unsigned int i;
> + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> + Elf64_Sym *sym;
> + unsigned long *location;
> + const char *symname;
> + u32 *instruction;
> +
> + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> + sechdrs[relsec].sh_info);
> +
> + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> + + rela[i].r_offset;
> + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> + + ELF64_R_SYM(rela[i].r_info);
> + symname = me->core_kallsyms.strtab
> + + sym->st_name;
> +
> + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> + continue;
> + /*
> + * reverse the operations in apply_relocate_add() for case
> + * R_PPC_REL24.
> + */
> + if (sym->st_shndx != SHN_UNDEF &&
> + sym->st_shndx != SHN_LIVEPATCH)
> + continue;
> +
> + instruction = (u32 *)location;
> + if (is_mprofile_ftrace_call(symname))
> + continue;
> +
> + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> + continue;
> +
> + instruction += 1;
> + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> + }
> +
> +}
> +#endif
> +
> #ifdef CONFIG_DYNAMIC_FTRACE
> int module_trampoline_target(struct module *mod, unsigned long addr,
> unsigned long *target)
> diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
> index 2d159b32885b..cc6784fbc1ac 100644
> --- a/arch/s390/kernel/module.c
> +++ b/arch/s390/kernel/module.c
> @@ -500,6 +500,14 @@ static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
> }
> #endif /* CONFIG_FUNCTION_TRACER */
>
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf64_Shdr *sechdrs, const char *strtab,
> + unsigned int symindex, unsigned int relsec,
> + struct module *me)
> +{
> +}
> +#endif
> +
> int module_finalize(const Elf_Ehdr *hdr,
> const Elf_Shdr *sechdrs,
> struct module *me)
> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
> index b1abf663417c..f9632afbb84c 100644
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -128,18 +128,20 @@ int apply_relocate(Elf32_Shdr *sechdrs,
> return 0;
> }
> #else /*X86_64*/
> -static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> +static int __apply_clear_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> struct module *me,
> - void *(*write)(void *dest, const void *src, size_t len))
> + void *(*write)(void *dest, const void *src, size_t len),
> + bool clear)
> {
> unsigned int i;
> Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
> Elf64_Sym *sym;
> void *loc;
> u64 val;
> + u64 zero = 0ULL;
>
> DEBUGP("Applying relocate section %u to %u\n",
> relsec, sechdrs[relsec].sh_info);
> @@ -163,40 +165,60 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> case R_X86_64_NONE:
> break;
> case R_X86_64_64:
> - if (*(u64 *)loc != 0)
> - goto invalid_relocation;
> - write(loc, &val, 8);
> + if (!clear) {
> + if (*(u64 *)loc != 0)
> + goto invalid_relocation;
> + write(loc, &val, 8);
> + } else {
> + write(loc, &zero, 8);
> + }
> break;
> case R_X86_64_32:
> - if (*(u32 *)loc != 0)
> - goto invalid_relocation;
> - write(loc, &val, 4);
> - if (val != *(u32 *)loc)
> - goto overflow;
> + if (!clear) {
> + if (*(u32 *)loc != 0)
> + goto invalid_relocation;
> + write(loc, &val, 4);
> + if (val != *(u32 *)loc)
> + goto overflow;
> + } else {
> + write(loc, &zero, 4);
> + }
> break;
> case R_X86_64_32S:
> - if (*(s32 *)loc != 0)
> - goto invalid_relocation;
> - write(loc, &val, 4);
> - if ((s64)val != *(s32 *)loc)
> - goto overflow;
> + if (!clear) {
> + if (*(s32 *)loc != 0)
> + goto invalid_relocation;
> + write(loc, &val, 4);
> + if ((s64)val != *(s32 *)loc)
> + goto overflow;
> + } else {
> + write(loc, &zero, 4);
> + }
> break;
> case R_X86_64_PC32:
> case R_X86_64_PLT32:
> - if (*(u32 *)loc != 0)
> - goto invalid_relocation;
> - val -= (u64)loc;
> - write(loc, &val, 4);
> + if (!clear) {
> + if (*(u32 *)loc != 0)
> + goto invalid_relocation;
> + val -= (u64)loc;
> + write(loc, &val, 4);
> #if 0
> - if ((s64)val != *(s32 *)loc)
> - goto overflow;
> + if ((s64)val != *(s32 *)loc)
> + goto overflow;
> #endif
> + } else {
> + write(loc, &zero, 4);
> + }
> break;
> case R_X86_64_PC64:
> - if (*(u64 *)loc != 0)
> - goto invalid_relocation;
> - val -= (u64)loc;
> - write(loc, &val, 8);
> + if (!clear) {
> + if (*(u64 *)loc != 0)
> + goto invalid_relocation;
> + val -= (u64)loc;
> + write(loc, &val, 8);
> + } else {
> + write(loc, &zero, 8);
> + }
> break;
> default:
> pr_err("%s: Unknown rela relocation: %llu\n",
> @@ -234,8 +256,8 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> mutex_lock(&text_mutex);
> }
>
> - ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> - write);
> + ret = __apply_clear_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + write, false /* clear */);
>
> if (!early) {
> text_poke_sync();
> @@ -245,6 +267,32 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> return ret;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +
> +void clear_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + bool early = me->state == MODULE_STATE_UNFORMED;
> + void *(*write)(void *, const void *, size_t) = memcpy;
> +
> + if (!early) {
> + write = text_poke;
> + mutex_lock(&text_mutex);
> + }
> +
> + __apply_clear_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + write, true /* clear */);
> +
> + if (!early) {
> + text_poke_sync();
> + mutex_unlock(&text_mutex);
> + }
> +}
> +#endif
> +
> #endif
>
> int module_finalize(const Elf_Ehdr *hdr,
> diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
> index 9e09d11ffe5b..958e6da7f475 100644
> --- a/include/linux/moduleloader.h
> +++ b/include/linux/moduleloader.h
> @@ -72,6 +72,13 @@ int apply_relocate_add(Elf_Shdr *sechdrs,
> unsigned int symindex,
> unsigned int relsec,
> struct module *mod);
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me);
> +#endif
> #else
> static inline int apply_relocate_add(Elf_Shdr *sechdrs,
> const char *strtab,
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index bc475e62279d..5c0d8a4eba13 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> }
>
> +static void klp_clear_object_relocations(struct module *pmod,
> + struct klp_object *obj)
> +{
> + int i, cnt;
> + const char *objname, *secname;
> + char sec_objname[MODULE_NAME_LEN];
> + Elf_Shdr *sec;
> +
> + objname = klp_is_module(obj) ? obj->name : "vmlinux";
> +
> + /* For each klp relocation section */
> + for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
> + sec = pmod->klp_info->sechdrs + i;
> + secname = pmod->klp_info->secstrings + sec->sh_name;
> + if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
> + continue;
> +
> + /*
> + * Format: .klp.rela.sec_objname.section_name
> + * See comment in klp_resolve_symbols() for an explanation
> + * of the selected field width value.
> + */
> + secname = pmod->klp_info->secstrings + sec->sh_name;
> + cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> + if (cnt != 1) {
> + pr_err("section %s has an incorrectly formatted name\n",
> + secname);
> + continue;
> + }
> +
> + if (strcmp(objname, sec_objname))
> + continue;
> +
> + clear_relocate_add(pmod->klp_info->sechdrs,
> + pmod->core_kallsyms.strtab,
> + pmod->klp_info->symndx, i, pmod);
> + }
> +}
> +
> /*
> * Sysfs Interface
> *
> @@ -1154,7 +1193,7 @@ static void klp_cleanup_module_patches_limited(struct module *mod,
> klp_unpatch_object(obj);
>
> klp_post_unpatch_callback(obj);
> -
> + klp_clear_object_relocations(patch->mod, obj);
> klp_free_object_loaded(obj);
> break;
> }
> --
> 2.30.2
>

2022-11-18 16:32:02

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Thu 2022-09-01 10:12:52, Song Liu wrote:
> From: Miroslav Benes <[email protected]>
>
> Josh reported a bug:
>
> When the object to be patched is a module, and that module is
> rmmod'ed and reloaded, it fails to load with:
>
> module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>
> The livepatch module has a relocation which references a symbol
> in the _previous_ loading of nfsd. When apply_relocate_add()
> tries to replace the old relocation with a new one, it sees that
> the previous one is nonzero and it errors out.
>
> We thus decided to reverse the relocation patching (clear all relocation
> targets on x86_64). The solution is not
> universal and is too much arch-specific, but it may prove to be simpler
> in the end.
>
> arch/powerpc/kernel/module_32.c | 10 ++++
> arch/powerpc/kernel/module_64.c | 49 +++++++++++++++
> arch/s390/kernel/module.c | 8 +++
> arch/x86/kernel/module.c | 102 +++++++++++++++++++++++---------
> include/linux/moduleloader.h | 7 +++
> kernel/livepatch/core.c | 41 ++++++++++++-

First, thanks a lot for working on this.

I can't check or test the powerpc and s390 code easily.

I am going to comment only x86 and generic code. It looks good
but it needs some changes to improve maintainability.

> 6 files changed, 189 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
> index ea6536171778..e3c312770453 100644
> --- a/arch/powerpc/kernel/module_32.c
> +++ b/arch/powerpc/kernel/module_32.c
> @@ -285,6 +285,16 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
> return 0;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf32_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> +}
> +#endif
> +
> #ifdef CONFIG_DYNAMIC_FTRACE
> notrace int module_trampoline_target(struct module *mod, unsigned long addr,
> unsigned long *target)
> diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
> index 7e45dc98df8a..514951f97391 100644
> --- a/arch/powerpc/kernel/module_64.c
> +++ b/arch/powerpc/kernel/module_64.c
> @@ -739,6 +739,55 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> return 0;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + unsigned int i;
> + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> + Elf64_Sym *sym;
> + unsigned long *location;
> + const char *symname;
> + u32 *instruction;
> +
> + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> + sechdrs[relsec].sh_info);
> +
> + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> + + rela[i].r_offset;
> + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> + + ELF64_R_SYM(rela[i].r_info);
> + symname = me->core_kallsyms.strtab
> + + sym->st_name;
> +
> + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> + continue;
> + /*
> + * reverse the operations in apply_relocate_add() for case
> + * R_PPC_REL24.
> + */
> + if (sym->st_shndx != SHN_UNDEF &&
> + sym->st_shndx != SHN_LIVEPATCH)
> + continue;
> +
> + instruction = (u32 *)location;
> + if (is_mprofile_ftrace_call(symname))
> + continue;
> +
> + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> + continue;
> +
> + instruction += 1;
> + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> + }
> +
> +}

This looks like a lot of duplicated code. Isn't it?

> +#endif
> +
> #ifdef CONFIG_DYNAMIC_FTRACE
> int module_trampoline_target(struct module *mod, unsigned long addr,
> unsigned long *target)
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -128,18 +128,20 @@ int apply_relocate(Elf32_Shdr *sechdrs,
> return 0;
> }
> #else /*X86_64*/
> -static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> +static int __apply_clear_relocate_add(Elf64_Shdr *sechdrs,

Nit: Honestly, the combination of 4 verbs: "apply", "clear, "relocate", and "add"
is really crazy. It is far from obvious what the function does.

The name was not ideal even before. Let's not make it worse and
use on 3 verbs again.

What about __update_relocate_add or __write_relocate_add()?

Note that the "__" prefix is still needed, see below.


> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> struct module *me,
> - void *(*write)(void *dest, const void *src, size_t len))
> + void *(*write)(void *dest, const void *src, size_t len),
> + bool clear)
> {
> unsigned int i;
> Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
> Elf64_Sym *sym;
> void *loc;
> u64 val;
> + u64 zero = 0ULL;
>
> DEBUGP("Applying relocate section %u to %u\n",
> relsec, sechdrs[relsec].sh_info);
> @@ -163,40 +165,60 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> case R_X86_64_NONE:
> break;
> case R_X86_64_64:
> - if (*(u64 *)loc != 0)
> - goto invalid_relocation;
> - write(loc, &val, 8);
> + if (!clear) {

Nit: I would prefer to use positive check when both if/else branches
are used. I would call the parameter "apply".

> + if (*(u64 *)loc != 0)
> + goto invalid_relocation;
> + write(loc, &val, 8);
> + } else {
> + write(loc, &zero, 8);
> + }
> break;
> case R_X86_64_32:
> - if (*(u32 *)loc != 0)
> - goto invalid_relocation;
> - write(loc, &val, 4);
> - if (val != *(u32 *)loc)
> - goto overflow;
> + if (!clear) {
> + if (*(u32 *)loc != 0)
> + goto invalid_relocation;
> + write(loc, &val, 4);
> + if (val != *(u32 *)loc)
> + goto overflow;
> + } else {
> + write(loc, &zero, 4);
> + }
> break;
> case R_X86_64_32S:
> - if (*(s32 *)loc != 0)
> - goto invalid_relocation;
> - write(loc, &val, 4);
> - if ((s64)val != *(s32 *)loc)
> - goto overflow;
> + if (!clear) {
> + if (*(s32 *)loc != 0)
> + goto invalid_relocation;
> + write(loc, &val, 4);
> + if ((s64)val != *(s32 *)loc)
> + goto overflow;
> + } else {
> + write(loc, &zero, 4);
> + }
> break;
> case R_X86_64_PC32:
> case R_X86_64_PLT32:
> - if (*(u32 *)loc != 0)
> - goto invalid_relocation;
> - val -= (u64)loc;
> - write(loc, &val, 4);
> + if (!clear) {
> + if (*(u32 *)loc != 0)
> + goto invalid_relocation;
> + val -= (u64)loc;
> + write(loc, &val, 4);
> #if 0
> - if ((s64)val != *(s32 *)loc)
> - goto overflow;
> + if ((s64)val != *(s32 *)loc)
> + goto overflow;
> #endif
> + } else {
> + write(loc, &zero, 4);
> + }
> break;
> case R_X86_64_PC64:
> - if (*(u64 *)loc != 0)
> - goto invalid_relocation;
> - val -= (u64)loc;
> - write(loc, &val, 8);
> + if (!clear) {
> + if (*(u64 *)loc != 0)
> + goto invalid_relocation;
> + val -= (u64)loc;
> + write(loc, &val, 8);
> + } else {
> + write(loc, &zero, 8);
> + }
> break;
> default:
> pr_err("%s: Unknown rela relocation: %llu\n",
> @@ -245,6 +267,32 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> return ret;
> }
>
> +#ifdef CONFIG_LIVEPATCH
> +
> +void clear_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + bool early = me->state == MODULE_STATE_UNFORMED;
> + void *(*write)(void *, const void *, size_t) = memcpy;
> +
> + if (!early) {
> + write = text_poke;
> + mutex_lock(&text_mutex);
> + }
> +
> + __apply_clear_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + write, true /* clear */);
> +
> + if (!early) {
> + text_poke_sync();
> + mutex_unlock(&text_mutex);
> + }
> +}

This duplicates a lot of code. Please, rename apply_relocate_add() the
same way as __apply_clear_relocate_add() and add the "apply" parameter.
Then add the wrappers for this:

int write_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me,
bool apply)
{
int ret;
bool early = me->state == MODULE_STATE_UNFORMED;
void *(*write)(void *, const void *, size_t) = memcpy;

if (!early) {
write = text_poke;
mutex_lock(&text_mutex);
}

ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
write, apply);

if (!early) {
text_poke_sync();
mutex_unlock(&text_mutex);
}

return ret;
}

int apply_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me)
{
return write_relocate_add(sechdrs, strtab, symindex, relsec, me, true);
}

#ifdef CONFIG_LIVEPATCH
void apply_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me)
{
write_relocate_add(sechdrs, strtab, symindex, relsec, me, false);
}
#endif


> +#endif
> +
> #endif
>
> int module_finalize(const Elf_Ehdr *hdr,
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> }
>
> +static void klp_clear_object_relocations(struct module *pmod,
> + struct klp_object *obj)
> +{
> + int i, cnt;
> + const char *objname, *secname;
> + char sec_objname[MODULE_NAME_LEN];
> + Elf_Shdr *sec;
> +
> + objname = klp_is_module(obj) ? obj->name : "vmlinux";
> +
> + /* For each klp relocation section */
> + for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
> + sec = pmod->klp_info->sechdrs + i;
> + secname = pmod->klp_info->secstrings + sec->sh_name;
> + if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
> + continue;
> +
> + /*
> + * Format: .klp.rela.sec_objname.section_name
> + * See comment in klp_resolve_symbols() for an explanation
> + * of the selected field width value.
> + */
> + secname = pmod->klp_info->secstrings + sec->sh_name;
> + cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> + if (cnt != 1) {
> + pr_err("section %s has an incorrectly formatted name\n",
> + secname);
> + continue;
> + }
> +
> + if (strcmp(objname, sec_objname))
> + continue;
> +
> + clear_relocate_add(pmod->klp_info->sechdrs,
> + pmod->core_kallsyms.strtab,
> + pmod->klp_info->symndx, i, pmod);
> + }
> +}

Huh, this duplicates a lot of tricky code.

It is even worse because this squashed code from two functions
klp_apply_section_relocs() and klp_apply_object_relocs()
into a single function. As a result, the code duplication is not
even obvious.

Also the suffix "_reloacations() does not match the suffix of
the related funciton:

+ klp_apply_object_relocs() (existing)
+ klp_clear_object_relocations() (new)

This all would complicate maintenance of the code.

Please, implement a common:

int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
const char *shstrtab, const char *strtab,
unsigned int symndx, unsigned int secndx,
const char *objname, bool apply);

and

int klp_write_object_relocs(struct klp_patch *patch,
struct klp_object *obj,
bool apply);

and add the respective wrappers:

int klp_apply_section_relocs(); /* also needed in module/main.c */
int klp_apply_object_relocs();
void klp_clear_object_relocs();

Best Regards,
Petr

2022-11-18 17:18:06

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

Hi Petr,

On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
>
> On Thu 2022-09-01 10:12:52, Song Liu wrote:
[...]
> >
> > arch/powerpc/kernel/module_32.c | 10 ++++
> > arch/powerpc/kernel/module_64.c | 49 +++++++++++++++
> > arch/s390/kernel/module.c | 8 +++
> > arch/x86/kernel/module.c | 102 +++++++++++++++++++++++---------
> > include/linux/moduleloader.h | 7 +++
> > kernel/livepatch/core.c | 41 ++++++++++++-
>
> First, thanks a lot for working on this.
>
> I can't check or test the powerpc and s390 code easily.
>
> I am going to comment only x86 and generic code. It looks good
> but it needs some changes to improve maintainability.

Thanks for these comments and suggestions. I will work on them
and send v4.

Song

2022-11-21 15:52:44

by Joe Lawrence

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On 11/18/22 12:14 PM, Song Liu wrote:
> Hi Petr,
>
> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
>>
>> On Thu 2022-09-01 10:12:52, Song Liu wrote:
> [...]
>>>
>>> arch/powerpc/kernel/module_32.c | 10 ++++
>>> arch/powerpc/kernel/module_64.c | 49 +++++++++++++++
>>> arch/s390/kernel/module.c | 8 +++
>>> arch/x86/kernel/module.c | 102 +++++++++++++++++++++++---------
>>> include/linux/moduleloader.h | 7 +++
>>> kernel/livepatch/core.c | 41 ++++++++++++-
>>
>> First, thanks a lot for working on this.
>>
>> I can't check or test the powerpc and s390 code easily.
>>
>> I am going to comment only x86 and generic code. It looks good
>> but it needs some changes to improve maintainability.
>
> Thanks for these comments and suggestions. I will work on them
> and send v4.
>

Hi Song,

I'll help with testing the arches (at least ppc64le and s390x, I can
only cross-build ppc32). I can either pick up the patches from the list
when you post, or if you want to run them through testing first, lmk.

Thanks,

--
Joe


2022-11-21 16:52:31

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Mon, Nov 21, 2022 at 8:32 AM Joe Lawrence <[email protected]> wrote:
>
> On 11/18/22 12:14 PM, Song Liu wrote:
> > Hi Petr,
> >
> > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> >>
> >> On Thu 2022-09-01 10:12:52, Song Liu wrote:
> > [...]
> >>>
> >>> arch/powerpc/kernel/module_32.c | 10 ++++
> >>> arch/powerpc/kernel/module_64.c | 49 +++++++++++++++
> >>> arch/s390/kernel/module.c | 8 +++
> >>> arch/x86/kernel/module.c | 102 +++++++++++++++++++++++---------
> >>> include/linux/moduleloader.h | 7 +++
> >>> kernel/livepatch/core.c | 41 ++++++++++++-
> >>
> >> First, thanks a lot for working on this.
> >>
> >> I can't check or test the powerpc and s390 code easily.
> >>
> >> I am going to comment only x86 and generic code. It looks good
> >> but it needs some changes to improve maintainability.
> >
> > Thanks for these comments and suggestions. I will work on them
> > and send v4.
> >
>
> Hi Song,
>
> I'll help with testing the arches (at least ppc64le and s390x, I can
> only cross-build ppc32). I can either pick up the patches from the list
> when you post, or if you want to run them through testing first, lmk.

Thanks Joe!

I am on vacation this week. I will pick this up next week.

Song

2022-11-29 02:02:43

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
>
[...]

> >
> > +#ifdef CONFIG_LIVEPATCH
> > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > + const char *strtab,
> > + unsigned int symindex,
> > + unsigned int relsec,
> > + struct module *me)
> > +{
> > + unsigned int i;
> > + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > + Elf64_Sym *sym;
> > + unsigned long *location;
> > + const char *symname;
> > + u32 *instruction;
> > +
> > + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > + sechdrs[relsec].sh_info);
> > +
> > + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > + + rela[i].r_offset;
> > + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > + + ELF64_R_SYM(rela[i].r_info);
> > + symname = me->core_kallsyms.strtab
> > + + sym->st_name;
> > +
> > + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> > + continue;
> > + /*
> > + * reverse the operations in apply_relocate_add() for case
> > + * R_PPC_REL24.
> > + */
> > + if (sym->st_shndx != SHN_UNDEF &&
> > + sym->st_shndx != SHN_LIVEPATCH)
> > + continue;
> > +
> > + instruction = (u32 *)location;
> > + if (is_mprofile_ftrace_call(symname))
> > + continue;
> > +
> > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > + continue;
> > +
> > + instruction += 1;
> > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> > + }
> > +
> > +}
>
> This looks like a lot of duplicated code. Isn't it?

TBH, I think the duplicated code is not really bad.

apply_relocate_add() is a much more complicated function, I would
rather not mess it up to make this function a little simpler.

[...]

>
> This duplicates a lot of code. Please, rename apply_relocate_add() the
> same way as __apply_clear_relocate_add() and add the "apply" parameter.
> Then add the wrappers for this:
>
> int write_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> struct module *me,
> bool apply)
> {
> int ret;
> bool early = me->state == MODULE_STATE_UNFORMED;
> void *(*write)(void *, const void *, size_t) = memcpy;
>
> if (!early) {
> write = text_poke;
> mutex_lock(&text_mutex);
> }

How about we move the "early" logic into __write_relocate_add()?

>
> ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
> write, apply);
>
> if (!early) {
> text_poke_sync();
> mutex_unlock(&text_mutex);
> }
>
> return ret;
> }
>
> int apply_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> struct module *me)
> {
> return write_relocate_add(sechdrs, strtab, symindex, relsec, me, true);

Then we just call __write_relocate_add() from here...

> }
>
> #ifdef CONFIG_LIVEPATCH
> void apply_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> struct module *me)
> {
> write_relocate_add(sechdrs, strtab, symindex, relsec, me, false);

and here.


> }
> #endif
>
>
> > +#endif
> > +
> > #endif
> >
> > int module_finalize(const Elf_Ehdr *hdr,
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> > return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > }
> >
> > +static void klp_clear_object_relocations(struct module *pmod,
> > + struct klp_object *obj)
> > +{
> > + int i, cnt;
> > + const char *objname, *secname;
> > + char sec_objname[MODULE_NAME_LEN];
> > + Elf_Shdr *sec;
> > +
> > + objname = klp_is_module(obj) ? obj->name : "vmlinux";
> > +
> > + /* For each klp relocation section */
> > + for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
> > + sec = pmod->klp_info->sechdrs + i;
> > + secname = pmod->klp_info->secstrings + sec->sh_name;
> > + if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
> > + continue;
> > +
> > + /*
> > + * Format: .klp.rela.sec_objname.section_name
> > + * See comment in klp_resolve_symbols() for an explanation
> > + * of the selected field width value.
> > + */
> > + secname = pmod->klp_info->secstrings + sec->sh_name;
> > + cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> > + if (cnt != 1) {
> > + pr_err("section %s has an incorrectly formatted name\n",
> > + secname);
> > + continue;
> > + }

Actually, I think we don't need the cnt check here. Once it is removed,
there isn't much duplicated logic.

> > +
> > + if (strcmp(objname, sec_objname))
> > + continue;
> > +
> > + clear_relocate_add(pmod->klp_info->sechdrs,
> > + pmod->core_kallsyms.strtab,
> > + pmod->klp_info->symndx, i, pmod);
> > + }
> > +}
>
> Huh, this duplicates a lot of tricky code.
>
> It is even worse because this squashed code from two functions
> klp_apply_section_relocs() and klp_apply_object_relocs()
> into a single function. As a result, the code duplication is not
> even obvious.
>
> Also the suffix "_reloacations() does not match the suffix of
> the related funciton:
>
> + klp_apply_object_relocs() (existing)
> + klp_clear_object_relocations() (new)
>
> This all would complicate maintenance of the code.
>
> Please, implement a common:
>
> int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> const char *shstrtab, const char *strtab,
> unsigned int symndx, unsigned int secndx,
> const char *objname, bool apply);
>
> and
>
> int klp_write_object_relocs(struct klp_patch *patch,
> struct klp_object *obj,
> bool apply);
>
> and add the respective wrappers:
>
> int klp_apply_section_relocs(); /* also needed in module/main.c */
> int klp_apply_object_relocs();
> void klp_clear_object_relocs();

With the above simplification (removing cnt check), do we still need
all these wrappers? Personally, I think they will make the code more
difficult to follow..

Thanks,
Song

2022-12-09 13:27:22

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

Hi,

first thank you for taking over and I also appologize for not replying
much sooner.

On Thu, 1 Sep 2022, Song Liu wrote:

> From: Miroslav Benes <[email protected]>
>
> Josh reported a bug:
>
> When the object to be patched is a module, and that module is
> rmmod'ed and reloaded, it fails to load with:
>
> module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>
> The livepatch module has a relocation which references a symbol
> in the _previous_ loading of nfsd. When apply_relocate_add()
> tries to replace the old relocation with a new one, it sees that
> the previous one is nonzero and it errors out.
>
> On ppc64le, we have a similar issue:
>
> module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>
> He also proposed three different solutions. We could remove the error
> check in apply_relocate_add() introduced by commit eda9cec4c9a1
> ("x86/module: Detect and skip invalid relocations"). However the check
> is useful for detecting corrupted modules.
>
> We could also deny the patched modules to be removed. If it proved to be
> a major drawback for users, we could still implement a different
> approach. The solution would also complicate the existing code a lot.
>
> We thus decided to reverse the relocation patching (clear all relocation
> targets on x86_64). The solution is not
> universal and is too much arch-specific, but it may prove to be simpler
> in the end.
>
> Reported-by: Josh Poimboeuf <[email protected]>
> Signed-off-by: Miroslav Benes <[email protected]>
> Signed-off-by: Song Liu <[email protected]>

Petr has commented on the code aspects. I will just add that s390x was not
dealt with at the time because there was no live patching support for
s390x back then if I remember correctly and my notes do not lie. The same
applies to powerpc32. I think that both should be fixed as well with this
patch. It might also help to clean up the ifdeffery in the patch a bit.

Miroslav

2022-12-09 13:30:47

by Petr Mladek

[permalink] [raw]
Subject: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

Hi,

this reply is only about the powerpc-specific part.

Also adding Kamalesh and Michael into Cc who worked on the related
commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24 relocation
support of livepatch symbols").


On Mon 2022-11-28 17:57:06, Song Liu wrote:
> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> >
> > > --- a/arch/powerpc/kernel/module_64.c
> > > +++ b/arch/powerpc/kernel/module_64.c

I put back the name of the modified file so that it is easier
to know what changes we are talking about.

[...]
> > > +#ifdef CONFIG_LIVEPATCH
> > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > + const char *strtab,
> > > + unsigned int symindex,
> > > + unsigned int relsec,
> > > + struct module *me)
> > > +{
> > > + unsigned int i;
> > > + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > + Elf64_Sym *sym;
> > > + unsigned long *location;
> > > + const char *symname;
> > > + u32 *instruction;
> > > +
> > > + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > > + sechdrs[relsec].sh_info);
> > > +
> > > + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > + + rela[i].r_offset;
> > > + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > + + ELF64_R_SYM(rela[i].r_info);
> > > + symname = me->core_kallsyms.strtab
> > > + + sym->st_name;

The above calculation is quite complex. It seems to be copied from
apply_relocate_add(). If I maintained this code I would want to avoid
the duplication. definitely.


> > > +
> > > + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> > > + continue;

Why are we interested only into R_PPC_REL24 relocation types, please?

The code for generating the special SHN_LIVEPATCH section is not in
the mainline so it is not well defined.

I guess that R_PPC_REL24 relocation type is used by kPatch. Are we
sure that other relocation types wont be needed?

Anyway, we must warn when an unsupported type is used in SHN_LIVEPATCH
section here.


> > > + /*
> > > + * reverse the operations in apply_relocate_add() for case
> > > + * R_PPC_REL24.
> > > + */
> > > + if (sym->st_shndx != SHN_UNDEF &&

Do we want to handle SHN_UNDEF symbols here?

The commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24
relocation support of livepatch symbols") explains that
R_PPC_REL24 relocations in SHN_LIVEPATCH section are handled
__like__ relocations in SHN_UNDEF sections.

My understanding is that R_PPC_REL24 reallocation type has
two variants. Where the variant used in SHN_UNDEF and
SHN_LIVEPATCH sections need some preprocessing.

Anyway, if this function is livepatch-specific that we should
clear only symbols from SHN_LIVEPATCH sections. I mean that
we should probably ignore SHN_UNDEF here.

> > > + sym->st_shndx != SHN_LIVEPATCH)
> > > + continue;
> > > +
> > > +
> > > + instruction = (u32 *)location;
> > > + if (is_mprofile_ftrace_call(symname))
> > > + continue;

Why do we ignore these symbols?

I can't find any counter-part in apply_relocate_add(). It looks super
tricky. It would deserve a comment.

And I have no idea how we could maintain these exceptions.

> > > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > + continue;

Same here. It looks super tricky and there is no explanation.

> > > + instruction += 1;
> > > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> > > + }
> > > +
> > > +}
> >
> > This looks like a lot of duplicated code. Isn't it?
>
> TBH, I think the duplicated code is not really bad.

How exactly do you mean it, please?

Do you think that the amount of duplicated code is small enough?
Or that the new function looks better that updating the existing one?

> apply_relocate_add() is a much more complicated function, I would
> rather not mess it up to make this function a little simpler.

IMHO, the duplicated code is quite tricky. And if we really do
not need to clear all relocation types then we could avoid
the duplication another way, for example:

int update_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me,
bool apply)
{
unsigned int i;
Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
Elf64_Sym *sym;
Elf64_Xword r_type;
unsigned long *location;

if (apply) {
pr_debug("Applying ADD relocate section %u to %u\n", relsec,
sechdrs[relsec].sh_info);
} else {
pr_debug("Clearing ADD relocate section %u\n", relsec");
}

for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
/* This is where to make the change */
location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
+ rela[i].r_offset;
/* This is the symbol it is referring to */
sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
+ ELF64_R_SYM(rela[i].r_info);

r_type = ELF64_R_TYPE(rela[i].r_info);

if (apply) {
apply_relocate_location(sym, location, r_type, rela[i].r_addend);
} else {
clear_relocate_location(sym, location, r_type);
}
}
}

where the two functions would implement the r_type-specific stuff.
It would remove a lot of duplicated code. Wouldn't?

My main concern is how to maintain this code. I am afraid that
if it is in #ifdef CONFIG_LIVEPATCH section than nobody would
update it when doing some changes in apply_relocate_add().

In this case, the livepatch-specific code has to be minimal,
warn about unsupported scenarios, and livepatch maintainers
should understand what is going on there.

Best Regards,
Petr

2022-12-09 13:30:49

by Petr Mladek

[permalink] [raw]
Subject: x86 part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Mon 2022-11-28 17:57:06, Song Liu wrote:
> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> > This duplicates a lot of code. Please, rename apply_relocate_add() the
> > same way as __apply_clear_relocate_add() and add the "apply" parameter.
> > Then add the wrappers for this:
> >
> > int write_relocate_add(Elf64_Shdr *sechdrs,
> > const char *strtab,
> > unsigned int symindex,
> > unsigned int relsec,
> > struct module *me,
> > bool apply)
> > {
> > int ret;
> > bool early = me->state == MODULE_STATE_UNFORMED;
> > void *(*write)(void *, const void *, size_t) = memcpy;
> >
> > if (!early) {
> > write = text_poke;
> > mutex_lock(&text_mutex);
> > }
>
> How about we move the "early" logic into __write_relocate_add()?

If I get it correctly then __write_relocate_add() has three different
return paths. I am not sure if this could be moved there a reasonable
way.

Anyway, I do not resist on the above proposal. Feel free to find
another solution that reduces the duplicated code and looks
reasonable. I am sure that there are more possibilities.

Best Regards,
Petr

2022-12-09 13:31:00

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> > > return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > > }
> > >
> > > +static void klp_clear_object_relocations(struct module *pmod,
> > > + struct klp_object *obj)
> > > +{
> > > + int i, cnt;
> > > + const char *objname, *secname;
> > > + char sec_objname[MODULE_NAME_LEN];
> > > + Elf_Shdr *sec;
> > > +
> > > + objname = klp_is_module(obj) ? obj->name : "vmlinux";
> > > +
> > > + /* For each klp relocation section */
> > > + for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
> > > + sec = pmod->klp_info->sechdrs + i;
> > > + secname = pmod->klp_info->secstrings + sec->sh_name;
> > > + if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
> > > + continue;
> > > +
> > > + /*
> > > + * Format: .klp.rela.sec_objname.section_name
> > > + * See comment in klp_resolve_symbols() for an explanation
> > > + * of the selected field width value.
> > > + */
> > > + secname = pmod->klp_info->secstrings + sec->sh_name;
> > > + cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> > > + if (cnt != 1) {
> > > + pr_err("section %s has an incorrectly formatted name\n",
> > > + secname);
> > > + continue;
> > > + }
>
> Actually, I think we don't need the cnt check here. Once it is removed,
> there isn't much duplicated logic.

Because it must have passed the check once already during
klp_apply_section_relocs()? Yes, perhaps.

Miroslav

2022-12-09 14:15:03

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Mon 2022-11-28 17:57:06, Song Liu wrote:
> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> > > return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > > }
> > >
> > > +static void klp_clear_object_relocations(struct module *pmod,
> > > + struct klp_object *obj)
> > > +{
> > > + int i, cnt;
> > > + const char *objname, *secname;
> > > + char sec_objname[MODULE_NAME_LEN];
> > > + Elf_Shdr *sec;
> > > +
> > > + objname = klp_is_module(obj) ? obj->name : "vmlinux";
> > > +
> > > + /* For each klp relocation section */
> > > + for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
> > > + sec = pmod->klp_info->sechdrs + i;
> > > + secname = pmod->klp_info->secstrings + sec->sh_name;

secname = ...

> > > + if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
> > > + continue;
> > > +
> > > + /*
> > > + * Format: .klp.rela.sec_objname.section_name
> > > + * See comment in klp_resolve_symbols() for an explanation
> > > + * of the selected field width value.
> > > + */
> > > + secname = pmod->klp_info->secstrings + sec->sh_name;

secname = ... same a above.

> > > + cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> > > + if (cnt != 1) {
> > > + pr_err("section %s has an incorrectly formatted name\n",
> > > + secname);
> > > + continue;
> > > + }
>
> Actually, I think we don't need the cnt check here. Once it is removed,
> there isn't much duplicated logic.

Seriously?

A section with this error was skipped in klp_apply_section_relocs().
I did not cause rejecting the module! Why is it suddenly safe to
process it, please?


I see that you removed also:

if (strcmp(objname ? objname : "vmlinux", sec_objname))
return 0;

This is another bug in your "simplification".


> > > +
> > > + if (strcmp(objname, sec_objname))
> > > + continue;
> > > +
> > > + clear_relocate_add(pmod->klp_info->sechdrs,
> > > + pmod->core_kallsyms.strtab,
> > > + pmod->klp_info->symndx, i, pmod);
> > > + }
> > > +}
> >
> > Huh, this duplicates a lot of tricky code.
> >
> > It is even worse because this squashed code from two functions
> > klp_apply_section_relocs() and klp_apply_object_relocs()
> > into a single function. As a result, the code duplication is not
> > even obvious.
> >
> > Also the suffix "_reloacations() does not match the suffix of
> > the related funciton:
> >
> > + klp_apply_object_relocs() (existing)
> > + klp_clear_object_relocations() (new)
> >
> > This all would complicate maintenance of the code.
> >
> > Please, implement a common:
> >
> > int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> > const char *shstrtab, const char *strtab,
> > unsigned int symndx, unsigned int secndx,
> > const char *objname, bool apply);
> >
> > and
> >
> > int klp_write_object_relocs(struct klp_patch *patch,
> > struct klp_object *obj,
> > bool apply);
> >
> > and add the respective wrappers:
> >
> > int klp_apply_section_relocs(); /* also needed in module/main.c */
> > int klp_apply_object_relocs();
> > void klp_clear_object_relocs();
>
> With the above simplification (removing cnt check), do we still need
> all these wrappers? Personally, I think they will make the code more
> difficult to follow..

Sigh.

klp_apply_section_relocs() has 25 lines of code.
klp_apply_object_relocs() has 18 lines of code.

The only difference should be that klp_clear_object_relocs():

+ does not need to call klp_resolve_symbols()
+ need to call clear_relocate_add() instead of apply_relocate_add()

It is 7 different lines from in the existing 25 + 18 = 43 lines.
The duplication does not look like a good deal even from this POV.

If we introduce update_relocate_add(..., bool apply) parameter
then we could call update_relocate_add() in both situations.

Let me repeat from the last mail. klp_clear_object_relocs() even
reshuffled the duplicated code. It was even harder to find differences.

Do I still need to explain how bad idea was the code duplication,
please?

Best Regards,
Petr

2022-12-09 14:29:52

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri 2022-12-09 14:54:10, Petr Mladek wrote:
> On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> I see that you removed also:
>
> if (strcmp(objname ? objname : "vmlinux", sec_objname))
> return 0;
>
> This is another bug in your "simplification".

This actually is not a bug. It was replaced by the strcmp() check below.

> > > > +
> > > > + if (strcmp(objname, sec_objname))
> > > > + continue;

But this works only because the function is not called for "vmlinux".
It can't be unloaded.

Well, this optimization is not worth it. IMHO, it is better when
the API is able to handle "vmlinux" object a safe way.

We always try to make the livepatch API as error-proof as possible.
It is the main idea of livepatching. It should fix bugs without
breaking the running system.

Best Regards,
Petr

2022-12-09 18:43:25

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Dec 9, 2022 at 5:54 AM Petr Mladek <[email protected]> wrote:
>
> On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > > --- a/kernel/livepatch/core.c
> > > > +++ b/kernel/livepatch/core.c
> > > > @@ -316,6 +316,45 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> > > > return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > > > }
> > > >
> > > > +static void klp_clear_object_relocations(struct module *pmod,
> > > > + struct klp_object *obj)
> > > > +{
> > > > + int i, cnt;
> > > > + const char *objname, *secname;
> > > > + char sec_objname[MODULE_NAME_LEN];
> > > > + Elf_Shdr *sec;
> > > > +
> > > > + objname = klp_is_module(obj) ? obj->name : "vmlinux";
> > > > +
> > > > + /* For each klp relocation section */
> > > > + for (i = 1; i < pmod->klp_info->hdr.e_shnum; i++) {
> > > > + sec = pmod->klp_info->sechdrs + i;
> > > > + secname = pmod->klp_info->secstrings + sec->sh_name;
>
> secname = ...
>
> > > > + if (!(sec->sh_flags & SHF_RELA_LIVEPATCH))
> > > > + continue;
> > > > +
> > > > + /*
> > > > + * Format: .klp.rela.sec_objname.section_name
> > > > + * See comment in klp_resolve_symbols() for an explanation
> > > > + * of the selected field width value.
> > > > + */
> > > > + secname = pmod->klp_info->secstrings + sec->sh_name;
>
> secname = ... same a above.
>
> > > > + cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> > > > + if (cnt != 1) {
> > > > + pr_err("section %s has an incorrectly formatted name\n",
> > > > + secname);
> > > > + continue;
> > > > + }
> >
> > Actually, I think we don't need the cnt check here. Once it is removed,
> > there isn't much duplicated logic.
>
> Seriously?
>
> A section with this error was skipped in klp_apply_section_relocs().
> I did not cause rejecting the module! Why is it suddenly safe to
> process it, please?

Hmm.. I think you are right, we still need the check.

>
>
> I see that you removed also:
>
> if (strcmp(objname ? objname : "vmlinux", sec_objname))
> return 0;
>
> This is another bug in your "simplification".
>
>
> > > > +
> > > > + if (strcmp(objname, sec_objname))
> > > > + continue;
> > > > +
> > > > + clear_relocate_add(pmod->klp_info->sechdrs,
> > > > + pmod->core_kallsyms.strtab,
> > > > + pmod->klp_info->symndx, i, pmod);
> > > > + }
> > > > +}
> > >
> > > Huh, this duplicates a lot of tricky code.
> > >
> > > It is even worse because this squashed code from two functions
> > > klp_apply_section_relocs() and klp_apply_object_relocs()
> > > into a single function. As a result, the code duplication is not
> > > even obvious.
> > >
> > > Also the suffix "_reloacations() does not match the suffix of
> > > the related funciton:
> > >
> > > + klp_apply_object_relocs() (existing)
> > > + klp_clear_object_relocations() (new)
> > >
> > > This all would complicate maintenance of the code.
> > >
> > > Please, implement a common:
> > >
> > > int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> > > const char *shstrtab, const char *strtab,
> > > unsigned int symndx, unsigned int secndx,
> > > const char *objname, bool apply);
> > >
> > > and
> > >
> > > int klp_write_object_relocs(struct klp_patch *patch,
> > > struct klp_object *obj,
> > > bool apply);
> > >
> > > and add the respective wrappers:
> > >
> > > int klp_apply_section_relocs(); /* also needed in module/main.c */
> > > int klp_apply_object_relocs();
> > > void klp_clear_object_relocs();
> >
> > With the above simplification (removing cnt check), do we still need
> > all these wrappers? Personally, I think they will make the code more
> > difficult to follow..
>
> Sigh.
>
> klp_apply_section_relocs() has 25 lines of code.
> klp_apply_object_relocs() has 18 lines of code.
>
> The only difference should be that klp_clear_object_relocs():
>
> + does not need to call klp_resolve_symbols()
> + need to call clear_relocate_add() instead of apply_relocate_add()
>
> It is 7 different lines from in the existing 25 + 18 = 43 lines.
> The duplication does not look like a good deal even from this POV.
>
> If we introduce update_relocate_add(..., bool apply) parameter
> then we could call update_relocate_add() in both situations.
>
> Let me repeat from the last mail. klp_clear_object_relocs() even
> reshuffled the duplicated code. It was even harder to find differences.
>
> Do I still need to explain how bad idea was the code duplication,
> please?

No objections that code duplication is not ideal. It is just that sometimes
it is hard to follow the difference between functions with similar names.
Well, it is probably my own problem.

Thanks,
Song

2022-12-09 18:43:28

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Dec 9, 2022 at 4:55 AM Miroslav Benes <[email protected]> wrote:
>
> Hi,
>
> first thank you for taking over and I also appologize for not replying
> much sooner.
>
> On Thu, 1 Sep 2022, Song Liu wrote:
>
> > From: Miroslav Benes <[email protected]>
> >
> > Josh reported a bug:
> >
> > When the object to be patched is a module, and that module is
> > rmmod'ed and reloaded, it fails to load with:
> >
> > module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> > livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >
> > The livepatch module has a relocation which references a symbol
> > in the _previous_ loading of nfsd. When apply_relocate_add()
> > tries to replace the old relocation with a new one, it sees that
> > the previous one is nonzero and it errors out.
> >
> > On ppc64le, we have a similar issue:
> >
> > module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> > livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >
> > He also proposed three different solutions. We could remove the error
> > check in apply_relocate_add() introduced by commit eda9cec4c9a1
> > ("x86/module: Detect and skip invalid relocations"). However the check
> > is useful for detecting corrupted modules.
> >
> > We could also deny the patched modules to be removed. If it proved to be
> > a major drawback for users, we could still implement a different
> > approach. The solution would also complicate the existing code a lot.
> >
> > We thus decided to reverse the relocation patching (clear all relocation
> > targets on x86_64). The solution is not
> > universal and is too much arch-specific, but it may prove to be simpler
> > in the end.
> >
> > Reported-by: Josh Poimboeuf <[email protected]>
> > Signed-off-by: Miroslav Benes <[email protected]>
> > Signed-off-by: Song Liu <[email protected]>
>
> Petr has commented on the code aspects. I will just add that s390x was not
> dealt with at the time because there was no live patching support for
> s390x back then if I remember correctly and my notes do not lie. The same
> applies to powerpc32. I think that both should be fixed as well with this
> patch. It might also help to clean up the ifdeffery in the patch a bit.

I don't have test environments for s390 and powerpc, so I really don't know
whether I am doing something sane for them.

Would you have time to finish these parts? (Or maybe the whole patch..)

Thanks,
Song

2022-12-09 19:08:49

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal



Le 09/12/2022 à 19:30, Song Liu a écrit :
> On Fri, Dec 9, 2022 at 4:55 AM Miroslav Benes <[email protected]> wrote:
>>
>> Hi,
>>
>> first thank you for taking over and I also appologize for not replying
>> much sooner.
>>
>> On Thu, 1 Sep 2022, Song Liu wrote:
>>
>>> From: Miroslav Benes <[email protected]>
>>>
>>> Josh reported a bug:
>>>
>>> When the object to be patched is a module, and that module is
>>> rmmod'ed and reloaded, it fails to load with:
>>>
>>> module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
>>> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
>>> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>>>
>>> The livepatch module has a relocation which references a symbol
>>> in the _previous_ loading of nfsd. When apply_relocate_add()
>>> tries to replace the old relocation with a new one, it sees that
>>> the previous one is nonzero and it errors out.
>>>
>>> On ppc64le, we have a similar issue:
>>>
>>> module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
>>> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
>>> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
>>>
>>> He also proposed three different solutions. We could remove the error
>>> check in apply_relocate_add() introduced by commit eda9cec4c9a1
>>> ("x86/module: Detect and skip invalid relocations"). However the check
>>> is useful for detecting corrupted modules.
>>>
>>> We could also deny the patched modules to be removed. If it proved to be
>>> a major drawback for users, we could still implement a different
>>> approach. The solution would also complicate the existing code a lot.
>>>
>>> We thus decided to reverse the relocation patching (clear all relocation
>>> targets on x86_64). The solution is not
>>> universal and is too much arch-specific, but it may prove to be simpler
>>> in the end.
>>>
>>> Reported-by: Josh Poimboeuf <[email protected]>
>>> Signed-off-by: Miroslav Benes <[email protected]>
>>> Signed-off-by: Song Liu <[email protected]>
>>
>> Petr has commented on the code aspects. I will just add that s390x was not
>> dealt with at the time because there was no live patching support for
>> s390x back then if I remember correctly and my notes do not lie. The same
>> applies to powerpc32. I think that both should be fixed as well with this
>> patch. It might also help to clean up the ifdeffery in the patch a bit.
>
> I don't have test environments for s390 and powerpc, so I really don't know
> whether I am doing something sane for them.
>
> Would you have time to finish these parts? (Or maybe the whole patch..)

Setting up a powerpc test environment is fairly easy with QEMU.

Some information below:
- https://github.com/linuxppc/wiki/wiki
- https://wiki.qemu.org/Documentation/Platforms/PowerPC

Christophe

2022-12-09 19:59:57

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Dec 9, 2022 at 10:52 AM Christophe Leroy
<[email protected]> wrote:
>
>
>
> Le 09/12/2022 à 19:30, Song Liu a écrit :
> > On Fri, Dec 9, 2022 at 4:55 AM Miroslav Benes <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> first thank you for taking over and I also appologize for not replying
> >> much sooner.
> >>
> >> On Thu, 1 Sep 2022, Song Liu wrote:
> >>
> >>> From: Miroslav Benes <[email protected]>
> >>>
> >>> Josh reported a bug:
> >>>
> >>> When the object to be patched is a module, and that module is
> >>> rmmod'ed and reloaded, it fails to load with:
> >>>
> >>> module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> >>> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >>> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >>>
> >>> The livepatch module has a relocation which references a symbol
> >>> in the _previous_ loading of nfsd. When apply_relocate_add()
> >>> tries to replace the old relocation with a new one, it sees that
> >>> the previous one is nonzero and it errors out.
> >>>
> >>> On ppc64le, we have a similar issue:
> >>>
> >>> module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> >>> livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >>> livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >>>
> >>> He also proposed three different solutions. We could remove the error
> >>> check in apply_relocate_add() introduced by commit eda9cec4c9a1
> >>> ("x86/module: Detect and skip invalid relocations"). However the check
> >>> is useful for detecting corrupted modules.
> >>>
> >>> We could also deny the patched modules to be removed. If it proved to be
> >>> a major drawback for users, we could still implement a different
> >>> approach. The solution would also complicate the existing code a lot.
> >>>
> >>> We thus decided to reverse the relocation patching (clear all relocation
> >>> targets on x86_64). The solution is not
> >>> universal and is too much arch-specific, but it may prove to be simpler
> >>> in the end.
> >>>
> >>> Reported-by: Josh Poimboeuf <[email protected]>
> >>> Signed-off-by: Miroslav Benes <[email protected]>
> >>> Signed-off-by: Song Liu <[email protected]>
> >>
> >> Petr has commented on the code aspects. I will just add that s390x was not
> >> dealt with at the time because there was no live patching support for
> >> s390x back then if I remember correctly and my notes do not lie. The same
> >> applies to powerpc32. I think that both should be fixed as well with this
> >> patch. It might also help to clean up the ifdeffery in the patch a bit.
> >
> > I don't have test environments for s390 and powerpc, so I really don't know
> > whether I am doing something sane for them.
> >
> > Would you have time to finish these parts? (Or maybe the whole patch..)
>
> Setting up a powerpc test environment is fairly easy with QEMU.
>
> Some information below:
> - https://github.com/linuxppc/wiki/wiki
> - https://wiki.qemu.org/Documentation/Platforms/PowerPC

Thanks for these pointers! I will give it a try.

Song

PS: Sometimes I am just lazy, you know..

2022-12-09 20:09:27

by Song Liu

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
>
> Hi,
>
> this reply is only about the powerpc-specific part.
>
> Also adding Kamalesh and Michael into Cc who worked on the related
> commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24 relocation
> support of livepatch symbols").
>
>
> On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > >
> > > > --- a/arch/powerpc/kernel/module_64.c
> > > > +++ b/arch/powerpc/kernel/module_64.c
>
> I put back the name of the modified file so that it is easier
> to know what changes we are talking about.
>
> [...]
> > > > +#ifdef CONFIG_LIVEPATCH
> > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > + const char *strtab,
> > > > + unsigned int symindex,
> > > > + unsigned int relsec,
> > > > + struct module *me)
> > > > +{
> > > > + unsigned int i;
> > > > + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > > + Elf64_Sym *sym;
> > > > + unsigned long *location;
> > > > + const char *symname;
> > > > + u32 *instruction;
> > > > +
> > > > + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > > > + sechdrs[relsec].sh_info);
> > > > +
> > > > + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > > + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > > + + rela[i].r_offset;
> > > > + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > > + + ELF64_R_SYM(rela[i].r_info);
> > > > + symname = me->core_kallsyms.strtab
> > > > + + sym->st_name;
>
> The above calculation is quite complex. It seems to be copied from
> apply_relocate_add(). If I maintained this code I would want to avoid
> the duplication. definitely.
>
>
> > > > +
> > > > + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> > > > + continue;
>
> Why are we interested only into R_PPC_REL24 relocation types, please?
>
> The code for generating the special SHN_LIVEPATCH section is not in
> the mainline so it is not well defined.
>
> I guess that R_PPC_REL24 relocation type is used by kPatch. Are we
> sure that other relocation types wont be needed?
>
> Anyway, we must warn when an unsupported type is used in SHN_LIVEPATCH
> section here.
>
>
> > > > + /*
> > > > + * reverse the operations in apply_relocate_add() for case
> > > > + * R_PPC_REL24.
> > > > + */
> > > > + if (sym->st_shndx != SHN_UNDEF &&
>
> Do we want to handle SHN_UNDEF symbols here?
>
> The commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24
> relocation support of livepatch symbols") explains that
> R_PPC_REL24 relocations in SHN_LIVEPATCH section are handled
> __like__ relocations in SHN_UNDEF sections.
>
> My understanding is that R_PPC_REL24 reallocation type has
> two variants. Where the variant used in SHN_UNDEF and
> SHN_LIVEPATCH sections need some preprocessing.
>
> Anyway, if this function is livepatch-specific that we should
> clear only symbols from SHN_LIVEPATCH sections. I mean that
> we should probably ignore SHN_UNDEF here.
>
> > > > + sym->st_shndx != SHN_LIVEPATCH)
> > > > + continue;
> > > > +
> > > > +
> > > > + instruction = (u32 *)location;
> > > > + if (is_mprofile_ftrace_call(symname))
> > > > + continue;
>
> Why do we ignore these symbols?
>
> I can't find any counter-part in apply_relocate_add(). It looks super
> tricky. It would deserve a comment.
>
> And I have no idea how we could maintain these exceptions.
>
> > > > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > > + continue;
>
> Same here. It looks super tricky and there is no explanation.

The two checks are from restore_r2(). But I cannot really remember
why we needed them. It is probably an updated version from an earlier
version (3 year earlier..).

>
> > > > + instruction += 1;
> > > > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> > > > + }
> > > > +
> > > > +}
> > >
> > > This looks like a lot of duplicated code. Isn't it?
> >
> > TBH, I think the duplicated code is not really bad.
>
> How exactly do you mean it, please?
>
> Do you think that the amount of duplicated code is small enough?
> Or that the new function looks better that updating the existing one?
>
> > apply_relocate_add() is a much more complicated function, I would
> > rather not mess it up to make this function a little simpler.
>
> IMHO, the duplicated code is quite tricky. And if we really do
> not need to clear all relocation types then we could avoid
> the duplication another way, for example:
>
> int update_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> struct module *me,
> bool apply)
> {
> unsigned int i;
> Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> Elf64_Sym *sym;
> Elf64_Xword r_type;
> unsigned long *location;
>
> if (apply) {
> pr_debug("Applying ADD relocate section %u to %u\n", relsec,
> sechdrs[relsec].sh_info);
> } else {
> pr_debug("Clearing ADD relocate section %u\n", relsec");
> }
>
> for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> /* This is where to make the change */
> location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> + rela[i].r_offset;
> /* This is the symbol it is referring to */
> sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> + ELF64_R_SYM(rela[i].r_info);
>
> r_type = ELF64_R_TYPE(rela[i].r_info);
>
> if (apply) {
> apply_relocate_location(sym, location, r_type, rela[i].r_addend);
> } else {
> clear_relocate_location(sym, location, r_type);
> }

I personally don't like too many "if (apply) {...} else {...}" patterns in
a function. And these new functions confuse me sometimes:

update_relocate_add(..., apply);
apply_relocate_location();
clear_relocate_location();

And I did think there wasn't too much duplicated code.

I know this is very personal. And I can understand your preference.
I can make the code to remove more duplicated code. But I guess
I need a better understanding of powerpc logic..

Thanks,
Song

2022-12-12 08:45:32

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

> > Petr has commented on the code aspects. I will just add that s390x was not
> > dealt with at the time because there was no live patching support for
> > s390x back then if I remember correctly and my notes do not lie. The same
> > applies to powerpc32. I think that both should be fixed as well with this
> > patch. It might also help to clean up the ifdeffery in the patch a bit.
>
> I don't have test environments for s390 and powerpc, so I really don't know
> whether I am doing something sane for them.

I would say that if you implement it, there are people here who would be
able help with the testing and reviewing the changes if CCed.

> Would you have time to finish these parts? (Or maybe the whole patch..)

Unfortunately I cannot promise anything at the moment. I am really sorry
about that.

Miroslav

2022-12-12 17:29:26

by Petr Mladek

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri 2022-12-09 11:59:35, Song Liu wrote:
> On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
> > On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > >
> > > > > --- a/arch/powerpc/kernel/module_64.c
> > > > > +++ b/arch/powerpc/kernel/module_64.c
> > > > > +#ifdef CONFIG_LIVEPATCH
> > > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > > + const char *strtab,
> > > > > + unsigned int symindex,
> > > > > + unsigned int relsec,
> > > > > + struct module *me)
> > > > > +{

[...]

> > > > > +
> > > > > + instruction = (u32 *)location;
> > > > > + if (is_mprofile_ftrace_call(symname))
> > > > > + continue;
> >
> > Why do we ignore these symbols?
> >
> > I can't find any counter-part in apply_relocate_add(). It looks super
> > tricky. It would deserve a comment.
> >
> > And I have no idea how we could maintain these exceptions.
> >
> > > > > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > > > + continue;
> >
> > Same here. It looks super tricky and there is no explanation.
>
> The two checks are from restore_r2(). But I cannot really remember
> why we needed them. It is probably an updated version from an earlier
> version (3 year earlier..).

This is a good sign that it has to be explained in a comment.
Or even better, it should not by copy pasted.

> > > > > + instruction += 1;
> > > > > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));

I believe that this is not enough. apply_relocate_add() does this:

int apply_relocate_add(Elf64_Shdr *sechdrs,
[...]
struct module *me)
{
[...]
case R_PPC_REL24:
/* FIXME: Handle weak symbols here --RR */
if (sym->st_shndx == SHN_UNDEF ||
sym->st_shndx == SHN_LIVEPATCH) {
[...]
if (!restore_r2(strtab + sym->st_name,
(u32 *)location + 1, me))
[...] return -ENOEXEC;

---> if (patch_instruction((u32 *)location, ppc_inst(value)))
return -EFAULT;

, where restore_r2() does:

static int restore_r2(const char *name, u32 *instruction, struct module *me)
{
[...]
/* ld r2,R2_STACK_OFFSET(r1) */
---> if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
return 0;
[...]
}

By other words, apply_relocate_add() modifies two instructions:

+ patch_instruction() called in restore_r2() writes into "location + 1"
+ patch_instruction() called in apply_relocate_add() writes into "location"

IMHO, we have to clear both.

IMHO, we need to implement a function that reverts the changes done
in restore_r2(). Also we need to revert the changes done in
apply_relocate_add().

Please, avoid code duplication as much as possible. Especially,
the two checks is_mprofile_ftrace_call() and
instr_is_relative_link_branch() must be shared. IMHO, it is
the only way to keep the code maintainable. We must make sure that
we will clear the instructions only when they were patched. And
copy pasting various tricky exceptions is a way to hell.


> > int update_relocate_add(Elf64_Shdr *sechdrs,
> > const char *strtab,
> > unsigned int symindex,
> > unsigned int relsec,
> > struct module *me,
> > bool apply)
> > {
> > unsigned int i;
> > Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > Elf64_Sym *sym;
> > Elf64_Xword r_type;
> > unsigned long *location;
> >
> > if (apply) {
> > pr_debug("Applying ADD relocate section %u to %u\n", relsec,
> > sechdrs[relsec].sh_info);
> > } else {
> > pr_debug("Clearing ADD relocate section %u\n", relsec");
> > }
> >
> > for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > /* This is where to make the change */
> > location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > + rela[i].r_offset;
> > /* This is the symbol it is referring to */
> > sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > + ELF64_R_SYM(rela[i].r_info);
> >
> > r_type = ELF64_R_TYPE(rela[i].r_info);
> >
> > if (apply) {
> > apply_relocate_location(sym, location, r_type, rela[i].r_addend);
> > } else {
> > clear_relocate_location(sym, location, r_type);
> > }
>
> I personally don't like too many "if (apply) {...} else {...}" patterns in
> a function. And these new functions confuse me sometimes:
>
> update_relocate_add(..., apply);
> apply_relocate_location();
> clear_relocate_location();

Feel free to come up with another way how to avoid code duplication.

> And I did think there wasn't too much duplicated code.

I think that it looks very different when you are writing or reading
or mantainting the code. It might be easier to write code and modify
it. It is more complicated to find the differences later. Also it is
more complicated to do the same changes many times when the common
code is updated later.

Best Regards,
Petr

2022-12-12 23:45:45

by Song Liu

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Mon, Dec 12, 2022 at 9:12 AM Petr Mladek <[email protected]> wrote:
>
> On Fri 2022-12-09 11:59:35, Song Liu wrote:
> > On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
> > > On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > > > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > > >
> > > > > > --- a/arch/powerpc/kernel/module_64.c
> > > > > > +++ b/arch/powerpc/kernel/module_64.c
> > > > > > +#ifdef CONFIG_LIVEPATCH
> > > > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > > > + const char *strtab,
> > > > > > + unsigned int symindex,
> > > > > > + unsigned int relsec,
> > > > > > + struct module *me)
> > > > > > +{
>
> [...]
>
> > > > > > +
> > > > > > + instruction = (u32 *)location;
> > > > > > + if (is_mprofile_ftrace_call(symname))
> > > > > > + continue;
> > >
> > > Why do we ignore these symbols?
> > >
> > > I can't find any counter-part in apply_relocate_add(). It looks super
> > > tricky. It would deserve a comment.
> > >
> > > And I have no idea how we could maintain these exceptions.
> > >
> > > > > > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > > > > + continue;
> > >
> > > Same here. It looks super tricky and there is no explanation.
> >
> > The two checks are from restore_r2(). But I cannot really remember
> > why we needed them. It is probably an updated version from an earlier
> > version (3 year earlier..).
>
> This is a good sign that it has to be explained in a comment.
> Or even better, it should not by copy pasted.
>
> > > > > > + instruction += 1;
> > > > > > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
>
> I believe that this is not enough. apply_relocate_add() does this:
>
> int apply_relocate_add(Elf64_Shdr *sechdrs,
> [...]
> struct module *me)
> {
> [...]
> case R_PPC_REL24:
> /* FIXME: Handle weak symbols here --RR */
> if (sym->st_shndx == SHN_UNDEF ||
> sym->st_shndx == SHN_LIVEPATCH) {
> [...]
> if (!restore_r2(strtab + sym->st_name,
> (u32 *)location + 1, me))
> [...] return -ENOEXEC;
>
> ---> if (patch_instruction((u32 *)location, ppc_inst(value)))
> return -EFAULT;
>
> , where restore_r2() does:
>
> static int restore_r2(const char *name, u32 *instruction, struct module *me)
> {
> [...]
> /* ld r2,R2_STACK_OFFSET(r1) */
> ---> if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
> return 0;
> [...]
> }
>
> By other words, apply_relocate_add() modifies two instructions:
>
> + patch_instruction() called in restore_r2() writes into "location + 1"
> + patch_instruction() called in apply_relocate_add() writes into "location"
>
> IMHO, we have to clear both.
>
> IMHO, we need to implement a function that reverts the changes done
> in restore_r2(). Also we need to revert the changes done in
> apply_relocate_add().
>
> Please, avoid code duplication as much as possible. Especially,
> the two checks is_mprofile_ftrace_call() and
> instr_is_relative_link_branch() must be shared. IMHO, it is
> the only way to keep the code maintainable. We must make sure that
> we will clear the instructions only when they were patched. And
> copy pasting various tricky exceptions is a way to hell.
>
>
> > > int update_relocate_add(Elf64_Shdr *sechdrs,
> > > const char *strtab,
> > > unsigned int symindex,
> > > unsigned int relsec,
> > > struct module *me,
> > > bool apply)
> > > {
> > > unsigned int i;
> > > Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > Elf64_Sym *sym;
> > > Elf64_Xword r_type;
> > > unsigned long *location;
> > >
> > > if (apply) {
> > > pr_debug("Applying ADD relocate section %u to %u\n", relsec,
> > > sechdrs[relsec].sh_info);
> > > } else {
> > > pr_debug("Clearing ADD relocate section %u\n", relsec");
> > > }
> > >
> > > for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > /* This is where to make the change */
> > > location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > + rela[i].r_offset;
> > > /* This is the symbol it is referring to */
> > > sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > + ELF64_R_SYM(rela[i].r_info);
> > >
> > > r_type = ELF64_R_TYPE(rela[i].r_info);
> > >
> > > if (apply) {
> > > apply_relocate_location(sym, location, r_type, rela[i].r_addend);
> > > } else {
> > > clear_relocate_location(sym, location, r_type);
> > > }
> >
> > I personally don't like too many "if (apply) {...} else {...}" patterns in
> > a function. And these new functions confuse me sometimes:
> >
> > update_relocate_add(..., apply);
> > apply_relocate_location();
> > clear_relocate_location();
>
> Feel free to come up with another way how to avoid code duplication.
>
> > And I did think there wasn't too much duplicated code.
>
> I think that it looks very different when you are writing or reading
> or mantainting the code. It might be easier to write code and modify
> it. It is more complicated to find the differences later. Also it is
> more complicated to do the same changes many times when the common
> code is updated later.

Agreed that we need to keep the code understandable and maintainable.
And I need to try harder to understand how these codes work as-is.

Thanks,
Song

2022-12-13 08:42:17

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Dec 9, 2022 at 4:55 AM Miroslav Benes <[email protected]> wrote:
>
> Hi,
>
> first thank you for taking over and I also appologize for not replying
> much sooner.
>
> On Thu, 1 Sep 2022, Song Liu wrote:
>
> > From: Miroslav Benes <[email protected]>
> >
> > Josh reported a bug:
> >
> > When the object to be patched is a module, and that module is
> > rmmod'ed and reloaded, it fails to load with:
> >
> > module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> > livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >
> > The livepatch module has a relocation which references a symbol
> > in the _previous_ loading of nfsd. When apply_relocate_add()
> > tries to replace the old relocation with a new one, it sees that
> > the previous one is nonzero and it errors out.
> >
> > On ppc64le, we have a similar issue:
> >
> > module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> > livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >
> > He also proposed three different solutions. We could remove the error
> > check in apply_relocate_add() introduced by commit eda9cec4c9a1
> > ("x86/module: Detect and skip invalid relocations"). However the check
> > is useful for detecting corrupted modules.
> >
> > We could also deny the patched modules to be removed. If it proved to be
> > a major drawback for users, we could still implement a different
> > approach. The solution would also complicate the existing code a lot.
> >
> > We thus decided to reverse the relocation patching (clear all relocation
> > targets on x86_64). The solution is not
> > universal and is too much arch-specific, but it may prove to be simpler
> > in the end.
> >
> > Reported-by: Josh Poimboeuf <[email protected]>
> > Signed-off-by: Miroslav Benes <[email protected]>
> > Signed-off-by: Song Liu <[email protected]>
>
> Petr has commented on the code aspects. I will just add that s390x was not
> dealt with at the time because there was no live patching support for
> s390x back then if I remember correctly and my notes do not lie. The same
> applies to powerpc32. I think that both should be fixed as well with this
> patch. It might also help to clean up the ifdeffery in the patch a bit.

After reading the code (no testing), I think we don't need any logic for
ppc32 and s390.

We need clear_relocate_add() to handle module reload failure.
The failure happens when we

1) call apply_relocate_add() on klp load (or module first load,
if klp was loaded first);
2) do nothing when the module is unloaded;
3) call apply_relocate_add() on module reload, which failed.

This failure happens in the sanity check in
apply_relocate_add().

For x86, the check is something like:
if (*(s32 *)loc != 0)
goto invalid_relocation;

For ppc64, the check is in restore_r2():

if (*instruction != PPC_RAW_NOP()) {
pr_err("%s: Expected nop after call, got %08x at %pS\n",
me->name, *instruction, instruction);
return 0;
}

I don't think we have similar checks for ppc32 and s390, so
clear_relocate_add() is not needed for the two.

OTOH, we can argue that clear_relocate_add() should undo
everything apply_relocate_add() did. But I do think that
will be an overkill.

WDYT?

Thanks,
Song

2022-12-13 08:55:08

by Song Liu

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

)() ()On Mon, Dec 12, 2022 at 9:12 AM Petr Mladek <[email protected]> wrote:
>
> On Fri 2022-12-09 11:59:35, Song Liu wrote:
> > On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
> > > On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > > > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > > >
> > > > > > --- a/arch/powerpc/kernel/module_64.c
> > > > > > +++ b/arch/powerpc/kernel/module_64.c
> > > > > > +#ifdef CONFIG_LIVEPATCH
> > > > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > > > + const char *strtab,
> > > > > > + unsigned int symindex,
> > > > > > + unsigned int relsec,
> > > > > > + struct module *me)
> > > > > > +{
>
> [...]
>
> > > > > > +
> > > > > > + instruction = (u32 *)location;
> > > > > > + if (is_mprofile_ftrace_call(symname))
> > > > > > + continue;
> > >
> > > Why do we ignore these symbols?
> > >
> > > I can't find any counter-part in apply_relocate_add(). It looks super
> > > tricky. It would deserve a comment.
> > >
> > > And I have no idea how we could maintain these exceptions.
> > >
> > > > > > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > > > > + continue;
> > >
> > > Same here. It looks super tricky and there is no explanation.
> >
> > The two checks are from restore_r2(). But I cannot really remember
> > why we needed them. It is probably an updated version from an earlier
> > version (3 year earlier..).
>
> This is a good sign that it has to be explained in a comment.
> Or even better, it should not by copy pasted.
>
> > > > > > + instruction += 1;
> > > > > > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
>
> I believe that this is not enough. apply_relocate_add() does this:
>
> int apply_relocate_add(Elf64_Shdr *sechdrs,
> [...]
> struct module *me)
> {
> [...]
> case R_PPC_REL24:
> /* FIXME: Handle weak symbols here --RR */
> if (sym->st_shndx == SHN_UNDEF ||
> sym->st_shndx == SHN_LIVEPATCH) {
> [...]
> if (!restore_r2(strtab + sym->st_name,
> (u32 *)location + 1, me))
> [...] return -ENOEXEC;
>
> ---> if (patch_instruction((u32 *)location, ppc_inst(value)))
> return -EFAULT;
>
> , where restore_r2() does:
>
> static int restore_r2(const char *name, u32 *instruction, struct module *me)
> {
> [...]
> /* ld r2,R2_STACK_OFFSET(r1) */
> ---> if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
> return 0;
> [...]
> }
>
> By other words, apply_relocate_add() modifies two instructions:
>
> + patch_instruction() called in restore_r2() writes into "location + 1"
> + patch_instruction() called in apply_relocate_add() writes into "location"
>
> IMHO, we have to clear both.
>
> IMHO, we need to implement a function that reverts the changes done
> in restore_r2(). Also we need to revert the changes done in
> apply_relocate_add().

I finally got time to read all the details again and recalled what
happened with the code.

The failure happens when we
1) call apply_relocate_add() on klp load (or module first load,
if klp was loaded first);
2) do nothing when the module is unloaded;
3) call apply_relocate_add() on module reload, which failed.

The failure happens at this check in restore_r2():

if (*instruction != PPC_RAW_NOP()) {
pr_err("%s: Expected nop after call, got %08x at %pS\n",
me->name, *instruction, instruction);
return 0;
}

Therefore, apply_relocate_add only fails when "location + 1"
is not NOP. And to make it not fail, we only need to write NOP to
"location + 1" in clear_relocate_add().

IIUC, you want clear_relocate_add() to undo everything we did
in apply_relocate_add(); while I was writing clear_relocate_add()
to make the next apply_relocate_add() not fail.

I agree that, based on the name, clear_relocate_add() should
undo everything by apply_relocate_add(). But I am not sure how
to handle some cases. For example, how do we undo

case R_PPC64_ADDR32:
/* Simply set it */
*(u32 *)location = value;
break;

Shall we just write zeros? I don't think this matters.

I think this is the question we should answer first:
What shall clear_relocate_add() do?
1) undo everything by apply_relocate_add();
2) only do things needed to make the next
apply_relocate_add succeed;
3) something between 1) and 2).

WDYT?

Thanks,
Song

>
> Please, avoid code duplication as much as possible. Especially,
> the two checks is_mprofile_ftrace_call() and
> instr_is_relative_link_branch() must be shared. IMHO, it is
> the only way to keep the code maintainable. We must make sure that
> we will clear the instructions only when they were patched. And
> copy pasting various tricky exceptions is a way to hell.
>
>
> > > int update_relocate_add(Elf64_Shdr *sechdrs,
> > > const char *strtab,
> > > unsigned int symindex,
> > > unsigned int relsec,
> > > struct module *me,
> > > bool apply)
> > > {
> > > unsigned int i;
> > > Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > Elf64_Sym *sym;
> > > Elf64_Xword r_type;
> > > unsigned long *location;
> > >
> > > if (apply) {
> > > pr_debug("Applying ADD relocate section %u to %u\n", relsec,
> > > sechdrs[relsec].sh_info);
> > > } else {
> > > pr_debug("Clearing ADD relocate section %u\n", relsec");
> > > }
> > >
> > > for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > /* This is where to make the change */
> > > location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > + rela[i].r_offset;
> > > /* This is the symbol it is referring to */
> > > sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > + ELF64_R_SYM(rela[i].r_info);
> > >
> > > r_type = ELF64_R_TYPE(rela[i].r_info);
> > >
> > > if (apply) {
> > > apply_relocate_location(sym, location, r_type, rela[i].r_addend);
> > > } else {
> > > clear_relocate_location(sym, location, r_type);
> > > }
> >
> > I personally don't like too many "if (apply) {...} else {...}" patterns in
> > a function. And these new functions confuse me sometimes:
> >
> > update_relocate_add(..., apply);
> > apply_relocate_location();
> > clear_relocate_location();
>
> Feel free to come up with another way how to avoid code duplication.
>
> > And I did think there wasn't too much duplicated code.
>
> I think that it looks very different when you are writing or reading
> or mantainting the code. It might be easier to write code and modify
> it. It is more complicated to find the differences later. Also it is
> more complicated to do the same changes many times when the common
> code is updated later.
>
> Best Regards,
> Petr

2022-12-13 13:36:43

by Petr Mladek

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Tue 2022-12-13 00:13:46, Song Liu wrote:
> )() ()On Mon, Dec 12, 2022 at 9:12 AM Petr Mladek <[email protected]> wrote:
> >
> > On Fri 2022-12-09 11:59:35, Song Liu wrote:
> > > On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
> > > > On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > > > > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > > > > >
> > > > > > > --- a/arch/powerpc/kernel/module_64.c
> > > > > > > +++ b/arch/powerpc/kernel/module_64.c
> > > > > > > +#ifdef CONFIG_LIVEPATCH
> > > > > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > > > > + const char *strtab,
> > > > > > > + unsigned int symindex,
> > > > > > > + unsigned int relsec,
> > > > > > > + struct module *me)
> > > > > > > +{
> >
> > [...]
> >
> > > > > > > +
> > > > > > > + instruction = (u32 *)location;
> > > > > > > + if (is_mprofile_ftrace_call(symname))
> > > > > > > + continue;
> > > >
> > > > Why do we ignore these symbols?
> > > >
> > > > I can't find any counter-part in apply_relocate_add(). It looks super
> > > > tricky. It would deserve a comment.
> > > >
> > > > And I have no idea how we could maintain these exceptions.
> > > >
> > > > > > > + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > > > > > + continue;
> > > >
> > > > Same here. It looks super tricky and there is no explanation.
> > >
> > > The two checks are from restore_r2(). But I cannot really remember
> > > why we needed them. It is probably an updated version from an earlier
> > > version (3 year earlier..).
> >
> > This is a good sign that it has to be explained in a comment.
> > Or even better, it should not by copy pasted.
> >
> > > > > > > + instruction += 1;
> > > > > > > + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> >
> > I believe that this is not enough. apply_relocate_add() does this:
> >
> > int apply_relocate_add(Elf64_Shdr *sechdrs,
> > [...]
> > struct module *me)
> > {
> > [...]
> > case R_PPC_REL24:
> > /* FIXME: Handle weak symbols here --RR */
> > if (sym->st_shndx == SHN_UNDEF ||
> > sym->st_shndx == SHN_LIVEPATCH) {
> > [...]
> > if (!restore_r2(strtab + sym->st_name,
> > (u32 *)location + 1, me))
> > [...] return -ENOEXEC;
> >
> > ---> if (patch_instruction((u32 *)location, ppc_inst(value)))
> > return -EFAULT;
> >
> > , where restore_r2() does:
> >
> > static int restore_r2(const char *name, u32 *instruction, struct module *me)
> > {
> > [...]
> > /* ld r2,R2_STACK_OFFSET(r1) */
> > ---> if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
> > return 0;
> > [...]
> > }
> >
> > By other words, apply_relocate_add() modifies two instructions:
> >
> > + patch_instruction() called in restore_r2() writes into "location + 1"
> > + patch_instruction() called in apply_relocate_add() writes into "location"
> >
> > IMHO, we have to clear both.
> >
> > IMHO, we need to implement a function that reverts the changes done
> > in restore_r2(). Also we need to revert the changes done in
> > apply_relocate_add().
>
> I finally got time to read all the details again and recalled what
> happened with the code.
>
> The failure happens when we
> 1) call apply_relocate_add() on klp load (or module first load,
> if klp was loaded first);
> 2) do nothing when the module is unloaded;
> 3) call apply_relocate_add() on module reload, which failed.
>
> The failure happens at this check in restore_r2():
>
> if (*instruction != PPC_RAW_NOP()) {
> pr_err("%s: Expected nop after call, got %08x at %pS\n",
> me->name, *instruction, instruction);
> return 0;
> }
>
> Therefore, apply_relocate_add only fails when "location + 1"
> is not NOP. And to make it not fail, we only need to write NOP to
> "location + 1" in clear_relocate_add().

Yes, this should be enough to pass the existing check.

> IIUC, you want clear_relocate_add() to undo everything we did
> in apply_relocate_add(); while I was writing clear_relocate_add()
> to make the next apply_relocate_add() not fail.
>
> I agree that, based on the name, clear_relocate_add() should
> undo everything by apply_relocate_add(). But I am not sure how
> to handle some cases. For example, how do we undo
>
> case R_PPC64_ADDR32:
> /* Simply set it */
> *(u32 *)location = value;
> break;
>
> Shall we just write zeros? I don't think this matters.

I guess that it would be zeros as we do in x86_64.


> I think this is the question we should answer first:
> What shall clear_relocate_add() do?
> 1) undo everything by apply_relocate_add();
> 2) only do things needed to make the next
> apply_relocate_add succeed;
> 3) something between 1) and 2).

Good question.

Hmm, the commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24
relocation support of livepatch symbols") suggests that all symbols
in the section SHN_LIVEPATCH have the type R_PPC_REL24. AFAIK, the
kernel livepatches are the only user of the clear_relocate_add()
feature.

If the above is correct then it might be enough to clear only
R_PPC_REL24 type. And it might be enough to warn when clear_relocate_add()
is called for another type so that we know when the relocations
were not cleared properly.

Good question. We might need some input from people familiar
with the architecture and creating the livepatches.

Best Regards,
Petr

2022-12-13 15:30:46

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Tue 2022-12-13 00:28:34, Song Liu wrote:
> On Fri, Dec 9, 2022 at 4:55 AM Miroslav Benes <[email protected]> wrote:
> >
> > Hi,
> >
> > first thank you for taking over and I also appologize for not replying
> > much sooner.
> >
> > On Thu, 1 Sep 2022, Song Liu wrote:
> >
> > > From: Miroslav Benes <[email protected]>
> > >
> > > Josh reported a bug:
> > >
> > > When the object to be patched is a module, and that module is
> > > rmmod'ed and reloaded, it fails to load with:
> > >
> > > module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> > > livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > > livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > >
> > > The livepatch module has a relocation which references a symbol
> > > in the _previous_ loading of nfsd. When apply_relocate_add()
> > > tries to replace the old relocation with a new one, it sees that
> > > the previous one is nonzero and it errors out.
> > >
> > > On ppc64le, we have a similar issue:
> > >
> > > module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> > > livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > > livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > >
> > > He also proposed three different solutions. We could remove the error
> > > check in apply_relocate_add() introduced by commit eda9cec4c9a1
> > > ("x86/module: Detect and skip invalid relocations"). However the check
> > > is useful for detecting corrupted modules.
> > >
> > > We could also deny the patched modules to be removed. If it proved to be
> > > a major drawback for users, we could still implement a different
> > > approach. The solution would also complicate the existing code a lot.
> > >
> > > We thus decided to reverse the relocation patching (clear all relocation
> > > targets on x86_64). The solution is not
> > > universal and is too much arch-specific, but it may prove to be simpler
> > > in the end.
> > >
> > > Reported-by: Josh Poimboeuf <[email protected]>
> > > Signed-off-by: Miroslav Benes <[email protected]>
> > > Signed-off-by: Song Liu <[email protected]>
> >
> > Petr has commented on the code aspects. I will just add that s390x was not
> > dealt with at the time because there was no live patching support for
> > s390x back then if I remember correctly and my notes do not lie. The same
> > applies to powerpc32. I think that both should be fixed as well with this
> > patch. It might also help to clean up the ifdeffery in the patch a bit.
>
> After reading the code (no testing), I think we don't need any logic for
> ppc32 and s390.
>
> We need clear_relocate_add() to handle module reload failure.
>
> I don't think we have similar checks for ppc32 and s390, so
> clear_relocate_add() is not needed for the two.
>
> OTOH, we can argue that clear_relocate_add() should undo
> everything apply_relocate_add() did. But I do think that
> will be an overkill.

It is true that we do not need to clear the relocations if the values
are not checked in apply_relocated_add().

I do not have strong opinion whether we should do it or not.

One one hand, the clearing code might introduce a bug if it modifies
some wrong location. So, it might do more harm then good.

One the other hand, it feels bad when a code is jumping to a
non-existing address. I know, nobody should call this code.
But it is still a kind of a security hole.

Well, I think that we could keep the clearing functions empty
on ppc32 and s390 in this patch(set). It won't be worse than
it is now. And perfection is the enemy of good.

Best Regards,
Petr

2022-12-13 19:50:58

by Song Liu

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
>
> Hi,
>
> this reply is only about the powerpc-specific part.
>
> Also adding Kamalesh and Michael into Cc who worked on the related
> commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24 relocation
> support of livepatch symbols").
>
>
> On Mon 2022-11-28 17:57:06, Song Liu wrote:
> > On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
> > >
> > > > --- a/arch/powerpc/kernel/module_64.c
> > > > +++ b/arch/powerpc/kernel/module_64.c
>
> I put back the name of the modified file so that it is easier
> to know what changes we are talking about.
>
> [...]
> > > > +#ifdef CONFIG_LIVEPATCH
> > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > + const char *strtab,
> > > > + unsigned int symindex,
> > > > + unsigned int relsec,
> > > > + struct module *me)
> > > > +{
> > > > + unsigned int i;
> > > > + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > > + Elf64_Sym *sym;
> > > > + unsigned long *location;
> > > > + const char *symname;
> > > > + u32 *instruction;
> > > > +
> > > > + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > > > + sechdrs[relsec].sh_info);
> > > > +
> > > > + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > > + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > > + + rela[i].r_offset;
> > > > + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > > + + ELF64_R_SYM(rela[i].r_info);
> > > > + symname = me->core_kallsyms.strtab
> > > > + + sym->st_name;
>
> The above calculation is quite complex. It seems to be copied from
> apply_relocate_add(). If I maintained this code I would want to avoid
> the duplication. definitely.
>

Back to this one...

If we go with option 2 that clear_relocate_add() only does things
needed to make the next apply_relocate_add() succeed, we are
not likely to have one write_relocate_add(), which is shared by
apply_relocate_add() and clear_relocate_add(). To avoid
duplication, shall we have two helpers to calculate location and
sym? Or maybe one more to calculate symname? I personally
don't like such one liner helper with multiple arguments, such as

static unsigned long *get_location(Elf64_Shdr *sechdrs,
unsigned int relsec, unsigned int idx)
{
Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;

return (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
+ rela[idx].r_offset;
}

Then use it as
location = get_location(sechdrs, relsec, i);

We also need get_sym(), which is similar to get_location.

We can probably pass in different arguments. But I don't find
any options that I like. I think duplicate some code is better in
this case. However, if you do think these helpers are better,
or have other suggestions, I won't insist further.

Thanks,
Song

2022-12-13 22:43:54

by Joe Lawrence

[permalink] [raw]
Subject: Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

On 12/13/22 8:29 AM, Petr Mladek wrote:
> On Tue 2022-12-13 00:13:46, Song Liu wrote:
>> )() ()On Mon, Dec 12, 2022 at 9:12 AM Petr Mladek <[email protected]> wrote:
>>>
>>> On Fri 2022-12-09 11:59:35, Song Liu wrote:
>>>> On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <[email protected]> wrote:
>>>>> On Mon 2022-11-28 17:57:06, Song Liu wrote:
>>>>>> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <[email protected]> wrote:
>>>>>>>
>>>>>>>> --- a/arch/powerpc/kernel/module_64.c
>>>>>>>> +++ b/arch/powerpc/kernel/module_64.c
>>>>>>>> +#ifdef CONFIG_LIVEPATCH
>>>>>>>> +void clear_relocate_add(Elf64_Shdr *sechdrs,
>>>>>>>> + const char *strtab,
>>>>>>>> + unsigned int symindex,
>>>>>>>> + unsigned int relsec,
>>>>>>>> + struct module *me)
>>>>>>>> +{
>>>
>>> [...]
>>>
>>>>>>>> +
>>>>>>>> + instruction = (u32 *)location;
>>>>>>>> + if (is_mprofile_ftrace_call(symname))
>>>>>>>> + continue;
>>>>>
>>>>> Why do we ignore these symbols?
>>>>>
>>>>> I can't find any counter-part in apply_relocate_add(). It looks super
>>>>> tricky. It would deserve a comment.
>>>>>
>>>>> And I have no idea how we could maintain these exceptions.
>>>>>
>>>>>>>> + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
>>>>>>>> + continue;
>>>>>
>>>>> Same here. It looks super tricky and there is no explanation.
>>>>
>>>> The two checks are from restore_r2(). But I cannot really remember
>>>> why we needed them. It is probably an updated version from an earlier
>>>> version (3 year earlier..).
>>>
>>> This is a good sign that it has to be explained in a comment.
>>> Or even better, it should not by copy pasted.
>>>
>>>>>>>> + instruction += 1;
>>>>>>>> + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
>>>
>>> I believe that this is not enough. apply_relocate_add() does this:
>>>
>>> int apply_relocate_add(Elf64_Shdr *sechdrs,
>>> [...]
>>> struct module *me)
>>> {
>>> [...]
>>> case R_PPC_REL24:
>>> /* FIXME: Handle weak symbols here --RR */
>>> if (sym->st_shndx == SHN_UNDEF ||
>>> sym->st_shndx == SHN_LIVEPATCH) {
>>> [...]
>>> if (!restore_r2(strtab + sym->st_name,
>>> (u32 *)location + 1, me))
>>> [...] return -ENOEXEC;
>>>
>>> ---> if (patch_instruction((u32 *)location, ppc_inst(value)))
>>> return -EFAULT;
>>>
>>> , where restore_r2() does:
>>>
>>> static int restore_r2(const char *name, u32 *instruction, struct module *me)
>>> {
>>> [...]
>>> /* ld r2,R2_STACK_OFFSET(r1) */
>>> ---> if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
>>> return 0;
>>> [...]
>>> }
>>>
>>> By other words, apply_relocate_add() modifies two instructions:
>>>
>>> + patch_instruction() called in restore_r2() writes into "location + 1"
>>> + patch_instruction() called in apply_relocate_add() writes into "location"
>>>
>>> IMHO, we have to clear both.
>>>
>>> IMHO, we need to implement a function that reverts the changes done
>>> in restore_r2(). Also we need to revert the changes done in
>>> apply_relocate_add().
>>
>> I finally got time to read all the details again and recalled what
>> happened with the code.
>>
>> The failure happens when we
>> 1) call apply_relocate_add() on klp load (or module first load,
>> if klp was loaded first);
>> 2) do nothing when the module is unloaded;
>> 3) call apply_relocate_add() on module reload, which failed.
>>
>> The failure happens at this check in restore_r2():
>>
>> if (*instruction != PPC_RAW_NOP()) {
>> pr_err("%s: Expected nop after call, got %08x at %pS\n",
>> me->name, *instruction, instruction);
>> return 0;
>> }
>>
>> Therefore, apply_relocate_add only fails when "location + 1"
>> is not NOP. And to make it not fail, we only need to write NOP to
>> "location + 1" in clear_relocate_add().
>
> Yes, this should be enough to pass the existing check.
>
>> IIUC, you want clear_relocate_add() to undo everything we did
>> in apply_relocate_add(); while I was writing clear_relocate_add()
>> to make the next apply_relocate_add() not fail.
>>
>> I agree that, based on the name, clear_relocate_add() should
>> undo everything by apply_relocate_add(). But I am not sure how
>> to handle some cases. For example, how do we undo
>>
>> case R_PPC64_ADDR32:
>> /* Simply set it */
>> *(u32 *)location = value;
>> break;
>>
>> Shall we just write zeros? I don't think this matters.
>
> I guess that it would be zeros as we do in x86_64.
>
>
>> I think this is the question we should answer first:
>> What shall clear_relocate_add() do?
>> 1) undo everything by apply_relocate_add();
>> 2) only do things needed to make the next
>> apply_relocate_add succeed;
>> 3) something between 1) and 2).
>
> Good question.
>
> Hmm, the commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24
> relocation support of livepatch symbols") suggests that all symbols
> in the section SHN_LIVEPATCH have the type R_PPC_REL24. AFAIK, the
> kernel livepatches are the only user of the clear_relocate_add()
> feature.
>
> If the above is correct then it might be enough to clear only
> R_PPC_REL24 type. And it might be enough to warn when clear_relocate_add()
> is called for another type so that we know when the relocations
> were not cleared properly.
>
> Good question. We might need some input from people familiar
> with the architecture and creating the livepatches.
>

Adding Russell to the to CC list as he worked some of recent ppc64le
livepatch klp-relocation threads [1] [2].

Maybe it would simpler to first organize a cleanup of the code, then add
the capability to undo the relocations? According to [2] and the last
comment on [3], it sounded like the Power folks had a "full"(er)
solution in mind depending on our requirements.

Finally, I'll try to finish my v6.1 rebase of the klp-convert patchset
this week. That includes a bunch of kselftests that generate all manner
of klp-relocation types and sections. (More than I've ever seen out of
kpatch-build.)

[1] https://lore.kernel.org/linuxppc-dev/[email protected]/
[2] https://lore.kernel.org/linuxppc-dev/[email protected]/
[3] https://github.com/linuxppc/issues/issues/375

--
Joe