2022-05-18 18:24:29

by Naveen N. Rao

[permalink] [raw]
Subject: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
symbols") [1], binutils (v2.36+) started dropping section symbols that
it thought were unused. This isn't an issue in general, but with
kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
separate .text.unlikely section and the section symbol ".text.unlikely"
is being dropped. Due to this, recordmcount is unable to find a non-weak
symbol in .text.unlikely to generate a relocation record against.

Address this by dropping the weak attribute from these functions:
- arch_kexec_apply_relocations() is not overridden by any architecture
today, so just drop the weak attribute.
- arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
Retain the function prototype for those and move the weak
implementation into the header as a static inline for other
architectures.

[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Signed-off-by: Naveen N. Rao <[email protected]>
---
include/linux/kexec.h | 28 ++++++++++++++++++++++++----
kernel/kexec_file.c | 19 +------------------
2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 58d1b58a971e34..e656f981f43a73 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -193,10 +193,6 @@ void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
unsigned long buf_len);
void *arch_kexec_kernel_image_load(struct kimage *image);
-int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
- Elf_Shdr *section,
- const Elf_Shdr *relsec,
- const Elf_Shdr *symtab);
int arch_kexec_apply_relocations(struct purgatory_info *pi,
Elf_Shdr *section,
const Elf_Shdr *relsec,
@@ -229,6 +225,30 @@ extern int crash_exclude_mem_range(struct crash_mem *mem,
unsigned long long mend);
extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
void **addr, unsigned long *sz);
+
+#if defined(CONFIG_X86_64) || defined(CONFIG_S390)
+int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
+ Elf_Shdr *section,
+ const Elf_Shdr *relsec,
+ const Elf_Shdr *symtab);
+#else
+/*
+ * arch_kexec_apply_relocations_add - apply relocations of type RELA
+ * @pi: Purgatory to be relocated.
+ * @section: Section relocations applying to.
+ * @relsec: Section containing RELAs.
+ * @symtab: Corresponding symtab.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+static inline int
+arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
+ const Elf_Shdr *relsec, const Elf_Shdr *symtab)
+{
+ pr_err("RELA relocation unsupported.\n");
+ return -ENOEXEC;
+}
+#endif /* CONFIG_X86_64 || CONFIG_S390 */
#endif /* CONFIG_KEXEC_FILE */

#ifdef CONFIG_KEXEC_ELF
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 8347fc158d2b96..6bae253b4d315e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -108,23 +108,6 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
}
#endif

-/*
- * arch_kexec_apply_relocations_add - apply relocations of type RELA
- * @pi: Purgatory to be relocated.
- * @section: Section relocations applying to.
- * @relsec: Section containing RELAs.
- * @symtab: Corresponding symtab.
- *
- * Return: 0 on success, negative errno on error.
- */
-int __weak
-arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
- const Elf_Shdr *relsec, const Elf_Shdr *symtab)
-{
- pr_err("RELA relocation unsupported.\n");
- return -ENOEXEC;
-}
-
/*
* arch_kexec_apply_relocations - apply relocations of type REL
* @pi: Purgatory to be relocated.
@@ -134,7 +117,7 @@ arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
*
* Return: 0 on success, negative errno on error.
*/
-int __weak
+int
arch_kexec_apply_relocations(struct purgatory_info *pi, Elf_Shdr *section,
const Elf_Shdr *relsec, const Elf_Shdr *symtab)
{

base-commit: ef1302160bfb19f804451d0e919266703501c875
--
2.36.1



2022-05-18 20:50:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

On Wed, 18 May 2022 23:48:28 +0530 "Naveen N. Rao" <[email protected]> wrote:

> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> symbols") [1], binutils (v2.36+) started dropping section symbols that
> it thought were unused. This isn't an issue in general, but with
> kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
> separate .text.unlikely section and the section symbol ".text.unlikely"
> is being dropped. Due to this, recordmcount is unable to find a non-weak
> symbol in .text.unlikely to generate a relocation record against.
>
> Address this by dropping the weak attribute from these functions:
> - arch_kexec_apply_relocations() is not overridden by any architecture
> today, so just drop the weak attribute.
> - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
> Retain the function prototype for those and move the weak
> implementation into the header as a static inline for other
> architectures.
>
> ...
>

Sigh. This patch demonstrates why I like __weak :<

> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -229,6 +225,30 @@ extern int crash_exclude_mem_range(struct crash_mem *mem,
> unsigned long long mend);
> extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
> void **addr, unsigned long *sz);
> +
> +#if defined(CONFIG_X86_64) || defined(CONFIG_S390)

Let's avoid listing the architectures here? Better to add

select ARCH_HAVE_ARCH_KEXEC_APPLY_RELOCATIONS_ADD

to arch/<arch>/Kconfig?

Please cc me on any additional work on this.

2022-05-18 22:47:53

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

"Naveen N. Rao" <[email protected]> writes:

> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> symbols") [1], binutils (v2.36+) started dropping section symbols that
> it thought were unused. This isn't an issue in general, but with
> kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
> separate .text.unlikely section and the section symbol ".text.unlikely"
> is being dropped. Due to this, recordmcount is unable to find a non-weak
> symbol in .text.unlikely to generate a relocation record against.
>
> Address this by dropping the weak attribute from these functions:
> - arch_kexec_apply_relocations() is not overridden by any architecture
> today, so just drop the weak attribute.
> - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
> Retain the function prototype for those and move the weak
> implementation into the header as a static inline for other
> architectures.
>
> [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Any chance you can also get machine_kexec_post_load,
crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.

That is everything in kexec that uses a __weak symbol. If we can't
count on them working we might as well just get rid of the rest
preemptively.

Could you also address Andrews concerns by using a Kconfig symbol that
the architectures that implement the symbol can select.

I don't want to ask too much of a volunteer but if you are willing
addressing both of those would be a great help.

Eric

> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> include/linux/kexec.h | 28 ++++++++++++++++++++++++----
> kernel/kexec_file.c | 19 +------------------
> 2 files changed, 25 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 58d1b58a971e34..e656f981f43a73 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -193,10 +193,6 @@ void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
> int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> unsigned long buf_len);
> void *arch_kexec_kernel_image_load(struct kimage *image);
> -int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
> - Elf_Shdr *section,
> - const Elf_Shdr *relsec,
> - const Elf_Shdr *symtab);
> int arch_kexec_apply_relocations(struct purgatory_info *pi,
> Elf_Shdr *section,
> const Elf_Shdr *relsec,
> @@ -229,6 +225,30 @@ extern int crash_exclude_mem_range(struct crash_mem *mem,
> unsigned long long mend);
> extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
> void **addr, unsigned long *sz);
> +
> +#if defined(CONFIG_X86_64) || defined(CONFIG_S390)
> +int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
> + Elf_Shdr *section,
> + const Elf_Shdr *relsec,
> + const Elf_Shdr *symtab);
> +#else
> +/*
> + * arch_kexec_apply_relocations_add - apply relocations of type RELA
> + * @pi: Purgatory to be relocated.
> + * @section: Section relocations applying to.
> + * @relsec: Section containing RELAs.
> + * @symtab: Corresponding symtab.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +static inline int
> +arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
> + const Elf_Shdr *relsec, const Elf_Shdr *symtab)
> +{
> + pr_err("RELA relocation unsupported.\n");
> + return -ENOEXEC;
> +}
> +#endif /* CONFIG_X86_64 || CONFIG_S390 */
> #endif /* CONFIG_KEXEC_FILE */
>
> #ifdef CONFIG_KEXEC_ELF
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 8347fc158d2b96..6bae253b4d315e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -108,23 +108,6 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> }
> #endif
>
> -/*
> - * arch_kexec_apply_relocations_add - apply relocations of type RELA
> - * @pi: Purgatory to be relocated.
> - * @section: Section relocations applying to.
> - * @relsec: Section containing RELAs.
> - * @symtab: Corresponding symtab.
> - *
> - * Return: 0 on success, negative errno on error.
> - */
> -int __weak
> -arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
> - const Elf_Shdr *relsec, const Elf_Shdr *symtab)
> -{
> - pr_err("RELA relocation unsupported.\n");
> - return -ENOEXEC;
> -}
> -
> /*
> * arch_kexec_apply_relocations - apply relocations of type REL
> * @pi: Purgatory to be relocated.
> @@ -134,7 +117,7 @@ arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
> *
> * Return: 0 on success, negative errno on error.
> */
> -int __weak
> +int
> arch_kexec_apply_relocations(struct purgatory_info *pi, Elf_Shdr *section,
> const Elf_Shdr *relsec, const Elf_Shdr *symtab)
> {
>
> base-commit: ef1302160bfb19f804451d0e919266703501c875

2022-05-19 07:50:18

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Hi Eric,

On 05/18/22 at 04:59pm, Eric W. Biederman wrote:
> "Naveen N. Rao" <[email protected]> writes:
>
> > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> > symbols") [1], binutils (v2.36+) started dropping section symbols that
> > it thought were unused. This isn't an issue in general, but with
> > kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
> > separate .text.unlikely section and the section symbol ".text.unlikely"
> > is being dropped. Due to this, recordmcount is unable to find a non-weak
> > symbol in .text.unlikely to generate a relocation record against.
> >
> > Address this by dropping the weak attribute from these functions:
> > - arch_kexec_apply_relocations() is not overridden by any architecture
> > today, so just drop the weak attribute.
> > - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
> > Retain the function prototype for those and move the weak
> > implementation into the header as a static inline for other
> > architectures.
> >
> > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>
> Any chance you can also get machine_kexec_post_load,
> crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
> arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
> arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
> arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.
>
> That is everything in kexec that uses a __weak symbol. If we can't
> count on them working we might as well just get rid of the rest
> preemptively.

Is there a new rule that __weak is not suggested in kernel any more?
Please help provide a pointer if yes, so that I can learn that.

In my mind, __weak is very simple and clear as a mechanism to add
ARCH related functionality.

Thanks
Baoquan


2022-05-19 12:57:12

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Andrew Morton wrote:
> On Wed, 18 May 2022 23:48:28 +0530 "Naveen N. Rao" <[email protected]> wrote:
>
>> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> symbols") [1], binutils (v2.36+) started dropping section symbols that
>> it thought were unused. This isn't an issue in general, but with
>> kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
>> separate .text.unlikely section and the section symbol ".text.unlikely"
>> is being dropped. Due to this, recordmcount is unable to find a non-weak
>> symbol in .text.unlikely to generate a relocation record against.
>>
>> Address this by dropping the weak attribute from these functions:
>> - arch_kexec_apply_relocations() is not overridden by any architecture
>> today, so just drop the weak attribute.
>> - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
>> Retain the function prototype for those and move the weak
>> implementation into the header as a static inline for other
>> architectures.
>>
>> ...
>>
>
> Sigh. This patch demonstrates why I like __weak :<
>
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -229,6 +225,30 @@ extern int crash_exclude_mem_range(struct crash_mem *mem,
>> unsigned long long mend);
>> extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
>> void **addr, unsigned long *sz);
>> +
>> +#if defined(CONFIG_X86_64) || defined(CONFIG_S390)
>
> Let's avoid listing the architectures here? Better to add
>
> select ARCH_HAVE_ARCH_KEXEC_APPLY_RELOCATIONS_ADD
>
> to arch/<arch>/Kconfig?

I followed the approach used in commit 6e7b64b9dd6d96 ("elfcore: fix
building with clang") since here again, it was overridden on only two
architectures. I also wanted to avoid touching the architecture headers
so as to make it simpler to backport.

But, as Michael points out, using a #ifdef isn't too much of a change
either. I also confirmed that those changes still apply cleanly all the
way back to v5.10. I've posted a v2 which takes this approach.

>
> Please cc me on any additional work on this.

I've copied you on the v2 patch. Thanks!


- Naveen


2022-05-19 13:47:16

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

"Naveen N. Rao" <[email protected]> writes:
> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> symbols") [1], binutils (v2.36+) started dropping section symbols that
> it thought were unused. This isn't an issue in general, but with
> kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
> separate .text.unlikely section and the section symbol ".text.unlikely"
> is being dropped. Due to this, recordmcount is unable to find a non-weak
> symbol in .text.unlikely to generate a relocation record against.
>
> Address this by dropping the weak attribute from these functions:
> - arch_kexec_apply_relocations() is not overridden by any architecture
> today, so just drop the weak attribute.
> - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
> Retain the function prototype for those and move the weak
> implementation into the header as a static inline for other
> architectures.
>
> [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>
> Signed-off-by: Naveen N. Rao <[email protected]>
> ---
> include/linux/kexec.h | 28 ++++++++++++++++++++++++----
> kernel/kexec_file.c | 19 +------------------
> 2 files changed, 25 insertions(+), 22 deletions(-)

I think it could be cleaner done with the #define foo foo style, see
patch below. It does have its downsides, but for a simple hook like this
I think it's not too bad.

cheers


(only build tested)

diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h
index 7f3c9ac34bd8..e818b58ccc43 100644
--- a/arch/s390/include/asm/kexec.h
+++ b/arch/s390/include/asm/kexec.h
@@ -74,6 +74,8 @@ void *kexec_file_add_components(struct kimage *image,
int arch_kexec_do_relocs(int r_type, void *loc, unsigned long val,
unsigned long addr);

+#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
+
#define ARCH_HAS_KIMAGE_ARCH

struct kimage_arch {
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 11b7c06e2828..58e3939a350a 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -186,6 +186,8 @@ extern int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages,
extern void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages);
#define arch_kexec_pre_free_pages arch_kexec_pre_free_pages

+#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
+
#endif

typedef void crash_vmclear_fn(void);
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 8347fc158d2b..6f07acb59f29 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -108,6 +108,7 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
}
#endif

+#ifndef arch_kexec_apply_relocations_add
/*
* arch_kexec_apply_relocations_add - apply relocations of type RELA
* @pi: Purgatory to be relocated.
@@ -117,14 +118,16 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
*
* Return: 0 on success, negative errno on error.
*/
-int __weak
+int
arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
const Elf_Shdr *relsec, const Elf_Shdr *symtab)
{
pr_err("RELA relocation unsupported.\n");
return -ENOEXEC;
}
+#endif

+#ifndef arch_kexec_apply_relocations
/*
* arch_kexec_apply_relocations - apply relocations of type REL
* @pi: Purgatory to be relocated.
@@ -134,13 +137,14 @@ arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
*
* Return: 0 on success, negative errno on error.
*/
-int __weak
+int
arch_kexec_apply_relocations(struct purgatory_info *pi, Elf_Shdr *section,
const Elf_Shdr *relsec, const Elf_Shdr *symtab)
{
pr_err("REL relocation unsupported.\n");
return -ENOEXEC;
}
+#endif

/*
* Free up memory used by kernel, initrd, and command line. This is temporary


2022-05-19 22:59:35

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Baoquan He <[email protected]> writes:

> Hi Eric,
>
> On 05/18/22 at 04:59pm, Eric W. Biederman wrote:
>> "Naveen N. Rao" <[email protected]> writes:
>>
>> > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> > symbols") [1], binutils (v2.36+) started dropping section symbols that
>> > it thought were unused. This isn't an issue in general, but with
>> > kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
>> > separate .text.unlikely section and the section symbol ".text.unlikely"
>> > is being dropped. Due to this, recordmcount is unable to find a non-weak
>> > symbol in .text.unlikely to generate a relocation record against.
>> >
>> > Address this by dropping the weak attribute from these functions:
>> > - arch_kexec_apply_relocations() is not overridden by any architecture
>> > today, so just drop the weak attribute.
>> > - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
>> > Retain the function prototype for those and move the weak
>> > implementation into the header as a static inline for other
>> > architectures.
>> >
>> > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>>
>> Any chance you can also get machine_kexec_post_load,
>> crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
>> arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
>> arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
>> arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.
>>
>> That is everything in kexec that uses a __weak symbol. If we can't
>> count on them working we might as well just get rid of the rest
>> preemptively.
>
> Is there a new rule that __weak is not suggested in kernel any more?
> Please help provide a pointer if yes, so that I can learn that.
>
> In my mind, __weak is very simple and clear as a mechanism to add
> ARCH related functionality.

You should be able to trace the conversation back for all of the details
but if you can't here is the summary.

There is a tool that some architectures use called recordmcount. The
recordmcount looks for a symbol in a section, and ignores all weak
symbols. In certain cases sections become so simple there are only weak
symbols. At which point recordmcount fails.

Which means in practice __weak symbols are unreliable and don't work
to add ARCH related functionality.

Given that __weak symbols fail randomly I would much rather have simpler
code that doesn't fail. It has never been the case that __weak symbols
have been very common in the kernel. I expect they are something like
bool that have been gaining traction. Still given that __weak symbols
don't work. I don't want them.

Eric

2022-05-22 05:39:19

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

On 05/19/22 at 12:59pm, Eric W. Biederman wrote:
> Baoquan He <[email protected]> writes:
>
> > Hi Eric,
> >
> > On 05/18/22 at 04:59pm, Eric W. Biederman wrote:
> >> "Naveen N. Rao" <[email protected]> writes:
> >>
> >> > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> >> > symbols") [1], binutils (v2.36+) started dropping section symbols that
> >> > it thought were unused. This isn't an issue in general, but with
> >> > kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
> >> > separate .text.unlikely section and the section symbol ".text.unlikely"
> >> > is being dropped. Due to this, recordmcount is unable to find a non-weak
> >> > symbol in .text.unlikely to generate a relocation record against.
> >> >
> >> > Address this by dropping the weak attribute from these functions:
> >> > - arch_kexec_apply_relocations() is not overridden by any architecture
> >> > today, so just drop the weak attribute.
> >> > - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
> >> > Retain the function prototype for those and move the weak
> >> > implementation into the header as a static inline for other
> >> > architectures.
> >> >
> >> > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
> >>
> >> Any chance you can also get machine_kexec_post_load,
> >> crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
> >> arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
> >> arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
> >> arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.
> >>
> >> That is everything in kexec that uses a __weak symbol. If we can't
> >> count on them working we might as well just get rid of the rest
> >> preemptively.
> >
> > Is there a new rule that __weak is not suggested in kernel any more?
> > Please help provide a pointer if yes, so that I can learn that.
> >
> > In my mind, __weak is very simple and clear as a mechanism to add
> > ARCH related functionality.
>
> You should be able to trace the conversation back for all of the details
> but if you can't here is the summary.
>
> There is a tool that some architectures use called recordmcount. The
> recordmcount looks for a symbol in a section, and ignores all weak
> symbols. In certain cases sections become so simple there are only weak
> symbols. At which point recordmcount fails.
>
> Which means in practice __weak symbols are unreliable and don't work
> to add ARCH related functionality.
>
> Given that __weak symbols fail randomly I would much rather have simpler
> code that doesn't fail. It has never been the case that __weak symbols
> have been very common in the kernel. I expect they are something like
> bool that have been gaining traction. Still given that __weak symbols
> don't work. I don't want them.

Thanks for the summary, Eric.

From Naveen's reply, what I got is, llvm's recent change makes
symbol of section .text.unlikely lost, but the secton .text.unlikely
still exists. The __weak symbol will be put in .text.unlikely partly,
when arch_kexec_apply_relocations_add() includes the pr_err line. While
removing the pr_err() line will put __weak symbol
arch_kexec_apply_relocations_add() in .text instead.

Now the status is that not only recordmcount got this problem, objtools
met it too and got an appropriate fix. Means objtools's fix doesn't need
kernel's adjustment. Recordmcount need kernel to adjust because it lacks
continuous support and developement. Naveen also told that they are
converting to objtools, just the old CI cases rely on recordmcount. In
fact, if someone stands up to get an appropriate recordmcount fix too,
the problem will be gone too.

Asking this because __weak will be sentenced to death from now on, if we
decide to change kernel. And this thread will be the pointer provided to
others when telling them not to use __weak.

I am not strongly against taking off __weak, just wondering if there's
chance to fix it in recordmcount, and the cost comparing with kernel fix;
except of this issue, any other weakness of __weak. Noticed Andrew has
picked this patch, as a witness of this moment, raise a tiny concern.


2022-05-22 19:30:46

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Baoquan He <[email protected]> writes:

> On 05/19/22 at 12:59pm, Eric W. Biederman wrote:
>> Baoquan He <[email protected]> writes:
>>
>> > Hi Eric,
>> >
>> > On 05/18/22 at 04:59pm, Eric W. Biederman wrote:
>> >> "Naveen N. Rao" <[email protected]> writes:
>> >>
>> >> > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> >> > symbols") [1], binutils (v2.36+) started dropping section symbols that
>> >> > it thought were unused. This isn't an issue in general, but with
>> >> > kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
>> >> > separate .text.unlikely section and the section symbol ".text.unlikely"
>> >> > is being dropped. Due to this, recordmcount is unable to find a non-weak
>> >> > symbol in .text.unlikely to generate a relocation record against.
>> >> >
>> >> > Address this by dropping the weak attribute from these functions:
>> >> > - arch_kexec_apply_relocations() is not overridden by any architecture
>> >> > today, so just drop the weak attribute.
>> >> > - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
>> >> > Retain the function prototype for those and move the weak
>> >> > implementation into the header as a static inline for other
>> >> > architectures.
>> >> >
>> >> > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>> >>
>> >> Any chance you can also get machine_kexec_post_load,
>> >> crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
>> >> arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
>> >> arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
>> >> arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.
>> >>
>> >> That is everything in kexec that uses a __weak symbol. If we can't
>> >> count on them working we might as well just get rid of the rest
>> >> preemptively.
>> >
>> > Is there a new rule that __weak is not suggested in kernel any more?
>> > Please help provide a pointer if yes, so that I can learn that.
>> >
>> > In my mind, __weak is very simple and clear as a mechanism to add
>> > ARCH related functionality.
>>
>> You should be able to trace the conversation back for all of the details
>> but if you can't here is the summary.
>>
>> There is a tool that some architectures use called recordmcount. The
>> recordmcount looks for a symbol in a section, and ignores all weak
>> symbols. In certain cases sections become so simple there are only weak
>> symbols. At which point recordmcount fails.
>>
>> Which means in practice __weak symbols are unreliable and don't work
>> to add ARCH related functionality.
>>
>> Given that __weak symbols fail randomly I would much rather have simpler
>> code that doesn't fail. It has never been the case that __weak symbols
>> have been very common in the kernel. I expect they are something like
>> bool that have been gaining traction. Still given that __weak symbols
>> don't work. I don't want them.
>
> Thanks for the summary, Eric.
>
> From Naveen's reply, what I got is, llvm's recent change makes
> symbol of section .text.unlikely lost,

If I have read the thread correctly this change happened in both
llvm and binutils. So both tools chains that are used to build the
kernel.

> but the secton .text.unlikely
> still exists. The __weak symbol will be put in .text.unlikely partly,
> when arch_kexec_apply_relocations_add() includes the pr_err line. While
> removing the pr_err() line will put __weak symbol
> arch_kexec_apply_relocations_add() in .text instead.

Yes. Calling pr_err has some effect. Either causing an mcount
entry to be ommitted, or causing the symbols in the function to be
placed in .text.unlikely.

> Now the status is that not only recordmcount got this problem, objtools
> met it too and got an appropriate fix. Means objtools's fix doesn't need
> kernel's adjustment. Recordmcount need kernel to adjust because it lacks
> continuous support and developement. Naveen also told that they are
> converting to objtools, just the old CI cases rely on recordmcount. In
> fact, if someone stands up to get an appropriate recordmcount fix too,
> the problem will be gone too.

If the descriptions are correct I suspect recoredmcount could just
decided to use the weak symbol, and not ignore it.

Unfortunately I looked at the code and it looks like recordmcount
is only ignoring weak symbols on arm. So without being able to
reproduce this I don't understand enough of what is going to on to fix
it.

> Asking this because __weak will be sentenced to death from now on, if we
> decide to change kernel. And this thread will be the pointer provided to
> others when telling them not to use __weak.

Well knowing that it is recordmcount all someone has to do is show that
recordmcount has been removed/fixed for the case in question.

> I am not strongly against taking off __weak, just wondering if there's
> chance to fix it in recordmcount, and the cost comparing with kernel fix;
> except of this issue, any other weakness of __weak. Noticed Andrew has
> picked this patch, as a witness of this moment, raise a tiny concern.

I just don't see what else we can realistically do.

Eric




2022-05-23 06:55:34

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Baoquan He wrote:
> Hi Eric,
>
> On 05/18/22 at 04:59pm, Eric W. Biederman wrote:
>> "Naveen N. Rao" <[email protected]> writes:
>>
>> > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> > symbols") [1], binutils (v2.36+) started dropping section symbols that
>> > it thought were unused. This isn't an issue in general, but with
>> > kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
>> > separate .text.unlikely section and the section symbol ".text.unlikely"
>> > is being dropped. Due to this, recordmcount is unable to find a non-weak
>> > symbol in .text.unlikely to generate a relocation record against.
>> >
>> > Address this by dropping the weak attribute from these functions:
>> > - arch_kexec_apply_relocations() is not overridden by any architecture
>> > today, so just drop the weak attribute.
>> > - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
>> > Retain the function prototype for those and move the weak
>> > implementation into the header as a static inline for other
>> > architectures.
>> >
>> > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>>
>> Any chance you can also get machine_kexec_post_load,
>> crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
>> arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
>> arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
>> arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.

I've posted a v2 that uses the approach suggested by Michael, and
something that was in use in kexec already. If you are ok with that
approach, I will take a stab at converting the rest of the functions
that are marked __weak.

>>
>> That is everything in kexec that uses a __weak symbol. If we can't
>> count on them working we might as well just get rid of the rest
>> preemptively.
>
> Is there a new rule that __weak is not suggested in kernel any more?
> Please help provide a pointer if yes, so that I can learn that.

I'm not aware of a move away from __weak in the kernel, in general.
Steven doesn't prefer it for ftrace, and other maintainers may have a
preference.

>
> In my mind, __weak is very simple and clear as a mechanism to add
> ARCH related functionality.

Notwithstanding the ftrace issue, the other caveat with __weak functions
are that they still make it into the final vmlinux even if they are
overridden. That is, you will have instructions from both the __weak
variant as well as from the overridden variant in the final vmlinux,
which can add up if the weak variants are non-trivial.


- Naveen


2022-05-27 12:21:49

by Naveen N. Rao

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Andrew Morton wrote:
> On Fri, 20 May 2022 14:25:05 -0500 "Eric W. Biederman" <[email protected]> wrote:
>
>> > I am not strongly against taking off __weak, just wondering if there's
>> > chance to fix it in recordmcount, and the cost comparing with kernel fix;
>> > except of this issue, any other weakness of __weak. Noticed Andrew has
>> > picked this patch, as a witness of this moment, raise a tiny concern.
>>
>> I just don't see what else we can realistically do.
>
> I think converting all of the kexec __weaks to use the ifdef approach
> makes sense, if only because kexec is now using two different styles.
>
> But for now, I'll send Naveen's v2 patch in to Linus to get us out of
> trouble.

Thanks!

>
> I'm thinking that we should add cc:stable to that patch as well, to
> reduce the amount of problems which people experience when using newer
> binutils on older kernels?

Yes, please. I missed tagging this for stable. It looks like this is
applicable all the way back to v4.9 (though I haven't tested if
recordmcount fails in the same manner with those older kernel levels). I
will post backports once this gets into linus' tree.


- Naveen

2022-05-28 19:03:20

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

On Fri, 20 May 2022 14:25:05 -0500 "Eric W. Biederman" <[email protected]> wrote:

> > I am not strongly against taking off __weak, just wondering if there's
> > chance to fix it in recordmcount, and the cost comparing with kernel fix;
> > except of this issue, any other weakness of __weak. Noticed Andrew has
> > picked this patch, as a witness of this moment, raise a tiny concern.
>
> I just don't see what else we can realistically do.

I think converting all of the kexec __weaks to use the ifdef approach
makes sense, if only because kexec is now using two different styles.

But for now, I'll send Naveen's v2 patch in to Linus to get us out of
trouble.

I'm thinking that we should add cc:stable to that patch as well, to
reduce the amount of problems which people experience when using newer
binutils on older kernels?