From: Ard Biesheuvel <[email protected]>
Weak external linkage is intended for cases where a symbol reference
can remain unsatisfied in the final link. Taking the address of such a
symbol should yield NULL if the reference was not satisfied.
Given that ordinary RIP or PC relative references cannot produce NULL,
some kind of indirection is always needed in such cases, and in position
independent code, this results in a GOT entry. In ordinary code, it is
arch specific but amounts to the same thing.
While unavoidable in some cases, weak references are currently also used
to declare symbols that are always defined in the final link, but not in
the first linker pass. This means we end up with worse codegen for no
good reason. So let's clean this up, by providing preliminary
definitions that are only used as a fallback.
Changes since v2:
- fix build issue in patch #3 reported by Jiri
- add Arnd's acks
Changes since v1:
- update second occurrence of BTF start/end markers
- drop NULL check of __start_BTF[] which is no longer meaningful
- avoid the preliminary BTF symbols if CONFIG_DEBUG_INFO_BTF is not set
- add Andrii's ack to patch #3
- patches #1 and #2 unchanged
Cc: Masahiro Yamada <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Ard Biesheuvel (3):
kallsyms: Avoid weak references for kallsyms symbols
vmlinux: Avoid weak reference to notes section
btf: Avoid weak external references
include/asm-generic/vmlinux.lds.h | 28 ++++++++++++++++++
kernel/bpf/btf.c | 7 +++--
kernel/bpf/sysfs_btf.c | 6 ++--
kernel/kallsyms.c | 6 ----
kernel/kallsyms_internal.h | 30 ++++++++------------
kernel/ksysfs.c | 4 +--
lib/buildid.c | 4 +--
7 files changed, 52 insertions(+), 33 deletions(-)
--
2.44.0.683.g7961c838ac-goog
From: Ard Biesheuvel <[email protected]>
If the BTF code is enabled in the build configuration, the start/stop
BTF markers are guaranteed to exist in the final link but not during the
first linker pass.
Avoid GOT based relocations to these markers in the final executable by
providing preliminary definitions that will be used by the first linker
pass, and superseded by the actual definitions in the subsequent ones.
Make the preliminary definitions dependent on CONFIG_DEBUG_INFO_BTF so
that inadvertent references to this section will trigger a link failure
if they occur in code that does not honour CONFIG_DEBUG_INFO_BTF.
Note that Clang will notice that taking the address of__start_BTF cannot
yield NULL any longer, so testing for that condition is no longer
needed.
Acked-by: Andrii Nakryiko <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 9 +++++++++
kernel/bpf/btf.c | 7 +++++--
kernel/bpf/sysfs_btf.c | 6 +++---
3 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index e8449be62058..4cb3d88449e5 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -456,6 +456,7 @@
* independent code.
*/
#define PRELIMINARY_SYMBOL_DEFINITIONS \
+ PRELIMINARY_BTF_DEFINITIONS \
PROVIDE(kallsyms_addresses = .); \
PROVIDE(kallsyms_offsets = .); \
PROVIDE(kallsyms_names = .); \
@@ -466,6 +467,14 @@
PROVIDE(kallsyms_markers = .); \
PROVIDE(kallsyms_seqs_of_names = .);
+#ifdef CONFIG_DEBUG_INFO_BTF
+#define PRELIMINARY_BTF_DEFINITIONS \
+ PROVIDE(__start_BTF = .); \
+ PROVIDE(__stop_BTF = .);
+#else
+#define PRELIMINARY_BTF_DEFINITIONS
+#endif
+
/*
* Read only Data
*/
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 90c4a32d89ff..6d46cee47ae3 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5642,8 +5642,8 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
return ERR_PTR(err);
}
-extern char __weak __start_BTF[];
-extern char __weak __stop_BTF[];
+extern char __start_BTF[];
+extern char __stop_BTF[];
extern struct btf *btf_vmlinux;
#define BPF_MAP_TYPE(_id, _ops)
@@ -5971,6 +5971,9 @@ struct btf *btf_parse_vmlinux(void)
struct btf *btf = NULL;
int err;
+ if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
+ return ERR_PTR(-ENOENT);
+
env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
if (!env)
return ERR_PTR(-ENOMEM);
diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
index ef6911aee3bb..fedb54c94cdb 100644
--- a/kernel/bpf/sysfs_btf.c
+++ b/kernel/bpf/sysfs_btf.c
@@ -9,8 +9,8 @@
#include <linux/sysfs.h>
/* See scripts/link-vmlinux.sh, gen_btf() func for details */
-extern char __weak __start_BTF[];
-extern char __weak __stop_BTF[];
+extern char __start_BTF[];
+extern char __stop_BTF[];
static ssize_t
btf_vmlinux_read(struct file *file, struct kobject *kobj,
@@ -32,7 +32,7 @@ static int __init btf_vmlinux_init(void)
{
bin_attr_btf_vmlinux.size = __stop_BTF - __start_BTF;
- if (!__start_BTF || bin_attr_btf_vmlinux.size == 0)
+ if (bin_attr_btf_vmlinux.size == 0)
return 0;
btf_kobj = kobject_create_and_add("btf", kernel_kobj);
--
2.44.0.683.g7961c838ac-goog
From: Ard Biesheuvel <[email protected]>
kallsyms is a directory of all the symbols in the vmlinux binary, and so
creating it is somewhat of a chicken-and-egg problem, as its non-zero
size affects the layout of the binary, and therefore the values of the
symbols.
For this reason, the kernel is linked more than once, and the first pass
does not include any kallsyms data at all. For the linker to accept
this, the symbol declarations describing the kallsyms metadata are
emitted as having weak linkage, so they can remain unsatisfied. During
the subsequent passes, the weak references are satisfied by the kallsyms
metadata that was constructed based on information gathered from the
preceding passes.
Weak references lead to somewhat worse codegen, because taking their
address may need to produce NULL (if the reference was unsatisfied), and
this is not usually supported by RIP or PC relative symbol references.
Given that these references are ultimately always satisfied in the final
link, let's drop the weak annotation, and instead, provide fallback
definitions in the linker script that are only emitted if an unsatisfied
reference exists.
While at it, drop the FRV specific annotation that these symbols reside
in .rodata - FRV is long gone.
Tested-by: Nick Desaulniers <[email protected]> # Boot
Reviewed-by: Nick Desaulniers <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Link: https://lkml.kernel.org/r/20230504174320.3930345-1-ardb%40kernel.org
Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 19 +++++++++++++
kernel/kallsyms.c | 6 ----
kernel/kallsyms_internal.h | 30 ++++++++------------
3 files changed, 31 insertions(+), 24 deletions(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index f7749d0f2562..e8449be62058 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -448,11 +448,30 @@
#endif
#endif
+/*
+ * Some symbol definitions will not exist yet during the first pass of the
+ * link, but are guaranteed to exist in the final link. Provide preliminary
+ * definitions that will be superseded in the final link to avoid having to
+ * rely on weak external linkage, which requires a GOT when used in position
+ * independent code.
+ */
+#define PRELIMINARY_SYMBOL_DEFINITIONS \
+ PROVIDE(kallsyms_addresses = .); \
+ PROVIDE(kallsyms_offsets = .); \
+ PROVIDE(kallsyms_names = .); \
+ PROVIDE(kallsyms_num_syms = .); \
+ PROVIDE(kallsyms_relative_base = .); \
+ PROVIDE(kallsyms_token_table = .); \
+ PROVIDE(kallsyms_token_index = .); \
+ PROVIDE(kallsyms_markers = .); \
+ PROVIDE(kallsyms_seqs_of_names = .);
+
/*
* Read only Data
*/
#define RO_DATA(align) \
. = ALIGN((align)); \
+ PRELIMINARY_SYMBOL_DEFINITIONS \
.rodata : AT(ADDR(.rodata) - LOAD_OFFSET) { \
__start_rodata = .; \
*(.rodata) *(.rodata.*) \
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 18edd57b5fe8..22ea19a36e6e 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -325,12 +325,6 @@ static unsigned long get_symbol_pos(unsigned long addr,
unsigned long symbol_start = 0, symbol_end = 0;
unsigned long i, low, high, mid;
- /* This kernel should never had been booted. */
- if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
- BUG_ON(!kallsyms_addresses);
- else
- BUG_ON(!kallsyms_offsets);
-
/* Do a binary search on the sorted kallsyms_addresses array. */
low = 0;
high = kallsyms_num_syms;
diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h
index 27fabdcc40f5..85480274fc8f 100644
--- a/kernel/kallsyms_internal.h
+++ b/kernel/kallsyms_internal.h
@@ -5,27 +5,21 @@
#include <linux/types.h>
/*
- * These will be re-linked against their real values
- * during the second link stage.
+ * These will be re-linked against their real values during the second link
+ * stage. Preliminary values must be provided in the linker script using the
+ * PROVIDE() directive so that the first link stage can complete successfully.
*/
-extern const unsigned long kallsyms_addresses[] __weak;
-extern const int kallsyms_offsets[] __weak;
-extern const u8 kallsyms_names[] __weak;
+extern const unsigned long kallsyms_addresses[];
+extern const int kallsyms_offsets[];
+extern const u8 kallsyms_names[];
-/*
- * Tell the compiler that the count isn't in the small data section if the arch
- * has one (eg: FRV).
- */
-extern const unsigned int kallsyms_num_syms
-__section(".rodata") __attribute__((weak));
-
-extern const unsigned long kallsyms_relative_base
-__section(".rodata") __attribute__((weak));
+extern const unsigned int kallsyms_num_syms;
+extern const unsigned long kallsyms_relative_base;
-extern const char kallsyms_token_table[] __weak;
-extern const u16 kallsyms_token_index[] __weak;
+extern const char kallsyms_token_table[];
+extern const u16 kallsyms_token_index[];
-extern const unsigned int kallsyms_markers[] __weak;
-extern const u8 kallsyms_seqs_of_names[] __weak;
+extern const unsigned int kallsyms_markers[];
+extern const u8 kallsyms_seqs_of_names[];
#endif // LINUX_KALLSYMS_INTERNAL_H_
--
2.44.0.683.g7961c838ac-goog
From: Ard Biesheuvel <[email protected]>
Weak references are references that are permitted to remain unsatisfied
in the final link. This means they cannot be implemented using place
relative relocations, resulting in GOT entries when using position
independent code generation.
The notes section should always exist, so the weak annotations can be
omitted.
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
---
kernel/ksysfs.c | 4 ++--
lib/buildid.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
index 495b69a71a5d..07fb5987b42b 100644
--- a/kernel/ksysfs.c
+++ b/kernel/ksysfs.c
@@ -228,8 +228,8 @@ KERNEL_ATTR_RW(rcu_normal);
/*
* Make /sys/kernel/notes give the raw contents of our kernel .notes section.
*/
-extern const void __start_notes __weak;
-extern const void __stop_notes __weak;
+extern const void __start_notes;
+extern const void __stop_notes;
#define notes_size (&__stop_notes - &__start_notes)
static ssize_t notes_read(struct file *filp, struct kobject *kobj,
diff --git a/lib/buildid.c b/lib/buildid.c
index 898301b49eb6..7954dd92e36c 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -182,8 +182,8 @@ unsigned char vmlinux_build_id[BUILD_ID_SIZE_MAX] __ro_after_init;
*/
void __init init_vmlinux_build_id(void)
{
- extern const void __start_notes __weak;
- extern const void __stop_notes __weak;
+ extern const void __start_notes;
+ extern const void __stop_notes;
unsigned int size = &__stop_notes - &__start_notes;
build_id_parse_buf(&__start_notes, vmlinux_build_id, size);
--
2.44.0.683.g7961c838ac-goog
On Mon, Apr 15, 2024 at 09:58:41AM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <[email protected]>
>
> If the BTF code is enabled in the build configuration, the start/stop
> BTF markers are guaranteed to exist in the final link but not during the
> first linker pass.
>
> Avoid GOT based relocations to these markers in the final executable by
> providing preliminary definitions that will be used by the first linker
> pass, and superseded by the actual definitions in the subsequent ones.
>
> Make the preliminary definitions dependent on CONFIG_DEBUG_INFO_BTF so
> that inadvertent references to this section will trigger a link failure
> if they occur in code that does not honour CONFIG_DEBUG_INFO_BTF.
>
> Note that Clang will notice that taking the address of__start_BTF cannot
> yield NULL any longer, so testing for that condition is no longer
> needed.
>
> Acked-by: Andrii Nakryiko <[email protected]>
> Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
jirka
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> include/asm-generic/vmlinux.lds.h | 9 +++++++++
> kernel/bpf/btf.c | 7 +++++--
> kernel/bpf/sysfs_btf.c | 6 +++---
> 3 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index e8449be62058..4cb3d88449e5 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -456,6 +456,7 @@
> * independent code.
> */
> #define PRELIMINARY_SYMBOL_DEFINITIONS \
> + PRELIMINARY_BTF_DEFINITIONS \
> PROVIDE(kallsyms_addresses = .); \
> PROVIDE(kallsyms_offsets = .); \
> PROVIDE(kallsyms_names = .); \
> @@ -466,6 +467,14 @@
> PROVIDE(kallsyms_markers = .); \
> PROVIDE(kallsyms_seqs_of_names = .);
>
> +#ifdef CONFIG_DEBUG_INFO_BTF
> +#define PRELIMINARY_BTF_DEFINITIONS \
> + PROVIDE(__start_BTF = .); \
> + PROVIDE(__stop_BTF = .);
> +#else
> +#define PRELIMINARY_BTF_DEFINITIONS
> +#endif
> +
> /*
> * Read only Data
> */
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 90c4a32d89ff..6d46cee47ae3 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -5642,8 +5642,8 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
> return ERR_PTR(err);
> }
>
> -extern char __weak __start_BTF[];
> -extern char __weak __stop_BTF[];
> +extern char __start_BTF[];
> +extern char __stop_BTF[];
> extern struct btf *btf_vmlinux;
>
> #define BPF_MAP_TYPE(_id, _ops)
> @@ -5971,6 +5971,9 @@ struct btf *btf_parse_vmlinux(void)
> struct btf *btf = NULL;
> int err;
>
> + if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
> + return ERR_PTR(-ENOENT);
> +
> env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
> if (!env)
> return ERR_PTR(-ENOMEM);
> diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> index ef6911aee3bb..fedb54c94cdb 100644
> --- a/kernel/bpf/sysfs_btf.c
> +++ b/kernel/bpf/sysfs_btf.c
> @@ -9,8 +9,8 @@
> #include <linux/sysfs.h>
>
> /* See scripts/link-vmlinux.sh, gen_btf() func for details */
> -extern char __weak __start_BTF[];
> -extern char __weak __stop_BTF[];
> +extern char __start_BTF[];
> +extern char __stop_BTF[];
>
> static ssize_t
> btf_vmlinux_read(struct file *file, struct kobject *kobj,
> @@ -32,7 +32,7 @@ static int __init btf_vmlinux_init(void)
> {
> bin_attr_btf_vmlinux.size = __stop_BTF - __start_BTF;
>
> - if (!__start_BTF || bin_attr_btf_vmlinux.size == 0)
> + if (bin_attr_btf_vmlinux.size == 0)
> return 0;
>
> btf_kobj = kobject_create_and_add("btf", kernel_kobj);
> --
> 2.44.0.683.g7961c838ac-goog
>
On Mon, Apr 15, 2024 at 4:58 PM Ard Biesheuvel <[email protected]> wrote:
>
> From: Ard Biesheuvel <[email protected]>
>
> If the BTF code is enabled in the build configuration, the start/stop
> BTF markers are guaranteed to exist in the final link but not during the
> first linker pass.
>
> Avoid GOT based relocations to these markers in the final executable by
> providing preliminary definitions that will be used by the first linker
> pass, and superseded by the actual definitions in the subsequent ones.
>
> Make the preliminary definitions dependent on CONFIG_DEBUG_INFO_BTF so
> that inadvertent references to this section will trigger a link failure
> if they occur in code that does not honour CONFIG_DEBUG_INFO_BTF.
>
> Note that Clang will notice that taking the address of__start_BTF cannot
> yield NULL any longer, so testing for that condition is no longer
> needed.
>
> Acked-by: Andrii Nakryiko <[email protected]>
> Acked-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> include/asm-generic/vmlinux.lds.h | 9 +++++++++
> kernel/bpf/btf.c | 7 +++++--
> kernel/bpf/sysfs_btf.c | 6 +++---
> 3 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index e8449be62058..4cb3d88449e5 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -456,6 +456,7 @@
> * independent code.
> */
> #define PRELIMINARY_SYMBOL_DEFINITIONS \
> + PRELIMINARY_BTF_DEFINITIONS \
> PROVIDE(kallsyms_addresses = .); \
> PROVIDE(kallsyms_offsets = .); \
> PROVIDE(kallsyms_names = .); \
> @@ -466,6 +467,14 @@
> PROVIDE(kallsyms_markers = .); \
> PROVIDE(kallsyms_seqs_of_names = .);
>
> +#ifdef CONFIG_DEBUG_INFO_BTF
> +#define PRELIMINARY_BTF_DEFINITIONS \
> + PROVIDE(__start_BTF = .); \
> + PROVIDE(__stop_BTF = .);
> +#else
> +#define PRELIMINARY_BTF_DEFINITIONS
> +#endif
> +
Is this necessary?
The following code (BOUNDED_SECTION_BY)
produces __start_BTF and __stop_BTF symbols
under the same condition.
/*
* .BTF
*/
#ifdef CONFIG_DEBUG_INFO_BTF
#define BTF \
.BTF : AT(ADDR(.BTF) - LOAD_OFFSET) { \
BOUNDED_SECTION_BY(.BTF, _BTF) \
} \
. = ALIGN(4); \
.BTF_ids : AT(ADDR(.BTF_ids) - LOAD_OFFSET) { \
*(.BTF_ids) \
}
#else
#define BTF
#endif
> /*
> * Read only Data
> */
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 90c4a32d89ff..6d46cee47ae3 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -5642,8 +5642,8 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
> return ERR_PTR(err);
> }
>
> -extern char __weak __start_BTF[];
> -extern char __weak __stop_BTF[];
> +extern char __start_BTF[];
> +extern char __stop_BTF[];
> extern struct btf *btf_vmlinux;
>
> #define BPF_MAP_TYPE(_id, _ops)
> @@ -5971,6 +5971,9 @@ struct btf *btf_parse_vmlinux(void)
> struct btf *btf = NULL;
> int err;
>
> + if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
> + return ERR_PTR(-ENOENT);
> +
> env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
> if (!env)
> return ERR_PTR(-ENOMEM);
> diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> index ef6911aee3bb..fedb54c94cdb 100644
> --- a/kernel/bpf/sysfs_btf.c
> +++ b/kernel/bpf/sysfs_btf.c
> @@ -9,8 +9,8 @@
> #include <linux/sysfs.h>
>
> /* See scripts/link-vmlinux.sh, gen_btf() func for details */
> -extern char __weak __start_BTF[];
> -extern char __weak __stop_BTF[];
> +extern char __start_BTF[];
> +extern char __stop_BTF[];
>
> static ssize_t
> btf_vmlinux_read(struct file *file, struct kobject *kobj,
> @@ -32,7 +32,7 @@ static int __init btf_vmlinux_init(void)
> {
> bin_attr_btf_vmlinux.size = __stop_BTF - __start_BTF;
>
> - if (!__start_BTF || bin_attr_btf_vmlinux.size == 0)
> + if (bin_attr_btf_vmlinux.size == 0)
> return 0;
>
> btf_kobj = kobject_create_and_add("btf", kernel_kobj);
> --
> 2.44.0.683.g7961c838ac-goog
>
--
Best Regards
Masahiro Yamada
On Mon, 15 Apr 2024 at 16:55, Masahiro Yamada <[email protected]> wrote:
>
> On Mon, Apr 15, 2024 at 4:58 PM Ard Biesheuvel <[email protected]> wrote:
> >
> > From: Ard Biesheuvel <[email protected]>
> >
> > If the BTF code is enabled in the build configuration, the start/stop
> > BTF markers are guaranteed to exist in the final link but not during the
> > first linker pass.
> >
> > Avoid GOT based relocations to these markers in the final executable by
> > providing preliminary definitions that will be used by the first linker
> > pass, and superseded by the actual definitions in the subsequent ones.
> >
> > Make the preliminary definitions dependent on CONFIG_DEBUG_INFO_BTF so
> > that inadvertent references to this section will trigger a link failure
> > if they occur in code that does not honour CONFIG_DEBUG_INFO_BTF.
> >
> > Note that Clang will notice that taking the address of__start_BTF cannot
> > yield NULL any longer, so testing for that condition is no longer
> > needed.
> >
> > Acked-by: Andrii Nakryiko <[email protected]>
> > Acked-by: Arnd Bergmann <[email protected]>
> > Signed-off-by: Ard Biesheuvel <[email protected]>
> > ---
> > include/asm-generic/vmlinux.lds.h | 9 +++++++++
> > kernel/bpf/btf.c | 7 +++++--
> > kernel/bpf/sysfs_btf.c | 6 +++---
> > 3 files changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > index e8449be62058..4cb3d88449e5 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -456,6 +456,7 @@
> > * independent code.
> > */
> > #define PRELIMINARY_SYMBOL_DEFINITIONS \
> > + PRELIMINARY_BTF_DEFINITIONS \
> > PROVIDE(kallsyms_addresses = .); \
> > PROVIDE(kallsyms_offsets = .); \
> > PROVIDE(kallsyms_names = .); \
> > @@ -466,6 +467,14 @@
> > PROVIDE(kallsyms_markers = .); \
> > PROVIDE(kallsyms_seqs_of_names = .);
> >
> > +#ifdef CONFIG_DEBUG_INFO_BTF
> > +#define PRELIMINARY_BTF_DEFINITIONS \
> > + PROVIDE(__start_BTF = .); \
> > + PROVIDE(__stop_BTF = .);
> > +#else
> > +#define PRELIMINARY_BTF_DEFINITIONS
> > +#endif
> > +
>
>
>
> Is this necessary?
>
I think so.
This actually resulted in Jiri's build failure with v2, and the
realization that there was dead code in btf_parse_vmlinux() that
happily tried to load the contents of the BTF section if
CONFIG_DEBUG_INFO_BTF was not enabled to begin with.
So this is another pitfall with weak references: the symbol may
unexpectedly be missing altogether rather than only during the first
linker pass.
>
>
> The following code (BOUNDED_SECTION_BY)
> produces __start_BTF and __stop_BTF symbols
> under the same condition.
>
Indeed. So if CONFIG_DEBUG_INFO_BTF=n, code can still link to
__start_BTF and __stop_BTF even though BTF is not enabled.
>
>
> /*
> * .BTF
> */
> #ifdef CONFIG_DEBUG_INFO_BTF
> #define BTF \
> .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) { \
> BOUNDED_SECTION_BY(.BTF, _BTF) \
> } \
> . = ALIGN(4); \
> .BTF_ids : AT(ADDR(.BTF_ids) - LOAD_OFFSET) { \
> *(.BTF_ids) \
> }
> #else
> #define BTF
> #endif
>
>
>
>
>
>
>
>
>
>
>
> > /*
> > * Read only Data
> > */
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 90c4a32d89ff..6d46cee47ae3 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -5642,8 +5642,8 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
> > return ERR_PTR(err);
> > }
> >
> > -extern char __weak __start_BTF[];
> > -extern char __weak __stop_BTF[];
> > +extern char __start_BTF[];
> > +extern char __stop_BTF[];
> > extern struct btf *btf_vmlinux;
> >
> > #define BPF_MAP_TYPE(_id, _ops)
> > @@ -5971,6 +5971,9 @@ struct btf *btf_parse_vmlinux(void)
> > struct btf *btf = NULL;
> > int err;
> >
> > + if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
> > + return ERR_PTR(-ENOENT);
> > +
> > env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
> > if (!env)
> > return ERR_PTR(-ENOMEM);
> > diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> > index ef6911aee3bb..fedb54c94cdb 100644
> > --- a/kernel/bpf/sysfs_btf.c
> > +++ b/kernel/bpf/sysfs_btf.c
> > @@ -9,8 +9,8 @@
> > #include <linux/sysfs.h>
> >
> > /* See scripts/link-vmlinux.sh, gen_btf() func for details */
> > -extern char __weak __start_BTF[];
> > -extern char __weak __stop_BTF[];
> > +extern char __start_BTF[];
> > +extern char __stop_BTF[];
> >
> > static ssize_t
> > btf_vmlinux_read(struct file *file, struct kobject *kobj,
> > @@ -32,7 +32,7 @@ static int __init btf_vmlinux_init(void)
> > {
> > bin_attr_btf_vmlinux.size = __stop_BTF - __start_BTF;
> >
> > - if (!__start_BTF || bin_attr_btf_vmlinux.size == 0)
> > + if (bin_attr_btf_vmlinux.size == 0)
> > return 0;
> >
> > btf_kobj = kobject_create_and_add("btf", kernel_kobj);
> > --
> > 2.44.0.683.g7961c838ac-goog
> >
>
>
> --
> Best Regards
> Masahiro Yamada
On Mon, Apr 15, 2024 at 11:59 PM Ard Biesheuvel <[email protected]> wrote:
>
> On Mon, 15 Apr 2024 at 16:55, Masahiro Yamada <[email protected]> wrote:
> >
> > On Mon, Apr 15, 2024 at 4:58 PM Ard Biesheuvel <ardb+git@googlecom> wrote:
> > >
> > > From: Ard Biesheuvel <[email protected]>
> > >
> > > If the BTF code is enabled in the build configuration, the start/stop
> > > BTF markers are guaranteed to exist in the final link but not during the
> > > first linker pass.
> > >
> > > Avoid GOT based relocations to these markers in the final executable by
> > > providing preliminary definitions that will be used by the first linker
> > > pass, and superseded by the actual definitions in the subsequent ones.
> > >
> > > Make the preliminary definitions dependent on CONFIG_DEBUG_INFO_BTF so
> > > that inadvertent references to this section will trigger a link failure
> > > if they occur in code that does not honour CONFIG_DEBUG_INFO_BTF.
> > >
> > > Note that Clang will notice that taking the address of__start_BTF cannot
> > > yield NULL any longer, so testing for that condition is no longer
> > > needed.
> > >
> > > Acked-by: Andrii Nakryiko <[email protected]>
> > > Acked-by: Arnd Bergmann <[email protected]>
> > > Signed-off-by: Ard Biesheuvel <[email protected]>
> > > ---
> > > include/asm-generic/vmlinux.lds.h | 9 +++++++++
> > > kernel/bpf/btf.c | 7 +++++--
> > > kernel/bpf/sysfs_btf.c | 6 +++---
> > > 3 files changed, 17 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > > index e8449be62058..4cb3d88449e5 100644
> > > --- a/include/asm-generic/vmlinux.lds.h
> > > +++ b/include/asm-generic/vmlinux.lds.h
> > > @@ -456,6 +456,7 @@
> > > * independent code.
> > > */
> > > #define PRELIMINARY_SYMBOL_DEFINITIONS \
> > > + PRELIMINARY_BTF_DEFINITIONS \
> > > PROVIDE(kallsyms_addresses = .); \
> > > PROVIDE(kallsyms_offsets = .); \
> > > PROVIDE(kallsyms_names = .); \
> > > @@ -466,6 +467,14 @@
> > > PROVIDE(kallsyms_markers = .); \
> > > PROVIDE(kallsyms_seqs_of_names = .);
> > >
> > > +#ifdef CONFIG_DEBUG_INFO_BTF
> > > +#define PRELIMINARY_BTF_DEFINITIONS \
> > > + PROVIDE(__start_BTF = .); \
> > > + PROVIDE(__stop_BTF = .);
> > > +#else
> > > +#define PRELIMINARY_BTF_DEFINITIONS
> > > +#endif
> > > +
> >
> >
> >
> > Is this necessary?
> >
>
> I think so.
>
> This actually resulted in Jiri's build failure with v2, and the
> realization that there was dead code in btf_parse_vmlinux() that
> happily tried to load the contents of the BTF section if
> CONFIG_DEBUG_INFO_BTF was not enabled to begin with.
>
> So this is another pitfall with weak references: the symbol may
> unexpectedly be missing altogether rather than only during the first
> linker pass.
>
> >
> >
> > The following code (BOUNDED_SECTION_BY)
> > produces __start_BTF and __stop_BTF symbols
> > under the same condition.
> >
>
> Indeed. So if CONFIG_DEBUG_INFO_BTF=n, code can still link to
> __start_BTF and __stop_BTF even though BTF is not enabled.
I am talking about the case for CONFIG_DEBUG_INFO_BTF=y.
PROVIDE() is meaningless because __start_BTF and __stop_BTF
are produced by the existing code.
(The code was a bit clearer before commit
9b351be25360c5cedfb98b88d6dfd89327849e52)
So, v4 of this patch will look like 2/3, right?
Just drop __weak attribute.
You still need
if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
return ERR_PTR(-ENOENT);
But, you do not need to modify vmlinux.lds.h
--
Best Regards
Masahiro Yamada
On Mon, 15 Apr 2024 at 17:32, Masahiro Yamada <[email protected]> wrote:
>
> On Mon, Apr 15, 2024 at 11:59 PM Ard Biesheuvel <[email protected]> wrote:
> >
> > On Mon, 15 Apr 2024 at 16:55, Masahiro Yamada <[email protected]> wrote:
> > >
> > > On Mon, Apr 15, 2024 at 4:58 PM Ard Biesheuvel <[email protected]> wrote:
> > > >
> > > > From: Ard Biesheuvel <[email protected]>
> > > >
> > > > If the BTF code is enabled in the build configuration, the start/stop
> > > > BTF markers are guaranteed to exist in the final link but not during the
> > > > first linker pass.
> > > >
> > > > Avoid GOT based relocations to these markers in the final executable by
> > > > providing preliminary definitions that will be used by the first linker
> > > > pass, and superseded by the actual definitions in the subsequent ones.
> > > >
> > > > Make the preliminary definitions dependent on CONFIG_DEBUG_INFO_BTF so
> > > > that inadvertent references to this section will trigger a link failure
> > > > if they occur in code that does not honour CONFIG_DEBUG_INFO_BTF.
> > > >
> > > > Note that Clang will notice that taking the address of__start_BTF cannot
> > > > yield NULL any longer, so testing for that condition is no longer
> > > > needed.
> > > >
> > > > Acked-by: Andrii Nakryiko <[email protected]>
> > > > Acked-by: Arnd Bergmann <[email protected]>
> > > > Signed-off-by: Ard Biesheuvel <[email protected]>
> > > > ---
> > > > include/asm-generic/vmlinux.lds.h | 9 +++++++++
> > > > kernel/bpf/btf.c | 7 +++++--
> > > > kernel/bpf/sysfs_btf.c | 6 +++---
> > > > 3 files changed, 17 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > > > index e8449be62058..4cb3d88449e5 100644
> > > > --- a/include/asm-generic/vmlinux.lds.h
> > > > +++ b/include/asm-generic/vmlinux.lds.h
> > > > @@ -456,6 +456,7 @@
> > > > * independent code.
> > > > */
> > > > #define PRELIMINARY_SYMBOL_DEFINITIONS \
> > > > + PRELIMINARY_BTF_DEFINITIONS \
> > > > PROVIDE(kallsyms_addresses = .); \
> > > > PROVIDE(kallsyms_offsets = .); \
> > > > PROVIDE(kallsyms_names = .); \
> > > > @@ -466,6 +467,14 @@
> > > > PROVIDE(kallsyms_markers = .); \
> > > > PROVIDE(kallsyms_seqs_of_names = .);
> > > >
> > > > +#ifdef CONFIG_DEBUG_INFO_BTF
> > > > +#define PRELIMINARY_BTF_DEFINITIONS \
> > > > + PROVIDE(__start_BTF = .); \
> > > > + PROVIDE(__stop_BTF = .);
> > > > +#else
> > > > +#define PRELIMINARY_BTF_DEFINITIONS
> > > > +#endif
> > > > +
> > >
> > >
> > >
> > > Is this necessary?
> > >
> >
> > I think so.
> >
> > This actually resulted in Jiri's build failure with v2, and the
> > realization that there was dead code in btf_parse_vmlinux() that
> > happily tried to load the contents of the BTF section if
> > CONFIG_DEBUG_INFO_BTF was not enabled to begin with.
> >
> > So this is another pitfall with weak references: the symbol may
> > unexpectedly be missing altogether rather than only during the first
> > linker pass.
> >
> > >
> > >
> > > The following code (BOUNDED_SECTION_BY)
> > > produces __start_BTF and __stop_BTF symbols
> > > under the same condition.
> > >
> >
> > Indeed. So if CONFIG_DEBUG_INFO_BTF=n, code can still link to
> > __start_BTF and __stop_BTF even though BTF is not enabled.
>
>
>
> I am talking about the case for CONFIG_DEBUG_INFO_BTF=y.
>
> PROVIDE() is meaningless because __start_BTF and __stop_BTF
> are produced by the existing code.
>
> (The code was a bit clearer before commit
> 9b351be25360c5cedfb98b88d6dfd89327849e52)
>
>
>
> So, v4 of this patch will look like 2/3, right?
>
>
> Just drop __weak attribute.
>
> You still need
>
> if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
> return ERR_PTR(-ENOENT);
>
> But, you do not need to modify vmlinux.lds.h
>
OK, I see what you mean now.
The PROVIDE() is indeed unnecessary - I'll drop that bit in v4.
On 4/15/2024 1:28 PM, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <[email protected]>
>
> Weak external linkage is intended for cases where a symbol reference
> can remain unsatisfied in the final link. Taking the address of such a
> symbol should yield NULL if the reference was not satisfied.
>
> Given that ordinary RIP or PC relative references cannot produce NULL,
> some kind of indirection is always needed in such cases, and in position
> independent code, this results in a GOT entry. In ordinary code, it is
> arch specific but amounts to the same thing.
>
> While unavoidable in some cases, weak references are currently also used
> to declare symbols that are always defined in the final link, but not in
> the first linker pass. This means we end up with worse codegen for no
> good reason. So let's clean this up, by providing preliminary
> definitions that are only used as a fallback.
>
> Changes since v2:
> - fix build issue in patch #3 reported by Jiri
> - add Arnd's acks
>
> Changes since v1:
> - update second occurrence of BTF start/end markers
> - drop NULL check of __start_BTF[] which is no longer meaningful
> - avoid the preliminary BTF symbols if CONFIG_DEBUG_INFO_BTF is not set
> - add Andrii's ack to patch #3
> - patches #1 and #2 unchanged
>
> Cc: Masahiro Yamada <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Martin KaFai Lau <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Andrii Nakryiko <[email protected]>
> Cc: Jiri Olsa <[email protected]>
>
> Ard Biesheuvel (3):
> kallsyms: Avoid weak references for kallsyms symbols
> vmlinux: Avoid weak reference to notes section
> btf: Avoid weak external references
>
> include/asm-generic/vmlinux.lds.h | 28 ++++++++++++++++++
> kernel/bpf/btf.c | 7 +++--
> kernel/bpf/sysfs_btf.c | 6 ++--
> kernel/kallsyms.c | 6 ----
> kernel/kallsyms_internal.h | 30 ++++++++------------
> kernel/ksysfs.c | 4 +--
> lib/buildid.c | 4 +--
> 7 files changed, 52 insertions(+), 33 deletions(-)
>
When I build linux-next20240423
[https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tag/?h=next-20240423]
kernel using the kernel config attached, I see a lot of hard/soft
lockups like the one below:
Apr 23 10:47:45 kernel: [ 292.932021] RIP:
0010:fixed_percpu_data+0xffffffff8c6a0b26/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] Code: cc cc cc b8 fe ff ff ff eb
df e8 65 b5 f5 00 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90 90 66 0f 1f 00 f3 90 <c3> cc cc cc cc 0f 1f 44 00 00 90 90 90 90
90 90 90 90 90 90 90 90
Apr 23 10:47:45 kernel: [ 292.932021] RSP: 0018:ff705b99c01afe30
EFLAGS: 00000246
Apr 23 10:47:45 kernel: [ 292.932021] RAX: 0000000000000000 RBX:
ff705b99c007bd00 RCX: 0000000000000000
Apr 23 10:47:45 kernel: [ 292.932021] RDX: ff47f74381c00000 RSI:
0000000000000080 RDI: ffffffff8e65e120
Apr 23 10:47:45 kernel: [ 292.932021] RBP: ff705b99c01afe70 R08:
0000000000000014 R09: 0000000000000001
Apr 23 10:47:45 kernel: [ 292.932021] R10: 0000000000000001 R11:
0000000000000001 R12: ff705b99c007bd24
Apr 23 10:47:45 kernel: [ 292.932021] R13: 0000000000000001 R14:
ffffffff8e65e120 R15: 0000000000000001
Apr 23 10:47:45 kernel: [ 292.932021] FS: 0000000000000000(0000)
GS:ff47f74381c00000(0000) knlGS:0000000000000000
Apr 23 10:47:45 kernel: [ 292.932021] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Apr 23 10:47:45 kernel: [ 292.932021] CR2: ff47f7448d3ff000 CR3:
000800404a45a001 CR4: 0000000000771ef0
Apr 23 10:47:45 kernel: [ 292.932021] PKRU: 55555554
Apr 23 10:47:45 kernel: [ 292.933349] watchdog: Watchdog detected hard
LOCKUP on cpu 14
Apr 23 10:47:45 kernel: [ 292.932021] Call Trace:
Apr 23 10:47:45 kernel: [ 292.932021] <IRQ>
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c4d189d/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c6c3656/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c6c3460/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c651e71/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c652fd0/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c651010/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c66591d/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c6659a9/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c518bbd/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8d5fb167/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] </IRQ>
Apr 23 10:47:45 kernel: [ 292.932021] <TASK>
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8d8016ef/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.935984] watchdog: Watchdog detected hard
LOCKUP on cpu 76
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c6a0b26/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c6a0be4/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c6a0b40/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021]
fixed_percpu_data+0xffffffff8c6a0743/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c5a2120/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021]
fixed_percpu_data+0xffffffff8c5a2206/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021]
fixed_percpu_data+0xffffffff8c5962a8/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c5961b0/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021]
fixed_percpu_data+0xffffffff8c4dd4a0/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] ?
fixed_percpu_data+0xffffffff8c5961b0/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021]
fixed_percpu_data+0xffffffff8c403f0a/0xffffffff8f030000
Apr 23 10:47:45 kernel: [ 292.932021] </TASK>
Apr 23 10:47:45 kernel: [ 288.996021] watchdog: BUG: soft lockup -
CPU#67 stuck for 257s! [migration/67:424]
Apr 23 10:47:45 kernel: [ 288.996021] Modules linked in:
Apr 23 10:47:45 kernel: [ 288.996021] CPU: 67 PID: 424 Comm:
migration/67 Tainted: G L 6.9.0-rc5-0bdad2836 #3
If I build commit before this [82d460ed], things work well.
Bad commit 0bdad28369fc5e93de39b5046228ed78e982fc71
Author: Ard Biesheuvel <[email protected]>
Date: Sat Apr 20 16:53:04 2024 +0200
I hit this only on Ubuntu host. RHEL host it seem to work well. Both
have same kconfig related to kallsysms though:
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_SELFTEST is not set
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y