2022-05-17 00:57:43

by Nathan Chancellor

[permalink] [raw]
Subject: objtool "no non-local symbols" error with tip of tree LLVM

Hi Josh and Peter,

After a recent change in LLVM [1], I see warnings (errors?) from objtool
when building x86_64 allmodconfig on 5.15 and 5.17:

$ make -skj"$(nproc)" KCONFIG_ALLCONFIG=<(echo CONFIG_WERROR) LLVM=1 allmodconfig all
...
mm/highmem.o: warning: objtool: no non-local symbols !?
mm/highmem.o: warning: objtool: gelf_update_symshndx: invalid section index
make[2]: *** [scripts/Makefile.build:288: mm/highmem.o] Error 255
...
security/tomoyo/load_policy.o: warning: objtool: no non-local symbols !?
security/tomoyo/load_policy.o: warning: objtool: gelf_update_symshndx: invalid section index
make[3]: *** [scripts/Makefile.build:288: security/tomoyo/load_policy.o] Error 255
...

I don't see the same errors on x86_64 allmodconfig on mainline so I
bisected the 5.17 branch and came upon commit 4abff6d48dbc ("objtool:
Fix code relocs vs weak symbols"). I wanted to see what 5.17 might be
missing and came to commit ed53a0d97192 ("x86/alternative: Use
.ibt_endbr_seal to seal indirect calls") in mainline, which I think just
hides the issue for allmodconfig. I can reproduce this problem with a
more selective set of config values on mainline:

$ make -skj"$(nproc)" LLVM=1 defconfig

$ scripts/config -e KASAN -e SECURITY_TOMOYO -e SECURITY_TOMOYO_OMIT_USERSPACE_LOADER

$ make -skj"$(nproc)" LLVM=1 olddefconfig security/tomoyo/load_policy.o
security/tomoyo/load_policy.o: warning: objtool: no non-local symbols !?
security/tomoyo/load_policy.o: warning: objtool: gelf_update_symshndx: invalid section index
make[3]: *** [scripts/Makefile.build:288: security/tomoyo/load_policy.o] Error 255
...

Looking at the object file, the '.text.asan.module_ctor' section has
disappeared.

Before:

$ llvm-nm -S security/tomoyo/load_policy.o
0000000000000000 0000000000000001 t asan.module_ctor

$ llvm-readelf -s security/tomoyo/load_policy.o

Symbol table '.symtab' contains 4 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS load_policy.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .text.asan.module_ctor
3: 0000000000000000 1 FUNC LOCAL DEFAULT 3 asan.module_ctor

After:

$ llvm-nm -S security/tomoyo/load_policy.o
0000000000000000 0000000000000001 t asan.module_ctor

$ llvm-readelf -s security/tomoyo/load_policy.o

Symbol table '.symtab' contains 3 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS load_policy.c
2: 0000000000000000 1 FUNC LOCAL DEFAULT 3 asan.module_ctor

As far as I understand it, the kernel uses constructors for at least
KASAN and KCSAN, hence why that change impacts the kernel. Beyond that,
I am not really sure whether the LLVM change is problematic or objtool
just is not accounting for something that it should. I am happy to
provide any additional information that might help understand what is
going wrong here.

[1]: https://github.com/llvm/llvm-project/commit/badd088c57d7d18acd337b7868fe8c7974c88c5b

Cheers,
Nathan


2022-05-17 02:55:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Mon, May 16, 2022 at 01:47:15PM -0700, Nathan Chancellor wrote:
> Hi Josh and Peter,
>
> After a recent change in LLVM [1], I see warnings (errors?) from objtool
> when building x86_64 allmodconfig on 5.15 and 5.17:
>
> $ make -skj"$(nproc)" KCONFIG_ALLCONFIG=<(echo CONFIG_WERROR) LLVM=1 allmodconfig all
> ...
> mm/highmem.o: warning: objtool: no non-local symbols !?
> mm/highmem.o: warning: objtool: gelf_update_symshndx: invalid section index
> make[2]: *** [scripts/Makefile.build:288: mm/highmem.o] Error 255
> ...
> security/tomoyo/load_policy.o: warning: objtool: no non-local symbols !?
> security/tomoyo/load_policy.o: warning: objtool: gelf_update_symshndx: invalid section index
> make[3]: *** [scripts/Makefile.build:288: security/tomoyo/load_policy.o] Error 255
> ...
>
> I don't see the same errors on x86_64 allmodconfig on mainline so I
> bisected the 5.17 branch and came upon commit 4abff6d48dbc ("objtool:
> Fix code relocs vs weak symbols"). I wanted to see what 5.17 might be
> missing and came to commit ed53a0d97192 ("x86/alternative: Use
> .ibt_endbr_seal to seal indirect calls") in mainline, which I think just
> hides the issue for allmodconfig. I can reproduce this problem with a
> more selective set of config values on mainline:
>
> $ make -skj"$(nproc)" LLVM=1 defconfig
>
> $ scripts/config -e KASAN -e SECURITY_TOMOYO -e SECURITY_TOMOYO_OMIT_USERSPACE_LOADER
>
> $ make -skj"$(nproc)" LLVM=1 olddefconfig security/tomoyo/load_policy.o
> security/tomoyo/load_policy.o: warning: objtool: no non-local symbols !?
> security/tomoyo/load_policy.o: warning: objtool: gelf_update_symshndx: invalid section index
> make[3]: *** [scripts/Makefile.build:288: security/tomoyo/load_policy.o] Error 255
> ...
>
> Looking at the object file, the '.text.asan.module_ctor' section has
> disappeared.
>
> Before:
>
> $ llvm-nm -S security/tomoyo/load_policy.o
> 0000000000000000 0000000000000001 t asan.module_ctor
>
> $ llvm-readelf -s security/tomoyo/load_policy.o
>
> Symbol table '.symtab' contains 4 entries:
> Num: Value Size Type Bind Vis Ndx Name
> 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
> 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS load_policy.c
> 2: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .text.asan.module_ctor
> 3: 0000000000000000 1 FUNC LOCAL DEFAULT 3 asan.module_ctor
>
> After:
>
> $ llvm-nm -S security/tomoyo/load_policy.o
> 0000000000000000 0000000000000001 t asan.module_ctor
>
> $ llvm-readelf -s security/tomoyo/load_policy.o
>
> Symbol table '.symtab' contains 3 entries:
> Num: Value Size Type Bind Vis Ndx Name
> 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
> 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS load_policy.c
> 2: 0000000000000000 1 FUNC LOCAL DEFAULT 3 asan.module_ctor
>

The problem seems to be that we need to add a local symbols because LLVM
helpfully stripped all unused section symbols.

The way we do that, is by moving a the first non-local symbol to the
end, thereby creating a hole where we can insert a new local symbol.
Because ELF very helpfully mandates that local symbols must come before
non-local symbols and keeps the symbols index of the first non-local in
sh_info.

Thing is, the above object files don't appear to have a non-local symbol
so the swizzle thing isn't needed, and apparently the value in sh_info
isn't valid either.

Does something simple like this work? If not, I'll try and reproduce
tomorrow, it shouldn't be too hard to fix.

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 583a3ec987b5..baabf38a2a11 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -618,8 +618,7 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,

sym = find_symbol_by_index(elf, first_non_local);
if (!sym) {
- WARN("no non-local symbols !?");
- return first_non_local;
+ return symtab->sh.sh_size / sizeof(sym->sym);
}

s = elf_getscn(elf->elf, symtab->idx);


2022-05-17 03:04:44

by Nathan Chancellor

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Mon, May 16, 2022 at 11:40:05PM +0200, Peter Zijlstra wrote:
> On Mon, May 16, 2022 at 01:47:15PM -0700, Nathan Chancellor wrote:
> > Hi Josh and Peter,
> >
> > After a recent change in LLVM [1], I see warnings (errors?) from objtool
> > when building x86_64 allmodconfig on 5.15 and 5.17:
> >
> > $ make -skj"$(nproc)" KCONFIG_ALLCONFIG=<(echo CONFIG_WERROR) LLVM=1 allmodconfig all
> > ...
> > mm/highmem.o: warning: objtool: no non-local symbols !?
> > mm/highmem.o: warning: objtool: gelf_update_symshndx: invalid section index
> > make[2]: *** [scripts/Makefile.build:288: mm/highmem.o] Error 255
> > ...
> > security/tomoyo/load_policy.o: warning: objtool: no non-local symbols !?
> > security/tomoyo/load_policy.o: warning: objtool: gelf_update_symshndx: invalid section index
> > make[3]: *** [scripts/Makefile.build:288: security/tomoyo/load_policy.o] Error 255
> > ...
> >
> > I don't see the same errors on x86_64 allmodconfig on mainline so I
> > bisected the 5.17 branch and came upon commit 4abff6d48dbc ("objtool:
> > Fix code relocs vs weak symbols"). I wanted to see what 5.17 might be
> > missing and came to commit ed53a0d97192 ("x86/alternative: Use
> > .ibt_endbr_seal to seal indirect calls") in mainline, which I think just
> > hides the issue for allmodconfig. I can reproduce this problem with a
> > more selective set of config values on mainline:
> >
> > $ make -skj"$(nproc)" LLVM=1 defconfig
> >
> > $ scripts/config -e KASAN -e SECURITY_TOMOYO -e SECURITY_TOMOYO_OMIT_USERSPACE_LOADER
> >
> > $ make -skj"$(nproc)" LLVM=1 olddefconfig security/tomoyo/load_policy.o
> > security/tomoyo/load_policy.o: warning: objtool: no non-local symbols !?
> > security/tomoyo/load_policy.o: warning: objtool: gelf_update_symshndx: invalid section index
> > make[3]: *** [scripts/Makefile.build:288: security/tomoyo/load_policy.o] Error 255
> > ...
> >
> > Looking at the object file, the '.text.asan.module_ctor' section has
> > disappeared.
> >
> > Before:
> >
> > $ llvm-nm -S security/tomoyo/load_policy.o
> > 0000000000000000 0000000000000001 t asan.module_ctor
> >
> > $ llvm-readelf -s security/tomoyo/load_policy.o
> >
> > Symbol table '.symtab' contains 4 entries:
> > Num: Value Size Type Bind Vis Ndx Name
> > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
> > 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS load_policy.c
> > 2: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .text.asan.module_ctor
> > 3: 0000000000000000 1 FUNC LOCAL DEFAULT 3 asan.module_ctor
> >
> > After:
> >
> > $ llvm-nm -S security/tomoyo/load_policy.o
> > 0000000000000000 0000000000000001 t asan.module_ctor
> >
> > $ llvm-readelf -s security/tomoyo/load_policy.o
> >
> > Symbol table '.symtab' contains 3 entries:
> > Num: Value Size Type Bind Vis Ndx Name
> > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
> > 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS load_policy.c
> > 2: 0000000000000000 1 FUNC LOCAL DEFAULT 3 asan.module_ctor
> >
>
> The problem seems to be that we need to add a local symbols because LLVM
> helpfully stripped all unused section symbols.
>
> The way we do that, is by moving a the first non-local symbol to the
> end, thereby creating a hole where we can insert a new local symbol.
> Because ELF very helpfully mandates that local symbols must come before
> non-local symbols and keeps the symbols index of the first non-local in
> sh_info.
>
> Thing is, the above object files don't appear to have a non-local symbol
> so the swizzle thing isn't needed, and apparently the value in sh_info
> isn't valid either.
>
> Does something simple like this work? If not, I'll try and reproduce
> tomorrow, it shouldn't be too hard to fix.

That diff obviously gets rid of the "no non-local symbols" message but I
still see the "invalid section index" message. I'll be offline tomorrow
but if you have issues reproducing it, I'll be happy to help on
Wednesday. At the time I am writing this, apt.llvm.org packages have not
been updated to include that LLVM change I mentioned; hopefully they
will be soon.

Thanks for the quick response!
Nathan

> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
> index 583a3ec987b5..baabf38a2a11 100644
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -618,8 +618,7 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
>
> sym = find_symbol_by_index(elf, first_non_local);
> if (!sym) {
> - WARN("no non-local symbols !?");
> - return first_non_local;
> + return symtab->sh.sh_size / sizeof(sym->sym);
> }
>
> s = elf_getscn(elf->elf, symtab->idx);
>

2022-05-18 03:37:56

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Tue, May 17, 2022 at 05:33:59PM +0200, Peter Zijlstra wrote:
> On Mon, May 16, 2022 at 11:40:06PM +0200, Peter Zijlstra wrote:
> > Does something simple like this work? If not, I'll try and reproduce
> > tomorrow, it shouldn't be too hard to fix.
>
> Oh, man, I so shouldn't have said that :/
>
> I have something that almost works, except it now mightly upsets
> modpost.
>
> I'm not entirely sure how the old code worked as well as it did. Oh
> well, I'll get it sorted.

Pff, it's been a *long* day.. here this works.

---
tools/objtool/elf.c | 191 ++++++++++++++++++++++++++++++++++------------------
1 file changed, 125 insertions(+), 66 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index ebf2ba5755c1..a9c3e27527de 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -600,24 +600,24 @@ static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
}

/*
- * Move the first global symbol, as per sh_info, into a new, higher symbol
- * index. This fees up the shndx for a new local symbol.
+ * The libelf API is terrible; gelf_update_sym*() takes a data block relative
+ * index value. As such, iterate the data blocks and adjust index until it fits.
+ *
+ * If no data block is found, allow adding a new data block provided the index
+ * is only one past the end.
*/
-static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
- struct section *symtab_shndx)
+static int elf_update_symbol(struct elf *elf, struct section *symtab,
+ struct section *symtab_shndx, struct symbol *sym)
{
- Elf_Data *data, *shndx_data = NULL;
- Elf32_Word first_non_local;
- struct symbol *sym;
- Elf_Scn *s;
+ Elf_Data *symtab_data = NULL, *shndx_data = NULL;
+ Elf32_Word shndx = sym->sec->idx;
+ Elf_Scn *s, *t = NULL;
+ int size, idx = sym->idx;

- first_non_local = symtab->sh.sh_info;
-
- sym = find_symbol_by_index(elf, first_non_local);
- if (!sym) {
- WARN("no non-local symbols !?");
- return first_non_local;
- }
+ if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
+ size = sizeof(Elf32_Sym);
+ else
+ size = sizeof(Elf64_Sym);

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -625,79 +625,120 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
return -1;
}

- data = elf_newdata(s);
- if (!data) {
- WARN_ELF("elf_newdata");
- return -1;
+ if (symtab_shndx) {
+ t = elf_getscn(elf->elf, symtab_shndx->idx);
+ if (!t) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
}

- data->d_buf = &sym->sym;
- data->d_size = sizeof(sym->sym);
- data->d_align = 1;
- data->d_type = ELF_T_SYM;
+ for (;;) {
+ symtab_data = elf_getdata(s, symtab_data);
+ if (t)
+ shndx_data = elf_getdata(t, shndx_data);

- sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
- elf_dirty_reloc_sym(elf, sym);
+ if (!symtab_data) {
+ if (!idx) {
+ void *buf;

- symtab->sh.sh_info += 1;
- symtab->sh.sh_size += data->d_size;
- symtab->changed = true;
+ symtab_data = elf_newdata(s);
+ if (t)
+ shndx_data = elf_newdata(t);

- if (symtab_shndx) {
- s = elf_getscn(elf->elf, symtab_shndx->idx);
- if (!s) {
- WARN_ELF("elf_getscn");
+ buf = calloc(1, size);
+ if (!buf) {
+ WARN("malloc");
+ return -1;
+ }
+
+ symtab_data->d_buf = buf;
+ symtab_data->d_size = size;
+ symtab_data->d_align = 1;
+ symtab_data->d_type = ELF_T_SYM;
+
+ symtab->sh.sh_size += size;
+ symtab->changed = true;
+
+ if (t) {
+ shndx_data->d_buf = &sym->sec->idx;
+ shndx_data->d_size = sizeof(Elf32_Word);
+ shndx_data->d_align = 4;
+ shndx_data->d_type = ELF_T_WORD;
+
+ symtab_shndx->sh.sh_size += 4;
+ symtab_shndx->changed = true;
+ }
+
+ break;
+ }
+
+ WARN("index out of range");
return -1;
}

- shndx_data = elf_newdata(s);
- if (!shndx_data) {
- WARN_ELF("elf_newshndx_data");
+ if (!symtab_data->d_size) {
+ WARN("zero size data");
return -1;
}

- shndx_data->d_buf = &sym->sec->idx;
- shndx_data->d_size = sizeof(Elf32_Word);
- shndx_data->d_align = 4;
- shndx_data->d_type = ELF_T_WORD;
+ if (idx * size < symtab_data->d_size)
+ break;

- symtab_shndx->sh.sh_size += 4;
- symtab_shndx->changed = true;
+ idx -= symtab_data->d_size / size;
}

- return first_non_local;
+ if (idx < 0) {
+ WARN("negative index");
+ return -1;
+ }
+
+ if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+ sym->sym.st_shndx = shndx;
+ if (!shndx_data)
+ shndx = 0;
+ } else {
+ sym->sym.st_shndx = SHN_XINDEX;
+ if (!shndx_data) {
+ WARN("no .symtab_shndx");
+ return -1;
+ }
+ }
+
+ if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
+ WARN_ELF("gelf_update_symshndx");
+ return -1;
+ }
+
+ return 0;
}

static struct symbol *
elf_create_section_symbol(struct elf *elf, struct section *sec)
{
struct section *symtab, *symtab_shndx;
- Elf_Data *shndx_data = NULL;
- struct symbol *sym;
- Elf32_Word shndx;
+ Elf32_Word first_non_local, new;
+ struct symbol *sym, *old;
+ int size;
+
+ if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
+ size = sizeof(Elf32_Sym);
+ else
+ size = sizeof(Elf64_Sym);

symtab = find_section_by_name(elf, ".symtab");
if (symtab) {
symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
- if (symtab_shndx)
- shndx_data = symtab_shndx->data;
} else {
WARN("no .symtab");
return NULL;
}

- sym = malloc(sizeof(*sym));
+ sym = calloc(1, sizeof(*sym));
if (!sym) {
perror("malloc");
return NULL;
}
- memset(sym, 0, sizeof(*sym));
-
- sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
- if (sym->idx < 0) {
- WARN("elf_move_global_symbol");
- return NULL;
- }

sym->name = sec->name;
sym->sec = sec;
@@ -707,24 +748,42 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
// st_other 0
// st_value 0
// st_size 0
- shndx = sec->idx;
- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
- sym->sym.st_shndx = shndx;
- if (!shndx_data)
- shndx = 0;
- } else {
- sym->sym.st_shndx = SHN_XINDEX;
- if (!shndx_data) {
- WARN("no .symtab_shndx");
+
+ new = symtab->sh.sh_size / size;
+
+ /*
+ * Move the first global symbol, as per sh_info, into a new, higher
+ * symbol index. This fees up a spot for a new local symbol.
+ */
+ first_non_local = symtab->sh.sh_info;
+ old = find_symbol_by_index(elf, first_non_local);
+ if (old) {
+ old->idx = new;
+
+ hlist_del(&old->hash);
+ elf_hash_add(symbol, &old->hash, old->idx);
+
+ elf_dirty_reloc_sym(elf, old);
+
+ if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
+ WARN("elf_update_symbol move");
return NULL;
}
+
+ new = first_non_local;
}

- if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
- WARN_ELF("gelf_update_symshndx");
+ sym->idx = new;
+ if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
+ WARN("elf_update_symbol");
return NULL;
}

+ /*
+ * Either way, we added a LOCAL symbol.
+ */
+ symtab->sh.sh_info += 1;
+
elf_add_symbol(elf, sym);

return sym;

2022-05-18 03:39:21

by Nathan Chancellor

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> On Tue, May 17, 2022 at 05:33:59PM +0200, Peter Zijlstra wrote:
> > On Mon, May 16, 2022 at 11:40:06PM +0200, Peter Zijlstra wrote:
> > > Does something simple like this work? If not, I'll try and reproduce
> > > tomorrow, it shouldn't be too hard to fix.
> >
> > Oh, man, I so shouldn't have said that :/
> >
> > I have something that almost works, except it now mightly upsets
> > modpost.
> >
> > I'm not entirely sure how the old code worked as well as it did. Oh
> > well, I'll get it sorted.
>
> Pff, it's been a *long* day.. here this works.

Thanks a lot for the quick fix! It resolves the error I see on 5.17 and
I don't see any new issues on mainline.

Tested-by: Nathan Chancellor <[email protected]>

> ---
> tools/objtool/elf.c | 191 ++++++++++++++++++++++++++++++++++------------------
> 1 file changed, 125 insertions(+), 66 deletions(-)
>
> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
> index ebf2ba5755c1..a9c3e27527de 100644
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -600,24 +600,24 @@ static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
> }
>
> /*
> - * Move the first global symbol, as per sh_info, into a new, higher symbol
> - * index. This fees up the shndx for a new local symbol.
> + * The libelf API is terrible; gelf_update_sym*() takes a data block relative
> + * index value. As such, iterate the data blocks and adjust index until it fits.
> + *
> + * If no data block is found, allow adding a new data block provided the index
> + * is only one past the end.
> */
> -static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
> - struct section *symtab_shndx)
> +static int elf_update_symbol(struct elf *elf, struct section *symtab,
> + struct section *symtab_shndx, struct symbol *sym)
> {
> - Elf_Data *data, *shndx_data = NULL;
> - Elf32_Word first_non_local;
> - struct symbol *sym;
> - Elf_Scn *s;
> + Elf_Data *symtab_data = NULL, *shndx_data = NULL;
> + Elf32_Word shndx = sym->sec->idx;
> + Elf_Scn *s, *t = NULL;
> + int size, idx = sym->idx;
>
> - first_non_local = symtab->sh.sh_info;
> -
> - sym = find_symbol_by_index(elf, first_non_local);
> - if (!sym) {
> - WARN("no non-local symbols !?");
> - return first_non_local;
> - }
> + if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
> + size = sizeof(Elf32_Sym);
> + else
> + size = sizeof(Elf64_Sym);
>
> s = elf_getscn(elf->elf, symtab->idx);
> if (!s) {
> @@ -625,79 +625,120 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
> return -1;
> }
>
> - data = elf_newdata(s);
> - if (!data) {
> - WARN_ELF("elf_newdata");
> - return -1;
> + if (symtab_shndx) {
> + t = elf_getscn(elf->elf, symtab_shndx->idx);
> + if (!t) {
> + WARN_ELF("elf_getscn");
> + return -1;
> + }
> }
>
> - data->d_buf = &sym->sym;
> - data->d_size = sizeof(sym->sym);
> - data->d_align = 1;
> - data->d_type = ELF_T_SYM;
> + for (;;) {
> + symtab_data = elf_getdata(s, symtab_data);
> + if (t)
> + shndx_data = elf_getdata(t, shndx_data);
>
> - sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
> - elf_dirty_reloc_sym(elf, sym);
> + if (!symtab_data) {
> + if (!idx) {
> + void *buf;
>
> - symtab->sh.sh_info += 1;
> - symtab->sh.sh_size += data->d_size;
> - symtab->changed = true;
> + symtab_data = elf_newdata(s);
> + if (t)
> + shndx_data = elf_newdata(t);
>
> - if (symtab_shndx) {
> - s = elf_getscn(elf->elf, symtab_shndx->idx);
> - if (!s) {
> - WARN_ELF("elf_getscn");
> + buf = calloc(1, size);
> + if (!buf) {
> + WARN("malloc");
> + return -1;
> + }
> +
> + symtab_data->d_buf = buf;
> + symtab_data->d_size = size;
> + symtab_data->d_align = 1;
> + symtab_data->d_type = ELF_T_SYM;
> +
> + symtab->sh.sh_size += size;
> + symtab->changed = true;
> +
> + if (t) {
> + shndx_data->d_buf = &sym->sec->idx;
> + shndx_data->d_size = sizeof(Elf32_Word);
> + shndx_data->d_align = 4;
> + shndx_data->d_type = ELF_T_WORD;
> +
> + symtab_shndx->sh.sh_size += 4;
> + symtab_shndx->changed = true;
> + }
> +
> + break;
> + }
> +
> + WARN("index out of range");
> return -1;
> }
>
> - shndx_data = elf_newdata(s);
> - if (!shndx_data) {
> - WARN_ELF("elf_newshndx_data");
> + if (!symtab_data->d_size) {
> + WARN("zero size data");
> return -1;
> }
>
> - shndx_data->d_buf = &sym->sec->idx;
> - shndx_data->d_size = sizeof(Elf32_Word);
> - shndx_data->d_align = 4;
> - shndx_data->d_type = ELF_T_WORD;
> + if (idx * size < symtab_data->d_size)
> + break;
>
> - symtab_shndx->sh.sh_size += 4;
> - symtab_shndx->changed = true;
> + idx -= symtab_data->d_size / size;
> }
>
> - return first_non_local;
> + if (idx < 0) {
> + WARN("negative index");
> + return -1;
> + }
> +
> + if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
> + sym->sym.st_shndx = shndx;
> + if (!shndx_data)
> + shndx = 0;
> + } else {
> + sym->sym.st_shndx = SHN_XINDEX;
> + if (!shndx_data) {
> + WARN("no .symtab_shndx");
> + return -1;
> + }
> + }
> +
> + if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
> + WARN_ELF("gelf_update_symshndx");
> + return -1;
> + }
> +
> + return 0;
> }
>
> static struct symbol *
> elf_create_section_symbol(struct elf *elf, struct section *sec)
> {
> struct section *symtab, *symtab_shndx;
> - Elf_Data *shndx_data = NULL;
> - struct symbol *sym;
> - Elf32_Word shndx;
> + Elf32_Word first_non_local, new;
> + struct symbol *sym, *old;
> + int size;
> +
> + if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
> + size = sizeof(Elf32_Sym);
> + else
> + size = sizeof(Elf64_Sym);
>
> symtab = find_section_by_name(elf, ".symtab");
> if (symtab) {
> symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
> - if (symtab_shndx)
> - shndx_data = symtab_shndx->data;
> } else {
> WARN("no .symtab");
> return NULL;
> }
>
> - sym = malloc(sizeof(*sym));
> + sym = calloc(1, sizeof(*sym));
> if (!sym) {
> perror("malloc");
> return NULL;
> }
> - memset(sym, 0, sizeof(*sym));
> -
> - sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
> - if (sym->idx < 0) {
> - WARN("elf_move_global_symbol");
> - return NULL;
> - }
>
> sym->name = sec->name;
> sym->sec = sec;
> @@ -707,24 +748,42 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
> // st_other 0
> // st_value 0
> // st_size 0
> - shndx = sec->idx;
> - if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
> - sym->sym.st_shndx = shndx;
> - if (!shndx_data)
> - shndx = 0;
> - } else {
> - sym->sym.st_shndx = SHN_XINDEX;
> - if (!shndx_data) {
> - WARN("no .symtab_shndx");
> +
> + new = symtab->sh.sh_size / size;
> +
> + /*
> + * Move the first global symbol, as per sh_info, into a new, higher
> + * symbol index. This fees up a spot for a new local symbol.
> + */
> + first_non_local = symtab->sh.sh_info;
> + old = find_symbol_by_index(elf, first_non_local);
> + if (old) {
> + old->idx = new;
> +
> + hlist_del(&old->hash);
> + elf_hash_add(symbol, &old->hash, old->idx);
> +
> + elf_dirty_reloc_sym(elf, old);
> +
> + if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
> + WARN("elf_update_symbol move");
> return NULL;
> }
> +
> + new = first_non_local;
> }
>
> - if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
> - WARN_ELF("gelf_update_symshndx");
> + sym->idx = new;
> + if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
> + WARN("elf_update_symbol");
> return NULL;
> }
>
> + /*
> + * Either way, we added a LOCAL symbol.
> + */
> + symtab->sh.sh_info += 1;
> +
> elf_add_symbol(elf, sym);
>
> return sym;

2022-05-18 03:48:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Mon, May 16, 2022 at 11:40:06PM +0200, Peter Zijlstra wrote:
> Does something simple like this work? If not, I'll try and reproduce
> tomorrow, it shouldn't be too hard to fix.

Oh, man, I so shouldn't have said that :/

I have something that almost works, except it now mightly upsets
modpost.

I'm not entirely sure how the old code worked as well as it did. Oh
well, I'll get it sorted.

2022-05-18 03:52:13

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> + for (;;) {
> + symtab_data = elf_getdata(s, symtab_data);
> + if (t)
> + shndx_data = elf_getdata(t, shndx_data);
>
> - sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
> - elf_dirty_reloc_sym(elf, sym);
> + if (!symtab_data) {
> + if (!idx) {
> + void *buf;

I'm confused by whatever this is doing, how is !symtab_data possible,
i.e. why would symtab not have data?

> elf_create_section_symbol(struct elf *elf, struct section *sec)
> {
> struct section *symtab, *symtab_shndx;
> - Elf_Data *shndx_data = NULL;
> - struct symbol *sym;
> - Elf32_Word shndx;
> + Elf32_Word first_non_local, new;
> + struct symbol *sym, *old;
> + int size;
> +
> + if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
> + size = sizeof(Elf32_Sym);
> + else
> + size = sizeof(Elf64_Sym);

This should probably be called 'entsize' and I think you can just get it
from symtab->sh.sh_entsize.

> + /*
> + * Either way, we added a LOCAL symbol.
> + */
> + symtab->sh.sh_info += 1;
> +
> elf_add_symbol(elf, sym);

Not sure if it matters here, but elf_add_symbol() doesn't set sym->alias
and sym->pv_target, and both of those are unconditionally initialized in
read_symbols(). Should elf_add_symbol() be changed to initialize them?

--
Josh

2022-05-18 05:43:18

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Tue, May 17, 2022 at 06:24:29PM -0700, Josh Poimboeuf wrote:
> On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> > + for (;;) {
> > + symtab_data = elf_getdata(s, symtab_data);
> > + if (t)
> > + shndx_data = elf_getdata(t, shndx_data);
> >
> > + if (!symtab_data) {
> > + if (!idx) {
> > + void *buf;
>
> I'm confused by whatever this is doing, how is !symtab_data possible,
> i.e. why would symtab not have data?

Elf_Data *elf_getdata(Elf_Scn *scn, Elf_Data *data);

is an iterator, if @data is null it will return the first element, which
you then feed into @data the next time to get the next element, once it
returns NULL, you've found the end.

In our specific case, we iterate the data sections, if idx fits inside
the current section, we good, otherwise we lower idx by however many did
fit and try the next.

> > elf_create_section_symbol(struct elf *elf, struct section *sec)
> > {
> > struct section *symtab, *symtab_shndx;
> > - Elf_Data *shndx_data = NULL;
> > - struct symbol *sym;
> > - Elf32_Word shndx;
> > + Elf32_Word first_non_local, new;
> > + struct symbol *sym, *old;
> > + int size;
> > +
> > + if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
> > + size = sizeof(Elf32_Sym);
> > + else
> > + size = sizeof(Elf64_Sym);
>
> This should probably be called 'entsize' and I think you can just get it
> from symtab->sh.sh_entsize.

Ok, that would be easier, I'll check.

> > + /*
> > + * Either way, we added a LOCAL symbol.
> > + */
> > + symtab->sh.sh_info += 1;
> > +
> > elf_add_symbol(elf, sym);
>
> Not sure if it matters here, but elf_add_symbol() doesn't set sym->alias
> and sym->pv_target, and both of those are unconditionally initialized in
> read_symbols(). Should elf_add_symbol() be changed to initialize them?

I'll go have a look, breakfast first though! :-)

2022-05-18 07:43:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Tue, May 17, 2022 at 06:24:29PM -0700, Josh Poimboeuf wrote:
> On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> > + for (;;) {
> > + symtab_data = elf_getdata(s, symtab_data);
> > + if (t)
> > + shndx_data = elf_getdata(t, shndx_data);
> >
> > - sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
> > - elf_dirty_reloc_sym(elf, sym);
> > + if (!symtab_data) {
> > + if (!idx) {
> > + void *buf;
>
> I'm confused by whatever this is doing, how is !symtab_data possible,
> i.e. why would symtab not have data?
>
> > elf_create_section_symbol(struct elf *elf, struct section *sec)
> > {
> > struct section *symtab, *symtab_shndx;
> > - Elf_Data *shndx_data = NULL;
> > - struct symbol *sym;
> > - Elf32_Word shndx;
> > + Elf32_Word first_non_local, new;
> > + struct symbol *sym, *old;
> > + int size;
> > +
> > + if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
> > + size = sizeof(Elf32_Sym);
> > + else
> > + size = sizeof(Elf64_Sym);
>
> This should probably be called 'entsize' and I think you can just get it
> from symtab->sh.sh_entsize.
>
> > + /*
> > + * Either way, we added a LOCAL symbol.
> > + */
> > + symtab->sh.sh_info += 1;
> > +
> > elf_add_symbol(elf, sym);
>
> Not sure if it matters here, but elf_add_symbol() doesn't set sym->alias
> and sym->pv_target, and both of those are unconditionally initialized in
> read_symbols(). Should elf_add_symbol() be changed to initialize them?



--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *e
struct list_head *entry;
struct rb_node *pnode;

+ INIT_LIST_HEAD(&sym->pv_target);
+ sym->alias = sym;
+
sym->type = GELF_ST_TYPE(sym->sym.st_info);
sym->bind = GELF_ST_BIND(sym->sym.st_info);

@@ -438,8 +441,6 @@ static int read_symbols(struct elf *elf)
return -1;
}
memset(sym, 0, sizeof(*sym));
- INIT_LIST_HEAD(&sym->pv_target);
- sym->alias = sym;

sym->idx = i;

@@ -604,7 +605,8 @@ static void elf_dirty_reloc_sym(struct e

/*
* The libelf API is terrible; gelf_update_sym*() takes a data block relative
- * index value. As such, iterate the data blocks and adjust index until it fits.
+ * index value, *NOT* the symbol index. As such, iterate the data blocks and
+ * adjust index until it fits.
*
* If no data block is found, allow adding a new data block provided the index
* is only one past the end.
@@ -613,14 +615,10 @@ static int elf_update_symbol(struct elf
struct section *symtab_shndx, struct symbol *sym)
{
Elf_Data *symtab_data = NULL, *shndx_data = NULL;
+ Elf64_Xword entsize = symtab->sh.sh_entsize;
Elf32_Word shndx = sym->sec->idx;
Elf_Scn *s, *t = NULL;
- int size, idx = sym->idx;
-
- if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
- size = sizeof(Elf32_Sym);
- else
- size = sizeof(Elf64_Sym);
+ int max_idx, idx = sym->idx;

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -637,11 +635,14 @@ static int elf_update_symbol(struct elf
}

for (;;) {
+ /* get next data descriptor for the relevant sections */
symtab_data = elf_getdata(s, symtab_data);
if (t)
shndx_data = elf_getdata(t, shndx_data);

+ /* end-of-list */
if (!symtab_data) {
+ /* if @idx == 0, it's the next contiguous entry, create it */
if (!idx) {
void *buf;

@@ -649,53 +650,60 @@ static int elf_update_symbol(struct elf
if (t)
shndx_data = elf_newdata(t);

- buf = calloc(1, size);
+ buf = calloc(1, entsize);
if (!buf) {
WARN("malloc");
return -1;
}

symtab_data->d_buf = buf;
- symtab_data->d_size = size;
+ symtab_data->d_size = entsize;
symtab_data->d_align = 1;
symtab_data->d_type = ELF_T_SYM;

- symtab->sh.sh_size += size;
+ symtab->sh.sh_size += entsize;
symtab->changed = true;

if (t) {
shndx_data->d_buf = &sym->sec->idx;
shndx_data->d_size = sizeof(Elf32_Word);
- shndx_data->d_align = 4;
+ shndx_data->d_align = sizeof(Elf32_Word);
shndx_data->d_type = ELF_T_WORD;

- symtab_shndx->sh.sh_size += 4;
+ symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
symtab_shndx->changed = true;
}

break;
}

+ /* we don't do holes in symbol tables */
WARN("index out of range");
return -1;
}

+ /* empty blocks should not happen */
if (!symtab_data->d_size) {
WARN("zero size data");
return -1;
}

- if (idx * size < symtab_data->d_size)
+ /* is this the right block? */
+ max_idx = symtab_data->d_size / entsize;
+ if (idx < max_idx)
break;

- idx -= symtab_data->d_size / size;
+ /* adjust index and try again */
+ idx -= max_idx;
}

+ /* something went side-ways */
if (idx < 0) {
WARN("negative index");
return -1;
}

+ /* setup extended section index magic and write the symbol */
if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
sym->sym.st_shndx = shndx;
if (!shndx_data)
@@ -720,14 +728,8 @@ static struct symbol *
elf_create_section_symbol(struct elf *elf, struct section *sec)
{
struct section *symtab, *symtab_shndx;
- Elf32_Word first_non_local, new;
+ Elf32_Word first_non_local, new_idx;
struct symbol *sym, *old;
- int size;
-
- if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32)
- size = sizeof(Elf32_Sym);
- else
- size = sizeof(Elf64_Sym);

symtab = find_section_by_name(elf, ".symtab");
if (symtab) {
@@ -752,16 +754,15 @@ elf_create_section_symbol(struct elf *el
// st_value 0
// st_size 0

- new = symtab->sh.sh_size / size;
-
/*
* Move the first global symbol, as per sh_info, into a new, higher
* symbol index. This fees up a spot for a new local symbol.
*/
first_non_local = symtab->sh.sh_info;
+ new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
old = find_symbol_by_index(elf, first_non_local);
if (old) {
- old->idx = new;
+ old->idx = new_idx;

hlist_del(&old->hash);
elf_hash_add(symbol, &old->hash, old->idx);
@@ -773,10 +774,10 @@ elf_create_section_symbol(struct elf *el
return NULL;
}

- new = first_non_local;
+ new_idx = first_non_local;
}

- sym->idx = new;
+ sym->idx = new_idx;
if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
WARN("elf_update_symbol");
return NULL;


2022-05-18 07:44:03

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH] objtool: Fix symbol creation


Subject: objtool: Fix symbol creation
From: Peter Zijlstra <[email protected]>
Date: Tue, 17 May 2022 17:42:04 +0200

Nathan reported objtool failing with the following messages:

warning: objtool: no non-local symbols !?
warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

In each symbol table, all symbols with STB_LOCAL binding preced the
weak and global symbols. As ``Sections'' above describes, a symbol
table section's sh_info section header member holds the symbol table
index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

- It failed to re-hash the symbol on the new index, so a subsequent
find_symbol_by_index() would not find it at the new location and a
query for the old location would now return a non-deterministic
choice between the old and new symbol.

- It failed to appreciate that the GElf wrappers are not a valid disk
format (it works because GElf is basically Elf64 and we only
support x86_64 atm.)

- It failed to fully appreciate how horrible the libelf API really is
and got the gelf_update_symshndx() call pretty much completely
wrong; with the direct consequence that if inserting a second
STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
---
tools/objtool/elf.c | 198 +++++++++++++++++++++++++++++++++-------------------
1 file changed, 129 insertions(+), 69 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *e
struct list_head *entry;
struct rb_node *pnode;

+ INIT_LIST_HEAD(&sym->pv_target);
+ sym->alias = sym;
+
sym->type = GELF_ST_TYPE(sym->sym.st_info);
sym->bind = GELF_ST_BIND(sym->sym.st_info);

@@ -438,8 +441,6 @@ static int read_symbols(struct elf *elf)
return -1;
}
memset(sym, 0, sizeof(*sym));
- INIT_LIST_HEAD(&sym->pv_target);
- sym->alias = sym;

sym->idx = i;

@@ -603,24 +604,21 @@ static void elf_dirty_reloc_sym(struct e
}

/*
- * Move the first global symbol, as per sh_info, into a new, higher symbol
- * index. This fees up the shndx for a new local symbol.
+ * The libelf API is terrible; gelf_update_sym*() takes a data block relative
+ * index value, *NOT* the symbol index. As such, iterate the data blocks and
+ * adjust index until it fits.
+ *
+ * If no data block is found, allow adding a new data block provided the index
+ * is only one past the end.
*/
-static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
- struct section *symtab_shndx)
+static int elf_update_symbol(struct elf *elf, struct section *symtab,
+ struct section *symtab_shndx, struct symbol *sym)
{
- Elf_Data *data, *shndx_data = NULL;
- Elf32_Word first_non_local;
- struct symbol *sym;
- Elf_Scn *s;
-
- first_non_local = symtab->sh.sh_info;
-
- sym = find_symbol_by_index(elf, first_non_local);
- if (!sym) {
- WARN("no non-local symbols !?");
- return first_non_local;
- }
+ Elf_Data *symtab_data = NULL, *shndx_data = NULL;
+ Elf64_Xword entsize = symtab->sh.sh_entsize;
+ Elf32_Word shndx = sym->sec->idx;
+ Elf_Scn *s, *t = NULL;
+ int max_idx, idx = sym->idx;

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -628,79 +626,124 @@ static int elf_move_global_symbol(struct
return -1;
}

- data = elf_newdata(s);
- if (!data) {
- WARN_ELF("elf_newdata");
- return -1;
+ if (symtab_shndx) {
+ t = elf_getscn(elf->elf, symtab_shndx->idx);
+ if (!t) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
}

- data->d_buf = &sym->sym;
- data->d_size = sizeof(sym->sym);
- data->d_align = 1;
- data->d_type = ELF_T_SYM;
+ for (;;) {
+ /* get next data descriptor for the relevant sections */
+ symtab_data = elf_getdata(s, symtab_data);
+ if (t)
+ shndx_data = elf_getdata(t, shndx_data);
+
+ /* end-of-list */
+ if (!symtab_data) {
+ /* if @idx == 0, it's the next contiguous entry, create it */
+ if (!idx) {
+ void *buf;
+
+ symtab_data = elf_newdata(s);
+ if (t)
+ shndx_data = elf_newdata(t);
+
+ buf = calloc(1, entsize);
+ if (!buf) {
+ WARN("malloc");
+ return -1;
+ }
+
+ symtab_data->d_buf = buf;
+ symtab_data->d_size = entsize;
+ symtab_data->d_align = 1;
+ symtab_data->d_type = ELF_T_SYM;
+
+ symtab->sh.sh_size += entsize;
+ symtab->changed = true;
+
+ if (t) {
+ shndx_data->d_buf = &sym->sec->idx;
+ shndx_data->d_size = sizeof(Elf32_Word);
+ shndx_data->d_align = sizeof(Elf32_Word);
+ shndx_data->d_type = ELF_T_WORD;
+
+ symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
+ symtab_shndx->changed = true;
+ }

- sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
- elf_dirty_reloc_sym(elf, sym);
+ break;
+ }

- symtab->sh.sh_info += 1;
- symtab->sh.sh_size += data->d_size;
- symtab->changed = true;
+ /* we don't do holes in symbol tables */
+ WARN("index out of range");
+ return -1;
+ }

- if (symtab_shndx) {
- s = elf_getscn(elf->elf, symtab_shndx->idx);
- if (!s) {
- WARN_ELF("elf_getscn");
+ /* empty blocks should not happen */
+ if (!symtab_data->d_size) {
+ WARN("zero size data");
return -1;
}

- shndx_data = elf_newdata(s);
+ /* is this the right block? */
+ max_idx = symtab_data->d_size / entsize;
+ if (idx < max_idx)
+ break;
+
+ /* adjust index and try again */
+ idx -= max_idx;
+ }
+
+ /* something went side-ways */
+ if (idx < 0) {
+ WARN("negative index");
+ return -1;
+ }
+
+ /* setup extended section index magic and write the symbol */
+ if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+ sym->sym.st_shndx = shndx;
+ if (!shndx_data)
+ shndx = 0;
+ } else {
+ sym->sym.st_shndx = SHN_XINDEX;
if (!shndx_data) {
- WARN_ELF("elf_newshndx_data");
+ WARN("no .symtab_shndx");
return -1;
}
+ }

- shndx_data->d_buf = &sym->sec->idx;
- shndx_data->d_size = sizeof(Elf32_Word);
- shndx_data->d_align = 4;
- shndx_data->d_type = ELF_T_WORD;
-
- symtab_shndx->sh.sh_size += 4;
- symtab_shndx->changed = true;
+ if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
+ WARN_ELF("gelf_update_symshndx");
+ return -1;
}

- return first_non_local;
+ return 0;
}

static struct symbol *
elf_create_section_symbol(struct elf *elf, struct section *sec)
{
struct section *symtab, *symtab_shndx;
- Elf_Data *shndx_data = NULL;
- struct symbol *sym;
- Elf32_Word shndx;
+ Elf32_Word first_non_local, new_idx;
+ struct symbol *sym, *old;

symtab = find_section_by_name(elf, ".symtab");
if (symtab) {
symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
- if (symtab_shndx)
- shndx_data = symtab_shndx->data;
} else {
WARN("no .symtab");
return NULL;
}

- sym = malloc(sizeof(*sym));
+ sym = calloc(1, sizeof(*sym));
if (!sym) {
perror("malloc");
return NULL;
}
- memset(sym, 0, sizeof(*sym));
-
- sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
- if (sym->idx < 0) {
- WARN("elf_move_global_symbol");
- return NULL;
- }

sym->name = sec->name;
sym->sec = sec;
@@ -710,24 +753,41 @@ elf_create_section_symbol(struct elf *el
// st_other 0
// st_value 0
// st_size 0
- shndx = sec->idx;
- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
- sym->sym.st_shndx = shndx;
- if (!shndx_data)
- shndx = 0;
- } else {
- sym->sym.st_shndx = SHN_XINDEX;
- if (!shndx_data) {
- WARN("no .symtab_shndx");
+
+ /*
+ * Move the first global symbol, as per sh_info, into a new, higher
+ * symbol index. This fees up a spot for a new local symbol.
+ */
+ first_non_local = symtab->sh.sh_info;
+ new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
+ old = find_symbol_by_index(elf, first_non_local);
+ if (old) {
+ old->idx = new_idx;
+
+ hlist_del(&old->hash);
+ elf_hash_add(symbol, &old->hash, old->idx);
+
+ elf_dirty_reloc_sym(elf, old);
+
+ if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
+ WARN("elf_update_symbol move");
return NULL;
}
+
+ new_idx = first_non_local;
}

- if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
- WARN_ELF("gelf_update_symshndx");
+ sym->idx = new_idx;
+ if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
+ WARN("elf_update_symbol");
return NULL;
}

+ /*
+ * Either way, we added a LOCAL symbol.
+ */
+ symtab->sh.sh_info += 1;
+
elf_add_symbol(elf, sym);

return sym;


2022-05-18 16:17:41

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Wed, May 18, 2022 at 07:30:06AM +0200, Peter Zijlstra wrote:
> On Tue, May 17, 2022 at 06:24:29PM -0700, Josh Poimboeuf wrote:
> > On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> > > + for (;;) {
> > > + symtab_data = elf_getdata(s, symtab_data);
> > > + if (t)
> > > + shndx_data = elf_getdata(t, shndx_data);
> > >
> > > + if (!symtab_data) {
> > > + if (!idx) {
> > > + void *buf;
> >
> > I'm confused by whatever this is doing, how is !symtab_data possible,
> > i.e. why would symtab not have data?
>
> Elf_Data *elf_getdata(Elf_Scn *scn, Elf_Data *data);
>
> is an iterator, if @data is null it will return the first element, which
> you then feed into @data the next time to get the next element, once it
> returns NULL, you've found the end.
>
> In our specific case, we iterate the data sections, if idx fits inside
> the current section, we good, otherwise we lower idx by however many did
> fit and try the next.

Ok, I think I see. But why are there multiple data blocks to begin
with? It's because of a previous call to elf_newdata() right?

If so then I don't see how it would "fit" in an existing data block,
since each block should already be full. Or... is the hole the one you
just made, by moving the old symbol out?

If so, the function seems weirdly generalized for the two distinct cases
and the loop seems unnecessary. When adding a symbol at the end, just
use elf_newdata(). When adding a symbol in the middle, the hole should
be in the first data block.

--
Josh

2022-05-18 17:15:10

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Wed, May 18, 2022 at 09:17:27AM -0700, Josh Poimboeuf wrote:
> On Wed, May 18, 2022 at 07:30:06AM +0200, Peter Zijlstra wrote:
> > On Tue, May 17, 2022 at 06:24:29PM -0700, Josh Poimboeuf wrote:
> > > On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> > > > + for (;;) {
> > > > + symtab_data = elf_getdata(s, symtab_data);
> > > > + if (t)
> > > > + shndx_data = elf_getdata(t, shndx_data);
> > > >
> > > > + if (!symtab_data) {
> > > > + if (!idx) {
> > > > + void *buf;
> > >
> > > I'm confused by whatever this is doing, how is !symtab_data possible,
> > > i.e. why would symtab not have data?
> >
> > Elf_Data *elf_getdata(Elf_Scn *scn, Elf_Data *data);
> >
> > is an iterator, if @data is null it will return the first element, which
> > you then feed into @data the next time to get the next element, once it
> > returns NULL, you've found the end.
> >
> > In our specific case, we iterate the data sections, if idx fits inside
> > the current section, we good, otherwise we lower idx by however many did
> > fit and try the next.
>
> Ok, I think I see. But why are there multiple data blocks to begin
> with? It's because of a previous call to elf_newdata() right?
>
> If so then I don't see how it would "fit" in an existing data block,
> since each block should already be full. Or... is the hole the one you
> just made, by moving the old symbol out?
>
> If so, the function seems weirdly generalized for the two distinct cases
> and the loop seems unnecessary. When adding a symbol at the end, just
> use elf_newdata(). When adding a symbol in the middle, the hole should
> be in the first data block.

Then I went for a bike ride and realized that if adding enough section
symbols to a file which didn't have very many non-locals, the hole might
occur in a later data block.

So yeah, this looks fine :-)

Another idea I had was to forego elf_newdata() entirely in favor of just
rewriting the original data block every time. But this is also fine.

--
Josh

2022-05-18 17:25:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Wed, May 18, 2022 at 09:17:25AM -0700, Josh Poimboeuf wrote:
> On Wed, May 18, 2022 at 07:30:06AM +0200, Peter Zijlstra wrote:
> > On Tue, May 17, 2022 at 06:24:29PM -0700, Josh Poimboeuf wrote:
> > > On Tue, May 17, 2022 at 05:42:04PM +0200, Peter Zijlstra wrote:
> > > > + for (;;) {
> > > > + symtab_data = elf_getdata(s, symtab_data);
> > > > + if (t)
> > > > + shndx_data = elf_getdata(t, shndx_data);
> > > >
> > > > + if (!symtab_data) {
> > > > + if (!idx) {
> > > > + void *buf;
> > >
> > > I'm confused by whatever this is doing, how is !symtab_data possible,
> > > i.e. why would symtab not have data?
> >
> > Elf_Data *elf_getdata(Elf_Scn *scn, Elf_Data *data);
> >
> > is an iterator, if @data is null it will return the first element, which
> > you then feed into @data the next time to get the next element, once it
> > returns NULL, you've found the end.
> >
> > In our specific case, we iterate the data sections, if idx fits inside
> > the current section, we good, otherwise we lower idx by however many did
> > fit and try the next.
>
> Ok, I think I see. But why are there multiple data blocks to begin
> with? It's because of a previous call to elf_newdata() right?

Correct.

> If so then I don't see how it would "fit" in an existing data block,
> since each block should already be full. Or... is the hole the one you
> just made, by moving the old symbol out?

Yeah, the hole can be in an arbitrary data block, also the case of not
having any global symbols, but see below...

> If so, the function seems weirdly generalized for the two distinct cases
> and the loop seems unnecessary. When adding a symbol at the end, just
> use elf_newdata(). When adding a symbol in the middle, the hole should
> be in the first data block.

I tried that, but there's a number of weird cases that made a right mess
of that.

Consider for instance the case where there is 1 global symbol and we
need to add 2 local symbols. We start with a single data block:

- L s1
| L s2
- G g1

So then we add one, say s3:

- L s1
| L s2
- L s3
- G g1

and we see we got a new data-block for g1, but then we add another local
symbol, s4, we move our g1 to a new data block but then find that our
hole is not in the original data block:

- L s1
| L s2
- L s3
- <hole>
- G g1

So while writing the global symbol can always use the new data section,
writing the new symbol can need arbitrary iteration of the data blocks.

Something somewhat similar is when there's no global symbols, then the
new symbol needs to go in the new data block instead of the old.

So it all became a tangled mess and I ended up with the one generic
function that could do it all (which is both simpler and less code than
trying to deal with all the weird cases).

2022-05-18 17:38:54

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] objtool: Fix symbol creation

On Wed, May 18, 2022 at 09:41:52AM +0200, Peter Zijlstra wrote:
> +static int elf_update_symbol(struct elf *elf, struct section *symtab,
> + struct section *symtab_shndx, struct symbol *sym)
> {
> - Elf_Data *data, *shndx_data = NULL;
> - Elf32_Word first_non_local;
> - struct symbol *sym;
> - Elf_Scn *s;
> -
> - first_non_local = symtab->sh.sh_info;
> -
> - sym = find_symbol_by_index(elf, first_non_local);
> - if (!sym) {
> - WARN("no non-local symbols !?");
> - return first_non_local;
> - }
> + Elf_Data *symtab_data = NULL, *shndx_data = NULL;
> + Elf64_Xword entsize = symtab->sh.sh_entsize;
> + Elf32_Word shndx = sym->sec->idx;

So if it's a global UNDEF symbol then I think 'sym->sec' can be NULL and
this blows up?

> + for (;;) {
> + /* get next data descriptor for the relevant sections */
> + symtab_data = elf_getdata(s, symtab_data);
> + if (t)
> + shndx_data = elf_getdata(t, shndx_data);
> +
> + /* end-of-list */
> + if (!symtab_data) {
> + /* if @idx == 0, it's the next contiguous entry, create it */
> + if (!idx) {
> + void *buf;

Could just do the "index out of range warning" here to reduce the
indentation level.

> + /* setup extended section index magic and write the symbol */
> + if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {

> + sym->sym.st_shndx = shndx;
> + if (!shndx_data)
> + shndx = 0;

I think this '0' is SHN_UNDEF?

Also shouldn't 'sym->sym.st_shndx' get the same value?

--
Josh

2022-05-18 18:05:46

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool "no non-local symbols" error with tip of tree LLVM

On Wed, May 18, 2022 at 07:25:13PM +0200, Peter Zijlstra wrote:
> So while writing the global symbol can always use the new data section,
> writing the new symbol can need arbitrary iteration of the data blocks.
>
> Something somewhat similar is when there's no global symbols, then the
> new symbol needs to go in the new data block instead of the old.
>
> So it all became a tangled mess and I ended up with the one generic
> function that could do it all (which is both simpler and less code than
> trying to deal with all the weird cases).

Makes sense, and matches my post-bike-ride insights. Thanks :-)

--
Josh

2022-05-18 22:41:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] objtool: Fix symbol creation

On Wed, May 18, 2022 at 10:36:04AM -0700, Josh Poimboeuf wrote:
> On Wed, May 18, 2022 at 09:41:52AM +0200, Peter Zijlstra wrote:
> > +static int elf_update_symbol(struct elf *elf, struct section *symtab,
> > + struct section *symtab_shndx, struct symbol *sym)
> > {
> > - Elf_Data *data, *shndx_data = NULL;
> > - Elf32_Word first_non_local;
> > - struct symbol *sym;
> > - Elf_Scn *s;
> > -
> > - first_non_local = symtab->sh.sh_info;
> > -
> > - sym = find_symbol_by_index(elf, first_non_local);
> > - if (!sym) {
> > - WARN("no non-local symbols !?");
> > - return first_non_local;
> > - }
> > + Elf_Data *symtab_data = NULL, *shndx_data = NULL;
> > + Elf64_Xword entsize = symtab->sh.sh_entsize;
> > + Elf32_Word shndx = sym->sec->idx;
>
> So if it's a global UNDEF symbol then I think 'sym->sec' can be NULL and
> this blows up?

Oh indeed, sym->sec ? sym->sec->idx : SHN_UNDEF it is.

> > + for (;;) {
> > + /* get next data descriptor for the relevant sections */
> > + symtab_data = elf_getdata(s, symtab_data);
> > + if (t)
> > + shndx_data = elf_getdata(t, shndx_data);
> > +
> > + /* end-of-list */
> > + if (!symtab_data) {
> > + /* if @idx == 0, it's the next contiguous entry, create it */
> > + if (!idx) {
> > + void *buf;
>
> Could just do the "index out of range warning" here to reduce the
> indentation level.

Sure.

> > + /* setup extended section index magic and write the symbol */
> > + if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
>
> > + sym->sym.st_shndx = shndx;
> > + if (!shndx_data)
> > + shndx = 0;
>
> I think this '0' is SHN_UNDEF?
>
> Also shouldn't 'sym->sym.st_shndx' get the same value?

This is when there isn't an extended section index. Specifically
gelf_update_symshndx() requires that when @shndx_data == NULL, @shndx
must be 0 too.

2022-05-19 16:15:37

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH v2] objtool: Fix symbol creation

Subject: objtool: Fix symbol creation
From: Peter Zijlstra <[email protected]>
Date: Tue, 17 May 2022 17:42:04 +0200

Nathan reported objtool failing with the following messages:

warning: objtool: no non-local symbols !?
warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

In each symbol table, all symbols with STB_LOCAL binding preced the
weak and global symbols. As ``Sections'' above describes, a symbol
table section's sh_info section header member holds the symbol table
index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

- It failed to re-hash the symbol on the new index, so a subsequent
find_symbol_by_index() would not find it at the new location and a
query for the old location would now return a non-deterministic
choice between the old and new symbol.

- It failed to appreciate that the GElf wrappers are not a valid disk
format (it works because GElf is basically Elf64 and we only
support x86_64 atm.)

- It failed to fully appreciate how horrible the libelf API really is
and got the gelf_update_symshndx() call pretty much completely
wrong; with the direct consequence that if inserting a second
STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
---
tools/objtool/elf.c | 198 +++++++++++++++++++++++++++++++++-------------------
1 file changed, 129 insertions(+), 69 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *e
struct list_head *entry;
struct rb_node *pnode;

+ INIT_LIST_HEAD(&sym->pv_target);
+ sym->alias = sym;
+
sym->type = GELF_ST_TYPE(sym->sym.st_info);
sym->bind = GELF_ST_BIND(sym->sym.st_info);

@@ -438,8 +441,6 @@ static int read_symbols(struct elf *elf)
return -1;
}
memset(sym, 0, sizeof(*sym));
- INIT_LIST_HEAD(&sym->pv_target);
- sym->alias = sym;

sym->idx = i;

@@ -603,24 +604,21 @@ static void elf_dirty_reloc_sym(struct e
}

/*
- * Move the first global symbol, as per sh_info, into a new, higher symbol
- * index. This fees up the shndx for a new local symbol.
+ * The libelf API is terrible; gelf_update_sym*() takes a data block relative
+ * index value, *NOT* the symbol index. As such, iterate the data blocks and
+ * adjust index until it fits.
+ *
+ * If no data block is found, allow adding a new data block provided the index
+ * is only one past the end.
*/
-static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
- struct section *symtab_shndx)
+static int elf_update_symbol(struct elf *elf, struct section *symtab,
+ struct section *symtab_shndx, struct symbol *sym)
{
- Elf_Data *data, *shndx_data = NULL;
- Elf32_Word first_non_local;
- struct symbol *sym;
- Elf_Scn *s;
-
- first_non_local = symtab->sh.sh_info;
-
- sym = find_symbol_by_index(elf, first_non_local);
- if (!sym) {
- WARN("no non-local symbols !?");
- return first_non_local;
- }
+ Elf32_Word shndx = sym->sec ? sym->sec->idx : SHN_UNDEF;
+ Elf_Data *symtab_data = NULL, *shndx_data = NULL;
+ Elf64_Xword entsize = symtab->sh.sh_entsize;
+ int max_idx, idx = sym->idx;
+ Elf_Scn *s, *t = NULL;

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -628,79 +626,124 @@ static int elf_move_global_symbol(struct
return -1;
}

- data = elf_newdata(s);
- if (!data) {
- WARN_ELF("elf_newdata");
- return -1;
+ if (symtab_shndx) {
+ t = elf_getscn(elf->elf, symtab_shndx->idx);
+ if (!t) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
}

- data->d_buf = &sym->sym;
- data->d_size = sizeof(sym->sym);
- data->d_align = 1;
- data->d_type = ELF_T_SYM;
+ for (;;) {
+ /* get next data descriptor for the relevant sections */
+ symtab_data = elf_getdata(s, symtab_data);
+ if (t)
+ shndx_data = elf_getdata(t, shndx_data);
+
+ /* end-of-list */
+ if (!symtab_data) {
+ void *buf;
+
+ if (idx) {
+ /* we don't do holes in symbol tables */
+ WARN("index out of range");
+ return -1;
+ }

- sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
- elf_dirty_reloc_sym(elf, sym);
+ /* if @idx == 0, it's the next contiguous entry, create it */
+ symtab_data = elf_newdata(s);
+ if (t)
+ shndx_data = elf_newdata(t);
+
+ buf = calloc(1, entsize);
+ if (!buf) {
+ WARN("malloc");
+ return -1;
+ }

- symtab->sh.sh_info += 1;
- symtab->sh.sh_size += data->d_size;
- symtab->changed = true;
+ symtab_data->d_buf = buf;
+ symtab_data->d_size = entsize;
+ symtab_data->d_align = 1;
+ symtab_data->d_type = ELF_T_SYM;
+
+ symtab->sh.sh_size += entsize;
+ symtab->changed = true;
+
+ if (t) {
+ shndx_data->d_buf = &sym->sec->idx;
+ shndx_data->d_size = sizeof(Elf32_Word);
+ shndx_data->d_align = sizeof(Elf32_Word);
+ shndx_data->d_type = ELF_T_WORD;

- if (symtab_shndx) {
- s = elf_getscn(elf->elf, symtab_shndx->idx);
- if (!s) {
- WARN_ELF("elf_getscn");
+ symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
+ symtab_shndx->changed = true;
+ }
+
+ break;
+ }
+
+ /* empty blocks should not happen */
+ if (!symtab_data->d_size) {
+ WARN("zero size data");
return -1;
}

- shndx_data = elf_newdata(s);
+ /* is this the right block? */
+ max_idx = symtab_data->d_size / entsize;
+ if (idx < max_idx)
+ break;
+
+ /* adjust index and try again */
+ idx -= max_idx;
+ }
+
+ /* something went side-ways */
+ if (idx < 0) {
+ WARN("negative index");
+ return -1;
+ }
+
+ /* setup extended section index magic and write the symbol */
+ if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+ sym->sym.st_shndx = shndx;
+ if (!shndx_data)
+ shndx = 0;
+ } else {
+ sym->sym.st_shndx = SHN_XINDEX;
if (!shndx_data) {
- WARN_ELF("elf_newshndx_data");
+ WARN("no .symtab_shndx");
return -1;
}
+ }

- shndx_data->d_buf = &sym->sec->idx;
- shndx_data->d_size = sizeof(Elf32_Word);
- shndx_data->d_align = 4;
- shndx_data->d_type = ELF_T_WORD;
-
- symtab_shndx->sh.sh_size += 4;
- symtab_shndx->changed = true;
+ if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
+ WARN_ELF("gelf_update_symshndx");
+ return -1;
}

- return first_non_local;
+ return 0;
}

static struct symbol *
elf_create_section_symbol(struct elf *elf, struct section *sec)
{
struct section *symtab, *symtab_shndx;
- Elf_Data *shndx_data = NULL;
- struct symbol *sym;
- Elf32_Word shndx;
+ Elf32_Word first_non_local, new_idx;
+ struct symbol *sym, *old;

symtab = find_section_by_name(elf, ".symtab");
if (symtab) {
symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
- if (symtab_shndx)
- shndx_data = symtab_shndx->data;
} else {
WARN("no .symtab");
return NULL;
}

- sym = malloc(sizeof(*sym));
+ sym = calloc(1, sizeof(*sym));
if (!sym) {
perror("malloc");
return NULL;
}
- memset(sym, 0, sizeof(*sym));
-
- sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
- if (sym->idx < 0) {
- WARN("elf_move_global_symbol");
- return NULL;
- }

sym->name = sec->name;
sym->sec = sec;
@@ -710,24 +753,41 @@ elf_create_section_symbol(struct elf *el
// st_other 0
// st_value 0
// st_size 0
- shndx = sec->idx;
- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
- sym->sym.st_shndx = shndx;
- if (!shndx_data)
- shndx = 0;
- } else {
- sym->sym.st_shndx = SHN_XINDEX;
- if (!shndx_data) {
- WARN("no .symtab_shndx");
+
+ /*
+ * Move the first global symbol, as per sh_info, into a new, higher
+ * symbol index. This fees up a spot for a new local symbol.
+ */
+ first_non_local = symtab->sh.sh_info;
+ new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
+ old = find_symbol_by_index(elf, first_non_local);
+ if (old) {
+ old->idx = new_idx;
+
+ hlist_del(&old->hash);
+ elf_hash_add(symbol, &old->hash, old->idx);
+
+ elf_dirty_reloc_sym(elf, old);
+
+ if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
+ WARN("elf_update_symbol move");
return NULL;
}
+
+ new_idx = first_non_local;
}

- if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
- WARN_ELF("gelf_update_symshndx");
+ sym->idx = new_idx;
+ if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
+ WARN("elf_update_symbol");
return NULL;
}

+ /*
+ * Either way, we added a LOCAL symbol.
+ */
+ symtab->sh.sh_info += 1;
+
elf_add_symbol(elf, sym);

return sym;


2022-05-19 20:24:52

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2] objtool: Fix symbol creation

On Thu, May 19, 2022 at 11:00:29AM +0200, Peter Zijlstra wrote:
> Subject: objtool: Fix symbol creation
> From: Peter Zijlstra <[email protected]>
> Date: Tue, 17 May 2022 17:42:04 +0200
>
> Nathan reported objtool failing with the following messages:
>
> warning: objtool: no non-local symbols !?
> warning: objtool: gelf_update_symshndx: invalid section index
>
> The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
> vs weak symbols") failing to consider the case where an object would
> have no non-local symbols.
>
> The problem that commit tries to address is adding a STB_LOCAL symbol
> to the symbol table in light of the ELF spec's requirement that:
>
> In each symbol table, all symbols with STB_LOCAL binding preced the
> weak and global symbols. As ``Sections'' above describes, a symbol
> table section's sh_info section header member holds the symbol table
> index for the first non-local symbol.
>
> The approach taken is to find this first non-local symbol, move that
> to the end and then re-use the freed spot to insert a new local symbol
> and increment sh_info.
>
> Except it never considered the case of object files without global
> symbols and got a whole bunch of details wrong -- so many in fact that
> it is a wonder it ever worked :/
>
> Specifically:
>
> - It failed to re-hash the symbol on the new index, so a subsequent
> find_symbol_by_index() would not find it at the new location and a
> query for the old location would now return a non-deterministic
> choice between the old and new symbol.
>
> - It failed to appreciate that the GElf wrappers are not a valid disk
> format (it works because GElf is basically Elf64 and we only
> support x86_64 atm.)
>
> - It failed to fully appreciate how horrible the libelf API really is
> and got the gelf_update_symshndx() call pretty much completely
> wrong; with the direct consequence that if inserting a second
> STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
> again it would completely come unstuck.
>
> Write a new elf_update_symbol() function that wraps all the magic
> required to update or create a new symbol at a given index.
>
> Specifically, gelf_update_sym*() require an @ndx argument that is
> relative to the @data argument; this means you have to manually
> iterate the section data descriptor list and update @ndx.
>
> Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
> Reported-by: Nathan Chancellor <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Tested-by: Nathan Chancellor <[email protected]>

Acked-by: Josh Poimboeuf <[email protected]>

--
Josh

Subject: [tip: objtool/core] objtool: Fix symbol creation

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: ead165fa1042247b033afad7be4be9b815d04ade
Gitweb: https://git.kernel.org/tip/ead165fa1042247b033afad7be4be9b815d04ade
Author: Peter Zijlstra <[email protected]>
AuthorDate: Tue, 17 May 2022 17:42:04 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Fri, 20 May 2022 12:42:06 +02:00

objtool: Fix symbol creation

Nathan reported objtool failing with the following messages:

warning: objtool: no non-local symbols !?
warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

In each symbol table, all symbols with STB_LOCAL binding preced the
weak and global symbols. As ``Sections'' above describes, a symbol
table section's sh_info section header member holds the symbol table
index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

- It failed to re-hash the symbol on the new index, so a subsequent
find_symbol_by_index() would not find it at the new location and a
query for the old location would now return a non-deterministic
choice between the old and new symbol.

- It failed to appreciate that the GElf wrappers are not a valid disk
format (it works because GElf is basically Elf64 and we only
support x86_64 atm.)

- It failed to fully appreciate how horrible the libelf API really is
and got the gelf_update_symshndx() call pretty much completely
wrong; with the direct consequence that if inserting a second
STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
tools/objtool/elf.c | 198 ++++++++++++++++++++++++++++---------------
1 file changed, 129 insertions(+), 69 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 583a3ec..7fb6720 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *elf, struct symbol *sym)
struct list_head *entry;
struct rb_node *pnode;

+ INIT_LIST_HEAD(&sym->pv_target);
+ sym->alias = sym;
+
sym->type = GELF_ST_TYPE(sym->sym.st_info);
sym->bind = GELF_ST_BIND(sym->sym.st_info);

@@ -438,8 +441,6 @@ static int read_symbols(struct elf *elf)
return -1;
}
memset(sym, 0, sizeof(*sym));
- INIT_LIST_HEAD(&sym->pv_target);
- sym->alias = sym;

sym->idx = i;

@@ -603,24 +604,21 @@ static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
}

/*
- * Move the first global symbol, as per sh_info, into a new, higher symbol
- * index. This fees up the shndx for a new local symbol.
+ * The libelf API is terrible; gelf_update_sym*() takes a data block relative
+ * index value, *NOT* the symbol index. As such, iterate the data blocks and
+ * adjust index until it fits.
+ *
+ * If no data block is found, allow adding a new data block provided the index
+ * is only one past the end.
*/
-static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
- struct section *symtab_shndx)
+static int elf_update_symbol(struct elf *elf, struct section *symtab,
+ struct section *symtab_shndx, struct symbol *sym)
{
- Elf_Data *data, *shndx_data = NULL;
- Elf32_Word first_non_local;
- struct symbol *sym;
- Elf_Scn *s;
-
- first_non_local = symtab->sh.sh_info;
-
- sym = find_symbol_by_index(elf, first_non_local);
- if (!sym) {
- WARN("no non-local symbols !?");
- return first_non_local;
- }
+ Elf32_Word shndx = sym->sec ? sym->sec->idx : SHN_UNDEF;
+ Elf_Data *symtab_data = NULL, *shndx_data = NULL;
+ Elf64_Xword entsize = symtab->sh.sh_entsize;
+ int max_idx, idx = sym->idx;
+ Elf_Scn *s, *t = NULL;

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -628,79 +626,124 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
return -1;
}

- data = elf_newdata(s);
- if (!data) {
- WARN_ELF("elf_newdata");
- return -1;
+ if (symtab_shndx) {
+ t = elf_getscn(elf->elf, symtab_shndx->idx);
+ if (!t) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
}

- data->d_buf = &sym->sym;
- data->d_size = sizeof(sym->sym);
- data->d_align = 1;
- data->d_type = ELF_T_SYM;
+ for (;;) {
+ /* get next data descriptor for the relevant sections */
+ symtab_data = elf_getdata(s, symtab_data);
+ if (t)
+ shndx_data = elf_getdata(t, shndx_data);

- sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
- elf_dirty_reloc_sym(elf, sym);
+ /* end-of-list */
+ if (!symtab_data) {
+ void *buf;

- symtab->sh.sh_info += 1;
- symtab->sh.sh_size += data->d_size;
- symtab->changed = true;
+ if (idx) {
+ /* we don't do holes in symbol tables */
+ WARN("index out of range");
+ return -1;
+ }

- if (symtab_shndx) {
- s = elf_getscn(elf->elf, symtab_shndx->idx);
- if (!s) {
- WARN_ELF("elf_getscn");
+ /* if @idx == 0, it's the next contiguous entry, create it */
+ symtab_data = elf_newdata(s);
+ if (t)
+ shndx_data = elf_newdata(t);
+
+ buf = calloc(1, entsize);
+ if (!buf) {
+ WARN("malloc");
+ return -1;
+ }
+
+ symtab_data->d_buf = buf;
+ symtab_data->d_size = entsize;
+ symtab_data->d_align = 1;
+ symtab_data->d_type = ELF_T_SYM;
+
+ symtab->sh.sh_size += entsize;
+ symtab->changed = true;
+
+ if (t) {
+ shndx_data->d_buf = &sym->sec->idx;
+ shndx_data->d_size = sizeof(Elf32_Word);
+ shndx_data->d_align = sizeof(Elf32_Word);
+ shndx_data->d_type = ELF_T_WORD;
+
+ symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
+ symtab_shndx->changed = true;
+ }
+
+ break;
+ }
+
+ /* empty blocks should not happen */
+ if (!symtab_data->d_size) {
+ WARN("zero size data");
return -1;
}

- shndx_data = elf_newdata(s);
+ /* is this the right block? */
+ max_idx = symtab_data->d_size / entsize;
+ if (idx < max_idx)
+ break;
+
+ /* adjust index and try again */
+ idx -= max_idx;
+ }
+
+ /* something went side-ways */
+ if (idx < 0) {
+ WARN("negative index");
+ return -1;
+ }
+
+ /* setup extended section index magic and write the symbol */
+ if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+ sym->sym.st_shndx = shndx;
+ if (!shndx_data)
+ shndx = 0;
+ } else {
+ sym->sym.st_shndx = SHN_XINDEX;
if (!shndx_data) {
- WARN_ELF("elf_newshndx_data");
+ WARN("no .symtab_shndx");
return -1;
}
+ }

- shndx_data->d_buf = &sym->sec->idx;
- shndx_data->d_size = sizeof(Elf32_Word);
- shndx_data->d_align = 4;
- shndx_data->d_type = ELF_T_WORD;
-
- symtab_shndx->sh.sh_size += 4;
- symtab_shndx->changed = true;
+ if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
+ WARN_ELF("gelf_update_symshndx");
+ return -1;
}

- return first_non_local;
+ return 0;
}

static struct symbol *
elf_create_section_symbol(struct elf *elf, struct section *sec)
{
struct section *symtab, *symtab_shndx;
- Elf_Data *shndx_data = NULL;
- struct symbol *sym;
- Elf32_Word shndx;
+ Elf32_Word first_non_local, new_idx;
+ struct symbol *sym, *old;

symtab = find_section_by_name(elf, ".symtab");
if (symtab) {
symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
- if (symtab_shndx)
- shndx_data = symtab_shndx->data;
} else {
WARN("no .symtab");
return NULL;
}

- sym = malloc(sizeof(*sym));
+ sym = calloc(1, sizeof(*sym));
if (!sym) {
perror("malloc");
return NULL;
}
- memset(sym, 0, sizeof(*sym));
-
- sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
- if (sym->idx < 0) {
- WARN("elf_move_global_symbol");
- return NULL;
- }

sym->name = sec->name;
sym->sec = sec;
@@ -710,24 +753,41 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
// st_other 0
// st_value 0
// st_size 0
- shndx = sec->idx;
- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
- sym->sym.st_shndx = shndx;
- if (!shndx_data)
- shndx = 0;
- } else {
- sym->sym.st_shndx = SHN_XINDEX;
- if (!shndx_data) {
- WARN("no .symtab_shndx");
+
+ /*
+ * Move the first global symbol, as per sh_info, into a new, higher
+ * symbol index. This fees up a spot for a new local symbol.
+ */
+ first_non_local = symtab->sh.sh_info;
+ new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
+ old = find_symbol_by_index(elf, first_non_local);
+ if (old) {
+ old->idx = new_idx;
+
+ hlist_del(&old->hash);
+ elf_hash_add(symbol, &old->hash, old->idx);
+
+ elf_dirty_reloc_sym(elf, old);
+
+ if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
+ WARN("elf_update_symbol move");
return NULL;
}
+
+ new_idx = first_non_local;
}

- if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
- WARN_ELF("gelf_update_symshndx");
+ sym->idx = new_idx;
+ if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
+ WARN("elf_update_symbol");
return NULL;
}

+ /*
+ * Either way, we added a LOCAL symbol.
+ */
+ symtab->sh.sh_info += 1;
+
elf_add_symbol(elf, sym);

return sym;

Subject: [tip: objtool/urgent] objtool: Fix symbol creation

The following commit has been merged into the objtool/urgent branch of tip:

Commit-ID: d3f526155da9529d61663fcff19ca0e707dc2077
Gitweb: https://git.kernel.org/tip/d3f526155da9529d61663fcff19ca0e707dc2077
Author: Peter Zijlstra <[email protected]>
AuthorDate: Tue, 17 May 2022 17:42:04 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 19 May 2022 23:46:12 +02:00

objtool: Fix symbol creation

Nathan reported objtool failing with the following messages:

warning: objtool: no non-local symbols !?
warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

In each symbol table, all symbols with STB_LOCAL binding preced the
weak and global symbols. As ``Sections'' above describes, a symbol
table section's sh_info section header member holds the symbol table
index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

- It failed to re-hash the symbol on the new index, so a subsequent
find_symbol_by_index() would not find it at the new location and a
query for the old location would now return a non-deterministic
choice between the old and new symbol.

- It failed to appreciate that the GElf wrappers are not a valid disk
format (it works because GElf is basically Elf64 and we only
support x86_64 atm.)

- It failed to fully appreciate how horrible the libelf API really is
and got the gelf_update_symshndx() call pretty much completely
wrong; with the direct consequence that if inserting a second
STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
tools/objtool/elf.c | 198 ++++++++++++++++++++++++++++---------------
1 file changed, 129 insertions(+), 69 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index ebf2ba5..22c2a07 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *elf, struct symbol *sym)
struct list_head *entry;
struct rb_node *pnode;

+ INIT_LIST_HEAD(&sym->pv_target);
+ sym->alias = sym;
+
sym->type = GELF_ST_TYPE(sym->sym.st_info);
sym->bind = GELF_ST_BIND(sym->sym.st_info);

@@ -435,8 +438,6 @@ static int read_symbols(struct elf *elf)
return -1;
}
memset(sym, 0, sizeof(*sym));
- INIT_LIST_HEAD(&sym->pv_target);
- sym->alias = sym;

sym->idx = i;

@@ -600,24 +601,21 @@ static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
}

/*
- * Move the first global symbol, as per sh_info, into a new, higher symbol
- * index. This fees up the shndx for a new local symbol.
+ * The libelf API is terrible; gelf_update_sym*() takes a data block relative
+ * index value, *NOT* the symbol index. As such, iterate the data blocks and
+ * adjust index until it fits.
+ *
+ * If no data block is found, allow adding a new data block provided the index
+ * is only one past the end.
*/
-static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
- struct section *symtab_shndx)
+static int elf_update_symbol(struct elf *elf, struct section *symtab,
+ struct section *symtab_shndx, struct symbol *sym)
{
- Elf_Data *data, *shndx_data = NULL;
- Elf32_Word first_non_local;
- struct symbol *sym;
- Elf_Scn *s;
-
- first_non_local = symtab->sh.sh_info;
-
- sym = find_symbol_by_index(elf, first_non_local);
- if (!sym) {
- WARN("no non-local symbols !?");
- return first_non_local;
- }
+ Elf32_Word shndx = sym->sec ? sym->sec->idx : SHN_UNDEF;
+ Elf_Data *symtab_data = NULL, *shndx_data = NULL;
+ Elf64_Xword entsize = symtab->sh.sh_entsize;
+ int max_idx, idx = sym->idx;
+ Elf_Scn *s, *t = NULL;

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -625,79 +623,124 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
return -1;
}

- data = elf_newdata(s);
- if (!data) {
- WARN_ELF("elf_newdata");
- return -1;
+ if (symtab_shndx) {
+ t = elf_getscn(elf->elf, symtab_shndx->idx);
+ if (!t) {
+ WARN_ELF("elf_getscn");
+ return -1;
+ }
}

- data->d_buf = &sym->sym;
- data->d_size = sizeof(sym->sym);
- data->d_align = 1;
- data->d_type = ELF_T_SYM;
+ for (;;) {
+ /* get next data descriptor for the relevant sections */
+ symtab_data = elf_getdata(s, symtab_data);
+ if (t)
+ shndx_data = elf_getdata(t, shndx_data);

- sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
- elf_dirty_reloc_sym(elf, sym);
+ /* end-of-list */
+ if (!symtab_data) {
+ void *buf;

- symtab->sh.sh_info += 1;
- symtab->sh.sh_size += data->d_size;
- symtab->changed = true;
+ if (idx) {
+ /* we don't do holes in symbol tables */
+ WARN("index out of range");
+ return -1;
+ }

- if (symtab_shndx) {
- s = elf_getscn(elf->elf, symtab_shndx->idx);
- if (!s) {
- WARN_ELF("elf_getscn");
+ /* if @idx == 0, it's the next contiguous entry, create it */
+ symtab_data = elf_newdata(s);
+ if (t)
+ shndx_data = elf_newdata(t);
+
+ buf = calloc(1, entsize);
+ if (!buf) {
+ WARN("malloc");
+ return -1;
+ }
+
+ symtab_data->d_buf = buf;
+ symtab_data->d_size = entsize;
+ symtab_data->d_align = 1;
+ symtab_data->d_type = ELF_T_SYM;
+
+ symtab->sh.sh_size += entsize;
+ symtab->changed = true;
+
+ if (t) {
+ shndx_data->d_buf = &sym->sec->idx;
+ shndx_data->d_size = sizeof(Elf32_Word);
+ shndx_data->d_align = sizeof(Elf32_Word);
+ shndx_data->d_type = ELF_T_WORD;
+
+ symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
+ symtab_shndx->changed = true;
+ }
+
+ break;
+ }
+
+ /* empty blocks should not happen */
+ if (!symtab_data->d_size) {
+ WARN("zero size data");
return -1;
}

- shndx_data = elf_newdata(s);
+ /* is this the right block? */
+ max_idx = symtab_data->d_size / entsize;
+ if (idx < max_idx)
+ break;
+
+ /* adjust index and try again */
+ idx -= max_idx;
+ }
+
+ /* something went side-ways */
+ if (idx < 0) {
+ WARN("negative index");
+ return -1;
+ }
+
+ /* setup extended section index magic and write the symbol */
+ if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+ sym->sym.st_shndx = shndx;
+ if (!shndx_data)
+ shndx = 0;
+ } else {
+ sym->sym.st_shndx = SHN_XINDEX;
if (!shndx_data) {
- WARN_ELF("elf_newshndx_data");
+ WARN("no .symtab_shndx");
return -1;
}
+ }

- shndx_data->d_buf = &sym->sec->idx;
- shndx_data->d_size = sizeof(Elf32_Word);
- shndx_data->d_align = 4;
- shndx_data->d_type = ELF_T_WORD;
-
- symtab_shndx->sh.sh_size += 4;
- symtab_shndx->changed = true;
+ if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
+ WARN_ELF("gelf_update_symshndx");
+ return -1;
}

- return first_non_local;
+ return 0;
}

static struct symbol *
elf_create_section_symbol(struct elf *elf, struct section *sec)
{
struct section *symtab, *symtab_shndx;
- Elf_Data *shndx_data = NULL;
- struct symbol *sym;
- Elf32_Word shndx;
+ Elf32_Word first_non_local, new_idx;
+ struct symbol *sym, *old;

symtab = find_section_by_name(elf, ".symtab");
if (symtab) {
symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
- if (symtab_shndx)
- shndx_data = symtab_shndx->data;
} else {
WARN("no .symtab");
return NULL;
}

- sym = malloc(sizeof(*sym));
+ sym = calloc(1, sizeof(*sym));
if (!sym) {
perror("malloc");
return NULL;
}
- memset(sym, 0, sizeof(*sym));
-
- sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
- if (sym->idx < 0) {
- WARN("elf_move_global_symbol");
- return NULL;
- }

sym->name = sec->name;
sym->sec = sec;
@@ -707,24 +750,41 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
// st_other 0
// st_value 0
// st_size 0
- shndx = sec->idx;
- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
- sym->sym.st_shndx = shndx;
- if (!shndx_data)
- shndx = 0;
- } else {
- sym->sym.st_shndx = SHN_XINDEX;
- if (!shndx_data) {
- WARN("no .symtab_shndx");
+
+ /*
+ * Move the first global symbol, as per sh_info, into a new, higher
+ * symbol index. This fees up a spot for a new local symbol.
+ */
+ first_non_local = symtab->sh.sh_info;
+ new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
+ old = find_symbol_by_index(elf, first_non_local);
+ if (old) {
+ old->idx = new_idx;
+
+ hlist_del(&old->hash);
+ elf_hash_add(symbol, &old->hash, old->idx);
+
+ elf_dirty_reloc_sym(elf, old);
+
+ if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
+ WARN("elf_update_symbol move");
return NULL;
}
+
+ new_idx = first_non_local;
}

- if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
- WARN_ELF("gelf_update_symshndx");
+ sym->idx = new_idx;
+ if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
+ WARN("elf_update_symbol");
return NULL;
}

+ /*
+ * Either way, we added a LOCAL symbol.
+ */
+ symtab->sh.sh_info += 1;
+
elf_add_symbol(elf, sym);

return sym;

2022-09-07 00:51:42

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH] objtool: Fix symbol creation

On Wed, May 18, 2022 at 3:10 PM Peter Zijlstra <[email protected]> wrote:
>
> On Wed, May 18, 2022 at 10:36:04AM -0700, Josh Poimboeuf wrote:
> > On Wed, May 18, 2022 at 09:41:52AM +0200, Peter Zijlstra wrote:
> > > +static int elf_update_symbol(struct elf *elf, struct section *symtab,
> > > + struct section *symtab_shndx, struct symbol *sym)
> > > {
> > > - Elf_Data *data, *shndx_data = NULL;
> > > - Elf32_Word first_non_local;
> > > - struct symbol *sym;
> > > - Elf_Scn *s;
> > > -
> > > - first_non_local = symtab->sh.sh_info;
> > > -
> > > - sym = find_symbol_by_index(elf, first_non_local);
> > > - if (!sym) {
> > > - WARN("no non-local symbols !?");
> > > - return first_non_local;
> > > - }
> > > + Elf_Data *symtab_data = NULL, *shndx_data = NULL;
> > > + Elf64_Xword entsize = symtab->sh.sh_entsize;
> > > + Elf32_Word shndx = sym->sec->idx;
> >
> > So if it's a global UNDEF symbol then I think 'sym->sec' can be NULL and
> > this blows up?
>
> Oh indeed, sym->sec ? sym->sec->idx : SHN_UNDEF it is.

elf_update_symbol seems to be a bit broken even after this. I noticed
it converts SHN_ABS symbols into SHN_UNDEF, which breaks some KCFI
builds. In fact, the function drops all the special st_shndx values
except SHN_XINDEX.

Specifically, read_symbols sets sym->sec to find_section_by_index(elf,
0) for all SHN_UNDEF and special st_shndx symbols, which means
sym->sec is non-NULL and sym->sec->idx is always 0 (= SHN_UNDEF) for
these symbols. As elf_update_symbol doesn't look at the actual
st_shndx value, it ends up marking the symbols undefined.

This quick hack fixes the issue for me, but I'm not sure if it's the
cleanest solution. Any thoughts?

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index c25e957c1e52..7e24b09b1163 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -619,6 +619,11 @@ static int elf_update_symbol(struct elf *elf,
struct section *symtab,
Elf64_Xword entsize = symtab->sh.sh_entsize;
int max_idx, idx = sym->idx;
Elf_Scn *s, *t = NULL;
+ bool is_special_shndx = sym->sym.st_shndx >= SHN_LORESERVE &&
+ sym->sym.st_shndx != SHN_XINDEX;
+
+ if (is_special_shndx)
+ shndx = sym->sym.st_shndx;

s = elf_getscn(elf->elf, symtab->idx);
if (!s) {
@@ -704,7 +709,7 @@ static int elf_update_symbol(struct elf *elf,
struct section *symtab,
}

/* setup extended section index magic and write the symbol */
- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+ if ((shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) || is_special_shndx) {
sym->sym.st_shndx = shndx;
if (!shndx_data)
shndx = 0;

Sami