Top of tree LLVM has optimizations related to
-fno-semantic-interposition to avoid emitting PLT relocations for
references to symbols located in the same translation unit, where it
will emit "local symbol" references.
Clang builds fall back on GNU as for assembling, currently. It appears a
bug in GNU as introduced around 2.31 is keeping around local labels in
the symbol table, despite the documentation saying:
"Local symbols are defined and used within the assembler, but they are
normally not saved in object files."
When objtool searches for a symbol at a given offset, it's finding the
incorrectly kept .L<symbol>$local symbol that should have been discarded
by the assembler.
A patch for GNU as has been authored. For now, objtool should not treat
local symbols as the expected symbol for a given offset when iterating
the symbol table.
commit 644592d32837 ("objtool: Fail the kernel build on fatal errors")
exposed this issue.
Link: https://github.com/ClangBuiltLinux/linux/issues/872
Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Symbol-Names
Link: https://sourceware.org/ml/binutils/2020-02/msg00243.html
Link: https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/286292010
Debugged-by: Nathan Chancellor <[email protected]>
Debugged-by: Fangrui Song <[email protected]>
Suggested-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Build tested allyesconfig with ToT Clang and GCC 9.2.1.
Boot tested defconfig with ToT Clang and GCC 9.2.1.
tools/objtool/elf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index edba4745f25a..9c1e3cc928b0 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -63,7 +63,8 @@ struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset)
list_for_each_entry(sym, &sec->symbol_list, list)
if (sym->type != STT_SECTION &&
- sym->offset == offset)
+ sym->offset == offset &&
+ strstr(sym->name, ".L") != sym->name)
return sym;
return NULL;
--
2.25.0.225.g125e21ebc7-goog
On 2020-02-13, Nick Desaulniers wrote:
>Top of tree LLVM has optimizations related to
>-fno-semantic-interposition to avoid emitting PLT relocations for
>references to symbols located in the same translation unit, where it
>will emit "local symbol" references.
>
>Clang builds fall back on GNU as for assembling, currently. It appears a
>bug in GNU as introduced around 2.31 is keeping around local labels in
>the symbol table, despite the documentation saying:
>
>"Local symbols are defined and used within the assembler, but they are
>normally not saved in object files."
If you can reword the paragraph above mentioning the fact below without being
more verbose, please do that.
If the reference is within the same section which defines the .L symbol,
there is no outstanding relocation. If the reference is outside the
section, there will be an R_X86_64_PLT32 referencing .L
>When objtool searches for a symbol at a given offset, it's finding the
>incorrectly kept .L<symbol>$local symbol that should have been discarded
>by the assembler.
>
>A patch for GNU as has been authored. For now, objtool should not treat
>local symbols as the expected symbol for a given offset when iterating
>the symbol table.
Agree. binutils 2.31~2.34 will be affected. objtool has to work around
the existing releases.
>commit 644592d32837 ("objtool: Fail the kernel build on fatal errors")
>exposed this issue.
>
>Link: https://github.com/ClangBuiltLinux/linux/issues/872
>Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Symbol-Names
>Link: https://sourceware.org/ml/binutils/2020-02/msg00243.html
>Link: https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/286292010
>Debugged-by: Nathan Chancellor <[email protected]>
>Debugged-by: Fangrui Song <[email protected]>
>Suggested-by: Josh Poimboeuf <[email protected]>
>Signed-off-by: Nick Desaulniers <[email protected]>
>---
>Build tested allyesconfig with ToT Clang and GCC 9.2.1.
>Boot tested defconfig with ToT Clang and GCC 9.2.1.
>
> tools/objtool/elf.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
>index edba4745f25a..9c1e3cc928b0 100644
>--- a/tools/objtool/elf.c
>+++ b/tools/objtool/elf.c
>@@ -63,7 +63,8 @@ struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset)
>
> list_for_each_entry(sym, &sec->symbol_list, list)
> if (sym->type != STT_SECTION &&
>- sym->offset == offset)
>+ sym->offset == offset &&
>+ strstr(sym->name, ".L") != sym->name)
!strncmp(sym->name, ".L", 2)
.L in the middle of a symbol name may be rare, though.
> return sym;
>
> return NULL;
>--
>2.25.0.225.g125e21ebc7-goog
>
On Thu, Feb 13, 2020 at 10:47:08AM -0800, Nick Desaulniers wrote:
> Top of tree LLVM has optimizations related to
> -fno-semantic-interposition to avoid emitting PLT relocations for
> references to symbols located in the same translation unit, where it
> will emit "local symbol" references.
>
> Clang builds fall back on GNU as for assembling, currently. It appears a
> bug in GNU as introduced around 2.31 is keeping around local labels in
> the symbol table, despite the documentation saying:
>
> "Local symbols are defined and used within the assembler, but they are
> normally not saved in object files."
>
> When objtool searches for a symbol at a given offset, it's finding the
> incorrectly kept .L<symbol>$local symbol that should have been discarded
> by the assembler.
>
> A patch for GNU as has been authored. For now, objtool should not treat
> local symbols as the expected symbol for a given offset when iterating
> the symbol table.
>
> commit 644592d32837 ("objtool: Fail the kernel build on fatal errors")
> exposed this issue.
Since I'm going to be dropping 644592d32837 ("objtool: Fail the kernel
build on fatal errors") anyway, I wonder if this patch is still needed?
At least the error will be downgraded to a warning. And while the
warning could be more user friendly, it still has value because it
reveals a toolchain bug.
--
Josh
On 2020-02-13, Josh Poimboeuf wrote:
>On Thu, Feb 13, 2020 at 10:47:08AM -0800, Nick Desaulniers wrote:
>> Top of tree LLVM has optimizations related to
>> -fno-semantic-interposition to avoid emitting PLT relocations for
>> references to symbols located in the same translation unit, where it
>> will emit "local symbol" references.
>>
>> Clang builds fall back on GNU as for assembling, currently. It appears a
>> bug in GNU as introduced around 2.31 is keeping around local labels in
>> the symbol table, despite the documentation saying:
>>
>> "Local symbols are defined and used within the assembler, but they are
>> normally not saved in object files."
>>
>> When objtool searches for a symbol at a given offset, it's finding the
>> incorrectly kept .L<symbol>$local symbol that should have been discarded
>> by the assembler.
>>
>> A patch for GNU as has been authored. For now, objtool should not treat
>> local symbols as the expected symbol for a given offset when iterating
>> the symbol table.
R_X86_64_PLT32 was fixed (just now) by
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=292676c15a615b5a95bede9ee91004d3f7ee7dfd
It will be included in binutils 2.35 and probably a bug fix release of 2.34.x
>> commit 644592d32837 ("objtool: Fail the kernel build on fatal errors")
>> exposed this issue.
>
>Since I'm going to be dropping 644592d32837 ("objtool: Fail the kernel
>build on fatal errors") anyway, I wonder if this patch is still needed?
>
>At least the error will be downgraded to a warning. And while the
>warning could be more user friendly, it still has value because it
>reveals a toolchain bug.
I still consider such a check (tools/objtool/check.c:679) unneeded.
st_type doesn't have to be STT_FUNC. Either STT_NOTYPE or STT_FUNC is
ok. If STT_GNU_IFUNC is used, it can be ok as well.
(My clang patch skips STT_GNU_IFUNC just because rtld typically doesn't
cache R_*_IRELATIVE results. Having two STT_GNU_IFUNC symbols with same st_shndx and
st_value can create two R_*_IRELATIVE, which need to be resolved twice
at runtime.)
} else if (rela->sym->type == STT_SECTION) {
insn->call_dest = find_symbol_by_offset(rela->sym->sec,
rela->addend+4);
if (!insn->call_dest ||
insn->call_dest->type != STT_FUNC) {
WARN_FUNC("can't find call dest symbol at %s+0x%x",
insn->sec, insn->offset,
rela->sym->sec->name,
rela->addend + 4);
return -1;
}
.section .init.text,"ax",@progbits
call printk
call .Lprintk$local
.text
.globl printk
.type printk,@function
printk:
.Lprintk$local:
ret
% llvm-mc -filetype=obj -triple=riscv64 a.s -mattr=+relax -o a.o
% readelf -Wr a.o
Relocation section '.rela.init.text' at offset 0xa0 contains 4 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000200000012 R_RISCV_CALL 0000000000000000 printk + 0
0000000000000000 0000000000000033 R_RISCV_RELAX 0
0000000000000008 0000000100000012 R_RISCV_CALL 0000000000000000 .Lprintk$local + 0
0000000000000008 0000000000000033 R_RISCV_RELAX 0
On RISC-V, when relaxation is enabled, .L cannot be resolved at assembly
time because sections can shrink.
https://sourceware.org/binutils/docs/as/Symbol-Names.html
> Local symbols are defined and used within the assembler, but they are *normally* not saved in object files.
I consider the GNU as issue a missed optimization, instead of a bug.
There is no rigid rule that .L symbols cannot be saved in object files.
On Thu, Feb 13, 2020 at 2:18 PM Josh Poimboeuf <[email protected]> wrote:
>
> On Thu, Feb 13, 2020 at 10:47:08AM -0800, Nick Desaulniers wrote:
> > Top of tree LLVM has optimizations related to
> > -fno-semantic-interposition to avoid emitting PLT relocations for
> > references to symbols located in the same translation unit, where it
> > will emit "local symbol" references.
> >
> > Clang builds fall back on GNU as for assembling, currently. It appears a
> > bug in GNU as introduced around 2.31 is keeping around local labels in
> > the symbol table, despite the documentation saying:
> >
> > "Local symbols are defined and used within the assembler, but they are
> > normally not saved in object files."
> >
> > When objtool searches for a symbol at a given offset, it's finding the
> > incorrectly kept .L<symbol>$local symbol that should have been discarded
> > by the assembler.
> >
> > A patch for GNU as has been authored. For now, objtool should not treat
> > local symbols as the expected symbol for a given offset when iterating
> > the symbol table.
> >
> > commit 644592d32837 ("objtool: Fail the kernel build on fatal errors")
> > exposed this issue.
>
> Since I'm going to be dropping 644592d32837 ("objtool: Fail the kernel
> build on fatal errors") anyway, I wonder if this patch is still needed?
>
> At least the error will be downgraded to a warning. And while the
> warning could be more user friendly, it still has value because it
> reveals a toolchain bug.
Sure thing. I appreciate it, and I'm on board with helping debug or
fix any compiler bugs we might have in order to re-strengthen the
warning soon.
--
Thanks,
~Nick Desaulniers
On Thu, Feb 13, 2020 at 02:37:34PM -0800, Fangrui Song wrote:
> I still consider such a check (tools/objtool/check.c:679) unneeded.
>
> st_type doesn't have to be STT_FUNC. Either STT_NOTYPE or STT_FUNC is
> ok. If STT_GNU_IFUNC is used, it can be ok as well.
> (My clang patch skips STT_GNU_IFUNC just because rtld typically doesn't
> cache R_*_IRELATIVE results. Having two STT_GNU_IFUNC symbols with same st_shndx and
> st_value can create two R_*_IRELATIVE, which need to be resolved twice
> at runtime.)
>
> } else if (rela->sym->type == STT_SECTION) {
> insn->call_dest = find_symbol_by_offset(rela->sym->sec,
> rela->addend+4);
> if (!insn->call_dest ||
> insn->call_dest->type != STT_FUNC) {
> WARN_FUNC("can't find call dest symbol at %s+0x%x",
> insn->sec, insn->offset,
> rela->sym->sec->name,
> rela->addend + 4);
> return -1;
> }
>
>
> .section .init.text,"ax",@progbits
> call printk
> call .Lprintk$local
> .text
> .globl printk
> .type printk,@function
> printk:
> .Lprintk$local:
> ret
Objtool isn't a general ELF validator, it's more of a kernel sanity
validator. In the kernel we currently have a constraint that you can
only call STT_FUNC. At the very least it helps keep our asm code clean.
If that constraint ever becomes a problem then we could always
reconsider it.
> % llvm-mc -filetype=obj -triple=riscv64 a.s -mattr=+relax -o a.o
> % readelf -Wr a.o
>
> Relocation section '.rela.init.text' at offset 0xa0 contains 4 entries:
> Offset Info Type Symbol's Value Symbol's Name + Addend
> 0000000000000000 0000000200000012 R_RISCV_CALL 0000000000000000 printk + 0
> 0000000000000000 0000000000000033 R_RISCV_RELAX 0
> 0000000000000008 0000000100000012 R_RISCV_CALL 0000000000000000 .Lprintk$local + 0
> 0000000000000008 0000000000000033 R_RISCV_RELAX 0
>
>
> On RISC-V, when relaxation is enabled, .L cannot be resolved at assembly
> time because sections can shrink.
>
> https://sourceware.org/binutils/docs/as/Symbol-Names.html
>
> > Local symbols are defined and used within the assembler, but they are *normally* not saved in object files.
>
> I consider the GNU as issue a missed optimization, instead of a bug.
> There is no rigid rule that .L symbols cannot be saved in object files.
I know nothing about RISC-V, but if I understand correctly,
.Lprintk$local is the function's local entry point, similar to ppc64
localentry. Would it not always be a constant offset from the printk
address, such that the relocation could be "printk + 8" or so?
Regardless, it doesn't really matter for now, objtool is x86-only.
--
Josh
On Thu, Feb 13, 2020 at 11:20:55AM -0800, Fangrui Song wrote:
> On 2020-02-13, Nick Desaulniers wrote:
> >Top of tree LLVM has optimizations related to
> >-fno-semantic-interposition to avoid emitting PLT relocations for
> >references to symbols located in the same translation unit, where it
> >will emit "local symbol" references.
> >
> >Clang builds fall back on GNU as for assembling, currently. It appears a
> >bug in GNU as introduced around 2.31 is keeping around local labels in
> >the symbol table, despite the documentation saying:
> >
> >"Local symbols are defined and used within the assembler, but they are
> >normally not saved in object files."
>
> If you can reword the paragraph above mentioning the fact below without being
> more verbose, please do that.
>
> If the reference is within the same section which defines the .L symbol,
> there is no outstanding relocation. If the reference is outside the
> section, there will be an R_X86_64_PLT32 referencing .L
>
Can you describe what case the clang change is supposed to optimize?
AFAICT, it kicks in when the symbol is known by the compiler to be local
to the DSO and defined in the same translation unit.
But then there are two cases:
(a) we have call foo, where foo is defined in the same section as the
call instruction. In this case the assembler should be able to fully
resolve foo and not generate any relocation, regardless of whether foo
is global or local.
(b) we have call foo, where foo is defined in a different section from
the call instruction. In this case the assembler must generate a
relocation regardless of whether foo is global or local, and the linker
should eliminate it.
In what case does does replacing call foo with call .Lfoo$local help?
I know little about objtool, but if it may be used by other
architectures, hope the following explanations don't appear to be too
off-topic:)
On 2020-02-14, Arvind Sankar wrote:
>On Thu, Feb 13, 2020 at 11:20:55AM -0800, Fangrui Song wrote:
>> On 2020-02-13, Nick Desaulniers wrote:
>> >Top of tree LLVM has optimizations related to
>> >-fno-semantic-interposition to avoid emitting PLT relocations for
>> >references to symbols located in the same translation unit, where it
>> >will emit "local symbol" references.
>> >
>> >Clang builds fall back on GNU as for assembling, currently. It appears a
>> >bug in GNU as introduced around 2.31 is keeping around local labels in
>> >the symbol table, despite the documentation saying:
>> >
>> >"Local symbols are defined and used within the assembler, but they are
>> >normally not saved in object files."
>>
>> If you can reword the paragraph above mentioning the fact below without being
>> more verbose, please do that.
>>
>> If the reference is within the same section which defines the .L symbol,
>> there is no outstanding relocation. If the reference is outside the
>> section, there will be an R_X86_64_PLT32 referencing .L
>>
>
>Can you describe what case the clang change is supposed to optimize?
>AFAICT, it kicks in when the symbol is known by the compiler to be local
>to the DSO and defined in the same translation unit.
>
>But then there are two cases:
>(a) we have call foo, where foo is defined in the same section as the
>call instruction. In this case the assembler should be able to fully
>resolve foo and not generate any relocation, regardless of whether foo
>is global or local.
If foo is STB_GLOBAL or STB_WEAK, the assembler cannot fully resolve a
reference to foo in the same section, unless the assembler can assume
(the codegen tells it) the call to foo cannot be interposed by another
foo definition at runtime.
>(b) we have call foo, where foo is defined in a different section from
>the call instruction. In this case the assembler must generate a
>relocation regardless of whether foo is global or local, and the linker
>should eliminate it.
>In what case does does replacing call foo with call .Lfoo$local help?
For -fPIC -fno-semantic-interposition, the assembly emitter can perform
the following optimization:
void foo() {}
void bar() { foo(); }
.globl foo, bar
foo:
.Lfoo$local:
ret
bar:
call foo --> call .Lfoo$local
ret
call foo generates an R_X86_64_PLT32. In a -shared link, it creates an
unneeded PLT entry for foo.
call .Lfoo$local generates an R_X86_64_PLT32. In a -shared link, .Lfoo$local is
non-preemptible => no PLT entry is created.
For -fno-PIC and -fPIE, the final link is expected to be -no-pie or
-pie. This optimization does not save anything, because PLT entries will
not be generated. With clang's integrated assembler, it may increase the
number of STT_SECTION symbols (because .Lfoo$local will be turned to a
STT_SECTION relative relocation), but the size increase is very small.
I want to teach clang -fPIC to use -fno-semantic-interposition by
default. (It is currently an LLVM optimization, not realized in clang.)
clang traditionally makes various -fno-semantic-interposition
assumptions and can perform interprocedural optimizations even if the
strict ELF rule disallows them.
On Fri, Feb 14, 2020 at 10:05:27AM -0800, Fangrui Song wrote:
> I know little about objtool, but if it may be used by other
> architectures, hope the following explanations don't appear to be too
> off-topic:)
>
> On 2020-02-14, Arvind Sankar wrote:
> >Can you describe what case the clang change is supposed to optimize?
> >AFAICT, it kicks in when the symbol is known by the compiler to be local
> >to the DSO and defined in the same translation unit.
> >
> >But then there are two cases:
> >(a) we have call foo, where foo is defined in the same section as the
> >call instruction. In this case the assembler should be able to fully
> >resolve foo and not generate any relocation, regardless of whether foo
> >is global or local.
>
> If foo is STB_GLOBAL or STB_WEAK, the assembler cannot fully resolve a
> reference to foo in the same section, unless the assembler can assume
> (the codegen tells it) the call to foo cannot be interposed by another
> foo definition at runtime.
I was testing with hidden/protected visibility, I see you want this for
the no-semantic-interposition case. Actually a bit more testing shows
some peculiarities even with hidden visibility. With the below, the call
and lea create relocations in the object file, but the jmp doesn't. ld
does avoid creating a plt for this though.
.text
.globl foo, bar
.hidden foo
bar:
call foo
leaq foo(%rip), %rax
jmp foo
foo: ret
>
> >(b) we have call foo, where foo is defined in a different section from
> >the call instruction. In this case the assembler must generate a
> >relocation regardless of whether foo is global or local, and the linker
> >should eliminate it.
> >In what case does does replacing call foo with call .Lfoo$local help?
>
> For -fPIC -fno-semantic-interposition, the assembly emitter can perform
> the following optimization:
>
> void foo() {}
> void bar() { foo(); }
>
> .globl foo, bar
> foo:
> .Lfoo$local:
> ret
> bar:
> call foo --> call .Lfoo$local
> ret
>
> call foo generates an R_X86_64_PLT32. In a -shared link, it creates an
> unneeded PLT entry for foo.
>
> call .Lfoo$local generates an R_X86_64_PLT32. In a -shared link, .Lfoo$local is
> non-preemptible => no PLT entry is created.
>
> For -fno-PIC and -fPIE, the final link is expected to be -no-pie or
> -pie. This optimization does not save anything, because PLT entries will
> not be generated. With clang's integrated assembler, it may increase the
> number of STT_SECTION symbols (because .Lfoo$local will be turned to a
> STT_SECTION relative relocation), but the size increase is very small.
>
>
> I want to teach clang -fPIC to use -fno-semantic-interposition by
> default. (It is currently an LLVM optimization, not realized in clang.)
> clang traditionally makes various -fno-semantic-interposition
> assumptions and can perform interprocedural optimizations even if the
> strict ELF rule disallows them.
FWIW, gcc with no-semantic-interposition also uses local aliases, but
rather than using .L labels, it creates a local alias by
.set foo.localalias, foo
This makes the type of foo.localalias the same as foo, which I gather
should placate objtool as it'll still see an STT_FUNC no matter whether
it picks up foo.localalias or foo.
On 2020-02-14, Arvind Sankar wrote:
>On Fri, Feb 14, 2020 at 10:05:27AM -0800, Fangrui Song wrote:
>> I know little about objtool, but if it may be used by other
>> architectures, hope the following explanations don't appear to be too
>> off-topic:)
>>
>> On 2020-02-14, Arvind Sankar wrote:
>> >Can you describe what case the clang change is supposed to optimize?
>> >AFAICT, it kicks in when the symbol is known by the compiler to be local
>> >to the DSO and defined in the same translation unit.
>> >
>> >But then there are two cases:
>> >(a) we have call foo, where foo is defined in the same section as the
>> >call instruction. In this case the assembler should be able to fully
>> >resolve foo and not generate any relocation, regardless of whether foo
>> >is global or local.
>>
>> If foo is STB_GLOBAL or STB_WEAK, the assembler cannot fully resolve a
>> reference to foo in the same section, unless the assembler can assume
>> (the codegen tells it) the call to foo cannot be interposed by another
>> foo definition at runtime.
>
>I was testing with hidden/protected visibility, I see you want this for
>the no-semantic-interposition case. Actually a bit more testing shows
>some peculiarities even with hidden visibility. With the below, the call
>and lea create relocations in the object file, but the jmp doesn't. ld
>does avoid creating a plt for this though.
>
> .text
> .globl foo, bar
> .hidden foo
> bar:
> call foo
> leaq foo(%rip), %rax
> jmp foo
>
> foo: ret
Yes, GNU as is inconsistent here. While fixing
https://sourceware.org/ml/binutils/2020-02/msg00243.html , I noticed
that the rule is quite complex. There are definitely lots of places to
improve. clang 10 emits relocations consistently.
call foo # R_X86_64_PLT32
leaq foo(%rip), %rax # R_X86_64_PC32
jmp foo # R_X86_64_PLT32
We can teach the assembler to not emit relocations referencing STV_HIDDEN or
STV_INTERNAL symbols, but I favor the simpler rule that only relocations
referencing STB_LOCAL non-STT_GNU_IFUNC symbols defined in the same section are resolved.
Leave the visibility jobs to the linker.
If we ever teach GNU objcopy or llvm-objcopt an option to set
visibility, resolving relocations may disallow such use cases.
Unfortunately gcc>=5 x86 and GNU ld>=2.26 x86 are in a bad status
regarding STV_PROTECTED (https://reviews.llvm.org/D72197#1866384).
(Now I retest it, I think I may add a special -no-integrated-as rule to
clang just to work around GNU ld x86>=2.26.)
>> >(b) we have call foo, where foo is defined in a different section from
>> >the call instruction. In this case the assembler must generate a
>> >relocation regardless of whether foo is global or local, and the linker
>> >should eliminate it.
>> >In what case does does replacing call foo with call .Lfoo$local help?
>>
>> For -fPIC -fno-semantic-interposition, the assembly emitter can perform
>> the following optimization:
>>
>> void foo() {}
>> void bar() { foo(); }
>>
>> .globl foo, bar
>> foo:
>> .Lfoo$local:
>> ret
>> bar:
>> call foo --> call .Lfoo$local
>> ret
>>
>> call foo generates an R_X86_64_PLT32. In a -shared link, it creates an
>> unneeded PLT entry for foo.
>>
>> call .Lfoo$local generates an R_X86_64_PLT32. In a -shared link, .Lfoo$local is
>> non-preemptible => no PLT entry is created.
>>
>> For -fno-PIC and -fPIE, the final link is expected to be -no-pie or
>> -pie. This optimization does not save anything, because PLT entries will
>> not be generated. With clang's integrated assembler, it may increase the
>> number of STT_SECTION symbols (because .Lfoo$local will be turned to a
>> STT_SECTION relative relocation), but the size increase is very small.
>>
>>
>> I want to teach clang -fPIC to use -fno-semantic-interposition by
>> default. (It is currently an LLVM optimization, not realized in clang.)
>> clang traditionally makes various -fno-semantic-interposition
>> assumptions and can perform interprocedural optimizations even if the
>> strict ELF rule disallows them.
>
>FWIW, gcc with no-semantic-interposition also uses local aliases, but
>rather than using .L labels, it creates a local alias by
> .set foo.localalias, foo
>This makes the type of foo.localalias the same as foo, which I gather
>should placate objtool as it'll still see an STT_FUNC no matter whether
>it picks up foo.localalias or foo.
The GCC approach costs more bytes. foo.localalias is not prefixed by .L,
thus it wastes sizeof(Elf*_Sym) bytes for each such function.
5: 0000000000401000 7 FUNC LOCAL DEFAULT 1 foo.localalias
Call/jump relocations on ARM and MIPS treat STT_FUNC differently.
If eventually we use the clang optimization for ARM and MIPS, we
probably should consider changing `.Lfoo$local:` to `.set .Lfoo$local, foo`
The assembler is quite complex. I need to investigate more into LLVM MC.
R_ARM_CALL/R_ARM_THM_CALL can be used against STT_NOTYPE symbols.
That disables interwork thunks (https://reviews.llvm.org/D73542).
If objtool is used by ARM and such disabling semantic is ever needed,
the rule should be loosened to allow STT_NOTYPE.
On Fri, Feb 14, 2020 at 02:20:46PM -0800, Fangrui Song wrote:
> On 2020-02-14, Arvind Sankar wrote:
> >
> >I was testing with hidden/protected visibility, I see you want this for
> >the no-semantic-interposition case. Actually a bit more testing shows
> >some peculiarities even with hidden visibility. With the below, the call
> >and lea create relocations in the object file, but the jmp doesn't. ld
> >does avoid creating a plt for this though.
> >
> > .text
> > .globl foo, bar
> > .hidden foo
> > bar:
> > call foo
> > leaq foo(%rip), %rax
> > jmp foo
> >
> > foo: ret
>
> Yes, GNU as is inconsistent here. While fixing
> https://sourceware.org/ml/binutils/2020-02/msg00243.html , I noticed
> that the rule is quite complex. There are definitely lots of places to
> improve. clang 10 emits relocations consistently.
>
> call foo # R_X86_64_PLT32
> leaq foo(%rip), %rax # R_X86_64_PC32
> jmp foo # R_X86_64_PLT32
>
I guess the reason why is that jmp instructions can be optimized to use
8-bit signed offset if the destination is close enough, so the assembler
wants to go through them anyway to check, while such optimization is not
possible for the call and lea.
clang 9 emits no relocations for me, unless @PLT/@GOTPCREL is explicitly
used. Has that changed? (Just using clang -o test.o test.s on that
assembler, not too familiar with invokation syntax)
On Fri, Feb 14, 2020 at 07:05:57PM -0500, Arvind Sankar wrote:
> On Fri, Feb 14, 2020 at 02:20:46PM -0800, Fangrui Song wrote:
> > On 2020-02-14, Arvind Sankar wrote:
> > >
> > >I was testing with hidden/protected visibility, I see you want this for
> > >the no-semantic-interposition case. Actually a bit more testing shows
> > >some peculiarities even with hidden visibility. With the below, the call
> > >and lea create relocations in the object file, but the jmp doesn't. ld
> > >does avoid creating a plt for this though.
> > >
> > > .text
> > > .globl foo, bar
> > > .hidden foo
> > > bar:
> > > call foo
> > > leaq foo(%rip), %rax
> > > jmp foo
> > >
> > > foo: ret
> >
> > Yes, GNU as is inconsistent here. While fixing
> > https://sourceware.org/ml/binutils/2020-02/msg00243.html , I noticed
> > that the rule is quite complex. There are definitely lots of places to
> > improve. clang 10 emits relocations consistently.
> >
> > call foo # R_X86_64_PLT32
> > leaq foo(%rip), %rax # R_X86_64_PC32
> > jmp foo # R_X86_64_PLT32
> >
>
> I guess the reason why is that jmp instructions can be optimized to use
> 8-bit signed offset if the destination is close enough, so the assembler
> wants to go through them anyway to check, while such optimization is not
> possible for the call and lea.
>
> clang 9 emits no relocations for me, unless @PLT/@GOTPCREL is explicitly
> used. Has that changed? (Just using clang -o test.o test.s on that
> assembler, not too familiar with invokation syntax)
Actually, wait, it does that even with default visibility. The only way
to make it allow for symbol interposition is to explicitly use @PLT etc.
Is the only reason you're adding these local symbols then is to work
around GNU as adding PLT relocations automatically for call foo?