On 2020-06-29, Arvind Sankar wrote:
>On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
>> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
>> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <[email protected]> wrote:
>> > >
>> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
>> > > > Add a linker script check that there are no runtime relocations, and
>> > > > remove the old one that tries to check via looking for specially-named
>> > > > sections in the object files.
>> > > >
>> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they
>> > > > are available in all supported gcc and binutils versions (as well as
>> > > > clang and lld).
>> > > >
>> > > > Signed-off-by: Arvind Sankar <[email protected]>
>> > > > Reviewed-by: Ard Biesheuvel <[email protected]>
>> > > > Reviewed-by: Fangrui Song <[email protected]>
>> > > > ---
>> > > > arch/x86/boot/compressed/Makefile | 28 +++-----------------------
>> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
>> > > > 2 files changed, 11 insertions(+), 25 deletions(-)
>> > >
>> > > Reviewed-by: Kees Cook <[email protected]>
>> > >
>> > > question below ...
>> > >
>> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
>> > > > index a4a4a59a2628..a78510046eec 100644
>> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S
>> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S
>> > > > @@ -42,6 +42,12 @@ SECTIONS
>> > > > *(.rodata.*)
>> > > > _erodata = . ;
>> > > > }
>> > > > + .rel.dyn : {
>> > > > + *(.rel.*)
>> > > > + }
>> > > > + .rela.dyn : {
>> > > > + *(.rela.*)
>> > > > + }
>> > > > .got : {
>> > > > *(.got)
>> > > > }
>> > >
>> > > Should these be marked (INFO) as well?
>> > >
>> >
>> > Given that sections marked as (INFO) will still be emitted into the
>> > ELF image, it does not really make a difference to do this for zero
>> > sized sections.
>>
>> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
>> you said was not allocated. So, disk space used for the .got.plt case,
>> but not memory space used. Sorry for the confusion!
>>
>> -Kees
About output section type (INFO):
https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
says "These type names are supported for backward compatibility, and are
rarely used."
If all input section don't have the SHF_ALLOC flag, the output section
will not have this flag as well. This type is not useful...
If .got and .got.plt were used, they should be considered dynamic
relocations which should be part of the loadable image. So they should
have the SHF_ALLOC flag. (INFO) will not be applicable anyway.
SHT_REL[A] may be allocable or not. Usually .rel[a].dyn and .rel[a].plt
are linker created allocable sections. (INFO) does not make sense for them.
>In the case of the REL[A] and .got sections, they are actually already
>not emitted at all into the ELF file now that they are zero size.
>
>For .got.plt, it is only emitted for 32-bit (with the 3 reserved
>entries), the 64-bit linker seems to get rid of it.
On Mon, 29 Jun 2020 at 19:37, Fangrui Song <[email protected]> wrote:
>
> On 2020-06-29, Arvind Sankar wrote:
> >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
> >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
> >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <[email protected]> wrote:
> >> > >
> >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
> >> > > > Add a linker script check that there are no runtime relocations, and
> >> > > > remove the old one that tries to check via looking for specially-named
> >> > > > sections in the object files.
> >> > > >
> >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they
> >> > > > are available in all supported gcc and binutils versions (as well as
> >> > > > clang and lld).
> >> > > >
> >> > > > Signed-off-by: Arvind Sankar <[email protected]>
> >> > > > Reviewed-by: Ard Biesheuvel <[email protected]>
> >> > > > Reviewed-by: Fangrui Song <[email protected]>
> >> > > > ---
> >> > > > arch/x86/boot/compressed/Makefile | 28 +++-----------------------
> >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
> >> > > > 2 files changed, 11 insertions(+), 25 deletions(-)
> >> > >
> >> > > Reviewed-by: Kees Cook <[email protected]>
> >> > >
> >> > > question below ...
> >> > >
> >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> >> > > > index a4a4a59a2628..a78510046eec 100644
> >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S
> >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> >> > > > @@ -42,6 +42,12 @@ SECTIONS
> >> > > > *(.rodata.*)
> >> > > > _erodata = . ;
> >> > > > }
> >> > > > + .rel.dyn : {
> >> > > > + *(.rel.*)
> >> > > > + }
> >> > > > + .rela.dyn : {
> >> > > > + *(.rela.*)
> >> > > > + }
> >> > > > .got : {
> >> > > > *(.got)
> >> > > > }
> >> > >
> >> > > Should these be marked (INFO) as well?
> >> > >
> >> >
> >> > Given that sections marked as (INFO) will still be emitted into the
> >> > ELF image, it does not really make a difference to do this for zero
> >> > sized sections.
> >>
> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
> >> you said was not allocated. So, disk space used for the .got.plt case,
> >> but not memory space used. Sorry for the confusion!
> >>
> >> -Kees
>
> About output section type (INFO):
> https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
> says "These type names are supported for backward compatibility, and are
> rarely used."
>
> If all input section don't have the SHF_ALLOC flag, the output section
> will not have this flag as well. This type is not useful...
>
> If .got and .got.plt were used, they should be considered dynamic
> relocations which should be part of the loadable image. So they should
> have the SHF_ALLOC flag. (INFO) will not be applicable anyway.
>
I don't care deeply either way, but Kees indicated that he would like
to get rid of the 24 bytes of .got.plt magic entries that we have no
need for.
In fact, a lot of this mangling is caused by the fact that the linker
is creating a relocatable binary, and assumes that it is a hosted
binary that is loaded by a dynamic loader. It would actually be much
better if the compiler and linker would take -ffreestanding into
account, and suppress GOT entries, PLTs, dynamic program headers for
shared libraries altogether.
On 2020-06-29, Ard Biesheuvel wrote:
>On Mon, 29 Jun 2020 at 19:37, Fangrui Song <[email protected]> wrote:
>>
>> On 2020-06-29, Arvind Sankar wrote:
>> >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
>> >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
>> >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <[email protected]> wrote:
>> >> > >
>> >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
>> >> > > > Add a linker script check that there are no runtime relocations, and
>> >> > > > remove the old one that tries to check via looking for specially-named
>> >> > > > sections in the object files.
>> >> > > >
>> >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they
>> >> > > > are available in all supported gcc and binutils versions (as well as
>> >> > > > clang and lld).
>> >> > > >
>> >> > > > Signed-off-by: Arvind Sankar <[email protected]>
>> >> > > > Reviewed-by: Ard Biesheuvel <[email protected]>
>> >> > > > Reviewed-by: Fangrui Song <[email protected]>
>> >> > > > ---
>> >> > > > arch/x86/boot/compressed/Makefile | 28 +++-----------------------
>> >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
>> >> > > > 2 files changed, 11 insertions(+), 25 deletions(-)
>> >> > >
>> >> > > Reviewed-by: Kees Cook <[email protected]>
>> >> > >
>> >> > > question below ...
>> >> > >
>> >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
>> >> > > > index a4a4a59a2628..a78510046eec 100644
>> >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S
>> >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S
>> >> > > > @@ -42,6 +42,12 @@ SECTIONS
>> >> > > > *(.rodata.*)
>> >> > > > _erodata = . ;
>> >> > > > }
>> >> > > > + .rel.dyn : {
>> >> > > > + *(.rel.*)
>> >> > > > + }
>> >> > > > + .rela.dyn : {
>> >> > > > + *(.rela.*)
>> >> > > > + }
>> >> > > > .got : {
>> >> > > > *(.got)
>> >> > > > }
>> >> > >
>> >> > > Should these be marked (INFO) as well?
>> >> > >
>> >> >
>> >> > Given that sections marked as (INFO) will still be emitted into the
>> >> > ELF image, it does not really make a difference to do this for zero
>> >> > sized sections.
>> >>
>> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
>> >> you said was not allocated. So, disk space used for the .got.plt case,
>> >> but not memory space used. Sorry for the confusion!
>> >>
>> >> -Kees
>>
>> About output section type (INFO):
>> https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
>> says "These type names are supported for backward compatibility, and are
>> rarely used."
>>
>> If all input section don't have the SHF_ALLOC flag, the output section
>> will not have this flag as well. This type is not useful...
>>
>> If .got and .got.plt were used, they should be considered dynamic
>> relocations which should be part of the loadable image. So they should
>> have the SHF_ALLOC flag. (INFO) will not be applicable anyway.
>>
>
>I don't care deeply either way, but Kees indicated that he would like
>to get rid of the 24 bytes of .got.plt magic entries that we have no
>need for.
>
>In fact, a lot of this mangling is caused by the fact that the linker
>is creating a relocatable binary, and assumes that it is a hosted
>binary that is loaded by a dynamic loader. It would actually be much
>better if the compiler and linker would take -ffreestanding into
>account, and suppress GOT entries, PLTs, dynamic program headers for
>shared libraries altogether.
Linkers (GNU ld and LLD) don't create .got or .got.plt just because the linker
command line has -pie or -shared. They create .got or .got.plt if there are
specific needs.
For .got.plt, if there is (1) any .plt/.iplt entry, (2) any .got.plt based
relocation (e.g. R_X86_64_GOTPC32 on x86-64), or (3) if _GLOBAL_OFFSET_TABLE_ is
referenced, .got.plt will be created (both GNU ld and LLD) with usually 3
entries (for ld.so purposes).
If (1) is not satisfied, the created .got.plt is just served as an anchor for
things that want to reference (the distance from GOT base to some point). The
linker will still reserve 3 words but the words are likely not needed.
I don't think there is a specific need for another option to teach the linker
(GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
-static (ld -no-pie))/-static-pie (-pie) already work quite well.
On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
>
> On 2020-06-29, Ard Biesheuvel wrote:
> >On Mon, 29 Jun 2020 at 19:37, Fangrui Song <[email protected]> wrote:
> >>
> >> On 2020-06-29, Arvind Sankar wrote:
> >> >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
> >> >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
> >> >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <[email protected]> wrote:
> >> >> > >
> >> >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
> >> >> > > > Add a linker script check that there are no runtime relocations, and
> >> >> > > > remove the old one that tries to check via looking for specially-named
> >> >> > > > sections in the object files.
> >> >> > > >
> >> >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they
> >> >> > > > are available in all supported gcc and binutils versions (as well as
> >> >> > > > clang and lld).
> >> >> > > >
> >> >> > > > Signed-off-by: Arvind Sankar <[email protected]>
> >> >> > > > Reviewed-by: Ard Biesheuvel <[email protected]>
> >> >> > > > Reviewed-by: Fangrui Song <[email protected]>
> >> >> > > > ---
> >> >> > > > arch/x86/boot/compressed/Makefile | 28 +++-----------------------
> >> >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
> >> >> > > > 2 files changed, 11 insertions(+), 25 deletions(-)
> >> >> > >
> >> >> > > Reviewed-by: Kees Cook <[email protected]>
> >> >> > >
> >> >> > > question below ...
> >> >> > >
> >> >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> >> >> > > > index a4a4a59a2628..a78510046eec 100644
> >> >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S
> >> >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> >> >> > > > @@ -42,6 +42,12 @@ SECTIONS
> >> >> > > > *(.rodata.*)
> >> >> > > > _erodata = . ;
> >> >> > > > }
> >> >> > > > + .rel.dyn : {
> >> >> > > > + *(.rel.*)
> >> >> > > > + }
> >> >> > > > + .rela.dyn : {
> >> >> > > > + *(.rela.*)
> >> >> > > > + }
> >> >> > > > .got : {
> >> >> > > > *(.got)
> >> >> > > > }
> >> >> > >
> >> >> > > Should these be marked (INFO) as well?
> >> >> > >
> >> >> >
> >> >> > Given that sections marked as (INFO) will still be emitted into the
> >> >> > ELF image, it does not really make a difference to do this for zero
> >> >> > sized sections.
> >> >>
> >> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
> >> >> you said was not allocated. So, disk space used for the .got.plt case,
> >> >> but not memory space used. Sorry for the confusion!
> >> >>
> >> >> -Kees
> >>
> >> About output section type (INFO):
> >> https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
> >> says "These type names are supported for backward compatibility, and are
> >> rarely used."
> >>
> >> If all input section don't have the SHF_ALLOC flag, the output section
> >> will not have this flag as well. This type is not useful...
> >>
> >> If .got and .got.plt were used, they should be considered dynamic
> >> relocations which should be part of the loadable image. So they should
> >> have the SHF_ALLOC flag. (INFO) will not be applicable anyway.
> >>
> >
> >I don't care deeply either way, but Kees indicated that he would like
> >to get rid of the 24 bytes of .got.plt magic entries that we have no
> >need for.
> >
> >In fact, a lot of this mangling is caused by the fact that the linker
> >is creating a relocatable binary, and assumes that it is a hosted
> >binary that is loaded by a dynamic loader. It would actually be much
> >better if the compiler and linker would take -ffreestanding into
> >account, and suppress GOT entries, PLTs, dynamic program headers for
> >shared libraries altogether.
>
> Linkers (GNU ld and LLD) don't create .got or .got.plt just because the linker
> command line has -pie or -shared. They create .got or .got.plt if there are
> specific needs.
>
> For .got.plt, if there is (1) any .plt/.iplt entry, (2) any .got.plt based
> relocation (e.g. R_X86_64_GOTPC32 on x86-64), or (3) if _GLOBAL_OFFSET_TABLE_ is
> referenced, .got.plt will be created (both GNU ld and LLD) with usually 3
> entries (for ld.so purposes).
>
This is not the case for AArch64. There, __GLOBAL_OFFSET_TABLE__ is
always emitted, along with the magic .got.plt entries, regardless of
the input.
As for the input objects - why is '#pragma GCC visibility(hidden)' not
the default for -ffreestanding builds? This suppresses any GOT entries
emitted by the compiler, but the only way to get this behavior is
through the #pragma, which is how we ended up with '-include hidden.h'
in a couple of places.
IOW, if the toolchain behavior was not 100% geared towards shared
executables as it is today, we would not need the hacks that we need
to apply to get a relocatable bare metal binary like we need for the
KASLR kernel.
> If (1) is not satisfied, the created .got.plt is just served as an anchor for
> things that want to reference (the distance from GOT base to some point). The
> linker will still reserve 3 words but the words are likely not needed.
>
> I don't think there is a specific need for another option to teach the linker
> (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
> -static (ld -no-pie))/-static-pie (-pie) already work quite well.
You mean 'ld -static -pie' right? That seems to work. Is that a recent
invention?
On Tue, Jun 30, 2020 at 06:26:43PM +0200, Ard Biesheuvel wrote:
> On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
>
> > If (1) is not satisfied, the created .got.plt is just served as an anchor for
> > things that want to reference (the distance from GOT base to some point). The
> > linker will still reserve 3 words but the words are likely not needed.
> >
> > I don't think there is a specific need for another option to teach the linker
> > (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
> > -static (ld -no-pie))/-static-pie (-pie) already work quite well.
>
> You mean 'ld -static -pie' right? That seems to work. Is that a recent
> invention?
gcc -static-pie is fairly recent [0], but it just influences how the
linker is invoked AFAIK (at least for gcc) -- in addition to passing
some linker flags, it will change what startup files get linked in (for
non-freestanding). It does not even imply -fPIE to the compiler, which
is confusing as hell. It _would_ be nice if this also told the compiler
that all symbols (perhaps unless explicitly marked) will be resolved at
static link time, so there is no need to use the GOT or PLT for globals.
As it stands, the executable can still have relocations, GOT and PLT, it
just needs to have startup code to handle them (provided by libc
typically) instead of relying on an external dynamic linker.
I don't think it's really relevant for the kernel build -- all we get is
ld -static --no-dynamic-linker, all -static does is prevent searching
shared libraries, and we already pass --no-dynamic-linker if it's
supported.
[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498
* Ard Biesheuvel
> On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
> >
> > On 2020-06-29, Ard Biesheuvel wrote:
> > >On Mon, 29 Jun 2020 at 19:37, Fangrui Song <[email protected]> wrote:
> > >>
> > >> On 2020-06-29, Arvind Sankar wrote:
> > >> >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
> > >> >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
> > >> >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <[email protected]> wrote:
> > >> >> > >
> > >> >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
> > >> >> > > > Add a linker script check that there are no runtime relocations, and
> > >> >> > > > remove the old one that tries to check via looking for specially-named
> > >> >> > > > sections in the object files.
> > >> >> > > >
> > >> >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they
> > >> >> > > > are available in all supported gcc and binutils versions (as well as
> > >> >> > > > clang and lld).
> > >> >> > > >
> > >> >> > > > Signed-off-by: Arvind Sankar <[email protected]>
> > >> >> > > > Reviewed-by: Ard Biesheuvel <[email protected]>
> > >> >> > > > Reviewed-by: Fangrui Song <[email protected]>
> > >> >> > > > ---
> > >> >> > > > arch/x86/boot/compressed/Makefile | 28 +++-----------------------
> > >> >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
> > >> >> > > > 2 files changed, 11 insertions(+), 25 deletions(-)
> > >> >> > >
> > >> >> > > Reviewed-by: Kees Cook <[email protected]>
> > >> >> > >
> > >> >> > > question below ...
> > >> >> > >
> > >> >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> > >> >> > > > index a4a4a59a2628..a78510046eec 100644
> > >> >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S
> > >> >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> > >> >> > > > @@ -42,6 +42,12 @@ SECTIONS
> > >> >> > > > *(.rodata.*)
> > >> >> > > > _erodata = . ;
> > >> >> > > > }
> > >> >> > > > + .rel.dyn : {
> > >> >> > > > + *(.rel.*)
> > >> >> > > > + }
> > >> >> > > > + .rela.dyn : {
> > >> >> > > > + *(.rela.*)
> > >> >> > > > + }
> > >> >> > > > .got : {
> > >> >> > > > *(.got)
> > >> >> > > > }
> > >> >> > >
> > >> >> > > Should these be marked (INFO) as well?
> > >> >> > >
> > >> >> >
> > >> >> > Given that sections marked as (INFO) will still be emitted into the
> > >> >> > ELF image, it does not really make a difference to do this for zero
> > >> >> > sized sections.
> > >> >>
> > >> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
> > >> >> you said was not allocated. So, disk space used for the .got.plt case,
> > >> >> but not memory space used. Sorry for the confusion!
> > >> >>
> > >> >> -Kees
> > >>
> > >> About output section type (INFO):
> > >> https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
> > >> says "These type names are supported for backward compatibility, and are
> > >> rarely used."
> > >>
> > >> If all input section don't have the SHF_ALLOC flag, the output section
> > >> will not have this flag as well. This type is not useful...
> > >>
> > >> If .got and .got.plt were used, they should be considered dynamic
> > >> relocations which should be part of the loadable image. So they should
> > >> have the SHF_ALLOC flag. (INFO) will not be applicable anyway.
> > >>
> > >
> > >I don't care deeply either way, but Kees indicated that he would like
> > >to get rid of the 24 bytes of .got.plt magic entries that we have no
> > >need for.
> > >
> > >In fact, a lot of this mangling is caused by the fact that the linker
> > >is creating a relocatable binary, and assumes that it is a hosted
> > >binary that is loaded by a dynamic loader. It would actually be much
> > >better if the compiler and linker would take -ffreestanding into
> > >account, and suppress GOT entries, PLTs, dynamic program headers for
> > >shared libraries altogether.
> >
> > Linkers (GNU ld and LLD) don't create .got or .got.plt just because the linker
> > command line has -pie or -shared. They create .got or .got.plt if there are
> > specific needs.
> >
> > For .got.plt, if there is (1) any .plt/.iplt entry, (2) any .got.plt based
> > relocation (e.g. R_X86_64_GOTPC32 on x86-64), or (3) if _GLOBAL_OFFSET_TABLE_ is
> > referenced, .got.plt will be created (both GNU ld and LLD) with usually 3
> > entries (for ld.so purposes).
> >
>
> This is not the case for AArch64. There, __GLOBAL_OFFSET_TABLE__ is
> always emitted, along with the magic .got.plt entries, regardless of
> the input.
>
> As for the input objects - why is '#pragma GCC visibility(hidden)' not
> the default for -ffreestanding builds? This suppresses any GOT entries
> emitted by the compiler, but the only way to get this behavior is
> through the #pragma, which is how we ended up with '-include hidden.h'
> in a couple of places.
A -ffreestanding build may provide a shared object used by other
applications. For example, musl (libc)'s CFLAGS has -ffreestanding.
> IOW, if the toolchain behavior was not 100% geared towards shared
> executables as it is today, we would not need the hacks that we need
> to apply to get a relocatable bare metal binary like we need for the
> KASLR kernel.
My mere concern regarding "geared towards shared objects" is that ELF
assumes symbols of default visibility are preemptible by default.
So unfortunately it is difficult to make -fno-semantic-interposition the
default.
> If (1) is not satisfied, the created .got.plt is just served as an anchor for
> things that want to reference (the distance from GOT base to some point). The
> linker will still reserve 3 words but the words are likely not needed.
>
> I don't think there is a specific need for another option to teach the linker
> (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
> -static (ld -no-pie))/-static-pie (-pie) already work quite well.
On 2020-06-30, Arvind Sankar wrote:
>On Tue, Jun 30, 2020 at 06:26:43PM +0200, Ard Biesheuvel wrote:
>> On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
>>
>> > If (1) is not satisfied, the created .got.plt is just served as an anchor for
>> > things that want to reference (the distance from GOT base to some point). The
>> > linker will still reserve 3 words but the words are likely not needed.
>> >
>> > I don't think there is a specific need for another option to teach the linker
>> > (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
>> > -static (ld -no-pie))/-static-pie (-pie) already work quite well.
>>
>> You mean 'ld -static -pie' right? That seems to work. Is that a recent
>> invention?
>
>gcc -static-pie is fairly recent [0], but it just influences how the
>linker is invoked AFAIK (at least for gcc) -- in addition to passing
>some linker flags, it will change what startup files get linked in (for
>non-freestanding). It does not even imply -fPIE to the compiler, which
>is confusing as hell. It _would_ be nice if this also told the compiler
>that all symbols (perhaps unless explicitly marked) will be resolved at
>static link time, so there is no need to use the GOT or PLT for globals.
>
>As it stands, the executable can still have relocations, GOT and PLT, it
>just needs to have startup code to handle them (provided by libc
>typically) instead of relying on an external dynamic linker.
If the executable is purely static, it does not need to have PLT. All
calls to a PLT can be redirected to the function itself. Some range
extension thunks (other terms: stub groups, veneers, etc) may still be
needed if the distance is too large.
There are cases where a GOT cannot be avoided, e.g.
extern char foo[] __attribute__((weak, visibility("hidden")));
char *fun() { return foo; }
If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
Many other architectures don't even have a GOT optimization.
>I don't think it's really relevant for the kernel build -- all we get is
>ld -static --no-dynamic-linker, all -static does is prevent searching
>shared libraries, and we already pass --no-dynamic-linker if it's
>supported.
>
>[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498
On Tue, Jun 30, 2020 at 03:00:43PM -0700, Fangrui Song wrote:
> * Ard Biesheuvel
> > On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
>
> If the executable is purely static, it does not need to have PLT. All
> calls to a PLT can be redirected to the function itself. Some range
> extension thunks (other terms: stub groups, veneers, etc) may still be
> needed if the distance is too large.
>
> There are cases where a GOT cannot be avoided, e.g.
>
> extern char foo[] __attribute__((weak, visibility("hidden")));
> char *fun() { return foo; }
>
> If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
> by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
> Many other architectures don't even have a GOT optimization.
Urk -- the example given in that bug report isn't even weak. Are you
guys proposing to pessimize every access to a global symbol, regardless
of visibility, by going through the GOT on the off chance that somebody
might define one of them as SHN_ABS? Can we at least gate it behind
something like __attribute__((might_be_shn_abs))?
>
> >I don't think it's really relevant for the kernel build -- all we get is
> >ld -static --no-dynamic-linker, all -static does is prevent searching
> >shared libraries, and we already pass --no-dynamic-linker if it's
> >supported.
> >
> >[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498
On Wed, 1 Jul 2020 at 01:28, Arvind Sankar <[email protected]> wrote:
>
> On Tue, Jun 30, 2020 at 03:00:43PM -0700, Fangrui Song wrote:
> > * Ard Biesheuvel
> > > On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
> >
> > If the executable is purely static, it does not need to have PLT. All
> > calls to a PLT can be redirected to the function itself. Some range
> > extension thunks (other terms: stub groups, veneers, etc) may still be
> > needed if the distance is too large.
> >
> > There are cases where a GOT cannot be avoided, e.g.
> >
> > extern char foo[] __attribute__((weak, visibility("hidden")));
> > char *fun() { return foo; }
> >
> > If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
> > by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
> > Many other architectures don't even have a GOT optimization.
>
> Urk -- the example given in that bug report isn't even weak. Are you
> guys proposing to pessimize every access to a global symbol, regardless
> of visibility, by going through the GOT on the off chance that somebody
> might define one of them as SHN_ABS? Can we at least gate it behind
> something like __attribute__((might_be_shn_abs))?
>
SHN_ABS is typically only used for constants emitted by the linker
script, so I don't think this is a huge deal.
The example above is not that different from having a statically
initialized function pointer in your object (which might be NULL), and
that is something we already have to deal with anyway.
What I was talking about is the tendency of the compiler to assume
that every function symbol with external linkage is preemptible, and
the only way to suppress this behavior is to issue a #pragma that can
be done in code only, not on the compiler command line.
> >
> > >I don't think it's really relevant for the kernel build -- all we get is
> > >ld -static --no-dynamic-linker, all -static does is prevent searching
> > >shared libraries, and we already pass --no-dynamic-linker if it's
> > >supported.
> > >
> > >[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498
On Wed, Jul 01, 2020 at 08:44:56AM +0200, Ard Biesheuvel wrote:
> On Wed, 1 Jul 2020 at 01:28, Arvind Sankar <[email protected]> wrote:
> >
> > On Tue, Jun 30, 2020 at 03:00:43PM -0700, Fangrui Song wrote:
> > > * Ard Biesheuvel
> > > > On Tue, 30 Jun 2020 at 01:34, Fangrui Song <[email protected]> wrote:
> > >
> > > If the executable is purely static, it does not need to have PLT. All
> > > calls to a PLT can be redirected to the function itself. Some range
> > > extension thunks (other terms: stub groups, veneers, etc) may still be
> > > needed if the distance is too large.
> > >
> > > There are cases where a GOT cannot be avoided, e.g.
> > >
> > > extern char foo[] __attribute__((weak, visibility("hidden")));
> > > char *fun() { return foo; }
> > >
> > > If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
> > > by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
> > > Many other architectures don't even have a GOT optimization.
> >
> > Urk -- the example given in that bug report isn't even weak. Are you
> > guys proposing to pessimize every access to a global symbol, regardless
> > of visibility, by going through the GOT on the off chance that somebody
> > might define one of them as SHN_ABS? Can we at least gate it behind
> > something like __attribute__((might_be_shn_abs))?
> >
>
> SHN_ABS is typically only used for constants emitted by the linker
> script, so I don't think this is a huge deal.
>
> The example above is not that different from having a statically
> initialized function pointer in your object (which might be NULL), and
> that is something we already have to deal with anyway.
>
> What I was talking about is the tendency of the compiler to assume
> that every function symbol with external linkage is preemptible, and
> the only way to suppress this behavior is to issue a #pragma that can
> be done in code only, not on the compiler command line.
>
Yes, SHN_ABS is rare. But supporting it without an explicit annotation
for the compiler requires pessimizing the common case where the symbol
is _not_ SHN_ABS, because the compiler would have to assume that
everything might be defined as SHN_ABS. This includes accessing a simple
extern int variable.
It would have to generate the GOT-referencing code in all cases where it
doesn't see a strong definition, even with hidden visibility. And on
x86, we'd have to bump the toolchain requirement to at least 2.26 so we
can get the linker relaxations, otherwise you'd have GOT references in
the final code.