The handling of various sections in the VDSO linker script
looks pretty haphazard. This patch series cleans it up in
these regards:
- improve the coding style
- remove superfluous sections
- issue a linker error if a section is encountered which
is known not to work
- check that the .got section is empty, except for the
three entries defined by the ABI
- discard sections which are not useful to user-space
Petr Tesarik (8):
x86: Adjust the coding style of vdso-layout.lds.S
x86: Remove .sdata from the vDSO linker script
x86: Remove .dynbss from the vDSO linker script
x86: add .broken section to the vDSO linker script
x86: mark altinstr-related sections in vDSO as broken
x86: mark some standard sections as broken in a vDSO
x86: check the size of GOT in vDSO
x86: remove unneeded section from the vDSO
arch/x86/vdso/Makefile | 5 +-
arch/x86/vdso/vdso-layout.lds.S | 146 +++++++++++++++++++++++++++++++-------
2 files changed, 122 insertions(+), 29 deletions(-)
This section should not contain anything unless the vDSO is
broken. It is intended to hold sections which are known not
to work correctly inside a vDSO.
If such a "broken" section is found at link-time, the linker
issues an error with a hint how to find out more.
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/vdso-layout.lds.S | 22 ++++++++++++++++++++++
1 files changed, 22 insertions(+), 0 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index c7fa74c..7a76159 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -72,8 +72,30 @@ SECTIONS
.text : {
*(.text*)
} :text =0x90909090
+
+ /* Some sections require some additional work from the dynamic
+ * linker to work properly. However, there is no dynamic-link
+ * pass for the vDSO, so these sections will not work.
+ * Put them into a special section and raise a link-time error
+ * if they get used by accident.
+ */
+ .broken : {
+ }
}
+/* This assert is triggered if the linker finds a section in one of its
+ * input files which is known not to work inside a vDSO.
+ *
+ * To see which is the offending sections, you may:
+ * a. use objdump -h on the object files which make up the vDSO, or
+ * b. add -Wl,-M to VDSO_LDFLAGS and examine the linker map.
+ *
+ * See individual comments in the definition of the .broken section
+ * above for more information on why any given section is considered
+ * "broken".
+ */
+ASSERT(!SIZEOF(.broken), "The vdso linker script found a section that is bad. See vdso-layout.lds.S for details.");
+
/*
* Very old versions of ld do not recognize this name token; use the constant.
*/
--
1.6.0.2
As far as I can see, the .dynbss section does not make any sense
and cannot even appear in a shared library.
Remove it.
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/vdso-layout.lds.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index af3fa61..c7fa74c 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -53,7 +53,6 @@ SECTIONS
*(.got.plt) *(.got)
*(.gnu.linkonce.d.*)
*(.bss*)
- *(.dynbss*)
*(.gnu.linkonce.b.*)
}
--
1.6.0.2
The style we try to introduce for .lds files in
arch/$ARCH/kernel/vmlinux.lds.S is much more C-like.
Use the same style in the vDSO linker script to get a consistent
style in linker scripts.
Credits to Sam Ravnborg for suggesting this change.
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/vdso-layout.lds.S | 76 +++++++++++++++++++++++++++------------
1 files changed, 53 insertions(+), 23 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index 634a2cf..917df03 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -8,34 +8,62 @@ SECTIONS
{
. = VDSO_PRELINK + SIZEOF_HEADERS;
- .hash : { *(.hash) } :text
- .gnu.hash : { *(.gnu.hash) }
- .dynsym : { *(.dynsym) }
- .dynstr : { *(.dynstr) }
- .gnu.version : { *(.gnu.version) }
- .gnu.version_d : { *(.gnu.version_d) }
- .gnu.version_r : { *(.gnu.version_r) }
+ .hash : {
+ *(.hash)
+ } :text
+ .gnu.hash : {
+ *(.gnu.hash)
+ }
+ .dynsym : {
+ *(.dynsym)
+ }
+ .dynstr : {
+ *(.dynstr)
+ }
+ .gnu.version : {
+ *(.gnu.version)
+ }
+ .gnu.version_d : {
+ *(.gnu.version_d)
+ }
+ .gnu.version_r : {
+ *(.gnu.version_r)
+ }
- .note : { *(.note.*) } :text :note
+ .note : {
+ *(.note.*)
+ } :text :note
- .eh_frame_hdr : { *(.eh_frame_hdr) } :text :eh_frame_hdr
- .eh_frame : { KEEP (*(.eh_frame)) } :text
+ .eh_frame_hdr : {
+ *(.eh_frame_hdr)
+ } :text :eh_frame_hdr
+ .eh_frame : {
+ KEEP (*(.eh_frame))
+ } :text
- .dynamic : { *(.dynamic) } :text :dynamic
+ .dynamic : {
+ *(.dynamic)
+ } :text :dynamic
- .rodata : { *(.rodata*) } :text
- .data : {
- *(.data*)
- *(.sdata*)
- *(.got.plt) *(.got)
- *(.gnu.linkonce.d.*)
- *(.bss*)
- *(.dynbss*)
- *(.gnu.linkonce.b.*)
+ .rodata : {
+ *(.rodata*)
+ } :text
+ .data : {
+ *(.data*)
+ *(.sdata*)
+ *(.got.plt) *(.got)
+ *(.gnu.linkonce.d.*)
+ *(.bss*)
+ *(.dynbss*)
+ *(.gnu.linkonce.b.*)
}
- .altinstructions : { *(.altinstructions) }
- .altinstr_replacement : { *(.altinstr_replacement) }
+ .altinstructions : {
+ *(.altinstructions)
+ }
+ .altinstr_replacement : {
+ *(.altinstr_replacement)
+ }
/*
* Align the actual code well away from the non-instruction data.
@@ -43,7 +71,9 @@ SECTIONS
*/
. = ALIGN(0x100);
- .text : { *(.text*) } :text =0x90909090
+ .text : {
+ *(.text*)
+ } :text =0x90909090
}
/*
--
1.6.0.2
Some standard sections are not useful in the resulting binary.
.gnu.warning.* is only interesting for the linker.
.note.* may be discarded, except:
.note.Linux which contains the Linux version
.note.gnu.build-id which may be useful for identifying
the vDSO
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/vdso-layout.lds.S | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index 3b56e2f..b70c27d 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -31,7 +31,8 @@ SECTIONS
}
.note : {
- *(.note.*)
+ *(.note.Linux)
+ *(.note.gnu.build-id)
} :text :note
.eh_frame_hdr : {
@@ -100,6 +101,12 @@ SECTIONS
*(.altinstructions)
*(.altinstr_replacement)
}
+
+ /* These sections are not useful to user-space */
+ /DISCARD/ : {
+ *(.gnu.warning.*)
+ *(.note.*)
+ }
}
/* This assert is triggered if the linker finds a section in one of its
--
1.6.0.2
Some standard sections can't work in a vDSO. Although most of them
are not very likely to ever appear there, it doesn't hurt to list
them as a precaution.
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/vdso-layout.lds.S | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index fc2ad07..ffd17e8 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -73,6 +73,25 @@ SECTIONS
* if they get used by accident.
*/
.broken : {
+ /* Code in the Procedure Linkage Table will segfault */
+ *(.plt)
+
+ /* Relocation will not be done, so any pointers will
+ * still point to the prelinked address, which is wrong
+ */
+ *(.data.rel.ro*)
+ *(.gnu.linkonce.d.rel.ro.*)
+
+ /* Initialization/termination won't work this way */
+ *(.init) *(.fini)
+ *(.preinit_array) *(.init_array*)
+ *(.fini_array*)
+
+ /* Thread-local storage cannot be defined like this */
+ *(.tdata .tdata.* .gnu.linkonce.td.*)
+ *(.tbss .tbss.* .gnu.linkonce.tb.*)
+ *(.tcommon)
+
/* The vDSO setup code does not handle alternative
* instructions.
*/
--
1.6.0.2
There are several reasons why this does not work for the vDSO.
1. The alt_instr records are not processed by anybody.
The .altinstruction sections end up in the vDSO binary, which
is then incorporated verbatim inside a completely different
section, so it is not seen by the main kernel.
There is no code to interpret them during vDSO setup either.
2. The pointers in the alt_instr records are wrong.
They point to the instructions and replacements at the prelinked
addresses, but that's neither where the vDSO blob is located at
startup, nor where it is copied to by the vDSO setup.
The .altinstr_replacement section is not broken per se, but it is
useless dead code without the accompanying .alt_instr section.
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/vdso-layout.lds.S | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index 7a76159..fc2ad07 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -56,13 +56,6 @@ SECTIONS
*(.gnu.linkonce.b.*)
}
- .altinstructions : {
- *(.altinstructions)
- }
- .altinstr_replacement : {
- *(.altinstr_replacement)
- }
-
/*
* Align the actual code well away from the non-instruction data.
* This is the best thing for the I-cache.
@@ -80,6 +73,11 @@ SECTIONS
* if they get used by accident.
*/
.broken : {
+ /* The vDSO setup code does not handle alternative
+ * instructions.
+ */
+ *(.altinstructions)
+ *(.altinstr_replacement)
}
}
--
1.6.0.2
There is no .sdata section on either i386 or x86_64.
The existence of this line probably dates back to the (long abandoned)
idea that x86_64 would have .sdata for "small" data and .data for large
data. However, the final ABI has .data for "normal" data and .ldata
for large data, so this line is no longer pertinent.
---
arch/x86/vdso/vdso-layout.lds.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index 917df03..af3fa61 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -50,7 +50,6 @@ SECTIONS
} :text
.data : {
*(.data*)
- *(.sdata*)
*(.got.plt) *(.got)
*(.gnu.linkonce.d.*)
*(.bss*)
--
1.6.0.2
There should be no real entries in the GOT, because they are basically
pointers to dynamic symbols, and that will not work correctly without
a real dynamic linker for the vDSO.
However, the ABI pre-defines three entries in the GOT which are always
present, so the GOT section is never completely empty. We can check
that there are no extra entries beyond these three.
To make it work:
- move the GOT into a separate section
- check the size of that section
- pass -m32 or -m64 to the pre-processor to get the correct
definition of __SIZEOF_POINTER__
Signed-off-by: Petr Tesarik <[email protected]>
---
arch/x86/vdso/Makefile | 5 +++--
arch/x86/vdso/vdso-layout.lds.S | 20 +++++++++++++++++++-
2 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile
index 16a9020..8c7f06a 100644
--- a/arch/x86/vdso/Makefile
+++ b/arch/x86/vdso/Makefile
@@ -23,7 +23,8 @@ $(obj)/vdso.o: $(obj)/vdso.so
targets += vdso.so vdso.so.dbg vdso.lds $(vobjs-y)
-export CPPFLAGS_vdso.lds += -P -C
+vdso-cppflags = -P -C
+export CPPFLAGS_vdso.lds += -m64 $(vdso-cppflags)
VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -Wl,-soname=linux-vdso.so.1 \
-Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096
@@ -68,7 +69,7 @@ vdso32.so-$(VDSO32-y) += sysenter
vdso32-images = $(vdso32.so-y:%=vdso32-%.so)
-CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds)
+CPPFLAGS_vdso32.lds = -m32 $(vdso-cppflags)
VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -Wl,-soname=linux-gate.so.1
# This makes sure the $(obj) subdirectory exists even though vdso32/
diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S
index ffd17e8..3b56e2f 100644
--- a/arch/x86/vdso/vdso-layout.lds.S
+++ b/arch/x86/vdso/vdso-layout.lds.S
@@ -44,13 +44,15 @@ SECTIONS
.dynamic : {
*(.dynamic)
} :text :dynamic
+ .got : {
+ *(.got.plt) *(.got)
+ } :text
.rodata : {
*(.rodata*)
} :text
.data : {
*(.data*)
- *(.got.plt) *(.got)
*(.gnu.linkonce.d.*)
*(.bss*)
*(.gnu.linkonce.b.*)
@@ -113,6 +115,22 @@ SECTIONS
*/
ASSERT(!SIZEOF(.broken), "The vdso linker script found a section that is bad. See vdso-layout.lds.S for details.");
+/* This assert is triggered if the resulting GOT is larger than the
+ * minimum defined by the ABI, i.e. there is some actual use of the
+ * GOT.
+ *
+ * To find the offending symbols you may:
+ * 1. temporarily disable this check
+ * 2. examine the dynamic relocations of the resulting vDSO with
+ * objdump -R
+ *
+ * To find the places where the symbols were used, you may:
+ * 1. add -Wl,--emit-relocs to VDSO_LDFLAGS
+ * 2. run objdump -r on the resulting vDSO and look for all
+ * GOT-type relocations.
+ */
+ASSERT(SIZEOF(.got) == 3*__SIZEOF_POINTER__, "The vdso linker script found a wrong reference to an external object. See vdso-layout.lds.S for details.");
+
/*
* Very old versions of ld do not recognize this name token; use the constant.
*/
--
1.6.0.2
On Fri, Jun 12, 2009 at 03:40:32PM +0200, Petr Tesarik wrote:
> The handling of various sections in the VDSO linker script
> looks pretty haphazard. This patch series cleans it up in
> these regards:
>
> - improve the coding style
> - remove superfluous sections
> - issue a linker error if a section is encountered which
> is known not to work
> - check that the .got section is empty, except for the
> three entries defined by the ABI
> - discard sections which are not useful to user-space
>
> Petr Tesarik (8):
> x86: Adjust the coding style of vdso-layout.lds.S
> x86: Remove .sdata from the vDSO linker script
> x86: add .broken section to the vDSO linker script
> x86: mark altinstr-related sections in vDSO as broken
> x86: mark some standard sections as broken in a vDSO
> x86: check the size of GOT in vDSO
> x86: remove unneeded section from the vDSO
The above looks good.
Acked-by: Sam Ravnborg <[email protected]>
>
> x86: Remove .dynbss from the vDSO linker script
This one I am a little reluctant about as I do not understand
why ld sometimes adds and sometimes does not add this section.
At least judging from the arch specific linker scripts some archs
include it and others do not.
I grepped the binutils source and bfd/elf64-x86-64.c mentions
dynbss several times.
Maybe just adding it to broken would suffice as we would
then be informed it it is used by some ld versions?
Sam
Sam Ravnborg píše v Pá 12. 06. 2009 v 20:06 +0200:
> On Fri, Jun 12, 2009 at 03:40:32PM +0200, Petr Tesarik wrote:
> > The handling of various sections in the VDSO linker script
> > looks pretty haphazard. This patch series cleans it up in
> > these regards:
> >
> > - improve the coding style
> > - remove superfluous sections
> > - issue a linker error if a section is encountered which
> > is known not to work
> > - check that the .got section is empty, except for the
> > three entries defined by the ABI
> > - discard sections which are not useful to user-space
> >
> > Petr Tesarik (8):
> > x86: Adjust the coding style of vdso-layout.lds.S
> > x86: Remove .sdata from the vDSO linker script
> > x86: add .broken section to the vDSO linker script
> > x86: mark altinstr-related sections in vDSO as broken
> > x86: mark some standard sections as broken in a vDSO
> > x86: check the size of GOT in vDSO
> > x86: remove unneeded section from the vDSO
> The above looks good.
> Acked-by: Sam Ravnborg <[email protected]>
>
> >
> > x86: Remove .dynbss from the vDSO linker script
> This one I am a little reluctant about as I do not understand
> why ld sometimes adds and sometimes does not add this section.
> At least judging from the arch specific linker scripts some archs
> include it and others do not.
The .dynbss section is a place to put symbols which are defined by
dynamic objects, are referenced by regular objects, and are not
functions. Space for these objects is allocated in the process image and
the dynamic linker copies the object data (using a copy relocation).
This is normally not needed, but if the process image uses the address
of a shared object in a read-only section, the link editor (ld) cannot
use the object in the DSO directly (since its address is only known at
runtime).
An example can probably explain it better:
$ cat >shared.c <<EOF
int shared_object[100];
EOF
$ cat >exe.c <<EOF
#include <stdio.h>
extern int shared_object[100];
int main(void) {
printf("%p\n", &shared_object);
return 0;
}
EOF
It seems pretty obvious to me that a .dynbss section can never occur in
a DSO, because:
1. a DSO must be position-independent, i.e. it cannot put absolute
addresses into a constant section
2. even if the DSO could do such a thing, the object is not allocated
in the DSO; consequently, there will never be a copy relocation,
and no need to have any data in .dynbss
I agree that the current documentation of the purpose of various ELF
sections is insufficient, so it's not easy to understand it. But I think
Roland also understands enough of it to confirm my reasoning.
I'm going to publish the knowledge I got during this vDSO excercise
soon.
Petr Tesarik
Petr Tesarik píše v Po 15. 06. 2009 v 17:16 +0200:
> [...]
> An example can probably explain it better:
>
> $ cat >shared.c <<EOF
> int shared_object[100];
> EOF
> $ cat >exe.c <<EOF
> #include <stdio.h>
> extern int shared_object[100];
> int main(void) {
> printf("%p\n", &shared_object);
> return 0;
> }
> EOF
BTW this example demonstrates that there is one section which should be
included in .data but is not:
*(COMMON)
I'll send a patch later.
Petr Tesarik
Petr Tesarik wrote:
> Petr Tesarik píše v Po 15. 06. 2009 v 17:16 +0200:
>> [...]
>> An example can probably explain it better:
>>
>> $ cat >shared.c <<EOF
>> int shared_object[100];
>> EOF
>> $ cat >exe.c <<EOF
>> #include <stdio.h>
>> extern int shared_object[100];
>> int main(void) {
>> printf("%p\n", &shared_object);
>> return 0;
>> }
>> EOF
>
> BTW this example demonstrates that there is one section which should be
> included in .data but is not:
>
> *(COMMON)
>
> I'll send a patch later.
>
*(COMMON) is BSS, not data...
-hpa
H. Peter Anvin píše v Po 15. 06. 2009 v 09:51 -0700:
> Petr Tesarik wrote:
> > Petr Tesarik píše v Po 15. 06. 2009 v 17:16 +0200:
> >> [...]
> >> An example can probably explain it better:
> >>
> >> $ cat >shared.c <<EOF
> >> int shared_object[100];
> >> EOF
> >> $ cat >exe.c <<EOF
> >> #include <stdio.h>
> >> extern int shared_object[100];
> >> int main(void) {
> >> printf("%p\n", &shared_object);
> >> return 0;
> >> }
> >> EOF
> >
> > BTW this example demonstrates that there is one section which should be
> > included in .data but is not:
> >
> > *(COMMON)
> >
> > I'll send a patch later.
> >
>
> *(COMMON) is BSS, not data...
Very true, but for the vDSO we decided to put both writeable and
read-only data into one section (called .data for that matter), probably
to reduce the number of sections and hence also the size of the
resulting binary.
Petr Tesarik
Petr Tesarik wrote:
>>>
>> *(COMMON) is BSS, not data...
>
> Very true, but for the vDSO we decided to put both writeable and
> read-only data into one section (called .data for that matter), probably
> to reduce the number of sections and hence also the size of the
> resulting binary.
>
BSS is neither, but I guess for the vDSO there really isn't any such
thing as uninitialized content.
I have to admit feeling funny about that, and I'm wondering if we
shouldn't compile the vDSO with -fno-common.
-hpa
H. Peter Anvin píše v Po 15. 06. 2009 v 11:33 -0700:
> Petr Tesarik wrote:
> >>>
> >> *(COMMON) is BSS, not data...
> >
> > Very true, but for the vDSO we decided to put both writeable and
> > read-only data into one section (called .data for that matter), probably
> > to reduce the number of sections and hence also the size of the
> > resulting binary.
> >
>
> BSS is neither,
Right. My typing was once again faster than my thinking. I meant
initialized vs. uninitialized, of course.
> but I guess for the vDSO there really isn't any such
> thing as uninitialized content.
Right, too. I can't even think of a valid use case, so .broken might be
the right place for both COMMON and .bss.
> I have to admit feeling funny about that, and I'm wondering if we
> shouldn't compile the vDSO with -fno-common.
Oops, sorry. It's already compiled with -fno-common, because it gets
inherited from the top-level Makefile's KBUILD_CFLAGS. Sorry for the
noise.
Anyway, my feeling is that the whole discussion is a bit academic. If I
want to be rigorous, I should take an opt-in approach, i.e. handle all
sections that work fine (e.g. also debugging sections) and then put all
else into .broken with something like:
.broken {
/* All else is dubious. */
*(*)
}
But this can become a maintenance PITA. Whenever GCC and/or binutils add
a new extension, the linker script would have to be adjusted
accordingly. Well, maybe that's even correct, because somebody at least
stops and thinks for a while about the implications of the new feature.
But I'm not really offering to become the new maintainer of this file.
Comments?
Petr Tesarik
If you are going to change the whitespace to match canonical C conventions,
then please do the same with the style of C-like comments.
/*
* Like this.
*/
Aside from that, the cosmetic changes are all fine by me.
The vDSO is unlike a normal DSO in that it has no writable PT_LOAD segment.
(This is the ELF way of expressing the fact that there is just one
contiguous read-only mapping of the whole vDSO image done by the kernel,
which is not in fact driven by any ELF headers at all.)
If any code in the vDSO uses any writable data sections, it will break at
runtime if nothing else. So indeed there should not be any such sections
in the input to the vDSO link.
It may well be true that all the non-.got* sections in the script now have
never actually appeared in a vDSO link. When I first wrote the linker
script to build a vDSO, it was by editting down the default -shared script
built into the ld of the day, not by naming only the sections that would
actually appear in my build at the time. It does no harm whatsoever to
match more input section names in the script, so I don't really see any
reason to remove any. (Probably nobody's compiler is emitting useless
1-word ones of them or whatnot, but who knows.) If you want to move them
all into a checked .broken sort of thing, that is fine.
The output sections (number of them, and cumulative length of their names)
contribute to the overall size of the whole vDSO file (and thus hasten the
day it pushes over another page boundary). So we should strive never to
add any that we can avoid.
The various ones from the beginning through .dynamic have to be separate
output sections for ld -shared to do the right thing (last we tried).
All of those names are magic, except for .note (its name doesn't matter,
just that it's a separate output section to have :note).
The reason .rodata is separate from .text is for the alignment of the code
section away from non-code data.
The reason .data is separate from .rodata is IIRC just to avoid its bogus
writable input sections "polluting" the output section with an SHF_WRITE,
which makes some tools or ELF checkers unhappy or something like that.
As I said, the stuff in .data probably doesn't exist except for .got*.
(But we're not really positive that various of the others don't exist with
some old tools or others.) The .got* sections have to be in some output
section (that is thus a writable section), because ld -shared wants to
generate them and freaks out if they are not in the script or are placed in
/DISCARD/ (or that was the state of ld when I first wrote the linker script).
You probably still can't get a way to make the .got* input sections
disappear entirely even though they are completely useless in the vDSO.
It's best to keep them placed after .rodata as it was (you gave no reason
for the effective reordering in 7/8), so that the useless garbage words are
last and with luck all the cache lines consumed are packed with the real
data that is ever actually read.
You added an output section just so you could use SIZEOF() on it.
I think you can do that same calculation for ASSERT() in a different way:
__start_got = .;
*(.got.plt) *(.got)
__stop_got = .;
There may also be a different way to do that check.
If the GOT were used, there would be dynamic relocs.
So readelf/objdump -r on vdso*.so.dbg should be empty, you can check.
Or it probably works too just to include *(.rel.*) *(.rela.*) in
.broken to ensure there are no reloc sections generated.
OTOH, I guess a non-broken link produces no .broken output section at all
(since it's empty) and so you've just done s/.data/.got/, which saves a
byte. So that's not so bad. Hey, about you show us the sizes of vdso*.so
(whole file sizes) before & after the series?
I see no rationale for changing the treatment of .note.* input sections.
Why do you think you should list two we use now and discard all others?
That makes no sense to me. Any .note.* sections we link into the vDSO, we
want in the vDSO PT_NOTE segment. What problem does this change solve?
Thanks,
Roland
Upfront, thank you very much for reviewing the series, Roland!
Roland McGrath píše v Út 16. 06. 2009 v 01:40 -0700:
> If you are going to change the whitespace to match canonical C conventions,
> then please do the same with the style of C-like comments.
>
> /*
> * Like this.
> */
Ah, the missing first almost empty line. Ok, didn't know the coding
style was that much strict...
> Aside from that, the cosmetic changes are all fine by me.
>
> The vDSO is unlike a normal DSO in that it has no writable PT_LOAD segment.
> (This is the ELF way of expressing the fact that there is just one
> contiguous read-only mapping of the whole vDSO image done by the kernel,
> which is not in fact driven by any ELF headers at all.)
>
> If any code in the vDSO uses any writable data sections, it will break at
> runtime if nothing else. So indeed there should not be any such sections
> in the input to the vDSO link.
And there currently isn't one. But the writability of the vDSO is a bit
more complicated. True, the VMA is initially mapped read-only, but it is
possible to make it writable using mprotect(). You may also set an
INT3-style breakpoint in the vDSO area. So, it's all about policy. If
there were any demand to make the vDSO writable, it could be done with a
trivial patch.
If we do not _WANT_ to make it writable, then we can simply add .data
and .bss to .broken.
> [...]
> The output sections (number of them, and cumulative length of their names)
> contribute to the overall size of the whole vDSO file (and thus hasten the
> day it pushes over another page boundary). So we should strive never to
> add any that we can avoid.
Fine with me. BTW what do you mean by "another page boundary"? The vDSO
currently fits comfortably into a single page.
> The various ones from the beginning through .dynamic have to be separate
> output sections for ld -shared to do the right thing (last we tried).
> All of those names are magic, except for .note (its name doesn't matter,
> just that it's a separate output section to have :note).
>
> The reason .rodata is separate from .text is for the alignment of the code
> section away from non-code data.
>
> The reason .data is separate from .rodata is IIRC just to avoid its bogus
> writable input sections "polluting" the output section with an SHF_WRITE,
> which makes some tools or ELF checkers unhappy or something like that.
>
> As I said, the stuff in .data probably doesn't exist except for .got*.
> (But we're not really positive that various of the others don't exist with
> some old tools or others.) The .got* sections have to be in some output
> section (that is thus a writable section), because ld -shared wants to
> generate them and freaks out if they are not in the script or are placed in
> /DISCARD/ (or that was the state of ld when I first wrote the linker script).
>
> You probably still can't get a way to make the .got* input sections
> disappear entirely even though they are completely useless in the vDSO.
> It's best to keep them placed after .rodata as it was (you gave no reason
> for the effective reordering in 7/8), so that the useless garbage words are
> last and with luck all the cache lines consumed are packed with the real
> data that is ever actually read.
For a moment I thought about removing them with objcopy -R .got (because
you run objcopy on the vDSO anyway), but for some reason (buggy
binutils?) this attempt produces ridiculous output (>1M file). So you
are right.
> You added an output section just so you could use SIZEOF() on it.
> I think you can do that same calculation for ASSERT() in a different way:
>
> __start_got = .;
> *(.got.plt) *(.got)
> __stop_got = .;
>
> There may also be a different way to do that check.
> If the GOT were used, there would be dynamic relocs.
> So readelf/objdump -r on vdso*.so.dbg should be empty, you can check.
> Or it probably works too just to include *(.rel.*) *(.rela.*) in
> .broken to ensure there are no reloc sections generated.
Yes, all .rel* sections should be in .broken, I guess.
> OTOH, I guess a non-broken link produces no .broken output section at all
> (since it's empty) and so you've just done s/.data/.got/, which saves a
> byte. So that's not so bad. Hey, about you show us the sizes of vdso*.so
> (whole file sizes) before & after the series?
No change. Before:
ptesarik@nathan:~/linux-2.6> ls -l arch/x86/vdso/vdso.so
-rw-r--r-- 1 ptesarik users 3624 Jun 16 11:51 arch/x86/vdso/vdso.so
After:
ptesarik@nathan:~/linux-2.6> ls -l arch/x86/vdso/vdso.so
-rw-r--r-- 1 ptesarik users 3624 Jun 16 11:52 arch/x86/vdso/vdso.so
> I see no rationale for changing the treatment of .note.* input sections.
> Why do you think you should list two we use now and discard all others?
> That makes no sense to me. Any .note.* sections we link into the vDSO, we
> want in the vDSO PT_NOTE segment. What problem does this change solve?
This was an attempt to discard .note.GNU-stack, which is what the
default linker script does, but it appears that all .note.GNU-stack
sections are zero-size, so this change does not make any difference.
Petr Tesarik
Sam Ravnborg píše v Pá 12. 06. 2009 v 20:06 +0200:
> On Fri, Jun 12, 2009 at 03:40:32PM +0200, Petr Tesarik wrote:
> > The handling of various sections in the VDSO linker script
> > looks pretty haphazard. This patch series cleans it up in
> > these regards:
> >
> > - improve the coding style
> > - remove superfluous sections
> > - issue a linker error if a section is encountered which
> > is known not to work
> > - check that the .got section is empty, except for the
> > three entries defined by the ABI
> > - discard sections which are not useful to user-space
> >
> > Petr Tesarik (8):
> > x86: Adjust the coding style of vdso-layout.lds.S
> > x86: Remove .sdata from the vDSO linker script
> > x86: add .broken section to the vDSO linker script
> > x86: mark altinstr-related sections in vDSO as broken
> > x86: mark some standard sections as broken in a vDSO
> > x86: check the size of GOT in vDSO
> > x86: remove unneeded section from the vDSO
> The above looks good.
> Acked-by: Sam Ravnborg <[email protected]>
What shall I do? Since Roland has some good ideas for improvement,
should I post a new patch series (v3), or should I add on top of the
acked one?
Petr Tesarik
> What shall I do? Since Roland has some good ideas for improvement,
> should I post a new patch series (v3), or should I add on top of the
> acked one?
I'd prefer to see a replacement series.
Thanks,
Roland
> Upfront, thank you very much for reviewing the series, Roland!
I'm glad to help.
> And there currently isn't one. But the writability of the vDSO is a bit
> more complicated. True, the VMA is initially mapped read-only, but it is
> possible to make it writable using mprotect().
Except for CONFIG_COMPAT_VDSO on X86_32.
> You may also set an
> INT3-style breakpoint in the vDSO area. So, it's all about policy. If
> there were any demand to make the vDSO writable, it could be done with a
> trivial patch.
There's a big difference between "writable" and "might be written".
> If we do not _WANT_ to make it writable, then we can simply add .data
> and .bss to .broken.
That is the correct thing to do.
> Yes, all .rel* sections should be in .broken, I guess.
You can toss an "void *x = &x;" into the vdso sources to verify that this
setup makes that explode gracefully.
> This was an attempt to discard .note.GNU-stack, which is what the
> default linker script does, but it appears that all .note.GNU-stack
> sections are zero-size, so this change does not make any difference.
Indeed it does not.
Thanks,
Roland