2008-07-01 22:33:43

by Denys Vlasenko

[permalink] [raw]
Subject: [PATCH 1/23] make section names compatible with -ffunction-sections -fdata-sections

Hi Andrew, folks,

I am unsure how to synchronize propagation of these patches
across all architectures.

Andrew, how this can be done without causing lots of pain
for arch maintainers? Please advise.

The purpose of these patches is to make kernel buildable
with "gcc -ffunction-sections -fdata-sections".

Newer gcc and binutils can do dead code and data removal
at link time. It is achieved using combination of
-ffunction-sections -fdata-sections options for gcc and
--gc-sections for ld.

Theory of operation:

-ffunction-sections instructs gcc to place each function
(including static ones) in it's own section named .text.function_name
instead of placing all functions in one big .text section.

At link time, ld normally coalesce all such sections into one
output section .text again. It is achieved by having *(.text.*) spec
along with *(.text) spec in built-in linker scripts.

If ld is invoked with --gc-sections, it tracks references, starting
from entry point and marks all input sections which are reachable
from there. Then it discards all input sections which are not marked.

This isn't buying much if you have one big .text section per .o module,
because even one referenced function will pull in entire section.
You need -ffunction-sections in order to split .text into per-function
sections and make --gc-sections much more useful.

-fdata-sections is analogous: it places each global or static variable
into .data.variable_name, .rodata.variable_name or .bss.variable_name.

If we ever want to use described mechanism, we need to adapt
existing code for new section names. Basically, we need to stop using
section names of the form
.text.xxxx
.data.xxxx
.rodata.xxxx
.bss.xxxx
in the kernel - otherwise section placement done by kernel's
custom linker scripts produces broken vmlinux and vdso images.

The following patches fix section names, one per architecture.

The patch in _this_ mail fixes generic part.

Signed-off-by: Denys Vlasenko <[email protected]>
--
vda


--- 0.org/Documentation/mutex-design.txt Wed Jul 2 00:40:39 2008
+++ 1.fixname/Documentation/mutex-design.txt Wed Jul 2 00:44:40 2008
@@ -66,14 +66,14 @@

c0377ccb <mutex_lock>:
c0377ccb: f0 ff 08 lock decl (%eax)
- c0377cce: 78 0e js c0377cde <.text.lock.mutex>
+ c0377cce: 78 0e js c0377cde <.lock.mutex.text>
c0377cd0: c3 ret

the unlocking fastpath is equally tight:

c0377cd1 <mutex_unlock>:
c0377cd1: f0 ff 00 lock incl (%eax)
- c0377cd4: 7e 0f jle c0377ce5 <.text.lock.mutex+0x7>
+ c0377cd4: 7e 0f jle c0377ce5 <.lock.mutex.text+0x7>
c0377cd6: c3 ret

- 'struct mutex' semantics are well-defined and are enforced if
--- 0.org/include/asm-generic/vmlinux.lds.h Wed Jul 2 00:40:50 2008
+++ 1.fixname/include/asm-generic/vmlinux.lds.h Wed Jul 2 00:54:09 2008
@@ -41,7 +41,7 @@
/* .data section */
#define DATA_DATA \
*(.data) \
- *(.data.init.refok) \
+ *(.init.refok.data) \
*(.ref.data) \
DEV_KEEP(init.data) \
DEV_KEEP(exit.data) \
@@ -206,8 +206,8 @@
ALIGN_FUNCTION(); \
*(.text) \
*(.ref.text) \
- *(.text.init.refok) \
- *(.exit.text.refok) \
+ *(.init.refok.text) \
+ *(.exit.refok.text) \
DEV_KEEP(init.text) \
DEV_KEEP(exit.text) \
CPU_KEEP(init.text) \
@@ -347,8 +347,8 @@
#define PERCPU(align) \
. = ALIGN(align); \
__per_cpu_start = .; \
- .data.percpu : AT(ADDR(.data.percpu) - LOAD_OFFSET) { \
- *(.data.percpu) \
- *(.data.percpu.shared_aligned) \
+ .percpu.data : AT(ADDR(.percpu.data) - LOAD_OFFSET) { \
+ *(.percpu.data) \
+ *(.percpu.shared_aligned.data) \
} \
__per_cpu_end = .;
--- 0.org/include/linux/cache.h Wed Jul 2 00:40:51 2008
+++ 1.fixname/include/linux/cache.h Wed Jul 2 00:45:51 2008
@@ -31,7 +31,7 @@
#ifndef __cacheline_aligned
#define __cacheline_aligned \
__attribute__((__aligned__(SMP_CACHE_BYTES), \
- __section__(".data.cacheline_aligned")))
+ __section__(".cacheline_aligned.data")))
#endif /* __cacheline_aligned */

#ifndef __cacheline_aligned_in_smp
--- 0.org/include/linux/init.h Wed Jul 2 00:40:51 2008
+++ 1.fixname/include/linux/init.h Wed Jul 2 00:54:13 2008
@@ -62,9 +62,9 @@

/* backward compatibility note
* A few places hardcode the old section names:
- * .text.init.refok
- * .data.init.refok
- * .exit.text.refok
+ * .init.refok.text
+ * .init.refok.data
+ * .exit.refok.text
* They should be converted to use the defines from this file
*/

@@ -299,7 +299,7 @@
#endif

/* Data marked not to be saved by software suspend */
-#define __nosavedata __section(.data.nosave)
+#define __nosavedata __section(.nosave.data)

/* This means "can be init if no module support, otherwise module load
may call it." */
--- 0.org/include/linux/percpu.h Wed Jul 2 00:40:51 2008
+++ 1.fixname/include/linux/percpu.h Wed Jul 2 00:45:39 2008
@@ -10,13 +10,13 @@

#ifdef CONFIG_SMP
#define DEFINE_PER_CPU(type, name) \
- __attribute__((__section__(".data.percpu"))) \
+ __attribute__((__section__(".percpu.data"))) \
PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name

#ifdef MODULE
-#define SHARED_ALIGNED_SECTION ".data.percpu"
+#define SHARED_ALIGNED_SECTION ".percpu.data"
#else
-#define SHARED_ALIGNED_SECTION ".data.percpu.shared_aligned"
+#define SHARED_ALIGNED_SECTION ".percpu.shared_aligned.data"
#endif

#define DEFINE_PER_CPU_SHARED_ALIGNED(type, name) \
--- 0.org/include/linux/spinlock.h Wed Jul 2 00:40:51 2008
+++ 1.fixname/include/linux/spinlock.h Wed Jul 2 00:44:40 2008
@@ -59,7 +59,7 @@
/*
* Must define these before including other files, inline functions need them
*/
-#define LOCK_SECTION_NAME ".text.lock."KBUILD_BASENAME
+#define LOCK_SECTION_NAME ".lock.text."KBUILD_BASENAME

#define LOCK_SECTION_START(extra) \
".subsection 1\n\t" \
--- 0.org/kernel/module.c Wed Jul 2 00:40:51 2008
+++ 1.fixname/kernel/module.c Wed Jul 2 00:45:39 2008
@@ -433,7 +433,7 @@
Elf_Shdr *sechdrs,
const char *secstrings)
{
- return find_sec(hdr, sechdrs, secstrings, ".data.percpu");
+ return find_sec(hdr, sechdrs, secstrings, ".percpu.data");
}

static void percpu_modcopy(void *pcpudest, const void *from, unsigned long size)
--- 0.org/scripts/mod/modpost.c Wed Jul 2 00:40:54 2008
+++ 1.fixname/scripts/mod/modpost.c Wed Jul 2 00:54:21 2008
@@ -794,9 +794,9 @@
/* sections that may refer to an init/exit section with no warning */
static const char *initref_sections[] =
{
- ".text.init.refok*",
- ".exit.text.refok*",
- ".data.init.refok*",
+ ".init.refok.text*",
+ ".exit.refok.text*",
+ ".init.refok.data*",
NULL
};

@@ -915,7 +915,7 @@
* Pattern 0:
* Do not warn if funtion/data are marked with __init_refok/__initdata_refok.
* The pattern is identified by:
- * fromsec = .text.init.refok* | .data.init.refok*
+ * fromsec = .init.refok.text* | .init.refok.data*
*
* Pattern 1:
* If a module parameter is declared __initdata and permissions=0
@@ -939,8 +939,8 @@
* *probe_one, *_console, *_timer
*
* Pattern 3:
- * Whitelist all refereces from .text.head to .init.data
- * Whitelist all refereces from .text.head to .init.text
+ * Whitelist all refereces from .head.text to .init.data
+ * Whitelist all refereces from .head.text to .init.text
*
* Pattern 4:
* Some symbols belong to init section but still it is ok to reference


2008-07-01 22:59:55

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH 1/23] make section names compatible with -ffunction-sections -fdata-sections

On Wed, 02 Jul 2008 02:33:48 +0200, Denys Vlasenko said:

> The purpose of these patches is to make kernel buildable
> with "gcc -ffunction-sections -fdata-sections".
>
> Newer gcc and binutils can do dead code and data removal
> at link time. It is achieved using combination of
> -ffunction-sections -fdata-sections options for gcc and
> --gc-sections for ld.

Interesting idea. Do you happen to have before-and-after 'size vmlinux'
numbers to show how much space is actually reclaimed?


Attachments:
(No filename) (226.00 B)

2008-07-02 00:04:21

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 1/23] make section names compatible with -ffunction-sections -fdata-sections

On Wednesday 02 July 2008 00:56, [email protected] wrote:
> On Wed, 02 Jul 2008 02:33:48 +0200, Denys Vlasenko said:
>
> > The purpose of these patches is to make kernel buildable
> > with "gcc -ffunction-sections -fdata-sections".
> >
> > Newer gcc and binutils can do dead code and data removal
> > at link time. It is achieved using combination of
> > -ffunction-sections -fdata-sections options for gcc and
> > --gc-sections for ld.
>
> Interesting idea. Do you happen to have before-and-after 'size vmlinux'
> numbers to show how much space is actually reclaimed?

After this patch there will be no change - it does not do
dead code and data removal. I submitted bigger change before
but it was probably too big for digestion.

That earlier version was achieving ~10% kernel size reduction
if kernel is built without loadable module support
(loadable modules interfere with linker's dead code
and data removal, need to add some rather contrived magic
to make it work there too. Left as TODO for later).
--
vda

Subject: Re: [PATCH 1/23] make section names compatible with -ffunction-sections -fdata-sections

On Wed, 2 Jul 2008 02:33:48 +0200
Denys Vlasenko <[email protected]> wrote:

> Hi Andrew, folks,
>
> I am unsure how to synchronize propagation of these patches
> across all architectures.
>
> Andrew, how this can be done without causing lots of pain
> for arch maintainers? Please advise.

Hi,

AFAICS, there is a lot of code in .lds.S files which really is
arch-independent, but still is duplicated in every arch. Kinda messy to
change anything in there.

I noticed this while writing another patch, namely early (pre-SMP)
initcall support. Fortunately, there was a generic header included by
all .lds.S files and I could fit my modification in there.

My suggestion is (for both you and arch maintainers)... why not make an
effort to reduce code duplication in these files? Life would be so much
easier. The idea is:
- Write a macro to define all generic sections, possibly taking in
alignment as an argument.
- Have each arch's .lds.S file define arch-dependent stuff and use that
macro for generic sections.

This would surely be immediately useful, more readily accepted by
maintainers and would open up the way for a lighter version of your
patch, IMO.


Cheers,
Eduard

2008-07-02 04:37:36

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/23] make section names compatible with -ffunction-sections -fdata-sections

On Wed, 2 Jul 2008 02:33:48 +0200 Denys Vlasenko <[email protected]> wrote:

> Hi Andrew, folks,
>
> I am unsure how to synchronize propagation of these patches
> across all architectures.
>
> Andrew, how this can be done without causing lots of pain
> for arch maintainers? Please advise.

You didn't describe the problem which you're trying to solve, so how
can I say?

Possibilities are:

a) the generic bit depends on the arch bits

-> No probs. I can merge the generic bit once all architectures are in.

b) the arch bits depend on the generic bits

-> No probs. I can merge the generic bit then send all the arch bits.

c) they each depend on each other

-> No probs. We go round gaththering acks, slam it all into
a single patch then in it goes. 2.6.28, presumably.

d) something else

-> please do tell.

> The purpose of these patches is to make kernel buildable
> with "gcc -ffunction-sections -fdata-sections".
>
> Newer gcc and binutils can do dead code and data removal
> at link time. It is achieved using combination of
> -ffunction-sections -fdata-sections options for gcc and
> --gc-sections for ld.
>
> Theory of operation:
>
> -ffunction-sections instructs gcc to place each function
> (including static ones) in it's own section named .text.function_name
> instead of placing all functions in one big .text section.
>
> At link time, ld normally coalesce all such sections into one
> output section .text again. It is achieved by having *(.text.*) spec
> along with *(.text) spec in built-in linker scripts.
>
> If ld is invoked with --gc-sections, it tracks references, starting
> from entry point and marks all input sections which are reachable
> from there. Then it discards all input sections which are not marked.
>
> This isn't buying much if you have one big .text section per .o module,
> because even one referenced function will pull in entire section.
> You need -ffunction-sections in order to split .text into per-function
> sections and make --gc-sections much more useful.
>
> -fdata-sections is analogous: it places each global or static variable
> into .data.variable_name, .rodata.variable_name or .bss.variable_name.
>
> If we ever want to use described mechanism, we need to adapt
> existing code for new section names. Basically, we need to stop using
> section names of the form
> .text.xxxx
> .data.xxxx
> .rodata.xxxx
> .bss.xxxx
> in the kernel - otherwise section placement done by kernel's
> custom linker scripts produces broken vmlinux and vdso images.
>
> The following patches fix section names, one per architecture.
>
> The patch in _this_ mail fixes generic part.

(tries to work out what it does)

oh, it does the above section renaming. So I guess we're looking at
scenario c), above?

"otherwise section placement done by kernel's custom linker scripts
produces broken vmlinux and vdso images" is an inadequate description.
Please describe the problem more completely. This is important,
because once we actually find out what the patch is fixing, perhaps
others will be aware of less intrusive ways of fixing the problem, and
we end up with a better patch.


Please be aware that last time someone tried function-sections, maybe
five years ago, problems were encountered with linker efficiency
(possible an O(nsections) or worse algorithm in ld). Link times went
up a lot.

So it would be good to hunt down some old ld versions and run some
timings. A mention of the results in the changelog is appropriate.


Is there actually a patch anywhere which enables function-sections for
some architectures? It would be good to see that (and its associated
size-reduction results) so we can work out whether all these changes
are worth pursuing.


> ...
>
> --- 0.org/scripts/mod/modpost.c Wed Jul 2 00:40:54 2008
> +++ 1.fixname/scripts/mod/modpost.c Wed Jul 2 00:54:21 2008
> @@ -794,9 +794,9 @@
> /* sections that may refer to an init/exit section with no warning */
> static const char *initref_sections[] =
> {
> - ".text.init.refok*",
> - ".exit.text.refok*",
> - ".data.init.refok*",
> + ".init.refok.text*",
> + ".exit.refok.text*",
> + ".init.refok.data*",
> NULL
> };
>
> @@ -915,7 +915,7 @@
> * Pattern 0:
> * Do not warn if funtion/data are marked with __init_refok/__initdata_refok.
> * The pattern is identified by:
> - * fromsec = .text.init.refok* | .data.init.refok*
> + * fromsec = .init.refok.text* | .init.refok.data*
> *
> * Pattern 1:
> * If a module parameter is declared __initdata and permissions=0
> @@ -939,8 +939,8 @@
> * *probe_one, *_console, *_timer
> *
> * Pattern 3:
> - * Whitelist all refereces from .text.head to .init.data
> - * Whitelist all refereces from .text.head to .init.text
> + * Whitelist all refereces from .head.text to .init.data
> + * Whitelist all refereces from .head.text to .init.text

um, this would be a good occasion for us to have another attempt at
spelling "references".

2008-07-02 07:10:02

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 1/23] make section names compatible with -ffunction-sections -fdata-sections

On Wednesday 02 July 2008 06:30, Andrew Morton wrote:
> On Wed, 2 Jul 2008 02:33:48 +0200 Denys Vlasenko <[email protected]> wrote:
> > I am unsure how to synchronize propagation of these patches
> > across all architectures.
> >
> > Andrew, how this can be done without causing lots of pain
> > for arch maintainers? Please advise.
>
> You didn't describe the problem which you're trying to solve, so how
> can I say?

The problem is that with -ffunction-sections -fdata-sections gcc
will create sections like .text.head and .data.nosave
whenever someone will have innocuous code like this:

static void head(...) {...}

or this:

int f(...)
{
static int nosave;
...
}

somewhere in the kernel.

Then kernel linker script will be confused and put these sections
in wrong places.

IOW: names like .text.XXXX and .data.XXX must not be used for "magic"
sections.


> Possibilities are:
>
> a) the generic bit depends on the arch bits
>
> -> No probs. I can merge the generic bit once all architectures are in.
>
> b) the arch bits depend on the generic bits
>
> -> No probs. I can merge the generic bit then send all the arch bits.
>
> c) they each depend on each other
>
> -> No probs. We go round gaththering acks, slam it all into
> a single patch then in it goes. 2.6.28, presumably.

It's definitely (c). Changes in, say, include/linux/init.h:

-#define __nosavedata __section(.data.nosave)
+#define __nosavedata __section(.nosave.data)

must be syncronized with, say, arch/arm/kernel/vmlinux.lds.S:

. = ALIGN(4096);
__nosave_begin = .;
- *(.data.nosave)
+ *(.nosave.data)

> > The following patches fix section names, one per architecture.
> >
> > The patch in _this_ mail fixes generic part.
>
> (tries to work out what it does)
>
> oh, it does the above section renaming. So I guess we're looking at
> scenario c), above?
>
> "otherwise section placement done by kernel's custom linker scripts
> produces broken vmlinux and vdso images" is an inadequate description.
> Please describe the problem more completely. This is important,
> because once we actually find out what the patch is fixing, perhaps
> others will be aware of less intrusive ways of fixing the problem, and
> we end up with a better patch.

See above. Is that explanation ok?

> Please be aware that last time someone tried function-sections, maybe
> five years ago, problems were encountered with linker efficiency
> (possible an O(nsections) or worse algorithm in ld). Link times went
> up a lot.

Last time is was probably me :) about a year ago I think.
Last link stage takes niticeably more time, but
nothing really awful.

> So it would be good to hunt down some old ld versions and run some
> timings. A mention of the results in the changelog is appropriate.
>
> Is there actually a patch anywhere which enables function-sections for
> some architectures? It would be good to see that (and its associated
> size-reduction results) so we can work out whether all these changes
> are worth pursuing.

Yes, I was posting it twice during last year.
(digging up old emails from "sent" folder...) here is some:

On Friday 07 September 2007 19:30, Denys Vlasenko wrote:
> On Friday 07 September 2007 17:31, Daniel Walker wrote:
> > On Thu, 2007-09-06 at 18:07 +0100, Denys Vlasenko wrote:
> > > A bit extended version:
> > >
> > > In the process in making it work I saw ~10% vmlinux size reductions
> > > (which basically matches what Marcelo says) when I wasn't retaining
> > > sections needed for EXPORT_SYMBOLs, but module loading didn't work.
> > >
> > > Thus I fixed that by adding KEEP() directives so that EXPORT_SYMBOLs
> > > are never discarded. This was just one of many fixes until kernel
> > > started to actually boot and work.
> > >
> > > I did that before I posted patches to lkml.
> > > IOW: posted patches are not broken versus module loading.
> >
> > Ok, this is more like the explanation I was looking for..
> >
> > During this thread you seemed to indicate the patches you release
> > reduced the kernel ~10% , but now your saying that was pre-release ,
> > right?
>
> CONFIG_MODULE=n will save ~10%
> CONFIG_MODULE=y - ~1%
>
> Exact figure depends on .config (whether you happen to include
> especially "fat" code or not).

--
vda