2010-06-14 12:38:35

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

From: Matt Fleming <[email protected]>

Many architectures collect text sections beginning with '.text.' in
their .text section, so move this pattern into TEXT_TEXT to stop them
all having to duplicate the pattern in their arch/ linker scripts.

Signed-off-by: Matt Fleming <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 48c5299..08f4680 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -369,6 +369,7 @@
ALIGN_FUNCTION(); \
*(.text.hot) \
*(.text) \
+ *(.text.*) \
*(.ref.text) \
DEV_KEEP(init.text) \
DEV_KEEP(exit.text) \
--
1.6.6.1


2010-06-14 12:38:40

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 4/5] parisc: Remove *(.text.*) pattern from the linker script

From: Matt Fleming <[email protected]>

Drop the pattern from the linker script because the pattern is now
provided by TEXT_TEXT.

Signed-off-by: Matt Fleming <[email protected]>
---
arch/parisc/kernel/vmlinux.lds.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/parisc/kernel/vmlinux.lds.S b/arch/parisc/kernel/vmlinux.lds.S
index d64a6bb..6784e39 100644
--- a/arch/parisc/kernel/vmlinux.lds.S
+++ b/arch/parisc/kernel/vmlinux.lds.S
@@ -61,7 +61,6 @@ SECTIONS
*(.text.sys_exit)
*(.text.do_sigaltstack)
*(.text.do_fork)
- *(.text.*)
*(.fixup)
*(.lock.text) /* out-of-line lock text */
*(.gnu.warning)
--
1.6.6.1

2010-06-14 12:38:37

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 5/5] score: Remove *(.text.*) pattern from the linker script

From: Matt Fleming <[email protected]>

Drop the pattern from the linker script because the pattern is now
provided by TEXT_TEXT.

Signed-off-by: Matt Fleming <[email protected]>
---
arch/score/kernel/vmlinux.lds.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/score/kernel/vmlinux.lds.S b/arch/score/kernel/vmlinux.lds.S
index eebcbaa..2ed49ad 100644
--- a/arch/score/kernel/vmlinux.lds.S
+++ b/arch/score/kernel/vmlinux.lds.S
@@ -42,7 +42,6 @@ SECTIONS
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
- *(.text.*)
*(.fixup)
. = ALIGN (4) ;
_etext = .; /* End of text section */
--
1.6.6.1

2010-06-14 12:39:05

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 3/5] mips: Remove *(.text.*) pattern from the linker script

From: Matt Fleming <[email protected]>

Drop the pattern from the linker script because the pattern is now
provided by TEXT_TEXT.

Signed-off-by: Matt Fleming <[email protected]>
---
arch/mips/kernel/vmlinux.lds.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
index f25df73..f51bca8 100644
--- a/arch/mips/kernel/vmlinux.lds.S
+++ b/arch/mips/kernel/vmlinux.lds.S
@@ -47,7 +47,6 @@ SECTIONS
LOCK_TEXT
KPROBES_TEXT
IRQENTRY_TEXT
- *(.text.*)
*(.fixup)
*(.gnu.warning)
} :text = 0
--
1.6.6.1

2010-06-14 12:39:20

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 2/5] blackfin: Remove *(.text.*) pattern from the linker script

From: Matt Fleming <[email protected]>

Drop the pattern from the linker script because the pattern is now
provided by TEXT_TEXT.

Signed-off-by: Matt Fleming <[email protected]>
---
arch/blackfin/kernel/vmlinux.lds.S | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/blackfin/kernel/vmlinux.lds.S b/arch/blackfin/kernel/vmlinux.lds.S
index 984c781..2c340fd 100644
--- a/arch/blackfin/kernel/vmlinux.lds.S
+++ b/arch/blackfin/kernel/vmlinux.lds.S
@@ -42,7 +42,6 @@ SECTIONS
__einittext = .;
EXIT_TEXT
#endif
- *(.text.*)
*(.fixup)

#if !L1_CODE_LENGTH
--
1.6.6.1

2010-06-14 14:32:53

by Tim Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010, Matt Fleming wrote:

> Many architectures collect text sections beginning with '.text.' in
> their .text section, so move this pattern into TEXT_TEXT to stop them
> all having to duplicate the pattern in their arch/ linker scripts.

Hi Matt,

I think this change could result in problems such as the page-aligned text
sections (recently renamed from .text.page_aligned to .text..page_aligned)
that exist in some architectures being included the main text section in a
non-page-aligned fashion (and similar issues for other .text.foo
sections).

I was planning to submit in the next couple weeks a change that adds
support for building the kernel with -ffunction-sections -fdata-sections,
which would have as a piece of it adding to TEXT_TEXT the following
expression:

*(.text.[A-Za-z$_]*) /* handle -ffunction-sections */\

which should match the .text.foo sections generated by -ffunction-sections
but not the kernel's special sections which now all have names of the form
.text..foo. I suspect after that change, the cleanup of deleting .text.*
from the various architecture linker scripts that reference it should be
possible.

-Tim Abbott

2010-06-14 16:00:16

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 4/5] parisc: Remove *(.text.*) pattern from the linker script

On Mon, 2010-06-14 at 13:38 +0100, Matt Fleming wrote:
> From: Matt Fleming <[email protected]>
>
> Drop the pattern from the linker script because the pattern is now
> provided by TEXT_TEXT.
>
> Signed-off-by: Matt Fleming <[email protected]>
> ---
> arch/parisc/kernel/vmlinux.lds.S | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/arch/parisc/kernel/vmlinux.lds.S b/arch/parisc/kernel/vmlinux.lds.S
> index d64a6bb..6784e39 100644
> --- a/arch/parisc/kernel/vmlinux.lds.S
> +++ b/arch/parisc/kernel/vmlinux.lds.S
> @@ -61,7 +61,6 @@ SECTIONS
> *(.text.sys_exit)
> *(.text.do_sigaltstack)
> *(.text.do_fork)
> - *(.text.*)
> *(.fixup)
> *(.lock.text) /* out-of-line lock text */
> *(.gnu.warning)


This would destroy all of the named parisc text ordering we do above the
removed line because now you'd have swept up all the function sections
before we get to them, won't it?

The ordering is an execution speed up on 32 bit systems because our
relative jump is so short.

James

2010-06-14 18:22:16

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, Jun 14, 2010 at 01:38:28PM +0100, Matt Fleming wrote:
> From: Matt Fleming <[email protected]>
>
> Many architectures collect text sections beginning with '.text.' in
> their .text section, so move this pattern into TEXT_TEXT to stop them
> all having to duplicate the pattern in their arch/ linker scripts.
>
> Signed-off-by: Matt Fleming <[email protected]>
> ---
> include/asm-generic/vmlinux.lds.h | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 48c5299..08f4680 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -369,6 +369,7 @@
> ALIGN_FUNCTION(); \
> *(.text.hot) \
> *(.text) \
> + *(.text.*) \
> *(.ref.text) \
> DEV_KEEP(init.text) \
> DEV_KEEP(exit.text) \

This will break ia64.
>From their vmlinux.lds.h:

TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
*(.gnu.linkonce.t*)
}
.text2 : AT(ADDR(.text2) - LOAD_OFFSET)
{ *(.text2) }
#ifdef CONFIG_SMP
.text..lock : AT(ADDR(.text..lock) - LOAD_OFFSET)
{ *(.text..lock) }
#endif

The patch above we would match the section ".text.*" too soon and
add the .text..lock section at the wrong place.

Browsing the vmlinux.lds files other archs will be impacted too.

If you feel tempted to clean up the linker script one place to start
would be to introduce consistent indention / layout.
If you use arch/sparc/kernel/vmlinux.lds.h as a template then you
have a good start.

The above mentioned ia64 vmlinux.lds.h would be a good place to start.

Sam

2010-06-14 19:13:53

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 4/5] parisc: Remove *(.text.*) pattern from the linker script

On Mon, 14 Jun 2010 10:20:59 -0500, James Bottomley <[email protected]> wrote:
>
> This would destroy all of the named parisc text ordering we do above the
> removed line because now you'd have swept up all the function sections
> before we get to them, won't it?

Argh, sorry, you're completely right. I was a bit overzealous with this
change. I'll drop it. In fact, given the very specific things that
parisc does with the .text.* sections, this entire series makes no sense
- it doesn't make sense to change TEXT_TEXT in a way that means parisc
can no longer use it.

Thanks for the review, James.

2010-06-14 19:21:30

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010 20:22:12 +0200, Sam Ravnborg <[email protected]> wrote:
>
> The patch above we would match the section ".text.*" too soon and
> add the .text..lock section at the wrong place.
>
> Browsing the vmlinux.lds files other archs will be impacted too.

Yeah, good point. James Bottomley has already pointed out how this
change completely breaks parisc. I'll drop this series.

Thanks for the review!

2010-06-14 19:33:57

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010 10:32:46 -0400 (EDT), Tim Abbott <[email protected]> wrote:
>
> I was planning to submit in the next couple weeks a change that adds
> support for building the kernel with -ffunction-sections -fdata-sections,
> which would have as a piece of it adding to TEXT_TEXT the following
> expression:
>
> *(.text.[A-Za-z$_]*) /* handle -ffunction-sections */\
>
> which should match the .text.foo sections generated by -ffunction-sections
> but not the kernel's special sections which now all have names of the form
> .text..foo. I suspect after that change, the cleanup of deleting .text.*
> from the various architecture linker scripts that reference it should be
> possible.

Do these special kernel sections include things like the parisc
.text.do_softirq, .text.sys_exit, etc? James raised a good objection to
the parisc patch of this series. I'm guessing most people saw it already
but I'll paste it here for reference,


This would destroy all of the named parisc text ordering we do above the
removed line because now you'd have swept up all the function sections
before we get to them, won't it?

The ordering is an execution speed up on 32 bit systems because our
relative jump is so short.

James

Will you changes handle this OK?

2010-06-14 19:59:11

by Mike Frysinger

[permalink] [raw]
Subject: Re: [PATCH 2/5] blackfin: Remove *(.text.*) pattern from the linker script

On Mon, Jun 14, 2010 at 08:38, Matt Fleming wrote:
> From: Matt Fleming <[email protected]>
>
> Drop the pattern from the linker script because the pattern is now
> provided by TEXT_TEXT.
>
> Signed-off-by: Matt Fleming <[email protected]>

once the vmlinux.lds.h change gets merged:
Acked-by: Mike Frysinger <[email protected]>
-mike

2010-06-14 20:05:57

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 2010-06-14 at 20:33 +0100, Matt Fleming wrote:
> On Mon, 14 Jun 2010 10:32:46 -0400 (EDT), Tim Abbott <[email protected]> wrote:
> >
> > I was planning to submit in the next couple weeks a change that adds
> > support for building the kernel with -ffunction-sections -fdata-sections,
> > which would have as a piece of it adding to TEXT_TEXT the following
> > expression:
> >
> > *(.text.[A-Za-z$_]*) /* handle -ffunction-sections */\

Just as a point of technical interest, that won't handle
-ffunction-sections. At least on parisc, we get a
section .text.<function name> for every function. This means that any
character legal in a function name can appear there, not just letters
and underscores (we get millicode ones with dollar signs as well for
instance). That's why *(.text.*) is safer

> > which should match the .text.foo sections generated by -ffunction-sections
> > but not the kernel's special sections which now all have names of the form
> > .text..foo.

They do? I don't find any symbols like that on parisc.

Historically, the way we've differentiated is that kernel special
symbols tend to have the text designator *after* the name, whereas the
linker puts it before ... of course that has issues for functions
called things like text or init ... but we try not to do that ...

> I suspect after that change, the cleanup of deleting .text.*
> > from the various architecture linker scripts that reference it should be
> > possible.
>
> Do these special kernel sections include things like the parisc
> .text.do_softirq, .text.sys_exit, etc? James raised a good objection to
> the parisc patch of this series. I'm guessing most people saw it already
> but I'll paste it here for reference,
>
>
> This would destroy all of the named parisc text ordering we do above the
> removed line because now you'd have swept up all the function sections
> before we get to them, won't it?
>
> The ordering is an execution speed up on 32 bit systems because our
> relative jump is so short.
>
> James
>
> Will you changes handle this OK?

As long as we don't do the generic gathering of the rest of the text
sections until after the special ones are handled (and this includes the
specific function sections we name), it doesn't really matter to us how
it's done.

James

2010-06-14 22:02:25

by Tim Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010, James Bottomley wrote:

> On Mon, 2010-06-14 at 20:33 +0100, Matt Fleming wrote:
> > On Mon, 14 Jun 2010 10:32:46 -0400 (EDT), Tim Abbott <[email protected]> wrote:
> > >
> > > I was planning to submit in the next couple weeks a change that adds
> > > support for building the kernel with -ffunction-sections -fdata-sections,
> > > which would have as a piece of it adding to TEXT_TEXT the following
> > > expression:
> > >
> > > *(.text.[A-Za-z$_]*) /* handle -ffunction-sections */\
>
> Just as a point of technical interest, that won't handle
> -ffunction-sections. At least on parisc, we get a
> section .text.<function name> for every function. This means that any
> character legal in a function name can appear there, not just letters
> and underscores (we get millicode ones with dollar signs as well for
> instance). That's why *(.text.*) is safer

Hi James,

I believe that the pattern [A-Za-z$_] matches all valid characters to
start a function name (in particular, it includes "$"). If I'm missing
any valid characters for the start of a function name, please correct me.

To give some background, the strategy here for -ffunction-sections support
is to have the kernel's "magic" text sections all start with

".text.."

while the sections generated by -ffunction-sections will start with

".text." followed by a character other than "."

These sets are disjoint, and so if we have a complete set of valid next
characters in the pattern ".text.[A-Za-z$_]*", it will be just as good as
".text.*" for matching the sections produced by -ffunction-sections. (As
a sidenote, I would prefer .text.[^.], but I don't believe that is valid
linker script syntax).

While one could in principle try to handle things by not renaming the
.text.foo sections and instead just placing the linker script code for
them all before a .text.* item in the linker script, that approach is
really fragile. I think the "text..foo" approach is a good design and I
am not aware of any problems with it.

Some more detailed explanation is available here:
<http://lkml.org/lkml/2010/2/19/365>

> > > which should match the .text.foo sections generated by -ffunction-sections
> > > but not the kernel's special sections which now all have names of the form
> > > .text..foo.
>
> They do? I don't find any symbols like that on parisc.
>
> Historically, the way we've differentiated is that kernel special
> symbols tend to have the text designator *after* the name, whereas the
> linker puts it before ... of course that has issues for functions
> called things like text or init ... but we try not to do that ...

It looks like the patches for the .text.foo rename haven't been sent yet,
my mistake. I've primarily been tracking the much larger .data.foo ->
.data..foo transition. I expect a patch series to rename the remaining
sections and add -ffunction-sections support will be sent by either Denys
Vlasenko or myself in the next few weeks.

While the kernel does use section names like ".init.text", there were
before quite recently a very large number of kernel special sections with
names like ".data.page_aligned" which would conflict with:

static int page_aligned;

when compiling with -fdata-sections.

In fact, we initially sent patches that changed these all to e.g.
".page_aligned.data". But we discovered that it is problem when writing
assembly files, since if you write:

.section ".page_aligned.data"

it doesn't work -- you need to specify the load flags like so:

.section ".page_aligned.data", "aw"

while if you write e.g.

.section ".data..page_aligned"

the assembler will automatically set the right load flags for read/write
data.

Since this is a subtle issue and I'd be surprised if there weren't a lot
of people who work on assembly code in the kernel who don't know about
this subtle issue, the ".page_aligned.data" naming scheme is destined to
result in some frustrating bugs. So we settled on the more bug-proof
".data..page_aligned" naming scheme for adding -fdata-sections support.

It would probably be a good cleanup to go through and rename e.g.
.init.text to .text..init so that we're consistent everywhere, but I'd
like to get -ffunction-sections working before starting that project
myself.

-Tim Abbott

2010-06-14 22:21:13

by Tim Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010, Matt Fleming wrote:

> Do these special kernel sections include things like the parisc
> .text.do_softirq, .text.sys_exit, etc? James raised a good objection to
> the parisc patch of this series. I'm guessing most people saw it already
> but I'll paste it here for reference,
>
> This would destroy all of the named parisc text ordering we do above the
> removed line because now you'd have swept up all the function sections
> before we get to them, won't it?
>
> The ordering is an execution speed up on 32 bit systems because our
> relative jump is so short.
>
> Will you changes handle this OK?

I think I addressed this in my reply to James just now, but to be super
clear, this -ffunction-sections plan involves renaming .text.do_softirq to
.text..do_softirq (etc.) first.

-Tim Abbott

2010-06-14 23:08:46

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 2010-06-14 at 18:02 -0400, Tim Abbott wrote:
> On Mon, 14 Jun 2010, James Bottomley wrote:
>
> > On Mon, 2010-06-14 at 20:33 +0100, Matt Fleming wrote:
> > > On Mon, 14 Jun 2010 10:32:46 -0400 (EDT), Tim Abbott <[email protected]> wrote:
> > > >
> > > > I was planning to submit in the next couple weeks a change that adds
> > > > support for building the kernel with -ffunction-sections -fdata-sections,
> > > > which would have as a piece of it adding to TEXT_TEXT the following
> > > > expression:
> > > >
> > > > *(.text.[A-Za-z$_]*) /* handle -ffunction-sections */\
> >
> > Just as a point of technical interest, that won't handle
> > -ffunction-sections. At least on parisc, we get a
> > section .text.<function name> for every function. This means that any
> > character legal in a function name can appear there, not just letters
> > and underscores (we get millicode ones with dollar signs as well for
> > instance). That's why *(.text.*) is safer
>
> Hi James,
>
> I believe that the pattern [A-Za-z$_] matches all valid characters to
> start a function name (in particular, it includes "$"). If I'm missing
> any valid characters for the start of a function name, please correct me.

Well, our linker seems to generate annoying symbols with carets in
them ...

> To give some background, the strategy here for -ffunction-sections support
> is to have the kernel's "magic" text sections all start with
>
> ".text.."
>
> while the sections generated by -ffunction-sections will start with
>
> ".text." followed by a character other than "."
>
> These sets are disjoint, and so if we have a complete set of valid next
> characters in the pattern ".text.[A-Za-z$_]*", it will be just as good as
> ".text.*" for matching the sections produced by -ffunction-sections. (As
> a sidenote, I would prefer .text.[^.], but I don't believe that is valid
> linker script syntax).
>
> While one could in principle try to handle things by not renaming the
> .text.foo sections and instead just placing the linker script code for
> them all before a .text.* item in the linker script, that approach is
> really fragile. I think the "text..foo" approach is a good design and I
> am not aware of any problems with it.

OK, but how about some actual explanation? You've just characterised
the current -ffunction-sections scheme that parisc has used for decades
as "fragile"

> Some more detailed explanation is available here:
> <http://lkml.org/lkml/2010/2/19/365>

That's still a bit short on explanations.

But if I infer from the rest, someone, somewhere broke the convention
that all our special linux sections be called .XX.data and .XX.text to
distinguish them from the .text.FF and .data.YY the compiler will
generate with the relevant sectional directives? because it's been
working OK for us for a while.

To fix the breakage, the proposal now is to name all linux special
sections as .text..XX and .data..XX? I can see that's more standard
looking that XX.text and XX.data, but not necessarily that it's better.

This then introduces a problem of matching because .text.X and .text..X
are hard to distinguish using the linker matching scripts.

So even if I buy the rename of the linux symbols, what about using a
linker defined symbol that's illegal as a function as the initial
separator instead of .? So hyphen looks the obvious one ... you can
have all the linux special sections being .text- and .data- then we can
easily distinguish.

James

> > > > which should match the .text.foo sections generated by -ffunction-sections
> > > > but not the kernel's special sections which now all have names of the form
> > > > .text..foo.
> >
> > They do? I don't find any symbols like that on parisc.
> >
> > Historically, the way we've differentiated is that kernel special
> > symbols tend to have the text designator *after* the name, whereas the
> > linker puts it before ... of course that has issues for functions
> > called things like text or init ... but we try not to do that ...
>
> It looks like the patches for the .text.foo rename haven't been sent yet,
> my mistake. I've primarily been tracking the much larger .data.foo ->
> .data..foo transition. I expect a patch series to rename the remaining
> sections and add -ffunction-sections support will be sent by either Denys
> Vlasenko or myself in the next few weeks.
>
> While the kernel does use section names like ".init.text", there were
> before quite recently a very large number of kernel special sections with
> names like ".data.page_aligned" which would conflict with:
>
> static int page_aligned;
>
> when compiling with -fdata-sections.
>
> In fact, we initially sent patches that changed these all to e.g.
> ".page_aligned.data". But we discovered that it is problem when writing
> assembly files, since if you write:
>
> .section ".page_aligned.data"
>
> it doesn't work -- you need to specify the load flags like so:
>
> .section ".page_aligned.data", "aw"
>
> while if you write e.g.
>
> .section ".data..page_aligned"
>
> the assembler will automatically set the right load flags for read/write
> data.
>
> Since this is a subtle issue and I'd be surprised if there weren't a lot
> of people who work on assembly code in the kernel who don't know about
> this subtle issue, the ".page_aligned.data" naming scheme is destined to
> result in some frustrating bugs. So we settled on the more bug-proof
> ".data..page_aligned" naming scheme for adding -fdata-sections support.
>
> It would probably be a good cleanup to go through and rename e.g.
> .init.text to .text..init so that we're consistent everywhere, but I'd
> like to get -ffunction-sections working before starting that project
> myself.


2010-06-14 23:15:06

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 2010-06-14 at 18:21 -0400, Tim Abbott wrote:
> On Mon, 14 Jun 2010, Matt Fleming wrote:
>
> > Do these special kernel sections include things like the parisc
> > .text.do_softirq, .text.sys_exit, etc? James raised a good objection to
> > the parisc patch of this series. I'm guessing most people saw it already
> > but I'll paste it here for reference,
> >
> > This would destroy all of the named parisc text ordering we do above the
> > removed line because now you'd have swept up all the function sections
> > before we get to them, won't it?
> >
> > The ordering is an execution speed up on 32 bit systems because our
> > relative jump is so short.
> >
> > Will you changes handle this OK?
>
> I think I addressed this in my reply to James just now, but to be super
> clear, this -ffunction-sections plan involves renaming .text.do_softirq to
> .text..do_softirq (etc.) first.

OK, so that doesn't make a lot of sense to me; I suspect because you
don't understand what parisc is doing. These aren't names of linux
special sections ... they're names of function sections. For
efficiency, we take specific hot functions and place them together in
the linker script so the jumps between them are small enough to be coded
as relative on the 32 bit architecture. It's really just a more
efficient way of laying out the binary.

James

2010-06-15 02:45:44

by Tim Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010, James Bottomley wrote:

> > I believe that the pattern [A-Za-z$_] matches all valid characters to
> > start a function name (in particular, it includes "$"). If I'm missing
> > any valid characters for the start of a function name, please correct me.
>
> Well, our linker seems to generate annoying symbols with carets in
> them ...

The question here is: is there C code that when compiled with
-ffunction-sections will generate an ELF section with a name that starts
with ".text.^"? For that to happen, you would need a function whose name
started with "^", which isn't valid C.

The relevant namespace here is the names for ELF sections generated by
-ffunction-sections. These are in turn computed by the compiler from
function names -- there's no potential conflict created by
linker-generated symbols whose names start with a caret. Similarly, for
-fdata-sections, we only care about the names of C data objects, which
also can't start with a caret.

> > While one could in principle try to handle things by not renaming the
> > .text.foo sections and instead just placing the linker script code for
> > them all before a .text.* item in the linker script, that approach is
> > really fragile. I think the "text..foo" approach is a good design and I
> > am not aware of any problems with it.
>
> OK, but how about some actual explanation? You've just characterised
> the current -ffunction-sections scheme that parisc has used for decades
> as "fragile"

The current parisc situation is fine.

What I was trying to draw a contrast with is supporting -fdata-sections by
adding ".data.*" to DATA_DATA, and then trying to make sure that all the
architecture linker scripts handle all the kernel's special data sections
with names like ".data.foo" before the place where DATA_DATA appears in
their linker scripts. Most of the architecture linker scripts mention
more than a half dozen special kernel sections with names of the form
".data.foo", often in fairly random orders, and so it would be really
fragile to add the constraint that these sections need to all appear above
DATA_DATA.

Adding ".data.[A-Za-z$_]" to DATA_DATA doesn't have this problem.

If we similarly added ".text.[A-Za-z$_]" to TEXT_TEXT, we'd presumably
move the 4 named .text.foo sections before TEXT_TEXT; I don't think any
other architectures would require any work.

> > Some more detailed explanation is available here:
> > <http://lkml.org/lkml/2010/2/19/365>
>
> That's still a bit short on explanations.
>
> But if I infer from the rest, someone, somewhere broke the convention
> that all our special linux sections be called .XX.data and .XX.text to
> distinguish them from the .text.FF and .data.YY the compiler will
> generate with the relevant sectional directives? because it's been
> working OK for us for a while.

I don't know the full history here. But prior to the ".data.foo" ->
".data..foo" patches that were merged recently, there were a bunch of
cross-architecture sections with these sorts of names, e.g.:

.data.page_aligned
.data.nosave
.data.read_mostly
.data.cacheline_aligned
.data.lock_aligned
.data.percpu*
.data.init_task
etc.

There were also a bunch of ".text.foo" sections on individual
architectures, many of which currently don't support -ffunction-sections
(sh, ia64, x86, mips, etc.).

However, there weren't any .text.foo sections that are cross-architecture.
Since parisc only uses -ffunction-sections, and not -fdata-sections, the
popular .data.foo naming scheme doesn't cause any breakage on parisc.

The only architecture that does use -fdata-sections is frv, and there
could theoretically have been breakage there, but in practice it's likely
nobody has written kernel code that would actually conflict, e.g. "static
int percpu = 3;", yet.

> To fix the breakage, the proposal now is to name all linux special
> sections as .text..XX and .data..XX? I can see that's more standard
> looking that XX.text and XX.data, but not necessarily that it's better.

Yes, that's the proposal.

> This then introduces a problem of matching because .text.X and .text..X
> are hard to distinguish using the linker matching scripts.

Right. I believe that this is totally solvable with a simple linker
script pattern, since the space of valid names for functions and data
objects in C code is quite restricted (and that the implementation of
using e.g. ".data.[A-Za-z$_]*" solves this problem).

> So even if I buy the rename of the linux symbols, what about using a
> linker defined symbol that's illegal as a function as the initial
> separator instead of .? So hyphen looks the obvious one ... you can
> have all the linux special sections being .text- and .data- then we can
> easily distinguish.

Is "." a valid first character for a function name? I don't see the
problem with using "." here.

Both .page_aligned.data and .data-page_aligned are valid choices (and in
fact, the first patch series I sent on this topic about 18 months ago did
.page_aligned.data, I think).

The main technical difference between ".data..page_aligned" and
".page_aligned.data" in my view is that you need to be more careful when
writing assembly files with ".page_aligned.data".

To give an example, if I run the following:

$ cat > foo.s
.section .data-page-aligned
.long 0
.section .data.page_aligned
.long 1
$ gcc -c foo.s -o foo.o
$ objdump -h foo.o
foo.o: file format elf32-i386

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000000 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000034 2**2
ALLOC
3 .data-page-aligned 00000004 00000000 00000000 00000034 2**0
CONTENTS, READONLY
4 .data.page_aligned 00000004 00000000 00000000 00000038 2**0
CONTENTS, ALLOC, LOAD, DATA

one can see that the .data-page-aligned section doesn't have the right
section flags. So I'm pretty sure the relevant assembler heuristic is
looking for things starting with ".data.", not just ".data".

The kernel has a lot of code in assembly files that just does:

.section ".data"

and so there's a very real risk that folks who are doing pattern-matching
development on assembly files will end up creating non-allocated sections
by accident (I've certainly made this mistake myself, and there's a bug of
this form in arch/x86/lib/thunk.S until commit
c6c2d7a084d14a8a701be84872aa1b77d2945f46, so I don't think I'm alone)

I also think that ".data..page_aligned" is more readable as a new name for
the former ".data.page_aligned" than ".page_aligned.data" is, but I think
that's a secondary consideration. ".data.-page_aligned" would be
technically equivalent to ".data..page_aligned", but I think it is uglier.

-Tim Abbott

2010-06-15 02:52:14

by Tim Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 14 Jun 2010, James Bottomley wrote:

> On Mon, 2010-06-14 at 18:21 -0400, Tim Abbott wrote:
> > On Mon, 14 Jun 2010, Matt Fleming wrote:
> >
> > > Do these special kernel sections include things like the parisc
> > > .text.do_softirq, .text.sys_exit, etc? James raised a good objection to
> > > the parisc patch of this series. I'm guessing most people saw it already
> > > but I'll paste it here for reference,
> > >
> > > This would destroy all of the named parisc text ordering we do above the
> > > removed line because now you'd have swept up all the function sections
> > > before we get to them, won't it?
> > >
> > > The ordering is an execution speed up on 32 bit systems because our
> > > relative jump is so short.
> > >
> > > Will you changes handle this OK?
> >
> > I think I addressed this in my reply to James just now, but to be super
> > clear, this -ffunction-sections plan involves renaming .text.do_softirq to
> > .text..do_softirq (etc.) first.
>
> OK, so that doesn't make a lot of sense to me; I suspect because you
> don't understand what parisc is doing. These aren't names of linux
> special sections ... they're names of function sections. For
> efficiency, we take specific hot functions and place them together in
> the linker script so the jumps between them are small enough to be coded
> as relative on the 32 bit architecture. It's really just a more
> efficient way of laying out the binary.

Yeah, I'd forgotten parisc was doing -ffunction-sections, now I understand
what you're talking about. What I'd recommend is just moving the
.text.do_softirq and friends like in the hunk below -- I believe that
should achieve the same performance goals, and should have no conflict
with adding a wildcard for -ffunction-sections generated sections to
TEXT_TEXT.

-Tim Abbott

diff --git a/arch/parisc/kernel/vmlinux.lds.S
b/arch/parisc/kernel/vmlinux.lds.S
index d64a6bb..ad0d3d3 100644
--- a/arch/parisc/kernel/vmlinux.lds.S
+++ b/arch/parisc/kernel/vmlinux.lds.S
@@ -52,15 +52,15 @@ SECTIONS
_text = .; /* Text and read-only data */
.text ALIGN(16) : {
HEAD_TEXT
+ *(.text.do_softirq)
+ *(.text.sys_exit)
+ *(.text.do_sigaltstack)
+ *(.text.do_fork)
TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
IRQENTRY_TEXT
- *(.text.do_softirq)
- *(.text.sys_exit)
- *(.text.do_sigaltstack)
- *(.text.do_fork)
*(.text.*)
*(.fixup)
*(.lock.text) /* out-of-line lock text */

2010-06-16 21:40:11

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Mon, 2010-06-14 at 22:45 -0400, Tim Abbott wrote:
> On Mon, 14 Jun 2010, James Bottomley wrote:
>
> > > I believe that the pattern [A-Za-z$_] matches all valid characters to
> > > start a function name (in particular, it includes "$"). If I'm missing
> > > any valid characters for the start of a function name, please correct me.
> >
> > Well, our linker seems to generate annoying symbols with carets in
> > them ...
>
> The question here is: is there C code that when compiled with
> -ffunction-sections will generate an ELF section with a name that starts
> with ".text.^"? For that to happen, you would need a function whose name
> started with "^", which isn't valid C.

True ... but I didn't say that it was valid C. I said the compiler is
spitting them out in the assembly ... I suspect it uses invalid C
function characters precisely to avoid clashes.

> The relevant namespace here is the names for ELF sections generated by
> -ffunction-sections. These are in turn computed by the compiler from
> function names -- there's no potential conflict created by
> linker-generated symbols whose names start with a caret. Similarly, for
> -fdata-sections, we only care about the names of C data objects, which
> also can't start with a caret.
>
> > > While one could in principle try to handle things by not renaming the
> > > .text.foo sections and instead just placing the linker script code for
> > > them all before a .text.* item in the linker script, that approach is
> > > really fragile. I think the "text..foo" approach is a good design and I
> > > am not aware of any problems with it.
> >
> > OK, but how about some actual explanation? You've just characterised
> > the current -ffunction-sections scheme that parisc has used for decades
> > as "fragile"
>
> The current parisc situation is fine.
>
> What I was trying to draw a contrast with is supporting -fdata-sections by
> adding ".data.*" to DATA_DATA, and then trying to make sure that all the
> architecture linker scripts handle all the kernel's special data sections
> with names like ".data.foo" before the place where DATA_DATA appears in
> their linker scripts. Most of the architecture linker scripts mention
> more than a half dozen special kernel sections with names of the form
> ".data.foo", often in fairly random orders, and so it would be really
> fragile to add the constraint that these sections need to all appear above
> DATA_DATA.
>
> Adding ".data.[A-Za-z$_]" to DATA_DATA doesn't have this problem.
>
> If we similarly added ".text.[A-Za-z$_]" to TEXT_TEXT, we'd presumably
> move the 4 named .text.foo sections before TEXT_TEXT; I don't think any
> other architectures would require any work.
>
> > > Some more detailed explanation is available here:
> > > <http://lkml.org/lkml/2010/2/19/365>
> >
> > That's still a bit short on explanations.
> >
> > But if I infer from the rest, someone, somewhere broke the convention
> > that all our special linux sections be called .XX.data and .XX.text to
> > distinguish them from the .text.FF and .data.YY the compiler will
> > generate with the relevant sectional directives? because it's been
> > working OK for us for a while.
>
> I don't know the full history here. But prior to the ".data.foo" ->
> ".data..foo" patches that were merged recently, there were a bunch of
> cross-architecture sections with these sorts of names, e.g.:
>
> .data.page_aligned
> .data.nosave
> .data.read_mostly
> .data.cacheline_aligned
> .data.lock_aligned
> .data.percpu*
> .data.init_task
> etc.
>
> There were also a bunch of ".text.foo" sections on individual
> architectures, many of which currently don't support -ffunction-sections
> (sh, ia64, x86, mips, etc.).
>
> However, there weren't any .text.foo sections that are cross-architecture.
> Since parisc only uses -ffunction-sections, and not -fdata-sections, the
> popular .data.foo naming scheme doesn't cause any breakage on parisc.
>
> The only architecture that does use -fdata-sections is frv, and there
> could theoretically have been breakage there, but in practice it's likely
> nobody has written kernel code that would actually conflict, e.g. "static
> int percpu = 3;", yet.

Right, but what I was curious about, since, we already sorted all the
problems with text sections out on parisc by putting the .text as a
suffix not a prefix, and you have to change all the data sections
anyway, why not just follow established practise?

> > To fix the breakage, the proposal now is to name all linux special
> > sections as .text..XX and .data..XX? I can see that's more standard
> > looking that XX.text and XX.data, but not necessarily that it's better.
>
> Yes, that's the proposal.
>
> > This then introduces a problem of matching because .text.X and .text..X
> > are hard to distinguish using the linker matching scripts.
>
> Right. I believe that this is totally solvable with a simple linker
> script pattern, since the space of valid names for functions and data
> objects in C code is quite restricted (and that the implementation of
> using e.g. ".data.[A-Za-z$_]*" solves this problem).
>
> > So even if I buy the rename of the linux symbols, what about using a
> > linker defined symbol that's illegal as a function as the initial
> > separator instead of .? So hyphen looks the obvious one ... you can
> > have all the linux special sections being .text- and .data- then we can
> > easily distinguish.
>
> Is "." a valid first character for a function name? I don't see the
> problem with using "." here.

The problem is that it's hard to get the linker to distinguish
between .text.. and .text. without funny regexp contortions. However,
if the two sections began .text- and .text. they are easy to distinguish
because one prefix isn't a subset of the other.

> Both .page_aligned.data and .data-page_aligned are valid choices (and in
> fact, the first patch series I sent on this topic about 18 months ago did
> .page_aligned.data, I think).
>
> The main technical difference between ".data..page_aligned" and
> ".page_aligned.data" in my view is that you need to be more careful when
> writing assembly files with ".page_aligned.data".
>
> To give an example, if I run the following:
>
> $ cat > foo.s
> .section .data-page-aligned
> .long 0
> .section .data.page_aligned
> .long 1
> $ gcc -c foo.s -o foo.o
> $ objdump -h foo.o
> foo.o: file format elf32-i386
>
> Sections:
> Idx Name Size VMA LMA File off Algn
> 0 .text 00000000 00000000 00000000 00000034 2**2
> CONTENTS, ALLOC, LOAD, READONLY, CODE
> 1 .data 00000000 00000000 00000000 00000034 2**2
> CONTENTS, ALLOC, LOAD, DATA
> 2 .bss 00000000 00000000 00000000 00000034 2**2
> ALLOC
> 3 .data-page-aligned 00000004 00000000 00000000 00000034 2**0
> CONTENTS, READONLY
> 4 .data.page_aligned 00000004 00000000 00000000 00000038 2**0
> CONTENTS, ALLOC, LOAD, DATA
>
> one can see that the .data-page-aligned section doesn't have the right
> section flags. So I'm pretty sure the relevant assembler heuristic is
> looking for things starting with ".data.", not just ".data".
>
> The kernel has a lot of code in assembly files that just does:
>
> .section ".data"
>
> and so there's a very real risk that folks who are doing pattern-matching
> development on assembly files will end up creating non-allocated sections
> by accident (I've certainly made this mistake myself, and there's a bug of
> this form in arch/x86/lib/thunk.S until commit
> c6c2d7a084d14a8a701be84872aa1b77d2945f46, so I don't think I'm alone)

But that commit isn't anything to do with the section not being
allocated. That's actually irrelevant to us, since we sort out the
sectional allocations out at link time with the linker script (which
ends up completely ignoring the original file flags). The problem,
specifically, is that the linker does a rename of two identically named
sections with different flags ... we would also get the same behaviour
if the linker thought two sections weren't mergeable even if they did
have identical flags.

In general, I think it's good practise for all use of sections to
specify the elf flags.

> I also think that ".data..page_aligned" is more readable as a new name for
> the former ".data.page_aligned" than ".page_aligned.data" is, but I think
> that's a secondary consideration. ".data.-page_aligned" would be
> technically equivalent to ".data..page_aligned", but I think it is uglier.

Actually, as I said, that would be .data-

But, to be honest, I don't care that much about the data names because
we don't use -fdata-sections, the only objection is that having two
separate conventions for text and data is adding needless complexity ...
I do care about the text names, and I'd rather we stuck to the existing
convention there, or provided a good reason for the change ... and
wanting .text. as a prefix for everything is weak at best.

The specific objection I have to converting everything to the .text..*
prefix for linux sections is that the gather all function sections can
no longer be *(.text.*), but has to become a regular expression, which
is adding fragility.

James

2010-06-17 18:54:26

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Tue, Jun 15, 2010 at 12:02 AM, Tim Abbott <[email protected]> wrote:
> On Mon, 14 Jun 2010, James Bottomley wrote:
>
>> On Mon, 2010-06-14 at 20:33 +0100, Matt Fleming wrote:
>> > On Mon, 14 Jun 2010 10:32:46 -0400 (EDT), Tim Abbott <[email protected]> wrote:
>> > >
>> > > I was planning to submit in the next couple weeks a change that adds
>> > > support for building the kernel with -ffunction-sections -fdata-sections,
>> > > which would have as a piece of it adding to TEXT_TEXT the following
>> > > expression:
>> > >
>> > > ? *(.text.[A-Za-z$_]*) ? ?/* handle -ffunction-sections */\
>>
>> Just as a point of technical interest, that won't handle
>> -ffunction-sections. ?At least on parisc, we get a
>> section .text.<function name> for every function. ?This means that any
>> character legal in a function name can appear there, not just letters
>> and underscores (we get millicode ones with dollar signs as well for
>> instance). ?That's why *(.text.*) is safer
>
> Hi James,
>
> I believe that the pattern [A-Za-z$_] matches all valid characters to
> start a function name (in particular, it includes "$"). ?If I'm missing
> any valid characters for the start of a function name, please correct me.

Yes, I think we need to add 0-9 too. C names can't have a digit as a starting
character, but linker can produce such names when invoked as "ld -r --unique"
(incremental linking).

Currently we use just "ld -r" to combine all .o files from a directory
into one bigger .o file, but this combines all similarly-named sections.
This not only combines all .text sections from every input .o file
into one .text section (which isn't surprising), but also combines
all .text.func sections too. Which we don't want to happen when we
(eventually) want to link kernel with --gc-sections.

The fix already exists: "ld -r --unique". With --unique, ld
will create unique sections named .text.1, .text.2, .text.func.1
and such.

Therefore, in order to accomodate .text.NUM sections in the future,
we'd better use .text.[A-Za-z0-9$_]* pattern.

--
vda

2010-06-17 19:11:50

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Wed, Jun 16, 2010 at 11:40 PM, James Bottomley
<[email protected]> wrote:
> True ... but I didn't say that it was valid C. ?I said the compiler is
> spitting them out in the assembly ... I suspect it uses invalid C
> function characters precisely to avoid clashes.

Then we need to add all such chars to that regexp.

> The problem is that it's hard to get the linker to distinguish
> between .text.. and .text. without funny regexp contortions. ?However,
> if the two sections began .text- and .text. they are easy to distinguish
> because one prefix isn't a subset of the other.
>
>> Both .page_aligned.data and .data-page_aligned are valid choices (and in
>> fact, the first patch series I sent on this topic about 18 months ago did
>> .page_aligned.data, I think).
>>
>> The main technical difference between ".data..page_aligned" and
>> ".page_aligned.data" in my view is that you need to be more careful when
>> writing assembly files with ".page_aligned.data".
>>
>> To give an example, if I run the following:
>>
>> $ cat > foo.s
>> .section .data-page-aligned
>> ? ? ? .long 0
>> .section .data.page_aligned
>> ? ? ? .long 1
>> $ gcc -c foo.s -o foo.o
>> $ objdump -h foo.o
>> foo.o: ? ? file format elf32-i386
>>
>> Sections:
>> Idx Name ? ? ? ? ?Size ? ? ?VMA ? ? ? LMA ? ? ? File off ?Algn
>> ? 0 .text ? ? ? ? 00000000 ?00000000 ?00000000 ?00000034 ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS, ALLOC, LOAD, READONLY, CODE
>> ? 1 .data ? ? ? ? 00000000 ?00000000 ?00000000 ?00000034 ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS, ALLOC, LOAD, DATA
>> ? 2 .bss ? ? ? ? ?00000000 ?00000000 ?00000000 ?00000034 ?2**2
>> ? ? ? ? ? ? ? ? ? ALLOC
>> ? 3 .data-page-aligned 00000004 ?00000000 ?00000000 ?00000034 ?2**0
>> ? ? ? ? ? ? ? ? ? CONTENTS, READONLY
>> ? 4 .data.page_aligned 00000004 ?00000000 ?00000000 ?00000038 ?2**0
>> ? ? ? ? ? ? ? ? ? CONTENTS, ALLOC, LOAD, DATA
>>
>> one can see that the .data-page-aligned section doesn't have the right
>> section flags. ?So I'm pretty sure the relevant assembler heuristic is
>> looking for things starting with ".data.", not just ".data".
>>
>> The kernel has a lot of code in assembly files that just does:
>>
>> .section ".data"
>>
>> and so there's a very real risk that folks who are doing pattern-matching
>> development on assembly files will end up creating non-allocated sections
>> by accident (I've certainly made this mistake myself, and there's a bug of
>> this form in arch/x86/lib/thunk.S until commit
>> c6c2d7a084d14a8a701be84872aa1b77d2945f46, so I don't think I'm alone)
>
> But that commit isn't anything to do with the section not being
> allocated. ?That's actually irrelevant to us, since we sort out the
> sectional allocations out at link time with the linker script (which
> ends up completely ignoring the original file flags). ?The problem,
> specifically, is that the linker does a rename of two identically named
> sections with different flags ... we would also get the same behaviour
> if the linker thought two sections weren't mergeable even if they did
> have identical flags.
>
> In general, I think it's good practise for all use of sections to
> specify the elf flags.

This is doable in asm, yes. For .bss, we need to not forget about
@nobits too: section .bss.foo,"aw",@nobits

> Actually, as I said, that would be .data-
>
> But, to be honest, I don't care that much about the data names because
> we don't use -fdata-sections, the only objection is that having two
> separate conventions for text and data is adding needless complexity ...
> I do care about the text names, and I'd rather we stuck to the existing
> convention there, or provided a good reason for the change ... and
> wanting .text. as a prefix for everything is weak at best.
>
> The specific objection I have to converting everything to the .text..*
> prefix for linux sections is that the gather all function sections can
> no longer be *(.text.*), but has to become a regular expression, which
> is adding fragility.

I agree that it's not pretty, but otherwise we add fragility in other places:
every section directive must be correct now. These places are more numerous
than linker scripts.

But, leaving aside the question whether we want to take the risk
of people forgetting to do it properly everywhere, -
is it even *possible* to specify these attributes in C - that is,
in gcc's __attribute__((section(".bss.foo")))?

I think currently gcc does not support it - it guesses attributes
based on the section name.

--
vda

2010-06-17 19:56:24

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Thu, 2010-06-17 at 21:11 +0200, Denys Vlasenko wrote:
> On Wed, Jun 16, 2010 at 11:40 PM, James Bottomley
> <[email protected]> wrote:
> > True ... but I didn't say that it was valid C. I said the compiler is
> > spitting them out in the assembly ... I suspect it uses invalid C
> > function characters precisely to avoid clashes.
>
> Then we need to add all such chars to that regexp.

Thus proving the point about fragility ...

> > The problem is that it's hard to get the linker to distinguish
> > between .text.. and .text. without funny regexp contortions. However,
> > if the two sections began .text- and .text. they are easy to distinguish
> > because one prefix isn't a subset of the other.
> >
> >> Both .page_aligned.data and .data-page_aligned are valid choices (and in
> >> fact, the first patch series I sent on this topic about 18 months ago did
> >> .page_aligned.data, I think).
> >>
> >> The main technical difference between ".data..page_aligned" and
> >> ".page_aligned.data" in my view is that you need to be more careful when
> >> writing assembly files with ".page_aligned.data".
> >>
> >> To give an example, if I run the following:
> >>
> >> $ cat > foo.s
> >> .section .data-page-aligned
> >> .long 0
> >> .section .data.page_aligned
> >> .long 1
> >> $ gcc -c foo.s -o foo.o
> >> $ objdump -h foo.o
> >> foo.o: file format elf32-i386
> >>
> >> Sections:
> >> Idx Name Size VMA LMA File off Algn
> >> 0 .text 00000000 00000000 00000000 00000034 2**2
> >> CONTENTS, ALLOC, LOAD, READONLY, CODE
> >> 1 .data 00000000 00000000 00000000 00000034 2**2
> >> CONTENTS, ALLOC, LOAD, DATA
> >> 2 .bss 00000000 00000000 00000000 00000034 2**2
> >> ALLOC
> >> 3 .data-page-aligned 00000004 00000000 00000000 00000034 2**0
> >> CONTENTS, READONLY
> >> 4 .data.page_aligned 00000004 00000000 00000000 00000038 2**0
> >> CONTENTS, ALLOC, LOAD, DATA
> >>
> >> one can see that the .data-page-aligned section doesn't have the right
> >> section flags. So I'm pretty sure the relevant assembler heuristic is
> >> looking for things starting with ".data.", not just ".data".
> >>
> >> The kernel has a lot of code in assembly files that just does:
> >>
> >> .section ".data"
> >>
> >> and so there's a very real risk that folks who are doing pattern-matching
> >> development on assembly files will end up creating non-allocated sections
> >> by accident (I've certainly made this mistake myself, and there's a bug of
> >> this form in arch/x86/lib/thunk.S until commit
> >> c6c2d7a084d14a8a701be84872aa1b77d2945f46, so I don't think I'm alone)
> >
> > But that commit isn't anything to do with the section not being
> > allocated. That's actually irrelevant to us, since we sort out the
> > sectional allocations out at link time with the linker script (which
> > ends up completely ignoring the original file flags). The problem,
> > specifically, is that the linker does a rename of two identically named
> > sections with different flags ... we would also get the same behaviour
> > if the linker thought two sections weren't mergeable even if they did
> > have identical flags.
> >
> > In general, I think it's good practise for all use of sections to
> > specify the elf flags.
>
> This is doable in asm, yes. For .bss, we need to not forget about
> @nobits too: section .bss.foo,"aw",@nobits

That's only for bss ... we have about a handful of such statements and
they always use the assembler .bss directive (which doesn't need flags).

> > Actually, as I said, that would be .data-
> >
> > But, to be honest, I don't care that much about the data names because
> > we don't use -fdata-sections, the only objection is that having two
> > separate conventions for text and data is adding needless complexity ...
> > I do care about the text names, and I'd rather we stuck to the existing
> > convention there, or provided a good reason for the change ... and
> > wanting .text. as a prefix for everything is weak at best.
> >
> > The specific objection I have to converting everything to the .text..*
> > prefix for linux sections is that the gather all function sections can
> > no longer be *(.text.*), but has to become a regular expression, which
> > is adding fragility.
>
> I agree that it's not pretty, but otherwise we add fragility in other places:
> every section directive must be correct now. These places are more numerous
> than linker scripts.

I thought I just refuted that in the above: we don't care what the
assembler sections are flagged as because the linker script gets to pick
the flags anyway ... so most bugs arrived at this way have no visible
side effects ... and section merging problems have to be accounted for
anyway in the final linker scripts.

James

> But, leaving aside the question whether we want to take the risk
> of people forgetting to do it properly everywhere, -
> is it even *possible* to specify these attributes in C - that is,
> in gcc's __attribute__((section(".bss.foo")))?
>
> I think currently gcc does not support it - it guesses attributes
> based on the section name.

2010-06-17 20:19:15

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

>> This is doable in asm, yes. For .bss, we need to not forget about
>> @nobits too: section .bss.foo,"aw",@nobits
>
> That's only for bss ... we have about a handful of such statements and
> they always use the assembler .bss directive (which doesn't need flags).
>
>> > Actually, as I said, that would be .data-

You are right, in assembly we can specify needed attributes.

I am more concerned about C:

arch/x86/include/asm/cache.h:
#define __read_mostly __attribute__((__section__(".data..read_mostly")))

If we change it to

#define __read_mostly __attribute__((__section__(".data-read_mostly")))

What makes this section have correct attributes?

With current gcc, __attribute__((__section__(".bss-page_aligned")))
does get wrong attributes. That's why we settled on .bss..foo
scheme.

> I thought I just refuted that in the above: we don't care what the
> assembler sections are flagged as because the linker script gets to pick
> the flags anyway ... so most bugs arrived at this way have no visible
> side effects ... and section merging problems have to be accounted for
> anyway in the final linker scripts.

When I was working on a older iteration of this patch,
I renamed .bss.page_aligned to .page_aligned.bss
and was bitten by linker bug: linker tried to merge
the sections and corrupted them.

Aha, here is it:
http://sourceware.org/bugzilla/show_bug.cgi?id=5006

It was since fixed, and if I read the ld patch correctly,
now ld emits a warning and switches entire target section
to PROGBITS - not what we want to happen to bss.

--
vda

2010-06-17 20:38:32

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/5] vmlinux.lds.h: Include *(.text.*) in TEXT_TEXT

On Thu, 2010-06-17 at 22:19 +0200, Denys Vlasenko wrote:
> >> This is doable in asm, yes. For .bss, we need to not forget about
> >> @nobits too: section .bss.foo,"aw",@nobits
> >
> > That's only for bss ... we have about a handful of such statements and
> > they always use the assembler .bss directive (which doesn't need flags).
> >
> >> > Actually, as I said, that would be .data-
>
> You are right, in assembly we can specify needed attributes.
>
> I am more concerned about C:
>
> arch/x86/include/asm/cache.h:
> #define __read_mostly __attribute__((__section__(".data..read_mostly")))
>
> If we change it to
>
> #define __read_mostly __attribute__((__section__(".data-read_mostly")))
>
> What makes this section have correct attributes?

The fact that we specify it correctly in the sectional gather in the
linker scripts. i.e. we should have a (NOLOAD) type for the
gathered .bss section ... although currently we don't.

The point (for the third time) is that if our linker scripts specify the
sections and attributes absolutely (and correctly) it doesn't matter
what random attributes the .o files pick up. It's only if we miss a
specifier that the linker tries to work it out from the input sections.

That's how we make postfix or any other type of "nonstandard" section
name work.

James