2005-05-11 09:17:43

by Imre Deak

[permalink] [raw]
Subject: arm: Inconsistent kallsyms data

Hi,

building 2.6.12-rc4 results in "Inconsistent kallsyms data". Setting
CONFIG_KALLSYMS_EXTRA_PASS=y doesn't help.

I made a diff of .tmp_kallsyms[12].S after converting them to human
readable form with kallsyms_uncompress.pl .

I noticed that the error is triggered by an __initdata definition. It is
accessed only from an __init function, so that's ok I think. Removing
the __initdata attribute gets rid of the error message.

Let me know if you need more data to track the problem.

Any help is appreciated,
Imre


Attachments:
kallsyms1_2.diff (49.33 kB)

2005-05-11 10:57:46

by Keith Owens

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

On Wed, 11 May 2005 12:05:10 +0300,
Imre Deak <[email protected]> wrote:
>building 2.6.12-rc4 results in "Inconsistent kallsyms data". Setting
>CONFIG_KALLSYMS_EXTRA_PASS=y doesn't help.
>
>I made a diff of .tmp_kallsyms[12].S after converting them to human
>readable form with kallsyms_uncompress.pl .
>
>I noticed that the error is triggered by an __initdata definition. It is
>accessed only from an __init function, so that's ok I think. Removing
>the __initdata attribute gets rid of the error message.
>
>Let me know if you need more data to track the problem.

A better approach is this patch, it extracts the maps including the
section headers after each stage. I sent it to lkml on Sat, 26 Feb
2005 with no response. Apply the patch, make debug_kallsyms and send
me the .tmp_map* files.

---

Make it easier to generate maps for debugging kallsyms problems.
debug_kallsyms is only a debugging target so no help or silent mode.

Signed-off-by: Keith Owens <[email protected]>


Index: linux/Makefile
===================================================================
--- linux.orig/Makefile 2005-02-25 16:21:44.000000000 +1100
+++ linux/Makefile 2005-02-26 21:30:54.000000000 +1100
@@ -722,6 +722,16 @@ quiet_cmd_kallsyms = KSYM $@
# Needs to visit scripts/ before $(KALLSYMS) can be used.
$(KALLSYMS): scripts ;

+# Generate some data for debugging strange kallsyms problems
+debug_kallsyms: .tmp_map$(last_kallsyms)
+
+.tmp_map%: .tmp_vmlinux% FORCE
+ ($(OBJDUMP) -h $< | awk '/^ +[0-9]/{print $$4 " 0 " $$2}'; $(NM) $<) | sort > $@
+
+.tmp_map3: .tmp_map2
+
+.tmp_map2: .tmp_map1
+
endif # ifdef CONFIG_KALLSYMS

# vmlinux image - including updated kernel symbols


2005-05-11 11:01:19

by Paulo Marques

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

Imre Deak wrote:
> Hi,
>
> building 2.6.12-rc4 results in "Inconsistent kallsyms data". Setting
> CONFIG_KALLSYMS_EXTRA_PASS=y doesn't help.
>
> I made a diff of .tmp_kallsyms[12].S after converting them to human
> readable form with kallsyms_uncompress.pl .

From the diff, I can see the problem is that "__bss_start" changes
position with "_edata" from the first to the second pass.

If your read my post from yesterday "Re: Linux v2.6.12-rc4" (not a very
descriptive subject), I explain there why this is a problem.

Sam, from looking at your patch, it seems that the patch shouldn't
affect these particular symbols. Am I correct?

Maybe we really need the more robust fix to kallsyms, so that this sort
of thing doesn't bite us in the future, no matter what symbols change
position.

> I noticed that the error is triggered by an __initdata definition. It is
> accessed only from an __init function, so that's ok I think. Removing
> the __initdata attribute gets rid of the error message.

This is just a "tape over" solution that makes the symbols change
positions, so that maybe these 2 symbols don't get selected for sampling.

> Let me know if you need more data to track the problem.

There is a simple workaround that is to increase the WORKING_SET define
in scripts/kallsyms.c to something like 65536. This will include every
symbol in the token table calculation, so that even if symbol position
changes, the token table should be the same.

I tested this with a configuration I have here that had a similar
problem and it indeed worked as expected.

The problem with this approach is that it takes longer to calculate the
token table. (~3secs on my P4 2.8GHz, 11300 symbols)

--
Paulo Marques - http://www.grupopie.com

An expert is a person who has made all the mistakes that can be
made in a very narrow field.
Niels Bohr (1885 - 1962)

2005-05-11 21:15:27

by Sam Ravnborg

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

> I sent it to lkml on Sat, 26 Feb 2005 with no response.

As noted in mail sent to you the 14th of March I have this patch queued
up. But since Linus has asked for a calm down period it will wait
until next kernel release opens up.

Sam

2005-05-11 21:21:30

by Sam Ravnborg

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

On Wed, May 11, 2005 at 11:59:58AM +0100, Paulo Marques wrote:
> Imre Deak wrote:
> >Hi,
> >
> >building 2.6.12-rc4 results in "Inconsistent kallsyms data". Setting
> >CONFIG_KALLSYMS_EXTRA_PASS=y doesn't help.
> >
> >I made a diff of .tmp_kallsyms[12].S after converting them to human
> >readable form with kallsyms_uncompress.pl .
>
> From the diff, I can see the problem is that "__bss_start" changes
> position with "_edata" from the first to the second pass.
>
> If your read my post from yesterday "Re: Linux v2.6.12-rc4" (not a very
> descriptive subject), I explain there why this is a problem.
>
> Sam, from looking at your patch, it seems that the patch shouldn't
> affect these particular symbols. Am I correct?
As I read the diff my patch will not solve this problem.
Let's see the output generated with Keith's debug_kallsyms patch if
that will sched some light over it.

Sam

2005-05-11 21:47:29

by Russell King

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

On Wed, May 11, 2005 at 12:05:10PM +0300, Imre Deak wrote:
> I noticed that the error is triggered by an __initdata definition. It is
> accessed only from an __init function, so that's ok I think. Removing
> the __initdata attribute gets rid of the error message.

This sounds very vague. Can you show us the code please?

Note that uninitialised variables with an __initdata marking aren't
legal.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2005-05-11 23:55:22

by Imre Deak

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

The definition I refered to:

static struct platform_device *devices[] __initdata = {
NULL,
};

To me however it seems to be irrelevant, since meanwhile I managed to
trigger the error multiple ways by changing something else.

I sent the .tmp_map* files of one failed build to Keith.

--Imre

On Wed, 2005-05-11 at 22:47 +0100, ext Russell King wrote:
> On Wed, May 11, 2005 at 12:05:10PM +0300, Imre Deak wrote:
> > I noticed that the error is triggered by an __initdata definition. It is
> > accessed only from an __init function, so that's ok I think. Removing
> > the __initdata attribute gets rid of the error message.
>
> This sounds very vague. Can you show us the code please?
>
> Note that uninitialised variables with an __initdata marking aren't
> legal.


2005-05-12 05:21:55

by Keith Owens

[permalink] [raw]
Subject: Re: arm: Inconsistent kallsyms data

On Wed, 11 May 2005 12:05:10 +0300,
Imre Deak <[email protected]> wrote:
>building 2.6.12-rc4 results in "Inconsistent kallsyms data". Setting
>CONFIG_KALLSYMS_EXTRA_PASS=y doesn't help.
>
>I made a diff of .tmp_kallsyms[12].S after converting them to human
>readable form with kallsyms_uncompress.pl .
>
>--- .tmp_kallsyms1.txt 2005-05-11 11:54:50.000000000 +0300
>+++ .tmp_kallsyms2.txt 2005-05-11 11:54:55.000000000 +0300
>@@ -16718,773 +16718,773 @@
> d gss_kerberos_pfs PTR 0xc02ae46c
> d gss_kerberos_ops PTR 0xc02ae494
> D net_table PTR 0xc02ae4a4
>-D _edata PTR 0xc02ae554
>-B __bss_start PTR 0xc02ae560
>-B system_state PTR 0xc02ae560
>.....
>+B __bss_start PTR 0xc02e8a00
>+D _edata PTR 0xc02e8a00
>+B system_state PTR 0xc02e8a00
>...

The problem is being caused by alignment padding combined with a sort
order combined with the token compression in scripts/kallsyms.c.

Pass 1 generates this map extract. Note the gap between _edata and
__bss_start, this is linker generated padding to start .bss on a 16
byte boundary.

...
c02ae4a4 D net_table
c02ae554 D _edata
c02ae560 B __bss_start
c02ae560 B system_state
c02ae564 B saved_command_line

...

Pass 2 defines the kallsyms_* symbols and generates data for them.
That adds new symbols and increases _edata, all perfectly normal.

...
c02ae4a4 D net_table
c02ae560 D kallsyms_addresses
c02bf6b0 D kallsyms_num_syms
c02bf6c0 D kallsyms_names
c02e82b0 D kallsyms_markers
c02e83d0 D kallsyms_token_table
c02e8800 D kallsyms_token_index
c02e8a00 B __bss_start
c02e8a00 B system_state
c02e8a00 D _edata
c02e8a04 B saved_command_line
...

After adding the kallsyms data in pass 2, there is no longer a gap
between _edata and __bss_start. The added kallsyms data moved _edata
to a 16 byte boundary which removed the need for the linker to add any
padding.

When $(NM) is run to produce the map, the -n option first sorts by
address, then by symbol type, then by name. When two symbols like
_edata and __bss_start have the same address they sort by type. D
comes after B which changes the order of the output symbols. That in
turn changes the input to the token compression code in
scripts/kallsyms.c which finally results in different kallsyms output.

This problem can only affect the padding between _edata and the start
of the next section, and only when the amount of padding is zero in one
kallsyms pass and non-zero in another. The simplest solution is to
change the vmlinux.lds files to add '. = . + 1;' between the definition
of _edata and the start of the next section. That ensures that _edata
never has the same address as the start of the next section and the
problem goes away. Of course it means changing 30 vmlinux.lds files,
but it is the same one line change in all of them.

2005-05-12 09:42:22

by Imre Deak

[permalink] [raw]
Subject: RE: arm: Inconsistent kallsyms data

Thanks! After changing vmlinux.lds.in as you suggest the problem goes
away. The relevant parts in Pass1 and Pass2 now:

Pass1:
D net_table PTR 0xc02ae4a4
D _edata PTR 0xc02ae554
B __bss_start PTR 0xc02ae560
B system_state PTR 0xc02ae560
B saved_command_line PTR 0xc02ae564

Pass2:
D net_table PTR 0xc02ae4a4
D _edata PTR 0xc02e8a00
B __bss_start PTR 0xc02e8a20
B system_state PTR 0xc02e8a20
B saved_command_line PTR 0xc02e8a24

So the order remains.

The patch for arm is included.

--Imre

> -----Original Message-----
> From: ext Keith Owens [mailto:[email protected]]
> Sent: 12 May, 2005 08:21
> To: Deak Imre (Nokia-M/Helsinki)
> Cc: [email protected]; Sam Ravnborg; Paulo Marques
> Subject: Re: arm: Inconsistent kallsyms data
>
> The problem is being caused by alignment padding combined with a sort
> order combined with the token compression in scripts/kallsyms.c.
>
> Pass 1 generates this map extract. Note the gap between _edata and
> __bss_start, this is linker generated padding to start .bss on a 16
> byte boundary.
>
> ...
> c02ae4a4 D net_table
> c02ae554 D _edata
> c02ae560 B __bss_start
> c02ae560 B system_state
> c02ae564 B saved_command_line
>
> ...
>
> Pass 2 defines the kallsyms_* symbols and generates data for them.
> That adds new symbols and increases _edata, all perfectly normal.
>
> ...
> c02ae4a4 D net_table
> c02ae560 D kallsyms_addresses
> c02bf6b0 D kallsyms_num_syms
> c02bf6c0 D kallsyms_names
> c02e82b0 D kallsyms_markers
> c02e83d0 D kallsyms_token_table
> c02e8800 D kallsyms_token_index
> c02e8a00 B __bss_start
> c02e8a00 B system_state
> c02e8a00 D _edata
> c02e8a04 B saved_command_line
> ...
>
> After adding the kallsyms data in pass 2, there is no longer a gap
> between _edata and __bss_start. The added kallsyms data moved _edata
> to a 16 byte boundary which removed the need for the linker to add any
> padding.
>
> When $(NM) is run to produce the map, the -n option first sorts by
> address, then by symbol type, then by name. When two symbols like
> _edata and __bss_start have the same address they sort by type. D
> comes after B which changes the order of the output symbols. That in
> turn changes the input to the token compression code in
> scripts/kallsyms.c which finally results in different kallsyms output.
>
> This problem can only affect the padding between _edata and the start
> of the next section, and only when the amount of padding is
> zero in one
> kallsyms pass and non-zero in another. The simplest solution is to
> change the vmlinux.lds files to add '. = . + 1;' between the
> definition
> of _edata and the start of the next section. That ensures that _edata
> never has the same address as the start of the next section and the
> problem goes away. Of course it means changing 30 vmlinux.lds files,
> but it is the same one line change in all of them.
>
>


Attachments:
arm-kbuildfix-patch.diff (547.00 B)
arm-kbuildfix-patch.diff