2010-06-04 03:10:34

by alan

[permalink] [raw]
Subject: Additional info on modpost segfault

Missed adding the actual segfault message:

LD drivers/usb/built-in.o
LD drivers/built-in.o
LD vmlinux.o
MODPOST vmlinux.o
/bin/sh: line 1: 20665 Segmentation fault (core dumped)
scripts/mod/modpost -o
/home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
make[1]: *** [vmlinux.o] Error 139
make: *** [vmlinux.o] Error 2

I have looked at the gcc 4.4.4 changelog and I can't see anything that
should cause this.

--
Truth is stranger than fiction because fiction has to make sense.


2010-06-04 04:47:23

by Cong Wang

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Thu, Jun 03, 2010 at 08:10:30PM -0700, alan wrote:
>Missed adding the actual segfault message:
>
> LD drivers/usb/built-in.o
> LD drivers/built-in.o
> LD vmlinux.o
> MODPOST vmlinux.o
>/bin/sh: line 1: 20665 Segmentation fault (core dumped)
>scripts/mod/modpost -o
>/home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
>make[1]: *** [vmlinux.o] Error 139
>make: *** [vmlinux.o] Error 2
>
>I have looked at the gcc 4.4.4 changelog and I can't see anything
>that should cause this.
>

Hmm, you need to find which program segfaults here.

What does 'file the_core_file_your_got' say?

2010-06-04 07:23:06

by Michal Marek

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On 4.6.2010 06:51, Am?rico Wang wrote:
> On Thu, Jun 03, 2010 at 08:10:30PM -0700, alan wrote:
>> Missed adding the actual segfault message:
>>
>> LD drivers/usb/built-in.o
>> LD drivers/built-in.o
>> LD vmlinux.o
>> MODPOST vmlinux.o
>> /bin/sh: line 1: 20665 Segmentation fault (core dumped)
>> scripts/mod/modpost -o
>> /home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
>> make[1]: *** [vmlinux.o] Error 139
>> make: *** [vmlinux.o] Error 2
>>
>> I have looked at the gcc 4.4.4 changelog and I can't see anything
>> that should cause this.
>>
>
> Hmm, you need to find which program segfaults here.

It's the modpost command run on vmlinux.o. Alan, can you try
$ gdb --args scripts/mod/modpost -o Module.symvers -S vmlinux.o
(gdb) r
(wait for the segfault)
(gdb) bt full

and post the backtrace?

Thanks,
Michal

2010-06-07 16:59:42

by alan

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Fri, 2010-06-04 at 09:22 +0200, Michal Marek wrote:
> On 4.6.2010 06:51, Américo Wang wrote:
> > On Thu, Jun 03, 2010 at 08:10:30PM -0700, alan wrote:
> >> Missed adding the actual segfault message:
> >>
> >> LD drivers/usb/built-in.o
> >> LD drivers/built-in.o
> >> LD vmlinux.o
> >> MODPOST vmlinux.o
> >> /bin/sh: line 1: 20665 Segmentation fault (core dumped)
> >> scripts/mod/modpost -o
> >> /home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
> >> make[1]: *** [vmlinux.o] Error 139
> >> make: *** [vmlinux.o] Error 2
> >>
> >> I have looked at the gcc 4.4.4 changelog and I can't see anything
> >> that should cause this.
> >>
> >
> > Hmm, you need to find which program segfaults here.
>
> It's the modpost command run on vmlinux.o. Alan, can you try
> $ gdb --args scripts/mod/modpost -o Module.symvers -S vmlinux.o
> (gdb) r
> (wait for the segfault)
> (gdb) bt full
>
> and post the backtrace?

Don't know if this will help much.

This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols
from /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost...(no
debugging symbols found)...done.
(gdb) r
Starting
program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
Module.symvers -S vmlinux.o

Program received signal SIGSEGV, Segmentation fault.
0x0000000000403711 in main ()
(gdb) bt full
#0 0x0000000000403711 in main ()
No symbol table info available.

Trying to get it to compile with debugging info.


2010-06-08 05:48:11

by Cong Wang

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Mon, Jun 07, 2010 at 09:59:39AM -0700, Alan wrote:
>On Fri, 2010-06-04 at 09:22 +0200, Michal Marek wrote:
>> On 4.6.2010 06:51, Américo Wang wrote:
>> > On Thu, Jun 03, 2010 at 08:10:30PM -0700, alan wrote:
>> >> Missed adding the actual segfault message:
>> >>
>> >> LD drivers/usb/built-in.o
>> >> LD drivers/built-in.o
>> >> LD vmlinux.o
>> >> MODPOST vmlinux.o
>> >> /bin/sh: line 1: 20665 Segmentation fault (core dumped)
>> >> scripts/mod/modpost -o
>> >> /home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
>> >> make[1]: *** [vmlinux.o] Error 139
>> >> make: *** [vmlinux.o] Error 2
>> >>
>> >> I have looked at the gcc 4.4.4 changelog and I can't see anything
>> >> that should cause this.
>> >>
>> >
>> > Hmm, you need to find which program segfaults here.
>>
>> It's the modpost command run on vmlinux.o. Alan, can you try
>> $ gdb --args scripts/mod/modpost -o Module.symvers -S vmlinux.o
>> (gdb) r
>> (wait for the segfault)
>> (gdb) bt full
>>
>> and post the backtrace?
>
>Don't know if this will help much.
>
>This GDB was configured as "x86_64-redhat-linux-gnu".
>For bug reporting instructions, please see:
><http://www.gnu.org/software/gdb/bugs/>...
>Reading symbols
>from /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost...(no
>debugging symbols found)...done.
>(gdb) r
>Starting
>program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
>Module.symvers -S vmlinux.o
>
>Program received signal SIGSEGV, Segmentation fault.
>0x0000000000403711 in main ()
>(gdb) bt full
>#0 0x0000000000403711 in main ()
>No symbol table info available.
>
>Trying to get it to compile with debugging info.
>

Try to append "-g" to HOSTCFLAGS in the top Makefile. ;)

2010-06-08 18:24:20

by alan

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Tue, 2010-06-08 at 13:51 +0800, Américo Wang wrote:
> On Mon, Jun 07, 2010 at 09:59:39AM -0700, Alan wrote:
> >On Fri, 2010-06-04 at 09:22 +0200, Michal Marek wrote:
> >> On 4.6.2010 06:51, Américo Wang wrote:
> >> > On Thu, Jun 03, 2010 at 08:10:30PM -0700, alan wrote:
> >> >> Missed adding the actual segfault message:
> >> >>
> >> >> LD drivers/usb/built-in.o
> >> >> LD drivers/built-in.o
> >> >> LD vmlinux.o
> >> >> MODPOST vmlinux.o
> >> >> /bin/sh: line 1: 20665 Segmentation fault (core dumped)
> >> >> scripts/mod/modpost -o
> >> >> /home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
> >> >> make[1]: *** [vmlinux.o] Error 139
> >> >> make: *** [vmlinux.o] Error 2
> >> >>
> >> >> I have looked at the gcc 4.4.4 changelog and I can't see anything
> >> >> that should cause this.
> >> >>
> >> >
> >> > Hmm, you need to find which program segfaults here.
> >>
> >> It's the modpost command run on vmlinux.o. Alan, can you try
> >> $ gdb --args scripts/mod/modpost -o Module.symvers -S vmlinux.o
> >> (gdb) r
> >> (wait for the segfault)
> >> (gdb) bt full
> >>
> >> and post the backtrace?
> >
> >Don't know if this will help much.
> >
> >This GDB was configured as "x86_64-redhat-linux-gnu".
> >For bug reporting instructions, please see:
> ><http://www.gnu.org/software/gdb/bugs/>...
> >Reading symbols
> >from /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost...(no
> >debugging symbols found)...done.
> >(gdb) r
> >Starting
> >program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
> >Module.symvers -S vmlinux.o
> >
> >Program received signal SIGSEGV, Segmentation fault.
> >0x0000000000403711 in main ()
> >(gdb) bt full
> >#0 0x0000000000403711 in main ()
> >No symbol table info available.
> >
> >Trying to get it to compile with debugging info.
> >
>
> Try to append "-g" to HOSTCFLAGS in the top Makefile. ;)

Thanks. Tried CFLAGS and that did not work...

Here is the backtrace:
(gdb) bt full
#0 read_symbols (argc=5, argv=0x7fffffffe198) at
scripts/mod/modpost.c:1564
license = <value optimized out>
info = {size = 209695023, hdr = 0x7fffeb263000,
sechdrs = 0x7ffff48fcc2c, symtab_start = 0x7ffff7863e2c,
symtab_stop = 0x7ffff794341c, export_sec = 37,
export_unused_sec = 0, export_gpl_sec = 48,
export_unused_gpl_sec = 0, export_gpl_future_sec = 0,
strtab = 0x7ffff794341c "", modinfo = 0x0, modinfo_len = 0}
sym = <value optimized out>
symname = <value optimized out>
version = <value optimized out>
mod = 0x610010
#1 main (argc=5, argv=0x7fffffffe198) at scripts/mod/modpost.c:1999
mod = <value optimized out>
buf = {p = 0x0, pos = 0, size = 0}
kernel_read = <value optimized out>
module_read = <value optimized out>
dump_write = 0x7fffffffe4ed "Module.symvers"
opt = <value optimized out>
err = <value optimized out>
extsym_iter = <value optimized out>
extsym_start = <value optimized out>

The line is does not like is:

read_symbols(argv[optind++]);

Not certain why...


2010-06-09 08:32:04

by Cong Wang

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Tue, Jun 08, 2010 at 11:24:18AM -0700, Alan wrote:
>On Tue, 2010-06-08 at 13:51 +0800, Américo Wang wrote:
>> On Mon, Jun 07, 2010 at 09:59:39AM -0700, Alan wrote:
>> >On Fri, 2010-06-04 at 09:22 +0200, Michal Marek wrote:
>> >> On 4.6.2010 06:51, Américo Wang wrote:
>> >> > On Thu, Jun 03, 2010 at 08:10:30PM -0700, alan wrote:
>> >> >> Missed adding the actual segfault message:
>> >> >>
>> >> >> LD drivers/usb/built-in.o
>> >> >> LD drivers/built-in.o
>> >> >> LD vmlinux.o
>> >> >> MODPOST vmlinux.o
>> >> >> /bin/sh: line 1: 20665 Segmentation fault (core dumped)
>> >> >> scripts/mod/modpost -o
>> >> >> /home/alan/GitTrees/linux-2.6-mid-ref/Module.symvers -S vmlinux.o
>> >> >> make[1]: *** [vmlinux.o] Error 139
>> >> >> make: *** [vmlinux.o] Error 2
>> >> >>
>> >> >> I have looked at the gcc 4.4.4 changelog and I can't see anything
>> >> >> that should cause this.
>> >> >>
>> >> >
>> >> > Hmm, you need to find which program segfaults here.
>> >>
>> >> It's the modpost command run on vmlinux.o. Alan, can you try
>> >> $ gdb --args scripts/mod/modpost -o Module.symvers -S vmlinux.o
>> >> (gdb) r
>> >> (wait for the segfault)
>> >> (gdb) bt full
>> >>
>> >> and post the backtrace?
>> >
>> >Don't know if this will help much.
>> >
>> >This GDB was configured as "x86_64-redhat-linux-gnu".
>> >For bug reporting instructions, please see:
>> ><http://www.gnu.org/software/gdb/bugs/>...
>> >Reading symbols
>> >from /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost...(no
>> >debugging symbols found)...done.
>> >(gdb) r
>> >Starting
>> >program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
>> >Module.symvers -S vmlinux.o
>> >
>> >Program received signal SIGSEGV, Segmentation fault.
>> >0x0000000000403711 in main ()
>> >(gdb) bt full
>> >#0 0x0000000000403711 in main ()
>> >No symbol table info available.
>> >
>> >Trying to get it to compile with debugging info.
>> >
>>
>> Try to append "-g" to HOSTCFLAGS in the top Makefile. ;)
>
>Thanks. Tried CFLAGS and that did not work...
>
>Here is the backtrace:
>(gdb) bt full
>#0 read_symbols (argc=5, argv=0x7fffffffe198) at
>scripts/mod/modpost.c:1564
> license = <value optimized out>
> info = {size = 209695023, hdr = 0x7fffeb263000,
> sechdrs = 0x7ffff48fcc2c, symtab_start = 0x7ffff7863e2c,
> symtab_stop = 0x7ffff794341c, export_sec = 37,
> export_unused_sec = 0, export_gpl_sec = 48,
> export_unused_gpl_sec = 0, export_gpl_future_sec = 0,
> strtab = 0x7ffff794341c "", modinfo = 0x0, modinfo_len = 0}
> sym = <value optimized out>
> symname = <value optimized out>
> version = <value optimized out>
> mod = 0x610010
>#1 main (argc=5, argv=0x7fffffffe198) at scripts/mod/modpost.c:1999
> mod = <value optimized out>
> buf = {p = 0x0, pos = 0, size = 0}
> kernel_read = <value optimized out>
> module_read = <value optimized out>
> dump_write = 0x7fffffffe4ed "Module.symvers"
> opt = <value optimized out>
> err = <value optimized out>
> extsym_iter = <value optimized out>
> extsym_start = <value optimized out>
>
>The line is does not like is:
>
> read_symbols(argv[optind++]);
>
>Not certain why...

Hmm, it seems the segfault happens at 'license = get_modinfo(...)'?
I can't spot any bug around that.

If I were you, I would do a step by step debug with gdb.

Thanks!

2010-06-10 23:08:24

by Krzysztof Halasa

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

Alan <[email protected]> writes:

> program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
> Module.symvers -S vmlinux.o
>
> Program received signal SIGSEGV, Segmentation fault.

It just hit me.
It's the offset calculation in reloc_location() which overflows:
return (void *)elf->hdr + sechdrs[section].sh_offset +
(r->r_offset - sechdrs[section].sh_addr);

E.g. for the first rodata r entry:
r->r_offset < sechdrs[section].sh_addr
and the expression in the parenthesis produces 0xFFFFFFE0 or something
equally wise.

Does the attached patch fix it?

Signed-off-by: Krzysztof Hałasa <[email protected]>

--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1317,8 +1317,8 @@ static unsigned int *reloc_location(struct elf_info *elf,
Elf_Shdr *sechdrs = elf->sechdrs;
int section = sechdr->sh_info;

return (void *)elf->hdr + sechdrs[section].sh_offset +
- (r->r_offset - sechdrs[section].sh_addr);
+ r->r_offset - sechdrs[section].sh_addr;
}

static int addend_386_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)

2010-06-11 00:06:36

by alan

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Fri, 11 Jun 2010, Krzysztof Halasa wrote:

> Alan <[email protected]> writes:
>
>> program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
>> Module.symvers -S vmlinux.o
>>
>> Program received signal SIGSEGV, Segmentation fault.
>
> It just hit me.
> It's the offset calculation in reloc_location() which overflows:
> return (void *)elf->hdr + sechdrs[section].sh_offset +
> (r->r_offset - sechdrs[section].sh_addr);
>
> E.g. for the first rodata r entry:
> r->r_offset < sechdrs[section].sh_addr
> and the expression in the parenthesis produces 0xFFFFFFE0 or something
> equally wise.
>
> Does the attached patch fix it?

YES!

Thank you!

Now the big question is why does this compile on older versions of gcc?

This needs to get added into 2.6.35-rc2.


>
> Signed-off-by: Krzysztof Ha??asa <[email protected]>
>
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -1317,8 +1317,8 @@ static unsigned int *reloc_location(struct elf_info *elf,
> Elf_Shdr *sechdrs = elf->sechdrs;
> int section = sechdr->sh_info;
>
> return (void *)elf->hdr + sechdrs[section].sh_offset +
> - (r->r_offset - sechdrs[section].sh_addr);
> + r->r_offset - sechdrs[section].sh_addr;
> }
>
> static int addend_386_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Truth is stranger than fiction because fiction has to make sense.

2010-06-11 20:42:58

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

Michal, are you sending this to Linus?

-hpa


On 06/10/2010 04:08 PM, Krzysztof Halasa wrote:
> Alan <[email protected]> writes:
>
>> program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
>> Module.symvers -S vmlinux.o
>>
>> Program received signal SIGSEGV, Segmentation fault.
>
> It just hit me.
> It's the offset calculation in reloc_location() which overflows:
> return (void *)elf->hdr + sechdrs[section].sh_offset +
> (r->r_offset - sechdrs[section].sh_addr);
>
> E.g. for the first rodata r entry:
> r->r_offset < sechdrs[section].sh_addr
> and the expression in the parenthesis produces 0xFFFFFFE0 or something
> equally wise.
>
> Does the attached patch fix it?
>
> Signed-off-by: Krzysztof Hałasa <[email protected]>
>
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -1317,8 +1317,8 @@ static unsigned int *reloc_location(struct elf_info *elf,
> Elf_Shdr *sechdrs = elf->sechdrs;
> int section = sechdr->sh_info;
>
> return (void *)elf->hdr + sechdrs[section].sh_offset +
> - (r->r_offset - sechdrs[section].sh_addr);
> + r->r_offset - sechdrs[section].sh_addr;
> }
>
> static int addend_386_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2010-06-11 22:49:54

by Michal Marek

[permalink] [raw]
Subject: Re: Additional info on modpost segfault

On Fri, Jun 11, 2010 at 01:42:47PM -0700, H. Peter Anvin wrote:
> Michal, are you sending this to Linus?

I'll, once Linus takes another pull request I sent a couple of hours
ago. Thanks for the patch Krzysztof!

Michal

>
> -hpa
>
>
> On 06/10/2010 04:08 PM, Krzysztof Halasa wrote:
> > Alan <[email protected]> writes:
> >
> >> program: /home/alan/GitTrees/linux-2.6-mid-ref/scripts/mod/modpost -o
> >> Module.symvers -S vmlinux.o
> >>
> >> Program received signal SIGSEGV, Segmentation fault.
> >
> > It just hit me.
> > It's the offset calculation in reloc_location() which overflows:
> > return (void *)elf->hdr + sechdrs[section].sh_offset +
> > (r->r_offset - sechdrs[section].sh_addr);
> >
> > E.g. for the first rodata r entry:
> > r->r_offset < sechdrs[section].sh_addr
> > and the expression in the parenthesis produces 0xFFFFFFE0 or something
> > equally wise.
> >
> > Does the attached patch fix it?
> >
> > Signed-off-by: Krzysztof Hałasa <[email protected]>
> >
> > --- a/scripts/mod/modpost.c
> > +++ b/scripts/mod/modpost.c
> > @@ -1317,8 +1317,8 @@ static unsigned int *reloc_location(struct elf_info *elf,
> > Elf_Shdr *sechdrs = elf->sechdrs;
> > int section = sechdr->sh_info;
> >
> > return (void *)elf->hdr + sechdrs[section].sh_offset +
> > - (r->r_offset - sechdrs[section].sh_addr);
> > + r->r_offset - sechdrs[section].sh_addr;
> > }
> >
> > static int addend_386_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/