2020-09-15 07:52:36

by Hugh Dickins

[permalink] [raw]
Subject: Static call dependency on libelf version?

This is just an FYI written from a position of ignorance: I may
have got it wrong, and my build environment too piecemeal to matter
to anyone else; but what I saw was weird enough to be worth mentioning,
in case it saves someone some time.

I usually build and test on mmotm weekly rather linux-next daily.
No problem with 5.9-rc3-mm1 from 2020-09-04, nor with 5.9-rc5, but
(on two machines) 5.9-rc5-mm1 from 2020-09-13 could not link vmlinux:

AR init/built-in.a
LD vmlinux.o
ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
ld: init/built-in.a: member init/main.o in archive is not an object
make[1]: *** [vmlinux] Error 1
make: *** [__sub-make] Error 2

On the third machine, a more recent installation, but using the same
gcc and the same binutils, I could build the same config successfully.
init/main.o was the same size on each (49216 bytes), but diff of hd
of the good against the bad showed:

2702,2709c2702,2709
< 00bfc0 000001db 00000001 00000003 00000000 >................<
< 00bfd0 00000000 00000000 0000b316 00000000 >................<
< 00bfe0 00000018 00000000 00000000 00000000 >................<
< 00bff0 00000001 00000000 00000008 00000000 >................<
< 00c000 000001ee 00000004 00000040 00000000 >........@.......<
< 00c010 00000000 00000000 0000b330 00000000 >........0.......<
< 00c020 00000090 00000000 0000002d 00000030 >........-...0...<
< 00c030 00000008 00000000 00000018 00000000 >................<
---
> 00bfc0 00000000 00000000 000001f1 00000000 >................<
> 00bfd0 79732e00 6261746d 74732e00 62617472 >..symtab..strtab<
> 00bfe0 68732e00 74727473 2e006261 616c6572 >..shstrtab..rela<
> 00bff0 7865742e 722e0074 2e616c65 61746164 >.text..rela.data<
> 00c000 73622e00 722e0073 5f616c65 6172745f >..bss..rela__tra<
> 00c010 6f706563 73746e69 7274705f 722e0073 >cepoints_ptrs..r<
> 00c020 2e616c65 74617473 635f6369 2e6c6c61 >ela.static_call.<
> 00c030 74786574 65722e00 692e616c 2e74696e >text..rela.init.<

and 217 other .os in the build tree also "corrupted".

CONFIG_HAVE_STATIC_CALL=y
CONFIG_HAVE_STATIC_CALL_INLINE=y
stand out as new in the .config for 5.9-rc5-mm1, and references
to objtool in static_call.h and static_call_types.h took me to
tools/objtool/Makefile, with its use of libelf.

I've copied over files of the newer libelf (0.168) to the failing
machines, which are now building the 5.9-rc5-mm1 vmlinux correctly.

It looks as if CONFIG_HAVE_STATIC_CALL=y depends on a newer libelf
than I had before (0.155), and should either insist on a minimum
version, or else be adjusted to work with older versions.

Hope this helps,
Hugh


2020-09-15 09:31:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Static call dependency on libelf version?

On Tue, Sep 15, 2020 at 12:50:54AM -0700, Hugh Dickins wrote:
> This is just an FYI written from a position of ignorance: I may
> have got it wrong, and my build environment too piecemeal to matter
> to anyone else; but what I saw was weird enough to be worth mentioning,
> in case it saves someone some time.
>
> I usually build and test on mmotm weekly rather linux-next daily.
> No problem with 5.9-rc3-mm1 from 2020-09-04, nor with 5.9-rc5, but
> (on two machines) 5.9-rc5-mm1 from 2020-09-13 could not link vmlinux:
>
> AR init/built-in.a
> LD vmlinux.o
> ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
> ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
> ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
> ld: warning: init/main.o has a corrupt section with a size (7472747368732e00) larger than the file size
> ld: init/built-in.a: member init/main.o in archive is not an object
> make[1]: *** [vmlinux] Error 1
> make: *** [__sub-make] Error 2
>
> On the third machine, a more recent installation, but using the same
> gcc and the same binutils, I could build the same config successfully.
> init/main.o was the same size on each (49216 bytes), but diff of hd
> of the good against the bad showed:
>
> 2702,2709c2702,2709
> < 00bfc0 000001db 00000001 00000003 00000000 >................<
> < 00bfd0 00000000 00000000 0000b316 00000000 >................<
> < 00bfe0 00000018 00000000 00000000 00000000 >................<
> < 00bff0 00000001 00000000 00000008 00000000 >................<
> < 00c000 000001ee 00000004 00000040 00000000 >........@.......<
> < 00c010 00000000 00000000 0000b330 00000000 >........0.......<
> < 00c020 00000090 00000000 0000002d 00000030 >........-...0...<
> < 00c030 00000008 00000000 00000018 00000000 >................<
> ---
> > 00bfc0 00000000 00000000 000001f1 00000000 >................<
> > 00bfd0 79732e00 6261746d 74732e00 62617472 >..symtab..strtab<
> > 00bfe0 68732e00 74727473 2e006261 616c6572 >..shstrtab..rela<
> > 00bff0 7865742e 722e0074 2e616c65 61746164 >.text..rela.data<
> > 00c000 73622e00 722e0073 5f616c65 6172745f >..bss..rela__tra<
> > 00c010 6f706563 73746e69 7274705f 722e0073 >cepoints_ptrs..r<
> > 00c020 2e616c65 74617473 635f6369 2e6c6c61 >ela.static_call.<
> > 00c030 74786574 65722e00 692e616c 2e74696e >text..rela.init.<
>
> and 217 other .os in the build tree also "corrupted".
>
> CONFIG_HAVE_STATIC_CALL=y
> CONFIG_HAVE_STATIC_CALL_INLINE=y
> stand out as new in the .config for 5.9-rc5-mm1, and references
> to objtool in static_call.h and static_call_types.h took me to
> tools/objtool/Makefile, with its use of libelf.
>
> I've copied over files of the newer libelf (0.168) to the failing
> machines, which are now building the 5.9-rc5-mm1 vmlinux correctly.
>
> It looks as if CONFIG_HAVE_STATIC_CALL=y depends on a newer libelf
> than I had before (0.155), and should either insist on a minimum
> version, or else be adjusted to work with older versions.

Hurmph, I have no idea how this happened; clearly none of my machines
have this older libelf :/ (the machines I use most seem to be on 0.180).

I'm also not sure what static_call is doing different from say orc
data generation. Both create and fill sections in similar ways.

Mark, do you have any idea?

2020-09-15 11:32:09

by Mark Wielaard

[permalink] [raw]
Subject: Re: Static call dependency on libelf version?

H Peter,

On Tue, 2020-09-15 at 11:30 +0200, [email protected] wrote:
> On Tue, Sep 15, 2020 at 12:50:54AM -0700, Hugh Dickins wrote:
> > CONFIG_HAVE_STATIC_CALL=y
> > CONFIG_HAVE_STATIC_CALL_INLINE=y
> > stand out as new in the .config for 5.9-rc5-mm1, and references
> > to objtool in static_call.h and static_call_types.h took me to
> > tools/objtool/Makefile, with its use of libelf.
> >
> > I've copied over files of the newer libelf (0.168) to the failing
> > machines, which are now building the 5.9-rc5-mm1 vmlinux correctly.
> >
> > It looks as if CONFIG_HAVE_STATIC_CALL=y depends on a newer libelf
> > than I had before (0.155), and should either insist on a minimum
> > version, or else be adjusted to work with older versions.
>
> Hurmph, I have no idea how this happened; clearly none of my machines
> have this older libelf :/ (the machines I use most seem to be on
> 0.180).
>
> I'm also not sure what static_call is doing different from say orc
> data generation. Both create and fill sections in similar ways.
>
> Mark, do you have any idea?

0.155 is more than 8 years old. Given that 0.168 (4 years old) works
fine and this might be an interaction with objtool, which if I remember
correctly uses ELF_C_RDWR to manipulate an ELF file in place, I suspect
it might be:

commit 88ad5ddb71bd1fa8ed043a840157ebf23c0057b3
Author: Mark Wielaard <[email protected]>
Date: Tue Nov 5 16:27:32 2013 +0100

libelf: Write all section headers if elf flags contains ELF_F_DIRTY.

When ehdr e_shoff changes, elf flags is set dirty. This indicates that
the section header moved because sections were added/removed or changed
in size.

Reported-by: Jiri Slaby <[email protected]>
Signed-off-by: Mark Wielaard <[email protected]>

Which is described as elfutils-0.157-15-g88ad5ddb so was in elfutils
0.158, but not before. At least the issue seems to mimics the bug
report a little:
https://sourceware.org/legacy-ml/elfutils-devel/imported/msg03724.html

But all this is for ancient versions of elfutils libelf. So it is hard
to say and my memory might be failing. If someone can confirm 0.158
(which is 6 years old) works fine I would pick that as minimum version,
otherwise simply go with 0.168 which is 4 years old and should be on
most systems by now.

Cheers,

Mark

2020-09-15 21:52:17

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: Static call dependency on libelf version?

On Tue, Sep 15, 2020 at 10:02:31AM -0500, Josh Poimboeuf wrote:
> On Tue, Sep 15, 2020 at 04:38:02PM +0200, Mark Wielaard wrote:
> > Hi Josh,
> >
> > On Tue, 2020-09-15 at 09:17 -0500, Josh Poimboeuf wrote:
> > > On Tue, Sep 15, 2020 at 01:24:17PM +0200, Mark Wielaard wrote:
> > > > But all this is for ancient versions of elfutils libelf. So it is hard
> > > > to say and my memory might be failing. If someone can confirm 0.158
> > > > (which is 6 years old) works fine I would pick that as minimum version,
> > > > otherwise simply go with 0.168 which is 4 years old and should be on
> > > > most systems by now.
> > >
> > > I just discovered elf_version(), I assume that would allow us to check
> > > and enforce the libelf version?
> >
> > No, sorry. That is for the ELF file format version, which is and has
> > always been version 1 (and I suspect it will be for the next 20
> > years).
>
> Oh, right :-)
>
> > There is /usr/include/elfutils/version.h which provides a
> > _ELFUTILS_PREREQ(major, minor) macro if you need something during
> > compile time.
>
> Nice, I'll try that.

Confirmed that 0.158 fixes it. I'll enforce that as a minimum. Thanks!

--
Josh

2020-09-15 23:01:58

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: Static call dependency on libelf version?

On Tue, Sep 15, 2020 at 04:38:02PM +0200, Mark Wielaard wrote:
> Hi Josh,
>
> On Tue, 2020-09-15 at 09:17 -0500, Josh Poimboeuf wrote:
> > On Tue, Sep 15, 2020 at 01:24:17PM +0200, Mark Wielaard wrote:
> > > But all this is for ancient versions of elfutils libelf. So it is hard
> > > to say and my memory might be failing. If someone can confirm 0.158
> > > (which is 6 years old) works fine I would pick that as minimum version,
> > > otherwise simply go with 0.168 which is 4 years old and should be on
> > > most systems by now.
> >
> > I just discovered elf_version(), I assume that would allow us to check
> > and enforce the libelf version?
>
> No, sorry. That is for the ELF file format version, which is and has
> always been version 1 (and I suspect it will be for the next 20
> years).

Oh, right :-)

> There is /usr/include/elfutils/version.h which provides a
> _ELFUTILS_PREREQ(major, minor) macro if you need something during
> compile time.

Nice, I'll try that.

> Note that in theory libelf is a generic library (there are variants for
> Solaris and BSD with which we try to be [source] compatible), but the
> only actively maintained version is the elfutils one.

Yeah, we've occasionally had users using another variant of the library
which is 10+ years old. Not surprisingly it didn't work well.

--
Josh

2020-09-15 23:25:15

by Mark Wielaard

[permalink] [raw]
Subject: Re: Static call dependency on libelf version?

Hi Josh,

On Tue, 2020-09-15 at 09:17 -0500, Josh Poimboeuf wrote:
> On Tue, Sep 15, 2020 at 01:24:17PM +0200, Mark Wielaard wrote:
> > But all this is for ancient versions of elfutils libelf. So it is hard
> > to say and my memory might be failing. If someone can confirm 0.158
> > (which is 6 years old) works fine I would pick that as minimum version,
> > otherwise simply go with 0.168 which is 4 years old and should be on
> > most systems by now.
>
> I just discovered elf_version(), I assume that would allow us to check
> and enforce the libelf version?

No, sorry. That is for the ELF file format version, which is and has
always been version 1 (and I suspect it will be for the next 20
years).

There is /usr/include/elfutils/version.h which provides a
_ELFUTILS_PREREQ(major, minor) macro if you need something during
compile time.

Note that in theory libelf is a generic library (there are variants for
Solaris and BSD with which we try to be [source] compatible), but the
only actively maintained version is the elfutils one.

Cheers,

Mark

2020-09-16 00:25:24

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: Static call dependency on libelf version?

On Tue, Sep 15, 2020 at 01:24:17PM +0200, Mark Wielaard wrote:
> But all this is for ancient versions of elfutils libelf. So it is hard
> to say and my memory might be failing. If someone can confirm 0.158
> (which is 6 years old) works fine I would pick that as minimum version,
> otherwise simply go with 0.168 which is 4 years old and should be on
> most systems by now.

I just discovered elf_version(), I assume that would allow us to check
and enforce the libelf version?

--
Josh