2018-12-16 18:35:24

by Arnd Bergmann

[permalink] [raw]
Subject: objtool warnings for kernel/trace/trace_selftest_dynamic.o

Hi Josh,

In randconfig tests with gcc-8.1, I get this warning every
few hundred builds, tried it on both next/master and 4.19.y-stable:

kernel/trace/trace_selftest_dynamic.o: warning: objtool:
trace_selftest_dynamic_test_func()+0x5: call without frame pointer
save/setup
kernel/trace/trace_selftest_dynamic.o: warning: objtool:
trace_selftest_dynamic_test_func2()+0x5: call without frame pointer
save/setup

$ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o

build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o:
file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <trace_selftest_dynamic_test_func>:
0: e8 00 00 00 00 callq 5
<trace_selftest_dynamic_test_func+0x5>
1: R_X86_64_PC32 __fentry__-0x4
5: e8 00 00 00 00 callq a
<trace_selftest_dynamic_test_func+0xa>
6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
a: 31 c0 xor %eax,%eax
c: c3 retq
d: 0f 1f 00 nopl (%rax)

0000000000000010 <trace_selftest_dynamic_test_func2>:
10: e8 00 00 00 00 callq 15
<trace_selftest_dynamic_test_func2+0x5>
11: R_X86_64_PC32 __fentry__-0x4
15: e8 00 00 00 00 callq 1a
<trace_selftest_dynamic_test_func2+0xa>
16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
1a: 31 c0 xor %eax,%eax
1c: c3 retq

I found this reported in
http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could
not find an existing fix or analysis.

Arnd


Attachments:
.config.gz (25.31 kB)

2018-12-17 20:37:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:

> > Yes LTO causes the to be treated like static functions.
> >
> > I guess noclone is unlikely to be really needed here because these
> > functions are unlikely to be cloned.
> >
> > So as a workaround it could be removed.
> >
> > But note we have other noclone functions in the tree (like in KVM)
> > which actually need it.
>
> How about we just use the __used attribute then? It seems to have the
> same result of preventing IPA optimizations (without the weird side
> effect of missing frame pointers).

AFAIK we don't have any in-tree LTO, so it can all go in the bin.

When/if we get the LTO trainwreck sorted -- which very much includes
getting that memory-order-consume fixed -- we can revisit all that.

2018-12-17 21:23:29

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

> > That seems weird.
> >
> > Are you sure it's not just because they are empty? AFAIK
> > gcc doesn't necessarily generate frame pointers for empty functions.
>
> I suspected that it was because they're empty, however I didn't see this
> warning for other leaf functions. The sancov plugin is presumably
> taking care of adding frame pointers where needed. Also, adding

That would surprise me.

> -mno-omit-leaf-frame-pointer didn't fix it.
>
> And anyway I confirmed that it was fixed by removing __noclone.

So this is with a plugin?

Maybe the plugin does something wrong?

I thought this was just a standard build.

I'm not sure the problem is well enough understood yet
to really do anything.

Do you have a simple standalone test case to show the compiler behavior?


-Andi

2018-12-17 23:01:06

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Sun, Dec 16, 2018 at 07:33:11PM +0100, Arnd Bergmann wrote:
> Hi Josh,
>
> In randconfig tests with gcc-8.1, I get this warning every
> few hundred builds, tried it on both next/master and 4.19.y-stable:
>
> kernel/trace/trace_selftest_dynamic.o: warning: objtool:
> trace_selftest_dynamic_test_func()+0x5: call without frame pointer
> save/setup
> kernel/trace/trace_selftest_dynamic.o: warning: objtool:
> trace_selftest_dynamic_test_func2()+0x5: call without frame pointer
> save/setup
>
> $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o
>
> build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o:
> file format elf64-x86-64
>
> Disassembly of section .text:
>
> 0000000000000000 <trace_selftest_dynamic_test_func>:
> 0: e8 00 00 00 00 callq 5
> <trace_selftest_dynamic_test_func+0x5>
> 1: R_X86_64_PC32 __fentry__-0x4
> 5: e8 00 00 00 00 callq a
> <trace_selftest_dynamic_test_func+0xa>
> 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
> a: 31 c0 xor %eax,%eax
> c: c3 retq
> d: 0f 1f 00 nopl (%rax)
>
> 0000000000000010 <trace_selftest_dynamic_test_func2>:
> 10: e8 00 00 00 00 callq 15
> <trace_selftest_dynamic_test_func2+0x5>
> 11: R_X86_64_PC32 __fentry__-0x4
> 15: e8 00 00 00 00 callq 1a
> <trace_selftest_dynamic_test_func2+0xa>
> 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
> 1a: 31 c0 xor %eax,%eax
> 1c: c3 retq
>
> I found this reported in
> http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could
> not find an existing fix or analysis.

Thanks for reporting this Arnd.

The problem is that, for some reason, __noclone is preventing GCC from
creating frame pointers for these functions. Miroslav said that
__noclone is not recommended by GCC developers, and that __used can be
used instead for the same purpose:

https://lkml.kernel.org/r/[email protected]

Andi,

is __noclone really needed here, since the functions aren't static? Or
does LTO cause them to be treated like static functions?

Or if it is really needed, would __used be sufficient instead?

noinline __noclone int DYN_FTRACE_TEST_NAME(void)
{
/* used to call mcount */
return 0;
}

noinline __noclone int DYN_FTRACE_TEST_NAME2(void)
{
/* used to call mcount */
return 0;
}

--
Josh

2018-12-17 23:08:18

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 11:39:00AM -0600, Josh Poimboeuf wrote:
> On Sun, Dec 16, 2018 at 07:33:11PM +0100, Arnd Bergmann wrote:
> > Hi Josh,
> >
> > In randconfig tests with gcc-8.1, I get this warning every
> > few hundred builds, tried it on both next/master and 4.19.y-stable:
> >
> > kernel/trace/trace_selftest_dynamic.o: warning: objtool:
> > trace_selftest_dynamic_test_func()+0x5: call without frame pointer
> > save/setup
> > kernel/trace/trace_selftest_dynamic.o: warning: objtool:
> > trace_selftest_dynamic_test_func2()+0x5: call without frame pointer
> > save/setup
> >
> > $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o
> >
> > build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o:
> > file format elf64-x86-64
> >
> > Disassembly of section .text:
> >
> > 0000000000000000 <trace_selftest_dynamic_test_func>:
> > 0: e8 00 00 00 00 callq 5
> > <trace_selftest_dynamic_test_func+0x5>
> > 1: R_X86_64_PC32 __fentry__-0x4
> > 5: e8 00 00 00 00 callq a
> > <trace_selftest_dynamic_test_func+0xa>
> > 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
> > a: 31 c0 xor %eax,%eax
> > c: c3 retq
> > d: 0f 1f 00 nopl (%rax)
> >
> > 0000000000000010 <trace_selftest_dynamic_test_func2>:
> > 10: e8 00 00 00 00 callq 15
> > <trace_selftest_dynamic_test_func2+0x5>
> > 11: R_X86_64_PC32 __fentry__-0x4
> > 15: e8 00 00 00 00 callq 1a
> > <trace_selftest_dynamic_test_func2+0xa>
> > 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
> > 1a: 31 c0 xor %eax,%eax
> > 1c: c3 retq
> >
> > I found this reported in
> > http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could
> > not find an existing fix or analysis.
>
> Thanks for reporting this Arnd.
>
> The problem is that, for some reason, __noclone is preventing GCC from
> creating frame pointers for these functions. Miroslav said that

That seems weird.

Are you sure it's not just because they are empty? AFAIK
gcc doesn't necessarily generate frame pointers for empty functions.

> __noclone is not recommended by GCC developers, and that __used can be
> used instead for the same purpose:
>
> https://lkml.kernel.org/r/[email protected]
>
> Andi,
>
> is __noclone really needed here, since the functions aren't static? Or
> does LTO cause them to be treated like static functions?

Yes LTO causes the to be treated like static functions.

I guess noclone is unlikely to be really needed here because these
functions are unlikely to be cloned.

So as a workaround it could be removed.

But note we have other noclone functions in the tree (like in KVM)
which actually need it.


-Andi


2018-12-17 23:09:26

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 10:04:34AM -0800, Andi Kleen wrote:
> On Mon, Dec 17, 2018 at 11:39:00AM -0600, Josh Poimboeuf wrote:
> > On Sun, Dec 16, 2018 at 07:33:11PM +0100, Arnd Bergmann wrote:
> > > Hi Josh,
> > >
> > > In randconfig tests with gcc-8.1, I get this warning every
> > > few hundred builds, tried it on both next/master and 4.19.y-stable:
> > >
> > > kernel/trace/trace_selftest_dynamic.o: warning: objtool:
> > > trace_selftest_dynamic_test_func()+0x5: call without frame pointer
> > > save/setup
> > > kernel/trace/trace_selftest_dynamic.o: warning: objtool:
> > > trace_selftest_dynamic_test_func2()+0x5: call without frame pointer
> > > save/setup
> > >
> > > $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o
> > >
> > > build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o:
> > > file format elf64-x86-64
> > >
> > > Disassembly of section .text:
> > >
> > > 0000000000000000 <trace_selftest_dynamic_test_func>:
> > > 0: e8 00 00 00 00 callq 5
> > > <trace_selftest_dynamic_test_func+0x5>
> > > 1: R_X86_64_PC32 __fentry__-0x4
> > > 5: e8 00 00 00 00 callq a
> > > <trace_selftest_dynamic_test_func+0xa>
> > > 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
> > > a: 31 c0 xor %eax,%eax
> > > c: c3 retq
> > > d: 0f 1f 00 nopl (%rax)
> > >
> > > 0000000000000010 <trace_selftest_dynamic_test_func2>:
> > > 10: e8 00 00 00 00 callq 15
> > > <trace_selftest_dynamic_test_func2+0x5>
> > > 11: R_X86_64_PC32 __fentry__-0x4
> > > 15: e8 00 00 00 00 callq 1a
> > > <trace_selftest_dynamic_test_func2+0xa>
> > > 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
> > > 1a: 31 c0 xor %eax,%eax
> > > 1c: c3 retq
> > >
> > > I found this reported in
> > > http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could
> > > not find an existing fix or analysis.
> >
> > Thanks for reporting this Arnd.
> >
> > The problem is that, for some reason, __noclone is preventing GCC from
> > creating frame pointers for these functions. Miroslav said that
>
> That seems weird.
>
> Are you sure it's not just because they are empty? AFAIK
> gcc doesn't necessarily generate frame pointers for empty functions.

I suspected that it was because they're empty, however I didn't see this
warning for other leaf functions. The sancov plugin is presumably
taking care of adding frame pointers where needed. Also, adding
-mno-omit-leaf-frame-pointer didn't fix it.

And anyway I confirmed that it was fixed by removing __noclone.

> > __noclone is not recommended by GCC developers, and that __used can be
> > used instead for the same purpose:
> >
> > https://lkml.kernel.org/r/[email protected]
> >
> > Andi,
> >
> > is __noclone really needed here, since the functions aren't static? Or
> > does LTO cause them to be treated like static functions?
>
> Yes LTO causes the to be treated like static functions.
>
> I guess noclone is unlikely to be really needed here because these
> functions are unlikely to be cloned.
>
> So as a workaround it could be removed.
>
> But note we have other noclone functions in the tree (like in KVM)
> which actually need it.

How about we just use the __used attribute then? It seems to have the
same result of preventing IPA optimizations (without the weird side
effect of missing frame pointers).

--
Josh

2018-12-17 23:21:22

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
>
> > > Yes LTO causes the to be treated like static functions.
> > >
> > > I guess noclone is unlikely to be really needed here because these
> > > functions are unlikely to be cloned.
> > >
> > > So as a workaround it could be removed.
> > >
> > > But note we have other noclone functions in the tree (like in KVM)
> > > which actually need it.
> >
> > How about we just use the __used attribute then? It seems to have the
> > same result of preventing IPA optimizations (without the weird side
> > effect of missing frame pointers).
>
> AFAIK we don't have any in-tree LTO, so it can all go in the bin.

I have patches for 4.20, and I was actually thinking about resending
soon. It will need a few changes, but not too bad.

FWIW there's also a user base who used the out of tree patches
for some time.

>
> When/if we get the LTO trainwreck sorted -- which very much includes
> getting that memory-order-consume fixed -- we can revisit all that.

What do you mean? I'm not aware of any LTO problems with memory-order-consume?

-Andi

2018-12-17 23:22:12

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
>
> > > Yes LTO causes the to be treated like static functions.
> > >
> > > I guess noclone is unlikely to be really needed here because these
> > > functions are unlikely to be cloned.
> > >
> > > So as a workaround it could be removed.
> > >
> > > But note we have other noclone functions in the tree (like in KVM)
> > > which actually need it.
> >
> > How about we just use the __used attribute then? It seems to have the
> > same result of preventing IPA optimizations (without the weird side
> > effect of missing frame pointers).
>
> AFAIK we don't have any in-tree LTO, so it can all go in the bin.
>
> When/if we get the LTO trainwreck sorted -- which very much includes
> getting that memory-order-consume fixed -- we can revisit all that.

Ok, then if there are no objections I'll just send a revert of:

dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")

--
Josh

2018-12-17 23:25:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, 17 Dec 2018 15:31:26 -0600
Josh Poimboeuf <[email protected]> wrote:

> On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> >
> > > > Yes LTO causes the to be treated like static functions.
> > > >
> > > > I guess noclone is unlikely to be really needed here because these
> > > > functions are unlikely to be cloned.
> > > >
> > > > So as a workaround it could be removed.
> > > >
> > > > But note we have other noclone functions in the tree (like in KVM)
> > > > which actually need it.
> > >
> > > How about we just use the __used attribute then? It seems to have the
> > > same result of preventing IPA optimizations (without the weird side
> > > effect of missing frame pointers).
> >
> > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> >
> > When/if we get the LTO trainwreck sorted -- which very much includes
> > getting that memory-order-consume fixed -- we can revisit all that.
>
> Ok, then if there are no objections I'll just send a revert of:
>
> dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")
>

Should it be reverted, or just remove the noclone, and keep the
noinline?

-- Steve

2018-12-17 23:25:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 12:55:35PM -0800, Andi Kleen wrote:
> On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > When/if we get the LTO trainwreck sorted -- which very much includes
> > getting that memory-order-consume fixed -- we can revisit all that.
>
> What do you mean? I'm not aware of any LTO problems with memory-order-consume?

The compiler is basically allowed to break RCU (and anything else that
depends on read-read dependencies). LTO makes it _far_ more likely this
happens.

We need guarantees (and possible switches) from the compiler folks that
this will not happen before I'll retract my NAK from any LTO enabling.

2018-12-17 23:55:32

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 03:31:26PM -0600, Josh Poimboeuf wrote:
> On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> >
> > > > Yes LTO causes the to be treated like static functions.
> > > >
> > > > I guess noclone is unlikely to be really needed here because these
> > > > functions are unlikely to be cloned.
> > > >
> > > > So as a workaround it could be removed.
> > > >
> > > > But note we have other noclone functions in the tree (like in KVM)
> > > > which actually need it.
> > >
> > > How about we just use the __used attribute then? It seems to have the
> > > same result of preventing IPA optimizations (without the weird side
> > > effect of missing frame pointers).
> >
> > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> >
> > When/if we get the LTO trainwreck sorted -- which very much includes
> > getting that memory-order-consume fixed -- we can revisit all that.
>
> Ok, then if there are no objections I'll just send a revert of:
>
> dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")

I object to that.

-Andi

2018-12-18 00:01:03

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 11:35:24PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 17, 2018 at 12:55:35PM -0800, Andi Kleen wrote:
> > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > > When/if we get the LTO trainwreck sorted -- which very much includes
> > > getting that memory-order-consume fixed -- we can revisit all that.
> >
> > What do you mean? I'm not aware of any LTO problems with memory-order-consume?
>
> The compiler is basically allowed to break RCU (and anything else that
> depends on read-read dependencies). LTO makes it _far_ more likely this
> happens.
>
> We need guarantees (and possible switches) from the compiler folks that
> this will not happen before I'll retract my NAK from any LTO enabling.

Are you really saying that the current RCU code depends on cross file inlining
not happening?

If that is true it's quite bad. Of course it would be a untolerable
situation because all kinds of changes can break it, and there would
be likely already plenty of broken code in tree.

I have a hard time believing it though. Do you have a concrete
example or is this just FUD?

BTW I have a user base for LTO and so far noone has reported any issues
like this.

-Andi

2018-12-18 00:07:25

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote:
> On Mon, 17 Dec 2018 15:31:26 -0600
> Josh Poimboeuf <[email protected]> wrote:
>
> > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> > >
> > > > > Yes LTO causes the to be treated like static functions.
> > > > >
> > > > > I guess noclone is unlikely to be really needed here because these
> > > > > functions are unlikely to be cloned.
> > > > >
> > > > > So as a workaround it could be removed.
> > > > >
> > > > > But note we have other noclone functions in the tree (like in KVM)
> > > > > which actually need it.
> > > >
> > > > How about we just use the __used attribute then? It seems to have the
> > > > same result of preventing IPA optimizations (without the weird side
> > > > effect of missing frame pointers).
> > >
> > > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> > >
> > > When/if we get the LTO trainwreck sorted -- which very much includes
> > > getting that memory-order-consume fixed -- we can revisit all that.
> >
> > Ok, then if there are no objections I'll just send a revert of:
> >
> > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")
> >
>
> Should it be reverted, or just remove the noclone, and keep the
> noinline?

It should not be touched for now, until it is properly debugged.

IMHO Josh's explanation doesn't make much sense and there
was a lot of handwaving

And just fixing one case isn't good enough because there are other
noclone functions in the tree.

It the problem is the plugin the plugin needs to be fixed.

If the problem is gcc we need a gcc test case and bug, with
some analysis, and then based on that select the proper workaround.


-Andi

2018-12-18 02:50:26

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 04:06:18PM -0800, Andi Kleen wrote:
> On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote:
> > On Mon, 17 Dec 2018 15:31:26 -0600
> > Josh Poimboeuf <[email protected]> wrote:
> >
> > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> > > >
> > > > > > Yes LTO causes the to be treated like static functions.
> > > > > >
> > > > > > I guess noclone is unlikely to be really needed here because these
> > > > > > functions are unlikely to be cloned.
> > > > > >
> > > > > > So as a workaround it could be removed.
> > > > > >
> > > > > > But note we have other noclone functions in the tree (like in KVM)
> > > > > > which actually need it.
> > > > >
> > > > > How about we just use the __used attribute then? It seems to have the
> > > > > same result of preventing IPA optimizations (without the weird side
> > > > > effect of missing frame pointers).
> > > >
> > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> > > >
> > > > When/if we get the LTO trainwreck sorted -- which very much includes
> > > > getting that memory-order-consume fixed -- we can revisit all that.
> > >
> > > Ok, then if there are no objections I'll just send a revert of:
> > >
> > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")

Sorry for suggesting this prematurely, my email client stopped syncing
and I missed your later replies to Peter about this.

> > Should it be reverted, or just remove the noclone, and keep the
> > noinline?
>
> It should not be touched for now, until it is properly debugged.
>
> IMHO Josh's explanation doesn't make much sense and there
> was a lot of handwaving
>
> And just fixing one case isn't good enough because there are other
> noclone functions in the tree.
>
> It the problem is the plugin the plugin needs to be fixed.
>
> If the problem is gcc we need a gcc test case and bug, with
> some analysis, and then based on that select the proper workaround.

The plugin is only used for older versions of GCC. Newer versions have
the same functionality builtin with -fsanitize-coverage=trace-pc.

So the problem is GCC. We're using a function attribute which at least
oneGCC developer doesn't recommend. If you want to keep the LTO support
then '__used' seems like a much better choice.

--
Josh

2018-12-18 03:07:19

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote:
> On Mon, 17 Dec 2018 15:31:26 -0600
> Josh Poimboeuf <[email protected]> wrote:
>
> > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> > >
> > > > > Yes LTO causes the to be treated like static functions.
> > > > >
> > > > > I guess noclone is unlikely to be really needed here because these
> > > > > functions are unlikely to be cloned.
> > > > >
> > > > > So as a workaround it could be removed.
> > > > >
> > > > > But note we have other noclone functions in the tree (like in KVM)
> > > > > which actually need it.
> > > >
> > > > How about we just use the __used attribute then? It seems to have the
> > > > same result of preventing IPA optimizations (without the weird side
> > > > effect of missing frame pointers).
> > >
> > > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> > >
> > > When/if we get the LTO trainwreck sorted -- which very much includes
> > > getting that memory-order-consume fixed -- we can revisit all that.
> >
> > Ok, then if there are no objections I'll just send a revert of:
> >
> > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")
> >
>
> Should it be reverted, or just remove the noclone, and keep the
> noinline?

If we want to support out-of-tree LTO, then it should only need "used",
because wouldn't that imply noinline?

If we *don't* want to support out-of-tree LTO, then it shouldn't need
any special attributes, because the functions aren't static so GCC won't
mess with their ABI.

--
Josh

2018-12-18 04:23:23

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

> The plugin is only used for older versions of GCC. Newer versions have
> the same functionality builtin with -fsanitize-coverage=trace-pc.

Ok and the frame pointer issue is with the older version only?

>
> So the problem is GCC. We're using a function attribute which at least
> oneGCC developer doesn't recommend. If you want to keep the LTO support
> then '__used' seems like a much better choice.

I guess __used will work for now.

Still would be better to root cause it properly, but I guess that's
ok for now. The lack of understanding may eventually come back
to bite you later.

FWIW i did some tests and I don't see noclone affecting frame pointer
with gcc 8. The only thing I saw was that empty functions ended up
without frame pointer.

I suspect you'll need to fix up the other users of noclone too.

-Andi

2018-12-18 09:22:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, Dec 17, 2018 at 03:59:50PM -0800, Andi Kleen wrote:
> BTW I have a user base for LTO and so far noone has reported any issues
> like this.

Because ordering issues are immediately obvious and easy to debug no
doubt.

2018-12-18 09:31:41

by Miroslav Benes

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Mon, 17 Dec 2018, Josh Poimboeuf wrote:

> On Mon, Dec 17, 2018 at 04:06:18PM -0800, Andi Kleen wrote:
> > On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote:
> > > On Mon, 17 Dec 2018 15:31:26 -0600
> > > Josh Poimboeuf <[email protected]> wrote:
> > >
> > > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote:
> > > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote:
> > > > >
> > > > > > > Yes LTO causes the to be treated like static functions.
> > > > > > >
> > > > > > > I guess noclone is unlikely to be really needed here because these
> > > > > > > functions are unlikely to be cloned.
> > > > > > >
> > > > > > > So as a workaround it could be removed.
> > > > > > >
> > > > > > > But note we have other noclone functions in the tree (like in KVM)
> > > > > > > which actually need it.
> > > > > >
> > > > > > How about we just use the __used attribute then? It seems to have the
> > > > > > same result of preventing IPA optimizations (without the weird side
> > > > > > effect of missing frame pointers).
> > > > >
> > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin.
> > > > >
> > > > > When/if we get the LTO trainwreck sorted -- which very much includes
> > > > > getting that memory-order-consume fixed -- we can revisit all that.
> > > >
> > > > Ok, then if there are no objections I'll just send a revert of:
> > > >
> > > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone")
>
> Sorry for suggesting this prematurely, my email client stopped syncing
> and I missed your later replies to Peter about this.
>
> > > Should it be reverted, or just remove the noclone, and keep the
> > > noinline?
> >
> > It should not be touched for now, until it is properly debugged.
> >
> > IMHO Josh's explanation doesn't make much sense and there
> > was a lot of handwaving
> >
> > And just fixing one case isn't good enough because there are other
> > noclone functions in the tree.
> >
> > It the problem is the plugin the plugin needs to be fixed.
> >
> > If the problem is gcc we need a gcc test case and bug, with
> > some analysis, and then based on that select the proper workaround.
>
> The plugin is only used for older versions of GCC. Newer versions have
> the same functionality builtin with -fsanitize-coverage=trace-pc.
>
> So the problem is GCC. We're using a function attribute which at least
> oneGCC developer doesn't recommend. If you want to keep the LTO support
> then '__used' seems like a much better choice.

Martin added to CC.

Martin, the thread starts here
http://lkml.kernel.org/r/CAK8P3a2K1K21ePBFbApaTKPCk+=Bqj0LyWoK1MdFb1s9ZwjfPg@mail.gmail.com

Can you explain the background of noclone vs. used attributes, please?
We discussed it yesterday and I understood that maybe we should not rely
on noclone that much. However it is used in the kernel. Should we avoid
it in general and replace it with something else (used)?

It definitely makes sense in our livepatching samples which Josh mentioned
previously in the thread.

Thanks,
Miroslav

2018-12-18 12:18:10

by Martin Jambor

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

Hi,

On Tue, Dec 18 2018, Miroslav Benes wrote:
>> Sorry for suggesting this prematurely, my email client stopped syncing
>> and I missed your later replies to Peter about this.
>>
>> > > Should it be reverted, or just remove the noclone, and keep the
>> > > noinline?
>> >
>> > It should not be touched for now, until it is properly debugged.
>> >
>> > IMHO Josh's explanation doesn't make much sense and there
>> > was a lot of handwaving
>> >
>> > And just fixing one case isn't good enough because there are other
>> > noclone functions in the tree.
>> >
>> > It the problem is the plugin the plugin needs to be fixed.
>> >
>> > If the problem is gcc we need a gcc test case and bug, with
>> > some analysis, and then based on that select the proper workaround.
>>
>> The plugin is only used for older versions of GCC. Newer versions have
>> the same functionality builtin with -fsanitize-coverage=trace-pc.
>>
>> So the problem is GCC.

That's a little bit harsh :-)

>> We're using a function attribute which at least
>> oneGCC developer doesn't recommend. If you want to keep the LTO support
>> then '__used' seems like a much better choice.
>
> Martin added to CC.
>
> Martin, the thread starts here
> http://lkml.kernel.org/r/CAK8P3a2K1K21ePBFbApaTKPCk+=Bqj0LyWoK1MdFb1s9ZwjfPg@mail.gmail.com
>
> Can you explain the background of noclone vs. used attributes, please?
> We discussed it yesterday and I understood that maybe we should not rely
> on noclone that much. However it is used in the kernel. Should we avoid
> it in general and replace it with something else (used)?

OK, I have read through it and with the caveats that I don't quite
understand what the failure is, that also believe attribute noclone
should not affect frame pointer generation, and that I don't quite get
how LTO comes into play, my comments are the following:

I am the developer who introduced attribute noclone to GCC and also the
one who advises against using it :-) ...at least without also using the
noinline attribute, the combination means "I want only one or zero
copies of this function in the compiled assembly" which you might need
if you do fancy stuff in inline assembly, for example.

I believe that when people use noclone on its own, in 99 out 100 cases
they actually want something else. Usually there is something that
references the function from code (such as assembly) or a tool that the
compiler does know about and then they should use the "used" attribute.
That is what I understood to be the use case here and therefore I
recommended it. In other cases people want all inter-procedural (IPA)
analyses to stay away from a function and then they should use attribute
noipa (or in older GCCs, the combination of noinline and noclone).

The attribute was introduced because it is useful in GCC testsuite, and
although I think occasions when it is useful on its own elsewhere are
very rare and quite obscure, they can happen. But it really only means
you want only one or zero *non-inlined* copies of the function. For
example if you have an asm in it that must appear in the compiled
assembly only once but you are confident it will be optimized out in
inlined instances. Or if you play games with __builtin_return_address()
and somehow have a very clear idea what the return values should be.

I'm afraid I cannot give an opinion what you should do in this case
without understanding the problem better. If you can isolate the case
where noclone behaves weirdly into a self-contained testcase, I'd be
happy to have a look at it.

Martin

2018-12-18 12:33:31

by Steven Rostedt

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, 18 Dec 2018 13:15:40 +0100
Martin Jambor <[email protected]> wrote:

> I am the developer who introduced attribute noclone to GCC and also the
> one who advises against using it :-) ...at least without also using the
> noinline attribute, the combination means "I want only one or zero
> copies of this function in the compiled assembly" which you might need
> if you do fancy stuff in inline assembly, for example.

Sounds to me that we should not be using noclone in this instance.

>
> I believe that when people use noclone on its own, in 99 out 100 cases
> they actually want something else. Usually there is something that
> references the function from code (such as assembly) or a tool that the
> compiler does know about and then they should use the "used" attribute.
> That is what I understood to be the use case here and therefore I
> recommended it. In other cases people want all inter-procedural (IPA)
> analyses to stay away from a function and then they should use attribute
> noipa (or in older GCCs, the combination of noinline and noclone).
>
> The attribute was introduced because it is useful in GCC testsuite, and
> although I think occasions when it is useful on its own elsewhere are
> very rare and quite obscure, they can happen. But it really only means
> you want only one or zero *non-inlined* copies of the function. For
> example if you have an asm in it that must appear in the compiled
> assembly only once but you are confident it will be optimized out in
> inlined instances. Or if you play games with __builtin_return_address()
> and somehow have a very clear idea what the return values should be.

With this explanation, I really believe the answer is s/noclone/used/.

Here's the problem that the "noinline noclone" is trying to solve. When
someone compiles with LTO (which links much more and can remove unused
functions, or inline them better) the ftrace dynamic self test
functions can be inlined. If that happens, they lose the
"mcount/fentry" from gcc -pg. Which was the reason they were made, and
placed in a separate file to begin with. Then when the self tests try
to enable function tracing on these functions, they can't and the boot
up tests fail, disabling ftrace.

The "noinline noclone" was added to prevent this case. And from your
description, it sounds like "noinline used" is what we actually want.

-- Steve

2018-12-18 14:03:42

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, Dec 18, 2018 at 01:15:40PM +0100, Martin Jambor wrote:
> OK, I have read through it and with the caveats that I don't quite
> understand what the failure is, that also believe attribute noclone
> should not affect frame pointer generation, and that I don't quite get
> how LTO comes into play, my comments are the following:
>
> I am the developer who introduced attribute noclone to GCC and also the
> one who advises against using it :-) ...at least without also using the
> noinline attribute, the combination means "I want only one or zero
> copies of this function in the compiled assembly" which you might need
> if you do fancy stuff in inline assembly, for example.
>
> I believe that when people use noclone on its own, in 99 out 100 cases
> they actually want something else. Usually there is something that
> references the function from code (such as assembly) or a tool that the
> compiler does know about and then they should use the "used" attribute.
> That is what I understood to be the use case here and therefore I
> recommended it. In other cases people want all inter-procedural (IPA)
> analyses to stay away from a function and then they should use attribute
> noipa (or in older GCCs, the combination of noinline and noclone).
>
> The attribute was introduced because it is useful in GCC testsuite, and
> although I think occasions when it is useful on its own elsewhere are
> very rare and quite obscure, they can happen. But it really only means
> you want only one or zero *non-inlined* copies of the function. For
> example if you have an asm in it that must appear in the compiled
> assembly only once but you are confident it will be optimized out in
> inlined instances. Or if you play games with __builtin_return_address()
> and somehow have a very clear idea what the return values should be.
>
> I'm afraid I cannot give an opinion what you should do in this case
> without understanding the problem better. If you can isolate the case
> where noclone behaves weirdly into a self-contained testcase, I'd be
> happy to have a look at it.

I whittled it down to a small test case. It turns out the problem is
caused by the "__optimize__("no-tracer")" atribute, which is used by our
__noclone macro:


# if __has_attribute(__optimize__)
# define __noclone __attribute__((__noclone__, __optimize__("no-tracer")))
# else
# define __noclone __attribute__((__noclone__))
# endif


Here's the test case. Notice it skips the frame pointer setup before
the call to __sanitizer_cov_trace_pc():


$ cat test.c
__attribute__((__optimize__("no-tracer"))) int test(void)
{
return 0;
}
$ gcc -O2 -fno-omit-frame-pointer -fsanitize-coverage=trace-pc -c -o test.o test.c
$ objdump -dr test.o

test.o: file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <test>:
0: 48 83 ec 08 sub $0x8,%rsp
4: e8 00 00 00 00 callq 9 <test+0x9>
5: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
9: 31 c0 xor %eax,%eax
b: 48 83 c4 08 add $0x8,%rsp
f: c3 retq



It works if you remove the function attribute:

0000000000000000 <test>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: e8 00 00 00 00 callq 9 <test+0x9>
5: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
9: 31 c0 xor %eax,%eax
b: 5d pop %rbp
c: c3 retq




Apparently we are using "no-tracer" because of:


commit 95272c29378ee7dc15f43fa2758cb28a5913a06d
Author: Paolo Bonzini <[email protected]>
Date: Thu Mar 31 09:38:51 2016 +0200

compiler-gcc: disable -ftracer for __noclone functions

-ftracer can duplicate asm blocks causing compilation to fail in
noclone functions. For example, KVM declares a global variable
in an asm like

asm("2: ... \n
.pushsection data \n
.global vmx_return \n
vmx_return: .long 2b");

and -ftracer causes a double declaration.

Cc: Andrew Morton <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reported-by: Linda Walsh <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>


--
Josh

2018-12-18 21:16:07

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

> OK, I have read through it and with the caveats that I don't quite
> understand what the failure is, that also believe attribute noclone
> should not affect frame pointer generation, and that I don't quite get
> how LTO comes into play, my comments are the following:



>
> I am the developer who introduced attribute noclone to GCC and also the
> one who advises against using it :-) ...at least without also using the
> noinline attribute, the combination means "

The function in question uses noinline too.

> I want only one or zero
> copies of this function in the compiled assembly" which you might need
> if you do fancy stuff in inline assembly, for example.

For this case we only want one non inlined copy because it is used as a
test case for a function tracer.

LTO comes into play because it originally relied on being in a separate
file, so it would not be inlined, but with LTO that doesn't work.

>
> I believe that when people use noclone on its own, in 99 out 100 cases
> they actually want something else. Usually there is something that

AFAIK there is no noclone without noinline in the kernel tree.


> references the function from code (such as assembly) or a tool that the
> compiler does know about and then they should use the "used" attribute.

Neither in the ftrace case, nor in the KVM case (another user which
has fancy inline assembly that cannot be duplicated) that's the case.
It's just about having exactly one out of line instance.

So based on that I think noclone is fine. Of course there
is still the open question why exactly the frame pointer disappears.

-Andi

2018-12-18 21:31:13

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

> I whittled it down to a small test case. It turns out the problem is
> caused by the "__optimize__("no-tracer")" atribute, which is used by our
> __noclone macro:
>
>
> # if __has_attribute(__optimize__)
> # define __noclone __attribute__((__noclone__, __optimize__("no-tracer")))
> # else
> # define __noclone __attribute__((__noclone__))
> # endif

Ah interesting. That makes sense. Thanks for tracking that down.

>
> commit 95272c29378ee7dc15f43fa2758cb28a5913a06d
> Author: Paolo Bonzini <[email protected]>
> Date: Thu Mar 31 09:38:51 2016 +0200
>
> compiler-gcc: disable -ftracer for __noclone functions
>
> -ftracer can duplicate asm blocks causing compilation to fail in
> noclone functions. For example, KVM declares a global variable
> in an asm like
>
> asm("2: ... \n
> .pushsection data \n
> .global vmx_return \n
> vmx_return: .long 2b");
>
> and -ftracer causes a double declaration.

Ok. So the real solution would be to add a no tracer to the KVM
function only.

But of course it will drop its frame pointer. So if you rely on
frame pointer for live patching you would still have a problem.

Just fixing the ftrace case independently may not be useful?

ftrace here really not interesting at all for live patching because
the ftracing self test only happens at boot, when there is no live patching.
Likely it's even in __init code.

KVM on the other hand is quite important and actually happens at boot.

So I would concentrate on fixing that one instead, but leave ftrace
alone.

My suggestion to fix KVM would be to drop the inline assembler
and use out of line assembler in a separate file. Then disabling
the tracer wouldn't be needed.

VM entries are very expensive anyways, I doubt avoiding a single call/ret
makes any difference at all.

-Andi


2018-12-18 22:09:48

by Steven Rostedt

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, 18 Dec 2018 13:15:01 -0800
Andi Kleen <[email protected]> wrote:

>
> >
> > I am the developer who introduced attribute noclone to GCC and also the
> > one who advises against using it :-) ...at least without also using the
> > noinline attribute, the combination means "
>
> The function in question uses noinline too.

And that's all I care about.

>
> > I want only one or zero
> > copies of this function in the compiled assembly" which you might need
> > if you do fancy stuff in inline assembly, for example.
>
> For this case we only want one non inlined copy because it is used as a
> test case for a function tracer.

The function tracer selftest should work just fine if there's more than
one copy. Because the test will enable all copies.

>
> LTO comes into play because it originally relied on being in a separate
> file, so it would not be inlined, but with LTO that doesn't work.

The reason for it being in a separate file is because the entire
directory is marked "notrace". That file, and that file alone in that
directory, gets compiled with the -pg flags. As long as the functions
in that file don't get inlined (which will make them notrace as well)
all is good.

Thus, if used and/or noinline prevents the functions from being
inlined, and thus have their profiling removed, we should be good.

Hmm, how does that work? When does LTO do its linker magic? Because the
fentry/mcounts are added when the object is created. Are they removed
if the compiler sees that it can be inlined? Or does LTO just compile
everything in one go?

>
> >
> > I believe that when people use noclone on its own, in 99 out 100 cases
> > they actually want something else. Usually there is something that
>
> AFAIK there is no noclone without noinline in the kernel tree.
>
>
> > references the function from code (such as assembly) or a tool that the
> > compiler does know about and then they should use the "used" attribute.
>
> Neither in the ftrace case, nor in the KVM case (another user which
> has fancy inline assembly that cannot be duplicated) that's the case.
> It's just about having exactly one out of line instance.

Again, that's not the ftrace case. It doesn't care about more than one
out of line instance. Thus, for this particular use, "used" should be
good enough.

-- Steve

2018-12-18 22:16:11

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, Dec 18, 2018 at 04:57:13PM -0500, Steven Rostedt wrote:
> Hmm, how does that work? When does LTO do its linker magic? Because the
> fentry/mcounts are added when the object is created. Are they removed
> if the compiler sees that it can be inlined? Or does LTO just compile
> everything in one go?

LTO compiles everything in one go at link time. The objects
just contain immediate code.

Also in principle it should track command line options from the command
line and apply them by function, but there were bugs regarding
this in older gcc versions so it may not always work.

> Again, that's not the ftrace case. It doesn't care about more than one
> out of line instance. Thus, for this particular use, "used" should be
> good enough.

You mean noinline used?

Inlining would certainly break the test.

-Andi

2018-12-18 22:18:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, 18 Dec 2018 14:13:38 -0800
Andi Kleen <[email protected]> wrote:

> > Again, that's not the ftrace case. It doesn't care about more than one
> > out of line instance. Thus, for this particular use, "used" should be
> > good enough.
>
> You mean noinline used?

I thought that someone said that "used" would also prevent inlining.
But regardless, whatever it takes to not let the functions be inlined
is indeed good enough.

-- Steve

>
> Inlining would certainly break the test.

2018-12-18 22:42:07

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, Dec 18, 2018 at 10:19:32AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 17, 2018 at 03:59:50PM -0800, Andi Kleen wrote:
> > BTW I have a user base for LTO and so far noone has reported any issues
> > like this.
>
> Because ordering issues are immediately obvious and easy to debug no
> doubt.

Again if it's a real problem it should be already there
because we have a lot of code inside files that gets happily inlined.

Do you have an example in the tree where it happened or could happen?


-Andi

2018-12-18 23:51:11

by Andi Kleen

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, Dec 18, 2018 at 05:16:20PM -0500, Steven Rostedt wrote:
> On Tue, 18 Dec 2018 14:13:38 -0800
> Andi Kleen <[email protected]> wrote:
>
> > > Again, that's not the ftrace case. It doesn't care about more than one
> > > out of line instance. Thus, for this particular use, "used" should be
> > > good enough.
> >
> > You mean noinline used?
>
> I thought that someone said that "used" would also prevent inlining.

that's not correct. You need noinline

-Andi

[ak@tassilo tsrc]$ cat tinline.c

int i;

inline __attribute__((used)) int finline(void)
{
i++;
}


main()
{
finline();
}

[ak@tassilo tsrc]$ gcc -O2 -S tinline.c
tinline.c:10:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
main()
^~~~
[ak@tassilo tsrc]$
[ak@tassilo tsrc]$ cat tinline.s
.file "tinline.c"
.text
.section .text.startup,"ax",@progbits
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
addl $1, i(%rip)
xorl %eax, %eax
ret
.cfi_endproc
.LFE1:
.size main, .-main
.comm i,4,4
.ident "GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)"
.section .note.GNU-stack,"",@progbits
[ak@tassilo tsrc]$





2018-12-19 01:20:01

by Steven Rostedt

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

On Tue, 18 Dec 2018 15:26:36 -0800
Andi Kleen <[email protected]> wrote:

> > I thought that someone said that "used" would also prevent inlining.
>
> that's not correct. You need noinline

Sure, whatever. That's besides the point.

The thing is, I don't need noclone. Just remove that and keep the
noiniline, and if that keeps the functions in that file still able to
be function traced, then I'm happy.

I was only explaining why I mentioned "used" in the previous email. If
you read the part that you cut off in your reply, I said "But
regardless, whatever it takes to not let the functions be inlined is
indeed good enough."

I don't care if that's "used" on "noinline", you didn't need to waste
your time to post the assembly.

-- Steve

2018-12-19 04:23:21

by Sean Christopherson

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

+cc Nadav and Paolo

On Tue, Dec 18, 2018 at 01:20:42PM -0800, Andi Kleen wrote:
> > I whittled it down to a small test case. It turns out the problem is
> > caused by the "__optimize__("no-tracer")" atribute, which is used by our
> > __noclone macro:
> >
> >
> > # if __has_attribute(__optimize__)
> > # define __noclone __attribute__((__noclone__, __optimize__("no-tracer")))
> > # else
> > # define __noclone __attribute__((__noclone__))
> > # endif
>
> Ah interesting. That makes sense. Thanks for tracking that down.
>
> >
> > commit 95272c29378ee7dc15f43fa2758cb28a5913a06d
> > Author: Paolo Bonzini <[email protected]>
> > Date: Thu Mar 31 09:38:51 2016 +0200
> >
> > compiler-gcc: disable -ftracer for __noclone functions
> >
> > -ftracer can duplicate asm blocks causing compilation to fail in
> > noclone functions. For example, KVM declares a global variable
> > in an asm like
> >
> > asm("2: ... \n
> > .pushsection data \n
> > .global vmx_return \n
> > vmx_return: .long 2b");
> >
> > and -ftracer causes a double declaration.
>
> Ok. So the real solution would be to add a no tracer to the KVM
> function only.
>
> But of course it will drop its frame pointer. So if you rely on
> frame pointer for live patching you would still have a problem.
>
> Just fixing the ftrace case independently may not be useful?
>
> ftrace here really not interesting at all for live patching because
> the ftracing self test only happens at boot, when there is no live patching.
> Likely it's even in __init code.
>
> KVM on the other hand is quite important and actually happens at boot.
>
> So I would concentrate on fixing that one instead, but leave ftrace
> alone.
>
> My suggestion to fix KVM would be to drop the inline assembler
> and use out of line assembler in a separate file. Then disabling
> the tracer wouldn't be needed.
>
> VM entries are very expensive anyways, I doubt avoiding a single call/ret
> makes any difference at all.

The KVM code in question isn't inline asm for performance, but rather
so that it can reference struct offsets when marshalling data between
the CPU and software structs when loading/saving guest state. The
structs in question are local to VMX, i.e. don't belong in asm-offsets.h,
and until now there hasn't been a need to move away from inline asm.

Nadav recently pointed out that the __noclone attribute results in
vmx_vcpu_run() failing to inline even trivial code. So now there's
two reasons to get rid of __noclone on vmx_vcpu_run().

But rather than move the entire asm blob wholesale into a separate asm
file, we can move just VMLAUNCH+VMRESUME along with a minimal amount
of wrapper code. I have working code, just need to write a changelog.
I'll send a series tomorrow for the VMX change and a revert of commit
95272c29378e ("compiler-gcc: disable -ftracer for __noclone functions").

[1] https://patchwork.kernel.org/patch/8707981/#21817015

2018-12-19 19:01:45

by Martin Jambor

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

Hi,

On Tue, Dec 18 2018, Andi Kleen wrote:
>> OK, I have read through it and with the caveats that I don't quite
>> understand what the failure is, that also believe attribute noclone
>> should not affect frame pointer generation, and that I don't quite get
>> how LTO comes into play, my comments are the following:
>
>>
>> I am the developer who introduced attribute noclone to GCC and also the
>> one who advises against using it :-) ...at least without also using the
>> noinline attribute, the combination means "
>
> The function in question uses noinline too.
>
>> I want only one or zero
>> copies of this function in the compiled assembly" which you might need
>> if you do fancy stuff in inline assembly, for example.
>
> For this case we only want one non inlined copy because it is used as a
> test case for a function tracer.
>
> LTO comes into play because it originally relied on being in a separate
> file, so it would not be inlined, but with LTO that doesn't work.
>
>>
>> I believe that when people use noclone on its own, in 99 out 100 cases
>> they actually want something else. Usually there is something that
>
> AFAIK there is no noclone without noinline in the kernel tree.
>
>
>> references the function from code (such as assembly) or a tool that the
>> compiler does know about and then they should use the "used" attribute.
>
> Neither in the ftrace case, nor in the KVM case (another user which
> has fancy inline assembly that cannot be duplicated) that's the case.
> It's just about having exactly one out of line instance.
>
> So based on that I think noclone is fine. Of course there
> is still the open question why exactly the frame pointer disappears.

I agree, I originally thought the problem was something else.

Thanks for the clarification,

Martin

2018-12-19 20:43:46

by Martin Jambor

[permalink] [raw]
Subject: Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o

Hi,

On Tue, Dec 18 2018, Josh Poimboeuf wrote:
> On Tue, Dec 18, 2018 at 01:15:40PM +0100, Martin Jambor wrote:
>> I'm afraid I cannot give an opinion what you should do in this case
>> without understanding the problem better. If you can isolate the case
>> where noclone behaves weirdly into a self-contained testcase, I'd be
>> happy to have a look at it.
>
> I whittled it down to a small test case. It turns out the problem is
> caused by the "__optimize__("no-tracer")" atribute, which is used by our
> __noclone macro:

Wonderful, thanks a lot.

>
> # if __has_attribute(__optimize__)
> # define __noclone __attribute__((__noclone__, __optimize__("no-tracer")))
> # else
> # define __noclone __attribute__((__noclone__))
> # endif
>
>
> Here's the test case. Notice it skips the frame pointer setup before
> the call to __sanitizer_cov_trace_pc():
>
>
> $ cat test.c
> __attribute__((__optimize__("no-tracer"))) int test(void)
> {
> return 0;
> }
> $ gcc -O2 -fno-omit-frame-pointer -fsanitize-coverage=trace-pc -c -o test.o test.c

Well, it might not be intuitive but the __optimize__ attribute resets to
-O2 optimization defaults and thus drops -fno-omit-frame-pointer on the
floor, so no wonder there is no frame pointer. This applies to other
optimization options you pass on the command line, for example if you
still compile kernel with -fno-strict-aliasing, you get GCC assuming
strict aliasing in functions marked with your __noclone.

> Apparently we are using "no-tracer" because of:
>
>
> commit 95272c29378ee7dc15f43fa2758cb28a5913a06d
> Author: Paolo Bonzini <[email protected]>
> Date: Thu Mar 31 09:38:51 2016 +0200
>
> compiler-gcc: disable -ftracer for __noclone functions
>
> -ftracer can duplicate asm blocks causing compilation to fail in
> noclone functions. For example, KVM declares a global variable
> in an asm like
>
> asm("2: ... \n
> .pushsection data \n
> .global vmx_return \n
> vmx_return: .long 2b");
>
> and -ftracer causes a double declaration.

There is no way for a user to reliably prevent a duplication of an
inline assembly statement. The correct fix is to move such statements
to an assembly files. You have disabled tracer which duplicated it in
your case but other passes might do that too in the future.

The correct fix is to put such assembler snippets into separate assembly
files or, if you for for some reason insist on inline assembly, to use
'%=' inline assembler template format string (it is expanded to a number
that is unique in a compilation unit).

Martin