2022-09-15 12:29:09

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

From: Peter Zijlstra <[email protected]>

Allow STT_NOTYPE to tail-call into STT_FUNC, per definition STT_NOTYPE
is not a sub-function of the STT_FUNC.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
tools/objtool/check.c | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1370,6 +1370,16 @@ static void add_return_call(struct objto

static bool same_function(struct instruction *insn1, struct instruction *insn2)
{
+ if (!insn1->func && !insn2->func)
+ return true;
+
+ /* Allow STT_NOTYPE -> STT_FUNC+0 tail-calls */
+ if (!insn1->func && insn1->func != insn2->func)
+ return false;
+
+ if (!insn2->func)
+ return true;
+
return insn1->func->pfunc == insn2->func->pfunc;
}

@@ -1487,18 +1497,19 @@ static int add_jump_destinations(struct
strstr(jump_dest->func->name, ".cold")) {
insn->func->cfunc = jump_dest->func;
jump_dest->func->pfunc = insn->func;
-
- } else if (!same_function(insn, jump_dest) &&
- is_first_func_insn(file, jump_dest)) {
- /*
- * Internal sibling call without reloc or with
- * STT_SECTION reloc.
- */
- add_call_dest(file, insn, jump_dest->func, true);
- continue;
}
}

+ if (!same_function(insn, jump_dest) &&
+ is_first_func_insn(file, jump_dest)) {
+ /*
+ * Internal sibling call without reloc or with
+ * STT_SECTION reloc.
+ */
+ add_call_dest(file, insn, jump_dest->func, true);
+ continue;
+ }
+
insn->jump_dest = jump_dest;
}




2022-09-22 05:31:09

by Pawan Gupta

[permalink] [raw]
Subject: Re: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

On Thu, Sep 15, 2022 at 01:11:11PM +0200, Peter Zijlstra wrote:
> From: Peter Zijlstra <[email protected]>
>
> Allow STT_NOTYPE to tail-call into STT_FUNC, per definition STT_NOTYPE
> is not a sub-function of the STT_FUNC.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---
> tools/objtool/check.c | 29 ++++++++++++++++++++---------
> 1 file changed, 20 insertions(+), 9 deletions(-)
>
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1370,6 +1370,16 @@ static void add_return_call(struct objto
>
> static bool same_function(struct instruction *insn1, struct instruction *insn2)
> {
> + if (!insn1->func && !insn2->func)
> + return true;
> +
> + /* Allow STT_NOTYPE -> STT_FUNC+0 tail-calls */
> + if (!insn1->func && insn1->func != insn2->func)
> + return false;

Looks like this check is triggering below warning:

vmlinux.o: warning: objtool: ftrace_replace_code.cold+0x0: unreachable instruction

(.config is attached)

> +
> + if (!insn2->func)
> + return true;
> +
> return insn1->func->pfunc == insn2->func->pfunc;
> }


Attachments:
(No filename) (1.21 kB)
.config (139.68 kB)
Download all attachments

2022-09-22 10:47:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

On Wed, Sep 21, 2022 at 10:27:50PM -0700, Pawan Gupta wrote:
> On Thu, Sep 15, 2022 at 01:11:11PM +0200, Peter Zijlstra wrote:
> > From: Peter Zijlstra <[email protected]>
> >
> > Allow STT_NOTYPE to tail-call into STT_FUNC, per definition STT_NOTYPE
> > is not a sub-function of the STT_FUNC.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > Signed-off-by: Thomas Gleixner <[email protected]>
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > ---
> > tools/objtool/check.c | 29 ++++++++++++++++++++---------
> > 1 file changed, 20 insertions(+), 9 deletions(-)
> >
> > --- a/tools/objtool/check.c
> > +++ b/tools/objtool/check.c
> > @@ -1370,6 +1370,16 @@ static void add_return_call(struct objto
> >
> > static bool same_function(struct instruction *insn1, struct instruction *insn2)
> > {
> > + if (!insn1->func && !insn2->func)
> > + return true;
> > +
> > + /* Allow STT_NOTYPE -> STT_FUNC+0 tail-calls */
> > + if (!insn1->func && insn1->func != insn2->func)
> > + return false;
>
> Looks like this check is triggering below warning:
>
> vmlinux.o: warning: objtool: ftrace_replace_code.cold+0x0: unreachable instruction
>

I managed to reproduce with 12.2.0 -- my 12.1.0 compiler didn't
reproduce.

$ ./scripts/objdump-func vmlinux.o ftrace_replace_code
0000 00000000001032f0 <ftrace_replace_code>:
0000 1032f0: f3 0f 1e fa endbr64
0004 1032f4: 41 57 push %r15
0006 1032f6: 41 56 push %r14
0008 1032f8: 41 55 push %r13
000a 1032fa: 41 89 fd mov %edi,%r13d
000d 1032fd: 41 54 push %r12
000f 1032ff: 55 push %rbp
0010 103300: 53 push %rbx
0011 103301: 48 83 ec 10 sub $0x10,%rsp
0015 103305: 65 48 8b 04 25 28 00 00 00 mov %gs:0x28,%rax
001e 10330e: 48 89 44 24 08 mov %rax,0x8(%rsp)
0023 103313: 31 c0 xor %eax,%eax
0025 103315: e8 00 00 00 00 call 10331a <ftrace_replace_code+0x2a> 103316: R_X86_64_PLT32 ftrace_rec_iter_start-0x4
002a 10331a: 45 85 ed test %r13d,%r13d
002d 10331d: 41 0f 95 c4 setne %r12b
0031 103321: 48 85 c0 test %rax,%rax
0034 103324: 0f 84 e1 00 00 00 je 10340b <ftrace_replace_code+0x11b>
003a 10332a: 48 89 c3 mov %rax,%rbx
003d 10332d: 45 0f b6 e4 movzbl %r12b,%r12d
0041 103331: 48 c7 c5 00 00 00 00 mov $0x0,%rbp 103334: R_X86_64_32S .bss+0x16538
0048 103338: 48 89 df mov %rbx,%rdi
004b 10333b: e8 00 00 00 00 call 103340 <ftrace_replace_code+0x50> 10333c: R_X86_64_PLT32 ftrace_rec_iter_record-0x4
0050 103340: 44 89 e6 mov %r12d,%esi
0053 103343: 48 89 c7 mov %rax,%rdi
0056 103346: 49 89 c6 mov %rax,%r14
0059 103349: e8 00 00 00 00 call 10334e <ftrace_replace_code+0x5e> 10334a: R_X86_64_PLT32 ftrace_test_record-0x4
005e 10334e: 83 f8 01 cmp $0x1,%eax
0061 103351: 0f 84 61 01 00 00 je 1034b8 <ftrace_replace_code+0x1c8>
0067 103357: 83 e8 02 sub $0x2,%eax
006a 10335a: 83 f8 01 cmp $0x1,%eax
006d 10335d: 0f 87 94 00 00 00 ja 1033f7 <ftrace_replace_code+0x107>
0073 103363: 4c 89 f7 mov %r14,%rdi
0076 103366: 49 c7 c7 00 00 00 00 mov $0x0,%r15 103369: R_X86_64_32S .bss+0x16538
007d 10336d: e8 00 00 00 00 call 103372 <ftrace_replace_code+0x82> 10336e: R_X86_64_PLT32 ftrace_get_addr_curr-0x4
0082 103372: 49 8b 16 mov (%r14),%rdx
0085 103375: c6 45 00 e8 movb $0xe8,0x0(%rbp)
0089 103379: 48 83 c2 05 add $0x5,%rdx
008d 10337d: 48 29 d0 sub %rdx,%rax
0090 103380: 89 45 01 mov %eax,0x1(%rbp)
0093 103383: 49 8b 36 mov (%r14),%rsi
0096 103386: ba 05 00 00 00 mov $0x5,%edx
009b 10338b: 48 8d 7c 24 03 lea 0x3(%rsp),%rdi
00a0 103390: c7 44 24 03 00 00 00 00 movl $0x0,0x3(%rsp)
00a8 103398: c6 44 24 07 00 movb $0x0,0x7(%rsp)
00ad 10339d: e8 00 00 00 00 call 1033a2 <ftrace_replace_code+0xb2> 10339e: R_X86_64_PLT32 copy_from_kernel_nofault-0x4
00b2 1033a2: 48 85 c0 test %rax,%rax
00b5 1033a5: 0f 85 3e 01 00 00 jne 1034e9 <ftrace_replace_code+0x1f9>
00bb 1033ab: 41 8b 07 mov (%r15),%eax
00be 1033ae: 39 44 24 03 cmp %eax,0x3(%rsp)
00c2 1033b2: 74 38 je 1033ec <ftrace_replace_code+0xfc>
00c4 1033b4: 4c 89 3d 00 00 00 00 mov %r15,0x0(%rip) # 1033bb <ftrace_replace_code+0xcb> 1033b7: R_X86_64_PC32 ftrace_expected-0x4
00cb 1033bb: 0f 0b ud2
00cd 1033bd: bf ea ff ff ff mov $0xffffffea,%edi
00d2 1033c2: 48 8b 44 24 08 mov 0x8(%rsp),%rax
00d7 1033c7: 65 48 2b 04 25 28 00 00 00 sub %gs:0x28,%rax
00e0 1033d0: 0f 85 1f 01 00 00 jne 1034f5 <ftrace_replace_code+0x205>
00e6 1033d6: 48 83 c4 10 add $0x10,%rsp
00ea 1033da: 4c 89 f6 mov %r14,%rsi
00ed 1033dd: 5b pop %rbx
00ee 1033de: 5d pop %rbp
00ef 1033df: 41 5c pop %r12
00f1 1033e1: 41 5d pop %r13
00f3 1033e3: 41 5e pop %r14
00f5 1033e5: 41 5f pop %r15
00f7 1033e7: e9 00 00 00 00 jmp 1033ec <ftrace_replace_code+0xfc> 1033e8: R_X86_64_PLT32 ftrace_bug-0x4
00fc 1033ec: 41 0f b6 47 04 movzbl 0x4(%r15),%eax
0101 1033f1: 38 44 24 07 cmp %al,0x7(%rsp)
0105 1033f5: 75 bd jne 1033b4 <ftrace_replace_code+0xc4>
0107 1033f7: 48 89 df mov %rbx,%rdi
010a 1033fa: e8 00 00 00 00 call 1033ff <ftrace_replace_code+0x10f> 1033fb: R_X86_64_PLT32 ftrace_rec_iter_next-0x4
010f 1033ff: 48 89 c3 mov %rax,%rbx
0112 103402: 48 85 c0 test %rax,%rax
0115 103405: 0f 85 2d ff ff ff jne 103338 <ftrace_replace_code+0x48>
011b 10340b: e8 00 00 00 00 call 103410 <ftrace_replace_code+0x120> 10340c: R_X86_64_PLT32 ftrace_rec_iter_start-0x4
0120 103410: 45 31 e4 xor %r12d,%r12d
0123 103413: 45 85 ed test %r13d,%r13d
0126 103416: 49 c7 c5 00 00 00 00 mov $0x0,%r13 103419: R_X86_64_32S .bss+0x16538
012d 10341d: 41 0f 95 c4 setne %r12b
0131 103421: 48 89 c3 mov %rax,%rbx
0134 103424: 48 85 c0 test %rax,%rax
0137 103427: 75 1a jne 103443 <ftrace_replace_code+0x153>
0139 103429: eb 6a jmp 103495 <ftrace_replace_code+0x1a5>
013b 10342b: 85 c0 test %eax,%eax
013d 10342d: 0f 8f 91 00 00 00 jg 1034c4 <ftrace_replace_code+0x1d4>
0143 103433: 48 89 df mov %rbx,%rdi
0146 103436: e8 00 00 00 00 call 10343b <ftrace_replace_code+0x14b> 103437: R_X86_64_PLT32 ftrace_rec_iter_next-0x4
014b 10343b: 48 89 c3 mov %rax,%rbx
014e 10343e: 48 85 c0 test %rax,%rax
0151 103441: 74 52 je 103495 <ftrace_replace_code+0x1a5>
0153 103443: 48 89 df mov %rbx,%rdi
0156 103446: e8 00 00 00 00 call 10344b <ftrace_replace_code+0x15b> 103447: R_X86_64_PLT32 ftrace_rec_iter_record-0x4
015b 10344b: 44 89 e6 mov %r12d,%esi
015e 10344e: 48 89 c7 mov %rax,%rdi
0161 103451: 48 89 c5 mov %rax,%rbp
0164 103454: e8 00 00 00 00 call 103459 <ftrace_replace_code+0x169> 103455: R_X86_64_PLT32 ftrace_test_record-0x4
0169 103459: 83 f8 02 cmp $0x2,%eax
016c 10345c: 7e cd jle 10342b <ftrace_replace_code+0x13b>
016e 10345e: 83 f8 03 cmp $0x3,%eax
0171 103461: 75 d0 jne 103433 <ftrace_replace_code+0x143>
0173 103463: 48 8b 35 00 00 00 00 mov 0x0(%rip),%rsi # 10346a <ftrace_replace_code+0x17a> 103466: R_X86_64_PC32 x86_nops+0x24
017a 10346a: 48 8b 7d 00 mov 0x0(%rbp),%rdi
017e 10346e: 31 c9 xor %ecx,%ecx
0180 103470: ba 05 00 00 00 mov $0x5,%edx
0185 103475: e8 00 00 00 00 call 10347a <ftrace_replace_code+0x18a> 103476: R_X86_64_PLT32 text_poke_queue-0x4
018a 10347a: 44 89 e6 mov %r12d,%esi
018d 10347d: 48 89 ef mov %rbp,%rdi
0190 103480: e8 00 00 00 00 call 103485 <ftrace_replace_code+0x195> 103481: R_X86_64_PLT32 ftrace_update_record-0x4
0195 103485: 48 89 df mov %rbx,%rdi
0198 103488: e8 00 00 00 00 call 10348d <ftrace_replace_code+0x19d> 103489: R_X86_64_PLT32 ftrace_rec_iter_next-0x4
019d 10348d: 48 89 c3 mov %rax,%rbx
01a0 103490: 48 85 c0 test %rax,%rax
01a3 103493: 75 ae jne 103443 <ftrace_replace_code+0x153>
01a5 103495: 48 8b 44 24 08 mov 0x8(%rsp),%rax
01aa 10349a: 65 48 2b 04 25 28 00 00 00 sub %gs:0x28,%rax
01b3 1034a3: 75 50 jne 1034f5 <ftrace_replace_code+0x205>
01b5 1034a5: 48 83 c4 10 add $0x10,%rsp
01b9 1034a9: 5b pop %rbx
01ba 1034aa: 5d pop %rbp
01bb 1034ab: 41 5c pop %r12
01bd 1034ad: 41 5d pop %r13
01bf 1034af: 41 5e pop %r14
01c1 1034b1: 41 5f pop %r15
01c3 1034b3: e9 00 00 00 00 jmp 1034b8 <ftrace_replace_code+0x1c8> 1034b4: R_X86_64_PLT32 text_poke_finish-0x4
01c8 1034b8: 4c 8b 3d 00 00 00 00 mov 0x0(%rip),%r15 # 1034bf <ftrace_replace_code+0x1cf> 1034bb: R_X86_64_PC32 x86_nops+0x24
01cf 1034bf: e9 bf fe ff ff jmp 103383 <ftrace_replace_code+0x93>
01d4 1034c4: 48 89 ef mov %rbp,%rdi
01d7 1034c7: e8 00 00 00 00 call 1034cc <ftrace_replace_code+0x1dc> 1034c8: R_X86_64_PLT32 ftrace_get_addr_new-0x4
01dc 1034cc: 48 8b 55 00 mov 0x0(%rbp),%rdx
01e0 1034d0: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 1034d3: R_X86_64_32S .bss+0x16538
01e7 1034d7: 41 c6 45 00 e8 movb $0xe8,0x0(%r13)
01ec 1034dc: 48 83 c2 05 add $0x5,%rdx
01f0 1034e0: 48 29 d0 sub %rdx,%rax
01f3 1034e3: 41 89 45 01 mov %eax,0x1(%r13)
01f7 1034e7: eb 81 jmp 10346a <ftrace_replace_code+0x17a>
01f9 1034e9: 0f 0b ud2
01fb 1034eb: bf f2 ff ff ff mov $0xfffffff2,%edi
0200 1034f0: e9 cd fe ff ff jmp 1033c2 <ftrace_replace_code+0xd2>
0205 1034f5: e8 00 00 00 00 call 1034fa <ftrace_replace_code+0x20a> 1034f6: R_X86_64_PLT32 __stack_chk_fail-0x4
020a 1034fa: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
0210 103500: 90 nop
0211 103501: 90 nop
0212 103502: 90 nop
0213 103503: 90 nop
0214 103504: 90 nop
0215 103505: 90 nop
0216 103506: 90 nop
0217 103507: 90 nop
0218 103508: 90 nop
0219 103509: 90 nop
021a 10350a: 90 nop
021b 10350b: 90 nop
021c 10350c: 90 nop
021d 10350d: 90 nop
021e 10350e: 90 nop
021f 10350f: 90 nop
0000 0000000000012a83 <ftrace_replace_code.cold>:
0000 12a83: 48 89 de mov %rbx,%rsi
0003 12a86: 89 c7 mov %eax,%edi
0005 12a88: 5b pop %rbx
0006 12a89: 5d pop %rbp
0007 12a8a: 41 5c pop %r12
0009 12a8c: 41 5d pop %r13
000b 12a8e: 41 5e pop %r14
000d 12a90: e9 62 fd ff ff jmp 127f7 <ftrace_bug>


Seems to suggest objtool is actually right; I cannot find a reference to
that cold symbol.

2022-09-22 11:28:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

On Thu, Sep 22, 2022 at 12:29:58PM +0200, Peter Zijlstra wrote:
> I managed to reproduce with 12.2.0 -- my 12.1.0 compiler didn't
> reproduce.
>
> $ ./scripts/objdump-func vmlinux.o ftrace_replace_code

> 0000 0000000000012a83 <ftrace_replace_code.cold>:
> 0000 12a83: 48 89 de mov %rbx,%rsi
> 0003 12a86: 89 c7 mov %eax,%edi
> 0005 12a88: 5b pop %rbx
> 0006 12a89: 5d pop %rbp
> 0007 12a8a: 41 5c pop %r12
> 0009 12a8c: 41 5d pop %r13
> 000b 12a8e: 41 5e pop %r14
> 000d 12a90: e9 62 fd ff ff jmp 127f7 <ftrace_bug>
>
>
> Seems to suggest objtool is actually right; I cannot find a reference to
> that cold symbol.

Ohhhh, ftrace_replace_cold is a weak function, so it could be the
original weak symbol had a reference to the cold thing.

We have some code to deal with crap like that, lemme try and figure out
what went wrong.

2022-09-22 13:25:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

On Thu, Sep 22, 2022 at 12:47:26PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 22, 2022 at 12:29:58PM +0200, Peter Zijlstra wrote:
> > I managed to reproduce with 12.2.0 -- my 12.1.0 compiler didn't
> > reproduce.
> >
> > $ ./scripts/objdump-func vmlinux.o ftrace_replace_code
>
> > 0000 0000000000012a83 <ftrace_replace_code.cold>:
> > 0000 12a83: 48 89 de mov %rbx,%rsi
> > 0003 12a86: 89 c7 mov %eax,%edi
> > 0005 12a88: 5b pop %rbx
> > 0006 12a89: 5d pop %rbp
> > 0007 12a8a: 41 5c pop %r12
> > 0009 12a8c: 41 5d pop %r13
> > 000b 12a8e: 41 5e pop %r14
> > 000d 12a90: e9 62 fd ff ff jmp 127f7 <ftrace_bug>
> >
> >
> > Seems to suggest objtool is actually right; I cannot find a reference to
> > that cold symbol.
>
> Ohhhh, ftrace_replace_cold is a weak function, so it could be the
> original weak symbol had a reference to the cold thing.
>
> We have some code to deal with crap like that, lemme try and figure out
> what went wrong.

The original weak function:

$ ./scripts/objdump-func kernel/trace/ftrace.o ftrace_replace_code
0000 0000000000003bc0 <ftrace_replace_code>:
0000 3bc0: f3 0f 1e fa endbr64
0004 3bc4: 8b 15 00 00 00 00 mov 0x0(%rip),%edx # 3bca <ftrace_replace_code+0xa> 3bc6: R_X86_64_PC32 .data..read_mostly+0xc
000a 3bca: 41 56 push %r14
000c 3bcc: 41 55 push %r13
000e 3bce: 41 89 fd mov %edi,%r13d
0011 3bd1: 41 54 push %r12
0013 3bd3: 41 83 e5 02 and $0x2,%r13d
0017 3bd7: 55 push %rbp
0018 3bd8: 53 push %rbx
0019 3bd9: 85 d2 test %edx,%edx
001b 3bdb: 75 67 jne 3c44 <ftrace_replace_code+0x84>
001d 3bdd: 4c 8b 35 00 00 00 00 mov 0x0(%rip),%r14 # 3be4 <ftrace_replace_code+0x24> 3be0: R_X86_64_PC32 .bss+0x54
0024 3be4: 41 89 fc mov %edi,%r12d
0027 3be7: 41 83 e4 01 and $0x1,%r12d
002b 3beb: 4d 85 f6 test %r14,%r14
002e 3bee: 74 54 je 3c44 <ftrace_replace_code+0x84>
0030 3bf0: 41 8b 46 10 mov 0x10(%r14),%eax
0034 3bf4: 31 ed xor %ebp,%ebp
0036 3bf6: 85 c0 test %eax,%eax
0038 3bf8: 7f 0b jg 3c05 <ftrace_replace_code+0x45>
003a 3bfa: eb 40 jmp 3c3c <ftrace_replace_code+0x7c>
003c 3bfc: 83 c5 01 add $0x1,%ebp
003f 3bff: 41 39 6e 10 cmp %ebp,0x10(%r14)
0043 3c03: 7e 37 jle 3c3c <ftrace_replace_code+0x7c>
0045 3c05: 48 63 dd movslq %ebp,%rbx
0048 3c08: 48 c1 e3 04 shl $0x4,%rbx
004c 3c0c: 49 03 5e 08 add 0x8(%r14),%rbx
0050 3c10: f6 43 0b 02 testb $0x2,0xb(%rbx)
0054 3c14: 75 e6 jne 3bfc <ftrace_replace_code+0x3c>
0056 3c16: 44 89 e6 mov %r12d,%esi
0059 3c19: 48 89 df mov %rbx,%rdi
005c 3c1c: e8 3f fb ff ff call 3760 <__ftrace_replace_code>
0061 3c21: 85 c0 test %eax,%eax
0063 3c23: 0f 85 00 00 00 00 jne 3c29 <ftrace_replace_code+0x69> 3c25: R_X86_64_PC32 .text.unlikely+0x368

Jumps to the cold subfunction right here ^

0069 3c29: 45 85 ed test %r13d,%r13d
006c 3c2c: 74 ce je 3bfc <ftrace_replace_code+0x3c>
006e 3c2e: e8 00 00 00 00 call 3c33 <ftrace_replace_code+0x73> 3c2f: R_X86_64_PLT32 __SCT__cond_resched-0x4
0073 3c33: 83 c5 01 add $0x1,%ebp
0076 3c36: 41 39 6e 10 cmp %ebp,0x10(%r14)
007a 3c3a: 7f c9 jg 3c05 <ftrace_replace_code+0x45>
007c 3c3c: 4d 8b 36 mov (%r14),%r14
007f 3c3f: 4d 85 f6 test %r14,%r14
0082 3c42: 75 ac jne 3bf0 <ftrace_replace_code+0x30>
0084 3c44: 5b pop %rbx
0085 3c45: 5d pop %rbp
0086 3c46: 41 5c pop %r12
0088 3c48: 41 5d pop %r13
008a 3c4a: 41 5e pop %r14
008c 3c4c: e9 00 00 00 00 jmp 3c51 <ftrace_replace_code+0x91> 3c4d: R_X86_64_PLT32 __x86_return_thunk-0x4
0091 3c51: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 cs nopw 0x0(%rax,%rax,1)
009c 3c5c: 0f 1f 40 00 nopl 0x0(%rax)
00a0 3c60: 90 nop
00a1 3c61: 90 nop
00a2 3c62: 90 nop
00a3 3c63: 90 nop
00a4 3c64: 90 nop
00a5 3c65: 90 nop
00a6 3c66: 90 nop
00a7 3c67: 90 nop
00a8 3c68: 90 nop
00a9 3c69: 90 nop
00aa 3c6a: 90 nop
00ab 3c6b: 90 nop
00ac 3c6c: 90 nop
00ad 3c6d: 90 nop
00ae 3c6e: 90 nop
00af 3c6f: 90 nop
0000 000000000000036c <ftrace_replace_code.cold>:
0000 36c: 48 89 de mov %rbx,%rsi
0003 36f: 89 c7 mov %eax,%edi
0005 371: 5b pop %rbx
0006 372: 5d pop %rbp
0007 373: 41 5c pop %r12
0009 375: 41 5d pop %r13
000b 377: 41 5e pop %r14
000d 379: e9 62 fd ff ff jmp e0 <ftrace_bug>

And yes, you're right this patch broke it.

Because the weak function symbol gets deleted by the linker, these
instructions will belong to no function and insn->func will be NULL.

Then the new: STT_NOTYPE -> STT_FUNC+0 logic kicks in and turns this
into a tail-call.

That means insn->jump_dest is unset in favour of insn->call_dest, and
then the unreachable logic falls on it's face because that doesn't deal
with call_dest.

Ho-Humm.. let me find a non hacky solution.

2022-09-23 15:14:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

On Thu, Sep 22, 2022 at 03:15:10PM +0200, Peter Zijlstra wrote:

> Ho-Humm.. let me find a non hacky solution.

I've ended up here:

https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=call-depth-tracking&id=c2a8b8187418956af92d2d954ee4574f87f5e188
https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=call-depth-tracking&id=c74976b9664fb6a1856d85ce246d84db3fd2c2cd

which is arguably quite a lot of work for something relatively minor,
but I do think it leaves us in a better place.

Combo patch below..

---
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -62,12 +62,12 @@ static struct instruction *next_insn_sam
struct instruction *insn)
{
struct instruction *next = list_next_entry(insn, list);
- struct symbol *func = insn->func;
+ struct symbol *func = insn_func(insn);

if (!func)
return NULL;

- if (&next->list != &file->insn_list && next->func == func)
+ if (&next->list != &file->insn_list && insn_func(next) == func)
return next;

/* Check if we're already in the subfunction: */
@@ -83,7 +83,7 @@ static struct instruction *prev_insn_sam
{
struct instruction *prev = list_prev_entry(insn, list);

- if (&prev->list != &file->insn_list && prev->func == insn->func)
+ if (&prev->list != &file->insn_list && insn_func(prev) == insn_func(insn))
return prev;

return NULL;
@@ -129,16 +129,13 @@ static bool is_jump_table_jump(struct in
static bool is_sibling_call(struct instruction *insn)
{
/*
- * Assume only ELF functions can make sibling calls. This ensures
- * sibling call detection consistency between vmlinux.o and individual
- * objects.
+ * Assume only STT_FUNC calls have jump-tables.
*/
- if (!insn->func)
- return false;
-
- /* An indirect jump is either a sibling call or a jump to a table. */
- if (insn->type == INSN_JUMP_DYNAMIC)
- return !is_jump_table_jump(insn);
+ if (insn_func(insn)) {
+ /* An indirect jump is either a sibling call or a jump to a table. */
+ if (insn->type == INSN_JUMP_DYNAMIC)
+ return !is_jump_table_jump(insn);
+ }

/* add_jump_destinations() sets insn->call_dest for sibling calls. */
return (is_static_jump(insn) && insn->call_dest);
@@ -207,7 +204,7 @@ static bool __dead_end_function(struct o
return false;

insn = find_insn(file, func->sec, func->offset);
- if (!insn->func)
+ if (!insn_func(insn))
return false;

func_for_each_insn(file, func, insn) {
@@ -243,7 +240,7 @@ static bool __dead_end_function(struct o
return false;
}

- return __dead_end_function(file, dest->func, recursion+1);
+ return __dead_end_function(file, insn_func(dest), recursion+1);
}
}

@@ -427,7 +424,7 @@ static int decode_instructions(struct ob
}

list_for_each_entry(func, &sec->symbol_list, list) {
- if (func->type != STT_FUNC || func->alias != func)
+ if (func->return_thunk || func->alias != func)
continue;

if (!find_insn(file, sec, func->offset)) {
@@ -437,9 +434,11 @@ static int decode_instructions(struct ob
}

sym_for_each_insn(file, func, insn) {
- insn->func = func;
- if (insn->type == INSN_ENDBR && list_empty(&insn->call_node)) {
- if (insn->offset == insn->func->offset) {
+ insn->sym = func;
+ if (func->type == STT_FUNC &&
+ insn->type == INSN_ENDBR &&
+ list_empty(&insn->call_node)) {
+ if (insn->offset == func->offset) {
list_add_tail(&insn->call_node, &file->endbr_list);
file->nr_endbr++;
} else {
@@ -1372,21 +1371,18 @@ static void add_return_call(struct objto
list_add_tail(&insn->call_node, &file->return_thunk_list);
}

-static bool same_function(struct instruction *insn1, struct instruction *insn2)
-{
- return insn1->func->pfunc == insn2->func->pfunc;
-}
-
-static bool is_first_func_insn(struct objtool_file *file, struct instruction *insn)
+static bool is_first_func_insn(struct objtool_file *file,
+ struct instruction *insn, struct symbol *sym)
{
- if (insn->offset == insn->func->offset)
+ if (insn->offset == sym->offset)
return true;

+ /* Allow direct CALL/JMP past ENDBR */
if (opts.ibt) {
struct instruction *prev = prev_insn_same_sym(file, insn);

if (prev && prev->type == INSN_ENDBR &&
- insn->offset == insn->func->offset + prev->len)
+ insn->offset == sym->offset + prev->len)
return true;
}

@@ -1394,6 +1390,32 @@ static bool is_first_func_insn(struct ob
}

/*
+ * A sibling call is a tail-call to another symbol -- to differentiate from a
+ * recursive tail-call which is to the same symbol.
+ */
+static bool jump_is_sibling_call(struct objtool_file *file,
+ struct instruction *from, struct instruction *to)
+{
+ struct symbol *fs = from->sym;
+ struct symbol *ts = to->sym;
+
+ /* Not a sibling call if from/to a symbol hole */
+ if (!fs || !ts)
+ return false;
+
+ /* Not a sibling call if not targeting the start of a symbol. */
+ if (!is_first_func_insn(file, to, ts))
+ return false;
+
+ /* Disallow sibling calls into STT_NOTYPE */
+ if (ts->type == STT_NOTYPE)
+ return false;
+
+ /* Must not be self to be a sibling */
+ return fs->pfunc != ts->pfunc;
+}
+
+/*
* Find the destination instructions for all jumps.
*/
static int add_jump_destinations(struct objtool_file *file)
@@ -1427,7 +1449,7 @@ static int add_jump_destinations(struct
} else if (reloc->sym->return_thunk) {
add_return_call(file, insn, true);
continue;
- } else if (insn->func) {
+ } else if (insn_func(insn)) {
/*
* External sibling call or internal sibling call with
* STT_FUNC reloc.
@@ -1469,8 +1491,8 @@ static int add_jump_destinations(struct
/*
* Cross-function jump.
*/
- if (insn->func && jump_dest->func &&
- insn->func != jump_dest->func) {
+ if (insn_func(insn) && insn_func(jump_dest) &&
+ insn_func(insn) != insn_func(jump_dest)) {

/*
* For GCC 8+, create parent/child links for any cold
@@ -1487,22 +1509,22 @@ static int add_jump_destinations(struct
* case where the parent function's only reference to a
* subfunction is through a jump table.
*/
- if (!strstr(insn->func->name, ".cold") &&
- strstr(jump_dest->func->name, ".cold")) {
- insn->func->cfunc = jump_dest->func;
- jump_dest->func->pfunc = insn->func;
-
- } else if (!same_function(insn, jump_dest) &&
- is_first_func_insn(file, jump_dest)) {
- /*
- * Internal sibling call without reloc or with
- * STT_SECTION reloc.
- */
- add_call_dest(file, insn, jump_dest->func, true);
- continue;
+ if (!strstr(insn_func(insn)->name, ".cold") &&
+ strstr(insn_func(jump_dest)->name, ".cold")) {
+ insn_func(insn)->cfunc = insn_func(jump_dest);
+ insn_func(jump_dest)->pfunc = insn_func(insn);
}
}

+ if (jump_is_sibling_call(file, insn, jump_dest)) {
+ /*
+ * Internal sibling call without reloc or with
+ * STT_SECTION reloc.
+ */
+ add_call_dest(file, insn, insn_func(jump_dest), true);
+ continue;
+ }
+
insn->jump_dest = jump_dest;
}

@@ -1549,7 +1571,7 @@ static int add_call_destinations(struct
return -1;
}

- if (insn->func && insn->call_dest->type != STT_FUNC) {
+ if (insn_func(insn) && insn->call_dest->type != STT_FUNC) {
WARN_FUNC("unsupported call to non-function",
insn->sec, insn->offset);
return -1;
@@ -1645,7 +1667,7 @@ static int handle_group_alt(struct objto
nop->offset = special_alt->new_off + special_alt->new_len;
nop->len = special_alt->orig_len - special_alt->new_len;
nop->type = INSN_NOP;
- nop->func = orig_insn->func;
+ nop->sym = orig_insn->sym;
nop->alt_group = new_alt_group;
nop->ignore = orig_insn->ignore_alts;
}
@@ -1665,7 +1687,7 @@ static int handle_group_alt(struct objto
last_new_insn = insn;

insn->ignore = orig_insn->ignore_alts;
- insn->func = orig_insn->func;
+ insn->sym = orig_insn->sym;
insn->alt_group = new_alt_group;

/*
@@ -1859,7 +1881,7 @@ static int add_jump_table(struct objtool
struct reloc *reloc = table;
struct instruction *dest_insn;
struct alternative *alt;
- struct symbol *pfunc = insn->func->pfunc;
+ struct symbol *pfunc = insn_func(insn)->pfunc;
unsigned int prev_offset = 0;

/*
@@ -1886,7 +1908,7 @@ static int add_jump_table(struct objtool
break;

/* Make sure the destination is in the same function: */
- if (!dest_insn->func || dest_insn->func->pfunc != pfunc)
+ if (!insn_func(dest_insn) || insn_func(dest_insn)->pfunc != pfunc)
break;

alt = malloc(sizeof(*alt));
@@ -1926,7 +1948,7 @@ static struct reloc *find_jump_table(str
* it.
*/
for (;
- insn && insn->func && insn->func->pfunc == func;
+ insn && insn_func(insn) && insn_func(insn)->pfunc == func;
insn = insn->first_jump_src ?: prev_insn_same_sym(file, insn)) {

if (insn != orig_insn && insn->type == INSN_JUMP_DYNAMIC)
@@ -1943,7 +1965,7 @@ static struct reloc *find_jump_table(str
if (!table_reloc)
continue;
dest_insn = find_insn(file, table_reloc->sym->sec, table_reloc->addend);
- if (!dest_insn || !dest_insn->func || dest_insn->func->pfunc != func)
+ if (!dest_insn || !insn_func(dest_insn) || insn_func(dest_insn)->pfunc != func)
continue;

return table_reloc;
@@ -2395,6 +2417,13 @@ static int decode_sections(struct objtoo
if (ret)
return ret;

+ /*
+ * Must be before add_{jump_call}_destination.
+ */
+ ret = classify_symbols(file);
+ if (ret)
+ return ret;
+
ret = decode_instructions(file);
if (ret)
return ret;
@@ -2414,13 +2443,6 @@ static int decode_sections(struct objtoo
return ret;

/*
- * Must be before add_{jump_call}_destination.
- */
- ret = classify_symbols(file);
- if (ret)
- return ret;
-
- /*
* Must be before add_jump_destinations(), which depends on 'func'
* being set for alternatives, to enable proper sibling call detection.
*/
@@ -2628,7 +2650,7 @@ static int update_cfi_state(struct instr

/* stack operations don't make sense with an undefined CFA */
if (cfa->base == CFI_UNDEFINED) {
- if (insn->func) {
+ if (insn_func(insn)) {
WARN_FUNC("undefined stack state", insn->sec, insn->offset);
return -1;
}
@@ -2974,7 +2996,7 @@ static int update_cfi_state(struct instr
}

/* detect when asm code uses rbp as a scratch register */
- if (opts.stackval && insn->func && op->src.reg == CFI_BP &&
+ if (opts.stackval && insn_func(insn) && op->src.reg == CFI_BP &&
cfa->base != CFI_BP)
cfi->bp_scratch = true;
break;
@@ -3284,7 +3306,7 @@ static int validate_sibling_call(struct
struct instruction *insn,
struct insn_state *state)
{
- if (has_modified_stack_frame(insn, state)) {
+ if (insn_func(insn) && has_modified_stack_frame(insn, state)) {
WARN_FUNC("sibling call from callable instruction with modified stack frame",
insn->sec, insn->offset);
return 1;
@@ -3370,9 +3392,9 @@ static int validate_branch(struct objtoo
while (1) {
next_insn = next_insn_to_validate(file, insn);

- if (func && insn->func && func != insn->func->pfunc) {
+ if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
WARN("%s() falls through to next function %s()",
- func->name, insn->func->name);
+ func->name, insn_func(insn)->name);
return 1;
}

@@ -3614,7 +3636,7 @@ static int validate_unwind_hints(struct

while (&insn->list != &file->insn_list && (!sec || insn->sec == sec)) {
if (insn->hint && !insn->visited && !insn->ignore) {
- ret = validate_branch(file, insn->func, insn, state);
+ ret = validate_branch(file, insn_func(insn), insn, state);
if (ret && opts.backtrace)
BT_FUNC("<=== (hint)", insn);
warnings += ret;
@@ -3837,7 +3859,7 @@ static bool ignore_unreachable_insn(stru
* In this case we'll find a piece of code (whole function) that is not
* covered by a !section symbol. Ignore them.
*/
- if (opts.link && !insn->func) {
+ if (opts.link && !insn_func(insn)) {
int size = find_symbol_hole_containing(insn->sec, insn->offset);
unsigned long end = insn->offset + size;

@@ -3861,10 +3883,10 @@ static bool ignore_unreachable_insn(stru
/*
* If this hole jumps to a .cold function, mark it ignore too.
*/
- if (insn->jump_dest && insn->jump_dest->func &&
- strstr(insn->jump_dest->func->name, ".cold")) {
+ if (insn->jump_dest && insn_func(insn->jump_dest) &&
+ strstr(insn_func(insn->jump_dest)->name, ".cold")) {
struct instruction *dest = insn->jump_dest;
- func_for_each_insn(file, dest->func, dest)
+ func_for_each_insn(file, insn_func(dest), dest)
dest->ignore = true;
}
}
@@ -3872,10 +3894,10 @@ static bool ignore_unreachable_insn(stru
return false;
}

- if (!insn->func)
+ if (!insn_func(insn))
return false;

- if (insn->func->static_call_tramp)
+ if (insn_func(insn)->static_call_tramp)
return true;

/*
@@ -3906,7 +3928,7 @@ static bool ignore_unreachable_insn(stru

if (insn->type == INSN_JUMP_UNCONDITIONAL) {
if (insn->jump_dest &&
- insn->jump_dest->func == insn->func) {
+ insn_func(insn->jump_dest) == insn_func(insn)) {
insn = insn->jump_dest;
continue;
}
@@ -3914,7 +3936,7 @@ static bool ignore_unreachable_insn(stru
break;
}

- if (insn->offset + insn->len >= insn->func->offset + insn->func->len)
+ if (insn->offset + insn->len >= insn_func(insn)->offset + insn_func(insn)->len)
break;

insn = list_next_entry(insn, list);
@@ -3943,7 +3965,7 @@ static int validate_symbol(struct objtoo

state->uaccess = sym->uaccess_safe;

- ret = validate_branch(file, insn->func, insn, *state);
+ ret = validate_branch(file, insn_func(insn), insn, *state);
if (ret && opts.backtrace)
BT_FUNC("<=== (sym)", insn);
return ret;
@@ -4080,7 +4102,7 @@ static int validate_ibt_insn(struct objt
continue;
}

- if (dest->func && dest->func == insn->func) {
+ if (insn_func(dest) && insn_func(dest) == insn_func(insn)) {
/*
* Anything from->to self is either _THIS_IP_ or
* IRET-to-self.
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -67,11 +67,21 @@ struct instruction {
struct reloc *jump_table;
struct reloc *reloc;
struct list_head alts;
- struct symbol *func;
+ struct symbol *sym;
struct list_head stack_ops;
struct cfi_state *cfi;
};

+static inline struct symbol *insn_func(struct instruction *insn)
+{
+ struct symbol *sym = insn->sym;
+
+ if (sym && sym->type != STT_FUNC)
+ sym = NULL;
+
+ return sym;
+}
+
#define VISITED_BRANCH 0x01
#define VISITED_BRANCH_UACCESS 0x02
#define VISITED_BRANCH_MASK 0x03

2022-09-23 18:54:37

by Pawan Gupta

[permalink] [raw]
Subject: Re: [PATCH v3 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls

On Fri, Sep 23, 2022 at 04:35:13PM +0200, Peter Zijlstra wrote:
>On Thu, Sep 22, 2022 at 03:15:10PM +0200, Peter Zijlstra wrote:
>
>> Ho-Humm.. let me find a non hacky solution.
>
>I've ended up here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=call-depth-tracking&id=c2a8b8187418956af92d2d954ee4574f87f5e188
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=call-depth-tracking&id=c74976b9664fb6a1856d85ce246d84db3fd2c2cd
>
>which is arguably quite a lot of work for something relatively minor,
>but I do think it leaves us in a better place.
>
>Combo patch below..

Confirming that the patch fixes the warning. Thanks.