2022-09-13 21:55:01

by Nadav Amit

[permalink] [raw]
Subject: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

From: Nadav Amit <[email protected]>

Tracing - through ftrace function tracer and kprobes - of certain common
syscall functions is currently disabled. Setting kprobes on these
functions is specifically useful for debugging of syscall failures.

Such tracing is disabled since __do_sys_[syscall] functions are declared
as "inline". "inline" in the kernel is actually defined as a macro that
in addition to using the inline keyword also disables tracing (notrace).
According to the comments in the code, tracing inline functions can
wreck havoc, which is probably true in some cases.

In practice, however, this might be too extensive. The compiler regards
the "inline" keyword only as a hint, which it is free to ignore. In
fact, in my builds gcc ignores the "inline" hint for many
__do_sys_[syscall] since some of these functions are quite big and
called from multiple locations (for compat). As a result, these
functions cannot be traced.

There are 3 possible solutions for enabling the tracing of
__do_sys_[syscall]:

1. Mark __do_sys_[syscall] as __always_inline instead of inline. This
would increase the executable size, which might not be desired.

2. Remove the inline hint from __do_sys_[syscall]. Again, it might
affect the generated code, inducing function call overhead for some
syscalls.

3. Remove "notrace" from the "inline" macro definition, and require
functions that cannot be traced to be marked explicitly as "notrace".
This might be the most correct solution, which would also enable tracing
of additional useful functions. But finding the functions that cannot
be traced is not easy without some automation.

4. Avoid the use of "notrace" specifically for __do_sys_[syscall].

Use the last approach to enable the tracing of __do_sys_[syscall]
functions. Introduce an "inline_trace" macro that sets the "__inline"
keyword without "notrace". Use it for the syscall wrappers.

This enables the tracing of 54 useful functions on my build, for
instance, __do_sys_vmsplice(), __do_sys_mremap() and
__do_sys_process_madvise().

Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: "Steven Rostedt (Google)" <[email protected]>
Signed-off-by: Nadav Amit <[email protected]>
---
arch/x86/include/asm/syscall_wrapper.h | 8 ++++----
include/linux/compat.h | 4 ++--
include/linux/compiler_types.h | 6 +++++-
3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index 59358d1bf880..2673e3551aad 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -201,14 +201,14 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);

#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
- static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+ static inline_trace long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
__IA32_COMPAT_SYS_STUBx(x, name, __VA_ARGS__) \
__X32_COMPAT_SYS_STUBx(x, name, __VA_ARGS__) \
static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
return __do_compat_sys##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__));\
} \
- static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+ static inline_trace long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))

/*
* As some compat syscalls may not be implemented, we need to expand
@@ -227,7 +227,7 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);

#define __SYSCALL_DEFINEx(x, name, ...) \
static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
- static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+ static inline_trace long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
__X64_SYS_STUBx(x, name, __VA_ARGS__) \
__IA32_SYS_STUBx(x, name, __VA_ARGS__) \
static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
@@ -237,7 +237,7 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
__PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
return ret; \
} \
- static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+ static inline_trace long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))

/*
* As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 594357881b0b..4d786581219b 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -75,7 +75,7 @@
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
__attribute__((alias(__stringify(__se_compat_sys##name)))); \
ALLOW_ERROR_INJECTION(compat_sys##name, ERRNO); \
- static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+ static inline_trace long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
asmlinkage long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
asmlinkage long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
@@ -84,7 +84,7 @@
return ret; \
} \
__diag_pop(); \
- static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+ static inline_trace long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
#endif /* COMPAT_SYSCALL_DEFINEx */

struct compat_iovec {
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 4f2a819fd60a..d88bfcf387ea 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -152,8 +152,12 @@ struct ftrace_likely_data {
* externally visible function. This makes extern inline behave as per gnu89
* semantics rather than c99. This prevents multiple symbol definition errors
* of extern inline functions at link time.
- * A lot of inline functions can cause havoc with function tracing.
+ *
+ * A lot of inline functions can cause havoc with function tracing. If the
+ * function is known to be safe for tracing, inline_trace can be used. Otherwise
+ * inline would prevent tracing.
*/
+#define inline_trace __inline __gnu_inline __inline_maybe_unused
#define inline inline __gnu_inline __inline_maybe_unused notrace

/*
--
2.25.1


2022-09-20 02:53:24

by Nadav Amit

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Sep 13, 2022, at 6:52 AM, Nadav Amit <[email protected]> wrote:

> From: Nadav Amit <[email protected]>
>
> Tracing - through ftrace function tracer and kprobes - of certain common
> syscall functions is currently disabled. Setting kprobes on these
> functions is specifically useful for debugging of syscall failures.
>
> Such tracing is disabled since __do_sys_[syscall] functions are declared
> as "inline". "inline" in the kernel is actually defined as a macro that
> in addition to using the inline keyword also disables tracing (notrace).
> According to the comments in the code, tracing inline functions can
> wreck havoc, which is probably true in some cases.
>
> In practice, however, this might be too extensive. The compiler regards
> the "inline" keyword only as a hint, which it is free to ignore. In
> fact, in my builds gcc ignores the "inline" hint for many
> __do_sys_[syscall] since some of these functions are quite big and
> called from multiple locations (for compat). As a result, these
> functions cannot be traced.
>
> There are 3 possible solutions for enabling the tracing of
> __do_sys_[syscall]:
>
> 1. Mark __do_sys_[syscall] as __always_inline instead of inline. This
> would increase the executable size, which might not be desired.
>
> 2. Remove the inline hint from __do_sys_[syscall]. Again, it might
> affect the generated code, inducing function call overhead for some
> syscalls.
>
> 3. Remove "notrace" from the "inline" macro definition, and require
> functions that cannot be traced to be marked explicitly as "notrace".
> This might be the most correct solution, which would also enable tracing
> of additional useful functions. But finding the functions that cannot
> be traced is not easy without some automation.
>
> 4. Avoid the use of "notrace" specifically for __do_sys_[syscall].

Steven, Peter, assistance would be helpful now that you are hopefully less
busy.

Using kprobes, I repeatedly get various crashes such as the one below that
happen in functions that are called from do_idle(). I am not sure that I get
exactly the reason (although it seems to be related to the RCU). The traces
for some reason show that when I patch a 2 byte JCC instruction and the RIP
always points to the second byte of the instructions, which I would not
expect to ever happen.

So basically, I have two questions for you:

1. What is the reason that inline functions are marked with notrace?

2. Is probing function that is called from do_idle() supposed to work, or
should the kernel prevent it?

Thanks,
Nadav


[ 2381.637652] BUG: unable to handle page fault for address: ffffc90077cb6e4b
[ 2381.645568] #PF: supervisor read access in kernel mode
[ 2381.651367] #PF: error_code(0x0000) - not-present page
[ 2381.657155] PGD 100000067 P4D 100000067 PUD 0
[ 2381.662154] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2381.667040] CPU: 19 PID: 0 Comm: swapper/19 Not tainted 5.17.0-rc4 #6
[ 2381.674749] Hardware name: Cisco Systems Inc UCSC-C220-M5SX/UCSC-C220-M5SX, BIOS C220M5.4.0.1i.0.0522190226 05/22/2019
[ 2381.686733] RIP: 0010:poke_int3_handler+0x6d/0x140
[ 2381.692108] Code: 32 31 c0 48 81 c6 00 00 00 81 48 39 f1 74 39 f0 41 ff 08 c3 31 c0 c3 49 89 d3 49 89 c2 49 d1 ea 4c 89 d2 48 c1 e2 04 4c 01 da <48> 63 32 48 81 c6 00 00 00 81 48 39 f1 0f 82 b7 00 00 00 0f 87 98
[ 2381.713139] RSP: 0000:ffffc9000c6a7c90 EFLAGS: 00010086
[ 2381.718994] RAX: 000000000d4f7e0a RBX: 0000000000000000 RCX: ffffffff81226ffb
[ 2381.726990] RDX: ffffc90077cb6e4b RSI: ffffffff82400a99 RDI: ffffc9000c6a7cb8
[ 2381.734986] RBP: ffffc9000c6a7ca8 R08: ffffc9000d4f7d84 R09: ffffffff81226ffc
[ 2381.742979] R10: 0000000006a7bf05 R11: ffffc9000d4f7dfb R12: ffffc9000c6a7cb8
[ 2381.750974] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 2381.758966] FS: 0000000000000000(0000) GS:ffff88afe3740000(0000) knlGS:0000000000000000
[ 2381.768031] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2381.774469] CR2: ffffc90077cb6e4b CR3: 000000000680a001 CR4: 00000000007706e0
[ 2381.782462] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2381.790455] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2381.798447] PKRU: 55555554
[ 2381.801481] Call Trace:
[ 2381.804224] <TASK>
[ 2381.806575] ? exc_int3+0x10/0xd0
[ 2381.810292] asm_exc_int3+0x31/0x40
[ 2381.814205] RIP: 0010:__hrtimer_next_event_base+0x6c/0xe0
[ 2381.820260] Code: 01 00 00 00 48 8b 41 28 49 39 c6 74 67 48 c1 e6 06 48 8b 50 18 4a 2b 54 26 78 4c 39 ea 7d 08 49 89 d5 4d 85 f6 74 40 85 db cc <21> 44 89 f8 89 d9 f3 48 0f bc c9 d3 e0 89 ce 89 c9 48 83 c1 01 f7
[ 2381.841292] RSP: 0000:ffffc9000c6a7d98 EFLAGS: 00000046
[ 2381.847148] RAX: ffff88afe3763960 RBX: 0000000000000000 RCX: ffff88afe3763200
[ 2381.855141] RDX: 0000022a7fa0e539 RSI: 0000000000000000 RDI: ffff88afe37631c0
[ 2381.863133] RBP: ffffc9000c6a7dc8 R08: 00000066a1725854 R09: 000002490d303b39
[ 2381.871117] R10: ffff88afe376fba4 R11: ffff88afe376fb84 R12: ffff88afe37631c0
[ 2381.879109] R13: 0000022a7fa0e539 R14: ffff88afe3763700 R15: 0000000000000001
[ 2381.887107] ? __hrtimer_next_event_base+0x6c/0xe0
[ 2381.892478] elfcorehdr_read+0x40/0x40
[ 2381.896681] tick_nohz_get_sleep_length+0x9d/0xc0
[ 2381.901955] menu_select+0x4bb/0x630
[ 2381.905965] cpuidle_select+0x16/0x20
[ 2381.910069] do_idle+0x1d2/0x270
[ 2381.913689] cpu_startup_entry+0x20/0x30
[ 2381.918086] start_secondary+0x118/0x150
[ 2381.922484] secondary_startup_64_no_verify+0xc3/0xcb
[ 2381.928147] </TASK>
[ 2381.931535] Modules linked in: zram
[ 2381.936365] CR2: ffffc90077cb6e4b
[ 2381.940998] ---[ end trace 0000000000000000 ]---

2022-09-20 09:55:09

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Tue, Sep 13, 2022 at 06:52:13AM -0700, Nadav Amit wrote:
> From: Nadav Amit <[email protected]>
>
> Tracing - through ftrace function tracer and kprobes - of certain common
> syscall functions is currently disabled. Setting kprobes on these
> functions is specifically useful for debugging of syscall failures.
>
> Such tracing is disabled since __do_sys_[syscall] functions are declared
> as "inline". "inline" in the kernel is actually defined as a macro that
> in addition to using the inline keyword also disables tracing (notrace).
> According to the comments in the code, tracing inline functions can
> wreck havoc, which is probably true in some cases.
>
> In practice, however, this might be too extensive. The compiler regards
> the "inline" keyword only as a hint, which it is free to ignore. In
> fact, in my builds gcc ignores the "inline" hint for many
> __do_sys_[syscall] since some of these functions are quite big and
> called from multiple locations (for compat). As a result, these
> functions cannot be traced.
>
> There are 3 possible solutions for enabling the tracing of
> __do_sys_[syscall]:
>
> 1. Mark __do_sys_[syscall] as __always_inline instead of inline. This
> would increase the executable size, which might not be desired.
>
> 2. Remove the inline hint from __do_sys_[syscall]. Again, it might
> affect the generated code, inducing function call overhead for some
> syscalls.
>
> 3. Remove "notrace" from the "inline" macro definition, and require
> functions that cannot be traced to be marked explicitly as "notrace".
> This might be the most correct solution, which would also enable tracing
> of additional useful functions. But finding the functions that cannot
> be traced is not easy without some automation.
>
> 4. Avoid the use of "notrace" specifically for __do_sys_[syscall].
>
> Use the last approach to enable the tracing of __do_sys_[syscall]
> functions. Introduce an "inline_trace" macro that sets the "__inline"
> keyword without "notrace". Use it for the syscall wrappers.
>
> This enables the tracing of 54 useful functions on my build, for
> instance, __do_sys_vmsplice(), __do_sys_mremap() and
> __do_sys_process_madvise().
>
> Cc: "Peter Zijlstra (Intel)" <[email protected]>
> Cc: "Steven Rostedt (Google)" <[email protected]>
> Signed-off-by: Nadav Amit <[email protected]>

So at least for x86 these functions cannot be inlined, at all times the
syscalls are laundered through the syscall table.

It is very hard to take the address of an inline function and stuff it
in a table.

Additionally, all indirect syscall table calls are in instrumentable
code, so tracing should not be an issue -- again speaking for x86.

For the above suggestions:

#1 above should refuse to build IMO, one shouldn't be allowed to take
the address of an __always_inline function.

#2 purely x86 speaking -- I don't see an issue with just taking the
'inline' keyword away entirely.

#3 I think Steve's concern is that the tracability of a function then
depends on the compiler's whim -- but yeah, who cares ;-)

#4 not a fan, but I also don't see anything wrong with it -- from x86
pov.


IOW, please figure out why these things are inline to begin with; this
might require auditing all architecture syscall code. While doing that
audit, make sure to determine if all of them can handle tracing at these
points.

2022-09-20 11:12:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Mon, Sep 19, 2022 at 07:35:42PM -0700, Nadav Amit wrote:

> 1. What is the reason that inline functions are marked with notrace?

IIRC the concern is that a notrace function using an inline function;
GCC deciding to not inline and then still hitting tracing.

For noinstr we've mandated __always_inline to avoid this problem. The
direct advantage is that those inlined into instrumented code get, well,
instrumented.

> 2. Is probing function that is called from do_idle() supposed to work, or
> should the kernel prevent it?

Should work for some :-) Specifically it doesn't work for those that
disable RCU, and that's (largely) being fixed here:

https://lore.kernel.org/all/[email protected]/T/#u

Although looking at it just now, I think I missed a spot.. lemme go fix
;-)

I'm failing to find this callchain; where is
tick_nohz_get_sleep_length() calling to elfcorehdr_read() ?!?

> [ 2381.892478] elfcorehdr_read+0x40/0x40
> [ 2381.896681] tick_nohz_get_sleep_length+0x9d/0xc0
> [ 2381.901955] menu_select+0x4bb/0x630
> [ 2381.905965] cpuidle_select+0x16/0x20
> [ 2381.910069] do_idle+0x1d2/0x270
> [ 2381.913689] cpu_startup_entry+0x20/0x30
> [ 2381.918086] start_secondary+0x118/0x150
> [ 2381.922484] secondary_startup_64_no_verify+0xc3/0xcb
> [ 2381.928147] </TASK>
> [ 2381.931535] Modules linked in: zram
> [ 2381.936365] CR2: ffffc90077cb6e4b
> [ 2381.940998] ---[ end trace 0000000000000000 ]---
>

2022-09-20 17:20:12

by Nadav Amit

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Sep 20, 2022, at 4:02 AM, Peter Zijlstra <[email protected]> wrote:

> On Mon, Sep 19, 2022 at 07:35:42PM -0700, Nadav Amit wrote:
>
>> 1. What is the reason that inline functions are marked with notrace?
>
> IIRC the concern is that a notrace function using an inline function;
> GCC deciding to not inline and then still hitting tracing.
>
> For noinstr we've mandated __always_inline to avoid this problem. The
> direct advantage is that those inlined into instrumented code get, well,
> instrumented.

I fully understand the __always_inline. I do not understand the inline,
which is a hint. Anyhow, I just thought that you would probably know, but
I’ll do the digging and look at the tables to see how they look with and
without inline implying notrace.

>
>> 2. Is probing function that is called from do_idle() supposed to work, or
>> should the kernel prevent it?
>
> Should work for some :-) Specifically it doesn't work for those that
> disable RCU, and that's (largely) being fixed here:
>
> https://lore.kernel.org/all/[email protected]/T/#u
>
> Although looking at it just now, I think I missed a spot.. lemme go fix
> ;-)
>

Thank you. I’ll give it a spin as soon as I finish some stuff (which can
be days).


> I'm failing to find this callchain; where is
> tick_nohz_get_sleep_length() calling to elfcorehdr_read() ?!?

Very strange. According to DWARF and disassembly, the call in the code is
actually to hrtimer_next_event_without() and nothing more, and
elfcorehdr_read+0x40/0x40 is actually after the ret.

The strangest part is that I actually collected additional similar crashes,
and I only now notice that all of them have elfcorehdr_read(). Good catch!
(which makes no sense)…

I’ll move to a newer kernel, apply your patches and dig into it too.

Thanks again,
Nadav



>> [ 2381.892478] elfcorehdr_read+0x40/0x40
>> [ 2381.896681] tick_nohz_get_sleep_length+0x9d/0xc0
>> [ 2381.901955] menu_select+0x4bb/0x630
>> [ 2381.905965] cpuidle_select+0x16/0x20
>> [ 2381.910069] do_idle+0x1d2/0x270
>> [ 2381.913689] cpu_startup_entry+0x20/0x30
>> [ 2381.918086] start_secondary+0x118/0x150
>> [ 2381.922484] secondary_startup_64_no_verify+0xc3/0xcb
>> [ 2381.928147] </TASK>
>> [ 2381.931535] Modules linked in: zram
>> [ 2381.936365] CR2: ffffc90077cb6e4b
>> [ 2381.940998] ---[ end trace 0000000000000000 ]—


2022-09-21 02:19:02

by Nadav Amit

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

Just following my questions with the answers I figured out, just to save
others time.

On Sep 20, 2022, at 9:48 AM, Nadav Amit <[email protected]> wrote:

> On Sep 20, 2022, at 4:02 AM, Peter Zijlstra <[email protected]> wrote:
>
>> On Mon, Sep 19, 2022 at 07:35:42PM -0700, Nadav Amit wrote:
>>
>>> 1. What is the reason that inline functions are marked with notrace?
>>
>> IIRC the concern is that a notrace function using an inline function;
>> GCC deciding to not inline and then still hitting tracing.
>>
>> For noinstr we've mandated __always_inline to avoid this problem. The
>> direct advantage is that those inlined into instrumented code get, well,
>> instrumented.

Commit 45959ee7aa645 (“ftrace: Do not function trace inlined functions”)
gives two reasons which correspond with what you were saying: (1)
consistency and (2) function that should not be traced are mostly marked as
inline.

I am not sure I fully agree with the arguments, specifically the consistency
(any function might be inlined and not traceable). But I am too afraid/lazy
to cause damage and fix it. I will remove the inline and play a bit with the
kernel to see how it behaves.

>>> 2. Is probing function that is called from do_idle() supposed to work, or
>>> should the kernel prevent it?
>>
>> Should work for some :-) Specifically it doesn't work for those that
>> disable RCU, and that's (largely) being fixed here:
>>
>> https://lore.kernel.org/all/[email protected]/T/#u
>>
>> Although looking at it just now, I think I missed a spot.. lemme go fix
>> ;-)

I did not try your patches, but I do think I figured out what’s wrong and
sent a patch.

https://lore.kernel.org/lkml/[email protected]/T/#t

Thanks again,
Nadav

2022-09-26 17:35:20

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Tue, 20 Sep 2022 18:31:24 -0700
Nadav Amit <[email protected]> wrote:

> Commit 45959ee7aa645 (“ftrace: Do not function trace inlined functions”)
> gives two reasons which correspond with what you were saying: (1)
> consistency and (2) function that should not be traced are mostly marked as
> inline.
>
> I am not sure I fully agree with the arguments, specifically the consistency
> (any function might be inlined and not traceable). But I am too afraid/lazy
> to cause damage and fix it. I will remove the inline and play a bit with the
> kernel to see how it behaves.

The main concern is two fold.

1) In the beginning, the function tracer was very susceptible to recursion
crashes (it's much more robust now), and depending on whether the compiler
decided to inline a function or not, would decide if a recursive function
would crash the kernel or not. It was a nightmare to debug!

2) Consistency. I was tired of getting bug reports that would say "hey
kernel X on machine M1 has function F available for tracing, but kernel X on
machine M2 does not have function F available". It was that the compiler
for M1 did not inline the function where it did for M2.

-- Steve

2022-09-26 17:53:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Tue, 13 Sep 2022 06:52:13 -0700
Nadav Amit <[email protected]> wrote:

> +++ b/include/linux/compiler_types.h
> @@ -152,8 +152,12 @@ struct ftrace_likely_data {
> * externally visible function. This makes extern inline behave as per gnu89
> * semantics rather than c99. This prevents multiple symbol definition errors
> * of extern inline functions at link time.
> - * A lot of inline functions can cause havoc with function tracing.
> + *
> + * A lot of inline functions can cause havoc with function tracing. If the
> + * function is known to be safe for tracing, inline_trace can be used. Otherwise
> + * inline would prevent tracing.

Perhaps add:

* Don't complain if this function is not available to trace!

;-)

-- Steve

> */
> +#define inline_trace __inline __gnu_inline __inline_maybe_unused
> #define inline inline __gnu_inline __inline_maybe_unused notrace
>

2022-09-26 18:18:42

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall] functions

On Tue, 20 Sep 2022 11:47:11 +0200
Peter Zijlstra <[email protected]> wrote:

> #3 I think Steve's concern is that the tracability of a function then
> depends on the compiler's whim -- but yeah, who cares ;-)

As I mentioned in my other email. I was tired of the bug reports telling me
it was a bug that a function was no longer available for tracing because
the compiler decided to inline it. By having notrace on all inlines, it
made it consistent no matter what the compiler decided to do, and those bug
reports went away ;-)

-- Steve