2016-10-11 10:43:05

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 00/11] syscall/tracing: compat syscall support

This patchset adds support syscall event tracing for compat syscalls.

Second patchset revision has received very little response - so I'm hoping
to get either acks from syscall/trace and arch maintainers or more feedback
on what else needs to be changed.

Patch 1 removes the unnecessary syscall_nr field from syscall metadata,
which was one of the obstacles for adding proper support for compat syscalls.

Patch 2 adds a method to distinguish handling of syscalls for compat tasks
if an arch requires that. In disussions about an earlier version of these
patches (http://marc.info/?l=linux-mips&m=147259973128606&w=2) it was suggested
to use audit arch for detecting syscall type. After analysing the code
for various arches it seemed to me that this would add an unnecessary
complexity (as would require extra APIs to enumerate and map all audit
arch types) and I've just simply used compat task status to determine call
type. I cannot see any added value from using the audit arch type in this
context.

Patch 3 add compat syscall metadata - this is mostly a copy of a set of macros
for generating metadata for standard syscalls.

Further patches add arch-specific methods required for differentiating between
standard and compat syscalls as well as for finding syscall addresses from
inside syscall tables.

I have tried to keep the tracing system working for bisections - and most
things work as previously until arch-specific patch is applied. The only
exception here is x86 which had extra methods to prevent incorrect syscall
reporting for compat tasks - this may happen after patch 2 is applied and
without x86/tracing patch.

version 3:
- rebase on top of linux-next-20161011
- tile: change in_compat_syscall to is_compat_task (suggested and signed-off by
Chris Metcalf)

Marcin Nowakowski (11):
tracing/syscalls: remove syscall_nr from syscall metadata
tracing/syscalls: add handling for compat tasks
tracing/syscalls: add compat syscall metadata
syscall/tracing: allow arch to override syscall_get_nr for ftrace
x86/tracing: fix compat syscall handling
s390/tracing: fix compat syscall handling
arm64/tracing: fix compat syscall handling
powerpc/tracing: fix compat syscall handling
tile/tracing: fix compat syscall handling
sparc/tracing: fix compat syscall handling
parisc/tracing: fix compat syscall handling

arch/arm64/include/asm/ftrace.h | 12 +-
arch/arm64/include/asm/unistd.h | 1 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/ftrace.c | 16 +++
arch/mips/kernel/ftrace.c | 4 +-
arch/parisc/include/asm/ftrace.h | 10 ++
arch/parisc/kernel/Makefile | 1 +
arch/parisc/kernel/ftrace.c | 15 +++
arch/powerpc/include/asm/ftrace.h | 26 +++-
arch/powerpc/kernel/ftrace.c | 6 +-
arch/s390/include/asm/ftrace.h | 11 ++
arch/s390/include/asm/syscall.h | 1 +
arch/s390/kernel/Makefile | 1 +
arch/s390/kernel/ftrace.c | 12 ++
arch/sparc/include/asm/ftrace.h | 10 ++
arch/sparc/kernel/Makefile | 1 +
arch/sparc/kernel/ftrace.c | 14 ++
arch/tile/include/asm/ftrace.h | 10 ++
arch/tile/kernel/Makefile | 1 +
arch/tile/kernel/ftrace.c | 13 ++
arch/x86/include/asm/ftrace.h | 14 +-
arch/x86/include/asm/syscall.h | 9 ++
arch/x86/kernel/ftrace.c | 15 +++
include/linux/compat.h | 74 +++++++++++
include/linux/ftrace.h | 2 +-
include/linux/syscalls.h | 1 -
include/trace/syscall.h | 2 -
kernel/trace/trace.h | 17 ++-
kernel/trace/trace_syscalls.c | 260 +++++++++++++++++++++++---------------
29 files changed, 421 insertions(+), 139 deletions(-)

--
2.7.4


2016-10-11 10:43:07

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 01/11] tracing/syscalls: remove syscall_nr from syscall metadata

Some architectures map multiple syscall numbers to a single syscall.
This meant that on those platforms, some system calls could not be
properly traced using syscall event tracing mechanism, as a different
number of a syscall was used for registration to the one used by
applications.
We can use syscall lookup together with the syscall metadata table
traversal to register for appropriate events instead. This slightly
increases the overhead during event (un)registration, but does not
impact the trace events themselves, which still use syscall numbers
directly.

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
include/linux/syscalls.h | 1 -
include/trace/syscall.h | 2 -
kernel/trace/trace_syscalls.c | 127 ++++++++++++++++++++++++------------------
3 files changed, 73 insertions(+), 57 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 0d7abb8..88324cc 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -160,7 +160,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
static struct syscall_metadata __used \
__syscall_meta_##sname = { \
.name = "sys"#sname, \
- .syscall_nr = -1, /* Filled in at boot */ \
.nb_args = nb, \
.types = nb ? types_##sname : NULL, \
.args = nb ? args_##sname : NULL, \
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 7434f0f..b5fbebe 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -13,7 +13,6 @@
* A syscall entry in the ftrace syscalls array.
*
* @name: name of the syscall
- * @syscall_nr: number of the syscall
* @nb_args: number of parameters it takes
* @types: list of types as strings
* @args: list of args as strings (args[i] matches types[i])
@@ -23,7 +22,6 @@
*/
struct syscall_metadata {
const char *name;
- int syscall_nr;
int nb_args;
const char **types;
const char **args;
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 5e10395..07887b4 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -405,16 +405,21 @@ static int reg_event_syscall_enter(struct trace_event_file *file,
int ret = 0;
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
- if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
- return -ENOSYS;
mutex_lock(&syscall_trace_lock);
- if (!tr->sys_refcount_enter)
+ if (!tr->sys_refcount_enter) {
ret = register_trace_sys_enter(ftrace_syscall_enter, tr);
- if (!ret) {
- rcu_assign_pointer(tr->enter_syscall_files[num], file);
- tr->sys_refcount_enter++;
+ if (ret)
+ goto out_unlock;
+ }
+
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ rcu_assign_pointer(tr->enter_syscall_files[num], file);
}
+ tr->sys_refcount_enter++;
+
+out_unlock:
mutex_unlock(&syscall_trace_lock);
return ret;
}
@@ -425,12 +430,13 @@ static void unreg_event_syscall_enter(struct trace_event_file *file,
struct trace_array *tr = file->tr;
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
- if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
- return;
mutex_lock(&syscall_trace_lock);
tr->sys_refcount_enter--;
- RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL);
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL);
+ }
if (!tr->sys_refcount_enter)
unregister_trace_sys_enter(ftrace_syscall_enter, tr);
mutex_unlock(&syscall_trace_lock);
@@ -443,16 +449,21 @@ static int reg_event_syscall_exit(struct trace_event_file *file,
int ret = 0;
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
- if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
- return -ENOSYS;
mutex_lock(&syscall_trace_lock);
- if (!tr->sys_refcount_exit)
- ret = register_trace_sys_exit(ftrace_syscall_exit, tr);
- if (!ret) {
- rcu_assign_pointer(tr->exit_syscall_files[num], file);
- tr->sys_refcount_exit++;
+ if (!tr->sys_refcount_exit) {
+ ret = register_trace_sys_enter(ftrace_syscall_exit, tr);
+ if (ret)
+ goto out_unlock;
}
+
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ rcu_assign_pointer(tr->exit_syscall_files[num], file);
+ }
+ tr->sys_refcount_exit++;
+
+out_unlock:
mutex_unlock(&syscall_trace_lock);
return ret;
}
@@ -463,12 +474,13 @@ static void unreg_event_syscall_exit(struct trace_event_file *file,
struct trace_array *tr = file->tr;
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
- if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
- return;
mutex_lock(&syscall_trace_lock);
tr->sys_refcount_exit--;
- RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL);
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL);
+ }
if (!tr->sys_refcount_exit)
unregister_trace_sys_exit(ftrace_syscall_exit, tr);
mutex_unlock(&syscall_trace_lock);
@@ -477,14 +489,6 @@ static void unreg_event_syscall_exit(struct trace_event_file *file,
static int __init init_syscall_trace(struct trace_event_call *call)
{
int id;
- int num;
-
- num = ((struct syscall_metadata *)call->data)->syscall_nr;
- if (num < 0 || num >= NR_syscalls) {
- pr_debug("syscall %s metadata not mapped, disabling ftrace event\n",
- ((struct syscall_metadata *)call->data)->name);
- return -ENOSYS;
- }

if (set_syscall_print_fmt(call) < 0)
return -ENOMEM;
@@ -547,7 +551,6 @@ void __init init_ftrace_syscalls(void)
if (!meta)
continue;

- meta->syscall_nr = i;
syscalls_metadata[i] = meta;
}
}
@@ -604,17 +607,23 @@ static int perf_sysenter_enable(struct trace_event_call *call)
int ret = 0;
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
-
mutex_lock(&syscall_trace_lock);
- if (!sys_perf_refcount_enter)
+ if (!sys_perf_refcount_enter) {
ret = register_trace_sys_enter(perf_syscall_enter, NULL);
- if (ret) {
- pr_info("event trace: Could not activate syscall entry trace point");
- } else {
- set_bit(num, enabled_perf_enter_syscalls);
- sys_perf_refcount_enter++;
+ if (ret) {
+ pr_info("event trace: Could not activate syscall entry trace point");
+ goto out_unlock;
+ }
+ }
+
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ set_bit(num, enabled_perf_enter_syscalls);
}
+ sys_perf_refcount_enter++;
+
+out_unlock:
mutex_unlock(&syscall_trace_lock);
return ret;
}
@@ -623,11 +632,13 @@ static void perf_sysenter_disable(struct trace_event_call *call)
{
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
-
mutex_lock(&syscall_trace_lock);
sys_perf_refcount_enter--;
- clear_bit(num, enabled_perf_enter_syscalls);
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ clear_bit(num, enabled_perf_enter_syscalls);
+ }
if (!sys_perf_refcount_enter)
unregister_trace_sys_enter(perf_syscall_enter, NULL);
mutex_unlock(&syscall_trace_lock);
@@ -675,17 +686,23 @@ static int perf_sysexit_enable(struct trace_event_call *call)
int ret = 0;
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
-
mutex_lock(&syscall_trace_lock);
- if (!sys_perf_refcount_exit)
+ if (!sys_perf_refcount_exit) {
ret = register_trace_sys_exit(perf_syscall_exit, NULL);
- if (ret) {
- pr_info("event trace: Could not activate syscall exit trace point");
- } else {
- set_bit(num, enabled_perf_exit_syscalls);
- sys_perf_refcount_exit++;
+ if (ret) {
+ pr_info("event trace: Could not activate syscall exit trace point");
+ goto out_unlock;
+ }
+ }
+
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ set_bit(num, enabled_perf_exit_syscalls);
}
+ sys_perf_refcount_exit++;
+
+out_unlock:
mutex_unlock(&syscall_trace_lock);
return ret;
}
@@ -694,11 +711,13 @@ static void perf_sysexit_disable(struct trace_event_call *call)
{
int num;

- num = ((struct syscall_metadata *)call->data)->syscall_nr;
-
mutex_lock(&syscall_trace_lock);
sys_perf_refcount_exit--;
- clear_bit(num, enabled_perf_exit_syscalls);
+ for (num = 0; num < NR_syscalls; num++) {
+ if (syscalls_metadata[num] &&
+ (syscalls_metadata[num] == call->data))
+ clear_bit(num, enabled_perf_exit_syscalls);
+ }
if (!sys_perf_refcount_exit)
unregister_trace_sys_exit(perf_syscall_exit, NULL);
mutex_unlock(&syscall_trace_lock);
--
2.7.4

2016-10-11 10:43:14

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 05/11] x86/tracing: fix compat syscall handling

Adapt the arch-specific code to new syscall tracing interface:
arch_trace_is_compat_syscall() now only indicates if a syscall is ia32,
as x32 syscalls exist in the same syscall table as native 64 bit ones,
so should not be treated as compat ones
Add arch_trace_syscall_get_nr that removes the x32 bit from syscall
numbers.

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
---
arch/x86/include/asm/ftrace.h | 14 ++------------
arch/x86/include/asm/syscall.h | 9 +++++++++
arch/x86/kernel/ftrace.c | 15 +++++++++++++++
3 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index eccd0ac..69f1674 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -50,20 +50,10 @@ int ftrace_int3_handler(struct pt_regs *regs);
#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_IA32_EMULATION)
#include <asm/compat.h>

-/*
- * Because ia32 syscalls do not map to x86_64 syscall numbers
- * this screws up the trace output when tracing a ia32 task.
- * Instead of reporting bogus syscalls, just do not trace them.
- *
- * If the user really wants these, then they should use the
- * raw syscall tracepoints with filtering.
- */
-#define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS 1
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
{
- if (in_compat_syscall())
- return true;
- return false;
+ return in_ia32_syscall();
}
#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_IA32_EMULATION */
#endif /* !__ASSEMBLY__ && !COMPILE_OFFSETS */
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index e3c95e8..732b5ab 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -239,4 +239,13 @@ static inline int syscall_get_arch(void)
}
#endif /* CONFIG_X86_32 */

+#ifdef CONFIG_FTRACE_SYSCALLS
+static inline
+int arch_trace_syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
+{
+ return syscall_get_nr(task, regs) & __SYSCALL_MASK;
+}
+#define arch_trace_syscall_get_nr arch_trace_syscall_get_nr
+#endif /* CONFIG_FTRACE_SYSCALLS */
+
#endif /* _ASM_X86_SYSCALL_H */
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 8639bb2..c26d4ae 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -28,6 +28,7 @@
#include <asm/kprobes.h>
#include <asm/ftrace.h>
#include <asm/nops.h>
+#include <asm/syscall.h>

#ifdef CONFIG_DYNAMIC_FTRACE

@@ -1035,3 +1036,17 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
}
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+#if defined(CONFIG_X86_64) && defined(CONFIG_IA32_EMULATION)
+
+unsigned long __init arch_syscall_addr(int nr, bool compat)
+{
+ if (compat)
+ return (unsigned long)ia32_sys_call_table[nr];
+
+ return (unsigned long)sys_call_table[nr];
+}
+
+#endif /* CONFIG_X86_64 && CONFIG_IA32_EMULATION */
+#endif /* CONFIG_FTRACE_SYSCALLS */
--
2.7.4

2016-10-11 10:43:21

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 11/11] parisc/tracing: fix compat syscall handling

Add missing arch code - arch_trace_is_compat_syscall and
arch_syscall_addr

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: [email protected]
---
arch/parisc/include/asm/ftrace.h | 10 ++++++++++
arch/parisc/kernel/Makefile | 1 +
arch/parisc/kernel/ftrace.c | 15 +++++++++++++++
3 files changed, 26 insertions(+)

diff --git a/arch/parisc/include/asm/ftrace.h b/arch/parisc/include/asm/ftrace.h
index d635c6b..0b54385 100644
--- a/arch/parisc/include/asm/ftrace.h
+++ b/arch/parisc/include/asm/ftrace.h
@@ -12,6 +12,16 @@ extern unsigned long return_address(unsigned int);

#define ftrace_return_address(n) return_address(n)

+#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_COMPAT)
+#include <linux/compat.h>
+
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
+static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
+{
+ return in_compat_syscall();
+}
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
+
#endif /* __ASSEMBLY__ */

#endif /* _ASM_PARISC_FTRACE_H */
diff --git a/arch/parisc/kernel/Makefile b/arch/parisc/kernel/Makefile
index 69a1118..7d1fd39 100644
--- a/arch/parisc/kernel/Makefile
+++ b/arch/parisc/kernel/Makefile
@@ -31,3 +31,4 @@ obj64-$(CONFIG_AUDIT) += compat_audit.o
obj-$(CONFIG_64BIT) += perf.o perf_asm.o $(obj64-y)
obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
+obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 5a5506a..41db7ca 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -56,6 +56,7 @@ static void __hot prepare_ftrace_return(unsigned long *parent,
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

+#ifdef CONFIG_FUNCTION_TRACER
void notrace __hot ftrace_function_trampoline(unsigned long parent,
unsigned long self_addr,
unsigned long org_sp_gr3)
@@ -85,4 +86,18 @@ void notrace __hot ftrace_function_trampoline(unsigned long parent,
}
#endif
}
+#endif /* CONFIG_FUNCTION_TRACER */
+#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_64BIT)
+extern const unsigned int sys_call_table64[];

+unsigned long __init arch_syscall_addr(int nr, bool compat)
+{
+#ifdef CONFIG_COMPAT
+ if (compat)
+ return (unsigned long)sys_call_table[nr];
+#endif /* CONFIG_COMPAT */
+
+ return (unsigned long)sys_call_table64[nr];
+}
+
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_64BIT */
--
2.7.4

2016-10-11 10:43:19

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 07/11] arm64/tracing: fix compat syscall handling

Add arch_syscall_addr for arm64 and define NR_compat_syscalls, as the
number of compat syscalls for arm64 exceeds the number defined by
NR_syscalls.

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
---
arch/arm64/include/asm/ftrace.h | 12 +-----------
arch/arm64/include/asm/unistd.h | 1 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/ftrace.c | 16 ++++++++++++++++
4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index caa955f..b57ff7c 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -41,17 +41,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)

#define ftrace_return_address(n) return_address(n)

-/*
- * Because AArch32 mode does not share the same syscall table with AArch64,
- * tracing compat syscalls may result in reporting bogus syscalls or even
- * hang-up, so just do not trace them.
- * See kernel/trace/trace_syscalls.c
- *
- * x86 code says:
- * If the user really wants these, then they should use the
- * raw syscall tracepoints with filtering.
- */
-#define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
{
return is_compat_task();
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index e78ac26..276d049 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -45,6 +45,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)

#define __NR_compat_syscalls 394
+#define NR_compat_syscalls (__NR_compat_syscalls)
#endif

#define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 7d66bba..7ea9cd3 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
+arm64-obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 40ad08a..75d010f 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -176,4 +176,20 @@ int ftrace_disable_ftrace_graph_caller(void)
return ftrace_modify_graph_caller(false);
}
#endif /* CONFIG_DYNAMIC_FTRACE */
+
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
+
+extern const void *sys_call_table[];
+extern const void *compat_sys_call_table[];
+
+unsigned long __init arch_syscall_addr(int nr, bool compat)
+{
+ if (compat)
+ return (unsigned long)compat_sys_call_table[nr];
+
+ return (unsigned long)sys_call_table[nr];
+}
+
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
--
2.7.4

2016-10-11 10:43:16

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 09/11] tile/tracing: fix compat syscall handling

Add missing arch code - arch_trace_is_compat_syscall and
arch_syscall_addr

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Chris Metcalf <[email protected]>
---
arch/tile/include/asm/ftrace.h | 10 ++++++++++
arch/tile/kernel/Makefile | 1 +
arch/tile/kernel/ftrace.c | 13 +++++++++++++
3 files changed, 24 insertions(+)

diff --git a/arch/tile/include/asm/ftrace.h b/arch/tile/include/asm/ftrace.h
index 738d239..e4fc08f 100644
--- a/arch/tile/include/asm/ftrace.h
+++ b/arch/tile/include/asm/ftrace.h
@@ -35,6 +35,16 @@ struct dyn_arch_ftrace {
};
#endif /* CONFIG_DYNAMIC_FTRACE */

+#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_COMPAT)
+#include <asm/compat.h>
+
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
+static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
+{
+ return is_compat_task();
+}
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
+
#endif /* __ASSEMBLY__ */

#endif /* CONFIG_FUNCTION_TRACER */
diff --git a/arch/tile/kernel/Makefile b/arch/tile/kernel/Makefile
index 09936d0..4de0989 100644
--- a/arch/tile/kernel/Makefile
+++ b/arch/tile/kernel/Makefile
@@ -30,6 +30,7 @@ obj-$(CONFIG_USE_PMC) += pmc.o
obj-$(CONFIG_TILE_USB) += usb.o
obj-$(CONFIG_TILE_HVGLUE_TRACE) += hvglue_trace.o
obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o mcount_64.o
+obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_KGDB) += kgdb.o
obj-$(CONFIG_JUMP_LABEL) += jump_label.o
diff --git a/arch/tile/kernel/ftrace.c b/arch/tile/kernel/ftrace.c
index b827a41..61768f3 100644
--- a/arch/tile/kernel/ftrace.c
+++ b/arch/tile/kernel/ftrace.c
@@ -21,6 +21,7 @@
#include <asm/ftrace.h>
#include <asm/sections.h>
#include <asm/insn.h>
+#include <asm/syscall.h>

#include <arch/opcode.h>

@@ -237,3 +238,15 @@ int ftrace_disable_ftrace_graph_caller(void)
}
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
+
+unsigned long __init arch_syscall_addr(int nr, bool compat)
+{
+ if (compat)
+ return (unsigned long)compat_sys_call_table[nr];
+
+ return (unsigned long)sys_call_table[nr];
+}
+
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
--
2.7.4

2016-10-11 10:43:51

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 10/11] sparc/tracing: fix compat syscall handling

Add missing arch code - arch_trace_is_compat_syscall and
arch_syscall_addr

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
---
arch/sparc/include/asm/ftrace.h | 10 ++++++++++
arch/sparc/kernel/Makefile | 1 +
arch/sparc/kernel/ftrace.c | 14 ++++++++++++++
3 files changed, 25 insertions(+)

diff --git a/arch/sparc/include/asm/ftrace.h b/arch/sparc/include/asm/ftrace.h
index 62755a3..82e5269 100644
--- a/arch/sparc/include/asm/ftrace.h
+++ b/arch/sparc/include/asm/ftrace.h
@@ -6,6 +6,16 @@
#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */

#ifndef __ASSEMBLY__
+
+#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_COMPAT)
+#include <asm/compat.h>
+
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
+static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
+{
+ return in_compat_syscall();
+}
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
void _mcount(void);
#endif

diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile
index fa3c02d..7808084 100644
--- a/arch/sparc/kernel/Makefile
+++ b/arch/sparc/kernel/Makefile
@@ -93,6 +93,7 @@ obj-$(CONFIG_KGDB) += kgdb_$(BITS).o

obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
+obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o

obj-$(CONFIG_EARLYFB) += btext.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
diff --git a/arch/sparc/kernel/ftrace.c b/arch/sparc/kernel/ftrace.c
index 6bcff69..e76f012 100644
--- a/arch/sparc/kernel/ftrace.c
+++ b/arch/sparc/kernel/ftrace.c
@@ -145,3 +145,17 @@ unsigned long prepare_ftrace_return(unsigned long parent,
return return_hooker;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
+extern const unsigned int sys_call_table[];
+extern const unsigned int sys_call_table32[];
+
+unsigned long __init arch_syscall_addr(int nr, bool compat)
+{
+ if (compat)
+ return (unsigned long)sys_call_table32[nr];
+
+ return (unsigned long)sys_call_table[nr];
+}
+
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
--
2.7.4

2016-10-11 10:44:14

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 08/11] powerpc/tracing: fix compat syscall handling

Adapt the code to make use of new syscall handling interface

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/ftrace.h | 11 +++++++++++
arch/powerpc/kernel/ftrace.c | 4 ++++
2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index 9697a73..d2ea3b6 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -86,6 +86,17 @@ static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
return !strcmp(sym + prefix_len + 1, name + prefix_len);
}
#endif
+
+#if defined(CONFIG_COMPAT)
+#include <linux/compat.h>
+#include <asm/compat.h>
+
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
+static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
+{
+ return in_compat_syscall();
+}
+#endif /* CONFIG_COMPAT */
#endif /* CONFIG_FTRACE_SYSCALLS && !__ASSEMBLY__ */

#endif /* _ASM_POWERPC_FTRACE */
diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index 5bfb35f..12c00f7 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -606,6 +606,10 @@ unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip)
#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_PPC64)
unsigned long __init arch_syscall_addr(int nr, bool compat)
{
+#ifdef CONFIG_COMPAT
+ if (compat)
+ return sys_call_table[nr*2+1];
+#endif
return sys_call_table[nr*2];
}
#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_PPC64 */
--
2.7.4

2016-10-11 10:43:12

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 03/11] tracing/syscalls: add compat syscall metadata

Now that compat syscalls are properly distinguished from native calls,
we can add metadata for compat syscalls as well.
All the macros used to generate the metadata are the same as for
standard syscalls, but with a compat_ prefix to distinguish them easily.

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/ftrace.h | 15 +++++---
include/linux/compat.h | 74 +++++++++++++++++++++++++++++++++++++++
kernel/trace/trace_syscalls.c | 8 +++--
3 files changed, 90 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index 686c5f7..9697a73 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -73,12 +73,17 @@ struct dyn_arch_ftrace {
static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
{
/*
- * Compare the symbol name with the system call name. Skip the .sys or .SyS
- * prefix from the symbol name and the sys prefix from the system call name and
- * just match the rest. This is only needed on ppc64 since symbol names on
- * 32bit do not start with a period so the generic function will work.
+ * Compare the symbol name with the system call name. Skip the .sys,
+ * .SyS or .compat_sys prefix from the symbol name and the sys prefix
+ * from the system call name and just match the rest. This is only
+ * needed on ppc64 since symbol names on 32bit do not start with a
+ * period so the generic function will work.
*/
- return !strcmp(sym + 4, name + 3);
+ int prefix_len = 3;
+
+ if (!strncasecmp(name, "compat_", 7))
+ prefix_len = 10;
+ return !strcmp(sym + prefix_len + 1, name + prefix_len);
}
#endif
#endif /* CONFIG_FTRACE_SYSCALLS && !__ASSEMBLY__ */
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 6360939..ef2a70f 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -30,7 +30,80 @@
#define __SC_DELOUSE(t,v) ((t)(unsigned long)(v))
#endif

+#ifdef CONFIG_FTRACE_SYSCALLS
+#ifndef __SC_STR_ADECL
+#define __SC_STR_ADECL(t, a) #a
+#endif
+
+#ifndef __SC_STR_TDECL
+#define __SC_STR_TDECL(t, a) #t
+#endif
+
+extern struct trace_event_class event_class_syscall_enter;
+extern struct trace_event_class event_class_syscall_exit;
+extern struct trace_event_functions enter_syscall_print_funcs;
+extern struct trace_event_functions exit_syscall_print_funcs;
+
+#define COMPAT_SYSCALL_TRACE_ENTER_EVENT(sname) \
+ static struct syscall_metadata __syscall_meta_compat_##sname; \
+ static struct trace_event_call __used \
+ event_enter_compat_##sname = { \
+ .class = &event_class_syscall_enter, \
+ { \
+ .name = "compat_sys_enter"#sname, \
+ }, \
+ .event.funcs = &enter_syscall_print_funcs, \
+ .data = (void *)&__syscall_meta_compat_##sname,\
+ .flags = TRACE_EVENT_FL_CAP_ANY, \
+ }; \
+ static struct trace_event_call __used \
+ __attribute__((section("_ftrace_events"))) \
+ *__event_enter_compat_##sname = &event_enter_compat_##sname;
+
+#define COMPAT_SYSCALL_TRACE_EXIT_EVENT(sname) \
+ static struct syscall_metadata __syscall_meta_compat_##sname; \
+ static struct trace_event_call __used \
+ event_exit_compat_##sname = { \
+ .class = &event_class_syscall_exit, \
+ { \
+ .name = "compat_sys_exit"#sname, \
+ }, \
+ .event.funcs = &exit_syscall_print_funcs, \
+ .data = (void *)&__syscall_meta_compat_##sname,\
+ .flags = TRACE_EVENT_FL_CAP_ANY, \
+ }; \
+ static struct trace_event_call __used \
+ __attribute__((section("_ftrace_events"))) \
+ *__event_exit_compat_##sname = &event_exit_compat_##sname;
+
+#define COMPAT_SYSCALL_METADATA(sname, nb, ...) \
+ static const char *types_compat_##sname[] = { \
+ __MAP(nb,__SC_STR_TDECL,__VA_ARGS__) \
+ }; \
+ static const char *args_compat_##sname[] = { \
+ __MAP(nb,__SC_STR_ADECL,__VA_ARGS__) \
+ }; \
+ COMPAT_SYSCALL_TRACE_ENTER_EVENT(sname); \
+ COMPAT_SYSCALL_TRACE_EXIT_EVENT(sname); \
+ static struct syscall_metadata __used \
+ __syscall_meta_compat_##sname = { \
+ .name = "compat_sys"#sname, \
+ .nb_args = nb, \
+ .types = nb ? types_compat_##sname : NULL, \
+ .args = nb ? args_compat_##sname : NULL, \
+ .enter_event = &event_enter_compat_##sname, \
+ .exit_event = &event_exit_compat_##sname, \
+ .enter_fields = LIST_HEAD_INIT(__syscall_meta_compat_##sname.enter_fields), \
+ }; \
+ static struct syscall_metadata __used \
+ __attribute__((section("__syscalls_metadata"))) \
+ *__p_syscall_meta_compat_##sname = &__syscall_meta_compat_##sname;
+#else
+#define COMPAT_SYSCALL_METADATA(sname, nb, ...)
+#endif
+
#define COMPAT_SYSCALL_DEFINE0(name) \
+ COMPAT_SYSCALL_METADATA(_##name, 0); \
asmlinkage long compat_sys_##name(void)

#define COMPAT_SYSCALL_DEFINE1(name, ...) \
@@ -47,6 +120,7 @@
COMPAT_SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)

#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
+ COMPAT_SYSCALL_METADATA(name, x, __VA_ARGS__) \
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
__attribute__((alias(__stringify(compat_SyS##name)))); \
static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index fe7fc33..409f83e 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -35,12 +35,16 @@ static struct syscall_metadata **syscalls_metadata;
static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
{
/*
- * Only compare after the "sys" prefix. Archs that use
+ * Only compare after the "sys" or "compat_sys" prefix. Archs that use
* syscall wrappers may have syscalls symbols aliases prefixed
* with ".SyS" or ".sys" instead of "sys", leading to an unwanted
* mismatch.
*/
- return !strcmp(sym + 3, name + 3);
+ int prefix_len = 3;
+
+ if (!strncasecmp(sym, "compat_", 7))
+ prefix_len = 10;
+ return !strcmp(sym + prefix_len, name + prefix_len);
}
#endif

--
2.7.4

2016-10-11 10:43:11

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 04/11] syscall/tracing: allow arch to override syscall_get_nr for ftrace

Some architectures modify syscall numbers to indicate ABI used
and as a result syscall number as returned by syscall_get_nr
does not correspond to the syscall number used inside the kernel.
Allow an arch to provide a separate implementation for ftrace
that returns the 'real' syscall number from the regs provided.

This will be used by x86 in a follow-up change.

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/trace/trace_syscalls.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 409f83e..0ab9b90 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -74,7 +74,11 @@ static inline bool trace_is_compat_syscall(struct pt_regs *regs)
static inline int
trace_get_syscall_nr(struct task_struct *task, struct pt_regs *regs)
{
+#ifdef arch_trace_syscall_get_nr
+ return arch_trace_syscall_get_nr(task, regs);
+#else
return syscall_get_nr(task, regs);
+#endif
}


--
2.7.4

2016-10-11 10:44:37

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 06/11] s390/tracing: fix compat syscall handling

Add missing arch code - arch_trace_is_compat_syscall and
arch_syscall_addr

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: [email protected]
---
arch/s390/include/asm/ftrace.h | 11 +++++++++++
arch/s390/include/asm/syscall.h | 1 +
arch/s390/kernel/Makefile | 1 +
arch/s390/kernel/ftrace.c | 12 ++++++++++++
4 files changed, 25 insertions(+)

diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index 64053d9..af8ba9c 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -82,5 +82,16 @@ static inline void ftrace_generate_call_insn(struct ftrace_insn *insn,
#endif
}

+#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_COMPAT)
+#include <linux/compat.h>
+#include <asm/compat.h>
+
+#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
+static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
+{
+ return in_compat_syscall();
+}
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_S390_FTRACE_H */
diff --git a/arch/s390/include/asm/syscall.h b/arch/s390/include/asm/syscall.h
index 6ba0bf9..a08db5b 100644
--- a/arch/s390/include/asm/syscall.h
+++ b/arch/s390/include/asm/syscall.h
@@ -16,6 +16,7 @@
#include <linux/sched.h>
#include <linux/err.h>
#include <asm/ptrace.h>
+#include <linux/compat.h>

/*
* The syscall table always contains 32 bit pointers since we know that the
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
index 1f0fe98..6254eec 100644
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_COMPAT) += compat_wrapper.o $(compat-obj-y)
obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
+obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_UPROBES) += uprobes.o

diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index 60a8a4e..3ce668e 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -243,3 +243,15 @@ int ftrace_disable_ftrace_graph_caller(void)
}

#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
+
+unsigned long __init arch_syscall_addr(int nr, bool compat)
+{
+ if (compat)
+ return (unsigned long)sys_call_table_emu[nr];
+
+ return (unsigned long)sys_call_table[nr];
+}
+
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_COMPAT */
--
2.7.4

2016-10-11 10:45:02

by Marcin Nowakowski

[permalink] [raw]
Subject: [PATCH v3 02/11] tracing/syscalls: add handling for compat tasks

Extend the syscall tracing subsystem by adding a handler for compat
tasks. For some architectures, where compat tasks' syscall numbers have
an exclusive set of syscall numbers, this already works since the
removal of syscall_nr.
Architectures where the same syscall may use a different syscall number
for compat tasks need to define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP and
define a method arch_trace_is_compat_syscall(struct pt_regs*) that tells
if a current task is a compat one.
For architectures that define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP:
- if an architecture defines NR_compat_syscalls, then the number of
trace event files equals to NR_compat_syscalls + NR_syscalls
- if an architecture doesn't define NR_compat_syscalls, the number of
trace event files is calculated as 2*NR_syscalls
Compat syscall trace events are identified by the syscall number offset
by NR_syscalls.

Signed-off-by: Marcin Nowakowski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
arch/mips/kernel/ftrace.c | 4 +-
arch/powerpc/kernel/ftrace.c | 2 +-
include/linux/ftrace.h | 2 +-
kernel/trace/trace.h | 17 +++++-
kernel/trace/trace_syscalls.c | 137 +++++++++++++++++++++++++-----------------
5 files changed, 102 insertions(+), 60 deletions(-)

diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
index 30a3b75..dd3dae3 100644
--- a/arch/mips/kernel/ftrace.c
+++ b/arch/mips/kernel/ftrace.c
@@ -412,7 +412,7 @@ void prepare_ftrace_return(unsigned long *parent_ra_addr, unsigned long self_ra,
#ifdef CONFIG_FTRACE_SYSCALLS

#ifdef CONFIG_32BIT
-unsigned long __init arch_syscall_addr(int nr)
+unsigned long __init arch_syscall_addr(int nr, bool compat)
{
return (unsigned long)sys_call_table[nr - __NR_O32_Linux];
}
@@ -420,7 +420,7 @@ unsigned long __init arch_syscall_addr(int nr)

#ifdef CONFIG_64BIT

-unsigned long __init arch_syscall_addr(int nr)
+unsigned long __init arch_syscall_addr(int nr, bool compat)
{
#ifdef CONFIG_MIPS32_N32
if (nr >= __NR_N32_Linux && nr <= __NR_N32_Linux + __NR_N32_Linux_syscalls)
diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index a95639b..5bfb35f 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -604,7 +604,7 @@ unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip)
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_PPC64)
-unsigned long __init arch_syscall_addr(int nr)
+unsigned long __init arch_syscall_addr(int nr, bool compat)
{
return sys_call_table[nr*2];
}
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index c45ed06..4d1ca1c 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -957,7 +957,7 @@ static inline void disable_trace_on_warning(void) { }

#ifdef CONFIG_FTRACE_SYSCALLS

-unsigned long arch_syscall_addr(int nr);
+unsigned long arch_syscall_addr(int nr, bool compat);

#endif /* CONFIG_FTRACE_SYSCALLS */

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index fd24b1f..821ee82 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -237,8 +237,21 @@ struct trace_array {
#ifdef CONFIG_FTRACE_SYSCALLS
int sys_refcount_enter;
int sys_refcount_exit;
- struct trace_event_file __rcu *enter_syscall_files[NR_syscalls];
- struct trace_event_file __rcu *exit_syscall_files[NR_syscalls];
+
+#if (defined ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP) && (defined CONFIG_COMPAT)
+
+#ifdef NR_compat_syscalls
+#define FTRACE_SYSCALL_CNT (NR_syscalls + NR_compat_syscalls)
+#else /* !NR_compat_syscalls */
+#define FTRACE_SYSCALL_CNT (2 * NR_syscalls)
+#endif /* NR_compat_syscalls */
+
+#else
+#define FTRACE_SYSCALL_CNT (NR_syscalls)
+#endif
+
+ struct trace_event_file __rcu *enter_syscall_files[FTRACE_SYSCALL_CNT];
+ struct trace_event_file __rcu *exit_syscall_files[FTRACE_SYSCALL_CNT];
#endif
int stop_count;
int clock_id;
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 07887b4..fe7fc33 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -44,37 +44,35 @@ static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
}
#endif

-#ifdef ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
+#ifdef ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP
/*
* Some architectures that allow for 32bit applications
* to run on a 64bit kernel, do not map the syscalls for
* the 32bit tasks the same as they do for 64bit tasks.
*
- * *cough*x86*cough*
- *
- * In such a case, instead of reporting the wrong syscalls,
- * simply ignore them.
- *
- * For an arch to ignore the compat syscalls it needs to
- * define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS as well as
+ * If a set of syscall numbers for 32-bit tasks overlaps
+ * the set of syscall numbers for 64-bit tasks, define
+ * ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP as well as
* define the function arch_trace_is_compat_syscall() to let
- * the tracing system know that it should ignore it.
+ * the tracing system know that a compat syscall is being handled.
*/
-static int
-trace_get_syscall_nr(struct task_struct *task, struct pt_regs *regs)
+static inline bool trace_is_compat_syscall(struct pt_regs *regs)
{
- if (unlikely(arch_trace_is_compat_syscall(regs)))
- return -1;
-
- return syscall_get_nr(task, regs);
+ return arch_trace_is_compat_syscall(regs);
}
#else
+static inline bool trace_is_compat_syscall(struct pt_regs *regs)
+{
+ return false;
+}
+#endif /* ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP */
+
static inline int
trace_get_syscall_nr(struct task_struct *task, struct pt_regs *regs)
{
return syscall_get_nr(task, regs);
}
-#endif /* ARCH_TRACE_IGNORE_COMPAT_SYSCALLS */
+

static __init struct syscall_metadata *
find_syscall_meta(unsigned long syscall)
@@ -98,9 +96,9 @@ find_syscall_meta(unsigned long syscall)
return NULL;
}

-static struct syscall_metadata *syscall_nr_to_meta(int nr)
+static struct syscall_metadata *trace_syscall_nr_to_meta(int nr)
{
- if (!syscalls_metadata || nr >= NR_syscalls || nr < 0)
+ if (!syscalls_metadata || nr >= FTRACE_SYSCALL_CNT || nr < 0)
return NULL;

return syscalls_metadata[nr];
@@ -110,7 +108,7 @@ const char *get_syscall_name(int syscall)
{
struct syscall_metadata *entry;

- entry = syscall_nr_to_meta(syscall);
+ entry = trace_syscall_nr_to_meta(syscall);
if (!entry)
return NULL;

@@ -130,7 +128,7 @@ print_syscall_enter(struct trace_iterator *iter, int flags,

trace = (typeof(trace))ent;
syscall = trace->nr;
- entry = syscall_nr_to_meta(syscall);
+ entry = trace_syscall_nr_to_meta(syscall);

if (!entry)
goto end;
@@ -176,7 +174,7 @@ print_syscall_exit(struct trace_iterator *iter, int flags,

trace = (typeof(trace))ent;
syscall = trace->nr;
- entry = syscall_nr_to_meta(syscall);
+ entry = trace_syscall_nr_to_meta(syscall);

if (!entry) {
trace_seq_putc(s, '\n');
@@ -304,6 +302,26 @@ static int __init syscall_exit_define_fields(struct trace_event_call *call)
return ret;
}

+static int ftrace_check_syscall_nr(struct pt_regs *regs)
+{
+ int syscall_nr;
+
+ syscall_nr = trace_get_syscall_nr(current, regs);
+ if (syscall_nr < 0)
+ return -1;
+
+ if (trace_is_compat_syscall(regs)) {
+ syscall_nr += NR_syscalls;
+ if (syscall_nr >= FTRACE_SYSCALL_CNT)
+ return -1;
+ } else {
+ if (syscall_nr > NR_syscalls)
+ return -1;
+ }
+
+ return syscall_nr;
+}
+
static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
{
struct trace_array *tr = data;
@@ -317,8 +335,8 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
int syscall_nr;
int size;

- syscall_nr = trace_get_syscall_nr(current, regs);
- if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
+ syscall_nr = ftrace_check_syscall_nr(regs);
+ if (syscall_nr < 0)
return;

/* Here we're inside tp handler's rcu_read_lock_sched (__DO_TRACE) */
@@ -329,7 +347,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
if (trace_trigger_soft_disabled(trace_file))
return;

- sys_data = syscall_nr_to_meta(syscall_nr);
+ sys_data = trace_syscall_nr_to_meta(syscall_nr);
if (!sys_data)
return;

@@ -364,8 +382,8 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
int pc;
int syscall_nr;

- syscall_nr = trace_get_syscall_nr(current, regs);
- if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
+ syscall_nr = ftrace_check_syscall_nr(regs);
+ if (syscall_nr < 0)
return;

/* Here we're inside tp handler's rcu_read_lock_sched (__DO_TRACE()) */
@@ -376,7 +394,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
if (trace_trigger_soft_disabled(trace_file))
return;

- sys_data = syscall_nr_to_meta(syscall_nr);
+ sys_data = trace_syscall_nr_to_meta(syscall_nr);
if (!sys_data)
return;

@@ -412,7 +430,7 @@ static int reg_event_syscall_enter(struct trace_event_file *file,
goto out_unlock;
}

- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
rcu_assign_pointer(tr->enter_syscall_files[num], file);
@@ -432,7 +450,7 @@ static void unreg_event_syscall_enter(struct trace_event_file *file,

mutex_lock(&syscall_trace_lock);
tr->sys_refcount_enter--;
- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL);
@@ -456,7 +474,7 @@ static int reg_event_syscall_exit(struct trace_event_file *file,
goto out_unlock;
}

- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
rcu_assign_pointer(tr->exit_syscall_files[num], file);
@@ -476,7 +494,7 @@ static void unreg_event_syscall_exit(struct trace_event_file *file,

mutex_lock(&syscall_trace_lock);
tr->sys_refcount_exit--;
- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL);
@@ -527,38 +545,47 @@ struct trace_event_class __refdata event_class_syscall_exit = {
.raw_init = init_syscall_trace,
};

-unsigned long __init __weak arch_syscall_addr(int nr)
+unsigned long __init __weak arch_syscall_addr(int nr, bool compat)
{
return (unsigned long)sys_call_table[nr];
}

-void __init init_ftrace_syscalls(void)
+void __init init_ftrace_syscalls_meta(bool compat)
{
struct syscall_metadata *meta;
unsigned long addr;
int i;

- syscalls_metadata = kcalloc(NR_syscalls, sizeof(*syscalls_metadata),
- GFP_KERNEL);
- if (!syscalls_metadata) {
- WARN_ON(1);
- return;
- }
-
for (i = 0; i < NR_syscalls; i++) {
- addr = arch_syscall_addr(i);
+ addr = arch_syscall_addr(i, compat);
meta = find_syscall_meta(addr);
if (!meta)
continue;

- syscalls_metadata[i] = meta;
+ syscalls_metadata[compat * NR_syscalls + i] = meta;
}
}

+void __init init_ftrace_syscalls(void)
+{
+ syscalls_metadata = kcalloc(FTRACE_SYSCALL_CNT,
+ sizeof(*syscalls_metadata), GFP_KERNEL);
+ if (!syscalls_metadata) {
+ WARN_ON(1);
+ return;
+ }
+
+ init_ftrace_syscalls_meta(false);
+#ifdef ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP
+ if (IS_ENABLED(CONFIG_COMPAT))
+ init_ftrace_syscalls_meta(true);
+#endif
+}
+
#ifdef CONFIG_PERF_EVENTS

-static DECLARE_BITMAP(enabled_perf_enter_syscalls, NR_syscalls);
-static DECLARE_BITMAP(enabled_perf_exit_syscalls, NR_syscalls);
+static DECLARE_BITMAP(enabled_perf_enter_syscalls, FTRACE_SYSCALL_CNT);
+static DECLARE_BITMAP(enabled_perf_exit_syscalls, FTRACE_SYSCALL_CNT);
static int sys_perf_refcount_enter;
static int sys_perf_refcount_exit;

@@ -571,13 +598,14 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
int rctx;
int size;

- syscall_nr = trace_get_syscall_nr(current, regs);
- if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
+ syscall_nr = ftrace_check_syscall_nr(regs);
+ if (syscall_nr < 0)
return;
+
if (!test_bit(syscall_nr, enabled_perf_enter_syscalls))
return;

- sys_data = syscall_nr_to_meta(syscall_nr);
+ sys_data = trace_syscall_nr_to_meta(syscall_nr);
if (!sys_data)
return;

@@ -616,7 +644,7 @@ static int perf_sysenter_enable(struct trace_event_call *call)
}
}

- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
set_bit(num, enabled_perf_enter_syscalls);
@@ -634,7 +662,7 @@ static void perf_sysenter_disable(struct trace_event_call *call)

mutex_lock(&syscall_trace_lock);
sys_perf_refcount_enter--;
- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
clear_bit(num, enabled_perf_enter_syscalls);
@@ -653,13 +681,14 @@ static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret)
int rctx;
int size;

- syscall_nr = trace_get_syscall_nr(current, regs);
- if (syscall_nr < 0 || syscall_nr >= NR_syscalls)
+ syscall_nr = ftrace_check_syscall_nr(regs);
+ if (syscall_nr < 0)
return;
+
if (!test_bit(syscall_nr, enabled_perf_exit_syscalls))
return;

- sys_data = syscall_nr_to_meta(syscall_nr);
+ sys_data = trace_syscall_nr_to_meta(syscall_nr);
if (!sys_data)
return;

@@ -695,7 +724,7 @@ static int perf_sysexit_enable(struct trace_event_call *call)
}
}

- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
set_bit(num, enabled_perf_exit_syscalls);
@@ -713,7 +742,7 @@ static void perf_sysexit_disable(struct trace_event_call *call)

mutex_lock(&syscall_trace_lock);
sys_perf_refcount_exit--;
- for (num = 0; num < NR_syscalls; num++) {
+ for (num = 0; num < FTRACE_SYSCALL_CNT; num++) {
if (syscalls_metadata[num] &&
(syscalls_metadata[num] == call->data))
clear_bit(num, enabled_perf_exit_syscalls);
--
2.7.4

2016-10-11 14:01:37

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 07/11] arm64/tracing: fix compat syscall handling

On Tue, Oct 11, 2016 at 12:42:52PM +0200, Marcin Nowakowski wrote:
> Add arch_syscall_addr for arm64 and define NR_compat_syscalls, as the
> number of compat syscalls for arm64 exceeds the number defined by
> NR_syscalls.
>
> Signed-off-by: Marcin Nowakowski <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: [email protected]
> ---
> arch/arm64/include/asm/ftrace.h | 12 +-----------
> arch/arm64/include/asm/unistd.h | 1 +
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/ftrace.c | 16 ++++++++++++++++
> 4 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index caa955f..b57ff7c 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -41,17 +41,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
>
> #define ftrace_return_address(n) return_address(n)
>
> -/*
> - * Because AArch32 mode does not share the same syscall table with AArch64,
> - * tracing compat syscalls may result in reporting bogus syscalls or even
> - * hang-up, so just do not trace them.
> - * See kernel/trace/trace_syscalls.c
> - *
> - * x86 code says:
> - * If the user really wants these, then they should use the
> - * raw syscall tracepoints with filtering.
> - */
> -#define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
> +#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
> static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
> {
> return is_compat_task();
> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> index e78ac26..276d049 100644
> --- a/arch/arm64/include/asm/unistd.h
> +++ b/arch/arm64/include/asm/unistd.h
> @@ -45,6 +45,7 @@
> #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
>
> #define __NR_compat_syscalls 394
> +#define NR_compat_syscalls (__NR_compat_syscalls)

We may as well just define NR_compat_syscalls instead of
__NR_compat_syscalls and move the handful of users over.

> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 40ad08a..75d010f 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -176,4 +176,20 @@ int ftrace_disable_ftrace_graph_caller(void)
> return ftrace_modify_graph_caller(false);
> }
> #endif /* CONFIG_DYNAMIC_FTRACE */
> +
> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> +
> +#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
> +
> +extern const void *sys_call_table[];
> +extern const void *compat_sys_call_table[];
> +
> +unsigned long __init arch_syscall_addr(int nr, bool compat)
> +{
> + if (compat)
> + return (unsigned long)compat_sys_call_table[nr];
> +
> + return (unsigned long)sys_call_table[nr];
> +}

Do we care about the compat private syscalls (from base 0x0f0000)? We
need to make sure that we exhibit the same behaviour as a native
32-bit ARM machine.

Will

2016-10-12 07:07:16

by Marcin Nowakowski

[permalink] [raw]
Subject: Re: [PATCH v3 07/11] arm64/tracing: fix compat syscall handling

Hi Will,

On 11.10.2016 15:36, Will Deacon wrote:
> On Tue, Oct 11, 2016 at 12:42:52PM +0200, Marcin Nowakowski wrote:
>> Add arch_syscall_addr for arm64 and define NR_compat_syscalls, as the
>> number of compat syscalls for arm64 exceeds the number defined by
>> NR_syscalls.
>>
>> Signed-off-by: Marcin Nowakowski <[email protected]>
>> Cc: Steven Rostedt <[email protected]>
>> Cc: Ingo Molnar <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: [email protected]
>> ---
>> arch/arm64/include/asm/ftrace.h | 12 +-----------
>> arch/arm64/include/asm/unistd.h | 1 +
>> arch/arm64/kernel/Makefile | 1 +
>> arch/arm64/kernel/ftrace.c | 16 ++++++++++++++++
>> 4 files changed, 19 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>> index caa955f..b57ff7c 100644
>> --- a/arch/arm64/include/asm/ftrace.h
>> +++ b/arch/arm64/include/asm/ftrace.h
>> @@ -41,17 +41,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
>>
>> #define ftrace_return_address(n) return_address(n)
>>
>> -/*
>> - * Because AArch32 mode does not share the same syscall table with AArch64,
>> - * tracing compat syscalls may result in reporting bogus syscalls or even
>> - * hang-up, so just do not trace them.
>> - * See kernel/trace/trace_syscalls.c
>> - *
>> - * x86 code says:
>> - * If the user really wants these, then they should use the
>> - * raw syscall tracepoints with filtering.
>> - */
>> -#define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
>> +#define ARCH_COMPAT_SYSCALL_NUMBERS_OVERLAP 1
>> static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
>> {
>> return is_compat_task();
>> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
>> index e78ac26..276d049 100644
>> --- a/arch/arm64/include/asm/unistd.h
>> +++ b/arch/arm64/include/asm/unistd.h
>> @@ -45,6 +45,7 @@
>> #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
>>
>> #define __NR_compat_syscalls 394
>> +#define NR_compat_syscalls (__NR_compat_syscalls)
>
> We may as well just define NR_compat_syscalls instead of
> __NR_compat_syscalls and move the handful of users over.

I had tried to minimise the amount of arch-specific changes here -
especially those that are not directly related to the proposed syscall
handling change. But I agree having these 2 #defines is a bit
unnecessary ...

>> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
>> index 40ad08a..75d010f 100644
>> --- a/arch/arm64/kernel/ftrace.c
>> +++ b/arch/arm64/kernel/ftrace.c
>> @@ -176,4 +176,20 @@ int ftrace_disable_ftrace_graph_caller(void)
>> return ftrace_modify_graph_caller(false);
>> }
>> #endif /* CONFIG_DYNAMIC_FTRACE */
>> +
>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>> +
>> +#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
>> +
>> +extern const void *sys_call_table[];
>> +extern const void *compat_sys_call_table[];
>> +
>> +unsigned long __init arch_syscall_addr(int nr, bool compat)
>> +{
>> + if (compat)
>> + return (unsigned long)compat_sys_call_table[nr];
>> +
>> + return (unsigned long)sys_call_table[nr];
>> +}
>
> Do we care about the compat private syscalls (from base 0x0f0000)? We
> need to make sure that we exhibit the same behaviour as a native
> 32-bit ARM machine.
>
> Will

Tracing of such syscalls has been disabled for a long time (see
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=086ba77a6db0).
Apart from using non-contiguous numbers, they are not defined using
standard SYSCALL macros, so they do not have any metadata generated either.
My suggestion is that if you wanted those to be included in the trace
then it should be done separately from these changes.

Marcin

2016-10-12 08:50:56

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v3 03/11] tracing/syscalls: add compat syscall metadata

Marcin Nowakowski <[email protected]> writes:

> Now that compat syscalls are properly distinguished from native calls,
> we can add metadata for compat syscalls as well.
> All the macros used to generate the metadata are the same as for
> standard syscalls, but with a compat_ prefix to distinguish them easily.
>
> Signed-off-by: Marcin Nowakowski <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: [email protected]
> ---
> arch/powerpc/include/asm/ftrace.h | 15 +++++---
> include/linux/compat.h | 74 +++++++++++++++++++++++++++++++++++++++
> kernel/trace/trace_syscalls.c | 8 +++--
> 3 files changed, 90 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
> index 686c5f7..9697a73 100644
> --- a/arch/powerpc/include/asm/ftrace.h
> +++ b/arch/powerpc/include/asm/ftrace.h
> @@ -73,12 +73,17 @@ struct dyn_arch_ftrace {
> static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
> {
> /*
> - * Compare the symbol name with the system call name. Skip the .sys or .SyS
> - * prefix from the symbol name and the sys prefix from the system call name and
> - * just match the rest. This is only needed on ppc64 since symbol names on
> - * 32bit do not start with a period so the generic function will work.
> + * Compare the symbol name with the system call name. Skip the .sys,
> + * .SyS or .compat_sys prefix from the symbol name and the sys prefix
> + * from the system call name and just match the rest. This is only
> + * needed on ppc64 since symbol names on 32bit do not start with a
> + * period so the generic function will work.
> */
> - return !strcmp(sym + 4, name + 3);
> + int prefix_len = 3;
> +
> + if (!strncasecmp(name, "compat_", 7))
> + prefix_len = 10;
> + return !strcmp(sym + prefix_len + 1, name + prefix_len);
> }

It's annoying that we have to duplicate all that just to do a + 1.

How about this as a precursor?

cheers


diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index dd5f916b351d..bd65f2adeb09 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -226,10 +226,6 @@ You need very few things to get the syscalls tracing in an arch.
- If the system call table on this arch is more complicated than a simple array
of addresses of the system calls, implement an arch_syscall_addr to return
the address of a given system call.
-- If the symbol names of the system calls do not match the function names on
- this arch, define ARCH_HAS_SYSCALL_MATCH_SYM_NAME in asm/ftrace.h and
- implement arch_syscall_match_sym_name with the appropriate logic to return
- true if the function name corresponds with the symbol name.
- Tag this arch as HAVE_SYSCALL_TRACEPOINTS.


diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index 686c5f70eb84..dc48f5b2878d 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -60,6 +60,12 @@ struct dyn_arch_ftrace {
struct module *mod;
};
#endif /* CONFIG_DYNAMIC_FTRACE */
+
+#ifdef PPC64_ELF_ABI_v1
+/* On ppc64 ABIv1 (BE) we have to skip the leading '.' in the symbol name */
+#define ARCH_SYM_NAME_SKIP_CHARS 1
+#endif
+
#endif /* __ASSEMBLY__ */

#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
@@ -67,20 +73,4 @@ struct dyn_arch_ftrace {
#endif
#endif

-#if defined(CONFIG_FTRACE_SYSCALLS) && !defined(__ASSEMBLY__)
-#ifdef PPC64_ELF_ABI_v1
-#define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
-static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
-{
- /*
- * Compare the symbol name with the system call name. Skip the .sys or .SyS
- * prefix from the symbol name and the sys prefix from the system call name and
- * just match the rest. This is only needed on ppc64 since symbol names on
- * 32bit do not start with a period so the generic function will work.
- */
- return !strcmp(sym + 4, name + 3);
-}
-#endif
-#endif /* CONFIG_FTRACE_SYSCALLS && !__ASSEMBLY__ */
-
#endif /* _ASM_POWERPC_FTRACE */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index b2b6efc083a4..91a7315dbe43 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -31,8 +31,11 @@ extern struct syscall_metadata *__stop_syscalls_metadata[];

static struct syscall_metadata **syscalls_metadata;

-#ifndef ARCH_HAS_SYSCALL_MATCH_SYM_NAME
-static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
+#ifndef ARCH_SYM_NAME_SKIP_CHARS
+#define ARCH_SYM_NAME_SKIP_CHARS 0
+#endif
+
+static inline bool syscall_match_sym_name(const char *sym, const char *name)
{
/*
* Only compare after the "sys" prefix. Archs that use
@@ -40,9 +43,8 @@ static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
* with ".SyS" or ".sys" instead of "sys", leading to an unwanted
* mismatch.
*/
- return !strcmp(sym + 3, name + 3);
+ return !strcmp(sym + 3 + ARCH_SYM_NAME_SKIP_CHARS, name + 3);
}
-#endif

#ifdef ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
/*
@@ -88,11 +90,11 @@ find_syscall_meta(unsigned long syscall)
stop = __stop_syscalls_metadata;
kallsyms_lookup(syscall, NULL, NULL, NULL, str);

- if (arch_syscall_match_sym_name(str, "sys_ni_syscall"))
+ if (syscall_match_sym_name(str, "sys_ni_syscall"))
return NULL;

for ( ; start < stop; start++) {
- if ((*start)->name && arch_syscall_match_sym_name(str, (*start)->name))
+ if ((*start)->name && syscall_match_sym_name(str, (*start)->name))
return *start;
}
return NULL;

2016-10-12 09:59:51

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v3 08/11] powerpc/tracing: fix compat syscall handling

Marcin Nowakowski <[email protected]> writes:

> Adapt the code to make use of new syscall handling interface
>
> Signed-off-by: Marcin Nowakowski <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: [email protected]
> ---
> arch/powerpc/include/asm/ftrace.h | 11 +++++++++++
> arch/powerpc/kernel/ftrace.c | 4 ++++

I went to test this and noticed the exit and enter events appear to be
reversed in time? (your series on top of 24532f768121)

ls-4221 [003] .... 83.766113: compat_sys_rt_sigprocmask -> 0x2
ls-4221 [003] .... 83.766137: compat_sys_rt_sigprocmask(how: 2, nset: 1010db30, oset: 0, sigsetsize: 8)
ls-4221 [003] .... 83.766175: compat_sys_rt_sigaction -> 0x14
ls-4221 [003] .... 83.766175: compat_sys_rt_sigaction(sig: 14, act: ffbd33c4, oact: ffbd3338, sigsetsize: 8)
ls-4221 [003] .... 83.766177: compat_sys_rt_sigaction -> 0x15
ls-4221 [003] .... 83.766177: compat_sys_rt_sigaction(sig: 15, act: ffbd33c4, oact: ffbd3338, sigsetsize: 8)
ls-4221 [003] .... 83.766178: compat_sys_rt_sigaction -> 0x16
ls-4221 [003] .... 83.766178: compat_sys_rt_sigaction(sig: 16, act: ffbd33d4, oact: ffbd3348, sigsetsize: 8)
ls-4221 [003] .... 83.766179: sys_setpgid -> 0x107d
ls-4221 [003] .... 83.766179: sys_setpgid(pid: 107d, pgid: 107d)
ls-4221 [003] .... 83.766180: compat_sys_rt_sigprocmask -> 0x0
ls-4221 [003] .... 83.766181: compat_sys_rt_sigprocmask(how: 0, nset: ffbd34b0, oset: ffbd3530, sigsetsize: 8)
ls-4221 [003] .... 83.766186: compat_sys_ioctl -> 0xff
ls-4221 [003] .... 83.766187: compat_sys_ioctl(fd: ff, cmd: 80047476, arg32: ffbd3488)
ls-4221 [003] .... 83.766188: compat_sys_rt_sigprocmask -> 0x2
ls-4221 [003] .... 83.766189: compat_sys_rt_sigprocmask(how: 2, nset: ffbd3530, oset: 0, sigsetsize: 8)
ls-4221 [003] .... 83.766189: sys_close -> 0x4
ls-4221 [003] .... 83.766190: sys_close(fd: 4)
ls-4221 [003] .... 83.766191: sys_read -> 0x3
ls-4221 [003] .... 83.766191: sys_read(fd: 3, buf: ffbd35dc, count: 1)
ls-4221 [003] .... 83.766235: sys_close -> 0x3
ls-4221 [003] .... 83.766235: sys_close(fd: 3)

cheers

2016-10-12 10:06:16

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 07/11] arm64/tracing: fix compat syscall handling

On Wed, Oct 12, 2016 at 09:07:03AM +0200, Marcin Nowakowski wrote:
> On 11.10.2016 15:36, Will Deacon wrote:
> >On Tue, Oct 11, 2016 at 12:42:52PM +0200, Marcin Nowakowski wrote:
> >>diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> >>index e78ac26..276d049 100644
> >>--- a/arch/arm64/include/asm/unistd.h
> >>+++ b/arch/arm64/include/asm/unistd.h
> >>@@ -45,6 +45,7 @@
> >> #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
> >>
> >> #define __NR_compat_syscalls 394
> >>+#define NR_compat_syscalls (__NR_compat_syscalls)
> >
> >We may as well just define NR_compat_syscalls instead of
> >__NR_compat_syscalls and move the handful of users over.
>
> I had tried to minimise the amount of arch-specific changes here -
> especially those that are not directly related to the proposed syscall
> handling change. But I agree having these 2 #defines is a bit unnecessary

There's only three users of __NR_compat_syscalls, so I think you can
move them over.

> >>diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> >>index 40ad08a..75d010f 100644
> >>--- a/arch/arm64/kernel/ftrace.c
> >>+++ b/arch/arm64/kernel/ftrace.c
> >>@@ -176,4 +176,20 @@ int ftrace_disable_ftrace_graph_caller(void)
> >> return ftrace_modify_graph_caller(false);
> >> }
> >> #endif /* CONFIG_DYNAMIC_FTRACE */
> >>+
> >> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> >>+
> >>+#if (defined CONFIG_FTRACE_SYSCALLS) && (defined CONFIG_COMPAT)
> >>+
> >>+extern const void *sys_call_table[];
> >>+extern const void *compat_sys_call_table[];
> >>+
> >>+unsigned long __init arch_syscall_addr(int nr, bool compat)
> >>+{
> >>+ if (compat)
> >>+ return (unsigned long)compat_sys_call_table[nr];
> >>+
> >>+ return (unsigned long)sys_call_table[nr];
> >>+}
> >
> >Do we care about the compat private syscalls (from base 0x0f0000)? We
> >need to make sure that we exhibit the same behaviour as a native
> >32-bit ARM machine.
> >
> >Will
>
> Tracing of such syscalls has been disabled for a long time (see
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=086ba77a6db0).
> Apart from using non-contiguous numbers, they are not defined using standard
> SYSCALL macros, so they do not have any metadata generated either.
> My suggestion is that if you wanted those to be included in the trace then
> it should be done separately from these changes.

Fine by me -- I just wanted to make sure our compat behaviour matched
the behaviour of native arch/arm/. It sounds like it does, so no need to
change anything here.

Acked-by: Will Deacon <[email protected]>

Will

2016-10-12 13:10:15

by Marcin Nowakowski

[permalink] [raw]
Subject: Re: [PATCH v3 03/11] tracing/syscalls: add compat syscall metadata

On 12.10.2016 10:50, Michael Ellerman wrote:
> <...>
> It's annoying that we have to duplicate all that just to do a + 1.
>
> How about this as a precursor?
> <...>

Thanks for the suggestion - unless anyone sees a reason to keep the
current solution I'll change it.

Marcin

2016-10-12 13:10:48

by Marcin Nowakowski

[permalink] [raw]
Subject: Re: [PATCH v3 08/11] powerpc/tracing: fix compat syscall handling



On 12.10.2016 11:59, Michael Ellerman wrote:
> I went to test this and noticed the exit and enter events appear to be
> reversed in time? (your series on top of 24532f768121)

thanks for testing the patch - I've found a bug that has sneaked in
while cleaning up the patches before submission ... I'll fix it in the
next iteration.

Marcin

2016-10-12 13:55:54

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 05/11] x86/tracing: fix compat syscall handling

On Tue, 11 Oct 2016, Marcin Nowakowski wrote:
> -/*
> - * Because ia32 syscalls do not map to x86_64 syscall numbers
> - * this screws up the trace output when tracing a ia32 task.
> - * Instead of reporting bogus syscalls, just do not trace them.
> - *
> - * If the user really wants these, then they should use the
> - * raw syscall tracepoints with filtering.

Why aren't you moving this information to the new place? It's valuable.

Thanks,

tglx

2016-10-13 06:35:22

by Marcin Nowakowski

[permalink] [raw]
Subject: Re: [PATCH v3 05/11] x86/tracing: fix compat syscall handling

On 12.10.2016 15:53, Thomas Gleixner wrote:
> On Tue, 11 Oct 2016, Marcin Nowakowski wrote:
>> -/*
>> - * Because ia32 syscalls do not map to x86_64 syscall numbers
>> - * this screws up the trace output when tracing a ia32 task.
>> - * Instead of reporting bogus syscalls, just do not trace them.
>> - *
>> - * If the user really wants these, then they should use the
>> - * raw syscall tracepoints with filtering.
>
> Why aren't you moving this information to the new place? It's valuable.

Well - much of this information is no longer true as with these fixes
tracing can work properly, but I can keep a note clarifying that the
numbers are mapped differently and why arch_syscall_addr() needs the
sort of handling it does.

Marcin

2016-10-13 09:41:30

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v3 03/11] tracing/syscalls: add compat syscall metadata

Marcin Nowakowski <[email protected]> writes:

> On 12.10.2016 10:50, Michael Ellerman wrote:
>> <...>
>> It's annoying that we have to duplicate all that just to do a + 1.
>>
>> How about this as a precursor?
> > <...>
>
> Thanks for the suggestion - unless anyone sees a reason to keep the
> current solution I'll change it.

Thanks. I forgot to add my SOB so here it is:

Signed-off-by: Michael Ellerman <[email protected]>

cheers