2019-06-06 09:51:19

by Leo Yan

[permalink] [raw]
Subject: [PATCH v2 0/4] perf augmented_raw_syscalls: Support for arm64

When I tried to run the trace on arm64 platform with eBPF program
augmented_raw_syscalls, it reports several failures for eBPF program
compilation. So tried to resolve these issues and this patch set is
the working result.

0001 patch lets perf command to exit directly if find eBPF program
building failure.

0002 patch is minor refactoring code to remove duplicate macro.

0003 patch is to add support arm64 raw syscalls numbers.

0004 patch is to document clang configuration so that can easily use
this program on both x86_64 and aarch64 platforms.

Changes from v1:
* Removed duplicated macro and aligned the numbers indention for arm64.

Leo Yan (4):
perf trace: Exit when build eBPF program failure
perf augmented_raw_syscalls: Remove duplicate macros
perf augmented_raw_syscalls: Support arm64 raw syscalls
perf augmented_raw_syscalls: Document clang configuration

tools/perf/builtin-trace.c | 8 ++
.../examples/bpf/augmented_raw_syscalls.c | 101 +++++++++++++++++-
2 files changed, 108 insertions(+), 1 deletion(-)

--
2.17.1


2019-06-06 09:51:48

by Leo Yan

[permalink] [raw]
Subject: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

This patch adds support for arm64 raw syscall numbers so that we can use
it on arm64 platform.

After applied this patch, we need to specify macro -D__aarch64__ or
-D__x86_64__ in compilation option so Clang can use the corresponding
syscall numbers for arm64 or x86_64 respectively, other architectures
will report failure when compilation.

Signed-off-by: Leo Yan <[email protected]>
---
.../examples/bpf/augmented_raw_syscalls.c | 81 +++++++++++++++++++
1 file changed, 81 insertions(+)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 5c4a4e715ae6..a3701a4daf2e 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -45,6 +45,83 @@ struct augmented_filename {
char value[PATH_MAX];
};

+#if defined(__aarch64__)
+
+/* syscalls where the first arg is a string */
+#define SYS_OPEN 1024
+#define SYS_STAT 1038
+#define SYS_LSTAT 1039
+#define SYS_ACCESS 1033
+#define SYS_EXECVE 221
+#define SYS_TRUNCATE 45
+#define SYS_CHDIR 49
+#define SYS_RENAME 1034
+#define SYS_MKDIR 1030
+#define SYS_RMDIR 1031
+#define SYS_CREAT 1064
+#define SYS_LINK 1025
+#define SYS_UNLINK 1026
+#define SYS_SYMLINK 1036
+#define SYS_READLINK 1035
+#define SYS_CHMOD 1028
+#define SYS_CHOWN 1029
+#define SYS_LCHOWN 1032
+#define SYS_MKNOD 1027
+#define SYS_STATFS 1056
+#define SYS_PIVOT_ROOT 41
+#define SYS_CHROOT 51
+#define SYS_ACCT 89
+#define SYS_SWAPON 224
+#define SYS_SWAPOFF 225
+#define SYS_DELETE_MODULE 106
+#define SYS_SETXATTR 5
+#define SYS_LSETXATTR 6
+#define SYS_GETXATTR 8
+#define SYS_LGETXATTR 9
+#define SYS_LISTXATTR 11
+#define SYS_LLISTXATTR 12
+#define SYS_REMOVEXATTR 14
+#define SYS_LREMOVEXATTR 15
+#define SYS_MQ_OPEN 180
+#define SYS_MQ_UNLINK 181
+#define SYS_ADD_KEY 217
+#define SYS_REQUEST_KEY 218
+#define SYS_SYMLINKAT 36
+#define SYS_MEMFD_CREATE 279
+
+/* syscalls where the second arg is a string */
+#define SYS_PWRITE64 68
+#define SYS_RENAME 1034
+#define SYS_QUOTACTL 60
+#define SYS_FSETXATTR 7
+#define SYS_FGETXATTR 10
+#define SYS_FREMOVEXATTR 16
+#define SYS_MQ_TIMEDSEND 182
+#define SYS_REQUEST_KEY 218
+#define SYS_INOTIFY_ADD_WATCH 27
+#define SYS_OPENAT 56
+#define SYS_MKDIRAT 34
+#define SYS_MKNODAT 33
+#define SYS_FCHOWNAT 54
+#define SYS_FUTIMESAT 1066
+#define SYS_NEWFSTATAT 1054
+#define SYS_UNLINKAT 35
+#define SYS_RENAMEAT 38
+#define SYS_LINKAT 37
+#define SYS_READLINKAT 78
+#define SYS_FCHMODAT 53
+#define SYS_FACCESSAT 48
+#define SYS_UTIMENSAT 88
+#define SYS_NAME_TO_HANDLE_AT 264
+#define SYS_FINIT_MODULE 273
+#define SYS_RENAMEAT2 276
+#define SYS_EXECVEAT 281
+#define SYS_STATX 291
+#define SYS_MOVE_MOUNT 429
+#define SYS_FSPICK 433
+
+#elif defined(__x86_64__)
+
/* syscalls where the first arg is a string */
#define SYS_OPEN 2
#define SYS_STAT 4
@@ -119,6 +196,10 @@ struct augmented_filename {
#define SYS_MOVE_MOUNT 429
#define SYS_FSPICK 433

+#else
+#error "unsupported architecture"
+#endif
+
pid_filter(pids_filtered);

struct augmented_args_filename {
--
2.17.1

2019-06-06 12:00:17

by Leo Yan

[permalink] [raw]
Subject: [PATCH v2 4/4] perf augmented_raw_syscalls: Document clang configuration

To build this program successfully with clang, there have three
compiler options need to be specified:

- Header file path: tools/perf/include/bpf;
- Specify architecture;
- Define macro __NR_CPUS__.

This patch add comments to explain the reasons for building failure and
give two examples for llvm.clang-opt variable, one is for x86_64
architecture and another is for aarch64 architecture.

Signed-off-by: Leo Yan <[email protected]>
---
.../examples/bpf/augmented_raw_syscalls.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index a3701a4daf2e..fb6987edab2c 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -6,6 +6,25 @@
*
* perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c cat /etc/passwd > /dev/null
*
+ * This program include two header files 'unistd.h' and 'pid_filter.h', which
+ * are placed in the folder tools/perf/include/bpf, but this folder is not
+ * included in env $KERNEL_INC_OPTIONS and it leads to compilation failure.
+ * For building this code, we also need to specify architecture and define macro
+ * __NR_CPUS__. To resolve these issues, variable llvm.clang-opt can be set in
+ * the file ~/.perfconfig:
+ *
+ * E.g. Test on a platform with 8 CPUs with x86_64 architecture:
+ *
+ * [llvm]
+ * clang-opt = "-D__NR_CPUS__=8 -D__x86_64__ \
+ * -I./tools/perf/include/bpf"
+ *
+ * E.g. Test on a platform with 5 CPUs with aarch64 architecture:
+ *
+ * [llvm]
+ * clang-opt = "-D__NR_CPUS__=5 -D__aarch64__ \
+ * -I./tools/perf/include/bpf"
+
* This exactly matches what is marshalled into the raw_syscall:sys_enter
* payload expected by the 'perf trace' beautifiers.
*
--
2.17.1

2019-06-06 13:40:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> This patch adds support for arm64 raw syscall numbers so that we can use
> it on arm64 platform.
>
> After applied this patch, we need to specify macro -D__aarch64__ or
> -D__x86_64__ in compilation option so Clang can use the corresponding
> syscall numbers for arm64 or x86_64 respectively, other architectures
> will report failure when compilation.

So, please check what I have in my perf/core branch, I've completely
removed arch specific stuff from augmented_raw_syscalls.c.

What is done now is use a map to specify what to copy, that same map
that is used to state which syscalls should be traced.

It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
out the mapping of syscall names to ids, just like is done for x86_64
and other arches, falling back to audit-libs when that syscalltbl thing
is not present.

- Arnaldo

> Signed-off-by: Leo Yan <[email protected]>
> ---
> .../examples/bpf/augmented_raw_syscalls.c | 81 +++++++++++++++++++
> 1 file changed, 81 insertions(+)
>
> diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
> index 5c4a4e715ae6..a3701a4daf2e 100644
> --- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
> +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
> @@ -45,6 +45,83 @@ struct augmented_filename {
> char value[PATH_MAX];
> };
>
> +#if defined(__aarch64__)
> +
> +/* syscalls where the first arg is a string */
> +#define SYS_OPEN 1024
> +#define SYS_STAT 1038
> +#define SYS_LSTAT 1039
> +#define SYS_ACCESS 1033
> +#define SYS_EXECVE 221
> +#define SYS_TRUNCATE 45
> +#define SYS_CHDIR 49
> +#define SYS_RENAME 1034
> +#define SYS_MKDIR 1030
> +#define SYS_RMDIR 1031
> +#define SYS_CREAT 1064
> +#define SYS_LINK 1025
> +#define SYS_UNLINK 1026
> +#define SYS_SYMLINK 1036
> +#define SYS_READLINK 1035
> +#define SYS_CHMOD 1028
> +#define SYS_CHOWN 1029
> +#define SYS_LCHOWN 1032
> +#define SYS_MKNOD 1027
> +#define SYS_STATFS 1056
> +#define SYS_PIVOT_ROOT 41
> +#define SYS_CHROOT 51
> +#define SYS_ACCT 89
> +#define SYS_SWAPON 224
> +#define SYS_SWAPOFF 225
> +#define SYS_DELETE_MODULE 106
> +#define SYS_SETXATTR 5
> +#define SYS_LSETXATTR 6
> +#define SYS_GETXATTR 8
> +#define SYS_LGETXATTR 9
> +#define SYS_LISTXATTR 11
> +#define SYS_LLISTXATTR 12
> +#define SYS_REMOVEXATTR 14
> +#define SYS_LREMOVEXATTR 15
> +#define SYS_MQ_OPEN 180
> +#define SYS_MQ_UNLINK 181
> +#define SYS_ADD_KEY 217
> +#define SYS_REQUEST_KEY 218
> +#define SYS_SYMLINKAT 36
> +#define SYS_MEMFD_CREATE 279
> +
> +/* syscalls where the second arg is a string */
> +#define SYS_PWRITE64 68
> +#define SYS_RENAME 1034
> +#define SYS_QUOTACTL 60
> +#define SYS_FSETXATTR 7
> +#define SYS_FGETXATTR 10
> +#define SYS_FREMOVEXATTR 16
> +#define SYS_MQ_TIMEDSEND 182
> +#define SYS_REQUEST_KEY 218
> +#define SYS_INOTIFY_ADD_WATCH 27
> +#define SYS_OPENAT 56
> +#define SYS_MKDIRAT 34
> +#define SYS_MKNODAT 33
> +#define SYS_FCHOWNAT 54
> +#define SYS_FUTIMESAT 1066
> +#define SYS_NEWFSTATAT 1054
> +#define SYS_UNLINKAT 35
> +#define SYS_RENAMEAT 38
> +#define SYS_LINKAT 37
> +#define SYS_READLINKAT 78
> +#define SYS_FCHMODAT 53
> +#define SYS_FACCESSAT 48
> +#define SYS_UTIMENSAT 88
> +#define SYS_NAME_TO_HANDLE_AT 264
> +#define SYS_FINIT_MODULE 273
> +#define SYS_RENAMEAT2 276
> +#define SYS_EXECVEAT 281
> +#define SYS_STATX 291
> +#define SYS_MOVE_MOUNT 429
> +#define SYS_FSPICK 433
> +
> +#elif defined(__x86_64__)
> +
> /* syscalls where the first arg is a string */
> #define SYS_OPEN 2
> #define SYS_STAT 4
> @@ -119,6 +196,10 @@ struct augmented_filename {
> #define SYS_MOVE_MOUNT 429
> #define SYS_FSPICK 433
>
> +#else
> +#error "unsupported architecture"
> +#endif
> +
> pid_filter(pids_filtered);
>
> struct augmented_args_filename {
> --
> 2.17.1

--

- Arnaldo

2019-06-06 13:48:40

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > This patch adds support for arm64 raw syscall numbers so that we can use
> > it on arm64 platform.
> >
> > After applied this patch, we need to specify macro -D__aarch64__ or
> > -D__x86_64__ in compilation option so Clang can use the corresponding
> > syscall numbers for arm64 or x86_64 respectively, other architectures
> > will report failure when compilation.
>
> So, please check what I have in my perf/core branch, I've completely
> removed arch specific stuff from augmented_raw_syscalls.c.
>
> What is done now is use a map to specify what to copy, that same map
> that is used to state which syscalls should be traced.
>
> It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> out the mapping of syscall names to ids, just like is done for x86_64
> and other arches, falling back to audit-libs when that syscalltbl thing
> is not present.

Also added:

Fixes: ac96287cae08 ("perf trace: Allow specifying a set of events to add in perfconfig")

For the [email protected] folks to automagically pick this.

- Arnaldo

2019-06-06 14:08:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Thu, Jun 06, 2019 at 10:46:24AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > > This patch adds support for arm64 raw syscall numbers so that we can use
> > > it on arm64 platform.

> > > After applied this patch, we need to specify macro -D__aarch64__ or
> > > -D__x86_64__ in compilation option so Clang can use the corresponding
> > > syscall numbers for arm64 or x86_64 respectively, other architectures
> > > will report failure when compilation.

> > So, please check what I have in my perf/core branch, I've completely
> > removed arch specific stuff from augmented_raw_syscalls.c.

> > What is done now is use a map to specify what to copy, that same map
> > that is used to state which syscalls should be traced.

> > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> > out the mapping of syscall names to ids, just like is done for x86_64
> > and other arches, falling back to audit-libs when that syscalltbl thing
> > is not present.

> Also added:

> Fixes: ac96287cae08 ("perf trace: Allow specifying a set of events to add in perfconfig")

> For the [email protected] folks to automagically pick this.

And this extra patch is needed, with yours and this one we finally get
what we want, which points to the kernel verifier blocking something,
exactly what is it that is blocking
(/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o)
and who asked for it, (trace.add_events=...) in a config key-value pair:

[root@quaco ~]# perf trace ls
event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o'
\___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events
Error: wrong config key-value pair trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
[root@quaco ~]#


commit 6455f983af2657b950d5dd5c45783e31e41ead4a
Author: Arnaldo Carvalho de Melo <[email protected]>
Date: Thu Jun 6 10:56:55 2019 -0300

perf config: Bail out when a handler returns failure for a key-value pair

So perf_config() uses:

int ret = 0;

perf_config_set__for_each_entry(config_set, section, item) {
...
ret = fn();
if (ret < 0)
break;
}

return ret;

Expecting that that break will imediatelly go to function exit to return
that error value (ret).

The problem is that perf_config_set__for_each_entry() expands into two
nested for() loops, one traversing the sections in a config and the
second the items in each of those sections, so we have to change that
'break' to a goto label right before that final 'return ret'.

With that, for instance 'perf trace' now correctly bails out when a
event that is requested to be added via its 'trace.add_events'
~/.perfconfig entry gets rejected by the kernel BPF verifier:

# perf trace ls
event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o'
\___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events
Error: wrong config key-value pair trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
#

While before it would continue and explode later, when trying to find
maps that would have been in place had that augmented_raw_syscalls.o
precompiled BPF proggie been accepted by the, humm, bast... rigorous
kernel BPF verifier 8-)

Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Yonghong Song <[email protected]>
Fixes: 8a0a9c7e9146 ("perf config: Introduce new init() and exit()")
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 7e3c1b60120c..e7d2c08d263a 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -739,11 +739,15 @@ int perf_config(config_fn_t fn, void *data)
if (ret < 0) {
pr_err("Error: wrong config key-value pair %s=%s\n",
key, value);
- break;
+ /*
+ * Can't be just a 'break', as perf_config_set__for_each_entry()
+ * expands to two nested for() loops.
+ */
+ goto out;
}
}
}
-
+out:
return ret;
}

2019-06-06 14:11:03

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf augmented_raw_syscalls: Document clang configuration

Em Thu, Jun 06, 2019 at 05:48:45PM +0800, Leo Yan escreveu:
> To build this program successfully with clang, there have three
> compiler options need to be specified:
>
> - Header file path: tools/perf/include/bpf;
> - Specify architecture;
> - Define macro __NR_CPUS__.

So, this shouldn't be needed, all of this is supposed to be done
automagically, have you done a 'make -C tools/perf install'?

- Arnaldo

> This patch add comments to explain the reasons for building failure and
> give two examples for llvm.clang-opt variable, one is for x86_64
> architecture and another is for aarch64 architecture.
>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> .../examples/bpf/augmented_raw_syscalls.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
> index a3701a4daf2e..fb6987edab2c 100644
> --- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
> +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
> @@ -6,6 +6,25 @@
> *
> * perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c cat /etc/passwd > /dev/null
> *
> + * This program include two header files 'unistd.h' and 'pid_filter.h', which
> + * are placed in the folder tools/perf/include/bpf, but this folder is not
> + * included in env $KERNEL_INC_OPTIONS and it leads to compilation failure.
> + * For building this code, we also need to specify architecture and define macro
> + * __NR_CPUS__. To resolve these issues, variable llvm.clang-opt can be set in
> + * the file ~/.perfconfig:
> + *
> + * E.g. Test on a platform with 8 CPUs with x86_64 architecture:
> + *
> + * [llvm]
> + * clang-opt = "-D__NR_CPUS__=8 -D__x86_64__ \
> + * -I./tools/perf/include/bpf"
> + *
> + * E.g. Test on a platform with 5 CPUs with aarch64 architecture:
> + *
> + * [llvm]
> + * clang-opt = "-D__NR_CPUS__=5 -D__aarch64__ \
> + * -I./tools/perf/include/bpf"
> +
> * This exactly matches what is marshalled into the raw_syscall:sys_enter
> * payload expected by the 'perf trace' beautifiers.
> *
> --
> 2.17.1

--

- Arnaldo

2019-06-06 14:38:23

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf augmented_raw_syscalls: Document clang configuration

On Thu, Jun 06, 2019 at 11:08:00AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 05:48:45PM +0800, Leo Yan escreveu:
> > To build this program successfully with clang, there have three
> > compiler options need to be specified:
> >
> > - Header file path: tools/perf/include/bpf;
> > - Specify architecture;
> > - Define macro __NR_CPUS__.
>
> So, this shouldn't be needed, all of this is supposed to be done
> automagically, have you done a 'make -C tools/perf install'?

I missed the up operation. But after git pulled the lastest code base
from perf/core branch and used the command 'make -C tools/perf
install', I still saw the eBPF build failure.

Just now this issue is fixed after I removed the config
'clang-bpf-cmd-template' from ~/.perfconfig; the reason is I followed
up the Documentation/perf-config.txt to set the config as below:

clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS \
$KERNEL_INC_OPTIONS -Wno-unused-value \
-Wno-pointer-sign -working-directory \
$WORKING_DIR -c $CLANG_SOURCE -target bpf \
-O2 -o -"

In fact, util/llvm-utils.c has updated the default configuration as
below:

#define CLANG_BPF_CMD_DEFAULT_TEMPLATE \
"$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
"-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
"$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
"-Wno-unused-value -Wno-pointer-sign " \
"-working-directory $WORKING_DIR " \
"-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"

Maybe should update Documentation/perf-config.txt to tell users the
real default value of clang-bpf-cmd-template?

Thanks,
Leo Yan

2019-06-06 14:47:19

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Thu, Jun 06, 2019 at 10:12:31PM +0800, Leo Yan escreveu:
> Hi Arnaldo,
>
> On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > > This patch adds support for arm64 raw syscall numbers so that we can use
> > > it on arm64 platform.
> > >
> > > After applied this patch, we need to specify macro -D__aarch64__ or
> > > -D__x86_64__ in compilation option so Clang can use the corresponding
> > > syscall numbers for arm64 or x86_64 respectively, other architectures
> > > will report failure when compilation.
> >
> > So, please check what I have in my perf/core branch, I've completely
> > removed arch specific stuff from augmented_raw_syscalls.c.
> >
> > What is done now is use a map to specify what to copy, that same map
> > that is used to state which syscalls should be traced.
> >
> > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> > out the mapping of syscall names to ids, just like is done for x86_64
> > and other arches, falling back to audit-libs when that syscalltbl thing
> > is not present.
>
> Actually I have noticed mksyscalltbl has been enabled for arm64, and
> had to say your approach is much better :)
>
> Thanks for the info and I will try your patch at my side.

That is excellent news! I'm eager to hear from you if this perf+BPF
integration experiment works for arm64.

I'm now trying to get past the verifier when checking if more than one
syscall arg is a filename, i.e. things like the rename* family, that
take two filenames.

An exercise in loop unrolling, providing the right hints to the
verifier, making sure clang don't trash those via explicit barriers, and
a lot of patience, limitless fun! ;-)

- Arnaldo

2019-06-06 17:12:45

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Hi Arnaldo,

On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > This patch adds support for arm64 raw syscall numbers so that we can use
> > it on arm64 platform.
> >
> > After applied this patch, we need to specify macro -D__aarch64__ or
> > -D__x86_64__ in compilation option so Clang can use the corresponding
> > syscall numbers for arm64 or x86_64 respectively, other architectures
> > will report failure when compilation.
>
> So, please check what I have in my perf/core branch, I've completely
> removed arch specific stuff from augmented_raw_syscalls.c.
>
> What is done now is use a map to specify what to copy, that same map
> that is used to state which syscalls should be traced.
>
> It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> out the mapping of syscall names to ids, just like is done for x86_64
> and other arches, falling back to audit-libs when that syscalltbl thing
> is not present.

Actually I have noticed mksyscalltbl has been enabled for arm64, and
had to say your approach is much better :)

Thanks for the info and I will try your patch at my side.

[...]

Thanks,
Leo Yan

2019-06-06 21:37:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf augmented_raw_syscalls: Document clang configuration

Em Thu, Jun 06, 2019 at 10:35:32PM +0800, Leo Yan escreveu:
> On Thu, Jun 06, 2019 at 11:08:00AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jun 06, 2019 at 05:48:45PM +0800, Leo Yan escreveu:
> > > To build this program successfully with clang, there have three
> > > compiler options need to be specified:
> > >
> > > - Header file path: tools/perf/include/bpf;
> > > - Specify architecture;
> > > - Define macro __NR_CPUS__.
> >
> > So, this shouldn't be needed, all of this is supposed to be done
> > automagically, have you done a 'make -C tools/perf install'?
>
> I missed the up operation. But after git pulled the lastest code base
> from perf/core branch and used the command 'make -C tools/perf
> install', I still saw the eBPF build failure.
>
> Just now this issue is fixed after I removed the config
> 'clang-bpf-cmd-template' from ~/.perfconfig; the reason is I followed
> up the Documentation/perf-config.txt to set the config as below:
>
> clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS \
> $KERNEL_INC_OPTIONS -Wno-unused-value \
> -Wno-pointer-sign -working-directory \
> $WORKING_DIR -c $CLANG_SOURCE -target bpf \
> -O2 -o -"
>
> In fact, util/llvm-utils.c has updated the default configuration as
> below:
>
> #define CLANG_BPF_CMD_DEFAULT_TEMPLATE \
> "$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
> "-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
> "$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
> "-Wno-unused-value -Wno-pointer-sign " \
> "-working-directory $WORKING_DIR " \
> "-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
>
> Maybe should update Documentation/perf-config.txt to tell users the
> real default value of clang-bpf-cmd-template?

Sure, if you fell like doing this, please update and also please figure
out when the this changed and add a Fixes: that cset,

Its great that you're going thru the docs and making sure the
differences are noted so that we update the docs, thanks a lot!

- Arnaldo

2019-06-07 10:01:34

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Hi Arnaldo,

On Thu, Jun 06, 2019 at 11:44:12AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 10:12:31PM +0800, Leo Yan escreveu:
> > Hi Arnaldo,
> >
> > On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > > > This patch adds support for arm64 raw syscall numbers so that we can use
> > > > it on arm64 platform.
> > > >
> > > > After applied this patch, we need to specify macro -D__aarch64__ or
> > > > -D__x86_64__ in compilation option so Clang can use the corresponding
> > > > syscall numbers for arm64 or x86_64 respectively, other architectures
> > > > will report failure when compilation.
> > >
> > > So, please check what I have in my perf/core branch, I've completely
> > > removed arch specific stuff from augmented_raw_syscalls.c.
> > >
> > > What is done now is use a map to specify what to copy, that same map
> > > that is used to state which syscalls should be traced.
> > >
> > > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> > > out the mapping of syscall names to ids, just like is done for x86_64
> > > and other arches, falling back to audit-libs when that syscalltbl thing
> > > is not present.
> >
> > Actually I have noticed mksyscalltbl has been enabled for arm64, and
> > had to say your approach is much better :)
> >
> > Thanks for the info and I will try your patch at my side.
>
> That is excellent news! I'm eager to hear from you if this perf+BPF
> integration experiment works for arm64.

I tested with the lastest perf/core branch which contains the patch:
'perf augmented_raw_syscalls: Tell which args are filenames and how
many bytes to copy' and got the error as below:

# perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
rmdir, stat, statfs, symlink, truncate, unlink
Hint: try 'perf list syscalls:sys_enter_*'
Hint: and: 'man syscalls'

So seems mksyscalltbl has not included completely for syscalls, I
use below command to generate syscalltbl_arm64[] array and it don't
include related entries for access, chmod, chown, etc ...

You could refer the generated syscalltbl_arm64 in:
http://paste.ubuntu.com/p/8Bj7Jkm2mP/

> I'm now trying to get past the verifier when checking if more than one
> syscall arg is a filename, i.e. things like the rename* family, that
> take two filenames.
>
> An exercise in loop unrolling, providing the right hints to the
> verifier, making sure clang don't trash those via explicit barriers, and
> a lot of patience, limitless fun! ;-)

Cool! Please feel free let me know if need me to do testing for this.

Thanks,
Leo Yan

2019-06-07 14:40:21

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf augmented_raw_syscalls: Document clang configuration

Hi Arnaldo,

On Thu, Jun 06, 2019 at 03:29:41PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 10:35:32PM +0800, Leo Yan escreveu:
> > On Thu, Jun 06, 2019 at 11:08:00AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Jun 06, 2019 at 05:48:45PM +0800, Leo Yan escreveu:
> > > > To build this program successfully with clang, there have three
> > > > compiler options need to be specified:
> > > >
> > > > - Header file path: tools/perf/include/bpf;
> > > > - Specify architecture;
> > > > - Define macro __NR_CPUS__.
> > >
> > > So, this shouldn't be needed, all of this is supposed to be done
> > > automagically, have you done a 'make -C tools/perf install'?
> >
> > I missed the up operation. But after git pulled the lastest code base
> > from perf/core branch and used the command 'make -C tools/perf
> > install', I still saw the eBPF build failure.
> >
> > Just now this issue is fixed after I removed the config
> > 'clang-bpf-cmd-template' from ~/.perfconfig; the reason is I followed
> > up the Documentation/perf-config.txt to set the config as below:
> >
> > clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS \
> > $KERNEL_INC_OPTIONS -Wno-unused-value \
> > -Wno-pointer-sign -working-directory \
> > $WORKING_DIR -c $CLANG_SOURCE -target bpf \
> > -O2 -o -"
> >
> > In fact, util/llvm-utils.c has updated the default configuration as
> > below:
> >
> > #define CLANG_BPF_CMD_DEFAULT_TEMPLATE \
> > "$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
> > "-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
> > "$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
> > "-Wno-unused-value -Wno-pointer-sign " \
> > "-working-directory $WORKING_DIR " \
> > "-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
> >
> > Maybe should update Documentation/perf-config.txt to tell users the
> > real default value of clang-bpf-cmd-template?
>
> Sure, if you fell like doing this, please update and also please figure
> out when the this changed and add a Fixes: that cset,

Thanks for guidance. Have sent patch for this [1].

> Its great that you're going thru the docs and making sure the
> differences are noted so that we update the docs, thanks a lot!

You are welcome!

Thanks,
Leo Yan

[1] https://lkml.org/lkml/2019/6/7/477

2019-06-07 19:02:12

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf augmented_raw_syscalls: Document clang configuration

Em Fri, Jun 07, 2019 at 10:38:49PM +0800, Leo Yan escreveu:
> Hi Arnaldo,
>
> On Thu, Jun 06, 2019 at 03:29:41PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jun 06, 2019 at 10:35:32PM +0800, Leo Yan escreveu:
> > > On Thu, Jun 06, 2019 at 11:08:00AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Thu, Jun 06, 2019 at 05:48:45PM +0800, Leo Yan escreveu:
> > > > > To build this program successfully with clang, there have three
> > > > > compiler options need to be specified:
> > > > >
> > > > > - Header file path: tools/perf/include/bpf;
> > > > > - Specify architecture;
> > > > > - Define macro __NR_CPUS__.
> > > >
> > > > So, this shouldn't be needed, all of this is supposed to be done
> > > > automagically, have you done a 'make -C tools/perf install'?
> > >
> > > I missed the up operation. But after git pulled the lastest code base
> > > from perf/core branch and used the command 'make -C tools/perf
> > > install', I still saw the eBPF build failure.
> > >
> > > Just now this issue is fixed after I removed the config
> > > 'clang-bpf-cmd-template' from ~/.perfconfig; the reason is I followed
> > > up the Documentation/perf-config.txt to set the config as below:
> > >
> > > clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS \
> > > $KERNEL_INC_OPTIONS -Wno-unused-value \
> > > -Wno-pointer-sign -working-directory \
> > > $WORKING_DIR -c $CLANG_SOURCE -target bpf \
> > > -O2 -o -"
> > >
> > > In fact, util/llvm-utils.c has updated the default configuration as
> > > below:
> > >
> > > #define CLANG_BPF_CMD_DEFAULT_TEMPLATE \
> > > "$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
> > > "-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
> > > "$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
> > > "-Wno-unused-value -Wno-pointer-sign " \
> > > "-working-directory $WORKING_DIR " \
> > > "-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
> > >
> > > Maybe should update Documentation/perf-config.txt to tell users the
> > > real default value of clang-bpf-cmd-template?
> >
> > Sure, if you fell like doing this, please update and also please figure
> > out when the this changed and add a Fixes: that cset,
>
> Thanks for guidance. Have sent patch for this [1].

yeah, applied already.

- Arnaldo

> > Its great that you're going thru the docs and making sure the
> > differences are noted so that we update the docs, thanks a lot!
>
> You are welcome!


2019-06-09 13:20:37

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

On Fri, Jun 07, 2019 at 05:58:31PM +0800, Leo Yan wrote:
> Hi Arnaldo,
>
> On Thu, Jun 06, 2019 at 11:44:12AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jun 06, 2019 at 10:12:31PM +0800, Leo Yan escreveu:
> > > Hi Arnaldo,
> > >
> > > On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > > > > This patch adds support for arm64 raw syscall numbers so that we can use
> > > > > it on arm64 platform.
> > > > >
> > > > > After applied this patch, we need to specify macro -D__aarch64__ or
> > > > > -D__x86_64__ in compilation option so Clang can use the corresponding
> > > > > syscall numbers for arm64 or x86_64 respectively, other architectures
> > > > > will report failure when compilation.
> > > >
> > > > So, please check what I have in my perf/core branch, I've completely
> > > > removed arch specific stuff from augmented_raw_syscalls.c.
> > > >
> > > > What is done now is use a map to specify what to copy, that same map
> > > > that is used to state which syscalls should be traced.
> > > >
> > > > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> > > > out the mapping of syscall names to ids, just like is done for x86_64
> > > > and other arches, falling back to audit-libs when that syscalltbl thing
> > > > is not present.
> > >
> > > Actually I have noticed mksyscalltbl has been enabled for arm64, and
> > > had to say your approach is much better :)
> > >
> > > Thanks for the info and I will try your patch at my side.
> >
> > That is excellent news! I'm eager to hear from you if this perf+BPF
> > integration experiment works for arm64.
>
> I tested with the lastest perf/core branch which contains the patch:
> 'perf augmented_raw_syscalls: Tell which args are filenames and how
> many bytes to copy' and got the error as below:
>
> # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> rmdir, stat, statfs, symlink, truncate, unlink
> Hint: try 'perf list syscalls:sys_enter_*'
> Hint: and: 'man syscalls'
>
> So seems mksyscalltbl has not included completely for syscalls, I
> use below command to generate syscalltbl_arm64[] array and it don't
> include related entries for access, chmod, chown, etc ...
>
> You could refer the generated syscalltbl_arm64 in:
> http://paste.ubuntu.com/p/8Bj7Jkm2mP/

After digging into this issue on Arm64, below is summary info:

- arm64 uses the header include/uapi/linux/unistd.h to define system
call numbers, in this header some system calls are not defined (I
think the reason is these system calls are obsolete at the end) so the
corresponding strings are missed in the array syscalltbl_native,
for arm64 the array is defined in the file:
tools/perf/arch/arm64/include/generated/asm/syscalls.c.

On the other hand, the file tools/perf/trace/strace/groups/string
stores the required system call strings, these system call strings
are based on x86_64 platform but not for arm64, the strings mismatch
with the system call defined in the array syscalltbl_native. This
is the reason why reports the fail: "Error: Invalid syscall access,
chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod,
newfstatat, open, readlink, rename, rmdir, stat, statfs, symlink,
truncate, unlink".

I tried to manually remove these reported strings from
tools/perf/trace/strace/groups/string, then 'perf trace' can work
well.

But I don't know what's a good way to proceed. Seems to me, we can
create a dedicated string file
tools/perf/trace/strace/groups/uapi_string which can be used to
match with system calls definitions in include/uapi/linux/unistd.h.
If there have other more general methods, will be great.

- As a side topic, arm64 also supports aarch32 compat system call
which are defined in header arch/arm64/include/asm/unistd32.h.

For either aarch64 or aarch32 system call, both of them finally will
invoke function el0_svc_common() to handle system call [1]. But so
far we don't distinguish the system call numbers is for aarch64 or
aarch32 and always consider it's aarch64 system call.

I think we can set an extra bit (e.g. use the 16th bit in 32 bits
signed int) to indicate it's a aarch32 compat system call, but not
sure if this is general method or not.

Maybe there have existed solution in other architectures for this,
especially other platforms also should support 32 bits and 64 bits
system calls along with the architecture evoluation, so want to
inquiry firstly to avoid duplicate works.

Thanks a lot for suggestions!
Leo.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/syscall.c#n93

2019-06-10 18:49:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Sun, Jun 09, 2019 at 09:18:49PM +0800, Leo Yan escreveu:
> On Fri, Jun 07, 2019 at 05:58:31PM +0800, Leo Yan wrote:
> > Hi Arnaldo,
> >
> > On Thu, Jun 06, 2019 at 11:44:12AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Jun 06, 2019 at 10:12:31PM +0800, Leo Yan escreveu:
> > > > Hi Arnaldo,
> > > >
> > > > On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:
> > > > > > This patch adds support for arm64 raw syscall numbers so that we can use
> > > > > > it on arm64 platform.
> > > > > >
> > > > > > After applied this patch, we need to specify macro -D__aarch64__ or
> > > > > > -D__x86_64__ in compilation option so Clang can use the corresponding
> > > > > > syscall numbers for arm64 or x86_64 respectively, other architectures
> > > > > > will report failure when compilation.
> > > > >
> > > > > So, please check what I have in my perf/core branch, I've completely
> > > > > removed arch specific stuff from augmented_raw_syscalls.c.
> > > > >
> > > > > What is done now is use a map to specify what to copy, that same map
> > > > > that is used to state which syscalls should be traced.
> > > > >
> > > > > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure
> > > > > out the mapping of syscall names to ids, just like is done for x86_64
> > > > > and other arches, falling back to audit-libs when that syscalltbl thing
> > > > > is not present.
> > > >
> > > > Actually I have noticed mksyscalltbl has been enabled for arm64, and
> > > > had to say your approach is much better :)
> > > >
> > > > Thanks for the info and I will try your patch at my side.
> > >
> > > That is excellent news! I'm eager to hear from you if this perf+BPF
> > > integration experiment works for arm64.
> >
> > I tested with the lastest perf/core branch which contains the patch:
> > 'perf augmented_raw_syscalls: Tell which args are filenames and how
> > many bytes to copy' and got the error as below:
> >
> > # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> > Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> > rmdir, stat, statfs, symlink, truncate, unlink

Humm, I think that we can just make the code that parses the
tools/perf/trace/strace/groups/string file to ignore syscalls it can't
find in the syscall_tbl, i.e. trace those if they exist in the arch.

> > Hint: try 'perf list syscalls:sys_enter_*'
> > Hint: and: 'man syscalls'
> >
> > So seems mksyscalltbl has not included completely for syscalls, I
> > use below command to generate syscalltbl_arm64[] array and it don't
> > include related entries for access, chmod, chown, etc ...

So, we need to investigate why is that these are missing, good thing we
have this 'strings' group :-)

> > You could refer the generated syscalltbl_arm64 in:
> > http://paste.ubuntu.com/p/8Bj7Jkm2mP/
>
> After digging into this issue on Arm64, below is summary info:
>
> - arm64 uses the header include/uapi/linux/unistd.h to define system
> call numbers, in this header some system calls are not defined (I
> think the reason is these system calls are obsolete at the end) so the
> corresponding strings are missed in the array syscalltbl_native,
> for arm64 the array is defined in the file:
> tools/perf/arch/arm64/include/generated/asm/syscalls.c.

Yeah, I looked at the 'access' case and indeed it is not present in
include/uapi/asm-generic/unistd.h, which is the place
include/uapi/linux/unistd.h ends up.

Ok please take a look at the patch at the end of this message, should be ok?

I tested it by changing the strace/gorups/string file to have a few
unknown syscalls, running it with -v we see:

[root@quaco perf]# perf trace -v -e string ls
Skipping unknown syscalls: access99, acct99, add_key99
<SNIP other verbose messages>
normal operation not considering those unknown syscalls.

> On the other hand, the file tools/perf/trace/strace/groups/string
> stores the required system call strings, these system call strings
> are based on x86_64 platform but not for arm64, the strings mismatch
> with the system call defined in the array syscalltbl_native. This
> is the reason why reports the fail: "Error: Invalid syscall access,
> chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod,
> newfstatat, open, readlink, rename, rmdir, stat, statfs, symlink,
> truncate, unlink".
>
> I tried to manually remove these reported strings from
> tools/perf/trace/strace/groups/string, then 'perf trace' can work
> well.
>
> But I don't know what's a good way to proceed. Seems to me, we can
> create a dedicated string file
> tools/perf/trace/strace/groups/uapi_string which can be used to
> match with system calls definitions in include/uapi/linux/unistd.h.
> If there have other more general methods, will be great.

> - As a side topic, arm64 also supports aarch32 compat system call
> which are defined in header arch/arm64/include/asm/unistd32.h.
>
> For either aarch64 or aarch32 system call, both of them finally will
> invoke function el0_svc_common() to handle system call [1]. But so
> far we don't distinguish the system call numbers is for aarch64 or
> aarch32 and always consider it's aarch64 system call.
>
> I think we can set an extra bit (e.g. use the 16th bit in 32 bits
> signed int) to indicate it's a aarch32 compat system call, but not
> sure if this is general method or not.

compat syscalls were not supported because of limitations in the
raw_syscalls:sys_{enter,exit} tracepoints I can't recall from the top of
my head right now, I think that looking at that code
(raw_syscalls:sys_{enter,exit}) git log history may provide the
explanation.

> Maybe there have existed solution in other architectures for this,
> especially other platforms also should support 32 bits and 64 bits
> system calls along with the architecture evoluation, so want to
> inquiry firstly to avoid duplicate works.
>
> Thanks a lot for suggestions!
> Leo.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/syscall.c#n93

- Arnaldo


commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
Author: Arnaldo Carvalho de Melo <[email protected]>
Date: Mon Jun 10 15:37:45 2019 -0300

perf trace: Skip unknown syscalls when expanding strace like syscall groups

We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
syscalls that should be selected when 'string' is used, meaning, in this
case, syscalls that receive as one of its arguments a string, like a
pathname.

But those were first selected and tested on x86_64, and end up failing
in architectures where some of those syscalls are not available, like
the 'access' syscall on arm64, which makes using 'perf trace -e string'
in such archs to fail.

Since this the routine doing the validation is used only when reading
such files, do not fail when some syscall is not found in the
syscalltbl, instead just use pr_debug() to register that in case people
are suspicious of problems.

Now using 'perf trace -e string' should work on arm64, selecting only
the syscalls that have a string and are available on that architecture.

Reported-by: Leo Yan <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 1a2a605cf068..eb70a4b71755 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
static int trace__validate_ev_qualifier(struct trace *trace)
{
int err = 0, i;
+ bool printed_invalid_prefix = false;
size_t nr_allocated;
struct str_node *pos;

@@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
if (id >= 0)
goto matches;

- if (err == 0) {
- fputs("Error:\tInvalid syscall ", trace->output);
- err = -EINVAL;
+ if (!printed_invalid_prefix) {
+ pr_debug("Skipping unknown syscalls: ");
+ printed_invalid_prefix = true;
} else {
- fputs(", ", trace->output);
+ pr_debug(", ");
}

- fputs(sc, trace->output);
+ pr_debug("%s", sc);
+ continue;
}
matches:
trace->ev_qualifier_ids.entries[i++] = id;
@@ -1591,15 +1593,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
}
}

- if (err < 0) {
- fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
- "\nHint:\tand: 'man syscalls'\n", trace->output);
-out_free:
- zfree(&trace->ev_qualifier_ids.entries);
- trace->ev_qualifier_ids.nr = 0;
- }
out:
+ if (printed_invalid_prefix)
+ pr_debug("\n");
return err;
+out_free:
+ zfree(&trace->ev_qualifier_ids.entries);
+ trace->ev_qualifier_ids.nr = 0;
+ goto out;
}

/*

2019-06-11 04:19:07

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

On Mon, Jun 10, 2019 at 03:47:54PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > > I tested with the lastest perf/core branch which contains the patch:
> > > 'perf augmented_raw_syscalls: Tell which args are filenames and how
> > > many bytes to copy' and got the error as below:
> > >
> > > # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> > > Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> > > rmdir, stat, statfs, symlink, truncate, unlink
>
> Humm, I think that we can just make the code that parses the
> tools/perf/trace/strace/groups/string file to ignore syscalls it can't
> find in the syscall_tbl, i.e. trace those if they exist in the arch.

Agree.

> > > Hint: try 'perf list syscalls:sys_enter_*'
> > > Hint: and: 'man syscalls'
> > >
> > > So seems mksyscalltbl has not included completely for syscalls, I
> > > use below command to generate syscalltbl_arm64[] array and it don't
> > > include related entries for access, chmod, chown, etc ...
>
> So, we need to investigate why is that these are missing, good thing we
> have this 'strings' group :-)
>
> > > You could refer the generated syscalltbl_arm64 in:
> > > http://paste.ubuntu.com/p/8Bj7Jkm2mP/
> >
> > After digging into this issue on Arm64, below is summary info:
> >
> > - arm64 uses the header include/uapi/linux/unistd.h to define system
> > call numbers, in this header some system calls are not defined (I
> > think the reason is these system calls are obsolete at the end) so the
> > corresponding strings are missed in the array syscalltbl_native,
> > for arm64 the array is defined in the file:
> > tools/perf/arch/arm64/include/generated/asm/syscalls.c.
>
> Yeah, I looked at the 'access' case and indeed it is not present in
> include/uapi/asm-generic/unistd.h, which is the place
> include/uapi/linux/unistd.h ends up.
>
> Ok please take a look at the patch at the end of this message, should be ok?
>
> I tested it by changing the strace/gorups/string file to have a few
> unknown syscalls, running it with -v we see:
>
> [root@quaco perf]# perf trace -v -e string ls
> Skipping unknown syscalls: access99, acct99, add_key99
> <SNIP other verbose messages>
> normal operation not considering those unknown syscalls.

I did testing with the patch, but it failed after I added eBPF event
with below command, I even saw segmentation fault; please see below
inline comments.

perf --debug verbose=10 trace -e string -e \
/mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c

[...]

> commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
> Author: Arnaldo Carvalho de Melo <[email protected]>
> Date: Mon Jun 10 15:37:45 2019 -0300
>
> perf trace: Skip unknown syscalls when expanding strace like syscall groups
>
> We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
> syscalls that should be selected when 'string' is used, meaning, in this
> case, syscalls that receive as one of its arguments a string, like a
> pathname.
>
> But those were first selected and tested on x86_64, and end up failing
> in architectures where some of those syscalls are not available, like
> the 'access' syscall on arm64, which makes using 'perf trace -e string'
> in such archs to fail.
>
> Since this the routine doing the validation is used only when reading
> such files, do not fail when some syscall is not found in the
> syscalltbl, instead just use pr_debug() to register that in case people
> are suspicious of problems.
>
> Now using 'perf trace -e string' should work on arm64, selecting only
> the syscalls that have a string and are available on that architecture.
>
> Reported-by: Leo Yan <[email protected]>
> Cc: Adrian Hunter <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Alexei Starovoitov <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Martin KaFai Lau <[email protected]>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Song Liu <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Cc: Yonghong Song <[email protected]>
> Link: https://lkml.kernel.org/n/[email protected]
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 1a2a605cf068..eb70a4b71755 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> static int trace__validate_ev_qualifier(struct trace *trace)
> {
> int err = 0, i;
> + bool printed_invalid_prefix = false;
> size_t nr_allocated;
> struct str_node *pos;
>
> @@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> if (id >= 0)
> goto matches;
>
> - if (err == 0) {
> - fputs("Error:\tInvalid syscall ", trace->output);
> - err = -EINVAL;
> + if (!printed_invalid_prefix) {
> + pr_debug("Skipping unknown syscalls: ");
> + printed_invalid_prefix = true;
> } else {
> - fputs(", ", trace->output);
> + pr_debug(", ");
> }
>
> - fputs(sc, trace->output);
> + pr_debug("%s", sc);
> + continue;

Here adds 'continue' so that we want to let ev_qualifier_ids.entries
to only store valid system call ids. But this is not sufficient,
because we have initialized ev_qualifier_ids.nr at the beginning of
the function:

trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);

This sentence will set ids number to the string table's length; but
actually some strings are not really supported; this leads to some
items in trace->ev_qualifier_ids.entries[] will be not initialized
properly.

If we want to get neat entries and entry number, I suggest at the
beginning of the function we use variable 'nr_allocated' to store
string table length and use it to allocate entries:

nr_allocated = strlist__nr_entries(trace->ev_qualifier);
trace->ev_qualifier_ids.entries = malloc(nr_allocated *
sizeof(trace->ev_qualifier_ids.entries[0]));

If we find any matched string, then increment the nr field under
'matches' tag:

matches:
trace->ev_qualifier_ids.nr++;
trace->ev_qualifier_ids.entries[i++] = id;

This can ensure the entries[0..nr-1] has valid id and we can use
ev_qualifier_ids.nr to maintain the valid system call numbers.

> }
> matches:
> trace->ev_qualifier_ids.entries[i++] = id;
> @@ -1591,15 +1593,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> }
> }
>
> - if (err < 0) {
> - fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
> - "\nHint:\tand: 'man syscalls'\n", trace->output);
> -out_free:
> - zfree(&trace->ev_qualifier_ids.entries);
> - trace->ev_qualifier_ids.nr = 0;
> - }
> out:
> + if (printed_invalid_prefix)
> + pr_debug("\n");
> return err;
> +out_free:
> + zfree(&trace->ev_qualifier_ids.entries);
> + trace->ev_qualifier_ids.nr = 0;
> + goto out;

Nitpick: directly return err and 'goto out' is not necessary.

Thanks,
Leo Yan

> }
>
> /*

2019-06-12 07:41:44

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Tue, Jun 11, 2019 at 12:18:31PM +0800, Leo Yan escreveu:
> On Mon, Jun 10, 2019 at 03:47:54PM -0300, Arnaldo Carvalho de Melo wrote:
>
> [...]
>
> > > > I tested with the lastest perf/core branch which contains the patch:
> > > > 'perf augmented_raw_syscalls: Tell which args are filenames and how
> > > > many bytes to copy' and got the error as below:
> > > >
> > > > # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> > > > Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> > > > rmdir, stat, statfs, symlink, truncate, unlink
> >
> > Humm, I think that we can just make the code that parses the
> > tools/perf/trace/strace/groups/string file to ignore syscalls it can't
> > find in the syscall_tbl, i.e. trace those if they exist in the arch.
>
> Agree.
>
> > > > Hint: try 'perf list syscalls:sys_enter_*'
> > > > Hint: and: 'man syscalls'
> > > >
> > > > So seems mksyscalltbl has not included completely for syscalls, I
> > > > use below command to generate syscalltbl_arm64[] array and it don't
> > > > include related entries for access, chmod, chown, etc ...
> >
> > So, we need to investigate why is that these are missing, good thing we
> > have this 'strings' group :-)
> >
> > > > You could refer the generated syscalltbl_arm64 in:
> > > > http://paste.ubuntu.com/p/8Bj7Jkm2mP/
> > >
> > > After digging into this issue on Arm64, below is summary info:
> > >
> > > - arm64 uses the header include/uapi/linux/unistd.h to define system
> > > call numbers, in this header some system calls are not defined (I
> > > think the reason is these system calls are obsolete at the end) so the
> > > corresponding strings are missed in the array syscalltbl_native,
> > > for arm64 the array is defined in the file:
> > > tools/perf/arch/arm64/include/generated/asm/syscalls.c.
> >
> > Yeah, I looked at the 'access' case and indeed it is not present in
> > include/uapi/asm-generic/unistd.h, which is the place
> > include/uapi/linux/unistd.h ends up.
> >
> > Ok please take a look at the patch at the end of this message, should be ok?
> >
> > I tested it by changing the strace/gorups/string file to have a few
> > unknown syscalls, running it with -v we see:
> >
> > [root@quaco perf]# perf trace -v -e string ls
> > Skipping unknown syscalls: access99, acct99, add_key99
> > <SNIP other verbose messages>
> > normal operation not considering those unknown syscalls.
>
> I did testing with the patch, but it failed after I added eBPF event
> with below command, I even saw segmentation fault; please see below
> inline comments.
>
> perf --debug verbose=10 trace -e string -e \
> /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
>
> [...]
>
> > commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
> > Author: Arnaldo Carvalho de Melo <[email protected]>
> > Date: Mon Jun 10 15:37:45 2019 -0300
> >
> > perf trace: Skip unknown syscalls when expanding strace like syscall groups
> >
> > We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
> > syscalls that should be selected when 'string' is used, meaning, in this
> > case, syscalls that receive as one of its arguments a string, like a
> > pathname.
> >
> > But those were first selected and tested on x86_64, and end up failing
> > in architectures where some of those syscalls are not available, like
> > the 'access' syscall on arm64, which makes using 'perf trace -e string'
> > in such archs to fail.
> >
> > Since this the routine doing the validation is used only when reading
> > such files, do not fail when some syscall is not found in the
> > syscalltbl, instead just use pr_debug() to register that in case people
> > are suspicious of problems.
> >
> > Now using 'perf trace -e string' should work on arm64, selecting only
> > the syscalls that have a string and are available on that architecture.
> >
> > Reported-by: Leo Yan <[email protected]>
> > Cc: Adrian Hunter <[email protected]>
> > Cc: Alexander Shishkin <[email protected]>
> > Cc: Alexei Starovoitov <[email protected]>
> > Cc: Daniel Borkmann <[email protected]>
> > Cc: Jiri Olsa <[email protected]>
> > Cc: Martin KaFai Lau <[email protected]>
> > Cc: Mathieu Poirier <[email protected]>
> > Cc: Mike Leach <[email protected]>
> > Cc: Namhyung Kim <[email protected]>
> > Cc: Song Liu <[email protected]>
> > Cc: Suzuki K Poulose <[email protected]>
> > Cc: Yonghong Song <[email protected]>
> > Link: https://lkml.kernel.org/n/[email protected]
> > Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> >
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index 1a2a605cf068..eb70a4b71755 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> > static int trace__validate_ev_qualifier(struct trace *trace)
> > {
> > int err = 0, i;
> > + bool printed_invalid_prefix = false;
> > size_t nr_allocated;
> > struct str_node *pos;
> >
> > @@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> > if (id >= 0)
> > goto matches;
> >
> > - if (err == 0) {
> > - fputs("Error:\tInvalid syscall ", trace->output);
> > - err = -EINVAL;
> > + if (!printed_invalid_prefix) {
> > + pr_debug("Skipping unknown syscalls: ");
> > + printed_invalid_prefix = true;
> > } else {
> > - fputs(", ", trace->output);
> > + pr_debug(", ");
> > }
> >
> > - fputs(sc, trace->output);
> > + pr_debug("%s", sc);
> > + continue;
>
> Here adds 'continue' so that we want to let ev_qualifier_ids.entries
> to only store valid system call ids. But this is not sufficient,
> because we have initialized ev_qualifier_ids.nr at the beginning of
> the function:
>
> trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);
> This sentence will set ids number to the string table's length; but
> actually some strings are not really supported; this leads to some
> items in trace->ev_qualifier_ids.entries[] will be not initialized
> properly.
>
> If we want to get neat entries and entry number, I suggest at the
> beginning of the function we use variable 'nr_allocated' to store
> string table length and use it to allocate entries:
>
> nr_allocated = strlist__nr_entries(trace->ev_qualifier);
> trace->ev_qualifier_ids.entries = malloc(nr_allocated *
> sizeof(trace->ev_qualifier_ids.entries[0]));
>
> If we find any matched string, then increment the nr field under
> 'matches' tag:
>
> matches:
> trace->ev_qualifier_ids.nr++;
> trace->ev_qualifier_ids.entries[i++] = id;
>
> This can ensure the entries[0..nr-1] has valid id and we can use
> ev_qualifier_ids.nr to maintain the valid system call numbers.

yeah, you're right, I'll address these issues in a followup patch,
tomorrow.

- Arnaldo

>
> > }
> > matches:
> > trace->ev_qualifier_ids.entries[i++] = id;
> > @@ -1591,15 +1593,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> > }
> > }
> >
> > - if (err < 0) {
> > - fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
> > - "\nHint:\tand: 'man syscalls'\n", trace->output);
> > -out_free:
> > - zfree(&trace->ev_qualifier_ids.entries);
> > - trace->ev_qualifier_ids.nr = 0;
> > - }
> > out:
> > + if (printed_invalid_prefix)
> > + pr_debug("\n");
> > return err;
> > +out_free:
> > + zfree(&trace->ev_qualifier_ids.entries);
> > + trace->ev_qualifier_ids.nr = 0;
> > + goto out;
>
> Nitpick: directly return err and 'goto out' is not necessary.
>
> Thanks,
> Leo Yan
>
> > }
> >
> > /*

--

- Arnaldo

2019-06-13 20:43:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Em Tue, Jun 11, 2019 at 11:49:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jun 11, 2019 at 12:18:31PM +0800, Leo Yan escreveu:
> > On Mon, Jun 10, 2019 at 03:47:54PM -0300, Arnaldo Carvalho de Melo wrote:
> >
> > [...]
> >
> > > > > I tested with the lastest perf/core branch which contains the patch:
> > > > > 'perf augmented_raw_syscalls: Tell which args are filenames and how
> > > > > many bytes to copy' and got the error as below:
> > > > >
> > > > > # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> > > > > Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> > > > > rmdir, stat, statfs, symlink, truncate, unlink
> > >
> > > Humm, I think that we can just make the code that parses the
> > > tools/perf/trace/strace/groups/string file to ignore syscalls it can't
> > > find in the syscall_tbl, i.e. trace those if they exist in the arch.
> >
> > Agree.
> >
> > > > > Hint: try 'perf list syscalls:sys_enter_*'
> > > > > Hint: and: 'man syscalls'
> > > > >
> > > > > So seems mksyscalltbl has not included completely for syscalls, I
> > > > > use below command to generate syscalltbl_arm64[] array and it don't
> > > > > include related entries for access, chmod, chown, etc ...
> > >
> > > So, we need to investigate why is that these are missing, good thing we
> > > have this 'strings' group :-)
> > >
> > > > > You could refer the generated syscalltbl_arm64 in:
> > > > > http://paste.ubuntu.com/p/8Bj7Jkm2mP/
> > > >
> > > > After digging into this issue on Arm64, below is summary info:
> > > >
> > > > - arm64 uses the header include/uapi/linux/unistd.h to define system
> > > > call numbers, in this header some system calls are not defined (I
> > > > think the reason is these system calls are obsolete at the end) so the
> > > > corresponding strings are missed in the array syscalltbl_native,
> > > > for arm64 the array is defined in the file:
> > > > tools/perf/arch/arm64/include/generated/asm/syscalls.c.
> > >
> > > Yeah, I looked at the 'access' case and indeed it is not present in
> > > include/uapi/asm-generic/unistd.h, which is the place
> > > include/uapi/linux/unistd.h ends up.
> > >
> > > Ok please take a look at the patch at the end of this message, should be ok?
> > >
> > > I tested it by changing the strace/gorups/string file to have a few
> > > unknown syscalls, running it with -v we see:
> > >
> > > [root@quaco perf]# perf trace -v -e string ls
> > > Skipping unknown syscalls: access99, acct99, add_key99
> > > <SNIP other verbose messages>
> > > normal operation not considering those unknown syscalls.
> >
> > I did testing with the patch, but it failed after I added eBPF event
> > with below command, I even saw segmentation fault; please see below
> > inline comments.
> >
> > perf --debug verbose=10 trace -e string -e \
> > /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> >
> > [...]
> >
> > > commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
> > > Author: Arnaldo Carvalho de Melo <[email protected]>
> > > Date: Mon Jun 10 15:37:45 2019 -0300
> > >
> > > perf trace: Skip unknown syscalls when expanding strace like syscall groups
> > >
> > > We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
> > > syscalls that should be selected when 'string' is used, meaning, in this
> > > case, syscalls that receive as one of its arguments a string, like a
> > > pathname.
> > >
> > > But those were first selected and tested on x86_64, and end up failing
> > > in architectures where some of those syscalls are not available, like
> > > the 'access' syscall on arm64, which makes using 'perf trace -e string'
> > > in such archs to fail.
> > >
> > > Since this the routine doing the validation is used only when reading
> > > such files, do not fail when some syscall is not found in the
> > > syscalltbl, instead just use pr_debug() to register that in case people
> > > are suspicious of problems.
> > >
> > > Now using 'perf trace -e string' should work on arm64, selecting only
> > > the syscalls that have a string and are available on that architecture.
> > >
> > > Reported-by: Leo Yan <[email protected]>
> > > Cc: Adrian Hunter <[email protected]>
> > > Cc: Alexander Shishkin <[email protected]>
> > > Cc: Alexei Starovoitov <[email protected]>
> > > Cc: Daniel Borkmann <[email protected]>
> > > Cc: Jiri Olsa <[email protected]>
> > > Cc: Martin KaFai Lau <[email protected]>
> > > Cc: Mathieu Poirier <[email protected]>
> > > Cc: Mike Leach <[email protected]>
> > > Cc: Namhyung Kim <[email protected]>
> > > Cc: Song Liu <[email protected]>
> > > Cc: Suzuki K Poulose <[email protected]>
> > > Cc: Yonghong Song <[email protected]>
> > > Link: https://lkml.kernel.org/n/[email protected]
> > > Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> > >
> > > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > > index 1a2a605cf068..eb70a4b71755 100644
> > > --- a/tools/perf/builtin-trace.c
> > > +++ b/tools/perf/builtin-trace.c
> > > @@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> > > static int trace__validate_ev_qualifier(struct trace *trace)
> > > {
> > > int err = 0, i;
> > > + bool printed_invalid_prefix = false;
> > > size_t nr_allocated;
> > > struct str_node *pos;
> > >
> > > @@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> > > if (id >= 0)
> > > goto matches;
> > >
> > > - if (err == 0) {
> > > - fputs("Error:\tInvalid syscall ", trace->output);
> > > - err = -EINVAL;
> > > + if (!printed_invalid_prefix) {
> > > + pr_debug("Skipping unknown syscalls: ");
> > > + printed_invalid_prefix = true;
> > > } else {
> > > - fputs(", ", trace->output);
> > > + pr_debug(", ");
> > > }
> > >
> > > - fputs(sc, trace->output);
> > > + pr_debug("%s", sc);
> > > + continue;
> >
> > Here adds 'continue' so that we want to let ev_qualifier_ids.entries
> > to only store valid system call ids. But this is not sufficient,
> > because we have initialized ev_qualifier_ids.nr at the beginning of
> > the function:
> >
> > trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);
> > This sentence will set ids number to the string table's length; but
> > actually some strings are not really supported; this leads to some
> > items in trace->ev_qualifier_ids.entries[] will be not initialized
> > properly.
> >
> > If we want to get neat entries and entry number, I suggest at the
> > beginning of the function we use variable 'nr_allocated' to store
> > string table length and use it to allocate entries:
> >
> > nr_allocated = strlist__nr_entries(trace->ev_qualifier);
> > trace->ev_qualifier_ids.entries = malloc(nr_allocated *
> > sizeof(trace->ev_qualifier_ids.entries[0]));
> >
> > If we find any matched string, then increment the nr field under
> > 'matches' tag:
> >
> > matches:
> > trace->ev_qualifier_ids.nr++;
> > trace->ev_qualifier_ids.entries[i++] = id;
> >
> > This can ensure the entries[0..nr-1] has valid id and we can use
> > ev_qualifier_ids.nr to maintain the valid system call numbers.
>
> yeah, you're right, I'll address these issues in a followup patch,
> tomorrow.

This is equivalent and I think the smallest patch, I'll add one on top
doing what you suggested about nr_allocated getting the
strlist__nr_entries() and also will rename i to nr_used to contrast with
nr_allocated, and then at the end set ev_qualifier_ids.nr to nr_used.

- Arnaldo

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index eb70a4b71755..bd1f00e7a2eb 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1528,9 +1528,9 @@ static int trace__read_syscall_info(struct trace *trace, int id)

static int trace__validate_ev_qualifier(struct trace *trace)
{
- int err = 0, i;
+ int err = 0;
bool printed_invalid_prefix = false;
- size_t nr_allocated;
+ size_t nr_allocated, i;
struct str_node *pos;

trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);
@@ -1575,7 +1575,7 @@ static int trace__validate_ev_qualifier(struct trace *trace)
id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next);
if (id < 0)
break;
- if (nr_allocated == trace->ev_qualifier_ids.nr) {
+ if (nr_allocated == i) {
void *entries;

nr_allocated += 8;
@@ -1588,11 +1588,11 @@ static int trace__validate_ev_qualifier(struct trace *trace)
}
trace->ev_qualifier_ids.entries = entries;
}
- trace->ev_qualifier_ids.nr++;
trace->ev_qualifier_ids.entries[i++] = id;
}
}

+ trace->ev_qualifier_ids.nr = i;
out:
if (printed_invalid_prefix)
pr_debug("\n");

2019-06-15 05:53:25

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

On Thu, Jun 13, 2019 at 03:15:14PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > > > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > > > index 1a2a605cf068..eb70a4b71755 100644
> > > > --- a/tools/perf/builtin-trace.c
> > > > +++ b/tools/perf/builtin-trace.c
> > > > @@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> > > > static int trace__validate_ev_qualifier(struct trace *trace)
> > > > {
> > > > int err = 0, i;
> > > > + bool printed_invalid_prefix = false;
> > > > size_t nr_allocated;
> > > > struct str_node *pos;
> > > >
> > > > @@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> > > > if (id >= 0)
> > > > goto matches;
> > > >
> > > > - if (err == 0) {
> > > > - fputs("Error:\tInvalid syscall ", trace->output);
> > > > - err = -EINVAL;
> > > > + if (!printed_invalid_prefix) {
> > > > + pr_debug("Skipping unknown syscalls: ");
> > > > + printed_invalid_prefix = true;
> > > > } else {
> > > > - fputs(", ", trace->output);
> > > > + pr_debug(", ");
> > > > }
> > > >
> > > > - fputs(sc, trace->output);
> > > > + pr_debug("%s", sc);
> > > > + continue;
> > >
> > > Here adds 'continue' so that we want to let ev_qualifier_ids.entries
> > > to only store valid system call ids. But this is not sufficient,
> > > because we have initialized ev_qualifier_ids.nr at the beginning of
> > > the function:
> > >
> > > trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);
> > > This sentence will set ids number to the string table's length; but
> > > actually some strings are not really supported; this leads to some
> > > items in trace->ev_qualifier_ids.entries[] will be not initialized
> > > properly.
> > >
> > > If we want to get neat entries and entry number, I suggest at the
> > > beginning of the function we use variable 'nr_allocated' to store
> > > string table length and use it to allocate entries:
> > >
> > > nr_allocated = strlist__nr_entries(trace->ev_qualifier);
> > > trace->ev_qualifier_ids.entries = malloc(nr_allocated *
> > > sizeof(trace->ev_qualifier_ids.entries[0]));
> > >
> > > If we find any matched string, then increment the nr field under
> > > 'matches' tag:
> > >
> > > matches:
> > > trace->ev_qualifier_ids.nr++;
> > > trace->ev_qualifier_ids.entries[i++] = id;
> > >
> > > This can ensure the entries[0..nr-1] has valid id and we can use
> > > ev_qualifier_ids.nr to maintain the valid system call numbers.
> >
> > yeah, you're right, I'll address these issues in a followup patch,
> > tomorrow.
>
> This is equivalent and I think the smallest patch, I'll add one on top
> doing what you suggested about nr_allocated getting the
> strlist__nr_entries() and also will rename i to nr_used to contrast with
> nr_allocated, and then at the end set ev_qualifier_ids.nr to nr_used.

Thanks for this patch, I tested below changes and 'perf trace' works
well. You could add my test tag:

Tested-by: Leo Yan <[email protected]>

> - Arnaldo
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index eb70a4b71755..bd1f00e7a2eb 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1528,9 +1528,9 @@ static int trace__read_syscall_info(struct trace *trace, int id)
>
> static int trace__validate_ev_qualifier(struct trace *trace)
> {
> - int err = 0, i;
> + int err = 0;
> bool printed_invalid_prefix = false;
> - size_t nr_allocated;
> + size_t nr_allocated, i;
> struct str_node *pos;
>
> trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);
> @@ -1575,7 +1575,7 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next);
> if (id < 0)
> break;
> - if (nr_allocated == trace->ev_qualifier_ids.nr) {
> + if (nr_allocated == i) {
> void *entries;
>
> nr_allocated += 8;
> @@ -1588,11 +1588,11 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> }
> trace->ev_qualifier_ids.entries = entries;
> }
> - trace->ev_qualifier_ids.nr++;
> trace->ev_qualifier_ids.entries[i++] = id;
> }
> }
>
> + trace->ev_qualifier_ids.nr = i;
> out:
> if (printed_invalid_prefix)
> pr_debug("\n");

Subject: [tip:perf/core] perf trace: Skip unknown syscalls when expanding strace like syscall groups

Commit-ID: 04c41bcb862bbec1fb225243ecf07a3219593f81
Gitweb: https://git.kernel.org/tip/04c41bcb862bbec1fb225243ecf07a3219593f81
Author: Arnaldo Carvalho de Melo <[email protected]>
AuthorDate: Mon, 10 Jun 2019 15:37:45 -0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 10 Jun 2019 17:50:04 -0300

perf trace: Skip unknown syscalls when expanding strace like syscall groups

We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
syscalls that should be selected when 'string' is used, meaning, in this
case, syscalls that receive as one of its arguments a string, like a
pathname.

But those were first selected and tested on x86_64, and end up failing
in architectures where some of those syscalls are not available, like
the 'access' syscall on arm64, which makes using 'perf trace -e string'
in such archs to fail.

Since this the routine doing the validation is used only when reading
such files, do not fail when some syscall is not found in the
syscalltbl, instead just use pr_debug() to register that in case people
are suspicious of problems.

Now using 'perf trace -e string' should work on arm64, selecting only
the syscalls that have a string and are available on that architecture.

Reported-by: Leo Yan <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-trace.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 1a2a605cf068..eb70a4b71755 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
static int trace__validate_ev_qualifier(struct trace *trace)
{
int err = 0, i;
+ bool printed_invalid_prefix = false;
size_t nr_allocated;
struct str_node *pos;

@@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
if (id >= 0)
goto matches;

- if (err == 0) {
- fputs("Error:\tInvalid syscall ", trace->output);
- err = -EINVAL;
+ if (!printed_invalid_prefix) {
+ pr_debug("Skipping unknown syscalls: ");
+ printed_invalid_prefix = true;
} else {
- fputs(", ", trace->output);
+ pr_debug(", ");
}

- fputs(sc, trace->output);
+ pr_debug("%s", sc);
+ continue;
}
matches:
trace->ev_qualifier_ids.entries[i++] = id;
@@ -1591,15 +1593,14 @@ matches:
}
}

- if (err < 0) {
- fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
- "\nHint:\tand: 'man syscalls'\n", trace->output);
-out_free:
- zfree(&trace->ev_qualifier_ids.entries);
- trace->ev_qualifier_ids.nr = 0;
- }
out:
+ if (printed_invalid_prefix)
+ pr_debug("\n");
return err;
+out_free:
+ zfree(&trace->ev_qualifier_ids.entries);
+ trace->ev_qualifier_ids.nr = 0;
+ goto out;
}

/*

Subject: [tip:perf/core] perf trace: Fix exclusion of not available syscall names from selector list

Commit-ID: a4066d64d9391a734ee0e49c8d2757f5685013b4
Gitweb: https://git.kernel.org/tip/a4066d64d9391a734ee0e49c8d2757f5685013b4
Author: Arnaldo Carvalho de Melo <[email protected]>
AuthorDate: Thu, 13 Jun 2019 17:35:09 -0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 17 Jun 2019 15:57:19 -0300

perf trace: Fix exclusion of not available syscall names from selector list

We were just skipping the syscalls not available in a particular
architecture without reflecting this in the number of entries in the
ev_qualifier_ids.nr variable, fix it.

This was done with the most minimalistic way, reusing the index variable
'i', a followup patch will further clean this by making 'i' renamed to
'nr_used' and using 'nr_allocated' in a few more places.

Reported-by: Leo Yan <[email protected]>
Tested-by: Leo Yan <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Fixes: 04c41bcb862b ("perf trace: Skip unknown syscalls when expanding strace like syscall groups")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-trace.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 12f5ad98f8c1..fa9eb467eb4c 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1527,9 +1527,9 @@ static int trace__read_syscall_info(struct trace *trace, int id)

static int trace__validate_ev_qualifier(struct trace *trace)
{
- int err = 0, i;
+ int err = 0;
bool printed_invalid_prefix = false;
- size_t nr_allocated;
+ size_t nr_allocated, i;
struct str_node *pos;

trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);
@@ -1574,7 +1574,7 @@ matches:
id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next);
if (id < 0)
break;
- if (nr_allocated == trace->ev_qualifier_ids.nr) {
+ if (nr_allocated == i) {
void *entries;

nr_allocated += 8;
@@ -1587,11 +1587,11 @@ matches:
}
trace->ev_qualifier_ids.entries = entries;
}
- trace->ev_qualifier_ids.nr++;
trace->ev_qualifier_ids.entries[i++] = id;
}
}

+ trace->ev_qualifier_ids.nr = i;
out:
if (printed_invalid_prefix)
pr_debug("\n");