2014-04-08 18:17:10

by Stephane Eranian

[permalink] [raw]
Subject: [BUG] perf record: does not propagate command error code

Hi,

There is a discrepancy in the way perf stat and perf record propagate
command error code back when they launch a process:

$ perf record -e cycles false && echo "yes" || echo "no"
yes

That's wrong!

But perf stat:

$ perf stat -e cycles false && echo "yes" || echo "no"
no

That's correct!

You want the error to be propagated back because it helps
catch errors in monitoring scripts.

I looked at the perf record code handling the error from the child.
It is complicated and uses atexit() to cleanup the child. It is hard
to get the child exit status back. I wonder why this was handled
that way.

Anybody has a fix for this?

Thanks


2014-04-16 00:30:48

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/2] perf record: Propagate exit status of a command line workload

Currently perf record doesn't propagate the exit status of a workload
given by the command line. But sometimes it'd useful if it's
propagated so that a monitoring script can handle errors
appropriately.

To do that, it got rid of exit handlers and run/call them directly in
the __cmd_record(). I don't see any reason why those are in a form of
exit handlers in the first place. Also it cleaned up the resouce
management code in record__exit().

With this change, perf record returns the child exit status in case of
normal termination. (Not sure what should be returned on abnormal
cases though).

Example run of Stephane's case:

$ perf record false && echo yes || echo no
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (~589 samples) ]
no

Reported-by: Stephane Eranian <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-record.c | 74 ++++++++++++++++++---------------------------
1 file changed, 30 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index eb524f91bffe..3d587d693873 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -152,26 +152,6 @@ static void sig_handler(int sig)
signr = sig;
}

-static void record__sig_exit(int exit_status __maybe_unused, void *arg)
-{
- struct record *rec = arg;
- int status;
-
- if (rec->evlist->workload.pid > 0) {
- if (!child_finished)
- kill(rec->evlist->workload.pid, SIGTERM);
-
- wait(&status);
- if (WIFSIGNALED(status))
- psignal(WTERMSIG(status), rec->progname);
- }
-
- if (signr == -1 || signr == SIGUSR1)
- return;
-
- signal(signr, SIG_DFL);
-}
-
static int record__open(struct record *rec)
{
char msg[512];
@@ -243,14 +223,11 @@ static int process_buildids(struct record *rec)
size, &build_id__mark_dso_hit_ops);
}

-static void record__exit(int status, void *arg)
+static void record__exit(void *arg)
{
struct record *rec = arg;
struct perf_data_file *file = &rec->file;

- if (status != 0)
- return;
-
if (!file->is_pipe) {
rec->session->header.data_size += rec->bytes_written;

@@ -259,8 +236,6 @@ static void record__exit(int status, void *arg)
perf_session__write_header(rec->session, rec->evlist,
file->fd, true);
perf_session__delete(rec->session);
- perf_evlist__delete(rec->evlist);
- symbol__exit();
}
}

@@ -356,6 +331,7 @@ static void workload_exec_failed_signal(int signo, siginfo_t *info,
static int __cmd_record(struct record *rec, int argc, const char **argv)
{
int err;
+ int status = 0;
unsigned long waking = 0;
const bool forks = argc > 0;
struct machine *machine;
@@ -367,7 +343,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)

rec->progname = argv[0];

- on_exit(record__sig_exit, rec);
signal(SIGCHLD, sig_handler);
signal(SIGINT, sig_handler);
signal(SIGTERM, sig_handler);
@@ -400,11 +375,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
if (!rec->evlist->nr_groups)
perf_header__clear_feat(&session->header, HEADER_GROUP_DESC);

- /*
- * perf_session__delete(session) will be called at record__exit()
- */
- on_exit(record__exit, rec);
-
if (file->is_pipe) {
err = perf_header__write_pipe(file->fd);
if (err < 0)
@@ -541,21 +511,36 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
goto out_delete_session;
}

- if (quiet || signr == SIGUSR1)
- return 0;
+ if (!quiet) {
+ fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);

- fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);
+ /*
+ * Approximate RIP event size: 24 bytes.
+ */
+ fprintf(stderr,
+ "[ perf record: Captured and wrote %.3f MB %s (~%" PRIu64 " samples) ]\n",
+ (double)rec->bytes_written / 1024.0 / 1024.0,
+ file->path,
+ rec->bytes_written / 24);
+ }

- /*
- * Approximate RIP event size: 24 bytes.
- */
- fprintf(stderr,
- "[ perf record: Captured and wrote %.3f MB %s (~%" PRIu64 " samples) ]\n",
- (double)rec->bytes_written / 1024.0 / 1024.0,
- file->path,
- rec->bytes_written / 24);
+ if (forks) {
+ int exit_status;

- return 0;
+ if (!child_finished)
+ kill(rec->evlist->workload.pid, SIGTERM);
+
+ wait(&exit_status);
+
+ if (WIFEXITED(exit_status))
+ status = WEXITSTATUS(exit_status);
+ else if (WIFSIGNALED(status))
+ psignal(WTERMSIG(status), rec->progname);
+ }
+
+ record__exit(rec);
+
+ return status;

out_delete_session:
perf_session__delete(session);
@@ -988,6 +973,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)

err = __cmd_record(&record, argc, argv);
out_symbol_exit:
+ perf_evlist__delete(rec->evlist);
symbol__exit();
return err;
}
--
1.9.2

2014-04-16 00:30:47

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/2] perf tools: Get rid of on_exit() feature test

The on_exit() function was only used in perf record but it's gone in
previous patch.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-record.c | 31 -------------------------
tools/perf/config/Makefile | 8 -------
tools/perf/config/feature-checks/Makefile | 4 ----
tools/perf/config/feature-checks/test-all.c | 5 ----
tools/perf/config/feature-checks/test-on-exit.c | 16 -------------
5 files changed, 64 deletions(-)
delete mode 100644 tools/perf/config/feature-checks/test-on-exit.c

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3d587d693873..c52e8b1ee744 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -30,37 +30,6 @@
#include <sched.h>
#include <sys/mman.h>

-#ifndef HAVE_ON_EXIT_SUPPORT
-#ifndef ATEXIT_MAX
-#define ATEXIT_MAX 32
-#endif
-static int __on_exit_count = 0;
-typedef void (*on_exit_func_t) (int, void *);
-static on_exit_func_t __on_exit_funcs[ATEXIT_MAX];
-static void *__on_exit_args[ATEXIT_MAX];
-static int __exitcode = 0;
-static void __handle_on_exit_funcs(void);
-static int on_exit(on_exit_func_t function, void *arg);
-#define exit(x) (exit)(__exitcode = (x))
-
-static int on_exit(on_exit_func_t function, void *arg)
-{
- if (__on_exit_count == ATEXIT_MAX)
- return -ENOMEM;
- else if (__on_exit_count == 0)
- atexit(__handle_on_exit_funcs);
- __on_exit_funcs[__on_exit_count] = function;
- __on_exit_args[__on_exit_count++] = arg;
- return 0;
-}
-
-static void __handle_on_exit_funcs(void)
-{
- int i;
- for (i = 0; i < __on_exit_count; i++)
- __on_exit_funcs[i] (__exitcode, __on_exit_args[i]);
-}
-#endif

struct record {
struct perf_tool tool;
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index ee21fa95ebcf..7b7003d11983 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -156,7 +156,6 @@ CORE_FEATURE_TESTS = \
libpython-version \
libslang \
libunwind \
- on-exit \
stackprotector-all \
timerfd \
libdw-dwarf-unwind
@@ -182,7 +181,6 @@ VF_FEATURE_TESTS = \
libelf-getphdrnum \
libelf-mmap \
libpython-version \
- on-exit \
stackprotector-all \
timerfd \
libunwind-debug-frame \
@@ -541,12 +539,6 @@ ifneq ($(filter -lbfd,$(EXTLIBS)),)
CFLAGS += -DHAVE_LIBBFD_SUPPORT
endif

-ifndef NO_ON_EXIT
- ifeq ($(feature-on-exit), 1)
- CFLAGS += -DHAVE_ON_EXIT_SUPPORT
- endif
-endif
-
ifndef NO_BACKTRACE
ifeq ($(feature-backtrace), 1)
CFLAGS += -DHAVE_BACKTRACE_SUPPORT
diff --git a/tools/perf/config/feature-checks/Makefile b/tools/perf/config/feature-checks/Makefile
index 2da103c53f89..64c84e5f0514 100644
--- a/tools/perf/config/feature-checks/Makefile
+++ b/tools/perf/config/feature-checks/Makefile
@@ -24,7 +24,6 @@ FILES= \
test-libslang.bin \
test-libunwind.bin \
test-libunwind-debug-frame.bin \
- test-on-exit.bin \
test-stackprotector-all.bin \
test-timerfd.bin \
test-libdw-dwarf-unwind.bin
@@ -133,9 +132,6 @@ test-liberty-z.bin:
test-cplus-demangle.bin:
$(BUILD) -liberty

-test-on-exit.bin:
- $(BUILD)
-
test-backtrace.bin:
$(BUILD)

diff --git a/tools/perf/config/feature-checks/test-all.c b/tools/perf/config/feature-checks/test-all.c
index fc37eb3ca17b..fe5c1e5c952f 100644
--- a/tools/perf/config/feature-checks/test-all.c
+++ b/tools/perf/config/feature-checks/test-all.c
@@ -69,10 +69,6 @@
# include "test-libbfd.c"
#undef main

-#define main main_test_on_exit
-# include "test-on-exit.c"
-#undef main
-
#define main main_test_backtrace
# include "test-backtrace.c"
#undef main
@@ -110,7 +106,6 @@ int main(int argc, char *argv[])
main_test_gtk2(argc, argv);
main_test_gtk2_infobar(argc, argv);
main_test_libbfd();
- main_test_on_exit();
main_test_backtrace();
main_test_libnuma();
main_test_timerfd();
diff --git a/tools/perf/config/feature-checks/test-on-exit.c b/tools/perf/config/feature-checks/test-on-exit.c
deleted file mode 100644
index 8e88b16e6ded..000000000000
--- a/tools/perf/config/feature-checks/test-on-exit.c
+++ /dev/null
@@ -1,16 +0,0 @@
-#include <stdio.h>
-#include <stdlib.h>
-
-static void exit_fn(int status, void *__data)
-{
- printf("exit status: %d, data: %d\n", status, *(int *)__data);
-}
-
-static int data = 123;
-
-int main(void)
-{
- on_exit(exit_fn, &data);
-
- return 321;
-}
--
1.9.2

2014-04-16 13:34:49

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf record: Propagate exit status of a command line workload

On Wed, Apr 16, 2014 at 09:30:43AM +0900, Namhyung Kim wrote:

SNIP

>
> @@ -356,6 +331,7 @@ static void workload_exec_failed_signal(int signo, siginfo_t *info,
> static int __cmd_record(struct record *rec, int argc, const char **argv)
> {
> int err;
> + int status = 0;
> unsigned long waking = 0;
> const bool forks = argc > 0;
> struct machine *machine;
> @@ -367,7 +343,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>
> rec->progname = argv[0];
>
> - on_exit(record__sig_exit, rec);

so record__sig_exit took care of waiting the child even
in case the record fails before running the workload

after your change all error cases just delete session
and go away, ending up with parent dead and go_pipe
released executing the workload

with attached patch, current perf code:

[jolsa@krava perf]$ ./perf.old record -e cycles -m 10G true
rounding mmap pages size to 17179869184 bytes (4194304 pages)
Permission error mapping pages.
Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
or try again with a smaller value of -m/--mmap_pages.
(current value: 4194304)
true: Terminated

with attached patch, your change:

[jolsa@krava perf]$ ./perf record -e cycles -m 10G true
rounding mmap pages size to 17179869184 bytes (4194304 pages)
Permission error mapping pages.
Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
or try again with a smaller value of -m/--mmap_pages.
(current value: 4194304)
unable to read pipe: No such file or directory


I think that after creating the workload, all error paths
need to release(wait) the child if there's any

thanks,
jirka

---
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 59ef280..185f30a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1072,7 +1072,7 @@ int perf_evlist__prepare_workload(struct perf_evlist *evlist, struct target *tar
/*
* Wait until the parent tells us to go.
*/
- if (read(go_pipe[0], &bf, 1) == -1)
+ if (read(go_pipe[0], &bf, 1) != 1)
perror("unable to read pipe");

execvp(argv[0], (char **)argv);

2014-04-17 08:22:01

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf record: Propagate exit status of a command line workload

Hi Jiri,

On Wed, 16 Apr 2014 15:34:32 +0200, Jiri Olsa wrote:
> On Wed, Apr 16, 2014 at 09:30:43AM +0900, Namhyung Kim wrote:
> I think that after creating the workload, all error paths
> need to release(wait) the child if there's any

Right. I'll resend v2.

Thanks,
Namhyung