2017-03-06 19:42:39

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [GIT PULL 00/35] perf/core improvements and fixes

From: Arnaldo Carvalho de Melo <[email protected]>

Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:

Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 08:05:45 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170306

for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:

perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)

E.g.:

# perf report -s symbol_size,symbol

Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
Overhead Symbol size Symbol
14.55% 326 [k] flush_tlb_mm_range
7.20% 1045 [k] filemap_map_pages
5.82% 124 [k] vma_interval_tree_insert
5.18% 2430 [k] unmap_page_range
2.57% 571 [k] vma_interval_tree_remove
1.94% 494 [k] page_add_file_rmap
1.82% 740 [k] page_remove_rmap
1.66% 1017 [k] release_pages
1.57% 1636 [k] update_blocked_averages
1.57% 76 [k] unlock_page

- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)

Change in behaviour:

- Make system wide (-a) the default option if no target was specified and one
of following conditions is met:

- No workload specified (current behaviour)

- A workload is specified but all requested events are system wide ones,
like uncore ones. (Jiri Olsa)

Fixes:

- Add missing initialization to the instruction decoder used in the
intel PT/BTS code, which was causing lots of failures in 'perf test',
looking for a value when there was none (Adrian Hunter)

Infrastructure:

- Add arch code needed to adopt the kernel's refcount_t to aid in
catching bugs when using atomic_t as a reference counter, basically
cmpxchg related functions (Arnaldo Carvalho de Melo)

- Convert the code using atomic_t as reference counts to refcount_t
(Elena Rashetova)

- Add feature test for sched_getcpu() to more easily check for its
presence in the many libc implementations and accross different
versions of such C libraries (Arnaldo Carvalho de Melo)

- Issue a HW watchdog disable hint in 'perf stat' for when some of the
requested events can't get counted because a PMU counter is taken by that
watchdog (Borislav Petkov).

- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)

Documentation:

- Clarify the term 'convergence' in:

perf bench numa numa-mem -h --show_convergence (Jiri Olsa)

Kernel code:

- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)

- Allow return probes with offsets and absolute addresses (Naveen N. Rao)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------
Adrian Hunter (1):
perf intel-PT/BTS: Add missing initialization

Arnaldo Carvalho de Melo (12):
tools include: Adopt __compiletime_error
tools arch x86: Include asm/cmpxchg.h
tools arch x86: Introduce atomic_cmpxchg()
tools include: Introduce atomic_cmpxchg_{relaxed,release}()
tools include: Provide gcc based cmpxchg fallback for !x86
tools include: Add UINT_MAX def to kernel.h
tools include: Adopt kernel's refcount.h
perf evlist: Clarify a bit the use of perf_mmap->refcnt
tools build: Add test for sched_getcpu()
perf bench futex: Use __maybe_unused
perf bench futex: Fix build on musl + clang
tools build: Use the same CC for feature detection and actual build

Borislav Petkov (1):
perf stat: Issue a HW watchdog disable hint

Charles Baylis (1):
perf tools: Allow sorting by symbol size

Elena Reshetova (9):
perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
perf dso: Convert dso.refcnt from atomic_t to refcount_t
perf map: Convert map.refcnt from atomic_t to refcount_t
perf map: Convert map_groups.refcnt from atomic_t to refcount_t
perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
perf thread: convert thread.refcnt from atomic_t to refcount_t
perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t

Jiri Olsa (2):
perf tools: Force uncore events to system wide monitoring
perf bench numa: Add more comment for -c option

Karol Wachowski (1):
perf vendor events: Add mapping for KnightsMill PMU events

Namhyung Kim (4):
perf ftrace: Add support for --pid option
perf cpumap: Introduce cpu_map__snprint_mask()
perf ftrace: Add support for -a and -C option
perf ftrace: Use pager for displaying result

Naveen N. Rao (3):
kretprobes: Ensure probe location is at function entry
trace/kprobes: Allow return probes with offsets and absolute addresses
perf probe: Generalize probe event file open routine

Steven Rostedt (VMware) (1):
trace/kprobes: Add back warning about offset in return probes

include/linux/kprobes.h | 1 +
kernel/kprobes.c | 13 ++
kernel/trace/trace.c | 1 +
kernel/trace/trace_kprobe.c | 9 +-
tools/arch/x86/include/asm/atomic.h | 7 +
tools/arch/x86/include/asm/cmpxchg.h | 89 ++++++++++++
tools/build/Makefile.feature | 1 +
tools/build/feature/Makefile | 10 +-
tools/build/feature/test-all.c | 5 +
tools/build/feature/test-sched_getcpu.c | 7 +
tools/include/asm-generic/atomic-gcc.h | 8 ++
tools/include/linux/atomic.h | 6 +
tools/include/linux/compiler-gcc.h | 4 +
tools/include/linux/compiler.h | 4 +
tools/include/linux/kernel.h | 4 +
tools/include/linux/refcount.h | 151 ++++++++++++++++++++
tools/perf/Documentation/perf-ftrace.txt | 18 +++
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/MANIFEST | 2 +
tools/perf/Makefile.config | 4 +
tools/perf/bench/futex-hash.c | 1 +
tools/perf/bench/futex-lock-pi.c | 1 +
tools/perf/bench/futex-requeue.c | 1 +
tools/perf/bench/futex-wake-parallel.c | 1 +
tools/perf/bench/futex-wake.c | 1 +
tools/perf/bench/futex.h | 10 +-
tools/perf/bench/numa.c | 3 +-
tools/perf/builtin-ftrace.c | 152 +++++++++++++++++----
tools/perf/builtin-stat.c | 44 +++++-
tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
tools/perf/tests/cpumap.c | 2 +-
tools/perf/tests/thread-map.c | 6 +-
tools/perf/tests/thread-mg-share.c | 12 +-
tools/perf/util/cgroup.c | 6 +-
tools/perf/util/cgroup.h | 4 +-
tools/perf/util/cloexec.h | 6 -
tools/perf/util/comm.c | 15 +-
tools/perf/util/cpumap.c | 62 +++++++--
tools/perf/util/cpumap.h | 5 +-
tools/perf/util/dso.c | 6 +-
tools/perf/util/dso.h | 4 +-
tools/perf/util/evlist.c | 31 +++--
tools/perf/util/evlist.h | 4 +-
tools/perf/util/hist.h | 1 +
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +
tools/perf/util/machine.c | 2 +-
tools/perf/util/map.c | 10 +-
tools/perf/util/map.h | 10 +-
tools/perf/util/parse-events.c | 5 +-
tools/perf/util/probe-file.c | 20 +--
tools/perf/util/probe-file.h | 1 +
tools/perf/util/sort.c | 41 ++++++
tools/perf/util/sort.h | 1 +
tools/perf/util/thread.c | 6 +-
tools/perf/util/thread.h | 4 +-
tools/perf/util/thread_map.c | 20 +--
tools/perf/util/thread_map.h | 4 +-
tools/perf/util/util.h | 4 +-
tools/scripts/Makefile.include | 9 ++
59 files changed, 720 insertions(+), 143 deletions(-)
create mode 100644 tools/arch/x86/include/asm/cmpxchg.h
create mode 100644 tools/build/feature/test-sched_getcpu.c
create mode 100644 tools/include/linux/refcount.h

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

[root@jouet ~]# waitp `pidof perf` ; time dm
1 alpine:3.4: Ok
2 alpine:3.5: Ok
3 alpine:edge: Ok
4 android-ndk:r12b-arm: Ok
5 archlinux:latest: Ok
6 centos:5: Ok
7 centos:6: Ok
8 centos:7: Ok
9 debian:7: Ok
10 debian:8: Ok
11 debian:experimental: Ok
12 debian:experimental-x-arm64: Ok
13 debian:experimental-x-mips: Ok
14 debian:experimental-x-mips64: Ok
15 debian:experimental-x-mipsel: Ok
16 fedora:20: Ok
17 fedora:21: Ok
18 fedora:22: Ok
19 fedora:23: Ok
20 fedora:24: Ok
21 fedora:24-x-ARC-uClibc: Ok
22 fedora:25: Ok
23 fedora:rawhide: Ok
24 mageia:5: Ok
25 opensuse:13.2: Ok
26 opensuse:42.1: Ok
27 opensuse:tumbleweed: Ok
28 ubuntu:12.04.5: Ok
29 ubuntu:14.04.4: Ok
30 ubuntu:14.04.4-x-linaro-arm64: Ok
31 ubuntu:15.10: Ok
32 ubuntu:16.04: Ok
33 ubuntu:16.04-x-arm: Ok
34 ubuntu:16.04-x-arm64: Ok
35 ubuntu:16.04-x-powerpc: Ok
36 ubuntu:16.04-x-powerpc64: Ok
37 ubuntu:16.04-x-s390: Ok
38 ubuntu:16.10: Ok
39 ubuntu:17.04: Ok
[root@jouet ~]#

[root@zoo ~]# uname -a
Linux zoo 4.9.13-100.fc24.x86_64 #1 SMP Mon Feb 27 16:57:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@zoo ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF pinning : Ok
37.3: BPF prologue generation : Ok
37.4: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: unit_number__scnprintf : Ok
54: x86 rdpmc : Ok
55: Convert perf time to TSC : Ok
56: DWARF unwind : Ok
57: x86 instruction decoder - new instructions : Ok
58: Intel cqm nmi context read : Skip
[root@zoo ~]#

[acme@jouet linux]$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_pure_O: make
make_doc_O: make doc
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_static_O: make LDFLAGS=-static
make_help_O: make help
make_no_libnuma_O: make NO_LIBNUMA=1
make_clean_all_O: make clean all
make_no_libelf_O: make NO_LIBELF=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_tags_O: make tags
make_debug_O: make DEBUG=1
make_no_newt_O: make NO_NEWT=1
make_install_prefix_O: make install prefix=/tmp/krava
make_install_bin_O: make install-bin
make_perf_o_O: make perf.o
make_no_slang_O: make NO_SLANG=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_util_map_o_O: make util/map.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_install_O: make install
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
OK
[acme@jouet linux]$


2017-03-06 19:38:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 08/35] tools include: Add UINT_MAX def to kernel.h

From: Arnaldo Carvalho de Melo <[email protected]>

The kernel has it and some files we got from there would require us
including the userland header for that, so add it conditionally.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/include/linux/kernel.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h
index 28607db02bd3..adb4d0147755 100644
--- a/tools/include/linux/kernel.h
+++ b/tools/include/linux/kernel.h
@@ -5,6 +5,10 @@
#include <stddef.h>
#include <assert.h>

+#ifndef UINT_MAX
+#define UINT_MAX (~0U)
+#endif
+
#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

#define PERF_ALIGN(x, a) __PERF_ALIGN_MASK(x, (typeof(x))(a)-1)
--
2.9.3

2017-03-06 19:39:05

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 04/35] tools arch x86: Include asm/cmpxchg.h

From: Arnaldo Carvalho de Melo <[email protected]>

Will be included from atomic.h and used in refcount.h

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/arch/x86/include/asm/cmpxchg.h | 89 ++++++++++++++++++++++++++++++++++++
tools/perf/MANIFEST | 1 +
tools/scripts/Makefile.include | 9 ++++
3 files changed, 99 insertions(+)
create mode 100644 tools/arch/x86/include/asm/cmpxchg.h

diff --git a/tools/arch/x86/include/asm/cmpxchg.h b/tools/arch/x86/include/asm/cmpxchg.h
new file mode 100644
index 000000000000..f5253260f3cc
--- /dev/null
+++ b/tools/arch/x86/include/asm/cmpxchg.h
@@ -0,0 +1,89 @@
+#ifndef TOOLS_ASM_X86_CMPXCHG_H
+#define TOOLS_ASM_X86_CMPXCHG_H
+
+#include <linux/compiler.h>
+
+/*
+ * Non-existant functions to indicate usage errors at link time
+ * (or compile-time if the compiler implements __compiletime_error().
+ */
+extern void __cmpxchg_wrong_size(void)
+ __compiletime_error("Bad argument size for cmpxchg");
+
+/*
+ * Constants for operation sizes. On 32-bit, the 64-bit size it set to
+ * -1 because sizeof will never return -1, thereby making those switch
+ * case statements guaranteeed dead code which the compiler will
+ * eliminate, and allowing the "missing symbol in the default case" to
+ * indicate a usage error.
+ */
+#define __X86_CASE_B 1
+#define __X86_CASE_W 2
+#define __X86_CASE_L 4
+#ifdef __x86_64__
+#define __X86_CASE_Q 8
+#else
+#define __X86_CASE_Q -1 /* sizeof will never return -1 */
+#endif
+
+/*
+ * Atomic compare and exchange. Compare OLD with MEM, if identical,
+ * store NEW in MEM. Return the initial value in MEM. Success is
+ * indicated by comparing RETURN with OLD.
+ */
+#define __raw_cmpxchg(ptr, old, new, size, lock) \
+({ \
+ __typeof__(*(ptr)) __ret; \
+ __typeof__(*(ptr)) __old = (old); \
+ __typeof__(*(ptr)) __new = (new); \
+ switch (size) { \
+ case __X86_CASE_B: \
+ { \
+ volatile u8 *__ptr = (volatile u8 *)(ptr); \
+ asm volatile(lock "cmpxchgb %2,%1" \
+ : "=a" (__ret), "+m" (*__ptr) \
+ : "q" (__new), "0" (__old) \
+ : "memory"); \
+ break; \
+ } \
+ case __X86_CASE_W: \
+ { \
+ volatile u16 *__ptr = (volatile u16 *)(ptr); \
+ asm volatile(lock "cmpxchgw %2,%1" \
+ : "=a" (__ret), "+m" (*__ptr) \
+ : "r" (__new), "0" (__old) \
+ : "memory"); \
+ break; \
+ } \
+ case __X86_CASE_L: \
+ { \
+ volatile u32 *__ptr = (volatile u32 *)(ptr); \
+ asm volatile(lock "cmpxchgl %2,%1" \
+ : "=a" (__ret), "+m" (*__ptr) \
+ : "r" (__new), "0" (__old) \
+ : "memory"); \
+ break; \
+ } \
+ case __X86_CASE_Q: \
+ { \
+ volatile u64 *__ptr = (volatile u64 *)(ptr); \
+ asm volatile(lock "cmpxchgq %2,%1" \
+ : "=a" (__ret), "+m" (*__ptr) \
+ : "r" (__new), "0" (__old) \
+ : "memory"); \
+ break; \
+ } \
+ default: \
+ __cmpxchg_wrong_size(); \
+ } \
+ __ret; \
+})
+
+#define __cmpxchg(ptr, old, new, size) \
+ __raw_cmpxchg((ptr), (old), (new), (size), LOCK_PREFIX)
+
+#define cmpxchg(ptr, old, new) \
+ __cmpxchg(ptr, old, new, sizeof(*(ptr)))
+
+
+#endif /* TOOLS_ASM_X86_CMPXCHG_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index 8672f835ae4e..e2c52190cf28 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -12,6 +12,7 @@ tools/arch/sparc/include/asm/barrier_32.h
tools/arch/sparc/include/asm/barrier_64.h
tools/arch/tile/include/asm/barrier.h
tools/arch/x86/include/asm/barrier.h
+tools/arch/x86/include/asm/cmpxchg.h
tools/arch/x86/include/asm/cpufeatures.h
tools/arch/x86/include/asm/disabled-features.h
tools/arch/x86/include/asm/required-features.h
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index 621578aa12d6..fc74db62fef4 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -43,6 +43,15 @@ ifneq ($(CC), clang)
EXTRA_WARNINGS += -Wstrict-aliasing=3
endif

+# Hack to avoid type-punned warnings on old systems such as RHEL5:
+# We should be changing CFLAGS and checking gcc version, but this
+# will do for now and keep the above -Wstrict-aliasing=3 in place
+# in newer systems.
+# Needed for the __raw_cmpxchg in tools/arch/x86/include/asm/cmpxchg.h
+ifneq ($(filter 3.%,$(MAKE_VERSION)),) # make-3
+EXTRA_WARNINGS += -fno-strict-aliasing
+endif
+
ifneq ($(findstring $(MAKEFLAGS), w),w)
PRINT_DIR = --no-print-directory
else
--
2.9.3

2017-03-06 19:39:10

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 03/35] tools include: Adopt __compiletime_error

From: Arnaldo Carvalho de Melo <[email protected]>

>From the kernel, get the gcc one and provide the fallback so that we can
continue build with other compilers, such as with clang.

Will be used by tools/arch/x86/include/asm/cmpxchg.h.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/include/linux/compiler-gcc.h | 4 ++++
tools/include/linux/compiler.h | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/tools/include/linux/compiler-gcc.h b/tools/include/linux/compiler-gcc.h
index 48af2f10a42d..616935f1ff56 100644
--- a/tools/include/linux/compiler-gcc.h
+++ b/tools/include/linux/compiler-gcc.h
@@ -12,3 +12,7 @@
#if GCC_VERSION >= 70000 && !defined(__CHECKER__)
# define __fallthrough __attribute__ ((fallthrough))
#endif
+
+#if GCC_VERSION >= 40300
+# define __compiletime_error(message) __attribute__((error(message)))
+#endif /* GCC_VERSION >= 40300 */
diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index 8de163b17c0d..c9e65e8faacd 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -5,6 +5,10 @@
#include <linux/compiler-gcc.h>
#endif

+#ifndef __compiletime_error
+# define __compiletime_error(message)
+#endif
+
/* Optimization barrier */
/* The "volatile" is due to gcc bugs */
#define barrier() __asm__ __volatile__("": : :"memory")
--
2.9.3

2017-03-06 19:39:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 21/35] perf ftrace: Add support for --pid option

From: Namhyung Kim <[email protected]>

The -p (--pid) option enables to trace existing process by its pid.

Committer notes:

Testing it:

Using the function_graph tracer on a process that is just waiting for user
input and thus will make 'perf ftrace' sit there waiting for that, then press
any key on that mutt session and see what happens:

# perf ftrace -t function_graph -p `pidof mutt` | head -40
2) 1.038 us | switch_mm_irqs_off();
------------------------------------------
2) <idle>-0 => mutt-3595
------------------------------------------

2) | finish_task_switch() {
2) | smp_irq_work_interrupt() {
2) | irq_enter() {
2) 0.180 us | rcu_irq_enter();
2) 1.248 us | }
2) | __wake_up() {
2) 0.126 us | _raw_spin_lock_irqsave();
2) | __wake_up_common() {
2) | pollwake() {
2) | default_wake_function() {
2) | try_to_wake_up() {
2) 0.662 us | _raw_spin_lock_irqsave();
2) | select_task_rq_fair() {
2) 1.719 us | effective_load.isra.41();
2) 1.343 us | effective_load.isra.41();
2) | select_idle_sibling() {
2) 0.331 us | idle_cpu();
2) 1.458 us | }
2) 8.350 us | }
2) 0.200 us | _raw_spin_lock();
2) | ttwu_do_activate() {
2) | activate_task() {
2) 0.136 us | update_rq_clock.part.77();
2) | enqueue_task_fair() {
2) | enqueue_entity() {
2) 0.146 us | update_curr();
2) 0.330 us | account_entity_enqueue();
2) 0.280 us | update_cfs_shares();
2) 0.321 us | place_entity();
2) 0.206 us | __enqueue_entity();
2) 6.926 us | }
2) | enqueue_entity() {
2) 0.105 us | update_curr();
2) 0.175 us | account_entity_enqueue();
2) 0.531 us | update_cfs_shares();
#

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-ftrace.txt | 4 ++
tools/perf/builtin-ftrace.c | 91 ++++++++++++++++++++++----------
2 files changed, 68 insertions(+), 27 deletions(-)

diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt
index 2d96de6132a9..2d39397f3f30 100644
--- a/tools/perf/Documentation/perf-ftrace.txt
+++ b/tools/perf/Documentation/perf-ftrace.txt
@@ -30,6 +30,10 @@ OPTIONS
--verbose=::
Verbosity level.

+-p::
+--pid=::
+ Trace on existing process id (comma separated list).
+

SEE ALSO
--------
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index c3e643666c72..85eee9c444ae 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -11,6 +11,7 @@

#include <unistd.h>
#include <signal.h>
+#include <fcntl.h>

#include "debug.h"
#include <subcmd/parse-options.h>
@@ -50,11 +51,12 @@ static void ftrace__workload_exec_failed_signal(int signo __maybe_unused,
done = true;
}

-static int write_tracing_file(const char *name, const char *val)
+static int __write_tracing_file(const char *name, const char *val, bool append)
{
char *file;
int fd, ret = -1;
ssize_t size = strlen(val);
+ int flags = O_WRONLY;

file = get_tracing_file(name);
if (!file) {
@@ -62,7 +64,12 @@ static int write_tracing_file(const char *name, const char *val)
return -1;
}

- fd = open(file, O_WRONLY);
+ if (append)
+ flags |= O_APPEND;
+ else
+ flags |= O_TRUNC;
+
+ fd = open(file, flags);
if (fd < 0) {
pr_debug("cannot open tracing file: %s\n", name);
goto out;
@@ -79,6 +86,16 @@ static int write_tracing_file(const char *name, const char *val)
return ret;
}

+static int write_tracing_file(const char *name, const char *val)
+{
+ return __write_tracing_file(name, val, false);
+}
+
+static int append_tracing_file(const char *name, const char *val)
+{
+ return __write_tracing_file(name, val, true);
+}
+
static int reset_tracing_files(struct perf_ftrace *ftrace __maybe_unused)
{
if (write_tracing_file("tracing_on", "0") < 0)
@@ -93,11 +110,27 @@ static int reset_tracing_files(struct perf_ftrace *ftrace __maybe_unused)
return 0;
}

+static int set_tracing_pid(struct perf_ftrace *ftrace)
+{
+ int i;
+ char buf[16];
+
+ if (target__has_cpu(&ftrace->target))
+ return 0;
+
+ for (i = 0; i < thread_map__nr(ftrace->evlist->threads); i++) {
+ scnprintf(buf, sizeof(buf), "%d",
+ ftrace->evlist->threads->map[i]);
+ if (append_tracing_file("set_ftrace_pid", buf) < 0)
+ return -1;
+ }
+ return 0;
+}
+
static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
{
char *trace_file;
int trace_fd;
- char *trace_pid;
char buf[4096];
struct pollfd pollfd = {
.events = POLLIN,
@@ -108,42 +141,37 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
return -1;
}

- if (argc < 1)
- return -1;
-
signal(SIGINT, sig_handler);
signal(SIGUSR1, sig_handler);
signal(SIGCHLD, sig_handler);

- reset_tracing_files(ftrace);
+ if (reset_tracing_files(ftrace) < 0)
+ goto out;

/* reset ftrace buffer */
if (write_tracing_file("trace", "0") < 0)
goto out;

- if (perf_evlist__prepare_workload(ftrace->evlist, &ftrace->target,
- argv, false, ftrace__workload_exec_failed_signal) < 0)
- goto out;
-
- if (write_tracing_file("current_tracer", ftrace->tracer) < 0) {
- pr_err("failed to set current_tracer to %s\n", ftrace->tracer);
+ if (argc && perf_evlist__prepare_workload(ftrace->evlist,
+ &ftrace->target, argv, false,
+ ftrace__workload_exec_failed_signal) < 0) {
goto out;
}

- if (asprintf(&trace_pid, "%d", thread_map__pid(ftrace->evlist->threads, 0)) < 0) {
- pr_err("failed to allocate pid string\n");
- goto out;
+ if (set_tracing_pid(ftrace) < 0) {
+ pr_err("failed to set ftrace pid\n");
+ goto out_reset;
}

- if (write_tracing_file("set_ftrace_pid", trace_pid) < 0) {
- pr_err("failed to set pid: %s\n", trace_pid);
- goto out_free_pid;
+ if (write_tracing_file("current_tracer", ftrace->tracer) < 0) {
+ pr_err("failed to set current_tracer to %s\n", ftrace->tracer);
+ goto out_reset;
}

trace_file = get_tracing_file("trace_pipe");
if (!trace_file) {
pr_err("failed to open trace_pipe\n");
- goto out_free_pid;
+ goto out_reset;
}

trace_fd = open(trace_file, O_RDONLY);
@@ -152,7 +180,7 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)

if (trace_fd < 0) {
pr_err("failed to open trace_pipe\n");
- goto out_free_pid;
+ goto out_reset;
}

fcntl(trace_fd, F_SETFL, O_NONBLOCK);
@@ -191,11 +219,9 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)

out_close_fd:
close(trace_fd);
-out_free_pid:
- free(trace_pid);
-out:
+out_reset:
reset_tracing_files(ftrace);
-
+out:
return done ? 0 : -1;
}

@@ -227,13 +253,15 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
.target = { .uid = UINT_MAX, },
};
const char * const ftrace_usage[] = {
- "perf ftrace [<options>] <command>",
+ "perf ftrace [<options>] [<command>]",
"perf ftrace [<options>] -- <command> [<options>]",
NULL
};
const struct option ftrace_options[] = {
OPT_STRING('t', "tracer", &ftrace.tracer, "tracer",
"tracer to use: function_graph(default) or function"),
+ OPT_STRING('p', "pid", &ftrace.target.pid, "pid",
+ "trace on existing process id"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose"),
OPT_END()
@@ -245,9 +273,18 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)

argc = parse_options(argc, argv, ftrace_options, ftrace_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
- if (!argc)
+ if (!argc && target__none(&ftrace.target))
usage_with_options(ftrace_usage, ftrace_options);

+ ret = target__validate(&ftrace.target);
+ if (ret) {
+ char errbuf[512];
+
+ target__strerror(&ftrace.target, ret, errbuf, 512);
+ pr_err("%s\n", errbuf);
+ return -EINVAL;
+ }
+
ftrace.evlist = perf_evlist__new();
if (ftrace.evlist == NULL)
return -ENOMEM;
--
2.9.3

2017-03-06 19:39:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 27/35] perf probe: Generalize probe event file open routine

From: "Naveen N. Rao" <[email protected]>

Generalize probe event file open routine into a generic function for opening
trace files.

Signed-off-by: Naveen N. Rao <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/b580465c7a4dcd5d3b40fdf8568e6be45d0a6333.1487849577.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/probe-file.c | 20 +++++++++++---------
tools/perf/util/probe-file.h | 1 +
2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 436b64731f65..1a62daceb028 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -70,7 +70,7 @@ static void print_both_open_warning(int kerr, int uerr)
}
}

-static int open_probe_events(const char *trace_file, bool readwrite)
+int open_trace_file(const char *trace_file, bool readwrite)
{
char buf[PATH_MAX];
int ret;
@@ -92,12 +92,12 @@ static int open_probe_events(const char *trace_file, bool readwrite)

static int open_kprobe_events(bool readwrite)
{
- return open_probe_events("kprobe_events", readwrite);
+ return open_trace_file("kprobe_events", readwrite);
}

static int open_uprobe_events(bool readwrite)
{
- return open_probe_events("uprobe_events", readwrite);
+ return open_trace_file("uprobe_events", readwrite);
}

int probe_file__open(int flag)
@@ -899,6 +899,7 @@ bool probe_type_is_available(enum probe_type type)
size_t len = 0;
bool target_line = false;
bool ret = probe_type_table[type].avail;
+ int fd;

if (type >= PROBE_TYPE_END)
return false;
@@ -906,14 +907,16 @@ bool probe_type_is_available(enum probe_type type)
if (ret || probe_type_table[type].checked)
return ret;

- if (asprintf(&buf, "%s/README", tracing_path) < 0)
+ fd = open_trace_file("README", false);
+ if (fd < 0)
return ret;

- fp = fopen(buf, "r");
- if (!fp)
- goto end;
+ fp = fdopen(fd, "r");
+ if (!fp) {
+ close(fd);
+ return ret;
+ }

- zfree(&buf);
while (getline(&buf, &len, fp) > 0 && !ret) {
if (!target_line) {
target_line = !!strstr(buf, " type: ");
@@ -928,7 +931,6 @@ bool probe_type_is_available(enum probe_type type)
probe_type_table[type].avail = ret;

fclose(fp);
-end:
free(buf);

return ret;
diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
index eba44c3e9dca..a17a82eff8a0 100644
--- a/tools/perf/util/probe-file.h
+++ b/tools/perf/util/probe-file.h
@@ -35,6 +35,7 @@ enum probe_type {

/* probe-file.c depends on libelf */
#ifdef HAVE_LIBELF_SUPPORT
+int open_trace_file(const char *trace_file, bool readwrite);
int probe_file__open(int flag);
int probe_file__open_both(int *kfd, int *ufd, int flag);
struct strlist *probe_file__get_namelist(int fd);
--
2.9.3

2017-03-06 19:39:40

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 11/35] perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ fixed mixed conversion to refcount in tests/cpumap.c ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/cpumap.c | 2 +-
tools/perf/util/cpumap.c | 16 ++++++++--------
tools/perf/util/cpumap.h | 4 ++--
3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index f168a85992d0..4478773cdb97 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -66,7 +66,7 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
TEST_ASSERT_VAL("wrong nr", map->nr == 2);
TEST_ASSERT_VAL("wrong cpu", map->map[0] == 1);
TEST_ASSERT_VAL("wrong cpu", map->map[1] == 256);
- TEST_ASSERT_VAL("wrong refcnt", atomic_read(&map->refcnt) == 1);
+ TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
cpu_map__put(map);
return 0;
}
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 8c7504939113..39ad2caccf56 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -29,7 +29,7 @@ static struct cpu_map *cpu_map__default_new(void)
cpus->map[i] = i;

cpus->nr = nr_cpus;
- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -43,7 +43,7 @@ static struct cpu_map *cpu_map__trim_new(int nr_cpus, int *tmp_cpus)
if (cpus != NULL) {
cpus->nr = nr_cpus;
memcpy(cpus->map, tmp_cpus, payload_size);
- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -252,7 +252,7 @@ struct cpu_map *cpu_map__dummy_new(void)
if (cpus != NULL) {
cpus->nr = 1;
cpus->map[0] = -1;
- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -269,7 +269,7 @@ struct cpu_map *cpu_map__empty_new(int nr)
for (i = 0; i < nr; i++)
cpus->map[i] = -1;

- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -278,7 +278,7 @@ struct cpu_map *cpu_map__empty_new(int nr)
static void cpu_map__delete(struct cpu_map *map)
{
if (map) {
- WARN_ONCE(atomic_read(&map->refcnt) != 0,
+ WARN_ONCE(refcount_read(&map->refcnt) != 0,
"cpu_map refcnt unbalanced\n");
free(map);
}
@@ -287,13 +287,13 @@ static void cpu_map__delete(struct cpu_map *map)
struct cpu_map *cpu_map__get(struct cpu_map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
return map;
}

void cpu_map__put(struct cpu_map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcount_dec_and_test(&map->refcnt))
cpu_map__delete(map);
}

@@ -357,7 +357,7 @@ int cpu_map__build_map(struct cpu_map *cpus, struct cpu_map **res,
/* ensure we process id in increasing order */
qsort(c->map, c->nr, sizeof(int), cmp_ids);

- atomic_set(&c->refcnt, 1);
+ refcount_set(&c->refcnt, 1);
*res = c;
return 0;
}
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 1a0549af8f5c..e84491636c1b 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -3,13 +3,13 @@

#include <stdio.h>
#include <stdbool.h>
-#include <linux/atomic.h>
+#include <linux/refcount.h>

#include "perf.h"
#include "util/debug.h"

struct cpu_map {
- atomic_t refcnt;
+ refcount_t refcnt;
int nr;
int map[];
};
--
2.9.3

2017-03-06 19:40:03

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 29/35] trace/kprobes: Add back warning about offset in return probes

From: "Steven Rostedt (VMware)" <[email protected]>

Let's not remove the warning about offsets and return probes when the
offset is invalid.

Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Acked-by: Naveen N. Rao <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
kernel/trace/trace_kprobe.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 18775ef182f8..2b7d0dd938ba 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -695,6 +695,11 @@ static int create_trace_kprobe(int argc, char **argv)
pr_info("Failed to parse symbol.\n");
return ret;
}
+ if (offset && is_return &&
+ !arch_function_offset_within_entry(offset)) {
+ pr_info("Given offset is not valid for return probe.\n");
+ return -EINVAL;
+ }
}
argc -= 2; argv += 2;

--
2.9.3

2017-03-06 19:40:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 14/35] perf map: Convert map.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/map.c | 6 +++---
tools/perf/util/map.h | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 0a943e7b1ea7..f0e2428efd0b 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -141,7 +141,7 @@ void map__init(struct map *map, enum map_type type,
RB_CLEAR_NODE(&map->rb_node);
map->groups = NULL;
map->erange_warned = false;
- atomic_set(&map->refcnt, 1);
+ refcount_set(&map->refcnt, 1);
}

struct map *map__new(struct machine *machine, u64 start, u64 len,
@@ -255,7 +255,7 @@ void map__delete(struct map *map)

void map__put(struct map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcount_dec_and_test(&map->refcnt))
map__delete(map);
}

@@ -354,7 +354,7 @@ struct map *map__clone(struct map *from)
struct map *map = memdup(from, sizeof(*map));

if (map != NULL) {
- atomic_set(&map->refcnt, 1);
+ refcount_set(&map->refcnt, 1);
RB_CLEAR_NODE(&map->rb_node);
dso__get(map->dso);
map->groups = NULL;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index abdacf800c98..9545ff343ec5 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -1,7 +1,7 @@
#ifndef __PERF_MAP_H
#define __PERF_MAP_H

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/compiler.h>
#include <linux/list.h>
#include <linux/rbtree.h>
@@ -51,7 +51,7 @@ struct map {

struct dso *dso;
struct map_groups *groups;
- atomic_t refcnt;
+ refcount_t refcnt;
};

struct kmap {
@@ -150,7 +150,7 @@ struct map *map__clone(struct map *map);
static inline struct map *map__get(struct map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
return map;
}

--
2.9.3

2017-03-06 19:40:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 24/35] perf ftrace: Use pager for displaying result

From: Namhyung Kim <[email protected]>

It's convenient to use the pager when seeing many lines of result.

Note that setup_pager() should be called after perf_evlist__prepare_workload()
since they can interfere each other regarding shared stdio streams.

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-ftrace.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index d5b566ed7178..6087295f8827 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -195,6 +195,7 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
signal(SIGINT, sig_handler);
signal(SIGUSR1, sig_handler);
signal(SIGCHLD, sig_handler);
+ signal(SIGPIPE, sig_handler);

if (reset_tracing_files(ftrace) < 0)
goto out;
@@ -247,6 +248,8 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
goto out_close_fd;
}

+ setup_pager();
+
perf_evlist__start_workload(ftrace->evlist);

while (!done) {
--
2.9.3

2017-03-06 19:40:58

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 13/35] perf dso: Convert dso.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/dso.c | 6 +++---
tools/perf/util/dso.h | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index d38b62a700ca..42db00d78573 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1109,7 +1109,7 @@ struct dso *dso__new(const char *name)
INIT_LIST_HEAD(&dso->node);
INIT_LIST_HEAD(&dso->data.open_entry);
pthread_mutex_init(&dso->lock, NULL);
- atomic_set(&dso->refcnt, 1);
+ refcount_set(&dso->refcnt, 1);
}

return dso;
@@ -1147,13 +1147,13 @@ void dso__delete(struct dso *dso)
struct dso *dso__get(struct dso *dso)
{
if (dso)
- atomic_inc(&dso->refcnt);
+ refcount_inc(&dso->refcnt);
return dso;
}

void dso__put(struct dso *dso)
{
- if (dso && atomic_dec_and_test(&dso->refcnt))
+ if (dso && refcount_dec_and_test(&dso->refcnt))
dso__delete(dso);
}

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ecc4bbd3f82e..12350b171727 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -1,7 +1,7 @@
#ifndef __PERF_DSO
#define __PERF_DSO

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/types.h>
#include <linux/rbtree.h>
#include <sys/types.h>
@@ -187,7 +187,7 @@ struct dso {
void *priv;
u64 db_id;
};
- atomic_t refcnt;
+ refcount_t refcnt;
char name[0];
};

--
2.9.3

2017-03-06 19:40:39

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 32/35] perf bench futex: Use __maybe_unused

From: Arnaldo Carvalho de Melo <[email protected]>

Instead of attributing a variable to itself to silence the compiler, use
the attribute designed for that, avoiding this:

In file included from bench/futex-hash.c:24:
bench/futex.h:95:7: error: explicitly assigning value of variable of type 'pthread_attr_t *' to itself [-Werror,-Wself-assign]
attr = attr;
~~~~ ^ ~~~~
bench/futex.h:96:13: error: explicitly assigning value of variable of type 'size_t' (aka 'unsigned long') to itself [-Werror,-Wself-assign]
cpusetsize = cpusetsize;
~~~~~~~~~~ ^ ~~~~~~~~~~
bench/futex.h:97:9: error: explicitly assigning value of variable of type 'cpu_set_t *' (aka 'struct cpu_set_t *') to itself [-Werror,-Wself-assign]
cpuset = cpuset;
~~~~~~ ^ ~~~~~~

That is only triggered when HAVE_PTHREAD_ATTR_SETAFFINITY_NP isn't set.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/bench/futex.h | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tools/perf/bench/futex.h b/tools/perf/bench/futex.h
index b2e06d1190d0..e44fd3239530 100644
--- a/tools/perf/bench/futex.h
+++ b/tools/perf/bench/futex.h
@@ -88,13 +88,11 @@ futex_cmp_requeue(u_int32_t *uaddr, u_int32_t val, u_int32_t *uaddr2, int nr_wak

#ifndef HAVE_PTHREAD_ATTR_SETAFFINITY_NP
#include <pthread.h>
-static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr,
- size_t cpusetsize,
- cpu_set_t *cpuset)
+#include <linux/compiler.h>
+static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr __maybe_unused,
+ size_t cpusetsize __maybe_unused,
+ cpu_set_t *cpuset __maybe_unused)
{
- attr = attr;
- cpusetsize = cpusetsize;
- cpuset = cpuset;
return 0;
}
#endif
--
2.9.3

2017-03-06 19:41:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 17/35] perf thread: convert thread.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Did missing conversion in __machine__remove_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 2 +-
tools/perf/util/thread.c | 6 +++---
tools/perf/util/thread.h | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 71c9720d4973..b9974fe41bc1 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1439,7 +1439,7 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th,
if (machine->last_match == th)
machine->last_match = NULL;

- BUG_ON(atomic_read(&th->refcnt) == 0);
+ BUG_ON(refcount_read(&th->refcnt) == 0);
if (lock)
pthread_rwlock_wrlock(&machine->threads_lock);
rb_erase_init(&th->rb_node, &machine->threads);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index f5af87f66663..74e79d26b421 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -53,7 +53,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
goto err_thread;

list_add(&comm->list, &thread->comm_list);
- atomic_set(&thread->refcnt, 1);
+ refcount_set(&thread->refcnt, 1);
RB_CLEAR_NODE(&thread->rb_node);
}

@@ -88,13 +88,13 @@ void thread__delete(struct thread *thread)
struct thread *thread__get(struct thread *thread)
{
if (thread)
- atomic_inc(&thread->refcnt);
+ refcount_inc(&thread->refcnt);
return thread;
}

void thread__put(struct thread *thread)
{
- if (thread && atomic_dec_and_test(&thread->refcnt)) {
+ if (thread && refcount_dec_and_test(&thread->refcnt)) {
/*
* Remove it from the dead_threads list, as last reference
* is gone.
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 99263cb6e6b6..e57188546465 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -1,7 +1,7 @@
#ifndef __PERF_THREAD_H
#define __PERF_THREAD_H

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/rbtree.h>
#include <linux/list.h>
#include <unistd.h>
@@ -23,7 +23,7 @@ struct thread {
pid_t tid;
pid_t ppid;
int cpu;
- atomic_t refcnt;
+ refcount_t refcnt;
char shortname[3];
bool comm_set;
int comm_len;
--
2.9.3

2017-03-06 19:40:50

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 15/35] perf map: Convert map_groups.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Did the missing conversion of tests/thread-mg-share.c too ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/thread-mg-share.c | 12 ++++++------
tools/perf/util/map.c | 4 ++--
tools/perf/util/map.h | 4 ++--
3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/thread-mg-share.c b/tools/perf/tests/thread-mg-share.c
index 188b63140fc8..76686dd6f5ec 100644
--- a/tools/perf/tests/thread-mg-share.c
+++ b/tools/perf/tests/thread-mg-share.c
@@ -43,7 +43,7 @@ int test__thread_mg_share(int subtest __maybe_unused)
leader && t1 && t2 && t3 && other);

mg = leader->mg;
- TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&mg->refcnt), 4);
+ TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(&mg->refcnt), 4);

/* test the map groups pointer is shared */
TEST_ASSERT_VAL("map groups don't match", mg == t1->mg);
@@ -71,25 +71,25 @@ int test__thread_mg_share(int subtest __maybe_unused)
machine__remove_thread(machine, other_leader);

other_mg = other->mg;
- TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&other_mg->refcnt), 2);
+ TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(&other_mg->refcnt), 2);

TEST_ASSERT_VAL("map groups don't match", other_mg == other_leader->mg);

/* release thread group */
thread__put(leader);
- TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&mg->refcnt), 3);
+ TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(&mg->refcnt), 3);

thread__put(t1);
- TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&mg->refcnt), 2);
+ TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(&mg->refcnt), 2);

thread__put(t2);
- TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&mg->refcnt), 1);
+ TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(&mg->refcnt), 1);

thread__put(t3);

/* release other group */
thread__put(other_leader);
- TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&other_mg->refcnt), 1);
+ TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(&other_mg->refcnt), 1);

thread__put(other);

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index f0e2428efd0b..1d9ebcf9e38e 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -485,7 +485,7 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
maps__init(&mg->maps[i]);
}
mg->machine = machine;
- atomic_set(&mg->refcnt, 1);
+ refcount_set(&mg->refcnt, 1);
}

static void __maps__purge(struct maps *maps)
@@ -547,7 +547,7 @@ void map_groups__delete(struct map_groups *mg)

void map_groups__put(struct map_groups *mg)
{
- if (mg && atomic_dec_and_test(&mg->refcnt))
+ if (mg && refcount_dec_and_test(&mg->refcnt))
map_groups__delete(mg);
}

diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 9545ff343ec5..c8a5a644c0a9 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -67,7 +67,7 @@ struct maps {
struct map_groups {
struct maps maps[MAP__NR_TYPES];
struct machine *machine;
- atomic_t refcnt;
+ refcount_t refcnt;
};

struct map_groups *map_groups__new(struct machine *machine);
@@ -77,7 +77,7 @@ bool map_groups__empty(struct map_groups *mg);
static inline struct map_groups *map_groups__get(struct map_groups *mg)
{
if (mg)
- atomic_inc(&mg->refcnt);
+ refcount_inc(&mg->refcnt);
return mg;
}

--
2.9.3

2017-03-06 19:41:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 35/35] perf bench numa: Add more comment for -c option

From: Jiri Olsa <[email protected]>

Adding more commentary for -c/--show_convergence option, to explain how
the convergence is defined.

Before:
-c, --show_convergence
show convergence details

Now:
-c, --show_convergence
convergence is reached when each process \
(all its threads) is running on a single NUMA node.

Suggested--by: Jiri Hladky <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Hladky <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Rephrased a bit based on a IRC conversation with Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/bench/numa.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index 3083fc36282b..6bd0581de298 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -187,7 +187,8 @@ static const struct option options[] = {
OPT_INCR ('d', "show_details" , &p0.show_details, "Show details"),
OPT_INCR ('a', "all" , &p0.run_all, "Run all tests in the suite"),
OPT_INTEGER('H', "thp" , &p0.thp, "MADV_NOHUGEPAGE < 0 < MADV_HUGEPAGE"),
- OPT_BOOLEAN('c', "show_convergence", &p0.show_convergence, "show convergence details"),
+ OPT_BOOLEAN('c', "show_convergence", &p0.show_convergence, "show convergence details, "
+ "convergence is reached when each process (all its threads) is running on a single NUMA node."),
OPT_BOOLEAN('m', "measure_convergence", &p0.measure_convergence, "measure convergence latency"),
OPT_BOOLEAN('q', "quiet" , &p0.show_quiet, "quiet mode"),
OPT_BOOLEAN('S', "serialize-startup", &p0.serialize_startup,"serialize thread startup"),
--
2.9.3

2017-03-06 19:41:29

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 33/35] perf bench futex: Fix build on musl + clang

From: Arnaldo Carvalho de Melo <[email protected]>

When building with clang on a musl libc system, Alpine Linux, we end up
hitting a problem where memset() is used but its prototype is not
present, add it to avoid this:

bench/futex-wake.c:99:3: error: implicitly declaring library function 'memset' with type 'void *(void *, int, unsigned long)'
[-Werror,-Wimplicit-function-declaration]
CPU_ZERO(&cpu);
^
/usr/include/sched.h:127:23: note: expanded from macro 'CPU_ZERO'
#define CPU_ZERO(set) CPU_ZERO_S(sizeof(cpu_set_t),set)
^
/usr/include/sched.h:110:30: note: expanded from macro 'CPU_ZERO_S'
#define CPU_ZERO_S(size,set) memset(set,0,size)
^
bench/futex-wake.c:99:3: note: include the header <string.h> or explicitly provide a declaration for 'memset'

Found while updating my test build containers to build perf with clang in more
systems.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/bench/futex-hash.c | 1 +
tools/perf/bench/futex-lock-pi.c | 1 +
tools/perf/bench/futex-requeue.c | 1 +
tools/perf/bench/futex-wake-parallel.c | 1 +
tools/perf/bench/futex-wake.c | 1 +
5 files changed, 5 insertions(+)

diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index da04b8c5568a..2499e1b0c6fb 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -9,6 +9,7 @@
*/

/* For the CLR_() macros */
+#include <string.h>
#include <pthread.h>

#include <errno.h>
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 91877777ec6e..a20814d94af1 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -3,6 +3,7 @@
*/

/* For the CLR_() macros */
+#include <string.h>
#include <pthread.h>

#include <signal.h>
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index 2b9705a8734c..9fad1e4fcd3e 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -9,6 +9,7 @@
*/

/* For the CLR_() macros */
+#include <string.h>
#include <pthread.h>

#include <signal.h>
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 2c8fa67ad537..40f5fcf1d120 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -8,6 +8,7 @@
*/

/* For the CLR_() macros */
+#include <string.h>
#include <pthread.h>

#include <signal.h>
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index e246b1b8388a..789490281ae3 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -9,6 +9,7 @@
*/

/* For the CLR_() macros */
+#include <string.h>
#include <pthread.h>

#include <signal.h>
--
2.9.3

2017-03-06 19:42:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 34/35] tools build: Use the same CC for feature detection and actual build

From: Arnaldo Carvalho de Melo <[email protected]>

When build with: 'make CC=clang' we were not using that CC to do
feature detection, which resulted in features being detected with gcc
and then the actual tools being built with clang.

Most of the time these compilers are compatible enough, so no
problem was being noticed.

As soon as a system with an old enough clang, one that hasn't
the cpuid.h header is used, and a gcc with it, the "get_cpuid" feature
will be found available but then code that will use can't be compiled.

Noticed with this combination:

/ $ gcc --version | head -1
gcc (Alpine 6.3.0) 6.3.0
/ $ clang --version | head -1
clang version 3.8.1 (tags/RELEASE_381/final)
/ $ cat /etc/alpine-release
3.5.0
/ $

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/build/feature/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index ab1e2bbc2e96..09c9626ea666 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -53,8 +53,8 @@ FILES= \

FILES := $(addprefix $(OUTPUT),$(FILES))

-CC := $(CROSS_COMPILE)gcc -MD
-CXX := $(CROSS_COMPILE)g++ -MD
+CC ?= $(CROSS_COMPILE)gcc -MD
+CXX ?= $(CROSS_COMPILE)g++ -MD
PKG_CONFIG := $(CROSS_COMPILE)pkg-config
LLVM_CONFIG ?= llvm-config

--
2.9.3

2017-03-06 19:42:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 18/35] perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Did missing tests/thread-map.c conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/thread-map.c | 6 +++---
tools/perf/util/thread_map.c | 20 ++++++++++----------
tools/perf/util/thread_map.h | 4 ++--
3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/perf/tests/thread-map.c b/tools/perf/tests/thread-map.c
index f2d2e542d0ee..a63d6945807b 100644
--- a/tools/perf/tests/thread-map.c
+++ b/tools/perf/tests/thread-map.c
@@ -29,7 +29,7 @@ int test__thread_map(int subtest __maybe_unused)
thread_map__comm(map, 0) &&
!strcmp(thread_map__comm(map, 0), NAME));
TEST_ASSERT_VAL("wrong refcnt",
- atomic_read(&map->refcnt) == 1);
+ refcount_read(&map->refcnt) == 1);
thread_map__put(map);

/* test dummy pid */
@@ -44,7 +44,7 @@ int test__thread_map(int subtest __maybe_unused)
thread_map__comm(map, 0) &&
!strcmp(thread_map__comm(map, 0), "dummy"));
TEST_ASSERT_VAL("wrong refcnt",
- atomic_read(&map->refcnt) == 1);
+ refcount_read(&map->refcnt) == 1);
thread_map__put(map);
return 0;
}
@@ -71,7 +71,7 @@ static int process_event(struct perf_tool *tool __maybe_unused,
thread_map__comm(threads, 0) &&
!strcmp(thread_map__comm(threads, 0), NAME));
TEST_ASSERT_VAL("wrong refcnt",
- atomic_read(&threads->refcnt) == 1);
+ refcount_read(&threads->refcnt) == 1);
thread_map__put(threads);
return 0;
}
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 7c3fcc538a70..9026408ea55b 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -66,7 +66,7 @@ struct thread_map *thread_map__new_by_pid(pid_t pid)
for (i = 0; i < items; i++)
thread_map__set_pid(threads, i, atoi(namelist[i]->d_name));
threads->nr = items;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}

for (i=0; i<items; i++)
@@ -83,7 +83,7 @@ struct thread_map *thread_map__new_by_tid(pid_t tid)
if (threads != NULL) {
thread_map__set_pid(threads, 0, tid);
threads->nr = 1;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}

return threads;
@@ -105,7 +105,7 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
goto out_free_threads;

threads->nr = 0;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);

while ((dirent = readdir(proc)) != NULL) {
char *end;
@@ -235,7 +235,7 @@ static struct thread_map *thread_map__new_by_pid_str(const char *pid_str)
out:
strlist__delete(slist);
if (threads)
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
return threads;

out_free_namelist:
@@ -255,7 +255,7 @@ struct thread_map *thread_map__new_dummy(void)
if (threads != NULL) {
thread_map__set_pid(threads, 0, -1);
threads->nr = 1;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}
return threads;
}
@@ -300,7 +300,7 @@ struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
}
out:
if (threads)
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
return threads;

out_free_threads:
@@ -326,7 +326,7 @@ static void thread_map__delete(struct thread_map *threads)
if (threads) {
int i;

- WARN_ONCE(atomic_read(&threads->refcnt) != 0,
+ WARN_ONCE(refcount_read(&threads->refcnt) != 0,
"thread map refcnt unbalanced\n");
for (i = 0; i < threads->nr; i++)
free(thread_map__comm(threads, i));
@@ -337,13 +337,13 @@ static void thread_map__delete(struct thread_map *threads)
struct thread_map *thread_map__get(struct thread_map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
return map;
}

void thread_map__put(struct thread_map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcount_dec_and_test(&map->refcnt))
thread_map__delete(map);
}

@@ -423,7 +423,7 @@ static void thread_map__copy_event(struct thread_map *threads,
threads->map[i].comm = strndup(event->entries[i].comm, 16);
}

- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}

struct thread_map *thread_map__new_event(struct thread_map_event *event)
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index ea0ef08c6303..bd34d7a0b9fa 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -3,7 +3,7 @@

#include <sys/types.h>
#include <stdio.h>
-#include <linux/atomic.h>
+#include <linux/refcount.h>

struct thread_map_data {
pid_t pid;
@@ -11,7 +11,7 @@ struct thread_map_data {
};

struct thread_map {
- atomic_t refcnt;
+ refcount_t refcnt;
int nr;
struct thread_map_data map[];
};
--
2.9.3

2017-03-06 19:43:01

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 16/35] perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evlist.c | 18 +++++++++---------
tools/perf/util/evlist.h | 4 ++--
2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b601f2814a30..564b924fb48a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -777,7 +777,7 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *md, bool check_messu
/*
* Check if event was unmapped due to a POLLHUP/POLLERR.
*/
- if (!atomic_read(&md->refcnt))
+ if (!refcount_read(&md->refcnt))
return NULL;

head = perf_mmap__read_head(md);
@@ -794,7 +794,7 @@ perf_mmap__read_backward(struct perf_mmap *md)
/*
* Check if event was unmapped due to a POLLHUP/POLLERR.
*/
- if (!atomic_read(&md->refcnt))
+ if (!refcount_read(&md->refcnt))
return NULL;

head = perf_mmap__read_head(md);
@@ -856,7 +856,7 @@ void perf_mmap__read_catchup(struct perf_mmap *md)
{
u64 head;

- if (!atomic_read(&md->refcnt))
+ if (!refcount_read(&md->refcnt))
return;

head = perf_mmap__read_head(md);
@@ -875,14 +875,14 @@ static bool perf_mmap__empty(struct perf_mmap *md)

static void perf_mmap__get(struct perf_mmap *map)
{
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
}

static void perf_mmap__put(struct perf_mmap *md)
{
- BUG_ON(md->base && atomic_read(&md->refcnt) == 0);
+ BUG_ON(md->base && refcount_read(&md->refcnt) == 0);

- if (atomic_dec_and_test(&md->refcnt))
+ if (refcount_dec_and_test(&md->refcnt))
perf_mmap__munmap(md);
}

@@ -894,7 +894,7 @@ void perf_mmap__consume(struct perf_mmap *md, bool overwrite)
perf_mmap__write_tail(md, old);
}

- if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
+ if (refcount_read(&md->refcnt) == 1 && perf_mmap__empty(md))
perf_mmap__put(md);
}

@@ -937,7 +937,7 @@ static void perf_mmap__munmap(struct perf_mmap *map)
munmap(map->base, perf_mmap__mmap_len(map));
map->base = NULL;
map->fd = -1;
- atomic_set(&map->refcnt, 0);
+ refcount_set(&map->refcnt, 0);
}
auxtrace_mmap__munmap(&map->auxtrace_mmap);
}
@@ -1001,7 +1001,7 @@ static int perf_mmap__mmap(struct perf_mmap *map,
* evlist layer can't just drop it when filtering events in
* perf_evlist__filter_pollfd().
*/
- atomic_set(&map->refcnt, 2);
+ refcount_set(&map->refcnt, 2);
map->prev = 0;
map->mask = mp->mask;
map->base = mmap(NULL, perf_mmap__mmap_len(map), mp->prot,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 389b9ccdf8c7..39942995f537 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -1,7 +1,7 @@
#ifndef __PERF_EVLIST_H
#define __PERF_EVLIST_H 1

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/list.h>
#include <api/fd/array.h>
#include <stdio.h>
@@ -29,7 +29,7 @@ struct perf_mmap {
void *base;
int mask;
int fd;
- atomic_t refcnt;
+ refcount_t refcnt;
u64 prev;
struct auxtrace_mmap auxtrace_mmap;
char event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
--
2.9.3

2017-03-06 19:43:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 22/35] perf cpumap: Introduce cpu_map__snprint_mask()

From: Namhyung Kim <[email protected]>

The cpu_map__snprint_mask() generates a string representation of a
cpumask bitmap. For cpu 0 to 11, it'll return "fff".

Committer notes:

Fix compiler warning on some toolchains:

19 fedora:24-x-ARC-uClibc: FAIL

CC /tmp/build/perf/util/cpumap.o
util/cpumap.c: In function 'hex_char':
util/cpumap.c:679:2: error: comparison is always true due to limited range of data type [-Werror=type-limits]
if (0 <= val && val <= 9)
^
cc1: all warnings being treated as errors

Applying patch from Namhyung that makes function receive an 'unsigned
char', that is what the callers are passing to this function.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/cpumap.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/cpumap.h | 1 +
2 files changed, 47 insertions(+)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 39ad2caccf56..061018b42393 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -673,3 +673,49 @@ size_t cpu_map__snprint(struct cpu_map *map, char *buf, size_t size)
pr_debug("cpumask list: %s\n", buf);
return ret;
}
+
+static char hex_char(unsigned char val)
+{
+ if (val < 10)
+ return val + '0';
+ if (val < 16)
+ return val - 10 + 'a';
+ return '?';
+}
+
+size_t cpu_map__snprint_mask(struct cpu_map *map, char *buf, size_t size)
+{
+ int i, cpu;
+ char *ptr = buf;
+ unsigned char *bitmap;
+ int last_cpu = cpu_map__cpu(map, map->nr - 1);
+
+ bitmap = zalloc((last_cpu + 7) / 8);
+ if (bitmap == NULL) {
+ buf[0] = '\0';
+ return 0;
+ }
+
+ for (i = 0; i < map->nr; i++) {
+ cpu = cpu_map__cpu(map, i);
+ bitmap[cpu / 8] |= 1 << (cpu % 8);
+ }
+
+ for (cpu = last_cpu / 4 * 4; cpu >= 0; cpu -= 4) {
+ unsigned char bits = bitmap[cpu / 8];
+
+ if (cpu % 8)
+ bits >>= 4;
+ else
+ bits &= 0xf;
+
+ *ptr++ = hex_char(bits);
+ if ((cpu % 32) == 0 && cpu > 0)
+ *ptr++ = ',';
+ }
+ *ptr = '\0';
+ free(bitmap);
+
+ buf[size - 1] = '\0';
+ return ptr - buf;
+}
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index e84491636c1b..6b8bff87481d 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -20,6 +20,7 @@ struct cpu_map *cpu_map__dummy_new(void);
struct cpu_map *cpu_map__new_data(struct cpu_map_data *data);
struct cpu_map *cpu_map__read(FILE *file);
size_t cpu_map__snprint(struct cpu_map *map, char *buf, size_t size);
+size_t cpu_map__snprint_mask(struct cpu_map *map, char *buf, size_t size);
size_t cpu_map__fprintf(struct cpu_map *map, FILE *fp);
int cpu_map__get_socket_id(int cpu);
int cpu_map__get_socket(struct cpu_map *map, int idx, void *data);
--
2.9.3

2017-03-06 19:43:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 10/35] perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/cgroup.c | 6 +++---
tools/perf/util/cgroup.h | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index eafbf11442b2..86399eda3684 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -127,19 +127,19 @@ static int add_cgroup(struct perf_evlist *evlist, char *str)
goto found;
n++;
}
- if (atomic_read(&cgrp->refcnt) == 0)
+ if (refcount_read(&cgrp->refcnt) == 0)
free(cgrp);

return -1;
found:
- atomic_inc(&cgrp->refcnt);
+ refcount_inc(&cgrp->refcnt);
counter->cgrp = cgrp;
return 0;
}

void close_cgroup(struct cgroup_sel *cgrp)
{
- if (cgrp && atomic_dec_and_test(&cgrp->refcnt)) {
+ if (cgrp && refcount_dec_and_test(&cgrp->refcnt)) {
close(cgrp->fd);
zfree(&cgrp->name);
free(cgrp);
diff --git a/tools/perf/util/cgroup.h b/tools/perf/util/cgroup.h
index 31f8dcdbd7ef..d91966b97cbd 100644
--- a/tools/perf/util/cgroup.h
+++ b/tools/perf/util/cgroup.h
@@ -1,14 +1,14 @@
#ifndef __CGROUP_H__
#define __CGROUP_H__

-#include <linux/atomic.h>
+#include <linux/refcount.h>

struct option;

struct cgroup_sel {
char *name;
int fd;
- atomic_t refcnt;
+ refcount_t refcnt;
};


--
2.9.3

2017-03-06 19:43:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 26/35] trace/kprobes: Allow return probes with offsets and absolute addresses

From: "Naveen N. Rao" <[email protected]>

Since the kernel includes many non-global functions with same names, we
will need to use offsets from other symbols (typically _text/_stext) or
absolute addresses to place return probes on specific functions. Also,
the core register_kretprobe() API never forbid use of offsets or
absolute addresses with kretprobes.

Allow its use with the trace infrastructure. To distinguish kernels that
support this, update ftrace README to explicitly call this out.

Signed-off-by: Naveen N. Rao <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/183e7ce2921a08c9c755ee9a5da3134febc6695b.1487770934.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
kernel/trace/trace.c | 1 +
kernel/trace/trace_kprobe.c | 8 --------
2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index f35109514a01..0ed834d6beb0 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4355,6 +4355,7 @@ static const char readme_msg[] =
"\t -:[<group>/]<event>\n"
#ifdef CONFIG_KPROBE_EVENTS
"\t place: [<module>:]<symbol>[+<offset>]|<memaddr>\n"
+ "place (kretprobe): [<module>:]<symbol>[+<offset>]|<memaddr>\n"
#endif
#ifdef CONFIG_UPROBE_EVENTS
"\t place: <path>:<offset>\n"
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index eadd96ef772f..18775ef182f8 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -680,10 +680,6 @@ static int create_trace_kprobe(int argc, char **argv)
return -EINVAL;
}
if (isdigit(argv[1][0])) {
- if (is_return) {
- pr_info("Return probe point must be a symbol.\n");
- return -EINVAL;
- }
/* an address specified */
ret = kstrtoul(&argv[1][0], 0, (unsigned long *)&addr);
if (ret) {
@@ -699,10 +695,6 @@ static int create_trace_kprobe(int argc, char **argv)
pr_info("Failed to parse symbol.\n");
return ret;
}
- if (offset && is_return) {
- pr_info("Return probe must be used without offset.\n");
- return -EINVAL;
- }
}
argc -= 2; argv += 2;

--
2.9.3

2017-03-06 19:43:36

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 19/35] perf evlist: Clarify a bit the use of perf_mmap->refcnt

From: Arnaldo Carvalho de Melo <[email protected]>

This is an odd refcount use case, so add some more comments to help
understand that when it hits zero it really means that the mmap()ed area
(on a perf_event_open() returned fd) has been munmap()ed.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evlist.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 564b924fb48a..50420cd35446 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -974,8 +974,19 @@ static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
if (!map)
return NULL;

- for (i = 0; i < evlist->nr_mmaps; i++)
+ for (i = 0; i < evlist->nr_mmaps; i++) {
map[i].fd = -1;
+ /*
+ * When the perf_mmap() call is made we grab one refcount, plus
+ * one extra to let perf_evlist__mmap_consume() get the last
+ * events after all real references (perf_mmap__get()) are
+ * dropped.
+ *
+ * Each PERF_EVENT_IOC_SET_OUTPUT points to this mmap and
+ * thus does perf_mmap__get() on it.
+ */
+ refcount_set(&map[i].refcnt, 0);
+ }
return map;
}

--
2.9.3

2017-03-06 19:43:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 25/35] kretprobes: Ensure probe location is at function entry

From: "Naveen N. Rao" <[email protected]>

kretprobes can be registered by specifying an absolute address or by
specifying offset to a symbol. However, we need to ensure this falls at
function entry so as to be able to determine the return address.

Validate the same during kretprobe registration. By default, there
should not be any offset from a function entry, as determined through a
kallsyms_lookup(). Introduce arch_function_offset_within_entry() as a
way for architectures to override this.

Signed-off-by: Naveen N. Rao <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/f1583bc4839a3862cfc2acefcc56f9c8837fa2ba.1487770934.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
include/linux/kprobes.h | 1 +
kernel/kprobes.c | 13 +++++++++++++
2 files changed, 14 insertions(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index c328e4f7dcad..177bdf6c6aeb 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -267,6 +267,7 @@ extern int arch_init_kprobes(void);
extern void show_registers(struct pt_regs *regs);
extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
+extern bool arch_function_offset_within_entry(unsigned long offset);

extern bool within_kprobe_blacklist(unsigned long addr);

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 699c5bc51a92..448759d4a263 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1875,12 +1875,25 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
}
NOKPROBE_SYMBOL(pre_handler_kretprobe);

+bool __weak arch_function_offset_within_entry(unsigned long offset)
+{
+ return !offset;
+}
+
int register_kretprobe(struct kretprobe *rp)
{
int ret = 0;
struct kretprobe_instance *inst;
int i;
void *addr;
+ unsigned long offset;
+
+ addr = kprobe_addr(&rp->kp);
+ if (!kallsyms_lookup_size_offset((unsigned long)addr, NULL, &offset))
+ return -EINVAL;
+
+ if (!arch_function_offset_within_entry(offset))
+ return -EINVAL;

if (kretprobe_blacklist_size) {
addr = kprobe_addr(&rp->kp);
--
2.9.3

2017-03-06 19:43:57

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 30/35] perf tools: Force uncore events to system wide monitoring

From: Jiri Olsa <[email protected]>

Make system wide (-a) the default option if no target was specified and
one of following conditions is met:

- there's no workload specified (current behaviour)
- there is workload specified but all requested
events are system wide ones

Mixed events core/uncore with workload:

$ perf stat -e 'uncore_cbox_0/clockticks/,cycles' sleep 1

Performance counter stats for 'sleep 1':

<not supported> uncore_cbox_0/clockticks/
980,489 cycles

1.000897406 seconds time elapsed

Uncore event with workload:

$ perf stat -e 'uncore_cbox_0/clockticks/' sleep 1

Performance counter stats for 'system wide':

281,473,897,192,670 uncore_cbox_0/clockticks/

1.000833784 seconds time elapsed

Committer note:

When testing I realized the default case for !root, i.e. no events
passed via -e, was broke by v2 of this patch, reported and after a
patch provided by Jiri it is back working:

[acme@jouet linux]$ perf stat usleep 1

Performance counter stats for 'usleep 1':

0.401335 task-clock:u (msec) # 0.297 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
48 page-faults:u # 0.120 M/sec
458,146 cycles:u # 1.142 GHz
245,113 instructions:u # 0.54 insn per cycle
47,991 branches:u # 119.578 M/sec
4,022 branch-misses:u # 8.38% of all branches

0.001350029 seconds time elapsed

[acme@jouet linux]$

Suggested-and-Tested-by: Borislav Petkov <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20170227094818.GA12764@krava
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 33 ++++++++++++++++++++++++++++++---
tools/perf/util/parse-events.c | 5 +++--
2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f4f555a67e9b..f53f449d864d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2350,6 +2350,35 @@ static int __cmd_report(int argc, const char **argv)
return 0;
}

+static void setup_system_wide(int forks)
+{
+ /*
+ * Make system wide (-a) the default target if
+ * no target was specified and one of following
+ * conditions is met:
+ *
+ * - there's no workload specified
+ * - there is workload specified but all requested
+ * events are system wide events
+ */
+ if (!target__none(&target))
+ return;
+
+ if (!forks)
+ target.system_wide = true;
+ else {
+ struct perf_evsel *counter;
+
+ evlist__for_each_entry(evsel_list, counter) {
+ if (!counter->system_wide)
+ return;
+ }
+
+ if (evsel_list->nr_entries)
+ target.system_wide = true;
+ }
+}
+
int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char * const stat_usage[] = {
@@ -2456,9 +2485,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
} else if (big_num_opt == 0) /* User passed --no-big-num */
big_num = false;

- /* Make system wide (-a) the default target. */
- if (!argc && target__none(&target))
- target.system_wide = true;
+ setup_system_wide(argc);

if (run_count < 0) {
pr_err("Run count must be a positive number\n");
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 67a8aebc67ab..54355d3caf09 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -316,8 +316,9 @@ __add_event(struct list_head *list, int *idx,
return NULL;

(*idx)++;
- evsel->cpus = cpu_map__get(cpus);
- evsel->own_cpus = cpu_map__get(cpus);
+ evsel->cpus = cpu_map__get(cpus);
+ evsel->own_cpus = cpu_map__get(cpus);
+ evsel->system_wide = !!cpus;

if (name)
evsel->name = strdup(name);
--
2.9.3

2017-03-06 19:44:10

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 23/35] perf ftrace: Add support for -a and -C option

From: Namhyung Kim <[email protected]>

The -a/--all-cpus and -C/--cpu option is for controlling tracing cpus.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-ftrace.txt | 14 ++++++++
tools/perf/builtin-ftrace.c | 60 ++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)

diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt
index 2d39397f3f30..6e6a8b22c859 100644
--- a/tools/perf/Documentation/perf-ftrace.txt
+++ b/tools/perf/Documentation/perf-ftrace.txt
@@ -34,6 +34,20 @@ OPTIONS
--pid=::
Trace on existing process id (comma separated list).

+-a::
+--all-cpus::
+ Force system-wide collection. Scripts run without a <command>
+ normally use -a by default, while scripts run with a <command>
+ normally don't - this option allows the latter to be run in
+ system-wide mode.
+
+-C::
+--cpu=::
+ Only trace for the list of CPUs provided. Multiple CPUs can
+ be provided as a comma separated list with no space like: 0,1.
+ Ranges of CPUs are specified with -: 0-2.
+ Default is to trace on all online CPUs.
+

SEE ALSO
--------
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 85eee9c444ae..d5b566ed7178 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -17,6 +17,7 @@
#include <subcmd/parse-options.h>
#include "evlist.h"
#include "target.h"
+#include "cpumap.h"
#include "thread_map.h"
#include "util/config.h"

@@ -96,6 +97,8 @@ static int append_tracing_file(const char *name, const char *val)
return __write_tracing_file(name, val, true);
}

+static int reset_tracing_cpu(void);
+
static int reset_tracing_files(struct perf_ftrace *ftrace __maybe_unused)
{
if (write_tracing_file("tracing_on", "0") < 0)
@@ -107,6 +110,9 @@ static int reset_tracing_files(struct perf_ftrace *ftrace __maybe_unused)
if (write_tracing_file("set_ftrace_pid", " ") < 0)
return -1;

+ if (reset_tracing_cpu() < 0)
+ return -1;
+
return 0;
}

@@ -127,6 +133,51 @@ static int set_tracing_pid(struct perf_ftrace *ftrace)
return 0;
}

+static int set_tracing_cpumask(struct cpu_map *cpumap)
+{
+ char *cpumask;
+ size_t mask_size;
+ int ret;
+ int last_cpu;
+
+ last_cpu = cpu_map__cpu(cpumap, cpumap->nr - 1);
+ mask_size = (last_cpu + 3) / 4 + 1;
+ mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */
+
+ cpumask = malloc(mask_size);
+ if (cpumask == NULL) {
+ pr_debug("failed to allocate cpu mask\n");
+ return -1;
+ }
+
+ cpu_map__snprint_mask(cpumap, cpumask, mask_size);
+
+ ret = write_tracing_file("tracing_cpumask", cpumask);
+
+ free(cpumask);
+ return ret;
+}
+
+static int set_tracing_cpu(struct perf_ftrace *ftrace)
+{
+ struct cpu_map *cpumap = ftrace->evlist->cpus;
+
+ if (!target__has_cpu(&ftrace->target))
+ return 0;
+
+ return set_tracing_cpumask(cpumap);
+}
+
+static int reset_tracing_cpu(void)
+{
+ struct cpu_map *cpumap = cpu_map__new(NULL);
+ int ret;
+
+ ret = set_tracing_cpumask(cpumap);
+ cpu_map__put(cpumap);
+ return ret;
+}
+
static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
{
char *trace_file;
@@ -163,6 +214,11 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
goto out_reset;
}

+ if (set_tracing_cpu(ftrace) < 0) {
+ pr_err("failed to set tracing cpumask\n");
+ goto out_reset;
+ }
+
if (write_tracing_file("current_tracer", ftrace->tracer) < 0) {
pr_err("failed to set current_tracer to %s\n", ftrace->tracer);
goto out_reset;
@@ -264,6 +320,10 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
"trace on existing process id"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose"),
+ OPT_BOOLEAN('a', "all-cpus", &ftrace.target.system_wide,
+ "system-wide collection from all CPUs"),
+ OPT_STRING('C', "cpu", &ftrace.target.cpu_list, "cpu",
+ "list of cpus to monitor"),
OPT_END()
};

--
2.9.3

2017-03-06 19:44:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 06/35] tools include: Introduce atomic_cmpxchg_{relaxed,release}()

From: Arnaldo Carvalho de Melo <[email protected]>

Will be used by refcnt.h

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/include/linux/atomic.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/tools/include/linux/atomic.h b/tools/include/linux/atomic.h
index 4e3d3d18ebab..9f21fc2b092b 100644
--- a/tools/include/linux/atomic.h
+++ b/tools/include/linux/atomic.h
@@ -3,4 +3,10 @@

#include <asm/atomic.h>

+/* atomic_cmpxchg_relaxed */
+#ifndef atomic_cmpxchg_relaxed
+#define atomic_cmpxchg_relaxed atomic_cmpxchg
+#define atomic_cmpxchg_release atomic_cmpxchg
+#endif /* atomic_cmpxchg_relaxed */
+
#endif /* __TOOLS_LINUX_ATOMIC_H */
--
2.9.3

2017-03-06 19:44:33

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 09/35] tools include: Adopt kernel's refcount.h

From: Arnaldo Carvalho de Melo <[email protected]>

To aid in catching bugs when using atomics as a reference count.

This is a trimmed down version with just what is used by tools/ at
this point.

After this, the patches submitted by Elena for tools/ doing the
conversion from atomic_ to recount_ methods can be applied and tested.

To activate it, buint perf with:

make DEBUG=1 -C tools/perf

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/include/linux/refcount.h | 151 +++++++++++++++++++++++++++++++++++++++++
tools/perf/MANIFEST | 1 +
2 files changed, 152 insertions(+)
create mode 100644 tools/include/linux/refcount.h

diff --git a/tools/include/linux/refcount.h b/tools/include/linux/refcount.h
new file mode 100644
index 000000000000..a0177c1f55b1
--- /dev/null
+++ b/tools/include/linux/refcount.h
@@ -0,0 +1,151 @@
+#ifndef _TOOLS_LINUX_REFCOUNT_H
+#define _TOOLS_LINUX_REFCOUNT_H
+
+/*
+ * Variant of atomic_t specialized for reference counts.
+ *
+ * The interface matches the atomic_t interface (to aid in porting) but only
+ * provides the few functions one should use for reference counting.
+ *
+ * It differs in that the counter saturates at UINT_MAX and will not move once
+ * there. This avoids wrapping the counter and causing 'spurious'
+ * use-after-free issues.
+ *
+ * Memory ordering rules are slightly relaxed wrt regular atomic_t functions
+ * and provide only what is strictly required for refcounts.
+ *
+ * The increments are fully relaxed; these will not provide ordering. The
+ * rationale is that whatever is used to obtain the object we're increasing the
+ * reference count on will provide the ordering. For locked data structures,
+ * its the lock acquire, for RCU/lockless data structures its the dependent
+ * load.
+ *
+ * Do note that inc_not_zero() provides a control dependency which will order
+ * future stores against the inc, this ensures we'll never modify the object
+ * if we did not in fact acquire a reference.
+ *
+ * The decrements will provide release order, such that all the prior loads and
+ * stores will be issued before, it also provides a control dependency, which
+ * will order us against the subsequent free().
+ *
+ * The control dependency is against the load of the cmpxchg (ll/sc) that
+ * succeeded. This means the stores aren't fully ordered, but this is fine
+ * because the 1->0 transition indicates no concurrency.
+ *
+ * Note that the allocator is responsible for ordering things between free()
+ * and alloc().
+ *
+ */
+
+#include <linux/atomic.h>
+#include <linux/kernel.h>
+
+#ifdef NDEBUG
+#define REFCOUNT_WARN(cond, str) (void)(cond)
+#define __refcount_check
+#else
+#define REFCOUNT_WARN(cond, str) BUG_ON(cond)
+#define __refcount_check __must_check
+#endif
+
+typedef struct refcount_struct {
+ atomic_t refs;
+} refcount_t;
+
+#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
+
+static inline void refcount_set(refcount_t *r, unsigned int n)
+{
+ atomic_set(&r->refs, n);
+}
+
+static inline unsigned int refcount_read(const refcount_t *r)
+{
+ return atomic_read(&r->refs);
+}
+
+/*
+ * Similar to atomic_inc_not_zero(), will saturate at UINT_MAX and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ */
+static inline __refcount_check
+bool refcount_inc_not_zero(refcount_t *r)
+{
+ unsigned int old, new, val = atomic_read(&r->refs);
+
+ for (;;) {
+ new = val + 1;
+
+ if (!val)
+ return false;
+
+ if (unlikely(!new))
+ return true;
+
+ old = atomic_cmpxchg_relaxed(&r->refs, val, new);
+ if (old == val)
+ break;
+
+ val = old;
+ }
+
+ REFCOUNT_WARN(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
+
+ return true;
+}
+
+/*
+ * Similar to atomic_inc(), will saturate at UINT_MAX and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller already has a
+ * reference on the object, will WARN when this is not so.
+ */
+static inline void refcount_inc(refcount_t *r)
+{
+ REFCOUNT_WARN(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+}
+
+/*
+ * Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
+ * decrement when saturated at UINT_MAX.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before, and provides a control dependency such that free() must come after.
+ * See the comment on top.
+ */
+static inline __refcount_check
+bool refcount_sub_and_test(unsigned int i, refcount_t *r)
+{
+ unsigned int old, new, val = atomic_read(&r->refs);
+
+ for (;;) {
+ if (unlikely(val == UINT_MAX))
+ return false;
+
+ new = val - i;
+ if (new > val) {
+ REFCOUNT_WARN(new > val, "refcount_t: underflow; use-after-free.\n");
+ return false;
+ }
+
+ old = atomic_cmpxchg_release(&r->refs, val, new);
+ if (old == val)
+ break;
+
+ val = old;
+ }
+
+ return !new;
+}
+
+static inline __refcount_check
+bool refcount_dec_and_test(refcount_t *r)
+{
+ return refcount_sub_and_test(1, r);
+}
+
+
+#endif /* _ATOMIC_LINUX_REFCOUNT_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index e2c52190cf28..28648c09dcd6 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -79,6 +79,7 @@ tools/include/uapi/linux/perf_event.h
tools/include/linux/poison.h
tools/include/linux/rbtree.h
tools/include/linux/rbtree_augmented.h
+tools/include/linux/refcount.h
tools/include/linux/string.h
tools/include/linux/stringify.h
tools/include/linux/types.h
--
2.9.3

2017-03-06 19:42:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 12/35] perf comm: Convert comm_str.refcnt from atomic_t to refcount_t

From: Elena Reshetova <[email protected]>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Kook <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matija Glavinic Pecotic <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Reinstated comm_str__get() function, needed when reusing entries in the rbtree ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/comm.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c
index 21b7ff382c3f..32837b6f7879 100644
--- a/tools/perf/util/comm.c
+++ b/tools/perf/util/comm.c
@@ -2,12 +2,12 @@
#include "util.h"
#include <stdlib.h>
#include <stdio.h>
-#include <linux/atomic.h>
+#include <linux/refcount.h>

struct comm_str {
char *str;
struct rb_node rb_node;
- atomic_t refcnt;
+ refcount_t refcnt;
};

/* Should perhaps be moved to struct machine */
@@ -16,13 +16,13 @@ static struct rb_root comm_str_root;
static struct comm_str *comm_str__get(struct comm_str *cs)
{
if (cs)
- atomic_inc(&cs->refcnt);
+ refcount_inc(&cs->refcnt);
return cs;
}

static void comm_str__put(struct comm_str *cs)
{
- if (cs && atomic_dec_and_test(&cs->refcnt)) {
+ if (cs && refcount_dec_and_test(&cs->refcnt)) {
rb_erase(&cs->rb_node, &comm_str_root);
zfree(&cs->str);
free(cs);
@@ -43,7 +43,7 @@ static struct comm_str *comm_str__alloc(const char *str)
return NULL;
}

- atomic_set(&cs->refcnt, 0);
+ refcount_set(&cs->refcnt, 1);

return cs;
}
@@ -61,7 +61,7 @@ static struct comm_str *comm_str__findnew(const char *str, struct rb_root *root)

cmp = strcmp(str, iter->str);
if (!cmp)
- return iter;
+ return comm_str__get(iter);

if (cmp < 0)
p = &(*p)->rb_left;
@@ -95,8 +95,6 @@ struct comm *comm__new(const char *str, u64 timestamp, bool exec)
return NULL;
}

- comm_str__get(comm->comm_str);
-
return comm;
}

@@ -108,7 +106,6 @@ int comm__override(struct comm *comm, const char *str, u64 timestamp, bool exec)
if (!new)
return -ENOMEM;

- comm_str__get(new);
comm_str__put(old);
comm->comm_str = new;
comm->start = timestamp;
--
2.9.3

2017-03-06 19:46:32

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 07/35] tools include: Provide gcc based cmpxchg fallback for !x86

From: Arnaldo Carvalho de Melo <[email protected]>

We've been using an atomic_t implementation subset based on the gcc
builtin functions for a while, now, with refcount.h we need cmpxchg(),
use gcc's __sync_val_compare_and_swap() for that.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/include/asm-generic/atomic-gcc.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/tools/include/asm-generic/atomic-gcc.h b/tools/include/asm-generic/atomic-gcc.h
index 2ba78c9f5701..5e9738f97bf3 100644
--- a/tools/include/asm-generic/atomic-gcc.h
+++ b/tools/include/asm-generic/atomic-gcc.h
@@ -60,4 +60,12 @@ static inline int atomic_dec_and_test(atomic_t *v)
return __sync_sub_and_fetch(&v->counter, 1) == 0;
}

+#define cmpxchg(ptr, oldval, newval) \
+ __sync_val_compare_and_swap(ptr, oldval, newval)
+
+static inline int atomic_cmpxchg(atomic_t *v, int oldval, int newval)
+{
+ return cmpxchg(&(v)->counter, oldval, newval);
+}
+
#endif /* __TOOLS_ASM_GENERIC_ATOMIC_H */
--
2.9.3

2017-03-06 19:46:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 31/35] tools build: Add test for sched_getcpu()

From: Arnaldo Carvalho de Melo <[email protected]>

Instead of trying to go on adding more ifdef conditions, do a feature
test and define HAVE_SCHED_GETCPU_SUPPORT instead, then use it to
provide the prototype. No need to change the stub, as it is already a
__weak symbol.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/build/Makefile.feature | 1 +
tools/build/feature/Makefile | 6 +++++-
tools/build/feature/test-all.c | 5 +++++
tools/build/feature/test-sched_getcpu.c | 7 +++++++
tools/perf/Makefile.config | 4 ++++
tools/perf/util/cloexec.h | 6 ------
tools/perf/util/util.h | 4 ++--
7 files changed, 24 insertions(+), 9 deletions(-)
create mode 100644 tools/build/feature/test-sched_getcpu.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index e3fb5ecbdcb6..523911f316ce 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -63,6 +63,7 @@ FEATURE_TESTS_BASIC := \
lzma \
get_cpuid \
bpf \
+ sched_getcpu \
sdt

# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index b564a2eea039..ab1e2bbc2e96 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -48,7 +48,8 @@ FILES= \
test-get_cpuid.bin \
test-sdt.bin \
test-cxx.bin \
- test-jvmti.bin
+ test-jvmti.bin \
+ test-sched_getcpu.bin

FILES := $(addprefix $(OUTPUT),$(FILES))

@@ -91,6 +92,9 @@ $(OUTPUT)test-libelf.bin:
$(OUTPUT)test-glibc.bin:
$(BUILD)

+$(OUTPUT)test-sched_getcpu.bin:
+ $(BUILD)
+
DWARFLIBS := -ldw
ifeq ($(findstring -static,${LDFLAGS}),-static)
DWARFLIBS += -lelf -lebl -lz -llzma -lbz2
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 699e43627397..cc6c7c01f4ca 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -117,6 +117,10 @@
# include "test-pthread-attr-setaffinity-np.c"
#undef main

+#define main main_test_sched_getcpu
+# include "test-sched_getcpu.c"
+#undef main
+
# if 0
/*
* Disable libbabeltrace check for test-all, because the requested
@@ -182,6 +186,7 @@ int main(int argc, char *argv[])
main_test_get_cpuid();
main_test_bpf();
main_test_libcrypto();
+ main_test_sched_getcpu();
main_test_sdt();

return 0;
diff --git a/tools/build/feature/test-sched_getcpu.c b/tools/build/feature/test-sched_getcpu.c
new file mode 100644
index 000000000000..c4a148dd7104
--- /dev/null
+++ b/tools/build/feature/test-sched_getcpu.c
@@ -0,0 +1,7 @@
+#define _GNU_SOURCE
+#include <sched.h>
+
+int main(void)
+{
+ return sched_getcpu();
+}
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 27c9fbca7bd9..2b656de99495 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -317,6 +317,10 @@ ifdef NO_DWARF
NO_LIBDW_DWARF_UNWIND := 1
endif

+ifeq ($(feature-sched_getcpu), 1)
+ CFLAGS += -DHAVE_SCHED_GETCPU_SUPPORT
+endif
+
ifndef NO_LIBELF
CFLAGS += -DHAVE_LIBELF_SUPPORT
EXTLIBS += -lelf
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
index d0d465953d36..94a5a7d829d5 100644
--- a/tools/perf/util/cloexec.h
+++ b/tools/perf/util/cloexec.h
@@ -3,10 +3,4 @@

unsigned long perf_event_open_cloexec_flag(void);

-#ifdef __GLIBC_PREREQ
-#if !__GLIBC_PREREQ(2, 6) && !defined(__UCLIBC__)
-int sched_getcpu(void) __THROW;
-#endif
-#endif
-
#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index c74708da8571..b2cfa47990dc 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -355,8 +355,8 @@ void print_binary(unsigned char *data, size_t len,
size_t bytes_per_line, print_binary_t printer,
void *extra);

-#if !defined(__GLIBC__) && !defined(__ANDROID__)
-extern int sched_getcpu(void);
+#ifndef HAVE_SCHED_GETCPU_SUPPORT
+int sched_getcpu(void);
#endif

int is_printable_array(char *p, unsigned int len);
--
2.9.3

2017-03-06 19:47:04

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 20/35] perf tools: Allow sorting by symbol size

From: Charles Baylis <[email protected]>

Add new sort key 'symbol_size' to allow user to sort by symbol size, or
(more usefully) display the symbol size using --fields=...,symbol_size.

Committer note:

Testing it together with the recently added -q, to remove the headers,
and using the '+' sign with -s, to add the symbol_size sort order to
the default, which is '-s/--sort comm,dso,symbol':

# perf report -q -s +symbol_size | head -10
10.39% swapper [kernel.vmlinux] [k] intel_idle 270
3.45% swapper [kernel.vmlinux] [k] update_blocked_averages 1546
2.61% swapper [kernel.vmlinux] [k] update_load_avg 1292
2.36% swapper [kernel.vmlinux] [k] update_cfs_shares 240
1.83% swapper [kernel.vmlinux] [k] __hrtimer_run_queues 606
1.74% swapper [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
1.66% swapper [kernel.vmlinux] [k] apic_timer_interrupt 152
1.60% CPU 0/KVM [kvm] [k] kvm_set_msr_common 3046
1.60% gnome-shell libglib-2.0.so.0 [.] g_slist_find 37
1.46% gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup 370
#

Signed-off-by: Charles Baylis <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Maxim Kuvyrkov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Use symbol__size(), remove needless %lld + (long long) casting ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 41 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 1 +
4 files changed, 44 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index c04cc0647c16..33f91906f5dc 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -80,6 +80,7 @@ OPTIONS
- pid: command and tid of the task
- dso: name of library or module executed at the time of sample
- symbol: name of function executed at the time of sample
+ - symbol_size: size of function executed at the time of sample
- parent: name of function matched to the parent regex filter. Unmatched
entries are displayed as "[other]".
- cpu: cpu number the task ran at the time of sample
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 28c216e3d5b7..2e839bf40bdd 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -57,6 +57,7 @@ enum hist_column {
HISTC_SRCLINE_FROM,
HISTC_SRCLINE_TO,
HISTC_TRACE,
+ HISTC_SYM_SIZE,
HISTC_NR_COLS, /* Last entry */
};

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 0ff622288d24..f8f16c0e20b6 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1396,6 +1396,46 @@ struct sort_entry sort_transaction = {
.se_width_idx = HISTC_TRANSACTION,
};

+/* --sort symbol_size */
+
+static int64_t _sort__sym_size_cmp(struct symbol *sym_l, struct symbol *sym_r)
+{
+ int64_t size_l = sym_l != NULL ? symbol__size(sym_l) : 0;
+ int64_t size_r = sym_r != NULL ? symbol__size(sym_r) : 0;
+
+ return size_l < size_r ? -1 :
+ size_l == size_r ? 0 : 1;
+}
+
+static int64_t
+sort__sym_size_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return _sort__sym_size_cmp(right->ms.sym, left->ms.sym);
+}
+
+static int _hist_entry__sym_size_snprintf(struct symbol *sym, char *bf,
+ size_t bf_size, unsigned int width)
+{
+ if (sym)
+ return repsep_snprintf(bf, bf_size, "%*d", width, symbol__size(sym));
+
+ return repsep_snprintf(bf, bf_size, "%*s", width, "unknown");
+}
+
+static int hist_entry__sym_size_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ return _hist_entry__sym_size_snprintf(he->ms.sym, bf, size, width);
+}
+
+struct sort_entry sort_sym_size = {
+ .se_header = "Symbol size",
+ .se_cmp = sort__sym_size_cmp,
+ .se_snprintf = hist_entry__sym_size_snprintf,
+ .se_width_idx = HISTC_SYM_SIZE,
+};
+
+
struct sort_dimension {
const char *name;
struct sort_entry *entry;
@@ -1418,6 +1458,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_GLOBAL_WEIGHT, "weight", sort_global_weight),
DIM(SORT_TRANSACTION, "transaction", sort_transaction),
DIM(SORT_TRACE, "trace", sort_trace),
+ DIM(SORT_SYM_SIZE, "symbol_size", sort_sym_size),
};

#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 796c847e2f00..f583325a3743 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -211,6 +211,7 @@ enum sort_type {
SORT_GLOBAL_WEIGHT,
SORT_TRANSACTION,
SORT_TRACE,
+ SORT_SYM_SIZE,

/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
2.9.3

2017-03-06 19:47:15

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 28/35] perf intel-PT/BTS: Add missing initialization

From: Adrian Hunter <[email protected]>

$ perf test decoder
57: x86 instruction decoder - new instructions : FAILED!
$

Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 80 78 56 34 12 bndstx %bnd0,0x12345678(%rax)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 85 78 56 34 12 bndstx %bnd0,0x12345678(%rbp)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 01 78 56 34 12 bndstx %bnd0,0x12345678(%rcx,%rax,1)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 05 78 56 34 12 bndstx %bnd0,0x12345678(%rbp,%rax,1)
Failed to decode 'rel' value (0xfffffffc vs expected 0): 0f 1b 84 08 78 56 34 12 bndstx %bnd0,0x12345678(%rax,%rcx,1)

There is missing initialization. It only affects the test because it is
checking 'rel' even in cases where there is no value.

Fix it.

Reported-and-Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index 7913363bde5c..55b6250350d7 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -39,6 +39,8 @@ static void intel_pt_insn_decoder(struct insn *insn,
enum intel_pt_insn_branch branch = INTEL_PT_BR_NO_BRANCH;
int ext;

+ intel_pt_insn->rel = 0;
+
if (insn_is_avx(insn)) {
intel_pt_insn->op = INTEL_PT_OP_OTHER;
intel_pt_insn->branch = INTEL_PT_BR_NO_BRANCH;
--
2.9.3

2017-03-06 19:47:29

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 01/35] perf vendor events: Add mapping for KnightsMill PMU events

From: Karol Wachowski <[email protected]>

Reuse events from KnightsLanding for KnightsMill

Signed-off-by: Karol Wachowski <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Piotr Luc <[email protected]>
Cc: Srinivas Pandruvada <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 12181bb1da2a..d1a12e584c1b 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -17,6 +17,7 @@ GenuineIntel-6-3A,v18,ivybridge,core
GenuineIntel-6-3E,v19,ivytown,core
GenuineIntel-6-2D,v20,jaketown,core
GenuineIntel-6-57,v9,knightslanding,core
+GenuineIntel-6-85,v9,knightslanding,core
GenuineIntel-6-1E,v2,nehalemep,core
GenuineIntel-6-1F,v2,nehalemep,core
GenuineIntel-6-1A,v2,nehalemep,core
--
2.9.3

2017-03-06 19:47:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 02/35] perf stat: Issue a HW watchdog disable hint

From: Borislav Petkov <[email protected]>

When using perf stat on an AMD F15h system with the default hw events
attributes, some of the events don't get counted:

Performance counter stats for 'sleep 1':

0.749208 task-clock (msec) # 0.001 CPUs utilized
1 context-switches # 0.001 M/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.072 M/sec
1,122,815 cycles # 1.499 GHz
286,740 stalled-cycles-frontend # 25.54% frontend cycles idle
<not counted> stalled-cycles-backend (0.00%)
^^^^^^^^^^^^
<not counted> instructions (0.00%)
^^^^^^^^^^^^
<not counted> branches (0.00%)
<not counted> branch-misses (0.00%)

1.001550070 seconds time elapsed

The reason is that we have the HW watchdog consuming one PMU counter and
when perf tries to schedule 6 events on 6 counters and some of those
counters are constrained to only a specific subset of PMCs by the
hardware, the event scheduling fails.

So issue a hint to disable the HW watchdog around a perf stat session.

Committer note:

Testing it...

# perf stat -d usleep 1

Performance counter stats for 'usleep 1':

1.180203 task-clock (msec) # 0.490 CPUs utilized
1 context-switches # 0.847 K/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.046 M/sec
184,754 cycles # 0.157 GHz
714,553 instructions # 3.87 insn per cycle
154,661 branches # 131.046 M/sec
7,247 branch-misses # 4.69% of all branches
219,984 L1-dcache-loads # 186.395 M/sec
17,600 L1-dcache-load-misses # 8.00% of all L1-dcache hits (90.16%)
<not counted> LLC-loads (0.00%)
<not counted> LLC-load-misses (0.00%)

0.002406823 seconds time elapsed

Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
#

Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Vince Weaver <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 13b54999ad79..f4f555a67e9b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -146,6 +146,7 @@ static aggr_get_id_t aggr_get_id;
static bool append_file;
static const char *output_name;
static int output_fd;
+static int print_free_counters_hint;

struct perf_stat {
bool record;
@@ -1109,6 +1110,9 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
counter->supported ? CNTR_NOT_COUNTED : CNTR_NOT_SUPPORTED,
csv_sep);

+ if (counter->supported)
+ print_free_counters_hint = 1;
+
fprintf(stat_config.output, "%-*s%s",
csv_output ? 0 : unit_width,
counter->unit, csv_sep);
@@ -1477,6 +1481,13 @@ static void print_footer(void)
avg_stats(&walltime_nsecs_stats));
}
fprintf(output, "\n\n");
+
+ if (print_free_counters_hint)
+ fprintf(output,
+"Some events weren't counted. Try disabling the NMI watchdog:\n"
+" echo 0 > /proc/sys/kernel/nmi_watchdog\n"
+" perf stat ...\n"
+" echo 1 > /proc/sys/kernel/nmi_watchdog\n");
}

static void print_counters(struct timespec *ts, int argc, const char **argv)
--
2.9.3

2017-03-06 19:47:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 05/35] tools arch x86: Introduce atomic_cmpxchg()

From: Arnaldo Carvalho de Melo <[email protected]>

Will be used by atomic_cmpxchg_relaxed(), in turn used by refcount.h.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/arch/x86/include/asm/atomic.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/tools/arch/x86/include/asm/atomic.h b/tools/arch/x86/include/asm/atomic.h
index 059e33e94260..328eeceec709 100644
--- a/tools/arch/x86/include/asm/atomic.h
+++ b/tools/arch/x86/include/asm/atomic.h
@@ -7,6 +7,8 @@

#define LOCK_PREFIX "\n\tlock; "

+#include <asm/cmpxchg.h>
+
/*
* Atomic operations that C can't guarantee us. Useful for
* resource counting etc..
@@ -62,4 +64,9 @@ static inline int atomic_dec_and_test(atomic_t *v)
GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", "e");
}

+static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+ return cmpxchg(&v->counter, old, new);
+}
+
#endif /* _TOOLS_LINUX_ASM_X86_ATOMIC_H */
--
2.9.3

2017-03-07 07:18:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 00/35] perf/core improvements and fixes


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> From: Arnaldo Carvalho de Melo <[email protected]>
>
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
>
> Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 08:05:45 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170306
>
> for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
>
> perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> New features:
>
> - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)
>
> E.g.:
>
> # perf report -s symbol_size,symbol
>
> Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
> Overhead Symbol size Symbol
> 14.55% 326 [k] flush_tlb_mm_range
> 7.20% 1045 [k] filemap_map_pages
> 5.82% 124 [k] vma_interval_tree_insert
> 5.18% 2430 [k] unmap_page_range
> 2.57% 571 [k] vma_interval_tree_remove
> 1.94% 494 [k] page_add_file_rmap
> 1.82% 740 [k] page_remove_rmap
> 1.66% 1017 [k] release_pages
> 1.57% 1636 [k] update_blocked_averages
> 1.57% 76 [k] unlock_page
>
> - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)
>
> Change in behaviour:
>
> - Make system wide (-a) the default option if no target was specified and one
> of following conditions is met:
>
> - No workload specified (current behaviour)
>
> - A workload is specified but all requested events are system wide ones,
> like uncore ones. (Jiri Olsa)
>
> Fixes:
>
> - Add missing initialization to the instruction decoder used in the
> intel PT/BTS code, which was causing lots of failures in 'perf test',
> looking for a value when there was none (Adrian Hunter)
>
> Infrastructure:
>
> - Add arch code needed to adopt the kernel's refcount_t to aid in
> catching bugs when using atomic_t as a reference counter, basically
> cmpxchg related functions (Arnaldo Carvalho de Melo)
>
> - Convert the code using atomic_t as reference counts to refcount_t
> (Elena Rashetova)
>
> - Add feature test for sched_getcpu() to more easily check for its
> presence in the many libc implementations and accross different
> versions of such C libraries (Arnaldo Carvalho de Melo)
>
> - Issue a HW watchdog disable hint in 'perf stat' for when some of the
> requested events can't get counted because a PMU counter is taken by that
> watchdog (Borislav Petkov).
>
> - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
>
> Documentation:
>
> - Clarify the term 'convergence' in:
>
> perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
>
> Kernel code:
>
> - Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
>
> - Allow return probes with offsets and absolute addresses (Naveen N. Rao)
>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> ----------------------------------------------------------------
> Adrian Hunter (1):
> perf intel-PT/BTS: Add missing initialization
>
> Arnaldo Carvalho de Melo (12):
> tools include: Adopt __compiletime_error
> tools arch x86: Include asm/cmpxchg.h
> tools arch x86: Introduce atomic_cmpxchg()
> tools include: Introduce atomic_cmpxchg_{relaxed,release}()
> tools include: Provide gcc based cmpxchg fallback for !x86
> tools include: Add UINT_MAX def to kernel.h
> tools include: Adopt kernel's refcount.h
> perf evlist: Clarify a bit the use of perf_mmap->refcnt
> tools build: Add test for sched_getcpu()
> perf bench futex: Use __maybe_unused
> perf bench futex: Fix build on musl + clang
> tools build: Use the same CC for feature detection and actual build
>
> Borislav Petkov (1):
> perf stat: Issue a HW watchdog disable hint
>
> Charles Baylis (1):
> perf tools: Allow sorting by symbol size
>
> Elena Reshetova (9):
> perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
> perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
> perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
> perf dso: Convert dso.refcnt from atomic_t to refcount_t
> perf map: Convert map.refcnt from atomic_t to refcount_t
> perf map: Convert map_groups.refcnt from atomic_t to refcount_t
> perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
> perf thread: convert thread.refcnt from atomic_t to refcount_t
> perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t
>
> Jiri Olsa (2):
> perf tools: Force uncore events to system wide monitoring
> perf bench numa: Add more comment for -c option
>
> Karol Wachowski (1):
> perf vendor events: Add mapping for KnightsMill PMU events
>
> Namhyung Kim (4):
> perf ftrace: Add support for --pid option
> perf cpumap: Introduce cpu_map__snprint_mask()
> perf ftrace: Add support for -a and -C option
> perf ftrace: Use pager for displaying result
>
> Naveen N. Rao (3):
> kretprobes: Ensure probe location is at function entry
> trace/kprobes: Allow return probes with offsets and absolute addresses
> perf probe: Generalize probe event file open routine
>
> Steven Rostedt (VMware) (1):
> trace/kprobes: Add back warning about offset in return probes
>
> include/linux/kprobes.h | 1 +
> kernel/kprobes.c | 13 ++
> kernel/trace/trace.c | 1 +
> kernel/trace/trace_kprobe.c | 9 +-
> tools/arch/x86/include/asm/atomic.h | 7 +
> tools/arch/x86/include/asm/cmpxchg.h | 89 ++++++++++++
> tools/build/Makefile.feature | 1 +
> tools/build/feature/Makefile | 10 +-
> tools/build/feature/test-all.c | 5 +
> tools/build/feature/test-sched_getcpu.c | 7 +
> tools/include/asm-generic/atomic-gcc.h | 8 ++
> tools/include/linux/atomic.h | 6 +
> tools/include/linux/compiler-gcc.h | 4 +
> tools/include/linux/compiler.h | 4 +
> tools/include/linux/kernel.h | 4 +
> tools/include/linux/refcount.h | 151 ++++++++++++++++++++
> tools/perf/Documentation/perf-ftrace.txt | 18 +++
> tools/perf/Documentation/perf-report.txt | 1 +
> tools/perf/MANIFEST | 2 +
> tools/perf/Makefile.config | 4 +
> tools/perf/bench/futex-hash.c | 1 +
> tools/perf/bench/futex-lock-pi.c | 1 +
> tools/perf/bench/futex-requeue.c | 1 +
> tools/perf/bench/futex-wake-parallel.c | 1 +
> tools/perf/bench/futex-wake.c | 1 +
> tools/perf/bench/futex.h | 10 +-
> tools/perf/bench/numa.c | 3 +-
> tools/perf/builtin-ftrace.c | 152 +++++++++++++++++----
> tools/perf/builtin-stat.c | 44 +++++-
> tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
> tools/perf/tests/cpumap.c | 2 +-
> tools/perf/tests/thread-map.c | 6 +-
> tools/perf/tests/thread-mg-share.c | 12 +-
> tools/perf/util/cgroup.c | 6 +-
> tools/perf/util/cgroup.h | 4 +-
> tools/perf/util/cloexec.h | 6 -
> tools/perf/util/comm.c | 15 +-
> tools/perf/util/cpumap.c | 62 +++++++--
> tools/perf/util/cpumap.h | 5 +-
> tools/perf/util/dso.c | 6 +-
> tools/perf/util/dso.h | 4 +-
> tools/perf/util/evlist.c | 31 +++--
> tools/perf/util/evlist.h | 4 +-
> tools/perf/util/hist.h | 1 +
> .../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +
> tools/perf/util/machine.c | 2 +-
> tools/perf/util/map.c | 10 +-
> tools/perf/util/map.h | 10 +-
> tools/perf/util/parse-events.c | 5 +-
> tools/perf/util/probe-file.c | 20 +--
> tools/perf/util/probe-file.h | 1 +
> tools/perf/util/sort.c | 41 ++++++
> tools/perf/util/sort.h | 1 +
> tools/perf/util/thread.c | 6 +-
> tools/perf/util/thread.h | 4 +-
> tools/perf/util/thread_map.c | 20 +--
> tools/perf/util/thread_map.h | 4 +-
> tools/perf/util/util.h | 4 +-
> tools/scripts/Makefile.include | 9 ++
> 59 files changed, 720 insertions(+), 143 deletions(-)
> create mode 100644 tools/arch/x86/include/asm/cmpxchg.h
> create mode 100644 tools/build/feature/test-sched_getcpu.c
> create mode 100644 tools/include/linux/refcount.h

Pulled, thanks a lot Arnaldo!

Ingo

2017-09-08 08:11:39

by Jörg Krause

[permalink] [raw]
Subject: Re: [PATCH 33/35] perf bench futex: Fix build on musl + clang

Hi Arnaldo,

On Mon, 2017-03-06 at 16:38 -0300, Arnaldo Carvalho de Melo wrote:
> From: Arnaldo Carvalho de Melo <[email protected]>
>
> When building with clang on a musl libc system, Alpine Linux, we end up
> hitting a problem where memset() is used but its prototype is not
> present, add it to avoid this:
>
> bench/futex-wake.c:99:3: error: implicitly declaring library function 'memset' with type 'void *(void *, int, unsigned long)'
> [-Werror,-Wimplicit-function-declaration]
> CPU_ZERO(&cpu);
> ^
> /usr/include/sched.h:127:23: note: expanded from macro 'CPU_ZERO'
> #define CPU_ZERO(set) CPU_ZERO_S(sizeof(cpu_set_t),set)
> ^
> /usr/include/sched.h:110:30: note: expanded from macro 'CPU_ZERO_S'
> #define CPU_ZERO_S(size,set) memset(set,0,size)
> ^
> bench/futex-wake.c:99:3: note: include the header <string.h> or explicitly provide a declaration for 'memset'

In my opinion the musl <shed.h> header file should include <string.h>.
I've reported the issue to the musl mailing list:

http://www.openwall.com/lists/musl/2017/09/08/1

> Found while updating my test build containers to build perf with clang in more
> systems.
>
> Cc: Adrian Hunter <[email protected]>
> Cc: David Ahern <[email protected]>
> Cc: Davidlohr Bueso <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Wang Nan <[email protected]>
> Link: http://lkml.kernel.org/n/[email protected]
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> ---
> tools/perf/bench/futex-hash.c | 1 +
> tools/perf/bench/futex-lock-pi.c | 1 +
> tools/perf/bench/futex-requeue.c | 1 +
> tools/perf/bench/futex-wake-parallel.c | 1 +
> tools/perf/bench/futex-wake.c | 1 +
> 5 files changed, 5 insertions(+)
>
> diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
> index da04b8c5568a..2499e1b0c6fb 100644
> --- a/tools/perf/bench/futex-hash.c
> +++ b/tools/perf/bench/futex-hash.c
> @@ -9,6 +9,7 @@
> */
>
> /* For the CLR_() macros */
> +#include <string.h>
> #include <pthread.h>
>
> #include <errno.h>
> diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
> index 91877777ec6e..a20814d94af1 100644
> --- a/tools/perf/bench/futex-lock-pi.c
> +++ b/tools/perf/bench/futex-lock-pi.c
> @@ -3,6 +3,7 @@
> */
>
> /* For the CLR_() macros */
> +#include <string.h>
> #include <pthread.h>
>
> #include <signal.h>
> diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
> index 2b9705a8734c..9fad1e4fcd3e 100644
> --- a/tools/perf/bench/futex-requeue.c
> +++ b/tools/perf/bench/futex-requeue.c
> @@ -9,6 +9,7 @@
> */
>
> /* For the CLR_() macros */
> +#include <string.h>
> #include <pthread.h>
>
> #include <signal.h>
> diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
> index 2c8fa67ad537..40f5fcf1d120 100644
> --- a/tools/perf/bench/futex-wake-parallel.c
> +++ b/tools/perf/bench/futex-wake-parallel.c
> @@ -8,6 +8,7 @@
> */
>
> /* For the CLR_() macros */
> +#include <string.h>
> #include <pthread.h>
>
> #include <signal.h>
> diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
> index e246b1b8388a..789490281ae3 100644
> --- a/tools/perf/bench/futex-wake.c
> +++ b/tools/perf/bench/futex-wake.c
> @@ -9,6 +9,7 @@
> */
>
> /* For the CLR_() macros */
> +#include <string.h>
> #include <pthread.h>
>
> #include <signal.h>

Best regards,
Jörg Krause

2017-09-08 13:48:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 33/35] perf bench futex: Fix build on musl + clang

Em Fri, Sep 08, 2017 at 10:04:05AM +0200, J?rg Krause escreveu:
> On Mon, 2017-03-06 at 16:38 -0300, Arnaldo Carvalho de Melo wrote:
> > When building with clang on a musl libc system, Alpine Linux, we end up
> > hitting a problem where memset() is used but its prototype is not
> > present, add it to avoid this:

> > bench/futex-wake.c:99:3: error: implicitly declaring library function 'memset' with type 'void *(void *, int, unsigned long)'
> > [-Werror,-Wimplicit-function-declaration]
> > CPU_ZERO(&cpu);
> > ^
> > /usr/include/sched.h:127:23: note: expanded from macro 'CPU_ZERO'
> > #define CPU_ZERO(set) CPU_ZERO_S(sizeof(cpu_set_t),set)
> > ^
> > /usr/include/sched.h:110:30: note: expanded from macro 'CPU_ZERO_S'
> > #define CPU_ZERO_S(size,set) memset(set,0,size)
> > ^
> > bench/futex-wake.c:99:3: note: include the header <string.h> or explicitly provide a declaration for 'memset'

> In my opinion the musl <shed.h> header file should include <string.h>.

Agreed.

> I've reported the issue to the musl mailing list:

> http://www.openwall.com/lists/musl/2017/09/08/1

Thanks for reporting that to them,

- Arnaldo