2019-08-27 01:42:19

by Lubashev, Igor

[permalink] [raw]
Subject: [PATCH 0/5] perf: Treat perf_event_paranoid and kptr_restrict like the kernel does it

This is a follow up series to the ensure perf treats perf_event_paranoid and
kptr_restrict in a way that is similar to the kernel's. That includes use of
capabilities instead of euid==0, when possible, as well as adjusting the logic
and fixing bugs.

Prior discussion: https://lkml.kernel.org/lkml/[email protected]

=== Testing notes ===

I have tested on x86 with perf binary installed according to
Documentation/admin-guide/perf-security.rst (cap_sys_admin, cap_sys_ptrace,
cap_syslog assigned to the perf executable).

I tested each permutation of:

* 7 commits:
1. HEAD of perf/core
2. patch 01 on top of perf/core
3. patches 01-02 on top of perf/core
4. patches 01-03 on top of perf/core
5. patches 01-04 on top of perf/core
6. patches 01-05 on top of perf/core
7. HEAD of perf/cap (with known bug fixed by patch 01 of this series)

* 2 build environments: with and without libcap-dev

* 3 kernel.kptr_restrict values: 0, 1, 2

* 4 kernel.perf_event_paranoid values: -1, 0, 1, 2

* 2 users: root and non-root

Total: 336 permutations

Each permutation consisted of:
perf test
perf record -e instructions -- sleep 1

All test runs were expected. Also, as expected, the following permutation (just
that permutation) resulted in segmentation failure:
commit: perf/cap
build: no libcap-dev
kernel.kptr_restrict: 0
kernel.perf_event_paranoid: 2
user: non-root

The perf/cap commit was included in the test to ensure that we can reproduce the
crash and hence test that the patch series fixes the crash, while retaining the
desired behavior of perf/cap.

=== Series Contents ===

01: perf event: Check ref_reloc_sym before using it
Fix the pre-existing cause of the crash above: use of ref_reloc_sym without
a check for NULL

02: perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks
Replace the use of euid==0 with a check for CAP_SYS_ADMIN whenever
perf_event_paranoid level is verified.
* This patch has been reviewed previously and is unchanged.
* I kept Acks and Sign-offs.

03: perf util: kernel profiling is disallowed only when perf_event_paranoid>1
Align perf logic regarding perf_event_paranoid to match kernel's.
This has been reported by Arnaldo.

04: perf symbols: Use CAP_SYSLOG with kptr_restrict checks
Replace the use of uid and euid with a check for CAP_SYSLOG when
kptr_restrict is verified (similar to kernel/kallsyms.c and lib/vsprintf.c).
Consult perf_event_paranoid when kptr_restrict==0 (see kernel/kallsyms.c).
* A previous version of this patch has been reviewed previously, but I
* modified it in a non-trivial way, so I removed Acks.

05: perf: warn perf_event_paranoid restrict kernel symbols
Warn that /proc/sys/kernel/perf_event_paranoid can also restrict kernel
symbols.

Igor Lubashev (5):
perf event: Check ref_reloc_sym before using it
perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks
perf util: kernel profiling is disallowed only when perf_event_paranoid > 1
perf symbols: Use CAP_SYSLOG with kptr_restrict checks
perf: warn that perf_event_paranoid can restrict kernel symbols

tools/perf/arch/arm/util/cs-etm.c | 3 ++-
tools/perf/arch/arm64/util/arm-spe.c | 3 ++-
tools/perf/arch/x86/util/intel-bts.c | 3 ++-
tools/perf/arch/x86/util/intel-pt.c | 2 +-
tools/perf/builtin-record.c | 2 +-
tools/perf/builtin-top.c | 2 +-
tools/perf/builtin-trace.c | 2 +-
tools/perf/util/event.c | 7 ++++---
tools/perf/util/evsel.c | 2 +-
tools/perf/util/symbol.c | 15 ++++++++++++---
10 files changed, 27 insertions(+), 14 deletions(-)

--
2.7.4


2019-08-27 01:42:31

by Lubashev, Igor

[permalink] [raw]
Subject: [PATCH 4/5] perf symbols: Use CAP_SYSLOG with kptr_restrict checks

The kernel is using CAP_SYSLOG capability instead of uid==0 and euid==0
when checking kptr_restrict. Make perf do the same.

Also, the kernel is a more restrictive than "no restrictions" in case of
kptr_restrict==0, so add the same logic to perf.

Signed-off-by: Igor Lubashev <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/291d2cda6ee75b4cd4c9ce717c177db18bf03a31.1565188228.git.ilubashe@akamai.com
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/symbol.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 4efde7879474..035f2e75728c 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -4,6 +4,7 @@
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
+#include <linux/capability.h>
#include <linux/kernel.h>
#include <linux/mman.h>
#include <linux/time64.h>
@@ -15,8 +16,10 @@
#include <inttypes.h>
#include "annotate.h"
#include "build-id.h"
+#include "cap.h"
#include "util.h"
#include "debug.h"
+#include "event.h"
#include "machine.h"
#include "map.h"
#include "symbol.h"
@@ -2195,13 +2198,19 @@ static bool symbol__read_kptr_restrict(void)
char line[8];

if (fgets(line, sizeof(line), fp) != NULL)
- value = ((geteuid() != 0) || (getuid() != 0)) ?
- (atoi(line) != 0) :
- (atoi(line) == 2);
+ value = perf_cap__capable(CAP_SYSLOG) ?
+ (atoi(line) >= 2) :
+ (atoi(line) != 0);

fclose(fp);
}

+ /* Per kernel/kallsyms.c:
+ * we also restrict when perf_event_paranoid > 1 w/o CAP_SYSLOG
+ */
+ if (perf_event_paranoid() > 1 && !perf_cap__capable(CAP_SYSLOG))
+ value = true;
+
return value;
}

--
2.7.4

2019-08-27 01:42:33

by Lubashev, Igor

[permalink] [raw]
Subject: [PATCH 2/5] perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks

The kernel is using CAP_SYS_ADMIN instead of euid==0 to override
perf_event_paranoid check. Make perf do the same.

Signed-off-by: Igor Lubashev <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/ad56df5452eeafb99dda9fc3d30f0f487aace503.1565188228.git.ilubashe@akamai.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/arch/arm/util/cs-etm.c | 3 ++-
tools/perf/arch/arm64/util/arm-spe.c | 3 ++-
tools/perf/arch/x86/util/intel-bts.c | 3 ++-
tools/perf/arch/x86/util/intel-pt.c | 2 +-
tools/perf/util/evsel.c | 2 +-
5 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 5cb07e8cb296..b87a1ca2968f 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -18,6 +18,7 @@
#include "../../perf.h"
#include "../../util/auxtrace.h"
#include "../../util/cpumap.h"
+#include "../../util/event.h"
#include "../../util/evlist.h"
#include "../../util/evsel.h"
#include "../../util/pmu.h"
@@ -254,7 +255,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
struct evsel *evsel, *cs_etm_evsel = NULL;
struct perf_cpu_map *cpus = evlist->core.cpus;
- bool privileged = (geteuid() == 0 || perf_event_paranoid() < 0);
+ bool privileged = perf_event_paranoid_check(-1);
int err = 0;

ptr->evlist = evlist;
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index 00915b8fd05b..29275a0544cd 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -12,6 +12,7 @@
#include <time.h>

#include "../../util/cpumap.h"
+#include "../../util/event.h"
#include "../../util/evsel.h"
#include "../../util/evlist.h"
#include "../../util/session.h"
@@ -66,7 +67,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
container_of(itr, struct arm_spe_recording, itr);
struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
struct evsel *evsel, *arm_spe_evsel = NULL;
- bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+ bool privileged = perf_event_paranoid_check(-1);
struct evsel *tracking_evsel;
int err;

diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
index 7b23318ebd7b..56a76142e9fd 100644
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -12,6 +12,7 @@
#include <linux/zalloc.h>

#include "../../util/cpumap.h"
+#include "../../util/event.h"
#include "../../util/evsel.h"
#include "../../util/evlist.h"
#include "../../util/session.h"
@@ -107,7 +108,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
struct perf_pmu *intel_bts_pmu = btsr->intel_bts_pmu;
struct evsel *evsel, *intel_bts_evsel = NULL;
const struct perf_cpu_map *cpus = evlist->core.cpus;
- bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+ bool privileged = perf_event_paranoid_check(-1);

btsr->evlist = evlist;
btsr->snapshot_mode = opts->auxtrace_snapshot_mode;
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index a8e633aa278a..7abccc0b0dc0 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -578,7 +578,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
bool have_timing_info, need_immediate = false;
struct evsel *evsel, *intel_pt_evsel = NULL;
const struct perf_cpu_map *cpus = evlist->core.cpus;
- bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+ bool privileged = perf_event_paranoid_check(-1);
u64 tsc_bit;
int err;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0a33f7322ecc..0b3b5af33954 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -279,7 +279,7 @@ struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)

static bool perf_event_can_profile_kernel(void)
{
- return geteuid() == 0 || perf_event_paranoid() == -1;
+ return perf_event_paranoid_check(-1);
}

struct evsel *perf_evsel__new_cycles(bool precise)
--
2.7.4

2019-08-27 01:42:36

by Lubashev, Igor

[permalink] [raw]
Subject: [PATCH 3/5] perf util: kernel profiling is disallowed only when perf_event_paranoid > 1

Perf was too restrictive about sysctl kernel.perf_event_paranoid. The
kernel only disallows profiling when perf_event_paranoid > 1. Make perf do
the same.

Signed-off-by: Igor Lubashev <[email protected]>
---
tools/perf/util/evsel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0b3b5af33954..bfe6ed2abcc2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -279,7 +279,7 @@ struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)

static bool perf_event_can_profile_kernel(void)
{
- return perf_event_paranoid_check(-1);
+ return perf_event_paranoid_check(1);
}

struct evsel *perf_evsel__new_cycles(bool precise)
--
2.7.4

2019-08-27 01:42:55

by Lubashev, Igor

[permalink] [raw]
Subject: [PATCH 1/5] perf event: Check ref_reloc_sym before using it

Check for ref_reloc_sym before using it instead of checking
symbol_conf.kptr_restrict and relying solely on that check.

Signed-off-by: Igor Lubashev <[email protected]>
Reported-by: Mathieu Poirier <[email protected]>
---
tools/perf/util/event.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index f440fdc3e953..b84f5f3c9651 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -913,11 +913,13 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
int err;
union perf_event *event;

- if (symbol_conf.kptr_restrict)
- return -1;
if (map == NULL)
return -1;

+ kmap = map__kmap(map);
+ if (!kmap->ref_reloc_sym)
+ return -1;
+
/*
* We should get this from /sys/kernel/sections/.text, but till that is
* available use this, and after it is use this as a fallback for older
@@ -940,7 +942,6 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
}

- kmap = map__kmap(map);
size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
"%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
size = PERF_ALIGN(size, sizeof(u64));
--
2.7.4

2019-08-27 13:47:04

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 3/5] perf util: kernel profiling is disallowed only when perf_event_paranoid > 1

Em Mon, Aug 26, 2019 at 09:39:14PM -0400, Igor Lubashev escreveu:
> Perf was too restrictive about sysctl kernel.perf_event_paranoid. The
> kernel only disallows profiling when perf_event_paranoid > 1. Make perf do
> the same.

Thanks for following up on this, I added these notes to this cset commit
log message:

--------------------------------- 8< ------------------------------------

perf evsel: Kernel profiling is disallowed only when perf_event_paranoid > 1

Perf was too restrictive about sysctl kernel.perf_event_paranoid. The
kernel only disallows profiling when perf_event_paranoid > 1. Make perf
do the same.

Committer testing:

For a non-root user:

$ id
uid=1000(acme) gid=1000(acme) groups=1000(acme),10(wheel) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
$

Before:

We were restricting it to just userspace (:u suffix) even for a
workload started by the user:

$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf evlist
cycles:u
$ perf evlist -v
cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 8 of event 'cycles:u'
# Event count (approx.): 1040396
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
68.36% sleep libc-2.29.so [.] _dl_addr
27.33% sleep ld-2.29.so [.] dl_main
3.80% sleep ld-2.29.so [.] _dl_setup_hash
#
# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
#
$
$

After:

When the kernel allows profiling the kernel in that scenario:

$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.023 MB perf.data (11 samples) ]
$ perf evlist
cycles
$ perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 11 of event 'cycles'
# Event count (approx.): 1601964
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ..........................
#
28.14% sleep [kernel.vmlinux] [k] __rb_erase_color
27.21% sleep [kernel.vmlinux] [k] unmap_page_range
27.20% sleep ld-2.29.so [.] __tunable_get_val
15.24% sleep [kernel.vmlinux] [k] thp_get_unmapped_area
1.96% perf [kernel.vmlinux] [k] perf_event_exec
0.22% perf [kernel.vmlinux] [k] native_sched_clock
0.02% perf [kernel.vmlinux] [k] intel_bts_enable_local
0.00% perf [kernel.vmlinux] [k] native_write_msr
#
# (Tip: Boolean options have negative forms, e.g.: perf report --no-children)
#
$

Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Igor Lubashev <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Author: Igor Lubashev <[email protected]>
# Date: Mon Aug 26 21:39:14 2019 -0400
#
# On branch perf/core
# Changes to be committed:
# modified: tools/perf/util/evsel.c
#
# Untracked files:
# a
# a.c
# bla
# f_mode
# perf.data
# perf.data.old
# q
#
# ------------------------ >8 ------------------------
# Do not modify or remove the line above.
# Everything below it will be ignored.
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7c704b8f0e5c..d4540bfe4574 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -282,7 +282,7 @@ struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)

static bool perf_event_can_profile_kernel(void)
{
- return perf_event_paranoid_check(-1);
+ return perf_event_paranoid_check(1);
}

struct evsel *perf_evsel__new_cycles(bool precise)

2019-08-28 19:32:58

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH 0/5] perf: Treat perf_event_paranoid and kptr_restrict like the kernel does it

On Mon, 26 Aug 2019 at 19:40, Igor Lubashev <[email protected]> wrote:
>
> This is a follow up series to the ensure perf treats perf_event_paranoid and
> kptr_restrict in a way that is similar to the kernel's. That includes use of
> capabilities instead of euid==0, when possible, as well as adjusting the logic
> and fixing bugs.
>
> Prior discussion: https://lkml.kernel.org/lkml/[email protected]
>
> === Testing notes ===
>
> I have tested on x86 with perf binary installed according to
> Documentation/admin-guide/perf-security.rst (cap_sys_admin, cap_sys_ptrace,
> cap_syslog assigned to the perf executable).
>
> I tested each permutation of:
>
> * 7 commits:
> 1. HEAD of perf/core
> 2. patch 01 on top of perf/core
> 3. patches 01-02 on top of perf/core
> 4. patches 01-03 on top of perf/core
> 5. patches 01-04 on top of perf/core
> 6. patches 01-05 on top of perf/core
> 7. HEAD of perf/cap (with known bug fixed by patch 01 of this series)
>
> * 2 build environments: with and without libcap-dev
>
> * 3 kernel.kptr_restrict values: 0, 1, 2
>
> * 4 kernel.perf_event_paranoid values: -1, 0, 1, 2
>
> * 2 users: root and non-root
>
> Total: 336 permutations
>
> Each permutation consisted of:
> perf test
> perf record -e instructions -- sleep 1
>
> All test runs were expected. Also, as expected, the following permutation (just
> that permutation) resulted in segmentation failure:
> commit: perf/cap
> build: no libcap-dev
> kernel.kptr_restrict: 0
> kernel.perf_event_paranoid: 2
> user: non-root
>
> The perf/cap commit was included in the test to ensure that we can reproduce the
> crash and hence test that the patch series fixes the crash, while retaining the
> desired behavior of perf/cap.
>
> === Series Contents ===
>
> 01: perf event: Check ref_reloc_sym before using it
> Fix the pre-existing cause of the crash above: use of ref_reloc_sym without
> a check for NULL
>
> 02: perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks
> Replace the use of euid==0 with a check for CAP_SYS_ADMIN whenever
> perf_event_paranoid level is verified.
> * This patch has been reviewed previously and is unchanged.
> * I kept Acks and Sign-offs.
>
> 03: perf util: kernel profiling is disallowed only when perf_event_paranoid>1
> Align perf logic regarding perf_event_paranoid to match kernel's.
> This has been reported by Arnaldo.
>
> 04: perf symbols: Use CAP_SYSLOG with kptr_restrict checks
> Replace the use of uid and euid with a check for CAP_SYSLOG when
> kptr_restrict is verified (similar to kernel/kallsyms.c and lib/vsprintf.c).
> Consult perf_event_paranoid when kptr_restrict==0 (see kernel/kallsyms.c).
> * A previous version of this patch has been reviewed previously, but I
> * modified it in a non-trivial way, so I removed Acks.
>
> 05: perf: warn perf_event_paranoid restrict kernel symbols
> Warn that /proc/sys/kernel/perf_event_paranoid can also restrict kernel
> symbols.
>
> Igor Lubashev (5):
> perf event: Check ref_reloc_sym before using it
> perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks
> perf util: kernel profiling is disallowed only when perf_event_paranoid > 1
> perf symbols: Use CAP_SYSLOG with kptr_restrict checks
> perf: warn that perf_event_paranoid can restrict kernel symbols
>
> tools/perf/arch/arm/util/cs-etm.c | 3 ++-

For the coresight part:

Reviewed-by: Mathieu Poirier <[email protected]>

> tools/perf/arch/arm64/util/arm-spe.c | 3 ++-
> tools/perf/arch/x86/util/intel-bts.c | 3 ++-
> tools/perf/arch/x86/util/intel-pt.c | 2 +-
> tools/perf/builtin-record.c | 2 +-
> tools/perf/builtin-top.c | 2 +-
> tools/perf/builtin-trace.c | 2 +-
> tools/perf/util/event.c | 7 ++++---
> tools/perf/util/evsel.c | 2 +-
> tools/perf/util/symbol.c | 15 ++++++++++++---
> 10 files changed, 27 insertions(+), 14 deletions(-)

For the set:

Tested-by: Mathieu Poirier <[email protected]>

>
> --
> 2.7.4
>

2019-08-28 21:02:01

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 0/5] perf: Treat perf_event_paranoid and kptr_restrict like the kernel does it

Em Wed, Aug 28, 2019 at 01:31:21PM -0600, Mathieu Poirier escreveu:
> On Mon, 26 Aug 2019 at 19:40, Igor Lubashev <[email protected]> wrote:
> > Igor Lubashev (5):
> > perf event: Check ref_reloc_sym before using it
> > perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks
> > perf util: kernel profiling is disallowed only when perf_event_paranoid > 1
> > perf symbols: Use CAP_SYSLOG with kptr_restrict checks
> > perf: warn that perf_event_paranoid can restrict kernel symbols

> > tools/perf/arch/arm/util/cs-etm.c | 3 ++-

> For the coresight part:

> Reviewed-by: Mathieu Poirier <[email protected]>

> > tools/perf/arch/arm64/util/arm-spe.c | 3 ++-
> > tools/perf/arch/x86/util/intel-bts.c | 3 ++-
> > tools/perf/arch/x86/util/intel-pt.c | 2 +-
> > tools/perf/builtin-record.c | 2 +-
> > tools/perf/builtin-top.c | 2 +-
> > tools/perf/builtin-trace.c | 2 +-
> > tools/perf/util/event.c | 7 ++++---
> > tools/perf/util/evsel.c | 2 +-
> > tools/perf/util/symbol.c | 15 ++++++++++++---
> > 10 files changed, 27 insertions(+), 14 deletions(-)
>
> For the set:
>
> Tested-by: Mathieu Poirier <[email protected]>

Thanks, updated the patches with your tags,

- Arnaldo

Subject: [tip: perf/core] perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks

The following commit has been merged into the perf/core branch of tip:

Commit-ID: dda1bf8ea78add78739d128a20b555c4a1a19c27
Gitweb: https://git.kernel.org/tip/dda1bf8ea78add78739d128a20b555c4a1a19c27
Author: Igor Lubashev <[email protected]>
AuthorDate: Mon, 26 Aug 2019 21:39:13 -04:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Wed, 28 Aug 2019 17:18:08 -03:00

perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks

The kernel is using CAP_SYS_ADMIN instead of euid==0 to override
perf_event_paranoid check. Make perf do the same.

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Mathieu Poirier <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]> # coresight part
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Igor Lubashev <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/arch/arm/util/cs-etm.c | 3 ++-
tools/perf/arch/arm64/util/arm-spe.c | 3 ++-
tools/perf/arch/x86/util/intel-bts.c | 3 ++-
tools/perf/arch/x86/util/intel-pt.c | 2 +-
tools/perf/util/evsel.c | 2 +-
5 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index a185dab..5d856ed 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -18,6 +18,7 @@
#include "../../util/record.h"
#include "../../util/auxtrace.h"
#include "../../util/cpumap.h"
+#include "../../util/event.h"
#include "../../util/evlist.h"
#include "../../util/evsel.h"
#include "../../util/pmu.h"
@@ -254,7 +255,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
struct evsel *evsel, *cs_etm_evsel = NULL;
struct perf_cpu_map *cpus = evlist->core.cpus;
- bool privileged = (geteuid() == 0 || perf_event_paranoid() < 0);
+ bool privileged = perf_event_paranoid_check(-1);
int err = 0;

ptr->evlist = evlist;
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index cdd5c0c..c7b38f0 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -12,6 +12,7 @@
#include <time.h>

#include "../../util/cpumap.h"
+#include "../../util/event.h"
#include "../../util/evsel.h"
#include "../../util/evlist.h"
#include "../../util/session.h"
@@ -67,7 +68,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
container_of(itr, struct arm_spe_recording, itr);
struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
struct evsel *evsel, *arm_spe_evsel = NULL;
- bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+ bool privileged = perf_event_paranoid_check(-1);
struct evsel *tracking_evsel;
int err;

diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
index 1f2cf61..16d26ea 100644
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -12,6 +12,7 @@
#include <linux/zalloc.h>

#include "../../util/cpumap.h"
+#include "../../util/event.h"
#include "../../util/evsel.h"
#include "../../util/evlist.h"
#include "../../util/session.h"
@@ -108,7 +109,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
struct perf_pmu *intel_bts_pmu = btsr->intel_bts_pmu;
struct evsel *evsel, *intel_bts_evsel = NULL;
const struct perf_cpu_map *cpus = evlist->core.cpus;
- bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+ bool privileged = perf_event_paranoid_check(-1);

btsr->evlist = evlist;
btsr->snapshot_mode = opts->auxtrace_snapshot_mode;
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 44cfe72..746981c 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -579,7 +579,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
bool have_timing_info, need_immediate = false;
struct evsel *evsel, *intel_pt_evsel = NULL;
const struct perf_cpu_map *cpus = evlist->core.cpus;
- bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+ bool privileged = perf_event_paranoid_check(-1);
u64 tsc_bit;
int err;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index fa67635..7c704b8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -282,7 +282,7 @@ struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)

static bool perf_event_can_profile_kernel(void)
{
- return geteuid() == 0 || perf_event_paranoid() == -1;
+ return perf_event_paranoid_check(-1);
}

struct evsel *perf_evsel__new_cycles(bool precise)

Subject: [tip: perf/core] perf event: Check ref_reloc_sym before using it

The following commit has been merged into the perf/core branch of tip:

Commit-ID: e9a6882f267a8105461066e3ea6b4b6b9be1b807
Gitweb: https://git.kernel.org/tip/e9a6882f267a8105461066e3ea6b4b6b9be1b807
Author: Igor Lubashev <[email protected]>
AuthorDate: Mon, 26 Aug 2019 21:39:12 -04:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Wed, 28 Aug 2019 17:17:51 -03:00

perf event: Check ref_reloc_sym before using it

Check for ref_reloc_sym before using it instead of checking
symbol_conf.kptr_restrict and relying solely on that check.

Reported-by: Mathieu Poirier <[email protected]>
Signed-off-by: Igor Lubashev <[email protected]>
Tested-by: Mathieu Poirier <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/event.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 33616ea..e33dd1a 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -913,11 +913,13 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
int err;
union perf_event *event;

- if (symbol_conf.kptr_restrict)
- return -1;
if (map == NULL)
return -1;

+ kmap = map__kmap(map);
+ if (!kmap->ref_reloc_sym)
+ return -1;
+
/*
* We should get this from /sys/kernel/sections/.text, but till that is
* available use this, and after it is use this as a fallback for older
@@ -940,7 +942,6 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
}

- kmap = map__kmap(map);
size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
"%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
size = PERF_ALIGN(size, sizeof(u64));

Subject: [tip: perf/core] perf symbols: Use CAP_SYSLOG with kptr_restrict checks

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 8859aedefefe7eeea5e67968b7fe39c828d589a0
Gitweb: https://git.kernel.org/tip/8859aedefefe7eeea5e67968b7fe39c828d589a0
Author: Igor Lubashev <[email protected]>
AuthorDate: Mon, 26 Aug 2019 21:39:15 -04:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Wed, 28 Aug 2019 17:19:19 -03:00

perf symbols: Use CAP_SYSLOG with kptr_restrict checks

The kernel is using CAP_SYSLOG capability instead of uid==0 and euid==0
when checking kptr_restrict. Make perf do the same.

Also, the kernel is a more restrictive than "no restrictions" in case of
kptr_restrict==0, so add the same logic to perf.

Signed-off-by: Igor Lubashev <[email protected]>
Tested-by: Mathieu Poirier <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/symbol.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 4efde78..035f2e7 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -4,6 +4,7 @@
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
+#include <linux/capability.h>
#include <linux/kernel.h>
#include <linux/mman.h>
#include <linux/time64.h>
@@ -15,8 +16,10 @@
#include <inttypes.h>
#include "annotate.h"
#include "build-id.h"
+#include "cap.h"
#include "util.h"
#include "debug.h"
+#include "event.h"
#include "machine.h"
#include "map.h"
#include "symbol.h"
@@ -2195,13 +2198,19 @@ static bool symbol__read_kptr_restrict(void)
char line[8];

if (fgets(line, sizeof(line), fp) != NULL)
- value = ((geteuid() != 0) || (getuid() != 0)) ?
- (atoi(line) != 0) :
- (atoi(line) == 2);
+ value = perf_cap__capable(CAP_SYSLOG) ?
+ (atoi(line) >= 2) :
+ (atoi(line) != 0);

fclose(fp);
}

+ /* Per kernel/kallsyms.c:
+ * we also restrict when perf_event_paranoid > 1 w/o CAP_SYSLOG
+ */
+ if (perf_event_paranoid() > 1 && !perf_cap__capable(CAP_SYSLOG))
+ value = true;
+
return value;
}

Subject: [tip: perf/core] perf evsel: Kernel profiling is disallowed only when perf_event_paranoid > 1

The following commit has been merged into the perf/core branch of tip:

Commit-ID: aa97293ff129f504e7c8589e56007ecfe3e3e835
Gitweb: https://git.kernel.org/tip/aa97293ff129f504e7c8589e56007ecfe3e3e835
Author: Igor Lubashev <[email protected]>
AuthorDate: Mon, 26 Aug 2019 21:39:14 -04:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Wed, 28 Aug 2019 17:19:05 -03:00

perf evsel: Kernel profiling is disallowed only when perf_event_paranoid > 1

Perf was too restrictive about sysctl kernel.perf_event_paranoid. The
kernel only disallows profiling when perf_event_paranoid > 1. Make perf
do the same.

Committer testing:

For a non-root user:

$ id
uid=1000(acme) gid=1000(acme) groups=1000(acme),10(wheel) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
$

Before:

We were restricting it to just userspace (:u suffix) even for a
workload started by the user:

$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf evlist
cycles:u
$ perf evlist -v
cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 8 of event 'cycles:u'
# Event count (approx.): 1040396
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
68.36% sleep libc-2.29.so [.] _dl_addr
27.33% sleep ld-2.29.so [.] dl_main
3.80% sleep ld-2.29.so [.] _dl_setup_hash
#
# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
#
$
$

After:

When the kernel allows profiling the kernel in that scenario:

$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.023 MB perf.data (11 samples) ]
$ perf evlist
cycles
$ perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 11 of event 'cycles'
# Event count (approx.): 1601964
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ..........................
#
28.14% sleep [kernel.vmlinux] [k] __rb_erase_color
27.21% sleep [kernel.vmlinux] [k] unmap_page_range
27.20% sleep ld-2.29.so [.] __tunable_get_val
15.24% sleep [kernel.vmlinux] [k] thp_get_unmapped_area
1.96% perf [kernel.vmlinux] [k] perf_event_exec
0.22% perf [kernel.vmlinux] [k] native_sched_clock
0.02% perf [kernel.vmlinux] [k] intel_bts_enable_local
0.00% perf [kernel.vmlinux] [k] native_write_msr
#
# (Tip: Boolean options have negative forms, e.g.: perf report --no-children)
#
$

Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Igor Lubashev <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Mathieu Poirier <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evsel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7c704b8..d4540bf 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -282,7 +282,7 @@ struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)

static bool perf_event_can_profile_kernel(void)
{
- return perf_event_paranoid_check(-1);
+ return perf_event_paranoid_check(1);
}

struct evsel *perf_evsel__new_cycles(bool precise)