Changes in v2:
- implemented minor doc and code changes to substitute CAP_SYS_ADMIN
with CAP_PERFMON capability;
- introduced Perf doc file with instructions on how to enable and use
perf_event LSM hooks for mandatory access control to perf_event_open()
syscall;
v1: https://lore.kernel.org/lkml/[email protected]/
repo: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
sha1: ee097e8ee56f8867cbbf45fe2a06f6b9e660c39c
Extend Perf tool with the check of /sys/fs/selinux/enforce value and notify
in case access to perf_event_open() syscall is restricted by the enforced
SELinux policy settings. See new added security.txt file for exact steps
how the changes look like and how to test the patch set.
---
Alexey Budankov (4):
perf trace: substitute CAP_SYS_ADMIN with CAP_PERFMON in error message
perf docs: substitute CAP_SYS_ADMIN with CAP_PERFMON where needed
perf tool: make Perf tool aware of SELinux access control
perf docs: introduce security.txt file to document related issues
tools/perf/Documentation/perf-intel-pt.txt | 2 +-
tools/perf/Documentation/security.txt | 236 +++++++++++++++++++++
tools/perf/builtin-ftrace.c | 2 +-
tools/perf/design.txt | 3 +-
tools/perf/util/cloexec.c | 4 +-
tools/perf/util/evsel.c | 40 ++--
6 files changed, 265 insertions(+), 22 deletions(-)
create mode 100644 tools/perf/Documentation/security.txt
--
2.24.1
Update error message to mention CAP_PERFMON only. CAP_SYS_ADMIN still
works in keeping with user space backward compatibility approach.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 55eda54240fb..39d43ad02f30 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -288,7 +288,7 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
perf_cap__capable(CAP_SYS_ADMIN))) {
pr_err("ftrace only works for %s!\n",
#ifdef HAVE_LIBCAP_SUPPORT
- "users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
+ "users with the CAP_PERFMON capability"
#else
"root"
#endif
--
2.24.1
Substitute CAP_SYS_ADMIN with CAP_PERFMON in the docs where admin
is mentioned. CAP_SYS_ADMIN still works in keeping with user space
backward compatibility approach.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/perf-intel-pt.txt | 2 +-
tools/perf/design.txt | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 456fdcbf26ac..176597be0755 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -687,7 +687,7 @@ The v4.2 kernel introduced support for a context switch metadata event,
PERF_RECORD_SWITCH, which allows unprivileged users to see when their processes
are scheduled out and in, just not by whom, which is left for the
PERF_RECORD_SWITCH_CPU_WIDE, that is only accessible in system wide context,
-which in turn requires CAP_SYS_ADMIN.
+which in turn requires CAP_PERFMON.
Please see the 45ac1403f564 ("perf: Add PERF_RECORD_SWITCH to indicate context
switches") commit, that introduces these metadata events for further info.
diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index a42fab308ff6..6fd879440c40 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -258,8 +258,7 @@ gets schedule to. Per task counters can be created by any user, for
their own tasks.
A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_PERFMON or CAP_SYS_ADMIN
-privilege.
+all events on CPU-x. Per CPU counters need CAP_PERFMON privilege.
The 'flags' parameter is currently unused and must be zero.
--
2.24.1
Implement SELinux sysfs check to see if the system is in enforcing
mode and print warning message with pointers to check audit logs.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/util/cloexec.c | 4 ++--
tools/perf/util/evsel.c | 40 +++++++++++++++++++++++----------------
2 files changed, 26 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index a12872f2856a..9c8ec816261b 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -65,7 +65,7 @@ static int perf_flag_probe(void)
return 1;
}
- WARN_ONCE(err != EINVAL && err != EBUSY,
+ WARN_ONCE(err != EINVAL && err != EBUSY && err != EACCES,
"perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with unexpected error %d (%s)\n",
err, str_error_r(err, sbuf, sizeof(sbuf)));
@@ -83,7 +83,7 @@ static int perf_flag_probe(void)
if (fd >= 0)
close(fd);
- if (WARN_ONCE(fd < 0 && err != EBUSY,
+ if (WARN_ONCE(fd < 0 && err != EBUSY && err != EACCES,
"perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
err, str_error_r(err, sbuf, sizeof(sbuf))))
return -1;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 9fa92649adb4..82492ca12405 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2514,32 +2514,40 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
int err, char *msg, size_t size)
{
char sbuf[STRERR_BUFSIZE];
- int printed = 0;
+ int printed = 0, enforced = 0;
switch (err) {
case EPERM:
case EACCES:
+ printed += scnprintf(msg + printed, size - printed,
+ "Access to performance monitoring and observability operations is limited.\n");
+
+ if (!sysfs__read_int("fs/selinux/enforce", &enforced)) {
+ if (enforced) {
+ printed += scnprintf(msg + printed, size - printed,
+ "Enforced MAC policy settings (SELinux) can limit access to performance\n"
+ "monitoring and observability operations. Inspect system audit records for\n"
+ "more perf_event access control information and adjusting the policy.\n");
+ }
+ }
+
if (err == EPERM)
- printed = scnprintf(msg, size,
- "No permission to enable %s event.\n\n",
+ printed += scnprintf(msg, size,
+ "No permission to enable %s event.\n",
perf_evsel__name(evsel));
return scnprintf(msg + printed, size - printed,
- "You may not have permission to collect %sstats.\n\n"
- "Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
- "which controls use of the performance events system by\n"
- "unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).\n\n"
- "The current value is %d:\n\n"
+ "Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open\n"
+ "access to performance monitoring and observability operations for users\n"
+ "without CAP_PERFMON capability. perf_event_paranoid setting is %d:\n"
" -1: Allow use of (almost) all events by all users\n"
" Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
- ">= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
- " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
- ">= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
- ">= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMIN\n\n"
- "To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
- " kernel.perf_event_paranoid = -1\n" ,
- target->system_wide ? "system-wide " : "",
- perf_event_paranoid());
+ ">= 0: Disallow raw and ftrace function tracepoint access\n"
+ ">= 1: Disallow CPU event access\n"
+ ">= 2: Disallow kernel profiling\n"
+ "To make the adjusted perf_event_paranoid setting permanent preserve it\n"
+ "in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)",
+ perf_event_paranoid());
case ENOENT:
return scnprintf(msg, size, "The %s event is not supported.",
perf_evsel__name(evsel));
--
2.24.1
Publish instructions on how to apply LSM hooks for access control
to perf_event_open() syscall on Fedora (v31) with Targeted
SELinux policy.
Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/Documentation/security.txt | 236 ++++++++++++++++++++++++++
1 file changed, 236 insertions(+)
create mode 100644 tools/perf/Documentation/security.txt
diff --git a/tools/perf/Documentation/security.txt b/tools/perf/Documentation/security.txt
new file mode 100644
index 000000000000..7ca9377c1526
--- /dev/null
+++ b/tools/perf/Documentation/security.txt
@@ -0,0 +1,236 @@
+Overview
+========
+
+For general security related questions of perf_event_open() syscall usage,
+performance monitoring and observability operations by Perf see here:
+https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
+
+Enabling LSM based mandatory access control (MAC) to perf_event_open() syscall
+==============================================================================
+
+LSM hooks for mandatory access control for perf_event_open() syscall can be
+used starting from Linux v5.3. Below are the steps to extend Fedora (v31) with
+Targeted policy with perf_event_open() access control capabilities:
+
+1. Download selinux-policy SRPM package (e.g. selinux-policy-3.14.4-48.fc31.src.rpm on FC31)
+ and install it so rpmbuild directory would exist in the current working directory:
+
+ # rpm -Uhv selinux-policy-3.14.4-48.fc31.src.rpm
+
+2. Get into rpmbuild/SPECS directory and unpack the source code:
+
+ # rpmbuild -bp selinux-policy.spec
+
+3. Place patch below at rpmbuild/BUILD/selinux-policy-b86eaaf4dbcf2d51dd4432df7185c0eaf3cbcc02
+ directory and apply it:
+
+ # patch -p1 < selinux-policy-perf-events-perfmon.patch
+ patching file policy/flask/access_vectors
+ patching file policy/flask/security_classes
+ # cat selinux-policy-perf-events-perfmon.patch
+diff -Nura a/policy/flask/access_vectors b/policy/flask/access_vectors
+--- a/policy/flask/access_vectors 2020-02-04 18:19:53.000000000 +0300
++++ b/policy/flask/access_vectors 2020-02-28 23:37:25.000000000 +0300
+@@ -174,6 +174,7 @@
+ wake_alarm
+ block_suspend
+ audit_read
++ perfmon
+ }
+
+ #
+@@ -1099,3 +1100,15 @@
+
+ class xdp_socket
+ inherits socket
++
++class perf_event
++{
++ open
++ cpu
++ kernel
++ tracepoint
++ read
++ write
++}
++
++
+diff -Nura a/policy/flask/security_classes b/policy/flask/security_classes
+--- a/policy/flask/security_classes 2020-02-04 18:19:53.000000000 +0300
++++ b/policy/flask/security_classes 2020-02-28 21:35:17.000000000 +0300
+@@ -200,4 +200,6 @@
+
+ class xdp_socket
+
++class perf_event
++
+ # FLASK
+
+4. Get into rpmbuild/SPECS directory and build policy packages from patched sources:
+
+ # rpmbuild --noclean --noprep -ba selinux-policy.spec
+
+ so you have this:
+
+ # ls -alh rpmbuild/RPMS/noarch/
+ total 33M
+ drwxr-xr-x. 2 root root 4.0K Mar 20 12:16 .
+ drwxr-xr-x. 3 root root 4.0K Mar 20 12:16 ..
+ -rw-r--r--. 1 root root 112K Mar 20 12:16 selinux-policy-3.14.4-48.fc31.noarch.rpm
+ -rw-r--r--. 1 root root 1.2M Mar 20 12:17 selinux-policy-devel-3.14.4-48.fc31.noarch.rpm
+ -rw-r--r--. 1 root root 2.3M Mar 20 12:17 selinux-policy-doc-3.14.4-48.fc31.noarch.rpm
+ -rw-r--r--. 1 root root 12M Mar 20 12:17 selinux-policy-minimum-3.14.4-48.fc31.noarch.rpm
+ -rw-r--r--. 1 root root 4.5M Mar 20 12:16 selinux-policy-mls-3.14.4-48.fc31.noarch.rpm
+ -rw-r--r--. 1 root root 111K Mar 20 12:16 selinux-policy-sandbox-3.14.4-48.fc31.noarch.rpm
+ -rw-r--r--. 1 root root 14M Mar 20 12:17 selinux-policy-targeted-3.14.4-48.fc31.noarch.rpm
+
+5. Install SELinux packages from Fedora repository, if not already done so, and
+ update with the patched rpms above:
+
+ # rpm -Uhv rpmbuild/RPMS/noarch/selinux-policy-*
+
+6. Enable SELinux Permissive mode for Targeted policy, if not already done so:
+
+ # cat /etc/selinux/config
+
+ # This file controls the state of SELinux on the system.
+ # SELINUX= can take one of these three values:
+ # enforcing - SELinux security policy is enforced.
+ # permissive - SELinux prints warnings instead of enforcing.
+ # disabled - No SELinux policy is loaded.
+ SELINUX=permissive
+ # SELINUXTYPE= can take one of these three values:
+ # targeted - Targeted processes are protected,
+ # minimum - Modification of targeted policy. Only selected processes are protected.
+ # mls - Multi Level Security protection.
+ SELINUXTYPE=targeted
+
+7. Enable filesystem SELinux labeling at the next reboot:
+
+ # touch /.autorelabel
+
+8. Reboot machine and it will label filesystems and load Targeted policy into the kernel;
+
+9. Login and check that dmesg output doesn't mention that perf_event class is unknown to SELinux subsystem;
+
+10. Check that SELinux is enabled and in Permissive mode
+
+ # getenforce
+ Permissive
+
+11. Turn SELinux into Enforcing mode:
+
+ # setenforce 1
+ # getenforce
+ Enforcing
+
+Opening access to perf_event_open() syscall on Fedora with SELinux
+==================================================================
+
+Access to performance monitoring and observability operations by Perf
+can be limited for superuser or CAP_PERFMON privileged process.
+MAC policy settings (e.g. SELinux) can be loaded into the kernel and
+prevent unauthorized access to perf_event_open() syscall. In such case
+Perf tool provides a message similar to the one below:
+
+ # perf stat
+ Error:
+ Access to performance monitoring and observability operations is limited.
+ Enforced MAC policy settings (SELinux) can limit access to performance
+ monitoring and observability operations. Inspect system audit records for
+ more perf_event access control information and adjusting the policy.
+ Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
+ access to performance monitoring and observability operations for users
+ without CAP_PERFMON capability. perf_event_paranoid setting is -1:
+ -1: Allow use of (almost) all events by all users
+ Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
+ >= 0: Disallow raw and ftrace function tracepoint access
+ >= 1: Disallow CPU event access
+ >= 2: Disallow kernel profiling
+ To make the adjusted perf_event_paranoid setting permanent preserve it
+ in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
+
+To make sure that access is limited by MAC policy settings inspect system
+audit records using journalctl command or /var/log/audit/audit.log so the
+output would contain AVC denied records related to perf_event:
+
+ # journalctl --reverse --no-pager | grep perf_event
+
+ python3[1318099]: SELinux is preventing perf from open access on the perf_event labeled unconfined_t.
+ If you believe that perf should be allowed open access on perf_event labeled unconfined_t by default.
+ setroubleshoot[1318099]: SELinux is preventing perf from open access on the perf_event labeled unconfined_t. For complete SELinux messages run: sealert -l 4595ce5b-e58f-462c-9d86-3bc2074935de
+ audit[1318098]: AVC avc: denied { open } for pid=1318098 comm="perf" scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=perf_event permissive=0
+
+In order to open access to perf_event_open() syscall MAC policy settings can
+require to be extend. On SELinux system this can be done by loading a special
+policy module extending base policy settings. Perf related policy module can
+be generated using the system audit records about blocking perf_event access.
+Run the command below to generate my-perf.te policy extension file with
+perf_event related rules:
+
+ # ausearch -c 'perf' --raw | audit2allow -M my-perf && cat my-perf.te
+
+ module my-perf 1.0;
+
+ require {
+ type unconfined_t;
+ class perf_event { cpu kernel open read tracepoint write };
+ }
+
+ #============= unconfined_t ==============
+ allow unconfined_t self:perf_event { cpu kernel open read tracepoint write };
+
+Now compile, pack and load my-perf.pp extension module into the kernel:
+
+ # checkmodule -M -m -o my-perf.mod my-perf.te
+ # semodule_package -o my-perf.pp -m my-perf.mod
+ # semodule -X 300 -i my-perf.pp
+
+After all those taken steps above access to perf_event_open() syscall should
+now be allowed by the policy settings. Check access running Perf like this:
+
+ # perf stat
+ ^C
+ Performance counter stats for 'system wide':
+
+ 36,387.41 msec cpu-clock # 7.999 CPUs utilized
+ 2,629 context-switches # 0.072 K/sec
+ 57 cpu-migrations # 0.002 K/sec
+ 1 page-faults # 0.000 K/sec
+ 263,721,559 cycles # 0.007 GHz
+ 175,746,713 instructions # 0.67 insn per cycle
+ 19,628,798 branches # 0.539 M/sec
+ 1,259,201 branch-misses # 6.42% of all branches
+
+ 4.549061439 seconds time elapsed
+
+The generated perf-event.pp related policy extension module can be removed
+from the kernel using this command:
+
+ # semodule -X 300 -r my-perf
+
+Alternatively the module can be temporarily disabled and enabled back using
+these two commands:
+
+ # semodule -d my-perf
+ # semodule -e my-perf
+
+If something went wrong
+=======================
+
+To turn SELinux into Permissive mode:
+ # setenforce 0
+
+To fully disable SELinux during kernel boot [3] set kernel command line parameter selinux=0
+
+To remove SELinux labeling from local filesystems:
+ # find / -mount -print0 | xargs -0 setfattr -h -x security.selinux
+
+To fully turn SELinux off a machine set SELINUX=disabled at /etc/selinux/config file and reboot;
+
+Links
+=====
+
+[1] https://download-ib01.fedoraproject.org/pub/fedora/linux/updates/31/Everything/SRPMS/Packages/s/selinux-policy-3.14.4-49.fc31.src.rpm
+[2] https://docs.fedoraproject.org/en-US/Fedora/11/html/Security-Enhanced_Linux/sect-Security-Enhanced_Linux-Working_with_SELinux-Enabling_and_Disabling_SELinux.html
+[3] https://danwalsh.livejournal.com/10972.html
--
2.24.1
Em Wed, Apr 22, 2020 at 05:44:02PM +0300, Alexey Budankov escreveu:
>
> Update error message to mention CAP_PERFMON only. CAP_SYS_ADMIN still
> works in keeping with user space backward compatibility approach.
This will confuse users that build the latest perf to use in older
systems where CAP_PERFMON isn't available, probably we need to, in these
cases, check for the existence of CAP_PERFMON to provide a better
warning message, something like:
You need CAP_ADMIN or update your kernel and libcap to one that supports
CAP_PERFMON.
For systems without CAP_PERFMON, while mentioning only CAP_PERFMON for
systems where it is present, right?
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/builtin-ftrace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
> index 55eda54240fb..39d43ad02f30 100644
> --- a/tools/perf/builtin-ftrace.c
> +++ b/tools/perf/builtin-ftrace.c
> @@ -288,7 +288,7 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
> perf_cap__capable(CAP_SYS_ADMIN))) {
> pr_err("ftrace only works for %s!\n",
> #ifdef HAVE_LIBCAP_SUPPORT
> - "users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
> + "users with the CAP_PERFMON capability"
> #else
> "root"
> #endif
> --
> 2.24.1
>
--
- Arnaldo
Em Wed, Apr 22, 2020 at 05:45:34PM +0300, Alexey Budankov escreveu:
>
> Implement SELinux sysfs check to see if the system is in enforcing
> mode and print warning message with pointers to check audit logs.
>
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/util/cloexec.c | 4 ++--
> tools/perf/util/evsel.c | 40 +++++++++++++++++++++++----------------
> 2 files changed, 26 insertions(+), 18 deletions(-)
>
> diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
> index a12872f2856a..9c8ec816261b 100644
> --- a/tools/perf/util/cloexec.c
> +++ b/tools/perf/util/cloexec.c
> @@ -65,7 +65,7 @@ static int perf_flag_probe(void)
> return 1;
> }
>
> - WARN_ONCE(err != EINVAL && err != EBUSY,
> + WARN_ONCE(err != EINVAL && err != EBUSY && err != EACCES,
> "perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with unexpected error %d (%s)\n",
> err, str_error_r(err, sbuf, sizeof(sbuf)));
>
> @@ -83,7 +83,7 @@ static int perf_flag_probe(void)
> if (fd >= 0)
> close(fd);
>
> - if (WARN_ONCE(fd < 0 && err != EBUSY,
> + if (WARN_ONCE(fd < 0 && err != EBUSY && err != EACCES,
> "perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
> err, str_error_r(err, sbuf, sizeof(sbuf))))
> return -1;
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 9fa92649adb4..82492ca12405 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2514,32 +2514,40 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
> int err, char *msg, size_t size)
> {
> char sbuf[STRERR_BUFSIZE];
> - int printed = 0;
> + int printed = 0, enforced = 0;
>
> switch (err) {
> case EPERM:
> case EACCES:
> + printed += scnprintf(msg + printed, size - printed,
> + "Access to performance monitoring and observability operations is limited.\n");
> +
> + if (!sysfs__read_int("fs/selinux/enforce", &enforced)) {
> + if (enforced) {
> + printed += scnprintf(msg + printed, size - printed,
> + "Enforced MAC policy settings (SELinux) can limit access to performance\n"
> + "monitoring and observability operations. Inspect system audit records for\n"
> + "more perf_event access control information and adjusting the policy.\n");
> + }
> + }
> +
> if (err == EPERM)
> - printed = scnprintf(msg, size,
> - "No permission to enable %s event.\n\n",
> + printed += scnprintf(msg, size,
> + "No permission to enable %s event.\n",
> perf_evsel__name(evsel));
This removal of a newline doesn't seem necessary to this patch.
> return scnprintf(msg + printed, size - printed,
> - "You may not have permission to collect %sstats.\n\n"
> - "Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
> - "which controls use of the performance events system by\n"
> - "unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).\n\n"
> - "The current value is %d:\n\n"
> + "Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open\n"
> + "access to performance monitoring and observability operations for users\n"
> + "without CAP_PERFMON capability. perf_event_paranoid setting is %d:\n"
Here we need as well to check if the kernel/libcap supports CAP_PERFMON
to provide a better error message.
> " -1: Allow use of (almost) all events by all users\n"
> " Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
> - ">= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
> - " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
> - ">= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
> - ">= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMIN\n\n"
> - "To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
> - " kernel.perf_event_paranoid = -1\n" ,
> - target->system_wide ? "system-wide " : "",
> - perf_event_paranoid());
> + ">= 0: Disallow raw and ftrace function tracepoint access\n"
> + ">= 1: Disallow CPU event access\n"
> + ">= 2: Disallow kernel profiling\n"
> + "To make the adjusted perf_event_paranoid setting permanent preserve it\n"
> + "in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)",
> + perf_event_paranoid());
> case ENOENT:
> return scnprintf(msg, size, "The %s event is not supported.",
> perf_evsel__name(evsel));
> --
> 2.24.1
>
>
--
- Arnaldo
Em Thu, Apr 23, 2020 at 05:49:32PM +0300, Alexey Budankov escreveu:
>
> On 23.04.2020 16:20, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Apr 22, 2020 at 05:44:02PM +0300, Alexey Budankov escreveu:
> >>
> >> Update error message to mention CAP_PERFMON only. CAP_SYS_ADMIN still
> >> works in keeping with user space backward compatibility approach.
> >
> > This will confuse users that build the latest perf to use in older
> > systems where CAP_PERFMON isn't available, probably we need to, in these
> > cases, check for the existence of CAP_PERFMON to provide a better
> > warning message, something like:
> >
> > You need CAP_ADMIN or update your kernel and libcap to one that supports
> > CAP_PERFMON.
> >
> > For systems without CAP_PERFMON, while mentioning only CAP_PERFMON for
> > systems where it is present, right?
>
> Right, but this ideal implementation requires more effort, so staying with
> two caps in the message and letting users decide which one to use looks like
> a good balance already.
Agreed.
- Arnaldo
Em Wed, Apr 22, 2020 at 05:44:53PM +0300, Alexey Budankov escreveu:
>
> Substitute CAP_SYS_ADMIN with CAP_PERFMON in the docs where admin
> is mentioned. CAP_SYS_ADMIN still works in keeping with user space
> backward compatibility approach.
Same issue as with the previous patch, the documentation is for the
tool, that may be used in older kernels, so we need to clarify that
CAP_PERFMON requires updating libcap and the kernel, if that isn't
possible, then CAP_SYS_ADMIN is needed.
- Arnaldo
> Signed-off-by: Alexey Budankov <[email protected]>
> ---
> tools/perf/Documentation/perf-intel-pt.txt | 2 +-
> tools/perf/design.txt | 3 +--
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
> index 456fdcbf26ac..176597be0755 100644
> --- a/tools/perf/Documentation/perf-intel-pt.txt
> +++ b/tools/perf/Documentation/perf-intel-pt.txt
> @@ -687,7 +687,7 @@ The v4.2 kernel introduced support for a context switch metadata event,
> PERF_RECORD_SWITCH, which allows unprivileged users to see when their processes
> are scheduled out and in, just not by whom, which is left for the
> PERF_RECORD_SWITCH_CPU_WIDE, that is only accessible in system wide context,
> -which in turn requires CAP_SYS_ADMIN.
> +which in turn requires CAP_PERFMON.
>
> Please see the 45ac1403f564 ("perf: Add PERF_RECORD_SWITCH to indicate context
> switches") commit, that introduces these metadata events for further info.
> diff --git a/tools/perf/design.txt b/tools/perf/design.txt
> index a42fab308ff6..6fd879440c40 100644
> --- a/tools/perf/design.txt
> +++ b/tools/perf/design.txt
> @@ -258,8 +258,7 @@ gets schedule to. Per task counters can be created by any user, for
> their own tasks.
>
> A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
> -all events on CPU-x. Per CPU counters need CAP_PERFMON or CAP_SYS_ADMIN
> -privilege.
> +all events on CPU-x. Per CPU counters need CAP_PERFMON privilege.
>
> The 'flags' parameter is currently unused and must be zero.
>
> --
> 2.24.1
>
>
--
- Arnaldo
On 23.04.2020 16:22, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 22, 2020 at 05:44:53PM +0300, Alexey Budankov escreveu:
>>
>> Substitute CAP_SYS_ADMIN with CAP_PERFMON in the docs where admin
>> is mentioned. CAP_SYS_ADMIN still works in keeping with user space
>> backward compatibility approach.
>
> Same issue as with the previous patch, the documentation is for the
> tool, that may be used in older kernels, so we need to clarify that
> CAP_PERFMON requires updating libcap and the kernel, if that isn't
> possible, then CAP_SYS_ADMIN is needed.
Then it is just extending of single mentioning of "CAP_SYS_ADMIN" with
"CAP_PERFMON or CAP_SYS_ADMIN" where required.
Thanks,
Alexey
On 23.04.2020 16:20, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 22, 2020 at 05:44:02PM +0300, Alexey Budankov escreveu:
>>
>> Update error message to mention CAP_PERFMON only. CAP_SYS_ADMIN still
>> works in keeping with user space backward compatibility approach.
>
> This will confuse users that build the latest perf to use in older
> systems where CAP_PERFMON isn't available, probably we need to, in these
> cases, check for the existence of CAP_PERFMON to provide a better
> warning message, something like:
>
> You need CAP_ADMIN or update your kernel and libcap to one that supports
> CAP_PERFMON.
>
> For systems without CAP_PERFMON, while mentioning only CAP_PERFMON for
> systems where it is present, right?
Right, but this ideal implementation requires more effort, so staying with
two caps in the message and letting users decide which one to use looks like
a good balance already.
Thanks,
Alexey
On 23.04.2020 16:27, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 22, 2020 at 05:45:34PM +0300, Alexey Budankov escreveu:
>>
>> Implement SELinux sysfs check to see if the system is in enforcing
>> mode and print warning message with pointers to check audit logs.
>>
>> Signed-off-by: Alexey Budankov <[email protected]>
>> ---
>> tools/perf/util/cloexec.c | 4 ++--
>> tools/perf/util/evsel.c | 40 +++++++++++++++++++++++----------------
>> 2 files changed, 26 insertions(+), 18 deletions(-)
>>
>> diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
>> index a12872f2856a..9c8ec816261b 100644
>> --- a/tools/perf/util/cloexec.c
>> +++ b/tools/perf/util/cloexec.c
>> @@ -65,7 +65,7 @@ static int perf_flag_probe(void)
>> return 1;
>> }
>>
>> - WARN_ONCE(err != EINVAL && err != EBUSY,
>> + WARN_ONCE(err != EINVAL && err != EBUSY && err != EACCES,
>> "perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with unexpected error %d (%s)\n",
>> err, str_error_r(err, sbuf, sizeof(sbuf)));
>>
>> @@ -83,7 +83,7 @@ static int perf_flag_probe(void)
>> if (fd >= 0)
>> close(fd);
>>
>> - if (WARN_ONCE(fd < 0 && err != EBUSY,
>> + if (WARN_ONCE(fd < 0 && err != EBUSY && err != EACCES,
>> "perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
>> err, str_error_r(err, sbuf, sizeof(sbuf))))
>> return -1;
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index 9fa92649adb4..82492ca12405 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -2514,32 +2514,40 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
>> int err, char *msg, size_t size)
>> {
>> char sbuf[STRERR_BUFSIZE];
>> - int printed = 0;
>> + int printed = 0, enforced = 0;
>>
>> switch (err) {
>> case EPERM:
>> case EACCES:
>> + printed += scnprintf(msg + printed, size - printed,
>> + "Access to performance monitoring and observability operations is limited.\n");
>> +
>> + if (!sysfs__read_int("fs/selinux/enforce", &enforced)) {
>> + if (enforced) {
>> + printed += scnprintf(msg + printed, size - printed,
>> + "Enforced MAC policy settings (SELinux) can limit access to performance\n"
>> + "monitoring and observability operations. Inspect system audit records for\n"
>> + "more perf_event access control information and adjusting the policy.\n");
>> + }
>> + }
>> +
>> if (err == EPERM)
>> - printed = scnprintf(msg, size,
>> - "No permission to enable %s event.\n\n",
>> + printed += scnprintf(msg, size,
>> + "No permission to enable %s event.\n",
>> perf_evsel__name(evsel));
>
> This removal of a newline doesn't seem necessary to this patch.
There will be break in the middle of the message then, but ok.
>
>> return scnprintf(msg + printed, size - printed,
>> - "You may not have permission to collect %sstats.\n\n"
>> - "Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
>> - "which controls use of the performance events system by\n"
>> - "unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).\n\n"
>> - "The current value is %d:\n\n"
>> + "Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open\n"
>> + "access to performance monitoring and observability operations for users\n"
>> + "without CAP_PERFMON capability. perf_event_paranoid setting is %d:\n"
>
> Here we need as well to check if the kernel/libcap supports CAP_PERFMON
> to provide a better error message.
I will change change "CAP_PERFMON" to "CAP_PERFMON or CAP_SYS_ADMIN" in the new message.
>
>> " -1: Allow use of (almost) all events by all users\n"
>> " Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
>> - ">= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
>> - " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
>> - ">= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
>> - ">= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMIN\n\n"
>> - "To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
>> - " kernel.perf_event_paranoid = -1\n" ,
>> - target->system_wide ? "system-wide " : "",
>> - perf_event_paranoid());
>> + ">= 0: Disallow raw and ftrace function tracepoint access\n"
>> + ">= 1: Disallow CPU event access\n"
>> + ">= 2: Disallow kernel profiling\n"
>> + "To make the adjusted perf_event_paranoid setting permanent preserve it\n"
>> + "in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)",
>> + perf_event_paranoid());
>> case ENOENT:
>> return scnprintf(msg, size, "The %s event is not supported.",
>> perf_evsel__name(evsel));
Thanks,
Alexey