2020-02-17 08:03:02

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability


Currently access to perf_events, i915_perf and other performance
monitoring and observability subsystems of the kernel is open only for
a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
process effective set [2].

This patch set introduces CAP_PERFMON capability designed to secure
system performance monitoring and observability operations so that
CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
for performance monitoring and observability subsystems of the kernel.

CAP_PERFMON intends to harden system security and integrity during
performance monitoring and observability operations by decreasing attack
surface that is available to a CAP_SYS_ADMIN privileged process [2].
Providing the access to performance monitoring and observability
operations under CAP_PERFMON capability singly, without the rest of
CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
and makes the operation more secure. Thus, CAP_PERFMON implements the
principal of least privilege for performance monitoring and
observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of
least privilege: A security design principle that states that a process
or program be granted only those privileges (e.g., capabilities)
necessary to accomplish its legitimate function, and only for the time
that such privileges are actually required)

CAP_PERFMON intends to meet the demand to secure system performance
monitoring and observability operations for adoption in security
sensitive, restricted, multiuser production environments (e.g. HPC
clusters, cloud and virtual compute environments), where root or
CAP_SYS_ADMIN credentials are not available to mass users of a system,
and securely unblock accessibility of system performance monitoring and
observability operations beyond root and CAP_SYS_ADMIN use cases.

CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
system performance monitoring and observability operations and balance
amount of CAP_SYS_ADMIN credentials following the recommendations in
the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability
is overloaded; see Notes to kernel developers, below." For backward
compatibility reasons access to system performance monitoring and
observability subsystems of the kernel remains open for CAP_SYS_ADMIN
privileged processes but CAP_SYS_ADMIN capability usage for secure
system performance monitoring and observability operations is
discouraged with respect to the designed CAP_PERFMON capability.

Possible alternative solution to this system security hardening,
capabilities balancing task of making performance monitoring and
observability operations more secure and accessible could be to use
the existing CAP_SYS_PTRACE capability to govern system performance
monitoring and observability subsystems. However CAP_SYS_PTRACE
capability still provides users with more credentials than are
required for secure performance monitoring and observability
operations and this excess is avoided by the designed CAP_PERFMON.

Although software running under CAP_PERFMON can not ensure avoidance of
related hardware issues, the software can still mitigate those issues
following the official hardware issues mitigation procedure [3]. The
bugs in the software itself can be fixed following the standard kernel
development process [4] to maintain and harden security of system
performance monitoring and observability operations. Finally, the patch
set is shaped in the way that simplifies backtracking procedure of
possible induced issues [5] as much as possible.

The patch set is for tip perf/core repository:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22

---
Changes in v7:
- updated and extended kernel.rst and perf-security.rst documentation
files with the information about CAP_PERFMON capability and its use cases
- documented the case of double audit logging of CAP_PERFMON and CAP_SYS_ADMIN
capabilities on a SELinux enabled system
Changes in v6:
- avoided noaudit checks in perfmon_capable() to explicitly advertise
CAP_PERFMON usage thru audit logs to secure system performance
monitoring and observability
Changes in v5:
- renamed CAP_SYS_PERFMON to CAP_PERFMON
- extended perfmon_capable() with noaudit checks
Changes in v4:
- converted perfmon_capable() into an inline function
- made perf_events kprobes, uprobes, hw breakpoints and namespaces data
available to CAP_SYS_PERFMON privileged processes
- applied perfmon_capable() to drivers/perf and drivers/oprofile
- extended __cmd_ftrace() with support of CAP_SYS_PERFMON
Changes in v3:
- implemented perfmon_capable() macros aggregating required capabilities
checks
Changes in v2:
- made perf_events trace points available to CAP_SYS_PERFMON privileged
processes
- made perf_event_paranoid_check() treat CAP_SYS_PERFMON equally to
CAP_SYS_ADMIN
- applied CAP_SYS_PERFMON to i915_perf, bpf_trace, powerpc and parisc
system performance monitoring and observability related subsystems

---
Alexey Budankov (12):
capabilities: introduce CAP_PERFMON to kernel and user space
perf/core: open access to the core for CAP_PERFMON privileged process
perf/core: open access to probes for CAP_PERFMON privileged process
perf tool: extend Perf tool with CAP_PERFMON capability support
drm/i915/perf: open access for CAP_PERFMON privileged process
trace/bpf_trace: open access for CAP_PERFMON privileged process
powerpc/perf: open access for CAP_PERFMON privileged process
parisc/perf: open access for CAP_PERFMON privileged process
drivers/perf: open access for CAP_PERFMON privileged process
drivers/oprofile: open access for CAP_PERFMON privileged process
doc/admin-guide: update perf-security.rst with CAP_PERFMON information
doc/admin-guide: update kernel.rst with CAP_PERFMON information

Documentation/admin-guide/perf-security.rst | 65 +++++++++++++--------
Documentation/admin-guide/sysctl/kernel.rst | 16 +++--
arch/parisc/kernel/perf.c | 2 +-
arch/powerpc/perf/imc-pmu.c | 4 +-
drivers/gpu/drm/i915/i915_perf.c | 13 ++---
drivers/oprofile/event_buffer.c | 2 +-
drivers/perf/arm_spe_pmu.c | 4 +-
include/linux/capability.h | 4 ++
include/linux/perf_event.h | 6 +-
include/uapi/linux/capability.h | 8 ++-
kernel/events/core.c | 6 +-
kernel/trace/bpf_trace.c | 2 +-
security/selinux/include/classmap.h | 4 +-
tools/perf/builtin-ftrace.c | 5 +-
tools/perf/design.txt | 3 +-
tools/perf/util/cap.h | 4 ++
tools/perf/util/evsel.c | 10 ++--
tools/perf/util/util.c | 1 +
18 files changed, 98 insertions(+), 61 deletions(-)

---
Validation (Intel Skylake, 8 cores, Fedora 29, 5.5.0-rc3+, x86_64):

libcap library [6], [7], [8] and Perf tool can be used to apply
CAP_PERFMON capability for secure system performance monitoring and
observability beyond the scope permitted by the system wide
perf_event_paranoid kernel setting [9] and below are the steps for
evaluation:

- patch, build and boot the kernel
- patch, build Perf tool e.g. to /home/user/perf
...
# git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
# pushd libcap
# patch libcap/include/uapi/linux/capabilities.h with [PATCH 1]
# make
# pushd progs
# ./setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
# ./setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
/home/user/perf: OK
# ./getcap /home/user/perf
/home/user/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
# echo 2 > /proc/sys/kernel/perf_event_paranoid
# cat /proc/sys/kernel/perf_event_paranoid
2
...
$ /home/user/perf top
... works as expected ...
$ cat /proc/`pidof perf`/status
Name: perf
Umask: 0002
State: S (sleeping)
Tgid: 2958
Ngid: 0
Pid: 2958
PPid: 9847
TracerPid: 0
Uid: 500 500 500 500
Gid: 500 500 500 500
FDSize: 256
...
CapInh: 0000000000000000
CapPrm: 0000004400080000
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
cap_perfmon,cap_sys_ptrace,cap_syslog
CapBnd: 0000007fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ff
Cpus_allowed_list: 0-7
...

Usage of cap_perfmon effectively avoids unused credentials excess:

- with cap_sys_admin:
CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111

- with cap_perfmon:
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
38 34 19
perfmon syslog sys_ptrace

---
[1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
[2] http://man7.org/linux/man-pages/man7/capabilities.7.html
[3] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
[4] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
[5] https://www.kernel.org/doc/html/latest/process/management-style.html#decisions
[6] http://man7.org/linux/man-pages/man8/setcap.8.html
[7] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
[8] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
[9] http://man7.org/linux/man-pages/man2/perf_event_open.2.html


2020-02-17 08:06:44

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 01/12] capabilities: introduce CAP_PERFMON to kernel and user space


Introduce CAP_PERFMON capability designed to secure system performance
monitoring and observability operations so that CAP_PERFMON would assist
CAP_SYS_ADMIN capability in its governing role for performance
monitoring and observability subsystems.

CAP_PERFMON hardens system security and integrity during performance
monitoring and observability operations by decreasing attack surface
that is available to a CAP_SYS_ADMIN privileged process [2]. Providing
the access to system performance monitoring and observability operations
under CAP_PERFMON capability singly, without the rest of CAP_SYS_ADMIN
credentials, excludes chances to misuse the credentials and makes the
operation more secure. Thus, CAP_PERFMON implements the principal of
least privilege for performance monitoring and observability operations
(POSIX IEEE 1003.1e: 2.2.2.39 principle of least privilege: A security
design principle that states that a process or program be granted only
those privileges (e.g., capabilities) necessary to accomplish its
legitimate function, and only for the time that such privileges are
actually required)

CAP_PERFMON meets the demand to secure system performance monitoring and
observability operations for adoption in security sensitive, restricted,
multiuser production environments (e.g. HPC clusters, cloud and virtual
compute environments), where root or CAP_SYS_ADMIN credentials are not
available to mass users of a system, and securely unblocks accessibility
of system performance monitoring and observability operations beyond
the root and CAP_SYS_ADMIN use cases.

CAP_PERFMON takes over CAP_SYS_ADMIN credentials related to system
performance monitoring and observability operations and balances amount
of CAP_SYS_ADMIN credentials following the recommendations in the
capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is
overloaded; see Notes to kernel developers, below." For backward
compatibility reasons access to system performance monitoring and
observability subsystems of the kernel remains open for CAP_SYS_ADMIN
privileged processes but CAP_SYS_ADMIN usage for secure system
performance monitoring and observability operations is discouraged with
respect to the designed CAP_PERFMON capability.

Although the software running under CAP_PERFMON can not ensure avoidance
of related hardware issues, the software can still mitigate these issues
following the official hardware issues mitigation procedure [2].
The bugs in the software itself can be fixed following the standard
kernel development process [3] to maintain and harden security of system
performance monitoring and observability operations.

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
[2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
[3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html

Signed-off-by: Alexey Budankov <[email protected]>
---
include/linux/capability.h | 4 ++++
include/uapi/linux/capability.h | 8 +++++++-
security/selinux/include/classmap.h | 4 ++--
3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index ecce0f43c73a..027d7e4a853b 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -251,6 +251,10 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct
extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);
+static inline bool perfmon_capable(void)
+{
+ return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
+}

/* audit system wants to get cap info from files as well */
extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps);
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 272dc69fa080..e58c9636741b 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -367,8 +367,14 @@ struct vfs_ns_cap_data {

#define CAP_AUDIT_READ 37

+/*
+ * Allow system performance and observability privileged operations
+ * using perf_events, i915_perf and other kernel subsystems
+ */
+
+#define CAP_PERFMON 38

-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_LAST_CAP CAP_PERFMON

#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)

diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 986f3ac14282..d233ab3f1533 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -27,9 +27,9 @@
"audit_control", "setfcap"

#define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \
- "wake_alarm", "block_suspend", "audit_read"
+ "wake_alarm", "block_suspend", "audit_read", "perfmon"

-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_PERFMON
#error New capability defined, please update COMMON_CAP2_PERMS.
#endif

--
2.20.1


2020-02-17 08:08:00

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 02/12] perf/core: open access to the core for CAP_PERFMON privileged process


Open access to monitoring of kernel code, cpus, tracepoints and
namespaces data for a CAP_PERFMON privileged process. Providing the
access under CAP_PERFMON capability singly, without the rest of
CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to perf_events subsystem
remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
usage for secure perf_events monitoring is discouraged with respect to
CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
include/linux/perf_event.h | 6 +++---
kernel/events/core.c | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 68e21e828893..5cbfc06c56b3 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1297,7 +1297,7 @@ static inline int perf_is_paranoid(void)

static inline int perf_allow_kernel(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
return -EACCES;

return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
@@ -1305,7 +1305,7 @@ static inline int perf_allow_kernel(struct perf_event_attr *attr)

static inline int perf_allow_cpu(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 0 && !perfmon_capable())
return -EACCES;

return security_perf_event_open(attr, PERF_SECURITY_CPU);
@@ -1313,7 +1313,7 @@ static inline int perf_allow_cpu(struct perf_event_attr *attr)

static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > -1 && !perfmon_capable())
return -EPERM;

return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3f1f77de7247..46464367c47a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11205,7 +11205,7 @@ SYSCALL_DEFINE5(perf_event_open,
}

if (attr.namespaces) {
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
}

--
2.20.1

2020-02-17 08:09:05

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 04/12] perf tool: extend Perf tool with CAP_PERFMON capability support


Extend error messages to mention CAP_PERFMON capability as an option
to substitute CAP_SYS_ADMIN capability for secure system performance
monitoring and observability. Make perf_event_paranoid_check() and
__cmd_ftrace() to be aware of CAP_PERFMON capability.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to perf_events subsystem
remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
usage for secure perf_events monitoring is discouraged with respect to
CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-ftrace.c | 5 +++--
tools/perf/design.txt | 3 ++-
tools/perf/util/cap.h | 4 ++++
tools/perf/util/evsel.c | 10 +++++-----
tools/perf/util/util.c | 1 +
5 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index d5adc417a4ca..55eda54240fb 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -284,10 +284,11 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
.events = POLLIN,
};

- if (!perf_cap__capable(CAP_SYS_ADMIN)) {
+ if (!(perf_cap__capable(CAP_PERFMON) ||
+ perf_cap__capable(CAP_SYS_ADMIN))) {
pr_err("ftrace only works for %s!\n",
#ifdef HAVE_LIBCAP_SUPPORT
- "users with the SYS_ADMIN capability"
+ "users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
#else
"root"
#endif
diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index 0453ba26cdbd..a42fab308ff6 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -258,7 +258,8 @@ gets schedule to. Per task counters can be created by any user, for
their own tasks.

A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
+all events on CPU-x. Per CPU counters need CAP_PERFMON or CAP_SYS_ADMIN
+privilege.

The 'flags' parameter is currently unused and must be zero.

diff --git a/tools/perf/util/cap.h b/tools/perf/util/cap.h
index 051dc590ceee..ae52878c0b2e 100644
--- a/tools/perf/util/cap.h
+++ b/tools/perf/util/cap.h
@@ -29,4 +29,8 @@ static inline bool perf_cap__capable(int cap __maybe_unused)
#define CAP_SYSLOG 34
#endif

+#ifndef CAP_PERFMON
+#define CAP_PERFMON 38
+#endif
+
#endif /* __PERF_CAP_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c8dc4450884c..da57d1d4c601 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2493,14 +2493,14 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
"You may not have permission to collect %sstats.\n\n"
"Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
"which controls use of the performance events system by\n"
- "unprivileged users (without CAP_SYS_ADMIN).\n\n"
+ "unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).\n\n"
"The current value is %d:\n\n"
" -1: Allow use of (almost) all events by all users\n"
" Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
- ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n"
- " Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n"
- ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n"
- ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n"
+ ">= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
+ " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMIN\n\n"
"To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
" kernel.perf_event_paranoid = -1\n" ,
target->system_wide ? "system-wide " : "",
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 969ae560dad9..51cf3071db74 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -272,6 +272,7 @@ int perf_event_paranoid(void)
bool perf_event_paranoid_check(int max_level)
{
return perf_cap__capable(CAP_SYS_ADMIN) ||
+ perf_cap__capable(CAP_PERFMON) ||
perf_event_paranoid() <= max_level;
}

--
2.20.1


2020-02-17 08:09:34

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 05/12] drm/i915/perf: open access for CAP_PERFMON privileged process


Open access to i915_perf monitoring for CAP_PERFMON privileged process.
Providing the access under CAP_PERFMON capability singly, without the
rest of CAP_SYS_ADMIN credentials, excludes chances to misuse the
credentials and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to i915_events subsystem
remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
usage for secure i915_events monitoring is discouraged with respect to
CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
drivers/gpu/drm/i915/i915_perf.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0f556d80ba36..a3f32bd0aa47 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3378,10 +3378,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
* we check a dev.i915.perf_stream_paranoid sysctl option
* to determine if it's ok to access system wide OA counters
- * without CAP_SYS_ADMIN privileges.
+ * without CAP_PERFMON or CAP_SYS_ADMIN privileges.
*/
if (privileged_op &&
- i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+ i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to open i915 perf stream\n");
ret = -EACCES;
goto err_ctx;
@@ -3574,9 +3574,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
} else
oa_freq_hz = 0;

- if (oa_freq_hz > i915_oa_max_sample_rate &&
- !capable(CAP_SYS_ADMIN)) {
- DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without root privileges\n",
+ if (oa_freq_hz > i915_oa_max_sample_rate && !perfmon_capable()) {
+ DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without CAP_PERFMON or CAP_SYS_ADMIN privileges\n",
i915_oa_max_sample_rate);
return -EACCES;
}
@@ -3997,7 +3996,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
return -EINVAL;
}

- if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+ if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
return -EACCES;
}
@@ -4144,7 +4143,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
return -ENOTSUPP;
}

- if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+ if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
return -EACCES;
}
--
2.20.1


2020-02-17 08:09:42

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 03/12] perf/core: open access to probes for CAP_PERFMON privileged process


Open access to monitoring via kprobes and uprobes and eBPF tracing for
CAP_PERFMON privileged process. Providing the access under CAP_PERFMON
capability singly, without the rest of CAP_SYS_ADMIN credentials,
excludes chances to misuse the credentials and makes operation more
secure.

perf kprobes and uprobes are used by ftrace and eBPF. perf probe uses
ftrace to define new kprobe events, and those events are treated as
tracepoint events. eBPF defines new probes via perf_event_open interface
and then the probes are used in eBPF tracing.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to perf_events subsystem
remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
usage for secure perf_events monitoring is discouraged with respect to
CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
kernel/events/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 46464367c47a..4564caa2c527 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9107,7 +9107,7 @@ static int perf_kprobe_event_init(struct perf_event *event)
if (event->attr.type != perf_kprobe.type)
return -ENOENT;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;

/*
@@ -9167,7 +9167,7 @@ static int perf_uprobe_event_init(struct perf_event *event)
if (event->attr.type != perf_uprobe.type)
return -ENOENT;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;

/*
--
2.20.1


2020-02-17 08:10:26

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 06/12] trace/bpf_trace: open access for CAP_PERFMON privileged process


Open access to bpf_trace monitoring for CAP_PERFMON privileged process.
Providing the access under CAP_PERFMON capability singly, without the
rest of CAP_SYS_ADMIN credentials, excludes chances to misuse the
credentials and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to bpf_trace monitoring
remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
usage for secure bpf_trace monitoring is discouraged with respect to
CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
kernel/trace/bpf_trace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 19e793aa441a..70e8249eebe5 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1416,7 +1416,7 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info)
u32 *ids, prog_cnt, ids_len;
int ret;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EPERM;
if (event->attr.type != PERF_TYPE_TRACEPOINT)
return -EINVAL;
--
2.20.1


2020-02-17 08:11:24

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 08/12] parisc/perf: open access for CAP_PERFMON privileged process


Open access to monitoring for CAP_PERFMON privileged process.
Providing the access under CAP_PERFMON capability singly, without
the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse
the credentials and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
arch/parisc/kernel/perf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/perf.c b/arch/parisc/kernel/perf.c
index e1a8fee3ad49..d46b6709ec56 100644
--- a/arch/parisc/kernel/perf.c
+++ b/arch/parisc/kernel/perf.c
@@ -300,7 +300,7 @@ static ssize_t perf_write(struct file *file, const char __user *buf,
else
return -EFAULT;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;

if (count != sizeof(uint32_t))
--
2.20.1


2020-02-17 08:11:41

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 07/12] powerpc/perf: open access for CAP_PERFMON privileged process


Open access to monitoring for CAP_PERFMON privileged process.
Providing the access under CAP_PERFMON capability singly, without
the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse
the credentials and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and
only for the time that such privileges are actually required)

For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
arch/powerpc/perf/imc-pmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index cb50a9e1fd2d..e837717492e4 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -898,7 +898,7 @@ static int thread_imc_event_init(struct perf_event *event)
if (event->attr.type != event->pmu->type)
return -ENOENT;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;

/* Sampling not supported */
@@ -1307,7 +1307,7 @@ static int trace_imc_event_init(struct perf_event *event)
if (event->attr.type != event->pmu->type)
return -ENOENT;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;

/* Return if this is a couting event */
--
2.20.1


2020-02-17 08:12:09

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 09/12] drivers/perf: open access for CAP_PERFMON privileged process


Open access to monitoring for CAP_PERFMON privileged process.
Providing the access under CAP_PERFMON capability singly, without
the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse
the credentials and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and
only for the time that such privileges are actually required)

For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
drivers/perf/arm_spe_pmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 4e4984a55cd1..5dff81bc3324 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -274,7 +274,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
if (!attr->exclude_kernel)
reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);

- if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && capable(CAP_SYS_ADMIN))
+ if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);

return reg;
@@ -700,7 +700,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;

reg = arm_spe_event_to_pmscr(event);
- if (!capable(CAP_SYS_ADMIN) &&
+ if (!perfmon_capable() &&
(reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
BIT(SYS_PMSCR_EL1_CX_SHIFT) |
BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
--
2.20.1


2020-02-17 08:12:48

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 10/12] drivers/oprofile: open access for CAP_PERFMON privileged process


Open access to monitoring for CAP_PERFMON privileged process.
Providing the access under CAP_PERFMON capability singly, without
the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse
the credentials and makes operation more secure.

CAP_PERFMON implements the principal of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
drivers/oprofile/event_buffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/oprofile/event_buffer.c b/drivers/oprofile/event_buffer.c
index 12ea4a4ad607..6c9edc8bbc95 100644
--- a/drivers/oprofile/event_buffer.c
+++ b/drivers/oprofile/event_buffer.c
@@ -113,7 +113,7 @@ static int event_buffer_open(struct inode *inode, struct file *file)
{
int err = -EPERM;

- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EPERM;

if (test_and_set_bit_lock(0, &buffer_opened))
--
2.20.1


2020-02-17 08:13:43

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 11/12] doc/admin-guide: update perf-security.rst with CAP_PERFMON information


Update perf-security.rst documentation file with the information
related to usage of CAP_PERFMON capability to secure performance
monitoring and observability operations in system.

Signed-off-by: Alexey Budankov <[email protected]>
---
Documentation/admin-guide/perf-security.rst | 65 +++++++++++++--------
1 file changed, 40 insertions(+), 25 deletions(-)

diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index 72effa7c23b9..81202d46a1ae 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -1,6 +1,6 @@
.. _perf_security:

-Perf Events and tool security
+Perf events and tool security
=============================

Overview
@@ -42,11 +42,11 @@ categories:
Data that belong to the fourth category can potentially contain
sensitive process data. If PMUs in some monitoring modes capture values
of execution context registers or data from process memory then access
-to such monitoring capabilities requires to be ordered and secured
-properly. So, perf_events/Perf performance monitoring is the subject for
-security access control management [5]_ .
+to such monitoring modes requires to be ordered and secured properly.
+So, perf_events performance monitoring and observability operations is
+the subject for security access control management [5]_ .

-perf_events/Perf access control
+perf_events access control
-------------------------------

To perform security checks, the Linux implementation splits processes
@@ -66,11 +66,25 @@ into distinct units, known as capabilities [6]_ , which can be
independently enabled and disabled on per-thread basis for processes and
files of unprivileged users.

-Unprivileged processes with enabled CAP_SYS_ADMIN capability are treated
+Unprivileged processes with enabled CAP_PERFMON capability are treated
as privileged processes with respect to perf_events performance
-monitoring and bypass *scope* permissions checks in the kernel.
-
-Unprivileged processes using perf_events system call API is also subject
+monitoring and observability operations, thus, bypass *scope* permissions
+checks in the kernel. CAP_PERFMON implements the principal of least
+privilege [13]_ (POSIX 1003.1e: 2.2.2.39) for performance monitoring and
+observability operations in the kernel and provides secure approach to
+perfomance monitoring and observability in the system.
+
+For backward compatibility reasons access to perf_events monitoring and
+observability operations is also open for CAP_SYS_ADMIN privileged
+processes but CAP_SYS_ADMIN usage for secure monitoring and observability
+use cases is discouraged with respect to CAP_PERFMON capability.
+If system audit records [14]_ for a process using perf_events system call
+API contain denial records of acquiring both CAP_PERFMON and CAP_SYS_ADMIN
+capabilities then providing the process with CAP_PERFMON capability singly
+is recommended as the preferred secure approach to resolve double access
+denial logging related to usage of performance monitoring and observability.
+
+Unprivileged processes using perf_events system call are also subject
for PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose
outcome determines whether monitoring is permitted. So unprivileged
processes provided with CAP_SYS_PTRACE capability are effectively
@@ -82,14 +96,14 @@ performance analysis of monitored processes or a system. For example,
CAP_SYSLOG capability permits reading kernel space memory addresses from
/proc/kallsyms file.

-perf_events/Perf privileged users
+Privileged Perf users groups
---------------------------------

Mechanisms of capabilities, privileged capability-dumb files [6]_ and
-file system ACLs [10]_ can be used to create a dedicated group of
-perf_events/Perf privileged users who are permitted to execute
-performance monitoring without scope limits. The following steps can be
-taken to create such a group of privileged Perf users.
+file system ACLs [10]_ can be used to create dedicated groups of
+privileged Perf users who are permitted to execute performance monitoring
+and observability without scope limits. The following steps can be
+taken to create such groups of privileged Perf users.

1. Create perf_users group of privileged Perf users, assign perf_users
group to Perf tool executable and limit access to the executable for
@@ -108,30 +122,30 @@ taken to create such a group of privileged Perf users.
-rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf

2. Assign the required capabilities to the Perf tool executable file and
- enable members of perf_users group with performance monitoring
+ enable members of perf_users group with monitoring and observability
privileges [6]_ :

::

- # setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
- # setcap -v "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
perf: OK
# getcap perf
- perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
+ perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep

As a result, members of perf_users group are capable of conducting
-performance monitoring by using functionality of the configured Perf
-tool executable that, when executes, passes perf_events subsystem scope
-checks.
+performance monitoring and observability by using functionality of the
+configured Perf tool executable that, when executes, passes perf_events
+subsystem scope checks.

This specific access control management is only available to superuser
or root running processes with CAP_SETPCAP, CAP_SETFCAP [6]_
capabilities.

-perf_events/Perf unprivileged users
+Unprivileged users
-----------------------------------

-perf_events/Perf *scope* and *access* control for unprivileged processes
+perf_events *scope* and *access* control for unprivileged processes
is governed by perf_event_paranoid [2]_ setting:

-1:
@@ -166,7 +180,7 @@ is governed by perf_event_paranoid [2]_ setting:
perf_event_mlock_kb locking limit is imposed but ignored for
unprivileged processes with CAP_IPC_LOCK capability.

-perf_events/Perf resource control
+Resource control
---------------------------------

Open file descriptors
@@ -227,4 +241,5 @@ Bibliography
.. [10] `<http://man7.org/linux/man-pages/man5/acl.5.html>`_
.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
-
+.. [13] `<https://sites.google.com/site/fullycapable>`_
+.. [14] `<http://man7.org/linux/man-pages/man8/auditd.8.html>`_
--
2.20.1


2020-02-17 08:14:46

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v7 12/12] doc/admin-guide: update kernel.rst with CAP_PERFMON information


Update kernel.rst documentation file with the information
related to usage of CAP_PERFMON capability to secure performance
monitoring and observability operations in system.

Signed-off-by: Alexey Budankov <[email protected]>
---
Documentation/admin-guide/sysctl/kernel.rst | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index def074807cee..b06ae9389809 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -720,20 +720,26 @@ perf_event_paranoid:
====================

Controls use of the performance events system by unprivileged
-users (without CAP_SYS_ADMIN). The default value is 2.
+users (without CAP_PERFMON). The default value is 2.
+
+For backward compatibility reasons access to system performance
+monitoring and observability remains open for CAP_SYS_ADMIN
+privileged processes but CAP_SYS_ADMIN usage for secure system
+performance monitoring and observability operations is discouraged
+with respect to CAP_PERFMON use cases.

=== ==================================================================
-1 Allow use of (almost) all events by all users

Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK

->=0 Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
+>=0 Disallow ftrace function tracepoint by users without CAP_PERFMON

- Disallow raw tracepoint access by users without CAP_SYS_ADMIN
+ Disallow raw tracepoint access by users without CAP_PERFMON

->=1 Disallow CPU event access by users without CAP_SYS_ADMIN
+>=1 Disallow CPU event access by users without CAP_PERFMON

->=2 Disallow kernel profiling by users without CAP_SYS_ADMIN
+>=2 Disallow kernel profiling by users without CAP_PERFMON
=== ==================================================================


--
2.20.1


2020-02-18 18:21:57

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH v7 01/12] capabilities: introduce CAP_PERFMON to kernel and user space

On 2/17/20 3:06 AM, Alexey Budankov wrote:
>
> Introduce CAP_PERFMON capability designed to secure system performance
> monitoring and observability operations so that CAP_PERFMON would assist
> CAP_SYS_ADMIN capability in its governing role for performance
> monitoring and observability subsystems.
>
> CAP_PERFMON hardens system security and integrity during performance
> monitoring and observability operations by decreasing attack surface
> that is available to a CAP_SYS_ADMIN privileged process [2]. Providing
> the access to system performance monitoring and observability operations
> under CAP_PERFMON capability singly, without the rest of CAP_SYS_ADMIN
> credentials, excludes chances to misuse the credentials and makes the
> operation more secure. Thus, CAP_PERFMON implements the principal of
> least privilege for performance monitoring and observability operations
> (POSIX IEEE 1003.1e: 2.2.2.39 principle of least privilege: A security
> design principle that states that a process or program be granted only
> those privileges (e.g., capabilities) necessary to accomplish its
> legitimate function, and only for the time that such privileges are
> actually required)
>
> CAP_PERFMON meets the demand to secure system performance monitoring and
> observability operations for adoption in security sensitive, restricted,
> multiuser production environments (e.g. HPC clusters, cloud and virtual
> compute environments), where root or CAP_SYS_ADMIN credentials are not
> available to mass users of a system, and securely unblocks accessibility
> of system performance monitoring and observability operations beyond
> the root and CAP_SYS_ADMIN use cases.
>
> CAP_PERFMON takes over CAP_SYS_ADMIN credentials related to system
> performance monitoring and observability operations and balances amount
> of CAP_SYS_ADMIN credentials following the recommendations in the
> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is
> overloaded; see Notes to kernel developers, below." For backward
> compatibility reasons access to system performance monitoring and
> observability subsystems of the kernel remains open for CAP_SYS_ADMIN
> privileged processes but CAP_SYS_ADMIN usage for secure system
> performance monitoring and observability operations is discouraged with
> respect to the designed CAP_PERFMON capability.
>
> Although the software running under CAP_PERFMON can not ensure avoidance
> of related hardware issues, the software can still mitigate these issues
> following the official hardware issues mitigation procedure [2].
> The bugs in the software itself can be fixed following the standard
> kernel development process [3] to maintain and harden security of system
> performance monitoring and observability operations.
>
> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html
> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
>
> Signed-off-by: Alexey Budankov <[email protected]>

Acked-by: Stephen Smalley <[email protected]>

[...]

2020-02-18 19:23:24

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 01/12] capabilities: introduce CAP_PERFMON to kernel and user space

On Mon, 17 Feb 2020, Alexey Budankov wrote:

>
> Introduce CAP_PERFMON capability designed to secure system performance
> monitoring and observability operations so that CAP_PERFMON would assist
> CAP_SYS_ADMIN capability in its governing role for performance
> monitoring and observability subsystems.


Acked-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:24:03

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 02/12] perf/core: open access to the core for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

>
> Open access to monitoring of kernel code, cpus, tracepoints and
> namespaces data for a CAP_PERFMON privileged process. Providing the
> access under CAP_PERFMON capability singly, without the rest of
> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
> and makes operation more secure.
>
> CAP_PERFMON implements the principal of least privilege for performance
> monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
> principle of least privilege: A security design principle that states
> that a process or program be granted only those privileges (e.g.,
> capabilities) necessary to accomplish its legitimate function, and only
> for the time that such privileges are actually required)
>
> For backward compatibility reasons access to perf_events subsystem
> remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
> usage for secure perf_events monitoring is discouraged with respect to
> CAP_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:24:52

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 03/12] perf/core: open access to probes for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

>
> Open access to monitoring via kprobes and uprobes and eBPF tracing for
> CAP_PERFMON privileged process. Providing the access under CAP_PERFMON
> capability singly, without the rest of CAP_SYS_ADMIN credentials,
> excludes chances to misuse the credentials and makes operation more
> secure.
>
> perf kprobes and uprobes are used by ftrace and eBPF. perf probe uses
> ftrace to define new kprobe events, and those events are treated as
> tracepoint events. eBPF defines new probes via perf_event_open interface
> and then the probes are used in eBPF tracing.
>
> CAP_PERFMON implements the principal of least privilege for performance
> monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
> principle of least privilege: A security design principle that states
> that a process or program be granted only those privileges (e.g.,
> capabilities) necessary to accomplish its legitimate function, and only
> for the time that such privileges are actually required)
>
> For backward compatibility reasons access to perf_events subsystem
> remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
> usage for secure perf_events monitoring is discouraged with respect to
> CAP_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:25:14

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 04/12] perf tool: extend Perf tool with CAP_PERFMON capability support

On Mon, 17 Feb 2020, Alexey Budankov wrote:

>
> Extend error messages to mention CAP_PERFMON capability as an option
> to substitute CAP_SYS_ADMIN capability for secure system performance
> monitoring and observability. Make perf_event_paranoid_check() and
> __cmd_ftrace() to be aware of CAP_PERFMON capability.
>
> CAP_PERFMON implements the principal of least privilege for performance
> monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
> principle of least privilege: A security design principle that states
> that a process or program be granted only those privileges (e.g.,
> capabilities) necessary to accomplish its legitimate function, and only
> for the time that such privileges are actually required)
>
> For backward compatibility reasons access to perf_events subsystem
> remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
> usage for secure perf_events monitoring is discouraged with respect to
> CAP_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:27:17

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 05/12] drm/i915/perf: open access for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

>
> Open access to i915_perf monitoring for CAP_PERFMON privileged process.
> Providing the access under CAP_PERFMON capability singly, without the
> rest of CAP_SYS_ADMIN credentials, excludes chances to misuse the
> credentials and makes operation more secure.
>
> CAP_PERFMON implements the principal of least privilege for performance
> monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
> principle of least privilege: A security design principle that states
> that a process or program be granted only those privileges (e.g.,
> capabilities) necessary to accomplish its legitimate function, and only
> for the time that such privileges are actually required)
>
> For backward compatibility reasons access to i915_events subsystem
> remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
> usage for secure i915_events monitoring is discouraged with respect to
> CAP_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:27:48

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 06/12] trace/bpf_trace: open access for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

>
> Open access to bpf_trace monitoring for CAP_PERFMON privileged process.
> Providing the access under CAP_PERFMON capability singly, without the
> rest of CAP_SYS_ADMIN credentials, excludes chances to misuse the
> credentials and makes operation more secure.
>
> CAP_PERFMON implements the principal of least privilege for performance
> monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
> principle of least privilege: A security design principle that states
> that a process or program be granted only those privileges (e.g.,
> capabilities) necessary to accomplish its legitimate function, and only
> for the time that such privileges are actually required)
>
> For backward compatibility reasons access to bpf_trace monitoring
> remains open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN
> usage for secure bpf_trace monitoring is discouraged with respect to
> CAP_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:30:49

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 07/12] powerpc/perf: open access for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

> For backward compatibility reasons access to the monitoring remains
> open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
> for secure monitoring is discouraged with respect to CAP_PERFMON
> capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:32:16

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 08/12] parisc/perf: open access for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

> For backward compatibility reasons access to the monitoring remains
> open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
> for secure monitoring is discouraged with respect to CAP_PERFMON
> capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>

--
James Morris
<[email protected]>

2020-02-18 19:45:59

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 09/12] drivers/perf: open access for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

> For backward compatibility reasons access to the monitoring remains
> open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
> for secure monitoring is discouraged with respect to CAP_PERFMON
> capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-18 19:47:32

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v7 10/12] drivers/oprofile: open access for CAP_PERFMON privileged process

On Mon, 17 Feb 2020, Alexey Budankov wrote:

> For backward compatibility reasons access to the monitoring remains
> open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
> for secure monitoring is discouraged with respect to CAP_PERFMON
> capability.
>
> Signed-off-by: Alexey Budankov <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2020-02-19 07:56:22

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v7 01/12] capabilities: introduce CAP_PERFMON to kernel and user space


On 18.02.2020 22:21, James Morris wrote:
> On Mon, 17 Feb 2020, Alexey Budankov wrote:
>
>>
>> Introduce CAP_PERFMON capability designed to secure system performance
>> monitoring and observability operations so that CAP_PERFMON would assist
>> CAP_SYS_ADMIN capability in its governing role for performance
>> monitoring and observability subsystems.
>
>
> Acked-by: James Morris <[email protected]>

Thanks James!
I appreciate your involvement and collaboration
w.r.t to the whole patch set.

Gratefully,
Alexey

2020-02-25 09:56:27

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability


Hi,

Is there anything else I could do in order to move the changes forward
or is something still missing from this patch set?
Could you please share you mind?

Thanks,
Alexey

On 17.02.2020 11:02, Alexey Budankov wrote:
>
> Currently access to perf_events, i915_perf and other performance
> monitoring and observability subsystems of the kernel is open only for
> a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
> process effective set [2].
>
> This patch set introduces CAP_PERFMON capability designed to secure
> system performance monitoring and observability operations so that
> CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
> for performance monitoring and observability subsystems of the kernel.
>
> CAP_PERFMON intends to harden system security and integrity during
> performance monitoring and observability operations by decreasing attack
> surface that is available to a CAP_SYS_ADMIN privileged process [2].
> Providing the access to performance monitoring and observability
> operations under CAP_PERFMON capability singly, without the rest of
> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
> and makes the operation more secure. Thus, CAP_PERFMON implements the
> principal of least privilege for performance monitoring and
> observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of
> least privilege: A security design principle that states that a process
> or program be granted only those privileges (e.g., capabilities)
> necessary to accomplish its legitimate function, and only for the time
> that such privileges are actually required)
>
> CAP_PERFMON intends to meet the demand to secure system performance
> monitoring and observability operations for adoption in security
> sensitive, restricted, multiuser production environments (e.g. HPC
> clusters, cloud and virtual compute environments), where root or
> CAP_SYS_ADMIN credentials are not available to mass users of a system,
> and securely unblock accessibility of system performance monitoring and
> observability operations beyond root and CAP_SYS_ADMIN use cases.
>
> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
> system performance monitoring and observability operations and balance
> amount of CAP_SYS_ADMIN credentials following the recommendations in
> the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability
> is overloaded; see Notes to kernel developers, below." For backward
> compatibility reasons access to system performance monitoring and
> observability subsystems of the kernel remains open for CAP_SYS_ADMIN
> privileged processes but CAP_SYS_ADMIN capability usage for secure
> system performance monitoring and observability operations is
> discouraged with respect to the designed CAP_PERFMON capability.
>
> Possible alternative solution to this system security hardening,
> capabilities balancing task of making performance monitoring and
> observability operations more secure and accessible could be to use
> the existing CAP_SYS_PTRACE capability to govern system performance
> monitoring and observability subsystems. However CAP_SYS_PTRACE
> capability still provides users with more credentials than are
> required for secure performance monitoring and observability
> operations and this excess is avoided by the designed CAP_PERFMON.
>
> Although software running under CAP_PERFMON can not ensure avoidance of
> related hardware issues, the software can still mitigate those issues
> following the official hardware issues mitigation procedure [3]. The
> bugs in the software itself can be fixed following the standard kernel
> development process [4] to maintain and harden security of system
> performance monitoring and observability operations. Finally, the patch
> set is shaped in the way that simplifies backtracking procedure of
> possible induced issues [5] as much as possible.
>
> The patch set is for tip perf/core repository:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22
>
> ---
> Changes in v7:
> - updated and extended kernel.rst and perf-security.rst documentation
> files with the information about CAP_PERFMON capability and its use cases
> - documented the case of double audit logging of CAP_PERFMON and CAP_SYS_ADMIN
> capabilities on a SELinux enabled system
> Changes in v6:
> - avoided noaudit checks in perfmon_capable() to explicitly advertise
> CAP_PERFMON usage thru audit logs to secure system performance
> monitoring and observability
> Changes in v5:
> - renamed CAP_SYS_PERFMON to CAP_PERFMON
> - extended perfmon_capable() with noaudit checks
> Changes in v4:
> - converted perfmon_capable() into an inline function
> - made perf_events kprobes, uprobes, hw breakpoints and namespaces data
> available to CAP_SYS_PERFMON privileged processes
> - applied perfmon_capable() to drivers/perf and drivers/oprofile
> - extended __cmd_ftrace() with support of CAP_SYS_PERFMON
> Changes in v3:
> - implemented perfmon_capable() macros aggregating required capabilities
> checks
> Changes in v2:
> - made perf_events trace points available to CAP_SYS_PERFMON privileged
> processes
> - made perf_event_paranoid_check() treat CAP_SYS_PERFMON equally to
> CAP_SYS_ADMIN
> - applied CAP_SYS_PERFMON to i915_perf, bpf_trace, powerpc and parisc
> system performance monitoring and observability related subsystems
>
> ---
> Alexey Budankov (12):
> capabilities: introduce CAP_PERFMON to kernel and user space
> perf/core: open access to the core for CAP_PERFMON privileged process
> perf/core: open access to probes for CAP_PERFMON privileged process
> perf tool: extend Perf tool with CAP_PERFMON capability support
> drm/i915/perf: open access for CAP_PERFMON privileged process
> trace/bpf_trace: open access for CAP_PERFMON privileged process
> powerpc/perf: open access for CAP_PERFMON privileged process
> parisc/perf: open access for CAP_PERFMON privileged process
> drivers/perf: open access for CAP_PERFMON privileged process
> drivers/oprofile: open access for CAP_PERFMON privileged process
> doc/admin-guide: update perf-security.rst with CAP_PERFMON information
> doc/admin-guide: update kernel.rst with CAP_PERFMON information
>
> Documentation/admin-guide/perf-security.rst | 65 +++++++++++++--------
> Documentation/admin-guide/sysctl/kernel.rst | 16 +++--
> arch/parisc/kernel/perf.c | 2 +-
> arch/powerpc/perf/imc-pmu.c | 4 +-
> drivers/gpu/drm/i915/i915_perf.c | 13 ++---
> drivers/oprofile/event_buffer.c | 2 +-
> drivers/perf/arm_spe_pmu.c | 4 +-
> include/linux/capability.h | 4 ++
> include/linux/perf_event.h | 6 +-
> include/uapi/linux/capability.h | 8 ++-
> kernel/events/core.c | 6 +-
> kernel/trace/bpf_trace.c | 2 +-
> security/selinux/include/classmap.h | 4 +-
> tools/perf/builtin-ftrace.c | 5 +-
> tools/perf/design.txt | 3 +-
> tools/perf/util/cap.h | 4 ++
> tools/perf/util/evsel.c | 10 ++--
> tools/perf/util/util.c | 1 +
> 18 files changed, 98 insertions(+), 61 deletions(-)
>
> ---
> Validation (Intel Skylake, 8 cores, Fedora 29, 5.5.0-rc3+, x86_64):
>
> libcap library [6], [7], [8] and Perf tool can be used to apply
> CAP_PERFMON capability for secure system performance monitoring and
> observability beyond the scope permitted by the system wide
> perf_event_paranoid kernel setting [9] and below are the steps for
> evaluation:
>
> - patch, build and boot the kernel
> - patch, build Perf tool e.g. to /home/user/perf
> ...
> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> # pushd libcap
> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1]
> # make
> # pushd progs
> # ./setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> # ./setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> /home/user/perf: OK
> # ./getcap /home/user/perf
> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
> # echo 2 > /proc/sys/kernel/perf_event_paranoid
> # cat /proc/sys/kernel/perf_event_paranoid
> 2
> ...
> $ /home/user/perf top
> ... works as expected ...
> $ cat /proc/`pidof perf`/status
> Name: perf
> Umask: 0002
> State: S (sleeping)
> Tgid: 2958
> Ngid: 0
> Pid: 2958
> PPid: 9847
> TracerPid: 0
> Uid: 500 500 500 500
> Gid: 500 500 500 500
> FDSize: 256
> ...
> CapInh: 0000000000000000
> CapPrm: 0000004400080000
> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> cap_perfmon,cap_sys_ptrace,cap_syslog
> CapBnd: 0000007fffffffff
> CapAmb: 0000000000000000
> NoNewPrivs: 0
> Seccomp: 0
> Speculation_Store_Bypass: thread vulnerable
> Cpus_allowed: ff
> Cpus_allowed_list: 0-7
> ...
>
> Usage of cap_perfmon effectively avoids unused credentials excess:
>
> - with cap_sys_admin:
> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
>
> - with cap_perfmon:
> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> 38 34 19
> perfmon syslog sys_ptrace
>
> ---
> [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> [2] http://man7.org/linux/man-pages/man7/capabilities.7.html
> [3] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
> [4] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
> [5] https://www.kernel.org/doc/html/latest/process/management-style.html#decisions
> [6] http://man7.org/linux/man-pages/man8/setcap.8.html
> [7] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> [8] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
> [9] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
>

2020-03-02 00:20:43

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability

Thanks, this looks good to me, in keeping with the CAP_SYSLOG break.

Acked-by: Serge E. Hallyn <[email protected]>

for the set.

James/Ingo/Peter, if noone has remaining objections, whose branch
should these go in through?

thanks,
-serge

On Tue, Feb 25, 2020 at 12:55:54PM +0300, Alexey Budankov wrote:
>
> Hi,
>
> Is there anything else I could do in order to move the changes forward
> or is something still missing from this patch set?
> Could you please share you mind?
>
> Thanks,
> Alexey
>
> On 17.02.2020 11:02, Alexey Budankov wrote:
> >
> > Currently access to perf_events, i915_perf and other performance
> > monitoring and observability subsystems of the kernel is open only for
> > a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
> > process effective set [2].
> >
> > This patch set introduces CAP_PERFMON capability designed to secure
> > system performance monitoring and observability operations so that
> > CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
> > for performance monitoring and observability subsystems of the kernel.
> >
> > CAP_PERFMON intends to harden system security and integrity during
> > performance monitoring and observability operations by decreasing attack
> > surface that is available to a CAP_SYS_ADMIN privileged process [2].
> > Providing the access to performance monitoring and observability
> > operations under CAP_PERFMON capability singly, without the rest of
> > CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
> > and makes the operation more secure. Thus, CAP_PERFMON implements the
> > principal of least privilege for performance monitoring and
> > observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of
> > least privilege: A security design principle that states that a process
> > or program be granted only those privileges (e.g., capabilities)
> > necessary to accomplish its legitimate function, and only for the time
> > that such privileges are actually required)
> >
> > CAP_PERFMON intends to meet the demand to secure system performance
> > monitoring and observability operations for adoption in security
> > sensitive, restricted, multiuser production environments (e.g. HPC
> > clusters, cloud and virtual compute environments), where root or
> > CAP_SYS_ADMIN credentials are not available to mass users of a system,
> > and securely unblock accessibility of system performance monitoring and
> > observability operations beyond root and CAP_SYS_ADMIN use cases.
> >
> > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
> > system performance monitoring and observability operations and balance
> > amount of CAP_SYS_ADMIN credentials following the recommendations in
> > the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability
> > is overloaded; see Notes to kernel developers, below." For backward
> > compatibility reasons access to system performance monitoring and
> > observability subsystems of the kernel remains open for CAP_SYS_ADMIN
> > privileged processes but CAP_SYS_ADMIN capability usage for secure
> > system performance monitoring and observability operations is
> > discouraged with respect to the designed CAP_PERFMON capability.
> >
> > Possible alternative solution to this system security hardening,
> > capabilities balancing task of making performance monitoring and
> > observability operations more secure and accessible could be to use
> > the existing CAP_SYS_PTRACE capability to govern system performance
> > monitoring and observability subsystems. However CAP_SYS_PTRACE
> > capability still provides users with more credentials than are
> > required for secure performance monitoring and observability
> > operations and this excess is avoided by the designed CAP_PERFMON.
> >
> > Although software running under CAP_PERFMON can not ensure avoidance of
> > related hardware issues, the software can still mitigate those issues
> > following the official hardware issues mitigation procedure [3]. The
> > bugs in the software itself can be fixed following the standard kernel
> > development process [4] to maintain and harden security of system
> > performance monitoring and observability operations. Finally, the patch
> > set is shaped in the way that simplifies backtracking procedure of
> > possible induced issues [5] as much as possible.
> >
> > The patch set is for tip perf/core repository:
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> > sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22
> >
> > ---
> > Changes in v7:
> > - updated and extended kernel.rst and perf-security.rst documentation
> > files with the information about CAP_PERFMON capability and its use cases
> > - documented the case of double audit logging of CAP_PERFMON and CAP_SYS_ADMIN
> > capabilities on a SELinux enabled system
> > Changes in v6:
> > - avoided noaudit checks in perfmon_capable() to explicitly advertise
> > CAP_PERFMON usage thru audit logs to secure system performance
> > monitoring and observability
> > Changes in v5:
> > - renamed CAP_SYS_PERFMON to CAP_PERFMON
> > - extended perfmon_capable() with noaudit checks
> > Changes in v4:
> > - converted perfmon_capable() into an inline function
> > - made perf_events kprobes, uprobes, hw breakpoints and namespaces data
> > available to CAP_SYS_PERFMON privileged processes
> > - applied perfmon_capable() to drivers/perf and drivers/oprofile
> > - extended __cmd_ftrace() with support of CAP_SYS_PERFMON
> > Changes in v3:
> > - implemented perfmon_capable() macros aggregating required capabilities
> > checks
> > Changes in v2:
> > - made perf_events trace points available to CAP_SYS_PERFMON privileged
> > processes
> > - made perf_event_paranoid_check() treat CAP_SYS_PERFMON equally to
> > CAP_SYS_ADMIN
> > - applied CAP_SYS_PERFMON to i915_perf, bpf_trace, powerpc and parisc
> > system performance monitoring and observability related subsystems
> >
> > ---
> > Alexey Budankov (12):
> > capabilities: introduce CAP_PERFMON to kernel and user space
> > perf/core: open access to the core for CAP_PERFMON privileged process
> > perf/core: open access to probes for CAP_PERFMON privileged process
> > perf tool: extend Perf tool with CAP_PERFMON capability support
> > drm/i915/perf: open access for CAP_PERFMON privileged process
> > trace/bpf_trace: open access for CAP_PERFMON privileged process
> > powerpc/perf: open access for CAP_PERFMON privileged process
> > parisc/perf: open access for CAP_PERFMON privileged process
> > drivers/perf: open access for CAP_PERFMON privileged process
> > drivers/oprofile: open access for CAP_PERFMON privileged process
> > doc/admin-guide: update perf-security.rst with CAP_PERFMON information
> > doc/admin-guide: update kernel.rst with CAP_PERFMON information
> >
> > Documentation/admin-guide/perf-security.rst | 65 +++++++++++++--------
> > Documentation/admin-guide/sysctl/kernel.rst | 16 +++--
> > arch/parisc/kernel/perf.c | 2 +-
> > arch/powerpc/perf/imc-pmu.c | 4 +-
> > drivers/gpu/drm/i915/i915_perf.c | 13 ++---
> > drivers/oprofile/event_buffer.c | 2 +-
> > drivers/perf/arm_spe_pmu.c | 4 +-
> > include/linux/capability.h | 4 ++
> > include/linux/perf_event.h | 6 +-
> > include/uapi/linux/capability.h | 8 ++-
> > kernel/events/core.c | 6 +-
> > kernel/trace/bpf_trace.c | 2 +-
> > security/selinux/include/classmap.h | 4 +-
> > tools/perf/builtin-ftrace.c | 5 +-
> > tools/perf/design.txt | 3 +-
> > tools/perf/util/cap.h | 4 ++
> > tools/perf/util/evsel.c | 10 ++--
> > tools/perf/util/util.c | 1 +
> > 18 files changed, 98 insertions(+), 61 deletions(-)
> >
> > ---
> > Validation (Intel Skylake, 8 cores, Fedora 29, 5.5.0-rc3+, x86_64):
> >
> > libcap library [6], [7], [8] and Perf tool can be used to apply
> > CAP_PERFMON capability for secure system performance monitoring and
> > observability beyond the scope permitted by the system wide
> > perf_event_paranoid kernel setting [9] and below are the steps for
> > evaluation:
> >
> > - patch, build and boot the kernel
> > - patch, build Perf tool e.g. to /home/user/perf
> > ...
> > # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> > # pushd libcap
> > # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1]
> > # make
> > # pushd progs
> > # ./setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> > # ./setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> > /home/user/perf: OK
> > # ./getcap /home/user/perf
> > /home/user/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
> > # echo 2 > /proc/sys/kernel/perf_event_paranoid
> > # cat /proc/sys/kernel/perf_event_paranoid
> > 2
> > ...
> > $ /home/user/perf top
> > ... works as expected ...
> > $ cat /proc/`pidof perf`/status
> > Name: perf
> > Umask: 0002
> > State: S (sleeping)
> > Tgid: 2958
> > Ngid: 0
> > Pid: 2958
> > PPid: 9847
> > TracerPid: 0
> > Uid: 500 500 500 500
> > Gid: 500 500 500 500
> > FDSize: 256
> > ...
> > CapInh: 0000000000000000
> > CapPrm: 0000004400080000
> > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> > cap_perfmon,cap_sys_ptrace,cap_syslog
> > CapBnd: 0000007fffffffff
> > CapAmb: 0000000000000000
> > NoNewPrivs: 0
> > Seccomp: 0
> > Speculation_Store_Bypass: thread vulnerable
> > Cpus_allowed: ff
> > Cpus_allowed_list: 0-7
> > ...
> >
> > Usage of cap_perfmon effectively avoids unused credentials excess:
> >
> > - with cap_sys_admin:
> > CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
> >
> > - with cap_perfmon:
> > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> > 38 34 19
> > perfmon syslog sys_ptrace
> >
> > ---
> > [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> > [2] http://man7.org/linux/man-pages/man7/capabilities.7.html
> > [3] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
> > [4] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
> > [5] https://www.kernel.org/doc/html/latest/process/management-style.html#decisions
> > [6] http://man7.org/linux/man-pages/man8/setcap.8.html
> > [7] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> > [8] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
> > [9] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
> >
> _______________________________________________
> Intel-gfx mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

2020-03-02 19:52:16

by James Morris

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability

On Sun, 1 Mar 2020, Serge Hallyn wrote:

> Thanks, this looks good to me, in keeping with the CAP_SYSLOG break.
>
> Acked-by: Serge E. Hallyn <[email protected]>
>
> for the set.
>
> James/Ingo/Peter, if noone has remaining objections, whose branch
> should these go in through?
>
> thanks,

I was assuming via the perf tree, but I am happy to take them.


> -serge
>
> On Tue, Feb 25, 2020 at 12:55:54PM +0300, Alexey Budankov wrote:
> >
> > Hi,
> >
> > Is there anything else I could do in order to move the changes forward
> > or is something still missing from this patch set?
> > Could you please share you mind?
> >
> > Thanks,
> > Alexey
> >
> > On 17.02.2020 11:02, Alexey Budankov wrote:
> > >
> > > Currently access to perf_events, i915_perf and other performance
> > > monitoring and observability subsystems of the kernel is open only for
> > > a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
> > > process effective set [2].
> > >
> > > This patch set introduces CAP_PERFMON capability designed to secure
> > > system performance monitoring and observability operations so that
> > > CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
> > > for performance monitoring and observability subsystems of the kernel.
> > >
> > > CAP_PERFMON intends to harden system security and integrity during
> > > performance monitoring and observability operations by decreasing attack
> > > surface that is available to a CAP_SYS_ADMIN privileged process [2].
> > > Providing the access to performance monitoring and observability
> > > operations under CAP_PERFMON capability singly, without the rest of
> > > CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
> > > and makes the operation more secure. Thus, CAP_PERFMON implements the
> > > principal of least privilege for performance monitoring and
> > > observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of
> > > least privilege: A security design principle that states that a process
> > > or program be granted only those privileges (e.g., capabilities)
> > > necessary to accomplish its legitimate function, and only for the time
> > > that such privileges are actually required)
> > >
> > > CAP_PERFMON intends to meet the demand to secure system performance
> > > monitoring and observability operations for adoption in security
> > > sensitive, restricted, multiuser production environments (e.g. HPC
> > > clusters, cloud and virtual compute environments), where root or
> > > CAP_SYS_ADMIN credentials are not available to mass users of a system,
> > > and securely unblock accessibility of system performance monitoring and
> > > observability operations beyond root and CAP_SYS_ADMIN use cases.
> > >
> > > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
> > > system performance monitoring and observability operations and balance
> > > amount of CAP_SYS_ADMIN credentials following the recommendations in
> > > the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability
> > > is overloaded; see Notes to kernel developers, below." For backward
> > > compatibility reasons access to system performance monitoring and
> > > observability subsystems of the kernel remains open for CAP_SYS_ADMIN
> > > privileged processes but CAP_SYS_ADMIN capability usage for secure
> > > system performance monitoring and observability operations is
> > > discouraged with respect to the designed CAP_PERFMON capability.
> > >
> > > Possible alternative solution to this system security hardening,
> > > capabilities balancing task of making performance monitoring and
> > > observability operations more secure and accessible could be to use
> > > the existing CAP_SYS_PTRACE capability to govern system performance
> > > monitoring and observability subsystems. However CAP_SYS_PTRACE
> > > capability still provides users with more credentials than are
> > > required for secure performance monitoring and observability
> > > operations and this excess is avoided by the designed CAP_PERFMON.
> > >
> > > Although software running under CAP_PERFMON can not ensure avoidance of
> > > related hardware issues, the software can still mitigate those issues
> > > following the official hardware issues mitigation procedure [3]. The
> > > bugs in the software itself can be fixed following the standard kernel
> > > development process [4] to maintain and harden security of system
> > > performance monitoring and observability operations. Finally, the patch
> > > set is shaped in the way that simplifies backtracking procedure of
> > > possible induced issues [5] as much as possible.
> > >
> > > The patch set is for tip perf/core repository:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> > > sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22
> > >
> > > ---
> > > Changes in v7:
> > > - updated and extended kernel.rst and perf-security.rst documentation
> > > files with the information about CAP_PERFMON capability and its use cases
> > > - documented the case of double audit logging of CAP_PERFMON and CAP_SYS_ADMIN
> > > capabilities on a SELinux enabled system
> > > Changes in v6:
> > > - avoided noaudit checks in perfmon_capable() to explicitly advertise
> > > CAP_PERFMON usage thru audit logs to secure system performance
> > > monitoring and observability
> > > Changes in v5:
> > > - renamed CAP_SYS_PERFMON to CAP_PERFMON
> > > - extended perfmon_capable() with noaudit checks
> > > Changes in v4:
> > > - converted perfmon_capable() into an inline function
> > > - made perf_events kprobes, uprobes, hw breakpoints and namespaces data
> > > available to CAP_SYS_PERFMON privileged processes
> > > - applied perfmon_capable() to drivers/perf and drivers/oprofile
> > > - extended __cmd_ftrace() with support of CAP_SYS_PERFMON
> > > Changes in v3:
> > > - implemented perfmon_capable() macros aggregating required capabilities
> > > checks
> > > Changes in v2:
> > > - made perf_events trace points available to CAP_SYS_PERFMON privileged
> > > processes
> > > - made perf_event_paranoid_check() treat CAP_SYS_PERFMON equally to
> > > CAP_SYS_ADMIN
> > > - applied CAP_SYS_PERFMON to i915_perf, bpf_trace, powerpc and parisc
> > > system performance monitoring and observability related subsystems
> > >
> > > ---
> > > Alexey Budankov (12):
> > > capabilities: introduce CAP_PERFMON to kernel and user space
> > > perf/core: open access to the core for CAP_PERFMON privileged process
> > > perf/core: open access to probes for CAP_PERFMON privileged process
> > > perf tool: extend Perf tool with CAP_PERFMON capability support
> > > drm/i915/perf: open access for CAP_PERFMON privileged process
> > > trace/bpf_trace: open access for CAP_PERFMON privileged process
> > > powerpc/perf: open access for CAP_PERFMON privileged process
> > > parisc/perf: open access for CAP_PERFMON privileged process
> > > drivers/perf: open access for CAP_PERFMON privileged process
> > > drivers/oprofile: open access for CAP_PERFMON privileged process
> > > doc/admin-guide: update perf-security.rst with CAP_PERFMON information
> > > doc/admin-guide: update kernel.rst with CAP_PERFMON information
> > >
> > > Documentation/admin-guide/perf-security.rst | 65 +++++++++++++--------
> > > Documentation/admin-guide/sysctl/kernel.rst | 16 +++--
> > > arch/parisc/kernel/perf.c | 2 +-
> > > arch/powerpc/perf/imc-pmu.c | 4 +-
> > > drivers/gpu/drm/i915/i915_perf.c | 13 ++---
> > > drivers/oprofile/event_buffer.c | 2 +-
> > > drivers/perf/arm_spe_pmu.c | 4 +-
> > > include/linux/capability.h | 4 ++
> > > include/linux/perf_event.h | 6 +-
> > > include/uapi/linux/capability.h | 8 ++-
> > > kernel/events/core.c | 6 +-
> > > kernel/trace/bpf_trace.c | 2 +-
> > > security/selinux/include/classmap.h | 4 +-
> > > tools/perf/builtin-ftrace.c | 5 +-
> > > tools/perf/design.txt | 3 +-
> > > tools/perf/util/cap.h | 4 ++
> > > tools/perf/util/evsel.c | 10 ++--
> > > tools/perf/util/util.c | 1 +
> > > 18 files changed, 98 insertions(+), 61 deletions(-)
> > >
> > > ---
> > > Validation (Intel Skylake, 8 cores, Fedora 29, 5.5.0-rc3+, x86_64):
> > >
> > > libcap library [6], [7], [8] and Perf tool can be used to apply
> > > CAP_PERFMON capability for secure system performance monitoring and
> > > observability beyond the scope permitted by the system wide
> > > perf_event_paranoid kernel setting [9] and below are the steps for
> > > evaluation:
> > >
> > > - patch, build and boot the kernel
> > > - patch, build Perf tool e.g. to /home/user/perf
> > > ...
> > > # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> > > # pushd libcap
> > > # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1]
> > > # make
> > > # pushd progs
> > > # ./setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> > > # ./setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> > > /home/user/perf: OK
> > > # ./getcap /home/user/perf
> > > /home/user/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
> > > # echo 2 > /proc/sys/kernel/perf_event_paranoid
> > > # cat /proc/sys/kernel/perf_event_paranoid
> > > 2
> > > ...
> > > $ /home/user/perf top
> > > ... works as expected ...
> > > $ cat /proc/`pidof perf`/status
> > > Name: perf
> > > Umask: 0002
> > > State: S (sleeping)
> > > Tgid: 2958
> > > Ngid: 0
> > > Pid: 2958
> > > PPid: 9847
> > > TracerPid: 0
> > > Uid: 500 500 500 500
> > > Gid: 500 500 500 500
> > > FDSize: 256
> > > ...
> > > CapInh: 0000000000000000
> > > CapPrm: 0000004400080000
> > > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> > > cap_perfmon,cap_sys_ptrace,cap_syslog
> > > CapBnd: 0000007fffffffff
> > > CapAmb: 0000000000000000
> > > NoNewPrivs: 0
> > > Seccomp: 0
> > > Speculation_Store_Bypass: thread vulnerable
> > > Cpus_allowed: ff
> > > Cpus_allowed_list: 0-7
> > > ...
> > >
> > > Usage of cap_perfmon effectively avoids unused credentials excess:
> > >
> > > - with cap_sys_admin:
> > > CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
> > >
> > > - with cap_perfmon:
> > > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> > > 38 34 19
> > > perfmon syslog sys_ptrace
> > >
> > > ---
> > > [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> > > [2] http://man7.org/linux/man-pages/man7/capabilities.7.html
> > > [3] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
> > > [4] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
> > > [5] https://www.kernel.org/doc/html/latest/process/management-style.html#decisions
> > > [6] http://man7.org/linux/man-pages/man8/setcap.8.html
> > > [7] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> > > [8] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
> > > [9] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
> > >
> > _______________________________________________
> > Intel-gfx mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

--
James Morris
<[email protected]>

2020-03-26 23:31:59

by James Morris

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability

On Sun, 1 Mar 2020, Serge Hallyn wrote:

> Thanks, this looks good to me, in keeping with the CAP_SYSLOG break.
>
> Acked-by: Serge E. Hallyn <[email protected]>
>
> for the set.
>
> James/Ingo/Peter, if noone has remaining objections, whose branch
> should these go in through?
>
> thanks,
> -serge
>
> On Tue, Feb 25, 2020 at 12:55:54PM +0300, Alexey Budankov wrote:
> >
> > Hi,
> >
> > Is there anything else I could do in order to move the changes forward
> > or is something still missing from this patch set?
> > Could you please share you mind?

Alexey,

It seems some of the previous Acks are not included in this patchset, e.g.
https://lkml.org/lkml/2020/1/22/655

Every patch needs a Reviewed-by or Acked-by from maintainers of the code
being changed.

You have enough from the security folk, but I can't see any included from
the perf folk.


--
James Morris
<[email protected]>