2019-12-05 16:18:05

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups


Currently access to perf_events functionality [1] beyond the scope permitted
by perf_event_paranoid [1] kernel setting is allowed to a privileged process
[2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].

This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
governing role for perf_events based performance monitoring of a system.

CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
performance using perf_events subsystem by processes and Perf privileged users
[2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
privileged processes [3].

CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
performance monitoring functionality of perf_events and balance amount of
CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
see Notes to kernel developers, below."

For backward compatibility reasons performance monitoring functionality of
perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
secure performance monitoring use cases is discouraged with respect to the
introduced CAP_SYS_PERFMON capability.

In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
[2] to conduct secure performance monitoring using perf_events in the scope
of available online CPUs when executing code in kernel and user modes.

Possible alternative solution to this capabilities balancing, system security
hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
perf_events' performance monitoring functionality, since process debugging is
similar to performance monitoring with respect to providing insights into
process memory and execution details. However CAP_SYS_PTRACE still provides
users with more credentials than are required for secure performance monitoring
using perf_events subsystem and this excess is avoided by using the dedicated
CAP_SYS_PERFMON capability.

libcap library utilities [4], [5] and Perf tool can be used to apply
CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
permitted by system wide perf_event_paranoid kernel setting and below are the
steps to evaluate the advancement suggested by the patch set:

- patch, build and boot the kernel
- patch, build Perf tool e.g. to /home/user/perf
...
# git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
# pushd libcap
# patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
# make
# pushd progs
# ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
# ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
/home/user/perf: OK
# ./getcap /home/user/perf
/home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
# echo 2 > /proc/sys/kernel/perf_event_paranoid
# cat /proc/sys/kernel/perf_event_paranoid
2
...
$ /home/user/perf top
... works as expected ...
$ cat /proc/`pidof perf`/status
Name: perf
Umask: 0002
State: S (sleeping)
Tgid: 2958
Ngid: 0
Pid: 2958
PPid: 9847
TracerPid: 0
Uid: 500 500 500 500
Gid: 500 500 500 500
FDSize: 256
...
CapInh: 0000000000000000
CapPrm: 0000004400080000
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
cap_sys_perfmon,cap_sys_ptrace,cap_syslog
CapBnd: 0000007fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ff
Cpus_allowed_list: 0-7
...

Usage of cap_sys_perfmon effectively avoids unused credentials excess:
- with cap_sys_admin:
CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
- with cap_sys_perfmon:
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
38 34 19
sys_perfmon syslog sys_ptrace

The patch set is for tip perf/core repository:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6

[1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
[2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
[3] http://man7.org/linux/man-pages/man7/capabilities.7.html
[4] http://man7.org/linux/man-pages/man8/setcap.8.html
[5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
[6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf

---
Alexey Budankov (3):
capabilities: introduce CAP_SYS_PERFMON to kernel and user space
perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
perf tool: extend Perf tool with CAP_SYS_PERFMON support

include/linux/perf_event.h | 6 ++++--
include/uapi/linux/capability.h | 10 +++++++++-
security/selinux/include/classmap.h | 4 ++--
tools/perf/design.txt | 3 ++-
tools/perf/util/cap.h | 4 ++++
tools/perf/util/evsel.c | 10 +++++-----
tools/perf/util/util.c | 15 +++++++++++++--
7 files changed, 39 insertions(+), 13 deletions(-)

--
2.20.1


2019-12-05 16:20:44

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 1/3] capabilities: introduce CAP_SYS_PERFMON to kernel and user space


Introduce CAP_SYS_PERFMON capability dedicated to secure performance
monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN
capability in its governing role for perf_events based performance
monitoring of a system.

CAP_SYS_PERFMON aims to harden system security and integrity during
performance monitoring by decreasing attack surface that is available
to CAP_SYS_ADMIN privileged processes.

CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
performance monitoring functionality of perf_events and balance amount of
CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
see Notes to kernel developers, below."

Signed-off-by: Alexey Budankov <[email protected]>
---
include/uapi/linux/capability.h | 10 +++++++++-
security/selinux/include/classmap.h | 4 ++--
2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..c9514f034be1 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -366,8 +366,16 @@ struct vfs_ns_cap_data {

#define CAP_AUDIT_READ 37

+/*
+ * Allow usage of perf_event_open() syscall (perf_events subsystem):
+ * http://man7.org/linux/man-pages/man2/perf_event_open.2.html
+ * beyond the scope permitted by perf_event_paranoid kernel setting.
+ * See Documentation/admin-guide/perf-security.rst for more information.
+ */
+
+#define CAP_SYS_PERFMON 38

-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_LAST_CAP CAP_SYS_PERFMON

#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)

diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 7db24855e12d..bae602c623b0 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -27,9 +27,9 @@
"audit_control", "setfcap"

#define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \
- "wake_alarm", "block_suspend", "audit_read"
+ "wake_alarm", "block_suspend", "audit_read", "sys_perfmon"

-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_SYS_PERFMON
#error New capability defined, please update COMMON_CAP2_PERMS.
#endif

--
2.20.1

2019-12-05 16:22:41

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 2/3] perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring


Enable CAP_SYS_PERFMON privileged process with secure performance monitoring
of available online CPUs, when executing code in kernel and user modes.

For backward compatibility reasons performance monitoring functionality of
perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
secure performance monitoring use cases is discouraged with respect to the
introduced CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <[email protected]>
---
include/linux/perf_event.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 34c7c6910026..e8dc8411de9a 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1285,7 +1285,8 @@ static inline int perf_is_paranoid(void)

static inline int perf_allow_kernel(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 1 &&
+ !(capable(CAP_SYS_PERFMON) || capable(CAP_SYS_ADMIN)))
return -EACCES;

return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
@@ -1293,7 +1294,8 @@ static inline int perf_allow_kernel(struct perf_event_attr *attr)

static inline int perf_allow_cpu(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 0 &&
+ !(capable(CAP_SYS_PERFMON) || capable(CAP_SYS_ADMIN)))
return -EACCES;

return security_perf_event_open(attr, PERF_SECURITY_CPU);
--
2.20.1

2019-12-05 16:51:12

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On 12/5/2019 8:15 AM, Alexey Budankov wrote:
> Currently access to perf_events functionality [1] beyond the scope permitted
> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>
> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
> governing role for perf_events based performance monitoring of a system.
>
> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
> performance using perf_events subsystem by processes and Perf privileged users
> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
> privileged processes [3].

Are there use cases where you would need CAP_SYS_PERFMON where you
would not also need CAP_SYS_ADMIN? If you separate a new capability
from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
with the new capability it is all rather pointless.

The scope you've defined for this CAP_SYS_PERFMON is very small.
Is there a larger set of privilege checks that might be applicable
for it?
 

>
> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
> performance monitoring functionality of perf_events and balance amount of
> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
> see Notes to kernel developers, below."
>
> For backward compatibility reasons performance monitoring functionality of
> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
> secure performance monitoring use cases is discouraged with respect to the
> introduced CAP_SYS_PERFMON capability.
>
> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
> [2] to conduct secure performance monitoring using perf_events in the scope
> of available online CPUs when executing code in kernel and user modes.
>
> Possible alternative solution to this capabilities balancing, system security
> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
> perf_events' performance monitoring functionality, since process debugging is
> similar to performance monitoring with respect to providing insights into
> process memory and execution details. However CAP_SYS_PTRACE still provides
> users with more credentials than are required for secure performance monitoring
> using perf_events subsystem and this excess is avoided by using the dedicated
> CAP_SYS_PERFMON capability.
>
> libcap library utilities [4], [5] and Perf tool can be used to apply
> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
> permitted by system wide perf_event_paranoid kernel setting and below are the
> steps to evaluate the advancement suggested by the patch set:
>
> - patch, build and boot the kernel
> - patch, build Perf tool e.g. to /home/user/perf
> ...
> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> # pushd libcap
> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
> # make
> # pushd progs
> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> /home/user/perf: OK
> # ./getcap /home/user/perf
> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
> # echo 2 > /proc/sys/kernel/perf_event_paranoid
> # cat /proc/sys/kernel/perf_event_paranoid
> 2
> ...
> $ /home/user/perf top
> ... works as expected ...
> $ cat /proc/`pidof perf`/status
> Name: perf
> Umask: 0002
> State: S (sleeping)
> Tgid: 2958
> Ngid: 0
> Pid: 2958
> PPid: 9847
> TracerPid: 0
> Uid: 500 500 500 500
> Gid: 500 500 500 500
> FDSize: 256
> ...
> CapInh: 0000000000000000
> CapPrm: 0000004400080000
> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
> CapBnd: 0000007fffffffff
> CapAmb: 0000000000000000
> NoNewPrivs: 0
> Seccomp: 0
> Speculation_Store_Bypass: thread vulnerable
> Cpus_allowed: ff
> Cpus_allowed_list: 0-7
> ...
>
> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
> - with cap_sys_admin:
> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
> - with cap_sys_perfmon:
> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> 38 34 19
> sys_perfmon syslog sys_ptrace
>
> The patch set is for tip perf/core repository:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
>
> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>
> ---
> Alexey Budankov (3):
> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
> perf tool: extend Perf tool with CAP_SYS_PERFMON support
>
> include/linux/perf_event.h | 6 ++++--
> include/uapi/linux/capability.h | 10 +++++++++-
> security/selinux/include/classmap.h | 4 ++--
> tools/perf/design.txt | 3 ++-
> tools/perf/util/cap.h | 4 ++++
> tools/perf/util/evsel.c | 10 +++++-----
> tools/perf/util/util.c | 15 +++++++++++++--
> 7 files changed, 39 insertions(+), 13 deletions(-)
>

2019-12-05 17:08:04

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

Hello Casey,

On 05.12.2019 19:49, Casey Schaufler wrote:
> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
>> Currently access to perf_events functionality [1] beyond the scope permitted
>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>>
>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
>> governing role for perf_events based performance monitoring of a system.
>>
>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
>> performance using perf_events subsystem by processes and Perf privileged users
>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
>> privileged processes [3].
>
> Are there use cases where you would need CAP_SYS_PERFMON where you
> would not also need CAP_SYS_ADMIN? If you separate a new capability

Actually, there are. Perf tool that has record, stat and top modes could run with
CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.

> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
> with the new capability it is all rather pointless.
>
> The scope you've defined for this CAP_SYS_PERFMON is very small.
> Is there a larger set of privilege checks that might be applicable
> for it?

CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
and stat mode use cases for system wide performance monitoring in kernel and
user modes.

Thanks,
Alexey

>  
>
>>
>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
>> performance monitoring functionality of perf_events and balance amount of
>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
>> see Notes to kernel developers, below."
>>
>> For backward compatibility reasons performance monitoring functionality of
>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
>> secure performance monitoring use cases is discouraged with respect to the
>> introduced CAP_SYS_PERFMON capability.
>>
>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
>> [2] to conduct secure performance monitoring using perf_events in the scope
>> of available online CPUs when executing code in kernel and user modes.
>>
>> Possible alternative solution to this capabilities balancing, system security
>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
>> perf_events' performance monitoring functionality, since process debugging is
>> similar to performance monitoring with respect to providing insights into
>> process memory and execution details. However CAP_SYS_PTRACE still provides
>> users with more credentials than are required for secure performance monitoring
>> using perf_events subsystem and this excess is avoided by using the dedicated
>> CAP_SYS_PERFMON capability.
>>
>> libcap library utilities [4], [5] and Perf tool can be used to apply
>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
>> permitted by system wide perf_event_paranoid kernel setting and below are the
>> steps to evaluate the advancement suggested by the patch set:
>>
>> - patch, build and boot the kernel
>> - patch, build Perf tool e.g. to /home/user/perf
>> ...
>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
>> # pushd libcap
>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
>> # make
>> # pushd progs
>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>> /home/user/perf: OK
>> # ./getcap /home/user/perf
>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
>> # echo 2 > /proc/sys/kernel/perf_event_paranoid
>> # cat /proc/sys/kernel/perf_event_paranoid
>> 2
>> ...
>> $ /home/user/perf top
>> ... works as expected ...
>> $ cat /proc/`pidof perf`/status
>> Name: perf
>> Umask: 0002
>> State: S (sleeping)
>> Tgid: 2958
>> Ngid: 0
>> Pid: 2958
>> PPid: 9847
>> TracerPid: 0
>> Uid: 500 500 500 500
>> Gid: 500 500 500 500
>> FDSize: 256
>> ...
>> CapInh: 0000000000000000
>> CapPrm: 0000004400080000
>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
>> CapBnd: 0000007fffffffff
>> CapAmb: 0000000000000000
>> NoNewPrivs: 0
>> Seccomp: 0
>> Speculation_Store_Bypass: thread vulnerable
>> Cpus_allowed: ff
>> Cpus_allowed_list: 0-7
>> ...
>>
>> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
>> - with cap_sys_admin:
>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
>> - with cap_sys_perfmon:
>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>> 38 34 19
>> sys_perfmon syslog sys_ptrace
>>
>> The patch set is for tip perf/core repository:
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
>>
>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>>
>> ---
>> Alexey Budankov (3):
>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
>> perf tool: extend Perf tool with CAP_SYS_PERFMON support
>>
>> include/linux/perf_event.h | 6 ++++--
>> include/uapi/linux/capability.h | 10 +++++++++-
>> security/selinux/include/classmap.h | 4 ++--
>> tools/perf/design.txt | 3 ++-
>> tools/perf/util/cap.h | 4 ++++
>> tools/perf/util/evsel.c | 10 +++++-----
>> tools/perf/util/util.c | 15 +++++++++++++--
>> 7 files changed, 39 insertions(+), 13 deletions(-)
>>
>
>

2019-12-05 17:37:45

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On 12/5/2019 9:05 AM, Alexey Budankov wrote:
> Hello Casey,
>
> On 05.12.2019 19:49, Casey Schaufler wrote:
>> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
>>> Currently access to perf_events functionality [1] beyond the scope permitted
>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>>>
>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
>>> governing role for perf_events based performance monitoring of a system.
>>>
>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
>>> performance using perf_events subsystem by processes and Perf privileged users
>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
>>> privileged processes [3].
>> Are there use cases where you would need CAP_SYS_PERFMON where you
>> would not also need CAP_SYS_ADMIN? If you separate a new capability
> Actually, there are. Perf tool that has record, stat and top modes could run with
> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.

The question isn't whether the tool could use the capability, it's whether
the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?
My bet is that any tool that does performance monitoring is going to need
CAP_SYS_ADMIN for other reasons.

>
>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
>> with the new capability it is all rather pointless.
>>
>> The scope you've defined for this CAP_SYS_PERFMON is very small.
>> Is there a larger set of privilege checks that might be applicable
>> for it?
> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
> and stat mode use cases for system wide performance monitoring in kernel and
> user modes.

The granularity of capabilities is something we have to watch
very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but
if we broke it up "properly" we'd have hundreds of capabilities.
If you want control that finely we have SELinux.

>
> Thanks,
> Alexey
>
>>  
>>
>>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
>>> performance monitoring functionality of perf_events and balance amount of
>>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
>>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
>>> see Notes to kernel developers, below."
>>>
>>> For backward compatibility reasons performance monitoring functionality of
>>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
>>> secure performance monitoring use cases is discouraged with respect to the
>>> introduced CAP_SYS_PERFMON capability.
>>>
>>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
>>> [2] to conduct secure performance monitoring using perf_events in the scope
>>> of available online CPUs when executing code in kernel and user modes.
>>>
>>> Possible alternative solution to this capabilities balancing, system security
>>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
>>> perf_events' performance monitoring functionality, since process debugging is
>>> similar to performance monitoring with respect to providing insights into
>>> process memory and execution details. However CAP_SYS_PTRACE still provides
>>> users with more credentials than are required for secure performance monitoring
>>> using perf_events subsystem and this excess is avoided by using the dedicated
>>> CAP_SYS_PERFMON capability.
>>>
>>> libcap library utilities [4], [5] and Perf tool can be used to apply
>>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
>>> permitted by system wide perf_event_paranoid kernel setting and below are the
>>> steps to evaluate the advancement suggested by the patch set:
>>>
>>> - patch, build and boot the kernel
>>> - patch, build Perf tool e.g. to /home/user/perf
>>> ...
>>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
>>> # pushd libcap
>>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
>>> # make
>>> # pushd progs
>>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>>> /home/user/perf: OK
>>> # ./getcap /home/user/perf
>>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
>>> # echo 2 > /proc/sys/kernel/perf_event_paranoid
>>> # cat /proc/sys/kernel/perf_event_paranoid
>>> 2
>>> ...
>>> $ /home/user/perf top
>>> ... works as expected ...
>>> $ cat /proc/`pidof perf`/status
>>> Name: perf
>>> Umask: 0002
>>> State: S (sleeping)
>>> Tgid: 2958
>>> Ngid: 0
>>> Pid: 2958
>>> PPid: 9847
>>> TracerPid: 0
>>> Uid: 500 500 500 500
>>> Gid: 500 500 500 500
>>> FDSize: 256
>>> ...
>>> CapInh: 0000000000000000
>>> CapPrm: 0000004400080000
>>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
>>> CapBnd: 0000007fffffffff
>>> CapAmb: 0000000000000000
>>> NoNewPrivs: 0
>>> Seccomp: 0
>>> Speculation_Store_Bypass: thread vulnerable
>>> Cpus_allowed: ff
>>> Cpus_allowed_list: 0-7
>>> ...
>>>
>>> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
>>> - with cap_sys_admin:
>>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
>>> - with cap_sys_perfmon:
>>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>>> 38 34 19
>>> sys_perfmon syslog sys_ptrace
>>>
>>> The patch set is for tip perf/core repository:
>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
>>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
>>>
>>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
>>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
>>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
>>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
>>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
>>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>>>
>>> ---
>>> Alexey Budankov (3):
>>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
>>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
>>> perf tool: extend Perf tool with CAP_SYS_PERFMON support
>>>
>>> include/linux/perf_event.h | 6 ++++--
>>> include/uapi/linux/capability.h | 10 +++++++++-
>>> security/selinux/include/classmap.h | 4 ++--
>>> tools/perf/design.txt | 3 ++-
>>> tools/perf/util/cap.h | 4 ++++
>>> tools/perf/util/evsel.c | 10 +++++-----
>>> tools/perf/util/util.c | 15 +++++++++++++--
>>> 7 files changed, 39 insertions(+), 13 deletions(-)
>>>
>>

2019-12-05 17:59:34

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v1 3/3] perf tool: extend Perf tool with CAP_SYS_PERFMON support


Extend error messages to mention CAP_SYS_PERFMON capability as an option
to substitute CAP_SYS_ADMIN credentials where applicable.

Make perf_event_paranoid_check() to be aware of CAP_SYS_PERFMON in case
perf_event_paranoid value >= 0.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/design.txt | 3 ++-
tools/perf/util/cap.h | 4 ++++
tools/perf/util/evsel.c | 10 +++++-----
tools/perf/util/util.c | 15 +++++++++++++--
4 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index 0453ba26cdbd..71755b3e1303 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -258,7 +258,8 @@ gets schedule to. Per task counters can be created by any user, for
their own tasks.

A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
+all events on CPU-x. Per CPU counters need CAP_SYS_PERFMON or
+CAP_SYS_ADMIN privilege.

The 'flags' parameter is currently unused and must be zero.

diff --git a/tools/perf/util/cap.h b/tools/perf/util/cap.h
index 051dc590ceee..0f79fbf6638b 100644
--- a/tools/perf/util/cap.h
+++ b/tools/perf/util/cap.h
@@ -29,4 +29,8 @@ static inline bool perf_cap__capable(int cap __maybe_unused)
#define CAP_SYSLOG 34
#endif

+#ifndef CAP_SYS_PERFMON
+#define CAP_SYS_PERFMON 38
+#endif
+
#endif /* __PERF_CAP_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f4dea055b080..3a46325e3702 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2468,14 +2468,14 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
"You may not have permission to collect %sstats.\n\n"
"Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
"which controls use of the performance events system by\n"
- "unprivileged users (without CAP_SYS_ADMIN).\n\n"
+ "unprivileged users (without CAP_SYS_PERFMON or CAP_SYS_ADMIN).\n\n"
"The current value is %d:\n\n"
" -1: Allow use of (almost) all events by all users\n"
" Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
- ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n"
- " Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n"
- ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n"
- ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n"
+ ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 1: Disallow CPU event access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 2: Disallow kernel profiling by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n\n"
"To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
" kernel.perf_event_paranoid = -1\n" ,
target->system_wide ? "system-wide " : "",
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 969ae560dad9..d8334fa97c85 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -271,8 +271,19 @@ int perf_event_paranoid(void)

bool perf_event_paranoid_check(int max_level)
{
- return perf_cap__capable(CAP_SYS_ADMIN) ||
- perf_event_paranoid() <= max_level;
+ bool res = false;
+
+ res = perf_cap__capable(CAP_SYS_ADMIN);
+
+ if (!res) {
+ if (max_level >= 0)
+ res = perf_cap__capable(CAP_SYS_PERFMON);
+ }
+
+ if (!res)
+ res = perf_event_paranoid() <= max_level;
+
+ return res;
}

static int
--
2.20.1

2019-12-05 18:17:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

> The question isn't whether the tool could use the capability, it's whether
> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?
> My bet is that any tool that does performance monitoring is going to need
> CAP_SYS_ADMIN for other reasons.

At least perf stat won't.

-Andi

2019-12-05 18:38:42

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On 05.12.2019 20:33, Casey Schaufler wrote:
> On 12/5/2019 9:05 AM, Alexey Budankov wrote:
>> Hello Casey,
>>
>> On 05.12.2019 19:49, Casey Schaufler wrote:
>>> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
>>>> Currently access to perf_events functionality [1] beyond the scope permitted
>>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
>>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>>>>
>>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
>>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
>>>> governing role for perf_events based performance monitoring of a system.
>>>>
>>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
>>>> performance using perf_events subsystem by processes and Perf privileged users
>>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
>>>> privileged processes [3].
>>> Are there use cases where you would need CAP_SYS_PERFMON where you
>>> would not also need CAP_SYS_ADMIN? If you separate a new capability
>> Actually, there are. Perf tool that has record, stat and top modes could run with
>> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
>> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.
>
> The question isn't whether the tool could use the capability, it's whether
> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?
> My bet is that any tool that does performance monitoring is going to need
> CAP_SYS_ADMIN for other reasons.

Yes, sorry. The tool is perf tool (part of kernel tree). If its binary is granted
CAP_SYS_ADMIN capability then the tool can collect performance data in system wide
mode for some group of unprivileged users.

This patch allows replacing CAP_SYS_ADMIN by CAP_SYS_PERFMON e.g. for perf tool and
then the tool being granted CAP_SYS_PERFMON could still provide performance data
in system wide scope for the same group of unprivileged users.

Hope it's got clearer. Feel free to ask more.

Thanks,
Alexey

>
>>
>>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
>>> with the new capability it is all rather pointless.
>>>
>>> The scope you've defined for this CAP_SYS_PERFMON is very small.
>>> Is there a larger set of privilege checks that might be applicable
>>> for it?
>> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
>> and stat mode use cases for system wide performance monitoring in kernel and
>> user modes.
>
> The granularity of capabilities is something we have to watch
> very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but
> if we broke it up "properly" we'd have hundreds of capabilities.
> If you want control that finely we have SELinux.
>
>>
>> Thanks,
>> Alexey
>>
>>>  
>>>
>>>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
>>>> performance monitoring functionality of perf_events and balance amount of
>>>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
>>>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
>>>> see Notes to kernel developers, below."
>>>>
>>>> For backward compatibility reasons performance monitoring functionality of
>>>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
>>>> secure performance monitoring use cases is discouraged with respect to the
>>>> introduced CAP_SYS_PERFMON capability.
>>>>
>>>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
>>>> [2] to conduct secure performance monitoring using perf_events in the scope
>>>> of available online CPUs when executing code in kernel and user modes.
>>>>
>>>> Possible alternative solution to this capabilities balancing, system security
>>>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
>>>> perf_events' performance monitoring functionality, since process debugging is
>>>> similar to performance monitoring with respect to providing insights into
>>>> process memory and execution details. However CAP_SYS_PTRACE still provides
>>>> users with more credentials than are required for secure performance monitoring
>>>> using perf_events subsystem and this excess is avoided by using the dedicated
>>>> CAP_SYS_PERFMON capability.
>>>>
>>>> libcap library utilities [4], [5] and Perf tool can be used to apply
>>>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
>>>> permitted by system wide perf_event_paranoid kernel setting and below are the
>>>> steps to evaluate the advancement suggested by the patch set:
>>>>
>>>> - patch, build and boot the kernel
>>>> - patch, build Perf tool e.g. to /home/user/perf
>>>> ...
>>>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
>>>> # pushd libcap
>>>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
>>>> # make
>>>> # pushd progs
>>>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>>>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>>>> /home/user/perf: OK
>>>> # ./getcap /home/user/perf
>>>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
>>>> # echo 2 > /proc/sys/kernel/perf_event_paranoid
>>>> # cat /proc/sys/kernel/perf_event_paranoid
>>>> 2
>>>> ...
>>>> $ /home/user/perf top
>>>> ... works as expected ...
>>>> $ cat /proc/`pidof perf`/status
>>>> Name: perf
>>>> Umask: 0002
>>>> State: S (sleeping)
>>>> Tgid: 2958
>>>> Ngid: 0
>>>> Pid: 2958
>>>> PPid: 9847
>>>> TracerPid: 0
>>>> Uid: 500 500 500 500
>>>> Gid: 500 500 500 500
>>>> FDSize: 256
>>>> ...
>>>> CapInh: 0000000000000000
>>>> CapPrm: 0000004400080000
>>>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>>>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
>>>> CapBnd: 0000007fffffffff
>>>> CapAmb: 0000000000000000
>>>> NoNewPrivs: 0
>>>> Seccomp: 0
>>>> Speculation_Store_Bypass: thread vulnerable
>>>> Cpus_allowed: ff
>>>> Cpus_allowed_list: 0-7
>>>> ...
>>>>
>>>> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
>>>> - with cap_sys_admin:
>>>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
>>>> - with cap_sys_perfmon:
>>>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>>>> 38 34 19
>>>> sys_perfmon syslog sys_ptrace
>>>>
>>>> The patch set is for tip perf/core repository:
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
>>>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
>>>>
>>>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
>>>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
>>>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
>>>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
>>>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
>>>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>>>>
>>>> ---
>>>> Alexey Budankov (3):
>>>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
>>>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
>>>> perf tool: extend Perf tool with CAP_SYS_PERFMON support
>>>>
>>>> include/linux/perf_event.h | 6 ++++--
>>>> include/uapi/linux/capability.h | 10 +++++++++-
>>>> security/selinux/include/classmap.h | 4 ++--
>>>> tools/perf/design.txt | 3 ++-
>>>> tools/perf/util/cap.h | 4 ++++
>>>> tools/perf/util/evsel.c | 10 +++++-----
>>>> tools/perf/util/util.c | 15 +++++++++++++--
>>>> 7 files changed, 39 insertions(+), 13 deletions(-)
>>>>
>>>
>
>

2019-12-11 11:14:57

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups


On 05.12.2019 20:33, Casey Schaufler wrote:
> On 12/5/2019 9:05 AM, Alexey Budankov wrote:
>> Hello Casey,
>>
>> On 05.12.2019 19:49, Casey Schaufler wrote:
>>> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
>>>> Currently access to perf_events functionality [1] beyond the scope permitted
>>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
>>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>>>>
>>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
>>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
>>>> governing role for perf_events based performance monitoring of a system.
>>>>
>>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
>>>> performance using perf_events subsystem by processes and Perf privileged users
>>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
>>>> privileged processes [3].
>>> Are there use cases where you would need CAP_SYS_PERFMON where you
>>> would not also need CAP_SYS_ADMIN? If you separate a new capability
>> Actually, there are. Perf tool that has record, stat and top modes could run with
>> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
>> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.
>
> The question isn't whether the tool could use the capability, it's whether
> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?
> My bet is that any tool that does performance monitoring is going to need
> CAP_SYS_ADMIN for other reasons.
>
>>
>>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
>>> with the new capability it is all rather pointless.
>>>
>>> The scope you've defined for this CAP_SYS_PERFMON is very small.
>>> Is there a larger set of privilege checks that might be applicable
>>> for it?
>> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
>> and stat mode use cases for system wide performance monitoring in kernel and
>> user modes.
>
> The granularity of capabilities is something we have to watch
> very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but
> if we broke it up "properly" we'd have hundreds of capabilities.

Fully agree and this broader discussion is really helpful to come up with
properly balanced solution.

> If you want control that finely we have SELinux.

Undoubtedly, SELinux is the powerful, mature, whole level of functionality that
could provide benefits not only for perf_events subsystem. However perf_events
is built around capabilities to provide access control to its functionality,
thus perf_events would require considerable rework prior it could be controlled
thru SELinux. Then the adoption could also require changes to the installed
infrastructure just for the sake of adopting alternative access control mechanism.

On the other hand there are currently already existing users and use cases that
are built around the CAP_SYS_ADMIN based access control, and Perf tool, which is
the native Linux kernel observability and performance profiling tool, provides
means to operate in restricted multiuser environments(HPC clusters, cloud and
virtual environments) for groups of unprivileged users under admins control [1].

In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that
trade-offs between perf_events subsystem extensions, required level of control
and configurability of perf_events, existing users adoption effort, and it brings
security hardening benefits of decreasing attack surface for the existing users
and use cases.

Well, yes, it is really good that Linux nowadays provides a handful of various
security assuring mechanisms but proper balance is what usually makes valuable
features happen and its users happy and moves forward.

Gratefully,
Alexey

[1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html

2019-12-11 15:25:29

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On Wed, Dec 11, 2019 at 01:52:15PM +0300, Alexey Budankov wrote:
> Undoubtedly, SELinux is the powerful, mature, whole level of functionality that
> could provide benefits not only for perf_events subsystem. However perf_events
> is built around capabilities to provide access control to its functionality,
> thus perf_events would require considerable rework prior it could be controlled
> thru SELinux.

You mean this:

da97e18458fb ("perf_event: Add support for LSM and SELinux checks")

?

> Then the adoption could also require changes to the installed
> infrastructure just for the sake of adopting alternative access control mechanism.

This is still very much true.

2019-12-11 17:01:06

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups


On 11.12.2019 18:24, Peter Zijlstra wrote:
> On Wed, Dec 11, 2019 at 01:52:15PM +0300, Alexey Budankov wrote:
>> Undoubtedly, SELinux is the powerful, mature, whole level of functionality that
>> could provide benefits not only for perf_events subsystem. However perf_events
>> is built around capabilities to provide access control to its functionality,
>> thus perf_events would require considerable rework prior it could be controlled
>> thru SELinux.
>
> You mean this:
>
> da97e18458fb ("perf_event: Add support for LSM and SELinux checks")
>
> ?

Yes, I do.

This feature greatly adds up into MAC access control [1], [2] for perf_events,
additionally to already existing DAC [3]. However, there is still the whole
other part of MAC story on the user space side.

Fortunately MAC and DAC access control mechanisms designed in the way they are
naturally layered and coexist in the system so I don't see any contradiction
in advancing either mechanism to meet the demand of possible diverse use cases.

There is no much rationale in providing favor to one or the other mechanism
because together they constitute complete integrity of security access control
and configurability for diverse use cases of perf_events.

>
>> Then the adoption could also require changes to the installed
>> infrastructure just for the sake of adopting alternative access control mechanism.
>
> This is still very much true.

It is just enough to imaging some HPC cluster or Cloud lab with
several hundreds of nodes to be upgraded.

Thanks,
Alexey

[1] https://en.wikipedia.org/wiki/Security-Enhanced_Linux
[2] https://en.wikipedia.org/wiki/Mandatory_access_control
[3] https://en.wikipedia.org/wiki/Discretionary_access_control

2019-12-11 18:10:26

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On 12/11/2019 2:52 AM, Alexey Budankov wrote:
> On 05.12.2019 20:33, Casey Schaufler wrote:
>> On 12/5/2019 9:05 AM, Alexey Budankov wrote:
>>> Hello Casey,
>>>
>>> On 05.12.2019 19:49, Casey Schaufler wrote:
>>>> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
>>>>> Currently access to perf_events functionality [1] beyond the scope permitted
>>>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
>>>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>>>>>
>>>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
>>>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
>>>>> governing role for perf_events based performance monitoring of a system.
>>>>>
>>>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
>>>>> performance using perf_events subsystem by processes and Perf privileged users
>>>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
>>>>> privileged processes [3].
>>>> Are there use cases where you would need CAP_SYS_PERFMON where you
>>>> would not also need CAP_SYS_ADMIN? If you separate a new capability
>>> Actually, there are. Perf tool that has record, stat and top modes could run with
>>> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
>>> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.
>> The question isn't whether the tool could use the capability, it's whether
>> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
>> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?
>> My bet is that any tool that does performance monitoring is going to need
>> CAP_SYS_ADMIN for other reasons.
>>
>>>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
>>>> with the new capability it is all rather pointless.
>>>>
>>>> The scope you've defined for this CAP_SYS_PERFMON is very small.
>>>> Is there a larger set of privilege checks that might be applicable
>>>> for it?
>>> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
>>> and stat mode use cases for system wide performance monitoring in kernel and
>>> user modes.
>> The granularity of capabilities is something we have to watch
>> very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but
>> if we broke it up "properly" we'd have hundreds of capabilities.
> Fully agree and this broader discussion is really helpful to come up with
> properly balanced solution.
>
>> If you want control that finely we have SELinux.
> Undoubtedly, SELinux is the powerful, mature, whole level of functionality that
> could provide benefits not only for perf_events subsystem. However perf_events
> is built around capabilities to provide access control to its functionality,
> thus perf_events would require considerable rework prior it could be controlled
> thru SELinux. Then the adoption could also require changes to the installed
> infrastructure just for the sake of adopting alternative access control mechanism.
>
> On the other hand there are currently already existing users and use cases that
> are built around the CAP_SYS_ADMIN based access control, and Perf tool, which is
> the native Linux kernel observability and performance profiling tool, provides
> means to operate in restricted multiuser environments(HPC clusters, cloud and
> virtual environments) for groups of unprivileged users under admins control [1].
>
> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that
> trade-offs between perf_events subsystem extensions, required level of control
> and configurability of perf_events, existing users adoption effort, and it brings
> security hardening benefits of decreasing attack surface for the existing users
> and use cases.

I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities
that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould
be converted to CAP_SYS_PERFMON as well. If there is a class of system performance
privileged operations, say a dozen or so, you may have a viable argument.


>
> Well, yes, it is really good that Linux nowadays provides a handful of various
> security assuring mechanisms but proper balance is what usually makes valuable
> features happen and its users happy and moves forward.
>
> Gratefully,
> Alexey
>
> [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html

2019-12-11 19:06:09

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On Thu, Dec 5, 2019 at 9:35 AM Casey Schaufler <[email protected]> wrote:
>
> On 12/5/2019 9:05 AM, Alexey Budankov wrote:
> > Hello Casey,
> >
> > On 05.12.2019 19:49, Casey Schaufler wrote:
> >> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
> >>> Currently access to perf_events functionality [1] beyond the scope permitted
> >>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
> >>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
> >>>
> >>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
> >>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
> >>> governing role for perf_events based performance monitoring of a system.
> >>>
> >>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
> >>> performance using perf_events subsystem by processes and Perf privileged users
> >>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
> >>> privileged processes [3].
> >> Are there use cases where you would need CAP_SYS_PERFMON where you
> >> would not also need CAP_SYS_ADMIN? If you separate a new capability
> > Actually, there are. Perf tool that has record, stat and top modes could run with
> > CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
> > data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.
>
> The question isn't whether the tool could use the capability, it's whether
> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?

The answer is yes. I have recently been alerted to a problem with
paranoid=2 and the
popular rr debugger (https://rr-project.org/). This debugger uses
several perf_events
features, including profiling of PMU events and tracepoints
(context-switches). With
paranoid=2, it does not work anymore. We would need a privilege between regular
user and admin to make it work again. Note that context switches
tracepoint is only
applied to self (not system-wide).


> My bet is that any tool that does performance monitoring is going to need
> CAP_SYS_ADMIN for other reasons.
>
> >
> >> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
> >> with the new capability it is all rather pointless.
> >>
> >> The scope you've defined for this CAP_SYS_PERFMON is very small.
> >> Is there a larger set of privilege checks that might be applicable
> >> for it?
> > CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
> > and stat mode use cases for system wide performance monitoring in kernel and
> > user modes.
>
> The granularity of capabilities is something we have to watch
> very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but
> if we broke it up "properly" we'd have hundreds of capabilities.
> If you want control that finely we have SELinux.
>
> >
> > Thanks,
> > Alexey
> >
> >>
> >>
> >>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
> >>> performance monitoring functionality of perf_events and balance amount of
> >>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
> >>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
> >>> see Notes to kernel developers, below."
> >>>
> >>> For backward compatibility reasons performance monitoring functionality of
> >>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
> >>> secure performance monitoring use cases is discouraged with respect to the
> >>> introduced CAP_SYS_PERFMON capability.
> >>>
> >>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
> >>> [2] to conduct secure performance monitoring using perf_events in the scope
> >>> of available online CPUs when executing code in kernel and user modes.
> >>>
> >>> Possible alternative solution to this capabilities balancing, system security
> >>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
> >>> perf_events' performance monitoring functionality, since process debugging is
> >>> similar to performance monitoring with respect to providing insights into
> >>> process memory and execution details. However CAP_SYS_PTRACE still provides
> >>> users with more credentials than are required for secure performance monitoring
> >>> using perf_events subsystem and this excess is avoided by using the dedicated
> >>> CAP_SYS_PERFMON capability.
> >>>
> >>> libcap library utilities [4], [5] and Perf tool can be used to apply
> >>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
> >>> permitted by system wide perf_event_paranoid kernel setting and below are the
> >>> steps to evaluate the advancement suggested by the patch set:
> >>>
> >>> - patch, build and boot the kernel
> >>> - patch, build Perf tool e.g. to /home/user/perf
> >>> ...
> >>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> >>> # pushd libcap
> >>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
> >>> # make
> >>> # pushd progs
> >>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> >>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> >>> /home/user/perf: OK
> >>> # ./getcap /home/user/perf
> >>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
> >>> # echo 2 > /proc/sys/kernel/perf_event_paranoid
> >>> # cat /proc/sys/kernel/perf_event_paranoid
> >>> 2
> >>> ...
> >>> $ /home/user/perf top
> >>> ... works as expected ...
> >>> $ cat /proc/`pidof perf`/status
> >>> Name: perf
> >>> Umask: 0002
> >>> State: S (sleeping)
> >>> Tgid: 2958
> >>> Ngid: 0
> >>> Pid: 2958
> >>> PPid: 9847
> >>> TracerPid: 0
> >>> Uid: 500 500 500 500
> >>> Gid: 500 500 500 500
> >>> FDSize: 256
> >>> ...
> >>> CapInh: 0000000000000000
> >>> CapPrm: 0000004400080000
> >>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> >>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
> >>> CapBnd: 0000007fffffffff
> >>> CapAmb: 0000000000000000
> >>> NoNewPrivs: 0
> >>> Seccomp: 0
> >>> Speculation_Store_Bypass: thread vulnerable
> >>> Cpus_allowed: ff
> >>> Cpus_allowed_list: 0-7
> >>> ...
> >>>
> >>> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
> >>> - with cap_sys_admin:
> >>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
> >>> - with cap_sys_perfmon:
> >>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> >>> 38 34 19
> >>> sys_perfmon syslog sys_ptrace
> >>>
> >>> The patch set is for tip perf/core repository:
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> >>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
> >>>
> >>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
> >>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> >>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
> >>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
> >>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> >>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
> >>>
> >>> ---
> >>> Alexey Budankov (3):
> >>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
> >>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
> >>> perf tool: extend Perf tool with CAP_SYS_PERFMON support
> >>>
> >>> include/linux/perf_event.h | 6 ++++--
> >>> include/uapi/linux/capability.h | 10 +++++++++-
> >>> security/selinux/include/classmap.h | 4 ++--
> >>> tools/perf/design.txt | 3 ++-
> >>> tools/perf/util/cap.h | 4 ++++
> >>> tools/perf/util/evsel.c | 10 +++++-----
> >>> tools/perf/util/util.c | 15 +++++++++++++--
> >>> 7 files changed, 39 insertions(+), 13 deletions(-)
> >>>
> >>
>

2019-12-11 20:37:32

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

> > In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that
> > trade-offs between perf_events subsystem extensions, required level of control
> > and configurability of perf_events, existing users adoption effort, and it brings
> > security hardening benefits of decreasing attack surface for the existing users
> > and use cases.
>
> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities
> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould
> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance
> privileged operations, say a dozen or so, you may have a viable argument.

perf events is not a single use. It has a bazillion of sub functionalities,
including hardware tracing, software tracing, pmu counters, software counters,
uncore counters, break points and various other stuff in its PMU drivers.

See it more as a whole quite heterogenous driver subsystem.

I guess CAP_SYS_PERFMON is not a good name because perf is much more
than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS

-Andi

2019-12-11 21:26:52

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On 12/11/2019 12:36 PM, Andi Kleen wrote:
>>> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that
>>> trade-offs between perf_events subsystem extensions, required level of control
>>> and configurability of perf_events, existing users adoption effort, and it brings
>>> security hardening benefits of decreasing attack surface for the existing users
>>> and use cases.
>> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities
>> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould
>> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance
>> privileged operations, say a dozen or so, you may have a viable argument.
> perf events is not a single use.

If it is only being called in two places, it is single use.

> It has a bazillion of sub functionalities,
> including hardware tracing, software tracing, pmu counters, software counters,
> uncore counters, break points and various other stuff in its PMU drivers.
>
> See it more as a whole quite heterogenous driver subsystem.
>
> I guess CAP_SYS_PERFMON is not a good name because perf is much more
> than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS
>
> -Andi

2019-12-12 14:25:43

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

On 12/11/19 3:36 PM, Andi Kleen wrote:
>>> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that
>>> trade-offs between perf_events subsystem extensions, required level of control
>>> and configurability of perf_events, existing users adoption effort, and it brings
>>> security hardening benefits of decreasing attack surface for the existing users
>>> and use cases.
>>
>> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities
>> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould
>> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance
>> privileged operations, say a dozen or so, you may have a viable argument.
>
> perf events is not a single use. It has a bazillion of sub functionalities,
> including hardware tracing, software tracing, pmu counters, software counters,
> uncore counters, break points and various other stuff in its PMU drivers.
>
> See it more as a whole quite heterogenous driver subsystem.
>
> I guess CAP_SYS_PERFMON is not a good name because perf is much more
> than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS

That seems misleading since it isn't being checked for all perf_events
operations IIUC (CAP_SYS_ADMIN is still required for some?) and it is
even more specialized than CAP_SYS_PERFMON, making it less likely that
we could ever use this capability as a check for other kernel
performance monitoring facilities beyond perf_events.

I'm not as opposed to fine-grained capabilities as Casey is but I do
recognize that there are a limited number of available bits (although we
do have a fair number of unused ones currently given the extension to
64-bits) and that it would be easy to consume them all if we allocated
one for every kernel feature. That said, this might be a sufficiently
important use case to justify it.

Obviously I'd encourage you to consider leveraging SELinux as well but I
understand that you are looking for a solution that doesn't depend on a
distro using a particular LSM or a particular policy. I will note that
SELinux doesn't suffer from the limited bits problem because one can
always define a new SELinux security class with its own access vector
permissions bitmap, as has been done for the recently added LSM/SELinux
perf_event hooks.

I don't know who actually gets to decide when/if a new capability is
allocated. Maybe Serge and/or James as capabilities and LSM maintainers.

I have no objections to these patches from a SELinux POV.

2019-12-15 11:54:35

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups


On 12.12.2019 17:24, Stephen Smalley wrote:
> On 12/11/19 3:36 PM, Andi Kleen wrote:
>>>> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that
>>>> trade-offs between perf_events subsystem extensions, required level of control
>>>> and configurability of perf_events, existing users adoption effort, and it brings
>>>> security hardening benefits of decreasing attack surface for the existing users
>>>> and use cases.
>>>
>>> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities
>>> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould
>>> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance
>>> privileged operations, say a dozen or so, you may have a viable argument.
>>
>> perf events is not a single use. It has a bazillion of sub functionalities,
>> including hardware tracing, software tracing, pmu counters, software counters,
>> uncore counters, break points and various other stuff in its PMU drivers.
>>
>> See it more as a whole quite heterogenous driver subsystem.
>>
>> I guess CAP_SYS_PERFMON is not a good name because perf is much more
>> than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS
>
> That seems misleading since it isn't being checked for all perf_events operations IIUC (CAP_SYS_ADMIN is still required for some?) and it is even more specialized than CAP_SYS_PERFMON, making it less likely that we could ever use this capability as a check for other kernel performance monitoring facilities beyond perf_events.
>
> I'm not as opposed to fine-grained capabilities as Casey is but I do recognize that there are a limited number of available bits (although we do have a fair number of unused ones currently given the extension to 64-bits) and that it would be easy to consume them all if we allocated one for every kernel feature.  That said, this might be a sufficiently important use case to justify it.
>
> Obviously I'd encourage you to consider leveraging SELinux as well but I understand that you are looking for a solution that doesn't depend on a distro using a particular LSM or a particular policy.  I will note that SELinux doesn't suffer from the limited bits problem because one can always define a new SELinux security class with its own access vector permissions bitmap, as has been done for the recently added LSM/SELinux perf_event hooks.
>
> I don't know who actually gets to decide when/if a new capability is allocated.  Maybe Serge and/or James as capabilities and LSM maintainers.
>
> I have no objections to these patches from a SELinux POV.

Stephen, thanks for meaningful input!

~Alexey