Extend perf_event_open 2 man page with the information about
CAP_PERFMON capability designed to secure performance monitoring
and observability operation in a system according to the principle
of least privilege [1] (POSIX IEEE 1003.1e, 2.2.2.39).
[1] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
Signed-off-by: Alexey Budankov <[email protected]>
---
man2/perf_event_open.2 | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index 4827a359d..9810bc554 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -97,6 +97,8 @@ when running on the specified CPU.
.BR "pid == \-1" " and " "cpu >= 0"
This measures all processes/threads on the specified CPU.
This requires
+.B CAP_PERFMON
+(since Linux 5.8) or
.B CAP_SYS_ADMIN
capability or a
.I /proc/sys/kernel/perf_event_paranoid
@@ -108,9 +110,11 @@ This setting is invalid and will return an error.
When
.I pid
is greater than zero, permission to perform this system call
-is governed by a ptrace access mode
+is governed by
+.B CAP_PERFMON
+(since Linux 5.9) and a ptrace access mode
.B PTRACE_MODE_READ_REALCREDS
-check; see
+check on older Linux versions; see
.BR ptrace (2).
.PP
The
@@ -2925,6 +2929,8 @@ to hold the result.
This allows attaching a Berkeley Packet Filter (BPF)
program to an existing kprobe tracepoint event.
You need
+.B CAP_PERFMON
+(since Linux 5.8) or
.B CAP_SYS_ADMIN
privileges to use this ioctl.
.IP
@@ -2967,6 +2973,8 @@ have multiple events attached to a tracepoint.
Querying this value on one tracepoint event returns the id
of all BPF programs in all events attached to the tracepoint.
You need
+.B CAP_PERFMON
+(since Linux 5.8) or
.B CAP_SYS_ADMIN
privileges to use this ioctl.
.IP
@@ -3175,6 +3183,8 @@ it was expecting.
.TP
.B EACCES
Returned when the requested event requires
+.B CAP_PERFMON
+(since Linux 5.8) or
.B CAP_SYS_ADMIN
permissions (or a more permissive perf_event paranoid setting).
Some common cases where an unprivileged process
@@ -3296,6 +3306,8 @@ setting is specified.
It can also happen, as with
.BR EACCES ,
when the requested event requires
+.B CAP_PERFMON
+(since Linux 5.8) or
.B CAP_SYS_ADMIN
permissions (or a more permissive perf_event paranoid setting).
This includes setting a breakpoint on a kernel address,
@@ -3326,6 +3338,22 @@ The official way of knowing if
support is enabled is checking
for the existence of the file
.IR /proc/sys/kernel/perf_event_paranoid .
+.PP
+.B CAP_PERFMON
+capability (since Linux 5.8) provides secure approach to
+performance monitoring and observability operations in a system
+according to the principal of least privilege (POSIX IEEE 1003.1e).
+Accessing system performance monitoring and observability operations
+using
+.B CAP_PERFMON
+rather than the much more powerful
+.B CAP_SYS_ADMIN
+excludes chances to misuse credentials and makes operations more secure.
+.B CAP_SYS_ADMIN
+usage for secure system performance monitoring and observability
+is discouraged with respect to
+.B CAP_PERFMON
+capability.
.SH BUGS
The
.B F_SETOWN_EX
--
2.24.1
Hello Alexey,
On 10/27/20 5:48 PM, Alexey Budankov wrote:
>
> Extend perf_event_open 2 man page with the information about
> CAP_PERFMON capability designed to secure performance monitoring
> and observability operation in a system according to the principle
> of least privilege [1] (POSIX IEEE 1003.1e, 2.2.2.39).
>
> [1] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>
> Signed-off-by: Alexey Budankov <[email protected]>
Thanks for this. I've applied. I have a few questions/comments below.
> ---
> man2/perf_event_open.2 | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
> index 4827a359d..9810bc554 100644
> --- a/man2/perf_event_open.2
> +++ b/man2/perf_event_open.2
> @@ -97,6 +97,8 @@ when running on the specified CPU.
> .BR "pid == \-1" " and " "cpu >= 0"
> This measures all processes/threads on the specified CPU.
> This requires
> +.B CAP_PERFMON
> +(since Linux 5.8) or
> .B CAP_SYS_ADMIN
> capability or a
> .I /proc/sys/kernel/perf_event_paranoid
> @@ -108,9 +110,11 @@ This setting is invalid and will return an error.
> When
> .I pid
> is greater than zero, permission to perform this system call
> -is governed by a ptrace access mode
> +is governed by
> +.B CAP_PERFMON
> +(since Linux 5.9) and a ptrace access mode
I want to check: did you really mean 5.9 here? (Everywhere else,
5.8 is mentioned, but perhaps this change came in the next kernel
version.)
> .B PTRACE_MODE_READ_REALCREDS
> -check; see
> +check on older Linux versions; see
> .BR ptrace (2).
> .PP
> The
> @@ -2925,6 +2929,8 @@ to hold the result.
> This allows attaching a Berkeley Packet Filter (BPF)
> program to an existing kprobe tracepoint event.
> You need
> +.B CAP_PERFMON
> +(since Linux 5.8) or
> .B CAP_SYS_ADMIN
> privileges to use this ioctl.
> .IP
> @@ -2967,6 +2973,8 @@ have multiple events attached to a tracepoint.
> Querying this value on one tracepoint event returns the id
> of all BPF programs in all events attached to the tracepoint.
> You need
> +.B CAP_PERFMON
> +(since Linux 5.8) or
> .B CAP_SYS_ADMIN
> privileges to use this ioctl.
> .IP
> @@ -3175,6 +3183,8 @@ it was expecting.
> .TP
> .B EACCES
> Returned when the requested event requires
> +.B CAP_PERFMON
> +(since Linux 5.8) or
> .B CAP_SYS_ADMIN
> permissions (or a more permissive perf_event paranoid setting).
> Some common cases where an unprivileged process
> @@ -3296,6 +3306,8 @@ setting is specified.
> It can also happen, as with
> .BR EACCES ,
> when the requested event requires
> +.B CAP_PERFMON
> +(since Linux 5.8) or
> .B CAP_SYS_ADMIN
> permissions (or a more permissive perf_event paranoid setting).
> This includes setting a breakpoint on a kernel address,
> @@ -3326,6 +3338,22 @@ The official way of knowing if
> support is enabled is checking
> for the existence of the file
> .IR /proc/sys/kernel/perf_event_paranoid .
> +.PP
> +.B CAP_PERFMON
> +capability (since Linux 5.8) provides secure approach to
> +performance monitoring and observability operations in a system
> +according to the principal of least privilege (POSIX IEEE 1003.1e).
> +Accessing system performance monitoring and observability operations
> +using
> +.B CAP_PERFMON
> +rather than the much more powerful
> +.B CAP_SYS_ADMIN
> +excludes chances to misuse credentials and makes operations more secure.
> +.B CAP_SYS_ADMIN
> +usage for secure system performance monitoring and observability
> +is discouraged with respect to
> +.B CAP_PERFMON
> +capability.
Thank you for adding the above piece. That point of course
really needs to be emphasized!
Thanks,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/