2020-11-05 08:30:49

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH] perf_event_open.2: Update man page with recent changes

From: Namhyung Kim <[email protected]>

There are lots of changes as usual. I've tried to fill some missing
bits in the man page but it'd be nice if you could take a look and put
more info there.

Signed-off-by: Namhyung Kim <[email protected]>
---
man2/perf_event_open.2 | 262 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 260 insertions(+), 2 deletions(-)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index 72afafb50..e86adfa41 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -247,8 +247,15 @@ struct perf_event_attr {
due to exec */
use_clockid : 1, /* use clockid for time fields */
context_switch : 1, /* context switch data */
+ write_backward : 1, /* Write ring buffer from end to beginning */
+ namespaces : 1, /* include namespaces data */
+ ksymbol : 1, /* include ksymbol events */
+ bpf_event : 1, /* include bpf events */
+ aux_output : 1, /* generate AUX records instead of events */
+ cgroup : 1, /* include cgroup events */
+ text_poke : 1, /* include text poke events */

- __reserved_1 : 37;
+ __reserved_1 : 30;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -854,6 +861,20 @@ is set higher than zero then the register
values returned are those captured by
hardware at the time of the sampled
instruction's retirement.
+.TP
+.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
+.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
+Records physical address of data like in
+.B PERF_SAMPLE_ADDR .
+.TP
+.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+Records (perf_event) cgroup id of the process.
+This corresponds to the
+.I id
+field in the
+.B PERF_RECORD_CGROUP
+event.
.RE
.TP
.IR "read_format"
@@ -1189,6 +1210,47 @@ information even with strict
.I perf_event_paranoid
settings.
.TP
+.IR "write_backward" " (since Linux 4.6)"
+.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
+This makes the ring buffer is written from end to beginning.
+This is to support reading from overwritable ring buffer.
+.TP
+.IR "namespaces" " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This enables the generation of
+.B PERF_RECORD_NAMESPACES
+records when a task is entering to a new namespace. Each namespace has a
+combination of device and inode numbers.
+.TP
+.IR "ksymbol" " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This enables the generation of
+.B PERF_RECORD_KSYMBOL
+records when a new kernel symbols are registered or unregistered.
+This is analyzing dynamic kernel functions like eBPF.
+.TP
+.IR "bpf_event" " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This enables the generation of
+.B PERF_RECORD_BPF_EVENT
+records when a eBPF program is loaded or unloaded.
+.IR "auxevent" " (since Linux 5.4)"
+.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
+This allows normal (non-AUX) events to generate data for AUX events
+if the hardware supports it.
+.IR "cgroup" " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This enables the generation of
+.B PERF_RECORD_CGROUP
+records when a new cgroup is created (and activated).
+.TR
+.IR "text_poke" " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This enables the generation of
+.B PERF_RECORD_TEXT_POKE
+records when there's a changes to the kernel text (i.e. self-modifying
+code).
+.TP
.IR "wakeup_events" ", " "wakeup_watermark"
This union sets how many samples
.RI ( wakeup_events )
@@ -2101,7 +2163,7 @@ struct {
u64 nr; /* if PERF_SAMPLE_CALLCHAIN */
u64 ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
u32 size; /* if PERF_SAMPLE_RAW */
- char data[size]; /* if PERF_SAMPLE_RAW */
+ char data[size]; /* if PERF_SAMPLE_RAW */
u64 bnr; /* if PERF_SAMPLE_BRANCH_STACK */
struct perf_branch_entry lbr[bnr];
/* if PERF_SAMPLE_BRANCH_STACK */
@@ -2118,6 +2180,8 @@ struct {
u64 abi; /* if PERF_SAMPLE_REGS_INTR */
u64 regs[weight(mask)];
/* if PERF_SAMPLE_REGS_INTR */
+ u64 phys_addr; /* if PERF_SAMPLE_PHYS_ADDR */
+ u64 cgroup; /* if PERF_SAMPLE_CGROUP */
};
.EE
.in
@@ -2744,6 +2808,200 @@ or next (if switching out) process on the CPU.
The thread ID of the previous (if switching in)
or next (if switching out) thread on the CPU.
.RE
+.TP
+.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This record includes various namespace information of a process.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u32 pid;
+ u32 tid;
+ u64 nr_namespaces;
+ struct { u64 dev, inode } [nr_namespaces];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I pid
+is the process ID
+.TP
+.I tid
+is the thread ID
+.TP
+.I nr_namespace
+is the number of namespaces in this record
+.RE
+.IP
+Each namespace has
+.I dev
+and
+.I inode
+fields and is recorded in the
+fixed position like below:
+.RS
+.TP
+.BR NET_NS_INDEX = 0
+Network namespace
+.TP
+.BR UTS_NS_INDEX = 1
+UTS namespace
+.TP
+.BR IPC_NS_INDEX = 2
+IPC namespace
+.TP
+.BR PID_NS_INDEX = 3
+PID namespace
+.TP
+.BR USER_NS_INDEX = 4
+User namespace
+.TP
+.BR MNT_NS_INDEX = 5
+Mount namespace
+.TP
+.BR CGROUP_NS_INDEX = 6
+Cgroup namespace
+.PP
+.RE
+.TP
+.BR PERF_RECORD_KSYMBOL " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This record indicates kernel symbol register/unregister events.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u32 len;
+ u16 ksym_type;
+ u16 flags;
+ char name[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I addr
+is the address of the kernel symbol
+.TP
+.I len
+is the length of the kernel symbol
+.TP
+.I ksym_type
+is the type of the kernel symbol. Currently following types are available:
+.RS
+.TP
+.B PERF_RECORD_KSYMBOL_TYPE_BPF
+The kernel symbols is a BPF function.
+.RE
+.TP
+.I flags
+If the
+.B PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER
+is set, then this event is for unregistering the kernel symbol.
+.RE
+.TP
+.BR PERF_RECORD_BPF_EVENT " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This record indicates BPF program is loaded or unloaded.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u16 type;
+ u16 flags;
+ u32 id;
+ u8 tag[BPF_TAG_SIZE];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I type
+is one of the following values:
+.RS
+.TP
+.B PERF_BPF_EVENT_PROG_LOAD
+A BPF program is loaded
+.TP
+.B PERF_BPF_EVENT_PROG_UNLOAD
+A BPF program is unloaded
+.RE
+.TP
+.I id
+is the id of the BPF program.
+.TP
+.I tag
+is the tag of the BPF program. Currently
+.BR BPF_TAG_SIZE
+is defined as 8.
+.RE
+.TP
+.BR PERF_RECORD_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This record indicates a new cgroup is created and activated.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 id;
+ char path[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I id
+is the cgroup identifier. This can be also retreived by
+.BR name_to_handle_at (2)
+on the cgroup path (as a file handle).
+.TP
+.I path
+is the path of the cgroup from the root.
+.RE
+.TP
+.BR PERF_RECORD_TEXT_POKE " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This record indicates a change in the kernel text. This includes
+addition and removal of the text and the corresponding length is zero
+in this case.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u16 old_len;
+ u16 new_len;
+ u8 bytes[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I addr
+is the address of the change
+.TP
+.I old_len
+is the old length
+.TP
+.I new_len
+is the new length
+.TP
+.I bytes
+contains old bytes immediately followed by new bytes.
+.RE
.RE
.SS Overflow handling
Events can be set to notify when a threshold is crossed,
--
2.29.1.341.ge80a0c044ae-goog


2020-11-05 15:33:20

by Alejandro Colomar

[permalink] [raw]
Subject: [PATCH v2] perf_event_open.2: Update man page with recent changes

From: Namhyung Kim <[email protected]>

There are lots of changes as usual. I've tried to fill some missing
bits in the man page but it'd be nice if you could take a look and put
more info there.

Signed-off-by: Namhyung Kim <[email protected]>
[[email protected]: ffix + tfix]
Co-developed-by : Alejandro Colomar <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
---

I wrapped a few lines, and did some formatting fixes to the patch.
However, there are some parts where
I found the text to be a bit unclear to me.
Maybe you could rephrase them:
- The paragraph right under 'write_backward'.
- Text right under 'text_poke': "there's a changes"

I would cheange
[[
struct { u64 dev, inode } [nr_namespaces];
]]
to
[[
struct {
u64 dev;
u64 inode;
} [nr_namespaces];
]]
Woudln't you?

Thanks,

Alex

man2/perf_event_open.2 | 265 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 263 insertions(+), 2 deletions(-)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index 72afafb50..4adeccdde 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -247,8 +247,17 @@ struct perf_event_attr {
due to exec */
use_clockid : 1, /* use clockid for time fields */
context_switch : 1, /* context switch data */
+ write_backward : 1, /* Write ring buffer from end
+ to beginning */
+ namespaces : 1, /* include namespaces data */
+ ksymbol : 1, /* include ksymbol events */
+ bpf_event : 1, /* include bpf events */
+ aux_output : 1, /* generate AUX records
+ instead of events */
+ cgroup : 1, /* include cgroup events */
+ text_poke : 1, /* include text poke events */

- __reserved_1 : 37;
+ __reserved_1 : 30;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -854,6 +863,20 @@ is set higher than zero then the register
values returned are those captured by
hardware at the time of the sampled
instruction's retirement.
+.TP
+.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
+.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
+Records physical address of data like in
+.B PERF_SAMPLE_ADDR .
+.TP
+.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+Records (perf_event) cgroup id of the process.
+This corresponds to the
+.I id
+field in the
+.B PERF_RECORD_CGROUP
+event.
.RE
.TP
.IR "read_format"
@@ -1189,6 +1212,47 @@ information even with strict
.I perf_event_paranoid
settings.
.TP
+.IR write_backward " (since Linux 4.6)"
+.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
+This makes the ring buffer is written from end to beginning.
+This is to support reading from overwritable ring buffer.
+.TP
+.IR namespaces " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This enables the generation of
+.B PERF_RECORD_NAMESPACES
+records when a task is entering to a new namespace.
+Each namespace has a combination of device and inode numbers.
+.TP
+.IR ksymbol " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This enables the generation of
+.B PERF_RECORD_KSYMBOL
+records when a new kernel symbols are registered or unregistered.
+This is analyzing dynamic kernel functions like eBPF.
+.TP
+.IR bpf_event " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This enables the generation of
+.B PERF_RECORD_BPF_EVENT
+records when a eBPF program is loaded or unloaded.
+.IR auxevent " (since Linux 5.4)"
+.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
+This allows normal (non-AUX) events to generate data for AUX events
+if the hardware supports it.
+.IR cgroup " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This enables the generation of
+.B PERF_RECORD_CGROUP
+records when a new cgroup is created (and activated).
+.TR
+.IR text_poke " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This enables the generation of
+.B PERF_RECORD_TEXT_POKE
+records when there's a changes to the kernel text
+(i.e. self-modifying code).
+.TP
.IR "wakeup_events" ", " "wakeup_watermark"
This union sets how many samples
.RI ( wakeup_events )
@@ -2101,7 +2165,7 @@ struct {
u64 nr; /* if PERF_SAMPLE_CALLCHAIN */
u64 ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
u32 size; /* if PERF_SAMPLE_RAW */
- char data[size]; /* if PERF_SAMPLE_RAW */
+ char data[size]; /* if PERF_SAMPLE_RAW */
u64 bnr; /* if PERF_SAMPLE_BRANCH_STACK */
struct perf_branch_entry lbr[bnr];
/* if PERF_SAMPLE_BRANCH_STACK */
@@ -2118,6 +2182,8 @@ struct {
u64 abi; /* if PERF_SAMPLE_REGS_INTR */
u64 regs[weight(mask)];
/* if PERF_SAMPLE_REGS_INTR */
+ u64 phys_addr; /* if PERF_SAMPLE_PHYS_ADDR */
+ u64 cgroup; /* if PERF_SAMPLE_CGROUP */
};
.EE
.in
@@ -2744,6 +2810,201 @@ or next (if switching out) process on the CPU.
The thread ID of the previous (if switching in)
or next (if switching out) thread on the CPU.
.RE
+.TP
+.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This record includes various namespace information of a process.
+.RS
+.PP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u32 pid;
+ u32 tid;
+ u64 nr_namespaces;
+ struct { u64 dev, inode } [nr_namespaces];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.TP
+.I pid
+is the process ID
+.TP
+.I tid
+is the thread ID
+.TP
+.I nr_namespace
+is the number of namespaces in this record
+.PP
+Each namespace has
+.I dev
+and
+.I inode
+fields and is recorded in the
+fixed position like below:
+.TP
+.BR NET_NS_INDEX = 0
+Network namespace
+.TP
+.BR UTS_NS_INDEX = 1
+UTS namespace
+.TP
+.BR IPC_NS_INDEX = 2
+IPC namespace
+.TP
+.BR PID_NS_INDEX = 3
+PID namespace
+.TP
+.BR USER_NS_INDEX = 4
+User namespace
+.TP
+.BR MNT_NS_INDEX = 5
+Mount namespace
+.TP
+.BR CGROUP_NS_INDEX = 6
+Cgroup namespace
+.PP
+.RE
+.TP
+.BR PERF_RECORD_KSYMBOL " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This record indicates kernel symbol register/unregister events.
+.RS
+.PP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u32 len;
+ u16 ksym_type;
+ u16 flags;
+ char name[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.TP
+.I addr
+is the address of the kernel symbol
+.TP
+.I len
+is the length of the kernel symbol
+.TP
+.I ksym_type
+is the type of the kernel symbol.
+Currently following types are available:
+.RS
+.TP
+.B PERF_RECORD_KSYMBOL_TYPE_BPF
+The kernel symbols is a BPF function.
+.RE
+.TP
+.I flags
+If the
+.B PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER
+is set, then this event is for unregistering the kernel symbol.
+.RE
+.TP
+.BR PERF_RECORD_BPF_EVENT " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This record indicates BPF program is loaded or unloaded.
+.RS
+.PP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u16 type;
+ u16 flags;
+ u32 id;
+ u8 tag[BPF_TAG_SIZE];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.TP
+.I type
+is one of the following values:
+.RS
+.TP
+.B PERF_BPF_EVENT_PROG_LOAD
+A BPF program is loaded
+.TP
+.B PERF_BPF_EVENT_PROG_UNLOAD
+A BPF program is unloaded
+.RE
+.TP
+.I id
+is the id of the BPF program.
+.TP
+.I tag
+is the tag of the BPF program.
+Currently,
+.B BPF_TAG_SIZE
+is defined as 8.
+.RE
+.TP
+.BR PERF_RECORD_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This record indicates a new cgroup is created and activated.
+.RS
+.PP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 id;
+ char path[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.TP
+.I id
+is the cgroup identifier.
+This can be also retreived by
+.BR name_to_handle_at (2)
+on the cgroup path (as a file handle).
+.TP
+.I path
+is the path of the cgroup from the root.
+.RE
+.TP
+.BR PERF_RECORD_TEXT_POKE " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This record indicates a change in the kernel text.
+This includes addition and removal of the text
+and the corresponding length is zero in this case.
+.RS
+.PP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u16 old_len;
+ u16 new_len;
+ u8 bytes[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.TP
+.I addr
+is the address of the change
+.TP
+.I old_len
+is the old length
+.TP
+.I new_len
+is the new length
+.TP
+.I bytes
+contains old bytes immediately followed by new bytes.
+.RE
.RE
.SS Overflow handling
Events can be set to notify when a threshold is crossed,
--
2.28.0

2020-11-06 04:43:10

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2] perf_event_open.2: Update man page with recent changes

Hello,

On Fri, Nov 6, 2020 at 12:31 AM Alejandro Colomar
<[email protected]> wrote:
>
> From: Namhyung Kim <[email protected]>
>
> There are lots of changes as usual. I've tried to fill some missing
> bits in the man page but it'd be nice if you could take a look and put
> more info there.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> [[email protected]: ffix + tfix]
> Co-developed-by : Alejandro Colomar <[email protected]>
> Signed-off-by: Alejandro Colomar <[email protected]>
> ---
>
> I wrapped a few lines, and did some formatting fixes to the patch.
> However, there are some parts where
> I found the text to be a bit unclear to me.
> Maybe you could rephrase them:
> - The paragraph right under 'write_backward'.
> - Text right under 'text_poke': "there's a changes"

Yeah, thank you for checking. I'll update them.

>
> I would cheange
> [[
> struct { u64 dev, inode } [nr_namespaces];
> ]]
> to
> [[
> struct {
> u64 dev;
> u64 inode;
> } [nr_namespaces];
> ]]
> Woudln't you?

Yep, will change.

Thanks,
Namhyung

>
> man2/perf_event_open.2 | 265 ++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 263 insertions(+), 2 deletions(-)
>
> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
> index 72afafb50..4adeccdde 100644
> --- a/man2/perf_event_open.2
> +++ b/man2/perf_event_open.2
> @@ -247,8 +247,17 @@ struct perf_event_attr {
> due to exec */
> use_clockid : 1, /* use clockid for time fields */
> context_switch : 1, /* context switch data */
> + write_backward : 1, /* Write ring buffer from end
> + to beginning */
> + namespaces : 1, /* include namespaces data */
> + ksymbol : 1, /* include ksymbol events */
> + bpf_event : 1, /* include bpf events */
> + aux_output : 1, /* generate AUX records
> + instead of events */
> + cgroup : 1, /* include cgroup events */
> + text_poke : 1, /* include text poke events */
>
> - __reserved_1 : 37;
> + __reserved_1 : 30;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> @@ -854,6 +863,20 @@ is set higher than zero then the register
> values returned are those captured by
> hardware at the time of the sampled
> instruction's retirement.
> +.TP
> +.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
> +.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
> +Records physical address of data like in
> +.B PERF_SAMPLE_ADDR .
> +.TP
> +.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
> +.\" commit 96aaab686505c449e24d76e76507290dcc30e008
> +Records (perf_event) cgroup id of the process.
> +This corresponds to the
> +.I id
> +field in the
> +.B PERF_RECORD_CGROUP
> +event.
> .RE
> .TP
> .IR "read_format"
> @@ -1189,6 +1212,47 @@ information even with strict
> .I perf_event_paranoid
> settings.
> .TP
> +.IR write_backward " (since Linux 4.6)"
> +.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
> +This makes the ring buffer is written from end to beginning.
> +This is to support reading from overwritable ring buffer.
> +.TP
> +.IR namespaces " (since Linux 4.11)"
> +.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
> +This enables the generation of
> +.B PERF_RECORD_NAMESPACES
> +records when a task is entering to a new namespace.
> +Each namespace has a combination of device and inode numbers.
> +.TP
> +.IR ksymbol " (since Linux 5.0)"
> +.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
> +This enables the generation of
> +.B PERF_RECORD_KSYMBOL
> +records when a new kernel symbols are registered or unregistered.
> +This is analyzing dynamic kernel functions like eBPF.
> +.TP
> +.IR bpf_event " (since Linux 5.0)"
> +.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
> +This enables the generation of
> +.B PERF_RECORD_BPF_EVENT
> +records when a eBPF program is loaded or unloaded.
> +.IR auxevent " (since Linux 5.4)"
> +.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
> +This allows normal (non-AUX) events to generate data for AUX events
> +if the hardware supports it.
> +.IR cgroup " (since Linux 5.7)"
> +.\" commit 96aaab686505c449e24d76e76507290dcc30e008
> +This enables the generation of
> +.B PERF_RECORD_CGROUP
> +records when a new cgroup is created (and activated).
> +.TR
> +.IR text_poke " (since Linux 5.8)"
> +.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
> +This enables the generation of
> +.B PERF_RECORD_TEXT_POKE
> +records when there's a changes to the kernel text
> +(i.e. self-modifying code).
> +.TP
> .IR "wakeup_events" ", " "wakeup_watermark"
> This union sets how many samples
> .RI ( wakeup_events )
> @@ -2101,7 +2165,7 @@ struct {
> u64 nr; /* if PERF_SAMPLE_CALLCHAIN */
> u64 ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
> u32 size; /* if PERF_SAMPLE_RAW */
> - char data[size]; /* if PERF_SAMPLE_RAW */
> + char data[size]; /* if PERF_SAMPLE_RAW */
> u64 bnr; /* if PERF_SAMPLE_BRANCH_STACK */
> struct perf_branch_entry lbr[bnr];
> /* if PERF_SAMPLE_BRANCH_STACK */
> @@ -2118,6 +2182,8 @@ struct {
> u64 abi; /* if PERF_SAMPLE_REGS_INTR */
> u64 regs[weight(mask)];
> /* if PERF_SAMPLE_REGS_INTR */
> + u64 phys_addr; /* if PERF_SAMPLE_PHYS_ADDR */
> + u64 cgroup; /* if PERF_SAMPLE_CGROUP */
> };
> .EE
> .in
> @@ -2744,6 +2810,201 @@ or next (if switching out) process on the CPU.
> The thread ID of the previous (if switching in)
> or next (if switching out) thread on the CPU.
> .RE
> +.TP
> +.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
> +.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
> +This record includes various namespace information of a process.
> +.RS
> +.PP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u32 pid;
> + u32 tid;
> + u64 nr_namespaces;
> + struct { u64 dev, inode } [nr_namespaces];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.TP
> +.I pid
> +is the process ID
> +.TP
> +.I tid
> +is the thread ID
> +.TP
> +.I nr_namespace
> +is the number of namespaces in this record
> +.PP
> +Each namespace has
> +.I dev
> +and
> +.I inode
> +fields and is recorded in the
> +fixed position like below:
> +.TP
> +.BR NET_NS_INDEX = 0
> +Network namespace
> +.TP
> +.BR UTS_NS_INDEX = 1
> +UTS namespace
> +.TP
> +.BR IPC_NS_INDEX = 2
> +IPC namespace
> +.TP
> +.BR PID_NS_INDEX = 3
> +PID namespace
> +.TP
> +.BR USER_NS_INDEX = 4
> +User namespace
> +.TP
> +.BR MNT_NS_INDEX = 5
> +Mount namespace
> +.TP
> +.BR CGROUP_NS_INDEX = 6
> +Cgroup namespace
> +.PP
> +.RE
> +.TP
> +.BR PERF_RECORD_KSYMBOL " (since Linux 5.0)"
> +.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
> +This record indicates kernel symbol register/unregister events.
> +.RS
> +.PP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u64 addr;
> + u32 len;
> + u16 ksym_type;
> + u16 flags;
> + char name[];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.TP
> +.I addr
> +is the address of the kernel symbol
> +.TP
> +.I len
> +is the length of the kernel symbol
> +.TP
> +.I ksym_type
> +is the type of the kernel symbol.
> +Currently following types are available:
> +.RS
> +.TP
> +.B PERF_RECORD_KSYMBOL_TYPE_BPF
> +The kernel symbols is a BPF function.
> +.RE
> +.TP
> +.I flags
> +If the
> +.B PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER
> +is set, then this event is for unregistering the kernel symbol.
> +.RE
> +.TP
> +.BR PERF_RECORD_BPF_EVENT " (since Linux 5.0)"
> +.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
> +This record indicates BPF program is loaded or unloaded.
> +.RS
> +.PP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u16 type;
> + u16 flags;
> + u32 id;
> + u8 tag[BPF_TAG_SIZE];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.TP
> +.I type
> +is one of the following values:
> +.RS
> +.TP
> +.B PERF_BPF_EVENT_PROG_LOAD
> +A BPF program is loaded
> +.TP
> +.B PERF_BPF_EVENT_PROG_UNLOAD
> +A BPF program is unloaded
> +.RE
> +.TP
> +.I id
> +is the id of the BPF program.
> +.TP
> +.I tag
> +is the tag of the BPF program.
> +Currently,
> +.B BPF_TAG_SIZE
> +is defined as 8.
> +.RE
> +.TP
> +.BR PERF_RECORD_CGROUP " (since Linux 5.7)"
> +.\" commit 96aaab686505c449e24d76e76507290dcc30e008
> +This record indicates a new cgroup is created and activated.
> +.RS
> +.PP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u64 id;
> + char path[];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.TP
> +.I id
> +is the cgroup identifier.
> +This can be also retreived by
> +.BR name_to_handle_at (2)
> +on the cgroup path (as a file handle).
> +.TP
> +.I path
> +is the path of the cgroup from the root.
> +.RE
> +.TP
> +.BR PERF_RECORD_TEXT_POKE " (since Linux 5.8)"
> +.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
> +This record indicates a change in the kernel text.
> +This includes addition and removal of the text
> +and the corresponding length is zero in this case.
> +.RS
> +.PP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u64 addr;
> + u16 old_len;
> + u16 new_len;
> + u8 bytes[];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.TP
> +.I addr
> +is the address of the change
> +.TP
> +.I old_len
> +is the old length
> +.TP
> +.I new_len
> +is the new length
> +.TP
> +.I bytes
> +contains old bytes immediately followed by new bytes.
> +.RE
> .RE
> .SS Overflow handling
> Events can be set to notify when a threshold is crossed,
> --
> 2.28.0
>

2020-11-12 10:34:51

by Alejandro Colomar

[permalink] [raw]
Subject: [PATCH v3] perf_event_open.2: Update man page with recent changes

From: Namhyung Kim <[email protected]>

There are lots of changes as usual. I've tried to fill some missing
bits in the man page but it'd be nice if you could take a look and put
more info there.

Signed-off-by: Namhyung Kim <[email protected]>
[alx: ffix + tfix]
Cowritten-by : Alejandro Colomar <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
---

Hi Namhyung,

I fixed a few more typos,
and changed a bit the formatting.

Cheers,

Alex

man2/perf_event_open.2 | 266 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 264 insertions(+), 2 deletions(-)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index 4d93a0be2..9a3e37bf6 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -247,8 +247,17 @@ struct perf_event_attr {
due to exec */
use_clockid : 1, /* use clockid for time fields */
context_switch : 1, /* context switch data */
+ write_backward : 1, /* Write ring buffer from end
+ to beginning */
+ namespaces : 1, /* include namespaces data */
+ ksymbol : 1, /* include ksymbol events */
+ bpf_event : 1, /* include bpf events */
+ aux_output : 1, /* generate AUX records
+ instead of events */
+ cgroup : 1, /* include cgroup events */
+ text_poke : 1, /* include text poke events */

- __reserved_1 : 37;
+ __reserved_1 : 30;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -875,6 +884,20 @@ is set higher than zero then the register
values returned are those captured by
hardware at the time of the sampled
instruction's retirement.
+.TP
+.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
+.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
+Records physical address of data like in
+.B PERF_SAMPLE_ADDR .
+.TP
+.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+Records (perf_event) cgroup id of the process.
+This corresponds to the
+.I id
+field in the
+.B PERF_RECORD_CGROUP
+event.
.RE
.TP
.I read_format
@@ -1218,6 +1241,48 @@ information even with strict
.I perf_event_paranoid
settings.
.TP
+.IR write_backward " (since Linux 4.6)"
+.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
+This makes the ring buffer is written from end to beginning.
+This is to support reading from overwritable ring buffer.
+.TP
+.IR namespaces " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This enables the generation of
+.B PERF_RECORD_NAMESPACES
+records when a task is entering to a new namespace.
+Each namespace has a combination of device and inode numbers.
+.TP
+.IR ksymbol " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This enables the generation of
+.B PERF_RECORD_KSYMBOL
+records when a new kernel symbols are registered or unregistered.
+This is analyzing dynamic kernel functions like eBPF.
+.TP
+.IR bpf_event " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This enables the generation of
+.B PERF_RECORD_BPF_EVENT
+records when a eBPF program is loaded or unloaded.
+.TP
+.IR auxevent " (since Linux 5.4)"
+.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
+This allows normal (non-AUX) events to generate data for AUX events
+if the hardware supports it.
+.IR cgroup " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This enables the generation of
+.B PERF_RECORD_CGROUP
+records when a new cgroup is created (and activated).
+.TP
+.IR text_poke " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This enables the generation of
+.B PERF_RECORD_TEXT_POKE
+records when there's a changes to the kernel text
+(i.e. self-modifying code).
+.TP
.IR wakeup_events ", " wakeup_watermark
.RS
This union sets how many samples
@@ -2132,7 +2197,7 @@ struct {
u64 nr; /* if PERF_SAMPLE_CALLCHAIN */
u64 ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
u32 size; /* if PERF_SAMPLE_RAW */
- char data[size]; /* if PERF_SAMPLE_RAW */
+ char data[size]; /* if PERF_SAMPLE_RAW */
u64 bnr; /* if PERF_SAMPLE_BRANCH_STACK */
struct perf_branch_entry lbr[bnr];
/* if PERF_SAMPLE_BRANCH_STACK */
@@ -2149,6 +2214,8 @@ struct {
u64 abi; /* if PERF_SAMPLE_REGS_INTR */
u64 regs[weight(mask)];
/* if PERF_SAMPLE_REGS_INTR */
+ u64 phys_addr; /* if PERF_SAMPLE_PHYS_ADDR */
+ u64 cgroup; /* if PERF_SAMPLE_CGROUP */
};
.EE
.in
@@ -2775,6 +2842,201 @@ or next (if switching out) process on the CPU.
The thread ID of the previous (if switching in)
or next (if switching out) thread on the CPU.
.RE
+.TP
+.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+.RS
+This record includes various namespace information of a process.
+.PP
+.RS 4
+.EX
+struct {
+ struct perf_event_header header;
+ u32 pid;
+ u32 tid;
+ u64 nr_namespaces;
+ struct { u64 dev, inode } [nr_namespaces];
+ struct sample_id sample_id;
+};
+.EE
+.RE
+.TP
+.I pid
+is the process ID
+.TP
+.I tid
+is the thread ID
+.TP
+.I nr_namespace
+is the number of namespaces in this record
+.PP
+Each namespace has
+.I dev
+and
+.I inode
+fields and is recorded in the
+fixed position like below:
+.TP
+.BR NET_NS_INDEX = 0
+Network namespace
+.TP
+.BR UTS_NS_INDEX = 1
+UTS namespace
+.TP
+.BR IPC_NS_INDEX = 2
+IPC namespace
+.TP
+.BR PID_NS_INDEX = 3
+PID namespace
+.TP
+.BR USER_NS_INDEX = 4
+User namespace
+.TP
+.BR MNT_NS_INDEX = 5
+Mount namespace
+.TP
+.BR CGROUP_NS_INDEX = 6
+Cgroup namespace
+.PP
+.RE
+.TP
+.BR PERF_RECORD_KSYMBOL " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+.RS
+This record indicates kernel symbol register/unregister events.
+.PP
+.RS 4
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u32 len;
+ u16 ksym_type;
+ u16 flags;
+ char name[];
+ struct sample_id sample_id;
+};
+.EE
+.RE
+.TP
+.I addr
+is the address of the kernel symbol
+.TP
+.I len
+is the length of the kernel symbol
+.TP
+.I ksym_type
+.RS
+is the type of the kernel symbol.
+Currently following types are available:
+.TP
+.B PERF_RECORD_KSYMBOL_TYPE_BPF
+The kernel symbols is a BPF function.
+.RE
+.TP
+.I flags
+If the
+.B PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER
+is set, then this event is for unregistering the kernel symbol.
+.RE
+.TP
+.BR PERF_RECORD_BPF_EVENT " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+.RS
+This record indicates BPF program is loaded or unloaded.
+.PP
+.RS 4
+.EX
+struct {
+ struct perf_event_header header;
+ u16 type;
+ u16 flags;
+ u32 id;
+ u8 tag[BPF_TAG_SIZE];
+ struct sample_id sample_id;
+};
+.EE
+.RE
+.TP
+.I type
+.RS
+is one of the following values:
+.TP
+.B PERF_BPF_EVENT_PROG_LOAD
+A BPF program is loaded
+.TP
+.B PERF_BPF_EVENT_PROG_UNLOAD
+A BPF program is unloaded
+.RE
+.TP
+.I id
+is the id of the BPF program.
+.TP
+.I tag
+is the tag of the BPF program.
+Currently,
+.B BPF_TAG_SIZE
+is defined as 8.
+.RE
+.TP
+.BR PERF_RECORD_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+.RS
+This record indicates a new cgroup is created and activated.
+.PP
+.RS 4
+.EX
+struct {
+ struct perf_event_header header;
+ u64 id;
+ char path[];
+ struct sample_id sample_id;
+};
+.EE
+.RE
+.TP
+.I id
+is the cgroup identifier.
+This can be also retreived by
+.BR name_to_handle_at (2)
+on the cgroup path (as a file handle).
+.TP
+.I path
+is the path of the cgroup from the root.
+.RE
+.TP
+.BR PERF_RECORD_TEXT_POKE " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+.RS
+This record indicates a change in the kernel text.
+This includes addition and removal of the text
+and the corresponding length is zero in this case.
+.PP
+.RS 4
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u16 old_len;
+ u16 new_len;
+ u8 bytes[];
+ struct sample_id sample_id;
+};
+.EE
+.RE
+.TP
+.I addr
+is the address of the change
+.TP
+.I old_len
+is the old length
+.TP
+.I new_len
+is the new length
+.TP
+.I bytes
+contains old bytes immediately followed by new bytes.
+.RE
.RE
.SS Overflow handling
Events can be set to notify when a threshold is crossed,
--
2.28.0

2020-11-13 21:31:18

by Alejandro Colomar

[permalink] [raw]
Subject: [PATCH v4] perf_event_open.2: Update man page with recent changes

From: Namhyung Kim <[email protected]>

There are lots of changes as usual. I've tried to fill some missing
bits in the man page but it'd be nice if you could take a look and put
more info there.

Signed-off-by: Namhyung Kim <[email protected]>
[alx: ffix + tfix]
Cowritten-by : Alejandro Colomar <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
---

Hi Nahmyung,

I fixed another typo,
and mainly fixed many formatting changes I introduced
a few days ago because we were discussing about
trying to improve the formatting,
but finally decided to continue with the old way.

Cheers,

Alex


man2/perf_event_open.2 | 267 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 265 insertions(+), 2 deletions(-)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index e7b0aa132..e1c7789b9 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -247,8 +247,17 @@ struct perf_event_attr {
due to exec */
use_clockid : 1, /* use clockid for time fields */
context_switch : 1, /* context switch data */
+ write_backward : 1, /* Write ring buffer from end
+ to beginning */
+ namespaces : 1, /* include namespaces data */
+ ksymbol : 1, /* include ksymbol events */
+ bpf_event : 1, /* include bpf events */
+ aux_output : 1, /* generate AUX records
+ instead of events */
+ cgroup : 1, /* include cgroup events */
+ text_poke : 1, /* include text poke events */

- __reserved_1 : 37;
+ __reserved_1 : 30;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -867,6 +876,20 @@ is set higher than zero then the register
values returned are those captured by
hardware at the time of the sampled
instruction's retirement.
+.TP
+.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
+.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
+Records physical address of data like in
+.B PERF_SAMPLE_ADDR .
+.TP
+.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+Records (perf_event) cgroup id of the process.
+This corresponds to the
+.I id
+field in the
+.B PERF_RECORD_CGROUP
+event.
.RE
.TP
.I read_format
@@ -1202,6 +1225,48 @@ information even with strict
.I perf_event_paranoid
settings.
.TP
+.IR write_backward " (since Linux 4.6)"
+.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
+This makes the ring buffer is written from end to beginning.
+This is to support reading from overwritable ring buffer.
+.TP
+.IR namespaces " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This enables the generation of
+.B PERF_RECORD_NAMESPACES
+records when a task is entering to a new namespace.
+Each namespace has a combination of device and inode numbers.
+.TP
+.IR ksymbol " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This enables the generation of
+.B PERF_RECORD_KSYMBOL
+records when a new kernel symbols are registered or unregistered.
+This is analyzing dynamic kernel functions like eBPF.
+.TP
+.IR bpf_event " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This enables the generation of
+.B PERF_RECORD_BPF_EVENT
+records when a eBPF program is loaded or unloaded.
+.TP
+.IR auxevent " (since Linux 5.4)"
+.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
+This allows normal (non-AUX) events to generate data for AUX events
+if the hardware supports it.
+.IR cgroup " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This enables the generation of
+.B PERF_RECORD_CGROUP
+records when a new cgroup is created (and activated).
+.TP
+.IR text_poke " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This enables the generation of
+.B PERF_RECORD_TEXT_POKE
+records when there's a changes to the kernel text
+(i.e. self-modifying code).
+.TP
.IR wakeup_events ", " wakeup_watermark
This union sets how many samples
.RI ( wakeup_events )
@@ -2131,7 +2196,7 @@ struct {
u64 nr; /* if PERF_SAMPLE_CALLCHAIN */
u64 ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
u32 size; /* if PERF_SAMPLE_RAW */
- char data[size]; /* if PERF_SAMPLE_RAW */
+ char data[size]; /* if PERF_SAMPLE_RAW */
u64 bnr; /* if PERF_SAMPLE_BRANCH_STACK */
struct perf_branch_entry lbr[bnr];
/* if PERF_SAMPLE_BRANCH_STACK */
@@ -2148,6 +2213,8 @@ struct {
u64 abi; /* if PERF_SAMPLE_REGS_INTR */
u64 regs[weight(mask)];
/* if PERF_SAMPLE_REGS_INTR */
+ u64 phys_addr; /* if PERF_SAMPLE_PHYS_ADDR */
+ u64 cgroup; /* if PERF_SAMPLE_CGROUP */
};
.EE
.in
@@ -2776,6 +2843,202 @@ or next (if switching out) process on the CPU.
The thread ID of the previous (if switching in)
or next (if switching out) thread on the CPU.
.RE
+.TP
+.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This record includes various namespace information of a process.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u32 pid;
+ u32 tid;
+ u64 nr_namespaces;
+ struct { u64 dev, inode } [nr_namespaces];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I pid
+is the process ID
+.TP
+.I tid
+is the thread ID
+.TP
+.I nr_namespace
+is the number of namespaces in this record
+.RE
+.IP
+Each namespace has
+.I dev
+and
+.I inode
+fields and is recorded in the
+fixed position like below:
+.RS
+.TP
+.BR NET_NS_INDEX = 0
+Network namespace
+.TP
+.BR UTS_NS_INDEX = 1
+UTS namespace
+.TP
+.BR IPC_NS_INDEX = 2
+IPC namespace
+.TP
+.BR PID_NS_INDEX = 3
+PID namespace
+.TP
+.BR USER_NS_INDEX = 4
+User namespace
+.TP
+.BR MNT_NS_INDEX = 5
+Mount namespace
+.TP
+.BR CGROUP_NS_INDEX = 6
+Cgroup namespace
+.RE
+.TP
+.BR PERF_RECORD_KSYMBOL " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This record indicates kernel symbol register/unregister events.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u32 len;
+ u16 ksym_type;
+ u16 flags;
+ char name[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I addr
+is the address of the kernel symbol
+.TP
+.I len
+is the length of the kernel symbol
+.TP
+.I ksym_type
+is the type of the kernel symbol.
+Currently following types are available:
+.RS
+.TP
+.B PERF_RECORD_KSYMBOL_TYPE_BPF
+The kernel symbols is a BPF function.
+.RE
+.TP
+.I flags
+If the
+.B PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER
+is set, then this event is for unregistering the kernel symbol.
+.RE
+.TP
+.BR PERF_RECORD_BPF_EVENT " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This record indicates BPF program is loaded or unloaded.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u16 type;
+ u16 flags;
+ u32 id;
+ u8 tag[BPF_TAG_SIZE];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I type
+is one of the following values:
+.RS
+.TP
+.B PERF_BPF_EVENT_PROG_LOAD
+A BPF program is loaded
+.TP
+.B PERF_BPF_EVENT_PROG_UNLOAD
+A BPF program is unloaded
+.RE
+.TP
+.I id
+is the id of the BPF program.
+.TP
+.I tag
+is the tag of the BPF program.
+Currently,
+.B BPF_TAG_SIZE
+is defined as 8.
+.RE
+.TP
+.BR PERF_RECORD_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This record indicates a new cgroup is created and activated.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 id;
+ char path[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I id
+is the cgroup identifier.
+This can be also retreived by
+.BR name_to_handle_at (2)
+on the cgroup path (as a file handle).
+.TP
+.I path
+is the path of the cgroup from the root.
+.RE
+.TP
+.BR PERF_RECORD_TEXT_POKE " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This record indicates a change in the kernel text.
+This includes addition and removal of the text
+and the corresponding length is zero in this case.
+.IP
+.in +4n
+.EX
+struct {
+ struct perf_event_header header;
+ u64 addr;
+ u16 old_len;
+ u16 new_len;
+ u8 bytes[];
+ struct sample_id sample_id;
+};
+.EE
+.in
+.RS
+.TP
+.I addr
+is the address of the change
+.TP
+.I old_len
+is the old length
+.TP
+.I new_len
+is the new length
+.TP
+.I bytes
+contains old bytes immediately followed by new bytes.
+.RE
.RE
.SS Overflow handling
Events can be set to notify when a threshold is crossed,
--
2.28.0

2020-11-16 16:21:37

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v4] perf_event_open.2: Update man page with recent changes

Hello Alex,

On Sat, Nov 14, 2020 at 6:28 AM Alejandro Colomar
<[email protected]> wrote:
>
> From: Namhyung Kim <[email protected]>
>
> There are lots of changes as usual. I've tried to fill some missing
> bits in the man page but it'd be nice if you could take a look and put
> more info there.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> [alx: ffix + tfix]
> Cowritten-by : Alejandro Colomar <[email protected]>
> Signed-off-by: Alejandro Colomar <[email protected]>
> ---
>
> Hi Nahmyung,
>
> I fixed another typo,
> and mainly fixed many formatting changes I introduced
> a few days ago because we were discussing about
> trying to improve the formatting,
> but finally decided to continue with the old way.

Thanks a lot for fixing them!

I also found a broken formatting below and would like
to add more description of PERF_RECORD_SAMPLE.

>
> man2/perf_event_open.2 | 267 ++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 265 insertions(+), 2 deletions(-)
>
> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
> index e7b0aa132..e1c7789b9 100644
> --- a/man2/perf_event_open.2
> +++ b/man2/perf_event_open.2
> @@ -247,8 +247,17 @@ struct perf_event_attr {
> due to exec */
> use_clockid : 1, /* use clockid for time fields */
> context_switch : 1, /* context switch data */
> + write_backward : 1, /* Write ring buffer from end
> + to beginning */
> + namespaces : 1, /* include namespaces data */
> + ksymbol : 1, /* include ksymbol events */
> + bpf_event : 1, /* include bpf events */
> + aux_output : 1, /* generate AUX records
> + instead of events */
> + cgroup : 1, /* include cgroup events */
> + text_poke : 1, /* include text poke events */
>
> - __reserved_1 : 37;
> + __reserved_1 : 30;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> @@ -867,6 +876,20 @@ is set higher than zero then the register
> values returned are those captured by
> hardware at the time of the sampled
> instruction's retirement.
> +.TP
> +.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
> +.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
> +Records physical address of data like in
> +.B PERF_SAMPLE_ADDR .
> +.TP
> +.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
> +.\" commit 96aaab686505c449e24d76e76507290dcc30e008
> +Records (perf_event) cgroup id of the process.
> +This corresponds to the
> +.I id
> +field in the
> +.B PERF_RECORD_CGROUP
> +event.
> .RE
> .TP
> .I read_format
> @@ -1202,6 +1225,48 @@ information even with strict
> .I perf_event_paranoid
> settings.
> .TP
> +.IR write_backward " (since Linux 4.6)"
> +.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
> +This makes the ring buffer is written from end to beginning.
> +This is to support reading from overwritable ring buffer.
> +.TP
> +.IR namespaces " (since Linux 4.11)"
> +.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
> +This enables the generation of
> +.B PERF_RECORD_NAMESPACES
> +records when a task is entering to a new namespace.
> +Each namespace has a combination of device and inode numbers.
> +.TP
> +.IR ksymbol " (since Linux 5.0)"
> +.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
> +This enables the generation of
> +.B PERF_RECORD_KSYMBOL
> +records when a new kernel symbols are registered or unregistered.
> +This is analyzing dynamic kernel functions like eBPF.
> +.TP
> +.IR bpf_event " (since Linux 5.0)"
> +.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
> +This enables the generation of
> +.B PERF_RECORD_BPF_EVENT
> +records when a eBPF program is loaded or unloaded.
> +.TP
> +.IR auxevent " (since Linux 5.4)"
> +.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
> +This allows normal (non-AUX) events to generate data for AUX events
> +if the hardware supports it.

.TP

> +.IR cgroup " (since Linux 5.7)"
> +.\" commit 96aaab686505c449e24d76e76507290dcc30e008
> +This enables the generation of
> +.B PERF_RECORD_CGROUP
> +records when a new cgroup is created (and activated).
> +.TP
> +.IR text_poke " (since Linux 5.8)"
> +.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
> +This enables the generation of
> +.B PERF_RECORD_TEXT_POKE
> +records when there's a changes to the kernel text
> +(i.e. self-modifying code).
> +.TP
> .IR wakeup_events ", " wakeup_watermark
> This union sets how many samples
> .RI ( wakeup_events )
> @@ -2131,7 +2196,7 @@ struct {
> u64 nr; /* if PERF_SAMPLE_CALLCHAIN */
> u64 ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
> u32 size; /* if PERF_SAMPLE_RAW */
> - char data[size]; /* if PERF_SAMPLE_RAW */
> + char data[size]; /* if PERF_SAMPLE_RAW */
> u64 bnr; /* if PERF_SAMPLE_BRANCH_STACK */
> struct perf_branch_entry lbr[bnr];
> /* if PERF_SAMPLE_BRANCH_STACK */
> @@ -2148,6 +2213,8 @@ struct {
> u64 abi; /* if PERF_SAMPLE_REGS_INTR */
> u64 regs[weight(mask)];
> /* if PERF_SAMPLE_REGS_INTR */
> + u64 phys_addr; /* if PERF_SAMPLE_PHYS_ADDR */
> + u64 cgroup; /* if PERF_SAMPLE_CGROUP */

I think I should add description for these fields too:

.TP
.I phys_addr
If the
.B PERF_SAMPLE_PHYS_ADDR
flag is set, then 64-bit physical address is recorded.
.TP
.i cgroup
If the
.B PERF_SAMPLE_CGROUP
flag is set, then 64-bit cgroup id (for the perf_event subsystem) is recorded.
To get the pathname of the cgroup, the id should match to one in a
.B PERF_RECORD_CGROUP .

Thanks,
Namhyung


> };
> .EE
> .in
> @@ -2776,6 +2843,202 @@ or next (if switching out) process on the CPU.
> The thread ID of the previous (if switching in)
> or next (if switching out) thread on the CPU.
> .RE
> +.TP
> +.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
> +.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
> +This record includes various namespace information of a process.
> +.IP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u32 pid;
> + u32 tid;
> + u64 nr_namespaces;
> + struct { u64 dev, inode } [nr_namespaces];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.RS
> +.TP
> +.I pid
> +is the process ID
> +.TP
> +.I tid
> +is the thread ID
> +.TP
> +.I nr_namespace
> +is the number of namespaces in this record
> +.RE
> +.IP
> +Each namespace has
> +.I dev
> +and
> +.I inode
> +fields and is recorded in the
> +fixed position like below:
> +.RS
> +.TP
> +.BR NET_NS_INDEX = 0
> +Network namespace
> +.TP
> +.BR UTS_NS_INDEX = 1
> +UTS namespace
> +.TP
> +.BR IPC_NS_INDEX = 2
> +IPC namespace
> +.TP
> +.BR PID_NS_INDEX = 3
> +PID namespace
> +.TP
> +.BR USER_NS_INDEX = 4
> +User namespace
> +.TP
> +.BR MNT_NS_INDEX = 5
> +Mount namespace
> +.TP
> +.BR CGROUP_NS_INDEX = 6
> +Cgroup namespace
> +.RE
> +.TP
> +.BR PERF_RECORD_KSYMBOL " (since Linux 5.0)"
> +.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
> +This record indicates kernel symbol register/unregister events.
> +.IP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u64 addr;
> + u32 len;
> + u16 ksym_type;
> + u16 flags;
> + char name[];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.RS
> +.TP
> +.I addr
> +is the address of the kernel symbol
> +.TP
> +.I len
> +is the length of the kernel symbol
> +.TP
> +.I ksym_type
> +is the type of the kernel symbol.
> +Currently following types are available:
> +.RS
> +.TP
> +.B PERF_RECORD_KSYMBOL_TYPE_BPF
> +The kernel symbols is a BPF function.
> +.RE
> +.TP
> +.I flags
> +If the
> +.B PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER
> +is set, then this event is for unregistering the kernel symbol.
> +.RE
> +.TP
> +.BR PERF_RECORD_BPF_EVENT " (since Linux 5.0)"
> +.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
> +This record indicates BPF program is loaded or unloaded.
> +.IP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u16 type;
> + u16 flags;
> + u32 id;
> + u8 tag[BPF_TAG_SIZE];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.RS
> +.TP
> +.I type
> +is one of the following values:
> +.RS
> +.TP
> +.B PERF_BPF_EVENT_PROG_LOAD
> +A BPF program is loaded
> +.TP
> +.B PERF_BPF_EVENT_PROG_UNLOAD
> +A BPF program is unloaded
> +.RE
> +.TP
> +.I id
> +is the id of the BPF program.
> +.TP
> +.I tag
> +is the tag of the BPF program.
> +Currently,
> +.B BPF_TAG_SIZE
> +is defined as 8.
> +.RE
> +.TP
> +.BR PERF_RECORD_CGROUP " (since Linux 5.7)"
> +.\" commit 96aaab686505c449e24d76e76507290dcc30e008
> +This record indicates a new cgroup is created and activated.
> +.IP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u64 id;
> + char path[];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.RS
> +.TP
> +.I id
> +is the cgroup identifier.
> +This can be also retreived by
> +.BR name_to_handle_at (2)
> +on the cgroup path (as a file handle).
> +.TP
> +.I path
> +is the path of the cgroup from the root.
> +.RE
> +.TP
> +.BR PERF_RECORD_TEXT_POKE " (since Linux 5.8)"
> +.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
> +This record indicates a change in the kernel text.
> +This includes addition and removal of the text
> +and the corresponding length is zero in this case.
> +.IP
> +.in +4n
> +.EX
> +struct {
> + struct perf_event_header header;
> + u64 addr;
> + u16 old_len;
> + u16 new_len;
> + u8 bytes[];
> + struct sample_id sample_id;
> +};
> +.EE
> +.in
> +.RS
> +.TP
> +.I addr
> +is the address of the change
> +.TP
> +.I old_len
> +is the old length
> +.TP
> +.I new_len
> +is the new length
> +.TP
> +.I bytes
> +contains old bytes immediately followed by new bytes.
> +.RE
> .RE
> .SS Overflow handling
> Events can be set to notify when a threshold is crossed,
> --
> 2.28.0
>

2020-11-17 02:00:33

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH v4] perf_event_open.2: Update man page with recent changes


On 11/16/20 5:17 PM, Namhyung Kim wrote:
> Hello Alex,
>
> On Sat, Nov 14, 2020 at 6:28 AM Alejandro Colomar
> <[email protected]> wrote:
>>
>> From: Namhyung Kim <[email protected]>
>>
>> There are lots of changes as usual. I've tried to fill some missing
>> bits in the man page but it'd be nice if you could take a look and put
>> more info there.
>>
>> Signed-off-by: Namhyung Kim <[email protected]>
>> [alx: ffix + tfix]
>> Cowritten-by : Alejandro Colomar <[email protected]>
>> Signed-off-by: Alejandro Colomar <[email protected]>
>> ---
>>
>> Hi Nahmyung,
>>
>> I fixed another typo,
>> and mainly fixed many formatting changes I introduced
>> a few days ago because we were discussing about
>> trying to improve the formatting,
>> but finally decided to continue with the old way.
>
> Thanks a lot for fixing them!
>
> I also found a broken formatting below and would like
> to add more description of PERF_RECORD_SAMPLE.

Hi Namhyung,

Fine, could you send an updated patch with the changes?

Thanks,

Alex