2020-10-22 05:19:05

by Alexey Budankov

[permalink] [raw]
Subject: [PATCH v2 08/15] perf record: write trace data into mmap trace files


Write trace buffer data into per mmap trace files located
at data directory. Streaming thread adjusts its affinity
according to mask of the buffer being processed.

Signed-off-by: Alexey Budankov <[email protected]>
---
tools/perf/builtin-record.c | 44 ++++++++++++++++++++++++++++++++-----
tools/perf/util/record.h | 1 +
2 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 619aaee11231..ba26d75c51d6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -120,6 +120,11 @@ static const char *affinity_tags[PERF_AFFINITY_MAX] = {
"SYS", "NODE", "CPU"
};

+static int record__threads_enabled(struct record *rec)
+{
+ return rec->opts.threads;
+}
+
static bool switch_output_signal(struct record *rec)
{
return rec->switch_output.signal &&
@@ -894,6 +899,20 @@ static int record__mmap_evlist(struct record *rec,
return -EINVAL;
}
}
+
+ if (record__threads_enabled(rec)) {
+ int i, ret, nr = evlist->core.nr_mmaps;
+ struct mmap *mmaps = rec->opts.overwrite ?
+ evlist->overwrite_mmap : evlist->mmap;
+
+ ret = perf_data__create_dir(&rec->data, evlist->core.nr_mmaps);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr; i++)
+ mmaps[i].file = &rec->data.dir.files[i];
+ }
+
return 0;
}

@@ -1184,8 +1203,12 @@ static int record__mmap_read_evlist(struct record *rec, struct evlist *evlist,
/*
* Mark the round finished in case we wrote
* at least one event.
+ *
+ * No need for round events in directory mode,
+ * because per-cpu maps and files have data
+ * sorted by kernel.
*/
- if (bytes_written != rec->bytes_written)
+ if (!record__threads_enabled(rec) && bytes_written != rec->bytes_written)
rc = record__write(rec, NULL, &finished_round_event, sizeof(finished_round_event));

if (overwrite)
@@ -1231,7 +1254,9 @@ static void record__init_features(struct record *rec)
if (!rec->opts.use_clockid)
perf_header__clear_feat(&session->header, HEADER_CLOCK_DATA);

- perf_header__clear_feat(&session->header, HEADER_DIR_FORMAT);
+ if (!record__threads_enabled(rec))
+ perf_header__clear_feat(&session->header, HEADER_DIR_FORMAT);
+
if (!record__comp_enabled(rec))
perf_header__clear_feat(&session->header, HEADER_COMPRESSED);

@@ -1242,15 +1267,21 @@ static void
record__finish_output(struct record *rec)
{
struct perf_data *data = &rec->data;
- int fd = perf_data__fd(data);
+ int i, fd = perf_data__fd(data);

if (data->is_pipe)
return;

rec->session->header.data_size += rec->bytes_written;
data->file.size = lseek(perf_data__fd(data), 0, SEEK_CUR);
+ if (record__threads_enabled(rec)) {
+ for (i = 0; i < data->dir.nr; i++)
+ data->dir.files[i].size = lseek(data->dir.files[i].fd, 0, SEEK_CUR);
+ }

if (!rec->no_buildid) {
+ /* this will be recalculated during process_buildids() */
+ rec->samples = 0;
process_buildids(rec);

if (rec->buildid_all)
@@ -2041,8 +2072,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
status = err;

record__synthesize(rec, true);
- /* this will be recalculated during process_buildids() */
- rec->samples = 0;

if (!err) {
if (!rec->timestamp_filename) {
@@ -2680,9 +2709,12 @@ int cmd_record(int argc, const char **argv)

}

- if (rec->opts.kcore)
+ if (rec->opts.kcore || record__threads_enabled(rec))
rec->data.is_dir = true;

+ if (record__threads_enabled(rec))
+ rec->opts.affinity = PERF_AFFINITY_CPU;
+
if (rec->opts.comp_level != 0) {
pr_debug("Compression enabled, disabling build id collection at the end of the session.\n");
rec->no_buildid = true;
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 266760ac9143..aeda3cdaa3e9 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -74,6 +74,7 @@ struct record_opts {
int ctl_fd;
int ctl_fd_ack;
bool ctl_fd_close;
+ bool threads;
};

extern const char * const *record_usage;
--
2.24.1



2020-10-24 15:51:19

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 08/15] perf record: write trace data into mmap trace files

On Wed, Oct 21, 2020 at 07:02:56PM +0300, Alexey Budankov wrote:

SNIP

>
> record__synthesize(rec, true);
> - /* this will be recalculated during process_buildids() */
> - rec->samples = 0;
>
> if (!err) {
> if (!rec->timestamp_filename) {
> @@ -2680,9 +2709,12 @@ int cmd_record(int argc, const char **argv)
>
> }
>
> - if (rec->opts.kcore)
> + if (rec->opts.kcore || record__threads_enabled(rec))
> rec->data.is_dir = true;
>
> + if (record__threads_enabled(rec))
> + rec->opts.affinity = PERF_AFFINITY_CPU;

so all the threads will pin to cpu and back before reading?
it makes sense for one thread, but why not pin every thread
at the start?

jirka

2020-10-26 10:47:03

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 08/15] perf record: write trace data into mmap trace files

On Mon, Oct 26, 2020 at 11:52:21AM +0300, Alexei Budankov wrote:
>
> On 24.10.2020 18:44, Jiri Olsa wrote:
> > On Wed, Oct 21, 2020 at 07:02:56PM +0300, Alexey Budankov wrote:
> >
> > SNIP
> >
> >>
> >> record__synthesize(rec, true);
> >> - /* this will be recalculated during process_buildids() */
> >> - rec->samples = 0;
> >>
> >> if (!err) {
> >> if (!rec->timestamp_filename) {
> >> @@ -2680,9 +2709,12 @@ int cmd_record(int argc, const char **argv)
> >>
> >> }
> >>
> >> - if (rec->opts.kcore)
> >> + if (rec->opts.kcore || record__threads_enabled(rec))
> >> rec->data.is_dir = true;
> >>
> >> + if (record__threads_enabled(rec))
> >> + rec->opts.affinity = PERF_AFFINITY_CPU;
> >
> > so all the threads will pin to cpu and back before reading?
>
> No, they will not back. Thread mask compares to mmap mask before
> read and the thread migrates if masks don't match. This happens
> once on the first mmap read. So explicit pinning can be avoided.

hum, is that right? the check in record__adjust_affinity
is checking global 'rec->affinity_mask', at lest I assume
it's still global ;-)

if (rec->opts.affinity != PERF_AFFINITY_SYS &&
!bitmap_equal(rec->affinity_mask.bits, map->affinity_mask.bits,
rec->affinity_mask.nbits)) {

I think this can never be equal if you have more than one map

when I check on sched_setaffinity syscalls:

# perf trace -e syscalls:sys_enter_sched_setaffinity

while running record --threads, I see sched_setaffinity
calls all the time

jirka

2020-10-26 11:52:55

by Alexei Budankov

[permalink] [raw]
Subject: Re: [PATCH v2 08/15] perf record: write trace data into mmap trace files


On 24.10.2020 18:44, Jiri Olsa wrote:
> On Wed, Oct 21, 2020 at 07:02:56PM +0300, Alexey Budankov wrote:
>
> SNIP
>
>>
>> record__synthesize(rec, true);
>> - /* this will be recalculated during process_buildids() */
>> - rec->samples = 0;
>>
>> if (!err) {
>> if (!rec->timestamp_filename) {
>> @@ -2680,9 +2709,12 @@ int cmd_record(int argc, const char **argv)
>>
>> }
>>
>> - if (rec->opts.kcore)
>> + if (rec->opts.kcore || record__threads_enabled(rec))
>> rec->data.is_dir = true;
>>
>> + if (record__threads_enabled(rec))
>> + rec->opts.affinity = PERF_AFFINITY_CPU;
>
> so all the threads will pin to cpu and back before reading?

No, they will not back. Thread mask compares to mmap mask before
read and the thread migrates if masks don't match. This happens
once on the first mmap read. So explicit pinning can be avoided.

Alexei

2020-10-26 14:22:26

by Alexei Budankov

[permalink] [raw]
Subject: Re: [PATCH v2 08/15] perf record: write trace data into mmap trace files


On 26.10.2020 13:32, Jiri Olsa wrote:
> On Mon, Oct 26, 2020 at 11:52:21AM +0300, Alexei Budankov wrote:
>>
>> On 24.10.2020 18:44, Jiri Olsa wrote:
>>> On Wed, Oct 21, 2020 at 07:02:56PM +0300, Alexey Budankov wrote:
>>>
>>> SNIP
>>>
>>>>
>>>> record__synthesize(rec, true);
>>>> - /* this will be recalculated during process_buildids() */
>>>> - rec->samples = 0;
>>>>
>>>> if (!err) {
>>>> if (!rec->timestamp_filename) {
>>>> @@ -2680,9 +2709,12 @@ int cmd_record(int argc, const char **argv)
>>>>
>>>> }
>>>>
>>>> - if (rec->opts.kcore)
>>>> + if (rec->opts.kcore || record__threads_enabled(rec))
>>>> rec->data.is_dir = true;
>>>>
>>>> + if (record__threads_enabled(rec))
>>>> + rec->opts.affinity = PERF_AFFINITY_CPU;
>>>
>>> so all the threads will pin to cpu and back before reading?
>>
>> No, they will not back. Thread mask compares to mmap mask before
>> read and the thread migrates if masks don't match. This happens
>> once on the first mmap read. So explicit pinning can be avoided.
>
> hum, is that right? the check in record__adjust_affinity
> is checking global 'rec->affinity_mask', at lest I assume
> it's still global ;-)

Yes, rec->affinity_mask should also be per-thread. Good catch. Thanks!

Alexei

>
> if (rec->opts.affinity != PERF_AFFINITY_SYS &&
> !bitmap_equal(rec->affinity_mask.bits, map->affinity_mask.bits,
> rec->affinity_mask.nbits)) {
>
> I think this can never be equal if you have more than one map
>
> when I check on sched_setaffinity syscalls:
>
> # perf trace -e syscalls:sys_enter_sched_setaffinity
>
> while running record --threads, I see sched_setaffinity
> calls all the time
>
> jirka
>