2021-08-21 09:44:39

by Riccardo Mancini

[permalink] [raw]
Subject: [GSoC] Multi-threading in perf: Final Report

Hi,

this is the final report of my project "Multi-threading in perf",
developed as part of the Google Summer of Code with the Linux Foundation.
https://summerofcode.withgoogle.com/projects/#4670070929752064

The final outcome of my project is a new utility library in perf for
asynchronous execution of tasks. This new abstraction follows the kernel's
workqueue API. This utility is then used to replace the manual
perf synthesis threads and to add multithreading to the evlist
operations (open, enable, disable, and close lists of perf events).
At the moment, not all features of the kernel workqueue are supported
(e.g. no work chaining), since I focused on the features that I needed
for implementing the aforementioned features. Hopefully, in the future,
the library would be tested and improved and could be moved to tools/lib
for other tools to use it.

The results I obtained from testing on my machines are promising, but the
patchset needs a more thorough test on bigger machines, since that is
the use case for which parallelization is being introduced in the first
place.

I would be very happy to continue working on this in the future to help my
project get merged, when it will be deemed ready. However, in the next weeks I
will not have enough time to do so, since I need to complete a couple of things
before graduation.

Below you can find a breakdown of my activities during GSoC with links to
the original emails on lkml (refer to those for more details about the
workqueue and how to use it), i.e.:
- the main project patchsets I sent in these last days;
- other patches I sent during the GSoC, mainly fixing memory bugs;
- review activity.

Finally, I would like to thank my mentors -- Ian, Arnaldo, and Namhyung --
for all the precious time they dedicated to me, for their useful
suggestions, and for the overall GSoC experience, which was really great.

Thanks,
Riccardo

---

Project patchsets:
PATCHSET perf: add workqueue library and use it in synthetic-events
Status: v3
Link: https://lore.kernel.org/lkml/[email protected]/
[01/15] perf workqueue: threadpool creation and destruction
[02/15] perf tests: add test for workqueue
[03/15] perf workqueue: add threadpool start and stop functions
[04/15] perf workqueue: add threadpool execute and wait functions
[05/15] tools: add sparse context/locking annotations in compiler-types.h
[06/15] perf workqueue: introduce workqueue struct
[07/15] perf workqueue: implement worker thread and management
[08/15] perf workqueue: add queue_work and flush_workqueue functions
[09/15] perf workqueue: spinup threads when needed
[10/15] perf workqueue: create global workqueue
[11/15] perf workqueue: add utility to execute a for loop in parallel
[12/15] perf record: setup global workqueue
[13/15] perf top: setup global workqueue
[14/15] perf test/synthesis: setup global workqueue
[15/15] perf synthetic-events: use workqueue parallel_for
PATCHSET perf: use workqueue for evlist operations
Status: v1, not reviewed
Link: https://lore.kernel.org/lkml/[email protected]/
[01/37] libperf cpumap: improve idx function
[02/37] libperf cpumap: improve max function
[03/37] perf evlist: replace evsel__cpu_iter* functions with evsel__find_cpu
[04/37] perf util: add mmap_cpu_mask__duplicate function
[05/37] perf util/mmap: add missing bitops.h header
[06/37] perf workqueue: add affinities to threadpool
[07/37] perf workqueue: add support for setting affinities to workers
[08/37] perf workqueue: add method to execute work on specific CPU
[09/37] perf python: add workqueue dependency
[10/37] perf evlist: add multithreading helper
[11/37] perf evlist: add multithreading to evlist__disable
[12/37] perf evlist: add multithreading to evlist__enable
[13/37] perf evlist: add multithreading to evlist__close
[14/37] perf evsel: remove retry_sample_id goto label
[15/37] perf evsel: separate open preparation from open itself
[16/37] perf evsel: save open flags in evsel
[17/37] perf evsel: separate missing feature disabling from evsel__open_cpu
[18/37] perf evsel: add evsel__prepare_open function
[19/37] perf evsel: separate missing feature detection from evsel__open_cpu
[20/37] perf evsel: separate rlimit increase from evsel__open_cpu
[21/37] perf evsel: move ignore_missing_thread to fallback code
[22/37] perf evsel: move test_attr__open to success path in evsel__open_cpu
[23/37] perf evsel: move bpf_counter__install_pe to success path in evsel__open_cpu
[24/37] perf evsel: handle precise_ip fallback in evsel__open_cpu
[25/37] perf evsel: move event open in evsel__open_cpu to separate function
[26/37] perf evsel: add evsel__open_per_cpu_no_fallback function
[27/37] perf evlist: add evlist__for_each_entry_from macro
[28/37] perf evlist: add multithreading to evlist__open
[29/37] perf evlist: add custom fallback to evlist__open
[30/37] perf record: use evlist__open_custom
[31/37] tools lib/subcmd: add OPT_UINTEGER_OPTARG option type
[32/37] perf record: add --threads option
[33/37] perf record: pin threads to monitored cpus if enough threads available
[34/37] perf record: apply multithreading in init and fini phases
[35/37] perf test/evlist-open-close: add multithreading
[36/37] perf test/evlist-open-close: use inline func to convert timeval to usec
[37/37] perf test/evlist-open-close: add detailed output mode


Other patches:
Merged patches:
da963834fe6975a1 perf test: Iterate over shell tests in alphabetical order
Link: http://lore.kernel.org/lkml/[email protected]
69c9ffed6cede9c1 perf symbol-elf: Fix memory leak by freeing sdt_note.args
Link: http://lore.kernel.org/lkml/[email protected]
67069a1f0fe5f9ee perf env: Fix memory leak of bpf_prog_info_linear member
Link: http://lore.kernel.org/lkml/[email protected]
c087e9480cf33672 perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL
Link: http://lore.kernel.org/lkml/[email protected]
6de249d66d2e7881 perf annotate: Allow 's' on source code lines
Link: http://lore.kernel.org/lkml/[email protected]
cf96b8e45a9bf74d perf session: Add missing evlist__delete when deleting a session
Link: http://lore.kernel.org/lkml/[email protected]
5a4451e4d562d5c3 perf annotate: Fix 's' on source line when disasm is empty
Link: http://lore.kernel.org/lkml/[email protected]
83952286f2683716 perf top: Fix overflow in elf_sec__is_text()
Link: http://lore.kernel.org/lkml/[email protected]
eb7261f14e1a86f0 perf test: Add free() calls for scandir() returned dirent entries
Link: http://lore.kernel.org/lkml/[email protected]
PATCHSET perf: fix several memory leaks reported by ASan on perf-test
Link: https://lore.kernel.org/lkml/[email protected]/
0967ebffe0981571 perf inject: Fix dso->nsinfo refcounting
2d6b74baa7147251 perf map: Fix dso->nsinfo refcounting
dedeb4be203b382b perf probe: Fix dso->nsinfo refcounting
42db3d9ded555f71 perf env: Fix sibling_dies memory leak
233f2dc1c2843372 perf test session_topology: Delete session->evlist
fc56f54f6fcd5337 perf test event_update: Fix memory leak of evlist
dccfca926c351ba0 perf test event_update: Fix memory leak of unit
581e295a0f6b5c29 perf dso: Fix memory leak in dso__new_map()
244d1797c8c8e850 perf test maps__merge_in: Fix memory leak of maps
da6b7c6c06269014 perf env: Fix memory leak of cpu_pmu_caps
a37338aad8c4d867 perf report: Free generated help strings for sort option
02e6246f5364d526 perf inject: Close inject.output on exit
423b9174f5f71fd3 perf session: Cleanup trace_event
1b1f57cf9e4c8eb1 perf script: Release zstd data
faf3ac305d61341c perf script: Fix memory 'threads' and 'cpus' leaks on exit
f8cbb0f926ae1e1f perf lzma: Close lzma stream on exit
6c7f0ab04707c288 perf trace: Free malloc'd trace fields on exit
f2ebf8ffe7af10bf perf trace: Free syscall->arg_fmt
3cb4d5e00e037c70 perf trace: Free syscall tp fields in evsel->priv
659ede7d13f1cc37 perf trace: Free strings in trace__parse_events_option()
937654ce497fb6e9 perf test bpf: Free obj_buf
e0fa7ab42232e742 perf probe-file: Delete namelist in del_events() on the error path
d4b3eedce151e639 perf data: Close all files in close_dir()
Link: http://lore.kernel.org/lkml/[email protected]
4241eabf59d5b7e9 perf bench: Add benchmark for evlist open/close operations
Link: http://lore.kernel.org/lkml/[email protected]

Unmerged patches:
PATCH perf: fix segfault when wrong option for --debug is provided
Link: https://lore.kernel.org/lkml/[email protected]/
Status: rejected, already fixed in earlier patch by Ian
PATCHSET tools: add gettid to libc_compat.h
Link: https://lore.kernel.org/lkml/[email protected]/
Status: withdrawn due to compilation issues in BPF
[01/03] tools libc_compat: add gettid
[02/03] perf jvmti: use gettid from libc_compat
[03/03] perf test: mmap-thread-lookup: use gettid
PATCH perf test: make --skip work on shell tests
Link: https://lore.kernel.org/lkml/[email protected]/
Status: accepted
PATCH perf tests: dlfilter: free desc and long_desc in check_filter_desc
Link: https://lore.kernel.org/lkml/[email protected]/
Status: accepted
PATCH perf config: fix caching and memory leak in perf_home_perfconfig
Link: https://lore.kernel.org/lkml/[email protected]/
Status: needs improvement

Unsent patches:
PATCHSET perf mmaps: grab refcount in maps__find
Link: https://github.com/Manciukic/linux/commits/perf/mem-leaks/patches/grab-refcnt-in-maps-find
Status: never sent due to difficulty in testing such big change. Some commits
have been cherry-picked in other (approved) patchsets.
[01/17] perf: prepare space for exit statements in preparation for maps__find to grab a refcnt
[02/17] perf: have maps__find grab a refcount on map while holding the lock
[03/17] perf: propagate refcnt'ed map from maps__find_symbol
[04/17] perf: propagate refcnt'ed map from maps__find_ams
[05/17] perf: rename addr_location__put to addr_location__put_members
[06/17] perf: add refcounts to addr_location members
[07/17] perf: add addr_location__put_members
[08/17] perf: return refcnt'ed map from maps__find_symbol_by_name
[09/17] perf: return refcnt'ed map from kernel_get_ref_reloc_sym
[10/17] perf: unwind: return refcnt'ed map from find_map
[11/17] perf: add utility functions to put members of branch_info and map_symbol
[12/17] perf: fix refcounting on he->mem_info
[13/17] perf: add missing puts on branch_info
[14/17] perf: unwind-libdw: add refcounts to map_symbol in ui->entries
[15/17] perf: hist: fix refcounts for he->ms
[16/17] perf: nsinfo: fix refcounting
[17/17] perf: missing map__put in arch__post_process_probe_trace_events
PATCH perf: add read lock in maps__first
Link: https://github.com/Manciukic/linux/commit/d1a46bcdd3447ad56cb54fdd3a21a280eab3cd4f
Status: ready to send
PATCH perf: ensure that a read lock is held when looping over maps entries
Link: https://github.com/Manciukic/linux/commit/ad948ef8e771c1ab03838c92afd3c2690019c694
Status: needs splitting


Review activity:
PATCHSET Introduce threaded trace streaming for basic perf record operation
Link: https://lore.kernel.org/lkml/[email protected]/
Contribution: helped in fixing some bugs, performed extensive testing
PATCHSET perf tools: Add PMU alias support
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Contribution: helped in fixing some memory bugs



2021-08-23 11:42:01

by Bayduraev, Alexey V

[permalink] [raw]
Subject: Re: [GSoC] Multi-threading in perf: Final Report

On 21.08.2021 12:41, Riccardo Mancini wrote:
> Hi,
>
> this is the final report of my project "Multi-threading in perf",
> developed as part of the Google Summer of Code with the Linux Foundation.
> https://summerofcode.withgoogle.com/projects/#4670070929752064
<SNIP>
>
> Review activity:
> PATCHSET Introduce threaded trace streaming for basic perf record operation
> Link: https://lore.kernel.org/lkml/[email protected]/
> Contribution: helped in fixing some bugs, performed extensive testing

Hi Riccardo,

Thank you very much for the deep review and extensive testing of
this patchset, it was very helpful and allowed us to improve
the quality of the feature used in our product.

Good luck,
Alexey

> PATCHSET perf tools: Add PMU alias support
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://lore.kernel.org/lkml/[email protected]/
> Contribution: helped in fixing some memory bugs
>
>

2021-08-23 20:50:11

by Ian Rogers

[permalink] [raw]
Subject: Re: [GSoC] Multi-threading in perf: Final Report

On Mon, Aug 23, 2021 at 4:40 AM Bayduraev, Alexey V
<[email protected]> wrote:
>
> On 21.08.2021 12:41, Riccardo Mancini wrote:
> > Hi,
> >
> > this is the final report of my project "Multi-threading in perf",
> > developed as part of the Google Summer of Code with the Linux Foundation.
> > https://summerofcode.withgoogle.com/projects/#4670070929752064
> <SNIP>
> >
> > Review activity:
> > PATCHSET Introduce threaded trace streaming for basic perf record operation
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Contribution: helped in fixing some bugs, performed extensive testing
>
> Hi Riccardo,
>
> Thank you very much for the deep review and extensive testing of
> this patchset, it was very helpful and allowed us to improve
> the quality of the feature used in our product.
>
> Good luck,
> Alexey

Likewise, thank you Riccardo! It is always implied but not said often
enough, thank you Arnaldo! I'm hoping the success of Riccardo's work
will be an example for next year and we can also get more mentor
volunteers.

Thanks!
Ian

> > PATCHSET perf tools: Add PMU alias support
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Contribution: helped in fixing some memory bugs
> >
> >

2021-08-24 18:02:27

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [GSoC] Multi-threading in perf: Final Report

Em Mon, Aug 23, 2021 at 01:47:51PM -0700, Ian Rogers escreveu:
> On Mon, Aug 23, 2021 at 4:40 AM Bayduraev, Alexey V
> <[email protected]> wrote:
> >
> > On 21.08.2021 12:41, Riccardo Mancini wrote:
> > > Hi,
> > >
> > > this is the final report of my project "Multi-threading in perf",
> > > developed as part of the Google Summer of Code with the Linux Foundation.
> > > https://summerofcode.withgoogle.com/projects/#4670070929752064
> > <SNIP>
> > >
> > > Review activity:
> > > PATCHSET Introduce threaded trace streaming for basic perf record operation
> > > Link: https://lore.kernel.org/lkml/[email protected]/
> > > Contribution: helped in fixing some bugs, performed extensive testing
> >
> > Hi Riccardo,
> >
> > Thank you very much for the deep review and extensive testing of
> > this patchset, it was very helpful and allowed us to improve
> > the quality of the feature used in our product.
> >
> > Good luck,
> > Alexey
>
> Likewise, thank you Riccardo! It is always implied but not said often
> enough, thank you Arnaldo! I'm hoping the success of Riccardo's work
> will be an example for next year and we can also get more mentor
> volunteers.

Yeah, it was a great experience, now we need to actually do the tests
Riccardo asked us on big machines and get his and Alexey's work
processed :-)

- Arnaldo

2021-08-27 00:35:09

by Namhyung Kim

[permalink] [raw]
Subject: Re: [GSoC] Multi-threading in perf: Final Report

On Tue, Aug 24, 2021 at 11:00 AM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Mon, Aug 23, 2021 at 01:47:51PM -0700, Ian Rogers escreveu:
> > On Mon, Aug 23, 2021 at 4:40 AM Bayduraev, Alexey V
> > <[email protected]> wrote:
> > >
> > > On 21.08.2021 12:41, Riccardo Mancini wrote:
> > > > Hi,
> > > >
> > > > this is the final report of my project "Multi-threading in perf",
> > > > developed as part of the Google Summer of Code with the Linux Foundation.
> > > > https://summerofcode.withgoogle.com/projects/#4670070929752064
> > > <SNIP>
> > > >
> > > > Review activity:
> > > > PATCHSET Introduce threaded trace streaming for basic perf record operation
> > > > Link: https://lore.kernel.org/lkml/[email protected]/
> > > > Contribution: helped in fixing some bugs, performed extensive testing
> > >
> > > Hi Riccardo,
> > >
> > > Thank you very much for the deep review and extensive testing of
> > > this patchset, it was very helpful and allowed us to improve
> > > the quality of the feature used in our product.
> > >
> > > Good luck,
> > > Alexey
> >
> > Likewise, thank you Riccardo! It is always implied but not said often
> > enough, thank you Arnaldo! I'm hoping the success of Riccardo's work
> > will be an example for next year and we can also get more mentor
> > volunteers.
>
> Yeah, it was a great experience, now we need to actually do the tests
> Riccardo asked us on big machines and get his and Alexey's work
> processed :-)

Thanks Riccardo for all your good work!

Namhyung