by Jiri Olsa

[permalink] [raw]

Subject: Re: [PATCH v2 06/15] perf session: load data directory into tool process memory

On Tue, Oct 27, 2020 at 05:43:20PM +0300, Alexey Budankov wrote:
>
> On 27.10.2020 15:21, Jiri Olsa wrote:
> > On Tue, Oct 27, 2020 at 10:37:58AM +0300, Alexey Budankov wrote:
> >>
> >> On 24.10.2020 18:43, Jiri Olsa wrote:
> >>> On Wed, Oct 21, 2020 at 07:01:19PM +0300, Alexey Budankov wrote:
> >>>>
> >>>> Read trace files located in data directory into tool process memory.
> >>>> Basic analysis support of data directories is provided for report
> >>>> mode. Raw dump (-D) and aggregated reports are available for data
> >>>> directories, still with no memory consumption optimizations. However
> >>>> data directories collected with --compression-level option enabled
> >>>> can be analyzed with little less memory because trace files are
> >>>> unmaped from tool process memory after loading collected data.
> >>>> The implementation is based on the prototype [1], [2].
> >>>>
> >>>> [1] git clone https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git -b perf/record_threads
> >>>> [2] https://lore.kernel.org/lkml/[email protected]/
> >>>>
> >>>> Suggested-by: Jiri Olsa <[email protected]>
> >>>
> >>> very loosely ;-) so there was a reason for all that reader refactoring,
> >>> so we could have __perf_session__process_dir_events function:
> >>>
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/commit/?h=perf/record_threads&id=308aa7cff1fed335401cfc02c7bac1a4644af68e
> >>
> >> Nonetheless. All that are necessary parts to make threaded data streaming
> >> and analysis eventually merged into the mainline as joint Perf developers
> >> community effort.
> >>
> >>>
> >>> when reporting the threaded record data on really big servers,
> >>> you will run out of memory, so you need to read and flush all
> >>> the files together by smaller pieces
> >>
> >> Yes, handling all that _big_ data after collection to make it
> >> helpful for analysis of performance issues is the other part
> >> of this story so that possible OOM should be somehow avoided.
> >>
> >>>
> >>> IMO we need to have this change before we allow threaded record
> >>
> >> There are use cases of perf tool as a data provider, btw VTune is not
> >> the only one of them, and for those use cases threaded trace streaming
> >> lets its users get to their data that the users just were loosing before.
> >> This is huge difference and whole new level of support for such users.
> >> Post-process scripting around perf (e.g. Python based) will benefit
> >> from threaded trace streaming. Pipe mode can be extended to stream into
> >> open and passed fds using threads (e.g. perf record -o -fd:13,14,15,16).
> >> VTune-like tools can get performance data, load it into a (relational)
> >> DB files and provide analysis. And all that uses perf tool at its core.
> >>
> >> I agree perf report OOM issue can exist on really-big servers but data
> >> directories support for report mode for not-so-big servers and desktops
> >> is already enabled with this smaller change. Also really-big-servers
> >> come with really-big amount of memory and collection could possibly be
> >> limited to only interesting phases of execution so the issue could likely
> >> be avoided. At the same time threaded trace streaming could clarify on
> >> real use cases that are blocked by perf report OOM issue and that would
> >> clarify on exact required solution. So perf report OOM issue shouldn't
> >> be the showstopper for upstream of threaded trace streaming.
> >
> > so the short answer is no, right? ;-)
>
> Answer to what question? Resolve OOM in perf report for data directories?
> I don't see a simple solution for that. The next issue after OOM is resolved
> is a very long processing of data directories. And again there is no simple
> solution for that as well. But it still need progress in order to be resolved
> eventually.

it's right here:
https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/commit/?h=perf/record_threads&id=308aa7cff1fed335401cfc02c7bac1a4644af68e

jirka

>
> >
> > I understand all the excuses, but from my point of view we are
> > adding another pain point (and there's already few ;-) ) that
> > will make perf (even more) not user friendly
>
> I would not name it a paint point but instead a growth opportunity.
> Now --threads can't be and is not enabled by default. When a user
> asks --threads the tool can print warning in advance about lots of
> data and possible perf report OOM limitation so confusion and
> disappointment for users of perf report can be avoided in advance.
>
> >
> > if we allow really friendly way to create huge data, we should
> > do our best to be able to process it as best as we can
>
> It is just little to no more friendly as it is already now.
> Everyone can grab patches apply and get threaded streaming.
> Inclusion into mainline will standardize solution to build
> and evolve upon and this is necessary step towards complete
> support of data directories in perf tool suite.
>
> Alexei
>
> >
> > jirka
> >
>

2020-10-29 11:15:01

by Namhyung Kim

[permalink] [raw]

Subject: Re: [PATCH v2 06/15] perf session: load data directory into tool process memory

Hi Jiri,

On Thu, Oct 29, 2020 at 12:40 AM Jiri Olsa <[email protected]> wrote:
>
> On Wed, Oct 28, 2020 at 04:22:49PM +0900, Namhyung Kim wrote:
> > On Tue, Oct 27, 2020 at 11:43 PM Alexey Budankov
> > <[email protected]> wrote:
> > >
> > >
> > > On 27.10.2020 15:21, Jiri Olsa wrote:
> > > > On Tue, Oct 27, 2020 at 10:37:58AM +0300, Alexey Budankov wrote:
> > > >> I agree perf report OOM issue can exist on really-big servers but data
> > > >> directories support for report mode for not-so-big servers and desktops
> > > >> is already enabled with this smaller change. Also really-big-servers
> > > >> come with really-big amount of memory and collection could possibly be
> > > >> limited to only interesting phases of execution so the issue could likely
> > > >> be avoided. At the same time threaded trace streaming could clarify on
> > > >> real use cases that are blocked by perf report OOM issue and that would
> > > >> clarify on exact required solution. So perf report OOM issue shouldn't
> > > >> be the showstopper for upstream of threaded trace streaming.
> > > >
> > > > so the short answer is no, right? ;-)
> > >
> > > Answer to what question? Resolve OOM in perf report for data directories?
> > > I don't see a simple solution for that. The next issue after OOM is resolved
> > > is a very long processing of data directories. And again there is no simple
> > > solution for that as well. But it still need progress in order to be resolved
> > > eventually.
> >
> > I think we should find a better way than just adding all events to the
> > ordered events queue in memory then processing them one by one.
> >
> > Separating tracking events (FORK/MMAP/...) might be the first step.
>
> I recall seeing this change before for threaded perf report,
> maybe even from you, right? ;-)

Yes, I did it a couple of years ago. The last version is here:

https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/log/?h=perf/threaded-v5

Thanks
Namhyung