by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v2 00/14] perf, persistent: Kernel updates for perf tool integration

On Thu, Jun 27, 2013 at 02:46:36PM +0900, Namhyung Kim wrote:
> How about using 2 bits for perfsistent flag, 1 for connecting to an
> existing one, 2 for creating new one.

No need since persistent events don't need to be duplicated. Think of a
tracepoint: the samples you get there are the same, no matter how many
times you enable it.

So, if you open a persistent event which doesn't exist, it gets created.
If you open an existing one, you get read-only access to its buffers. No
need for two bits. Actually, with the detach/attach ioctl we don't even
need a single flag but having it makes the implementation a lot simpler.

> > A problem here is that mmap'ed buffer size (number of pages) must be
> > be equal to the pre-existing buffer size and thus to be known somehow.
>
> What about also exporting the buffer size via sysfs pmu directory?

Yes, we've been discussing buffer size. The simplest thing is to
hardcode the buffer size so that it is implicitly known to all agents.
However, I don't know whether there is a use case where different buffer
sizes actually make sense. I'd tend to do the simplest thing initially.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-27 08:51:07

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH v2 00/14] perf, persistent: Kernel updates for perf tool integration

* Borislav Petkov <[email protected]> wrote:

> > > A problem here is that mmap'ed buffer size (number of pages) must be
> > > be equal to the pre-existing buffer size and thus to be known
> > > somehow.
> >
> > What about also exporting the buffer size via sysfs pmu directory?
>
> Yes, we've been discussing buffer size. The simplest thing is to
> hardcode the buffer size so that it is implicitly known to all agents.
> However, I don't know whether there is a use case where different buffer
> sizes actually make sense. I'd tend to do the simplest thing initially.

Btw., in terms of testing and design focus I'd suggest concentrating not
on rare and relatively singular events like RAS MCE events, but on a more
'everyday tracing' flow of events:

- large, per cpu trace buffers

- all events output into a single trace buffer to get a coherent trace

- [ possibly revive the tracepoint-multiplexing patch from peterz/tglx,
to be able to get a rich selection of tracepoints. ]

That is I think the workflow most people will be interested in:

- they have a workload to analyze and they want to do some 'tracing' to
understand it better or to pinpoint a problem.

- based on the problem they want to trace a selection of tracepoints, as
easily and quickly as possible.

- we could possibly provide a few 'groups' of tracepoints for typical
uses - for example scheduler tracing, or MM tracing, or IO tracing,
etc.), so people wouldn't have to specify a dozen tracepoints but could
symbolically refer to any of these pre-cooked groups. [this is probably
a tooling detail.]

- they want to persistently trace into a generously sized trace buffer,
without any tool running while the collection period lasts.

- to refine the result they'd like to stop/start tracing, reset/clear the
tracebuffer for failed attempts, and generally see how large the
tracebuffer is.

- and extract/analyze traces - both in a detailed form and in summary
fashion.

- they possibly want to save it to a perf.data and have equivalent
facilities of analysis.

If that workflow works well then the RAS angle will work well too.

Thanks,

Ingo