Hi Don,
On Mon, 24 Mar 2014 15:36:54 -0400, Don Zickus wrote:
> From: Arnaldo Carvalho de Melo <[email protected]>
>
> This is the start of a new perf tool that will collect information about
> memory accesses and analyse it to find things like hot cachelines, etc.
So why not integrating this into existing 'perf mem' command if it's all
about analyzing memory accesses?
>
> This is basically trying to get a prototype written by Richard Fowles
> written using the tools/perf coding style and libraries.
>
> Start it from 'perf sched', this patch starts the process by adding the
> 'record' subcommand to collect the needed mem loads and stores samples.
>
> It also have the basic 'report' skeleton, resolving the sample address
> and hooking the events found in a perf.data file with methods to handle
> them, right now just printing the resolved perf_sample data structure
> after each event name.
>
> [dcz: refreshed to latest upstream changes]
[SNIP]
> +perf-c2c(1)
> +===========
> +
> +NAME
> +----
> +perf-c2c - Shared Data C2C/HITM Analyzer.
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf c2c' record
> +
> +DESCRIPTION
> +-----------
> +These are the variants of perf c2c:
> +
> + 'perf c2c record <command>' to record the memory accesses of an arbitrary
> + workload.
> +
> +SEE ALSO
> +--------
> +linkperf:perf-record[1], linkperf:perf-mem[1]
This document is very terse and only memtions the 'record' subcommand -
also it's not updated throughout the series. So I'd like to suggest
adding a separate documentation patch with full/verbose descriptions at
the end of this series.
[SNIP]
> +static int perf_c2c__read_events(struct perf_c2c *c2c)
> +{
> + int err = -1;
> + struct perf_session *session;
> + struct perf_data_file file = {
> + .path = input_name,
> + .mode = PERF_DATA_MODE_READ,
> + };
> + struct perf_evsel *evsel;
> +
> + session = perf_session__new(&file, 0, &c2c->tool);
> + if (session == NULL) {
> + pr_debug("No memory for session\n");
> + goto out;
> + }
> +
> + /* setup the evsel handlers for each event type */
> + evlist__for_each(session->evlist, evsel) {
> + const char *name = perf_evsel__name(evsel);
> + unsigned int i;
> +
> + for (i = 0; i < ARRAY_SIZE(handlers); i++) {
> + if (!strcmp(name, handlers[i].name))
> + evsel->handler = handlers[i].handler;
> + }
> + }
> +
> + err = perf_session__process_events(session, &c2c->tool);
> + if (err)
> + pr_err("Failed to process events, error %d", err);
You may want to add perf_session__delete() here.
> +
> +out:
> + return err;
> +}
[SNIP]
> +int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
> +{
> + struct perf_c2c c2c = {
> + .tool = {
> + .sample = perf_c2c__process_sample,
> + .comm = perf_event__process_comm,
> + .exit = perf_event__process_exit,
> + .fork = perf_event__process_fork,
> + .lost = perf_event__process_lost,
It seems that it also needs to handle mmap[2] events otherwise it cannot
find symbols from an address.
Thanks,
Namhyung
> + .ordered_samples = true,
> + },
> + };
On Tue, Apr 08, 2014 at 03:59:15PM +0900, Namhyung Kim wrote:
> Hi Don,
Oh by the way, thank you for your review. I will clean up a bunch of
stuff based on your suggestions.
Cheers,
Don
>
> On Mon, 24 Mar 2014 15:36:54 -0400, Don Zickus wrote:
> > From: Arnaldo Carvalho de Melo <[email protected]>
> >
> > This is the start of a new perf tool that will collect information about
> > memory accesses and analyse it to find things like hot cachelines, etc.
>
> So why not integrating this into existing 'perf mem' command if it's all
> about analyzing memory accesses?
>
> >
> > This is basically trying to get a prototype written by Richard Fowles
> > written using the tools/perf coding style and libraries.
> >
> > Start it from 'perf sched', this patch starts the process by adding the
> > 'record' subcommand to collect the needed mem loads and stores samples.
> >
> > It also have the basic 'report' skeleton, resolving the sample address
> > and hooking the events found in a perf.data file with methods to handle
> > them, right now just printing the resolved perf_sample data structure
> > after each event name.
> >
> > [dcz: refreshed to latest upstream changes]
>
> [SNIP]
> > +perf-c2c(1)
> > +===========
> > +
> > +NAME
> > +----
> > +perf-c2c - Shared Data C2C/HITM Analyzer.
> > +
> > +SYNOPSIS
> > +--------
> > +[verse]
> > +'perf c2c' record
> > +
> > +DESCRIPTION
> > +-----------
> > +These are the variants of perf c2c:
> > +
> > + 'perf c2c record <command>' to record the memory accesses of an arbitrary
> > + workload.
> > +
> > +SEE ALSO
> > +--------
> > +linkperf:perf-record[1], linkperf:perf-mem[1]
>
> This document is very terse and only memtions the 'record' subcommand -
> also it's not updated throughout the series. So I'd like to suggest
> adding a separate documentation patch with full/verbose descriptions at
> the end of this series.
>
>
> [SNIP]
> > +static int perf_c2c__read_events(struct perf_c2c *c2c)
> > +{
> > + int err = -1;
> > + struct perf_session *session;
> > + struct perf_data_file file = {
> > + .path = input_name,
> > + .mode = PERF_DATA_MODE_READ,
> > + };
> > + struct perf_evsel *evsel;
> > +
> > + session = perf_session__new(&file, 0, &c2c->tool);
> > + if (session == NULL) {
> > + pr_debug("No memory for session\n");
> > + goto out;
> > + }
> > +
> > + /* setup the evsel handlers for each event type */
> > + evlist__for_each(session->evlist, evsel) {
> > + const char *name = perf_evsel__name(evsel);
> > + unsigned int i;
> > +
> > + for (i = 0; i < ARRAY_SIZE(handlers); i++) {
> > + if (!strcmp(name, handlers[i].name))
> > + evsel->handler = handlers[i].handler;
> > + }
> > + }
> > +
> > + err = perf_session__process_events(session, &c2c->tool);
> > + if (err)
> > + pr_err("Failed to process events, error %d", err);
>
> You may want to add perf_session__delete() here.
>
> > +
> > +out:
> > + return err;
> > +}
>
>
> [SNIP]
> > +int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
> > +{
> > + struct perf_c2c c2c = {
> > + .tool = {
> > + .sample = perf_c2c__process_sample,
> > + .comm = perf_event__process_comm,
> > + .exit = perf_event__process_exit,
> > + .fork = perf_event__process_fork,
> > + .lost = perf_event__process_lost,
>
> It seems that it also needs to handle mmap[2] events otherwise it cannot
> find symbols from an address.
>
> Thanks,
> Namhyung
>
>
> > + .ordered_samples = true,
> > + },
> > + };
On Tue, Apr 08, 2014 at 03:59:15PM +0900, Namhyung Kim wrote:
> Hi Don,
>
> On Mon, 24 Mar 2014 15:36:54 -0400, Don Zickus wrote:
> > From: Arnaldo Carvalho de Melo <[email protected]>
> >
> > This is the start of a new perf tool that will collect information about
> > memory accesses and analyse it to find things like hot cachelines, etc.
>
> So why not integrating this into existing 'perf mem' command if it's all
> about analyzing memory accesses?
Our expectations were different. We expeted to do system-wide analysis
with loads and stores. With 'perf mem' you didn't have the ability to
anlayze both load and stores at the same time.
In all my private conversations with Stephane, Arnalado and Jiri, it was
never brought up. We had just assumed that is made more sense to keep it
separate.
>
> >
> > This is basically trying to get a prototype written by Richard Fowles
> > written using the tools/perf coding style and libraries.
> >
> > Start it from 'perf sched', this patch starts the process by adding the
> > 'record' subcommand to collect the needed mem loads and stores samples.
> >
> > It also have the basic 'report' skeleton, resolving the sample address
> > and hooking the events found in a perf.data file with methods to handle
> > them, right now just printing the resolved perf_sample data structure
> > after each event name.
> >
> > [dcz: refreshed to latest upstream changes]
>
> [SNIP]
> > +perf-c2c(1)
> > +===========
> > +
> > +NAME
> > +----
> > +perf-c2c - Shared Data C2C/HITM Analyzer.
> > +
> > +SYNOPSIS
> > +--------
> > +[verse]
> > +'perf c2c' record
> > +
> > +DESCRIPTION
> > +-----------
> > +These are the variants of perf c2c:
> > +
> > + 'perf c2c record <command>' to record the memory accesses of an arbitrary
> > + workload.
> > +
> > +SEE ALSO
> > +--------
> > +linkperf:perf-record[1], linkperf:perf-mem[1]
>
> This document is very terse and only memtions the 'record' subcommand -
> also it's not updated throughout the series. So I'd like to suggest
> adding a separate documentation patch with full/verbose descriptions at
> the end of this series.
>
>
> [SNIP]
> > +static int perf_c2c__read_events(struct perf_c2c *c2c)
> > +{
> > + int err = -1;
> > + struct perf_session *session;
> > + struct perf_data_file file = {
> > + .path = input_name,
> > + .mode = PERF_DATA_MODE_READ,
> > + };
> > + struct perf_evsel *evsel;
> > +
> > + session = perf_session__new(&file, 0, &c2c->tool);
> > + if (session == NULL) {
> > + pr_debug("No memory for session\n");
> > + goto out;
> > + }
> > +
> > + /* setup the evsel handlers for each event type */
> > + evlist__for_each(session->evlist, evsel) {
> > + const char *name = perf_evsel__name(evsel);
> > + unsigned int i;
> > +
> > + for (i = 0; i < ARRAY_SIZE(handlers); i++) {
> > + if (!strcmp(name, handlers[i].name))
> > + evsel->handler = handlers[i].handler;
> > + }
> > + }
> > +
> > + err = perf_session__process_events(session, &c2c->tool);
> > + if (err)
> > + pr_err("Failed to process events, error %d", err);
>
> You may want to add perf_session__delete() here.
>
> > +
> > +out:
> > + return err;
> > +}
>
>
> [SNIP]
> > +int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
> > +{
> > + struct perf_c2c c2c = {
> > + .tool = {
> > + .sample = perf_c2c__process_sample,
> > + .comm = perf_event__process_comm,
> > + .exit = perf_event__process_exit,
> > + .fork = perf_event__process_fork,
> > + .lost = perf_event__process_lost,
>
> It seems that it also needs to handle mmap[2] events otherwise it cannot
> find symbols from an address.
>
> Thanks,
> Namhyung
>
>
> > + .ordered_samples = true,
> > + },
> > + };
On Tue, 8 Apr 2014 10:22:26 -0400, Don Zickus wrote:
> On Tue, Apr 08, 2014 at 03:59:15PM +0900, Namhyung Kim wrote:
>> Hi Don,
>>
>> On Mon, 24 Mar 2014 15:36:54 -0400, Don Zickus wrote:
>> > From: Arnaldo Carvalho de Melo <[email protected]>
>> >
>> > This is the start of a new perf tool that will collect information about
>> > memory accesses and analyse it to find things like hot cachelines, etc.
>>
>> So why not integrating this into existing 'perf mem' command if it's all
>> about analyzing memory accesses?
>
> Our expectations were different. We expeted to do system-wide analysis
> with loads and stores. With 'perf mem' you didn't have the ability to
> anlayze both load and stores at the same time.
But it's very simple to change perf mem to work with the both IMHO.
>
> In all my private conversations with Stephane, Arnalado and Jiri, it was
> never brought up. We had just assumed that is made more sense to keep it
> separate.
Well, I'm not sure ;-) Yes, the c2c is a complex tool which might
deserve an own command, but the functionality is very similar and I
guess there's something to share between them.
Thanks,
Namhyung
Namhyung Kim <[email protected]> writes:
>
> Well, I'm not sure ;-) Yes, the c2c is a complex tool which might
> deserve an own command, but the functionality is very similar and I
> guess there's something to share between them.
They work very differently. I don't see a lot of potential
for sharing.
perf mem is basically just a way to annotate normal samples slightly
with addresses, while c2c is fundamentally address driven.
-Andi
--
[email protected] -- Speaking for myself only