2012-06-12 05:37:56

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 0/13] perf: Intel uncore pmu counting support

Hi, all

Here is the V5 patches to add uncore counting support for Nehalem,
Sandy Bridge and Sandy Bridge-EP, applied on top of current tip.
The code is based on Lin Ming's old patches.

For Nehalem and Sandy Bridge-EP, A few general events are exported
under sysfs directory:
/sys/bus/event_source/devices/${uncore_dev}/events/

Each file in the events directory defines an event. The content is
a string such as:
config=1,config1=2

You can use 'perf stat' to access to the uncore pmu. For example:
perf stat -a -C 0 -e 'Uncore_iMC_0/CAS_COUNT_RD/' sleep 1
perf stat -a -C 0 -e 'Uncore_iMC_0/event=CAS_COUNT_RD/' sleep 1

Any comment is appreciated.
Thank you
---
Changes since v1:
- Modify perf tool to parse events from sysfs
- A few minor code cleanup

Changes since v2:
- Place all events for a particular socket onto a single cpu
- Make the events parser in perf tool reentrantable
- A few code cleanup

Changes since v3:
- Use per cpu pointer to track uncore box
- Rework the cpu hotplug code because topology_physical_package_id()
return wrong result when the cpu is offline
- Rework the event alias code, event terms are stored in the alias
structure instead events string

Changes since v4:
- Include Jiri's uncore related changes patch set
- Add pmu/event=alias/ syntax support


2012-06-12 05:37:58

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 01/13] perf: Export perf_assign_events

From: "Yan, Zheng" <[email protected]>

Export perf_assign_events so the uncore code can use it to
schedule events.

Signed-off-by: Zheng Yan <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 6 +++---
arch/x86/kernel/cpu/perf_event.h | 2 ++
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 9000590..0c9041c 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -626,7 +626,7 @@ static bool __perf_sched_find_counter(struct perf_sched *sched)
c = sched->constraints[sched->state.event];

/* Prefer fixed purpose counters */
- if (x86_pmu.num_counters_fixed) {
+ if (c->idxmsk64 & ((u64)-1 << X86_PMC_IDX_FIXED)) {
idx = X86_PMC_IDX_FIXED;
for_each_set_bit_from(idx, c->idxmsk, X86_PMC_IDX_MAX) {
if (!__test_and_set_bit(idx, sched->state.used))
@@ -693,8 +693,8 @@ static bool perf_sched_next_event(struct perf_sched *sched)
/*
* Assign a counter for each event.
*/
-static int perf_assign_events(struct event_constraint **constraints, int n,
- int wmin, int wmax, int *assign)
+int perf_assign_events(struct event_constraint **constraints, int n,
+ int wmin, int wmax, int *assign)
{
struct perf_sched sched;

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 3df3de9..83238f2 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -481,6 +481,8 @@ static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc,

void x86_pmu_enable_all(int added);

+int perf_assign_events(struct event_constraint **constraints, int n,
+ int wmin, int wmax, int *assign);
int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign);

void x86_pmu_stop(struct perf_event *event, int flags);
--
1.7.10.2

2012-06-12 05:38:09

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 02/13] perf: Avoid race between cpu hotplug and installing event

From: "Yan, Zheng" <[email protected]>

perf_event_open requires the cpu on which to install event is online,
but the cpu can go offline after perf_event_open checks that. Add
get_online_cpus()/put_online_cpus() pair to avoid the race.

Signed-off-by: Zheng Yan <[email protected]>
---
kernel/events/core.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f85c015..d71a2d6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6248,6 +6248,8 @@ SYSCALL_DEFINE5(perf_event_open,
}
}

+ get_online_cpus();
+
event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
NULL, NULL);
if (IS_ERR(event)) {
@@ -6387,6 +6389,8 @@ SYSCALL_DEFINE5(perf_event_open,
perf_unpin_context(ctx);
mutex_unlock(&ctx->mutex);

+ put_online_cpus();
+
event->owner = current;

mutex_lock(&current->perf_event_mutex);
@@ -6415,6 +6419,7 @@ err_context:
err_alloc:
free_event(event);
err_task:
+ put_online_cpus();
if (task)
put_task_struct(task);
err_group_fd:
--
1.7.10.2

2012-06-12 05:38:14

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 09/13] perf, tool: Use data struct for arg passing in event parse function

From: Jiri Olsa <[email protected]>

Moving all the bison arguments into the structure. In upcomming
patches we are going to
- add more arguments
- reuse the grammer for term parsing

so it's more clear to pack/separate related arguments.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 16 +++++++------
tools/perf/util/parse-events.h | 8 +++++--
tools/perf/util/parse-events.y | 52 +++++++++++++++++++++++++++-------------
3 files changed, 50 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 05dbc8b..c71b29a 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -26,7 +26,7 @@ struct event_symbol {
#ifdef PARSER_DEBUG
extern int parse_events_debug;
#endif
-int parse_events_parse(struct list_head *list, int *idx);
+int parse_events_parse(void *data);

#define CHW(x) .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_##x
#define CSW(x) .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_##x
@@ -789,25 +789,27 @@ int parse_events_modifier(struct list_head *list, char *str)

int parse_events(struct perf_evlist *evlist, const char *str, int unset __used)
{
- LIST_HEAD(list);
- LIST_HEAD(list_tmp);
+ struct parse_events_data__events data = {
+ .list = LIST_HEAD_INIT(data.list),
+ .idx = evlist->nr_entries,
+ };
YY_BUFFER_STATE buffer;
- int ret, idx = evlist->nr_entries;
+ int ret;

buffer = parse_events__scan_string(str);

#ifdef PARSER_DEBUG
parse_events_debug = 1;
#endif
- ret = parse_events_parse(&list, &idx);
+ ret = parse_events_parse(&data);

parse_events__flush_buffer(buffer);
parse_events__delete_buffer(buffer);
parse_events_lex_destroy();

if (!ret) {
- int entries = idx - evlist->nr_entries;
- perf_evlist__splice_list_tail(evlist, &list, entries);
+ int entries = data.idx - evlist->nr_entries;
+ perf_evlist__splice_list_tail(evlist, &data.list, entries);
return 0;
}

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 8cac57a..dc3c83a 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -63,6 +63,11 @@ struct parse_events__term {
struct list_head list;
};

+struct parse_events_data__events {
+ struct list_head list;
+ int idx;
+};
+
int parse_events__is_hardcoded_term(struct parse_events__term *term);
int parse_events__term_num(struct parse_events__term **_term,
int type_term, char *config, long num);
@@ -83,8 +88,7 @@ int parse_events_add_pmu(struct list_head **list, int *idx,
char *pmu , struct list_head *head_config);
void parse_events_update_lists(struct list_head *list_event,
struct list_head *list_all);
-void parse_events_error(struct list_head *list_all,
- int *idx, char const *msg);
+void parse_events_error(void *data, char const *msg);
int parse_events__test(void);

void print_events(const char *event_glob);
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 362cc59..e533bf7 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -1,7 +1,6 @@

%name-prefix "parse_events_"
-%parse-param {struct list_head *list_all}
-%parse-param {int *idx}
+%parse-param {void *_data}

%{

@@ -64,18 +63,22 @@ events ',' event | event
event:
event_def PE_MODIFIER_EVENT
{
+ struct parse_events_data__events *data = _data;
+
/*
* Apply modifier on all events added by single event definition
* (there could be more events added for multiple tracepoint
* definitions via '*?'.
*/
ABORT_ON(parse_events_modifier($1, $2));
- parse_events_update_lists($1, list_all);
+ parse_events_update_lists($1, &data->list);
}
|
event_def
{
- parse_events_update_lists($1, list_all);
+ struct parse_events_data__events *data = _data;
+
+ parse_events_update_lists($1, &data->list);
}

event_def: event_pmu |
@@ -89,9 +92,10 @@ event_def: event_pmu |
event_pmu:
PE_NAME '/' event_config '/'
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_pmu(&list, idx, $1, $3));
+ ABORT_ON(parse_events_add_pmu(&list, &data->idx, $1, $3));
parse_events__free_terms($3);
$$ = list;
}
@@ -99,91 +103,106 @@ PE_NAME '/' event_config '/'
event_legacy_symbol:
PE_VALUE_SYM '/' event_config '/'
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;
int type = $1 >> 16;
int config = $1 & 255;

- ABORT_ON(parse_events_add_numeric(&list, idx, type, config, $3));
+ ABORT_ON(parse_events_add_numeric(&list, &data->idx,
+ type, config, $3));
parse_events__free_terms($3);
$$ = list;
}
|
PE_VALUE_SYM sep_slash_dc
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;
int type = $1 >> 16;
int config = $1 & 255;

- ABORT_ON(parse_events_add_numeric(&list, idx, type, config, NULL));
+ ABORT_ON(parse_events_add_numeric(&list, &data->idx,
+ type, config, NULL));
$$ = list;
}

event_legacy_cache:
PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_cache(&list, idx, $1, $3, $5));
+ ABORT_ON(parse_events_add_cache(&list, &data->idx, $1, $3, $5));
$$ = list;
}
|
PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_cache(&list, idx, $1, $3, NULL));
+ ABORT_ON(parse_events_add_cache(&list, &data->idx, $1, $3, NULL));
$$ = list;
}
|
PE_NAME_CACHE_TYPE
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_cache(&list, idx, $1, NULL, NULL));
+ ABORT_ON(parse_events_add_cache(&list, &data->idx, $1, NULL, NULL));
$$ = list;
}

event_legacy_mem:
PE_PREFIX_MEM PE_VALUE ':' PE_MODIFIER_BP sep_dc
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_breakpoint(&list, idx, (void *) $2, $4));
+ ABORT_ON(parse_events_add_breakpoint(&list, &data->idx,
+ (void *) $2, $4));
$$ = list;
}
|
PE_PREFIX_MEM PE_VALUE sep_dc
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_breakpoint(&list, idx, (void *) $2, NULL));
+ ABORT_ON(parse_events_add_breakpoint(&list, &data->idx,
+ (void *) $2, NULL));
$$ = list;
}

event_legacy_tracepoint:
PE_NAME ':' PE_NAME
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_tracepoint(&list, idx, $1, $3));
+ ABORT_ON(parse_events_add_tracepoint(&list, &data->idx, $1, $3));
$$ = list;
}

event_legacy_numeric:
PE_VALUE ':' PE_VALUE
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_numeric(&list, idx, $1, $3, NULL));
+ ABORT_ON(parse_events_add_numeric(&list, &data->idx, $1, $3, NULL));
$$ = list;
}

event_legacy_raw:
PE_RAW
{
+ struct parse_events_data__events *data = _data;
struct list_head *list = NULL;

- ABORT_ON(parse_events_add_numeric(&list, idx, PERF_TYPE_RAW, $1, NULL));
+ ABORT_ON(parse_events_add_numeric(&list, &data->idx,
+ PERF_TYPE_RAW, $1, NULL));
$$ = list;
}

@@ -267,8 +286,7 @@ sep_slash_dc: '/' | ':' |

%%

-void parse_events_error(struct list_head *list_all __used,
- int *idx __used,
+void parse_events_error(void *data __used,
char const *msg __used)
{
}
--
1.7.10.2

2012-06-12 05:38:19

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 11/13] perf, tool: Add support to reuse event grammar to parse out terms

From: Jiri Olsa <[email protected]>

We want to reuse the event grammar for parsing aliased terms.
The obvious reason is we dont need to add new code when there's
already support for this in event grammar.

Doing this by adding terms and event start entries into event
parse grammar. The grammar forks on the begining based on the
starting token, which is supplied via bison interface into the
lexer. The lexer then returns the starting token as the first
token, thus making the grammar switch accordingly.

Currently 2 starting tokens/grammars are supported:
PE_START_TERMS, PE_START_EVENTS

The PE_START_TERMS related grammar uses 'event_config' part
of the grammar for term parsing.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 28 +++++++++++++++++++++++++---
tools/perf/util/parse-events.h | 5 +++++
tools/perf/util/parse-events.l | 13 +++++++++++++
tools/perf/util/parse-events.y | 12 ++++++++++++
4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index ca8665e..d002170 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -12,6 +12,7 @@
#include "header.h"
#include "debugfs.h"
#include "parse-events-bison.h"
+#define YY_EXTRA_TYPE int
#include "parse-events-flex.h"
#include "pmu.h"

@@ -788,13 +789,13 @@ int parse_events_modifier(struct list_head *list, char *str)
return 0;
}

-static int parse_events__scanner(const char *str, void *data)
+static int parse_events__scanner(const char *str, void *data, int start_token)
{
YY_BUFFER_STATE buffer;
void *scanner;
int ret;

- ret = parse_events_lex_init(&scanner);
+ ret = parse_events_lex_init_extra(start_token, &scanner);
if (ret)
return ret;

@@ -811,6 +812,27 @@ static int parse_events__scanner(const char *str, void *data)
return ret;
}

+/*
+ * parse event config string, return a list of event terms.
+ */
+int parse_events_terms(struct list_head *terms, const char *str)
+{
+ struct parse_events_data__terms data = {
+ .terms = NULL,
+ };
+ int ret;
+
+ ret = parse_events__scanner(str, &data, PE_START_TERMS);
+ if (!ret) {
+ list_splice(data.terms, terms);
+ free(data.terms);
+ return 0;
+ }
+
+ parse_events__free_terms(data.terms);
+ return ret;
+}
+
int parse_events(struct perf_evlist *evlist, const char *str, int unset __used)
{
struct parse_events_data__events data = {
@@ -819,7 +841,7 @@ int parse_events(struct perf_evlist *evlist, const char *str, int unset __used)
};
int ret;

- ret = parse_events__scanner(str, &data);
+ ret = parse_events__scanner(str, &data, PE_START_EVENTS);
if (!ret) {
int entries = data.idx - evlist->nr_entries;
perf_evlist__splice_list_tail(evlist, &data.list, entries);
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index fa2b19b..9896eda 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -33,6 +33,7 @@ extern int parse_events_option(const struct option *opt, const char *str,
int unset);
extern int parse_events(struct perf_evlist *evlist, const char *str,
int unset);
+extern int parse_events_terms(struct list_head *terms, const char *str);
extern int parse_filter(const struct option *opt, const char *str, int unset);

#define EVENTS_HELP_MAX (128*1024)
@@ -68,6 +69,10 @@ struct parse_events_data__events {
int idx;
};

+struct parse_events_data__terms {
+ struct list_head *terms;
+};
+
int parse_events__is_hardcoded_term(struct parse_events__term *term);
int parse_events__term_num(struct parse_events__term **_term,
int type_term, char *config, long num);
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 329794e..488362e 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -79,6 +79,19 @@ modifier_event [ukhpGH]{1,8}
modifier_bp [rwx]

%%
+
+%{
+ {
+ int start_token;
+
+ start_token = (int) parse_events_get_extra(yyscanner);
+ if (start_token) {
+ parse_events_set_extra(NULL, yyscanner);
+ return start_token;
+ }
+ }
+%}
+
cpu-cycles|cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES); }
stalled-cycles-frontend|idle-cycles-frontend { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
stalled-cycles-backend|idle-cycles-backend { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 2a93d5c..9525c45 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -25,6 +25,7 @@ do { \

%}

+%token PE_START_EVENTS PE_START_TERMS
%token PE_VALUE PE_VALUE_SYM PE_RAW PE_TERM
%token PE_NAME
%token PE_MODIFIER_EVENT PE_MODIFIER_BP
@@ -60,6 +61,11 @@ do { \
}
%%

+start:
+PE_START_EVENTS events
+|
+PE_START_TERMS terms
+
events:
events ',' event | event

@@ -209,6 +215,12 @@ PE_RAW
$$ = list;
}

+terms: event_config
+{
+ struct parse_events_data__terms *data = _data;
+ data->terms = $1;
+}
+
event_config:
event_config ',' event_term
{
--
1.7.10.2

2012-06-12 05:38:18

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 12/13] perf, tool: Add pmu event alias support

From: Jiri Olsa <[email protected]>

Adding support to specify alias term within the event description.

The definition of pmu event alias is located at:
${sysfs_mount}/bus/event_source/devices/${pmu}/events/

Each file in the 'events' directory defines a event alias. Its contents
is like:
config=1,config1=2

Using pmu event alias, event could be now specified like:
uncore/CLOCKTICKS/

Signed-off-by: Zheng Yan <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 10 +++
tools/perf/util/parse-events.h | 2 +
tools/perf/util/pmu.c | 152 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/pmu.h | 11 ++-
4 files changed, 174 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d002170..3339424 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -701,6 +701,9 @@ int parse_events_add_pmu(struct list_head **list, int *idx,

memset(&attr, 0, sizeof(attr));

+ if (perf_pmu__check_alias(pmu, head_config))
+ return -EINVAL;
+
/*
* Configure hardcoded terms first, no need to check
* return value when called with fail == 0 ;)
@@ -1143,6 +1146,13 @@ int parse_events__term_str(struct parse_events__term **term,
config, str, 0);
}

+int parse_events__term_clone(struct parse_events__term **new,
+ struct parse_events__term *term)
+{
+ return new_term(new, term->type_val, term->type_term, term->config,
+ term->val.str, term->val.num);
+}
+
void parse_events__free_terms(struct list_head *terms)
{
struct parse_events__term *term, *h;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 9896eda..a2c7168 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -78,6 +78,8 @@ int parse_events__term_num(struct parse_events__term **_term,
int type_term, char *config, long num);
int parse_events__term_str(struct parse_events__term **_term,
int type_term, char *config, char *str);
+int parse_events__term_clone(struct parse_events__term **new,
+ struct parse_events__term *term);
void parse_events__free_terms(struct list_head *terms);
int parse_events_modifier(struct list_head *list, char *str);
int parse_events_add_tracepoint(struct list_head **list, int *idx,
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index a119a53..9bf61f6 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -80,6 +80,130 @@ static int pmu_format(char *name, struct list_head *format)
return 0;
}

+static int perf_pmu__new_alias(struct list_head *list, char *name, FILE *file)
+{
+ struct perf_pmu__alias *alias;
+ char buf[256];
+ int ret;
+
+ ret = fread(buf, 1, sizeof(buf), file);
+ if (ret == 0)
+ return -EINVAL;
+ buf[ret] = 0;
+
+ alias = malloc(sizeof(*alias));
+ if (!alias)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&alias->terms);
+ ret = parse_events_terms(&alias->terms, buf);
+ if (ret) {
+ free(alias);
+ return ret;
+ }
+
+ alias->name = strdup(name);
+ list_add_tail(&alias->list, list);
+ return 0;
+}
+
+/*
+ * Process all the sysfs attributes located under the directory
+ * specified in 'dir' parameter.
+ */
+static int pmu_aliases_parse(char *dir, struct list_head *head)
+{
+ struct dirent *evt_ent;
+ DIR *event_dir;
+ int ret = 0;
+
+ event_dir = opendir(dir);
+ if (!event_dir)
+ return -EINVAL;
+
+ while (!ret && (evt_ent = readdir(event_dir))) {
+ char path[PATH_MAX];
+ char *name = evt_ent->d_name;
+ FILE *file;
+
+ if (!strcmp(name, ".") || !strcmp(name, ".."))
+ continue;
+
+ snprintf(path, PATH_MAX, "%s/%s", dir, name);
+
+ ret = -EINVAL;
+ file = fopen(path, "r");
+ if (!file)
+ break;
+ ret = perf_pmu__new_alias(head, name, file);
+ fclose(file);
+ }
+
+ closedir(event_dir);
+ return ret;
+}
+
+/*
+ * Reading the pmu event aliases definition, which should be located at:
+ * /sys/bus/event_source/devices/<dev>/events as sysfs group attributes.
+ */
+static int pmu_aliases(char *name, struct list_head *head)
+{
+ struct stat st;
+ char path[PATH_MAX];
+ const char *sysfs;
+
+ sysfs = sysfs_find_mountpoint();
+ if (!sysfs)
+ return -1;
+
+ snprintf(path, PATH_MAX,
+ "%s/bus/event_source/devices/%s/events", sysfs, name);
+
+ if (stat(path, &st) < 0)
+ return -1;
+
+ if (pmu_aliases_parse(path, head))
+ return -1;
+
+ return 0;
+}
+
+static struct perf_pmu__alias *pmu_find_alias(struct list_head *aliases,
+ struct parse_events__term *term)
+{
+ struct perf_pmu__alias *alias;
+
+ if (term->type_val != PARSE_EVENTS__TERM_TYPE_STR || term->val.str)
+ return NULL;
+
+ list_for_each_entry(alias, aliases, list) {
+ if (!strcmp(alias->name, term->config))
+ return alias;
+ }
+
+ return NULL;
+}
+
+static int pmu_alias_terms(struct perf_pmu__alias *alias,
+ struct list_head *terms)
+{
+ struct parse_events__term *term, *clone;
+ LIST_HEAD(list);
+ int ret;
+
+ list_for_each_entry(term, &alias->terms, list) {
+ ret = parse_events__term_clone(&clone, term);
+ if (ret) {
+ parse_events__free_terms(&list);
+ return ret;
+ }
+ list_add_tail(&clone->list, &list);
+ }
+ list_splice(&list, terms);
+ return 0;
+}
+
/*
* Reading/parsing the default pmu type value, which should be
* located at:
@@ -118,6 +242,7 @@ static struct perf_pmu *pmu_lookup(char *name)
{
struct perf_pmu *pmu;
LIST_HEAD(format);
+ LIST_HEAD(aliases);
__u32 type;

/*
@@ -135,8 +260,12 @@ static struct perf_pmu *pmu_lookup(char *name)
if (!pmu)
return NULL;

+ pmu_aliases(name, &aliases);
+
INIT_LIST_HEAD(&pmu->format);
+ INIT_LIST_HEAD(&pmu->aliases);
list_splice(&format, &pmu->format);
+ list_splice(&aliases, &pmu->aliases);
pmu->name = strdup(name);
pmu->type = type;
return pmu;
@@ -279,6 +408,29 @@ int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
return pmu_config(&pmu->format, attr, head_terms);
}

+/*
+ * Find alias in the terms list and replace it with the terms
+ * defined for the alias
+ */
+int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms)
+{
+ struct parse_events__term *term, *h;
+ struct perf_pmu__alias *alias;
+ int ret;
+
+ list_for_each_entry_safe(term, h, head_terms, list) {
+ alias = pmu_find_alias(&pmu->aliases, term);
+ if (!alias)
+ continue;
+ ret = pmu_alias_terms(alias, &term->list);
+ if (ret)
+ return ret;
+ list_del(&term->list);
+ free(term);
+ }
+ return 0;
+}
+
int perf_pmu__new_format(struct list_head *list, char *name,
int config, unsigned long *bits)
{
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 68c0db9..535f2c5 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -19,17 +19,26 @@ struct perf_pmu__format {
struct list_head list;
};

+struct perf_pmu__alias {
+ char *name;
+ struct list_head terms;
+ struct list_head list;
+};
+
struct perf_pmu {
char *name;
__u32 type;
struct list_head format;
+ struct list_head aliases;
struct list_head list;
};

struct perf_pmu *perf_pmu__find(char *name);
int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
struct list_head *head_terms);
-
+int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms);
+struct list_head *perf_pmu__alias(struct perf_pmu *pmu,
+ struct list_head *head_terms);
int perf_pmu_wrap(void);
void perf_pmu_error(struct list_head *list, char *name, char const *msg);

--
1.7.10.2

2012-06-12 05:38:48

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 13/13] perf, tool: Add automated test for pure terms parsing

From: Jiri Olsa <[email protected]>

Adding automated test for parsing terms out of the event grammar.
Also slightly changing current event parsing test functions to
follow up more generic namespace.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events-test.c | 122 +++++++++++++++++++++++++++++++++--
1 file changed, 117 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/parse-events-test.c b/tools/perf/util/parse-events-test.c
index 76b98e2..af1039c 100644
--- a/tools/perf/util/parse-events-test.c
+++ b/tools/perf/util/parse-events-test.c
@@ -430,6 +430,49 @@ static int test__checkevent_pmu_name(struct perf_evlist *evlist)
return 0;
}

+static int test__checkterms_simple(struct list_head *terms)
+{
+ struct parse_events__term *term;
+
+ /* config=10 */
+ term = list_entry(terms->next, struct parse_events__term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 10);
+ TEST_ASSERT_VAL("wrong config", !term->config);
+
+ /* config1 */
+ term = list_entry(term->list.next, struct parse_events__term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG1);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+ TEST_ASSERT_VAL("wrong config", !term->config);
+
+ /* config2=3 */
+ term = list_entry(term->list.next, struct parse_events__term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG2);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 3);
+ TEST_ASSERT_VAL("wrong config", !term->config);
+
+ /* umask=1*/
+ term = list_entry(term->list.next, struct parse_events__term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_USER);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+ TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "umask"));
+
+ return 0;
+}
+
struct test__event_st {
const char *name;
__u32 type;
@@ -559,7 +602,23 @@ static struct test__event_st test__events_pmu[] = {
#define TEST__EVENTS_PMU_CNT (sizeof(test__events_pmu) / \
sizeof(struct test__event_st))

-static int test(struct test__event_st *e)
+struct test__term {
+ const char *str;
+ __u32 type;
+ int (*check)(struct list_head *terms);
+};
+
+static struct test__term test__terms[] = {
+ [0] = {
+ .str = "config=10,config1,config2=3,umask=1",
+ .check = test__checkterms_simple,
+ },
+};
+
+#define TEST__TERMS_CNT (sizeof(test__terms) / \
+ sizeof(struct test__term))
+
+static int test_event(struct test__event_st *e)
{
struct perf_evlist *evlist;
int ret;
@@ -590,7 +649,48 @@ static int test_events(struct test__event_st *events, unsigned cnt)
struct test__event_st *e = &events[i];

pr_debug("running test %d '%s'\n", i, e->name);
- ret = test(e);
+ ret = test_event(e);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
+static int test_term(struct test__term *t)
+{
+ struct list_head *terms;
+ int ret;
+
+ terms = malloc(sizeof(*terms));
+ if (!terms)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(terms);
+
+ ret = parse_events_terms(terms, t->str);
+ if (ret) {
+ pr_debug("failed to parse terms '%s', err %d\n",
+ t->str , ret);
+ return ret;
+ }
+
+ ret = t->check(terms);
+ parse_events__free_terms(terms);
+
+ return ret;
+}
+
+static int test_terms(struct test__term *terms, unsigned cnt)
+{
+ int ret = 0;
+ unsigned i;
+
+ for (i = 0; i < cnt; i++) {
+ struct test__term *t = &terms[i];
+
+ pr_debug("running test %d '%s'\n", i, t->str);
+ ret = test_term(t);
if (ret)
break;
}
@@ -617,9 +717,21 @@ int parse_events__test(void)
{
int ret;

- ret = test_events(test__events, TEST__EVENTS_CNT);
- if (!ret && test_pmu())
- ret = test_events(test__events_pmu, TEST__EVENTS_PMU_CNT);
+ do {
+ ret = test_events(test__events, TEST__EVENTS_CNT);
+ if (ret)
+ break;
+
+ if (test_pmu()) {
+ ret = test_events(test__events_pmu,
+ TEST__EVENTS_PMU_CNT);
+ if (ret)
+ break;
+ }
+
+ ret = test_terms(test__terms, TEST__TERMS_CNT);
+
+ } while (0);

return ret;
}
--
1.7.10.2

2012-06-12 05:38:13

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 08/13] perf: Add Sandy Bridge-EP uncore support

From: "Yan, Zheng" <[email protected]>

Add Intel Nehalem and Sandy Bridge uncore pmu support. The uncore
subsystem in Sandy Bridge-EP consists of 8 components (Ubox,
Cacheing Agent, Home Agent, Memory controller, Power Control,
QPI Link Layer, R2PCIe, R3QPI).

Signed-off-by: Zheng Yan <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 484 +++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 86 +++++
include/linux/pci_ids.h | 11 +
3 files changed, 581 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 9a43fb4..f2c536f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -21,6 +21,482 @@ DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
DEFINE_UNCORE_FORMAT_ATTR(cmask5, cmask, "config:24-28");
DEFINE_UNCORE_FORMAT_ATTR(cmask8, cmask, "config:24-31");
+DEFINE_UNCORE_FORMAT_ATTR(thresh8, thresh, "config:24-31");
+DEFINE_UNCORE_FORMAT_ATTR(thresh5, thresh, "config:24-28");
+DEFINE_UNCORE_FORMAT_ATTR(occ_sel, occ_sel, "config:14-15");
+DEFINE_UNCORE_FORMAT_ATTR(occ_invert, occ_invert, "config:30");
+DEFINE_UNCORE_FORMAT_ATTR(occ_edge, occ_edge, "config:14-51");
+
+/* Sandy Bridge-EP uncore support */
+static void snbep_uncore_pci_disable_box(struct intel_uncore_box *box)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ int box_ctl = uncore_pci_box_ctl(box);
+ u32 config;
+
+ pci_read_config_dword(pdev, box_ctl, &config);
+ config |= SNBEP_PMON_BOX_CTL_FRZ;
+ pci_write_config_dword(pdev, box_ctl, config);
+}
+
+static void snbep_uncore_pci_enable_box(struct intel_uncore_box *box)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ int box_ctl = uncore_pci_box_ctl(box);
+ u32 config;
+
+ pci_read_config_dword(pdev, box_ctl, &config);
+ config &= ~SNBEP_PMON_BOX_CTL_FRZ;
+ pci_write_config_dword(pdev, box_ctl, config);
+}
+
+static void snbep_uncore_pci_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ struct hw_perf_event *hwc = &event->hw;
+
+ pci_write_config_dword(pdev, hwc->config_base, hwc->config |
+ SNBEP_PMON_CTL_EN);
+}
+
+static void snbep_uncore_pci_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ struct hw_perf_event *hwc = &event->hw;
+
+ pci_write_config_dword(pdev, hwc->config_base, hwc->config);
+}
+
+static u64 snbep_uncore_pci_read_counter(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ struct hw_perf_event *hwc = &event->hw;
+ u64 count;
+
+ pci_read_config_dword(pdev, hwc->event_base, (u32 *)&count);
+ pci_read_config_dword(pdev, hwc->event_base + 4, (u32 *)&count + 1);
+ return count;
+}
+
+static void snbep_uncore_pci_init_box(struct intel_uncore_box *box)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ pci_write_config_dword(pdev, SNBEP_PCI_PMON_BOX_CTL,
+ SNBEP_PMON_BOX_CTL_INT);
+}
+
+static void snbep_uncore_msr_disable_box(struct intel_uncore_box *box)
+{
+ u64 config;
+ unsigned msr;
+
+ msr = uncore_msr_box_ctl(box);
+ if (msr) {
+ rdmsrl(msr, config);
+ config |= SNBEP_PMON_BOX_CTL_FRZ;
+ wrmsrl(msr, config);
+ return;
+ }
+}
+
+static void snbep_uncore_msr_enable_box(struct intel_uncore_box *box)
+{
+ u64 config;
+ unsigned msr;
+
+ msr = uncore_msr_box_ctl(box);
+ if (msr) {
+ rdmsrl(msr, config);
+ config &= ~SNBEP_PMON_BOX_CTL_FRZ;
+ wrmsrl(msr, config);
+ return;
+ }
+}
+
+static void snbep_uncore_msr_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ wrmsrl(hwc->config_base, hwc->config | SNBEP_PMON_CTL_EN);
+}
+
+static void snbep_uncore_msr_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ wrmsrl(hwc->config_base, hwc->config);
+}
+
+static u64 snbep_uncore_msr_read_counter(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ u64 count;
+
+ rdmsrl(hwc->event_base, count);
+ return count;
+}
+
+static void snbep_uncore_msr_init_box(struct intel_uncore_box *box)
+{
+ unsigned msr = uncore_msr_box_ctl(box);
+ if (msr)
+ wrmsrl(msr, SNBEP_PMON_BOX_CTL_INT);
+}
+
+static struct attribute *snbep_uncore_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_thresh8.attr,
+ NULL,
+};
+
+static struct attribute *snbep_uncore_ubox_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_thresh5.attr,
+ NULL,
+};
+
+static struct attribute *snbep_uncore_pcu_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_occ_sel.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_thresh5.attr,
+ &format_attr_occ_invert.attr,
+ &format_attr_occ_edge.attr,
+ NULL,
+};
+
+static struct uncore_event_desc snbep_uncore_imc_events[] = {
+ INTEL_UNCORE_EVENT_DESC(CLOCKTICKS, "config=0xffff"),
+ /* read */
+ INTEL_UNCORE_EVENT_DESC(CAS_COUNT_RD, "event=0x4,umask=0x3"),
+ /* write */
+ INTEL_UNCORE_EVENT_DESC(CAS_COUNT_WR, "event=0x4,umask=0xc"),
+ { /* end: all zeroes */ },
+};
+
+static struct uncore_event_desc snbep_uncore_qpi_events[] = {
+ INTEL_UNCORE_EVENT_DESC(CLOCKTICKS, "event=0x14"),
+ /* outgoing data+nondata flits */
+ INTEL_UNCORE_EVENT_DESC(TxL_FLITS_ACTIVE, "event=0x0,umask=0x6"),
+ /* DRS data received */
+ INTEL_UNCORE_EVENT_DESC(DRS_DATA, "event=0x2,umask=0x8"),
+ /* NCB data received */
+ INTEL_UNCORE_EVENT_DESC(NCB_DATA, "event=0x3,umask=0x4"),
+ { /* end: all zeroes */ },
+};
+
+static struct attribute_group snbep_uncore_format_group = {
+ .name = "format",
+ .attrs = snbep_uncore_formats_attr,
+};
+
+static struct attribute_group snbep_uncore_ubox_format_group = {
+ .name = "format",
+ .attrs = snbep_uncore_ubox_formats_attr,
+};
+
+static struct attribute_group snbep_uncore_pcu_format_group = {
+ .name = "format",
+ .attrs = snbep_uncore_pcu_formats_attr,
+};
+
+static struct intel_uncore_ops snbep_uncore_msr_ops = {
+ .init_box = snbep_uncore_msr_init_box,
+ .disable_box = snbep_uncore_msr_disable_box,
+ .enable_box = snbep_uncore_msr_enable_box,
+ .disable_event = snbep_uncore_msr_disable_event,
+ .enable_event = snbep_uncore_msr_enable_event,
+ .read_counter = snbep_uncore_msr_read_counter,
+};
+
+static struct intel_uncore_ops snbep_uncore_pci_ops = {
+ .init_box = snbep_uncore_pci_init_box,
+ .disable_box = snbep_uncore_pci_disable_box,
+ .enable_box = snbep_uncore_pci_enable_box,
+ .disable_event = snbep_uncore_pci_disable_event,
+ .enable_event = snbep_uncore_pci_enable_event,
+ .read_counter = snbep_uncore_pci_read_counter,
+};
+
+static struct event_constraint snbep_uncore_cbo_constraints[] = {
+ UNCORE_EVENT_CONSTRAINT(0x01, 0x1),
+ UNCORE_EVENT_CONSTRAINT(0x02, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x04, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x05, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x07, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x11, 0x1),
+ UNCORE_EVENT_CONSTRAINT(0x12, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x13, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x1b, 0xc),
+ UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
+ UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
+ UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
+ UNCORE_EVENT_CONSTRAINT(0x1f, 0xe),
+ UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x35, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x36, 0x1),
+ UNCORE_EVENT_CONSTRAINT(0x37, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x38, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x39, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x3b, 0x1),
+ EVENT_CONSTRAINT_END
+};
+
+static struct event_constraint snbep_uncore_r2pcie_constraints[] = {
+ UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x12, 0x1),
+ UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x24, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x25, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
+ EVENT_CONSTRAINT_END
+};
+
+static struct event_constraint snbep_uncore_r3qpi_constraints[] = {
+ UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x12, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x13, 0x1),
+ UNCORE_EVENT_CONSTRAINT(0x20, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x22, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x24, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x25, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x30, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x36, 0x3),
+ UNCORE_EVENT_CONSTRAINT(0x37, 0x3),
+ EVENT_CONSTRAINT_END
+};
+
+static struct intel_uncore_type snbep_uncore_ubox = {
+ .name = "UBox",
+ .num_counters = 2,
+ .num_boxes = 1,
+ .perf_ctr_bits = 44,
+ .fixed_ctr_bits = 48,
+ .perf_ctr = SNBEP_U_MSR_PMON_CTR0,
+ .event_ctl = SNBEP_U_MSR_PMON_CTL0,
+ .event_mask = SNBEP_U_MSR_PMON_RAW_EVENT_MASK,
+ .fixed_ctr = SNBEP_U_MSR_PMON_UCLK_FIXED_CTR,
+ .fixed_ctl = SNBEP_U_MSR_PMON_UCLK_FIXED_CTL,
+ .ops = &snbep_uncore_msr_ops,
+ .format_group = &snbep_uncore_ubox_format_group,
+};
+
+static struct intel_uncore_type snbep_uncore_cbo = {
+ .name = "Cbo",
+ .num_counters = 4,
+ .num_boxes = 8,
+ .perf_ctr_bits = 44,
+ .event_ctl = SNBEP_C0_MSR_PMON_CTL0,
+ .perf_ctr = SNBEP_C0_MSR_PMON_CTR0,
+ .event_mask = SNBEP_PMON_RAW_EVENT_MASK,
+ .box_ctl = SNBEP_C0_MSR_PMON_BOX_CTL,
+ .msr_offset = SNBEP_CBO_MSR_OFFSET,
+ .constraints = snbep_uncore_cbo_constraints,
+ .ops = &snbep_uncore_msr_ops,
+ .format_group = &snbep_uncore_format_group,
+};
+
+static struct intel_uncore_type snbep_uncore_pcu = {
+ .name = "PCU",
+ .num_counters = 4,
+ .num_boxes = 1,
+ .perf_ctr_bits = 48,
+ .perf_ctr = SNBEP_PCU_MSR_PMON_CTR0,
+ .event_ctl = SNBEP_PCU_MSR_PMON_CTL0,
+ .event_mask = SNBEP_PCU_MSR_PMON_RAW_EVENT_MASK,
+ .box_ctl = SNBEP_PCU_MSR_PMON_BOX_CTL,
+ .ops = &snbep_uncore_msr_ops,
+ .format_group = &snbep_uncore_pcu_format_group,
+};
+
+static struct intel_uncore_type *snbep_msr_uncores[] = {
+ &snbep_uncore_ubox,
+ &snbep_uncore_cbo,
+ &snbep_uncore_pcu,
+ NULL,
+};
+
+#define SNBEP_UNCORE_PCI_COMMON_INIT() \
+ .perf_ctr = SNBEP_PCI_PMON_CTR0, \
+ .event_ctl = SNBEP_PCI_PMON_CTL0, \
+ .event_mask = SNBEP_PMON_RAW_EVENT_MASK, \
+ .box_ctl = SNBEP_PCI_PMON_BOX_CTL, \
+ .ops = &snbep_uncore_pci_ops, \
+ .format_group = &snbep_uncore_format_group
+
+static struct intel_uncore_type snbep_uncore_ha = {
+ .name = "HA",
+ .num_counters = 4,
+ .num_boxes = 1,
+ .perf_ctr_bits = 48,
+ SNBEP_UNCORE_PCI_COMMON_INIT(),
+};
+
+static struct intel_uncore_type snbep_uncore_imc = {
+ .name = "iMC",
+ .num_counters = 4,
+ .num_boxes = 4,
+ .perf_ctr_bits = 48,
+ .fixed_ctr_bits = 48,
+ .fixed_ctr = SNBEP_MC_CHy_PCI_PMON_FIXED_CTR,
+ .fixed_ctl = SNBEP_MC_CHy_PCI_PMON_FIXED_CTL,
+ .event_descs = snbep_uncore_imc_events,
+ SNBEP_UNCORE_PCI_COMMON_INIT(),
+};
+
+static struct intel_uncore_type snbep_uncore_qpi = {
+ .name = "QPI",
+ .num_counters = 4,
+ .num_boxes = 2,
+ .perf_ctr_bits = 48,
+ .event_descs = snbep_uncore_qpi_events,
+ SNBEP_UNCORE_PCI_COMMON_INIT(),
+};
+
+
+static struct intel_uncore_type snbep_uncore_r2pcie = {
+ .name = "R2PCIe",
+ .num_counters = 4,
+ .num_boxes = 1,
+ .perf_ctr_bits = 44,
+ .constraints = snbep_uncore_r2pcie_constraints,
+ SNBEP_UNCORE_PCI_COMMON_INIT(),
+};
+
+static struct intel_uncore_type snbep_uncore_r3qpi = {
+ .name = "R3QPI",
+ .num_counters = 3,
+ .num_boxes = 2,
+ .perf_ctr_bits = 44,
+ .constraints = snbep_uncore_r3qpi_constraints,
+ SNBEP_UNCORE_PCI_COMMON_INIT(),
+};
+
+static struct intel_uncore_type *snbep_pci_uncores[] = {
+ &snbep_uncore_ha,
+ &snbep_uncore_imc,
+ &snbep_uncore_qpi,
+ &snbep_uncore_r2pcie,
+ &snbep_uncore_r3qpi,
+ NULL,
+};
+
+static DEFINE_PCI_DEVICE_TABLE(snbep_uncore_pci_ids) = {
+ { /* Home Agent */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_HA),
+ .driver_data = (unsigned long)&snbep_uncore_ha,
+ },
+ { /* MC Channel 0 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC0),
+ .driver_data = (unsigned long)&snbep_uncore_imc,
+ },
+ { /* MC Channel 1 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC1),
+ .driver_data = (unsigned long)&snbep_uncore_imc,
+ },
+ { /* MC Channel 2 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC2),
+ .driver_data = (unsigned long)&snbep_uncore_imc,
+ },
+ { /* MC Channel 3 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC3),
+ .driver_data = (unsigned long)&snbep_uncore_imc,
+ },
+ { /* QPI Port 0 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_QPI0),
+ .driver_data = (unsigned long)&snbep_uncore_qpi,
+ },
+ { /* QPI Port 1 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_QPI1),
+ .driver_data = (unsigned long)&snbep_uncore_qpi,
+ },
+ { /* P2PCIe */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_R2PCIE),
+ .driver_data = (unsigned long)&snbep_uncore_r2pcie,
+ },
+ { /* R3QPI Link 0 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_R3QPI0),
+ .driver_data = (unsigned long)&snbep_uncore_r3qpi,
+ },
+ { /* R3QPI Link 1 */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_R3QPI1),
+ .driver_data = (unsigned long)&snbep_uncore_r3qpi,
+ },
+ { /* end: all zeroes */ }
+};
+
+static struct pci_driver snbep_uncore_pci_driver = {
+ .name = "snbep_uncore",
+ .id_table = snbep_uncore_pci_ids,
+};
+
+/*
+ * build pci bus to socket mapping
+ */
+static void snbep_pci2phy_map_init(void)
+{
+ struct pci_dev *ubox_dev = NULL;
+ int i, bus, nodeid;
+ u32 config;
+
+ while (1) {
+ /* find the UBOX device */
+ ubox_dev = pci_get_device(PCI_VENDOR_ID_INTEL,
+ PCI_DEVICE_ID_INTEL_JAKETOWN_UBOX,
+ ubox_dev);
+ if (!ubox_dev)
+ break;
+ bus = ubox_dev->bus->number;
+ /* get the Node ID of the local register */
+ pci_read_config_dword(ubox_dev, 0x40, &config);
+ nodeid = config;
+ /* get the Node ID mapping */
+ pci_read_config_dword(ubox_dev, 0x54, &config);
+ /*
+ * every three bits in the Node ID mapping register maps
+ * to a particular node.
+ */
+ for (i = 0; i < 8; i++) {
+ if (nodeid == ((config >> (3 * i)) & 0x7)) {
+ pcibus_to_physid[bus] = i;
+ break;
+ }
+ }
+ };
+ return;
+}
+/* end of Sandy Bridge-EP uncore support */
+

/* Sandy Bridge uncore support */
static void snb_uncore_msr_enable_event(struct intel_uncore_box *box,
@@ -892,6 +1368,11 @@ static int __init uncore_pci_init(void)
int ret;

switch (boot_cpu_data.x86_model) {
+ case 45: /* Sandy Bridge-EP */
+ pci_uncores = snbep_pci_uncores;
+ uncore_pci_driver = &snbep_uncore_pci_driver;
+ snbep_pci2phy_map_init();
+ break;
default:
return 0;
}
@@ -1152,6 +1633,9 @@ static int __init uncore_cpu_init(void)
case 42: /* Sandy Bridge */
msr_uncores = snb_msr_uncores;
break;
+ case 45: /* Sandy Birdge-EP */
+ msr_uncores = snbep_msr_uncores;
+ break;
default:
return 0;
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index aa01df8..4d52db0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -65,6 +65,92 @@
#define NHM_UNC_PERFEVTSEL0 0x3c0
#define NHM_UNC_UNCORE_PMC0 0x3b0

+/* SNB-EP Box level control */
+#define SNBEP_PMON_BOX_CTL_RST_CTRL (1 << 0)
+#define SNBEP_PMON_BOX_CTL_RST_CTRS (1 << 1)
+#define SNBEP_PMON_BOX_CTL_FRZ (1 << 8)
+#define SNBEP_PMON_BOX_CTL_FRZ_EN (1 << 16)
+#define SNBEP_PMON_BOX_CTL_INT (SNBEP_PMON_BOX_CTL_RST_CTRL | \
+ SNBEP_PMON_BOX_CTL_RST_CTRS | \
+ SNBEP_PMON_BOX_CTL_FRZ_EN)
+/* SNB-EP event control */
+#define SNBEP_PMON_CTL_EV_SEL_MASK 0x000000ff
+#define SNBEP_PMON_CTL_UMASK_MASK 0x0000ff00
+#define SNBEP_PMON_CTL_RST (1 << 17)
+#define SNBEP_PMON_CTL_EDGE_DET (1 << 18)
+#define SNBEP_PMON_CTL_EV_SEL_EXT (1 << 21) /* only for QPI */
+#define SNBEP_PMON_CTL_EN (1 << 22)
+#define SNBEP_PMON_CTL_INVERT (1 << 23)
+#define SNBEP_PMON_CTL_TRESH_MASK 0xff000000
+#define SNBEP_PMON_RAW_EVENT_MASK (SNBEP_PMON_CTL_EV_SEL_MASK | \
+ SNBEP_PMON_CTL_UMASK_MASK | \
+ SNBEP_PMON_CTL_EDGE_DET | \
+ SNBEP_PMON_CTL_INVERT | \
+ SNBEP_PMON_CTL_TRESH_MASK)
+
+/* SNB-EP Ubox event control */
+#define SNBEP_U_MSR_PMON_CTL_TRESH_MASK 0x1f000000
+#define SNBEP_U_MSR_PMON_RAW_EVENT_MASK \
+ (SNBEP_PMON_CTL_EV_SEL_MASK | \
+ SNBEP_PMON_CTL_UMASK_MASK | \
+ SNBEP_PMON_CTL_EDGE_DET | \
+ SNBEP_PMON_CTL_INVERT | \
+ SNBEP_U_MSR_PMON_CTL_TRESH_MASK)
+
+/* SNB-EP PCU event control */
+#define SNBEP_PCU_MSR_PMON_CTL_OCC_SEL_MASK 0x0000c000
+#define SNBEP_PCU_MSR_PMON_CTL_TRESH_MASK 0x1f000000
+#define SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT (1 << 30)
+#define SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET (1 << 31)
+#define SNBEP_PCU_MSR_PMON_RAW_EVENT_MASK \
+ (SNBEP_PMON_CTL_EV_SEL_MASK | \
+ SNBEP_PCU_MSR_PMON_CTL_OCC_SEL_MASK | \
+ SNBEP_PMON_CTL_EDGE_DET | \
+ SNBEP_PMON_CTL_INVERT | \
+ SNBEP_PCU_MSR_PMON_CTL_TRESH_MASK | \
+ SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT | \
+ SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET)
+
+/* SNB-EP pci control register */
+#define SNBEP_PCI_PMON_BOX_CTL 0xf4
+#define SNBEP_PCI_PMON_CTL0 0xd8
+/* SNB-EP pci counter register */
+#define SNBEP_PCI_PMON_CTR0 0xa0
+
+/* SNB-EP home agent register */
+#define SNBEP_HA_PCI_PMON_BOX_ADDRMATCH0 0x40
+#define SNBEP_HA_PCI_PMON_BOX_ADDRMATCH1 0x44
+#define SNBEP_HA_PCI_PMON_BOX_OPCODEMATCH 0x48
+/* SNB-EP memory controller register */
+#define SNBEP_MC_CHy_PCI_PMON_FIXED_CTL 0xf0
+#define SNBEP_MC_CHy_PCI_PMON_FIXED_CTR 0xd0
+/* SNB-EP QPI register */
+#define SNBEP_Q_Py_PCI_PMON_PKT_MATCH0 0x228
+#define SNBEP_Q_Py_PCI_PMON_PKT_MATCH1 0x22c
+#define SNBEP_Q_Py_PCI_PMON_PKT_MASK0 0x238
+#define SNBEP_Q_Py_PCI_PMON_PKT_MASK1 0x23c
+
+/* SNB-EP Ubox register */
+#define SNBEP_U_MSR_PMON_CTR0 0xc16
+#define SNBEP_U_MSR_PMON_CTL0 0xc10
+
+#define SNBEP_U_MSR_PMON_UCLK_FIXED_CTL 0xc08
+#define SNBEP_U_MSR_PMON_UCLK_FIXED_CTR 0xc09
+
+/* SNB-EP Cbo register */
+#define SNBEP_C0_MSR_PMON_CTR0 0xd16
+#define SNBEP_C0_MSR_PMON_CTL0 0xd10
+#define SNBEP_C0_MSR_PMON_BOX_FILTER 0xd14
+#define SNBEP_C0_MSR_PMON_BOX_CTL 0xd04
+#define SNBEP_CBO_MSR_OFFSET 0x20
+
+/* SNB-EP PCU register */
+#define SNBEP_PCU_MSR_PMON_CTR0 0xc36
+#define SNBEP_PCU_MSR_PMON_CTL0 0xc30
+#define SNBEP_PCU_MSR_PMON_BOX_FILTER 0xc34
+#define SNBEP_PCU_MSR_PMON_BOX_CTL 0xc24
+#define SNBEP_PCU_MSR_CORE_C3_CTR 0x3fc
+#define SNBEP_PCU_MSR_CORE_C6_CTR 0x3fd

struct intel_uncore_ops;
struct intel_uncore_pmu;
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 05fd02e..fc35260 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2756,6 +2756,17 @@
#define PCI_DEVICE_ID_INTEL_IOAT_SNB7 0x3c27
#define PCI_DEVICE_ID_INTEL_IOAT_SNB8 0x3c2e
#define PCI_DEVICE_ID_INTEL_IOAT_SNB9 0x3c2f
+#define PCI_DEVICE_ID_INTEL_UNC_HA 0x3c46
+#define PCI_DEVICE_ID_INTEL_UNC_IMC0 0x3cb0
+#define PCI_DEVICE_ID_INTEL_UNC_IMC1 0x3cb1
+#define PCI_DEVICE_ID_INTEL_UNC_IMC2 0x3cb4
+#define PCI_DEVICE_ID_INTEL_UNC_IMC3 0x3cb5
+#define PCI_DEVICE_ID_INTEL_UNC_QPI0 0x3c41
+#define PCI_DEVICE_ID_INTEL_UNC_QPI1 0x3c42
+#define PCI_DEVICE_ID_INTEL_UNC_R2PCIE 0x3c43
+#define PCI_DEVICE_ID_INTEL_UNC_R3QPI0 0x3c44
+#define PCI_DEVICE_ID_INTEL_UNC_R3QPI1 0x3c45
+#define PCI_DEVICE_ID_INTEL_JAKETOWN_UBOX 0x3ce0
#define PCI_DEVICE_ID_INTEL_IOAT_SNB 0x402f
#define PCI_DEVICE_ID_INTEL_5100_16 0x65f0
#define PCI_DEVICE_ID_INTEL_5100_21 0x65f5
--
1.7.10.2

2012-06-12 05:39:12

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 10/13] perf, tool: Make the event parser reentrantable

From: Jiri Olsa <[email protected]>

Making the event parser reentrantable by creating separate
scanner for each parsing. The scanner is passed to the bison
as and argument to the lexer.

Signed-off-by: Zheng Yan <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 35 ++++++++----
tools/perf/util/parse-events.h | 2 +-
tools/perf/util/parse-events.l | 116 +++++++++++++++++++++++-----------------
tools/perf/util/parse-events.y | 9 ++--
4 files changed, 98 insertions(+), 64 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index c71b29a..ca8665e 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -11,6 +11,7 @@
#include "cache.h"
#include "header.h"
#include "debugfs.h"
+#include "parse-events-bison.h"
#include "parse-events-flex.h"
#include "pmu.h"

@@ -26,7 +27,7 @@ struct event_symbol {
#ifdef PARSER_DEBUG
extern int parse_events_debug;
#endif
-int parse_events_parse(void *data);
+int parse_events_parse(void *data, void *scanner);

#define CHW(x) .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_##x
#define CSW(x) .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_##x
@@ -787,26 +788,38 @@ int parse_events_modifier(struct list_head *list, char *str)
return 0;
}

-int parse_events(struct perf_evlist *evlist, const char *str, int unset __used)
+static int parse_events__scanner(const char *str, void *data)
{
- struct parse_events_data__events data = {
- .list = LIST_HEAD_INIT(data.list),
- .idx = evlist->nr_entries,
- };
YY_BUFFER_STATE buffer;
+ void *scanner;
int ret;

- buffer = parse_events__scan_string(str);
+ ret = parse_events_lex_init(&scanner);
+ if (ret)
+ return ret;
+
+ buffer = parse_events__scan_string(str, scanner);

#ifdef PARSER_DEBUG
parse_events_debug = 1;
#endif
- ret = parse_events_parse(&data);
+ ret = parse_events_parse(data, scanner);
+
+ parse_events__flush_buffer(buffer, scanner);
+ parse_events__delete_buffer(buffer, scanner);
+ parse_events_lex_destroy(scanner);
+ return ret;
+}

- parse_events__flush_buffer(buffer);
- parse_events__delete_buffer(buffer);
- parse_events_lex_destroy();
+int parse_events(struct perf_evlist *evlist, const char *str, int unset __used)
+{
+ struct parse_events_data__events data = {
+ .list = LIST_HEAD_INIT(data.list),
+ .idx = evlist->nr_entries,
+ };
+ int ret;

+ ret = parse_events__scanner(str, &data);
if (!ret) {
int entries = data.idx - evlist->nr_entries;
perf_evlist__splice_list_tail(evlist, &data.list, entries);
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index dc3c83a..fa2b19b 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -88,7 +88,7 @@ int parse_events_add_pmu(struct list_head **list, int *idx,
char *pmu , struct list_head *head_config);
void parse_events_update_lists(struct list_head *list_event,
struct list_head *list_all);
-void parse_events_error(void *data, char const *msg);
+void parse_events_error(void *data, void *scanner, char const *msg);
int parse_events__test(void);

void print_events(const char *event_glob);
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 618a8e7..329794e 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -1,4 +1,6 @@

+%option reentrant
+%option bison-bridge
%option prefix="parse_events_"
%option stack

@@ -8,7 +10,10 @@
#include "parse-events-bison.h"
#include "parse-events.h"

-static int __value(char *str, int base, int token)
+char *parse_events_get_text(yyscan_t yyscanner);
+YYSTYPE *parse_events_get_lval(yyscan_t yyscanner);
+
+static int __value(YYSTYPE *yylval, char *str, int base, int token)
{
long num;

@@ -17,35 +22,48 @@ static int __value(char *str, int base, int token)
if (errno)
return PE_ERROR;

- parse_events_lval.num = num;
+ yylval->num = num;
return token;
}

-static int value(int base)
+static int value(yyscan_t scanner, int base)
{
- return __value(parse_events_text, base, PE_VALUE);
+ YYSTYPE *yylval = parse_events_get_lval(scanner);
+ char *text = parse_events_get_text(scanner);
+
+ return __value(yylval, text, base, PE_VALUE);
}

-static int raw(void)
+static int raw(yyscan_t scanner)
{
- return __value(parse_events_text + 1, 16, PE_RAW);
+ YYSTYPE *yylval = parse_events_get_lval(scanner);
+ char *text = parse_events_get_text(scanner);
+
+ return __value(yylval, text + 1, 16, PE_RAW);
}

-static int str(int token)
+static int str(yyscan_t scanner, int token)
{
- parse_events_lval.str = strdup(parse_events_text);
+ YYSTYPE *yylval = parse_events_get_lval(scanner);
+ char *text = parse_events_get_text(scanner);
+
+ yylval->str = strdup(text);
return token;
}

-static int sym(int type, int config)
+static int sym(yyscan_t scanner, int type, int config)
{
- parse_events_lval.num = (type << 16) + config;
+ YYSTYPE *yylval = parse_events_get_lval(scanner);
+
+ yylval->num = (type << 16) + config;
return PE_VALUE_SYM;
}

-static int term(int type)
+static int term(yyscan_t scanner, int type)
{
- parse_events_lval.num = type;
+ YYSTYPE *yylval = parse_events_get_lval(scanner);
+
+ yylval->num = type;
return PE_TERM;
}

@@ -61,25 +79,25 @@ modifier_event [ukhpGH]{1,8}
modifier_bp [rwx]

%%
-cpu-cycles|cycles { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES); }
-stalled-cycles-frontend|idle-cycles-frontend { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
-stalled-cycles-backend|idle-cycles-backend { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
-instructions { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS); }
-cache-references { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES); }
-cache-misses { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_MISSES); }
-branch-instructions|branches { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
-branch-misses { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_MISSES); }
-bus-cycles { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_BUS_CYCLES); }
-ref-cycles { return sym(PERF_TYPE_HARDWARE, PERF_COUNT_HW_REF_CPU_CYCLES); }
-cpu-clock { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_CLOCK); }
-task-clock { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_TASK_CLOCK); }
-page-faults|faults { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS); }
-minor-faults { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS_MIN); }
-major-faults { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS_MAJ); }
-context-switches|cs { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CONTEXT_SWITCHES); }
-cpu-migrations|migrations { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_MIGRATIONS); }
-alignment-faults { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
-emulation-faults { return sym(PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
+cpu-cycles|cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES); }
+stalled-cycles-frontend|idle-cycles-frontend { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
+stalled-cycles-backend|idle-cycles-backend { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
+instructions { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS); }
+cache-references { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES); }
+cache-misses { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_MISSES); }
+branch-instructions|branches { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
+branch-misses { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_MISSES); }
+bus-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BUS_CYCLES); }
+ref-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_REF_CPU_CYCLES); }
+cpu-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_CLOCK); }
+task-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_TASK_CLOCK); }
+page-faults|faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS); }
+minor-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS_MIN); }
+major-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS_MAJ); }
+context-switches|cs { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CONTEXT_SWITCHES); }
+cpu-migrations|migrations { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_MIGRATIONS); }
+alignment-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
+emulation-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }

L1-dcache|l1-d|l1d|L1-data |
L1-icache|l1-i|l1i|L1-instruction |
@@ -87,14 +105,14 @@ LLC|L2 |
dTLB|d-tlb|Data-TLB |
iTLB|i-tlb|Instruction-TLB |
branch|branches|bpu|btb|bpc |
-node { return str(PE_NAME_CACHE_TYPE); }
+node { return str(yyscanner, PE_NAME_CACHE_TYPE); }

load|loads|read |
store|stores|write |
prefetch|prefetches |
speculative-read|speculative-load |
refs|Reference|ops|access |
-misses|miss { return str(PE_NAME_CACHE_OP_RESULT); }
+misses|miss { return str(yyscanner, PE_NAME_CACHE_OP_RESULT); }

/*
* These are event config hardcoded term names to be specified
@@ -102,20 +120,20 @@ misses|miss { return str(PE_NAME_CACHE_OP_RESULT); }
* so we can put them here directly. In case the we have a conflict
* in future, this needs to go into '//' condition block.
*/
-config { return term(PARSE_EVENTS__TERM_TYPE_CONFIG); }
-config1 { return term(PARSE_EVENTS__TERM_TYPE_CONFIG1); }
-config2 { return term(PARSE_EVENTS__TERM_TYPE_CONFIG2); }
-name { return term(PARSE_EVENTS__TERM_TYPE_NAME); }
-period { return term(PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD); }
-branch_type { return term(PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE); }
+config { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG); }
+config1 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG1); }
+config2 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG2); }
+name { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NAME); }
+period { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD); }
+branch_type { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE); }

mem: { BEGIN(mem); return PE_PREFIX_MEM; }
-r{num_raw_hex} { return raw(); }
-{num_dec} { return value(10); }
-{num_hex} { return value(16); }
+r{num_raw_hex} { return raw(yyscanner); }
+{num_dec} { return value(yyscanner, 10); }
+{num_hex} { return value(yyscanner, 16); }

-{modifier_event} { return str(PE_MODIFIER_EVENT); }
-{name} { return str(PE_NAME); }
+{modifier_event} { return str(yyscanner, PE_MODIFIER_EVENT); }
+{name} { return str(yyscanner, PE_NAME); }
"/" { return '/'; }
- { return '-'; }
, { return ','; }
@@ -123,17 +141,17 @@ r{num_raw_hex} { return raw(); }
= { return '='; }

<mem>{
-{modifier_bp} { return str(PE_MODIFIER_BP); }
+{modifier_bp} { return str(yyscanner, PE_MODIFIER_BP); }
: { return ':'; }
-{num_dec} { return value(10); }
-{num_hex} { return value(16); }
+{num_dec} { return value(yyscanner, 10); }
+{num_hex} { return value(yyscanner, 16); }
/*
* We need to separate 'mem:' scanner part, in order to get specific
* modifier bits parsed out. Otherwise we would need to handle PE_NAME
* and we'd need to parse it manually. During the escape from <mem>
* state we need to put the escaping char back, so we dont miss it.
*/
-. { unput(*parse_events_text); BEGIN(INITIAL); }
+. { unput(*yytext); BEGIN(INITIAL); }
/*
* We destroy the scanner after reaching EOF,
* but anyway just to be sure get back to INIT state.
@@ -143,7 +161,7 @@ r{num_raw_hex} { return raw(); }

%%

-int parse_events_wrap(void)
+int parse_events_wrap(void *scanner __used)
{
return 1;
}
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index e533bf7..2a93d5c 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -1,6 +1,8 @@
-
+%pure-parser
%name-prefix "parse_events_"
%parse-param {void *_data}
+%parse-param {void *scanner}
+%lex-param {void* scanner}

%{

@@ -11,8 +13,9 @@
#include "types.h"
#include "util.h"
#include "parse-events.h"
+#include "parse-events-bison.h"

-extern int parse_events_lex (void);
+extern int parse_events_lex (YYSTYPE* lvalp, void* scanner);

#define ABORT_ON(val) \
do { \
@@ -286,7 +289,7 @@ sep_slash_dc: '/' | ':' |

%%

-void parse_events_error(void *data __used,
+void parse_events_error(void *data __used, void *scanner __used,
char const *msg __used)
{
}
--
1.7.10.2

2012-06-12 05:39:45

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 07/13] perf: Generic pci uncore device support

From: "Yan, Zheng" <[email protected]>

This patch adds generic support for uncore pmu presented as
pci device.

Signed-off-by: Zheng Yan <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 175 ++++++++++++++++++++++++-
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 66 ++++++++++
2 files changed, 236 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 84f9ae6..9a43fb4 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -2,6 +2,11 @@

static struct intel_uncore_type *empty_uncore[] = { NULL, };
static struct intel_uncore_type **msr_uncores = empty_uncore;
+static struct intel_uncore_type **pci_uncores = empty_uncore;
+/* pci bus to socket mapping */
+static int pcibus_to_physid[256] = { [0 ... 255] = -1, };
+
+static DEFINE_RAW_SPINLOCK(uncore_box_lock);

/* mask of cpus that collect uncore events */
static cpumask_t uncore_cpu_mask;
@@ -205,13 +210,13 @@ static void uncore_assign_hw_event(struct intel_uncore_box *box,
hwc->last_tag = ++box->tags[idx];

if (hwc->idx == UNCORE_PMC_IDX_FIXED) {
- hwc->event_base = uncore_msr_fixed_ctr(box);
- hwc->config_base = uncore_msr_fixed_ctl(box);
+ hwc->event_base = uncore_fixed_ctr(box);
+ hwc->config_base = uncore_fixed_ctl(box);
return;
}

- hwc->config_base = uncore_msr_event_ctl(box, hwc->idx);
- hwc->event_base = uncore_msr_perf_ctr(box, hwc->idx);
+ hwc->config_base = uncore_event_ctl(box, hwc->idx);
+ hwc->event_base = uncore_perf_ctr(box, hwc->idx);
}

static void uncore_perf_event_update(struct intel_uncore_box *box,
@@ -305,6 +310,22 @@ struct intel_uncore_box *uncore_alloc_box(int cpu)
static struct intel_uncore_box *
uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
{
+ static struct intel_uncore_box *box;
+
+ box = *per_cpu_ptr(pmu->box, cpu);
+ if (box)
+ return box;
+
+ raw_spin_lock(&uncore_box_lock);
+ list_for_each_entry(box, &pmu->box_list, list) {
+ if (box->phys_id == topology_physical_package_id(cpu)) {
+ atomic_inc(&box->refcnt);
+ *per_cpu_ptr(pmu->box, cpu) = box;
+ break;
+ }
+ }
+ raw_spin_unlock(&uncore_box_lock);
+
return *per_cpu_ptr(pmu->box, cpu);
}

@@ -706,6 +727,13 @@ static void __init uncore_type_exit(struct intel_uncore_type *type)
type->attr_groups[1] = NULL;
}

+static void uncore_types_exit(struct intel_uncore_type **types)
+{
+ int i;
+ for (i = 0; types[i]; i++)
+ uncore_type_exit(types[i]);
+}
+
static int __init uncore_type_init(struct intel_uncore_type *type)
{
struct intel_uncore_pmu *pmus;
@@ -725,6 +753,7 @@ static int __init uncore_type_init(struct intel_uncore_type *type)
pmus[i].func_id = -1;
pmus[i].pmu_idx = i;
pmus[i].type = type;
+ INIT_LIST_HEAD(&pmus[i].box_list);
pmus[i].box = alloc_percpu(struct intel_uncore_box *);
if (!pmus[i].box)
goto fail;
@@ -771,6 +800,127 @@ fail:
return ret;
}

+static struct pci_driver *uncore_pci_driver;
+static bool pcidrv_registered;
+
+/*
+ * add a pci uncore device
+ */
+static int __devinit uncore_pci_add(struct intel_uncore_type *type,
+ struct pci_dev *pdev)
+{
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ int i, phys_id;
+
+ phys_id = pcibus_to_physid[pdev->bus->number];
+ if (phys_id < 0)
+ return -ENODEV;
+
+ box = uncore_alloc_box(0);
+ if (!box)
+ return -ENOMEM;
+
+ /*
+ * for performance monitoring unit with multiple boxes,
+ * each box has a different function id.
+ */
+ for (i = 0; i < type->num_boxes; i++) {
+ pmu = &type->pmus[i];
+ if (pmu->func_id == pdev->devfn)
+ break;
+ if (pmu->func_id < 0) {
+ pmu->func_id = pdev->devfn;
+ break;
+ }
+ pmu = NULL;
+ }
+
+ if (!pmu) {
+ kfree(box);
+ return -EINVAL;
+ }
+
+ box->phys_id = phys_id;
+ box->pci_dev = pdev;
+ box->pmu = pmu;
+ uncore_box_init(box);
+ pci_set_drvdata(pdev, box);
+
+ raw_spin_lock(&uncore_box_lock);
+ list_add_tail(&box->list, &pmu->box_list);
+ raw_spin_unlock(&uncore_box_lock);
+
+ return 0;
+}
+
+static void __devexit uncore_pci_remove(struct pci_dev *pdev)
+{
+ struct intel_uncore_box *box = pci_get_drvdata(pdev);
+ struct intel_uncore_pmu *pmu = box->pmu;
+ int cpu, phys_id = pcibus_to_physid[pdev->bus->number];
+
+ if (WARN_ON_ONCE(phys_id != box->phys_id))
+ return;
+
+ raw_spin_lock(&uncore_box_lock);
+ list_del(&box->list);
+ raw_spin_unlock(&uncore_box_lock);
+
+ for_each_possible_cpu(cpu) {
+ if (*per_cpu_ptr(pmu->box, cpu) == box) {
+ *per_cpu_ptr(pmu->box, cpu) = NULL;
+ atomic_dec(&box->refcnt);
+ }
+ }
+
+ WARN_ON_ONCE(atomic_read(&box->refcnt) != 1);
+ kfree(box);
+}
+
+static int __devinit uncore_pci_probe(struct pci_dev *pdev,
+ const struct pci_device_id *id)
+{
+ struct intel_uncore_type *type;
+
+ type = (struct intel_uncore_type *)id->driver_data;
+ return uncore_pci_add(type, pdev);
+}
+
+static int __init uncore_pci_init(void)
+{
+ int ret;
+
+ switch (boot_cpu_data.x86_model) {
+ default:
+ return 0;
+ }
+
+ ret = uncore_types_init(pci_uncores);
+ if (ret)
+ return ret;
+
+ uncore_pci_driver->probe = uncore_pci_probe;
+ uncore_pci_driver->remove = uncore_pci_remove;
+
+ ret = pci_register_driver(uncore_pci_driver);
+ if (ret == 0)
+ pcidrv_registered = true;
+ else
+ uncore_types_exit(pci_uncores);
+
+ return ret;
+}
+
+static void __init uncore_pci_exit(void)
+{
+ if (pcidrv_registered) {
+ pcidrv_registered = false;
+ pci_unregister_driver(uncore_pci_driver);
+ uncore_types_exit(pci_uncores);
+ }
+}
+
static void __cpuinit uncore_cpu_dying(int cpu)
{
struct intel_uncore_type *type;
@@ -919,6 +1069,7 @@ static void __cpuinit uncore_event_exit_cpu(int cpu)
cpumask_set_cpu(target, &uncore_cpu_mask);

uncore_change_context(msr_uncores, cpu, target);
+ uncore_change_context(pci_uncores, cpu, target);
}

static void __cpuinit uncore_event_init_cpu(int cpu)
@@ -934,6 +1085,7 @@ static void __cpuinit uncore_event_init_cpu(int cpu)
cpumask_set_cpu(cpu, &uncore_cpu_mask);

uncore_change_context(msr_uncores, -1, cpu);
+ uncore_change_context(pci_uncores, -1, cpu);
}

static int __cpuinit uncore_cpu_notifier(struct notifier_block *self,
@@ -1048,6 +1200,14 @@ static int __init uncore_pmus_register(void)
}
}

+ for (i = 0; pci_uncores[i]; i++) {
+ type = pci_uncores[i];
+ for (j = 0; j < type->num_boxes; j++) {
+ pmu = &type->pmus[j];
+ uncore_pmu_register(pmu);
+ }
+ }
+
return 0;
}

@@ -1058,9 +1218,14 @@ static int __init intel_uncore_init(void)
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
return -ENODEV;

- ret = uncore_cpu_init();
+ ret = uncore_pci_init();
if (ret)
goto fail;
+ ret = uncore_cpu_init();
+ if (ret) {
+ uncore_pci_exit();
+ goto fail;
+ }

uncore_pmus_register();
return 0;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index eeb5ca5..aa01df8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -1,5 +1,6 @@
#include <linux/module.h>
#include <linux/slab.h>
+#include <linux/pci.h>
#include <linux/perf_event.h>
#include "perf_event.h"

@@ -110,6 +111,7 @@ struct intel_uncore_pmu {
int func_id;
struct intel_uncore_type *type;
struct intel_uncore_box ** __percpu box;
+ struct list_head box_list;
};

struct intel_uncore_box {
@@ -123,6 +125,7 @@ struct intel_uncore_box {
struct perf_event *event_list[UNCORE_PMC_IDX_MAX];
unsigned long active_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)];
u64 tags[UNCORE_PMC_IDX_MAX];
+ struct pci_dev *pci_dev;
struct intel_uncore_pmu *pmu;
struct hrtimer hrtimer;
struct list_head list;
@@ -161,6 +164,33 @@ static ssize_t uncore_event_show(struct kobject *kobj,
return sprintf(buf, "%s", event->config);
}

+static inline unsigned uncore_pci_box_ctl(struct intel_uncore_box *box)
+{
+ return box->pmu->type->box_ctl;
+}
+
+static inline unsigned uncore_pci_fixed_ctl(struct intel_uncore_box *box)
+{
+ return box->pmu->type->fixed_ctl;
+}
+
+static inline unsigned uncore_pci_fixed_ctr(struct intel_uncore_box *box)
+{
+ return box->pmu->type->fixed_ctr;
+}
+
+static inline
+unsigned uncore_pci_event_ctl(struct intel_uncore_box *box, int idx)
+{
+ return idx * 4 + box->pmu->type->event_ctl;
+}
+
+static inline
+unsigned uncore_pci_perf_ctr(struct intel_uncore_box *box, int idx)
+{
+ return idx * 8 + box->pmu->type->perf_ctr;
+}
+
static inline
unsigned uncore_msr_box_ctl(struct intel_uncore_box *box)
{
@@ -200,6 +230,42 @@ unsigned uncore_msr_perf_ctr(struct intel_uncore_box *box, int idx)
box->pmu->type->msr_offset * box->pmu->pmu_idx;
}

+static inline
+unsigned uncore_fixed_ctl(struct intel_uncore_box *box)
+{
+ if (box->pci_dev)
+ return uncore_pci_fixed_ctl(box);
+ else
+ return uncore_msr_fixed_ctl(box);
+}
+
+static inline
+unsigned uncore_fixed_ctr(struct intel_uncore_box *box)
+{
+ if (box->pci_dev)
+ return uncore_pci_fixed_ctr(box);
+ else
+ return uncore_msr_fixed_ctr(box);
+}
+
+static inline
+unsigned uncore_event_ctl(struct intel_uncore_box *box, int idx)
+{
+ if (box->pci_dev)
+ return uncore_pci_event_ctl(box, idx);
+ else
+ return uncore_msr_event_ctl(box, idx);
+}
+
+static inline
+unsigned uncore_perf_ctr(struct intel_uncore_box *box, int idx)
+{
+ if (box->pci_dev)
+ return uncore_pci_perf_ctr(box, idx);
+ else
+ return uncore_msr_perf_ctr(box, idx);
+}
+
static inline int uncore_perf_ctr_bits(struct intel_uncore_box *box)
{
return box->pmu->type->perf_ctr_bits;
--
1.7.10.2

2012-06-12 05:39:46

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 06/13] perf: Add Nehalem and Sandy Bridge uncore support

From: "Yan, Zheng" <[email protected]>

Add Intel Nehalem and Sandy Bridge uncore pmu support.

Signed-off-by: Zheng Yan <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 194 +++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 50 +++++++
2 files changed, 244 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index e33ea16..84f9ae6 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -10,6 +10,192 @@ static cpumask_t uncore_cpu_mask;
static struct event_constraint constraint_fixed =
EVENT_CONSTRAINT((u64)-1, 1 << UNCORE_PMC_IDX_FIXED, (u64)-1);

+DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
+DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
+DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
+DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
+DEFINE_UNCORE_FORMAT_ATTR(cmask5, cmask, "config:24-28");
+DEFINE_UNCORE_FORMAT_ATTR(cmask8, cmask, "config:24-31");
+
+/* Sandy Bridge uncore support */
+static void snb_uncore_msr_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (hwc->idx < UNCORE_PMC_IDX_FIXED)
+ wrmsrl(hwc->config_base, hwc->config | SNB_UNC_CTL_EN);
+ else
+ wrmsrl(hwc->config_base, SNB_UNC_CTL_EN);
+}
+
+static void snb_uncore_msr_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ wrmsrl(event->hw.config_base, 0);
+}
+
+static u64 snb_uncore_msr_read_counter(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ u64 count;
+ rdmsrl(event->hw.event_base, count);
+ return count;
+}
+
+static void snb_uncore_msr_init_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0) {
+ wrmsrl(SNB_UNC_PERF_GLOBAL_CTL,
+ SNB_UNC_GLOBAL_CTL_EN | SNB_UNC_GLOBAL_CTL_CORE_ALL);
+ }
+}
+
+static struct attribute *snb_uncore_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_cmask5.attr,
+ NULL,
+};
+
+static struct attribute_group snb_uncore_format_group = {
+ .name = "format",
+ .attrs = snb_uncore_formats_attr,
+};
+
+static struct intel_uncore_ops snb_uncore_msr_ops = {
+ .init_box = snb_uncore_msr_init_box,
+ .disable_event = snb_uncore_msr_disable_event,
+ .enable_event = snb_uncore_msr_enable_event,
+ .read_counter = snb_uncore_msr_read_counter,
+};
+
+static struct event_constraint snb_uncore_cbo_constraints[] = {
+ UNCORE_EVENT_CONSTRAINT(0x80, 0x1),
+ UNCORE_EVENT_CONSTRAINT(0x83, 0x1),
+ EVENT_CONSTRAINT_END
+};
+
+static struct intel_uncore_type snb_uncore_cbo = {
+ .name = "C-Box",
+ .num_counters = 2,
+ .num_boxes = 4,
+ .perf_ctr_bits = 44,
+ .fixed_ctr_bits = 48,
+ .perf_ctr = SNB_UNC_CBO_0_PER_CTR0,
+ .event_ctl = SNB_UNC_CBO_0_PERFEVTSEL0,
+ .fixed_ctr = SNB_UNC_FIXED_CTR,
+ .fixed_ctl = SNB_UNC_FIXED_CTR_CTRL,
+ .single_fixed = 1,
+ .event_mask = SNB_UNC_RAW_EVENT_MASK,
+ .msr_offset = SNB_UNC_CBO_MSR_OFFSET,
+ .constraints = snb_uncore_cbo_constraints,
+ .ops = &snb_uncore_msr_ops,
+ .format_group = &snb_uncore_format_group,
+};
+
+static struct intel_uncore_type *snb_msr_uncores[] = {
+ &snb_uncore_cbo,
+ NULL,
+};
+/* end of Sandy Bridge uncore support */
+
+/* Nehalem uncore support */
+static void nhm_uncore_msr_disable_box(struct intel_uncore_box *box)
+{
+ wrmsrl(NHM_UNC_PERF_GLOBAL_CTL, 0);
+}
+
+static void nhm_uncore_msr_enable_box(struct intel_uncore_box *box)
+{
+ wrmsrl(NHM_UNC_PERF_GLOBAL_CTL,
+ NHM_UNC_GLOBAL_CTL_EN_PC_ALL | NHM_UNC_GLOBAL_CTL_EN_FC);
+}
+
+static void nhm_uncore_msr_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (hwc->idx < UNCORE_PMC_IDX_FIXED)
+ wrmsrl(hwc->config_base, hwc->config | SNB_UNC_CTL_EN);
+ else
+ wrmsrl(hwc->config_base, NHM_UNC_FIXED_CTR_CTL_EN);
+}
+
+static struct attribute *nhm_uncore_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_cmask8.attr,
+ NULL,
+};
+
+static struct attribute_group nhm_uncore_format_group = {
+ .name = "format",
+ .attrs = nhm_uncore_formats_attr,
+};
+
+static struct uncore_event_desc nhm_uncore_events[] = {
+ INTEL_UNCORE_EVENT_DESC(CLOCKTICKS, "config=0xffff"),
+ /* full cache line writes to DRAM */
+ INTEL_UNCORE_EVENT_DESC(QMC_WRITES_FULL_ANY, "event=0x2f,umask=0xf"),
+ /* Quickpath Memory Controller normal priority read requests */
+ INTEL_UNCORE_EVENT_DESC(QMC_NORMAL_READS_ANY, "event=0x2c,umask=0xf"),
+ /* Quickpath Home Logic read requests from the IOH */
+ INTEL_UNCORE_EVENT_DESC(QHL_REQUEST_IOH_READS,
+ "event=0x20,umask=0x1"),
+ /* Quickpath Home Logic write requests from the IOH */
+ INTEL_UNCORE_EVENT_DESC(QHL_REQUEST_IOH_WRITES,
+ "event=0x20,umask=0x2"),
+ /* Quickpath Home Logic read requests from a remote socket */
+ INTEL_UNCORE_EVENT_DESC(QHL_REQUEST_REMOTE_READS,
+ "event=0x20,umask=0x4"),
+ /* Quickpath Home Logic write requests from a remote socket */
+ INTEL_UNCORE_EVENT_DESC(QHL_REQUEST_REMOTE_WRITES,
+ "event=0x20,umask=0x8"),
+ /* Quickpath Home Logic read requests from the local socket */
+ INTEL_UNCORE_EVENT_DESC(QHL_REQUEST_LOCAL_READS,
+ "event=0x20,umask=0x10"),
+ /* Quickpath Home Logic write requests from the local socket */
+ INTEL_UNCORE_EVENT_DESC(QHL_REQUEST_LOCAL_WRITES,
+ "event=0x20,umask=0x20"),
+ { /* end: all zeroes */ },
+};
+
+static struct intel_uncore_ops nhm_uncore_msr_ops = {
+ .disable_box = nhm_uncore_msr_disable_box,
+ .enable_box = nhm_uncore_msr_enable_box,
+ .disable_event = snb_uncore_msr_disable_event,
+ .enable_event = nhm_uncore_msr_enable_event,
+ .read_counter = snb_uncore_msr_read_counter,
+};
+
+static struct intel_uncore_type nhm_uncore = {
+ .name = "",
+ .num_counters = 8,
+ .num_boxes = 1,
+ .perf_ctr_bits = 48,
+ .fixed_ctr_bits = 48,
+ .event_ctl = NHM_UNC_PERFEVTSEL0,
+ .perf_ctr = NHM_UNC_UNCORE_PMC0,
+ .fixed_ctr = NHM_UNC_FIXED_CTR,
+ .fixed_ctl = NHM_UNC_FIXED_CTR_CTRL,
+ .event_mask = NHM_UNC_RAW_EVENT_MASK,
+ .event_descs = nhm_uncore_events,
+ .ops = &nhm_uncore_msr_ops,
+ .format_group = &nhm_uncore_format_group,
+};
+
+static struct intel_uncore_type *nhm_msr_uncores[] = {
+ &nhm_uncore,
+ NULL,
+};
+/* end of Nehalem uncore support */
+
static void uncore_assign_hw_event(struct intel_uncore_box *box,
struct perf_event *event, int idx)
{
@@ -806,6 +992,14 @@ static int __init uncore_cpu_init(void)
int ret, cpu;

switch (boot_cpu_data.x86_model) {
+ case 26: /* Nehalem */
+ case 30:
+ case 37: /* Westmere */
+ msr_uncores = nhm_msr_uncores;
+ break;
+ case 42: /* Sandy Bridge */
+ msr_uncores = snb_msr_uncores;
+ break;
default:
return 0;
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index 49a6bfb..eeb5ca5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -15,6 +15,56 @@

#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)

+/* SNB event control */
+#define SNB_UNC_CTL_EV_SEL_MASK 0x000000ff
+#define SNB_UNC_CTL_UMASK_MASK 0x0000ff00
+#define SNB_UNC_CTL_EDGE_DET (1 << 18)
+#define SNB_UNC_CTL_EN (1 << 22)
+#define SNB_UNC_CTL_INVERT (1 << 23)
+#define SNB_UNC_CTL_CMASK_MASK 0x1f000000
+#define NHM_UNC_CTL_CMASK_MASK 0xff000000
+#define NHM_UNC_FIXED_CTR_CTL_EN (1 << 0)
+
+#define SNB_UNC_RAW_EVENT_MASK (SNB_UNC_CTL_EV_SEL_MASK | \
+ SNB_UNC_CTL_UMASK_MASK | \
+ SNB_UNC_CTL_EDGE_DET | \
+ SNB_UNC_CTL_INVERT | \
+ SNB_UNC_CTL_CMASK_MASK)
+
+#define NHM_UNC_RAW_EVENT_MASK (SNB_UNC_CTL_EV_SEL_MASK | \
+ SNB_UNC_CTL_UMASK_MASK | \
+ SNB_UNC_CTL_EDGE_DET | \
+ SNB_UNC_CTL_INVERT | \
+ NHM_UNC_CTL_CMASK_MASK)
+
+/* SNB global control register */
+#define SNB_UNC_PERF_GLOBAL_CTL 0x391
+#define SNB_UNC_FIXED_CTR_CTRL 0x394
+#define SNB_UNC_FIXED_CTR 0x395
+
+/* SNB uncore global control */
+#define SNB_UNC_GLOBAL_CTL_CORE_ALL ((1 << 4) - 1)
+#define SNB_UNC_GLOBAL_CTL_EN (1 << 29)
+
+/* SNB Cbo register */
+#define SNB_UNC_CBO_0_PERFEVTSEL0 0x700
+#define SNB_UNC_CBO_0_PER_CTR0 0x706
+#define SNB_UNC_CBO_MSR_OFFSET 0x10
+
+/* NHM global control register */
+#define NHM_UNC_PERF_GLOBAL_CTL 0x391
+#define NHM_UNC_FIXED_CTR 0x394
+#define NHM_UNC_FIXED_CTR_CTRL 0x395
+
+/* NHM uncore global control */
+#define NHM_UNC_GLOBAL_CTL_EN_PC_ALL ((1ULL << 8) - 1)
+#define NHM_UNC_GLOBAL_CTL_EN_FC (1ULL << 32)
+
+/* NHM uncore register */
+#define NHM_UNC_PERFEVTSEL0 0x3c0
+#define NHM_UNC_UNCORE_PMC0 0x3b0
+
+
struct intel_uncore_ops;
struct intel_uncore_pmu;
struct intel_uncore_box;
--
1.7.10.2

2012-06-12 05:38:07

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 03/13] perf: Allow pmu to choose cpu on which to install event

From: "Yan, Zheng" <[email protected]>

Allow the pmu->event_init callback to change event->cpu, so pmu can
choose cpu on which to install event.

Signed-off-by: Zheng Yan <[email protected]>
---
kernel/events/core.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d71a2d6..2c05027 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6302,7 +6302,7 @@ SYSCALL_DEFINE5(perf_event_open,
/*
* Get the target context (task or percpu):
*/
- ctx = find_get_context(pmu, task, cpu);
+ ctx = find_get_context(pmu, task, event->cpu);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto err_alloc;
@@ -6375,16 +6375,16 @@ SYSCALL_DEFINE5(perf_event_open,
mutex_lock(&ctx->mutex);

if (move_group) {
- perf_install_in_context(ctx, group_leader, cpu);
+ perf_install_in_context(ctx, group_leader, event->cpu);
get_ctx(ctx);
list_for_each_entry(sibling, &group_leader->sibling_list,
group_entry) {
- perf_install_in_context(ctx, sibling, cpu);
+ perf_install_in_context(ctx, sibling, event->cpu);
get_ctx(ctx);
}
}

- perf_install_in_context(ctx, event, cpu);
+ perf_install_in_context(ctx, event, event->cpu);
++ctx->generation;
perf_unpin_context(ctx);
mutex_unlock(&ctx->mutex);
--
1.7.10.2

2012-06-12 05:38:05

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 04/13] perf: Introduce perf_pmu_migrate_context

From: "Yan, Zheng" <[email protected]>

Originally from Peter Zijlstra. The helper migrates perf events
from one cpu to another cpu.

Signed-off-by: Zheng Yan <[email protected]>
---
include/linux/perf_event.h | 2 ++
kernel/events/core.c | 36 ++++++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 1ce887a..76c5c8b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1107,6 +1107,8 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
struct task_struct *task,
perf_overflow_handler_t callback,
void *context);
+extern void perf_pmu_migrate_context(struct pmu *pmu,
+ int src_cpu, int dst_cpu);
extern u64 perf_event_read_value(struct perf_event *event,
u64 *enabled, u64 *running);

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2c05027..2e54e74 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1641,6 +1641,8 @@ perf_install_in_context(struct perf_event_context *ctx,
lockdep_assert_held(&ctx->mutex);

event->ctx = ctx;
+ if (event->cpu != -1)
+ event->cpu = cpu;

if (!task) {
/*
@@ -6375,6 +6377,7 @@ SYSCALL_DEFINE5(perf_event_open,
mutex_lock(&ctx->mutex);

if (move_group) {
+ synchronize_rcu();
perf_install_in_context(ctx, group_leader, event->cpu);
get_ctx(ctx);
list_for_each_entry(sibling, &group_leader->sibling_list,
@@ -6480,6 +6483,39 @@ err:
}
EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);

+void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
+{
+ struct perf_event_context *src_ctx;
+ struct perf_event_context *dst_ctx;
+ struct perf_event *event, *tmp;
+ LIST_HEAD(events);
+
+ src_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, src_cpu)->ctx;
+ dst_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, dst_cpu)->ctx;
+
+ mutex_lock(&src_ctx->mutex);
+ list_for_each_entry_safe(event, tmp, &src_ctx->event_list,
+ event_entry) {
+ perf_remove_from_context(event);
+ put_ctx(src_ctx);
+ list_add(&event->event_entry, &events);
+ }
+ mutex_unlock(&src_ctx->mutex);
+
+ synchronize_rcu();
+
+ mutex_lock(&dst_ctx->mutex);
+ list_for_each_entry_safe(event, tmp, &events, event_entry) {
+ list_del(&event->event_entry);
+ if (event->state >= PERF_EVENT_STATE_OFF)
+ event->state = PERF_EVENT_STATE_INACTIVE;
+ perf_install_in_context(dst_ctx, event, dst_cpu);
+ get_ctx(dst_ctx);
+ }
+ mutex_unlock(&dst_ctx->mutex);
+}
+EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
+
static void sync_child_event(struct perf_event *child_event,
struct task_struct *child)
{
--
1.7.10.2

2012-06-12 05:40:37

by Yan, Zheng

[permalink] [raw]
Subject: [PATCH V5 05/13] perf: Generic intel uncore support

From: "Yan, Zheng" <[email protected]>

This patch adds the generic intel uncore pmu support, including helper
functions that add/delete uncore events, a hrtimer that periodically
polls the counters to avoid overflow and code that places all events
for a particular socket onto a single cpu. The code design is based on
the structure of Sandy Bridge-EP's uncore subsystem, which consists of
a variety of components, each component contain one or more boxes.

Signed-off-by: Zheng Yan <[email protected]>
---
arch/x86/kernel/cpu/Makefile | 4 +-
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 876 +++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 204 ++++++
3 files changed, 1083 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/kernel/cpu/perf_event_intel_uncore.c
create mode 100644 arch/x86/kernel/cpu/perf_event_intel_uncore.h

diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 6ab6aa2..bac4c38 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -32,7 +32,9 @@ obj-$(CONFIG_PERF_EVENTS) += perf_event.o

ifdef CONFIG_PERF_EVENTS
obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd.o
-obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_p4.o perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
+obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_p4.o
+obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
+obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o
endif

obj-$(CONFIG_X86_MCE) += mcheck/
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
new file mode 100644
index 0000000..e33ea16
--- /dev/null
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -0,0 +1,876 @@
+#include "perf_event_intel_uncore.h"
+
+static struct intel_uncore_type *empty_uncore[] = { NULL, };
+static struct intel_uncore_type **msr_uncores = empty_uncore;
+
+/* mask of cpus that collect uncore events */
+static cpumask_t uncore_cpu_mask;
+
+/* constraint for the fixed counter */
+static struct event_constraint constraint_fixed =
+ EVENT_CONSTRAINT((u64)-1, 1 << UNCORE_PMC_IDX_FIXED, (u64)-1);
+
+static void uncore_assign_hw_event(struct intel_uncore_box *box,
+ struct perf_event *event, int idx)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ hwc->idx = idx;
+ hwc->last_tag = ++box->tags[idx];
+
+ if (hwc->idx == UNCORE_PMC_IDX_FIXED) {
+ hwc->event_base = uncore_msr_fixed_ctr(box);
+ hwc->config_base = uncore_msr_fixed_ctl(box);
+ return;
+ }
+
+ hwc->config_base = uncore_msr_event_ctl(box, hwc->idx);
+ hwc->event_base = uncore_msr_perf_ctr(box, hwc->idx);
+}
+
+static void uncore_perf_event_update(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ u64 prev_count, new_count, delta;
+ int shift;
+
+ if (event->hw.idx >= UNCORE_PMC_IDX_FIXED)
+ shift = 64 - uncore_fixed_ctr_bits(box);
+ else
+ shift = 64 - uncore_perf_ctr_bits(box);
+
+ /* the hrtimer might modify the previous event value */
+again:
+ prev_count = local64_read(&event->hw.prev_count);
+ new_count = uncore_read_counter(box, event);
+ if (local64_xchg(&event->hw.prev_count, new_count) != prev_count)
+ goto again;
+
+ delta = (new_count << shift) - (prev_count << shift);
+ delta >>= shift;
+
+ local64_add(delta, &event->count);
+}
+
+/*
+ * The overflow interrupt is unavailable for SandyBridge-EP, is broken
+ * for SandyBridge. So we use hrtimer to periodically poll the counter
+ * to avoid overflow.
+ */
+static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
+{
+ struct intel_uncore_box *box;
+ unsigned long flags;
+ int bit;
+
+ box = container_of(hrtimer, struct intel_uncore_box, hrtimer);
+ if (!box->n_active || box->cpu != smp_processor_id())
+ return HRTIMER_NORESTART;
+ /*
+ * disable local interrupt to prevent uncore_pmu_event_start/stop
+ * to interrupt the update process
+ */
+ local_irq_save(flags);
+
+ for_each_set_bit(bit, box->active_mask, UNCORE_PMC_IDX_MAX)
+ uncore_perf_event_update(box, box->events[bit]);
+
+ local_irq_restore(flags);
+
+ hrtimer_forward_now(hrtimer, ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL));
+ return HRTIMER_RESTART;
+}
+
+static void uncore_pmu_start_hrtimer(struct intel_uncore_box *box)
+{
+ __hrtimer_start_range_ns(&box->hrtimer,
+ ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL), 0,
+ HRTIMER_MODE_REL_PINNED, 0);
+}
+
+static void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box)
+{
+ hrtimer_cancel(&box->hrtimer);
+}
+
+static void uncore_pmu_init_hrtimer(struct intel_uncore_box *box)
+{
+ hrtimer_init(&box->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+ box->hrtimer.function = uncore_pmu_hrtimer;
+}
+
+struct intel_uncore_box *uncore_alloc_box(int cpu)
+{
+ struct intel_uncore_box *box;
+
+ box = kmalloc_node(sizeof(*box), GFP_KERNEL | __GFP_ZERO,
+ cpu_to_node(cpu));
+ if (!box)
+ return NULL;
+
+ uncore_pmu_init_hrtimer(box);
+ atomic_set(&box->refcnt, 1);
+ box->cpu = -1;
+ box->phys_id = -1;
+
+ return box;
+}
+
+static struct intel_uncore_box *
+uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
+{
+ return *per_cpu_ptr(pmu->box, cpu);
+}
+
+static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
+{
+ return container_of(event->pmu, struct intel_uncore_pmu, pmu);
+}
+
+static struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
+{
+ /*
+ * perf core schedules event on the basis of cpu, uncore events are
+ * collected by one of the cpus inside a physical package.
+ */
+ return uncore_pmu_to_box(uncore_event_to_pmu(event),
+ smp_processor_id());
+}
+
+static int uncore_collect_events(struct intel_uncore_box *box,
+ struct perf_event *leader, bool dogrp)
+{
+ struct perf_event *event;
+ int n, max_count;
+
+ max_count = box->pmu->type->num_counters;
+ if (box->pmu->type->fixed_ctl)
+ max_count++;
+
+ if (box->n_events >= max_count)
+ return -EINVAL;
+
+ n = box->n_events;
+ box->event_list[n] = leader;
+ n++;
+ if (!dogrp)
+ return n;
+
+ list_for_each_entry(event, &leader->sibling_list, group_entry) {
+ if (event->state <= PERF_EVENT_STATE_OFF)
+ continue;
+
+ if (n >= max_count)
+ return -EINVAL;
+
+ box->event_list[n] = event;
+ n++;
+ }
+ return n;
+}
+
+static struct event_constraint *
+uncore_event_constraint(struct intel_uncore_type *type,
+ struct perf_event *event)
+{
+ struct event_constraint *c;
+
+ if (event->hw.config == (u64)-1)
+ return &constraint_fixed;
+
+ if (type->constraints) {
+ for_each_event_constraint(c, type->constraints) {
+ if ((event->hw.config & c->cmask) == c->code)
+ return c;
+ }
+ }
+
+ return &type->unconstrainted;
+}
+
+static int uncore_assign_events(struct intel_uncore_box *box,
+ int assign[], int n)
+{
+ unsigned long used_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)];
+ struct event_constraint *c, *constraints[UNCORE_PMC_IDX_MAX];
+ int i, ret, wmin, wmax;
+ struct hw_perf_event *hwc;
+
+ bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX);
+
+ for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) {
+ c = uncore_event_constraint(box->pmu->type,
+ box->event_list[i]);
+ constraints[i] = c;
+ wmin = min(wmin, c->weight);
+ wmax = max(wmax, c->weight);
+ }
+
+ /* fastpath, try to reuse previous register */
+ for (i = 0; i < n; i++) {
+ hwc = &box->event_list[i]->hw;
+ c = constraints[i];
+
+ /* never assigned */
+ if (hwc->idx == -1)
+ break;
+
+ /* constraint still honored */
+ if (!test_bit(hwc->idx, c->idxmsk))
+ break;
+
+ /* not already used */
+ if (test_bit(hwc->idx, used_mask))
+ break;
+
+ __set_bit(hwc->idx, used_mask);
+ assign[i] = hwc->idx;
+ }
+ if (i == n)
+ return 0;
+
+ /* slow path */
+ ret = perf_assign_events(constraints, n, wmin, wmax, assign);
+ return ret ? -EINVAL : 0;
+}
+
+static void uncore_pmu_event_start(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ int idx = event->hw.idx;
+
+ if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
+ return;
+
+ if (WARN_ON_ONCE(idx == -1 || idx >= UNCORE_PMC_IDX_MAX))
+ return;
+
+ event->hw.state = 0;
+ box->events[idx] = event;
+ box->n_active++;
+ __set_bit(idx, box->active_mask);
+
+ local64_set(&event->hw.prev_count, uncore_read_counter(box, event));
+ uncore_enable_event(box, event);
+
+ if (box->n_active == 1) {
+ uncore_enable_box(box);
+ uncore_pmu_start_hrtimer(box);
+ }
+}
+
+static void uncore_pmu_event_stop(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (__test_and_clear_bit(hwc->idx, box->active_mask)) {
+ uncore_disable_event(box, event);
+ box->n_active--;
+ box->events[hwc->idx] = NULL;
+ WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+ hwc->state |= PERF_HES_STOPPED;
+
+ if (box->n_active == 0) {
+ uncore_disable_box(box);
+ uncore_pmu_cancel_hrtimer(box);
+ }
+ }
+
+ if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+ /*
+ * Drain the remaining delta count out of a event
+ * that we are disabling:
+ */
+ uncore_perf_event_update(box, event);
+ hwc->state |= PERF_HES_UPTODATE;
+ }
+}
+
+static int uncore_pmu_event_add(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ struct hw_perf_event *hwc = &event->hw;
+ int assign[UNCORE_PMC_IDX_MAX];
+ int i, n, ret;
+
+ if (!box)
+ return -ENODEV;
+
+ ret = n = uncore_collect_events(box, event, false);
+ if (ret < 0)
+ return ret;
+
+ hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+ if (!(flags & PERF_EF_START))
+ hwc->state |= PERF_HES_ARCH;
+
+ ret = uncore_assign_events(box, assign, n);
+ if (ret)
+ return ret;
+
+ /* save events moving to new counters */
+ for (i = 0; i < box->n_events; i++) {
+ event = box->event_list[i];
+ hwc = &event->hw;
+
+ if (hwc->idx == assign[i] &&
+ hwc->last_tag == box->tags[assign[i]])
+ continue;
+ /*
+ * Ensure we don't accidentally enable a stopped
+ * counter simply because we rescheduled.
+ */
+ if (hwc->state & PERF_HES_STOPPED)
+ hwc->state |= PERF_HES_ARCH;
+
+ uncore_pmu_event_stop(event, PERF_EF_UPDATE);
+ }
+
+ /* reprogram moved events into new counters */
+ for (i = 0; i < n; i++) {
+ event = box->event_list[i];
+ hwc = &event->hw;
+
+ if (hwc->idx != assign[i] ||
+ hwc->last_tag != box->tags[assign[i]])
+ uncore_assign_hw_event(box, event, assign[i]);
+ else if (i < box->n_events)
+ continue;
+
+ if (hwc->state & PERF_HES_ARCH)
+ continue;
+
+ uncore_pmu_event_start(event, 0);
+ }
+ box->n_events = n;
+
+ return 0;
+}
+
+static void uncore_pmu_event_del(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ int i;
+
+ uncore_pmu_event_stop(event, PERF_EF_UPDATE);
+
+ for (i = 0; i < box->n_events; i++) {
+ if (event == box->event_list[i]) {
+ while (++i < box->n_events)
+ box->event_list[i - 1] = box->event_list[i];
+
+ --box->n_events;
+ break;
+ }
+ }
+
+ event->hw.idx = -1;
+ event->hw.last_tag = ~0ULL;
+}
+
+static void uncore_pmu_event_read(struct perf_event *event)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ uncore_perf_event_update(box, event);
+}
+
+/*
+ * validation ensures the group can be loaded onto the
+ * PMU if it was the only group available.
+ */
+static int uncore_validate_group(struct intel_uncore_pmu *pmu,
+ struct perf_event *event)
+{
+ struct perf_event *leader = event->group_leader;
+ struct intel_uncore_box *fake_box;
+ int assign[UNCORE_PMC_IDX_MAX];
+ int ret = -EINVAL, n;
+
+ fake_box = uncore_alloc_box(smp_processor_id());
+ if (!fake_box)
+ return -ENOMEM;
+
+ fake_box->pmu = pmu;
+ /*
+ * the event is not yet connected with its
+ * siblings therefore we must first collect
+ * existing siblings, then add the new event
+ * before we can simulate the scheduling
+ */
+ n = uncore_collect_events(fake_box, leader, true);
+ if (n < 0)
+ goto out;
+
+ fake_box->n_events = n;
+ n = uncore_collect_events(fake_box, event, false);
+ if (n < 0)
+ goto out;
+
+ fake_box->n_events = n;
+
+ ret = uncore_assign_events(fake_box, assign, n);
+out:
+ kfree(fake_box);
+ return ret;
+}
+
+int uncore_pmu_event_init(struct perf_event *event)
+{
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ struct hw_perf_event *hwc = &event->hw;
+ int ret;
+
+ if (event->attr.type != event->pmu->type)
+ return -ENOENT;
+
+ pmu = uncore_event_to_pmu(event);
+ /* no device found for this pmu */
+ if (pmu->func_id < 0)
+ return -ENOENT;
+
+ /*
+ * Uncore PMU does measure at all privilege level all the time.
+ * So it doesn't make sense to specify any exclude bits.
+ */
+ if (event->attr.exclude_user || event->attr.exclude_kernel ||
+ event->attr.exclude_hv || event->attr.exclude_idle)
+ return -EINVAL;
+
+ /* Sampling not supported yet */
+ if (hwc->sample_period)
+ return -EINVAL;
+
+ /*
+ * Place all uncore events for a particular physical package
+ * onto a single cpu
+ */
+ if (event->cpu < 0)
+ return -EINVAL;
+ box = uncore_pmu_to_box(pmu, event->cpu);
+ if (!box || box->cpu < 0)
+ return -EINVAL;
+ event->cpu = box->cpu;
+
+ if (event->attr.config == UNCORE_FIXED_EVENT) {
+ /* no fixed counter */
+ if (!pmu->type->fixed_ctl)
+ return -EINVAL;
+ /*
+ * if there is only one fixed counter, only the first pmu
+ * can access the fixed counter
+ */
+ if (pmu->type->single_fixed && pmu->pmu_idx > 0)
+ return -EINVAL;
+ hwc->config = (u64)-1;
+ } else {
+ hwc->config = event->attr.config & pmu->type->event_mask;
+ }
+
+ event->hw.idx = -1;
+ event->hw.last_tag = ~0ULL;
+
+ if (event->group_leader != event)
+ ret = uncore_validate_group(pmu, event);
+ else
+ ret = 0;
+
+ return ret;
+}
+
+static int __init uncore_pmu_register(struct intel_uncore_pmu *pmu)
+{
+ int ret;
+
+ pmu->pmu = (struct pmu) {
+ .attr_groups = pmu->type->attr_groups,
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = uncore_pmu_event_init,
+ .add = uncore_pmu_event_add,
+ .del = uncore_pmu_event_del,
+ .start = uncore_pmu_event_start,
+ .stop = uncore_pmu_event_stop,
+ .read = uncore_pmu_event_read,
+ };
+
+ if (pmu->type->num_boxes == 1) {
+ if (strlen(pmu->type->name) > 0)
+ sprintf(pmu->name, "Uncore_%s", pmu->type->name);
+ else
+ sprintf(pmu->name, "Uncore");
+ } else {
+ sprintf(pmu->name, "Uncore_%s_%d", pmu->type->name,
+ pmu->pmu_idx);
+ }
+
+ ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
+ return ret;
+}
+
+static void __init uncore_type_exit(struct intel_uncore_type *type)
+{
+ int i;
+
+ for (i = 0; i < type->num_boxes; i++)
+ free_percpu(type->pmus[i].box);
+ kfree(type->pmus);
+ type->pmus = NULL;
+ kfree(type->attr_groups[1]);
+ type->attr_groups[1] = NULL;
+}
+
+static int __init uncore_type_init(struct intel_uncore_type *type)
+{
+ struct intel_uncore_pmu *pmus;
+ struct attribute_group *events_group;
+ struct attribute **attrs;
+ int i, j;
+
+ pmus = kzalloc(sizeof(*pmus) * type->num_boxes, GFP_KERNEL);
+ if (!pmus)
+ return -ENOMEM;
+
+ type->unconstrainted = (struct event_constraint)
+ __EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1,
+ 0, type->num_counters, 0);
+
+ for (i = 0; i < type->num_boxes; i++) {
+ pmus[i].func_id = -1;
+ pmus[i].pmu_idx = i;
+ pmus[i].type = type;
+ pmus[i].box = alloc_percpu(struct intel_uncore_box *);
+ if (!pmus[i].box)
+ goto fail;
+ }
+
+ if (type->event_descs) {
+ for (i = 0; type->event_descs[i].attr.attr.name; i++);
+
+ events_group = kzalloc(sizeof(struct attribute *) * (i + 1) +
+ sizeof(*events_group), GFP_KERNEL);
+ if (!events_group)
+ goto fail;
+
+ attrs = (struct attribute **)(events_group + 1);
+ events_group->name = "events";
+ events_group->attrs = attrs;
+
+ for (j = 0; j < i; j++)
+ attrs[j] = &type->event_descs[j].attr.attr;
+
+ type->attr_groups[1] = events_group;
+ }
+
+ type->pmus = pmus;
+ return 0;
+fail:
+ uncore_type_exit(type);
+ return -ENOMEM;
+}
+
+static int __init uncore_types_init(struct intel_uncore_type **types)
+{
+ int i, ret;
+
+ for (i = 0; types[i]; i++) {
+ ret = uncore_type_init(types[i]);
+ if (ret)
+ goto fail;
+ }
+ return 0;
+fail:
+ while (--i >= 0)
+ uncore_type_exit(types[i]);
+ return ret;
+}
+
+static void __cpuinit uncore_cpu_dying(int cpu)
+{
+ struct intel_uncore_type *type;
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ int i, j;
+
+ for (i = 0; msr_uncores[i]; i++) {
+ type = msr_uncores[i];
+ for (j = 0; j < type->num_boxes; j++) {
+ pmu = &type->pmus[j];
+ box = *per_cpu_ptr(pmu->box, cpu);
+ *per_cpu_ptr(pmu->box, cpu) = NULL;
+ if (box && atomic_dec_and_test(&box->refcnt))
+ kfree(box);
+ }
+ }
+}
+
+static int __cpuinit uncore_cpu_starting(int cpu)
+{
+ struct intel_uncore_type *type;
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box, *exist;
+ int i, j, k, phys_id;
+
+ phys_id = topology_physical_package_id(cpu);
+
+ for (i = 0; msr_uncores[i]; i++) {
+ type = msr_uncores[i];
+ for (j = 0; j < type->num_boxes; j++) {
+ pmu = &type->pmus[j];
+ box = *per_cpu_ptr(pmu->box, cpu);
+ /* called by uncore_cpu_init? */
+ if (box && box->phys_id >= 0) {
+ uncore_box_init(box);
+ continue;
+ }
+
+ for_each_online_cpu(k) {
+ exist = *per_cpu_ptr(pmu->box, k);
+ if (exist && exist->phys_id == phys_id) {
+ atomic_inc(&exist->refcnt);
+ *per_cpu_ptr(pmu->box, cpu) = exist;
+ kfree(box);
+ box = NULL;
+ break;
+ }
+ }
+
+ if (box) {
+ box->phys_id = phys_id;
+ uncore_box_init(box);
+ }
+ }
+ }
+ return 0;
+}
+
+static int __cpuinit uncore_cpu_prepare(int cpu, int phys_id)
+{
+ struct intel_uncore_type *type;
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ int i, j;
+
+ for (i = 0; msr_uncores[i]; i++) {
+ type = msr_uncores[i];
+ for (j = 0; j < type->num_boxes; j++) {
+ pmu = &type->pmus[j];
+ if (pmu->func_id < 0)
+ pmu->func_id = j;
+
+ box = uncore_alloc_box(cpu);
+ if (!box)
+ return -ENOMEM;
+
+ box->pmu = pmu;
+ box->phys_id = phys_id;
+ *per_cpu_ptr(pmu->box, cpu) = box;
+ }
+ }
+ return 0;
+}
+
+static void __cpuinit uncore_change_context(struct intel_uncore_type **uncores,
+ int old_cpu, int new_cpu)
+{
+ struct intel_uncore_type *type;
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ int i, j;
+
+ for (i = 0; uncores[i]; i++) {
+ type = uncores[i];
+ for (j = 0; j < type->num_boxes; j++) {
+ pmu = &type->pmus[j];
+ if (old_cpu < 0)
+ box = uncore_pmu_to_box(pmu, new_cpu);
+ else
+ box = uncore_pmu_to_box(pmu, old_cpu);
+ if (!box)
+ continue;
+
+ if (old_cpu < 0) {
+ WARN_ON_ONCE(box->cpu != -1);
+ box->cpu = new_cpu;
+ continue;
+ }
+
+ WARN_ON_ONCE(box->cpu != old_cpu);
+ if (new_cpu >= 0) {
+ uncore_pmu_cancel_hrtimer(box);
+ perf_pmu_migrate_context(&pmu->pmu,
+ old_cpu, new_cpu);
+ box->cpu = new_cpu;
+ } else {
+ box->cpu = -1;
+ }
+ }
+ }
+}
+
+static void __cpuinit uncore_event_exit_cpu(int cpu)
+{
+ int i, phys_id, target;
+
+ /* if exiting cpu is used for collecting uncore events */
+ if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
+ return;
+
+ /* find a new cpu to collect uncore events */
+ phys_id = topology_physical_package_id(cpu);
+ target = -1;
+ for_each_online_cpu(i) {
+ if (i == cpu)
+ continue;
+ if (phys_id == topology_physical_package_id(i)) {
+ target = i;
+ break;
+ }
+ }
+
+ /* migrate uncore events to the new cpu */
+ if (target >= 0)
+ cpumask_set_cpu(target, &uncore_cpu_mask);
+
+ uncore_change_context(msr_uncores, cpu, target);
+}
+
+static void __cpuinit uncore_event_init_cpu(int cpu)
+{
+ int i, phys_id;
+
+ phys_id = topology_physical_package_id(cpu);
+ for_each_cpu(i, &uncore_cpu_mask) {
+ if (phys_id == topology_physical_package_id(i))
+ return;
+ }
+
+ cpumask_set_cpu(cpu, &uncore_cpu_mask);
+
+ uncore_change_context(msr_uncores, -1, cpu);
+}
+
+static int __cpuinit uncore_cpu_notifier(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ unsigned int cpu = (long)hcpu;
+
+ /* allocate/free data structure for uncore box */
+ switch (action & ~CPU_TASKS_FROZEN) {
+ case CPU_UP_PREPARE:
+ uncore_cpu_prepare(cpu, -1);
+ break;
+ case CPU_STARTING:
+ uncore_cpu_starting(cpu);
+ break;
+ case CPU_UP_CANCELED:
+ case CPU_DYING:
+ uncore_cpu_dying(cpu);
+ break;
+ default:
+ break;
+ }
+
+ /* select the cpu that collects uncore events */
+ switch (action & ~CPU_TASKS_FROZEN) {
+ case CPU_DOWN_FAILED:
+ case CPU_STARTING:
+ uncore_event_init_cpu(cpu);
+ break;
+ case CPU_DOWN_PREPARE:
+ uncore_event_exit_cpu(cpu);
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_OK;
+}
+
+static struct notifier_block uncore_cpu_nb __cpuinitdata = {
+ .notifier_call = uncore_cpu_notifier,
+ /*
+ * to migrate uncore events, our notifier should be executed
+ * before perf core's notifier.
+ */
+ .priority = CPU_PRI_PERF + 1,
+};
+
+static void __init uncore_cpu_setup(void *dummy)
+{
+ uncore_cpu_starting(smp_processor_id());
+}
+
+static int __init uncore_cpu_init(void)
+{
+ int ret, cpu;
+
+ switch (boot_cpu_data.x86_model) {
+ default:
+ return 0;
+ }
+
+ ret = uncore_types_init(msr_uncores);
+ if (ret)
+ return ret;
+
+ get_online_cpus();
+
+ for_each_online_cpu(cpu) {
+ int i, phys_id = topology_physical_package_id(cpu);
+
+ for_each_cpu(i, &uncore_cpu_mask) {
+ if (phys_id == topology_physical_package_id(i)) {
+ phys_id = -1;
+ break;
+ }
+ }
+ if (phys_id < 0)
+ continue;
+
+ uncore_cpu_prepare(cpu, phys_id);
+ uncore_event_init_cpu(cpu);
+ }
+ on_each_cpu(uncore_cpu_setup, NULL, 1);
+
+ register_cpu_notifier(&uncore_cpu_nb);
+
+ put_online_cpus();
+
+ return 0;
+}
+
+static int __init uncore_pmus_register(void)
+{
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_type *type;
+ int i, j;
+
+ for (i = 0; msr_uncores[i]; i++) {
+ type = msr_uncores[i];
+ for (j = 0; j < type->num_boxes; j++) {
+ pmu = &type->pmus[j];
+ uncore_pmu_register(pmu);
+ }
+ }
+
+ return 0;
+}
+
+static int __init intel_uncore_init(void)
+{
+ int ret;
+
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return -ENODEV;
+
+ ret = uncore_cpu_init();
+ if (ret)
+ goto fail;
+
+ uncore_pmus_register();
+ return 0;
+fail:
+ return ret;
+}
+device_initcall(intel_uncore_init);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
new file mode 100644
index 0000000..49a6bfb
--- /dev/null
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -0,0 +1,204 @@
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/perf_event.h>
+#include "perf_event.h"
+
+#define UNCORE_PMU_NAME_LEN 32
+#define UNCORE_BOX_HASH_SIZE 8
+
+#define UNCORE_PMU_HRTIMER_INTERVAL (60 * NSEC_PER_SEC)
+
+#define UNCORE_FIXED_EVENT 0xffff
+#define UNCORE_PMC_IDX_MAX_GENERIC 8
+#define UNCORE_PMC_IDX_FIXED UNCORE_PMC_IDX_MAX_GENERIC
+#define UNCORE_PMC_IDX_MAX (UNCORE_PMC_IDX_FIXED + 1)
+
+#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
+
+struct intel_uncore_ops;
+struct intel_uncore_pmu;
+struct intel_uncore_box;
+struct uncore_event_desc;
+
+struct intel_uncore_type {
+ const char *name;
+ int num_counters;
+ int num_boxes;
+ int perf_ctr_bits;
+ int fixed_ctr_bits;
+ int single_fixed;
+ unsigned perf_ctr;
+ unsigned event_ctl;
+ unsigned event_mask;
+ unsigned fixed_ctr;
+ unsigned fixed_ctl;
+ unsigned box_ctl;
+ unsigned msr_offset;
+ struct event_constraint unconstrainted;
+ struct event_constraint *constraints;
+ struct intel_uncore_pmu *pmus;
+ struct intel_uncore_ops *ops;
+ struct uncore_event_desc *event_descs;
+ const struct attribute_group *attr_groups[3];
+};
+
+#define format_group attr_groups[0]
+
+struct intel_uncore_ops {
+ void (*init_box)(struct intel_uncore_box *);
+ void (*disable_box)(struct intel_uncore_box *);
+ void (*enable_box)(struct intel_uncore_box *);
+ void (*disable_event)(struct intel_uncore_box *, struct perf_event *);
+ void (*enable_event)(struct intel_uncore_box *, struct perf_event *);
+ u64 (*read_counter)(struct intel_uncore_box *, struct perf_event *);
+};
+
+struct intel_uncore_pmu {
+ struct pmu pmu;
+ char name[UNCORE_PMU_NAME_LEN];
+ int pmu_idx;
+ int func_id;
+ struct intel_uncore_type *type;
+ struct intel_uncore_box ** __percpu box;
+};
+
+struct intel_uncore_box {
+ int phys_id;
+ int n_active; /* number of active events */
+ int n_events;
+ int cpu; /* cpu to collect events */
+ unsigned long flags;
+ atomic_t refcnt;
+ struct perf_event *events[UNCORE_PMC_IDX_MAX];
+ struct perf_event *event_list[UNCORE_PMC_IDX_MAX];
+ unsigned long active_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)];
+ u64 tags[UNCORE_PMC_IDX_MAX];
+ struct intel_uncore_pmu *pmu;
+ struct hrtimer hrtimer;
+ struct list_head list;
+};
+
+#define UNCORE_BOX_FLAG_INITIATED 0
+
+struct uncore_event_desc {
+ struct kobj_attribute attr;
+ const char *config;
+};
+
+#define INTEL_UNCORE_EVENT_DESC(_name, _config) \
+{ \
+ .attr = __ATTR(_name, 0444, uncore_event_show, NULL), \
+ .config = _config, \
+}
+
+#define DEFINE_UNCORE_FORMAT_ATTR(_var, _name, _format) \
+static ssize_t __uncore_##_var##_show(struct kobject *kobj, \
+ struct kobj_attribute *attr, \
+ char *page) \
+{ \
+ BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE); \
+ return sprintf(page, _format "\n"); \
+} \
+static struct kobj_attribute format_attr_##_var = \
+ __ATTR(_name, 0444, __uncore_##_var##_show, NULL)
+
+
+static ssize_t uncore_event_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct uncore_event_desc *event =
+ container_of(attr, struct uncore_event_desc, attr);
+ return sprintf(buf, "%s", event->config);
+}
+
+static inline
+unsigned uncore_msr_box_ctl(struct intel_uncore_box *box)
+{
+ if (!box->pmu->type->box_ctl)
+ return 0;
+ return box->pmu->type->box_ctl +
+ box->pmu->type->msr_offset * box->pmu->pmu_idx;
+}
+
+static inline
+unsigned uncore_msr_fixed_ctl(struct intel_uncore_box *box)
+{
+ if (!box->pmu->type->fixed_ctl)
+ return 0;
+ return box->pmu->type->fixed_ctl +
+ box->pmu->type->msr_offset * box->pmu->pmu_idx;
+}
+
+static inline
+unsigned uncore_msr_fixed_ctr(struct intel_uncore_box *box)
+{
+ return box->pmu->type->fixed_ctr +
+ box->pmu->type->msr_offset * box->pmu->pmu_idx;
+}
+
+static inline
+unsigned uncore_msr_event_ctl(struct intel_uncore_box *box, int idx)
+{
+ return idx + box->pmu->type->event_ctl +
+ box->pmu->type->msr_offset * box->pmu->pmu_idx;
+}
+
+static inline
+unsigned uncore_msr_perf_ctr(struct intel_uncore_box *box, int idx)
+{
+ return idx + box->pmu->type->perf_ctr +
+ box->pmu->type->msr_offset * box->pmu->pmu_idx;
+}
+
+static inline int uncore_perf_ctr_bits(struct intel_uncore_box *box)
+{
+ return box->pmu->type->perf_ctr_bits;
+}
+
+static inline int uncore_fixed_ctr_bits(struct intel_uncore_box *box)
+{
+ return box->pmu->type->fixed_ctr_bits;
+}
+
+static inline int uncore_num_counters(struct intel_uncore_box *box)
+{
+ return box->pmu->type->num_counters;
+}
+
+static inline void uncore_disable_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->type->ops->disable_box)
+ box->pmu->type->ops->disable_box(box);
+}
+
+static inline void uncore_enable_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->type->ops->enable_box)
+ box->pmu->type->ops->enable_box(box);
+}
+
+static inline void uncore_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ box->pmu->type->ops->disable_event(box, event);
+}
+
+static inline void uncore_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ box->pmu->type->ops->enable_event(box, event);
+}
+
+static inline u64 uncore_read_counter(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ return box->pmu->type->ops->read_counter(box, event);
+}
+
+static inline void uncore_box_init(struct intel_uncore_box *box)
+{
+ if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
+ if (box->pmu->type->ops->init_box)
+ box->pmu->type->ops->init_box(box);
+ }
+}
--
1.7.10.2

2012-06-12 10:17:36

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V5 03/13] perf: Allow pmu to choose cpu on which to install event

On Tue, Jun 12, 2012 at 7:37 AM, Yan, Zheng <[email protected]> wrote:
> From: "Yan, Zheng" <[email protected]>
>
> Allow the pmu->event_init callback to change event->cpu, so pmu can
> choose cpu on which to install event.
>
So now, the user can say perf record -e xxxx -C 1 -a and then get nothing
out of perf report -C1 because under the cover the kernel has swapped
it for another CPU?

> Signed-off-by: Zheng Yan <[email protected]>
> ---
>  kernel/events/core.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index d71a2d6..2c05027 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6302,7 +6302,7 @@ SYSCALL_DEFINE5(perf_event_open,
>        /*
>         * Get the target context (task or percpu):
>         */
> -       ctx = find_get_context(pmu, task, cpu);
> +       ctx = find_get_context(pmu, task, event->cpu);
>        if (IS_ERR(ctx)) {
>                err = PTR_ERR(ctx);
>                goto err_alloc;
> @@ -6375,16 +6375,16 @@ SYSCALL_DEFINE5(perf_event_open,
>        mutex_lock(&ctx->mutex);
>
>        if (move_group) {
> -               perf_install_in_context(ctx, group_leader, cpu);
> +               perf_install_in_context(ctx, group_leader, event->cpu);
>                get_ctx(ctx);
>                list_for_each_entry(sibling, &group_leader->sibling_list,
>                                    group_entry) {
> -                       perf_install_in_context(ctx, sibling, cpu);
> +                       perf_install_in_context(ctx, sibling, event->cpu);
>                        get_ctx(ctx);
>                }
>        }
>
> -       perf_install_in_context(ctx, event, cpu);
> +       perf_install_in_context(ctx, event, event->cpu);
>        ++ctx->generation;
>        perf_unpin_context(ctx);
>        mutex_unlock(&ctx->mutex);
> --
> 1.7.10.2
>

2012-06-12 15:38:53

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On Tue, Jun 12, 2012 at 7:37 AM, Yan, Zheng <[email protected]> wrote:
> Hi, all
>
> Here is the V5 patches to add uncore counting support for Nehalem,
> Sandy Bridge and Sandy Bridge-EP, applied on top of current tip.
> The code is based on Lin Ming's old patches.
>
> For Nehalem and Sandy Bridge-EP, A few general events are exported
> under sysfs directory:
>  /sys/bus/event_source/devices/${uncore_dev}/events/
>
On NHM, I tried:
perf stat -a -e Uncore/CLOCKTICKS/
invalid or unsupported event: 'Uncore/CLOCKTICKS/'

I started tracking this down but gave up because it's coming
once again from the too complex parser. You are looking for
a format match (inv, edge, cmask) with the word CLOCKTICKS
instead of config.

Please fix this.

Also I don't think using upper case for the PMU name is a good idea.
Just call it uncore.


> Each file in the events directory defines an event. The content is
> a string such as:
>  config=1,config1=2
>
> You can use 'perf stat' to access to the uncore pmu. For example:
>  perf stat -a -C 0 -e 'Uncore_iMC_0/CAS_COUNT_RD/' sleep 1
>  perf stat -a -C 0 -e 'Uncore_iMC_0/event=CAS_COUNT_RD/' sleep 1
>
> Any comment is appreciated.
> Thank you
> ---
> Changes since v1:
>  - Modify perf tool to parse events from sysfs
>  - A few minor code cleanup
>
> Changes since v2:
>  - Place all events for a particular socket onto a single cpu
>  - Make the events parser in perf tool reentrantable
>  - A few code cleanup
>
> Changes since v3:
>  - Use per cpu pointer to track uncore box
>  - Rework the cpu hotplug code because topology_physical_package_id()
>   return wrong result when the cpu is offline
>  - Rework the event alias code, event terms are stored in the alias
>   structure instead events string
>
> Changes since v4:
>  - Include Jiri's uncore related changes patch set
>  - Add pmu/event=alias/ syntax support
>

2012-06-13 01:41:35

by Yan, Zheng

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On 06/12/2012 11:38 PM, Stephane Eranian wrote:
> On Tue, Jun 12, 2012 at 7:37 AM, Yan, Zheng <[email protected]> wrote:
>> Hi, all
>>
>> Here is the V5 patches to add uncore counting support for Nehalem,
>> Sandy Bridge and Sandy Bridge-EP, applied on top of current tip.
>> The code is based on Lin Ming's old patches.
>>
>> For Nehalem and Sandy Bridge-EP, A few general events are exported
>> under sysfs directory:
>> /sys/bus/event_source/devices/${uncore_dev}/events/
>>
> On NHM, I tried:
> perf stat -a -e Uncore/CLOCKTICKS/
> invalid or unsupported event: 'Uncore/CLOCKTICKS/'

Strange enough. Did you re-compile the perf tool? Are there
files under directory /sys/bus/event_source/devices/Uncore/events
>
> I started tracking this down but gave up because it's coming
> once again from the too complex parser. You are looking for
> a format match (inv, edge, cmask) with the word CLOCKTICKS
> instead of config.
>
> Please fix this.
>
> Also I don't think using upper case for the PMU name is a good idea.
> Just call it uncore.
>

Peter suggests keeping the uncore names as they're listed in the intel
doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC, QPI.
I think Uncore_Cbo_0 appears better than uncore_Cbo_0

Regards
Yan, Zheng
>
>> Each file in the events directory defines an event. The content is
>> a string such as:
>> config=1,config1=2
>>
>> You can use 'perf stat' to access to the uncore pmu. For example:
>> perf stat -a -C 0 -e 'Uncore_iMC_0/CAS_COUNT_RD/' sleep 1
>> perf stat -a -C 0 -e 'Uncore_iMC_0/event=CAS_COUNT_RD/' sleep 1
>>
>> Any comment is appreciated.
>> Thank you
>> ---
>> Changes since v1:
>> - Modify perf tool to parse events from sysfs
>> - A few minor code cleanup
>>
>> Changes since v2:
>> - Place all events for a particular socket onto a single cpu
>> - Make the events parser in perf tool reentrantable
>> - A few code cleanup
>>
>> Changes since v3:
>> - Use per cpu pointer to track uncore box
>> - Rework the cpu hotplug code because topology_physical_package_id()
>> return wrong result when the cpu is offline
>> - Rework the event alias code, event terms are stored in the alias
>> structure instead events string
>>
>> Changes since v4:
>> - Include Jiri's uncore related changes patch set
>> - Add pmu/event=alias/ syntax support
>>

2012-06-13 01:57:28

by Yan, Zheng

[permalink] [raw]
Subject: Re: [PATCH V5 03/13] perf: Allow pmu to choose cpu on which to install event

On 06/12/2012 06:17 PM, Stephane Eranian wrote:
> On Tue, Jun 12, 2012 at 7:37 AM, Yan, Zheng <[email protected]> wrote:
>> From: "Yan, Zheng" <[email protected]>
>>
>> Allow the pmu->event_init callback to change event->cpu, so pmu can
>> choose cpu on which to install event.
>>
> So now, the user can say perf record -e xxxx -C 1 -a and then get nothing
> out of perf report -C1 because under the cover the kernel has swapped
> it for another CPU?

This change is for uncore, it does not support 'perf report'.

Regards
Yan, Zheng
>
>> Signed-off-by: Zheng Yan <[email protected]>
>> ---
>> kernel/events/core.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index d71a2d6..2c05027 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -6302,7 +6302,7 @@ SYSCALL_DEFINE5(perf_event_open,
>> /*
>> * Get the target context (task or percpu):
>> */
>> - ctx = find_get_context(pmu, task, cpu);
>> + ctx = find_get_context(pmu, task, event->cpu);
>> if (IS_ERR(ctx)) {
>> err = PTR_ERR(ctx);
>> goto err_alloc;
>> @@ -6375,16 +6375,16 @@ SYSCALL_DEFINE5(perf_event_open,
>> mutex_lock(&ctx->mutex);
>>
>> if (move_group) {
>> - perf_install_in_context(ctx, group_leader, cpu);
>> + perf_install_in_context(ctx, group_leader, event->cpu);
>> get_ctx(ctx);
>> list_for_each_entry(sibling, &group_leader->sibling_list,
>> group_entry) {
>> - perf_install_in_context(ctx, sibling, cpu);
>> + perf_install_in_context(ctx, sibling, event->cpu);
>> get_ctx(ctx);
>> }
>> }
>>
>> - perf_install_in_context(ctx, event, cpu);
>> + perf_install_in_context(ctx, event, event->cpu);
>> ++ctx->generation;
>> perf_unpin_context(ctx);
>> mutex_unlock(&ctx->mutex);
>> --
>> 1.7.10.2
>>

2012-06-13 03:31:51

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

"Yan, Zheng" <[email protected]> writes:

> Peter suggests keeping the uncore names as they're listed in the intel
> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC, QPI.
> I think Uncore_Cbo_0 appears better than uncore_Cbo_0

How about a case insensitive match for the sysfs directories? That can
be implemented in user land with nftw(). Since sysfs is all virtual
it should not be too expensive, as long as you only walk the pmu
parts.

I think that would be most user friendly.

-Andi

--
[email protected] -- Speaking for myself only

2012-06-13 06:37:52

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On Wed, Jun 13, 2012 at 3:41 AM, Yan, Zheng <[email protected]> wrote:
> On 06/12/2012 11:38 PM, Stephane Eranian wrote:
>> On Tue, Jun 12, 2012 at 7:37 AM, Yan, Zheng <[email protected]> wrote:
>>> Hi, all
>>>
>>> Here is the V5 patches to add uncore counting support for Nehalem,
>>> Sandy Bridge and Sandy Bridge-EP, applied on top of current tip.
>>> The code is based on Lin Ming's old patches.
>>>
>>> For Nehalem and Sandy Bridge-EP, A few general events are exported
>>> under sysfs directory:
>>>  /sys/bus/event_source/devices/${uncore_dev}/events/
>>>
>> On NHM, I tried:
>>     perf stat -a -e Uncore/CLOCKTICKS/
>> invalid or unsupported event: 'Uncore/CLOCKTICKS/'
>
> Strange enough. Did you re-compile the perf tool? Are there
> files under directory /sys/bus/event_source/devices/Uncore/events
>>
Of course I did recompile. And of course there are files under events. Otherwise
I would not have gone that far in the analysis of the parsing problem.

>> I started tracking this down but gave up because it's coming
>> once again from the too complex parser. You are looking for
>> a format match (inv, edge, cmask) with the word CLOCKTICKS
>> instead of config.
>>
>> Please fix this.
>>
>> Also I don't think using upper case for the PMU name is a good idea.
>> Just call it uncore.
>>
>
> Peter suggests keeping the uncore names as they're listed in the intel
> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC, QPI.
> I think Uncore_Cbo_0 appears better than uncore_Cbo_0
>
Keeping the name is different from keeping the upper vs. lower case letters.
I think this need to be case insensitive. What does it buy you to be case
sensitive? Don't think Intel is ever going to create two events which differ
only by the lower vs. upper case. And, I think that's extra frustration when
you type the event name because you have to also remember the letter
case.

> Regards
> Yan, Zheng
>>
>>> Each file in the events directory defines an event. The content is
>>> a string such as:
>>>  config=1,config1=2
>>>
>>> You can use 'perf stat' to access to the uncore pmu. For example:
>>>  perf stat -a -C 0 -e 'Uncore_iMC_0/CAS_COUNT_RD/' sleep 1
>>>  perf stat -a -C 0 -e 'Uncore_iMC_0/event=CAS_COUNT_RD/' sleep 1
>>>
>>> Any comment is appreciated.
>>> Thank you
>>> ---
>>> Changes since v1:
>>>  - Modify perf tool to parse events from sysfs
>>>  - A few minor code cleanup
>>>
>>> Changes since v2:
>>>  - Place all events for a particular socket onto a single cpu
>>>  - Make the events parser in perf tool reentrantable
>>>  - A few code cleanup
>>>
>>> Changes since v3:
>>>  - Use per cpu pointer to track uncore box
>>>  - Rework the cpu hotplug code because topology_physical_package_id()
>>>   return wrong result when the cpu is offline
>>>  - Rework the event alias code, event terms are stored in the alias
>>>   structure instead events string
>>>
>>> Changes since v4:
>>>  - Include Jiri's uncore related changes patch set
>>>  - Add pmu/event=alias/ syntax support
>>>
>
>

2012-06-13 06:58:19

by Yan, Zheng

[permalink] [raw]
Subject: Re: [PATCH V5 12/13] perf, tool: Add pmu event alias support

I'm sorry I previously sent an wrong patch. here is the new patch.

---
>From c7c762f2f8588a62ea3c8c1794c2577137a895b8 Mon Sep 17 00:00:00 2001
From: Jiri Olsa <[email protected]>
Date: Mon, 21 May 2012 09:36:52 +0200
Subject: [PATCH 12/13] perf, tool: Add pmu event alias support

Adding support to specify alias term within the event description.

The definition of pmu event alias is located at:
${sysfs_mount}/bus/event_source/devices/${pmu}/events/

Each file in the 'events' directory defines a event alias. Its contents
is like:
config=1,config1=2

Using pmu event alias, event could be now specified like:
uncore/CLOCKTICKS/ or uncore/event=CLOCKTICKS/

Signed-off-by: Zheng Yan <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 10 +++
tools/perf/util/parse-events.h | 2 +
tools/perf/util/pmu.c | 166 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/pmu.h | 11 +++-
4 files changed, 188 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d002170..3339424 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -701,6 +701,9 @@ int parse_events_add_pmu(struct list_head **list, int *idx,

memset(&attr, 0, sizeof(attr));

+ if (perf_pmu__check_alias(pmu, head_config))
+ return -EINVAL;
+
/*
* Configure hardcoded terms first, no need to check
* return value when called with fail == 0 ;)
@@ -1143,6 +1146,13 @@ int parse_events__term_str(struct parse_events__term **term,
config, str, 0);
}

+int parse_events__term_clone(struct parse_events__term **new,
+ struct parse_events__term *term)
+{
+ return new_term(new, term->type_val, term->type_term, term->config,
+ term->val.str, term->val.num);
+}
+
void parse_events__free_terms(struct list_head *terms)
{
struct parse_events__term *term, *h;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 9896eda..a2c7168 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -78,6 +78,8 @@ int parse_events__term_num(struct parse_events__term **_term,
int type_term, char *config, long num);
int parse_events__term_str(struct parse_events__term **_term,
int type_term, char *config, char *str);
+int parse_events__term_clone(struct parse_events__term **new,
+ struct parse_events__term *term);
void parse_events__free_terms(struct list_head *terms);
int parse_events_modifier(struct list_head *list, char *str);
int parse_events_add_tracepoint(struct list_head **list, int *idx,
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index a119a53..336f790 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -80,6 +80,114 @@ static int pmu_format(char *name, struct list_head *format)
return 0;
}

+static int perf_pmu__new_alias(struct list_head *list, char *name, FILE *file)
+{
+ struct perf_pmu__alias *alias;
+ char buf[256];
+ int ret;
+
+ ret = fread(buf, 1, sizeof(buf), file);
+ if (ret == 0)
+ return -EINVAL;
+ buf[ret] = 0;
+
+ alias = malloc(sizeof(*alias));
+ if (!alias)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&alias->terms);
+ ret = parse_events_terms(&alias->terms, buf);
+ if (ret) {
+ free(alias);
+ return ret;
+ }
+
+ alias->name = strdup(name);
+ list_add_tail(&alias->list, list);
+ return 0;
+}
+
+/*
+ * Process all the sysfs attributes located under the directory
+ * specified in 'dir' parameter.
+ */
+static int pmu_aliases_parse(char *dir, struct list_head *head)
+{
+ struct dirent *evt_ent;
+ DIR *event_dir;
+ int ret = 0;
+
+ event_dir = opendir(dir);
+ if (!event_dir)
+ return -EINVAL;
+
+ while (!ret && (evt_ent = readdir(event_dir))) {
+ char path[PATH_MAX];
+ char *name = evt_ent->d_name;
+ FILE *file;
+
+ if (!strcmp(name, ".") || !strcmp(name, ".."))
+ continue;
+
+ snprintf(path, PATH_MAX, "%s/%s", dir, name);
+
+ ret = -EINVAL;
+ file = fopen(path, "r");
+ if (!file)
+ break;
+ ret = perf_pmu__new_alias(head, name, file);
+ fclose(file);
+ }
+
+ closedir(event_dir);
+ return ret;
+}
+
+/*
+ * Reading the pmu event aliases definition, which should be located at:
+ * /sys/bus/event_source/devices/<dev>/events as sysfs group attributes.
+ */
+static int pmu_aliases(char *name, struct list_head *head)
+{
+ struct stat st;
+ char path[PATH_MAX];
+ const char *sysfs;
+
+ sysfs = sysfs_find_mountpoint();
+ if (!sysfs)
+ return -1;
+
+ snprintf(path, PATH_MAX,
+ "%s/bus/event_source/devices/%s/events", sysfs, name);
+
+ if (stat(path, &st) < 0)
+ return -1;
+
+ if (pmu_aliases_parse(path, head))
+ return -1;
+
+ return 0;
+}
+
+static int pmu_alias_terms(struct perf_pmu__alias *alias,
+ struct list_head *terms)
+{
+ struct parse_events__term *term, *clone;
+ LIST_HEAD(list);
+ int ret;
+
+ list_for_each_entry(term, &alias->terms, list) {
+ ret = parse_events__term_clone(&clone, term);
+ if (ret) {
+ parse_events__free_terms(&list);
+ return ret;
+ }
+ list_add_tail(&clone->list, &list);
+ }
+ list_splice(&list, terms);
+ return 0;
+}
+
/*
* Reading/parsing the default pmu type value, which should be
* located at:
@@ -118,6 +226,7 @@ static struct perf_pmu *pmu_lookup(char *name)
{
struct perf_pmu *pmu;
LIST_HEAD(format);
+ LIST_HEAD(aliases);
__u32 type;

/*
@@ -135,8 +244,12 @@ static struct perf_pmu *pmu_lookup(char *name)
if (!pmu)
return NULL;

+ pmu_aliases(name, &aliases);
+
INIT_LIST_HEAD(&pmu->format);
+ INIT_LIST_HEAD(&pmu->aliases);
list_splice(&format, &pmu->format);
+ list_splice(&aliases, &pmu->aliases);
pmu->name = strdup(name);
pmu->type = type;
return pmu;
@@ -279,6 +392,59 @@ int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
return pmu_config(&pmu->format, attr, head_terms);
}

+static struct perf_pmu__alias *pmu_find_alias(struct perf_pmu *pmu,
+ struct parse_events__term *term)
+{
+ struct perf_pmu__alias *alias;
+ char *name;
+
+ if (parse_events__is_hardcoded_term(term))
+ return NULL;
+
+ if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+ if (term->val.num != 1)
+ return NULL;
+ if (pmu_find_format(&pmu->format, term->config))
+ return NULL;
+ name = term->config;
+ } else if (term->type_val == PARSE_EVENTS__TERM_TYPE_STR) {
+ if (strcmp(term->config, "event"))
+ return NULL;
+ name = term->val.str;
+ } else {
+ return NULL;
+ }
+
+ list_for_each_entry(alias, &pmu->aliases, list) {
+ if (!strcmp(alias->name, name))
+ return alias;
+ }
+ return NULL;
+}
+
+/*
+ * Find alias in the terms list and replace it with the terms
+ * defined for the alias
+ */
+int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms)
+{
+ struct parse_events__term *term, *h;
+ struct perf_pmu__alias *alias;
+ int ret;
+
+ list_for_each_entry_safe(term, h, head_terms, list) {
+ alias = pmu_find_alias(pmu, term);
+ if (!alias)
+ continue;
+ ret = pmu_alias_terms(alias, &term->list);
+ if (ret)
+ return ret;
+ list_del(&term->list);
+ free(term);
+ }
+ return 0;
+}
+
int perf_pmu__new_format(struct list_head *list, char *name,
int config, unsigned long *bits)
{
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 68c0db9..535f2c5 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -19,17 +19,26 @@ struct perf_pmu__format {
struct list_head list;
};

+struct perf_pmu__alias {
+ char *name;
+ struct list_head terms;
+ struct list_head list;
+};
+
struct perf_pmu {
char *name;
__u32 type;
struct list_head format;
+ struct list_head aliases;
struct list_head list;
};

struct perf_pmu *perf_pmu__find(char *name);
int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
struct list_head *head_terms);
-
+int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms);
+struct list_head *perf_pmu__alias(struct perf_pmu *pmu,
+ struct list_head *head_terms);
int perf_pmu_wrap(void);
void perf_pmu_error(struct list_head *list, char *name, char const *msg);

--
1.7.6.5

2012-06-13 07:38:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support


* Stephane Eranian <[email protected]> wrote:

> > Peter suggests keeping the uncore names as they're listed in
> > the intel doc. For Sandybirdge-EP, uncore names are
> > something like: Cbo, iMC, QPI. I think Uncore_Cbo_0 appears
> > better than uncore_Cbo_0
>
> Keeping the name is different from keeping the upper vs. lower
> case letters. I think this need to be case insensitive. What
> does it buy you to be case sensitive? Don't think Intel is
> ever going to create two events which differ only by the lower
> vs. upper case. And, I think that's extra frustration when you
> type the event name because you have to also remember the
> letter case.

Yeah - the right approach is to make it all lowercase in sysfs -
then user-space can tolower() the string provided by the user.

There should be no case sensitivity in event specifications
anywhere.

Thanks,

Ingo

2012-06-13 07:44:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support


* Andi Kleen <[email protected]> wrote:

> "Yan, Zheng" <[email protected]> writes:
>
> > Peter suggests keeping the uncore names as they're listed in the intel
> > doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC, QPI.
> > I think Uncore_Cbo_0 appears better than uncore_Cbo_0
>
> How about a case insensitive match for the sysfs directories?
> [...]

That's idiotic, avoidable lookup complexity.

> [...] That can be implemented in user land with nftw(). Since
> sysfs is all virtual it should not be too expensive, as long
> as you only walk the pmu parts.
>
> I think that would be most user friendly.

It would be most idiotic, stop suggesting crap.

Why not rot13 it as well? Could be decoded in user-space as well
with some helpers. Hey, md5 checksum it too.

Thanks,

Ingo

2012-06-13 09:02:35

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On Wed, 2012-06-13 at 09:41 +0800, Yan, Zheng wrote:
> Peter suggests keeping the uncore names as they're listed in the intel
> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC,
> QPI.

No they're not, they're C-Box etc.. but I'm fine with doing a tolower on
all of it.

2012-06-14 02:18:43

by Yan, Zheng

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On 06/13/2012 05:02 PM, Peter Zijlstra wrote:
> On Wed, 2012-06-13 at 09:41 +0800, Yan, Zheng wrote:
>> Peter suggests keeping the uncore names as they're listed in the intel
>> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC,
>> QPI.
>
> No they're not, they're C-Box etc.. but I'm fine with doing a tolower on
> all of it.
>

The reason I choose CBox instead of C-Box is that '-' is a separate symbol
in the flex rules. '-' is used for matching events such as LLC-load-misses.
I don't know how to allow letter '-' in the pmu name, but without leading
to ambiguity.

Regards
Yan, Zheng

2012-06-14 05:41:28

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On Thu, Jun 14, 2012 at 4:18 AM, Yan, Zheng <[email protected]> wrote:
> On 06/13/2012 05:02 PM, Peter Zijlstra wrote:
>> On Wed, 2012-06-13 at 09:41 +0800, Yan, Zheng wrote:
>>> Peter suggests keeping the uncore names as they're listed in the intel
>>> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC,
>>> QPI.
>>
>> No they're not, they're C-Box etc.. but I'm fine with doing a tolower on
>> all of it.
>>
>
> The reason I choose CBox instead of C-Box is that '-' is a separate symbol
> in the flex rules. '-' is used for matching events such as LLC-load-misses.
> I don't know how to allow letter '-' in the pmu name, but without leading
> to ambiguity.
>
I would drop the -, just call it cbox.

2012-06-27 01:05:17

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

Hi,

If you compile the uncore support in 32-bit mode, you will
get a warning on the hrtimer_start_range_ns() functions
because the interval is passed as ktime_t whereas the
function expects unsigned long. With 64-bit, no problem
ktime_t is a union with s64. But in 32-bit mode, there is
a possible truncation of the delta. This needs to be
fixed.

On Thu, Jun 14, 2012 at 7:41 AM, Stephane Eranian <[email protected]> wrote:
> On Thu, Jun 14, 2012 at 4:18 AM, Yan, Zheng <[email protected]> wrote:
>> On 06/13/2012 05:02 PM, Peter Zijlstra wrote:
>>> On Wed, 2012-06-13 at 09:41 +0800, Yan, Zheng wrote:
>>>> Peter suggests keeping the uncore names as they're listed in the intel
>>>> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC,
>>>> QPI.
>>>
>>> No they're not, they're C-Box etc.. but I'm fine with doing a tolower on
>>> all of it.
>>>
>>
>> The reason I choose CBox instead of C-Box is that '-' is a separate symbol
>> in the flex rules. '-' is used for matching events such as LLC-load-misses.
>> I don't know how to allow letter '-' in the pmu name, but without leading
>> to ambiguity.
>>
> I would drop the -, just call it cbox.

2012-06-27 02:09:05

by Yan, Zheng

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On 06/27/2012 09:05 AM, Stephane Eranian wrote:
> Hi,
>
> If you compile the uncore support in 32-bit mode, you will
> get a warning on the hrtimer_start_range_ns() functions
> because the interval is passed as ktime_t whereas the
> function expects unsigned long. With 64-bit, no problem
> ktime_t is a union with s64. But in 32-bit mode, there is
> a possible truncation of the delta. This needs to be
> fixed.
>
thank you for mention. but I think someone has already submitted a patch.

Yan, Zheng

> On Thu, Jun 14, 2012 at 7:41 AM, Stephane Eranian <[email protected]> wrote:
>> On Thu, Jun 14, 2012 at 4:18 AM, Yan, Zheng <[email protected]> wrote:
>>> On 06/13/2012 05:02 PM, Peter Zijlstra wrote:
>>>> On Wed, 2012-06-13 at 09:41 +0800, Yan, Zheng wrote:
>>>>> Peter suggests keeping the uncore names as they're listed in the intel
>>>>> doc. For Sandybirdge-EP, uncore names are something like: Cbo, iMC,
>>>>> QPI.
>>>>
>>>> No they're not, they're C-Box etc.. but I'm fine with doing a tolower on
>>>> all of it.
>>>>
>>>
>>> The reason I choose CBox instead of C-Box is that '-' is a separate symbol
>>> in the flex rules. '-' is used for matching events such as LLC-load-misses.
>>> I don't know how to allow letter '-' in the pmu name, but without leading
>>> to ambiguity.
>>>
>> I would drop the -, just call it cbox.

2012-06-27 09:17:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 0/13] perf: Intel uncore pmu counting support

On Wed, 2012-06-27 at 03:05 +0200, Stephane Eranian wrote:
> If you compile the uncore support in 32-bit mode, you will
> get a warning on the hrtimer_start_range_ns() functions
> because the interval is passed as ktime_t whereas the
> function expects unsigned long. With 64-bit, no problem
> ktime_t is a union with s64. But in 32-bit mode, there is
> a possible truncation of the delta. This needs to be
> fixed.

Right, Andrew has a patch for that, I guess I'd better make sure it
appears in tip as well.